144 31 4MB
English Pages 355 [348] Year 2023
Behaviormetrics: Quantitative Approaches to Human Behavior 2
Haruhiko Ogasawara
Expository Moments for Pseudo Distributions
Behaviormetrics: Quantitative Approaches to Human Behavior Volume 2
Series Editor Akinori Okada, Professor Emeritus, Rikkyo University, Tokyo, Japan
This series covers in their entirety the elements of behaviormetrics, a term that encompasses all quantitative approaches of research to disclose and understand human behavior in the broadest sense. The term includes the concept, theory, model, algorithm, method, and application of quantitative approaches from theoretical or conceptual studies to empirical or practical application studies to comprehend human behavior. The Behaviormetrics series deals with a wide range of topics of data analysis and of developing new models, algorithms, and methods to analyze these data. The characteristics featured in the series have four aspects. The first is the variety of the methods utilized in data analysis and a newly developed method that includes not only standard or general statistical methods or psychometric methods traditionally used in data analysis, but also includes cluster analysis, multidimensional scaling, machine learning, corresponding analysis, biplot, network analysis and graph theory, conjoint measurement, biclustering, visualization, and data and web mining. The second aspect is the variety of types of data including ranking, categorical, preference, functional, angle, contextual, nominal, multi-mode multi-way, contextual, continuous, discrete, high-dimensional, and sparse data. The third comprises the varied procedures by which the data are collected: by survey, experiment, sensor devices, and purchase records, and other means. The fourth aspect of the Behaviormetrics series is the diversity of fields from which the data are derived, including marketing and consumer behavior, sociology, psychology, education, archaeology, medicine, economics, political and policy science, cognitive science, public administration, pharmacy, engineering, urban planning, agriculture and forestry science, and brain science. In essence, the purpose of this series is to describe the new horizons opening up in behaviormetrics — approaches to understanding and disclosing human behaviors both in the analyses of diverse data by a wide range of methods and in the development of new methods to analyze these data. Editor in Chief Akinori Okada (Rikkyo University) Managing Editors Daniel Baier (University of Bayreuth) Giuseppe Bove (Roma Tre University) Takahiro Hoshino (Keio University)
Haruhiko Ogasawara
Expository Moments for Pseudo Distributions
123
Haruhiko Ogasawara Professor Emeritus Otaru University of Commerce Otaru, Hokkaido, Japan
ISSN 2524-4027 ISSN 2524-4035 (electronic) Behaviormetrics: Quantitative Approaches to Human Behavior ISBN 978-981-19-3524-4 ISBN 978-981-19-3525-1 (eBook) https://doi.org/10.1007/978-981-19-3525-1 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
Early in 2021, the following two papers were online published: Ogasawara, H. (2021a). Unified and non-recursive formulas for moments of the normal distribution with stripe truncation. Communications in Statistics—Theory and Methods. https://doi.org/10.1080/03610926.2020.1867742. Ogasawara, H. (2021b). A non-recursive formula for various moments of the multivariate normal distribution with sectional truncation. Journal of Multivariate Analysis. https://doi.org/10.1016/j.jmva.2021.104729. This book is a collection of explications, extensions or generalizations of these papers and looks like an edited book of several new papers edited by the same author. The above two papers deal with moments under normality with stripe and sectional truncation for univariate and multivariate cases, respectively, where the new truncation forms include usual single and double tail truncation as special cases. In the latter 2021b paper, sectional truncation yields a pseudo normal (PN) family of distributions, which is seen as an extension of the skew-normal (SN), whose statistical aspects were derived by the seminal paper of Azzalini (1985). The PN is also seen as an extension of the closed skew-normal (CSN) family obtained by Domínguez-Molina, González-Farías and Gupta (2003) and associated papers, where the CSN includes the SN and various distributions as special cases. The PN is a “normal” version of the “pseudo distributions,” which is used in the title of this book. One of the new features of the PN over the SN and CSN is that the PN provides symmetric non-normal distributions, while in the SN and CSN, symmetric distributions reduce to normal ones. In this book, the author uses a term “kurtic normal (KN),” where a coined word “kurtic” indicates “mesokurtic, leptokurtic or platykurtic” used in statistics. This new feature was made possible by employing stripe/sectional truncation. It is known that before the advent of Azzalini (1985), there were preliminary versions with the same probability density function as that of the SN, e.g., Birnbaum (1950) in mathematical statistics. In an associated
v
vi
Preface
paper, Birnbaum, Paulson and Andrews (1950) in psychometrics or more generally behaviormetrics dealt with truncation in, e.g., entrance examination and personnel selection and gave the term “general truncation,” which is similar to sectional truncation, where estimation of untruncated moments from truncated moments was a problem. This book consists of ten chapters, where moments associated with truncation are focused on. In Chap. 1, the sectionally truncated normal vector is dealt with, where the derivatives of the cumulative distributions with respect to the variables in the moment generating functions (mgf’s) are presented. Chapters 2 and 3 give new two distributions, the real-valued Poisson and the basic parabolic cylinder distributions, respectively. These new distributions are introduced to derive, e.g., absolute moments of real-valued orders. Chapter 4 gives the closed features and moments of the PN. In Chap. 5, the KN is discussed. Following the discussion, Chap. 6 gives the new family called normal-normal (NN) family of distributions, which can be an approximation to the PN using finite/infinite mixtures with the weights of normal densities. It is known that the SN distributed variable can be decomposed into truncated and untruncated normal independent variables, which was derived by Henze (1986). In Chap. 7, it is shown that we have decompositions of the PN and NN similar to the Henze theorem. Another aspect of the decomposition is for the mgf of the PN, where a pseudo mgf is employed for a pseudo or complex-valued variable, which considerably reduces the computation of the cumulants. Sectional truncation can also be used in the PN and NN, which is shown in Chap. 8. Chapter 9 gives some preliminary explanations of the Student tand the pseudo t- (PT) distributions corresponding to the PN. The last chapter deals with the multivariate measures of skewness and kurtosis with some expository explanations of the Kronecker product, vectorizing operator and commutation matrix. Throughout the book, proofs are not omitted and are often given in more than one method for expository purposes. Then, the readers can start with any chapter following their interests without difficulties. Some aspects are repeatedly given in similar ways, which is also employed for exposition. As mentioned earlier, this book was based on recent works of the author on moments and truncation (for associated papers, see https://www.otaru-uc.ac.jp/*emt-hogasa/), which has been motivated by communications with researchers in this field. Comments on works of the author by Professor Nicola Loperfido (Università degli Studi di Urbino “Carlo Bo”) have guided the author fruitfully with his constructive suggestions. Discussions on computational aspects and software associated with the moments of the normal vector under sectional truncation with Professor Adelchi Azzalini (Università degli Studi di Padova), Professor Njål Foldnes (University of Stavanger) and Dr. Christian Galarza (Escuela Superior Politécnica del Litoral— ESPOL) are highly appreciated. I regret that I have not fulfilled the promised revision of my software associated with sectional truncation. This book is an intermediate report for the revision. I am directly indebted to Professor Akinori
Preface
vii
Okada (Rikkyo University), the series editor of Behaviormetrics for this book. Without his invitation and encouragements, this book has not realized. Last but not least, I deeply thank Professor Takahiro Terasaka (Otaru University of Commerce) for discussions on mathematical statistics and academic environments which have been unchanged after my retirement in 2017. Otaru, Japan March 2022
Haruhiko Ogasawara
References Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat 12:171– 178 Birnbaum ZW (1950) Effect of linear truncation on a multinormal population. Ann Math Stat 21:272–279 Birnbaum ZW, Paulson E, Andrews FC (1950) On the effect of selection performed on some coordainates of a multi-dimensional population. Psychometrika 15:191–204 Domínguez-Molina A, González-Farías G, Gupta AK (2003) The multivariate closed skew normal distribution. Technical report 03‐12. Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, OH Henze N (1986) A probabilistic representation of the ‘skew-normal’ distribution. Scand J Stat 13:271–275
Contents
1
2
... ...
1 1
...
3
...
6
...
14
... ...
29 43
Normal Moments Under Stripe Truncation and the Real-Valued Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Closed Formulas for Moments of Integer-Valued Orders . . . . . .
47 47 49
The Sectionally Truncated Normal Distribution . . . . . . . . . . . . 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 The Probability Density Function (PDF) and the Moment Generating Function for the Sectionally Truncated Normal Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Partial Derivatives of the Cumulative Distribution Function of the Normal Random Vector . . . . . . . . . . . . . . . . . . . . . . 1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 The Product Sum of Natural Numbers and the Hermite Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 2.4
ðrÞ
Series Expressions of I k ðk ¼ 0; 1; . . .; r ¼ 1; . . .; RÞ for Moments of Integer-Valued Orders . . . . . . . . . . . . . . . . . . . The Real-Valued Poisson Distribution for Series Expressions ðrÞ
of I k ðk ¼ 0; 1; . . .; r ¼ 1; :::; RÞ for Absolute Moments 2.4.1 Generalization of the Poisson Distribution . . . . 2.4.2 The Real-Valued Poisson Distribution . . . . . . . 2.4.3 Applications to the Series Expressions of the Moments of the Normal Distribution . . . . . . . . 2.5 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
...... ...... ......
56 56 57
...... ...... ......
62 67 68
ix
x
3
4
5
Contents
The Basic Parabolic Cylinder Distribution and Its Multivariate Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The BPC Distribution of the Third Kind and Its CDF . . . . . . 3.3 Moments of the BPC Distribution . . . . . . . . . . . . . . . . . . . . 3.4 The Mode and the Shapes of the PDFs of the BPC Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 The Multivariate BPC Distribution . . . . . . . . . . . . . . . . . . . . 3.6 Numerical Illustrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 R-Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.1 The R-Function wpc for the Weighted Parabolic Cylinder Function . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.2 The R-Functions bpc1n and bpc2n for the Normalizers of the Uni- and Bivariate BPC Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.3 The R-Functions dbpc1 and dbpc2 for the PDFs of the Uni- and Bivariate BPC Distributions . . . . . . 3.8.4 The R-Function bpc2d for the CDF of the Bivariate BPC Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Pseudo-Normal (PN) Distribution . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 The PDF of the PN Distribution . . . . . . . . . . . . 4.3 The Moment Generating Functions (MGFs) . . . . 4.3.1 The MGF of the PN-Distributed Vector . 4.3.2 The MGF of YT CY . . . . . . . . . . . . . . . 4.3.3 The MGF of YYT . . . . . . . . . . . . . . . . 4.4 Closed Properties of the PN . . . . . . . . . . . . . . . 4.4.1 The Closure of Affine Transformations of the PN-Distributed Vector . . . . . . . . . 4.4.2 Marginal and Conditional Distributions . 4.4.3 Independent Random Vectors and Sums 4.4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . 4.5 Moments and Cumulants of the PN . . . . . . . . . . 4.5.1 General Results for Cumulants . . . . . . . 4.5.2 Moments and Cumulants When q = 1 . . 4.6 The Distribution Function of the PN . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
69 69 70 76
. . . . .
. . . . .
84 85 91 94 96
..
96
..
98
. . 100 . . 100 . . 102
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
105 105 106 114 114 116 118 119
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
119 123 124 125 125 125 129 142 146
The Kurtic-Normal (KN) Distribution . . . . . . . . . . . . . . . . . . . . . . . 149 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 5.2 The Limiting Distributions of the KN . . . . . . . . . . . . . . . . . . . 154
Contents
xi
5.3 Moments and Cumulants of the KN . . . . . . . . . . . . . . . . . . . . . 159 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 6
The Normal-Normal (NN) Distribution . . . . . . . . . . . . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 The MGFs of the NN . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Closed Properties of the NN . . . . . . . . . . . . . . . . . . . . . 6.4 Cumulants of the NN . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Alternative Expressions of the PDF of the NN: Mixture, Convolution and Regression . . . . . . . . . . . . . . . . . . . . . 6.6 Moment-Equating for the PN and NN . . . . . . . . . . . . . . 6.6.1 The SN and NN . . . . . . . . . . . . . . . . . . . . . . . . 6.6.2 The Multivariate PN and NN with Exchangeable Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
171 171 174 176 177
. . . . . 186 . . . . . 198 . . . . . 198 . . . . . 206 . . . . . 213 . . . . . .
. . . . . .
. . . . . .
215 215 222 226 231 232
7
The Decompositions of the PN- and NN-Distributed Variables . 7.1 Decomposition of the PN . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Decomposition of the NN . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Multivariate Hermite Polynomials . . . . . . . . . . . . . . . . . . . 7.4 Normal-Reduced and Normal-Added PN and NN . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
The Truncated Pseudo-Normal (TPN) and Truncated Normal-Normal (TNN) Distributions . . . . . . . . . . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Moment Generating Functions for the TPN Distribution 8.3 Properties of the TPN . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Affine Transformation of the TPN Vector . . . . 8.3.2 Marginal and Conditional Distributions of the TPN Vector . . . . . . . . . . . . . . . . . . . . . 8.4 Moments and Cumulants of the TPN . . . . . . . . . . . . . . 8.4.1 A Non-recursive Formula . . . . . . . . . . . . . . . . 8.4.2 A Formula Using the MGF . . . . . . . . . . . . . . . 8.4.3 The Case of Sectionally Truncated SN with p = q = 1 . . . . . . . . . . . . . . . . . . . . . . . . 8.5 The Truncated Normal-Normal Distribution . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 257 . . . . . . 262 . . . . . . 264
The Student t- and Pseudo t- (PT) Distributions: Various Expressions of Mixtures . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 The t-Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 The Multivariate t-Distribution . . . . . . . . . . . . . . . . . .
. . . .
9
. . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
235 235 238 241 241
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
242 243 243 253
. . . .
. . . .
. . . .
. . . .
. . . .
265 265 266 285
xii
Contents
9.4
The Pseudo t (PT)-Distribution . . . . . . . . . 9.4.1 The PDF of the PT . . . . . . . . . . . . 9.4.2 Moments and Cumulants of the PT References . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
288 288 295 298
10 Multivariate Measures of Skewness and Kurtosis . . . . . . . 10.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Multivariate Cumulants and Multiple Commutators . . . 10.3 Multivariate Measures of Skewness and Kurtosis . . . . 10.3.1 Multivariate Measures of Skewness . . . . . . . . 10.3.2 Multivariate Measures of Excess Kurtosis . . . 10.4 Elimination Matrices and Non-duplicated Multivariate Skewness and Kurtosis . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
299 299 309 316 317 321
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . . . . 328 . . . . . . . 334
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Chapter 1
The Sectionally Truncated Normal Distribution
1.1
Introduction
Let a random vector X ¼ ðX1 ; . . .; Xp ÞT be normally distributed. Suppose that X is truncated such that when X belongs to one of non-overlapping R sections (regions), X is selected, otherwise X is truncated. The sections are denoted by SR SR Tp T and r¼1 far X\br g ¼ r¼1 i¼1 fair Xi \bir g with ar ¼ ða1r ; . . .; apr Þ T br ¼ ðb1r ; . . .; bpr Þ ðr ¼ 1; . . .; RÞ. In this case, X is said to be sectionally truncated under normality [32]. Note that the regions for selection may touch each other on some points or line segments, and some or all elements of ar (br ) are possibly 1ð1Þ. When ar ¼ ð1; . . .; 1ÞT and some element(s) of br are finite, the upper (right) tail of X is truncated while when some element(s) of ar are finite and br ¼ ð1; . . .; 1ÞT , the lower (left) tail of X is truncated. These cases are said to be singly truncated. When some of the pair(s) of the elements of ar and br are finite, X is doubly truncated in that both tails are truncated. So far, moments under single and double truncation have been well investigated [2, 6, 7, 10, 13–15, 18–21, 28–30, 34, 36]. Sectional truncation introduced by Ogasawara [32] is an extension of single and double truncation. In the univariate case, sectional truncation gives a zebraic or tigerish probability density function (pdf) and consequently, its case is called stripe truncation (for various examples under this truncation, see Fig. 1.1 and Ogasawara [31]). Note that this extension does not yield excessive extra complicatedness over single /double truncation. This S is because the selected regions Rr¼1 far X\br g are given by R cases of double/ single truncation (selection). One of the simplest cases of sectional truncation other than double/single truncation is that of inner truncation (see Examples 4–10 of Fig. 1.1 and Ogasawara [31, 32]), where the inner complement of the domain for selection in double truncation is discarded with the lower and upper tails being selected. When
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 H. Ogasawara, Expository Moments for Pseudo Distributions, Behaviormetrics: Quantitative Approaches to Human Behavior 2, https://doi.org/10.1007/978-981-19-3525-1_1
1
2
1 The Sectionally Truncated Normal Distribution
Example 1
−3
−2
−1
0
Example 2
1
2
3
−3
−2
−1
−2
−1
0
1
2
3
−3
−2
−1
Example 5
−3
−2
−1
0
−2
−1
0
1
2
3
−3
−2
−2
−1
0
3
0
1
2
3
−1
0
1
2
3
1
2
3
2
3
Example 8
1
2
3
−3
−2
−1
0
Example 10
Example 9
−3
2
Example 6
Example 7
−3
1
Example 4
Example 3
−3
0
1
2
3
−3
−2
−1
0
1
Fig. 1.1 Examples of stripe truncation in N(0, 1) with truncated areas being shaded (taken from Ogasawara [31, Fig. 1] by permission)
1.2 The Probability Density Function (PDF) and the Moment …
3
variables stand for risks as in actuarial science, the behavior of the variables in tail areas is of primary interest as the term “tail conditional expectation” [5, Sect. 5.1; 23; 24] shows. In some or many cases, the high and low values of (risk) variables, e.g., measurements of blood pressure and pulse are focused on for medical treatment, while “normal” patients with intermediate values of associated risk variables are excluded from treatment, giving inner truncation. Similarly, in plants and animals breeding, combinations of high and low values of some variables are typically considered to have possible improvement by selection (for statistical aspects in breeding see Cochran [11], Herrendörfer and Tuchscherer [17], Gianola and Rosa [16]). Note that the works of G. M. Tallis starting from the seminal paper of Tallis [36] giving the moment generating function of the normal vector under double truncation are motivated by breeding. Though sectional truncation gives various cases of truncation including some of elliptical truncation [3, 37], cases under radial [37] and plane [38] truncation are not covered. In the behavioral sciences, the effects of selection due to, e.g., entrance examinations for universities and personnel selection have been investigated, where the inference of the moments of variables for achievements or abilities before truncation from those after truncation is one of the typical problems [1, 4, 8, 9, 25, 33]. Note that as early as in Birnbaum et al. [9, p. 193], the term “general truncation” indicating complicated conditions for selection similar to plane and sectional truncation is used.
1.2
The Probability Density Function (PDF) and the Moment Generating Function for the Sectionally Truncated Normal Vector
The pdf of the normally distributed p 1 vector X without truncation denoted by X Np ðl; RÞ is /p ðX ¼ xjl; RÞ ¼ /p ðxjl; RÞ ¼
1 T 1 ðx lÞ exp R ðx lÞ ; 2 ð2pÞp=2 jRj1=2 1
where EðXÞ ¼ l and covðXÞ ¼ R is non-singular by assumption. Suppose that X is S sectionally truncated with the regions for selection Rr¼1 far X\br g. Define
4
1 The Sectionally Truncated Normal Distribution
a¼
b R Z 1r X r¼1
a1r
R Z X
Zbpr
/p ðxjl; RÞdx1 dxp apr
br
r¼1
/p ðxjl; RÞdx
ar
ZB
/p ðxjl; RÞdx; A
where A ¼ ða1 ; . . .; aR Þ; ar ¼ ða1r ; . . .; apr ÞT and B ¼ ðb1 ; . . .; bR Þ; br ¼ ðb1r ; . . .; bpr ÞT ; r ¼ ð1; . . .; RÞ. Let S be a set representing the union of the R regions. Then, a is equal to PrðX 2 SÞ ¼ Pr
[R
fa X\b g : r r r¼1
Using a, we have the following pdf of the sectionally truncated normal vector X. Definition 1.1 The distribution of the p 1 sectionally truncated normal (STN) vector X is defined with the notation. X Np ðl; R; A; BÞ ¼ NðaÞ p ðl; RÞ; when its pdf is /p ðxjl; R; A; BÞ /ðaÞ p ðxjl; RÞ RB ¼
/p ðxjl; RÞ
A /p ðxjl; RÞdx a1 /p ðxjl; RÞ:
From the above definition, it is obvious that a is the valid normalizer of the truncated distribution. It is seen that 0\a\1 under some truncation and a ¼ 1 if and only if X is untruncated. Theorem 1.1 [32, Corollary 4] The moment generating function (mgf) of the STN vector. X Np ðl; R; A; BÞ
1.2 The Probability Density Function (PDF) and the Moment …
5
is PrðX þ Rt 2 Sjl; RÞ exp lT t þ MX ðtÞ ¼ PrðX 2 Sjl; RÞ PrðX 2 Sjl þ Rt; RÞ exp lT t þ ¼ a
tT Rt 2 T t Rt ; 2
where PrðX 2 Sjl; RÞ is the probability of X 2 S when X Np ðl; RÞ. Proof By definition MX ðtÞ ¼ EfexpðXT tÞg ¼a
1
ZB A
1 T 1 exp ðx lÞ R ðx lÞ dx 2 ð2pÞp=2 jRj1=2 expðxT tÞ
1 T 1 ðx l RtÞ exp R ðx l RtÞ dx 2 ð2pÞp=2 jRj1=2 A tT Rt T exp l t þ 2 tT Rt ¼ a1 PrðX 2 Sjl þ Rt; RÞ exp lT t þ : 2 ZB
¼
a1
Noting that PrðX þ Rt 2 Sjl; RÞ ¼ Pr
[R
fa Rt X\b Rtgjl; R r r r¼1
¼ PrðX 2 Sjl þ Rt; RÞ; the required results follow. Q.E.D. Remark 1.1 The factor exp lT t þ
tT Rt 2
in the above mgf is that of X Np ðl; RÞ.
The cumulant generating function (cgf) of the STN is KX ðtÞ ¼ lnMX ðtÞ ¼ ln PrðX 2 Sjl þ Rt; RÞ ln a þ lT t þ
tT Rt : 2
6
1 The Sectionally Truncated Normal Distribution
Remark 1.2 Let the mgf in Theorem 1.1 be MX ðtÞ ¼ MX1 ðtÞMX0 ðtÞ; where tT Rt T MX0 ðtÞ ¼ exp l t þ 2 is the mgf of X0 Np ðl; RÞ while MX1 ðtÞ ¼ a1 PrðX 2 Sjl þ Rt; RÞ is the pseudo mgf of the pseudo random vector X1 . Though MX1 ð0Þ ¼ 1 satisfies a necessary condition for a mgf as well as MX0 ð0Þ ¼ 1, MX1 ðtÞ is not a valid mgf for a real valued distribution. This is seen, e.g., when p = 1 by noting that from the additive property of the corresponding cgf, the variance of X obtained from d2 KX ðtÞ=dt2 jt¼0 is equal to d2 KX1 ðtÞ d2 KX1 ðtÞ d2 KX0 ðtÞ j þ j ¼ jt¼0 þ r2 ; t¼0 t¼0 dt2 dt2 dt2 where R ¼ r2 in the univariate case. However, the STN-distributed variable X has a typically reduced variance smaller than untruncated r2 especially when tail areas are truncated. Then, in this case, the above equation gives the negative d2 KX1 ðtÞ=dt2 jt¼0 ¼ varðX1 Þ giving the pseudo mgf and pseudo variable X1 . Although X1 is labeled as a pseudo random vector primarily for ease of reference and for understanding the structure of MX ðtÞ and KX ðtÞ, MX1 ðtÞ is a valid factor of MX ðtÞ. Similarly, KX1 ðtÞ is a valid term of KX ðtÞ. Consequently, in the above example, the correct variance of X is given by the sum of the possibly negative varðX1 Þ and varðX0 Þ ¼ r2 . Further, it is seen that the correct cumulants higher than the second order for X are obtained only from the corresponding pseudo cumulants of X1 since the cumulants beyond the second order of X0 Np ðl; RÞ vanish. Though the pseudo moments of X1 gives correct cumulants of X, the moments of X1 do not give correct moments of X.
1.3
Partial Derivatives of the Cumulative Distribution Function of the Normal Random Vector
In this section, the partial derivatives of the cumulative distribution function (cdf) of the untruncated normal random vector up to the fourth order are given, which are required to have the moments of the STN-distributed vector from its mgf. Let
1.3 Partial Derivatives of the Cumulative Distribution Function …
7
Zc Up ðcjl; RÞ ¼
/p ðXjl; RÞdX 1
be the cdf of the untruncated distribution of X Np ðl; RÞ at X ¼ c. Define the pi 1 vector. UðiÞ p ðcjl; RÞ ¼
@i ð@cÞ\i [
Up ðcjl; RÞ ði ¼ 1; 2; . . .Þ;
where ð@cÞ\i [ ¼ ð@cÞ ð@cÞ (i times of @c) is the i-fold Kronecker product of @c, UðiÞ p ðcjl; RÞ is the vector of the i-th order partial derivatives of the cdf of X Np ðl; RÞ at X ¼ c. Note that the vector consists of the p univariate and pi p cross multivariate i-th order derivatives. In the following, p is supposed to be large enough when an associated formula is considered. Lemma 1.1 Let XðkÞ ¼ ðx1 ; . . .; xk1 ; xk þ 1 ; . . .; xp ÞT with cðkÞ and lðkÞ defined similarly, and /1 ðck jlk ; rkk Þ ¼ /ðck jlk ; rkk Þ. Then, @Up ðcjl; RÞ ¼ /ðck jlk ; rkk ÞUp1 fcðkÞ jlðkÞ @ck þ rðkÞk r1 kk ðck lk Þ; Rðk;kÞjk g ZcðkÞ /p fðck ; XTðkÞ ÞT jl; RgdXðkÞ
¼ 1
fð1Þ ðck Þ
ðk ¼ 1; . . .; pÞ;
where rðkÞk ¼ ðr1k ; . . .; rk1;k ; rk þ 1;k ; . . .; rpk ÞT ; T Rðk;kÞjk ¼ Rðk;kÞ rðkÞk r1 kk rðkÞk
¼ covðXðkÞ Þ covðXðkÞ ; xk Þ fvarðxk Þg1 covðxk ; XTðkÞ Þ and the elements of l and R are reordered. Proof Reorder the elements of x as X ¼ ðxk ; XTðkÞ ÞT .
ðk ¼ 1; . . .; pÞ;
8
1 The Sectionally Truncated Normal Distribution
Then, we have rTðkÞk 1 0T rkk jRj ¼ 1 r rðkÞk Ip1 rðkÞk Rðk;kÞ kk rkk rTðkÞk ¼ T 0 Rðk;kÞ r1 kk rðkÞk rðkÞk T ¼ rkk jRðk;kÞ r1 kk rðkÞk rðkÞk j
rkk jRðk;kÞjk j; where Ip1 is the ðp 1Þ ðp 1Þ identity matrix. The formula of the inverse of a partitioned matrix [26, p. 11; 39, p. 16] gives !1 rTðkÞk rkk 1 R ¼ rðkÞk Rðk;kÞ ! 1 1 2 T T r1 r1 kk þ rkk rðkÞk Rðk;kÞjk rðkÞk kk rðkÞk Rðk;kÞjk ¼ 1 r1 R1 kk Rðk;kÞjk rðkÞk ðk;kÞjk Using the above results, it follows that
1 exp fðck ; XTðkÞ ÞT p=2 1=2 2 ð2pÞ jRj i T 1 T T lg R fðck ; XðkÞ Þ lg 1 1 ¼ exp fðck ; XTðkÞ ÞT lgT p=2 1=2 2 ð2pÞ jRj 1
/p fðck ; XTðkÞ ÞT jl; Rg ¼
1 2 T r1 kk þ rkk rðkÞk Rðk;kÞjk rðkÞk
1 r1 kk Rðk;kÞjk rðkÞk i fðck ; XTðkÞ ÞT lg
1 T r1 kk rðkÞk Rðk;kÞjk
!
R1 ðk;kÞjk
) ðck lk Þ2 ¼ pffiffiffiffiffiffi exp 2r kk ð2pÞ1=2 rkk 1 1 T exp fXðkÞ lðkÞ rðkÞk r1 kk ðck lk Þg ðp1Þ=2 1=2 2 jRðk;kÞjk j ð2pÞ 1 1 Rðk;kÞjk fXðkÞ lðkÞ rðkÞk rkk ðck lk Þg 1
:(
¼ /ðck jlk ; rkk Þ/p1 fXðkÞ jlðkÞ þ rðkÞk r1 kk ðck lk Þ; Rðk;kÞjk g:
1.3 Partial Derivatives of the Cumulative Distribution Function …
9
Consequently, we obtain @Up ðcjl; RÞ @ ¼ @ck @ck
Zc1
Zcp
1
1
Zck1 Zck þ 1
Zc1 ¼
/p ðXjl; RÞdx1 dxp
1
Zcp
1
1
/p ðx1 ; . . .; xk1 ; ck ; xk þ 1 ; . . .; xp jl; RÞ 1
dx1 dxk1 dxk þ 1 dxp ZcðkÞ /p fðck ; XTðkÞ ÞT jl; RgdXðkÞ
1
ZcðkÞ ¼ /ðck jlk ; rkk Þ
/p1 fXðkÞ jl þ rðkÞk r1 kk ðck lk Þ; Rðk;kÞjk gdXðkÞ
1
¼ /ðck jlk ; rkk ÞUp1 fcðkÞ jlðkÞ þ rðkÞk r1 kk ðck lk Þ; Rðk;kÞjk g fð1Þ ðck Þ ðk ¼ 1; . . .; pÞ: Q.E.D. T It is well known that Rðk;kÞjk ¼ Rðk;kÞ rðkÞk r1 kk rðkÞk is the covariance matrix of the conditional distribution of XðkÞ when Xk is given as a fixed value without truncation or equivalently the covariance matrix of the residuals of the regression of XðkÞ on Xk . Define rðkÞ=k ¼ rðkÞk r1 kk , which is the vector of the regression coefficients of XðkÞ on Xk . Then, we have
Rðk;kÞjk ¼ covðXðkÞ rðkÞ=k Xk Þ ¼ covðXðkÞ Þ covðrðkÞ=k Xk Þ ¼ Rðk;kÞ rðkÞ=k rkk rTðkÞ=k : More generally, define, e.g., Rabc=de ¼ Rabc;de R1 de;de ¼ covfðXa ; Xb ; Xc ÞT ; ðXd ; Xe Þg½covfðXd ; Xe ÞT g 1 ; which is the matrix of the regression coefficients of ðXa ; Xb ; Xc Þ on ðXd ; Xe Þ, whose variations defined similarly will be used for simplicity of notation. Using this notation, we have
10
1 The Sectionally Truncated Normal Distribution
Rabc;abcjde ¼ Rabc;abc Rabc;de R1 de;de Rde;abc ¼ Rabc;abc Rabc=de Rde;de RTabc=de : R cðkÞ /p fðck ; XTðkÞ ÞT jl; RgdXðkÞ in Lemma 1.1 will be The notation fð1Þ ðck Þ ¼ 1 frequently used, which indicates the marginal density of Xk ¼ ck multiplied by the Rc normalizer a ¼ 1 /p ðXjl; RÞdX when each variable is upper-tail truncated as 1 Xk \ck ðk ¼ 1; . . .; pÞ. Lemma 1.2 Using the simplified notation rl=k ¼ rlk r1 kk defined earlier, @ 2 Up ðcjl; RÞ c k lk ¼ fð1Þ ðck Þ rkk @c2k cðk;lÞ Z p X 1 rlk rkk /p fðcTkl ; XTðk;lÞ ÞT jl; RgdXðk;lÞ l¼1;l6¼k
1
p X c k lk fð1Þ ðck Þ rl=k fð2Þ ðckl Þ rkk l¼1;l6¼k
ðk ¼ 1; . . .; pÞ;
where ckl ¼ ðck ; cl ÞT and Xðk;lÞ is the ðp 2Þ 1 vector when the elements xk and xl ðl 6¼ kÞ are deleted in x. Proof @ 2 Up ðcjl; RÞ @c2k @ ¼ /ðck jlk ; rkk ÞUp1 fcðkÞ jlðkÞ þ rðkÞ=k ðck lk Þ; Rðk;kÞjk g @ck @/ðck jlk ; rkk Þ Up1 fcðkÞ jlðkÞ þ rðkÞ=k ðck lk Þ; Rðk;kÞjk g þ /ðck jlk ; rkk Þ ¼ @ck @ Up1 fcðkÞ lðkÞ rðkÞ=k ðck lk Þj0; Rðk;kÞjk g @ck ck l k /ðck jlk ; rkk ÞUp1 fcðkÞ jlðkÞ þ rðkÞ=k ðck lk Þ; Rðk;kÞjk g ¼ rkk Z cðkÞ lðkÞ rðkÞ=k ðck lk Þ @ /p1 fXðkÞ j0; Rðk;kÞjk gdXðkÞ þ /ðck jlk ; rkk Þ @ck 1 p X ck l k ¼ fð1Þ ðck Þ /ðck jlk ; rkk Þ rl=k /fcl ll rl=k ðck lk Þj0; rlljk g rkk l¼1;l6¼k
1.3 Partial Derivatives of the Cumulative Distribution Function …
11
Up2 fcðk;lÞ lðk;lÞ Rðk;lÞ=kl ðckl lkl Þj0; Rðk;l;k;lÞjkl g p X c k lk fð1Þ ðck Þ rl=k /2 ðckl jlkl ; Rkl;kl Þ ¼ rkk l¼1;l6¼k Up2 fcðk;lÞ lðk;lÞ Rðk;lÞ=kl ðckl lkl Þj0; Rðk;l;k:lÞjkl g p X c k lk fð1Þ ðck Þ rl=k fð2Þ ðckl Þ ðk ¼ 1; . . .; pÞ: ¼ rkk l¼1;l6¼k Q.E.D. The notation Z
cðk;lÞ
/p fðcTkl ; XTðk;lÞ ÞT jl; RgdXðk;lÞ
fð2Þ ðckl Þ ¼
ðk; l ¼ 1; . . .; p; k 6¼ lÞ
1
indicates the bivariate marginal density of Xkl ¼ ðXk ; Xl ÞT ¼ ckl multiplied by the normalizer a under upper-tail truncation as before. Lemma 1.3 @ 3 Up ðcjl; RÞ 1 ¼ fðck lk Þ2 r2 kk rkk gfð1Þ ðck Þ @c3k p X 1 rl=k ½ðck lk Þr1 þ kk þ fRkl;kl ðckl lkl Þg½k fð2Þ ðckl Þ l¼1;l6¼k p X
þ
rl=k ðrm=kl Þ½k
l;m¼1 ðk;l;m:6¼Þ
Z
cðk;l;mÞ
/p fðcTklm ; XTðk;l;mÞ ÞT jl; RgdXðk;l;mÞ
1
1 fðck lk Þ2 r2 kk rkk gfð1Þ ðck Þ p X 1 þ rl=k ½ðck lk Þr1 kk þ fRkl;kl ðckl lkl Þg½k fð2Þ ðckl Þ l¼1;l6¼k
þ
p P l;m¼1 ðk;l;m:6¼Þ
rl=k ðrm=kl Þ½k fð3Þ ðcklm Þðk ¼ 1; . . .; pÞ;
where ðÞ½k is the element corresponding to Xk of the vector in parentheses; cklm and Xðk;l;mÞ are defined similarly to ckl and Xðk;lÞ , respectively; and k; l; m :6¼ indicates that k, l and m are mutually distinct, i.e., k 6¼ l 6¼ m 6¼ k.
12
1 The Sectionally Truncated Normal Distribution
Proof Taking the partial derivative of the result of Lemma 1.2 similarly as before, the required results follow. Q.E.D. In Lemma 1.3, kk kl fR1 kl;kl ðckl lkl Þg½k ¼ ðr ; r Þðckl lkl Þ and
ðrm=kl Þ½k ¼ ðrmk ; rml Þðrkk ; rlk ÞT ; where R1 kl;kl ¼
rkk rkl rlk rll
!1 ¼
rkk rkl
!
rlk rll
:
The quantity Z
cðk;l;mÞ
/p fðcTklm ; XTðk;l;mÞ ÞT jl; RgdXðk;l;mÞ
fð3Þ ðcklm Þ ¼ 1
is the tri-variate marginal density of Xklm ¼ cklm multiplied by the normalizer a under upper-tail truncation. Using notations for more than three variables defined as before and similar methods, we have the following. Lemma 1.4 @ 4 Up ðcjl; RÞ 2 ¼ fðck lk Þ3 r3 kk þ 3ðck lk Þrkk gfð1Þ ðck Þ @c4k p h X 1 1 kk þ rl=k fðck lk Þ2 r2 kk rkk g þ rkk þ r l¼1;l6¼k
i 1 1 ½ðck lk Þr1 kk þ fRkl;kl ðckl lkl Þg½k fRkl;kl ðckl lkl Þg½k fð2Þ ðckl Þ
p X
h 1 rl=k ½ðck lk Þr1 kk þ fRkl;kl ðckl lkl Þg½k ðrm=kl Þ½k
l;m¼1 ðk;l;m:6¼Þ
þ ðrm=kl Þ½k fR1 ðc l Þg klm ½k fð3Þ ðcklm Þ klm;klm klm
p X
rl=k ðrm=kl Þ½k ðrn=klm Þ½k fð4Þ ðcklmn Þðk ¼ 1; . . .; pÞ:
l;m;n¼1 ðk;l;m;n:6¼Þ
The cross partial derivatives up to the fourth order are given below.
1.3 Partial Derivatives of the Cumulative Distribution Function …
13
Lemma 1.5 @ 2 Up ðcjl; RÞ ¼ fð2Þ ðckl Þ; @ck @cl p X @ 3 Up ðcjl; RÞ 1 ¼ fR ðc l Þg f ðc Þ ðrm=kl Þ½l fð3Þ ðcklm Þ; kl kl ð2Þ kl ½l
kl;kl @ck @c2l m¼1 ðk;l;m:6¼Þ
@ 4 Up ðcjl; RÞ 2 ¼ ½rll þ fR1 kl;kl ðckl lkl Þg½l fð2Þ ðckl Þ @ck @c3l 2 p X 6 1 þ6 ðrm=kl Þ½l
4fRkl;kl ðckl lkl Þg½l
m¼1 ðk;l;m:6¼Þ
þ
p X m¼1 ðk;l;m:6¼Þ
þ
3
7 7 ðrm=kl Þ½l fR1 klm;klm ðcklm lklm Þg½l 5fð3Þ ðcklm Þ
p P m;n¼1 ðk;l;m;n:6¼Þ
ðrm=kl Þ½l ðrn=klm Þ½l fð4Þ ðcklmn Þðk; l ¼ 1; . . .; p; k 6¼ lÞ;
@ 3 Up ðcjl; RÞ ¼ fð3Þ ðcklm Þ; @ck @cl @cm @ 4 Up ðcjl; RÞ ¼ fR1 klm;klm ðcklm lklm Þg½m fð3Þ ðcklm Þ @ck @cl @c2m p X ðrn=klm Þ½m fð4Þ ðcklmn Þ n¼1 ðk;l;m;n:6¼Þ
ðk; l; m ¼ 1; . . .; p; k; l; m :6¼Þ; @ 4 Up ðcjl; RÞ ¼ fð4Þ ðcklmn Þðk; l; m; n ¼ 1; . . .; p; k; l; m; n :6¼Þ: @ck @cl @cm @cn Proof The results @ 2 Up ðÞ=@ck @cl ; @ 3 Up ðÞ=@ck @cl @cm and @ 4 Up ðÞ=@ck @cl @cm @cn ðk; l; m; n ¼ 1; . . .; p; k; l; m; n 6¼Þ are given by Lemma 1.1 with successive differentiation. The other results are given by these results. Q.E.D. Alternative expressions of the above results are also available. For instance, the 3 2 term of fR1 kl;kl ðckl lkl Þg½l fð2Þ ðckl Þ in @ Up ðÞ=@ck @cl is given alternatively from 2 2 the result of @ Up ðÞ=@cl in Lemma 1.2 as
14
1 The Sectionally Truncated Normal Distribution
c l ll 1 þ rk=l fRkl;kl ðckl lkl Þg½k fð2Þ ðckl Þ rll
ðk; l ¼ 1; . . .; p; k 6¼ lÞ;
which is shown to be equal to the corresponding result in Lemma 1.5 as follows Lemma 1.2 gives @ 2 Up ðcjl; RÞ c l lk ¼ fð1Þ ðcl Þ rll @c2l p X rm=l fð2Þ ðclm Þ
ðl ¼ 1; . . .; pÞ
m¼1;m6¼l
Differentiating this result with respect to ck , we have the term of fð2Þ ðclm Þ given above, which becomes 1 rll ðck lk Þ rkl ðcl ll Þ ðcl ll Þ þ rkl fð2Þ ðckl Þ rll rkk rll r2kl 1 r2kl rkl rll ðck lk Þ 1þ l Þ þ ¼ ðc fð2Þ ðckl Þ l l rll rkk rll r2kl rkk rll r2kl rkk ðcl ll Þ rkl ðck lk Þ þ ¼ fð2Þ ðckl Þ rkk rll r2kl rkk rll r2kl ¼ fR1 kl;kl ðckl lkl Þg½l
ðk; l ¼ 1; . . .; p; k 6¼ lÞ;
which is equal to the corresponding result in Lemma 1.5 and is simpler than the alternative expression. It is of interest to see that the above result corresponds to the latter case of the general formulas: ðw11 ; w12 ÞW1 ðx1 ; x2 ÞT ¼ x1 and ðw21 ; w22 ÞW1 ðx1 ; x2 ÞT ¼ x2 for arbitrary x1 and x2 when W ¼
w11 w12 w21 w22
! which is possibly asymmetric. The
above formulas are easily derived by WW1 ðx1 ; x2 ÞT ¼ ðx1 ; x2 ÞT .
1.4
Moments and Cumulants of the STN-Distributed Vector Using the MGF
In this section, moments and cumulants of the STN-distributed vector are given using the mgf based on the results of the previous section. Define
1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF
15
ðL Þ
Lk 1Lk ckr k ¼ ðcðLÞ and r Þk ¼ akr bkr ðLk Þ
ckr
¼ ðcðLÞ Þk r ð1Lk Þ
k ¼ aL kr bkr
k ¼ ðar ÞLk k ðbr Þ1L k k ¼ ðar RtÞLk k ðbr RtÞ1L k
ðLÞ
ðLk ¼ 0; 1; k ¼ 1; . . .; p; r ¼ 1; . . .; RÞ; where ðar ÞLk k ¼ fðar Þk gLk ; and ðcr Þk is the ðLÞ
ðLÞ
k-th element of cr and should not be confused with ðcr Þ½k defined earlier though in this case they happen to be the same. T Using the above notations and T lT t þ t 2Rt, the mgf of the STN vector given in Theorem 1.1 becomes tT Rt MX ðtÞ ¼ a1 PrðX 2 Sjl þ Rt; RÞ exp lT t þ 2 ¼ a1 PrðX 2 Sjl þ Rt; RÞeT Pp R X 1 1 X X ðL Þ ðLp Þ T ¼ a1 ð1Þ k¼1 Lk Up fðc1r 1 ; . . .; cpr Þ jl; RÞgeT ð1:1Þ r¼1 L1 ¼0
¼ a1
R X
Lp ¼0
1 X
r¼1 L1 ;...;Lp ¼0
ð1ÞL þ Up ðcrðLÞ jl; RÞeT : ðLÞ
When t ¼ 0, we have eT ¼ 1 and cr a¼
R X
1 X
r¼1 L1 ;...;Lp ¼0
ðLÞ
¼ cr ðr ¼ 1; . . .; RÞ, which give
ð1ÞL þ Up ðcðLÞ r jl; RÞ
yielding MX ð0Þ ¼ 1 as expected. Remark 1.3 Note that ðL Þ
ðLÞ
@ckr k @ðcr RtÞk ¼ ¼ rki @ti @ti
ði; k ¼ 1; . . .; p; r ¼ 1; . . .; RÞ
or in a matrix form ðLÞT
@cr @t
ðLÞ
¼
@ðcr
RtÞT ¼ R @t
does not depend on the values of Li ði ¼ 1; . . .; pÞ. Consequently, recalling the definition of the pseudo random vector X1 in Remark 1.2, we obtain
16
1 The Sectionally Truncated Normal Distribution
@ j MX1 ðtÞ ð@tÞ\j [
¼
@j ð@tÞ\j [
¼a
a1
R X
1
R X
1 X
r¼1 L1 ;...;Lp ¼0 1 X
(
Lþ
ð1Þ
r¼1 L1 ;...;Lp ¼0 R X
¼ a1
1 X
ð1ÞL þ Up ðcðLÞ jl; RÞ r ðLÞT
@cr ð@tÞ
)\j [
ðLÞ
@ j Up ðcr
jl; RÞ
ðLÞ ð@cr Þ\j [
ð1ÞL þ ðRÞ\j [
ðLÞ
@ j Up ðcr
r¼1 L1 ;...;Lp ¼0
jl; RÞ
ðLÞ ð@cr Þ\j [
ðj ¼ 1; 2; . . .Þ:
In this expression, the elements of ðLÞ
@ j Up ðcr
jl; RÞ
ðLÞ ð@cr Þ\j [
ðLk ¼ 0; 1; k ¼ 1; . . .; p; r ¼ 1; . . .; R; j ¼ 1; . . .; 4Þ ðLÞ
are given by Lemmas 1.1 to 1.5 when c ¼ cr when t ¼ 0, we obtain [ Þ¼ EðX\j 1
ðLÞ
. Using cr
ðLÞ
¼ cr ðr ¼ 1; . . .; RÞ
@ j MX1 ðtÞ
jt¼0 ð@tÞ\j [ R 1 X X ¼ a1
ð1ÞL þ ðRÞ\j [
r¼1 L1 ;...;Lp ¼0
ðLÞ
@ j Up ðcr jl; RÞ ðLÞ
ð@cr Þ\j [
ðj ¼ 1; 2; . . .Þ:
The above formula can be used to have the cumulants of X via the cumulants of X1 as EðXÞ ¼ EðX1 Þ þ l @MX1 ðtÞ jt¼0 þ l; @t covðXÞ ¼ covðX1 Þ þ R ¼
@ 2 MX1 ðtÞ j EðX1 ÞEðXT 1 Þ þ R; @t@tT t¼0 jj ðXÞ ¼ jj ðX1 Þ ðj ¼ 3; 4Þ; ¼
where jj ðXÞ is the p j 1 vector of the multivariate cumulants of X. In the following, we obtain the results using somewhat reduced expressions. Theorem 1.2 EðXÞi1 ¼ a1
p X R X 1 X r¼1 k¼1 Lk ¼0
ðrÞ
ðL Þ
ð1ÞLk þ 1 ri1 k fð1Þ ðckr k Þ þ li1
ði1 ¼ 1; . . .; pÞ;
1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF
17
where ðrÞ
1 X
ðL Þ
fð1Þ ðckr k Þ ¼
ðL Þ
L1 ;...;Lk1 ; Lk þ 1 ...:;Lp ¼0
ð1ÞL þ Lk fð1Þ ðckr k Þ:
Proof Rewrite the mgf of the STN vector (see (1.1) of Sect. 1.4) as MX ðtÞ ¼ a1
R X
1 X
r¼1 L1 ;...;Lp ¼0
ð1ÞL þ Up ðcðLÞ jl; RÞeT r
T UðaÞ p e :
Then, EðXÞi1 ¼ ¼ ¼
T @UðaÞ @MX ðtÞ p e jt¼0 ¼ jt¼0 @ti1 @ti1
@UðaÞ p @ti1
T jt¼0 eT jt¼0 þ UðaÞ p jt¼0 ðl þ RtÞi1 e jt¼0
@UðaÞ p j þ li 1 @ti1 t¼0
ði1 ¼ 1; . . .; pÞ
follows. Since @UðaÞ p @ti1
¼ a1
R X
1 X
ð1ÞL þ
r¼1 L1 ;...;Lp ¼0
t¼0
¼ a1
R X
1 X
ð1ÞL þ
r¼1 L1 ;...;Lp ¼0
¼ a1
R X
1 X
ð1ÞL þ
r¼1 L1 ;...;Lp ¼0
¼ a1
p R X X
1 X
r¼1 k¼1 Lk ¼0
¼ a1
p R P P
1 P
r¼1 k¼1 Lk ¼0
ðLÞ @Up cr jl; R @ti1 t¼0 p ðLk Þ @Up cðLÞ jl; R X r @ckr ð L Þ @ti1 @ckr k k¼1 t¼0 ðLÞ p @Up cr jl; R X ðri1 k Þ ðL Þ @ckr k k¼1
t¼0
1 X
ð1ÞLk þ 1 ri1 k
ð1ÞL þ Lk
L1 ;...;Lk1 ; Lk þ 1 ;...;Lp ¼0 ðrÞ
ð1ÞLk þ 1 ri1 k fð1Þ
ðL k Þ ckr
ðLk Þ
@ckr
ði1 ¼ 1; . . .; pÞ t¼0
ðLÞ @Up cr jl; R
t¼0
18
1 The Sectionally Truncated Normal Distribution
and ðLk Þ
ckr
ðL Þ
jt¼0 ¼ ckr k
ðk ¼ 1; . . .; p; r ¼ 1; . . .; RÞ;
we obtain the required result. Q.E.D. Note that Z
brðkÞ ðrÞ ðL Þ fð1Þ ðckr k Þ
ðL Þ
/p fðckr k ; XTðkÞ ÞT jl; RgdXðkÞ
¼
ðk ¼ 1; . . .; pÞ;
arðkÞ
where the elements of l and R are reordered to be compatible with those of ðL Þ ðckr k ; XTðkÞ Þ. For the moments and cumulants higher than the first, we can use s Cj s X X @ j UðaÞ @ s MX ðtÞ @ sj eT p ¼ @ti1 @tis @t @tij @tij þ 1 @tis j¼0 ði ;...;i Þ i1 1
where
Ps C j ði1 ;...;ij Þ
ðij ¼ 1; . . .; p; j ¼ 0; . . .; sÞ;
j
ðÞ is the sum of s Cj terms choosing j elements from the s elements
ðaÞ of t; and when j ¼ 0, @ j UðaÞ p =@ti0 Up . For the moments of the second order, ðaÞ 2 @ 2 UðaÞ T X @ 2 MX ðtÞ @ 2 Up @ 2 eT p @e ¼ þ þ UðaÞ ; p @ti1 @ti2 @ti1 @ti2 @ti1 @ti2 @ti1 @ti2 ði ;i Þ 1 2
@e ¼ fli1 þ ðRtÞi1 geT ; @ti1 @ 2 eT ¼ ½ri1 i2 þ fli1 þ ðRtÞi1 gfli2 þ ðRtÞi2 g eT @ti1 @ti2 T
ði1 ; i2 ¼ 1; . . .; pÞ:
Since the corresponding results of order higher than the second tend to be complicated, in the following the pseudo mgf UðaÞ p of the pseudo random vector X1 is used. Using a method similar to that in Theorem 1.2 and noting that EfðX1 Þi1 ðX1 Þi2 g ¼
@ 2 UðaÞ p j @ti1 @ti2 t¼0
ði1 ; i2 ¼ 1; . . .; pÞ
with ð1ÞLk þ 1 þ Ll þ 1 ¼ ð1ÞLk þ Ll , we obtain the pseudo second moments as follows:
1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF ðL Þ
19
ðL Þ
Lemma 1.6 Define ckr k ¼ ckr k lk . Then, we have EfðX1 Þi1 ðX1 Þi2 g
¼a
1
( p X R 1 X X r¼1 p X
þ
k¼1 Lk ¼0 1 X
ðL Þ ðrÞ
)
ð1Þ
Lk þ Ll
k;l¼1;k6¼l Lk ;Ll ¼0
¼a
1
( p X R 1 X X r¼1
p X
þ
k¼1 Lk ¼0 1 X
ðL Þ
ð1ÞLk þ 1 ri1 k ri2 =k ckr k fð1Þ ðckr k Þ ðrÞ ðLÞ ri1 k ri2 ljk fð2Þ ðckl;r Þ
ðL Þ ðrÞ
ðL Þ
k k ð1ÞLk þ 1 ri1 k ri2 k r1 kk ckr fð1Þ ðckr Þ
)
ð1Þ
Lk þ Ll
ðri1 k ri2 l
k;l¼1;k6¼l Lk ;Ll ¼0
ðrÞ ðLÞ ri1 k ri2 k rlk r1 kk Þfð2Þ ðckl;r Þ
ði1 ; i2 ¼ 1; . . .; pÞ; where ðrÞ
ðLÞ
ðrÞ
ðL Þ
ðL Þ
fð2Þ ðckl;r Þ ¼ fð2Þ fðckr k ; clr l ÞT g Z
brðk;lÞ ðLÞ
/p fðckl;r ; XTðk;lÞ ÞT jl; RgdXðk;lÞ
¼ arðk;lÞ
¼
1 X
ðLÞ
ð1ÞL þ Lk Ll fð2Þ ðckl;r Þ
ðk; l ¼ 1; . . .; p; k 6¼ lÞ:
ðLk ;Ll Þ¼0
and 1 X ðLk ;Ll Þ¼0
ðÞ ¼
1 X L1; ...; Lk1; Lk þ 1;...; Ll1; Ll þ 1 ; ...; Lp ¼ 0
when k\l.
ðÞ
20
1 The Sectionally Truncated Normal Distribution
Proof ðLÞ
@ 2 UðaÞ p ðcr
jl; RÞ jt¼0 @ti1 @ti2 p X R X 1 @ 1 X ðrÞ ðL Þ ¼ a ð1ÞLk þ 1 ri1 k fð1Þ ðckr k Þjt¼0 @ti2 r¼1 k¼1 L ¼0
EfðX1 Þi1 ðX1 Þi2 g ¼
k
R XX 1 @ 1 X ¼ a ð1ÞLk þ 1 ri1 k @ti2 r¼1 k¼1 L ¼0 p
k
¼ a1
R X
"
1 X
p X
1 X
1 X
ð1ÞL þ Lk Ll
ðLk Þ
r¼1
k¼1 Lk ¼0
p X
1 X
k;l¼1;k6¼l Lk ;Ll ¼0
þ
p X
ðL Þ ðrÞ
1 X
k¼1 Lk ¼0
k;l¼1;k6¼l Lk ;Ll ¼0
ðL Þ
@clr l @ ðLÞ Up1 fcðkÞr jlðkÞ @ti2 @cðLl Þ lr #
ðL Þ
ðrÞ
p X
1 X
jlk ; rkk Þ
ðLÞ
ð1ÞLk þ Ll ri1 k ðrðkÞ=k Þl ri2 k fð2Þ ðckl;r Þ
( p X R 1 X X r¼1
lk Þ; Rðk;kÞjk Þg
ð1ÞLk þ 1 ri1 k ri2 =k ckr k fð1Þ ðckr k Þ
k;l¼1;k6¼l Lk ;Ll ¼0
¼ a1
ðLk Þ
lk Þ; Rðk;kÞjk g jt¼0
( p X R 1 X X
þ
ðL Þ
@ckr k @ ðL Þ f/ðckr k jlk ; rkk Þ @ti2 @cðLk Þ kr
ðLk Þ
þ rðkÞ=k ðckr ¼a
; XTðkÞ ÞT jl; RgdXðkÞ jt¼0
arðkÞ
ð1ÞLk þ 1 þ Ll ri1 k /ðckr
ðLk ;Ll Þ¼0
1
ðLk Þ
/p fðckr
ð1ÞL þ Lk Up1 ðcðkÞr jlðkÞ þ rðkÞ=k ðckr
k;l¼1;k6¼l Lk ;Ll ¼0
ð1ÞLk þ 1 ri1 k
Z
ðLÞ
ðLk Þ¼0
þ
1 X
k¼1 Lk ¼0
r¼1
p X
brðkÞ
) ðrÞ
ðLÞ
ð1ÞLk þ Ll ri1 k ri2 l fð2Þ ðckl;r Þ ðL Þ ðrÞ
ðL Þ
ð1ÞLk þ 1 ri1 k ri2 =k ckr k fð1Þ ðckr k Þ ) Lk þ Ll
ð1Þ
ðrÞ ðLÞ ri1 k ri2 ljk fð2Þ ðckl;r Þ
:
Q.E.D. Using ri1 k ri2 ljk ¼ ri1 k ri2 l ri1 k ri2 k r1 kk rkl in the last expression of Lemma 1.6, 2 ðaÞ we find the symmetric property corresponding to @ 2 UðaÞ p =@ti1 @ti2 ¼ @ Up =@ti2 @ti1 : The result in Theorem 1.2
1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF
21
EðX1 Þi1 ¼ EðXÞi1 li1 ¼ a1
p X R X 1 X r¼1 k¼1 Lk ¼0
ðrÞ
ðL Þ
ð1ÞLk þ 1 ri1 k fð1Þ ðckr k Þ
ði1 ¼ 1; . . .; pÞ;
gives the following. Theorem 1.3 1
cov(Xi1 ; Xi2 Þ ¼ a
( p X R 1 X X
þ ( a2
ðL Þ ðrÞ
)
1 X
p X
ð1Þ
Lk þ Ll
k;l¼1;k6¼l Lk ;Ll ¼0 p X R X 1 X r¼1 k¼1 Lk ¼0
(
ðL Þ
ð1ÞLk þ 1 ri1 k ri2 =k ckr k fð1Þ ðckr k Þ
k¼1 Lk ¼0
r¼1
ðrÞ ðLÞ ri1 k ri2 ljk fð2Þ ðckl;r Þ
) ðrÞ
ðL Þ
ð1ÞLk þ 1 ri1 k fð1Þ ðckr k Þ
p X R X 1 X
) ð1Þ
Lk þ 1
r¼1 k¼1 Lk ¼0
ðrÞ ðL Þ ri2 k fð1Þ ðckr k Þ
þ ri1 i2
ði1 ; i2 ¼ 1; . . .; pÞ:
Lemma 1.7 EfðX1 Þi1 ðX1 Þi2 ðX1 Þi3 g
1
¼a
" p X R 1 X X r¼1
þ
p X
k¼1 Lk ¼0 1 X
k;l¼1;k6¼l Lk ;Ll ¼0
ðL Þ2
ðrÞ
ðL Þ
ð1ÞLk þ Ll ðri1 k ri2 =k ri3 ljk ckr k ðLÞ
ðrÞ
ðLÞ
þ ri1 k ri2 ljk rTi3 =kl ckl;r Þfð2Þ ðckl;r Þ þ
p X
ðL Þ
k 1 ð1ÞLk þ 1 ri1 k ri2 k ri3 k ðckr k r2 kk rkk Þfð1Þ ðckr Þ
1 X
ð1Þ
Lk þ Ll þ Lm þ 1
k;l;m¼1 Lk ;Ll ;Lm ¼0 k;l;m:6¼
#
ðrÞ ðLÞ ri1 k ri2 ljk ri3 mjkl fð3Þ ðcklm;r Þ
ði1 ; i2 ; i3 ¼ 1; . . .; pÞ;
where undefined notations are similarly defined as before. The corresponding symmetric expressions is
22
1 The Sectionally Truncated Normal Distribution
EfðX1 Þi1 ðX1 Þi2 ðX1 Þi3 g " p X R 1 X X ðL Þ2 1 ðrÞ ðLk Þ ð1ÞLk þ 1 ri1 k ri2 k ri3 k ðckr k r2 ¼ a1 kk rkk Þfð1Þ ðckr Þ r¼1
k¼1 Lk ¼0
p X
þ
1 X
( ðL Þ
Lk þ Ll
k ri1 k ri2 k ri3 k r2 kk rkl ckr
ð1Þ
k;l¼1;k6¼l Lk ;Ll ¼0
0
3 X
þ @ri1 k ri2 k ri3 k r1 kk rkl þ
1
)
ðLÞ ri1 k ri2 k ri3 l AðR1 kl;kl ckl;r Þ½k
ði1 ;i2 ;i3 Þ p X
þ
ðrÞ
ðLÞ
fð2Þ ðckl;r Þ
(
1 X
ð1Þ
Lk þ Ll þ Lm þ 1
ri1 k ri2 l ri3 m
k;l;m¼1 Lk ;Ll ;Lm ¼0 k;l;m:6¼
0
þ @ri1 k ri2 k ri3 k r1 kk rkl
3 X ði1 ;i2 ;i3 Þ
1
3
)
ri1 k ri2 k ri3 l Aðrm=kl Þ½k fð3Þ ðcklm;r Þ5 ðrÞ
ðLÞ
ði1 ; i2 ; i3 ¼ 1; . . .; pÞ:
Proof The asymmetric expression is given by Lemma 1.6. For the correspondence ðrÞ ðL Þ of the asymmetric and symmetric expressions, the term of fð1Þ ðckr k Þ is already ðrÞ
ðLÞ
symmetric. The term of fð2Þ ðckl;r Þ from Lemma 1.6 is p X
1 X
k;l¼1;k6¼l Lk ;Ll ¼0
¼
p X
ðL Þ
þ ðri1 k ri2 l ¼
ðrÞ
ðLÞ
(
1 X
ð1Þ
Lk þ Ll
k;l¼1;k6¼l Lk ;Ll ¼0
p X
ðLÞ
k T ð1ÞLk þ Ll ðri1 k ri2 k r1 kk ri3 ljk ckr þ ri1 k ri2 ljk ri3 =kl ckl;r Þfð2Þ ðckl;r Þ
1 X
k;l¼1;k6¼l Lk ;Ll ¼0
ðL Þ
1 k ri1 k ri2 k r1 kk ðri3 l ri3 k rkk rkl Þckr
)
1 ðLÞ ri1 k ri2 k r1 kk rkl Þðri3 k ; ri3 l ÞRkl;kl ckl;r
ðrÞ
ðLÞ
fð2Þ ðckl;r Þ
( ð1Þ
Lk þ Ll
ðL Þ
1 k ri1 k ri2 k r1 kk ðri3 l ri3 k rkk rkl Þckr
ðri3 k ; ri3 l Þ þ ðri1 k ri2 l ri1 k ri2 k r1 kk rkl Þ rkk rll r2kl
!0 ðLk Þ 1) c @ kr A f ðrÞ ðcðLÞ Þ ð2Þ kl;r ðL Þ rlk rkk clr l rll rkl
1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF
¼
p X
1 X
23
" Lk þ Ll
ð1Þ
k;l¼1;k6¼l Lk ;Ll ¼0
ðL Þ
k ri1 k ri2 k ri3 k r2 kk rkl ckr
ðLÞ
ðLÞ
1 1 ri1 k ri2 k ri3 k r1 kk rkl ðRkl;kl ckl;r Þ½k þ ri1 k ri2 l ri3 k ðRkl;kl ckl;r Þ½k
ðrkl Þ rkk ðLk Þ ðLl Þ þ ri1 k ri2 k ri3 l r1 1 r c r c kl kl kk rkk rll r2kl lr rkk rll r2kl kr # ðLÞ
ðrÞ
ðLÞ
þ ri1 k ri2 l ri3 l ðR1 kl;kl ckl;r Þ½l fð2Þ ðckl;r Þ ¼
p X
1 X
( Lk þ Ll
ð1Þ
k;l¼1;k6¼l Lk ;Ll ¼0
ðL Þ
k ri1 k ri2 k ri3 k r2 kk rkl ckr
ðLÞ
ðLÞ
1 1 ri1 k ri2 k ri3 k r1 kk rkl ðRkl;kl ckl;r Þ½k þ ri1 k ri2 l ri3 k ðRkl;kl ckl;r Þ½k
) ðL Þ ðL Þ rll ckr k rkl clr l ðrÞ ðLÞ 1 ðLÞ þ ri1 k ri2 k ri3 l rkl þ ri1 l ri2 k ri3 k ðRkl;kl ckl;r Þ½k fð2Þ ðckl;r Þ rkk rll r2kl ( p 1 X X ðLk Þ Lk þ Ll ð1Þ ¼ ri1 k ri2 k ri3 k r2 kk rkl ckr k;l¼1;k6¼l Lk ;Ll ¼0
ðLÞ
1 ri1 k ri2 k ri3 k r1 kk rkl ðRkl;kl ckl;r Þ½k
9 3 = X ðLÞ ðrÞ ðLÞ þ ri1 k ri2 k ri3 l ðR1 kl;kl ckl;r Þ½k fð2Þ ðckl;r Þ; ; ði ;i ;i Þ 1 2 3
yielding the symmetric term, where p X
1 X
k;l¼1;k6¼l Lk ;Ll ¼0
¼
p X
ðLÞ
ðrÞ
ðLÞ
ð1ÞLk þ Ll ri1 k ri2 l ri3 l ðR1 kl;kl ckl;r Þ½l fð2Þ ðckl;r Þ 1 X
k;l¼1;k6¼l Lk ;Ll ¼0
ðLÞ
ðrÞ
ðLÞ
ð1ÞLk þ Ll ri1 l ri2 k ri3 k ðR1 kl;kl ckl;r Þ½k fð2Þ ðckl;r Þ
is used. An alternative derivation is an indirect one. Among the three terms to be P3 1 ðLÞ 1 ðLÞ derived in ði1 ;i2 ;i3 Þ ri1 k ri2 k ri3 l ðRkl;kl ckl;r Þ½k , the term ri1 k ri2 l ri3 k ðRkl;kl ckl;r Þ½k is immediately obtained as above, the remaining two terms are logically obtained due to the symmetry of the derivatives. Note that although this may look like a case of circular reasoning, the logic is valid since the derivation is to find the actual symmetric form rather than to prove the symmetry of partial derivatives.
24
1 The Sectionally Truncated Normal Distribution ðrÞ
ðLÞ
For the term of fð3Þ ðcklm;r Þ, note that T 1 ri1 k ri2 ljk ri3 mjkl ¼ ri1 k ðri2 l ri2 k r1 kk rkl Þðri3 m ri3 ;kl Rkl;kl rm;kl Þ
¼ ðri1 k ri2 l ri1 k ri2 k r1 kk rkl Þfri3 m ri3 k ðrkk ; rkl Þrm;kl ri3 l ðrlk ; rll Þrm;kl g ¼ ri1 k ri2 l ri3 m ri1 k ri2 l ri3 k ðrkk ; rkl Þrm;kl ri1 k ri2 l ri3 l ðrlk ; rll Þrm;kl 1 lk ll fri1 k ri2 k ri3 m r1 kk rkl ri1 k ri2 k ri3 l rkk rkl ðr ; r Þrm;kl g kk kl þ ri1 k ri2 k ri3 k r1 kk rkl ðr ; r Þrm;kl :
In the above result, the term ri1 k ri2 l ri3 l ðrlk ; rll Þrm;kl after taking the sum over k and lðk 6¼ lÞ is equal to that of the term exchanging k and l: The term in braces is 1 lk ll fri1 k ri2 k ri3 m r1 kk rkl ri1 k ri2 k ri3 l rkk rkl ðr ; r Þrm;kl g 1 lk ll ¼ fri1 k ri2 k ri3 m r1 kk rkl ri1 k ri2 k ri3 l rkk rkl ðr ; r Þrm;kl g
þ ri1 k ri2 k ri3 l ðrkk ; rkl Þrm;kl ri1 k ri2 k ri3 l ðrkk ; rkl Þrm;kl ¼ ri1 k ri2 k ri3 m r1 kk rkl ri1 k ri2 k ri3 l 1 þ fr rkl ðrkk rlm rkl rkm Þ þ rll rkm rkl rlm g rkk rll r2kl kk ri1 k ri2 k ri3 l ðrkk ; rkl Þrm;kl ¼ ri1 k ri2 k ri3 m r1 kk rkl þ
ri1 k ri2 k ri3 l r1 kk rkm ðrkk rll r2kl Þ rkk rll r2kl
ri1 k ri2 k ri3 l ðrkk ; rkl Þrm;kl 1 ¼ ri1 k ri2 k ri3 m r1 kk rkl þ ri1 k ri2 k ri3 l rkk rkm
ri1 k ri2 k ri3 l ðrkk ; rkl Þrm;kl : In the above result, the term ri1 k ri2 k ri3 m r1 kk rkl after taking the sum over l and mðl 6¼ mÞ is equal to that of the term exchanging l and m. Consequently, the sum of the two terms on the right-hand side of the last equation vanishes. Then, it is found that the asymmetric expression is equal to the symmetric expression. An alternative indirect proof is to use the term ri1 k ri2 l ri3 k ðrkk ; rkl Þrm;kl and extend it to ri1 k ri2 k ri3 l ðrkk ; rkl Þrm;kl and ri1 l ri2 k ri3 k ðrkk ; rkl Þrm;kl which hold due to the symmetric property of derivatives. Q.E.D.
1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF
25
Theorem 1.4 The third multivariate cumulants of Xi1 ; Xi2 and Xi3 are j3 ðXi1 ; Xi2 ; Xi3 Þ ¼ j3 fðX1 Þi1 ; ðX1 Þi2 ; ðX1 Þi3 g ¼ EfðX1 Þi1 ðX1 Þi2 ðX1 Þi3 g
3 X
EfðX1 Þi1 ðX1 Þi2 gEfðX1 Þi3 g
þ 2EfðX1 Þi1 gEfðX1 Þi2 gEfðX1 Þi3 g ði1 ; i2 ; i3 ¼ 1; . . .; pÞ; P ðÞ ¼ 3ði1 ;i2 ;i3 Þ ðÞ indicates the sum of three terms considering the symmetric property for i1 ; i2 and i3 ; and the expectations are given in Theorem 1.2, Lemma 1.6, and Lemma 1.7. P3
Lemma 1.7 gives the following asymmetric result with undefined notations defined similarly as before. Lemma 1.8 EfðX1 Þi1 ðX1 Þi2 ðX1 Þi3 ðX1 Þi4 g " p X R 1 X X ðL Þ3 ðLk Þ 2 ðrÞ ðLk Þ 1 ð1ÞLk þ 1 ri1 k ri2 k ri3 k ri4 k ðckr k r3 ¼a kk 3ckr rkk Þfð1Þ ðckr Þ r¼1
þ
p X
k¼1 Lk ¼0 1 X
( Lk þ Ll
ð1Þ
k;l¼1;k6¼l Lk ;Ll ¼0
ðL Þ2
1 ri1 k ri2 k ri3 k ri4 ljk ðckr k r2 kk rkk Þ
ri1 k ri2 =k ri3 ljk ri4 k ri1 k ri2 ljk rTi3 =kl ri4 ;kl ðL Þ þ ðri1 k ri2 =k ri3 ljk ckr k
þ
p X
1 X
ð1Þ
ðLÞ
1 X
k;l;m;n¼1 Lk ;Ll ;Lm ;Ln ¼0 k;l;m;n:6¼
ðL Þ
ðLÞ
ðri1 k ri2 =k ri3 ljk ckr k þ ri1 k ri2 ljk rTi3 =kl ckl;r Þri4 mjkl
)
ðLÞ þ ri1 k ri2 ljk ri3 mjkl rTi4 =klm cklm;r p X
ðrÞ
fð2Þ ðckl;r Þ
( Lk þ Ll þ Lm þ 1
k;l;m¼1 Lk ;Ll ;Lm ¼0 k;l;m:6¼
þ
)
ðLÞ ðLÞ þ ri1 k ri2 ljk rTi3 =kl ckl;r ÞrTi4 =kl ckl;r
ðrÞ
ðLÞ
fð3Þ ðcklm;r Þ ðrÞ
ðLÞ
ð1ÞLk þ Ll þ Lm þ Ln ri1 k ri2 ljk ri3 mjkl ri4 njklm fð4Þ ðcklmn;r Þ
ði1 ; i2 ; i3 ; i4 ¼ 1; . . .; pÞ:
26
1 The Sectionally Truncated Normal Distribution
Theorem 1.5 The fourth multivariate cumulants of Xi1 ; Xi2 ; Xi3 and Xi4 are j4 ðXi1 ; Xi2 ; Xi3 ; Xi4 Þ ¼ j4 fðX1 Þi1 ; ðX1 Þi2 ; ðX1 Þi3 ; ðX1 Þi4 g ¼ EfðX1 Þi1 ðX1 Þi2 ðX1 Þi3 ðX1 Þi4 g þ
4 X
6 X
EfðX1 Þi1 ðX1 Þi2 ðX1 Þi3 gEfðX1 Þi4 g EfðX1 Þi1 ðX1 Þi2 gEfðX1 Þi3 gEfðX1 Þi4 g
3EfðX1 Þi1 gEfðX1 Þi2 gEfðX1 Þi3 gEfðX1 Þi4 g ði1 ; i2 ; i3 ; i4 ¼ 1; . . .; pÞ;
the expectations are given in Theorems 1.2 to 1.4, Remark 1.3, Lemmas 1.6, 1.7, and 1.8. The corresponding symmetric expression becomes complicated and is not shown. The following results given from Lemma 1.8 are provided for possible use for higher-order cumulants. Lemma 1.9 EfðX1 Þi1 ðX1 Þi5 g p X R h X 1 X ðL Þ4 ¼ a1 ð1ÞLk þ 1 ri1 k ri5 k ðckr k r4 kk r¼1 ðAÞ k¼1 Lk ¼0 ðL Þ2
ðrÞ
ðL Þ
2 k 6ckr k r3 kk þ 3rkk Þfð1Þ ðckr Þ
þ
p X
1 X
k;l¼1;k6¼l Lk ;Ll ¼0
h ðL Þ3 ðLk Þ 2 ð1ÞLk þ Ll ri1 k ri4 k ri5 ljk ðckr k r3 kk 3ckr rkk Þ ðL Þ
k 2ri1 k ri2 k ri3 k ri4 ljk ri5 k r2 kk ckr
ðLÞ
ðri1 k ri2 =k ri3 ljk ri5 k þ ri1 k ri2 ljk rTi3 =kl ri5 ;kl ÞrTi4 =kl ckl;r ðL Þ
ðLÞ
ðri1 k ri2 =k ri3 ljk ckr k þ ri1 k ri2 ljk rTi3 =kl ckl;r ÞrTi4 =kl ri5 ;kl n ðL Þ2 1 þ ri1 k ri2 k ri3 k ri4 ljk ðckr k r2 kk rkk Þ ri1 k ri2 =k ri3 ljk ri4 k ri1 k ri2 ljk rTi3 =kl ri4 ;kl
o ðL Þ ðLÞ ðLÞ ðLÞ ðrÞ ðLÞ þ ðri1 k ri2 =k ri3 ljk ckr k þ ri1 k ri2 ljk rTi3 =kl ckl;r ÞrTi4 =kl ckl;r rTi5 =kl ckl;r fð2Þ ðckl;r Þ
1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF
þ
1 X
p X
ð1ÞLk þ Ll þ Lm þ 1
k;l;m¼1 Lk ;Ll ;Lm ¼0 k;l;m:6¼
h n
27
ðL Þ2
1 ri1 k ri2 k ri3 k ri4 l j k ðckr k r2 kk rkk Þ
ðBÞ
ri1 k ri2 =k ri3 ljk ri4 k ri1 k ri2 ljk rTi3 =kl ri4 ;kl
o ðL Þ ðLÞ ðLÞ þ ðri1 k ri2 =k ri3 ljk ckr k þ ri1 k ri2 ljk rTi3 =kl ckl;r ÞrTi4 =kl ckl;r ri5 mjkl ðri1 k ri2 =k ri3 ljk ri5 =k þ ri1 k ri2 ljk rTi3 =kl ri5 ;kl Þri4 mjkl ri1 k ri2 ljk ri3 mjkl rTi4 =klm ri5 ;klm n ðL Þ ðLÞ þ ðri1 k ri2 =k ri3 ljk ckr k þ ri1 k ri2 ljk rTi3 =kl ckl;r Þri4 mjkl o i ðLÞ ðLÞ ðrÞ ðLÞ þ ri1 k ri2 ljk ri3 mjkl rTi4 =klmcklm;r rTi5 =klmcklm;r fð3Þ ðcklm;r Þ ðBÞ
þ
p X
1 X
ð1ÞLk þ Ll þ Lm þ Ln
k;l;m;n¼1 Lk ;Ll ;Lm ;Ln ¼0 k;l;m;n:6¼
hn
ðL Þ
ðLÞ
ðri1 k ri2 =k ri3 ljk ckr k þ ri1 k ri2 ljk rTi3 =kl ckl;r Þri4 mjkl o ðLÞ þ ri1 k ri2 ljk ri3 mjkl rTi4 =klm cklm;r ri5 njklm i ðLÞ ðrÞ ðLÞ þ ri1 k ri2 ljk ri3 mjkl ri4 njklm rTi5 =klmn cklmn;r fð4Þ ðcklmn;r Þ 1 X
p X
þ
ð1ÞLk þ Ll þ Lm þ Ln þ Lq þ 1 ri1 k ri2 ljk ri3 mjkl ri4 njklm ri5 qjklmn
k;l;m;n;q¼1 Lk ;Ll ;Lm ;Ln ;Lq ¼0 k;l;m;n;q:6¼ ðrÞ
ðLÞ
fð5Þ ðcklmnq;r Þ
i
ði1 ; i2 ; i3 ; i4 ; i5 ¼ 1; . . .; pÞ;
ðAÞ
where, e.g., ½ is for ease of finding correspondence. ðAÞ ðAÞ
From the results obtained earlier, we have some rules for computation of EfðX1 Þi1 ðX1 Þis g ¼
@ s UðaÞ p @i1 @is
ðis ¼ 1; . . .; p; s ¼ 0; 1; . . .Þ;
ðaÞ where @ 0 UðaÞ p =@i0 jt¼0 Up jt¼0 ¼ 1 as addressed before.
28
1 The Sectionally Truncated Normal Distribution
Lemma 1.10 Suppose that we have derived the following derivative: " p X R 1 X X @ s UðaÞ p ðL Þ ðrÞ ðL Þ 1 jt¼0 ¼ a ð1ÞL1 þ 1 gs1 ðck1 r1 Þfð1Þ ðck1 r1 Þ @ti1 @tis r¼1 k ¼1 L ¼0 1
1
þ
p P
1 P
ðLÞ
k1 ;k2 ¼1 L1 ;L2 ¼0 k1 6¼k2 p X
þ
ðrÞ
ðLÞ
ð1ÞL1 þ L2 gs2 ðck1 k2 ;r Þfð2Þ ðck1 k2 ;r Þ #
1 X
ð1Þ
L1 þ þ Ls þ s
k1 ;...;ks ¼1 L1 ;...;Ls ¼0 k1 ;...;ks :6¼
ðLÞ ðrÞ ðLÞ gs;s ðck1 ;...;ks ;r ÞfðsÞ ðck1 ;...;ks ;r Þ
ðiu ¼ 1; . . .; p; u ¼ 1; 2; . . .; sÞ: Then, we obtain @ s þ 1 UðaÞ p @ti1 @tis þ 1
jt¼0 ¼ a1
" p X R 1 X X k1 ¼1 L1 ¼0
r¼1
þ
s X
( ð1ÞL1 þ 1
p X
1 X
) ðL Þ @gs1 ðck1 r1 Þ ðL Þ ðL Þ ðrÞ ðL Þ þ gs1 ðck1 r1 Þris þ 1 =k1 ck1 r1 fð1Þ ðck1 r1 Þ @tis þ 1 (
ð1ÞL1 þ þ Lu þ u
u¼2 k1 ;...;ku ¼1 L1 ;...;Lu ¼0 k1 ;...;ku :6¼
ðLÞ
gs;u1 ðck1 ;...;ku1 ;r Þris þ 1 ku jk1 ;...;ku1
) ðLÞ @gsu ðck1 ;...;ku ;r Þ ðLÞ ðLÞ ðrÞ ðLÞ T þ þ gsu ðck1 ;...;ku ;r Þris þ 1 =k1 ;...;ku ck1 ;...;ku ;r fðuÞ ðck1 ;...;ku ;r Þ @tis þ 1
þ
1 X
p X
k1 ;...;ks þ 1 ¼1 L1 ;...;Ls þ 1 ¼0 k1 ;...;ks þ 1 :6¼
ð1ÞL1 þ þ Ls þ 1 þ s þ 1 ri1 jk1 ri2 k2 jk1 ris þ 1 ks þ 1 jk1 ;...;ks
#
ðrÞ ðLÞ fðs þ 1Þ ðck1 ;...;ks þ 1 ;r Þ
:
Proof The terms except the last one in brackets are given by construction. The last ðrÞ ðLÞ term with the factor fðs þ 1Þ ðck1 ;...;ks þ 1 ;r Þ is obtained by induction. Q.E.D. The closed formula for the first term using the Hermite polynomials will be given later. An application of Lemma 1.10 is given below. Lemma 1.11 For the EfðX1 Þi1 ðX1 Þi6 gði1 ; . . .; i6 ¼ 1; . . .; pÞ, let 6 X @ 6 UðaÞ p jt¼0 ¼ a1 @ti1 @ti6 u¼1
p X
1 X
k1 ; . . .; ku ¼ 1 k1 ; . . .; ku :6¼
L1 ;...;Lu ¼0
ðLÞ
ðrÞ
ðLÞ
ð1ÞL1 þ þ Lu þ u g6u ðck1 ;...;ku r ÞfðuÞ ðck1 ;...;ku r Þ:
1.5 The Product Sum of Natural Numbers and the Hermite Polynomials
29
Then, we have ðL Þ
ðL Þ5
ðL Þ3
ðL Þ
1 4 1 3 g61 ðck1 r1 Þ ¼ ri1 k1 ri6 k1 ðck1 r1 r5 k1 k1 10ck1 r rk1 k1 þ 15ck1 r rk1 k1 Þ;
ðLÞ
@g52 ðck1 k2 ;r Þ ðL Þ2 2 ¼ ri1 k1 ri4 k1 ri5 k2 jk1 ri6 k1 ð3ck1 r1 r3 k1 k1 3rk1 k1 Þ @ti6 þ 2ri1 k1 ri2 k1 ri3 k1 ri4 k2 jk1 ri5 k1 ri6 k1 r2 k1 k1 þ ðri1 k1 ri2 =k1 ri3 k2 jk1 ri5 k1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ri5 ;k1 k2 ÞrTi4 =k1 k2 ri6 ;k1 k2 þ ðri1 k1 ri2 =k1 ri3 k2 jk1 ri6 k1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ri6 ;k1 k2 ÞrTi4 =k1 k2 ri5 ;k1 k2 ðL Þ
1 f2ri1 k1 ri2 k1 ri3 k1 ri4 k2 jk1 ri6 k1 r2 k1 k1 ck1 r
ðLÞ
þ ðri1 k1 ri2 =k1 ri3 k2 jk1 ri6 k1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ri6 ;k1 k2 ÞrTi4 =k1 k2 ck1 k2 ;r ðL Þ
ðLÞ
ðLÞ
þ ðri1 k1 ri2 =k1 ri3 k2 jk1 ck1 r1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ck1 k2 ;r ÞrTi4 =k1 k2 ri6 ;k1 k2 grTi5 =k1 k2 ck1 k2 ;r ðLÞ
g42 ðck1 k2 ;r ÞrTi5 =k1 k2 ri6 ;k1 k2 ; ðLÞ n @g53 ðck1 k2 k3 ;r Þ ðL1 Þ ¼ 2ri1 k1 ri2 k1 ri3 k1 ri4 k2 jk1 ri6 k1 r2 k1 k1 c k1 r @ti6 ðLÞ
þ ðri1 k1 ri2 =k1 ri3 k2 jk1 ri6 k1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ri6 ;k1 k2 ÞrTi4 =k1 k2 ck1 k2 ;r
o ðL Þ ðLÞ þ ðri1 k1 ri2 =k1 ri3 k2 jk1 ck1 r1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ck1 k2 ;r ÞrTi4 =k1 k2 ri6 ;k1 k2 ri5 k3 jk1 k2
n ðri1 k1 ri2 =k1 ri3 k2 jk1 ri6 k1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ri6 ;k1 k2 Þri4 k3 jk1 k2 o ðLÞ þ ri1 k1 ri2 k2 jk1 ri3 k3 jk1 k2 rTi4 =k1 k2 k3 ri6 ;k1 k2 k3 rTi5 =k1 k2 k3 ck1 k2 k3 ;r ðLÞ
g43 ðck1 k2 k3 ;r ÞrTi5 =k1 k2 k3 ri6 ;k1 k2 k3 ;
n @g54 ðck1 k2 k3 k4 ;r Þ ¼ ðri1 k1 ri2 =k1 ri3 k2 jk1 ri6 k1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ri6 ;k1 k2 Þri4 k3 jk1 k2 @ti6 o þ ri1 k1 ri2 k2 jk1 ri3 k3 jk1 k2 rTi4 =k1 k2 k3 ri6 ;k1 k2 k3 ri5 k4 jk1 k2 k3 ðLÞ
ri1 k1 ri2 k2 jk1 ri3 k3 jk1 k2 ri4 k4 jk1 k2 k3 rTi5 =k1 k2 k3 k4 ri6 ;k1 k2 k3 k4 ; ðLÞ
@g55 ðck1 k2 k3 k4 k5 ;r Þ ¼ 0: @ti6
1.5
The Product Sum of Natural Numbers and the Hermite Polynomials ðL Þ
As shown in the previous section, gs1 ðck1 r1 Þ is associated with the probabilist’s Hermite polynomial, which is given up to the ninth order as follows:
30
1 The Sectionally Truncated Normal Distribution
h0 ¼ 1; h1 ¼ x; h2 ¼ x2 1; h3 ¼ x3 3x; h4 ¼ x4 6x2 þ 3; h5 ¼ x5 10x3 þ 15x; h6 ¼ x6 15x4 þ 45x2 15; h7 ¼ x7 21x5 þ 105x3 105x h8 ¼ x8 28x6 þ 210x4 420x2 þ 105; h9 ¼ x9 36x7 þ 378x5 1260x3 þ 945x: Probably, the most popular definition of the Hermite polynomial is given by ð1Þs ds /ðxÞ /ðxÞ dxs 2 s 2 x d x s ¼ ð1Þ exp exp s 2 dx 2
hs ¼ hs ðxÞ ¼
ðs ¼ 0; 1; . . .Þ;
where /ðxÞ ¼ /1 ðxj0; 1Þ and d0 /ðxÞ=dx0 /ðxÞ. Recall that when p ¼ 1; R ¼ 1; a1 ¼ 1; b1 ¼ x; l ¼ l ¼ 0 and R ¼ r2 ¼ 1, the mgf of the STN-distributed variable X becomes as simple as MX ðtÞ ¼
Uðx tÞ expðt2 =2Þ: UðxÞ
Consider the associated pseudo variable. Then, noting that hs ðxÞ is also given by hs ¼ hs ðxÞ ¼
1 ds /ðx tÞ jt¼0 ; /ðxÞ dts
the moments EðX1s Þ ¼
ds Uðx tÞ j dts UðxÞ t¼0
ðs ¼ 0; 1; . . .Þ
are written as EðX10 Þ ¼ Eð1Þ ¼
d0 Uðx tÞ Uðx tÞ j ¼ j ¼1 UðxÞ t¼0 dt0 UðxÞ t¼0
1.5 The Product Sum of Natural Numbers and the Hermite Polynomials
31
as expected, EðX1 Þ ¼
d Uðx tÞ /ðx tÞ /ðxÞ j ¼ j ¼ dt UðxÞ t¼0 UðxÞ t¼0 UðxÞ
and ds Uðx tÞ ds1 /ðx tÞ j j ¼ dts UðxÞ t¼0 dts1 UðxÞ t¼0 hs1 ðxÞ/ðxÞ ðs ¼ 2; 3; . . .Þ: ¼ UðxÞ
EðX1s Þ ¼
The negative sign when s 2 is due to ð1ÞL1 þ 1 ¼ 1 (see Lemma 1.10) ðL Þ ðL Þ 1 ð1L1 Þ which holds when ci1 r 1 ¼ c11 1 ¼ aL ¼ b11 with L1 ¼ 0 while a11 ¼ 11 b11 a11 t ¼ 1 corresponding to L1 ¼ 1 does not contribute to the derivative with respect to t. The above result shows that in the simple case, using h0 ðxÞ ¼ 1, we obtain EðX1s Þ ¼
hs1 ðxÞ/ðxÞ UðxÞ
ðs ¼ 1; 2; . . .Þ;
ð1:2Þ
which is proportional to hs1 ðxÞ. When s = 1, we have EðXÞ ¼ EðX1 Þ þ l ¼
/ðxÞ þ l; UðxÞ
which shows a negative shift of the mean from l due to upper-tail truncation. Lemma 1.12 When p ¼ 1, we have EðX1s Þ
1 s
¼a r
where a ¼
R X 1 X r¼1 L1 ¼0
ð1Þ
L1 þ 1
hs1
! ! ðL Þ ðL Þ c1r 1 l c1r 1 l / r r
PR b1r l U a1rrl . r¼1 U r
ðs ¼ 1; 2; . . .Þ;
32
1 The Sectionally Truncated Normal Distribution
Proof Note that when p = 1, Lemma 1.10 gives EðX1s Þ ¼
@ s UðaÞ p
jt¼0 ¼
@ s UðaÞ j @ts t¼0
@ti1 @tis p X R X 1 X ðL Þ ðrÞ ðL Þ ¼ a1 ð1ÞL1 þ 1 gs1 ðck1 r1 Þfð1Þ ðck1 r1 Þ r¼1 k1 ¼1 L1 ¼0
¼ a1
R X 1 X r¼1 L1 ¼0
ðL Þ
ðrÞ
ðL Þ
ð1ÞL1 þ 1 gs1 ðc1r 1 jl; r2 Þfð1Þ ðc1r 1 jl; r2 Þ
! ! ðL Þ ðL1 Þ c1r 1 l ðrÞ c1r l j0; 1 fð1Þ j0; 1 ¼a r ð1Þ gs1 r r r¼1 L1 ¼0 ! ! ðL Þ ðL Þ R X 1 X c1r 1 l c1r 1 l L1 þ 1 1 s ¼a r / ; ð1Þ hs1 r r r¼1 L ¼0 1 s
R X 1 X
L1 þ 1
1
where ti1 ¼ t1 ¼ t; gs1 ðjl; r2 Þ and gs1 ðj0; 1Þ are used to show that they are functions using different parameters; a¼
R X
Uðb1r jl; r2 Þ Uða1r jl; r2 Þ
r¼1
R a l X b1r l 1r j0; 1 U j0; 1 U ¼ r r r¼1 R a l X b1r l 1r ¼ U U ; r r r¼1 and the validity of use of the Hermite polynomial is given by the simple result shown earlier (see 1.2) with scaling and extension to the case R 1 which holds since aEðX1s Þ is the sum of R terms each of which can be expressed using the Hermite polynomial. Q.E.D. In Lemma 1.10, let ðL Þ
ðLÞ
gs1 ðck1 r1 Þ ¼ gs1 ðckr Þ ¼ gs1
ðs ¼ 1; 2; . . .; k1 k ¼ 1; . . .; p; L1 L ¼ 0; 1Þ
for simplicity of notation. Then, by construction, we obtain
1.5 The Product Sum of Natural Numbers and the Hermite Polynomials
33
g11 ¼ ri1 k ; ðLÞ
g21 ¼ ri1 k ri2 k r1 kk ckr ; ðLÞ2
1 g31 ¼ ri1 k ri2 k ri3 k ðckr r2 kk rkk Þ; ðLÞ3
ðLÞ
2 g41 ¼ ri1 k ri4 k fckr r3 kk ð1 þ 2Þckr rk g; ðLÞ4
ðLÞ2
3 2 g51 ¼ ri1 k ri5 k fckr r4 kk ð1 þ 2 þ 3Þckr rkk þ ð1 þ 2Þrk g; ðLÞ5 ðLÞ3 4 g61 ¼ ri1 k ri6 k ckr r5 kk ð1 þ þ 4Þckr rkk ðLÞ þ f1 þ 2 þ 2ð1 þ 2 þ 3Þgckr r3 kk ; ðLÞ6 ðLÞ4 5 g71 ¼ ri1 k ri7 k ck r6 kk ð1 þ þ 5Þckr rkk ðLÞ2
g81
þ f1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þgckr r4 kk 3 f1 þ 2 þ 2ð1 þ 2 þ 3Þgrkk ; ðLÞ7 ðLÞ5 6 ¼ ri1 k ri8 k ck :r7 kk ð1 þ þ 6Þckr rkk þ f1 þ 2 þ 2ð1 þ 2 þ 3Þ ðLÞ3
þ 3ð1 þ þ 4Þ þ 4ð1 þ þ 5Þgckr r5 kk
f1 þ 2 þ 2ð1 þ 2 þ 3Þg
g91
ðLÞ þ 2f1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þg ckr r4 kk ; ðLÞ8 ðLÞ6 7 ¼ ri1 k ri9 k ck :r8 kk ð1 þ þ 7Þckr rkk þ f1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þ ðLÞ4
þ 4ð1 þ þ 5Þ þ 5ð1 þ þ 6Þgckr r6 kk
1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 2f1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þg þ 3f1 þ 2 þ 2ð1 þ 2 þ 3Þ ðLÞ2 þ 3ð1 þ þ 4Þ þ 4ð1 þ þ 5Þg ckr r5 kk
þ f1 þ 2 þ 2ð1 þ 2 þ 3Þg
þ 2f1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þg ði1 ; . . .; i9 ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ;
r4 kk
34
1 The Sectionally Truncated Normal Distribution
where the term with r4 kk in large brackets is rewritten: ½f1 þ 2 þ 2ð1 þ 2 þ 3Þg þ 2f1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þg r4 kk ¼ ½1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þg r4 kk indicating a three-fold product sum. The above results are summarized as g11 ¼ ri1 k ; ðLÞ
g21 ¼ ri1 k ri2 k r1 kk ckr ; ðLÞ2
1 g31 ¼ ri1 k ri2 k ri3 k ðckr r2 kk rkk Þ; ðLÞ3
ðLÞ
ðLÞ4
ðLÞ2
2 g41 ¼ ri1 k ri4 k ðckr r3 kk 3ckr rkk Þ; 3 2 g51 ¼ ri1 k ri5 k ðckr r4 kk 6ckr rkk þ 3rkk Þ; ðLÞ5
ðLÞ3
ðLÞ
ðLÞ6
ðLÞ4
ðLÞ2
ðLÞ7
ðLÞ5
ðLÞ3
ðLÞ8
ðLÞ6
ðLÞ4
4 3 g61 ¼ ri1 k ri6 k ðckr r5 kk 10ckr rkk þ 15ckr rkk Þ; 5 4 3 g71 ¼ ri1 k ri7 k ðckr r6 kk 15ckr rkk þ 45ckr rkk 15rkk Þ; ðLÞ
6 5 4 g81 ¼ ri1 k ri8 k ðckr r7 kk 21ckr rkk þ 105ckr rkk 105ckr rkk Þ; 7 6 g91 ¼ ri1 k ri9 k ðckr r8 kk 28ckr rkk þ 210ckr rkk ðLÞ2
4 420ckr r5 kk þ 105rkk Þ
ði1 ; . . .; i9 ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ: It is found that the last results give ðLÞ
gs1 jrkk ¼1 ¼ ri1 k ris k hs1 ðckr Þ ðs ¼ 1; . . .; 9; k ¼ 1; . . .; p; L ¼ 0; 1Þ: Definition 1.2 The j-th order product sum of natural numbers (psnn) up to i denoted by i : j is recursively defined as i :0 ¼ 1; i :1 ¼ i :¼
i X
m;
m¼1
i :2 ¼ 1 ð2 :Þ þ 2 ð3 :Þ þ þ ði 1Þði :Þ þ ifði þ 1Þ :g ¼ 1 2 : þ 2 3 : þ þ ði 1Þi : þ iði þ 1Þ : i : ¼ 1 2 :2 þ 2 3 :2 þ þ ði 1Þi :2 þ iði þ 1Þ :2 .. . 3
i : j ¼ 1 2 :j1 þ 2 3 :j1 þ þ ði 1Þi :j1 þ iði þ 1Þ :j1 ði ¼ 1; 2; . . .; j ¼ 0; 1; . . .Þ:
1.5 The Product Sum of Natural Numbers and the Hermite Polynomials
From the last recursive definition, we have i : j ¼ 1 2 :j1 þ 2 3 :j1 þ þ ði 1Þi :j1 þ iði þ 1Þ :j1 ¼ ði 1Þ : j þ iði þ 1Þ :j1 ¼ ði 2Þ : j þ ði 1Þi :j1 þ iði þ 1Þ :j1 .. . ¼ 1 2 :j1 þ 2 3 :j1 þ þ iði þ 1Þ :j1 ¼ 1 : j þ 2 3 :j1 þ þ iði þ 1Þ :j1
ði ¼ 1; 2; . . .; j ¼ 1; 2; . . .Þ;
where 1 2 :j1 ¼ 1 : j is used. Some examples of the psnn are given: 1 : ¼ 1; 1 :2 ¼ 2 :¼ 1 þ 2 ¼ 3; 1 :3 ¼ 2 :2 ¼ 1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ ¼ 15; 1 :4 ¼ 2 :3 ¼ 1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þg ¼ 1 15 þ 2 45 ¼ 105; 1 :5 ¼ 2 :4 ¼ 1½1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þg
þ 2½1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þg þ 3f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þ þ 4ð1 þ þ 5Þg
¼ 1 105 þ 2ð15 þ 2 45 þ 3 105Þ ¼ 945;
3 : ¼ 1 þ 2 þ 3 ¼ 6; 3 :2 ¼ 1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þ ¼ 45; 3 :3 ¼ 1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þg þ 3f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þ þ 4ð1 þ þ 5Þg ¼ 1 15 þ 2 45 þ 3 105 ¼ 420;
35
36
1 The Sectionally Truncated Normal Distribution
3 :4 ¼ 1 2 :3 þ 2 3 :3 þ 3 4 :3 ¼ 1 1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þg þ 2 1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þg
þ 3f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þ þ 4ð1 þ þ 5Þg þ 3 1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þg þ 3f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þ þ 4ð1 þ þ 5Þg þ 4f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þ þ 4ð1 þ þ 5Þ þ 5ð1 þ þ 6Þg ¼ 1 105 þ 2 420 þ 3 ð420 þ 4 210Þ ¼ 4725; 4 : ¼ 1 þ þ 4 ¼ 10; 4 :2 ¼ 1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þ þ 4ð1 þ þ 5Þ ¼ 105; 4 :3 ¼ 420 þ 4 210 ¼ 1260 (given in 3 :4 ), 5 : ¼ 1 þ þ 5 ¼ 15; 5 :2 ¼ 4 :2 þ 5 6 :¼ 105 þ 5 21 ¼ 210; 5 :3 ¼ 4 :3 þ 5 6 :2 ¼ 1260 þ 5f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ þ 4Þ þ 4ð1 þ þ 5Þ þ 5ð1 þ þ 6Þ þ 6ð1 þ þ 7Þg ¼ 1260 þ 5ð3 þ 12 þ 3 10 þ 4 15 þ 5 21 þ 6 28Þ ¼ 3150: Comparing the summarized results before Definition 1.1 and the corresponding previous ones, it is found that the absolute values of the fixed integers, e.g., 1, 6 and ðLÞ4 ðLÞ2 3 2 3 in g51 ¼ ri1 k ri5 k ðckr r4 kk 6ckr rkk þ 3rk Þ are given by the psnn’s.
1.5 The Product Sum of Natural Numbers and the Hermite Polynomials
37
Lemma 1.13 i : j¼
ði þ 2j 1Þ! ð2jÞ!!ði 1Þ!
ði ¼ 1; 2; . . . : j ¼ 0; 1; . . .Þ;
where ðÞ!! is the double factorial and 0!! 1. Proof 1 The result is given by induction. When j ¼ 0, we have i :0 ¼ 1 by definition and ði þ 2j 1Þ! ði 1Þ! ¼ ¼ 1; ð2jÞ!!ði 1Þ! 0!!ði 1Þ! which shows that the equation holds. Assume that the results hold for i and j. Then, i :j þ 1 ¼ 1 2 : j þ 2 3 : j þ þ ði 1Þi : j þ iði þ 1Þ : j ¼
i X
mðm þ 1Þ : j
m¼1
¼
i X m¼1
¼
¼
m
ðm þ 2jÞ! ð2jÞ!!m!
i 1 X mðm þ 1Þ ðm þ 2jÞ ð2jÞ!! m¼1
i 1 1 X fmðm þ 1Þ ðm þ 2j þ 1Þ ð2jÞ!! 2j þ 2 m¼1
ðm 1Þm ðm þ 2jÞg 1 ið1 þ 1Þ ði þ 2j þ 1Þ ¼ f2ðj þ 1Þg!! fi þ 2ðj þ 1Þ 1g! ¼ ði ¼ 1; 2; . . . : j ¼ 0; 1; . . .Þ; f2ðj þ 1Þg!!ði 1Þ! which shows that the result holds for i :j þ 1 Q.E.D. Proof 2 It is known that the Hermite polynomials are explicitly given by hi ðxÞ ¼
½i=2
X j¼0
¼
½i=2
X j¼0
i! ð1Þ j xi2j ði 2jÞ!j!2 j i! ð1Þ j xi2j ; ð2jÞ!!ði 2jÞ!
38
1 The Sectionally Truncated Normal Distribution
where ½ is the Gauss notation of the floor function i.e., the integer part of the quantity in brackets. Suppose that the above result is given by a method other than that in Proof 1 using, e.g., the generating function expftx ðt2 =2Þg of hi ðxÞ with /ðx tÞ ¼ /ðxÞ expftx ðt2 =2Þg (see [12, Proposition 1.4.3; 22, Eq. (7); 27, Sect. 5.6; 35, Sects. 6.14–6.15]). Since i : j or ði þ 1Þ : j was derived as a coefficient of hi ðxÞ, equating the two sets of results term by term we obtain ði þ 1Þ : j ¼
ði þ 2jÞ! ði þ 2jÞ! ¼ ð2jÞ!!fði þ 2jÞ 2jg! ð2jÞ!!i!
ði; j ¼ 0; 1; . . .Þ:
Q.E.D. Proof 1 gives an alternative derivation of the explicit expressions of the coefficients of the Hermite polynomials. Using i : j , gs1 ð¼ 1; . . .; 9Þ given earlier are rewritten as g11 ¼ ri1 k ; ðLÞ
g21 ¼ ri1 k ri2 k r1 kk ckr ; ðLÞ2
1 g31 ¼ ri1 k ri2 k ri3 k ðckr r2 kk rkk Þ; ðLÞ3
ðLÞ
ðLÞ4
ðLÞ2
ðLÞ5
ðLÞ3
ðLÞ
ðLÞ6
ðLÞ4
ðLÞ2
ðLÞ7
ðLÞ5
ðLÞ3
ðLÞ
ðLÞ8
ðLÞ6
ðLÞ4
ðLÞ2
2 g41 ¼ ri1 k ri4 k ðckr r3 kk 2 : ckr rkk Þ; 3 2 g51 ¼ ri1 k ri5 k ðckr r4 kk 3 : ckr rkk þ 2 : rkk Þ; 4 2 3 g61 ¼ ri1 k ri6 k ðckr r5 kk 4 : ckr rkk þ 2 : ckr rkk Þ; 5 2 4 2 3 g71 ¼ ri1 k ri7 k ðckr r6 kk 5 : ckr rkk þ 3 : ckr rkk 2 : rkk Þ; 6 2 5 3 4 g81 ¼ ri1 k ri8 k ðckr r7 kk 6 : ckr rkk þ 4 : ckr rkk 2 : ckr rkk Þ; 7 2 6 3 5 3 4 g91 ¼ ri1 k ri9 k ðckr r8 kk 7 : ckr rkk þ 5 : ckr rkk 3 : ckr rkk þ 2 : rkk Þ
ði1 ; . . .; i9 ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ: The above results indicate the following general expressions. Theorem 1.6 (i) When s is odd, n ðLÞðs1Þ ðs1Þ gs1 ¼ ri1 k ris k ckr rkk þ
ðs3Þ=2 X
ðLÞðs2u1Þ ðsu1Þ rkk
ð1Þu ðs 2uÞ :u ckr
u¼1 ðs1Þ=2
þ ð1Þðs1Þ=2 2 :ðs3Þ=2 rkk
o
ði1 ; . . .; is ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ; where for s = 1, 3, define
Pðs3Þ=2 u¼1
ðÞ ¼ 0 and 2 :1 ¼ 0 as well as 2 :0 ¼ 1.
1.5 The Product Sum of Natural Numbers and the Hermite Polynomials
(ii) When s is even,
39
( ðLÞðs1Þ ðs1Þ rkk
gs1 ¼ ri1 k ris k ckr þ
ðs2Þ=2 X
) u
ð1Þ ðs 2uÞ :
u
ðLÞðs2u1Þ ðsu1Þ ckr rkk
u¼1
ði1 ; . . .; is ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ; Pðs2Þ=2 where for s = 2, define u¼1 ðÞ ¼ 0. (iii) When s is a natural number, gs1 ¼ ri1 k ris k
½ðs1Þ=2
X
ðLÞðs2u1Þ ðsu1Þ rkk
ð1Þu ðs 2uÞ :u ckr
u¼0
ði1 ; . . .; is ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ; where s :0 ¼ 1 as before. Proof When s = 1,…,9, it was shown earlier that the results hold. (i) Odd s: Assume that the result holds for an odd s with s 5; where 5 is the Pðs3Þ=2 minimum integer among the cases when u¼1 ðÞ does not vanish. Recall ðLÞ
ðLk Þ
ckr ¼ ckr that 1; . . .; RÞ: Then, 2gðs þ 1Þ1 ¼
ðLÞ
¼ ðcðLÞ r RtÞk ¼ ðcr
l RtÞk ðk ¼ 1; . . .; p; r ¼
n @ ðLÞðs1Þ ðs1Þ ri1 k ris k ckr rkk @tis þ 1 þ
ðs3Þ=2 X
ðLÞðs2u1Þ ðsu1Þ rkk
ð1Þu ðs 2uÞ :u ckr
u¼1 ðs1Þ=2
þ ð1Þðs1Þ=2 2 :ðs3Þ=2 rkk n ðLÞðs1Þ ðs1Þ þ ri1 k ris k ckr rkk þ
ðs3Þ=2 X
o
jt¼0
ðLÞðs2u1Þ ðsu1Þ rkk
ð1Þu ðs 2uÞ :u ckr
u¼1 ðs1Þ=2
þ ð1Þðs1Þ=2 2 :ðs3Þ=2 rkk
o
ðLÞ
ris þ 1 k r1 kk ckr
40
1 The Sectionally Truncated Normal Distribution
n ðLÞðs2Þ ðs1Þ ¼ ri1 k ris þ 1 k ðs 1Þckr rkk
ðs3Þ=2 X
ðLÞðs2u2Þ ðsu1Þ rkk
ð1Þu ðs 2u 1Þðs 2uÞ :u ckr
u¼1 ðLÞs
þ ckr rs kk þ
ðs3Þ=2 X
ðLÞðs2uÞ ðsuÞ rkk
u
ð1Þ ðs 2uÞ :u ckr
u¼1
o ðLÞ ðs þ 1Þ=2 2 :ðs3Þ=2 ckr rkk h ðLÞs ðLÞðs2Þ ðs1Þ 1 ¼ ri1 k ris þ 1 k ckr rs rkk kk þ ð1Þ fðs 1Þ þ ðs 2Þ :gckr þ ð1Þ
þ
ðs1Þ=2
ðs3Þ=2 X
ðLÞðs2u2Þ ðsu1Þ rkk
ð1Þu þ 1 ðs 2u 1Þðs 2uÞ :u ckr
u¼1
þ
ðs3Þ=2 X
ðLÞðs2uÞ ðsuÞ rkk
ð1Þu ðs 2uÞ :u ckr
u¼2
i ðLÞ ðs þ 1Þ=2 þ ð1Þðs1Þ=2 2 :ðs3Þ=2 ckr rkk ðLÞs ðLÞðs2Þ ðs1Þ ¼ ri1 k ris þ 1 k ckr rs rkk kk ðs 1Þ : ckr þ
ðs3Þ=2 X
ð1Þu fðs 2u þ 1Þðs 2u þ 2Þ :u1
u¼2 ðLÞðs2uÞ ðsuÞ rkk ðs3Þ=2 ðLÞ ðs þ 1Þ=2 3: ckr rkk
þ ðs 2uÞ :u gckr þ ð1Þfðs3Þ=2g þ 1 2 þ ð1Þ
ðs1Þ=2
ðs3Þ=2
ðLÞ ðs þ 1Þ=2 ckr rkk
2: ðLÞs u u ðLÞðs2uÞ ðsuÞ ¼ ri1 k ris þ 1 k ckr rs rkk gju¼1 kk þ fð1Þ ðs 2u þ 1Þ : ckr þ
ðs3Þ=2 X
ðLÞðs2uÞ ðsuÞ rkk
ð1Þu ðs 2u þ 1Þ :u ckr
u¼2 u
þ fð1Þ ðs 2u þ 1Þ :
u
( ¼ ri1 k ris þ 1 k
ðLÞs ckr rs kk
þ
ðLÞðs2uÞ ðsuÞ ckr rkk gju¼ðs1Þ=2 ðs1Þ=2 X u¼1
)
u
ð1Þ ðs 2u þ 1Þ :
u
ðLÞðs2uÞ ðsuÞ ckr rkk
;
1.5 The Product Sum of Natural Numbers and the Hermite Polynomials
41
where the following are used: ðs 2u þ 1Þðs 2u þ 2Þ :u1 þ ðs 2uÞ :u ¼ ðs 2u þ 1Þ :u a special case of ði 1Þ : j þ iði þ 1Þ :j1 ¼ i : j and 2 :ðs3Þ=2 þ 2 3 :ðs3Þ=2 ¼ 1 ð1 þ 1Þ :ðs3Þ=2 þ 2 ð2 þ 1Þ :ðs3Þ=2 ¼ 2 :ðs1Þ=2 ¼ ðs 2u þ 1Þ :u ju¼ðs1Þ=2 a special case of i : j ¼ 1 2 :j1 þ 2 3 :j1 þ þ ði 1Þi :j1 þ iði þ 1Þ :j1 when i ¼ 2 and j ¼ ðs 1Þ=2. On the right-hand side of the final equation, when odd s is replaced by even s ¼ s þ 1, we obtain ( ðLÞs
gðs þ 1Þ1 ¼ ri1 k ris þ 1 k ckr rs kk ðs1Þ=2 X
þ
) u
ð1Þ ðs 2u þ 1Þ :
u¼1
u
ðLÞðs2uÞ ðsuÞ ckr rkk
( ðLÞðs1Þ ðs1Þ rkk
¼ ri1 k ris k ckr þ
ðs2Þ=2 X
) u
ð1Þ ðs 2uÞ :
u
ðLÞðs2u1Þ ðsu1Þ ckr rkk
;
u¼1
giving the required result of the even case. (ii) Even s: Assume that the result holds for an even s with s 6; where 6 is the Pðs2Þ=2 minimum integer among the cases when u¼2 ðÞ does not vanish. As in the case of odd s, we have ( @ ðLÞðs1Þ ðs1Þ gðs þ 1Þ1 ¼ ri k ris k ckr rkk @tis þ 1 1 ) ðs2Þ=2 X u u ðLÞðs2u1Þ ðsu1Þ þ ð1Þ ðs 2uÞ : ckr rkk jt¼0 u¼1
( ðLÞðs1Þ ðs1Þ rkk
þ ri1 k ris k ckr þ
ðs2Þ=2 X u¼1
) ð1Þu ðs 2uÞ
ðLÞðs2u1Þ ðsu1Þ :u ckr rkk
ðLÞ
ris þ 1 k r1 kk ckr
42
1 The Sectionally Truncated Normal Distribution
ðLÞs
ðLÞðs2Þ ðs1Þ rkk
1 ¼ ri1 k ris þ 1 k ckr rs kk þ ð1Þ fðs 1Þ þ ðs 2Þ :gckr
þ
ðs2Þ=2 X
ðLÞðs2u2Þ ðsu1Þ rkk
ð1Þu þ 1 ðs 2u 1Þðs 2uÞ :u ckr
u¼1
þ
ðs2Þ=2 X
ðLÞðs2uÞ ðsuÞ rkk
ð1Þu ðs 2uÞ :u ckr
i
u¼2
h ðLÞs ðLÞðs2Þ ðs1Þ ¼ ri1 k ris þ 1 k ckr rs rkk kk ðs 1Þ : ckr þ
ðs2Þ=2 X
ð1Þu fðs 2u þ 1Þðs 2u þ 2Þ :u1
u¼2 ðLÞðs2uÞ ðsuÞ rkk
þ ðs 2uÞ :u gckr
s=2
þ ð1Þfðs2Þ=2g þ 1 2 :ðs2Þ=2 rkk n ðLÞs ¼ ri1 k ris þ 1 k ckr rs kk þ
ðs2Þ=2 X
i
ðLÞðs2uÞ ðsuÞ rkk
ð1Þu ðs 2u þ 1Þ :u ckr
u¼1 s=2
þ ð1Þs=2 2 :ðs2Þ=2 rkk
o
:
When even s is replaced by odd s ¼ s þ 1, the last result becomes ( ðLÞs1 ðs1Þ rkk
ri1 k ris k ckr þ
ðs3Þ=2 X
ðLÞðs2u1Þ ðsu1Þ rkk
ð1Þu ðs 2uÞ :u ckr
u¼1
þ ð1Þ
ðs1Þ=2
) ðs3Þ=2
2:
ðs1Þ=2 rkk
;
which shows the required result. (iii) When s is even, the result is easily obtained. When s is odd, the last term in gs1 =ðri1 k ris k Þ of (i) can be replaced by ðs1Þ=2
ð1Þðs1Þ=2 2 :ðs3Þ=2 rkk
ðs1Þ=2
¼ ð1Þðs1Þ=2 1 :ðs1Þ=2 rkk
using 1 2 : j ¼ 2 : j ¼ 1 :j þ 1 , giving the required result. Q.E.D. ðLÞ
Note that the order of the polynomial of gs1 =ðri1 k ris k Þ in terms of ckr is s 1. When rii ¼ 1ði ¼ 1; . . .; pÞ, gs1 =ðri1 k ris k Þ becomes the Hermite
References
43
polynomial of order s 1 irrespective of the parity of s. Using Theorem 1.6, gs1 ðs ¼ 1; . . .; 9Þ are rewritten as g11 ¼ ri1 k ; ðLÞ
g21 ¼ ri1 k ri2 k ðckr r1 kk Þ; ðLÞ2
1 g31 ¼ ri1 k ri2 k ri3 k ðckr r2 kk 1 : rkk Þ; ðLÞ3
ðLÞ
ðLÞ4
ðLÞ2
ðLÞ5
ðLÞ3
ðLÞ
ðLÞ6
ðLÞ4
ðLÞ2
ðLÞ7
ðLÞ5
ðLÞ3
ðLÞ
ðLÞ8
ðLÞ6
ðLÞ4
ðLÞ2
2 g41 ¼ ri1 k ri4 k ðckr r3 kk 2 : ckr rkk Þ; 3 2 2 g51 ¼ ri1 k ri5 k ðckr r4 kk 3 : ckr rkk þ 1 : rkk Þ; 4 2 3 g61 ¼ ri1 k ri6 k ðckr r5 kk 4 : ckr rkk þ 2 : ckr rkk Þ; 5 2 4 3 3 g71 ¼ ri1 k ri7 k ðckr r6 kk 5 : ckr rkk þ 3 : ckr rkk 1 : rkk Þ; 6 2 5 3 4 g81 ¼ ri1 k ri8 k ðckr r7 kk 6 : ckr rkk þ 4 : ckr rkk 2 : ckr rkk Þ; 7 2 6 3 5 4 4 g91 ¼ ri1 k ri9 k ðckr r8 kk 7 : ckr rkk þ 5 : ckr rkk 3 : ckr rkk þ 1 : rkk Þ ði1 ; . . .; i9 ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ;
which give pleasingly regularized results. When s is odd, the absolute value of the last term in gs1 =ðri1 k ris k Þ is 1 :ðs1Þ=2 ¼ 2 :ðs3Þ=2 ¼
ðs 2Þ! ðs 2Þ! ¼ ¼ ðs 2Þ!!ðs 3Þ; ðs 3Þ!!1! ðs 3Þ!!
which becomes 1, 3, 15 and 105 when s = 3, 5, 7 and 9, respectively. It is well known that ðs 2Þ!! is the ðs 1Þ-th order central moment of an untruncated standard normal, which is equal to the number of distinct cases of choosing ðs 1Þ=2 pairs from s 1 members. Similar combinatorial interpretations for other coefficients of the Hermite polynomials are available.
References 1. Aitken AC (1934) Note on selection from a multivariate normal population. Proc Edinb Math Soc 4:106–110 2. Arismendi JC (2013) Multivariate truncated moments. J Multivar Anal 117:41–75 3. Arismendi JC, Broda S (2017) Multivariate elliptical truncated moments. J Multivar Anal 157:29–44 4. Arnold BC, Beaver RJ, Groeneveld RA, Meeker WQ (1993) The nontruncated marginal of a truncated bivariate normal distribution. Psychometrika 58:471–488 5. Artzner P, Delbaen F, Eber JM, Heath D (1999) Coherent measures of risk. Math Financ 9:203–228 6. Azzalini A, Genz A, Miller A, Wichura MJ, Hill GW, Ge Y (2020) The multivariate normal and t distributions, and their truncated versions. R package version 2.0.2. https://CRAN.Rproject.org/package=mnormt
44
1 The Sectionally Truncated Normal Distribution
7. Barr DR, Sherill ET (1999) Mean and variance of truncated normal distributions. Am Stat 53:357–361 8. Birnbaum ZW (1950) Effect of linear truncation on a multinormal population. Ann Math Stat 21:272–279 9. Birnbaum ZW, Paulson E, Andrews FC (1950) On the effect of selection performed on some coordinates of a multi-dimensional population. Psychometrika 15:191–204 10. Burkardt J (2014) The truncated normal distribution, pp 1–35. http://people.sc.fsu.edu/ jburkardt/presentations/truncatednormal.pdf 11. Cochran WG (1951) Improvement by means of selection. In: Neyman J (ed) Proceedings of the second Berkeley symposium on mathematical statistics and probability. University of California Press, Berkeley, CA, pp 449–470 12. Dunkl CF, Xu Y (2001) Orthogonal polynomials of several variables. Cambridge University Press, Cambridge 13. Fisher RA (1931) Introduction to mathematical tables, vol 1. British Association for the Advancement of Science, pp xxvi–xxxv. Reprinted in Fisher RA (1950) Contributions to mathematical statistics, pp 517–526 with the title “The sampling error of estimated deviates, together with other illustrations of the properties and applications of the integrals and derivatives of the normal error function” and the author’s note (CMS 23.xxva). Wiley, New York. https://digital.library.adelaide.edu.au/dspace/handle/2440/3860 14. Galarza CE, Kan R, Lachos VH (2020) MomTrunc: moments of folded and doubly truncated multivariate distributions. R package version 5.57. https://CRAN.R-project.org/package= MomTrunc 15. Galarza CE, Matos LA, Dey DK, Lachos VH (2022) On moments of folded and truncated multivariate extended skew-normal distributions. J Comput Graphical Stat (online published). http://doi.org/10.1080/10618600.2021.2000869 16. Gianola D, Rosa GJM (2015) One hundred years of statistical developments in animal breeding. Annu Rev Anim Biosci 3:19–56 17. Herrendörfer G, Tuchscherer A (1996) Selection and breeding. J Stat Plann Infer 54:307–321 18. Horrace WC (2015) Moments of the truncated normal distribution. J Prod Anal 43:133–138 19. Kamat AR (1953) Incomplete and absolute moments of the multivariate normal distribution with some applications. Biometrika 40:20–34 20. Kamat AR (1958) Hypergeometric expansions for incomplete moments of the bivariate normal distribution. Sankhyā 20:317–320 21. Kan R, Robotti C (2017) On moments of folded and truncated multivariate normal distributions. J Comput Graph Stat 26:930–934 22. Kemp CD, Kemp AW (1965) Some properties of the “Hermite” distribution. Biometrika 52:381–394 23. Landsman Z, Makov U, Shushi T (2016) Multivariate tail conditional expectation for elliptical distributions. Insur Math Econ 70:216–223 24. Landsman Z, Makov U, Shushi T (2018) A multivariate tail covariance measure for elliptical distributions. Insur Math Econ 81:27–35 25. Lawley DN (1943) A note on Karl Pearson’s selection formulae. In: Proceedings of the Royal Society of Edinburgh, vol 62 (Section A, Pt. 1), pp 28–30 26. Magnus JR, Neudecker H (1999) Matrix differential calculus with applications in statistics and econometrics, Rev. edn. Wiley, New York 27. Magnus W, Oberhettinger F, Soni RP (1966) Formulas and theorems for the special functions of mathematical physics, 3rd enlarged edn. Springer, Berlin 28. Manjunath BG, Wilhelm S (2012) Moments calculation for the doubly truncated multivariate normal density. arXiv:1206.5387v1 [stat.CO]. 23 June 2012 29. Nabeya S (1951) Absolute moments in 2-dimensional normal distribution. Ann Inst Stat Math 3:2–6 30. Nabeya S (1952) Absolute moments in 3-dimensional normal distribution. Ann Inst Stat Math 4:15–30
References
45
31. Ogasawara H (2021) Unified and non-recursive formulas for moments of the normal distribution with stripe truncation. Commun Stat Theor Methods (online published). http:// doi.org/10.1080/03610926.2020.1867742 32. Ogasawara H (2021) A non-recursive formula for various moments of the multivariate normal distribution with sectional truncation. J Multivar Anal (online published). http://doi.org/10. 1016/j.jmva.2021.104729 33. Pearson K (1903) On the influence of natural selection on the variability and correlation of organs. Philos Trans R Soc Lond Ser A Containing Pap Math Phys Charact 200:1–66 34. Pearson K, Lee A (1908) On the generalized probable error in multiple normal correlation. Biometrika 6:59–68 35. Stuart A, Ord JK (1994) Kendall’s advanced theory of statistics: distribution theory, 6th edn., vol 1. Arnold, London 36. Tallis GM (1961) The moment generating function of the truncated multi-normal distribution. J R Stat Soc B 23:223–229 37. Tallis GM (1963) Elliptical and radial truncation in normal populations. Ann Math Stat 34:940–944 38. Tallis GM (1965) Plane truncation in normal populations. J R Stat Soc B 27:301–307 39. Yanai H, Takeuchi K, Takane Y (2011) Projection matrices, generalized inverse matrices, and singular value decomposition. Springer, New York
Chapter 2
Normal Moments Under Stripe Truncation and the Real-Valued Poisson Distribution
2.1
Introduction
Stripe truncation introduced by Ogasawara [8] under normality is a univariate case of sectional truncation having zebraic or tigerish truncation patterns. Usual single and double truncations are special cases of stripe truncation. As in sectional truncation, R intervals or stripes for selection are defined by R pairs of selection points: 1 a1 \b1 \ \aR \bR 1: Let X N l; r2 : S When Rr¼1 far X\br g holds, which is also denoted by X 2 S with S being the set of the R non-overlapping interval(s), X is selected otherwise truncated. Single truncation has a single selection interval ½a1 ¼ 1; b1 \1Þ for upper-tail truncation or ½a1 [ 1; b1 ¼ 1Þ for lower-tail truncation each with R = 1. Double truncation has also a single selection interval ½a1 [ 1; b1 \1Þ with both tails being truncated. While the tail areas are typically small ones, they can be majorities. The simplest case of stripe truncation other than single or double truncation is given by inner truncation with two selection intervals or stripes, i.e., R = 2:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 H. Ogasawara, Expository Moments for Pseudo Distributions, Behaviormetrics: Quantitative Approaches to Human Behavior 2, https://doi.org/10.1007/978-981-19-3525-1_2
47
48
2
Normal Moments Under Stripe Truncation and the Real-Valued …
1 ¼ a1 \b1 \a2 \b2 ¼ 1; which is the complementary interval of the corresponding double truncation Let R [
a ¼ Pr
r¼1 R Z X
br
¼
r¼1
ar
R Z X
br
¼
r¼1
Zb ¼
! far X\br g ( ) 1 ðx lÞ2 pffiffiffi exp dx 2r2 2r /1 xjl; r2 dx
ar
/1 xjl; r2 dx
a
¼
R X r¼1
ðbrZlÞ=r
/1 ð xj0; 1Þdx ðar lÞ=r
Zb ¼
/ð xÞ dx; a
where a ¼ ða1 ; . . .; aR ÞT ; b ¼ ðb1 ; . . .; bR ÞT ; a ¼ ða1 ; . . .; aR ÞT and b ¼ ðb1 ; . . .; bR ÞT with ar ¼ ðar lÞ=r and br ¼ ðbr lÞ=r ðr ¼ 1; . . .; RÞ. Then the probability density function (pdf) of X = x is /1 xjl; r2 ; a; b ¼ /ðaÞ xjl; r2 ¼ / xjl; r2 =a: When X has the above pdf, the distribution under stripe truncation is denoted by X Nðl; r2 ; a; bÞ:
2.2 Closed Formulas for Moments of Integer-Valued Orders
2.2
49
Closed Formulas for Moments of Integer-Valued Orders
When k = 0, 1,…, let ðrÞ Ik
Zbr
Zbr
x /ð xÞ dx ¼
¼ I k ar ; br ¼
¼
1 X
xk /ð xÞ dx
k
0
ar
Zar x /ð xÞ dx
k
0
ðLÞ Zcr
ð1ÞL
L¼0
xk /ð xÞ dx; 0
R cðLÞ L 1L with 00 = 1; and when cðLÞ is negative, 0 j ðÞ dx ¼ where cðLÞ r ¼ ar br r R0 cðLÞ ðÞ dx (r = 1,…,R) is used. Then, Ogasawara [8] gave the following result j
using the above notation: Lemma 2.1 (Ogasawara [8, Lemma 1]). 1 h i kþ1 1 X ðrÞ pffiffiffi I k ¼ 2k=2 C ð1ÞL sign cðLÞ 1 k ¼ 2v þ 1 k ¼ 2v 1 f g f g r 2 p L¼0 ðLÞ2 1 c kþ1 ðr ¼ 1; . . .; R; k; v ¼ 0; 1; . . .Þ; FC r 2 2 2 ð2:1Þ where CðÞ is the gamma function; sign(x) = 1 when x 0 and signðxÞ ¼ 1 when x < 0; 1fg is the incident function; and FC ðxjaÞ is the cumulative distribution function (cdf) of the gamma distribution at x when the shape parameter is a with the unit scale parameter. Proof The results are given by cases when k is (i) even or (ii) odd. (i) Even k (=2v; v = 0, 1,…) Noting that xk /ðxÞ ¼ x2v /ðxÞ is a non-negative even function with the property Zc
Zc x /ð xÞ dx ¼ signðcÞ
x2v /ð xÞ dx ð1 c 1Þ
2v
0
0
pffiffiffiffiffi and using the variable transformation y = x2/2 with dx=dy ¼ 1= 2y, we have
50
2
ðrÞ Ik
¼
ðrÞ I 2v
Normal Moments Under Stripe Truncation and the Real-Valued …
¼ sign br
Zjbr j
Zjar j x /ð xÞ dx signðar Þ
0
¼
1 X
x2v /ð xÞ dx
2v
0
ð1ÞL sign cðLÞ r
L¼0
ðLÞ2 cZ =2 r
0
ð2yÞk=2 ey pffiffiffiffiffiffipffiffiffiffiffi dy 2p 2y
2ðk=2Þ1 ¼ ð1ÞL pffiffiffi sign cðLÞ r p L¼0 1 X
ðLÞ2 cZ =2 r
yðk1Þ=2 ey dy
0
! cðLÞ2 k þ 1 j ¼ ð1Þ pffiffiffi sign c 2 2 p L¼0 1 k=2 X kþ1 k þ 1 sign cðLÞ cðLÞ2 L2 r r FC ¼ ð1Þ pffiffiffi C ; 2 2 2 2 p L¼0 1 X
L2
ðk=2Þ1
ðLÞ cj
where cðxjaÞ is the lower incomplete gamma function at x when the shape parameter is a with the unit scale parameter in the corresponding gamma distribution. (ii) Odd k (¼2v 1; v = 1, 2,…) Note that xk /ðxÞ is an odd function when k is odd. Then, when c < 0, Zc
Z0 x /ð xÞ dx ¼
Zjcj x /ð xÞ dx ¼
k
c
0
xk /ð xÞ dx
k
0
while when c 0; Zc
Zjcj xk /ð xÞ dx ¼
0
xk /ð xÞ dx: 0
R j cj Rc That is, 0 xk /ðxÞ dx ¼ 0 xk /ðxÞ dx holds irrespective of the sign of c. Consequently, noting that when c is odd, the result of (i) can be similarly used simply by omitting signðbr Þ and signðar Þ. Then, we have
2.2 Closed Formulas for Moments of Integer-Valued Orders
ðrÞ Ik
¼
ðrÞ I 2v1
Zjbr j ¼
51
Zjar j x
2v1
/ð xÞ dx
0
x2v1 /ð xÞ dx 0
! ðLÞ2 cj k þ 1 kþ1 1 L2 FC ¼ ð1Þ pffiffiffi C : 2 2 2 2 p L¼0 1 X
k=2
The combined results of (i) and (ii) give (2.1). Q.E.D. It is known that in (2.1) pffiffiffi 2k=2 Cfðk þ 1Þ=2g= p ¼ E j X jk when untruncated X * N (0, 1) (k = 0, 1,…) (see Winkelbauer [11, Eq. (18)]; Pollack and Shauly-Aharonov [9, Eq. (1)]). When k = 2v (v = 0, 1,…), kþ1 1 2v 1 2v 1 3 1 pffiffiffi pffiffiffi ¼ pffiffiffi C v þ 2k=2 C p ¼ pffiffiffi v 2 2 2 2 2 p p p ¼ ð2v 1Þ 3 1 ¼ ð2v 1Þ!! ¼ ðk 1Þ!!; which is well known as the k-th order non-vanishing central moment of the standard normal. When k ¼ 2v 1 ðv ¼ 1; 2; . . .Þ, 2k=2 C
kþ1 1 1 1 pffiffiffi ¼ 2vð1=2Þ CðvÞ pffiffiffi ¼ 2vð1=2Þ ðv 1Þ! pffiffiffi 2 p p p pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi ¼ ð2v 2Þ!! 2=p ¼ ðk 1Þ!! 2=p:
pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi While ðk 1Þ!! holds for even k, ðk 1Þ!! 2=p for odd k is 2=p ¼ : Eðj X jÞ ¼ 0:8 times ðk 1Þ!!, where Eðj X jÞ is equal to the mean of the chi-distributed variable with one degree of freedom. Note that for even k, ðk 1Þ!! ¼ ðk 1Þ!!EðX 2 Þ. Using Lemma 4.1, Ogasawara [8] gave the following result. Theorem 2.1 (Ogasawara [8, Theorem 1]). Let d be an arbitrary reference point for the deviation of X Nðl; r2 ; a; bÞ. Then, we have EfðX d Þs g ¼ rs
s X k¼0
where s Ck ¼ 0 and 0 C0 ¼ 1.
s k
s Ck a
1
R X
ðrÞ
Ik
d
sk
ðs ¼ 0; 1; . . .Þ;
r¼1
! s! ¼ ðskÞ!k! ðk ¼ 0; . . .; sÞ; d ¼ ðd lÞ=r; ðdÞ0 ¼ 1 when d ¼
52
Normal Moments Under Stripe Truncation and the Real-Valued …
2
Proof Note that the moment of X Nðl; r2 ; a; bÞ is given by Zb s
EfðX d Þ jX 2 Sg ¼
ðx d Þs /ðaÞ xjl; r2 dx:
a
Then, we have EfðX d Þs jX 2 Sg ¼
R X
EfðX d Þs jar X\br g
r¼1
¼
R X
E½fX l ðd lÞgs jar X\br
r¼1
¼r
s
r¼1
rs
( ) X l k d l sk a C E X\b s k r r r r k¼0
R X s X
s X
s Ck
r¼1
k¼0
¼ rs
s X
R X
k e ar X e \br d sk E X
s Ck a
1
R X
ðrÞ
Ik
d
sk
;
r¼1
k¼0
which gives the required result. Q.E.D. As used in Ogasawara [8], define
ðxÞ Ik
¼2
k=2
2 kþ1 1 1 x k þ 1 pffiffiffi FC C 2 2 2 p2
½signð xÞ1fk ¼ 2vg þ 1fk ¼ 2v 1g ðk; v ¼ 0; 1; . . .Þ: ðrÞ
ðb Þ
ðar Þ
Note that I k ¼ I k r I k orders for stripe-truncated X.
. Then, we obtain the absolute moments of odd
Theorem 2.2 (Ogasawara [8, Theorem 2]) Let d be as in Theorem 2.1. Then, for X Nðl; r2 ; a; bÞ and odd s = 1, 3,…, EðjX d js jX 2 SÞ ¼ rs
s X k¼0
1 s Ck a
R h X r¼1
ðrÞ I k sign br d
sk
ða Þ
ðdÞ þ 2 Ik r Ik : 1 ar d\br d
2.2 Closed Formulas for Moments of Integer-Valued Orders
53
Proof For X Nðl; r2 ; a; bÞ and arbitrary d, we have s
1
EðjX d j jX 2 SÞ ¼ a
ðbrZlÞ=r
R X r¼1
ðar lÞ=r
b R Zr X
¼ rs a1
r¼1
¼ rs a1
x d s /ð xÞ dx
ar
2 R X
2 x jrx þ l d js pffiffiffiffiffiffi exp dx 2 2p
6 4
r¼1
Zbr
s x d /ðxÞ dx 1 d\ar
ar
9 > =
x d /ð xÞ dx 2 x d /ð xÞ dx 1 ar d\br þ > > ; : ar ar 3 Zbr s
7 x d /ðxÞ dx 1 br d 5 8 > < Zbr
Zd
s
s
ar
2 Zbr R X s 6 ¼ rs a1 x d /ðxÞ dx sign br d 4 r¼1
Zd 2
ar
3
7 x d /ð xÞ dx 1 ar d\br 5 s
ar
2 R X s X
s 1
¼ra
6
r¼1 k¼0
Zd 2
x d k
Zbr
s Ck 4
sk xk d /ð xÞ dx sign br d
ar
sk
3 7 /ð xÞ dx 1 ar d\br 5
ar
¼ rs
s X
s Ck a
k¼0
1
R h X
i sk ðrÞ
ða Þ
ðdÞ I k sign br d þ 2 I k r I k 1 ar d\br d
r¼1
ðs ¼ 1; 3; . . .Þ:
Q.E.D. Note that for even k, the absolute moments are given by Theorem 2.1.
54
2.3
2
Normal Moments Under Stripe Truncation and the Real-Valued … ðrÞ
Series Expressions of I k ðk ¼ 0; 1; . . .; r ¼ 1; . . .; RÞ for Moments of Integer-Valued Orders ðrÞ
ðb Þ
ða Þ
The function I k ¼ I k r I k r ðk ¼ 0; 1; . . .; r ¼ 1; . . .; RÞ derived in Lemma 2.1 plays an important role for the integer-valued moments of X Nðl; r2 ; a; bÞ as ðrÞ
ðbr Þ
Ik
ðrÞ
ðbr Þ
Ik
used in Theorems 2.1 and 2.2. The series expressions of I k ¼ I k for seeing its properties are available as follows: Lemma 2.2 (Ogasawara [8, Eqs. 5.2 and 5.3]) I k ¼ I k 0; 1; . . .; r ¼ 1; . . .; RÞ are given by cases:
ðar Þ
ðar Þ
(i) Even k (k = 0, 2,…; r = 1,…,R) ðrÞ
Ik ¼
k=2 o X ðk 1Þ!! n 2u1 2u1 ar /ðar Þ br / br þ ðk 1Þ!!ar ; ð2u 1Þ!! u¼1
where when k = 0, ðrÞ I0
Zbr ¼ ð1Þ!!ar ar
/ð xÞ dx ar
with the definition ð1Þ!! ¼ 1. (ii) Odd k (k = 1, 3,…; j = 1,…,J) ðrÞ
Ik ¼
ðk1Þ=2 X u¼0
o ðk 1Þ!!n 2u 2u ar /ðar Þ br / br ð2uÞ!!
Proof Integration by parts gives ðrÞ Ik
Zbr ¼
Zbr x /ð xÞ dx ¼
xk1 x/ð xÞ dx
k
ar
¼ x
ar
k1
/ð xÞ
br ar
Zbr þ ð k 1Þ
xk2 /ð xÞ dx ar
¼
ak1 r / ð ar Þ
k1 br / br þ ð k
ðrÞ
1ÞI k2
useful ðk ¼
2.3 Series Expressions of …
55
Rb Rb ðrÞ ðrÞ with I 0 ¼ arr /ðxÞ dx ¼ ar and I 1 ¼ arr x/ðxÞ dx ¼ /ðar Þ /ðbr Þ (k = 2, 3,…; r = 1,…,R). (i) Even k (k = 0, 2,…; r = 1,…,R) n o k1 k3 ðrÞ k3 I k ¼ ak1 r /ðar Þ br / br þ ðk 1Þ ar /ðar Þ br / br n o k5 þ ðk 1Þðk 3Þ ak5 / ð a Þ b / b r r r r .. .
þ ðk 1Þðk 3Þ 5 3 ar /ðar Þ br / br ðrÞ
þ ðk 1Þðk 3Þ 5 3 1I 0 k=2 o X ðk 1Þ!! n 2u1 2u1 ¼ ar /ðar Þ br / bj þ ðk 1Þ!!ar ; ð2u 1Þ!! u¼1 (ii) Odd k (k = 1, 3,…; j = 1,…,J) n o k1 k3 ðrÞ k3 I k ¼ ak1 r /ðar Þ br / br þ ðk 1Þ ar /ðar Þ br / br n o k5 þ ðk 1Þðk 3Þ ak5 r /ðar Þ br / br .. .
n o 2 þ ðk 1Þðk 3Þ 6 4 a2r /ðar Þ br / br ðrÞ
þ ðk 1Þðk 3Þ 6 4 2I 1 ¼
ðk1Þ=2 X u¼1
¼
ðk1Þ=2 X u¼0
o
ðk 1Þ!!n 2u 2u ar /ðar Þ br / br þ ðk 1Þ!! /ðar Þ / br ð2uÞ!! o ðk 1Þ!!n 2u 2u aj / aj bj / bj ; ð2uÞ!!
which give the required results. Q.E.D. Note that in the even case of Lemma 2.2, the o term ðk 1Þ!!ar is not given when Pk=2 ðk1Þ!! n 2u1 2u1 u = 0 in u¼1 ð2u1Þ!! ar /ðar Þ br /ðbr Þ but when u = 1/2. Ogasawara [8, Appendix A.5] gave the second proof of Lemma 2.1 using the above series expressions in Lemma 2.2. The extension of Lemmas 2.1 and 2.2 to the cases when k [ 1 is real-valued for absolute moments will be given in the next section introducing the new real-valued Poisson distribution, whose special
56
2
Normal Moments Under Stripe Truncation and the Real-Valued …
cases are the usual Poisson and the half Poisson distributions. The latter distribution was defined by Ogasawara [8]. Definition 2.1 (Ogasawara [8, Definition 3]) The distribution of a variable taking positive half integers with the probability function kv þ 0:5 =Cðv þ 1:5Þ ðv ¼ 0; 1; . . .Þ PrðX ¼ v þ 0:5Þ ¼ P1 u þ 0:5 =Cðu þ 1:5Þ u¼0 k is called the half Poisson distribution with the parameter k [ 0. Ogasawara [8, Eq. 5.6] showed that when, e.g., k ¼ a2r =2 for the cases of even k in Lemma 2.2: n o þ 1:5Þ Cðkj0:5Þ ek CCðkjv ðv þ 1:5Þ Cð0:5Þ PrðX v þ 0:5Þ ¼ P1 u þ 0:5 =Cðu þ 1:5Þ u¼0 k qffiffi2 Pv þ 1 a2u1 j sign aj u¼1 ð2u1Þ!! p ¼P ðv ¼ 0; 1; . . .Þ; u þ 0:5 1 2 a =2 =C ð u þ 1:5 Þ j u¼0 where Cðkjv þ 1:5Þ is the upper incomplete gamma function at k whose complete counterpart is Cð0jv þ 1:5Þ ¼ Cðv þ 1:5Þ.
2.4
The Real-Valued Poisson Distribution for Series ðrÞ
Expressions of I k ðk ¼ 0; 1; . . .; r ¼ 1; :::; RÞ for Absolute Moments 2.4.1
Generalization of the Poisson Distribution
The Poisson distribution is one of the most basic discrete distributions that take non-negative integers. Generalization of the distribution with emphasis on regression has been given by Consul and Jain [4, Eq. (3.1)], Consul [2], Letac and Mora [7, Example D], Consul and Famoye [3, Eq. (2.3)], Zamani and Ismail [12, Eq. (6)], and Chandra, Roy and Ghosh [1, Eq. (2)] mainly to cope with the situations showing overdispersion /underdispersion frequently encountered in practice (for the summarized mean–variance relationships of the above generalized Poisson distributions (GPSs) see Wagh and Kamalja [10, Table 1]). It is known that for the Poisson distributed variable X with the parameter k, its cdf at X = k (k = 0, 1,…) is given by
2.4 The Real-Valued Poisson Distribution for Series Expressions of … k X
Zk
v k
k e =v! ¼ 1
v¼0
57
xk ex dx=Cðk þ 1Þ
0
Z1 ¼
xk ex dx=Cðk þ 1Þ
ð2:2Þ
k
¼ Cðkjk þ 1Þ=Cðk þ 1Þ (see, e.g., Johnson et al. [6, p. 372]). The GPD of Chandra et al. [1] uses a relationship similar to (2.2) yielding the following probability function: PrðX ¼ vÞ ¼
81 < R xg ex dx=Cð1 þ gÞ ðv ¼ 0; 0 g\1Þ : k v þ g k k e =Cðv þ 1 þ gÞ ðv ¼ 1; 2; . . .; 0 g\1Þ:
ð2:3Þ
The half Poisson distribution obtained by Ogasawara [8, Definition 3] is another generalization of the Poisson distribution, whose probability function was shown in Definition 2.1, which is different from the GPDs mentioned earlier in that the half Poisson distribution takes positive half integers rather than non-negative integers for the GPDs. The half Poisson distribution associated with the upper incomplete gamma function Cðkjk þ 1Þ was introduced for the series expansion of the raw partial moments of even orders for the standard normal distribution. In the next subsection, the real-valued Poisson (r-Poisson) distribution will be defined, whose special cases are the usual and half Poisson distributions. Applications will be shown for series expressions of the absolute partial moments of real orders in the standard normal distribution.
2.4.2
The Real-Valued Poisson Distribution
Before presenting a new distribution, we provide the following lemma which is an extension of (2.2). Lemma 2.3 For k ¼ 0; 1; . . .; 0 k\1; and 0 g\1 with 00 1, we have 8 k P kv þ g ek > > < Cðv þ 1 þ gÞ þ
CðkjgÞ
CðgÞ Cðkjk þ 1 þ gÞ ¼ v¼0 k P > C ðk þ 1 þ g Þ > kv ek : Cðv þ 1Þ ðg ¼ 0Þ: v¼0
ð0\g\1Þ ð2:4Þ
58
2
Normal Moments Under Stripe Truncation and the Real-Valued …
Proof When 0\g\1, we obtain Cðkjk þ 1 þ gÞ ¼ Cðk þ 1 þ gÞ
Z1
xk þ g ex dx=Cðk þ 1 þ gÞ
k
8 9 Z1 < = 1 1 xk þ g ex k þ ðk þ gÞ xk1 þ g ex dx ¼ ; Cðk þ 1 þ gÞ : k
¼
k þ g k
k e 1 þ Cðk þ 1 þ gÞ Cðk þ gÞ k þ g k
Z1
xk1 þ g ex dx
k
k e k e kg ek 1 þ þ þ þ ¼ Cðk þ 1 þ gÞ Cðk þ gÞ Cð1 þ gÞ CðgÞ ¼
k X
k1 þ g k
Z1
x1 þ g ex dx
k
v þ g k
k e CðkjgÞ þ : C ð v þ 1 þ g Þ CðgÞ v¼0
The case of g ¼ 0 is derived from the above result when g is temporarily relaxed to take 1: k Cðkjk þ 2Þ X kv þ 1 ek Cðkj1Þ ¼ þ Cðk þ 2Þ Cð1Þ Cðv þ 2Þ v¼0
¼ ¼
k X kv þ 1 ek
Cðv þ 2Þ v¼0
þ ek
kX þ 1 v k
ke : v! v¼0
Redefining k + 1 as k, we have the second result of (2.4). Alternatively, the second result is obtained by stopping the series up to the second last term, which is R1 replaced by Cð11þ gÞ k xg ex dx ¼ ek when g ¼ 0. Q.E.D. In Lemma 2.3, the known second result when g ¼ 0 is included for convenience. The definition 00 1 is required for the second result, where the left-hand side of (2.4) is well defined as 1 when k ¼ 0 since Cð0jk þ 1 þ gÞ ¼ Cðk þ 1 þ gÞ while the first term on the right-hand side of (2.4) when g ¼ 0 and v = 0 should be k0 ek =0! ¼ 1 when k ¼ 0. When g ¼ 0:5 and k ¼ 1, (2.4) gives the closed-form normalizer for the half Poisson in Definition 2.1. In the above results and the remainder of this section, the expression, e.g., k þ 1 þ g is used rather than k þ g þ 1 to indicate that k + 1 is an integer while 0 g\1 is real-valued.
2.4 The Real-Valued Poisson Distribution for Series Expressions of …
59
Definition 2.2 The discrete distribution taking non-negative equally spaced real values with unit steps is defined as the real-valued Poisson (r-Poisson) distribution when its probability function is given by the following equivalent sets of expressions: PrðX ¼ v þ gjk; gÞ kv þ g =Cðv þ 1 þ gÞ ¼ P1 u þ g ðv ¼ 0; 1; . . .; k [ 0; 0 g\1Þ k =Cðu þ 1 þ gÞ ( u¼0 kv þ g ð0\g\1Þ Cðv þ 1 þ gÞek ½1fCðkjgÞ=CðgÞg ¼ k v k = e v! ðg ¼ 0Þ 8 kv þ g < k ð0\g\1Þ e fCðv þ 1 þ gÞðgÞv þ 1 CðkjgÞg ¼ : kv = ek v! ðg ¼ 0Þ;
ð2:5Þ
where ðgÞv þ 1 ¼ gð1 þ gÞ ðv þ gÞ ¼ Cðv þ 1 þ gÞ=CðgÞ is the rising or ascending factorial using the Pochhammer notation. Proof of the second expression of the normalizer Cðv þ 1 þ gÞek ½1 fCðkjgÞ=CðgÞg: Using Lemma 2.3 when 0\g\1 and k ¼ 1, the result Cðkjk þ 1 þ gÞ=Cðk þ 1 þ gÞ ¼ 1 gives the required normalizer. We prove this by the Chebyshev inequality. Let X be gamma distributed with the shape parameter k [ k and the unit scale parameter. Then, using EðXÞ ¼ varðXÞ ¼ k and the Chebyshev inequality, we have PrðX\kÞ ¼ 1
CðkjkÞ varð X Þ k 1 : ¼ ¼ 2 2 2 CðkÞ k =k 2k þ k ðk k Þ fk Eð X Þg
When k goes to 1, the last result approaches 0, which indicates Cðkj1Þ=Cð1Þ ¼ 1. Q.E.D. In Definition 2.2, the cases of g ¼ 0 and g ¼ 0:5 give the usual and half Poisson distributions, respectively. The first expression of (2.5) covers g ¼ 0 as well as 0\g\1. The cdf of the r-Poisson distribution is given as follows. Theorem 2.3 When X follows the r-Poisson distribution with k [ 0 and 0 g\1, we have PrðX k þ gjk; gÞ ðk ¼ 0; 1; . . .Þ 8n on o1 < Cðkjk þ 1 þ gÞ CðkjgÞ 1 CðkjgÞ ð0\g\1Þ Cðk þ 1 þ gÞ CðgÞ CðgÞ ¼ : Cðkjk þ 1Þ Cðk þ 1Þ ðg ¼ 0Þ:
60
2
Normal Moments Under Stripe Truncation and the Real-Valued …
Proof Using Lemma 2.3, we obtain the required results. Q.E.D. Remark 2.1 In Theorem 2.3, the second result for the usual Poisson distribution is known as addressed earlier. The second result is also given from the first result with lim fCðkjgÞ=CðgÞg ¼ 0 though Cð0Þ is not defined. The zero limit is proved again
g! þ 0
by the Chebyshev inequality. Let X be gamma distributed with the shape parameter g ð0\g\kÞ and the unit scale parameter. Then, we obtain PrðX [ kÞ ¼ n It is found that lim
g! þ 0
CðkjgÞ varð X Þ g ¼ : 2 CðgÞ ðk gÞ2 fk E ð X Þ g
o g=ðk gÞ2 ¼ 0 gives lim fCðkjgÞ=CðgÞg ¼ 0. g! þ 0
Theorem 2.4 Define FC ðkjgÞ as the cdf of the gamma distribution at k with the shape parameter g and the unit scale parameter. Then, for the r-Poisson distribution with k [ 0 and 0\g\1, we have Eð Xjk; gÞ ¼ Eð X Þ kg þ k; ek FC ðkjgÞCðgÞ 2 kg ð g kÞ kg k varð X Þ ¼ k þk e FC ðkjgÞCðgÞ e FC ðkjgÞCðgÞ ¼
and the moment generating function MðtÞ ¼ expfðet 1Þkg
FC ðet kjgÞ : FC ðkjgÞ
Proof Eð X Þ ¼
1 X
ðv þ gÞkv þ g Cðv þ 1 þ gÞek ½1 fCðkjgÞ=CðgÞg v¼0 1 X
kv1 þ g Cðv þ gÞek FC ðkjgÞ v¼0 ( ) 1 k k1 þ g X kv1 þ g þ ¼ k e FC ðkjgÞ CðgÞ C ð v þ gÞ v¼1 ( ) 1 k k1 þ g X kv þ g þ ¼ k ; e FC ðkjgÞ CðgÞ C ð v þ 1 þ gÞ v¼0 ¼k
2.4 The Real-Valued Poisson Distribution for Series Expressions of …
61
which gives the required expression of EðXÞ using Lemma 2.3 and the proof in Definition 2.2. For varðXÞ, we have EfXðX 1Þg ¼
1 X ðv þ gÞðv 1 þ gÞkv þ g
Cðv þ 1 þ gÞek FC ðkjgÞ ( ) 1 1 gð1 þ gÞkg ð1 þ gÞgk1 þ g X kv þ g þ þ ¼ k e FC ðkjgÞ Cð1 þ gÞ Cð2 þ gÞ Cðv 1 þ gÞ v¼2 ( ) 1 X 1 gkg kv þ g 2 ð1 þ g þ kÞ þ k ¼ k e FC ðkjgÞ Cð1 þ gÞ Cðv þ 1 þ gÞ v¼0 v¼0
¼
kg ð1 þ g þ kÞ þ k2 ; ek FC ðkjgÞCðgÞ
which gives varð X Þ ¼ EfX ðX 1Þg þ Eð X Þ fEð X Þg2 2 kg ð g þ kÞ kg 2 þk þk k þk ; ¼ k e FC ðkjgÞCðgÞ e FC ðkjgÞCðgÞ yielding the required result. For MðtÞ, we have MðtÞ ¼ EðetX Þ 1 X ðet kÞv þ g ¼ Cðv þ 1 þ gÞek FC ðkjgÞ v¼0 ¼
1 expðet kÞFC ðet kjgÞ X ðet kÞv þ g ek FC ðkjgÞ Cðv þ 1 þ gÞ expðet kÞFC ðet kjgÞ v¼0
¼
expðet kÞFC ðet kjgÞ ; ek FC ðkjgÞ
which is equal to the required expression. Q.E.D. The moments can be given by MðtÞ though the above mean and variance are obtained by simpler methods. The known MðtÞ ¼ expfðet 1Þkg for the usual Poisson case is given by the above result using lim FC ðet kjgÞ ¼ g! þ 0
lim FC ðkjgÞg ¼ 1 (see the associated result in Remark 2.1).
g! þ 0
62
2.4.3
2
Normal Moments Under Stripe Truncation and the Real-Valued …
Applications to the Series Expressions of the Moments of the Normal Distribution ðrÞ
Recall the definition of I k Zbr
ðrÞ
Ik ¼
xk /ð xÞ dx ar
r ¼ 1; . . .; R; 1 a1 \b1 \. . .\aR \bR 1; k ¼ 0; 1; . . . ; which is a partial moment for the r-th interval of the normal distribution under double/stripe truncation and is similar to Fisher’s [5, Eq. (11)] In function under single truncation: 1 In ¼ n!
Z1 ðt xÞn /ðtÞdt; x
ðrÞ
As in Fisher [5], we extend I k when k [ 1 is real-valued with a simplified notation as Zb Ik ¼
xk /ð xÞ dx
ð0 a\b 1; k [ 1Þ
a
Zb =2 2
1 ¼ pffiffiffi 2 p
2k=2 yðk1Þ=2 ey dy
pffiffiffiffiffi y ¼ x2 =2; dx=dy ¼ 1= 2y
a2 =2
2 2 kþ1 1 1 b k þ 1 a k þ 1 pffiffiffi FC ¼ 2k=2 C F ; C 2 2 2 2 2 p2 where 2k=2 C k þ2 1 p1ffiffip ¼ Eðj X jk Þ ðk [ 1Þ when X Nð0; 1Þ(Winkelbauer [11, Eq. (18)]). Employing the notation k þ g ðk ¼ 1; 0; 1; . . .; 0 g\1; k þ g [ 1Þ given earlier and using integration by parts, we have the recurrence expression:
2.4 The Real-Valued Poisson Distribution for Series Expressions of …
Zb Ik þ g ¼
63
xk þ g /ð xÞ dx
a
¼ x
k1 þ g
b
Zb
/ð xÞ a þ ðk 1 þ gÞ
xk2 þ g /ðxÞ dx
ð2:6Þ
a
¼a
k1 þ g
/ðaÞ b
k1 þ g
/ðbÞ þ ðk 1 þ gÞI k2 þ g
ð0 a\b 1; k ¼ 2; 3; . . .; 0 g\1Þ: (i) Even k = 2, 4,… with 0\g\1 The case of even k with g ¼ 0 will be given in (ii). Under the condition of (i), (2.6) gives Zb Ik þ g ¼
xk þ g /ðxÞ dx
a
¼ ak1 þ g /ðaÞ bk1 þ g /ðbÞ þ ðk 1 þ gÞ ak3 þ g /ðaÞ bk3 þ g /ðbÞ
þ ðk 1 þ gÞðk 3 þ gÞ ak5 þ g /ðaÞ bk5 þ g /ðbÞ .. .
þ ðk 1 þ gÞðk 3 þ gÞ ð5 þ gÞð3 þ gÞ a1 þ g /ðaÞ b1 þ g /ðbÞ þ ðk 1 þ gÞðk 3 þ gÞ ð5 þ gÞð3 þ gÞð1 þ gÞI g ¼
k=2 X ðk 1 þ gÞ!! 2u1 þ g fa /ðaÞ b2u1 þ g /ðbÞg þ ðk 1 þ gÞ!!I g ; ð2u 1 þ gÞ!! u¼1
where ðk 1 þ gÞ!! ¼ ðk 1 þ gÞðk 3 þ gÞ ð3 þ gÞð1 þ gÞ is an extended double factorial for odd k 1 1 and 0\g\1. Define k ¼ a2 =2 and g ¼ ð1 þ gÞ=2. Then, we obtain k=2 X ðk 1 þ gÞ!! 2u1 þ g a /ðaÞ ð2u 1 þ gÞ!! u¼1 ðk 1 þ gÞ!! a2 =2 a1 þ g a3 þ g ak1 þ g pffiffiffiffiffiffi ¼ þ þ þ e 1 þ g ð3 þ gÞð1 þ gÞ ðk 1 þ gÞ!! 2p ðk 1 þ gÞ!! a2 =2 ð1 þ gÞ=2 pffiffiffiffiffiffi ¼ 2 Cfð1 þ gÞ=2g e 2p
64
2
"
Normal Moments Under Stripe Truncation and the Real-Valued …
ða2 =2Þð1 þ gÞ=2 ða2 =2Þð3 þ gÞ=2 þ Cfð1 þ gÞ=2gfð1 þ gÞ=2g Cfð1 þ gÞ=2gfð1 þ gÞ=2gfð3 þ gÞ=2g ða2 =2Þðk1 þ gÞ=2 þ þ Cfð1 þ gÞ=2gfð1 þ gÞ=2gfð3 þ gÞ=2g fðk 1 þ gÞ=2g
#
2g=2 ¼ ðk 1 þ gÞ!! pffiffiffi Cfð1 þ gÞ=2g 2 p " # ð1 þ gÞ=2 k k e k1 þ ð1 þ gÞ=2 ek kðk=2Þ1 þ ð1 þ gÞ=2 ek þ þ þ Cf1 þ ð1 þ gÞ=2g Cf2 þ ð1 þ gÞ=2g Cfðk=2Þ þ ð1 þ gÞ=2g 2g=2 ¼ ðk 1 þ gÞ!! pffiffiffi fCðg Þ Cðkjg Þg PrfX ðk=2Þ 1 þ g jk; g g: 2 p In the last result, X is r-Poisson distributed, where ðk=2Þ 1 is an integer (see Definition 2.2). Since the corresponding result for b is similarly given using k ¼ b2 =2, we have Zb Ik þ g ¼
xk þ g /ðxÞ dx
a
2g=2
¼ ðk 1 þ gÞ!! pffiffiffi Cðg Þ C a2 =2 g Pr X ðk=2Þ 1 þ g ja2 =2; g
2 p
Cðg Þ C b2 =2 g Pr X ðk=2Þ 1 þ g jb2 =2; g
þ ðk 1 þ gÞ!!I g :
(ii) Even k = 2, 4,… with g ¼ 0 The essential results in (ii) were derived by Ogasawara [8, Sect. 5.1 (i) and A.5 (i)] when ð1 þ gÞ=2 ¼ 1=2 gives the half Poisson distribution. The following results are given by those in (i) with g ¼ 0. Zb Ik ¼
xk /ðxÞ dx a
1 pffiffiffi p Cða2 =2 1=2Þ Pr X ðk 1Þ=2ja2 =2; 1=2 ¼ ðk 1Þ!! pffiffiffi 2 p
pffiffiffi
p Cðb2 =2 1=2Þ Pr X ðk 1Þ=2jb2 =2; 1=2 þ ðk 1Þ!!I 0 ;
ð2:7Þ
2.4 The Real-Valued Poisson Distribution for Series Expressions of …
65
where Zb I0 ¼
1 /ðxÞ dx ¼ pffiffiffi C a2 =2 1=2 C b2 =2 1=2 : 2 p
a
In (2.7), when Y Nð0; 1Þ, it is well known that EðY k Þ ¼ ðk 1Þ!! ðk ¼ 2; 4; . . .Þ, pffiffiffi which is equal to 2k=2 Cfðk þ 1Þ=2g= p as mentioned earlier. (iii) Odd k = 1, 3,…with 0\g\1 Under the condition of (iii), (2.6) gives Zb Ik þ g ¼
xk þ g /ðxÞ dx
a
¼ ak1 þ g /ðaÞ bk1 þ g /ðbÞ þ ðk 1 þ gÞ ak3 þ g /ðaÞ bk3 þ g /ðbÞ
þ ðk 1 þ gÞðk 3 þ gÞ ak5 þ g /ðaÞ bk5 þ g /ðbÞ .. . þ ðk 1 þ gÞðk 3 þ gÞ ð4 þ gÞð2 þ gÞfag /ðaÞ bg /ðbÞg þ ðk 1 þ gÞðk 3 þ gÞ ð4 þ gÞð2 þ gÞgI 1 þ g ¼
ðk1Þ=2 X u¼0
ðk 1 þ gÞ!! 2u þ g a /ðaÞ b2u þ g /ðbÞ þ ðk 1 þ gÞ!!I 1 þ g : ð2u þ gÞ!!
where ðk 1 þ gÞ!! ¼ ðk 1 þ gÞðk 3 þ gÞ ð4 þ gÞð2 þ gÞg is an extended double factorial for even k 1 0 and 0\g\1. Define k ¼ a2 =2. Then, we have ðk1Þ=2 X
ðk 1 þ gÞ!! 2u þ g a /ðaÞ ð2u þ gÞ!! u¼0 ðk 1 þ gÞ!! a2 =2 ag a2 þ g a4 þ g ak1 þ g pffiffiffiffiffiffi ¼ þ þ þ þ e g ð2 þ gÞg ð4 þ gÞð2 þ gÞg ðk 1 þ gÞ!! 2p ðk 1 þ gÞ!! a2 =2 ðg=2Þ1 pffiffiffiffiffiffi 2 Cðg=2Þ e ¼ 2p " ða2 =2Þg=2 ða2 =2Þ1 þ ðg=2Þ þ Cðg=2Þðg=2Þ Cðg=2Þðg=2Þf1 þ ðg=2Þg # ða2 =2Þðk1Þ=2 þ ðg=2Þ þ þ Cðg=2Þðg=2Þf1 þ ðg=2Þg fðk 1Þ=2 þ ðg=2Þg
66
2
Normal Moments Under Stripe Truncation and the Real-Valued …
Since the result for b is similarly given, we have Zb Ik þ g ¼
xk þ g /ðxÞ dx
a
2ðg=2Þ1 ¼ ðk 1 þ gÞ!! pffiffiffiffiffiffi ð2:8Þ 2p 2
2 Cðg=2Þ C a =2 g=2 Pr X ðk 1Þ=2 þ ðg=2Þja =2; g=2
Cðg=2Þ C b2 =2 g=2 Pr X ðk 1Þ=2 þ ðg=2Þjb2 =2; g=2 þ ðk 1 þ gÞ!!I 1 þ g :
(iv) Odd k = 1, 3,… with g ¼ 0 The essential results in this subsection were given by Ogasawara [8, Sect. 5.1 (ii) and A.5 (ii)]. The results are not given by the last result of (2.8) when g ¼ 0 since Cð0Þ is not defined. They are given from an earlier result. For clarity, we start with (2.6). Under the condition of (iv), Zb Ik ¼
xk /ðxÞ dx a
¼ ak1 /ðaÞ bk1 /ðbÞ þ ðk 1Þ ak3 /ðaÞ bk3 /ðbÞ
þ ðk 1Þðk 3Þ ak5 /ðaÞ bk5 /ðbÞ .. .
þ ðk 1Þðk 3Þ 6 4 a2 /ðaÞ b2 /ðbÞ þ ðk 1Þðk 3Þ 6 4 2I 1 ¼
ðk1Þ=2 X u¼0
ðk 1Þ!! 2u a /ðaÞ b2u /ðbÞ : ð2uÞ!!
Define k ¼ a2 =2. Then, we have ðk1Þ=2 X
ðk 1Þ!! 2u a /ðaÞ ð2uÞ!! u¼0 ðk 1Þ!! a2 =2 a2 a4 ak1 ¼ pffiffiffiffiffiffi e þ þ þ 1þ 2 42 ðk 1Þ!! 2p ( ) ðk 1Þ!! k k2 ek kðk1Þ=2 ek k þ þ ¼ pffiffiffiffiffiffi e þ ke þ 2! fðk 1Þ=2g! 2p
¼
ðk 1Þ!! pffiffiffiffiffiffi PrfX ðk 1Þ=2jk; 0g; 2p
where the r-Poisson distributed X reduces to the usual Poisson.
2.5 Remarks
67
Using similar results for b, we obtain Zb Ik ¼
xk /ðxÞ dx a
ðk 1Þ!!
¼ pffiffiffiffiffiffi Pr X ðk 1Þ=2ja2 =2; 0 2p
Pr X ðk 1Þ=2jb2 =2; 0 :
ð2:9Þ
pffiffiffiffiffiffiffiffi When Y Nð0; 1Þ, (2.9) gives EðjY jk Þ ¼ ðk 1Þ!! 2=p ðk ¼ 1; 3; . . .Þ, which pffiffiffi is equal to 2k=2 Cfðk þ 1Þ=2g= p as mentioned earlier.
2.5
Remarks
The mean derived in Theorem 2.4 indicates that it is larger than k for the usual Poisson case. Similarly, the variance in Theorem 2.4 shows that when k is sufficiently large, e.g., k g, the variance is smaller than k indicating underdispersion. The r-Poisson distributed variable takes non-integer values when 0\g\1. By replacing x ¼ v þ g with x ¼ v ðv ¼ 0; 1; . . .Þ in Definition 2.2, we have another integer-valued GPD similar to (1.2). ( PrðX ¼ vjk; gÞ ¼
kv þ g Cðv þ 1 þ gÞek FC ðkjgÞ v k
k =ðe v!Þ
ð0\g\1Þ
ðg ¼ 0Þ
ð2:10Þ
[compare (2.10) and (2.3)]. Note that the probabilities in (2.10) are proportional to kv þ g =Cðv þ 1 þ gÞ as in the usual Poisson case whereas (2.3) has not this property. This indicates that (2.3) can be seen as a relatively zero-distorted distribution in this sense. We find that (2.10) and the r-Poisson belong to the exponential family of distributions while (2.3) does not. The GPD of (2.10) is seen as a downward shifted r-Poisson by g with Eð X Þ ¼
kg g þ k; ek FC ðkjgÞCðgÞ
the unchanged variance 2 kg ðg kÞ kg k varð X Þ ¼ k þk e FC ðkjgÞCðgÞ e FC ðkjgÞCðgÞ
68
2
Normal Moments Under Stripe Truncation and the Real-Valued …
as in Theorem 2.4 and the moment generating function 1 X MðtÞ ¼ E etX ¼
etv kv þ g Cðv þ 1 þ gÞek FC ðkjgÞ v¼0
¼
1 expðet k tgÞFC ðet kjgÞ X ðet kÞv þ g ek FC ðkjgÞ Cðv þ 1 þ gÞ expðet kÞFC ðet kjgÞ v¼0
¼
expfðet 1Þk tggFC ðet kjgÞ : FC ðkjgÞ
Chandra et al. [1, p. 2788] suggests a real-valued distribution similar to the r-Poisson. However, their distribution is a real-valued counterpart of (2.3) with a zero-distorted property.
References 1. Chandra NK, Roy D, Ghosh T (2013) A generalized Poisson distribution. Commun Stat Theor Methods 42:2786–2797 2. Consul PC (1989) Generalized Poisson distribution: properties and applications. Marcel Dekker, New York 3. Consul PC, Famoye E (1992) Generalized Poisson regression model. Commun Stat Theor Methods 21:89–109 4. Consul PC, Jain G (1973) A generalization of the Poisson distribution. Technometrics 15:791–799 5. Fisher RA (1931) Introduction to Mathematical Tables, 1, xxvi–xxxv. British Association for the Advancement of Science. Reprinted in R. A. Fisher, Contributions to Mathematical Statistics, pp 517–526 with the title The sampling error of estimated deviates, together with other illustrations of the properties and applications of the integrals and derivatives of the normal error function and the author’s note (CMS 23.xxva). Wiley, New York (1950) 6. Johnson NL, Kotz S, Balakrishnan N (1994) Continuous univariate distributions, vol 1, 2nd edn. Wiley, New York 7. Letac C, Mora M (1990) Natural real exponential families with cubic variance functions. Ann Stat 18:1–37 8. Ogasawara H (2021) Unified and non-recursive formulas for moments of the normal distribution with stripe truncation. Commun Stat Theor Methods (online published). https:// doi.org/10.1080/03610926.2020.1867742 9. Pollack M, Shauly-Aharonov M (2019) A double recursion for calculating moments of the truncated normal distribution and its connection to change detection. Methodol Comput Appl Probab 21:889–906 10. Wagh YS, Kamalja KK (2017) Comparison of methods of estimation for parameters of generalized Poisson distribution through simulation study. Commun Stat Simul Comput 46:4098–4112 11. Winkelbauer A (2014) Moments and absolute moments of the normal distribution. arXiv:1209.4340v2 [math.ST], 15 July 2014 12. Zamani H, Ismail N (2012) Functional form for the generalized Poisson regression model. Commun Stat Theor Methods 41:3666–3675
Chapter 3
The Basic Parabolic Cylinder Distribution and Its Multivariate Extension
3.1
Introduction
A variation of the parabolic cylinder distribution was given by Kostylev [11, Eq. (3.14)], where the density function when variable X takes value A is defined by ðKoÞ
fD
ðX ¼ Aja; r; pÞ
¼ 1ðAÞ
Aa1 expfðA4 2pA2 Þ=ð4r4 Þg pffiffiffi ; Da=2 fp=ð 2r2 Þg2ða=4Þ1 ra Cða=2Þ expfp2 =ð8r4 Þg
ð3:1Þ
where 1ðxÞ ¼ 1 when x 0 and 1ðxÞ ¼ 0 when x\0; Dm ðzÞ is the parabolic cylinder function using the traditional Whittaker notation [2, Chap. 12; 4, Chap. 8; 13, Chap. 8; 28, Sects. 9.24–9.25]; and CðÞ is the gamma function. The distribution with (3.1) was derived in the context of energy detection and called the three-parametric parabolic cylinder distribution. Two other variations of the parabolic cylinder distribution were introduced by Ogasawara [17, Definitions 1 and 2], where the density functions are ð1Þ
fD;k ðX ¼ xjgÞ ¼
xk expðx2 gxÞ pffiffiffi 2ðk þ 1Þ=2 Cðk þ 1Þeg2 =8 Dk1 ðg= 2Þ
ðx [ 0; k [ 1Þ
ð3:2Þ
and ð2Þ
fD;k ðX ¼ xjhÞ ¼
xk expfðx2 =2Þ hxg Cðk þ 1Þeh2 =4 Dk1 ðhÞ
ðx [ 0; k [ 1Þ:
ð3:3Þ
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 H. Ogasawara, Expository Moments for Pseudo Distributions, Behaviormetrics: Quantitative Approaches to Human Behavior 2, https://doi.org/10.1007/978-981-19-3525-1_3
69
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
70
The distributions with (3.2) and (3.3) were called the basic parabolic cylinder (bpc) distributions of the first and second kinds, respectively. The two bpc distributions were introduced to have the absolute moments of real-valued orders in the truncated normal distribution. It is found that (3.2) and (3.3) are not special cases of (3.1) using reparametrization although (3.1) can give (3.2) and (3.3) when the change of variables is employed as well. The parabolic cylinder distributions using Dm ðÞ derived by Pierson and Holmes [22] and rediscovered by Mathai and Saxena [14, Theorem 7]; and the generalized parabolic cylinder distribution obtained by Progri [25] are different from (3.1) to (3.3). The purposes of this chapter are to give another simple bpc distribution with an additional parameter over (3.2) and (3.3), to derive its moments and cumulative distribution function (cdf) and to give a multivariate extension. For the cdf of the bpc distribution, the weighted parabolic cylinder function is developed.
3.2
The BPC Distribution of the Third Kind and Its CDF
A new bpc distribution is defined as follows. Definition 3.1 The probability density function (pdf) of the bpc distribution of the third kind or simply the bpc distribution is defined by ð3Þ
fD;k ðX ¼ xjp; qÞ ¼ fD;k ðxjp; qÞ ¼ fD;k ðxÞ xk expðpx2 qxÞ ¼ R1 k 2 0 t expðpt qtÞdt ¼
xk expðpx2 qxÞ
ð3:4Þ
pffiffiffiffiffi ð2pÞðk þ 1Þ=2 Cðk þ 1Þeq2 =ð8pÞ Dk1 ðq= 2pÞ ðx [ 0; p [ 0; k [ 1Þ:
pffiffiffiffiffi In (3.4), 1= 2p or its proportional is seen as an added scale parameter. The derivation of the normalizer in (3.4) is given by the integral representation of Dk1 ðÞ [5, Sect. 6.3, Eq. (13); 28, Sect. 3.462, Eq. 1]. The two distributions of (3.1) and (3.4) are equivalent in that one of the distributions is given by the other using change of variables and reparametrization. While (3.1) was developed for a special purpose and is complicated, (3.4) is simpler than (3.1). Note that the denominator of (3.4) is proportional to EðX k Þðk [ 1Þ of a normally distributed variable X under single truncation. First, we derive the cdf of the bpc distribution with (3.4). For this purpose, the following function is defined.
3.2 The BPC Distribution of the Third Kind and Its CDF
71
Definition 3.2 The weighted or incomplete parabolic cylinder function using an extended Whittaker notation is given by pffiffiffi m 1 z 2 x2 p m=2 z2 =4 ; DmW ðz; xÞ ¼ 2 e 1 F1W ; ; 2 2 2 2 Cfð1 mÞ=2g pffiffiffiffiffiffi 2pz 1 m 3 z 2 x2 ; ; ; F ; 1 1W Cðm=2Þ 2 2 2 2 where 1 F1W ðg; n; z; xÞ
¼
1 X ðgÞu zu cðxjg þ uÞ ðnÞu u!Cðg þ uÞ u¼0
ð3:5Þ
is the weighted or incomplete Kummer confluent hypergeometric function defined by Ogasawara [18, Corollary 1]; ðgÞu ¼ gðg þ 1Þ ðg þ u 1Þ ¼ Cðg þ uÞ=CðgÞ Rx with ðgÞ0 ¼ 1 is the rising or ascending factorial; and cðxjg þ uÞ ¼ 0 zg þ u1 ez dz is the lower incomplete gamma function. The subscript “W” in DmW ðz; xÞ and 1 F1W ðg; n; z; xÞ indicates a “weighted” function for clarity. When x ¼ 1, (3.5) with cð1jg þ uÞ ¼ Cðg þ uÞ becomes the usual Kummer confluent hypergeometric function 1 F1 ðg; n; zÞ [27, Eq. (6); 28, Sect. 9.210, Eq. 1; 2, Chap. 13]. Note that the weight 0 cðxjg þ uÞ=Cðg þ uÞ 1 in (3.5) is the cdf of the gamma distribution at X = x with the shape parameter g þ u and the unit scale parameter. The term “incomplete” is synonymously used as for DmW ðz; xÞ to avoid confusion since the term “the weighted parabolic cylinder function” is also used as a weighted sum of the parabolic cylinder functions [19, p. 616; see also 15, p. 99]. Then, one of the main results is given as follows. Theorem 3.1 The cdf of the bpc distribution with (3.4) other than the integral representation is given in three ways: pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi PrðX xjk; p; qÞ ¼ Dk1;W ðq= 2p; 2pxÞ=Dk1 ðq= 2pÞ ð1st expressionÞ P1 pffiffiffii 2 i¼0 cfpx jðk þ 1 þ iÞ=2g q= p =i! ð2nd expressionÞ ¼ P1 pffiffiffii i¼0 Cfðk þ 1 þ iÞ=2g q= p =i! kþ1 k þ 1 1 q2 ; ; ; px2 ¼ C ð3rd expressionÞ 1 F1W 2 2 2 4p q kþ2 k þ 2 3 q2 ; ; ; px2 pffiffiffi C 1 F1W p 2 2 2 4p kþ1 k þ 1 1 q2 C ; ; 1 F1 2 2 2 4p 1 q kþ2 k þ 2 3 q2 : pffiffiffi C ; ; 1 F1 p 2 2 2 4p
ð3:6Þ
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
72
Proof In the integral representation PrðX xjk; p; qÞ ¼
ð2pÞ
Rx
tk expðpt2 qtÞdt
pffiffiffiffiffi ; Cðk þ 1Þeq2 =ð8pÞ Dk1 ðq= 2pÞ
0 ðk þ 1Þ=2
ð3:7Þ
pffiffiffiffiffi the variable transformation y ¼ pt2 with dt=dy ¼ 1=ð2 pyÞ gives alternative expressions of the numerator of (3.7): Zx
1 Z X yðk1 þ iÞ=2 ey ðqÞi t expðpt qtÞdt ¼ dy i! 2pðk þ 1 þ iÞ=2 i¼0 px2
k
0
2
0
¼
1 X cfpx2 jðk þ 1 þ iÞ=2g ðqÞi
2pðk þ 1 þ iÞ=2
i¼0
¼
1
1 X
2pðk þ 1Þ=2
i¼0
i!
cfpx2 jðk þ 1 þ iÞ=2g
pffiffiffi ðq= pÞi i!
ð3:8Þ
(scaled 2nd expression) 1 kþ1 k þ 1 1 q2 2 ; ; ; px ¼ ðk þ 1Þ=2 C 1 F1W 2 2 2 4p 2p q kþ2 k þ 2 3 q2 ; ; ; px2 pffiffiffi C 1 F1W 2 2 2 4p p (scaled 3rd expression); where the two terms of the “3rd expression” in braces correspond to the even and odd powers in the preceding infinite series as in Ogasawara [18, Lemma 1]. Let pffiffiffi w ¼ px2 and 2/ ¼ q= p. Then, in the “2nd expression” pffiffiffi 1 X ðq= pÞi cfpx2 jðk þ 1 þ iÞ=2g i! i¼0 1 X
ð2/Þi i! i¼0 kþ1 k þ 3 ð2/Þ2 k þ 5 ð2/Þ4 þ c wj þ c wj þ ¼ c wj 2 2 2 2! 4! kþ2 k þ 4 ð2/Þ3 k þ 6 ð2/Þ5 2/ þ c wj þ c wj þ þ c wj 2 2 2 3! 5! 1 1 kþ1 kþ3 2 2 1 kþ5 2 2 4321 þ c wj / ð/ Þ þ c wj þ ¼ c wj 2 2 2 2 2 2222 ¼
cfwjðk þ 1 þ iÞ=2g
3.2 The BPC Distribution of the Third Kind and Its CDF
73
kþ2 k þ 4 2 3 2 1 þ 2/ c wj þ c wj / 2 2 22 ) 1 kþ6 2 2 5432 þ c wj þ ð/ Þ 2 2222 " k þ 1 cfwjðk þ 1Þ=2g ¼C 2 Cfðk þ 1Þ=2g 1 cfwjðk þ 3Þ=2g 1 2 fðk þ 1Þ=2g1 / þ 1! Cfðk þ 3Þ=2g 2 1 # 1 cfwjðk þ 5Þ=2g 1 2 2 fðk þ 1Þ=2g2 ð/ Þ þ 2! þ Cfðk þ 5Þ=2g 2 2 " k þ 2 cfwjðk þ 2Þ=2g þ 2/C 2 Cfðk þ 2Þ=2g 1 cfwjðk þ 4Þ=2g 3 2 fðk þ 2Þ=2g1 / þ 1! Cfðk þ 4Þ=2g 2 1 # 1 cfwjðk þ 6Þ=2g 3 fðk þ 2Þ=2g2 ð/2 Þ2 þ 2! þ Cfðk þ 6Þ=2g 2 2 kþ1 kþ1 1 2 kþ2 kþ2 3 2 ¼C ; ; / ; ; / F ; w þ 2/C F ; w : 1 1W 1 1W 2 2 2 2 2 2 kþ1 k þ 1 1 q2 2 ; ; ; px ¼C 1 F1W 2 2 2 4p q kþ2 k þ 2 3 q2 2 ; ; ; px ; pffiffiffi C 1 F1W p 2 2 2 4p yielding the “3rd expression.” In the Legendre duplication formula or the multiplication theorem pffiffiffi CðzÞCðz þ 0:5Þ ¼ 212z pCð2zÞ [1, Eq. 6.1.18; 3, Sect. 1.2, Eq. (11)], the case with z ¼ ðk þ 1Þ=2 gives pffiffiffi kþ1 kþ2 C C ¼ 2k pCðk þ 1Þ: 2 2
ð3:9Þ
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
74
Then, Definition 3.2 with (3.9) makes the last result of (3.8) as Zx tk expðpt2 qtÞdt 0
kþ1 kþ2 ðk þ 1Þ=2 q2 =ð8pÞ 1 p ffiffiffi 2 e C C 2 2 p 2pðk þ 1Þ=2 pffiffiffiffiffi pffiffiffiffiffi Dk1;W ðq= 2p; 2pxÞ pffiffiffiffiffi pffiffiffiffiffi pffiffiffi 1 2 ¼ pðk þ 1Þ=2 2ðk1Þ=2 eq =ð8pÞ pffiffiffi 2k pCðk þ 1ÞDk1;W ðq= 2p; 2pxÞ p pffiffiffiffiffi pffiffiffiffiffi ðk þ 1Þ=2 q2 =ð8pÞ ¼ ð2pÞ e Cðk þ 1ÞDk1;W ðq= 2p; 2pxÞ
¼
1
ðscaled 1st expressionÞ: ð3:10Þ Canceling the factor ð2pÞðk þ 1Þ=2 eq =ð8pÞ Cðk þ 1Þ common to the numerator of (3.10) and denominator of (3.7), we obtain the “1st expression” of (3.6). Noting that when x ¼ 1, (3.8) gives the denominator of (3.7) and canceling the factor f2pðk þ 1Þ=2 g1 , the “2nd expression” of (3.6) follows. The “3rd expression” of (3.6) is given similarly from the last result of (3.8). Q.E.D. 2
The “2nd expression” of (3.6) is of use for actual computation by replacing the infinite series with a finite one when the residual is sufficiently small. The “3rd expression” is useful when some methods for the weighted or incomplete Kummer confluent hypergeometric function will hopefully be developed. Comparing (3.7) and the “1st expression” of (3.6), we have an integral formula when p = 1/2 and q = z: ez =4 Dk1;W ðz; xÞ ¼ Cðk þ 1Þ 2
Zx
2 t t exp zt dt: 2 k
ð3:11Þ
0
The proof of Theorem 3.1 gives (3.11) and consequently another proof of the usual Dk1 ðzÞ other than that using an associated differential equation [4, p. 120]. Remark 3.1 Alternative expressions of Dk1;W ðz; xÞ are given as follows: pffiffiffiffiffi Let z ¼ q= 2p and p ¼ 1=2. Then, comparing the “1st expression” of (3.10) with the “2nd and 3rd expressions” of (3.8), we obtain the following
3.2 The BPC Distribution of the Third Kind and Its CDF
ez =4 Dk1;W ðz; xÞ ¼ Cðk þ 1Þ 2
Zx
2 t tk exp zt dt 2
75
(1st expression)
0
pffiffiffi 2 1 2 2ðk1Þ=2 ez =4 X x k þ 1 þ i ð 2zÞi j (2nd expression) ¼ c 2 Cðk þ 1Þ i¼0 2 i! 2 2ðk1Þ=2 ez =4 kþ1 k þ 1 1 z 2 x2 ; ; ; C ¼ 1 F1W 2 2 2 2 2 Cðk þ 1Þ pffiffiffi kþ2 k þ 2 3 z 2 x2 ; ; ; 2zC F (3rd expression): 1 1W 2 2 2 2 2 ð3:12Þ Let Fc
x2 k þ 1 2 j 2
be the cdf of the gamma distribution at x2 =2 with the shape
parameter ðk þ 1Þ=2 and the unit scale parameter. Then, when z = 0, we have 1 Dk1;W ð0; xÞ ¼ Cðk þ 1Þ
Zx tk expðt2 =2Þdt 0
2 2ðk1Þ=2 x kþ1 c j ¼ 2 Cðk þ 1Þ 2 pffiffiffi kþ1 k þ 2 1 x2 k þ 1 j ¼ 2ðk1Þ=2 2k p C c C 2 2 2 2 1 2 pffiffiffi kþ2 x kþ1 ð3:13Þ ¼ 2ðk þ 1Þ=2 p C j Fc 2 2 2 2 2ðk1Þ=2 kþ1 x kþ1 ¼ C j Fc 2 2 Cðk þ 1Þ 2 2ðk1Þ=2 kþ1 kþ1 1 x2 ; ; 0; ¼ C 1 F1W 2 2 2 Cðk þ 1Þ 2 1 pffiffiffi kþ2 kþ1 1 x2 ðk þ 1Þ=2 ; ; 0; p C ¼2 : 1 F1W 2 2 2 2 In (3.13), it is found that Fc
x2 k þ 1 2 j 2
¼ 1 F1W
kþ1 1 x2 2 ; 2 ; 0; 2
. When z = 0, the
definition 00 = 1 should be used in the “2nd and 3rd expressions” of (3.12), and in this case, the bpc distribution reduces to the chi distribution with k + 1 degrees of freedom (real-valued k [ 1), whose density function is
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
76
xk ex =2 xk ex =2 fv;k þ 1 ðxÞ ¼ R 1 k x2 =2 ¼ R 1 y dx ð2yÞk=2 pe ffiffiffiffi dy 0 x e 2
2
0
¼ ¼
2y
k x2 =2
2ðk1Þ=2
xe R1
ðk1Þ=2 ey dy 0 y 2 k x =2
xe ; 2ðk1Þ=2 Cfðk þ 1Þ=2g
which gives PrðX xjkÞ ¼ c
x2 k þ 1 kþ1 j =C : 2 2 2
This cdf can also be given from that of the bpc distribution: PrðX xjkÞ ¼ Dk1;W ð0; xÞ=Dk1 ð0Þ 2 ðk1Þ=2 1 2ðk1Þ=2 x kþ1 2 kþ1 ¼ c j C 2 2 Cðk þ 1Þ 2 Cðk þ 1Þ 2 x kþ1 kþ1 j ¼c =C : 2 2 2
ð3:14Þ
The median or more generally a-th ð0\a\1Þ quantile of the bpc distribution is given by the inverse function of the cdf or the solution of x when a ¼ PrðX xjk; p; qÞ using the expressions of Theorem 3.1. However, it is difficult to have the closed form. Some numerical procedure, e.g., the bisection method, may be employed for the solution.
3.3
Moments of the BPC Distribution
The moments of the bpc are given by the following lemma, which is easily obtained from earlier results. Lemma 3.1 The m-th raw moment ðm 0Þ of the bpc distribution is given in three ways as well as the integral representation:
3.3 Moments of the BPC Distribution
77
EðX m jk; p; qÞ ¼ EðX m Þ R1 mþk x expðpx2 qxÞdx ¼ 0R 1 k 2 0 x expðpx qxÞdx
pffiffiffiffiffi Dmk1 ðq= 2pÞ pffiffiffiffiffi ð1st expressionÞ Dk1 ðq= 2pÞ P1 pffiffiffii i¼0 Cfðm þ k þ 1 þ iÞ=2g q= p =i! m=2 ¼p ð2nd expressionÞ P1 pffiffiffii i¼0 CfCfðk þ 1 þ iÞ=2g q= p =i! mþkþ1 m þ k þ 1 1 q2 ð3rd expressionÞ F ¼ pm=2 C ; ; 1 1 2 2 2 4p q mþkþ2 m þ k þ 2 3 q2 F pffiffiffi C ; ; 1 1 p 2 2 2 4p 2 kþ1 kþ1 1 q C ; ; 1 F1 2 2 2 4p 1 q kþ2 k þ 2 3 q2 pffiffiffi C ; ; ; 1 F1 p 2 2 2 4p ¼ ð2pÞm=2 ðk þ 1Þm
ð3:15Þ where ðk þ 1Þm is seen as Cðm þ k þ 1Þ=Cðk þ 1Þ when m is a non-integer. From the “1st expression” of Lemma 3.1, we have the following result. pffiffiffiffiffi Theorem 3.2 Let Dk1 Dk1 ðpq Þ with pq q= 2p when confusion does not occur. Define skðXjk; p; qÞ ¼ skðXÞ and ktðXjk; p; qÞ ¼ ktðXÞ as the skewness and excess kurtosis of the bpc distribution, respectively. Then, we obtain ðk þ 1ÞDk2 EðXÞ ¼ pffiffiffiffiffi ; 2pDk1 kþ1 varðXÞ ¼ fðk þ 2ÞDk1 Dk3 ðk þ 1ÞD2k2 g; 2pD2k1 kþ1 skðXÞ ¼ fðk þ 2Þðk þ 3ÞD2k1 Dk4 ð2pÞ3=2 D3k1 3ðk þ 1Þðk þ 2ÞDk1 Dk2 Dk3 þ 2ðk þ 1Þ2 D3k2 gfvarðXÞg3=2 ; kþ1 ktðXÞ ¼ fðk þ 2Þðk þ 3Þðk þ 4ÞD3k1 Dk5 ð2pÞ2 D4k1 4ðk þ 1Þðk þ 2Þðk þ 3ÞD2k1 Dk2 Dk4 þ 6ðk þ 1Þ2 ðk þ 2ÞDk1 D2k2 Dk3 3ðk þ 1Þ3 D4k2 gfvarðXÞg2 3:
ð3:16Þ
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
78
For Dm ðzÞ, the recurrence relation is known: Dm þ 1 ðzÞ zDm ðzÞ þ mDm1 ðzÞ ¼ 0
ð3:17Þ
[4, Sect. 8.2, Eq. (14); 13, Sect. 8.1.3, Recurrence relations; 28, Sect. 9.247, Eq. 1]. An alternative recurrence formula using the Miller notation is also known: zUða; zÞ Uða 1; zÞ þ ða þ 0:5ÞUða þ 1; zÞ ¼ 0
ð3:18Þ
[1, Eq. 19.6.4; 2, Eq. 12.8.1]. The equivalence of (3.17) and (3.18) with sign reversal is found using Uða; zÞ ¼ Da0:5 ðzÞ when m ¼ a 0:5. In this paper, kDk1 ðzÞ ¼ Dk þ 1 ðzÞ zDk ðzÞ
ð3:19Þ
from (3.17) when m ¼ k is used with Dm1 ¼ Dm1 ðqp Þ. Employing the familiar notation r and l, we have expðpx2 qxÞ ¼ expfðx lÞ2 =ð2r2 Þg p ¼ 1=ð2r2 Þ and q ¼ l=r2 . Consequently, expfl2 =ð2r2 Þg with pffiffiffiffiffi qp ¼ q= 2p ¼ qr ¼ l=r l. That is, qp is the standardized mean l under normality, which shows that Dm1 ¼ Dm1 ðqp Þ ¼ Dm1 ðlÞ is scale-free. Define pffiffiffiffiffi 2p ¼ EðXÞ=r and pffiffiffiffiffi Dk2 EðXÞ EðXÞ 2pEðXÞ ¼ : ¼ ¼ ¼ ðk þ 1Þr k þ 1 kþ1 Dk1
EðXÞ ¼ EðXÞ Rk1
Then, we have alternative expressions of (3.16) in Theorem 3.2. Corollary 3.1 Under the notations defined earlier, we have kþ1 EðXÞ ¼ pffiffiffiffiffi Rk1 ; 2p kþ1 f1 qp Rk1 ðk þ 1ÞR2k1 g varðXÞ ¼ 2p 2
¼ r2 fk þ 1 þ lEðXÞ E ðXÞg; skðXÞ ¼ ðk þ 1Þ1=2 fqp þ ð2k
ð3:20Þ 1 þ q2p ÞRk1
þ 3ðk þ 1Þqp R2k1
þ 2ðk þ 1Þ2 R3k1 g=f1 qp Rk1 ðk þ 1ÞR2k1 g3=2 2
¼ fðk þ 1Þl þ ð2k 1 þ l2 ÞEðXÞ 3lE ðXÞ 3
2
þ 2E ðXÞg=fk þ 1 þ lEðXÞ E ðXÞg3=2 ;
3.3 Moments of the BPC Distribution
79
ktðXÞ ¼ ðk þ 1Þ1 ½k þ 3 þ q2p þ fð2k 1Þqp q3p gRk1 þ ð2k 2 4q2p Þðk þ 1ÞR2k1 6qp ðk þ 1Þ2 R3k1 3ðk þ 1Þ3 R4k1 gf1 qp Rk1 ðk þ 1ÞR2k1 g2 3 ¼ ½ðk þ 1Þðk þ 3 þ l2 Þ þ fð2k 1Þl þ l3 gEðXÞ 2
3
4
þ ð2k 2 4l2 ÞE ðXÞ þ 6lE ðXÞ 3E ðXÞ fk þ 1 þ lEðXÞ E ðXÞg2 3; 2
m
where E ðXÞ ¼ fEðXÞgm . Proof First, we provide the following alternative expressions of EðX m Þðm ¼ 2; 3; 4Þ using the recurrence relation (see 3.19): ðk þ 1Þðk þ 2ÞDk3 kþ1 ¼ ðDk1 qp Dk2 Þ 2pDk1 2pDk1 kþ1 ð1 qp Rk1 Þ; ¼ 2p ðk þ 1Þðk þ 2Þðk þ 3ÞDk4 ðk þ 1Þðk þ 2Þ ¼ ðDk2 qp Dk3 Þ EðX 3 Þ ¼ ð2pÞ3=2 Dk1 ð2pÞ3=2 Dk1 kþ1 fðk þ 2ÞDk2 qp ðDk1 qp Dk2 Þg ¼ ð2pÞ3=2 Dk1 kþ1 fqp þ ðk þ 2 þ q2p ÞRk1 g; ¼ ð2pÞ3=2 EðX 2 Þ ¼
ð3:21Þ EðX 4 Þ ¼
ðk þ 1Þ4 Dk5
¼
ðk þ 1Þ3
ðDk3 qp Dk4 Þ ð2pÞ Dk1 ð2pÞ2 Dk1 kþ1 fðk þ 3ÞðDk1 qp Dk2 Þ ðk þ 2Þqp ðDk2 qp Dk3 Þg ¼ ð2pÞ2 Dk1 kþ1 fðk þ 3ÞDk1 ð2k þ 5Þqp Dk2 þ q2p ðDk1 qp Dk2 Þg ¼ ð2pÞ2 Dk1 kþ1 ½k þ 3 þ q2p fð2k þ 5Þqp þ q3p gRk1 : ¼ ð2pÞ2 2
E(X) in (3.20) is repeated for clarity. The expression of var(X) in (3.20) is easily derived using EðX 2 Þ shown above. For sk(X), (3.21) gives
80
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
E½fX EðXÞg3 ¼ EðX 3 Þ 3EðX 2 ÞEðXÞ þ 2E3 ðXÞ kþ1 ¼ fqp þ ðk þ 2 þ q2p ÞRk1 ð2pÞ3=2 3ð1 qp Rk1 Þðk þ 1ÞRk1 þ 2ðk þ 1Þ2 R3k1 g kþ1 ¼ fqp þ ð2k 1 þ q2p ÞRk1 ð2pÞ3=2 þ 3ðk þ 1Þqp R2k1 þ 2ðk þ 1Þ2 R3k1 g ¼ r3 fðk þ 1Þl þ ð2k 1 þ q2p ÞEðXÞ 2
3
3lE ðXÞ þ 2E ðXÞg; which yields sk(X) in (3.20). For kt(X), (3.21) gives E½fX EðXÞg4 ¼ EðX 4 Þ 4EðX 3 ÞEðXÞ þ 6EðX 2 ÞE2 ðXÞ 3E4 ðXÞ kþ1 ¼ ½k þ 3 þ q2p fð2k þ 5Þqp þ q3p gRk1 ð2pÞ2 4fqp þ ðk þ 2 þ q2p ÞRk1 gðk þ 1ÞRk1 þ 6ð1 qp Rk1 Þðk þ 1Þ2 R2k1 3ðk þ 1Þ3 R4k1 kþ1 ¼ ½k þ 3 þ q2p þ fð2k 1Þqp q3p gRk1 ð2pÞ2 þ f4ðk þ 1Þðk þ 2 þ q2p Þ þ 6ðk þ 1Þ2 gR2k1 6qp ðk þ 1Þ2 R3k1 3ðk þ 1Þ3 R4k1 kþ1 ¼ ½k þ 3 þ q2p þ fð2k 1Þqp q3p gRk1 ð2pÞ2 þ ð2k 2 4q2p Þðk þ 1ÞR2k1 6qp ðk þ 1Þ2 R3k1 3ðk þ 1Þ3 R4k1 ¼ r4 ½ðk þ 1Þðk þ 3 þ l2 Þ þ fð2k 1Þl þ l3 gEðXÞ 2
3
4
þ ð2k 2 4l2 ÞE ðXÞ þ 6lE ðXÞ 3E ðXÞ yielding kt(X) in (3.20). Q.E.D. It is of interest to see that the moments in Corollary 3.1 are functions of Rk1 or EðXÞðEðXÞÞ. That is, only two values Dk1 ðqp Þ and Dk2 ðqp Þ among Dm ðqp Þ’s are used as well as k, p, and q due to the recurrence relation. However, it is to be noted that the recurrence formula of (3.19) when z in (3.19) is qp becomes
3.3 Moments of the BPC Distribution
81
kDk1 ðqp Þ ¼ Dk þ 1 ðqp Þ qp Dk ðqp Þ;
ð3:22Þ
which tends to be subject to the subtract cancelation error when q and consequently qp is positive (or equivalently l is negative) since Dm ðÞ [ 0 especially when k is large. Similar formulas for the moments of the truncated normally distributed variable have been developed by Pearson and Lee [20], Fisher [6], Kan and Robotti [8, Theorem 1], Galarza et al. [7] and Kirkby et al. [9, 10]. However, it is known that such recurrence formulas have similar difficulties as for (3.20) when qp [ 0, which was pointed out by Pollack and Shauly-Aharonov [24] and Ogasawara [17, 18]. However, when k is not large even when qp [ 0, the formula gives reasonable results. Remark 3.2 Some relationships between Dk1 ðzÞ and Fisher’s I n function and their recurrence formulas Fisher’s [6, Eqs. (11) and (12)] In function is defined by In ¼ In ðxÞ Z1 ðt xÞn pffiffiffiffiffiffi expðt2 =2Þdt ¼ n! 2p x Z1
¼ 0
ð3:23Þ
tn pffiffiffiffiffiffi expfðt þ xÞ2 =2gdt: n! 2p
When n and x are replaced by k and z, respectively, the recurrence formula for Ik ¼ Ik ðzÞ is ðk þ 2ÞIk þ 2 ðzÞ þ zIk þ 1 ðzÞ Ik ðzÞ ¼ 0
ðk ¼ 0; 1; . . .Þ
ð3:24Þ
[6, Eq. (13)]. From the second definition of In in (3.23) and the integral representation of Dk1 ðzÞ (see (3.11) when x ¼ 1), it is easily found that pffiffiffiffiffiffi Dk1 ðzÞ ¼ expðz2 =4Þ 2pIk ðzÞ;
ð3:25Þ
which is known [17, Theorem 6]. Equation (3.19) is given by (3.24) with (3.25) when k is replaced by k 2. Note that this is another derivation of the recurrence relation for Dm ðzÞ using Fisher’s In . Conversely, (3.24) is given from (3.19). Note also that the definition of In looks to be given under n = 0, 1,… as stated in (3.24). Actually, as Fisher [6, page xxviii] noted, In holds when n [ 1 is real-valued with n! replaced by Cðn þ 1Þ.
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
82
Fisher [6, Eq. (8)] gave the following differential equation:
d2 d þ z k Ik ðzÞ ¼ 0: dz dz2
Then, using (3.25), we obtain
d2 d þ z k expðz2 =4ÞDk1 ðzÞ ¼ 0; dz dz2
ð3:26Þ
which gives
d2 d þ z k expðz2 =4ÞDk1 ðzÞ dz dz2 2 d Dk1 ðzÞ dDk1 ðzÞ ¼ þ ðz þ zÞ 2 dz dz 2 z 1 z2 þ k Dk1 ðzÞ expðz2 =4Þ 4 2 2 ¼0
yielding d2 Dk1 ðzÞ 1 z2 þ k Dk1 ðzÞ ¼ 0: dz2 2 4
ð3:27Þ
It is known that the equation d2 Dm ðzÞ 1 z2 þ m þ Dm ðzÞ ¼ 0 dz2 2 4
ð3:28Þ
is the differential equation from which the parabolic cylinder functions are derived [2, Eq. 12.2.4; 4, Sect. 8.2, Eq. (1); 13, Sect. 8.1.1, Eq. (2)]. Since (3.27) is equal to (3.28) when k ¼ m 1, (3.27) is seen as another derivation of the differential equation via Fisher’s differential equation for Ik ðzÞ, which seems to be new. Magnus and Oberhettinger [12, p. 123] (see also [13, Sect. 8.1.1, Eq. (3)], defined uðzÞ ¼ ez =4 Dm ðzÞ 2
ð3:29Þ
and showed its differential equation d2 uðzÞ duðzÞ þ ðm þ 1ÞuðzÞ ¼ 0; þz 2 dz dz
ð3:30Þ
3.3 Moments of the BPC Distribution
83
which is equal to (3.26) when m ¼ k 1. That is, (3.29) and (3.30) are seen as rediscoveries of scaled Fisher’s In and its associated differential equation. The characteristic function is given as follows. Theorem 3.3 Define /ðtjk; p; qÞ ¼ /ðtÞ as the characteristic function of the bpc pffiffiffiffiffiffiffi distribution and i ¼ 1. Then, we obtain R1
xk expfpx2 ðq i tÞxgdx R1 k 2 0 x expðpx qxÞdx ( ) pffiffiffiffiffi ðq i tÞ2 q2 Dk1 fðq i tÞ= 2pg pffiffiffiffiffi ð1st expressionÞ ¼ exp 8p Dk1 ðq= 2pÞ P1 pffiffiffi j j¼0 Cfðk þ 1 þ jÞ=2g ðq itÞ= p =j! ð2nd expressionÞ ¼ P1 pffiffiffi j j¼0 Cfðk þ 1 þ jÞ=2g q= p =j! ( )
kþ1 k þ 1 1 ðq itÞ2 ð3:31Þ ; ; ¼ C ð3rd expressionÞ 1 F1 2 2 2 4p ( )# q kþ2 k þ 2 3 ðq itÞ2 ; ; pffiffiffi C 1 F1 2 2 2 4p p 2 kþ1 kþ1 1 q ; ; C 1 F1 2 2 2 4p 1 q kþ2 k þ 2 3 q2 ; ; F : pffiffiffi C 1 1 p 2 2 2 4p
/ðtÞ ¼ Eðe
i tX
Þ¼
0
For confirmation, using the “2nd expression” of (3.31), we have P pffiffiffijm im pm=2 1 =ðj mÞ! dm /ðtÞ j¼m Cfðk þ 1 þ jÞ=2g q= p j ¼ P1 pffiffiffi j dtm t¼0 j¼0 Cfðk þ 1 þ jÞ=2g q= p =j! P1 pffiffiffi j j¼0 Cfðm þ k þ 1 þ jÞ=2g q= p =j! m m=2 : ¼i p P1 pffiffiffi j i¼0 Cfðk þ 1 þ jÞ=2g q= p =j! It is found that im dm /ðtÞ=dtm jt¼0 is equal to the “2nd expression” of (3.15). It is obvious that the moment generating function MðtÞ is given from (3.31) when it is replaced by t.
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
84
3.4
The Mode and the Shapes of the PDFs of the BPC Distribution
A positive (local) mode of the bpc distribution, if it exists, is given by differentiating the numerator of the pdf (see 3.4) and setting the result to zero: 2px2 qx þ k ¼ 0 i.e:; x ¼ fq ðq2 þ 8pkÞ1=2 g=ð4pÞ:
ð3:32Þ
0 Let fD;k ðxÞ ¼ dfD;k ðxÞ=dx. Then, we have the following.
Result 3.1 The mode including the local one and decreasing/increasing tendencies of the density function fD;k ðxÞ of the bpc distribution are given by cases: (i) 1\k\0 (i:a) 1\k\ q2 =ð8pÞ: The real solution of (3.32) does not exist. fD;k ðxÞ is 0 ðxÞ\0. a strictly decreasing function (sdf) with fD;k 2 0 (i:b) 1\k ¼ q =ð8pÞ and q\0: fD;k ðxÞ is a sdf with fD;k ðxÞ 0. The value 0 x ¼ q=ð4pÞ gives an inflection point with fD;k ðxÞ ¼ 0. 0 (i:c) 1\k ¼ q2 =ð8pÞ and q [ 0: fD;k ðxÞ is a sdf with fD;k ðxÞ\0. 2 0 (i:d) maxf1; q =ð8pÞg\k\0 and q\0: fD;k ðxÞ ¼ 0 has two positive solutions. That is, fD;k ðxÞ decreases from x = 0 to a local minimum, then increases up to a local maximum or mode, and after this point decreases. This case gives a bimodal distribution. 0 (i:e) maxf1; q2 =ð8pÞg\k\0 and q [ 0: fD;k ðxÞ ¼ 0 has two negative 0 solutions. fD;k ðxÞ is a sdf with fD;k ðxÞ\0. (ii) k ¼ 0: The pdf becomes Z1 fD;k ðxÞ ¼ fD;0 ðxÞ ¼ expðpx qxÞ=
expðpt2 qtÞdt
2
0
which is the pdf of Nðq=ð2pÞ; 1=ð2pÞÞ under single truncation when X\0. (ii:a) k ¼ 0; q\0: x ¼ q=ð2pÞ gives a global mode. 0 0 (ii:b) k ¼ 0; q ¼ 0: fD;0 ðxÞ is a sdf with fD;0 ðxÞ 0 and fD;0 ð0Þ ¼ 0. 0 (ii:c) k ¼ 0; q [ 0: fD;0 ðxÞ is a sdf with fD;0 ðxÞ\0. 0 (iii) k [ 0: fD;k ðxÞ ¼ 0 has a negative solution and a positive one. The latter
solution x ¼ fq þ ðq2 þ 8pkÞ1=2 g=ð4pÞ gives a global mode. The above results will be numerically illustrated later.
3.5 The Multivariate BPC Distribution
3.5
85
The Multivariate BPC Distribution
For the multivariate bpc distribution, we use the following notations. X ¼ ðX1 ; . . .; Xn ÞT : the n-dimensional random vector, whose realized value is x ¼ ðx1 ; . . .; xn ÞT ðxi [ 0; i ¼ 1; . . .; nÞ; k ¼ ðk1 ; . . .; kn ÞT ðki [ 1; i ¼ 1; . . .; nÞ: the vector of powers of Xi and xi ði ¼ 1; . . .; nÞ used, e.g., in Xk ¼ X1k1 Xnkn and xk ¼ xk11 xknn ; C ¼ fcij gði; j ¼ 1; . . .; nÞ: a fixed positive definite symmetric matrix; d ¼ ðd1 ; . . .; dn ÞT : a fixed real vector. Definition 3.3 Under the notations and assumptions given above, the multivariate bpc distribution is defined with its density function: fD;k ðX ¼ xjd; CÞ ¼ fD;k ðxÞ ¼ R1 0
xk expfðx dÞT C1 ðx dÞ=2g R1 k 0 t expfðt dÞT C1 ðt dÞ=2gdt1 dtn ð3:33Þ
xk expfðx dÞT C1 ðx dÞ=2g ¼ R1 k ; T 1 0 t expfðt dÞ C ðt dÞ=2gdt where the denominator or the normalizer is ) " ( ) T 1 (Y n 1 n X Y d C d fðC1 dÞith g2 ii ðki þ 1Þ=2 u i =2 exp ðc Þ 2 exp 2 4cii i¼1 u¼0 i¼1 n pffiffiffiffiffioi Y 2cgh ugh 1 pffiffiffiffiffiffiffiffiffiffiffiffi Cðki þ 1 þ u i ÞDki 1u i ðC1 dÞith = cii ; ugh ! cgg chh g\h ð1st expressionÞ ð3:34Þ P1 P1 P1 C1 ¼ fcij gði; j ¼ 1; . . .; nÞ; u i ¼ u¼0 ðÞ ¼ u12 ¼0 un1;n ¼0 ðÞ; P Pn1 Pn ðd þ dhi Þ; dgi is the Kronecker delta; g¼1 h¼g þ 1 ugh ðdgi þ dhi Þ ¼ g\h ugh Q gi P ðÞith is the ith element of a vector; and g\h is defined similarly to g\h : The Proof of the Normalizer As in the univariate case, we use the variable pffiffiffiffiffiffiffiffiffiffi transformations: yi ¼ x2i cii =2; dxi =dyi ¼ 1= 2cii yi ði ¼ 1; . . .; nÞ. Then, we obtain
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
86
Z1
xk expfðx dÞT C1 ðx dÞ=2gdx
0
(
Z1 ¼
x exp k
n X x2 cii i
i¼1
0
2
X
xi xj c þ ij
i\j
n X i¼1
1 T 1 xi ðC dÞith d C d dx 2 1
) T 1 Z1 (Y n d C d ð2yi =cii Þki =2 pffiffiffiffiffiffiffiffiffiffi ¼ exp 2 2cii yi i¼1 0 ( rffiffiffiffiffiffi) n n X X 2cij pffiffiffiffiffiffiffi X 2yi 1 pffiffiffiffiffiffiffiffiffi yi yj þ exp yi ðC dÞith dy ii cjj cii c i\j i¼1 i¼1 ) ! T 1 Z1 (Y ðk 1Þ=2 n n X d C d 2ðki 1Þ=2 yi i ¼ exp yi exp 2 ðcii Þðki þ 1Þ=2 i¼1 i¼1 0 ( pffiffiffi v u ) n X 1 1 YX 2cij pffiffiffiffiffiffiffi 1 Y ðC1 dÞith 2 pffiffiffiffi 1 pffiffiffiffiffiffiffiffiffi yi yj pffiffiffiffiffi dy yi u! i¼1 v¼0 v! cii cii cjj i\j u¼0 ( ) T 1 n d C d n=2 Y 2ki =2 ¼ exp 2 ii ðki þ 1Þ=2 2 i¼1 ðc Þ ) 1( 1 1 1 1 Z n X Y X X X ðki 1 þ u i þ vi Þ=2 yi yi e dy u12 ¼0
(
un1;n ¼0 v1 ¼0
vn ¼0
0
i¼1
) pffiffiffivl n Y 2cgh ugh 1 Y ðC1 dÞlth 2 1 pffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi gg hh ll u v ! gh l! c c c g\h l¼1 ( ) T 1 n d C d n=2 Y 2ki =2 ¼ exp 2 ii ðki þ 1Þ=2 2 i¼1 ðc Þ " vi # 1 X 1 n X Y ki þ 1 þ u i þ vi 2ðC1 dÞith 1 C pffiffiffipffiffiffiffiffi ii v 2 i! 2 c u¼0 v¼0 i¼1 Y 2cgh ugh 1 pffiffiffiffiffiffiffiffiffiffiffiffi ð2nd expressionÞ ugh ! cgg chh g\h
3.5 The Multivariate BPC Distribution
87
( ) n dT C1 d n=2 Y 2ki =2 ¼ exp 2 ii ðki þ 1Þ=2 2 i¼1 ðc Þ " ( ) 1 n X Y ki þ 1 þ u i ki þ 1 þ u i 1 fðC1 dÞith g2 ; ; C 1 F1 2 2 2 2cii u¼0 i¼1 ( )## pffiffiffi 1 2ðC dÞith ki þ 2 þ u i ki þ 2 þ u i 3 fðC1 dÞith g2 pffiffiffiffiffi þ ; ; C 1 F1 2 2 2 2cii cii Y 2cgh ugh 1 pffiffiffiffiffiffiffiffiffiffiffiffi ð3rd expressionÞ ugh ! cgg chh g\h ( ) T 1 n d C d n=2 Y 2ki =2 ¼ exp 2 ii ðki þ 1Þ=2 2 i¼1 ðc Þ " ( ) 1 n X Y fðC1 dÞith g2 ðki þ u i 1Þ=2 2 exp 4cii u¼0 i¼1 Y ugh ðC1 dÞith 2cgh 1 pffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffi Cðki þ 1 þ u i ÞDki 1u i gg hh u ! gh cii c c g\h
ð1st expressionÞ; which gives the “1st expression” of (3.34). Q.E.D. Note that the normalizer in (3.33) is seen as a scaled extension of Fisher’s Ik (In when n is k) or the parabolic cylinder function Dk1 ðÞ to the n-variate case with k ¼ ðk1 ; . . .; kn ÞT instead of k. Properties of the multivariate bpc distribution are shown as follows. Theorem 3.4 The product moment of the multivariate bpc distribution is given by the ratio of the normalizers. EðX1m1 Xnmn jd; CÞ ¼ EðXm Þ ðmi 0; i ¼ 1; . . .; nÞ R1 mþk x expfðx dÞT C1 ðx dÞ=2gdx ¼ 0R 1 ; T 1 k 0 x expfðx dÞ C ðx dÞ=2gdx where xm þ k ¼ xm xk ; the denominator is given by (3.34); and the numerator is also given by (3.34) with k replaced by m + k.
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
88
The characteristic function with t ¼ ðt1 ; . . .; tn ÞT is Z1 Eðe
itT X
Þ¼ 0
xk expfðx d iCtÞT C1 ðx d iCtÞ=2gdx
Z1 tT Ct exp it d = xk expfðx dÞT C1 ðx dÞ=2gdx; 2 T
0
ð3:35Þ R1 where 0 xk expfgdx in the numerator is given by (3.34) when d is replaced with d þ iCt. R1 Let a ¼ 0 xk expfðx dÞT C1 ðx dÞ=2gdx. Then, the cdf, i.e., PrðX1 x1 ; . . .; Xn xn Þ PrðX xÞ is PrðX xÞ
) " ( ) (Y n 1 n Y X dT C1 d fðC1 dÞith g2 ii ðki þ 1Þ=2 u i =2 ðc Þ 2 exp ¼ a exp 4cii 2 i¼1 u¼0 i¼1 n pffiffiffiffiffi pffiffiffiffiffioi Y 2cgh ugh 1 pffiffiffiffiffiffiffiffiffiffiffiffi Cðki þ 1 þ u i ÞDki 1u i ;W ðC1 dÞith = cii ; xi cii ugh ! cgg chh g\h 1
ð1st expressionÞ ( ) T 1 n d C d n=2 Y 2ki =2 1 2 ¼ a exp ii ðki þ 1Þ=2 2 i¼1 ðc Þ " v i # 1 X 1 n Y X x2 cii ki þ 1 þ u i þ vi 2ðC1 dÞith 1 c i j pffiffiffipffiffiffiffiffi ii v 2 2 i! 2 c u¼0 v¼0 i¼1 Y 2cgh ugh 1 pffiffiffiffiffiffiffiffiffiffiffiffi ð2nd expressionÞ ugh ! cgg chh g\h ( ) T 1 n d C d n=2 Y 2ki =2 2 ¼ a1 exp ii ðki þ 1Þ=2 2 i¼1 ðc Þ " ( ) 1 n Y X ki þ 1 þ u i ki þ 1 þ u i 1 fðC1 dÞith g2 x2i cii ; ; C ; 1 F1W 2 2 2 2cii 2 u¼0 i¼1 ( )## pffiffiffi 1 1
2ðC dÞith ki þ 2 þ ui ki þ 2 þ ui 3 fðC dÞith g2 x2i cii pffiffiffiffiffi þ ; ; ; C 1 F1W 2 2 2 2cii 2 cii u gh gh Y 2c 1 pffiffiffiffiffiffiffiffiffiffiffiffi ð3rd expressionÞ: gg hh u gh ! c c g\h
ð3:36Þ
3.5 The Multivariate BPC Distribution
89
In Theorem 3.4, the three expressions of the cdf and their proofs parallel those of Definition 3.3 and the proof of the normalizer. That is, Dki 1u i n o
n pffiffiffiffiffio k þ 1 þ u i þ vi ðC1 dÞith = cii , 1 F1 ; ; fðC1 dÞith g2 =ð2cii Þ and C i used 2 n p p ffiffiffiffiffio ffiffiffiffi ffi earlier are replaced by Dki 1u i ;W ðC1 dÞith = cii ; xi cii , n o 2 ii
x c k þ 1 þ u i þ vi 1 ii 2 2 ii and c i2 j i , respectively. 1 F1W ; ; fðC dÞith =ð2c Þg ; xi c =2 2 Note that the cdf given above is seen as a scaled extension of the weighted/ incomplete parabolic cylinder function Dk1;W ðÞ to the n-variate case with k ¼ ðk1 ; . . .; kn ÞT instead of k. Corollary 3.2 The mode with xi [ 0 ði ¼ 1; . . .; nÞ, when it exists, is given from the following equation. ( ) X 1 ii 2 ij c xi þ c xj ðC dÞith xi ki ¼ 0 ði ¼ 1; . . .; nÞ ð3:37Þ j6¼i
Proof Differentiating the numerator of the density function of (3.33) and setting the result to 0, we obtain @ k x expfðx dÞT C1 ðx dÞ=2g @xi " ( ) # X 1 k 1 ii ij ¼ ki x xi þ c xi c xj þ ðC dÞith xk j6¼i T
1
expfðx dÞ C ðx dÞ=2g " ( ) # X 1 ii 2 ij k c xj þ ðC dÞith xi þ ki x1 ¼ c xi þ i x j6¼i
expfðx dÞT C1 ðx dÞ=2g ¼0
ði ¼ 1; . . .; nÞ;
which gives (3.37). Q.E.D. For the solution of (3.37), some iterative computation, e.g., the Newton– Raphson, can be used. Let X ¼ ðXTð1Þ ; XTð2Þ ÞT with Xð1Þ ¼ ðX1 ; . . .Xn1 ÞT and Xð2Þ ¼ ðXn1 þ 1 ; . . .Xn ÞT ðn1 ¼ 1; . . .; n 1Þ. Then, the marginal distribution of Xð1Þ is easily given by (3.33) when (3.33) is integrated with respect to Xð2Þ over its support. Let X ð1Þ is multivariate bpc-distributed with kð1Þ ; dð1Þ and Cð1Þ , where kð1Þ ¼ ðk1 ; . . .; kn1 ÞT , dð1Þ ¼ ðd1 ; . . .; dn1 ÞT and Cð1Þ ¼ C11 is the n1 n1 submatrix of C corresponding to Xð1Þ . Then, it can be shown that the moments of Xð1Þ are generally different from those of
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
90
X ð1Þ unless C12 ¼ O, where C ¼
C11 C12 C21 C22
! , which will be numerically shown
later. Further, the marginal distribution of Xð1Þ is shown to be not bpc-distributed ! C11 C12 1 1 unless C12 ¼ O. Let C ¼ , where C22 ¼ ðC22 C21 C1 11 C12 Þ C21 C22 1 1 1 1 ¼ C1 22 þ C22 C21 ðC11 C12 C22 C21 Þ C12 C22 . Define xð1Þ ; xð2Þ ; kð2Þ and dð2Þ similarly as above. Denote the normalizer of the pdf of (3.33) by a. Then, the marginal density of Xð1Þ at xð1Þ using the notation fD;k ðXð1Þ ¼ xð1Þ jd; CÞ rather than fD;kð1Þ ðXð1Þ ¼ xð1Þ jdð1Þ ; Cð1Þ Þ is given by
fD;k ðXð1Þ ¼ xð1Þ jd; CÞ Z1 1 xk expfðx dÞT C1 ðx dÞ=2gdxð2Þ ¼a 0 1 kð1Þ ¼ a xð1Þ expfðxð1Þ dð1Þ ÞT C1 11 ðxð1Þ dð1Þ Þ=2g 1 Z kð2Þ T 22 xð2Þ exp½fxð2Þ dð2Þ C21 C1 11 ðxð1Þ dð1Þ Þg C 0
fxð2Þ dð2Þ C21 C1 11 ðxð1Þ dð1Þ Þg=2dxð2Þ ð1Þ xð1Þ expfðxð1Þ dð1Þ ÞT C1 11 ðxð1Þ dð1Þ Þ=2g ¼ R1 k T 1 0 t expfðt dÞ C ðt dÞ=2gdt 1 Z kð2Þ T 22 xð2Þ exp½fxð2Þ dð2Þ C21 C1 11 ðxð1Þ dð1Þ Þg C
k
0
fxð2Þ dð2Þ C21 C1 11 ðxð1Þ dð1Þ Þg=2dxð2Þ : Since fixed xð1Þ appears in the integral of the last result as well as outside the integral, the marginal distribution is not a bpc one. Let fD;k ðXð2Þ ¼ xð2Þ jxð1Þ ; d; CÞ be the pdf of the conditional distribution of Xð2Þ , when Xð1Þ ¼ xð1Þ is given. Then, we obtain the following result. Theorem 3.5 Using the notations defined earlier, we have fD;k ðXð2Þ ¼ xð2Þ jxð1Þ ; d; CÞ 1 ¼ fD;kð2Þ fXð2Þ ¼ xð2Þ jdð2Þ þ C21 C1 11 ðxð1Þ dð1Þ Þ; C22 C21 C11 C12 g:
ð3:38Þ
3.6 Numerical Illustrations
91
Proof fD;k ðXð2Þ ¼ xð2Þ jxð1Þ ; d; CÞ ¼
fD;k ðX ¼ xjd; CÞ fD;k ðXð1Þ ¼ xð1Þ jd; CÞ
T 22 ð2Þ xð2Þ exp½fxð2Þ dð2Þ C21 C1 11 ðxð1Þ dð1Þ Þg C k
fxð2Þ dð2Þ C21 C1 11 ðxð1Þ dð1Þ Þg=2 ¼ R 1 kð2Þ T 22 1 0 xð2Þ exp½fxð2Þ dð2Þ C21 C11 ðxð1Þ dð1Þ Þg C fxð2Þ dð2Þ C21 C1 11 ðxð1Þ dð1Þ Þg=2dxð2Þ 1 ¼ fD;kð2Þ fXð2Þ ¼ xð2Þ jdð2Þ þ C21 C1 11 ðxð1Þ dð1Þ Þ; C22 C21 C11 C12 g:
Q.E.D. Equation (3.38) shows that the conditional distribution is the bpc when k ¼ kð2Þ , 22 1 d ¼ dð2Þ C21 C1 ¼ C22 C21 C1 11 ðxð1Þ dð1Þ Þ and C ¼ ðC Þ 11 C12 .
3.6
Numerical Illustrations
For computation of the pdf, cdf and moments of the bpc distribution, the computation of Dk1;W ðz; xÞ is an important factor. Currently, this can be performed using “2nd expression” of (3.12) based on the infinite series or “1st expression” of (3.12) for the integral representation with some numerical integration. When the former is used, the following stopping rule is employed. When the sum of the absolute values of the newest four added terms divided by the current value of the function is smaller than or equal to a predetermined criterion denoted by “eps,” then the computation is stopped. The newest four terms are employed considering the cases of two equal consecutive values in similar functions [21, p. 5]. The function “wpc” for the computation of Dk1;W ðz; xÞ coded in R [26] was developed using the default double precision. The criterion “eps” of convergence can be zero, which indicates the maximum machine precision, and is usually attained without excessive added computation time. The R-function “wpc” and the associated functions are given at the end of this chapter. The numerical integration by the R-function “integrate” based on QUADPACK [23] is employed using the default arguments for comparison to the computation by “wpc”. The comparison was performed for the combinations of k ¼ 0:5; 0; 0:5; 1ð1Þ5; x ¼ 0:5; 1; 2; Inf with Inf ¼ 1 in R and z ¼ 0ð0:5Þ5. That is, 8 4 11 ¼ 352 cases are used. In “wpc,” the maximum machine precision, i.e., eps = 0 is used. The user cpu time required for the comparison of “wpc” and “integrate” in the set of 352 cases was 0.22 s using intel Core: 7-6700 [email protected] GHz. The statistics of the absolute differences of the values of Dk1;W ðz; xÞ by “wpc” and “integrate” are min = median = 0, mean = 6.0e−9 and max = 8.8e−8, where aeb ¼ a 10b .
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
92
The statistics of the maximum powers in the power series (see i in “2nd expression” of 3.12) when convergence is attained are min = 0, median = 29.5, mean = 37.8, max = 145. Note that min = 0 is given when z = 0, where the computation reduces to the scaled (in)complete gamma function (see 3.13). When somewhat relaxed convergence criterion eps = 1e−6 was used, the user cpu time for the same comparison was 0.19 s. That is, only 0.03 s was saved over the case using the maximum machine precision. The corresponding statistics for the absolute differences of the function are min = 0, median = 1.6e−12, mean = 6.0e−9, max = 8.8e−8. The statistics for the maximum powers in wpc are min = 0, median = 13.0, mean = 18.5, max = 117, which are in line with the short cpu time saved as mentioned earlier. In the case of the multivariate bpc distribution, the cdf of the bivariate case was used for illustration, where a similar comparison of the cdf by the wpc and multivariate integration by the R-function “cubintegrate” with the method “hcubature” in the R-package “cubature” version 2.04 [16] was used with the default arguments. The cases of xT ¼ ð0:1; 0:1Þ; ð0:8; 0:8Þ; ð2:5; 2:5Þ; ð0:1; 2:5Þ; kT ¼ ð0:5; 0:5Þ; ð1=7; 1=7Þ; ð0:5; 0:5Þ; ð0:5; 0:5Þ; ð2:5; 2:5Þ; c12 ¼ 0:5; 0:5; 0:8 with c11 ¼ c22 ¼ 1; dT ¼ ð0:5; 0:5Þ; ð0:5; 0:5Þ; ð1; 1Þ; i.e., 4 5 3 3 ¼ 180 cases were employed. When eps = 0, the user cpu time = 60.1 s for the whole computation. The statistics of the absolute differences of the values of the cdf by wpc and cubature are min = 0, median = 7.2e−8, mean = 1.1e−6, max = 9.8e−5. Figure 3.1 shows eight univariate cases for the combinations of k ¼ 1=2; 1=7; 1=2; q ¼ 1; 1; and p ¼ 0:5; 2. When k\0, fD;k ð0Þ ¼ 1, which parameter p=2 6
6
parameter p=0.5
1
2
3
4
5
Ex. 3.5: k=−1/2, q=1 Ex. 3.6: k=−1/7, q=−1 Ex. 3.7: k=1/2, q=1 Ex. 3.8: k=1/2, q=−1
0
0
1
2
3
4
5
Ex. 3.1: k=−1/2, q=1 Ex. 3.2: k=−1/7, q=−1 Ex. 3.3: k=1/2, q=1 Ex. 3.4: k=1/2, q=−1
0.0
0.5
1.0
1.5
2.0
2.5
3.0
0.0
0.5
1.0
Fig. 3.1 Density functions of the basic parabolic cylinder distribution
1.5
2.0
2.5
3.0
3.6 Numerical Illustrations
93
is well seen in the case of k ¼ 1=2. When k ¼ 1=7 and q ¼ 1 (Ex. 3.2), there are two (local) modes (see (i.d) of Result 3.1). The positive local mode is x = 0.83 by computation, which may be found when we look at the curve carefully. Though Ex. 3.6 with k ¼ 1=7 and q ¼ 1 when p ¼ 2 gives a strictly decreasing function over the support (see (i.a) of Result 3.1), the density function is a tilted S-shaped curve with non-stationary inflection points somewhere in (0, 0.3) and (0.5, 0.9). A main difference of Ex. 3.1 to 3.4 with p = 0.5 and Ex. 3.5 to 3.8 with p = 2 is that of the scales. Though the same set of q ¼ 1; 1 is used when p = 0.5 and 2, pffiffiffiffiffi they give some differences due to q other than scales since qp ¼ q= 2p is scale-free as mentioned earlier. That is, q ¼ 1; 1 give different sets of sk(X) and kt(X): in Ex. 3.1 to 3.4, sk(X) (kt(X)) are 1.99 (4.95), 0.65 (0.05), 1.02 (1.11) and 0.46 (−0.05), respectively while in Ex. 3.5 to 3.8, they are 1.79 (3.76), 0.88 (0.51), 0.90 (0.76) and 0.61 (0.14), respectively. The values of sk(X) and kt(X) were given by the recursive (see Corollary 3.1) and non-recursive (see Theorem 3.2) methods, where the differences for Dk41 ðqp Þ required for kt(X) by the two methods (see 3.22 and the 2nd expression of 3.12) are around the maximum machine precision, i.e., less than 1e−15 and similar differences for comparison to the numerical integration (see the 1st expression of 3.12) are less than 1e−10. The last results seem to indicate that the results by numerical integration are less accurate than those by the remaining methods in these cases. Figure 3.2 illustrates the density contours of the 9 bivariate bpc distributions. The first row (Ex. 3.9, 3.10 and 3.11), the second row (Ex. 3.12, 3.13 and 3.14) and
6
0.22
0.5
1.0
0.18
0.24
1.5
2.0
0.16
22
2.5
3.0
0.0
Ex.3.12: c*= −0.5, k’= (−0.5, 0.5), d’=(0.5, 0.5)
0.5
3.0
3.0 2.0
3.0
0.2
0.25
0.
1.0
1.5
0.3
02
14
0.
0.
2.0
2.5
3.0
0.0
Ex.3.13: c*= 0.5, k’= (−0.5, 0.5), d’=(0.5, 0.5)
2.0
0.05
0.1 0.15
1.0
0.12 0.2
0.1
2.0
3.0 2
8
0.0
0.1
0.1
0.1
0.0
0.16
0.2 4
2
0.0
1.0
0.22
2.0
0.2
0.08
1.0
2.0
0.14
0.0
0.04
0.5
1.0
1.5
3.0
2.0 1.0
0.15
0.0
0.5
1.0
1.5
3.0
3.0
2.0
2.5
3.0
0.3
0.0
0.5
1.0
0.25
1.0
0.45
2.0
0.05
2.5
3.0
0.2
25
0.35 5 0.1 0.1
0.
0.05
1.0
1.5
2.0
2.5
3.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
0.0
0.2 0.05
0.5
0.2
0.1
1.0
2.0
0.15
0.0
2.0 0.0
0.15
0.0
5
0.4
1.0
0.3
0.2
2.0
0.3
0.1 0.3
1.5
Ex.3.17: c*= 0.8, k’= (2.5, 2.5), d’=(−0.5, 0.5)
Ex.3.16: c*= 0.5, k’= (2.5, 2.5), d’=(−0.5, 0.5)
Ex.3.15: c*= −0.5, k’= (2.5, 2.5), d’=(−0.5, 0.5)
0.15 0.25 0.35
0.2
3.0
2.5
0.1
0.0
1.0 2.0
0.0
0.4
1.0 0.0
1.5
5
5
1.0
0.2
0.3
0.1
0.5
3.0
0.05
0.1
0.2
3 0. 0.2
2.5
Ex.3.14: c*= 0.8, k’= (−0.5, 0.5), d’=(0.5, 0.5)
0.05
0.0
2.0
3.0
0.06
0.1
0.02 0.06 0.08
0.0
3.0
0.02
0.04
Ex.3.11: c*= 0.8, k’= (−0.14, −0.14), d’=(1, 1)
Ex.3.10: c*= 0.5, k’= (−0.14, −0.14), d’=(1, 1)
Ex.3.9: c*= −0.5, k’= (−0.14, −0.14), d’=(1, 1)
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Fig. 3.2 Density contours of the bivariate bpc distribution (X1 = the horizontal axis, X2 = the vertical axis, c* = c12)
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
94
the third row (Ex. 3.15, 3.16 and 3.17) are cases with kT ¼ k0 ¼ ð1=7; 1=7Þ, dT ¼ d0 ¼ ð1; 1Þ; kT ¼ ð0:5; 0:5Þ, dT ¼ ð0:5; 0:5Þ; and kT ¼ ð2:5; 2:5Þ, dT ¼ ð0:5; 0:5Þ, respectively, where the value 0:14 in the figure should be read as 1=7. On the other hand, the first column (Ex. 3.9, 3.12 and 3.15), the second column (Ex. 3.10, 3.13 and 3.16) and the third column (Ex. 3.11, 3.14 and 3.17) are cases with c ¼ c12 ¼ 0:5; 0:5 and 0:8, respectively. c11 ¼ c22 ¼ 1 is used throughout the 9 examples. That is, c12 is a correlation coefficient when X is bivariate normally distributed without truncation. In the three examples with kT ¼ ð1=7; 1=7Þ and dT ¼ ð1; 1Þ in the first row, it can be shown that when x1 ! 0 and x2 ! 0, the density function becomes infinitely large (when x1 ! 0 and x2 is a positive fixed value, the density goes to 1 with a different speed). From the contour plot of Ex. 3.9, it is found that there are more than one (local) modes in the example. The correlation coefficients of X1 and X2 following the bivariate bpc distribution when c12 ¼ 0:5 are −0.31, −0.19 and −0.18 in Ex. 3.9, 3.12 and 3.15, respectively. When c12 ¼ 0:5, they are 0.38, 0.32 and 0.31 in Ex. 3.10, 3.13 and 3.16, respectively. When c12 ¼ 0:8, they are 0.71, 0.66 and 0.65 in Ex. 3.11, 3.14 and 3.17, respectively. Absolute values of these coefficients seem to be reduced from the corresponding c12 ’s, which may be due to lower truncation and the multiplicative factor xm þ k as found in independent cases. As mentioned earlier, the means and variances in Fig. 3.2 are different from those of the corresponding univariate cases, i.e., k ¼ ki ; p ¼ 1=ð2cii Þ and q ¼ di =ð2cii Þði ¼ 1; 2Þ. For instance, the common mean (standard deviation) of X1 and X2 in Ex. 3.9 is 1.11 (0.74) while the corresponding value for the univariate versions is 1.19 (0.79). In Ex. 3.17, the corresponding value for X1 is 1.87 (0.67) while the univariate version is 1.53 (0.63); the corresponding value for X2 is 2.75 (0.734) while the univariate version is 1.99 (0.729).
3.7
Discussion
(a) Non-raw absolute moments: As addressed earlier, a major application of the bpc distribution is to have absolute moments of real-valued orders for the uniand multivariate truncated normal distributions. For the corresponding absolute moments of integer-valued orders, we can use the simple result with q = 0, e.g., in the univariate case, where the variable transformation with X þ fq=ð2pÞg redefined as X, the change of support from ½0; 1Þ to ½q=ð2pÞ; 1Þ and the binomial expansion of ½X þ fq=ð2pÞgm should be considered. The simple result with q = 0 mentioned above is the case of the scaled (in)complete gamma function without infinite series as addressed earlier. The above discussion suggests a problem of EðjX þ ajm Þðm [ 1Þ with a being a real constant, which was not dealt with in earlier sections when X is bpc-distributed. This can be given again by the variable transformation X þ a redefined as X using the following procedure.
3.7 Discussion
95
Z1 m
jx þ ajm fD;k ðxjp; qÞdx
EðjX þ aj Þ ¼
ðm [ 1Þ
0
Z1 jxjm fD;k ðxjp; q 2apÞdx expðqa pa2 Þ
¼ a
9 88 1 Za
> m m > > x f ðxjp; q 2apÞdx x f ðxjp; q 2apÞdx D;k D;k > > ; > > :0 > 0 > > > > > < expðqa pa2 Þ ða 0Þ; ¼ 8 9 > Zjaj >
> > m m > x f ðxjp; q 2apÞdx þ x f ðxjp; q þ 2apÞdx > D;k D;k > : ; > > > 0 0 > > > : expðqa pa2 Þ ða\0Þ; where pðx aÞ2 qðx aÞ ¼ px2 ðq 2apÞx þ qa pa2 and pðxÞ2 ðq 2apÞðxÞ ¼ px2 ðq þ 2apÞx are used. When a ¼ EðXjk; p; qÞ, the above result gives the central absolute moment of real-valued order mð [ 1Þ as a special case. In the case of EfðX þ aÞm g, where m is a non-negative integer, the binomial expansion of ðX þ aÞm gives the required moment as mentioned earlier for ½X þ fq=ð2pÞgm : (b) The multiple infinite series: In the multivariate bpc distribution, the formula of the normalizer (see 3.33 and its proof) includes fðn2 nÞ=2g-fold multiple infinite series, where ðn2 nÞ=2 is the number of the non-duplicated off-diagonal elements in C. When some cij ’s are zero, ðn2 nÞ=2 can be replaced by the number of nonzero cij ’s ð1 i\j nÞ. Since generally cij 6¼ 0, improved formulas for the multiple infinite series are desired, which is a task to be investigated in the future. (c) Other applications of the bpc distribution: Since the shapes of the density functions of the bpc distribution can take various forms, we have possible applications of fitting the distribution to various phenomena in real data as Kostylev [11] used a form of this distribution with a variable transformation for a physical problem. For fitting the bpc distribution to data, methods of parameter estimation and evaluation of the estimators should be derived, which are also remaining problems. Note that the uni- and multivariate bpc distributions belong to the exponential family of distributions, where p and q in the univariate case are natural parameters. As addressed earlier, the chi distribution is a special case of the bpc distribution. The relationship between these two distributions is similar to that between the chi-square and gamma distributions, since the former is a special case of the latter.
3 The Basic Parabolic Cylinder Distribution and Its Multivariate …
96
It is to be noted that the gamma distribution is given by the bpc distribution, but not vice versa, when the variable transformation U ¼ bpX 2 ðb [ 0Þ is used, where X is bpc-distributed with k [ 1 and q = 0. Then, it can be shown that U has the gamma distribution with the shape parameter ðk þ 1Þ=2 and the scale parameter b. When k = 0, p = 1/2 and q = 0, the bpc distribution becomes the chi distribution with 1 degree of freedom, which is equal to the normal distribution under single truncation when X\0. These results show general properties of the bpc distribution.
3.8 3.8.1
R-Functions The R-Function wpc for the Weighted Parabolic Cylinder Function
################ The start of function wpc ########################### # # R-function [wpc] version 1.0 # # The value of the weighted parabolic cylinder function # 2020 # Haruhiko Ogasawara # Otaru University of Commerce, Otaru, Japan # wpc=function(k,z,x,eps=1e-6,mterm=500){ # #.......................... INPUT .................................... # # [k] The parameter k in the subscript -k-1 for the extended # Whittaker notation D_(-k-1,W)(z,x) ( note that the usual # Whittaker notation is D_(-k-1)(z) ) # [z] The main argument for D_(-k-1,W)(z,x) # [x] The second argument for D_(-k-1,W)(z,x), which gives # the incomplete gamma function when x < Inf while x = Inf # gives the usual gamma function gamma(z) # [eps] The convergence criterion using the sum of current four # absolute differences divided by the absolute value # of the function in a power series; 1e-6(default); # epc=0 indicates machine precision # [mterm] The maximum order of terms for the variable in a power # series, 500(default) #
3.8 R-Functions
97
#......................... OUTPUT .................................... # # The following are returned,when e.g., # abc=wpc(k,z,x,eps,mterm) # # [abc$value] The derived value of the weighted parabolic cylinder # function # [abc$nterms] The order of terms required to have convergence # for the variable in a power series # [abc$rinc] The final relative increment of the variable # in a power series, which should be smaller than # or equal to 'eps' for convergence #..................................................................... mseq=NULL dseq=NULL h0=1e100 # a large initial value if(z != 0){ wpc=0 uv=-1 repeat{ uv=uv+1 abc=exp( lgamma((k+1+uv)/2)+log(pgamma(x^2/2,(k+1+uv)/2)) +uv*log(abs(-sqrt(2)*z))-lfactorial(uv) ) if(uv %% 2 == 1 && sign(-z) == -1)abc=-abc wpc=wpc+abc h1=wpc mseq=c(mseq,h1) dseq=c(h1-h0,dseq) h0=h1 msize=4 if(uv grd hrd qffiffiffiffiffiffiffiffiffiffiffiffiffiffi > > ; ugh ! g\h : wgg whh where W1 ¼ fwij g ði; j ¼ 1; . . .; NÞ; signðxÞ ¼ 1; 0 and 1 when x > 0, x = 0 and x < 0, respectively; 1f g is the indicator function; P1 P1 P1 P1 P1 ðÞ ¼ ðÞ; ðÞ and L 1 ;...;L N ¼0 L 1 ¼0 L N ¼0 u12 ;...;uN1;N ¼0 v1 ;...;vN ¼0 ðÞ are similarly; cðxjs Þ is the lower incomplete gamma function at x, i.e., Rdefined x s 1 t t e dt with its complete version Cðs Þ, which is the usual gamma function 0 P P PN with a positive real-valued s*; g\h ðÞ ¼ N1 g¼1 h¼g þ 1 ðÞ; dgi is the Kronecker delta; ðÞ is the ith element of the vector in parentheses; ith Q ðÞ ¼ ðÞ ðÞ .; and the definition 00 = 1 is used when g\h g¼1;h¼2 g¼N1;h¼N necessary. Proof Employ the following notations: ZBd ðÞdx
b1rY d Z RY X RZ X rY ¼1 rZ ¼1
Ad
1 XX
aprY d
1 X
aðp þ 1ÞrZ
c1rd Z ðL 1 Þ
ð1Þ ð1Þ
ð1Þ
L 1 þ þ L N
Z
ðÞdx 0
Z
crd
ð1ÞL þ
ðÞdx 0
cNrd Z ðL N Þ
0
ðLÞ crd
ðLÞ
r2R L¼0
LN
L n ¼0
1 X
1 XX
ðÞdx
aNrZ
L1
r2R L 1 ;...;L N ¼0
ZbNrZ
Z
aðp þ 1ÞrZ
a1rY d
r2R L 1 ¼0
X
Z
bprY d
ðÞdx 0
8 The Truncated Pseudo-Normal (TPN) and Truncated …
246
R cðLirdi Þ
where
0
ðÞdx ¼
R0
ðL Þ
cirdi
ðL Þ
ðÞdx if cirdi \0 ðLi ¼ 0; 1; r ¼ 1; . . .; RY
when
i ¼ 1; . . .; p; r ¼ 1; . . .; RZ when i ¼ p þ 1; . . .; N). Using the variable transformapffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffi tions wi ¼ x2i wii =2; xi ¼ signðxi Þ 2wi =rii and dxi =dwi ¼ signðxi Þ= 2wii wi ði ¼ 1; . . .; NÞ; and noting lid ¼ ðld Þith , (8.1) gives ~ k Þ ¼ a1 EðY d
ZBd
~yk N=2
ð2pÞ
Ad
¼a
1
ZBd Ad
¼
exp ð2pÞN=2 jWj1=2
Z
i
i¼1
2
X
xi xj wij
i\j
ðL Þ
!
(
p Y
ðL Þ signðcirdi Þki
i¼1
ð2wi =wii Þki =2 pffiffiffiffiffiffiffiffiffiffiffiffiffi 2wii wi
(
1 expðwTd W1 wd =2Þ X X
að2pÞN=2 jWj1=2 crd w =2 (
)
ðL Þ ðL Þ
ð1ÞL þ
r2R L¼0
ðLÞ2
Z
(
0
p Y i¼1
ðL Þ signðcirdi Þki þ 1
ðki 1Þ=2
2ðki 1Þ=2 wi
ðwii Þðki þ 1Þ=2 !
)
) N X 1 pffiffiffiffiffiffiffiffiffiffiffiffiffi exp wi 2wii wi i¼p þ 1 i¼1 " ( )u # ðL Þ ðL Þ 1 YX 2signðcirdi cjrdj Þwij pffiffiffiffiffiffiffiffiffi 1 pffiffiffiffiffiffiffiffiffiffiffi wi wj ii jj u! w w i\j u¼0 " ( )v # ðL Þ pffiffiffi N X 1 Y ðW1 gd Þith signðcirdi Þ 2 pffiffiffiffiffi 1 pffiffiffiffiffiffi dw wi v! wii i¼1 v¼0 N Y
ðL Þ
ð1ÞL 1 þ þ L N signðc1rd1 Þ signðcnrdN Þ
N X 2signðcirdi cjrdj Þwij pffiffiffiffiffiffiffiffiffi X 1 pffiffiffiffiffiffiffiffiffiffiffiffiffi exp pffiffiffiffiffiffiffiffiffiffiffi wi wi wj 2wii wi wii wjj i\j i¼p þ 1 i¼1 sffiffiffiffiffiffiffi) N X 2wi ðL i Þ ij w ljd signðcird Þ þ dw wii i;j¼1
¼
0
N Y
1 X
r2R L 1 ;...;L n ¼0
ðL Þ2 cNrdN wNN =2
0
N X x2 wii
N X 1 T 1 ij þ xi w ljd wd W wd dx 2 i;j¼1
að2pÞN=2 jWj1=2 Z
jWj
expfðx wd ÞT W1 ðx wd Þ=2gdx
~yk
expðwTd W1 wd =2Þ X ðL Þ2 c1rd1 w11 =2
1=2
8.4 Moments and Cumulants of the TPN
¼
( p expðwTd W1 wd =2Þ Y
247
)
2ki =2
!
N Y
1 pffiffiffiffiffiffi wii i¼p þ 1
ii ðki þ 1Þ=2 að4pÞN=2 jWj1=2 i¼1 ðw Þ ) ( )( p R X 1 N Y Y X ðL i Þ ki þ 1 ðL i Þ Lþ ð1Þ signðcird Þ signðcird Þ r¼1 L¼0
i¼p þ 1
i¼1 ðLÞ2
1 X
u12 ¼0
"
N Y
1 1 X X
uN1; N ¼0 v1 ¼0 fki 1 þ
wi
P g\h
1 X vN ¼0
crd w =2
Z 0
ugh ðdgi þ dhi Þ þ vi g=2
i¼1
#" e
8 9ugh 3 >
= Y 1 7 6 grd hrd qffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4 5 > > u ! gg hh gh ; g\h : w w 8 9v l N >
= 1 Y d lth qffiffiffiffiffiffi lrd ; > > ; vl ! l¼1 : wll 2
N Y
wi
f1 þ
P
wi
g\h
ugh ðdgi þ dhi Þ þ vi g=2
# e
wi
dw
i¼p þ 1
R cðL 1 Þ2 w11 =2 R cðL N Þ2 wNN =2 R cðLÞ2 w =2 where 0 1rd 0 Nrd ð )dw 0 rd ð )dw. The last expression gives the required result. Q.E.D. As in Ogasawara [9], Theorem 8.5 gives raw, central, arbitrarily deviated, non-absolute, absolute and partially absolute cross moments with possible non-integer orders greater than 1 for jYid j’s when Y~id ¼ jYid j under sectional truncation. It is found that Ogasawara [9, Theorem 1] is a special case of Theorem 8.5 in that in the former result, Z is missing. The following result is also similarly obtained. Corollary 8.1 An alternative expression of Theorem 8.5 is given by ~ kÞ ðY d
¼
X
( p expðwTd W1 wd =2Þ Y
2ki =2
)
N Y
1 pffiffiffiffiffiffi wii i¼p þ 1
!
ii ðki þ 1Þ=2 að4pÞN=2 jWj1=2 i¼1 ðw Þ ) ( )( p 1 N Y X Y ðL i Þ ki þ 1 ðL i Þ Lþ ð1Þ signðcird Þ signðcird Þ
r2R L1 ;...;LN ¼0
"
i¼1
i¼p þ 1
P N Y 1fi pgki þ 1 þ g\h ugh ðdgi þ dhi Þ C 2 u12 ;...;uN1;N ¼0 v1 ;...;vN ¼0 i¼1 ( ) P ðL i Þ2 ii 1fi pgki þ 1 þ g\h ugh ðdgi þ dhi Þ 1 fðW1 wd Þith g 2 cird w ; ; 1 F1w ; 2 2 2 2wii 1 X
1 X
8 The Truncated Pseudo-Normal (TPN) and Truncated …
248
P pffiffiffi 1 ðL Þ 1fi pgki þ 2 þ g\h ugh ðdgi þ dhi Þ 2ðW wd Þith signðcirdi Þ pffiffiffiffiffiffi C 2 wii ( )# P ðL i Þ2 ii ki þ 2 þ g\h ugh ðdgi þ dhi Þ 3 fðW1 wd Þith g 2 cird w ; ; 1 F1w ; 2 2 2 2wii 9ugh 8 = 1
Y> grd hrd qffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; > > ; ugh ! g\h : wgg whh
þ
where 1 F1w ðg; n; x; w Þ ¼
1 X ðgÞv xv cðw jg þ vÞ ðw 0; x 0Þ ðnÞv v!Cðg þ vÞ v¼0
(the weighted Kummer confluent hypergeometric function); and ðgÞv ¼ gðg þ 1Þ ðg þ v 1Þ ¼ Cðg þ vÞ=CðgÞ is the rising factorial. Proof For the proof, the following lemma is used. Q.E.D. Lemma 8.2 (Ogasawara [9, Lemma 1]) 1 X
cfw jðk þ 1 þ vÞ=2gð2/Þv =v!
v¼0
kþ1 kþ1 1 kþ2 kþ2 3 2 2 ; ; / ; w þ 2/ C ; ; / ; w : ¼C 1 F1w 1 F1w 2 2 2 2 2 2
Proof of Lemma 8.2 The infinite series on the left-hand side of the above equation is 1 X v¼0
cfw jðk þ 1 þ vÞ=2gð2/Þv =v!
2 4 kþ1 k þ 3 ð2/Þ k þ 5 ð2/Þ þc w j þ ¼c w j þc w j 2 2 2 2! 4! kþ2 k þ 4 ð2/Þ3 k þ 6 ð2/Þ5 þ c w j þ c w j þ 2/ þ c w j 2 2 2 3! 5! kþ1 k þ 3 2 2 1 1 ¼ c w j þ c w j / 2 2 2 2
8.4 Moments and Cumulants of the TPN
249
kþ5 4 3 2 1 1 þ c w j þ ð/2 Þ2 2 2 2 2 2 1 2 3 2 kþ2 kþ4 þ 2/ c w j þc w j / 2 2 2 2 ) 1 2 2 5 4 3 2 kþ6 þc w j þ ð/ Þ 2 2 2 2 2 k þ 1 cfw jðk þ 1Þ=2g ¼C 2 Cfðk þ 1Þ=2g 1 cfw jðk þ 3Þ=2g 1 fðk þ 1Þ=2g1 /2 þ 1! Cfðk þ 3Þ=2g 2 1 # 1 cfw jðk þ 5Þ=2g 1 2 2 fðk þ 1Þ=2g2 ð/ Þ þ 2! þ Cfðk þ 5Þ=2g 2 2 k þ 2 cfw jðk þ 2Þ=2g þ 2/ C 2 Cfðk þ 2Þ=2g 1 cfw jðk þ 4Þ=2g 3 fðk þ 2Þ=2g1 /2 þ 1! Cfðk þ 4Þ=2g 2 1 # 1 cfw jðk þ 6Þ=2g 3 2 2 fðk þ 2Þ=2g2 ð/ Þ þ 2! þ ; Cfðk þ 6Þ=2g 2 2 which gives the required result with the definition of 1 F1w ðÞ. Q.E.D. In Corollary 8.1, when w is 1, 1 F1w ðg; n; x; w Þ becomes the usual Kummer confluent hypergeometric function: 1 F1w ðg; 1 X
n; x; 1Þ
¼
ðgÞv xv cð1jg þ vÞ ðnÞv v!Cðg þ vÞ v¼0
¼
1 X ðgÞv xv ðnÞv v! v¼0
¼ 1 F1 ðg; n; xÞ ðx 0Þ (Winkerbauer [10, Eq. (6)]; Zwillinger [11, Eq. (1) of Sect. 9.210]; DLMF [2, Chap. 13]. Note that 1 F1w ðg; n; x; w Þ is given by 1 F1 ðg; n; xÞ when each term is weighted by 0 cðw jg þ vÞ=Cðg þ vÞ 1. That is, it is expected that the infinite series of 1 F1w ðg; n; x; w Þ tends to converge faster than 1 F1 ðg; n; xÞ.
8 The Truncated Pseudo-Normal (TPN) and Truncated …
250
When p ¼ q ¼ 1, i.e., N ¼ p þ q ¼ 2 and wd ¼ 0, define r 0 1 q r 0 W¼ ; 0 rx q 1 0 rx ðL Þ
ðL Þ
c1rd1 ¼ c1rd1 =r
ðL Þ
ðL Þ
c2rd2 ¼ c2rd2 =rx :
and
Denote k1 , k1 and Y1 by k, k and Y, respectively. Then, using Lemma 8.2 in a similar way, we have the following relatively simple result similar to Ogasawara [9, Corollary 3]: Corollary 8.2 When k [ 1 is real-valued, EðY~dk jwd ¼ 0Þ ¼
1 X X r2R L 1 ;L 2 ¼0
rk k=2 2 ð1 q2 Þðk þ 1Þ=2 a4p ðL Þ
ðL Þ
ð1ÞL 1 þ L 2 signðc1rd1 Þk þ 1 signðc2rd2 Þ
! ! ðL Þ2 ðL Þ2 c1rd1 c2rd2 kþ1þu 1þu j j c c 2 2ð1 q2 Þ 2ð1 q2 Þ 2 u¼0 n ou 1 ðL Þ ðL Þ signðc1rd1 c2rd2 Þ2q u! 1 X
rk k=2 2 ð1 q2 Þðk þ 1Þ=2 a4p 1 X X ðL Þ ðL Þ ð1ÞL 1 þ L 2 signðc1rd1 Þk þ 1 signðc2rd2 Þ
¼
r2R L 1 ;L 2 ¼0
( k þ 1 pffiffiffi kþ1 1 p 2 F1w2 ; C 2 2 2 ( kþ2 kþ2 3 ;1 ; þ 2qC 2 F1w2 2 2 2
) ðL Þ2 ðL Þ2 c1rd1 c2rd2 1 2 ; ;q ; ; 2 2ð1 q2 Þ 2ð1 q2 Þ )# ðL Þ2 ðL Þ2 c1rd1 c2rd2 2 ;q ; ; ; 2ð1 q2 Þ 2ð1 q2 Þ
where 2 F1w2 ðg1 ; g2 ; ; n; x; w1 ; w2 Þ 1 X ðg1 Þv ðg2 Þv xv cðw 1 jg1 þ vÞcðw 2 jg2
¼
v¼0
ðnÞv v!Cðg1 þ vÞCðg2 þ vÞ
þ vÞ
ðw 1 0; w 2 0; x 0Þ:
last expression 2 F1w2 ðg1 ; g2 ; ; n; x; w 1 ; w 2 Þ with the weight cðw 1 jg1 þ vÞcðw 2 jg2 þ vÞ 0 1 added to each term in the usual Gauss hyperCðg1 þ vÞCðg2 þ vÞ P ðg1 Þv ðg2 Þv xv geometric function 2 F1 ðg1 ; g2 ; ; n; xÞ ¼ 1 was obtained by v¼0 ðnÞv v! The
8.4 Moments and Cumulants of the TPN
251
Ogasawara [9, Corollary 3] and is called the weighted Gauss hypergeometric function. The unweighted Gauss hypergeometric function used by Nabeya [8] and Kamat [5] in the untruncated and singly truncated bivariate cases, respectively, are special cases of the weighted counterpart. Remark 8.3 Corollary 8.2 shows that the moments of Y do not depend on the scale or standard deviation rx of Z ¼ Z1 before truncation, which was used to have the ðL Þ ðL Þ standardized limit c2rd2 ¼ c2rd2 =rx for an interval for selection. This is expected and generally holds for Z in the TPN and PN since the distribution of Y is unchanged as long as PrðZ 2 SZ j g; XÞ ¼ Pr
RZ [
! faZr Z\bZr g
r¼1
is the same under reparametrization as n o Pr Z ¼ Diag1=2 ðXÞZ 2 SZ j Diag1=2 ðXÞg; P ¼ Pr
RZ [ r¼1
¼ Pr
RZ [
!
fDiag1=2 ðXÞaZr Z\Diag1=2 ðXÞbZr g ! faZr Z\bZr g ;
r¼1
where P ¼ Diag1=2 ðXÞXDiag1=2 ðXÞ is the correlation matrix with unit diagonals corresponding to X and Diag1=2 ðXÞ is the diagonal matrix whose diagonal ele1=2 ments are x11 ; . . .; x1=2 qq . Though this reparametrization can be used without loss of generality, other methods are also possible. The SN, a special case of. PNp;q;R ð l; R; D; g; D; A; BÞ ¼ PN1;1;1 ð 0; 1; k; 0; 1; 1; 0Þ; uses D ¼ d ¼ 1 rather than X ¼ r2x ¼ 1, giving the pdf at y 2/ðyÞUðkyjg ¼ 0; d ¼ 1Þ ¼ 2/ðyÞUðkyÞ as addressed earlier. It is found that the TPN and PN are latent variable models (LVMs), where hidden or latent truncation is typically used though Z may be observable. Note that even in the latter observable case, Z does not always appear in the model, e.g., 2/ðyÞUðkyÞ for the SN unless the expression 2/ðyÞ PrðZ\0j ky; 1Þ is employed. One of the properties of the LVM is the inflated dimensionality for variables
8 The Truncated Pseudo-Normal (TPN) and Truncated …
252
associated with the LVM. The SN model is a univariate model for Y. However, we require an added latent variable Z to describe the pdf. In psychometrics or more generally behaviormetrics, it is known that the exploratory factor analysis (EFA) model is a LVM while a similar model of principal component analysis (PCA) is not a LVM. In the EFA model, when the number of observable variables of interest is p, the model consists of p unique factors and additional q common factors, yielding the inflated dimensionality p + q over that of the observable variables or data. On the other hand, in PCA the sum of the numbers of principal and remaining minor components is equal to p, i.e., the number of observable variables with no inflation of the number of associated variables (for the LVM and EFA model see, e.g., Bollen [1]; Gorsuch [4]). A special case of the PN is given by the SN as addressed earlier. The SN under single truncation was introduced by Kim [6] while the SN under double truncation was investigated by Flecher, Allard and Naveau [3]. The truncated skew-normal (TSN) with emphasis on convolution was discussed by Krenek, Cha, Cho and Sharp [7]. A special case of Corollary 8.2 in the case of the TSN is given when pffiffiffiffiffiffiffiffiffiffiffiffiffi d ¼ d ¼ 0; wd ¼ w ¼ 0; W¼ r2 ¼ 1; r2x ¼ 1 þ k2 , q ¼ k= 1 þ k2 , 2 1 k k 1 þ k ðL Þ ðL Þ , W1 ¼ , jWj ¼ 1, a ¼ 1=2, c1rd1 ¼ c1r 1 ¼ k 1 þ k2 k 1 ðL Þ
ðL Þ
ðL Þ
ðL Þ
ðL Þ
ðL Þ
c1r 1 =r ¼ c1r 1 ðr ¼ 1; . . .; RY Þ, c2rd2 ¼ c2r 2 ¼ c2r 2 =rx ¼ c2r 2 =ð1 þ k2 Þ1=2 due to b2r ¼ 0 and a2r ¼ 1 ðr ¼ RZ ¼ 1Þ. Corollary 8.3 When k [ 1 is real-valued for the TSN, RY X 1 X 1 2k=2 ðL Þ ð1ÞL 1 signðc1r 1 Þk þ 1 2p ð1 þ k2 Þðk þ 1Þ=2 r¼1 L ¼0 1 ! )u ( ðL 1 Þ2 1 2 X c1r ð1 þ k Þ k þ 1 þ u 1þu 2k 1 ðL 1 Þ j c C signðc1r Þ pffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 2 u! 1þk u¼0
EðY~ k Þ ¼
RY X 1 X 1 2k=2 ðL Þ ð1ÞL 1 signðc1r 1 Þk þ 1 2p ð1 þ k2 Þðk þ 1Þ=2 r¼1 L ¼0 1 ( ) ðL Þ2 k þ 1 pffiffiffi kþ1 1 1 k2 c1r 1 ð1 k2 Þ p 2 F1w2 ; ; ; ;1 C ; 2 2 2 2 1 þ k2 2ð1 q2 Þ ( )# ðL Þ2 k kþ2 kþ2 3 k2 c1r 1 ð1 þ k2 Þ 2 pffiffiffiffiffiffiffiffiffiffiffiffiffi C ;1 ; ; ;1 ; ; 2 F1w2 2 2 2 1 þ k2 2ð1 q2 Þ 1 þ k2
¼
where
8.4 Moments and Cumulants of the TPN 2 F1w2 ðg1 ; g2 ; ; n; x; w1 ; 1Þ 1 X ðg1 Þv ðg2 Þv xv cðw 1 jg1 þ vÞ
¼
ðnÞv v!Cðg1 þ vÞCðg2 þ vÞ v¼0
253
ðw 1 0; x 0Þ:
Proof Use w11 ¼ 1 þ k2 ; w22 ¼ 1 and jWj ¼ 1. Note that b2r ¼ 0 does not contribute to the result and that a2r ¼ 1 ðr ¼ RZ ¼ 1Þ, which corresponds to ðL Þ L2 ¼ 1, gives ð1ÞL 2 signðc2rd2 Þ ¼ 1 and ðL Þ2
c2rd2 1þu c j 2 2ð1 q Þ 2
!
1þu 1þu ¼ c 1j ¼C ðr ¼ RZ ¼ 1Þ; 2 2
yielding the required result. Q.E.D.
8.4.2
A Formula Using the MGF
In Chap. 4, the formulas for the moments for the PN were given using its mgf and the partial derivatives of the cdfs of the associated variables. In this subsection, the formulas based on reduced expressions for the STN in Chap. 1 are used. Recall that the mgf of the STN is PrðX þ Rt 2 S j l; RÞ tT Rt T exp l t þ MX ðtÞ ¼ PrðX 2 Sj l; RÞ 2 (Theorem 1.1) while that of the TPN is h i tT Rt MY ðtÞ ¼ a1 Pr fðY þ RtÞT ; ðZ þ DRtÞT gT 2 Sjw;W exp lT t þ 2 and a ¼ PrðX 2 Sj w; WÞ (Theorem 8.1). Let Y ¼ Y 1 þ Y0 where Y 1 is a pseudo h i random vector whose pseudo mgf is a1 Pr fðY þ RtÞT ; ðZ þ DRtÞT gT 2 Sjw;W and Y0 Np ðl; RÞ assumed to be independent of Y 1 . Define A Y ¼ AY Rt1TRY ; B Y ¼ BY Rt1TRY ; A Z ¼ AZ DRt1TRZ ; B Z ¼ BZ DRt1TRZ
8 The Truncated Pseudo-Normal (TPN) and Truncated …
254
ar ¼ A Y r t¼0 ¼ ðAY Þr ; br ¼ B Y r t¼0 ¼ ðBY Þr ðr ¼ 1; . . .; RY Þ; aRY þ r ¼ A Z r t¼0 ¼ ðAZ Þr ; bp þ r ¼ B Z r t¼0 ¼ ðBZ Þr ðr ¼ 1; . . .; RZ Þ; where ðÞr is the rth column of a matrix. Let arðkÞ be ar with the kth element being omitted. Then, as in Chap. 1 for the STN using similar notations, e.g., ðL Þ k ckr k ¼ aLkrk b1L kr , the following results are obtained: Lemma 8.3 Define rd i1 k ¼ ri1 k ¼ ðRÞi1 k ði1 ¼ 1; . . .; p; k ¼ 1; . . .; pÞ and rd i1 k ¼ ðÞi1 k is the ðRDT Þi1 ;kp ði1 ¼ 1; . . .; p; k ¼ p þ 1; . . .; NÞ; N ¼ p þ q; where ði1 ; kÞth element of a matrix. EðYÞi1 ¼ a1 where
P r2R
N XX 1 X k¼1 r2R Lk ¼0
PRY
ðÞ ¼
rY ¼1
ðrÞ
ðrÞ
ðÞ ðk ¼ 1; . . .; pÞ, 1 X
ðL Þ
fð1Þ ðckr k Þ ¼ Z
PRZ
ðÞ ¼
ðk ¼ p þ 1; . . .; p þ kÞ,
rZ ¼1
ðLk Þ ¼0
Z
Z
ZbNr Z
aðp þ 1Þr Z
bNr Z
bp r Y Z
bðk þ 1Þr Y
a1r Y
r2R
ðL Þ
Z
¼
P
ð1ÞL þ Lk fð1Þ ðckr k Þ
bðk1Þr Y
b1r Y
ðL Þ
ð1ÞLk þ 1 rd i1 k fð1Þ ðckr k Þ þ li1 ði1 ¼ 1; . . .; pÞ
bðp þ 1Þr Z
aðk1Þr Y
aðk þ 1Þr Y
ap r Y
ðL Þ /N fðckrYk ; xTðkÞ ÞT jw; WgdxðkÞ
Z
brðkÞ ðL Þ
/N fðckr k ; xTðkÞ ÞT jw; WgdxðkÞ ðk ¼ 1; . . .; pÞ
arðkÞ
and similarly ðrÞ
1 X
ðL Þ
fð1Þ ðckr k Þ ¼ Z ¼
ðLk Þ ¼0 bp r Y Z
b1r Y
ðL Þ
ð1ÞL þ Lk fð1Þ ðckr k Þ
a1r Y
Z
Z
bðp þ 1Þr Z
bðk1Þr Z
Z
ZbNr Z
aðk þ 1Þr Z
bNr Z
bðk þ 1Þr Z
ap r Y
aðp þ 1Þr Z
aðk1Þr Z
ðL Þ
/p þ q fðckrZk ; xTðkÞ ÞT jw; WgdxðkÞ Z
brðkÞ ðL Þ
/N fðckr k ; xTðkÞ ÞT jw; WgdxðkÞ ðk ¼ p þ 1; . . .; NÞ;
arðkÞ
8.4 Moments and Cumulants of the TPN
255
where L þ ¼ L1 þ þ LN . ðL Þ
ðL Þ
Lemma 8.4 Redefine ckr k ¼ ckr k wk . Then, we have ( EfðY 1 Þi1 ðY 1 Þi2 g ¼ a1 þ
N XX 1 X k¼1 r2R Lk ¼0
N X
ðL Þ ðrÞ
ðL Þ
ð1ÞLk þ 1 rd i1 k rd i2 =k ckr k fð1Þ ðckr k Þ
1 X X
) Lk þ Ll
ð1Þ
k; l¼1; k6¼l r2R Lk ;Ll ¼0
ðrÞ ðLÞ rd i1 k rd i2 l jk fð2Þ ðckl;r Þ
ði1 ; i2 ¼ 1; . . .; pÞ; where rd i2 =k ¼ rdi2 k w1 kk ; rd i2 l jk ¼ rdi2 l rdi2 k w1 kk wkl ; ðrÞ
ðLÞ
ðrÞ
ðL Þ
ðL Þ
fð2Þ ðckl;r Þ ¼ fð2Þ fðckr k ; clr l ÞT g Z
brðk;lÞ ðLÞ
/N fðckl;r ; xTðk;lÞ ÞT jw; Wgdxðk;lÞ
¼ arðk;lÞ
¼
1 X
ðLÞ
ð1ÞL þ Lk Ll fð2Þ ðckl;r Þ
ðLk ;Ll Þ¼0
ðk; l ¼ 1; . . .; N; k 6¼ lÞ; R brðk;lÞ arðk;lÞ
ðÞ dxðk;lÞ is the (N 2)-mode multiple integral for x omitting variables
xk and xl ; and 1 X ðLk ;Ll Þ¼0
ðÞ ¼
1 X L1;...; Lk1 ; Lk þ 1;...; Ll1; Ll þ 1;...; LN ¼ 0
when k\l.
ðÞ
8 The Truncated Pseudo-Normal (TPN) and Truncated …
256
Lemma 8.5 EfðY 1 Þi1 ðY 1 Þi2 ðY 1 Þi3 g " N XX 1 X ðL Þ2 1 ðrÞ ðLk Þ 1 ð1ÞLk þ 1 rdi1 k rdi2 k rdi3 k ðckr k w2 ¼a kk wkk Þfð1Þ ðckr Þ k¼1 r2R Lk ¼0
N X
þ
1 X X
k; l¼1; k6¼l r2R Lk ;Ll ¼0
ðL Þ
ð1ÞLk þ Ll ðrdi1 k rdi2 =k rdi3 l j k ckr k
ðLÞ
ðrÞ
ðLÞ
þ rdi1 k rdi2 l j k rTdi3 =kl ckl;r Þfð2Þ ðckl; r Þ N X X
þ
1 X
Lk þ Ll þ Lm þ 1
ð1Þ
k;l;m¼1 r2R Lk ; Ll ; Lm ¼0 k;l;m:6¼
ðrÞ ðLÞ rdi1 k rdi2 l j k rdi3 m j k l fð3Þ ðcklm; r Þ
ði1 ; i2 ; i3 ¼ 1; . . .; pÞ; where undefined notations are defined similarly as before. Lemma 8.6 EfðY 1 Þi1 ðY 1 Þi2 ðY 1 Þi3 ðY 1 Þi4 g " N XX 1 X ðL Þ3 ðLk Þ 2 1 ¼a ð1ÞLk þ 1 rdi1 k rdi2 k rdi3 k rdi4 k ðckr k w3 kk 3ckr wkk Þ k¼1 r2R Lk ¼0 ðrÞ
ðL Þ
fð1Þ ðckr k Þ þ
N X
1 X X
k; l¼1; k6¼l r2R Lk ;Ll ¼0
n ðL Þ2 1 ð1ÞLk þ Ll rdi1 k rdi2 k rdi3 k rdi4 l j k ðckr k w2 kk wkk Þ
rdi1 k rdi2 =k rdi3 l j k rdi4 k rdi1 k rdi2 l j k rTdi3 =kl rdi4 ;kl o ðL Þ ðLÞ ðLÞ ðrÞ ðLÞ þ ðrdi1 k rdi2 =k rdi3 l j k ckr k þ rdi1 k rdi2 l j k rTdi3 =kl ckl:r ÞrTdi4 =kl ckl:r fð2Þ ðckl; r Þ þ
N X
X
1 X
ð1ÞLk þ Ll þ Lm þ 1
k; l; m¼1 r2R Lk ; Ll ; Lm ¼0 k; l; m: 6¼
n ðL Þ ðLÞ ðrdi1 k rdi2 =k rdi3 l jk ckr k þ rdi1 k rdi2 l j k rTdi3 =kl ckl; r Þrdi4 m j k l o ðLÞ ðrÞ ðLÞ þ rdi1 k rdi2 l j k rdi3 m j kl rTdi4 =klm cklm; r fð3Þ ðcklm; r Þ þ
N X
X
1 X
k; l; m; n¼1 r2R Lk ; Ll ; Lm ; Ln ¼0 k; l; m; n : 6¼
ð1Þ
Lk þ Ll þ Lm þ Ln
ðrÞ ðLÞ rdi1 k rdi2 l j k rdi3 m j kl rdi4 n j klm fð4Þ ðcklmn; r Þ
ði1 ; i2 ; i3 ; i4 ¼ 1; . . .; pÞ;
where undefined notations are defined similarly as before.
#
#
8.4 Moments and Cumulants of the TPN
257
In Lemmas 8.5 and 8.6, asymmetric expressions for associated variables are used. The above lemmas can be used to have the moments and cumulants up to the fourth order as in Theorems 1.4 and 1.5.
8.4.3
The Case of Sectionally Truncated SN with p = q = 1
Among the cases of the TPN, when p = q = 1 with X ¼ðY; ZÞT , w ¼ 0, R RDT 2 2 ¼ R ¼ r ¼ 1, D ¼ k; D ¼ d ¼ 1; X ¼ x ¼ 1 þ k , W ¼ DR X 1 k 1 þ k2 k 1 ¼ , jWj ¼ 1, AY ¼ aTY , BY ¼ bTY , AZ ¼ aZ ¼ 2 , W k 1 þ k k 1 1 and BZ ¼ bZ ¼ 0. we have the sectionally truncated skew-normal (TSN). Remark 8.7 The pdf of the TSN is proportional to that of the SN, which is given by definition as /ðyÞUðkyÞ /ðyÞUðkyÞ ¼ PR R bYr R 0 Y PrðX 2 Sj 0; WÞ /ðyÞ/ðzj ky; 1Þdz dy r¼1 aYr
¼ R bT R Y
aTY
0 1
1
/ðyÞUðkyÞ /ðyÞ/ðzj ky; 1Þdz dy
1
¼ a /ðyÞUðkyÞ: and the mgf is given as a special case of Theorem 8.1: MY ðtÞ ¼ a1 PrfðY þ t; Z ktÞT 2 S j 0;Wg expðt2 =2Þ Remark 8.8 While moments are obtained by using the pdf, the expressions are generally given by integral expressions as the normalizer AZ ¼ aZ ¼ 1
and
BZ ¼ bZ ¼ 0:
For actual computation, some numerical integration or series expressions (recall Theorem 8.5) are required. For expository purposes, moments are obtained using the mgf of a pseudo variable. Let Y ¼ Y1 þ Y0 , where Y1 is the pseudo random variable with the pseudo mgf MY1 ðtÞ ¼ a1 PrfðY þ t; Z ktÞT 2 Sj0;Wg used earlier and Y0 Nð0; 1Þ independent of Y1 . Then,
8 The Truncated Pseudo-Normal (TPN) and Truncated …
258
d 1 a PrfðY þ t; Z ktÞT 2 S j 0;Wgjt¼0 dt br 0 RY Z Y Z X d 1 /2 fðy t; z þ ktÞT j0; Wgjt¼0 dz dy ¼a dt r ¼1
EðY1 Þ ¼
ar Y
1
RY Z X
Z0
Y
br Y
¼a
1
rY ¼1
ar Y
ð1; kÞW1 ðy t; z þ ktÞT
1
/2 fðy t; z þ ktÞT j0; Wgjt¼0 dz dy: In the above result, using 1
T
ð1; kÞW ðy t; z þ ktÞ jt¼0 ¼ ð1; kÞ
1 þ k2 k k
1
! ðy; zÞT ¼ y
we have
EðY1 Þ ¼ a1
¼a
1
br 0 RY Z Y Z X rY ¼1
ar Y
br RY Z Y X rY ¼1
ar Y
RY Z X
y/2 fðy; zÞT j0; Wgdz dy
1
Z0 /ðzj ky; 1Þdz dy
y/ðyÞ 1
br Y
¼ a1
rY ¼1
y/ðyÞUðkyÞdy
ar Y
¼ EðYÞ; which is expected since EðYÞ ¼ EðY1 Þ þ EðY0 Þ ¼ EðY1 Þ. d2 1 a PrfðY þ t; Z ktÞT 2 S j 0;Wgjt¼0 dt2 br 0 RY Z Y Z X d 1 ð1; kÞW1 ðy t; z þ ktÞT ¼a dt r ¼1
EðY1 2 Þ ¼
Y
ar Y
1
/2 fðy t; z þ ktÞT j0; Wgjt¼0 dz dy
8.4 Moments and Cumulants of the TPN
¼a
259
br 0 RY Z Y Z h X ð1; kÞW1 ð1; kÞT þfð1; kÞW1 ðy t; z þ ktÞT g2
1
rY ¼1
1
ar Y
i /2 fðy t; z þ ktÞT j0; W jt¼0 dz dy br 0 RY Z Y Z X
¼ a1
rY ¼1
fð1; kÞW1 ð1; kÞT þ y2 g/2 fðy; zÞT j0; Wgdz dy
1
ar Y
¼ 1 þ EðY Þ; 2
! 2 k 1 þ k where ð1; kÞW1 ð1; kÞ ¼ ð1; kÞ ð1; kÞT ¼ 1 is used. Then, k 1 T
varðY1 Þ ¼ 1 þ EðY 2 Þ E2 ðY1 Þ ¼ 1 þ EðY 2 Þ E2 ðYÞ ¼ 1 þ varðYÞ, which is expected since varðY1 Þ ¼ varðYÞ varðY0 Þ ¼ varðYÞ 1, where
varðYÞ ¼ a1
RY X rY ¼1
Zbr Y y2 /ðyÞUðkyÞdy ar Y
8 > < > :
a1
RY X rY ¼1
Zbr Y ar Y
92 > = y/ðyÞUðkyÞdy : > ;
Remark 8.9 We obtain the third and fourth moments/cumulants using the mgf. d3 1 a PrfðY þ t; Z ktÞT 2 S j 0;Wgjt¼0 dt3 br 0 RY Z Y Z X dh 1 ð1; kÞW1 ð1; kÞT ¼a dt r ¼1
EðY1 3 Þ ¼
Y
ar Y
1
i þ fð1; kÞW1 ðy t; z þ ktÞT g2 /2 fðy t; z þ ktÞT j0; W jt¼0 dz dy ¼ a1
br 0 RY Z Y Z h X rY ¼1
ar Y
3ð1; kÞW1 ð1; kÞT ð1; kÞW1 ðy t; z þ ktÞT
1
i þ fð1; kÞW1 ðy t; z þ ktÞT g3 /2 fðy t; z þ ktÞT j0; Wgjt¼0 dz dy ¼a
1
br 0 RY Z Y Z X rY ¼1
ar Y
1
ð3y þ y3 Þ /2 fðy; zÞT j0; Wgdz dy
;
8 The Truncated Pseudo-Normal (TPN) and Truncated …
260
which gives j3 ðY1 Þ¼ E[f Y1 EðY1 Þg3 ¼ E(Y1 3 Þ 3E(Y1 2 ÞEðY1 Þ þ 2E3 ðY1 Þ ¼a
1
br 0 RY Z Y Z X rY ¼1
ar Y
ð3y þ y3 Þ /2 fðy; zÞT j0; Wgdz dy
1
3f1 þ EðY 2 ÞgEðYÞ þ 2E3 ðYÞ ¼ EðY 3 Þ 3EðY 2 ÞEðYÞ þ 2E3 ðYÞ ¼ E½fY EðYÞg3 ¼ j3 ðYÞ; as is expected since Y0 Nð0; 1Þ in Y ¼ Y1 þ Y0 does not contribute to the cumulants of Y beyond the second order. The fourth moments/cumulants are also given via the mgf for confirmation. d4 1 a PrfðY þ t; Z ktÞT 2 S j 0;Wgjt¼0 dt4 br 0 RY Z Y Z X dh 1 3ð1; kÞW1 ð1; kÞT ð1; kÞW1 ðy t; z þ ktÞT ¼a dt r ¼1
EðY1 4 Þ ¼
Y
ar Y
1
i þ fð1; kÞW1 ðy t; z þ ktÞT g3 /2 fðy t; z þ ktÞT j0; Wgjt¼0 dz dy ¼a
1
br 0 RY Z Y Z h X 3fð1; kÞW1 ð1; kÞT g2 rY ¼1
ar Y
1
þ 6ð1; kÞW1 ð1; kÞT fð1; kÞW1 ðy t; z þ ktÞT g2 i þ fð1; kÞW1 ðy t; z þ ktÞT g4 /2 fðy t; z þ ktÞT j0; Wgjt¼0 dz dy ¼a
1
br 0 RY Z Y Z X rY ¼1
ar Y
Then, we have
1
ð3 6y2 þ y4 Þ /2 fðy; zÞT j0; Wgdz dy ¼ 3 6EðY 2 Þ þ EðY 4 Þ:
8.4 Moments and Cumulants of the TPN
261
j4 ðY1 Þ ¼ E[f Y1 EðY1 Þg4 3var2 ðY1 Þ ¼ E(Y1 4 Þ 4E(Y1 3 ÞEðY1 Þ þ 6E(Y1 2 ÞE2 ðY1 Þ 3E4 ðY1 Þ ¼ 3 6EðY 2 Þ þ EðY 4 Þ 4f3EðYÞ þ EðY 3 ÞgEðYÞ þ 6f1 þ EðY 2 ÞgE2 ðYÞ 3E4 ðYÞ 3fEðY 2 Þ E2 ðYÞ 1g2 ¼ EðY 4 Þ 4EðY 3 ÞEðYÞ þ 6EðY 2 ÞE2 ðYÞ 3E4 ðYÞ 3fEðY 2 Þ E2 ðYÞg2 ¼ E[f Y EðYÞg4 3var2 ðYÞ ¼ j4 ðYÞ; which is expected as in the third cumulant. The results of Remarks 8.8 and 8.9 partially support the mgfs of Theorem 8.1 and Remark 8.7. Remark 8.10 In this subsection, the TSN has been dealt with considering the familiarity of the SN as well as simplicity. However, when we use intervals for selection other than AZ ¼ aZ ¼ 1, BZ ¼ bZ ¼ 0 and RZ ¼ 1, the corresponding results are similarly obtained. That is, using 1 RZ vectors AZ ¼ aTZ and BZ ¼ bTZ R0 with arbitrary positive integer RZ , the results are given when 1 ðÞdz in Remarks R bT Rb P 8.7–8.9 is replaced by RrZZ¼1 arr Z ðÞdz aTZ ðÞdz. Note also that in Remarks 8.7– Z
Z
8.9, a form of the mgf of Theorem 8.1 using Z as well as Y is used for illustration. However, the form can be simplified. In the case of the TSN, the pdf shown in Remark 8.6 becomes /ðyÞUðkyÞ /ðyÞUðkyÞ ¼ R bT R 0 PrðX 2 Sj 0; WÞ Y /ðyÞ/ðzj ky; 1Þdz dy aT 1 Y
/ðyÞUðkyÞ ¼ R bT ¼ a1 /ðyÞUðkyÞ: Y /ðyÞUðkyÞdy aT Y
Consequently, the mgf is simplified as: ZbY
T
MY ðtÞ ¼ EfexpðtYÞg ¼ a1
expðtyÞ/ðyÞUðkyÞdy aTY
ZbY
T
¼ a1
/ðy tÞUðkyÞdy expðt2 =2Þ; aTY
where MY1 ðtÞ ¼ a
1
T b RY
/ðy tÞUðkyÞdy and MY0 ðtÞ ¼ expðt2 =2Þ. It is confirmed
aTY
that these simplified forms give the same results as shown in Remarks 8.8 and 8.9.
8 The Truncated Pseudo-Normal (TPN) and Truncated …
262
8.5
The Truncated Normal-Normal Distribution
In Chap. 6, the normal-normal (NN) distribution was introduced, which can be used as an approximation to the corresponding PN distribution. In this section, the truncated counterpart of the NN is obtained, which can be seen as an approximation to the TPN. Recall that the pdf of the NN vector is fY;NN ðyÞ ¼ fp;q;R ðyj l; R; D; g; D; CÞ P /p ðyj l; RÞ Rr¼1 /q fcr jg þ Dðy lÞ; Dg ¼ PR r¼1 /q ðcr jg; XÞ PR /q ðcr jg; XÞ /p fyjl þ R DT D1 ðcr gÞ; R g ¼ r¼1 PR r¼1 /q ðcr jg; XÞ (Definition 6.1, Remark 6.5), where the last expression is a mean mixture of the normal densities with the untruncated support S ¼ Rp . Suppose that NN-distributed S Y is sectionally truncated with the support S ¼ Rr¼1 far Y\br g. Then, we have the following set of distributions. Definition 8.2 The truncated normal-normal (TNN) family of distributions is defined when the p 1 random vector Y takes the following pdf at y: fY;TNN ðyÞ ¼ fp;q;R ðyj l; R; D; g; D; C; A; BÞ P /p ðyj l; RÞ Rr¼1 /q fcr jg þ Dðy lÞ; Dg ¼ PR T 1 r¼1 /q ðcr jg; XÞ PrfY 2 Sjl þ R D D ðcr gÞ; R g PR /q ðcr jg; XÞ /p fyjl þ R DT D1 ðcr gÞ; R g ¼ PR r¼1 T 1 r¼1 /q ðcr jg; XÞ PrfY 2 Sjl þ R D D ðcr gÞ; R g ¼ a1
R X
/q ðcr jg; XÞ /p fyjl þ R DT D1 ðcr gÞ; R g;
r¼1
where A ¼ ða1 ; . . .; aR Þ and B ¼ ðb1 ; . . .; bR Þ. This distribution is denoted by Y TNNp;q;R ðl; R; D; g; D; C; A; BÞ. Proof of the normalizer. The equality of the two expressions of the pdf is given as in the NN (see Remark 6.5). The added factor in each term in the denominator of the pdfs is given by definition. Q.E.D.
8.5 The Truncated Normal-Normal Distribution
263
Note that the normalizer is a¼
R X
/q ðcr jg; XÞ PrfY 2 Sjl þ R DT D1 ðcr gÞ; R g
r¼1
¼
R X
Zbr /q ðcr jg; XÞ
r¼1
/p fyjl þ R DT D1 ðcr gÞ; R gdy:
ar
Theorem 8.6 The mgf of Y TNNp;q;R ðl; R; D; g; D; C; A; BÞ is MY ðtÞ ¼ a1
R X
/q ðcr jg þ DRt; XÞ PrfY 2 Sjl þ Rt þ R DT D1 ðcr g DRtÞ; R g
r¼1
1 T T exp l t þ t Rt : 2 Proof R Z X
br
MY ðtÞ ¼ a
1
r¼1
¼ a1
R X
expðtT yÞ/p ðyj l; RÞ/q fcr jg þ Dðy lÞ; Dgdy
ar
/q ðcr jg þ DRt; XÞ PrfY 2 Sjl þ Rt þ R DT D1 ðcr g DRtÞ; R g
r¼1
1 exp lT t þ tT Rt 2 PR R br r¼1 ar /p ðyj l þ Rt; RÞ/q fcr jg þ DRt þ Dðy l RtÞ; Dgdy PR T 1 r¼1 /q ðcr jg þ DRt; XÞ PrfY 2 Sjl þ Rt þ R D D ðcr g DRtÞ; R g ¼ a1
R X
/q ðcr jg þ DRt; XÞ PrfY 2 Sjl þ Rt þ R DT D1 ðcr g DRtÞ; R g
r¼1
1 T T exp l t þ t Rt : 2 Q.E.D. As for the pdf, the mgf includes cdfs giving less tractable results than those of the NN.
264
8 The Truncated Pseudo-Normal (TPN) and Truncated …
References 1. Bollen KA (1989) Structural equations with latent variables. Wiley, New York 2. DLMF (2020) Olver FWJ, Olde Daalhuis AB, Lozier DW, Schneider BI, Boisvert RF, Clark CW, Miller BR, Saunders BV, Cohl HS, McClain MA (eds) NIST Digital Library of Mathematical Functions. National Institutes of Standards and Technology, U. S. Department of Commerce. http://dlmf.nist.gov/, Release 1.0.26 of 2020-03-15 3. Flecher C, Allard D, Naveau P (2010) Truncated skew-normal distributions: moments, estimation by weighted moments and application to climate data. METRON—Int J Stati LXVIII:331–345 4. Gorsuch RL (2014) Factor analysis, Classic. Psychology Press, New York 5. Kamat AR (1958) Hypergeometric expansions for incomplete moments of the bivariate normal distribution. Sankhyā 20:317–320 6. Kim H-J (2004) A family of truncated skew-normal distributions. Korean Commun Stat 11:265–274 7. Krenek R, Cha J, Cho BR, Sharp JL (2017) Development of statistical convolutions of truncated normal and truncated skew normal distributions with applications. J Stat Theory Pract 11:1–25 8. Nabeya S (1951) Absolute moments in 2-dimensional normal distribution. Ann Inst Stat Math 3:2–6 9. Ogasawara H (2021) A non-recursive formula for various moments of the multivariate normal distribution with sectional truncation. J Multivar Anal (online published). https://doi.org/10. 1016/j.jmva.2021.104729 10. Winkelbauer A (2014) Moments and absolute moments of the normal distribution. arXiv:1209.4340v2 [math.ST] 15 Jul 2014 11. Zwillinger D (2015) Table of integrals, series and products, 8th edn. Translated from Russian by Scripta Technica Inc. Elsevier, Amsterdam. The original Russian text by Gradshteyn IS, Ryzhik IM (1962) Fiziko-Matematicheskoy Literatury, Moscow
Chapter 9
The Student t- and Pseudo t- (PT) Distributions: Various Expressions of Mixtures
9.1
Introduction
So far, the pseudo-normal distributions and associated results have been dealt with. This family of distributions gives varieties of pdfs based on various types of truncation including single, double and inner truncation as special cases. Non-normal versions of the pseudo distributions can also be similarly considered, which should further extend the varieties of distributions. Recall that the pdf of the pseudo-normally distributed vector Y PNp;q;R ðl; R; D; g; D; A; BÞ is fp;q;R ðyj l; R; D; g; D; A; BÞ /p ðyjl; RÞ ¼ PrfZ 2 S j g þ Dðy lÞ; Dg; PrðZ 2 Sj g; XÞ where the definitions of the notations are given in Definition 4.1. We can replace /p ðyjl; RÞ by a non-normal version as generically denoted by wp ðyjl; hY Þ, where hY is a vector of parameters for random vector Y other than the location parameter vector l. Similarly, PrfZ 2 S j g þ Dðy lÞ; Dg and PrðZ 2 Sj g; XÞ can be replaced by non-normal counterparts denoted by PrfZ 2 S j g þ Dðy lÞ; hZ g and PrðZ 2 Sj xÞ, respectively, where hZ is defined similarly to hY and x is the vector of all the parameters including l; hY ; D; g and hZ though PrðZ 2 Sj xÞ may not depend on some of the elements of x as the normal case not dependent on l. This reformulation gives the family of pseudo normal/non-normal distributions:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 H. Ogasawara, Expository Moments for Pseudo Distributions, Behaviormetrics: Quantitative Approaches to Human Behavior 2, https://doi.org/10.1007/978-981-19-3525-1_9
265
9 The Student t- and Pseudo t- (PT) Distributions …
266
gp;q;R ðyj l; hY ; D; g; hZ ; A; BÞ ¼
wp ðyjl; hY Þ PrfZ 2 S j g þ Dðy lÞ; hZ g: PrðZ 2 Sj xÞ
Note that the type of the non-normal distribution of Y before truncation, i.e., wp ðyjl; hY Þ can be different from that for Z. When the truncation for Z is restricted to single truncation, possibly non-normal skew distributions, known as the skew-symmetric family (see Gupta, Chang and Huang [11]; Genton [8]), have been used by Nadarajah and Kotz [21], Gupta [9], Gupta and Chang [10], Azzalini and Capitanio [1], Nadarajah and Ali [19], Nadarajah and Gupta [20], Fung and Seneta [6], Kollo and Pettere [16], Joe and Li [12], Lachos, Garay and Cabral [18], Kollo, Käärik and Selart [15] and Galarza, Matos, Castro and Lachos [7] among others. In these papers, one of the skew non-normal distributions well investigated is the skew t-distribution with its multivariate version. In this chapter, various expressions of the mixtures for the pdfs of the t-distribution and its multivariate version are also presented as preliminaries.
9.2
The t-Distribution
The pdf of the Student t-distribution with mdegrees of freedom (df) using location and scale parameters, l and r, respectively, is given by ( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g ðz lÞ2 pffiffiffipffiffiffi gðzjl; r; mÞ ¼ 1þ mr2 Cðm=2Þ p mr ( )ðm þ 1Þ=2 1 ðz lÞ2 pffiffiffi ¼ 1þ mr2 Bðm=2; 1=2Þ mr ðm [ 0; 1\l\1; r [ 0Þ; where m is typically a positive integer, but can be positively real-valued, and Bðm=2; 1=2Þ is the beta function. When l ¼ 0 and r ¼ 1, we have the standard tdistribution. It is known that when X Nð0; 1Þ and Y is chi-square distributed with m df pffiffiffiffiffiffiffiffi independent of X, Z ¼ X= Y=m is t-distributed with the pdf gðzj0; 1; mÞ given above. Since in many text books, the derivation of gðzj0; 1; mÞ is omitted, for expository purposes, we give it in similar four ways: Lemma 9.1 The pdf of the standard t-distribution is derived from any one of the pffiffiffi pffiffiffiffi pffiffiffipffiffiffiffi pffiffiffi pffiffiffi composite variables of mX= Y , m Y X, mX=Y and mYX when X Nð0; 1Þ and Y is chi-square, inverse chi-square, chi and inverse-chi distributed with m df independent of X, respectively.
9.2 The t-Distribution
267
Proof 1 First, we derive the case of the inverse-chi distribution, whose pdf is given pffiffiffiffiffiffi when Y ¼ 1= Y and Y is chi-square or equivalently Gamma ðm=2; 1=2Þ distributed with 1/2 being the rate parameter: yðm=2Þ1 y gC ðY ¼ y jm=2; 1=2Þ ¼ gv2 ðY ¼ y jmÞ ¼ m=2 exp : 2 2 Cðm=2Þ Using dy =dy ¼ 2y3 and substituting y ¼ y2 in the above pdf, we obtain the pdf of the inverse-chi: ym1 1 gv1 ðY ¼ yjmÞ ¼ ðm=2Þ1 exp 2 : 2y 2 Cðm=2Þ Since X and Y are independent, the pdf of the joint distribution of X and Y is /ðX ¼ xÞgv1 ðY ¼ yjmÞ 2 1 x ym1 1 ¼ pffiffiffiffiffiffi exp exp 2 : 2y 2 2ðm=2Þ1 Cðm=2Þ 2p We employ the variable transformation from (X and Y) to (Z ¼ where dx=dz dx=dy det dy=dz dy=dy
!
pffiffiffi mYX and Y),
pffiffiffi pffiffiffi ! 1=ð myÞ z=ð my2 Þ pffiffiffi ¼ 1=ð myÞ: ¼ det 0 1
pffiffiffi Then, the pdf of the joint distribution of Z and Y using x ¼ z=ð myÞ and unchanged y with the Jacobian becomes m1 pffiffiffi 1 z2 y =ð myÞ 1 pffiffiffiffiffiffi exp exp 2y2 2my2 2ðm=2Þ1 Cðm=2Þ 2p ym2 z2 ¼ pffiffiffiffiffiffipffiffiffi exp 1 þ =ð2y2 Þ m 2p m2ðm=2Þ1 Cðm=2Þ ðm þ 1Þ=2 2 1 þ zm Cfðm þ 1Þ=2g 1 pffiffiffipffiffiffi ¼ 2 1=2 p mCðm=2Þ ðm þ 1Þ=21 2 Cfðm þ 1Þ=2g 1 þ zm ( 1=2 )m2 z2 z2 =ð2y2 Þ y= 1 þ exp 1 þ m m ðm þ 1Þ=2 2 1 þ zm 1 pffiffiffi ¼ Bðm=2; 1=2Þ m 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g 1 þ z2 1=2 m ( )m2 2 1=2 2 z z 2 =ð2y Þ : y= 1 þ exp 1 þ m m
9 The Student t- and Pseudo t- (PT) Distributions …
268
The distribution of Z is given by the above joint distribution when y is integrated out, which is obtained by noting that 1 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g
1þ
z2 1=2 m
( 1=2 )m2 z2 z2 2 y= 1 þ exp 1 þ =ð2y Þ m m
is the pdf of the scaled inverse-chi distribution with m þ 1 df, giving the pdf of Z: ðm þ 1Þ=2 1 z2 pffiffiffi 1 þ ; m Bðm=2; 1=2Þ m which is the pdf of the standard t-distribution with m df. The remaining three cases pffiffiffi pffiffiffiffi pffiffiffipffiffiffiffi pffiffiffi of mX= Y , m Y X and mX=Y are given by using variable transformation of the inverse-chi distributed variable. Q.E.D. Proof 2 We derive the case of the inverse chi-square distribution, whose pdf is given when Y ¼ 1=Y and Y is chi-square distributed. Using dy =dy ¼ y2 and substituting y ¼ y1 , we obtain the pdf of the inverse chi-square: gv2 ðY ¼ yjmÞ ¼
yðm=2Þ1 1 : exp 2y 2m=2 Cðm=2Þ
The pdf of the joint distribution of X and Y is /ðX ¼ xÞgv2 ðY ¼ yjmÞ 2 ðm=2Þ1 1 x y 1 exp : ¼ pffiffiffiffiffiffi exp 2y 2 2m=2 Cðm=2Þ 2p We employ the variable transformation from (X and Y) to (Z ¼ where the Jacobian is
pffiffiffipffiffiffiffi m Y X and Y),
dx 1 ¼ pffiffiffipffiffiffi : dz m y pffiffiffipffiffiffi Then, the pdf of the joint distribution of Z and Y using x ¼ z=ð m yÞ and unchanged y with the Jacobian becomes
9.2 The t-Distribution
269
pffiffiffipffiffiffi 1 z2 yðm=2Þ1 =ð m yÞ 1 pffiffiffiffiffiffi exp exp 2y 2my 2m=2 Cðm=2Þ 2p ðm þ 3Þ=2 2 y z exp 1 þ ¼ pffiffiffiffiffiffipffiffiffi =ð2yÞ m 2p m2m=2 Cðm=2Þ ðm þ 1Þ=2 2 1 þ zm Cfðm þ 1Þ=2g pffiffiffipffiffiffi ¼ p mCðm=2Þ 1 2
ðm þ 1Þ=2 Cfðm þ 1Þ=2g 1 þ zm 2 ðm þ 3Þ=2 z2 z2 exp 1 þ y= 1 þ =ð2yÞ m m ðm þ 1Þ=2 2 1 þ zm pffiffiffi ¼ Bðm=2; 1=2Þ m 1 2
ðm þ 1Þ=2 Cfðm þ 1Þ=2g 1 þ zm 2 ðm þ 3Þ=2 z2 z2 y= 1 þ exp 1 þ =ð2yÞ : m m The distribution of Z is given by the above joint distribution when y is integrated out, which is obtained by noting the pdf of the inverse chi-square constructed above, giving the pdf of Z as in Proof 1: ðm þ 1Þ=2 1 z2 pffiffiffi 1 þ ; m Bðm=2; 1=2Þ m which is the pdf of the standard t-distribution with m df. The remaining three cases are given by using variable transformation of the inverse chi-square distributed variable. Q.E.D. Proof 3 Since the derivations using the pdfs of the inverse-chi or inverse chi-square may be unusual, we give the result using the chi-square. Let Y be chi-square distributed with mdf with unchanged X. Then, the joint distribution of X and Y becomes 2 ðm=2Þ1 y 1 x y p ffiffiffiffiffi ffi /ðX ¼ xÞg ðY ¼ yjmÞ ¼ exp exp : 2 2 2m=2 Cðm=2Þ 2p v2
9 The Student t- and Pseudo t- (PT) Distributions …
270
Use the variable transformation from (X and Y) to (Z ¼ Jacobian is dx ¼ dz
pffiffiffi pffiffiffiffi mX= Y and Y), where the
rffiffiffi y : m
pffiffiffi pffiffiffi Then, the pdf of the joint distribution of Z and Y using x ¼ z y= m and unchanged y with the Jacobian becomes 2 ðm=2Þ1 pffiffiffiffiffiffiffi y y=m 1 z y y pffiffiffiffiffiffi exp exp m=2 2m 2 Cðm=2Þ 2 2p ðm þ 1Þ=2 Cfðm þ 1Þ=2g z2 ¼ pffiffiffiffiffiffipffiffiffi 1þ m=2 m 2p m2 Cðm=2Þ ðm þ 1Þ=2 z2 yðm þ 1Þ=21 z2 exp 1 þ 1þ y=2 m Cfðm þ 1Þ=2g m ðm þ 1Þ=2 Cfðm þ 1Þ=2g z2 ¼ pffiffiffipffiffiffi 1þ m m pCðm=2Þ 2 ðm þ 1Þ=2 1 z yðm þ 1Þ=21 z2 exp 1 þ ðm þ 1Þ=2 1 þ y=2 m Cfðm þ 1Þ=2g m 2 ðm þ 1Þ=2 1 z2 ¼ pffiffiffi 1þ m mBðm=2; 1=2Þ ðm þ 1Þ=2 1 z2 yðm þ 1Þ=21 z2 exp 1 þ ðm þ 1Þ=2 1 þ y=2 ; m Cfðm þ 1Þ=2g m 2 where ðm þ 1Þ=2 ðm þ 1Þ=21 z2 y z2 exp 1 þ 1 þ y=2 m Cfðm þ 1Þ=2g m 2ðm þ 1Þ=2 1
z2 is the pdf of Gamma m þ2 1 ; 12 þ 2m or scaled chi-square with m þ 1 df and scale 1 2 , which is integrated out to have the pdf of Z: parameter 1 þ zm ðm þ 1Þ=2 1 z2 pffiffiffi 1þ m mBðm=2; 1=2Þ giving the required result. The remaining three cases are obtained using variable transformations. Q.E.D.
9.2 The t-Distribution
271
Proof 4 Finally, we give the result using the chi distribution. Let Y be chi dispffiffiffiffiffiffi tributed with mdf with unchanged X. The pdf of the chi is given by Y ¼ Y , where Y is the corresponding chi-square distributed variable. Then, the joint distribution of X and Y becomes 2 2 1 x ym1 y /ðX ¼ xÞgv ðY ¼ yjmÞ ¼ pffiffiffiffiffiffi exp exp : ðm=2Þ1 2 2 2 Cðm=2Þ 2p Using the variable transformation from (X and Y) to (Z ¼ Jacobian is
pffiffiffi mX=Y and Y), where the
dx y ¼ pffiffiffi : dz m pffiffiffi Then, the pdf of the joint distribution of Z and Y using x ¼ zy= m and unchanged y with the Jacobian becomes 2 2 2 pffiffiffi 1 z y ym1 y= m y pffiffiffiffiffiffi exp exp ðm=2Þ1 2m 2 2 Cðm=2Þ 2p 2 ðm þ 1Þ=2 Cfðm þ 1Þ=2g z ¼ pffiffiffipffiffiffi 1þ m m pCðm=2Þ 2 ðm þ 1Þ=2 z ym 1þ exp 1 þ m 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g ðm þ 1Þ=2 1 z2 1þ ¼ pffiffiffi m mBðm=2; 1=2Þ ðm þ 1Þ=2 z2 ym 1þ exp 1þ m 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g
z 2 y2 m 2
z 2 y2 ; m 2
where ðm þ 1Þ=2 z2 ym z 2 y2 1þ exp 1 þ ; m m 2 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g 1=2 2 , is the pdf of the scaled chi distribution with m þ 1 df and scale parameter 1 þ zm which is integrated out to have the pdf of Z, giving the required result. The remaining three cases are obtained using variable transformations. Q.E.D. It is found that the derivations in Proofs 1 to 4 are comparable with a slight advantage in Proof 3; in that, the gamma distribution may be familiar for most of the readers. However, the formulation using the inverse-chi in Proof 1 has an
9 The Student t- and Pseudo t- (PT) Distributions …
272
advantage when we have the raw moments of the t-distribution, which are the products of the moments of the normal and inverse-chi distributions of the same orders due to their independence. Lemma 9.2 (Kollo, Käärik and Selart [15, Lemma 1]) The raw moments of real-valued order k [ 0 for the inverse chi-distributed variable denoted by Zm with m df is given by EðZmk Þ ¼
Cfðm kÞ=2g ; m [ k: 2k=2 Cðm=2Þ
When k is a natural number, EðZmk Þ ¼
k1 Y
EðZvi Þ ; m [ k:
i¼0
Proof Using the pdf of the inverse-chi, we have Z1 EðZmk Þ
¼
zk 0
¼
2
zm1 1 exp dz 2z2 2ðm=2Þ1 Cðm=2Þ
ðmkÞ=21
Cfðm kÞ=2g m=21 2 Cðm=2Þ
Z1 0
zðmkÞ1 1 exp 2 dz 2z 2ðmkÞ=21 Cfðm kÞ=2g
Cfðm kÞ=2g ¼ k=2 ; 2 Cðm=2Þ which is the first required result. The second result with k being a positive integer is given by Cfðm kÞ=2g 2k=2 Cðm=2Þ Cfðm 1Þ=2g Cfðm 2Þ=2g Cfðm kÞ=2g 1=2 ¼ 1=2 2 Cðm=2Þ 21=2 Cfðm 1Þ=2g 2 Cfðm k þ 1Þ=2g k 1 Y EðZvi Þ: ¼
EðZmk Þ ¼
i¼0
Q.E.D.
9.2 The t-Distribution
Lemma 9.2 gives Cfðm 1Þ=2g ; m [ 1; 21=2 Cðm=2Þ CðuÞ EðZ2u þ 1 Þ ¼ 1=2 2 Cfu þ ð1=2Þg ðu 1Þ! ¼ 1=2 2 fu ð1=2Þgfu ð3=2Þg ð1=2ÞCð1=2Þ EðZm Þ ¼
21=2 ð2u 2Þ!! ¼ pffiffiffi ; pð2u 1Þ!! Cfu ð1=2Þg EðZ2u Þ ¼ 21=2 CðuÞ fu ð3=2Þgfu ð5=2Þg ð1=2ÞCð1=2Þ ¼ 21=2 ðu 1Þ! pffiffiffi pð2u 3Þ!! ¼ 1=2 2 ð2u 2Þ!! ðu ¼ 1; 2; :::Þ; Cfðm 2Þ=2g EðZm2 Þ ¼ 2Cðm=2Þ 1 ; m [ 2; ¼ m2 Cfðm 3Þ=2g EðZm3 Þ ¼ 3=2 2 Cðm=2Þ Cfðm 1Þ=2g Cfðm 3Þ=2g ¼ 1=2 2 Cðm=2Þ 2Cfðm 1Þ=2g EðZm Þ ; m [ 3; ¼ m3 Cfðm 4Þ=2g EðZm4 Þ ¼ 22 Cðm=2Þ Cfðm 2Þ=2g Cfðm 4Þ=2g ¼ 2Cðm=2Þ 2Cfðm 2Þ=2g 1 ; m [ 4; ¼ ðm 2Þðm 4Þ
273
9 The Student t- and Pseudo t- (PT) Distributions …
274
Cfðm 5Þ=2g 25=2 Cðm=2Þ Cfðm 1Þ=2g Cfðm 3Þ=2g Cfðm 5Þ=2g ¼ 1=2 2 Cðm=2Þ 2Cfðm 1Þ=2g 2Cfðm 3Þ=2g EðZm Þ ; m [ 5; ¼ ðm 3Þðm 5Þ Cfðm 6Þ=2g EðZm6 Þ ¼ 23 Cðm=2Þ Cfðm 2Þ=2g Cfðm 4Þ=2g Cfðm 6Þ=2g ¼ 2Cðm=2Þ 2Cfðm 2Þ=2g 2Cfðm 4Þ=2g 1 ; m [ 6: ¼ ðm 2Þðm 4Þðm 6Þ
EðZm5 Þ ¼
Theorem 9.1 When Zm is inverse chi-distributed with m df, we have EðZm Þ ; m [ 2u 1; ðm 3Þðm 5Þ ðm 2u þ 1Þ 1 EðZm2u Þ ¼ ; m [ 2u; ðm 2Þðm 4Þ ðm 2uÞ
EðZm2u1 Þ ¼
ðu ¼ 1; 2; :::Þ: Proof The results follow by induction with EðZv Þ and EðZm2 Þ obtained earlier. Q.E.D. Note that in Theorem 9.1, m is real-valued. In the result of Lemma 9.1, introducing the location and scale parameters corresponding to those of Nðl; r2 Þ for the standardized t-distribution, we have the unstandardized Student’s t-distribution with m df denoted by Stðl; r; mÞ, whose pdf at Z = z is ( )ðm þ 1Þ=2 1 ðz lÞ2 pffiffiffi : 1þ r2 m Bðm=2; 1=2Þ mr It is known that infinite mixtures of the normal distributions with different variances give the above pdf of Stðl; r; mÞ (Stuart and Ort [23], Example 5.6]; Bishop [2], Eq. (2.161)]; Kirkby, Nguyen and Nguyen [13], Eq. (9); [14], Eq. (3.1)]), where the gamma distribution for the weights in mixtures seems to be exclusively used. In many cases, this mixture is called a scale mixture. However, actually this is a mixture of the reciprocals of squared scales or equivalently the reciprocals of variances in the normal case, which are also called the precisions. The next lemma shows that the same results are given by the mixture of normal variances using the inverse gamma distribution.
9.2 The t-Distribution
275
Lemma 9.3 The pdf of Stðl; r; mÞ with m [ 0 being real-valued is given by (i) the mixture of the precisions of N(l; Y 1 r2 Þ when Y follows Gamma ðm=2; m=2Þ with m=2 being the shape and rate parameter or equivalently the scaled chi-square with m df and scale parameter m1 ; (ii) the mixture of the precisions of N(l; Y 1 mr2 Þ when Y follows Gamma ðm=2; 1=2Þ with m=2 and 1/2 being the shape and rate parameters, respectively, or equivalently the chi-square with m df. The same pdf is given by (iii) the mixture of the variances N(l; Yr2 Þ when Y follows the inverse gamma distribution denoted by Inv-Gamma ðm=2; m=2Þ with m=2 being the shape and scale parameter or equivalently the scaled inverse chi-square with m df and scale parameter m; (iv) the mixture of the variances N(l; Ymr2 Þ when Y follows the inverse gamma distribution denoted by Inv-Gamma ðm=2; 1=2Þ with m=2 and 1/2 being the shape and scale parameters, respectively, or equivalently the inverse chi-square with m df. Proof 1 We start with (i) of the precision mixture, which is given by ( ) Z1 pffiffiffi my y yðz lÞ2 mm=2 yðm=2Þ1 pffiffiffiffiffiffi exp exp dy 2 2 2r2 Cðm=2Þ 2pr 0 ( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g mm=2 m ðz lÞ2 þ ¼ pffiffiffiffiffiffi 2 2r2 2pCðm=2Þr 2 )ðm þ 1Þ=2 " ( ) # Z1 ( m ðz lÞ2 yðm þ 1Þ=21 m ðz lÞ2 þ þ exp y dy 2 2 2r2 Cfðm þ 1Þ=2g 2r2 0
( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g ðz lÞ2 pffiffiffi 1þ ¼ pffiffiffi mr2 pCðm=2Þ mr ( )ðm þ 1Þ=2 1 ðz lÞ2 pffiffiffi ¼ ; 1þ mr2 Bðm=2; 1=2Þ mr which is the first required result. The case (ii) of the precision mixture is given from (i) when the variable transformation Y ¼ mY following the chi-square with m df is considered. That is, we have
9 The Student t- and Pseudo t- (PT) Distributions …
276
( ) Z1 pffiffiffi my y yðz lÞ2 m m=2 yðm=2Þ1 pffiffiffiffiffiffi exp exp dy 2 2 2r2 Cðm=2Þ 2pr 0 ( ) Z1 pffiffiffiffiffi y y ðz lÞ2 1 yfðm=2Þ1g y pffiffiffiffiffiffipffiffiffi exp exp ¼ dy 2 m=2 2mr Cðm=2Þ 2 2 2p mr 0
( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g 1 1 ðz lÞ2 þ ¼ pffiffiffiffiffiffi pffiffiffi 2mr2 2pCðm=2Þ mr 2m=2 2 )ðm þ 1Þ=2 " ( ) # Z1 ( 1 ðz lÞ2 yfðm þ 1Þ=21g 1 ðz lÞ2 þ þ exp y dy 2 2 2mr2 Cfðm þ 1Þ=2g 2mr2 0
( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g ðz lÞ2 pffiffiffi ¼ pffiffiffi 1þ mr2 pCðm=2Þ mr ( )ðm þ 1Þ=2 1 ðz lÞ2 pffiffiffi ¼ ; 1þ mr2 Bðm=2; 1=2Þ mr which is the required result for (ii). The remaining results are given by the property that when Y is gamma ((scaled) chi-square) distributed, Y 1 follows the corresponding inverse gamma (inverse (scaled) chi-square) distribution. Proof 2 Consider (iii) of the variance mixture using Inv-Gamma ðm=2; m=2Þ with m=2 being the shape and scale parameter: ( ) 1 ðz lÞ2 m m=2 yðm=2Þ1 m=2 pffiffiffiffiffiffi pffiffiffi exp exp dy 2 y 2r2 y Cðm=2Þ 2pr y 0 ( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g mm=2 m ðz lÞ2 þ ¼ pffiffiffiffiffiffi 2 2r2 2pCðm=2Þr 2 ( ) " ( ) # ðm þ 1Þ=2 Z1 m ðz lÞ2 yðm þ 1Þ=21 m ðz lÞ2 þ þ exp =y dy 2 2 2r2 Cfðm þ 1Þ=2g 2r2
Z1
0
( )ðm þ 1Þ=2 1 ðz lÞ2 pffiffiffi ; 1þ ¼ mr2 Bðm=2; 1=2Þ mr which is the required result for (iii). The case (iv) of the variance mixture is given from (iii) when the variable transformation Y ¼ Y=m following the inverse chi-square with mdf is considered. That is, we have the identity
9.2 The t-Distribution
277
( ) 1 ðz lÞ2 m m=2 yðm=2Þ1 m=2 pffiffiffiffiffiffi pffiffiffi exp exp dy 2 y 2r2 y Cðm=2Þ 2pr y 0 ( ) Z1 1 ðz lÞ2 1 yfðm=2Þ1g 1=2 pffiffiffiffiffiffipffiffiffipffiffiffiffiffi exp exp dy ¼ y 2mr2 y 2m=2 Cðm=2Þ 2p m y r
Z1
0
( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g 1 1 ðz lÞ2 þ ¼ pffiffiffiffiffiffi pffiffiffi 2mr2 2pCðm=2Þ mr 2m=2 2 )ðm þ 1Þ=2 " ( ) # Z1 ( 1 ðz lÞ2 yfðm þ 1Þ=21g 1 ðz lÞ2 þ þ exp =y dy 2 2 2mr2 Cfðm þ 1Þ=2g 2mr2 0
( )ðm þ 1Þ=2 1 ðz lÞ2 pffiffiffi ¼ : 1þ mr2 Bðm=2; 1=2Þ mr The remaining results are given by the property that when Y is inverse gamma (inverse (scaled) chi-square) distributed, Y 1 follows the corresponding gamma ((scaled) chi-square) distribution. Q.E.D. The above two proofs are comparable. When for some reasons, the variance mixture is preferred, the inverse gamma (inverse chi-square) can be used; while the gamma (chi-square) distribution is employed, the precision mixture should be used. As in Lemma 9.1 other similar mixtures giving the same pdf of Stðl; r; mÞ can be used. When the standard deviation mixture is used, the inverse-square-root gamma pffiffiffiffiffiffi can be used, which is the distribution of 1= Y when Y follows the gamma distribution, whose special case is the inverse-chi. Similarly, the mixture of the reciprocal of the standard deviation can also be used. In this case, the square-root pffiffiffiffiffiffi gamma can be used, which is the distribution of Y with Y as above, whose special case is the chi distribution. Actually, infinitely many types of mixtures yielding the same Stðl; r2 ; mÞ can be constructed when the distributions of the powers of Y are used. Let Y ¼ Y ð1=cÞ ð1\c\1; c 6¼ 0Þ. Then, Y is said to have the power-gamma distribution denoted by Power-Cða; b; cÞ. The pdf of this distribution is given by noting that dy =dy ¼ cyc1 : gPowerC ðyja; b; cÞ ¼ ¼
ba ycða1Þ jcjyc1 expðbyc Þ CðaÞ
ba jcjyca1 expðbyc Þ; CðaÞ
ð0\y\1; 0\a\1; 0\b\1; 1\c\1; c 6¼ 0Þ
9 The Student t- and Pseudo t- (PT) Distributions …
278
where a and b are the shape and rate parameters for the gamma distributed Y (the same notation a as that for the normalizer used earlier is employed as long as confusion does not occur). Stacy [22, Eq. (1)] gave the generalized gamma distribution whose pdf is gGeneralizedC ðxja; d; pÞ ¼
ðp=ad Þxd1 expfðx=aÞp g Cðd=pÞ
ð0\x\1; 0\a\1; 0\d\1; 0\p\1Þ which is seen as a reparametrization of the power-gamma with a ¼ d=p; b ¼ 1=ap and c ¼ p with the restriction of 0\p\1. The power gamma can be seen as a special case of the Amoroso distribution whose pdf is jb=hj x aab1 x ab gAmoroso ðxja; h; a; bÞ ¼ exp CðaÞ h h ðx a if h [ 0; x a if h\0; b 2 R; b 6¼ 0; a [ 0; h 2 R; h 6¼ 0Þ (Crooks [5, Eq. (1)]). Theorem 9.2 The pdf of Stðl; r; mÞ is given by the mixture of the powers of the precisions (power-precisions) of N(l; Y c r2 Þ ð1\c\1; c 6¼ 0Þ when Y follows Power-Cðm=2; m=2; cÞ. Proof In Proof 1 for case (i) of Lemma 9.3, replacing y by y , we have the identity: ( ) Z1 pffiffiffiffiffiffi y y ðz lÞ2 m m=2 yfðm=2Þ1g my pffiffiffiffiffiffi exp exp dy 2 2 2r Cðm=2Þ 2 2pr 0 ( )ðm þ 1Þ=2 1 ðz lÞ2 pffiffiffi ¼ : 1þ mr2 Bðm=2; 1=2Þ mr Let y ¼ yð1=cÞ ð1\c\1; c 6¼ 0Þ. Then, using dy =dy ¼ cyc1 , the left-hand side of the above identity becomes ( ) Z1 pffiffiffiffic yc ðz lÞ2 m m=2 ycfðm=2Þ1g jcjyc1 myc y pffiffiffiffiffiffi exp exp dy 2 2r2 Cðm=2Þ 2 2pr 0 ( ) Z1 pffiffiffiffic yc ðz lÞ2 m m=2 jcjycðm=2Þ1 myc y pffiffiffiffiffiffi exp ¼ exp dy 2 2r2 Cðm=2Þ 2 2pr 0
( ) Z1 pffiffiffiffic yc ðz lÞ2 y pffiffiffiffiffiffi exp ¼ gPowerC ðyjm=2; m=2; cÞdy; 2r2 2pr 0
which shows the required result. Q.E.D.
9.2 The t-Distribution
279
In Theorem 9.2, the mixture of the power-precisions is dealt with. When c ðc2 Þ is redefined as c, we obtain the corresponding results for the mixtures of the power-variances (power-standard deviations). Note that when c ¼ 2; 1; 1; 2, Power-Cðm=2; 1=2; cÞ become the (scaled) inverse-chi, inverse chi-square, chi-square and chi distributions, respectively. These findings reflect a general result of arbitrary formulations of mixtures as long as valid variable transformations are used. It is sometimes pointed out that the mixture, e.g., that for Stðl; r; mÞ is scale-free. In the case of Stðl; r; mÞ, this indicates that the weights for the mixture, i.e., the pdfs of Power-Cðm=2; 1=2; cÞ, do not depend on the scale r or the standard deviation (SD) of the associated normal density. In Theorem 9.2, Power-Cðm=2; m=2; cÞ is used for the weight to yield the normal mixture, where m=2 is used as a function of the scale parameter denoted by b of Power-Cðm=2; m=2; cÞ, i.e., m=2 ¼ 1=bc . For the mixture, the unit scale parameter as in Power-Cðm=2; 1; cÞ or the power gamma dependent on the normal scale r can also be considered. Corollary 9.1 The pdf of Stðl; r; mÞ is given by the mixture of the powerprecisions of N(l; Y c mr2 =2Þ ð1\c\1; c 6¼ 0Þ when Y follows PowerCðm=2; 1; cÞ. The same pdf is also given by the mixture of the power-precisions of N(l; Y c Þ ð1\c\1; c 6¼ 0Þ when Y follows Power-Cðm=2; mr2 =2; cÞ. Proof In the proof of Theorem 9.2, replacing y by y , we have the identity: ( ) Z1 pffiffiffiffiffiffi yc yc ðz lÞ2 m m=2 jcjycðm=2Þ1 myc pffiffiffiffiffiffi exp exp dy 2 2r2 Cðm=2Þ 2 2pr 0 ( )ðm þ 1Þ=2 1 ðz lÞ2 pffiffiffi ¼ : 1þ mr2 Bðm=2; 1=2Þ mr Let yc ¼ ðm=2Þyc . Then, using dy =dy ¼ ðm=2Þ1=c , the left-hand side of the above identity becomes ( ) Z1 pffiffiffiffiffiffi yc yc ðz lÞ2 m m=2 jcjycðm=2Þ1 myc pffiffiffiffiffiffi exp exp dy 2 2r2 Cðm=2Þ 2 2pr 0 ( ) Z1 pffiffiffiffic pffiffiffi y 2 yc ðz lÞ2 jcjycðm=2Þ1 pffiffiffiffiffiffipffiffiffi exp expðyc Þdy ¼ mr2 Cðm=2Þ 2p mr 0
Z1 ¼
/ðzjl; yc mr2 =2ÞgPowerC ðyjm=2; 1; cÞdy;
0
which shows the first required result. For the remaining result, let yc ¼ yc =r2 with y as above. Using dy =dy ¼ ðr2 Þ1=c , we have as above
9 The Student t- and Pseudo t- (PT) Distributions …
280
( ) ffi Z1 pffiffiffiffiffi yc yc ðz lÞ2 mm=2 jcjycðm=2Þ1 myc pffiffiffiffiffiffi exp exp dy 2 2r2 Cðm=2Þ 2 2pr 0 ( ) m=2 Z1 pffiffiffiffic y yc ðz lÞ2 mr2 jcjycðm=2Þ1 mr2 yc pffiffiffiffiffiffi exp exp ¼ dy 2 2 Cðm=2Þ 2 2p 0
Z1 ¼
/ðzjl; yc ÞgPowerC ðyjm=2; mr2 =2; cÞdy;
0
which gives the remaining required result. Q.E.D. So far, the pdf of Stðl; r; mÞ has been derived by the variable transformation from normal X and power-gamma Y to the normal mixture Z and unchanged Y. Then, Z becomes t-distributed. The roles of X and Y can be exchanged. Theorem 9.3 The pdf of Stð0; 1; mÞ is given by the quasi mixture of the scale pffiffiffi parameters mX ðX [ 0Þ of the scaled inverse-chi distributed variable Z with m df when X N(0; 1Þ over the domain X [ 0. Proof The pdf of the scaled inverse-chi distribution with m df and scale parameter b at Z = z is given by gv1 ðzjm; bÞ ¼
zm1 bv b2 exp : 2z2 2ðm=2Þ1 Cðm=2Þ
Then, the quasi mixture in this theorem becomes Z1 0
2 m1 pffiffiffi m 1 x z ð mxÞ mx2 pffiffiffiffiffiffi exp exp 2 dx 2 2ðm=2Þ1 Cðm=2Þ 2z 2p Z1
¼ 0
zm1 mm=2 xm m x2 pffiffiffiffiffiffi exp 1 þ 2 dx z 2 2p2ðm=2Þ1 Cðm=2Þ
zm1 mm=2 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g ¼ pffiffiffiffiffiffi ðm þ 1Þ=2 2p2ðm=2Þ1 Cðm=2Þ 1þ m Z1 0
ðm þ 1Þ=2 1 þ zm2 xm
z2
m x2 exp 1 þ dx z2 2 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g
9.2 The t-Distribution
281
Cfðm þ 1Þ=2g ðm=z2 Þðm þ 1Þ=2 ¼ pffiffiffipffiffiffi ðm þ 1Þ=2 m pCðm=2Þ 1 þ zm2 ðm þ 1Þ=2 1 z2 p ffiffi ffi ¼ ; 1þ m mBðm=2; 1=2Þ which is the pdf of Stð0; 1; mÞ. Q.E.D. Remark 9.1 Note that the phrase “quasi mixture” is used in Theorem 9.3 since the truncated support X [ 0 is used for X N(0; 1Þ without doubling the pdf as in the half normal. The above expression of the quasi mixture is also obtained by the pffiffiffi variable transformation. Let Z ¼ mXY [ 0, where X N(0; 1Þ with X [ 0 and Y is standard inverse-chi distributed with mdf independent of X (when Z is negative pffiffiffi pffiffiffi Z ¼ mXY\0 can be used). Then, the pdf of Z at z ¼ mxy is given by the variable transformation from (X and Y) to (Z and X), where X is integrated out with y ¼ pzffimx, unchanged x and the Jacobian det
dx=dz dx=dx dy=dz dy=dx
!
0 pffiffiffi 1=ð mxÞ pffiffiffi ¼ 1=ð mxÞ: ¼ det
1 pffiffiffi z=ð mx2 Þ
Using the above result, the pdf of Z becomes Z1 0
z dy dx /ðxÞgv1 pffiffiffi jm dz mx Z1
¼ 0
Z1 ¼ 0
Z1 ¼
pffiffiffi pffiffiffi 2 1 x fz=ð mxÞgm1 =ð mxÞ mx2 pffiffiffiffiffiffi exp dx exp 2 2z2 2ðm=2Þ1 Cðm=2Þ 2p 2 m1 pffiffiffi m 1 x z ð mxÞ mx2 pffiffiffiffiffiffi exp exp dx 2 2ðm=2Þ1 Cðm=2Þ 2z2 2p pffiffiffi /ðxÞgv1 ðzjm; mxÞdx:
0
Remark 9.2 In Theorem 9.2, the normal mixture using the weights of the power gamma is given while in Theorem 9.3, the (quasi) mixture of the scaled inverse-chi with the normal density weights is obtained. This shows a dual aspect when pffiffiffi standard t-distributed Z ¼ mXY with X N(0; 1Þ and Y v1 ðmÞ is obtained by
9 The Student t- and Pseudo t- (PT) Distributions …
282
mixtures. In the literature, to the author’s knowledge, the former type of mixture has been exclusively used. This duality is associated with that of the variable transformation as expected. That is, as shown earlier when the transformation (X and Y (or Y c )) to (Z and X) is used, the normal mixture is derived with Y being integrated out (see Theorem 9.2). On the other hand, when the transformation (X and Y) to (Z and Y) is used, the quasi mixture of the scales of the scaled inverse-chi with the normal density weights is obtained with X being integrated out (see Theorem 9.3). Remark 9.3 The phrase “quasi mixture” was used in RTheorem 9.3 since the 1 integral of the weights for a positive z in Theorem 9.3 is 0 /ðxÞ dx ¼ 1=2 rather than 1, which is not consistent with the usual definition of a mixture. However, the pffiffiffi pffiffiffi weight mX for the mixture or the scale of Y v1 ðm; mXÞ is used R 1 two times, one for a positive Z = z and the other for a negative z, which makes 2 0 /ðxÞ dx ¼ 1 as in the half normal. If this extended definition of a mixture is employed, “quasi” is unnecessary and will be parenthesized or omitted. Corollary 9.2 The pdf of Stð0; 1; mÞ is given by the (quasi) mixture of the scale parameters X ðX [ 0Þ of the scaled inverse-chi distribution with m df when X N(0; mÞ over the domain X [ 0. Proof Theorem 9.3 gives the pdf of Stð0; 1; mÞ as Z1 0
2 m1 pffiffiffi m 1 x z ð mx Þ mx2 pffiffiffiffiffiffi exp exp 2 dx 2 2ðm=2Þ1 Cðm=2Þ 2z 2p Z1
¼
pffiffiffi /ðx Þgv1 ðzjm; mx Þdx ;
0
where x in place of x is used. Consider the variable transformation x ¼ pffiffiffi dx =dx ¼ 1= m. Then, the above result becomes Z1
/ðx Þg
v1
pffiffiffi ðzjm; mx Þdx ¼
0
Z1 0
pffiffiffi mx with
pffiffiffi 1 /ðx= mÞ pffiffiffi gv1 ðzjm; xÞdx m
Z1 ¼
/ðxj0; mÞgv1 ðzjm; xÞdx; 0
which is the required result. Q.E.D. In Corollary 9.2, the possibly non-integer value m of the df is also used as the variance of the normal weights for the inverse-chi mixture, which shows another aspect of the arbitrariness of the scales of the kernel in a mixture.
9.2 The t-Distribution
283
Corollary 9.3 The pdf of Stð0; 1; mÞ is given by the following mixtures: pffiffiffiffiffiffi (i) the mixture of the scale parameters mX of the scaled inverse-chi distribution with m df when X v2 ð1Þ; pffiffiffiffi (ii) the mixture of the scale parameters X of the scaled inverse-chi distribution with m df when X Gammaf1=2; 1=ð2mÞg with 1=ð2mÞ being the rate parameter; pffiffiffiffiffiffiffiffi (iii) the mixture of the scale parameters mX c of the scaled inverse-chi distribution with m df when X Power-Cð1=2; 1=2; cÞ. Proof (i) Theorem 9.3 gives the pdf of Stð0; 1; mÞ as Z1 0
2 m1 pffiffiffi m 1 x z ð mx Þ mx2 pffiffiffiffiffiffi exp exp dx 2 2ðm=2Þ1 Cðm=2Þ 2z2 2p Z1
¼
pffiffiffi /ðx Þgv1 ðzjm; mx Þdx :
0
pffiffiffi Consider the variable transformation x ¼ x2 with dx =dx ¼ 1=ð2 xÞ. Then, the above result becomes Z1
pffiffiffi /ðx Þgv1 ðzjm; mx Þdx ¼
0
Z1 0
Z1 ¼
x 1 pffiffiffiffiffi 1 pffiffiffiffiffiffi exp pffiffiffi gv1 ðzjm; mxÞdx 2 2 x 2p pffiffiffiffiffi 1 gv2 ðxj1Þgv1 ðzjm; mxÞdx; 2
0
which is the required result. (ii) For the second result, consider the variable transformation x ¼ mx2 with pffiffiffiffiffiffiffi pffiffiffiffiffi x ¼ x=m and dx =dx ¼ 1=ð2 mxÞ. Then, the pdf of Stð0; 1; mÞ is Z1
pffiffiffi /ðx Þgv1 ðzjm; mx Þdx ¼
0
Z1 0
Z1 ¼ 0
Z1 ¼
x 1 pffiffiffi 1 pffiffiffiffiffiffi exp pffiffiffiffiffi gv1 ðzjm; xÞdx 2m 2 mx 2p x pffiffiffi 1 xð1=2Þ1 pffiffiffiffiffi gv1 ðzjm; xÞdx exp 2 2mCð1=2Þ 2m pffiffiffi 1 gC fxj1=2; 1=ð2mÞggv1 ðzjm; xÞdx; 2
0
which gives the second required result.
9 The Student t- and Pseudo t- (PT) Distributions …
284
(iii) For the remaining result, consider the variable transformation x ¼ x2=c with x ¼ xc=2 and dx =dx ¼ cxðc=2Þ1 =2. Then, the pdf of Stð0; 1; mÞ is
Z1
pffiffiffi /ðx Þgv1 ðzjm; mx Þdx ¼
0
Z1 0
Z1 ¼ 0
Z1 ¼
c pffiffiffiffiffiffi 1 x jcjxðc=2Þ1 pffiffiffiffiffiffi exp gv1 ðzjm; mxc Þdx 2 2 2p c pffiffiffiffiffiffi 1 jcjxðc=2Þ1 x pffiffiffi exp gv1 ðzjm; mxc Þdx 2 2Cð1=2Þ 2 pffiffiffiffiffiffi 1 gPowerC ðxj1=2; 1=2; cÞgv1 ðzjm; mxc Þdx; 2
0
which is the last required result. Q.E.D. As addressed after Corollary 9.2, Corollary 9.3 shows other cases of the arbitrariness of scales in the mixture. It is found that the weights using the (power)gamma distribution can be employed both in the normal and inverse-chi mixtures. Corollary 9.4 The pdf of Stð0; 1; mÞ is given by the mixture of the second parameters of the Power-Cð1=2; mX c =2; cÞ when X Power-Cð1=2; 1=2; cÞ. Proof From the last result of (iii) of Corollary 9.3, noting that the inverse chi distribution is a special case of the Power-C, we have Z1
pffiffiffiffiffiffi 1 gPowerC ðxj1=2; 1=2; cÞgv1 ðzjm; mxc Þdx 2
0
Z1 ¼
1 gPowerC ðxj1=2; 1=2; cÞgPowerC ðzjm=2; mxc =2; 2Þdx; 2
0
giving the required result. Q.E.D. The above result shows that the pdf of Stð0; 1; mÞ can be expressed as a Power-C mixture with the Power-C weights, where the value of the power in the former Power-C, which is also used in the second parameter of the latter Power-C, is arbitrary as in other cases due to the model indeterminacy of a mixture. Remark 9.4 In many text books, the standard t-distributed variable is defined as pffiffiffiffiffiffiffiffi X= Y=m, where X Nð0; 1Þ and Y v2 ðmÞ independent of X as shown in Proof 3 of Lemma 9.1. On the other hand, the pdf is typically given by the precision mixture of the normal distributions using the weights of the gamma distribution with the shape and rate parameters being m=2 as shown in (i) of Proof 1 of Lemma 9.3. There seems to be some gap between these two explanations. The latter derivation of the pdf is chosen due to its simplicity, while to have the pdf from pffiffiffiffiffiffiffiffi X= Y=m we have to use the joint distribution of X and Y, variable transformation
9.3 The Multivariate t-Distribution
285
and marginalization, which gives an indirect impression. It is of some interest to see the relationship between mixture and marginalization in these two cases that are seemingly distinct. In Proof 3 of Lemma 9.1, the following joint pdf of Z ¼ pffiffiffiffiffiffiffiffi X= Y=m and Y is used before marginalization 2 ðm=2Þ1 pffiffiffiffiffiffiffi y y=m 1 z y y pffiffiffiffiffiffi exp exp ; 2m 2m=2 Cðm=2Þ 2 2p pffiffiffiffiffiffiffi where y=m comes from the Jacobian. Then, moving this factor forward, the above pdf becomes the product of two pdfs: 2 ðm=2Þ1 y 1 pffiffiffiffiffiffiffi z y y pffiffiffiffiffiffi y=m exp exp ; 2m 2m=2 Cðm=2Þ 2 2p which shows the form of the (scaled) precision mixture with the chi-square weights when y is integrated as shown in (ii) of Lemma 9.3. That is, this marginalization can be seen as a case of a (unscaled) chi-square mixture.
9.3
The Multivariate t-Distribution
The p-variate t-distribution with m df denoted by Z Stðl; R; mÞ is defined when the pdf at Z ¼ z is gðzjl; R; mÞ ¼
Cfðm þ pÞ=2g ðpmÞp=2 Cðm=2ÞjRj1=2
(
ðz lÞT R1 ðz lÞ 1þ m
)ðm þ pÞ=2
(Cornish [4, Eq. (1)]; Kotz and Nadarajah [17, Eq. (1.1)]). Lemma 9.4 The pdf of the p-variate t-distribution when l ¼ 0 and R ¼ Ip , where pffiffiffiffiffiffiffiffi pffiffiffiffi pffiffiffi Ip is the p p identity matrix, is given when Z is equal to X= Y=m, Y X= m, pffiffiffi pffiffiffi Y X= m or X=ðY= mÞ where X Nð0; Ip Þ and Y is chi-square, inverse chi-square, inverse-chi and chi-distributed, respectively, with m df independent of X. pffiffiffiffiffiffiffiffi Proof We give the familiar case of Z ¼ X= Y=m. The pdf of the joint distribution of X and Y is /p ðX ¼ xjl ¼ 0; R ¼ Ip Þgv2 ðY ¼ yjmÞ T ðm=2Þ1 y 1 x x y ¼ exp exp : 2 2m=2 Cðm=2Þ 2 ð2pÞp=2
286
9 The Student t- and Pseudo t- (PT) Distributions …
pffiffiffiffiffiffiffiffi Use the variable transformation from (X and Y) to (Z ¼ X= Y=m and Y) with the pffiffiffiffiffiffiffiffi Q dxi yp=2 ¼ . Then, noting X ¼ Z Y=m, the pdf of the joint disJacobian pi¼1 m dzi tribution of Z and Y becomes T ðm=2Þ1 y 1 z zy y ðy=mÞp=2 exp exp p=2 m=2 2m 2 2 Cðm=2Þ ð2pÞ ðm þ pÞ=2 Cfðm þ pÞ=2g 1 zT z þ ¼ 2m ð2pmÞp=2 2m=2 Cðm=2Þ 2 ðm þ pÞ=2 1 zT z yðm þ pÞ=21 1 zT z þ þ exp y 2 2m 2 2m Cfðm þ pÞ=2g ðm þ pÞ=2 Cfðm þ pÞ=2g zT z ¼ 1 þ m ðpmÞp=2 Cðm=2Þ ðm þ pÞ=2 1 zT z yðm þ pÞ=21 1 zT z þ þ exp y ; 2 2m 2 2m Cfðm þ pÞ=2g where ðm þ pÞ=2 ðm þ pÞ=21 1 zT z y 1 zT z þ þ exp y 2 2m 2 2m Cfðm þ pÞ=2g m þ p 1 zT z ; þ is the pdf of Gamma or scaled chi-square with m þ p df and scale 2 2 2m 1 zT z , which is integrated out to have the pdf of Z. The remaining parameter 1 þ m pffiffiffiffi pffiffiffi pffiffiffi pffiffiffi cases when Z is Y X= m, Y X= m or X=ðY= mÞ are given similarly. Q.E.D. Note that Lemma 9.4 parallels Lemma 9.1 for the univariate t-distribution. Introducing location and scale parameters l and R1=2 that is a symmetric matrix-square-root of R, respectively, we have ( )ðm þ pÞ=2 Cfðm þ pÞ=2g ðz lÞT R1 ðz lÞ gðzjl; R; mÞ ¼ 1þ : m ðpmÞp=2 Cðm=2ÞjRj1=2 The derivation of gðzjl; R; mÞ using mixtures is also given in various ways as in the univariate case of Lemma 9.3. Lemma 9.5 The pdf of Stðl; R; mÞ with m [ 0 being real-valued is given by (i) the mixture of the precisions of N(l; Y 1 RÞ when Y follows Gamma ðm=2; m=2Þ with m=2 being the shape and rate parameter or equivalently the scaled chi-square with m df and scale parameter m1 ;
9.3 The Multivariate t-Distribution
287
(ii) the mixture of the precisions of N(l; Y 1 mRÞ when Y follows Gamma ðm=2; 1=2Þ with m=2 and 1/2 being the shape and rate parameters, respectively, or equivalently the chi-square with m df. The same pdf is given by (iii) the mixture of the covariances N(l; Y RÞ when Y follows the inverse gamma distribution denoted by Inv-Gamma ðm=2; m=2Þ with m=2 being the shape and scale parameter or equivalently the scaled inverse chi-square with m df and scale parameter m; (iv) the mixture of the covariances N(l; YmRÞ when Y follows the inverse gamma distribution denoted by Inv-Gamma ðm=2; 1=2Þ with m=2 and 1/2 being the shape and scale parameters, respectively, or equivalently the inverse chi-square with m df. Proof For (i), we have Z1 0
(
yðz lÞT R1 ðz lÞ exp p=2 1=2 2 ð2pÞ jRj yp=2
)
my exp dy 2 2 Cðm=2Þ ( )ðm þ pÞ=2 Cfðm þ pÞ=2g m m=2 m ðz lÞT R1 ðz lÞ þ ¼ 2 2 ð2pÞp=2 Cðm=2ÞjRj1=2 2 ( ) ðm þ pÞ=2 Z1 m ðz lÞT R1 ðz lÞ yðm þ pÞ=21 þ 2 2 Cfðm þ pÞ=2g 0 " ( ) # m ðz lÞT R1 ðz lÞ þ exp y dy 2 2 ( )ðm þ pÞ=2 Cfðm þ pÞ=2g m m=2 m ðz lÞT R1 ðz lÞ þ ¼ 2 2 ð2pÞp=2 Cðm=2ÞjRj1=2 2 ( )ðm þ pÞ=2 Cfðm þ pÞ=2g ðz lÞT R1 ðz lÞ ¼ 1þ : m ðpmÞp=2 Cðm=2ÞjRj1=2
mm=2 yðm=2Þ1
The result of (ii) is given by redefinition of Ym in (i) as Y. The result of (iii) is obtained from (i) considering that when Y is gamma distributed, Y 1 follows the inverse gamma. The result of (iv) is given by redefinition of Ym1 in (iii) as Y. Q.E.D.
9 The Student t- and Pseudo t- (PT) Distributions …
288
9.4
The Pseudo t (PT)-Distribution
As addressed in Sect. 9.1, the PN can be extended using non-normal distributions. One of the most extensively investigated non-normal distributions is the t-distribution. The skew multivariate t-distribution has been derived by Branco and Dey [3, Sect. 3.2] using the precision mixture of the normal with the gamma weights or equivalently by Gupta [9, Definition 1] as Y ¼ ðY1 ; :::; Yp ÞT ¼ X=
pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi W=m ¼ ðX1 ; :::; Xp ÞT = W=m;
where X is multivariate skew-normal distributed whose pdf is f ðX ¼ xÞ ¼ 2/p ðxjl ¼ 0; RÞUðaT xÞ; x 2 Rp ; a 2 Rp ; where UðÞ is the cumulative distribution function (cdf) of the standard normal; W is chi-square distributed with m df. When a ¼ 0, we have X Nð0; RÞ giving Y Stð0; R; mÞ, which is the only case of a symmetric distribution in the case of the multivariate skew t (ST) as in the multivariate SN.
9.4.1
The PDF of the PT
Recall that the pdf of Y PNp;q;R ð l; R; D; g; D; A; BÞ is fp;q;R ðyj l; R; D; g; D; A; BÞ ¼
/p ðyjl; RÞ PrfZ 2 S j g þ Dðy lÞ; Dg: PrðZ 2 Sj g; XÞ
Definition 9.1 Let Y PNp;q;R ð l ¼ 0; R; D; g ¼ 0; D; A; BÞ. Then, the random pffiffiffi pffiffiffiffiffiffiffi p-vector following the pseudo t (PT)-distribution is defined as mY= W c , where W Power-Cðm=2; 1=2; cÞ independent of Y. Gupta’s [9, Definition 1] case (see also Gupta et al. [11, Sect. 4]) shown earlier is a special case when W is chi-square distributed with m df or equivalently W Power-Cðm=2; 1=2; 1Þ. As shown after Theorem 9.2, when c ¼ 2; 1; 1; 2, W is inverse-chi, inverse chi-square, chi-square and chi-distributed, respectively. For simplicity, consider the case l ¼ 0; q ¼ 1; g ¼ g ¼ 0; D ¼ dT ; D ¼ d and X ¼ x ¼ dT Rd þ d. Then, we have PrfZ 2 S j g þ Dðy lÞ; Dg ¼ PrðZ 2 S j dT y; dÞ: When Y = y is given, Z ¼ Z NðdT y; dÞ without truncation or Z 2 R.
9.4 The Pseudo t (PT)-Distribution
289
Lemma 9.6 Suppose that W Power-Cfðm þ pÞ=2; b; cg. Then, pffiffiffiffiffiffiffiffiffiffi EW fPrðZ 2 S j d wc =m; dÞg 2 b 3 Z p ffiffiffiffiffiffiffiffiffi ffi 2 1 ¼ EW 4 pffiffiffiffiffiffiffiffi exp z d wc =m =ð2dÞ dz5 2pd a ( ðm þ pÞ=2 X 1=2 )u 1 d 2 =m d d 2 =m Cfðm þ p þ uÞ=2g ¼ 1þ bþ 1=2 d 2bd 2d Cfðm þ pÞ=2Þgu! m u¼0 EfZ u jZ Nð0; dÞ; a; bg; where
Rb a
ð )dz
PR R b r r¼1 ar
ð )dz and Zb
EfZ u jZ Nð0; dÞ; a; bg ¼ a
2 zu z pffiffiffiffiffiffiffiffi exp dz: 2d 2pd
Proof Z1 0
gPowerC fwjðm þ pÞ=2; b; cg PrðZ 2 S j d Z1
¼ 0
pffiffiffiffiffiffiffiffiffiffi wc =m; dÞdw
bðm þ pÞ=2 jcjwcfðm þ pÞ=2g1 expðbwc Þ Cfðm þ pÞ=2Þg
pffiffiffiffiffiffiffiffiffiffi2 1 pffiffiffiffiffiffiffiffi exp z d wc =m =ð2dÞ dz dw 2pd R¼1 ar pffiffiffiffiffiffiffiffiffiffi u Zb Z1 ðm þ pÞ=2 1 cfðm þ pÞ=2g1 zd wc =m=d X b jcjw 1 ¼ expðbwc Þ pffiffiffiffiffiffiffiffi Cfðm þ pÞ=2Þg u! 2pd u¼0 a 0 2 c 2 d w =m z dw exp exp dz 2d 2d Zb 1 bðm þ pÞ=2 1 X 1 pffiffiffiffiffiffiffiffi fzd =ðm1=2 dÞgu ¼ u! Cfðm þ pÞ=2Þg 2pd u¼0 R Z X
br
a
Z1
0
2 d 2 =m c z jcjwcfðm þ p þ uÞ=2g1 exp b þ w dw exp dz 2d 2d
9 The Student t- and Pseudo t- (PT) Distributions …
290
¼
ðm þ p þ uÞ=2 bðm þ pÞ=2 Cfðm þ p þ uÞ=2g 1 fzd =ðm1=2 dÞgu d 2 =m pffiffiffiffiffiffiffiffi bþ Cfðm þ pÞ=2Þg 2d u! 2pd u¼0
Zb X 1 a
Z1
ðm þ p þ uÞ=2 d 2 =m jcjwcfðm þ p þ uÞ=2g1 d 2 =m c exp b þ w dw 2d 2d Cfðm þ p þ uÞ=2g 0 2 z exp dz 2d ðm þ p þ uÞ=2 1 X fd =ðm1=2 dÞgu Cfðm þ p þ uÞ=2g d 2 =m ðm þ pÞ=2 bþ ¼b Cfðm þ pÞ=2Þgu! 2d u¼0
bþ
2 zu z pffiffiffiffiffiffiffiffi exp dz 2d 2pd a ( ðm þ pÞ=2 X 1=2 )u 1 d 2 =m d d 2 =m Cfðm þ p þ uÞ=2g ¼ 1þ bþ 1=2 d 2bd 2d Cfðm þ pÞ=2Þgu! m u¼0 Zb
EfZ u jZ Nð0; dÞ; a; bg; which is the required result. Q.E.D. pffiffiffi pffiffiffiffiffiffiffi Theorem 9.4 The pdf of the pseudo t-distributed vector Y ¼ mY= W c at Y ¼ y , where Y PNp;1;R ð 0; R; dT ; 0; d; a; bÞ and W Power-Cðm=2; 1=2; cÞ independent of Y, is tðy j0; R; mÞ ( ðm þ pÞ=2 X 1=2 )u 1 d 2 =m d d 2 =m Cfðm þ p þ uÞ=2g 1þ bþ 1=2 d 2bd 2d Cfðm þ pÞ=2Þgu! m u¼0
EfZ u jZ Nð0; dÞ; a; bg ; Rb a /ðzj0; xÞdz
where d dT y and b f1 þ yT ðmRÞ1 y g=2.
9.4 The Pseudo t (PT)-Distribution
291
Proof The pdf of Y with the Jacobian ðwc =mÞp=2 from Y to Y is
Z1 gPowerC ðwjm=2; 1=2; cÞ 0
wc m
p=2
fp;1;R ðy
pffiffiffiffiffiffiffiffiffiffi wc =m j 0; R; dT ; 0; d; a; bÞdw
pffiffiffiffiffiffiffiffiffiffi wc p=2 /p ðy wc =m j0; RÞ ¼ gPowerC ðwjm=2; 1=2; cÞ m PrðZ 2 S j 0; xÞ 0 p ffiffiffiffiffiffiffiffiffi ffi PrðZ 2 S j dT y wc =m; dÞdw c p=2 Z1 jcjwcðm=2Þ1 w 1 c ¼ =2Þ expðw p=2 v 2m=2 Cðm=2Þ ð2pÞ jRj1=2 0 pffiffiffiffiffiffiffiffiffiffi T c 1 y w R y PrðZ 2 S j dT y wc =m; dÞ dw exp PrðZ 2 S j 0; xÞ m2 ðm þ pÞ=2 Cfðm þ pÞ=2g 1 1 yT R1 y þ ¼ m2 ð2pvÞp=2 Cðm=2ÞjRj1=2 2m=2 2 1 ðm þ pÞ=2 Z 1 yT R1 y jcjwcðm þ pÞ=21 þ 2 m2 Cfðm þ pÞ=2g 0 pffiffiffiffiffiffiffiffiffiffi 1 yT R1 y c PrðZ 2 S j dT y wc =m; dÞ þ dw exp w 2 PrðZ 2 S j 0; xÞ m2
Z1
Cfðm þ pÞ=2g
¼
yT R1 y 1þ m
ðm þ pÞ=2
ðpvÞp=2 Cðm=2ÞjRj1=2 ðm þ pÞ=2 Z1 1 yT R1 y jcjwcðm þ pÞ=21 þ 2 m2 Cfðm þ pÞ=2g 0
pffiffiffiffiffiffiffiffiffiffi 1 yT R1 y c PrðZ 2 S j dT y wc =m; dÞ þ dw exp w 2 PrðZ 2 S j 0; xÞ m2 Z1 ¼ tðy j0; R; mÞ gPowerC ½wjðm þ pÞ=2; f1 þ yT ðmRÞ1 y g=2; c
0
pffiffiffiffiffiffiffiffiffiffi PrðZ 2 S j dT y wc =m; dÞ dw PrðZ 2 S j 0; xÞ
9 The Student t- and Pseudo t- (PT) Distributions …
292
¼ tðy j0; R; mÞ ( ðm þ pÞ=2 X 1=2 )u 1 d 2 =m d d 2 =m Cfðm þ p þ uÞ=2g 1þ bþ 1=2 d 2bd 2d Cfðm þ pÞ=2Þgu! m u¼0
EfZ u jZ Nð0; dÞ; a; bg ; Rb a /ðzj0; xÞ
where Lemma 9.6 is used for the last equation, giving the required result. Q.E.D. When p = 1 and Z is normally distributed under single truncation as in the SN, we have a simplified result of Lemma 9.6. Lemma 9.7 Suppose that W Power-Cfðm þ 1Þ=2; b; cg, a1 ¼ 1 and b1 ¼ 0 with R ¼ 1, i.e., PrðZ 2 S j Þ ¼ PrðZ \0 j Þ. Then, h n oi pffiffiffiffiffiffiffiffiffiffiffi EW Pr Z 2 S j Z Nðd W c =m; dÞ 2 0 3 Z pffiffiffiffiffiffiffiffiffiffiffi2 1 pffiffiffiffiffiffiffiffi exp z d W c =m =ð2dÞ dz5 ¼ EW 4 2pd 1 n o pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ Pr Z \ d ðm þ 1Þ=ð2mbÞjZ Stð0; d; m þ 1Þ : Proof Z1
gPowerC fwjðm þ 1Þ=2; b; cg PrfZ 2 S j Nðd
pffiffiffiffiffiffiffiffiffiffi wc =m; dÞgdw
0
Z1 ¼ 0
bðm þ 1Þ=2 jcjwcfðm þ 1Þ=2g1 expðbwc Þ Cfðm þ 1Þ=2g
Z0 1
pffiffiffiffiffiffiffiffiffiffi2 1 pffiffiffiffiffiffiffiffi exp z d wc =m =ð2dÞ dz dw 2pd
9.4 The Pseudo t (PT)-Distribution
Z1 ¼ 0
293
bðm þ 1Þ=2 jcjwcfðm þ 1Þ=2g1 expðbwc Þ Cfðm þ 1Þ=2g pffiffiffiffiffiffiffi c Z w =m
d
Z1 ¼ 0
1
1 pffiffiffiffiffiffiffiffi expfz2 =ð2dÞgdz dw 2pd
bðm þ 1Þ=2 jcjwcfðm þ 1Þ=2g1 expðbwc Þ Cfðm þ 1Þ=2g
d #2 c pffiffiffiffiffiffiffiffiffiffi Z 1 z w =m c pffiffiffiffiffiffiffiffi exp w =m dz# dw 2d 2pd 1
Zd ¼ 1
Zd ¼ 1 Z1
0
Zd ¼ 1
1 pffiffiffiffiffiffiffiffiffiffi 2pmd
Z1 0
bðm þ 1Þ=2 jcjwcfðm þ 2Þ=2g1 z#2 =m c exp b þ w dw dz# 2d Cfðm þ 1Þ=2g
ðm þ 2Þ=2 Cfðm þ 2Þ=2Þg z#2 =m pffiffiffiffiffiffiffiffiffiffi bðm þ 1Þ=2 b þ 2d 2pmdCfðm þ 1Þ=2g ðm þ 2Þ=2 jcjwcfðm þ 2Þ=2g1 z#2 =m z#2 =m c bþ exp b þ w dw dz# 2d 2d Cfðm þ 2Þ=2g ðm þ 2Þ=2 Cfðm þ 2Þ=2Þg z#2 =m pffiffiffiffiffiffiffiffiffiffi dz# bðm þ 1Þ=2 b þ 2d 2pmdCfðm þ 1Þ=2g
ðm þ 2Þ=2 Cfðm þ 2Þ=2Þg z#2 pffiffiffiffiffiffiffiffi 1 þ dz# pffiffiffipffiffiffi 2dbm p mCfðm þ 1Þ=2g 2db 1 pffiffiffiffiffiffiffiffiffiffiffiffiffi d ðm þ 1Þ=m ðm þ 2Þ=2 Z Cfðm þ 2Þ=2Þg z2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ dz : pffiffiffiffiffiffiffiffi 1 þ 2dbðm þ 1Þ pðm þ 1ÞCfðm þ 1Þ=2g 2db 1 n o pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ Pr Z \ d ðm þ 1Þ=mjZ Stð0; 2db; m þ 1Þ n o pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ Pr Z \ d ðm þ 1Þ=ð2mbÞjZ Stð0; d; m þ 1Þ ; Zd
¼
which is the required result. Q.E.D. Under the same condition for Z when p = 1, we have the following special case of Theorem 9.4.
9 The Student t- and Pseudo t- (PT) Distributions …
294
Corollary 9.5 The pdf of the pseudo t-distributed variable Y ¼ Y ¼ y , where Y PN1;1;1 ð 0; r2 ; d; 0; d; 1; 0Þ and Cðm=2; 1=2; cÞ independent of Y, is
pffiffiffi pffiffiffiffiffifficffi mY= W at W Power-
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ) mþ1 j Z Stð0; d; m þ 1Þ : 2tðy j0; r ; mÞ Pr Z \ dy m þ ðy2 =r2 Þ (
2
Proof Use the second last result of the proof of Theorem 9.4 when y ¼ y ; R ¼ r2 and d ¼ d with p = 1 under the above condition. Then, Lemma 9.7 when b ¼ f1 þ yT ðmRÞ1 y g=2 ¼ ½1 þ fy2 =ðvr2 Þg =2 gives Z1 gPowerC ðwjm=2; 1=2; cÞ 0
c p=2 w m
pffiffiffiffiffiffiffiffiffiffi fp;1;R ðy wc =m j 0; R; dT ; 0; d; a; bÞdw Z1 ¼ gPowerC ðwjm=2; 1=2; cÞðwc =mÞ1=2 0
f1;1;1 ðy
pffiffiffiffiffiffiffiffiffiffi wc =m j 0; r2 ; d; 0; d; 1; 0Þdw
Z1
¼ tðy j0; r ; mÞ 2
gPowerC 0
mþ1 y2 ; 1 þ 2 =2; c wj 2 mr
pffiffiffiffiffiffiffiffiffiffi PrfZ 2 S j Nðdy wc =m; dÞgdw= PrfZ 2 S j Nð0; xÞg 8 9 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi < = 2 1 m þ 1 y 2 1þ 2 jZ Stð0; d; m þ 1Þ ¼ 2tðy j0; r2 ; mÞ Pr Z \ dy : ; 2m mr rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi mþ1 ¼ 2tðy j0; r2 ; mÞ Pr Z \ dy Stð0; d; m þ 1Þ , jZ m þ ðy2 =r2 Þ R0 where 1 /ðzj0; x ¼ d 2 r2 þ dÞdz ¼ 1=2 is used, giving the required result. Q.E.D. In Corollary 9.5, when r2 ¼ d ¼ 1 and d ¼ k, we have the result for Y PN1;1;1 ð 0; r2 ; d; 0; d; 1; 0Þ ¼ SN with the pdf 2/ðyÞUðkyÞ at Y = y, which was obtained using the chi-square distributed W by Gupta et al. [11, Eq. 4.2] and Gupta [9, Eq. (2.4)]. Their former expression of the pdf of the skew t (ST) is written as
9.4 The Pseudo t (PT)-Distribution
295
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ) m þ 1 2tðy j0; 1; mÞ Pr Z \ky j Z Stð0; 1; m þ 1Þ mðm þ y2 Þ (
using our notation, where the denominator mðm þ y2 Þ due to a possible typo should be read as m þ y2 , which is obtained as a special case of Corollary 9.5. After this correction, it is confirmed that when m goes to infinity, the pdf becomes 2/ðy j0; 1Þ PrfZ \ky j Z Nð0; 1Þg ¼ 2/ðy ÞUðky Þ i.e., the pdf of the SN as expected rather than /ðy j0; 1Þ unless corrected. Note that Theorem 9.4 and Corollary 9.5 are extensions of the pdf of the ST given by Gupta et al. to those for the PT under hidden sectional truncation or the ST using the power-gamma distribution for generality. Gupta et al.’s pdf quoted above is based on Gupta et al. [11, Lemma 3] and Gupta [9, Lemma 1], where the former result is pffiffiffiffiffi pffiffiffi EW fUðc W ÞjW v2 ðmÞg ¼ PrfZ \c mjZ Stð0; 1; mÞg; c 2 R: pffiffiffi Their lemma is seen as a special case of Lemma 9.7 when c ¼ d = m and W Power-Cfðm þ 1Þ=2; b; cg is W Power-Cðm=2; 1=2; 1Þ with b ¼ 1=2 or equivalently W v2 ðmÞ. For ease of reference and comparison, the corresponding extended expression equivalent to Lemma 9.7 is given as follows. Lemma 9.8 Suppose that W Power-Cðm=2; b; cÞ. Then, h n oi pffiffiffiffiffiffiffi EW Pr Z\ c W c j Z Nð0; dÞ n o pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ Pr Z \c m=ð2bÞjZ Stð0; d; mÞ ; c 2 R: pffiffiffi Proof In Lemma 9.7, employ the reparametrization c ¼ d = m followed by the transformation of W Power-Cfðm þ 1Þ=2; b; cg to W Power-Cðm=2; b; cÞ. Q.E.D. Lemma 9.8 is an extension of Gupta et al.’s lemma using their expression with added scale parameters b and d, and the power gamma with the shape parameter c.
9.4.2
Moments and Cumulants of the PT
For the derivation of the moments and cumulants of the PT, the following are used. Property 9.1 Let W be a variable and Y a random vector independent of W, then
9 The Student t- and Pseudo t- (PT) Distributions …
296
EðWYÞ ¼ EðWÞEðYÞ; covðWYÞ ¼ EðW 2 YYT Þ EðWYÞEðWYT Þ ¼ fvarðWÞ þ E2 ðWÞgfcovðYÞ þ EðYÞEðYT Þg E2 ðWÞEðYÞEðYT Þ ¼ varðWÞcovðYÞ þ varðWÞEðYÞEðYT Þ þ E2 ðWÞcovðYÞ; j3 ðWYÞ ¼ EfðWYÞ\3 [g
3 X
EfðWYÞ\2 [g EðWYÞ þ 2E\3 [ ðWYÞ
¼ EðW 3 ÞEðY\3 [Þ EðW 2 ÞEðWÞ j4 ðWYÞ ¼ EfðWYÞ\4 [g þ
6 X
4 X
3 X
EðY\2 [Þ EðYÞ þ 2E3 ðWÞE\3 [ ðYÞ;
EfðWYÞ\3 [g EðWYÞ
EfðWYÞ\2 [g E\2 [ ðWYÞ 3E\4 [ ðWYÞ
¼ EðW 4 ÞEðY\4 [Þ EðW 3 ÞEðWÞ þ EðW 2 ÞE2 ðWÞ
6 X
4 X
EðY\3 [Þ EðYÞ
EðY\2 [Þ E\2 [ ðYÞ 3E4 ðWÞE\4 [ ðYÞ:
Property 9.2 Let W be inverse-chi distributed variable with m df multiplied by Then, Lemma 9.2 gives
pffiffiffi m.
pffiffiffi mCfðm 1Þ=2g ; m [ 1; 21=2 Cðm=2Þ m ; m [ 2; EðW 2 Þ ¼ m2 m mC2 fðm 1Þ=2g varðWÞ ¼ ; m [ 2; m2 2C2 ðm=2Þ EðWÞ ¼
m3=2 Cfðm 3Þ=2g m3=2 Cfðm 1Þ=2g ¼ 1=2 ; m [ 3; 23=2 Cðm=2Þ 2 ðm 3ÞCðm=2Þ m2 ; m [ 4: EðW 4 Þ ¼ ðm 2Þðm 4Þ
EðW 3 Þ ¼
Property 9.3 Let Y PNp;1;R f l ¼ 0; R; D ¼ dT ; g ¼ 0; D ¼ d; A ¼ ða1 ; :::; aR Þ; B ¼ ðb1 ; :::; bR Þg with X ¼ x ¼ D þ DRDT ¼ d þ dT Rd. Define a ¼ PrðZ 2 Sj g ¼ 0; X ¼ xÞ and ~ ¼ Rd. Then, Corollary 4.4 gives d
9.4 The Pseudo t (PT)-Distribution
~ EðYÞ ¼ a1 d
R X
297
f/ðbr j0; xÞ /ðar j0; xÞg;
r¼1
~d ~T covðYÞ ¼ d
a1 x1 "
þa
fbr /ðbr j0; xÞ ar /ðar j0; xÞg
r¼1
ðAÞ 2
R X
R X
#2 f/ðbr j0; xÞ /ðar j0; xÞg þ R;
r¼1
~ EðY\2 [ Þ ¼ d \3 [
EðY
Þ¼a
\2 [ 1
a x1
X R
1
ðAÞ
ðAÞ R X
f/ðbr j0; xÞ /ðar j0; xÞg þ vecðRÞ;
r¼1
~\3 [ ½x2 fb2 /ðbr j0; xÞ a2 /ðar j0; xÞg d r r
r¼1
þ x1 f/ðbr j0; xÞ /ðar j0; xÞg
3 X
~ vecðRÞf/ðbr j0; xÞ /ðar j0; xÞg d
X R
EðX\4 [ Þ ¼ a1
ðAÞ
; ðAÞ
~\4 [ ½x3 fb3 /ðbr j0; xÞ a3 /ðar j0; xÞg d r r
r¼1
þ 3x2 fbr /ðbr j0; xÞ ar /ðar j0; xÞg
6 X
~\2 [ vecðRÞx1 fbr /ðbr j0; xÞ ar /ðar j0; xÞg d
ðAÞ
þ
3 X
vec\2 [ ðRÞ:
Theorem 9.5 The moments and cumulants of the PT in Theorem 9.4 up to the fourth order are given by substituting the results of Properties 9.2 and 9.3 for those in Property 9.1. pffiffiffiffiffiffiffiffiffiffiffi Proof Noting that the PT distributed vector is given by Y= W c =m, where W Power-Cðm=2; 1=2; cÞ. When c ¼ 2, W is inverse chi-distributed with m df pffiffiffiffiffiffiffiffiffiffiffi pffiffiffi independent of Y; then, Y= W c =m ¼ W mY gives the required results. Q.E.D.
298
9 The Student t- and Pseudo t- (PT) Distributions …
References 1. Azzalini A, Capitanio A (2003) Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. J Royal Stat Soc B 65:367–389 2. Bishop CM (2006) Pattern recognition and machine learning. Springer, New York 3. Branco MD, Dey DK (2001) A general class of multivariate skew-elliptical distributions. J Multivar Anal 79:99–113 4. Cornish EA (1954) The multivariate t-distribution associated with a set of normal sample deviates. Aust J Phys 7:531–542 5. Crooks GE (2015) The Amoroso distribution. arXiv:1005.3274v2 [math.ST] 13 Jul 2015 6. Fung T, Seneta E (2010) Tail dependence for two skew t distributions. Statist Probab Lett 80:784–791 7. Galarza CE, Matos LA, Castro LM, Lachos VH (2022) Moments of the doubly truncated selection elliptical distributions with emphasis on the unified multivariate skew-t distribution. J Multivar Anal (online published). https://doi.org/10.1016/j.jmva.2021.104944 8. Genton MG (2004) Skew-symmetric and generalized skew-elliptical distributions. In: Genton MG (ed) Skew-elliptical distributions and their applications. Chapman & Hall/CRC, Boca Raton, FL, pp 81–100 9. Gupta AK (2003) Multivariate skew t-distribution. Statistics 37:359–363 10. Gupta AK, Chang F-C (2003) Multivariate skew-symmetric distributions. Appl Math Lett 16:643–646 11. Gupta AK, Chang FC, Huang WJ (2002) Some skew-symmetric models. Random Oper Stoch Equ 10:133–140 12. Joe H, Li H (2019) Tail densities of skew-elliptical distributions. J Multivar Anal 171: 421–435 13. Kirkby JL, Nguyen D, Nguyen D (2019) Moments of Student's t-distribution: a unified approach. arXiv:1912.01607v1 [math.PR] 3 Dec 2019 14. Kirkby JL, Nguyen D, Nguyen D (2021) Moments of Student's t-distribution: a unified approach. arXiv:1912.01607v2 [math.PR] 26 Mar 2021 15. Kollo T, Käärik M, Selart A (2021) Multivariate skew t-distribution: asymptotics for parameter estimators and extension to skew t-copula. Symmetry 13:1059. https://doi.org/10. 3390/sym13061059 16. Kollo T, Pettere G (2010) Parameter estimation and application of the multivariate skew t-copula. In: Jaworski P, Durante F, Härdle W, Rychlik T (eds) Copula theory and its applications. Springer, Berlin, pp 289–298 17. Kotz S, Nadarajah S (2004) Multivariate t distributions and their applications. Cambridge University Press, Cambridge 18. Lachos VH, Garay AM, Cabral CRB (2020) Moments of truncated scale mixtures of skew-normal distributions. Braz J Probab Stat 34:478–494 19. Nadarajah S, Ali MM (2004) A skewed truncated t distribution. Math Comput Model 40: 935–939 20. Nadarajah S, Gupta AK (2005) A skewed truncated Pearson type VII distribution. J Jpn Stat Soc 35:61–71 21. Nadarajah S, Kotz S (2003) Skewed distributions generated by the normal kernel. Statist Probab Lett 65:269–277 22. Stacy EW (1962) A generalization of the gamma distribution. Ann Math Stat 33:1187–1192 23. Stuart A, Ord JK (1994) Kendall’s advanced theory of statistics: distribution theory, 6th edn., vol 1). Arnold, London
Chapter 10
Multivariate Measures of Skewness and Kurtosis
10.1
Preliminaries
In previous chapters, the (left) Kronecker product denoted by, e.g., A B ¼ faij Bg ði ¼ 1; . . .; m; j ¼ 1; . . .; nÞ (Magnus and Neudecker [14, Sect. 2, Chap. 2]) and A\2 [ ¼ A A, was extensively used, which is also called the direct or tensor product, where aij B ðp qÞ is the (i, j)th block or submatrix of A B. In literatures, Q Qk k ¼ A A the notations A\k [ ¼ ki¼1 A ¼ i¼1:k A ¼ i¼1 A ¼ A (k times of A) with 1 : k ¼ ð1; 2; . . .; kÞ are synonymously used (for the first two see Holmquist [6, 7]; Kano [9, p. 182]; Meijer [16, p. 122]; and the third to fifth see Terdik [21, 22]). The vectorizing operator oT T n vecðAÞ ¼ a11 ; a21 ; . . .; am1;n ; amn ¼ ðAÞT1 ; . . .; ðAÞTn ; was also used along with the Kronecker product, where ðAÞj ðm 1Þ is the j-th column of A [ðAÞi ð1 nÞ denotes the i-th row of A]. The commutation matrix (Magnus and Neudecker [12, 14, Sect. 7, Chap. 3]) or commutator (Terdik [22]) is useful when we consider multivariate moments and cumulants especially when their higher-order cases are dealt with. The commutation matrix is seen as a special case of the commutator mathematically defined by the operator CðÞ in CðA BÞ ¼ B A, where indicates an operation when B A does not commute, e.g., matrix multiplication; and A and B are mathematical quantities/ symbols defined variously. In this book, the commutator indicates the commutation matrix for short. A commutation matrix denoted by the mn mn matrix Kmn is a permutation matrix giving the relation vec AT ¼ Kmn vecðAÞ for a m n matrix A and equivalently © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 H. Ogasawara, Expository Moments for Pseudo Distributions, Behaviormetrics: Quantitative Approaches to Human Behavior 2, https://doi.org/10.1007/978-981-19-3525-1_10
299
300
10 Multivariate Measures of Skewness and Kurtosis
a1 a2 ¼ Kmn ða2 a1 Þ for m 1 and n 1 vectors a1 and a2 , respectively. Note that the mn elements of vecðAT Þ are a permutation of those of vecðAÞ obtained by the pre-multiplication of vecðAÞ by Kmn . A k k permutation matrix P is defined by interchanging some or all of the rows or columns of the k k identity matrix Ik . That is, every row or column of P has only an element of 1 with the others being zero. Assume that in a permutation matrix P ¼ fpij g, the i-row has a single one in the j-th column, i.e., pij ¼ 1 ði 6¼ jÞ. In this case, it is found that we also have pji ¼ 1 due to the definition of P. The i-th and j-th rows of P in PC for a k l matrix or a vector C give the interchange of the i-th and j-th rows of C. The remaining rows of P have similar roles. When pii ¼ 1, the i-th row of C is unchanged. In vecðAT Þ ¼ Kmn vecðAÞ, we find that fi þ ðj 1Þmg-th element of vecðAÞ is moved to the fj þ ði 1Þng-th element of vecðAT Þ ði ¼ 1; :::; m; j ¼ 1; :::; nÞ after permutation due to the definition of vecðÞ. Since Kmn is a permutation matrix, in this case fj þ ði 1Þng-th element of vecðAÞ is also moved to the fi þ ðj 1Þmg-th element of vecðAT Þ. Some examples of Kmn are given for exposition. K11 ¼ 1; K12 ¼ K21 ¼ I2 ; " K22 ¼ KT22 ¼
ð11Þ
K22
ð21Þ
K22
# ð12Þ
K22
ð22Þ
K22
2
1
60 6 ¼6 40 0
K222
0
3
0
0
0 1
1 0
07 7 7; 05
0
0
1
¼ I4 ðijÞ
and K13 ¼ K31 ¼ I3 are easily obtained, where the 2 2 submatrices K22 ði; j ¼ 1 3 5 1; 2Þ each have a single 1 for the (j, i)th element. Since vec ¼ 2 4 6 2 3 1 2 ½1; 2; 3; 4; 5; 6T and vec4 3 4 5 ¼ ½1; 3; 5; 2; 4; 6T , we have 5 6 2 3 1 637 " 6 7 ð11Þ 657 6 7 ¼ K23 ð21Þ 627 K23 6 7 445 6
ð12Þ
K23 ð22Þ K23
2 3 2 1 1 60 27 #6 6 7 6 ð13Þ 7 6 K23 6 637 60 ð23Þ 6 4 7 ¼ 6 0 K23 6 7 6 455 40 0 6 ðijÞ
0 0 0 1 0 0
0 1 0 0 0 0
0 0 0 0 1 0
0 0 1 0 0 0
32 3 2 3 1 0 1 6 7 627 07 76 2 7 6 7 6 7 6 7 07 76 3 7 ¼ K23 6 3 7; 7 6 7 647 0 76 4 7 6 7 455 0 54 5 5 6 1 6
where the 3 2 submatrices K23 ði ¼ 1; 2; j ¼ 1; 2; 3Þ each have a single 1 for the (j, i)th element. Similarly, we obtain
10.1
Preliminaries
301
2 3 1 6 2 7 2 ð11Þ 6 7 6 3 7 6 K32 6 7 ¼ 4 Kð21Þ 647 32 6 7 ð31Þ 455 K32 6
2 3 2 1 1 3 60 ð12Þ 6 2 7 7 6 K32 6 6 7 60 ð22Þ 76 3 7 K32 56 4 7 ¼ 6 60 6 ð32Þ 6 7 K32 4 5 5 4 0 0 6
0 0 0 1 0 0
0 1 0 0 0 0
0 0 0 0 1 0
0 0 1 0 0 0
32 3 2 3 1 0 1 627 627 07 76 7 6 7 6 7 6 7 07 76 3 7 ¼ K32 6 3 7; 647 647 07 76 7 6 7 5 4 5 455 5 0 6 1 6
ðijÞ
where the 2 3 submatrices K32 ði ¼ 1; 2; 3; j ¼ 1; 2Þ each have a single 1 for the (j, i)th element. In the above results, K23 ¼ KT32 and K32 K23 ¼ K23 K32 ¼ I6 with K23 6¼ KT23 and K32 6¼ KT32 are found. For K33 , we have 2 3 1 647 6 7 6 7 677 6 7 2 ð11Þ 6 7 K 627 6 7 6 33 6 5 7 ¼ 6 Kð21Þ 6 7 4 33 6 7 ð31Þ 687 K33 6 7 637 6 7 6 7 465 9
2 3 2 1 1 627 60 6 7 6 6 7 6 637 60 6 7 6 3 ð13Þ 6 7 K33 6 4 7 6 60 6 7 6 7 ð23Þ 6 7 6 ¼ 5 K33 7 56 7 6 0 6 7 6 ð33Þ 6 6 7 0 K33 6 7 6 6 677 60 6 7 6 6 7 6 485 40 0 9
ð12Þ
K33
ð22Þ
K33
ð32Þ
K33
0 0 0 0 0 1 0 1 0 0 0 0 0
0 0 0 0 1 0 0
0 0 0 0 0 0 0
32 3 2 3 1 0 0 0 0 0 1 7 6 7 7 6 0 0 0 0 0 76 2 7 627 76 7 6 7 7 6 7 7 6 0 0 1 0 0 76 3 7 637 76 7 6 7 0 0 0 0 0 76 4 7 647 76 7 6 7 6 5 7 ¼ K33 6 5 7; 1 0 0 0 07 76 7 6 7 76 7 6 7 0 0 0 1 0 76 6 7 667 76 7 6 7 7 6 7 677 0 0 0 0 0 76 7 7 6 7 76 7 6 7 0 1 0 0 0 54 8 5 485 9 0 0 0 0 1 9
ðijÞ
where the 3 3 submatrices K33 ði; j ¼ 1; 2; 3Þ each have a single 1 for the (j, i)th element as before. It is seen that K33 ¼ KT33 and K233 ¼ I9 . Though Kkij ði 6¼ j; k ¼ 2; 3; . . .Þ are usually unused since these are not commutators but permutation matrices, which are of some use to see a property of Kij . An example is shown. 2
K223
1 0
0
0
0 0
1 0
0 0
1
0
60 6 6 60 ¼6 60 6 6 40
0 0 0 0 0 2 3 2 1 0 1 627 60 0 6 7 6 6 7 6 637 60 0 7 6 K223 6 647 ¼ 60 0 6 7 6 6 7 6 455 40 1 0 0 6
0 0
32
2
1
0
6 0 07 7 60 0 7 6 1 07 60 0 7 ¼6 6 0 0 07 7 60 0 7 6 1 0 05 40 1 0 0 1 0 0 32 3 1 0 0 0 0 6 7 0 0 1 0 76 2 7 7 76 7 0 1 0 0 76 3 7 76 7 6 7 1 0 0 07 76 4 7 76 7 0 0 0 0 54 5 5 6 0 0 0 1
0 0
0
0 0 0 1
1 0
1 0
0
0 0 0 0
0 0
0
3
07 7 7 07 7; 07 7 7 05 1
302
10 Multivariate Measures of Skewness and Kurtosis
2 3 2 1 1 637 60 6 7 6 6 7 6 657 60 7 6 ¼ K23 6 627 ¼ 60 6 7 6 6 7 6 445 40 0 6
0 0
0 1
0 0 0 0
0 1
0 0
0 1 0 0
0 0
0 0
1 0 0 0
32 3 2 3 1 1 0 6 7 6 7 0 76 3 7 6 5 7 7 76 7 6 7 0 76 5 7 6 4 7 76 7 ¼ 6 7: 6 7 6 7 07 76 2 7 6 3 7 76 7 6 7 0 54 4 5 4 2 5 1
6
6
Among Kk23 ðk ¼ 1; 2; . . .Þ, the number of distinct ones is found to be at most 6!. Next, we obtain the explicit or algebraic expression of Kmn . Define the m n matrix Eij whose (i, j)th element is 1 with the remaining ones being zero. Let eðkÞ be the vector of an appropriate dimension whose k-th element is 1 with the other elements being zero. Lemma 10.1 The commutation matrix satisfying vecðAT Þ ¼ Kmn vecðAÞ is algebraically expressed in several ways: Kmn ¼
m X n X
m;n X vec Eji eTfi þ ðj1Þmg vec Eij eTfi þ ðj1Þmg i;j
i¼1 j¼1
¼ ¼
m;n X i;j m;n X
efj þ ði1Þng vecT Eij efj þ ði1Þng eTfi þ ðj1Þmg ¼
m;n X
i;j
¼
vec Eji vecT Eij
i;j
m;n X
eðiÞ eðjÞ
eðjÞ eðiÞ
T
¼
m;n X eðiÞ eTðjÞ eðjÞ eTðiÞ
i;j
¼
i;j
m;n X
Eij ETij ¼
i;j
m;n X
Eij Eji ¼
i;j
m;n X
Eij KðijÞ mn ;
i;j
where eðiÞ ; eðjÞ and efg are m 1; n 1 and mn 1 vectors while Eij and Eji are T m n and n m matrices, respectively; and KðijÞ mn ¼ Eji ¼ eðjÞ eðiÞ is the n m submatrix for the (i, j)th block, which has a single 1 for the (j, i)th element of the submatrix ðj ¼ 1; . . .; n; i ¼ 1; . . .; mÞ: When m = n, we have Kmm ¼
m X m X
m X vec Eji eTfi þ ðj1Þmg vec Eji eTfi þ ðj1Þmg
i¼1 j¼1
¼
m X i;j
efj þ ði1Þmg vecT Eij
i;j
10.1
Preliminaries
¼
m X
303
efj þ ði1Þmg eTfi þ ðj1Þmg ¼
m X
i;j
¼ ¼
vec Eji vecT Eij
i;j
m X
T
vec Eij vec Eji
i;j m X
eðjÞ eðiÞ
eðiÞ eðjÞ
T
¼
i;j
m X
eðiÞ eðjÞ
eðjÞ eðiÞ
T
i;j
m m X X ¼ eðjÞ eTðiÞ eðiÞ eTðjÞ ¼ eðiÞ eTðjÞ eðjÞ eTðiÞ i;j
¼
i;j
m X
Eji Eij ¼
i;j
m X
Eij Eji ¼
m X
i;j
Eij KðijÞ mm :
i;j
Proof Noting that A¼
m;n X
Eij aij
and
AT ¼
i;j
m;n X
ETij aij ¼
i;j
m;n X
Eji aij ;
i;j
we have m;n m;n X X vec AT ¼ vec Eji aij ¼ vec Eji fvecðAÞgi þ ðj1Þm i;j
¼
m;n X
i;j
vec Eji eTfi þ ðj1Þmg vecðAÞ;
i;j
where f gk is the k-th element of the vector in braces. The last result gives an P T expression Kmn ¼ m;n i;j vecðEji Þefi þ ðj1Þmg . The alternative expressions for Kmn are obtained from this expression. The results when m = n are given using Kmm ¼ KTmm . Q.E.D. P The expressions of Kmn in Lemma 10.1, e.g., Kmn ¼ m;n i;j P T vec , well show properties of this speefj þ ði1Þng eTfi þ ðj1Þmg ¼ m;n vec E E ji ij i;j cial case as a permutation matrix. The lemma also gives an alternative algebraic Pm;n ðijÞ T T expression Kmn ¼ i;j eðiÞ eðjÞ Kmn using the submatrices KðijÞ mn ¼ eðjÞ eðiÞ ði ¼ 1; :::; m; j ¼ 1; :::; nÞ, which is consistent with the findings in the examples shown earlier. Terdik [22, Eq. (1.11)] used another symbolic expression Kmn ¼ ½ETi;j i;j reflecting a procedure to have Kmn and stated that “Kmn has n m blocks of dimension m n” (p. 8), which should be read as “Kmn has m n blocks of dimension n m,” where Kmn ¼ Kmn in our notation as used by Magnus and Neudecker [14]. For this statement, see Neudecker and Wansbeek [18, p. 222]
304
10 Multivariate Measures of Skewness and Kurtosis
though they use the notation Pnm for Kmn . It is known that a permutation matrix is ortho-normal, which is found by considering that the inverse permutation corresponding to P is given by PT , yielding well-known properties: T ¼ vecðAÞ: KTmn ¼ K1 mn ¼ Knm with Knm vec A 2k þ 1 ¼ Kmm ðk ¼ 0; 1; . . .Þ Consequently, when m = n, we have K2k mm ¼ Im2 and Kmm 0 with Kmm Im2 . When m = 1 or n = 1, it follows that
K1n ¼ In or Km1 ¼ Im ; respectively, due to vecðAT Þ ¼ vecðAÞ. A formula frequently used in multivariate statistical analysis for the Kronecker product and the vec operator is vecðABCÞ ¼ CT A vecðBÞ; where A, B and C are a b; b c and c d matrices, which is proved for exposition: vecðABCÞ ¼
b X c X
n o X n o vec ðAÞi bij ðCÞj ¼ vec ðAÞi ðCÞj bij
i¼1 j¼1
o Xn T ¼ ðCÞj ðAÞi bij i;j
¼
X
CT A
i;j
¼
X
CT A
i;j
eT fvecðBÞgi þ ðj1Þb fi þ ðj1Þbg fi þ ðj1Þbg eT vecðBÞ fi þ ðj1Þbg fi þ ðj1Þbg
¼ CT A vecðBÞ:
i;j
This proof is similar to that in Lemma 10.1 (compare the proof given by, e.g., Magnus and Neudecker [14, Theorem 2, Chap. 2]). The properties given below are well-known (Magnus and Neudecker [14, Theorem 9, Chap. 3]; Terdik [22, Eqs. (1.14)–(1.17)]), where the second to the last equations are obtained by the first one, which was proved by Magnus and Neudecker [14] in an elegant but indirect method using vecðABCÞ ¼ ðCT AÞvecðBÞ repeatedly and an additional arbitrary matrix. The first proof in the following lemma is a direct one. Lemma 10.2 Let A and B be m n and p q matrices with a and b being m 1 and p 1 vectors, respectively. Then, we have
10.1
Preliminaries
305
Kpm ðA BÞ ¼ ðB AÞKqn Kpm ðA BÞKnq ¼ B A; Kpm ðA bÞ ¼ b A; Kmp ðB aÞ ¼ a B; Kpm ða BÞ ¼ B a; Kmp ðb AÞ ¼ A b: Proof 1 In the first equation, the ½fu þ ðv 1Þmg; fi þ ðj 1Þqg-th element on each side will be found (u = 1,…,m; v = 1,…,p; i = 1,…,q; j = 1,…,n). The element on the left-hand side is T eTfu þ ðv1Þmg Kpm ðA BÞefi þ ðj1Þqg ¼ eðvÞ eðuÞ Kpm ðA BÞ eðjÞ eðiÞ
p;m o T X T n ¼ eðvÞ eðuÞ eði Þ eðj Þ eðj Þ eði Þ ðAÞj ðBÞi
¼ eðuÞ eðvÞ
i ;j
T n
o ðAÞj ðBÞi ¼ auj bvi ;
T P eðj Þ eði Þ where Kpm ¼ p;m in the results of Lemma 10.1 is i ;j eði Þ eðj Þ used. On the other hand, the corresponding element on the right-hand side of the equation is similarly given by T eTfu þ ðv1Þmg ðB AÞKqn efi þ ðj1Þqg ¼ eðvÞ eðuÞ ðB AÞKqn eðjÞ eðiÞ q;n T X T ¼ ðBÞv ðAÞu eði Þ eðj Þ eðj Þ eði Þ eðjÞ eðiÞ
¼ ðBÞv ðAÞu
i ;j
T
eðiÞ eðjÞ ¼ auj bvi :
The above results show that the first equation holds. The remaining equations are given from the first one using Kqn Knq ¼ Iqn , Kn1 ¼ In and K1q ¼ Iq . Proof 2 We use the converse flow of Proof 1 for the first equation, which is a less direct proof. Note that the elements on the left (or right)-hand side of the equation are different products of the elements of A and B. Recall the definition a1 a2 ¼ Kmn ða2 a1 Þ, which indicated that Kpm ðA BÞ is the matrix whose rows are permutated ones of A B such that the fv þ ðu 1Þpg-th row of A B corresponding to the rows ðAÞu and ðBÞv in A B becomes the fu þ ðv 1Þmg-th row of Kpm ðA BÞ after permutation. This result shows that the row indexes (rows for short) of Kpm ðA BÞ are the same as the corresponding ones of B A except that the column indexes of (columns for short) B A are permutated ones of Kpm ðA BÞ, where the columns of A B are unchanged by Kpm ðA BÞ.
306
10 Multivariate Measures of Skewness and Kurtosis
Consider the right-hand side ðB AÞKqn of the equation Kpm ðA BÞ ¼ ðB AÞKqn to be derived. We find that in a similar manner the columns of ðB AÞKqn are the same as those of A B with the rows permutated and Kpm ðA BÞ. Recalling that the rows of Kpm ðA BÞ were proved to be the same as those of B A and consequently those of ðB AÞKqn , Kpm ðA BÞ ¼ ðB AÞKqn since the rows and columns of the left-hand side are equal to the corresponding ones of the right-hand side. As shown earlier, the fu þ ðv 1Þmg-th row of ðB AÞKqn is given by the direct product of ðBÞv and ðAÞu . Similarly, it is found that the fi þ ðj 1Þqg-th column of Kpm ðA BÞ is given by the direct product of ðAÞj and ðBÞi . Since Kpm ðA BÞ ¼ ðB AÞKqn with the elements being distinct products of the elements of A and B, it is found that the ½fu þ ðv 1Þmg; fi þ ðj 1Þqg-th element on each side of Kpm ðA BÞ ¼ ðB AÞKqn should be fðAÞu gTj ¼ fðAÞj gu ¼ auj times fðBÞv gTi ¼ fðBÞi gv ¼ bvi . The remaining equations are derived as in Proof 1. Q.E.D. An advantage of the proofs of Lemma 10.2 is that the explicit expressions of the elements are derived as by-products, which are summarized in the following theorem including a repeated one given by the proof of Lemma 10.2 for ease of reference. Theorem 10.1 The elements of the products of the matrices in Lemma 10.2 are fKpm ðA BÞgfu þ ðv1Þmg;fi þ ðj1Þqg ¼ fðB AÞKqn gfu þ ðv1Þmg;fi þ ðj1Þqg ¼ auj bvi ; fKpm ðA BÞKnq gfu þ ðv1Þmg;fj þ ði1Þng ¼ ðB AÞfu þ ðv1Þmg;fj þ ði1Þng ¼ auj bvi ; fKpm ðA bÞgfu þ ðv1Þmg;j ¼ ðb AÞfu þ ðv1Þmg;j ¼ auj bv ; fKmp ðB aÞgfv þ ðu1Þpg;i ¼ ða BÞfv þ ðu1Þpg;i ¼ au bvi ; fKpm ða BÞgfu þ ðv1Þmg;i ¼ ðB aÞfu þ ðv1Þmg;i ¼ au bvi ; fKmp ðb AÞgfv þ ðu1Þpg;j ¼ ðA bÞfv þ ðu1Þpg;j ¼ auj bv ðu ¼ 1; . . .; m; v ¼ 1; . . .; p; i ¼ 1; . . .; q; j ¼ 1; . . .; nÞ: Note that in the above expressions, the value auj bvi is the same in the first and second lines though the columns of the entries for this value are different. Similar cases are also found in the remaining lines. The following equation first derived by Neudecker and Wansbeek [18, Theorem 3.1 (i)] is useful (see also Magnus and Neudecker [14, Theorem 9, Chap. 3]). The result is proved with added algebraic expressions of the elements. Theorem 10.2 Let A and B be m n and p q matrices, respectively. Then, vecðA BÞ ¼ In Kqm Ip fvecðAÞ vecðBÞg;
10.1
Preliminaries
307
whose [v þ ðu 1Þp þ fi þ ðj 1Þq 1gmp]-th element is auj bvi ðu ¼ 1; . . .; m; j ¼ 1; . . .; n; v ¼ 1; . . .; p; i ¼ 1; . . .; qÞ. Proof It is seen that the elements of the vector on each side of the equation to be derived consist of distinct products of auj and bvi . Consider the product auj bvi on the left-hand side of the equation. We find that this product is located in the [½fv þ ðu 1Þpg; fi þ ðj 1Þqg-th element of A B. Noting that A B is a mp nq matrix, auj bvi becomes the [v þ ðu 1Þp þ fi þ ðj 1Þq 1gmp]-th element of vecðA BÞ. Note also that this location is given by the lexicographical order [j, i, u, v] with v changing fastest. On the other hand, auj bvi ’s in vecðAÞ vecðBÞ on the right-hand side of the equation are located according to the lexicographical order [j, u, i, v] ðj ¼ 1; . . .; n; u ¼ 1; . . .; m; i ¼ 1; . . .; q; v ¼ 1; . . .; pÞ with v changing fastest. Pre-multiplying vecðAÞ vecðBÞ by In Kqm Ip converts the order into [j, i, u, v]. That is, auj bvi is located in the [v þ ðu 1Þp þ fi þ ðj 1Þq 1gmp]-th element of the right-hand side of the equation after conversion, which is found to be equal to the corresponding one on the left-hand side of the equation. Q.E.D. Terdik [22, Lemma 1.1] gave the following equations, except the added third one, using vecðABCÞ¼ ðCT AÞvecðBÞ repeatedly, which are proved by a similar method employed earlier with added expressions of the elements. Theorem 10.3 Let A and B be m n and p m matrices, respectively. Then, vecðBAÞ ¼ vecT ðAÞ Inp In Kmn Ip fvecðIn Þ vecðBÞg ¼ vecT ðBÞ Inp Im Kpn Ip vec AT vec Ip ¼ vecT ðIm Þ Inp Im Kmn Ip vec AT vecðBÞ : Proof Let ai and bTj be the i-th column of A and the j-th row of B, respectively, ði ¼ 1; . . .; n; j ¼ 1; :::; pÞ. Then, ðBAÞji ¼ bTj ai is the {j þ ði 1Þp}-th element of vecðBAÞ. (i) Consider the right-hand side of the first equation to be derived:
vecT ðAÞ Inp
In Kmn Ip fvecðIn Þ vecðBÞg;
where vecðIn Þ vecðBÞ consists of the products ðIn Þik bjl ¼ dik bjl ði; k ¼ 1; . . .; n; j ¼ 1; . . .; p; l ¼ 1; . . .; mÞ with dik being the Kronecker delta. The products are located in the vector according to the lexicographical order [k, i, l, j] with j changing fastest. Pre-multiplying vecðIn Þ vecðBÞ by ðIn Kmn Ip Þ converts the order into [k, l, i, j], whose corresponding element is denoted by c½k;l;i;j ð¼dik bjl Þ. Noting that the remaining factor vecT ðAÞ Inp ¼ vecT ðAÞ In Ip is a matrix
308
10 Multivariate Measures of Skewness and Kurtosis
which converts the vector consisting of c½k;l;i;j ’s into the np 1 vector whose Pn;m Pm Pn P m {j þ ði 1Þp}-th element is k¼1 l¼1 alk c½k;l;i;j ¼ k;l alk dik bjl ¼ l¼1 bjl ali ¼ ðBAÞji , which is the same element of vecðBAÞ obtained earlier. Note that the indexes l and k of alk in vecT ðAÞ Inp are chosen from the first two indexes in c½k;l;i;j with l changes faster than k in the lexicographical order. (ii) The vector vec(AT Þ vecðIp Þ on the right-hand side of the second equation vecðBAÞ ¼ vecT ðBÞ Inp Im Kpn Ip vec AT vec Ip consists of the products ali ðIp Þjk ¼ ali djk ðl ¼ 1; . . .; m; i ¼ 1; . . .; n ; j; k ¼ 1; . . .; pÞ, which are located according to the lexicographical order [l, i, k, j] as denoted by c½l;i;k;j . These elements in the vector pre-multiplying by Im Kpn Ip are permutated as c½l;k;i;j ð¼ali djk Þ. Finally, the matrix vecT ðBÞ Inp ¼ vecT ðBÞ In Ip converts the transformed vector into the np 1 vector whose Pp;m Pm Pp Pm {j þ ði 1Þp}-th element is l¼1 bkl c½l;k;i;j ¼ l¼1 bjl ali ¼ k¼1 k;l bkl ali djk ¼ ðBAÞji , which is the required result. (iii) The first proof: The vector vecðAT Þ vecðBÞ on the right-hand side of the third equation fvecT ðIm Þ Inp gðIm Kmn Ip ÞfvecðAT Þ vecðBÞg consists of the products aki bjl ðk; l ¼ 1; . . .; m; i ¼ 1; . . .; n; j ¼ 1; . . .; p Þ, which are located according to the lexicographical order denoted by c½k;i;l;j . The first transformation by Im Kmn Ip gives c½k;l;i;j ð¼ aki bjl Þ, which is further converted by vecT ðIm Þ Inp into the np 1 vector whose {j þ ði 1Þp}-th element is Pm;m Pm Pm P m k¼1 l¼1 dlk c½k;l;i;j ¼ k;l dlk aki bjl ¼ l¼1 bjl ali ¼ ðBAÞji , which is the required result. The second proof: Employing a method as in Terdik [22, Lemma 1.1] for the first and second equations, the third expression of vecðBAÞ is derived using vecðABCÞ¼ ðCT AÞvecðBÞ and Theorem 10.2: vecðBAÞ ¼ AT B vecðIm Þ ¼ vecT ðIm Þ Inp vec AT B ¼ vecT ðIm Þ Inp Im Kmn Ip vec AT vecðBÞ : Q.E.D. The above proof is given to show that unique elements, i.e., the inner products ðBAÞji ¼ bTj ai ði ¼ 1; . . .; n; j ¼ 1; . . .; pÞ in this case can be used to derive an
10.2
Multivariate Cumulants and Multiple Commutators
309
equation by specifying their positions in the equation. In addition to obtaining the expressions of the elements of the matrix on each side of an equation, this method seems to yield some help in finding the transformation of matrices of interest, e.g., A and B in the above case when the commutator, vec operator and Kronecker product are used. However, when only the explicit expressions for the elements on each side of the equation are desired in the above case, the left-hand side can be used. The equation without obtaining the explicit expressions of the elements may be derived more concisely by using vecðABCÞ¼ ðCT AÞvecðBÞ repeatedly as shown in the second proof of Theorem 10.3 (iii). Note that this formula is applied to vecðBAÞ in (i), (ii) and (iii) using BA ¼ BAIn ¼ Ip BA ¼ BIm A; respectively. The result (iii) is of interest since it is a formula of the relation between a vectorized usual matrix product and the Kronecker product of two vectorized matrices. Remark 10.1 Theorem 10.3 suggests more basic results for a m n matrix A: vecðAÞ ¼ vecðAIn In Þ ¼ ðIn AÞvecðIn Þ ¼ vecðIm Im AÞ ¼ AT Im vecðIm Þ; where aij is the {i þ ðj 1Þm}-th element of vecðAÞ ði ¼ 1; :::; m; j ¼ 1; :::; nÞ. The corresponding elements ofPðIn AÞvec(In Þ and ðAT Im Þvec(Im Þ are obtained by P n;n m;m k;l¼1 djk ail dlk ¼ aij and k;l¼1 alj dik dlk ¼ aij , respectively, as expected. Note that the remaining third result vecðAÞ ¼ vecðIm AIn Þ ¼ ðIn Im ÞvecðAÞ¼ Imn vecðAÞ is trivial.
10.2
Multivariate Cumulants and Multiple Commutators
Let X be a p 1 random vector with EðXÞ ¼ l and covðXÞ ¼ R [ 0 (positive definite). Define Y ¼ R1=2 ðX lÞ, where R1=2 is a symmetric matrix-square-root of R1 . That is, we consider the standardized random vector with uncorrelated elements. The multivariate cumulants up to the fourth order are denoted by the p j 1 vectors jj ðYÞ ðj ¼ 1; :::; 4Þ:
310
10 Multivariate Measures of Skewness and Kurtosis
j1 ðYÞ ¼ EðYÞ ¼ 0; j2 ðYÞ ¼ EY\2 [ ¼ vecfcovðYÞg ¼ vec Ip ; j3 ðYÞ ¼ E Y\3 [ ; 3 P j4 ðYÞ ¼ E Y\4 [ E\2 [ Y\2 [ ; where 3 X
E\2 [ Y\2 [ ¼ EðY Y Y Y Þ þ EðY Y Y Y Þ þ EðY Y Y YÞ;
E\2 [ ðY\2 [Þ ¼ fEðY\2 [ Þg\2 [ ¼ EðY Y Y Y Þ; Y is an independent P copy of Y and k as used earlier indicates the sum of k similar terms to have a symmetric result with respect to the positions of the random vectors in the Kronecker products considering their combinations/permutations. While the above expression P is used to explain the definition of the operator 3 , the same result is also given by 3 X
E\2 [ Y\2 [ ¼ E\2 [ Y\2 [ þ EðY Y Y Y Þ þ EfY EðY YÞ Yg:
P Note that 3 is seen as a (scaled) symmetrizer. When p = 1 with Y ¼ Y, we have P3 \2 [ \2 [ P3 P3 \2 [ \2 [ E ðY Þ = E ðY Þ ¼ 3E2 ðY 2 Þ ¼ 3, That is, ðÞ=3 is the corresponding unscaled symmetrizer. The standard definition of the unscaled symmetrizer for a1 ak , where ai is the di 1 vector ði ¼ 1; :::; kÞ, is given by Sðd1 ;:::;dkÞ ða1 ak Þ
X
ðapð1Þ apðkÞ Þ=k!;
fpð1Þ;:::;pðkÞg2Sk
where fpð1Þ; :::; pðkÞg ¼ pð1 : kÞ is a permutation of ð1 : kÞ ¼ ð1; :::; kÞ, Sk with k! members is the set of k-way permutations (Holmquist [6, p. 175]; Kano [9, Sect. 2.1]; Terdik [22, Sect. 1.3.1]). P3 \2 [ \2 [ P3 Noting that E ðY Þ = EðY Y Y Y Þ , we obtain the P3 EðY Y Y Y Þ¼ S2;2 EðY Y Y scaled symmetrizer S2;2 yielding Y Þ. For this purpose, the q-way or q-mode commutation matrix (commutator) is introduced (Jammalamadaka, Taufer and Terdik [8, Appendix 1.2]), which is an extension of the 2-mode commutator Kmp defined earlier with Kmp ðb aÞ ¼ a b, where a and b are m 1 and p 1 vectors, respectively. Employing the order of the vectors in the Kronecker product and suppressing the dimension sizes, i.e., m and p in the above case, we obtain Kð21Þ ¼Kmp ;
10.2
Multivariate Cumulants and Multiple Commutators
311
where (21) = (2, 1) is a permutation of the original sequence (12) = (1, 2). When q = 3, we have, e.g., Kð231Þ ða1 a2 a3 Þ ¼ apð1Þ apð2Þ apð3Þ ¼ a3 a1 a2 ; where a1 ; a2 and a3 with possibly distinct dimensions denoted by di ði ¼ 1; 2; 3Þ, respectively, before permutation go to the second, third and first factors (vectors) of the permuted Kronecker product due to the definition of the permutation pð123Þ ¼ pð1 : 3Þ ¼ fpð1Þ; pð2Þ; pð3Þg ¼ ð3; 1; 2Þ ¼ ð312Þ 6¼ ð231Þ; where pðiÞ and i correspond to the {pðiÞ}-th and i-th factors in the Kronecker products before and after permutation, respectively. This definition of the multiple commutator seems to be somewhat intractable due to ð312Þ 6¼ ð231Þ in the above case. It is natural to desire an expression of the commutator such that Kð312Þ ða1 a2 a3 Þ ¼ a3 a1 a2 , which is much more tractable than the expression Kð231Þ ð¼ Kð312Þ Þ since the subscript (312) of Kð312Þ is the same as the desired permutation pð123Þ ¼ ð312Þ. This is easily obtained using the definition KðÞ : a1 a2 a3 ¼ Kð312Þ ða3 a1 a2 Þ, which gives Kð312Þ ða1 a2 a3 Þ ¼ K1 ð312Þ ða1 a2 a3 Þ ¼ a3 a1 a2 1 with Kð312Þ ¼ K1 ð312Þ . Note that Kð312Þ is a commutator yielding the inverse permutation from (312) to (1:3), where the existence of the inverse of Kð312Þ is shown by using commutators of interchanging the neighboring factors successively as
Kð312Þ ¼ Kð132Þ Kð213Þ ¼ Kð132Þ ðd1 ; d3 ; d2 ÞKð213Þ ðd3 ; d1 ; d2 Þ; where Kð213Þ ðd3 ; d1 ; d2 Þ ¼ Kð21Þ ðd3 ; d1 Þ Id2 and Kð213Þ ðd3 ; d1 ; d2 Þ is the commutator applied to a3 a1 a2 whose factors are of dimensions d3 , d1 and d2 , respectively, with Kð21Þ ðd3 ; d1 Þ defined similarly, whose notation is employed by Terdik [22, Sect. 1.2.4]. Then, we have Kð213Þ ðd3 ; d1 ; d2 Þða3 a1 a2 Þ ¼ fKð21Þ ðd3 ; d1 Þ Id2 gða3 a1 a2 Þ ¼ fKð21Þ ðd3 ; d1 Þða3 a1 Þg ðId2 a2 Þ ¼ ða1 a3 Þ a2 ¼ a1 a3 a2 i.e., exchanging a1 and a2 . Similarly, we obtain Kð132Þ ¼ Kð132Þ ðd1 ; d3 ; d2 Þ ¼ Id1 Kð21Þ ðd3 ; d2 Þ:
312
10 Multivariate Measures of Skewness and Kurtosis
Note that the notations suppressing the dimension sizes, e.g., Kð21Þ ¼ Kð21Þ ðd2 ; d1 Þ were used for simplicity when confusion does not occur. Then, using these results, we confirm that Kð312Þ ða3 a1 a2 Þ ¼ Kð132Þ Kð213Þ ða3 a1 a2 Þ ¼ fId1 Kð21Þ ðd3 ; d2 Þgða1 a3 a2 Þ ¼ a1 fKð21Þ ðd3 ; d2 Þða3 a2 Þg ¼ a1 a 2 a3 : Using the expressions of Kð132Þ and Kð213Þ in the process of confirmation, we obtain Kð312Þ ¼ Kð132Þ Kð213Þ ¼ fId1 Kð21Þ ðd3 ; d2 ÞgfKð21Þ ðd3 ; d1 Þ Id2 g; which gives 1 1 1 1 Kð231Þ ¼ K1 ð312Þ ¼ Kð213Þ Kð132Þ ¼ fKð21Þ ðd3 ; d1 Þ Id2 g fId1 Kð21Þ ðd3 ; d2 Þg 1 ¼ fK1 ð21Þ ðd3 ; d1 Þ Id2 gfId1 Kð21Þ ðd3 ; d2 Þg
¼ fKð21Þ ðd1 ; d3 Þ Id2 gfId1 Kð21Þ ðd2 ; d3 Þg; 1 1 1 where the existence of K1 ð21Þ ðd3 ; d1 Þ ¼ Kd1 d3 ¼ Kd3 d1 and Kð21Þ ðd3 ; d2 Þ ¼ Kd2 d3 ¼ Kd3 d2 is used with the notations of usual or two-fold commutators. The last result indicates the form of K1 ð312Þ as well as its existence. Note that Jammalamadaka et al. [8,
Appendix 1.2] and Terdik [22, Appendix A.4] extensively used the notation K1 ðÞ . To
find KðÞ corresponding to K1 ð312Þ , write the positions of 1, 2 and 3 in the subscript (312) sequentially, i.e., the second, third and first positions, respectively, giving the 1 subscript (231) of KðÞ in K1 ð312Þ ¼ Kð231Þ . Conversely, to find KðÞ corresponding to Kð231Þ , write the positions similarly, i.e., the third, first and second, respectively, giving the subscript (312) of K1 ðÞ . For the expression given in the first paragraph of
this subsection, we have the corresponding explicit one using K1 ðÞ and the scaled symmetrizer S22 . Theorem 10.4 S22 fE\2 ðY\2 [Þg ¼
3 X
E\2 [ ðY\2 [Þ ¼ EðY Y Y Y Þ
þ EðY Y Y Y Þ þ E(Y Y Y Yg 1 ¼ ðIp4 þ K1 ð1324Þ þ Kð1342Þ ÞEðY Y Y Y Þ 1 \2 [ ¼ ðIp4 þ K1 ðY\2 [Þ: ð1324Þ þ Kð1342Þ ÞE
10.2
Multivariate Cumulants and Multiple Commutators
313
Note that S22 ¼ S22 =3 is symmetric and idempotent by construction (see, e.g., Kano [9, Proposition 2.2]). So far, the well-known formulas ðA1 Ak ÞðB1 Bk Þ ¼ ðA1 B1 Þ ðAk Bk Þ; with ðA1 Ak ÞT ¼ AT1 ATk and ðC1 Ck Þ1 ¼ C1 1 C1 have been used, where A , B and C are m n , n q and r r i i i i i i i i i matrices, k 1 respectively, with the assumption of the existence of Ci ði ¼ 1; :::; kÞ. It is of some interest to prove the first equation using the lexicographical notation, e.g., c½i;j;k;l with l changing fastest as employed earlier. Note that the elements of the products denoted by ðAm Bm Þim jm ðm ¼ 1; :::; kÞ on the right-hand side give the rows indexed by c½i1 ;:::;ik ð¼ ðA1 B1 Þi1 ðAk Bk Þik Þ ðil ¼ 1; :::; ml ; l ¼ 1; :::; kÞ, where the dots indicate possibly different arbitrary column indexes. Similarly, the columns on the right-hand side are indexed by c½j1 ;:::;jk ð¼ ðA1 B1 Þj1 ðAk Bk Þjk Þ ðjl ¼ 1; :::; ql ; l ¼ 1; :::; kÞ. On the left-hand side of the equation, the first factor A1 Ak has the row and column indexes denoted by c½i1 ;:::;ik ð¼ ðA1 Þi1 ðAk Þik Þ ðil ¼ 1; :::; ml ; l ¼ 1; :::; kÞ and c½t1 ;:::;tk ð¼ ðA1 Þt1 ðAk Þtk Þ ðtl ¼ 1; :::; nl ; l ¼ 1; :::; kÞ; respectively. Similarly, the second factor B1 Bk has the row and column indexes c½t1 ;:::;tk ð¼ ðB1 Þt1 ðBk Þtk Þ ðtl ¼ 1; :::; nl ; l ¼ 1; :::; kÞ and c½j1 ;:::;jk ð¼ ðB1 Þj1 ðBk Þjl Þ ðjl ¼ 1; :::; ql ; l ¼ 1; :::; kÞ: Due to the product of the first and second factors with the corresponding column and row indexes in the first and second factors, respectively, it is found that the row and column indexes of ðA1 Ak ÞðB1 Bk Þ are given by c½i1 ;:::;ik ð¼ ðA1 B1 Þi1 ðAk Bk Þik Þ ðil ¼ 1; :::; ml ; l ¼ 1; :::; kÞ and c½j1 ;:::;jk ð¼ ðA1 B1 Þj1 ðAk Bk Þjk Þ ðjl ¼ 1; :::; ql ; l ¼ 1; :::; kÞ; respectively; which show the required results. The formula ðA1 Ak ÞT ¼ AT1 ATk is similarly given. The 1 remaining formula ðC1 Ck Þ1 ¼ C1 is given by the first 1 Ck 1 formula when Ai ¼ Ci and Bi ¼ Ci ði ¼ 1; :::; kÞ with Ir1 Irk ¼ Ir1 rk . The above formulas give the following summarized results for ease of reference.
314
10 Multivariate Measures of Skewness and Kurtosis
Lemma 10.3 Let ai be di 1 vectors ði ¼ 1; :::; 4Þ, respectively. Then, we have 1 K1 ð1324Þ ða1 a2 a3 a4 Þ ¼ Kð1324Þ ðd1 ; d2 ; d3 ; d4 Þða1 a2 a3 a4 Þ
¼ fId1 Kð21Þ ðd2 ; d3 Þ Id4 gða1 a2 a3 a4 Þ ¼ a1 Kð21Þ ða2 a3 Þ a4 ¼ a1 a3 a2 a4 ; dP 3 ;d2
1 K1 ð132Þ ¼ Kð132Þ ðd1 ; d2 ; d3 Þ ¼
i;j dP 3 ;d1
1 K1 ð321Þ ¼ Kð321Þ ðd1 ; d2 ; d3 Þ ¼
i;j dP 2 ;d1
1 K1 ð213Þ ¼ Kð213Þ ðd1 ; d2 ; d3 Þ ¼
i;j
ðd d2 Þ
ðId1 Eij 3
ðd d3 Þ
Þ;
ðd d3 Þ
Þ;
Eji 2
ðd d1 Þ
Id2 Eji 1
ðd d1 Þ
Eji 1
ðEij 3
ðd d2 Þ
ðEij 2
Id3 Þ;
1 1 1 1 K1 ð312Þ ¼ Kð213Þ Kð132Þ ¼ Kð213Þ ðd1 ; d3 ; d2 ÞKð132Þ ðd1 ; d2 ; d3 Þ
¼
d3X ;d1 ;d2
ðd d1 Þ
ðEij 3
ðd d2 Þ
ðd d3 Þ
Ejg 1
Egi 2
Þ
i;j;g ðd d Þ
3 2 is the d3 d2 matrix with K1 ð312Þ ða1 a2 a3 Þ ¼ a3 a1 a2 , where Eij whose (i, j)th element is 1 with the remaining ones being zero.
Proof The first set of equations for K1 ð1324Þ ðÞ are summarized results explained 1 1 earlier. The equations for K1 ð132Þ ; Kð321Þ and Kð213Þ are given by Lemma 10.1.
The remaining result for K1 ð312Þ is given as follows. a3 a1 a2 ¼ K1 ð312Þ ða1 a2 a3 Þ 1 ¼ K1 ð213Þ Kð132Þ ða1 a2 a3 Þ 1 ¼ K1 ð213Þ ðd1 ; d3 ; d2 ÞKð132Þ ðd1 ; d2 ; d3 Þða1 a2 a3 Þ
¼ fKð21Þ ðd1 ; d3 Þ Id2 gfId1 Kð21Þ ðd2 ; d3 Þgða1 a2 a3 Þ ¼ ðKd3 d1 Id2 ÞðId1 Kd3 d2 Þða1 a2 a3 Þ: Recall Lemma 10.1 using clarified notations K di dj ¼
dj di X X
ðd Þ
ðd Þ
eðiÞi eðjÞj
ðd Þ
ðd Þ
eðjÞj eðiÞi
i¼1 j¼1
¼
di ;dj X
ðd Þ ðd ÞT
eðiÞi eðjÞj
i;j
ðd Þ ðd ÞT eðjÞj eðiÞi ;
T
10.2
Multivariate Cumulants and Multiple Commutators
315
ðd Þ
where eðiÞi is the di 1 vector whose i-th element is 1 with the remaining ones ðd ÞT
being zero; and eðiÞi
ðd Þ
¼ ðeðiÞi ÞT . Then, we obtain
1 K1 ð213Þ Kð132Þ ða1 a2 a3 Þ ¼ ðKd3 d1 Id2 ÞðId1 Kd3 d2 Þða1 a2 a3 Þ ( ) dX 3 ;d1 ðd Þ ðd ÞT ðd Þ ðd ÞT ¼ ðeðiÞ3 eðjÞ1 Þ ðeðjÞ1 eðiÞ3 Þ Id2 i;j
(
Id1
( ¼
dX 3 ;d2
) ðd Þ ðd ÞT
ðd Þ ðd ÞT
ðeðkÞ3 eðlÞ2 Þ ðeðlÞ2 eðkÞ3 Þ ða1 a2 a3 Þ
k;l d3X ;d1 ;d2
)
ðd Þ ðd ÞT
ðd Þ ðd ÞT
ðd Þ ðd ÞT
ðd Þ ðd ÞT ðeðhÞ1 eðhÞ1 Þ
ðd Þ ðd ÞT ðeðkÞ3 eðlÞ2 Þ
ðd Þ ðd ÞT ðeðlÞ2 eðkÞ3 Þ
ðd Þ
ðd Þ
ðeðiÞ3 eðjÞ1 Þ ðeðjÞ1 eðiÞ3 Þ ðeðgÞ2 eðgÞ2 Þ
i;j;g
(
d1X ;d3 ;d2
)
ða1 a2 a3 Þ
h;k;l
¼
d3X ;d1 ;d2 d1X ;d3 ;d2 i;j;g
ðd Þ
ðd Þ
ðd Þ
ðd Þ
ðd Þ
ðd Þ
ðd Þ
ðeðiÞ3 eðjÞ1 eðgÞ2 ÞðeðjÞ1 eðiÞ3 eðgÞ2 ÞT ðeðhÞ1 eðkÞ3 eðlÞ2 Þ
h;k;l
ðd Þ
ðd Þ
ðd Þ
ðeðhÞ1 eðlÞ2 eðkÞ3 ÞT ða1 a2 a3 Þ ¼
d3X ;d1 ;d2
ðd Þ
ðd Þ
ðd Þ
ðd Þ
ðd Þ
ðd Þ
ðeðiÞ3 eðjÞ1 eðgÞ2 ÞðeðjÞ1 eðgÞ2 eðiÞ3 ÞT ða1 a2 a3 Þ
i;j;g
¼
d3X ;d1 ;d2
n
o ðd Þ ðd ÞT ðd Þ ðd ÞT ðd Þ ðd ÞT ðeðiÞ3 eðjÞ1 Þ ðeðjÞ1 eðgÞ2 Þ ðeðgÞ2 eðiÞ3 Þ ða1 a2 a3 Þ
i;j;g
¼
d3X ;d1 ;d2
ðd d1 Þ
ðEij 3
ðd d2 Þ
Ejg 1
ðd d3 Þ
Egi 2
Þða1 a2 a3 Þ:
i;j;g
Since the elements of ai ði ¼ 1; 2; 3Þ are arbitrary, we obtain the required result. Q.E.D. 1 1 In Lemma 10.3, K1 ð132Þ ; Kð321Þ and Kð213Þ are commutators interchanging two factors (vectors) in the Kronecker product i.e., giving single-step permutations while K1 ð312Þ is a commutator interchanging neighboring factors two-times, i.e., yielding a two-step permutation. The proof of Lemma 10.3 gives the following general result. Theorem 10.5 Let ai be di 1 vectors ði ¼ 1; :::; kÞ, respectively. Suppose that a1 ak is transformed to ak a1 ak1 by interchanging the neighboring two factors k 1 times starting from exchanging ak1 ak such as ak ak1 . Then, the commutator for the transformation is K1 ðk;1;2;:::;k1Þ ¼
dk ;dX 1 ;:::;dk1 ik ;:::;ik1
ðd d1 Þ
ðEik ik1
ðd d2 Þ
Ei1 i12
ðd
d Þ
Eik1k1ik k Þ:
with K1 ðk;1;2;:::;k1Þ ða1 ak Þ ¼ ak a1 ak1 .
316
10 Multivariate Measures of Skewness and Kurtosis
In the first paragraph of this section, the multivariate cumulants of the standardized random vector Y ¼ R1=2 ðX lÞ are addressed. In some or many cases, the unstandardized multivariate raw moments are obtained more easily than the standardized or unstandardized central moments. In the following, we obtain the multivariate unstandardized cumulants via multivariate raw moments. Corollary 10.1 j1 ðXÞ ¼ EðXÞ ¼ l; j2 ðXÞ ¼ vecfcovðXÞg ¼ vecðRÞ ¼ EðX\2 [Þ l\2 [ ; j3 ðXÞ ¼ EfðX lÞ\3 [g ¼ EðX\3 [ Þ
3 X
EðX\2 [Þ l þ 2l\3 [
1 \2 [ ¼ EðX\3 [Þ ðIp3 þ K1 Þ lg þ 2l\3 [ ð132Þ þ Kð312Þ ÞfEðX
EðX\3 [Þ S21 fEðX\2 [ Þ lg þ 2l\3 [ ; j4 ðXÞ ¼ EfðX lÞ\4 [g ¼ EðX\4 [Þ
4 X
3 X
[ j\2 ðXÞ 2
EðX\3 [Þ l þ
6 X
EðX\2 [Þ l\2 [ 3l\4 [
3 X
[ j\2 ðXÞ 2
1 1 \3 [ Þ lg ¼ EðX\4 [Þ ðIp4 þ K1 ð1243Þ þ Kð1423Þ þ Kð4123Þ ÞfEðX 1 1 1 1 \2 [ þ ðIp4 þ K1 Þ l\2 [g ð1324Þ þ Kð1342Þ þ Kð3124Þ þ Kð3142Þ þ Kð3412Þ ÞfEðX 1 \2 [ ðXÞ 3l\4 [ ðIp4 þ K1 ð1324Þ þ Kð1342Þ Þj2 [ EðX\4 [ Þ S31 fEðX\3 [Þ lg þ S211 fEðX\2 [Þ l\2 [g 3l\4 [ S22 j\2 ðXÞ; 2
1 where K1 ðÞ ¼ KðÞ ðp; :::; pÞ.
10.3
Multivariate Measures of Skewness and Kurtosis
Skewness and kurtosis are indexes of asymmetry and peakedness (or tail-thickness), respectively, defined in various ways for scalar random variables. Most typical measures for skewness and kurtosis are based on the third and fourth standardized cumulants, respectively, where “standardized” indicates the unit variance after standardization. Other indexes using the mode and quantiles for skewness are also available (Arnold and Groeneveld [1]; Ekström and Jammalamadaka [5]). Since the fourth central moment of the standard normal distribution is 3, the phrase “excess kurtosis” indicating the standardized central fourth moment subtracted by 3 giving the standardized fourth cumulant is also used. Recall that in Chap. 5 the coined word “kurtic” indicates “mesokurtic, leptokurtic or platykurtic” corresponding to zero, positive and negative excess kurtosis, respectively. The measures of skewness and (excess) kurtosis for random vectors are defined in various ways. Consider the cumulant-based ones. Note that the standardization with the unit variance for univariate cases corresponds to Y ¼ R1=2 ðX lÞ giving covðYÞ ¼ Ip . Other methods of standardization can also be considered especially
10.3
Multivariate Measures of Skewness and Kurtosis
317
when the original variables are of primary interest rather than their linear combinations or summarized values. For instance, let Y ¼ Diag1=2 ðRÞðX lÞ, where 1=2 Diag1=2 ðRÞ ¼ diagðr11 ; :::; r1=2 pp Þ and rii is the i-th diagonal element of R (i = 1,…,p), then we have covðY Þ ¼ P, which is the scale-free population correlation matrix (for this type of standardization, see Ogasawara [19]). We assume that overall measures of skewness and kurtosis for random vectors are desired employing Y ¼ R1=2 ðX lÞ for standardization unless otherwise stated.
10.3.1 Multivariate Measures of Skewness Recall that the third cumulant vector is given by j3 ðYÞ ¼ EðY\3 [Þ: In the elements of j3 ðYÞ ¼ EðY\3 [ Þ, there are p univariate third cumulants, i.e., j3 ðYi Þ ¼ EðYi3 Þ ði ¼ 1; :::; pÞ: The 3ðp2 pÞ bivariate third cumulants are defined by j3 ðYi ; Yi ; Yj Þ ¼ EðYi2 Yj Þ ði; j ¼ 1; :::; p; i 6¼ jÞ: The remaining pðp 1Þðp 2Þ tri-variate third cumulants are given by j3 ðYi ; Yj ; Yk Þ ¼ EðYi Yj Yk Þ ði; j; k ¼ 1; :::; p; i 6¼ j 6¼ k 6¼ iÞ: It is confirmed that the sum of the numbers of the elements for the three groups becomes p þ 3ðp2 pÞ þ pðp 1Þðp 2Þ ¼ p3 : Among the p3 elements, p univariate third cumulants are unique ones; p2 p bivariate third cumulants are non-duplicated; and pðp 1Þðp 2Þ=3! tri-variate third cumulants are unique giving the number of unique third cumulants p þ p2 p þ fpðp 1Þðp 2Þ=3!g ¼ pðp þ 1Þðp þ 2Þ=3!; which is equal to the number of the combinations choosing three variables from p variables including repeated ones. One of the popular overall indexes of skewness is Mardia’s [15, Eq. (2.19)] b1;p defined by
318
10 Multivariate Measures of Skewness and Kurtosis
b1;p ¼ j3 ðYÞT j3 ðYÞ ¼ EðY\3 [ T ÞEðY\3 [Þ; which ignores the signs of the elements of j3 ðYÞ ¼ EðY\3 [ Þ and uses all the p3 duplicated elements of j3 ðYÞ. Mòri et al. [17] defined the vector of skewness: E
p X
! Yi2 Y
= EðYT YYÞ;
i¼1
which includes p2 non-duplicated third cumulants consisting of p univariate and p2 p bivariate ones ignoring the tri-variate third cumulants. Kollo [10] defined an alternative vector of skewness: E
p X p X
! Yi Yj Y ;
i¼1 j¼1
which includes all the p3 elements of j3 ðYÞ. However, Jammalamadaka et al. [8, p. 612] showed that Kollo’s vector of skewness can be zero when the absolute values of j3 ðYÞ are not zero. For instance, in the case of p = 2 with j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1; 1ÞT , the vector becomes E
p X p X
! Yi Yj Y
¼ Efð1Tp YÞ2 Yg ¼ Ef1Tp2 ðY YÞYg
i¼1 j¼1
¼ Efð1Tp2 Ip ÞY\3 [ g ¼ ð1Tp2 Ip Þj3 ðYÞ ¼ 0; where 1p is the p-vector consisting of 1s. Jammalamadaka et al. [8] stated that this is a disadvantage of Kollo’s vector of skewness. This point will be addressed later in detail. Balakrishnan et al. [2] proposed the skewness vector T, which was shown to be proportional to Mòri et al.’s vector by Jammalamadaka et al. [8, p. 613] as T¼
3 3 fvecT ðIp Þ Ip gj3 ðYÞ ¼ EðYT YYÞ pðp þ 2Þ pðp þ 2Þ
since EðYT YYÞ ¼ EfðYT YT ÞvecðIp ÞYg ¼ EfvecT ðIp ÞðY YÞYg ¼ fvecT ðIp Þ Ip gEðY\3 [ Þ ¼ fvecT ðIp Þ Ip gj3 ðYÞ: That is, T is seen to be essentially the same as Mòri et al.’s vector.
10.3
Multivariate Measures of Skewness and Kurtosis
319
Srivastava [20] gave a skewness index. Let R ¼ CKCT ¼ Cdiagðk1 ; :::; kp ÞCT be the spectral decomposition of R with CT C ¼ Ip and ki [ 0 ði ¼ 1; :::; pÞ yielding 1=2
T R1=2 ¼ CK1=2 CT ¼ Cdiagðk1 ; :::; k1=2 p ÞC . Define.
Y ¼ K1 CT ðX lÞ ¼ K1 CT R1=2 Y ¼ K1=2 CY: Then, Srivastava’s multivariate measure of skewness is given by 1X 2 1 X 2 3 1 X 3 T\3 [ E ½fðY Þi g3 ¼ E ðYi Þ ¼ k fki j3 ðYÞg2 ; p i¼1 p i¼1 p i¼1 i p
p
p
3=2
3=2
3=2
[ where EðYi3 Þ ¼ ki EfðCYÞ3i g ¼ ki EfðkTi YÞ3 g ¼ ki kT\3 j3 ðYÞ is used; i T and ki is the i-th row of C. Jammalamadaka et al. [8, p. 613] gave a derivation similar to the above result and pointed out that Srivastava’s multivariate measure of skewness is not affine invariant. This non-invariant property is found by covðY Þ ¼ K1 . The corresponding affine invariant measure is easily obtained by
Y K1=2 Y ¼ CY with covðY Þ ¼ covðYÞ ¼ Ip :
That is, Y is an orthogonally rotated random vector of Y. This measure gives a modified Srivastava’s measure: 2 1 X 2 h 3 i 1 X 2 3 1 X T\3 [ E Y i E Yi ¼ ki j3 ðYÞ : ¼ p i¼1 p i¼1 p i¼1 p
p
p
As addressed earlier, reconsider Jammalamadaka et al.’s criticism of Kollo’s skewness vector Efð1Tp YÞ2 Yg ¼ ð1Tp2 Ip Þj3 ðYÞ, which becomes zero when p = 2 and j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1; 1ÞT is nonzero. It is found that the following case also gives the same result: j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1; 1ÞT : Let jijk ¼ EðYi Yj Yk Þ ði; j; k ¼ 1; :::; pÞ be the elements of j3 ðYÞ, which are lexicographically ordered with k changing fastest. Then, it is found that the necessary and sufficient condition of the zero Kollo vector for j3 ðYÞ is Pp Pp T i¼1 j¼1 jijk ¼ 0 ðk ¼ 1; :::; pÞ. This condition stems from the property of 1p2 Ip in ð1Tp2 Ip Þj3 ðYÞ. The q-th element of ð1Tp2 Ip Þj3 ðYÞ is the sum of every p elements of j3 ðYÞ starting from the q-th element (q = 1,…,p). That is, when p = 2, the first and second elements of ð1Tp2 Ip Þj3 ðYÞ are the sums of the odd and even elements of j3 ðYÞ, respectively, which should be zero to yield the vanishing Kollo vector. It is also found that we have infinitely many such cases other than the above ones as long as j112 ¼ j121 ¼ j211 and j122 ¼ j212 ¼ j221 .
320
10 Multivariate Measures of Skewness and Kurtosis
When a variable Yi in Y is reflected as Yi , the signs of associated third cumulants other than Yi2 Yj ði; j ¼ 1; :::; pÞ become reversed. This reversal may be employed to see the properties of multivariate third cumulants in simplified situations without loss of generality. One of the simplifications is to make the third univariate cumulants positive by sign reversal for variables if necessary. Since this seems to give substantial simplification, we give the following definition. Definition 10.1 For variables Xi ði ¼ 1; :::; pÞ in a random vector X ¼ ðX1 ; :::; Xp ÞT , the sign reversal of Xi when the univariate third cumulant j3 ðXi Þ\0 is said to be the “positively skewed transformation” or “skew transformation” for short, which gives the sign reversal of the associated multivariate third cumulants. In the two cases addressed earlier, we find that (i) j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1; 1ÞT , j3 ðX1 Þ ¼ 1; j3 ðX2 Þ ¼ 1, (ii) j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1; 1ÞT j3 ðX1 Þ ¼ 1; j3 ðX2 Þ ¼ 1. Employ the skew transformation for one of the variables in (i) and (ii). Then, we have (i) j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1; 1ÞT , (ii) j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1; 1ÞT ; and j3 ðY1 Þ ¼ j3 ðY2 Þ ¼ 1 for (i) and (ii). It is of interest to find that the negative bivariate third cumulants j112 ¼ j121 ¼ j211 ¼ 1 in (i) and j122 ¼ j212 ¼ j221 ¼ 1 in (ii) all become 1 after the skew transformation. In other words, the negative cumulants before transformation were artifacts of the sign reversal of one of the variables. Note also that case (ii) before the skew transformation is given by exchanging Y1 and Y2 in (i) and that the two variables after skew transformation are exchangeable as long as the multivariate cumulants up to the third order are concerned. Employing the skew transformation with j3 ðY1 Þ ¼ j3 ðY2 Þ ¼ 1. Let j112 ¼ j121 ¼ j211 ¼ a and j122 ¼ j212 ¼ j221 ¼ b. Then, Kollo’s skew vector becomes ð1Tp2 Ip Þj3 ðYÞ ¼ ð1 þ 2a þ b; 1 þ a þ 2bÞT . It is found that the condition of the zero Kollo vector is equal to a ¼ b ¼ 1=3. Ogasawara [19, Eq. (2.10)] gave an upper bound of j2iij for Yi and Yj : j2iij 1 þ minfjiijj ; jiiii þ 1g ði 6¼ jÞ as an extension of Pearson’s inequality j2iii jiiii þ 2, where jiijj and jiiii are the bivariate and univariate fourth cumulants of (Yi ,Yj ) and Yi , respectively, i.e., jiijj ¼ EðYi2 Yj2 Þ 1 and jiiii ¼ EðYi4 Þ 3 due to EðYi Yj Þ ¼ 0 ði 6¼ jÞ. When minfjiijj ; jiiii þ 1g 8=9, which seems to be rather a weak condition, j2iijj ¼ 1=9 does not violate the upper bound though this is not a sufficient condition of the existence of the associated distribution.
10.3
Multivariate Measures of Skewness and Kurtosis
321
10.3.2 Multivariate Measures of Excess Kurtosis Multivariate measures of excess kurtosis are based on the multivariate fourth cumulants shown earlier, which is repeated: j4 ðYÞ ¼ EðY\4 [Þ
3 X
E\2 [ ðY\2 [Þ;
where 3 X
E\2 [ ðY\2 [Þ ¼ S2;2 fE\2 [ ðY\2 [Þg 1 \2 [ ¼ ðIp4 þ K1 ðY\2 [Þ: ð1324Þ þ Kð1342Þ ÞE
One of the most popular indexes is Mardia’s [15] non-excess kurtosis: b2;p ¼ EfðYT YÞ2 g; which is obtained from j4 ðYÞ as: b2;p ¼ E[fvecðYT YÞg2 ¼ E[f ðYT YT ÞvecðIp Þg2 ¼ E[YT\4 [ fvec\2 [ ðIp Þg ( ) 3 X ¼ E[f vec\2 [ ðIp ÞgT Y\4 [ ¼ fvec\2 [ ðIp ÞgT j4 ðYÞ þ E\2 [ ðY\2 [ Þ :
Under normality, this index becomes b2;p ¼ fvec\2 [ ðIp ÞgT
3 X
E\2 [ ðY\2 [Þ
¼ fvecT ðIp ÞEðY\2 [Þg2 þ 2fvec\2 [ ðIp ÞgT EðY Y Y Y Þ p p X X ¼ p2 þ 2 dij dkl EðYi Yj Yk Yl Þ ¼ p2 þ 2 EðYi Yi Yk Yk Þ ¼ p2 þ 2p; i;j;k;l¼1
i;k¼1
where Y is an independent copy of Y, which is a normalizing constant to be subtracted from b2;p when the corresponding excess kurtosis is used. Remark 10.2 Jammalamadaka et al. [8, p. 614] gave the following expression. ( b2;p ¼ vecT ðIp2 Þ j4 ðYÞ þ
3 X
) E\2 [ ðY\2 [Þ ;
which is different from b2;p = f vec\2 [ ðIp ÞgT fg shown earlier. The vector vec\2 [ ðIp Þ is not equal to vecðIp2 Þ unless p = 1. For instance when p = 2,
322
10 Multivariate Measures of Skewness and Kurtosis
vec\2 [ ðIp Þ ¼ ð1; 0; 0; 1Þ\2 [ T ¼ ð1; 0; 0; 1; 0; 0; 0; 0; 0; 0; 0; 0; 1; 0; 0; 1ÞT whereas 2
1 60 6 vecðIp2 Þ ¼ vec6 40
0 1 0
3 0 0 0 07 7 7: 1 05
0
0
0 1
¼ ð1; 0; 0; 0; 0; 1; 0; 0; 0; 0; 1; 0; 0; 0; 0; 1ÞT : The relationship between the two vectors is given by Theorem 10.2 when A ¼ B ¼ Ip with m ¼ n ¼ p ¼ q: vecðIp2 Þ ¼ vecðIp Ip Þ ¼ ðIp Kpp Ip ÞfvecðIp Þ vecðIp Þg ¼ ðIp Kpp Ip Þvec\2 [ ðIp Þ; whose [i þ ðj 1Þp þ fv þ ðu 1Þp 1gp2 ]-th element ([u, v, j, i]-th element using the lexicographical order with i changing fastest) is duj dvi ðu; j; v; i ¼ 1; . . .; pÞ. Using the above formula, we find that Jammalamadaka et al.’s expression gives the same result as ours: Ef vecT ðIp2 ÞY\4 [ g ¼ E[fðIp Kpp Ip Þvec\2 [ ðIp ÞgT Y\4 [ ¼ E½fvec\2 [ ðIp ÞgT ðIp Kpp Ip ÞY\4 [ ¼ E½fvec\2 [ ðIp ÞgT Y\4 [ ¼ b2;p ; where ðIp Kpp Ip Þða1 a2 a3 a4 Þ ¼ a1 a3 a2 a4 with ai ði ¼ 1; :::; 4Þ being p 1 vectors is used. This suggests the remaining third expression using Y\4 [ to have b2;p . Note that in 1 \2 [ \2 [ vecðIp2 Þ ¼ ðIp Kpp Ip Þvec ðIp Þ ¼ Kð1324Þ vec ðIp Þ, duj dvi of the [u, j, v, i]-th element of vec\2 [ ðIp Þ becomes the [u, v, j, i]-th element of vecðIp2 Þ due to the interchange of j and v. The third expression is given by exchanging the lexicographical order for j and i in [u, j, v, i]-th element of vec\2 [ ðIp Þ, which is \2 [ obtained by K1 ðIp Þ. These results are summarized as ð1432Þ vec b2;p ¼ E½fvec\2 [ ðIp ÞgT Y\4 [ ¼ Ef vecT ðIp2 ÞY\4 [ g \2 [ ¼ E½fK1 ðIp ÞgT Y\4 [ : ð1432Þ vec
10.3
Multivariate Measures of Skewness and Kurtosis
323
ðrÞ
Let Yk ¼ Yk be the k-th element of the r-th vector YðrÞ ¼ Y ðr ¼ 1; :::; 4; k ¼ 1; :::; pÞ in Y\4 [ ¼ Yð1Þ Yð2Þ Yð3Þ Yð4Þ . Then, it is found that E½fvec\2 [ ðIp ÞgT Y\4 [ ¼
p X
ð1Þ ð2Þ ð3Þ ð4Þ
E(Yi Yi Yj Yj Þ
i;j¼1
¼ Ef vecT ðIp2 ÞY\4 [ g ¼
p X
ð1Þ ð2Þ ð3Þ ð4Þ
E(Yi Yj Yi Yj Þ
i;j¼1 \2 [ ¼ E½fK1 ðIp ÞgT Y\4 [ ¼ ð1432Þ vec
p X
ð1Þ ð2Þ ð3Þ ð4Þ
E(Yi Yj Yj Yi Þ
i;j¼1
¼
p X i;j¼1
E(Yi2 Yj2 Þ ¼
p X
E(Yi4 Þ þ 2
i¼1
X 1 i\j p
E(Yi2 Yj2 Þ ¼ b2;p :
Remark 10.3 It is to be noted that among the p4 central product moments E(Yi Yj Yk Yl Þ ði; j; k; l ¼ 1; :::; pÞ including duplicated ones, b2;p considered only E(Yi41 Þ and E(Yi21 Yi22 Þ, i.e., univariate and symmetric-bivariate ones, respectively, with the latter being duplicated, neglecting E(Yi1 Yi32 Þ, E(Yi1 Yi2 Yi23 Þ and E(Yi1 Yi2 Yi3 Yi4 Þ(i1 ; i2 ; i3 and i4 are different), i.e., asymmetric-bivariate, tri-variate and 4-variate fourth central moments, respectively. Koziol [11] gave the following index for a multivariate measure of non-excess kurtosis EðY\4 [ T ÞEðY\4 [ Þ; which is an extension of Mardia’s skewness measure b1;p ¼ EðY\3 [ T ÞEðY\3 [Þ to kurtosis. The relation of this index to j4 ðYÞ was given by Terdik [22], which is derived using a method different from Terdik’s proof in the following theorem. Theorem 10.6 (Terdik [22, Eq. (6.8)]). Koziol’s multivariate measure of non-excess kurtosis EðY\4 [ T ÞEðY\4 [Þ is given by j4 ðYÞ with the corresponding Mardia measure b2;p as EðY\4 [ T ÞEðY\4 [Þ ¼ c2;p þ 6b2;p 3pðp þ 2Þ; where c2;p jT4 ðYÞj4 ðYÞ.
324
10 Multivariate Measures of Skewness and Kurtosis
Proof ( EðY
\4 [ T
ÞEðY
\4 [
Þ¼
j4 ðYÞ þ
3 X
)T ( E
\2 [
ðY
¼ jT4 ðYÞj4 ðYÞ þ 2jT4 ðYÞ þ
3 X
E\2 [ T ðY\2 [Þ (
¼ c2;p þ 2 EðY\4 [Þ þ
3 X
E\2 [ T ðY\2 [Þ
¼ c2;p þ 6b2;p
3 X
\2 [
Þ
3 X
3 X 3 X
j4 ðYÞ þ
) E
\2 [
ðY
\2 [
Þ
E\2 [ ðY\2 [Þ
E\2 [ ðY\2 [Þ )T
E\2 [ ðY\2 [Þ
3 X
3 X
3 X
E\2 [ ðY\2 [Þ
E\2 [ ðY\2 [Þ
E\2 [ T ðY\2 [Þ
3 X
E\2 [ ðY\2 [Þ:
In the above result, we have 3 X
E\2 [ T ðY\2 [Þ
3 X
E\2 [ ðY\2 [Þ
= EðY Y Y Y þ Y Y Y Y þ Y Y Y YÞT EðY Y Y Y þ Y Y Y Y þ Y Y Y YÞ 1 \2 [ 1 \2 [ ¼ fðIp2 þ K1 ðIp ÞgT ðIp2 þ K1 ðIp Þ ð1324Þ þ Kð1432Þ )vec ð1324Þ þ Kð1432Þ )vec T 1 1 1 \2 [ ¼ fvec\2 [ ðIp ÞgT ðIp2 þ K1 ðIp Þ ð1324Þ þ Kð1432Þ Þ ðIp2 þ Kð1324Þ þ Kð1432Þ )vec 1 T 1 ¼ fvec\2 [ ðIp ÞgT f3Ip2 þ KT ð1324Þ þ Kð1324Þ þ Kð1432Þ þ Kð1432Þ 1 T 1 \2 [ þ KT ðIp Þ; ð1324Þ Kð1432Þ þ Kð1432Þ Kð1324Þ gvec
1 T where KT ðÞ ¼ ðKðÞ Þ ð¼ KðÞ Þ, p X
fvec\2 [ ðIp ÞgT vec\2 [ ðIp Þ ¼
d2ij d2kl ¼
i;j;k;l¼1
p X
dij dkl ¼ p2
i;j;k;l¼1
and e.g., \2 [ \2 [ ðIp Þ ¼ fK1 ðIp ÞgT vec\2 [ ðIp Þ fvec\2 [ ðIp ÞgT KT ð1324Þ vec ð1324Þ vec
¼
p X i;j;k;l¼1
dik djl dij dkl ¼
p X i¼1
dii ¼ p;
10.3
Multivariate Measures of Skewness and Kurtosis
325
T 1 holds for the other similar three expanded terms using K1 ð1324Þ ; Kð1432Þ and Kð1432Þ . For the remaining two terms, we obtain 1 \2 [ \2 [ \2 [ fvec\2 [ ðIp ÞgT KT ðIp Þ ¼ fK1 ðIp ÞgT K1 ðIp Þ ð1324Þ Kð1432Þ vec ð1324Þ vec ð1432Þ vec
¼
p X
dik djl dil dkj ¼
i;j;k;l¼1
p X
dii ¼ p
i¼1
with the other one derived similarly. These results give 3 X
E\2 [ T ðY\2 [Þ
3 X
E\2 [ ðY\2 [Þ ¼ 3pðp þ 2Þ;
yielding the required expression. Q.E.D. In Theorem 10.6, c2;p ¼ jT4 ðYÞj4 ðYÞ is seen as a cumulant version of b1;p ¼ EðY\3 [ T ÞEðY\3 [Þ, which is called the “total kurtosis” by Terdik [22, Definition 6.2]. Jammalamadaka et al. [8, p. 615] used the notation j4 for c2;p and gave the result corresponding to Theorem 10.6 though the last term of EðY\4 [ T ÞEðY\4 [Þ ¼ c2;p þ 6b2;p 3pðp þ 2Þ is replaced by p2 , which may be a typo. Remark 10.4 Since under normality b2;p ¼ pðp þ 2Þ as addressed earlier with c2;p ¼ 0, Koziol’s index becomes 3pðp þ 2Þ. Consequently, the excess kurtosis version of Koziol’s index is defined as EðY\4 [ T ÞEðY\4 [Þ 3pðp þ 2Þ. Note that Koziol’s index considers all the multivariate fourth central product moments and consequently all the multivariate fourth cumulants including some duplicated elements. Cardoso [4] and Mòri et al. [17] gave the excess kurtosis matrix defined by BðYÞ ¼ EðYYT YYT Þ ðp þ 2ÞIp ¼ EðYT YYYT Þ ðp þ 2ÞIp ; whose relation to j4 ðYÞ was obtained by Terdik [22], which is proved in an expository way: Theorem 10.7 (Terdik [22, p. 325]) The excess kurtosis matrix BðYÞ given by Cardoso [4] and Mòri et al. [17] is related to j4 ðYÞ as vecfBðY)g ¼ fIp2 vecT ðIp Þgj4 ðYÞ Proof Since YT Y ¼ vecT ðIp ÞðY YÞ, YYT Y ¼ YvecT ðIp ÞðY YÞ ¼ ðIp YÞ fvecT ðIp ÞðY YÞg ¼ fIp vecT ðIp ÞgfY ðY YÞg ¼ fIp vecT ðIp ÞgY\3 [ (Terdik [22, p. 406]) follows, which gives.
326
10 Multivariate Measures of Skewness and Kurtosis
vec(YYT YYT Þ ¼ Y ½fIp vecT ðIp ÞgY\3 [ ¼ ðIp YÞ ½fIp vecT ðIp ÞgY\3 [ ¼ fIp Ip vecT ðIp ÞgY\4 [ ¼ fIp2 vecT ðIp ÞgY\4 [ : Using the above formula and the definition of BðYÞ, we have vecfBðY)g ¼ vecfEðYT YYYT Þ ðp þ 2ÞIp g ¼ fIp2 vecT ðIp ÞgEðY\4 [Þ ðp þ 2Þvec(Ip Þ ( ) 3 X T \2 [ \2 [ E ðY Þ ðp þ 2Þvec(Ip Þ: ¼ fIp2 vec ðIp Þg j4 ðYÞ þ
In the above result, fIp2 vecT ðIp Þg ¼
p X
3 X
E\2 [ ðY\2 [Þ
ðeðiÞ eðjÞ Þdkl ðdij dkl þ dik djl þ dil djk Þ
i;j;k;l¼1 p X
ðeðiÞ eðjÞ Þdij þ 2
¼p
i;j¼1
p X
ðeðiÞ eðiÞ Þ
i¼1
¼ ðp þ 2ÞvecðIp Þ: Consequently, we obtain vecfBðY)g ¼ fIp2 vecT ðIp Þgj4 ðYÞ, which is the required result. Q.E.D. In Theorem 10.7, YT Y ¼ vecT ðIp ÞðY YÞ is used, whose derivation may be unnecessary since the result is easily derived for most of the readers. However, this equation includes basic properties of the vec operator, Kronecker product and the identity matrix. Terdik [22, p. 406] gave the following explanation: Y Y¼ T
p X i¼1
Yi2
¼
p X i¼1
ðeTðiÞ YÞ
¼ vecT ðIp ÞY\2 [ :
ðeTðiÞ YÞ
¼
p X
! [T e\2 ðiÞ
Y\2 [
i¼1
One of the basic properties of the identity matrix used in this explanation is P Ip ¼ pi¼1 eðiÞ eTðiÞ . The formula vecðabT Þ ¼ b a due to the definitions of the vec operator and the Kronecker product is also used. Further, ðABÞ ðCDÞ ¼ ðA CÞðB DÞ when the matrix products exist is employed. Another explanation may be
10.3
Multivariate Measures of Skewness and Kurtosis
327
YT Y ¼ YT Ip Y ¼ ðYT Ip YÞT ¼ fðYT YT )vec(Ip ÞgT ¼ vecT ðIp ÞðY YÞ; where vecðABCÞ ¼ ðCT AÞvecðBÞ and ðA BÞT ¼ AT BT are used. An extension of the above formula is p X i¼1
Yiq ¼
p X
ðeTðiÞ YÞq ¼
i¼1
p X
ðeTðiÞ YÞ\q [ ¼
i¼1
p X
\q [ T \q [ eðiÞ Y ðq ¼ 1; 2; ; :::Þ;
i¼1
[ where e\q ¼ eðiÞ eðiÞ (q times of eðiÞ ) is the pq 1 vector whose ðiÞ
fi þ ði 1Þp þ ::: þ ði 1Þpq1 g-th element is 1 with the remaining ones being zero. When using the lexicographical order, this unit element is located in [i,…,i]-th position (q times of i). Note that the excess kurtosis matrix given by Cardoso and Mòri et al. does not include the four-variate fourth moments/cumulants, i.e., those for Yi ; Yj ; Yk and Yl (i, j, k and l are different). It is also found that trfBðY)g ¼ trfEðYYT YYT Þg pðp þ 2Þ ¼ b2;p pðp þ 2Þ; which is the excess kurtosis version of b2;p addressed earlier. Kollo’s [10] non-excess kurtosis matrix is defined by BðYÞ ¼
p X p X
EðYi Yj YYT Þ;
i¼1 j¼1
whose relation to j4 ðYÞ was obtained by Jammalamadaka et al. [8, p. 616]: vecfBðYÞg ¼ Eð1Tp2 Y\2 [ Y\2 [Þ ¼ EfY\2 [ ð1Tp2 Y\2 [Þg ¼ EfðIp2 1Tp2 ÞY\4 [g; which is also written as vecfBðYÞg ¼ Efð1Tp2 Y\2 [Þ Y\2 [g ¼ Efð1Tp2 Ip2 ÞY\4 [g: Using the former one, they gave ( vecfBðYÞg ¼ ðIp2
1Tp2 ÞE(Y\4 [Þ
¼ ðIp2
( ¼ ðIp2
1Tp2 Þ
j4 ðYÞ þ
3 X
1Tp2 Þ
j4 ðYÞ þ )
vec
\2 [
ðIp Þ :
3 X
) E
\2 [
ðY
\2 [
Þ
328
10 Multivariate Measures of Skewness and Kurtosis
Based on their above result, the excess kurtosis counterpart of vecfBðYÞg is given by 3 X vec\2 [ ðIp Þ; vecfBðYÞg ðIp2 1Tp2 Þ where the normalizing constant subtracted becomes ðIp2 1Tp2 Þ
3 X
vec\2 [ ðIp Þ ¼ ðIp Ip 1Tp 1Tp Þ ¼ ¼
p X
3 X
vec\2 [ ðIp Þ
ðeðiÞ eðjÞ Þðdij dkl þ dik djl þ dil djk Þ
i;j;k;l¼1 p X
ðeðiÞ eðjÞ Þðdij p þ 2Þ;
i;j¼1 p X
¼p
eðiÞ eðiÞ þ 2
i¼1
p X
eðiÞ eðjÞ
i;j¼1
[ ¼ pvecðIp Þ þ 21\2 : p
Using the inverse vec operation, we have the excess counterpart of the kurtosis matrix BðYÞ as BðYÞ pIp þ 21p 1Tp :
10.4
Elimination Matrices and Non-duplicated Multivariate Skewness and Kurtosis
Meijer [16, Sect. 2.1] and Jammalamadaka et al. [8, Sect. 5] defined the fpðp þ 1Þðp þ 2Þ=3!g 1 vector for the third cumulants of distinct elements. Since the order of the non-duplicated elements in the vector is arbitrary, we employ the lexicographical order for the subscripts of the elements such that only the distinct elements appearing earliest in the order are kept included for identification. That is, j3D ðYÞ ¼ fjijk g ¼ ðj111 ; j112 ; ::; jp1;p;p ; jppp ÞT ; where jijk ¼ EðYi Yj Yk Þ ð1 i j k pÞ using Jammalamadaka et al.’s notation. When p = 2, we have j3D ðYÞ ¼ ðj111 ; j112 ; j122 ; j222 ÞT :
10.4
Elimination Matrices and Non-duplicated Multivariate …
329
The ratio of the number of the distinct elements to that of the duplicated ones is fpðp þ 1Þðp þ 2Þ=ðp3 3!Þ ¼ ð1 þ p1 Þð1 þ 2p1 Þ=6 which is 1/2 when p = 2 and approaching 1/6 when p increases. The duplicated elements occur for the second cumulant or covariance matrix for a p 1 random vector X, whose vectorized version is denoted by j2 ðXÞ ¼ vecfcovðXÞg ¼ vecðRÞ: The non-duplicated counterpart of j2 ðXÞ is denoted by j2D ðXÞ, whose relation to j2 ðXÞ is obtained using the p2 fpðp þ 1Þ=2g duplication matrix D2p introduced by Browne [3]: j2 ðXÞ ¼ D2p j2D ðXÞ: D2p , which is also written as D and Dp (see, e.g., Magnus and Neudecker [13, 14, Sect. 7, Chap. 3]; Meijer [16]), which has a single unit element in every row with the remaining elements being zero. When p = 2, 3 2 3 3 1 0 0 2 j11 j11 6 7 6 7 j12 j21 7 j 5 6 0 1 0 74 j2 ðXÞ ¼ vec 11 ¼6 4 j12 5 ¼ D2p j2D ðXÞ ¼ 4 0 1 0 5 j12 , j21 j22 j22 0 0 1 j22 where jij ¼ jij ðXÞ ¼ rij ði; j ¼ 1; 2Þ and j12 ¼ j21 . For arbitrary p, noting that D2p is of full column rank, j2 ðXÞ ¼ D2p j2D ðXÞ gives,
2
þ j2 ðXÞ ¼ j2D ðXÞ; ðDT2p D2p Þ1 DT2p j2 ðXÞ ¼ D2p þ where D2p is the left inverse or the Moore–Penrose generalized (g)-inverse (MP inverse) of D2p . The MP inverse is a special case of the g-inverse denoted by A for A having the property AA A ¼ A with added restrictions A þ AA þ ¼ A þ , ðAA þ ÞT ¼ AA þ and ðA þ AÞT ¼ A þ A as well as AA þ A ¼ A (see, e.g., Magnus þ and Neudecker [14, Chap. 2]; Yanai et al. [23, Sect. 3.3.4]). D2p is a special case of the MP inverse for a matrix of full column rank, which is algebraically obtained from D2p as shown above. When p = 2,
1 6 ¼ 40
0 2
31 2 1 0 7 6 05 40
0
0
1
2
þ ¼ ðDT2p D2p Þ1 DT2p D2p
2
1 6 ¼ 40 0
0
0 1=2
0 1=2
0
0
0 1
0 3 0 7 0 5: 1
3 0 0 7 1 05 0 1
330
10 Multivariate Measures of Skewness and Kurtosis
þ It is found that D2p is the transposed D2p , where the unit values corresponding the duplicated elements in j2 ðXÞ ¼ D2p are each replaced by the reciprocal of the number of the duplicated ones, which is expected from the formula þ þ D2p ¼ ðDT2p D2p Þ1 DT2p . In other words, D2p transforms vecðRÞ such that rij and rji are replaced by their averaged single value rij ¼ rji ¼ ðrij þ rji Þ=2ð1 i j pÞ. This property also holds when R is replaced by an asymmetric matrix. Consider the matrix, þ D2p D2p ¼ D2p ðDT2p D2p Þ1 DT2p 2 3 31 2 3 1 0 0 2 1 0 0 0 60 1 07 1 0 0 6 76 7 6 7 ¼6 74 0 2 0 5 4 0 1 1 0 5 40 1 05 0 0 1 0 0 0 1 0 0 1 2 3 0 2 31 1 0 0 0 1 0 0 0 6 0 1=2 1=2 0 7 B 6 0 0 1 0 7C 6 7 B 6 7C ¼6 7 ¼ BI4 þ 6 7C=2 4 0 1=2 1=2 0 5 @ 4 0 1 0 0 5A 0 0 0 1 0 0 0 1 ¼ ðI4 þ K1 ð1324Þ Þ=2 ¼ Sðd1 ;:::;dpÞ ¼ Sð2;2Þ ;
is an (unscaled) symmetrizer defined in the first paragraph of Sect. 10.2, where d1 and d2 are the numbers of the columns and rows of A in Sðd1 ;d2 Þ vecðAÞ or the þ ¼ dimension sizes of a1 and a2 in Sðd1 ;d2 Þ ða1 a2 Þ, respectively. Note that D2p D2p Sð2;2Þ is a special case of the projection matrix, which is symmetric and idempotent, projecting a vector onto the space spanned by the column vectors of D2p . In the equation j2 ðXÞ ¼ D2p j2D ðXÞ, it is found that D2p is a unique solution. On þ þ the other hand, in D2p j2 ðXÞ ¼ j2D ðXÞ, D2p can be replaced by infinitely many other matrices. For instance, when p = 2, we find that 2
1 40 0
0 w 0
0 1w 0
3 0 0 5j2 ðXÞ ¼ j2D ðXÞ ð1\w\1Þ; 1
since wr12 þ ð1 wÞr21 ¼ r12 ¼ r21 . Among the many solutions other than w ¼ 1=2, those with w ¼ 1 and 0 are of interest in that they give vectors ðr11 ; r21 ; r22 ÞT and ðr11 ; r12 ; r22 ÞT eliminating the supra- and infra-diagonal elements, respectively, though in this case with symmetric R, they are equal. When w ¼ 1,
10.4
Elimination Matrices and Non-duplicated Multivariate …
2
1 40 0
0 w 0
0 1w 0
3 2 0 1 05 ¼ 40 1 0
0 0 1 0 0 0
331
3 0 05 1
is a special case of the fpðp þ 1Þ=2g p2 elimination matrix L2p , which is also denoted by Lp in this paper and L by Magnus and Neudecker [13]. Though L2p seems to be restricted to the case eliminating supra-diagonal elements in a narrow sense, other cases eliminating some of the elements in many ways can be considered using the same notation L2p or Lp when confusion does not occur. For instance, when w = 0 in the above case, Lp becomes the matrix eliminating the infra-diagonal elements of a matrix as mentioned earlier. " When #eliminating the 1000 in the case of off-diagonal elements, Lp becomes a p p2 matrix, e.g., 0001 p = 2. Finally, for standardized Y when p = 2, we note that þ j2 ðYÞ ¼ D2p j2D ðYÞ ¼ ð1; 0; 0; 1ÞT and D2p j2 ðYÞ ¼ j2D ðYÞ ¼ ð1; 0; 1ÞT :
Consider the p3 1 vector j3 ðYÞ of the multivariate third cumulants including duplicated elements. Recall the corresponding vector consisting of the distinct elements j3D ðYÞ shown in the first paragraph of this section. Then, we have j3 ðYÞ ¼ D3p j3D ðYÞ; (Meijer [16, Sect. 2.1]), where D3p is the p3 fpðp þ 1Þðp þ 2Þ=3!g triplication (or 3-way duplication) matrix of full column rank (Meijer used the notation Tp for D3p ), which has a single unit element in every row with the remaining elements being zero as in the case of D2p ð¼ Dp Þ. Since D3p is of full column rank, we obtain þ j3 ðYÞ ¼ j3D ðYÞ; ðDT3p D3p Þ1 DT3p j3 ðXÞ ¼ D3p þ where D3p is the fpðp þ 1Þðp þ 2Þ=3!g p3 matrix replacing the duplicated eleþ ments by their single means (Meijer used the notation Tpþ for D3p ). As was the case þ þ of D2p , D3p is a unique solution for j3 ðYÞ ¼ D3p j3D ðYÞ while D3p in D3p j3 ðYÞ ¼ j3D ðYÞ can be replaced by many other matrices. ðlÞ Let j3D ðYÞ ðl ¼ 1; 2; 3Þ be p 1; fpðp 1Þg 1 and fpðp 1Þðp 2Þ=6g 1, vectors consisting of the uni-, bi- and tri-variate third cumulants, respectively, where the elements are located according to the corresponding lexicographic order in the selected elements of j3 ðYÞ:
332
10 Multivariate Measures of Skewness and Kurtosis
ð1Þ
j3D ðYÞ ¼
p X
ð1Þ
\3 [ T eðiÞ eðiÞ j3 ðYÞ L3D j3 ðYÞ ¼ ðj111 ; j222 ; :::; jppp ÞT ;
i¼1 ð2Þ
j3D ðYÞ ¼
X
1 i\j p
n o \2 [ T \2 [ efði1Þp2 þ ðj1Þp þ jg ðeðiÞ eðjÞ Þ þ efði1Þp2 þ ði1Þp þ jg ðeðiÞ eðjÞ ÞT j3 ðYÞ
ð2Þ
ð3Þ j3D ðYÞ
L3D j3 ðYÞ ¼ ðj112 ; :::; jp1;p;p ÞT ; X ¼ efði1Þp2 þ ðj1Þp þ kg ðeðiÞ eðjÞ eðkÞ ÞT j3 ðYÞ 1 i\j\k p ð3Þ
L3D j3 ðYÞ ¼ ðj123 ; j124 ; :::; j1;p1;p ; j234 ; j235 ; :::; jp2;p1;p ÞT : ð3Þ
In the above expressions, when p = 2, j3D ðYÞ does not exist. Define ð123Þ
ð1ÞT
ð2ÞT
ð3ÞT
j3D ðYÞ ¼ ðL3D ; L3D ; L3D ÞT j3D ðYÞ ð123Þ
L3D j3D ðYÞ ð1ÞT
ð2ÞT
ð3ÞT
¼ fj3D ðYÞ; j3D ðYÞ; j3D ðYÞgT : ð123Þ
Then, the elements of j3D ðYÞ are permuted ones of j3D ðYÞ. Under bi- and ð123Þ
tri-variate independence, j3D ðYÞ becomes shorter than that without any independence. Meijer [16, Sect. 3.1] and Jammalamadaka et al. [8, Sect. 5] defined the fpðp þ 1Þðp þ 2Þðp þ 3Þ=4!g 1 vector j4D ðYÞ for the fourth cumulants of distinct elements using Jammalamadaka et al.’s notation. As in j3D ðYÞ, the lexicographical order for the subscripts of the elements is employed for identification. That is, j4D ðYÞ ¼ fjijkl g ¼ ðj1111 ; j1112 ; ::; jp1;p;p;p ; jpppp ÞT ; where jijkl ¼ EðYi Yj Yk Yl Þ dij dkl dik djl dil djk ð1 i j k l pÞ. When p = 2, we have j4D ðYÞ ¼ ðj1111 ; j1112 ; j1122 ; j1222 ; j2222 ÞT : The ratio of the number of the distinct elements to that of the duplicated ones is fpðp þ 1Þðp þ 2Þðp þ 3Þ=ðp4 4!Þ ¼ ð1 þ p1 Þð1 þ 2p1 Þð1 þ 3p1 Þ=24, which is 5/ 16 when p = 2 and approaching 1/24 when p increases. Meijer [16, Sect. 3.1] showed that j4 ðYÞ ¼ D4p j4D ðYÞ; where D4p is the p4 fpðp þ 1Þðp þ 2Þðp þ 3Þ=4!g quadruplication (or 4-way duplication) matrix of full column rank (Meijer used the notation Qp for D4p ), which has a single unit element in every row with the remaining elements being zero as in the case of D3p . As for D2p and D3p , we have
10.4
Elimination Matrices and Non-duplicated Multivariate …
333
þ ðDT4p D4p Þ1 DT4p j4 ðXÞ ¼ D4p j4 ðYÞ ¼ j4D ðYÞ; þ where D4p is the fpðp þ 1Þðp þ 2Þðp þ 3Þ=4!g p4 matrix replacing the duplicated þ ). As before, elements by their single means (Meijer used the notation Qpþ for D4p þ þ D4p is a unique solution for j4 ðYÞ ¼ D4p j4D ðYÞ while D4p in D4p j4 ðYÞ ¼ j4D ðYÞ can be replaced by many other matrices. ðmÞ Let j4D ðYÞ ðm ¼ 1; 2a; 2b, 3; 4Þ be p 1; fpðp 1Þg 1, fpðp 1Þ=2g 1, fpðp 1Þðp 2Þ=2g 1 and fpðp 1Þðp 2Þðp 3Þ =24g 1, vectors consisting of the uni-, asymmetric bi- (e.g., j1112 ), symmetric bi(e.g., j1122 ), tri- (e.g., j1233 ) and four-variate fourth cumulants, respectively, where the elements are located according to the corresponding lexicographic order in the selected elements of j4 ðYÞ: ð1Þ
j4D ðYÞ ¼
p X i¼1
ð2aÞ j4D ðYÞ
¼
ð1Þ
\4 [ T eðiÞ eðiÞ j4 ðYÞ L4D j4 ðYÞ ¼ ðj1111 ; j2222 ; :::; jpppp ÞT ;
X
1 i\j p
n [ efði1Þp3 þ ði1Þp2 þ ði1Þp þ jg ðe\3 eðjÞ ÞT ðiÞ o [ T þ efði1Þp3 þ ðj1Þp2 þ ðj1Þp þ jg ðeðiÞ e\3 Þ j4 ðYÞ ðjÞ
ð2aÞ
L4D j4 ðYÞ ¼ ðj1112 ; j1113 ; :::; jp1;p;p;p ÞT ; X ð2bÞ \2 [ \2 [ T j4D ðYÞ ¼ efði1Þp3 þ ði1Þp2 þ ðj1Þp þ jg ðeðiÞ eðjÞ Þ j4 ðYÞ 1 i\j p ð2bÞ
L4D j4 ðYÞ ¼ ðj1122 ; j1133 ; :::; jp1;p1;p;p ÞT ; X ð3Þ \2 [ T j4D ðYÞ ¼ efði1Þp2 þ ðj1Þp þ kg fðeðiÞ eðjÞ eðkÞ Þ 1 i\j\k p
[ [ þ ðeðiÞ e\2 eðkÞ ÞT þ ðe\2 eðjÞ eðkÞ ÞT gj4 ðYÞ ðjÞ ðiÞ ð3Þ
L4D j4 ðYÞ ¼ ðj1233 ; j1244 ; :::; j1;2;p1;p1 ; j2234 ; ; :::; jp2;p1;p;p ÞT ; X ð4Þ j4D ðYÞ ¼ efði1Þp3 þ ðj1Þp2 þ ðk1Þp þ lg ðeðiÞ eðjÞ eðkÞ eðlÞ ÞT j4 ðYÞ 1 i\j\k\l p ð4Þ
L4D j4 ðYÞ ¼ ðj1234 ; j1235 ; :::; j1;p2;p1;p ; j2345 ; ; :::; jp3;p2;p1;p ÞT : ð3Þ
ð4Þ
In the above expressions, when p = 2, j3D ðYÞ and j3D ðYÞ do not exist; and when ð4Þ
p = 3, j3D ðYÞ does not exist. Define
334
10 Multivariate Measures of Skewness and Kurtosis ð1234Þ
j4D
ð1ÞT
ð2aÞT
ð1234Þ
j4D ðYÞ
ð2bÞT
ð3ÞT
ð4ÞT
ðYÞ ¼ ðL4D ; L4D ; L4D ; L4D ; L4D ÞT j4D ðYÞ L3D ¼
ð1ÞT ð2aÞT ð2bÞT ð3ÞT ð4ÞT fj4D ðYÞ; j4D ðYÞ; j4D ðYÞ; j4D ðYÞ; j4D ðYÞgT : ð1234Þ
Then, the elements of j4D
ðYÞ are permuted ones of j4D ðYÞ. Under bi-, tri- and ð1234Þ
four-variate independence, j4D ðYÞ is shorter than that without any independence. So far, various elimination matrices have been defined in this section. ð1Þ ð2Þ ð3Þ ð123Þ are For the multivariate third cumulants, L3D ; L3D ; L3D and L3D 3 3 3 ð3Þ 3 p p ; fpðp 1Þg p , fpðp 1Þðp 2Þ=6g p and p p matrices, where For the multivariate fourth cumulants, pð3Þ ¼ pðp þ 1Þðp þ 2Þ=3!. ð1Þ ð2aÞ ð2bÞ ð3Þ ð4Þ ð1234Þ L4D ; L4D ; L4D ; L4D ; L4D and L4D are p p4 ; fpðp 1Þg p4 , fpðp 1Þ=2g p4 , fpðp 1Þðp 2Þ=2g p4 , fpðp 1Þðp 2Þðp 3Þ=24g p4 and pð4Þ p4 matrices, where pð4Þ ¼ pðp þ 1Þðp þ 2Þðp þ 3Þ=4!. Remark 10.5 Let L be a m n generic elimination matrix including permutation ð123Þ ð1234Þ matrices e.g., L3D and L4D for convenience. Then, by definition m n, where m ¼ n indicates that L is a permutation matrix. Suppose that y ¼ Ly, where y consist of some or all of the elements of y. Consider L þ y , where L þ ðn mÞ is the MP inverse of L: T L þ ¼ LT ðLLT Þ1 ¼ LT I1 m ¼ L ;
which is different from the case of a n m duplication matrix generically denoted by D with D þ 6¼ DT . It is found that L þ y is y when the eliminated elements are replaced by 0s. That is, L þ y partially restores y unless m ¼ n, which corresponds to the non-existence of the inverse transformation of L (Magnus and Neudecker [13, p. 427]).
References 1. Arnold BC, Groeneveld RA (1995) Measuring skewness with respect to the mode. Am Stat 49:34–38 2. Balakrishnan N, Brito MR, Quiroz AJ (2007) A vectorial notion of skewness and its use in testing for multivariate symmetry. Commun Stat Theor Methods 36:1757–1767 3. Browne MW (1974) Generalized least-squares estimators in the analysis of covariance structures. South Afr Stat J 8:1–24. Reprinted in Aigner DJ, Goldberger AS (eds) Latent variables in socioeconomic models. North Holland, Amsterdam, pp 205–226 (1977) 4. Cardoso JF (1989) Source separation using higher order moments. In: International conference on acoustics, speech, and signal processing. IEEE, pp 2109–2112
References
335
5. Ekström M, Jammalamadaka SR (2012) A general measure of skewness. Statist Probab Lett 82:1559–1568 6. Holmquist B (1988) Moments and cumulants of the multivariate normal distribution. Stoch Anal Appl 6:273–278 7. Holmquist B (1996) The d-variate vector and Hermite polynomial of order k. Linear Algebra Appl 237(238):155–190 8. Jammalamadaka SR, Taufer E, Terdik GH (2021) On multivariate skewness and kurtosis. Sankhyā A 83:607–644 9. Kano Y (1997) Beyond third-order efficiency. Sankhyā A 59:179–197 10. Kollo T (2008) Multivariate skewness and kurtosis measures with an application in ICA. J Multivar Anal 99:2328–2338 11. Koziol JA (1989) A note on measures of multivariate kurtosis. Biom J 31:619–624 12. Magnus JR, Neudecker H (1979) The commutation matrix: some properties and applications. Ann Stat 7:381–394 13. Magnus JR, Neudecker H (1980) The elimination matrix: some lemmas and applications. SIAM J Algebraic Discrete Methods 1:422–449 14. Magnus JR, Neudecker H (1999) Matrix differential calculus with applications in statistics and econometrics, Rev. Wiley, New York 15. Mardia KV (1970) Measures of multivariate skewness and kurtosis with applications. Biometrika 57:519–530 16. Meijer E (2005) Matrix algebra for higher order moments. Linear Algebra Appl 410:112–134 17. Mòri TF, Rohatgi VK, Székely GJ (1994) On multivariate skewness and kurtosis. Theor Probab Appl 38:547–551 18. Neudecker H, Wansbeek T (1983) Some results on commutation matrices, with statistical applications. Can J Stat 11:221–231 19. Ogasawara H (2017) Extensions of Pearson’s inequality between skewness and kurtosis to multivariate cases. Stat Probab Lett 130:12–16 20. Srivastava MS (1984) A measure of skewness and kurtosis and a graphical method for assessing multivariate normality. Stat Probab Lett 2:263–267 21. Terdik G (2002) Higher order statistics and multivariate vector Hermite polynomials for nonlinear analysis of multidimensional time series. Teorija verojatnosteĭ i matematičeskaja statistika (Theor Probab Math Stat) 66:147–168 22. Terdik G (2021) Multivariate statistical methods: going beyond the linear. Springer Nature, Cham, Switzerland 23. Yanai H, Takeuchi K, Takane Y (2011) Projection matrices, generalized inverse matrices, and singular value decomposition. Springer, New York
Index
A Abramowitz, 73, 78 Absolute moments central, 95 of real-valued orders, 47, 70, 94 Actuarial science, 3 Affine transformation, 105, 119, 125, 241 Aitken, 3, 105 Ali, 266 Allard, 252 Amoroso distribution, 278 Andrews, 3, 105 Arismendi, 1, 3 Arnold, 3, 105, 108, 316 Artzner, 3 Ascending factorial, 59, 71 Asymmetric-bivariate, 323 Asymmetric expressions, 22, 24, 257 Azzalini, 1, 105, 108, 114, 119, 155, 200, 218, 266 B Balakrishnan, 57, 318 Barr, 1 Basic parabolic cylinder (BPC) distribution of the first kind, 69, 70 of the second kind, 69, 70 of the third kind, 69, 70 Bayes, 114 Beaver, 3, 105, 108 Behaviormetrics, 105, 252 Bimodal distribution, 84, 193 Binomial expansion, 94, 95 Birnbaum, 3, 105 Bishop, 274
Bollen, 252 Bouvier, 92 Branco, 288 Breeding animals, 3 plants, 3 Brito, 318 Broda, 3 Browne, 329 Burkardt, 1 C Cabral, 266 Capitanio, 105, 108, 218, 266 Castro, 266 Cha, 252 Chandra, 56, 57, 68 Chang, 266 Change of variables, 70 Chebyshev inequality, 59, 60 Chi-distribution scaled, 271, 279 Chi-square distribution scaled, 279 Cho, 252 Circular reasoning, 23 Closed-form, 58, 76 Closed skew-normal (CSN) family of distributions, 105 Cochran, 3 Commutation matrix, 299, 302 Commutator, 299, 301, 309–312, 315 Complementary interval, 48 Completing squares, 174 Compound symmetry, 207
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 H. Ogasawara, Expository Moments for Pseudo Distributions, Behaviormetrics: Quantitative Approaches to Human Behavior 2, https://doi.org/10.1007/978-981-19-3525-1
337
338 Consul, 56 Convergence, 91, 92, 203 Convolution, 171, 186–188, 223, 252 Cornish, 285 Covariance matrix, 9, 106, 189, 217, 224, 231, 329 Cpu time, 91, 92 Crooks, 278 Cubature, 92 Cumulant generating function (CGF), 5, 6, 116, 185, 226 Cumulative distribution function (CDF), 6, 7, 49, 56, 59, 60, 70, 71, 75, 76, 88, 89, 91, 92, 142, 253, 263, 288 D Dalla Valle, 105, 119 de Doncker-Kapenga, 91 Degree(s) of freedom, 51, 96, 117, 119, 200, 266, 268–272, 274–276, 280–283, 285–288, 296, 297 Delbaen, 3 Density contour, 93, 209 Determinant, 109 Dey, 1, 81, 288 Differential equation, 74, 82, 83 Direct product, 230, 306 Distinct elements, 328, 329, 331, 332 DLMF, 69, 71, 78, 82, 249 Domínguez-Molina, 105, 108, 111, 117, 120, 122–125, 145, 217, 218 Double factorial, 37, 63, 65, 155 Double precision, 91, 158 Double truncation, 1, 3, 47, 48, 62, 141–143, 149, 166, 167, 252, 265 Doubly truncated, 1 Dunkl, 38 E Eber, 3 Ekström, 316 Elimination matrix, 328, 331, 334 Elliptical truncation, 3 Equally-spaced sequence, 198 Equated means, 203, 209 Erdélyi, 69, 70, 73, 74, 78, 82 Excess kurtosis matrix, 325, 327 Exchangeability, 207, 208 Exchangeable variables, 206 Exploratory factor analysis (EFA), 252
Index F Factorial double, 37, 63, 65, 155 rising or ascending, 59, 71, 248 Famoye, 56 Fisher, 1, 62, 81, 82 Fisher’sInfunction, 62, 81 Flecher, 252 Floor function, 38 Full column rank, 122, 176, 329, 331, 332 Fung, 266 G Galarza, 1, 266 Galko, 71 Gamma distribution generalized, 278 inverse, 274, 275, 287 inverse-square-root gamma, 277 power, 265, 277, 284, 295 square-root gamma, 277 Galko, 71 Gamma function lower incomplete, 50, 71, 245 upper incomplete, 56, 57 Garay, 266 Gaure, 92 Gauss hypergeometric function, 250 Gauss notation, 38 Ge, 1 Generalized gamma distribution, 278 Generalized gamma distribution, 278 Generalized inverse, 329 Generalized Poisson distributions (GPSs), 56 General truncation, 3, 105 Genton, 266 Genz, 1 Ghaderinezhad, 114 Ghosh, 56 Gianola, 3 G-inverse, 329 González-Farías, 105, 108, 113, 114, 120, 122–125, 217 Gorsuch, 252 Groeneveld, 3, 108, 316 Gupta, 105, 108, 113, 217, 266, 288, 294, 295 H Hadamard product, 118 Hahn, 92 Half integer, 56, 57
Index Half normal, 281, 282 Half Poisson distribution, 56, 57, 64 Heath, 3 Henze’s theorem, 215, 217 Henze, 114, 215–217 Hermite polynomials multivariate, 215, 226, 230 Herrendörfer, 3 Hidden or latent truncation, 251 Holmes, 70 Holmes, 70 Hill, 1 Holmquist, 230, 299, 310 Horrace, 1 Huang, 266 I Idempotent, 313, 330 Identity matrix, 8, 109, 285, 300, 326 Incident function, 49 Incomplete gamma function lower, 50, 71, 245 upper, 56, 57 Induction, 28, 37, 155, 274 Infinite series multiple, 95 Inflated dimensionality, 251, 252 Infra-diagonal, 330, 331 Inner truncation, 1, 3, 47, 141–143, 149, 153, 158, 166–168, 193, 265 Integer part, 38 Integral representation, 70–72, 76, 81, 91 Inverse chi distribution, 284 Inverse chi-square distribution, 268 Inverse gamma, 287 Inverse of a partitioned matrix, 8, 111, 112 Inverse permutation, 304, 311 Inverse transformation, 334 Inverse vec operation, 328 Ismail, 56 Iterated expectation, 108 Iterative replacement, 203, 207, 208 J Jacobian, 267, 268, 270, 271, 281, 285, 286, 291 Jain, 56 Jammalamadaka, 310, 312, 316, 318, 319, 321, 322, 325, 327, 328, 332 Joe, 266 Johnson, 57, 92 K Käärik, 266, 272
339 Kahaner, 91 Kamalja, 56 Kamat, 1, 251 Kan, 1, 81 Kano, 230, 299, 310, 313 Kemp, 38, 157 Kiêu, 92 Kim, 252 Kirkby, 274 Koller, 92 Kollo, 230, 266, 272, 318–320, 327 Kostylev, 69, 95 Kotz, 57, 266, 285 Koziol, 323, 325 Krenek, 252 Kronecker delta, 85, 245, 307 Kronecker product, 7, 125, 299, 304, 309–311, 315, 326 Kummer confluent hypergeometric function, 69, 71, 74, 248, 249 Kurtic kurtic-normal (KN) distribution, 149 kurtic functions, 149 kurtic normal-normal (KNN), 193 leptokurtic, 149, 316 mesokurtic, 149, 316 platykurtic, 149, 316 Kurtosis excess, 77, 135, 142, 143, 149, 159, 162, 167, 181, 192, 194, 196, 204, 316, 321, 325, 327, 328 L Lachos, 1, 81, 266 Landsman, 3 Lachos, 266 Latent variable models (LVMs), 251, 252 Law of total or iterated expectations, 108 Lawley, 3, 105 Lee, 1, 81 Left inverse, 329 Legendre duplication formula, 73 Leonald, 105, 114 Leptokurtic, 149, 316 Letac, 56 Letac, 56 Lexicographically ordered, 127, 319 Lexicographical order, 307, 308, 322, 327, 328, 332 Ley, 114 L’Hôpital’s rule extended L’Hôpital’s rule, 164 Li, 266 Linear space, 121
340 Liseo, 105, 114 Lodge, 71 Loperfido, 105, 114, 119 Lower (left) tail, 1 Lower incomplete gamma function, 50, 71, 245 Löwner’s sense, 186, 218 Lukacs, 126, 217, 218 M Machine precision, 91–93 Magnus, 8, 37, 69, 78, 82, 299, 303, 304, 306, 329, 331, 334 Makov, 3 Manjunath, 1 Mardia’s non-excess kurtosis, 321 Marsaglia, 108, 155 Mathai, 70 Matos, 1, 266 Matrix-square-root, 121, 286, 309 Mean-equated, 203, 204, 210 Mean equation, 205, 208 Mean mixture, 262 Mean vector, 106 Meeker, 3, 110 Meijer, 299, 328, 329, 331–333 Miller, 1, 71 Mesokurtic, 149, 316 Miller notation, 78 Mixture precision, 275, 277, 284, 285, 288 variance, 276, 277 Mode global, 84 local, 93 Moment equating, 193 Moment generating function (MGF), 1, 3–6, 14, 15, 17, 30, 60, 68, 83, 105, 114–116, 118, 120–122, 125, 131, 135, 136, 150, 160, 171, 174–176, 182, 185, 190, 191, 198, 200, 215–217, 220, 222–225, 235, 238–241, 253, 257, 259–261, 263 pseudo mgf, 6, 18, 126, 224, 253, 257 Moments absolute, 47, 52, 53, 55, 56, 69, 70, 94, 95 partial, 57, 62 Moore-Penrose generalized-inverse, 329 Mora, 56 Mòri, 318, 325, 327 MP inverse, 329, 334 Multi-modal distributions, 105 Multiple commutator, 299, 309, 311 Multiple integral, 255 Multiplication theorem, 73
Index Multivariate basic parabolic cylinder (BPC) distribution, 69, 85, 87, 92, 95 Multivariate cumulants, 16, 25, 26, 126, 127, 177, 226, 299, 309, 316, 320 Multivariate fourth cumulants, 321, 325, 334 Multivariate measure of excess kurtosis, 321 Multivariate measure of non-excess kurtosis, 323 Multivariate measure of skewness, 319 Multivariate PN and NN, 206 Multivariate t-distribution skew multivariate t-distribution, 288 Multivariate third cumulants, 320, 331, 334 N Nabeya, 1, 251 Nadarajah, 266, 285 Narasimhan, 92 Navarro, 105 Naveau, 252 Necessary and sufficient condition, 126, 319 Neudecker, 8, 299, 303, 304, 306, 329, 331, 334 Nguyen, 274 NIST, 249 Nguyen, 274 Non-duplicated multivariate skewness and kurtosis, 328 Non-normal, 105, 149, 189, 265, 266, 288 Normal discrete normal distribution, 171, 182, 222 distribution, 57, 62, 70, 94, 96, 106, 154, 171, 187, 191, 192, 231, 235, 274, 284, 316 exchangeable normal vector, 105 normal-normal (NN) distributions, 171, 174, 231, 235, 262 normal-separated normal-normal (NSNN), 225 normal-separated pseudo-normal (NSPN), 126, 220 normal mixture, 279–282 prior density, 114 standard, 43, 51, 57, 230, 288 vector, 1, 3, 4, 69, 116, 215, 217, 224, 231 Normality, 1, 47, 78, 186, 187, 321, 325 Normalizer, 4, 10–12, 58, 59, 70, 85, 87, 89, 90, 95, 106, 107, 113, 114, 116, 124, 166, 167, 174, 186, 188, 237, 257, 263, 278 Normally distributed, 1, 3, 70, 81, 94, 107, 215–217, 222, 224, 235, 236, 265, 292
Index O Oberhettinger, 38, 69, 82 Observable variables, 235, 252 Off-diagonal elements, 95, 331 Ogasawara, 1, 4, 47, 49, 51, 52, 54–57, 64, 66, 69, 71, 72, 81, 105, 106, 114, 159, 243, 244, 247, 248, 250, 251, 317, 320 O’Hagan, 105, 114 Olver, 91 Ord, 38, 159, 276 Order statistics, 105 Orthogonally rotated random vector, 319 Ortho-normal, 304 Overdispersion, 56 P Parabolic cylinder distribution generalized, 70 three-parametric, 69 Parabolic cylinder function incomplete, 69, 71, 89 weighted, 70, 71, 89 Partial derivative, 1, 6, 7, 12, 23, 129, 253 Partial moment, 57, 62 Partitioned matrix inverse of, 8, 111, 112 Patenaude, 71 Paulson, 3, 105 Pearson, 1, 3, 81, 91, 105, 159, 320 Pearson, 81, 159, 320 Permutation matrix, 299–301, 303, 304, 334 Perspective view, 209 Pettere, 266 Pierson, 70 Piessens, 91 Plane truncation, 3 Platykurtic, 149, 316 Pochhammer notation, 59 Poisson distribution half, 56, 57, 64 real-valued, 47, 55–57 Pollack, 51, 81 Porter, 91 Positively-skewed transformation, 320 Posterior density, 114 Power gamma distribution, 265 Power-precision, 278, 279 Power series, 92 Power-standard deviation, 279 Power-variance, 279 Precision mixture, 275, 277, 284, 285, 288 Principal component analysis (PCA), 252
341 Prior density, 114 Probabilist’s Hermite polynomial, 29 Probability density function (PDF), 1, 3, 4, 48, 70, 84, 90, 91, 105, 106, 108, 110, 111, 113–115, 119, 129, 142, 150, 160, 171, 173, 174, 182, 185–193, 198, 203, 205, 209, 217, 223, 224, 235, 237, 238, 240, 242, 243, 251, 252, 257, 261–263, 265–272, 274, 275, 277–288, 290, 291, 294, 295 Probability function, 56, 57, 59, 187 Product sum of natural numbers (PSNN), 1, 29, 34 Progri, 70 Projection matrix, 121, 330 Pseudo non-normal distributions, 265 Pseudo-normal (PN) distribution family of distributions, 105 singular pseudo-normal (SPN), 122 Pseudo random vector, 6, 15, 18, 126, 218, 224, 253 Psychometrics, 105, 252 Q Quadruplication matrix, 332 Quasi mixture, 280–282 Quiroz, 318 Q-way or q-mode commutation matrix, 310 R Radial truncation, 3 Raw moment, 76, 244, 272, 316 R Core Team, 91, 158 Real-valued Poisson distribution, 47, 55–57 Recurrence formula, 78, 80, 81 relation, 78–81 Reference point, 47, 51, 174, 192, 193, 209 Reflection point, 303 Regression, 9, 56, 108, 120, 171, 186, 189, 217, 224 Reparametrization, 70, 251, 278, 295 Residual, 9, 74, 108, 189, 217, 224 R-function cubintegrate, 92 integrate, 91 wpc, 91 Rising factorial, 248 Risk, 3 Roberts, 105 Robotti, 1, 81 Rohatgi, 318 Rosa, 3
342 Roy, 56 R-package cubature, 92 R-Poisson, 57, 59, 60, 64, 66–68 Ruiz, 105 S Sandoval, 105 Saxena, 70 Scale-free, 78, 93, 279, 317 Scale parameter, 49, 50, 59, 60, 70, 71, 75, 96, 119, 266, 270, 271, 274–276, 279, 280, 282, 283, 286, 287, 295 Sectionally truncated sectionally truncated normal (STN), 3–6, 14, 15, 17, 30, 116, 215, 224, 243, 253, 254 Sectional truncation, 1, 3, 47, 69, 105, 141, 171, 216, 235, 247, 265, 295 Selart, 266, 272 Selected regions, 1, 208 Selection intervals or stripes, 47 Seneta, 266 Series expansion, 57 Shape parameter, 49, 50, 59, 60, 71, 75, 96, 295 Sharp, 252 Shauly-Aharonov, 51, 81 Sherill, 1 Shushi, 3 Single truncation, 1, 47, 62, 70, 84, 96, 105, 141, 149, 252, 265, 266, 292 Singly truncated, 1, 251 Skewed unimodal distributions, 105 Skewness, 77, 135, 142, 143, 149, 159, 162, 167, 181, 194, 196, 204, 299, 316–319, 323 Skew-normal (SN) distributions extended, 105 hierarchical, 105 multivariate, 105, 119, 288 singular skew-normal (SSN), 122 Skew transformation, 320 Soni, 38, 69 Spectral decomposition, 319 Srivastava, 319 Stacy, 278 Standardized random vector, 309 Stegun, 73, 78 STN-distributed vector, 6, 14, 243 Strictly decreasing function (SDF), 84, 93 Stripe truncation, 1, 47–49, 62, 69 Stuart, 38, 157, 274 Student’s t-distribution, 265, 266, 274
Index (Sub)independence, 216 Submatrix, 89, 123, 299–303 Subtract cancellation error, 81 Sufficient condition, 126, 319, 320 Supra-diagonal, 331 Sylvester’s theorem, 109, 173 Symmetric bivariate, 323 expressions, 21, 22, 24, 26 forms, 23 Symmetrizer, 310, 312, 330 Symmetry of derivatives, 23 Székely, 318 T Tail areas, 3, 6, 47, 141, 149 Tail conditional expectation, 3 Tail-thickness, 316 Takane, 8, 329 Takeuchi, 8, 329 Tallis, 1, 3 Taufer, 310 T-distribution multivariate, 266, 285 standard, 266, 268, 269 univariate, 286 unstandard, 274 Tensor product, 299 Terdik, 299, 303, 304, 307, 308, 310–312, 323, 325, 326 Total expectation, 108 Total kurtosis, 325 Triplication matrix, 331 Tri-variate, 12, 317, 318, 323, 331, 332 Truncated normal, 3, 171 Truncated normal-normal (TNN) distribution, 235, 262 Truncated pseudo-normal (TPN) distribution, 235, 238 Truncation double, 1, 3, 47, 48, 62, 141–143, 149, 166, 167, 252, 265 hidden, 105, 114 inner, 1, 3, 47, 141–143, 149, 153, 158, 166–168, 193, 265 observed, 114 sectional, 1, 3, 47, 69, 105, 141, 171, 216, 235, 247, 265, 295 single, 1, 47, 62, 70, 84, 96, 105, 141, 149, 252, 265, 266, 292 stripe, 1, 47, 48, 62, 69 tigerish, 1, 47, 69 zebraic, 1, 47, 69 Tuchscherer, 3
Index Turning point, 164 Two-mode sectionally truncated normal (2MSTN) vector, vi U Űberhuber, 91 Unconditional distribution, 106 Underdispersion, 56, 67 Unimodal distributions, 105 Upper incomplete gamma function, 56, 57 Upper (right) tail, 1 V Variable transformation, 49, 72, 85, 94–96, 246, 267–271, 275, 276, 279–284, 286 Variance mixture, 276, 277 Vec operator, 304, 309, 326 Vectorizing operator, 126, 299 von Rosen, 230 W Wagh, 56 Wansbeek, 303, 306 3-way duplication matrix, 331
343 4-way duplication matrix, 332 Weighted Gauss hypergeometric function, 251 Weighted or incomplete Kummer confluent hypergeometric function, 69, 71, 74 Weighted or incomplete parabolic cylinder function, 69, 71 Weinstein-Aronszajin identity, 109 Whittaker notation, 69, 71 Wichura, 1 Wilhelm, 1 Winkelbauer, 51, 62 Woodbury matrix identity, 109 X Xu, 38 Y Yanai, 8, 329 Z Zacks, 108 Zamani, 56 Zero-distorted, 67, 68 Zwillinger, 69–71, 78, 249