Expository Moments for Pseudo Distributions (Behaviormetrics: Quantitative Approaches to Human Behavior, 2) 9811935246, 9789811935244

This book provides expository derivations for moments of a family of pseudo distributions, which is an extended family o

144 31 4MB

English Pages 355 [348] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Contents
1 The Sectionally Truncated Normal Distribution
1.1 Introduction
1.2 The Probability Density Function (PDF) and the Moment Generating Function for the Sectionally Truncated Normal Vector
1.3 Partial Derivatives of the Cumulative Distribution Function of the Normal Random Vector
1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF
1.5 The Product Sum of Natural Numbers and the Hermite Polynomials
References
2 Normal Moments Under Stripe Truncation and the Real-Valued Poisson Distribution
2.1 Introduction
2.2 Closed Formulas for Moments of Integer-Valued Orders
2.3 Series Expressions of \overline{I}_{k}^{(r)} \,(k = 0,1, \ldots ;r = 1, \ldots ,R) for Moments of Integer-Valued Orders
2.4 The Real-Valued Poisson Distribution for Series Expressions of \overline{I}_{k}^{(r)} \,(k = 0,1, \ldots ;r = 1,...,R) for Absolute Moments
2.4.1 Generalization of the Poisson Distribution
2.4.2 The Real-Valued Poisson Distribution
2.4.3 Applications to the Series Expressions of the Moments of the Normal Distribution
2.5 Remarks
References
3 The Basic Parabolic Cylinder Distribution and Its Multivariate Extension
3.1 Introduction
3.2 The BPC Distribution of the Third Kind and Its CDF
3.3 Moments of the BPC Distribution
3.4 The Mode and the Shapes of the PDFs of the BPC Distribution
3.5 The Multivariate BPC Distribution
3.6 Numerical Illustrations
3.7 Discussion
3.8 R-Functions
3.8.1 The R-Function wpc for the Weighted Parabolic Cylinder Function
3.8.2 The R-Functions bpc1n and bpc2n for the Normalizers of the Uni- and Bivariate BPC Distributions
3.8.3 The R-Functions dbpc1 and dbpc2 for the PDFs of the Uni- and Bivariate BPC Distributions
3.8.4 The R-Function bpc2d for the CDF of the Bivariate BPC Distribution
References
4 The Pseudo-Normal (PN) Distribution
4.1 Introduction
4.2 The PDF of the PN Distribution
4.3 The Moment Generating Functions (MGFs)
4.3.1 The MGF of the PN-Distributed Vector
4.3.2 The MGF of {{\bf Y}}^{{{\rm T}}} {{\bf CY}}
4.3.3 The MGF of {{\bf YY}}^{{{\rm T}}}
4.4 Closed Properties of the PN
4.4.1 The Closure of Affine Transformations of the PN-Distributed Vector
4.4.2 Marginal and Conditional Distributions
4.4.3 Independent Random Vectors and Sums
4.4.4 Summary
4.5 Moments and Cumulants of the PN
4.5.1 General Results for Cumulants
4.5.2 Moments and Cumulants When q = 1
4.6 The Distribution Function of the PN
References
5 The Kurtic-Normal (KN) Distribution
5.1 Introduction
5.2 The Limiting Distributions of the KN
5.3 Moments and Cumulants of the KN
References
6 The Normal-Normal (NN) Distribution
6.1 Introduction
6.2 The MGFs of the NN
6.3 Closed Properties of the NN
6.4 Cumulants of the NN
6.5 Alternative Expressions of the PDF of the NN: Mixture, Convolution and Regression
6.6 Moment-Equating for the PN and NN
6.6.1 The SN and NN
6.6.2 The Multivariate PN and NN with Exchangeable Variables
Reference
7 The Decompositions of the PN- and NN-Distributed Variables
7.1 Decomposition of the PN
7.2 Decomposition of the NN
7.3 Multivariate Hermite Polynomials
7.4 Normal-Reduced and Normal-Added PN and NN
References
8 The Truncated Pseudo-Normal (TPN) and Truncated Normal-Normal (TNN) Distributions
8.1 Introduction
8.2 Moment Generating Functions for the TPN Distribution
8.3 Properties of the TPN
8.3.1 Affine Transformation of the TPN Vector
8.3.2 Marginal and Conditional Distributions of the TPN Vector
8.4 Moments and Cumulants of the TPN
8.4.1 A Non-recursive Formula
8.4.2 A Formula Using the MGF
8.4.3 The Case of Sectionally Truncated SN with p = q = 1
8.5 The Truncated Normal-Normal Distribution
References
9 The Student t- and Pseudo t- (PT) Distributions: Various Expressions of Mixtures
9.1 Introduction
9.2 The t-Distribution
9.3 The Multivariate t-Distribution
9.4 The Pseudo t (PT)-Distribution
9.4.1 The PDF of the PT
9.4.2 Moments and Cumulants of the PT
References
10 Multivariate Measures of Skewness and Kurtosis
10.1 Preliminaries
10.2 Multivariate Cumulants and Multiple Commutators
10.3 Multivariate Measures of Skewness and Kurtosis
10.3.1 Multivariate Measures of Skewness
10.3.2 Multivariate Measures of Excess Kurtosis
10.4 Elimination Matrices and Non-duplicated Multivariate Skewness and Kurtosis
References
Index
Recommend Papers

Expository Moments for Pseudo Distributions (Behaviormetrics: Quantitative Approaches to Human Behavior, 2)
 9811935246, 9789811935244

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Behaviormetrics: Quantitative Approaches to Human Behavior 2

Haruhiko Ogasawara

Expository Moments for Pseudo Distributions

Behaviormetrics: Quantitative Approaches to Human Behavior Volume 2

Series Editor Akinori Okada, Professor Emeritus, Rikkyo University, Tokyo, Japan

This series covers in their entirety the elements of behaviormetrics, a term that encompasses all quantitative approaches of research to disclose and understand human behavior in the broadest sense. The term includes the concept, theory, model, algorithm, method, and application of quantitative approaches from theoretical or conceptual studies to empirical or practical application studies to comprehend human behavior. The Behaviormetrics series deals with a wide range of topics of data analysis and of developing new models, algorithms, and methods to analyze these data. The characteristics featured in the series have four aspects. The first is the variety of the methods utilized in data analysis and a newly developed method that includes not only standard or general statistical methods or psychometric methods traditionally used in data analysis, but also includes cluster analysis, multidimensional scaling, machine learning, corresponding analysis, biplot, network analysis and graph theory, conjoint measurement, biclustering, visualization, and data and web mining. The second aspect is the variety of types of data including ranking, categorical, preference, functional, angle, contextual, nominal, multi-mode multi-way, contextual, continuous, discrete, high-dimensional, and sparse data. The third comprises the varied procedures by which the data are collected: by survey, experiment, sensor devices, and purchase records, and other means. The fourth aspect of the Behaviormetrics series is the diversity of fields from which the data are derived, including marketing and consumer behavior, sociology, psychology, education, archaeology, medicine, economics, political and policy science, cognitive science, public administration, pharmacy, engineering, urban planning, agriculture and forestry science, and brain science. In essence, the purpose of this series is to describe the new horizons opening up in behaviormetrics — approaches to understanding and disclosing human behaviors both in the analyses of diverse data by a wide range of methods and in the development of new methods to analyze these data. Editor in Chief Akinori Okada (Rikkyo University) Managing Editors Daniel Baier (University of Bayreuth) Giuseppe Bove (Roma Tre University) Takahiro Hoshino (Keio University)

Haruhiko Ogasawara

Expository Moments for Pseudo Distributions

123

Haruhiko Ogasawara Professor Emeritus Otaru University of Commerce Otaru, Hokkaido, Japan

ISSN 2524-4027 ISSN 2524-4035 (electronic) Behaviormetrics: Quantitative Approaches to Human Behavior ISBN 978-981-19-3524-4 ISBN 978-981-19-3525-1 (eBook) https://doi.org/10.1007/978-981-19-3525-1 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

Early in 2021, the following two papers were online published: Ogasawara, H. (2021a). Unified and non-recursive formulas for moments of the normal distribution with stripe truncation. Communications in Statistics—Theory and Methods. https://doi.org/10.1080/03610926.2020.1867742. Ogasawara, H. (2021b). A non-recursive formula for various moments of the multivariate normal distribution with sectional truncation. Journal of Multivariate Analysis. https://doi.org/10.1016/j.jmva.2021.104729. This book is a collection of explications, extensions or generalizations of these papers and looks like an edited book of several new papers edited by the same author. The above two papers deal with moments under normality with stripe and sectional truncation for univariate and multivariate cases, respectively, where the new truncation forms include usual single and double tail truncation as special cases. In the latter 2021b paper, sectional truncation yields a pseudo normal (PN) family of distributions, which is seen as an extension of the skew-normal (SN), whose statistical aspects were derived by the seminal paper of Azzalini (1985). The PN is also seen as an extension of the closed skew-normal (CSN) family obtained by Domínguez-Molina, González-Farías and Gupta (2003) and associated papers, where the CSN includes the SN and various distributions as special cases. The PN is a “normal” version of the “pseudo distributions,” which is used in the title of this book. One of the new features of the PN over the SN and CSN is that the PN provides symmetric non-normal distributions, while in the SN and CSN, symmetric distributions reduce to normal ones. In this book, the author uses a term “kurtic normal (KN),” where a coined word “kurtic” indicates “mesokurtic, leptokurtic or platykurtic” used in statistics. This new feature was made possible by employing stripe/sectional truncation. It is known that before the advent of Azzalini (1985), there were preliminary versions with the same probability density function as that of the SN, e.g., Birnbaum (1950) in mathematical statistics. In an associated

v

vi

Preface

paper, Birnbaum, Paulson and Andrews (1950) in psychometrics or more generally behaviormetrics dealt with truncation in, e.g., entrance examination and personnel selection and gave the term “general truncation,” which is similar to sectional truncation, where estimation of untruncated moments from truncated moments was a problem. This book consists of ten chapters, where moments associated with truncation are focused on. In Chap. 1, the sectionally truncated normal vector is dealt with, where the derivatives of the cumulative distributions with respect to the variables in the moment generating functions (mgf’s) are presented. Chapters 2 and 3 give new two distributions, the real-valued Poisson and the basic parabolic cylinder distributions, respectively. These new distributions are introduced to derive, e.g., absolute moments of real-valued orders. Chapter 4 gives the closed features and moments of the PN. In Chap. 5, the KN is discussed. Following the discussion, Chap. 6 gives the new family called normal-normal (NN) family of distributions, which can be an approximation to the PN using finite/infinite mixtures with the weights of normal densities. It is known that the SN distributed variable can be decomposed into truncated and untruncated normal independent variables, which was derived by Henze (1986). In Chap. 7, it is shown that we have decompositions of the PN and NN similar to the Henze theorem. Another aspect of the decomposition is for the mgf of the PN, where a pseudo mgf is employed for a pseudo or complex-valued variable, which considerably reduces the computation of the cumulants. Sectional truncation can also be used in the PN and NN, which is shown in Chap. 8. Chapter 9 gives some preliminary explanations of the Student tand the pseudo t- (PT) distributions corresponding to the PN. The last chapter deals with the multivariate measures of skewness and kurtosis with some expository explanations of the Kronecker product, vectorizing operator and commutation matrix. Throughout the book, proofs are not omitted and are often given in more than one method for expository purposes. Then, the readers can start with any chapter following their interests without difficulties. Some aspects are repeatedly given in similar ways, which is also employed for exposition. As mentioned earlier, this book was based on recent works of the author on moments and truncation (for associated papers, see https://www.otaru-uc.ac.jp/*emt-hogasa/), which has been motivated by communications with researchers in this field. Comments on works of the author by Professor Nicola Loperfido (Università degli Studi di Urbino “Carlo Bo”) have guided the author fruitfully with his constructive suggestions. Discussions on computational aspects and software associated with the moments of the normal vector under sectional truncation with Professor Adelchi Azzalini (Università degli Studi di Padova), Professor Njål Foldnes (University of Stavanger) and Dr. Christian Galarza (Escuela Superior Politécnica del Litoral— ESPOL) are highly appreciated. I regret that I have not fulfilled the promised revision of my software associated with sectional truncation. This book is an intermediate report for the revision. I am directly indebted to Professor Akinori

Preface

vii

Okada (Rikkyo University), the series editor of Behaviormetrics for this book. Without his invitation and encouragements, this book has not realized. Last but not least, I deeply thank Professor Takahiro Terasaka (Otaru University of Commerce) for discussions on mathematical statistics and academic environments which have been unchanged after my retirement in 2017. Otaru, Japan March 2022

Haruhiko Ogasawara

References Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat 12:171– 178 Birnbaum ZW (1950) Effect of linear truncation on a multinormal population. Ann Math Stat 21:272–279 Birnbaum ZW, Paulson E, Andrews FC (1950) On the effect of selection performed on some coordainates of a multi-dimensional population. Psychometrika 15:191–204 Domínguez-Molina A, González-Farías G, Gupta AK (2003) The multivariate closed skew normal distribution. Technical report 03‐12. Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, OH Henze N (1986) A probabilistic representation of the ‘skew-normal’ distribution. Scand J Stat 13:271–275

Contents

1

2

... ...

1 1

...

3

...

6

...

14

... ...

29 43

Normal Moments Under Stripe Truncation and the Real-Valued Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Closed Formulas for Moments of Integer-Valued Orders . . . . . .

47 47 49

The Sectionally Truncated Normal Distribution . . . . . . . . . . . . 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 The Probability Density Function (PDF) and the Moment Generating Function for the Sectionally Truncated Normal Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Partial Derivatives of the Cumulative Distribution Function of the Normal Random Vector . . . . . . . . . . . . . . . . . . . . . . 1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 The Product Sum of Natural Numbers and the Hermite Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3 2.4

ðrÞ

Series Expressions of I k ðk ¼ 0; 1; . . .; r ¼ 1; . . .; RÞ for Moments of Integer-Valued Orders . . . . . . . . . . . . . . . . . . . The Real-Valued Poisson Distribution for Series Expressions ðrÞ

of I k ðk ¼ 0; 1; . . .; r ¼ 1; :::; RÞ for Absolute Moments 2.4.1 Generalization of the Poisson Distribution . . . . 2.4.2 The Real-Valued Poisson Distribution . . . . . . . 2.4.3 Applications to the Series Expressions of the Moments of the Normal Distribution . . . . . . . . 2.5 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54

...... ...... ......

56 56 57

...... ...... ......

62 67 68

ix

x

3

4

5

Contents

The Basic Parabolic Cylinder Distribution and Its Multivariate Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The BPC Distribution of the Third Kind and Its CDF . . . . . . 3.3 Moments of the BPC Distribution . . . . . . . . . . . . . . . . . . . . 3.4 The Mode and the Shapes of the PDFs of the BPC Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 The Multivariate BPC Distribution . . . . . . . . . . . . . . . . . . . . 3.6 Numerical Illustrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 R-Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.1 The R-Function wpc for the Weighted Parabolic Cylinder Function . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.2 The R-Functions bpc1n and bpc2n for the Normalizers of the Uni- and Bivariate BPC Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.3 The R-Functions dbpc1 and dbpc2 for the PDFs of the Uni- and Bivariate BPC Distributions . . . . . . 3.8.4 The R-Function bpc2d for the CDF of the Bivariate BPC Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Pseudo-Normal (PN) Distribution . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 The PDF of the PN Distribution . . . . . . . . . . . . 4.3 The Moment Generating Functions (MGFs) . . . . 4.3.1 The MGF of the PN-Distributed Vector . 4.3.2 The MGF of YT CY . . . . . . . . . . . . . . . 4.3.3 The MGF of YYT . . . . . . . . . . . . . . . . 4.4 Closed Properties of the PN . . . . . . . . . . . . . . . 4.4.1 The Closure of Affine Transformations of the PN-Distributed Vector . . . . . . . . . 4.4.2 Marginal and Conditional Distributions . 4.4.3 Independent Random Vectors and Sums 4.4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . 4.5 Moments and Cumulants of the PN . . . . . . . . . . 4.5.1 General Results for Cumulants . . . . . . . 4.5.2 Moments and Cumulants When q = 1 . . 4.6 The Distribution Function of the PN . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

69 69 70 76

. . . . .

. . . . .

84 85 91 94 96

..

96

..

98

. . 100 . . 100 . . 102

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

105 105 106 114 114 116 118 119

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

119 123 124 125 125 125 129 142 146

The Kurtic-Normal (KN) Distribution . . . . . . . . . . . . . . . . . . . . . . . 149 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 5.2 The Limiting Distributions of the KN . . . . . . . . . . . . . . . . . . . 154

Contents

xi

5.3 Moments and Cumulants of the KN . . . . . . . . . . . . . . . . . . . . . 159 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 6

The Normal-Normal (NN) Distribution . . . . . . . . . . . . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 The MGFs of the NN . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Closed Properties of the NN . . . . . . . . . . . . . . . . . . . . . 6.4 Cumulants of the NN . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Alternative Expressions of the PDF of the NN: Mixture, Convolution and Regression . . . . . . . . . . . . . . . . . . . . . 6.6 Moment-Equating for the PN and NN . . . . . . . . . . . . . . 6.6.1 The SN and NN . . . . . . . . . . . . . . . . . . . . . . . . 6.6.2 The Multivariate PN and NN with Exchangeable Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

171 171 174 176 177

. . . . . 186 . . . . . 198 . . . . . 198 . . . . . 206 . . . . . 213 . . . . . .

. . . . . .

. . . . . .

215 215 222 226 231 232

7

The Decompositions of the PN- and NN-Distributed Variables . 7.1 Decomposition of the PN . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Decomposition of the NN . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Multivariate Hermite Polynomials . . . . . . . . . . . . . . . . . . . 7.4 Normal-Reduced and Normal-Added PN and NN . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

The Truncated Pseudo-Normal (TPN) and Truncated Normal-Normal (TNN) Distributions . . . . . . . . . . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Moment Generating Functions for the TPN Distribution 8.3 Properties of the TPN . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Affine Transformation of the TPN Vector . . . . 8.3.2 Marginal and Conditional Distributions of the TPN Vector . . . . . . . . . . . . . . . . . . . . . 8.4 Moments and Cumulants of the TPN . . . . . . . . . . . . . . 8.4.1 A Non-recursive Formula . . . . . . . . . . . . . . . . 8.4.2 A Formula Using the MGF . . . . . . . . . . . . . . . 8.4.3 The Case of Sectionally Truncated SN with p = q = 1 . . . . . . . . . . . . . . . . . . . . . . . . 8.5 The Truncated Normal-Normal Distribution . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . 257 . . . . . . 262 . . . . . . 264

The Student t- and Pseudo t- (PT) Distributions: Various Expressions of Mixtures . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 The t-Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 The Multivariate t-Distribution . . . . . . . . . . . . . . . . . .

. . . .

9

. . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

235 235 238 241 241

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

242 243 243 253

. . . .

. . . .

. . . .

. . . .

. . . .

265 265 266 285

xii

Contents

9.4

The Pseudo t (PT)-Distribution . . . . . . . . . 9.4.1 The PDF of the PT . . . . . . . . . . . . 9.4.2 Moments and Cumulants of the PT References . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

288 288 295 298

10 Multivariate Measures of Skewness and Kurtosis . . . . . . . 10.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Multivariate Cumulants and Multiple Commutators . . . 10.3 Multivariate Measures of Skewness and Kurtosis . . . . 10.3.1 Multivariate Measures of Skewness . . . . . . . . 10.3.2 Multivariate Measures of Excess Kurtosis . . . 10.4 Elimination Matrices and Non-duplicated Multivariate Skewness and Kurtosis . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

299 299 309 316 317 321

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . . . . . 328 . . . . . . . 334

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337

Chapter 1

The Sectionally Truncated Normal Distribution

1.1

Introduction

Let a random vector X ¼ ðX1 ; . . .; Xp ÞT be normally distributed. Suppose that X is truncated such that when X belongs to one of non-overlapping R sections (regions), X is selected, otherwise X is truncated. The sections are denoted by SR SR Tp T and r¼1 far  X\br g ¼ r¼1 i¼1 fair  Xi \bir g with ar ¼ ða1r ; . . .; apr Þ T br ¼ ðb1r ; . . .; bpr Þ ðr ¼ 1; . . .; RÞ. In this case, X is said to be sectionally truncated under normality [32]. Note that the regions for selection may touch each other on some points or line segments, and some or all elements of ar (br ) are possibly 1ð1Þ. When ar ¼ ð1; . . .; 1ÞT and some element(s) of br are finite, the upper (right) tail of X is truncated while when some element(s) of ar are finite and br ¼ ð1; . . .; 1ÞT , the lower (left) tail of X is truncated. These cases are said to be singly truncated. When some of the pair(s) of the elements of ar and br are finite, X is doubly truncated in that both tails are truncated. So far, moments under single and double truncation have been well investigated [2, 6, 7, 10, 13–15, 18–21, 28–30, 34, 36]. Sectional truncation introduced by Ogasawara [32] is an extension of single and double truncation. In the univariate case, sectional truncation gives a zebraic or tigerish probability density function (pdf) and consequently, its case is called stripe truncation (for various examples under this truncation, see Fig. 1.1 and Ogasawara [31]). Note that this extension does not yield excessive extra complicatedness over single /double truncation. This S is because the selected regions Rr¼1 far  X\br g are given by R cases of double/ single truncation (selection). One of the simplest cases of sectional truncation other than double/single truncation is that of inner truncation (see Examples 4–10 of Fig. 1.1 and Ogasawara [31, 32]), where the inner complement of the domain for selection in double truncation is discarded with the lower and upper tails being selected. When

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 H. Ogasawara, Expository Moments for Pseudo Distributions, Behaviormetrics: Quantitative Approaches to Human Behavior 2, https://doi.org/10.1007/978-981-19-3525-1_1

1

2

1 The Sectionally Truncated Normal Distribution

Example 1

−3

−2

−1

0

Example 2

1

2

3

−3

−2

−1

−2

−1

0

1

2

3

−3

−2

−1

Example 5

−3

−2

−1

0

−2

−1

0

1

2

3

−3

−2

−2

−1

0

3

0

1

2

3

−1

0

1

2

3

1

2

3

2

3

Example 8

1

2

3

−3

−2

−1

0

Example 10

Example 9

−3

2

Example 6

Example 7

−3

1

Example 4

Example 3

−3

0

1

2

3

−3

−2

−1

0

1

Fig. 1.1 Examples of stripe truncation in N(0, 1) with truncated areas being shaded (taken from Ogasawara [31, Fig. 1] by permission)

1.2 The Probability Density Function (PDF) and the Moment …

3

variables stand for risks as in actuarial science, the behavior of the variables in tail areas is of primary interest as the term “tail conditional expectation” [5, Sect. 5.1; 23; 24] shows. In some or many cases, the high and low values of (risk) variables, e.g., measurements of blood pressure and pulse are focused on for medical treatment, while “normal” patients with intermediate values of associated risk variables are excluded from treatment, giving inner truncation. Similarly, in plants and animals breeding, combinations of high and low values of some variables are typically considered to have possible improvement by selection (for statistical aspects in breeding see Cochran [11], Herrendörfer and Tuchscherer [17], Gianola and Rosa [16]). Note that the works of G. M. Tallis starting from the seminal paper of Tallis [36] giving the moment generating function of the normal vector under double truncation are motivated by breeding. Though sectional truncation gives various cases of truncation including some of elliptical truncation [3, 37], cases under radial [37] and plane [38] truncation are not covered. In the behavioral sciences, the effects of selection due to, e.g., entrance examinations for universities and personnel selection have been investigated, where the inference of the moments of variables for achievements or abilities before truncation from those after truncation is one of the typical problems [1, 4, 8, 9, 25, 33]. Note that as early as in Birnbaum et al. [9, p. 193], the term “general truncation” indicating complicated conditions for selection similar to plane and sectional truncation is used.

1.2

The Probability Density Function (PDF) and the Moment Generating Function for the Sectionally Truncated Normal Vector

The pdf of the normally distributed p  1 vector X without truncation denoted by X  Np ðl; RÞ is /p ðX ¼ xjl; RÞ ¼ /p ðxjl; RÞ ¼

  1 T 1 ðx  lÞ exp  R ðx  lÞ ; 2 ð2pÞp=2 jRj1=2 1

where EðXÞ ¼ l and covðXÞ ¼ R is non-singular by assumption. Suppose that X is S sectionally truncated with the regions for selection Rr¼1 far  X\br g. Define

4

1 The Sectionally Truncated Normal Distribution



b R Z 1r X r¼1

a1r

R Z X

Zbpr 

/p ðxjl; RÞdx1    dxp apr

br



r¼1

/p ðxjl; RÞdx

ar

ZB 

/p ðxjl; RÞdx; A

where A ¼ ða1 ; . . .; aR Þ; ar ¼ ða1r ; . . .; apr ÞT and B ¼ ðb1 ; . . .; bR Þ; br ¼ ðb1r ; . . .; bpr ÞT ; r ¼ ð1; . . .; RÞ. Let S be a set representing the union of the R regions. Then, a is equal to PrðX 2 SÞ ¼ Pr

 [R

 fa  X\b g : r r r¼1

Using a, we have the following pdf of the sectionally truncated normal vector X. Definition 1.1 The distribution of the p  1 sectionally truncated normal (STN) vector X is defined with the notation. X  Np ðl; R; A; BÞ ¼ NðaÞ p ðl; RÞ; when its pdf is /p ðxjl; R; A; BÞ  /ðaÞ p ðxjl; RÞ  RB ¼

/p ðxjl; RÞ

A /p ðxjl; RÞdx a1 /p ðxjl; RÞ:

From the above definition, it is obvious that a is the valid normalizer of the truncated distribution. It is seen that 0\a\1 under some truncation and a ¼ 1 if and only if X is untruncated. Theorem 1.1 [32, Corollary 4] The moment generating function (mgf) of the STN vector. X  Np ðl; R; A; BÞ

1.2 The Probability Density Function (PDF) and the Moment …

5

is  PrðX þ Rt 2 Sjl; RÞ exp lT t þ MX ðtÞ ¼ PrðX 2 Sjl; RÞ  PrðX 2 Sjl þ Rt; RÞ exp lT t þ ¼ a

 tT Rt 2  T t Rt ; 2

where PrðX 2 Sjl; RÞ is the probability of X 2 S when X  Np ðl; RÞ. Proof By definition MX ðtÞ ¼ EfexpðXT tÞg ¼a

1

ZB A



 1 T 1 exp  ðx  lÞ R ðx  lÞ dx 2 ð2pÞp=2 jRj1=2 expðxT tÞ

  1 T 1 ðx  l  RtÞ exp  R ðx  l  RtÞ dx 2 ð2pÞp=2 jRj1=2 A   tT Rt T  exp l t þ 2   tT Rt ¼ a1 PrðX 2 Sjl þ Rt; RÞ exp lT t þ : 2 ZB

¼

a1

Noting that PrðX þ Rt 2 Sjl; RÞ ¼ Pr

 [R

 fa  Rt  X\b  Rtgjl; R r r r¼1

¼ PrðX 2 Sjl þ Rt; RÞ; the required results follow. Q.E.D.  Remark 1.1 The factor exp lT t þ



tT Rt 2

in the above mgf is that of X  Np ðl; RÞ.

The cumulant generating function (cgf) of the STN is KX ðtÞ ¼ lnMX ðtÞ ¼ ln PrðX 2 Sjl þ Rt; RÞ  ln a þ lT t þ

tT Rt : 2

6

1 The Sectionally Truncated Normal Distribution

Remark 1.2 Let the mgf in Theorem 1.1 be MX ðtÞ ¼ MX1 ðtÞMX0 ðtÞ; where   tT Rt T MX0 ðtÞ ¼ exp l t þ 2 is the mgf of X0  Np ðl; RÞ while MX1 ðtÞ ¼ a1 PrðX 2 Sjl þ Rt; RÞ is the pseudo mgf of the pseudo random vector X1 . Though MX1 ð0Þ ¼ 1 satisfies a necessary condition for a mgf as well as MX0 ð0Þ ¼ 1, MX1 ðtÞ is not a valid mgf for a real valued distribution. This is seen, e.g., when p = 1 by noting that from the additive property of the corresponding cgf, the variance of X obtained from d2 KX ðtÞ=dt2 jt¼0 is equal to d2 KX1 ðtÞ d2 KX1 ðtÞ d2 KX0 ðtÞ j þ j ¼ jt¼0 þ r2 ; t¼0 t¼0 dt2 dt2 dt2 where R ¼ r2 in the univariate case. However, the STN-distributed variable X has a typically reduced variance smaller than untruncated r2 especially when tail areas are truncated. Then, in this case, the above equation gives the negative d2 KX1 ðtÞ=dt2 jt¼0 ¼ varðX1 Þ giving the pseudo mgf and pseudo variable X1 . Although X1 is labeled as a pseudo random vector primarily for ease of reference and for understanding the structure of MX ðtÞ and KX ðtÞ, MX1 ðtÞ is a valid factor of MX ðtÞ. Similarly, KX1 ðtÞ is a valid term of KX ðtÞ. Consequently, in the above example, the correct variance of X is given by the sum of the possibly negative varðX1 Þ and varðX0 Þ ¼ r2 . Further, it is seen that the correct cumulants higher than the second order for X are obtained only from the corresponding pseudo cumulants of X1 since the cumulants beyond the second order of X0  Np ðl; RÞ vanish. Though the pseudo moments of X1 gives correct cumulants of X, the moments of X1 do not give correct moments of X.

1.3

Partial Derivatives of the Cumulative Distribution Function of the Normal Random Vector

In this section, the partial derivatives of the cumulative distribution function (cdf) of the untruncated normal random vector up to the fourth order are given, which are required to have the moments of the STN-distributed vector from its mgf. Let

1.3 Partial Derivatives of the Cumulative Distribution Function …

7

Zc Up ðcjl; RÞ ¼

/p ðXjl; RÞdX 1

be the cdf of the untruncated distribution of X  Np ðl; RÞ at X ¼ c. Define the pi  1 vector. UðiÞ p ðcjl; RÞ ¼

@i ð@cÞ\i [

Up ðcjl; RÞ ði ¼ 1; 2; . . .Þ;

where ð@cÞ\i [ ¼ ð@cÞ    ð@cÞ (i times of @c) is the i-fold Kronecker product of @c, UðiÞ p ðcjl; RÞ is the vector of the i-th order partial derivatives of the cdf of X  Np ðl; RÞ at X ¼ c. Note that the vector consists of the p univariate and pi  p cross multivariate i-th order derivatives. In the following, p is supposed to be large enough when an associated formula is considered. Lemma 1.1 Let XðkÞ ¼ ðx1 ; . . .; xk1 ; xk þ 1 ; . . .; xp ÞT with cðkÞ and lðkÞ defined similarly, and /1 ðck jlk ; rkk Þ ¼ /ðck jlk ; rkk Þ. Then, @Up ðcjl; RÞ ¼ /ðck jlk ; rkk ÞUp1 fcðkÞ jlðkÞ @ck þ rðkÞk r1 kk ðck  lk Þ; Rðk;kÞjk g ZcðkÞ /p fðck ; XTðkÞ ÞT jl; RgdXðkÞ

¼ 1

 fð1Þ ðck Þ

ðk ¼ 1; . . .; pÞ;

where rðkÞk ¼ ðr1k ; . . .; rk1;k ; rk þ 1;k ; . . .; rpk ÞT ; T Rðk;kÞjk ¼ Rðk;kÞ  rðkÞk r1 kk rðkÞk

¼ covðXðkÞ Þ  covðXðkÞ ; xk Þ  fvarðxk Þg1 covðxk ; XTðkÞ Þ and the elements of l and R are reordered. Proof Reorder the elements of x as X ¼ ðxk ; XTðkÞ ÞT .

ðk ¼ 1; . . .; pÞ;

8

1 The Sectionally Truncated Normal Distribution

Then, we have      rTðkÞk  1 0T  rkk  jRj ¼  1    r rðkÞk Ip1  rðkÞk Rðk;kÞ  kk     rkk rTðkÞk   ¼  T   0 Rðk;kÞ  r1 kk rðkÞk rðkÞk T ¼ rkk jRðk;kÞ  r1 kk rðkÞk rðkÞk j

 rkk jRðk;kÞjk j; where Ip1 is the ðp  1Þ  ðp  1Þ identity matrix. The formula of the inverse of a partitioned matrix [26, p. 11; 39, p. 16] gives !1 rTðkÞk rkk 1 R ¼ rðkÞk Rðk;kÞ ! 1 1 2 T T r1 r1 kk þ rkk rðkÞk Rðk;kÞjk rðkÞk kk rðkÞk Rðk;kÞjk ¼ 1 r1 R1 kk Rðk;kÞjk rðkÞk ðk;kÞjk Using the above results, it follows that

1 exp  fðck ; XTðkÞ ÞT p=2 1=2 2 ð2pÞ jRj i T 1 T T lg R fðck ; XðkÞ Þ  lg 1 1 ¼ exp  fðck ; XTðkÞ ÞT  lgT p=2 1=2 2 ð2pÞ jRj 1

/p fðck ; XTðkÞ ÞT jl; Rg ¼



1 2 T r1 kk þ rkk rðkÞk Rðk;kÞjk rðkÞk

1 r1 kk Rðk;kÞjk rðkÞk i fðck ; XTðkÞ ÞT  lg

1 T r1 kk rðkÞk Rðk;kÞjk

!

R1 ðk;kÞjk

) ðck  lk Þ2 ¼ pffiffiffiffiffiffi exp  2r kk ð2pÞ1=2 rkk 1 1 T  exp  fXðkÞ  lðkÞ  rðkÞk r1 kk ðck  lk Þg ðp1Þ=2 1=2 2 jRðk;kÞjk j ð2pÞ 1 1 Rðk;kÞjk fXðkÞ  lðkÞ  rðkÞk rkk ðck  lk Þg 1

:(

¼ /ðck jlk ; rkk Þ/p1 fXðkÞ jlðkÞ þ rðkÞk r1 kk ðck  lk Þ; Rðk;kÞjk g:

1.3 Partial Derivatives of the Cumulative Distribution Function …

9

Consequently, we obtain @Up ðcjl; RÞ @ ¼ @ck @ck

Zc1

Zcp 

1

1

Zck1 Zck þ 1

Zc1 ¼

/p ðXjl; RÞdx1    dxp

 1

Zcp 

1

1

/p ðx1 ; . . .; xk1 ; ck ; xk þ 1 ; . . .; xp jl; RÞ 1

 dx1    dxk1 dxk þ 1    dxp ZcðkÞ /p fðck ; XTðkÞ ÞT jl; RgdXðkÞ

 1

ZcðkÞ ¼ /ðck jlk ; rkk Þ

/p1 fXðkÞ jl þ rðkÞk r1 kk ðck  lk Þ; Rðk;kÞjk gdXðkÞ

1

¼ /ðck jlk ; rkk ÞUp1 fcðkÞ jlðkÞ þ rðkÞk r1 kk ðck  lk Þ; Rðk;kÞjk g  fð1Þ ðck Þ ðk ¼ 1; . . .; pÞ: Q.E.D. T It is well known that Rðk;kÞjk ¼ Rðk;kÞ  rðkÞk r1 kk rðkÞk is the covariance matrix of the conditional distribution of XðkÞ when Xk is given as a fixed value without truncation or equivalently the covariance matrix of the residuals of the regression of XðkÞ on Xk . Define rðkÞ=k ¼ rðkÞk r1 kk , which is the vector of the regression coefficients of XðkÞ on Xk . Then, we have

Rðk;kÞjk ¼ covðXðkÞ  rðkÞ=k Xk Þ ¼ covðXðkÞ Þ  covðrðkÞ=k Xk Þ ¼ Rðk;kÞ  rðkÞ=k rkk rTðkÞ=k : More generally, define, e.g., Rabc=de ¼ Rabc;de R1 de;de ¼ covfðXa ; Xb ; Xc ÞT ; ðXd ; Xe Þg½covfðXd ; Xe ÞT g 1 ; which is the matrix of the regression coefficients of ðXa ; Xb ; Xc Þ on ðXd ; Xe Þ, whose variations defined similarly will be used for simplicity of notation. Using this notation, we have

10

1 The Sectionally Truncated Normal Distribution

Rabc;abcjde ¼ Rabc;abc  Rabc;de R1 de;de Rde;abc ¼ Rabc;abc  Rabc=de Rde;de RTabc=de : R cðkÞ /p fðck ; XTðkÞ ÞT jl; RgdXðkÞ in Lemma 1.1 will be The notation fð1Þ ðck Þ ¼ 1 frequently used, which indicates the marginal density of Xk ¼ ck multiplied by the Rc normalizer a ¼ 1 /p ðXjl; RÞdX when each variable is upper-tail truncated as 1  Xk \ck ðk ¼ 1; . . .; pÞ. Lemma 1.2 Using the simplified notation rl=k ¼ rlk r1 kk defined earlier, @ 2 Up ðcjl; RÞ c k  lk ¼ fð1Þ ðck Þ rkk @c2k cðk;lÞ Z p X 1  rlk rkk /p fðcTkl ; XTðk;lÞ ÞT jl; RgdXðk;lÞ l¼1;l6¼k



1

p X c k  lk fð1Þ ðck Þ  rl=k fð2Þ ðckl Þ rkk l¼1;l6¼k

ðk ¼ 1; . . .; pÞ;

where ckl ¼ ðck ; cl ÞT and Xðk;lÞ is the ðp  2Þ  1 vector when the elements xk and xl ðl 6¼ kÞ are deleted in x. Proof @ 2 Up ðcjl; RÞ @c2k @ ¼ /ðck jlk ; rkk ÞUp1 fcðkÞ jlðkÞ þ rðkÞ=k ðck  lk Þ; Rðk;kÞjk g @ck @/ðck jlk ; rkk Þ Up1 fcðkÞ jlðkÞ þ rðkÞ=k ðck  lk Þ; Rðk;kÞjk g þ /ðck jlk ; rkk Þ ¼ @ck @  Up1 fcðkÞ  lðkÞ  rðkÞ=k ðck  lk Þj0; Rðk;kÞjk g @ck ck  l k /ðck jlk ; rkk ÞUp1 fcðkÞ jlðkÞ þ rðkÞ=k ðck  lk Þ; Rðk;kÞjk g ¼ rkk Z cðkÞ lðkÞ rðkÞ=k ðck lk Þ @ /p1 fXðkÞ j0; Rðk;kÞjk gdXðkÞ þ /ðck jlk ; rkk Þ @ck 1 p X ck  l k ¼ fð1Þ ðck Þ  /ðck jlk ; rkk Þ rl=k /fcl  ll  rl=k ðck  lk Þj0; rlljk g rkk l¼1;l6¼k

1.3 Partial Derivatives of the Cumulative Distribution Function …

11

 Up2 fcðk;lÞ  lðk;lÞ  Rðk;lÞ=kl ðckl  lkl Þj0; Rðk;l;k;lÞjkl g p X c k  lk fð1Þ ðck Þ  rl=k /2 ðckl jlkl ; Rkl;kl Þ ¼ rkk l¼1;l6¼k  Up2 fcðk;lÞ  lðk;lÞ  Rðk;lÞ=kl ðckl  lkl Þj0; Rðk;l;k:lÞjkl g p X c k  lk fð1Þ ðck Þ  rl=k fð2Þ ðckl Þ ðk ¼ 1; . . .; pÞ: ¼ rkk l¼1;l6¼k Q.E.D. The notation Z

cðk;lÞ

/p fðcTkl ; XTðk;lÞ ÞT jl; RgdXðk;lÞ

fð2Þ ðckl Þ ¼

ðk; l ¼ 1; . . .; p; k 6¼ lÞ

1

indicates the bivariate marginal density of Xkl ¼ ðXk ; Xl ÞT ¼ ckl multiplied by the normalizer a under upper-tail truncation as before. Lemma 1.3 @ 3 Up ðcjl; RÞ 1 ¼ fðck  lk Þ2 r2 kk  rkk gfð1Þ ðck Þ @c3k p X 1 rl=k ½ðck  lk Þr1 þ kk þ fRkl;kl ðckl  lkl Þg½k fð2Þ ðckl Þ l¼1;l6¼k p X

þ

rl=k ðrm=kl Þ½k

l;m¼1 ðk;l;m:6¼Þ

Z

cðk;l;mÞ

/p fðcTklm ; XTðk;l;mÞ ÞT jl; RgdXðk;l;mÞ

 1

1  fðck  lk Þ2 r2 kk  rkk gfð1Þ ðck Þ p X 1 þ rl=k ½ðck  lk Þr1 kk þ fRkl;kl ðckl  lkl Þg½k fð2Þ ðckl Þ l¼1;l6¼k

þ

p P l;m¼1 ðk;l;m:6¼Þ

rl=k ðrm=kl Þ½k fð3Þ ðcklm Þðk ¼ 1; . . .; pÞ;

where ðÞ½k is the element corresponding to Xk of the vector in parentheses; cklm and Xðk;l;mÞ are defined similarly to ckl and Xðk;lÞ , respectively; and k; l; m :6¼ indicates that k, l and m are mutually distinct, i.e., k 6¼ l 6¼ m 6¼ k.

12

1 The Sectionally Truncated Normal Distribution

Proof Taking the partial derivative of the result of Lemma 1.2 similarly as before, the required results follow. Q.E.D. In Lemma 1.3, kk kl fR1 kl;kl ðckl  lkl Þg½k ¼ ðr ; r Þðckl  lkl Þ and

ðrm=kl Þ½k ¼ ðrmk ; rml Þðrkk ; rlk ÞT ; where R1 kl;kl ¼

rkk rkl rlk rll

!1 ¼

rkk rkl

!

rlk rll

:

The quantity Z

cðk;l;mÞ

/p fðcTklm ; XTðk;l;mÞ ÞT jl; RgdXðk;l;mÞ

fð3Þ ðcklm Þ ¼ 1

is the tri-variate marginal density of Xklm ¼ cklm multiplied by the normalizer a under upper-tail truncation. Using notations for more than three variables defined as before and similar methods, we have the following. Lemma 1.4 @ 4 Up ðcjl; RÞ 2 ¼ fðck  lk Þ3 r3 kk þ 3ðck  lk Þrkk gfð1Þ ðck Þ @c4k p h X 1 1 kk þ rl=k fðck  lk Þ2 r2 kk  rkk g þ rkk þ r l¼1;l6¼k

i 1 1  ½ðck  lk Þr1 kk þ fRkl;kl ðckl  lkl Þg½k fRkl;kl ðckl  lkl Þg½k fð2Þ ðckl Þ 

p X

h 1 rl=k ½ðck  lk Þr1 kk þ fRkl;kl ðckl  lkl Þg½k ðrm=kl Þ½k

l;m¼1 ðk;l;m:6¼Þ

þ ðrm=kl Þ½k fR1 ðc  l Þg klm ½k fð3Þ ðcklm Þ klm;klm klm 

p X

rl=k ðrm=kl Þ½k ðrn=klm Þ½k fð4Þ ðcklmn Þðk ¼ 1; . . .; pÞ:

l;m;n¼1 ðk;l;m;n:6¼Þ

The cross partial derivatives up to the fourth order are given below.

1.3 Partial Derivatives of the Cumulative Distribution Function …

13

Lemma 1.5 @ 2 Up ðcjl; RÞ ¼ fð2Þ ðckl Þ; @ck @cl p X @ 3 Up ðcjl; RÞ 1 ¼ fR ðc  l Þg f ðc Þ  ðrm=kl Þ½l fð3Þ ðcklm Þ; kl kl ð2Þ kl ½l

kl;kl @ck @c2l m¼1 ðk;l;m:6¼Þ

@ 4 Up ðcjl; RÞ 2 ¼ ½rll þ fR1 kl;kl ðckl  lkl Þg½l fð2Þ ðckl Þ @ck @c3l 2 p X 6 1 þ6 ðrm=kl Þ½l

4fRkl;kl ðckl  lkl Þg½l

m¼1 ðk;l;m:6¼Þ

þ

p X m¼1 ðk;l;m:6¼Þ

þ

3

7 7 ðrm=kl Þ½l fR1 klm;klm ðcklm  lklm Þg½l 5fð3Þ ðcklm Þ

p P m;n¼1 ðk;l;m;n:6¼Þ

ðrm=kl Þ½l ðrn=klm Þ½l fð4Þ ðcklmn Þðk; l ¼ 1; . . .; p; k 6¼ lÞ;

@ 3 Up ðcjl; RÞ ¼ fð3Þ ðcklm Þ; @ck @cl @cm @ 4 Up ðcjl; RÞ ¼ fR1 klm;klm ðcklm  lklm Þg½m fð3Þ ðcklm Þ @ck @cl @c2m p X ðrn=klm Þ½m fð4Þ ðcklmn Þ  n¼1 ðk;l;m;n:6¼Þ

ðk; l; m ¼ 1; . . .; p; k; l; m :6¼Þ; @ 4 Up ðcjl; RÞ ¼ fð4Þ ðcklmn Þðk; l; m; n ¼ 1; . . .; p; k; l; m; n :6¼Þ: @ck @cl @cm @cn Proof The results @ 2 Up ðÞ=@ck @cl ; @ 3 Up ðÞ=@ck @cl @cm and @ 4 Up ðÞ=@ck @cl @cm @cn ðk; l; m; n ¼ 1; . . .; p; k; l; m; n 6¼Þ are given by Lemma 1.1 with successive differentiation. The other results are given by these results. Q.E.D. Alternative expressions of the above results are also available. For instance, the 3 2 term of fR1 kl;kl ðckl  lkl Þg½l fð2Þ ðckl Þ in @ Up ðÞ=@ck @cl is given alternatively from 2 2 the result of @ Up ðÞ=@cl in Lemma 1.2 as

14

1 The Sectionally Truncated Normal Distribution



c l  ll 1  þ rk=l fRkl;kl ðckl  lkl Þg½k fð2Þ ðckl Þ rll

ðk; l ¼ 1; . . .; p; k 6¼ lÞ;

which is shown to be equal to the corresponding result in Lemma 1.5 as follows Lemma 1.2 gives @ 2 Up ðcjl; RÞ c l  lk ¼ fð1Þ ðcl Þ rll @c2l p X  rm=l fð2Þ ðclm Þ

ðl ¼ 1; . . .; pÞ

m¼1;m6¼l

Differentiating this result with respect to ck , we have the term of fð2Þ ðclm Þ given above, which becomes 1 rll ðck  lk Þ  rkl ðcl  ll Þ ðcl  ll Þ þ rkl fð2Þ ðckl Þ rll rkk rll  r2kl     1 r2kl rkl rll ðck  lk Þ  1þ  l Þ þ ¼ ðc fð2Þ ðckl Þ l l rll rkk rll  r2kl rkk rll  r2kl   rkk ðcl  ll Þ rkl ðck  lk Þ þ ¼  fð2Þ ðckl Þ rkk rll  r2kl rkk rll  r2kl ¼ fR1 kl;kl ðckl  lkl Þg½l

ðk; l ¼ 1; . . .; p; k 6¼ lÞ;

which is equal to the corresponding result in Lemma 1.5 and is simpler than the alternative expression. It is of interest to see that the above result corresponds to the latter case of the general formulas: ðw11 ; w12 ÞW1 ðx1 ; x2 ÞT ¼ x1 and ðw21 ; w22 ÞW1 ðx1 ; x2 ÞT ¼ x2 for arbitrary x1 and x2 when W ¼

w11 w12 w21 w22

! which is possibly asymmetric. The

above formulas are easily derived by WW1 ðx1 ; x2 ÞT ¼ ðx1 ; x2 ÞT .

1.4

Moments and Cumulants of the STN-Distributed Vector Using the MGF

In this section, moments and cumulants of the STN-distributed vector are given using the mgf based on the results of the previous section. Define

1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF

15

ðL Þ

Lk 1Lk ckr k ¼ ðcðLÞ and r Þk ¼ akr bkr ðLk Þ

ckr

¼ ðcðLÞ Þk r ð1Lk Þ

k ¼ aL kr bkr

k ¼ ðar ÞLk k ðbr Þ1L k k ¼ ðar  RtÞLk k ðbr  RtÞ1L k

ðLÞ

ðLk ¼ 0; 1; k ¼ 1; . . .; p; r ¼ 1; . . .; RÞ; where ðar ÞLk k ¼ fðar Þk gLk ; and ðcr Þk is the ðLÞ

ðLÞ

k-th element of cr and should not be confused with ðcr Þ½k defined earlier though in this case they happen to be the same. T Using the above notations and T   lT t þ t 2Rt, the mgf of the STN vector given in Theorem 1.1 becomes   tT Rt MX ðtÞ ¼ a1 PrðX 2 Sjl þ Rt; RÞ exp lT t þ 2 ¼ a1 PrðX 2 Sjl þ Rt; RÞeT Pp R X 1 1 X X ðL Þ ðLp Þ T ¼ a1  ð1Þ k¼1 Lk Up fðc1r 1 ; . . .; cpr Þ jl; RÞgeT ð1:1Þ r¼1 L1 ¼0

¼ a1

R X

Lp ¼0

1 X

r¼1 L1 ;...;Lp ¼0

ð1ÞL þ Up ðcrðLÞ jl; RÞeT : ðLÞ

When t ¼ 0, we have eT ¼ 1 and cr a¼

R X

1 X

r¼1 L1 ;...;Lp ¼0

ðLÞ

¼ cr ðr ¼ 1; . . .; RÞ, which give

ð1ÞL þ Up ðcðLÞ r jl; RÞ

yielding MX ð0Þ ¼ 1 as expected. Remark 1.3 Note that ðL Þ

ðLÞ

@ckr k @ðcr  RtÞk ¼ ¼ rki @ti @ti

ði; k ¼ 1; . . .; p; r ¼ 1; . . .; RÞ

or in a matrix form ðLÞT

@cr @t

ðLÞ

¼

@ðcr

 RtÞT ¼ R @t

does not depend on the values of Li ði ¼ 1; . . .; pÞ. Consequently, recalling the definition of the pseudo random vector X1 in Remark 1.2, we obtain

16

1 The Sectionally Truncated Normal Distribution

@ j MX1 ðtÞ ð@tÞ\j [

¼

@j ð@tÞ\j [

¼a

a1

R X

1

R X

1 X

r¼1 L1 ;...;Lp ¼0 1 X

(



ð1Þ

r¼1 L1 ;...;Lp ¼0 R X

¼ a1

1 X

ð1ÞL þ Up ðcðLÞ jl; RÞ r ðLÞT

@cr ð@tÞ

)\j [

ðLÞ

@ j Up ðcr

jl; RÞ

ðLÞ ð@cr Þ\j [

ð1ÞL þ ðRÞ\j [

ðLÞ

@ j Up ðcr

r¼1 L1 ;...;Lp ¼0

jl; RÞ

ðLÞ ð@cr Þ\j [

ðj ¼ 1; 2; . . .Þ:

In this expression, the elements of ðLÞ

@ j Up ðcr

jl; RÞ

ðLÞ ð@cr Þ\j [

ðLk ¼ 0; 1; k ¼ 1; . . .; p; r ¼ 1; . . .; R; j ¼ 1; . . .; 4Þ ðLÞ

are given by Lemmas 1.1 to 1.5 when c ¼ cr when t ¼ 0, we obtain [ Þ¼ EðX\j 1

ðLÞ

. Using cr

ðLÞ

¼ cr ðr ¼ 1; . . .; RÞ

@ j MX1 ðtÞ

jt¼0 ð@tÞ\j [ R 1 X X ¼ a1

ð1ÞL þ ðRÞ\j [

r¼1 L1 ;...;Lp ¼0

ðLÞ

@ j Up ðcr jl; RÞ ðLÞ

ð@cr Þ\j [

ðj ¼ 1; 2; . . .Þ:

The above formula can be used to have the cumulants of X via the cumulants of X1 as EðXÞ ¼ EðX1 Þ þ l @MX1 ðtÞ jt¼0 þ l; @t covðXÞ ¼ covðX1 Þ þ R ¼

@ 2 MX1 ðtÞ j  EðX1 ÞEðXT 1 Þ þ R; @t@tT t¼0 jj ðXÞ ¼ jj ðX1 Þ ðj ¼ 3; 4Þ; ¼

where jj ðXÞ is the p j  1 vector of the multivariate cumulants of X. In the following, we obtain the results using somewhat reduced expressions. Theorem 1.2 EðXÞi1 ¼ a1

p X R X 1 X r¼1 k¼1 Lk ¼0

ðrÞ

ðL Þ

ð1ÞLk þ 1 ri1 k fð1Þ ðckr k Þ þ li1

ði1 ¼ 1; . . .; pÞ;

1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF

17

where ðrÞ

1 X

ðL Þ

fð1Þ ðckr k Þ ¼

ðL Þ

L1 ;...;Lk1 ; Lk þ 1 ...:;Lp ¼0

ð1ÞL þ Lk fð1Þ ðckr k Þ:

Proof Rewrite the mgf of the STN vector (see (1.1) of Sect. 1.4) as MX ðtÞ ¼ a1

R X

1 X

r¼1 L1 ;...;Lp ¼0



ð1ÞL þ Up ðcðLÞ jl; RÞeT r

T UðaÞ p e :

Then, EðXÞi1 ¼ ¼ ¼

T @UðaÞ @MX ðtÞ p e jt¼0 ¼ jt¼0 @ti1 @ti1

@UðaÞ p @ti1

T jt¼0 eT jt¼0 þ UðaÞ p jt¼0 ðl þ RtÞi1 e jt¼0

@UðaÞ p j þ li 1 @ti1 t¼0

ði1 ¼ 1; . . .; pÞ

follows. Since   @UðaÞ p   @ti1 

¼ a1

R X

1 X

ð1ÞL þ

r¼1 L1 ;...;Lp ¼0

t¼0

¼ a1

R X

1 X

ð1ÞL þ

r¼1 L1 ;...;Lp ¼0

¼ a1

R X

1 X

ð1ÞL þ

r¼1 L1 ;...;Lp ¼0

¼ a1

p R X X

1 X

r¼1 k¼1 Lk ¼0

¼ a1

p R P P

1 P

r¼1 k¼1 Lk ¼0

  ðLÞ @Up cr jl; R    @ti1   t¼0   p ðLk Þ @Up cðLÞ jl; R X r  @ckr    ð L Þ @ti1  @ckr k k¼1  t¼0  ðLÞ p @Up cr jl; R  X  ðri1 k Þ   ðL Þ  @ckr k k¼1

t¼0



1 X

ð1ÞLk þ 1 ri1 k

ð1ÞL þ Lk

L1 ;...;Lk1 ; Lk þ 1 ;...;Lp ¼0 ðrÞ

ð1ÞLk þ 1 ri1 k fð1Þ



   ðL k Þ  ckr  

ðLk Þ

@ckr

ði1 ¼ 1; . . .; pÞ t¼0

      

ðLÞ @Up cr jl; R

t¼0

18

1 The Sectionally Truncated Normal Distribution

and ðLk Þ

ckr

ðL Þ

jt¼0 ¼ ckr k

ðk ¼ 1; . . .; p; r ¼ 1; . . .; RÞ;

we obtain the required result. Q.E.D. Note that Z

brðkÞ ðrÞ ðL Þ fð1Þ ðckr k Þ

ðL Þ

/p fðckr k ; XTðkÞ ÞT jl; RgdXðkÞ

¼

ðk ¼ 1; . . .; pÞ;

arðkÞ

where the elements of l and R are reordered to be compatible with those of ðL Þ ðckr k ; XTðkÞ Þ. For the moments and cumulants higher than the first, we can use s Cj s X X @ j UðaÞ @ s MX ðtÞ @ sj eT p ¼ @ti1    @tis @t    @tij @tij þ 1    @tis j¼0 ði ;...;i Þ i1 1

where

Ps C j ði1 ;...;ij Þ

ðij ¼ 1; . . .; p; j ¼ 0; . . .; sÞ;

j

ðÞ is the sum of s Cj terms choosing j elements from the s elements

ðaÞ of t; and when j ¼ 0, @ j UðaÞ p =@ti0  Up . For the moments of the second order, ðaÞ 2 @ 2 UðaÞ T X @ 2 MX ðtÞ @ 2 Up @ 2 eT p @e ¼ þ þ UðaÞ ; p @ti1 @ti2 @ti1 @ti2 @ti1 @ti2 @ti1 @ti2 ði ;i Þ 1 2

@e ¼ fli1 þ ðRtÞi1 geT ; @ti1 @ 2 eT ¼ ½ri1 i2 þ fli1 þ ðRtÞi1 gfli2 þ ðRtÞi2 g eT @ti1 @ti2 T

ði1 ; i2 ¼ 1; . . .; pÞ:

Since the corresponding results of order higher than the second tend to be  complicated, in the following the pseudo mgf UðaÞ p of the pseudo random vector X1 is used. Using a method similar to that in Theorem 1.2 and noting that EfðX1 Þi1 ðX1 Þi2 g ¼

@ 2 UðaÞ p j @ti1 @ti2 t¼0

ði1 ; i2 ¼ 1; . . .; pÞ

with ð1ÞLk þ 1 þ Ll þ 1 ¼ ð1ÞLk þ Ll , we obtain the pseudo second moments as follows:

1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF ðL Þ

19

ðL Þ

Lemma 1.6 Define ckr k ¼ ckr k  lk . Then, we have EfðX1 Þi1 ðX1 Þi2 g

¼a

1

( p X R 1 X X r¼1 p X

þ

k¼1 Lk ¼0 1 X

ðL Þ ðrÞ

)

ð1Þ

Lk þ Ll

k;l¼1;k6¼l Lk ;Ll ¼0

¼a

1

( p X R 1 X X r¼1

p X

þ

k¼1 Lk ¼0 1 X

ðL Þ

ð1ÞLk þ 1 ri1 k ri2 =k ckr k fð1Þ ðckr k Þ ðrÞ ðLÞ ri1 k ri2 ljk fð2Þ ðckl;r Þ

ðL Þ ðrÞ

ðL Þ

k k ð1ÞLk þ 1 ri1 k ri2 k r1 kk ckr fð1Þ ðckr Þ

)

ð1Þ

Lk þ Ll

ðri1 k ri2 l 

k;l¼1;k6¼l Lk ;Ll ¼0

ðrÞ ðLÞ ri1 k ri2 k rlk r1 kk Þfð2Þ ðckl;r Þ

ði1 ; i2 ¼ 1; . . .; pÞ; where ðrÞ

ðLÞ

ðrÞ

ðL Þ

ðL Þ

fð2Þ ðckl;r Þ ¼ fð2Þ fðckr k ; clr l ÞT g Z

brðk;lÞ ðLÞ

/p fðckl;r ; XTðk;lÞ ÞT jl; RgdXðk;lÞ

¼ arðk;lÞ

¼

1 X

ðLÞ

ð1ÞL þ Lk Ll fð2Þ ðckl;r Þ

ðk; l ¼ 1; . . .; p; k 6¼ lÞ:

ðLk ;Ll Þ¼0

and 1 X ðLk ;Ll Þ¼0

ðÞ ¼

1 X L1; ...; Lk1; Lk þ 1;...; Ll1; Ll þ 1 ; ...; Lp ¼ 0

when k\l.

ðÞ

20

1 The Sectionally Truncated Normal Distribution

Proof ðLÞ

@ 2 UðaÞ p ðcr

jl; RÞ jt¼0 @ti1 @ti2 p X R X 1 @ 1 X ðrÞ ðL Þ ¼ a ð1ÞLk þ 1 ri1 k fð1Þ ðckr k Þjt¼0 @ti2 r¼1 k¼1 L ¼0

EfðX1 Þi1 ðX1 Þi2 g ¼

k

R XX 1 @ 1 X ¼ a ð1ÞLk þ 1 ri1 k @ti2 r¼1 k¼1 L ¼0 p

k

¼ a1

R X

"

1 X

p X

1 X

1 X

ð1ÞL þ Lk Ll

ðLk Þ



r¼1

k¼1 Lk ¼0

p X

1 X

k;l¼1;k6¼l Lk ;Ll ¼0

þ

p X

ðL Þ ðrÞ

1 X

k¼1 Lk ¼0

k;l¼1;k6¼l Lk ;Ll ¼0

ðL Þ

@clr l @ ðLÞ Up1 fcðkÞr jlðkÞ @ti2 @cðLl Þ lr #

ðL Þ

ðrÞ

p X

1 X

jlk ; rkk Þ

ðLÞ

ð1ÞLk þ Ll ri1 k ðrðkÞ=k Þl ri2 k fð2Þ ðckl;r Þ

( p X R 1 X X r¼1

 lk Þ; Rðk;kÞjk Þg

ð1ÞLk þ 1 ri1 k ri2 =k ckr k fð1Þ ðckr k Þ

k;l¼1;k6¼l Lk ;Ll ¼0

¼ a1

ðLk Þ

 lk Þ; Rðk;kÞjk g jt¼0

( p X R 1 X X

þ

ðL Þ

@ckr k @ ðL Þ f/ðckr k jlk ; rkk Þ @ti2 @cðLk Þ kr

ðLk Þ

þ rðkÞ=k ðckr ¼a

; XTðkÞ ÞT jl; RgdXðkÞ jt¼0

arðkÞ

ð1ÞLk þ 1 þ Ll ri1 k /ðckr

ðLk ;Ll Þ¼0

1

ðLk Þ

/p fðckr

ð1ÞL þ Lk Up1 ðcðkÞr jlðkÞ þ rðkÞ=k ðckr

k;l¼1;k6¼l Lk ;Ll ¼0



ð1ÞLk þ 1 ri1 k

Z

ðLÞ

ðLk Þ¼0

þ

1 X

k¼1 Lk ¼0

r¼1



p X

brðkÞ

) ðrÞ

ðLÞ

ð1ÞLk þ Ll ri1 k ri2 l fð2Þ ðckl;r Þ ðL Þ ðrÞ

ðL Þ

ð1ÞLk þ 1 ri1 k ri2 =k ckr k fð1Þ ðckr k Þ ) Lk þ Ll

ð1Þ

ðrÞ ðLÞ ri1 k ri2 ljk fð2Þ ðckl;r Þ

:

Q.E.D. Using ri1 k ri2 ljk ¼ ri1 k ri2 l  ri1 k ri2 k r1 kk rkl in the last expression of Lemma 1.6, 2 ðaÞ we find the symmetric property corresponding to @ 2 UðaÞ p =@ti1 @ti2 ¼ @ Up =@ti2 @ti1 : The result in Theorem 1.2

1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF

21

EðX1 Þi1 ¼ EðXÞi1  li1 ¼ a1

p X R X 1 X r¼1 k¼1 Lk ¼0

ðrÞ

ðL Þ

ð1ÞLk þ 1 ri1 k fð1Þ ðckr k Þ

ði1 ¼ 1; . . .; pÞ;

gives the following. Theorem 1.3 1

cov(Xi1 ; Xi2 Þ ¼ a

( p X R 1 X X

þ (  a2

ðL Þ ðrÞ

)

1 X

p X

ð1Þ

Lk þ Ll

k;l¼1;k6¼l Lk ;Ll ¼0 p X R X 1 X r¼1 k¼1 Lk ¼0

(



ðL Þ

ð1ÞLk þ 1 ri1 k ri2 =k ckr k fð1Þ ðckr k Þ

k¼1 Lk ¼0

r¼1

ðrÞ ðLÞ ri1 k ri2 ljk fð2Þ ðckl;r Þ

) ðrÞ

ðL Þ

ð1ÞLk þ 1 ri1 k fð1Þ ðckr k Þ

p X R X 1 X

) ð1Þ

Lk þ 1

r¼1 k¼1 Lk ¼0

ðrÞ ðL Þ ri2 k fð1Þ ðckr k Þ

þ ri1 i2

ði1 ; i2 ¼ 1; . . .; pÞ:

Lemma 1.7 EfðX1 Þi1 ðX1 Þi2 ðX1 Þi3 g

1

¼a

" p X R 1 X X r¼1

þ

p X

k¼1 Lk ¼0 1 X

k;l¼1;k6¼l Lk ;Ll ¼0

ðL Þ2

ðrÞ

ðL Þ

ð1ÞLk þ Ll ðri1 k ri2 =k ri3 ljk ckr k ðLÞ

ðrÞ

ðLÞ

þ ri1 k ri2 ljk rTi3 =kl ckl;r Þfð2Þ ðckl;r Þ þ

p X

ðL Þ

k 1 ð1ÞLk þ 1 ri1 k ri2 k ri3 k ðckr k r2 kk  rkk Þfð1Þ ðckr Þ

1 X

ð1Þ

Lk þ Ll þ Lm þ 1

k;l;m¼1 Lk ;Ll ;Lm ¼0 k;l;m:6¼

#

ðrÞ ðLÞ ri1 k ri2 ljk ri3 mjkl fð3Þ ðcklm;r Þ

ði1 ; i2 ; i3 ¼ 1; . . .; pÞ;

where undefined notations are similarly defined as before. The corresponding symmetric expressions is

22

1 The Sectionally Truncated Normal Distribution

EfðX1 Þi1 ðX1 Þi2 ðX1 Þi3 g " p X R 1 X X ðL Þ2 1 ðrÞ ðLk Þ ð1ÞLk þ 1 ri1 k ri2 k ri3 k ðckr k r2 ¼ a1 kk  rkk Þfð1Þ ðckr Þ r¼1

k¼1 Lk ¼0

p X

þ

1 X

( ðL Þ

Lk þ Ll

k ri1 k ri2 k ri3 k r2 kk rkl ckr

ð1Þ

k;l¼1;k6¼l Lk ;Ll ¼0

0

3 X

þ @ri1 k ri2 k ri3 k r1 kk rkl þ

1

)

ðLÞ ri1 k ri2 k ri3 l AðR1 kl;kl ckl;r Þ½k

ði1 ;i2 ;i3 Þ p X

þ

ðrÞ

ðLÞ

fð2Þ ðckl;r Þ

(

1 X

ð1Þ

Lk þ Ll þ Lm þ 1

ri1 k ri2 l ri3 m

k;l;m¼1 Lk ;Ll ;Lm ¼0 k;l;m:6¼

0

þ @ri1 k ri2 k ri3 k r1 kk rkl 

3 X ði1 ;i2 ;i3 Þ

1

3

)

ri1 k ri2 k ri3 l Aðrm=kl Þ½k fð3Þ ðcklm;r Þ5 ðrÞ

ðLÞ

ði1 ; i2 ; i3 ¼ 1; . . .; pÞ:

Proof The asymmetric expression is given by Lemma 1.6. For the correspondence ðrÞ ðL Þ of the asymmetric and symmetric expressions, the term of fð1Þ ðckr k Þ is already ðrÞ

ðLÞ

symmetric. The term of fð2Þ ðckl;r Þ from Lemma 1.6 is p X

1 X

k;l¼1;k6¼l Lk ;Ll ¼0

¼

p X

ðL Þ

þ ðri1 k ri2 l  ¼

ðrÞ

ðLÞ

(

1 X

ð1Þ

Lk þ Ll

k;l¼1;k6¼l Lk ;Ll ¼0

p X

ðLÞ

k T ð1ÞLk þ Ll ðri1 k ri2 k r1 kk ri3 ljk ckr þ ri1 k ri2 ljk ri3 =kl ckl;r Þfð2Þ ðckl;r Þ

1 X

k;l¼1;k6¼l Lk ;Ll ¼0

ðL Þ

1 k ri1 k ri2 k r1 kk ðri3 l  ri3 k rkk rkl Þckr

)

1 ðLÞ ri1 k ri2 k r1 kk rkl Þðri3 k ; ri3 l ÞRkl;kl ckl;r

ðrÞ

ðLÞ

fð2Þ ðckl;r Þ

( ð1Þ

Lk þ Ll

ðL Þ

1 k ri1 k ri2 k r1 kk ðri3 l  ri3 k rkk rkl Þckr

ðri3 k ; ri3 l Þ þ ðri1 k ri2 l  ri1 k ri2 k r1 kk rkl Þ rkk rll  r2kl

!0 ðLk Þ 1) c @ kr A f ðrÞ ðcðLÞ Þ ð2Þ kl;r ðL Þ rlk rkk clr l rll rkl

1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF

¼

p X

1 X

23

" Lk þ Ll

ð1Þ

k;l¼1;k6¼l Lk ;Ll ¼0

ðL Þ

k ri1 k ri2 k ri3 k r2 kk rkl ckr

ðLÞ

ðLÞ

1 1  ri1 k ri2 k ri3 k r1 kk rkl ðRkl;kl ckl;r Þ½k þ ri1 k ri2 l ri3 k ðRkl;kl ckl;r Þ½k

   ðrkl Þ rkk ðLk Þ ðLl Þ þ ri1 k ri2 k ri3 l r1 1  r c  r c kl kl kk rkk rll  r2kl lr rkk rll  r2kl kr # ðLÞ

ðrÞ

ðLÞ

þ ri1 k ri2 l ri3 l ðR1 kl;kl ckl;r Þ½l fð2Þ ðckl;r Þ ¼

p X

1 X

( Lk þ Ll

ð1Þ

k;l¼1;k6¼l Lk ;Ll ¼0

ðL Þ

k ri1 k ri2 k ri3 k r2 kk rkl ckr

ðLÞ

ðLÞ

1 1  ri1 k ri2 k ri3 k r1 kk rkl ðRkl;kl ckl;r Þ½k þ ri1 k ri2 l ri3 k ðRkl;kl ckl;r Þ½k

) ðL Þ ðL Þ rll ckr k  rkl clr l ðrÞ ðLÞ 1 ðLÞ þ ri1 k ri2 k ri3 l rkl þ ri1 l ri2 k ri3 k ðRkl;kl ckl;r Þ½k fð2Þ ðckl;r Þ rkk rll  r2kl ( p 1 X X ðLk Þ Lk þ Ll ð1Þ ¼ ri1 k ri2 k ri3 k r2 kk rkl ckr k;l¼1;k6¼l Lk ;Ll ¼0

ðLÞ

1  ri1 k ri2 k ri3 k r1 kk rkl ðRkl;kl ckl;r Þ½k

9 3 = X ðLÞ ðrÞ ðLÞ þ ri1 k ri2 k ri3 l ðR1 kl;kl ckl;r Þ½k fð2Þ ðckl;r Þ; ; ði ;i ;i Þ 1 2 3

yielding the symmetric term, where p X

1 X

k;l¼1;k6¼l Lk ;Ll ¼0

¼

p X

ðLÞ

ðrÞ

ðLÞ

ð1ÞLk þ Ll ri1 k ri2 l ri3 l ðR1 kl;kl ckl;r Þ½l fð2Þ ðckl;r Þ 1 X

k;l¼1;k6¼l Lk ;Ll ¼0

ðLÞ

ðrÞ

ðLÞ

ð1ÞLk þ Ll ri1 l ri2 k ri3 k ðR1 kl;kl ckl;r Þ½k fð2Þ ðckl;r Þ

is used. An alternative derivation is an indirect one. Among the three terms to be P3 1 ðLÞ 1 ðLÞ derived in ði1 ;i2 ;i3 Þ ri1 k ri2 k ri3 l ðRkl;kl ckl;r Þ½k , the term ri1 k ri2 l ri3 k ðRkl;kl ckl;r Þ½k is immediately obtained as above, the remaining two terms are logically obtained due to the symmetry of the derivatives. Note that although this may look like a case of circular reasoning, the logic is valid since the derivation is to find the actual symmetric form rather than to prove the symmetry of partial derivatives.

24

1 The Sectionally Truncated Normal Distribution ðrÞ

ðLÞ

For the term of fð3Þ ðcklm;r Þ, note that T 1 ri1 k ri2 ljk ri3 mjkl ¼ ri1 k ðri2 l  ri2 k r1 kk rkl Þðri3 m  ri3 ;kl Rkl;kl rm;kl Þ

¼ ðri1 k ri2 l  ri1 k ri2 k r1 kk rkl Þfri3 m  ri3 k ðrkk ; rkl Þrm;kl  ri3 l ðrlk ; rll Þrm;kl g ¼ ri1 k ri2 l ri3 m  ri1 k ri2 l ri3 k ðrkk ; rkl Þrm;kl  ri1 k ri2 l ri3 l ðrlk ; rll Þrm;kl 1 lk ll  fri1 k ri2 k ri3 m r1 kk rkl  ri1 k ri2 k ri3 l rkk rkl ðr ; r Þrm;kl g kk kl þ ri1 k ri2 k ri3 k r1 kk rkl ðr ; r Þrm;kl :

In the above result, the term ri1 k ri2 l ri3 l ðrlk ; rll Þrm;kl after taking the sum over k and lðk 6¼ lÞ is equal to that of the term exchanging k and l: The term in braces is 1 lk ll  fri1 k ri2 k ri3 m r1 kk rkl  ri1 k ri2 k ri3 l rkk rkl ðr ; r Þrm;kl g 1 lk ll ¼ fri1 k ri2 k ri3 m r1 kk rkl  ri1 k ri2 k ri3 l rkk rkl ðr ; r Þrm;kl g

þ ri1 k ri2 k ri3 l ðrkk ; rkl Þrm;kl  ri1 k ri2 k ri3 l ðrkk ; rkl Þrm;kl ¼ ri1 k ri2 k ri3 m r1 kk rkl ri1 k ri2 k ri3 l 1 þ fr rkl ðrkk rlm  rkl rkm Þ þ rll rkm  rkl rlm g rkk rll  r2kl kk  ri1 k ri2 k ri3 l ðrkk ; rkl Þrm;kl ¼ ri1 k ri2 k ri3 m r1 kk rkl þ

ri1 k ri2 k ri3 l r1 kk rkm ðrkk rll  r2kl Þ rkk rll  r2kl

 ri1 k ri2 k ri3 l ðrkk ; rkl Þrm;kl 1 ¼ ri1 k ri2 k ri3 m r1 kk rkl þ ri1 k ri2 k ri3 l rkk rkm

 ri1 k ri2 k ri3 l ðrkk ; rkl Þrm;kl : In the above result, the term ri1 k ri2 k ri3 m r1 kk rkl after taking the sum over l and mðl 6¼ mÞ is equal to that of the term exchanging l and m. Consequently, the sum of the two terms on the right-hand side of the last equation vanishes. Then, it is found that the asymmetric expression is equal to the symmetric expression. An alternative indirect proof is to use the term ri1 k ri2 l ri3 k ðrkk ; rkl Þrm;kl and extend it to ri1 k ri2 k ri3 l ðrkk ; rkl Þrm;kl and ri1 l ri2 k ri3 k ðrkk ; rkl Þrm;kl which hold due to the symmetric property of derivatives. Q.E.D.

1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF

25

Theorem 1.4 The third multivariate cumulants of Xi1 ; Xi2 and Xi3 are j3 ðXi1 ; Xi2 ; Xi3 Þ ¼ j3 fðX1 Þi1 ; ðX1 Þi2 ; ðX1 Þi3 g ¼ EfðX1 Þi1 ðX1 Þi2 ðX1 Þi3 g 

3 X

EfðX1 Þi1 ðX1 Þi2 gEfðX1 Þi3 g

þ 2EfðX1 Þi1 gEfðX1 Þi2 gEfðX1 Þi3 g ði1 ; i2 ; i3 ¼ 1; . . .; pÞ; P ðÞ ¼ 3ði1 ;i2 ;i3 Þ ðÞ indicates the sum of three terms considering the symmetric property for i1 ; i2 and i3 ; and the expectations are given in Theorem 1.2, Lemma 1.6, and Lemma 1.7. P3

Lemma 1.7 gives the following asymmetric result with undefined notations defined similarly as before. Lemma 1.8 EfðX1 Þi1 ðX1 Þi2 ðX1 Þi3 ðX1 Þi4 g " p X R 1 X X ðL Þ3 ðLk Þ 2 ðrÞ ðLk Þ 1 ð1ÞLk þ 1 ri1 k ri2 k ri3 k ri4 k ðckr k r3 ¼a kk  3ckr rkk Þfð1Þ ðckr Þ r¼1

þ

p X

k¼1 Lk ¼0 1 X

( Lk þ Ll

ð1Þ

k;l¼1;k6¼l Lk ;Ll ¼0

ðL Þ2

1 ri1 k ri2 k ri3 k ri4 ljk ðckr k r2 kk  rkk Þ

 ri1 k ri2 =k ri3 ljk ri4 k  ri1 k ri2 ljk rTi3 =kl ri4 ;kl ðL Þ þ ðri1 k ri2 =k ri3 ljk ckr k

þ

p X

1 X

ð1Þ

ðLÞ

1 X

k;l;m;n¼1 Lk ;Ll ;Lm ;Ln ¼0 k;l;m;n:6¼

ðL Þ

ðLÞ

ðri1 k ri2 =k ri3 ljk ckr k þ ri1 k ri2 ljk rTi3 =kl ckl;r Þri4 mjkl

)

ðLÞ þ ri1 k ri2 ljk ri3 mjkl rTi4 =klm cklm;r p X

ðrÞ

fð2Þ ðckl;r Þ

( Lk þ Ll þ Lm þ 1

k;l;m¼1 Lk ;Ll ;Lm ¼0 k;l;m:6¼

þ

)

ðLÞ ðLÞ þ ri1 k ri2 ljk rTi3 =kl ckl;r ÞrTi4 =kl ckl;r

ðrÞ

ðLÞ

fð3Þ ðcklm;r Þ ðrÞ

ðLÞ

ð1ÞLk þ Ll þ Lm þ Ln ri1 k ri2 ljk ri3 mjkl ri4 njklm fð4Þ ðcklmn;r Þ

ði1 ; i2 ; i3 ; i4 ¼ 1; . . .; pÞ:

26

1 The Sectionally Truncated Normal Distribution

Theorem 1.5 The fourth multivariate cumulants of Xi1 ; Xi2 ; Xi3 and Xi4 are j4 ðXi1 ; Xi2 ; Xi3 ; Xi4 Þ ¼ j4 fðX1 Þi1 ; ðX1 Þi2 ; ðX1 Þi3 ; ðX1 Þi4 g ¼ EfðX1 Þi1 ðX1 Þi2 ðX1 Þi3 ðX1 Þi4 g  þ

4 X

6 X

EfðX1 Þi1 ðX1 Þi2 ðX1 Þi3 gEfðX1 Þi4 g EfðX1 Þi1 ðX1 Þi2 gEfðX1 Þi3 gEfðX1 Þi4 g

 3EfðX1 Þi1 gEfðX1 Þi2 gEfðX1 Þi3 gEfðX1 Þi4 g ði1 ; i2 ; i3 ; i4 ¼ 1; . . .; pÞ;

the expectations are given in Theorems 1.2 to 1.4, Remark 1.3, Lemmas 1.6, 1.7, and 1.8. The corresponding symmetric expression becomes complicated and is not shown. The following results given from Lemma 1.8 are provided for possible use for higher-order cumulants. Lemma 1.9 EfðX1 Þi1    ðX1 Þi5 g p X R h X 1 X ðL Þ4 ¼ a1 ð1ÞLk þ 1 ri1 k    ri5 k ðckr k r4 kk r¼1 ðAÞ k¼1 Lk ¼0 ðL Þ2

ðrÞ

ðL Þ

2 k  6ckr k r3 kk þ 3rkk Þfð1Þ ðckr Þ

þ

p X

1 X

k;l¼1;k6¼l Lk ;Ll ¼0

h ðL Þ3 ðLk Þ 2 ð1ÞLk þ Ll ri1 k    ri4 k ri5 ljk ðckr k r3 kk  3ckr rkk Þ ðL Þ

k  2ri1 k ri2 k ri3 k ri4 ljk ri5 k r2 kk ckr

ðLÞ

 ðri1 k ri2 =k ri3 ljk ri5 k þ ri1 k ri2 ljk rTi3 =kl ri5 ;kl ÞrTi4 =kl ckl;r ðL Þ

ðLÞ

 ðri1 k ri2 =k ri3 ljk ckr k þ ri1 k ri2 ljk rTi3 =kl ckl;r ÞrTi4 =kl ri5 ;kl n ðL Þ2 1 þ ri1 k ri2 k ri3 k ri4 ljk ðckr k r2 kk  rkk Þ  ri1 k ri2 =k ri3 ljk ri4 k  ri1 k ri2 ljk rTi3 =kl ri4 ;kl

o ðL Þ ðLÞ ðLÞ ðLÞ ðrÞ ðLÞ þ ðri1 k ri2 =k ri3 ljk ckr k þ ri1 k ri2 ljk rTi3 =kl ckl;r ÞrTi4 =kl ckl;r rTi5 =kl ckl;r fð2Þ ðckl;r Þ

1.4 Moments and Cumulants of the STN-Distributed Vector Using the MGF

þ

1 X

p X

ð1ÞLk þ Ll þ Lm þ 1

k;l;m¼1 Lk ;Ll ;Lm ¼0 k;l;m:6¼

h n

27

ðL Þ2

1 ri1 k ri2 k ri3 k ri4 l j k ðckr k r2 kk  rkk Þ

ðBÞ

 ri1 k ri2 =k ri3 ljk ri4 k  ri1 k ri2 ljk rTi3 =kl ri4 ;kl

o ðL Þ ðLÞ ðLÞ þ ðri1 k ri2 =k ri3 ljk ckr k þ ri1 k ri2 ljk rTi3 =kl ckl;r ÞrTi4 =kl ckl;r ri5 mjkl  ðri1 k ri2 =k ri3 ljk ri5 =k þ ri1 k ri2 ljk rTi3 =kl ri5 ;kl Þri4 mjkl  ri1 k ri2 ljk ri3 mjkl rTi4 =klm ri5 ;klm n ðL Þ ðLÞ þ ðri1 k ri2 =k ri3 ljk ckr k þ ri1 k ri2 ljk rTi3 =kl ckl;r Þri4 mjkl o i ðLÞ ðLÞ ðrÞ ðLÞ þ ri1 k ri2 ljk ri3 mjkl rTi4 =klmcklm;r rTi5 =klmcklm;r fð3Þ ðcklm;r Þ ðBÞ

þ

p X

1 X

ð1ÞLk þ Ll þ Lm þ Ln

k;l;m;n¼1 Lk ;Ll ;Lm ;Ln ¼0 k;l;m;n:6¼



hn

ðL Þ

ðLÞ

ðri1 k ri2 =k ri3 ljk ckr k þ ri1 k ri2 ljk rTi3 =kl ckl;r Þri4 mjkl o ðLÞ þ ri1 k ri2 ljk ri3 mjkl rTi4 =klm cklm;r ri5 njklm i ðLÞ ðrÞ ðLÞ þ ri1 k ri2 ljk ri3 mjkl ri4 njklm rTi5 =klmn cklmn;r fð4Þ ðcklmn;r Þ 1 X

p X

þ

ð1ÞLk þ Ll þ Lm þ Ln þ Lq þ 1 ri1 k ri2 ljk ri3 mjkl ri4 njklm ri5 qjklmn

k;l;m;n;q¼1 Lk ;Ll ;Lm ;Ln ;Lq ¼0 k;l;m;n;q:6¼ ðrÞ

ðLÞ

fð5Þ ðcklmnq;r Þ

i

ði1 ; i2 ; i3 ; i4 ; i5 ¼ 1; . . .; pÞ;

ðAÞ

where, e.g., ½  is for ease of finding correspondence. ðAÞ ðAÞ

From the results obtained earlier, we have some rules for computation of EfðX1 Þi1    ðX1 Þis g ¼

@ s UðaÞ p @i1    @is

ðis ¼ 1; . . .; p; s ¼ 0; 1; . . .Þ;

ðaÞ where @ 0 UðaÞ p =@i0 jt¼0  Up jt¼0 ¼ 1 as addressed before.

28

1 The Sectionally Truncated Normal Distribution

Lemma 1.10 Suppose that we have derived the following derivative: " p X R 1 X X @ s UðaÞ p ðL Þ ðrÞ ðL Þ 1 jt¼0 ¼ a ð1ÞL1 þ 1 gs1 ðck1 r1 Þfð1Þ ðck1 r1 Þ @ti1    @tis r¼1 k ¼1 L ¼0 1

1

þ

p P

1 P

ðLÞ

k1 ;k2 ¼1 L1 ;L2 ¼0 k1 6¼k2 p X

þ

ðrÞ

ðLÞ

ð1ÞL1 þ L2 gs2 ðck1 k2 ;r Þfð2Þ ðck1 k2 ;r Þ #

1 X

ð1Þ

L1 þ  þ Ls þ s

k1 ;...;ks ¼1 L1 ;...;Ls ¼0 k1 ;...;ks :6¼

ðLÞ ðrÞ ðLÞ gs;s ðck1 ;...;ks ;r ÞfðsÞ ðck1 ;...;ks ;r Þ

ðiu ¼ 1; . . .; p; u ¼ 1; 2; . . .; sÞ: Then, we obtain @ s þ 1 UðaÞ p @ti1    @tis þ 1

jt¼0 ¼ a1

" p X R 1 X X k1 ¼1 L1 ¼0

r¼1

þ

s X

( ð1ÞL1 þ 1

p X

1 X

) ðL Þ @gs1 ðck1 r1 Þ ðL Þ ðL Þ ðrÞ ðL Þ þ gs1 ðck1 r1 Þris þ 1 =k1 ck1 r1 fð1Þ ðck1 r1 Þ @tis þ 1 (

ð1ÞL1 þ  þ Lu þ u

u¼2 k1 ;...;ku ¼1 L1 ;...;Lu ¼0 k1 ;...;ku :6¼

ðLÞ

gs;u1 ðck1 ;...;ku1 ;r Þris þ 1 ku jk1 ;...;ku1

) ðLÞ @gsu ðck1 ;...;ku ;r Þ ðLÞ ðLÞ ðrÞ ðLÞ T þ þ gsu ðck1 ;...;ku ;r Þris þ 1 =k1 ;...;ku ck1 ;...;ku ;r fðuÞ ðck1 ;...;ku ;r Þ @tis þ 1

þ

1 X

p X

k1 ;...;ks þ 1 ¼1 L1 ;...;Ls þ 1 ¼0 k1 ;...;ks þ 1 :6¼

ð1ÞL1 þ  þ Ls þ 1 þ s þ 1 ri1 jk1 ri2 k2 jk1    ris þ 1 ks þ 1 jk1 ;...;ks

#

ðrÞ ðLÞ fðs þ 1Þ ðck1 ;...;ks þ 1 ;r Þ

:

Proof The terms except the last one in brackets are given by construction. The last ðrÞ ðLÞ term with the factor fðs þ 1Þ ðck1 ;...;ks þ 1 ;r Þ is obtained by induction. Q.E.D. The closed formula for the first term using the Hermite polynomials will be given later. An application of Lemma 1.10 is given below. Lemma 1.11 For the EfðX1 Þi1    ðX1 Þi6 gði1 ; . . .; i6 ¼ 1; . . .; pÞ, let 6 X @ 6 UðaÞ p jt¼0 ¼ a1 @ti1    @ti6 u¼1

p X

1 X

k1 ; . . .; ku ¼ 1 k1 ; . . .; ku :6¼

L1 ;...;Lu ¼0

ðLÞ

ðrÞ

ðLÞ

ð1ÞL1 þ  þ Lu þ u g6u ðck1 ;...;ku r ÞfðuÞ ðck1 ;...;ku r Þ:

1.5 The Product Sum of Natural Numbers and the Hermite Polynomials

29

Then, we have ðL Þ

ðL Þ5

ðL Þ3

ðL Þ

1 4 1 3 g61 ðck1 r1 Þ ¼ ri1 k1    ri6 k1 ðck1 r1 r5 k1 k1  10ck1 r rk1 k1 þ 15ck1 r rk1 k1 Þ;

ðLÞ

@g52 ðck1 k2 ;r Þ ðL Þ2 2 ¼ ri1 k1    ri4 k1 ri5 k2 jk1 ri6 k1 ð3ck1 r1 r3 k1 k1  3rk1 k1 Þ @ti6 þ 2ri1 k1 ri2 k1 ri3 k1 ri4 k2 jk1 ri5 k1 ri6 k1 r2 k1 k1 þ ðri1 k1 ri2 =k1 ri3 k2 jk1 ri5 k1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ri5 ;k1 k2 ÞrTi4 =k1 k2 ri6 ;k1 k2 þ ðri1 k1 ri2 =k1 ri3 k2 jk1 ri6 k1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ri6 ;k1 k2 ÞrTi4 =k1 k2 ri5 ;k1 k2 ðL Þ

1  f2ri1 k1 ri2 k1 ri3 k1 ri4 k2 jk1 ri6 k1 r2 k1 k1 ck1 r

ðLÞ

þ ðri1 k1 ri2 =k1 ri3 k2 jk1 ri6 k1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ri6 ;k1 k2 ÞrTi4 =k1 k2 ck1 k2 ;r ðL Þ

ðLÞ

ðLÞ

þ ðri1 k1 ri2 =k1 ri3 k2 jk1 ck1 r1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ck1 k2 ;r ÞrTi4 =k1 k2 ri6 ;k1 k2 grTi5 =k1 k2 ck1 k2 ;r ðLÞ

 g42 ðck1 k2 ;r ÞrTi5 =k1 k2 ri6 ;k1 k2 ; ðLÞ n @g53 ðck1 k2 k3 ;r Þ ðL1 Þ ¼  2ri1 k1 ri2 k1 ri3 k1 ri4 k2 jk1 ri6 k1 r2 k1 k1 c k1 r @ti6 ðLÞ

þ ðri1 k1 ri2 =k1 ri3 k2 jk1 ri6 k1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ri6 ;k1 k2 ÞrTi4 =k1 k2 ck1 k2 ;r

o ðL Þ ðLÞ þ ðri1 k1 ri2 =k1 ri3 k2 jk1 ck1 r1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ck1 k2 ;r ÞrTi4 =k1 k2 ri6 ;k1 k2 ri5 k3 jk1 k2

n  ðri1 k1 ri2 =k1 ri3 k2 jk1 ri6 k1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ri6 ;k1 k2 Þri4 k3 jk1 k2 o ðLÞ þ ri1 k1 ri2 k2 jk1 ri3 k3 jk1 k2 rTi4 =k1 k2 k3 ri6 ;k1 k2 k3 rTi5 =k1 k2 k3 ck1 k2 k3 ;r ðLÞ

 g43 ðck1 k2 k3 ;r ÞrTi5 =k1 k2 k3 ri6 ;k1 k2 k3 ;

n @g54 ðck1 k2 k3 k4 ;r Þ ¼  ðri1 k1 ri2 =k1 ri3 k2 jk1 ri6 k1 þ ri1 k1 ri2 k2 jk1 rTi3 =k1 k2 ri6 ;k1 k2 Þri4 k3 jk1 k2 @ti6 o þ ri1 k1 ri2 k2 jk1 ri3 k3 jk1 k2 rTi4 =k1 k2 k3 ri6 ;k1 k2 k3 ri5 k4 jk1 k2 k3 ðLÞ

 ri1 k1 ri2 k2 jk1 ri3 k3 jk1 k2 ri4 k4 jk1 k2 k3 rTi5 =k1 k2 k3 k4 ri6 ;k1 k2 k3 k4 ; ðLÞ

@g55 ðck1 k2 k3 k4 k5 ;r Þ ¼ 0: @ti6

1.5

The Product Sum of Natural Numbers and the Hermite Polynomials ðL Þ

As shown in the previous section, gs1 ðck1 r1 Þ is associated with the probabilist’s Hermite polynomial, which is given up to the ninth order as follows:

30

1 The Sectionally Truncated Normal Distribution

h0 ¼ 1; h1 ¼ x; h2 ¼ x2  1; h3 ¼ x3  3x; h4 ¼ x4  6x2 þ 3; h5 ¼ x5  10x3 þ 15x; h6 ¼ x6  15x4 þ 45x2  15; h7 ¼ x7  21x5 þ 105x3  105x h8 ¼ x8  28x6 þ 210x4  420x2 þ 105; h9 ¼ x9  36x7 þ 378x5  1260x3 þ 945x: Probably, the most popular definition of the Hermite polynomial is given by ð1Þs ds /ðxÞ /ðxÞ dxs  2 s  2 x d x s ¼ ð1Þ exp exp  s 2 dx 2

hs ¼ hs ðxÞ ¼

ðs ¼ 0; 1; . . .Þ;

where /ðxÞ ¼ /1 ðxj0; 1Þ and d0 /ðxÞ=dx0  /ðxÞ. Recall that when p ¼ 1; R ¼ 1; a1 ¼ 1; b1 ¼ x; l ¼ l ¼ 0 and R ¼ r2 ¼ 1, the mgf of the STN-distributed variable X becomes as simple as MX ðtÞ ¼

Uðx  tÞ expðt2 =2Þ: UðxÞ

Consider the associated pseudo variable. Then, noting that hs ðxÞ is also given by hs ¼ hs ðxÞ ¼

1 ds /ðx  tÞ jt¼0 ; /ðxÞ dts

the moments EðX1s Þ ¼

ds Uðx  tÞ j dts UðxÞ t¼0

ðs ¼ 0; 1; . . .Þ

are written as EðX10 Þ ¼ Eð1Þ ¼

d0 Uðx  tÞ Uðx  tÞ j ¼ j ¼1 UðxÞ t¼0 dt0 UðxÞ t¼0

1.5 The Product Sum of Natural Numbers and the Hermite Polynomials

31

as expected, EðX1 Þ ¼

d Uðx  tÞ /ðx  tÞ /ðxÞ j ¼ j ¼ dt UðxÞ t¼0 UðxÞ t¼0 UðxÞ

and ds Uðx  tÞ ds1 /ðx  tÞ j j ¼  dts UðxÞ t¼0 dts1 UðxÞ t¼0 hs1 ðxÞ/ðxÞ ðs ¼ 2; 3; . . .Þ: ¼ UðxÞ

EðX1s Þ ¼

The negative sign when s 2 is due to ð1ÞL1 þ 1 ¼ 1 (see Lemma 1.10) ðL Þ ðL Þ 1 ð1L1 Þ which holds when ci1 r 1 ¼ c11 1 ¼ aL ¼ b11 with L1 ¼ 0 while a11 ¼ 11 b11 a11  t ¼ 1 corresponding to L1 ¼ 1 does not contribute to the derivative with respect to t. The above result shows that in the simple case, using h0 ðxÞ ¼ 1, we obtain EðX1s Þ ¼ 

hs1 ðxÞ/ðxÞ UðxÞ

ðs ¼ 1; 2; . . .Þ;

ð1:2Þ

which is proportional to hs1 ðxÞ. When s = 1, we have EðXÞ ¼ EðX1 Þ þ l ¼ 

/ðxÞ þ l; UðxÞ

which shows a negative shift of the mean from l due to upper-tail truncation. Lemma 1.12 When p ¼ 1, we have EðX1s Þ

1 s

¼a r

where a ¼

R X 1 X r¼1 L1 ¼0

ð1Þ

L1 þ 1

hs1

! ! ðL Þ ðL Þ c1r 1  l c1r 1  l / r r



 PR  b1r l  U a1rrl . r¼1 U r

ðs ¼ 1; 2; . . .Þ;

32

1 The Sectionally Truncated Normal Distribution

Proof Note that when p = 1, Lemma 1.10 gives EðX1s Þ ¼

@ s UðaÞ p

jt¼0 ¼

@ s UðaÞ j @ts t¼0

@ti1    @tis p X R X 1 X ðL Þ ðrÞ ðL Þ ¼ a1 ð1ÞL1 þ 1 gs1 ðck1 r1 Þfð1Þ ðck1 r1 Þ r¼1 k1 ¼1 L1 ¼0

¼ a1

R X 1 X r¼1 L1 ¼0

ðL Þ

ðrÞ

ðL Þ

ð1ÞL1 þ 1 gs1 ðc1r 1 jl; r2 Þfð1Þ ðc1r 1 jl; r2 Þ

! ! ðL Þ ðL1 Þ c1r 1  l ðrÞ c1r  l j0; 1 fð1Þ j0; 1 ¼a r ð1Þ gs1 r r r¼1 L1 ¼0 ! ! ðL Þ ðL Þ R X 1 X c1r 1  l c1r 1  l L1 þ 1 1 s ¼a r / ; ð1Þ hs1 r r r¼1 L ¼0 1 s

R X 1 X

L1 þ 1

1

where ti1 ¼ t1 ¼ t; gs1 ðjl; r2 Þ and gs1 ðj0; 1Þ are used to show that they are functions using different parameters; a¼

R  X

Uðb1r jl; r2 Þ  Uða1r jl; r2 Þ



r¼1

 R   a  l  X b1r  l 1r j0; 1  U j0; 1 U ¼ r r r¼1  R   a  l X b1r  l 1r ¼ U U ; r r r¼1 and the validity of use of the Hermite polynomial is given by the simple result shown earlier (see 1.2) with scaling and extension to the case R 1 which holds since aEðX1s Þ is the sum of R terms each of which can be expressed using the Hermite polynomial. Q.E.D. In Lemma 1.10, let ðL Þ

ðLÞ

gs1 ðck1 r1 Þ ¼ gs1 ðckr Þ ¼ gs1

ðs ¼ 1; 2; . . .; k1  k ¼ 1; . . .; p; L1  L ¼ 0; 1Þ

for simplicity of notation. Then, by construction, we obtain

1.5 The Product Sum of Natural Numbers and the Hermite Polynomials

33

g11 ¼ ri1 k ; ðLÞ

g21 ¼ ri1 k ri2 k r1 kk ckr ; ðLÞ2

1 g31 ¼ ri1 k ri2 k ri3 k ðckr r2 kk  rkk Þ; ðLÞ3

ðLÞ

2 g41 ¼ ri1 k    ri4 k fckr r3 kk  ð1 þ 2Þckr rk g; ðLÞ4

ðLÞ2

3 2 g51 ¼ ri1 k    ri5 k fckr r4 kk  ð1 þ 2 þ 3Þckr rkk þ ð1 þ 2Þrk g; ðLÞ5 ðLÞ3 4 g61 ¼ ri1 k    ri6 k ckr r5 kk  ð1 þ    þ 4Þckr rkk ðLÞ þ f1 þ 2 þ 2ð1 þ 2 þ 3Þgckr r3 kk ; ðLÞ6 ðLÞ4 5 g71 ¼ ri1 k    ri7 k ck r6 kk  ð1 þ    þ 5Þckr rkk ðLÞ2

g81

þ f1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þgckr r4 kk 3  f1 þ 2 þ 2ð1 þ 2 þ 3Þgrkk ; ðLÞ7 ðLÞ5 6 ¼ ri1 k    ri8 k ck :r7 kk  ð1 þ    þ 6Þckr rkk þ f1 þ 2 þ 2ð1 þ 2 þ 3Þ ðLÞ3



þ 3ð1 þ    þ 4Þ þ 4ð1 þ    þ 5Þgckr r5 kk

 f1 þ 2 þ 2ð1 þ 2 þ 3Þg

g91

ðLÞ þ 2f1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þg ckr r4 kk ; ðLÞ8 ðLÞ6 7 ¼ ri1 k    ri9 k ck :r8 kk  ð1 þ    þ 7Þckr rkk þ f1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þ ðLÞ4



þ 4ð1 þ    þ 5Þ þ 5ð1 þ    þ 6Þgckr r6 kk

 1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 2f1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þg þ 3f1 þ 2 þ 2ð1 þ 2 þ 3Þ ðLÞ2 þ 3ð1 þ    þ 4Þ þ 4ð1 þ    þ 5Þg ckr r5 kk

þ f1 þ 2 þ 2ð1 þ 2 þ 3Þg

þ 2f1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þg ði1 ; . . .; i9 ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ;

r4 kk



34

1 The Sectionally Truncated Normal Distribution

where the term with r4 kk in large brackets is rewritten: ½f1 þ 2 þ 2ð1 þ 2 þ 3Þg þ 2f1 þ 2 þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þg r4 kk ¼ ½1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þg r4 kk indicating a three-fold product sum. The above results are summarized as g11 ¼ ri1 k ; ðLÞ

g21 ¼ ri1 k ri2 k r1 kk ckr ; ðLÞ2

1 g31 ¼ ri1 k ri2 k ri3 k ðckr r2 kk  rkk Þ; ðLÞ3

ðLÞ

ðLÞ4

ðLÞ2

2 g41 ¼ ri1 k    ri4 k ðckr r3 kk  3ckr rkk Þ; 3 2 g51 ¼ ri1 k    ri5 k ðckr r4 kk  6ckr rkk þ 3rkk Þ; ðLÞ5

ðLÞ3

ðLÞ

ðLÞ6

ðLÞ4

ðLÞ2

ðLÞ7

ðLÞ5

ðLÞ3

ðLÞ8

ðLÞ6

ðLÞ4

4 3 g61 ¼ ri1 k    ri6 k ðckr r5 kk  10ckr rkk þ 15ckr rkk Þ; 5 4 3 g71 ¼ ri1 k    ri7 k ðckr r6 kk  15ckr rkk þ 45ckr rkk  15rkk Þ; ðLÞ

6 5 4 g81 ¼ ri1 k    ri8 k ðckr r7 kk  21ckr rkk þ 105ckr rkk  105ckr rkk Þ; 7 6 g91 ¼ ri1 k    ri9 k ðckr r8 kk  28ckr rkk þ 210ckr rkk ðLÞ2

4  420ckr r5 kk þ 105rkk Þ

ði1 ; . . .; i9 ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ: It is found that the last results give ðLÞ

gs1 jrkk ¼1 ¼ ri1 k    ris k hs1 ðckr Þ ðs ¼ 1; . . .; 9; k ¼ 1; . . .; p; L ¼ 0; 1Þ: Definition 1.2 The j-th order product sum of natural numbers (psnn) up to i denoted by i : j is recursively defined as i :0 ¼ 1; i :1 ¼ i :¼

i X

m;

m¼1

i :2 ¼ 1  ð2 :Þ þ 2  ð3 :Þ þ    þ ði  1Þði :Þ þ ifði þ 1Þ :g ¼ 1  2 : þ 2  3 : þ    þ ði  1Þi : þ iði þ 1Þ : i : ¼ 1  2 :2 þ 2  3 :2 þ    þ ði  1Þi :2 þ iði þ 1Þ :2 .. . 3

i : j ¼ 1  2 :j1 þ 2  3 :j1 þ    þ ði  1Þi :j1 þ iði þ 1Þ :j1 ði ¼ 1; 2; . . .; j ¼ 0; 1; . . .Þ:

1.5 The Product Sum of Natural Numbers and the Hermite Polynomials

From the last recursive definition, we have i : j ¼ 1  2 :j1 þ 2  3 :j1 þ    þ ði  1Þi :j1 þ iði þ 1Þ :j1 ¼ ði  1Þ : j þ iði þ 1Þ :j1 ¼ ði  2Þ : j þ ði  1Þi :j1 þ iði þ 1Þ :j1 .. . ¼ 1  2 :j1 þ 2  3 :j1 þ    þ iði þ 1Þ :j1 ¼ 1 : j þ 2  3 :j1 þ    þ iði þ 1Þ :j1

ði ¼ 1; 2; . . .; j ¼ 1; 2; . . .Þ;

where 1  2 :j1 ¼ 1 : j is used. Some examples of the psnn are given: 1 : ¼ 1; 1 :2 ¼ 2 :¼ 1 þ 2 ¼ 3; 1 :3 ¼ 2 :2 ¼ 1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ ¼ 15; 1 :4 ¼ 2 :3 ¼ 1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þg ¼ 1  15 þ 2  45 ¼ 105; 1 :5 ¼ 2 :4 ¼ 1½1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þg

þ 2½1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þg þ 3f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þ þ 4ð1 þ    þ 5Þg

¼ 1  105 þ 2ð15 þ 2  45 þ 3  105Þ ¼ 945;

3 : ¼ 1 þ 2 þ 3 ¼ 6; 3 :2 ¼ 1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þ ¼ 45; 3 :3 ¼ 1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þg þ 3f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þ þ 4ð1 þ    þ 5Þg ¼ 1  15 þ 2  45 þ 3  105 ¼ 420;

35

36

1 The Sectionally Truncated Normal Distribution

3 :4 ¼ 1  2 :3 þ 2  3 :3 þ 3  4 :3 ¼ 1 1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þg þ 2 1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þg



þ 3f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þ þ 4ð1 þ    þ 5Þg þ 3 1f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þg þ 2f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þg þ 3f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þ þ 4ð1 þ    þ 5Þg þ 4f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þ þ 4ð1 þ    þ 5Þ þ 5ð1 þ    þ 6Þg ¼ 1  105 þ 2  420 þ 3  ð420 þ 4  210Þ ¼ 4725; 4 : ¼ 1 þ    þ 4 ¼ 10; 4 :2 ¼ 1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þ þ 4ð1 þ    þ 5Þ ¼ 105; 4 :3 ¼ 420 þ 4  210 ¼ 1260 (given in 3 :4 ), 5 : ¼ 1 þ    þ 5 ¼ 15; 5 :2 ¼ 4 :2 þ 5  6 :¼ 105 þ 5  21 ¼ 210; 5 :3 ¼ 4 :3 þ 5  6 :2 ¼ 1260 þ 5f1ð1 þ 2Þ þ 2ð1 þ 2 þ 3Þ þ 3ð1 þ    þ 4Þ þ 4ð1 þ    þ 5Þ þ 5ð1 þ    þ 6Þ þ 6ð1 þ    þ 7Þg ¼ 1260 þ 5ð3 þ 12 þ 3  10 þ 4  15 þ 5  21 þ 6  28Þ ¼ 3150: Comparing the summarized results before Definition 1.1 and the corresponding previous ones, it is found that the absolute values of the fixed integers, e.g., 1, 6 and ðLÞ4 ðLÞ2 3 2 3 in g51 ¼ ri1 k    ri5 k ðckr r4 kk  6ckr rkk þ 3rk Þ are given by the psnn’s.

1.5 The Product Sum of Natural Numbers and the Hermite Polynomials

37

Lemma 1.13 i : j¼

ði þ 2j  1Þ! ð2jÞ!!ði  1Þ!

ði ¼ 1; 2; . . . : j ¼ 0; 1; . . .Þ;

where ðÞ!! is the double factorial and 0!!  1. Proof 1 The result is given by induction. When j ¼ 0, we have i :0 ¼ 1 by definition and ði þ 2j  1Þ! ði  1Þ! ¼ ¼ 1; ð2jÞ!!ði  1Þ! 0!!ði  1Þ! which shows that the equation holds. Assume that the results hold for i and j. Then, i :j þ 1 ¼ 1  2 : j þ 2  3 : j þ    þ ði  1Þi : j þ iði þ 1Þ : j ¼

i X

mðm þ 1Þ : j

m¼1

¼

i X m¼1

¼

¼

m

ðm þ 2jÞ! ð2jÞ!!m!

i 1 X mðm þ 1Þ   ðm þ 2jÞ ð2jÞ!! m¼1

i 1 1 X  fmðm þ 1Þ   ðm þ 2j þ 1Þ ð2jÞ!! 2j þ 2 m¼1

 ðm  1Þm    ðm þ 2jÞg 1 ið1 þ 1Þ    ði þ 2j þ 1Þ ¼ f2ðj þ 1Þg!! fi þ 2ðj þ 1Þ  1g! ¼ ði ¼ 1; 2; . . . : j ¼ 0; 1; . . .Þ; f2ðj þ 1Þg!!ði  1Þ! which shows that the result holds for i :j þ 1 Q.E.D. Proof 2 It is known that the Hermite polynomials are explicitly given by hi ðxÞ ¼

½i=2

X j¼0

¼

½i=2

X j¼0

i! ð1Þ j xi2j ði  2jÞ!j!2 j i! ð1Þ j xi2j ; ð2jÞ!!ði  2jÞ!

38

1 The Sectionally Truncated Normal Distribution

where ½ is the Gauss notation of the floor function i.e., the integer part of the quantity in brackets. Suppose that the above result is given by a method other than that in Proof 1 using, e.g., the generating function expftx  ðt2 =2Þg of hi ðxÞ with /ðx  tÞ ¼ /ðxÞ expftx  ðt2 =2Þg (see [12, Proposition 1.4.3; 22, Eq. (7); 27, Sect. 5.6; 35, Sects. 6.14–6.15]). Since i : j or ði þ 1Þ : j was derived as a coefficient of hi ðxÞ, equating the two sets of results term by term we obtain ði þ 1Þ : j ¼

ði þ 2jÞ! ði þ 2jÞ! ¼ ð2jÞ!!fði þ 2jÞ  2jg! ð2jÞ!!i!

ði; j ¼ 0; 1; . . .Þ:

Q.E.D. Proof 1 gives an alternative derivation of the explicit expressions of the coefficients of the Hermite polynomials. Using i : j , gs1 ð¼ 1; . . .; 9Þ given earlier are rewritten as g11 ¼ ri1 k ; ðLÞ

g21 ¼ ri1 k ri2 k r1 kk ckr ; ðLÞ2

1 g31 ¼ ri1 k ri2 k ri3 k ðckr r2 kk  rkk Þ; ðLÞ3

ðLÞ

ðLÞ4

ðLÞ2

ðLÞ5

ðLÞ3

ðLÞ

ðLÞ6

ðLÞ4

ðLÞ2

ðLÞ7

ðLÞ5

ðLÞ3

ðLÞ

ðLÞ8

ðLÞ6

ðLÞ4

ðLÞ2

2 g41 ¼ ri1 k    ri4 k ðckr r3 kk  2 : ckr rkk Þ; 3 2 g51 ¼ ri1 k    ri5 k ðckr r4 kk  3 : ckr rkk þ 2 : rkk Þ; 4 2 3 g61 ¼ ri1 k    ri6 k ðckr r5 kk  4 : ckr rkk þ 2 : ckr rkk Þ; 5 2 4 2 3 g71 ¼ ri1 k    ri7 k ðckr r6 kk  5 : ckr rkk þ 3 : ckr rkk  2 : rkk Þ; 6 2 5 3 4 g81 ¼ ri1 k    ri8 k ðckr r7 kk  6 : ckr rkk þ 4 : ckr rkk  2 : ckr rkk Þ; 7 2 6 3 5 3 4 g91 ¼ ri1 k    ri9 k ðckr r8 kk  7 : ckr rkk þ 5 : ckr rkk  3 : ckr rkk þ 2 : rkk Þ

ði1 ; . . .; i9 ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ: The above results indicate the following general expressions. Theorem 1.6 (i) When s is odd, n ðLÞðs1Þ ðs1Þ gs1 ¼ ri1 k    ris k ckr rkk þ

ðs3Þ=2 X

ðLÞðs2u1Þ ðsu1Þ rkk

ð1Þu ðs  2uÞ :u ckr

u¼1 ðs1Þ=2

þ ð1Þðs1Þ=2 2 :ðs3Þ=2 rkk

o

ði1 ; . . .; is ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ; where for s = 1, 3, define

Pðs3Þ=2 u¼1

ðÞ ¼ 0 and 2 :1 ¼ 0 as well as 2 :0 ¼ 1.

1.5 The Product Sum of Natural Numbers and the Hermite Polynomials

(ii) When s is even,

39

( ðLÞðs1Þ ðs1Þ rkk

gs1 ¼ ri1 k    ris k ckr þ

ðs2Þ=2 X

) u

ð1Þ ðs  2uÞ :

u

ðLÞðs2u1Þ ðsu1Þ ckr rkk

u¼1

ði1 ; . . .; is ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ; Pðs2Þ=2 where for s = 2, define u¼1 ðÞ ¼ 0. (iii) When s is a natural number, gs1 ¼ ri1 k    ris k

½ðs1Þ=2

X

ðLÞðs2u1Þ ðsu1Þ rkk

ð1Þu ðs  2uÞ :u ckr

u¼0

ði1 ; . . .; is ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ; where s :0 ¼ 1 as before. Proof When s = 1,…,9, it was shown earlier that the results hold. (i) Odd s: Assume that the result holds for an odd s with s 5; where 5 is the Pðs3Þ=2 minimum integer among the cases when u¼1 ðÞ does not vanish. Recall ðLÞ

ðLk Þ

ckr ¼ ckr that 1; . . .; RÞ: Then, 2gðs þ 1Þ1 ¼

ðLÞ

¼ ðcðLÞ r  RtÞk ¼ ðcr

 l  RtÞk ðk ¼ 1; . . .; p; r ¼

n @ ðLÞðs1Þ ðs1Þ ri1 k    ris k ckr rkk @tis þ 1 þ

ðs3Þ=2 X

ðLÞðs2u1Þ ðsu1Þ rkk

ð1Þu ðs  2uÞ :u ckr

u¼1 ðs1Þ=2

þ ð1Þðs1Þ=2 2 :ðs3Þ=2 rkk n ðLÞðs1Þ ðs1Þ þ ri1 k    ris k ckr rkk þ

ðs3Þ=2 X

o

jt¼0

ðLÞðs2u1Þ ðsu1Þ rkk

ð1Þu ðs  2uÞ :u ckr

u¼1 ðs1Þ=2

þ ð1Þðs1Þ=2 2 :ðs3Þ=2 rkk

o

ðLÞ

ris þ 1 k r1 kk ckr

40

1 The Sectionally Truncated Normal Distribution

n ðLÞðs2Þ ðs1Þ ¼ ri1 k    ris þ 1 k ðs  1Þckr rkk 

ðs3Þ=2 X

ðLÞðs2u2Þ ðsu1Þ rkk

ð1Þu ðs  2u  1Þðs  2uÞ :u ckr

u¼1 ðLÞs

þ ckr rs kk þ

ðs3Þ=2 X

ðLÞðs2uÞ ðsuÞ rkk

u

ð1Þ ðs  2uÞ :u ckr

u¼1

o ðLÞ ðs þ 1Þ=2 2 :ðs3Þ=2 ckr rkk h ðLÞs ðLÞðs2Þ ðs1Þ 1 ¼ ri1 k    ris þ 1 k ckr rs rkk kk þ ð1Þ fðs  1Þ þ ðs  2Þ :gckr þ ð1Þ

þ

ðs1Þ=2

ðs3Þ=2 X

ðLÞðs2u2Þ ðsu1Þ rkk

ð1Þu þ 1 ðs  2u  1Þðs  2uÞ :u ckr

u¼1

þ

ðs3Þ=2 X

ðLÞðs2uÞ ðsuÞ rkk

ð1Þu ðs  2uÞ :u ckr

u¼2

i ðLÞ ðs þ 1Þ=2 þ ð1Þðs1Þ=2 2 :ðs3Þ=2 ckr rkk ðLÞs ðLÞðs2Þ ðs1Þ ¼ ri1 k    ris þ 1 k ckr rs rkk kk  ðs  1Þ : ckr þ

ðs3Þ=2 X

ð1Þu fðs  2u þ 1Þðs  2u þ 2Þ :u1

u¼2 ðLÞðs2uÞ ðsuÞ rkk ðs3Þ=2 ðLÞ ðs þ 1Þ=2 3: ckr rkk

þ ðs  2uÞ :u gckr þ ð1Þfðs3Þ=2g þ 1 2  þ ð1Þ

ðs1Þ=2

ðs3Þ=2

ðLÞ ðs þ 1Þ=2 ckr rkk



2: ðLÞs u u ðLÞðs2uÞ ðsuÞ ¼ ri1 k    ris þ 1 k ckr rs rkk gju¼1 kk þ fð1Þ ðs  2u þ 1Þ : ckr þ

ðs3Þ=2 X

ðLÞðs2uÞ ðsuÞ rkk

ð1Þu ðs  2u þ 1Þ :u ckr

u¼2 u

þ fð1Þ ðs  2u þ 1Þ :

u

( ¼ ri1 k    ris þ 1 k

ðLÞs ckr rs kk

þ

ðLÞðs2uÞ ðsuÞ ckr rkk gju¼ðs1Þ=2 ðs1Þ=2 X u¼1

)

u

ð1Þ ðs  2u þ 1Þ :

u

ðLÞðs2uÞ ðsuÞ ckr rkk

;

1.5 The Product Sum of Natural Numbers and the Hermite Polynomials

41

where the following are used: ðs  2u þ 1Þðs  2u þ 2Þ :u1 þ ðs  2uÞ :u ¼ ðs  2u þ 1Þ :u a special case of ði  1Þ : j þ iði þ 1Þ :j1 ¼ i : j and 2 :ðs3Þ=2 þ 2  3 :ðs3Þ=2 ¼ 1  ð1 þ 1Þ :ðs3Þ=2 þ 2  ð2 þ 1Þ :ðs3Þ=2 ¼ 2 :ðs1Þ=2 ¼ ðs  2u þ 1Þ :u ju¼ðs1Þ=2 a special case of i : j ¼ 1  2 :j1 þ 2  3 :j1 þ    þ ði  1Þi :j1 þ iði þ 1Þ :j1 when i ¼ 2 and j ¼ ðs  1Þ=2. On the right-hand side of the final equation, when odd s is replaced by even s ¼ s þ 1, we obtain ( ðLÞs

gðs þ 1Þ1 ¼ ri1 k    ris þ 1 k ckr rs kk ðs1Þ=2 X

þ

) u

ð1Þ ðs  2u þ 1Þ :

u¼1

u

ðLÞðs2uÞ ðsuÞ ckr rkk

( ðLÞðs1Þ ðs1Þ rkk

¼ ri1 k    ris k ckr þ

ðs2Þ=2 X

) u



ð1Þ ðs  2uÞ :

u

ðLÞðs2u1Þ ðsu1Þ ckr rkk

;

u¼1

giving the required result of the even case. (ii) Even s: Assume that the result holds for an even s with s 6; where 6 is the Pðs2Þ=2 minimum integer among the cases when u¼2 ðÞ does not vanish. As in the case of odd s, we have ( @ ðLÞðs1Þ ðs1Þ gðs þ 1Þ1 ¼ ri k    ris k ckr rkk @tis þ 1 1 ) ðs2Þ=2 X u u ðLÞðs2u1Þ ðsu1Þ þ ð1Þ ðs  2uÞ : ckr rkk jt¼0 u¼1

( ðLÞðs1Þ ðs1Þ rkk

þ ri1 k    ris k ckr þ

ðs2Þ=2 X u¼1

) ð1Þu ðs  2uÞ

ðLÞðs2u1Þ ðsu1Þ :u ckr rkk

ðLÞ

ris þ 1 k r1 kk ckr

42

1 The Sectionally Truncated Normal Distribution



ðLÞs

ðLÞðs2Þ ðs1Þ rkk

1 ¼ ri1 k    ris þ 1 k ckr rs kk þ ð1Þ fðs  1Þ þ ðs  2Þ :gckr

þ

ðs2Þ=2 X

ðLÞðs2u2Þ ðsu1Þ rkk

ð1Þu þ 1 ðs  2u  1Þðs  2uÞ :u ckr

u¼1

þ

ðs2Þ=2 X

ðLÞðs2uÞ ðsuÞ rkk

ð1Þu ðs  2uÞ :u ckr

i

u¼2

h ðLÞs ðLÞðs2Þ ðs1Þ ¼ ri1 k    ris þ 1 k ckr rs rkk kk  ðs  1Þ : ckr þ

ðs2Þ=2 X

ð1Þu fðs  2u þ 1Þðs  2u þ 2Þ :u1

u¼2 ðLÞðs2uÞ ðsuÞ rkk

þ ðs  2uÞ :u gckr

s=2

þ ð1Þfðs2Þ=2g þ 1 2 :ðs2Þ=2 rkk n ðLÞs ¼ ri1 k    ris þ 1 k ckr rs kk þ

ðs2Þ=2 X

i

ðLÞðs2uÞ ðsuÞ rkk

ð1Þu ðs  2u þ 1Þ :u ckr

u¼1 s=2

þ ð1Þs=2 2 :ðs2Þ=2 rkk

o

:

When even s is replaced by odd s ¼ s þ 1, the last result becomes ( ðLÞs1 ðs1Þ rkk

ri1 k    ris k ckr þ

ðs3Þ=2 X

ðLÞðs2u1Þ ðsu1Þ rkk

ð1Þu ðs  2uÞ :u ckr

u¼1

þ ð1Þ

ðs1Þ=2

) ðs3Þ=2

2:

ðs1Þ=2 rkk

;

which shows the required result. (iii) When s is even, the result is easily obtained. When s is odd, the last term in gs1 =ðri1 k    ris k Þ of (i) can be replaced by ðs1Þ=2

ð1Þðs1Þ=2 2 :ðs3Þ=2 rkk

ðs1Þ=2

¼ ð1Þðs1Þ=2 1 :ðs1Þ=2 rkk

using 1  2 : j ¼ 2 : j ¼ 1 :j þ 1 , giving the required result. Q.E.D. ðLÞ

Note that the order of the polynomial of gs1 =ðri1 k    ris k Þ in terms of ckr is s  1. When rii ¼ 1ði ¼ 1; . . .; pÞ, gs1 =ðri1 k    ris k Þ becomes the Hermite

References

43

polynomial of order s  1 irrespective of the parity of s. Using Theorem 1.6, gs1 ðs ¼ 1; . . .; 9Þ are rewritten as g11 ¼ ri1 k ; ðLÞ

g21 ¼ ri1 k  ri2 k ðckr r1 kk Þ; ðLÞ2

1 g31 ¼ ri1 k ri2 k ri3 k ðckr r2 kk  1 : rkk Þ; ðLÞ3

ðLÞ

ðLÞ4

ðLÞ2

ðLÞ5

ðLÞ3

ðLÞ

ðLÞ6

ðLÞ4

ðLÞ2

ðLÞ7

ðLÞ5

ðLÞ3

ðLÞ

ðLÞ8

ðLÞ6

ðLÞ4

ðLÞ2

2 g41 ¼ ri1 k    ri4 k ðckr r3 kk  2 : ckr rkk Þ; 3 2 2 g51 ¼ ri1 k    ri5 k ðckr r4 kk  3 : ckr rkk þ 1 : rkk Þ; 4 2 3 g61 ¼ ri1 k    ri6 k ðckr r5 kk  4 : ckr rkk þ 2 : ckr rkk Þ; 5 2 4 3 3 g71 ¼ ri1 k    ri7 k ðckr r6 kk  5 : ckr rkk þ 3 : ckr rkk  1 : rkk Þ; 6 2 5 3 4 g81 ¼ ri1 k    ri8 k ðckr r7 kk  6 : ckr rkk þ 4 : ckr rkk  2 : ckr rkk Þ; 7 2 6 3 5 4 4 g91 ¼ ri1 k    ri9 k ðckr r8 kk  7 : ckr rkk þ 5 : ckr rkk  3 : ckr rkk þ 1 : rkk Þ ði1 ; . . .; i9 ; k ¼ 1; . . .; p; L ¼ 0; 1; r ¼ 1; . . .; RÞ;

which give pleasingly regularized results. When s is odd, the absolute value of the last term in gs1 =ðri1 k    ris k Þ is 1 :ðs1Þ=2 ¼ 2 :ðs3Þ=2 ¼

ðs  2Þ! ðs  2Þ! ¼ ¼ ðs  2Þ!!ðs 3Þ; ðs  3Þ!!1! ðs  3Þ!!

which becomes 1, 3, 15 and 105 when s = 3, 5, 7 and 9, respectively. It is well known that ðs  2Þ!! is the ðs  1Þ-th order central moment of an untruncated standard normal, which is equal to the number of distinct cases of choosing ðs  1Þ=2 pairs from s  1 members. Similar combinatorial interpretations for other coefficients of the Hermite polynomials are available.

References 1. Aitken AC (1934) Note on selection from a multivariate normal population. Proc Edinb Math Soc 4:106–110 2. Arismendi JC (2013) Multivariate truncated moments. J Multivar Anal 117:41–75 3. Arismendi JC, Broda S (2017) Multivariate elliptical truncated moments. J Multivar Anal 157:29–44 4. Arnold BC, Beaver RJ, Groeneveld RA, Meeker WQ (1993) The nontruncated marginal of a truncated bivariate normal distribution. Psychometrika 58:471–488 5. Artzner P, Delbaen F, Eber JM, Heath D (1999) Coherent measures of risk. Math Financ 9:203–228 6. Azzalini A, Genz A, Miller A, Wichura MJ, Hill GW, Ge Y (2020) The multivariate normal and t distributions, and their truncated versions. R package version 2.0.2. https://CRAN.Rproject.org/package=mnormt

44

1 The Sectionally Truncated Normal Distribution

7. Barr DR, Sherill ET (1999) Mean and variance of truncated normal distributions. Am Stat 53:357–361 8. Birnbaum ZW (1950) Effect of linear truncation on a multinormal population. Ann Math Stat 21:272–279 9. Birnbaum ZW, Paulson E, Andrews FC (1950) On the effect of selection performed on some coordinates of a multi-dimensional population. Psychometrika 15:191–204 10. Burkardt J (2014) The truncated normal distribution, pp 1–35. http://people.sc.fsu.edu/  jburkardt/presentations/truncatednormal.pdf 11. Cochran WG (1951) Improvement by means of selection. In: Neyman J (ed) Proceedings of the second Berkeley symposium on mathematical statistics and probability. University of California Press, Berkeley, CA, pp 449–470 12. Dunkl CF, Xu Y (2001) Orthogonal polynomials of several variables. Cambridge University Press, Cambridge 13. Fisher RA (1931) Introduction to mathematical tables, vol 1. British Association for the Advancement of Science, pp xxvi–xxxv. Reprinted in Fisher RA (1950) Contributions to mathematical statistics, pp 517–526 with the title “The sampling error of estimated deviates, together with other illustrations of the properties and applications of the integrals and derivatives of the normal error function” and the author’s note (CMS 23.xxva). Wiley, New York. https://digital.library.adelaide.edu.au/dspace/handle/2440/3860 14. Galarza CE, Kan R, Lachos VH (2020) MomTrunc: moments of folded and doubly truncated multivariate distributions. R package version 5.57. https://CRAN.R-project.org/package= MomTrunc 15. Galarza CE, Matos LA, Dey DK, Lachos VH (2022) On moments of folded and truncated multivariate extended skew-normal distributions. J Comput Graphical Stat (online published). http://doi.org/10.1080/10618600.2021.2000869 16. Gianola D, Rosa GJM (2015) One hundred years of statistical developments in animal breeding. Annu Rev Anim Biosci 3:19–56 17. Herrendörfer G, Tuchscherer A (1996) Selection and breeding. J Stat Plann Infer 54:307–321 18. Horrace WC (2015) Moments of the truncated normal distribution. J Prod Anal 43:133–138 19. Kamat AR (1953) Incomplete and absolute moments of the multivariate normal distribution with some applications. Biometrika 40:20–34 20. Kamat AR (1958) Hypergeometric expansions for incomplete moments of the bivariate normal distribution. Sankhyā 20:317–320 21. Kan R, Robotti C (2017) On moments of folded and truncated multivariate normal distributions. J Comput Graph Stat 26:930–934 22. Kemp CD, Kemp AW (1965) Some properties of the “Hermite” distribution. Biometrika 52:381–394 23. Landsman Z, Makov U, Shushi T (2016) Multivariate tail conditional expectation for elliptical distributions. Insur Math Econ 70:216–223 24. Landsman Z, Makov U, Shushi T (2018) A multivariate tail covariance measure for elliptical distributions. Insur Math Econ 81:27–35 25. Lawley DN (1943) A note on Karl Pearson’s selection formulae. In: Proceedings of the Royal Society of Edinburgh, vol 62 (Section A, Pt. 1), pp 28–30 26. Magnus JR, Neudecker H (1999) Matrix differential calculus with applications in statistics and econometrics, Rev. edn. Wiley, New York 27. Magnus W, Oberhettinger F, Soni RP (1966) Formulas and theorems for the special functions of mathematical physics, 3rd enlarged edn. Springer, Berlin 28. Manjunath BG, Wilhelm S (2012) Moments calculation for the doubly truncated multivariate normal density. arXiv:1206.5387v1 [stat.CO]. 23 June 2012 29. Nabeya S (1951) Absolute moments in 2-dimensional normal distribution. Ann Inst Stat Math 3:2–6 30. Nabeya S (1952) Absolute moments in 3-dimensional normal distribution. Ann Inst Stat Math 4:15–30

References

45

31. Ogasawara H (2021) Unified and non-recursive formulas for moments of the normal distribution with stripe truncation. Commun Stat Theor Methods (online published). http:// doi.org/10.1080/03610926.2020.1867742 32. Ogasawara H (2021) A non-recursive formula for various moments of the multivariate normal distribution with sectional truncation. J Multivar Anal (online published). http://doi.org/10. 1016/j.jmva.2021.104729 33. Pearson K (1903) On the influence of natural selection on the variability and correlation of organs. Philos Trans R Soc Lond Ser A Containing Pap Math Phys Charact 200:1–66 34. Pearson K, Lee A (1908) On the generalized probable error in multiple normal correlation. Biometrika 6:59–68 35. Stuart A, Ord JK (1994) Kendall’s advanced theory of statistics: distribution theory, 6th edn., vol 1. Arnold, London 36. Tallis GM (1961) The moment generating function of the truncated multi-normal distribution. J R Stat Soc B 23:223–229 37. Tallis GM (1963) Elliptical and radial truncation in normal populations. Ann Math Stat 34:940–944 38. Tallis GM (1965) Plane truncation in normal populations. J R Stat Soc B 27:301–307 39. Yanai H, Takeuchi K, Takane Y (2011) Projection matrices, generalized inverse matrices, and singular value decomposition. Springer, New York

Chapter 2

Normal Moments Under Stripe Truncation and the Real-Valued Poisson Distribution

2.1

Introduction

Stripe truncation introduced by Ogasawara [8] under normality is a univariate case of sectional truncation having zebraic or tigerish truncation patterns. Usual single and double truncations are special cases of stripe truncation. As in sectional truncation, R intervals or stripes for selection are defined by R pairs of selection points: 1  a1 \b1 \    \aR \bR  1: Let   X  N l; r2 : S When Rr¼1 far  X\br g holds, which is also denoted by X 2 S with S being the set of the R non-overlapping interval(s), X is selected otherwise truncated. Single truncation has a single selection interval ½a1 ¼ 1; b1 \1Þ for upper-tail truncation or ½a1 [ 1; b1 ¼ 1Þ for lower-tail truncation each with R = 1. Double truncation has also a single selection interval ½a1 [ 1; b1 \1Þ with both tails being truncated. While the tail areas are typically small ones, they can be majorities. The simplest case of stripe truncation other than single or double truncation is given by inner truncation with two selection intervals or stripes, i.e., R = 2:

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 H. Ogasawara, Expository Moments for Pseudo Distributions, Behaviormetrics: Quantitative Approaches to Human Behavior 2, https://doi.org/10.1007/978-981-19-3525-1_2

47

48

2

Normal Moments Under Stripe Truncation and the Real-Valued …

1 ¼ a1 \b1 \a2 \b2 ¼ 1; which is the complementary interval of the corresponding double truncation Let R [

a ¼ Pr

r¼1 R Z X

br

¼

r¼1

ar

R Z X

br

¼

r¼1

Zb ¼

! far  X\br g ( ) 1 ðx  lÞ2 pffiffiffi exp  dx 2r2 2r   /1 xjl; r2 dx

ar

  /1 xjl; r2 dx

a

¼

R X r¼1

ðbrZlÞ=r

/1 ð xj0; 1Þdx ðar lÞ=r

Zb ¼

/ð xÞ dx; a

where a ¼ ða1 ; . . .; aR ÞT ; b ¼ ðb1 ; . . .; bR ÞT ; a ¼ ða1 ; . . .; aR ÞT and b ¼ ðb1 ; . . .; bR ÞT with ar ¼ ðar  lÞ=r and br ¼ ðbr  lÞ=r ðr ¼ 1; . . .; RÞ. Then the probability density function (pdf) of X = x is     /1 xjl; r2 ; a; b ¼ /ðaÞ xjl; r2   ¼ / xjl; r2 =a: When X has the above pdf, the distribution under stripe truncation is denoted by X  Nðl; r2 ; a; bÞ:

2.2 Closed Formulas for Moments of Integer-Valued Orders

2.2

49

Closed Formulas for Moments of Integer-Valued Orders

When k = 0, 1,…, let ðrÞ Ik



Zbr

Zbr



x /ð xÞ dx ¼

¼ I k ar ; br ¼

¼

1 X

xk /ð xÞ dx

k

0

ar

Zar x /ð xÞ dx 

k

0

ðLÞ Zcr

ð1ÞL

L¼0

xk /ð xÞ dx; 0

R cðLÞ L 1L with 00 = 1; and when cðLÞ is negative, 0 j ðÞ dx ¼ where cðLÞ r ¼ ar br r R0  cðLÞ ðÞ dx (r = 1,…,R) is used. Then, Ogasawara [8] gave the following result j

using the above notation: Lemma 2.1 (Ogasawara [8, Lemma 1]).   1   h i kþ1 1 X ðrÞ pffiffiffi I k ¼ 2k=2 C ð1ÞL sign cðLÞ 1 k ¼ 2v þ 1 k ¼ 2v  1 f g f g r 2 p L¼0  ðLÞ2  1 c kþ1 ðr ¼ 1; . . .; R; k; v ¼ 0; 1; . . .Þ;  FC r 2 2 2 ð2:1Þ where CðÞ is the gamma function; sign(x) = 1 when x  0 and signðxÞ ¼ 1 when x < 0; 1fg is the incident function; and FC ðxjaÞ is the cumulative distribution function (cdf) of the gamma distribution at x when the shape parameter is a with the unit scale parameter. Proof The results are given by cases when k is (i) even or (ii) odd. (i) Even k (=2v; v = 0, 1,…) Noting that xk /ðxÞ ¼ x2v /ðxÞ is a non-negative even function with the property Zc

Zc x /ð xÞ dx ¼ signðcÞ

x2v /ð xÞ dx ð1  c  1Þ

2v

0

0

pffiffiffiffiffi and using the variable transformation y = x2/2 with dx=dy ¼ 1= 2y, we have

50

2

ðrÞ Ik

¼

ðrÞ I 2v

Normal Moments Under Stripe Truncation and the Real-Valued …

  ¼ sign br

Zjbr j

Zjar j x /ð xÞ dx  signðar Þ

0

¼

1 X

x2v /ð xÞ dx

2v

0

  ð1ÞL sign cðLÞ r

L¼0

ðLÞ2 cZ =2 r

0

ð2yÞk=2 ey pffiffiffiffiffiffipffiffiffiffiffi dy 2p 2y

  2ðk=2Þ1 ¼ ð1ÞL pffiffiffi sign cðLÞ r p L¼0 1 X

ðLÞ2 cZ =2 r

yðk1Þ=2 ey dy

0

!  cðLÞ2 k þ 1 j ¼ ð1Þ pffiffiffi sign c 2 2 p L¼0       1 k=2 X kþ1 k þ 1 sign cðLÞ cðLÞ2 L2 r r FC ¼ ð1Þ pffiffiffi C ; 2 2 2 2 p L¼0 1 X

L2



ðk=2Þ1

ðLÞ cj

where cðxjaÞ is the lower incomplete gamma function at x when the shape parameter is a with the unit scale parameter in the corresponding gamma distribution. (ii) Odd k (¼2v  1; v = 1, 2,…) Note that xk /ðxÞ is an odd function when k is odd. Then, when c < 0, Zc

Z0 x /ð xÞ dx ¼ 

Zjcj x /ð xÞ dx ¼

k

c

0

xk /ð xÞ dx

k

0

while when c  0; Zc

Zjcj xk /ð xÞ dx ¼

0

xk /ð xÞ dx: 0

R j cj Rc That is, 0 xk /ðxÞ dx ¼ 0 xk /ðxÞ dx holds irrespective of the sign of c. Consequently, noting that when c is odd, the result of (i) can be similarly used simply by omitting signðbr Þ and signðar Þ. Then, we have

2.2 Closed Formulas for Moments of Integer-Valued Orders

ðrÞ Ik

¼

ðrÞ I 2v1

Zjbr j ¼

51

Zjar j x

2v1

/ð xÞ dx 

0

x2v1 /ð xÞ dx 0

!   ðLÞ2 cj k þ 1 kþ1 1 L2 FC ¼ ð1Þ pffiffiffi C : 2 2 2 2 p L¼0 1 X

k=2

The combined results of (i) and (ii) give (2.1). Q.E.D. It is known that in (2.1)   pffiffiffi 2k=2 Cfðk þ 1Þ=2g= p ¼ E j X jk when untruncated X * N (0, 1) (k = 0, 1,…) (see Winkelbauer [11, Eq. (18)]; Pollack and Shauly-Aharonov [9, Eq. (1)]). When k = 2v (v = 0, 1,…),       kþ1 1 2v 1 2v 1 3 1 pffiffiffi pffiffiffi ¼ pffiffiffi C v þ 2k=2 C p ¼ pffiffiffi v      2 2 2 2 2 p p p ¼ ð2v  1Þ      3  1 ¼ ð2v  1Þ!! ¼ ðk  1Þ!!; which is well known as the k-th order non-vanishing central moment of the standard normal. When k ¼ 2v  1 ðv ¼ 1; 2; . . .Þ, 2k=2 C

  kþ1 1 1 1 pffiffiffi ¼ 2vð1=2Þ CðvÞ pffiffiffi ¼ 2vð1=2Þ ðv  1Þ! pffiffiffi 2 p p p pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi ¼ ð2v  2Þ!! 2=p ¼ ðk  1Þ!! 2=p:

pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi While ðk  1Þ!! holds for even k, ðk  1Þ!! 2=p for odd k is 2=p ¼ : Eðj X jÞ ¼ 0:8 times ðk  1Þ!!, where Eðj X jÞ is equal to the mean of the chi-distributed variable with one degree of freedom. Note that for even k, ðk  1Þ!! ¼ ðk  1Þ!!EðX 2 Þ. Using Lemma 4.1, Ogasawara [8] gave the following result. Theorem 2.1 (Ogasawara [8, Theorem 1]). Let d be an arbitrary reference point for the deviation of X  Nðl; r2 ; a; bÞ. Then, we have EfðX  d Þs g ¼ rs

s X k¼0

where s Ck ¼ 0 and 0 C0 ¼ 1.

s k

s Ck a

1

R X

ðrÞ 

Ik

d

sk

ðs ¼ 0; 1; . . .Þ;

r¼1

! s! ¼ ðskÞ!k! ðk ¼ 0; . . .; sÞ; d ¼ ðd  lÞ=r; ðdÞ0 ¼ 1 when d ¼

52

Normal Moments Under Stripe Truncation and the Real-Valued …

2

Proof Note that the moment of X  Nðl; r2 ; a; bÞ is given by Zb s

EfðX  d Þ jX 2 Sg ¼

  ðx  d Þs /ðaÞ xjl; r2 dx:

a

Then, we have EfðX  d Þs jX 2 Sg ¼

R X

EfðX  d Þs jar  X\br g

r¼1

¼

R X

E½fX  l  ðd  lÞgs jar  X\br 

r¼1

¼r

s

r¼1

rs

( )   X  l k d  l sk a C E  X\b  s k r r r r k¼0

R X s X

s X

s Ck

r¼1

k¼0

¼ rs

s X

R X

k   e ar  X e \br d sk E X

s Ck a

1

R X

ðrÞ 

Ik

d

sk

;

r¼1

k¼0

which gives the required result. Q.E.D. As used in Ogasawara [8], define

ðxÞ Ik

¼2

k=2

   2  kþ1 1 1 x k þ 1 pffiffiffi FC C 2 2 2 p2

 ½signð xÞ1fk ¼ 2vg þ 1fk ¼ 2v  1g ðk; v ¼ 0; 1; . . .Þ: ðrÞ

ðb Þ

ðar Þ

Note that I k ¼ I k r  I k orders for stripe-truncated X.

. Then, we obtain the absolute moments of odd

Theorem 2.2 (Ogasawara [8, Theorem 2]) Let d be as in Theorem 2.1. Then, for X  Nðl; r2 ; a; bÞ and odd s = 1, 3,…, EðjX  d js jX 2 SÞ ¼ rs

s X k¼0

1 s Ck a

R h X r¼1

  ðrÞ I k sign br  d

 

 sk

ða Þ

ðdÞ þ 2 Ik r  Ik : 1 ar  d\br d

2.2 Closed Formulas for Moments of Integer-Valued Orders

53

Proof For X  Nðl; r2 ; a; bÞ and arbitrary d, we have s

1

EðjX  d j jX 2 SÞ ¼ a

ðbrZlÞ=r

R X r¼1

ðar lÞ=r

b R Zr X

¼ rs a1

r¼1

¼ rs a1

x  d s /ð xÞ dx

ar

2 R X

 2 x jrx þ l  d js pffiffiffiffiffiffi exp  dx 2 2p

6 4

r¼1

Zbr



s x  d /ðxÞ dx 1 d\ar

ar

9 > =

x  d /ð xÞ dx  2 x  d /ð xÞ dx 1 ar  d\br þ > > ; : ar ar 3 Zbr  s

7  x  d /ðxÞ dx 1 br  d 5 8 > < Zbr 

Zd

s



s

ar

2 Zbr R X s   6  ¼ rs a1 x  d /ðxÞ dx sign br  d 4 r¼1

Zd 2



ar

3

7 x  d /ð xÞ dx 1 ar  d\br 5 s

ar

2 R X s X

s 1

¼ra

6

r¼1 k¼0

Zd 2



x d k

Zbr

s Ck 4

 sk   xk d /ð xÞ dx sign br  d

ar

sk

3 7 /ð xÞ dx 1 ar  d\br 5

ar

¼ rs

s X

s Ck a

k¼0

1

R h X

 

i sk   ðrÞ

ða Þ

ðdÞ I k sign br  d þ 2 I k r  I k 1 ar  d\br d

r¼1

ðs ¼ 1; 3; . . .Þ:

Q.E.D. Note that for even k, the absolute moments are given by Theorem 2.1.

54

2.3

2

Normal Moments Under Stripe Truncation and the Real-Valued … ðrÞ

Series Expressions of I k ðk ¼ 0; 1; . . .; r ¼ 1; . . .; RÞ for Moments of Integer-Valued Orders ðrÞ

ðb Þ

ða Þ

The function I k ¼ I k r  I k r ðk ¼ 0; 1; . . .; r ¼ 1; . . .; RÞ derived in Lemma 2.1 plays an important role for the integer-valued moments of X  Nðl; r2 ; a; bÞ as ðrÞ

ðbr Þ

 Ik

ðrÞ

ðbr Þ

 Ik

used in Theorems 2.1 and 2.2. The series expressions of I k ¼ I k for seeing its properties are available as follows: Lemma 2.2 (Ogasawara [8, Eqs. 5.2 and 5.3]) I k ¼ I k 0; 1; . . .; r ¼ 1; . . .; RÞ are given by cases:

ðar Þ

ðar Þ

(i) Even k (k = 0, 2,…; r = 1,…,R) ðrÞ

Ik ¼

k=2 o X ðk  1Þ!! n 2u1 2u1   ar /ðar Þ  br / br þ ðk  1Þ!!ar ; ð2u  1Þ!! u¼1

where when k = 0, ðrÞ I0

Zbr ¼ ð1Þ!!ar ar

/ð xÞ dx ar

with the definition ð1Þ!! ¼ 1. (ii) Odd k (k = 1, 3,…; j = 1,…,J) ðrÞ

Ik ¼

ðk1Þ=2 X u¼0

o ðk  1Þ!!n 2u 2u   ar /ðar Þ  br / br ð2uÞ!!

Proof Integration by parts gives ðrÞ Ik

Zbr ¼

Zbr x /ð xÞ dx ¼

xk1 x/ð xÞ dx

k

ar

¼ x

ar

k1

/ð xÞ

br ar

Zbr þ ð k  1Þ

xk2 /ð xÞ dx ar

¼

ak1 r / ð ar Þ



k1   br / br þ ð k

ðrÞ

 1ÞI k2

useful ðk ¼

2.3 Series Expressions of …

55

Rb Rb ðrÞ ðrÞ with I 0 ¼ arr /ðxÞ dx ¼ ar and I 1 ¼ arr x/ðxÞ dx ¼ /ðar Þ  /ðbr Þ (k = 2, 3,…; r = 1,…,R). (i) Even k (k = 0, 2,…; r = 1,…,R) n o k1   k3   ðrÞ k3 I k ¼ ak1 r /ðar Þ  br / br þ ðk  1Þ ar /ðar Þ  br / br n o k5   þ ðk  1Þðk  3Þ ak5 / ð a Þ  b / b r r r r .. .

  þ ðk  1Þðk  3Þ    5  3 ar /ðar Þ  br / br ðrÞ

þ ðk  1Þðk  3Þ    5  3  1I 0 k=2 o X ðk  1Þ!! n 2u1 2u1   ¼ ar /ðar Þ  br / bj þ ðk  1Þ!!ar ; ð2u  1Þ!! u¼1 (ii) Odd k (k = 1, 3,…; j = 1,…,J) n o k1   k3   ðrÞ k3 I k ¼ ak1 r /ðar Þ  br / br þ ðk  1Þ ar /ðar Þ  br / br n o k5   þ ðk  1Þðk  3Þ ak5 r /ðar Þ  br / br .. .

n o 2   þ ðk  1Þðk  3Þ    6  4 a2r /ðar Þ  br / br ðrÞ

þ ðk  1Þðk  3Þ    6  4  2I 1 ¼

ðk1Þ=2 X u¼1

¼

ðk1Þ=2 X u¼0

o

  ðk  1Þ!!n 2u 2u   ar /ðar Þ  br / br þ ðk  1Þ!! /ðar Þ  / br ð2uÞ!! o ðk  1Þ!!n 2u   2u   aj / aj  bj / bj ; ð2uÞ!!

which give the required results. Q.E.D. Note that in the even case of Lemma 2.2, the o term ðk  1Þ!!ar is not given when Pk=2 ðk1Þ!! n 2u1 2u1 u = 0 in u¼1 ð2u1Þ!! ar /ðar Þ  br /ðbr Þ but when u = 1/2. Ogasawara [8, Appendix A.5] gave the second proof of Lemma 2.1 using the above series expressions in Lemma 2.2. The extension of Lemmas 2.1 and 2.2 to the cases when k [  1 is real-valued for absolute moments will be given in the next section introducing the new real-valued Poisson distribution, whose special

56

2

Normal Moments Under Stripe Truncation and the Real-Valued …

cases are the usual Poisson and the half Poisson distributions. The latter distribution was defined by Ogasawara [8]. Definition 2.1 (Ogasawara [8, Definition 3]) The distribution of a variable taking positive half integers with the probability function kv þ 0:5 =Cðv þ 1:5Þ ðv ¼ 0; 1; . . .Þ PrðX ¼ v þ 0:5Þ ¼ P1 u þ 0:5 =Cðu þ 1:5Þ u¼0 k is called the half Poisson distribution with the parameter k [ 0. Ogasawara [8, Eq. 5.6] showed that when, e.g., k ¼ a2r =2 for the cases of even k in Lemma 2.2: n o þ 1:5Þ Cðkj0:5Þ ek CCðkjv  ðv þ 1:5Þ Cð0:5Þ PrðX  v þ 0:5Þ ¼ P1 u þ 0:5 =Cðu þ 1:5Þ u¼0 k  qffiffi2 Pv þ 1 a2u1 j sign aj u¼1 ð2u1Þ!! p ¼P  ðv ¼ 0; 1; . . .Þ; u þ 0:5 1 2 a =2 =C ð u þ 1:5 Þ j u¼0 where Cðkjv þ 1:5Þ is the upper incomplete gamma function at k whose complete counterpart is Cð0jv þ 1:5Þ ¼ Cðv þ 1:5Þ.

2.4

The Real-Valued Poisson Distribution for Series ðrÞ

Expressions of I k ðk ¼ 0; 1; . . .; r ¼ 1; :::; RÞ for Absolute Moments 2.4.1

Generalization of the Poisson Distribution

The Poisson distribution is one of the most basic discrete distributions that take non-negative integers. Generalization of the distribution with emphasis on regression has been given by Consul and Jain [4, Eq. (3.1)], Consul [2], Letac and Mora [7, Example D], Consul and Famoye [3, Eq. (2.3)], Zamani and Ismail [12, Eq. (6)], and Chandra, Roy and Ghosh [1, Eq. (2)] mainly to cope with the situations showing overdispersion /underdispersion frequently encountered in practice (for the summarized mean–variance relationships of the above generalized Poisson distributions (GPSs) see Wagh and Kamalja [10, Table 1]). It is known that for the Poisson distributed variable X with the parameter k, its cdf at X = k (k = 0, 1,…) is given by

2.4 The Real-Valued Poisson Distribution for Series Expressions of … k X

Zk

v k

k e =v! ¼ 1 

v¼0

57

xk ex dx=Cðk þ 1Þ

0

Z1 ¼

xk ex dx=Cðk þ 1Þ

ð2:2Þ

k

¼ Cðkjk þ 1Þ=Cðk þ 1Þ (see, e.g., Johnson et al. [6, p. 372]). The GPD of Chandra et al. [1] uses a relationship similar to (2.2) yielding the following probability function: PrðX ¼ vÞ ¼

81 < R xg ex dx=Cð1 þ gÞ ðv ¼ 0; 0  g\1Þ : k v þ g k k e =Cðv þ 1 þ gÞ ðv ¼ 1; 2; . . .; 0  g\1Þ:

ð2:3Þ

The half Poisson distribution obtained by Ogasawara [8, Definition 3] is another generalization of the Poisson distribution, whose probability function was shown in Definition 2.1, which is different from the GPDs mentioned earlier in that the half Poisson distribution takes positive half integers rather than non-negative integers for the GPDs. The half Poisson distribution associated with the upper incomplete gamma function Cðkjk þ 1Þ was introduced for the series expansion of the raw partial moments of even orders for the standard normal distribution. In the next subsection, the real-valued Poisson (r-Poisson) distribution will be defined, whose special cases are the usual and half Poisson distributions. Applications will be shown for series expressions of the absolute partial moments of real orders in the standard normal distribution.

2.4.2

The Real-Valued Poisson Distribution

Before presenting a new distribution, we provide the following lemma which is an extension of (2.2). Lemma 2.3 For k ¼ 0; 1; . . .; 0  k\1; and 0  g\1 with 00 1, we have 8 k P kv þ g ek > > < Cðv þ 1 þ gÞ þ

CðkjgÞ

CðgÞ Cðkjk þ 1 þ gÞ ¼ v¼0 k P > C ðk þ 1 þ g Þ > kv ek : Cðv þ 1Þ ðg ¼ 0Þ: v¼0

ð0\g\1Þ ð2:4Þ

58

2

Normal Moments Under Stripe Truncation and the Real-Valued …

Proof When 0\g\1, we obtain Cðkjk þ 1 þ gÞ ¼ Cðk þ 1 þ gÞ

Z1

xk þ g ex dx=Cðk þ 1 þ gÞ

k

8 9 Z1 < = 1 1 xk þ g ex k þ ðk þ gÞ xk1 þ g ex dx ¼ ; Cðk þ 1 þ gÞ : k

¼

k þ g k

k e 1 þ Cðk þ 1 þ gÞ Cðk þ gÞ k þ g k

Z1

xk1 þ g ex dx

k

k e k e kg ek 1 þ þ  þ þ ¼ Cðk þ 1 þ gÞ Cðk þ gÞ Cð1 þ gÞ CðgÞ ¼

k X

k1 þ g k

Z1

x1 þ g ex dx

k

v þ g k

k e CðkjgÞ þ : C ð v þ 1 þ g Þ CðgÞ v¼0

The case of g ¼ 0 is derived from the above result when g is temporarily relaxed to take 1: k Cðkjk þ 2Þ X kv þ 1 ek Cðkj1Þ ¼ þ Cðk þ 2Þ Cð1Þ Cðv þ 2Þ v¼0

¼ ¼

k X kv þ 1 ek

Cðv þ 2Þ v¼0

þ ek

kX þ 1 v k

ke : v! v¼0

Redefining k + 1 as k, we have the second result of (2.4). Alternatively, the second result is obtained by stopping the series up to the second last term, which is R1 replaced by Cð11þ gÞ k xg ex dx ¼ ek when g ¼ 0. Q.E.D. In Lemma 2.3, the known second result when g ¼ 0 is included for convenience. The definition 00 1 is required for the second result, where the left-hand side of (2.4) is well defined as 1 when k ¼ 0 since Cð0jk þ 1 þ gÞ ¼ Cðk þ 1 þ gÞ while the first term on the right-hand side of (2.4) when g ¼ 0 and v = 0 should be k0 ek =0! ¼ 1 when k ¼ 0. When g ¼ 0:5 and k ¼ 1, (2.4) gives the closed-form normalizer for the half Poisson in Definition 2.1. In the above results and the remainder of this section, the expression, e.g., k þ 1 þ g is used rather than k þ g þ 1 to indicate that k + 1 is an integer while 0  g\1 is real-valued.

2.4 The Real-Valued Poisson Distribution for Series Expressions of …

59

Definition 2.2 The discrete distribution taking non-negative equally spaced real values with unit steps is defined as the real-valued Poisson (r-Poisson) distribution when its probability function is given by the following equivalent sets of expressions: PrðX ¼ v þ gjk; gÞ kv þ g =Cðv þ 1 þ gÞ ¼ P1 u þ g ðv ¼ 0; 1; . . .; k [ 0; 0  g\1Þ k =Cðu þ 1 þ gÞ ( u¼0 kv þ g ð0\g\1Þ Cðv þ 1 þ gÞek ½1fCðkjgÞ=CðgÞg ¼  k  v k = e v! ðg ¼ 0Þ 8 kv þ g < k ð0\g\1Þ e fCðv þ 1 þ gÞðgÞv þ 1 CðkjgÞg ¼   : kv = ek v! ðg ¼ 0Þ;

ð2:5Þ

where ðgÞv þ 1 ¼ gð1 þ gÞ    ðv þ gÞ ¼ Cðv þ 1 þ gÞ=CðgÞ is the rising or ascending factorial using the Pochhammer notation. Proof of the second expression of the normalizer Cðv þ 1 þ gÞek ½1  fCðkjgÞ=CðgÞg: Using Lemma 2.3 when 0\g\1 and k ¼ 1, the result Cðkjk þ 1 þ gÞ=Cðk þ 1 þ gÞ ¼ 1 gives the required normalizer. We prove this by the Chebyshev inequality. Let X be gamma distributed with the shape parameter k [ k and the unit scale parameter. Then, using EðXÞ ¼ varðXÞ ¼ k and the Chebyshev inequality, we have PrðX\kÞ ¼ 1 

CðkjkÞ varð X Þ k 1  : ¼ ¼ 2  2 2 CðkÞ k =k  2k þ k ðk  k Þ fk  Eð X Þg

When k goes to 1, the last result approaches 0, which indicates Cðkj1Þ=Cð1Þ ¼ 1. Q.E.D. In Definition 2.2, the cases of g ¼ 0 and g ¼ 0:5 give the usual and half Poisson distributions, respectively. The first expression of (2.5) covers g ¼ 0 as well as 0\g\1. The cdf of the r-Poisson distribution is given as follows. Theorem 2.3 When X follows the r-Poisson distribution with k [ 0 and 0  g\1, we have PrðX  k þ gjk; gÞ ðk ¼ 0; 1; . . .Þ 8n on o1 < Cðkjk þ 1 þ gÞ  CðkjgÞ 1  CðkjgÞ ð0\g\1Þ Cðk þ 1 þ gÞ CðgÞ CðgÞ ¼ : Cðkjk þ 1Þ Cðk þ 1Þ ðg ¼ 0Þ:

60

2

Normal Moments Under Stripe Truncation and the Real-Valued …

Proof Using Lemma 2.3, we obtain the required results. Q.E.D. Remark 2.1 In Theorem 2.3, the second result for the usual Poisson distribution is known as addressed earlier. The second result is also given from the first result with lim fCðkjgÞ=CðgÞg ¼ 0 though Cð0Þ is not defined. The zero limit is proved again

g! þ 0

by the Chebyshev inequality. Let X be gamma distributed with the shape parameter g ð0\g\kÞ and the unit scale parameter. Then, we obtain PrðX [ kÞ ¼ n It is found that lim

g! þ 0

CðkjgÞ varð X Þ g  ¼ : 2 CðgÞ ðk  gÞ2 fk  E ð X Þ g

o g=ðk  gÞ2 ¼ 0 gives lim fCðkjgÞ=CðgÞg ¼ 0. g! þ 0

Theorem 2.4 Define FC ðkjgÞ as the cdf of the gamma distribution at k with the shape parameter g and the unit scale parameter. Then, for the r-Poisson distribution with k [ 0 and 0\g\1, we have Eð Xjk; gÞ ¼ Eð X Þ kg þ k; ek FC ðkjgÞCðgÞ  2 kg ð g  kÞ kg  k varð X Þ ¼ k þk e FC ðkjgÞCðgÞ e FC ðkjgÞCðgÞ ¼

and the moment generating function MðtÞ ¼ expfðet  1Þkg

FC ðet kjgÞ : FC ðkjgÞ

Proof Eð X Þ ¼

1 X

ðv þ gÞkv þ g Cðv þ 1 þ gÞek ½1  fCðkjgÞ=CðgÞg v¼0 1 X

kv1 þ g Cðv þ gÞek FC ðkjgÞ v¼0 ( ) 1 k k1 þ g X kv1 þ g þ ¼ k e FC ðkjgÞ CðgÞ C ð v þ gÞ v¼1 ( ) 1 k k1 þ g X kv þ g þ ¼ k ; e FC ðkjgÞ CðgÞ C ð v þ 1 þ gÞ v¼0 ¼k

2.4 The Real-Valued Poisson Distribution for Series Expressions of …

61

which gives the required expression of EðXÞ using Lemma 2.3 and the proof in Definition 2.2. For varðXÞ, we have EfXðX  1Þg ¼

1 X ðv þ gÞðv  1 þ gÞkv þ g

Cðv þ 1 þ gÞek FC ðkjgÞ ( ) 1 1 gð1 þ gÞkg ð1 þ gÞgk1 þ g X kv þ g þ þ ¼ k e FC ðkjgÞ Cð1 þ gÞ Cð2 þ gÞ Cðv  1 þ gÞ v¼2 ( ) 1 X 1 gkg kv þ g 2 ð1 þ g þ kÞ þ k ¼ k e FC ðkjgÞ Cð1 þ gÞ Cðv þ 1 þ gÞ v¼0 v¼0

¼

kg ð1 þ g þ kÞ þ k2 ; ek FC ðkjgÞCðgÞ

which gives varð X Þ ¼ EfX ðX  1Þg þ Eð X Þ  fEð X Þg2  2 kg ð g þ kÞ kg 2 þk þk  k þk ; ¼ k e FC ðkjgÞCðgÞ e FC ðkjgÞCðgÞ yielding the required result. For MðtÞ, we have MðtÞ ¼ EðetX Þ 1 X ðet kÞv þ g ¼ Cðv þ 1 þ gÞek FC ðkjgÞ v¼0 ¼

1 expðet kÞFC ðet kjgÞ X ðet kÞv þ g ek FC ðkjgÞ Cðv þ 1 þ gÞ expðet kÞFC ðet kjgÞ v¼0

¼

expðet kÞFC ðet kjgÞ ; ek FC ðkjgÞ

which is equal to the required expression. Q.E.D. The moments can be given by MðtÞ though the above mean and variance are obtained by simpler methods. The known MðtÞ ¼ expfðet  1Þkg for the usual Poisson case is given by the above result using lim FC ðet kjgÞ ¼ g! þ 0

lim FC ðkjgÞg ¼ 1 (see the associated result in Remark 2.1).

g! þ 0

62

2.4.3

2

Normal Moments Under Stripe Truncation and the Real-Valued …

Applications to the Series Expressions of the Moments of the Normal Distribution ðrÞ

Recall the definition of I k Zbr

ðrÞ

Ik ¼

xk /ð xÞ dx ar

  r ¼ 1; . . .; R; 1  a1 \b1 \. . .\aR \bR  1; k ¼ 0; 1; . . . ; which is a partial moment for the r-th interval of the normal distribution under double/stripe truncation and is similar to Fisher’s [5, Eq. (11)] In function under single truncation: 1 In ¼ n!

Z1 ðt  xÞn /ðtÞdt; x

ðrÞ

As in Fisher [5], we extend I k when k [ 1 is real-valued with a simplified notation as Zb Ik ¼

xk /ð xÞ dx

ð0  a\b  1; k [ 1Þ

a

Zb =2 2

1 ¼ pffiffiffi 2 p

2k=2 yðk1Þ=2 ey dy

 pffiffiffiffiffi y ¼ x2 =2; dx=dy ¼ 1= 2y

a2 =2

    2   2  kþ1 1 1 b k þ 1 a k þ 1 pffiffiffi FC ¼ 2k=2 C  F ; C 2 2 2 2 2 p2   where 2k=2 C k þ2 1 p1ffiffip ¼ Eðj X jk Þ ðk [ 1Þ when X  Nð0; 1Þ(Winkelbauer [11, Eq. (18)]). Employing the notation k þ g ðk ¼ 1; 0; 1; . . .; 0  g\1; k þ g [ 1Þ given earlier and using integration by parts, we have the recurrence expression:

2.4 The Real-Valued Poisson Distribution for Series Expressions of …

Zb Ik þ g ¼

63

xk þ g /ð xÞ dx

a

¼ x

k1 þ g

b

Zb

/ð xÞ a þ ðk  1 þ gÞ

xk2 þ g /ðxÞ dx

ð2:6Þ

a

¼a

k1 þ g

/ðaÞ  b

k1 þ g

/ðbÞ þ ðk  1 þ gÞI k2 þ g

ð0  a\b  1; k ¼ 2; 3; . . .; 0  g\1Þ: (i) Even k = 2, 4,… with 0\g\1 The case of even k with g ¼ 0 will be given in (ii). Under the condition of (i), (2.6) gives Zb Ik þ g ¼

xk þ g /ðxÞ dx

a

¼ ak1 þ g /ðaÞ  bk1 þ g /ðbÞ þ ðk  1 þ gÞ ak3 þ g /ðaÞ  bk3 þ g /ðbÞ

þ ðk  1 þ gÞðk  3 þ gÞ ak5 þ g /ðaÞ  bk5 þ g /ðbÞ .. .

þ ðk  1 þ gÞðk  3 þ gÞ    ð5 þ gÞð3 þ gÞ a1 þ g /ðaÞ  b1 þ g /ðbÞ þ ðk  1 þ gÞðk  3 þ gÞ    ð5 þ gÞð3 þ gÞð1 þ gÞI g ¼

k=2 X ðk  1 þ gÞ!! 2u1 þ g fa /ðaÞ  b2u1 þ g /ðbÞg þ ðk  1 þ gÞ!!I g ; ð2u  1 þ gÞ!! u¼1

where ðk  1 þ gÞ!! ¼ ðk  1 þ gÞðk  3 þ gÞ    ð3 þ gÞð1 þ gÞ is an extended double factorial for odd k  1  1 and 0\g\1. Define k ¼ a2 =2 and g ¼ ð1 þ gÞ=2. Then, we obtain k=2 X ðk  1 þ gÞ!! 2u1 þ g a /ðaÞ ð2u  1 þ gÞ!! u¼1   ðk  1 þ gÞ!! a2 =2 a1 þ g a3 þ g ak1 þ g pffiffiffiffiffiffi ¼ þ þ  þ e 1 þ g ð3 þ gÞð1 þ gÞ ðk  1 þ gÞ!! 2p ðk  1 þ gÞ!! a2 =2 ð1 þ gÞ=2 pffiffiffiffiffiffi ¼ 2 Cfð1 þ gÞ=2g e 2p

64

2

" 

Normal Moments Under Stripe Truncation and the Real-Valued …

ða2 =2Þð1 þ gÞ=2 ða2 =2Þð3 þ gÞ=2 þ Cfð1 þ gÞ=2gfð1 þ gÞ=2g Cfð1 þ gÞ=2gfð1 þ gÞ=2gfð3 þ gÞ=2g ða2 =2Þðk1 þ gÞ=2 þ  þ Cfð1 þ gÞ=2gfð1 þ gÞ=2gfð3 þ gÞ=2g    fðk  1 þ gÞ=2g

#

2g=2 ¼ ðk  1 þ gÞ!! pffiffiffi Cfð1 þ gÞ=2g 2 p " # ð1 þ gÞ=2 k k e k1 þ ð1 þ gÞ=2 ek kðk=2Þ1 þ ð1 þ gÞ=2 ek þ  þ  þ Cf1 þ ð1 þ gÞ=2g Cf2 þ ð1 þ gÞ=2g Cfðk=2Þ þ ð1 þ gÞ=2g 2g=2 ¼ ðk  1 þ gÞ!! pffiffiffi fCðg Þ  Cðkjg Þg PrfX  ðk=2Þ  1 þ g jk; g g: 2 p In the last result, X is r-Poisson distributed, where ðk=2Þ  1 is an integer (see Definition 2.2). Since the corresponding result for b is similarly given using k ¼ b2 =2, we have Zb Ik þ g ¼

xk þ g /ðxÞ dx

a



 2g=2

¼ ðk  1 þ gÞ!! pffiffiffi Cðg Þ  C a2 =2 g Pr X  ðk=2Þ  1 þ g ja2 =2; g

2 p 



 Cðg Þ  C b2 =2 g Pr X  ðk=2Þ  1 þ g jb2 =2; g

þ ðk  1 þ gÞ!!I g :

(ii) Even k = 2, 4,… with g ¼ 0 The essential results in (ii) were derived by Ogasawara [8, Sect. 5.1 (i) and A.5 (i)] when ð1 þ gÞ=2 ¼ 1=2 gives the half Poisson distribution. The following results are given by those in (i) with g ¼ 0. Zb Ik ¼

xk /ðxÞ dx a



1 pffiffiffi p  Cða2 =2 1=2Þ Pr X  ðk  1Þ=2ja2 =2; 1=2 ¼ ðk  1Þ!! pffiffiffi 2 p

pffiffiffi

 p  Cðb2 =2 1=2Þ Pr X  ðk  1Þ=2jb2 =2; 1=2 þ ðk  1Þ!!I 0 ;

ð2:7Þ

2.4 The Real-Valued Poisson Distribution for Series Expressions of …

65

where Zb I0 ¼

   1  /ðxÞ dx ¼ pffiffiffi C a2 =2 1=2  C b2 =2 1=2 : 2 p

a

In (2.7), when Y  Nð0; 1Þ, it is well known that EðY k Þ ¼ ðk  1Þ!! ðk ¼ 2; 4; . . .Þ, pffiffiffi which is equal to 2k=2 Cfðk þ 1Þ=2g= p as mentioned earlier. (iii) Odd k = 1, 3,…with 0\g\1 Under the condition of (iii), (2.6) gives Zb Ik þ g ¼

xk þ g /ðxÞ dx

a

¼ ak1 þ g /ðaÞ  bk1 þ g /ðbÞ þ ðk  1 þ gÞ ak3 þ g /ðaÞ  bk3 þ g /ðbÞ

þ ðk  1 þ gÞðk  3 þ gÞ ak5 þ g /ðaÞ  bk5 þ g /ðbÞ .. . þ ðk  1 þ gÞðk  3 þ gÞ    ð4 þ gÞð2 þ gÞfag /ðaÞ  bg /ðbÞg þ ðk  1 þ gÞðk  3 þ gÞ    ð4 þ gÞð2 þ gÞgI 1 þ g ¼

ðk1Þ=2 X u¼0

ðk  1 þ gÞ!! 2u þ g a /ðaÞ  b2u þ g /ðbÞ þ ðk  1 þ gÞ!!I 1 þ g : ð2u þ gÞ!!

where ðk  1 þ gÞ!! ¼ ðk  1 þ gÞðk  3 þ gÞ    ð4 þ gÞð2 þ gÞg is an extended double factorial for even k  1  0 and 0\g\1. Define k ¼ a2 =2. Then, we have ðk1Þ=2 X

ðk  1 þ gÞ!! 2u þ g a /ðaÞ ð2u þ gÞ!! u¼0   ðk  1 þ gÞ!! a2 =2 ag a2 þ g a4 þ g ak1 þ g pffiffiffiffiffiffi ¼ þ þ þ  þ e g ð2 þ gÞg ð4 þ gÞð2 þ gÞg ðk  1 þ gÞ!! 2p ðk  1 þ gÞ!! a2 =2 ðg=2Þ1 pffiffiffiffiffiffi 2 Cðg=2Þ e ¼ 2p " ða2 =2Þg=2 ða2 =2Þ1 þ ðg=2Þ  þ Cðg=2Þðg=2Þ Cðg=2Þðg=2Þf1 þ ðg=2Þg # ða2 =2Þðk1Þ=2 þ ðg=2Þ þ  þ Cðg=2Þðg=2Þf1 þ ðg=2Þg    fðk  1Þ=2 þ ðg=2Þg

66

2

Normal Moments Under Stripe Truncation and the Real-Valued …

Since the result for b is similarly given, we have Zb Ik þ g ¼

xk þ g /ðxÞ dx

a

2ðg=2Þ1 ¼ ðk  1 þ gÞ!! pffiffiffiffiffiffi ð2:8Þ 2p  2 

2  Cðg=2Þ  C a =2 g=2 Pr X  ðk  1Þ=2 þ ðg=2Þja =2; g=2

 

 Cðg=2Þ  C b2 =2 g=2 Pr X  ðk  1Þ=2 þ ðg=2Þjb2 =2; g=2 þ ðk  1 þ gÞ!!I 1 þ g :

(iv) Odd k = 1, 3,… with g ¼ 0 The essential results in this subsection were given by Ogasawara [8, Sect. 5.1 (ii) and A.5 (ii)]. The results are not given by the last result of (2.8) when g ¼ 0 since Cð0Þ is not defined. They are given from an earlier result. For clarity, we start with (2.6). Under the condition of (iv), Zb Ik ¼

xk /ðxÞ dx a

¼ ak1 /ðaÞ  bk1 /ðbÞ þ ðk  1Þ ak3 /ðaÞ  bk3 /ðbÞ

þ ðk  1Þðk  3Þ ak5 /ðaÞ  bk5 /ðbÞ .. .

þ ðk  1Þðk  3Þ    6  4 a2 /ðaÞ  b2 /ðbÞ þ ðk  1Þðk  3Þ    6  4  2I 1 ¼

ðk1Þ=2 X u¼0

ðk  1Þ!! 2u a /ðaÞ  b2u /ðbÞ : ð2uÞ!!

Define k ¼ a2 =2. Then, we have ðk1Þ=2 X

ðk  1Þ!! 2u a /ðaÞ ð2uÞ!! u¼0   ðk  1Þ!! a2 =2 a2 a4 ak1 ¼ pffiffiffiffiffiffi e þ þ  þ 1þ 2 42 ðk  1Þ!! 2p ( ) ðk  1Þ!! k k2 ek kðk1Þ=2 ek k þ  þ ¼ pffiffiffiffiffiffi e þ ke þ 2! fðk  1Þ=2g! 2p

¼

ðk  1Þ!! pffiffiffiffiffiffi PrfX  ðk  1Þ=2jk; 0g; 2p

where the r-Poisson distributed X reduces to the usual Poisson.

2.5 Remarks

67

Using similar results for b, we obtain Zb Ik ¼

xk /ðxÞ dx a

ðk  1Þ!!

¼ pffiffiffiffiffiffi Pr X  ðk  1Þ=2ja2 =2; 0 2p

 Pr X  ðk  1Þ=2jb2 =2; 0 :

ð2:9Þ

pffiffiffiffiffiffiffiffi When Y  Nð0; 1Þ, (2.9) gives EðjY jk Þ ¼ ðk  1Þ!! 2=p ðk ¼ 1; 3; . . .Þ, which pffiffiffi is equal to 2k=2 Cfðk þ 1Þ=2g= p as mentioned earlier.

2.5

Remarks

The mean derived in Theorem 2.4 indicates that it is larger than k for the usual Poisson case. Similarly, the variance in Theorem 2.4 shows that when k is sufficiently large, e.g., k  g, the variance is smaller than k indicating underdispersion. The r-Poisson distributed variable takes non-integer values when 0\g\1. By replacing x ¼ v þ g with x ¼ v ðv ¼ 0; 1; . . .Þ in Definition 2.2, we have another integer-valued GPD similar to (1.2). ( PrðX ¼ vjk; gÞ ¼

kv þ g Cðv þ 1 þ gÞek FC ðkjgÞ v k

k =ðe v!Þ

ð0\g\1Þ

ðg ¼ 0Þ

ð2:10Þ

[compare (2.10) and (2.3)]. Note that the probabilities in (2.10) are proportional to kv þ g =Cðv þ 1 þ gÞ as in the usual Poisson case whereas (2.3) has not this property. This indicates that (2.3) can be seen as a relatively zero-distorted distribution in this sense. We find that (2.10) and the r-Poisson belong to the exponential family of distributions while (2.3) does not. The GPD of (2.10) is seen as a downward shifted r-Poisson by g with Eð X Þ ¼

kg  g þ k; ek FC ðkjgÞCðgÞ

the unchanged variance  2 kg ðg  kÞ kg  k varð X Þ ¼ k þk e FC ðkjgÞCðgÞ e FC ðkjgÞCðgÞ

68

2

Normal Moments Under Stripe Truncation and the Real-Valued …

as in Theorem 2.4 and the moment generating function 1   X MðtÞ ¼ E etX ¼

etv kv þ g Cðv þ 1 þ gÞek FC ðkjgÞ v¼0

¼

1 expðet k  tgÞFC ðet kjgÞ X ðet kÞv þ g ek FC ðkjgÞ Cðv þ 1 þ gÞ expðet kÞFC ðet kjgÞ v¼0

¼

expfðet  1Þk  tggFC ðet kjgÞ : FC ðkjgÞ

Chandra et al. [1, p. 2788] suggests a real-valued distribution similar to the r-Poisson. However, their distribution is a real-valued counterpart of (2.3) with a zero-distorted property.

References 1. Chandra NK, Roy D, Ghosh T (2013) A generalized Poisson distribution. Commun Stat Theor Methods 42:2786–2797 2. Consul PC (1989) Generalized Poisson distribution: properties and applications. Marcel Dekker, New York 3. Consul PC, Famoye E (1992) Generalized Poisson regression model. Commun Stat Theor Methods 21:89–109 4. Consul PC, Jain G (1973) A generalization of the Poisson distribution. Technometrics 15:791–799 5. Fisher RA (1931) Introduction to Mathematical Tables, 1, xxvi–xxxv. British Association for the Advancement of Science. Reprinted in R. A. Fisher, Contributions to Mathematical Statistics, pp 517–526 with the title The sampling error of estimated deviates, together with other illustrations of the properties and applications of the integrals and derivatives of the normal error function and the author’s note (CMS 23.xxva). Wiley, New York (1950) 6. Johnson NL, Kotz S, Balakrishnan N (1994) Continuous univariate distributions, vol 1, 2nd edn. Wiley, New York 7. Letac C, Mora M (1990) Natural real exponential families with cubic variance functions. Ann Stat 18:1–37 8. Ogasawara H (2021) Unified and non-recursive formulas for moments of the normal distribution with stripe truncation. Commun Stat Theor Methods (online published). https:// doi.org/10.1080/03610926.2020.1867742 9. Pollack M, Shauly-Aharonov M (2019) A double recursion for calculating moments of the truncated normal distribution and its connection to change detection. Methodol Comput Appl Probab 21:889–906 10. Wagh YS, Kamalja KK (2017) Comparison of methods of estimation for parameters of generalized Poisson distribution through simulation study. Commun Stat Simul Comput 46:4098–4112 11. Winkelbauer A (2014) Moments and absolute moments of the normal distribution. arXiv:1209.4340v2 [math.ST], 15 July 2014 12. Zamani H, Ismail N (2012) Functional form for the generalized Poisson regression model. Commun Stat Theor Methods 41:3666–3675

Chapter 3

The Basic Parabolic Cylinder Distribution and Its Multivariate Extension

3.1

Introduction

A variation of the parabolic cylinder distribution was given by Kostylev [11, Eq. (3.14)], where the density function when variable X takes value A is defined by ðKoÞ

fD

ðX ¼ Aja; r; pÞ

¼ 1ðAÞ

Aa1 expfðA4  2pA2 Þ=ð4r4 Þg pffiffiffi ; Da=2 fp=ð 2r2 Þg2ða=4Þ1 ra Cða=2Þ expfp2 =ð8r4 Þg

ð3:1Þ

where 1ðxÞ ¼ 1 when x  0 and 1ðxÞ ¼ 0 when x\0; Dm ðzÞ is the parabolic cylinder function using the traditional Whittaker notation [2, Chap. 12; 4, Chap. 8; 13, Chap. 8; 28, Sects. 9.24–9.25]; and CðÞ is the gamma function. The distribution with (3.1) was derived in the context of energy detection and called the three-parametric parabolic cylinder distribution. Two other variations of the parabolic cylinder distribution were introduced by Ogasawara [17, Definitions 1 and 2], where the density functions are ð1Þ

fD;k ðX ¼ xjgÞ ¼

xk expðx2  gxÞ pffiffiffi 2ðk þ 1Þ=2 Cðk þ 1Þeg2 =8 Dk1 ðg= 2Þ

ðx [ 0; k [ 1Þ

ð3:2Þ

and ð2Þ

fD;k ðX ¼ xjhÞ ¼

xk expfðx2 =2Þ  hxg Cðk þ 1Þeh2 =4 Dk1 ðhÞ

ðx [ 0; k [ 1Þ:

ð3:3Þ

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 H. Ogasawara, Expository Moments for Pseudo Distributions, Behaviormetrics: Quantitative Approaches to Human Behavior 2, https://doi.org/10.1007/978-981-19-3525-1_3

69

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

70

The distributions with (3.2) and (3.3) were called the basic parabolic cylinder (bpc) distributions of the first and second kinds, respectively. The two bpc distributions were introduced to have the absolute moments of real-valued orders in the truncated normal distribution. It is found that (3.2) and (3.3) are not special cases of (3.1) using reparametrization although (3.1) can give (3.2) and (3.3) when the change of variables is employed as well. The parabolic cylinder distributions using Dm ðÞ derived by Pierson and Holmes [22] and rediscovered by Mathai and Saxena [14, Theorem 7]; and the generalized parabolic cylinder distribution obtained by Progri [25] are different from (3.1) to (3.3). The purposes of this chapter are to give another simple bpc distribution with an additional parameter over (3.2) and (3.3), to derive its moments and cumulative distribution function (cdf) and to give a multivariate extension. For the cdf of the bpc distribution, the weighted parabolic cylinder function is developed.

3.2

The BPC Distribution of the Third Kind and Its CDF

A new bpc distribution is defined as follows. Definition 3.1 The probability density function (pdf) of the bpc distribution of the third kind or simply the bpc distribution is defined by ð3Þ

fD;k ðX ¼ xjp; qÞ ¼ fD;k ðxjp; qÞ ¼ fD;k ðxÞ xk expðpx2  qxÞ ¼ R1 k 2 0 t expðpt  qtÞdt ¼

xk expðpx2  qxÞ

ð3:4Þ

pffiffiffiffiffi ð2pÞðk þ 1Þ=2 Cðk þ 1Þeq2 =ð8pÞ Dk1 ðq= 2pÞ ðx [ 0; p [ 0; k [ 1Þ:

pffiffiffiffiffi In (3.4), 1= 2p or its proportional is seen as an added scale parameter. The derivation of the normalizer in (3.4) is given by the integral representation of Dk1 ðÞ [5, Sect. 6.3, Eq. (13); 28, Sect. 3.462, Eq. 1]. The two distributions of (3.1) and (3.4) are equivalent in that one of the distributions is given by the other using change of variables and reparametrization. While (3.1) was developed for a special purpose and is complicated, (3.4) is simpler than (3.1). Note that the denominator of (3.4) is proportional to EðX k Þðk [ 1Þ of a normally distributed variable X under single truncation. First, we derive the cdf of the bpc distribution with (3.4). For this purpose, the following function is defined.

3.2 The BPC Distribution of the Third Kind and Its CDF

71

Definition 3.2 The weighted or incomplete parabolic cylinder function using an extended Whittaker notation is given by    pffiffiffi m 1 z 2 x2 p m=2 z2 =4 ; DmW ðz; xÞ ¼ 2 e 1 F1W  ; ; 2 2 2 2 Cfð1  mÞ=2g pffiffiffiffiffiffi   2pz 1  m 3 z 2 x2 ; ; ; F  ; 1 1W Cðm=2Þ 2 2 2 2 where 1 F1W ðg; n; z; xÞ

¼

1 X ðgÞu zu cðxjg þ uÞ ðnÞu u!Cðg þ uÞ u¼0

ð3:5Þ

is the weighted or incomplete Kummer confluent hypergeometric function defined by Ogasawara [18, Corollary 1]; ðgÞu ¼ gðg þ 1Þ    ðg þ u  1Þ ¼ Cðg þ uÞ=CðgÞ Rx with ðgÞ0 ¼ 1 is the rising or ascending factorial; and cðxjg þ uÞ ¼ 0 zg þ u1 ez dz is the lower incomplete gamma function. The subscript “W” in DmW ðz; xÞ and 1 F1W ðg; n; z; xÞ indicates a “weighted” function for clarity. When x ¼ 1, (3.5) with cð1jg þ uÞ ¼ Cðg þ uÞ becomes the usual Kummer confluent hypergeometric function 1 F1 ðg; n; zÞ [27, Eq. (6); 28, Sect. 9.210, Eq. 1; 2, Chap. 13]. Note that the weight 0  cðxjg þ uÞ=Cðg þ uÞ  1 in (3.5) is the cdf of the gamma distribution at X = x with the shape parameter g þ u and the unit scale parameter. The term “incomplete” is synonymously used as for DmW ðz; xÞ to avoid confusion since the term “the weighted parabolic cylinder function” is also used as a weighted sum of the parabolic cylinder functions [19, p. 616; see also 15, p. 99]. Then, one of the main results is given as follows. Theorem 3.1 The cdf of the bpc distribution with (3.4) other than the integral representation is given in three ways: pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi PrðX  xjk; p; qÞ ¼ Dk1;W ðq= 2p; 2pxÞ=Dk1 ðq= 2pÞ ð1st expressionÞ  P1 pffiffiffii 2 i¼0 cfpx jðk þ 1 þ iÞ=2g q= p =i! ð2nd expressionÞ ¼ P1  pffiffiffii i¼0 Cfðk þ 1 þ iÞ=2g q= p =i!      kþ1 k þ 1 1 q2 ; ; ; px2 ¼ C ð3rd expressionÞ 1 F1W 2 2 2 4p     q kþ2 k þ 2 3 q2 ; ; ; px2  pffiffiffi C 1 F1W p 2 2 2 4p      kþ1 k þ 1 1 q2  C ; ; 1 F1 2 2 2 4p  1   q kþ2 k þ 2 3 q2 :  pffiffiffi C ; ; 1 F1 p 2 2 2 4p

ð3:6Þ

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

72

Proof In the integral representation PrðX  xjk; p; qÞ ¼

ð2pÞ

Rx

tk expðpt2  qtÞdt

pffiffiffiffiffi ; Cðk þ 1Þeq2 =ð8pÞ Dk1 ðq= 2pÞ

0 ðk þ 1Þ=2

ð3:7Þ

pffiffiffiffiffi the variable transformation y ¼ pt2 with dt=dy ¼ 1=ð2 pyÞ gives alternative expressions of the numerator of (3.7): Zx

1 Z X yðk1 þ iÞ=2 ey ðqÞi t expðpt  qtÞdt ¼ dy i! 2pðk þ 1 þ iÞ=2 i¼0 px2

k

0

2

0

¼

1 X cfpx2 jðk þ 1 þ iÞ=2g ðqÞi

2pðk þ 1 þ iÞ=2

i¼0

¼

1

1 X

2pðk þ 1Þ=2

i¼0

i!

cfpx2 jðk þ 1 þ iÞ=2g

pffiffiffi ðq= pÞi i!

ð3:8Þ

(scaled 2nd expression)      1 kþ1 k þ 1 1 q2 2 ; ; ; px ¼ ðk þ 1Þ=2 C 1 F1W 2 2 2 4p 2p     q kþ2 k þ 2 3 q2 ; ; ; px2  pffiffiffi C 1 F1W 2 2 2 4p p (scaled 3rd expression); where the two terms of the “3rd expression” in braces correspond to the even and odd powers in the preceding infinite series as in Ogasawara [18, Lemma 1]. Let pffiffiffi w ¼ px2 and 2/ ¼ q= p. Then, in the “2nd expression” pffiffiffi 1 X ðq= pÞi cfpx2 jðk þ 1 þ iÞ=2g i! i¼0 1 X

ð2/Þi i! i¼0       kþ1 k þ 3 ð2/Þ2 k þ 5 ð2/Þ4 þ c wj þ c wj þ  ¼ c wj 2 2 2 2! 4!       kþ2 k þ 4 ð2/Þ3 k þ 6 ð2/Þ5 2/ þ c wj þ c wj þ  þ c wj 2 2 2 3! 5!      1    1 kþ1 kþ3 2 2 1 kþ5 2 2 4321 þ c wj /  ð/ Þ þ c wj þ  ¼ c wj 2 2 2 2 2 2222 ¼

cfwjðk þ 1 þ iÞ=2g

3.2 The BPC Distribution of the Third Kind and Its CDF

73

       kþ2 k þ 4 2 3 2 1 þ 2/ c wj þ c wj / 2 2 22 )    1 kþ6 2 2 5432 þ c wj þ  ð/ Þ 2 2222  " k þ 1 cfwjðk þ 1Þ=2g ¼C 2 Cfðk þ 1Þ=2g   1 cfwjðk þ 3Þ=2g 1 2 fðk þ 1Þ=2g1 / þ 1! Cfðk þ 3Þ=2g 2 1 #   1 cfwjðk þ 5Þ=2g 1 2 2 fðk þ 1Þ=2g2 ð/ Þ þ 2! þ  Cfðk þ 5Þ=2g 2 2  " k þ 2 cfwjðk þ 2Þ=2g þ 2/C 2 Cfðk þ 2Þ=2g   1 cfwjðk þ 4Þ=2g 3 2 fðk þ 2Þ=2g1 / þ 1! Cfðk þ 4Þ=2g 2 1 #   1 cfwjðk þ 6Þ=2g 3 fðk þ 2Þ=2g2 ð/2 Þ2 þ 2! þ  Cfðk þ 6Þ=2g 2 2         kþ1 kþ1 1 2 kþ2 kþ2 3 2 ¼C ; ; / ; ; / F ; w þ 2/C F ; w : 1 1W 1 1W 2 2 2 2 2 2     kþ1 k þ 1 1 q2 2 ; ; ; px ¼C 1 F1W 2 2 2 4p     q kþ2 k þ 2 3 q2 2 ; ; ; px ;  pffiffiffi C 1 F1W p 2 2 2 4p yielding the “3rd expression.” In the Legendre duplication formula or the multiplication theorem pffiffiffi CðzÞCðz þ 0:5Þ ¼ 212z pCð2zÞ [1, Eq. 6.1.18; 3, Sect. 1.2, Eq. (11)], the case with z ¼ ðk þ 1Þ=2 gives     pffiffiffi kþ1 kþ2 C C ¼ 2k pCðk þ 1Þ: 2 2

ð3:9Þ

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

74

Then, Definition 3.2 with (3.9) makes the last result of (3.8) as Zx tk expðpt2  qtÞdt 0

    kþ1 kþ2 ðk þ 1Þ=2 q2 =ð8pÞ 1 p ffiffiffi 2 e C C 2 2 p 2pðk þ 1Þ=2 pffiffiffiffiffi pffiffiffiffiffi  Dk1;W ðq= 2p; 2pxÞ pffiffiffiffiffi pffiffiffiffiffi pffiffiffi 1 2 ¼ pðk þ 1Þ=2 2ðk1Þ=2 eq =ð8pÞ pffiffiffi 2k pCðk þ 1ÞDk1;W ðq= 2p; 2pxÞ p pffiffiffiffiffi pffiffiffiffiffi ðk þ 1Þ=2 q2 =ð8pÞ ¼ ð2pÞ e Cðk þ 1ÞDk1;W ðq= 2p; 2pxÞ

¼

1

ðscaled 1st expressionÞ: ð3:10Þ Canceling the factor ð2pÞðk þ 1Þ=2 eq =ð8pÞ Cðk þ 1Þ common to the numerator of (3.10) and denominator of (3.7), we obtain the “1st expression” of (3.6). Noting that when x ¼ 1, (3.8) gives the denominator of (3.7) and canceling the factor f2pðk þ 1Þ=2 g1 , the “2nd expression” of (3.6) follows. The “3rd expression” of (3.6) is given similarly from the last result of (3.8). Q.E.D. 2

The “2nd expression” of (3.6) is of use for actual computation by replacing the infinite series with a finite one when the residual is sufficiently small. The “3rd expression” is useful when some methods for the weighted or incomplete Kummer confluent hypergeometric function will hopefully be developed. Comparing (3.7) and the “1st expression” of (3.6), we have an integral formula when p = 1/2 and q = z: ez =4 Dk1;W ðz; xÞ ¼ Cðk þ 1Þ 2

Zx

 2  t t exp   zt dt: 2 k

ð3:11Þ

0

The proof of Theorem 3.1 gives (3.11) and consequently another proof of the usual Dk1 ðzÞ other than that using an associated differential equation [4, p. 120]. Remark 3.1 Alternative expressions of Dk1;W ðz; xÞ are given as follows: pffiffiffiffiffi Let z ¼ q= 2p and p ¼ 1=2. Then, comparing the “1st expression” of (3.10) with the “2nd and 3rd expressions” of (3.8), we obtain the following

3.2 The BPC Distribution of the Third Kind and Its CDF

ez =4 Dk1;W ðz; xÞ ¼ Cðk þ 1Þ 2

Zx

 2  t tk exp   zt dt 2

75

(1st expression)

0

 pffiffiffi 2 1  2 2ðk1Þ=2 ez =4 X x k þ 1 þ i ð 2zÞi j (2nd expression) ¼ c 2 Cðk þ 1Þ i¼0 2 i!      2 2ðk1Þ=2 ez =4 kþ1 k þ 1 1 z 2 x2 ; ; ; C ¼ 1 F1W 2 2 2 2 2 Cðk þ 1Þ     pffiffiffi kþ2 k þ 2 3 z 2 x2 ; ; ;  2zC F (3rd expression): 1 1W 2 2 2 2 2 ð3:12Þ Let Fc



x2 k þ 1 2 j 2

be the cdf of the gamma distribution at x2 =2 with the shape

parameter ðk þ 1Þ=2 and the unit scale parameter. Then, when z = 0, we have 1 Dk1;W ð0; xÞ ¼ Cðk þ 1Þ

Zx tk expðt2 =2Þdt 0

 2  2ðk1Þ=2 x kþ1 c j ¼ 2 Cðk þ 1Þ 2        pffiffiffi kþ1 k þ 2 1 x2 k þ 1 j ¼ 2ðk1Þ=2 2k p C c C 2 2 2 2   1  2  pffiffiffi kþ2 x kþ1 ð3:13Þ ¼ 2ðk þ 1Þ=2 p C j Fc 2 2 2    2  2ðk1Þ=2 kþ1 x kþ1 ¼ C j Fc 2 2 Cðk þ 1Þ 2     2ðk1Þ=2 kþ1 kþ1 1 x2 ; ; 0; ¼ C 1 F1W 2 2 2 Cðk þ 1Þ 2   1   pffiffiffi kþ2 kþ1 1 x2 ðk þ 1Þ=2 ; ; 0; p C ¼2 : 1 F1W 2 2 2 2 In (3.13), it is found that Fc

x2 k þ 1 2 j 2



¼ 1 F1W



kþ1 1 x2 2 ; 2 ; 0; 2

. When z = 0, the

definition 00 = 1 should be used in the “2nd and 3rd expressions” of (3.12), and in this case, the bpc distribution reduces to the chi distribution with k + 1 degrees of freedom (real-valued k [ 1), whose density function is

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

76

xk ex =2 xk ex =2 fv;k þ 1 ðxÞ ¼ R 1 k x2 =2 ¼ R 1 y dx ð2yÞk=2 pe ffiffiffiffi dy 0 x e 2

2

0

¼ ¼

2y

k x2 =2

2ðk1Þ=2

xe R1

ðk1Þ=2 ey dy 0 y 2 k x =2

xe ; 2ðk1Þ=2 Cfðk þ 1Þ=2g

which gives  PrðX  xjkÞ ¼ c

   x2 k þ 1 kþ1 j =C : 2 2 2

This cdf can also be given from that of the bpc distribution: PrðX  xjkÞ ¼ Dk1;W ð0; xÞ=Dk1 ð0Þ  2  ðk1Þ=2  1 2ðk1Þ=2 x kþ1 2 kþ1 ¼ c j C 2 2 Cðk þ 1Þ 2 Cðk þ 1Þ  2    x kþ1 kþ1 j ¼c =C : 2 2 2

ð3:14Þ

The median or more generally a-th ð0\a\1Þ quantile of the bpc distribution is given by the inverse function of the cdf or the solution of x when a ¼ PrðX  xjk; p; qÞ using the expressions of Theorem 3.1. However, it is difficult to have the closed form. Some numerical procedure, e.g., the bisection method, may be employed for the solution.

3.3

Moments of the BPC Distribution

The moments of the bpc are given by the following lemma, which is easily obtained from earlier results. Lemma 3.1 The m-th raw moment ðm  0Þ of the bpc distribution is given in three ways as well as the integral representation:

3.3 Moments of the BPC Distribution

77

EðX m jk; p; qÞ ¼ EðX m Þ R1 mþk x expðpx2  qxÞdx ¼ 0R 1 k 2 0 x expðpx  qxÞdx

pffiffiffiffiffi Dmk1 ðq= 2pÞ pffiffiffiffiffi ð1st expressionÞ Dk1 ðq= 2pÞ  P1 pffiffiffii i¼0 Cfðm þ k þ 1 þ iÞ=2g q= p =i! m=2 ¼p ð2nd expressionÞ  P1 pffiffiffii i¼0 CfCfðk þ 1 þ iÞ=2g q= p =i!      mþkþ1 m þ k þ 1 1 q2 ð3rd expressionÞ F ¼ pm=2 C ; ; 1 1 2 2 2 4p     q mþkþ2 m þ k þ 2 3 q2 F  pffiffiffi C ; ; 1 1 p 2 2 2 4p      2 kþ1 kþ1 1 q  C ; ; 1 F1 2 2 2 4p  1   q kþ2 k þ 2 3 q2  pffiffiffi C ; ; ; 1 F1 p 2 2 2 4p ¼ ð2pÞm=2 ðk þ 1Þm

ð3:15Þ where ðk þ 1Þm is seen as Cðm þ k þ 1Þ=Cðk þ 1Þ when m is a non-integer. From the “1st expression” of Lemma 3.1, we have the following result. pffiffiffiffiffi Theorem 3.2 Let Dk1  Dk1 ðpq Þ with pq  q= 2p when confusion does not occur. Define skðXjk; p; qÞ ¼ skðXÞ and ktðXjk; p; qÞ ¼ ktðXÞ as the skewness and excess kurtosis of the bpc distribution, respectively. Then, we obtain ðk þ 1ÞDk2 EðXÞ ¼ pffiffiffiffiffi ; 2pDk1 kþ1 varðXÞ ¼ fðk þ 2ÞDk1 Dk3  ðk þ 1ÞD2k2 g; 2pD2k1 kþ1 skðXÞ ¼ fðk þ 2Þðk þ 3ÞD2k1 Dk4 ð2pÞ3=2 D3k1  3ðk þ 1Þðk þ 2ÞDk1 Dk2 Dk3 þ 2ðk þ 1Þ2 D3k2 gfvarðXÞg3=2 ; kþ1 ktðXÞ ¼ fðk þ 2Þðk þ 3Þðk þ 4ÞD3k1 Dk5 ð2pÞ2 D4k1  4ðk þ 1Þðk þ 2Þðk þ 3ÞD2k1 Dk2 Dk4 þ 6ðk þ 1Þ2 ðk þ 2ÞDk1 D2k2 Dk3  3ðk þ 1Þ3 D4k2 gfvarðXÞg2  3:

ð3:16Þ

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

78

For Dm ðzÞ, the recurrence relation is known: Dm þ 1 ðzÞ  zDm ðzÞ þ mDm1 ðzÞ ¼ 0

ð3:17Þ

[4, Sect. 8.2, Eq. (14); 13, Sect. 8.1.3, Recurrence relations; 28, Sect. 9.247, Eq. 1]. An alternative recurrence formula using the Miller notation is also known: zUða; zÞ  Uða  1; zÞ þ ða þ 0:5ÞUða þ 1; zÞ ¼ 0

ð3:18Þ

[1, Eq. 19.6.4; 2, Eq. 12.8.1]. The equivalence of (3.17) and (3.18) with sign reversal is found using Uða; zÞ ¼ Da0:5 ðzÞ when m ¼ a  0:5. In this paper, kDk1 ðzÞ ¼ Dk þ 1 ðzÞ  zDk ðzÞ

ð3:19Þ

from (3.17) when m ¼ k is used with Dm1 ¼ Dm1 ðqp Þ. Employing the familiar notation r and l, we have expðpx2  qxÞ ¼ expfðx  lÞ2 =ð2r2 Þg p ¼ 1=ð2r2 Þ and q ¼ l=r2 . Consequently,  expfl2 =ð2r2 Þg with pffiffiffiffiffi qp ¼ q= 2p ¼ qr ¼ l=r  l. That is, qp is the standardized mean l under normality, which shows that Dm1 ¼ Dm1 ðqp Þ ¼ Dm1 ðlÞ is scale-free. Define pffiffiffiffiffi 2p ¼ EðXÞ=r and pffiffiffiffiffi Dk2 EðXÞ EðXÞ 2pEðXÞ ¼ : ¼ ¼ ¼ ðk þ 1Þr k þ 1 kþ1 Dk1

EðXÞ ¼ EðXÞ Rk1

Then, we have alternative expressions of (3.16) in Theorem 3.2. Corollary 3.1 Under the notations defined earlier, we have kþ1 EðXÞ ¼ pffiffiffiffiffi Rk1 ; 2p kþ1 f1  qp Rk1  ðk þ 1ÞR2k1 g varðXÞ ¼ 2p 2

¼ r2 fk þ 1 þ lEðXÞ  E ðXÞg; skðXÞ ¼ ðk þ 1Þ1=2  fqp þ ð2k 

ð3:20Þ 1 þ q2p ÞRk1

þ 3ðk þ 1Þqp R2k1

þ 2ðk þ 1Þ2 R3k1 g=f1  qp Rk1  ðk þ 1ÞR2k1 g3=2 2

¼ fðk þ 1Þl þ ð2k  1 þ l2 ÞEðXÞ  3lE ðXÞ 3

2

þ 2E ðXÞg=fk þ 1 þ lEðXÞ  E ðXÞg3=2 ;

3.3 Moments of the BPC Distribution

79

ktðXÞ ¼ ðk þ 1Þ1  ½k þ 3 þ q2p þ fð2k  1Þqp  q3p gRk1 þ ð2k  2  4q2p Þðk þ 1ÞR2k1  6qp ðk þ 1Þ2 R3k1  3ðk þ 1Þ3 R4k1 gf1  qp Rk1  ðk þ 1ÞR2k1 g2  3 ¼ ½ðk þ 1Þðk þ 3 þ l2 Þ þ fð2k  1Þl þ l3 gEðXÞ 2

3

4

þ ð2k  2  4l2 ÞE ðXÞ þ 6lE ðXÞ  3E ðXÞ  fk þ 1 þ lEðXÞ  E ðXÞg2  3; 2

m

where E ðXÞ ¼ fEðXÞgm . Proof First, we provide the following alternative expressions of EðX m Þðm ¼ 2; 3; 4Þ using the recurrence relation (see 3.19): ðk þ 1Þðk þ 2ÞDk3 kþ1 ¼ ðDk1  qp Dk2 Þ 2pDk1 2pDk1 kþ1 ð1  qp Rk1 Þ; ¼ 2p ðk þ 1Þðk þ 2Þðk þ 3ÞDk4 ðk þ 1Þðk þ 2Þ ¼ ðDk2  qp Dk3 Þ EðX 3 Þ ¼ ð2pÞ3=2 Dk1 ð2pÞ3=2 Dk1 kþ1 fðk þ 2ÞDk2  qp ðDk1  qp Dk2 Þg ¼ ð2pÞ3=2 Dk1 kþ1 fqp þ ðk þ 2 þ q2p ÞRk1 g; ¼ ð2pÞ3=2 EðX 2 Þ ¼

ð3:21Þ EðX 4 Þ ¼

ðk þ 1Þ4 Dk5

¼

ðk þ 1Þ3

ðDk3  qp Dk4 Þ ð2pÞ Dk1 ð2pÞ2 Dk1 kþ1 fðk þ 3ÞðDk1  qp Dk2 Þ  ðk þ 2Þqp ðDk2  qp Dk3 Þg ¼ ð2pÞ2 Dk1 kþ1 fðk þ 3ÞDk1  ð2k þ 5Þqp Dk2 þ q2p ðDk1  qp Dk2 Þg ¼ ð2pÞ2 Dk1 kþ1 ½k þ 3 þ q2p  fð2k þ 5Þqp þ q3p gRk1 : ¼ ð2pÞ2 2

E(X) in (3.20) is repeated for clarity. The expression of var(X) in (3.20) is easily derived using EðX 2 Þ shown above. For sk(X), (3.21) gives

80

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

E½fX  EðXÞg3  ¼ EðX 3 Þ  3EðX 2 ÞEðXÞ þ 2E3 ðXÞ kþ1 ¼ fqp þ ðk þ 2 þ q2p ÞRk1 ð2pÞ3=2  3ð1  qp Rk1 Þðk þ 1ÞRk1 þ 2ðk þ 1Þ2 R3k1 g kþ1 ¼ fqp þ ð2k  1 þ q2p ÞRk1 ð2pÞ3=2 þ 3ðk þ 1Þqp R2k1 þ 2ðk þ 1Þ2 R3k1 g ¼ r3 fðk þ 1Þl þ ð2k  1 þ q2p ÞEðXÞ 2

3

 3lE ðXÞ þ 2E ðXÞg; which yields sk(X) in (3.20). For kt(X), (3.21) gives E½fX  EðXÞg4  ¼ EðX 4 Þ  4EðX 3 ÞEðXÞ þ 6EðX 2 ÞE2 ðXÞ  3E4 ðXÞ kþ1 ¼ ½k þ 3 þ q2p  fð2k þ 5Þqp þ q3p gRk1 ð2pÞ2  4fqp þ ðk þ 2 þ q2p ÞRk1 gðk þ 1ÞRk1 þ 6ð1  qp Rk1 Þðk þ 1Þ2 R2k1  3ðk þ 1Þ3 R4k1  kþ1 ¼ ½k þ 3 þ q2p þ fð2k  1Þqp  q3p gRk1 ð2pÞ2 þ f4ðk þ 1Þðk þ 2 þ q2p Þ þ 6ðk þ 1Þ2 gR2k1  6qp ðk þ 1Þ2 R3k1  3ðk þ 1Þ3 R4k1  kþ1 ¼ ½k þ 3 þ q2p þ fð2k  1Þqp  q3p gRk1 ð2pÞ2 þ ð2k  2  4q2p Þðk þ 1ÞR2k1  6qp ðk þ 1Þ2 R3k1  3ðk þ 1Þ3 R4k1  ¼ r4 ½ðk þ 1Þðk þ 3 þ l2 Þ þ fð2k  1Þl þ l3 gEðXÞ 2

3

4

þ ð2k  2  4l2 ÞE ðXÞ þ 6lE ðXÞ  3E ðXÞ yielding kt(X) in (3.20). Q.E.D. It is of interest to see that the moments in Corollary 3.1 are functions of Rk1 or EðXÞðEðXÞÞ. That is, only two values Dk1 ðqp Þ and Dk2 ðqp Þ among Dm ðqp Þ’s are used as well as k, p, and q due to the recurrence relation. However, it is to be noted that the recurrence formula of (3.19) when z in (3.19) is qp becomes

3.3 Moments of the BPC Distribution

81

kDk1 ðqp Þ ¼ Dk þ 1 ðqp Þ  qp Dk ðqp Þ;

ð3:22Þ

which tends to be subject to the subtract cancelation error when q and consequently qp is positive (or equivalently l is negative) since Dm ðÞ [ 0 especially when k is large. Similar formulas for the moments of the truncated normally distributed variable have been developed by Pearson and Lee [20], Fisher [6], Kan and Robotti [8, Theorem 1], Galarza et al. [7] and Kirkby et al. [9, 10]. However, it is known that such recurrence formulas have similar difficulties as for (3.20) when qp [ 0, which was pointed out by Pollack and Shauly-Aharonov [24] and Ogasawara [17, 18]. However, when k is not large even when qp [ 0, the formula gives reasonable results. Remark 3.2 Some relationships between Dk1 ðzÞ and Fisher’s I n function and their recurrence formulas Fisher’s [6, Eqs. (11) and (12)] In function is defined by In ¼ In ðxÞ Z1 ðt  xÞn pffiffiffiffiffiffi expðt2 =2Þdt ¼ n! 2p x Z1

¼ 0

ð3:23Þ

tn pffiffiffiffiffiffi expfðt þ xÞ2 =2gdt: n! 2p

When n and x are replaced by k and z, respectively, the recurrence formula for Ik ¼ Ik ðzÞ is ðk þ 2ÞIk þ 2 ðzÞ þ zIk þ 1 ðzÞ  Ik ðzÞ ¼ 0

ðk ¼ 0; 1; . . .Þ

ð3:24Þ

[6, Eq. (13)]. From the second definition of In in (3.23) and the integral representation of Dk1 ðzÞ (see (3.11) when x ¼ 1), it is easily found that pffiffiffiffiffiffi Dk1 ðzÞ ¼ expðz2 =4Þ 2pIk ðzÞ;

ð3:25Þ

which is known [17, Theorem 6]. Equation (3.19) is given by (3.24) with (3.25) when k is replaced by k  2. Note that this is another derivation of the recurrence relation for Dm ðzÞ using Fisher’s In . Conversely, (3.24) is given from (3.19). Note also that the definition of In looks to be given under n = 0, 1,… as stated in (3.24). Actually, as Fisher [6, page xxviii] noted, In holds when n [ 1 is real-valued with n! replaced by Cðn þ 1Þ.

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

82

Fisher [6, Eq. (8)] gave the following differential equation: 

 d2 d þ z  k Ik ðzÞ ¼ 0: dz dz2

Then, using (3.25), we obtain 

 d2 d þ z  k expðz2 =4ÞDk1 ðzÞ ¼ 0; dz dz2

ð3:26Þ

which gives 

 d2 d þ z  k expðz2 =4ÞDk1 ðzÞ dz dz2  2 d Dk1 ðzÞ dDk1 ðzÞ ¼ þ ðz þ zÞ 2 dz dz 2   z 1 z2 þ    k Dk1 ðzÞ expðz2 =4Þ 4 2 2 ¼0

yielding   d2 Dk1 ðzÞ 1 z2  þ k  Dk1 ðzÞ ¼ 0: dz2 2 4

ð3:27Þ

It is known that the equation   d2 Dm ðzÞ 1 z2  þ m þ Dm ðzÞ ¼ 0 dz2 2 4

ð3:28Þ

is the differential equation from which the parabolic cylinder functions are derived [2, Eq. 12.2.4; 4, Sect. 8.2, Eq. (1); 13, Sect. 8.1.1, Eq. (2)]. Since (3.27) is equal to (3.28) when k ¼ m  1, (3.27) is seen as another derivation of the differential equation via Fisher’s differential equation for Ik ðzÞ, which seems to be new. Magnus and Oberhettinger [12, p. 123] (see also [13, Sect. 8.1.1, Eq. (3)], defined uðzÞ ¼ ez =4 Dm ðzÞ 2

ð3:29Þ

and showed its differential equation d2 uðzÞ duðzÞ þ ðm þ 1ÞuðzÞ ¼ 0; þz 2 dz dz

ð3:30Þ

3.3 Moments of the BPC Distribution

83

which is equal to (3.26) when m ¼ k  1. That is, (3.29) and (3.30) are seen as rediscoveries of scaled Fisher’s In and its associated differential equation. The characteristic function is given as follows. Theorem 3.3 Define /ðtjk; p; qÞ ¼ /ðtÞ as the characteristic function of the bpc pffiffiffiffiffiffiffi distribution and i ¼ 1. Then, we obtain R1

xk expfpx2  ðq  i tÞxgdx R1 k 2 0 x expðpx  qxÞdx ( ) pffiffiffiffiffi ðq  i tÞ2  q2 Dk1 fðq  i tÞ= 2pg pffiffiffiffiffi ð1st expressionÞ ¼ exp 8p Dk1 ðq= 2pÞ P1 pffiffiffi j j¼0 Cfðk þ 1 þ jÞ=2g ðq  itÞ= p =j! ð2nd expressionÞ ¼  P1 pffiffiffi j j¼0 Cfðk þ 1 þ jÞ=2g q= p =j! ( )

  kþ1 k þ 1 1 ðq  itÞ2 ð3:31Þ ; ; ¼ C ð3rd expressionÞ 1 F1 2 2 2 4p ( )#   q kþ2 k þ 2 3 ðq  itÞ2 ; ;  pffiffiffi C 1 F1 2 2 2 4p p      2 kþ1 kþ1 1 q ; ;  C 1 F1  2 2 2 4p    1 q kþ2 k þ 2 3 q2 ; ; F  :  pffiffiffi C 1 1 p 2 2 2 4p

/ðtÞ ¼ Eðe

i tX

Þ¼

0

For confirmation, using the “2nd expression” of (3.31), we have  P pffiffiffijm im pm=2 1 =ðj  mÞ! dm /ðtÞ j¼m Cfðk þ 1 þ jÞ=2g q= p j ¼  P1 pffiffiffi j dtm t¼0 j¼0 Cfðk þ 1 þ jÞ=2g q= p =j!  P1 pffiffiffi j j¼0 Cfðm þ k þ 1 þ jÞ=2g q= p =j! m m=2 : ¼i p  P1 pffiffiffi j i¼0 Cfðk þ 1 þ jÞ=2g q= p =j! It is found that im dm /ðtÞ=dtm jt¼0 is equal to the “2nd expression” of (3.15). It is obvious that the moment generating function MðtÞ is given from (3.31) when it is replaced by t.

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

84

3.4

The Mode and the Shapes of the PDFs of the BPC Distribution

A positive (local) mode of the bpc distribution, if it exists, is given by differentiating the numerator of the pdf (see 3.4) and setting the result to zero: 2px2  qx þ k ¼ 0 i.e:; x ¼ fq ðq2 þ 8pkÞ1=2 g=ð4pÞ:

ð3:32Þ

0 Let fD;k ðxÞ ¼ dfD;k ðxÞ=dx. Then, we have the following.

Result 3.1 The mode including the local one and decreasing/increasing tendencies of the density function fD;k ðxÞ of the bpc distribution are given by cases: (i) 1\k\0 (i:a) 1\k\  q2 =ð8pÞ: The real solution of (3.32) does not exist. fD;k ðxÞ is 0 ðxÞ\0. a strictly decreasing function (sdf) with fD;k 2 0 (i:b) 1\k ¼ q =ð8pÞ and q\0: fD;k ðxÞ is a sdf with fD;k ðxÞ  0. The value 0 x ¼ q=ð4pÞ gives an inflection point with fD;k ðxÞ ¼ 0. 0 (i:c) 1\k ¼ q2 =ð8pÞ and q [ 0: fD;k ðxÞ is a sdf with fD;k ðxÞ\0. 2 0 (i:d) maxf1; q =ð8pÞg\k\0 and q\0: fD;k ðxÞ ¼ 0 has two positive solutions. That is, fD;k ðxÞ decreases from x = 0 to a local minimum, then increases up to a local maximum or mode, and after this point decreases. This case gives a bimodal distribution. 0 (i:e) maxf1; q2 =ð8pÞg\k\0 and q [ 0: fD;k ðxÞ ¼ 0 has two negative 0 solutions. fD;k ðxÞ is a sdf with fD;k ðxÞ\0. (ii) k ¼ 0: The pdf becomes Z1 fD;k ðxÞ ¼ fD;0 ðxÞ ¼ expðpx  qxÞ=

expðpt2  qtÞdt

2

0

which is the pdf of Nðq=ð2pÞ; 1=ð2pÞÞ under single truncation when X\0. (ii:a) k ¼ 0; q\0: x ¼ q=ð2pÞ gives a global mode. 0 0 (ii:b) k ¼ 0; q ¼ 0: fD;0 ðxÞ is a sdf with fD;0 ðxÞ  0 and fD;0 ð0Þ ¼ 0. 0 (ii:c) k ¼ 0; q [ 0: fD;0 ðxÞ is a sdf with fD;0 ðxÞ\0. 0 (iii) k [ 0: fD;k ðxÞ ¼ 0 has a negative solution and a positive one. The latter

solution x ¼ fq þ ðq2 þ 8pkÞ1=2 g=ð4pÞ gives a global mode. The above results will be numerically illustrated later.

3.5 The Multivariate BPC Distribution

3.5

85

The Multivariate BPC Distribution

For the multivariate bpc distribution, we use the following notations. X ¼ ðX1 ; . . .; Xn ÞT : the n-dimensional random vector, whose realized value is x ¼ ðx1 ; . . .; xn ÞT ðxi [ 0; i ¼ 1; . . .; nÞ; k ¼ ðk1 ; . . .; kn ÞT ðki [ 1; i ¼ 1; . . .; nÞ: the vector of powers of Xi and xi ði ¼ 1; . . .; nÞ used, e.g., in Xk ¼ X1k1    Xnkn and xk ¼ xk11    xknn ; C ¼ fcij gði; j ¼ 1; . . .; nÞ: a fixed positive definite symmetric matrix; d ¼ ðd1 ; . . .; dn ÞT : a fixed real vector. Definition 3.3 Under the notations and assumptions given above, the multivariate bpc distribution is defined with its density function: fD;k ðX ¼ xjd; CÞ ¼ fD;k ðxÞ ¼ R1 0

xk expfðx  dÞT C1 ðx  dÞ=2g R1 k    0 t expfðt  dÞT C1 ðt  dÞ=2gdt1    dtn ð3:33Þ

xk expfðx  dÞT C1 ðx  dÞ=2g ¼ R1 k ; T 1 0 t expfðt  dÞ C ðt  dÞ=2gdt where the denominator or the normalizer is ) " ( )  T 1 (Y n 1 n X Y d C d fðC1 dÞith g2 ii ðki þ 1Þ=2 u i =2 exp ðc Þ 2 exp 2 4cii i¼1 u¼0 i¼1 n pffiffiffiffiffioi Y  2cgh ugh 1 pffiffiffiffiffiffiffiffiffiffiffiffi  Cðki þ 1 þ u i ÞDki 1u i ðC1 dÞith = cii ; ugh ! cgg chh g\h ð1st expressionÞ ð3:34Þ P1 P1 P1 C1 ¼ fcij gði; j ¼ 1; . . .; nÞ; u i ¼ u¼0 ðÞ ¼ u12 ¼0    un1;n ¼0 ðÞ; P Pn1 Pn ðd þ dhi Þ; dgi is the Kronecker delta; g¼1 h¼g þ 1 ugh ðdgi þ dhi Þ ¼ g\h ugh Q gi P ðÞith is the ith element of a vector; and g\h is defined similarly to g\h : The Proof of the Normalizer As in the univariate case, we use the variable pffiffiffiffiffiffiffiffiffiffi transformations: yi ¼ x2i cii =2; dxi =dyi ¼ 1= 2cii yi ði ¼ 1; . . .; nÞ. Then, we obtain

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

86

Z1

xk expfðx  dÞT C1 ðx  dÞ=2gdx

0

(

Z1 ¼

x exp  k

n X x2 cii i

i¼1

0

2



X

xi xj c þ ij

i\j

n X i¼1

 1 T 1 xi ðC dÞith  d C d dx 2 1

)  T 1  Z1 (Y n d C d ð2yi =cii Þki =2 pffiffiffiffiffiffiffiffiffiffi ¼ exp 2 2cii yi i¼1 0 ( rffiffiffiffiffiffi) n n X X 2cij pffiffiffiffiffiffiffi X 2yi 1 pffiffiffiffiffiffiffiffiffi yi yj þ  exp  yi  ðC dÞith dy ii cjj cii c i\j i¼1 i¼1 ) !  T 1  Z1 (Y ðk 1Þ=2 n n X d C d 2ðki 1Þ=2 yi i ¼ exp yi exp  2 ðcii Þðki þ 1Þ=2 i¼1 i¼1 0 ( pffiffiffi v u ) n X 1  1  YX 2cij pffiffiffiffiffiffiffi 1 Y ðC1 dÞith 2 pffiffiffiffi 1 pffiffiffiffiffiffiffiffiffi yi yj pffiffiffiffiffi dy  yi u! i¼1 v¼0 v! cii cii cjj i\j u¼0 ( )  T 1  n d C d n=2 Y 2ki =2 ¼ exp 2 ii ðki þ 1Þ=2 2 i¼1 ðc Þ ) 1( 1 1 1 1 Z n X Y X X X ðki 1 þ u i þ vi Þ=2 yi   yi e  dy u12 ¼0

(

un1;n ¼0 v1 ¼0

vn ¼0

0

i¼1

) pffiffiffivl n  Y  2cgh ugh 1 Y ðC1 dÞlth 2 1 pffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi  gg hh ll u v ! gh l! c c c g\h l¼1 ( )  T 1  n d C d n=2 Y 2ki =2 ¼ exp 2 ii ðki þ 1Þ=2 2 i¼1 ðc Þ " vi #   1 X 1 n X Y ki þ 1 þ u i þ vi 2ðC1 dÞith 1  C pffiffiffipffiffiffiffiffi ii v 2 i! 2 c u¼0 v¼0 i¼1   Y 2cgh ugh 1 pffiffiffiffiffiffiffiffiffiffiffiffi  ð2nd expressionÞ ugh ! cgg chh g\h

3.5 The Multivariate BPC Distribution

87

( )  n dT C1 d n=2 Y 2ki =2 ¼ exp 2 ii ðki þ 1Þ=2 2 i¼1 ðc Þ "  ( )  1 n X Y ki þ 1 þ u i ki þ 1 þ u i 1 fðC1 dÞith g2  ; ; C 1 F1 2 2 2 2cii u¼0 i¼1 ( )## pffiffiffi 1   2ðC dÞith ki þ 2 þ u i ki þ 2 þ u i 3 fðC1 dÞith g2 pffiffiffiffiffi þ ; ; C 1 F1 2 2 2 2cii cii Y  2cgh ugh 1 pffiffiffiffiffiffiffiffiffiffiffiffi  ð3rd expressionÞ ugh ! cgg chh g\h ( )  T 1  n d C d n=2 Y 2ki =2 ¼ exp 2 ii ðki þ 1Þ=2 2 i¼1 ðc Þ " ( ) 1 n X Y fðC1 dÞith g2 ðki þ u i 1Þ=2  2 exp 4cii u¼0 i¼1   Y  ugh ðC1 dÞith 2cgh 1 pffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffi  Cðki þ 1 þ u i ÞDki 1u i gg hh u ! gh cii c c g\h 

ð1st expressionÞ; which gives the “1st expression” of (3.34). Q.E.D. Note that the normalizer in (3.33) is seen as a scaled extension of Fisher’s Ik (In when n is k) or the parabolic cylinder function Dk1 ðÞ to the n-variate case with k ¼ ðk1 ; . . .; kn ÞT instead of k. Properties of the multivariate bpc distribution are shown as follows. Theorem 3.4 The product moment of the multivariate bpc distribution is given by the ratio of the normalizers. EðX1m1    Xnmn jd; CÞ ¼ EðXm Þ ðmi  0; i ¼ 1; . . .; nÞ R1 mþk x expfðx  dÞT C1 ðx  dÞ=2gdx ¼ 0R 1 ; T 1 k 0 x expfðx  dÞ C ðx  dÞ=2gdx where xm þ k ¼ xm xk ; the denominator is given by (3.34); and the numerator is also given by (3.34) with k replaced by m + k.

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

88

The characteristic function with t ¼ ðt1 ; . . .; tn ÞT is Z1 Eðe

itT X

Þ¼ 0

xk expfðx  d  iCtÞT C1 ðx  d  iCtÞ=2gdx 

 Z1 tT Ct  exp it d  = xk expfðx  dÞT C1 ðx  dÞ=2gdx; 2 T

0

ð3:35Þ R1 where 0 xk expfgdx in the numerator is given by (3.34) when d is replaced with d þ iCt. R1 Let a ¼ 0 xk expfðx  dÞT C1 ðx  dÞ=2gdx. Then, the cdf, i.e., PrðX1  x1 ; . . .; Xn  xn Þ  PrðX  xÞ is PrðX  xÞ

) " ( )  (Y n 1 n Y X dT C1 d fðC1 dÞith g2 ii ðki þ 1Þ=2 u i =2 ðc Þ 2 exp ¼ a exp 4cii 2 i¼1 u¼0 i¼1  n pffiffiffiffiffi pffiffiffiffiffioi Y 2cgh ugh 1 pffiffiffiffiffiffiffiffiffiffiffiffi  Cðki þ 1 þ u i ÞDki 1u i ;W ðC1 dÞith = cii ; xi cii ugh ! cgg chh g\h 1



ð1st expressionÞ ( )  T 1  n d C d n=2 Y 2ki =2 1 2 ¼ a exp ii ðki þ 1Þ=2 2 i¼1 ðc Þ "   v i # 1 X 1 n Y X x2 cii ki þ 1 þ u i þ vi 2ðC1 dÞith 1  c i j pffiffiffipffiffiffiffiffi ii v 2 2 i! 2 c u¼0 v¼0 i¼1 Y  2cgh ugh 1 pffiffiffiffiffiffiffiffiffiffiffiffi  ð2nd expressionÞ ugh ! cgg chh g\h ( )  T 1  n d C d n=2 Y 2ki =2 2 ¼ a1 exp ii ðki þ 1Þ=2 2 i¼1 ðc Þ "  ( )  1 n Y X ki þ 1 þ u i ki þ 1 þ u i 1 fðC1 dÞith g2 x2i cii ; ;  C ; 1 F1W 2 2 2 2cii 2 u¼0 i¼1 ( )## pffiffiffi 1   1



2ðC dÞith ki þ 2 þ ui ki þ 2 þ ui 3 fðC dÞith g2 x2i cii pffiffiffiffiffi þ ; ; ; C 1 F1W 2 2 2 2cii 2 cii   u gh gh Y 2c 1 pffiffiffiffiffiffiffiffiffiffiffiffi  ð3rd expressionÞ: gg hh u gh ! c c g\h

ð3:36Þ

3.5 The Multivariate BPC Distribution

89

In Theorem 3.4, the three expressions of the cdf and their proofs parallel those of Definition 3.3 and the proof of the normalizer. That is, Dki 1u i n o

n pffiffiffiffiffio k þ 1 þ u i þ vi ðC1 dÞith = cii , 1 F1 ; ; fðC1 dÞith g2 =ð2cii Þ and C i used 2 n p p ffiffiffiffiffio ffiffiffiffi ffi earlier are replaced by Dki 1u i ;W ðC1 dÞith = cii ; xi cii , n o 2 ii

x c k þ 1 þ u i þ vi 1 ii 2 2 ii and c i2 j i , respectively. 1 F1W ; ; fðC dÞith =ð2c Þg ; xi c =2 2 Note that the cdf given above is seen as a scaled extension of the weighted/ incomplete parabolic cylinder function Dk1;W ðÞ to the n-variate case with k ¼ ðk1 ; . . .; kn ÞT instead of k. Corollary 3.2 The mode with xi [ 0 ði ¼ 1; . . .; nÞ, when it exists, is given from the following equation. ( ) X 1 ii 2 ij c xi þ c xj  ðC dÞith xi  ki ¼ 0 ði ¼ 1; . . .; nÞ ð3:37Þ j6¼i

Proof Differentiating the numerator of the density function of (3.33) and setting the result to 0, we obtain @ k x expfðx  dÞT C1 ðx  dÞ=2g @xi " ( ) # X 1 k 1 ii ij ¼ ki x xi þ c xi  c xj þ ðC dÞith xk j6¼i T

1

 expfðx  dÞ C ðx  dÞ=2g " ( ) # X 1 ii 2 ij k c xj þ ðC dÞith xi þ ki x1 ¼ c xi þ  i x j6¼i

 expfðx  dÞT C1 ðx  dÞ=2g ¼0

ði ¼ 1; . . .; nÞ;

which gives (3.37). Q.E.D. For the solution of (3.37), some iterative computation, e.g., the Newton– Raphson, can be used. Let X ¼ ðXTð1Þ ; XTð2Þ ÞT with Xð1Þ ¼ ðX1 ; . . .Xn1 ÞT and Xð2Þ ¼ ðXn1 þ 1 ; . . .Xn ÞT ðn1 ¼ 1; . . .; n  1Þ. Then, the marginal distribution of Xð1Þ is easily given by (3.33) when (3.33) is integrated with respect to Xð2Þ over its support. Let X ð1Þ is multivariate bpc-distributed with kð1Þ ; dð1Þ and Cð1Þ , where kð1Þ ¼ ðk1 ; . . .; kn1 ÞT , dð1Þ ¼ ðd1 ; . . .; dn1 ÞT and Cð1Þ ¼ C11 is the n1  n1 submatrix of C corresponding to Xð1Þ . Then, it can be shown that the moments of Xð1Þ are generally different from those of

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

90

X ð1Þ unless C12 ¼ O, where C ¼

C11 C12 C21 C22

! , which will be numerically shown

later. Further, the marginal distribution of Xð1Þ is shown to be not bpc-distributed ! C11 C12 1 1 unless C12 ¼ O. Let C ¼ , where C22 ¼ ðC22  C21 C1 11 C12 Þ C21 C22 1 1 1 1 ¼ C1 22 þ C22 C21 ðC11  C12 C22 C21 Þ C12 C22 . Define xð1Þ ; xð2Þ ; kð2Þ and dð2Þ similarly as above. Denote the normalizer of the pdf of (3.33) by a. Then, the marginal density of Xð1Þ at xð1Þ using the notation fD;k ðXð1Þ ¼ xð1Þ jd; CÞ rather than fD;kð1Þ ðXð1Þ ¼ xð1Þ jdð1Þ ; Cð1Þ Þ is given by

fD;k ðXð1Þ ¼ xð1Þ jd; CÞ Z1 1 xk expfðx  dÞT C1 ðx  dÞ=2gdxð2Þ ¼a 0 1 kð1Þ ¼ a xð1Þ expfðxð1Þ  dð1Þ ÞT C1 11 ðxð1Þ  dð1Þ Þ=2g 1 Z kð2Þ T 22  xð2Þ exp½fxð2Þ  dð2Þ  C21 C1 11 ðxð1Þ  dð1Þ Þg C 0

 fxð2Þ  dð2Þ  C21 C1 11 ðxð1Þ  dð1Þ Þg=2dxð2Þ ð1Þ xð1Þ expfðxð1Þ  dð1Þ ÞT C1 11 ðxð1Þ  dð1Þ Þ=2g ¼ R1 k T 1 0 t expfðt  dÞ C ðt  dÞ=2gdt 1 Z kð2Þ T 22  xð2Þ exp½fxð2Þ  dð2Þ  C21 C1 11 ðxð1Þ  dð1Þ Þg C

k

0

 fxð2Þ  dð2Þ  C21 C1 11 ðxð1Þ  dð1Þ Þg=2dxð2Þ : Since fixed xð1Þ appears in the integral of the last result as well as outside the integral, the marginal distribution is not a bpc one. Let fD;k ðXð2Þ ¼ xð2Þ jxð1Þ ; d; CÞ be the pdf of the conditional distribution of Xð2Þ , when Xð1Þ ¼ xð1Þ is given. Then, we obtain the following result. Theorem 3.5 Using the notations defined earlier, we have fD;k ðXð2Þ ¼ xð2Þ jxð1Þ ; d; CÞ 1 ¼ fD;kð2Þ fXð2Þ ¼ xð2Þ jdð2Þ þ C21 C1 11 ðxð1Þ  dð1Þ Þ; C22  C21 C11 C12 g:

ð3:38Þ

3.6 Numerical Illustrations

91

Proof fD;k ðXð2Þ ¼ xð2Þ jxð1Þ ; d; CÞ ¼

fD;k ðX ¼ xjd; CÞ fD;k ðXð1Þ ¼ xð1Þ jd; CÞ

T 22 ð2Þ xð2Þ exp½fxð2Þ  dð2Þ  C21 C1 11 ðxð1Þ  dð1Þ Þg C k

fxð2Þ  dð2Þ  C21 C1 11 ðxð1Þ  dð1Þ Þg=2 ¼ R 1 kð2Þ T 22 1 0 xð2Þ exp½fxð2Þ  dð2Þ  C21 C11 ðxð1Þ  dð1Þ Þg C fxð2Þ  dð2Þ  C21 C1 11 ðxð1Þ  dð1Þ Þg=2dxð2Þ 1 ¼ fD;kð2Þ fXð2Þ ¼ xð2Þ jdð2Þ þ C21 C1 11 ðxð1Þ  dð1Þ Þ; C22  C21 C11 C12 g:

Q.E.D. Equation (3.38) shows that the conditional distribution is the bpc when k ¼ kð2Þ , 22 1 d ¼ dð2Þ  C21 C1 ¼ C22  C21 C1 11 ðxð1Þ  dð1Þ Þ and C ¼ ðC Þ 11 C12 .

3.6

Numerical Illustrations

For computation of the pdf, cdf and moments of the bpc distribution, the computation of Dk1;W ðz; xÞ is an important factor. Currently, this can be performed using “2nd expression” of (3.12) based on the infinite series or “1st expression” of (3.12) for the integral representation with some numerical integration. When the former is used, the following stopping rule is employed. When the sum of the absolute values of the newest four added terms divided by the current value of the function is smaller than or equal to a predetermined criterion denoted by “eps,” then the computation is stopped. The newest four terms are employed considering the cases of two equal consecutive values in similar functions [21, p. 5]. The function “wpc” for the computation of Dk1;W ðz; xÞ coded in R [26] was developed using the default double precision. The criterion “eps” of convergence can be zero, which indicates the maximum machine precision, and is usually attained without excessive added computation time. The R-function “wpc” and the associated functions are given at the end of this chapter. The numerical integration by the R-function “integrate” based on QUADPACK [23] is employed using the default arguments for comparison to the computation by “wpc”. The comparison was performed for the combinations of k ¼ 0:5; 0; 0:5; 1ð1Þ5; x ¼ 0:5; 1; 2; Inf with Inf ¼ 1 in R and z ¼ 0ð0:5Þ5. That is, 8  4  11 ¼ 352 cases are used. In “wpc,” the maximum machine precision, i.e., eps = 0 is used. The user cpu time required for the comparison of “wpc” and “integrate” in the set of 352 cases was 0.22 s using intel Core: 7-6700 [email protected] GHz. The statistics of the absolute differences of the values of Dk1;W ðz; xÞ by “wpc” and “integrate” are min = median = 0, mean = 6.0e−9 and max = 8.8e−8, where aeb ¼ a  10b .

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

92

The statistics of the maximum powers in the power series (see i in “2nd expression” of 3.12) when convergence is attained are min = 0, median = 29.5, mean = 37.8, max = 145. Note that min = 0 is given when z = 0, where the computation reduces to the scaled (in)complete gamma function (see 3.13). When somewhat relaxed convergence criterion eps = 1e−6 was used, the user cpu time for the same comparison was 0.19 s. That is, only 0.03 s was saved over the case using the maximum machine precision. The corresponding statistics for the absolute differences of the function are min = 0, median = 1.6e−12, mean = 6.0e−9, max = 8.8e−8. The statistics for the maximum powers in wpc are min = 0, median = 13.0, mean = 18.5, max = 117, which are in line with the short cpu time saved as mentioned earlier. In the case of the multivariate bpc distribution, the cdf of the bivariate case was used for illustration, where a similar comparison of the cdf by the wpc and multivariate integration by the R-function “cubintegrate” with the method “hcubature” in the R-package “cubature” version 2.04 [16] was used with the default arguments. The cases of xT ¼ ð0:1; 0:1Þ; ð0:8; 0:8Þ; ð2:5; 2:5Þ; ð0:1; 2:5Þ; kT ¼ ð0:5; 0:5Þ; ð1=7; 1=7Þ; ð0:5; 0:5Þ; ð0:5; 0:5Þ; ð2:5; 2:5Þ; c12 ¼ 0:5; 0:5; 0:8 with c11 ¼ c22 ¼ 1; dT ¼ ð0:5; 0:5Þ; ð0:5; 0:5Þ; ð1; 1Þ; i.e., 4  5  3  3 ¼ 180 cases were employed. When eps = 0, the user cpu time = 60.1 s for the whole computation. The statistics of the absolute differences of the values of the cdf by wpc and cubature are min = 0, median = 7.2e−8, mean = 1.1e−6, max = 9.8e−5. Figure 3.1 shows eight univariate cases for the combinations of k ¼ 1=2; 1=7; 1=2; q ¼ 1; 1; and p ¼ 0:5; 2. When k\0, fD;k ð0Þ ¼ 1, which parameter p=2 6

6

parameter p=0.5

1

2

3

4

5

Ex. 3.5: k=−1/2, q=1 Ex. 3.6: k=−1/7, q=−1 Ex. 3.7: k=1/2, q=1 Ex. 3.8: k=1/2, q=−1

0

0

1

2

3

4

5

Ex. 3.1: k=−1/2, q=1 Ex. 3.2: k=−1/7, q=−1 Ex. 3.3: k=1/2, q=1 Ex. 3.4: k=1/2, q=−1

0.0

0.5

1.0

1.5

2.0

2.5

3.0

0.0

0.5

1.0

Fig. 3.1 Density functions of the basic parabolic cylinder distribution

1.5

2.0

2.5

3.0

3.6 Numerical Illustrations

93

is well seen in the case of k ¼ 1=2. When k ¼ 1=7 and q ¼ 1 (Ex. 3.2), there are two (local) modes (see (i.d) of Result 3.1). The positive local mode is x = 0.83 by computation, which may be found when we look at the curve carefully. Though Ex. 3.6 with k ¼ 1=7 and q ¼ 1 when p ¼ 2 gives a strictly decreasing function over the support (see (i.a) of Result 3.1), the density function is a tilted S-shaped curve with non-stationary inflection points somewhere in (0, 0.3) and (0.5, 0.9). A main difference of Ex. 3.1 to 3.4 with p = 0.5 and Ex. 3.5 to 3.8 with p = 2 is that of the scales. Though the same set of q ¼ 1; 1 is used when p = 0.5 and 2, pffiffiffiffiffi they give some differences due to q other than scales since qp ¼ q= 2p is scale-free as mentioned earlier. That is, q ¼ 1; 1 give different sets of sk(X) and kt(X): in Ex. 3.1 to 3.4, sk(X) (kt(X)) are 1.99 (4.95), 0.65 (0.05), 1.02 (1.11) and 0.46 (−0.05), respectively while in Ex. 3.5 to 3.8, they are 1.79 (3.76), 0.88 (0.51), 0.90 (0.76) and 0.61 (0.14), respectively. The values of sk(X) and kt(X) were given by the recursive (see Corollary 3.1) and non-recursive (see Theorem 3.2) methods, where the differences for Dk41 ðqp Þ required for kt(X) by the two methods (see 3.22 and the 2nd expression of 3.12) are around the maximum machine precision, i.e., less than 1e−15 and similar differences for comparison to the numerical integration (see the 1st expression of 3.12) are less than 1e−10. The last results seem to indicate that the results by numerical integration are less accurate than those by the remaining methods in these cases. Figure 3.2 illustrates the density contours of the 9 bivariate bpc distributions. The first row (Ex. 3.9, 3.10 and 3.11), the second row (Ex. 3.12, 3.13 and 3.14) and

6

0.22

0.5

1.0

0.18

0.24

1.5

2.0

0.16

22

2.5

3.0

0.0

Ex.3.12: c*= −0.5, k’= (−0.5, 0.5), d’=(0.5, 0.5)

0.5

3.0

3.0 2.0

3.0

0.2

0.25

0.

1.0

1.5

0.3

02

14

0.

0.

2.0

2.5

3.0

0.0

Ex.3.13: c*= 0.5, k’= (−0.5, 0.5), d’=(0.5, 0.5)

2.0

0.05

0.1 0.15

1.0

0.12 0.2

0.1

2.0

3.0 2

8

0.0

0.1

0.1

0.1

0.0

0.16

0.2 4

2

0.0

1.0

0.22

2.0

0.2

0.08

1.0

2.0

0.14

0.0

0.04

0.5

1.0

1.5

3.0

2.0 1.0

0.15

0.0

0.5

1.0

1.5

3.0

3.0

2.0

2.5

3.0

0.3

0.0

0.5

1.0

0.25

1.0

0.45

2.0

0.05

2.5

3.0

0.2

25

0.35 5 0.1 0.1

0.

0.05

1.0

1.5

2.0

2.5

3.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

0.0

0.2 0.05

0.5

0.2

0.1

1.0

2.0

0.15

0.0

2.0 0.0

0.15

0.0

5

0.4

1.0

0.3

0.2

2.0

0.3

0.1 0.3

1.5

Ex.3.17: c*= 0.8, k’= (2.5, 2.5), d’=(−0.5, 0.5)

Ex.3.16: c*= 0.5, k’= (2.5, 2.5), d’=(−0.5, 0.5)

Ex.3.15: c*= −0.5, k’= (2.5, 2.5), d’=(−0.5, 0.5)

0.15 0.25 0.35

0.2

3.0

2.5

0.1

0.0

1.0 2.0

0.0

0.4

1.0 0.0

1.5

5

5

1.0

0.2

0.3

0.1

0.5

3.0

0.05

0.1

0.2

3 0. 0.2

2.5

Ex.3.14: c*= 0.8, k’= (−0.5, 0.5), d’=(0.5, 0.5)

0.05

0.0

2.0

3.0

0.06

0.1

0.02 0.06 0.08

0.0

3.0

0.02

0.04

Ex.3.11: c*= 0.8, k’= (−0.14, −0.14), d’=(1, 1)

Ex.3.10: c*= 0.5, k’= (−0.14, −0.14), d’=(1, 1)

Ex.3.9: c*= −0.5, k’= (−0.14, −0.14), d’=(1, 1)

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Fig. 3.2 Density contours of the bivariate bpc distribution (X1 = the horizontal axis, X2 = the vertical axis, c* = c12)

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

94

the third row (Ex. 3.15, 3.16 and 3.17) are cases with kT ¼ k0 ¼ ð1=7; 1=7Þ, dT ¼ d0 ¼ ð1; 1Þ; kT ¼ ð0:5; 0:5Þ, dT ¼ ð0:5; 0:5Þ; and kT ¼ ð2:5; 2:5Þ, dT ¼ ð0:5; 0:5Þ, respectively, where the value 0:14 in the figure should be read as 1=7. On the other hand, the first column (Ex. 3.9, 3.12 and 3.15), the second column (Ex. 3.10, 3.13 and 3.16) and the third column (Ex. 3.11, 3.14 and 3.17) are cases with c ¼ c12 ¼ 0:5; 0:5 and 0:8, respectively. c11 ¼ c22 ¼ 1 is used throughout the 9 examples. That is, c12 is a correlation coefficient when X is bivariate normally distributed without truncation. In the three examples with kT ¼ ð1=7; 1=7Þ and dT ¼ ð1; 1Þ in the first row, it can be shown that when x1 ! 0 and x2 ! 0, the density function becomes infinitely large (when x1 ! 0 and x2 is a positive fixed value, the density goes to 1 with a different speed). From the contour plot of Ex. 3.9, it is found that there are more than one (local) modes in the example. The correlation coefficients of X1 and X2 following the bivariate bpc distribution when c12 ¼ 0:5 are −0.31, −0.19 and −0.18 in Ex. 3.9, 3.12 and 3.15, respectively. When c12 ¼ 0:5, they are 0.38, 0.32 and 0.31 in Ex. 3.10, 3.13 and 3.16, respectively. When c12 ¼ 0:8, they are 0.71, 0.66 and 0.65 in Ex. 3.11, 3.14 and 3.17, respectively. Absolute values of these coefficients seem to be reduced from the corresponding c12 ’s, which may be due to lower truncation and the multiplicative factor xm þ k as found in independent cases. As mentioned earlier, the means and variances in Fig. 3.2 are different from those of the corresponding univariate cases, i.e., k ¼ ki ; p ¼ 1=ð2cii Þ and q ¼ di =ð2cii Þði ¼ 1; 2Þ. For instance, the common mean (standard deviation) of X1 and X2 in Ex. 3.9 is 1.11 (0.74) while the corresponding value for the univariate versions is 1.19 (0.79). In Ex. 3.17, the corresponding value for X1 is 1.87 (0.67) while the univariate version is 1.53 (0.63); the corresponding value for X2 is 2.75 (0.734) while the univariate version is 1.99 (0.729).

3.7

Discussion

(a) Non-raw absolute moments: As addressed earlier, a major application of the bpc distribution is to have absolute moments of real-valued orders for the uniand multivariate truncated normal distributions. For the corresponding absolute moments of integer-valued orders, we can use the simple result with q = 0, e.g., in the univariate case, where the variable transformation with X þ fq=ð2pÞg redefined as X, the change of support from ½0; 1Þ to ½q=ð2pÞ; 1Þ and the binomial expansion of ½X þ fq=ð2pÞgm should be considered. The simple result with q = 0 mentioned above is the case of the scaled (in)complete gamma function without infinite series as addressed earlier. The above discussion suggests a problem of EðjX þ ajm Þðm [ 1Þ with a being a real constant, which was not dealt with in earlier sections when X is bpc-distributed. This can be given again by the variable transformation X þ a redefined as X using the following procedure.

3.7 Discussion

95

Z1 m

jx þ ajm fD;k ðxjp; qÞdx

EðjX þ aj Þ ¼

ðm [ 1Þ

0

Z1 jxjm fD;k ðxjp; q  2apÞdx expðqa  pa2 Þ

¼ a

9 88 1 Za

> m m > > x f ðxjp; q  2apÞdx  x f ðxjp; q  2apÞdx D;k D;k > > ; > > :0 > 0 > > > > > <  expðqa  pa2 Þ ða  0Þ; ¼ 8 9 > Zjaj >

> > m m > x f ðxjp; q  2apÞdx þ x f ðxjp; q þ 2apÞdx > D;k D;k > : ; > > > 0 0 > > > :  expðqa  pa2 Þ ða\0Þ; where pðx  aÞ2  qðx  aÞ ¼ px2  ðq  2apÞx þ qa  pa2 and pðxÞ2  ðq  2apÞðxÞ ¼ px2  ðq þ 2apÞx are used. When a ¼ EðXjk; p; qÞ, the above result gives the central absolute moment of real-valued order mð [ 1Þ as a special case. In the case of EfðX þ aÞm g, where m is a non-negative integer, the binomial expansion of ðX þ aÞm gives the required moment as mentioned earlier for ½X þ fq=ð2pÞgm : (b) The multiple infinite series: In the multivariate bpc distribution, the formula of the normalizer (see 3.33 and its proof) includes fðn2  nÞ=2g-fold multiple infinite series, where ðn2  nÞ=2 is the number of the non-duplicated off-diagonal elements in C. When some cij ’s are zero, ðn2  nÞ=2 can be replaced by the number of nonzero cij ’s ð1  i\j  nÞ. Since generally cij 6¼ 0, improved formulas for the multiple infinite series are desired, which is a task to be investigated in the future. (c) Other applications of the bpc distribution: Since the shapes of the density functions of the bpc distribution can take various forms, we have possible applications of fitting the distribution to various phenomena in real data as Kostylev [11] used a form of this distribution with a variable transformation for a physical problem. For fitting the bpc distribution to data, methods of parameter estimation and evaluation of the estimators should be derived, which are also remaining problems. Note that the uni- and multivariate bpc distributions belong to the exponential family of distributions, where p and q in the univariate case are natural parameters. As addressed earlier, the chi distribution is a special case of the bpc distribution. The relationship between these two distributions is similar to that between the chi-square and gamma distributions, since the former is a special case of the latter.

3 The Basic Parabolic Cylinder Distribution and Its Multivariate …

96

It is to be noted that the gamma distribution is given by the bpc distribution, but not vice versa, when the variable transformation U ¼ bpX 2 ðb [ 0Þ is used, where X is bpc-distributed with k [ 1 and q = 0. Then, it can be shown that U has the gamma distribution with the shape parameter ðk þ 1Þ=2 and the scale parameter b. When k = 0, p = 1/2 and q = 0, the bpc distribution becomes the chi distribution with 1 degree of freedom, which is equal to the normal distribution under single truncation when X\0. These results show general properties of the bpc distribution.

3.8 3.8.1

R-Functions The R-Function wpc for the Weighted Parabolic Cylinder Function

################ The start of function wpc ########################### # # R-function [wpc] version 1.0 # # The value of the weighted parabolic cylinder function # 2020 # Haruhiko Ogasawara # Otaru University of Commerce, Otaru, Japan # wpc=function(k,z,x,eps=1e-6,mterm=500){ # #.......................... INPUT .................................... # # [k] The parameter k in the subscript -k-1 for the extended # Whittaker notation D_(-k-1,W)(z,x) ( note that the usual # Whittaker notation is D_(-k-1)(z) ) # [z] The main argument for D_(-k-1,W)(z,x) # [x] The second argument for D_(-k-1,W)(z,x), which gives # the incomplete gamma function when x < Inf while x = Inf # gives the usual gamma function gamma(z) # [eps] The convergence criterion using the sum of current four # absolute differences divided by the absolute value # of the function in a power series; 1e-6(default); # epc=0 indicates machine precision # [mterm] The maximum order of terms for the variable in a power # series, 500(default) #

3.8 R-Functions

97

#......................... OUTPUT .................................... # # The following are returned,when e.g., # abc=wpc(k,z,x,eps,mterm) # # [abc$value] The derived value of the weighted parabolic cylinder # function # [abc$nterms] The order of terms required to have convergence # for the variable in a power series # [abc$rinc] The final relative increment of the variable # in a power series, which should be smaller than # or equal to 'eps' for convergence #..................................................................... mseq=NULL dseq=NULL h0=1e100 # a large initial value if(z != 0){ wpc=0 uv=-1 repeat{ uv=uv+1 abc=exp( lgamma((k+1+uv)/2)+log(pgamma(x^2/2,(k+1+uv)/2)) +uv*log(abs(-sqrt(2)*z))-lfactorial(uv) ) if(uv %% 2 == 1 && sign(-z) == -1)abc=-abc wpc=wpc+abc h1=wpc mseq=c(mseq,h1) dseq=c(h1-h0,dseq) h0=h1 msize=4 if(uv grd hrd qffiffiffiffiffiffiffiffiffiffiffiffiffiffi  > > ; ugh ! g\h : wgg whh where W1 ¼ fwij g ði; j ¼ 1; . . .; NÞ; signðxÞ ¼ 1; 0 and 1 when x > 0, x = 0 and x < 0, respectively; 1f  g is the indicator function; P1 P1 P1 P1 P1 ðÞ ¼    ðÞ; ðÞ and L 1 ;...;L N ¼0 L 1 ¼0 L N ¼0 u12 ;...;uN1;N ¼0 v1 ;...;vN ¼0 ðÞ are similarly; cðxjs Þ is the lower incomplete gamma function at x, i.e., Rdefined x s 1 t t e dt with its complete version Cðs Þ, which is the usual gamma function 0 P P PN with a positive real-valued s*; g\h ðÞ ¼ N1 g¼1 h¼g þ 1 ðÞ; dgi is the Kronecker delta; ðÞ is the ith element of the vector in parentheses; ith Q ðÞ ¼ ðÞ      ðÞ .; and the definition 00 = 1 is used when g\h g¼1;h¼2 g¼N1;h¼N necessary. Proof Employ the following notations: ZBd ðÞdx 

b1rY d Z RY X RZ X rY ¼1 rZ ¼1

Ad



1 XX







aprY d

1 X

aðp þ 1ÞrZ

c1rd Z ðL 1 Þ

ð1Þ    ð1Þ

ð1Þ

L 1 þ  þ L N

Z

ðÞdx 0

Z

crd

ð1ÞL þ

ðÞdx 0

cNrd Z ðL N Þ

 0

ðLÞ crd

ðLÞ

r2R L¼0

LN

L n ¼0

1 X

1 XX

ðÞdx

aNrZ

L1

r2R L 1 ;...;L N ¼0



ZbNrZ

Z

aðp þ 1ÞrZ



a1rY d

r2R L 1 ¼0

X

Z

bprY d

ðÞdx 0

8 The Truncated Pseudo-Normal (TPN) and Truncated …

246

R cðLirdi Þ

where

0

ðÞdx ¼ 

R0

ðL Þ

cirdi

ðL Þ

ðÞdx if cirdi \0 ðLi ¼ 0; 1; r ¼ 1; . . .; RY

when

i ¼ 1; . . .; p; r ¼ 1; . . .; RZ when i ¼ p þ 1; . . .; N). Using the variable transformapffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffi tions wi ¼ x2i wii =2; xi ¼ signðxi Þ 2wi =rii and dxi =dwi ¼ signðxi Þ= 2wii wi ði ¼ 1; . . .; NÞ; and noting lid ¼ ðld Þith , (8.1) gives ~ k Þ ¼ a1 EðY d

ZBd

~yk N=2

ð2pÞ

Ad

¼a

1

ZBd Ad

¼

exp  ð2pÞN=2 jWj1=2



Z



i

i¼1

2



X

xi xj wij

i\j

ðL Þ

!

(

p Y

ðL Þ signðcirdi Þki

i¼1

ð2wi =wii Þki =2 pffiffiffiffiffiffiffiffiffiffiffiffiffi 2wii wi

(

1 expðwTd W1 wd =2Þ X X

að2pÞN=2 jWj1=2 crd w =2 (

)

ðL Þ ðL Þ

ð1ÞL þ

r2R L¼0

ðLÞ2

Z

 (

0

p Y i¼1

ðL Þ signðcirdi Þki þ 1

ðki 1Þ=2

2ðki 1Þ=2 wi

ðwii Þðki þ 1Þ=2 !

)

) N X 1 pffiffiffiffiffiffiffiffiffiffiffiffiffi exp   wi 2wii wi i¼p þ 1 i¼1 " ( )u # ðL Þ ðL Þ 1 YX 2signðcirdi cjrdj Þwij pffiffiffiffiffiffiffiffiffi 1 pffiffiffiffiffiffiffiffiffiffiffi  wi wj ii jj u! w w i\j u¼0 " ( )v # ðL Þ pffiffiffi N X 1 Y ðW1 gd Þith signðcirdi Þ 2 pffiffiffiffiffi 1 pffiffiffiffiffiffi dw wi  v! wii i¼1 v¼0 N Y

ðL Þ

ð1ÞL 1 þ  þ L N signðc1rd1 Þ    signðcnrdN Þ

N X 2signðcirdi cjrdj Þwij pffiffiffiffiffiffiffiffiffi X 1 pffiffiffiffiffiffiffiffiffiffiffiffiffi exp  pffiffiffiffiffiffiffiffiffiffiffi wi  wi wj 2wii wi wii wjj i\j i¼p þ 1 i¼1 sffiffiffiffiffiffiffi) N X 2wi ðL i Þ ij w ljd signðcird Þ þ dw wii i;j¼1



¼

0

N Y

1 X

r2R L 1 ;...;L n ¼0

ðL Þ2 cNrdN wNN =2

0

N X x2 wii

 N X 1 T 1 ij þ xi w ljd  wd W wd dx 2 i;j¼1

að2pÞN=2 jWj1=2 Z

jWj

expfðx  wd ÞT W1 ðx  wd Þ=2gdx

~yk

expðwTd W1 wd =2Þ X ðL Þ2 c1rd1 w11 =2

1=2

8.4 Moments and Cumulants of the TPN

¼

( p expðwTd W1 wd =2Þ Y

247

)

2ki =2

!

N Y

1 pffiffiffiffiffiffi wii i¼p þ 1

ii ðki þ 1Þ=2 að4pÞN=2 jWj1=2 i¼1 ðw Þ ) ( )( p R X 1 N Y Y X ðL i Þ ki þ 1 ðL i Þ Lþ ð1Þ signðcird Þ signðcird Þ  r¼1 L¼0

i¼p þ 1

i¼1 ðLÞ2



1 X



u12 ¼0

"



N Y

1 1 X X



uN1; N ¼0 v1 ¼0 fki 1 þ

wi

P g\h

1 X vN ¼0

crd w =2

Z 0

ugh ðdgi þ dhi Þ þ vi g=2

i¼1

#" e

8 9ugh 3 >

= Y 1 7 6 grd hrd qffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4 5 > > u ! gg hh gh ; g\h : w w 8 9v l N >

= 1 Y d lth qffiffiffiffiffiffi lrd  ; > > ; vl ! l¼1 : wll 2

N Y

wi

f1 þ

P

wi

g\h

ugh ðdgi þ dhi Þ þ vi g=2

# e

wi

dw

i¼p þ 1

R cðL 1 Þ2 w11 =2 R cðL N Þ2 wNN =2 R cðLÞ2 w =2 where 0 1rd    0 Nrd ð  )dw  0 rd ð  )dw. The last expression gives the required result. Q.E.D. As in Ogasawara [9], Theorem 8.5 gives raw, central, arbitrarily deviated, non-absolute, absolute and partially absolute cross moments with possible non-integer orders greater than 1 for jYid j’s when Y~id ¼ jYid j under sectional truncation. It is found that Ogasawara [9, Theorem 1] is a special case of Theorem 8.5 in that in the former result, Z is missing. The following result is also similarly obtained. Corollary 8.1 An alternative expression of Theorem 8.5 is given by ~ kÞ ðY d 

¼

X

( p expðwTd W1 wd =2Þ Y

2ki =2

)

N Y

1 pffiffiffiffiffiffi wii i¼p þ 1

!

ii ðki þ 1Þ=2 að4pÞN=2 jWj1=2 i¼1 ðw Þ ) ( )( p 1 N Y X Y ðL i Þ ki þ 1 ðL i Þ Lþ ð1Þ signðcird Þ signðcird Þ

r2R L1 ;...;LN ¼0

"

i¼1

i¼p þ 1

P   N Y 1fi  pgki þ 1 þ g\h ugh ðdgi þ dhi Þ  C 2 u12 ;...;uN1;N ¼0 v1 ;...;vN ¼0 i¼1 ( ) P ðL i Þ2 ii 1fi  pgki þ 1 þ g\h ugh ðdgi þ dhi Þ 1 fðW1 wd Þith g 2 cird w ; ; 1 F1w ; 2 2 2 2wii 1 X

1 X

8 The Truncated Pseudo-Normal (TPN) and Truncated …

248

P pffiffiffi 1   ðL Þ 1fi  pgki þ 2 þ g\h ugh ðdgi þ dhi Þ 2ðW wd Þith signðcirdi Þ pffiffiffiffiffiffi C 2 wii ( )# P ðL i Þ2 ii ki þ 2 þ g\h ugh ðdgi þ dhi Þ 3 fðW1 wd Þith g 2 cird w ; ;  1 F1w ; 2 2 2 2wii 9ugh 8 = 1

Y> grd hrd qffiffiffiffiffiffiffiffiffiffiffiffiffiffi ;  > > ; ugh ! g\h : wgg whh

þ

where 1 F1w ðg; n; x; w Þ ¼

1 X ðgÞv xv cðw jg þ vÞ ðw 0; x 0Þ ðnÞv v!Cðg þ vÞ v¼0

(the weighted Kummer confluent hypergeometric function); and ðgÞv ¼ gðg þ 1Þ    ðg þ v  1Þ ¼ Cðg þ vÞ=CðgÞ is the rising factorial. Proof For the proof, the following lemma is used. Q.E.D. Lemma 8.2 (Ogasawara [9, Lemma 1]) 1 X

cfw jðk þ 1 þ vÞ=2gð2/Þv =v!

v¼0

        kþ1 kþ1 1 kþ2 kþ2 3 2 2 ; ; / ; w þ 2/ C ; ; / ; w : ¼C 1 F1w 1 F1w 2 2 2 2 2 2

Proof of Lemma 8.2 The infinite series on the left-hand side of the above equation is 1 X v¼0

cfw jðk þ 1 þ vÞ=2gð2/Þv =v!

      2 4 kþ1 k þ 3 ð2/Þ k þ 5 ð2/Þ þc w j þ  ¼c w j þc w j 2 2 2 2! 4!       kþ2 k þ 4 ð2/Þ3 k þ 6 ð2/Þ5 þ c w j þ c w j þ  2/ þ c w j 2 2 2 3! 5!       kþ1 k þ 3 2 2 1 1  ¼ c w j þ c w j / 2 2 2 2

8.4 Moments and Cumulants of the TPN

249

    kþ5 4 3 2 1 1    þ c w j þ  ð/2 Þ2 2 2 2 2 2       1 2 3 2 kþ2 kþ4  þ 2/ c w j þc w j / 2 2 2 2 )    1 2 2 5 4 3 2 kþ6    þc w j þ  ð/ Þ 2 2 2 2 2   k þ 1 cfw jðk þ 1Þ=2g ¼C 2 Cfðk þ 1Þ=2g   1 cfw jðk þ 3Þ=2g 1 fðk þ 1Þ=2g1 /2 þ 1! Cfðk þ 3Þ=2g 2 1 #   1 cfw jðk þ 5Þ=2g 1 2 2 fðk þ 1Þ=2g2 ð/ Þ þ 2! þ  Cfðk þ 5Þ=2g 2 2   k þ 2 cfw jðk þ 2Þ=2g þ 2/ C 2 Cfðk þ 2Þ=2g   1 cfw jðk þ 4Þ=2g 3 fðk þ 2Þ=2g1 /2 þ 1! Cfðk þ 4Þ=2g 2 1 #   1 cfw jðk þ 6Þ=2g 3 2 2 fðk þ 2Þ=2g2 ð/ Þ þ 2! þ  ; Cfðk þ 6Þ=2g 2 2 which gives the required result with the definition of 1 F1w ðÞ. Q.E.D. In Corollary 8.1, when w is 1, 1 F1w ðg; n; x; w Þ becomes the usual Kummer confluent hypergeometric function: 1 F1w ðg; 1 X

n; x; 1Þ

¼

ðgÞv xv cð1jg þ vÞ ðnÞv v!Cðg þ vÞ v¼0

¼

1 X ðgÞv xv ðnÞv v! v¼0

¼ 1 F1 ðg; n; xÞ ðx 0Þ (Winkerbauer [10, Eq. (6)]; Zwillinger [11, Eq. (1) of Sect. 9.210]; DLMF [2, Chap. 13]. Note that 1 F1w ðg; n; x; w Þ is given by 1 F1 ðg; n; xÞ when each term is weighted by 0  cðw jg þ vÞ=Cðg þ vÞ  1. That is, it is expected that the infinite series of 1 F1w ðg; n; x; w Þ tends to converge faster than 1 F1 ðg; n; xÞ.

8 The Truncated Pseudo-Normal (TPN) and Truncated …

250

When p ¼ q ¼ 1, i.e., N ¼ p þ q ¼ 2 and wd ¼ 0, define     r 0 1 q r 0 W¼ ; 0 rx q 1 0 rx ðL Þ

ðL Þ

c1rd1 ¼ c1rd1 =r

ðL Þ

ðL Þ

c2rd2 ¼ c2rd2 =rx :

and

Denote k1 , k1 and Y1 by k, k and Y, respectively. Then, using Lemma 8.2 in a similar way, we have the following relatively simple result similar to Ogasawara [9, Corollary 3]: Corollary 8.2 When k [  1 is real-valued, EðY~dk jwd ¼ 0Þ ¼ 

1 X X r2R L 1 ;L 2 ¼0

rk k=2 2 ð1  q2 Þðk þ 1Þ=2 a4p ðL Þ

ðL Þ

ð1ÞL 1 þ L 2 signðc1rd1 Þk þ 1 signðc2rd2 Þ

! ! ðL Þ2 ðL Þ2 c1rd1 c2rd2 kþ1þu 1þu j j c  c 2 2ð1  q2 Þ 2ð1  q2 Þ 2 u¼0 n ou 1 ðL Þ ðL Þ  signðc1rd1 c2rd2 Þ2q u! 1 X

rk k=2 2 ð1  q2 Þðk þ 1Þ=2 a4p 1 X X ðL Þ ðL Þ  ð1ÞL 1 þ L 2 signðc1rd1 Þk þ 1 signðc2rd2 Þ

¼

r2R L 1 ;L 2 ¼0

(    k þ 1 pffiffiffi kþ1 1 p 2 F1w2 ;  C 2 2 2 (   kþ2 kþ2 3 ;1 ; þ 2qC 2 F1w2 2 2 2

) ðL Þ2 ðL Þ2 c1rd1 c2rd2 1 2 ; ;q ; ; 2 2ð1  q2 Þ 2ð1  q2 Þ )# ðL Þ2 ðL Þ2 c1rd1 c2rd2 2 ;q ; ; ; 2ð1  q2 Þ 2ð1  q2 Þ

where 2 F1w2 ðg1 ; g2 ; ; n; x; w1 ; w2 Þ 1 X ðg1 Þv ðg2 Þv xv cðw 1 jg1 þ vÞcðw 2 jg2

¼

v¼0

ðnÞv v!Cðg1 þ vÞCðg2 þ vÞ

þ vÞ

ðw 1 0; w 2 0; x 0Þ:

last expression 2 F1w2 ðg1 ; g2 ; ; n; x; w 1 ; w 2 Þ with the weight cðw 1 jg1 þ vÞcðw 2 jg2 þ vÞ 0  1 added to each term in the usual Gauss hyperCðg1 þ vÞCðg2 þ vÞ P ðg1 Þv ðg2 Þv xv geometric function 2 F1 ðg1 ; g2 ; ; n; xÞ ¼ 1 was obtained by v¼0 ðnÞv v! The

8.4 Moments and Cumulants of the TPN

251

Ogasawara [9, Corollary 3] and is called the weighted Gauss hypergeometric function. The unweighted Gauss hypergeometric function used by Nabeya [8] and Kamat [5] in the untruncated and singly truncated bivariate cases, respectively, are special cases of the weighted counterpart. Remark 8.3 Corollary 8.2 shows that the moments of Y do not depend on the scale or standard deviation rx of Z ¼ Z1 before truncation, which was used to have the ðL Þ ðL Þ standardized limit c2rd2 ¼ c2rd2 =rx for an interval for selection. This is expected and generally holds for Z in the TPN and PN since the distribution of Y is unchanged as long as PrðZ 2 SZ j g; XÞ ¼ Pr

RZ [

! faZr  Z\bZr g

r¼1

is the same under reparametrization as n o Pr Z ¼ Diag1=2 ðXÞZ 2 SZ j Diag1=2 ðXÞg; P ¼ Pr

RZ [ r¼1

¼ Pr

RZ [

!

fDiag1=2 ðXÞaZr  Z\Diag1=2 ðXÞbZr g ! faZr  Z\bZr g ;

r¼1

where P ¼ Diag1=2 ðXÞXDiag1=2 ðXÞ is the correlation matrix with unit diagonals corresponding to X and Diag1=2 ðXÞ is the diagonal matrix whose diagonal ele1=2 ments are x11 ; . . .; x1=2 qq . Though this reparametrization can be used without loss of generality, other methods are also possible. The SN, a special case of. PNp;q;R ð l; R; D; g; D; A; BÞ ¼ PN1;1;1 ð 0; 1; k; 0; 1; 1; 0Þ; uses D ¼ d ¼ 1 rather than X ¼ r2x ¼ 1, giving the pdf at y 2/ðyÞUðkyjg ¼ 0; d ¼ 1Þ ¼ 2/ðyÞUðkyÞ as addressed earlier. It is found that the TPN and PN are latent variable models (LVMs), where hidden or latent truncation is typically used though Z may be observable. Note that even in the latter observable case, Z does not always appear in the model, e.g., 2/ðyÞUðkyÞ for the SN unless the expression 2/ðyÞ PrðZ\0j  ky; 1Þ is employed. One of the properties of the LVM is the inflated dimensionality for variables

8 The Truncated Pseudo-Normal (TPN) and Truncated …

252

associated with the LVM. The SN model is a univariate model for Y. However, we require an added latent variable Z to describe the pdf. In psychometrics or more generally behaviormetrics, it is known that the exploratory factor analysis (EFA) model is a LVM while a similar model of principal component analysis (PCA) is not a LVM. In the EFA model, when the number of observable variables of interest is p, the model consists of p unique factors and additional q common factors, yielding the inflated dimensionality p + q over that of the observable variables or data. On the other hand, in PCA the sum of the numbers of principal and remaining minor components is equal to p, i.e., the number of observable variables with no inflation of the number of associated variables (for the LVM and EFA model see, e.g., Bollen [1]; Gorsuch [4]). A special case of the PN is given by the SN as addressed earlier. The SN under single truncation was introduced by Kim [6] while the SN under double truncation was investigated by Flecher, Allard and Naveau [3]. The truncated skew-normal (TSN) with emphasis on convolution was discussed by Krenek, Cha, Cho and Sharp [7]. A special case of Corollary 8.2 in the case of the TSN is given when pffiffiffiffiffiffiffiffiffiffiffiffiffi d ¼ d ¼ 0; wd ¼ w ¼ 0; W¼ r2 ¼ 1; r2x ¼ 1 þ k2 , q ¼ k= 1 þ k2 ,     2 1 k k 1 þ k ðL Þ ðL Þ , W1 ¼ , jWj ¼ 1, a ¼ 1=2, c1rd1 ¼ c1r 1 ¼ k 1 þ k2 k 1 ðL Þ

ðL Þ

ðL Þ

ðL Þ

ðL Þ

ðL Þ

c1r 1 =r ¼ c1r 1 ðr ¼ 1; . . .; RY Þ, c2rd2 ¼ c2r 2 ¼ c2r 2 =rx ¼ c2r 2 =ð1 þ k2 Þ1=2 due to b2r ¼ 0 and a2r ¼ 1 ðr ¼ RZ ¼ 1Þ. Corollary 8.3 When k [  1 is real-valued for the TSN, RY X 1 X 1 2k=2 ðL Þ ð1ÞL 1 signðc1r 1 Þk þ 1 2p ð1 þ k2 Þðk þ 1Þ=2 r¼1 L ¼0 1 !  )u ( ðL 1 Þ2 1 2 X c1r ð1 þ k Þ k þ 1 þ u 1þu 2k 1 ðL 1 Þ j  c C signðc1r Þ pffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 2 u! 1þk u¼0

EðY~ k Þ ¼

RY X 1 X 1 2k=2 ðL Þ ð1ÞL 1 signðc1r 1 Þk þ 1 2p ð1 þ k2 Þðk þ 1Þ=2 r¼1 L ¼0 1 ( )    ðL Þ2 k þ 1 pffiffiffi kþ1 1 1 k2 c1r 1 ð1  k2 Þ p 2 F1w2 ; ; ; ;1  C ; 2 2 2 2 1 þ k2 2ð1  q2 Þ ( )#   ðL Þ2 k kþ2 kþ2 3 k2 c1r 1 ð1 þ k2 Þ  2 pffiffiffiffiffiffiffiffiffiffiffiffiffi C ;1 ; ; ;1 ; ; 2 F1w2 2 2 2 1 þ k2 2ð1  q2 Þ 1 þ k2

¼

where

8.4 Moments and Cumulants of the TPN 2 F1w2 ðg1 ; g2 ; ; n; x; w1 ; 1Þ 1 X ðg1 Þv ðg2 Þv xv cðw 1 jg1 þ vÞ

¼

ðnÞv v!Cðg1 þ vÞCðg2 þ vÞ v¼0

253

ðw 1 0; x 0Þ:

Proof Use w11 ¼ 1 þ k2 ; w22 ¼ 1 and jWj ¼ 1. Note that b2r ¼ 0 does not contribute to the result and that a2r ¼ 1 ðr ¼ RZ ¼ 1Þ, which corresponds to ðL Þ L2 ¼ 1, gives ð1ÞL 2 signðc2rd2 Þ ¼ 1 and ðL Þ2

c2rd2 1þu c j 2 2ð1  q Þ 2

!

    1þu 1þu ¼ c 1j ¼C ðr ¼ RZ ¼ 1Þ; 2 2

yielding the required result. Q.E.D.

8.4.2

A Formula Using the MGF

In Chap. 4, the formulas for the moments for the PN were given using its mgf and the partial derivatives of the cdfs of the associated variables. In this subsection, the formulas based on reduced expressions for the STN in Chap. 1 are used. Recall that the mgf of the STN is   PrðX þ Rt 2 S j l; RÞ tT Rt T exp l t þ MX ðtÞ ¼ PrðX 2 Sj l; RÞ 2 (Theorem 1.1) while that of the TPN is   h i tT Rt MY ðtÞ ¼ a1 Pr fðY þ RtÞT ; ðZ þ DRtÞT gT 2 Sjw;W exp lT t þ 2 and a ¼ PrðX 2 Sj w; WÞ (Theorem 8.1). Let Y ¼ Y 1 þ Y0 where Y 1 is a pseudo h i random vector whose pseudo mgf is a1 Pr fðY þ RtÞT ; ðZ þ DRtÞT gT 2 Sjw;W and Y0  Np ðl; RÞ assumed to be independent of Y 1 . Define A Y ¼ AY  Rt1TRY ; B Y ¼ BY  Rt1TRY ; A Z ¼ AZ  DRt1TRZ ; B Z ¼ BZ  DRt1TRZ

8 The Truncated Pseudo-Normal (TPN) and Truncated …

254

    ar ¼ A Y r t¼0 ¼ ðAY Þr ; br ¼ B Y r t¼0 ¼ ðBY Þr ðr ¼ 1; . . .; RY Þ;     aRY þ r ¼ A Z r t¼0 ¼ ðAZ Þr ; bp þ r ¼ B Z r t¼0 ¼ ðBZ Þr ðr ¼ 1; . . .; RZ Þ; where ðÞr is the rth column of a matrix. Let arðkÞ be ar with the kth element being omitted. Then, as in Chap. 1 for the STN using similar notations, e.g., ðL Þ k ckr k ¼ aLkrk b1L kr , the following results are obtained: Lemma 8.3 Define rd i1 k ¼ ri1 k ¼ ðRÞi1 k ði1 ¼ 1; . . .; p; k ¼ 1; . . .; pÞ and rd i1 k ¼ ðÞi1 k is the ðRDT Þi1 ;kp ði1 ¼ 1; . . .; p; k ¼ p þ 1; . . .; NÞ; N ¼ p þ q; where ði1 ; kÞth element of a matrix. EðYÞi1 ¼ a1 where

P r2R

N XX 1 X k¼1 r2R Lk ¼0

PRY

ðÞ ¼

rY ¼1

ðrÞ

ðrÞ

ðÞ ðk ¼ 1; . . .; pÞ, 1 X

ðL Þ

fð1Þ ðckr k Þ ¼ Z

PRZ

ðÞ ¼

ðk ¼ p þ 1; . . .; p þ kÞ,

rZ ¼1

ðLk Þ ¼0

Z

Z

ZbNr Z 

aðp þ 1Þr Z

bNr Z

bp r Y Z

bðk þ 1Þr Y

 a1r Y

r2R

ðL Þ

Z

¼

P

ð1ÞL þ Lk fð1Þ ðckr k Þ

bðk1Þr Y

b1r Y

ðL Þ

ð1ÞLk þ 1 rd i1 k fð1Þ ðckr k Þ þ li1 ði1 ¼ 1; . . .; pÞ

bðp þ 1Þr Z

 aðk1Þr Y

aðk þ 1Þr Y

ap r Y

ðL Þ /N fðckrYk ; xTðkÞ ÞT jw; WgdxðkÞ



Z

brðkÞ ðL Þ

/N fðckr k ; xTðkÞ ÞT jw; WgdxðkÞ ðk ¼ 1; . . .; pÞ

 arðkÞ

and similarly ðrÞ

1 X

ðL Þ

fð1Þ ðckr k Þ ¼ Z ¼

ðLk Þ ¼0 bp r Y Z

b1r Y

ðL Þ

ð1ÞL þ Lk fð1Þ ðckr k Þ

 a1r Y

Z

Z

bðp þ 1Þr Z

bðk1Þr Z

Z

ZbNr Z 

aðk þ 1Þr Z

bNr Z

bðk þ 1Þr Z

 ap r Y

aðp þ 1Þr Z

aðk1Þr Z

ðL Þ

 /p þ q fðckrZk ; xTðkÞ ÞT jw; WgdxðkÞ Z

brðkÞ ðL Þ

/N fðckr k ; xTðkÞ ÞT jw; WgdxðkÞ ðk ¼ p þ 1; . . .; NÞ;

 arðkÞ

8.4 Moments and Cumulants of the TPN

255

where L þ ¼ L1 þ    þ LN . ðL Þ

ðL Þ

Lemma 8.4 Redefine ckr k ¼ ckr k  wk . Then, we have ( EfðY 1 Þi1 ðY 1 Þi2 g ¼ a1 þ

N XX 1 X k¼1 r2R Lk ¼0

N X

ðL Þ ðrÞ

ðL Þ

ð1ÞLk þ 1 rd i1 k rd i2 =k ckr k fð1Þ ðckr k Þ

1 X X

) Lk þ Ll

ð1Þ

k; l¼1; k6¼l r2R Lk ;Ll ¼0

ðrÞ ðLÞ rd i1 k rd i2 l jk fð2Þ ðckl;r Þ

ði1 ; i2 ¼ 1; . . .; pÞ; where rd i2 =k ¼ rdi2 k w1 kk ; rd i2 l jk ¼ rdi2 l  rdi2 k w1 kk wkl ; ðrÞ

ðLÞ

ðrÞ

ðL Þ

ðL Þ

fð2Þ ðckl;r Þ ¼ fð2Þ fðckr k ; clr l ÞT g Z

brðk;lÞ ðLÞ

/N fðckl;r ; xTðk;lÞ ÞT jw; Wgdxðk;lÞ

¼ arðk;lÞ

¼

1 X

ðLÞ

ð1ÞL þ Lk Ll fð2Þ ðckl;r Þ

ðLk ;Ll Þ¼0

ðk; l ¼ 1; . . .; N; k 6¼ lÞ; R brðk;lÞ arðk;lÞ

ðÞ dxðk;lÞ is the (N  2)-mode multiple integral for x omitting variables

xk and xl ; and 1 X ðLk ;Ll Þ¼0

ðÞ ¼

1 X L1;...; Lk1 ; Lk þ 1;...; Ll1; Ll þ 1;...; LN ¼ 0

when k\l.

ðÞ

8 The Truncated Pseudo-Normal (TPN) and Truncated …

256

Lemma 8.5 EfðY 1 Þi1 ðY 1 Þi2 ðY 1 Þi3 g " N XX 1 X ðL Þ2 1 ðrÞ ðLk Þ 1 ð1ÞLk þ 1 rdi1 k rdi2 k rdi3 k ðckr k w2 ¼a kk  wkk Þfð1Þ ðckr Þ k¼1 r2R Lk ¼0

N X

þ

1 X X

k; l¼1; k6¼l r2R Lk ;Ll ¼0

ðL Þ

ð1ÞLk þ Ll ðrdi1 k rdi2 =k rdi3 l j k ckr k

ðLÞ

ðrÞ

ðLÞ

þ rdi1 k rdi2 l j k rTdi3 =kl ckl;r Þfð2Þ ðckl; r Þ N X X

þ

1 X

Lk þ Ll þ Lm þ 1

ð1Þ

k;l;m¼1 r2R Lk ; Ll ; Lm ¼0 k;l;m:6¼

ðrÞ ðLÞ rdi1 k rdi2 l j k rdi3 m j k l fð3Þ ðcklm; r Þ

ði1 ; i2 ; i3 ¼ 1; . . .; pÞ; where undefined notations are defined similarly as before. Lemma 8.6 EfðY 1 Þi1 ðY 1 Þi2 ðY 1 Þi3 ðY 1 Þi4 g " N XX 1 X ðL Þ3 ðLk Þ 2 1 ¼a ð1ÞLk þ 1 rdi1 k rdi2 k rdi3 k rdi4 k ðckr k w3 kk  3ckr wkk Þ k¼1 r2R Lk ¼0 ðrÞ

ðL Þ

 fð1Þ ðckr k Þ þ

N X

1 X X

k; l¼1; k6¼l r2R Lk ;Ll ¼0

n ðL Þ2 1 ð1ÞLk þ Ll rdi1 k rdi2 k rdi3 k rdi4 l j k ðckr k w2 kk  wkk Þ

 rdi1 k rdi2 =k rdi3 l j k rdi4 k  rdi1 k rdi2 l j k rTdi3 =kl rdi4 ;kl o ðL Þ ðLÞ ðLÞ ðrÞ ðLÞ þ ðrdi1 k rdi2 =k rdi3 l j k ckr k þ rdi1 k rdi2 l j k rTdi3 =kl ckl:r ÞrTdi4 =kl ckl:r fð2Þ ðckl; r Þ þ

N X

X

1 X

ð1ÞLk þ Ll þ Lm þ 1

k; l; m¼1 r2R Lk ; Ll ; Lm ¼0 k; l; m: 6¼

n ðL Þ ðLÞ  ðrdi1 k rdi2 =k rdi3 l jk ckr k þ rdi1 k rdi2 l j k rTdi3 =kl ckl; r Þrdi4 m j k l o ðLÞ ðrÞ ðLÞ þ rdi1 k rdi2 l j k rdi3 m j kl rTdi4 =klm cklm; r fð3Þ ðcklm; r Þ þ

N X

X

1 X

k; l; m; n¼1 r2R Lk ; Ll ; Lm ; Ln ¼0 k; l; m; n : 6¼

ð1Þ

Lk þ Ll þ Lm þ Ln

ðrÞ ðLÞ rdi1 k rdi2 l j k rdi3 m j kl rdi4 n j klm fð4Þ ðcklmn; r Þ

ði1 ; i2 ; i3 ; i4 ¼ 1; . . .; pÞ;

where undefined notations are defined similarly as before.

#

#

8.4 Moments and Cumulants of the TPN

257

In Lemmas 8.5 and 8.6, asymmetric expressions for associated variables are used. The above lemmas can be used to have the moments and cumulants up to the fourth order as in Theorems 1.4 and 1.5.

8.4.3

The Case of Sectionally Truncated SN with p = q = 1

Among the cases of the TPN, when p = q = 1 with X ¼ðY; ZÞT , w ¼  0, R RDT 2 2 ¼ R ¼ r ¼ 1, D ¼ k; D ¼ d ¼ 1; X ¼ x ¼ 1 þ k , W ¼ DR X     1 k 1 þ k2 k 1 ¼ , jWj ¼ 1, AY ¼ aTY , BY ¼ bTY , AZ ¼ aZ ¼ 2 , W k 1 þ k k 1 1 and BZ ¼ bZ ¼ 0. we have the sectionally truncated skew-normal (TSN). Remark 8.7 The pdf of the TSN is proportional to that of the SN, which is given by definition as /ðyÞUðkyÞ /ðyÞUðkyÞ ¼ PR R bYr R 0 Y PrðX 2 Sj 0; WÞ /ðyÞ/ðzj  ky; 1Þdz dy r¼1 aYr

¼ R bT R Y

aTY

0 1

1

/ðyÞUðkyÞ /ðyÞ/ðzj  ky; 1Þdz dy

1

¼ a /ðyÞUðkyÞ: and the mgf is given as a special case of Theorem 8.1: MY ðtÞ ¼ a1 PrfðY þ t; Z  ktÞT 2 S j 0;Wg expðt2 =2Þ Remark 8.8 While moments are obtained by using the pdf, the expressions are generally given by integral expressions as the normalizer AZ ¼ aZ ¼ 1

and

BZ ¼ bZ ¼ 0:

For actual computation, some numerical integration or series expressions (recall Theorem 8.5) are required. For expository purposes, moments are obtained using the mgf of a pseudo variable. Let Y ¼ Y1 þ Y0 , where Y1 is the pseudo random variable with the pseudo mgf MY1 ðtÞ ¼ a1 PrfðY þ t; Z  ktÞT 2 Sj0;Wg used earlier and Y0  Nð0; 1Þ independent of Y1 . Then,

8 The Truncated Pseudo-Normal (TPN) and Truncated …

258

d 1 a PrfðY þ t; Z  ktÞT 2 S j 0;Wgjt¼0 dt br 0 RY Z Y Z X d 1 /2 fðy  t; z þ ktÞT j0; Wgjt¼0 dz dy ¼a dt r ¼1

EðY1 Þ ¼

ar Y

1

RY Z X

Z0

Y

br Y

¼a

1

rY ¼1

ar Y

ð1; kÞW1 ðy  t; z þ ktÞT

1

 /2 fðy  t; z þ ktÞT j0; Wgjt¼0 dz dy: In the above result, using 1

T

ð1; kÞW ðy  t; z þ ktÞ jt¼0 ¼ ð1; kÞ

1 þ k2 k k

1

! ðy; zÞT ¼ y

we have

EðY1 Þ ¼ a1

¼a

1

br 0 RY Z Y Z X rY ¼1

ar Y

br RY Z Y X rY ¼1

ar Y

RY Z X

y/2 fðy; zÞT j0; Wgdz dy

1

Z0 /ðzj  ky; 1Þdz dy

y/ðyÞ 1

br Y

¼ a1

rY ¼1

y/ðyÞUðkyÞdy

ar Y

¼ EðYÞ; which is expected since EðYÞ ¼ EðY1 Þ þ EðY0 Þ ¼ EðY1 Þ. d2 1 a PrfðY þ t; Z  ktÞT 2 S j 0;Wgjt¼0 dt2 br 0 RY Z Y Z X d 1 ð1; kÞW1 ðy  t; z þ ktÞT ¼a dt r ¼1

EðY1 2 Þ ¼

Y

ar Y

1

 /2 fðy  t; z þ ktÞT j0; Wgjt¼0 dz dy

8.4 Moments and Cumulants of the TPN

¼a

259

br 0 RY Z Y Z h X ð1; kÞW1 ð1; kÞT þfð1; kÞW1 ðy  t; z þ ktÞT g2

1

rY ¼1

1

ar Y

i  /2 fðy  t; z þ ktÞT j0; W jt¼0 dz dy br 0 RY Z Y Z X

¼ a1

rY ¼1

fð1; kÞW1 ð1; kÞT þ y2 g/2 fðy; zÞT j0; Wgdz dy

1

ar Y

¼ 1 þ EðY Þ; 2

! 2 k 1 þ k where ð1; kÞW1 ð1; kÞ ¼ ð1; kÞ ð1; kÞT ¼ 1 is used. Then, k 1 T

varðY1 Þ ¼ 1 þ EðY 2 Þ  E2 ðY1 Þ ¼ 1 þ EðY 2 Þ  E2 ðYÞ ¼ 1 þ varðYÞ, which is expected since varðY1 Þ ¼ varðYÞ  varðY0 Þ ¼ varðYÞ  1, where

varðYÞ ¼ a1

RY X rY ¼1

Zbr Y y2 /ðyÞUðkyÞdy ar Y

8 > < > :

a1

RY X rY ¼1

Zbr Y ar Y

92 > = y/ðyÞUðkyÞdy : > ;

Remark 8.9 We obtain the third and fourth moments/cumulants using the mgf. d3 1 a PrfðY þ t; Z  ktÞT 2 S j 0;Wgjt¼0 dt3 br 0 RY Z Y Z X dh 1 ð1; kÞW1 ð1; kÞT ¼a dt r ¼1

EðY1 3 Þ ¼

Y

ar Y

1

i þ fð1; kÞW1 ðy  t; z þ ktÞT g2 /2 fðy  t; z þ ktÞT j0; W jt¼0 dz dy ¼ a1

br 0 RY Z Y Z h X rY ¼1

ar Y

3ð1; kÞW1 ð1; kÞT ð1; kÞW1 ðy  t; z þ ktÞT

1

i þ fð1; kÞW1 ðy  t; z þ ktÞT g3 /2 fðy  t; z þ ktÞT j0; Wgjt¼0 dz dy ¼a

1

br 0 RY Z Y Z X rY ¼1

ar Y

1

ð3y þ y3 Þ /2 fðy; zÞT j0; Wgdz dy

;

8 The Truncated Pseudo-Normal (TPN) and Truncated …

260

which gives j3 ðY1 Þ¼ E[f Y1  EðY1 Þg3  ¼ E(Y1 3 Þ  3E(Y1 2 ÞEðY1 Þ þ 2E3 ðY1 Þ ¼a

1

br 0 RY Z Y Z X rY ¼1

ar Y

ð3y þ y3 Þ /2 fðy; zÞT j0; Wgdz dy

1

 3f1 þ EðY 2 ÞgEðYÞ þ 2E3 ðYÞ ¼ EðY 3 Þ  3EðY 2 ÞEðYÞ þ 2E3 ðYÞ ¼ E½fY  EðYÞg3  ¼ j3 ðYÞ; as is expected since Y0  Nð0; 1Þ in Y ¼ Y1 þ Y0 does not contribute to the cumulants of Y beyond the second order. The fourth moments/cumulants are also given via the mgf for confirmation. d4 1 a PrfðY þ t; Z  ktÞT 2 S j 0;Wgjt¼0 dt4 br 0 RY Z Y Z X dh 1 3ð1; kÞW1 ð1; kÞT ð1; kÞW1 ðy  t; z þ ktÞT ¼a dt r ¼1

EðY1 4 Þ ¼

Y

ar Y

1

i þ fð1; kÞW1 ðy  t; z þ ktÞT g3 /2 fðy  t; z þ ktÞT j0; Wgjt¼0 dz dy ¼a

1

br 0 RY Z Y Z h X 3fð1; kÞW1 ð1; kÞT g2 rY ¼1

ar Y

1

þ 6ð1; kÞW1 ð1; kÞT fð1; kÞW1 ðy  t; z þ ktÞT g2 i þ fð1; kÞW1 ðy  t; z þ ktÞT g4 /2 fðy  t; z þ ktÞT j0; Wgjt¼0 dz dy ¼a

1

br 0 RY Z Y Z X rY ¼1

ar Y

Then, we have

1

ð3  6y2 þ y4 Þ /2 fðy; zÞT j0; Wgdz dy ¼ 3  6EðY 2 Þ þ EðY 4 Þ:

8.4 Moments and Cumulants of the TPN

261

j4 ðY1 Þ ¼ E[f Y1  EðY1 Þg4   3var2 ðY1 Þ ¼ E(Y1 4 Þ  4E(Y1 3 ÞEðY1 Þ þ 6E(Y1 2 ÞE2 ðY1 Þ  3E4 ðY1 Þ ¼ 3  6EðY 2 Þ þ EðY 4 Þ  4f3EðYÞ þ EðY 3 ÞgEðYÞ þ 6f1 þ EðY 2 ÞgE2 ðYÞ  3E4 ðYÞ  3fEðY 2 Þ  E2 ðYÞ  1g2 ¼ EðY 4 Þ  4EðY 3 ÞEðYÞ þ 6EðY 2 ÞE2 ðYÞ  3E4 ðYÞ  3fEðY 2 Þ  E2 ðYÞg2 ¼ E[f Y  EðYÞg4   3var2 ðYÞ ¼ j4 ðYÞ; which is expected as in the third cumulant. The results of Remarks 8.8 and 8.9 partially support the mgfs of Theorem 8.1 and Remark 8.7. Remark 8.10 In this subsection, the TSN has been dealt with considering the familiarity of the SN as well as simplicity. However, when we use intervals for selection other than AZ ¼ aZ ¼ 1, BZ ¼ bZ ¼ 0 and RZ ¼ 1, the corresponding results are similarly obtained. That is, using 1  RZ vectors AZ ¼ aTZ and BZ ¼ bTZ R0 with arbitrary positive integer RZ , the results are given when 1 ðÞdz in Remarks R bT Rb P 8.7–8.9 is replaced by RrZZ¼1 arr Z ðÞdz  aTZ ðÞdz. Note also that in Remarks 8.7– Z

Z

8.9, a form of the mgf of Theorem 8.1 using Z as well as Y is used for illustration. However, the form can be simplified. In the case of the TSN, the pdf shown in Remark 8.6 becomes /ðyÞUðkyÞ /ðyÞUðkyÞ ¼ R bT R 0 PrðX 2 Sj 0; WÞ Y /ðyÞ/ðzj  ky; 1Þdz dy aT 1 Y

/ðyÞUðkyÞ ¼ R bT ¼ a1 /ðyÞUðkyÞ: Y /ðyÞUðkyÞdy aT Y

Consequently, the mgf is simplified as: ZbY

T

MY ðtÞ ¼ EfexpðtYÞg ¼ a1

expðtyÞ/ðyÞUðkyÞdy aTY

ZbY

T

¼ a1

/ðy  tÞUðkyÞdy expðt2 =2Þ; aTY

where MY1 ðtÞ ¼ a

1

T b RY

/ðy  tÞUðkyÞdy and MY0 ðtÞ ¼ expðt2 =2Þ. It is confirmed

aTY

that these simplified forms give the same results as shown in Remarks 8.8 and 8.9.

8 The Truncated Pseudo-Normal (TPN) and Truncated …

262

8.5

The Truncated Normal-Normal Distribution

In Chap. 6, the normal-normal (NN) distribution was introduced, which can be used as an approximation to the corresponding PN distribution. In this section, the truncated counterpart of the NN is obtained, which can be seen as an approximation to the TPN. Recall that the pdf of the NN vector is fY;NN ðyÞ ¼ fp;q;R ðyj l; R; D; g; D; CÞ P /p ðyj l; RÞ Rr¼1 /q fcr jg þ Dðy  lÞ; Dg ¼ PR r¼1 /q ðcr jg; XÞ PR /q ðcr jg; XÞ /p fyjl þ R DT D1 ðcr  gÞ; R g ¼ r¼1 PR r¼1 /q ðcr jg; XÞ (Definition 6.1, Remark 6.5), where the last expression is a mean mixture of the normal densities with the untruncated support S ¼ Rp . Suppose that NN-distributed S Y is sectionally truncated with the support S ¼ Rr¼1 far  Y\br g. Then, we have the following set of distributions. Definition 8.2 The truncated normal-normal (TNN) family of distributions is defined when the p  1 random vector Y takes the following pdf at y: fY;TNN ðyÞ ¼ fp;q;R ðyj l; R; D; g; D; C; A; BÞ P /p ðyj l; RÞ Rr¼1 /q fcr jg þ Dðy  lÞ; Dg ¼ PR T 1 r¼1 /q ðcr jg; XÞ PrfY 2 Sjl þ R D D ðcr  gÞ; R g PR /q ðcr jg; XÞ /p fyjl þ R DT D1 ðcr  gÞ; R g ¼ PR r¼1 T 1 r¼1 /q ðcr jg; XÞ PrfY 2 Sjl þ R D D ðcr  gÞ; R g ¼ a1

R X

/q ðcr jg; XÞ /p fyjl þ R DT D1 ðcr  gÞ; R g;

r¼1

where A ¼ ða1 ; . . .; aR Þ and B ¼ ðb1 ; . . .; bR Þ. This distribution is denoted by Y  TNNp;q;R ðl; R; D; g; D; C; A; BÞ. Proof of the normalizer. The equality of the two expressions of the pdf is given as in the NN (see Remark 6.5). The added factor in each term in the denominator of the pdfs is given by definition. Q.E.D.

8.5 The Truncated Normal-Normal Distribution

263

Note that the normalizer is a¼

R X

/q ðcr jg; XÞ PrfY 2 Sjl þ R DT D1 ðcr  gÞ; R g

r¼1

¼

R X

Zbr /q ðcr jg; XÞ

r¼1

/p fyjl þ R DT D1 ðcr  gÞ; R gdy:

ar

Theorem 8.6 The mgf of Y  TNNp;q;R ðl; R; D; g; D; C; A; BÞ is MY ðtÞ ¼ a1

R X

/q ðcr jg þ DRt; XÞ PrfY 2 Sjl þ Rt þ R DT D1 ðcr  g  DRtÞ; R g

r¼1

  1 T T  exp l t þ t Rt : 2 Proof R Z X

br

MY ðtÞ ¼ a

1

r¼1

¼ a1

R X

expðtT yÞ/p ðyj l; RÞ/q fcr jg þ Dðy  lÞ; Dgdy

ar

/q ðcr jg þ DRt; XÞ PrfY 2 Sjl þ Rt þ R DT D1 ðcr  g  DRtÞ; R g

r¼1

  1  exp lT t þ tT Rt 2 PR R br r¼1 ar /p ðyj l þ Rt; RÞ/q fcr jg þ DRt þ Dðy  l  RtÞ; Dgdy  PR T 1 r¼1 /q ðcr jg þ DRt; XÞ PrfY 2 Sjl þ Rt þ R D D ðcr  g  DRtÞ; R g ¼ a1

R X

/q ðcr jg þ DRt; XÞ PrfY 2 Sjl þ Rt þ R DT D1 ðcr  g  DRtÞ; R g

r¼1

  1 T T  exp l t þ t Rt : 2 Q.E.D. As for the pdf, the mgf includes cdfs giving less tractable results than those of the NN.

264

8 The Truncated Pseudo-Normal (TPN) and Truncated …

References 1. Bollen KA (1989) Structural equations with latent variables. Wiley, New York 2. DLMF (2020) Olver FWJ, Olde Daalhuis AB, Lozier DW, Schneider BI, Boisvert RF, Clark CW, Miller BR, Saunders BV, Cohl HS, McClain MA (eds) NIST Digital Library of Mathematical Functions. National Institutes of Standards and Technology, U. S. Department of Commerce. http://dlmf.nist.gov/, Release 1.0.26 of 2020-03-15 3. Flecher C, Allard D, Naveau P (2010) Truncated skew-normal distributions: moments, estimation by weighted moments and application to climate data. METRON—Int J Stati LXVIII:331–345 4. Gorsuch RL (2014) Factor analysis, Classic. Psychology Press, New York 5. Kamat AR (1958) Hypergeometric expansions for incomplete moments of the bivariate normal distribution. Sankhyā 20:317–320 6. Kim H-J (2004) A family of truncated skew-normal distributions. Korean Commun Stat 11:265–274 7. Krenek R, Cha J, Cho BR, Sharp JL (2017) Development of statistical convolutions of truncated normal and truncated skew normal distributions with applications. J Stat Theory Pract 11:1–25 8. Nabeya S (1951) Absolute moments in 2-dimensional normal distribution. Ann Inst Stat Math 3:2–6 9. Ogasawara H (2021) A non-recursive formula for various moments of the multivariate normal distribution with sectional truncation. J Multivar Anal (online published). https://doi.org/10. 1016/j.jmva.2021.104729 10. Winkelbauer A (2014) Moments and absolute moments of the normal distribution. arXiv:1209.4340v2 [math.ST] 15 Jul 2014 11. Zwillinger D (2015) Table of integrals, series and products, 8th edn. Translated from Russian by Scripta Technica Inc. Elsevier, Amsterdam. The original Russian text by Gradshteyn IS, Ryzhik IM (1962) Fiziko-Matematicheskoy Literatury, Moscow

Chapter 9

The Student t- and Pseudo t- (PT) Distributions: Various Expressions of Mixtures

9.1

Introduction

So far, the pseudo-normal distributions and associated results have been dealt with. This family of distributions gives varieties of pdfs based on various types of truncation including single, double and inner truncation as special cases. Non-normal versions of the pseudo distributions can also be similarly considered, which should further extend the varieties of distributions. Recall that the pdf of the pseudo-normally distributed vector Y  PNp;q;R ðl; R; D; g; D; A; BÞ is fp;q;R ðyj l; R; D; g; D; A; BÞ /p ðyjl; RÞ ¼ PrfZ 2 S j g þ Dðy  lÞ; Dg; PrðZ 2 Sj g; XÞ where the definitions of the notations are given in Definition 4.1. We can replace /p ðyjl; RÞ by a non-normal version as generically denoted by wp ðyjl; hY Þ, where hY is a vector of parameters for random vector Y other than the location parameter vector l. Similarly, PrfZ 2 S j g þ Dðy  lÞ; Dg and PrðZ 2 Sj g; XÞ can be replaced by non-normal counterparts denoted by PrfZ 2 S j g þ Dðy  lÞ; hZ g and PrðZ 2 Sj xÞ, respectively, where hZ is defined similarly to hY and x is the vector of all the parameters including l; hY ; D; g and hZ though PrðZ 2 Sj xÞ may not depend on some of the elements of x as the normal case not dependent on l. This reformulation gives the family of pseudo normal/non-normal distributions:

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 H. Ogasawara, Expository Moments for Pseudo Distributions, Behaviormetrics: Quantitative Approaches to Human Behavior 2, https://doi.org/10.1007/978-981-19-3525-1_9

265

9 The Student t- and Pseudo t- (PT) Distributions …

266

gp;q;R ðyj l; hY ; D; g; hZ ; A; BÞ ¼

wp ðyjl; hY Þ PrfZ 2 S j g þ Dðy  lÞ; hZ g: PrðZ 2 Sj xÞ

Note that the type of the non-normal distribution of Y before truncation, i.e., wp ðyjl; hY Þ can be different from that for Z. When the truncation for Z is restricted to single truncation, possibly non-normal skew distributions, known as the skew-symmetric family (see Gupta, Chang and Huang [11]; Genton [8]), have been used by Nadarajah and Kotz [21], Gupta [9], Gupta and Chang [10], Azzalini and Capitanio [1], Nadarajah and Ali [19], Nadarajah and Gupta [20], Fung and Seneta [6], Kollo and Pettere [16], Joe and Li [12], Lachos, Garay and Cabral [18], Kollo, Käärik and Selart [15] and Galarza, Matos, Castro and Lachos [7] among others. In these papers, one of the skew non-normal distributions well investigated is the skew t-distribution with its multivariate version. In this chapter, various expressions of the mixtures for the pdfs of the t-distribution and its multivariate version are also presented as preliminaries.

9.2

The t-Distribution

The pdf of the Student t-distribution with mdegrees of freedom (df) using location and scale parameters, l and r, respectively, is given by ( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g ðz  lÞ2 pffiffiffipffiffiffi gðzjl; r; mÞ ¼ 1þ mr2 Cðm=2Þ p mr ( )ðm þ 1Þ=2 1 ðz  lÞ2 pffiffiffi ¼ 1þ mr2 Bðm=2; 1=2Þ mr ðm [ 0; 1\l\1; r [ 0Þ; where m is typically a positive integer, but can be positively real-valued, and Bðm=2; 1=2Þ is the beta function. When l ¼ 0 and r ¼ 1, we have the standard tdistribution. It is known that when X  Nð0; 1Þ and Y is chi-square distributed with m df pffiffiffiffiffiffiffiffi independent of X, Z ¼ X= Y=m is t-distributed with the pdf gðzj0; 1; mÞ given above. Since in many text books, the derivation of gðzj0; 1; mÞ is omitted, for expository purposes, we give it in similar four ways: Lemma 9.1 The pdf of the standard t-distribution is derived from any one of the pffiffiffi pffiffiffiffi pffiffiffipffiffiffiffi pffiffiffi pffiffiffi composite variables of mX= Y , m Y X, mX=Y and mYX when X  Nð0; 1Þ and Y is chi-square, inverse chi-square, chi and inverse-chi distributed with m df independent of X, respectively.

9.2 The t-Distribution

267

Proof 1 First, we derive the case of the inverse-chi distribution, whose pdf is given pffiffiffiffiffiffi when Y ¼ 1= Y  and Y  is chi-square or equivalently Gamma ðm=2; 1=2Þ distributed with 1/2 being the rate parameter:   yðm=2Þ1 y     gC ðY ¼ y jm=2; 1=2Þ ¼ gv2 ðY ¼ y jmÞ ¼ m=2 exp  : 2 2 Cðm=2Þ Using dy =dy ¼ 2y3 and substituting y ¼ y2 in the above pdf, we obtain the pdf of the inverse-chi:   ym1 1 gv1 ðY ¼ yjmÞ ¼ ðm=2Þ1 exp  2 : 2y 2 Cðm=2Þ Since X and Y are independent, the pdf of the joint distribution of X and Y is /ðX ¼ xÞgv1 ðY ¼ yjmÞ    2 1 x ym1 1 ¼ pffiffiffiffiffiffi exp  exp  2 : 2y 2 2ðm=2Þ1 Cðm=2Þ 2p We employ the variable transformation from (X and Y) to (Z ¼ where dx=dz dx=dy det dy=dz dy=dy

!

pffiffiffi mYX and Y),

pffiffiffi pffiffiffi ! 1=ð myÞ  z=ð my2 Þ pffiffiffi ¼ 1=ð myÞ: ¼ det 0 1

pffiffiffi Then, the pdf of the joint distribution of Z and Y using x ¼ z=ð myÞ and unchanged y with the Jacobian becomes   m1 pffiffiffi   1 z2 y =ð myÞ 1 pffiffiffiffiffiffi exp  exp  2y2 2my2 2ðm=2Þ1 Cðm=2Þ 2p     ym2 z2 ¼ pffiffiffiffiffiffipffiffiffi exp  1 þ =ð2y2 Þ m 2p m2ðm=2Þ1 Cðm=2Þ  ðm þ 1Þ=2 2 1 þ zm Cfðm þ 1Þ=2g 1 pffiffiffipffiffiffi ¼  2 1=2 p mCðm=2Þ ðm þ 1Þ=21 2 Cfðm þ 1Þ=2g 1 þ zm (  1=2 )m2     z2 z2 =ð2y2 Þ  y= 1 þ exp  1 þ m m   ðm þ 1Þ=2 2 1 þ zm 1 pffiffiffi  ¼ Bðm=2; 1=2Þ m 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g 1 þ z2 1=2 m (   )m2     2 1=2 2 z z 2 =ð2y Þ :  y= 1 þ exp  1 þ m m

9 The Student t- and Pseudo t- (PT) Distributions …

268

The distribution of Z is given by the above joint distribution when y is integrated out, which is obtained by noting that 1 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g





z2 1=2 m

(   1=2 )m2    z2 z2 2  y= 1 þ exp  1 þ =ð2y Þ m m

is the pdf of the scaled inverse-chi distribution with m þ 1 df, giving the pdf of Z:  ðm þ 1Þ=2 1 z2 pffiffiffi 1 þ ; m Bðm=2; 1=2Þ m which is the pdf of the standard t-distribution with m df. The remaining three cases pffiffiffi pffiffiffiffi pffiffiffipffiffiffiffi pffiffiffi of mX= Y , m Y X and mX=Y are given by using variable transformation of the inverse-chi distributed variable. Q.E.D. Proof 2 We derive the case of the inverse chi-square distribution, whose pdf is given when Y ¼ 1=Y  and Y  is chi-square distributed. Using dy =dy ¼ y2 and substituting y ¼ y1 , we obtain the pdf of the inverse chi-square: gv2 ðY ¼ yjmÞ ¼

  yðm=2Þ1 1 : exp  2y 2m=2 Cðm=2Þ

The pdf of the joint distribution of X and Y is /ðX ¼ xÞgv2 ðY ¼ yjmÞ    2  ðm=2Þ1 1 x y 1 exp  : ¼ pffiffiffiffiffiffi exp  2y 2 2m=2 Cðm=2Þ 2p We employ the variable transformation from (X and Y) to (Z ¼ where the Jacobian is

pffiffiffipffiffiffiffi m Y X and Y),

dx 1 ¼ pffiffiffipffiffiffi : dz m y pffiffiffipffiffiffi Then, the pdf of the joint distribution of Z and Y using x ¼ z=ð m yÞ and unchanged y with the Jacobian becomes

9.2 The t-Distribution

269

pffiffiffipffiffiffi     1 z2 yðm=2Þ1 =ð m yÞ 1 pffiffiffiffiffiffi exp  exp  2y 2my 2m=2 Cðm=2Þ 2p     ðm þ 3Þ=2 2 y z exp  1 þ ¼ pffiffiffiffiffiffipffiffiffi =ð2yÞ m 2p m2m=2 Cðm=2Þ   ðm þ 1Þ=2 2 1 þ zm Cfðm þ 1Þ=2g pffiffiffipffiffiffi ¼ p mCðm=2Þ 1  2

ðm þ 1Þ=2 Cfðm þ 1Þ=2g 1 þ zm 2   ðm þ 3Þ=2     z2 z2 exp  1 þ  y= 1 þ =ð2yÞ m m  ðm þ 1Þ=2 2 1 þ zm pffiffiffi ¼ Bðm=2; 1=2Þ m 1  2

ðm þ 1Þ=2 Cfðm þ 1Þ=2g 1 þ zm 2   ðm þ 3Þ=2     z2 z2  y= 1 þ exp  1 þ =ð2yÞ : m m The distribution of Z is given by the above joint distribution when y is integrated out, which is obtained by noting the pdf of the inverse chi-square constructed above, giving the pdf of Z as in Proof 1:  ðm þ 1Þ=2 1 z2 pffiffiffi 1 þ ; m Bðm=2; 1=2Þ m which is the pdf of the standard t-distribution with m df. The remaining three cases are given by using variable transformation of the inverse chi-square distributed variable. Q.E.D. Proof 3 Since the derivations using the pdfs of the inverse-chi or inverse chi-square may be unusual, we give the result using the chi-square. Let Y be chi-square distributed with mdf with unchanged X. Then, the joint distribution of X and Y becomes  2  ðm=2Þ1  y 1 x y p ffiffiffiffiffi ffi /ðX ¼ xÞg ðY ¼ yjmÞ ¼ exp  exp  : 2 2 2m=2 Cðm=2Þ 2p v2

9 The Student t- and Pseudo t- (PT) Distributions …

270

Use the variable transformation from (X and Y) to (Z ¼ Jacobian is dx ¼ dz

pffiffiffi pffiffiffiffi mX= Y and Y), where the

rffiffiffi y : m

pffiffiffi pffiffiffi Then, the pdf of the joint distribution of Z and Y using x ¼ z y= m and unchanged y with the Jacobian becomes  2  ðm=2Þ1 pffiffiffiffiffiffiffi  y y=m 1 z y y pffiffiffiffiffiffi exp  exp  m=2 2m 2 Cðm=2Þ 2 2p   ðm þ 1Þ=2 Cfðm þ 1Þ=2g z2 ¼ pffiffiffiffiffiffipffiffiffi 1þ m=2 m 2p m2 Cðm=2Þ       ðm þ 1Þ=2 z2 yðm þ 1Þ=21 z2 exp  1 þ  1þ y=2 m Cfðm þ 1Þ=2g m  ðm þ 1Þ=2 Cfðm þ 1Þ=2g z2 ¼ pffiffiffipffiffiffi 1þ m m pCðm=2Þ       2 ðm þ 1Þ=2 1 z yðm þ 1Þ=21 z2 exp  1 þ  ðm þ 1Þ=2 1 þ y=2 m Cfðm þ 1Þ=2g m 2  ðm þ 1Þ=2 1 z2 ¼ pffiffiffi 1þ m mBðm=2; 1=2Þ       ðm þ 1Þ=2 1 z2 yðm þ 1Þ=21 z2 exp  1 þ  ðm þ 1Þ=2 1 þ y=2 ; m Cfðm þ 1Þ=2g m 2 where  ðm þ 1Þ=2 ðm þ 1Þ=21     z2 y z2 exp  1 þ 1 þ y=2 m Cfðm þ 1Þ=2g m 2ðm þ 1Þ=2 1

  z2 is the pdf of Gamma m þ2 1 ; 12 þ 2m or scaled chi-square with m þ 1 df and scale   1 2 , which is integrated out to have the pdf of Z: parameter 1 þ zm  ðm þ 1Þ=2 1 z2 pffiffiffi 1þ m mBðm=2; 1=2Þ giving the required result. The remaining three cases are obtained using variable transformations. Q.E.D.

9.2 The t-Distribution

271

Proof 4 Finally, we give the result using the chi distribution. Let Y be chi dispffiffiffiffiffiffi tributed with mdf with unchanged X. The pdf of the chi is given by Y ¼ Y  , where Y  is the corresponding chi-square distributed variable. Then, the joint distribution of X and Y becomes  2  2 1 x ym1 y /ðX ¼ xÞgv ðY ¼ yjmÞ ¼ pffiffiffiffiffiffi exp  exp  : ðm=2Þ1 2 2 2 Cðm=2Þ 2p Using the variable transformation from (X and Y) to (Z ¼ Jacobian is

pffiffiffi mX=Y and Y), where the

dx y ¼ pffiffiffi : dz m pffiffiffi Then, the pdf of the joint distribution of Z and Y using x ¼ zy= m and unchanged y with the Jacobian becomes  2 2  2 pffiffiffi 1 z y ym1 y= m y pffiffiffiffiffiffi exp  exp  ðm=2Þ1 2m 2 2 Cðm=2Þ 2p   2 ðm þ 1Þ=2 Cfðm þ 1Þ=2g z ¼ pffiffiffipffiffiffi 1þ m m pCðm=2Þ     2 ðm þ 1Þ=2 z ym  1þ exp  1 þ m 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g  ðm þ 1Þ=2 1 z2 1þ ¼ pffiffiffi m mBðm=2; 1=2Þ     ðm þ 1Þ=2 z2 ym  1þ exp  1þ m 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g

  z 2 y2 m 2

  z 2 y2 ; m 2

where  ðm þ 1Þ=2     z2 ym z 2 y2 1þ exp  1 þ ; m m 2 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g  1=2 2 , is the pdf of the scaled chi distribution with m þ 1 df and scale parameter 1 þ zm which is integrated out to have the pdf of Z, giving the required result. The remaining three cases are obtained using variable transformations. Q.E.D. It is found that the derivations in Proofs 1 to 4 are comparable with a slight advantage in Proof 3; in that, the gamma distribution may be familiar for most of the readers. However, the formulation using the inverse-chi in Proof 1 has an

9 The Student t- and Pseudo t- (PT) Distributions …

272

advantage when we have the raw moments of the t-distribution, which are the products of the moments of the normal and inverse-chi distributions of the same orders due to their independence. Lemma 9.2 (Kollo, Käärik and Selart [15, Lemma 1]) The raw moments of real-valued order k [ 0 for the inverse chi-distributed variable denoted by Zm with m df is given by EðZmk Þ ¼

Cfðm  kÞ=2g ; m [ k: 2k=2 Cðm=2Þ

When k is a natural number, EðZmk Þ ¼

k1 Y

EðZvi Þ ; m [ k:

i¼0

Proof Using the pdf of the inverse-chi, we have Z1 EðZmk Þ

¼

zk 0

¼

2

  zm1 1 exp  dz 2z2 2ðm=2Þ1 Cðm=2Þ

ðmkÞ=21

Cfðm  kÞ=2g m=21 2 Cðm=2Þ

Z1 0

  zðmkÞ1 1 exp  2 dz 2z 2ðmkÞ=21 Cfðm  kÞ=2g

Cfðm  kÞ=2g ¼ k=2 ; 2 Cðm=2Þ which is the first required result. The second result with k being a positive integer is given by Cfðm  kÞ=2g 2k=2 Cðm=2Þ Cfðm  1Þ=2g Cfðm  2Þ=2g Cfðm  kÞ=2g      1=2 ¼ 1=2 2 Cðm=2Þ 21=2 Cfðm  1Þ=2g 2 Cfðm  k þ 1Þ=2g k 1 Y EðZvi Þ: ¼

EðZmk Þ ¼

i¼0

Q.E.D.

9.2 The t-Distribution

Lemma 9.2 gives Cfðm  1Þ=2g ; m [ 1; 21=2 Cðm=2Þ CðuÞ EðZ2u þ 1 Þ ¼ 1=2 2 Cfu þ ð1=2Þg ðu  1Þ! ¼ 1=2 2 fu  ð1=2Þgfu  ð3=2Þg    ð1=2ÞCð1=2Þ EðZm Þ ¼

21=2 ð2u  2Þ!! ¼ pffiffiffi ; pð2u  1Þ!! Cfu  ð1=2Þg EðZ2u Þ ¼ 21=2 CðuÞ fu  ð3=2Þgfu  ð5=2Þg    ð1=2ÞCð1=2Þ ¼ 21=2 ðu  1Þ! pffiffiffi pð2u  3Þ!! ¼ 1=2 2 ð2u  2Þ!! ðu ¼ 1; 2; :::Þ; Cfðm  2Þ=2g EðZm2 Þ ¼ 2Cðm=2Þ 1 ; m [ 2; ¼ m2 Cfðm  3Þ=2g EðZm3 Þ ¼ 3=2 2 Cðm=2Þ Cfðm  1Þ=2g Cfðm  3Þ=2g ¼ 1=2 2 Cðm=2Þ 2Cfðm  1Þ=2g EðZm Þ ; m [ 3; ¼ m3 Cfðm  4Þ=2g EðZm4 Þ ¼ 22 Cðm=2Þ Cfðm  2Þ=2g Cfðm  4Þ=2g ¼ 2Cðm=2Þ 2Cfðm  2Þ=2g 1 ; m [ 4; ¼ ðm  2Þðm  4Þ

273

9 The Student t- and Pseudo t- (PT) Distributions …

274

Cfðm  5Þ=2g 25=2 Cðm=2Þ Cfðm  1Þ=2g Cfðm  3Þ=2g Cfðm  5Þ=2g ¼ 1=2 2 Cðm=2Þ 2Cfðm  1Þ=2g 2Cfðm  3Þ=2g EðZm Þ ; m [ 5; ¼ ðm  3Þðm  5Þ Cfðm  6Þ=2g EðZm6 Þ ¼ 23 Cðm=2Þ Cfðm  2Þ=2g Cfðm  4Þ=2g Cfðm  6Þ=2g ¼ 2Cðm=2Þ 2Cfðm  2Þ=2g 2Cfðm  4Þ=2g 1 ; m [ 6: ¼ ðm  2Þðm  4Þðm  6Þ

EðZm5 Þ ¼

Theorem 9.1 When Zm is inverse chi-distributed with m df, we have EðZm Þ ; m [ 2u  1; ðm  3Þðm  5Þ    ðm  2u þ 1Þ 1 EðZm2u Þ ¼ ; m [ 2u; ðm  2Þðm  4Þ    ðm  2uÞ

EðZm2u1 Þ ¼

ðu ¼ 1; 2; :::Þ: Proof The results follow by induction with EðZv Þ and EðZm2 Þ obtained earlier. Q.E.D. Note that in Theorem 9.1, m is real-valued. In the result of Lemma 9.1, introducing the location and scale parameters corresponding to those of Nðl; r2 Þ for the standardized t-distribution, we have the unstandardized Student’s t-distribution with m df denoted by Stðl; r; mÞ, whose pdf at Z = z is ( )ðm þ 1Þ=2 1 ðz  lÞ2 pffiffiffi : 1þ r2 m Bðm=2; 1=2Þ mr It is known that infinite mixtures of the normal distributions with different variances give the above pdf of Stðl; r; mÞ (Stuart and Ort [23], Example 5.6]; Bishop [2], Eq. (2.161)]; Kirkby, Nguyen and Nguyen [13], Eq. (9); [14], Eq. (3.1)]), where the gamma distribution for the weights in mixtures seems to be exclusively used. In many cases, this mixture is called a scale mixture. However, actually this is a mixture of the reciprocals of squared scales or equivalently the reciprocals of variances in the normal case, which are also called the precisions. The next lemma shows that the same results are given by the mixture of normal variances using the inverse gamma distribution.

9.2 The t-Distribution

275

Lemma 9.3 The pdf of Stðl; r; mÞ with m [ 0 being real-valued is given by (i) the mixture of the precisions of N(l; Y 1 r2 Þ when Y follows Gamma ðm=2; m=2Þ with m=2 being the shape and rate parameter or equivalently the scaled chi-square with m df and scale parameter m1 ; (ii) the mixture of the precisions of N(l; Y 1 mr2 Þ when Y follows Gamma ðm=2; 1=2Þ with m=2 and 1/2 being the shape and rate parameters, respectively, or equivalently the chi-square with m df. The same pdf is given by (iii) the mixture of the variances N(l; Yr2 Þ when Y follows the inverse gamma distribution denoted by Inv-Gamma ðm=2; m=2Þ with m=2 being the shape and scale parameter or equivalently the scaled inverse chi-square with m df and scale parameter m; (iv) the mixture of the variances N(l; Ymr2 Þ when Y follows the inverse gamma distribution denoted by Inv-Gamma ðm=2; 1=2Þ with m=2 and 1/2 being the shape and scale parameters, respectively, or equivalently the inverse chi-square with m df. Proof 1 We start with (i) of the precision mixture, which is given by ( ) Z1 pffiffiffi  my y yðz  lÞ2  mm=2 yðm=2Þ1 pffiffiffiffiffiffi exp  exp  dy 2 2 2r2 Cðm=2Þ 2pr 0 ( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g  mm=2 m ðz  lÞ2 þ ¼ pffiffiffiffiffiffi 2 2r2 2pCðm=2Þr 2 )ðm þ 1Þ=2 " ( ) # Z1 ( m ðz  lÞ2 yðm þ 1Þ=21 m ðz  lÞ2 þ þ exp   y dy 2 2 2r2 Cfðm þ 1Þ=2g 2r2 0

( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g ðz  lÞ2 pffiffiffi 1þ ¼ pffiffiffi mr2 pCðm=2Þ mr ( )ðm þ 1Þ=2 1 ðz  lÞ2 pffiffiffi ¼ ; 1þ mr2 Bðm=2; 1=2Þ mr which is the first required result. The case (ii) of the precision mixture is given from (i) when the variable transformation Y  ¼ mY following the chi-square with m df is considered. That is, we have

9 The Student t- and Pseudo t- (PT) Distributions …

276

( ) Z1 pffiffiffi  my y yðz  lÞ2  m m=2 yðm=2Þ1 pffiffiffiffiffiffi exp  exp  dy 2 2 2r2 Cðm=2Þ 2pr 0 ( )   Z1 pffiffiffiffiffi y y ðz  lÞ2 1 yfðm=2Þ1g y pffiffiffiffiffiffipffiffiffi exp  exp  ¼ dy 2 m=2 2mr Cðm=2Þ 2 2 2p mr 0

( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g 1 1 ðz  lÞ2 þ ¼ pffiffiffiffiffiffi pffiffiffi 2mr2 2pCðm=2Þ mr 2m=2 2 )ðm þ 1Þ=2 " ( ) # Z1 ( 1 ðz  lÞ2 yfðm þ 1Þ=21g 1 ðz  lÞ2  þ þ exp   y dy 2 2 2mr2 Cfðm þ 1Þ=2g 2mr2 0

( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g ðz  lÞ2 pffiffiffi ¼ pffiffiffi 1þ mr2 pCðm=2Þ mr ( )ðm þ 1Þ=2 1 ðz  lÞ2 pffiffiffi ¼ ; 1þ mr2 Bðm=2; 1=2Þ mr which is the required result for (ii). The remaining results are given by the property that when Y is gamma ((scaled) chi-square) distributed, Y 1 follows the corresponding inverse gamma (inverse (scaled) chi-square) distribution. Proof 2 Consider (iii) of the variance mixture using Inv-Gamma ðm=2; m=2Þ with m=2 being the shape and scale parameter: ( )   1 ðz  lÞ2  m m=2 yðm=2Þ1 m=2 pffiffiffiffiffiffi pffiffiffi exp  exp  dy 2 y 2r2 y Cðm=2Þ 2pr y 0 ( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g  mm=2 m ðz  lÞ2 þ ¼ pffiffiffiffiffiffi 2 2r2 2pCðm=2Þr 2 ( ) " ( ) # ðm þ 1Þ=2 Z1 m ðz  lÞ2 yðm þ 1Þ=21 m ðz  lÞ2 þ þ exp   =y dy 2 2 2r2 Cfðm þ 1Þ=2g 2r2

Z1

0

( )ðm þ 1Þ=2 1 ðz  lÞ2 pffiffiffi ; 1þ ¼ mr2 Bðm=2; 1=2Þ mr which is the required result for (iii). The case (iv) of the variance mixture is given from (iii) when the variable transformation Y  ¼ Y=m following the inverse chi-square with mdf is considered. That is, we have the identity

9.2 The t-Distribution

277

( )   1 ðz  lÞ2  m m=2 yðm=2Þ1 m=2 pffiffiffiffiffiffi pffiffiffi exp  exp  dy 2 y 2r2 y Cðm=2Þ 2pr y 0 ( )   Z1 1 ðz  lÞ2 1 yfðm=2Þ1g 1=2 pffiffiffiffiffiffipffiffiffipffiffiffiffiffi exp  exp   dy ¼ y 2mr2 y 2m=2 Cðm=2Þ 2p m y r

Z1

0

( )ðm þ 1Þ=2 Cfðm þ 1Þ=2g 1 1 ðz  lÞ2 þ ¼ pffiffiffiffiffiffi pffiffiffi 2mr2 2pCðm=2Þ mr 2m=2 2 )ðm þ 1Þ=2 " ( ) # Z1 ( 1 ðz  lÞ2 yfðm þ 1Þ=21g 1 ðz  lÞ2 þ þ exp   =y dy 2 2 2mr2 Cfðm þ 1Þ=2g 2mr2 0

( )ðm þ 1Þ=2 1 ðz  lÞ2 pffiffiffi ¼ : 1þ mr2 Bðm=2; 1=2Þ mr The remaining results are given by the property that when Y is inverse gamma (inverse (scaled) chi-square) distributed, Y 1 follows the corresponding gamma ((scaled) chi-square) distribution. Q.E.D. The above two proofs are comparable. When for some reasons, the variance mixture is preferred, the inverse gamma (inverse chi-square) can be used; while the gamma (chi-square) distribution is employed, the precision mixture should be used. As in Lemma 9.1 other similar mixtures giving the same pdf of Stðl; r; mÞ can be used. When the standard deviation mixture is used, the inverse-square-root gamma pffiffiffiffiffiffi can be used, which is the distribution of 1= Y  when Y  follows the gamma distribution, whose special case is the inverse-chi. Similarly, the mixture of the reciprocal of the standard deviation can also be used. In this case, the square-root pffiffiffiffiffiffi gamma can be used, which is the distribution of Y  with Y  as above, whose special case is the chi distribution. Actually, infinitely many types of mixtures yielding the same Stðl; r2 ; mÞ can be constructed when the distributions of the powers of Y  are used. Let Y ¼ Y ð1=cÞ ð1\c\1; c 6¼ 0Þ. Then, Y is said to have the power-gamma distribution denoted by Power-Cða; b; cÞ. The pdf of this distribution is given by noting that dy =dy ¼ cyc1 : gPowerC ðyja; b; cÞ ¼ ¼

ba ycða1Þ jcjyc1 expðbyc Þ CðaÞ

ba jcjyca1 expðbyc Þ; CðaÞ

ð0\y\1; 0\a\1; 0\b\1; 1\c\1; c 6¼ 0Þ

9 The Student t- and Pseudo t- (PT) Distributions …

278

where a and b are the shape and rate parameters for the gamma distributed Y  (the same notation a as that for the normalizer used earlier is employed as long as confusion does not occur). Stacy [22, Eq. (1)] gave the generalized gamma distribution whose pdf is gGeneralizedC ðxja; d; pÞ ¼

ðp=ad Þxd1 expfðx=aÞp g Cðd=pÞ

ð0\x\1; 0\a\1; 0\d\1; 0\p\1Þ which is seen as a reparametrization of the power-gamma with a ¼ d=p; b ¼ 1=ap and c ¼ p with the restriction of 0\p\1. The power gamma can be seen as a special case of the Amoroso distribution whose pdf is    jb=hj x  aab1 x  ab gAmoroso ðxja; h; a; bÞ ¼ exp  CðaÞ h h ðx  a if h [ 0; x  a if h\0; b 2 R; b 6¼ 0; a [ 0; h 2 R; h 6¼ 0Þ (Crooks [5, Eq. (1)]). Theorem 9.2 The pdf of Stðl; r; mÞ is given by the mixture of the powers of the precisions (power-precisions) of N(l; Y c r2 Þ ð1\c\1; c 6¼ 0Þ when Y follows Power-Cðm=2; m=2; cÞ. Proof In Proof 1 for case (i) of Lemma 9.3, replacing y by y , we have the identity: ( )   Z1 pffiffiffiffiffiffi y y ðz  lÞ2  m m=2 yfðm=2Þ1g my pffiffiffiffiffiffi exp  exp  dy 2 2 2r Cðm=2Þ 2 2pr 0 ( )ðm þ 1Þ=2 1 ðz  lÞ2 pffiffiffi ¼ : 1þ mr2 Bðm=2; 1=2Þ mr Let y ¼ yð1=cÞ ð1\c\1; c 6¼ 0Þ. Then, using dy =dy ¼ cyc1 , the left-hand side of the above identity becomes ( )   Z1 pffiffiffiffic yc ðz  lÞ2  m m=2 ycfðm=2Þ1g jcjyc1 myc y pffiffiffiffiffiffi exp  exp  dy 2 2r2 Cðm=2Þ 2 2pr 0 ( )   Z1 pffiffiffiffic yc ðz  lÞ2  m m=2 jcjycðm=2Þ1 myc y pffiffiffiffiffiffi exp  ¼ exp  dy 2 2r2 Cðm=2Þ 2 2pr 0

( ) Z1 pffiffiffiffic yc ðz  lÞ2 y pffiffiffiffiffiffi exp  ¼ gPowerC ðyjm=2; m=2; cÞdy; 2r2 2pr 0

which shows the required result. Q.E.D.

9.2 The t-Distribution

279

In Theorem 9.2, the mixture of the power-precisions is dealt with. When c ðc2 Þ is redefined as c, we obtain the corresponding results for the mixtures of the power-variances (power-standard deviations). Note that when c ¼ 2; 1; 1; 2, Power-Cðm=2; 1=2; cÞ become the (scaled) inverse-chi, inverse chi-square, chi-square and chi distributions, respectively. These findings reflect a general result of arbitrary formulations of mixtures as long as valid variable transformations are used. It is sometimes pointed out that the mixture, e.g., that for Stðl; r; mÞ is scale-free. In the case of Stðl; r; mÞ, this indicates that the weights for the mixture, i.e., the pdfs of Power-Cðm=2; 1=2; cÞ, do not depend on the scale r or the standard deviation (SD) of the associated normal density. In Theorem 9.2, Power-Cðm=2; m=2; cÞ is used for the weight to yield the normal mixture, where m=2 is used as a function of the scale parameter denoted by b of Power-Cðm=2; m=2; cÞ, i.e., m=2 ¼ 1=bc . For the mixture, the unit scale parameter as in Power-Cðm=2; 1; cÞ or the power gamma dependent on the normal scale r can also be considered. Corollary 9.1 The pdf of Stðl; r; mÞ is given by the mixture of the powerprecisions of N(l; Y c mr2 =2Þ ð1\c\1; c 6¼ 0Þ when Y follows PowerCðm=2; 1; cÞ. The same pdf is also given by the mixture of the power-precisions of N(l; Y c Þ ð1\c\1; c 6¼ 0Þ when Y follows Power-Cðm=2; mr2 =2; cÞ. Proof In the proof of Theorem 9.2, replacing y by y , we have the identity: ( )   Z1 pffiffiffiffiffiffi yc yc ðz  lÞ2  m m=2 jcjycðm=2Þ1 myc pffiffiffiffiffiffi exp  exp  dy 2 2r2 Cðm=2Þ 2 2pr 0 ( )ðm þ 1Þ=2 1 ðz  lÞ2 pffiffiffi ¼ : 1þ mr2 Bðm=2; 1=2Þ mr Let yc ¼ ðm=2Þyc . Then, using dy =dy ¼ ðm=2Þ1=c , the left-hand side of the above identity becomes ( )   Z1 pffiffiffiffiffiffi yc yc ðz  lÞ2  m m=2 jcjycðm=2Þ1 myc pffiffiffiffiffiffi exp  exp  dy 2 2r2 Cðm=2Þ 2 2pr 0 ( ) Z1 pffiffiffiffic pffiffiffi y 2 yc ðz  lÞ2 jcjycðm=2Þ1 pffiffiffiffiffiffipffiffiffi exp  expðyc Þdy ¼ mr2 Cðm=2Þ 2p mr 0

Z1 ¼

/ðzjl; yc mr2 =2ÞgPowerC ðyjm=2; 1; cÞdy;

0

which shows the first required result. For the remaining result, let yc ¼ yc =r2 with y as above. Using dy =dy ¼ ðr2 Þ1=c , we have as above

9 The Student t- and Pseudo t- (PT) Distributions …

280

( )   ffi Z1 pffiffiffiffiffi yc yc ðz  lÞ2  mm=2 jcjycðm=2Þ1 myc pffiffiffiffiffiffi exp  exp  dy 2 2r2 Cðm=2Þ 2 2pr 0 ( ) m=2   Z1 pffiffiffiffic y yc ðz  lÞ2 mr2 jcjycðm=2Þ1 mr2 yc pffiffiffiffiffiffi exp  exp  ¼ dy 2 2 Cðm=2Þ 2 2p 0

Z1 ¼

/ðzjl; yc ÞgPowerC ðyjm=2; mr2 =2; cÞdy;

0

which gives the remaining required result. Q.E.D. So far, the pdf of Stðl; r; mÞ has been derived by the variable transformation from normal X and power-gamma Y to the normal mixture Z and unchanged Y. Then, Z becomes t-distributed. The roles of X and Y can be exchanged. Theorem 9.3 The pdf of Stð0; 1; mÞ is given by the quasi mixture of the scale pffiffiffi parameters mX ðX [ 0Þ of the scaled inverse-chi distributed variable Z with m df when X  N(0; 1Þ over the domain X [ 0. Proof The pdf of the scaled inverse-chi distribution with m df and scale parameter b at Z = z is given by gv1 ðzjm; bÞ ¼

  zm1 bv b2 exp  : 2z2 2ðm=2Þ1 Cðm=2Þ

Then, the quasi mixture in this theorem becomes Z1 0

   2  m1 pffiffiffi m 1 x z ð mxÞ mx2 pffiffiffiffiffiffi exp  exp  2 dx 2 2ðm=2Þ1 Cðm=2Þ 2z 2p Z1

¼ 0

    zm1 mm=2 xm m x2 pffiffiffiffiffiffi exp  1 þ 2 dx z 2 2p2ðm=2Þ1 Cðm=2Þ

zm1 mm=2 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g ¼ pffiffiffiffiffiffi  ðm þ 1Þ=2 2p2ðm=2Þ1 Cðm=2Þ 1þ m Z1  0

 ðm þ 1Þ=2 1 þ zm2 xm

z2

    m x2 exp  1 þ dx z2 2 2ðm þ 1Þ=21 Cfðm þ 1Þ=2g

9.2 The t-Distribution

281

Cfðm þ 1Þ=2g ðm=z2 Þðm þ 1Þ=2 ¼ pffiffiffipffiffiffi  ðm þ 1Þ=2 m pCðm=2Þ 1 þ zm2  ðm þ 1Þ=2 1 z2 p ffiffi ffi ¼ ; 1þ m mBðm=2; 1=2Þ which is the pdf of Stð0; 1; mÞ. Q.E.D. Remark 9.1 Note that the phrase “quasi mixture” is used in Theorem 9.3 since the truncated support X [ 0 is used for X  N(0; 1Þ without doubling the pdf as in the half normal. The above expression of the quasi mixture is also obtained by the pffiffiffi variable transformation. Let Z ¼ mXY [ 0, where X  N(0; 1Þ with X [ 0 and Y is standard inverse-chi distributed with mdf independent of X (when Z is negative pffiffiffi pffiffiffi Z ¼  mXY\0 can be used). Then, the pdf of Z at z ¼ mxy is given by the variable transformation from (X and Y) to (Z and X), where X is integrated out with y ¼ pzffimx, unchanged x and the Jacobian det

dx=dz dx=dx dy=dz dy=dx

!



0 pffiffiffi 1=ð mxÞ pffiffiffi ¼ 1=ð mxÞ: ¼ det

1 pffiffiffi z=ð mx2 Þ



Using the above result, the pdf of Z becomes Z1 0

  z dy dx /ðxÞgv1 pffiffiffi jm dz mx Z1

¼ 0

Z1 ¼ 0

Z1 ¼

pffiffiffi pffiffiffi    2 1 x fz=ð mxÞgm1 =ð mxÞ mx2 pffiffiffiffiffiffi exp  dx exp  2 2z2 2ðm=2Þ1 Cðm=2Þ 2p    2  m1 pffiffiffi m 1 x z ð mxÞ mx2 pffiffiffiffiffiffi exp  exp  dx 2 2ðm=2Þ1 Cðm=2Þ 2z2 2p pffiffiffi /ðxÞgv1 ðzjm; mxÞdx:

0

Remark 9.2 In Theorem 9.2, the normal mixture using the weights of the power gamma is given while in Theorem 9.3, the (quasi) mixture of the scaled inverse-chi with the normal density weights is obtained. This shows a dual aspect when pffiffiffi standard t-distributed Z ¼ mXY with X  N(0; 1Þ and Y  v1 ðmÞ is obtained by

9 The Student t- and Pseudo t- (PT) Distributions …

282

mixtures. In the literature, to the author’s knowledge, the former type of mixture has been exclusively used. This duality is associated with that of the variable transformation as expected. That is, as shown earlier when the transformation (X and Y (or Y c )) to (Z and X) is used, the normal mixture is derived with Y being integrated out (see Theorem 9.2). On the other hand, when the transformation (X and Y) to (Z and Y) is used, the quasi mixture of the scales of the scaled inverse-chi with the normal density weights is obtained with X being integrated out (see Theorem 9.3). Remark 9.3 The phrase “quasi mixture” was used in RTheorem 9.3 since the 1 integral of the weights for a positive z in Theorem 9.3 is 0 /ðxÞ dx ¼ 1=2 rather than 1, which is not consistent with the usual definition of a mixture. However, the pffiffiffi pffiffiffi weight mX for the mixture or the scale of Y  v1 ðm; mXÞ is used R 1 two times, one for a positive Z = z and the other for a negative z, which makes 2 0 /ðxÞ dx ¼ 1 as in the half normal. If this extended definition of a mixture is employed, “quasi” is unnecessary and will be parenthesized or omitted. Corollary 9.2 The pdf of Stð0; 1; mÞ is given by the (quasi) mixture of the scale parameters X ðX [ 0Þ of the scaled inverse-chi distribution with m df when X  N(0; mÞ over the domain X [ 0. Proof Theorem 9.3 gives the pdf of Stð0; 1; mÞ as Z1 0

   2  m1 pffiffiffi  m 1 x z ð mx Þ mx2 pffiffiffiffiffiffi exp  exp  2 dx 2 2ðm=2Þ1 Cðm=2Þ 2z 2p Z1

¼

pffiffiffi /ðx Þgv1 ðzjm; mx Þdx ;

0

where x in place of x is used. Consider the variable transformation x ¼ pffiffiffi dx =dx ¼ 1= m. Then, the above result becomes Z1



/ðx Þg

v1

pffiffiffi ðzjm; mx Þdx ¼

0

Z1 0

pffiffiffi  mx with

pffiffiffi 1 /ðx= mÞ pffiffiffi gv1 ðzjm; xÞdx m

Z1 ¼

/ðxj0; mÞgv1 ðzjm; xÞdx; 0

which is the required result. Q.E.D. In Corollary 9.2, the possibly non-integer value m of the df is also used as the variance of the normal weights for the inverse-chi mixture, which shows another aspect of the arbitrariness of the scales of the kernel in a mixture.

9.2 The t-Distribution

283

Corollary 9.3 The pdf of Stð0; 1; mÞ is given by the following mixtures: pffiffiffiffiffiffi (i) the mixture of the scale parameters mX of the scaled inverse-chi distribution with m df when X  v2 ð1Þ; pffiffiffiffi (ii) the mixture of the scale parameters X of the scaled inverse-chi distribution with m df when X  Gammaf1=2; 1=ð2mÞg with 1=ð2mÞ being the rate parameter; pffiffiffiffiffiffiffiffi (iii) the mixture of the scale parameters mX c of the scaled inverse-chi distribution with m df when X  Power-Cð1=2; 1=2; cÞ. Proof (i) Theorem 9.3 gives the pdf of Stð0; 1; mÞ as Z1 0

   2  m1 pffiffiffi  m 1 x z ð mx Þ mx2 pffiffiffiffiffiffi exp  exp  dx 2 2ðm=2Þ1 Cðm=2Þ 2z2 2p Z1

¼

pffiffiffi /ðx Þgv1 ðzjm; mx Þdx :

0

pffiffiffi Consider the variable transformation x ¼ x2 with dx =dx ¼ 1=ð2 xÞ. Then, the above result becomes Z1

pffiffiffi /ðx Þgv1 ðzjm; mx Þdx ¼ 

0

Z1 0

Z1 ¼

 x 1 pffiffiffiffiffi 1 pffiffiffiffiffiffi exp  pffiffiffi gv1 ðzjm; mxÞdx 2 2 x 2p pffiffiffiffiffi 1 gv2 ðxj1Þgv1 ðzjm; mxÞdx; 2

0

which is the required result. (ii) For the second result, consider the variable transformation x ¼ mx2 with pffiffiffiffiffiffiffi pffiffiffiffiffi x ¼ x=m and dx =dx ¼ 1=ð2 mxÞ. Then, the pdf of Stð0; 1; mÞ is Z1

pffiffiffi /ðx Þgv1 ðzjm; mx Þdx ¼

0

Z1 0

Z1 ¼ 0

Z1 ¼

 x 1 pffiffiffi 1 pffiffiffiffiffiffi exp  pffiffiffiffiffi gv1 ðzjm; xÞdx 2m 2 mx 2p  x pffiffiffi 1 xð1=2Þ1 pffiffiffiffiffi gv1 ðzjm; xÞdx exp  2 2mCð1=2Þ 2m pffiffiffi 1 gC fxj1=2; 1=ð2mÞggv1 ðzjm; xÞdx; 2

0

which gives the second required result.

9 The Student t- and Pseudo t- (PT) Distributions …

284

(iii) For the remaining result, consider the variable transformation x ¼ x2=c with x ¼ xc=2 and dx =dx ¼ cxðc=2Þ1 =2. Then, the pdf of Stð0; 1; mÞ is 

Z1

pffiffiffi /ðx Þgv1 ðzjm; mx Þdx ¼ 

0

Z1 0

Z1 ¼ 0

Z1 ¼

 c pffiffiffiffiffiffi 1 x jcjxðc=2Þ1 pffiffiffiffiffiffi exp  gv1 ðzjm; mxc Þdx 2 2 2p  c pffiffiffiffiffiffi 1 jcjxðc=2Þ1 x pffiffiffi exp  gv1 ðzjm; mxc Þdx 2 2Cð1=2Þ 2 pffiffiffiffiffiffi 1 gPowerC ðxj1=2; 1=2; cÞgv1 ðzjm; mxc Þdx; 2

0

which is the last required result. Q.E.D. As addressed after Corollary 9.2, Corollary 9.3 shows other cases of the arbitrariness of scales in the mixture. It is found that the weights using the (power)gamma distribution can be employed both in the normal and inverse-chi mixtures. Corollary 9.4 The pdf of Stð0; 1; mÞ is given by the mixture of the second parameters of the Power-Cð1=2; mX c =2; cÞ when X  Power-Cð1=2; 1=2; cÞ. Proof From the last result of (iii) of Corollary 9.3, noting that the inverse chi distribution is a special case of the Power-C, we have Z1

pffiffiffiffiffiffi 1 gPowerC ðxj1=2; 1=2; cÞgv1 ðzjm; mxc Þdx 2

0

Z1 ¼

1 gPowerC ðxj1=2; 1=2; cÞgPowerC ðzjm=2; mxc =2; 2Þdx; 2

0

giving the required result. Q.E.D. The above result shows that the pdf of Stð0; 1; mÞ can be expressed as a Power-C mixture with the Power-C weights, where the value of the power in the former Power-C, which is also used in the second parameter of the latter Power-C, is arbitrary as in other cases due to the model indeterminacy of a mixture. Remark 9.4 In many text books, the standard t-distributed variable is defined as pffiffiffiffiffiffiffiffi X= Y=m, where X  Nð0; 1Þ and Y  v2 ðmÞ independent of X as shown in Proof 3 of Lemma 9.1. On the other hand, the pdf is typically given by the precision mixture of the normal distributions using the weights of the gamma distribution with the shape and rate parameters being m=2 as shown in (i) of Proof 1 of Lemma 9.3. There seems to be some gap between these two explanations. The latter derivation of the pdf is chosen due to its simplicity, while to have the pdf from pffiffiffiffiffiffiffiffi X= Y=m we have to use the joint distribution of X and Y, variable transformation

9.3 The Multivariate t-Distribution

285

and marginalization, which gives an indirect impression. It is of some interest to see the relationship between mixture and marginalization in these two cases that are seemingly distinct. In Proof 3 of Lemma 9.1, the following joint pdf of Z ¼ pffiffiffiffiffiffiffiffi X= Y=m and Y is used before marginalization  2  ðm=2Þ1 pffiffiffiffiffiffiffi  y y=m 1 z y y pffiffiffiffiffiffi exp  exp  ; 2m 2m=2 Cðm=2Þ 2 2p pffiffiffiffiffiffiffi where y=m comes from the Jacobian. Then, moving this factor forward, the above pdf becomes the product of two pdfs:  2  ðm=2Þ1  y 1 pffiffiffiffiffiffiffi z y y pffiffiffiffiffiffi y=m exp  exp  ; 2m 2m=2 Cðm=2Þ 2 2p which shows the form of the (scaled) precision mixture with the chi-square weights when y is integrated as shown in (ii) of Lemma 9.3. That is, this marginalization can be seen as a case of a (unscaled) chi-square mixture.

9.3

The Multivariate t-Distribution

The p-variate t-distribution with m df denoted by Z  Stðl; R; mÞ is defined when the pdf at Z ¼ z is gðzjl; R; mÞ ¼

Cfðm þ pÞ=2g ðpmÞp=2 Cðm=2ÞjRj1=2

(

ðz  lÞT R1 ðz  lÞ 1þ m

)ðm þ pÞ=2

(Cornish [4, Eq. (1)]; Kotz and Nadarajah [17, Eq. (1.1)]). Lemma 9.4 The pdf of the p-variate t-distribution when l ¼ 0 and R ¼ Ip , where pffiffiffiffiffiffiffiffi pffiffiffiffi pffiffiffi Ip is the p  p identity matrix, is given when Z is equal to X= Y=m, Y X= m, pffiffiffi pffiffiffi Y X= m or X=ðY= mÞ where X  Nð0; Ip Þ and Y is chi-square, inverse chi-square, inverse-chi and chi-distributed, respectively, with m df independent of X. pffiffiffiffiffiffiffiffi Proof We give the familiar case of Z ¼ X= Y=m. The pdf of the joint distribution of X and Y is /p ðX ¼ xjl ¼ 0; R ¼ Ip Þgv2 ðY ¼ yjmÞ  T  ðm=2Þ1  y 1 x x y ¼ exp  exp  : 2 2m=2 Cðm=2Þ 2 ð2pÞp=2

286

9 The Student t- and Pseudo t- (PT) Distributions …

pffiffiffiffiffiffiffiffi Use the variable transformation from (X and Y) to (Z ¼ X= Y=m and Y) with the pffiffiffiffiffiffiffiffi Q dxi yp=2 ¼ . Then, noting X ¼ Z Y=m, the pdf of the joint disJacobian pi¼1 m dzi tribution of Z and Y becomes  T  ðm=2Þ1  y 1 z zy y ðy=mÞp=2 exp  exp  p=2 m=2 2m 2 2 Cðm=2Þ ð2pÞ   ðm þ pÞ=2 Cfðm þ pÞ=2g 1 zT z þ ¼ 2m ð2pmÞp=2 2m=2 Cðm=2Þ 2       ðm þ pÞ=2 1 zT z yðm þ pÞ=21 1 zT z þ þ exp   y 2 2m 2 2m Cfðm þ pÞ=2g  ðm þ pÞ=2 Cfðm þ pÞ=2g zT z ¼ 1 þ m ðpmÞp=2 Cðm=2Þ       ðm þ pÞ=2 1 zT z yðm þ pÞ=21 1 zT z þ þ exp   y ; 2 2m 2 2m Cfðm þ pÞ=2g where  ðm þ pÞ=2 ðm þ pÞ=21     1 zT z y 1 zT z þ þ exp  y 2 2m 2 2m Cfðm þ pÞ=2g   m þ p 1 zT z ; þ is the pdf of Gamma or scaled chi-square with m þ p df and scale 2 2 2m  1 zT z , which is integrated out to have the pdf of Z. The remaining parameter 1 þ m pffiffiffiffi pffiffiffi pffiffiffi pffiffiffi cases when Z is Y X= m, Y X= m or X=ðY= mÞ are given similarly. Q.E.D. Note that Lemma 9.4 parallels Lemma 9.1 for the univariate t-distribution. Introducing location and scale parameters l and R1=2 that is a symmetric matrix-square-root of R, respectively, we have ( )ðm þ pÞ=2 Cfðm þ pÞ=2g ðz  lÞT R1 ðz  lÞ gðzjl; R; mÞ ¼ 1þ : m ðpmÞp=2 Cðm=2ÞjRj1=2 The derivation of gðzjl; R; mÞ using mixtures is also given in various ways as in the univariate case of Lemma 9.3. Lemma 9.5 The pdf of Stðl; R; mÞ with m [ 0 being real-valued is given by (i) the mixture of the precisions of N(l; Y 1 RÞ when Y follows Gamma ðm=2; m=2Þ with m=2 being the shape and rate parameter or equivalently the scaled chi-square with m df and scale parameter m1 ;

9.3 The Multivariate t-Distribution

287

(ii) the mixture of the precisions of N(l; Y 1 mRÞ when Y follows Gamma ðm=2; 1=2Þ with m=2 and 1/2 being the shape and rate parameters, respectively, or equivalently the chi-square with m df. The same pdf is given by (iii) the mixture of the covariances N(l; Y RÞ when Y follows the inverse gamma distribution denoted by Inv-Gamma ðm=2; m=2Þ with m=2 being the shape and scale parameter or equivalently the scaled inverse chi-square with m df and scale parameter m; (iv) the mixture of the covariances N(l; YmRÞ when Y follows the inverse gamma distribution denoted by Inv-Gamma ðm=2; 1=2Þ with m=2 and 1/2 being the shape and scale parameters, respectively, or equivalently the inverse chi-square with m df. Proof For (i), we have Z1 0

(

yðz  lÞT R1 ðz  lÞ exp  p=2 1=2 2 ð2pÞ jRj yp=2

)

 my exp  dy 2 2 Cðm=2Þ ( )ðm þ pÞ=2 Cfðm þ pÞ=2g  m m=2 m ðz  lÞT R1 ðz  lÞ þ ¼ 2 2 ð2pÞp=2 Cðm=2ÞjRj1=2 2 ( ) ðm þ pÞ=2 Z1 m ðz  lÞT R1 ðz  lÞ yðm þ pÞ=21 þ  2 2 Cfðm þ pÞ=2g 0 " ( ) # m ðz  lÞT R1 ðz  lÞ þ  exp  y dy 2 2 ( )ðm þ pÞ=2 Cfðm þ pÞ=2g  m m=2 m ðz  lÞT R1 ðz  lÞ þ ¼ 2 2 ð2pÞp=2 Cðm=2ÞjRj1=2 2 ( )ðm þ pÞ=2 Cfðm þ pÞ=2g ðz  lÞT R1 ðz  lÞ ¼ 1þ : m ðpmÞp=2 Cðm=2ÞjRj1=2 

 mm=2 yðm=2Þ1

The result of (ii) is given by redefinition of Ym in (i) as Y. The result of (iii) is obtained from (i) considering that when Y is gamma distributed, Y 1 follows the inverse gamma. The result of (iv) is given by redefinition of Ym1 in (iii) as Y. Q.E.D.

9 The Student t- and Pseudo t- (PT) Distributions …

288

9.4

The Pseudo t (PT)-Distribution

As addressed in Sect. 9.1, the PN can be extended using non-normal distributions. One of the most extensively investigated non-normal distributions is the t-distribution. The skew multivariate t-distribution has been derived by Branco and Dey [3, Sect. 3.2] using the precision mixture of the normal with the gamma weights or equivalently by Gupta [9, Definition 1] as Y ¼ ðY1 ; :::; Yp ÞT ¼ X=

pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi W=m ¼ ðX1 ; :::; Xp ÞT = W=m;

where X is multivariate skew-normal distributed whose pdf is f ðX ¼ xÞ ¼ 2/p ðxjl ¼ 0; RÞUðaT xÞ; x 2 Rp ; a 2 Rp ; where UðÞ is the cumulative distribution function (cdf) of the standard normal; W is chi-square distributed with m df. When a ¼ 0, we have X  Nð0; RÞ giving Y  Stð0; R; mÞ, which is the only case of a symmetric distribution in the case of the multivariate skew t (ST) as in the multivariate SN.

9.4.1

The PDF of the PT

Recall that the pdf of Y  PNp;q;R ð l; R; D; g; D; A; BÞ is fp;q;R ðyj l; R; D; g; D; A; BÞ ¼

/p ðyjl; RÞ PrfZ 2 S j g þ Dðy  lÞ; Dg: PrðZ 2 Sj g; XÞ

Definition 9.1 Let Y  PNp;q;R ð l ¼ 0; R; D; g ¼ 0; D; A; BÞ. Then, the random pffiffiffi pffiffiffiffiffiffiffi p-vector following the pseudo t (PT)-distribution is defined as mY= W c , where W  Power-Cðm=2; 1=2; cÞ independent of Y. Gupta’s [9, Definition 1] case (see also Gupta et al. [11, Sect. 4]) shown earlier is a special case when W is chi-square distributed with m df or equivalently W  Power-Cðm=2; 1=2; 1Þ. As shown after Theorem 9.2, when c ¼ 2; 1; 1; 2, W is inverse-chi, inverse chi-square, chi-square and chi-distributed, respectively. For simplicity, consider the case l ¼ 0; q ¼ 1; g ¼ g ¼ 0; D ¼ dT ; D ¼ d and X ¼ x ¼ dT Rd þ d. Then, we have PrfZ 2 S j g þ Dðy  lÞ; Dg ¼ PrðZ 2 S j dT y; dÞ: When Y = y is given, Z ¼ Z  NðdT y; dÞ without truncation or Z 2 R.

9.4 The Pseudo t (PT)-Distribution

289

Lemma 9.6 Suppose that W  Power-Cfðm þ pÞ=2; b; cg. Then, pffiffiffiffiffiffiffiffiffiffi EW fPrðZ 2 S j d  wc =m; dÞg 2 b 3    Z  p ffiffiffiffiffiffiffiffiffi ffi 2 1 ¼ EW 4 pffiffiffiffiffiffiffiffi exp  z  d  wc =m =ð2dÞ dz5 2pd a (  ðm þ pÞ=2 X  1=2 )u 1 d 2 =m d d 2 =m Cfðm þ p þ uÞ=2g ¼ 1þ bþ 1=2 d 2bd 2d Cfðm þ pÞ=2Þgu! m u¼0  EfZ u jZ  Nð0; dÞ; a; bg; where

Rb a

ð  )dz

PR R b r r¼1 ar

ð  )dz and Zb

EfZ u jZ  Nð0; dÞ; a; bg ¼ a

 2 zu z pffiffiffiffiffiffiffiffi exp  dz: 2d 2pd

Proof Z1 0

gPowerC fwjðm þ pÞ=2; b; cg PrðZ 2 S j d  Z1

¼ 0

pffiffiffiffiffiffiffiffiffiffi wc =m; dÞdw

bðm þ pÞ=2 jcjwcfðm þ pÞ=2g1 expðbwc Þ Cfðm þ pÞ=2Þg

   pffiffiffiffiffiffiffiffiffiffi2 1 pffiffiffiffiffiffiffiffi exp  z  d  wc =m =ð2dÞ dz dw 2pd R¼1 ar  pffiffiffiffiffiffiffiffiffiffi u Zb Z1 ðm þ pÞ=2 1 cfðm þ pÞ=2g1 zd  wc =m=d X b jcjw 1 ¼ expðbwc Þ pffiffiffiffiffiffiffiffi Cfðm þ pÞ=2Þg u! 2pd u¼0 a 0  2 c   2 d w =m z dw exp   exp  dz 2d 2d Zb 1 bðm þ pÞ=2 1 X 1 pffiffiffiffiffiffiffiffi fzd  =ðm1=2 dÞgu ¼ u! Cfðm þ pÞ=2Þg 2pd u¼0 R Z X

br



a

Z1

 0

     2 d 2 =m c z jcjwcfðm þ p þ uÞ=2g1 exp  b þ w dw exp  dz 2d 2d

9 The Student t- and Pseudo t- (PT) Distributions …

290

¼

 ðm þ p þ uÞ=2 bðm þ pÞ=2 Cfðm þ p þ uÞ=2g 1 fzd  =ðm1=2 dÞgu d 2 =m pffiffiffiffiffiffiffiffi bþ Cfðm þ pÞ=2Þg 2d u! 2pd u¼0

Zb X 1 a

Z1 

ðm þ p þ uÞ=2     d 2 =m jcjwcfðm þ p þ uÞ=2g1 d 2 =m c exp  b þ w dw 2d 2d Cfðm þ p þ uÞ=2g 0  2 z  exp  dz 2d  ðm þ p þ uÞ=2 1 X fd  =ðm1=2 dÞgu Cfðm þ p þ uÞ=2g d 2 =m ðm þ pÞ=2 bþ ¼b Cfðm þ pÞ=2Þgu! 2d u¼0 



 2 zu z  pffiffiffiffiffiffiffiffi exp  dz 2d 2pd a (  ðm þ pÞ=2 X  1=2 )u 1 d 2 =m d d 2 =m Cfðm þ p þ uÞ=2g ¼ 1þ bþ 1=2 d 2bd 2d Cfðm þ pÞ=2Þgu! m u¼0 Zb

 EfZ u jZ  Nð0; dÞ; a; bg; which is the required result. Q.E.D. pffiffiffi pffiffiffiffiffiffiffi Theorem 9.4 The pdf of the pseudo t-distributed vector Y ¼ mY= W c at Y ¼ y , where Y  PNp;1;R ð 0; R; dT ; 0; d; a; bÞ and W  Power-Cðm=2; 1=2; cÞ independent of Y, is tðy j0; R; mÞ (  ðm þ pÞ=2 X  1=2 )u 1 d 2 =m d d 2 =m Cfðm þ p þ uÞ=2g  1þ bþ 1=2 d 2bd 2d Cfðm þ pÞ=2Þgu! m u¼0 

EfZ u jZ  Nð0; dÞ; a; bg ; Rb a /ðzj0; xÞdz

where d  dT y and b f1 þ yT ðmRÞ1 y g=2.

9.4 The Pseudo t (PT)-Distribution

291

Proof The pdf of Y with the Jacobian ðwc =mÞp=2 from Y to Y is 

Z1 gPowerC ðwjm=2; 1=2; cÞ 0

wc m

p=2

fp;1;R ðy

pffiffiffiffiffiffiffiffiffiffi wc =m j 0; R; dT ; 0; d; a; bÞdw

pffiffiffiffiffiffiffiffiffiffi  wc p=2 /p ðy wc =m j0; RÞ ¼ gPowerC ðwjm=2; 1=2; cÞ m PrðZ 2 S j 0; xÞ 0 p ffiffiffiffiffiffiffiffiffi ffi  PrðZ 2 S j dT y wc =m; dÞdw  c p=2 Z1 jcjwcðm=2Þ1 w 1 c ¼ =2Þ expðw p=2 v 2m=2 Cðm=2Þ ð2pÞ jRj1=2 0 pffiffiffiffiffiffiffiffiffiffi  T c 1   y w R y PrðZ 2 S j dT y wc =m; dÞ dw  exp  PrðZ 2 S j 0; xÞ m2  ðm þ pÞ=2 Cfðm þ pÞ=2g 1 1 yT R1 y þ ¼ m2 ð2pvÞp=2 Cðm=2ÞjRj1=2 2m=2 2 1 ðm þ pÞ=2 Z  1 yT R1 y jcjwcðm þ pÞ=21 þ  2 m2 Cfðm þ pÞ=2g 0 pffiffiffiffiffiffiffiffiffiffi   1 yT R1 y c PrðZ 2 S j dT y wc =m; dÞ þ dw  exp  w 2 PrðZ 2 S j 0; xÞ m2 

Z1

Cfðm þ pÞ=2g

¼



yT R1 y 1þ m

ðm þ pÞ=2

ðpvÞp=2 Cðm=2ÞjRj1=2 ðm þ pÞ=2 Z1  1 yT R1 y jcjwcðm þ pÞ=21 þ  2 m2 Cfðm þ pÞ=2g 0

pffiffiffiffiffiffiffiffiffiffi   1 yT R1 y c PrðZ 2 S j dT y wc =m; dÞ þ dw  exp  w 2 PrðZ 2 S j 0; xÞ m2 Z1  ¼ tðy j0; R; mÞ gPowerC ½wjðm þ pÞ=2; f1 þ yT ðmRÞ1 y g=2; c

0

pffiffiffiffiffiffiffiffiffiffi PrðZ 2 S j dT y wc =m; dÞ dw  PrðZ 2 S j 0; xÞ

9 The Student t- and Pseudo t- (PT) Distributions …

292

¼ tðy j0; R; mÞ (  ðm þ pÞ=2 X  1=2 )u 1 d 2 =m d d 2 =m Cfðm þ p þ uÞ=2g  1þ bþ 1=2 d 2bd 2d Cfðm þ pÞ=2Þgu! m u¼0 

EfZ u jZ  Nð0; dÞ; a; bg ; Rb a /ðzj0; xÞ

where Lemma 9.6 is used for the last equation, giving the required result. Q.E.D. When p = 1 and Z is normally distributed under single truncation as in the SN, we have a simplified result of Lemma 9.6. Lemma 9.7 Suppose that W  Power-Cfðm þ 1Þ=2; b; cg, a1 ¼ 1 and b1 ¼ 0 with R ¼ 1, i.e., PrðZ 2 S j  Þ ¼ PrðZ \0 j  Þ. Then, h n oi pffiffiffiffiffiffiffiffiffiffiffi EW Pr Z 2 S j Z  Nðd  W c =m; dÞ 2 0 3    Z pffiffiffiffiffiffiffiffiffiffiffi2 1  pffiffiffiffiffiffiffiffi exp  z  d W c =m =ð2dÞ dz5 ¼ EW 4 2pd 1 n o pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ Pr Z  \  d  ðm þ 1Þ=ð2mbÞjZ   Stð0; d; m þ 1Þ : Proof Z1

gPowerC fwjðm þ 1Þ=2; b; cg PrfZ 2 S j Nðd 

pffiffiffiffiffiffiffiffiffiffi wc =m; dÞgdw

0

Z1 ¼ 0

bðm þ 1Þ=2 jcjwcfðm þ 1Þ=2g1 expðbwc Þ Cfðm þ 1Þ=2g

Z0  1

   pffiffiffiffiffiffiffiffiffiffi2 1 pffiffiffiffiffiffiffiffi exp  z  d  wc =m =ð2dÞ dz dw 2pd

9.4 The Pseudo t (PT)-Distribution

Z1 ¼ 0

293

bðm þ 1Þ=2 jcjwcfðm þ 1Þ=2g1 expðbwc Þ Cfðm þ 1Þ=2g pffiffiffiffiffiffiffi c Z w =m

d 

 Z1 ¼ 0

1

1 pffiffiffiffiffiffiffiffi expfz2 =ð2dÞgdz dw 2pd

bðm þ 1Þ=2 jcjwcfðm þ 1Þ=2g1 expðbwc Þ Cfðm þ 1Þ=2g

d   #2 c  pffiffiffiffiffiffiffiffiffiffi Z 1 z w =m c pffiffiffiffiffiffiffiffi exp   w =m dz# dw 2d 2pd 1

Zd  ¼ 1

Zd  ¼ 1 Z1

 0

Zd  ¼ 1

1 pffiffiffiffiffiffiffiffiffiffi 2pmd

Z1 0

    bðm þ 1Þ=2 jcjwcfðm þ 2Þ=2g1 z#2 =m c exp  b þ w dw dz# 2d Cfðm þ 1Þ=2g

 ðm þ 2Þ=2 Cfðm þ 2Þ=2Þg z#2 =m pffiffiffiffiffiffiffiffiffiffi bðm þ 1Þ=2 b þ 2d 2pmdCfðm þ 1Þ=2g  ðm þ 2Þ=2     jcjwcfðm þ 2Þ=2g1 z#2 =m z#2 =m c bþ exp  b þ w dw dz# 2d 2d Cfðm þ 2Þ=2g  ðm þ 2Þ=2 Cfðm þ 2Þ=2Þg z#2 =m pffiffiffiffiffiffiffiffiffiffi dz# bðm þ 1Þ=2 b þ 2d 2pmdCfðm þ 1Þ=2g

 ðm þ 2Þ=2 Cfðm þ 2Þ=2Þg z#2 pffiffiffiffiffiffiffiffi 1 þ dz# pffiffiffipffiffiffi 2dbm p mCfðm þ 1Þ=2g 2db 1 pffiffiffiffiffiffiffiffiffiffiffiffiffi d  ðm þ 1Þ=m  ðm þ 2Þ=2 Z Cfðm þ 2Þ=2Þg z2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ dz : pffiffiffiffiffiffiffiffi 1 þ 2dbðm þ 1Þ pðm þ 1ÞCfðm þ 1Þ=2g 2db 1 n o pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ Pr Z  \  d  ðm þ 1Þ=mjZ   Stð0; 2db; m þ 1Þ n o pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ Pr Z  \  d  ðm þ 1Þ=ð2mbÞjZ   Stð0; d; m þ 1Þ ; Zd 

¼

which is the required result. Q.E.D. Under the same condition for Z when p = 1, we have the following special case of Theorem 9.4.

9 The Student t- and Pseudo t- (PT) Distributions …

294

Corollary 9.5 The pdf of the pseudo t-distributed variable Y  ¼ Y  ¼ y , where Y  PN1;1;1 ð 0; r2 ; d; 0; d; 1; 0Þ and Cðm=2; 1=2; cÞ independent of Y, is

pffiffiffi pffiffiffiffiffifficffi mY= W at W  Power-

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ) mþ1  j Z  Stð0; d; m þ 1Þ : 2tðy j0; r ; mÞ Pr Z \  dy m þ ðy2 =r2 Þ (





2



Proof Use the second last result of the proof of Theorem 9.4 when y ¼ y ; R ¼ r2 and d ¼ d with p = 1 under the above condition. Then, Lemma 9.7 when b ¼ f1 þ yT ðmRÞ1 y g=2 ¼ ½1 þ fy2 =ðvr2 Þg =2 gives Z1 gPowerC ðwjm=2; 1=2; cÞ 0

 c p=2 w m

pffiffiffiffiffiffiffiffiffiffi  fp;1;R ðy wc =m j 0; R; dT ; 0; d; a; bÞdw Z1 ¼ gPowerC ðwjm=2; 1=2; cÞðwc =mÞ1=2 0

 f1;1;1 ðy 

pffiffiffiffiffiffiffiffiffiffi wc =m j 0; r2 ; d; 0; d; 1; 0Þdw 

Z1

¼ tðy j0; r ; mÞ 2

gPowerC 0

   mþ1 y2 ; 1 þ 2 =2; c wj 2 mr

pffiffiffiffiffiffiffiffiffiffi  PrfZ 2 S j Nðdy wc =m; dÞgdw= PrfZ 2 S j Nð0; xÞg 8 9 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi   < = 2 1 m þ 1 y 2 1þ 2 jZ   Stð0; d; m þ 1Þ ¼ 2tðy j0; r2 ; mÞ Pr Z  \  dy : ; 2m mr   rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi mþ1  ¼ 2tðy j0; r2 ; mÞ Pr Z  \  dy  Stð0; d; m þ 1Þ , jZ m þ ðy2 =r2 Þ R0 where 1 /ðzj0; x ¼ d 2 r2 þ dÞdz ¼ 1=2 is used, giving the required result. Q.E.D. In Corollary 9.5, when r2 ¼ d ¼ 1 and d ¼ k, we have the result for Y  PN1;1;1 ð 0; r2 ; d; 0; d; 1; 0Þ ¼ SN with the pdf 2/ðyÞUðkyÞ at Y = y, which was obtained using the chi-square distributed W by Gupta et al. [11, Eq. 4.2] and Gupta [9, Eq. (2.4)]. Their former expression of the pdf of the skew t (ST) is written as

9.4 The Pseudo t (PT)-Distribution

295

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ) m þ 1 2tðy j0; 1; mÞ Pr Z  \ky j Z   Stð0; 1; m þ 1Þ mðm þ y2 Þ (

using our notation, where the denominator mðm þ y2 Þ due to a possible typo should be read as m þ y2 , which is obtained as a special case of Corollary 9.5. After this correction, it is confirmed that when m goes to infinity, the pdf becomes 2/ðy j0; 1Þ PrfZ  \ky j Z   Nð0; 1Þg ¼ 2/ðy ÞUðky Þ i.e., the pdf of the SN as expected rather than /ðy j0; 1Þ unless corrected. Note that Theorem 9.4 and Corollary 9.5 are extensions of the pdf of the ST given by Gupta et al. to those for the PT under hidden sectional truncation or the ST using the power-gamma distribution for generality. Gupta et al.’s pdf quoted above is based on Gupta et al. [11, Lemma 3] and Gupta [9, Lemma 1], where the former result is pffiffiffiffiffi pffiffiffi EW fUðc W ÞjW  v2 ðmÞg ¼ PrfZ  \c mjZ   Stð0; 1; mÞg; c 2 R: pffiffiffi Their lemma is seen as a special case of Lemma 9.7 when c ¼ d  = m and W  Power-Cfðm þ 1Þ=2; b; cg is W  Power-Cðm=2; 1=2; 1Þ with b ¼ 1=2 or equivalently W  v2 ðmÞ. For ease of reference and comparison, the corresponding extended expression equivalent to Lemma 9.7 is given as follows. Lemma 9.8 Suppose that W  Power-Cðm=2; b; cÞ. Then, h n oi pffiffiffiffiffiffiffi EW Pr Z\ c W c j Z  Nð0; dÞ n o pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ Pr Z  \c m=ð2bÞjZ   Stð0; d; mÞ ; c 2 R: pffiffiffi Proof In Lemma 9.7, employ the reparametrization c ¼ d  = m followed by the transformation of W  Power-Cfðm þ 1Þ=2; b; cg to W  Power-Cðm=2; b; cÞ. Q.E.D. Lemma 9.8 is an extension of Gupta et al.’s lemma using their expression with added scale parameters b and d, and the power gamma with the shape parameter c.

9.4.2

Moments and Cumulants of the PT

For the derivation of the moments and cumulants of the PT, the following are used. Property 9.1 Let W be a variable and Y a random vector independent of W, then

9 The Student t- and Pseudo t- (PT) Distributions …

296

EðWYÞ ¼ EðWÞEðYÞ; covðWYÞ ¼ EðW 2 YYT Þ  EðWYÞEðWYT Þ ¼ fvarðWÞ þ E2 ðWÞgfcovðYÞ þ EðYÞEðYT Þg  E2 ðWÞEðYÞEðYT Þ ¼ varðWÞcovðYÞ þ varðWÞEðYÞEðYT Þ þ E2 ðWÞcovðYÞ; j3 ðWYÞ ¼ EfðWYÞ\3 [g 

3 X

EfðWYÞ\2 [g EðWYÞ þ 2E\3 [ ðWYÞ

¼ EðW 3 ÞEðY\3 [Þ  EðW 2 ÞEðWÞ j4 ðWYÞ ¼ EfðWYÞ\4 [g  þ

6 X

4 X

3 X

EðY\2 [Þ EðYÞ þ 2E3 ðWÞE\3 [ ðYÞ;

EfðWYÞ\3 [g EðWYÞ

EfðWYÞ\2 [g E\2 [ ðWYÞ  3E\4 [ ðWYÞ

¼ EðW 4 ÞEðY\4 [Þ  EðW 3 ÞEðWÞ þ EðW 2 ÞE2 ðWÞ

6 X

4 X

EðY\3 [Þ EðYÞ

EðY\2 [Þ E\2 [ ðYÞ  3E4 ðWÞE\4 [ ðYÞ:

Property 9.2 Let W be inverse-chi distributed variable with m df multiplied by Then, Lemma 9.2 gives

pffiffiffi m.

pffiffiffi mCfðm  1Þ=2g ; m [ 1; 21=2 Cðm=2Þ m ; m [ 2; EðW 2 Þ ¼ m2 m mC2 fðm  1Þ=2g varðWÞ ¼  ; m [ 2; m2 2C2 ðm=2Þ EðWÞ ¼

m3=2 Cfðm  3Þ=2g m3=2 Cfðm  1Þ=2g ¼ 1=2 ; m [ 3; 23=2 Cðm=2Þ 2 ðm  3ÞCðm=2Þ m2 ; m [ 4: EðW 4 Þ ¼ ðm  2Þðm  4Þ

EðW 3 Þ ¼

Property 9.3 Let Y  PNp;1;R f l ¼ 0; R; D ¼ dT ; g ¼ 0; D ¼ d; A ¼ ða1 ; :::; aR Þ; B ¼ ðb1 ; :::; bR Þg with X ¼ x ¼ D þ DRDT ¼ d þ dT Rd. Define a ¼ PrðZ 2 Sj g ¼ 0; X ¼ xÞ and ~ ¼ Rd. Then, Corollary 4.4 gives d

9.4 The Pseudo t (PT)-Distribution

~ EðYÞ ¼ a1 d

R X

297

f/ðbr j0; xÞ  /ðar j0; xÞg;

r¼1

~d ~T covðYÞ ¼ d



a1 x1 "

þa

fbr /ðbr j0; xÞ  ar /ðar j0; xÞg

r¼1

ðAÞ 2

R X

R X

#2 f/ðbr j0; xÞ  /ðar j0; xÞg þ R;

r¼1

~ EðY\2 [ Þ ¼ d \3 [

EðY

Þ¼a

\2 [ 1

a x1

X R

1

ðAÞ

ðAÞ R X

f/ðbr j0; xÞ  /ðar j0; xÞg þ vecðRÞ;

r¼1

~\3 [ ½x2 fb2 /ðbr j0; xÞ  a2 /ðar j0; xÞg d r r

r¼1

þ x1 f/ðbr j0; xÞ  /ðar j0; xÞg



3 X

~ vecðRÞf/ðbr j0; xÞ  /ðar j0; xÞg d

X R

EðX\4 [ Þ ¼ a1

ðAÞ

; ðAÞ

~\4 [ ½x3 fb3 /ðbr j0; xÞ  a3 /ðar j0; xÞg d r r

r¼1

þ 3x2 fbr /ðbr j0; xÞ  ar /ðar j0; xÞg



6 X

~\2 [ vecðRÞx1 fbr /ðbr j0; xÞ  ar /ðar j0; xÞg d

ðAÞ

þ

3 X

vec\2 [ ðRÞ:

Theorem 9.5 The moments and cumulants of the PT in Theorem 9.4 up to the fourth order are given by substituting the results of Properties 9.2 and 9.3 for those in Property 9.1. pffiffiffiffiffiffiffiffiffiffiffi Proof Noting that the PT distributed vector is given by Y= W c =m, where W  Power-Cðm=2; 1=2; cÞ. When c ¼ 2, W is inverse chi-distributed with m df pffiffiffiffiffiffiffiffiffiffiffi pffiffiffi independent of Y; then, Y= W c =m ¼ W mY gives the required results. Q.E.D.

298

9 The Student t- and Pseudo t- (PT) Distributions …

References 1. Azzalini A, Capitanio A (2003) Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. J Royal Stat Soc B 65:367–389 2. Bishop CM (2006) Pattern recognition and machine learning. Springer, New York 3. Branco MD, Dey DK (2001) A general class of multivariate skew-elliptical distributions. J Multivar Anal 79:99–113 4. Cornish EA (1954) The multivariate t-distribution associated with a set of normal sample deviates. Aust J Phys 7:531–542 5. Crooks GE (2015) The Amoroso distribution. arXiv:1005.3274v2 [math.ST] 13 Jul 2015 6. Fung T, Seneta E (2010) Tail dependence for two skew t distributions. Statist Probab Lett 80:784–791 7. Galarza CE, Matos LA, Castro LM, Lachos VH (2022) Moments of the doubly truncated selection elliptical distributions with emphasis on the unified multivariate skew-t distribution. J Multivar Anal (online published). https://doi.org/10.1016/j.jmva.2021.104944 8. Genton MG (2004) Skew-symmetric and generalized skew-elliptical distributions. In: Genton MG (ed) Skew-elliptical distributions and their applications. Chapman & Hall/CRC, Boca Raton, FL, pp 81–100 9. Gupta AK (2003) Multivariate skew t-distribution. Statistics 37:359–363 10. Gupta AK, Chang F-C (2003) Multivariate skew-symmetric distributions. Appl Math Lett 16:643–646 11. Gupta AK, Chang FC, Huang WJ (2002) Some skew-symmetric models. Random Oper Stoch Equ 10:133–140 12. Joe H, Li H (2019) Tail densities of skew-elliptical distributions. J Multivar Anal 171: 421–435 13. Kirkby JL, Nguyen D, Nguyen D (2019) Moments of Student's t-distribution: a unified approach. arXiv:1912.01607v1 [math.PR] 3 Dec 2019 14. Kirkby JL, Nguyen D, Nguyen D (2021) Moments of Student's t-distribution: a unified approach. arXiv:1912.01607v2 [math.PR] 26 Mar 2021 15. Kollo T, Käärik M, Selart A (2021) Multivariate skew t-distribution: asymptotics for parameter estimators and extension to skew t-copula. Symmetry 13:1059. https://doi.org/10. 3390/sym13061059 16. Kollo T, Pettere G (2010) Parameter estimation and application of the multivariate skew t-copula. In: Jaworski P, Durante F, Härdle W, Rychlik T (eds) Copula theory and its applications. Springer, Berlin, pp 289–298 17. Kotz S, Nadarajah S (2004) Multivariate t distributions and their applications. Cambridge University Press, Cambridge 18. Lachos VH, Garay AM, Cabral CRB (2020) Moments of truncated scale mixtures of skew-normal distributions. Braz J Probab Stat 34:478–494 19. Nadarajah S, Ali MM (2004) A skewed truncated t distribution. Math Comput Model 40: 935–939 20. Nadarajah S, Gupta AK (2005) A skewed truncated Pearson type VII distribution. J Jpn Stat Soc 35:61–71 21. Nadarajah S, Kotz S (2003) Skewed distributions generated by the normal kernel. Statist Probab Lett 65:269–277 22. Stacy EW (1962) A generalization of the gamma distribution. Ann Math Stat 33:1187–1192 23. Stuart A, Ord JK (1994) Kendall’s advanced theory of statistics: distribution theory, 6th edn., vol 1). Arnold, London

Chapter 10

Multivariate Measures of Skewness and Kurtosis

10.1

Preliminaries

In previous chapters, the (left) Kronecker product denoted by, e.g., A  B ¼ faij Bg ði ¼ 1; . . .; m; j ¼ 1; . . .; nÞ (Magnus and Neudecker [14, Sect. 2, Chap. 2]) and A\2 [ ¼ A  A, was extensively used, which is also called the direct or tensor product, where aij B ðp  qÞ is the (i, j)th block or submatrix of A  B. In literatures, Q Qk k ¼ A    A the notations A\k [ ¼ ki¼1 A ¼  i¼1:k A ¼ i¼1 A ¼ A (k times of A) with 1 : k ¼ ð1; 2; . . .; kÞ are synonymously used (for the first two see Holmquist [6, 7]; Kano [9, p. 182]; Meijer [16, p. 122]; and the third to fifth see Terdik [21, 22]). The vectorizing operator oT  T n vecðAÞ ¼ a11 ; a21 ; . . .; am1;n ; amn ¼ ðAÞT1 ; . . .; ðAÞTn ; was also used along with the Kronecker product, where ðAÞj ðm  1Þ is the j-th column of A [ðAÞi ð1  nÞ denotes the i-th row of A]. The commutation matrix (Magnus and Neudecker [12, 14, Sect. 7, Chap. 3]) or commutator (Terdik [22]) is useful when we consider multivariate moments and cumulants especially when their higher-order cases are dealt with. The commutation matrix is seen as a special case of the commutator mathematically defined by the operator CðÞ in CðA  BÞ ¼ B  A, where  indicates an operation when B  A does not commute, e.g., matrix multiplication; and A and B are mathematical quantities/ symbols defined variously. In this book, the commutator indicates the commutation matrix for short. A commutation matrix denoted by the mn  mn matrix Kmn is a permutation matrix giving the relation   vec AT ¼ Kmn vecðAÞ for a m  n matrix A and equivalently © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 H. Ogasawara, Expository Moments for Pseudo Distributions, Behaviormetrics: Quantitative Approaches to Human Behavior 2, https://doi.org/10.1007/978-981-19-3525-1_10

299

300

10 Multivariate Measures of Skewness and Kurtosis

a1  a2 ¼ Kmn ða2  a1 Þ for m  1 and n  1 vectors a1 and a2 , respectively. Note that the mn elements of vecðAT Þ are a permutation of those of vecðAÞ obtained by the pre-multiplication of vecðAÞ by Kmn . A k  k permutation matrix P is defined by interchanging some or all of the rows or columns of the k  k identity matrix Ik . That is, every row or column of P has only an element of 1 with the others being zero. Assume that in a permutation matrix P ¼ fpij g, the i-row has a single one in the j-th column, i.e., pij ¼ 1 ði 6¼ jÞ. In this case, it is found that we also have pji ¼ 1 due to the definition of P. The i-th and j-th rows of P in PC for a k  l matrix or a vector C give the interchange of the i-th and j-th rows of C. The remaining rows of P have similar roles. When pii ¼ 1, the i-th row of C is unchanged. In vecðAT Þ ¼ Kmn vecðAÞ, we find that fi þ ðj  1Þmg-th element of vecðAÞ is moved to the fj þ ði  1Þng-th element of vecðAT Þ ði ¼ 1; :::; m; j ¼ 1; :::; nÞ after permutation due to the definition of vecðÞ. Since Kmn is a permutation matrix, in this case fj þ ði  1Þng-th element of vecðAÞ is also moved to the fi þ ðj  1Þmg-th element of vecðAT Þ. Some examples of Kmn are given for exposition. K11 ¼ 1; K12 ¼ K21 ¼ I2 ; " K22 ¼ KT22 ¼

ð11Þ

K22

ð21Þ

K22

# ð12Þ

K22

ð22Þ

K22

2

1

60 6 ¼6 40 0

K222

0

3

0

0

0 1

1 0

07 7 7; 05

0

0

1

¼ I4 ðijÞ

and K13 ¼ K31 ¼ I3 are easily obtained, where the 2  2 submatrices K22 ði; j ¼   1 3 5 1; 2Þ each have a single 1 for the (j, i)th element. Since vec ¼ 2 4 6 2 3 1 2 ½1; 2; 3; 4; 5; 6T and vec4 3 4 5 ¼ ½1; 3; 5; 2; 4; 6T , we have 5 6 2 3 1 637 " 6 7 ð11Þ 657 6 7 ¼ K23 ð21Þ 627 K23 6 7 445 6

ð12Þ

K23 ð22Þ K23

2 3 2 1 1 60 27 #6 6 7 6 ð13Þ 7 6 K23 6 637 60 ð23Þ 6 4 7 ¼ 6 0 K23 6 7 6 455 40 0 6 ðijÞ

0 0 0 1 0 0

0 1 0 0 0 0

0 0 0 0 1 0

0 0 1 0 0 0

32 3 2 3 1 0 1 6 7 627 07 76 2 7 6 7 6 7 6 7 07 76 3 7 ¼ K23 6 3 7; 7 6 7 647 0 76 4 7 6 7 455 0 54 5 5 6 1 6

where the 3  2 submatrices K23 ði ¼ 1; 2; j ¼ 1; 2; 3Þ each have a single 1 for the (j, i)th element. Similarly, we obtain

10.1

Preliminaries

301

2 3 1 6 2 7 2 ð11Þ 6 7 6 3 7 6 K32 6 7 ¼ 4 Kð21Þ 647 32 6 7 ð31Þ 455 K32 6

2 3 2 1 1 3 60 ð12Þ 6 2 7 7 6 K32 6 6 7 60 ð22Þ 76 3 7 K32 56 4 7 ¼ 6 60 6 ð32Þ 6 7 K32 4 5 5 4 0 0 6

0 0 0 1 0 0

0 1 0 0 0 0

0 0 0 0 1 0

0 0 1 0 0 0

32 3 2 3 1 0 1 627 627 07 76 7 6 7 6 7 6 7 07 76 3 7 ¼ K32 6 3 7; 647 647 07 76 7 6 7 5 4 5 455 5 0 6 1 6

ðijÞ

where the 2  3 submatrices K32 ði ¼ 1; 2; 3; j ¼ 1; 2Þ each have a single 1 for the (j, i)th element. In the above results, K23 ¼ KT32 and K32 K23 ¼ K23 K32 ¼ I6 with K23 6¼ KT23 and K32 6¼ KT32 are found. For K33 , we have 2 3 1 647 6 7 6 7 677 6 7 2 ð11Þ 6 7 K 627 6 7 6 33 6 5 7 ¼ 6 Kð21Þ 6 7 4 33 6 7 ð31Þ 687 K33 6 7 637 6 7 6 7 465 9

2 3 2 1 1 627 60 6 7 6 6 7 6 637 60 6 7 6 3 ð13Þ 6 7 K33 6 4 7 6 60 6 7 6 7 ð23Þ 6 7 6 ¼ 5 K33 7 56 7 6 0 6 7 6 ð33Þ 6 6 7 0 K33 6 7 6 6 677 60 6 7 6 6 7 6 485 40 0 9

ð12Þ

K33

ð22Þ

K33

ð32Þ

K33

0 0 0 0 0 1 0 1 0 0 0 0 0

0 0 0 0 1 0 0

0 0 0 0 0 0 0

32 3 2 3 1 0 0 0 0 0 1 7 6 7 7 6 0 0 0 0 0 76 2 7 627 76 7 6 7 7 6 7 7 6 0 0 1 0 0 76 3 7 637 76 7 6 7 0 0 0 0 0 76 4 7 647 76 7 6 7 6 5 7 ¼ K33 6 5 7; 1 0 0 0 07 76 7 6 7 76 7 6 7 0 0 0 1 0 76 6 7 667 76 7 6 7 7 6 7 677 0 0 0 0 0 76 7 7 6 7 76 7 6 7 0 1 0 0 0 54 8 5 485 9 0 0 0 0 1 9

ðijÞ

where the 3  3 submatrices K33 ði; j ¼ 1; 2; 3Þ each have a single 1 for the (j, i)th element as before. It is seen that K33 ¼ KT33 and K233 ¼ I9 . Though Kkij ði 6¼ j; k ¼ 2; 3; . . .Þ are usually unused since these are not commutators but permutation matrices, which are of some use to see a property of Kij . An example is shown. 2

K223

1 0

0

0

0 0

1 0

0 0

1

0

60 6 6 60 ¼6 60 6 6 40

0 0 0 0 0 2 3 2 1 0 1 627 60 0 6 7 6 6 7 6 637 60 0 7 6 K223 6 647 ¼ 60 0 6 7 6 6 7 6 455 40 1 0 0 6

0 0

32

2

1

0

6 0 07 7 60 0 7 6 1 07 60 0 7 ¼6 6 0 0 07 7 60 0 7 6 1 0 05 40 1 0 0 1 0 0 32 3 1 0 0 0 0 6 7 0 0 1 0 76 2 7 7 76 7 0 1 0 0 76 3 7 76 7 6 7 1 0 0 07 76 4 7 76 7 0 0 0 0 54 5 5 6 0 0 0 1

0 0

0

0 0 0 1

1 0

1 0

0

0 0 0 0

0 0

0

3

07 7 7 07 7; 07 7 7 05 1

302

10 Multivariate Measures of Skewness and Kurtosis

2 3 2 1 1 637 60 6 7 6 6 7 6 657 60 7 6 ¼ K23 6 627 ¼ 60 6 7 6 6 7 6 445 40 0 6

0 0

0 1

0 0 0 0

0 1

0 0

0 1 0 0

0 0

0 0

1 0 0 0

32 3 2 3 1 1 0 6 7 6 7 0 76 3 7 6 5 7 7 76 7 6 7 0 76 5 7 6 4 7 76 7 ¼ 6 7: 6 7 6 7 07 76 2 7 6 3 7 76 7 6 7 0 54 4 5 4 2 5 1

6

6

Among Kk23 ðk ¼ 1; 2; . . .Þ, the number of distinct ones is found to be at most 6!. Next, we obtain the explicit or algebraic expression of Kmn . Define the m  n matrix Eij whose (i, j)th element is 1 with the remaining ones being zero. Let eðkÞ be the vector of an appropriate dimension whose k-th element is 1 with the other elements being zero. Lemma 10.1 The commutation matrix satisfying vecðAT Þ ¼ Kmn vecðAÞ is algebraically expressed in several ways: Kmn ¼

m X n X

m;n X     vec Eji eTfi þ ðj1Þmg  vec Eij eTfi þ ðj1Þmg i;j

i¼1 j¼1

¼ ¼

m;n X i;j m;n X

  efj þ ði1Þng vecT Eij efj þ ði1Þng eTfi þ ðj1Þmg ¼

m;n X

i;j

¼

    vec Eji vecT Eij

i;j

m;n X 

eðiÞ  eðjÞ



eðjÞ  eðiÞ

T

¼

m;n   X eðiÞ eTðjÞ  eðjÞ eTðiÞ

i;j

¼

i;j

m;n X

Eij  ETij ¼

i;j

m;n X

Eij Eji ¼

i;j

m;n X

Eij  KðijÞ mn ;

i;j

where eðiÞ ; eðjÞ and efg are m  1; n  1 and mn  1 vectors while Eij and Eji are T m  n and n  m matrices, respectively; and KðijÞ mn ¼ Eji ¼ eðjÞ eðiÞ is the n  m submatrix for the (i, j)th block, which has a single 1 for the (j, i)th element of the submatrix ðj ¼ 1; . . .; n; i ¼ 1; . . .; mÞ: When m = n, we have Kmm ¼

m X m X

m X     vec Eji eTfi þ ðj1Þmg  vec Eji eTfi þ ðj1Þmg

i¼1 j¼1

¼

m X i;j

  efj þ ði1Þmg vecT Eij

i;j

10.1

Preliminaries

¼

m X

303

efj þ ði1Þmg eTfi þ ðj1Þmg ¼

m X

i;j

¼ ¼

    vec Eji vecT Eij

i;j

m X





 T

vec Eij vec Eji

i;j m  X

eðjÞ  eðiÞ





eðiÞ  eðjÞ

T

¼

i;j

m  X

eðiÞ  eðjÞ



eðjÞ  eðiÞ

T

i;j

m  m   X  X ¼ eðjÞ eTðiÞ  eðiÞ eTðjÞ ¼ eðiÞ eTðjÞ  eðjÞ eTðiÞ i;j

¼

i;j

m X

Eji  Eij ¼

i;j

m X

Eij  Eji ¼

m X

i;j

Eij  KðijÞ mm :

i;j

Proof Noting that A¼

m;n X

Eij aij

and

AT ¼

i;j

m;n X

ETij aij ¼

i;j

m;n X

Eji aij ;

i;j

we have m;n m;n X   X     vec AT ¼ vec Eji aij ¼ vec Eji fvecðAÞgi þ ðj1Þm i;j

¼

m;n X

i;j

  vec Eji eTfi þ ðj1Þmg vecðAÞ;

i;j

where f  gk is the k-th element of the vector in braces. The last result gives an P T expression Kmn ¼ m;n i;j vecðEji Þefi þ ðj1Þmg . The alternative expressions for Kmn are obtained from this expression. The results when m = n are given using Kmm ¼ KTmm . Q.E.D. P The expressions of Kmn in Lemma 10.1, e.g., Kmn ¼ m;n i;j     P T vec , well show properties of this speefj þ ði1Þng eTfi þ ðj1Þmg ¼ m;n vec E E ji ij i;j cial case as a permutation matrix. The lemma also gives an alternative algebraic Pm;n ðijÞ T T expression Kmn ¼ i;j eðiÞ eðjÞ  Kmn using the submatrices KðijÞ mn ¼ eðjÞ eðiÞ ði ¼ 1; :::; m; j ¼ 1; :::; nÞ, which is consistent with the findings in the examples shown earlier. Terdik [22, Eq. (1.11)] used another symbolic expression Kmn ¼ ½ETi;j i;j reflecting a procedure to have Kmn and stated that “Kmn has n  m blocks of dimension m  n” (p. 8), which should be read as “Kmn has m  n blocks of dimension n  m,” where Kmn ¼ Kmn in our notation as used by Magnus and Neudecker [14]. For this statement, see Neudecker and Wansbeek [18, p. 222]

304

10 Multivariate Measures of Skewness and Kurtosis

though they use the notation Pnm for Kmn . It is known that a permutation matrix is ortho-normal, which is found by considering that the inverse permutation corresponding to P is given by PT , yielding well-known properties:  T ¼ vecðAÞ: KTmn ¼ K1 mn ¼ Knm with Knm vec A 2k þ 1 ¼ Kmm ðk ¼ 0; 1; . . .Þ Consequently, when m = n, we have K2k mm ¼ Im2 and Kmm 0 with Kmm  Im2 . When m = 1 or n = 1, it follows that

K1n ¼ In or Km1 ¼ Im ; respectively, due to vecðAT Þ ¼ vecðAÞ. A formula frequently used in multivariate statistical analysis for the Kronecker product and the vec operator is   vecðABCÞ ¼ CT  A vecðBÞ; where A, B and C are a  b; b  c and c  d matrices, which is proved for exposition: vecðABCÞ ¼

b X c X

n o X n o vec ðAÞi bij ðCÞj ¼ vec ðAÞi ðCÞj bij

i¼1 j¼1

o Xn T ¼ ðCÞj ðAÞi bij i;j

¼

X

CT  A

i;j

¼

X

CT  A

 

i;j

eT fvecðBÞgi þ ðj1Þb fi þ ðj1Þbg fi þ ðj1Þbg eT vecðBÞ fi þ ðj1Þbg fi þ ðj1Þbg

  ¼ CT  A vecðBÞ:

i;j

This proof is similar to that in Lemma 10.1 (compare the proof given by, e.g., Magnus and Neudecker [14, Theorem 2, Chap. 2]). The properties given below are well-known (Magnus and Neudecker [14, Theorem 9, Chap. 3]; Terdik [22, Eqs. (1.14)–(1.17)]), where the second to the last equations are obtained by the first one, which was proved by Magnus and Neudecker [14] in an elegant but indirect method using vecðABCÞ ¼ ðCT  AÞvecðBÞ repeatedly and an additional arbitrary matrix. The first proof in the following lemma is a direct one. Lemma 10.2 Let A and B be m  n and p  q matrices with a and b being m  1 and p  1 vectors, respectively. Then, we have

10.1

Preliminaries

305

Kpm ðA  BÞ ¼ ðB  AÞKqn Kpm ðA  BÞKnq ¼ B  A; Kpm ðA  bÞ ¼ b  A; Kmp ðB  aÞ ¼ a  B; Kpm ða  BÞ ¼ B  a; Kmp ðb  AÞ ¼ A  b: Proof 1 In the first equation, the ½fu þ ðv  1Þmg; fi þ ðj  1Þqg-th element on each side will be found (u = 1,…,m; v = 1,…,p; i = 1,…,q; j = 1,…,n). The element on the left-hand side is  T   eTfu þ ðv1Þmg Kpm ðA  BÞefi þ ðj1Þqg ¼ eðvÞ  eðuÞ Kpm ðA  BÞ eðjÞ  eðiÞ

p;m o  T X   T n ¼ eðvÞ  eðuÞ eði Þ  eðj Þ eðj Þ  eði Þ ðAÞj ðBÞi



¼ eðuÞ  eðvÞ

i ;j

T n

o ðAÞj ðBÞi ¼ auj bvi ;

 T P  eðj Þ  eði Þ where Kpm ¼ p;m in the results of Lemma 10.1 is i ;j eði Þ  eðj Þ used. On the other hand, the corresponding element on the right-hand side of the equation is similarly given by  T   eTfu þ ðv1Þmg ðB  AÞKqn efi þ ðj1Þqg ¼ eðvÞ  eðuÞ ðB  AÞKqn eðjÞ  eðiÞ q;n  T X   T   ¼ ðBÞv ðAÞu eði Þ  eðj Þ eðj Þ  eði Þ eðjÞ  eðiÞ



¼ ðBÞv ðAÞu

i ;j

T 

 eðiÞ  eðjÞ ¼ auj bvi :

The above results show that the first equation holds. The remaining equations are given from the first one using Kqn Knq ¼ Iqn , Kn1 ¼ In and K1q ¼ Iq . Proof 2 We use the converse flow of Proof 1 for the first equation, which is a less direct proof. Note that the elements on the left (or right)-hand side of the equation are different products of the elements of A and B. Recall the definition a1  a2 ¼ Kmn ða2  a1 Þ, which indicated that Kpm ðA  BÞ is the matrix whose rows are permutated ones of A  B such that the fv þ ðu  1Þpg-th row of A  B corresponding to the rows ðAÞu and ðBÞv in A  B becomes the fu þ ðv  1Þmg-th row of Kpm ðA  BÞ after permutation. This result shows that the row indexes (rows for short) of Kpm ðA  BÞ are the same as the corresponding ones of B  A except that the column indexes of (columns for short) B  A are permutated ones of Kpm ðA  BÞ, where the columns of A  B are unchanged by Kpm ðA  BÞ.

306

10 Multivariate Measures of Skewness and Kurtosis

Consider the right-hand side ðB  AÞKqn of the equation Kpm ðA  BÞ ¼ ðB  AÞKqn to be derived. We find that in a similar manner the columns of ðB  AÞKqn are the same as those of A  B with the rows permutated and Kpm ðA  BÞ. Recalling that the rows of Kpm ðA  BÞ were proved to be the same as those of B  A and consequently those of ðB  AÞKqn , Kpm ðA  BÞ ¼ ðB  AÞKqn since the rows and columns of the left-hand side are equal to the corresponding ones of the right-hand side. As shown earlier, the fu þ ðv  1Þmg-th row of ðB  AÞKqn is given by the direct product of ðBÞv and ðAÞu . Similarly, it is found that the fi þ ðj  1Þqg-th column of Kpm ðA  BÞ is given by the direct product of ðAÞj and ðBÞi . Since Kpm ðA  BÞ ¼ ðB  AÞKqn with the elements being distinct products of the elements of A and B, it is found that the ½fu þ ðv  1Þmg; fi þ ðj  1Þqg-th element on each side of Kpm ðA  BÞ ¼ ðB  AÞKqn should be fðAÞu gTj ¼ fðAÞj gu ¼ auj times fðBÞv gTi ¼ fðBÞi gv ¼ bvi . The remaining equations are derived as in Proof 1. Q.E.D. An advantage of the proofs of Lemma 10.2 is that the explicit expressions of the elements are derived as by-products, which are summarized in the following theorem including a repeated one given by the proof of Lemma 10.2 for ease of reference. Theorem 10.1 The elements of the products of the matrices in Lemma 10.2 are fKpm ðA  BÞgfu þ ðv1Þmg;fi þ ðj1Þqg ¼ fðB  AÞKqn gfu þ ðv1Þmg;fi þ ðj1Þqg ¼ auj bvi ; fKpm ðA  BÞKnq gfu þ ðv1Þmg;fj þ ði1Þng ¼ ðB  AÞfu þ ðv1Þmg;fj þ ði1Þng ¼ auj bvi ; fKpm ðA  bÞgfu þ ðv1Þmg;j ¼ ðb  AÞfu þ ðv1Þmg;j ¼ auj bv ; fKmp ðB  aÞgfv þ ðu1Þpg;i ¼ ða  BÞfv þ ðu1Þpg;i ¼ au bvi ; fKpm ða  BÞgfu þ ðv1Þmg;i ¼ ðB  aÞfu þ ðv1Þmg;i ¼ au bvi ; fKmp ðb  AÞgfv þ ðu1Þpg;j ¼ ðA  bÞfv þ ðu1Þpg;j ¼ auj bv ðu ¼ 1; . . .; m; v ¼ 1; . . .; p; i ¼ 1; . . .; q; j ¼ 1; . . .; nÞ: Note that in the above expressions, the value auj bvi is the same in the first and second lines though the columns of the entries for this value are different. Similar cases are also found in the remaining lines. The following equation first derived by Neudecker and Wansbeek [18, Theorem 3.1 (i)] is useful (see also Magnus and Neudecker [14, Theorem 9, Chap. 3]). The result is proved with added algebraic expressions of the elements. Theorem 10.2 Let A and B be m  n and p  q matrices, respectively. Then,   vecðA  BÞ ¼ In  Kqm  Ip fvecðAÞ  vecðBÞg;

10.1

Preliminaries

307

whose [v þ ðu  1Þp þ fi þ ðj  1Þq  1gmp]-th element is auj bvi ðu ¼ 1; . . .; m; j ¼ 1; . . .; n; v ¼ 1; . . .; p; i ¼ 1; . . .; qÞ. Proof It is seen that the elements of the vector on each side of the equation to be derived consist of distinct products of auj and bvi . Consider the product auj bvi on the left-hand side of the equation. We find that this product is located in the [½fv þ ðu  1Þpg; fi þ ðj  1Þqg-th element of A  B. Noting that A  B is a mp  nq matrix, auj bvi becomes the [v þ ðu  1Þp þ fi þ ðj  1Þq  1gmp]-th element of vecðA  BÞ. Note also that this location is given by the lexicographical order [j, i, u, v] with v changing fastest. On the other hand, auj bvi ’s in vecðAÞ  vecðBÞ on the right-hand side of the equation are located according to the lexicographical order [j, u, i, v] ðj ¼ 1; . . .; n; u ¼ 1; . . .; m; i ¼ 1; . . .; q; v ¼ 1; . . .; pÞ with v changing fastest. Pre-multiplying vecðAÞ  vecðBÞ by In  Kqm  Ip converts the order into [j, i, u, v]. That is, auj bvi is located in the [v þ ðu  1Þp þ fi þ ðj  1Þq  1gmp]-th element of the right-hand side of the equation after conversion, which is found to be equal to the corresponding one on the left-hand side of the equation. Q.E.D. Terdik [22, Lemma 1.1] gave the following equations, except the added third one, using vecðABCÞ¼ ðCT  AÞvecðBÞ repeatedly, which are proved by a similar method employed earlier with added expressions of the elements. Theorem 10.3 Let A and B be m  n and p  m matrices, respectively. Then,    vecðBAÞ ¼ vecT ðAÞ  Inp In  Kmn  Ip fvecðIn Þ  vecðBÞg        ¼ vecT ðBÞ  Inp Im  Kpn  Ip vec AT  vec Ip       ¼ vecT ðIm Þ  Inp Im  Kmn  Ip vec AT  vecðBÞ : Proof Let ai and bTj be the i-th column of A and the j-th row of B, respectively, ði ¼ 1; . . .; n; j ¼ 1; :::; pÞ. Then, ðBAÞji ¼ bTj ai is the {j þ ði  1Þp}-th element of vecðBAÞ. (i) Consider the right-hand side of the first equation to be derived: 

vecT ðAÞ  Inp



 In  Kmn  Ip fvecðIn Þ  vecðBÞg;

where vecðIn Þ  vecðBÞ consists of the products ðIn Þik bjl ¼ dik bjl ði; k ¼ 1; . . .; n; j ¼ 1; . . .; p; l ¼ 1; . . .; mÞ with dik being the Kronecker delta. The products are located in the vector according to the lexicographical order [k, i, l, j] with j changing fastest. Pre-multiplying vecðIn Þ  vecðBÞ by ðIn  Kmn  Ip Þ converts the order into [k, l, i, j], whose corresponding element is denoted by c½k;l;i;j ð¼dik bjl Þ. Noting that the remaining factor vecT ðAÞ  Inp ¼ vecT ðAÞ  In  Ip is a matrix

308

10 Multivariate Measures of Skewness and Kurtosis

which converts the vector consisting of c½k;l;i;j ’s into the np  1 vector whose Pn;m Pm Pn P m {j þ ði  1Þp}-th element is k¼1 l¼1 alk c½k;l;i;j ¼ k;l alk dik bjl ¼ l¼1 bjl ali ¼ ðBAÞji , which is the same element of vecðBAÞ obtained earlier. Note that the indexes l and k of alk in vecT ðAÞ  Inp are chosen from the first two indexes in c½k;l;i;j with l changes faster than k in the lexicographical order. (ii) The vector vec(AT Þ  vecðIp Þ on the right-hand side of the second equation        vecðBAÞ ¼ vecT ðBÞ  Inp Im  Kpn  Ip vec AT  vec Ip consists of the products ali ðIp Þjk ¼ ali djk ðl ¼ 1; . . .; m; i ¼ 1; . . .; n ; j; k ¼ 1; . . .; pÞ, which are located according to the lexicographical order [l, i, k, j] as denoted by c½l;i;k;j . These elements in the vector pre-multiplying by Im  Kpn  Ip are permutated as c½l;k;i;j ð¼ali djk Þ. Finally, the matrix vecT ðBÞ  Inp ¼ vecT ðBÞ  In  Ip converts the transformed vector into the np  1 vector whose Pp;m Pm Pp Pm {j þ ði  1Þp}-th element is l¼1 bkl c½l;k;i;j ¼ l¼1 bjl ali ¼ k¼1 k;l bkl ali djk ¼ ðBAÞji , which is the required result. (iii) The first proof: The vector vecðAT Þ  vecðBÞ on the right-hand side of the third equation fvecT ðIm Þ  Inp gðIm  Kmn  Ip ÞfvecðAT Þ  vecðBÞg consists of the products aki bjl ðk; l ¼ 1; . . .; m; i ¼ 1; . . .; n; j ¼ 1; . . .; p Þ, which are located according to the lexicographical order denoted by c½k;i;l;j . The first transformation by Im  Kmn  Ip gives c½k;l;i;j ð¼ aki bjl Þ, which is further converted by vecT ðIm Þ  Inp into the np  1 vector whose {j þ ði  1Þp}-th element is Pm;m Pm Pm P m k¼1 l¼1 dlk c½k;l;i;j ¼ k;l dlk aki bjl ¼ l¼1 bjl ali ¼ ðBAÞji , which is the required result. The second proof: Employing a method as in Terdik [22, Lemma 1.1] for the first and second equations, the third expression of vecðBAÞ is derived using vecðABCÞ¼ ðCT  AÞvecðBÞ and Theorem 10.2:   vecðBAÞ ¼ AT  B vecðIm Þ     ¼ vecT ðIm Þ  Inp vec AT  B       ¼ vecT ðIm Þ  Inp Im  Kmn  Ip vec AT  vecðBÞ : Q.E.D. The above proof is given to show that unique elements, i.e., the inner products ðBAÞji ¼ bTj ai ði ¼ 1; . . .; n; j ¼ 1; . . .; pÞ in this case can be used to derive an

10.2

Multivariate Cumulants and Multiple Commutators

309

equation by specifying their positions in the equation. In addition to obtaining the expressions of the elements of the matrix on each side of an equation, this method seems to yield some help in finding the transformation of matrices of interest, e.g., A and B in the above case when the commutator, vec operator and Kronecker product are used. However, when only the explicit expressions for the elements on each side of the equation are desired in the above case, the left-hand side can be used. The equation without obtaining the explicit expressions of the elements may be derived more concisely by using vecðABCÞ¼ ðCT  AÞvecðBÞ repeatedly as shown in the second proof of Theorem 10.3 (iii). Note that this formula is applied to vecðBAÞ in (i), (ii) and (iii) using BA ¼ BAIn ¼ Ip BA ¼ BIm A; respectively. The result (iii) is of interest since it is a formula of the relation between a vectorized usual matrix product and the Kronecker product of two vectorized matrices. Remark 10.1 Theorem 10.3 suggests more basic results for a m  n matrix A: vecðAÞ ¼ vecðAIn In Þ ¼ ðIn  AÞvecðIn Þ   ¼ vecðIm Im AÞ ¼ AT  Im vecðIm Þ; where aij is the {i þ ðj  1Þm}-th element of vecðAÞ ði ¼ 1; :::; m; j ¼ 1; :::; nÞ. The corresponding elements ofPðIn  AÞvec(In Þ and ðAT  Im Þvec(Im Þ are obtained by P n;n m;m k;l¼1 djk ail dlk ¼ aij and k;l¼1 alj dik dlk ¼ aij , respectively, as expected. Note that the remaining third result vecðAÞ ¼ vecðIm AIn Þ ¼ ðIn  Im ÞvecðAÞ¼ Imn vecðAÞ is trivial.

10.2

Multivariate Cumulants and Multiple Commutators

Let X be a p  1 random vector with EðXÞ ¼ l and covðXÞ ¼ R [ 0 (positive definite). Define Y ¼ R1=2 ðX  lÞ, where R1=2 is a symmetric matrix-square-root of R1 . That is, we consider the standardized random vector with uncorrelated elements. The multivariate cumulants up to the fourth order are denoted by the p j  1 vectors jj ðYÞ ðj ¼ 1; :::; 4Þ:

310

10 Multivariate Measures of Skewness and Kurtosis

j1 ðYÞ ¼ EðYÞ ¼ 0;    j2 ðYÞ ¼ EY\2 [ ¼ vecfcovðYÞg ¼ vec Ip ; j3 ðYÞ ¼ E Y\3 [ ; 3   P   j4 ðYÞ ¼ E Y\4 [  E\2 [ Y\2 [ ; where 3 X

  E\2 [ Y\2 [ ¼ EðY  Y  Y  Y Þ þ EðY  Y  Y  Y Þ þ EðY  Y  Y  YÞ;

E\2 [ ðY\2 [Þ ¼ fEðY\2 [ Þg\2 [ ¼ EðY  Y  Y  Y Þ; Y is an independent P copy of Y and k as used earlier indicates the sum of k similar terms to have a symmetric result with respect to the positions of the random vectors in the Kronecker products considering their combinations/permutations. While the above expression P is used to explain the definition of the operator 3 , the same result is also given by 3 X

    E\2 [ Y\2 [ ¼ E\2 [ Y\2 [ þ EðY  Y  Y  Y Þ þ EfY  EðY  YÞ  Yg:

P Note that 3 is seen as a (scaled) symmetrizer. When p = 1 with Y ¼ Y, we have P3 \2 [ \2 [ P3 P3 \2 [ \2 [ E ðY Þ = E ðY Þ ¼ 3E2 ðY 2 Þ ¼ 3, That is, ðÞ=3 is the corresponding unscaled symmetrizer. The standard definition of the unscaled symmetrizer for a1      ak , where ai is the di  1 vector ði ¼ 1; :::; kÞ, is given by Sðd1 ;:::;dkÞ ða1      ak Þ 

X

ðapð1Þ      apðkÞ Þ=k!;

fpð1Þ;:::;pðkÞg2Sk

where fpð1Þ; :::; pðkÞg ¼ pð1 : kÞ is a permutation of ð1 : kÞ ¼ ð1; :::; kÞ, Sk with k! members is the set of k-way permutations (Holmquist [6, p. 175]; Kano [9, Sect. 2.1]; Terdik [22, Sect. 1.3.1]). P3 \2 [ \2 [ P3 Noting that E ðY Þ = EðY  Y  Y  Y Þ , we obtain the P3 EðY  Y  Y  Y Þ¼ S2;2 EðY  Y  Y scaled symmetrizer S2;2 yielding  Y Þ. For this purpose, the q-way or q-mode commutation matrix (commutator) is introduced (Jammalamadaka, Taufer and Terdik [8, Appendix 1.2]), which is an extension of the 2-mode commutator Kmp defined earlier with Kmp ðb  aÞ ¼ a  b, where a and b are m  1 and p  1 vectors, respectively. Employing the order of the vectors in the Kronecker product and suppressing the dimension sizes, i.e., m and p in the above case, we obtain   Kð21Þ ¼Kmp ;

10.2

Multivariate Cumulants and Multiple Commutators

311

where (21) = (2, 1) is a permutation of the original sequence (12) = (1, 2). When q = 3, we have, e.g., Kð231Þ ða1  a2  a3 Þ ¼ apð1Þ  apð2Þ  apð3Þ ¼ a3  a1  a2 ; where a1 ; a2 and a3 with possibly distinct dimensions denoted by di ði ¼ 1; 2; 3Þ, respectively, before permutation go to the second, third and first factors (vectors) of the permuted Kronecker product due to the definition of the permutation pð123Þ ¼ pð1 : 3Þ ¼ fpð1Þ; pð2Þ; pð3Þg ¼ ð3; 1; 2Þ ¼ ð312Þ 6¼ ð231Þ; where pðiÞ and i correspond to the {pðiÞ}-th and i-th factors in the Kronecker products before and after permutation, respectively. This definition of the multiple commutator seems to be somewhat intractable due to ð312Þ 6¼ ð231Þ in the above case. It is natural to desire an expression of the commutator such that Kð312Þ ða1  a2  a3 Þ ¼ a3  a1  a2 , which is much more tractable than the expression Kð231Þ ð¼ Kð312Þ Þ since the subscript (312) of Kð312Þ is the same as the desired permutation pð123Þ ¼ ð312Þ. This is easily obtained using the definition KðÞ : a1  a2  a3 ¼ Kð312Þ ða3  a1  a2 Þ, which gives Kð312Þ ða1  a2  a3 Þ ¼ K1 ð312Þ ða1  a2  a3 Þ ¼ a3  a1  a2 1 with Kð312Þ ¼ K1 ð312Þ . Note that Kð312Þ is a commutator yielding the inverse permutation from (312) to (1:3), where the existence of the inverse of Kð312Þ is shown by using commutators of interchanging the neighboring factors successively as

Kð312Þ ¼ Kð132Þ Kð213Þ ¼ Kð132Þ ðd1 ; d3 ; d2 ÞKð213Þ ðd3 ; d1 ; d2 Þ; where Kð213Þ ðd3 ; d1 ; d2 Þ ¼ Kð21Þ ðd3 ; d1 Þ  Id2 and Kð213Þ ðd3 ; d1 ; d2 Þ is the commutator applied to a3  a1  a2 whose factors are of dimensions d3 , d1 and d2 , respectively, with Kð21Þ ðd3 ; d1 Þ defined similarly, whose notation is employed by Terdik [22, Sect. 1.2.4]. Then, we have Kð213Þ ðd3 ; d1 ; d2 Þða3  a1  a2 Þ ¼ fKð21Þ ðd3 ; d1 Þ  Id2 gða3  a1  a2 Þ ¼ fKð21Þ ðd3 ; d1 Þða3  a1 Þg  ðId2 a2 Þ ¼ ða1  a3 Þ  a2 ¼ a1  a3  a2 i.e., exchanging a1 and a2 . Similarly, we obtain Kð132Þ ¼ Kð132Þ ðd1 ; d3 ; d2 Þ ¼ Id1  Kð21Þ ðd3 ; d2 Þ:

312

10 Multivariate Measures of Skewness and Kurtosis

Note that the notations suppressing the dimension sizes, e.g., Kð21Þ ¼ Kð21Þ ðd2 ; d1 Þ were used for simplicity when confusion does not occur. Then, using these results, we confirm that Kð312Þ ða3  a1  a2 Þ ¼ Kð132Þ Kð213Þ ða3  a1  a2 Þ ¼ fId1  Kð21Þ ðd3 ; d2 Þgða1  a3  a2 Þ ¼ a1  fKð21Þ ðd3 ; d2 Þða3  a2 Þg ¼ a1  a 2  a3 : Using the expressions of Kð132Þ and Kð213Þ in the process of confirmation, we obtain Kð312Þ ¼ Kð132Þ Kð213Þ ¼ fId1  Kð21Þ ðd3 ; d2 ÞgfKð21Þ ðd3 ; d1 Þ  Id2 g; which gives 1 1 1 1 Kð231Þ ¼ K1 ð312Þ ¼ Kð213Þ Kð132Þ ¼ fKð21Þ ðd3 ; d1 Þ  Id2 g fId1  Kð21Þ ðd3 ; d2 Þg 1 ¼ fK1 ð21Þ ðd3 ; d1 Þ  Id2 gfId1  Kð21Þ ðd3 ; d2 Þg

¼ fKð21Þ ðd1 ; d3 Þ  Id2 gfId1  Kð21Þ ðd2 ; d3 Þg; 1 1 1 where the existence of K1 ð21Þ ðd3 ; d1 Þ ¼ Kd1 d3 ¼ Kd3 d1 and Kð21Þ ðd3 ; d2 Þ ¼ Kd2 d3 ¼ Kd3 d2 is used with the notations of usual or two-fold commutators. The last result indicates the form of K1 ð312Þ as well as its existence. Note that Jammalamadaka et al. [8,

Appendix 1.2] and Terdik [22, Appendix A.4] extensively used the notation K1 ðÞ . To

find KðÞ corresponding to K1 ð312Þ , write the positions of 1, 2 and 3 in the subscript (312) sequentially, i.e., the second, third and first positions, respectively, giving the 1 subscript (231) of KðÞ in K1 ð312Þ ¼ Kð231Þ . Conversely, to find KðÞ corresponding to Kð231Þ , write the positions similarly, i.e., the third, first and second, respectively, giving the subscript (312) of K1 ðÞ . For the expression given in the first paragraph of

this subsection, we have the corresponding explicit one using K1 ðÞ and the scaled symmetrizer S22 . Theorem 10.4 S22 fE\2 ðY\2 [Þg ¼

3 X

E\2 [ ðY\2 [Þ ¼ EðY  Y  Y  Y Þ

þ EðY  Y  Y  Y Þ þ E(Y  Y  Y  Yg 1   ¼ ðIp4 þ K1 ð1324Þ þ Kð1342Þ ÞEðY  Y  Y  Y Þ 1 \2 [ ¼ ðIp4 þ K1 ðY\2 [Þ: ð1324Þ þ Kð1342Þ ÞE

10.2

Multivariate Cumulants and Multiple Commutators

313

Note that S22 ¼ S22 =3 is symmetric and idempotent by construction (see, e.g., Kano [9, Proposition 2.2]). So far, the well-known formulas ðA1      Ak ÞðB1      Bk Þ ¼ ðA1 B1 Þ      ðAk Bk Þ; with ðA1      Ak ÞT ¼ AT1      ATk and ðC1      Ck Þ1 ¼ C1 1    C1 have been used, where A , B and C are m  n , n  q and r  r i i i i i i i i i matrices, k 1 respectively, with the assumption of the existence of Ci ði ¼ 1; :::; kÞ. It is of some interest to prove the first equation using the lexicographical notation, e.g., c½i;j;k;l with l changing fastest as employed earlier. Note that the elements of the products denoted by ðAm Bm Þim jm ðm ¼ 1; :::; kÞ on the right-hand side give the rows indexed by c½i1 ;:::;ik  ð¼ ðA1 B1 Þi1     ðAk Bk Þik  Þ ðil ¼ 1; :::; ml ; l ¼ 1; :::; kÞ, where the dots indicate possibly different arbitrary column indexes. Similarly, the columns on the right-hand side are indexed by c½j1 ;:::;jk  ð¼ ðA1 B1 Þj1    ðAk Bk Þjk Þ ðjl ¼ 1; :::; ql ; l ¼ 1; :::; kÞ. On the left-hand side of the equation, the first factor A1      Ak has the row and column indexes denoted by c½i1 ;:::;ik  ð¼ ðA1 Þi1     ðAk Þik  Þ ðil ¼ 1; :::; ml ; l ¼ 1; :::; kÞ and c½t1 ;:::;tk  ð¼ ðA1 Þt1    ðAk Þtk Þ ðtl ¼ 1; :::; nl ; l ¼ 1; :::; kÞ; respectively. Similarly, the second factor B1      Bk has the row and column indexes c½t1 ;:::;tk  ð¼ ðB1 Þt1     ðBk Þtk  Þ ðtl ¼ 1; :::; nl ; l ¼ 1; :::; kÞ and c½j1 ;:::;jk  ð¼ ðB1 Þj1    ðBk Þjl Þ ðjl ¼ 1; :::; ql ; l ¼ 1; :::; kÞ: Due to the product of the first and second factors with the corresponding column and row indexes in the first and second factors, respectively, it is found that the row and column indexes of ðA1      Ak ÞðB1      Bk Þ are given by c½i1 ;:::;ik  ð¼ ðA1 B1 Þi1     ðAk Bk Þik  Þ ðil ¼ 1; :::; ml ; l ¼ 1; :::; kÞ and c½j1 ;:::;jk  ð¼ ðA1 B1 Þj1    ðAk Bk Þjk Þ ðjl ¼ 1; :::; ql ; l ¼ 1; :::; kÞ; respectively; which show the required results. The formula ðA1      Ak ÞT ¼ AT1      ATk is similarly given. The 1 remaining formula ðC1      Ck Þ1 ¼ C1 is given by the first 1      Ck 1 formula when Ai ¼ Ci and Bi ¼ Ci ði ¼ 1; :::; kÞ with Ir1      Irk ¼ Ir1 rk . The above formulas give the following summarized results for ease of reference.

314

10 Multivariate Measures of Skewness and Kurtosis

Lemma 10.3 Let ai be di  1 vectors ði ¼ 1; :::; 4Þ, respectively. Then, we have 1 K1 ð1324Þ ða1  a2  a3  a4 Þ ¼ Kð1324Þ ðd1 ; d2 ; d3 ; d4 Þða1  a2  a3  a4 Þ

¼ fId1  Kð21Þ ðd2 ; d3 Þ  Id4 gða1  a2  a3  a4 Þ ¼ a1  Kð21Þ ða2  a3 Þ  a4 ¼ a1  a3  a2  a4 ; dP 3 ;d2

1 K1 ð132Þ ¼ Kð132Þ ðd1 ; d2 ; d3 Þ ¼

i;j dP 3 ;d1

1 K1 ð321Þ ¼ Kð321Þ ðd1 ; d2 ; d3 Þ ¼

i;j dP 2 ;d1

1 K1 ð213Þ ¼ Kð213Þ ðd1 ; d2 ; d3 Þ ¼

i;j

ðd d2 Þ

ðId1  Eij 3

ðd d3 Þ

Þ;

ðd d3 Þ

Þ;

 Eji 2

ðd d1 Þ

 Id2  Eji 1

ðd d1 Þ

 Eji 1

ðEij 3

ðd d2 Þ

ðEij 2

 Id3 Þ;

1 1 1 1 K1 ð312Þ ¼ Kð213Þ Kð132Þ ¼ Kð213Þ ðd1 ; d3 ; d2 ÞKð132Þ ðd1 ; d2 ; d3 Þ

¼

d3X ;d1 ;d2

ðd d1 Þ

ðEij 3

ðd d2 Þ

ðd d3 Þ

 Ejg 1

 Egi 2

Þ

i;j;g ðd d Þ

3 2 is the d3  d2 matrix with K1 ð312Þ ða1  a2  a3 Þ ¼ a3  a1  a2 , where Eij whose (i, j)th element is 1 with the remaining ones being zero.

Proof The first set of equations for K1 ð1324Þ ðÞ are summarized results explained 1 1 earlier. The equations for K1 ð132Þ ; Kð321Þ and Kð213Þ are given by Lemma 10.1.

The remaining result for K1 ð312Þ is given as follows. a3  a1  a2 ¼ K1 ð312Þ ða1  a2  a3 Þ 1 ¼ K1 ð213Þ Kð132Þ ða1  a2  a3 Þ 1 ¼ K1 ð213Þ ðd1 ; d3 ; d2 ÞKð132Þ ðd1 ; d2 ; d3 Þða1  a2  a3 Þ

¼ fKð21Þ ðd1 ; d3 Þ  Id2 gfId1  Kð21Þ ðd2 ; d3 Þgða1  a2  a3 Þ ¼ ðKd3 d1  Id2 ÞðId1  Kd3 d2 Þða1  a2  a3 Þ: Recall Lemma 10.1 using clarified notations K di dj ¼

dj  di X X

ðd Þ

ðd Þ

eðiÞi  eðjÞj



ðd Þ

ðd Þ

eðjÞj  eðiÞi

i¼1 j¼1

¼

di ;dj  X

ðd Þ ðd ÞT

eðiÞi eðjÞj

i;j



 ðd Þ ðd ÞT  eðjÞj eðiÞi ;

T

10.2

Multivariate Cumulants and Multiple Commutators

315

ðd Þ

where eðiÞi is the di  1 vector whose i-th element is 1 with the remaining ones ðd ÞT

being zero; and eðiÞi

ðd Þ

¼ ðeðiÞi ÞT . Then, we obtain

1 K1 ð213Þ Kð132Þ ða1  a2  a3 Þ ¼ ðKd3 d1  Id2 ÞðId1  Kd3 d2 Þða1  a2  a3 Þ ( ) dX 3 ;d1 ðd Þ ðd ÞT ðd Þ ðd ÞT ¼ ðeðiÞ3 eðjÞ1 Þ  ðeðjÞ1 eðiÞ3 Þ  Id2 i;j

(

Id1 

 ( ¼

dX 3 ;d2

) ðd Þ ðd ÞT

ðd Þ ðd ÞT

ðeðkÞ3 eðlÞ2 Þ  ðeðlÞ2 eðkÞ3 Þ ða1  a2  a3 Þ

k;l d3X ;d1 ;d2

)

ðd Þ ðd ÞT

ðd Þ ðd ÞT

ðd Þ ðd ÞT

ðd Þ ðd ÞT ðeðhÞ1 eðhÞ1 Þ



ðd Þ ðd ÞT ðeðkÞ3 eðlÞ2 Þ

ðd Þ ðd ÞT ðeðlÞ2 eðkÞ3 Þ

ðd Þ

ðd Þ

ðeðiÞ3 eðjÞ1 Þ  ðeðjÞ1 eðiÞ3 Þ  ðeðgÞ2 eðgÞ2 Þ

i;j;g

( 

d1X ;d3 ;d2

) 

ða1  a2  a3 Þ

h;k;l

¼

d3X ;d1 ;d2 d1X ;d3 ;d2 i;j;g

ðd Þ

ðd Þ

ðd Þ

ðd Þ

ðd Þ

ðd Þ

ðd Þ

ðeðiÞ3  eðjÞ1  eðgÞ2 ÞðeðjÞ1  eðiÞ3  eðgÞ2 ÞT ðeðhÞ1  eðkÞ3  eðlÞ2 Þ

h;k;l

ðd Þ

ðd Þ

ðd Þ

 ðeðhÞ1  eðlÞ2  eðkÞ3 ÞT ða1  a2  a3 Þ ¼

d3X ;d1 ;d2

ðd Þ

ðd Þ

ðd Þ

ðd Þ

ðd Þ

ðd Þ

ðeðiÞ3  eðjÞ1  eðgÞ2 ÞðeðjÞ1  eðgÞ2  eðiÞ3 ÞT ða1  a2  a3 Þ

i;j;g

¼

d3X ;d1 ;d2

n

o ðd Þ ðd ÞT ðd Þ ðd ÞT ðd Þ ðd ÞT ðeðiÞ3 eðjÞ1 Þ  ðeðjÞ1 eðgÞ2 Þ  ðeðgÞ2 eðiÞ3 Þ ða1  a2  a3 Þ

i;j;g

¼

d3X ;d1 ;d2

ðd d1 Þ

ðEij 3

ðd d2 Þ

 Ejg 1

ðd d3 Þ

 Egi 2

Þða1  a2  a3 Þ:

i;j;g

Since the elements of ai ði ¼ 1; 2; 3Þ are arbitrary, we obtain the required result. Q.E.D. 1 1 In Lemma 10.3, K1 ð132Þ ; Kð321Þ and Kð213Þ are commutators interchanging two factors (vectors) in the Kronecker product i.e., giving single-step permutations while K1 ð312Þ is a commutator interchanging neighboring factors two-times, i.e., yielding a two-step permutation. The proof of Lemma 10.3 gives the following general result. Theorem 10.5 Let ai be di  1 vectors ði ¼ 1; :::; kÞ, respectively. Suppose that a1      ak is transformed to ak  a1      ak1 by interchanging the neighboring two factors k  1 times starting from exchanging ak1  ak such as ak  ak1 . Then, the commutator for the transformation is K1 ðk;1;2;:::;k1Þ ¼

dk ;dX 1 ;:::;dk1 ik ;:::;ik1

ðd d1 Þ

ðEik ik1

ðd d2 Þ

 Ei1 i12

ðd

d Þ

     Eik1k1ik k Þ:

with K1 ðk;1;2;:::;k1Þ ða1      ak Þ ¼ ak  a1      ak1 .

316

10 Multivariate Measures of Skewness and Kurtosis

In the first paragraph of this section, the multivariate cumulants of the standardized random vector Y ¼ R1=2 ðX  lÞ are addressed. In some or many cases, the unstandardized multivariate raw moments are obtained more easily than the standardized or unstandardized central moments. In the following, we obtain the multivariate unstandardized cumulants via multivariate raw moments. Corollary 10.1 j1 ðXÞ ¼ EðXÞ ¼ l; j2 ðXÞ ¼ vecfcovðXÞg ¼ vecðRÞ ¼ EðX\2 [Þ  l\2 [ ; j3 ðXÞ ¼ EfðX  lÞ\3 [g ¼ EðX\3 [ Þ 

3 X

EðX\2 [Þ  l þ 2l\3 [

1 \2 [ ¼ EðX\3 [Þ  ðIp3 þ K1 Þ  lg þ 2l\3 [ ð132Þ þ Kð312Þ ÞfEðX

 EðX\3 [Þ  S21 fEðX\2 [ Þ  lg þ 2l\3 [ ; j4 ðXÞ ¼ EfðX  lÞ\4 [g  ¼ EðX\4 [Þ 

4 X

3 X

[ j\2 ðXÞ 2

EðX\3 [Þ  l þ

6 X

EðX\2 [Þ  l\2 [  3l\4 [ 

3 X

[ j\2 ðXÞ 2

1 1 \3 [ Þ  lg ¼ EðX\4 [Þ  ðIp4 þ K1 ð1243Þ þ Kð1423Þ þ Kð4123Þ ÞfEðX 1 1 1 1 \2 [ þ ðIp4 þ K1 Þ  l\2 [g ð1324Þ þ Kð1342Þ þ Kð3124Þ þ Kð3142Þ þ Kð3412Þ ÞfEðX 1 \2 [ ðXÞ  3l\4 [  ðIp4 þ K1 ð1324Þ þ Kð1342Þ Þj2 [  EðX\4 [ Þ  S31 fEðX\3 [Þ  lg þ S211 fEðX\2 [Þ  l\2 [g  3l\4 [  S22 j\2 ðXÞ; 2

1 where K1 ðÞ ¼ KðÞ ðp; :::; pÞ.

10.3

Multivariate Measures of Skewness and Kurtosis

Skewness and kurtosis are indexes of asymmetry and peakedness (or tail-thickness), respectively, defined in various ways for scalar random variables. Most typical measures for skewness and kurtosis are based on the third and fourth standardized cumulants, respectively, where “standardized” indicates the unit variance after standardization. Other indexes using the mode and quantiles for skewness are also available (Arnold and Groeneveld [1]; Ekström and Jammalamadaka [5]). Since the fourth central moment of the standard normal distribution is 3, the phrase “excess kurtosis” indicating the standardized central fourth moment subtracted by 3 giving the standardized fourth cumulant is also used. Recall that in Chap. 5 the coined word “kurtic” indicates “mesokurtic, leptokurtic or platykurtic” corresponding to zero, positive and negative excess kurtosis, respectively. The measures of skewness and (excess) kurtosis for random vectors are defined in various ways. Consider the cumulant-based ones. Note that the standardization with the unit variance for univariate cases corresponds to Y ¼ R1=2 ðX  lÞ giving covðYÞ ¼ Ip . Other methods of standardization can also be considered especially

10.3

Multivariate Measures of Skewness and Kurtosis

317

when the original variables are of primary interest rather than their linear combinations or summarized values. For instance, let Y ¼ Diag1=2 ðRÞðX  lÞ, where 1=2 Diag1=2 ðRÞ ¼ diagðr11 ; :::; r1=2 pp Þ and rii is the i-th diagonal element of R (i = 1,…,p), then we have covðY Þ ¼ P, which is the scale-free population correlation matrix (for this type of standardization, see Ogasawara [19]). We assume that overall measures of skewness and kurtosis for random vectors are desired employing Y ¼ R1=2 ðX  lÞ for standardization unless otherwise stated.

10.3.1 Multivariate Measures of Skewness Recall that the third cumulant vector is given by j3 ðYÞ ¼ EðY\3 [Þ: In the elements of j3 ðYÞ ¼ EðY\3 [ Þ, there are p univariate third cumulants, i.e., j3 ðYi Þ ¼ EðYi3 Þ ði ¼ 1; :::; pÞ: The 3ðp2  pÞ bivariate third cumulants are defined by j3 ðYi ; Yi ; Yj Þ ¼ EðYi2 Yj Þ ði; j ¼ 1; :::; p; i 6¼ jÞ: The remaining pðp  1Þðp  2Þ tri-variate third cumulants are given by j3 ðYi ; Yj ; Yk Þ ¼ EðYi Yj Yk Þ ði; j; k ¼ 1; :::; p; i 6¼ j 6¼ k 6¼ iÞ: It is confirmed that the sum of the numbers of the elements for the three groups becomes p þ 3ðp2  pÞ þ pðp  1Þðp  2Þ ¼ p3 : Among the p3 elements, p univariate third cumulants are unique ones; p2  p bivariate third cumulants are non-duplicated; and pðp  1Þðp  2Þ=3! tri-variate third cumulants are unique giving the number of unique third cumulants p þ p2  p þ fpðp  1Þðp  2Þ=3!g ¼ pðp þ 1Þðp þ 2Þ=3!; which is equal to the number of the combinations choosing three variables from p variables including repeated ones. One of the popular overall indexes of skewness is Mardia’s [15, Eq. (2.19)] b1;p defined by

318

10 Multivariate Measures of Skewness and Kurtosis

b1;p ¼ j3 ðYÞT j3 ðYÞ ¼ EðY\3 [ T ÞEðY\3 [Þ; which ignores the signs of the elements of j3 ðYÞ ¼ EðY\3 [ Þ and uses all the p3 duplicated elements of j3 ðYÞ. Mòri et al. [17] defined the vector of skewness: E

p X

! Yi2 Y

= EðYT YYÞ;

i¼1

which includes p2 non-duplicated third cumulants consisting of p univariate and p2  p bivariate ones ignoring the tri-variate third cumulants. Kollo [10] defined an alternative vector of skewness: E

p X p X

! Yi Yj Y ;

i¼1 j¼1

which includes all the p3 elements of j3 ðYÞ. However, Jammalamadaka et al. [8, p. 612] showed that Kollo’s vector of skewness can be zero when the absolute values of j3 ðYÞ are not zero. For instance, in the case of p = 2 with j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1;  1ÞT , the vector becomes E

p X p X

! Yi Yj Y

¼ Efð1Tp YÞ2 Yg ¼ Ef1Tp2 ðY  YÞYg

i¼1 j¼1

¼ Efð1Tp2  Ip ÞY\3 [ g ¼ ð1Tp2  Ip Þj3 ðYÞ ¼ 0; where 1p is the p-vector consisting of 1s. Jammalamadaka et al. [8] stated that this is a disadvantage of Kollo’s vector of skewness. This point will be addressed later in detail. Balakrishnan et al. [2] proposed the skewness vector T, which was shown to be proportional to Mòri et al.’s vector by Jammalamadaka et al. [8, p. 613] as T¼

3 3 fvecT ðIp Þ  Ip gj3 ðYÞ ¼ EðYT YYÞ pðp þ 2Þ pðp þ 2Þ

since EðYT YYÞ ¼ EfðYT  YT ÞvecðIp ÞYg ¼ EfvecT ðIp ÞðY  YÞYg ¼ fvecT ðIp Þ  Ip gEðY\3 [ Þ ¼ fvecT ðIp Þ  Ip gj3 ðYÞ: That is, T is seen to be essentially the same as Mòri et al.’s vector.

10.3

Multivariate Measures of Skewness and Kurtosis

319

Srivastava [20] gave a skewness index. Let R ¼ CKCT ¼ Cdiagðk1 ; :::; kp ÞCT be the spectral decomposition of R with CT C ¼ Ip and ki [ 0 ði ¼ 1; :::; pÞ yielding 1=2

T R1=2 ¼ CK1=2 CT ¼ Cdiagðk1 ; :::; k1=2 p ÞC . Define.

Y ¼ K1 CT ðX  lÞ ¼ K1 CT R1=2 Y ¼ K1=2 CY: Then, Srivastava’s multivariate measure of skewness is given by 1X 2 1 X 2 3 1 X 3 T\3 [ E ½fðY Þi g3  ¼ E ðYi Þ ¼ k fki j3 ðYÞg2 ; p i¼1 p i¼1 p i¼1 i p

p

p

3=2

3=2

3=2

[ where EðYi3 Þ ¼ ki EfðCYÞ3i g ¼ ki EfðkTi YÞ3 g ¼ ki kT\3 j3 ðYÞ is used; i T and ki is the i-th row of C. Jammalamadaka et al. [8, p. 613] gave a derivation similar to the above result and pointed out that Srivastava’s multivariate measure of skewness is not affine invariant. This non-invariant property is found by covðY Þ ¼ K1 . The corresponding affine invariant measure is easily obtained by 



Y  K1=2 Y ¼ CY with covðY Þ ¼ covðYÞ ¼ Ip : 

That is, Y is an orthogonally rotated random vector of Y. This measure gives a modified Srivastava’s measure: 2 1 X 2 h   3 i 1 X 2  3 1 X  T\3 [ E Y i E Yi ¼ ki j3 ðYÞ : ¼ p i¼1 p i¼1 p i¼1 p

p

p

As addressed earlier, reconsider Jammalamadaka et al.’s criticism of Kollo’s skewness vector Efð1Tp YÞ2 Yg ¼ ð1Tp2  Ip Þj3 ðYÞ, which becomes zero when p = 2 and j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1;  1ÞT is nonzero. It is found that the following case also gives the same result: j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1; 1ÞT : Let jijk ¼ EðYi Yj Yk Þ ði; j; k ¼ 1; :::; pÞ be the elements of j3 ðYÞ, which are lexicographically ordered with k changing fastest. Then, it is found that the necessary and sufficient condition of the zero Kollo vector for j3 ðYÞ is Pp Pp T i¼1 j¼1 jijk ¼ 0 ðk ¼ 1; :::; pÞ. This condition stems from the property of 1p2  Ip in ð1Tp2  Ip Þj3 ðYÞ. The q-th element of ð1Tp2  Ip Þj3 ðYÞ is the sum of every p elements of j3 ðYÞ starting from the q-th element (q = 1,…,p). That is, when p = 2, the first and second elements of ð1Tp2  Ip Þj3 ðYÞ are the sums of the odd and even elements of j3 ðYÞ, respectively, which should be zero to yield the vanishing Kollo vector. It is also found that we have infinitely many such cases other than the above ones as long as j112 ¼ j121 ¼ j211 and j122 ¼ j212 ¼ j221 .

320

10 Multivariate Measures of Skewness and Kurtosis

When a variable Yi in Y is reflected as Yi , the signs of associated third cumulants other than Yi2 Yj ði; j ¼ 1; :::; pÞ become reversed. This reversal may be employed to see the properties of multivariate third cumulants in simplified situations without loss of generality. One of the simplifications is to make the third univariate cumulants positive by sign reversal for variables if necessary. Since this seems to give substantial simplification, we give the following definition. Definition 10.1 For variables Xi ði ¼ 1; :::; pÞ in a random vector X ¼ ðX1 ; :::; Xp ÞT , the sign reversal of Xi when the univariate third cumulant j3 ðXi Þ\0 is said to be the “positively skewed transformation” or “skew transformation” for short, which gives the sign reversal of the associated multivariate third cumulants. In the two cases addressed earlier, we find that (i) j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1;  1ÞT , j3 ðX1 Þ ¼ 1; j3 ðX2 Þ ¼ 1, (ii) j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1; 1ÞT j3 ðX1 Þ ¼ 1; j3 ðX2 Þ ¼ 1. Employ the skew transformation for one of the variables in (i) and (ii). Then, we have (i) j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1; 1ÞT , (ii) j3 ðYÞ ¼ ð1; 1; 1; 1; 1; 1; 1; 1ÞT ; and j3 ðY1 Þ ¼ j3 ðY2 Þ ¼ 1 for (i) and (ii). It is of interest to find that the negative bivariate third cumulants j112 ¼ j121 ¼ j211 ¼ 1 in (i) and j122 ¼ j212 ¼ j221 ¼ 1 in (ii) all become 1 after the skew transformation. In other words, the negative cumulants before transformation were artifacts of the sign reversal of one of the variables. Note also that case (ii) before the skew transformation is given by exchanging Y1 and Y2 in (i) and that the two variables after skew transformation are exchangeable as long as the multivariate cumulants up to the third order are concerned. Employing the skew transformation with j3 ðY1 Þ ¼ j3 ðY2 Þ ¼ 1. Let j112 ¼ j121 ¼ j211 ¼ a and j122 ¼ j212 ¼ j221 ¼ b. Then, Kollo’s skew vector becomes ð1Tp2  Ip Þj3 ðYÞ ¼ ð1 þ 2a þ b; 1 þ a þ 2bÞT . It is found that the condition of the zero Kollo vector is equal to a ¼ b ¼ 1=3. Ogasawara [19, Eq. (2.10)] gave an upper bound of j2iij for Yi and Yj : j2iij 1 þ minfjiijj ; jiiii þ 1g ði 6¼ jÞ as an extension of Pearson’s inequality j2iii jiiii þ 2, where jiijj and jiiii are the bivariate and univariate fourth cumulants of (Yi ,Yj ) and Yi , respectively, i.e., jiijj ¼ EðYi2 Yj2 Þ  1 and jiiii ¼ EðYi4 Þ  3 due to EðYi Yj Þ ¼ 0 ði 6¼ jÞ. When minfjiijj ; jiiii þ 1g  8=9, which seems to be rather a weak condition, j2iijj ¼ 1=9 does not violate the upper bound though this is not a sufficient condition of the existence of the associated distribution.

10.3

Multivariate Measures of Skewness and Kurtosis

321

10.3.2 Multivariate Measures of Excess Kurtosis Multivariate measures of excess kurtosis are based on the multivariate fourth cumulants shown earlier, which is repeated: j4 ðYÞ ¼ EðY\4 [Þ 

3 X

E\2 [ ðY\2 [Þ;

where 3 X

E\2 [ ðY\2 [Þ ¼ S2;2 fE\2 [ ðY\2 [Þg 1 \2 [ ¼ ðIp4 þ K1 ðY\2 [Þ: ð1324Þ þ Kð1342Þ ÞE

One of the most popular indexes is Mardia’s [15] non-excess kurtosis: b2;p ¼ EfðYT YÞ2 g; which is obtained from j4 ðYÞ as: b2;p ¼ E[fvecðYT YÞg2  ¼ E[f ðYT  YT ÞvecðIp Þg2  ¼ E[YT\4 [ fvec\2 [ ðIp Þg ( ) 3 X ¼ E[f vec\2 [ ðIp ÞgT Y\4 [  ¼ fvec\2 [ ðIp ÞgT j4 ðYÞ þ E\2 [ ðY\2 [ Þ :

Under normality, this index becomes b2;p ¼ fvec\2 [ ðIp ÞgT

3 X

E\2 [ ðY\2 [Þ

¼ fvecT ðIp ÞEðY\2 [Þg2 þ 2fvec\2 [ ðIp ÞgT EðY  Y  Y  Y Þ p p X X ¼ p2 þ 2 dij dkl EðYi Yj Yk Yl Þ ¼ p2 þ 2 EðYi Yi Yk Yk Þ ¼ p2 þ 2p; i;j;k;l¼1

i;k¼1

where Y is an independent copy of Y, which is a normalizing constant to be subtracted from b2;p when the corresponding excess kurtosis is used. Remark 10.2 Jammalamadaka et al. [8, p. 614] gave the following expression. ( b2;p ¼ vecT ðIp2 Þ j4 ðYÞ þ

3 X

) E\2 [ ðY\2 [Þ ;

which is different from b2;p = f vec\2 [ ðIp ÞgT fg shown earlier. The vector vec\2 [ ðIp Þ is not equal to vecðIp2 Þ unless p = 1. For instance when p = 2,

322

10 Multivariate Measures of Skewness and Kurtosis

vec\2 [ ðIp Þ ¼ ð1; 0; 0; 1Þ\2 [ T ¼ ð1; 0; 0; 1; 0; 0; 0; 0; 0; 0; 0; 0; 1; 0; 0; 1ÞT whereas 2

1 60 6 vecðIp2 Þ ¼ vec6 40

0 1 0

3 0 0 0 07 7 7: 1 05

0

0

0 1

¼ ð1; 0; 0; 0; 0; 1; 0; 0; 0; 0; 1; 0; 0; 0; 0; 1ÞT : The relationship between the two vectors is given by Theorem 10.2 when A ¼ B ¼ Ip with m ¼ n ¼ p ¼ q: vecðIp2 Þ ¼ vecðIp  Ip Þ ¼ ðIp  Kpp  Ip ÞfvecðIp Þ  vecðIp Þg ¼ ðIp  Kpp  Ip Þvec\2 [ ðIp Þ; whose [i þ ðj  1Þp þ fv þ ðu  1Þp  1gp2 ]-th element ([u, v, j, i]-th element using the lexicographical order with i changing fastest) is duj dvi ðu; j; v; i ¼ 1; . . .; pÞ. Using the above formula, we find that Jammalamadaka et al.’s expression gives the same result as ours: Ef vecT ðIp2 ÞY\4 [ g ¼ E[fðIp  Kpp  Ip Þvec\2 [ ðIp ÞgT Y\4 [  ¼ E½fvec\2 [ ðIp ÞgT ðIp  Kpp  Ip ÞY\4 [  ¼ E½fvec\2 [ ðIp ÞgT Y\4 [  ¼ b2;p ; where ðIp  Kpp  Ip Þða1  a2  a3  a4 Þ ¼ a1  a3  a2  a4 with ai ði ¼ 1; :::; 4Þ being p  1 vectors is used. This suggests the remaining third expression using Y\4 [ to have b2;p . Note that in 1 \2 [ \2 [ vecðIp2 Þ ¼ ðIp  Kpp  Ip Þvec ðIp Þ ¼ Kð1324Þ vec ðIp Þ, duj dvi of the [u, j, v, i]-th element of vec\2 [ ðIp Þ becomes the [u, v, j, i]-th element of vecðIp2 Þ due to the interchange of j and v. The third expression is given by exchanging the lexicographical order for j and i in [u, j, v, i]-th element of vec\2 [ ðIp Þ, which is \2 [ obtained by K1 ðIp Þ. These results are summarized as ð1432Þ vec b2;p ¼ E½fvec\2 [ ðIp ÞgT Y\4 [  ¼ Ef vecT ðIp2 ÞY\4 [ g \2 [ ¼ E½fK1 ðIp ÞgT Y\4 [ : ð1432Þ vec

10.3

Multivariate Measures of Skewness and Kurtosis

323

ðrÞ

Let Yk ¼ Yk be the k-th element of the r-th vector YðrÞ ¼ Y ðr ¼ 1; :::; 4; k ¼ 1; :::; pÞ in Y\4 [ ¼ Yð1Þ  Yð2Þ  Yð3Þ  Yð4Þ . Then, it is found that E½fvec\2 [ ðIp ÞgT Y\4 [  ¼

p X

ð1Þ ð2Þ ð3Þ ð4Þ

E(Yi Yi Yj Yj Þ

i;j¼1

¼ Ef vecT ðIp2 ÞY\4 [ g ¼

p X

ð1Þ ð2Þ ð3Þ ð4Þ

E(Yi Yj Yi Yj Þ

i;j¼1 \2 [ ¼ E½fK1 ðIp ÞgT Y\4 [  ¼ ð1432Þ vec

p X

ð1Þ ð2Þ ð3Þ ð4Þ

E(Yi Yj Yj Yi Þ

i;j¼1

¼

p X i;j¼1

E(Yi2 Yj2 Þ ¼

p X

E(Yi4 Þ þ 2

i¼1

X 1 i\j p

E(Yi2 Yj2 Þ ¼ b2;p :

Remark 10.3 It is to be noted that among the p4 central product moments E(Yi Yj Yk Yl Þ ði; j; k; l ¼ 1; :::; pÞ including duplicated ones, b2;p considered only E(Yi41 Þ and E(Yi21 Yi22 Þ, i.e., univariate and symmetric-bivariate ones, respectively, with the latter being duplicated, neglecting E(Yi1 Yi32 Þ, E(Yi1 Yi2 Yi23 Þ and E(Yi1 Yi2 Yi3 Yi4 Þ(i1 ; i2 ; i3 and i4 are different), i.e., asymmetric-bivariate, tri-variate and 4-variate fourth central moments, respectively. Koziol [11] gave the following index for a multivariate measure of non-excess kurtosis EðY\4 [ T ÞEðY\4 [ Þ; which is an extension of Mardia’s skewness measure b1;p ¼ EðY\3 [ T ÞEðY\3 [Þ to kurtosis. The relation of this index to j4 ðYÞ was given by Terdik [22], which is derived using a method different from Terdik’s proof in the following theorem. Theorem 10.6 (Terdik [22, Eq. (6.8)]). Koziol’s multivariate measure of non-excess kurtosis EðY\4 [ T ÞEðY\4 [Þ is given by j4 ðYÞ with the corresponding Mardia measure b2;p as EðY\4 [ T ÞEðY\4 [Þ ¼ c2;p þ 6b2;p  3pðp þ 2Þ; where c2;p  jT4 ðYÞj4 ðYÞ.

324

10 Multivariate Measures of Skewness and Kurtosis

Proof ( EðY

\4 [ T

ÞEðY

\4 [

Þ¼

j4 ðYÞ þ

3 X

)T ( E

\2 [

ðY

¼ jT4 ðYÞj4 ðYÞ þ 2jT4 ðYÞ þ

3 X

E\2 [ T ðY\2 [Þ (

¼ c2;p þ 2 EðY\4 [Þ  þ

3 X

E\2 [ T ðY\2 [Þ

¼ c2;p þ 6b2;p 

3 X

\2 [

Þ

3 X

3 X 3 X

j4 ðYÞ þ

) E

\2 [

ðY

\2 [

Þ

E\2 [ ðY\2 [Þ

E\2 [ ðY\2 [Þ )T

E\2 [ ðY\2 [Þ

3 X

3 X

3 X

E\2 [ ðY\2 [Þ

E\2 [ ðY\2 [Þ

E\2 [ T ðY\2 [Þ

3 X

E\2 [ ðY\2 [Þ:

In the above result, we have 3 X

E\2 [ T ðY\2 [Þ

3 X

E\2 [ ðY\2 [Þ

= EðY  Y  Y  Y þ Y  Y  Y  Y þ Y  Y  Y  YÞT  EðY  Y  Y  Y þ Y  Y  Y  Y þ Y  Y  Y  YÞ 1 \2 [ 1 \2 [ ¼ fðIp2 þ K1 ðIp ÞgT ðIp2 þ K1 ðIp Þ ð1324Þ þ Kð1432Þ )vec ð1324Þ þ Kð1432Þ )vec T 1 1 1 \2 [ ¼ fvec\2 [ ðIp ÞgT ðIp2 þ K1 ðIp Þ ð1324Þ þ Kð1432Þ Þ ðIp2 þ Kð1324Þ þ Kð1432Þ )vec 1 T 1 ¼ fvec\2 [ ðIp ÞgT f3Ip2 þ KT ð1324Þ þ Kð1324Þ þ Kð1432Þ þ Kð1432Þ 1 T 1 \2 [ þ KT ðIp Þ; ð1324Þ Kð1432Þ þ Kð1432Þ Kð1324Þ gvec

1 T where KT ðÞ ¼ ðKðÞ Þ ð¼ KðÞ Þ, p X

fvec\2 [ ðIp ÞgT vec\2 [ ðIp Þ ¼

d2ij d2kl ¼

i;j;k;l¼1

p X

dij dkl ¼ p2

i;j;k;l¼1

and e.g., \2 [ \2 [ ðIp Þ ¼ fK1 ðIp ÞgT vec\2 [ ðIp Þ fvec\2 [ ðIp ÞgT KT ð1324Þ vec ð1324Þ vec

¼

p X i;j;k;l¼1

dik djl dij dkl ¼

p X i¼1

dii ¼ p;

10.3

Multivariate Measures of Skewness and Kurtosis

325

T 1 holds for the other similar three expanded terms using K1 ð1324Þ ; Kð1432Þ and Kð1432Þ . For the remaining two terms, we obtain 1 \2 [ \2 [ \2 [ fvec\2 [ ðIp ÞgT KT ðIp Þ ¼ fK1 ðIp ÞgT K1 ðIp Þ ð1324Þ Kð1432Þ vec ð1324Þ vec ð1432Þ vec

¼

p X

dik djl dil dkj ¼

i;j;k;l¼1

p X

dii ¼ p

i¼1

with the other one derived similarly. These results give 3 X

E\2 [ T ðY\2 [Þ

3 X

E\2 [ ðY\2 [Þ ¼ 3pðp þ 2Þ;

yielding the required expression. Q.E.D. In Theorem 10.6, c2;p ¼ jT4 ðYÞj4 ðYÞ is seen as a cumulant version of b1;p ¼ EðY\3 [ T ÞEðY\3 [Þ, which is called the “total kurtosis” by Terdik [22, Definition 6.2]. Jammalamadaka et al. [8, p. 615] used the notation j4 for c2;p and gave the result corresponding to Theorem 10.6 though the last term of EðY\4 [ T ÞEðY\4 [Þ ¼ c2;p þ 6b2;p  3pðp þ 2Þ is replaced by p2 , which may be a typo. Remark 10.4 Since under normality b2;p ¼ pðp þ 2Þ as addressed earlier with c2;p ¼ 0, Koziol’s index becomes 3pðp þ 2Þ. Consequently, the excess kurtosis version of Koziol’s index is defined as EðY\4 [ T ÞEðY\4 [Þ  3pðp þ 2Þ. Note that Koziol’s index considers all the multivariate fourth central product moments and consequently all the multivariate fourth cumulants including some duplicated elements. Cardoso [4] and Mòri et al. [17] gave the excess kurtosis matrix defined by BðYÞ ¼ EðYYT YYT Þ  ðp þ 2ÞIp ¼ EðYT YYYT Þ  ðp þ 2ÞIp ; whose relation to j4 ðYÞ was obtained by Terdik [22], which is proved in an expository way: Theorem 10.7 (Terdik [22, p. 325]) The excess kurtosis matrix BðYÞ given by Cardoso [4] and Mòri et al. [17] is related to j4 ðYÞ as vecfBðY)g ¼ fIp2  vecT ðIp Þgj4 ðYÞ Proof Since YT Y ¼ vecT ðIp ÞðY  YÞ, YYT Y ¼ YvecT ðIp ÞðY  YÞ ¼ ðIp YÞ  fvecT ðIp ÞðY  YÞg ¼ fIp  vecT ðIp ÞgfY  ðY  YÞg ¼ fIp  vecT ðIp ÞgY\3 [ (Terdik [22, p. 406]) follows, which gives.

326

10 Multivariate Measures of Skewness and Kurtosis

vec(YYT YYT Þ ¼ Y  ½fIp  vecT ðIp ÞgY\3 [  ¼ ðIp YÞ  ½fIp  vecT ðIp ÞgY\3 [  ¼ fIp  Ip  vecT ðIp ÞgY\4 [ ¼ fIp2  vecT ðIp ÞgY\4 [ : Using the above formula and the definition of BðYÞ, we have vecfBðY)g ¼ vecfEðYT YYYT Þ  ðp þ 2ÞIp g ¼ fIp2  vecT ðIp ÞgEðY\4 [Þ  ðp þ 2Þvec(Ip Þ ( ) 3 X T \2 [ \2 [ E ðY Þ  ðp þ 2Þvec(Ip Þ: ¼ fIp2  vec ðIp Þg j4 ðYÞ þ

In the above result, fIp2  vecT ðIp Þg ¼

p X

3 X

E\2 [ ðY\2 [Þ

ðeðiÞ  eðjÞ Þdkl ðdij dkl þ dik djl þ dil djk Þ

i;j;k;l¼1 p X

ðeðiÞ  eðjÞ Þdij þ 2

¼p

i;j¼1

p X

ðeðiÞ  eðiÞ Þ

i¼1

¼ ðp þ 2ÞvecðIp Þ: Consequently, we obtain vecfBðY)g ¼ fIp2  vecT ðIp Þgj4 ðYÞ, which is the required result. Q.E.D. In Theorem 10.7, YT Y ¼ vecT ðIp ÞðY  YÞ is used, whose derivation may be unnecessary since the result is easily derived for most of the readers. However, this equation includes basic properties of the vec operator, Kronecker product and the identity matrix. Terdik [22, p. 406] gave the following explanation: Y Y¼ T

p X i¼1

Yi2

¼

p X i¼1

ðeTðiÞ YÞ

¼ vecT ðIp ÞY\2 [ :



ðeTðiÞ YÞ

¼

p X

! [T e\2 ðiÞ

Y\2 [

i¼1

One of the basic properties of the identity matrix used in this explanation is P Ip ¼ pi¼1 eðiÞ eTðiÞ . The formula vecðabT Þ ¼ b  a due to the definitions of the vec operator and the Kronecker product is also used. Further, ðABÞ  ðCDÞ ¼ ðA  CÞðB  DÞ when the matrix products exist is employed. Another explanation may be

10.3

Multivariate Measures of Skewness and Kurtosis

327

YT Y ¼ YT Ip Y ¼ ðYT Ip YÞT ¼ fðYT  YT )vec(Ip ÞgT ¼ vecT ðIp ÞðY  YÞ; where vecðABCÞ ¼ ðCT  AÞvecðBÞ and ðA  BÞT ¼ AT  BT are used. An extension of the above formula is p X i¼1

Yiq ¼

p X

ðeTðiÞ YÞq ¼

i¼1

p X

ðeTðiÞ YÞ\q [ ¼

i¼1

p X

\q [ T \q [ eðiÞ Y ðq ¼ 1; 2; ; :::Þ;

i¼1

[ where e\q ¼ eðiÞ      eðiÞ (q times of eðiÞ ) is the pq  1 vector whose ðiÞ

fi þ ði  1Þp þ ::: þ ði  1Þpq1 g-th element is 1 with the remaining ones being zero. When using the lexicographical order, this unit element is located in [i,…,i]-th position (q times of i). Note that the excess kurtosis matrix given by Cardoso and Mòri et al. does not include the four-variate fourth moments/cumulants, i.e., those for Yi ; Yj ; Yk and Yl (i, j, k and l are different). It is also found that trfBðY)g ¼ trfEðYYT YYT Þg  pðp þ 2Þ ¼ b2;p  pðp þ 2Þ; which is the excess kurtosis version of b2;p addressed earlier. Kollo’s [10] non-excess kurtosis matrix is defined by BðYÞ ¼

p X p X

EðYi Yj YYT Þ;

i¼1 j¼1

whose relation to j4 ðYÞ was obtained by Jammalamadaka et al. [8, p. 616]: vecfBðYÞg ¼ Eð1Tp2 Y\2 [ Y\2 [Þ ¼ EfY\2 [  ð1Tp2 Y\2 [Þg ¼ EfðIp2  1Tp2 ÞY\4 [g; which is also written as vecfBðYÞg ¼ Efð1Tp2 Y\2 [Þ  Y\2 [g ¼ Efð1Tp2  Ip2 ÞY\4 [g: Using the former one, they gave ( vecfBðYÞg ¼ ðIp2 

1Tp2 ÞE(Y\4 [Þ

¼ ðIp2 

( ¼ ðIp2 

1Tp2 Þ

j4 ðYÞ þ

3 X

1Tp2 Þ

j4 ðYÞ þ )

vec

\2 [

ðIp Þ :

3 X

) E

\2 [

ðY

\2 [

Þ

328

10 Multivariate Measures of Skewness and Kurtosis

Based on their above result, the excess kurtosis counterpart of vecfBðYÞg is given by 3 X vec\2 [ ðIp Þ; vecfBðYÞg  ðIp2  1Tp2 Þ where the normalizing constant subtracted becomes ðIp2  1Tp2 Þ

3 X

vec\2 [ ðIp Þ ¼ ðIp  Ip  1Tp  1Tp Þ ¼ ¼

p X

3 X

vec\2 [ ðIp Þ

ðeðiÞ  eðjÞ Þðdij dkl þ dik djl þ dil djk Þ

i;j;k;l¼1 p X

ðeðiÞ  eðjÞ Þðdij p þ 2Þ;

i;j¼1 p X

¼p

eðiÞ  eðiÞ þ 2

i¼1

p X

eðiÞ  eðjÞ

i;j¼1

[ ¼ pvecðIp Þ þ 21\2 : p

Using the inverse vec operation, we have the excess counterpart of the kurtosis matrix BðYÞ as BðYÞ  pIp þ 21p 1Tp :

10.4

Elimination Matrices and Non-duplicated Multivariate Skewness and Kurtosis

Meijer [16, Sect. 2.1] and Jammalamadaka et al. [8, Sect. 5] defined the fpðp þ 1Þðp þ 2Þ=3!g  1 vector for the third cumulants of distinct elements. Since the order of the non-duplicated elements in the vector is arbitrary, we employ the lexicographical order for the subscripts of the elements such that only the distinct elements appearing earliest in the order are kept included for identification. That is, j3D ðYÞ ¼ fjijk g ¼ ðj111 ; j112 ; ::; jp1;p;p ; jppp ÞT ; where jijk ¼ EðYi Yj Yk Þ ð1 i j k pÞ using Jammalamadaka et al.’s notation. When p = 2, we have j3D ðYÞ ¼ ðj111 ; j112 ; j122 ; j222 ÞT :

10.4

Elimination Matrices and Non-duplicated Multivariate …

329

The ratio of the number of the distinct elements to that of the duplicated ones is fpðp þ 1Þðp þ 2Þ=ðp3 3!Þ ¼ ð1 þ p1 Þð1 þ 2p1 Þ=6 which is 1/2 when p = 2 and approaching 1/6 when p increases. The duplicated elements occur for the second cumulant or covariance matrix for a p  1 random vector X, whose vectorized version is denoted by j2 ðXÞ ¼ vecfcovðXÞg ¼ vecðRÞ: The non-duplicated counterpart of j2 ðXÞ is denoted by j2D ðXÞ, whose relation to j2 ðXÞ is obtained using the p2  fpðp þ 1Þ=2g duplication matrix D2p introduced by Browne [3]: j2 ðXÞ ¼ D2p j2D ðXÞ: D2p , which is also written as D and Dp (see, e.g., Magnus and Neudecker [13, 14, Sect. 7, Chap. 3]; Meijer [16]), which has a single unit element in every row with the remaining elements being zero. When p = 2, 3 2 3 3 1 0 0 2 j11 j11 6 7 6 7 j12 j21 7 j 5 6 0 1 0 74 j2 ðXÞ ¼ vec 11 ¼6 4 j12 5 ¼ D2p j2D ðXÞ ¼ 4 0 1 0 5 j12 , j21 j22 j22 0 0 1 j22 where jij ¼ jij ðXÞ ¼ rij ði; j ¼ 1; 2Þ and j12 ¼ j21 . For arbitrary p, noting that D2p is of full column rank, j2 ðXÞ ¼ D2p j2D ðXÞ gives, 



2

þ j2 ðXÞ ¼ j2D ðXÞ; ðDT2p D2p Þ1 DT2p j2 ðXÞ ¼ D2p þ where D2p is the left inverse or the Moore–Penrose generalized (g)-inverse (MP inverse) of D2p . The MP inverse is a special case of the g-inverse denoted by A for A having the property AA A ¼ A with added restrictions A þ AA þ ¼ A þ , ðAA þ ÞT ¼ AA þ and ðA þ AÞT ¼ A þ A as well as AA þ A ¼ A (see, e.g., Magnus þ and Neudecker [14, Chap. 2]; Yanai et al. [23, Sect. 3.3.4]). D2p is a special case of the MP inverse for a matrix of full column rank, which is algebraically obtained from D2p as shown above. When p = 2,

1 6 ¼ 40

0 2

31 2 1 0 7 6 05 40

0

0

1

2

þ ¼ ðDT2p D2p Þ1 DT2p D2p

2

1 6 ¼ 40 0

0

0 1=2

0 1=2

0

0

0 1

0 3 0 7 0 5: 1

3 0 0 7 1 05 0 1

330

10 Multivariate Measures of Skewness and Kurtosis

þ It is found that D2p is the transposed D2p , where the unit values corresponding the duplicated elements in j2 ðXÞ ¼ D2p are each replaced by the reciprocal of the number of the duplicated ones, which is expected from the formula þ þ D2p ¼ ðDT2p D2p Þ1 DT2p . In other words, D2p transforms vecðRÞ such that rij and rji are replaced by their averaged single value rij ¼ rji ¼ ðrij þ rji Þ=2ð1 i j pÞ. This property also holds when R is replaced by an asymmetric matrix. Consider the matrix, þ D2p D2p ¼ D2p ðDT2p D2p Þ1 DT2p 2 3 31 2 3 1 0 0 2 1 0 0 0 60 1 07 1 0 0 6 76 7 6 7 ¼6 74 0 2 0 5 4 0 1 1 0 5 40 1 05 0 0 1 0 0 0 1 0 0 1 2 3 0 2 31 1 0 0 0 1 0 0 0 6 0 1=2 1=2 0 7 B 6 0 0 1 0 7C 6 7 B 6 7C ¼6 7 ¼ BI4 þ 6 7C=2 4 0 1=2 1=2 0 5 @ 4 0 1 0 0 5A 0 0 0 1 0 0 0 1   ¼ ðI4 þ K1 ð1324Þ Þ=2 ¼ Sðd1 ;:::;dpÞ ¼ Sð2;2Þ ;

is an (unscaled) symmetrizer defined in the first paragraph of Sect. 10.2, where d1 and d2 are the numbers of the columns and rows of A in Sðd1 ;d2 Þ vecðAÞ or the þ ¼ dimension sizes of a1 and a2 in Sðd1 ;d2 Þ ða1  a2 Þ, respectively. Note that D2p D2p  Sð2;2Þ is a special case of the projection matrix, which is symmetric and idempotent, projecting a vector onto the space spanned by the column vectors of D2p . In the equation j2 ðXÞ ¼ D2p j2D ðXÞ, it is found that D2p is a unique solution. On þ þ the other hand, in D2p j2 ðXÞ ¼ j2D ðXÞ, D2p can be replaced by infinitely many other matrices. For instance, when p = 2, we find that 2

1 40 0

0 w 0

0 1w 0

3 0 0 5j2 ðXÞ ¼ j2D ðXÞ ð1\w\1Þ; 1

since wr12 þ ð1  wÞr21 ¼ r12 ¼ r21 . Among the many solutions other than w ¼ 1=2, those with w ¼ 1 and 0 are of interest in that they give vectors ðr11 ; r21 ; r22 ÞT and ðr11 ; r12 ; r22 ÞT eliminating the supra- and infra-diagonal elements, respectively, though in this case with symmetric R, they are equal. When w ¼ 1,

10.4

Elimination Matrices and Non-duplicated Multivariate …

2

1 40 0

0 w 0

0 1w 0

3 2 0 1 05 ¼ 40 1 0

0 0 1 0 0 0

331

3 0 05 1

is a special case of the fpðp þ 1Þ=2g  p2 elimination matrix L2p , which is also denoted by Lp in this paper and L by Magnus and Neudecker [13]. Though L2p seems to be restricted to the case eliminating supra-diagonal elements in a narrow sense, other cases eliminating some of the elements in many ways can be considered using the same notation L2p or Lp when confusion does not occur. For instance, when w = 0 in the above case, Lp becomes the matrix eliminating the infra-diagonal elements of a matrix as mentioned earlier. " When #eliminating the 1000 in the case of off-diagonal elements, Lp becomes a p  p2 matrix, e.g., 0001 p = 2. Finally, for standardized Y when p = 2, we note that þ j2 ðYÞ ¼ D2p j2D ðYÞ ¼ ð1; 0; 0; 1ÞT and D2p j2 ðYÞ ¼ j2D ðYÞ ¼ ð1; 0; 1ÞT :

Consider the p3  1 vector j3 ðYÞ of the multivariate third cumulants including duplicated elements. Recall the corresponding vector consisting of the distinct elements j3D ðYÞ shown in the first paragraph of this section. Then, we have j3 ðYÞ ¼ D3p j3D ðYÞ; (Meijer [16, Sect. 2.1]), where D3p is the p3  fpðp þ 1Þðp þ 2Þ=3!g triplication (or 3-way duplication) matrix of full column rank (Meijer used the notation Tp for D3p ), which has a single unit element in every row with the remaining elements being zero as in the case of D2p ð¼ Dp Þ. Since D3p is of full column rank, we obtain þ j3 ðYÞ ¼ j3D ðYÞ; ðDT3p D3p Þ1 DT3p j3 ðXÞ ¼ D3p þ where D3p is the fpðp þ 1Þðp þ 2Þ=3!g  p3 matrix replacing the duplicated eleþ ments by their single means (Meijer used the notation Tpþ for D3p ). As was the case þ þ of D2p , D3p is a unique solution for j3 ðYÞ ¼ D3p j3D ðYÞ while D3p in D3p j3 ðYÞ ¼ j3D ðYÞ can be replaced by many other matrices. ðlÞ Let j3D ðYÞ ðl ¼ 1; 2; 3Þ be p  1; fpðp  1Þg  1 and fpðp  1Þðp  2Þ=6g 1, vectors consisting of the uni-, bi- and tri-variate third cumulants, respectively, where the elements are located according to the corresponding lexicographic order in the selected elements of j3 ðYÞ:

332

10 Multivariate Measures of Skewness and Kurtosis

ð1Þ

j3D ðYÞ ¼

p X

ð1Þ

\3 [ T eðiÞ eðiÞ j3 ðYÞ  L3D j3 ðYÞ ¼ ðj111 ; j222 ; :::; jppp ÞT ;

i¼1 ð2Þ

j3D ðYÞ ¼

X

1 i\j p

n o \2 [ T \2 [ efði1Þp2 þ ðj1Þp þ jg ðeðiÞ  eðjÞ Þ þ efði1Þp2 þ ði1Þp þ jg ðeðiÞ  eðjÞ ÞT j3 ðYÞ

ð2Þ

ð3Þ j3D ðYÞ

 L3D j3 ðYÞ ¼ ðj112 ; :::; jp1;p;p ÞT ; X ¼ efði1Þp2 þ ðj1Þp þ kg ðeðiÞ  eðjÞ  eðkÞ ÞT j3 ðYÞ 1 i\j\k p ð3Þ

 L3D j3 ðYÞ ¼ ðj123 ; j124 ; :::; j1;p1;p ; j234 ; j235 ; :::; jp2;p1;p ÞT : ð3Þ

In the above expressions, when p = 2, j3D ðYÞ does not exist. Define ð123Þ

ð1ÞT

ð2ÞT

ð3ÞT

j3D ðYÞ ¼ ðL3D ; L3D ; L3D ÞT j3D ðYÞ ð123Þ

 L3D j3D ðYÞ ð1ÞT

ð2ÞT

ð3ÞT

¼ fj3D ðYÞ; j3D ðYÞ; j3D ðYÞgT : ð123Þ

Then, the elements of j3D ðYÞ are permuted ones of j3D ðYÞ. Under bi- and ð123Þ

tri-variate independence, j3D ðYÞ becomes shorter than that without any independence. Meijer [16, Sect. 3.1] and Jammalamadaka et al. [8, Sect. 5] defined the fpðp þ 1Þðp þ 2Þðp þ 3Þ=4!g  1 vector j4D ðYÞ for the fourth cumulants of distinct elements using Jammalamadaka et al.’s notation. As in j3D ðYÞ, the lexicographical order for the subscripts of the elements is employed for identification. That is, j4D ðYÞ ¼ fjijkl g ¼ ðj1111 ; j1112 ; ::; jp1;p;p;p ; jpppp ÞT ; where jijkl ¼ EðYi Yj Yk Yl Þ  dij dkl  dik djl  dil djk ð1 i j k l pÞ. When p = 2, we have j4D ðYÞ ¼ ðj1111 ; j1112 ; j1122 ; j1222 ; j2222 ÞT : The ratio of the number of the distinct elements to that of the duplicated ones is fpðp þ 1Þðp þ 2Þðp þ 3Þ=ðp4 4!Þ ¼ ð1 þ p1 Þð1 þ 2p1 Þð1 þ 3p1 Þ=24, which is 5/ 16 when p = 2 and approaching 1/24 when p increases. Meijer [16, Sect. 3.1] showed that j4 ðYÞ ¼ D4p j4D ðYÞ; where D4p is the p4  fpðp þ 1Þðp þ 2Þðp þ 3Þ=4!g quadruplication (or 4-way duplication) matrix of full column rank (Meijer used the notation Qp for D4p ), which has a single unit element in every row with the remaining elements being zero as in the case of D3p . As for D2p and D3p , we have

10.4

Elimination Matrices and Non-duplicated Multivariate …

333

þ ðDT4p D4p Þ1 DT4p j4 ðXÞ ¼ D4p j4 ðYÞ ¼ j4D ðYÞ; þ where D4p is the fpðp þ 1Þðp þ 2Þðp þ 3Þ=4!g  p4 matrix replacing the duplicated þ ). As before, elements by their single means (Meijer used the notation Qpþ for D4p þ þ D4p is a unique solution for j4 ðYÞ ¼ D4p j4D ðYÞ while D4p in D4p j4 ðYÞ ¼ j4D ðYÞ can be replaced by many other matrices. ðmÞ Let j4D ðYÞ ðm ¼ 1; 2a; 2b, 3; 4Þ be p  1; fpðp  1Þg  1, fpðp  1Þ=2g  1, fpðp  1Þðp  2Þ=2g  1 and fpðp  1Þðp  2Þðp  3Þ =24g  1, vectors consisting of the uni-, asymmetric bi- (e.g., j1112 ), symmetric bi(e.g., j1122 ), tri- (e.g., j1233 ) and four-variate fourth cumulants, respectively, where the elements are located according to the corresponding lexicographic order in the selected elements of j4 ðYÞ: ð1Þ

j4D ðYÞ ¼

p X i¼1

ð2aÞ j4D ðYÞ

¼

ð1Þ

\4 [ T eðiÞ eðiÞ j4 ðYÞ  L4D j4 ðYÞ ¼ ðj1111 ; j2222 ; :::; jpppp ÞT ;

X

1 i\j p

n [ efði1Þp3 þ ði1Þp2 þ ði1Þp þ jg ðe\3  eðjÞ ÞT ðiÞ o [ T þ efði1Þp3 þ ðj1Þp2 þ ðj1Þp þ jg ðeðiÞ  e\3 Þ j4 ðYÞ ðjÞ

ð2aÞ

 L4D j4 ðYÞ ¼ ðj1112 ; j1113 ; :::; jp1;p;p;p ÞT ; X ð2bÞ \2 [ \2 [ T j4D ðYÞ ¼ efði1Þp3 þ ði1Þp2 þ ðj1Þp þ jg ðeðiÞ  eðjÞ Þ j4 ðYÞ 1 i\j p ð2bÞ

 L4D j4 ðYÞ ¼ ðj1122 ; j1133 ; :::; jp1;p1;p;p ÞT ; X ð3Þ \2 [ T j4D ðYÞ ¼ efði1Þp2 þ ðj1Þp þ kg fðeðiÞ  eðjÞ  eðkÞ Þ 1 i\j\k p

[ [ þ ðeðiÞ  e\2  eðkÞ ÞT þ ðe\2  eðjÞ  eðkÞ ÞT gj4 ðYÞ ðjÞ ðiÞ ð3Þ

 L4D j4 ðYÞ ¼ ðj1233 ; j1244 ; :::; j1;2;p1;p1 ; j2234 ; ; :::; jp2;p1;p;p ÞT ; X ð4Þ j4D ðYÞ ¼ efði1Þp3 þ ðj1Þp2 þ ðk1Þp þ lg ðeðiÞ  eðjÞ  eðkÞ  eðlÞ ÞT j4 ðYÞ 1 i\j\k\l p ð4Þ

 L4D j4 ðYÞ ¼ ðj1234 ; j1235 ; :::; j1;p2;p1;p ; j2345 ; ; :::; jp3;p2;p1;p ÞT : ð3Þ

ð4Þ

In the above expressions, when p = 2, j3D ðYÞ and j3D ðYÞ do not exist; and when ð4Þ

p = 3, j3D ðYÞ does not exist. Define

334

10 Multivariate Measures of Skewness and Kurtosis ð1234Þ

j4D

ð1ÞT

ð2aÞT

ð1234Þ

j4D ðYÞ

ð2bÞT

ð3ÞT

ð4ÞT

ðYÞ ¼ ðL4D ; L4D ; L4D ; L4D ; L4D ÞT j4D ðYÞ  L3D ¼

ð1ÞT ð2aÞT ð2bÞT ð3ÞT ð4ÞT fj4D ðYÞ; j4D ðYÞ; j4D ðYÞ; j4D ðYÞ; j4D ðYÞgT : ð1234Þ

Then, the elements of j4D

ðYÞ are permuted ones of j4D ðYÞ. Under bi-, tri- and ð1234Þ

four-variate independence, j4D ðYÞ is shorter than that without any independence. So far, various elimination matrices have been defined in this section. ð1Þ ð2Þ ð3Þ ð123Þ are For the multivariate third cumulants, L3D ; L3D ; L3D and L3D 3 3 3 ð3Þ 3 p  p ; fpðp  1Þg  p , fpðp  1Þðp  2Þ=6g  p and p  p matrices, where For the multivariate fourth cumulants, pð3Þ ¼ pðp þ 1Þðp þ 2Þ=3!. ð1Þ ð2aÞ ð2bÞ ð3Þ ð4Þ ð1234Þ L4D ; L4D ; L4D ; L4D ; L4D and L4D are p  p4 ; fpðp  1Þg  p4 , fpðp  1Þ=2g  p4 , fpðp  1Þðp  2Þ=2g  p4 , fpðp  1Þðp  2Þðp  3Þ=24g  p4 and pð4Þ  p4 matrices, where pð4Þ ¼ pðp þ 1Þðp þ 2Þðp þ 3Þ=4!. Remark 10.5 Let L be a m  n generic elimination matrix including permutation ð123Þ ð1234Þ matrices e.g., L3D and L4D for convenience. Then, by definition m n, where m ¼ n indicates that L is a permutation matrix. Suppose that y ¼ Ly, where y consist of some or all of the elements of y. Consider L þ y , where L þ ðn  mÞ is the MP inverse of L: T L þ ¼ LT ðLLT Þ1 ¼ LT I1 m ¼ L ;

which is different from the case of a n  m duplication matrix generically denoted by D with D þ 6¼ DT . It is found that L þ y is y when the eliminated elements are replaced by 0s. That is, L þ y partially restores y unless m ¼ n, which corresponds to the non-existence of the inverse transformation of L (Magnus and Neudecker [13, p. 427]).

References 1. Arnold BC, Groeneveld RA (1995) Measuring skewness with respect to the mode. Am Stat 49:34–38 2. Balakrishnan N, Brito MR, Quiroz AJ (2007) A vectorial notion of skewness and its use in testing for multivariate symmetry. Commun Stat Theor Methods 36:1757–1767 3. Browne MW (1974) Generalized least-squares estimators in the analysis of covariance structures. South Afr Stat J 8:1–24. Reprinted in Aigner DJ, Goldberger AS (eds) Latent variables in socioeconomic models. North Holland, Amsterdam, pp 205–226 (1977) 4. Cardoso JF (1989) Source separation using higher order moments. In: International conference on acoustics, speech, and signal processing. IEEE, pp 2109–2112

References

335

5. Ekström M, Jammalamadaka SR (2012) A general measure of skewness. Statist Probab Lett 82:1559–1568 6. Holmquist B (1988) Moments and cumulants of the multivariate normal distribution. Stoch Anal Appl 6:273–278 7. Holmquist B (1996) The d-variate vector and Hermite polynomial of order k. Linear Algebra Appl 237(238):155–190 8. Jammalamadaka SR, Taufer E, Terdik GH (2021) On multivariate skewness and kurtosis. Sankhyā A 83:607–644 9. Kano Y (1997) Beyond third-order efficiency. Sankhyā A 59:179–197 10. Kollo T (2008) Multivariate skewness and kurtosis measures with an application in ICA. J Multivar Anal 99:2328–2338 11. Koziol JA (1989) A note on measures of multivariate kurtosis. Biom J 31:619–624 12. Magnus JR, Neudecker H (1979) The commutation matrix: some properties and applications. Ann Stat 7:381–394 13. Magnus JR, Neudecker H (1980) The elimination matrix: some lemmas and applications. SIAM J Algebraic Discrete Methods 1:422–449 14. Magnus JR, Neudecker H (1999) Matrix differential calculus with applications in statistics and econometrics, Rev. Wiley, New York 15. Mardia KV (1970) Measures of multivariate skewness and kurtosis with applications. Biometrika 57:519–530 16. Meijer E (2005) Matrix algebra for higher order moments. Linear Algebra Appl 410:112–134 17. Mòri TF, Rohatgi VK, Székely GJ (1994) On multivariate skewness and kurtosis. Theor Probab Appl 38:547–551 18. Neudecker H, Wansbeek T (1983) Some results on commutation matrices, with statistical applications. Can J Stat 11:221–231 19. Ogasawara H (2017) Extensions of Pearson’s inequality between skewness and kurtosis to multivariate cases. Stat Probab Lett 130:12–16 20. Srivastava MS (1984) A measure of skewness and kurtosis and a graphical method for assessing multivariate normality. Stat Probab Lett 2:263–267 21. Terdik G (2002) Higher order statistics and multivariate vector Hermite polynomials for nonlinear analysis of multidimensional time series. Teorija verojatnosteĭ i matematičeskaja statistika (Theor Probab Math Stat) 66:147–168 22. Terdik G (2021) Multivariate statistical methods: going beyond the linear. Springer Nature, Cham, Switzerland 23. Yanai H, Takeuchi K, Takane Y (2011) Projection matrices, generalized inverse matrices, and singular value decomposition. Springer, New York

Index

A Abramowitz, 73, 78 Absolute moments central, 95 of real-valued orders, 47, 70, 94 Actuarial science, 3 Affine transformation, 105, 119, 125, 241 Aitken, 3, 105 Ali, 266 Allard, 252 Amoroso distribution, 278 Andrews, 3, 105 Arismendi, 1, 3 Arnold, 3, 105, 108, 316 Artzner, 3 Ascending factorial, 59, 71 Asymmetric-bivariate, 323 Asymmetric expressions, 22, 24, 257 Azzalini, 1, 105, 108, 114, 119, 155, 200, 218, 266 B Balakrishnan, 57, 318 Barr, 1 Basic parabolic cylinder (BPC) distribution of the first kind, 69, 70 of the second kind, 69, 70 of the third kind, 69, 70 Bayes, 114 Beaver, 3, 105, 108 Behaviormetrics, 105, 252 Bimodal distribution, 84, 193 Binomial expansion, 94, 95 Birnbaum, 3, 105 Bishop, 274

Bollen, 252 Bouvier, 92 Branco, 288 Breeding animals, 3 plants, 3 Brito, 318 Broda, 3 Browne, 329 Burkardt, 1 C Cabral, 266 Capitanio, 105, 108, 218, 266 Castro, 266 Cha, 252 Chandra, 56, 57, 68 Chang, 266 Change of variables, 70 Chebyshev inequality, 59, 60 Chi-distribution scaled, 271, 279 Chi-square distribution scaled, 279 Cho, 252 Circular reasoning, 23 Closed-form, 58, 76 Closed skew-normal (CSN) family of distributions, 105 Cochran, 3 Commutation matrix, 299, 302 Commutator, 299, 301, 309–312, 315 Complementary interval, 48 Completing squares, 174 Compound symmetry, 207

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 H. Ogasawara, Expository Moments for Pseudo Distributions, Behaviormetrics: Quantitative Approaches to Human Behavior 2, https://doi.org/10.1007/978-981-19-3525-1

337

338 Consul, 56 Convergence, 91, 92, 203 Convolution, 171, 186–188, 223, 252 Cornish, 285 Covariance matrix, 9, 106, 189, 217, 224, 231, 329 Cpu time, 91, 92 Crooks, 278 Cubature, 92 Cumulant generating function (CGF), 5, 6, 116, 185, 226 Cumulative distribution function (CDF), 6, 7, 49, 56, 59, 60, 70, 71, 75, 76, 88, 89, 91, 92, 142, 253, 263, 288 D Dalla Valle, 105, 119 de Doncker-Kapenga, 91 Degree(s) of freedom, 51, 96, 117, 119, 200, 266, 268–272, 274–276, 280–283, 285–288, 296, 297 Delbaen, 3 Density contour, 93, 209 Determinant, 109 Dey, 1, 81, 288 Differential equation, 74, 82, 83 Direct product, 230, 306 Distinct elements, 328, 329, 331, 332 DLMF, 69, 71, 78, 82, 249 Domínguez-Molina, 105, 108, 111, 117, 120, 122–125, 145, 217, 218 Double factorial, 37, 63, 65, 155 Double precision, 91, 158 Double truncation, 1, 3, 47, 48, 62, 141–143, 149, 166, 167, 252, 265 Doubly truncated, 1 Dunkl, 38 E Eber, 3 Ekström, 316 Elimination matrix, 328, 331, 334 Elliptical truncation, 3 Equally-spaced sequence, 198 Equated means, 203, 209 Erdélyi, 69, 70, 73, 74, 78, 82 Excess kurtosis matrix, 325, 327 Exchangeability, 207, 208 Exchangeable variables, 206 Exploratory factor analysis (EFA), 252

Index F Factorial double, 37, 63, 65, 155 rising or ascending, 59, 71, 248 Famoye, 56 Fisher, 1, 62, 81, 82 Fisher’sInfunction, 62, 81 Flecher, 252 Floor function, 38 Full column rank, 122, 176, 329, 331, 332 Fung, 266 G Galarza, 1, 266 Galko, 71 Gamma distribution generalized, 278 inverse, 274, 275, 287 inverse-square-root gamma, 277 power, 265, 277, 284, 295 square-root gamma, 277 Galko, 71 Gamma function lower incomplete, 50, 71, 245 upper incomplete, 56, 57 Garay, 266 Gaure, 92 Gauss hypergeometric function, 250 Gauss notation, 38 Ge, 1 Generalized gamma distribution, 278 Generalized gamma distribution, 278 Generalized inverse, 329 Generalized Poisson distributions (GPSs), 56 General truncation, 3, 105 Genton, 266 Genz, 1 Ghaderinezhad, 114 Ghosh, 56 Gianola, 3 G-inverse, 329 González-Farías, 105, 108, 113, 114, 120, 122–125, 217 Gorsuch, 252 Groeneveld, 3, 108, 316 Gupta, 105, 108, 113, 217, 266, 288, 294, 295 H Hadamard product, 118 Hahn, 92 Half integer, 56, 57

Index Half normal, 281, 282 Half Poisson distribution, 56, 57, 64 Heath, 3 Henze’s theorem, 215, 217 Henze, 114, 215–217 Hermite polynomials multivariate, 215, 226, 230 Herrendörfer, 3 Hidden or latent truncation, 251 Holmes, 70 Holmes, 70 Hill, 1 Holmquist, 230, 299, 310 Horrace, 1 Huang, 266 I Idempotent, 313, 330 Identity matrix, 8, 109, 285, 300, 326 Incident function, 49 Incomplete gamma function lower, 50, 71, 245 upper, 56, 57 Induction, 28, 37, 155, 274 Infinite series multiple, 95 Inflated dimensionality, 251, 252 Infra-diagonal, 330, 331 Inner truncation, 1, 3, 47, 141–143, 149, 153, 158, 166–168, 193, 265 Integer part, 38 Integral representation, 70–72, 76, 81, 91 Inverse chi distribution, 284 Inverse chi-square distribution, 268 Inverse gamma, 287 Inverse of a partitioned matrix, 8, 111, 112 Inverse permutation, 304, 311 Inverse transformation, 334 Inverse vec operation, 328 Ismail, 56 Iterated expectation, 108 Iterative replacement, 203, 207, 208 J Jacobian, 267, 268, 270, 271, 281, 285, 286, 291 Jain, 56 Jammalamadaka, 310, 312, 316, 318, 319, 321, 322, 325, 327, 328, 332 Joe, 266 Johnson, 57, 92 K Käärik, 266, 272

339 Kahaner, 91 Kamalja, 56 Kamat, 1, 251 Kan, 1, 81 Kano, 230, 299, 310, 313 Kemp, 38, 157 Kiêu, 92 Kim, 252 Kirkby, 274 Koller, 92 Kollo, 230, 266, 272, 318–320, 327 Kostylev, 69, 95 Kotz, 57, 266, 285 Koziol, 323, 325 Krenek, 252 Kronecker delta, 85, 245, 307 Kronecker product, 7, 125, 299, 304, 309–311, 315, 326 Kummer confluent hypergeometric function, 69, 71, 74, 248, 249 Kurtic kurtic-normal (KN) distribution, 149 kurtic functions, 149 kurtic normal-normal (KNN), 193 leptokurtic, 149, 316 mesokurtic, 149, 316 platykurtic, 149, 316 Kurtosis excess, 77, 135, 142, 143, 149, 159, 162, 167, 181, 192, 194, 196, 204, 316, 321, 325, 327, 328 L Lachos, 1, 81, 266 Landsman, 3 Lachos, 266 Latent variable models (LVMs), 251, 252 Law of total or iterated expectations, 108 Lawley, 3, 105 Lee, 1, 81 Left inverse, 329 Legendre duplication formula, 73 Leonald, 105, 114 Leptokurtic, 149, 316 Letac, 56 Letac, 56 Lexicographically ordered, 127, 319 Lexicographical order, 307, 308, 322, 327, 328, 332 Ley, 114 L’Hôpital’s rule extended L’Hôpital’s rule, 164 Li, 266 Linear space, 121

340 Liseo, 105, 114 Lodge, 71 Loperfido, 105, 114, 119 Lower (left) tail, 1 Lower incomplete gamma function, 50, 71, 245 Löwner’s sense, 186, 218 Lukacs, 126, 217, 218 M Machine precision, 91–93 Magnus, 8, 37, 69, 78, 82, 299, 303, 304, 306, 329, 331, 334 Makov, 3 Manjunath, 1 Mardia’s non-excess kurtosis, 321 Marsaglia, 108, 155 Mathai, 70 Matos, 1, 266 Matrix-square-root, 121, 286, 309 Mean-equated, 203, 204, 210 Mean equation, 205, 208 Mean mixture, 262 Mean vector, 106 Meeker, 3, 110 Meijer, 299, 328, 329, 331–333 Miller, 1, 71 Mesokurtic, 149, 316 Miller notation, 78 Mixture precision, 275, 277, 284, 285, 288 variance, 276, 277 Mode global, 84 local, 93 Moment equating, 193 Moment generating function (MGF), 1, 3–6, 14, 15, 17, 30, 60, 68, 83, 105, 114–116, 118, 120–122, 125, 131, 135, 136, 150, 160, 171, 174–176, 182, 185, 190, 191, 198, 200, 215–217, 220, 222–225, 235, 238–241, 253, 257, 259–261, 263 pseudo mgf, 6, 18, 126, 224, 253, 257 Moments absolute, 47, 52, 53, 55, 56, 69, 70, 94, 95 partial, 57, 62 Moore-Penrose generalized-inverse, 329 Mora, 56 Mòri, 318, 325, 327 MP inverse, 329, 334 Multi-modal distributions, 105 Multiple commutator, 299, 309, 311 Multiple integral, 255 Multiplication theorem, 73

Index Multivariate basic parabolic cylinder (BPC) distribution, 69, 85, 87, 92, 95 Multivariate cumulants, 16, 25, 26, 126, 127, 177, 226, 299, 309, 316, 320 Multivariate fourth cumulants, 321, 325, 334 Multivariate measure of excess kurtosis, 321 Multivariate measure of non-excess kurtosis, 323 Multivariate measure of skewness, 319 Multivariate PN and NN, 206 Multivariate t-distribution skew multivariate t-distribution, 288 Multivariate third cumulants, 320, 331, 334 N Nabeya, 1, 251 Nadarajah, 266, 285 Narasimhan, 92 Navarro, 105 Naveau, 252 Necessary and sufficient condition, 126, 319 Neudecker, 8, 299, 303, 304, 306, 329, 331, 334 Nguyen, 274 NIST, 249 Nguyen, 274 Non-duplicated multivariate skewness and kurtosis, 328 Non-normal, 105, 149, 189, 265, 266, 288 Normal discrete normal distribution, 171, 182, 222 distribution, 57, 62, 70, 94, 96, 106, 154, 171, 187, 191, 192, 231, 235, 274, 284, 316 exchangeable normal vector, 105 normal-normal (NN) distributions, 171, 174, 231, 235, 262 normal-separated normal-normal (NSNN), 225 normal-separated pseudo-normal (NSPN), 126, 220 normal mixture, 279–282 prior density, 114 standard, 43, 51, 57, 230, 288 vector, 1, 3, 4, 69, 116, 215, 217, 224, 231 Normality, 1, 47, 78, 186, 187, 321, 325 Normalizer, 4, 10–12, 58, 59, 70, 85, 87, 89, 90, 95, 106, 107, 113, 114, 116, 124, 166, 167, 174, 186, 188, 237, 257, 263, 278 Normally distributed, 1, 3, 70, 81, 94, 107, 215–217, 222, 224, 235, 236, 265, 292

Index O Oberhettinger, 38, 69, 82 Observable variables, 235, 252 Off-diagonal elements, 95, 331 Ogasawara, 1, 4, 47, 49, 51, 52, 54–57, 64, 66, 69, 71, 72, 81, 105, 106, 114, 159, 243, 244, 247, 248, 250, 251, 317, 320 O’Hagan, 105, 114 Olver, 91 Ord, 38, 159, 276 Order statistics, 105 Orthogonally rotated random vector, 319 Ortho-normal, 304 Overdispersion, 56 P Parabolic cylinder distribution generalized, 70 three-parametric, 69 Parabolic cylinder function incomplete, 69, 71, 89 weighted, 70, 71, 89 Partial derivative, 1, 6, 7, 12, 23, 129, 253 Partial moment, 57, 62 Partitioned matrix inverse of, 8, 111, 112 Patenaude, 71 Paulson, 3, 105 Pearson, 1, 3, 81, 91, 105, 159, 320 Pearson, 81, 159, 320 Permutation matrix, 299–301, 303, 304, 334 Perspective view, 209 Pettere, 266 Pierson, 70 Piessens, 91 Plane truncation, 3 Platykurtic, 149, 316 Pochhammer notation, 59 Poisson distribution half, 56, 57, 64 real-valued, 47, 55–57 Pollack, 51, 81 Porter, 91 Positively-skewed transformation, 320 Posterior density, 114 Power gamma distribution, 265 Power-precision, 278, 279 Power series, 92 Power-standard deviation, 279 Power-variance, 279 Precision mixture, 275, 277, 284, 285, 288 Principal component analysis (PCA), 252

341 Prior density, 114 Probabilist’s Hermite polynomial, 29 Probability density function (PDF), 1, 3, 4, 48, 70, 84, 90, 91, 105, 106, 108, 110, 111, 113–115, 119, 129, 142, 150, 160, 171, 173, 174, 182, 185–193, 198, 203, 205, 209, 217, 223, 224, 235, 237, 238, 240, 242, 243, 251, 252, 257, 261–263, 265–272, 274, 275, 277–288, 290, 291, 294, 295 Probability function, 56, 57, 59, 187 Product sum of natural numbers (PSNN), 1, 29, 34 Progri, 70 Projection matrix, 121, 330 Pseudo non-normal distributions, 265 Pseudo-normal (PN) distribution family of distributions, 105 singular pseudo-normal (SPN), 122 Pseudo random vector, 6, 15, 18, 126, 218, 224, 253 Psychometrics, 105, 252 Q Quadruplication matrix, 332 Quasi mixture, 280–282 Quiroz, 318 Q-way or q-mode commutation matrix, 310 R Radial truncation, 3 Raw moment, 76, 244, 272, 316 R Core Team, 91, 158 Real-valued Poisson distribution, 47, 55–57 Recurrence formula, 78, 80, 81 relation, 78–81 Reference point, 47, 51, 174, 192, 193, 209 Reflection point, 303 Regression, 9, 56, 108, 120, 171, 186, 189, 217, 224 Reparametrization, 70, 251, 278, 295 Residual, 9, 74, 108, 189, 217, 224 R-function cubintegrate, 92 integrate, 91 wpc, 91 Rising factorial, 248 Risk, 3 Roberts, 105 Robotti, 1, 81 Rohatgi, 318 Rosa, 3

342 Roy, 56 R-package cubature, 92 R-Poisson, 57, 59, 60, 64, 66–68 Ruiz, 105 S Sandoval, 105 Saxena, 70 Scale-free, 78, 93, 279, 317 Scale parameter, 49, 50, 59, 60, 70, 71, 75, 96, 119, 266, 270, 271, 274–276, 279, 280, 282, 283, 286, 287, 295 Sectionally truncated sectionally truncated normal (STN), 3–6, 14, 15, 17, 30, 116, 215, 224, 243, 253, 254 Sectional truncation, 1, 3, 47, 69, 105, 141, 171, 216, 235, 247, 265, 295 Selart, 266, 272 Selected regions, 1, 208 Selection intervals or stripes, 47 Seneta, 266 Series expansion, 57 Shape parameter, 49, 50, 59, 60, 71, 75, 96, 295 Sharp, 252 Shauly-Aharonov, 51, 81 Sherill, 1 Shushi, 3 Single truncation, 1, 47, 62, 70, 84, 96, 105, 141, 149, 252, 265, 266, 292 Singly truncated, 1, 251 Skewed unimodal distributions, 105 Skewness, 77, 135, 142, 143, 149, 159, 162, 167, 181, 194, 196, 204, 299, 316–319, 323 Skew-normal (SN) distributions extended, 105 hierarchical, 105 multivariate, 105, 119, 288 singular skew-normal (SSN), 122 Skew transformation, 320 Soni, 38, 69 Spectral decomposition, 319 Srivastava, 319 Stacy, 278 Standardized random vector, 309 Stegun, 73, 78 STN-distributed vector, 6, 14, 243 Strictly decreasing function (SDF), 84, 93 Stripe truncation, 1, 47–49, 62, 69 Stuart, 38, 157, 274 Student’s t-distribution, 265, 266, 274

Index (Sub)independence, 216 Submatrix, 89, 123, 299–303 Subtract cancellation error, 81 Sufficient condition, 126, 319, 320 Supra-diagonal, 331 Sylvester’s theorem, 109, 173 Symmetric bivariate, 323 expressions, 21, 22, 24, 26 forms, 23 Symmetrizer, 310, 312, 330 Symmetry of derivatives, 23 Székely, 318 T Tail areas, 3, 6, 47, 141, 149 Tail conditional expectation, 3 Tail-thickness, 316 Takane, 8, 329 Takeuchi, 8, 329 Tallis, 1, 3 Taufer, 310 T-distribution multivariate, 266, 285 standard, 266, 268, 269 univariate, 286 unstandard, 274 Tensor product, 299 Terdik, 299, 303, 304, 307, 308, 310–312, 323, 325, 326 Total expectation, 108 Total kurtosis, 325 Triplication matrix, 331 Tri-variate, 12, 317, 318, 323, 331, 332 Truncated normal, 3, 171 Truncated normal-normal (TNN) distribution, 235, 262 Truncated pseudo-normal (TPN) distribution, 235, 238 Truncation double, 1, 3, 47, 48, 62, 141–143, 149, 166, 167, 252, 265 hidden, 105, 114 inner, 1, 3, 47, 141–143, 149, 153, 158, 166–168, 193, 265 observed, 114 sectional, 1, 3, 47, 69, 105, 141, 171, 216, 235, 247, 265, 295 single, 1, 47, 62, 70, 84, 96, 105, 141, 149, 252, 265, 266, 292 stripe, 1, 47, 48, 62, 69 tigerish, 1, 47, 69 zebraic, 1, 47, 69 Tuchscherer, 3

Index Turning point, 164 Two-mode sectionally truncated normal (2MSTN) vector, vi U Űberhuber, 91 Unconditional distribution, 106 Underdispersion, 56, 67 Unimodal distributions, 105 Upper incomplete gamma function, 56, 57 Upper (right) tail, 1 V Variable transformation, 49, 72, 85, 94–96, 246, 267–271, 275, 276, 279–284, 286 Variance mixture, 276, 277 Vec operator, 304, 309, 326 Vectorizing operator, 126, 299 von Rosen, 230 W Wagh, 56 Wansbeek, 303, 306 3-way duplication matrix, 331

343 4-way duplication matrix, 332 Weighted Gauss hypergeometric function, 251 Weighted or incomplete Kummer confluent hypergeometric function, 69, 71, 74 Weighted or incomplete parabolic cylinder function, 69, 71 Weinstein-Aronszajin identity, 109 Whittaker notation, 69, 71 Wichura, 1 Wilhelm, 1 Winkelbauer, 51, 62 Woodbury matrix identity, 109 X Xu, 38 Y Yanai, 8, 329 Z Zacks, 108 Zamani, 56 Zero-distorted, 67, 68 Zwillinger, 69–71, 78, 249