Gaussian Measures in Hilbert Space: Construction and Properties 9781119476825, 1119476828, 9781119686729, 1119686725

Gaussian Measures in Euclidean Space -- Gaussian Measure in l 2 as a Product Measure -- Borel Measures in Hilbert Space

187 46 3MB

English Pages 260 pages [277] Year 2020

Table of contents :
Cover......Page 1
Half-Title Page......Page 3
Dedication......Page 4
Title Page......Page 5
Copyright Page......Page 6
Contents......Page 7
Foreword......Page 11
Preface......Page 15
Acknowledgments......Page 16
Introduction......Page 17
Abbreviations and Notation......Page 21
1.1. The change of variables formula......Page 25
1.2. Invariance of Lebesgue measure......Page 28
1.3. Absence of invariant measure in infinite-dimensional Hilbert space......Page 33
1.4. Random vectors and their distributions......Page 34
1.4.1. Random variables......Page 35
1.4.2. Random vectors......Page 36
1.4.3. Distributions of random vectors......Page 38
1.5.1. Characteristic functions of Gaussian vectors......Page 41
1.5.2. Expansion of Gaussian vector......Page 44
1.5.3. Support of Gaussian vector......Page 46
1.5.4. Gaussian measures in Euclidean space......Page 47
2.1.1. Metric on R∞......Page 51
2.1.2. Borel and cylindrical sigma-algebras coincide......Page 54
2.1.3. Weighted l2 space......Page 55
2.2.1. Kolmogorov extension theorem......Page 58
2.2.2. Construction of product measure on B(R∞)......Page 60
2.2.3. Properties of product measure......Page 62
2.3. Standard Gaussian measure in R∞......Page 66
2.3.1. Alternative proof of the second part of theorem 2.4......Page 69
2.4. Construction of Gaussian measure in l2......Page 70
3.1. Classes of operators in H......Page 75
3.1.1. Hilbert–Schmidt operators......Page 76
3.1.2. Polar decomposition......Page 79
3.1.3. Nuclear operators......Page 81
3.1.4. S-operators......Page 86
3.2.1. Weak integral......Page 92
3.2.2. Strong integral......Page 93
3.3.1. Weak and strong moments......Page 99
3.3.2. Examples of Borel measures......Page 102
3.3.3. Boundedness of moment form......Page 107
4.1. Cylindrical sigma-algebra in normed space......Page 113
4.2. Convolution of measures......Page 117
4.3. Properties of characteristic functionals in H......Page 120
4.4. S-topology in H......Page 123
4.5. Minlos–Sazonov theorem......Page 126
5.1. Characteristic functional of Gaussian measure......Page 135
5.2. Decomposition of Gaussian measure and Gaussian random element......Page 138
5.3. Support of Gaussian measure and its invariance......Page 141
5.4. Weak convergence of Gaussian measures......Page 149
5.5.1. Gaussian measures in normed space......Page 153
5.5.2. Fernique’s theorem......Page 157
6.1. Uniformly integrable sequences......Page 167
6.2.1. General properties of absolutely continuous measures......Page 169
6.2.2. Kakutani’s theorem for product measures......Page 172
6.2.3. Dichotomy for Gaussian product measures......Page 176
6.3.1. The case where Gaussian measures have equal correlation operators......Page 179
6.3.2. Necessary conditions for equivalence of Gaussian measures......Page 182
6.3.3. Criterion for equivalence of Gaussian measures......Page 189
6.4.1. Estimation and hypothesis testing for mean of Gaussian random element......Page 193
6.4.2. Estimation and hypothesis testing for correlation operator of centered Gaussian random element......Page 197
7.1. Solutions for Chapter 1......Page 203
7.2. Solutions for Chapter 2......Page 217
7.2.1. Generalized Kolmogorov extension theorem......Page 220
7.3. Solutions for Chapter 3......Page 226
7.4. Solutions for Chapter 4......Page 235
7.5. Solutions for Chapter 5......Page 241
7.6. Solutions for Chapter 6......Page 251
Chapter 2......Page 259
Chapter 4......Page 260
Chapter 5......Page 261
Chapter 6......Page 262
References......Page 263
Index......Page 265
Other titles from iSTE in Mathematics and Statistics......Page 269
EULA......Page 275

Recommend Papers

Gaussian Measures in Finite and Infinite Dimensions 9783031231216, 9783031231223

190 91 1MB Read more

Commuting Nonselfadjoint Operations in Hilbert Space

373 49 408KB Read more

With Hilbert Space

444 45 3MB Read more

Harmonic Analysis on Hilbert Space

396 61 2MB Read more

Measure, Lebesgue Integral And Hilbert Space

436 19 4MB Read more

Gaussian Measures in Finite and Infinite Dimensions [1 ed.] 9783031231216, 9783031231223

This text provides a concise introduction, suitable for a one-semester special topics course, to the remarkable properti

183 8 1MB Read more

Nonliner and Non-Gaussian State Space Modeling Using Sampling Tecyniques

338 29 245KB Read more

Linear Transformations in Hilbert Space. II. Analytical Aspects

396 5 482KB Read more

Hilbert Space Problem Book [1st ed.]

463 52 6MB Read more

Structure of Hilbert Space Operators 9789812566164, 9812566163

This book exposes the internal structure of non-self-adjoint operators acting on complex separable infinite dimensional

407 102 273KB Read more

Gaussian Measures in Hilbert Space: Construction and Properties
9781119476825, 1119476828, 9781119686729, 1119686725

Author / Uploaded
Kukush
Alexander

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Gaussian Measures in Hilbert Space

To the memory of my daughter Ann

Series Editor Nikolaos Limnios

Gaussian Measures in Hilbert Space Construction and Properties

Alexander Kukush

First published 2019 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Ltd 27-37 St George’s Road London SW19 4EU UK

John Wiley & Sons, Inc. 111 River Street Hoboken, NJ 07030 USA

www.iste.co.uk

www.wiley.com

© ISTE Ltd 2019 The rights of Alexander Kukush to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. Library of Congress Control Number: 2019946454 British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN 978-1-78630-267-0

Contents

Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ix

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xiii

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xv

Abbreviations and Notation

. . . . . . . . . . . . . . . . . . . . . . . . . .

xix

Chapter 1. Gaussian Measures in Euclidean Space . . . . . . . . . . .

1

1.1. The change of variables formula . . . . . . . . . . . . . . . . . . . . 1.2. Invariance of Lebesgue measure . . . . . . . . . . . . . . . . . . . . 1.3. Absence of invariant measure in inﬁnite-dimensional Hilbert space 1.4. Random vectors and their distributions . . . . . . . . . . . . . . . . 1.4.1. Random variables . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.2. Random vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.3. Distributions of random vectors . . . . . . . . . . . . . . . . . . 1.5. Gaussian vectors and Gaussian measures . . . . . . . . . . . . . . . 1.5.1. Characteristic functions of Gaussian vectors . . . . . . . . . . . 1.5.2. Expansion of Gaussian vector . . . . . . . . . . . . . . . . . . . 1.5.3. Support of Gaussian vector . . . . . . . . . . . . . . . . . . . . . 1.5.4. Gaussian measures in Euclidean space . . . . . . . . . . . . . .

. . . . . . . . . . . .

1 4 9 10 11 12 14 17 17 20 22 23

Chapter 2. Gaussian Measure in l2 as a Product Measure . . . . . . .

27

∞

. . . . . . . . . . . . . . . . . . . . . . 2.1. Space R 2.1.1. Metric on R∞ . . . . . . . . . . . . . . . . . . 2.1.2. Borel and cylindrical sigma-algebras coincide 2.1.3. Weighted l2 space . . . . . . . . . . . . . . . . 2.2. Product measure in R∞ . . . . . . . . . . . . . . . 2.2.1. Kolmogorov extension theorem . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . . .

27 27 30 31 34 34

vi

Gaussian Measures in Hilbert Space

2.2.2. Construction of product measure on B(R∞ ) . . . . 2.2.3. Properties of product measure . . . . . . . . . . . . 2.3. Standard Gaussian measure in R∞ . . . . . . . . . . . . 2.3.1. Alternative proof of the second part of theorem 2.4 2.4. Construction of Gaussian measure in l2 . . . . . . . . .

. . . . .

36 38 42 45 46

Chapter 3. Borel Measures in Hilbert Space . . . . . . . . . . . . . . . .

51

3.1. Classes of operators in H . . . . . 3.1.1. Hilbert–Schmidt operators . . 3.1.2. Polar decomposition . . . . . . 3.1.3. Nuclear operators . . . . . . . 3.1.4. S-operators . . . . . . . . . . . 3.2. Pettis and Bochner integrals . . . 3.2.1. Weak integral . . . . . . . . . 3.2.2. Strong integral . . . . . . . . . 3.3. Borel measures in Hilbert space . 3.3.1. Weak and strong moments . . 3.3.2. Examples of Borel measures . 3.3.3. Boundedness of moment form

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

Chapter 5. Gaussian Measure of General Form

. . . . .

. . . . . . . . . . . .

. . . . .

. . . . . . . . . . . .

. . . . .

. . . . . . . . . . . .

. . . . .

. . . . . . . . . . . .

. . . . .

. . . . . . . . . . . .

. . . . .

. . . . . . . . . . . .

. . . . .

. . . . .

. . . . . . . . . . . .

. . . . .

. . . . .

. . . . . . . . . . . .

. . . . .

. . . . .

. . . . . . . . . . . .

. . . . .

89

. . . . .

. . . . . . . . . . . .

. . . . .

Chapter 4. Construction of Measure by its Characteristic Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . .

51 52 55 57 62 68 68 69 75 75 78 83

. . . . .

. . . . . . . . . . . .

. . . . .

. . . . . . . . . . . .

4.1. Cylindrical sigma-algebra in normed space . 4.2. Convolution of measures . . . . . . . . . . . 4.3. Properties of characteristic functionals in H 4.4. S-topology in H . . . . . . . . . . . . . . . 4.5. Minlos–Sazonov theorem . . . . . . . . . . .

. . . . . . . . . . . .

. . . . .

. . . . .

. . . . . . . . . . . .

. . . . .

. 89 . 93 . 96 . 99 . 102

. . . . . . . . . . . . . 111

5.1. Characteristic functional of Gaussian measure . . . . . . . . . . . . 5.2. Decomposition of Gaussian measure and Gaussian random element 5.3. Support of Gaussian measure and its invariance . . . . . . . . . . . 5.4. Weak convergence of Gaussian measures . . . . . . . . . . . . . . . 5.5. Exponential moments of Gaussian measure in normed space . . . . 5.5.1. Gaussian measures in normed space . . . . . . . . . . . . . . . . 5.5.2. Fernique’s theorem . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

111 114 117 125 129 129 133

Chapter 6. Equivalence and Singularity of Gaussian Measures . . . 143 6.1. Uniformly integrable sequences . . . . . . . . . . . . . . . 6.2. Kakutani’s dichotomy for product measures on R∞ . . . . 6.2.1. General properties of absolutely continuous measures . 6.2.2. Kakutani’s theorem for product measures . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

143 145 145 148

Contents

6.2.3. Dichotomy for Gaussian product measures . . . . . . . . . . . . . 6.3. Feldman–Hájek dichotomy for Gaussian measures on H . . . . . . . 6.3.1. The case where Gaussian measures have equal correlation operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2. Necessary conditions for equivalence of Gaussian measures . . . 6.3.3. Criterion for equivalence of Gaussian measures . . . . . . . . . . 6.4. Applications in statistics . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1. Estimation and hypothesis testing for mean of Gaussian random element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.2. Estimation and hypothesis testing for correlation operator of centered Gaussian random element . . . . . . . . . . . . . . . . . . . . .

vii

. 152 . 155 . . . .

155 158 165 169

. 169 . 173

Chapter 7. Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 7.1. Solutions for Chapter 1 . . . . . . . . . . . . . . 7.2. Solutions for Chapter 2 . . . . . . . . . . . . . . 7.2.1. Generalized Kolmogorov extension theorem 7.3. Solutions for Chapter 3 . . . . . . . . . . . . . . 7.4. Solutions for Chapter 4 . . . . . . . . . . . . . . 7.5. Solutions for Chapter 5 . . . . . . . . . . . . . . 7.6. Solutions for Chapter 6 . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

179 193 196 202 211 217 227

Summarizing Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 References Index

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241

Foreword

The study of modern theory of stochastic processes, inﬁnite-dimensional analysis and Malliavin calculus is impossible without a solid knowledge of Gaussian measures on inﬁnite-dimensional spaces. In spite of the importance of this topic and the abundance of literature available for experienced researchers, there is no textbook suitable for students for a ﬁrst reading. The present manual is an excellent get-to-know course in Gaussian measures on inﬁnite-dimensional spaces, which has been given by the author for many years at the Faculty of Mechanics & Mathematics of Taras Shevchenko National University of Kyiv, Ukraine. The presentation of the material is well thought out, and the course is self-contained. After reading the book it may seem that the topic is very simple. But that is not true! Apparent simplicity is achieved by careful organization of the book. For experts and PhD students having experience in inﬁnite-dimensional analysis, I prefer to recommend the monograph V. I. Bogachev, Gaussian Measures (1998). But for ﬁrst acquaintance with the topic, I recommend this new manual. Prerequisites for the book are only a basic knowledge of probability theory, linear algebra, measure theory and functional analysis. The exposition is supplemented with a bulk of examples and exercises with solutions, which are very useful for unassisted work and control of studied material. In this book, many delicate and important topics of inﬁnite-dimensional analysis are analyzed in detail, e.g. Borel and cylindrical sigma-algebras in inﬁnite-dimensional spaces, Bochner and Pettis integrals, nuclear operators and the topology of nuclear convergence, etc. We present the contents of the book, emphasizing places where ﬁnite-dimensional results need reconsideration (everywhere except Chapter 1).

x

Gaussian Measures in Hilbert Space

– Chapter 1. Gaussian distributions on a ﬁnite-dimensional space. The chapter is preparatory but necessary. Later on, many analogies with ﬁnite-dimensional space will be given, and the places will be visible where a new technique is needed. – Chapter 2. Space R∞ , Kolmogorov theorem about the existence of probability measure, product measures, Gaussian product measures, Gaussian product measures in l2 space. After reading the chapter, the student will start to understand that on inﬁnite-dimensional space there are several ways to deﬁne a sigma-algebra (luckily, in our case Borel and cylindrical sigma-algebras coincide). Moreover, it will become clear that inﬁnite-dimensional Lebesgue measure does not exist, hence construction of measure by means of density needs reconsideration. – Chapter 3. Bochner and Pettis integrals, Hilbert–Schmidt operators and nuclear operators, strong and weak moments. The chapter is a preparation for the deﬁnition of the expectation and correlation operator of Gaussian (or even arbitrary) random element. We see that it is not so easy to introduce expectation of a random element distributed in Hilbert or Banach space. As opposed to ﬁnite-dimensional space, it is not enough just to integrate over basis vectors and then augment the results in a single vector. – Chapter 4. Characteristic functionals, Minlos–Sazonov theorem. One of the most important methods to investigate probability measures on ﬁnite-dimensional space is the method of characteristic functions. As well-known from the course of probability theory, these will be all continuous positive deﬁnite functions equal to one at zero, and only them. On inﬁnite-dimensional space this is not true. For the statement “they and only them”, continuity in the topology of nuclear convergence is required, and this topology is explained in detail. – Chapter 5. General Gaussian measures. Based on results of previous chapters, we see the necessary and sufﬁcient conditions that have to be satisﬁed by the characteristic functional of a Gaussian measure in Hilbert space. We realize that we have used all the knowledge from Chapters 2–4 (concerning integration of random elements, about Hilbert–Schmidt and nuclear operators, Minlos–Sazonov theorem, etc.). We notice that for the eigenbasis of the correlation operator, a Gaussian measure is just a product measure which we constructed in Chapter 2. This seems natural; but on our way it was impossible to discard any single step without loss of mathematical rigor. In this chapter, Fernique’s theorem about ﬁniteness of an exponential moment of the norm of a Gaussian random element is proved and the criterion for the weak convergence of Gaussian measures is stated. – Chapter 6. Equivalence and mutual singularity of measures. Here, Kakutani’s theorem is proven about the equivalence of the inﬁnite product of measures. As we saw in the previous chapter, Gaussian measures on Hilbert spaces are product measures, in a way. Therefore, as a consequence of general theory, we get a criterion for the equivalence of Gaussian measures (Feldman–Hájek theorem). The obtained results are applied to problems of inﬁnite-dimensional statistics. One should be

Foreword

xi

careful here, as due to the absence of the inﬁnite-dimensional Lebesgue measure, the Radon–Nikodym density should be written w.r.t. one of the Gaussian measures. The author of this book, Professor A.G. Kukush, has been working at the Faculty of Mechanics & Mathematics of Taras Shevchenko National University for 40 years. He is an excellent teacher and a famous expert in statistics and probability theory. In particular, he used to give lectures to students of mathematics and statistics on Measure Theory, Functional Analysis, Statistics and Econometrics. As a student, I was lucky to attend his fascinating course on inﬁnite-dimensional analysis. Andrey P ILIPENKO Leading Researcher at the Institute of Mathematics of Ukrainian National Academy of Sciences, Professor of Mathematics at the National Technical University of Ukraine, “Igor Sikorsky Kyiv Polytechnic Institute” August 2019

Preface

This book is written for graduate students of mathematics and mathematical statistics who know algebra, measure theory and functional analysis (generalized functions are not used here); the knowledge of mathematical statistics is desirable only to understand section 6.4. The topic of this book can be considered as supplementary chapters of measure theory and lies between measure theory and the theory of stochastic processes; possible applications are in functional analysis and statistics of stochastic processes. For 20 years, the author has been giving a special course “Gaussian Measures” at Taras Shevchenko National University of Kyiv, Ukraine, and in 2018–2019, preliminary versions of this book have been used as a textbook for this course. There are excellent textbooks and monographs on related topics, such as Gaussian Measures in Banach Spaces [KUO 75], Gaussian Measures [BOG 98] and Probability Distributions on Banach Spaces [VAK 87]. Why did I write my own textbook? In the 1970s, I studied at the Faculty of Mechanics and Mathematics of Taras Shevchenko National University of Kyiv, at that time called Kiev State University. There I attended unforgettable lectures given by Professors Anatoliy Ya. Dorogovtsev (calculus and measure theory), Lev A. Kaluzhnin (algebra), Mykhailo I. Yadrenko (probability theory), Myroslav L. Gorbachuk (functional analysis) and Yuriy M. Berezansky (spectral theory of linear operators). My PhD thesis was supervised by famous statistician A. Ya. Dorogovtsev and dealt with the weak convergence of measures on inﬁnite-dimensional spaces. For long time, I was a member of the research seminar “Stochastic processes and distributions in functional spaces” headed by classics of probability theory Anatoliy V. Skorokhod and Yuriy L. Daletskii. My second doctoral thesis was about asymptotic properties of estimators for parameters of stochastic processes. Thus, I am somewhat tied up with measures on inﬁnite-dimensional spaces.

xiv

Gaussian Measures in Hilbert Space

In 1979, Kuo’s fascinating textbook was translated into Russian. Inspired by this book, I started to give my lectures on Gaussian measures for graduate students. The subject seemed highly technical and extremely difﬁcult. I decided to create something like a comic book on this topic, in particular to divide lengthy proofs into small understandable steps and explain the ideas behind computations. It is impossible to study mathematical courses without solving problems. Each section ends with several problems, some of which are original and some are taken from different sources. A separate chapter contains detailed solutions to all the problems. Acknowledgments I would like to thank my colleagues at Taras Shevchenko National University of Kyiv who supported my project, especially Yuliya Mishura, Oleksiy Nesterenko and Ivan Feshchenko. Also I wish to thank my students of different generations who followed up on the ideas of the material and helped me to improve the presentation. I am grateful to Fedor Nazarov (Kent State University, USA) who communicated the proof of theorem 3.9. In particular, I am grateful to Oksana Chernova and Andrey Frolkin for preparing the manuscript for publication. I thank Sergiy Shklyar for his valuable comments. My wife Mariya deserves the most thanks for her encouragement and patience. Alexander K UKUSH Kyiv, Ukraine September 2019

Introduction

The theory of Gaussian measures lies on the junction of theory of stochastic processes, functional analysis and mathematical physics. Possible applications are in quantum mechanics, statistical physics, ﬁnancial mathematics and other branches of science. In this ﬁeld, the ideas and methods of probability theory, nonlinear analysis, geometry and theory of linear operators interact in an elegant and intriguing way. The aim of this book is to explain the construction of Gaussian measure in Hilbert space, present its main properties and also outline possible applications in statistics. Chapter 1 deals with Euclidean space, where the invariance of Lebesgue measure is explained and Gaussian vectors and Gaussian measures are introduced. Their properties are stated in such a form that (later on) they can be extended to the inﬁnite-dimensional case. Furthermore, it is shown that on an inﬁnite-dimensional Hilbert space there is no non-trivial measure, which is invariant under all translations (the same concerning invariance under all unitary operators); hence on such a space there is no measure analogous to the Lebesgue one. In Chapter 2, a product measure is constructed on the sequence space R∞ based on Kolmogorov extension theorem. For standard Gaussian measure μ on R∞ , Kolmogorov–Khinchin criterion is established. In particular, it is shown that μ is concentrated on certain weighted sequence spaces l2,a , and based on isometry between l2,a and l2 , a Gaussian product measure is constructed on the latter sequence space. Chapter 3 introduces important classes of operators in a separable inﬁnite-dimensional Hilbert space H, in particular S-operators, i.e. self-adjoint, positive and nuclear ones. Theorem 3.9 shows that the convergence of S-operators is equivalent to certain convergence of corresponding quadratic forms. Also the weak (Pettis) and strong (Bochner) integrals are deﬁned for a function valued in a Banach space.

xvi

Gaussian Measures in Hilbert Space

Borel probability measures on H and a normed space X are studied with examples. The boundedness of moment forms of such measures is shown, with simple proof based on the classical Banach–Steinhaus theorem. Corollary 3.3 and remark 3.8 give mild conditions for the existence of mean value of a probability measure μ as Pettis integral, and if the underlying space is a separable Banach space B and μ has a strong ﬁrst moment, then its mean value exists as Bochner integral. In Chapter 4, properties of characteristic functionals of Borel probability measures on H are studied. A special linear topology, S-topology, is introduced in H with a neighborhood system consisting of ellipsoids. Classical Minlos–Sazonov theorem is proven and properly extends Bochner’s theorem from Rn to H. According to Minlos– Sazonov theorem, the characteristic functional of a Borel probability measures on H should be continuous in S-topology. A part of proof of this theorem (see lemma 4.9) suggests the way to construct a probability measure by its characteristic functional. In Chapter 5, theorem 5.1 uses the Minlos–Sazonov theorem to describe a Gaussian measure on H of general form. It turns out that the correlation operator of such a measure is always an S-operator. It is shown that each Gaussian measure on H is just a product of one-dimensional Gaussian measures w.r.t. the eigenbasis of the correlation operator. Thus, every Gaussian measure on H can be constructed along the way, as demonstrated in Chapter 2. The support of Gaussian measure is studied. It is shown that a centered Gaussian measure is invariant under quite a rich group of linear transforms (see theorem 5.5). Hence, a Gaussian measure in Hilbert space can be considered as a natural inﬁnitedimensional analogue of (invariant) Lebesgue measure. A criterion for the weak convergence of Gaussian measures is stated, where (due to theorem 3.9) we recognize the convergence of correlation operators in nuclear norm. In section 5.5, we study Gaussian measures on a separable normed space X. Important example 5.3 shows that a Gaussian stochastic process generates a measure on the path space Lp [0, T ], hence in case p = 2, we obtain a Gaussian measure on Hilbert space. Lemma 5.9 presents a characterization of Gaussian random element in X. The famous theorem of Fernique is proven, which states that certain exponential moments of a Gaussian measure on X are ﬁnite. In particular, every Gaussian measure on a separable Banach space B has mean value as Bochner integral and its correlation operator is well-deﬁned. Theorem 5.10 derives the convergence of moments of weakly convergent Gaussian measures. In Chapter 6, Kakutani’s remarkable dichotomy for product measures on R∞ is proven. In particular, two such product measures with absolutely continuous

Introduction

xvii

components are either absolutely continuous or mutually singular. This implies the dichotomy for Gaussian measures on R∞ : two such measures are either equivalent or mutually singular. Section 6.3 proves the famous Feldman–Hájek dichotomy for Gaussian measures on H, and in case of equivalent measures, expressions for Radon–Nikodym derivatives are provided. In section 6.4, the results of Chapter 6 are applied in statistics. Based on a single observation of Gaussian random element in H, we construct unbiased estimators for its mean and for parameters of its correlation operator; also we check a hypothesis about the mean and the correlation operator (the latter hypothesis is in the case where the Gaussian element is centered). In view of example 5.3 with p = 2, these statistical procedures can be used for a single observation of a Gaussian process on ﬁnite time interval. The book is aimed for advanced undergraduate students and graduate students in mathematics and statistics, and also for theoretically interested students from other disciplines, say physics. Prerequisites for the book are calculus, algebra, measure theory, basic probability theory and functional analysis (we do not use generalized functions). In section 6.4, the knowledge of basic mathematical statistics is required. Some words about the structure of the book: we present the results in lemmas, theorems, corollaries and remarks. All statements are proven. Important and illustrative examples are given. Furthermore, each section ends with a list of problems. Detailed solutions to the problems are provided in Chapter 7. The abbreviations and notation used in the book are deﬁned in the corresponding chapters; an overview of them is given in the following list.

Abbreviations and Notation

a.e. a.s. cdf pdf i.i.d.

almost everywhere w.r.t. Lebesgue measure almost surely cumulative distribution function probability density function independent and identically distributed (random variables or vectors) r.v. random variable LHS left-hand side RHS right-hand side MLE maximum likelihood estimator |A| number of points in set A Ac complement of set A A¯ closure of set A TB image of set B under transformation T T −1 A preimage of set A under transformation T x , A transposed vector and transposed matrix, respectively ¯ ¯ = R ∪ {−∞, +∞} R extended real line, i.e. R n×m R space of real n × m matrices ¯ r) open and closed ball, respectively, centered at x with radius B(x, r), B(x, r > 0 in a metric space f+ positive part of function f , f+ = max(f, 0) f− negative part of function f , f− = − min(f, 0) δij Kronecker delta, δij = 1 if i = j, and δij = 0, otherwise an ∼ bn {an } is equivalent to {bn } as n → ∞, i.e. an /bn → 1 as n→∞ C(X) space of all real continuous functions on X R∞ space of all real sequences B(X) Borel sigma-algebra on metric (or topological) space X

xx

Gaussian Measures in Hilbert Space

Lebesgue measure on Rm sigma-algebra of Lebesgue measurable sets on Rm indicator function, i.e. IA (x) = 1 if x ∈ A, else IA (x) = 0 measure induced by measurable transformation T based on measure μ, i.e. (μT −1 )(A) = μ(T −1 A), for each measurable set A L(X, μ) space of Lebesgue integrable functions on X w.r.t. measure μ f = g (mod μ) functions f and g are equal almost everywhere w.r.t. measure μ δx Dirac measure at point x, δx (B) = IB (x) νμ signed measure ν is absolutely continuous w.r.t. measure μ dν the Radon–Nikodym derivative of ν w.r.t. μ dμ ν∼μ measures ν and μ are equivalent ν⊥μ signed measure ν and measure μ are mutually singular (x, y) inner product of vectors x and y in Euclidean or Hilbert space x Euclidean norm of vector x A Euclidean norm of matrix A, A = sup Ax x λm Sm IA μT −1

x=0

Im rk(S) Pn √ A

the identity matrix of size m rank of matrix S projective operator, Pn x = (x1 , . . . , xn ) , x ∈ R∞ square root of positive semideﬁnite matrix A, it is positive √ semideﬁnite as well with ( A)2 = A x, x∗ or x∗ , x value of functional x∗ at vector x I the identity operator L(X) space of linear bounded operators on normed space X R(A) range of operator A, R(A) = {y : ∃x, y = Ax} L⊥ orthogonal complement to set L L2 [a, b] Hilbert space of square integrable real functions with inner b product (x, y) = a x(t)y(t)dt, the latter is Lebesgue integral lp space of real sequences x = (xn )∞ 1 with norm x p = ∞ 1/p ( n=1 |xn |p ) if 1 ≤ p < ∞, and x ∞ = supn≥1 |xn | if p = ∞. Forp = 2, l2 is Hilbert space with inner ∞ product (x, y) = 1 xn yn . l2,a weighted l2 space span(M ) span of set M, i.e., set of all ﬁnite linear combinations of vectors from M Aˆn cylinder in R∞ with base An ∈ B(Rn ) A = A L(X) operator norm of linear bounded operator A, A = sup Ax x ∗ √ A 1/2 B=B |A|

x=0

adjoint operator square root of self-adjoint positive operator B modulus of compact operator A, |A| = (A∗ A)1/2

Abbreviations and Notation

A 1 A 2 An ⇒ A S0 (H) S1 (H) S2 (H) S∞ (H) LS (H) A≥0 A≥B

nuclear norm of operator A Hilbert–Schmidt norm of operator A operators An uniformly converge to operator A class of ﬁnite-dimensional operators on H class of nuclear operators on H class of Hilbert–Schmidt operators on H class of compact operators on H class of S-operators on H operator A is positive, i.e. (Ax, x) ≥ 0 for all x comparison in Loewner order of self-adjoint operators: A − B is positive operator n A Cartesian product of sets A1 , . . . , An k k=1 n μ product of measures μ1 , . . . , μn k k=1 ∞ μ product measure on R∞ or on Hilbert space k=1 k mμ mean value of measure μ Cov(μ) variance-covariance matrix of measure μ on Rn ϕμ or μ ˆ characteristic function (or functional) of measure μ Aμ operator of second moment of measure μ Sμ correlation operator of measure μ σn (z1 , . . . , zn ) weak moments of order n of Borel probability measure on H μX distribution of random vector X or random element X, μX (B) = P(X ∈ B) for all Borel sets B ϕX characteristic function (functional) of random vector (element) X EX expectation of random vector (element) X DX variance of random variable X Cov(X) variance-covariance matrix of random vector X d

X=Y N (m, σ 2 ) N (m, S) d

− →

random vectors (elements) X and Y are identically distributed Gaussian distribution with mean m and variance σ 2 , σ ≥ 0 Gaussian distribution on Rn (or on H) with mean value m and variance-covariance matrix (or correlation operator) S convergence in distribution of random elements

xxi

1 Gaussian Measures in Euclidean Space

1.1. The change of variables formula Let (X, S, μ) be a measure space, i.e. X is a non-empty set, S is a sigma-algebra on X and μ is a measure on S. Consider also a measurable space (Y, F ), i.e. Y is another non-empty set and F is a sigma-algebra on Y. Let T : X → Y be a measurable transformation, which means that ∀A ∈ F,

T −1 A ∈ S.

[1.1]

Hereafter T −1 A := {x ∈ X : T x ∈ A}

[1.2]

is preimage of A under T . (To simplify the notation, we write T −1 A and T x rather than T −1 (A) and T (x), respectively, if it does not cause confusion.) Introduce a set function ν(A) = μ(T −1 A),

A ∈ F.

[1.3]

T HEOREM 1.1.– (About induced measure) The set of function ν given in [1.3] is a measure on F . P ROOF.– The function ν is well deﬁned due to [1.1]. We have to show that it is not identical to inﬁnity, but it is non-negative and sigma-additive. Indeed, ν(∅) = μ(T −1 ∅) = μ(∅) = 0, and therefore, ν is not identical to inﬁnity. For each A ∈ F , ν(A) ≥ 0, because μ is a non-negative set function.

Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.

2

Gaussian Measures in Hilbert Space

Finally, {An , n ≥ 1} are disjoint sets from F . Then the preimages {T −1 An , n ≥ 1} are disjoint as well, and ν

∞

=μ T

An

−1

n=1

∞

An

=μ

n=1

=

∞

T

−1

An

=

n=1

∞ ∞

μ T −1 An = ν(An ). n=1

n=1

Here, we used the sigma-additivity of μ. Thus, ν is a non-negative and sigmaadditive set function on the sigma-algebra F , i.e. ν is a measure on F . D EFINITION 1.1.– The set function ν given in [1.3] is called a measure induced by transformation T and is denoted as μT −1 . The notation prompts how to evaluate ν(A): (μT −1 )(A) = μ(T −1 A),

A ∈ F.

For any measurable space (Y, F, ν), denote L(F, ν) the space of Lebesgue integrable functions on Y w.r.t. measure ν. ¯ be an F -measurable function, i.e. for each Borel subset B of Let f : Y → R ¯ extended real line R, it holds f −1 B ∈ F . T HEOREM 1.2.– (The change of variables formula) Assume that either f ≥ 0 or f ∈ L(Y, μT −1 ). Then it holds f (T x)dμ(x) = f (y)d(μT −1 )(y). [1.4] X

Y

P ROOF.– Equality [1.4] is shown in a standard way: ﬁrst for indicators, then for simple non-negative functions, then for f ≥ 0, and ﬁnally, for f ∈ L(Y, μT −1 ). a) Let A ∈ F ,

f (y) = IA (y) = Then

IA (T x) = IA (T x) =

1, if y ∈ A 0, otherwise.

1, if T x ∈ A 0, otherwise, 1, if x ∈ T −1 A 0, otherwise,

Gaussian Measures in Euclidean Space

3

IA (T x) = IT −1 A (x). Hence

IT −1 A (x)dμ(x) = μ(T −1 A) = (μT −1 )(A),

IA (T x)dμ(x) = X

X

IA (y)d(μT −1 )(y) = (μT −1 )(A),

Y

and [1.4] follows for the indicator function. b) Let f ≥ 0 be a simple F -measurable function. Then it admits a representation f (y) =

m

y ∈ Y,

ak IAk (y),

[1.5]

k=1

with disjoint measurable sets {Ak , k = 1, . . . , m} and non-negative ak . For the function [1.5], relation [1.4] follows due to part (a) of the proof and the linearity of the Lebesgue integral. c) Let f be an arbitrary non-negative and F -measurable function. Then there exists a sequence {pn (y), n ≥ 1, y ∈ Y } of non-negative, simple and F -measurable functions such that pn converges to f pointwise and pn (y) ≤ pn+1 (y), n ≥ 1, y ∈ Y . By part (b) of the proof, pn (T x)dμ(x) = pn (y)d(μT −1 )(y). [1.6] X

Y

Here, tend n to inﬁnity. By the monotone convergence theorem, [1.6] implies [1.4]. d) Finally, let f ∈ L(Y, μT −1 ), f+ (y) := max{f (y), 0},

f− (y) := − min{f (y), 0},

By part (c) of the proof, f+ (T x)dμ(x) = f+ (y)d(μT −1 )(y), X

[1.7]

Y

f− (y)d(μT −1 )(y).

f− (T x)dμ(x) = X

y ∈ Y.

[1.8]

Y

Subtract [1.8] from [1.7] and obtain [1.4] using the deﬁnition of Lebesgue integral.

4

Gaussian Measures in Hilbert Space

Problems 1.1 1) Let λT2 be a Lebesgue measure on [0, T ]2 ; π1 (x1 , x2 ) = x1 , (x1 , x2 ) ∈ [0, T ]2 , π1 : [0, T ]2 → R. Show that (λT2 π1−1 )(A) = T · λ1 (A ∩ [0, T ]),

A ∈ S1 ,

where λ1 is Lebesgue measure on R and S1 is sigma-algebra of Lebesgue measurable sets on R. 2) Let μ1 and μ2 be ﬁnite measures on Borel sigma-algebra B(R), and π(x1 , x2 ) = x1 , (x1 , x2 ) ∈ R2 . Find the induced measure (μ1 × μ2 )π1−1 . 3) For the objects of theorem 1.1, prove the following: if μT −1 is sigma-ﬁnite, then μ is sigma-ﬁnite as well. Does the opposite hold true? ¯ be any F -measurable function. Show that the Lebesgue integral 4) Let f : Y → R on the left-hand side of [1.4] is well deﬁned if, and only if, the integral on the righthand side of [1.4] is well deﬁned. Moreover, in case where they are well deﬁned, they coincide. 1.2. Invariance of Lebesgue measure Consider a measure space (X, S, μ) and a measurable transformation T : X → X. D EFINITION 1.2.– The measure μ is called invariant under T , or T -invariant, if μT −1 = μ. R EMARK 1.1.– Assume additionally that T is a bijection on X, and moreover T −1 is a measurable transformation as well. Then μ is T -invariant if and only if μ(B) = μ(T B),

∀B ∈ S.

[1.9]

(Hereafter T B denotes image of B under T .) P ROOF.– a) Let μ be T -invariant and B ∈ S. Because T −1 is measurable, A := T B ∈ S. It holds B = T −1 A, and μ(T −1 A) = μ(A). Equality [1.9] follows. b) Conversely, assume [1.9] and take any A ∈ S. Denote B0 = T −1 A, B0 ∈ S. Then (μT −1 )(A) = μ(B0 ) = μ(T B0 ) = μ(A), and μ is T -invariant. E XAMPLE 1.1.– (Counting measure) Let X = {1, 2, ..., n}, S = 2X be the sigmaalgebra of all subsets of X, and μ be the counting measure on X, i.e. μ(A) = |A|, A ∈ S. (Hereafter |A| is number of points in a set A; if A is inﬁnite, |A| = +∞.) Then

Gaussian Measures in Euclidean Space

5

μ is invariant under any bijection π on X. Indeed, μ(π −1 A) = |π −1 A| = |A| = μ(A), A ∈ S. In this section, we show that Lebesgue measure λn on Rn is rotation and translation invariant. Hereafter, we suppose that Euclidean space Rn consists of column vectors x = (x1 , . . . , xn ) . D EFINITION 1.3.– Afﬁne transformation of Rn is a mapping of a form T x = Lx + c, with L ∈ Rn×n and c ∈ Rn . Such transformation is called non-singular if L is non-singular. Otherwise, if det L = 0, then T of this form is called a singular afﬁne transformation. R EMARK 1.2.– Afﬁne transformation T x = Lx + c is invertible if, and only if, it is non-singular. In this case, the inverse transformation is a non-singular afﬁne transformation as well, and it acts as follows: T −1 y = L−1 y − L−1 c,

y ∈ Rn .

Remember that non-singular afﬁne transformations on a plane include rotations, translations and axial symmetries. T HEOREM 1.3.– (Transformation of Lebesgue measure at Borel sets) Consider Lebesgue measure λn on Borel sigma-algebra B(Rn ). Let T x = Lx + c be a non-singular afﬁne transformation on Rn . Then λn T −1 =

1 λn . | det L|

[1.10]

P ROOF.– Transformation T is continuous, and, therefore, Borel measurable. Hence the induced measure λn T −1 on B(Rn ) is well deﬁned. a) For a = (ak )n1 and b = (bk )n1 with ak < bk , k = 1, . . . , n denote [a, b] =

n

[ak , bk ],

k=1

Hereafter,

n

(a, b] =

n

(ak , bk ].

[1.11]

k=1

Ak stands for Cartesian product of A1 , . . . , An . Evaluate

k=1

λn T −1 ([a, b]) = λn T −1 [a, b] =

T −1 [a,b]

dλn =

T −1 [a,b]

dx.

The latter integral is Riemann integral over the compact and Jordan measurable set T −1 [a, b]. The change of variables in the Riemann integral leads to the following: −1 ∂y

dy = mn ([a, b]) = λn ([a, b]) . λn T −1 [a, b] = | det L| | det L| [a,b] ∂x Here mn is Jordan measure on Rn .

6

Gaussian Measures in Hilbert Space

b) Consider a set (a, b] introduced in [1.11], and let {ak (m), m ≥ 1} be a decreasing sequence that converges to ak such that ak (m) < bk , m ≥ 1; k = 1, . . . , n. n Denote a(m) = (ak (m))k=1 ∈ Rn . Then Am := [a(m), b] is a monotone sequence of sets that converges to (a, b]. The continuity of Lebesgue measure from below implies

λn (Am ) λn ((a, b]) λn T −1 ((a, b]) = lim λn (T −1 Am ) = lim = . m→∞ m→∞ | det L| | det L|

Here, we used part (a) of the proof. λn c) Thus, the two measures λn T −1 and | det L| in [1.10] coincide on the semiring Pn that consists of all bricks (a, b] from [1.11]. Both measures are sigma-ﬁnite, and therefore, they coincide on σr(Pn ) = B(Rn ), where σr(Pn ) is sigma-ring generated by Pn .

Now, we extend theorem 1.3 to Lebesgue measure λn on sigma-algebra Sn of Lebesgue measurable sets in Rn . = L EMMA 1.1.– Non-singular afﬁne transformation T x (Sn , Sn )-measurable, i.e. for any A ∈ Sn , T −1 A ∈ Sn as well.

Lx + c is

P ROOF.– It is known (see [HAL 13]) that Sn = {B ∪ N |B ∈ B(Rn ), N ⊂ N0 with N0 ∈ B(Rn ), λn (N0 ) = 0}. [1.12] Let A ∈ Sn , then A = B ∪ N , with B and N described in [1.12]. It holds

T −1 A = T −1 B ∪ T −1 N , T −1 B ∈ B(Rn ), [1.13]

λn (N0 ) T −1 N ⊂ T −1 N0 , T −1 N0 ∈ B(Rn ), λn T −1 N0 = = 0. | det L|

[1.14]

Here, we used theorem 1.3 and the fact that T is a Borel function. Decompositions [1.13] and [1.14] show that T −1 A ∈ Sn . T HEOREM 1.4.– (Transformation of Lebesgue measure) Let T x = Lx + c be a non-singular afﬁne transformation on Rn . For Lebesgue measure λn on Sn , it holds λn T −1 =

λn . | det L|

P ROOF.– Consider A ∈ Sn and decompose T −1 A as in [1.13] and [1.14]. Because λn (N ) = λn (T −1 N ) = 0, we have by theorem 1.3:

λn (B) λn (A) λn T −1 A = λn T −1 B = = . | det L| | det L|

Gaussian Measures in Euclidean Space

7

C OROLLARY 1.1.– (Criterion for invariance of Lebesgue measure) Lebesgue measure λn on Sn is invariant under a non-singular afﬁne transformation T x = Lx + c if, and only if, det L = ±1. In particular, λn is symmetric around the origin, and it is invariant under translations T x = x + c and orthogonal transformations T x = U x, where U is an orthogonal matrix (i.e. U −1 = U ), e.g. under symmetries w.r.t. hyperplanes that pass through the origin. In the planar case (n = 2), λ2 is invariant under the transformation 2x1 Tx = 1 , x ∈ R2 . 2 x2 Here, T is dilation along x1 -axis with coefﬁcient 2 and contraction along x2 -axis with the same coefﬁcient. C OROLLARY 1.2.– (Afﬁne change of variables) Let T x = Lx + c be a non-singular ¯ be a Lebesgue measurable function, afﬁne transformation on Rn and f : Rn → R which is either non-negative or belongs to L(Rn , λn ). Then it holds 1 f (y)dλn (y). f (T x)dλn (x) = | det L| Rn Rn P ROOF.– Apply theorems 1.2 and 1.4:

f (T x)dλn (x) = f (y)d λn T −1 (y) = Rn

Rn

1 | det L|

Rn

f (y)dλn (y).

Problems 1.2 5) Let T x = |x|, x ∈ R. Find λ1 T −1 . 6) Let f : R → R be a Lebesgue measurable function such that ∀x, y ∈ R, |f (x) − f (y)| ≥ |x − y|. Prove that f is (S1 , S1 )-measurable, where S1 is sigmaalgebra of Lebesgue measurable sets on real line.

7) Show that arctan x is an (S1 , S1 )-measurable function. Let f : − π2 , π2 → [0, +∞] be a Lebesgue measurable function. Prove that f (t) f (arctan x)dλ1 (x) = dλ1 (t). 2t π π cos R − , ( 2 2) 8) Show that ex is (S1 , S1 )-measurable function. Let f : (0, ∞) → [0, +∞] be a Lebesgue measurable function. Prove that f (t) f (ex )dλ1 (x) = dλ1 (x). R (0,+∞) t

8

Gaussian Measures in Hilbert Space

9) Prove that f (x) = ||x||, x ∈ Rn is (Sn , S1 )-measurable function. 10) Let f : [0, +∞) → [0, +∞] be a Lebesgue measurable function. Prove that the measure μ(A) = f (||x||) dλn (x), A ∈ Sn A

is invariant under unitary operators in Rn . 11) Let μ be a measure on Sn , which is ﬁnite at each bounded set from Sn , absolutely continuous w.r.t. λn and invariant under unitary operators in Rn . Prove that there exists a Borel function f : [0, +∞) → [0, +∞) such that representation from Problem 10 holds true. Hint. Given a locally Lebesgue integrable function g on Rn , a point x ∈ Rn is Lebesgue point if 1 lim |g(y) − g(x)|dλn (y) = 0. r→0+ λn (B(x, r)) B(x,r) Hereafter, B(x, r) is an open ball centered at x with radius r. Use the Lebesgue differentiation theorem [BOG 07] which states that, given any locally Lebesgue integrable function g on Rn , almost every x is a Lebesgue point of g. 12) Let g : Rn → R be a Lebesgue measurable function such that g(T x) = g(x) (mod λn ), for all unitary operators T in Rn . Prove that there exists a Borel function f : [0, +∞) → R, with g(x) = f (||x||) (modλn ). 13) Let α > 0 and f ∈ L(R, λ1 ). Prove that f (n1+α x) → 0 as n → ∞ for almost all x ∈ R. Extend this statement to functions from L(Rm , λm ). ¯ f ∈ L([0, +∞), λ1 ). Prove the following: 14) Let f : Rn → R, a) If f is an even function, then f dλ1 = 2 f dλ1 . R

[0,+∞)

b) If f is an odd function, then

R

f dλ1 = 0.

15) Let f : [−1, 1] → (0, +∞) be a Lebesgue measurable function. Find the f (x) integral [−1,1] f (x)+f (−x) dλ1 (x).

Gaussian Measures in Euclidean Space

9

1.3. Absence of invariant measure in inﬁnite-dimensional Hilbert space Let H be a real Hilbert space, with Borel sigma-algebra B(H). In this section, we search for a measure λ on B(H) with the following properties: i) λ is positive at each non-empty open set; ii) λ is ﬁnite at each bounded Borel set; iii) λ is invariant under each translation T x = x + c, x ∈ H, with c ∈ H. Remember that a linear operator U in H is called unitary if ||U x|| = ||x||, x ∈ H, and R(U ) = H. An operator U ∈ L(H) is unitary if, and only if, U ∗ = U −1 . iv) λ is invariant under each unitary operator in H. Note that Lebesgue measure λn on B(Rn ) possesses the properties (i)–(iv). T HEOREM 1.5.– (Absence of invariant measure in H) Let H be an inﬁnitedimensional real Hilbert space. Then: a) There is no measure λ with properties (i)–(iii). b) There is no measure λ with properties (i), (ii) and (iv). P ROOF.– Because dim(H) = ∞, there exists an inﬁnite orthonormalsystem{en , √ √ n ≥ 1} in H. For k = m, ||ek − em || = 2, hence open balls B en , 22 are √ √ disjoint. For x ∈ B en , 22 , it holds ||x|| ≤ ||en || + ||x − en || < 1 + 22 < 2, and B

√ 2 ⊂ B(0, 2), en , 2

∞ n=1

B

√ 2 ⊂ B(0, 2). en , 2

[1.15]

a) Let λ have the properties (i)–(iii). For k =√m,the translation T x = x+em −ek , √ x ∈ H maps the ball B ek , 22 onto B em , 22 . Hence by (i) and (iii), λ B

√ √ 2 2 = λ B em , = a > 0. ek , 2 2

Due to [1.15] and [1.16], we have √ ∞ ∞ 2 λ(B(0, 2)) ≥ = λ B en , a = +∞, 2 n=1 n=1

[1.16]

λ(B(0, 2)) = +∞.

But this contradicts property (ii). Therefore, such a measure λ does not exist.

10

Gaussian Measures in Hilbert Space

b) Now, assume that λ has the properties (i), (ii) and (iv). Construct a unitary √ √ 2 2 operator U that maps B ek , 2 onto B em , 2 , with ﬁxed k = m. Let L be a subspace generated by {en , n = 1, 2, . . . }. Each x ∈ H can be decomposed as x=

∞

(x, en )en + z,

z ∈ L⊥ .

n=1

The isometric operator U x = (x, ek )em + (x, em )ek +

∞

(x, en )en + z

n=k,n=m

√ √ is a surjection, and hence is unitary. It maps B ek , 22 onto B em , 22 , and by properties (iv) and (i), it holds [1.16]. The rest of the proof follows the line of part (a) of the proof. As we see, in space l2 of sequences and in the space L2 [a, b] of functions there is no measure analogous to Lebesgue measure. Nevertheless, we will construct a measure in an inﬁnite-dimensional Hilbert space, which is invariant under quite a large group of transformations. It will be a Gaussian measure. Problems 1.3 16) Prove that there is no measure λ on B(l∞ ) with properties (i) and (ii) from section 1.3, where l∞ is the space of real bounded sequences with the supremum norm. 17) Let X be a real normed space, with dim(X) = ∞. Prove that there is no measure λ on B(X) with properties (i)–(iii) from section 1.3. 18) A linear bijection V on a normed space X is called isometry if ||V x|| = ||x||, x ∈ X. Prove that there is no measure λ on B(lp ) with properties (i) and (ii) from section 1.3 and such that λ is invariant under all isometries on lp , 1 ≤ p < ∞. 19) Let ϕ(t), t ∈ [0, 1] be a continuous increasing function, with ϕ(0) = 0, ϕ(1) = 1, and ϕ(t) < t, t ∈ (0, 1). In Banach space, X = C[0, 1] introduce a transformation (T x)(t) = x(ϕ(t)), t ∈ [0, 1], x ∈ X. Prove that there is no measure λ on B(X) with properties (i) and (ii) from section 1.3 and such that it is T -invariant. 1.4. Random vectors and their distributions Remember that a probability measure is a measure on sigma-algebra, which is equal to 1 at the total space. A measure space (Ω, F, P) is called probability space if P is a probability measure, i.e. P(Ω) = 1.

Gaussian Measures in Euclidean Space

11

1.4.1. Random variables A random variable (r.v.) on a probability space (Ω, F, P) is just an F-measurable function on Ω. D EFINITION 1.4.– Let X = X(ω) be a r.v. on a probability space (Ω, F, P). The distribution of X is a probability measure μX deﬁned as follows: μX (B) = P{ω : X(ω) ∈ B},

B ∈ B(R).

Note that μX is a measure induced by the mapping X : Ω → R, i.e. μX = P X −1 (see deﬁnition 1.1). A Borel function f : R → R is called the probability density function (pdf) of a r.v. X if P{X(ω) ∈ B} = f (t)dλ1 (t), B ∈ B(R). B

Actually, this means that the distribution μX λ1 , where the Lebesgue measure λ1 is considered on B(R), and moreover the Radon–Nikodym derivative dμX dλ1 = f (t)(modλ1 ). D EFINITION 1.5.– A r.v. γ is called normal (or normally distributed) if it has a pdf of the form ρ(x) = √

(x−m)2 1 e− 2σ2 , 2πσ

x ∈ R,

with parameters m ∈ R and σ > 0. This is denoted as follows: γ ∼ N (m, σ 2 ). If γ ∼ N (m, σ 2 ), then it holds E γ = m,

D γ = σ2 .

Hereafter, E stands for expectation and D stands for the variance of a r.v. Remember that 2 Eγ = γ(ω)d P(ω), D γ = E(γ − m)2 = (γ(ω) − m) d P(ω). Ω

Ω

By the change of variables formula (theorem 1.2), it holds 2 Eγ = xdμγ (x), D γ = (x − m) dμγ (x), R

R

and since γ has pdf equal ρ, we have 2 2 m = Eγ = xρ(x)dx, σ = D γ = (x − m) ρ(x)dx. R

R

12

Gaussian Measures in Hilbert Space

The latter integrals are Lebesgue integrals. D EFINITION 1.6.– A r.v. γ has degenerate normal distribution N (m, 0) if γ(ω) = m, almost surely (a.s.). We denote it as γ ∼ N (m, 0). In this case, E γ = m, D γ = 0, and the distribution μγ is a point measure concentrated at m: B ∈ B(R).

μγ (B) = IB (m),

Such a measure is called Dirac measure at point m and is denoted as δm . D EFINITION 1.7.– A r.v. γ is called Gaussian if it is either normally distributed (with positive variance) or has degenerate normal distribution (with zero variance). Thus, a Gaussian r.v. γ satisﬁes γ ∼ N (m, σ 2 ) with some m ∈ R and σ ≥ 0. If σ > 0, then μγ λ1 , and if σ = 0, then μγ = δm ⊥ λ1 (i.e. μγ is singular to λ1 ). D EFINITION 1.8.– A r.v. γ ∼ N (0, 1) is called standard normal. Remember that characteristic function ϕξ of a r.v. ξ is as follows: ϕξ (t) = E eitξ ,

t ∈ R.

A normal r.v. γ ∼ N (m, σ 2 ) has characteristic function ϕγ (t) = exp{imt −

σ 2 t2 }. 2

[1.17]

If γ has degenerate normal distribution N (m, 0), then ϕγ (t) = exp{imt}. Thus, relation [1.17] holds true for any Gaussian r.v. γ ∼ N (m, σ 2 ), with σ ≥ 0. 1.4.2. Random vectors Remember that a random vector X distributed in Rn is F-measurable mapping X : Ω → Rn , where (Ω, F, P) is the underlying probability space. A mapping X(ω) = (X1 (ω), . . . , Xn (ω)) , ω ∈ Ω is a random vector if, and only if, all Xk (ω), k = 1, . . . , n are random variables. For a random vector X, its expectation is deﬁned coordinate-wise:

E X = (E X1 , . . . , E Xn ) =: m,

m = (m1 , . . . , mn ) ∈ Rn .

Gaussian Measures in Euclidean Space

13

In case E Xk2 < ∞ for all k = 1, . . . , n, its variance–covariance matrix Cov(X) = (sij )ni,j=1 is deﬁned as follows: sij = Cov(Xi , Xj ) = E (Xi − mi ) (Xj − mj ) ,

i, j = 1, . . . , n.

[1.18]

Hereafter, expectation E is considered as an operator that acts on the total product under its sign, and we omit brackets for brevity. The variance–covariance matrix can be expressed as

Cov(X) = E (X − m) (X − m) .

[1.19]

Here, the expectation of a random matrix is a matrix composed of expectations of entries, according to [1.18]. The variance–covariance matrix is a positive semideﬁnite matrix, i.e. it is symmetric and the corresponding quadratic form is non-negative. The variance–covariance matrix of a random vector X exists if and only if E ||X||2 < ∞. L EMMA 1.2.– (Variance–covariance matrix under linear transform) Let X be a random vector in Rn , with variance–covariance matrix S, and A ∈ Rp×n . Then AX is a random vector in Rp , with Cov(AX) = ASA .

[1.20]

P ROOF.– The mapping AX : Ω → Rp is F-measurable, as a Borel function of random vector. Hence AX is a random vector in Rp , and because E ||AX||2 ≤ ||A||2 · E ||X||2 < ∞, it has a variance–covariance matrix. Hereafter, ||A|| is Euclidean norm of matrix A, ||A|| = sup x=0

||Ax|| . ||x||

Now, E(AX) = A(E X) = Am, where m = E X, and

Cov(AX) = E (AX − Am) (AX − Am) =

= E[A (X − m) (X − m) A ] = A · E (X − m) (X − m) · A = ASA . Here, we used the linearity of the operator of taking expectation.

C OROLLARY 1.3.– (Moments of linear functional) Let a ∈ Rn and X be a random vector in Rn , with mean value m and variance–covariance matrix S. Then E(a X) = a m,

D(a X) = a Sa.

P ROOF.– The statement follows from lemma 1.2 and its proof if to put A = a ∈ R1×n . The random vector a X is just a r.v., and its variance–covariance matrix is just the variance.

14

Gaussian Measures in Hilbert Space

1.4.3. Distributions of random vectors Let X be a random vector distributed in Rn . Its distribution is introduced similarly to deﬁnition 1.4. D EFINITION 1.9.– The distribution of X is a probability measure μX deﬁned as follows: μX (B) = P{ω : X(ω) ∈ B},

B ∈ B(Rn ).

It is always possible to construct a random vector with a given distribution. L EMMA 1.3.– Given a probability measure μ on B(Rn ), there exists a random vector X, with distribution μX = μ. P ROOF.– Take the measure space (Rn , B(Rn ), μ) as a probability space (Ω, F, P) and deﬁne X : Ω → Rn as X(ω) = ω. Then μX (B) = P{ω ∈ B} = P(B) = μ(B),

B ∈ B(Rn ).

Remember that random variables X1 , . . . , Xn , which are deﬁned on the same probability space, are independent if P{X1 ∈ B1 , X2 ∈ B2 , . . . , Xn ∈ Bn } =

n

P{Xk ∈ Bk },

1

for all B1 , . . . , Bn ∈ B(R). The latter relation can be written in terms of the distribution μX of random vector X = (Xk )n1 and marginal distributions μXk of its components: μX (

n

Bk ) =

1

Here

n 1

n

μXk (Bk ).

1

Bk denotes Cartesian product of the sets Bk .

It is clear that components of random vector X = (Xk )n1 are independent if, and only if, μX is a product of n probability measures, and in this case μX =

n

μXk .

1

Remember that characteristic function ϕX of a random vector X is deﬁned as follows: ϕX (t) = E ei(X,t) ,

t ∈ Rn .

Gaussian Measures in Euclidean Space

15

One can rewrite ϕX (t) using the change of variables formula (see theorem 1.2): i(X(ω),t) ϕX (t) = e dP (ω) = ei(z,t) d(P X −1 )(z), ϕX (t) =

Rn

Ω

Rn

ei(z,t) dμX (z).

This prompts the following deﬁnition. D EFINITION 1.10.– Given a probability measure μ on B(Rn ), its characteristic function ϕμ is as follows: ϕμ (t) = ei(z,t) dμ(z), t ∈ Rn . Rn

Thus, ϕX and ϕμX coincide. From standard course of probability theory, it is known that the cumulative distribution function (and therefore, the distribution) of a random vector is uniquely deﬁned by its characteristic function. L EMMA 1.4.– (Criterion for independence) Consider a random vector X = (Xk )n1 . Its components are independent if, and only if, ϕX can be decomposed as follows: ϕX (t) = ϕ1 (t1 )ϕ2 (t2 ) . . . ϕn (tn ), t ∈ Rn , where ϕk : R → C are some functions with ϕk (0) = 1, k = 1, . . . , n, and in this case ϕX (t) =

n

ϕXk (tk ),

t ∈ Rn .

1

P ROOF.– a) Let Xk be independent. Then random variables eitk Xk , k = 1, . . . , n are independent as well, and ϕX (t) = E

n

1

e

itk Xk

=

n

1

Ee

itk Xk

=

n

1

ϕXk (tk );

ϕXk (0) = 1.

16

Gaussian Measures in Hilbert Space

b) Assume that ϕX (t) = ϕ1 (t1 )ϕ2 (t2 ) . . . ϕn (tn ), with ϕk (0) = 1, k = 1, . . . , n. Let {ek , k = 1, . . . , n} be the standard orthobasis in Rn . Then ϕXk (tk ) = E eitk Xk = E ei(X,tk ek ) = ϕX (tk ek ) = ϕk (tk ). Let Y = (Yk )n1 be a random vector with independent components and the same marginal distributions: μ Yk = μ X k ,

k = 1, . . . , n.

(Such Y can be constructed if to apply lemma 1.3 to the measure μ = Then, by part (a) of the proof, ϕY (t) =

n

ϕYk (tk ) =

1

n

ϕXk (tk ) = ϕX (t),

n 1

μXk .)

t ∈ Rn .

1

Therefore, μ X = μY =

n

μY k =

1

n

μ Xk ,

1

and the components of X are independent.

In an obvious way, lemma 1.3 can be reformulated as a criterion for a probability measure on B(Rn ) to be a product of n probability measures on B(R). D EFINITION 1.11.– Given a probability measure μ on B(Rn ), its mean value mμ and variance–covariance matrix Cov(μ) = (sij )ni,j=1 are deﬁned as follows:

mμ = sij =

Rn

Rn

xdμ(x) :=

n Rn

n

xk dμ(x)

(xi − mi ) (xj − mj ) dμ(x),

= (mk )1 , k=1

i, j = 1, . . . , n.

Deﬁnition 1.11 is consistent with the corresponding deﬁnition of the mean and variance–covariance matrix of a random vector. Indeed, for a random vector X, it holds E X = mμX ,

Cov(X) = Cov(μX ),

i.e. expectation and variance–covariance matrix of a random vector are just the mean and variance–covariance matrix of its distribution.

Gaussian Measures in Euclidean Space

17

Now, we interpret the bilinear form generated by S = Cov(X). Let m = E X and u, v ∈ Rn ; then

(Su, v) = v E(X − m)(X − m) u = E v (X − m)(X − m) u , (Su, v) = E(X − m, u)(X − m, v). For a probability measure μ on B(Rn ), with m = mμ and S = Cov(μ), we have, respectively: (Su, v) = (z − m, u) (z − m, v) dμ(z), u, v ∈ Rn . Rn

For the mean value, we have (m, u) = (z, u) dμ(z), Rn

u ∈ Rn .

Those expressions are the ﬁrst and the central second moments of measure μ. Problems 1.4 20) A measure μ on B(Rn ) is called symmetric around the origin if μ(B) = μ(−B), for all B(Rn ). Let μ be a probability measure on B(Rn ). Prove that μ is symmetric around the origin if, and only if, its characteristic function ϕμ takes real values only. 21) Let μ be a probability measure on B(Rn ). Prove that μ is invariant under all orthogonal transformations if, and only if, there exists a function f : [0, +∞) → C such that ϕμ (t) = f (||t||), t ∈ Rn .

¯ r) = 22) Let μ and ν be probability measures on B(Rn ) such that μ B(x, ¯ r)), for all closed balls B(x, ¯ r) in Rn . Prove that μ = ν. ν(B(x, 1.5. Gaussian vectors and Gaussian measures 1.5.1. Characteristic functions of Gaussian vectors D EFINITION 1.12.– A random vector ξ in Rn is called Gaussian if for each a ∈ Rn , inner product (ξ, a) is a Gaussian r.v. n

Consider a Gaussian random vector ξ = (ξk )1 in Rn . Its components ξk = (ξ, ek ) are Gaussian random variables, ξk ∼ N (mk , σk2 ) with σk ≥ 0; k = 1, . . . , n. Such random variables, the components of a Gaussian random vector, are called jointly Gaussian. It holds E ||ξ||2 =

n 1

E ξk2 =

n 1

(m2k + σk2 ) < ∞,

18

Gaussian Measures in Hilbert Space

and therefore, Cov(ξ) is well deﬁned. We have n

n

E ξ = (E ξk )1 = (mk )1 =: m, S := Cov(ξ) = (sij )ni,j=1 , sii = D ξi = σi2 ,

m ∈ Rn ;

sij = Cov(ξi , ξj ),

1 ≤ i, j ≤ n.

Then we write ξ ∼ N (m, S) and say that ξ is a Gaussian random vector with mean m and variance–covariance matrix S. Here, S is a positive semideﬁnite n × n real matrix as a variance–covariance matrix of a random vector in Rn . L EMMA 1.5.– If ξ ∼ N (m, S) then for each a ∈ Rn , (ξ, a) ∼ N (ma , σa2 ),

ma = (m, a),

σa2 = (Sa, a).

P ROOF.– R.v. (ξ, a) is Gaussian according to deﬁnition 1.12. Its mean and variance are evaluated in corollary 1.3. L EMMA 1.6.– If ξ ∼ N (m, S) in Rn , then (St, t) , t ∈ Rn . ϕξ (t) = exp i(t, m) − 2 P ROOF.– It holds ϕξ (t) = E ei(ξ,t) = ϕ(ξ,t) (1). Now, use lemmas 1.5 and [1.17] z 2 (St, t) ϕξ (t) = exp iz(m, t) − , 2 z=1

and the statement follows.

R EMARK 1.3.– If a random vector ξ has characteristic function given in lemma 1.6, with certain n × n real and symmetric matrix S, then S is positive semideﬁnite and ξ ∼ N (m, S). P ROOF.– For a ∈ Rn , it holds ϕ(ξ,a) (u) = E ei(ξ,ua) = ϕξ (ua), (Sa, a)u2 , ϕ(ξ,a) (u) = exp iu(a, m) − 2

u ∈ R.

Gaussian Measures in Euclidean Space

19

Since the absolute value of characteristic function does not exceed 1, matrix S is positive semideﬁnite, and moreover (ξ, a) ∼ N (ma , σa2 ), with ma = (a, m), σa2 = (Sa, a). Hence, ξ is a Gaussian vector, with some parameters m1 ∈ Rn and S1 ∈ Rn×n , S1 is positive semideﬁnite. Then lemma 1.5 implies (ξ, a) ∼ N (m ˜ a, σ ˜a2 ), with m ˜ a = (a, m1 ), σ ˜a2 = (S1 a, a). We get (a, m) = (a, m1 ) n and (Sa, a) = (S1 a, a) for all a ∈ R . Thus, m1 = m and S1 = S. D EFINITION 1.13.– Random vector γ ∼ N (0, In ) is called standard, or canonical, Gaussian vector distributed in Rn . The components γ1 , . . . , γn of standard Gaussian vector are uncorrelated jointly Gaussian standard random variables. In particular, E γk = 0, D γk = 1, and for all i = j, Cov(γi , γj ) = E γi γj = 0. T HEOREM 1.6.– (About components of standard Gaussian vector) 1) Let γ = (γk )n1 be standard Gaussian vector. Then ϕγ (t) = e−

||t||2 2

,

t ∈ Rn

and γ1 , . . . , γn are independent and identically distributed (i.i.d.) N (0, 1) random variables. 2) If γ1 , . . . , γn are i.i.d. N (0, 1) random variables, then γ = (γk )n1 is standard Gaussian vector. P ROOF.– 1) The formula for ϕγ follows from lemma 1.5 with m = 0 and S = In . Therefore, n

t2 k ϕγ (t) = e− 2 . k=1

Now, by lemma 1.3 the components of γ are independent. n 2) Let γ1 , . . . , γn be i.i.d. N (0, 1) random n variables. Then γ = (γk )1 is a random n vector, and for each a ∈ R , (γ, a) = k=1 ak γk is a Gaussian r.v. as a sum of independent Gaussian random variables. Therefore, γ is Gaussian. We have

E γ = (E γk )n1 = 0, Cov γ = (sij )ni,j=1 ,

sij = E γi γj = δij ,

1 ≤ i, j ≤ n.

Hereafter, δij is Kronecker delta, δij = 1 if i = j and δij = 0, otherwise. Hence γ ∼ N (0, In ). R EMARK 1.4.– (About uncorrelated Gaussian variables) Theorem 1.6 can be extended as follows: jointly Gaussian random variables ξ1 , . . . , ξn are independent if, and only if, they are uncorrelated. Based on theorem 1.6, this can be proven by consideration of i −E ξi normalized random variables ηi = ξ√ (before we cancel that ξi are constant a.s.). Dξ i

20

Gaussian Measures in Hilbert Space

1.5.2. Expansion of Gaussian vector Lemma 1.5 can be generalized for any linear transformation of a Gaussian vector. L EMMA 1.7.– (Linear transform of Gaussian vector) Let ξ ∼ N (m, S) be a Gaussian vector in Rn and A ∈ Rp×n . Then Aξ is a Gaussian vector in Rp and Aξ ∼ N (Am, ASA ). P ROOF.– For a ∈ Rp , it holds (Aξ, a) = (ξ, A a). It is a Gaussian r.v. due to deﬁnition 1.12. Now, the statement follows from lemma 1.2 and its proof. Now, we want to show that for any m ∈ Rn and positive semideﬁnite matrix S ∈ Rn×n , a random vector ξ ∼ N (m, S) can be obtained as an afﬁne transformation of standard Gaussian vector. Let A ∈ Rn×n be a positive semideﬁnite matrix. Then it has orthonormal eigenbasis e1 , . . . , en and corresponding non-negative eigenvalues λ1 , . . . , λn . The matrix can be expanded as follows: A=

n

λk ek e k.

1

D EFINITION 1.14.– Let A be a positive semideﬁnite matrix as described above. √ 1 A 2 = A is a matrix satisfying √

A=

n

λk e k e k.

1

√ It is a unique positive semideﬁnite matrix with its square equal to A. The √ matrix A has the same eigenbasis e , . . . , e and corresponding eigenvalues λ1 , . . . , 1 n √ λn . L EMMA 1.8.– (Representation of Gaussian vector via standard Gaussian vector) Let m ∈ Rn , S ∈ Rn×n be a positive semideﬁnite matrix and γ ∼ N (0, In ). Then √ ξ := m + Sγ ∼ N (m, S). P ROOF.– According to lemma 1.7 √ √ √ Sγ ∼ N (0, SIn ( S) ), This implies the statement.

√ Sγ ∼ N (0, S).

Gaussian Measures in Euclidean Space

21

Since γ ∼ N (0, In ) can be constructed based on i.i.d. N (0, 1) random variables, lemma 1.8 shows that Gaussian N (m, S) random vectors do exist, for any m ∈ Rn and any positive semideﬁnite matrix S ∈ Rn×n . Now, we expand a Gaussian vector ξ ∼ N (m, S) using i.i.d. N (0, 1) random variables γ1 , . . . , γr , with r = rk(S). T HEOREM 1.7.– (Decomposition of Gaussian vector) Let m ∈ Rn and S ∈ Rn×n be a positive semideﬁnite matrix, with rk(S) = r ≥ 1. Let λ1 , . . . , λr be positive eigenvalues of S and e1 , . . . , er be the corresponding orthonormal system of eigenvectors. 1) For ξ ∼ N (m, S), there exist i.i.d. N (0, 1) random variables γ1 , . . . , γr on the underlying probability space such that ξ =m+

r λk γk ek ,

a.s.

1

2) If γ1 , . . . , γr are i.i.d. N (0, 1) random variables, then η := m +

r λk γk ek ∼ N (m, S). 1

P ROOF.– We complete the orthonormal system to an orthobasis e1 , . . . , en . Denote λr+1 = λr+2 = · · · = λn = 0; they are the rest eigenvalues of S. n 1) Introduce X = ξ − m, X ∼ N (0, S). Now, X = 1 (X, ek )ek , E(X, ek ) = 0, and for all k ≥ r + 1, D(X, ek ) = λk = 0. Hence (X, ek ) = 0 a.s., for all k ≥ r + 1. Thus, X=

r (X, ek ) λk √ ek , λk 1

a.s.

) √ k , k = 1, . . . , r are jointly Gaussian (because vector Random variables γk := (X,e λk γ = (γk )r1 is a linear transformation of Gaussian vector X) and E γk = 0,

1 λk (Sek , ej ) = δkj = δkj . Cov(γk , γj ) = λk λj λ k λj According to theorem 1.6(1), γ1 , . . . , γr are i.i.d. N (0, 1) random variables. Now, r λk γ k e k , X= 1

and the statement follows.

a.s.

22

Gaussian Measures in Hilbert Space

r √ 2) For a ∈ Rn , (η, a) = (m, a) + 1 λk γk (ek , a) is a Gaussian r.v. as a sum of independent Gaussian random variables. Therefore, η is Gaussian. Next, ⎞ r ⎛ r ⎠= E η = m, Cov(η) = E λk γ k e k ⎝ λj γj e j k=1

=

j=1

r r λk λj (E γk γj )ek e = λk e k e j k = S. k,j=1

k=1

Thus, η ∼ N (m, S).

1.5.3. Support of Gaussian vector D EFINITION 1.15.– For a random vector X in Rn , denote by G a union of all balls B(x, r), with P{X ∈ B(x, r)} = 0. The set Rn \ G is called support of X and denoted as supp X. It is clear that P{X ∈ G} = 0. Moreover, G is the largest open set with this property. supp X is the smallest closed set such that X belongs to this set with probability 1. Since G = Rn , supp X = ∅. E XAMPLE 1.2.– Let that a r.v. ξ has Poisson distribution with parameter λ > 0. Then supp ξ = {0, 1, 2, . . . , n, . . . }. T HEOREM 1.8.– (About support of Gaussian vector) For random vector ξ ∼ N (0, S), supp ξ = R(S), where R(S) is the range of S. P ROOF.– a) If S = 0, then ξ = 0 a.s., and supp ξ = {0} = R(S). Now, let rk(S) = r ≥ 1, and let λ1 , . . . , λr be positive eigenvalues of S and e1 , . . . , er be the corresponding orthonormal system of eigenvectors. According to theorem 1.7(1), there exist i.i.d. N (0, 1) random variables γ1 , . . . , γr on the underlying probability space such that ξ=

r λk γ k e k ,

a.s.,

1

and with probability one ξ ∈ span(e1 , . . . , er ) = R(S). b) Take arbitrary x ∈ R(S) and ε > 0. Show that P{ξ ∈ B(x, ε)} > 0.

[1.21]

Gaussian Measures in Euclidean Space

Indeed, x =

r

M := {y =

1

23

ak ek and there exists δ > 0 such that

r

bk ek : |bk − ak | < δ, k = 1, . . . , r} ⊂ B(x, ε).

1

Then

a k − δ ak + δ √ , k = 1, . . . , r = P{ξ ∈ M } = P γk ∈ , √ λk λk r

a k − δ ak + δ √ = > 0, P γk ∈ , √ λk λk 1

P{ξ ∈ B(x, ε)} ≥ P{ξ ∈ M } > 0. Thus, [1.21] holds true. This fact and the relation shown in part (a) imply that supp ξ = R(S). 1.5.4. Gaussian measures in Euclidean space Now, we reformulate the results of sections 1.5.1–1.5.3 for distributions of Gaussian vectors. D EFINITION 1.16.– A probability measure μ on B(Rn ) is called Gaussian measure in Rn , if there exists a Gaussian random vector ξ in Rn such that its distribution μξ = μ. Let ξ ∼ N (m, S) and μ = μξ . In view of section 1.4.3, we have the following: m = E ξ = mμ ,

S = Cov(ξ) = Cov(μ),

i.e. m is mean value of μ and S is variance–covariance matrix of μ. Next, according to lemma 1.6 the characteristic function of μ equals (St, t) ϕμ (t) = ϕξ (t) = exp i(t, m) − , t ∈ Rn . 2 Since parameters μ and S deﬁne uniquely the characteristic function of a Gaussian measure, they deﬁne uniquely the Gaussian measure μ itself. There is one-to-one correspondence between the set of all Gaussian measures in Rn and the set of couples (m; S), where m ∈ Rn and S is a positive semideﬁnite n × n matrix. Let λ1 , . . . , λn be eigenvalues of S (they are non-negative) and e1 , . . . , en be the corresponding eigenbasis of S. Then n n 1 2 ϕμ (t) = exp i λk (t, ek ) , (t, ek )(m, ek ) − 2 1 1

24

Gaussian Measures in Hilbert Space

ϕμ (t) =

n

1

1 2 exp i(t, ek )(m, ek ) − λk (t, ek ) . 2

We treat (t, ek ) as coordinates tk of vector t ∈ Rn , and the same for (m, ek ) = mk . Thus, n

1 2 ϕμ (t) = exp itk mk − λk tk . 2 1 In view of lemma 1.3, we get a decomposition of μ: μ=

n

μk ,

1

where μk is a Gaussian measure on real line, with mean mk and variance λk , k = 1, . . . , n. Thus, each Gaussian measure in Euclidean space is just a product of Gaussian measures on real line. In case, rk(S) = r, 1 ≤ r ≤ n − 1 we may and do assume that λr+1 = λr+2 = · · · = λn = 0. Then μk = δmk ( Dirac measure at point mk ), k ≥ r + 1, and we get the expansion μ=

r

μk ×

1

n

δmk =

r+1

r

μk × δ z ,

z = (mr+1 , . . . , mn ) .

1

Here, δz is Dirac measure on B(Rn−r ) at point z. D EFINITION 1.17.– Standard Gaussian measure g in Rn is the distribution of standard Gaussian vector in Rn . The measure g has zero mean and the variance–covariance matrix is equal to In . Its characteristic function is ϕg (t) = e−

||t||2 2

, t ∈ Rn .

It holds n

gk , g= 1

where g1 = . . . = gn is standard Gaussian measure on real line. This means that x2 1 √ e− 2 dx, B ∈ B(R), i = 1, . . . , n. gi (B) = 2π B

D EFINITION 1.18.– (See deﬁnition 1.15) For a probability measure μ on B(Rn ), denote by G a union of all balls B(x, r), with μ(B(x, r)) = 0. The set Rn \ G is called support of μ and is denoted as supp μ.

Gaussian Measures in Euclidean Space

25

Since Rn is separable, G is a countable union of balls B(xi , ri ), with μ(B(xi , ri )) = 0. Hence, μ(G) = 0. Moreover, G is the largest open set with this property. Note that supp μ is the smallest closed set such that the value of μ at this set equals 1. Since G = Rn , supp μ = ∅. Theorem 1.8 implies the following: if μ is a Gaussian measure with mean 0 and variance–covariance matrix S, then supp μ = R(S). In particular for standard Gaussian measure g in Rn , supp g = Rn . Now, we study the invariance of Gaussian measures under linear transformations. Let X be a random vector in Rn with distribution μX and T : Rn → Rn be a Borel function. Random vector T X has distribution μT X = μX T −1 . Therefore, μX is d d invariant under T (see deﬁnition 1.2) if, and only if, X = T X. Hereafter, X = Y means that random vectors X and Y are identically distributed, i.e. μX = μY . √ Remember deﬁnition 1.14 of A1/2 = A for a positive semideﬁnite matrix A. T HEOREM 1.9.– (Invariance of Gaussian measure) Let U ∈ Rn×n and μ be a Gaussian measure in Rn , with zero mean and non-singular variance–covariance matrix S. The measure μ is U -invariant if, and only if, the matrix S −1/2 U S 1/2 is orthogonal. In particular, standard Gaussian measure g in Rn is U -invariant if, and only if, U is an orthogonal matrix. P ROOF.– Let X be a Gaussian random vector, with μX = μ. Then X ∼ N (0, S) and d

by lemma 1.7, U X ∼ N (m, U SU ). Now, U X = X if, and only if, S −1/2 U S 1/2 S −1/2 U S 1/2 = In . U SU = S ⇔ But this is equivalent to the orthogonality of matrix S −1/2 U S 1/2 . In case μ = g, it holds S = In . Thus, g is U -invariant if, and only if, U is an orthogonal matrix. The main statement of theorem 1.9 can be interpreted as follows. For a positive deﬁnite S ∈ Rn×n , we introduce a new inner product in Rn ,

(x, y)S = S −1 x, y , x, y ∈ Rn . The corresponding norm is 1 1 ||x||S = (x, x)S = ||S − 2 x||2 = ||S − 2 x||,

x ∈ Rn .

Now, a Gaussian measure μ, with zero mean and non-singular variance–covariance matrix S, is U -invariant if, and only if, U is unitary operator w.r.t. the inner product (x, y)S .

26

Gaussian Measures in Hilbert Space

Indeed, S −1/2 U S 1/2 is an orthogonal matrix if, and only if, ||S −1/2 U S 1/2 x|| = ||x||,

x ∈ Rn .

Now, make a change of variable y = S 1/2 x, y ∈ Rn . Then we get an equivalent condition ||S −1/2 U y|| = ||S −1/2 y||

⇔

||U y||S = ||y||S ,

y ∈ Rn .

The latter equality means that the linear transformation U is orthogonal w.r.t. the inner product (x, y)S . One can say that a Gaussian measure μ changes the geometry of Euclidean space. Standard Gaussian measure g stays in correspondence with standard geometry of Euclidean space. Compared to Lebesgue measure λn (see theorem 1.3 and corollary 1.1), the measure g has fewer invariant transformations. Theorem 1.6 shows that λn cannot be extended to an inﬁnite-dimensional Hilbert space H. But we will see that Gaussian measure can be constructed in H, with quite a large group of invariant transformations. Problems 1.5 23) Let A = (aij )ni,j=1 and B = (bij )ni,j=1 be positive deﬁnite matrices. Prove that the matrix C = (aij bij )ni,j=1 is positive deﬁnite as well. 24) Let f and g be pdfs, with cumulative distribution functions F and G, respectively. Prove the following: a) For each α ∈ (−1, 1), h(x, y) := f (x)g(y) + αf (x)(1 − 2F (x))g(y)(1 − 2G(y)),

(x, y) ∈ R2

is pdf, with marginal densities f (x) and g(y). b) Assume additionally that f and g are even functions and |x|f (x)dx < ∞, |y|f (y)dy < ∞. Let (X; Y ) be a random vector with R R pdf equal to h(x, y). If α ∈ (0, 1), then X and Y are positively correlated, and if α ∈ (−1, 0), then X and Y are negatively correlated.

25) Based on problem (24), construct Gaussian random variables X and Y which are not jointly Gaussian. 26) Let A ∈ Rn×n be a symmetric matrix and X ∼ N (0, S) in Rn . Denote eigenvalues of S 1/2 AS 1/2 as λ1 , . . . , λn . Prove that Iα := E exp{α(AX, X)} < ∞ if, and only if, αλk < 12 , k = 1, . . . , n. Show that in this case Iα = √n 1 . k=1 (1−2αλk )

27) Let X ∼ N (0, In ). Find for which real α Iα := E ||X||−α < ∞.

2 Gaussian Measure in l2 as a Product Measure

2.1. Space R∞ 2.1.1. Metric on R∞ We denote R∞ as the set of all sequences x = (xn )∞ 1 = (x1 , . . . , xn , . . . ) such that xn ∈ R, n ≥ 1. It is a real space w.r.t. natural operations: x + y = (xn + yn )∞ 1 ,

λx = (λxn )∞ 1 ,

where xn and yn are coordinates of x and y, respectively, and λ ∈ R. Let ρ(x, y) =

∞ 1 |xn − yn | · , n 1 + |x − y | 2 n n n=1

x, y ∈ R∞ .

[2.1]

We will show that [2.1] is a metric on R∞ that metrizes the coordinate-wise convergence. L EMMA 2.1.– (Bounded metric on real line) The function d(t, s) =

|t − s| , 1 + |t − s|

t, s ∈ R

[2.2]

is a metric on R. The convergence in this metric is equivalent to usual convergence of real sequences.

Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.

28

Gaussian Measures in Hilbert Space

P ROOF.– 1) The ﬁrst and second axioms of a metric are easily veriﬁed. Here, we check only the triangle inequality. The function ϕ(t) :=

t 1+t ,

t ≥ 0 is increasing. For real numbers t, s, u, we have

d(t, s) = ϕ(|t − s|) ≤ ϕ(|t − u| + |u − s|) = =

|t − u| |u − s| + , 1 + |t − u| + |u − s| 1 + |t − u| + |u − s|

d(t, s) ≤

|t − u| |u − s| + = d(t, u) + d(u, s). 1 + |t − u| 1 + |u − s|

Thus, d is a metric on R. 2) If limn→∞ tn = t, then d(tn , t) =

|tn − t| →0 1 + |tn − t|

as n → ∞.

Conversely, let d(tn , t) → 0 as n → ∞. Then |tn − t| =

d(tn , t) →0 1 − d(tn , t)

as

n → ∞.

Now, the function [2.1] can be rewritten in terms of the metric [2.2]: ρ(x, y) =

∞ d(xn , yn ) , 2n n=1

x, y ∈ R∞ .

[2.3]

L EMMA 2.2.– The function [2.1] is a metric on R∞ . The convergence in this metric is equivalent to the coordinate-wise convergence. The metric space (R∞ , ρ) is separable. P ROOF.– series in [2.1] converges because it is majorized by the convergent series ∞1) The 1 n=1 2n . Thus, the function [2.1] takes real values. The ﬁrst and second axioms of a metric are easily checked. Here, we verify the triangle inequality only. Let x, y, z ∈ R∞ . We use representation [2.3] and lemma 2.1: ρ(x, y) ≤

∞ d(xn , zn ) + d(zn , yn ) = ρ(x, z) + ρ(z, y). 2n n=1

Thus, ρ is a metric.

Gaussian Measure in l2 as a Product Measure

29

∞ 2) Let x ∈ R∞ , x(m) = (xn (m))∞ n=1 ∈ R , m ≥ 1. Suppose that ρ(x(m), x) → 0 as m → ∞. We have 1 ρ(x(m), x) ≥ n d(xn (m), xn ) ≥ 0, 2

thus, d(xn (m), xn ) → 0 as m → ∞. By lemma 2.1, xn (m) → xn as m → ∞. Here, n is arbitrary, and x(m) converges to x coordinate-wise. Conversely, assume that x(m) converges to x coordinate-wise. For each N ≥ 1, ρ(x(m), x) ≤

N d(xn (m), xn ) 1 + N. n 2 2 n=1

By lemma 2.1, d(xn (m), xn ) → 0 as m → ∞. Then 0 ≤ limm→∞ sup ρ(xn (m), x) ≤

1 . 2N

Tending N to inﬁnity, we obtain limm→∞ sup ρ(xn (m), x) = 0, and ρ(xn (m), x) tends to 0 as m → ∞. 3) Consider a set A = {(r1 , . . . , rn , 0, 0, . . . ) : n ≥ 1, ri ∈ Q, i ≥ 1}. It consists of ﬁnitary vectors with rational coordinates. The set is countable. Now, we check that it is dense in R∞ . ∞ Take any x = (xn )∞ 1 ∈ R . For each n ≥ 1, construct a sequence {rm (n), m ≥ 1} of rational numbers that converges to xn as m → ∞. As n → ∞, the sequence of points from A

a(n) = (rn (1), rn (2), . . . , rn (n), 0, 0, . . . ), n ≥ 1 converges to x coordinate-wise, and hence in (R∞ , ρ). Thus, A is dense in R∞ and countable. Therefore, (R∞ , ρ) is separable. Notice that there is no norm on R∞ that generates the coordinate-wise convergence. Indeed, consider arbitrary norm on R∞ . For n ≥ 1, let x(n) = (0, . . . , 0, αn , 0, . . .) where αn stands at nth place and positive αn is chosen such that ||x(n)|| = 1. Then the sequence x(n) converges to zero coordinate-wise, but it does not converge to zero in this norm. C OROLLARY 2.1.– For any n ≥ 1, the projective operator Pn : R∞ → Rn , Pn x = (x1 , . . . , xn ) is continuous. ∞ ∞ P ROOF.– Let x(m) = (xk (m))∞ k=1 → x = (xk )k=1 in R . By lemma 2.2, limm→∞ xk (m) = xk , k = 1, 2, . . . Therefore,

Pn (x(m)) = (x1 (m), . . . , xn (m)) → Pn x = (x1 , . . . , xn ) in Rn as m → ∞.

30

Gaussian Measures in Hilbert Space

2.1.2. Borel and cylindrical sigma-algebras coincide Remember that Borel sigma-algebra B(X) on a metric space (X, ρ) is generated by the class G of all open sets in X: B(X) = σa(G). This is applicable to the space (R∞ , ρ). Since it is separable (see lemma 2.2), B(R∞ ) is generated by the class of all closed balls: ¯ r) : x ∈ R∞ , r > 0}). B(R∞ ) = σa({B(x,

[2.4]

Another way to generate this sigma-algebra is to use the co-called cylindrical sets. D EFINITION 2.1.– Let n ≥ 1, An ∈ B(Rn ). The set Aˆn = {x ∈ R∞ : (x1 , . . . , xn ) ∈ An } is called a cylinder with base An . Consider the class of all cylinders Cyl = {Aˆn : n ≥ 1, An ∈ B(Rn )}.

[2.5]

L EMMA 2.3.– (About cylindrical algebra) The class [2.5] is algebra of sets in R∞ , but it is not a sigma-algebra. P ROOF.– a) A base of a cylinder is not uniquely deﬁned. For An ∈ B(Rn ), it holds Aˆn = ˆ An+k , k ≥ 1, with An+k = An × Rk , An+k ∈ B(Rn+k ). b) Let C1 , C2 ∈ Cyl. Without loss of generality, we may and do assume that they ˆn , with An , Bn ∈ have bases of the same dimension, say, n: C1 = Aˆn , C2 = B n B(R ). Then C1 ∪ C2 = A n ∪ Bn ,

An ∪ Bn ∈ B(Rn )

⇒

C1 ∪ C2 ∈ Cyl;

C1 \ C2 = A n \ Bn ,

An \ Bn ∈ B(Rn )

⇒

C1 \ C2 ∈ Cyl.

Thus, Cyl is an algebra. c) Let Cn = {x ∈ R∞ : x1 = . . . = xn = 0}, n ≥ 1. These sets are cylinders but = {0} ⊂ R∞ is not a cylinder. Therefore, Cyl is not a sigma-algebra.

∩∞ 1 Cn

The class [2.5] is called cylindrical algebra. It turns out that the sigma-algebra generated by Cyl (the co-called cylindrical sigma-algebra) coincides with Borel sigma-algebra. T HEOREM 2.1.– (Cylindrical sigma-algebra coincides with Borel one) For the class of sets [2.5], σa(Cyl) = B(R∞ ).

Gaussian Measure in l2 as a Product Measure

31

P ROOF.– We show inclusions in both directions. a) For Aˆn ∈ Cyl, Aˆn = Pn−1 An (see corollary 2.1). Since Pn : R∞ → Rn , it holds Pn−1 An ∈ B(R∞ ). Thus, Cyl ⊂ B(R∞ ) ⇒ σa(Cyl) ⊂ B(R∞ ). ¯ r): b) We use relation [2.4]. Take any closed ball B(x, ∞ |yn − xn | ¯ ≤r = B(x, r) = y : 2n (1 + |yn − xn |) 1 ∞ k |yn − xn | ∞ y∈R : = ≤r . 2n (1 + |yn − xn |) 1 k=1

The latter sets are cylinders Aˆk , with closed bases k |yn − xn | k Ak = y ∈ R : ≤r . 2n (1 + |yn − xn |) 1 ¯ r) ∈ σa(Cyl), and Thus, B(x, ¯ r) : x ∈ R∞ , r > 0}) = B(R∞ ) ⊂ σa(Cyl). σa({B(x, We checked the inclusions in both directions, and the statement is proven.

2.1.3. Weighted l2 space Classical sequence spaces are subsets of R∞ . The most popular of those spaces is Hilbert space l2 = {x ∈ R∞ :

∞

x2n < ∞},

[2.6]

1

with inner product (x, y)2 =

∞

x, y ∈ l2 .

xn yn ,

1

Consider a more general sequence space. Let a = (a1 , . . . , an , . . . ) be a sequence of positive numbers and l2,a := {x ∈ R∞ :

∞ 1

an x2n < ∞}.

[2.7]

32

Gaussian Measures in Hilbert Space

Denote (x, y)a =

∞

x, y ∈ l2,a .

an xn yn ,

[2.8]

1

The series converges because

∞

an |xn yn | ≤

∞

an (x2n + yn2 ) < ∞. It is easy to 1 l2,a with inner product [2.8] weighted

1

verify that [2.8] is inner product in l2,a . We call l2 space, with weights an , n = 1, 2, . . . . The induced norm is ||x||a = (x, x)a =

∞

1/2 an x2n

,

x ∈ l2,a .

[2.9]

1

L EMMA 2.4.– (About weighted l2 space) The sequence space l2,a , with inner product [2.8], is Hilbert space. P ROOF.– Let τ be a measure on the sigma-algebra 2N of all subsets of N , an , A ⊂ N τ (A) = n∈A

(in particular τ (∅) = 0, because the sum over empty set of indices is zero by convention). The space L2 (N , τ ) consists of functions f : N → R such that N

f 2 (n)dτ (n) =

n

an f 2 (n) < ∞,

1

with inner product (f, g)L2 =

N

f (n)g(n)dτ (n) =

n

f, g ∈ L2 (N , τ ).

an f (n)g(n),

1

For x ∈ l2,a deﬁne fx (n) = xn , n ∈ N . Then fx ∈ L2 (N , τ ) and the operator J : l2,a → L2 (N , τ ), Jx = fx ,

x ∈ l2,a

is an isometry between l2,a and L2 (N , τ ). Indeed, it is a linear surjection, with (Jx, Jy)L2 = (fx , fy )L2 =

n 1

an fx (n)gx (n) =

n

an xn yn = (x, y)a .

1

Thus, l2,a and L2 (N , τ ) are isometric. But the latter space is Hilbert as L2 space. Therefore, l2,a is Hilbert space as well.

Gaussian Measure in l2 as a Product Measure

33

See problem (4) below for another proof of lemma 2.4 based on isometry between sequence spaces l2,a and l2 . L EMMA 2.5.– The set l2,a is a Borel subset of R∞ . P ROOF.– In view of theorem 2.1, it is enough to express l2,a through cylinders using a countable number of operations that are admissible in a sigma-algebra. We have ∞

l2,a =

{x ∈ R∞ :

=

an x2n ≤ N } =

1

N =1 ∞

∞

∞

{x ∈ R∞ :

N =1 k=1

k

an x2n ≤ N }.

1

k The latter set {x ∈ R∞ : 1 an x2n ≤ N } =: AN k is a cylinder with closed base in Rk , AN k ∈ σa(Cyl). Then BN :=

∞

AN k ∈ σa(Cyl),

k=1

l2,a =

∞

BN ∈ σa(Cyl) = B(R∞ )

1

(see theorem 2.1).

Problems 2.1 1) Let (X, ρ) be a metric space and ϕ : [0, ∞) → [0, ∞) be a concave function, with ϕ(0) = 0, which is not identical to zero. Prove that (X, ϕ(ρ)) is a metric space as well. 2) Based on problem (1), give an alternative proof of the fact that the function [2.2] is a metric on R. 3) Prove that the space (R∞ , ρ) is complete, with the metric given in [2.1]. √ 4) Show that the operator J : l2,a → l2 , Jx = ( an xn )∞ 1 is an isometry between l2,a and l2 . Then give an alternative proof of lemma 2.4. Moreover, prove that the space l2,a is separable. 5) Prove that the following sequences are Borel subsets of R∞ : a) lp , 1 ≤ p < ∞; b) space l∞ of bounded real sequences; c) space c0 of real sequences convergent to zero; d) space c of real convergent sequences.

34

Gaussian Measures in Hilbert Space

2.2. Product measure in R∞ If we want to construct a measure on B(R∞ ), we can deﬁne it ﬁrst on the cylindrical algebra [2.5] and then extend it to σa(Cyl) = B(R∞ ) using Carathéodory’s theorem (see [HAL 13]). 2.2.1. Kolmogorov extension theorem D EFINITION 2.2.– For each n ≥ 1, let μn be a probability measure on B(Rn ). The sequence {μn , n ≥ 1} is called consistent if for each n ≥ 1 and each Bn ∈ B(Rn ), μn+1 (Bn × R) = μn (Bn ).

[2.10]

E XAMPLE 2.1.– (Consistent sequence of projections) Let ν be a probability measure on B(R∞ ). Consider the so-called projections of ν: ˆn ), νn (Bn ) = ν(B

n ≥ 1,

Bn ∈ B(Rn ).

ˆn is the cylinder with base Bn ; recall deﬁnition 2.1.) Then the sequence (Here, B {νn , n ≥ 1} is consistent. P ROOF.– Remember that Pn : R∞ → Rn is the projective operator from corollary 2.1. It holds νn = νPn−1 , therefore, νn is a probability measure on B(Rn ). For Bn ∈ B(Rn ), it holds ˆ νn+1 (Bn × R) = ν(B n × R) = ν(Bn ) = νn (Bn ). Thus, the sequence {νn , n ≥ 1} is consistent.

L EMMA 2.6.– (Condition for sigma-additivity) Let μ be a non-negative, additive and ﬁnite set function on algebra A, and moreover for each sequence Bn of sets from the algebra that decrease to ∅, μ(Bn ) → 0 as n → ∞. Then μ is a measure on A. P ROOF.– Let {An , n = 1, 2, . . . } ⊂ A be disjoint sets and ∪∞ 1 An ∈ A. Then ∞ ∞ k [2.11] = μ(A ) + μ An . An μ n 1

1

k+1

∞ k The sets Bk := ∪∞ k+1 An = (∪1 An ) \ ∪1 An ∈ A and decrease to ∅. Therefore, μ(Bk ) → 0 as k → ∞. Now, tend k → ∞ in [2.11] and get ∞ ∞ An = μ(An ). μ 1

1

Gaussian Measure in l2 as a Product Measure

35

Thus, μ is non-negative and sigma-additive on A. Hence, it is a measure on A. It turns out that any consistent sequence of measures has a form presented in example 2.1. T HEOREM 2.2.– (Kolmogorov extension theorem) Let μn be a probability measure on B(Rn ), n ≥ 1, and the sequence {μn } be consistent. Then there exists a unique probability measure μ on B(R∞ ) such that for all n ≥ 1, ˆn ) = μn (Bn ), Bn ∈ B(Rn ). μ(B P ROOF.– a) Deﬁne a set function on the cylindrical algebra. For each Aˆn ∈ Cyl, we put μ(Aˆn ) = μn (An ), n ≥ 1, An ∈ B(Rn ). The consistency of {μn } implies that μ is well-deﬁned. Indeed, μn+k (An × Rk ) = μn+k−1 (An × Rk−1 ) = · · · = μn+1 (An × R) = μn (An ). Next, μ is additive. Indeed, let C1 , . . . , Ck be disjoint cylinders.Without loss of generality, we may and do assume that they have bases of equal dimensions: ˆin , Bin ∈ B(Rn ), i = 1, . . . , k. Ci = B Then the bases are disjoint as well, and ⎞ ⎛ k k k μ Bin = Ci = μ ⎝ Bin ⎠ = μn 1

i=1

i=1

=

k

μn (Bin ) =

i=1

k

μ(Ci ).

i=1

Hence, μ is an additive, non-negative and ﬁnite set function on Cyl. b) Show that μ satisﬁes conditions of lemma 2.6 on algebra Cyl. We prove by contradiction. Assume that there exists a sequence Cn of cylinders, which decreases to ∅ and such that μ(Cn ) → δ > 0 as n → ∞. Without loss of ˆn , Bn ∈ B(Rn ), n ≥ 1. generality, we assume that Cn = B Now, by the regularity of measure μn on Rn (see [BOG 07]), for the ﬁxed δ > 0 there exists a compact set An ⊂ Bn , with μn (Bn \ An ) ≤

δ . 2n+1

Hence ˆn \ Aˆn ) = μ(B μ(B n \ An ) = μn (Bn \ An ) ≤

δ . 2n+1

36

Gaussian Measures in Hilbert Space

Now, we form a decreasing sequence of cylinders ˆn = D

n

Aˆk ,

n ≥ 1.

k=1

ˆ Here, the base Dn is a compact set in Rn , and ∩∞ n=1 Dn = ∅ because ∞

ˆn ⊂ D

n=1

∞

ˆn = ∅. B

n=1

ˆn are decreasing, therefore, The sets B ˆ n) ≤ ˆn \ D μ(B

n

ˆn \ Aˆk ) ≤ μ(B

k=1

n

ˆk \ Aˆk ) ≤ μ(B

1

n k=1

δ δ = . 2k+1 2

Then ˆ n ) ≥ lim μ(B ˆn ) − lim μ(D

n→∞

n→∞

δ δ = > 0, 2 2

ˆ n = ∅, for all n ≥ 1. and D ˆ Finally, let x ˆ(n) = (xk (n))∞ k=1 ∈ Dn , n ≥ 1. Consider {x1 (n)} ⊂ D1 ; D1 is compact; hence there exists a subsequence {x1 (n1 )} that converges to x01 ∈ D1 . Next, consider {(x1 (n1 ), x2 (n1 )) , n1 ≥ 2} ⊂ D2 ; D2 is compact; hence there exists a subsequence {(x1 (n2 ), x2 (n2 )) , nk ≥ k} that converges to (x01 , x02 ) ∈ D2 . We continue the process and for each k ≥ 1, we obtain an embedded subsequence {(x1 (nk ), . . . , xk (nk )) } that converges to (x01 , . . . , x0k ) ∈ Dk . The ˆ k , for all k ≥ 1, and therefore it belongs to point (x01 , . . . , x0k , . . . ) belongs to D ˆ k . But the latter intersection is empty. The obtained contradiction shows that D ∩∞ 1 μ satisﬁes conditions of lemma 2.6, and μ is a measure on Cyl. c) By Carathéodory’s theorem, the probability measure μ can be uniquely extended to a probability measure on σa(Cyl) = B(R∞ ). 2.2.2. Construction of product measure on B(R∞ ) Remember a deﬁnition of a product of ﬁnite number of measures. Start with two measure spaces (Xi , Fi , μi ), i = 1, 2, with probability measures μi . The product measure μ1 × μ2 is a probability measure deﬁned on product space (X1 × X2 , F), F = σa(F1 × F2 ), F1 × F2 := {A1 × A2 : A1 ∈ F1 , A2 ∈ F2 }. It holds (μ1 × μ2 )(A1 × A2 ) = μ1 (A1 )μ2 (A2 ),

A i ∈ Fi ,

i = 1, 2,

Gaussian Measure in l2 as a Product Measure

and for A ∈ F,

37

(μ1 × μ2 )(A) =

μ1 (Ax2 )dμ2 (x2 ) =

X2

X1

μ2 (Ax1 )dμ1 (x1 ).

Here Axi are sections of the set A, Ax2 = {x1 ∈ X1 : (x1 , x2 ) ∈ A},

Ax1 = {x2 ∈ X2 : (x1 , x2 ) ∈ A}.

Given three probability spaces (Xi , Fi , μi ), i = 1, 2, 3, one can form product 3 space ( 1 Xi , F (3) ), with F (3) = σa(F1 × F2 × F3 ), 3 3

Fi := Ai : Ai ∈ Fi , i = 1, 2, 3 . 1

1

Let us identify Cartesian products A1 × A2 × A3 and (A1 × A2 ) × A3 , also we identify A1 × A2 × A3 and A1 × (A2 × A3 ). Product measure μ1 × μ2 × μ3 := 3 (μ1 × μ2 ) × μ3 = μ1 × (μ2 × μ3 ) is a probability measure on ( 1 Xi , F (3) ) such that 3 3 3

μi (Ai ), Ai ∈ Fi , i = 1, 2, 3. μi ( Ai ) = 1

1

1

Using sections of a set A ∈ F (3) , one can write, for instance, 3

μi (A) = μ1 (Ax2 x3 )d(μ2 × μ3 )(x2 , x3 ) = 1

X2 ×X3

= X3

(μ1 × μ2 )(Ax3 )dμ3 (x3 ).

By induction, given n probability spaces (Xi , Fi , μi ), 1 ≤ i ≤ n product measure n μ(n) = 1 μi is a probability measure on product space F (n) := σa(F1 × F2 × . . . × Fn ) such that n−1

(n) μ = μi × μn . 1

It holds μ

(n)

n

1

Ai

=

n

μi (Ai ),

A i ∈ Fi ,

i = 1, . . . , n.

[2.12]

1

Now, we want to deﬁne a product of a sequence of measures. We need this construction for measures on real line.

38

Gaussian Measures in Hilbert Space

Let {μn , n ≥ 1} be a sequence of probability measures on B(R). For any n ≥ 1, consider the product measure νn :=

n

μi

[2.13]

1

n on σa( 1 Fi ) = B(Rn ), where Fi ≡ B(R). L EMMA 2.7.– (About product measure) For probability measures [2.13], there exists a unique probability measure ν on B(R∞ ) such that for all n ≥ 1, n

ˆ μi (Bn ), Bn ∈ B(Rn ). ν(Bn ) = νn (Bn ) = 1

P ROOF.– Check that the sequence {νn } is consistent (see deﬁnition 2.2). Because νn+1 = νn × μn+1 , it holds νn+1 (Bn × R) = νn (Bn )μn+1 (R) = νn (Bn ),

Bn ∈ B(Rn ),

n ≥ 1.

Thus, {νn } is consistent. Now, the statement follows from theorem 2.2.

D EFINITION 2.3.– The probability measure ν from lemma 2.7 is called a product measure in R∞ and denoted as ν = μ 1 × μ2 × . . . × μn × . . . =

∞

μn .

1

2.2.3. Properties of product measure The next statement extends property [2.12] to the product of a sequence of measures. T HEOREM 2.3.– Let {μn } be a sequence of probability measures on B(R), n } be {A ∞ ∞ a sequence of Borel subsets of R and ν = 1 μn in R∞ . Then A := 1 An ∈ B(R∞ ) and ν(A) =

∞

μn (An ).

1

P ROOF.– Let Bn = A1 × . . . × An . It holds A=

∞ n=1

n ∈ σa(Cyl) = B(R∞ ). B

Gaussian Measure in l2 as a Product Measure

Here we used the fact that the product of Borel sets

n 1

39

Ai is a Borel set in Rn .

n are decreasing to the set A, and the upper continuity of Cylinders Cn = B measure ν implies ν(A) = lim ν(Cn ) = lim νn (A1 × . . . × An ) = n→∞

= lim

n→∞

n

n→∞

μi (Ai ) =

1

∞

μi (Ai ).

1

C OROLLARY 2.2.– Let {μn } be a sequence of probability measures {An } ∞ on B(R), ∞ be a sequence of sets, with μn (A ) > 0, n ≥ 1, and ν = μ in R . Then n 1 n∞ ∞ ν( 1 An ) > 0 if, and only if, 1 μn (Acn ) < ∞ (hereafter Ac is the complement of A). P ROOF.– By theorem 2.3 ν(

∞

1

An ) =

∞

(1 − μn (Acn )).

1

Here 0 ≤ μn (Acn ) < 1, n ≥ 1. According to the test forconvergence of inﬁnite ∞ products, the latter inﬁnite product converges if, and only if, 1 μn (Acn ) < ∞. ∞ Thus, ν( 1 An ) > 0 if, and only if, the values μn (Acn ) are quite small, or equivalently, the values μn (An ) are quite large. Now, we consider mappings from a probability space to R∞ . D EFINITION 2.4.– Let (Ω, F, P) be a probability space and (X, ρ) be a metric space. A mapping ξ : Ω → X is called a random element distributed in X if ξ is (F, B(X)) measurable. The induced probability measure μξ := P ξ −1 is called the distribution of ξ. Thus, for a random element ξ distributed in X, it holds ξ −1 B ∈ F , for all B ∈ B(X), and μξ (B) = (P ξ −1 )(B) = P{ω : ξ(ω) ∈ B},

B ∈ B(X).

In particular if X = R, then the random element is just a r.v., and if X = Rn , then it is a random vector. L EMMA 2.8.– (About components of random element) Consider a mapping ∞ ∞ ξ = (ξn )∞ if, and only if, ξn is a r.v., for 1 : Ω → R . It is a random element in R each n ≥ 1.

40

Gaussian Measures in Hilbert Space

P ROOF.– a) The coordinate projector πn x = xn ,

x ∈ R∞

[2.14]

is continuous by lemma 2.2; hence πn is a Borel mapping. Assume that ξ is a random element in R∞ . Then ξn = πn (ξ) : Ω → R is (F, B(R)) measurable as a composition of (F, B(R∞ )) and (B(R∞ ), B(R)) measurable mappings. Thus, ξn is a r.v. b) Assume that ξn is a r.v., for all n ≥ 1. Then ξ (k) := (ξn )k1 is a random vector in Rk . Now, take any cylinder Aˆk ∈ Cyl. It holds ξ −1 (Aˆk ) = {ω ∈ Ω : ξ (k) (ω) ∈ Ak } ∈ F, because Ak ∈ B(Rk ) and ξ (k) is a random vector. Since σa(Cyl) = B(R∞ ), the mapping ξ is (F, B(R∞ )) measurable, and ξ is a random element in R∞ . Remember that random variables Y1 , . . . , Yn , . . . , which are deﬁned on the same probability space (Ω, F, P), are independent if for each n ≥ 1, random variables Y1 , . . . , Yn are independent. This means that P{Y1 ∈ B1 , Y2 ∈ B2 , . . . , Yn ∈ Bn } =

n

P{Yk ∈ Bk },

1

for each n ≥ 1 and each B1 , . . . , Bn ∈ B(R). Consider a sequence ξ1 , ξ2 , . . . , ξn , . . . of random variables on the same ∞ probability space and let ξ = (ξn )∞ 1 : Ω → R . By lemma 2.8, it is a random ∞ element in R . L EMMA 2.9.– (About distribution of element with independent components) The random variables ξn are independent if, and only if, the distribution of ξ is a product measure: μξ =

∞

μn ,

1

and then μn is a distribution of ξn : μn = μξn , n ≥ 1.

[2.15]

Gaussian Measure in l2 as a Product Measure

41

P ROOF.– a) Assume that ξn , n ≥ 1 are independent. For a cylinder Bn , we have (here ξ (n) = (ξk )n1 ): μξ (Bn ) = P{ξ ∈ Bn } = P{ξ (n) ∈ Bn } = μξ(n) (Bn ). Since components of ξ (n) are independent, μξ(n) =

n

μ ξk ,

1

(see section 1.4.3), and μξ (Bn ) = (

n

μξk )(Bn ).

1

Thus (see deﬁnition 2.3), μξ =

∞ 1

μ ξk .

b) Now, assume [2.15]. Then for A1 , . . . , An ∈ B(R), P{ξ1 ∈ A1 , . . . , ξn ∈ An } = P{ξ ∈

n n n

Ak } = μξ ( Ak ) = μk (Ak ). 1

1

1

But μk (Ak ) = μξ (πk−1 Ak ) = P{ξ ∈ πk−1 Ak } = P{ξk ∈ Ak }. Therefore, P{ξ1 ∈ A1 , . . . , ξn ∈ An } =

n

P {ξk ∈ Ak }.

1

This holds for any n ≥ 1 and A1 , . . . , An ∈ B(R). Hence, ξn , n ≥ 1 are independent. C OROLLARY 2.3.– (Existence of independent sequence) Let μn , n = 1, 2, . . . be a sequence of probability measures on B(R). Then there exists a sequence ξn , n = 1, 2, . . . of independent random variables, with μξn = μn , n = 1, 2, . . . P ROOF.– Introduce a probability space ∞

(Ω, F, P) = R∞ , B(R∞ ), μn 1

42

Gaussian Measures in Hilbert Space

and mappings ξn : Ω → R , ξn (ω) = ωn , n = 1, 2, . . . . Then the identity mapping ∞ ξ = (ξn )∞ 1 is a random element in R . The distribution of ξ at a Borel set B equals: ∞

μξ (B) = P{ξ ∈ B} = P(B) = μn (B), 1

∞

and μξ = 1 μn . Now, by lemma 2.9 ξn are independent random variables and μξn = μn , n = 1, 2, . . . Problems 2.2 6) Generalize theorem 2.2 for consistent sequences of probability measures on (X1 , B(X1 )), (X1 × X2 , B(X1 × X2 )), . . . , where Xn is a complete separable metric space, n = 1, 2, . . . . Hint. Let X be a complete separable metric space. Every probability measure μ on B(X) is regular, in particular for each A ∈ B(X) and each > 0 there exists a compact set K ⊂ A such that μ(A\K) < (see [BOG 07]). 7) Let μn be a probability measure in a complete separablemetric space Xn , n = ∞ 1, 2, . . . Based problem (6), construct a product measure n=1 μn in the metric on ∞ space X∞ = n=1 Xn . 8) Based on the Borel–Cantelli lemma, give an alternative proof to the following ∞ part of corollary 2.2: if under conditions of corollary 2.2 ν( n=1 An ) > 0, then ∞ c μ (A ) < ∞. n n=1 n 9) Based on problem (7), prove the following generalization of corollary 2.3: let μn be a probability measure on a complete separable metric space Xn , n = 1, 2, . . . ; then there exists a sequence ξn , n = 1, 2, . . . of independent random elements such that ξn is distributed on Xn and μξn = μn , n = 1, 2, . . . . 2.3. Standard Gaussian measure in R∞ In section 1.5.4, we considered standard Gaussian measure on a real line and in Euclidean space. Let g be the standard Gaussian measure on R, i.e. x2 1 g(B) = √ e− 2 dx, B ∈ B(R). 2π B D EFINITION 2.5.– Product measure μ = measure in R∞ if μn = g, for all n ≥ 1.

∞ 1

μn is called standard Gaussian

Gaussian Measure in l2 as a Product Measure

43

D EFINITION 2.6.– We call a sequence γn , n = 1, 2, . . . of i.i.d. N (0, 1) random variables as Gaussian white noise. ∞ Gaussian white noise {γn } yields a random element γ = (γn )∞ 1 distributed in R . ∞ Its distribution μγ is just a standard Gaussian measure in R , because by lemma 2.9

μγ =

∞

μγn , μγn = g, n = 1, 2, . . .

1

Remember that the sequence space l2,a was introduced in section 2.1.3. Here ∞ a = (an )∞ 1 , with an > 0, n = 1, 2, . . . Remember also that l2,a ∈ B(R ). We will show that standard Gaussian measure is concentrated on l2,a if a ∈ l1 . T HEOREM 2.4.– (Kolmogorov–Khinchin criterion) Let μ be standard Gaussian measure in R∞ . Then ⎧ ∞ ⎪ an = ∞, ⎨0 if 1 μ(l2,a ) = ∞ ⎪ ⎩1 if an < ∞. 1

P ROOF.– a) For a ﬁxed λ > 0 and N ≥ 1, introduce a function N λ ak x2k , x ∈ R∞ . fλN (x) = exp − 2 1 Functions of such a kind that depend only on a ﬁnite number of coordinates are called cylindrical. The function fλN is continuous and bounded (0 < fλN ≤ 1), hence fλN ∈ L(R∞ , μ). This function is generated by the projective operator PN : R∞ → RN and the function N λ (N ) 2 a k xk . fλ (x1 , . . . , xN ) = exp − 2 1 (N )

In fact, fλN = fλ ◦ PN . N Denote νN = 1 μk , μk ≡ g. Then νN = μPN−1 . Using the change of variables formula and problem (26) from Chapter 1, we obtain: (N ) (N ) fλN dμ = fλ (PN x)dμ(x) = fλ (t)dνN (t) R∞

R∞

RN

= N

1

k=1 (1

. + λak )

44

Gaussian Measures in Hilbert Space

For ﬁxed x ∈ R∞ ,

& % ∞ exp − λ2 1 ak x2k , fλ (x) := lim fλN (x) = N →∞ 0,

if x ∈ l2,a , if x ∈ R∞ \l2,a .

Moreover, 0 < fλN (x) ≤ 1, 1 ∈ L(R∞ , μ), and by Lebesgue dominated convergence theorem: 1 fλ dμ = lim fλN dμ = ∞ . [2.16] N →∞ ∞ ∞ (1 + λak ) R R k=1 If the inﬁnite product converges, then R∞ f dλ is ﬁnite and positive; otherwise if the inﬁnite product diverges to inﬁnity, then the right-hand side of [2.16] equals zero by convention, and R∞ f dλ = 0. ∞ b) Assume 1 ak = ∞. Then by the test for convergence of inﬁnite ∞ that products, k=1 (1 + λak ) = ∞. It holds fλ dμ = fλ dμ. 0= R∞

l2,a

Here, we used the fact that fλ vanishes at R∞ \l2,a . Next, fλ > 0 at l2,a , and therefore, μ(l2,a ) = 0. ∞ c) Now, assume that 1 ak < ∞. The inﬁnite product in [2.16] converges, thus, ∞ 1 fλ dμ = exp − log(1 + λak ) . [2.17] 2 R∞ k=1

It holds

1, lim fλ (x) = λ→0+ 0,

if x ∈ l2,a , if x ∈ R∞ \l2,a .

Moreover, 0 ≤ fλ ≤ 1, 1 ∈ L(R∞ , μ), and by Lebesgue dominated convergence theorem lim fλ dμ = Il2,a dμ = μ(l2,a ). λ→0+

R∞

R∞

However, the right-hand side of [2.17] converges to 1 as λ → 0+, because 0≤

∞ k=1

log(1 + λak ) ≤ λ

∞

ak → 0 as λ → 0 + .

k=1

Thus, in this case μ(l2,a ) = 1. It is instructive to give an alternative proof of a part of theorem 2.4.

Gaussian Measure in l2 as a Product Measure

45

2.3.1. Alternative proof of the second part of theorem 2.4 P ROOF.– Let {γn } be Gaussian white noise (see deﬁnition 2.6) and γ = (γn )∞ 1 be the corresponding random element in R∞ . Then the distribution μγ of γ coincides with standard Gaussian measure μ and ∞ 2 μ(l2,a ) = μγ (l2,a ) = P {γ ∈ l2,a } = P ak γk < ∞ . 1

In terms of probability theory, theorem 2.4 can be restated as follows: Let {γk } be Gaussian white noise, then: ∞ ∞ i) 1 ak γk2 = ∞ a.s. if 1 ak = ∞; ∞ ∞ 2 ii) 1 ak γk < ∞ a.s. if 1 ak < ∞. Now, assume that S(ω) =

∞

∞ 1

ak < ∞. Put

ak γk2 (ω)

= lim

N →∞

1

N

ak γk2 (ω),

ω ∈ Ω.

k=1

¯ = Here S = S(ω) is a non-negative r.v. distributed on the extended real line R 2 R ∪{+∞, −∞}. Next, ak γk ≥ 0, and E[S] =

∞

ak E γk2 =

∞

1

ak < ∞.

1

This implies the desired relation S(ω) < ∞ a.s.

Problems 2.3 10) Let measures on B(R), with ν2n be a sequence of probability ∞ sup x dν < ∞, and let ν = ν . Prove that ν(l2,a ) = 1 if n n n≥1 n=1 R ∞ n=1 an < ∞. ∞ 11) Let λn > 0, n ≥ 1 and n=1 λn < ∞. Denote ∞ 2 Jα = λn xn dμ(x), α ∈ R, exp α R∞

n=1

where μ is standard Gaussian measure in R∞ . Prove that Jα < ∞ if, and only if, 1 α < 2 maxn≥1 λn . Show that in this case Jα = ∞

1

n=1 (1 − 2αλn )

.

46

Gaussian Measures in Hilbert Space

∞ 12) Let λn > 0, n ≥ 1, n=1 λn = ∞ and the integral Jα be deﬁned as in problem (11). Prove that Jα = 0 if α < 0, and Jα = +∞ if α > 0. 13) Levy’s theorem implies the following: if a series of independent random variables converges in probability, then it converges a.s. (see lemma 3.2.1 in [BUL 80]). Using this fact prove the next statement. Let μ be standard Gaussian measure on R∞ and t = (tn )∞ 1 ∈ l2 . Prove that the ∞ ∞ series 1 tn xn , x = (xn )∞ ∈ R converges almost everywhere w.r.t. μ. Introduce 1 a function ft : R∞ → R, ∞ 1 tn xn , if the series converges, ft (x) = 0, otherwise. Prove that t 2 exp{ift (x)}dμ(x) = exp − 2 , 2 R∞ where t 2 is the norm in l2 . 2.4. Construction of Gaussian measure in l2 Based on Kolmogorov–Khinchin criterion (theorem 2.4), we will construct a Gaussian measure in the sequence space l2 ; the latter is isometric to any separable inﬁnite-dimensional Hilbert space. Let H be arbitrary real Hilbert space, with inner product (x, y). Deﬁnitions of Gaussian random element distributed in H and of Gaussian measure in H are similar to deﬁnitions 1.11 and 1.14. D EFINITION 2.7.– A random element ξ distributed in H is called Gaussian if for each h ∈ H, inner product (ξ, h) is a Gaussian r.v. (possibly with zero variance; see deﬁnition 1.5). D EFINITION 2.8.– A probability measure μ on B(H) is called Gaussian if there exists a Gaussian random element ξ in H such that its distribution μξ = μ. ∞ ∞ Now, let μ be standard Gaussian ∞ measure in R . Fix a sequence a = (an )1 of positive numbers such that 1 an < ∞. By Kolmogorov–Khinchin criterion, μ(l2,a ) = 1.

From the proof of lemma 2.5, one can see that the ball {x ∈ l2,a : x a ≤ r} = {x ∈ R∞ :

∞ 1

an x2n ≤ r2 } ∈ B(R∞ ),

Gaussian Measure in l2 as a Product Measure

47

and in a similar way for each y ∈ l2,a , the ball Ba (y, r) := {x ∈ l2,a : x − y a ≤ r} ∈ B(R∞ ). The space l2,a is separable (see problem (4)), therefore, B(l2,a ) = σr({Ba (y, r) : y ∈ l2,a , r > 0}) ⊂ B(R∞ ). Here, σr denotes the generated sigma-ring. Now, let μa be the restriction of μ to B(l2,a ), μa (A) = μ(A), A ∈ B(l2,a ). Since μ is concentrated on l2,a , μa is a probability measure on B(l2,a ). In order to construct a related measure in l2 , use the isometry J between l2,a and l2 (see problem (4)) √ Jx = ( an xn )∞ 1 , x ∈ l2,a . Introduce the induced probability measure on B(l2 ): g = μa J −1 , g(B) = μa (J −1 B), B ∈ B(l2 ).

[2.18]

Unlike in section 2.3, here g does not denote standard Gaussian measure. Now, we need a simple statement about the convergence of Gaussian random variables. L EMMA 2.10.– (About convergence of Gaussian variables) Let γn ∼ N (0, σn2 ), with σn ≥ 0, n ≥ 1, and γn → γ a.s., σn2 → σ 2 < ∞ as n → ∞. Then γ ∼ N (0, σ 2 ). P ROOF.– Characteristic function σ 2 x2 σ 2 x2 → exp − as n → ∞. ϕγn (x) = exp − n 2 2 However, by Lebesgue dominated convergence theorem: ϕγn (x) = E eixγn → E eixγ = ϕγ (x) as n → ∞. 2

2

Thus, ϕγ (x) = exp{− σ 2x }, and γ ∼ N (0, σ 2 ).

For a probability measure on B(H), where H is a real Hilbert space, the mean value and the correlation operator are deﬁned similarly to the case H = Rn (see section 1.4). D EFINITION 2.9.– Let ξ be a random element in H and μ be a probability measure on B(H). A vector m ∈ H is called expectation of ξ if (m, h) = E(ξ, h), for all h ∈ H,

48

Gaussian Measures in Hilbert Space

and m is called mean value of μ if (m, h) = (x, h)dμ(x) for all h ∈ H. H

We denote m = E ξ and m = mμ , respectively. D EFINITION 2.10.– Consider a linear bounded operator S in H. It is called correlation operator of ξ if (Sh1 , h2 ) = E(ξ − m, h1 )(ξ − m, h2 ), for all h1 , h2 ∈ H, where m = E ξ, and S is called correlation operator of μ if (Sh1 , h2 ) = (x − m, h1 )(x − m, h2 )dμ(x), for all h1 , h2 ∈ H, H

where m = mμ . R EMARK 2.1.– The change of variables formula implies the following: for a random element ξ in H and its distribution μξ , the mean value of μξ coincides with E ξ, and correlation operators of μξ and ξ coincide as well. ∞ T HEOREM 2.5.– (Construction of Gaussian ∞ measure in l2 ) Let a = (an )1 be a sequence of positive numbers such that 1 an < ∞. Then g deﬁned in [2.18] is a Gaussian measure in l2 , with zero mean and diagonal correlation operator S

Sx = (an xn )∞ 1 , x ∈ l2 .

[2.19]

P ROOF.– a) Construct a random element in l2 with distribution g. Let {γn } be Gaussian white noise (see deﬁnition 2.6) and γ = (γn )∞ 1 be the random element in R∞ . Then its distribution μγ coincides with μ, standard Gaussian measure in R∞ . Since μ(l2,a ) = 1, it holds P{γ ∈ l2,a } = 1. Without loss of generality, we may and do assume that γ acts from Ω to l2,a (otherwise one can remove from Ω a subset Ω0 = {ω : γ(ω) ∈ R∞ \l2,a }, with P(Ω0 ) = 0). Since B(l2,a ) ⊂ B(R∞ ), the mapping γ : Ω → l2,a is (F, B(l2,a )) measurable, i.e. γ is a random element in l2,a . Consider its distribution μγ,a in l2,a . For A ∈ B(l2,a ), we have μγ,a (A) = P{γ ∈ A} = μ(A) = μa (A) (remember that μa is the restriction of μ to B(l2,a )). Therefore, μγ,a = μa . √ Next, Jγ = ( an γn )∞ 1 is a random element in l2 , with distribution μJγ (B) = P{Jγ ∈ B} = P{γ ∈ J −1 B} = μa (J −1 B) = (μa J −1 )(B), B ∈ B(l2 ). Thus (see [2.18]), Jγ has distribution g in l2 .

Gaussian Measure in l2 as a Product Measure

49

b) Prove that Jγ is a Gaussian random element in Hilbert space l2 and ﬁnd the expectation of Jγ and its correlation operator. Take any h ∈ l2 . We have for ω ∈ Ω that (Jγ, h) = lim

N →∞

ηN :=

N √

N √

an γn hn ,

n=1

2 2 an γn hn ∼ N (0, σN ), σN =

n=1

N

an h2n .

n=1

Since an is bounded and h ∈ l2 , 2 σN →

∞

an h2n = h 2a < ∞ as N → ∞.

n=1

By lemma 2.10, the limiting r.v. is (Jγ, h) ∼ N (0, h 2a ). Therefore, Jγ is a Gaussian random element with zero mean. Take another vector u ∈ l2 . A symmetric bilinear form B(h, u) = (Jγ, h)(Jγ, u) can be expressed through the quadratic form as follows: 1 (Jγ, h + u)2 − (Jγ, h)2 − (Jγ, u)2 ; 2

1 E(Jγ, h)(Jγ, u) = ||h + u||2a − ||h||2a − ||u||2a = (h, u)a , 2 ∞ a n hn un . E(Jγ, h)(Jγ, u) =

(Jγ, h)(Jγ, u) =

1

Consider the diagonal operator S deﬁned in [2.19]. It is a linear bounded operator in l2 because the sequence {an } is bounded. For any h, u ∈ l2 , it holds (Sh, u) =

∞

an hn un = E(Jγ, h)(Jγ, u).

1

Therefore, S is correlation operator of Jγ. By part (a) of the proof, g is a distribution of Jγ. Then g is a Gaussian measure, with zero mean and correlation operator S. Look at the constructed Gaussian measure g from another point of view. Let {˜ γn , n ≥ 1} be a sequence of independent standard normal random variables and √ ∞ η = (ηn )∞ ˜n )∞ 1 = ( an γ 1 be the random element in R , where an > 0, n ≥ 1 and ∞ 1 an < ∞.

50

Gaussian Measures in Hilbert Space

Distribution of η in R∞ is the product of marginal distributions gn = μηn , gn is a Gaussian measure on R with zero mean and variance an : μη =

∞

gn

in

R∞ .

1

The measure μη is concentrated on l2 : μη (l2 ) = 1, and the restriction of μη to B(l2 ) ⊂ B(R∞ ) is just the measure g from [2.18]. In this situation, we say that g is a product measure in Hilbert space l2 and we write: g=

∞

gn

in l2 .

[2.20]

1

In subsequent chapters, we will show that any Gaussian measure G in a separable inﬁnite-dimensional Hilbert space H has a form of product measure in H. This means the following. There exists an orthobasis {en } in H, such that after identiﬁcation of H and the sequence space l2 of Fourier coefﬁcients (cn = (c, en ))∞ 1 , c ∈ H, the measure G becomesa product measure in l2 like [2.20], with Gaussian marginal distributions ∞ Gn : G = 1 Gn in l2 . Moreover, we will prove that G has certain mean value mG ; if mG = 0, then all Gn have zero mean as well, otherwise some of mean values mGn are not equal to zero. Problems 2.4 14) Let g be Gaussian measure deﬁned in [2.18] and b = (bn )∞ 1 be a sequence of positive numbers. Prove that ∞ 1, if 1 an bn < ∞, g(l2,b ∩ l2 ) = . 0, otherwise. 15) Let g be Gaussian measure deﬁned in [2.18] and t = (tn )∞ 1 ∈ l2,a . Introduce a function ft : l2 → R, ∞ 1 tn xn , if the series converges, ft (x) = 0, otherwise. Prove that ft is a Borel function and, if it is considered a r.v. on the probability space (l2 , B(l2 ), g), it has Gaussian distribution N (0, ||t||2a ).

3 Borel Measures in Hilbert Space

In this chapter, we systematically study measures on a Hilbert space H. We will start with nuclear operators in H and then proceed with the weak and strong integrals of a function valued in H or, more generally, in a Banach space. The correlation operator of a measure μ on B(H) with H ||x||2 dμ(x) < ∞ is a nuclear operator; mean value mμ of the measure μ is usually deﬁned as the weak integral mμ = xdμ(x), [3.1]

H

but if H ||x||dμ(x) < ∞, then the right-hand side of [3.1] can be understood as the strong integral. 3.1. Classes of operators in H In this section, H is a separable inﬁnite-dimensional real Hilbert space. Remember that a linear operator A in H is called compact operator if for each bounded set M ⊂ H, its image A(M ) is relatively compact in H. Class of all compact operators in H is denoted as S∞ (H). It holds S∞ (H) ⊂ L(H), where L(H) is Banach space of all linear bounded operators in H, with operator norm ||B|| =

||Bx|| . x∈H,x=0 ||x|| sup

The set S∞ (H) is a subspace of L(H), i.e. S∞ (H) is a linear and closed subset of L(H).

Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.

52

Gaussian Measures in Hilbert Space

3.1.1. Hilbert–Schmidt operators Such operators are studied in a standard university course on functional analysis. Here, we brieﬂy overview properties of Hilbert–Schmidt operators and omit most of the proofs. D EFINITION 3.1.– Let A ∈ L(H). It is called Hilbert–Schmidt operator if there exists ∞ an orthobasis {en , n ≥ 1} in H, with 1 ||Aen ||2 < ∞. The linear set of all such operators is denoted as S2 (H). For A ∈ L(H), the latter sum does not depend on the choice of orthobasis. D EFINITION 3.2.– For A ∈ L(H) and an orthobasis {en , n ≥ 1}, the expression ' (∞ ( ||Aen ||2 ||A||2 := ) 1

is called Hilbert–Schmidt norm of A. This norm is ﬁnite if, and only if, A ∈ S2 (H). Using matrix entries aij = (Aej , ei ), i, j = 1, 2, . . . , one can express the Hilbert–Schmidt norm as follows: ' (∞ ( ||A||2 = ) a2ij . 1

E XAMPLE 3.1.– (Diagonal operator) Consider a bounded real sequence {an , n ≥ 1} and operator A in l2 , Ax = (a1 x1 , . . . , an xn , . . . ),

x ∈ l2 .

It is the co-called diagonal operator and denoted as A = diag(an , n ≥ 1). For the standard basis {en } in l2 , matrix entries are aij = (Aej , ei ) = (aj ej , ei ) = aj δij ,

i, j = 1, 2, . . . ∞ 2 The Hilbert–Schmidt norm of A is ||A||2 = 1 aj . Thus, A ∈ S2 (H) if, and ∞ 2 only if, 1 aj < ∞. E XAMPLE 3.2.– (Integral Hilbert–Schmidt operator) Let K = K(t, s) ∈ L2 ([a, b]2 ). The integral operator b (Ax)(t) = K(t, s)x(s)ds, x ∈ L2 ([a, b]2 ), a ≤ t ≤ b a

Borel Measures in Hilbert Space

53

acts in Hilbert space H = L2 [a, b]. It holds A ∈ S2 (H) and * b b ||A||2 = ||K||L2 = |K(t, s)|2 dtds. a

a

E XAMPLE 3.3.– (Identity operator) In a separable inﬁnite-dimensional Hilbert space H, consider the identity operator Ix = x, x ∈ H. For an orthobasis {en }, we have ' ' (∞ (∞ ( ( 2 ) ||Ien || = ) 1 = ∞. ||I||2 = 1

n=1

Thus, I ∈ L(H) \ S2 (H), and inclusion S2 (H) ⊂ L(H) is strict. Now, we introduce an inner product in S2 (H). For an orthobasis {en } in H, we set (A, B)2 =

∞

(Aej , Bej ),

A, B ∈ S2 (H).

[3.2]

1

L EMMA 3.1.– (About inner product in S2 (H)) A series in [3.2] converges and its sum does not depend on the choice of orthobasis. Moreover, relation [3.2] deﬁnes an inner product in S2 (H), and S2 (H) with this product is separable Hilbert space. The corresponding norm in S2 (H) is just the Hilbert–Schmidt norm. P ROOF.– a) The functional fj (A, B) := (Aej , Bej ) is a symmetric bilinear form on S2 (H). Therefore, fj (A, B) = ∞ 1

fj (A + B, A + B) − fj (A − B, A − B) , 4

1 (Aej , Bej ) = 4 =

∞

fj (A + B, A + B) −

1

∞

fj (A − B, A − B)

=

1

1 ||A + B||22 − ||A − B||22 . 4

Here A + B, A − B ∈ S2 (H) because S2 (H) is a linear set in L(H), thus, ||A + B||2 < ∞ and ||A − B||2 < ∞. We conclude that the right-hand side of [3.2] is ﬁnite and does not depend on the choice of orthobasis.

54

Gaussian Measures in Hilbert Space

b) On sigma-algebra 2N ×N of all subsets of N × N , consider the counting measure μ(C) = |C|,

C ⊂ N × N.

Fix an orthobasis {en } in H. For A ∈ S2 (H), we use its matrix entries aij = (Aej , ei ), i, j ≥ 1. It holds N ×N

a2ij dμ(i, j) =

∞

a2ij = ||A||22 < ∞.

i,j=1

Therefore, aij as a function of a couple (i; j) ∈ N × N belongs to the space L2 (N × N , μ). The mapping J : S2 (H) → L2 (N × N , μ), J(A) = (aij )∞ i,j=1 is a linear bijection, with

(J(A), J(B))L2 =

∞

aij bij =

∞

(Aej , Bej ) = (A, B)2 ,

A, B ∈ S2 (H).

1

i,j=1

[3.3] Here, bij are matrix entries of B. Relation [3.3] implies that [3.2] is an inner product in S2 (H), together with the inner product in L2 (N × N , μ). Moreover J is an isometry between S2 (H) and a separable Hilbert space L2 (N × N , μ). Therefore, S2 (H) is a separable Hilbert space as well. c) A norm induced by the inner product [3.2] is as follows: ' (∞ ( ||A||S2 (H) = (A, A)2 = ) (Aej , Aej ) = ||A||2 . 1

Thus, the induced norm is just the Hilbert–Schmidt norm. We list the properties of the Hilbert–Schmidt operators. T HEOREM 3.1.– The following statements hold true: a) A ≤ A 2 , for all A ∈ S2 (H). b) A∗ 2 = A 2 , for all A ∈ L(H). c) AB 2 ≤ A · B 2 , for all A ∈ L(H), B ∈ S2 (H); AB 2 ≤ A 2 · B , for all A ∈ S2 (H), B ∈ L(H). d) If A ∈ S2 (H), then A ∈ S∞ (H).

Borel Measures in Hilbert Space

55

Now, we comment theorem 3.1. Statement (a) implies that natural embedding J : S2 (H) → L(H), J(A) = A is a continuous operator. Statement (b) implies the following: a linear bounded operator is a Hilbert–Schmidt one if, and only if, its adjoint operator is a Hilbert–Schmidt operator. Inequalities (c) yield the following: if A ∈ S2 (H) and B ∈ L(H), then AB, BA ∈ S2 (H). This means that S2 (H) is a two-sided ideal in the operator algebra L(H). (Remember that L(H) is a normed algebra w.r.t. linear operations and multiplication of operators.) In (d), we have the inclusion S2 (H) ⊂ S∞ (H), i.e. Hilbert space S2 (H) is a linear subset of Banach space S∞ (H). 3.1.2. Polar decomposition Remember that a linear bounded operator A in Hilbert space is called positive if its quadratic form (Ax, x) is non-negative. This is denoted as A ≥ 0. If the underlying space is complex and A ≥ 0, then A is self-adjoint. In a real space, this is not true. Now, we derive the so-called polar decomposition of an operator in H. It is an analogue of the polar decomposition z = eiφ |z|, z ∈ C. T HEOREM 3.2.– (About polar decomposition) Each compact operator A in H can be decomposed as A = U T , where T is a self-adjoint positive compact operator and U ∈ L(H), U performs an isometry of R(T ) into H (i.e. U x = x , x ∈ R(T )). Before the proof, we consider an example of such decomposition. E XAMPLE 3.4.– (Polar decomposition of compact diagonal operator) Let {an } be a real sequence that converges to zero and A = diag(an , n ≥ 1) in Hilbert space l2 (for the notation of diagonal operator, see example 3.1). Then A ∈ S∞ (l2 ). We put T = diag(|an |, n ≥ 1), U = diag(signan , n ≥ 1). Since an =(signan )|an |, it ∞ holds A = U T ; T is self-adjoint, it is positive because (T x, x) = 1 |an |x2n ≥ 0, x ∈ l2 , and T is a compact operator because |an | → 0 as n → ∞; U ∈ L(l2 ) ∞ and |U T x 2 = Ax 2 = 1 a2n x2n = T x 2 , x ∈ l2 , and therefore, U performs an isometry of R(T ) into l2 . Thus, we constructed a desired decomposition of the compact diagonal operator A.

56

Gaussian Measures in Hilbert Space

P ROOF OF THEOREM 3.2.– a) Construction of T . Let B = A∗ A. It is a compact operator as a product of A∗ ∈ L(H) and A ∈ S∞ (H). Moreover, (Bx, x) = (Ax, Ax) ≥ 0, and B ≥ 0. Then by Hilbert–Schmidt theorem Bx =

∞

λn (x, en )en , x ∈ H,

1

where {en } is eigenbasis of B, {λn ≥ 0, n ≥ 1} are eigenvalues with λn → 0 as n → ∞, and the series converges in H. We set Tx =

∞ λn (x, en )en , x ∈ H.

[3.4]

1

This is a self-adjoint positive operator such that T 2 = B. It is called square root √ √ 1 of B and denoted as T = B = B 2 . The operator T is compact because λn → 0 as n → ∞ (actually T is the diagonal operator in the basis {en }; the matrix that represents T in this basis is diagonal). b) Construction of U . First we deﬁne U on a linear set R(T ). We put U (T x) = Ax, x ∈ H. The operator U : R(T ) → H is well-deﬁned. Indeed, let T x = T y, then T z = 0 with z = x − y, 0 = T z 2 =

∞

λn (z, en )2 = (Bz, z) = Az 2 ⇒ Az = 0, Ax = Ay.

1

Next, U (T x) = Ax = T x by the computations made above, and U performs an isometry of R(T ) into H. Finally, U can be extended by continuity to an isometry from R(T ) into H, and then it can be further extended to a linear bounded operator in H by letting U x = 0, ⊥ x ∈ (R(T )) = KerT . 1

D EFINITION 3.3.– The operator T = (A∗ A) 2 deﬁned in [3.4] is called modulus of a compact operator A and denoted as |A|. The equality A = U |A|, where U ∈ L(H), U performs an isometry of R(T ) into H and U is vanishing at KerT, is called polar decomposition of A. In example 3.4, T = |A| and equality A = U T is polar decomposition of the diagonal operator A.

Borel Measures in Hilbert Space

57

R EMARK 3.1.– Theorem 3.2 holds true for arbitrary separable Hilbert space (it can be ﬁnite-dimensional or complex; the proof can be modiﬁed with minor changes). R EMARK 3.2.– (Bound for norm of U ) Consider polar decomposition of A ∈ S∞ (H). If A is zero operator in H, then |A| = U = 0 and ||U || = 0. If A = 0, then ||U x|| ||U || = sup : x ∈ R(T ), x = 0 = 1. ||x|| In all the cases, ||U || ≤ 1. 3.1.3. Nuclear operators Remember that H is a separable real inﬁnite-dimensional Hilbert space. D EFINITION 3.4.– For A ∈ S∞ (H), positive eigenvalues αn , n 1 |A| = (A∗ A) 2 are called singular values of A.

≥

1 of

D EFINITION 3.5.– A compact operator A in H is called nuclear if n≥1 αn < ∞, where αn are singular values of A (counted with multiplicity). Class of all nuclear operators in H is denoted as S1 (H). D EFINITION 3.6.– For A ∈ S1 (H), the trace of A is deﬁned as tr A =

∞

(Aen , en ),

[3.5]

1

where {en } is arbitrary orthobasis in H. L EMMA 3.2.– (About trace) For A ∈ S1 (H), the series in [3.5] converges absolutely and its sum does not depend on the choice of orthobasis. P ROOF.– Consider polar decomposition A = U T of the compact operator A. Let {en } be eigenbasis of T and T en = αn en , n ≥ 1. Then Aen = U (αn en ) = αn U en , n ≥ 1. a) We have |(Aen , en )| ≤ αn · U ≤ αn , ∞ and the majorizing series 1 αn converges. Hence the series [3.5] converges absolutely, and therefore, converges.

58

Gaussian Measures in Hilbert Space

b) Take another orthobasis {fn }. It holds ∞

(Aen , en ) =

n=1

∞ ∞

∞ ∞

(Aen , fk )(en , fk ) =

n=1 k=1

αn (U en , fk )(en , fk ).

n=1 k=1

We want to change the order of summation. In order to ground this, bound the double sum: ∞ ∞

αn |(U en , fk )| · |(en , fk )| ≤

n=1 k=1

×(

∞

∞ n=1

1

|(en , fk )|2 ) 2 =

∞

αn (

∞ k=1

αn U en · en ≤

n=1

k=1

1

|(U en , fk )|2 ) 2 ×

∞

αn < ∞.

n=1

Therefore, ∞

(Aen , en ) =

n=1

∞ ∞

(Aen , fk )(en , fk ) =

k=1 n=1

= (fk , A∗ fk ) = (Afk , fk ). k

∞ ∞

(en , A∗ fk )(en , fk ) =

k=1 n=1

k

The latter series converges absolutely because its sum remains unchanged after permutation of vectors in the basis {fk }. L(Rn ) and {ek , k = 1, . . . , n} be an orthobasis in Rn . Then R EMARK n3.3.– Let A ∈ n trA = 1 (Aek , ek ) = 1 akk , where akj are matrix entries of A in the basis {ek }. Thus, in the ﬁnite-dimensional case the trace of operator coincides with the trace of matrix that represents the operator. D EFINITION 3.7.– Let A ∈ S∞ (H) and {αn , n ≥ 1} be singular values of A counted with multiplicity. The sum A 1 = αn n≥1

is called nuclear norm of A. We see that a compact operator A is nuclear if, and only if, A 1 < ∞. L EMMA 3.3.– (About triangle inequality) Let A, B ∈ S1 (H), then A + B ∈ S1 (H) and A + B 1 ≤ A 1 + B 1 .

Borel Measures in Hilbert Space

59

P ROOF.– The operator A + B is compact as a sum of two compact operators. For arbitrary compact operator T , we arrange eigenvalues of |T | (counted with multiplicity) in descending order: α1 (T ) ≥ α2 (T ) ≥ α3 (T ) ≥ . . . For each N ≥ 1, it holds (see lemma 4.2 in [GOK 88]): N

αn (A + B) ≤

1

N

αn (A) +

N

1

αn (B).

1

Tending N → ∞, we obtain A + B 1 =

∞

αn (A + B) ≤

1

∞

αn (A) +

1

∞

αn (B) = A 1 + B 1 < ∞.

1

Thus, A + B ∈ S1 (H) and the desired inequality follows.

In view of lemma 3.3, it is evident that S1 (H) is a linear normed space with nuclear norm. Moreover, it is Banach space (see problem 6). Now, we consider some properties of nuclear operators. T HEOREM 3.3.– (Operator norm is dominated by nuclear one) If A ∈ S1 (H), then A ≤ A 1 . P ROOF.– We use polar decomposition A = U T , eigenbasis {en } of T , and the corresponding eigenvalues αn . For x ∈ H, Ax = U T x = T x =

∞

αn (x, en )en ≤

1

≤ x ·

∞

∞

αn |(x, en )| ≤

1

αn = A 1 · x ,

1

and the desired follows.

We interpret theorem 3.3 as follows. Consider canonical embedding π1 : S1 (H) → L(H), π1 A = A. Then π1 A L(H) ≤ A 1 , and the embedding is continuous, moreover, π1 ≤ 1. T HEOREM 3.4.– Let A, B ∈ S2 (H). Then AB ∈ S1 (H) and AB 1 ≤ A 2 · B 2 .

60

Gaussian Measures in Hilbert Space

P ROOF.– a) AB is a compact operator as a product of compact and continuous operators. We use polar decomposition AB = U T , singular values sn = sn (AB) and the corresponding orthonormal system {en , n ≥ 1} of eigenvalues of T . It holds sn = (T en , en ) = (U T en , U en ) = (ABen , U en ) = (Ben , A∗ U en ). Vectors {U en , n ≥ 1} form an orthonormal system (because U performs an isometry on R(T ) and en ∈ R(T ), n ≥ 1), and we complement it to a basis {fk }. Then

A∗ U en 2 ≤

n≥1

∞

A∗ fk 2 = A∗ 22 = A 22 .

k=1

b) Now, we bound the nuclear norm of AB: sn = (Ben , A∗ U en ) ≤ AB 1 = ≤

*

n≥1

Ben 2 ·

n≥1

*

n≥1

A∗ U en 2 ≤ B 2 · A 2 < ∞.

n≥1

Thus, AB ∈ S1 (H) and the desired inequality follows.

R EMARK 3.4.– Theorem 3.4 implies the following: if A ∈ S2 (H), then A2 ∈ S1 (H). That is why Hilbert–Schmidt operators are called sometimes quasinuclear ones. R EMARK 3.5.– Theorem 3.4 can be interpreted as follows. Consider bilinear operator Φ : S2 (H) × S2 (H) → S1 (H),

Φ(A, B) = AB.

Then its norm can be written as Φ =

sup

A2 =B2 =1

Φ(A, B) 1 .

And theorem 3.4 states that Φ ≤ 1. T HEOREM 3.5.– (Hilbert–Schmidt norm is dominated by nuclear one) If A ∈ S1 (H), then A ∈ S2 (H) and A 2 ≤ A 1 . P ROOF.– We use polar decomposition A = U T , eigenvalues {en } of T and the corresponding eigenvalues αn . We have 2 ∞ ∞ ∞ ∞ 2 2 2 2 = A 21 < ∞. αn ≤ αn Aen = T en = A 2 = 1

1

1

1

Borel Measures in Hilbert Space

61

Theorem 3.5 means that for canonical embedding π12 : S1 (H) → S2 (H), π12 A = A, it holds π12 ≤ 1. T HEOREM 3.6.– If A ∈ S1 (H), then A∗ ∈ S1 (H) and A∗ 1 = A 1 . P ROOF.– We use polar decomposition 1

1

A = U T = (U T 2 )T 2 . Consider eigenbasis {en } of T , T en = αn en , n ≥ 1. We have 1

T 2 22 =

∞

1

T 2 en 2 =

1

∞

αn = A 1 < ∞.

1

1

1

Hence T 2 ∈ S2 (H), and by theorem 3.1 (c) it holds U T 2 ∈ S2 (H). 1

1

Now, A∗ = T 2 (U T 2 )∗ ∈ S1 (H) as a product of two Hilbert–Schmidt operators (see theorems 3.1(b) and 3.4). Next, 1

1

1

1

A∗ 1 ≤ T 2 2 · (U T 2 )∗ 2 = T 2 2 · U T 2 2 ≤ 1

1

≤ T 2 22 · U ≤ T 2 22 = A 1 . The inequality A∗ 1 ≤ A 1 holds for any A ∈ S1 (H). We substitute here A∗ instead of A, and since A∗∗ = A we get the inequality in opposite direction A 1 ≤ A∗ 1 . Thus, A∗ 1 = A 1 . The next statement is a complement to theorem 3.4. C OROLLARY 3.1.– Each A ∈ S1 (H) can be decomposed as a product of two Hilbert– Schmidt operators. 1

1

P ROOF.– The desired decomposition is A = (U T 2 )T 2 from the previous proof.

T HEOREM 3.7.– (Nuclear operators form an ideal in L(H) ) If A ∈ L(H) and B ∈ S1 (H), then both operators AB and BA are nuclear and AB 1 ≤ A · B 1 , BA 1 ≤ A · B 1 . P ROOF.– a) We start with polar decomposition B = U T , where T ∈ S1 (H). Then 1 T 2 22 = T 1 = B 1 . Now, 1

1

AB = (AU T 2 )T 2 ∈ S1 (H)

62

Gaussian Measures in Hilbert Space

as a product of two Hilbert–Schmidt operators. Hence 1

1

1

AB 1 ≤ AU T 2 2 · T 2 2 ≤ A · U · T 2 22 ≤ A · B 1 . b) Operator BA ∈ S∞ (H), and therefore, its nuclear norm is well-deﬁned. We have by theorem 3.6 BA 1 = A∗ B ∗ 1 ≤ A∗ · B ∗ 1 = A · B 1 .

Theorem 3.7 means that S1 (H) is a two-sided ideal in Banach algebra L(H). Denote by S0 (H) the vector space of all ﬁnite-dimensional operators, i.e. such operators A ∈ L(H) that dim R(A) < ∞. Summarizing, we have a chain of linear spaces S0 (H) ⊂ S1 (H) ⊂ S2 (H) ⊂ S∞ (H) ⊂ L(H). Here all inclusions are strict; S1 (H) is Banach space with nuclear norm, S2 (H) is Hilbert space with Hilbert–Schmidt norm, and S∞ (H) and L(H) are Banach spaces with operator norm; S0 (H) is dense in three spaces S1 (H), S2 (H), S∞ (H) but not in L(H); canonical embeddings of S1 (H) in S2 (H) and S2 (H) in S∞ (H) are continuous. 3.1.4. S-operators D EFINITION 3.8.– A self-adjoint, positive, and nuclear operator in H is called S-operator. E XAMPLE 3.5.– (Diagonal S-operator) For a bounded real sequence {an , n ≥ 1}, consider the diagonal operator A in l2 , A = diag(an , n ≥ 1) (see example 3.1). It is always self-adjoint; A is positive ∞ if, and only if, the weights an are non-negative, and it is nuclear if, and only if, 1 |an | < ∞. Thus, ∞ ∞A is S-operator if, and only if, an ≥ 0, n ≥ 1, and 1 an < ∞. It holds trA = 1 an . Now, let B be arbitrary S-operator in a real separable inﬁnite-dimensional Hilbert space H. Then B is a compact self-adjoint operator, and by Hilbert–Schmidt theorem there exists eigenbasis {en } of B, with Ben = αn en , n ≥ 1. Here αn → 0 as n → ∞, and αn ≥ 0, n ≥ 1 because B is a positive operator. Since B is nuclear, we have B 1 =

∞

αn < ∞.

1

The S-operator B has a form B=

∞ 1

αn P[en ] ,

[3.6]

Borel Measures in Hilbert Space

63

where P[en ] is projector on [en ] := span(en ) and the series converges in the sense of uniform operator convergence. In particular Bx =

∞

αn (x, en )en ,

x ∈ H,

1

where the series converges strongly (i.e. in the norm of H). ∞ And opposite statement holds true: if {en } is an orthobasis and αn ≥ 0, 1 αn < ∞, then the operator [3.6] is S-operator. Thus, we have a general description of Soperators. Of course, the diagonal S-operator from example 3.4 ﬁts this description. T HEOREM 3.8.– Let T be a self-adjoint positive operator in H. Suppose that there exists an orthobasis {fk }, with ∞

(T fk , fk ) < ∞.

1

Then T is nuclear, and therefore, it is S-operator. P ROOF.– The bilinear form (T x, y), x, y ∈ H satisﬁes all the axioms of inner product except of a part of the ﬁrst axiom (it can happen that (T x, x) = 0 for certain x = 0). It is the so-called pseudoscalar product. Then by the Cauchy–Schwartz inequality we have |(T fi , fk )|2 ≤ (T fi , fi )(T fk , fk ),

i, k ≥ 1.

Hence T 22 =

∞

|(T fi , fk )|2 ≤

∞

(T fi , fi ) ·

i=1

i,k=1

∞

(T fk , fk ) < ∞.

k=1

Therefore, T ∈ S2 (H) and T is a compact operator. Moreover, T is self-adjoint 1 and positive; hence there exists S = T 2 , which is self-adjoint, positive and compact as well (see proof of theorem 3.2). Then ∞ 1

(T fk , fk ) =

∞ 1

(S 2 fk , fk ) =

∞

Sfk 2 < ∞,

1

and S ∈ S2 (H). Hence T = S 2 ∈ S1 (H) (see remark 3.4).

The set LS (H) of all S-operators in H is a convex cone in Banach space S1 (H), i.e. LS (H) is closed under linear combinations with positive coefﬁcients. Now, we check that it is a closed subset of S1 (H).

64

Gaussian Measures in Hilbert Space

L EMMA 3.4.– (S-operators form closed set) Let {Sn , n ≥ 1} be S-operators in H and Sn converges to S in nuclear norm. Then S is S-operator as well. P ROOF.– The operator S is nuclear, and by theorem 3.3 Sn − S ≤ Sn − S 1 → 0

as

n → ∞.

Hence Sn ⇒ S (i.e. Sn converges uniformly). Then Sn = Sn∗ ⇒ S ∗ , and S ∗ = S. Finally, the uniform convergence implies the weak operator convergence, and 0 ≤ (Sn x, x) → (Sx, x)

as

n → ∞, x ∈ H.

Thus, (Sx, x) ≥ 0, x ∈ H. Therefore, S is S-operator.

Finally, we give a criterion for nuclear convergence of S-operators. T HEOREM 3.9.– Let S and {Sn , n ≥ 1} be S-operators. For the convergence of Sn to S in nuclear norm, it is necessary that for each orthobasis {ei } and it is sufﬁcient that there exists an orthobasis {ei }, such that (Sn ei , ej ) → (Sei , ej ), for all i = j, ∞

|(Sn ei , ei ) − (Sei , ei )| → 0 as n → ∞.

[3.7] [3.8]

i=1

P ROOF.– a) Necessity: Part 1 Assume that Sn converges to S in nuclear norm. Then Sn converges to S weakly (see proof of lemma 3.4). This implies [3.7], for any orthobasis {ei }. b) Auxiliary statement We prove that for any A ∈ S1 (H) and any orthobasis {fk }, ∞

|(Afk , fk )| ≤

1

αk = A 1 ,

[3.9]

k≥1

where αk = αk (A) are singular values of A counted with multiplicity. Indeed, consider polar decomposition A = U T and orthonormal system {ei }, with T ei = αi ei , i ≥ 1. Then Aei = αi gi , where {gi = U ei , i ≥ 1} is another orthonormal system. It holds ⎞ ⎛ αi (x, ei )gi , Ax = A ⎝ (x, ei )ei ⎠ = i≥1

i≥1

Borel Measures in Hilbert Space

(Afk , fk ) =

65

αi (fk , ei )(fk , gi ),

i≥1 ∞

|(Afk , fk )| ≤

1

i≥1

αi (

∞

1

|(fk , ei )|2 ) 2 (

k=1

∞

1

|(fk , gi )|2 ) 2 ≤

αi ,

i≥1

k=1

and [3.9] is proven. c) Necessity: Part 2 Come back to the operators Sn that converge to S in nuclear norm. Using [3.9], we get ∞

|(Sn ei , ei ) − (Sei , ei )| =

i=1

∞

|((Sn −S)ei , ei )| ≤ Sn −S 1 → 0 as n → ∞.

i=1

d) Sufﬁciency: Part 1 – properties of the projectors Now, we assume that [3.7] and [3.8] hold true for ﬁxed orthobasis {ei }, and we want to show that Sn − S 1 → 0 as n → ∞. For arbitrary n, split H = Hn ⊕ H n , where Hn = span(e1 , . . . , en ). Let Pn and P be projectors on Hn and H n , respectively. The conditions imply that n

Pn Sk Pn ⇒ Pn SPn as k → ∞,

[3.10]

and for δ > 0, tr(P n Sk P n ) ≤

∞

|(Sk ei , ei ) − (Sei , ei )| +

i=1

∞

(Sei , ei ) < δ

i=n+1

if n ≥ nδ and k ≥ kδ , therefore, lim sup sup tr(P n Sk P n ) ≤ δ + lim sup n→∞ k≥1

n→∞

kδ

tr(P n Sk P n ) = δ;

k=1

hence sup tr(P n Sk P n ) → 0 as n → ∞. k≥1

∞ The latter means that the series i=1 (Sk ei , ei ) converges uniformly in k ≥ 1; we can formalize it as follows: for each k ≥ 1 and n ≥ 1, tr (P n Sk P n ) ≤ n and tr(P n SP n ) ≤ n with n → 0 as n → ∞.

66

Gaussian Measures in Hilbert Space

e) Sufﬁciency: Part 2 – construction of S-operator which is close to Sk − S Introduce an operator for k, n ≥ 1 and δ > 0: W = Sk − S +

δ Pn + δ(Pn Sk Pn + Pn SPn )+ n

+ (1 + δ −1 )(P n Sk P n + P n SP n ). It is self-adjoint and nuclear operator as a sum of such operators. We show that for k ≥ knδ , it is positive. We have

, + δ (W x, x) ≥ (Pn (Sk − S)Pn x, x) + (Pn x, x) + n n + −2(P SPn x, x) + δ(Pn SPn x, x) + δ −1 (P n SP n x, x) =: R1 + R2 .

Hereafter, we compare self-adjoint operators A and B in the so-called Loewner order: A ≥ B means that A − B is a positive operator. Now, due to [3.10] for k ≥ knδ it holds δ Pn ≥ −Pn (Sk − S)Pn ⇒ R1 ≥ 0; n 1

1

1

1

2|(P n SPn x, x)| = 2|(S 2 Pn x, S 2 P n x)| ≤ 2 S 2 Pn x · S 2 P n x ≤ 1

1

≤ δ · S 2 Pn x 2 + δ −1 · S 2 P n x 2 = δ(Pn SPn x, x) + δ −1 (P n SP n x, x), and R2 ≥ 0. Hence for all k ≥ knδ the operator W is S-operator. f) Sufﬁciency: Part 3 – ﬁnal proof of the desired convergence We bound the trace of W using relation tr ( nδ Pn ) = δ: tr W ≤ |tr Pn (Sk − S)Pn | + δ(1 + 21 ) + (3 + 2δ −1 )n , which can be made arbitrarily small by choosing δ, n, k is this order. Also, W = Sk − S + Wknδ , where Wknδ is S-operator, with tr Wknδ ≤ δ(1 + 21 ) + 2(1 + δ −1 )n , and this can be made arbitrarily small by choosing δ and n in this order. Finally, for δ ≥ δ0 , n ≥ n0 (δ0 ) and k ≥ k0 (δ0 , n0 ), both operators W and Wknδ are S-operators, and Sk − S 1 ≤ W 1 + Wknδ 1 = tr W + tr Wknδ ,

Borel Measures in Hilbert Space

67

which can be made arbitrarily small for all k ≥ k0 (δ0 , n0 ) by choosing appropriate δ = δ0 and n = n0 (δ0 ). This proves the desired convergence Sk → S in S1 (H). R EMARK 3.6.– In view of the proof of theorem 3.8, relations [3.7] and ∞[3.8] can be formulated as follows: (Sn ei , ej ) → (Sei , ej ), for all i, j ≥ 1, and i=1 (Sn ei , ei ) converges uniformly in n ≥ 1. Problems 3.1 1) Find the norm of canonical embedding J : S2 (H) → L(H). 2) Consider C as a complex Hilbert space, with inner product (u, v) = uv. For z ∈ C\{0}, introduce a linear operator Az w = zw, w ∈ C. Find polar decomposition of Az . 3) Let H be a separable Hilbert space and y, z ∈ H\{0}. Find polar decomposition of the one-dimensional operator A, with Ax = (x, y)z, x ∈ H. 4) Let A ∈ S∞ (H), where H is an inﬁnite-dimensional separable Hilbert space, and let {αn , n ≥ 1} be singular values of A. Prove the following: a) A ∈ S0 (H) if, and only if, the number of singular values αn is ﬁnite; 2 b) A ∈ S2 (H) if, and only if, n≥1 αn < ∞ (here singular values are counted with multiplicity). 5) Problem 567 in [KIR 82] implies that each positive self-adjoint operator A in a Hilbert space can be decomposed as A = B 2 , where B is a positive self-adjoint operator as well, and B is unique (B is called square root of A and denoted as B = √ 1 A = A 2 ). Based on this fact, prove the following generalization of theorem 3.2: Each linear bounded operator A in a Hilbert space H0 can be decomposed as A = U T , where T is a self-adjoint positive operator and U ∈ L(H0 ), U performs an isometry of R(T ) into H0 ; moreover, such operator T is unique (it is denoted as |A|). 6) Prove that the normed space (S1 (H), · 1 ) is Banach space. 7) For canonical embedding π1 : S1 (H) → L(H), prove that π1 = 1. 8) Prove that S0 (H) is dense in S1 (H) in nuclear norm. 9) Prove that for each B ∈ L(H), f (A) := tr (AB) is a linear continuous functional on S1 (H) and ﬁnd f . 1 0

10) In real Hilbert space L2 [0, 1], consider the integral operator (Ax)(t) = min(t, s)x(s)ds, 0 ≤ t ≤ 1, x ∈ L2 [0, 1]. Its eigenvalues and the corresponding

normalized eigenfunctions are as follows: √ 1 1 πt, n ≥ 0. λn =

2 , ϕn (t) = 2 sin n + 2 π 2 n + 12

68

Gaussian Measures in Hilbert Space

Prove that A is S-operator. 3.2. Pettis and Bochner integrals In deﬁnition 2.9, we introduced the mean value of a random element and of a probability measure in H. Actually, it is the so-called weak integral, which is deﬁned using linear functionals fh (x) = (x, h), x ∈ H. We will see that under natural conditions, this integral coincides with the so-called strong integral, which is deﬁned as a strong limit of integrals of simple functions. 3.2.1. Weak integral Consider a probability space (Ω, F, P), a normed vector space X and a random element ξ : Ω → X that has the weak ﬁrst order, i.e. the mapping ξ is (F, B(X)) measurable and for each x∗ ∈ X ∗ , the weak ﬁrst moment Eξ(ω), x∗ = ξ(ω), x∗ d P(ω) Ω

is ﬁnite. Hereafter x, x∗ denotes the value of a functional x∗ at a vector x. D EFINITION 3.9.– Given a random element ξ in X of the weak ﬁrst order, suppose that there exists m ∈ X such that for all x∗ ∈ X ∗ , it holds Eξ(ω), x∗ = m, x∗ . The vector m is called Pettis integral, or weak integral, of ξ over the measure P and denoted as E ξ or Ω ξ(ω)d P(ω). Let μ be a probability measure on B(X), possessing weak ﬁrst moments, i.e. for each x∗ ∈ X ∗ the integral X x, x∗ dμ(x) is ﬁnite. Then the identity operator I : X → X is a random element on the probability space (X, B(X), μ), and the mean value mμ of μ is deﬁned as Pettis integral of I. In other words, it holds mμ , x∗ = x, x∗ dμ(x), x∗ ∈ X ∗ . X

One can write that in the sense of Pettis integral, mμ = xdμ(x). X

It is clear that mμ = E η, where η is any random element in X with distribution μ. L EMMA 3.5.– Let ξ and η be random elements in X deﬁned on the same probability space, with ﬁnite Pettis integrals E ξ and E η. Then the following statements hold true:

Borel Measures in Hilbert Space

69

a) E ξ is uniquely deﬁned; b) if ξ + η is a random element in X, then there exists Pettis integral E(ξ + η), and E(ξ + η) = E ξ + E η; c) it holds E ξ ≤ E ξ ; d) let A be a linear bounded operator from X into a normed vector space Y , then there exists Pettis integral E(Aξ), and E(Aξ) = A(E ξ). P ROOF.– a) Suppose that for some distinct vectors m1 , m2 ∈ X, we have m1 , x∗ = Eξ, x∗ = m2 , x∗ , x∗ ∈ X. Then for all x∗ ∈ X ∗ , m1 , x∗ = m2 , x∗ and m1 = m2 . But a corollary of Hahn–Banach theorem states that there exists x∗0 ∈ X ∗ , with m1 , x∗0 = m2 , x∗0 (i.e. x∗0 separates points m1 and m2 ). The obtained contradiction proves the uniqueness of E ξ. b) Proof is straightforward and based on the linearity of Lebesgue integral. c) ξ is a r.v. as a Borel function of a random element. If E ξ = ∞, then the inequality is true. Now, let E ξ < ∞. Denote E ξ = m. We have |m, x∗ | = | Eξ, x∗ | ≤ E |ξ, x∗ | ≤ E ξ · x∗ , x∗ ∈ X ∗ . Introduce a linear continuous functional Fm (x∗ ) = m, x∗ , x∗ ∈ X ∗ . Since |Fm (x∗ )| ≤ E ξ · x∗ , it holds m = Fm ≤ E ξ . Here, we used the fact that the embedding X m → Fm ∈ X ∗∗ is isometric. d) Aξ is a random element in Y as a composition of measurable mappings. It holds for any y ∗ ∈ Y ∗ : EAξ, y ∗ = Eξ, A∗ y ∗ = E ξ, A∗ y ∗ = A(E ξ), y ∗ . Hence A(E ξ) is Pettis integral of Aξ. Here, A∗ : Y ∗ → X ∗ is adjoint operator to A. 3.2.2. Strong integral Now, let B be a separable Banach space. D EFINITION 3.10.– A random element η in B is called simple if its range η(Ω) is ﬁnite.

70

Gaussian Measures in Hilbert Space

It is clear that a simple random element η in B can be represented as η(ω) =

n

ck IEk (ω),

ω ∈ Ω,

1

where n ≥ 1, {ck } ⊂ B, and E1 , . . . , En are disjoint random events. Then η(ω) =

n

ck · IEk (ω),

ω ∈ Ω,

1

is a simple r.v. L EMMA 3.6.– Let L0 (Ω) be the set of all random elements in B with underlying probability space (Ω, F, P), where random elements ξ and η are identiﬁed if P(ξ = η) = 1. Then ρ(ξ, η) = E

ξ − η , ξ, η ∈ L0 (Ω) 1 + ξ − η

[3.11]

is a metric that metrizes the convergence in probability. P ROOF.– a) First we show that for any random elements ξ and η in the separable Banach space B, ξ + η ∈ L0 (Ω) as well. Consider z : Ω → B × B, z(ω) = (ξ(ω); η(ω)). This mapping is (F, B(B × B)) measurable, because z −1 (A1 × A2 ) = (ξ −1 A1 ) ∩ (η −1 A2 ) ∈ F,

A1 × A2 ∈ B(B) × B(B)

and σa(B(B) × B(B)) = B(B × B) due to the separability of B. Next, the mapping K : B × B → B, K(x, y) = x + y is continuous, and therefore, it is (B(B × B), B(B)) measurable. Hence the mapping ξ(ω) + η(ω) = K(z(ω)), ω ∈ Ω is (F, B(B)) measurable as a composition of measurable functions, and ξ + η indeed belongs to L0 (Ω). Also it is clear that for any scalar λ and any ξ ∈ L0 (Ω), λξ ∈ L0 (Ω) as well. Thus, L0 (Ω) is a vector space (provided the underlying Banach space is separable).

Borel Measures in Hilbert Space

71

b) In [3.11], ξ − η ∈ L0 (Ω) and ξ − η is a r.v. Hence ρ(ξ, η) is well-deﬁned. It is veriﬁed directly, using lemma 2.1, that ρ(ξ, η) is indeed a metric at L0 (Ω). c) If {ξ, ξn , n ≤ 1} ⊂ L0 (Ω) and ξn converges in probability to ξ, i.e. P ξn − ξ − → 0, then ρ(ξn , ξ) = E

ξn − ξ → 0 as 1 + ξn − ξ

n→∞

by Lebesgue dominated convergence theorem. d) Vice versa, suppose that ρ(ξn , ξ) → 0 as n → ∞. For any ε > 0, ξn − ξ ε ≤ P { ξn − ξ > ε} = P > 1 + ξn − ξ 1+ε ≤E

ξn − ξ 1+ε · → 0 as 1 + ξn − ξ ε

P

→ 0 and ξn converges to ξ in probability. Hence, ξn − ξ −

n → ∞.

L EMMA 3.7.– Let B be a separable Banach space and ξ be a random element in B. Then there exists a sequence {ξn , n ≥ 1} of simple random elements in B such that ξn − ξ → 0 as n → ∞, a.s. P ROOF.– Fix ε > 0. Because B is separable, it can be partitioned as B = ∪∞ 1 Bi , where Bi are disjoint Borel sets, with diamBi = sup x − y ≤ ε, x,y∈Bi

i ≥ 1.

Introduce a random element ηε (ω) =

∞

bi · Iξ−1 Bi (ω),

ω ∈ Ω,

[3.12]

1

where bi ∈ Bi and the series converges strongly for each ω ∈ Ω. Then ξ − ηε ≤ ε, and η n1 converges pointwise to ξ. Hence ρ(η n1 , ξ) → 0 as n → ∞, and for some number N it holds ρ(η N1 , ξ) < 2ε . For η N1 , we have expression like [3.12]: η N1 =

∞

ci · Iξ−1 Ci ,

ci ∈ Ci ,

i ≥ 1.

1

M Simple random elements τM = 1 ci · Iξ−1 Ci converge strongly to η N1 for each ω ∈ Ω; hence for some M0 , it holds ρ(τM0 , η N1 ) < 2ε . Thus, ρ(τM0 , ξ) < ε, and ξ can be approximated by simple random elements in the metric space (L0 (Ω), ρ).

72

Gaussian Measures in Hilbert Space

Therefore, one can construct a sequence {αn } of simple random elements such P that ρ(αn , ξ) → 0 as n → ∞, and αn − ξ − → 0. Now, by Riesz theorem there exists a subsequence {αnk } such that αnk − ξ → 0 as k → ∞, a.s. Then {αnk } is the desired sequence of simple random elements. At the ﬁrst stage, we deﬁne the strong integral of a simple random element. Let ξ=

m

ck IEk ,

[3.13]

1

where {Ek } are disjoint random elements and ck ∈ B, k = 1, . . . , m. D EFINITION 3.11.– Bochner integral of a simple random element with representation [3.13] is deﬁned as Eξ =

ξ(ω)d P(ω) = Ω

m

ck P(Ek ).

1

It is easy to verify that Bochner integral given in deﬁnition 3.11 is well-deﬁned. L EMMA 3.8.– Let ξ be a random element in the Banach space B and {ξn } be simple random elements in B such that E ξn − ξ → 0 as n → ∞. Then the sequence E ξn of Bochner integrals converges in B (in strong sense). P ROOF.– We use evident properties of Bochner integral of simple random elements: E ξn − E ξm = E(ξn − ξm ) ≤ E ξn − ξm ≤ ≤ E ξn − ξ + E ξm − ξ → 0

as n, m → ∞.

Then {E ξn } is a Cauchy sequence in the complete normed space B. Thus, it strongly converges in B. D EFINITION 3.12.– Let ξ be a random element in the Banach space B and there exists a sequence {ξn } of simple random elements in B such that E ξn − ξ → 0 as n → ∞; by lemma 3.8 there exists the strong limit of E ξn , and this limit is called Bochner integral of ξ and denoted as E ξ = Ω ξ(ω)d P(ω). R EMARK 3.7.– In deﬁnition 3.12, the limit does not depend of the approximating sequence. P ROOF.– Let E ξn − ξ → 0 and E ηn − ξ → 0 as n → ∞, where ξn and ηn are simple random elements in B. Then E ξn − E ηn ≤ E ξn − ηn ≤ E ξn − ξ + E ηn − ξ → 0

Borel Measures in Hilbert Space

73

as n → ∞. Hence lim E ξn = lim E ηn .

n→∞

n→∞

Let μ be a probability measure on B(B). Then the identity operator I : B → B is a random element on the probability space (B, B(B), μ), and mean value mμ of μ (in strong sense) is deﬁned as Bochner integral of I (if the latter exists according to deﬁnition 3.12). In this case, we write mμ = xdμ(x) (in strong sense). [3.14] B

If η is any random element in B with distribution μ and B xdμ(x) exists in strong sense, then E η exists as Bochner integral and E η = B xdμ(x). In other words, the existence of strong integral [3.14] means the following. First, we consider a simple measurable function p : B → B of the form p(x) =

m

ck IEk (x),

x ∈ B,

1

sets and ck ∈ B, k = 1, . . . , m. Bochner integral of p where Ek are disjoint Borel m is B p(x)dμ(x) = 1 ck μ(Ek ). Next, assume that there exist a sequence {pn } of simple measurablefunctions such that B pn (x) − x dμ(x) → 0 as n → ∞. Then the strong integral B xdμ(x) = limn→∞ B pn (x)dμ(x). Now, we describe the relation between Bochner and Pettis integrals. T HEOREM 3.10.– Let ξ be a random element in the Banach space B. If E ξ exists in the strong sense, then it exists in the weak sense, and the strong and weak values of expectation coincide. m P ROOF.– For a simple random element η = 1 ck IEk with representation like in [3.13], E η, x∗ =

m

ck P(Ek ), x∗ =

1

= E

m

m

P(Ek )ck , x∗ =

1

ck IEk (ω), x∗ = Eη, x∗ ,

x∗ ∈ X ∗ .

1

Let {ξn } be a sequence of simple random elements in B, with E ξn − ξ → 0 as n → ∞. Then E ξn → E ξ as n → ∞. The latter convergence is strong in B, and this implies that E ξn → E ξ weakly as well. Hence E ξn , x∗ → E ξ, x∗

as

n → ∞.

74

Gaussian Measures in Hilbert Space

We want to tend n to inﬁnity in relation E ξn , x∗ = Eξn , x∗ . To be able to do it on the right-hand side, consider E |ξ, x∗ | ≤ E |ξ − ξn , x∗ | + E |ξn , x∗ | ≤ ≤ E ξ − ξn · x∗ + E ξn · x∗ < ∞ for n large enough; hence Eξ, x∗ is ﬁnite; moreover, | Eξn , x∗ − Eξ, x∗ | ≤ E |ξn − ξ, x∗ | ≤ E ξn − ξ · x∗ → 0 as n → ∞, and therefore, Eξn , x∗ → Eξ, x∗

as

n → ∞.

Thus, E ξ, x∗ = Eξ, x∗ . This proves that the strong integral E ξ coincides with the corresponding weak integral. T HEOREM 3.11.– (Criterion for existence of Bochner integral) Let ξ be a random element in the separable Banach space B. There exists Bochner integral E ξ if, and only if, E ξ < ∞ (i.e. when the strong ﬁrst moment E ξ is ﬁnite). P ROOF.– a) Necessity. There exists a sequence {ξn , n ≥ 1} of simple random elements in B, with E ξn − ξ → 0 as n → ∞. Hence E ξn − ξ < ∞, for all n ≥ n0 . We have E ξ ≤ E ξn0 − ξ + E ξn0 < ∞

⇒

E ξ < ∞.

b) Sufﬁciency. Now, we assume that E ξ < ∞. By lemma 3.7, due to the separability of B there exists a sequence {ξn } of simple random elements in B, with ξn − ξ → 0 a.s. Now, introduce simple random elements ξn , if ξn ≤ 2 ξ , ηn = 0, otherwise. First, we check that ηn − ξ → 0 a.s. Indeed, if ξ(ω) = 0, then ηn (ω) = 0, and ηn (ω) − ξ(ω) → 0; next, let ξ(ω) = 0 and ξn (ω) − ξ(ω) → 0, then ξn (ω) → ξ(ω) and ξn (ω) ≤ 2 ξ(ω) , for n ≥ n0 (ω); hence ηn (ω) = ξn (ω) for n ≥ n0 (ω), and for such n, it holds ηn (ω) − ξ(ω) = ξn (ω) − ξ(ω) → 0asn → ∞.

Borel Measures in Hilbert Space

75

We proved that ηn strongly converges to ξ a.s. Second, ηn − ξ ≤ ηn + ξ ≤ 3 · ξ , and 3 · ξ has ﬁnite expectation. Therefore, by Lebesgue dominated convergence theorem it holds E ηn − ξ → 0 as n → ∞. According to deﬁnition 3.12, there exists Bohner integral E ξ. In theorem 3.11, we used the separability of B to ensure that the existence of approximating sequence of simple random elements (in the sufﬁciency part of proof). That is why in this book we consider Bochner integral of random elements that are distributed namely in a separable Banach space. Problems 3.2 11) Let ξ be a random element in a normed vector space X such that E ξ < ∞. a) Prove that there exists Mξ ∈ X ∗∗ so that for each x∗ ∈ X ∗ , Eξ, x∗ = Mξ , x . ∗

b) Under additional assumption that X is a reﬂexive Banach space (possibly non-separable) prove that there exists Pettis integral E ξ. 12) Let H be an inﬁnite-dimensional separable Hilbert space and A(ω) be a random Hilbert–Schmidt operator in H, i.e. A(ω) be a random element in S2 (H). Assume additionally that E A(ω) 2 < ∞. Prove that for each x ∈ H, there exists Bochner integral E[A(ω)x] and E[A(ω)x] = (E A(ω))x, where the latter integral on the righthand side is Bochner integral. 13) Let H be a space from problem (12). Construct a random element ξ in H such that there exists E ξ in weak sense, but it does not exist in strong sense. 3.3. Borel measures in Hilbert space In this section, we study general properties of probability measures on B(H), where H is a real separable inﬁnite-dimensional Hilbert space. 3.3.1. Weak and strong moments D EFINITION 3.13.– Let (X, ρ) be a metric space. Any measure on the Borel sigmaalgebra B(X) is called Borel measure in X. We will study mostly Borel measures in H. D EFINITION 3.14.– Let μ be a Borel probability measure in H and ξ be a random element in H. Expressions (x, z1 )(x, z2 ) . . . (x, zn )dμ(x) σn (z1 , . . . , zn ) = H

76

Gaussian Measures in Hilbert Space

and σnξ (z1 , . . . , zn ) = E(ξ, z1 )(ξ, z2 ) . . . (ξ, zn ),

z 1 , . . . , zn ∈ H

are called weak (non-central) moments of order n of μ and ξ, respectively. Expressions x n dμ(x) and mnξ = E ξ n mn = H

are called strong (non-central) moments of order n of μ and ξ, respectively. Remember deﬁnition 2.9. For a Borel probability measure μ, its mean value mμ = xdμ(x), this is Pettis integral; and for a random element ξ in H, its mean value E ξ H can be understood as Pettis integral. In terms of the ﬁrst weak moments, it holds (mμ , z) = σ1 (z), (E ξ, z) = σ1ξ (z), z ∈ H. If m1 = H z dμ(z) < ∞, then mμ exists and mμ = H xdμ(x), this is Bochner integral (see theorems 3.11 and 3.10). If m1ξ = E ξ < ∞, then E ξ can be understood both as Bochner and Pettis integral. Now, switch to the deﬁnition 2.10. For a Borel probability measure μ in H and a random element ξ inH, correlation operator is deﬁned using the so-called central weak second moments H (x − m, h1 )(x − m, h2 )dμ(x) and E(ξ − m, h1 )(ξ − m, h2 ), where m is mean value of μ or ξ, respectively. It is convenient to introduce another operator based on non-central moments. D EFINITION 3.15.– An operator A ∈ L(H) is called covariance operator of a random element ξ in H if (Ah1 , h2 ) = E(ξ, h1 )(ξ, h2 ),

for all

h1 , h2 ∈ H,

and A is called covariance operator of a Borel probability measure μ in H if (x, h1 )(x, h2 )dμ(x), for all h1 , h2 ∈ H. (Ah1 , h2 ) = H

There is a simple relation between the correlation operator and covariance one. L EMMA 3.9.– Let ξ be a random element in H with mean value m. The correlation operator S of ξ exists if, and only if, the covariance operator A of ξ exists; they are related as follows: (Sh1 , h2 ) = (Ah1 , h2 ) − (m, h1 )(m, h2 ), S = A − m 2 · P[m] ,

h1 , h2 ∈ H,

Borel Measures in Hilbert Space

77

where [m] = span(m) and P[m] is projector on [m]. P ROOF.– Here we prove the relation only: (Sh1 , h2 ) = E(ξ, h1 )(ξ, h2 ) − E(ξ, h1 )(m, h2 ) − E(m, h1 )(ξ, h2 )+ + (m, h1 )(m, h2 ) = (Ah1 , h2 ) − 2(m, h1 )(m, h2 ) + (m, h1 )(m, h2 ) = = (Ah1 , h2 ) − (m, h1 )(m, h2 ), (Sh1 , h2 ) = (Ah1 , h2 ) − m 2 · (P[m] h1 , h2 ) = ((A − m 2 · P[m] )h1 , h2 ). This implies the desired relation between the two operators.

We notice that similar statement holds true for a Borel probability measure in H. In terms of weak moments σn (see deﬁnition 3.14), it holds (Ah1 , h2 ) = σ2 (h1 , h2 ), (Sh1 , h2 ) = σ2 (h1 , h2 ) − σ1 (h1 )σ1 (h2 ). Directly from the deﬁnition one can see that both correlation and covariance operators are positive (i.e. their quadratic forms are non-negative) and self-adjoint. L EMMA 3.10.– Let μ be a Borel probability measure in H. Its covariance operator Aμ is S-operator if, and only if, H x 2 dμ(x) < ∞, and then x 2 dμ(x). trAμ = H

P ROOF.– a) Necessity Assume that covariance operator Aμ exists and nuclear. Take an arbitrary orthobasis {en , n ≥ 1} in H. We have trAμ =

∞

(Aμ en , en ) =

1

=

∞ H

∞ 1

(x, en )2 dμ(x) = H

x 2 dμ(x) < ∞.

(x, en )2 dμ(x) = H

1

b) Sufﬁciency Assume that H x 2 dμ(x) < ∞. This implies that weak second moment of μ are ﬁnite. The bilinear form σ2 (h1 , h2 ) = (x, h1 )(x, h2 )dμ(x), h1 , h2 ∈ H H

78

Gaussian Measures in Hilbert Space

is bounded because

|σ2 (h1 , h2 )| ≤

x 2 dμ · h1 · h2 . H

Therefore, there exists the covariance operator Aμ ∈ L(H) that represents the bounded bilinear form σ2 (h1 , h2 ). It is known that Aμ is positive self-adjoint, and moreover for any orthobasis {en } in H, ∞ ∞ 2 (Aμ en , en ) = (x, en ) dμ(x) = x 2 dμ(x) < ∞. 1

1

H

Now, by theorem 3.7 Aμ is S-operator.

H

C OROLLARY 3.2.– For a Borel probability measure μ in H, its correlation operator Sμ is S-operator if, and only if, H x 2 dμ(x) < ∞. P ROOF.– a) Necessity If Sμ is S-operator, then by lemma 3.9 there exists the covariance operator Aμ = Sμ + m 2 P[m] ; here P[m] is nuclear as a ﬁnite-dimensional operator; hence Aμ is nuclear as a linear combination of two nuclear operators. Therefore, Aμ is S-operator and by lemma 3.10 the strong second moment of μ is ﬁnite. b) Sufﬁciency Now, we assume that x 2 dμ(x) < ∞. Then H x dμ(x) is ﬁnite as well H and mean value m = H xdμ(x) exists as both Bochner and Pettis integrals. By theorem 3.7, the covariance operator Aμ is S-operator, and by lemma 3.9 there exists the correlation operator Sμ = Aμ − m 2 P[m] , and it is nuclear as a linear combination of two nuclear operators. Finally, Sμ is S-operator, because it is always positive and self-adjoint. 3.3.2. Examples of Borel measures D EFINITION 3.16.– Let μ be a Borel probability measure in H and ξ be a random element in H. Functions ϕμ (z) = ei(z,x) dμ(x) H

and ϕξ (z) = E ei(z,ξ) ,

z∈H

Borel Measures in Hilbert Space

79

are called characteristic functionals of μ and ξ, respectively. If μ coincides with the distribution μξ of ξ, then ϕμ = ϕξ . Now, we calculate mean values, correlation and covariance operators, and characteristic functionals of some Borel measures in H. E XAMPLE 3.6.– (Dirac measure) Fix a ∈ H. The probability measure δa (B) = IB (a),

B ∈ B(H)

is called Dirac (or point) measure concentrated at a. Random element ξ(ω) = a, ω ∈ Ω has distribution δa . The mean value m = Eξ = a

(in the sense of Bochner integral).

We have ξ(ω) − m = 0, ω ∈ Ω; hence the correlation operator S is zero operator. For the covariance operator A, it holds

(Ax, y) = E(ξ, x)(ξ, y) = (a, x)(a, y) = a 2 · P[a] x, y . Therefore, A = a 2 · P[a] . Finally, the characteristic functional is ϕ(x) = E ei(x,ξ) = ei(x,a) ,

x ∈ H.

E XAMPLE 3.7.– (Measure induced in the direction) Let μ be a Borel probability measure on real line and e ∈ H be a unit vector. The mapping ρ(αe) = α,

α∈R

identiﬁes one-dimensional subspace [e] and R. The measure μ ˜=μ ˜e , μ ˜e (E) = μ(ρ(E ∩ [e])),

E ∈ B(R),

is called the measure induced by μ in direction e. Let η be a r.v. with distribution μ. Random element X(ω) = η(ω)e, ω ∈ Ω has distribution μ ˜e . Indeed, for E ∈ B(R), μX (E) = P{η(ω)e ∈ E} = P{η(ω)e ∈ (E ∩ [e])} = = P{η(ω) ∈ ρ(E ∩ [e])} = μ ˜e (E). Suppose that μ has ﬁnite mean value mμ . Then E X = E ηe = E |η| = |x|dμ < ∞, R

80

Gaussian Measures in Hilbert Space

and mean value mμ˜ exists in strong sense. It holds (mμ˜ , z) = E(X, z) = (E η)(e, z), mμ˜ = (E η)e = mμ e. Now, assume that R x2 dμ(x) < ∞ (this implies that mμ is ﬁnite). For the covariance operator Aμ˜ , it holds (Aμ˜ z1 , z2 ) = E(ηe, z1 )(ηe, z2 ) = E η 2 · (e, z1 )(e, z2 ). Hence Aμ˜ = R x2 dμ · P[e] . For the correlation operator Sμ˜ , we have (Sμ˜ z1 , z2 ) = E((η − mμ )e, z1 )((η − mμ )e, z2 ) = D η · (e, z1 )(e, z2 ), Sμ˜ = D η · P[e] = (x − mμ )2 dμ · P[e] . R

The characteristic functional is as follows: ϕμ˜ (z) = E e

i(z,ηe)

ϕμ˜ (z) = ϕμ (t)

= Ee

i(z,e)η

= ϕη (t)

, t=(z,e)

. t=(z,e)

Here, ϕη and ϕμ are characteristic functions of η and μ, respectively. E XAMPLE 3.8.– (Measure in H induced by a measure in Rn ) We generalize previous example. Let μ be a Borel probability measure on Rn and Ln be n-dimensional subspace of H, with orthobasis {ei , 1 ≤ i ≤ n}. The mapping n αi ei ) = (α1 , . . . , αn ) , ρ(

α 1 , . . . , αn ∈ R

1

identiﬁes Ln with Rn . The measure μ ˜(E) = μ(ρ(E ∩ Ln )),

E ∈ B(H)

is called the measure (in H) induced by μ. This measure μ ˜ is concentrated on Ln , i.e. μ ˜(H \ Ln ) = 0. n nLet η = (ηi )1 be a random vector with distribution μ. Random element X(ω) = ˜. 1 ηi (ω)ei has distribution μ

Suppose that μ has ﬁnite second moments. For the mean value mμ˜ , it holds (mμ˜ , z) = E(

n

ηi ei , z) =

1

mμ˜ = ρ−1 E η = ρ−1 mμ .

n 1

E ηi · (ei , z) = (ρ−1 E η, z),

z ∈ H,

Borel Measures in Hilbert Space

Here, mμ˜ =

81

xd˜ μ(x) can be understood as Bochner integral because x dμ(x) < ∞. E X = E η = H

Rn

For the covariance operator Aμ˜ , we have for z1 , z2 ∈ H : (Aμ˜ z1 , z2 ) = E(X, z1 )(X, z2 ) =

n

(E ηi ηj )(z1 , ej )(z2 , ei ),

i,j=1

Aμ˜ z1 =

n

⎛ ⎞ n ⎝ aij (z1 , ej )⎠ ei ,

i=1

aij = E ηi ηj =

j=1

Rn

xi xj dμ(x).

In a similar way, the correlation operator Sμ˜ is as follows: ⎛ ⎞ n n ⎝ ⎠ Sμ˜ z = sij (z, ej ) ei , sij = (xi − mμi )(xj − mμj )dμ(x), i=1

Rn

j=1

where mμ = (mμi )ni=1 . For the characteristic functional ϕμ˜ , we have n n ϕμ˜ (z) = E exp{i z, ηk ek } = E exp{i ηk (z, ek )}, k=1

ϕμ˜ (z) = ϕμ (t),

k=1

t = ((z, e1 ), . . . , (z, en )) .

E XAMPLE 3.9.– (Measure with compact but not nuclear covariance operator) We use a construction from solution to problem (13) of this chapter. Let {pn , n ≥ 1} be a complete collection of positive probabilities, {en } be an orthobasis in H and ξ(ω) = αn en with probability pn , n ≥ 1. We will choose αn > 0 and pn later, such that second weak moments of ξ are ﬁnite but E ξ 2 = ∞. The distribution μ = μξ of ξ is as follows: μ(E) =

∞

pn δαn en (E),

E ∈ B(H),

1

where δαn en is Dirac measure at point αn en . We demand that lim pn αn2 = 0,

n→∞

∞ 1

pn αn2 = ∞.

82

Gaussian Measures in Hilbert Space

6 π 2 n2 ,

For instance, pn =

αn =

√ n, n ≥ 1. Then for z1 , z2 ∈ H,

σ2 (z1 , z2 ) =

(z1 , x)(z2 , x)dμ(x) =

∞

H

pn αn2 (z1 , en )(z2 , en ),

1

and the series converges because the sequence {pn αn2 } is bounded. The covariance operator Aμ can be founded from relation (Aμ z1 , z2 ) = σ2 (z1 , z2 ), ∞

Aμ =

pn αn2 P[en ] ,

1

where the series converges in sense of uniform operator convergence, because pn αn2 → 0 as n → ∞. This operator is compact. Next, E ξ 2 =

∞ 1

pn αn2 = ∞, therefore, Aμ is not nuclear (see lemma 3.10).

According to solution of problem (13), the (weak) mean value m = mμ =

∞

pn αn en ,

1

where the series converges in the norm of H. For concrete pn and αn chosen above, E ξ =

∞

pn αn < ∞,

1

and mμ =

H

xdμ can be understood as Bochner integral.

The correlation operator Sμ is as follows: Sμ = Aμ − m P[m] = 2

∞

pn αn2 P[en ]

1

−

∞

p2n αn2

P[m] .

1

The latter operator is compact but not nuclear, like the covariance operator Aμ . Finally, the characteristic functional equals ϕμ (z) = E ei(z,ξ) =

∞

pn eiαn (z,en ) ,

z ∈ H.

1

From this example, we see that the correlation and covariance operators of a probability measure in H can be compact, but not nuclear.

Borel Measures in Hilbert Space

83

3.3.3. Boundedness of moment form We will prove that the moment form σn (z1 , . . . , zn ) introduced in deﬁnition 3.14 is bounded, provided the corresponding weak moments of order n exist. We will deal with this problem even in a more general setting, for Borel probability measures on a normed vector space. L EMMA 3.11.– Every polylinear form τn (z1 , . . . , zn ), which is deﬁned on a Banach space B and continuous in each variable for ﬁxed values of all other variables, is bounded, i.e. there exists C ≥ 0 such that for all z1 , . . . , zn ∈ B, |τn (z1 , . . . , zn )| ≤ C z1 · z2 . . . zn . P ROOF.– We prove by induction over n, the order of a polylinear form. For n = 1, it is well-known theorem about the equivalence of continuity and boundedness of a linear functional. Assume that the statement is true for all polylinear forms of order n − 1, where n ≥ 2 is ﬁxed. Consider a polylinear form τn (z1 , . . . , zn ) on B, which is continuous in each variable for ﬁxed values of all other variables. For ﬁxed z1 , the polylinear form τn (z1 , ·, . . . , ·) of order n − 1 is bounded by inductive hypothesis. Hence, there exists Cz1 ≥ 0 such that for all zi with zi ≤ 1, i = 2, . . . , n, it holds |τn (z1 , z2 , . . . , zn )| ≤ Cz1 . Introduce a family of continuous linear functionals on Banach space B: fz2 ...zn (z1 ) = τn (z1 , . . . , zn ),

z1 ∈ B,

where zi ≤ 1, i = 2, . . . , n. By Banach–Steinhaus theorem, there exists C ≥ 0 such that for all zi with zi ≤ 1, i = 1, . . . , n, it holds |fz2 ...zn (z1 )| = |τn (z1 , . . . , zn )| ≤ C. Thus, τn is bounded. We proved the statement for arbitrary order of a polylinear form. T HEOREM 3.12.– (About boundedness of moment form) Let μ be a Borel probability measure on a normed vector space X. Fix n ≥ 1. Assume that for each z1∗ , . . . , zn∗ ∈ X ∗ , there exists ﬁnite integral x, z1∗ · x, z2∗ . . . x, zn∗ dμ(x) =: σn (z1∗ , . . . , zn∗ ). [3.15] X

Then σn is a bounded polylinear form.

84

Gaussian Measures in Hilbert Space

P ROOF.– It is clear that σn is a symmetric polylinear form on Banach space X ∗ . In view of lemma 3.11, it is enough to prove that σn is continuous in z1∗ for ﬁxed other ∗ ∗ variables z2,0 , . . . , zn,0 . ∗ ∗ Denote p(x) = x, z2,0 . . . x, zn,0 (if n = 1, then p(x) = 1 by convention), ∗ ∗ , . . . , zn,0 )= x, z1∗ p(x)dμ(x), z1∗ ∈ X ∗ . f (z1∗ ) := σn (z1∗ , z2,0 X

Here f is a linear functional, and we have to show that it is bounded. Introduce linear functionals fN (z1∗ ) = x, z1∗ p(x)dμ(x), ¯ B(0,N )

z1∗ ∈ X ∗ ,

N ≥ 1.

Each of them is bounded because ∗ ∗ . . . zn,0 ) · z1∗ . |fN (z1∗ )| ≤ (N n · z2,0

Moreover, |fN (z1∗ )| ≤

X

∗ ∗ ∗ |x, z1,0 · x, z2,0 . . . x, zn,0 |dμ(x) =: Cz1 ,

∗ ∗ Cz1 < ∞ because Lebesgue integral σn (z1∗ , z2,0 , . . . , zn,0 ) is ﬁnite. Now, by Banach– Steinhaus theorem there exists C ≥ 0 such that for all N ≥ 1 and all z1∗ with z1∗ ≤ 1, it holds |fN (z1∗ )| ≤ C. By theorem about integration over increasing sets, we get fN (z1∗ ) → f (z1∗ ) as N → ∞; hence

|f (z1∗ )| ≤ C,

for all

z1∗ ∈ X ∗ ,

z1∗ ≤ 1.

The functional f is bounded, and reference to lemma 3.11 accomplishes the proof. Remember that mean value mμ ∈ X is Pettis integral mμ = X xdμ(x) (see section 3.2). C OROLLARY 3.3.– (About existence of mean value) Let μ be a Borel probability measure on a reﬂexive Banach space B. Assume that all ﬁrst weak moments σ1 (z ∗ ) = x, z ∗ dμ(x), z ∗ ∈ B ∗ B

are ﬁnite. Then there exists mean value mμ . P ROOF.– By theorem 3.12, σ1 is a continuous linear functional on B ∗ . Since B is reﬂexive, there exists m ∈ B that generates this functional: σ1 (z ∗ ) = m, z ∗ ,

for all

z∗ ∈ B∗.

Borel Measures in Hilbert Space

85

Hence, there exists mμ = m.

R EMARK 3.8.– (About existence of mean value in separable Banach space) Let μ be a Borel probability measure on a separable Banach space B. Assume that for some ﬁxed real p > 1, |x, z ∗ |p dμ(x) < ∞, for all z ∗ ∈ B ∗ . B

Then there exists mean value mμ . P ROOF.– Let ξ be random element in B, with distribution μ. Then E |ξ, z ∗ |p < ∞,

z∗ ∈ B∗.

a) Introduce a linear operator Tξ : B ∗ → Lp (Ω) = Lp (Ω, P),

Tξ x∗ = ξ, x∗ .

We prove that Tξ is bounded. Consider the graph Γξ ⊂ B ∗ × Lp (Ω) of Tξ , Γξ = {(x∗ ; ξ, x∗ )|x∗ ∈ B ∗ }. We check that Γξ is a closed set. Let x∗n → x∗ strongly in B ∗ and ξ, x∗n → η = η(ω) strongly in Lp (Ω). Then ξ, x∗n → ξ, x∗ as n → ∞ a.s. Hence η = ξ, x∗ a.s. and (x∗ ; η) ∈ Γξ . Thus, Γξ is closed, and by the closed graph theorem (see [BER 12]), the operator Tξ is bounded. b) Construction of mean value. For random events An := { ξ ≤ n}, we set ηn = ξIAn ; n ≥ 1. Then E ηn < ∞, and Pettis integral E ηn ∈ B exists. For x∗ ∈ B ∗ , it holds (here q ∈ (1, ∞) is the conjugate index): |E ηm − E ηn , x∗ | = | Eξ, x∗ (IAm − IAn )| ≤ ≤ (E |ξ, x∗ |p )

1/p

(E |IAm − IAn |q )

1/q

≤

∗

≤ Tξ · x · IAm − IAn Lq (Ω) . Therefore, E ηm − E ηn ≤ Tξ · IAm − IAn Lq (Ω) → 0

as

m, n → ∞,

since IAn → 1 in Lq (Ω) as n → ∞ (the latter convergence holds because P(Acn ) converges to zero as n → ∞). Hence {E ηn , n ≥ 1} is a Cauchy sequence in Banach space B, and E ηn → m ∈ B strongly.

86

Gaussian Measures in Hilbert Space

c) Finally, we prove that m = E ξ in weak sense. We have for x∗ ∈ B ∗ : Eηn , x∗ = Eξ, x∗ IAn → m, x∗ . On the other hand, Eξ, x∗ IAn → Eξ, x∗ , since

Acn

|ξ, x∗ |d P → 0

as

n → ∞.

Thus, Eξ, x∗ = m, x∗ , x∗ ∈ B ∗ .

Now, we introduce covariance and correlation operators for a measure in a normed vector space. D EFINITION 3.17.– A linear bounded operator Aμ : X ∗ → X ∗∗ is called covariance operator of a Borel probability measure μ in a normed vector space X if Aμ z1∗ , z2∗ = σ2 (z1∗ , z2∗ ),

z1∗ , z2∗ ∈ X ∗ ,

[3.16]

and a linear bounded operator Sμ : X ∗ → X ∗∗ is called correlation operator of μ if Sμ z1∗ , z2∗ = σ2 (z1∗ , z2∗ ) − σ1 (z1∗ )σ1 (z2∗ ),

z1∗ , z2∗ ∈ X ∗ ,

[3.17]

where moment forms σn are given in [3.15] In case when mμ exists, deﬁnition 3.17 is consistent with the deﬁnition of covariance and correlation operators in H (remember that H ∗ and H ∗∗ are isometric to H). C OROLLARY 3.4.– (About existence of covariance and correlation operators) Let μ be a Borel probability measure on a normed vector space X. Assume that all second weak moments σ2 (z1∗ , z2∗ ) are ﬁnite. Then: a) there exist the covariance and correlation operators Aμ , Sμ : X ∗ → X ∗∗ , b) if additionally X is a reﬂexive Banach space, there exist the covariance and correlation operators Aμ , Sμ : X ∗ → X. P ROOF.– Existence of second moments σ2 implies existence of ﬁrst moments σ1 . By theorem 3.12, both moment forms are bounded. Therefore, bilinear forms on the right-hand side of [3.16] and [3.17] are bounded as well. a) Hence there exist linear bounded operators Aμ , Sμ : X ∗ → X ∗∗ , which represent the latter forms.

Borel Measures in Hilbert Space

87

b) Now, assume that X is a reﬂexive Banach space. Therefore, the canonical embedding i : X → X ∗∗ is an isometry. Let Aμ and Sμ be operators constructed in part (a) of proof. Introduce operators A˜μ = i−1 Aμ ,

˜μ = i−1 Bμ . B

It holds A˜μ z1∗ , z2∗ = Aμ z1∗ , z2∗ = σ2 (z1∗ , z2∗ ) ˜μ . Thus, the new operators are desired. and similarly for B

R EMARK 3.9.– In Chapter 2 of the monograph [VAK 87], the next statement is proven: For a Borel probability measure on a separable Banach space B with ﬁnite second moments σ2 (z1∗ , z2∗ ), there exist the covariance and correlation operators Aμ , Sμ : B ∗ → B. Notice that under conditions of remark 3.9, mean value mμ exists as well (see remark 3.8). Of course, corollaries 3.3 and 3.4 are applicable to a Hilbert space H0 , for which H0∗ and H0 are isometric. C OROLLARY 3.5.– (About existence of mμ , Aμ , Sμ in H0 ) If a Borel probability measure on a real (possibly non-separable) Hilbert space H0 has ﬁnite second moments, then there exist mean value mμ and the covariance and correlation operators Aμ , Sμ ∈ L(H0 ). Problems 3.3 14) For a Borel probability measure μ in H, with ﬁnite strong second moment, prove that trS = x 2 dμ(x) − m 2 , H

where m and S are mean value and correlation operator of μ, respectively. 15) Construct two distinct Borel probability measures in H, with equal means and equal correlation operators. 16) Let ξ be random element in a Hilbert space H0 , with covariance and correlation operators A and S, respectively, and T ∈ L(H0 ). Prove that covariance and correlation operators of T ξ are T AT ∗ and T ST ∗ , respectively. 17) Let ξ be random element in a normed vector space X, with E ξ 2 < ∞. Prove that A ≤ E ξ 2 , where A is covariance operator of ξ.

4 Construction of Measure by its Characteristic Functional

In this chapter, we deal mainly with Borel probability measures on a real separable inﬁnite-dimensional Hilbert space H. We study properties of characteristic functionals of such measures and show how to construct a measure by its characteristic functional. Finally, we give a criterion for a function θ : H → C to be a characteristic functional. 4.1. Cylindrical sigma-algebra in normed space In section 2.1.2, we introduce the so-called cylindrical sigma-algebra in the metric space (R∞ , ρ). Now, we do a similar thing in a real normed vector space X. D EFINITION 4.1.– Let n ≥ 1, {x∗1 , . . . , x∗n } ⊂ X ∗ and An ∈ B(Rn ). The set Aˆn (x∗1 , . . . , x∗n ) = {x ∈ X : (x, x∗1 , . . . , x, x∗n ) ∈ An } is called cylinder with base An constructed by functionals x∗1 , . . . , x∗n . Consider the class of all cylinders Cyl = Cyl(X) = {Aˆn (x∗1 , . . . , x∗n ) : n ≥ 1, {x∗1 , . . . , x∗n } ⊂ X ∗ , An ∈ B(Rn )}. Like in lemma 2.3, it is an algebra of sets in X, but it is not a sigma-algebra whenever dimX = ∞. The generated sigma-algebra σa(Cyl) is called cylindrical sigma-algebra. We will compare it to the Borel sigma-algebra. L EMMA 4.1.– Let {xn , n ≥ 1} be a dense set in X and {x∗n , n ≥ 1} ⊂ X ∗ be such that xn , x∗n = xn ,

x∗n = 1,

n≥1

Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.

90

Gaussian Measures in Hilbert Space

(such functionals exist by a corollary of the Hahn–Banach theorem). Then x = sup |x, x∗n |, n≥1

x ∈ X.

P ROOF.– a) We have |x, x∗n | ≤ x · x∗n = x . Hence sup |x, x∗n | ≤ x .

n≥1

b) Fix x ∈ X. There exists {xnk } that strongly converges to x. It holds x, x∗nk = xnk , x∗nk + x − xnk , x∗nk , |x, x∗nk | ≥ xnk − x − xnk , sup |x, x∗n | ≥ lim ( xnk − x − xnk ) = x . k→∞

n≥1

The desired equality follows.

T HEOREM 4.1.– (Mourier’s theorem about cylindrical sigma-algebra) In a separable normed space X, σa(Cyl) = B(X), i.e. the cylindrical and Borel sigma-algebra coincide. P ROOF.– Inclusion σa(Cyl) ⊂ B(X) is shown in a similar way to in the proof of theorem 2.1 (here the separability is not used). Next, X is separable; hence there exists a countable set {xn , n ≥ 1}, which is dense in X. Let {x∗n , n ≥ 1} be the corresponding functionals from lemma 4.1. ¯ 0 , r) ⊂ X. By lemma 4.1 Consider a ball B(x x − x0 ≤ r

⇐⇒

|x − x0 , x∗n | ≤ r

for all n ≥ 1.

Therefore, ¯ 0 , r) = B(x

∞

{x : |x, x∗n − x0 , x∗n | ≤ r},

n=1

¯ 0 , r) ∈ σa(Cyl), and B(x ¯ 0 , r) : x0 ∈ X, r > 0}. σa(Cyl) ⊃ σa{B(x

Construction of Measure by its Characteristic Functional

91

But due to the separability, the latter sigma-algebra coincides with B(X). We checked inclusions in both directions, and the statement is proven. Remember that the characteristic functional of a measure in H was introduced in deﬁnition 3.15. One can extend this concept to measures in X. D EFINITION 4.2.– Let μ be a Borel probability measure in a real normed space X and ξ be a random element in X. Functions ∗ ∗ ei x ,x dμ(x) ϕμ (x ) = X

and ϕξ (x∗ ) = E ei x

∗

,ξ

x∗ ∈ X ∗

;

are called characteristic functionals of μ and ξ, respectively. C OROLLARY 4.1.– (About renewal of measure by its characteristic functional) Let μ and ν be Borel probability measures in a separable normed space X such that ϕμ (x∗ ) = ϕν (x∗ ), for all x∗ ∈ X ∗ . Then μ = ν. P ROOF.– Let {x∗1 , . . . , x∗n } ⊂ X ∗ , {a1 , . . . , an } ⊂ R. It holds n n ∗ ϕμ ak xk = exp {i ak x∗k , x }dμ(x) = =

X

1

Rn

exp{i

n

1

ak tk }dμn (t1 , . . . , tn ) = ϕμn (a1 , . . . , an ) ,

1

where μn is a Borel probability measure in Rn , which is induced by the mapping X x → (x∗k , x )n1 ∈ Rn . In a similar way n ak x∗k = ϕνn (a1 , . . . , an ) . ϕν 1

Therefore, ϕμn = ϕνn , and for Borel probability measures in Rn , this implies that μn = ν n . Now, take a set Aˆn = Aˆn (x∗1 , . . . , x∗n ) from deﬁnition 4.1. It holds ˆ μ(An ) = μn (An ) = νn (An ) = ν(Aˆn ). Thus, μ and ν coincide at the cylindrical algebra Cyl = Cyl(X). Then by Carathéodory’s extension theorem and theorem 4.1, μ and ν coincide at σa(Cyl) = B(X).

92

Gaussian Measures in Hilbert Space

In a separable Hilbert space, the cylindrical sigma-algebra can be introduced in relation to a ﬁxed orthobasis. Let H be an inﬁnite-dimensional separable Hilbert space. Fix an orthobasis {en , n ≥ 1}. For An ∈ B(Rn ), consider a cylinder n Aˆn := {x ∈ H : ((x, ek ))k=1 ∈ An }.

The class of all such cylinders Cyle := {Aˆn : n ≥ 1, An ∈ B(Rn )} is an algebra of sets in H. L EMMA 4.2.– (About cylindrical sigma-algebra in H) It holds σa(Cyle ) = B(H). P ROOF.– It follows the line of the proof of theorem 2.1. In particular, for x ∈ H and r > 0, ¯ r) = B(x,

∞ k=1

{y ∈ H :

k

(yn − xn )2 ≤ r2 } ∈ σa(Cyle ),

1

where xn = (x, en ) and yn = (y, en ) are corresponding Fourier coefﬁcients.

C OROLLARY 4.2.– Let μ and ν be Borel probability measures in an inﬁnite-dimensional separable Hilbert space H, {en , n ≥ 1} be an orthobasis in H and Pn be orthoprojector on Ln = span(e1 , . . . , en ). Assume that for all n ≥ 1, μPn−1 = νPn−1 . Then μ = ν. P ROOF.– For any n ≥ 1 and An ∈ B(Rn ), it holds (μPn−1 )(An ) = μ(Aˆn ) = (νPn−1 )(An ) = ν(Aˆn ). Thus, μ and ν coincide at the algebra Cyle . Since σa(Cyle ) = B(H) (see lemma 4.2), μ and ν coincide at B(H) according to Carathéodory’s extension theorem. Problems 4.1 1) Consider a measurable space (R, 2R ), with a counting measure μ(A) = |A|, A ⊂ R. Prove that Hilbert space H0 = L2 (R, μ) is non-separable, in which Borel sigma-algebra differs from the sigma-algebra generated by all cylindrical sets. 2) Let μ and ν be Borel probability measures in a real separable Hilbert space such ¯ r)) = ν(B(x, ¯ r)), for all x and all r > 0. Prove that μ = ν. that μ(B(x,

Construction of Measure by its Characteristic Functional

93

3) Let X ∗ be a separable space. For arbitrary n ≥ 1, {x1 , . . . , xn } ⊂ X and An ∈ B(Rn ), denote Aˆn (x1 , . . . , xn ) = {x∗ ∈ X ∗ : (x∗ , x1 , . . . , x∗ , xn ) ∈ An }.

Let Cyl = Cyl(X ∗ , X) be the class of all such cylinders. Prove that σa(Cyl) = B(X ∗ ). 4.2. Convolution of measures D EFINITION 4.3.– Let ξ and η be random elements on the same probability space (Ω, F, P), which are distributed in real normed spaces X and Y , respectively. The two random elements are called independent if for each A ∈ B(X) and B ∈ B(Y ), P{ξ ∈ A and η ∈ B} = P{ξ ∈ A} · P{η ∈ B}. The product Z := X × Y is a real normed space as well, with the norm (x; y) Z = x X + y Y . Assume additionally that X and Y are separable. Then Z is separable as well, and for a random element ξ in X and a random element η in Y , a couple (ξ; η) is a random element in Z (see part (a) of the proof of lemma 3.6). The next statement is straightforward: ξ and η are independent if, and only if, μ(ξ;η) = μξ ×μη . Here μξ , μη , and μ(ξ;η) are distributions of ξ, η and (ξ; η), respectively. Characteristic functional ϕ(ξ;η) can be written as ϕ(ξ;η) (x∗ , y ∗ ) = E exp{i (x∗ , ξ + y ∗ , η )},

x∗ ∈ X ∗ , y ∗ ∈ Y ∗ .

L EMMA 4.3.– (About independence in terms of characteristic functionals) Let X and Y be separable real normed space and ξ, η be random elements deﬁned on the same probability space and distributed in X and Y , respectively. The two random elements are independent if, and only if, ϕ(ξ;η) (x∗ , y ∗ ) = ϕξ (x∗ )ϕη (y ∗ ),

for all

x∗ ∈ X ∗ , y ∗ ∈ Y ∗ .

P ROOF.– It follows the line of the proof of lemma 1.4 and is not given here. It is crucial that in a separable space a measure is uniquely deﬁned by its characteristic functional (see corollary 4.1). Now, let ξ and η be random elements deﬁned on the same probability space and distributed in a separable real normed space X. Then, ξ + η is a random element in X, with decomposition ξ + η = K(z), where z = (ξ; η), K(x, y) = x + y, x ∈ X, y ∈ X (see part (a) of the proof of lemma 3.6).

94

Gaussian Measures in Hilbert Space

We ﬁnd the distribution of ξ + η. For A ∈ B(X), μξ+η (A) = P{K(z) ∈ A} = P{z ∈ K −1 A} = (μξ × μη ) (K −1 A) =

μξ (K −1 A)y dμη (y). = X

Here, (K −1 A)y is the so-called y-cross-section of the set K −1 A ⊂ X × Y . We have K −1 A = {(x, y) : x + y ∈ A}, (K −1 A)y = {x ∈ X : x + y ∈ A} = A − y. Thus,

μξ (A − y)dμη (y).

μξ+η (A) = X

Using x-cross-section instead of y-cross-section, we obtain μξ+η (A) = μη (A − x)dμξ (x).

[4.1]

X

D EFINITION 4.4.– Let μ and ν be Borel probability measures in a separable normed space X. The probability measure (μ ∗ ν) (A) = μ(A − y)dν(y), A ∈ B(X) X

is called convolution of μ and ν. It is clear that μξ+η = μξ ∗ μη , where ξ, η are independent random elements in a separable normed space X. R EMARK 4.1.– For any Borel probability measures μ and ν in a separable normed space X, the convolution μ ∗ ν is well deﬁned. Indeed, consider independent random elements ξ and η in X, with distribution μξ = μ and μη = ν. Such elements can be constructed as follows: (Ω, F, P) = (X × X, B(X × X), μ × ν),

ξ(ω1 , ω2 ) = ω1 ,

η(ω1 , ω2 ) = ω2 .

Then μ∗ν = μξ ∗μη = μξ+η is a probability measure on B(X). Moreover relation [4.1] implies that μ ∗ ν = ν ∗ μ. L EMMA 4.4.– (About characteristic functional of convolution) Let μ and ν be Borel probability measures in a separable vector space X. Then ϕμ∗ν (x∗ ) = ϕμ (x∗ )ϕν (x∗ ),

x∗ ∈ X ∗ .

Construction of Measure by its Characteristic Functional

95

P ROOF.– Introduce independent random elements ξ and η in X, with μξ = μ and μη = ν (see remark 4.1). Then ∗ ∗ ∗ ϕμ∗ν (x∗ ) = ϕξ+η (x∗ ) = E ei ξ+η,x = E ei ξ,x · ei η,x . Since ξ and η are independent, random variables ξ, x∗ and η, x∗ are independent as well. Hence ϕμ∗ν (x∗ ) = E ei ξ,x

∗

· E ei η,x

∗

= ϕξ (x∗ )ϕη (x∗ ) = ϕμ (x∗ )ϕν (x∗ ).

Problems 4.2 4) Consider a normed vector space X (possibly nonseparable) with cylindrical sigma-algebra C = C(X). Let (Ω, F, P) be a probability space. A mapping ξ : Ω → X is called weak random element if it is (F, C)-measurable. Induced probability measure μξ = P ξ −1 is called distribution of ξ. a) Let ξ and η be weak random elements deﬁned on the same probability space X. Prove that ξ + η is a weak random element as well. b) ξ and η from item (a) are called independent if for all A1 , A2 ∈ C, P{ξ ∈ A1 and η ∈ A2 } = P{ξ ∈ A1 } · P{η ∈ A2 }. For such weak random elements, prove that μξ+η (A) = μξ (A − y)dμη (y), A ∈ C. X

(Thus, convolution μ ∗ ν for probability measures on C can be deﬁned similarly to deﬁnition 4.4.) 5) Let μ, ν and τ be Borel probability measures in a separable normed space X. Prove that: a) δ0 ∗ μ = μ, where δ0 is Dirac measure at 0, which is deﬁned on B(X); b) (μ ∗ ν) ∗ τ = μ ∗ (ν ∗ τ ); c) (aμ + bν) ∗ τ = a(μ ∗ ν) + b(ν ∗ τ ), where a, b ≥ 0, a + b = 1. 6) Let μ1 ∗ ν = μ2 ∗ ν = τ , where all four are Borel probability measures in a separable normed space X, and let ϕν be not vanishing at some set D that is dense in X ∗ . Prove that μ1 = μ2 . (This means that the deconvolution equation μ ∗ ν = τ , with unknown μ, cannot have two or more solutions.)

96

Gaussian Measures in Hilbert Space

4.3. Properties of characteristic functionals in H Characteristic functional is a very useful attribute of a measure in a separable space, because a measure can be renewed by its characteristic functional (see corollary 4.1). Lemmas 4.3 and 4.4 state some important properties of characteristic functionals. We study further properties of them. For convenience, here we will deal with measures in a real Hilbert space, though most of the properties are still valid in a normed space. Let H be a real (possibly non-separable) Hilbert space. For a Borel probability measure μ in H, its characteristic functional ϕμ : H → C was introduced in deﬁnition 3.15. Sometimes we will denote ϕμ as μ ˆ, emphasizing that characteristic functional is just a Fourier transform of μ. Thus, μ ˆ(z) = ϕμ (z) = ei(z,x) dμ(x), z ∈ H. H

ˆ(0) = 1 and μ ˆ is positive deﬁnite, i.e. for all n ≥ 1, {xk , 1 ≤ L EMMA 4.5.– It holds μ k ≤ n} ⊂ H and {ck , 1 ≤ k ≤ n} ⊂ C, n

μ ˆ(xk − xj )ck c¯j ≥ 0.

[4.2]

j,k=1

P ROOF.– μ ˆ(0) = μ(H) = 1, because μ is a probability measure. Next, for the desired inequality, LHS =

n

H

j,k=1

=

n

ei(xk −xj ,y) dμ(y) =

ck c¯j

ck c¯j

ei(xk ,y) ei(xj ,y) dμ(y)

H j,k=1

2 n i(xk ,y) = ck e dμ(y) ≥ 0. H k=1

It is useful to write [4.2] in a vector form. Introduce a square matrix A = (ajk )nj,k=1 ,

ajk = μ ˆ(xk − xj ),

[4.3]

ck )n1 . Then LHS of [4.2] is just a quadratic form and column vectors c = (ck )n1 , c¯ = (¯ of A, and [4.2] takes a form c¯ Ac ≥ 0,

for all c ∈ Cn .

¯jk ), and moreover it is positive This means that A is Hermitean (i.e. akj = a semideﬁnite (i.e. its quadratic form is non-negative).

Construction of Measure by its Characteristic Functional

97

We will derive some interesting relations just from equality μ ˆ(0) = 1 and inequality [4.2]. T HEOREM 4.2.– Let H be a real Hilbert space and θ : H → C be a positive deﬁnite functional, with θ(0) = 1. Then: a) θ(−x) = θ(x), b) |θ(x)| ≤ 1,

x ∈ H;

x ∈ H;

c) |θ(x) − θ(y)| ≤ 2(1 − Re θ(x − y)), complex number. 2

x, y ∈ H, where Re is real part of a

P ROOF.– a) We have an inequality like [4.2]: n

θ(xk − xj )ck c¯j ≥ 0.

[4.4]

j,k=1

Put here n = 2, x1 = 0, x2 = x. The matrix θ(0) θ(x) 1 θ(x) 2 = A2 = (θ(xk − xj ))k,j=1 = θ(−x) θ(0) θ(−x) 1 has a non-negative quadratic form. Hence A2 is Hermitean and θ(−x) = θ(x). b) Moreover, A2 is positive semideﬁnite. By Sylvester’s criterion det A2 = 1 − θ(−x)θ(x) = 1 − |θ(x)|2 ≥ 0

⇒

|θ(x)| ≤ 1.

c) In case θ(x) = θ(y), the inequality holds true because RHS is always non-negative due to statement (b). Hence, we deal with the case θ(x) = θ(y). Put in [4.4] n = 3, x1 = 0, x2 = x and x3 = y. The corresponding Hermitean matrix ⎞ ⎛ 1 θ(x) θ(y) θ(y − x) ⎠ . A3 = ⎝ θ(−x) 1 θ(−y) θ(x − y) 1 We put in [4.4] c1 = 1, c3 = −c2 and group conjugate summands using the identity z + z¯ = 2 Re z, z ∈ C: 1 + |c2 |2 + |c3 |2 + 2 Re (θ(x) · c2 ) + 2 Re (θ(y) · c3 ) + + 2 Re (θ(y − x) · c¯2 c3 ) = 1 + 2|c2 |2 − 2 Re θ(y − x) · |c2 |2 + + 2 Re{(θ(x) − θ(y)) · c2 } ≥ 0,

for all

c2 ∈ C .

98

Gaussian Measures in Hilbert Space

Finally, we put here c2 = λ ·

|θ(x) − θ(y)| , θ(x) − θ(y)

λ ∈ R.

We get λ2 (1 − Re θ(y − x)) + λ|θ(x) − θ(y)| +

1 ≥ 0, 2

for all

λ ∈ R.

If 1 − Re θ(y − x) = 0, then the inequality is just linear in λ, and it cannot hold for each real λ (remember that in our case θ(x) = θ(y)). Thus, 1 − Re θ(y − x) > 0, and discriminant D of the quadratic function in λ should be non-positive: D = |θ(x) − θ(y)|2 − 2(1 − Re θ(y − x)) ≤ 0.

This implies the desired inequality.

Remember that a linear topology on a linear space L is a topology, which is invariant under translations. This means the following: if U is an open set in the topology and x ∈ L, then U + x is an open set in the topology. C OROLLARY 4.3.– Let θ : H → C satisfy the conditions of theorem 4.2. Consider a linear topology τ on H. If Re θ is continuous at zero in the topology τ , then θ is a continuous functional in τ . P ROOF.– We prove that θ is a continuous in τ at arbitrary point x ∈ H. Fix δ > 0. We want to ensure that |θ(x) − θ(y)| < δ, for all y from some open in τ set Ux , x ∈ Ux . According to theorem 4.2(c), |θ(x) − θ(y)| ≤ 2(1 − Re θ(y − x)). 2

The inequality 1 − Re θ(y − x) = |1 − Re θ(y − x)| < δ2 holds if y − x ∈ U0 , where U0 is some open set in τ , 0 ∈ U0 , or y ∈ U0 + x =: Ux . The set Ux is also open in τ (because τ is a linear topology) and x ∈ Ux . Thus, for y ∈ Uy , it holds |θ(x) − θ(y)| < δ, and θ is continuous at point x. Since x was arbitrary vector from Ux , θ is continuous in τ . Later on, in section 4.4, we will introduce the so-called S-topology. It is a linear topology in H, which is crucial in the description of characteristic functionals. L EMMA 4.6.– For a Borel probability measure μ in a real Hilbert space H, the characteristic functional μ ˆ is a uniformly continuous functional on H.

Construction of Measure by its Characteristic Functional

99

P ROOF.– Let {xn } and {yn } be two sequences in H, with xn − yn → 0 as n → ∞. Consider |ˆ μ(xn ) − μ ˆ(yn )| ≤ |ei(xn ,z) − ei(yn ,z) |dμ(z) = =

H

|1 − ei(xn −yn ,z) |dμ(z) =: In .

H

The integrand fn (z) = |1 − ei(xn −yn ,z) |, z ∈ H converges pointwise to zero, and moreover 0 ≤ fn (z) ≤ 2, 2dμ = 2 < ∞. H

Hence by the Lebesgue dominated convergence theorem In → 0 as n → ∞, and μ ˆ(xn ) − μ ˆ(yn ) → 0 as n → ∞. This proves the statement. Problems 4.3 7) Let τ , θ : H → C be positive deﬁnite functionals on a real Hilbert space H. Prove that the functional τ · θ and eτ are positive deﬁnite as well. Hint. Use problem (23) of Chapter 1. 8) Let B(x, y) be a symmetric positive semideﬁnite bilinear form on real Hilbert space H. Prove that the functional e−B(x,x) is a positive deﬁnite functional on H. Hint. Use lemma 1.6. 4.4. S-topology in H D EFINITION 4.5.– Let (X, τ ) be a topological space and {Nx , x ∈ X} be an indexed family where Nx is non-empty class of subsets of X. Assume also that for each x ∈ X and U ∈ Nx , it holds x ∈ U , and moreover U is a neighborhood of x (i.e. U contains an open set that contains x). The family {Nx , x ∈ X} is called neighborhood system of the topological space if, for each U ∈ τ and each x ∈ U , there exists W ∈ Nx such that W ⊂ U . The next statement is presented in [ENG 89]. T HEOREM 4.3.– (About neighborhood system) Let {Nx , x ∈ X} be an indexed family where Nx is non-empty class of subsets of X. Assume the following: a) for each x ∈ X and U ∈ Nx , it holds x ∈ U ;

100

Gaussian Measures in Hilbert Space

b) for each x ∈ X, U ∈ Nx and y ∈ U there exists V ∈ Ny , with V ⊂ U ; c) for each x ∈ X and U1 , U2 ∈ Nx , there exists U ∈ Nx , with U ⊂ U1 ∩ U2 . Then T = (X, τ ) is a topological space where τ consists of all unions of sets from the class ∪x∈X Nx . Moreover, {Nx , x ∈ X} is a neighborhood system of T . Based on theorem 4.3, we construct a topology on a real separable inﬁnite-dimensional Hilbert space H. Denote ES = {x ∈ H : (Sx, x) < 1}, where S is an S-operator in H, i.e. S is a self-adjoint, positive and nuclear operator. (Remember that the class of all such operators is denoted LS (H).) Since 0 ∈ ES , this set is non-empty; it is open in usual topology, because ES = KS−1 ((−∞, 1)), KS (x) := (Sx, x) is continuous on H and (−∞, 1) is open on R; ES is unbounded and convex. Let {ek , k ≥ 1} be the eigenbasis of S and {λk , k ≥ 1} be the corresponding (non-negative) eigenvalues. In the case where KerS = {0} (i.e. S is non-degenerate), ES = {x ∈ H :

∞ (x, ek )2 1

a2k

< 1},

1 ak = √ , λk

k ≥ 1, −1/2

and ES is an inﬁnite-dimensional ellipsoid, with semiaxes λk

.

L EMMA 4.7.– (About topology generated by ellipsoids) Let N0 = {ES , S ∈ LS (H)},

Nx = N0 + x = {ES + x : S ∈ LS (H)},

x ∈ H.

Then TS = (H, τS ) is a linear topological space where τS consists of all unions of sets from the class ∪x∈H Nx . Moreover, {Nx , x ∈ H} is a neighborhood system of TS . P ROOF.– We have to verify conditions (a)–(c) of theorem 4.3. It is enough to deal with x = 0 only. a) 0 ∈ ES , for all ES ∈ N0 . b) We have to check that for each y ∈ ES , there exists Q ∈ LS (H), with y+EQ ⊂ ES . Denote (u, v)S = (Su, v) = (S 1/2 u, S 1/2 v), u, v ∈ H; u S := 1/2 S u , u ∈ H. The latter functional is a seminorm in H.

(u, u)S =

It holds y S < 1. Let Q = ε−2 S, and we will choose ε > 0 later. For z ∈ EQ , it holds z Q = ε−1 z S < 1, z S < ε. Then y + z S ≤ y S + z S < ε + y S = 1 if ε = 1 − y S . Thus, with such ε and Q = ε−2 S ∈ LS (H), we have y + EQ ⊂ ES .

Construction of Measure by its Characteristic Functional

101

c) For S, Q ∈ LS (H), it holds ES+Q ⊂ ES ∩ EQ and S + Q ∈ LS (H). Thus, the conditions of theorem 4.3 are satisﬁed, and the statement of lemma 4.7 follows directly from this theorem. D EFINITION 4.6.– The topology τS constructed in lemma 4.7 is called S-topology, or Sazonov’s topology, in honor of the mathematician V.V. Sazonov. Actually, τS is the weakest topology under which quadratic forms KS (x) = (Sx, x), x ∈ H, S ∈ LS (H) are continuous. R EMARK 4.2.– Let f : H → C be a continuous functional in S-topology. Then f is continuous in usual topology. P ROOF.– For each open set, G ⊂ C, f −1 (G) ∈ τC . But τC is weaker topology than usual topology; hence f −1 (G) is an open set in usual topology. This implies the statement. Now, we state a criterion for the continuity of a nonlinear functional in S-topology. L EMMA 4.8.– Consider a functional f : H → C such that |f (x)| ≤ 1, for all x ∈ H. This functional is continuous at zero in S-topology if, and only if, for each ε > 0 there exists Sε ∈ LS (H) such that for all x ∈ H, |f (x) − f (0)| ≤ ε + (Sε x, x). P ROOF.– a) Necessity. We may and do assume that ε < 1. Since f is assumed continuous at zero in τS , there exists S ∈ LS (H) such that the inequality (Sx, x) < 1 implies that |Δf (x)| = |f (x) − f (0)| < ε. Hence |Δf (x)| ≤ ε + (2Sx, x),

for all

x ∈ H.

Indeed, for x ∈ ES , |Δf (x)| < ε ≤ ε + (2Sx, x) and if (Sx, x) ≥ 1 then |Δf (x)| ≤ |f (x)| + |f (0)| ≤ 2 ≤ (2Sx, x) < ε + (2Sx, x). One can set Sε = 2S ∈ LS (H). b) Sufﬁciency. Now, for each ε > 0 it holds |Δf (x)| ≤

ε + (Sε/2 x, x). 2

102

Gaussian Measures in Hilbert Space

If (Sε/2 x, x) < 2ε , then |Δf (x)| < ε. We set S = 2ε Sε/2 ∈ LS (H). We have x ∈ ES ⇐⇒ (Sε/2 x, x)
0, there exists S-operator Sε such that for each x ∈ H, 1 − Re θ(x) ≤ (Sε x, x) + ε. P ROOF.– a) Necessity. We assume that θ is characteristic functional of a Borel probability measure μ. By lemmas 4.5 and 4.6, θ satisﬁes condition 1. Fix ε > 0 and choose R = R(ε) > 0, with μ({ y > R}) < 2ε . This is possible since lim μ({ y > R}) = μ(∅) = 0.

R→+∞

Construction of Measure by its Characteristic Functional

We have

¯ R)) + μ({ y > R}) − Re 1 − Re θ(x) = μ(B(0, − Re +

105

{y>R}

¯ B(0,R)

¯ B(0,R)

ei(x,y) dμ(y)−

ei(x,y) dμ(y) ≤ 2μ({ y > R}) +

(1 − cos(x, y))dμ(y).

Since 1 − cos t = 2 sin2 1 1 − Re θ(x) ≤ ε + 2

t 2

≤

t2 2,

it holds

1 (x, y) dμ(y) = ε + 2 ¯ B(0,R)

2

(x, y)2 dμR (y). H

¯ R), Here, μR is a ﬁnite Borel measure concentrated at B(0, ¯ R)), μR (A) = μ(A ∩ B(0,

A ∈ B(H).

Covariance operator SR of μR is S-operator, because 2 x dμR (x) = x 2 dμ(x) ≤ R2 < ∞ H

¯ B(0,R)

(here we apply lemma 3.10, which is valid not only for a probability measure but also for any ﬁnite Borel measure in H). Hence 1 1 − Re θ(x) ≤ ε + ( SR x, x), 2

x ∈ H,

and condition 2 holds true with Sε = 12 SR ∈ LS (H). b) Sufﬁciency: Part 1. Now, we assume that θ satisﬁes both conditions and we want to construct a probability measure on B(H), with μ ˆ = θ. According to theorem 4.2(b), |θ(x)| ≤ 1, | Re θ(x)| ≤ 1 and taking into account lemma 4.8 condition 2 implies that Re θ(x) is continuous at zero in S-topology. The latter topology is linear, and from corollary 4.3 we get that θ is a continuous functional in S-topology. Due to remark 4.2 , θ is continuous in usual topology as well. In the rest of proof, we may and do assume that H coincides with real sequence space l2 . Our functional θ : l2 → C satisﬁes conditions [4.5], and we are able to construct the objects μn and μe from lemma 4.9(a). It remains to prove that μe (l2 ) = 1.

106

Gaussian Measures in Hilbert Space

c) Sufﬁciency: Part 2. With the measure μ on B(R∞ ), we relate a random element X in R∞ to its distribution μX = μe . We set (Ω, F, P) = (R∞ , B(R∞ ), μe ) and ∞ introduce X = (Xn )∞ 1 : Ω → R , X(ω) = ω. Then μe (l2 ) = 1 ⇐⇒ P{X(ω) ∈ l2 } = 1 ⇐⇒

∞

Xn2 (ω) < ∞ a.s.

1

We have to show that the latter series converges a.s. The next identity follows from theorem 1.6 and gives an expression for characteristic function of standard Gaussian vector in Rn : n n 1 2 exp{i aj yj }ρ(y)dy = exp{− a }, aj ∈ R, 1 ≤ j ≤ n. 2 1 j Rn 1 Here 1 2 1 exp{− y }, ρ(y) := √ 2 1 j ( 2π)n n

y ∈ Rn

is pdf of standard Gaussian vector in Rn . We substitute aj = Xk+j (ω) to the identity: 1 2 2 exp{− (Xk+1 + · · · + Xk+n )}d P = Ikn := 2 Ω ⎡ ⎤ n ⎣ exp{i Xk+j yj }ρ(y)dy ⎦ d P(ω). = Rn

Ω

j=1

One can apply Fubini’s theorem because the double integral can be bounded as follows: n | exp{i Xk+j yj }|ρ(y)dyd P(ω) ≤ P(Ω) ρ(y)dy = 1 < ∞. Ω

Rn

Then Ikn =

Rn

j=1

Rn

⎡ ρ(y) ⎣

exp{i Ω

n

⎤ Xk+j yj }d P(ω)⎦ dy.

j=1

The inner integral equals exp{i Lk+n

n j=1

⎛ xk+j yj }dμk+n (x) = θ ⎝

n j=1

⎞ yj ek+j ⎠

Construction of Measure by its Characteristic Functional

107

(here {en } is standard basis in l2 ). Therefore, a real number Ikn can be expressed as follows: ⎞ ⎞ ⎛ ⎛ n n Ikn = θ⎝ yj ek+j ⎠ ρ(y)dy = Re θ ⎝ yj ek+j ⎠ ρ(y)dy. Rn

Rn

j=1

j=1

Now, we use condition 2: ⎛ ⎞⎞ ⎛ n ⎝1 − Re θ ⎝ 1 − Ikn = yj ek+j ⎠⎠ ρ(y)dy ≤ Rn

≤ε+

n

j=1

(Sε ek+j , ek+p )

j,p=1

Rn

yj yp ρ(y)dy.

Remember that ρ is pdf of standard Gaussian vector γ = (γj )n1 ; hence yj yp ρ(y)dy = E γj γp = δjp , 1 ≤ j, p ≤ n. Rn

Then

⎧ ⎫ ∞ ∞ n ⎨ 1 ⎬ 2 (Sε ek+j , ek+j ) = ε + (Sε ep , ep ). Xk+j 1 − E exp − ≤ ε+ ⎩ 2 ⎭ j=1 j=1 p=k+1

∞

Since Sε is nuclear, p=1 (Sε ep , ep ) < ∞ and we can ﬁx k = k0 (ε), with ∞ p=k+1 (Sε ep , ep ) ≤ ε. We get ⎧ ⎫ n ⎨ 1 ⎬ 2 E exp − Xk+j ≥ 1 − 2ε. ⎩ 2 ⎭ j=1 We tend n → ∞ and obtain by Lebesgue dominated convergence theorem (here exp (−∞) = 0 by deﬁnition): ⎧ ⎫ ∞ ⎨ 1 ⎬ 2 E exp − Xk+j ≥ 1 − 2ε, ⎩ 2 ⎭ j=1

∞ Xn2 < ∞} ≥ P{ 1

⎧ ⎫ ∞ ⎨ 1 ⎬ 2 exp − X dP = k+j ⎩ 2 ⎭ 2 {ω: ∞ 1 Xn 0 and Sε = 12 Sμ ∈ LS (H). But in the general case the covariance operator of μ need not exist, and moreover, if it does exist, this operator need not be nuclear (see example 3.8). Comparing theorems 4.5 and 4.4, we see that in inﬁnite-dimensional case, a functional should be continuous in a weaker topology than the usual one, while in Rn the continuity in the usual topology is enough. E XAMPLE 4.1.– (Exponent of quadratic form) Let H be a real separable inﬁnitedimensional Hilbert space and C be self-adjoint bounded operator in H. Consider a functional θ(z) = e−(Cz,z) ,

z ∈ H.

We state that θ is characteristic functional of some Borel probability measure in H if, and only if, C is S-operator. P ROOF.– a) Sufﬁciency. Assume that C is S-operator. Then θ(0) = 1; (Cx, y) is a symmetric positive semideﬁnite bilinear form on H; hence θ is a positive deﬁnite functional (see problem (8) of Chapter 4). Finally KC (z) = (Cz, z) is continuous in S-topology (see problem (11) of Chapter 4), therefore, Re θ(z) = θ(z) = e−KC (z) is continuous in S-topology as a continuous function of KC . According to theorem 4.5, θ = μ ˆ, for some Borel probability measure μ in H.

Construction of Measure by its Characteristic Functional

109

b) Necessity. Now, assume that θ is characteristic functional of some Borel probability measure in H. Then |θ(z)| = θ(z) ≤ 1 (see theorem 4.2(b)); hence (Cz, z) ≥ 0, and C is a positive operator. Fix ε ∈ (0, 12 ). According to theorem 4.5, there exists Sε ∈ LS (H) such that for each z ∈ H, 1 − θ(z) = 1 − Re θ(z) ≤ (Sε z, z) + ε. At each point where (Sε z, z) ≤ ε, it holds 1 − e−(Cz,z) ≤ 2ε

⇒

(Cz, z) ≤ aε := log

1 . 1 − 2ε

Then (Cz, z) ≤

aε (Sε z, z), ε

for all

z∈H

(see part (a) of the solution to problem (11) of Chapter 4). Finally, for any orthobasis {en }, ∞

(Cen , en ) ≤

1

∞ aε (Sε en , en ) < ∞, ε 1

and according to theorem 3.8 the positive self-adjoint operator C is nuclear. Thus, C is S-operator. Problems 4.5 12) Let θ : l2 → C be characteristic functional of some Borel probability measure in l2 and μe be a measure in R∞ constructed in lemma 4.9(a). Prove that μe (l2 ) = 1. 13) Let H be a real separable inﬁnite-dimensional Hilbert space, ϕ : H → C be a continuous positive deﬁnite functional, with ϕ(0) = 1, and A ∈ S2 (H). Prove that the functional θ(x) = ϕ(Ax), x ∈ H is characteristic functional of some Borel probability measure in H.

5 Gaussian Measure of General Form

In this chapter, we give a description of all Gaussian measures in H. The main tool is the Minlos–Sazonov theorem (theorem 4.5). We derive some properties of Gaussian measures in Hilbert space. Moreover, we introduce a Gaussian measure in a normed space and prove that its exponential moments are ﬁnite. 5.1. Characteristic functional of Gaussian measure Remember that a Gaussian random element and a Gaussian measure in a real Hilbert space H were introduced in deﬁnitions 2.7 and 2.8. The mean value and correlation operator of random element and the Borel probability measure in H were introduced in deﬁnitions 2.9 and 2.10. L EMMA 5.1.– Let μ be a Gaussian measure in a real Hilbert space H. Then it has ﬁnite second weak moments and its characteristic functional has a form 1 μ ˆ(x) = exp i(m, x) − (Sx, x) , x ∈ H, 2 where m and S are mean value and correlation operator of μ, respectively. P ROOF.– Let ξ be a Gaussian random element in H, with distribution μ. For x ∈ H, denote ξx = (ξ, x) and let μx be the distribution of r.v. ξx , μx ∼ N (mx , σx2 ) (it can happen that σx2 = 0, then μx is the Dirac measure at point mx ). We have by the change of variables formula: i(x,y) μ ˆ(x) = e dμ(y) = eit dμx (t) = μˆx (1) = H

= exp{imx u −

σx2 u2 2

}

R

= exp{imx − u=1

σx2 }. 2

Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.

[5.1]

112

Gaussian Measures in Hilbert Space

Again, by the change of variables formula, the measure μ has ﬁnite ﬁrst weak moments: mx = E ξx = t dμx (t) = (x, y) dμ(y), R

H

and by corollary 3.3 there exists mean value m of μ, with mx = (m, X); σx2 = D ξx = (t − mx )2 dμx (t) = R

((x, y) − mx )2 dμ(y) =

= H

(x, y − m)2 dμ(y), H

the measure μ has ﬁnite weak second moments, and by corollary 3.5 there exists correlation operator S of μ. Therefore, σx2 = (Sx, x). We plug-in expressions for mx and σx2 into formula [5.1] and obtain the desired relation for μ ˆ(x). Remember that correlation operator of any Borel probability measures in H (if such an operator exists) is always bounded, self-adjoint and positive. T HEOREM 5.1.– (About characteristic functional of Gaussian measure) Let H be a real separable inﬁnite-dimensional Hilbert space. a) If μ is a Gaussian measure in H, then its correlation operator is S-operator. b) If x0 ∈ H and S ∈ LS (H), then the functional 1 ϕ(x) = exp i(x0 , x) − (Sx, x) , x ∈ H, 2 is characteristic functional of some Gaussian measure in H, with mean value x0 and correlation operator S. P ROOF.– a) Let ξ be a Gaussian random element with distribution μ. By lemma 5.1, characteristic functional is 1 ϕξ (x) = μ ˆ(x) = exp i(m, x) − (Sx, x) , x ∈ H, 2 where m and S are mean value and correlation operator of μ, respectively. Introduce random element ξc = ξ − m, with distribution μc . Then 1 μ ˆc (x) = ϕξc (x) = E ei(x,ξ−m) = e−i(x,m) ϕξ (x) = exp − Sx, x . 2 The operator 12 S is self-adjoint and bounded as a half of correlation operator of a Borel probability measures in H. According to example 4.1, 12 S is S-operator; hence S = 2 · ( 12 S) is S-operator as well (remember that LS (H) is a cone in S1 (H)).

Gaussian Measure of General Form

113

%

& b) We start with the case x0 = 0. Then ϕ(x) = exp − 12 Sx, x ; since S ∈ LS (H), the operator 12 S is S-operator as well, and according to example 4.1 there exists a Borel probability measure μ, with μ ˆ = ϕ. Now, we prove that μ is Gaussian. Let (Ω, F, P) = (H, B(H), μ) and ξ : Ω → H, ξ(ω) = ω. Then it is a random element in H, with distribution μ. Fix x ∈ H and introduce a r.v. ξx = (ξ, x), with distribution μx . We have itz μ ˆx (t) = e dμx (z) = eit(x,y) dμ(y) = μ ˆ(tx) = = exp

R

−

2

t (Sx, x) 2

H

,

t ∈ R.

Therefore, μx is normal distribution N (0, σx2 ), σx2 = (Sx, x) (it can happen that σx2 = 0). Hence ξ is a Gaussian measure. Now, consider the general case x0 ∈ H. Let η = ξ + x0 and ν be the distribution of η. We have: ϕη (x) = νˆ(x) = E ei(x,ξ+x0 ) = ei(x,x0 ) ϕξ (x) = ϕ(x); (η, x) = (ξ, x) + (x0 , x) ∼ N (mx , σx2 ),

mx = (x0 , x),

σx2 = (Sx, x).

Thus, η is a Gaussian random element and its distribution ν is a Gaussian measure, with given characteristic functional ϕ. By lemma 5.1, ν has mean value x0 and correlation operator S. Theorem 5.1 shows that a Gaussian measure μ in a real separable inﬁnite-dimensional Hilbert space H is uniquely deﬁned by its mean value m ∈ H and correlation operator S ∈ LS (H). For a Gaussian random element ξ and its distribution μ, we write ξ ∼ N (m, S) and say that ξ has normal distribution with parameters m and S. There is one-to-one correspondence between the class of all Gaussian measures in H and the parameter space H × LS (H). Problem (15) from Chapter 3 shows that a general Borel probability measures in H is not uniquely deﬁned by those characteristics. C OROLLARY 5.1.– Let μ be a Gaussian measure in a real separable inﬁnite-dimensional Hilbert space, with mean value m and covariance operator A. Then x 2 dμ(x) = tr A < ∞, [5.2] H

and in the sense of Bochner integral, x dμ(x). m= H

[5.3]

114

Gaussian Measures in Hilbert Space

P ROOF.– By theorem 5.1(a), correlation operator S of μ is nuclear, then corollary 3.2 implies H x 2 dμ < ∞, and [5.2] follows by lemma 3.10. Hence, m1 = H x dμ(x) < ∞ and [5.3] holds true involving Bochner integral (see section 3.3.1). Problems 5.1 1) Let H be a real separable inﬁnite-dimensional Hilbert space and h ∈ H. Prove that there are no Borel probability measures in H with characteristic 2 functional exp{i(h, x) − x 2 }, x ∈ H. 2) Let ξ1 and ξ2 be independent random elements in a real separable inﬁnitedimensional Hilbert space and ξi ∼ N (mi , Si ), i = 1, 2. Prove that ξ1 + ξ2 ∼ N (m1 + m2 , S1 + S2 ). 5.2. Decomposition of Gaussian measure and Gaussian random element Let μ be a Gaussian measure with mean value m and covariance operator S in a real separable inﬁnite-dimensional Hilbert space H. Let {ek , k ≥ 1} be an eigenbasis ∞ of S and {λk , k ≥ 1} be the corresponding eigenvalues, λk ≥ 0, 1 λk < ∞. We use a construction described in lemma 4.9. Consider an increasing system of ﬁnite-dimensional subspaces Ln := span(e1 , . . . , en ). Let μn be a probability measure on B(Ln ), with μ ˆn = μ ˆ|Ln . For x ∈ Ln , it holds n n 1 1 2 mk xk − λk x k . μ ˆn (x) = exp i(m, x) − (Sx, x) = exp i 2 2 1 1 Here mk := (m, ek ) and xk := (x, ek ) are Fourier coefﬁcients of m and x, respectively. Hence n n

1 2 μ ˆn (x) = exp imk xk − λk xk = μ ˆ[ek ] (xk ), x ∈ Ln . [5.4] 2 1 1 n Here μ[ek2] is Gaussian measure on B(R ), with mean value mk and variance (t − mk ) dμ[ek ] (t) = λk (if λk = 0, then μ[ek ] is Dirac measure δmk at R point mk ). We identify Ln and Rn and obtain from [5.4] that

μn =

n

μ[ek ] .

1

Measures {μn , n ≥ 1} are consistent and yield a measure μe on B(R∞ ), with μe (Aˆn ) = μn (An ), for all n ≥ 1 and An ∈ B(Rn ) (see lemma 4.9). The

Gaussian Measure of General Form

115

measure μe is just inﬁnite product of measures μ[ek ] . Now, we identify x ∈ H and ∞ vector of Fourier coefﬁcients ((x, ek ))1 ∈ l2 . Thus, we identify H and l2 . Then μ = μe B(l2 )

(see problem (12) from Chapter 4). In such a situation, we say that μ is a product measure in H and write μn =

∞

μ[ek ]

in H

[5.5]

1

(see [2.20]). In section 2.4, we constructed a Gaussian measure in l2 as a product measure of one-dimensional Gaussian measures. Expansion [5.4] shows that arbitrary Gaussian measure in H can be constructed that way. Now, we derive an expansion for a Gaussian random element in H. Actually, we extend theorem 1.7 to inﬁnite-dimensional Hilbert space. T HEOREM 5.2.– (About expansion of Gaussian random element) Let μ ∈ H, S be S-operator in H, {λk , k ≥ 1} be positive eigenvalues of S (with multiplicity) and {ek , k ≥ 1} be the corresponding orthonormal system of eigenvectors. a) For ξ ∼ N (m, S), there exist i.i.d. N (0, 1) r.v.’s {γk , k ≥ 1} on the underlying probability space such that almost surely ξ =m+ λ k γk ek , k≥1

where the series (if the sum is inﬁnite) converges strongly in H with probability 1. ≥ 1} are b) If {γk , k √ i.i.d. N (0, 1) random variables, then the series (if the sum is inﬁnite) T := k≥1 λk γk ek converges strongly in H a.s., and random element η which equals m + T a.s., satisﬁes η ∼ N (m, S). P ROOF.– We complete the given orthonormal system to eigenbasis {ek , k ∈ N }. It holds Sek = λk ek , k ∈ N (some of λk ’s can be zero). a) Introduce X = ξ − m, X ∼ N (0, S). Now, (X, ek )ek , E(X, ek ) = 0, D(X, ek ) = λk , X= k∈N

If λj = 0, then (X, ej ) = 0 a.s. Thus, a.s. X=

k: λk >0

(X, ek ) λk √ ek λk

k ∈ N.

116

Gaussian Measures in Hilbert Space

) √ k from (the series converges strongly in H a.s.). Random variables γk := (X,e λk N N the latter sum are jointly Gaussian (because k=1 ak γk = X, k=1 a√kλek is a k Gaussian r.v.), and E γk = 0,

1 λk (Sek , ej ) = δkj = δkj . Cov(γk , γj ) = λk λj λ k λj By theorem 1.6(a), {γk } are i.i.d. N (0, 1) random variables. Now, λk γk ek , a.s. X= k≥1

and the statement follows. b) We have E

2 λk γ k = λk E γk2 = λk = tr S < ∞. k≥1

Hence

k≥1

k≥1

k≥1

√

2 λk γk < ∞ a.s., and the series T converges strongly in H a.s.

Now, let η = m + T a.s. It holds ϕη (x) = ei(x,m) ϕT (x), and by Lebesgue dominated convergence theorem ϕη (x) = ei(x,m) lim E exp{i

∞

n→∞

= ei(x,m)

E exp{i

λk γk (x, ek )} =

k≥1

=e

i(x,m)

k≥1

e

−

λk γk (x, ek )} =

1

λk (x,ek )2 2

∞

1 = exp i(x, m) − λk (x, ek )2 2 1

.

Thus, 1 ϕη (x) = exp i(x, m) − (Sx, x) , 2 and by theorem 5.1(b), η ∼ N (m, S).

x ∈ H,

Gaussian Measure of General Form

117

Problems 5.2 3) Let ξ ∼ N (m, S) in a real separable inﬁnite-dimensional Hilbert space and operator S be non-singular (i.e. Ker S = {0}). Prove that for each α > 0, E ξ −α < ∞. 4) Let ξ ∼ N (0, S) in a real separable Hilbert space, ξ1 . . . , ξn be independent k copies of ξ, U = (ukj )nk,j=1 be a real orthogonal matrix, and ηk = j=1 ukj ξj , k = 1, . . . , n. Prove that η1 , . . . , ηn are independent copies of ξ. 5.3. Support of Gaussian measure and its invariance We state a classical result. T HEOREM 5.3.– (Anderson’s inequality) Let g be a centered (i.e. with zero mean) Gaussian measure in Rn . Then for each symmetric convex Borel set A and for each vector a, it holds g(A + a) ≤ g(A). For the proof, see [BOG 98], Chapter 1. We can strengthen theorem 5.3 for the case where A is a ball. L EMMA 5.2.– Let d > 0 and ξ ∼ N (0, σ 2 ), σ > 0. Then the function ϕ(a) := P{ξ ∈ [−d + a, d + a]},

a≥0

is decreasing. P ROOF.– Let 0 ≤ a1 < a2 . Then −d+a2 ϕ(a1 ) − ϕ(a2 ) = ρ(x) dx −

−d+a1

d+a2

=

d+a2

ρ(t) dt = d+a1

(ρ(t − 2d) − ρ(t)) dt,

d+a1

where ρ(x) = √

x2 1 e− 2σ2 2πσ

is pdf of ξ. For t ∈ (d + a1 , d + a2 ], |t − 2d| equals either t − 2d or 2d − t, and in both cases it is less than t = |t|. Therefore, ρ(t − 2d) > ρ(t), t ∈ (d + a1 , d + a2 ]; hence (a1 ) − (a2 ) > 0.

118

Gaussian Measures in Hilbert Space

C OROLLARY 5.2.– Let r > 0, g be a centered Gaussian measure in Rn , with nonsingular correlation matrix S, and {ei } be eigenbasis of S. For a ∈ Rn , denote ci = ¯ r)) is ci (a) = (a, ei ), 1 ≤ 1 ≤ n. Then the function ϕ(|c1 |, . . . , |cn |) = g(B(a, decreasing in each argument |ci | for ﬁxed other arguments |cj |, j = i. P ROOF.– For n = 1, the statement follows from lemma 5.2. Let n ≥ 2, ξ ∼ N (0, S), ηi = (ξ, ei ), 1 ≤ i ≤ n. Then ηi are independent normal random variables, with positive variances. It is enough to consider ci ≥ 0, 1 ≤ i ≤ n. We have ϕ(c1 , . . . , cn ) = P{

n

(ηi − ci )2 ≤ r2 } =

1

= E [P{(η1 − c1 ) ≤ r2 − S2,n , S2,n < r2 | ξ2 , . . . , ξn }], 2

where S2,n =

n

(ηi − ci )2 .

i=2

Denote g1 (c1 , R) = P{(η1 − c1 )2 ≤ R2 },

c1 ≥ 0,

R > 0.

Then

ϕ(c1 , . . . , cn ) = E I{S2,n < r2 } · g1 (c1 , R)

R=

√

. r 2 −S2,n

The function g1 is decreasing in c1 , P{S2,n < r2 } > 0; hence ϕ is decreasing in c1 for ﬁxed c1 , . . . , cn . In a similar way, ϕ is decreasing in each argument ci for ﬁxed other arguments cj , j = i. L EMMA 5.3.– Let μ be a centered Gaussian measure in a real separable Hilbert space H. Then for each ε > 0, ¯ ε)) > 0. μ(B(0,

Gaussian Measure of General Form

119

P ROOF.– a) Consider the case dim H = n < ∞. By theorem 1.8, supp μ = R(S) where S is variance–covariance matrix of μ. Hence, 0 ∈ supp μ, and the desired inequality follows (see deﬁnition 1.18). b) Now, let dim H = ∞. We may and do assume that H coincides with real sequence space l2 . First we prove that for any a ∈ l2 , ¯ ε)) ≤ μ(B(0, ¯ ε)). μ(B(a,

[5.6]

Indeed, consider a Gaussian random element γ = (γn )∞ 1 in l2 , with distribution μ. We have ∞ ¯ ε)) = P{ (γn − an )2 ≤ ε2 } = μ(B(a, 1

= lim P{ N →∞

N

(γn − an )2 ≤ ε2 }.

1

By theorem 5.3 or corollary 5.2 P{

N

N (γn − an )2 ≤ ε2 } ≤ P{ γn2 ≤ ε2 }.

1

1

Hence ¯ ε)) ≤ lim P{ μ(B(a, N →∞

N

∞ γn2 ≤ ε2 } ≤ P{ γn2 ≤ ε2 } =

1

1

¯ ε)), = μ(B(0, and inequality [5.6] is established. c) Still for H = l2 , we prove the statement of the present lemma. ˆ ε0 )) = 0. Then [5.6] implies that for Suppose that for some ε0 > 0, μ(B(0, ¯ each a ∈ l2 , μ(B(a, ε0 )) = 0. But l2 is separable; hence l2 =

∞

¯ (k) , ε0 ), B(a

1

for some sequence {a(k) } ⊂ l2 . Therefore, μ(l2 ) ≤

∞

¯ (k) , ε0 )) = 0. μ(B(a

1

But μ(l2 ) = 1. The obtained contradiction proves the statement.

120

Gaussian Measures in Hilbert Space

In Chapter 1, we found the support of a Gaussian measure in Rn . Now, we extend deﬁnitions 1.15 and 1.18 of support to a separable metric space (X, ρ). D EFINITION 5.1.– For a random element ξ in (X, ρ), we denote by G a union of all balls B(x, r) with P{ξ ∈ B(x, r)} = 0. The set X \ G is called the support of ξ and denoted as supp ξ. D EFINITION 5.2.– For a Borel probability measure μ in (X, ρ), we denote by G a union of all balls B(x, r) with μ(B(x, r)) = 0. The set X \ G is called the support of μ and denoted as supp μ. It is clear that since (X, ρ) is separable, μ(G) = 0 and μ(supp μ) = 1. T HEOREM 5.4.– (About support of Gaussian measure) Let μ be a Gaussian measure in a real separable Hilbert space H, with mean value m and correlation operator S. Then supp μ = R(S) + m,

[5.7]

where bar stands for the closure of a set. P ROOF.– a) Let γ ∼ N (m, S), η = γ − m. Then η ∼ N (0, S) and supp μ = supp γ = supp η + m. In order to prove [5.7], we have to show that supp η = R(S).

[5.8]

b) If dim H < ∞, then this follows from theorem 1.8. Now, we consider the case dim H = ∞. The space H can be decomposed as H = M 1 ⊕ M2 ,

M1 = R(S),

M2 = Ker S,

because S is self-adjoint. Respectively, S = S1 ⊕ S2 ,

S1 ∈ L(M1 ),

S2 = 0 ∈ L(M2 ),

KerS1 = {0}.

According to section 5.2, μη = μ1 × μ2 ,

μ1 is N (0, S1 ) distribution,

μ 2 = δ0 ,

the latter is Dirac measure at origin in M2 . Consequently (see problem (5) of Chapter 5), supp η = supp μη = (supp μ1 ) × (supp μ2 ),

Gaussian Measure of General Form

121

where we identify M1 ⊕ M2 and M1 × M2 . But supp μ2 = {0}, and in order to show [5.8], we have to prove that supp μ1 = R(S1 ). But now R(S1 ) = M1 due to the properties of S1 . If dim M1 < ∞, then supp μ1 = R(S1 ) = M1 and we are done. Therefore, we focus on the case dim M1 = ∞. c) Thus, it remains to prove the following: if η ∼ N (0, S) with non-singular S (i.e. Ker S = {0}), then supp η = H. We may and do assume H which coincides with real sequence space l2 and S is a diagonal operator Sx = (λ1 x1 , . . . , λn xn , . . . ), ∞ with λn > 0, n ≥ 1, 1 λn < ∞.

x ∈ l2 ,

¯ r) in l2 . We have to show Now, consider the arbitrary closed ball B(a, ¯ that μη (B(a, r)) > 0. There exist r1 > 0 and a ﬁnitary vector b = (a1 , . . . , ak , 0, 0, . . . ) ¯ r). We have since {ηi } are independent: ¯ r1 ) ⊂ B(a, such that B(b, ¯ r1 )) = P{ μη (B(b,

k

(ηi − ai )2 +

1

≥ P{

k

(ηi − ai )2 ≤

1

= P{

k

∞

ηi2 ≤ r12 } ≥

k+1 ∞ r12 2 r2 ηi ≤ 1 } = , 2 2 k+1

∞

(ηi − ai )2 ≤

1

r12 r2 ηi2 ≤ 1 }. } · P{ 2 2 k+1

Here the ﬁrst multiplier is positive because for η (k) := (ηi )k1 , supp η (k) = Rk (see theorem 1.8); the second multiplier is positive as well due to lemma 5.2 applied to a centered Gaussian random element η(k) := (ηk+1 , ηk+2 , . . . ) in l2 . Thus, ¯ r1 )) > 0 μη (B(b,

⇒

¯ r) > 0. μη (B(a,

Therefore, for the measure μη it holds G = ∅ and supp η = supp μη = H\G = H (see deﬁnition 5.2). This accomplishes the proof. C OROLLARY 5.3.– Let μ be a centered Gaussian measure in a real separable Hilbert space H, with non-singular correlation operator S. Then supp μ = H.

122

Gaussian Measures in Hilbert Space

P ROOF.– We have H = R(S) ⊕ Ker S. But now Ker S = {0}. Hence, H = R(S), and by theorem 5.4 supp μ = R(S) + m = H + m = H.

Now, we switch to invariance properties of a Gaussian measure. In section 1.5.4, we dealt with a centered Gaussian measure μ in Rn , with a non-singular variance–covariance matrix S. We introduced an inner product 1 n n −2 − 12 (x, y)S = S x, S y in R and showed that given u ∈ L(R ), μ is U -invariant if, and only if, U is an orthogonal transformation w.r.t. the inner product (x, y)S . Our goal is to obtain an analogous result for a Gaussian measure in a real separable inﬁnite-dimensional Hilbert space H. Now, consider a centered Gaussian measure μ in H, with a non-singular covariance operator S (i.e. Ker S = {0}). In a linear set √ √ H0 := S(H) = { Sx : x ∈ H } we introduce an inner product 1 1 (x, y)S := S − 2 x, S − 2 y ,

x, y ∈ H0 .

With √ this inner product, H0 is a real separable inﬁnite-dimensional Hilbert space, because S : H → H0 is an isometry of the two spaces with inner products. R EMARK 5.1.– Let {en , n ≥ 1} be then eigenbasis of S and {λn , n ≥ 1} be the corresponding eigenvalues of S. Then H0 = { y =

∞

yn en :

n=1

Here the series

∞ 1

∞ yn2 < ∞ }. λn 1

yn en converges in the norm of H.

For U ∈ L(H), the adjoint operator U ∗ ∈ L(H). Assume additionally that U (H0 ) ⊂ H0 , i.e. H0 is invariant w.r.t. U . Denote U0 = U . H0

Clearly, U0 is a linear operator in H0 . L EMMA 5.4.– (About boundedness of restricted operator) Let U ∈ L(H) and H0 be invariant w.r.t. U . Then U0 ∈ L(H0 ).

Gaussian Measure of General Form

123

P ROOF.– We use the closed graph theorem (see [BER 12]). Suppose that in H0 we have the convergence of two sequences: √

H

0 Sxn −−→

√

√ H0 √ U Sxn −−→ Sy.

Sx,

Then in H we have the convergence: H

xn −→ x

⇒

√ √ H U Sxn −→ U Sx. 1

H

1

1

√ √ H U Sxn −→ U Sy

⇒

H

1

H

1

1

0 z, then S − 2 zn −→ S − 2 z, and S 2 (S − 2 zn ) = zn −→ S 2 (S − 2 z) = z If zn −−→ 1 due to the continuity of S 2 in H. Hence, the convergence in H0 implies the convergence in H.

We have √ H √ U Sxn −→ Sy,

√

√ Sy = U Sx.

Thus, the graph of U0 is closed in H0 , and by the closed graph theorem U0 ∈ L(H0 ). For V ∈ L(H0 ), the adjoint operator from L(H0 ) will be denoted as Vˆ . L EMMA 5.5.– Let U ∈ L(H) and H0 be invariant w.r.t. U . Then on the set S(H), which is dense in H0 , ˆ0 = SU ∗ S −1 . U P ROOF.– Let x ∈ H0 and y ∈ S(H). Then ˆ0 y)S = (U0 x, y)S = (S − 12 U0 x, S − 12 y) = (x, U = (U0 x, S −1 y) = (U x, S −1 y) = (x, U ∗ S −1 y) = 1

1

= (S 2 x, S 2 U ∗ S −1 y)S = (x, SU ∗ S −1 y). ˆ0 y = SU ∗ S −1 y, y ∈ S(H). Hence, U L EMMA 5.6.– (About linear transform of Gaussian random Let ξ ∼ N (m, S) in H and A ∈ L(H). Then Aξ ∼ N (Am, ASA∗ ).

element)

P ROOF.– Random element Aξ is Gaussian, because for each h ∈ H, (Aξ, h) = (ξ, A∗ h) is a Gaussian random variable (possibly degenerate). Next, E(Aξ) = A(E ξ) = Am by properties of Pettis integral. The correlation operator of Aξ equals ASA∗ (see problem (16) of Chapter 3).

124

Gaussian Measures in Hilbert Space

C OROLLARY 5.4.– (Criterion for invariance of Gaussian measure under linear transform) Let μ be a centered Gaussian measure in H, with correlation operator S, and A ∈ L(H). Then μ is A-invariant if, and only if, ASA∗ = S. P ROOF.– Consider a random element ξ ∼ N (0, S) in H, with distribution μξ = μ. By lemma 5.6, Aξ ∼ (0, ASA∗ ). The measure μ is A-invariant if, and only if, ξ and Aξ are identically distributed, and this is equivalent to equality S = ASA∗ . T HEOREM 5.5.– (Condition for invariance in terms of unitary operators) Let μ be a centered Gaussian √ measure in H, with a non-singular covariance operator S, and U ∈ L(H). Let H0 = S(H) be invariant w.r.t. U , and U0 = U |H0 be unitary operator in H0 . Then μ is U -invariant. P ROOF.– In view of corollary 5.4, we have to check that U SU ∗ = S. √ Since S(H) ⊂ S(H), it holds U SU ∗ = U0 SU ∗ , and we have to verify that U0 SU ∗ x = Sx,

x ∈ H,

or U0 SU ∗ S −1 y = y,

y ∈ S(H),

or in view of lemma 5.5, ˆ0 y = y, U0 U

y ∈ S(H).

[5.9]

ˆ0 z = z, z ∈ But since U0 is unitary operator√in H0 , it holds U0 U and [5.9] follows because S(H) ⊂ S(H). Thus, indeed U SU ∗ = S.

√

S(H) = H0 ,

Let μ be a Gaussian measure from theorem 5.5. Problem (7), in this chapter, describes all unitary operators in H under which μ is invariant. Now, we give examples of such operators. E XAMPLE 5.1.– (Symmetries). Let {en } be eigenbasis of S. Consider unitary operator Ux =

∞

εn (x, en )en ,

x ∈ H,

1

where each εn is either 1 or (−1), with arbitrary combination of signs. Then μ is U -invariant.

Gaussian Measure of General Form

125

E XAMPLE 5.2.– (Permutation of coordinates). Let {en } be eigenbasis of S, {λn } be the corresponding (positive) eigenvalues (with multiplicity), and π be a permutation of N , with λn = λπ(n) , n ≥ 1. Consider unitary operator Ux =

∞

(x, en )eπ(n) ,

x ∈ H.

1

Then μ is U -invariant. Theorem 5.5, examples 5.1 and 5.2, and problem (7) of this chapter show that there is a vast group of linear transformations, under which a centered Gaussian measure is invariant. Problems 5.3 5) Let μi be a Borel probability measure in a separable metric space Xi , i = 1, 2. Prove that for supp (μ1 × μ2 ) = (supp μ1 ) × (supp μ2 ). 6) Let μ be a centered Gaussian measure in a real separable inﬁnite-dimensional Hilbert space H, with non-singular covariance operator S. Let {en } be eigenbasis of S and {λn } be the corresponding (positive) eigenvalues of S (with multiplicity). Consider V ∈ L(H), with * λπ(i) V ei = eπ(i) , i ≥ 1, λi where π is some permutation of N . Prove that μ is V -invariant. 7) Let μ be a measure from problem (6), α1 > α2 > . . . > αn > . . . be eigenvalues of S (without multiplicity), and H = H1 ⊕ H2 ⊕ . . . ⊕ Hn ⊕ . . . be a decomposition of H into an orthogonal sum of eigenspaces of S (here Hn corresponds to the eigenvalue αn ). Consider unitary operator U in H. Prove that μ is U -invariant if, and only if, U (Hn ) = Hn for each n ≥ 1. 5.4. Weak convergence of Gaussian measures D EFINITION 5.3.– Let (X, ρ) be a metric space and {μ, μn , n ≥ 1} be Borel probability measures in X. The sequence {μn } weakly converges to μ if for each bounded and continuous function f : X → R, f dμn → f dμ as n → ∞ . X

X

126

Gaussian Measures in Hilbert Space

See [BIL 99] for properties and applications of the weak convergence of measures. We denote by MX the class of all Borel probability measures in X. D EFINITION 5.4.– The Lévy–Prokhorov metric d in MX is introduced as follows: for μ and ν from MX , d(μ, ν) = inf{ε > 0 : μ(F ) ≤ ν(F ε ) + ε

and

ν(F ) ≤ μ(F ε ) + ε,

for each closed F ⊂ X}, where F ε is the neighborhood of F , F ε = { x ∈ X : ρ(x, F ) < ε }. T HEOREM 5.6.– (Prokhorov’s theorem about metrization of weak convergence) Let (X, ρ) be a complete separable metric space. Then MX with the Lévy–Prokhorov metric d is a complete separable metric space as well. Moreover, μn converges to μ in (MX , d) if, and only if, μn weakly converges to μ. For the proof, see [PRO 56] or [BIL 99]. Before studying the convergence of Gaussian measures in H, we ﬁrst deal with Gaussian measures in Euclidean space. We use the following fact: if ξn ∼ N (mn , σn2 ) converges in distribution to ξ, then ξ ∼ N (m, σ 2 ), with m = lim mn , n→∞

σ 2 = lim σn2 n→∞

2

(here it is possible that σ = 0; in this case ξ = m a.s.). T HEOREM 5.7.– (Criterion for weak convergence of Gaussian measures in Rk ) a) The class of all Gaussian measures in Rk is closed in MRk w.r.t. weak convergence. b) Consider Gaussian measures μn in Rk , with mean values mn and variance– covariance matrices Sn , n ≥ 1, and a Gaussian measure μ in Rk , with mean value m and variance–covariance matrix S. The sequence {μn } weakly converges to μ if, and only if, mn → m

and

Sn → S

as

n → ∞.

[5.10]

P ROOF.– a) Let a sequence μn of Gaussian measures in Rk weakly converge to μ ∈ MRk . Consider random vectors Xn , X with distributions μn and μ, respectively. Fix a ∈ Rk . Gaussian random variables (Xn , a) converge in distribution to the random variable (X, a); hence (X, a) is Gaussian as well. Therefore, X is a Gaussian random vector and μx = μ is a Gaussian measure.

Gaussian Measure of General Form

127

b) Again consider random vectors Xn , X with distributions μn and μ, respectively. Assume that μn weakly converges to μ. Then Xn converges to X in distribution. Fix a ∈ Rk . It holds: (Xn , a) ∼ N ((mn , a), (Sn a, a)) converges to (X, a) ∼ N ((m, a), (Sa, a)) in distribution. Hence (mn , a) → (m, a)

and

(Sn a, a) → (Sa, a)

as n → ∞.

Since vector a is arbitrary and Sn , S are symmetric matrices, we get the desired [5.10]. Now, we prove the sufﬁciency of condition [5.10]. For x ∈ Rn , consider the characteristic function as n → ∞: (Sx, x) (Sn x, x) → exp i(x, m) − =μ ˆ(x). μ ˆn (x) = exp i(x, mn ) − 2 2 By Lévy’s continuity theorem, μn weakly converges to μ.

It is interesting that the weak convergence of Gaussian measures in H is closely related to the convergence of S-operators in S1 (H). T HEOREM 5.8.– (Criterion for weak convergence of Gaussian measures in H) Let H be a real separable inﬁnite-dimensional Hilbert space and MG denote the class of all Gaussian measures in H. Fix an orthobasis {ei } in H. a) MG is closed in MH w.r.t. weak convergence. b) Consider Gaussian measures μn , with mean values mn and correlation operators Sn , n ≥ 1, and a Gaussian measure μ, with mean value m and correlation operator S. The sequence μn weakly converges to μ if, and only if, two conditions hold: 1) mn strongly converges to m in H; ∞ 2) i=1 |(Sn ei , ei ) − (Sei , ei )| → 0 as n → ∞, and (Sn ei , ej ) → (Sei , ej ) as n → ∞, for all i = j. Proof can be found in [PAR 05]. Note that condition 2 means that Sn converges to S in nuclear norm (see deﬁnition 3.7). C OROLLARY 5.5.– Let H be the space from theorem 5.8. The sequence μn of Gaussian measures weakly converges to the Gaussian measure μ (here μn and μ

128

Gaussian Measures in Hilbert Space

satisfy conditions of theorem 5.8) if, and only if, mn strongly converges to m in H and Sn converges to S in nuclear norm. P ROOF.– Theorem 3.9 shows that the convergence of S-operator in condition 2 of theorem 5.8 is equivalent to the convergence in nuclear norm. C OROLLARY 5.6.– Let H be a real separable inﬁnite-dimensional Hilbert space. Then the class LS (H) of all S-operators is a separable set in S1 (H). 0 P ROOF.– By theorem 5.7, (MH , d) is a separable metric space. Then the class MG of all centered Gaussian measures in H is a separable set in MH , and there exists a 0 0 countable set Q ⊂ MG , which is dense in MG (w.r.t. the weak convergence). 0 Let S ∈ LS (H) and μ ∈ MG , with correlation operator S. Then there exists a sequence {μn } ⊂ Q, which converges weakly to μ. By corollary 5.5, ||Sn − S||1 → 0 where Sn is correlation operator of μn . Thus, a countable set {Sμ : μ ∈ Q} is dense in LS (H).

Problems 5.4 8) Let μ be a centered Gaussian measure in a real separable inﬁnite-dimensional Hilbert space H, with non-zero covariance operator S. Let {λn } be eigenvalues of S (with multiplicity) and 2 Jα = exp{α||x|| } dμ(x), α ∈ R . H

Using problem (11) of Chapter 2, prove that Jα < ∞ if, and only if, α < Show that in this case 1 Jα = ∞ . n=1 (1 − 2αλn )

1 2||S|| .

9) Let μn be Gaussian measures in a real separable inﬁnite-dimensional Hilbert space H such that μn weakly converge to μ, which has non-zero correlation operator S. Consider a continuous function f : H → R such that for some ε > 0, 1−ε 2 |f (x)| ≤ exp · ||x|| , x ∈ H. 2||S|| Prove that lim f (x) dμn (x) = f (x) dμ(x). n→∞

H

H

Hint. Use problem (8) and also theorem 3.5 and condition [3.18], both from [BIL 99].

Gaussian Measure of General Form

129

5.5. Exponential moments of Gaussian measure in normed space We start with separable Hilbert spaces. Let μ be a centered Gaussian measure in Rn , with non-zero variance–covariance matrix S and let {λk , 1 ≤ k ≤ n} be eigenvalues of S (with multiplicity). The integral Iα = exp {α x 2 } dμ(x), α ∈ R [5.11] Rn

is ﬁnite if, and only if, α < Iα = n

1 2S .

1

1 (1 − 2αλk )

In this case,

.

This statement can be obtained from problem (8) of this chapter if we consider ˜ ≤ n. the covariance operator S˜ in H, with dim R(S) In problem (8), a similar result is stated for a real separable inﬁnite-dimensional Hilbert space. In this section, we consider the integral like [5.11] in a separable normed space. We are not able to evaluate it, but we prove that it is ﬁnite for a small enough α. 5.5.1. Gaussian measures in normed space Let X be a real separable normed space. We extend deﬁnitions 2.7 and 2.8 for the objects in X. D EFINITION 5.5.– A random element ξ distributed in X is called Gaussian if for each x∗ ∈ X ∗ , ξ, x∗ is a Gaussian r.v. (possibly with zero variance). A probability measure μ on B(X) is called Gaussian if there exists a Gaussian random element ξ in X, such that its distribution μξ = μ. L EMMA 5.7.– Let μ be a Gaussian measure in a real separable Banach space B. Then there exists the mean value mμ ∈ B (where the integral mμ = B x dμ(x) is understood in weak sense), and the covariance and correlation operators Aμ , Sμ : B ∗ → B. P ROOF.– Let x∗ ∈ B ∗ and ξ be a Gaussian random element in B with distribution μξ = μ. Then ξ, x∗ is a Gaussian r.v., and |x, x∗ |2 dμ(x) = E |ξ, x∗ |2 < ∞. B

Now, since B is separable, the statement follows from remarks 3.8 and 3.9.

130

Gaussian Measures in Hilbert Space

L EMMA 5.8.– Let μ be a Gaussian measure in a real separable normed space X. Then all ﬁrst weak moments σ1 (x∗ ) of μ are ﬁnite and the characteristic functional equals 1 ∗ ∗ ∗ ∗ ϕμ (x ) = exp iσ1 (x ) − Sμ x , x , x∗ ∈ X ∗ , 2 where Sμ : X ∗ → X ∗∗ is correlation operator of μ. Thus, a Gaussian measure in X is uniquely deﬁned by {σ1 (x∗ ) : x∗ ∈ X ∗ } and Sμ . P ROOF.– Let ξ be a Gaussian random element in X with distribution μ. Like in the proof of lemma 5.7 it follows that all second moments σ2 (x∗1 , x∗2 ) of μ are ﬁnite; hence σ1 (x∗ ) are ﬁnite as well. Correlation operator Sμ : X ∗ → X ∗∗ of μ exists by corollary 3.4. We have Eξ, x∗ = σ1 (x∗ ),

Dξ, x∗ = σ2 (x∗ , x∗ ) − σ12 (x∗ ) = Sμ x∗ , x∗ .

Hence ξ, x∗ ∼ N (σ1 (x∗ ), Sμ x∗ , x∗ ). Therefore, ϕμ (x∗ ) = E ei ξ,x

∗

1 = ϕ ξ,x∗ (1) = exp iσ1 (x∗ ) − Sμ x∗ , x∗ . 2

By corollary 4.1, a Borel probability measure in a separable X is uniquely deﬁned by characteristic functional ϕμ , and the last statement of lemma 5.8 follows. In particular, when there exists the mean value mμ of a Gaussian measure μ (e.g. if X is a real separable Banach space), then 1 ϕμ (x∗ ) = exp imμ , x∗ − Sμ x∗ , x∗ , 2 and like in Hilbert space, μ is uniquely deﬁned by a couple (mμ , Sμ ). E XAMPLE 5.3.– (Measure generated by Gaussian process) Let (Ω, F, P) be a complete probability space and ξ : Ω × [0, T ] → R be a measurable stochastic process, i.e. ξ be measurable w.r.t. sigma-algebra F ⊗ ST , which is generated by measurable rectangles F × A, F ∈ F, A ∈ ST (here ST denotes the class of Lebesgue measurable subsets of [0, T ]). Assume additionally that for some real p ∈ [1, ∞),

T 0

E |ξ(t)|p dt < ∞.

Gaussian Measure of General Form

Then T

131

|ξ(t)|p dt < ∞ a.s.,

0

and the path ξ(·, ω) ∈ Lp [0, T ] a.s. The space B = Lp [0, T ] is a real separable Banach space. Due to the measurability of ξ, X : ω → ξ(·, ω) is a random element in B with distribution μ(A) = P { ω : ξ(·, ω) ∈ A },

A ∈ B(B)

(it is called the distribution of ξ in the space of paths). Assume additionally that ξ is a Gaussian stochastic process, i.e. for each n ≥ 1 and t1 , ..., tn ∈ [0, T ], random vector (ξt1 , ..., ξtn ) is Gaussian. Let q ∈ (1, ∞] be the conjugate index, i.e. p−1 + q −1 = 1 (if p = 1, then q = ∞). For each x∗ ∈ Lq [0, T ] = B ∗ , r.v. T η= ξ(t)x∗ (t) dt [5.12] 0

is Gaussian. To prove this, ﬁrst note that in the case where 1 < p < ∞, p p/q T

E

|ξ(t)x∗ (t)| dt

0

0

and in the case where p = 1, T E |ξ(t)x∗ (t)| dt ≤ 0

in both cases E Next, denote

T

≤

T 0

T 0

T

E |ξ(t)|p dt ·

|x∗ (t)|q dt

< ∞,

0

E |ξ(t)| dt · ess sup |x∗ (t)| < ∞; 0≤t≤T

|ξ(t)x∗ (t)| dt < ∞, and by Fubini’s theorem η is indeed a r.v.

x∗n = x∗n (t) = x∗ · IAn ,

An = {t : (x∗ (t))2 · E ξ 2 (t) ≤ n}.

It is enough to prove that η is Gaussian, with x∗n instead of x∗ . In this case T E ξ 2 (t) · (x∗n (t))2 dt ≤ nT < ∞, 0

and η ∈ L2 (Ω, P). Denote by G the subspace in Hilbert space L2 (Ω, P) generated by random variables ξ(t, ·), t ∈ [0, T ]. Since G consists of Gaussian random variables, we have to show that η ∈ G. Take the decomposition η = η1 + η2 with η1 ∈ G, η2 ⊥G. By Fubini’s theorem , T + 2 E η2 = E ηη2 = ξ(t, ω)η2 (ω) dP (ω) x∗n (t) dt = 0; 0

Ω

132

Gaussian Measures in Hilbert Space

hence η2 = 0 a.s. and [5.12] is indeed a Gaussian r.v. Therefore, μ is a Gaussian measure in B. Its mean value m(t) = E ξ(t),

0 ≤ t ≤ T,

m(·) ∈ Lp [0, T ].

Its correlation operator S : Lq [0, T ] → Lp [0, T ] for each x∗ ∈ Lq [0, T ] and y ∈ Lq [0, T ] satisﬁes the relation Sx∗ , y ∗ = x − m, x∗ x − m, y ∗ dμ(x) = ∗

B

[0,T ]2

[0,T ]2

E(ξ(t) − m(t))(ξ(s) − m(s))x∗ (t)y ∗ (s)dtds = r(t, s)x∗ (t)y ∗ (s)dtds.

Here r(t, s) is a correlation function of ξ which is ﬁnite for all (t, s) ∈ [0, T ]2 and Lebesgue measurable by Fubini’s theorem. Hence (Sx∗ )(t) =

T

r(t, s)x∗ (s) ds,

0 ≤ t ≤ T.

0

where r(t, s) is correlation function of ξ. Characteristic function of μ is as follows: r(t, s) = E(ξ(t) − m(t))(ξ(s) − m(s)),

t, s ∈ [0, T ].

By Fubini’s theorem, the latter expectation is ﬁnite almost for all (t, s) ∈ [0, T ]2 and deﬁnes the Lebesgue measurable function r(t, s). Characteristic functional of μ is as follows: T 1 ∗ ∗ ∗ ∗ m(t)x (t)dt − r(t, s)x (t)x (s) dtds , ϕμ (x ) = exp i 2 [0,T ]2 0 x∗ ∈ Lq [0, T ]. L EMMA 5.9.– (Characterization of Gaussian random element in normed space) Let ξ and η be independent random elements in a real separable normed space X. a) If ξ and η are identically distributed Gaussian with zero mean, then √12 (ξ + η) and √12 (ξ − η) are independent copies of ξ (i.e. they are independent and have the same distribution as ξ). b) If ξ + η and ξ − η are independent, then ξ and η are Gaussian.

Gaussian Measure of General Form

133

P ROOF.– a) Denote α = √12 (ξ+η), β = √12 (ξ−η). Since X is separable, a couple (α; β) is a random element in X ×X (see part (a) of proof of lemma 3.6). Find its characteristic functional: i i ϕ(α;β) (x∗ , y ∗ ) = E exp { √ ξ + η, x∗ + √ ξ − η, y ∗ } = 2 2 x∗ + y ∗ x∗ − y ∗ x∗ + y ∗ x∗ − y ∗ √ } · E exp {i η, √ } = ϕξ ( √ )ϕξ ( √ )= 2 2 2 2 1 = exp − [σ2 (x∗ + y ∗ , x∗ + y ∗ ) + σ2 (x∗ − y ∗ , x∗ − y ∗ )] . 4

= E exp {i ξ,

Here σ2 (u∗ , v ∗ ) =

x, u∗ · x, v ∗ dμ(x), X

σ2 is a symmetric bilinear form on X ∗ , and we used the condition that ξ and η are i.i.d. random elements. Next, by the parallelogram identity: 1 ϕ(α;β) (x∗ , y ∗ ) = exp − [σ2 (x∗ , x∗ ) + σ2 (y ∗ , y ∗ )] = 2 = ϕξ (x∗ ) ϕη (y ∗ ) = ϕ(ξ;η) (x∗ , y ∗ ). By corollary 4.1, the couple (α; β) and the couple (ξ; η) have equal distributions in X × X. Hence α and β are independent and have the same distribution as ξ. b) For x∗ ∈ X ∗ , random variables ξ+η, x∗ = ξ, x∗ +η, x∗ and ξ−η, x∗ = ξ, x∗ − η, x∗ are independent, moreover ξ, x∗ and η, x∗ are independent as well. By the characterization of normal law, ξ, x∗ and η, x∗ are Gaussian (see [MAT 77]); hence ξ and η are Gaussian random elements. 5.5.2. Fernique’s theorem Now, we state a famous result about the exponential integrability of norm. T HEOREM 5.9.– (Fernique’s theorem) Let μ be a centered Gaussian measure in a real separable normed space X. Select τ > 0 such that c = μ(B(0, τ )) > 12 . Fix α0 =

1 √ 4(1+ 2)2 τ 2

+∞

log

c 1−c

if c < 1, if

c = 1.

134

Gaussian Measures in Hilbert Space

Then for each 0 < α < α0 , exp(α x 2 ) dμ(x) ≤ I(τ, c, α) < ∞, X

where I(τ, c, α) depends only on τ, c, α and is continuous in τ > 0, c ∈ ( 12 , 1], α ∈ (0, α0 (τ, c)). If c = 1, then μ = δ0 , Dirac measure at zero. P ROOF.– a) Key inequality Let ξ be a Gaussian random element in X with distribution μ, and η be an √ independent copy of ξ deﬁned on the same probability space. By lemma 5.9, ξ+η 2 and

ξ−η √ 2

are independent copies of ξ.

Fix two points 0 < s < t. We have ξ−η ξ+η P{ ξ ≤ s} · P{ η > t} = P{ √ ≤ s} · P{ √ > t} = 2 2 √ √ √ √ = P{ ξ + η ≤ s 2, ξ − η > t 2} ≤ P{ ξ − η ≤ s 2, ξ + η > t 2} t−s t−s t−s ≤ P{ ξ > √ , η > √ } = P2 { ξ > √ }. 2 2 2 Here, we used the following inclusion, which is geometrically evident for 0 < s < t: √ √ t−s t−s {(a; b) ∈ [0, ∞)2 : |a − b| ≤ s 2, a + b > t 2} ⊂ √ , ∞ × √ , ∞ . 2 2 Therefore, t−s P{ ξ ≤ s} · P{ ξ > t} ≤ P2 { ξ > √ }. 2

[5.13]

b) Bound for the tail of distribution of ξ Since μ(B(0, r)) → 1 as r → ∞, there exists a positive τ such that c := μ(B(0, τ )) > 12 . Then α0 :=

1−c μ({x : x > τ }) = < 1. c μ(B(0, τ ))

If c = 1, then ξ ≤ τ a.s., ξ, x∗ is bounded centered Gaussian r.v. for all x∗ ∈ X ; hence ξ, x∗ = 0 a.s. for all x∗ ∈ X ∗ . Then σ2 (x∗ , x∗ ) = E |ξ, x∗ |2 = ∗

Gaussian Measure of General Form

135

0, ϕξ (x∗ ) = exp{− 12 σ2 (x∗ , x∗ )} = 1, and ξ = 0 a.s. since in the separable X the characteristic function uniquely deﬁnes the distribution. Thus, in this case μ = δ0 and exp(α x 2 ) dμ(x) = 1, for all α. X

Now, we may and do assume that c < 1. Consider the sequence √ t0 = τ, tn+1 = τ + tn 2, n ≥ 0. We have consequently √ √ t1 = τ + 2τ, t2 = τ + 2τ + 2τ, ... √ √ √ ( 2)n+1 − 1 √ τ tn = (1 + 2 + ... + ( 2)n )τ = 2−1 √ √ = (( 2)n+1 − 1)( 2 + 1)τ, n ≥ 0. From [5.13], we obtain for n ≥ 1: P{ ξ ≤ τ } · P{ ξ > tn } ≤ P2 { ξ > tn−1 }. Denote αn =

P{ ξ > tn } , n ≥ 1. P{ ξ ≤ τ }

2 , n ≥ 1. Hence Then αn ≤ αn−1

α1 ≤ α02 , α2 ≤ (α02 )2 = α04 , α3 ≤ (α04 )2 = α08 , ... 2n 1−c 2n αn ≤ α 0 = , c 2n 1−c P{ ξ > tn } ≤ c , n ≥ 0. c c) Bound for exponential moment Fix α0 =

1 c √ log . 2 2 1−c 4(1 + 2) τ

136

Gaussian Measures in Hilbert Space

For 0 < α < α0 , we have: exp(α x 2 )dμ ≤ B(0,τ )

X

+

exp(α x 2 )dμ

∞

exp(αt2n+1 )μ(tn < x ≤ tn+1 ) ≤

n=0

≤ c exp(ατ 2 ) +

∞

exp(αt2n+1 )μ( x > tn ) ≤

n=0

2n ∞ √ 1−c c exp(ατ 2 (1 + 2)2 · 2n+2 ) = ≤ c exp(ατ ) + c n=0 2

= c exp(ατ 2 ) + c

∞

exp(2n [log

n=0

√ 1−c + 4ατ 2 (1 + 2)2 ]) =: I(τ, c, α). c

We see that for α < α0 the series converges and I(τ, c, α) < ∞. For c = 1, we set I(τ, 1, α) = exp(ατ 2 ) > 1. Then I(τ, c, α) bounds the exponential moment from above and is continuous in τ > 0, c ∈ ( 21 , 1], α ∈ (0, α0 ). C OROLLARY 5.7.– Let μ be a Gaussian measure in a real separable Banach space B. Then there exists the mean value mμ = B xdμ(x), where the integral is understood in a strong sense. P ROOF.– The mean value mμ = B xdμ(x) exists in a weak sense due to lemma 5.7. Let ξ be a Gaussian random element in B with distribution μ. Then ξ − mμ is centered Gaussian random element in the separable space B, and theorem 5.9 implies that E ξ − mμ < ∞. Then E ξ ≤ mμ + E ξ − mμ < ∞, and since B is a separable Banach space, mμ = E ξ in a strong sense. C OROLLARY 5.8.– For the Gaussian process ξ from example 5.2, it holds that for all r ∈ [1, +∞),

T

|ξ(t, ·)|p dt ∈ Lr (Ω, P).

0

P ROOF.– Fix r ∈ [1, +∞). Consider the Gaussian element X in Lp [0, T ] constructed in example 5.2. Its mean value m ∈ Lp [0, T ] equals m(t) = E ξ(t), t ∈ [0, T ] (for almost all t w.r.t. Lebesgue measure). Then X − m is a centered Gaussian element

Gaussian Measure of General Form

137

in Lp [0, T ], and theorem 5.9 implies that E X − m pr p < ∞, where z p stands for the norm in Lp [0, T ]. Finally, T pr E X − m p = E( |ξ(t)|p dt)r < ∞. 0

Remember that the weak convergence of Gaussian measures in Euclidean and Hilbert spaces was studied in section 5.4. See deﬁnition 5.3 of weak convergence of probability measures in a metric space. We need some facts about the weak convergence of measures stated, e.g. in [BIL 99]. Let Γ be a family of Borel probability measures in a metric space (X, ρ). Γ is called relatively compact if for each sequence {μn } ⊂ Γ there exist a subsequence {μnk } and a Borel probability measure μ such that {μnk } converges weakly to μ. Γ is called a tight family if for each μ ∈ Γ, μ(K c ) < . Prokhorov’s theorem states that in case (X, ρ) is a complete and separable space, Γ is relatively compact if, and only if, Γ is tight. Theorem 3.5 and condition [3.18], both from [BIL 99], imply the following: let (X, ρ) be a metric space, {μn } be Borel probability measures that weakly converge to μ and f ∈ C(X) such that for some > 0, supn≥1 X |f |1+ dμn < ∞. Then f dμn → f dμ as n → ∞. X

X

T HEOREM 5.10.– Let {μn } be a sequence of Gaussian measures in a real separable Banach space B such that {μn } converges weakly to μ. Then: a) μ is a Gaussian measure as well and mean values mμn strongly converge to mμ in B. b) There exists α∗ > 0 such that exp(α∗ x 2 ) dμn (x) < ∞; sup n≥1

B

hence for each α < α∗ , exp(α x 2 ) dμn (x) → exp(α x 2 ) dμ(x) B

B

for each r > 0, x r dμn (x) → x r dμ(x) B

and for each λ ∈ R, λx e dμn (x) → eλx dμ(x) B

as

n → ∞,

B

B

as

n → ∞.

as

n → ∞,

138

Gaussian Measures in Hilbert Space

P ROOF.– a1) We prove that μ is Gaussian. Let Xn and X have distributions μn and μ, respectively. Since Xn converge in d distribution to X and x∗ ∈ B ∗ is a continuous functional, it holds Xn , x∗ − → X, x∗ . Because Xn , x∗ is a Gaussian r.v., the limit r.v. X, x∗ is Gaussian as well. Hence X is a Gaussian random element, and μ = μX is a Gaussian measure. Moreover, the convergence in distribution of Gaussian random variables Xn , x∗ implies that mμn , x∗ = EXn , x∗ → EX, x∗ = mμ , x∗ as n → ∞. Therefore, mμn converge weakly to mμ in B. a2) We prove that {mμn , n ≥ 1} is relatively compact in B, i.e. closure of mμn is compact in B. By Prokhorov’s theorem, the family of distributions {μn = μXn } is tight. We √n and − X √n , respectively. Then denote by Γ1 and Γ2 the set of distributions of X 2 2 both Γ1 and Γ2 are tight. Introduce Γ0 = {ν1 ∗ ν2 : ν1 ∈ Γ1 , ν2 ∈ Γ2 }. It is tight. Indeed, for given > 0, there exist compact sets K1 and K2 , with ν1 (K1 ) > 1 − and ν2 (K2 ) > 1 − , ν1 ∈ Γ1 , ν2 ∈ Γ2 . Then K0 := K1 + K2 = {x1 + x2 : x1 ∈ K1 , x2 ∈ K2 } is compact as well and (ν1 ∗ ν2 )(K0 ) ≥ μ(K0 − x) dν(x) ≥ (1 − )2 , K2

because K1 ⊂ K0 − x, x ∈ K2 . Thus, Γ0 is tight. n Let Yn be an independent copy of Xn and Zn = Xn√−Y . Then Zn is a copy of 2 Xn − mμn (see lemma 5.9), and its distribution belongs to Γ0 , the tight family.

Hence there exists a compact set K such that μn (K) >

1 1 , μn (K − mμn ) > , 2 2

for all

n ≥ 1.

Now, μn (K ∩ (K − mμn )) = μn (K) + μn (K − mμn ) − μn (K ∩ (K − mμn )) > > 1 − μn (K ∩ (K − mμn )) > 0.

Gaussian Measure of General Form

139

Therefore, K ∩ (K − mμn ) = ∅ and mμn ∈ K − K. The latter set is compact; hence {mμn } is indeed relatively compact. This fact and the weak convergence mμn → mμ imply that mμn → mμ strongly in B. b1) Assume additionally that μn are centered and prove the existence of α∗ . Select τ > 0 such that c := μ(B(0, τ )) > 12 and moreover the boundary ∂B(0, τ ) = {x : x = τ } has μ-measure zero (this is possible because the set of τ ’s with ∂B(0, τ ) of positive measure is at most countable). Then cn := μn (B(0, τ )) → c as n → ∞. We may and do assume that cn ≥ 12 + δ, for all n ≥ 1. Then for all α < α0 (τ, cn ), it holds In (α) := exp(α x 2 ) dμn (x) ≤ I(τ, cn , α), B

where α0 = α0 (τ, c) and I(τ, c, α) are given in theorem 5.9. Note that α0 (τ, c) is increasing in c. Hence for all positive α < α0 (τ, 12 + δ) and all n ≥ 1, In (α) ≤

max

1 2 +δ≤c≤1

I(τ, c, α) < ∞.

The latter maximum exists because I(τ, c, α) is continuous in c. b2) Show the existence of α∗ in general case. A couple (Xn ; mμn ) converges in distribution to (X; mμ ) as random elements in B × B, because the ﬁrst and the second component are stochastically independent d and Xn − → X, mμn → mμ strongly in B. Hence d

→ ϕ(X, mμ ) = x − mμ . ϕ(Xn , mμn ) = Xn − mμn − Here, ϕ(u, v) = u − v, u, v ∈ B is a continuous function. Gaussian random elements Xn −mμn are centered. By part (b1) of the proof, there exists α1 > 0 such that sup E exp(α1 Xn − mμn 2 ) < ∞.

n≥1

Since the sequence {mμn , n ≥ 1} is bounded, it follows the desired relation: for all α∗ < α1 , sup E exp(α∗ Xn 2 ) < ∞

n≥1

(for details, see the solution to problem (12) of this chapter).

140

Gaussian Measures in Hilbert Space

b3) Fix α < α∗ , r > 0, λ ∈ R and let f1 (x) = exp(α x 2 ), f2 (x) = x r , f3 (x) = eλx , x ∈ B. Then for = αα∗ − 1, fi1+ (x) ≤ ci exp(α∗ x 2 ), x ∈ B, i = 1, 2, 3, with certain positive constants ci (in particular c1 = 1). Hence sup E fi1+ (xn ) ≤ ci sup E exp(α∗ xn 2 ) < ∞, i = 1, 2, 3.

n≥1

n≥1

This relation and weak convergence of μn to μ imply the desired convergence of integrals (see the statement above theorem 5.10). Problems 5.5 10) Let ξ and η be i.i.d. centered Gaussian random elements in a real separable normed space X, deﬁned on a single probability space. For ϕ ∈ R, prove that ξ sin ϕ + η cos ϕ and ξ cos ϕ − η sin ϕ are independent copies of ξ. 11) Let μ be a centered Gaussian measure in a real separable inﬁnite-dimensional Hilbert space H, with non-zero covariance operator S. For t > 0, prove that trS x 2t dμ(x) < tt e−t · min 1 (α−t (1 − 2α S )− 2S ) = 0 0 such that exp{α x 2 } dμ(x) < ∞. X

13) Let {ξn } be a sequence of Gaussian random elements in a real separable Banach P

→ 0, where z ∈ B is a ﬁxed vector. Let α > 0 and space B such that ξn − z − 2 f : B → R be a Borel function which is continuous at z with |f (x)| ≤ eαx , x ∈ B. Prove that E f (ξn ) → f (z) as n → ∞. 14) Let (Ω, F, P) be a complete probability space, (T, S, σ) be a measure space with complete σ-ﬁnite measure σ and ξ : Ω × T → R be a measurable stochastic process, i.e. ξ be measurable w.r.t. sigma-algebra F ⊗ S, which is generated by measurable rectangles F × A, F ∈ F , A ∈ S. Assume additionally that for some real p ∈ [1, +∞), T E |ξ(t)|p dσ(t) < ∞. Let B = Lp (T, σ).

Gaussian Measure of General Form

141

a) Prove that μ(A) = P{ω : ξ(·, ω) ∈ A}, A ∈ B(B) is a probability measure. b) If ξ is a Gaussian stochastic process, i.e. for each n ≥ 1 and t1 , ..., tn ∈ T , T the random vector (ξp t1 , ..., ξtn ) is Gaussian, then μ is a Gaussian measure in B, moreover T |ξ(t, ·)| dσ(t) ∈ Lr (Ω, P), for all r ∈ [1, +∞).

6 Equivalence and Singularity of Gaussian Measures

In this chapter, we prove the fundamental Kakutani’s theorem on the absolute continuity or mutual singularity of product measures on R∞ . We give a criterion for the equivalence of Gaussian product measures on R∞ and apply it to obtain a simple version of the Feldman–Hájek dichotomy on the equivalence or mutual singularity of Gaussian measures on a separable Hilbert space H. The latter result is applied for the estimation of an unknown mean of a Gaussian random element in H and for testing a hypothesis about this mean, as well as testing a hypothesis about correlation operator of a centered Gaussian random element. 6.1. Uniformly integrable sequences We remember properties of a sequence of uniformly integrable random variables. This concept is convenient to justify passage to the limit under the expectation sign. D EFINITION 6.1.– A sequence {Xn } of random variables is called uniformly integrable if sup E |Xn | · I (|Xn | ≥ α) → 0

n≥1

as

α → +∞.

[6.1]

L EMMA 6.1.– Consider a sequence {Xn } of random variables. a) If {Xn } is uniformly integrable, then supn≥1 E |Xn | < ∞. b) If |Xn | ≤ Y a.s. for all n ≥ 1 and E Y < ∞, then {Xn } is uniformly integrable. c) Assume that for some ε > 0, it holds supn≥1 E |Xn |1+ε < ∞. Then {Xn } is uniformly integrable.

Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.

144

Gaussian Measures in Hilbert Space

P ROOF.– a) According to [6.1], there exists α0 > 0 with E |Xn |·I (|Xn | ≥ α0 ) ≤ 1, n ≥ 1. Then for each n ≥ 1, E |Xn | = E |Xn | · I (|Xn | < α0 ) + E |Xn | · I (|Xn | ≥ α0 ) ≤ α0 + 1. b) Since {ω : |Xn | ≥ α} ⊂ {ω : Y ≥ α}, we have |Xn |d P ≤ Y d P → 0 as sup n≥1

because

Ω

{|Xn |≥α}

{Y ≥α}

α → +∞,

Y d P = E Y < ∞.

c) For α > 0, it holds 1 C |Xn |d P ≤ ε |Xn |1+ε d P ≤ ε , α α {|Xn |≥α} {|Xn |≥α} C = supn≥1 E |Xn |1+ε < ∞. Hence C sup |Xn |d P ≤ ε → 0 α n≥1 {|Xn |≥α}

as

α → +∞.

The next two statements show that uniformly integrable sequences are suitable for passing to the limit under the expectation symbol. Their proofs can be found in [SHI 16]. T HEOREM 6.1.– (Extension of Lebesgue dominated convergence theorem) Let {Xn } be uniformly integrable and Xn → X a.s. Then E X and E Xn , n ≥ 1, are ﬁnite, and moreover E Xn → E X as n → ∞. T HEOREM 6.2.– Let Xn be non-negative random variables, Xn → X a.s., and X and Xn , n ≥ 1, have ﬁnite expectations, and moreover E Xn → E X as n → ∞. Then {Xn } are uniformly integrable. R EMARK 6.1.– In theorems 6.1and 6.2, convergence Xn → X a.s. can be replaced d with a weak convergence Xn − → X (see [BIL 99]). Problems 6.1 1) Let {Xn } be a sequence of random variables with ﬁnite expectations. Prove that {Xn } is uniformly integrable if, and only if, lim sup E |Xn | · I (|Xn | ≥ α) → 0 n≥1

as

α → +∞.

Equivalence and Singularity of Gaussian Measures

145

2) Construct such a sequence {Xn } of random variables with ﬁnite expectations, that Xn → X a.s., E Xn → E X as n → ∞ and X has ﬁnite expectation as well, but {Xn } is not uniformly integrable. 3) Prove the following Vallée–Poussin criterion which extends lemma 6.1(c). A sequence {Xn } of random variables is uniformly integrable if, and only if, there exists a non-negative increasing function G(t), t ≥ 0, with lim G(t)/t = +∞ such that t→+∞

supn≥1 E G(|Xn |) < ∞. 6.2. Kakutani’s dichotomy for product measures on R∞ We state some properties of absolutely continuous measures and apply them, as well as results of section 6.1, to measures on R∞ . 6.2.1. General properties of absolutely continuous measures Let (X, S) be a measurable space and μ, ν be ﬁnite measures on S. Remember that ν is absolutely continuous w.r.t. μ if, for all A ∈ S with μ(A) = 0, it holds ν(A) = 0. Notation: ν μ. The Radon–Nikodym theorem (see [BOG 07]) states that ν μ if, and only if, there exists S-measurable function ϕ such that ν(A) = ϕ(x)dμ(x), A ∈ S. A

The function ϕ is uniquely deﬁned up to a μ-null set and ϕ(x) ≥ 0. It is called the dν dν Radon–Nikodym derivative (or density) of ν w.r.t. μ and is denoted as dμ = dμ (x). If ϕ(x) > 0 almost everywhere w.r.t. μ, then μ ν, and moreover In this case, the measures μ and ν are equivalent. Notation: μ ∼ ν.

dμ dν

=

dν dμ

−1

.

Suppose the measure μ and ν are concentrated at disjoint sets, i.e. X = A ∪ B where A and B are disjoint and μ(B) = 0, ν(A) = 0. Then μ and ν are called mutually singular. Notation: μ⊥ν. For arbitrary ﬁnite measures ν and μ on S, the measure ν can be uniquely decomposed as ν = ν1 + ν2 with ν1 μ and ν2 ⊥μ (see [BOG 07]). In general case, dν1 dν dν dμ denotes the Radon–Nikodym derivative dμ . Thus, equality dμ (x) = 0 (mod μ) is a necessary and sufﬁcient condition for the singularity of ﬁnite measures ν and μ. Often it is convenient to deal with the following objects. Let X be a universal space, Sn be an increasing sequence of sigma-algebras on X (i.e. Sn ⊂ Sn+1 , n ≥ 1) and S be the least sigma-algebra containing all Sn . Consider probability measures μ

146

Gaussian Measures in Hilbert Space

and ν on S, and let μn and ν n be restrictions of μ and ν on Sn , respectively. It is clear that μn and ν n are probability measures as well. L EMMA 6.2.– (Densities of restrictions form a martingale) Assume that ν n μn , n for all n ≥ 1. Then ϕn := μν n , n ≥ 1, form a martingale on the stochastic basis (X, S, Sn , n ≥ 1, μ), i.e. ϕn is Sn -measurable and E (ϕn+1 |Sn ) = ϕn

(modμ).

[6.2]

P ROOF.– The function ϕn is Sn -measurable density of the measure deﬁned on Sn . To prove [6.2], take A ∈ Sn and get n+1 n+1 n n ϕn+1 dμ =ν (A) = ν(A) = ν (A) = ϕn dμ = ϕn dμ. A

A

A

The latter equality holds because μn is a restriction of μ on Sn . Thus, [6.2] follows from the deﬁnition of conditional expectation of r.v. ϕn+1 on the probability space (X, S, μ) (see [SHI 16]). C OROLLARY 6.1.– (Convergence of densities) Under the conditions of lemma 6.2, there exists a function ϕ(x) := lim ϕn (x) (modμ).

[6.3]

n→∞

P ROOF.– By lemma 6.2, ϕn is a non-negative Sn -martingale. Then the desired statement follows from the theorem about the convergence of a martingale (see [SHI 16]). dν 1 Remember that in general dμ = dν dμ , where ν1 is such a component of a ﬁnite measure ν that is absolutely continuous with respect to a ﬁnite measure μ.

T HEOREM 6.3.– (About the limiting density) Assume that ν n μn and ϕn = n ≥ 1. Then dν = lim ϕn dμ n→∞

(modμ).

dν n dμn ,

[6.4]

P ROOF.– a) The probability measure ν can be decomposed as ν = ν˜1 + ν˜2 , ν˜1 μ, ν˜2 ⊥μ. In the case where ν˜1 (X) > 0, ν˜2 (X) > 0, we have ν = ν˜1 (X) ·

ν˜1 ν˜2 + ν˜2 · , ν˜1 (X) ν˜2 (X)

ν = pν1 + (1 − p)ν2 ,

0 ≤ p ≤ 1,

ν1 μ,

ν2 ⊥μ.

[6.5]

Equivalence and Singularity of Gaussian Measures

147

Here, p = ν˜1 (X) and ν1 , ν2 are the corresponding probability measures. If ν˜1 (X) = 0 or ν˜2 (X) = 0, then [6.5] still holds with p = 0 or p = 1, respectively. In all cases, we have for the restrictions ν n , μn , ν1n , ν2n of ν, μ, ν1 , ν2 on Sn : dν n dν n dν n = p 1n + (1 − p) 2n . n dμ dμ dμ Hence in order to show [6.5], it is enough to prove that dν1n dν1 → , n dμ dμ

dν2n →0 dμn

(because relation [6.5] implies

dν dμ

(modμ) 1 = p dν dμ ).

Therefore, it is enough to prove [6.4] in pure cases only: when ν μ and ν⊥μ. b) The case ν μ: In this case, dν n dν = E |S n . dμn dμ

[6.6]

Indeed, for A ∈ Sn , dν dν n n dν n dμ (x) = dμ(x), dμ(x) = ν(A) = n n A dμ A dμ A dμ and [6.6] follows. Passing to the limit in [6.6], we get a.s.: dν n dν dν =E lim |S = . n→∞ dμn dμ dμ Here, we used corollary 6.1 and Lévy’s theorem (see [SHI 16]) about the limit of conditional expectations; applying the latter theorem, we used the following: sigmadν dν = X dμ dμ(x) = 1 < ∞. algebras Sn are increasing to the sigma-algebra S and E dμ c) The case ν⊥μ: We have to show that in this case, dν n → 0 (mod μ). dμn

[6.7] n

dν According to corollary 6.1, dμ (mod μ). We take A ∈ Sm , n ≥ m and n → ϕ apply Fatou’s lemma: dν n ν(A) = ν n (A) = dμ(x), n A dμ

ν(A) = lim inf n→∞

A

dν n dμ(x) ≥ dμn

dν n lim inf n dμ(x) = A n→∞ dμ

ϕdμ(x). A

148

Gaussian Measures in Hilbert Space

Hence, the monotone class

M := {B ∈ S : ν(B) ≥

ϕdμ(x)} B

contains the algebra A := {B ∈ S : ∃n ≥ 1, B ∈ Sn }, and M ⊃ σa(A) = S. Therefore, ν(A) ≥ ϕdμ(x), A ∈ S. [6.8] A c Since ν⊥μ, there exists A1 with ν(A1 ) = 0 and μ(A1 ) = 0. We have ϕdμ(x) = 0, and inequality [6.8] implies A1 ϕdμ(x) = 0. For non-negative Ac1 ϕ, we get X ϕdμ(x) = 0; hence ϕ = 0(mod μ). Relation [6.7] is proven.

6.2.2. Kakutani’s theorem for product measures Let {μk } and {νk } be sequences of probability measures on Borel sigma-algebra ∞ ∞ ∞ on R. Consider product measures μ = (see 1 μk and ν = 1 νk on R deﬁnition 2.3). T HEOREM 6.4.– (Kakutani’s criterion for absolute continuity) Product measure ν is absolutely continuous w.r.t. μ if, and only if, νk μk for all k ≥ 1 and there exists α ∈ (0, 1) such that the inﬁnite product α ∞

dνk dμk (x) [6.9] dμk R 1 converges (to some positive number). R EMARK 6.2.– In the necessity part of theorem 6.4, the product [6.9] converges for all α ∈ (0, 1). This is shown in the proof below. R EMARK 6.3.– The integral in [6.9] is called the Hellinger integral and is denoted as √ dνk α 1−α (dνk ) (dμk ) . In particular R dνk dμk = R dμk dμk . R P ROOF.– a) Necessity. It is given that ν μ. First, we prove that ν1 μ1 . Indeed, let μ1 (A1 ) = 0. Then, for a cylinder Aˆ1 = {x ∈ R∞ : x1 ∈ A1 } we have μ(Aˆ1 ) = μ1 (A1 ) = 0; hence ν(Aˆ1 ) = ν1 (A1 ) = 0, and ν1 μ1 . In a similar way, νk μk , for all k ≥ 1. Now, by theorem 6.3, dν n dν (x1 , . . . , xn ) (mod μ). (x) = lim n→∞ dμn dμ

[6.10]

Equivalence and Singularity of Gaussian Measures

149

ˆ n : Bn ∈ Here we consider the increasing sequence of sigma-algebras Bn = {B n ∞ B(R )}, and B(R ) is the least sigma-algebra that contains all Bn (see theorem 2.1); ν n and μn are restrictions of ν and μ on Bn . Since ν and μ are product measures with νk μk for all k ≥ 1, it holds

dνk dν n (x1 , . . . , xn ) = (xk ). n dμ dμk 1 n

[6.11]

We ﬁx α ∈ (0, 1). Then [6.10] and [6.11] imply α α n

dν dνk (x) = lim (xk ) (mod μ). n→∞ dμ dμk 1 But

α n

dνk

R∞

1

dμk

1/α (xk )

dμ(x) =

R∞

dν n dμ(x) = 1 dμn

n dνk α (xk ) are and the exponent α1 > 1; hence by lemma 6.1(c) the functions 1 dμ k uniformly integrable, and by theorem 6.1, α α n dν dνk (x)dμ(x) = lim dμ(x) = n→∞ R∞ dμ dμk R∞ 1 α α n ∞

dνk dνk dμk (x) = (xk )dμk (x) < ∞. n→∞ dμk dμk R R 1 1

= lim

[6.12] Moreover, the inﬁnite product in [6.12] is non-zero. Indeed, if it equals 0, then = 0(mod μ), and since ν μ, we get ν = 0, and ν is not a probability measure, which is a contradiction. dν dμ

Thus, the product [6.9] converges to a positive number. R EMARK 6.4.– We have established the following: if νk μk for all k ≥ 1, then ∞ dνk α dμk (x) either converges (to a positive number) or diverges to 0. 1 R dμk b) Sufﬁciency. Now, we assume that νk μk for all k ≥ 1, and [6.9] converges. dν is the generalized Radon– Like in part (a) of the proof, we get [6.10], where dμ Nikodym derivative (i.e. the density of ν1 w.r.t. μ), ν1 is absolutely continuous component of ν. Hence the measurable function ϕ(x) :=

∞

dνk dν (x) (xk ) = dμk dμ 1

(mod μ).

[6.13]

150

Gaussian Measures in Hilbert Space

Here almost surely w.r.t. μ, the inﬁnite product at some points converges to a positive number, and at some points it converges to 0. To show that ν μ, it is enough to prove that ϕdμ ≥ 1.

[6.14]

R∞

Indeed, decompose ν = ν1 + ν2 , ν1 μ, ν2 ⊥μ. Then ϕ = proved [6.14], we get ∞ 1 = ν(R ) = ϕdμ + ν2 (R∞ ) ≥ 1 + ν2 (R∞ );

dν1 dμ .

Once we have

R∞

hence ν2 (R∞ ) = 0 and ν2 = 0, ν = ν1 μ. (In fact, then in [6.14] the equality holds true.) In order to prove [6.14], introduce measurable functions ϕn (x) =

∞

dνk (xk ) (mod μ). dμk

[6.15]

k=n

Almost everywhere w.r.t. μ, inﬁnite product in [6.15] converges either to a positive number or to 0, because in forming product measures ν and μ one could multiply measures νk or μk starting from k = n rather than k = 1. α m dνk The sequence of functions { k=n dμ (x ) , m ≥ n} is uniformly integrable k k

at the measure space (R∞ , B(R∞ ), μ), because 1/α > 1 and

R∞

=

α 1/α m m

dνk dνk (xk ) dμ(x) = (xk )dμ(x) = dμk dμk R∞

k=n

m

k=n

R

k=n

dνk (xk )dμk (xk ) = 1 dμk

(see lemma 6.1(c)). Tending m → ∞, we get by theorem 6.1: R∞

ϕα n dμ =

α m

dνk (xk ) dμk (xk ). dμk R

k=n

Equivalence and Singularity of Gaussian Measures

151

Using Fubini’s theorem and the moment inequality, we obtain:

n−1

dνk dνk ϕ(x)dμ(x) = (xk ) · ϕn (x)dμ(x) = (xk )dμk (xk )× dμk R∞ R∞ k=1 dμk k=1 R × ϕn (xn , xn+1 , . . . )dμ(x) = ϕn (x)dμ(x),

n−1

R∞

R∞

R∞

ϕ(x)dμ(x) ≥

1/α R∞

ϕα n dμ(x)

=

1/α α ∞

dνk (xk ) dμk (xk ) . dμk R

k=n

Tending n → ∞, we obtain the desired inequality [6.14], because the product [6.9] converges (to a positive number). C OROLLARY μk , for all k ≥ 1. ∞6.2.– (Kakutani’s dichotomy) Assume that νk ∞ Then ν = 1 νk is either absolutely continuous w.r.t. μ = 1 μk or it is singular to μ. P ROOF.– Fix 0 < α < 1. By remark 6.4, α ∞

dνk dμk dμk R 1

[6.16]

either converges to a positive number, or diverges to 0. In the ﬁrst case, ν μ by theorem 6.4. In the second case, the product [6.16] ∞ dνk α equals 0. The functions fn (x) = 1 dμk , n ≥ 1, are uniformly integrable at the measure space (R∞ , B(R∞ ), μ) (see proof of theorem 6.4); hence by theorem 6.3, α dν dμ = lim fn (x)dμ(x) = lim fn (x)dμ(x) = n→∞ R∞ dμ R∞ R∞ n→∞ α ∞

dνk = dμk = 0. dμk R 1 dν dμ

dν Here dμ is the generalized Radon–Nikodym derivative (like in theorem 6.3). Thus, = 0(mod μ), and ν⊥μ.

Therefore, either ν μ or ν⊥μ.

152

Gaussian Measures in Hilbert Space

6.2.3. Dichotomy for Gaussian product measures L EMMA 6.3.– (Computation of Hellinger integral for normal distribution on R) Let ν and μ be normal distributions N (a, b) and N (ˆ a, ˆb) on the real line, with positive b and ˆb. Then * 1/4 2 2 (a − a ˆ ) . [6.17] dνdμ = bˆb exp − b + ˆb 4(b + ˆb) R P ROOF.– Let ρν and ρμ be the densities of ν and μ w.r.t. Lebesgue measure. Then the Hellinger integral is equal to √ H := dνdμ = ρν ρμ dx, R

R

√ ρν ρμ = √

1 1 (x − a)2 (x − a ˆ )2 . + 1/4 exp − 4 ˆb b 2π bˆb

Transform

2 (x − a)2 1 1 a a a (x − a ˆ )2 ˆ a ˆ2 = x2 − 2x + = + + + + ˆb ˆb ˆb b b ˆb b b 2 B B2 2 ; = Ax − 2Bx + C = A x − + C− A A −1/4 At2 1 B2 1 √ e− 4 dt, C− H = bˆb exp − 4 A 2π R √ √ t2 At2 1 1 − √ e− 4 dt = 2A−1 e 2(2A−1 ) dt = 2A−1 , √ √ −1 2π 2π 2A R R

1 H = exp − 4

B2 C− A

We have C−

2

2

2

B a a ˆ − = + ˆb A b

b + ˆb A= , bˆb

' ( 2A−1 ( ) 1/2 . bˆb

[6.18]

2

a a ˆ b + ˆ b 1 1 + ˆ b b

=

1/2 2 bˆb 2A−1 . 1/2 = b + ˆb bˆb

(a − a ˆ )2 , b + ˆb

[6.19]

[6.20]

Equivalence and Singularity of Gaussian Measures

Finally [6.18]–[6.20] imply [6.17].

153

Consider νk = N (ak , bk ) and μk = N (ˆ ak , ˆbk ) with positive bk and ˆbk ; k ≥ 1. Introduce two Gaussian product measures on R∞ : ν=

∞

νk ,

μ=

1

∞

μk .

1

T HEOREM 6.5.– (Dichotomy about Gaussian product measures) ν ∼ μ if, and only if, 2 ∞ ∞ 2 ˆbk − bk (ˆ ak − ak ) < ∞ and < ∞. [6.21] ˆbk bk 1 1 If at least one of the two conditions [6.21] is violated, then ν⊥μ. P ROOF.– a) The results of section 6.2.2 imply the following. The product ∞

dνk dμk 1

[6.22]

R

is either convergent to a positive number or divergent to 0; in the ﬁrst case, ν μ and in the second case ν⊥μ. Denote ˆbk − bk ; bk

δk =

δk > −1.

By lemma 6.3

(ˆ a k − ak ) 2 , dνk dμk = Ak exp − 4(ˆbk + bk ) R

A2k =

(1 + δk )1/2 ≤ 1. 1 + 12 δk

[6.23]

[6.24]

In view of [6.23] and [6.24], the product [6.22] converges if, and only if, ∞

(1 + δk )1/2 1

1 + 12 δk

converges

[6.25]

and ∞ (ˆ ak − a k )2 1

ˆbk + bk

< ∞.

[6.26]

154

Gaussian Measures in Hilbert Space

b) Suppose that [6.25] and [6.26] hold true. The next product converges: ∞ ∞

(1 + δk /2)2 δk2 /4 1+ ; = 1 + δk 1 + δk 1 1

[6.27]

∞ δk2 ∞ hence 1 1+δ < ∞; then δk → 0 as k → ∞, and therefore, 1 δk2 < ∞. At that k bk ∼ ˆbk and bk + ˆbk ∼ 2ˆbk ; hence [6.26] implies the second relation in [6.21]. ∞ c) Now, suppose that [6.21] holds true. Then 1 δk2 < ∞, δk → 0 as k → ∞, and the product [6.27] converges together with the product [6.25]. At that bk ∼ ˆbk and the second relation in [6.21] implies [6.26]. d) Thus, ν μ if, and only if, [6.25] and [6.26] hold, and this occurs if, and only if, [6.21] holds. If at least one of the two relations [6.21] is violated, then ν is not absolutely continuous w.r.t. μ, and therefore, the product [6.22] diverges to 0; hence ν⊥μ. e) Assume that ν μ. Then the product [6.22] converges. Measures νk and μk are equivalent, and the Hellinger integral can be transformed as 1/2 dνk H(νk , μk ) = dνk μk = dμk (x) = dμk R R 1/2 1/2 dμk dνk dμk dνk (x) = dνk (x) = H(μk , νk ). = dμk dνk dνk R R ∞ √ ∞ μk dνk converges; hence Therefore, the product 1 H(μk , νk ) = 1 R μ ν. Thus, ν ∼ μ if, and only if, ν μ.

∞ N (ak , bk ) and μ = 1 N (ˆ ak , bk ) with bk > 0, ∞ (ak −ˆak )2 < ∞. If the series diverges, then k ≥ 1. Then ν ∼ μ if, and only if, 1 bk ν⊥μ. C OROLLARY 6.3.– Let ν =

∞ 1

Thus, under the conditions of corollary 6.3, ν ∼ μ if, and only if, a − a ˆ ∈ l2,c . 1 ∞ Here a = (ak )∞ , k ≥ 1}. , a ˆ = (ˆ a ) , c = { k 1 1 bk ∞ ∞ C OROLLARY 6.4.– Let ν = 1 N (ak , bk ) and μ = 1 N (ak , ˆbk ) with positive bk ∞ ˆbk −bk 2 < ∞. If the series diverges, then and ˆbk . Then ν ∼ μ if, and only if, 1

bk

ν⊥μ. As an application consider the independent sequence ξk ∼ N (0, bk ) with positive variances bk . The random element ξ = (ξk )∞ on R∞ has distribution 1

Equivalence and Singularity of Gaussian Measures

155

∞ μξ = 1 N (0, bk ). Fora real non-zero number t, the random element tξ has ∞ 2 distribution μtξ = 1 N (0, t bk ). By corollary 6.4, μtξ ⊥μξ for all t ∈ R \{1, −1, 0}. For t = −1, −ξ and ξ are equally distributed, and for t = 1, tξ = ξ. Problems 6.2 4) Construct two product measures on R∞ such that they neither absolutely continuous (one w.r.t. another) nor singular. ∞ ∞ ˆ 5) Let ν = 1 P ois(λk ) and μ = 1 P ois(λk ), where P ois(λ) stands for Poisson distribution with parameter λ. Prove that ν ∼ μ if, and only if, 2 ∞ √ ˆ k < ∞; if the series diverges then ν⊥μ. λk − λ 1

6.3. Feldman–Hájek dichotomy for Gaussian measures on H Let H be a real separable inﬁnite-dimensional Hilbert space. We compare two Gaussian measures on H. 6.3.1. The case where Gaussian measures have equal correlation operators T HEOREM 6.6.– (Dichotomy for Gaussian measures with equal correlation operators) Let μ = N (a1 , B) and ν = N (a2 , B) be Gaussian measures on H. It holds: √ a) if a1 − a2 ∈ R( B), then μ ∼ ν; √ b) if a1 − a2 ∈ R( B), then μ⊥ν. P ROOF.– 1) We start with the case where B is a non-singular operator, i.e. KerB = {0}. Let {ek } and {βk } be eigenbasis and corresponding eigenvalues of B; βk > 0, k ≥ 1. Using Fourier coefﬁcients of a vector w.r.t. {ek }, we can identify B with the sequence in l2 . Introduce Fourier coefﬁcients a1k = (a1 , ek ),

a2k = (a2 , ek ),

k ≥ 1.

Then μ and ν are product measures in l2 (see Chapter 5): μ=

∞

1

N (a1k , βk ),

μ=

∞

1

N (a2k , βk ).

[6.28]

156

Gaussian Measures in Hilbert Space

The right-hand sides of relations [6.28] deﬁne extended measures μe and νe on R∞ , such that μ = μe |B(l2 ) ,

ν = νe |B(l2 ) .

By corollary 6.3, Gaussian product measures μe and νe are either equivalent or mutually singular. In the ﬁrst case μ ∼ ν, and in the second case μ⊥ν (see problem (6) at the end of section 6.3). By corollary 6.3, μe ∼ νe if, and only if, ∞ (a1k − a2k )2 1

βk

< ∞.

[6.29]

√ √ Bek = βk ek , k ≥ 1, and [6.29] is equivalent to the following: √ a1 − a2 ∈ R( B). [6.30]

We have

Therefore, if [6.30] holds, then μe ∼ νe and μ ∼ ν. And if [6.30] is violated then μe ⊥νe and μ⊥ν. We have proved both statements (a) and (b) for non-singular B. 2) The case of singular B. We use the same construction as in part 1 of the proof. But now there exists an eigenvalue βk0 = 0. If a1k0 = a2k0 , then N (a1k0 , βk0 )⊥N (a2k0 , βk0 );√ hence μe ⊥νe and μ⊥ν. Notice that in this case (a1 − a2 , ek0 ) = 0 and a1 − a2 ∈ R( B). Now, suppose that for all k ≥ 1 with βk = 0, it holds a1k = a2k . Then we can reduce our consideration to the case of a non-singular correlation operator. If (a1k − a2k )2 < ∞, βk

[6.31]

k:βk >0

then μe ∼ νe and μ ∼ ν; otherwise μe ⊥νe and μ⊥ν. But√in the considered case, relation [6.31] is equivalent to the following: a1 − a2 ∈ R( B). This accomplishes the proof. R EMARK 6.5.– Consider two arbitrary sequences {αk , βk , k ≥ 1}. Hereafter, we real ∞ agree that the convergence of series 1 αβkk means the following: αk = 0 for all k αk such that βk = 0, and moreover the series k:βk =0 βk converges. Under this agreement, N (a1 , B) ∼ N (a2 , B) if, and only if, [6.29] holds whatever is the correlation operator B (i.e. it can be either singular or not). C OROLLARY 6.5.– (About admissible shift of Gaussian measure) Consider Gaussian √ measures μa = N (a, B) and μ0 = N (0, B) on H. If a ∈ R( B), then μa ∼ μ0 , and otherwise μa ⊥μ0 .

Equivalence and Singularity of Gaussian Measures

157

√ A vector a ∈ R( B) is the so-called admissible shift of μ0 in the sense that Ta−1 μ0 μ0 , where Ta x = x + a, x ∈ H. Now, we ﬁnd the Radon–Nikodym derivative dμa /dμ0 for admissible shifts. T HEOREM 6.7.– Let μ0 = N (0, B) and μa = N (a, B) on H, with a = B 1/2 b, and let {ek } and {βk } be eigenbasis and corresponding eigenvalues of B. Assume additionally that (b, ek ) = 0 whenever βk = 0. Then dμa b 2 −1/2 . [6.32] = exp (b, B x) − dμ0 2 Here n bk xk √ , n→∞ βk 1

(b, B −1/2 x) = lim

[6.33]

where bk = (b, ek ), xk = (x, ek ), k ≥ 1 and the limit in [6.33] exists a.e. with β k0 x k0 respect to μ0 ; if some βk0 = 0, then xk0 = 0 (mod μ0 ) and √ = 0 in [6.33] by β k0

deﬁnition.

P ROOF.– We identify H and l2 as in the proof of theorem 6.6. Then in l2 , we have μ0 =

∞

N (0, βk ),

μa =

1

∞

N (ak , βk ),

ak =

βk bk ,

bk = (b, ek ).

1

Consider sigma-algebras Sn = {Aˆn |An ∈ B(Rn )}, n ≥ 1. Here, Aˆn = {x ∈ l2 : (x1 , . . . , xn ) ∈ An }. Let μn0 = μ0 |Sn , μna = μa |Sn . It holds

− (xk −ak )2 + xk dμna 2βk 2βk = e = dμn0 1 n

2

1 2 bk x k √ }. b + 2 1 k βk 1 n

= exp{−

n

[6.34]

This calculation remains valid if some βk = 0. By corollary 6.5, we have μa μ0 , and by theorem 6.3 it holds a.e. with respect to μ0 : dμna dν → n dμ0 dμ

as

n → ∞.

[6.35]

n Since limn→∞ 1 b2k = b 2 , relations [6.34] and [6.35] imply that the RHS of [6.33] converges a.e. with respect to μ0 , and then dν b 2 −1/2 x) . = exp − + (b, B dμ 2

158

Gaussian Measures in Hilbert Space

Note that the functional f (x) = (b, B −1/2 x) given in [6.33] is a linear measurable functional on the probability space (H, B(H), μ0 ). This means that f is well deﬁned on some H0 ⊂ H with μ0 (H0 ) = 1, and moreover, for all α ∈ R, f (αx) = αf (x) for a.e. x ∈ H with respect to μ0 . 6.3.2. Necessary conditions for equivalence of Gaussian measures Now, we deal with two Gaussian measures on H with possibly different correlation operators μ = N (a1 , B1 ),

ν = N (a2 , B2 ).

[6.36]

L EMMA 6.4.– (First necessary condition for equivalence of Gaussian measures) If measures [6.36] are equivalent, there exists c > 0 such that for all z ∈ H, (B1 z, z) ≤ c · (B2 z, z). P ROOF.– Assume the contrary. Then there exists a sequence {zn } ⊂ H with (B2 zn , zn ) →0 (B1 zn , zn )

as n → ∞.

Let ξn (x) =

(x, zn ) − (a2 , zn ) , (B1 zn , zn )

x ∈ H,

n ≥ 1.

W.r.t. ν, ξn is a sequence of normal random variables with zero mean and variance Dν ξn =

(B2 zn , zn ) → 0 as n → ∞. (B1 zn , zn )

Hence ξn → 0 in probability ν. With respect to μ, ξn is a sequence of normal random variables as well, with variance Dμ ξn =

(B1 zn , zn ) = 1; (B1 zn , zn )

hence ξn ∼ N (mn , 1) and

4 1 (t−mn )2 2 1 − 2 √ μ ({x : |ξn (x)| ≤ 1}) = e dt ≤ , π 2π −1 4 2 μ ({x : |ξn (x)| > 1}) ≥ 1 − > 0. π

[6.37]

Equivalence and Singularity of Gaussian Measures

159

On the other hand, ν ({x : |ξn (x)| > 1}) → 0

as n → ∞.

[6.38]

Due to [6.37], μ ({x : |ξn (x)| > 1}) does not converge to zero. This fact together with relation [6.38] contradicts the condition that μ ∼ ν. This proves the statement. R EMARK 6.6.– For measures [6.36], if there is no positive number c with (B1 z, z) ≤ c · (B2 z, z), z ∈ H, then μ⊥ν. P ROOF.– We use functions ξn (x) from the proof of lemma 6.4. For each ε > 0, it holds ν ({x : |ξn (x)| > ε}) → 0

as n → ∞,

and similarly to [6.37], 2ε μ ({x : |ξn (x)| ≤ ε}) ≤ √ , 2π

2ε μ ({x : |ξn (x)| > ε}) ≥ 1 − √ . 2π

Taking a sequence εk → 0, it is possible to construct a sequence of sets Ak = {x : |ξnk | > εk } with ν(Ak ) → 0 and μ(Ak ) → 1. Then μ⊥ν (see problem (8) at the end of section 6.3). −1/2

Based on lemma 6.4, we can introduce a bounded operator J := B2

−1/2

B1 B2

.

We assume that the two measures in [6.36] are equivalent. Since the relation μ ∼ ν is symmetric, lemma 6.4 implies that there exist positive c1 and c2 such that c1 (B2 z, z) ≤ (B1 z, z) ≤ c2 (B2 z, z),

z ∈ H.

Then KerB1 = KerB2 and we set Jz = z, z ∈ KerB1 . We decompose H = KerB1 ⊕ L, and it is enough to deﬁne J on L. Operators B1 |L and B2 |L are non-singular (i.e. with zero kernel), and therefore, we may and do assume that initial correlation operators B1 and B2 are non-singular. √ 1/2 Now, we deﬁne J on the dense linear set R( B2 ). Let z = B2 u, √ −1/2 u = B2 z ∈ H. We deﬁne√J to be a symmetric linear operator on R( B2 ) (i.e. (Jx, y) = (x, Jy), x, y ∈ R( B2 )) such that −1/2 −1/2 (Jz, z) = (B1 B2 z, B2 z) = (B1 u, u), z ∈ R( B2 ). √ √ Then J is bounded operator on R( B2 ), because for z ∈ R( B2 ), z = 0, it holds due to lemma 6.4: (Jz, z) (B1 u, u) = ≤ c. (z, z) (B2 u, u)

160

Gaussian Measures in Hilbert Space

J is extended to a self-adjoint operator on H. This extended operator will be −1/2 −1/2 denoted as B2 B1 B2 . It is a positive operator. Since the relation μ ∼ ν is symmetric, a positive self-adjoint bounded operator −1/2 B2 B 1 is deﬁned similarly.

−1/2

B1

of Gaussian measures) If L EMMA 6.5.– (Second necessary condition for equivalence √ measures [6.36] are equivalent, then a2 − a1 ∈ R( B1 ). P ROOF.– Let {ek } and {βk } be eigenbasis and corresponding eigenvalues of B1 . For ﬁxed z ∈ H, consider the series ∞ 1 √ (x − a1 , ek )(z, ek ). βk 1

[6.39]

It is a series of independent normal variables on the probability space (H, B(H), μ), with zero mean and variances (z, ek )2 . We have ∞

(z, ek )2 = z 2 ,

1

and according to the Kolmogorov theorem about two series (see [SHI 16]), the series 2 [6.39] converges for x ∈ H (mod μ). Its sum Sz (x) ∼ N (0, z ) under measure μ. Since supz≤1 Eμ Sz2 (x) = 1 < ∞, the family of random variables {Sz (x) : z ≤ 1} is bounded in probability μ, i.e. sup μ({x : |Sz (x)| > c}) → 0 as

c → ∞.

z≤1

[6.40]

Because μ ∼ ν, the series [6.39] converges to Sz (x) for x ∈ H (mod ν). Then it should hold sup ν({x : |Sz (x)| > c}) →

z≤1

as

c → ∞.

[6.41]

Otherwise, if [6.41] fails, then it would be possible to use [6.40] to construct a sequence An of Borel sets in H such that μ(An ) tends to 0 and ν(An ) does not tend to 0 as n → ∞. But this contradicts the equivalence of ν and μ. Thus, [6.41] holds true, and the family of random variables {Sz (x) : z ≤ 1} is bounded in probability ν as well. But under ν, Sz (x) is a normal r.v. with mean m(z) =

∞ 1 √ (a2 − a1 , ek )(z, ek ). βk 1

Equivalence and Singularity of Gaussian Measures

161

Now, [6.41] implies that supz≤1 |m(z)| < ∞, and m(z) is a linear continuous 1/2

functional on H, m(z) = (h, z) for some z ∈ H. We substitute here z = B1 u and obtain ∞

1/2

1/2

(a2 − a1 , ek )(u, ek ) = (h, B1 u) = (B1 h, u),

1/2

a2 − a1 = B1 h.

1

R EMARK 6.7.– In what follows, the sum [6.39] under measure μ will be denoted as −1/2 (x − a1 ), z). For ﬁxed z ∈ H, it is a measurable functional on (H, B(H), μ); (B1 −1/2 see the functional (b, B1 x) in [6.33]. If some βk0 = 0, then (x − a1 , ek0 ) = 0 (mod μ), and in [6.39] √1 (x − a1 , ek0 )(z, ek0 ) = 0 by deﬁnition. β k0

√ R EMARK 6.8.– For measures [6.36], if a1 − a2 ∈ R( B1 ), then μ⊥ν. P ROOF.– We may and do assume that there exist c1 > 0, c2 > 0 with c1 (B2 z, z) ≤ (B1 z, z) ≤ c2 (B2 z, z), z ∈ H,

[6.42]

otherwise by remark 6.6 μ⊥ν and the statement is proven. −1/2

Relation [6.42] implies that the operator B1 (see discussion above lemma 6.5).

−1/2

B2 B 1

∈ L(H) is well deﬁned

We use the series [6.39] and its partial sum Snz (x) =

∞ 1 √ (x − a1 , ek )(z, ek ). βk 1 2

Under μ, Snz (x) → Sz (x) ∼ N (0, z ) as n → ∞ for almost all x ∈ H. Now, we study the behavior of Snz (x) under ν. We have Snz (x) =

+

n 1 √ (a2 − a1 , ek )(z, ek ) βk 1

n 1 √ (x − a2 , ek )(z, ek ) =: mn (z) + Tnz (x). βk 1

2 √ ∞ Since a2 − a1 ∈ R( B1 ), it holds 1 (a2 −aβk1 ,ek ) = +∞, and we can select z ∈ H with mn (z) → +∞ as n → ∞. Next, Tnz (x) is normal a r.v. with respect to ν, with zero mean and variance

Dν Tnz (x) =

n (B2 ek , ej )(z, ek )(z, ej ) . βk β j k,j=1

162

Gaussian Measures in Hilbert Space −1/2

Using the deﬁnition of operator F = B1 −1/2 −1/2 (B1 B2 B1 ek , e j )

=

=

−1/2

B2 B1

, we get

−1/2 −1/2 (B2 B1 e k , B1 ej )

=

ek ej B2 √ , βk βj

=

(B2 ek , ej ) . β k βj

Thus, Dν Tnz (x) =

n

(F ek , ej )(z, ek )(z, ej ) = (F zn , zn ), zn =

n

(z, ek )ek .

1

k,j=1

Hence Dν Tnz (x) → (F z, z) as n → ∞, and Tnz (x) under ν converges in probability to normal variable N (0, (F z, z)). To summarize, Snz (x) → +∞ under ν in probability, and therefore, for each c > 0 ν({x : |Snz (x)| > c}) → 1

as

n → ∞.

[6.43]

Meanwhile μ({x : |Snz (x)| > c}) → μ({x : |Sz (x)| > c}) as

n → ∞,

[6.44]

and the limit is small for large c, since lim μ({x : |Sz (x)| > c}) = 0.

c→+∞

[6.45]

In view of problem (8) posed at the end of section 6.3.3, relations [6.43]–[6.45] imply that μ⊥ν. We need some information about spectral resolution of self-adjoint operators (see [BER 12] for details). Let G be a Hilbert space. A monotone mapping P (·) from real line into the set of orthogonal projectors on G is called resolution of the identity if it is left-continuous w.r.t. the strong operator convergence and satisﬁes the conditions lim P (t) = 0,

t→−∞

lim P (t) = I,

t→+∞

where the limits are taken in the sense of strong operator convergence. According to the spectral decomposition theorem, every self-adjoint bounded operator A on G has an integral representation A= λd P(λ), [6.46] R

where P(·) is some resolution of the identity. The integral in [6.46] is taken over some interval containing the spectrum of A and can be deﬁned as Riemann–Stieltjes integral

Equivalence and Singularity of Gaussian Measures

163

based on the uniform operator convergence. If A is a compact self-adjoint operator, representation [6.46] takes a form A= λ k P Gk , [6.47] k≥1

where {λk , k ≥ 1} is at most a countable collection of non-zero eigenvalues (without multiplicity), Gk are the corresponding ﬁnite-dimensional eigenspaces and PGk is orthoprojector on Gk . If the number of eigenvalues is inﬁnite, then λk → 0 as k → ∞ and the series in [6.47] converges in the operator norm (i.e. uniformly). For the corresponding resolution of the identity P(·) it holds P(λk +) − P(λk ) = PGk , P(0+) − P(0) = PKerA (this is orthoprojector on KerA), and if λ ∈ σ(A), then P(·) is continuous at point λ w.r.t. strong operator convergence; moreover, if [a, b] ∩ σ(A) = ∅, then P(λ) = P(a), λ ∈ [a, b]. We note that in this case, dim P(−δ)G < ∞ and dim(I − P(δ))G < ∞,

for each δ > 0.

[6.48]

The following criterion holds true: a bounded self-adjoint operator A on G is compact if, and only if, relation [6.48] holds for the corresponding resolution of the identity P(·). L EMMA 6.6.– (Third necessary condition for equivalence of Gaussian measures) If −1/2 −1/2 measures [6.36] are equivalent, then the operator D := B2 B1 B2 −I is compact on the separable inﬁnite-dimensional Hilbert space H. P ROOF.– Let P(·) be resolution of the identity for the self-adjoint bounded operator D. We prove by the contrary and suppose that D is not a compact operator. Then there exists δ > 0 such that at least one of the two subspaces P(−δ)H and (I − P(δ))H has inﬁnite dimension. Then, it is possible to construct an inﬁnite orthonormal system {fk , k ≥ 1} belonging to one of those subspaces, with (Dfk , fj ) = 0, k = j. These vectors can be taken from eigenspaces (P(λk +)−P(λk ))H and from subspaces (P(uk ) − P(dk ))H, where (dk , uk ) are disjoint intervals from (−∞, −δ) ∪ (δ, +∞) which do not contain eigenvalues. −1/2

Consider a measurable functional (B2 (x − a2 ), fk ) (see remark 6.7). It is a normal r.v. on the probability space (H, B(H), ν) and therefore on the probability space (H, B(H), μ) as well, since μ ∼ ν and the corresponding series n 1 √ (x − a2 , en )(fk , en ) βk 1

converges a.s. for both probability spaces. We introduce a mapping T : H → R∞ , 5 6 −1/2 Tx = B2 [6.49] (x − a2 ), fk , k ≥ 1 .

164

Gaussian Measures in Hilbert Space

The induced measures μT −1 and νT −1 are equivalent as well. Actually T is a Gaussian random element on R∞ for both probability spaces. Find the distribution of T in both cases. It holds: −1/2 Eν B 2 (x − a2 ), fk = 0, −1/2 −1/2 Eν B2 (x − a2 ), fk B2 (x − a2 ), fj = (fk , fj ) = δkj . Thus, νT −1 is a product measure

∞ k=1

N (0, 1) on R∞ .

√ Next, by lemma [6.5] a1 − a2 ∈ R( B2 ), and −1/2 −1/2 ˆk . (x − a2 ), fk = B2 (a1 − a2 ), fk =: a Eμ B2 −1/2

−1/2

B1 B 2 ∈ L(H) and like in the proof of remark 6.8, it By lemma 6.4, B2 −1/2 holds for components (T x)k = (B2 (x − a2 ), fk ): −1/2 −1/2 Covμ ((T x)k , (T x)j ) = (x − a1 ), fk )(B2 (x − a1 ), fj )dμ(x) = (B2 H

=

−1/2 −1/2 fk , f j ) B1 B2 (B2

= (Dfk , fj ) + (fk , fj ).

This equals 0 for k = j, and for k = j Dμ (T x)k = (Dfk , fk ) + (fk , fk ) = (Dfk , fk ) + 1 =: ˆbk . ∞ Thus, μT −1 is a product measure 1 N (ˆ ak , ˆbk ) on R∞ . Since μT −1 ∼ νT −1 , we apply theorem 6.5 and obtain 2 ∞ ∞ ∞ ˆbk − 1 δk2 = = (Dfk , fk )2 < ∞. 1 1 1 1

∞ On the other hand, |(Dfk , fk )| ≥ δ, k ≥ 1, and 1 δk2 = ∞. We come to a contradiction. Thus, D is a compact operator. √ R EMARK 6.9.– For measures [6.38], if a1 − a2 ∈ B1 H and there exists C > 0 with (B1 z, z) ≤ C(B2 z, z), z ∈ H, and if additionally the operator −1/2 −1/2 D = B2 B1 B 2 − I is not compact, then μ⊥ν. P ROOF.– We use notations from the proof of lemma 6.5. For the probability space (H, B(H), ν), we consider the mapping T : H → R∞ given in [6.49]. Under μ, the series ∞

1 √ (x − a2 , en )(fk , en ) βn n=1

Equivalence and Singularity of Gaussian Measures

165

converges in probability to a normal variable distributed as N (ˆ ak , ˆbk ); according to the Riesz theorem there exists a sequence nm → ∞ such that for all k ≥ 1, nm

1 √ (x − a2 , en )(fk , en ) =: Snm (x; fk ) βn n=1 converges μ-almost surely to ξk (x) ∼ N (ˆ ak , ˆbk ). Now, we deﬁne the mapping ∞ ˜ ˜ T : H → R , (T x)k = limm→∞ Snm (x; fk ), k ≥ 1. It is well deﬁned for all x ∈ H \ H0 with μ(H0 ) = ν(H0 ) = 0. We have ν T˜−1 =

∞

μT˜−1 =

N (0, 1),

k=1

∞

N (ˆ ak , ˆbk ).

k=1

Since D is not compact,

∞ 1

δk2 = ∞ and ν T˜−1 ⊥μT˜−1 ; hence ν⊥μ.

6.3.3. Criterion for equivalence of Gaussian measures T HEOREM 6.8.– (Feldman–Hájek theorem) Let μ = N (a1 , B1 ) and ν = N (a2 , B2 ) be Gaussian measures on a real separable inﬁnite-dimensional Hilbert space H. a) μ ∼ ν if, and only if, it holds: √ 1) a2 − a1 ∈ B1 H, 2) there exist C1 , C2 > 0 such that C1 (B2 z, z) ≤ (B1 z, z) ≤ C2 (B2 z, z), z ∈ H, −1/2

3) D := B2

−1/2

B 1 B2

− I is a Hilbert–Schmidt operator.

b) If at least one of conditions 1–3 is violated, then μ⊥ν. P ROOF.– The necessity of conditions 1 and 2 and mutual singularity of μ and ν, if at least one of the ﬁrst two conditions is violated, follows from lemmas 6.5 and 6.4 and from remarks 6.8 and 6.6. Now, assume that conditions 1 and 2 hold true. If μ ∼ ν then D is a compact operator, and if D is not a compact operator, then ν⊥μ (see lemma 6.6 and remark 6.9). Thus, we assume conditions 1 and 2 and that D is a compact operator. If D = 0, then D ∈ S2 (H) and B1 = B2 ; hence by lemma 6.6 ν ∼ μ. Now, let D = 0 and {fk , k ≥ 1} be eigenbasis of D in the subspace H KerD, KerD = KerB1 = KerB2 . Consider random variables (B2−1 (x − a2 ), fk ) (see the proof of lemma 6.6). On the probability space (H, B(H), ν), they are i.i.d. standard normal random variables, and on (H, B(H), μ) they are independent normal with

166

Gaussian Measures in Hilbert Space

means (B2−1 (a1 − a2 ), fk ) and variances 1 + (Dfk , fk ) (see the proof in remark 6.9). Theorem 6.5 implies that conditions (B2−1 (a1 − a2 ), fk )2 < ∞ and (Dfk , fk )2 < ∞ [6.50] k≥1

k≥1

are necessary and sufﬁcient for μ ∼ ν. The ﬁrst condition in [6.50] is equivalent to condition 1, and the second condition in [6.50] is equivalent to D ∈ S2 (H). If D ∈ S2 (H), then k≥1 (Dfk , fk )2 = ∞ and theorem 6.5 implies than μ⊥ν (see the proof in remark 6.9). C OROLLARY 6.6.– On a real separable inﬁnite-dimensional Hilbert space, Gaussian measures N (a1 , B1 ) and N (a2 , B2 ) are either equivalent or mutual singular. They are equivalent if, and only if, it holds: 1) N (a1 , B1 ) ∼ N (a2 , B1 ); 2) N (a2 , B1 ) ∼ N (a2 , B2 ). Now, we ﬁnd the Radon–Nikodym derivative for two Gaussian measures with different correlation operators. Consider two centered Gaussian measures on H, μ = N (0, B1 )

and ν = N (0, B2 ),

μ ∼ ν.

[6.51]

By theorem 6.7, the following operator is a Hilbert–Schmidt one, −1/2

F := B1

−1/2

B2 B 1

− I,

with eigenbasis {ek } and corresponding eigenvalues {βk }, −1/2 −1/2 Ker(B1 B2 B 1 ) = {0}, it holds βk > −1, k ≥ 1.

[6.52] ∞ 1

βk2 < ∞. Since

Consider a probability space (R∞ , B(R∞ ), μ) with μ=

∞

N (0, 1),

[6.53]

k=1

and Hilbert space G := L2 (R∞ , H) of random elements on H with ﬁnite strong second moments; the inner product in this space is as follows: (ξ, η)G = (ξ(x), η(x))H dμ(x). R∞

The series ∞ 1

xk

B1 ek

[6.54]

Equivalence and Singularity of Gaussian Measures

167

converges in G to some random element ψ(x), since the summands of [6.54] are pairwise orthogonal in G and moreover, ∞ 7 ∞ 72 7 7 ( B1 ek , B1 ek ) = trB1 < ∞. 7xk B1 ek 7 = G

1

[6.55]

1

Relation [6.55] implies that for partial sums Sn (x) of [6.54], Sn (x) − ψ(x) H converges in probability to 0. For any series of independent random elements in H, it converges a.s. if, and only if, it converges in probability (see [BUL 80] or [BUL 81]). Hence, the series [6.54] converges to ψ(x) μ−a.s. in the norm of H. T HEOREM 6.9.– (About the Radon–Nikodym derivative of two centered Gaussian measures) For measures [6.51], ﬁx the arbitrary modiﬁcation of the Radon–Nikodym dν derivative dμ . Then, ∞

1 βk x2k dν √ , exp (ψ(x)) = dμ 2(1 + βk ) 1 + βk k=1

[6.56]

where the measurable mapping ψ : R∞ → H is the sum of series [6.54], which converges μ–a.s., and [6.56] holds for almost all x ∈ R∞ with respect to the measure μ; the latter is given in [6.53]. P ROOF.– 1) Introduce a Gaussian measure on R∞ , ν :=

∞

N (0, 1 + βk ).

[6.57]

k=1

∞ Since 1 βk2 < ∞, ν ∼ μ (see theorem 6.8), and therefore, the series [6.54] converges to ψ(x) ν–a.s. as well. 2) Prove that μ = μψ −1

ν = νψ −1 .

[6.58]

Indeed, both measures μψ −1 and νψ −1 in [6.58] are centered Gaussian measures on H. Next, for y ∈ H as m → ∞, R∞

=

(y, Snm (z))2 dμ(z) =

nm j,k=1

R∞

zj zk (

B1 ej , y)(

B1 ek , y)dμ(z) =

nm nm 7 72 7 7 ( B1 ej , y)2 = ( B1 y, ej )2 → 7 B1 y 7 = (B1 y, y), j=1

j=1

168

Gaussian Measures in Hilbert Space

and since Snm − ψ G → 0, we have 2 (y, ψ(z)) dμ(z) = (y, t)2 d(μψ −1 )(t), (B1 y, y) = R∞

y ∈ H.

H

Hence, μψ −1 = N (0, B1 ) = μ. In a similar way, as m → ∞, nm √ ( B1 y, ej )2 2 (y, Snm (z)) dν(z) = = 1 + βj R∞ j=1 nm ( B2 y, ej )2 → (B2 y, y). =

[6.59]

j=1

˜ := L2 (R∞ , H, ν) of random elements with respect to the In the space G probability ν, we have 7 √ ∞ 7 ∞ 7 ∞ 72 7 B1 ek 7 7 7 (B2 ek , ek ) = trB2 < ∞, = 7xk B1 ek 7 ˜ = βk + 1 G 1 1 1 ˜ as well. Hence [6.59] implies and series [6.54] converges to ψ(x) in G ∞ (B2 y, y) = (y, t)2 d(νψ −1 )(t), y ∈ H, (y, ψ(z))2 dν(z) = 1

R∞

H

and νψ −1 = N (0, B2 ) = ν. 3) Based on theorem 6.3, ﬁnd It holds

dν dμ .

Let μk = N (0, 1), νk = N (0, βk + 1), k ≥ 1.

dνk 1 x2 x2 = (x) = √ exp − + dμk 2(1 + βk ) 2 1 + βk 1 β k x2 , x ∈ R; =√ exp 2(1 + βk ) 1 + βk n ∞

dν 1 1 βk x2k βk x2k = . exp exp (x) = log ti dμ 1 + βk 2(1 + βk ) 1 + βk 2(1 + βk ) k=1

k=1

[6.60] The product in [6.60] converges a.e. with respect to μ. 4) Let A ∈ B(R∞ ). From [6.58], we get dν dν −1 (t) = (ψ(x))dμ(x), ν(A) = ν(ψ A) = ψ −1 A dμ A dμ dν dν (ψ(x)) = (x) dμ dμ

(modμ),

Equivalence and Singularity of Gaussian Measures

169

and [6.56] follows from [6.60].

R EMARK 6.10.– (Simpliﬁed version of Radon–Nikodym derivative) Under the conditions of theorem 6.9, if additionally the operator F in [6.52] is nuclear, then ∞ 1 |βk | < ∞ and ∞

dν βk x2k 1 exp (ψ(x)) = dμ 2(1 + βk ) det(I + F ) k=1 where det(I + F ) is equal to the convergent product

∞ 1

(mod μ),

(1 + βk ).

Problems 6.3 6) Let (X, S) be a measurable space, Xr ∈ S, μ and ν be measures on measurable space (Xr , Xr ∩ S), μe (A) = μ(A ∩ Xr ), νe (A) = ν(A ∩ Xr ), A ∈ S. Prove that μ ν if, and only if, μe νe , and that μ⊥ν if, and only if, μe ⊥νe . 7) Under the conditions of theorem 6.6(a) ﬁnd

dμ dν .

8) Prove that probability measures μ and ν, measures on a measurable space (X, S), are mutually singular if, and only if, there exists a sequence {An } of sets such that μ(An ) → 0 and ν(An ) → 1 as n → ∞. 9) Let t ∈ R \ {−1, 1} and ξ be a Gaussian random element on a real separable inﬁnite-dimensional Hilbert space H, with a correlation operator which is not a ﬁnitedimensional operator. Prove that for each c ∈ H, distributions of ξ and tξ + c are mutually singular. 6.4. Applications in statistics We apply the results of section 6.3 for estimation of the parameters of Gaussian random elements in a real separable inﬁnite-dimensional Hilbert space H and for hypothesis testing about those parameters. 6.4.1. Estimation and hypothesis testing for mean of Gaussian random element Consider a random element X in H with Gaussian distribution N (a, B), where the correlation operator B is known, B is a non-zero S-operator, and the mean a ∈ H is unknown and to be estimated by a single realization X(ω) of the underlying random element. Assume √ additionally that a ∈ Lm , where Lm is a given m-dimensional subspace of R( B). According to theorem 6.6, the distribution of X, μa = N (a, B),

170

Gaussian Measures in Hilbert Space

is equivalent to μ0 = N (0, B). Based on theorem 6.7, one can construct the maximum likelihood estimator (MLE) of a. The mean can be decomposed as a = B 1/2 b,

b=

m

αi f i ,

i=1

where αi ∈ R are unknown, {fi } is a known orthonormal system with fi ⊥KerB, i = 1, . . . , m. According to [6.34], the log-likelihood function is as follows: dμa 1 2 (X) = αi (fi , B −1/2 X) − α . [6.61] dμ0 2 i=1 i i=1 m

L(α1 , . . . , αm ; X) = log

m

Maximizing [6.61], we get the MLEs ˆ i = (fi , B −1/2 X), α a ˆ=

m

i = 1, . . . , m,

ˆb =

m

(fi , B −1/2 X)fi ,

i=1

(fi , B −1/2 X)B 1/2 fi .

[6.62]

i=1

Remember that by [6.33] n (fi , ek )Xk √ , n→∞ βk k=1

(fi , B −1/2 X) = lim

[6.63]

and the limit in [6.63] exists a.e. with respect to both μ0 and μa . The estimator [6.62] is unbiased. Indeed, X = a + X0 , X0 ∼ N (0, B) and E(fi , B −1/2 X) = (fi , B −1/2 a) + E(fi , B −1/2 X0 ) = (fi , b), Ea ˆ=

m

(fi , b)B 1/2 fi = B 1/2 b = a.

i=1

Here, we understand expectation in the Bochner sense. We have proved that a ˆ is unbiased. Another application of relation [6.32] is hypothesis testing. Suppose that we observe a single realization X(ω) of a random element X in H with a known non-zero correlation operator B. We test the null hypothesis H0 : X ∼ N (0, B) against the alternative H1 : X ∼ N (B 1/2 b, B), where b is a ﬁxed non-zero vector, b⊥KerB.

Equivalence and Singularity of Gaussian Measures

171

We set a = B 1/2 b, a = 0. Under H0 , X has distribution μ0 = N (0, B), and under a H1 , it has distribution μa = N (a, B), μa ∼ μ0 ; the Radon–Nikodym derivative dμ dμ0 is given in [6.32]. We construct the Neyman–Pearson test. The inequality leads to an inequality of the form b, B −1/2 X ≤ C.

dμa dμ0 (X)

≤ C0

[6.64]

2 Under H0 , b, B −1/2 X ∼ N (0, b ) and for any real C, 5 6 PH 0 b, B −1/2 X = C = 0. Therefore, the Neyman–Pearson test will be non-randomized. Given a conﬁdence level 1 − α ∈ [0.95; 1], we select the threshold C = Cα such that 5 6 b, B −1/2 X > Cα = α, PH 0 or

Cα = α; P N (0, 1) > b

Cα = zα · b .

Here zα is the upper quantile of normal law. The Neyman–Pearson decision rule is as follows: reject H0 if b, B −1/2 X > zα · b , and do not reject H0 if b, B −1/2 X ≤ zα · b . The signiﬁcance level of the test equals 5 6 PH 0 b, B −1/2 X > zα · b = α.

2 2 Under H1 , b, B −1/2 X ∼ N ( b , b ). The power of the test equals 5 6 power = PH1 b, B −1/2 X > zα · b = P {N (0, 1) > zα − b } = Φ( b −zα ). Hereafter Φ(x) denotes the cdf of standard normal law. We see that the larger b , the more powerful the test we have; the power tends to 1 as b → ∞. Now, suppose that we test a simple null hypothesis H0 against a simple alternative H1 concerning the distribution of the observed random element X in H, and moreover the corresponding distributions μ0 and μ1 are mutually singular. Then we can test the hypotheses with zero probability of Type I and II errors. Indeed, H can be partitioned as H = W0 ∪ W1 , with μ0 (W0 ) = 1 and μ1 (W1 ) = 1; if the

172

Gaussian Measures in Hilbert Space

observation X(ω) ∈ W0 , then we accept H0 , and if X(ω) ∈ W1 , then we accept H1 . Since PH0 {X(ω) ∈ W1 } = μ0 (W1 ) = 0, PH1 {X(ω) ∈ W0 } = μ1 (W0 ) = 0, the proposed decision rule is exact. The problem is that a desired partition H = W0 ∪ W1 is sometimes difﬁcult to construct. For example, theorem 6.8(b) states that under certain conditions two Gaussian measures are mutually singular, but it does not provide with the corresponding partition of H. Now, we consider a simple case where such a partition can be easily constructed. In real problems, this never happens, and the following test is just an educational theoretical one. Let B be S-operator in H with zero kernel, eigenbasis {ek } and corresponding (positive) eigenvalues βk , k ≥ 1. Suppose that we observe a single realization X(ω) of a random element X in H with known correlation operator equal to B. We test the null hypothesis H0 : X ∼ N (0, B) against the alternative H1 : X ∼ N (a, B), where a is such that ∞ a2 √ k = ∞, βk k=1

ak = (a, ek ),

k ≥ 1.

Notice that under [6.65], since βk → 0 as k → ∞, it holds ∞ ∞ a2k a2 1 √k · √ = ∞ = βk β βk k k=1 k=1

√ as well; hence a ∈ R( B) and μ0 = N (0, B)⊥μa = N (a, B). Remember that Xk (ω) = (X(ω), ek ). Deﬁne a decision rule as follows: reject H0 if the series

∞ ak Xk (ω) √ is divergent, βk k=1

and do not reject H0 if the series converges.

[6.65]

Equivalence and Singularity of Gaussian Measures

173

∞ Xk (ω) Under H0 , X ∼ N (0, B) and k=1 ak√ = a, B −1/2 X converges a.s.; βk hence the probability of Type I error equals zero. Under H1 , it holds X = a + X0 , X0 ∼ N (0, B) and due to [6.65] ∞ ∞ ak Xk (ω) a2 √ √k = ∞ = a, B −1/2 X0 + βk βk k=1 k=1 almost surely. Thus, the probability of Type II error equals zero as well. 6.4.2. Estimation and hypothesis testing for correlation operator of centered Gaussian random element Let B1 be S-operator with zero kernel, and for some basis {ek }, B2 = B 1 +

∞

1/2

1/2

βk B1 P[ek ] B1 ,

βk > −1,

k ≥ 1,

1

∞

βk2 < ∞, [6.66]

1

where P[ek ] is orthoprojector on span(ek ) and the series in [6.66] converges uniformly, i.e. in the sense of uniform operator convergence. Then using the corresponding quadratic forms and the deﬁnition of bounded linear operator −1/2 −1/2 B1 B2 B 1 from section 6.3.2, we obtain −1/2

F := B1

−1/2

B2 B 1

−I =

∞

βk P[ek ] .

[6.67]

1

Suppose that we observe a single realization Y (ω) of a random element Y in H, Y ∼ N (0, B2 ). Here, B2 has a form [6.66] with unknown B1 and known ek , k ≥ 1, but unknown βk , k ≥ 1. Now, we make an attempt to construct the maximum likelihood estimators of the coefﬁcients βk . According to theorem 6.8(a), ν = N (0, B2 ) ∼ μ = N (0, B1 ), and [6.56] gives dν the Radon–Nikodym derivative dμ at point ψ(x), x ∈ R∞ , where ψ(x) is given in [6.54]. We can assume that Y ∼ N (0, B2 ) is represented as Y (ω) =

∞

Xk (ω)

B1 ek = ψ(X(ω))

a.s.,

[6.68]

1 ∞

where X(ω) = (Xk (ω))1 is a random element on (R∞ , B(R∞ ), ν¯) with the probability measure ν¯ given in [6.57]. In view of [6.68], we may and do assume that we observe random variables X1 (ω), . . . , Xk (ω), . . . By [6.56], the log-likelihood function is as follows: ∞

L ({βk } ; {Xk (ω)}) = log

dν 1 (ψ (X(ω))) = dμ 2 1

βk Xk2 (ω) − log(1 + βk ) . 1 + βk [6.69]

174

Gaussian Measures in Hilbert Space

Here, the series converges a.s. Maximization in βk > −1 of the summand lk (βk ; Xk (ω)) =

βk Xk2 (ω) − log(1 + βk ) 1 + βk

leads to the estimator βˆk = Xk2 (ω) − 1. It is an unbiased estimator of βk , since E βˆk = (1 + βk ) − 1 = βk . Moreover, βˆ1 , . . . , βˆn are the maximum likelihood estimators of β1 , . . . , βn , provided βn+1 , βn+2 , . . . are known. ∞ ∞ is not the MLE of the sequence (βk )1 ∈ l2 . For instance, The sequence βˆk 1

in the case where βk = 0, k ≥ 1 (i.e. when B2 = B1 ), Xk2 (ω) are i.i.d. standard ∞ 2 normal, and then 1 βˆk = ∞ a.s., because by the Strong Law of Large Numbers

2 n 2 n−1 1 βˆk → E X12 (ω) − 1 > 0 as n → ∞ a.s.; hence with probability one ∞ βˆk ∈ l2 . 1

Now, we switch to hypothesis testing. Suppose that B1 and B2 are operators described above, and not all βk are zeros. We observe a random element Y (ω) in H and test the null hypothesis H0 : Y ∼ N (0, B1 ) := μ1 against the alternative H ∗ : Y ∼ N (0, B2 ) := μ2 . In view of [6.58], we may and do assume that Y (ω) has representation [6.68] with observed X(ω), where under H0 {Xk (ω), k ≥ 1} are i.i.d. standard normal, and under H ∗ they are independent with Xk ∼ N (0, 1 + βk ), k ≥ 1. We construct the Neyman–Pearson test. In view of [6.56], the inequality ≤ C0 leads to inequality of the form

dμ2 dμ1 (X)

S(X) :=

∞ βk X 2 (ω) k

1

1 + βk

− log(1 + βk )

≤ C.

[6.70]

We will show that the test will be non-randomized. There exists βj = 0; we decompose βk X 2 βj Xj2 k S(X) = − log(1 + βj ) + − log(1 + βk ) =: ξ + η. 1 + βj 1 + βk k=j

Under H0 , ξ and η are independent, and moreover the cdf of ξ is continuous. Hence (see problem (11)), S(X) has continuous cdf, and for any C ∈ R, it holds

Equivalence and Singularity of Gaussian Measures

175

PH0 {S(X) = C} = 0. Therefore, the Neyman–Pearson test will indeed be non-randomized. Given a conﬁdence level 1 − α ∈ [0.95; 1], we select the threshold C = Cα such that PH0 {S(X) > Cα } = α.

[6.71]

For α small enough, the solution Cα to equation [6.71] exists. For such α, the Neyman–Pearson decision rule is as follows: reject H0 if S(X) > Cα , and do not reject H0 if S(X) ≤ Cα . The signiﬁcance level of the test equals α. Consider a convenient particular case where S−operator, ∞ the RHS of [6.67] is ∞ i.e. βk ≥ 0, k ≥ 1 (not all βk are zeros) and 1 βk < ∞. Then it holds 1 log(1 + βk ) < ∞, and [6.70] takes a form Q(X) :=

∞ βk Xk2 ˜ ≤ C. 1 + βk 1

[6.72]

The threshold C˜ = C˜α can be found from equation 5 6 PH0 Q(X) > C˜α = α.

[6.73]

The Neyman–Pearson decision rule can be rewritten as follows: reject H0 if Q(X) > C˜α , and do not reject H0 if Q(X) ≤ C˜α . Now, we obtain the approximate solution to equation [6.73]. Under H0 , {Xk } are i.i.d. standard normal, and the distribution of Q(X) can be approximated by the 2 2 distribution Aχ νν with

some real positive numbers A and ν (for such ν, χν is Gamma distribution Γ 2 , 2 ). We have EH0 Q(X) =

∞ 1

βk , 1 + βk

∞

∞

βk2 βk2 2 Var [X ] = 2 ; H k 0 (1 + βk )2 (1 + βk )2 1 1

E Aχ2ν = Aν, Var Aχ2ν = 2A2 ν.

VarH0 [Q(X)] =

176

Gaussian Measures in Hilbert Space

Equalizing the corresponding ﬁrst and second moments, we get a system of equations for A and ν: Aν =

∞ 1

βk , 1 + βk

∞

A2 ν =

∞ 1

βk2 ; (1 + βk )2

∞

βk βk2 : , (1 + βk )2 1 + βk 1 1 2 ∞ ∞ βk βk2 : . ν= 1 + βk (1 + βk )2 1 1

A=

[6.74]

[6.75]

[6.76]

Note that ν ≥ 1 in [6.76]. With such a choice of A and ν, we replace [6.73] with the approximate equation 5 6 P Aχ2ν > C˜α = α; C˜α = Aχ2να , [6.77] % & where χ2να is upper α-quantile of χ2ν distribution, i.e. P χ2ν > χ2να = α. This quantile can be found from statistical tables. Now, we ﬁnd the approximate power of the test under C˜α selected by [6.77]. Under H ∗ , {Xk } are independent with Xk ∼ N (0, 1 + βk ), k ≥ 1. Thus, under H ∗ , Q(X) =

∞ 1

βk γk2 ,

γk = √

Xk , 1 + βk

k ≥ 1,

[6.78]

{γk } are i.i.d. standard normal. We approximate the distribution of [6.78] by distribution Bχ2τ . Similarly to [6.75]–[6.76], we obtain ∞ ∞ ∞ 2 ( 1 βk ) 2 B= βk : βk , τ = ∞ 2 . 1 βk 1 1 Thus, the power of the test is as follows: % 2 & A 2 A 2 2 2 , power ≈ P Bχτ > Aχνα = P χτ > χνα = 1 − Fτ χ B B να [6.79] where Fτ is cdf of χ2τ distribution; the value Fτ (z) can be found from statistical tables. Note that the statistical procedures presented in section 6.4 can be reformulated in terms of a Gaussian stochastic process ξ(t, ω) with square integrable paths on [0, T ], which is observed on this interval. Such a process generates a Gaussian measure on Hilbert space L2 [0, T ] (see example 5.3 with p = 2). More complicated statistical procedures for observed Gaussian random elements in H can be found in [IBR 80].

Equivalence and Singularity of Gaussian Measures

177

Problems 6.4 10) Find the distribution of the estimator [6.62]. 11) Let ξ and η be independent random variables, and ξ have continuous cdf. Prove that ξ + η has continuous cdf as well. 12) Let B1 and B2 be the operators in H as described at the beginning of section 6.4.2, but concerning {βk } in [6.67] assume that −1 < βk ≤ 0, k ≥ 1, ∞ not all of them are zeros and 1 |βk | < ∞. We observe a single realization Y (ω) of a random element Y in H and test a hypothesis H0 : Y ∼ N (0, B1 ) against H ∗ : Y ∼ N (0, B2 ). Construct the corresponding Neyman–Pearson test and similarly to [6.79] ﬁnd its approximate power.

7 Solutions

7.1. Solutions for Chapter 1 1) We have (λT2 π1−1 )(A) = λT2 ((A ∩ [0, T ]) × [0, T ]) = λ2 ((A ∩ [0, T ]) × [0, T ]) = = λ1 (A ∩ [0, T ]) · λ1 ([0, T ]) = T · λ1 (A ∩ [0, T ]). 2) For B(R), it holds (μ1 × μ2 )(π1−1 B) = (μ1 × μ2 )(B × R) = μ1 (B) · μ2 (R). Answer: μ2 (R) · μ1 . 3) a) Let μT −1 be sigma-ﬁnite. Then there exists {An , n ≥ 1} ⊂ F such that −1 −1 Y = ∪∞ )(An ) < ∞, n ≥ 1. Then X = ∪∞ An and μ(T −1 An ) < 1 An and (μT 1 T ∞, n ≥ 1. b) Construct a counterexample. The Lebesgue measure λ1 is sigma-ﬁnite on the Borel sigma-algebra B(R). Let T : R → R, T x = 0, x ∈ R. For the measure λ1 T −1 on B(R), we have for A ∈ B(R): (λ1 T

−1

)(A) =

0, if 0 ∈ A, +∞, otherwise.

The induced measure λ1 T −1 is not sigma-ﬁnite. 4) The statement follows from relations [1.7] and [1.8].

Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.

180

Gaussian Measures in Hilbert Space

5) Let B be a Lebesgue measurable set on real line. Denote B+ = B ∩ [0, +∞) and −B+ = {−x : x ∈ B+ }. Then T −1 B = B+ ∪(−B+ ). Using corollary 1.1 for the transformation T x = −x, x ∈ R, we get

λ1 T −1 B = λ1 (B+ ) + λ1 (−B+ ) = 2 · λ1 (B+ ) .

Answer: λ1 T −1 (B) = 2 · λ1 (B ∩ [0, +∞)), B ∈ S1 . 6) a) For any ﬁnite interval [a, b], the preimage f −1 (a, b] lies in a closed interval of length b − a. Indeed, let x, y ∈ f −1 (a, b]. Then |x − y| ≤ |f (x) − f (y)| ≤ b − a. Denote x∗ = inf f −1 (a, b], x∗ = sup f −1 (a, b]. It holds x∗ − x∗ ≤ b − a; hence f −1 (a, b] ⊂ [x∗ , x∗ + b − a]. b) Let A ∈ S1 , then A = B ∪ N , with B ∈ B(R) and λ1 (N ) = 0. We have f −1 A = (f −1 B) ∪ (f −1 N ).

[7.1]

Here f −1 B ∈ S1 , because f is Lebesgue measurable. c) Show that f −1 N ∈ S1 . Fix ε > 0. It holds 0 = λ1 (N ) = λ∗1 (N ), where λ∗1 denotes the Lebesgue outer ∞ measure. Then there exist intervals (ak , bk ], k = 1, 2, . . . such that k=1 (bk − ak ) < ε and N ⊂ ∪∞ k=1 (ak , bk ]. Therefore, −1 (ak , bk ] ⊂ ∪∞ f −1 N ⊂ ∪∞ k=1 f k=1 Ik ,

where Ik , k = 1, 2, . . . are some closed intervals with total length L≤

∞

(bk − ak ) < ε.

k=1

Next, λ1 (f −1 N ) ≤ L < ε; hence λ1 (f −1 N ) = 0. Thus, f −1 N is Lebesgue measurable, moreover, it has Lebesgue measure zero.

Solutions

181

d) Finally, [7.1] implies that f −1 A ∈ S1 as a union of two Lebesgue measurable sets. R EMARK 7.1.– A more general statement holds true: Let A ∈ S1 and f : A → R be a Lebesgue measurable function such that ∀R > 0∃εR > 0∀x,

y ∈ A ∩ [−R, R],

|f (x) − f (y)| ≥ εR · |x − y|.

Then f is (S1 ∩ A, S1 )-measurable. P ROOF.– Denote by fR the function f restricted on the set AR := A ∪ [−R, R]. Then fR is Lebesgue measurable and ∀x, y ∈ AR , |f (x) − f (y)| ≥ εR · |x − y|. Then similarly to problem (6), the function fR is (S1 ∩ AR , S1 )- measurable. −1 Let C ∈ S1 . Then f −1 C = ∪∞ N =1 fN C ∈ S1 as a countable union of Lebesgue measurable sets.

7) a) For T x := arctan x, x ∈ R, it holds |T x − T y| = |T (θ)| · |x − y| =

|x − y| , 1 + θ2

where θ is an intermediate point between x and y. If x, y ∈ [−R, R], then |T x − T y| ≥

|x − y| . 1 + R2

The function T is (S1 , S1 )-measurable by remark 7.1. b) By theorem 1.2

R

− π2

f (T x)dλ1 (x) =

R

f (t)d(λ1 T −1 )(t).

[7.2]

Now, the induced measure is concentrated on (− π2 , π2 ) and for (α, β] with < α < β < π2 , it holds T −1 (α, β] = (tan α, tan β], (λ1 T −1 ) ((α, β]) =

(α,β]

(λ1 T −1 ) ((α, β]) = tan β − tan α,

I(− π2 , π2 ) dλ1 (t) . cos2 (t)

182

Gaussian Measures in Hilbert Space

This equality can be extended from the sets of semiring P1 of ﬁnite intervals (α, β] to the sets of B(R) = σa(P1 ) and then ﬁnally to the sets of S1 . Thus, λ1 T −1 λ1 ,

I(− π2 , π2 ) (t) d(λ1 T −1 ) = . dλ1 cos2 (t)

This relation and [7.2] imply the desired equality. 8) A solution relies on remark 7.1 and is similar to the solution of problem (7). 9) a) Let R > 0 an (α, β] ⊂ (0, R]. Using generalized spherical coordinates, one can show that

λ1 f −1 (α, β] ≤ CR (β − α), with the positive CR depending on R and n only. Fix ε > 0 and let λ1 (N ) = 0, N ⊂ (0, R]. Then there exists a sequence of (αn , βn ] ⊂ (0, R] such that N⊂

∞

(αn , βn ],

n=1

∞

(βn − αn ) < ε.

n=1

−1 (αn , βn ] and Therefore, f −1 N ⊂ ∪∞ n=1 f ∞ ∞

λ∗1 f −1 (αn , βn ] = λ1 f −1 (αn , βn ] ≤ CR · ε. λ∗1 f −1 N ≤ n=1

n=1

Thus, λ∗1 f −1 N = 0 and f −1 N ∈ S1 . Given any N with λ1 (N ) = 0, it holds f −1 N =

∞

f −1 (N ∩ [0, k]) ∈ S1

k=1

as a countable union of Lebesgue measurable sets. b) Given A ∈ S1 , it holds A = B ∪ N , with B ∈ B(R) and λ1 (N ) = 0. Then

f −1 A = f −1 B ∪ f −1 N ∈ S1 .

Solutions

183

10) The function f (x) = ||x||, x ∈ Rn is Lebesgue measurable by problem (9), and the integral in the statement of this problem is well deﬁned. Let T : Rn → Rn be a unitary operator, i.e. a linear transformation that preserves the norm. Then μ(T

−1

A) =

T −1 A

f (||x||)dλn (x) =

T −1 A

f (||T x||)dλn (x) =

f (||y||)d(λn T −1 )(y).

= A

But λn T −1 = λn , and μ(T −1 A) =

A

f (||y||)dλn (y) = μ(A), A ∈ Sn .

dμ 11) Let g = dλ ; g ≥ 0. Since μ is ﬁnite at each bounded set, g is locally Lebesgue n integrable. Then due to the Lebesgue differentiation theorem (see hint to this problem), for almost every x ∈ Rn we have

1 r→0+ λn (B(x, r))

g(y)dλn (y).

g(x) = lim

[7.3]

B(x,r)

Let N be the set where [7.3] does not hold, λn (N ) = 0, and u, z ∈ Rn \ N , with ||u|| = ||z||. Consider unitary operator T , with T u = z. Then due to relation λn T −1 = λn , we have 1 r→0+ λn (B(z, r))

g(y)dλn (y) =

g(z) = lim

1 = lim r→0+ λn (T B(u, r))

B(z,r)

g(y)dλn (y), T B(u,r)

1 r→0+ λn (B(u, r))

g(T x)d(λn T −1 )(x) =

g(z) = lim

1 r→0+ λn (B(u, r))

B(u,r)

g(T x)dλn (x).

= lim

[7.4]

B(u,r)

−1 = μ. Then for any A ∈ Sn , μ (T A) = μ(A) = But it is given that μT g(x)dλ (x). On the other hand, n A

μ (T A) =

g(T x)d(λn T −1 )(x) =

g(y)dλn (y) = A

A

g(T x)dλn (x); A

184

Gaussian Measures in Hilbert Space

hence μ(A) =

A

g(T x)dλn (x), A ∈ Sn . Therefore,

dμ (x) = g(x) = g(T x) dλn

(modλn ).

[7.5]

Now, [7.5] implies that 1 g(z) = lim r→0 λn (B(u, r))

g(x)dλn (x) = g(u). B(u,r)

Thus, on the set Rn \ N , the function g depends only on the norm of x. There exists a function g˜ : Rn → [0, +∞), which depends on the norm of x only and is equal to g almost everywhere. Now, we construct a Lebesgue measurable function f : [0, +∞) → [0, +∞) such that g˜(x) = f (||x||),

a.e.

[7.6]

Let S n−1 = {x ∈ Rn : ||x|| = 1} be the unit sphere in Rn and Vn be the sigma-algebra of Lebesgue measurable sets in [0, +∞) × S n−1 . Following the line of the solution to problem (6), one can show that the function φ(r, v) = rv, r ≥ 0, v ∈ S n−1 is (Vn , Sn )-measurable. Then the function h(r, v) = g˜(φ(r, v)) = g˜(rv),

z ≥ 0, v ∈ S n−1

is Lebesgue measurable. Therefore, its cross-section hv (r) = g˜(rv), r ≥ 0 is Lebesgue measurable, for almost every v ∈ S n−1 (here on the sphere S n−1 the Lebesgue measure is considered). But the cross-sections hv (r) for the different v ∈ S n−1 coincide and hence are equal to some Lebesgue measurable function f (r), r ≥ 0. Then [7.6] holds true and the representation is valid. Finally, we replace f with an equivalent Borel function. 12) a) Consider the case where g is non-negative and bounded. The measure μ(A) =

g(T x)dλn (x),

A ∈ Sn ,

A

is ﬁnite at each bounded set, absolutely continuous w.r.t. λn and invariant under orthogonal transformations in Rn . The statement follows from problem (11) and the uniqueness of the Radon–Nikodym derivative.

Solutions

185

b) Switch to the case where g is non-negative but could be unbounded. For R > 0, the function gR (x) := g(x)I(|g(x)| ≤ R) is Lebesgue measurable, bounded and gR (T x) = gR (x) (mod λn ) for each orthogonal transformation T . By part (a) of the proof, there exists NR ∈ Sn such that λn (NR ) = 0 and for all x, y ∈ Rn \ NR with ||x|| = ||y||, it holds gR (x) = gR (y). n Now, set N = ∪∞ k=1 Nk . Then λn (N ) = 0 and for all x, y ∈ R \ N with ||x|| = ||y||, it holds g(x) = g(y).

c) Now, g is an arbitrary function from the problem statement. Using part (b) of the proof, we construct N+ , N− ∈ Sn such that λn (N+ ) = λn (N− ) = 0 and for all x , y ∈ Rn \ N+ , x , y ∈ Rn \ N− with ||x || = ||y || and ||x || = ||y ||, it holds g+ (x ) = g+ (y ) and g+ (x ) = g+ (y ), where g+ and g− are positive and negative parts of g, respectively. Then we set N = N+ ∪ N− , λn (N ) = 0, and for all x, y ∈ Rn \ N with ||x|| = ||y||, it holds g(x) = g(y). The desired function f is constructed as in the solution to problem (11). 13) a) Introduce a function S : R → [0, +∞], S(x) =

∞

|f (n1+α x)|,

x ∈ R.

n=1

It is Lebesgue measurable and ∞

R

Sdλ1 =

n=1

R

|f (n1+α x)|dλ1 (x) =

∞ n=1

1 n1+α

R

|f (t)|dλ1 (t).

[7.7]

Here, we used the change of variable t = T x = n1+α x, x ∈ R and corollary 1.2. Since α > 0 and f ∈ L(R, λ1 ), the expression [7.7] is ﬁnite. Hence S ∈ L(R, λ1 ), and S(x) < ∞ (mod λ1 ). Therefore, lim |f (n1+α x)| = 0 (mod λ1 ). n→∞

1

b) Let α > 0 and g ∈ L(Rm , λm ). We show that g(n m +α x) → 0 as n → ∞ (mod λm ). Now, we set S(x) =

∞ n=1

1

|g(n m +α x)|,

x ∈ R.

186

Gaussian Measures in Hilbert Space

We have Rm

Sdλm =

∞ n=1

1 n1+αm

Rm

|g(t)|dλm (t). 1

We used corollary 1.2, with T x = n m +α x, x ∈ Rm , and the corresponding matrix 1 L = n m +α Im , with det L = n1+αm . (Here Im denotes the identity matrix of size m.) The rest of the proof follows the line of part (a) of the solution. Another possible extension is as follows: Let α1 m n1+α xm ) m

1 > 0, . . . , αm > 0 and g ∈ L(Rm , λm ). Then g(n1+α x1 , . . . , 1 → 0 as n1 → ∞, . . . , nm → ∞ (mod λm ).

14) a)

R

f dλ1 =

f dλ1 + (−∞,0]

f dλ1 .

[7.8]

[0,+∞)

Use transformation T x = −x, x ∈ R. Corollary 1.1 implies λ1 T −1 = λ1 . Because f (−x) ≡ f (x), we have

(−∞,0]

f (−t)d(λ1 T −1 )(t) =

f (x)dλ1 (x) =

f (t)dλ1 (t), [7.9]

[0,+∞)

[0,+∞)

and relations [7.8] and [7.9] imply the desired equality. b)

R

f (x)dλ1 (x) = −

=−

R

f (t)d(λ1 T

−1

R

f (−x)dλ1 (x) =

)(t) = −

R

f (t)dλ1 (t), 2 ·

R

f (x)dλ1 (x) = 0.

Solutions

187

15) Use transformation T x = −x, x ∈ R, under which λ1 is invariant. We have I := [−1,1]

=

[−1,1]

f (x) dλ1 (x) = f (x) + f (−x)

f (−t) dλ1 (t), f (t) + f (−t)

2I = [−1,1]

f (x) + f (−x) dλ1 (x) = f (x) + f (−x)

[−1,1]

f (−t) d(λ1 T −1 )(t) = f (t) + f (−t)

dλ1 (x) = 2,

I = 1.

[−1,1]

Answer: 1. 16) Let M = {ε = (εn )∞ 1 : εn = 0 or 1, n = 1, 2, . . . }. For each ε, δ ∈ M , ε = δ, it holds ||ε − δ||∞ = 1. Now, consider disjoint balls B(ε, 12 ), ε ∈ M . Suppose that a measure λ satisﬁes (i) and (ii). Then for each ε ∈ M , λ(B(ε, 12 )) = λε , 0 < λε < ∞. The set M is uncountable; hence one can select a ∞ sequence of different points ε(m) ∈ M , m = 1, 2, . . . , with 1 λε(m) = ∞. (m) 1 We have ∪∞ , 2 ) ⊂ B(0, 2), then 1 B(ε

λ(B(0, 2)) ≥

∞

λε(m) = ∞,

λ(B(0, 2)) = ∞.

1

This contradicts the condition (ii). 17) a) First we construct a sequence of unit vectors xn , n = 1, 2, . . . such that ||xm − xn || > 12 , for all n = m. This can be done by induction. Describe the inductive step. Given unit vectors x1 , . . . , xn , with ||xi − xj || > 12 for all i = j, apply the theorem about an almost orthogonal vector (see [BER 12]) to the subspace Ln = span(x1 , . . . , xn ). Then there exists a unit vector xn+1 such that ||xn+1 − l|| > 12 , for all l ∈ Ln . In particular ||xn+1 − xi || > 12 , i = 1, . . . , n. 1 b) The balls B(xn , 14 ), n = 1, 2, . . . are disjoint, moreover ∪∞ 1 B(xn , 4 ) ⊂ B(0, 2). The rest of the proof follows the line of theorem 1.5(a).

188

Gaussian Measures in Hilbert Space

18) Let en be standard unit vectors in lp , en = (0, . . . , 0, 89:; 1 , 0, . . . ). Then ||en −

√ em ||p = p 2, n = m, and the balls B(en , V in lp is isometry,

V x = xn em − xm en +

√ p 2 2 )

xi e i ,

n

are disjoint. For n = m, the operator

x = (xi )∞ 1 .

i=n,i=m √ p

√ p

Assume that λ is a desired measure. Then V (B(en , 22 )) = B(em , 22 ), and the values of λ at these two balls coincide. The rest of the proof follows the line of theorem 1.5(a). 19) For x ∈ X, it holds ||T x|| = max |x(ϕ(t))| = max |x(s)| = ||x||. 0≤t≤1

0≤s≤1

Moreover, the inverse mapping ϕ(−1) : [0, 1] → [0, 1] is continuous, therefore, T is bijection, and it is an isometry on X (see the deﬁnition in problem (18)). Let ϕ(n) be the n-fold composition of ϕ with itself. For n, k ≥ 1, we have ||ϕ(n) − ϕ(n+k) || = max |ϕ(n) (t) − ϕ(k) (ϕ(n) (t))| = 0≤t≤1

= max (s − ϕ(k) (s)) ≥ δ := max (s − ϕ(s)) > 0. 0≤s≤1

0≤s≤1

The balls B(ϕ(n) , 2δ ), n = 1, 2, . . . are disjoint and for each n, T (B(ϕ(n) , 2δ )) = B(ϕ(n+1) , 2δ ). The rest of the proof follows the line of theorem 1.5(a). 20) Let X be a random vector in Rn , with μX = μ (see lemma 1.2). Random vector Y := −X has the characteristic function ϕY (t) = E ei(Y,t) = E e−i(X,t) = ϕX (t), where ϕX (t) is the complex conjugate to ϕX (t). Now, μX is symmetric around the origin ⇔ X and Y have equal distributions ⇔ ϕX ≡ ϕY ⇔ ϕX (t) ≡ ϕX (t) ⇔ ϕX (t) ∈ R, for all t ∈ Rn .

Solutions

189

21) Let X be a random vector in Rn , with μX = μ, and U be an orthogonal n × n matrix. Random vector U X has characteristic function ϕU X (t) = E ei(U X,t) = E ei(X,U

t)

,

t ∈ Rn .

Note that U = U −1 is an orthogonal matrix as well. Now, μX is invariant under all unitary operators ⇔ X and U X have equal distributions, for all orthogonal matrices U ⇔ ϕX (U t) = ϕX (t), for all t ∈ Rn and all orthogonal matrices U . The latter condition is equivalent to the following: ϕX (t) depends only on ||t||. 22) Let X and Y be random vectors, with μX = μ and μY = ν. Introduce a half-space Vτ d = {x ∈ Rn : (x, τ ) ≤ d},

||τ || = 1,

d ∈ R.

a) Prove that μ and ν coincide at each half-space. Select a vector x0 such that (x0 , τ ) = d and put xn = x0 − nτ , n ≥ 1. The balls ¯ n , n) are increasing in n and their union is Vτ d . By the continuity of measure from B(x ¯ n , n)) = ν(B(x ¯ n , n)), n ≥ 1, imply μ(Vτ d ) = ν(Vτ d ). below, equalities μ(B(x b) Prove that ϕX = ϕY . Take a ∈ Rn , a = 0 and τ =

a ||a|| .

We have

ϕX (a) = E ei||a||(X,τ ) = ϕ(X,τ ) (||a||), ϕY (a) = ϕ(Y,τ ) (||a||). Let F and G be cumulative distribution functions (cdfs) of (X, τ ) and (Y, τ ), respectively. Then F (d) = P{(X, τ ) ≤ d} = μ(Vτ d ) = ν(Vτ d ) = G(d),

d ∈ R.

Thus, random variables (X, τ ) and (Y, τ ) have equal distributions and equal characteristic functions. Therefore, ϕX (a) = ϕY (a), for all a = 0. But ϕX (0) = ϕY (0) = 1, and ϕX = ϕY . Then μX = μY .

190

Gaussian Measures in Hilbert Space

23) a) Prove that C is positive semideﬁnite. Let ξ = (ξi )n1 ∼ N (0, A) and η = (ηi )n1 ∼ N (0, B) be independent random vectors. Introduce random vector z = (ξi ηi )n1 . Then E z = 0 because E ξi ηi = (E ξi )(E ηi ) = 0, i = 1, . . . , n; Cov (ξi ηi , ξj ηj ) = E (ξi ξj ) (ηi ηj ) = E (ξi ξj ) E (ηi ηj ) = aij bij , i, j = 1, . . . , n. Therefore, Cov(z) = C, and C is positive semideﬁnite. b) Now, show that C is positive deﬁnite. Assume the contrary. Then, in view of part (a) of the proof, ∃d = (di )n1 = 0,

E(z, d)2 = d Cd = 0.

Next, we use the independence of ξ and η. By the properties of conditional expectation (see [SHI 16]),

0=E

n 1

< =E

n

2 di ξi ηi

⎡ ⎡ 2 ⎤⎤ n = E ⎣E ⎣ di ξi ηi η ⎦⎦ = =

1

aij (di ηi )(dj ηj ) .

1

The latter double sum is non-negative because A is positive semideﬁnite. Therefore, n

aij (di ηi )(dj ηj ) = 0,

a.s.

1

Hence (di ηi )n1 = 0 a.s. But d = 0, and there exists dj = 0. Then ηj = 0 a.s., E ηj2 = bjj = 0. This contradicts the condition that B is positive deﬁnite. Thus, our assumption is wrong, and C is indeed positive deﬁnite. 24) a) First we check that h is a pdf. It is a non-negative Borel function, because |1 − 2F (x)| · |1 − 2G(x)| ≤ 1.

Solutions

Next,

+∞ −∞

f (x)F (x)dx =

+∞ −∞

191

+∞ F (x)dF (x) = 12 F 2 (x)−∞ = 12 , and

R2

h(x, y)dxdy =

=1+α

+∞ −∞

(f (x) − 2f (x)F (x))dx ·

+∞ −∞

(g(y) − 2g(y)G(y))dy = 1.

Thus, h is a two-dimensional pdf, and there exists a random vector (X; Y ) with the pdf equal h(x, y). The marginal density equals fX (x) = f (x)

+∞ −∞

×

g(y)dy + αf (x)(1 − 2F (x))×

+∞ −∞

g(y)(1 − 2G(y))dy = f (x),

and in a similar way another marginal density fY (y) = g(y). b) Since f and g are even functions, E X = E Y = 0. Then Cov(X, Y ) =

R2

xyf (x)g(y)dxdy+

+α

R

xf (x)(1 − 2F (x))dx ·

R

yg(y)(1 − G(y))dy,

Cov(X, Y ) = 4α E[XF (X)] · E[Y F (Y )]. It remains to show that E XF (X) > 0 (the inequality E Y F (Y ) > 0 can be proven similarly). We have E XF (X) = E XF (X)I(X > 0) + E XF (X)I(X < 0). Random variables X and (−X) are identically distributed, therefore, E XF (X)I(X < 0) = E(−X)F (−X)I(−X < 0) = − E X(1 − F (X))I(X > 0), E XF (X) = E X(2F (X) − 1)I(X > 0).

192

Gaussian Measures in Hilbert Space

There exists t0 with

1 2

< F (t0 ) < 1. Then

E XF (X) ≥ E X(2F (X) − 1)I(X > t0 ) ≥ t0 (2F (t0 ) − 1) · P{X ≥ t0 }. 25) In problem (24), one can put x2 1 f (x) = g(x) = √ e− 2 , 2π

x ∈ R.

Then X ∼ N (0, 1) and Y ∼ N (0, 1). But for α ∈ (−1, 1), α = 0 (e.g. α = 12 ), the function h(x, y), which is the pdf of random vector (X; Y ) , is not the pdf of a Gaussian random vector. Thus, X and Y are not jointly Gaussian. d

1

26) Let γ be standard Gaussian vector in Rn . By lemma 1.8, X = S 2 γ, and 1

1

Iα = E exp{α(AS 2 γ, S 2 γ)} = E exp{α(Bγ, γ)}. 1

1

Here, B is the symmetric matrix S 2 AS 2 , with the real eigenvalues λ1 , . . . , λn and the corresponding orthogonal eigenvectors e1 , . . . , en . Then (Bγ, γ) = n 2 1 λk (γ, ek ) , and because (γ, ek ), k = 1, . . . , n are i.i.d. N (0, 1) random variables, Iα =

n

E exp{αλk (γ, ek )2 }.

1

For β ∈ R and ξ ∼ N (0, 1), evaluate 2 1 Jβ := E eβξ = √ 2π

R

e−

(1−2β)x2 2

dx.

For β ≥ 12 , Jβ = +∞. And for β < 12 , we put σ = Jβ = σ

R

√

√ 1 1−2β

x2 1 e− 2σ2 dx = σ, 2πσ

because the integrand is the pdf of normal law N (0, σ 2 ).

and obtain

Solutions

n Finally, Iα = 1 Jαλk , and Iα is ﬁnite if, and only if, αλk < 1, . . . , n. In this case Iα =

n

1

√

1 2

193

for all k =

1 . 1 − 2αλk

27) Pdf of X is (2π)−n/2 e− Jα = (2π)−n/2

Rn

||x||2 2

, x ∈ Rn . We have

||x||−α e−

||x||2 2

dx.

It is the improper Riemann integral. Using n-dimensional spherical coordinates, we get

∞

Jα = const · 0

1 rα+1−n

e−

r2 2

dr.

The latter integral converges for α > n. Answer: α > n. 7.2. Solutions for Chapter 2 1) a) Prove that ϕ is non-decreasing. Assume the contrary. Suppose that there exist points 0 < t1 < t2 , with ϕ(t1 ) > ϕ(t2 ) ≥ 0. Consider t3 > t2 and points Mi (ti ; ϕ(ti )), i = 1, 2, 3 on the graph of ϕ. We have ϕ(t3 ) ≥ 0, and for large enough t3 , the point M2 lies below the segment M1 M3 . This contradicts the concavity of ϕ. Thus, ϕ is non-decreasing. b) Prove that ϕ(t) > 0, t > 0. Assume the contrary. Suppose that there exists t1 > 0, with ϕ(t1 ) = 0. Remember that ϕ(0) = 0. Since ϕ is non-decreasing, ϕ(t) = 0 at [0, t1 ]. Because ϕ is not identical to zero, there exists t2 > t1 with ϕ(t2 ) > 0. Points M0 (0; 0), M1 (t1 ; ϕ(t1 )) and M2 (t2 ; ϕ(t2 )) lie on the graph of ϕ, but M1 is below the segment M1 M2 . This contradicts the concavity of ϕ. Hence ϕ(t) > 0, t > 0. c) For a, b ≥ 0, it holds ϕ(a + b) ≤ ϕ(a) + ϕ(b).

194

Gaussian Measures in Hilbert Space

Indeed, consider points on the graph of ϕ: M0 (0; 0), Ma (a; ϕ(a)), Mb (b; ϕ(b)), Ma+b (a + b; ϕ(a + b)). the upper half-plane The segment Ma Mb lies in w.r.t. the line M0 Ma+b . Therefore, a+b ϕ(a)+ϕ(b) the middle of this segment N 2 ; lies in the upper half-plane as well. 2 Hence yN =

ϕ(a) + ϕ(b) ϕ(a + b) 1 ≤ yMa+b = , 2 2 2

and the desired inequality follows. d) Based on parts (a)-(c) of the solution, it is easy to verify the axioms of a metric for the function ϕ(ρ). In particular, for x, y, z ∈ X, ϕ(ρ(x, y)) ≤ ϕ(ρ(x, z) + ρ(z, y)) ≤ ϕ(ρ(x, z)) + ϕ(ρ(z, y)), and the triangle inequality for ϕ(ρ) follows. 2) Let ρ(t, s) = |t − s|, t, s ∈ R, and ϕ(x) = ϕ(x) = 1 −

x 1+x ,

x ≥ 0. We have

1 2 < 0, x ≥ 0, , ϕ (x) = − 1+x (1 + x)3

and ϕ is strictly concave. This function satisﬁes all the requirements of problem (1); hence the function d = ϕ(ρ) is a metric on R. 3) a) Prove that (R, d) is complete, with d given in 2.2. This follows from the completeness of R with the usual metric and from the relation |t − s| =

d(t, s) , t, s ∈ R . 1 − d(t, s)

∞ b) Let x(m) = (xn (m))∞ n=1 , m ≥ 1, be a Cauchy sequence in (R , ρ). Relation 2.3 implies that for each n ≥ 1, {xn (m), m ≥ 1} is a Cauchy sequence in (R, d). By part (a) of the solution, there exists xn ∈ R, with

lim xn (m) = xn , n ≥ 1.

m→∞

Solutions

195

∞ Lemma 2.2 implies that x(m) → x = (xn )∞ n=1 in R (ρ) as m → ∞. Thus, (R , ρ) is complete. ∞

4) a) It is straightforward that J is a linear surjection. Moreover, Jx 2 = (

∞

1

an x2n ) 2 = x a ,

n=1

and J is an isometry between l2,a and l2 . Thus, l2,a and l2 are isometric. b) Since l2 is a separable Hilbert space, l2,a is a separable Hilbert space as well. 5) a) The proof is similar to the proof of lemma 2.5. b) We have l∞ =

∞ N =1

{x : sup |xn | ≤ N } = n≥1

∞ ∞

{x : |x1 | ≤ N, . . . , |xk | ≤ N }.

N =1 k=1

Then l∞ ∈ σa(Cyl) = B(R∞ ). c) lim xn = 0 ⇐⇒ ∀k ≥ 1∃N ≥ 1∀n ≥ N, |xn | < k1 . Therefore, n→∞

c0 =

∞ ∞ ∞

{x ∈ R∞ : |xn |
x − y S − 2ε, and for ε = 13 x − y S > 0, it holds u − v S > 0; hence u = v, and U ∩ V = ∅. This proves the statement. 10) If f is continuous in τS , then it is continuous in the usual sense (see remark 4.2). Now, let f ∈ H ∗ . Then f (x) = (x, h), x ∈ H, for some h ∈ H. We may and do assume that h = 1. We prove that f is continuous at point x0 ∈ H in S-topology. For ε > 0, we set S = ε−2 P[h] ∈ LS (H). Consider ES = {x ∈ H : (ε−2 P[h] x, x) < 1} = {x ∈ H : |(x, h)| < ε}. If x ∈ ES + x0 , then |(x − x0 , h)| < ε ⇒ |f (x) − f (x0 )| < ε. Thus, f is continuous in τS . 11) a) Necessity Assume that KA (x) := (Ax, x) is continuous at zero in τS . Then for ε = 1 there exists S ∈ LS (H) such that (Sx, x) < 1

⇒

|(Ax, x)| < ε = 1.

By continuity of both functions KS (x) and KA (x), we get the next implication: (Sx, x) ≤ 1

⇒

|(Ax, x)| ≤ 1.

216

Gaussian Measures in Hilbert Space

At the point where (Sx, x) = 0, it holds (Ax, x) = 0. Now, let (Sx, x) = c2 , c > 0. Then (S xc , xc ) = 1 ⇒ |(A xc , xc )| ≤ 1, |(Ax, x)| ≤ c2 = (Sx, x). Thus, in all the cases (Ax, x) ≤ (Sx, x). Let {en } be an orthobasis in H. We have ∞

|(Aen , en )| ≤

1

∞

(Sen , en ) < ∞.

1

The cited theorem from [GOK 88] implies that A ∈ S1 (H). b) Sufﬁciency We assume that A ∈ S1 (H). Consider its polar decomposition A = U T , where T is S-operator. First, we prove that KA is continuous at zero in τS . It holds |(Ax, x)| = |(T x, U ∗ x)| = |(T 1/2 x, T 1/2 U ∗ x)| ≤ T 1/2 x · T 1/2 U ∗ x ≤ 1 1/2 2 T x + T 1/2 U ∗ x 2 = (Sx, x), ≤ 2 where S=

1 (T + U T U ∗ ) . 2

It is S-operator, because T , U T U ∗ ∈ LS (H). Let ε > 0, Sε = ε−1 S. If Sε x, x) < 1, then (Sx, x) < ε and |KA (x)| < ε. Thus, KA is continuous at zero in S-topology. Next, we want to prove that KA is continuous at point x0 ∈ H in τS . Change the variable t = x − x0 : f (t) := KA (t + x0 ) = KA (t) + KA (x0 ) + (t, A∗ x0 ) + (t, Ax0 ) Here, KA is continuous at zero in τS , and linear functionals (t, A∗ x0 ) and (t, Ax0 ) belong to H ∗ ; hence they are continuous in τS (see problem (10)). Thus, the functional f is continuous at zero in the linear topology τS , and KA is continuous at x0 in τS . Hence KA is continuous everywhere in τS .

Solutions

217

12) Assume that θ = μ ˆ, for some Borel probability measure in l2 . Let νe (B) = μ(S ∩ l2 ), B ∈ B(R∞ ). Since νe (l2 ) = 1, it is enough to prove that νe = μe . n Consider projector Pn : l2 → Ln , Pn x = 1 xk ek , and the induced measure νn = μP −1 , n ≥ 1. For z ∈ Ln , it holds (see [4.6]):

ei(z,x) dμn (x) =

θ(z) = l2

ei(z,x) dνn (x) =

Ln

ei(z,x) dμn (x). Ln

Hence νn = μn , n ≥ 1. Consider a cylinder Aˆn in R∞ , An ∈ B(Rn ). We have (here we identify Ln and R ): n

νe (Aˆn ) = μ(Aˆn ∩ l2 ) = μ(Pn−1 An ) = νn (An ) = μn (An ) = μe (Aˆn ). Now, by theorem 2.2 the measures νe and μe coincide. 13) It is straightforward that θ(0) = 1 and θ is positive deﬁnite. Check that Re θ is continuous at zero in S-topology. Fix ε > 0. Since Re ϕ is continuous at zero, there exists δ > 0 such that for each x ∈ B(0, δ), |1 − Re ϕ(x)| < ε.

Inequality (Ax, Ax) < δ 2 is equivalent to δ12 Ax, Ax < 1. The operator A∗ A is self-adjoint and positive, moreover it is nuclear as a product of two Hilbert–Schmidt operators; hence A∗ A ∈ LS (H) and S := δ12 A∗ A is S-operator. Thus, if (Sx, x) < 1, then |1 − Re ϕ(Ax)| < ε, and Re θ is continuous at zero in S-topology. Now, by theorem 4.5, θ = μ ˆ, where μ is a Borel probability measure in H. 7.5. Solutions for Chapter 5 1) We prove by the contrary. Suppose that there exists a random element ξ in H, with ϕξ (x) = exp{i(h, x) −

x 2 }, 2

x ∈ H.

Then η = ξ − h has the characteristic functional 1 1 ϕη (x) = exp{− (x, x)} = exp{−( Ix, x)}, 2 2

x ∈ H.

218

Gaussian Measures in Hilbert Space

According to example 4.1, 12 I is S-operator; hence the identity operator I is S-operator as well in the inﬁnite-dimensional Hilbert space H, which is not true. 2) Since ξ1 and ξ2 are independent, 1 ϕξ1 +ξ2 (x) = ϕξ1 (x)ϕξ2 (x) = exp{i(m1 +m2 , x)− ((S1 +S2 )x, x)}, 2

x ∈ H.

The operator S1 +S2 ∈ LS (H) as a sum of S-operators. By theorem 5.1, ξ1 +ξ2 ∼ N (m1 + m2 , S1 + S2 ). 3) Let {en } be the eigenbasis of S and {λn > 0, n ≥ 1} be the corresponding eigenvalues. By theorem 5.2, with probability 1

ξ =m+

∞ λk γk ek , 1

where {γk } are i.i.d. N (0, 1) random variables and the series converges strongly in H with probability 1. Then for each N ≥ 1, ξ 2 ≥

N

(mk +

λk γ k ) 2 ,

mk = (m, ek ),

k ≥ 1;

1

E ξ

−α

≤E

N

(mk + λk γk )2

−α/2 =

1

=

N

RN

(mk + λk xk )2

−α/2 gN (x)dx =: JN

1

where gN is pdf of standard Gaussian measure in RN . Next,

JN =

N

1

−1/2 λk

RN

t −α gN

t1 − m1 tN − mN √ , ..., √ λ1 λN

dt.

The latter integral converges for α < N (this can be shown using generalized spherical coordinates). Thus, for any N > α, E ξ −α ≤ JN < ∞.

Solutions

4) Problem (2) of Chapter 5 implies that ηk ∼ N (0, Tk ), Tk = ( for all k = 1, ..., n. Thus, ηk ’s are copies of ξ.

n j=1

219

u2kj )S = S,

Take hk ∈ H, k = 1, ..., n. Since U is an orthogonal matrix, for k = p it holds E(ηk , hk )(ηp , hp ) =

n

ukj upj E(ξj , hk )(ξj , hp ) =

j=1

= (Shk , hp )

n

ukj upj = 0.

j=1

Hence jointly Gaussian random variables (ηk , hk ), k = 1, ..., n are uncorrelated and independent. For a compound random element η := (ηk )n1 in H n we have ϕη (h1 , ..., hn ) = E exp{i

n

(hj , ηj )} =

1

=

n

E ei(hj ,ηj ) =

1

n

ϕξ (hj ).

1

Due to evident generalization of lemma 4.3, random elements η1 , ..., ηn are independent. 5) X = X1 × X2 is a separable metric space, with metric ρ(x, y) = ρ1 (x1 , y1 ) + ρ2 (x2 , y2 ),

x = (x1 , x2 ) ∈ X,

y = (y1 , y2 ) ∈ X.

Denote μ = μ1 × μ2 , it is a Borel probability measure in X. a) Let ai ∈ suppμi , i = 1, 2, a = (a1 , a2 ). For each ε > 0, ε ε × B a2 , ; B(a, ε) ⊃ B a1 , 2 2 hence μ(B(a, ε)) ≥

ε > 0. μi B ai , 2 i=1 2

Thus, a ∈ suppμ.

220

Gaussian Measures in Hilbert Space

b) Let b1 ∈ / suppμ1 and b2 ∈ X2 , b = (b1 , b2 ). There exists ε > 0, with μ1 (B(b1 , ε)) = 0. Since B(b, ε) ⊂ B(b1 , ε) × B(b2 , ε), it holds μ(B(b, ε)) ≤

2

μi (B(bi , ε)) = 0.

i=1

Then b ∈ / suppμ. In a similar way if b1 ∈ X1 and b2 ∈ / suppμ2 , then b = (b1 , b2 ) ∈ / suppμ. The desired equality follows from parts (a) and (b) of the solution. 6) Let ξ ∼ N (0, S) in H. Its distribution μξ = μ. There exist independent N (0, 1) r.v.’s {γn } such that ξ=

∞ λn γn en , 1

where the convergence is strong in H a.s. Then a.s. Vξ =

∞

λn γ n V e n =

1

∞ λπ(n) γn eπ(n) . 1

Remember that π : N → N is a bijection. After the change of summation index i = π(n), i ∈ N we get Vξ =

∞

λi γi ei ,

γi = γπ−1 (i) ,

i ≥ 1,

1

where {γi } is again a sequence of independent 1) random variables and the ∞ √ N(0, ∞ 2 2 convergence is strong in H a.s. (because λ γ ) = ( i i 1 1 λi (γi ) a.s. due to the ∞ convergence of series 1 λi ). Now, by theorem 5.2 (b), V ξ ∼ N (0, S). Thus, ξ and V ξ are identically distributed; hence μ = μξ = μV ξ = μV −1 . Remark. Not every permutation π yields a bounded operator V as deﬁned in problem (6). A necessary and sufﬁcient condition for that is as follows: λπ(i) = O(λi ).

Solutions

221

7) Let ξ ∼ N (0, S) in H. By theorem 5.2(a) ξ=

∞ √

αn ξn ,

1

where independent ξn ∼ N (0, Pn ), Pn is orthoprojector on Hn (remember that dimHn < ∞) and the convergence is strong in H a.s. a) Sufﬁciency Let U be unitary operator in H, with U (Hn ) = Hn . Then Uξ =

∞ √

αn ξn ,

ξn = U ξn ,

n≥1

1

and the convergence is strong in H a.s. Since U |Hn is unitary operator in Hn , ξn ∼ N (0, Pn ), and ξn are independent. By theorem 5.2(b), U ξ ∼ N (0, S). Hence ξ and U ξ are identically distributed, and μ = μξ is U -invariant. b) Necessity Let U be unitary operator in H such that μ is U -invariant. By corollary 5.3, S = U SU ∗ = U SU −1

⇒

SU = U S.

Let ei ∈ Hn . Then U Sei = αn U ei = S(U ei )

⇒

U e i ∈ Hn .

Thus, U (Hn ) ⊂ Hn , n ≥ 1. Remark. If V ∈ L(H), V (Hn ) = Hn and V |Hn is unitary operator in Hn for each n ≥ 1, then V is unitary operator in H. 8) Let {γn , n ≥ 1} be i.i.d. N (0, 1) random variables and ξ be a random element in H, with ξ=

∞ λn γn en , 1

a.s.

222

Gaussian Measures in Hilbert Space

where {en } is the eigenbasis of S corresponding to {λn } and the series converges strongly in H a.s. By theorem 5.2(b), the distribution μξ = μ. It holds Jα = E exp{α ξ } = E exp α 2

∞

λn γn2

.

1 ∞ Consider random element γ = (γn )∞ 1 distributed in R . Its distribution μγ is the ∞ standard Gaussian measure in R . We have

Jα =

R∞

exp α

∞

λn x2n

dμγ (x).

1

Now, the statement follows directly from problem (11) of Chapter 2 (we note that S = maxn≥1 λn ). 9) We may and do assume that 0 < ε < 1. Denote gδ (x) = exp

δ 2 · ||x|| , 2||S||

x ∈ H,

0 < δ < 1.

If we show that for some nδ ∈ N , gδ (x)dμn (x) < ∞,

sup n≥nδ

[7.13]

H

then |f (x)|1+ε dμn (x) < ∞,

sup n≥nδ

δ = 1 − ε2 ,

H

and the desired convergence would follow from theorem 3.5 and condition (3.18), both from [BIL 99]. Now, we prove [7.13]. First consider the case where all mean values mn of μn are zeroes. Let Sn be the correlation operator of μn , n ≥ 1. By corollary 5.4, Sn converges to S in nuclear norm, which implies that Sn → S > 0,

trSn → trS

as

n → ∞.

Solutions

We ﬁx δ ∈ (0, 1). There exists nδ ∈ N such that for all n ≥ nδ , For n ≥ nδ , it holds √ δ 2 gδ (x)dμn (x) ≤ exp · ||x|| dμn (x) = 2||Sn || H H √ δ 1 1 , αn = = < , 2||Sn || 2||Sn || (n) ∞ (1 − 2α λ ) n k k=1

δ 2||S||

≤

223

√ δ 2||Sn || .

(n)

where {λk , k ≥ 1} are eigenvalues of Sn (with multiplicity). Then

∞ 1 (n) gδ (x)dμn (x) ≤ exp − log(1 − 2αn λk ) . 2 H

k=1

(n)

We have 2αn λk

≤ 2αn ||Sn || ≤

√

δ ≤ 1; hence

1 (n) (n) − log(1 − 2αn λk ) ≤ Cδ · 2αn λk , 2 with Cδ > 0 depending only on δ. Therefore,

gδ (x)dμn (x) ≤ exp 2Cδ αn H

∞

(n) λk

= exp {2Cδ αn · trSn } .

k=1

Since αn and trSn are bounded, relation [7.13] follows. Now, consider arbitrary mean values mn . By corollary 5.4, mn converge strongly to m, where m is the mean value of μ. Let Xn ∼ N (0, Sn ), n ≥ 1. Then mn + Xn ∼ N (mn , Sn ), and for each τ > 0, ||mn + Xn ||2 ≤ (1 + τ )||X||2 + (1 +

1 )||mn ||2 . τ

Therefore, gδ (x)dμn (x) ≤ E gδ (mn + Xn ) ≤ H

≤

exp H

1 δ [(1 + τ )||x||2 + (1 + )||mn ||2 ] dμcn (x). 2||S|| τ

224

Gaussian Measures in Hilbert Space

Here, μcn is a centered Gaussian measure, with correlation operator Sn . We ﬁx τ such that δ(1 + τ ) = δ0 < 1, and since ||mn || ≤ const, we get

gδ (x)dμn (x) ≤ const H

H

gδ0 (x)dμcn (x).

The latter integrals are bounded uniformly in n ≥ nδ0 according to the case of zero means, and [7.13] follows, with nδ replaced as nδ0 . This accomplishes the proof. 10) We modify the proof of lemma 5.9(a). Denote α = ξ sin ϕ + η cos ϕ and β = ξ cos ϕ − η sin ϕ. It holds ϕ(α;β) (x∗ , y ∗ ) = E exp{iξ, x∗ sin ϕ + y ∗ cos ϕ } E exp{iη, x∗ cos ϕ − y ∗ sin ϕ } = ϕξ (x∗ sin ϕ + y ∗ cos ϕ) · ϕξ (x∗ cos ϕ − y ∗ sin ϕ) = 1 = exp{− [σ2 (x∗ sin ϕ + y ∗ cos ϕ, x∗ sin ϕ + y ∗ cos ϕ)+ 2 + σ2 (x∗ cos ϕ − y ∗ sin ϕ, x∗ cos ϕ − y ∗ sin ϕ)]} = 1 = exp{− (σ2 (x∗ , x∗ ) + σ2 (y ∗ , y ∗ ))} = ϕ(ξ;η) (x∗ , y ∗ ). 2 In transformation, we used that σ2 is a symmetric bilinear form on X ∗ . The rest of the proof follows the line of the proof of lemma 5.9(a). 11) Let 0 < α
0. Hence ||x||2t dμ(x) ≤ H

t t e−t Jα , α

2

eα||x|| dμ(x).

Jα = H

Now, we bound Jα . We use eigenvalues λn of S (with multiplicity):

∞ 1 Jα = exp − log(1 − 2αλn ) . 2 1 Since 0 ≤ 2αλn ≤ 2α S < 1, we obtain from the concavity of the function g(t) = − log(1 − t), t < 1: −

1 1 2αλn log(1 − 2α||S||) log(1 − 2αλn ) ≤ − · = 2 2 2α||S||

=−

λn log(1 − 2α||S||) , 2||S||

Jα ≤ (1 − 2α||S||)

trS − 2||S||

.

Thus,

||x||2t dμ(x) ≤ tt e−t α−t (1 − 2α||S||)

trS − 2||S||

.

H

This implies the statement. The function of α attains its minimum at point α0 =

t . 2t||S|| + trS

12) Let ξ be a Gaussian random element with distribution μ. Then ξ − m is a centered Gaussian random element in X, and by theorem 5.9 there exists α0 > 0, with E exp{α0 ||ξ − m||2 } < ∞. Now, for any ε > 0, 1 ||ξ||2 ≤ (||ξ − m|| + ||m||)2 ≤ (1 + ε)||ξ − m||2 + (1 + )||m||2 , ε

226

Gaussian Measures in Hilbert Space

and for α > 0, 1 E exp{α||ξ||2 } ≤ exp{α(1 + )||m||2 } E exp{(1 + ε)||ξ − m||2 }. ε We set α =

α0 1+ε

and obtain the desired relation.

Since ε is arbitrary, actually we have proved more: if E exp{α0 ||ξ − m||2 } < ∞ (here α0 is a certain positive number), then E exp{α0 ||ξ||2 < ∞ for all α < α0 . 13) The sequence {ξn } converges in probability to z, therefore, Gaussian measures μn := μξn weakly converge to δz , Dirac measure at point z (see [BIL 99]). Fix τ > ¯ τ )) =: c = 1. In this case, from the proof of theorem 5.10 we ||z||, then δz (B(0, obtain that for any β > 0 there exists nβ ≥ 1 such that sup E exp{β||ξn ||2 } < ∞.

n≥nβ

We set β = 2α where α is the number from problem (13). Then sup E f 2 (ξn ) ≤ sup E exp{2α||ξn ||2 } < ∞.

n≥nβ

n≥nβ

Since f is continuous almost everywhere (a.e.) with respect to the limit measure δz , we obtain (see [BIL 99]): E f (ξn ) → E f (ξ) = f (z)

as

n → ∞.

Here, ξ(ω) ≡ z, ξn → ξ in distribution. 14) We modify the proof demonstrated in example 5.2. Since the measure σ is sigma-ﬁnite, B ∗ = Lq (T, σ). For x∗ ∈ Lq (T, σ), we explain why

ξ(t)x∗ (t)dσ(t)

η := T

is a Gaussian r.v. (all the other reasonings from example 5.2 are easily extended to the measure σ on T instead of the Lebesgue measure on the interval).

Solutions

227

First one can show by Fubini’s theorem that η is a r.v. There exists an increasing sequence {Tn } such that T = ∪∞ 1 Tn , Tn ∈ S, n ≥ 1 (here S is a sigma-algebra where σ is deﬁned), and σ(Tn ) ≤ n for all n. Denote x∗n = x∗n (t) = x∗ · IAn (t),

An = {t ∈ Tn : (x∗ (t))2 · E ξ 2 (t) ≤ n}.

It is enough to show that η is Gaussian, with x∗n instead of x∗ . In this case σ(Tn ) < ∞, η= Tn

ξ(t)x∗n (t)dσ(t),

Tn

E ξ 2 (t) · (x∗n (t))2 dσ(t) ≤ n2 < ∞,

and η ∈ L2 (Ω, P). The rest of the proof of the fact, that such an η is Gaussian, follows the line of proof in example 5.2, where the Lebesgue integral over [0, T ] is replaced with the integral over T w.r.t. the measure σ. 7.6. Solutions for Chapter 6 1) a) Necessity follows from the inequality lim sup zn ≤ sup zn . n→∞

n≥1

b) Sufﬁciency Fix ε > 0. From the given relation, it follows that there exists a positive α0 with lim sup E |Xn | · I(|Xn | ≥ α0 ) < ε. n→∞

Hence for some n0 ≥ 2, sup E |Xn | · I(|Xn | ≥ α0 ) < 2ε.

n≥1

Since E |Xn | < ∞, it holds lim E |Xn | · I(|Xn | ≥ α) = 0. Therefore, there α→+∞

exists α1 > 0 with n 0 −1 n=1

E |Xn | · I(|Xn | ≥ α1 ) < ε.

228

Gaussian Measures in Hilbert Space

For α2 := max {α0 , α1 }, we have sup E |Xn | · I(|Xn | ≥ α2 ) ≤

n≥1

n 0 −1

E |Xn | · I(|Xn | ≥ α2 )+

n=1

+ sup E |Xn | · I(|Xn | ≥ α2 ) < ε + 2ε = 3ε. n≥n0

Thus, for all α ≥ α2 we obtain sup E |Xn | · I(|Xn | ≥ α) < 3ε.

n≥1

This proves [6.1]. 2) Let ξ have a Cauchy distribution and {εn } be a sequence of positive numbers with εn → 0 and nεn → ∞ as n → ∞. We set Xn = εn ξI(|ξ| ≤ n) and will specify εn more precisely later. Since Xn is symmetrically distributed at ﬁnite intervals, it holds E Xn = 0 → 0 as n → ∞, and |Xn | ≤ εn |ξ| → 0 a.s., hence Xn → X = 0 a.s. For α > 0, we have if nεn ≥ α :

α ≤ |ξ| ≤ n = εn xdx εn α2 2 log(1 + n . = ) − log 1 + 1 + x2 π ε2n

An := E |Xn | · I(|Xn | ≥ α) = εn E |ξ| · I = 2εn ·

1 π

n α εn

1 We select εn = log(1+n 2 ) , n ≥ 1, such that An tends to a positive number. Indeed, with such a choice, An → π1 as n → ∞. Hence

B(α) := sup An ≥ lim An = n≥1

n→∞

1 , π

and B(α) does not tend to 0 as n → ∞. Thus, {Xn } is not uniformly integrable. 3) a) Sufﬁciency Denote ε(α) = supx≥α xI(x ≥ α) ≤ ε(α)G(x),

x G(x)

and note that ε(α) → 0 as α → +∞ and

x ≥ 0.

Solutions

229

Therefore, sup E |Xn | · I(|Xn | ≥ α) ≤ ε(α) sup E G(|Xn |) → 0

n≥1

as

n≥1

α → +∞.

b) Necessity If supn≥1 |Xn | is a bounded r.v., then G(t) = t2 , t ≥ 0 satisﬁes the statement. Now, suppose that supn≥1 |Xn | is an unbounded r.v. We set ϕα (x) == xI(x ≥ α), x ≥ 0, α > 0 and −1/2 G(t) = t sup E ϕt (|Xn |) ,

t ≥ 0.

n≥1

Then G(t) is an increasing function as a product of an increasing function g(t) = t, t ≥ 0, and non-decreasing function

−1/2 sup E ϕt (|Xn |)

h(t) =

,

n≥1

t ≥ 0.

Moreover, since {Xn } is uniformly integrable, it holds t → +∞.

G(t) t

= h(t) → +∞ as

Let Fn be the cdf of |Xn | and τn ∈ [0, +∞] be the right point of the distribution of Xn . (This means in case τn = +∞ that Xn is unbounded, and in case τn < +∞ that P(|Xn | ≤ τn ) = 1 and P(|Xn | > τn − ε) > 0, for each ε > 0.) We have

τn

E G(|Xn |) ≤ 0

*

τn

= −2

τn

d 0

tdFn (t) = τn xdF (x) n t τn

xdFn (x) = 2

xdFn (x) = 2

E |Xn |.

0

t

Here, the integrals are Lebesgue–Stieltjes integrals. Hence by lemma 6.1(a), 1/2 < ∞. sup E G(|Xn |) ≤ 2 sup E |Xn |

n≥1

n≥1

230

Gaussian Measures in Hilbert Space

4)Let ν1 and μ1 be uniform distributions at [0, 2] and [1, 3], respectively. Consider ∞ ∞ ν = 1 νk and μ = 1 μk , where {νk , k ≥ 2} are arbitrary probability measures on B and μk = νk , k ≥ 2. Since ν1 is not absolutely continuous w.r.t. μ1 , ν is not absolutely continuous w.r.t. μ either (see theorem 6.4). It holds ν1 = 12 (ν1 + ν1 ) where ν1 and ν1 are uniform distributions at [1, 2] and [0, 1], respectively. Then

ν=

1 (ν + ν ), 2

ν = ν1 ×

∞

ν = ν1 ×

νk ,

k=2

∞

νk .

k=2

By theorem 6.4, ν μ. Since ν1 ⊥μ1 , it holds ν ⊥μ. Hence ν is neither absolutely continuous w.r.t. μ nor it is singular to μ. In a similar way, it is shown that μ is not absolutely continuous w.r.t. ν either. ˆ k ) and fk = 5) Denote νk = P ois(λk ), μk = P ois(λ for n ≥ 0, νk ({n}) = fk (n) = e

dνk dμk .

It holds νk ∼ μk and

ˆ

{n}

fk (x)dμk (x) = fk (n)

ˆk ) −(λk −λ

λk ˆk λ

n

e−λk ˆ n λ , n! k

.

Compute Hellinger integral

H(λk , μk ) = =e ∞

1

R

√ 2 √ ˆk ) − 12 ( λk − λ

fk dμk (x) =

∞

ˆk ) − 12 (λk −λ

e

n=0

;

∞

1 √ H(λk , μk ) = exp − ( λk − 2 1

λk ˆk λ

n/2

ˆ

e−λk ˆ n λ = n! k

ˆ k )2 . λ

∞ √ ˆ k )2 < ∞, then the latter inﬁnite product converges, and if λ If 1 ( λ k − ∞ √ 2 ˆ k ) = ∞, then the product diverges to zero. Now, the desired λ 1 ( λk − statement follows from theorem 6.4 and corollary 6.2. 6) a) Let μ ν. Take A ∈ S with νe (A) = 0. Then ν(A ∩ Xr ) = νe (A) = 0; hence 0 = μ(A ∩ Xr ) = μe (A) = 0, and μe νe .

Solutions

231

Now, assume that μe νe . Take B ∈ Xr ∩ S with ν(B) = 0. Hence νe (B) = 0 and 0 = μe (B) = μ(B), and μ ν. b) Let μ⊥ν. Then Xr = Xr ∪Xr with Xr ∩Xr = ∅, μ(Xr ) = 0, ν(Xr ) = 0. Hence μe (Xr ) = 0, νe (X \ Xr ) = ν(Xr ) = 0, and μe ⊥νe . Now, assume that μe ⊥νe . Then X = X ∪ X with X ∩ X = ∅, μe (X ) = 0, νe (X ) = 0. Hence Xr = Xr ∪Xr with Xr = X ∩Xr , Xr = X ∩Xr , μ(Xr ) = 0 and ν(Xr ) = 0. Thus, μ⊥ν. 7) Similarly to theorem 6.7, we obtain for a1 − a2 = B 1/2 b, b ∈ H (here (b, ek ) = 0 whenever βk = 0): 2 dμ b −1/2 (x − a2 )) . = exp − + (b, B dν 2 Here, (b, B −1/2 (x − a2 )) is a measurable functional on the probability space (H, B(H), ν) (see remark 6.6), n bk (xk − a2,k ) √ , n→∞ βk 1

(b, B −1/2 (x − a2 )) = lim

and the limit exists a.e. with respect to ν; if some βk0 = 0, then xk0 = a2,k0 (mod ν) b (xk0 −a2,k0 ) and k0 √ = 0 by deﬁnition. βk0

8) The necessity is obvious. Now, we prove the sufﬁciency. Let {An } satisfy the conditions of the problem. Decompose ν = ν1 +ν2 where ν1 and ν2 are ﬁnite measures on S with ν1 μ, ν2 ⊥μ. Since μ(An ) → 0, it holds ν1 (An ) → 0, and ν2 (An ) = ν(An ) − ν1 (An ) → 1 as n → ∞. Hence ν2 (X) ≥ 1, ν1 (X) == ν(X) − ν2 (X) ≤ 1 − 1 = 0, ν1 = 0 and ν = ν2 ⊥μ. 9) Let ξ = N (a2 , B2 ), then tξ + c ∼ N (a1 , B1 ) for some a1 ∈ H and B1 == t2 B2 . If t = 0, then KerB1 = KerB2 and μξ ⊥μtξ+c (see remark 6.6). If t ∈

−1/2 −1/2 −1/2 −1/2 2 R \ {−1, 0, 1}, then J := B2 B1 B 2 = B2 , and Jx = t2 x, t B2 B2 −1/2 −1/2 x ∈ H KerB2 =: L, dimL = ∞. For D := B2 B1 B 2 − I, it holds 2 D|L = (t − 1)IL , where IL is the identity operator on L. Since t2 − 1 = 0, D|L is not a compact operator on L; hence D is not compact as well, and the Feldman–Hájek theorem implies that N (a1 , B1 )⊥N (a2 , B2 ). ˆ m ) is a Gaussian random vector, the estimator ˆb is a 10) Since (ˆ α1 , . . . , α Gaussian random element on H. Its correlation operator Sˆb coincides with the one of

ˆb0 := m (fi , B −1/2 X0 )fi . Random vector (fi , B −1/2 X0 ) m has distribution i=1 i=1

232

Gaussian Measures in Hilbert Space

N (0, Im ). Hence ˆb0 ∼ N (0, PUm ), Um = span(f1 , . . . , fm ), PUm is orthoprojector ˆ = B 1/2ˆb, its correlation operator is on Um . Thus, Sˆb = PUm . Since the MLE a ˆ is unbiased, a ˆ ∼ N (a, B 1/2 PUm B 1/2 ). B 1/2 PUm B 1/2 . Because the estimator a Answer: a ˆ ∼ span(f1 , . . . , fm ).

N (a, B 1/2 PUm B 1/2 ), where PUm is orthoprojector on

11) For any real C,

P {ξ + η = C} = x+y=C

=

dμξ (x)dμη (y) =

R

R

dμη (y)

{C−y}

dμξ (x) =

P {ξ = C − y} dμη (y) = 0,

since cdf of ξ is continuous at any point C − y. Here, μξ and μη are the distribution of ξ and η, respectively. ∞

12) Let random element X = (Xk )1 on R∞ be related to Y as in section 6.4.2. Under H0 , {Xk } are i.i.d. standard normal and under H ∗ , {Xk } are independent with Xk ∼ N (0, 1 + βk ), k ≥ 1. Introduce S(X) and Q(X) deﬁned in [6.70] and [6.72], respectively. Like in section 6.4.2, PH0 {S(X) = C} = 0 and the Neyman–Pearson test is constructed based on relation [6.73]: reject H0 if Q(X) > C˜α , and do not reject H0 if Q(X) ≤ C˜α . Since βk ≤ 0, k ≥ 1, equation [6.73] takes a form 5 6 ˜ < −C˜α = α, PH0 Q(X)

˜ Q(X) :=

∞ |βk | X 2 k

1

1 + βk

.

˜ Using [6.75]–[6.76], we approximate the distribution of Q(X) by the distribution Aχ2ν , A=

∞ 1

ν=

∞

|βk | βk2 : , 2 (1 + βk ) 1 + βk 1

∞ |βk | 1 + βk 1

2

:

∞ 1

βk2 . (1 + βk )2

Hence C˜α = −Aχ2ν,1−α , where χ2ν,1−α is the upper (1 − α)-quantile of χ2ν distribution.

Solutions

233

∞ Xk ˜ Under H ∗ , Q(X) = 1 |βk | γk2 , γk = √1+β , {γk } are i.i.d. standard normal, k ˜ and we approximate the distribution of Q(X) by the distribution Bχ2τ with B=

∞

βk2

:

1

∞

|βk | ,

1

∞ 2 |βk |) τ = 1 ∞ 2 . 1 βk (

The power of the test is as follows: power ≈ P

%

Bχ2τ