187 46 3MB
English Pages 260 pages [277] Year 2020
Gaussian Measures in Hilbert Space
To the memory of my daughter Ann
Series Editor Nikolaos Limnios
Gaussian Measures in Hilbert Space Construction and Properties
Alexander Kukush
First published 2019 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Ltd 27-37 St George’s Road London SW19 4EU UK
John Wiley & Sons, Inc. 111 River Street Hoboken, NJ 07030 USA
www.iste.co.uk
www.wiley.com
© ISTE Ltd 2019 The rights of Alexander Kukush to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. Library of Congress Control Number: 2019946454 British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN 978-1-78630-267-0
Contents
Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xiii
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xv
Abbreviations and Notation
. . . . . . . . . . . . . . . . . . . . . . . . . .
xix
Chapter 1. Gaussian Measures in Euclidean Space . . . . . . . . . . .
1
1.1. The change of variables formula . . . . . . . . . . . . . . . . . . . . 1.2. Invariance of Lebesgue measure . . . . . . . . . . . . . . . . . . . . 1.3. Absence of invariant measure in infinite-dimensional Hilbert space 1.4. Random vectors and their distributions . . . . . . . . . . . . . . . . 1.4.1. Random variables . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.2. Random vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.3. Distributions of random vectors . . . . . . . . . . . . . . . . . . 1.5. Gaussian vectors and Gaussian measures . . . . . . . . . . . . . . . 1.5.1. Characteristic functions of Gaussian vectors . . . . . . . . . . . 1.5.2. Expansion of Gaussian vector . . . . . . . . . . . . . . . . . . . 1.5.3. Support of Gaussian vector . . . . . . . . . . . . . . . . . . . . . 1.5.4. Gaussian measures in Euclidean space . . . . . . . . . . . . . .
. . . . . . . . . . . .
1 4 9 10 11 12 14 17 17 20 22 23
Chapter 2. Gaussian Measure in l2 as a Product Measure . . . . . . .
27
∞
. . . . . . . . . . . . . . . . . . . . . . 2.1. Space R 2.1.1. Metric on R∞ . . . . . . . . . . . . . . . . . . 2.1.2. Borel and cylindrical sigma-algebras coincide 2.1.3. Weighted l2 space . . . . . . . . . . . . . . . . 2.2. Product measure in R∞ . . . . . . . . . . . . . . . 2.2.1. Kolmogorov extension theorem . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . . . . . . . .
. . . . . .
. . . . . .
27 27 30 31 34 34
vi
Gaussian Measures in Hilbert Space
2.2.2. Construction of product measure on B(R∞ ) . . . . 2.2.3. Properties of product measure . . . . . . . . . . . . 2.3. Standard Gaussian measure in R∞ . . . . . . . . . . . . 2.3.1. Alternative proof of the second part of theorem 2.4 2.4. Construction of Gaussian measure in l2 . . . . . . . . .
. . . . .
36 38 42 45 46
Chapter 3. Borel Measures in Hilbert Space . . . . . . . . . . . . . . . .
51
3.1. Classes of operators in H . . . . . 3.1.1. Hilbert–Schmidt operators . . 3.1.2. Polar decomposition . . . . . . 3.1.3. Nuclear operators . . . . . . . 3.1.4. S-operators . . . . . . . . . . . 3.2. Pettis and Bochner integrals . . . 3.2.1. Weak integral . . . . . . . . . 3.2.2. Strong integral . . . . . . . . . 3.3. Borel measures in Hilbert space . 3.3.1. Weak and strong moments . . 3.3.2. Examples of Borel measures . 3.3.3. Boundedness of moment form
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
Chapter 5. Gaussian Measure of General Form
. . . . .
. . . . . . . . . . . .
. . . . .
. . . . . . . . . . . .
. . . . .
. . . . . . . . . . . .
. . . . .
. . . . . . . . . . . .
. . . . .
. . . . . . . . . . . .
. . . . .
. . . . . . . . . . . .
. . . . .
. . . . .
. . . . . . . . . . . .
. . . . .
. . . . .
. . . . . . . . . . . .
. . . . .
. . . . .
. . . . . . . . . . . .
. . . . .
89
. . . . .
. . . . . . . . . . . .
. . . . .
Chapter 4. Construction of Measure by its Characteristic Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . .
51 52 55 57 62 68 68 69 75 75 78 83
. . . . .
. . . . . . . . . . . .
. . . . .
. . . . . . . . . . . .
4.1. Cylindrical sigma-algebra in normed space . 4.2. Convolution of measures . . . . . . . . . . . 4.3. Properties of characteristic functionals in H 4.4. S-topology in H . . . . . . . . . . . . . . . 4.5. Minlos–Sazonov theorem . . . . . . . . . . .
. . . . . . . . . . . .
. . . . .
. . . . .
. . . . . . . . . . . .
. . . . .
. 89 . 93 . 96 . 99 . 102
. . . . . . . . . . . . . 111
5.1. Characteristic functional of Gaussian measure . . . . . . . . . . . . 5.2. Decomposition of Gaussian measure and Gaussian random element 5.3. Support of Gaussian measure and its invariance . . . . . . . . . . . 5.4. Weak convergence of Gaussian measures . . . . . . . . . . . . . . . 5.5. Exponential moments of Gaussian measure in normed space . . . . 5.5.1. Gaussian measures in normed space . . . . . . . . . . . . . . . . 5.5.2. Fernique’s theorem . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
111 114 117 125 129 129 133
Chapter 6. Equivalence and Singularity of Gaussian Measures . . . 143 6.1. Uniformly integrable sequences . . . . . . . . . . . . . . . 6.2. Kakutani’s dichotomy for product measures on R∞ . . . . 6.2.1. General properties of absolutely continuous measures . 6.2.2. Kakutani’s theorem for product measures . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
143 145 145 148
Contents
6.2.3. Dichotomy for Gaussian product measures . . . . . . . . . . . . . 6.3. Feldman–Hájek dichotomy for Gaussian measures on H . . . . . . . 6.3.1. The case where Gaussian measures have equal correlation operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2. Necessary conditions for equivalence of Gaussian measures . . . 6.3.3. Criterion for equivalence of Gaussian measures . . . . . . . . . . 6.4. Applications in statistics . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1. Estimation and hypothesis testing for mean of Gaussian random element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.2. Estimation and hypothesis testing for correlation operator of centered Gaussian random element . . . . . . . . . . . . . . . . . . . . .
vii
. 152 . 155 . . . .
155 158 165 169
. 169 . 173
Chapter 7. Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 7.1. Solutions for Chapter 1 . . . . . . . . . . . . . . 7.2. Solutions for Chapter 2 . . . . . . . . . . . . . . 7.2.1. Generalized Kolmogorov extension theorem 7.3. Solutions for Chapter 3 . . . . . . . . . . . . . . 7.4. Solutions for Chapter 4 . . . . . . . . . . . . . . 7.5. Solutions for Chapter 5 . . . . . . . . . . . . . . 7.6. Solutions for Chapter 6 . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
179 193 196 202 211 217 227
Summarizing Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 References Index
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Foreword
The study of modern theory of stochastic processes, infinite-dimensional analysis and Malliavin calculus is impossible without a solid knowledge of Gaussian measures on infinite-dimensional spaces. In spite of the importance of this topic and the abundance of literature available for experienced researchers, there is no textbook suitable for students for a first reading. The present manual is an excellent get-to-know course in Gaussian measures on infinite-dimensional spaces, which has been given by the author for many years at the Faculty of Mechanics & Mathematics of Taras Shevchenko National University of Kyiv, Ukraine. The presentation of the material is well thought out, and the course is self-contained. After reading the book it may seem that the topic is very simple. But that is not true! Apparent simplicity is achieved by careful organization of the book. For experts and PhD students having experience in infinite-dimensional analysis, I prefer to recommend the monograph V. I. Bogachev, Gaussian Measures (1998). But for first acquaintance with the topic, I recommend this new manual. Prerequisites for the book are only a basic knowledge of probability theory, linear algebra, measure theory and functional analysis. The exposition is supplemented with a bulk of examples and exercises with solutions, which are very useful for unassisted work and control of studied material. In this book, many delicate and important topics of infinite-dimensional analysis are analyzed in detail, e.g. Borel and cylindrical sigma-algebras in infinite-dimensional spaces, Bochner and Pettis integrals, nuclear operators and the topology of nuclear convergence, etc. We present the contents of the book, emphasizing places where finite-dimensional results need reconsideration (everywhere except Chapter 1).
x
Gaussian Measures in Hilbert Space
– Chapter 1. Gaussian distributions on a finite-dimensional space. The chapter is preparatory but necessary. Later on, many analogies with finite-dimensional space will be given, and the places will be visible where a new technique is needed. – Chapter 2. Space R∞ , Kolmogorov theorem about the existence of probability measure, product measures, Gaussian product measures, Gaussian product measures in l2 space. After reading the chapter, the student will start to understand that on infinite-dimensional space there are several ways to define a sigma-algebra (luckily, in our case Borel and cylindrical sigma-algebras coincide). Moreover, it will become clear that infinite-dimensional Lebesgue measure does not exist, hence construction of measure by means of density needs reconsideration. – Chapter 3. Bochner and Pettis integrals, Hilbert–Schmidt operators and nuclear operators, strong and weak moments. The chapter is a preparation for the definition of the expectation and correlation operator of Gaussian (or even arbitrary) random element. We see that it is not so easy to introduce expectation of a random element distributed in Hilbert or Banach space. As opposed to finite-dimensional space, it is not enough just to integrate over basis vectors and then augment the results in a single vector. – Chapter 4. Characteristic functionals, Minlos–Sazonov theorem. One of the most important methods to investigate probability measures on finite-dimensional space is the method of characteristic functions. As well-known from the course of probability theory, these will be all continuous positive definite functions equal to one at zero, and only them. On infinite-dimensional space this is not true. For the statement “they and only them”, continuity in the topology of nuclear convergence is required, and this topology is explained in detail. – Chapter 5. General Gaussian measures. Based on results of previous chapters, we see the necessary and sufficient conditions that have to be satisfied by the characteristic functional of a Gaussian measure in Hilbert space. We realize that we have used all the knowledge from Chapters 2–4 (concerning integration of random elements, about Hilbert–Schmidt and nuclear operators, Minlos–Sazonov theorem, etc.). We notice that for the eigenbasis of the correlation operator, a Gaussian measure is just a product measure which we constructed in Chapter 2. This seems natural; but on our way it was impossible to discard any single step without loss of mathematical rigor. In this chapter, Fernique’s theorem about finiteness of an exponential moment of the norm of a Gaussian random element is proved and the criterion for the weak convergence of Gaussian measures is stated. – Chapter 6. Equivalence and mutual singularity of measures. Here, Kakutani’s theorem is proven about the equivalence of the infinite product of measures. As we saw in the previous chapter, Gaussian measures on Hilbert spaces are product measures, in a way. Therefore, as a consequence of general theory, we get a criterion for the equivalence of Gaussian measures (Feldman–Hájek theorem). The obtained results are applied to problems of infinite-dimensional statistics. One should be
Foreword
xi
careful here, as due to the absence of the infinite-dimensional Lebesgue measure, the Radon–Nikodym density should be written w.r.t. one of the Gaussian measures. The author of this book, Professor A.G. Kukush, has been working at the Faculty of Mechanics & Mathematics of Taras Shevchenko National University for 40 years. He is an excellent teacher and a famous expert in statistics and probability theory. In particular, he used to give lectures to students of mathematics and statistics on Measure Theory, Functional Analysis, Statistics and Econometrics. As a student, I was lucky to attend his fascinating course on infinite-dimensional analysis. Andrey P ILIPENKO Leading Researcher at the Institute of Mathematics of Ukrainian National Academy of Sciences, Professor of Mathematics at the National Technical University of Ukraine, “Igor Sikorsky Kyiv Polytechnic Institute” August 2019
Preface
This book is written for graduate students of mathematics and mathematical statistics who know algebra, measure theory and functional analysis (generalized functions are not used here); the knowledge of mathematical statistics is desirable only to understand section 6.4. The topic of this book can be considered as supplementary chapters of measure theory and lies between measure theory and the theory of stochastic processes; possible applications are in functional analysis and statistics of stochastic processes. For 20 years, the author has been giving a special course “Gaussian Measures” at Taras Shevchenko National University of Kyiv, Ukraine, and in 2018–2019, preliminary versions of this book have been used as a textbook for this course. There are excellent textbooks and monographs on related topics, such as Gaussian Measures in Banach Spaces [KUO 75], Gaussian Measures [BOG 98] and Probability Distributions on Banach Spaces [VAK 87]. Why did I write my own textbook? In the 1970s, I studied at the Faculty of Mechanics and Mathematics of Taras Shevchenko National University of Kyiv, at that time called Kiev State University. There I attended unforgettable lectures given by Professors Anatoliy Ya. Dorogovtsev (calculus and measure theory), Lev A. Kaluzhnin (algebra), Mykhailo I. Yadrenko (probability theory), Myroslav L. Gorbachuk (functional analysis) and Yuriy M. Berezansky (spectral theory of linear operators). My PhD thesis was supervised by famous statistician A. Ya. Dorogovtsev and dealt with the weak convergence of measures on infinite-dimensional spaces. For long time, I was a member of the research seminar “Stochastic processes and distributions in functional spaces” headed by classics of probability theory Anatoliy V. Skorokhod and Yuriy L. Daletskii. My second doctoral thesis was about asymptotic properties of estimators for parameters of stochastic processes. Thus, I am somewhat tied up with measures on infinite-dimensional spaces.
xiv
Gaussian Measures in Hilbert Space
In 1979, Kuo’s fascinating textbook was translated into Russian. Inspired by this book, I started to give my lectures on Gaussian measures for graduate students. The subject seemed highly technical and extremely difficult. I decided to create something like a comic book on this topic, in particular to divide lengthy proofs into small understandable steps and explain the ideas behind computations. It is impossible to study mathematical courses without solving problems. Each section ends with several problems, some of which are original and some are taken from different sources. A separate chapter contains detailed solutions to all the problems. Acknowledgments I would like to thank my colleagues at Taras Shevchenko National University of Kyiv who supported my project, especially Yuliya Mishura, Oleksiy Nesterenko and Ivan Feshchenko. Also I wish to thank my students of different generations who followed up on the ideas of the material and helped me to improve the presentation. I am grateful to Fedor Nazarov (Kent State University, USA) who communicated the proof of theorem 3.9. In particular, I am grateful to Oksana Chernova and Andrey Frolkin for preparing the manuscript for publication. I thank Sergiy Shklyar for his valuable comments. My wife Mariya deserves the most thanks for her encouragement and patience. Alexander K UKUSH Kyiv, Ukraine September 2019
Introduction
The theory of Gaussian measures lies on the junction of theory of stochastic processes, functional analysis and mathematical physics. Possible applications are in quantum mechanics, statistical physics, financial mathematics and other branches of science. In this field, the ideas and methods of probability theory, nonlinear analysis, geometry and theory of linear operators interact in an elegant and intriguing way. The aim of this book is to explain the construction of Gaussian measure in Hilbert space, present its main properties and also outline possible applications in statistics. Chapter 1 deals with Euclidean space, where the invariance of Lebesgue measure is explained and Gaussian vectors and Gaussian measures are introduced. Their properties are stated in such a form that (later on) they can be extended to the infinite-dimensional case. Furthermore, it is shown that on an infinite-dimensional Hilbert space there is no non-trivial measure, which is invariant under all translations (the same concerning invariance under all unitary operators); hence on such a space there is no measure analogous to the Lebesgue one. In Chapter 2, a product measure is constructed on the sequence space R∞ based on Kolmogorov extension theorem. For standard Gaussian measure μ on R∞ , Kolmogorov–Khinchin criterion is established. In particular, it is shown that μ is concentrated on certain weighted sequence spaces l2,a , and based on isometry between l2,a and l2 , a Gaussian product measure is constructed on the latter sequence space. Chapter 3 introduces important classes of operators in a separable infinite-dimensional Hilbert space H, in particular S-operators, i.e. self-adjoint, positive and nuclear ones. Theorem 3.9 shows that the convergence of S-operators is equivalent to certain convergence of corresponding quadratic forms. Also the weak (Pettis) and strong (Bochner) integrals are defined for a function valued in a Banach space.
xvi
Gaussian Measures in Hilbert Space
Borel probability measures on H and a normed space X are studied with examples. The boundedness of moment forms of such measures is shown, with simple proof based on the classical Banach–Steinhaus theorem. Corollary 3.3 and remark 3.8 give mild conditions for the existence of mean value of a probability measure μ as Pettis integral, and if the underlying space is a separable Banach space B and μ has a strong first moment, then its mean value exists as Bochner integral. In Chapter 4, properties of characteristic functionals of Borel probability measures on H are studied. A special linear topology, S-topology, is introduced in H with a neighborhood system consisting of ellipsoids. Classical Minlos–Sazonov theorem is proven and properly extends Bochner’s theorem from Rn to H. According to Minlos– Sazonov theorem, the characteristic functional of a Borel probability measures on H should be continuous in S-topology. A part of proof of this theorem (see lemma 4.9) suggests the way to construct a probability measure by its characteristic functional. In Chapter 5, theorem 5.1 uses the Minlos–Sazonov theorem to describe a Gaussian measure on H of general form. It turns out that the correlation operator of such a measure is always an S-operator. It is shown that each Gaussian measure on H is just a product of one-dimensional Gaussian measures w.r.t. the eigenbasis of the correlation operator. Thus, every Gaussian measure on H can be constructed along the way, as demonstrated in Chapter 2. The support of Gaussian measure is studied. It is shown that a centered Gaussian measure is invariant under quite a rich group of linear transforms (see theorem 5.5). Hence, a Gaussian measure in Hilbert space can be considered as a natural infinitedimensional analogue of (invariant) Lebesgue measure. A criterion for the weak convergence of Gaussian measures is stated, where (due to theorem 3.9) we recognize the convergence of correlation operators in nuclear norm. In section 5.5, we study Gaussian measures on a separable normed space X. Important example 5.3 shows that a Gaussian stochastic process generates a measure on the path space Lp [0, T ], hence in case p = 2, we obtain a Gaussian measure on Hilbert space. Lemma 5.9 presents a characterization of Gaussian random element in X. The famous theorem of Fernique is proven, which states that certain exponential moments of a Gaussian measure on X are finite. In particular, every Gaussian measure on a separable Banach space B has mean value as Bochner integral and its correlation operator is well-defined. Theorem 5.10 derives the convergence of moments of weakly convergent Gaussian measures. In Chapter 6, Kakutani’s remarkable dichotomy for product measures on R∞ is proven. In particular, two such product measures with absolutely continuous
Introduction
xvii
components are either absolutely continuous or mutually singular. This implies the dichotomy for Gaussian measures on R∞ : two such measures are either equivalent or mutually singular. Section 6.3 proves the famous Feldman–Hájek dichotomy for Gaussian measures on H, and in case of equivalent measures, expressions for Radon–Nikodym derivatives are provided. In section 6.4, the results of Chapter 6 are applied in statistics. Based on a single observation of Gaussian random element in H, we construct unbiased estimators for its mean and for parameters of its correlation operator; also we check a hypothesis about the mean and the correlation operator (the latter hypothesis is in the case where the Gaussian element is centered). In view of example 5.3 with p = 2, these statistical procedures can be used for a single observation of a Gaussian process on finite time interval. The book is aimed for advanced undergraduate students and graduate students in mathematics and statistics, and also for theoretically interested students from other disciplines, say physics. Prerequisites for the book are calculus, algebra, measure theory, basic probability theory and functional analysis (we do not use generalized functions). In section 6.4, the knowledge of basic mathematical statistics is required. Some words about the structure of the book: we present the results in lemmas, theorems, corollaries and remarks. All statements are proven. Important and illustrative examples are given. Furthermore, each section ends with a list of problems. Detailed solutions to the problems are provided in Chapter 7. The abbreviations and notation used in the book are defined in the corresponding chapters; an overview of them is given in the following list.
Abbreviations and Notation
a.e. a.s. cdf pdf i.i.d.
almost everywhere w.r.t. Lebesgue measure almost surely cumulative distribution function probability density function independent and identically distributed (random variables or vectors) r.v. random variable LHS left-hand side RHS right-hand side MLE maximum likelihood estimator |A| number of points in set A Ac complement of set A A¯ closure of set A TB image of set B under transformation T T −1 A preimage of set A under transformation T x , A transposed vector and transposed matrix, respectively ¯ ¯ = R ∪ {−∞, +∞} R extended real line, i.e. R n×m R space of real n × m matrices ¯ r) open and closed ball, respectively, centered at x with radius B(x, r), B(x, r > 0 in a metric space f+ positive part of function f , f+ = max(f, 0) f− negative part of function f , f− = − min(f, 0) δij Kronecker delta, δij = 1 if i = j, and δij = 0, otherwise an ∼ bn {an } is equivalent to {bn } as n → ∞, i.e. an /bn → 1 as n→∞ C(X) space of all real continuous functions on X R∞ space of all real sequences B(X) Borel sigma-algebra on metric (or topological) space X
xx
Gaussian Measures in Hilbert Space
Lebesgue measure on Rm sigma-algebra of Lebesgue measurable sets on Rm indicator function, i.e. IA (x) = 1 if x ∈ A, else IA (x) = 0 measure induced by measurable transformation T based on measure μ, i.e. (μT −1 )(A) = μ(T −1 A), for each measurable set A L(X, μ) space of Lebesgue integrable functions on X w.r.t. measure μ f = g (mod μ) functions f and g are equal almost everywhere w.r.t. measure μ δx Dirac measure at point x, δx (B) = IB (x) νμ signed measure ν is absolutely continuous w.r.t. measure μ dν the Radon–Nikodym derivative of ν w.r.t. μ dμ ν∼μ measures ν and μ are equivalent ν⊥μ signed measure ν and measure μ are mutually singular (x, y) inner product of vectors x and y in Euclidean or Hilbert space x Euclidean norm of vector x A Euclidean norm of matrix A, A = sup Ax x λm Sm IA μT −1
x=0
Im rk(S) Pn √ A
the identity matrix of size m rank of matrix S projective operator, Pn x = (x1 , . . . , xn ) , x ∈ R∞ square root of positive semidefinite matrix A, it is positive √ semidefinite as well with ( A)2 = A x, x∗ or x∗ , x value of functional x∗ at vector x I the identity operator L(X) space of linear bounded operators on normed space X R(A) range of operator A, R(A) = {y : ∃x, y = Ax} L⊥ orthogonal complement to set L L2 [a, b] Hilbert space of square integrable real functions with inner b product (x, y) = a x(t)y(t)dt, the latter is Lebesgue integral lp space of real sequences x = (xn )∞ 1 with norm x p = ∞ 1/p ( n=1 |xn |p ) if 1 ≤ p < ∞, and x ∞ = supn≥1 |xn | if p = ∞. Forp = 2, l2 is Hilbert space with inner ∞ product (x, y) = 1 xn yn . l2,a weighted l2 space span(M ) span of set M, i.e., set of all finite linear combinations of vectors from M Aˆn cylinder in R∞ with base An ∈ B(Rn ) A = A L(X) operator norm of linear bounded operator A, A = sup Ax x ∗ √ A 1/2 B=B |A|
x=0
adjoint operator square root of self-adjoint positive operator B modulus of compact operator A, |A| = (A∗ A)1/2
Abbreviations and Notation
A 1 A 2 An ⇒ A S0 (H) S1 (H) S2 (H) S∞ (H) LS (H) A≥0 A≥B
nuclear norm of operator A Hilbert–Schmidt norm of operator A operators An uniformly converge to operator A class of finite-dimensional operators on H class of nuclear operators on H class of Hilbert–Schmidt operators on H class of compact operators on H class of S-operators on H operator A is positive, i.e. (Ax, x) ≥ 0 for all x comparison in Loewner order of self-adjoint operators: A − B is positive operator n A Cartesian product of sets A1 , . . . , An k k=1 n μ product of measures μ1 , . . . , μn k k=1 ∞ μ product measure on R∞ or on Hilbert space k=1 k mμ mean value of measure μ Cov(μ) variance-covariance matrix of measure μ on Rn ϕμ or μ ˆ characteristic function (or functional) of measure μ Aμ operator of second moment of measure μ Sμ correlation operator of measure μ σn (z1 , . . . , zn ) weak moments of order n of Borel probability measure on H μX distribution of random vector X or random element X, μX (B) = P(X ∈ B) for all Borel sets B ϕX characteristic function (functional) of random vector (element) X EX expectation of random vector (element) X DX variance of random variable X Cov(X) variance-covariance matrix of random vector X d
X=Y N (m, σ 2 ) N (m, S) d
− →
random vectors (elements) X and Y are identically distributed Gaussian distribution with mean m and variance σ 2 , σ ≥ 0 Gaussian distribution on Rn (or on H) with mean value m and variance-covariance matrix (or correlation operator) S convergence in distribution of random elements
xxi
1 Gaussian Measures in Euclidean Space
1.1. The change of variables formula Let (X, S, μ) be a measure space, i.e. X is a non-empty set, S is a sigma-algebra on X and μ is a measure on S. Consider also a measurable space (Y, F ), i.e. Y is another non-empty set and F is a sigma-algebra on Y. Let T : X → Y be a measurable transformation, which means that ∀A ∈ F,
T −1 A ∈ S.
[1.1]
Hereafter T −1 A := {x ∈ X : T x ∈ A}
[1.2]
is preimage of A under T . (To simplify the notation, we write T −1 A and T x rather than T −1 (A) and T (x), respectively, if it does not cause confusion.) Introduce a set function ν(A) = μ(T −1 A),
A ∈ F.
[1.3]
T HEOREM 1.1.– (About induced measure) The set of function ν given in [1.3] is a measure on F . P ROOF.– The function ν is well defined due to [1.1]. We have to show that it is not identical to infinity, but it is non-negative and sigma-additive. Indeed, ν(∅) = μ(T −1 ∅) = μ(∅) = 0, and therefore, ν is not identical to infinity. For each A ∈ F , ν(A) ≥ 0, because μ is a non-negative set function.
Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.
2
Gaussian Measures in Hilbert Space
Finally, {An , n ≥ 1} are disjoint sets from F . Then the preimages {T −1 An , n ≥ 1} are disjoint as well, and ν
∞
=μ T
An
−1
n=1
∞
An
=μ
n=1
=
∞
T
−1
An
=
n=1
∞ ∞
μ T −1 An = ν(An ). n=1
n=1
Here, we used the sigma-additivity of μ. Thus, ν is a non-negative and sigmaadditive set function on the sigma-algebra F , i.e. ν is a measure on F . D EFINITION 1.1.– The set function ν given in [1.3] is called a measure induced by transformation T and is denoted as μT −1 . The notation prompts how to evaluate ν(A): (μT −1 )(A) = μ(T −1 A),
A ∈ F.
For any measurable space (Y, F, ν), denote L(F, ν) the space of Lebesgue integrable functions on Y w.r.t. measure ν. ¯ be an F -measurable function, i.e. for each Borel subset B of Let f : Y → R ¯ extended real line R, it holds f −1 B ∈ F . T HEOREM 1.2.– (The change of variables formula) Assume that either f ≥ 0 or f ∈ L(Y, μT −1 ). Then it holds f (T x)dμ(x) = f (y)d(μT −1 )(y). [1.4] X
Y
P ROOF.– Equality [1.4] is shown in a standard way: first for indicators, then for simple non-negative functions, then for f ≥ 0, and finally, for f ∈ L(Y, μT −1 ). a) Let A ∈ F ,
f (y) = IA (y) = Then
IA (T x) = IA (T x) =
1, if y ∈ A 0, otherwise.
1, if T x ∈ A 0, otherwise, 1, if x ∈ T −1 A 0, otherwise,
Gaussian Measures in Euclidean Space
3
IA (T x) = IT −1 A (x). Hence
IT −1 A (x)dμ(x) = μ(T −1 A) = (μT −1 )(A),
IA (T x)dμ(x) = X
X
IA (y)d(μT −1 )(y) = (μT −1 )(A),
Y
and [1.4] follows for the indicator function. b) Let f ≥ 0 be a simple F -measurable function. Then it admits a representation f (y) =
m
y ∈ Y,
ak IAk (y),
[1.5]
k=1
with disjoint measurable sets {Ak , k = 1, . . . , m} and non-negative ak . For the function [1.5], relation [1.4] follows due to part (a) of the proof and the linearity of the Lebesgue integral. c) Let f be an arbitrary non-negative and F -measurable function. Then there exists a sequence {pn (y), n ≥ 1, y ∈ Y } of non-negative, simple and F -measurable functions such that pn converges to f pointwise and pn (y) ≤ pn+1 (y), n ≥ 1, y ∈ Y . By part (b) of the proof, pn (T x)dμ(x) = pn (y)d(μT −1 )(y). [1.6] X
Y
Here, tend n to infinity. By the monotone convergence theorem, [1.6] implies [1.4]. d) Finally, let f ∈ L(Y, μT −1 ), f+ (y) := max{f (y), 0},
f− (y) := − min{f (y), 0},
By part (c) of the proof, f+ (T x)dμ(x) = f+ (y)d(μT −1 )(y), X
[1.7]
Y
f− (y)d(μT −1 )(y).
f− (T x)dμ(x) = X
y ∈ Y.
[1.8]
Y
Subtract [1.8] from [1.7] and obtain [1.4] using the definition of Lebesgue integral.
4
Gaussian Measures in Hilbert Space
Problems 1.1 1) Let λT2 be a Lebesgue measure on [0, T ]2 ; π1 (x1 , x2 ) = x1 , (x1 , x2 ) ∈ [0, T ]2 , π1 : [0, T ]2 → R. Show that (λT2 π1−1 )(A) = T · λ1 (A ∩ [0, T ]),
A ∈ S1 ,
where λ1 is Lebesgue measure on R and S1 is sigma-algebra of Lebesgue measurable sets on R. 2) Let μ1 and μ2 be finite measures on Borel sigma-algebra B(R), and π(x1 , x2 ) = x1 , (x1 , x2 ) ∈ R2 . Find the induced measure (μ1 × μ2 )π1−1 . 3) For the objects of theorem 1.1, prove the following: if μT −1 is sigma-finite, then μ is sigma-finite as well. Does the opposite hold true? ¯ be any F -measurable function. Show that the Lebesgue integral 4) Let f : Y → R on the left-hand side of [1.4] is well defined if, and only if, the integral on the righthand side of [1.4] is well defined. Moreover, in case where they are well defined, they coincide. 1.2. Invariance of Lebesgue measure Consider a measure space (X, S, μ) and a measurable transformation T : X → X. D EFINITION 1.2.– The measure μ is called invariant under T , or T -invariant, if μT −1 = μ. R EMARK 1.1.– Assume additionally that T is a bijection on X, and moreover T −1 is a measurable transformation as well. Then μ is T -invariant if and only if μ(B) = μ(T B),
∀B ∈ S.
[1.9]
(Hereafter T B denotes image of B under T .) P ROOF.– a) Let μ be T -invariant and B ∈ S. Because T −1 is measurable, A := T B ∈ S. It holds B = T −1 A, and μ(T −1 A) = μ(A). Equality [1.9] follows. b) Conversely, assume [1.9] and take any A ∈ S. Denote B0 = T −1 A, B0 ∈ S. Then (μT −1 )(A) = μ(B0 ) = μ(T B0 ) = μ(A), and μ is T -invariant. E XAMPLE 1.1.– (Counting measure) Let X = {1, 2, ..., n}, S = 2X be the sigmaalgebra of all subsets of X, and μ be the counting measure on X, i.e. μ(A) = |A|, A ∈ S. (Hereafter |A| is number of points in a set A; if A is infinite, |A| = +∞.) Then
Gaussian Measures in Euclidean Space
5
μ is invariant under any bijection π on X. Indeed, μ(π −1 A) = |π −1 A| = |A| = μ(A), A ∈ S. In this section, we show that Lebesgue measure λn on Rn is rotation and translation invariant. Hereafter, we suppose that Euclidean space Rn consists of column vectors x = (x1 , . . . , xn ) . D EFINITION 1.3.– Affine transformation of Rn is a mapping of a form T x = Lx + c, with L ∈ Rn×n and c ∈ Rn . Such transformation is called non-singular if L is non-singular. Otherwise, if det L = 0, then T of this form is called a singular affine transformation. R EMARK 1.2.– Affine transformation T x = Lx + c is invertible if, and only if, it is non-singular. In this case, the inverse transformation is a non-singular affine transformation as well, and it acts as follows: T −1 y = L−1 y − L−1 c,
y ∈ Rn .
Remember that non-singular affine transformations on a plane include rotations, translations and axial symmetries. T HEOREM 1.3.– (Transformation of Lebesgue measure at Borel sets) Consider Lebesgue measure λn on Borel sigma-algebra B(Rn ). Let T x = Lx + c be a non-singular affine transformation on Rn . Then λn T −1 =
1 λn . | det L|
[1.10]
P ROOF.– Transformation T is continuous, and, therefore, Borel measurable. Hence the induced measure λn T −1 on B(Rn ) is well defined. a) For a = (ak )n1 and b = (bk )n1 with ak < bk , k = 1, . . . , n denote [a, b] =
n
[ak , bk ],
k=1
Hereafter,
n
(a, b] =
n
(ak , bk ].
[1.11]
k=1
Ak stands for Cartesian product of A1 , . . . , An . Evaluate
k=1
λn T −1 ([a, b]) = λn T −1 [a, b] =
T −1 [a,b]
dλn =
T −1 [a,b]
dx.
The latter integral is Riemann integral over the compact and Jordan measurable set T −1 [a, b]. The change of variables in the Riemann integral leads to the following: −1 ∂y
dy = mn ([a, b]) = λn ([a, b]) . λn T −1 [a, b] = | det L| | det L| [a,b] ∂x Here mn is Jordan measure on Rn .
6
Gaussian Measures in Hilbert Space
b) Consider a set (a, b] introduced in [1.11], and let {ak (m), m ≥ 1} be a decreasing sequence that converges to ak such that ak (m) < bk , m ≥ 1; k = 1, . . . , n. n Denote a(m) = (ak (m))k=1 ∈ Rn . Then Am := [a(m), b] is a monotone sequence of sets that converges to (a, b]. The continuity of Lebesgue measure from below implies
λn (Am ) λn ((a, b]) λn T −1 ((a, b]) = lim λn (T −1 Am ) = lim = . m→∞ m→∞ | det L| | det L|
Here, we used part (a) of the proof. λn c) Thus, the two measures λn T −1 and | det L| in [1.10] coincide on the semiring Pn that consists of all bricks (a, b] from [1.11]. Both measures are sigma-finite, and therefore, they coincide on σr(Pn ) = B(Rn ), where σr(Pn ) is sigma-ring generated by Pn .
Now, we extend theorem 1.3 to Lebesgue measure λn on sigma-algebra Sn of Lebesgue measurable sets in Rn . = L EMMA 1.1.– Non-singular affine transformation T x (Sn , Sn )-measurable, i.e. for any A ∈ Sn , T −1 A ∈ Sn as well.
Lx + c is
P ROOF.– It is known (see [HAL 13]) that Sn = {B ∪ N |B ∈ B(Rn ), N ⊂ N0 with N0 ∈ B(Rn ), λn (N0 ) = 0}. [1.12] Let A ∈ Sn , then A = B ∪ N , with B and N described in [1.12]. It holds
T −1 A = T −1 B ∪ T −1 N , T −1 B ∈ B(Rn ), [1.13]
λn (N0 ) T −1 N ⊂ T −1 N0 , T −1 N0 ∈ B(Rn ), λn T −1 N0 = = 0. | det L|
[1.14]
Here, we used theorem 1.3 and the fact that T is a Borel function. Decompositions [1.13] and [1.14] show that T −1 A ∈ Sn . T HEOREM 1.4.– (Transformation of Lebesgue measure) Let T x = Lx + c be a non-singular affine transformation on Rn . For Lebesgue measure λn on Sn , it holds λn T −1 =
λn . | det L|
P ROOF.– Consider A ∈ Sn and decompose T −1 A as in [1.13] and [1.14]. Because λn (N ) = λn (T −1 N ) = 0, we have by theorem 1.3:
λn (B) λn (A) λn T −1 A = λn T −1 B = = . | det L| | det L|
Gaussian Measures in Euclidean Space
7
C OROLLARY 1.1.– (Criterion for invariance of Lebesgue measure) Lebesgue measure λn on Sn is invariant under a non-singular affine transformation T x = Lx + c if, and only if, det L = ±1. In particular, λn is symmetric around the origin, and it is invariant under translations T x = x + c and orthogonal transformations T x = U x, where U is an orthogonal matrix (i.e. U −1 = U ), e.g. under symmetries w.r.t. hyperplanes that pass through the origin. In the planar case (n = 2), λ2 is invariant under the transformation 2x1 Tx = 1 , x ∈ R2 . 2 x2 Here, T is dilation along x1 -axis with coefficient 2 and contraction along x2 -axis with the same coefficient. C OROLLARY 1.2.– (Affine change of variables) Let T x = Lx + c be a non-singular ¯ be a Lebesgue measurable function, affine transformation on Rn and f : Rn → R which is either non-negative or belongs to L(Rn , λn ). Then it holds 1 f (y)dλn (y). f (T x)dλn (x) = | det L| Rn Rn P ROOF.– Apply theorems 1.2 and 1.4:
f (T x)dλn (x) = f (y)d λn T −1 (y) = Rn
Rn
1 | det L|
Rn
f (y)dλn (y).
Problems 1.2 5) Let T x = |x|, x ∈ R. Find λ1 T −1 . 6) Let f : R → R be a Lebesgue measurable function such that ∀x, y ∈ R, |f (x) − f (y)| ≥ |x − y|. Prove that f is (S1 , S1 )-measurable, where S1 is sigmaalgebra of Lebesgue measurable sets on real line.
7) Show that arctan x is an (S1 , S1 )-measurable function. Let f : − π2 , π2 → [0, +∞] be a Lebesgue measurable function. Prove that f (t) f (arctan x)dλ1 (x) = dλ1 (t). 2t π π cos R − , ( 2 2) 8) Show that ex is (S1 , S1 )-measurable function. Let f : (0, ∞) → [0, +∞] be a Lebesgue measurable function. Prove that f (t) f (ex )dλ1 (x) = dλ1 (x). R (0,+∞) t
8
Gaussian Measures in Hilbert Space
9) Prove that f (x) = ||x||, x ∈ Rn is (Sn , S1 )-measurable function. 10) Let f : [0, +∞) → [0, +∞] be a Lebesgue measurable function. Prove that the measure μ(A) = f (||x||) dλn (x), A ∈ Sn A
is invariant under unitary operators in Rn . 11) Let μ be a measure on Sn , which is finite at each bounded set from Sn , absolutely continuous w.r.t. λn and invariant under unitary operators in Rn . Prove that there exists a Borel function f : [0, +∞) → [0, +∞) such that representation from Problem 10 holds true. Hint. Given a locally Lebesgue integrable function g on Rn , a point x ∈ Rn is Lebesgue point if 1 lim |g(y) − g(x)|dλn (y) = 0. r→0+ λn (B(x, r)) B(x,r) Hereafter, B(x, r) is an open ball centered at x with radius r. Use the Lebesgue differentiation theorem [BOG 07] which states that, given any locally Lebesgue integrable function g on Rn , almost every x is a Lebesgue point of g. 12) Let g : Rn → R be a Lebesgue measurable function such that g(T x) = g(x) (mod λn ), for all unitary operators T in Rn . Prove that there exists a Borel function f : [0, +∞) → R, with g(x) = f (||x||) (modλn ). 13) Let α > 0 and f ∈ L(R, λ1 ). Prove that f (n1+α x) → 0 as n → ∞ for almost all x ∈ R. Extend this statement to functions from L(Rm , λm ). ¯ f ∈ L([0, +∞), λ1 ). Prove the following: 14) Let f : Rn → R, a) If f is an even function, then f dλ1 = 2 f dλ1 . R
[0,+∞)
b) If f is an odd function, then
R
f dλ1 = 0.
15) Let f : [−1, 1] → (0, +∞) be a Lebesgue measurable function. Find the f (x) integral [−1,1] f (x)+f (−x) dλ1 (x).
Gaussian Measures in Euclidean Space
9
1.3. Absence of invariant measure in infinite-dimensional Hilbert space Let H be a real Hilbert space, with Borel sigma-algebra B(H). In this section, we search for a measure λ on B(H) with the following properties: i) λ is positive at each non-empty open set; ii) λ is finite at each bounded Borel set; iii) λ is invariant under each translation T x = x + c, x ∈ H, with c ∈ H. Remember that a linear operator U in H is called unitary if ||U x|| = ||x||, x ∈ H, and R(U ) = H. An operator U ∈ L(H) is unitary if, and only if, U ∗ = U −1 . iv) λ is invariant under each unitary operator in H. Note that Lebesgue measure λn on B(Rn ) possesses the properties (i)–(iv). T HEOREM 1.5.– (Absence of invariant measure in H) Let H be an infinitedimensional real Hilbert space. Then: a) There is no measure λ with properties (i)–(iii). b) There is no measure λ with properties (i), (ii) and (iv). P ROOF.– Because dim(H) = ∞, there exists an infinite orthonormalsystem{en , √ √ n ≥ 1} in H. For k = m, ||ek − em || = 2, hence open balls B en , 22 are √ √ disjoint. For x ∈ B en , 22 , it holds ||x|| ≤ ||en || + ||x − en || < 1 + 22 < 2, and B
√ 2 ⊂ B(0, 2), en , 2
∞ n=1
B
√ 2 ⊂ B(0, 2). en , 2
[1.15]
a) Let λ have the properties (i)–(iii). For k =√m,the translation T x = x+em −ek , √ x ∈ H maps the ball B ek , 22 onto B em , 22 . Hence by (i) and (iii), λ B
√ √ 2 2 = λ B em , = a > 0. ek , 2 2
Due to [1.15] and [1.16], we have √ ∞ ∞ 2 λ(B(0, 2)) ≥ = λ B en , a = +∞, 2 n=1 n=1
[1.16]
λ(B(0, 2)) = +∞.
But this contradicts property (ii). Therefore, such a measure λ does not exist.
10
Gaussian Measures in Hilbert Space
b) Now, assume that λ has the properties (i), (ii) and (iv). Construct a unitary √ √ 2 2 operator U that maps B ek , 2 onto B em , 2 , with fixed k = m. Let L be a subspace generated by {en , n = 1, 2, . . . }. Each x ∈ H can be decomposed as x=
∞
(x, en )en + z,
z ∈ L⊥ .
n=1
The isometric operator U x = (x, ek )em + (x, em )ek +
∞
(x, en )en + z
n=k,n=m
√ √ is a surjection, and hence is unitary. It maps B ek , 22 onto B em , 22 , and by properties (iv) and (i), it holds [1.16]. The rest of the proof follows the line of part (a) of the proof. As we see, in space l2 of sequences and in the space L2 [a, b] of functions there is no measure analogous to Lebesgue measure. Nevertheless, we will construct a measure in an infinite-dimensional Hilbert space, which is invariant under quite a large group of transformations. It will be a Gaussian measure. Problems 1.3 16) Prove that there is no measure λ on B(l∞ ) with properties (i) and (ii) from section 1.3, where l∞ is the space of real bounded sequences with the supremum norm. 17) Let X be a real normed space, with dim(X) = ∞. Prove that there is no measure λ on B(X) with properties (i)–(iii) from section 1.3. 18) A linear bijection V on a normed space X is called isometry if ||V x|| = ||x||, x ∈ X. Prove that there is no measure λ on B(lp ) with properties (i) and (ii) from section 1.3 and such that λ is invariant under all isometries on lp , 1 ≤ p < ∞. 19) Let ϕ(t), t ∈ [0, 1] be a continuous increasing function, with ϕ(0) = 0, ϕ(1) = 1, and ϕ(t) < t, t ∈ (0, 1). In Banach space, X = C[0, 1] introduce a transformation (T x)(t) = x(ϕ(t)), t ∈ [0, 1], x ∈ X. Prove that there is no measure λ on B(X) with properties (i) and (ii) from section 1.3 and such that it is T -invariant. 1.4. Random vectors and their distributions Remember that a probability measure is a measure on sigma-algebra, which is equal to 1 at the total space. A measure space (Ω, F, P) is called probability space if P is a probability measure, i.e. P(Ω) = 1.
Gaussian Measures in Euclidean Space
11
1.4.1. Random variables A random variable (r.v.) on a probability space (Ω, F, P) is just an F-measurable function on Ω. D EFINITION 1.4.– Let X = X(ω) be a r.v. on a probability space (Ω, F, P). The distribution of X is a probability measure μX defined as follows: μX (B) = P{ω : X(ω) ∈ B},
B ∈ B(R).
Note that μX is a measure induced by the mapping X : Ω → R, i.e. μX = P X −1 (see definition 1.1). A Borel function f : R → R is called the probability density function (pdf) of a r.v. X if P{X(ω) ∈ B} = f (t)dλ1 (t), B ∈ B(R). B
Actually, this means that the distribution μX λ1 , where the Lebesgue measure λ1 is considered on B(R), and moreover the Radon–Nikodym derivative dμX dλ1 = f (t)(modλ1 ). D EFINITION 1.5.– A r.v. γ is called normal (or normally distributed) if it has a pdf of the form ρ(x) = √
(x−m)2 1 e− 2σ2 , 2πσ
x ∈ R,
with parameters m ∈ R and σ > 0. This is denoted as follows: γ ∼ N (m, σ 2 ). If γ ∼ N (m, σ 2 ), then it holds E γ = m,
D γ = σ2 .
Hereafter, E stands for expectation and D stands for the variance of a r.v. Remember that 2 Eγ = γ(ω)d P(ω), D γ = E(γ − m)2 = (γ(ω) − m) d P(ω). Ω
Ω
By the change of variables formula (theorem 1.2), it holds 2 Eγ = xdμγ (x), D γ = (x − m) dμγ (x), R
R
and since γ has pdf equal ρ, we have 2 2 m = Eγ = xρ(x)dx, σ = D γ = (x − m) ρ(x)dx. R
R
12
Gaussian Measures in Hilbert Space
The latter integrals are Lebesgue integrals. D EFINITION 1.6.– A r.v. γ has degenerate normal distribution N (m, 0) if γ(ω) = m, almost surely (a.s.). We denote it as γ ∼ N (m, 0). In this case, E γ = m, D γ = 0, and the distribution μγ is a point measure concentrated at m: B ∈ B(R).
μγ (B) = IB (m),
Such a measure is called Dirac measure at point m and is denoted as δm . D EFINITION 1.7.– A r.v. γ is called Gaussian if it is either normally distributed (with positive variance) or has degenerate normal distribution (with zero variance). Thus, a Gaussian r.v. γ satisfies γ ∼ N (m, σ 2 ) with some m ∈ R and σ ≥ 0. If σ > 0, then μγ λ1 , and if σ = 0, then μγ = δm ⊥ λ1 (i.e. μγ is singular to λ1 ). D EFINITION 1.8.– A r.v. γ ∼ N (0, 1) is called standard normal. Remember that characteristic function ϕξ of a r.v. ξ is as follows: ϕξ (t) = E eitξ ,
t ∈ R.
A normal r.v. γ ∼ N (m, σ 2 ) has characteristic function ϕγ (t) = exp{imt −
σ 2 t2 }. 2
[1.17]
If γ has degenerate normal distribution N (m, 0), then ϕγ (t) = exp{imt}. Thus, relation [1.17] holds true for any Gaussian r.v. γ ∼ N (m, σ 2 ), with σ ≥ 0. 1.4.2. Random vectors Remember that a random vector X distributed in Rn is F-measurable mapping X : Ω → Rn , where (Ω, F, P) is the underlying probability space. A mapping X(ω) = (X1 (ω), . . . , Xn (ω)) , ω ∈ Ω is a random vector if, and only if, all Xk (ω), k = 1, . . . , n are random variables. For a random vector X, its expectation is defined coordinate-wise:
E X = (E X1 , . . . , E Xn ) =: m,
m = (m1 , . . . , mn ) ∈ Rn .
Gaussian Measures in Euclidean Space
13
In case E Xk2 < ∞ for all k = 1, . . . , n, its variance–covariance matrix Cov(X) = (sij )ni,j=1 is defined as follows: sij = Cov(Xi , Xj ) = E (Xi − mi ) (Xj − mj ) ,
i, j = 1, . . . , n.
[1.18]
Hereafter, expectation E is considered as an operator that acts on the total product under its sign, and we omit brackets for brevity. The variance–covariance matrix can be expressed as
Cov(X) = E (X − m) (X − m) .
[1.19]
Here, the expectation of a random matrix is a matrix composed of expectations of entries, according to [1.18]. The variance–covariance matrix is a positive semidefinite matrix, i.e. it is symmetric and the corresponding quadratic form is non-negative. The variance–covariance matrix of a random vector X exists if and only if E ||X||2 < ∞. L EMMA 1.2.– (Variance–covariance matrix under linear transform) Let X be a random vector in Rn , with variance–covariance matrix S, and A ∈ Rp×n . Then AX is a random vector in Rp , with Cov(AX) = ASA .
[1.20]
P ROOF.– The mapping AX : Ω → Rp is F-measurable, as a Borel function of random vector. Hence AX is a random vector in Rp , and because E ||AX||2 ≤ ||A||2 · E ||X||2 < ∞, it has a variance–covariance matrix. Hereafter, ||A|| is Euclidean norm of matrix A, ||A|| = sup x=0
||Ax|| . ||x||
Now, E(AX) = A(E X) = Am, where m = E X, and
Cov(AX) = E (AX − Am) (AX − Am) =
= E[A (X − m) (X − m) A ] = A · E (X − m) (X − m) · A = ASA . Here, we used the linearity of the operator of taking expectation.
C OROLLARY 1.3.– (Moments of linear functional) Let a ∈ Rn and X be a random vector in Rn , with mean value m and variance–covariance matrix S. Then E(a X) = a m,
D(a X) = a Sa.
P ROOF.– The statement follows from lemma 1.2 and its proof if to put A = a ∈ R1×n . The random vector a X is just a r.v., and its variance–covariance matrix is just the variance.
14
Gaussian Measures in Hilbert Space
1.4.3. Distributions of random vectors Let X be a random vector distributed in Rn . Its distribution is introduced similarly to definition 1.4. D EFINITION 1.9.– The distribution of X is a probability measure μX defined as follows: μX (B) = P{ω : X(ω) ∈ B},
B ∈ B(Rn ).
It is always possible to construct a random vector with a given distribution. L EMMA 1.3.– Given a probability measure μ on B(Rn ), there exists a random vector X, with distribution μX = μ. P ROOF.– Take the measure space (Rn , B(Rn ), μ) as a probability space (Ω, F, P) and define X : Ω → Rn as X(ω) = ω. Then μX (B) = P{ω ∈ B} = P(B) = μ(B),
B ∈ B(Rn ).
Remember that random variables X1 , . . . , Xn , which are defined on the same probability space, are independent if P{X1 ∈ B1 , X2 ∈ B2 , . . . , Xn ∈ Bn } =
n
P{Xk ∈ Bk },
1
for all B1 , . . . , Bn ∈ B(R). The latter relation can be written in terms of the distribution μX of random vector X = (Xk )n1 and marginal distributions μXk of its components: μX (
n
Bk ) =
1
Here
n 1
n
μXk (Bk ).
1
Bk denotes Cartesian product of the sets Bk .
It is clear that components of random vector X = (Xk )n1 are independent if, and only if, μX is a product of n probability measures, and in this case μX =
n
μXk .
1
Remember that characteristic function ϕX of a random vector X is defined as follows: ϕX (t) = E ei(X,t) ,
t ∈ Rn .
Gaussian Measures in Euclidean Space
15
One can rewrite ϕX (t) using the change of variables formula (see theorem 1.2): i(X(ω),t) ϕX (t) = e dP (ω) = ei(z,t) d(P X −1 )(z), ϕX (t) =
Rn
Ω
Rn
ei(z,t) dμX (z).
This prompts the following definition. D EFINITION 1.10.– Given a probability measure μ on B(Rn ), its characteristic function ϕμ is as follows: ϕμ (t) = ei(z,t) dμ(z), t ∈ Rn . Rn
Thus, ϕX and ϕμX coincide. From standard course of probability theory, it is known that the cumulative distribution function (and therefore, the distribution) of a random vector is uniquely defined by its characteristic function. L EMMA 1.4.– (Criterion for independence) Consider a random vector X = (Xk )n1 . Its components are independent if, and only if, ϕX can be decomposed as follows: ϕX (t) = ϕ1 (t1 )ϕ2 (t2 ) . . . ϕn (tn ), t ∈ Rn , where ϕk : R → C are some functions with ϕk (0) = 1, k = 1, . . . , n, and in this case ϕX (t) =
n
ϕXk (tk ),
t ∈ Rn .
1
P ROOF.– a) Let Xk be independent. Then random variables eitk Xk , k = 1, . . . , n are independent as well, and ϕX (t) = E
n
1
e
itk Xk
=
n
1
Ee
itk Xk
=
n
1
ϕXk (tk );
ϕXk (0) = 1.
16
Gaussian Measures in Hilbert Space
b) Assume that ϕX (t) = ϕ1 (t1 )ϕ2 (t2 ) . . . ϕn (tn ), with ϕk (0) = 1, k = 1, . . . , n. Let {ek , k = 1, . . . , n} be the standard orthobasis in Rn . Then ϕXk (tk ) = E eitk Xk = E ei(X,tk ek ) = ϕX (tk ek ) = ϕk (tk ). Let Y = (Yk )n1 be a random vector with independent components and the same marginal distributions: μ Yk = μ X k ,
k = 1, . . . , n.
(Such Y can be constructed if to apply lemma 1.3 to the measure μ = Then, by part (a) of the proof, ϕY (t) =
n
ϕYk (tk ) =
1
n
ϕXk (tk ) = ϕX (t),
n 1
μXk .)
t ∈ Rn .
1
Therefore, μ X = μY =
n
μY k =
1
n
μ Xk ,
1
and the components of X are independent.
In an obvious way, lemma 1.3 can be reformulated as a criterion for a probability measure on B(Rn ) to be a product of n probability measures on B(R). D EFINITION 1.11.– Given a probability measure μ on B(Rn ), its mean value mμ and variance–covariance matrix Cov(μ) = (sij )ni,j=1 are defined as follows:
mμ = sij =
Rn
Rn
xdμ(x) :=
n Rn
n
xk dμ(x)
(xi − mi ) (xj − mj ) dμ(x),
= (mk )1 , k=1
i, j = 1, . . . , n.
Definition 1.11 is consistent with the corresponding definition of the mean and variance–covariance matrix of a random vector. Indeed, for a random vector X, it holds E X = mμX ,
Cov(X) = Cov(μX ),
i.e. expectation and variance–covariance matrix of a random vector are just the mean and variance–covariance matrix of its distribution.
Gaussian Measures in Euclidean Space
17
Now, we interpret the bilinear form generated by S = Cov(X). Let m = E X and u, v ∈ Rn ; then
(Su, v) = v E(X − m)(X − m) u = E v (X − m)(X − m) u , (Su, v) = E(X − m, u)(X − m, v). For a probability measure μ on B(Rn ), with m = mμ and S = Cov(μ), we have, respectively: (Su, v) = (z − m, u) (z − m, v) dμ(z), u, v ∈ Rn . Rn
For the mean value, we have (m, u) = (z, u) dμ(z), Rn
u ∈ Rn .
Those expressions are the first and the central second moments of measure μ. Problems 1.4 20) A measure μ on B(Rn ) is called symmetric around the origin if μ(B) = μ(−B), for all B(Rn ). Let μ be a probability measure on B(Rn ). Prove that μ is symmetric around the origin if, and only if, its characteristic function ϕμ takes real values only. 21) Let μ be a probability measure on B(Rn ). Prove that μ is invariant under all orthogonal transformations if, and only if, there exists a function f : [0, +∞) → C such that ϕμ (t) = f (||t||), t ∈ Rn .
¯ r) = 22) Let μ and ν be probability measures on B(Rn ) such that μ B(x, ¯ r)), for all closed balls B(x, ¯ r) in Rn . Prove that μ = ν. ν(B(x, 1.5. Gaussian vectors and Gaussian measures 1.5.1. Characteristic functions of Gaussian vectors D EFINITION 1.12.– A random vector ξ in Rn is called Gaussian if for each a ∈ Rn , inner product (ξ, a) is a Gaussian r.v. n
Consider a Gaussian random vector ξ = (ξk )1 in Rn . Its components ξk = (ξ, ek ) are Gaussian random variables, ξk ∼ N (mk , σk2 ) with σk ≥ 0; k = 1, . . . , n. Such random variables, the components of a Gaussian random vector, are called jointly Gaussian. It holds E ||ξ||2 =
n 1
E ξk2 =
n 1
(m2k + σk2 ) < ∞,
18
Gaussian Measures in Hilbert Space
and therefore, Cov(ξ) is well defined. We have n
n
E ξ = (E ξk )1 = (mk )1 =: m, S := Cov(ξ) = (sij )ni,j=1 , sii = D ξi = σi2 ,
m ∈ Rn ;
sij = Cov(ξi , ξj ),
1 ≤ i, j ≤ n.
Then we write ξ ∼ N (m, S) and say that ξ is a Gaussian random vector with mean m and variance–covariance matrix S. Here, S is a positive semidefinite n × n real matrix as a variance–covariance matrix of a random vector in Rn . L EMMA 1.5.– If ξ ∼ N (m, S) then for each a ∈ Rn , (ξ, a) ∼ N (ma , σa2 ),
ma = (m, a),
σa2 = (Sa, a).
P ROOF.– R.v. (ξ, a) is Gaussian according to definition 1.12. Its mean and variance are evaluated in corollary 1.3. L EMMA 1.6.– If ξ ∼ N (m, S) in Rn , then (St, t) , t ∈ Rn . ϕξ (t) = exp i(t, m) − 2 P ROOF.– It holds ϕξ (t) = E ei(ξ,t) = ϕ(ξ,t) (1). Now, use lemmas 1.5 and [1.17] z 2 (St, t) ϕξ (t) = exp iz(m, t) − , 2 z=1
and the statement follows.
R EMARK 1.3.– If a random vector ξ has characteristic function given in lemma 1.6, with certain n × n real and symmetric matrix S, then S is positive semidefinite and ξ ∼ N (m, S). P ROOF.– For a ∈ Rn , it holds ϕ(ξ,a) (u) = E ei(ξ,ua) = ϕξ (ua), (Sa, a)u2 , ϕ(ξ,a) (u) = exp iu(a, m) − 2
u ∈ R.
Gaussian Measures in Euclidean Space
19
Since the absolute value of characteristic function does not exceed 1, matrix S is positive semidefinite, and moreover (ξ, a) ∼ N (ma , σa2 ), with ma = (a, m), σa2 = (Sa, a). Hence, ξ is a Gaussian vector, with some parameters m1 ∈ Rn and S1 ∈ Rn×n , S1 is positive semidefinite. Then lemma 1.5 implies (ξ, a) ∼ N (m ˜ a, σ ˜a2 ), with m ˜ a = (a, m1 ), σ ˜a2 = (S1 a, a). We get (a, m) = (a, m1 ) n and (Sa, a) = (S1 a, a) for all a ∈ R . Thus, m1 = m and S1 = S. D EFINITION 1.13.– Random vector γ ∼ N (0, In ) is called standard, or canonical, Gaussian vector distributed in Rn . The components γ1 , . . . , γn of standard Gaussian vector are uncorrelated jointly Gaussian standard random variables. In particular, E γk = 0, D γk = 1, and for all i = j, Cov(γi , γj ) = E γi γj = 0. T HEOREM 1.6.– (About components of standard Gaussian vector) 1) Let γ = (γk )n1 be standard Gaussian vector. Then ϕγ (t) = e−
||t||2 2
,
t ∈ Rn
and γ1 , . . . , γn are independent and identically distributed (i.i.d.) N (0, 1) random variables. 2) If γ1 , . . . , γn are i.i.d. N (0, 1) random variables, then γ = (γk )n1 is standard Gaussian vector. P ROOF.– 1) The formula for ϕγ follows from lemma 1.5 with m = 0 and S = In . Therefore, n
t2 k ϕγ (t) = e− 2 . k=1
Now, by lemma 1.3 the components of γ are independent. n 2) Let γ1 , . . . , γn be i.i.d. N (0, 1) random n variables. Then γ = (γk )1 is a random n vector, and for each a ∈ R , (γ, a) = k=1 ak γk is a Gaussian r.v. as a sum of independent Gaussian random variables. Therefore, γ is Gaussian. We have
E γ = (E γk )n1 = 0, Cov γ = (sij )ni,j=1 ,
sij = E γi γj = δij ,
1 ≤ i, j ≤ n.
Hereafter, δij is Kronecker delta, δij = 1 if i = j and δij = 0, otherwise. Hence γ ∼ N (0, In ). R EMARK 1.4.– (About uncorrelated Gaussian variables) Theorem 1.6 can be extended as follows: jointly Gaussian random variables ξ1 , . . . , ξn are independent if, and only if, they are uncorrelated. Based on theorem 1.6, this can be proven by consideration of i −E ξi normalized random variables ηi = ξ√ (before we cancel that ξi are constant a.s.). Dξ i
20
Gaussian Measures in Hilbert Space
1.5.2. Expansion of Gaussian vector Lemma 1.5 can be generalized for any linear transformation of a Gaussian vector. L EMMA 1.7.– (Linear transform of Gaussian vector) Let ξ ∼ N (m, S) be a Gaussian vector in Rn and A ∈ Rp×n . Then Aξ is a Gaussian vector in Rp and Aξ ∼ N (Am, ASA ). P ROOF.– For a ∈ Rp , it holds (Aξ, a) = (ξ, A a). It is a Gaussian r.v. due to definition 1.12. Now, the statement follows from lemma 1.2 and its proof. Now, we want to show that for any m ∈ Rn and positive semidefinite matrix S ∈ Rn×n , a random vector ξ ∼ N (m, S) can be obtained as an affine transformation of standard Gaussian vector. Let A ∈ Rn×n be a positive semidefinite matrix. Then it has orthonormal eigenbasis e1 , . . . , en and corresponding non-negative eigenvalues λ1 , . . . , λn . The matrix can be expanded as follows: A=
n
λk ek e k.
1
D EFINITION 1.14.– Let A be a positive semidefinite matrix as described above. √ 1 A 2 = A is a matrix satisfying √
A=
n
λk e k e k.
1
√ It is a unique positive semidefinite matrix with its square equal to A. The √ matrix A has the same eigenbasis e , . . . , e and corresponding eigenvalues λ1 , . . . , 1 n √ λn . L EMMA 1.8.– (Representation of Gaussian vector via standard Gaussian vector) Let m ∈ Rn , S ∈ Rn×n be a positive semidefinite matrix and γ ∼ N (0, In ). Then √ ξ := m + Sγ ∼ N (m, S). P ROOF.– According to lemma 1.7 √ √ √ Sγ ∼ N (0, SIn ( S) ), This implies the statement.
√ Sγ ∼ N (0, S).
Gaussian Measures in Euclidean Space
21
Since γ ∼ N (0, In ) can be constructed based on i.i.d. N (0, 1) random variables, lemma 1.8 shows that Gaussian N (m, S) random vectors do exist, for any m ∈ Rn and any positive semidefinite matrix S ∈ Rn×n . Now, we expand a Gaussian vector ξ ∼ N (m, S) using i.i.d. N (0, 1) random variables γ1 , . . . , γr , with r = rk(S). T HEOREM 1.7.– (Decomposition of Gaussian vector) Let m ∈ Rn and S ∈ Rn×n be a positive semidefinite matrix, with rk(S) = r ≥ 1. Let λ1 , . . . , λr be positive eigenvalues of S and e1 , . . . , er be the corresponding orthonormal system of eigenvectors. 1) For ξ ∼ N (m, S), there exist i.i.d. N (0, 1) random variables γ1 , . . . , γr on the underlying probability space such that ξ =m+
r λk γk ek ,
a.s.
1
2) If γ1 , . . . , γr are i.i.d. N (0, 1) random variables, then η := m +
r λk γk ek ∼ N (m, S). 1
P ROOF.– We complete the orthonormal system to an orthobasis e1 , . . . , en . Denote λr+1 = λr+2 = · · · = λn = 0; they are the rest eigenvalues of S. n 1) Introduce X = ξ − m, X ∼ N (0, S). Now, X = 1 (X, ek )ek , E(X, ek ) = 0, and for all k ≥ r + 1, D(X, ek ) = λk = 0. Hence (X, ek ) = 0 a.s., for all k ≥ r + 1. Thus, X=
r (X, ek ) λk √ ek , λk 1
a.s.
) √ k , k = 1, . . . , r are jointly Gaussian (because vector Random variables γk := (X,e λk γ = (γk )r1 is a linear transformation of Gaussian vector X) and E γk = 0,
1 λk (Sek , ej ) = δkj = δkj . Cov(γk , γj ) = λk λj λ k λj According to theorem 1.6(1), γ1 , . . . , γr are i.i.d. N (0, 1) random variables. Now, r λk γ k e k , X= 1
and the statement follows.
a.s.
22
Gaussian Measures in Hilbert Space
r √ 2) For a ∈ Rn , (η, a) = (m, a) + 1 λk γk (ek , a) is a Gaussian r.v. as a sum of independent Gaussian random variables. Therefore, η is Gaussian. Next, ⎞ r ⎛ r ⎠= E η = m, Cov(η) = E λk γ k e k ⎝ λj γj e j k=1
=
j=1
r r λk λj (E γk γj )ek e = λk e k e j k = S. k,j=1
k=1
Thus, η ∼ N (m, S).
1.5.3. Support of Gaussian vector D EFINITION 1.15.– For a random vector X in Rn , denote by G a union of all balls B(x, r), with P{X ∈ B(x, r)} = 0. The set Rn \ G is called support of X and denoted as supp X. It is clear that P{X ∈ G} = 0. Moreover, G is the largest open set with this property. supp X is the smallest closed set such that X belongs to this set with probability 1. Since G = Rn , supp X = ∅. E XAMPLE 1.2.– Let that a r.v. ξ has Poisson distribution with parameter λ > 0. Then supp ξ = {0, 1, 2, . . . , n, . . . }. T HEOREM 1.8.– (About support of Gaussian vector) For random vector ξ ∼ N (0, S), supp ξ = R(S), where R(S) is the range of S. P ROOF.– a) If S = 0, then ξ = 0 a.s., and supp ξ = {0} = R(S). Now, let rk(S) = r ≥ 1, and let λ1 , . . . , λr be positive eigenvalues of S and e1 , . . . , er be the corresponding orthonormal system of eigenvectors. According to theorem 1.7(1), there exist i.i.d. N (0, 1) random variables γ1 , . . . , γr on the underlying probability space such that ξ=
r λk γ k e k ,
a.s.,
1
and with probability one ξ ∈ span(e1 , . . . , er ) = R(S). b) Take arbitrary x ∈ R(S) and ε > 0. Show that P{ξ ∈ B(x, ε)} > 0.
[1.21]
Gaussian Measures in Euclidean Space
Indeed, x =
r
M := {y =
1
23
ak ek and there exists δ > 0 such that
r
bk ek : |bk − ak | < δ, k = 1, . . . , r} ⊂ B(x, ε).
1
Then
a k − δ ak + δ √ , k = 1, . . . , r = P{ξ ∈ M } = P γk ∈ , √ λk λk r
a k − δ ak + δ √ = > 0, P γk ∈ , √ λk λk 1
P{ξ ∈ B(x, ε)} ≥ P{ξ ∈ M } > 0. Thus, [1.21] holds true. This fact and the relation shown in part (a) imply that supp ξ = R(S). 1.5.4. Gaussian measures in Euclidean space Now, we reformulate the results of sections 1.5.1–1.5.3 for distributions of Gaussian vectors. D EFINITION 1.16.– A probability measure μ on B(Rn ) is called Gaussian measure in Rn , if there exists a Gaussian random vector ξ in Rn such that its distribution μξ = μ. Let ξ ∼ N (m, S) and μ = μξ . In view of section 1.4.3, we have the following: m = E ξ = mμ ,
S = Cov(ξ) = Cov(μ),
i.e. m is mean value of μ and S is variance–covariance matrix of μ. Next, according to lemma 1.6 the characteristic function of μ equals (St, t) ϕμ (t) = ϕξ (t) = exp i(t, m) − , t ∈ Rn . 2 Since parameters μ and S define uniquely the characteristic function of a Gaussian measure, they define uniquely the Gaussian measure μ itself. There is one-to-one correspondence between the set of all Gaussian measures in Rn and the set of couples (m; S), where m ∈ Rn and S is a positive semidefinite n × n matrix. Let λ1 , . . . , λn be eigenvalues of S (they are non-negative) and e1 , . . . , en be the corresponding eigenbasis of S. Then n n 1 2 ϕμ (t) = exp i λk (t, ek ) , (t, ek )(m, ek ) − 2 1 1
24
Gaussian Measures in Hilbert Space
ϕμ (t) =
n
1
1 2 exp i(t, ek )(m, ek ) − λk (t, ek ) . 2
We treat (t, ek ) as coordinates tk of vector t ∈ Rn , and the same for (m, ek ) = mk . Thus, n
1 2 ϕμ (t) = exp itk mk − λk tk . 2 1 In view of lemma 1.3, we get a decomposition of μ: μ=
n
μk ,
1
where μk is a Gaussian measure on real line, with mean mk and variance λk , k = 1, . . . , n. Thus, each Gaussian measure in Euclidean space is just a product of Gaussian measures on real line. In case, rk(S) = r, 1 ≤ r ≤ n − 1 we may and do assume that λr+1 = λr+2 = · · · = λn = 0. Then μk = δmk ( Dirac measure at point mk ), k ≥ r + 1, and we get the expansion μ=
r
μk ×
1
n
δmk =
r+1
r
μk × δ z ,
z = (mr+1 , . . . , mn ) .
1
Here, δz is Dirac measure on B(Rn−r ) at point z. D EFINITION 1.17.– Standard Gaussian measure g in Rn is the distribution of standard Gaussian vector in Rn . The measure g has zero mean and the variance–covariance matrix is equal to In . Its characteristic function is ϕg (t) = e−
||t||2 2
, t ∈ Rn .
It holds n
gk , g= 1
where g1 = . . . = gn is standard Gaussian measure on real line. This means that x2 1 √ e− 2 dx, B ∈ B(R), i = 1, . . . , n. gi (B) = 2π B
D EFINITION 1.18.– (See definition 1.15) For a probability measure μ on B(Rn ), denote by G a union of all balls B(x, r), with μ(B(x, r)) = 0. The set Rn \ G is called support of μ and is denoted as supp μ.
Gaussian Measures in Euclidean Space
25
Since Rn is separable, G is a countable union of balls B(xi , ri ), with μ(B(xi , ri )) = 0. Hence, μ(G) = 0. Moreover, G is the largest open set with this property. Note that supp μ is the smallest closed set such that the value of μ at this set equals 1. Since G = Rn , supp μ = ∅. Theorem 1.8 implies the following: if μ is a Gaussian measure with mean 0 and variance–covariance matrix S, then supp μ = R(S). In particular for standard Gaussian measure g in Rn , supp g = Rn . Now, we study the invariance of Gaussian measures under linear transformations. Let X be a random vector in Rn with distribution μX and T : Rn → Rn be a Borel function. Random vector T X has distribution μT X = μX T −1 . Therefore, μX is d d invariant under T (see definition 1.2) if, and only if, X = T X. Hereafter, X = Y means that random vectors X and Y are identically distributed, i.e. μX = μY . √ Remember definition 1.14 of A1/2 = A for a positive semidefinite matrix A. T HEOREM 1.9.– (Invariance of Gaussian measure) Let U ∈ Rn×n and μ be a Gaussian measure in Rn , with zero mean and non-singular variance–covariance matrix S. The measure μ is U -invariant if, and only if, the matrix S −1/2 U S 1/2 is orthogonal. In particular, standard Gaussian measure g in Rn is U -invariant if, and only if, U is an orthogonal matrix. P ROOF.– Let X be a Gaussian random vector, with μX = μ. Then X ∼ N (0, S) and d
by lemma 1.7, U X ∼ N (m, U SU ). Now, U X = X if, and only if, S −1/2 U S 1/2 S −1/2 U S 1/2 = In . U SU = S ⇔ But this is equivalent to the orthogonality of matrix S −1/2 U S 1/2 . In case μ = g, it holds S = In . Thus, g is U -invariant if, and only if, U is an orthogonal matrix. The main statement of theorem 1.9 can be interpreted as follows. For a positive definite S ∈ Rn×n , we introduce a new inner product in Rn ,
(x, y)S = S −1 x, y , x, y ∈ Rn . The corresponding norm is 1 1 ||x||S = (x, x)S = ||S − 2 x||2 = ||S − 2 x||,
x ∈ Rn .
Now, a Gaussian measure μ, with zero mean and non-singular variance–covariance matrix S, is U -invariant if, and only if, U is unitary operator w.r.t. the inner product (x, y)S .
26
Gaussian Measures in Hilbert Space
Indeed, S −1/2 U S 1/2 is an orthogonal matrix if, and only if, ||S −1/2 U S 1/2 x|| = ||x||,
x ∈ Rn .
Now, make a change of variable y = S 1/2 x, y ∈ Rn . Then we get an equivalent condition ||S −1/2 U y|| = ||S −1/2 y||
⇔
||U y||S = ||y||S ,
y ∈ Rn .
The latter equality means that the linear transformation U is orthogonal w.r.t. the inner product (x, y)S . One can say that a Gaussian measure μ changes the geometry of Euclidean space. Standard Gaussian measure g stays in correspondence with standard geometry of Euclidean space. Compared to Lebesgue measure λn (see theorem 1.3 and corollary 1.1), the measure g has fewer invariant transformations. Theorem 1.6 shows that λn cannot be extended to an infinite-dimensional Hilbert space H. But we will see that Gaussian measure can be constructed in H, with quite a large group of invariant transformations. Problems 1.5 23) Let A = (aij )ni,j=1 and B = (bij )ni,j=1 be positive definite matrices. Prove that the matrix C = (aij bij )ni,j=1 is positive definite as well. 24) Let f and g be pdfs, with cumulative distribution functions F and G, respectively. Prove the following: a) For each α ∈ (−1, 1), h(x, y) := f (x)g(y) + αf (x)(1 − 2F (x))g(y)(1 − 2G(y)),
(x, y) ∈ R2
is pdf, with marginal densities f (x) and g(y). b) Assume additionally that f and g are even functions and |x|f (x)dx < ∞, |y|f (y)dy < ∞. Let (X; Y ) be a random vector with R R pdf equal to h(x, y). If α ∈ (0, 1), then X and Y are positively correlated, and if α ∈ (−1, 0), then X and Y are negatively correlated.
25) Based on problem (24), construct Gaussian random variables X and Y which are not jointly Gaussian. 26) Let A ∈ Rn×n be a symmetric matrix and X ∼ N (0, S) in Rn . Denote eigenvalues of S 1/2 AS 1/2 as λ1 , . . . , λn . Prove that Iα := E exp{α(AX, X)} < ∞ if, and only if, αλk < 12 , k = 1, . . . , n. Show that in this case Iα = √n 1 . k=1 (1−2αλk )
27) Let X ∼ N (0, In ). Find for which real α Iα := E ||X||−α < ∞.
2 Gaussian Measure in l2 as a Product Measure
2.1. Space R∞ 2.1.1. Metric on R∞ We denote R∞ as the set of all sequences x = (xn )∞ 1 = (x1 , . . . , xn , . . . ) such that xn ∈ R, n ≥ 1. It is a real space w.r.t. natural operations: x + y = (xn + yn )∞ 1 ,
λx = (λxn )∞ 1 ,
where xn and yn are coordinates of x and y, respectively, and λ ∈ R. Let ρ(x, y) =
∞ 1 |xn − yn | · , n 1 + |x − y | 2 n n n=1
x, y ∈ R∞ .
[2.1]
We will show that [2.1] is a metric on R∞ that metrizes the coordinate-wise convergence. L EMMA 2.1.– (Bounded metric on real line) The function d(t, s) =
|t − s| , 1 + |t − s|
t, s ∈ R
[2.2]
is a metric on R. The convergence in this metric is equivalent to usual convergence of real sequences.
Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.
28
Gaussian Measures in Hilbert Space
P ROOF.– 1) The first and second axioms of a metric are easily verified. Here, we check only the triangle inequality. The function ϕ(t) :=
t 1+t ,
t ≥ 0 is increasing. For real numbers t, s, u, we have
d(t, s) = ϕ(|t − s|) ≤ ϕ(|t − u| + |u − s|) = =
|t − u| |u − s| + , 1 + |t − u| + |u − s| 1 + |t − u| + |u − s|
d(t, s) ≤
|t − u| |u − s| + = d(t, u) + d(u, s). 1 + |t − u| 1 + |u − s|
Thus, d is a metric on R. 2) If limn→∞ tn = t, then d(tn , t) =
|tn − t| →0 1 + |tn − t|
as n → ∞.
Conversely, let d(tn , t) → 0 as n → ∞. Then |tn − t| =
d(tn , t) →0 1 − d(tn , t)
as
n → ∞.
Now, the function [2.1] can be rewritten in terms of the metric [2.2]: ρ(x, y) =
∞ d(xn , yn ) , 2n n=1
x, y ∈ R∞ .
[2.3]
L EMMA 2.2.– The function [2.1] is a metric on R∞ . The convergence in this metric is equivalent to the coordinate-wise convergence. The metric space (R∞ , ρ) is separable. P ROOF.– series in [2.1] converges because it is majorized by the convergent series ∞1) The 1 n=1 2n . Thus, the function [2.1] takes real values. The first and second axioms of a metric are easily checked. Here, we verify the triangle inequality only. Let x, y, z ∈ R∞ . We use representation [2.3] and lemma 2.1: ρ(x, y) ≤
∞ d(xn , zn ) + d(zn , yn ) = ρ(x, z) + ρ(z, y). 2n n=1
Thus, ρ is a metric.
Gaussian Measure in l2 as a Product Measure
29
∞ 2) Let x ∈ R∞ , x(m) = (xn (m))∞ n=1 ∈ R , m ≥ 1. Suppose that ρ(x(m), x) → 0 as m → ∞. We have 1 ρ(x(m), x) ≥ n d(xn (m), xn ) ≥ 0, 2
thus, d(xn (m), xn ) → 0 as m → ∞. By lemma 2.1, xn (m) → xn as m → ∞. Here, n is arbitrary, and x(m) converges to x coordinate-wise. Conversely, assume that x(m) converges to x coordinate-wise. For each N ≥ 1, ρ(x(m), x) ≤
N d(xn (m), xn ) 1 + N. n 2 2 n=1
By lemma 2.1, d(xn (m), xn ) → 0 as m → ∞. Then 0 ≤ limm→∞ sup ρ(xn (m), x) ≤
1 . 2N
Tending N to infinity, we obtain limm→∞ sup ρ(xn (m), x) = 0, and ρ(xn (m), x) tends to 0 as m → ∞. 3) Consider a set A = {(r1 , . . . , rn , 0, 0, . . . ) : n ≥ 1, ri ∈ Q, i ≥ 1}. It consists of finitary vectors with rational coordinates. The set is countable. Now, we check that it is dense in R∞ . ∞ Take any x = (xn )∞ 1 ∈ R . For each n ≥ 1, construct a sequence {rm (n), m ≥ 1} of rational numbers that converges to xn as m → ∞. As n → ∞, the sequence of points from A
a(n) = (rn (1), rn (2), . . . , rn (n), 0, 0, . . . ), n ≥ 1 converges to x coordinate-wise, and hence in (R∞ , ρ). Thus, A is dense in R∞ and countable. Therefore, (R∞ , ρ) is separable. Notice that there is no norm on R∞ that generates the coordinate-wise convergence. Indeed, consider arbitrary norm on R∞ . For n ≥ 1, let x(n) = (0, . . . , 0, αn , 0, . . .) where αn stands at nth place and positive αn is chosen such that ||x(n)|| = 1. Then the sequence x(n) converges to zero coordinate-wise, but it does not converge to zero in this norm. C OROLLARY 2.1.– For any n ≥ 1, the projective operator Pn : R∞ → Rn , Pn x = (x1 , . . . , xn ) is continuous. ∞ ∞ P ROOF.– Let x(m) = (xk (m))∞ k=1 → x = (xk )k=1 in R . By lemma 2.2, limm→∞ xk (m) = xk , k = 1, 2, . . . Therefore,
Pn (x(m)) = (x1 (m), . . . , xn (m)) → Pn x = (x1 , . . . , xn ) in Rn as m → ∞.
30
Gaussian Measures in Hilbert Space
2.1.2. Borel and cylindrical sigma-algebras coincide Remember that Borel sigma-algebra B(X) on a metric space (X, ρ) is generated by the class G of all open sets in X: B(X) = σa(G). This is applicable to the space (R∞ , ρ). Since it is separable (see lemma 2.2), B(R∞ ) is generated by the class of all closed balls: ¯ r) : x ∈ R∞ , r > 0}). B(R∞ ) = σa({B(x,
[2.4]
Another way to generate this sigma-algebra is to use the co-called cylindrical sets. D EFINITION 2.1.– Let n ≥ 1, An ∈ B(Rn ). The set Aˆn = {x ∈ R∞ : (x1 , . . . , xn ) ∈ An } is called a cylinder with base An . Consider the class of all cylinders Cyl = {Aˆn : n ≥ 1, An ∈ B(Rn )}.
[2.5]
L EMMA 2.3.– (About cylindrical algebra) The class [2.5] is algebra of sets in R∞ , but it is not a sigma-algebra. P ROOF.– a) A base of a cylinder is not uniquely defined. For An ∈ B(Rn ), it holds Aˆn = ˆ An+k , k ≥ 1, with An+k = An × Rk , An+k ∈ B(Rn+k ). b) Let C1 , C2 ∈ Cyl. Without loss of generality, we may and do assume that they ˆn , with An , Bn ∈ have bases of the same dimension, say, n: C1 = Aˆn , C2 = B n B(R ). Then C1 ∪ C2 = A n ∪ Bn ,
An ∪ Bn ∈ B(Rn )
⇒
C1 ∪ C2 ∈ Cyl;
C1 \ C2 = A n \ Bn ,
An \ Bn ∈ B(Rn )
⇒
C1 \ C2 ∈ Cyl.
Thus, Cyl is an algebra. c) Let Cn = {x ∈ R∞ : x1 = . . . = xn = 0}, n ≥ 1. These sets are cylinders but = {0} ⊂ R∞ is not a cylinder. Therefore, Cyl is not a sigma-algebra.
∩∞ 1 Cn
The class [2.5] is called cylindrical algebra. It turns out that the sigma-algebra generated by Cyl (the co-called cylindrical sigma-algebra) coincides with Borel sigma-algebra. T HEOREM 2.1.– (Cylindrical sigma-algebra coincides with Borel one) For the class of sets [2.5], σa(Cyl) = B(R∞ ).
Gaussian Measure in l2 as a Product Measure
31
P ROOF.– We show inclusions in both directions. a) For Aˆn ∈ Cyl, Aˆn = Pn−1 An (see corollary 2.1). Since Pn : R∞ → Rn , it holds Pn−1 An ∈ B(R∞ ). Thus, Cyl ⊂ B(R∞ ) ⇒ σa(Cyl) ⊂ B(R∞ ). ¯ r): b) We use relation [2.4]. Take any closed ball B(x, ∞ |yn − xn | ¯ ≤r = B(x, r) = y : 2n (1 + |yn − xn |) 1 ∞ k |yn − xn | ∞ y∈R : = ≤r . 2n (1 + |yn − xn |) 1 k=1
The latter sets are cylinders Aˆk , with closed bases k |yn − xn | k Ak = y ∈ R : ≤r . 2n (1 + |yn − xn |) 1 ¯ r) ∈ σa(Cyl), and Thus, B(x, ¯ r) : x ∈ R∞ , r > 0}) = B(R∞ ) ⊂ σa(Cyl). σa({B(x, We checked the inclusions in both directions, and the statement is proven.
2.1.3. Weighted l2 space Classical sequence spaces are subsets of R∞ . The most popular of those spaces is Hilbert space l2 = {x ∈ R∞ :
∞
x2n < ∞},
[2.6]
1
with inner product (x, y)2 =
∞
x, y ∈ l2 .
xn yn ,
1
Consider a more general sequence space. Let a = (a1 , . . . , an , . . . ) be a sequence of positive numbers and l2,a := {x ∈ R∞ :
∞ 1
an x2n < ∞}.
[2.7]
32
Gaussian Measures in Hilbert Space
Denote (x, y)a =
∞
x, y ∈ l2,a .
an xn yn ,
[2.8]
1
The series converges because
∞
an |xn yn | ≤
∞
an (x2n + yn2 ) < ∞. It is easy to 1 l2,a with inner product [2.8] weighted
1
verify that [2.8] is inner product in l2,a . We call l2 space, with weights an , n = 1, 2, . . . . The induced norm is ||x||a = (x, x)a =
∞
1/2 an x2n
,
x ∈ l2,a .
[2.9]
1
L EMMA 2.4.– (About weighted l2 space) The sequence space l2,a , with inner product [2.8], is Hilbert space. P ROOF.– Let τ be a measure on the sigma-algebra 2N of all subsets of N , an , A ⊂ N τ (A) = n∈A
(in particular τ (∅) = 0, because the sum over empty set of indices is zero by convention). The space L2 (N , τ ) consists of functions f : N → R such that N
f 2 (n)dτ (n) =
n
an f 2 (n) < ∞,
1
with inner product (f, g)L2 =
N
f (n)g(n)dτ (n) =
n
f, g ∈ L2 (N , τ ).
an f (n)g(n),
1
For x ∈ l2,a define fx (n) = xn , n ∈ N . Then fx ∈ L2 (N , τ ) and the operator J : l2,a → L2 (N , τ ), Jx = fx ,
x ∈ l2,a
is an isometry between l2,a and L2 (N , τ ). Indeed, it is a linear surjection, with (Jx, Jy)L2 = (fx , fy )L2 =
n 1
an fx (n)gx (n) =
n
an xn yn = (x, y)a .
1
Thus, l2,a and L2 (N , τ ) are isometric. But the latter space is Hilbert as L2 space. Therefore, l2,a is Hilbert space as well.
Gaussian Measure in l2 as a Product Measure
33
See problem (4) below for another proof of lemma 2.4 based on isometry between sequence spaces l2,a and l2 . L EMMA 2.5.– The set l2,a is a Borel subset of R∞ . P ROOF.– In view of theorem 2.1, it is enough to express l2,a through cylinders using a countable number of operations that are admissible in a sigma-algebra. We have ∞
l2,a =
{x ∈ R∞ :
=
an x2n ≤ N } =
1
N =1 ∞
∞
∞
{x ∈ R∞ :
N =1 k=1
k
an x2n ≤ N }.
1
k The latter set {x ∈ R∞ : 1 an x2n ≤ N } =: AN k is a cylinder with closed base in Rk , AN k ∈ σa(Cyl). Then BN :=
∞
AN k ∈ σa(Cyl),
k=1
l2,a =
∞
BN ∈ σa(Cyl) = B(R∞ )
1
(see theorem 2.1).
Problems 2.1 1) Let (X, ρ) be a metric space and ϕ : [0, ∞) → [0, ∞) be a concave function, with ϕ(0) = 0, which is not identical to zero. Prove that (X, ϕ(ρ)) is a metric space as well. 2) Based on problem (1), give an alternative proof of the fact that the function [2.2] is a metric on R. 3) Prove that the space (R∞ , ρ) is complete, with the metric given in [2.1]. √ 4) Show that the operator J : l2,a → l2 , Jx = ( an xn )∞ 1 is an isometry between l2,a and l2 . Then give an alternative proof of lemma 2.4. Moreover, prove that the space l2,a is separable. 5) Prove that the following sequences are Borel subsets of R∞ : a) lp , 1 ≤ p < ∞; b) space l∞ of bounded real sequences; c) space c0 of real sequences convergent to zero; d) space c of real convergent sequences.
34
Gaussian Measures in Hilbert Space
2.2. Product measure in R∞ If we want to construct a measure on B(R∞ ), we can define it first on the cylindrical algebra [2.5] and then extend it to σa(Cyl) = B(R∞ ) using Carathéodory’s theorem (see [HAL 13]). 2.2.1. Kolmogorov extension theorem D EFINITION 2.2.– For each n ≥ 1, let μn be a probability measure on B(Rn ). The sequence {μn , n ≥ 1} is called consistent if for each n ≥ 1 and each Bn ∈ B(Rn ), μn+1 (Bn × R) = μn (Bn ).
[2.10]
E XAMPLE 2.1.– (Consistent sequence of projections) Let ν be a probability measure on B(R∞ ). Consider the so-called projections of ν: ˆn ), νn (Bn ) = ν(B
n ≥ 1,
Bn ∈ B(Rn ).
ˆn is the cylinder with base Bn ; recall definition 2.1.) Then the sequence (Here, B {νn , n ≥ 1} is consistent. P ROOF.– Remember that Pn : R∞ → Rn is the projective operator from corollary 2.1. It holds νn = νPn−1 , therefore, νn is a probability measure on B(Rn ). For Bn ∈ B(Rn ), it holds ˆ νn+1 (Bn × R) = ν(B n × R) = ν(Bn ) = νn (Bn ). Thus, the sequence {νn , n ≥ 1} is consistent.
L EMMA 2.6.– (Condition for sigma-additivity) Let μ be a non-negative, additive and finite set function on algebra A, and moreover for each sequence Bn of sets from the algebra that decrease to ∅, μ(Bn ) → 0 as n → ∞. Then μ is a measure on A. P ROOF.– Let {An , n = 1, 2, . . . } ⊂ A be disjoint sets and ∪∞ 1 An ∈ A. Then ∞ ∞ k [2.11] = μ(A ) + μ An . An μ n 1
1
k+1
∞ k The sets Bk := ∪∞ k+1 An = (∪1 An ) \ ∪1 An ∈ A and decrease to ∅. Therefore, μ(Bk ) → 0 as k → ∞. Now, tend k → ∞ in [2.11] and get ∞ ∞ An = μ(An ). μ 1
1
Gaussian Measure in l2 as a Product Measure
35
Thus, μ is non-negative and sigma-additive on A. Hence, it is a measure on A. It turns out that any consistent sequence of measures has a form presented in example 2.1. T HEOREM 2.2.– (Kolmogorov extension theorem) Let μn be a probability measure on B(Rn ), n ≥ 1, and the sequence {μn } be consistent. Then there exists a unique probability measure μ on B(R∞ ) such that for all n ≥ 1, ˆn ) = μn (Bn ), Bn ∈ B(Rn ). μ(B P ROOF.– a) Define a set function on the cylindrical algebra. For each Aˆn ∈ Cyl, we put μ(Aˆn ) = μn (An ), n ≥ 1, An ∈ B(Rn ). The consistency of {μn } implies that μ is well-defined. Indeed, μn+k (An × Rk ) = μn+k−1 (An × Rk−1 ) = · · · = μn+1 (An × R) = μn (An ). Next, μ is additive. Indeed, let C1 , . . . , Ck be disjoint cylinders.Without loss of generality, we may and do assume that they have bases of equal dimensions: ˆin , Bin ∈ B(Rn ), i = 1, . . . , k. Ci = B Then the bases are disjoint as well, and ⎞ ⎛ k k k μ Bin = Ci = μ ⎝ Bin ⎠ = μn 1
i=1
i=1
=
k
μn (Bin ) =
i=1
k
μ(Ci ).
i=1
Hence, μ is an additive, non-negative and finite set function on Cyl. b) Show that μ satisfies conditions of lemma 2.6 on algebra Cyl. We prove by contradiction. Assume that there exists a sequence Cn of cylinders, which decreases to ∅ and such that μ(Cn ) → δ > 0 as n → ∞. Without loss of ˆn , Bn ∈ B(Rn ), n ≥ 1. generality, we assume that Cn = B Now, by the regularity of measure μn on Rn (see [BOG 07]), for the fixed δ > 0 there exists a compact set An ⊂ Bn , with μn (Bn \ An ) ≤
δ . 2n+1
Hence ˆn \ Aˆn ) = μ(B μ(B n \ An ) = μn (Bn \ An ) ≤
δ . 2n+1
36
Gaussian Measures in Hilbert Space
Now, we form a decreasing sequence of cylinders ˆn = D
n
Aˆk ,
n ≥ 1.
k=1
ˆ Here, the base Dn is a compact set in Rn , and ∩∞ n=1 Dn = ∅ because ∞
ˆn ⊂ D
n=1
∞
ˆn = ∅. B
n=1
ˆn are decreasing, therefore, The sets B ˆ n) ≤ ˆn \ D μ(B
n
ˆn \ Aˆk ) ≤ μ(B
k=1
n
ˆk \ Aˆk ) ≤ μ(B
1
n k=1
δ δ = . 2k+1 2
Then ˆ n ) ≥ lim μ(B ˆn ) − lim μ(D
n→∞
n→∞
δ δ = > 0, 2 2
ˆ n = ∅, for all n ≥ 1. and D ˆ Finally, let x ˆ(n) = (xk (n))∞ k=1 ∈ Dn , n ≥ 1. Consider {x1 (n)} ⊂ D1 ; D1 is compact; hence there exists a subsequence {x1 (n1 )} that converges to x01 ∈ D1 . Next, consider {(x1 (n1 ), x2 (n1 )) , n1 ≥ 2} ⊂ D2 ; D2 is compact; hence there exists a subsequence {(x1 (n2 ), x2 (n2 )) , nk ≥ k} that converges to (x01 , x02 ) ∈ D2 . We continue the process and for each k ≥ 1, we obtain an embedded subsequence {(x1 (nk ), . . . , xk (nk )) } that converges to (x01 , . . . , x0k ) ∈ Dk . The ˆ k , for all k ≥ 1, and therefore it belongs to point (x01 , . . . , x0k , . . . ) belongs to D ˆ k . But the latter intersection is empty. The obtained contradiction shows that D ∩∞ 1 μ satisfies conditions of lemma 2.6, and μ is a measure on Cyl. c) By Carathéodory’s theorem, the probability measure μ can be uniquely extended to a probability measure on σa(Cyl) = B(R∞ ). 2.2.2. Construction of product measure on B(R∞ ) Remember a definition of a product of finite number of measures. Start with two measure spaces (Xi , Fi , μi ), i = 1, 2, with probability measures μi . The product measure μ1 × μ2 is a probability measure defined on product space (X1 × X2 , F), F = σa(F1 × F2 ), F1 × F2 := {A1 × A2 : A1 ∈ F1 , A2 ∈ F2 }. It holds (μ1 × μ2 )(A1 × A2 ) = μ1 (A1 )μ2 (A2 ),
A i ∈ Fi ,
i = 1, 2,
Gaussian Measure in l2 as a Product Measure
and for A ∈ F,
37
(μ1 × μ2 )(A) =
μ1 (Ax2 )dμ2 (x2 ) =
X2
X1
μ2 (Ax1 )dμ1 (x1 ).
Here Axi are sections of the set A, Ax2 = {x1 ∈ X1 : (x1 , x2 ) ∈ A},
Ax1 = {x2 ∈ X2 : (x1 , x2 ) ∈ A}.
Given three probability spaces (Xi , Fi , μi ), i = 1, 2, 3, one can form product 3 space ( 1 Xi , F (3) ), with F (3) = σa(F1 × F2 × F3 ), 3 3
Fi := Ai : Ai ∈ Fi , i = 1, 2, 3 . 1
1
Let us identify Cartesian products A1 × A2 × A3 and (A1 × A2 ) × A3 , also we identify A1 × A2 × A3 and A1 × (A2 × A3 ). Product measure μ1 × μ2 × μ3 := 3 (μ1 × μ2 ) × μ3 = μ1 × (μ2 × μ3 ) is a probability measure on ( 1 Xi , F (3) ) such that 3 3 3
μi (Ai ), Ai ∈ Fi , i = 1, 2, 3. μi ( Ai ) = 1
1
1
Using sections of a set A ∈ F (3) , one can write, for instance, 3
μi (A) = μ1 (Ax2 x3 )d(μ2 × μ3 )(x2 , x3 ) = 1
X2 ×X3
= X3
(μ1 × μ2 )(Ax3 )dμ3 (x3 ).
By induction, given n probability spaces (Xi , Fi , μi ), 1 ≤ i ≤ n product measure n μ(n) = 1 μi is a probability measure on product space F (n) := σa(F1 × F2 × . . . × Fn ) such that n−1
(n) μ = μi × μn . 1
It holds μ
(n)
n
1
Ai
=
n
μi (Ai ),
A i ∈ Fi ,
i = 1, . . . , n.
[2.12]
1
Now, we want to define a product of a sequence of measures. We need this construction for measures on real line.
38
Gaussian Measures in Hilbert Space
Let {μn , n ≥ 1} be a sequence of probability measures on B(R). For any n ≥ 1, consider the product measure νn :=
n
μi
[2.13]
1
n on σa( 1 Fi ) = B(Rn ), where Fi ≡ B(R). L EMMA 2.7.– (About product measure) For probability measures [2.13], there exists a unique probability measure ν on B(R∞ ) such that for all n ≥ 1, n
ˆ μi (Bn ), Bn ∈ B(Rn ). ν(Bn ) = νn (Bn ) = 1
P ROOF.– Check that the sequence {νn } is consistent (see definition 2.2). Because νn+1 = νn × μn+1 , it holds νn+1 (Bn × R) = νn (Bn )μn+1 (R) = νn (Bn ),
Bn ∈ B(Rn ),
n ≥ 1.
Thus, {νn } is consistent. Now, the statement follows from theorem 2.2.
D EFINITION 2.3.– The probability measure ν from lemma 2.7 is called a product measure in R∞ and denoted as ν = μ 1 × μ2 × . . . × μn × . . . =
∞
μn .
1
2.2.3. Properties of product measure The next statement extends property [2.12] to the product of a sequence of measures. T HEOREM 2.3.– Let {μn } be a sequence of probability measures on B(R), n } be {A ∞ ∞ a sequence of Borel subsets of R and ν = 1 μn in R∞ . Then A := 1 An ∈ B(R∞ ) and ν(A) =
∞
μn (An ).
1
P ROOF.– Let Bn = A1 × . . . × An . It holds A=
∞ n=1
n ∈ σa(Cyl) = B(R∞ ). B
Gaussian Measure in l2 as a Product Measure
Here we used the fact that the product of Borel sets
n 1
39
Ai is a Borel set in Rn .
n are decreasing to the set A, and the upper continuity of Cylinders Cn = B measure ν implies ν(A) = lim ν(Cn ) = lim νn (A1 × . . . × An ) = n→∞
= lim
n→∞
n
n→∞
μi (Ai ) =
1
∞
μi (Ai ).
1
C OROLLARY 2.2.– Let {μn } be a sequence of probability measures {An } ∞ on B(R), ∞ be a sequence of sets, with μn (A ) > 0, n ≥ 1, and ν = μ in R . Then n 1 n∞ ∞ ν( 1 An ) > 0 if, and only if, 1 μn (Acn ) < ∞ (hereafter Ac is the complement of A). P ROOF.– By theorem 2.3 ν(
∞
1
An ) =
∞
(1 − μn (Acn )).
1
Here 0 ≤ μn (Acn ) < 1, n ≥ 1. According to the test forconvergence of infinite ∞ products, the latter infinite product converges if, and only if, 1 μn (Acn ) < ∞. ∞ Thus, ν( 1 An ) > 0 if, and only if, the values μn (Acn ) are quite small, or equivalently, the values μn (An ) are quite large. Now, we consider mappings from a probability space to R∞ . D EFINITION 2.4.– Let (Ω, F, P) be a probability space and (X, ρ) be a metric space. A mapping ξ : Ω → X is called a random element distributed in X if ξ is (F, B(X)) measurable. The induced probability measure μξ := P ξ −1 is called the distribution of ξ. Thus, for a random element ξ distributed in X, it holds ξ −1 B ∈ F , for all B ∈ B(X), and μξ (B) = (P ξ −1 )(B) = P{ω : ξ(ω) ∈ B},
B ∈ B(X).
In particular if X = R, then the random element is just a r.v., and if X = Rn , then it is a random vector. L EMMA 2.8.– (About components of random element) Consider a mapping ∞ ∞ ξ = (ξn )∞ if, and only if, ξn is a r.v., for 1 : Ω → R . It is a random element in R each n ≥ 1.
40
Gaussian Measures in Hilbert Space
P ROOF.– a) The coordinate projector πn x = xn ,
x ∈ R∞
[2.14]
is continuous by lemma 2.2; hence πn is a Borel mapping. Assume that ξ is a random element in R∞ . Then ξn = πn (ξ) : Ω → R is (F, B(R)) measurable as a composition of (F, B(R∞ )) and (B(R∞ ), B(R)) measurable mappings. Thus, ξn is a r.v. b) Assume that ξn is a r.v., for all n ≥ 1. Then ξ (k) := (ξn )k1 is a random vector in Rk . Now, take any cylinder Aˆk ∈ Cyl. It holds ξ −1 (Aˆk ) = {ω ∈ Ω : ξ (k) (ω) ∈ Ak } ∈ F, because Ak ∈ B(Rk ) and ξ (k) is a random vector. Since σa(Cyl) = B(R∞ ), the mapping ξ is (F, B(R∞ )) measurable, and ξ is a random element in R∞ . Remember that random variables Y1 , . . . , Yn , . . . , which are defined on the same probability space (Ω, F, P), are independent if for each n ≥ 1, random variables Y1 , . . . , Yn are independent. This means that P{Y1 ∈ B1 , Y2 ∈ B2 , . . . , Yn ∈ Bn } =
n
P{Yk ∈ Bk },
1
for each n ≥ 1 and each B1 , . . . , Bn ∈ B(R). Consider a sequence ξ1 , ξ2 , . . . , ξn , . . . of random variables on the same ∞ probability space and let ξ = (ξn )∞ 1 : Ω → R . By lemma 2.8, it is a random ∞ element in R . L EMMA 2.9.– (About distribution of element with independent components) The random variables ξn are independent if, and only if, the distribution of ξ is a product measure: μξ =
∞
μn ,
1
and then μn is a distribution of ξn : μn = μξn , n ≥ 1.
[2.15]
Gaussian Measure in l2 as a Product Measure
41
P ROOF.– a) Assume that ξn , n ≥ 1 are independent. For a cylinder Bn , we have (here ξ (n) = (ξk )n1 ): μξ (Bn ) = P{ξ ∈ Bn } = P{ξ (n) ∈ Bn } = μξ(n) (Bn ). Since components of ξ (n) are independent, μξ(n) =
n
μ ξk ,
1
(see section 1.4.3), and μξ (Bn ) = (
n
μξk )(Bn ).
1
Thus (see definition 2.3), μξ =
∞ 1
μ ξk .
b) Now, assume [2.15]. Then for A1 , . . . , An ∈ B(R), P{ξ1 ∈ A1 , . . . , ξn ∈ An } = P{ξ ∈
n n n
Ak } = μξ ( Ak ) = μk (Ak ). 1
1
1
But μk (Ak ) = μξ (πk−1 Ak ) = P{ξ ∈ πk−1 Ak } = P{ξk ∈ Ak }. Therefore, P{ξ1 ∈ A1 , . . . , ξn ∈ An } =
n
P {ξk ∈ Ak }.
1
This holds for any n ≥ 1 and A1 , . . . , An ∈ B(R). Hence, ξn , n ≥ 1 are independent. C OROLLARY 2.3.– (Existence of independent sequence) Let μn , n = 1, 2, . . . be a sequence of probability measures on B(R). Then there exists a sequence ξn , n = 1, 2, . . . of independent random variables, with μξn = μn , n = 1, 2, . . . P ROOF.– Introduce a probability space ∞
(Ω, F, P) = R∞ , B(R∞ ), μn 1
42
Gaussian Measures in Hilbert Space
and mappings ξn : Ω → R , ξn (ω) = ωn , n = 1, 2, . . . . Then the identity mapping ∞ ξ = (ξn )∞ 1 is a random element in R . The distribution of ξ at a Borel set B equals: ∞
μξ (B) = P{ξ ∈ B} = P(B) = μn (B), 1
∞
and μξ = 1 μn . Now, by lemma 2.9 ξn are independent random variables and μξn = μn , n = 1, 2, . . . Problems 2.2 6) Generalize theorem 2.2 for consistent sequences of probability measures on (X1 , B(X1 )), (X1 × X2 , B(X1 × X2 )), . . . , where Xn is a complete separable metric space, n = 1, 2, . . . . Hint. Let X be a complete separable metric space. Every probability measure μ on B(X) is regular, in particular for each A ∈ B(X) and each > 0 there exists a compact set K ⊂ A such that μ(A\K) < (see [BOG 07]). 7) Let μn be a probability measure in a complete separablemetric space Xn , n = ∞ 1, 2, . . . Based problem (6), construct a product measure n=1 μn in the metric on ∞ space X∞ = n=1 Xn . 8) Based on the Borel–Cantelli lemma, give an alternative proof to the following ∞ part of corollary 2.2: if under conditions of corollary 2.2 ν( n=1 An ) > 0, then ∞ c μ (A ) < ∞. n n=1 n 9) Based on problem (7), prove the following generalization of corollary 2.3: let μn be a probability measure on a complete separable metric space Xn , n = 1, 2, . . . ; then there exists a sequence ξn , n = 1, 2, . . . of independent random elements such that ξn is distributed on Xn and μξn = μn , n = 1, 2, . . . . 2.3. Standard Gaussian measure in R∞ In section 1.5.4, we considered standard Gaussian measure on a real line and in Euclidean space. Let g be the standard Gaussian measure on R, i.e. x2 1 g(B) = √ e− 2 dx, B ∈ B(R). 2π B D EFINITION 2.5.– Product measure μ = measure in R∞ if μn = g, for all n ≥ 1.
∞ 1
μn is called standard Gaussian
Gaussian Measure in l2 as a Product Measure
43
D EFINITION 2.6.– We call a sequence γn , n = 1, 2, . . . of i.i.d. N (0, 1) random variables as Gaussian white noise. ∞ Gaussian white noise {γn } yields a random element γ = (γn )∞ 1 distributed in R . ∞ Its distribution μγ is just a standard Gaussian measure in R , because by lemma 2.9
μγ =
∞
μγn , μγn = g, n = 1, 2, . . .
1
Remember that the sequence space l2,a was introduced in section 2.1.3. Here ∞ a = (an )∞ 1 , with an > 0, n = 1, 2, . . . Remember also that l2,a ∈ B(R ). We will show that standard Gaussian measure is concentrated on l2,a if a ∈ l1 . T HEOREM 2.4.– (Kolmogorov–Khinchin criterion) Let μ be standard Gaussian measure in R∞ . Then ⎧ ∞ ⎪ an = ∞, ⎨0 if 1 μ(l2,a ) = ∞ ⎪ ⎩1 if an < ∞. 1
P ROOF.– a) For a fixed λ > 0 and N ≥ 1, introduce a function N λ ak x2k , x ∈ R∞ . fλN (x) = exp − 2 1 Functions of such a kind that depend only on a finite number of coordinates are called cylindrical. The function fλN is continuous and bounded (0 < fλN ≤ 1), hence fλN ∈ L(R∞ , μ). This function is generated by the projective operator PN : R∞ → RN and the function N λ (N ) 2 a k xk . fλ (x1 , . . . , xN ) = exp − 2 1 (N )
In fact, fλN = fλ ◦ PN . N Denote νN = 1 μk , μk ≡ g. Then νN = μPN−1 . Using the change of variables formula and problem (26) from Chapter 1, we obtain: (N ) (N ) fλN dμ = fλ (PN x)dμ(x) = fλ (t)dνN (t) R∞
R∞
RN
= N
1
k=1 (1
. + λak )
44
Gaussian Measures in Hilbert Space
For fixed x ∈ R∞ ,
& % ∞ exp − λ2 1 ak x2k , fλ (x) := lim fλN (x) = N →∞ 0,
if x ∈ l2,a , if x ∈ R∞ \l2,a .
Moreover, 0 < fλN (x) ≤ 1, 1 ∈ L(R∞ , μ), and by Lebesgue dominated convergence theorem: 1 fλ dμ = lim fλN dμ = ∞ . [2.16] N →∞ ∞ ∞ (1 + λak ) R R k=1 If the infinite product converges, then R∞ f dλ is finite and positive; otherwise if the infinite product diverges to infinity, then the right-hand side of [2.16] equals zero by convention, and R∞ f dλ = 0. ∞ b) Assume 1 ak = ∞. Then by the test for convergence of infinite ∞ that products, k=1 (1 + λak ) = ∞. It holds fλ dμ = fλ dμ. 0= R∞
l2,a
Here, we used the fact that fλ vanishes at R∞ \l2,a . Next, fλ > 0 at l2,a , and therefore, μ(l2,a ) = 0. ∞ c) Now, assume that 1 ak < ∞. The infinite product in [2.16] converges, thus, ∞ 1 fλ dμ = exp − log(1 + λak ) . [2.17] 2 R∞ k=1
It holds
1, lim fλ (x) = λ→0+ 0,
if x ∈ l2,a , if x ∈ R∞ \l2,a .
Moreover, 0 ≤ fλ ≤ 1, 1 ∈ L(R∞ , μ), and by Lebesgue dominated convergence theorem lim fλ dμ = Il2,a dμ = μ(l2,a ). λ→0+
R∞
R∞
However, the right-hand side of [2.17] converges to 1 as λ → 0+, because 0≤
∞ k=1
log(1 + λak ) ≤ λ
∞
ak → 0 as λ → 0 + .
k=1
Thus, in this case μ(l2,a ) = 1. It is instructive to give an alternative proof of a part of theorem 2.4.
Gaussian Measure in l2 as a Product Measure
45
2.3.1. Alternative proof of the second part of theorem 2.4 P ROOF.– Let {γn } be Gaussian white noise (see definition 2.6) and γ = (γn )∞ 1 be the corresponding random element in R∞ . Then the distribution μγ of γ coincides with standard Gaussian measure μ and ∞ 2 μ(l2,a ) = μγ (l2,a ) = P {γ ∈ l2,a } = P ak γk < ∞ . 1
In terms of probability theory, theorem 2.4 can be restated as follows: Let {γk } be Gaussian white noise, then: ∞ ∞ i) 1 ak γk2 = ∞ a.s. if 1 ak = ∞; ∞ ∞ 2 ii) 1 ak γk < ∞ a.s. if 1 ak < ∞. Now, assume that S(ω) =
∞
∞ 1
ak < ∞. Put
ak γk2 (ω)
= lim
N →∞
1
N
ak γk2 (ω),
ω ∈ Ω.
k=1
¯ = Here S = S(ω) is a non-negative r.v. distributed on the extended real line R 2 R ∪{+∞, −∞}. Next, ak γk ≥ 0, and E[S] =
∞
ak E γk2 =
∞
1
ak < ∞.
1
This implies the desired relation S(ω) < ∞ a.s.
Problems 2.3 10) Let measures on B(R), with ν2n be a sequence of probability ∞ sup x dν < ∞, and let ν = ν . Prove that ν(l2,a ) = 1 if n n n≥1 n=1 R ∞ n=1 an < ∞. ∞ 11) Let λn > 0, n ≥ 1 and n=1 λn < ∞. Denote ∞ 2 Jα = λn xn dμ(x), α ∈ R, exp α R∞
n=1
where μ is standard Gaussian measure in R∞ . Prove that Jα < ∞ if, and only if, 1 α < 2 maxn≥1 λn . Show that in this case Jα = ∞
1
n=1 (1 − 2αλn )
.
46
Gaussian Measures in Hilbert Space
∞ 12) Let λn > 0, n ≥ 1, n=1 λn = ∞ and the integral Jα be defined as in problem (11). Prove that Jα = 0 if α < 0, and Jα = +∞ if α > 0. 13) Levy’s theorem implies the following: if a series of independent random variables converges in probability, then it converges a.s. (see lemma 3.2.1 in [BUL 80]). Using this fact prove the next statement. Let μ be standard Gaussian measure on R∞ and t = (tn )∞ 1 ∈ l2 . Prove that the ∞ ∞ series 1 tn xn , x = (xn )∞ ∈ R converges almost everywhere w.r.t. μ. Introduce 1 a function ft : R∞ → R, ∞ 1 tn xn , if the series converges, ft (x) = 0, otherwise. Prove that t 2 exp{ift (x)}dμ(x) = exp − 2 , 2 R∞ where t 2 is the norm in l2 . 2.4. Construction of Gaussian measure in l2 Based on Kolmogorov–Khinchin criterion (theorem 2.4), we will construct a Gaussian measure in the sequence space l2 ; the latter is isometric to any separable infinite-dimensional Hilbert space. Let H be arbitrary real Hilbert space, with inner product (x, y). Definitions of Gaussian random element distributed in H and of Gaussian measure in H are similar to definitions 1.11 and 1.14. D EFINITION 2.7.– A random element ξ distributed in H is called Gaussian if for each h ∈ H, inner product (ξ, h) is a Gaussian r.v. (possibly with zero variance; see definition 1.5). D EFINITION 2.8.– A probability measure μ on B(H) is called Gaussian if there exists a Gaussian random element ξ in H such that its distribution μξ = μ. ∞ ∞ Now, let μ be standard Gaussian ∞ measure in R . Fix a sequence a = (an )1 of positive numbers such that 1 an < ∞. By Kolmogorov–Khinchin criterion, μ(l2,a ) = 1.
From the proof of lemma 2.5, one can see that the ball {x ∈ l2,a : x a ≤ r} = {x ∈ R∞ :
∞ 1
an x2n ≤ r2 } ∈ B(R∞ ),
Gaussian Measure in l2 as a Product Measure
47
and in a similar way for each y ∈ l2,a , the ball Ba (y, r) := {x ∈ l2,a : x − y a ≤ r} ∈ B(R∞ ). The space l2,a is separable (see problem (4)), therefore, B(l2,a ) = σr({Ba (y, r) : y ∈ l2,a , r > 0}) ⊂ B(R∞ ). Here, σr denotes the generated sigma-ring. Now, let μa be the restriction of μ to B(l2,a ), μa (A) = μ(A), A ∈ B(l2,a ). Since μ is concentrated on l2,a , μa is a probability measure on B(l2,a ). In order to construct a related measure in l2 , use the isometry J between l2,a and l2 (see problem (4)) √ Jx = ( an xn )∞ 1 , x ∈ l2,a . Introduce the induced probability measure on B(l2 ): g = μa J −1 , g(B) = μa (J −1 B), B ∈ B(l2 ).
[2.18]
Unlike in section 2.3, here g does not denote standard Gaussian measure. Now, we need a simple statement about the convergence of Gaussian random variables. L EMMA 2.10.– (About convergence of Gaussian variables) Let γn ∼ N (0, σn2 ), with σn ≥ 0, n ≥ 1, and γn → γ a.s., σn2 → σ 2 < ∞ as n → ∞. Then γ ∼ N (0, σ 2 ). P ROOF.– Characteristic function σ 2 x2 σ 2 x2 → exp − as n → ∞. ϕγn (x) = exp − n 2 2 However, by Lebesgue dominated convergence theorem: ϕγn (x) = E eixγn → E eixγ = ϕγ (x) as n → ∞. 2
2
Thus, ϕγ (x) = exp{− σ 2x }, and γ ∼ N (0, σ 2 ).
For a probability measure on B(H), where H is a real Hilbert space, the mean value and the correlation operator are defined similarly to the case H = Rn (see section 1.4). D EFINITION 2.9.– Let ξ be a random element in H and μ be a probability measure on B(H). A vector m ∈ H is called expectation of ξ if (m, h) = E(ξ, h), for all h ∈ H,
48
Gaussian Measures in Hilbert Space
and m is called mean value of μ if (m, h) = (x, h)dμ(x) for all h ∈ H. H
We denote m = E ξ and m = mμ , respectively. D EFINITION 2.10.– Consider a linear bounded operator S in H. It is called correlation operator of ξ if (Sh1 , h2 ) = E(ξ − m, h1 )(ξ − m, h2 ), for all h1 , h2 ∈ H, where m = E ξ, and S is called correlation operator of μ if (Sh1 , h2 ) = (x − m, h1 )(x − m, h2 )dμ(x), for all h1 , h2 ∈ H, H
where m = mμ . R EMARK 2.1.– The change of variables formula implies the following: for a random element ξ in H and its distribution μξ , the mean value of μξ coincides with E ξ, and correlation operators of μξ and ξ coincide as well. ∞ T HEOREM 2.5.– (Construction of Gaussian ∞ measure in l2 ) Let a = (an )1 be a sequence of positive numbers such that 1 an < ∞. Then g defined in [2.18] is a Gaussian measure in l2 , with zero mean and diagonal correlation operator S
Sx = (an xn )∞ 1 , x ∈ l2 .
[2.19]
P ROOF.– a) Construct a random element in l2 with distribution g. Let {γn } be Gaussian white noise (see definition 2.6) and γ = (γn )∞ 1 be the random element in R∞ . Then its distribution μγ coincides with μ, standard Gaussian measure in R∞ . Since μ(l2,a ) = 1, it holds P{γ ∈ l2,a } = 1. Without loss of generality, we may and do assume that γ acts from Ω to l2,a (otherwise one can remove from Ω a subset Ω0 = {ω : γ(ω) ∈ R∞ \l2,a }, with P(Ω0 ) = 0). Since B(l2,a ) ⊂ B(R∞ ), the mapping γ : Ω → l2,a is (F, B(l2,a )) measurable, i.e. γ is a random element in l2,a . Consider its distribution μγ,a in l2,a . For A ∈ B(l2,a ), we have μγ,a (A) = P{γ ∈ A} = μ(A) = μa (A) (remember that μa is the restriction of μ to B(l2,a )). Therefore, μγ,a = μa . √ Next, Jγ = ( an γn )∞ 1 is a random element in l2 , with distribution μJγ (B) = P{Jγ ∈ B} = P{γ ∈ J −1 B} = μa (J −1 B) = (μa J −1 )(B), B ∈ B(l2 ). Thus (see [2.18]), Jγ has distribution g in l2 .
Gaussian Measure in l2 as a Product Measure
49
b) Prove that Jγ is a Gaussian random element in Hilbert space l2 and find the expectation of Jγ and its correlation operator. Take any h ∈ l2 . We have for ω ∈ Ω that (Jγ, h) = lim
N →∞
ηN :=
N √
N √
an γn hn ,
n=1
2 2 an γn hn ∼ N (0, σN ), σN =
n=1
N
an h2n .
n=1
Since an is bounded and h ∈ l2 , 2 σN →
∞
an h2n = h 2a < ∞ as N → ∞.
n=1
By lemma 2.10, the limiting r.v. is (Jγ, h) ∼ N (0, h 2a ). Therefore, Jγ is a Gaussian random element with zero mean. Take another vector u ∈ l2 . A symmetric bilinear form B(h, u) = (Jγ, h)(Jγ, u) can be expressed through the quadratic form as follows: 1 (Jγ, h + u)2 − (Jγ, h)2 − (Jγ, u)2 ; 2
1 E(Jγ, h)(Jγ, u) = ||h + u||2a − ||h||2a − ||u||2a = (h, u)a , 2 ∞ a n hn un . E(Jγ, h)(Jγ, u) =
(Jγ, h)(Jγ, u) =
1
Consider the diagonal operator S defined in [2.19]. It is a linear bounded operator in l2 because the sequence {an } is bounded. For any h, u ∈ l2 , it holds (Sh, u) =
∞
an hn un = E(Jγ, h)(Jγ, u).
1
Therefore, S is correlation operator of Jγ. By part (a) of the proof, g is a distribution of Jγ. Then g is a Gaussian measure, with zero mean and correlation operator S. Look at the constructed Gaussian measure g from another point of view. Let {˜ γn , n ≥ 1} be a sequence of independent standard normal random variables and √ ∞ η = (ηn )∞ ˜n )∞ 1 = ( an γ 1 be the random element in R , where an > 0, n ≥ 1 and ∞ 1 an < ∞.
50
Gaussian Measures in Hilbert Space
Distribution of η in R∞ is the product of marginal distributions gn = μηn , gn is a Gaussian measure on R with zero mean and variance an : μη =
∞
gn
in
R∞ .
1
The measure μη is concentrated on l2 : μη (l2 ) = 1, and the restriction of μη to B(l2 ) ⊂ B(R∞ ) is just the measure g from [2.18]. In this situation, we say that g is a product measure in Hilbert space l2 and we write: g=
∞
gn
in l2 .
[2.20]
1
In subsequent chapters, we will show that any Gaussian measure G in a separable infinite-dimensional Hilbert space H has a form of product measure in H. This means the following. There exists an orthobasis {en } in H, such that after identification of H and the sequence space l2 of Fourier coefficients (cn = (c, en ))∞ 1 , c ∈ H, the measure G becomesa product measure in l2 like [2.20], with Gaussian marginal distributions ∞ Gn : G = 1 Gn in l2 . Moreover, we will prove that G has certain mean value mG ; if mG = 0, then all Gn have zero mean as well, otherwise some of mean values mGn are not equal to zero. Problems 2.4 14) Let g be Gaussian measure defined in [2.18] and b = (bn )∞ 1 be a sequence of positive numbers. Prove that ∞ 1, if 1 an bn < ∞, g(l2,b ∩ l2 ) = . 0, otherwise. 15) Let g be Gaussian measure defined in [2.18] and t = (tn )∞ 1 ∈ l2,a . Introduce a function ft : l2 → R, ∞ 1 tn xn , if the series converges, ft (x) = 0, otherwise. Prove that ft is a Borel function and, if it is considered a r.v. on the probability space (l2 , B(l2 ), g), it has Gaussian distribution N (0, ||t||2a ).
3 Borel Measures in Hilbert Space
In this chapter, we systematically study measures on a Hilbert space H. We will start with nuclear operators in H and then proceed with the weak and strong integrals of a function valued in H or, more generally, in a Banach space. The correlation operator of a measure μ on B(H) with H ||x||2 dμ(x) < ∞ is a nuclear operator; mean value mμ of the measure μ is usually defined as the weak integral mμ = xdμ(x), [3.1]
H
but if H ||x||dμ(x) < ∞, then the right-hand side of [3.1] can be understood as the strong integral. 3.1. Classes of operators in H In this section, H is a separable infinite-dimensional real Hilbert space. Remember that a linear operator A in H is called compact operator if for each bounded set M ⊂ H, its image A(M ) is relatively compact in H. Class of all compact operators in H is denoted as S∞ (H). It holds S∞ (H) ⊂ L(H), where L(H) is Banach space of all linear bounded operators in H, with operator norm ||B|| =
||Bx|| . x∈H,x=0 ||x|| sup
The set S∞ (H) is a subspace of L(H), i.e. S∞ (H) is a linear and closed subset of L(H).
Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.
52
Gaussian Measures in Hilbert Space
3.1.1. Hilbert–Schmidt operators Such operators are studied in a standard university course on functional analysis. Here, we briefly overview properties of Hilbert–Schmidt operators and omit most of the proofs. D EFINITION 3.1.– Let A ∈ L(H). It is called Hilbert–Schmidt operator if there exists ∞ an orthobasis {en , n ≥ 1} in H, with 1 ||Aen ||2 < ∞. The linear set of all such operators is denoted as S2 (H). For A ∈ L(H), the latter sum does not depend on the choice of orthobasis. D EFINITION 3.2.– For A ∈ L(H) and an orthobasis {en , n ≥ 1}, the expression ' (∞ ( ||Aen ||2 ||A||2 := ) 1
is called Hilbert–Schmidt norm of A. This norm is finite if, and only if, A ∈ S2 (H). Using matrix entries aij = (Aej , ei ), i, j = 1, 2, . . . , one can express the Hilbert–Schmidt norm as follows: ' (∞ ( ||A||2 = ) a2ij . 1
E XAMPLE 3.1.– (Diagonal operator) Consider a bounded real sequence {an , n ≥ 1} and operator A in l2 , Ax = (a1 x1 , . . . , an xn , . . . ),
x ∈ l2 .
It is the co-called diagonal operator and denoted as A = diag(an , n ≥ 1). For the standard basis {en } in l2 , matrix entries are aij = (Aej , ei ) = (aj ej , ei ) = aj δij ,
i, j = 1, 2, . . . ∞ 2 The Hilbert–Schmidt norm of A is ||A||2 = 1 aj . Thus, A ∈ S2 (H) if, and ∞ 2 only if, 1 aj < ∞. E XAMPLE 3.2.– (Integral Hilbert–Schmidt operator) Let K = K(t, s) ∈ L2 ([a, b]2 ). The integral operator b (Ax)(t) = K(t, s)x(s)ds, x ∈ L2 ([a, b]2 ), a ≤ t ≤ b a
Borel Measures in Hilbert Space
53
acts in Hilbert space H = L2 [a, b]. It holds A ∈ S2 (H) and * b b ||A||2 = ||K||L2 = |K(t, s)|2 dtds. a
a
E XAMPLE 3.3.– (Identity operator) In a separable infinite-dimensional Hilbert space H, consider the identity operator Ix = x, x ∈ H. For an orthobasis {en }, we have ' ' (∞ (∞ ( ( 2 ) ||Ien || = ) 1 = ∞. ||I||2 = 1
n=1
Thus, I ∈ L(H) \ S2 (H), and inclusion S2 (H) ⊂ L(H) is strict. Now, we introduce an inner product in S2 (H). For an orthobasis {en } in H, we set (A, B)2 =
∞
(Aej , Bej ),
A, B ∈ S2 (H).
[3.2]
1
L EMMA 3.1.– (About inner product in S2 (H)) A series in [3.2] converges and its sum does not depend on the choice of orthobasis. Moreover, relation [3.2] defines an inner product in S2 (H), and S2 (H) with this product is separable Hilbert space. The corresponding norm in S2 (H) is just the Hilbert–Schmidt norm. P ROOF.– a) The functional fj (A, B) := (Aej , Bej ) is a symmetric bilinear form on S2 (H). Therefore, fj (A, B) = ∞ 1
fj (A + B, A + B) − fj (A − B, A − B) , 4
1 (Aej , Bej ) = 4 =
∞
fj (A + B, A + B) −
1
∞
fj (A − B, A − B)
=
1
1 ||A + B||22 − ||A − B||22 . 4
Here A + B, A − B ∈ S2 (H) because S2 (H) is a linear set in L(H), thus, ||A + B||2 < ∞ and ||A − B||2 < ∞. We conclude that the right-hand side of [3.2] is finite and does not depend on the choice of orthobasis.
54
Gaussian Measures in Hilbert Space
b) On sigma-algebra 2N ×N of all subsets of N × N , consider the counting measure μ(C) = |C|,
C ⊂ N × N.
Fix an orthobasis {en } in H. For A ∈ S2 (H), we use its matrix entries aij = (Aej , ei ), i, j ≥ 1. It holds N ×N
a2ij dμ(i, j) =
∞
a2ij = ||A||22 < ∞.
i,j=1
Therefore, aij as a function of a couple (i; j) ∈ N × N belongs to the space L2 (N × N , μ). The mapping J : S2 (H) → L2 (N × N , μ), J(A) = (aij )∞ i,j=1 is a linear bijection, with
(J(A), J(B))L2 =
∞
aij bij =
∞
(Aej , Bej ) = (A, B)2 ,
A, B ∈ S2 (H).
1
i,j=1
[3.3] Here, bij are matrix entries of B. Relation [3.3] implies that [3.2] is an inner product in S2 (H), together with the inner product in L2 (N × N , μ). Moreover J is an isometry between S2 (H) and a separable Hilbert space L2 (N × N , μ). Therefore, S2 (H) is a separable Hilbert space as well. c) A norm induced by the inner product [3.2] is as follows: ' (∞ ( ||A||S2 (H) = (A, A)2 = ) (Aej , Aej ) = ||A||2 . 1
Thus, the induced norm is just the Hilbert–Schmidt norm. We list the properties of the Hilbert–Schmidt operators. T HEOREM 3.1.– The following statements hold true: a) A ≤ A 2 , for all A ∈ S2 (H). b) A∗ 2 = A 2 , for all A ∈ L(H). c) AB 2 ≤ A · B 2 , for all A ∈ L(H), B ∈ S2 (H); AB 2 ≤ A 2 · B , for all A ∈ S2 (H), B ∈ L(H). d) If A ∈ S2 (H), then A ∈ S∞ (H).
Borel Measures in Hilbert Space
55
Now, we comment theorem 3.1. Statement (a) implies that natural embedding J : S2 (H) → L(H), J(A) = A is a continuous operator. Statement (b) implies the following: a linear bounded operator is a Hilbert–Schmidt one if, and only if, its adjoint operator is a Hilbert–Schmidt operator. Inequalities (c) yield the following: if A ∈ S2 (H) and B ∈ L(H), then AB, BA ∈ S2 (H). This means that S2 (H) is a two-sided ideal in the operator algebra L(H). (Remember that L(H) is a normed algebra w.r.t. linear operations and multiplication of operators.) In (d), we have the inclusion S2 (H) ⊂ S∞ (H), i.e. Hilbert space S2 (H) is a linear subset of Banach space S∞ (H). 3.1.2. Polar decomposition Remember that a linear bounded operator A in Hilbert space is called positive if its quadratic form (Ax, x) is non-negative. This is denoted as A ≥ 0. If the underlying space is complex and A ≥ 0, then A is self-adjoint. In a real space, this is not true. Now, we derive the so-called polar decomposition of an operator in H. It is an analogue of the polar decomposition z = eiφ |z|, z ∈ C. T HEOREM 3.2.– (About polar decomposition) Each compact operator A in H can be decomposed as A = U T , where T is a self-adjoint positive compact operator and U ∈ L(H), U performs an isometry of R(T ) into H (i.e. U x = x , x ∈ R(T )). Before the proof, we consider an example of such decomposition. E XAMPLE 3.4.– (Polar decomposition of compact diagonal operator) Let {an } be a real sequence that converges to zero and A = diag(an , n ≥ 1) in Hilbert space l2 (for the notation of diagonal operator, see example 3.1). Then A ∈ S∞ (l2 ). We put T = diag(|an |, n ≥ 1), U = diag(signan , n ≥ 1). Since an =(signan )|an |, it ∞ holds A = U T ; T is self-adjoint, it is positive because (T x, x) = 1 |an |x2n ≥ 0, x ∈ l2 , and T is a compact operator because |an | → 0 as n → ∞; U ∈ L(l2 ) ∞ and |U T x 2 = Ax 2 = 1 a2n x2n = T x 2 , x ∈ l2 , and therefore, U performs an isometry of R(T ) into l2 . Thus, we constructed a desired decomposition of the compact diagonal operator A.
56
Gaussian Measures in Hilbert Space
P ROOF OF THEOREM 3.2.– a) Construction of T . Let B = A∗ A. It is a compact operator as a product of A∗ ∈ L(H) and A ∈ S∞ (H). Moreover, (Bx, x) = (Ax, Ax) ≥ 0, and B ≥ 0. Then by Hilbert–Schmidt theorem Bx =
∞
λn (x, en )en , x ∈ H,
1
where {en } is eigenbasis of B, {λn ≥ 0, n ≥ 1} are eigenvalues with λn → 0 as n → ∞, and the series converges in H. We set Tx =
∞ λn (x, en )en , x ∈ H.
[3.4]
1
This is a self-adjoint positive operator such that T 2 = B. It is called square root √ √ 1 of B and denoted as T = B = B 2 . The operator T is compact because λn → 0 as n → ∞ (actually T is the diagonal operator in the basis {en }; the matrix that represents T in this basis is diagonal). b) Construction of U . First we define U on a linear set R(T ). We put U (T x) = Ax, x ∈ H. The operator U : R(T ) → H is well-defined. Indeed, let T x = T y, then T z = 0 with z = x − y, 0 = T z 2 =
∞
λn (z, en )2 = (Bz, z) = Az 2 ⇒ Az = 0, Ax = Ay.
1
Next, U (T x) = Ax = T x by the computations made above, and U performs an isometry of R(T ) into H. Finally, U can be extended by continuity to an isometry from R(T ) into H, and then it can be further extended to a linear bounded operator in H by letting U x = 0, ⊥ x ∈ (R(T )) = KerT . 1
D EFINITION 3.3.– The operator T = (A∗ A) 2 defined in [3.4] is called modulus of a compact operator A and denoted as |A|. The equality A = U |A|, where U ∈ L(H), U performs an isometry of R(T ) into H and U is vanishing at KerT, is called polar decomposition of A. In example 3.4, T = |A| and equality A = U T is polar decomposition of the diagonal operator A.
Borel Measures in Hilbert Space
57
R EMARK 3.1.– Theorem 3.2 holds true for arbitrary separable Hilbert space (it can be finite-dimensional or complex; the proof can be modified with minor changes). R EMARK 3.2.– (Bound for norm of U ) Consider polar decomposition of A ∈ S∞ (H). If A is zero operator in H, then |A| = U = 0 and ||U || = 0. If A = 0, then ||U x|| ||U || = sup : x ∈ R(T ), x = 0 = 1. ||x|| In all the cases, ||U || ≤ 1. 3.1.3. Nuclear operators Remember that H is a separable real infinite-dimensional Hilbert space. D EFINITION 3.4.– For A ∈ S∞ (H), positive eigenvalues αn , n 1 |A| = (A∗ A) 2 are called singular values of A.
≥
1 of
D EFINITION 3.5.– A compact operator A in H is called nuclear if n≥1 αn < ∞, where αn are singular values of A (counted with multiplicity). Class of all nuclear operators in H is denoted as S1 (H). D EFINITION 3.6.– For A ∈ S1 (H), the trace of A is defined as tr A =
∞
(Aen , en ),
[3.5]
1
where {en } is arbitrary orthobasis in H. L EMMA 3.2.– (About trace) For A ∈ S1 (H), the series in [3.5] converges absolutely and its sum does not depend on the choice of orthobasis. P ROOF.– Consider polar decomposition A = U T of the compact operator A. Let {en } be eigenbasis of T and T en = αn en , n ≥ 1. Then Aen = U (αn en ) = αn U en , n ≥ 1. a) We have |(Aen , en )| ≤ αn · U ≤ αn , ∞ and the majorizing series 1 αn converges. Hence the series [3.5] converges absolutely, and therefore, converges.
58
Gaussian Measures in Hilbert Space
b) Take another orthobasis {fn }. It holds ∞
(Aen , en ) =
n=1
∞ ∞
∞ ∞
(Aen , fk )(en , fk ) =
n=1 k=1
αn (U en , fk )(en , fk ).
n=1 k=1
We want to change the order of summation. In order to ground this, bound the double sum: ∞ ∞
αn |(U en , fk )| · |(en , fk )| ≤
n=1 k=1
×(
∞
∞ n=1
1
|(en , fk )|2 ) 2 =
∞
αn (
∞ k=1
αn U en · en ≤
n=1
k=1
1
|(U en , fk )|2 ) 2 ×
∞
αn < ∞.
n=1
Therefore, ∞
(Aen , en ) =
n=1
∞ ∞
(Aen , fk )(en , fk ) =
k=1 n=1
= (fk , A∗ fk ) = (Afk , fk ). k
∞ ∞
(en , A∗ fk )(en , fk ) =
k=1 n=1
k
The latter series converges absolutely because its sum remains unchanged after permutation of vectors in the basis {fk }. L(Rn ) and {ek , k = 1, . . . , n} be an orthobasis in Rn . Then R EMARK n3.3.– Let A ∈ n trA = 1 (Aek , ek ) = 1 akk , where akj are matrix entries of A in the basis {ek }. Thus, in the finite-dimensional case the trace of operator coincides with the trace of matrix that represents the operator. D EFINITION 3.7.– Let A ∈ S∞ (H) and {αn , n ≥ 1} be singular values of A counted with multiplicity. The sum A 1 = αn n≥1
is called nuclear norm of A. We see that a compact operator A is nuclear if, and only if, A 1 < ∞. L EMMA 3.3.– (About triangle inequality) Let A, B ∈ S1 (H), then A + B ∈ S1 (H) and A + B 1 ≤ A 1 + B 1 .
Borel Measures in Hilbert Space
59
P ROOF.– The operator A + B is compact as a sum of two compact operators. For arbitrary compact operator T , we arrange eigenvalues of |T | (counted with multiplicity) in descending order: α1 (T ) ≥ α2 (T ) ≥ α3 (T ) ≥ . . . For each N ≥ 1, it holds (see lemma 4.2 in [GOK 88]): N
αn (A + B) ≤
1
N
αn (A) +
N
1
αn (B).
1
Tending N → ∞, we obtain A + B 1 =
∞
αn (A + B) ≤
1
∞
αn (A) +
1
∞
αn (B) = A 1 + B 1 < ∞.
1
Thus, A + B ∈ S1 (H) and the desired inequality follows.
In view of lemma 3.3, it is evident that S1 (H) is a linear normed space with nuclear norm. Moreover, it is Banach space (see problem 6). Now, we consider some properties of nuclear operators. T HEOREM 3.3.– (Operator norm is dominated by nuclear one) If A ∈ S1 (H), then A ≤ A 1 . P ROOF.– We use polar decomposition A = U T , eigenbasis {en } of T , and the corresponding eigenvalues αn . For x ∈ H, Ax = U T x = T x =
∞
αn (x, en )en ≤
1
≤ x ·
∞
∞
αn |(x, en )| ≤
1
αn = A 1 · x ,
1
and the desired follows.
We interpret theorem 3.3 as follows. Consider canonical embedding π1 : S1 (H) → L(H), π1 A = A. Then π1 A L(H) ≤ A 1 , and the embedding is continuous, moreover, π1 ≤ 1. T HEOREM 3.4.– Let A, B ∈ S2 (H). Then AB ∈ S1 (H) and AB 1 ≤ A 2 · B 2 .
60
Gaussian Measures in Hilbert Space
P ROOF.– a) AB is a compact operator as a product of compact and continuous operators. We use polar decomposition AB = U T , singular values sn = sn (AB) and the corresponding orthonormal system {en , n ≥ 1} of eigenvalues of T . It holds sn = (T en , en ) = (U T en , U en ) = (ABen , U en ) = (Ben , A∗ U en ). Vectors {U en , n ≥ 1} form an orthonormal system (because U performs an isometry on R(T ) and en ∈ R(T ), n ≥ 1), and we complement it to a basis {fk }. Then
A∗ U en 2 ≤
n≥1
∞
A∗ fk 2 = A∗ 22 = A 22 .
k=1
b) Now, we bound the nuclear norm of AB: sn = (Ben , A∗ U en ) ≤ AB 1 = ≤
*
n≥1
Ben 2 ·
n≥1
*
n≥1
A∗ U en 2 ≤ B 2 · A 2 < ∞.
n≥1
Thus, AB ∈ S1 (H) and the desired inequality follows.
R EMARK 3.4.– Theorem 3.4 implies the following: if A ∈ S2 (H), then A2 ∈ S1 (H). That is why Hilbert–Schmidt operators are called sometimes quasinuclear ones. R EMARK 3.5.– Theorem 3.4 can be interpreted as follows. Consider bilinear operator Φ : S2 (H) × S2 (H) → S1 (H),
Φ(A, B) = AB.
Then its norm can be written as Φ =
sup
A2 =B2 =1
Φ(A, B) 1 .
And theorem 3.4 states that Φ ≤ 1. T HEOREM 3.5.– (Hilbert–Schmidt norm is dominated by nuclear one) If A ∈ S1 (H), then A ∈ S2 (H) and A 2 ≤ A 1 . P ROOF.– We use polar decomposition A = U T , eigenvalues {en } of T and the corresponding eigenvalues αn . We have 2 ∞ ∞ ∞ ∞ 2 2 2 2 = A 21 < ∞. αn ≤ αn Aen = T en = A 2 = 1
1
1
1
Borel Measures in Hilbert Space
61
Theorem 3.5 means that for canonical embedding π12 : S1 (H) → S2 (H), π12 A = A, it holds π12 ≤ 1. T HEOREM 3.6.– If A ∈ S1 (H), then A∗ ∈ S1 (H) and A∗ 1 = A 1 . P ROOF.– We use polar decomposition 1
1
A = U T = (U T 2 )T 2 . Consider eigenbasis {en } of T , T en = αn en , n ≥ 1. We have 1
T 2 22 =
∞
1
T 2 en 2 =
1
∞
αn = A 1 < ∞.
1
1
1
Hence T 2 ∈ S2 (H), and by theorem 3.1 (c) it holds U T 2 ∈ S2 (H). 1
1
Now, A∗ = T 2 (U T 2 )∗ ∈ S1 (H) as a product of two Hilbert–Schmidt operators (see theorems 3.1(b) and 3.4). Next, 1
1
1
1
A∗ 1 ≤ T 2 2 · (U T 2 )∗ 2 = T 2 2 · U T 2 2 ≤ 1
1
≤ T 2 22 · U ≤ T 2 22 = A 1 . The inequality A∗ 1 ≤ A 1 holds for any A ∈ S1 (H). We substitute here A∗ instead of A, and since A∗∗ = A we get the inequality in opposite direction A 1 ≤ A∗ 1 . Thus, A∗ 1 = A 1 . The next statement is a complement to theorem 3.4. C OROLLARY 3.1.– Each A ∈ S1 (H) can be decomposed as a product of two Hilbert– Schmidt operators. 1
1
P ROOF.– The desired decomposition is A = (U T 2 )T 2 from the previous proof.
T HEOREM 3.7.– (Nuclear operators form an ideal in L(H) ) If A ∈ L(H) and B ∈ S1 (H), then both operators AB and BA are nuclear and AB 1 ≤ A · B 1 , BA 1 ≤ A · B 1 . P ROOF.– a) We start with polar decomposition B = U T , where T ∈ S1 (H). Then 1 T 2 22 = T 1 = B 1 . Now, 1
1
AB = (AU T 2 )T 2 ∈ S1 (H)
62
Gaussian Measures in Hilbert Space
as a product of two Hilbert–Schmidt operators. Hence 1
1
1
AB 1 ≤ AU T 2 2 · T 2 2 ≤ A · U · T 2 22 ≤ A · B 1 . b) Operator BA ∈ S∞ (H), and therefore, its nuclear norm is well-defined. We have by theorem 3.6 BA 1 = A∗ B ∗ 1 ≤ A∗ · B ∗ 1 = A · B 1 .
Theorem 3.7 means that S1 (H) is a two-sided ideal in Banach algebra L(H). Denote by S0 (H) the vector space of all finite-dimensional operators, i.e. such operators A ∈ L(H) that dim R(A) < ∞. Summarizing, we have a chain of linear spaces S0 (H) ⊂ S1 (H) ⊂ S2 (H) ⊂ S∞ (H) ⊂ L(H). Here all inclusions are strict; S1 (H) is Banach space with nuclear norm, S2 (H) is Hilbert space with Hilbert–Schmidt norm, and S∞ (H) and L(H) are Banach spaces with operator norm; S0 (H) is dense in three spaces S1 (H), S2 (H), S∞ (H) but not in L(H); canonical embeddings of S1 (H) in S2 (H) and S2 (H) in S∞ (H) are continuous. 3.1.4. S-operators D EFINITION 3.8.– A self-adjoint, positive, and nuclear operator in H is called S-operator. E XAMPLE 3.5.– (Diagonal S-operator) For a bounded real sequence {an , n ≥ 1}, consider the diagonal operator A in l2 , A = diag(an , n ≥ 1) (see example 3.1). It is always self-adjoint; A is positive ∞ if, and only if, the weights an are non-negative, and it is nuclear if, and only if, 1 |an | < ∞. Thus, ∞ ∞A is S-operator if, and only if, an ≥ 0, n ≥ 1, and 1 an < ∞. It holds trA = 1 an . Now, let B be arbitrary S-operator in a real separable infinite-dimensional Hilbert space H. Then B is a compact self-adjoint operator, and by Hilbert–Schmidt theorem there exists eigenbasis {en } of B, with Ben = αn en , n ≥ 1. Here αn → 0 as n → ∞, and αn ≥ 0, n ≥ 1 because B is a positive operator. Since B is nuclear, we have B 1 =
∞
αn < ∞.
1
The S-operator B has a form B=
∞ 1
αn P[en ] ,
[3.6]
Borel Measures in Hilbert Space
63
where P[en ] is projector on [en ] := span(en ) and the series converges in the sense of uniform operator convergence. In particular Bx =
∞
αn (x, en )en ,
x ∈ H,
1
where the series converges strongly (i.e. in the norm of H). ∞ And opposite statement holds true: if {en } is an orthobasis and αn ≥ 0, 1 αn < ∞, then the operator [3.6] is S-operator. Thus, we have a general description of Soperators. Of course, the diagonal S-operator from example 3.4 fits this description. T HEOREM 3.8.– Let T be a self-adjoint positive operator in H. Suppose that there exists an orthobasis {fk }, with ∞
(T fk , fk ) < ∞.
1
Then T is nuclear, and therefore, it is S-operator. P ROOF.– The bilinear form (T x, y), x, y ∈ H satisfies all the axioms of inner product except of a part of the first axiom (it can happen that (T x, x) = 0 for certain x = 0). It is the so-called pseudoscalar product. Then by the Cauchy–Schwartz inequality we have |(T fi , fk )|2 ≤ (T fi , fi )(T fk , fk ),
i, k ≥ 1.
Hence T 22 =
∞
|(T fi , fk )|2 ≤
∞
(T fi , fi ) ·
i=1
i,k=1
∞
(T fk , fk ) < ∞.
k=1
Therefore, T ∈ S2 (H) and T is a compact operator. Moreover, T is self-adjoint 1 and positive; hence there exists S = T 2 , which is self-adjoint, positive and compact as well (see proof of theorem 3.2). Then ∞ 1
(T fk , fk ) =
∞ 1
(S 2 fk , fk ) =
∞
Sfk 2 < ∞,
1
and S ∈ S2 (H). Hence T = S 2 ∈ S1 (H) (see remark 3.4).
The set LS (H) of all S-operators in H is a convex cone in Banach space S1 (H), i.e. LS (H) is closed under linear combinations with positive coefficients. Now, we check that it is a closed subset of S1 (H).
64
Gaussian Measures in Hilbert Space
L EMMA 3.4.– (S-operators form closed set) Let {Sn , n ≥ 1} be S-operators in H and Sn converges to S in nuclear norm. Then S is S-operator as well. P ROOF.– The operator S is nuclear, and by theorem 3.3 Sn − S ≤ Sn − S 1 → 0
as
n → ∞.
Hence Sn ⇒ S (i.e. Sn converges uniformly). Then Sn = Sn∗ ⇒ S ∗ , and S ∗ = S. Finally, the uniform convergence implies the weak operator convergence, and 0 ≤ (Sn x, x) → (Sx, x)
as
n → ∞, x ∈ H.
Thus, (Sx, x) ≥ 0, x ∈ H. Therefore, S is S-operator.
Finally, we give a criterion for nuclear convergence of S-operators. T HEOREM 3.9.– Let S and {Sn , n ≥ 1} be S-operators. For the convergence of Sn to S in nuclear norm, it is necessary that for each orthobasis {ei } and it is sufficient that there exists an orthobasis {ei }, such that (Sn ei , ej ) → (Sei , ej ), for all i = j, ∞
|(Sn ei , ei ) − (Sei , ei )| → 0 as n → ∞.
[3.7] [3.8]
i=1
P ROOF.– a) Necessity: Part 1 Assume that Sn converges to S in nuclear norm. Then Sn converges to S weakly (see proof of lemma 3.4). This implies [3.7], for any orthobasis {ei }. b) Auxiliary statement We prove that for any A ∈ S1 (H) and any orthobasis {fk }, ∞
|(Afk , fk )| ≤
1
αk = A 1 ,
[3.9]
k≥1
where αk = αk (A) are singular values of A counted with multiplicity. Indeed, consider polar decomposition A = U T and orthonormal system {ei }, with T ei = αi ei , i ≥ 1. Then Aei = αi gi , where {gi = U ei , i ≥ 1} is another orthonormal system. It holds ⎞ ⎛ αi (x, ei )gi , Ax = A ⎝ (x, ei )ei ⎠ = i≥1
i≥1
Borel Measures in Hilbert Space
(Afk , fk ) =
65
αi (fk , ei )(fk , gi ),
i≥1 ∞
|(Afk , fk )| ≤
1
i≥1
αi (
∞
1
|(fk , ei )|2 ) 2 (
k=1
∞
1
|(fk , gi )|2 ) 2 ≤
αi ,
i≥1
k=1
and [3.9] is proven. c) Necessity: Part 2 Come back to the operators Sn that converge to S in nuclear norm. Using [3.9], we get ∞
|(Sn ei , ei ) − (Sei , ei )| =
i=1
∞
|((Sn −S)ei , ei )| ≤ Sn −S 1 → 0 as n → ∞.
i=1
d) Sufficiency: Part 1 – properties of the projectors Now, we assume that [3.7] and [3.8] hold true for fixed orthobasis {ei }, and we want to show that Sn − S 1 → 0 as n → ∞. For arbitrary n, split H = Hn ⊕ H n , where Hn = span(e1 , . . . , en ). Let Pn and P be projectors on Hn and H n , respectively. The conditions imply that n
Pn Sk Pn ⇒ Pn SPn as k → ∞,
[3.10]
and for δ > 0, tr(P n Sk P n ) ≤
∞
|(Sk ei , ei ) − (Sei , ei )| +
i=1
∞
(Sei , ei ) < δ
i=n+1
if n ≥ nδ and k ≥ kδ , therefore, lim sup sup tr(P n Sk P n ) ≤ δ + lim sup n→∞ k≥1
n→∞
kδ
tr(P n Sk P n ) = δ;
k=1
hence sup tr(P n Sk P n ) → 0 as n → ∞. k≥1
∞ The latter means that the series i=1 (Sk ei , ei ) converges uniformly in k ≥ 1; we can formalize it as follows: for each k ≥ 1 and n ≥ 1, tr (P n Sk P n ) ≤ n and tr(P n SP n ) ≤ n with n → 0 as n → ∞.
66
Gaussian Measures in Hilbert Space
e) Sufficiency: Part 2 – construction of S-operator which is close to Sk − S Introduce an operator for k, n ≥ 1 and δ > 0: W = Sk − S +
δ Pn + δ(Pn Sk Pn + Pn SPn )+ n
+ (1 + δ −1 )(P n Sk P n + P n SP n ). It is self-adjoint and nuclear operator as a sum of such operators. We show that for k ≥ knδ , it is positive. We have
, + δ (W x, x) ≥ (Pn (Sk − S)Pn x, x) + (Pn x, x) + n n + −2(P SPn x, x) + δ(Pn SPn x, x) + δ −1 (P n SP n x, x) =: R1 + R2 .
Hereafter, we compare self-adjoint operators A and B in the so-called Loewner order: A ≥ B means that A − B is a positive operator. Now, due to [3.10] for k ≥ knδ it holds δ Pn ≥ −Pn (Sk − S)Pn ⇒ R1 ≥ 0; n 1
1
1
1
2|(P n SPn x, x)| = 2|(S 2 Pn x, S 2 P n x)| ≤ 2 S 2 Pn x · S 2 P n x ≤ 1
1
≤ δ · S 2 Pn x 2 + δ −1 · S 2 P n x 2 = δ(Pn SPn x, x) + δ −1 (P n SP n x, x), and R2 ≥ 0. Hence for all k ≥ knδ the operator W is S-operator. f) Sufficiency: Part 3 – final proof of the desired convergence We bound the trace of W using relation tr ( nδ Pn ) = δ: tr W ≤ |tr Pn (Sk − S)Pn | + δ(1 + 21 ) + (3 + 2δ −1 )n , which can be made arbitrarily small by choosing δ, n, k is this order. Also, W = Sk − S + Wknδ , where Wknδ is S-operator, with tr Wknδ ≤ δ(1 + 21 ) + 2(1 + δ −1 )n , and this can be made arbitrarily small by choosing δ and n in this order. Finally, for δ ≥ δ0 , n ≥ n0 (δ0 ) and k ≥ k0 (δ0 , n0 ), both operators W and Wknδ are S-operators, and Sk − S 1 ≤ W 1 + Wknδ 1 = tr W + tr Wknδ ,
Borel Measures in Hilbert Space
67
which can be made arbitrarily small for all k ≥ k0 (δ0 , n0 ) by choosing appropriate δ = δ0 and n = n0 (δ0 ). This proves the desired convergence Sk → S in S1 (H). R EMARK 3.6.– In view of the proof of theorem 3.8, relations [3.7] and ∞[3.8] can be formulated as follows: (Sn ei , ej ) → (Sei , ej ), for all i, j ≥ 1, and i=1 (Sn ei , ei ) converges uniformly in n ≥ 1. Problems 3.1 1) Find the norm of canonical embedding J : S2 (H) → L(H). 2) Consider C as a complex Hilbert space, with inner product (u, v) = uv. For z ∈ C\{0}, introduce a linear operator Az w = zw, w ∈ C. Find polar decomposition of Az . 3) Let H be a separable Hilbert space and y, z ∈ H\{0}. Find polar decomposition of the one-dimensional operator A, with Ax = (x, y)z, x ∈ H. 4) Let A ∈ S∞ (H), where H is an infinite-dimensional separable Hilbert space, and let {αn , n ≥ 1} be singular values of A. Prove the following: a) A ∈ S0 (H) if, and only if, the number of singular values αn is finite; 2 b) A ∈ S2 (H) if, and only if, n≥1 αn < ∞ (here singular values are counted with multiplicity). 5) Problem 567 in [KIR 82] implies that each positive self-adjoint operator A in a Hilbert space can be decomposed as A = B 2 , where B is a positive self-adjoint operator as well, and B is unique (B is called square root of A and denoted as B = √ 1 A = A 2 ). Based on this fact, prove the following generalization of theorem 3.2: Each linear bounded operator A in a Hilbert space H0 can be decomposed as A = U T , where T is a self-adjoint positive operator and U ∈ L(H0 ), U performs an isometry of R(T ) into H0 ; moreover, such operator T is unique (it is denoted as |A|). 6) Prove that the normed space (S1 (H), · 1 ) is Banach space. 7) For canonical embedding π1 : S1 (H) → L(H), prove that π1 = 1. 8) Prove that S0 (H) is dense in S1 (H) in nuclear norm. 9) Prove that for each B ∈ L(H), f (A) := tr (AB) is a linear continuous functional on S1 (H) and find f . 1 0
10) In real Hilbert space L2 [0, 1], consider the integral operator (Ax)(t) = min(t, s)x(s)ds, 0 ≤ t ≤ 1, x ∈ L2 [0, 1]. Its eigenvalues and the corresponding
normalized eigenfunctions are as follows: √ 1 1 πt, n ≥ 0. λn =
2 , ϕn (t) = 2 sin n + 2 π 2 n + 12
68
Gaussian Measures in Hilbert Space
Prove that A is S-operator. 3.2. Pettis and Bochner integrals In definition 2.9, we introduced the mean value of a random element and of a probability measure in H. Actually, it is the so-called weak integral, which is defined using linear functionals fh (x) = (x, h), x ∈ H. We will see that under natural conditions, this integral coincides with the so-called strong integral, which is defined as a strong limit of integrals of simple functions. 3.2.1. Weak integral Consider a probability space (Ω, F, P), a normed vector space X and a random element ξ : Ω → X that has the weak first order, i.e. the mapping ξ is (F, B(X)) measurable and for each x∗ ∈ X ∗ , the weak first moment Eξ(ω), x∗ = ξ(ω), x∗ d P(ω) Ω
is finite. Hereafter x, x∗ denotes the value of a functional x∗ at a vector x. D EFINITION 3.9.– Given a random element ξ in X of the weak first order, suppose that there exists m ∈ X such that for all x∗ ∈ X ∗ , it holds Eξ(ω), x∗ = m, x∗ . The vector m is called Pettis integral, or weak integral, of ξ over the measure P and denoted as E ξ or Ω ξ(ω)d P(ω). Let μ be a probability measure on B(X), possessing weak first moments, i.e. for each x∗ ∈ X ∗ the integral X x, x∗ dμ(x) is finite. Then the identity operator I : X → X is a random element on the probability space (X, B(X), μ), and the mean value mμ of μ is defined as Pettis integral of I. In other words, it holds mμ , x∗ = x, x∗ dμ(x), x∗ ∈ X ∗ . X
One can write that in the sense of Pettis integral, mμ = xdμ(x). X
It is clear that mμ = E η, where η is any random element in X with distribution μ. L EMMA 3.5.– Let ξ and η be random elements in X defined on the same probability space, with finite Pettis integrals E ξ and E η. Then the following statements hold true:
Borel Measures in Hilbert Space
69
a) E ξ is uniquely defined; b) if ξ + η is a random element in X, then there exists Pettis integral E(ξ + η), and E(ξ + η) = E ξ + E η; c) it holds E ξ ≤ E ξ ; d) let A be a linear bounded operator from X into a normed vector space Y , then there exists Pettis integral E(Aξ), and E(Aξ) = A(E ξ). P ROOF.– a) Suppose that for some distinct vectors m1 , m2 ∈ X, we have m1 , x∗ = Eξ, x∗ = m2 , x∗ , x∗ ∈ X. Then for all x∗ ∈ X ∗ , m1 , x∗ = m2 , x∗ and m1 = m2 . But a corollary of Hahn–Banach theorem states that there exists x∗0 ∈ X ∗ , with m1 , x∗0 = m2 , x∗0 (i.e. x∗0 separates points m1 and m2 ). The obtained contradiction proves the uniqueness of E ξ. b) Proof is straightforward and based on the linearity of Lebesgue integral. c) ξ is a r.v. as a Borel function of a random element. If E ξ = ∞, then the inequality is true. Now, let E ξ < ∞. Denote E ξ = m. We have |m, x∗ | = | Eξ, x∗ | ≤ E |ξ, x∗ | ≤ E ξ · x∗ , x∗ ∈ X ∗ . Introduce a linear continuous functional Fm (x∗ ) = m, x∗ , x∗ ∈ X ∗ . Since |Fm (x∗ )| ≤ E ξ · x∗ , it holds m = Fm ≤ E ξ . Here, we used the fact that the embedding X m → Fm ∈ X ∗∗ is isometric. d) Aξ is a random element in Y as a composition of measurable mappings. It holds for any y ∗ ∈ Y ∗ : EAξ, y ∗ = Eξ, A∗ y ∗ = E ξ, A∗ y ∗ = A(E ξ), y ∗ . Hence A(E ξ) is Pettis integral of Aξ. Here, A∗ : Y ∗ → X ∗ is adjoint operator to A. 3.2.2. Strong integral Now, let B be a separable Banach space. D EFINITION 3.10.– A random element η in B is called simple if its range η(Ω) is finite.
70
Gaussian Measures in Hilbert Space
It is clear that a simple random element η in B can be represented as η(ω) =
n
ck IEk (ω),
ω ∈ Ω,
1
where n ≥ 1, {ck } ⊂ B, and E1 , . . . , En are disjoint random events. Then η(ω) =
n
ck · IEk (ω),
ω ∈ Ω,
1
is a simple r.v. L EMMA 3.6.– Let L0 (Ω) be the set of all random elements in B with underlying probability space (Ω, F, P), where random elements ξ and η are identified if P(ξ = η) = 1. Then ρ(ξ, η) = E
ξ − η , ξ, η ∈ L0 (Ω) 1 + ξ − η
[3.11]
is a metric that metrizes the convergence in probability. P ROOF.– a) First we show that for any random elements ξ and η in the separable Banach space B, ξ + η ∈ L0 (Ω) as well. Consider z : Ω → B × B, z(ω) = (ξ(ω); η(ω)). This mapping is (F, B(B × B)) measurable, because z −1 (A1 × A2 ) = (ξ −1 A1 ) ∩ (η −1 A2 ) ∈ F,
A1 × A2 ∈ B(B) × B(B)
and σa(B(B) × B(B)) = B(B × B) due to the separability of B. Next, the mapping K : B × B → B, K(x, y) = x + y is continuous, and therefore, it is (B(B × B), B(B)) measurable. Hence the mapping ξ(ω) + η(ω) = K(z(ω)), ω ∈ Ω is (F, B(B)) measurable as a composition of measurable functions, and ξ + η indeed belongs to L0 (Ω). Also it is clear that for any scalar λ and any ξ ∈ L0 (Ω), λξ ∈ L0 (Ω) as well. Thus, L0 (Ω) is a vector space (provided the underlying Banach space is separable).
Borel Measures in Hilbert Space
71
b) In [3.11], ξ − η ∈ L0 (Ω) and ξ − η is a r.v. Hence ρ(ξ, η) is well-defined. It is verified directly, using lemma 2.1, that ρ(ξ, η) is indeed a metric at L0 (Ω). c) If {ξ, ξn , n ≤ 1} ⊂ L0 (Ω) and ξn converges in probability to ξ, i.e. P ξn − ξ − → 0, then ρ(ξn , ξ) = E
ξn − ξ → 0 as 1 + ξn − ξ
n→∞
by Lebesgue dominated convergence theorem. d) Vice versa, suppose that ρ(ξn , ξ) → 0 as n → ∞. For any ε > 0, ξn − ξ ε ≤ P { ξn − ξ > ε} = P > 1 + ξn − ξ 1+ε ≤E
ξn − ξ 1+ε · → 0 as 1 + ξn − ξ ε
P
→ 0 and ξn converges to ξ in probability. Hence, ξn − ξ −
n → ∞.
L EMMA 3.7.– Let B be a separable Banach space and ξ be a random element in B. Then there exists a sequence {ξn , n ≥ 1} of simple random elements in B such that ξn − ξ → 0 as n → ∞, a.s. P ROOF.– Fix ε > 0. Because B is separable, it can be partitioned as B = ∪∞ 1 Bi , where Bi are disjoint Borel sets, with diamBi = sup x − y ≤ ε, x,y∈Bi
i ≥ 1.
Introduce a random element ηε (ω) =
∞
bi · Iξ−1 Bi (ω),
ω ∈ Ω,
[3.12]
1
where bi ∈ Bi and the series converges strongly for each ω ∈ Ω. Then ξ − ηε ≤ ε, and η n1 converges pointwise to ξ. Hence ρ(η n1 , ξ) → 0 as n → ∞, and for some number N it holds ρ(η N1 , ξ) < 2ε . For η N1 , we have expression like [3.12]: η N1 =
∞
ci · Iξ−1 Ci ,
ci ∈ Ci ,
i ≥ 1.
1
M Simple random elements τM = 1 ci · Iξ−1 Ci converge strongly to η N1 for each ω ∈ Ω; hence for some M0 , it holds ρ(τM0 , η N1 ) < 2ε . Thus, ρ(τM0 , ξ) < ε, and ξ can be approximated by simple random elements in the metric space (L0 (Ω), ρ).
72
Gaussian Measures in Hilbert Space
Therefore, one can construct a sequence {αn } of simple random elements such P that ρ(αn , ξ) → 0 as n → ∞, and αn − ξ − → 0. Now, by Riesz theorem there exists a subsequence {αnk } such that αnk − ξ → 0 as k → ∞, a.s. Then {αnk } is the desired sequence of simple random elements. At the first stage, we define the strong integral of a simple random element. Let ξ=
m
ck IEk ,
[3.13]
1
where {Ek } are disjoint random elements and ck ∈ B, k = 1, . . . , m. D EFINITION 3.11.– Bochner integral of a simple random element with representation [3.13] is defined as Eξ =
ξ(ω)d P(ω) = Ω
m
ck P(Ek ).
1
It is easy to verify that Bochner integral given in definition 3.11 is well-defined. L EMMA 3.8.– Let ξ be a random element in the Banach space B and {ξn } be simple random elements in B such that E ξn − ξ → 0 as n → ∞. Then the sequence E ξn of Bochner integrals converges in B (in strong sense). P ROOF.– We use evident properties of Bochner integral of simple random elements: E ξn − E ξm = E(ξn − ξm ) ≤ E ξn − ξm ≤ ≤ E ξn − ξ + E ξm − ξ → 0
as n, m → ∞.
Then {E ξn } is a Cauchy sequence in the complete normed space B. Thus, it strongly converges in B. D EFINITION 3.12.– Let ξ be a random element in the Banach space B and there exists a sequence {ξn } of simple random elements in B such that E ξn − ξ → 0 as n → ∞; by lemma 3.8 there exists the strong limit of E ξn , and this limit is called Bochner integral of ξ and denoted as E ξ = Ω ξ(ω)d P(ω). R EMARK 3.7.– In definition 3.12, the limit does not depend of the approximating sequence. P ROOF.– Let E ξn − ξ → 0 and E ηn − ξ → 0 as n → ∞, where ξn and ηn are simple random elements in B. Then E ξn − E ηn ≤ E ξn − ηn ≤ E ξn − ξ + E ηn − ξ → 0
Borel Measures in Hilbert Space
73
as n → ∞. Hence lim E ξn = lim E ηn .
n→∞
n→∞
Let μ be a probability measure on B(B). Then the identity operator I : B → B is a random element on the probability space (B, B(B), μ), and mean value mμ of μ (in strong sense) is defined as Bochner integral of I (if the latter exists according to definition 3.12). In this case, we write mμ = xdμ(x) (in strong sense). [3.14] B
If η is any random element in B with distribution μ and B xdμ(x) exists in strong sense, then E η exists as Bochner integral and E η = B xdμ(x). In other words, the existence of strong integral [3.14] means the following. First, we consider a simple measurable function p : B → B of the form p(x) =
m
ck IEk (x),
x ∈ B,
1
sets and ck ∈ B, k = 1, . . . , m. Bochner integral of p where Ek are disjoint Borel m is B p(x)dμ(x) = 1 ck μ(Ek ). Next, assume that there exist a sequence {pn } of simple measurablefunctions such that B pn (x) − x dμ(x) → 0 as n → ∞. Then the strong integral B xdμ(x) = limn→∞ B pn (x)dμ(x). Now, we describe the relation between Bochner and Pettis integrals. T HEOREM 3.10.– Let ξ be a random element in the Banach space B. If E ξ exists in the strong sense, then it exists in the weak sense, and the strong and weak values of expectation coincide. m P ROOF.– For a simple random element η = 1 ck IEk with representation like in [3.13], E η, x∗ =
m
ck P(Ek ), x∗ =
1
= E
m
m
P(Ek )ck , x∗ =
1
ck IEk (ω), x∗ = Eη, x∗ ,
x∗ ∈ X ∗ .
1
Let {ξn } be a sequence of simple random elements in B, with E ξn − ξ → 0 as n → ∞. Then E ξn → E ξ as n → ∞. The latter convergence is strong in B, and this implies that E ξn → E ξ weakly as well. Hence E ξn , x∗ → E ξ, x∗
as
n → ∞.
74
Gaussian Measures in Hilbert Space
We want to tend n to infinity in relation E ξn , x∗ = Eξn , x∗ . To be able to do it on the right-hand side, consider E |ξ, x∗ | ≤ E |ξ − ξn , x∗ | + E |ξn , x∗ | ≤ ≤ E ξ − ξn · x∗ + E ξn · x∗ < ∞ for n large enough; hence Eξ, x∗ is finite; moreover, | Eξn , x∗ − Eξ, x∗ | ≤ E |ξn − ξ, x∗ | ≤ E ξn − ξ · x∗ → 0 as n → ∞, and therefore, Eξn , x∗ → Eξ, x∗
as
n → ∞.
Thus, E ξ, x∗ = Eξ, x∗ . This proves that the strong integral E ξ coincides with the corresponding weak integral. T HEOREM 3.11.– (Criterion for existence of Bochner integral) Let ξ be a random element in the separable Banach space B. There exists Bochner integral E ξ if, and only if, E ξ < ∞ (i.e. when the strong first moment E ξ is finite). P ROOF.– a) Necessity. There exists a sequence {ξn , n ≥ 1} of simple random elements in B, with E ξn − ξ → 0 as n → ∞. Hence E ξn − ξ < ∞, for all n ≥ n0 . We have E ξ ≤ E ξn0 − ξ + E ξn0 < ∞
⇒
E ξ < ∞.
b) Sufficiency. Now, we assume that E ξ < ∞. By lemma 3.7, due to the separability of B there exists a sequence {ξn } of simple random elements in B, with ξn − ξ → 0 a.s. Now, introduce simple random elements ξn , if ξn ≤ 2 ξ , ηn = 0, otherwise. First, we check that ηn − ξ → 0 a.s. Indeed, if ξ(ω) = 0, then ηn (ω) = 0, and ηn (ω) − ξ(ω) → 0; next, let ξ(ω) = 0 and ξn (ω) − ξ(ω) → 0, then ξn (ω) → ξ(ω) and ξn (ω) ≤ 2 ξ(ω) , for n ≥ n0 (ω); hence ηn (ω) = ξn (ω) for n ≥ n0 (ω), and for such n, it holds ηn (ω) − ξ(ω) = ξn (ω) − ξ(ω) → 0asn → ∞.
Borel Measures in Hilbert Space
75
We proved that ηn strongly converges to ξ a.s. Second, ηn − ξ ≤ ηn + ξ ≤ 3 · ξ , and 3 · ξ has finite expectation. Therefore, by Lebesgue dominated convergence theorem it holds E ηn − ξ → 0 as n → ∞. According to definition 3.12, there exists Bohner integral E ξ. In theorem 3.11, we used the separability of B to ensure that the existence of approximating sequence of simple random elements (in the sufficiency part of proof). That is why in this book we consider Bochner integral of random elements that are distributed namely in a separable Banach space. Problems 3.2 11) Let ξ be a random element in a normed vector space X such that E ξ < ∞. a) Prove that there exists Mξ ∈ X ∗∗ so that for each x∗ ∈ X ∗ , Eξ, x∗ = Mξ , x . ∗
b) Under additional assumption that X is a reflexive Banach space (possibly non-separable) prove that there exists Pettis integral E ξ. 12) Let H be an infinite-dimensional separable Hilbert space and A(ω) be a random Hilbert–Schmidt operator in H, i.e. A(ω) be a random element in S2 (H). Assume additionally that E A(ω) 2 < ∞. Prove that for each x ∈ H, there exists Bochner integral E[A(ω)x] and E[A(ω)x] = (E A(ω))x, where the latter integral on the righthand side is Bochner integral. 13) Let H be a space from problem (12). Construct a random element ξ in H such that there exists E ξ in weak sense, but it does not exist in strong sense. 3.3. Borel measures in Hilbert space In this section, we study general properties of probability measures on B(H), where H is a real separable infinite-dimensional Hilbert space. 3.3.1. Weak and strong moments D EFINITION 3.13.– Let (X, ρ) be a metric space. Any measure on the Borel sigmaalgebra B(X) is called Borel measure in X. We will study mostly Borel measures in H. D EFINITION 3.14.– Let μ be a Borel probability measure in H and ξ be a random element in H. Expressions (x, z1 )(x, z2 ) . . . (x, zn )dμ(x) σn (z1 , . . . , zn ) = H
76
Gaussian Measures in Hilbert Space
and σnξ (z1 , . . . , zn ) = E(ξ, z1 )(ξ, z2 ) . . . (ξ, zn ),
z 1 , . . . , zn ∈ H
are called weak (non-central) moments of order n of μ and ξ, respectively. Expressions x n dμ(x) and mnξ = E ξ n mn = H
are called strong (non-central) moments of order n of μ and ξ, respectively. Remember definition 2.9. For a Borel probability measure μ, its mean value mμ = xdμ(x), this is Pettis integral; and for a random element ξ in H, its mean value E ξ H can be understood as Pettis integral. In terms of the first weak moments, it holds (mμ , z) = σ1 (z), (E ξ, z) = σ1ξ (z), z ∈ H. If m1 = H z dμ(z) < ∞, then mμ exists and mμ = H xdμ(x), this is Bochner integral (see theorems 3.11 and 3.10). If m1ξ = E ξ < ∞, then E ξ can be understood both as Bochner and Pettis integral. Now, switch to the definition 2.10. For a Borel probability measure μ in H and a random element ξ inH, correlation operator is defined using the so-called central weak second moments H (x − m, h1 )(x − m, h2 )dμ(x) and E(ξ − m, h1 )(ξ − m, h2 ), where m is mean value of μ or ξ, respectively. It is convenient to introduce another operator based on non-central moments. D EFINITION 3.15.– An operator A ∈ L(H) is called covariance operator of a random element ξ in H if (Ah1 , h2 ) = E(ξ, h1 )(ξ, h2 ),
for all
h1 , h2 ∈ H,
and A is called covariance operator of a Borel probability measure μ in H if (x, h1 )(x, h2 )dμ(x), for all h1 , h2 ∈ H. (Ah1 , h2 ) = H
There is a simple relation between the correlation operator and covariance one. L EMMA 3.9.– Let ξ be a random element in H with mean value m. The correlation operator S of ξ exists if, and only if, the covariance operator A of ξ exists; they are related as follows: (Sh1 , h2 ) = (Ah1 , h2 ) − (m, h1 )(m, h2 ), S = A − m 2 · P[m] ,
h1 , h2 ∈ H,
Borel Measures in Hilbert Space
77
where [m] = span(m) and P[m] is projector on [m]. P ROOF.– Here we prove the relation only: (Sh1 , h2 ) = E(ξ, h1 )(ξ, h2 ) − E(ξ, h1 )(m, h2 ) − E(m, h1 )(ξ, h2 )+ + (m, h1 )(m, h2 ) = (Ah1 , h2 ) − 2(m, h1 )(m, h2 ) + (m, h1 )(m, h2 ) = = (Ah1 , h2 ) − (m, h1 )(m, h2 ), (Sh1 , h2 ) = (Ah1 , h2 ) − m 2 · (P[m] h1 , h2 ) = ((A − m 2 · P[m] )h1 , h2 ). This implies the desired relation between the two operators.
We notice that similar statement holds true for a Borel probability measure in H. In terms of weak moments σn (see definition 3.14), it holds (Ah1 , h2 ) = σ2 (h1 , h2 ), (Sh1 , h2 ) = σ2 (h1 , h2 ) − σ1 (h1 )σ1 (h2 ). Directly from the definition one can see that both correlation and covariance operators are positive (i.e. their quadratic forms are non-negative) and self-adjoint. L EMMA 3.10.– Let μ be a Borel probability measure in H. Its covariance operator Aμ is S-operator if, and only if, H x 2 dμ(x) < ∞, and then x 2 dμ(x). trAμ = H
P ROOF.– a) Necessity Assume that covariance operator Aμ exists and nuclear. Take an arbitrary orthobasis {en , n ≥ 1} in H. We have trAμ =
∞
(Aμ en , en ) =
1
=
∞ H
∞ 1
(x, en )2 dμ(x) = H
x 2 dμ(x) < ∞.
(x, en )2 dμ(x) = H
1
b) Sufficiency Assume that H x 2 dμ(x) < ∞. This implies that weak second moment of μ are finite. The bilinear form σ2 (h1 , h2 ) = (x, h1 )(x, h2 )dμ(x), h1 , h2 ∈ H H
78
Gaussian Measures in Hilbert Space
is bounded because
|σ2 (h1 , h2 )| ≤
x 2 dμ · h1 · h2 . H
Therefore, there exists the covariance operator Aμ ∈ L(H) that represents the bounded bilinear form σ2 (h1 , h2 ). It is known that Aμ is positive self-adjoint, and moreover for any orthobasis {en } in H, ∞ ∞ 2 (Aμ en , en ) = (x, en ) dμ(x) = x 2 dμ(x) < ∞. 1
1
H
Now, by theorem 3.7 Aμ is S-operator.
H
C OROLLARY 3.2.– For a Borel probability measure μ in H, its correlation operator Sμ is S-operator if, and only if, H x 2 dμ(x) < ∞. P ROOF.– a) Necessity If Sμ is S-operator, then by lemma 3.9 there exists the covariance operator Aμ = Sμ + m 2 P[m] ; here P[m] is nuclear as a finite-dimensional operator; hence Aμ is nuclear as a linear combination of two nuclear operators. Therefore, Aμ is S-operator and by lemma 3.10 the strong second moment of μ is finite. b) Sufficiency Now, we assume that x 2 dμ(x) < ∞. Then H x dμ(x) is finite as well H and mean value m = H xdμ(x) exists as both Bochner and Pettis integrals. By theorem 3.7, the covariance operator Aμ is S-operator, and by lemma 3.9 there exists the correlation operator Sμ = Aμ − m 2 P[m] , and it is nuclear as a linear combination of two nuclear operators. Finally, Sμ is S-operator, because it is always positive and self-adjoint. 3.3.2. Examples of Borel measures D EFINITION 3.16.– Let μ be a Borel probability measure in H and ξ be a random element in H. Functions ϕμ (z) = ei(z,x) dμ(x) H
and ϕξ (z) = E ei(z,ξ) ,
z∈H
Borel Measures in Hilbert Space
79
are called characteristic functionals of μ and ξ, respectively. If μ coincides with the distribution μξ of ξ, then ϕμ = ϕξ . Now, we calculate mean values, correlation and covariance operators, and characteristic functionals of some Borel measures in H. E XAMPLE 3.6.– (Dirac measure) Fix a ∈ H. The probability measure δa (B) = IB (a),
B ∈ B(H)
is called Dirac (or point) measure concentrated at a. Random element ξ(ω) = a, ω ∈ Ω has distribution δa . The mean value m = Eξ = a
(in the sense of Bochner integral).
We have ξ(ω) − m = 0, ω ∈ Ω; hence the correlation operator S is zero operator. For the covariance operator A, it holds
(Ax, y) = E(ξ, x)(ξ, y) = (a, x)(a, y) = a 2 · P[a] x, y . Therefore, A = a 2 · P[a] . Finally, the characteristic functional is ϕ(x) = E ei(x,ξ) = ei(x,a) ,
x ∈ H.
E XAMPLE 3.7.– (Measure induced in the direction) Let μ be a Borel probability measure on real line and e ∈ H be a unit vector. The mapping ρ(αe) = α,
α∈R
identifies one-dimensional subspace [e] and R. The measure μ ˜=μ ˜e , μ ˜e (E) = μ(ρ(E ∩ [e])),
E ∈ B(R),
is called the measure induced by μ in direction e. Let η be a r.v. with distribution μ. Random element X(ω) = η(ω)e, ω ∈ Ω has distribution μ ˜e . Indeed, for E ∈ B(R), μX (E) = P{η(ω)e ∈ E} = P{η(ω)e ∈ (E ∩ [e])} = = P{η(ω) ∈ ρ(E ∩ [e])} = μ ˜e (E). Suppose that μ has finite mean value mμ . Then E X = E ηe = E |η| = |x|dμ < ∞, R
80
Gaussian Measures in Hilbert Space
and mean value mμ˜ exists in strong sense. It holds (mμ˜ , z) = E(X, z) = (E η)(e, z), mμ˜ = (E η)e = mμ e. Now, assume that R x2 dμ(x) < ∞ (this implies that mμ is finite). For the covariance operator Aμ˜ , it holds (Aμ˜ z1 , z2 ) = E(ηe, z1 )(ηe, z2 ) = E η 2 · (e, z1 )(e, z2 ). Hence Aμ˜ = R x2 dμ · P[e] . For the correlation operator Sμ˜ , we have (Sμ˜ z1 , z2 ) = E((η − mμ )e, z1 )((η − mμ )e, z2 ) = D η · (e, z1 )(e, z2 ), Sμ˜ = D η · P[e] = (x − mμ )2 dμ · P[e] . R
The characteristic functional is as follows: ϕμ˜ (z) = E e
i(z,ηe)
ϕμ˜ (z) = ϕμ (t)
= Ee
i(z,e)η
= ϕη (t)
, t=(z,e)
. t=(z,e)
Here, ϕη and ϕμ are characteristic functions of η and μ, respectively. E XAMPLE 3.8.– (Measure in H induced by a measure in Rn ) We generalize previous example. Let μ be a Borel probability measure on Rn and Ln be n-dimensional subspace of H, with orthobasis {ei , 1 ≤ i ≤ n}. The mapping n αi ei ) = (α1 , . . . , αn ) , ρ(
α 1 , . . . , αn ∈ R
1
identifies Ln with Rn . The measure μ ˜(E) = μ(ρ(E ∩ Ln )),
E ∈ B(H)
is called the measure (in H) induced by μ. This measure μ ˜ is concentrated on Ln , i.e. μ ˜(H \ Ln ) = 0. n nLet η = (ηi )1 be a random vector with distribution μ. Random element X(ω) = ˜. 1 ηi (ω)ei has distribution μ
Suppose that μ has finite second moments. For the mean value mμ˜ , it holds (mμ˜ , z) = E(
n
ηi ei , z) =
1
mμ˜ = ρ−1 E η = ρ−1 mμ .
n 1
E ηi · (ei , z) = (ρ−1 E η, z),
z ∈ H,
Borel Measures in Hilbert Space
Here, mμ˜ =
81
xd˜ μ(x) can be understood as Bochner integral because x dμ(x) < ∞. E X = E η = H
Rn
For the covariance operator Aμ˜ , we have for z1 , z2 ∈ H : (Aμ˜ z1 , z2 ) = E(X, z1 )(X, z2 ) =
n
(E ηi ηj )(z1 , ej )(z2 , ei ),
i,j=1
Aμ˜ z1 =
n
⎛ ⎞ n ⎝ aij (z1 , ej )⎠ ei ,
i=1
aij = E ηi ηj =
j=1
Rn
xi xj dμ(x).
In a similar way, the correlation operator Sμ˜ is as follows: ⎛ ⎞ n n ⎝ ⎠ Sμ˜ z = sij (z, ej ) ei , sij = (xi − mμi )(xj − mμj )dμ(x), i=1
Rn
j=1
where mμ = (mμi )ni=1 . For the characteristic functional ϕμ˜ , we have n n ϕμ˜ (z) = E exp{i z, ηk ek } = E exp{i ηk (z, ek )}, k=1
ϕμ˜ (z) = ϕμ (t),
k=1
t = ((z, e1 ), . . . , (z, en )) .
E XAMPLE 3.9.– (Measure with compact but not nuclear covariance operator) We use a construction from solution to problem (13) of this chapter. Let {pn , n ≥ 1} be a complete collection of positive probabilities, {en } be an orthobasis in H and ξ(ω) = αn en with probability pn , n ≥ 1. We will choose αn > 0 and pn later, such that second weak moments of ξ are finite but E ξ 2 = ∞. The distribution μ = μξ of ξ is as follows: μ(E) =
∞
pn δαn en (E),
E ∈ B(H),
1
where δαn en is Dirac measure at point αn en . We demand that lim pn αn2 = 0,
n→∞
∞ 1
pn αn2 = ∞.
82
Gaussian Measures in Hilbert Space
6 π 2 n2 ,
For instance, pn =
αn =
√ n, n ≥ 1. Then for z1 , z2 ∈ H,
σ2 (z1 , z2 ) =
(z1 , x)(z2 , x)dμ(x) =
∞
H
pn αn2 (z1 , en )(z2 , en ),
1
and the series converges because the sequence {pn αn2 } is bounded. The covariance operator Aμ can be founded from relation (Aμ z1 , z2 ) = σ2 (z1 , z2 ), ∞
Aμ =
pn αn2 P[en ] ,
1
where the series converges in sense of uniform operator convergence, because pn αn2 → 0 as n → ∞. This operator is compact. Next, E ξ 2 =
∞ 1
pn αn2 = ∞, therefore, Aμ is not nuclear (see lemma 3.10).
According to solution of problem (13), the (weak) mean value m = mμ =
∞
pn αn en ,
1
where the series converges in the norm of H. For concrete pn and αn chosen above, E ξ =
∞
pn αn < ∞,
1
and mμ =
H
xdμ can be understood as Bochner integral.
The correlation operator Sμ is as follows: Sμ = Aμ − m P[m] = 2
∞
pn αn2 P[en ]
1
−
∞
p2n αn2
P[m] .
1
The latter operator is compact but not nuclear, like the covariance operator Aμ . Finally, the characteristic functional equals ϕμ (z) = E ei(z,ξ) =
∞
pn eiαn (z,en ) ,
z ∈ H.
1
From this example, we see that the correlation and covariance operators of a probability measure in H can be compact, but not nuclear.
Borel Measures in Hilbert Space
83
3.3.3. Boundedness of moment form We will prove that the moment form σn (z1 , . . . , zn ) introduced in definition 3.14 is bounded, provided the corresponding weak moments of order n exist. We will deal with this problem even in a more general setting, for Borel probability measures on a normed vector space. L EMMA 3.11.– Every polylinear form τn (z1 , . . . , zn ), which is defined on a Banach space B and continuous in each variable for fixed values of all other variables, is bounded, i.e. there exists C ≥ 0 such that for all z1 , . . . , zn ∈ B, |τn (z1 , . . . , zn )| ≤ C z1 · z2 . . . zn . P ROOF.– We prove by induction over n, the order of a polylinear form. For n = 1, it is well-known theorem about the equivalence of continuity and boundedness of a linear functional. Assume that the statement is true for all polylinear forms of order n − 1, where n ≥ 2 is fixed. Consider a polylinear form τn (z1 , . . . , zn ) on B, which is continuous in each variable for fixed values of all other variables. For fixed z1 , the polylinear form τn (z1 , ·, . . . , ·) of order n − 1 is bounded by inductive hypothesis. Hence, there exists Cz1 ≥ 0 such that for all zi with zi ≤ 1, i = 2, . . . , n, it holds |τn (z1 , z2 , . . . , zn )| ≤ Cz1 . Introduce a family of continuous linear functionals on Banach space B: fz2 ...zn (z1 ) = τn (z1 , . . . , zn ),
z1 ∈ B,
where zi ≤ 1, i = 2, . . . , n. By Banach–Steinhaus theorem, there exists C ≥ 0 such that for all zi with zi ≤ 1, i = 1, . . . , n, it holds |fz2 ...zn (z1 )| = |τn (z1 , . . . , zn )| ≤ C. Thus, τn is bounded. We proved the statement for arbitrary order of a polylinear form. T HEOREM 3.12.– (About boundedness of moment form) Let μ be a Borel probability measure on a normed vector space X. Fix n ≥ 1. Assume that for each z1∗ , . . . , zn∗ ∈ X ∗ , there exists finite integral x, z1∗ · x, z2∗ . . . x, zn∗ dμ(x) =: σn (z1∗ , . . . , zn∗ ). [3.15] X
Then σn is a bounded polylinear form.
84
Gaussian Measures in Hilbert Space
P ROOF.– It is clear that σn is a symmetric polylinear form on Banach space X ∗ . In view of lemma 3.11, it is enough to prove that σn is continuous in z1∗ for fixed other ∗ ∗ variables z2,0 , . . . , zn,0 . ∗ ∗ Denote p(x) = x, z2,0 . . . x, zn,0 (if n = 1, then p(x) = 1 by convention), ∗ ∗ , . . . , zn,0 )= x, z1∗ p(x)dμ(x), z1∗ ∈ X ∗ . f (z1∗ ) := σn (z1∗ , z2,0 X
Here f is a linear functional, and we have to show that it is bounded. Introduce linear functionals fN (z1∗ ) = x, z1∗ p(x)dμ(x), ¯ B(0,N )
z1∗ ∈ X ∗ ,
N ≥ 1.
Each of them is bounded because ∗ ∗ . . . zn,0 ) · z1∗ . |fN (z1∗ )| ≤ (N n · z2,0
Moreover, |fN (z1∗ )| ≤
X
∗ ∗ ∗ |x, z1,0 · x, z2,0 . . . x, zn,0 |dμ(x) =: Cz1 ,
∗ ∗ Cz1 < ∞ because Lebesgue integral σn (z1∗ , z2,0 , . . . , zn,0 ) is finite. Now, by Banach– Steinhaus theorem there exists C ≥ 0 such that for all N ≥ 1 and all z1∗ with z1∗ ≤ 1, it holds |fN (z1∗ )| ≤ C. By theorem about integration over increasing sets, we get fN (z1∗ ) → f (z1∗ ) as N → ∞; hence
|f (z1∗ )| ≤ C,
for all
z1∗ ∈ X ∗ ,
z1∗ ≤ 1.
The functional f is bounded, and reference to lemma 3.11 accomplishes the proof. Remember that mean value mμ ∈ X is Pettis integral mμ = X xdμ(x) (see section 3.2). C OROLLARY 3.3.– (About existence of mean value) Let μ be a Borel probability measure on a reflexive Banach space B. Assume that all first weak moments σ1 (z ∗ ) = x, z ∗ dμ(x), z ∗ ∈ B ∗ B
are finite. Then there exists mean value mμ . P ROOF.– By theorem 3.12, σ1 is a continuous linear functional on B ∗ . Since B is reflexive, there exists m ∈ B that generates this functional: σ1 (z ∗ ) = m, z ∗ ,
for all
z∗ ∈ B∗.
Borel Measures in Hilbert Space
85
Hence, there exists mμ = m.
R EMARK 3.8.– (About existence of mean value in separable Banach space) Let μ be a Borel probability measure on a separable Banach space B. Assume that for some fixed real p > 1, |x, z ∗ |p dμ(x) < ∞, for all z ∗ ∈ B ∗ . B
Then there exists mean value mμ . P ROOF.– Let ξ be random element in B, with distribution μ. Then E |ξ, z ∗ |p < ∞,
z∗ ∈ B∗.
a) Introduce a linear operator Tξ : B ∗ → Lp (Ω) = Lp (Ω, P),
Tξ x∗ = ξ, x∗ .
We prove that Tξ is bounded. Consider the graph Γξ ⊂ B ∗ × Lp (Ω) of Tξ , Γξ = {(x∗ ; ξ, x∗ )|x∗ ∈ B ∗ }. We check that Γξ is a closed set. Let x∗n → x∗ strongly in B ∗ and ξ, x∗n → η = η(ω) strongly in Lp (Ω). Then ξ, x∗n → ξ, x∗ as n → ∞ a.s. Hence η = ξ, x∗ a.s. and (x∗ ; η) ∈ Γξ . Thus, Γξ is closed, and by the closed graph theorem (see [BER 12]), the operator Tξ is bounded. b) Construction of mean value. For random events An := { ξ ≤ n}, we set ηn = ξIAn ; n ≥ 1. Then E ηn < ∞, and Pettis integral E ηn ∈ B exists. For x∗ ∈ B ∗ , it holds (here q ∈ (1, ∞) is the conjugate index): |E ηm − E ηn , x∗ | = | Eξ, x∗ (IAm − IAn )| ≤ ≤ (E |ξ, x∗ |p )
1/p
(E |IAm − IAn |q )
1/q
≤
∗
≤ Tξ · x · IAm − IAn Lq (Ω) . Therefore, E ηm − E ηn ≤ Tξ · IAm − IAn Lq (Ω) → 0
as
m, n → ∞,
since IAn → 1 in Lq (Ω) as n → ∞ (the latter convergence holds because P(Acn ) converges to zero as n → ∞). Hence {E ηn , n ≥ 1} is a Cauchy sequence in Banach space B, and E ηn → m ∈ B strongly.
86
Gaussian Measures in Hilbert Space
c) Finally, we prove that m = E ξ in weak sense. We have for x∗ ∈ B ∗ : Eηn , x∗ = Eξ, x∗ IAn → m, x∗ . On the other hand, Eξ, x∗ IAn → Eξ, x∗ , since
Acn
|ξ, x∗ |d P → 0
as
n → ∞.
Thus, Eξ, x∗ = m, x∗ , x∗ ∈ B ∗ .
Now, we introduce covariance and correlation operators for a measure in a normed vector space. D EFINITION 3.17.– A linear bounded operator Aμ : X ∗ → X ∗∗ is called covariance operator of a Borel probability measure μ in a normed vector space X if Aμ z1∗ , z2∗ = σ2 (z1∗ , z2∗ ),
z1∗ , z2∗ ∈ X ∗ ,
[3.16]
and a linear bounded operator Sμ : X ∗ → X ∗∗ is called correlation operator of μ if Sμ z1∗ , z2∗ = σ2 (z1∗ , z2∗ ) − σ1 (z1∗ )σ1 (z2∗ ),
z1∗ , z2∗ ∈ X ∗ ,
[3.17]
where moment forms σn are given in [3.15] In case when mμ exists, definition 3.17 is consistent with the definition of covariance and correlation operators in H (remember that H ∗ and H ∗∗ are isometric to H). C OROLLARY 3.4.– (About existence of covariance and correlation operators) Let μ be a Borel probability measure on a normed vector space X. Assume that all second weak moments σ2 (z1∗ , z2∗ ) are finite. Then: a) there exist the covariance and correlation operators Aμ , Sμ : X ∗ → X ∗∗ , b) if additionally X is a reflexive Banach space, there exist the covariance and correlation operators Aμ , Sμ : X ∗ → X. P ROOF.– Existence of second moments σ2 implies existence of first moments σ1 . By theorem 3.12, both moment forms are bounded. Therefore, bilinear forms on the right-hand side of [3.16] and [3.17] are bounded as well. a) Hence there exist linear bounded operators Aμ , Sμ : X ∗ → X ∗∗ , which represent the latter forms.
Borel Measures in Hilbert Space
87
b) Now, assume that X is a reflexive Banach space. Therefore, the canonical embedding i : X → X ∗∗ is an isometry. Let Aμ and Sμ be operators constructed in part (a) of proof. Introduce operators A˜μ = i−1 Aμ ,
˜μ = i−1 Bμ . B
It holds A˜μ z1∗ , z2∗ = Aμ z1∗ , z2∗ = σ2 (z1∗ , z2∗ ) ˜μ . Thus, the new operators are desired. and similarly for B
R EMARK 3.9.– In Chapter 2 of the monograph [VAK 87], the next statement is proven: For a Borel probability measure on a separable Banach space B with finite second moments σ2 (z1∗ , z2∗ ), there exist the covariance and correlation operators Aμ , Sμ : B ∗ → B. Notice that under conditions of remark 3.9, mean value mμ exists as well (see remark 3.8). Of course, corollaries 3.3 and 3.4 are applicable to a Hilbert space H0 , for which H0∗ and H0 are isometric. C OROLLARY 3.5.– (About existence of mμ , Aμ , Sμ in H0 ) If a Borel probability measure on a real (possibly non-separable) Hilbert space H0 has finite second moments, then there exist mean value mμ and the covariance and correlation operators Aμ , Sμ ∈ L(H0 ). Problems 3.3 14) For a Borel probability measure μ in H, with finite strong second moment, prove that trS = x 2 dμ(x) − m 2 , H
where m and S are mean value and correlation operator of μ, respectively. 15) Construct two distinct Borel probability measures in H, with equal means and equal correlation operators. 16) Let ξ be random element in a Hilbert space H0 , with covariance and correlation operators A and S, respectively, and T ∈ L(H0 ). Prove that covariance and correlation operators of T ξ are T AT ∗ and T ST ∗ , respectively. 17) Let ξ be random element in a normed vector space X, with E ξ 2 < ∞. Prove that A ≤ E ξ 2 , where A is covariance operator of ξ.
4 Construction of Measure by its Characteristic Functional
In this chapter, we deal mainly with Borel probability measures on a real separable infinite-dimensional Hilbert space H. We study properties of characteristic functionals of such measures and show how to construct a measure by its characteristic functional. Finally, we give a criterion for a function θ : H → C to be a characteristic functional. 4.1. Cylindrical sigma-algebra in normed space In section 2.1.2, we introduce the so-called cylindrical sigma-algebra in the metric space (R∞ , ρ). Now, we do a similar thing in a real normed vector space X. D EFINITION 4.1.– Let n ≥ 1, {x∗1 , . . . , x∗n } ⊂ X ∗ and An ∈ B(Rn ). The set Aˆn (x∗1 , . . . , x∗n ) = {x ∈ X : (x, x∗1 , . . . , x, x∗n ) ∈ An } is called cylinder with base An constructed by functionals x∗1 , . . . , x∗n . Consider the class of all cylinders Cyl = Cyl(X) = {Aˆn (x∗1 , . . . , x∗n ) : n ≥ 1, {x∗1 , . . . , x∗n } ⊂ X ∗ , An ∈ B(Rn )}. Like in lemma 2.3, it is an algebra of sets in X, but it is not a sigma-algebra whenever dimX = ∞. The generated sigma-algebra σa(Cyl) is called cylindrical sigma-algebra. We will compare it to the Borel sigma-algebra. L EMMA 4.1.– Let {xn , n ≥ 1} be a dense set in X and {x∗n , n ≥ 1} ⊂ X ∗ be such that xn , x∗n = xn ,
x∗n = 1,
n≥1
Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.
90
Gaussian Measures in Hilbert Space
(such functionals exist by a corollary of the Hahn–Banach theorem). Then x = sup |x, x∗n |, n≥1
x ∈ X.
P ROOF.– a) We have |x, x∗n | ≤ x · x∗n = x . Hence sup |x, x∗n | ≤ x .
n≥1
b) Fix x ∈ X. There exists {xnk } that strongly converges to x. It holds x, x∗nk = xnk , x∗nk + x − xnk , x∗nk , |x, x∗nk | ≥ xnk − x − xnk , sup |x, x∗n | ≥ lim ( xnk − x − xnk ) = x . k→∞
n≥1
The desired equality follows.
T HEOREM 4.1.– (Mourier’s theorem about cylindrical sigma-algebra) In a separable normed space X, σa(Cyl) = B(X), i.e. the cylindrical and Borel sigma-algebra coincide. P ROOF.– Inclusion σa(Cyl) ⊂ B(X) is shown in a similar way to in the proof of theorem 2.1 (here the separability is not used). Next, X is separable; hence there exists a countable set {xn , n ≥ 1}, which is dense in X. Let {x∗n , n ≥ 1} be the corresponding functionals from lemma 4.1. ¯ 0 , r) ⊂ X. By lemma 4.1 Consider a ball B(x x − x0 ≤ r
⇐⇒
|x − x0 , x∗n | ≤ r
for all n ≥ 1.
Therefore, ¯ 0 , r) = B(x
∞
{x : |x, x∗n − x0 , x∗n | ≤ r},
n=1
¯ 0 , r) ∈ σa(Cyl), and B(x ¯ 0 , r) : x0 ∈ X, r > 0}. σa(Cyl) ⊃ σa{B(x
Construction of Measure by its Characteristic Functional
91
But due to the separability, the latter sigma-algebra coincides with B(X). We checked inclusions in both directions, and the statement is proven. Remember that the characteristic functional of a measure in H was introduced in definition 3.15. One can extend this concept to measures in X. D EFINITION 4.2.– Let μ be a Borel probability measure in a real normed space X and ξ be a random element in X. Functions ∗ ∗ ei x ,x dμ(x) ϕμ (x ) = X
and ϕξ (x∗ ) = E ei x
∗
,ξ
x∗ ∈ X ∗
;
are called characteristic functionals of μ and ξ, respectively. C OROLLARY 4.1.– (About renewal of measure by its characteristic functional) Let μ and ν be Borel probability measures in a separable normed space X such that ϕμ (x∗ ) = ϕν (x∗ ), for all x∗ ∈ X ∗ . Then μ = ν. P ROOF.– Let {x∗1 , . . . , x∗n } ⊂ X ∗ , {a1 , . . . , an } ⊂ R. It holds n n ∗ ϕμ ak xk = exp {i ak x∗k , x }dμ(x) = =
X
1
Rn
exp{i
n
1
ak tk }dμn (t1 , . . . , tn ) = ϕμn (a1 , . . . , an ) ,
1
where μn is a Borel probability measure in Rn , which is induced by the mapping X x → (x∗k , x )n1 ∈ Rn . In a similar way n ak x∗k = ϕνn (a1 , . . . , an ) . ϕν 1
Therefore, ϕμn = ϕνn , and for Borel probability measures in Rn , this implies that μn = ν n . Now, take a set Aˆn = Aˆn (x∗1 , . . . , x∗n ) from definition 4.1. It holds ˆ μ(An ) = μn (An ) = νn (An ) = ν(Aˆn ). Thus, μ and ν coincide at the cylindrical algebra Cyl = Cyl(X). Then by Carathéodory’s extension theorem and theorem 4.1, μ and ν coincide at σa(Cyl) = B(X).
92
Gaussian Measures in Hilbert Space
In a separable Hilbert space, the cylindrical sigma-algebra can be introduced in relation to a fixed orthobasis. Let H be an infinite-dimensional separable Hilbert space. Fix an orthobasis {en , n ≥ 1}. For An ∈ B(Rn ), consider a cylinder n Aˆn := {x ∈ H : ((x, ek ))k=1 ∈ An }.
The class of all such cylinders Cyle := {Aˆn : n ≥ 1, An ∈ B(Rn )} is an algebra of sets in H. L EMMA 4.2.– (About cylindrical sigma-algebra in H) It holds σa(Cyle ) = B(H). P ROOF.– It follows the line of the proof of theorem 2.1. In particular, for x ∈ H and r > 0, ¯ r) = B(x,
∞ k=1
{y ∈ H :
k
(yn − xn )2 ≤ r2 } ∈ σa(Cyle ),
1
where xn = (x, en ) and yn = (y, en ) are corresponding Fourier coefficients.
C OROLLARY 4.2.– Let μ and ν be Borel probability measures in an infinite-dimensional separable Hilbert space H, {en , n ≥ 1} be an orthobasis in H and Pn be orthoprojector on Ln = span(e1 , . . . , en ). Assume that for all n ≥ 1, μPn−1 = νPn−1 . Then μ = ν. P ROOF.– For any n ≥ 1 and An ∈ B(Rn ), it holds (μPn−1 )(An ) = μ(Aˆn ) = (νPn−1 )(An ) = ν(Aˆn ). Thus, μ and ν coincide at the algebra Cyle . Since σa(Cyle ) = B(H) (see lemma 4.2), μ and ν coincide at B(H) according to Carathéodory’s extension theorem. Problems 4.1 1) Consider a measurable space (R, 2R ), with a counting measure μ(A) = |A|, A ⊂ R. Prove that Hilbert space H0 = L2 (R, μ) is non-separable, in which Borel sigma-algebra differs from the sigma-algebra generated by all cylindrical sets. 2) Let μ and ν be Borel probability measures in a real separable Hilbert space such ¯ r)) = ν(B(x, ¯ r)), for all x and all r > 0. Prove that μ = ν. that μ(B(x,
Construction of Measure by its Characteristic Functional
93
3) Let X ∗ be a separable space. For arbitrary n ≥ 1, {x1 , . . . , xn } ⊂ X and An ∈ B(Rn ), denote Aˆn (x1 , . . . , xn ) = {x∗ ∈ X ∗ : (x∗ , x1 , . . . , x∗ , xn ) ∈ An }.
Let Cyl = Cyl(X ∗ , X) be the class of all such cylinders. Prove that σa(Cyl) = B(X ∗ ). 4.2. Convolution of measures D EFINITION 4.3.– Let ξ and η be random elements on the same probability space (Ω, F, P), which are distributed in real normed spaces X and Y , respectively. The two random elements are called independent if for each A ∈ B(X) and B ∈ B(Y ), P{ξ ∈ A and η ∈ B} = P{ξ ∈ A} · P{η ∈ B}. The product Z := X × Y is a real normed space as well, with the norm (x; y) Z = x X + y Y . Assume additionally that X and Y are separable. Then Z is separable as well, and for a random element ξ in X and a random element η in Y , a couple (ξ; η) is a random element in Z (see part (a) of the proof of lemma 3.6). The next statement is straightforward: ξ and η are independent if, and only if, μ(ξ;η) = μξ ×μη . Here μξ , μη , and μ(ξ;η) are distributions of ξ, η and (ξ; η), respectively. Characteristic functional ϕ(ξ;η) can be written as ϕ(ξ;η) (x∗ , y ∗ ) = E exp{i (x∗ , ξ + y ∗ , η )},
x∗ ∈ X ∗ , y ∗ ∈ Y ∗ .
L EMMA 4.3.– (About independence in terms of characteristic functionals) Let X and Y be separable real normed space and ξ, η be random elements defined on the same probability space and distributed in X and Y , respectively. The two random elements are independent if, and only if, ϕ(ξ;η) (x∗ , y ∗ ) = ϕξ (x∗ )ϕη (y ∗ ),
for all
x∗ ∈ X ∗ , y ∗ ∈ Y ∗ .
P ROOF.– It follows the line of the proof of lemma 1.4 and is not given here. It is crucial that in a separable space a measure is uniquely defined by its characteristic functional (see corollary 4.1). Now, let ξ and η be random elements defined on the same probability space and distributed in a separable real normed space X. Then, ξ + η is a random element in X, with decomposition ξ + η = K(z), where z = (ξ; η), K(x, y) = x + y, x ∈ X, y ∈ X (see part (a) of the proof of lemma 3.6).
94
Gaussian Measures in Hilbert Space
We find the distribution of ξ + η. For A ∈ B(X), μξ+η (A) = P{K(z) ∈ A} = P{z ∈ K −1 A} = (μξ × μη ) (K −1 A) =
μξ (K −1 A)y dμη (y). = X
Here, (K −1 A)y is the so-called y-cross-section of the set K −1 A ⊂ X × Y . We have K −1 A = {(x, y) : x + y ∈ A}, (K −1 A)y = {x ∈ X : x + y ∈ A} = A − y. Thus,
μξ (A − y)dμη (y).
μξ+η (A) = X
Using x-cross-section instead of y-cross-section, we obtain μξ+η (A) = μη (A − x)dμξ (x).
[4.1]
X
D EFINITION 4.4.– Let μ and ν be Borel probability measures in a separable normed space X. The probability measure (μ ∗ ν) (A) = μ(A − y)dν(y), A ∈ B(X) X
is called convolution of μ and ν. It is clear that μξ+η = μξ ∗ μη , where ξ, η are independent random elements in a separable normed space X. R EMARK 4.1.– For any Borel probability measures μ and ν in a separable normed space X, the convolution μ ∗ ν is well defined. Indeed, consider independent random elements ξ and η in X, with distribution μξ = μ and μη = ν. Such elements can be constructed as follows: (Ω, F, P) = (X × X, B(X × X), μ × ν),
ξ(ω1 , ω2 ) = ω1 ,
η(ω1 , ω2 ) = ω2 .
Then μ∗ν = μξ ∗μη = μξ+η is a probability measure on B(X). Moreover relation [4.1] implies that μ ∗ ν = ν ∗ μ. L EMMA 4.4.– (About characteristic functional of convolution) Let μ and ν be Borel probability measures in a separable vector space X. Then ϕμ∗ν (x∗ ) = ϕμ (x∗ )ϕν (x∗ ),
x∗ ∈ X ∗ .
Construction of Measure by its Characteristic Functional
95
P ROOF.– Introduce independent random elements ξ and η in X, with μξ = μ and μη = ν (see remark 4.1). Then ∗ ∗ ∗ ϕμ∗ν (x∗ ) = ϕξ+η (x∗ ) = E ei ξ+η,x = E ei ξ,x · ei η,x . Since ξ and η are independent, random variables ξ, x∗ and η, x∗ are independent as well. Hence ϕμ∗ν (x∗ ) = E ei ξ,x
∗
· E ei η,x
∗
= ϕξ (x∗ )ϕη (x∗ ) = ϕμ (x∗ )ϕν (x∗ ).
Problems 4.2 4) Consider a normed vector space X (possibly nonseparable) with cylindrical sigma-algebra C = C(X). Let (Ω, F, P) be a probability space. A mapping ξ : Ω → X is called weak random element if it is (F, C)-measurable. Induced probability measure μξ = P ξ −1 is called distribution of ξ. a) Let ξ and η be weak random elements defined on the same probability space X. Prove that ξ + η is a weak random element as well. b) ξ and η from item (a) are called independent if for all A1 , A2 ∈ C, P{ξ ∈ A1 and η ∈ A2 } = P{ξ ∈ A1 } · P{η ∈ A2 }. For such weak random elements, prove that μξ+η (A) = μξ (A − y)dμη (y), A ∈ C. X
(Thus, convolution μ ∗ ν for probability measures on C can be defined similarly to definition 4.4.) 5) Let μ, ν and τ be Borel probability measures in a separable normed space X. Prove that: a) δ0 ∗ μ = μ, where δ0 is Dirac measure at 0, which is defined on B(X); b) (μ ∗ ν) ∗ τ = μ ∗ (ν ∗ τ ); c) (aμ + bν) ∗ τ = a(μ ∗ ν) + b(ν ∗ τ ), where a, b ≥ 0, a + b = 1. 6) Let μ1 ∗ ν = μ2 ∗ ν = τ , where all four are Borel probability measures in a separable normed space X, and let ϕν be not vanishing at some set D that is dense in X ∗ . Prove that μ1 = μ2 . (This means that the deconvolution equation μ ∗ ν = τ , with unknown μ, cannot have two or more solutions.)
96
Gaussian Measures in Hilbert Space
4.3. Properties of characteristic functionals in H Characteristic functional is a very useful attribute of a measure in a separable space, because a measure can be renewed by its characteristic functional (see corollary 4.1). Lemmas 4.3 and 4.4 state some important properties of characteristic functionals. We study further properties of them. For convenience, here we will deal with measures in a real Hilbert space, though most of the properties are still valid in a normed space. Let H be a real (possibly non-separable) Hilbert space. For a Borel probability measure μ in H, its characteristic functional ϕμ : H → C was introduced in definition 3.15. Sometimes we will denote ϕμ as μ ˆ, emphasizing that characteristic functional is just a Fourier transform of μ. Thus, μ ˆ(z) = ϕμ (z) = ei(z,x) dμ(x), z ∈ H. H
ˆ(0) = 1 and μ ˆ is positive definite, i.e. for all n ≥ 1, {xk , 1 ≤ L EMMA 4.5.– It holds μ k ≤ n} ⊂ H and {ck , 1 ≤ k ≤ n} ⊂ C, n
μ ˆ(xk − xj )ck c¯j ≥ 0.
[4.2]
j,k=1
P ROOF.– μ ˆ(0) = μ(H) = 1, because μ is a probability measure. Next, for the desired inequality, LHS =
n
H
j,k=1
=
n
ei(xk −xj ,y) dμ(y) =
ck c¯j
ck c¯j
ei(xk ,y) ei(xj ,y) dμ(y)
H j,k=1
2 n i(xk ,y) = ck e dμ(y) ≥ 0. H k=1
It is useful to write [4.2] in a vector form. Introduce a square matrix A = (ajk )nj,k=1 ,
ajk = μ ˆ(xk − xj ),
[4.3]
ck )n1 . Then LHS of [4.2] is just a quadratic form and column vectors c = (ck )n1 , c¯ = (¯ of A, and [4.2] takes a form c¯ Ac ≥ 0,
for all c ∈ Cn .
¯jk ), and moreover it is positive This means that A is Hermitean (i.e. akj = a semidefinite (i.e. its quadratic form is non-negative).
Construction of Measure by its Characteristic Functional
97
We will derive some interesting relations just from equality μ ˆ(0) = 1 and inequality [4.2]. T HEOREM 4.2.– Let H be a real Hilbert space and θ : H → C be a positive definite functional, with θ(0) = 1. Then: a) θ(−x) = θ(x), b) |θ(x)| ≤ 1,
x ∈ H;
x ∈ H;
c) |θ(x) − θ(y)| ≤ 2(1 − Re θ(x − y)), complex number. 2
x, y ∈ H, where Re is real part of a
P ROOF.– a) We have an inequality like [4.2]: n
θ(xk − xj )ck c¯j ≥ 0.
[4.4]
j,k=1
Put here n = 2, x1 = 0, x2 = x. The matrix θ(0) θ(x) 1 θ(x) 2 = A2 = (θ(xk − xj ))k,j=1 = θ(−x) θ(0) θ(−x) 1 has a non-negative quadratic form. Hence A2 is Hermitean and θ(−x) = θ(x). b) Moreover, A2 is positive semidefinite. By Sylvester’s criterion det A2 = 1 − θ(−x)θ(x) = 1 − |θ(x)|2 ≥ 0
⇒
|θ(x)| ≤ 1.
c) In case θ(x) = θ(y), the inequality holds true because RHS is always non-negative due to statement (b). Hence, we deal with the case θ(x) = θ(y). Put in [4.4] n = 3, x1 = 0, x2 = x and x3 = y. The corresponding Hermitean matrix ⎞ ⎛ 1 θ(x) θ(y) θ(y − x) ⎠ . A3 = ⎝ θ(−x) 1 θ(−y) θ(x − y) 1 We put in [4.4] c1 = 1, c3 = −c2 and group conjugate summands using the identity z + z¯ = 2 Re z, z ∈ C: 1 + |c2 |2 + |c3 |2 + 2 Re (θ(x) · c2 ) + 2 Re (θ(y) · c3 ) + + 2 Re (θ(y − x) · c¯2 c3 ) = 1 + 2|c2 |2 − 2 Re θ(y − x) · |c2 |2 + + 2 Re{(θ(x) − θ(y)) · c2 } ≥ 0,
for all
c2 ∈ C .
98
Gaussian Measures in Hilbert Space
Finally, we put here c2 = λ ·
|θ(x) − θ(y)| , θ(x) − θ(y)
λ ∈ R.
We get λ2 (1 − Re θ(y − x)) + λ|θ(x) − θ(y)| +
1 ≥ 0, 2
for all
λ ∈ R.
If 1 − Re θ(y − x) = 0, then the inequality is just linear in λ, and it cannot hold for each real λ (remember that in our case θ(x) = θ(y)). Thus, 1 − Re θ(y − x) > 0, and discriminant D of the quadratic function in λ should be non-positive: D = |θ(x) − θ(y)|2 − 2(1 − Re θ(y − x)) ≤ 0.
This implies the desired inequality.
Remember that a linear topology on a linear space L is a topology, which is invariant under translations. This means the following: if U is an open set in the topology and x ∈ L, then U + x is an open set in the topology. C OROLLARY 4.3.– Let θ : H → C satisfy the conditions of theorem 4.2. Consider a linear topology τ on H. If Re θ is continuous at zero in the topology τ , then θ is a continuous functional in τ . P ROOF.– We prove that θ is a continuous in τ at arbitrary point x ∈ H. Fix δ > 0. We want to ensure that |θ(x) − θ(y)| < δ, for all y from some open in τ set Ux , x ∈ Ux . According to theorem 4.2(c), |θ(x) − θ(y)| ≤ 2(1 − Re θ(y − x)). 2
The inequality 1 − Re θ(y − x) = |1 − Re θ(y − x)| < δ2 holds if y − x ∈ U0 , where U0 is some open set in τ , 0 ∈ U0 , or y ∈ U0 + x =: Ux . The set Ux is also open in τ (because τ is a linear topology) and x ∈ Ux . Thus, for y ∈ Uy , it holds |θ(x) − θ(y)| < δ, and θ is continuous at point x. Since x was arbitrary vector from Ux , θ is continuous in τ . Later on, in section 4.4, we will introduce the so-called S-topology. It is a linear topology in H, which is crucial in the description of characteristic functionals. L EMMA 4.6.– For a Borel probability measure μ in a real Hilbert space H, the characteristic functional μ ˆ is a uniformly continuous functional on H.
Construction of Measure by its Characteristic Functional
99
P ROOF.– Let {xn } and {yn } be two sequences in H, with xn − yn → 0 as n → ∞. Consider |ˆ μ(xn ) − μ ˆ(yn )| ≤ |ei(xn ,z) − ei(yn ,z) |dμ(z) = =
H
|1 − ei(xn −yn ,z) |dμ(z) =: In .
H
The integrand fn (z) = |1 − ei(xn −yn ,z) |, z ∈ H converges pointwise to zero, and moreover 0 ≤ fn (z) ≤ 2, 2dμ = 2 < ∞. H
Hence by the Lebesgue dominated convergence theorem In → 0 as n → ∞, and μ ˆ(xn ) − μ ˆ(yn ) → 0 as n → ∞. This proves the statement. Problems 4.3 7) Let τ , θ : H → C be positive definite functionals on a real Hilbert space H. Prove that the functional τ · θ and eτ are positive definite as well. Hint. Use problem (23) of Chapter 1. 8) Let B(x, y) be a symmetric positive semidefinite bilinear form on real Hilbert space H. Prove that the functional e−B(x,x) is a positive definite functional on H. Hint. Use lemma 1.6. 4.4. S-topology in H D EFINITION 4.5.– Let (X, τ ) be a topological space and {Nx , x ∈ X} be an indexed family where Nx is non-empty class of subsets of X. Assume also that for each x ∈ X and U ∈ Nx , it holds x ∈ U , and moreover U is a neighborhood of x (i.e. U contains an open set that contains x). The family {Nx , x ∈ X} is called neighborhood system of the topological space if, for each U ∈ τ and each x ∈ U , there exists W ∈ Nx such that W ⊂ U . The next statement is presented in [ENG 89]. T HEOREM 4.3.– (About neighborhood system) Let {Nx , x ∈ X} be an indexed family where Nx is non-empty class of subsets of X. Assume the following: a) for each x ∈ X and U ∈ Nx , it holds x ∈ U ;
100
Gaussian Measures in Hilbert Space
b) for each x ∈ X, U ∈ Nx and y ∈ U there exists V ∈ Ny , with V ⊂ U ; c) for each x ∈ X and U1 , U2 ∈ Nx , there exists U ∈ Nx , with U ⊂ U1 ∩ U2 . Then T = (X, τ ) is a topological space where τ consists of all unions of sets from the class ∪x∈X Nx . Moreover, {Nx , x ∈ X} is a neighborhood system of T . Based on theorem 4.3, we construct a topology on a real separable infinite-dimensional Hilbert space H. Denote ES = {x ∈ H : (Sx, x) < 1}, where S is an S-operator in H, i.e. S is a self-adjoint, positive and nuclear operator. (Remember that the class of all such operators is denoted LS (H).) Since 0 ∈ ES , this set is non-empty; it is open in usual topology, because ES = KS−1 ((−∞, 1)), KS (x) := (Sx, x) is continuous on H and (−∞, 1) is open on R; ES is unbounded and convex. Let {ek , k ≥ 1} be the eigenbasis of S and {λk , k ≥ 1} be the corresponding (non-negative) eigenvalues. In the case where KerS = {0} (i.e. S is non-degenerate), ES = {x ∈ H :
∞ (x, ek )2 1
a2k
< 1},
1 ak = √ , λk
k ≥ 1, −1/2
and ES is an infinite-dimensional ellipsoid, with semiaxes λk
.
L EMMA 4.7.– (About topology generated by ellipsoids) Let N0 = {ES , S ∈ LS (H)},
Nx = N0 + x = {ES + x : S ∈ LS (H)},
x ∈ H.
Then TS = (H, τS ) is a linear topological space where τS consists of all unions of sets from the class ∪x∈H Nx . Moreover, {Nx , x ∈ H} is a neighborhood system of TS . P ROOF.– We have to verify conditions (a)–(c) of theorem 4.3. It is enough to deal with x = 0 only. a) 0 ∈ ES , for all ES ∈ N0 . b) We have to check that for each y ∈ ES , there exists Q ∈ LS (H), with y+EQ ⊂ ES . Denote (u, v)S = (Su, v) = (S 1/2 u, S 1/2 v), u, v ∈ H; u S := 1/2 S u , u ∈ H. The latter functional is a seminorm in H.
(u, u)S =
It holds y S < 1. Let Q = ε−2 S, and we will choose ε > 0 later. For z ∈ EQ , it holds z Q = ε−1 z S < 1, z S < ε. Then y + z S ≤ y S + z S < ε + y S = 1 if ε = 1 − y S . Thus, with such ε and Q = ε−2 S ∈ LS (H), we have y + EQ ⊂ ES .
Construction of Measure by its Characteristic Functional
101
c) For S, Q ∈ LS (H), it holds ES+Q ⊂ ES ∩ EQ and S + Q ∈ LS (H). Thus, the conditions of theorem 4.3 are satisfied, and the statement of lemma 4.7 follows directly from this theorem. D EFINITION 4.6.– The topology τS constructed in lemma 4.7 is called S-topology, or Sazonov’s topology, in honor of the mathematician V.V. Sazonov. Actually, τS is the weakest topology under which quadratic forms KS (x) = (Sx, x), x ∈ H, S ∈ LS (H) are continuous. R EMARK 4.2.– Let f : H → C be a continuous functional in S-topology. Then f is continuous in usual topology. P ROOF.– For each open set, G ⊂ C, f −1 (G) ∈ τC . But τC is weaker topology than usual topology; hence f −1 (G) is an open set in usual topology. This implies the statement. Now, we state a criterion for the continuity of a nonlinear functional in S-topology. L EMMA 4.8.– Consider a functional f : H → C such that |f (x)| ≤ 1, for all x ∈ H. This functional is continuous at zero in S-topology if, and only if, for each ε > 0 there exists Sε ∈ LS (H) such that for all x ∈ H, |f (x) − f (0)| ≤ ε + (Sε x, x). P ROOF.– a) Necessity. We may and do assume that ε < 1. Since f is assumed continuous at zero in τS , there exists S ∈ LS (H) such that the inequality (Sx, x) < 1 implies that |Δf (x)| = |f (x) − f (0)| < ε. Hence |Δf (x)| ≤ ε + (2Sx, x),
for all
x ∈ H.
Indeed, for x ∈ ES , |Δf (x)| < ε ≤ ε + (2Sx, x) and if (Sx, x) ≥ 1 then |Δf (x)| ≤ |f (x)| + |f (0)| ≤ 2 ≤ (2Sx, x) < ε + (2Sx, x). One can set Sε = 2S ∈ LS (H). b) Sufficiency. Now, for each ε > 0 it holds |Δf (x)| ≤
ε + (Sε/2 x, x). 2
102
Gaussian Measures in Hilbert Space
If (Sε/2 x, x) < 2ε , then |Δf (x)| < ε. We set S = 2ε Sε/2 ∈ LS (H). We have x ∈ ES ⇐⇒ (Sε/2 x, x)
0, there exists S-operator Sε such that for each x ∈ H, 1 − Re θ(x) ≤ (Sε x, x) + ε. P ROOF.– a) Necessity. We assume that θ is characteristic functional of a Borel probability measure μ. By lemmas 4.5 and 4.6, θ satisfies condition 1. Fix ε > 0 and choose R = R(ε) > 0, with μ({ y > R}) < 2ε . This is possible since lim μ({ y > R}) = μ(∅) = 0.
R→+∞
Construction of Measure by its Characteristic Functional
We have
¯ R)) + μ({ y > R}) − Re 1 − Re θ(x) = μ(B(0, − Re +
105
{y>R}
¯ B(0,R)
¯ B(0,R)
ei(x,y) dμ(y)−
ei(x,y) dμ(y) ≤ 2μ({ y > R}) +
(1 − cos(x, y))dμ(y).
Since 1 − cos t = 2 sin2 1 1 − Re θ(x) ≤ ε + 2
t 2
≤
t2 2,
it holds
1 (x, y) dμ(y) = ε + 2 ¯ B(0,R)
2
(x, y)2 dμR (y). H
¯ R), Here, μR is a finite Borel measure concentrated at B(0, ¯ R)), μR (A) = μ(A ∩ B(0,
A ∈ B(H).
Covariance operator SR of μR is S-operator, because 2 x dμR (x) = x 2 dμ(x) ≤ R2 < ∞ H
¯ B(0,R)
(here we apply lemma 3.10, which is valid not only for a probability measure but also for any finite Borel measure in H). Hence 1 1 − Re θ(x) ≤ ε + ( SR x, x), 2
x ∈ H,
and condition 2 holds true with Sε = 12 SR ∈ LS (H). b) Sufficiency: Part 1. Now, we assume that θ satisfies both conditions and we want to construct a probability measure on B(H), with μ ˆ = θ. According to theorem 4.2(b), |θ(x)| ≤ 1, | Re θ(x)| ≤ 1 and taking into account lemma 4.8 condition 2 implies that Re θ(x) is continuous at zero in S-topology. The latter topology is linear, and from corollary 4.3 we get that θ is a continuous functional in S-topology. Due to remark 4.2 , θ is continuous in usual topology as well. In the rest of proof, we may and do assume that H coincides with real sequence space l2 . Our functional θ : l2 → C satisfies conditions [4.5], and we are able to construct the objects μn and μe from lemma 4.9(a). It remains to prove that μe (l2 ) = 1.
106
Gaussian Measures in Hilbert Space
c) Sufficiency: Part 2. With the measure μ on B(R∞ ), we relate a random element X in R∞ to its distribution μX = μe . We set (Ω, F, P) = (R∞ , B(R∞ ), μe ) and ∞ introduce X = (Xn )∞ 1 : Ω → R , X(ω) = ω. Then μe (l2 ) = 1 ⇐⇒ P{X(ω) ∈ l2 } = 1 ⇐⇒
∞
Xn2 (ω) < ∞ a.s.
1
We have to show that the latter series converges a.s. The next identity follows from theorem 1.6 and gives an expression for characteristic function of standard Gaussian vector in Rn : n n 1 2 exp{i aj yj }ρ(y)dy = exp{− a }, aj ∈ R, 1 ≤ j ≤ n. 2 1 j Rn 1 Here 1 2 1 exp{− y }, ρ(y) := √ 2 1 j ( 2π)n n
y ∈ Rn
is pdf of standard Gaussian vector in Rn . We substitute aj = Xk+j (ω) to the identity: 1 2 2 exp{− (Xk+1 + · · · + Xk+n )}d P = Ikn := 2 Ω ⎡ ⎤ n ⎣ exp{i Xk+j yj }ρ(y)dy ⎦ d P(ω). = Rn
Ω
j=1
One can apply Fubini’s theorem because the double integral can be bounded as follows: n | exp{i Xk+j yj }|ρ(y)dyd P(ω) ≤ P(Ω) ρ(y)dy = 1 < ∞. Ω
Rn
Then Ikn =
Rn
j=1
Rn
⎡ ρ(y) ⎣
exp{i Ω
n
⎤ Xk+j yj }d P(ω)⎦ dy.
j=1
The inner integral equals exp{i Lk+n
n j=1
⎛ xk+j yj }dμk+n (x) = θ ⎝
n j=1
⎞ yj ek+j ⎠
Construction of Measure by its Characteristic Functional
107
(here {en } is standard basis in l2 ). Therefore, a real number Ikn can be expressed as follows: ⎞ ⎞ ⎛ ⎛ n n Ikn = θ⎝ yj ek+j ⎠ ρ(y)dy = Re θ ⎝ yj ek+j ⎠ ρ(y)dy. Rn
Rn
j=1
j=1
Now, we use condition 2: ⎛ ⎞⎞ ⎛ n ⎝1 − Re θ ⎝ 1 − Ikn = yj ek+j ⎠⎠ ρ(y)dy ≤ Rn
≤ε+
n
j=1
(Sε ek+j , ek+p )
j,p=1
Rn
yj yp ρ(y)dy.
Remember that ρ is pdf of standard Gaussian vector γ = (γj )n1 ; hence yj yp ρ(y)dy = E γj γp = δjp , 1 ≤ j, p ≤ n. Rn
Then
⎧ ⎫ ∞ ∞ n ⎨ 1 ⎬ 2 (Sε ek+j , ek+j ) = ε + (Sε ep , ep ). Xk+j 1 − E exp − ≤ ε+ ⎩ 2 ⎭ j=1 j=1 p=k+1
∞
Since Sε is nuclear, p=1 (Sε ep , ep ) < ∞ and we can fix k = k0 (ε), with ∞ p=k+1 (Sε ep , ep ) ≤ ε. We get ⎧ ⎫ n ⎨ 1 ⎬ 2 E exp − Xk+j ≥ 1 − 2ε. ⎩ 2 ⎭ j=1 We tend n → ∞ and obtain by Lebesgue dominated convergence theorem (here exp (−∞) = 0 by definition): ⎧ ⎫ ∞ ⎨ 1 ⎬ 2 E exp − Xk+j ≥ 1 − 2ε, ⎩ 2 ⎭ j=1
∞ Xn2 < ∞} ≥ P{ 1
⎧ ⎫ ∞ ⎨ 1 ⎬ 2 exp − X dP = k+j ⎩ 2 ⎭ 2 {ω: ∞ 1 Xn 0 and Sε = 12 Sμ ∈ LS (H). But in the general case the covariance operator of μ need not exist, and moreover, if it does exist, this operator need not be nuclear (see example 3.8). Comparing theorems 4.5 and 4.4, we see that in infinite-dimensional case, a functional should be continuous in a weaker topology than the usual one, while in Rn the continuity in the usual topology is enough. E XAMPLE 4.1.– (Exponent of quadratic form) Let H be a real separable infinitedimensional Hilbert space and C be self-adjoint bounded operator in H. Consider a functional θ(z) = e−(Cz,z) ,
z ∈ H.
We state that θ is characteristic functional of some Borel probability measure in H if, and only if, C is S-operator. P ROOF.– a) Sufficiency. Assume that C is S-operator. Then θ(0) = 1; (Cx, y) is a symmetric positive semidefinite bilinear form on H; hence θ is a positive definite functional (see problem (8) of Chapter 4). Finally KC (z) = (Cz, z) is continuous in S-topology (see problem (11) of Chapter 4), therefore, Re θ(z) = θ(z) = e−KC (z) is continuous in S-topology as a continuous function of KC . According to theorem 4.5, θ = μ ˆ, for some Borel probability measure μ in H.
Construction of Measure by its Characteristic Functional
109
b) Necessity. Now, assume that θ is characteristic functional of some Borel probability measure in H. Then |θ(z)| = θ(z) ≤ 1 (see theorem 4.2(b)); hence (Cz, z) ≥ 0, and C is a positive operator. Fix ε ∈ (0, 12 ). According to theorem 4.5, there exists Sε ∈ LS (H) such that for each z ∈ H, 1 − θ(z) = 1 − Re θ(z) ≤ (Sε z, z) + ε. At each point where (Sε z, z) ≤ ε, it holds 1 − e−(Cz,z) ≤ 2ε
⇒
(Cz, z) ≤ aε := log
1 . 1 − 2ε
Then (Cz, z) ≤
aε (Sε z, z), ε
for all
z∈H
(see part (a) of the solution to problem (11) of Chapter 4). Finally, for any orthobasis {en }, ∞
(Cen , en ) ≤
1
∞ aε (Sε en , en ) < ∞, ε 1
and according to theorem 3.8 the positive self-adjoint operator C is nuclear. Thus, C is S-operator. Problems 4.5 12) Let θ : l2 → C be characteristic functional of some Borel probability measure in l2 and μe be a measure in R∞ constructed in lemma 4.9(a). Prove that μe (l2 ) = 1. 13) Let H be a real separable infinite-dimensional Hilbert space, ϕ : H → C be a continuous positive definite functional, with ϕ(0) = 1, and A ∈ S2 (H). Prove that the functional θ(x) = ϕ(Ax), x ∈ H is characteristic functional of some Borel probability measure in H.
5 Gaussian Measure of General Form
In this chapter, we give a description of all Gaussian measures in H. The main tool is the Minlos–Sazonov theorem (theorem 4.5). We derive some properties of Gaussian measures in Hilbert space. Moreover, we introduce a Gaussian measure in a normed space and prove that its exponential moments are finite. 5.1. Characteristic functional of Gaussian measure Remember that a Gaussian random element and a Gaussian measure in a real Hilbert space H were introduced in definitions 2.7 and 2.8. The mean value and correlation operator of random element and the Borel probability measure in H were introduced in definitions 2.9 and 2.10. L EMMA 5.1.– Let μ be a Gaussian measure in a real Hilbert space H. Then it has finite second weak moments and its characteristic functional has a form 1 μ ˆ(x) = exp i(m, x) − (Sx, x) , x ∈ H, 2 where m and S are mean value and correlation operator of μ, respectively. P ROOF.– Let ξ be a Gaussian random element in H, with distribution μ. For x ∈ H, denote ξx = (ξ, x) and let μx be the distribution of r.v. ξx , μx ∼ N (mx , σx2 ) (it can happen that σx2 = 0, then μx is the Dirac measure at point mx ). We have by the change of variables formula: i(x,y) μ ˆ(x) = e dμ(y) = eit dμx (t) = μˆx (1) = H
= exp{imx u −
σx2 u2 2
}
R
= exp{imx − u=1
σx2 }. 2
Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.
[5.1]
112
Gaussian Measures in Hilbert Space
Again, by the change of variables formula, the measure μ has finite first weak moments: mx = E ξx = t dμx (t) = (x, y) dμ(y), R
H
and by corollary 3.3 there exists mean value m of μ, with mx = (m, X); σx2 = D ξx = (t − mx )2 dμx (t) = R
((x, y) − mx )2 dμ(y) =
= H
(x, y − m)2 dμ(y), H
the measure μ has finite weak second moments, and by corollary 3.5 there exists correlation operator S of μ. Therefore, σx2 = (Sx, x). We plug-in expressions for mx and σx2 into formula [5.1] and obtain the desired relation for μ ˆ(x). Remember that correlation operator of any Borel probability measures in H (if such an operator exists) is always bounded, self-adjoint and positive. T HEOREM 5.1.– (About characteristic functional of Gaussian measure) Let H be a real separable infinite-dimensional Hilbert space. a) If μ is a Gaussian measure in H, then its correlation operator is S-operator. b) If x0 ∈ H and S ∈ LS (H), then the functional 1 ϕ(x) = exp i(x0 , x) − (Sx, x) , x ∈ H, 2 is characteristic functional of some Gaussian measure in H, with mean value x0 and correlation operator S. P ROOF.– a) Let ξ be a Gaussian random element with distribution μ. By lemma 5.1, characteristic functional is 1 ϕξ (x) = μ ˆ(x) = exp i(m, x) − (Sx, x) , x ∈ H, 2 where m and S are mean value and correlation operator of μ, respectively. Introduce random element ξc = ξ − m, with distribution μc . Then 1 μ ˆc (x) = ϕξc (x) = E ei(x,ξ−m) = e−i(x,m) ϕξ (x) = exp − Sx, x . 2 The operator 12 S is self-adjoint and bounded as a half of correlation operator of a Borel probability measures in H. According to example 4.1, 12 S is S-operator; hence S = 2 · ( 12 S) is S-operator as well (remember that LS (H) is a cone in S1 (H)).
Gaussian Measure of General Form
113
%
& b) We start with the case x0 = 0. Then ϕ(x) = exp − 12 Sx, x ; since S ∈ LS (H), the operator 12 S is S-operator as well, and according to example 4.1 there exists a Borel probability measure μ, with μ ˆ = ϕ. Now, we prove that μ is Gaussian. Let (Ω, F, P) = (H, B(H), μ) and ξ : Ω → H, ξ(ω) = ω. Then it is a random element in H, with distribution μ. Fix x ∈ H and introduce a r.v. ξx = (ξ, x), with distribution μx . We have itz μ ˆx (t) = e dμx (z) = eit(x,y) dμ(y) = μ ˆ(tx) = = exp
R
−
2
t (Sx, x) 2
H
,
t ∈ R.
Therefore, μx is normal distribution N (0, σx2 ), σx2 = (Sx, x) (it can happen that σx2 = 0). Hence ξ is a Gaussian measure. Now, consider the general case x0 ∈ H. Let η = ξ + x0 and ν be the distribution of η. We have: ϕη (x) = νˆ(x) = E ei(x,ξ+x0 ) = ei(x,x0 ) ϕξ (x) = ϕ(x); (η, x) = (ξ, x) + (x0 , x) ∼ N (mx , σx2 ),
mx = (x0 , x),
σx2 = (Sx, x).
Thus, η is a Gaussian random element and its distribution ν is a Gaussian measure, with given characteristic functional ϕ. By lemma 5.1, ν has mean value x0 and correlation operator S. Theorem 5.1 shows that a Gaussian measure μ in a real separable infinite-dimensional Hilbert space H is uniquely defined by its mean value m ∈ H and correlation operator S ∈ LS (H). For a Gaussian random element ξ and its distribution μ, we write ξ ∼ N (m, S) and say that ξ has normal distribution with parameters m and S. There is one-to-one correspondence between the class of all Gaussian measures in H and the parameter space H × LS (H). Problem (15) from Chapter 3 shows that a general Borel probability measures in H is not uniquely defined by those characteristics. C OROLLARY 5.1.– Let μ be a Gaussian measure in a real separable infinite-dimensional Hilbert space, with mean value m and covariance operator A. Then x 2 dμ(x) = tr A < ∞, [5.2] H
and in the sense of Bochner integral, x dμ(x). m= H
[5.3]
114
Gaussian Measures in Hilbert Space
P ROOF.– By theorem 5.1(a), correlation operator S of μ is nuclear, then corollary 3.2 implies H x 2 dμ < ∞, and [5.2] follows by lemma 3.10. Hence, m1 = H x dμ(x) < ∞ and [5.3] holds true involving Bochner integral (see section 3.3.1). Problems 5.1 1) Let H be a real separable infinite-dimensional Hilbert space and h ∈ H. Prove that there are no Borel probability measures in H with characteristic 2 functional exp{i(h, x) − x 2 }, x ∈ H. 2) Let ξ1 and ξ2 be independent random elements in a real separable infinitedimensional Hilbert space and ξi ∼ N (mi , Si ), i = 1, 2. Prove that ξ1 + ξ2 ∼ N (m1 + m2 , S1 + S2 ). 5.2. Decomposition of Gaussian measure and Gaussian random element Let μ be a Gaussian measure with mean value m and covariance operator S in a real separable infinite-dimensional Hilbert space H. Let {ek , k ≥ 1} be an eigenbasis ∞ of S and {λk , k ≥ 1} be the corresponding eigenvalues, λk ≥ 0, 1 λk < ∞. We use a construction described in lemma 4.9. Consider an increasing system of finite-dimensional subspaces Ln := span(e1 , . . . , en ). Let μn be a probability measure on B(Ln ), with μ ˆn = μ ˆ|Ln . For x ∈ Ln , it holds n n 1 1 2 mk xk − λk x k . μ ˆn (x) = exp i(m, x) − (Sx, x) = exp i 2 2 1 1 Here mk := (m, ek ) and xk := (x, ek ) are Fourier coefficients of m and x, respectively. Hence n n
1 2 μ ˆn (x) = exp imk xk − λk xk = μ ˆ[ek ] (xk ), x ∈ Ln . [5.4] 2 1 1 n Here μ[ek2] is Gaussian measure on B(R ), with mean value mk and variance (t − mk ) dμ[ek ] (t) = λk (if λk = 0, then μ[ek ] is Dirac measure δmk at R point mk ). We identify Ln and Rn and obtain from [5.4] that
μn =
n
μ[ek ] .
1
Measures {μn , n ≥ 1} are consistent and yield a measure μe on B(R∞ ), with μe (Aˆn ) = μn (An ), for all n ≥ 1 and An ∈ B(Rn ) (see lemma 4.9). The
Gaussian Measure of General Form
115
measure μe is just infinite product of measures μ[ek ] . Now, we identify x ∈ H and ∞ vector of Fourier coefficients ((x, ek ))1 ∈ l2 . Thus, we identify H and l2 . Then μ = μe B(l2 )
(see problem (12) from Chapter 4). In such a situation, we say that μ is a product measure in H and write μn =
∞
μ[ek ]
in H
[5.5]
1
(see [2.20]). In section 2.4, we constructed a Gaussian measure in l2 as a product measure of one-dimensional Gaussian measures. Expansion [5.4] shows that arbitrary Gaussian measure in H can be constructed that way. Now, we derive an expansion for a Gaussian random element in H. Actually, we extend theorem 1.7 to infinite-dimensional Hilbert space. T HEOREM 5.2.– (About expansion of Gaussian random element) Let μ ∈ H, S be S-operator in H, {λk , k ≥ 1} be positive eigenvalues of S (with multiplicity) and {ek , k ≥ 1} be the corresponding orthonormal system of eigenvectors. a) For ξ ∼ N (m, S), there exist i.i.d. N (0, 1) r.v.’s {γk , k ≥ 1} on the underlying probability space such that almost surely ξ =m+ λ k γk ek , k≥1
where the series (if the sum is infinite) converges strongly in H with probability 1. ≥ 1} are b) If {γk , k √ i.i.d. N (0, 1) random variables, then the series (if the sum is infinite) T := k≥1 λk γk ek converges strongly in H a.s., and random element η which equals m + T a.s., satisfies η ∼ N (m, S). P ROOF.– We complete the given orthonormal system to eigenbasis {ek , k ∈ N }. It holds Sek = λk ek , k ∈ N (some of λk ’s can be zero). a) Introduce X = ξ − m, X ∼ N (0, S). Now, (X, ek )ek , E(X, ek ) = 0, D(X, ek ) = λk , X= k∈N
If λj = 0, then (X, ej ) = 0 a.s. Thus, a.s. X=
k: λk >0
(X, ek ) λk √ ek λk
k ∈ N.
116
Gaussian Measures in Hilbert Space
) √ k from (the series converges strongly in H a.s.). Random variables γk := (X,e λk N N the latter sum are jointly Gaussian (because k=1 ak γk = X, k=1 a√kλek is a k Gaussian r.v.), and E γk = 0,
1 λk (Sek , ej ) = δkj = δkj . Cov(γk , γj ) = λk λj λ k λj By theorem 1.6(a), {γk } are i.i.d. N (0, 1) random variables. Now, λk γk ek , a.s. X= k≥1
and the statement follows. b) We have E
2 λk γ k = λk E γk2 = λk = tr S < ∞. k≥1
Hence
k≥1
k≥1
k≥1
√
2 λk γk < ∞ a.s., and the series T converges strongly in H a.s.
Now, let η = m + T a.s. It holds ϕη (x) = ei(x,m) ϕT (x), and by Lebesgue dominated convergence theorem ϕη (x) = ei(x,m) lim E exp{i
∞
n→∞
= ei(x,m)
E exp{i
λk γk (x, ek )} =
k≥1
=e
i(x,m)
k≥1
e
−
λk γk (x, ek )} =
1
λk (x,ek )2 2
∞
1 = exp i(x, m) − λk (x, ek )2 2 1
.
Thus, 1 ϕη (x) = exp i(x, m) − (Sx, x) , 2 and by theorem 5.1(b), η ∼ N (m, S).
x ∈ H,
Gaussian Measure of General Form
117
Problems 5.2 3) Let ξ ∼ N (m, S) in a real separable infinite-dimensional Hilbert space and operator S be non-singular (i.e. Ker S = {0}). Prove that for each α > 0, E ξ −α < ∞. 4) Let ξ ∼ N (0, S) in a real separable Hilbert space, ξ1 . . . , ξn be independent k copies of ξ, U = (ukj )nk,j=1 be a real orthogonal matrix, and ηk = j=1 ukj ξj , k = 1, . . . , n. Prove that η1 , . . . , ηn are independent copies of ξ. 5.3. Support of Gaussian measure and its invariance We state a classical result. T HEOREM 5.3.– (Anderson’s inequality) Let g be a centered (i.e. with zero mean) Gaussian measure in Rn . Then for each symmetric convex Borel set A and for each vector a, it holds g(A + a) ≤ g(A). For the proof, see [BOG 98], Chapter 1. We can strengthen theorem 5.3 for the case where A is a ball. L EMMA 5.2.– Let d > 0 and ξ ∼ N (0, σ 2 ), σ > 0. Then the function ϕ(a) := P{ξ ∈ [−d + a, d + a]},
a≥0
is decreasing. P ROOF.– Let 0 ≤ a1 < a2 . Then −d+a2 ϕ(a1 ) − ϕ(a2 ) = ρ(x) dx −
−d+a1
d+a2
=
d+a2
ρ(t) dt = d+a1
(ρ(t − 2d) − ρ(t)) dt,
d+a1
where ρ(x) = √
x2 1 e− 2σ2 2πσ
is pdf of ξ. For t ∈ (d + a1 , d + a2 ], |t − 2d| equals either t − 2d or 2d − t, and in both cases it is less than t = |t|. Therefore, ρ(t − 2d) > ρ(t), t ∈ (d + a1 , d + a2 ]; hence (a1 ) − (a2 ) > 0.
118
Gaussian Measures in Hilbert Space
C OROLLARY 5.2.– Let r > 0, g be a centered Gaussian measure in Rn , with nonsingular correlation matrix S, and {ei } be eigenbasis of S. For a ∈ Rn , denote ci = ¯ r)) is ci (a) = (a, ei ), 1 ≤ 1 ≤ n. Then the function ϕ(|c1 |, . . . , |cn |) = g(B(a, decreasing in each argument |ci | for fixed other arguments |cj |, j = i. P ROOF.– For n = 1, the statement follows from lemma 5.2. Let n ≥ 2, ξ ∼ N (0, S), ηi = (ξ, ei ), 1 ≤ i ≤ n. Then ηi are independent normal random variables, with positive variances. It is enough to consider ci ≥ 0, 1 ≤ i ≤ n. We have ϕ(c1 , . . . , cn ) = P{
n
(ηi − ci )2 ≤ r2 } =
1
= E [P{(η1 − c1 ) ≤ r2 − S2,n , S2,n < r2 | ξ2 , . . . , ξn }], 2
where S2,n =
n
(ηi − ci )2 .
i=2
Denote g1 (c1 , R) = P{(η1 − c1 )2 ≤ R2 },
c1 ≥ 0,
R > 0.
Then
ϕ(c1 , . . . , cn ) = E I{S2,n < r2 } · g1 (c1 , R)
R=
√
. r 2 −S2,n
The function g1 is decreasing in c1 , P{S2,n < r2 } > 0; hence ϕ is decreasing in c1 for fixed c1 , . . . , cn . In a similar way, ϕ is decreasing in each argument ci for fixed other arguments cj , j = i. L EMMA 5.3.– Let μ be a centered Gaussian measure in a real separable Hilbert space H. Then for each ε > 0, ¯ ε)) > 0. μ(B(0,
Gaussian Measure of General Form
119
P ROOF.– a) Consider the case dim H = n < ∞. By theorem 1.8, supp μ = R(S) where S is variance–covariance matrix of μ. Hence, 0 ∈ supp μ, and the desired inequality follows (see definition 1.18). b) Now, let dim H = ∞. We may and do assume that H coincides with real sequence space l2 . First we prove that for any a ∈ l2 , ¯ ε)) ≤ μ(B(0, ¯ ε)). μ(B(a,
[5.6]
Indeed, consider a Gaussian random element γ = (γn )∞ 1 in l2 , with distribution μ. We have ∞ ¯ ε)) = P{ (γn − an )2 ≤ ε2 } = μ(B(a, 1
= lim P{ N →∞
N
(γn − an )2 ≤ ε2 }.
1
By theorem 5.3 or corollary 5.2 P{
N
N (γn − an )2 ≤ ε2 } ≤ P{ γn2 ≤ ε2 }.
1
1
Hence ¯ ε)) ≤ lim P{ μ(B(a, N →∞
N
∞ γn2 ≤ ε2 } ≤ P{ γn2 ≤ ε2 } =
1
1
¯ ε)), = μ(B(0, and inequality [5.6] is established. c) Still for H = l2 , we prove the statement of the present lemma. ˆ ε0 )) = 0. Then [5.6] implies that for Suppose that for some ε0 > 0, μ(B(0, ¯ each a ∈ l2 , μ(B(a, ε0 )) = 0. But l2 is separable; hence l2 =
∞
¯ (k) , ε0 ), B(a
1
for some sequence {a(k) } ⊂ l2 . Therefore, μ(l2 ) ≤
∞
¯ (k) , ε0 )) = 0. μ(B(a
1
But μ(l2 ) = 1. The obtained contradiction proves the statement.
120
Gaussian Measures in Hilbert Space
In Chapter 1, we found the support of a Gaussian measure in Rn . Now, we extend definitions 1.15 and 1.18 of support to a separable metric space (X, ρ). D EFINITION 5.1.– For a random element ξ in (X, ρ), we denote by G a union of all balls B(x, r) with P{ξ ∈ B(x, r)} = 0. The set X \ G is called the support of ξ and denoted as supp ξ. D EFINITION 5.2.– For a Borel probability measure μ in (X, ρ), we denote by G a union of all balls B(x, r) with μ(B(x, r)) = 0. The set X \ G is called the support of μ and denoted as supp μ. It is clear that since (X, ρ) is separable, μ(G) = 0 and μ(supp μ) = 1. T HEOREM 5.4.– (About support of Gaussian measure) Let μ be a Gaussian measure in a real separable Hilbert space H, with mean value m and correlation operator S. Then supp μ = R(S) + m,
[5.7]
where bar stands for the closure of a set. P ROOF.– a) Let γ ∼ N (m, S), η = γ − m. Then η ∼ N (0, S) and supp μ = supp γ = supp η + m. In order to prove [5.7], we have to show that supp η = R(S).
[5.8]
b) If dim H < ∞, then this follows from theorem 1.8. Now, we consider the case dim H = ∞. The space H can be decomposed as H = M 1 ⊕ M2 ,
M1 = R(S),
M2 = Ker S,
because S is self-adjoint. Respectively, S = S1 ⊕ S2 ,
S1 ∈ L(M1 ),
S2 = 0 ∈ L(M2 ),
KerS1 = {0}.
According to section 5.2, μη = μ1 × μ2 ,
μ1 is N (0, S1 ) distribution,
μ 2 = δ0 ,
the latter is Dirac measure at origin in M2 . Consequently (see problem (5) of Chapter 5), supp η = supp μη = (supp μ1 ) × (supp μ2 ),
Gaussian Measure of General Form
121
where we identify M1 ⊕ M2 and M1 × M2 . But supp μ2 = {0}, and in order to show [5.8], we have to prove that supp μ1 = R(S1 ). But now R(S1 ) = M1 due to the properties of S1 . If dim M1 < ∞, then supp μ1 = R(S1 ) = M1 and we are done. Therefore, we focus on the case dim M1 = ∞. c) Thus, it remains to prove the following: if η ∼ N (0, S) with non-singular S (i.e. Ker S = {0}), then supp η = H. We may and do assume H which coincides with real sequence space l2 and S is a diagonal operator Sx = (λ1 x1 , . . . , λn xn , . . . ), ∞ with λn > 0, n ≥ 1, 1 λn < ∞.
x ∈ l2 ,
¯ r) in l2 . We have to show Now, consider the arbitrary closed ball B(a, ¯ that μη (B(a, r)) > 0. There exist r1 > 0 and a finitary vector b = (a1 , . . . , ak , 0, 0, . . . ) ¯ r). We have since {ηi } are independent: ¯ r1 ) ⊂ B(a, such that B(b, ¯ r1 )) = P{ μη (B(b,
k
(ηi − ai )2 +
1
≥ P{
k
(ηi − ai )2 ≤
1
= P{
k
∞
ηi2 ≤ r12 } ≥
k+1 ∞ r12 2 r2 ηi ≤ 1 } = , 2 2 k+1
∞
(ηi − ai )2 ≤
1
r12 r2 ηi2 ≤ 1 }. } · P{ 2 2 k+1
Here the first multiplier is positive because for η (k) := (ηi )k1 , supp η (k) = Rk (see theorem 1.8); the second multiplier is positive as well due to lemma 5.2 applied to a centered Gaussian random element η(k) := (ηk+1 , ηk+2 , . . . ) in l2 . Thus, ¯ r1 )) > 0 μη (B(b,
⇒
¯ r) > 0. μη (B(a,
Therefore, for the measure μη it holds G = ∅ and supp η = supp μη = H\G = H (see definition 5.2). This accomplishes the proof. C OROLLARY 5.3.– Let μ be a centered Gaussian measure in a real separable Hilbert space H, with non-singular correlation operator S. Then supp μ = H.
122
Gaussian Measures in Hilbert Space
P ROOF.– We have H = R(S) ⊕ Ker S. But now Ker S = {0}. Hence, H = R(S), and by theorem 5.4 supp μ = R(S) + m = H + m = H.
Now, we switch to invariance properties of a Gaussian measure. In section 1.5.4, we dealt with a centered Gaussian measure μ in Rn , with a non-singular variance–covariance matrix S. We introduced an inner product 1 n n −2 − 12 (x, y)S = S x, S y in R and showed that given u ∈ L(R ), μ is U -invariant if, and only if, U is an orthogonal transformation w.r.t. the inner product (x, y)S . Our goal is to obtain an analogous result for a Gaussian measure in a real separable infinite-dimensional Hilbert space H. Now, consider a centered Gaussian measure μ in H, with a non-singular covariance operator S (i.e. Ker S = {0}). In a linear set √ √ H0 := S(H) = { Sx : x ∈ H } we introduce an inner product 1 1 (x, y)S := S − 2 x, S − 2 y ,
x, y ∈ H0 .
With √ this inner product, H0 is a real separable infinite-dimensional Hilbert space, because S : H → H0 is an isometry of the two spaces with inner products. R EMARK 5.1.– Let {en , n ≥ 1} be then eigenbasis of S and {λn , n ≥ 1} be the corresponding eigenvalues of S. Then H0 = { y =
∞
yn en :
n=1
Here the series
∞ 1
∞ yn2 < ∞ }. λn 1
yn en converges in the norm of H.
For U ∈ L(H), the adjoint operator U ∗ ∈ L(H). Assume additionally that U (H0 ) ⊂ H0 , i.e. H0 is invariant w.r.t. U . Denote U0 = U . H0
Clearly, U0 is a linear operator in H0 . L EMMA 5.4.– (About boundedness of restricted operator) Let U ∈ L(H) and H0 be invariant w.r.t. U . Then U0 ∈ L(H0 ).
Gaussian Measure of General Form
123
P ROOF.– We use the closed graph theorem (see [BER 12]). Suppose that in H0 we have the convergence of two sequences: √
H
0 Sxn −−→
√
√ H0 √ U Sxn −−→ Sy.
Sx,
Then in H we have the convergence: H
xn −→ x
⇒
√ √ H U Sxn −→ U Sx. 1
H
1
1
√ √ H U Sxn −→ U Sy
⇒
H
1
H
1
1
0 z, then S − 2 zn −→ S − 2 z, and S 2 (S − 2 zn ) = zn −→ S 2 (S − 2 z) = z If zn −−→ 1 due to the continuity of S 2 in H. Hence, the convergence in H0 implies the convergence in H.
We have √ H √ U Sxn −→ Sy,
√
√ Sy = U Sx.
Thus, the graph of U0 is closed in H0 , and by the closed graph theorem U0 ∈ L(H0 ). For V ∈ L(H0 ), the adjoint operator from L(H0 ) will be denoted as Vˆ . L EMMA 5.5.– Let U ∈ L(H) and H0 be invariant w.r.t. U . Then on the set S(H), which is dense in H0 , ˆ0 = SU ∗ S −1 . U P ROOF.– Let x ∈ H0 and y ∈ S(H). Then ˆ0 y)S = (U0 x, y)S = (S − 12 U0 x, S − 12 y) = (x, U = (U0 x, S −1 y) = (U x, S −1 y) = (x, U ∗ S −1 y) = 1
1
= (S 2 x, S 2 U ∗ S −1 y)S = (x, SU ∗ S −1 y). ˆ0 y = SU ∗ S −1 y, y ∈ S(H). Hence, U L EMMA 5.6.– (About linear transform of Gaussian random Let ξ ∼ N (m, S) in H and A ∈ L(H). Then Aξ ∼ N (Am, ASA∗ ).
element)
P ROOF.– Random element Aξ is Gaussian, because for each h ∈ H, (Aξ, h) = (ξ, A∗ h) is a Gaussian random variable (possibly degenerate). Next, E(Aξ) = A(E ξ) = Am by properties of Pettis integral. The correlation operator of Aξ equals ASA∗ (see problem (16) of Chapter 3).
124
Gaussian Measures in Hilbert Space
C OROLLARY 5.4.– (Criterion for invariance of Gaussian measure under linear transform) Let μ be a centered Gaussian measure in H, with correlation operator S, and A ∈ L(H). Then μ is A-invariant if, and only if, ASA∗ = S. P ROOF.– Consider a random element ξ ∼ N (0, S) in H, with distribution μξ = μ. By lemma 5.6, Aξ ∼ (0, ASA∗ ). The measure μ is A-invariant if, and only if, ξ and Aξ are identically distributed, and this is equivalent to equality S = ASA∗ . T HEOREM 5.5.– (Condition for invariance in terms of unitary operators) Let μ be a centered Gaussian √ measure in H, with a non-singular covariance operator S, and U ∈ L(H). Let H0 = S(H) be invariant w.r.t. U , and U0 = U |H0 be unitary operator in H0 . Then μ is U -invariant. P ROOF.– In view of corollary 5.4, we have to check that U SU ∗ = S. √ Since S(H) ⊂ S(H), it holds U SU ∗ = U0 SU ∗ , and we have to verify that U0 SU ∗ x = Sx,
x ∈ H,
or U0 SU ∗ S −1 y = y,
y ∈ S(H),
or in view of lemma 5.5, ˆ0 y = y, U0 U
y ∈ S(H).
[5.9]
ˆ0 z = z, z ∈ But since U0 is unitary operator√in H0 , it holds U0 U and [5.9] follows because S(H) ⊂ S(H). Thus, indeed U SU ∗ = S.
√
S(H) = H0 ,
Let μ be a Gaussian measure from theorem 5.5. Problem (7), in this chapter, describes all unitary operators in H under which μ is invariant. Now, we give examples of such operators. E XAMPLE 5.1.– (Symmetries). Let {en } be eigenbasis of S. Consider unitary operator Ux =
∞
εn (x, en )en ,
x ∈ H,
1
where each εn is either 1 or (−1), with arbitrary combination of signs. Then μ is U -invariant.
Gaussian Measure of General Form
125
E XAMPLE 5.2.– (Permutation of coordinates). Let {en } be eigenbasis of S, {λn } be the corresponding (positive) eigenvalues (with multiplicity), and π be a permutation of N , with λn = λπ(n) , n ≥ 1. Consider unitary operator Ux =
∞
(x, en )eπ(n) ,
x ∈ H.
1
Then μ is U -invariant. Theorem 5.5, examples 5.1 and 5.2, and problem (7) of this chapter show that there is a vast group of linear transformations, under which a centered Gaussian measure is invariant. Problems 5.3 5) Let μi be a Borel probability measure in a separable metric space Xi , i = 1, 2. Prove that for supp (μ1 × μ2 ) = (supp μ1 ) × (supp μ2 ). 6) Let μ be a centered Gaussian measure in a real separable infinite-dimensional Hilbert space H, with non-singular covariance operator S. Let {en } be eigenbasis of S and {λn } be the corresponding (positive) eigenvalues of S (with multiplicity). Consider V ∈ L(H), with * λπ(i) V ei = eπ(i) , i ≥ 1, λi where π is some permutation of N . Prove that μ is V -invariant. 7) Let μ be a measure from problem (6), α1 > α2 > . . . > αn > . . . be eigenvalues of S (without multiplicity), and H = H1 ⊕ H2 ⊕ . . . ⊕ Hn ⊕ . . . be a decomposition of H into an orthogonal sum of eigenspaces of S (here Hn corresponds to the eigenvalue αn ). Consider unitary operator U in H. Prove that μ is U -invariant if, and only if, U (Hn ) = Hn for each n ≥ 1. 5.4. Weak convergence of Gaussian measures D EFINITION 5.3.– Let (X, ρ) be a metric space and {μ, μn , n ≥ 1} be Borel probability measures in X. The sequence {μn } weakly converges to μ if for each bounded and continuous function f : X → R, f dμn → f dμ as n → ∞ . X
X
126
Gaussian Measures in Hilbert Space
See [BIL 99] for properties and applications of the weak convergence of measures. We denote by MX the class of all Borel probability measures in X. D EFINITION 5.4.– The Lévy–Prokhorov metric d in MX is introduced as follows: for μ and ν from MX , d(μ, ν) = inf{ε > 0 : μ(F ) ≤ ν(F ε ) + ε
and
ν(F ) ≤ μ(F ε ) + ε,
for each closed F ⊂ X}, where F ε is the neighborhood of F , F ε = { x ∈ X : ρ(x, F ) < ε }. T HEOREM 5.6.– (Prokhorov’s theorem about metrization of weak convergence) Let (X, ρ) be a complete separable metric space. Then MX with the Lévy–Prokhorov metric d is a complete separable metric space as well. Moreover, μn converges to μ in (MX , d) if, and only if, μn weakly converges to μ. For the proof, see [PRO 56] or [BIL 99]. Before studying the convergence of Gaussian measures in H, we first deal with Gaussian measures in Euclidean space. We use the following fact: if ξn ∼ N (mn , σn2 ) converges in distribution to ξ, then ξ ∼ N (m, σ 2 ), with m = lim mn , n→∞
σ 2 = lim σn2 n→∞
2
(here it is possible that σ = 0; in this case ξ = m a.s.). T HEOREM 5.7.– (Criterion for weak convergence of Gaussian measures in Rk ) a) The class of all Gaussian measures in Rk is closed in MRk w.r.t. weak convergence. b) Consider Gaussian measures μn in Rk , with mean values mn and variance– covariance matrices Sn , n ≥ 1, and a Gaussian measure μ in Rk , with mean value m and variance–covariance matrix S. The sequence {μn } weakly converges to μ if, and only if, mn → m
and
Sn → S
as
n → ∞.
[5.10]
P ROOF.– a) Let a sequence μn of Gaussian measures in Rk weakly converge to μ ∈ MRk . Consider random vectors Xn , X with distributions μn and μ, respectively. Fix a ∈ Rk . Gaussian random variables (Xn , a) converge in distribution to the random variable (X, a); hence (X, a) is Gaussian as well. Therefore, X is a Gaussian random vector and μx = μ is a Gaussian measure.
Gaussian Measure of General Form
127
b) Again consider random vectors Xn , X with distributions μn and μ, respectively. Assume that μn weakly converges to μ. Then Xn converges to X in distribution. Fix a ∈ Rk . It holds: (Xn , a) ∼ N ((mn , a), (Sn a, a)) converges to (X, a) ∼ N ((m, a), (Sa, a)) in distribution. Hence (mn , a) → (m, a)
and
(Sn a, a) → (Sa, a)
as n → ∞.
Since vector a is arbitrary and Sn , S are symmetric matrices, we get the desired [5.10]. Now, we prove the sufficiency of condition [5.10]. For x ∈ Rn , consider the characteristic function as n → ∞: (Sx, x) (Sn x, x) → exp i(x, m) − =μ ˆ(x). μ ˆn (x) = exp i(x, mn ) − 2 2 By Lévy’s continuity theorem, μn weakly converges to μ.
It is interesting that the weak convergence of Gaussian measures in H is closely related to the convergence of S-operators in S1 (H). T HEOREM 5.8.– (Criterion for weak convergence of Gaussian measures in H) Let H be a real separable infinite-dimensional Hilbert space and MG denote the class of all Gaussian measures in H. Fix an orthobasis {ei } in H. a) MG is closed in MH w.r.t. weak convergence. b) Consider Gaussian measures μn , with mean values mn and correlation operators Sn , n ≥ 1, and a Gaussian measure μ, with mean value m and correlation operator S. The sequence μn weakly converges to μ if, and only if, two conditions hold: 1) mn strongly converges to m in H; ∞ 2) i=1 |(Sn ei , ei ) − (Sei , ei )| → 0 as n → ∞, and (Sn ei , ej ) → (Sei , ej ) as n → ∞, for all i = j. Proof can be found in [PAR 05]. Note that condition 2 means that Sn converges to S in nuclear norm (see definition 3.7). C OROLLARY 5.5.– Let H be the space from theorem 5.8. The sequence μn of Gaussian measures weakly converges to the Gaussian measure μ (here μn and μ
128
Gaussian Measures in Hilbert Space
satisfy conditions of theorem 5.8) if, and only if, mn strongly converges to m in H and Sn converges to S in nuclear norm. P ROOF.– Theorem 3.9 shows that the convergence of S-operator in condition 2 of theorem 5.8 is equivalent to the convergence in nuclear norm. C OROLLARY 5.6.– Let H be a real separable infinite-dimensional Hilbert space. Then the class LS (H) of all S-operators is a separable set in S1 (H). 0 P ROOF.– By theorem 5.7, (MH , d) is a separable metric space. Then the class MG of all centered Gaussian measures in H is a separable set in MH , and there exists a 0 0 countable set Q ⊂ MG , which is dense in MG (w.r.t. the weak convergence). 0 Let S ∈ LS (H) and μ ∈ MG , with correlation operator S. Then there exists a sequence {μn } ⊂ Q, which converges weakly to μ. By corollary 5.5, ||Sn − S||1 → 0 where Sn is correlation operator of μn . Thus, a countable set {Sμ : μ ∈ Q} is dense in LS (H).
Problems 5.4 8) Let μ be a centered Gaussian measure in a real separable infinite-dimensional Hilbert space H, with non-zero covariance operator S. Let {λn } be eigenvalues of S (with multiplicity) and 2 Jα = exp{α||x|| } dμ(x), α ∈ R . H
Using problem (11) of Chapter 2, prove that Jα < ∞ if, and only if, α < Show that in this case 1 Jα = ∞ . n=1 (1 − 2αλn )
1 2||S|| .
9) Let μn be Gaussian measures in a real separable infinite-dimensional Hilbert space H such that μn weakly converge to μ, which has non-zero correlation operator S. Consider a continuous function f : H → R such that for some ε > 0, 1−ε 2 |f (x)| ≤ exp · ||x|| , x ∈ H. 2||S|| Prove that lim f (x) dμn (x) = f (x) dμ(x). n→∞
H
H
Hint. Use problem (8) and also theorem 3.5 and condition [3.18], both from [BIL 99].
Gaussian Measure of General Form
129
5.5. Exponential moments of Gaussian measure in normed space We start with separable Hilbert spaces. Let μ be a centered Gaussian measure in Rn , with non-zero variance–covariance matrix S and let {λk , 1 ≤ k ≤ n} be eigenvalues of S (with multiplicity). The integral Iα = exp {α x 2 } dμ(x), α ∈ R [5.11] Rn
is finite if, and only if, α < Iα = n
1 2S .
1
1 (1 − 2αλk )
In this case,
.
This statement can be obtained from problem (8) of this chapter if we consider ˜ ≤ n. the covariance operator S˜ in H, with dim R(S) In problem (8), a similar result is stated for a real separable infinite-dimensional Hilbert space. In this section, we consider the integral like [5.11] in a separable normed space. We are not able to evaluate it, but we prove that it is finite for a small enough α. 5.5.1. Gaussian measures in normed space Let X be a real separable normed space. We extend definitions 2.7 and 2.8 for the objects in X. D EFINITION 5.5.– A random element ξ distributed in X is called Gaussian if for each x∗ ∈ X ∗ , ξ, x∗ is a Gaussian r.v. (possibly with zero variance). A probability measure μ on B(X) is called Gaussian if there exists a Gaussian random element ξ in X, such that its distribution μξ = μ. L EMMA 5.7.– Let μ be a Gaussian measure in a real separable Banach space B. Then there exists the mean value mμ ∈ B (where the integral mμ = B x dμ(x) is understood in weak sense), and the covariance and correlation operators Aμ , Sμ : B ∗ → B. P ROOF.– Let x∗ ∈ B ∗ and ξ be a Gaussian random element in B with distribution μξ = μ. Then ξ, x∗ is a Gaussian r.v., and |x, x∗ |2 dμ(x) = E |ξ, x∗ |2 < ∞. B
Now, since B is separable, the statement follows from remarks 3.8 and 3.9.
130
Gaussian Measures in Hilbert Space
L EMMA 5.8.– Let μ be a Gaussian measure in a real separable normed space X. Then all first weak moments σ1 (x∗ ) of μ are finite and the characteristic functional equals 1 ∗ ∗ ∗ ∗ ϕμ (x ) = exp iσ1 (x ) − Sμ x , x , x∗ ∈ X ∗ , 2 where Sμ : X ∗ → X ∗∗ is correlation operator of μ. Thus, a Gaussian measure in X is uniquely defined by {σ1 (x∗ ) : x∗ ∈ X ∗ } and Sμ . P ROOF.– Let ξ be a Gaussian random element in X with distribution μ. Like in the proof of lemma 5.7 it follows that all second moments σ2 (x∗1 , x∗2 ) of μ are finite; hence σ1 (x∗ ) are finite as well. Correlation operator Sμ : X ∗ → X ∗∗ of μ exists by corollary 3.4. We have Eξ, x∗ = σ1 (x∗ ),
Dξ, x∗ = σ2 (x∗ , x∗ ) − σ12 (x∗ ) = Sμ x∗ , x∗ .
Hence ξ, x∗ ∼ N (σ1 (x∗ ), Sμ x∗ , x∗ ). Therefore, ϕμ (x∗ ) = E ei ξ,x
∗
1 = ϕ ξ,x∗ (1) = exp iσ1 (x∗ ) − Sμ x∗ , x∗ . 2
By corollary 4.1, a Borel probability measure in a separable X is uniquely defined by characteristic functional ϕμ , and the last statement of lemma 5.8 follows. In particular, when there exists the mean value mμ of a Gaussian measure μ (e.g. if X is a real separable Banach space), then 1 ϕμ (x∗ ) = exp imμ , x∗ − Sμ x∗ , x∗ , 2 and like in Hilbert space, μ is uniquely defined by a couple (mμ , Sμ ). E XAMPLE 5.3.– (Measure generated by Gaussian process) Let (Ω, F, P) be a complete probability space and ξ : Ω × [0, T ] → R be a measurable stochastic process, i.e. ξ be measurable w.r.t. sigma-algebra F ⊗ ST , which is generated by measurable rectangles F × A, F ∈ F, A ∈ ST (here ST denotes the class of Lebesgue measurable subsets of [0, T ]). Assume additionally that for some real p ∈ [1, ∞),
T 0
E |ξ(t)|p dt < ∞.
Gaussian Measure of General Form
Then T
131
|ξ(t)|p dt < ∞ a.s.,
0
and the path ξ(·, ω) ∈ Lp [0, T ] a.s. The space B = Lp [0, T ] is a real separable Banach space. Due to the measurability of ξ, X : ω → ξ(·, ω) is a random element in B with distribution μ(A) = P { ω : ξ(·, ω) ∈ A },
A ∈ B(B)
(it is called the distribution of ξ in the space of paths). Assume additionally that ξ is a Gaussian stochastic process, i.e. for each n ≥ 1 and t1 , ..., tn ∈ [0, T ], random vector (ξt1 , ..., ξtn ) is Gaussian. Let q ∈ (1, ∞] be the conjugate index, i.e. p−1 + q −1 = 1 (if p = 1, then q = ∞). For each x∗ ∈ Lq [0, T ] = B ∗ , r.v. T η= ξ(t)x∗ (t) dt [5.12] 0
is Gaussian. To prove this, first note that in the case where 1 < p < ∞, p p/q T
E
|ξ(t)x∗ (t)| dt
0
0
and in the case where p = 1, T E |ξ(t)x∗ (t)| dt ≤ 0
in both cases E Next, denote
T
≤
T 0
T 0
T
E |ξ(t)|p dt ·
|x∗ (t)|q dt
< ∞,
0
E |ξ(t)| dt · ess sup |x∗ (t)| < ∞; 0≤t≤T
|ξ(t)x∗ (t)| dt < ∞, and by Fubini’s theorem η is indeed a r.v.
x∗n = x∗n (t) = x∗ · IAn ,
An = {t : (x∗ (t))2 · E ξ 2 (t) ≤ n}.
It is enough to prove that η is Gaussian, with x∗n instead of x∗ . In this case T E ξ 2 (t) · (x∗n (t))2 dt ≤ nT < ∞, 0
and η ∈ L2 (Ω, P). Denote by G the subspace in Hilbert space L2 (Ω, P) generated by random variables ξ(t, ·), t ∈ [0, T ]. Since G consists of Gaussian random variables, we have to show that η ∈ G. Take the decomposition η = η1 + η2 with η1 ∈ G, η2 ⊥G. By Fubini’s theorem , T + 2 E η2 = E ηη2 = ξ(t, ω)η2 (ω) dP (ω) x∗n (t) dt = 0; 0
Ω
132
Gaussian Measures in Hilbert Space
hence η2 = 0 a.s. and [5.12] is indeed a Gaussian r.v. Therefore, μ is a Gaussian measure in B. Its mean value m(t) = E ξ(t),
0 ≤ t ≤ T,
m(·) ∈ Lp [0, T ].
Its correlation operator S : Lq [0, T ] → Lp [0, T ] for each x∗ ∈ Lq [0, T ] and y ∈ Lq [0, T ] satisfies the relation Sx∗ , y ∗ = x − m, x∗ x − m, y ∗ dμ(x) = ∗
B
[0,T ]2
[0,T ]2
E(ξ(t) − m(t))(ξ(s) − m(s))x∗ (t)y ∗ (s)dtds = r(t, s)x∗ (t)y ∗ (s)dtds.
Here r(t, s) is a correlation function of ξ which is finite for all (t, s) ∈ [0, T ]2 and Lebesgue measurable by Fubini’s theorem. Hence (Sx∗ )(t) =
T
r(t, s)x∗ (s) ds,
0 ≤ t ≤ T.
0
where r(t, s) is correlation function of ξ. Characteristic function of μ is as follows: r(t, s) = E(ξ(t) − m(t))(ξ(s) − m(s)),
t, s ∈ [0, T ].
By Fubini’s theorem, the latter expectation is finite almost for all (t, s) ∈ [0, T ]2 and defines the Lebesgue measurable function r(t, s). Characteristic functional of μ is as follows: T 1 ∗ ∗ ∗ ∗ m(t)x (t)dt − r(t, s)x (t)x (s) dtds , ϕμ (x ) = exp i 2 [0,T ]2 0 x∗ ∈ Lq [0, T ]. L EMMA 5.9.– (Characterization of Gaussian random element in normed space) Let ξ and η be independent random elements in a real separable normed space X. a) If ξ and η are identically distributed Gaussian with zero mean, then √12 (ξ + η) and √12 (ξ − η) are independent copies of ξ (i.e. they are independent and have the same distribution as ξ). b) If ξ + η and ξ − η are independent, then ξ and η are Gaussian.
Gaussian Measure of General Form
133
P ROOF.– a) Denote α = √12 (ξ+η), β = √12 (ξ−η). Since X is separable, a couple (α; β) is a random element in X ×X (see part (a) of proof of lemma 3.6). Find its characteristic functional: i i ϕ(α;β) (x∗ , y ∗ ) = E exp { √ ξ + η, x∗ + √ ξ − η, y ∗ } = 2 2 x∗ + y ∗ x∗ − y ∗ x∗ + y ∗ x∗ − y ∗ √ } · E exp {i η, √ } = ϕξ ( √ )ϕξ ( √ )= 2 2 2 2 1 = exp − [σ2 (x∗ + y ∗ , x∗ + y ∗ ) + σ2 (x∗ − y ∗ , x∗ − y ∗ )] . 4
= E exp {i ξ,
Here σ2 (u∗ , v ∗ ) =
x, u∗ · x, v ∗ dμ(x), X
σ2 is a symmetric bilinear form on X ∗ , and we used the condition that ξ and η are i.i.d. random elements. Next, by the parallelogram identity: 1 ϕ(α;β) (x∗ , y ∗ ) = exp − [σ2 (x∗ , x∗ ) + σ2 (y ∗ , y ∗ )] = 2 = ϕξ (x∗ ) ϕη (y ∗ ) = ϕ(ξ;η) (x∗ , y ∗ ). By corollary 4.1, the couple (α; β) and the couple (ξ; η) have equal distributions in X × X. Hence α and β are independent and have the same distribution as ξ. b) For x∗ ∈ X ∗ , random variables ξ+η, x∗ = ξ, x∗ +η, x∗ and ξ−η, x∗ = ξ, x∗ − η, x∗ are independent, moreover ξ, x∗ and η, x∗ are independent as well. By the characterization of normal law, ξ, x∗ and η, x∗ are Gaussian (see [MAT 77]); hence ξ and η are Gaussian random elements. 5.5.2. Fernique’s theorem Now, we state a famous result about the exponential integrability of norm. T HEOREM 5.9.– (Fernique’s theorem) Let μ be a centered Gaussian measure in a real separable normed space X. Select τ > 0 such that c = μ(B(0, τ )) > 12 . Fix α0 =
1 √ 4(1+ 2)2 τ 2
+∞
log
c 1−c
if c < 1, if
c = 1.
134
Gaussian Measures in Hilbert Space
Then for each 0 < α < α0 , exp(α x 2 ) dμ(x) ≤ I(τ, c, α) < ∞, X
where I(τ, c, α) depends only on τ, c, α and is continuous in τ > 0, c ∈ ( 12 , 1], α ∈ (0, α0 (τ, c)). If c = 1, then μ = δ0 , Dirac measure at zero. P ROOF.– a) Key inequality Let ξ be a Gaussian random element in X with distribution μ, and η be an √ independent copy of ξ defined on the same probability space. By lemma 5.9, ξ+η 2 and
ξ−η √ 2
are independent copies of ξ.
Fix two points 0 < s < t. We have ξ−η ξ+η P{ ξ ≤ s} · P{ η > t} = P{ √ ≤ s} · P{ √ > t} = 2 2 √ √ √ √ = P{ ξ + η ≤ s 2, ξ − η > t 2} ≤ P{ ξ − η ≤ s 2, ξ + η > t 2} t−s t−s t−s ≤ P{ ξ > √ , η > √ } = P2 { ξ > √ }. 2 2 2 Here, we used the following inclusion, which is geometrically evident for 0 < s < t: √ √ t−s t−s {(a; b) ∈ [0, ∞)2 : |a − b| ≤ s 2, a + b > t 2} ⊂ √ , ∞ × √ , ∞ . 2 2 Therefore, t−s P{ ξ ≤ s} · P{ ξ > t} ≤ P2 { ξ > √ }. 2
[5.13]
b) Bound for the tail of distribution of ξ Since μ(B(0, r)) → 1 as r → ∞, there exists a positive τ such that c := μ(B(0, τ )) > 12 . Then α0 :=
1−c μ({x : x > τ }) = < 1. c μ(B(0, τ ))
If c = 1, then ξ ≤ τ a.s., ξ, x∗ is bounded centered Gaussian r.v. for all x∗ ∈ X ; hence ξ, x∗ = 0 a.s. for all x∗ ∈ X ∗ . Then σ2 (x∗ , x∗ ) = E |ξ, x∗ |2 = ∗
Gaussian Measure of General Form
135
0, ϕξ (x∗ ) = exp{− 12 σ2 (x∗ , x∗ )} = 1, and ξ = 0 a.s. since in the separable X the characteristic function uniquely defines the distribution. Thus, in this case μ = δ0 and exp(α x 2 ) dμ(x) = 1, for all α. X
Now, we may and do assume that c < 1. Consider the sequence √ t0 = τ, tn+1 = τ + tn 2, n ≥ 0. We have consequently √ √ t1 = τ + 2τ, t2 = τ + 2τ + 2τ, ... √ √ √ ( 2)n+1 − 1 √ τ tn = (1 + 2 + ... + ( 2)n )τ = 2−1 √ √ = (( 2)n+1 − 1)( 2 + 1)τ, n ≥ 0. From [5.13], we obtain for n ≥ 1: P{ ξ ≤ τ } · P{ ξ > tn } ≤ P2 { ξ > tn−1 }. Denote αn =
P{ ξ > tn } , n ≥ 1. P{ ξ ≤ τ }
2 , n ≥ 1. Hence Then αn ≤ αn−1
α1 ≤ α02 , α2 ≤ (α02 )2 = α04 , α3 ≤ (α04 )2 = α08 , ... 2n 1−c 2n αn ≤ α 0 = , c 2n 1−c P{ ξ > tn } ≤ c , n ≥ 0. c c) Bound for exponential moment Fix α0 =
1 c √ log . 2 2 1−c 4(1 + 2) τ
136
Gaussian Measures in Hilbert Space
For 0 < α < α0 , we have: exp(α x 2 )dμ ≤ B(0,τ )
X
+
exp(α x 2 )dμ
∞
exp(αt2n+1 )μ(tn < x ≤ tn+1 ) ≤
n=0
≤ c exp(ατ 2 ) +
∞
exp(αt2n+1 )μ( x > tn ) ≤
n=0
2n ∞ √ 1−c c exp(ατ 2 (1 + 2)2 · 2n+2 ) = ≤ c exp(ατ ) + c n=0 2
= c exp(ατ 2 ) + c
∞
exp(2n [log
n=0
√ 1−c + 4ατ 2 (1 + 2)2 ]) =: I(τ, c, α). c
We see that for α < α0 the series converges and I(τ, c, α) < ∞. For c = 1, we set I(τ, 1, α) = exp(ατ 2 ) > 1. Then I(τ, c, α) bounds the exponential moment from above and is continuous in τ > 0, c ∈ ( 21 , 1], α ∈ (0, α0 ). C OROLLARY 5.7.– Let μ be a Gaussian measure in a real separable Banach space B. Then there exists the mean value mμ = B xdμ(x), where the integral is understood in a strong sense. P ROOF.– The mean value mμ = B xdμ(x) exists in a weak sense due to lemma 5.7. Let ξ be a Gaussian random element in B with distribution μ. Then ξ − mμ is centered Gaussian random element in the separable space B, and theorem 5.9 implies that E ξ − mμ < ∞. Then E ξ ≤ mμ + E ξ − mμ < ∞, and since B is a separable Banach space, mμ = E ξ in a strong sense. C OROLLARY 5.8.– For the Gaussian process ξ from example 5.2, it holds that for all r ∈ [1, +∞),
T
|ξ(t, ·)|p dt ∈ Lr (Ω, P).
0
P ROOF.– Fix r ∈ [1, +∞). Consider the Gaussian element X in Lp [0, T ] constructed in example 5.2. Its mean value m ∈ Lp [0, T ] equals m(t) = E ξ(t), t ∈ [0, T ] (for almost all t w.r.t. Lebesgue measure). Then X − m is a centered Gaussian element
Gaussian Measure of General Form
137
in Lp [0, T ], and theorem 5.9 implies that E X − m pr p < ∞, where z p stands for the norm in Lp [0, T ]. Finally, T pr E X − m p = E( |ξ(t)|p dt)r < ∞. 0
Remember that the weak convergence of Gaussian measures in Euclidean and Hilbert spaces was studied in section 5.4. See definition 5.3 of weak convergence of probability measures in a metric space. We need some facts about the weak convergence of measures stated, e.g. in [BIL 99]. Let Γ be a family of Borel probability measures in a metric space (X, ρ). Γ is called relatively compact if for each sequence {μn } ⊂ Γ there exist a subsequence {μnk } and a Borel probability measure μ such that {μnk } converges weakly to μ. Γ is called a tight family if for each μ ∈ Γ, μ(K c ) < . Prokhorov’s theorem states that in case (X, ρ) is a complete and separable space, Γ is relatively compact if, and only if, Γ is tight. Theorem 3.5 and condition [3.18], both from [BIL 99], imply the following: let (X, ρ) be a metric space, {μn } be Borel probability measures that weakly converge to μ and f ∈ C(X) such that for some > 0, supn≥1 X |f |1+ dμn < ∞. Then f dμn → f dμ as n → ∞. X
X
T HEOREM 5.10.– Let {μn } be a sequence of Gaussian measures in a real separable Banach space B such that {μn } converges weakly to μ. Then: a) μ is a Gaussian measure as well and mean values mμn strongly converge to mμ in B. b) There exists α∗ > 0 such that exp(α∗ x 2 ) dμn (x) < ∞; sup n≥1
B
hence for each α < α∗ , exp(α x 2 ) dμn (x) → exp(α x 2 ) dμ(x) B
B
for each r > 0, x r dμn (x) → x r dμ(x) B
and for each λ ∈ R, λx e dμn (x) → eλx dμ(x) B
as
n → ∞,
B
B
as
n → ∞.
as
n → ∞,
138
Gaussian Measures in Hilbert Space
P ROOF.– a1) We prove that μ is Gaussian. Let Xn and X have distributions μn and μ, respectively. Since Xn converge in d distribution to X and x∗ ∈ B ∗ is a continuous functional, it holds Xn , x∗ − → X, x∗ . Because Xn , x∗ is a Gaussian r.v., the limit r.v. X, x∗ is Gaussian as well. Hence X is a Gaussian random element, and μ = μX is a Gaussian measure. Moreover, the convergence in distribution of Gaussian random variables Xn , x∗ implies that mμn , x∗ = EXn , x∗ → EX, x∗ = mμ , x∗ as n → ∞. Therefore, mμn converge weakly to mμ in B. a2) We prove that {mμn , n ≥ 1} is relatively compact in B, i.e. closure of mμn is compact in B. By Prokhorov’s theorem, the family of distributions {μn = μXn } is tight. We √n and − X √n , respectively. Then denote by Γ1 and Γ2 the set of distributions of X 2 2 both Γ1 and Γ2 are tight. Introduce Γ0 = {ν1 ∗ ν2 : ν1 ∈ Γ1 , ν2 ∈ Γ2 }. It is tight. Indeed, for given > 0, there exist compact sets K1 and K2 , with ν1 (K1 ) > 1 − and ν2 (K2 ) > 1 − , ν1 ∈ Γ1 , ν2 ∈ Γ2 . Then K0 := K1 + K2 = {x1 + x2 : x1 ∈ K1 , x2 ∈ K2 } is compact as well and (ν1 ∗ ν2 )(K0 ) ≥ μ(K0 − x) dν(x) ≥ (1 − )2 , K2
because K1 ⊂ K0 − x, x ∈ K2 . Thus, Γ0 is tight. n Let Yn be an independent copy of Xn and Zn = Xn√−Y . Then Zn is a copy of 2 Xn − mμn (see lemma 5.9), and its distribution belongs to Γ0 , the tight family.
Hence there exists a compact set K such that μn (K) >
1 1 , μn (K − mμn ) > , 2 2
for all
n ≥ 1.
Now, μn (K ∩ (K − mμn )) = μn (K) + μn (K − mμn ) − μn (K ∩ (K − mμn )) > > 1 − μn (K ∩ (K − mμn )) > 0.
Gaussian Measure of General Form
139
Therefore, K ∩ (K − mμn ) = ∅ and mμn ∈ K − K. The latter set is compact; hence {mμn } is indeed relatively compact. This fact and the weak convergence mμn → mμ imply that mμn → mμ strongly in B. b1) Assume additionally that μn are centered and prove the existence of α∗ . Select τ > 0 such that c := μ(B(0, τ )) > 12 and moreover the boundary ∂B(0, τ ) = {x : x = τ } has μ-measure zero (this is possible because the set of τ ’s with ∂B(0, τ ) of positive measure is at most countable). Then cn := μn (B(0, τ )) → c as n → ∞. We may and do assume that cn ≥ 12 + δ, for all n ≥ 1. Then for all α < α0 (τ, cn ), it holds In (α) := exp(α x 2 ) dμn (x) ≤ I(τ, cn , α), B
where α0 = α0 (τ, c) and I(τ, c, α) are given in theorem 5.9. Note that α0 (τ, c) is increasing in c. Hence for all positive α < α0 (τ, 12 + δ) and all n ≥ 1, In (α) ≤
max
1 2 +δ≤c≤1
I(τ, c, α) < ∞.
The latter maximum exists because I(τ, c, α) is continuous in c. b2) Show the existence of α∗ in general case. A couple (Xn ; mμn ) converges in distribution to (X; mμ ) as random elements in B × B, because the first and the second component are stochastically independent d and Xn − → X, mμn → mμ strongly in B. Hence d
→ ϕ(X, mμ ) = x − mμ . ϕ(Xn , mμn ) = Xn − mμn − Here, ϕ(u, v) = u − v, u, v ∈ B is a continuous function. Gaussian random elements Xn −mμn are centered. By part (b1) of the proof, there exists α1 > 0 such that sup E exp(α1 Xn − mμn 2 ) < ∞.
n≥1
Since the sequence {mμn , n ≥ 1} is bounded, it follows the desired relation: for all α∗ < α1 , sup E exp(α∗ Xn 2 ) < ∞
n≥1
(for details, see the solution to problem (12) of this chapter).
140
Gaussian Measures in Hilbert Space
b3) Fix α < α∗ , r > 0, λ ∈ R and let f1 (x) = exp(α x 2 ), f2 (x) = x r , f3 (x) = eλx , x ∈ B. Then for = αα∗ − 1, fi1+ (x) ≤ ci exp(α∗ x 2 ), x ∈ B, i = 1, 2, 3, with certain positive constants ci (in particular c1 = 1). Hence sup E fi1+ (xn ) ≤ ci sup E exp(α∗ xn 2 ) < ∞, i = 1, 2, 3.
n≥1
n≥1
This relation and weak convergence of μn to μ imply the desired convergence of integrals (see the statement above theorem 5.10). Problems 5.5 10) Let ξ and η be i.i.d. centered Gaussian random elements in a real separable normed space X, defined on a single probability space. For ϕ ∈ R, prove that ξ sin ϕ + η cos ϕ and ξ cos ϕ − η sin ϕ are independent copies of ξ. 11) Let μ be a centered Gaussian measure in a real separable infinite-dimensional Hilbert space H, with non-zero covariance operator S. For t > 0, prove that trS x 2t dμ(x) < tt e−t · min 1 (α−t (1 − 2α S )− 2S ) = 0 0 such that exp{α x 2 } dμ(x) < ∞. X
13) Let {ξn } be a sequence of Gaussian random elements in a real separable Banach P
→ 0, where z ∈ B is a fixed vector. Let α > 0 and space B such that ξn − z − 2 f : B → R be a Borel function which is continuous at z with |f (x)| ≤ eαx , x ∈ B. Prove that E f (ξn ) → f (z) as n → ∞. 14) Let (Ω, F, P) be a complete probability space, (T, S, σ) be a measure space with complete σ-finite measure σ and ξ : Ω × T → R be a measurable stochastic process, i.e. ξ be measurable w.r.t. sigma-algebra F ⊗ S, which is generated by measurable rectangles F × A, F ∈ F , A ∈ S. Assume additionally that for some real p ∈ [1, +∞), T E |ξ(t)|p dσ(t) < ∞. Let B = Lp (T, σ).
Gaussian Measure of General Form
141
a) Prove that μ(A) = P{ω : ξ(·, ω) ∈ A}, A ∈ B(B) is a probability measure. b) If ξ is a Gaussian stochastic process, i.e. for each n ≥ 1 and t1 , ..., tn ∈ T , T the random vector (ξp t1 , ..., ξtn ) is Gaussian, then μ is a Gaussian measure in B, moreover T |ξ(t, ·)| dσ(t) ∈ Lr (Ω, P), for all r ∈ [1, +∞).
6 Equivalence and Singularity of Gaussian Measures
In this chapter, we prove the fundamental Kakutani’s theorem on the absolute continuity or mutual singularity of product measures on R∞ . We give a criterion for the equivalence of Gaussian product measures on R∞ and apply it to obtain a simple version of the Feldman–Hájek dichotomy on the equivalence or mutual singularity of Gaussian measures on a separable Hilbert space H. The latter result is applied for the estimation of an unknown mean of a Gaussian random element in H and for testing a hypothesis about this mean, as well as testing a hypothesis about correlation operator of a centered Gaussian random element. 6.1. Uniformly integrable sequences We remember properties of a sequence of uniformly integrable random variables. This concept is convenient to justify passage to the limit under the expectation sign. D EFINITION 6.1.– A sequence {Xn } of random variables is called uniformly integrable if sup E |Xn | · I (|Xn | ≥ α) → 0
n≥1
as
α → +∞.
[6.1]
L EMMA 6.1.– Consider a sequence {Xn } of random variables. a) If {Xn } is uniformly integrable, then supn≥1 E |Xn | < ∞. b) If |Xn | ≤ Y a.s. for all n ≥ 1 and E Y < ∞, then {Xn } is uniformly integrable. c) Assume that for some ε > 0, it holds supn≥1 E |Xn |1+ε < ∞. Then {Xn } is uniformly integrable.
Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.
144
Gaussian Measures in Hilbert Space
P ROOF.– a) According to [6.1], there exists α0 > 0 with E |Xn |·I (|Xn | ≥ α0 ) ≤ 1, n ≥ 1. Then for each n ≥ 1, E |Xn | = E |Xn | · I (|Xn | < α0 ) + E |Xn | · I (|Xn | ≥ α0 ) ≤ α0 + 1. b) Since {ω : |Xn | ≥ α} ⊂ {ω : Y ≥ α}, we have |Xn |d P ≤ Y d P → 0 as sup n≥1
because
Ω
{|Xn |≥α}
{Y ≥α}
α → +∞,
Y d P = E Y < ∞.
c) For α > 0, it holds 1 C |Xn |d P ≤ ε |Xn |1+ε d P ≤ ε , α α {|Xn |≥α} {|Xn |≥α} C = supn≥1 E |Xn |1+ε < ∞. Hence C sup |Xn |d P ≤ ε → 0 α n≥1 {|Xn |≥α}
as
α → +∞.
The next two statements show that uniformly integrable sequences are suitable for passing to the limit under the expectation symbol. Their proofs can be found in [SHI 16]. T HEOREM 6.1.– (Extension of Lebesgue dominated convergence theorem) Let {Xn } be uniformly integrable and Xn → X a.s. Then E X and E Xn , n ≥ 1, are finite, and moreover E Xn → E X as n → ∞. T HEOREM 6.2.– Let Xn be non-negative random variables, Xn → X a.s., and X and Xn , n ≥ 1, have finite expectations, and moreover E Xn → E X as n → ∞. Then {Xn } are uniformly integrable. R EMARK 6.1.– In theorems 6.1and 6.2, convergence Xn → X a.s. can be replaced d with a weak convergence Xn − → X (see [BIL 99]). Problems 6.1 1) Let {Xn } be a sequence of random variables with finite expectations. Prove that {Xn } is uniformly integrable if, and only if, lim sup E |Xn | · I (|Xn | ≥ α) → 0 n≥1
as
α → +∞.
Equivalence and Singularity of Gaussian Measures
145
2) Construct such a sequence {Xn } of random variables with finite expectations, that Xn → X a.s., E Xn → E X as n → ∞ and X has finite expectation as well, but {Xn } is not uniformly integrable. 3) Prove the following Vallée–Poussin criterion which extends lemma 6.1(c). A sequence {Xn } of random variables is uniformly integrable if, and only if, there exists a non-negative increasing function G(t), t ≥ 0, with lim G(t)/t = +∞ such that t→+∞
supn≥1 E G(|Xn |) < ∞. 6.2. Kakutani’s dichotomy for product measures on R∞ We state some properties of absolutely continuous measures and apply them, as well as results of section 6.1, to measures on R∞ . 6.2.1. General properties of absolutely continuous measures Let (X, S) be a measurable space and μ, ν be finite measures on S. Remember that ν is absolutely continuous w.r.t. μ if, for all A ∈ S with μ(A) = 0, it holds ν(A) = 0. Notation: ν μ. The Radon–Nikodym theorem (see [BOG 07]) states that ν μ if, and only if, there exists S-measurable function ϕ such that ν(A) = ϕ(x)dμ(x), A ∈ S. A
The function ϕ is uniquely defined up to a μ-null set and ϕ(x) ≥ 0. It is called the dν dν Radon–Nikodym derivative (or density) of ν w.r.t. μ and is denoted as dμ = dμ (x). If ϕ(x) > 0 almost everywhere w.r.t. μ, then μ ν, and moreover In this case, the measures μ and ν are equivalent. Notation: μ ∼ ν.
dμ dν
=
dν dμ
−1
.
Suppose the measure μ and ν are concentrated at disjoint sets, i.e. X = A ∪ B where A and B are disjoint and μ(B) = 0, ν(A) = 0. Then μ and ν are called mutually singular. Notation: μ⊥ν. For arbitrary finite measures ν and μ on S, the measure ν can be uniquely decomposed as ν = ν1 + ν2 with ν1 μ and ν2 ⊥μ (see [BOG 07]). In general case, dν1 dν dν dμ denotes the Radon–Nikodym derivative dμ . Thus, equality dμ (x) = 0 (mod μ) is a necessary and sufficient condition for the singularity of finite measures ν and μ. Often it is convenient to deal with the following objects. Let X be a universal space, Sn be an increasing sequence of sigma-algebras on X (i.e. Sn ⊂ Sn+1 , n ≥ 1) and S be the least sigma-algebra containing all Sn . Consider probability measures μ
146
Gaussian Measures in Hilbert Space
and ν on S, and let μn and ν n be restrictions of μ and ν on Sn , respectively. It is clear that μn and ν n are probability measures as well. L EMMA 6.2.– (Densities of restrictions form a martingale) Assume that ν n μn , n for all n ≥ 1. Then ϕn := μν n , n ≥ 1, form a martingale on the stochastic basis (X, S, Sn , n ≥ 1, μ), i.e. ϕn is Sn -measurable and E (ϕn+1 |Sn ) = ϕn
(modμ).
[6.2]
P ROOF.– The function ϕn is Sn -measurable density of the measure defined on Sn . To prove [6.2], take A ∈ Sn and get n+1 n+1 n n ϕn+1 dμ =ν (A) = ν(A) = ν (A) = ϕn dμ = ϕn dμ. A
A
A
The latter equality holds because μn is a restriction of μ on Sn . Thus, [6.2] follows from the definition of conditional expectation of r.v. ϕn+1 on the probability space (X, S, μ) (see [SHI 16]). C OROLLARY 6.1.– (Convergence of densities) Under the conditions of lemma 6.2, there exists a function ϕ(x) := lim ϕn (x) (modμ).
[6.3]
n→∞
P ROOF.– By lemma 6.2, ϕn is a non-negative Sn -martingale. Then the desired statement follows from the theorem about the convergence of a martingale (see [SHI 16]). dν 1 Remember that in general dμ = dν dμ , where ν1 is such a component of a finite measure ν that is absolutely continuous with respect to a finite measure μ.
T HEOREM 6.3.– (About the limiting density) Assume that ν n μn and ϕn = n ≥ 1. Then dν = lim ϕn dμ n→∞
(modμ).
dν n dμn ,
[6.4]
P ROOF.– a) The probability measure ν can be decomposed as ν = ν˜1 + ν˜2 , ν˜1 μ, ν˜2 ⊥μ. In the case where ν˜1 (X) > 0, ν˜2 (X) > 0, we have ν = ν˜1 (X) ·
ν˜1 ν˜2 + ν˜2 · , ν˜1 (X) ν˜2 (X)
ν = pν1 + (1 − p)ν2 ,
0 ≤ p ≤ 1,
ν1 μ,
ν2 ⊥μ.
[6.5]
Equivalence and Singularity of Gaussian Measures
147
Here, p = ν˜1 (X) and ν1 , ν2 are the corresponding probability measures. If ν˜1 (X) = 0 or ν˜2 (X) = 0, then [6.5] still holds with p = 0 or p = 1, respectively. In all cases, we have for the restrictions ν n , μn , ν1n , ν2n of ν, μ, ν1 , ν2 on Sn : dν n dν n dν n = p 1n + (1 − p) 2n . n dμ dμ dμ Hence in order to show [6.5], it is enough to prove that dν1n dν1 → , n dμ dμ
dν2n →0 dμn
(because relation [6.5] implies
dν dμ
(modμ) 1 = p dν dμ ).
Therefore, it is enough to prove [6.4] in pure cases only: when ν μ and ν⊥μ. b) The case ν μ: In this case, dν n dν = E |S n . dμn dμ
[6.6]
Indeed, for A ∈ Sn , dν dν n n dν n dμ (x) = dμ(x), dμ(x) = ν(A) = n n A dμ A dμ A dμ and [6.6] follows. Passing to the limit in [6.6], we get a.s.: dν n dν dν =E lim |S = . n→∞ dμn dμ dμ Here, we used corollary 6.1 and Lévy’s theorem (see [SHI 16]) about the limit of conditional expectations; applying the latter theorem, we used the following: sigmadν dν = X dμ dμ(x) = 1 < ∞. algebras Sn are increasing to the sigma-algebra S and E dμ c) The case ν⊥μ: We have to show that in this case, dν n → 0 (mod μ). dμn
[6.7] n
dν According to corollary 6.1, dμ (mod μ). We take A ∈ Sm , n ≥ m and n → ϕ apply Fatou’s lemma: dν n ν(A) = ν n (A) = dμ(x), n A dμ
ν(A) = lim inf n→∞
A
dν n dμ(x) ≥ dμn
dν n lim inf n dμ(x) = A n→∞ dμ
ϕdμ(x). A
148
Gaussian Measures in Hilbert Space
Hence, the monotone class
M := {B ∈ S : ν(B) ≥
ϕdμ(x)} B
contains the algebra A := {B ∈ S : ∃n ≥ 1, B ∈ Sn }, and M ⊃ σa(A) = S. Therefore, ν(A) ≥ ϕdμ(x), A ∈ S. [6.8] A c Since ν⊥μ, there exists A1 with ν(A1 ) = 0 and μ(A1 ) = 0. We have ϕdμ(x) = 0, and inequality [6.8] implies A1 ϕdμ(x) = 0. For non-negative Ac1 ϕ, we get X ϕdμ(x) = 0; hence ϕ = 0(mod μ). Relation [6.7] is proven.
6.2.2. Kakutani’s theorem for product measures Let {μk } and {νk } be sequences of probability measures on Borel sigma-algebra ∞ ∞ ∞ on R. Consider product measures μ = (see 1 μk and ν = 1 νk on R definition 2.3). T HEOREM 6.4.– (Kakutani’s criterion for absolute continuity) Product measure ν is absolutely continuous w.r.t. μ if, and only if, νk μk for all k ≥ 1 and there exists α ∈ (0, 1) such that the infinite product α ∞
dνk dμk (x) [6.9] dμk R 1 converges (to some positive number). R EMARK 6.2.– In the necessity part of theorem 6.4, the product [6.9] converges for all α ∈ (0, 1). This is shown in the proof below. R EMARK 6.3.– The integral in [6.9] is called the Hellinger integral and is denoted as √ dνk α 1−α (dνk ) (dμk ) . In particular R dνk dμk = R dμk dμk . R P ROOF.– a) Necessity. It is given that ν μ. First, we prove that ν1 μ1 . Indeed, let μ1 (A1 ) = 0. Then, for a cylinder Aˆ1 = {x ∈ R∞ : x1 ∈ A1 } we have μ(Aˆ1 ) = μ1 (A1 ) = 0; hence ν(Aˆ1 ) = ν1 (A1 ) = 0, and ν1 μ1 . In a similar way, νk μk , for all k ≥ 1. Now, by theorem 6.3, dν n dν (x1 , . . . , xn ) (mod μ). (x) = lim n→∞ dμn dμ
[6.10]
Equivalence and Singularity of Gaussian Measures
149
ˆ n : Bn ∈ Here we consider the increasing sequence of sigma-algebras Bn = {B n ∞ B(R )}, and B(R ) is the least sigma-algebra that contains all Bn (see theorem 2.1); ν n and μn are restrictions of ν and μ on Bn . Since ν and μ are product measures with νk μk for all k ≥ 1, it holds
dνk dν n (x1 , . . . , xn ) = (xk ). n dμ dμk 1 n
[6.11]
We fix α ∈ (0, 1). Then [6.10] and [6.11] imply α α n
dν dνk (x) = lim (xk ) (mod μ). n→∞ dμ dμk 1 But
α n
dνk
R∞
1
dμk
1/α (xk )
dμ(x) =
R∞
dν n dμ(x) = 1 dμn
n dνk α (xk ) are and the exponent α1 > 1; hence by lemma 6.1(c) the functions 1 dμ k uniformly integrable, and by theorem 6.1, α α n dν dνk (x)dμ(x) = lim dμ(x) = n→∞ R∞ dμ dμk R∞ 1 α α n ∞
dνk dνk dμk (x) = (xk )dμk (x) < ∞. n→∞ dμk dμk R R 1 1
= lim
[6.12] Moreover, the infinite product in [6.12] is non-zero. Indeed, if it equals 0, then = 0(mod μ), and since ν μ, we get ν = 0, and ν is not a probability measure, which is a contradiction. dν dμ
Thus, the product [6.9] converges to a positive number. R EMARK 6.4.– We have established the following: if νk μk for all k ≥ 1, then ∞ dνk α dμk (x) either converges (to a positive number) or diverges to 0. 1 R dμk b) Sufficiency. Now, we assume that νk μk for all k ≥ 1, and [6.9] converges. dν is the generalized Radon– Like in part (a) of the proof, we get [6.10], where dμ Nikodym derivative (i.e. the density of ν1 w.r.t. μ), ν1 is absolutely continuous component of ν. Hence the measurable function ϕ(x) :=
∞
dνk dν (x) (xk ) = dμk dμ 1
(mod μ).
[6.13]
150
Gaussian Measures in Hilbert Space
Here almost surely w.r.t. μ, the infinite product at some points converges to a positive number, and at some points it converges to 0. To show that ν μ, it is enough to prove that ϕdμ ≥ 1.
[6.14]
R∞
Indeed, decompose ν = ν1 + ν2 , ν1 μ, ν2 ⊥μ. Then ϕ = proved [6.14], we get ∞ 1 = ν(R ) = ϕdμ + ν2 (R∞ ) ≥ 1 + ν2 (R∞ );
dν1 dμ .
Once we have
R∞
hence ν2 (R∞ ) = 0 and ν2 = 0, ν = ν1 μ. (In fact, then in [6.14] the equality holds true.) In order to prove [6.14], introduce measurable functions ϕn (x) =
∞
dνk (xk ) (mod μ). dμk
[6.15]
k=n
Almost everywhere w.r.t. μ, infinite product in [6.15] converges either to a positive number or to 0, because in forming product measures ν and μ one could multiply measures νk or μk starting from k = n rather than k = 1. α m dνk The sequence of functions { k=n dμ (x ) , m ≥ n} is uniformly integrable k k
at the measure space (R∞ , B(R∞ ), μ), because 1/α > 1 and
R∞
=
α 1/α m m
dνk dνk (xk ) dμ(x) = (xk )dμ(x) = dμk dμk R∞
k=n
m
k=n
R
k=n
dνk (xk )dμk (xk ) = 1 dμk
(see lemma 6.1(c)). Tending m → ∞, we get by theorem 6.1: R∞
ϕα n dμ =
α m
dνk (xk ) dμk (xk ). dμk R
k=n
Equivalence and Singularity of Gaussian Measures
151
Using Fubini’s theorem and the moment inequality, we obtain:
n−1
dνk dνk ϕ(x)dμ(x) = (xk ) · ϕn (x)dμ(x) = (xk )dμk (xk )× dμk R∞ R∞ k=1 dμk k=1 R × ϕn (xn , xn+1 , . . . )dμ(x) = ϕn (x)dμ(x),
n−1
R∞
R∞
R∞
ϕ(x)dμ(x) ≥
1/α R∞
ϕα n dμ(x)
=
1/α α ∞
dνk (xk ) dμk (xk ) . dμk R
k=n
Tending n → ∞, we obtain the desired inequality [6.14], because the product [6.9] converges (to a positive number). C OROLLARY μk , for all k ≥ 1. ∞6.2.– (Kakutani’s dichotomy) Assume that νk ∞ Then ν = 1 νk is either absolutely continuous w.r.t. μ = 1 μk or it is singular to μ. P ROOF.– Fix 0 < α < 1. By remark 6.4, α ∞
dνk dμk dμk R 1
[6.16]
either converges to a positive number, or diverges to 0. In the first case, ν μ by theorem 6.4. In the second case, the product [6.16] ∞ dνk α equals 0. The functions fn (x) = 1 dμk , n ≥ 1, are uniformly integrable at the measure space (R∞ , B(R∞ ), μ) (see proof of theorem 6.4); hence by theorem 6.3, α dν dμ = lim fn (x)dμ(x) = lim fn (x)dμ(x) = n→∞ R∞ dμ R∞ R∞ n→∞ α ∞
dνk = dμk = 0. dμk R 1 dν dμ
dν Here dμ is the generalized Radon–Nikodym derivative (like in theorem 6.3). Thus, = 0(mod μ), and ν⊥μ.
Therefore, either ν μ or ν⊥μ.
152
Gaussian Measures in Hilbert Space
6.2.3. Dichotomy for Gaussian product measures L EMMA 6.3.– (Computation of Hellinger integral for normal distribution on R) Let ν and μ be normal distributions N (a, b) and N (ˆ a, ˆb) on the real line, with positive b and ˆb. Then * 1/4 2 2 (a − a ˆ ) . [6.17] dνdμ = bˆb exp − b + ˆb 4(b + ˆb) R P ROOF.– Let ρν and ρμ be the densities of ν and μ w.r.t. Lebesgue measure. Then the Hellinger integral is equal to √ H := dνdμ = ρν ρμ dx, R
R
√ ρν ρμ = √
1 1 (x − a)2 (x − a ˆ )2 . + 1/4 exp − 4 ˆb b 2π bˆb
Transform
2 (x − a)2 1 1 a a a (x − a ˆ )2 ˆ a ˆ2 = x2 − 2x + = + + + + ˆb ˆb ˆb b b ˆb b b 2 B B2 2 ; = Ax − 2Bx + C = A x − + C− A A −1/4 At2 1 B2 1 √ e− 4 dt, C− H = bˆb exp − 4 A 2π R √ √ t2 At2 1 1 − √ e− 4 dt = 2A−1 e 2(2A−1 ) dt = 2A−1 , √ √ −1 2π 2π 2A R R
1 H = exp − 4
B2 C− A
We have C−
2
2
2
B a a ˆ − = + ˆb A b
b + ˆb A= , bˆb
' ( 2A−1 ( ) 1/2 . bˆb
[6.18]
2
a a ˆ b + ˆ b 1 1 + ˆ b b
=
1/2 2 bˆb 2A−1 . 1/2 = b + ˆb bˆb
(a − a ˆ )2 , b + ˆb
[6.19]
[6.20]
Equivalence and Singularity of Gaussian Measures
Finally [6.18]–[6.20] imply [6.17].
153
Consider νk = N (ak , bk ) and μk = N (ˆ ak , ˆbk ) with positive bk and ˆbk ; k ≥ 1. Introduce two Gaussian product measures on R∞ : ν=
∞
νk ,
μ=
1
∞
μk .
1
T HEOREM 6.5.– (Dichotomy about Gaussian product measures) ν ∼ μ if, and only if, 2 ∞ ∞ 2 ˆbk − bk (ˆ ak − ak ) < ∞ and < ∞. [6.21] ˆbk bk 1 1 If at least one of the two conditions [6.21] is violated, then ν⊥μ. P ROOF.– a) The results of section 6.2.2 imply the following. The product ∞
dνk dμk 1
[6.22]
R
is either convergent to a positive number or divergent to 0; in the first case, ν μ and in the second case ν⊥μ. Denote ˆbk − bk ; bk
δk =
δk > −1.
By lemma 6.3
(ˆ a k − ak ) 2 , dνk dμk = Ak exp − 4(ˆbk + bk ) R
A2k =
(1 + δk )1/2 ≤ 1. 1 + 12 δk
[6.23]
[6.24]
In view of [6.23] and [6.24], the product [6.22] converges if, and only if, ∞
(1 + δk )1/2 1
1 + 12 δk
converges
[6.25]
and ∞ (ˆ ak − a k )2 1
ˆbk + bk
< ∞.
[6.26]
154
Gaussian Measures in Hilbert Space
b) Suppose that [6.25] and [6.26] hold true. The next product converges: ∞ ∞
(1 + δk /2)2 δk2 /4 1+ ; = 1 + δk 1 + δk 1 1
[6.27]
∞ δk2 ∞ hence 1 1+δ < ∞; then δk → 0 as k → ∞, and therefore, 1 δk2 < ∞. At that k bk ∼ ˆbk and bk + ˆbk ∼ 2ˆbk ; hence [6.26] implies the second relation in [6.21]. ∞ c) Now, suppose that [6.21] holds true. Then 1 δk2 < ∞, δk → 0 as k → ∞, and the product [6.27] converges together with the product [6.25]. At that bk ∼ ˆbk and the second relation in [6.21] implies [6.26]. d) Thus, ν μ if, and only if, [6.25] and [6.26] hold, and this occurs if, and only if, [6.21] holds. If at least one of the two relations [6.21] is violated, then ν is not absolutely continuous w.r.t. μ, and therefore, the product [6.22] diverges to 0; hence ν⊥μ. e) Assume that ν μ. Then the product [6.22] converges. Measures νk and μk are equivalent, and the Hellinger integral can be transformed as 1/2 dνk H(νk , μk ) = dνk μk = dμk (x) = dμk R R 1/2 1/2 dμk dνk dμk dνk (x) = dνk (x) = H(μk , νk ). = dμk dνk dνk R R ∞ √ ∞ μk dνk converges; hence Therefore, the product 1 H(μk , νk ) = 1 R μ ν. Thus, ν ∼ μ if, and only if, ν μ.
∞ N (ak , bk ) and μ = 1 N (ˆ ak , bk ) with bk > 0, ∞ (ak −ˆak )2 < ∞. If the series diverges, then k ≥ 1. Then ν ∼ μ if, and only if, 1 bk ν⊥μ. C OROLLARY 6.3.– Let ν =
∞ 1
Thus, under the conditions of corollary 6.3, ν ∼ μ if, and only if, a − a ˆ ∈ l2,c . 1 ∞ Here a = (ak )∞ , k ≥ 1}. , a ˆ = (ˆ a ) , c = { k 1 1 bk ∞ ∞ C OROLLARY 6.4.– Let ν = 1 N (ak , bk ) and μ = 1 N (ak , ˆbk ) with positive bk ∞ ˆbk −bk 2 < ∞. If the series diverges, then and ˆbk . Then ν ∼ μ if, and only if, 1
bk
ν⊥μ. As an application consider the independent sequence ξk ∼ N (0, bk ) with positive variances bk . The random element ξ = (ξk )∞ on R∞ has distribution 1
Equivalence and Singularity of Gaussian Measures
155
∞ μξ = 1 N (0, bk ). Fora real non-zero number t, the random element tξ has ∞ 2 distribution μtξ = 1 N (0, t bk ). By corollary 6.4, μtξ ⊥μξ for all t ∈ R \{1, −1, 0}. For t = −1, −ξ and ξ are equally distributed, and for t = 1, tξ = ξ. Problems 6.2 4) Construct two product measures on R∞ such that they neither absolutely continuous (one w.r.t. another) nor singular. ∞ ∞ ˆ 5) Let ν = 1 P ois(λk ) and μ = 1 P ois(λk ), where P ois(λ) stands for Poisson distribution with parameter λ. Prove that ν ∼ μ if, and only if, 2 ∞ √ ˆ k < ∞; if the series diverges then ν⊥μ. λk − λ 1
6.3. Feldman–Hájek dichotomy for Gaussian measures on H Let H be a real separable infinite-dimensional Hilbert space. We compare two Gaussian measures on H. 6.3.1. The case where Gaussian measures have equal correlation operators T HEOREM 6.6.– (Dichotomy for Gaussian measures with equal correlation operators) Let μ = N (a1 , B) and ν = N (a2 , B) be Gaussian measures on H. It holds: √ a) if a1 − a2 ∈ R( B), then μ ∼ ν; √ b) if a1 − a2 ∈ R( B), then μ⊥ν. P ROOF.– 1) We start with the case where B is a non-singular operator, i.e. KerB = {0}. Let {ek } and {βk } be eigenbasis and corresponding eigenvalues of B; βk > 0, k ≥ 1. Using Fourier coefficients of a vector w.r.t. {ek }, we can identify B with the sequence in l2 . Introduce Fourier coefficients a1k = (a1 , ek ),
a2k = (a2 , ek ),
k ≥ 1.
Then μ and ν are product measures in l2 (see Chapter 5): μ=
∞
1
N (a1k , βk ),
μ=
∞
1
N (a2k , βk ).
[6.28]
156
Gaussian Measures in Hilbert Space
The right-hand sides of relations [6.28] define extended measures μe and νe on R∞ , such that μ = μe |B(l2 ) ,
ν = νe |B(l2 ) .
By corollary 6.3, Gaussian product measures μe and νe are either equivalent or mutually singular. In the first case μ ∼ ν, and in the second case μ⊥ν (see problem (6) at the end of section 6.3). By corollary 6.3, μe ∼ νe if, and only if, ∞ (a1k − a2k )2 1
βk
< ∞.
[6.29]
√ √ Bek = βk ek , k ≥ 1, and [6.29] is equivalent to the following: √ a1 − a2 ∈ R( B). [6.30]
We have
Therefore, if [6.30] holds, then μe ∼ νe and μ ∼ ν. And if [6.30] is violated then μe ⊥νe and μ⊥ν. We have proved both statements (a) and (b) for non-singular B. 2) The case of singular B. We use the same construction as in part 1 of the proof. But now there exists an eigenvalue βk0 = 0. If a1k0 = a2k0 , then N (a1k0 , βk0 )⊥N (a2k0 , βk0 );√ hence μe ⊥νe and μ⊥ν. Notice that in this case (a1 − a2 , ek0 ) = 0 and a1 − a2 ∈ R( B). Now, suppose that for all k ≥ 1 with βk = 0, it holds a1k = a2k . Then we can reduce our consideration to the case of a non-singular correlation operator. If (a1k − a2k )2 < ∞, βk
[6.31]
k:βk >0
then μe ∼ νe and μ ∼ ν; otherwise μe ⊥νe and μ⊥ν. But√in the considered case, relation [6.31] is equivalent to the following: a1 − a2 ∈ R( B). This accomplishes the proof. R EMARK 6.5.– Consider two arbitrary sequences {αk , βk , k ≥ 1}. Hereafter, we real ∞ agree that the convergence of series 1 αβkk means the following: αk = 0 for all k αk such that βk = 0, and moreover the series k:βk =0 βk converges. Under this agreement, N (a1 , B) ∼ N (a2 , B) if, and only if, [6.29] holds whatever is the correlation operator B (i.e. it can be either singular or not). C OROLLARY 6.5.– (About admissible shift of Gaussian measure) Consider Gaussian √ measures μa = N (a, B) and μ0 = N (0, B) on H. If a ∈ R( B), then μa ∼ μ0 , and otherwise μa ⊥μ0 .
Equivalence and Singularity of Gaussian Measures
157
√ A vector a ∈ R( B) is the so-called admissible shift of μ0 in the sense that Ta−1 μ0 μ0 , where Ta x = x + a, x ∈ H. Now, we find the Radon–Nikodym derivative dμa /dμ0 for admissible shifts. T HEOREM 6.7.– Let μ0 = N (0, B) and μa = N (a, B) on H, with a = B 1/2 b, and let {ek } and {βk } be eigenbasis and corresponding eigenvalues of B. Assume additionally that (b, ek ) = 0 whenever βk = 0. Then dμa b 2 −1/2 . [6.32] = exp (b, B x) − dμ0 2 Here n bk xk √ , n→∞ βk 1
(b, B −1/2 x) = lim
[6.33]
where bk = (b, ek ), xk = (x, ek ), k ≥ 1 and the limit in [6.33] exists a.e. with β k0 x k0 respect to μ0 ; if some βk0 = 0, then xk0 = 0 (mod μ0 ) and √ = 0 in [6.33] by β k0
definition.
P ROOF.– We identify H and l2 as in the proof of theorem 6.6. Then in l2 , we have μ0 =
∞
N (0, βk ),
μa =
1
∞
N (ak , βk ),
ak =
βk bk ,
bk = (b, ek ).
1
Consider sigma-algebras Sn = {Aˆn |An ∈ B(Rn )}, n ≥ 1. Here, Aˆn = {x ∈ l2 : (x1 , . . . , xn ) ∈ An }. Let μn0 = μ0 |Sn , μna = μa |Sn . It holds
− (xk −ak )2 + xk dμna 2βk 2βk = e = dμn0 1 n
2
1 2 bk x k √ }. b + 2 1 k βk 1 n
= exp{−
n
[6.34]
This calculation remains valid if some βk = 0. By corollary 6.5, we have μa μ0 , and by theorem 6.3 it holds a.e. with respect to μ0 : dμna dν → n dμ0 dμ
as
n → ∞.
[6.35]
n Since limn→∞ 1 b2k = b 2 , relations [6.34] and [6.35] imply that the RHS of [6.33] converges a.e. with respect to μ0 , and then dν b 2 −1/2 x) . = exp − + (b, B dμ 2
158
Gaussian Measures in Hilbert Space
Note that the functional f (x) = (b, B −1/2 x) given in [6.33] is a linear measurable functional on the probability space (H, B(H), μ0 ). This means that f is well defined on some H0 ⊂ H with μ0 (H0 ) = 1, and moreover, for all α ∈ R, f (αx) = αf (x) for a.e. x ∈ H with respect to μ0 . 6.3.2. Necessary conditions for equivalence of Gaussian measures Now, we deal with two Gaussian measures on H with possibly different correlation operators μ = N (a1 , B1 ),
ν = N (a2 , B2 ).
[6.36]
L EMMA 6.4.– (First necessary condition for equivalence of Gaussian measures) If measures [6.36] are equivalent, there exists c > 0 such that for all z ∈ H, (B1 z, z) ≤ c · (B2 z, z). P ROOF.– Assume the contrary. Then there exists a sequence {zn } ⊂ H with (B2 zn , zn ) →0 (B1 zn , zn )
as n → ∞.
Let ξn (x) =
(x, zn ) − (a2 , zn ) , (B1 zn , zn )
x ∈ H,
n ≥ 1.
W.r.t. ν, ξn is a sequence of normal random variables with zero mean and variance Dν ξn =
(B2 zn , zn ) → 0 as n → ∞. (B1 zn , zn )
Hence ξn → 0 in probability ν. With respect to μ, ξn is a sequence of normal random variables as well, with variance Dμ ξn =
(B1 zn , zn ) = 1; (B1 zn , zn )
hence ξn ∼ N (mn , 1) and
4 1 (t−mn )2 2 1 − 2 √ μ ({x : |ξn (x)| ≤ 1}) = e dt ≤ , π 2π −1 4 2 μ ({x : |ξn (x)| > 1}) ≥ 1 − > 0. π
[6.37]
Equivalence and Singularity of Gaussian Measures
159
On the other hand, ν ({x : |ξn (x)| > 1}) → 0
as n → ∞.
[6.38]
Due to [6.37], μ ({x : |ξn (x)| > 1}) does not converge to zero. This fact together with relation [6.38] contradicts the condition that μ ∼ ν. This proves the statement. R EMARK 6.6.– For measures [6.36], if there is no positive number c with (B1 z, z) ≤ c · (B2 z, z), z ∈ H, then μ⊥ν. P ROOF.– We use functions ξn (x) from the proof of lemma 6.4. For each ε > 0, it holds ν ({x : |ξn (x)| > ε}) → 0
as n → ∞,
and similarly to [6.37], 2ε μ ({x : |ξn (x)| ≤ ε}) ≤ √ , 2π
2ε μ ({x : |ξn (x)| > ε}) ≥ 1 − √ . 2π
Taking a sequence εk → 0, it is possible to construct a sequence of sets Ak = {x : |ξnk | > εk } with ν(Ak ) → 0 and μ(Ak ) → 1. Then μ⊥ν (see problem (8) at the end of section 6.3). −1/2
Based on lemma 6.4, we can introduce a bounded operator J := B2
−1/2
B1 B2
.
We assume that the two measures in [6.36] are equivalent. Since the relation μ ∼ ν is symmetric, lemma 6.4 implies that there exist positive c1 and c2 such that c1 (B2 z, z) ≤ (B1 z, z) ≤ c2 (B2 z, z),
z ∈ H.
Then KerB1 = KerB2 and we set Jz = z, z ∈ KerB1 . We decompose H = KerB1 ⊕ L, and it is enough to define J on L. Operators B1 |L and B2 |L are non-singular (i.e. with zero kernel), and therefore, we may and do assume that initial correlation operators B1 and B2 are non-singular. √ 1/2 Now, we define J on the dense linear set R( B2 ). Let z = B2 u, √ −1/2 u = B2 z ∈ H. We define√J to be a symmetric linear operator on R( B2 ) (i.e. (Jx, y) = (x, Jy), x, y ∈ R( B2 )) such that −1/2 −1/2 (Jz, z) = (B1 B2 z, B2 z) = (B1 u, u), z ∈ R( B2 ). √ √ Then J is bounded operator on R( B2 ), because for z ∈ R( B2 ), z = 0, it holds due to lemma 6.4: (Jz, z) (B1 u, u) = ≤ c. (z, z) (B2 u, u)
160
Gaussian Measures in Hilbert Space
J is extended to a self-adjoint operator on H. This extended operator will be −1/2 −1/2 denoted as B2 B1 B2 . It is a positive operator. Since the relation μ ∼ ν is symmetric, a positive self-adjoint bounded operator −1/2 B2 B 1 is defined similarly.
−1/2
B1
of Gaussian measures) If L EMMA 6.5.– (Second necessary condition for equivalence √ measures [6.36] are equivalent, then a2 − a1 ∈ R( B1 ). P ROOF.– Let {ek } and {βk } be eigenbasis and corresponding eigenvalues of B1 . For fixed z ∈ H, consider the series ∞ 1 √ (x − a1 , ek )(z, ek ). βk 1
[6.39]
It is a series of independent normal variables on the probability space (H, B(H), μ), with zero mean and variances (z, ek )2 . We have ∞
(z, ek )2 = z 2 ,
1
and according to the Kolmogorov theorem about two series (see [SHI 16]), the series 2 [6.39] converges for x ∈ H (mod μ). Its sum Sz (x) ∼ N (0, z ) under measure μ. Since supz≤1 Eμ Sz2 (x) = 1 < ∞, the family of random variables {Sz (x) : z ≤ 1} is bounded in probability μ, i.e. sup μ({x : |Sz (x)| > c}) → 0 as
c → ∞.
z≤1
[6.40]
Because μ ∼ ν, the series [6.39] converges to Sz (x) for x ∈ H (mod ν). Then it should hold sup ν({x : |Sz (x)| > c}) →
z≤1
as
c → ∞.
[6.41]
Otherwise, if [6.41] fails, then it would be possible to use [6.40] to construct a sequence An of Borel sets in H such that μ(An ) tends to 0 and ν(An ) does not tend to 0 as n → ∞. But this contradicts the equivalence of ν and μ. Thus, [6.41] holds true, and the family of random variables {Sz (x) : z ≤ 1} is bounded in probability ν as well. But under ν, Sz (x) is a normal r.v. with mean m(z) =
∞ 1 √ (a2 − a1 , ek )(z, ek ). βk 1
Equivalence and Singularity of Gaussian Measures
161
Now, [6.41] implies that supz≤1 |m(z)| < ∞, and m(z) is a linear continuous 1/2
functional on H, m(z) = (h, z) for some z ∈ H. We substitute here z = B1 u and obtain ∞
1/2
1/2
(a2 − a1 , ek )(u, ek ) = (h, B1 u) = (B1 h, u),
1/2
a2 − a1 = B1 h.
1
R EMARK 6.7.– In what follows, the sum [6.39] under measure μ will be denoted as −1/2 (x − a1 ), z). For fixed z ∈ H, it is a measurable functional on (H, B(H), μ); (B1 −1/2 see the functional (b, B1 x) in [6.33]. If some βk0 = 0, then (x − a1 , ek0 ) = 0 (mod μ), and in [6.39] √1 (x − a1 , ek0 )(z, ek0 ) = 0 by definition. β k0
√ R EMARK 6.8.– For measures [6.36], if a1 − a2 ∈ R( B1 ), then μ⊥ν. P ROOF.– We may and do assume that there exist c1 > 0, c2 > 0 with c1 (B2 z, z) ≤ (B1 z, z) ≤ c2 (B2 z, z), z ∈ H,
[6.42]
otherwise by remark 6.6 μ⊥ν and the statement is proven. −1/2
Relation [6.42] implies that the operator B1 (see discussion above lemma 6.5).
−1/2
B2 B 1
∈ L(H) is well defined
We use the series [6.39] and its partial sum Snz (x) =
∞ 1 √ (x − a1 , ek )(z, ek ). βk 1 2
Under μ, Snz (x) → Sz (x) ∼ N (0, z ) as n → ∞ for almost all x ∈ H. Now, we study the behavior of Snz (x) under ν. We have Snz (x) =
+
n 1 √ (a2 − a1 , ek )(z, ek ) βk 1
n 1 √ (x − a2 , ek )(z, ek ) =: mn (z) + Tnz (x). βk 1
2 √ ∞ Since a2 − a1 ∈ R( B1 ), it holds 1 (a2 −aβk1 ,ek ) = +∞, and we can select z ∈ H with mn (z) → +∞ as n → ∞. Next, Tnz (x) is normal a r.v. with respect to ν, with zero mean and variance
Dν Tnz (x) =
n (B2 ek , ej )(z, ek )(z, ej ) . βk β j k,j=1
162
Gaussian Measures in Hilbert Space −1/2
Using the definition of operator F = B1 −1/2 −1/2 (B1 B2 B1 ek , e j )
=
=
−1/2
B2 B1
, we get
−1/2 −1/2 (B2 B1 e k , B1 ej )
=
ek ej B2 √ , βk βj
=
(B2 ek , ej ) . β k βj
Thus, Dν Tnz (x) =
n
(F ek , ej )(z, ek )(z, ej ) = (F zn , zn ), zn =
n
(z, ek )ek .
1
k,j=1
Hence Dν Tnz (x) → (F z, z) as n → ∞, and Tnz (x) under ν converges in probability to normal variable N (0, (F z, z)). To summarize, Snz (x) → +∞ under ν in probability, and therefore, for each c > 0 ν({x : |Snz (x)| > c}) → 1
as
n → ∞.
[6.43]
Meanwhile μ({x : |Snz (x)| > c}) → μ({x : |Sz (x)| > c}) as
n → ∞,
[6.44]
and the limit is small for large c, since lim μ({x : |Sz (x)| > c}) = 0.
c→+∞
[6.45]
In view of problem (8) posed at the end of section 6.3.3, relations [6.43]–[6.45] imply that μ⊥ν. We need some information about spectral resolution of self-adjoint operators (see [BER 12] for details). Let G be a Hilbert space. A monotone mapping P (·) from real line into the set of orthogonal projectors on G is called resolution of the identity if it is left-continuous w.r.t. the strong operator convergence and satisfies the conditions lim P (t) = 0,
t→−∞
lim P (t) = I,
t→+∞
where the limits are taken in the sense of strong operator convergence. According to the spectral decomposition theorem, every self-adjoint bounded operator A on G has an integral representation A= λd P(λ), [6.46] R
where P(·) is some resolution of the identity. The integral in [6.46] is taken over some interval containing the spectrum of A and can be defined as Riemann–Stieltjes integral
Equivalence and Singularity of Gaussian Measures
163
based on the uniform operator convergence. If A is a compact self-adjoint operator, representation [6.46] takes a form A= λ k P Gk , [6.47] k≥1
where {λk , k ≥ 1} is at most a countable collection of non-zero eigenvalues (without multiplicity), Gk are the corresponding finite-dimensional eigenspaces and PGk is orthoprojector on Gk . If the number of eigenvalues is infinite, then λk → 0 as k → ∞ and the series in [6.47] converges in the operator norm (i.e. uniformly). For the corresponding resolution of the identity P(·) it holds P(λk +) − P(λk ) = PGk , P(0+) − P(0) = PKerA (this is orthoprojector on KerA), and if λ ∈ σ(A), then P(·) is continuous at point λ w.r.t. strong operator convergence; moreover, if [a, b] ∩ σ(A) = ∅, then P(λ) = P(a), λ ∈ [a, b]. We note that in this case, dim P(−δ)G < ∞ and dim(I − P(δ))G < ∞,
for each δ > 0.
[6.48]
The following criterion holds true: a bounded self-adjoint operator A on G is compact if, and only if, relation [6.48] holds for the corresponding resolution of the identity P(·). L EMMA 6.6.– (Third necessary condition for equivalence of Gaussian measures) If −1/2 −1/2 measures [6.36] are equivalent, then the operator D := B2 B1 B2 −I is compact on the separable infinite-dimensional Hilbert space H. P ROOF.– Let P(·) be resolution of the identity for the self-adjoint bounded operator D. We prove by the contrary and suppose that D is not a compact operator. Then there exists δ > 0 such that at least one of the two subspaces P(−δ)H and (I − P(δ))H has infinite dimension. Then, it is possible to construct an infinite orthonormal system {fk , k ≥ 1} belonging to one of those subspaces, with (Dfk , fj ) = 0, k = j. These vectors can be taken from eigenspaces (P(λk +)−P(λk ))H and from subspaces (P(uk ) − P(dk ))H, where (dk , uk ) are disjoint intervals from (−∞, −δ) ∪ (δ, +∞) which do not contain eigenvalues. −1/2
Consider a measurable functional (B2 (x − a2 ), fk ) (see remark 6.7). It is a normal r.v. on the probability space (H, B(H), ν) and therefore on the probability space (H, B(H), μ) as well, since μ ∼ ν and the corresponding series n 1 √ (x − a2 , en )(fk , en ) βk 1
converges a.s. for both probability spaces. We introduce a mapping T : H → R∞ , 5 6 −1/2 Tx = B2 [6.49] (x − a2 ), fk , k ≥ 1 .
164
Gaussian Measures in Hilbert Space
The induced measures μT −1 and νT −1 are equivalent as well. Actually T is a Gaussian random element on R∞ for both probability spaces. Find the distribution of T in both cases. It holds: −1/2 Eν B 2 (x − a2 ), fk = 0, −1/2 −1/2 Eν B2 (x − a2 ), fk B2 (x − a2 ), fj = (fk , fj ) = δkj . Thus, νT −1 is a product measure
∞ k=1
N (0, 1) on R∞ .
√ Next, by lemma [6.5] a1 − a2 ∈ R( B2 ), and −1/2 −1/2 ˆk . (x − a2 ), fk = B2 (a1 − a2 ), fk =: a Eμ B2 −1/2
−1/2
B1 B 2 ∈ L(H) and like in the proof of remark 6.8, it By lemma 6.4, B2 −1/2 holds for components (T x)k = (B2 (x − a2 ), fk ): −1/2 −1/2 Covμ ((T x)k , (T x)j ) = (x − a1 ), fk )(B2 (x − a1 ), fj )dμ(x) = (B2 H
=
−1/2 −1/2 fk , f j ) B1 B2 (B2
= (Dfk , fj ) + (fk , fj ).
This equals 0 for k = j, and for k = j Dμ (T x)k = (Dfk , fk ) + (fk , fk ) = (Dfk , fk ) + 1 =: ˆbk . ∞ Thus, μT −1 is a product measure 1 N (ˆ ak , ˆbk ) on R∞ . Since μT −1 ∼ νT −1 , we apply theorem 6.5 and obtain 2 ∞ ∞ ∞ ˆbk − 1 δk2 = = (Dfk , fk )2 < ∞. 1 1 1 1
∞ On the other hand, |(Dfk , fk )| ≥ δ, k ≥ 1, and 1 δk2 = ∞. We come to a contradiction. Thus, D is a compact operator. √ R EMARK 6.9.– For measures [6.38], if a1 − a2 ∈ B1 H and there exists C > 0 with (B1 z, z) ≤ C(B2 z, z), z ∈ H, and if additionally the operator −1/2 −1/2 D = B2 B1 B 2 − I is not compact, then μ⊥ν. P ROOF.– We use notations from the proof of lemma 6.5. For the probability space (H, B(H), ν), we consider the mapping T : H → R∞ given in [6.49]. Under μ, the series ∞
1 √ (x − a2 , en )(fk , en ) βn n=1
Equivalence and Singularity of Gaussian Measures
165
converges in probability to a normal variable distributed as N (ˆ ak , ˆbk ); according to the Riesz theorem there exists a sequence nm → ∞ such that for all k ≥ 1, nm
1 √ (x − a2 , en )(fk , en ) =: Snm (x; fk ) βn n=1 converges μ-almost surely to ξk (x) ∼ N (ˆ ak , ˆbk ). Now, we define the mapping ∞ ˜ ˜ T : H → R , (T x)k = limm→∞ Snm (x; fk ), k ≥ 1. It is well defined for all x ∈ H \ H0 with μ(H0 ) = ν(H0 ) = 0. We have ν T˜−1 =
∞
μT˜−1 =
N (0, 1),
k=1
∞
N (ˆ ak , ˆbk ).
k=1
Since D is not compact,
∞ 1
δk2 = ∞ and ν T˜−1 ⊥μT˜−1 ; hence ν⊥μ.
6.3.3. Criterion for equivalence of Gaussian measures T HEOREM 6.8.– (Feldman–Hájek theorem) Let μ = N (a1 , B1 ) and ν = N (a2 , B2 ) be Gaussian measures on a real separable infinite-dimensional Hilbert space H. a) μ ∼ ν if, and only if, it holds: √ 1) a2 − a1 ∈ B1 H, 2) there exist C1 , C2 > 0 such that C1 (B2 z, z) ≤ (B1 z, z) ≤ C2 (B2 z, z), z ∈ H, −1/2
3) D := B2
−1/2
B 1 B2
− I is a Hilbert–Schmidt operator.
b) If at least one of conditions 1–3 is violated, then μ⊥ν. P ROOF.– The necessity of conditions 1 and 2 and mutual singularity of μ and ν, if at least one of the first two conditions is violated, follows from lemmas 6.5 and 6.4 and from remarks 6.8 and 6.6. Now, assume that conditions 1 and 2 hold true. If μ ∼ ν then D is a compact operator, and if D is not a compact operator, then ν⊥μ (see lemma 6.6 and remark 6.9). Thus, we assume conditions 1 and 2 and that D is a compact operator. If D = 0, then D ∈ S2 (H) and B1 = B2 ; hence by lemma 6.6 ν ∼ μ. Now, let D = 0 and {fk , k ≥ 1} be eigenbasis of D in the subspace H KerD, KerD = KerB1 = KerB2 . Consider random variables (B2−1 (x − a2 ), fk ) (see the proof of lemma 6.6). On the probability space (H, B(H), ν), they are i.i.d. standard normal random variables, and on (H, B(H), μ) they are independent normal with
166
Gaussian Measures in Hilbert Space
means (B2−1 (a1 − a2 ), fk ) and variances 1 + (Dfk , fk ) (see the proof in remark 6.9). Theorem 6.5 implies that conditions (B2−1 (a1 − a2 ), fk )2 < ∞ and (Dfk , fk )2 < ∞ [6.50] k≥1
k≥1
are necessary and sufficient for μ ∼ ν. The first condition in [6.50] is equivalent to condition 1, and the second condition in [6.50] is equivalent to D ∈ S2 (H). If D ∈ S2 (H), then k≥1 (Dfk , fk )2 = ∞ and theorem 6.5 implies than μ⊥ν (see the proof in remark 6.9). C OROLLARY 6.6.– On a real separable infinite-dimensional Hilbert space, Gaussian measures N (a1 , B1 ) and N (a2 , B2 ) are either equivalent or mutual singular. They are equivalent if, and only if, it holds: 1) N (a1 , B1 ) ∼ N (a2 , B1 ); 2) N (a2 , B1 ) ∼ N (a2 , B2 ). Now, we find the Radon–Nikodym derivative for two Gaussian measures with different correlation operators. Consider two centered Gaussian measures on H, μ = N (0, B1 )
and ν = N (0, B2 ),
μ ∼ ν.
[6.51]
By theorem 6.7, the following operator is a Hilbert–Schmidt one, −1/2
F := B1
−1/2
B2 B 1
− I,
with eigenbasis {ek } and corresponding eigenvalues {βk }, −1/2 −1/2 Ker(B1 B2 B 1 ) = {0}, it holds βk > −1, k ≥ 1.
[6.52] ∞ 1
βk2 < ∞. Since
Consider a probability space (R∞ , B(R∞ ), μ) with μ=
∞
N (0, 1),
[6.53]
k=1
and Hilbert space G := L2 (R∞ , H) of random elements on H with finite strong second moments; the inner product in this space is as follows: (ξ, η)G = (ξ(x), η(x))H dμ(x). R∞
The series ∞ 1
xk
B1 ek
[6.54]
Equivalence and Singularity of Gaussian Measures
167
converges in G to some random element ψ(x), since the summands of [6.54] are pairwise orthogonal in G and moreover, ∞ 7 ∞ 72 7 7 ( B1 ek , B1 ek ) = trB1 < ∞. 7xk B1 ek 7 = G
1
[6.55]
1
Relation [6.55] implies that for partial sums Sn (x) of [6.54], Sn (x) − ψ(x) H converges in probability to 0. For any series of independent random elements in H, it converges a.s. if, and only if, it converges in probability (see [BUL 80] or [BUL 81]). Hence, the series [6.54] converges to ψ(x) μ−a.s. in the norm of H. T HEOREM 6.9.– (About the Radon–Nikodym derivative of two centered Gaussian measures) For measures [6.51], fix the arbitrary modification of the Radon–Nikodym dν derivative dμ . Then, ∞
1 βk x2k dν √ , exp (ψ(x)) = dμ 2(1 + βk ) 1 + βk k=1
[6.56]
where the measurable mapping ψ : R∞ → H is the sum of series [6.54], which converges μ–a.s., and [6.56] holds for almost all x ∈ R∞ with respect to the measure μ; the latter is given in [6.53]. P ROOF.– 1) Introduce a Gaussian measure on R∞ , ν :=
∞
N (0, 1 + βk ).
[6.57]
k=1
∞ Since 1 βk2 < ∞, ν ∼ μ (see theorem 6.8), and therefore, the series [6.54] converges to ψ(x) ν–a.s. as well. 2) Prove that μ = μψ −1
ν = νψ −1 .
[6.58]
Indeed, both measures μψ −1 and νψ −1 in [6.58] are centered Gaussian measures on H. Next, for y ∈ H as m → ∞, R∞
=
(y, Snm (z))2 dμ(z) =
nm j,k=1
R∞
zj zk (
B1 ej , y)(
B1 ek , y)dμ(z) =
nm nm 7 72 7 7 ( B1 ej , y)2 = ( B1 y, ej )2 → 7 B1 y 7 = (B1 y, y), j=1
j=1
168
Gaussian Measures in Hilbert Space
and since Snm − ψ G → 0, we have 2 (y, ψ(z)) dμ(z) = (y, t)2 d(μψ −1 )(t), (B1 y, y) = R∞
y ∈ H.
H
Hence, μψ −1 = N (0, B1 ) = μ. In a similar way, as m → ∞, nm √ ( B1 y, ej )2 2 (y, Snm (z)) dν(z) = = 1 + βj R∞ j=1 nm ( B2 y, ej )2 → (B2 y, y). =
[6.59]
j=1
˜ := L2 (R∞ , H, ν) of random elements with respect to the In the space G probability ν, we have 7 √ ∞ 7 ∞ 7 ∞ 72 7 B1 ek 7 7 7 (B2 ek , ek ) = trB2 < ∞, = 7xk B1 ek 7 ˜ = βk + 1 G 1 1 1 ˜ as well. Hence [6.59] implies and series [6.54] converges to ψ(x) in G ∞ (B2 y, y) = (y, t)2 d(νψ −1 )(t), y ∈ H, (y, ψ(z))2 dν(z) = 1
R∞
H
and νψ −1 = N (0, B2 ) = ν. 3) Based on theorem 6.3, find It holds
dν dμ .
Let μk = N (0, 1), νk = N (0, βk + 1), k ≥ 1.
dνk 1 x2 x2 = (x) = √ exp − + dμk 2(1 + βk ) 2 1 + βk 1 β k x2 , x ∈ R; =√ exp 2(1 + βk ) 1 + βk n ∞
dν 1 1 βk x2k βk x2k = . exp exp (x) = log ti dμ 1 + βk 2(1 + βk ) 1 + βk 2(1 + βk ) k=1
k=1
[6.60] The product in [6.60] converges a.e. with respect to μ. 4) Let A ∈ B(R∞ ). From [6.58], we get dν dν −1 (t) = (ψ(x))dμ(x), ν(A) = ν(ψ A) = ψ −1 A dμ A dμ dν dν (ψ(x)) = (x) dμ dμ
(modμ),
Equivalence and Singularity of Gaussian Measures
169
and [6.56] follows from [6.60].
R EMARK 6.10.– (Simplified version of Radon–Nikodym derivative) Under the conditions of theorem 6.9, if additionally the operator F in [6.52] is nuclear, then ∞ 1 |βk | < ∞ and ∞
dν βk x2k 1 exp (ψ(x)) = dμ 2(1 + βk ) det(I + F ) k=1 where det(I + F ) is equal to the convergent product
∞ 1
(mod μ),
(1 + βk ).
Problems 6.3 6) Let (X, S) be a measurable space, Xr ∈ S, μ and ν be measures on measurable space (Xr , Xr ∩ S), μe (A) = μ(A ∩ Xr ), νe (A) = ν(A ∩ Xr ), A ∈ S. Prove that μ ν if, and only if, μe νe , and that μ⊥ν if, and only if, μe ⊥νe . 7) Under the conditions of theorem 6.6(a) find
dμ dν .
8) Prove that probability measures μ and ν, measures on a measurable space (X, S), are mutually singular if, and only if, there exists a sequence {An } of sets such that μ(An ) → 0 and ν(An ) → 1 as n → ∞. 9) Let t ∈ R \ {−1, 1} and ξ be a Gaussian random element on a real separable infinite-dimensional Hilbert space H, with a correlation operator which is not a finitedimensional operator. Prove that for each c ∈ H, distributions of ξ and tξ + c are mutually singular. 6.4. Applications in statistics We apply the results of section 6.3 for estimation of the parameters of Gaussian random elements in a real separable infinite-dimensional Hilbert space H and for hypothesis testing about those parameters. 6.4.1. Estimation and hypothesis testing for mean of Gaussian random element Consider a random element X in H with Gaussian distribution N (a, B), where the correlation operator B is known, B is a non-zero S-operator, and the mean a ∈ H is unknown and to be estimated by a single realization X(ω) of the underlying random element. Assume √ additionally that a ∈ Lm , where Lm is a given m-dimensional subspace of R( B). According to theorem 6.6, the distribution of X, μa = N (a, B),
170
Gaussian Measures in Hilbert Space
is equivalent to μ0 = N (0, B). Based on theorem 6.7, one can construct the maximum likelihood estimator (MLE) of a. The mean can be decomposed as a = B 1/2 b,
b=
m
αi f i ,
i=1
where αi ∈ R are unknown, {fi } is a known orthonormal system with fi ⊥KerB, i = 1, . . . , m. According to [6.34], the log-likelihood function is as follows: dμa 1 2 (X) = αi (fi , B −1/2 X) − α . [6.61] dμ0 2 i=1 i i=1 m
L(α1 , . . . , αm ; X) = log
m
Maximizing [6.61], we get the MLEs ˆ i = (fi , B −1/2 X), α a ˆ=
m
i = 1, . . . , m,
ˆb =
m
(fi , B −1/2 X)fi ,
i=1
(fi , B −1/2 X)B 1/2 fi .
[6.62]
i=1
Remember that by [6.33] n (fi , ek )Xk √ , n→∞ βk k=1
(fi , B −1/2 X) = lim
[6.63]
and the limit in [6.63] exists a.e. with respect to both μ0 and μa . The estimator [6.62] is unbiased. Indeed, X = a + X0 , X0 ∼ N (0, B) and E(fi , B −1/2 X) = (fi , B −1/2 a) + E(fi , B −1/2 X0 ) = (fi , b), Ea ˆ=
m
(fi , b)B 1/2 fi = B 1/2 b = a.
i=1
Here, we understand expectation in the Bochner sense. We have proved that a ˆ is unbiased. Another application of relation [6.32] is hypothesis testing. Suppose that we observe a single realization X(ω) of a random element X in H with a known non-zero correlation operator B. We test the null hypothesis H0 : X ∼ N (0, B) against the alternative H1 : X ∼ N (B 1/2 b, B), where b is a fixed non-zero vector, b⊥KerB.
Equivalence and Singularity of Gaussian Measures
171
We set a = B 1/2 b, a = 0. Under H0 , X has distribution μ0 = N (0, B), and under a H1 , it has distribution μa = N (a, B), μa ∼ μ0 ; the Radon–Nikodym derivative dμ dμ0 is given in [6.32]. We construct the Neyman–Pearson test. The inequality leads to an inequality of the form b, B −1/2 X ≤ C.
dμa dμ0 (X)
≤ C0
[6.64]
2 Under H0 , b, B −1/2 X ∼ N (0, b ) and for any real C, 5 6 PH 0 b, B −1/2 X = C = 0. Therefore, the Neyman–Pearson test will be non-randomized. Given a confidence level 1 − α ∈ [0.95; 1], we select the threshold C = Cα such that 5 6 b, B −1/2 X > Cα = α, PH 0 or
Cα = α; P N (0, 1) > b
Cα = zα · b .
Here zα is the upper quantile of normal law. The Neyman–Pearson decision rule is as follows: reject H0 if b, B −1/2 X > zα · b , and do not reject H0 if b, B −1/2 X ≤ zα · b . The significance level of the test equals 5 6 PH 0 b, B −1/2 X > zα · b = α.
2 2 Under H1 , b, B −1/2 X ∼ N ( b , b ). The power of the test equals 5 6 power = PH1 b, B −1/2 X > zα · b = P {N (0, 1) > zα − b } = Φ( b −zα ). Hereafter Φ(x) denotes the cdf of standard normal law. We see that the larger b , the more powerful the test we have; the power tends to 1 as b → ∞. Now, suppose that we test a simple null hypothesis H0 against a simple alternative H1 concerning the distribution of the observed random element X in H, and moreover the corresponding distributions μ0 and μ1 are mutually singular. Then we can test the hypotheses with zero probability of Type I and II errors. Indeed, H can be partitioned as H = W0 ∪ W1 , with μ0 (W0 ) = 1 and μ1 (W1 ) = 1; if the
172
Gaussian Measures in Hilbert Space
observation X(ω) ∈ W0 , then we accept H0 , and if X(ω) ∈ W1 , then we accept H1 . Since PH0 {X(ω) ∈ W1 } = μ0 (W1 ) = 0, PH1 {X(ω) ∈ W0 } = μ1 (W0 ) = 0, the proposed decision rule is exact. The problem is that a desired partition H = W0 ∪ W1 is sometimes difficult to construct. For example, theorem 6.8(b) states that under certain conditions two Gaussian measures are mutually singular, but it does not provide with the corresponding partition of H. Now, we consider a simple case where such a partition can be easily constructed. In real problems, this never happens, and the following test is just an educational theoretical one. Let B be S-operator in H with zero kernel, eigenbasis {ek } and corresponding (positive) eigenvalues βk , k ≥ 1. Suppose that we observe a single realization X(ω) of a random element X in H with known correlation operator equal to B. We test the null hypothesis H0 : X ∼ N (0, B) against the alternative H1 : X ∼ N (a, B), where a is such that ∞ a2 √ k = ∞, βk k=1
ak = (a, ek ),
k ≥ 1.
Notice that under [6.65], since βk → 0 as k → ∞, it holds ∞ ∞ a2k a2 1 √k · √ = ∞ = βk β βk k k=1 k=1
√ as well; hence a ∈ R( B) and μ0 = N (0, B)⊥μa = N (a, B). Remember that Xk (ω) = (X(ω), ek ). Define a decision rule as follows: reject H0 if the series
∞ ak Xk (ω) √ is divergent, βk k=1
and do not reject H0 if the series converges.
[6.65]
Equivalence and Singularity of Gaussian Measures
173
∞ Xk (ω) Under H0 , X ∼ N (0, B) and k=1 ak√ = a, B −1/2 X converges a.s.; βk hence the probability of Type I error equals zero. Under H1 , it holds X = a + X0 , X0 ∼ N (0, B) and due to [6.65] ∞ ∞ ak Xk (ω) a2 √ √k = ∞ = a, B −1/2 X0 + βk βk k=1 k=1 almost surely. Thus, the probability of Type II error equals zero as well. 6.4.2. Estimation and hypothesis testing for correlation operator of centered Gaussian random element Let B1 be S-operator with zero kernel, and for some basis {ek }, B2 = B 1 +
∞
1/2
1/2
βk B1 P[ek ] B1 ,
βk > −1,
k ≥ 1,
1
∞
βk2 < ∞, [6.66]
1
where P[ek ] is orthoprojector on span(ek ) and the series in [6.66] converges uniformly, i.e. in the sense of uniform operator convergence. Then using the corresponding quadratic forms and the definition of bounded linear operator −1/2 −1/2 B1 B2 B 1 from section 6.3.2, we obtain −1/2
F := B1
−1/2
B2 B 1
−I =
∞
βk P[ek ] .
[6.67]
1
Suppose that we observe a single realization Y (ω) of a random element Y in H, Y ∼ N (0, B2 ). Here, B2 has a form [6.66] with unknown B1 and known ek , k ≥ 1, but unknown βk , k ≥ 1. Now, we make an attempt to construct the maximum likelihood estimators of the coefficients βk . According to theorem 6.8(a), ν = N (0, B2 ) ∼ μ = N (0, B1 ), and [6.56] gives dν the Radon–Nikodym derivative dμ at point ψ(x), x ∈ R∞ , where ψ(x) is given in [6.54]. We can assume that Y ∼ N (0, B2 ) is represented as Y (ω) =
∞
Xk (ω)
B1 ek = ψ(X(ω))
a.s.,
[6.68]
1 ∞
where X(ω) = (Xk (ω))1 is a random element on (R∞ , B(R∞ ), ν¯) with the probability measure ν¯ given in [6.57]. In view of [6.68], we may and do assume that we observe random variables X1 (ω), . . . , Xk (ω), . . . By [6.56], the log-likelihood function is as follows: ∞
L ({βk } ; {Xk (ω)}) = log
dν 1 (ψ (X(ω))) = dμ 2 1
βk Xk2 (ω) − log(1 + βk ) . 1 + βk [6.69]
174
Gaussian Measures in Hilbert Space
Here, the series converges a.s. Maximization in βk > −1 of the summand lk (βk ; Xk (ω)) =
βk Xk2 (ω) − log(1 + βk ) 1 + βk
leads to the estimator βˆk = Xk2 (ω) − 1. It is an unbiased estimator of βk , since E βˆk = (1 + βk ) − 1 = βk . Moreover, βˆ1 , . . . , βˆn are the maximum likelihood estimators of β1 , . . . , βn , provided βn+1 , βn+2 , . . . are known. ∞ ∞ is not the MLE of the sequence (βk )1 ∈ l2 . For instance, The sequence βˆk 1
in the case where βk = 0, k ≥ 1 (i.e. when B2 = B1 ), Xk2 (ω) are i.i.d. standard ∞ 2 normal, and then 1 βˆk = ∞ a.s., because by the Strong Law of Large Numbers
2 n 2 n−1 1 βˆk → E X12 (ω) − 1 > 0 as n → ∞ a.s.; hence with probability one ∞ βˆk ∈ l2 . 1
Now, we switch to hypothesis testing. Suppose that B1 and B2 are operators described above, and not all βk are zeros. We observe a random element Y (ω) in H and test the null hypothesis H0 : Y ∼ N (0, B1 ) := μ1 against the alternative H ∗ : Y ∼ N (0, B2 ) := μ2 . In view of [6.58], we may and do assume that Y (ω) has representation [6.68] with observed X(ω), where under H0 {Xk (ω), k ≥ 1} are i.i.d. standard normal, and under H ∗ they are independent with Xk ∼ N (0, 1 + βk ), k ≥ 1. We construct the Neyman–Pearson test. In view of [6.56], the inequality ≤ C0 leads to inequality of the form
dμ2 dμ1 (X)
S(X) :=
∞ βk X 2 (ω) k
1
1 + βk
− log(1 + βk )
≤ C.
[6.70]
We will show that the test will be non-randomized. There exists βj = 0; we decompose βk X 2 βj Xj2 k S(X) = − log(1 + βj ) + − log(1 + βk ) =: ξ + η. 1 + βj 1 + βk k=j
Under H0 , ξ and η are independent, and moreover the cdf of ξ is continuous. Hence (see problem (11)), S(X) has continuous cdf, and for any C ∈ R, it holds
Equivalence and Singularity of Gaussian Measures
175
PH0 {S(X) = C} = 0. Therefore, the Neyman–Pearson test will indeed be non-randomized. Given a confidence level 1 − α ∈ [0.95; 1], we select the threshold C = Cα such that PH0 {S(X) > Cα } = α.
[6.71]
For α small enough, the solution Cα to equation [6.71] exists. For such α, the Neyman–Pearson decision rule is as follows: reject H0 if S(X) > Cα , and do not reject H0 if S(X) ≤ Cα . The significance level of the test equals α. Consider a convenient particular case where S−operator, ∞ the RHS of [6.67] is ∞ i.e. βk ≥ 0, k ≥ 1 (not all βk are zeros) and 1 βk < ∞. Then it holds 1 log(1 + βk ) < ∞, and [6.70] takes a form Q(X) :=
∞ βk Xk2 ˜ ≤ C. 1 + βk 1
[6.72]
The threshold C˜ = C˜α can be found from equation 5 6 PH0 Q(X) > C˜α = α.
[6.73]
The Neyman–Pearson decision rule can be rewritten as follows: reject H0 if Q(X) > C˜α , and do not reject H0 if Q(X) ≤ C˜α . Now, we obtain the approximate solution to equation [6.73]. Under H0 , {Xk } are i.i.d. standard normal, and the distribution of Q(X) can be approximated by the 2 2 distribution Aχ νν with
some real positive numbers A and ν (for such ν, χν is Gamma distribution Γ 2 , 2 ). We have EH0 Q(X) =
∞ 1
βk , 1 + βk
∞
∞
βk2 βk2 2 Var [X ] = 2 ; H k 0 (1 + βk )2 (1 + βk )2 1 1
E Aχ2ν = Aν, Var Aχ2ν = 2A2 ν.
VarH0 [Q(X)] =
176
Gaussian Measures in Hilbert Space
Equalizing the corresponding first and second moments, we get a system of equations for A and ν: Aν =
∞ 1
βk , 1 + βk
∞
A2 ν =
∞ 1
βk2 ; (1 + βk )2
∞
βk βk2 : , (1 + βk )2 1 + βk 1 1 2 ∞ ∞ βk βk2 : . ν= 1 + βk (1 + βk )2 1 1
A=
[6.74]
[6.75]
[6.76]
Note that ν ≥ 1 in [6.76]. With such a choice of A and ν, we replace [6.73] with the approximate equation 5 6 P Aχ2ν > C˜α = α; C˜α = Aχ2να , [6.77] % & where χ2να is upper α-quantile of χ2ν distribution, i.e. P χ2ν > χ2να = α. This quantile can be found from statistical tables. Now, we find the approximate power of the test under C˜α selected by [6.77]. Under H ∗ , {Xk } are independent with Xk ∼ N (0, 1 + βk ), k ≥ 1. Thus, under H ∗ , Q(X) =
∞ 1
βk γk2 ,
γk = √
Xk , 1 + βk
k ≥ 1,
[6.78]
{γk } are i.i.d. standard normal. We approximate the distribution of [6.78] by distribution Bχ2τ . Similarly to [6.75]–[6.76], we obtain ∞ ∞ ∞ 2 ( 1 βk ) 2 B= βk : βk , τ = ∞ 2 . 1 βk 1 1 Thus, the power of the test is as follows: % 2 & A 2 A 2 2 2 , power ≈ P Bχτ > Aχνα = P χτ > χνα = 1 − Fτ χ B B να [6.79] where Fτ is cdf of χ2τ distribution; the value Fτ (z) can be found from statistical tables. Note that the statistical procedures presented in section 6.4 can be reformulated in terms of a Gaussian stochastic process ξ(t, ω) with square integrable paths on [0, T ], which is observed on this interval. Such a process generates a Gaussian measure on Hilbert space L2 [0, T ] (see example 5.3 with p = 2). More complicated statistical procedures for observed Gaussian random elements in H can be found in [IBR 80].
Equivalence and Singularity of Gaussian Measures
177
Problems 6.4 10) Find the distribution of the estimator [6.62]. 11) Let ξ and η be independent random variables, and ξ have continuous cdf. Prove that ξ + η has continuous cdf as well. 12) Let B1 and B2 be the operators in H as described at the beginning of section 6.4.2, but concerning {βk } in [6.67] assume that −1 < βk ≤ 0, k ≥ 1, ∞ not all of them are zeros and 1 |βk | < ∞. We observe a single realization Y (ω) of a random element Y in H and test a hypothesis H0 : Y ∼ N (0, B1 ) against H ∗ : Y ∼ N (0, B2 ). Construct the corresponding Neyman–Pearson test and similarly to [6.79] find its approximate power.
7 Solutions
7.1. Solutions for Chapter 1 1) We have (λT2 π1−1 )(A) = λT2 ((A ∩ [0, T ]) × [0, T ]) = λ2 ((A ∩ [0, T ]) × [0, T ]) = = λ1 (A ∩ [0, T ]) · λ1 ([0, T ]) = T · λ1 (A ∩ [0, T ]). 2) For B(R), it holds (μ1 × μ2 )(π1−1 B) = (μ1 × μ2 )(B × R) = μ1 (B) · μ2 (R). Answer: μ2 (R) · μ1 . 3) a) Let μT −1 be sigma-finite. Then there exists {An , n ≥ 1} ⊂ F such that −1 −1 Y = ∪∞ )(An ) < ∞, n ≥ 1. Then X = ∪∞ An and μ(T −1 An ) < 1 An and (μT 1 T ∞, n ≥ 1. b) Construct a counterexample. The Lebesgue measure λ1 is sigma-finite on the Borel sigma-algebra B(R). Let T : R → R, T x = 0, x ∈ R. For the measure λ1 T −1 on B(R), we have for A ∈ B(R): (λ1 T
−1
)(A) =
0, if 0 ∈ A, +∞, otherwise.
The induced measure λ1 T −1 is not sigma-finite. 4) The statement follows from relations [1.7] and [1.8].
Gaussian Measures in Hilbert Space: Construction and Properties, First Edition. Alexander Kukush. © ISTE Ltd 2019. Published by ISTE Ltd and John Wiley & Sons, Inc.
180
Gaussian Measures in Hilbert Space
5) Let B be a Lebesgue measurable set on real line. Denote B+ = B ∩ [0, +∞) and −B+ = {−x : x ∈ B+ }. Then T −1 B = B+ ∪(−B+ ). Using corollary 1.1 for the transformation T x = −x, x ∈ R, we get
λ1 T −1 B = λ1 (B+ ) + λ1 (−B+ ) = 2 · λ1 (B+ ) .
Answer: λ1 T −1 (B) = 2 · λ1 (B ∩ [0, +∞)), B ∈ S1 . 6) a) For any finite interval [a, b], the preimage f −1 (a, b] lies in a closed interval of length b − a. Indeed, let x, y ∈ f −1 (a, b]. Then |x − y| ≤ |f (x) − f (y)| ≤ b − a. Denote x∗ = inf f −1 (a, b], x∗ = sup f −1 (a, b]. It holds x∗ − x∗ ≤ b − a; hence f −1 (a, b] ⊂ [x∗ , x∗ + b − a]. b) Let A ∈ S1 , then A = B ∪ N , with B ∈ B(R) and λ1 (N ) = 0. We have f −1 A = (f −1 B) ∪ (f −1 N ).
[7.1]
Here f −1 B ∈ S1 , because f is Lebesgue measurable. c) Show that f −1 N ∈ S1 . Fix ε > 0. It holds 0 = λ1 (N ) = λ∗1 (N ), where λ∗1 denotes the Lebesgue outer ∞ measure. Then there exist intervals (ak , bk ], k = 1, 2, . . . such that k=1 (bk − ak ) < ε and N ⊂ ∪∞ k=1 (ak , bk ]. Therefore, −1 (ak , bk ] ⊂ ∪∞ f −1 N ⊂ ∪∞ k=1 f k=1 Ik ,
where Ik , k = 1, 2, . . . are some closed intervals with total length L≤
∞
(bk − ak ) < ε.
k=1
Next, λ1 (f −1 N ) ≤ L < ε; hence λ1 (f −1 N ) = 0. Thus, f −1 N is Lebesgue measurable, moreover, it has Lebesgue measure zero.
Solutions
181
d) Finally, [7.1] implies that f −1 A ∈ S1 as a union of two Lebesgue measurable sets. R EMARK 7.1.– A more general statement holds true: Let A ∈ S1 and f : A → R be a Lebesgue measurable function such that ∀R > 0∃εR > 0∀x,
y ∈ A ∩ [−R, R],
|f (x) − f (y)| ≥ εR · |x − y|.
Then f is (S1 ∩ A, S1 )-measurable. P ROOF.– Denote by fR the function f restricted on the set AR := A ∪ [−R, R]. Then fR is Lebesgue measurable and ∀x, y ∈ AR , |f (x) − f (y)| ≥ εR · |x − y|. Then similarly to problem (6), the function fR is (S1 ∩ AR , S1 )- measurable. −1 Let C ∈ S1 . Then f −1 C = ∪∞ N =1 fN C ∈ S1 as a countable union of Lebesgue measurable sets.
7) a) For T x := arctan x, x ∈ R, it holds |T x − T y| = |T (θ)| · |x − y| =
|x − y| , 1 + θ2
where θ is an intermediate point between x and y. If x, y ∈ [−R, R], then |T x − T y| ≥
|x − y| . 1 + R2
The function T is (S1 , S1 )-measurable by remark 7.1. b) By theorem 1.2
R
− π2
f (T x)dλ1 (x) =
R
f (t)d(λ1 T −1 )(t).
[7.2]
Now, the induced measure is concentrated on (− π2 , π2 ) and for (α, β] with < α < β < π2 , it holds T −1 (α, β] = (tan α, tan β], (λ1 T −1 ) ((α, β]) =
(α,β]
(λ1 T −1 ) ((α, β]) = tan β − tan α,
I(− π2 , π2 ) dλ1 (t) . cos2 (t)
182
Gaussian Measures in Hilbert Space
This equality can be extended from the sets of semiring P1 of finite intervals (α, β] to the sets of B(R) = σa(P1 ) and then finally to the sets of S1 . Thus, λ1 T −1 λ1 ,
I(− π2 , π2 ) (t) d(λ1 T −1 ) = . dλ1 cos2 (t)
This relation and [7.2] imply the desired equality. 8) A solution relies on remark 7.1 and is similar to the solution of problem (7). 9) a) Let R > 0 an (α, β] ⊂ (0, R]. Using generalized spherical coordinates, one can show that
λ1 f −1 (α, β] ≤ CR (β − α), with the positive CR depending on R and n only. Fix ε > 0 and let λ1 (N ) = 0, N ⊂ (0, R]. Then there exists a sequence of (αn , βn ] ⊂ (0, R] such that N⊂
∞
(αn , βn ],
n=1
∞
(βn − αn ) < ε.
n=1
−1 (αn , βn ] and Therefore, f −1 N ⊂ ∪∞ n=1 f ∞ ∞
λ∗1 f −1 (αn , βn ] = λ1 f −1 (αn , βn ] ≤ CR · ε. λ∗1 f −1 N ≤ n=1
n=1
Thus, λ∗1 f −1 N = 0 and f −1 N ∈ S1 . Given any N with λ1 (N ) = 0, it holds f −1 N =
∞
f −1 (N ∩ [0, k]) ∈ S1
k=1
as a countable union of Lebesgue measurable sets. b) Given A ∈ S1 , it holds A = B ∪ N , with B ∈ B(R) and λ1 (N ) = 0. Then
f −1 A = f −1 B ∪ f −1 N ∈ S1 .
Solutions
183
10) The function f (x) = ||x||, x ∈ Rn is Lebesgue measurable by problem (9), and the integral in the statement of this problem is well defined. Let T : Rn → Rn be a unitary operator, i.e. a linear transformation that preserves the norm. Then μ(T
−1
A) =
T −1 A
f (||x||)dλn (x) =
T −1 A
f (||T x||)dλn (x) =
f (||y||)d(λn T −1 )(y).
= A
But λn T −1 = λn , and μ(T −1 A) =
A
f (||y||)dλn (y) = μ(A), A ∈ Sn .
dμ 11) Let g = dλ ; g ≥ 0. Since μ is finite at each bounded set, g is locally Lebesgue n integrable. Then due to the Lebesgue differentiation theorem (see hint to this problem), for almost every x ∈ Rn we have
1 r→0+ λn (B(x, r))
g(y)dλn (y).
g(x) = lim
[7.3]
B(x,r)
Let N be the set where [7.3] does not hold, λn (N ) = 0, and u, z ∈ Rn \ N , with ||u|| = ||z||. Consider unitary operator T , with T u = z. Then due to relation λn T −1 = λn , we have 1 r→0+ λn (B(z, r))
g(y)dλn (y) =
g(z) = lim
1 = lim r→0+ λn (T B(u, r))
B(z,r)
g(y)dλn (y), T B(u,r)
1 r→0+ λn (B(u, r))
g(T x)d(λn T −1 )(x) =
g(z) = lim
1 r→0+ λn (B(u, r))
B(u,r)
g(T x)dλn (x).
= lim
[7.4]
B(u,r)
−1 = μ. Then for any A ∈ Sn , μ (T A) = μ(A) = But it is given that μT g(x)dλ (x). On the other hand, n A
μ (T A) =
g(T x)d(λn T −1 )(x) =
g(y)dλn (y) = A
A
g(T x)dλn (x); A
184
Gaussian Measures in Hilbert Space
hence μ(A) =
A
g(T x)dλn (x), A ∈ Sn . Therefore,
dμ (x) = g(x) = g(T x) dλn
(modλn ).
[7.5]
Now, [7.5] implies that 1 g(z) = lim r→0 λn (B(u, r))
g(x)dλn (x) = g(u). B(u,r)
Thus, on the set Rn \ N , the function g depends only on the norm of x. There exists a function g˜ : Rn → [0, +∞), which depends on the norm of x only and is equal to g almost everywhere. Now, we construct a Lebesgue measurable function f : [0, +∞) → [0, +∞) such that g˜(x) = f (||x||),
a.e.
[7.6]
Let S n−1 = {x ∈ Rn : ||x|| = 1} be the unit sphere in Rn and Vn be the sigma-algebra of Lebesgue measurable sets in [0, +∞) × S n−1 . Following the line of the solution to problem (6), one can show that the function φ(r, v) = rv, r ≥ 0, v ∈ S n−1 is (Vn , Sn )-measurable. Then the function h(r, v) = g˜(φ(r, v)) = g˜(rv),
z ≥ 0, v ∈ S n−1
is Lebesgue measurable. Therefore, its cross-section hv (r) = g˜(rv), r ≥ 0 is Lebesgue measurable, for almost every v ∈ S n−1 (here on the sphere S n−1 the Lebesgue measure is considered). But the cross-sections hv (r) for the different v ∈ S n−1 coincide and hence are equal to some Lebesgue measurable function f (r), r ≥ 0. Then [7.6] holds true and the representation is valid. Finally, we replace f with an equivalent Borel function. 12) a) Consider the case where g is non-negative and bounded. The measure μ(A) =
g(T x)dλn (x),
A ∈ Sn ,
A
is finite at each bounded set, absolutely continuous w.r.t. λn and invariant under orthogonal transformations in Rn . The statement follows from problem (11) and the uniqueness of the Radon–Nikodym derivative.
Solutions
185
b) Switch to the case where g is non-negative but could be unbounded. For R > 0, the function gR (x) := g(x)I(|g(x)| ≤ R) is Lebesgue measurable, bounded and gR (T x) = gR (x) (mod λn ) for each orthogonal transformation T . By part (a) of the proof, there exists NR ∈ Sn such that λn (NR ) = 0 and for all x, y ∈ Rn \ NR with ||x|| = ||y||, it holds gR (x) = gR (y). n Now, set N = ∪∞ k=1 Nk . Then λn (N ) = 0 and for all x, y ∈ R \ N with ||x|| = ||y||, it holds g(x) = g(y).
c) Now, g is an arbitrary function from the problem statement. Using part (b) of the proof, we construct N+ , N− ∈ Sn such that λn (N+ ) = λn (N− ) = 0 and for all x , y ∈ Rn \ N+ , x , y ∈ Rn \ N− with ||x || = ||y || and ||x || = ||y ||, it holds g+ (x ) = g+ (y ) and g+ (x ) = g+ (y ), where g+ and g− are positive and negative parts of g, respectively. Then we set N = N+ ∪ N− , λn (N ) = 0, and for all x, y ∈ Rn \ N with ||x|| = ||y||, it holds g(x) = g(y). The desired function f is constructed as in the solution to problem (11). 13) a) Introduce a function S : R → [0, +∞], S(x) =
∞
|f (n1+α x)|,
x ∈ R.
n=1
It is Lebesgue measurable and ∞
R
Sdλ1 =
n=1
R
|f (n1+α x)|dλ1 (x) =
∞ n=1
1 n1+α
R
|f (t)|dλ1 (t).
[7.7]
Here, we used the change of variable t = T x = n1+α x, x ∈ R and corollary 1.2. Since α > 0 and f ∈ L(R, λ1 ), the expression [7.7] is finite. Hence S ∈ L(R, λ1 ), and S(x) < ∞ (mod λ1 ). Therefore, lim |f (n1+α x)| = 0 (mod λ1 ). n→∞
1
b) Let α > 0 and g ∈ L(Rm , λm ). We show that g(n m +α x) → 0 as n → ∞ (mod λm ). Now, we set S(x) =
∞ n=1
1
|g(n m +α x)|,
x ∈ R.
186
Gaussian Measures in Hilbert Space
We have Rm
Sdλm =
∞ n=1
1 n1+αm
Rm
|g(t)|dλm (t). 1
We used corollary 1.2, with T x = n m +α x, x ∈ Rm , and the corresponding matrix 1 L = n m +α Im , with det L = n1+αm . (Here Im denotes the identity matrix of size m.) The rest of the proof follows the line of part (a) of the solution. Another possible extension is as follows: Let α1 m n1+α xm ) m
1 > 0, . . . , αm > 0 and g ∈ L(Rm , λm ). Then g(n1+α x1 , . . . , 1 → 0 as n1 → ∞, . . . , nm → ∞ (mod λm ).
14) a)
R
f dλ1 =
f dλ1 + (−∞,0]
f dλ1 .
[7.8]
[0,+∞)
Use transformation T x = −x, x ∈ R. Corollary 1.1 implies λ1 T −1 = λ1 . Because f (−x) ≡ f (x), we have
(−∞,0]
f (−t)d(λ1 T −1 )(t) =
f (x)dλ1 (x) =
f (t)dλ1 (t), [7.9]
[0,+∞)
[0,+∞)
and relations [7.8] and [7.9] imply the desired equality. b)
R
f (x)dλ1 (x) = −
=−
R
f (t)d(λ1 T
−1
R
f (−x)dλ1 (x) =
)(t) = −
R
f (t)dλ1 (t), 2 ·
R
f (x)dλ1 (x) = 0.
Solutions
187
15) Use transformation T x = −x, x ∈ R, under which λ1 is invariant. We have I := [−1,1]
=
[−1,1]
f (x) dλ1 (x) = f (x) + f (−x)
f (−t) dλ1 (t), f (t) + f (−t)
2I = [−1,1]
f (x) + f (−x) dλ1 (x) = f (x) + f (−x)
[−1,1]
f (−t) d(λ1 T −1 )(t) = f (t) + f (−t)
dλ1 (x) = 2,
I = 1.
[−1,1]
Answer: 1. 16) Let M = {ε = (εn )∞ 1 : εn = 0 or 1, n = 1, 2, . . . }. For each ε, δ ∈ M , ε = δ, it holds ||ε − δ||∞ = 1. Now, consider disjoint balls B(ε, 12 ), ε ∈ M . Suppose that a measure λ satisfies (i) and (ii). Then for each ε ∈ M , λ(B(ε, 12 )) = λε , 0 < λε < ∞. The set M is uncountable; hence one can select a ∞ sequence of different points ε(m) ∈ M , m = 1, 2, . . . , with 1 λε(m) = ∞. (m) 1 We have ∪∞ , 2 ) ⊂ B(0, 2), then 1 B(ε
λ(B(0, 2)) ≥
∞
λε(m) = ∞,
λ(B(0, 2)) = ∞.
1
This contradicts the condition (ii). 17) a) First we construct a sequence of unit vectors xn , n = 1, 2, . . . such that ||xm − xn || > 12 , for all n = m. This can be done by induction. Describe the inductive step. Given unit vectors x1 , . . . , xn , with ||xi − xj || > 12 for all i = j, apply the theorem about an almost orthogonal vector (see [BER 12]) to the subspace Ln = span(x1 , . . . , xn ). Then there exists a unit vector xn+1 such that ||xn+1 − l|| > 12 , for all l ∈ Ln . In particular ||xn+1 − xi || > 12 , i = 1, . . . , n. 1 b) The balls B(xn , 14 ), n = 1, 2, . . . are disjoint, moreover ∪∞ 1 B(xn , 4 ) ⊂ B(0, 2). The rest of the proof follows the line of theorem 1.5(a).
188
Gaussian Measures in Hilbert Space
18) Let en be standard unit vectors in lp , en = (0, . . . , 0, 89:; 1 , 0, . . . ). Then ||en −
√ em ||p = p 2, n = m, and the balls B(en , V in lp is isometry,
V x = xn em − xm en +
√ p 2 2 )
xi e i ,
n
are disjoint. For n = m, the operator
x = (xi )∞ 1 .
i=n,i=m √ p
√ p
Assume that λ is a desired measure. Then V (B(en , 22 )) = B(em , 22 ), and the values of λ at these two balls coincide. The rest of the proof follows the line of theorem 1.5(a). 19) For x ∈ X, it holds ||T x|| = max |x(ϕ(t))| = max |x(s)| = ||x||. 0≤t≤1
0≤s≤1
Moreover, the inverse mapping ϕ(−1) : [0, 1] → [0, 1] is continuous, therefore, T is bijection, and it is an isometry on X (see the definition in problem (18)). Let ϕ(n) be the n-fold composition of ϕ with itself. For n, k ≥ 1, we have ||ϕ(n) − ϕ(n+k) || = max |ϕ(n) (t) − ϕ(k) (ϕ(n) (t))| = 0≤t≤1
= max (s − ϕ(k) (s)) ≥ δ := max (s − ϕ(s)) > 0. 0≤s≤1
0≤s≤1
The balls B(ϕ(n) , 2δ ), n = 1, 2, . . . are disjoint and for each n, T (B(ϕ(n) , 2δ )) = B(ϕ(n+1) , 2δ ). The rest of the proof follows the line of theorem 1.5(a). 20) Let X be a random vector in Rn , with μX = μ (see lemma 1.2). Random vector Y := −X has the characteristic function ϕY (t) = E ei(Y,t) = E e−i(X,t) = ϕX (t), where ϕX (t) is the complex conjugate to ϕX (t). Now, μX is symmetric around the origin ⇔ X and Y have equal distributions ⇔ ϕX ≡ ϕY ⇔ ϕX (t) ≡ ϕX (t) ⇔ ϕX (t) ∈ R, for all t ∈ Rn .
Solutions
189
21) Let X be a random vector in Rn , with μX = μ, and U be an orthogonal n × n matrix. Random vector U X has characteristic function ϕU X (t) = E ei(U X,t) = E ei(X,U
t)
,
t ∈ Rn .
Note that U = U −1 is an orthogonal matrix as well. Now, μX is invariant under all unitary operators ⇔ X and U X have equal distributions, for all orthogonal matrices U ⇔ ϕX (U t) = ϕX (t), for all t ∈ Rn and all orthogonal matrices U . The latter condition is equivalent to the following: ϕX (t) depends only on ||t||. 22) Let X and Y be random vectors, with μX = μ and μY = ν. Introduce a half-space Vτ d = {x ∈ Rn : (x, τ ) ≤ d},
||τ || = 1,
d ∈ R.
a) Prove that μ and ν coincide at each half-space. Select a vector x0 such that (x0 , τ ) = d and put xn = x0 − nτ , n ≥ 1. The balls ¯ n , n) are increasing in n and their union is Vτ d . By the continuity of measure from B(x ¯ n , n)) = ν(B(x ¯ n , n)), n ≥ 1, imply μ(Vτ d ) = ν(Vτ d ). below, equalities μ(B(x b) Prove that ϕX = ϕY . Take a ∈ Rn , a = 0 and τ =
a ||a|| .
We have
ϕX (a) = E ei||a||(X,τ ) = ϕ(X,τ ) (||a||), ϕY (a) = ϕ(Y,τ ) (||a||). Let F and G be cumulative distribution functions (cdfs) of (X, τ ) and (Y, τ ), respectively. Then F (d) = P{(X, τ ) ≤ d} = μ(Vτ d ) = ν(Vτ d ) = G(d),
d ∈ R.
Thus, random variables (X, τ ) and (Y, τ ) have equal distributions and equal characteristic functions. Therefore, ϕX (a) = ϕY (a), for all a = 0. But ϕX (0) = ϕY (0) = 1, and ϕX = ϕY . Then μX = μY .
190
Gaussian Measures in Hilbert Space
23) a) Prove that C is positive semidefinite. Let ξ = (ξi )n1 ∼ N (0, A) and η = (ηi )n1 ∼ N (0, B) be independent random vectors. Introduce random vector z = (ξi ηi )n1 . Then E z = 0 because E ξi ηi = (E ξi )(E ηi ) = 0, i = 1, . . . , n; Cov (ξi ηi , ξj ηj ) = E (ξi ξj ) (ηi ηj ) = E (ξi ξj ) E (ηi ηj ) = aij bij , i, j = 1, . . . , n. Therefore, Cov(z) = C, and C is positive semidefinite. b) Now, show that C is positive definite. Assume the contrary. Then, in view of part (a) of the proof, ∃d = (di )n1 = 0,
E(z, d)2 = d Cd = 0.
Next, we use the independence of ξ and η. By the properties of conditional expectation (see [SHI 16]),
0=E
n 1
< =E
n
2 di ξi ηi
⎡ ⎡ 2 ⎤⎤ n = E ⎣E ⎣ di ξi ηi η ⎦⎦ = =
1
aij (di ηi )(dj ηj ) .
1
The latter double sum is non-negative because A is positive semidefinite. Therefore, n
aij (di ηi )(dj ηj ) = 0,
a.s.
1
Hence (di ηi )n1 = 0 a.s. But d = 0, and there exists dj = 0. Then ηj = 0 a.s., E ηj2 = bjj = 0. This contradicts the condition that B is positive definite. Thus, our assumption is wrong, and C is indeed positive definite. 24) a) First we check that h is a pdf. It is a non-negative Borel function, because |1 − 2F (x)| · |1 − 2G(x)| ≤ 1.
Solutions
Next,
+∞ −∞
f (x)F (x)dx =
+∞ −∞
191
+∞ F (x)dF (x) = 12 F 2 (x)−∞ = 12 , and
R2
h(x, y)dxdy =
=1+α
+∞ −∞
(f (x) − 2f (x)F (x))dx ·
+∞ −∞
(g(y) − 2g(y)G(y))dy = 1.
Thus, h is a two-dimensional pdf, and there exists a random vector (X; Y ) with the pdf equal h(x, y). The marginal density equals fX (x) = f (x)
+∞ −∞
×
g(y)dy + αf (x)(1 − 2F (x))×
+∞ −∞
g(y)(1 − 2G(y))dy = f (x),
and in a similar way another marginal density fY (y) = g(y). b) Since f and g are even functions, E X = E Y = 0. Then Cov(X, Y ) =
R2
xyf (x)g(y)dxdy+
+α
R
xf (x)(1 − 2F (x))dx ·
R
yg(y)(1 − G(y))dy,
Cov(X, Y ) = 4α E[XF (X)] · E[Y F (Y )]. It remains to show that E XF (X) > 0 (the inequality E Y F (Y ) > 0 can be proven similarly). We have E XF (X) = E XF (X)I(X > 0) + E XF (X)I(X < 0). Random variables X and (−X) are identically distributed, therefore, E XF (X)I(X < 0) = E(−X)F (−X)I(−X < 0) = − E X(1 − F (X))I(X > 0), E XF (X) = E X(2F (X) − 1)I(X > 0).
192
Gaussian Measures in Hilbert Space
There exists t0 with
1 2
< F (t0 ) < 1. Then
E XF (X) ≥ E X(2F (X) − 1)I(X > t0 ) ≥ t0 (2F (t0 ) − 1) · P{X ≥ t0 }. 25) In problem (24), one can put x2 1 f (x) = g(x) = √ e− 2 , 2π
x ∈ R.
Then X ∼ N (0, 1) and Y ∼ N (0, 1). But for α ∈ (−1, 1), α = 0 (e.g. α = 12 ), the function h(x, y), which is the pdf of random vector (X; Y ) , is not the pdf of a Gaussian random vector. Thus, X and Y are not jointly Gaussian. d
1
26) Let γ be standard Gaussian vector in Rn . By lemma 1.8, X = S 2 γ, and 1
1
Iα = E exp{α(AS 2 γ, S 2 γ)} = E exp{α(Bγ, γ)}. 1
1
Here, B is the symmetric matrix S 2 AS 2 , with the real eigenvalues λ1 , . . . , λn and the corresponding orthogonal eigenvectors e1 , . . . , en . Then (Bγ, γ) = n 2 1 λk (γ, ek ) , and because (γ, ek ), k = 1, . . . , n are i.i.d. N (0, 1) random variables, Iα =
n
E exp{αλk (γ, ek )2 }.
1
For β ∈ R and ξ ∼ N (0, 1), evaluate 2 1 Jβ := E eβξ = √ 2π
R
e−
(1−2β)x2 2
dx.
For β ≥ 12 , Jβ = +∞. And for β < 12 , we put σ = Jβ = σ
R
√
√ 1 1−2β
x2 1 e− 2σ2 dx = σ, 2πσ
because the integrand is the pdf of normal law N (0, σ 2 ).
and obtain
Solutions
n Finally, Iα = 1 Jαλk , and Iα is finite if, and only if, αλk < 1, . . . , n. In this case Iα =
n
1
√
1 2
193
for all k =
1 . 1 − 2αλk
27) Pdf of X is (2π)−n/2 e− Jα = (2π)−n/2
Rn
||x||2 2
, x ∈ Rn . We have
||x||−α e−
||x||2 2
dx.
It is the improper Riemann integral. Using n-dimensional spherical coordinates, we get
∞
Jα = const · 0
1 rα+1−n
e−
r2 2
dr.
The latter integral converges for α > n. Answer: α > n. 7.2. Solutions for Chapter 2 1) a) Prove that ϕ is non-decreasing. Assume the contrary. Suppose that there exist points 0 < t1 < t2 , with ϕ(t1 ) > ϕ(t2 ) ≥ 0. Consider t3 > t2 and points Mi (ti ; ϕ(ti )), i = 1, 2, 3 on the graph of ϕ. We have ϕ(t3 ) ≥ 0, and for large enough t3 , the point M2 lies below the segment M1 M3 . This contradicts the concavity of ϕ. Thus, ϕ is non-decreasing. b) Prove that ϕ(t) > 0, t > 0. Assume the contrary. Suppose that there exists t1 > 0, with ϕ(t1 ) = 0. Remember that ϕ(0) = 0. Since ϕ is non-decreasing, ϕ(t) = 0 at [0, t1 ]. Because ϕ is not identical to zero, there exists t2 > t1 with ϕ(t2 ) > 0. Points M0 (0; 0), M1 (t1 ; ϕ(t1 )) and M2 (t2 ; ϕ(t2 )) lie on the graph of ϕ, but M1 is below the segment M1 M2 . This contradicts the concavity of ϕ. Hence ϕ(t) > 0, t > 0. c) For a, b ≥ 0, it holds ϕ(a + b) ≤ ϕ(a) + ϕ(b).
194
Gaussian Measures in Hilbert Space
Indeed, consider points on the graph of ϕ: M0 (0; 0), Ma (a; ϕ(a)), Mb (b; ϕ(b)), Ma+b (a + b; ϕ(a + b)). the upper half-plane The segment Ma Mb lies in w.r.t. the line M0 Ma+b . Therefore, a+b ϕ(a)+ϕ(b) the middle of this segment N 2 ; lies in the upper half-plane as well. 2 Hence yN =
ϕ(a) + ϕ(b) ϕ(a + b) 1 ≤ yMa+b = , 2 2 2
and the desired inequality follows. d) Based on parts (a)-(c) of the solution, it is easy to verify the axioms of a metric for the function ϕ(ρ). In particular, for x, y, z ∈ X, ϕ(ρ(x, y)) ≤ ϕ(ρ(x, z) + ρ(z, y)) ≤ ϕ(ρ(x, z)) + ϕ(ρ(z, y)), and the triangle inequality for ϕ(ρ) follows. 2) Let ρ(t, s) = |t − s|, t, s ∈ R, and ϕ(x) = ϕ(x) = 1 −
x 1+x ,
x ≥ 0. We have
1 2 < 0, x ≥ 0, , ϕ (x) = − 1+x (1 + x)3
and ϕ is strictly concave. This function satisfies all the requirements of problem (1); hence the function d = ϕ(ρ) is a metric on R. 3) a) Prove that (R, d) is complete, with d given in 2.2. This follows from the completeness of R with the usual metric and from the relation |t − s| =
d(t, s) , t, s ∈ R . 1 − d(t, s)
∞ b) Let x(m) = (xn (m))∞ n=1 , m ≥ 1, be a Cauchy sequence in (R , ρ). Relation 2.3 implies that for each n ≥ 1, {xn (m), m ≥ 1} is a Cauchy sequence in (R, d). By part (a) of the solution, there exists xn ∈ R, with
lim xn (m) = xn , n ≥ 1.
m→∞
Solutions
195
∞ Lemma 2.2 implies that x(m) → x = (xn )∞ n=1 in R (ρ) as m → ∞. Thus, (R , ρ) is complete. ∞
4) a) It is straightforward that J is a linear surjection. Moreover, Jx 2 = (
∞
1
an x2n ) 2 = x a ,
n=1
and J is an isometry between l2,a and l2 . Thus, l2,a and l2 are isometric. b) Since l2 is a separable Hilbert space, l2,a is a separable Hilbert space as well. 5) a) The proof is similar to the proof of lemma 2.5. b) We have l∞ =
∞ N =1
{x : sup |xn | ≤ N } = n≥1
∞ ∞
{x : |x1 | ≤ N, . . . , |xk | ≤ N }.
N =1 k=1
Then l∞ ∈ σa(Cyl) = B(R∞ ). c) lim xn = 0 ⇐⇒ ∀k ≥ 1∃N ≥ 1∀n ≥ N, |xn | < k1 . Therefore, n→∞
c0 =
∞ ∞ ∞
{x ∈ R∞ : |xn |
x − y S − 2ε, and for ε = 13 x − y S > 0, it holds u − v S > 0; hence u = v, and U ∩ V = ∅. This proves the statement. 10) If f is continuous in τS , then it is continuous in the usual sense (see remark 4.2). Now, let f ∈ H ∗ . Then f (x) = (x, h), x ∈ H, for some h ∈ H. We may and do assume that h = 1. We prove that f is continuous at point x0 ∈ H in S-topology. For ε > 0, we set S = ε−2 P[h] ∈ LS (H). Consider ES = {x ∈ H : (ε−2 P[h] x, x) < 1} = {x ∈ H : |(x, h)| < ε}. If x ∈ ES + x0 , then |(x − x0 , h)| < ε ⇒ |f (x) − f (x0 )| < ε. Thus, f is continuous in τS . 11) a) Necessity Assume that KA (x) := (Ax, x) is continuous at zero in τS . Then for ε = 1 there exists S ∈ LS (H) such that (Sx, x) < 1
⇒
|(Ax, x)| < ε = 1.
By continuity of both functions KS (x) and KA (x), we get the next implication: (Sx, x) ≤ 1
⇒
|(Ax, x)| ≤ 1.
216
Gaussian Measures in Hilbert Space
At the point where (Sx, x) = 0, it holds (Ax, x) = 0. Now, let (Sx, x) = c2 , c > 0. Then (S xc , xc ) = 1 ⇒ |(A xc , xc )| ≤ 1, |(Ax, x)| ≤ c2 = (Sx, x). Thus, in all the cases (Ax, x) ≤ (Sx, x). Let {en } be an orthobasis in H. We have ∞
|(Aen , en )| ≤
1
∞
(Sen , en ) < ∞.
1
The cited theorem from [GOK 88] implies that A ∈ S1 (H). b) Sufficiency We assume that A ∈ S1 (H). Consider its polar decomposition A = U T , where T is S-operator. First, we prove that KA is continuous at zero in τS . It holds |(Ax, x)| = |(T x, U ∗ x)| = |(T 1/2 x, T 1/2 U ∗ x)| ≤ T 1/2 x · T 1/2 U ∗ x ≤ 1 1/2 2 T x + T 1/2 U ∗ x 2 = (Sx, x), ≤ 2 where S=
1 (T + U T U ∗ ) . 2
It is S-operator, because T , U T U ∗ ∈ LS (H). Let ε > 0, Sε = ε−1 S. If Sε x, x) < 1, then (Sx, x) < ε and |KA (x)| < ε. Thus, KA is continuous at zero in S-topology. Next, we want to prove that KA is continuous at point x0 ∈ H in τS . Change the variable t = x − x0 : f (t) := KA (t + x0 ) = KA (t) + KA (x0 ) + (t, A∗ x0 ) + (t, Ax0 ) Here, KA is continuous at zero in τS , and linear functionals (t, A∗ x0 ) and (t, Ax0 ) belong to H ∗ ; hence they are continuous in τS (see problem (10)). Thus, the functional f is continuous at zero in the linear topology τS , and KA is continuous at x0 in τS . Hence KA is continuous everywhere in τS .
Solutions
217
12) Assume that θ = μ ˆ, for some Borel probability measure in l2 . Let νe (B) = μ(S ∩ l2 ), B ∈ B(R∞ ). Since νe (l2 ) = 1, it is enough to prove that νe = μe . n Consider projector Pn : l2 → Ln , Pn x = 1 xk ek , and the induced measure νn = μP −1 , n ≥ 1. For z ∈ Ln , it holds (see [4.6]):
ei(z,x) dμn (x) =
θ(z) = l2
ei(z,x) dνn (x) =
Ln
ei(z,x) dμn (x). Ln
Hence νn = μn , n ≥ 1. Consider a cylinder Aˆn in R∞ , An ∈ B(Rn ). We have (here we identify Ln and R ): n
νe (Aˆn ) = μ(Aˆn ∩ l2 ) = μ(Pn−1 An ) = νn (An ) = μn (An ) = μe (Aˆn ). Now, by theorem 2.2 the measures νe and μe coincide. 13) It is straightforward that θ(0) = 1 and θ is positive definite. Check that Re θ is continuous at zero in S-topology. Fix ε > 0. Since Re ϕ is continuous at zero, there exists δ > 0 such that for each x ∈ B(0, δ), |1 − Re ϕ(x)| < ε.
Inequality (Ax, Ax) < δ 2 is equivalent to δ12 Ax, Ax < 1. The operator A∗ A is self-adjoint and positive, moreover it is nuclear as a product of two Hilbert–Schmidt operators; hence A∗ A ∈ LS (H) and S := δ12 A∗ A is S-operator. Thus, if (Sx, x) < 1, then |1 − Re ϕ(Ax)| < ε, and Re θ is continuous at zero in S-topology. Now, by theorem 4.5, θ = μ ˆ, where μ is a Borel probability measure in H. 7.5. Solutions for Chapter 5 1) We prove by the contrary. Suppose that there exists a random element ξ in H, with ϕξ (x) = exp{i(h, x) −
x 2 }, 2
x ∈ H.
Then η = ξ − h has the characteristic functional 1 1 ϕη (x) = exp{− (x, x)} = exp{−( Ix, x)}, 2 2
x ∈ H.
218
Gaussian Measures in Hilbert Space
According to example 4.1, 12 I is S-operator; hence the identity operator I is S-operator as well in the infinite-dimensional Hilbert space H, which is not true. 2) Since ξ1 and ξ2 are independent, 1 ϕξ1 +ξ2 (x) = ϕξ1 (x)ϕξ2 (x) = exp{i(m1 +m2 , x)− ((S1 +S2 )x, x)}, 2
x ∈ H.
The operator S1 +S2 ∈ LS (H) as a sum of S-operators. By theorem 5.1, ξ1 +ξ2 ∼ N (m1 + m2 , S1 + S2 ). 3) Let {en } be the eigenbasis of S and {λn > 0, n ≥ 1} be the corresponding eigenvalues. By theorem 5.2, with probability 1
ξ =m+
∞ λk γk ek , 1
where {γk } are i.i.d. N (0, 1) random variables and the series converges strongly in H with probability 1. Then for each N ≥ 1, ξ 2 ≥
N
(mk +
λk γ k ) 2 ,
mk = (m, ek ),
k ≥ 1;
1
E ξ
−α
≤E
N
(mk + λk γk )2
−α/2 =
1
=
N
RN
(mk + λk xk )2
−α/2 gN (x)dx =: JN
1
where gN is pdf of standard Gaussian measure in RN . Next,
JN =
N
1
−1/2 λk
RN
t −α gN
t1 − m1 tN − mN √ , ..., √ λ1 λN
dt.
The latter integral converges for α < N (this can be shown using generalized spherical coordinates). Thus, for any N > α, E ξ −α ≤ JN < ∞.
Solutions
4) Problem (2) of Chapter 5 implies that ηk ∼ N (0, Tk ), Tk = ( for all k = 1, ..., n. Thus, ηk ’s are copies of ξ.
n j=1
219
u2kj )S = S,
Take hk ∈ H, k = 1, ..., n. Since U is an orthogonal matrix, for k = p it holds E(ηk , hk )(ηp , hp ) =
n
ukj upj E(ξj , hk )(ξj , hp ) =
j=1
= (Shk , hp )
n
ukj upj = 0.
j=1
Hence jointly Gaussian random variables (ηk , hk ), k = 1, ..., n are uncorrelated and independent. For a compound random element η := (ηk )n1 in H n we have ϕη (h1 , ..., hn ) = E exp{i
n
(hj , ηj )} =
1
=
n
E ei(hj ,ηj ) =
1
n
ϕξ (hj ).
1
Due to evident generalization of lemma 4.3, random elements η1 , ..., ηn are independent. 5) X = X1 × X2 is a separable metric space, with metric ρ(x, y) = ρ1 (x1 , y1 ) + ρ2 (x2 , y2 ),
x = (x1 , x2 ) ∈ X,
y = (y1 , y2 ) ∈ X.
Denote μ = μ1 × μ2 , it is a Borel probability measure in X. a) Let ai ∈ suppμi , i = 1, 2, a = (a1 , a2 ). For each ε > 0, ε ε × B a2 , ; B(a, ε) ⊃ B a1 , 2 2 hence μ(B(a, ε)) ≥
ε > 0. μi B ai , 2 i=1 2
Thus, a ∈ suppμ.
220
Gaussian Measures in Hilbert Space
b) Let b1 ∈ / suppμ1 and b2 ∈ X2 , b = (b1 , b2 ). There exists ε > 0, with μ1 (B(b1 , ε)) = 0. Since B(b, ε) ⊂ B(b1 , ε) × B(b2 , ε), it holds μ(B(b, ε)) ≤
2
μi (B(bi , ε)) = 0.
i=1
Then b ∈ / suppμ. In a similar way if b1 ∈ X1 and b2 ∈ / suppμ2 , then b = (b1 , b2 ) ∈ / suppμ. The desired equality follows from parts (a) and (b) of the solution. 6) Let ξ ∼ N (0, S) in H. Its distribution μξ = μ. There exist independent N (0, 1) r.v.’s {γn } such that ξ=
∞ λn γn en , 1
where the convergence is strong in H a.s. Then a.s. Vξ =
∞
λn γ n V e n =
1
∞ λπ(n) γn eπ(n) . 1
Remember that π : N → N is a bijection. After the change of summation index i = π(n), i ∈ N we get Vξ =
∞
λi γi ei ,
γi = γπ−1 (i) ,
i ≥ 1,
1
where {γi } is again a sequence of independent 1) random variables and the ∞ √ N(0, ∞ 2 2 convergence is strong in H a.s. (because λ γ ) = ( i i 1 1 λi (γi ) a.s. due to the ∞ convergence of series 1 λi ). Now, by theorem 5.2 (b), V ξ ∼ N (0, S). Thus, ξ and V ξ are identically distributed; hence μ = μξ = μV ξ = μV −1 . Remark. Not every permutation π yields a bounded operator V as defined in problem (6). A necessary and sufficient condition for that is as follows: λπ(i) = O(λi ).
Solutions
221
7) Let ξ ∼ N (0, S) in H. By theorem 5.2(a) ξ=
∞ √
αn ξn ,
1
where independent ξn ∼ N (0, Pn ), Pn is orthoprojector on Hn (remember that dimHn < ∞) and the convergence is strong in H a.s. a) Sufficiency Let U be unitary operator in H, with U (Hn ) = Hn . Then Uξ =
∞ √
αn ξn ,
ξn = U ξn ,
n≥1
1
and the convergence is strong in H a.s. Since U |Hn is unitary operator in Hn , ξn ∼ N (0, Pn ), and ξn are independent. By theorem 5.2(b), U ξ ∼ N (0, S). Hence ξ and U ξ are identically distributed, and μ = μξ is U -invariant. b) Necessity Let U be unitary operator in H such that μ is U -invariant. By corollary 5.3, S = U SU ∗ = U SU −1
⇒
SU = U S.
Let ei ∈ Hn . Then U Sei = αn U ei = S(U ei )
⇒
U e i ∈ Hn .
Thus, U (Hn ) ⊂ Hn , n ≥ 1. Remark. If V ∈ L(H), V (Hn ) = Hn and V |Hn is unitary operator in Hn for each n ≥ 1, then V is unitary operator in H. 8) Let {γn , n ≥ 1} be i.i.d. N (0, 1) random variables and ξ be a random element in H, with ξ=
∞ λn γn en , 1
a.s.
222
Gaussian Measures in Hilbert Space
where {en } is the eigenbasis of S corresponding to {λn } and the series converges strongly in H a.s. By theorem 5.2(b), the distribution μξ = μ. It holds Jα = E exp{α ξ } = E exp α 2
∞
λn γn2
.
1 ∞ Consider random element γ = (γn )∞ 1 distributed in R . Its distribution μγ is the ∞ standard Gaussian measure in R . We have
Jα =
R∞
exp α
∞
λn x2n
dμγ (x).
1
Now, the statement follows directly from problem (11) of Chapter 2 (we note that S = maxn≥1 λn ). 9) We may and do assume that 0 < ε < 1. Denote gδ (x) = exp
δ 2 · ||x|| , 2||S||
x ∈ H,
0 < δ < 1.
If we show that for some nδ ∈ N , gδ (x)dμn (x) < ∞,
sup n≥nδ
[7.13]
H
then |f (x)|1+ε dμn (x) < ∞,
sup n≥nδ
δ = 1 − ε2 ,
H
and the desired convergence would follow from theorem 3.5 and condition (3.18), both from [BIL 99]. Now, we prove [7.13]. First consider the case where all mean values mn of μn are zeroes. Let Sn be the correlation operator of μn , n ≥ 1. By corollary 5.4, Sn converges to S in nuclear norm, which implies that Sn → S > 0,
trSn → trS
as
n → ∞.
Solutions
We fix δ ∈ (0, 1). There exists nδ ∈ N such that for all n ≥ nδ , For n ≥ nδ , it holds √ δ 2 gδ (x)dμn (x) ≤ exp · ||x|| dμn (x) = 2||Sn || H H √ δ 1 1 , αn = = < , 2||Sn || 2||Sn || (n) ∞ (1 − 2α λ ) n k k=1
δ 2||S||
≤
223
√ δ 2||Sn || .
(n)
where {λk , k ≥ 1} are eigenvalues of Sn (with multiplicity). Then
∞ 1 (n) gδ (x)dμn (x) ≤ exp − log(1 − 2αn λk ) . 2 H
k=1
(n)
We have 2αn λk
≤ 2αn ||Sn || ≤
√
δ ≤ 1; hence
1 (n) (n) − log(1 − 2αn λk ) ≤ Cδ · 2αn λk , 2 with Cδ > 0 depending only on δ. Therefore,
gδ (x)dμn (x) ≤ exp 2Cδ αn H
∞
(n) λk
= exp {2Cδ αn · trSn } .
k=1
Since αn and trSn are bounded, relation [7.13] follows. Now, consider arbitrary mean values mn . By corollary 5.4, mn converge strongly to m, where m is the mean value of μ. Let Xn ∼ N (0, Sn ), n ≥ 1. Then mn + Xn ∼ N (mn , Sn ), and for each τ > 0, ||mn + Xn ||2 ≤ (1 + τ )||X||2 + (1 +
1 )||mn ||2 . τ
Therefore, gδ (x)dμn (x) ≤ E gδ (mn + Xn ) ≤ H
≤
exp H
1 δ [(1 + τ )||x||2 + (1 + )||mn ||2 ] dμcn (x). 2||S|| τ
224
Gaussian Measures in Hilbert Space
Here, μcn is a centered Gaussian measure, with correlation operator Sn . We fix τ such that δ(1 + τ ) = δ0 < 1, and since ||mn || ≤ const, we get
gδ (x)dμn (x) ≤ const H
H
gδ0 (x)dμcn (x).
The latter integrals are bounded uniformly in n ≥ nδ0 according to the case of zero means, and [7.13] follows, with nδ replaced as nδ0 . This accomplishes the proof. 10) We modify the proof of lemma 5.9(a). Denote α = ξ sin ϕ + η cos ϕ and β = ξ cos ϕ − η sin ϕ. It holds ϕ(α;β) (x∗ , y ∗ ) = E exp{iξ, x∗ sin ϕ + y ∗ cos ϕ } E exp{iη, x∗ cos ϕ − y ∗ sin ϕ } = ϕξ (x∗ sin ϕ + y ∗ cos ϕ) · ϕξ (x∗ cos ϕ − y ∗ sin ϕ) = 1 = exp{− [σ2 (x∗ sin ϕ + y ∗ cos ϕ, x∗ sin ϕ + y ∗ cos ϕ)+ 2 + σ2 (x∗ cos ϕ − y ∗ sin ϕ, x∗ cos ϕ − y ∗ sin ϕ)]} = 1 = exp{− (σ2 (x∗ , x∗ ) + σ2 (y ∗ , y ∗ ))} = ϕ(ξ;η) (x∗ , y ∗ ). 2 In transformation, we used that σ2 is a symmetric bilinear form on X ∗ . The rest of the proof follows the line of the proof of lemma 5.9(a). 11) Let 0 < α
0. Hence ||x||2t dμ(x) ≤ H
t t e−t Jα , α
2
eα||x|| dμ(x).
Jα = H
Now, we bound Jα . We use eigenvalues λn of S (with multiplicity):
∞ 1 Jα = exp − log(1 − 2αλn ) . 2 1 Since 0 ≤ 2αλn ≤ 2α S < 1, we obtain from the concavity of the function g(t) = − log(1 − t), t < 1: −
1 1 2αλn log(1 − 2α||S||) log(1 − 2αλn ) ≤ − · = 2 2 2α||S||
=−
λn log(1 − 2α||S||) , 2||S||
Jα ≤ (1 − 2α||S||)
trS − 2||S||
.
Thus,
||x||2t dμ(x) ≤ tt e−t α−t (1 − 2α||S||)
trS − 2||S||
.
H
This implies the statement. The function of α attains its minimum at point α0 =
t . 2t||S|| + trS
12) Let ξ be a Gaussian random element with distribution μ. Then ξ − m is a centered Gaussian random element in X, and by theorem 5.9 there exists α0 > 0, with E exp{α0 ||ξ − m||2 } < ∞. Now, for any ε > 0, 1 ||ξ||2 ≤ (||ξ − m|| + ||m||)2 ≤ (1 + ε)||ξ − m||2 + (1 + )||m||2 , ε
226
Gaussian Measures in Hilbert Space
and for α > 0, 1 E exp{α||ξ||2 } ≤ exp{α(1 + )||m||2 } E exp{(1 + ε)||ξ − m||2 }. ε We set α =
α0 1+ε
and obtain the desired relation.
Since ε is arbitrary, actually we have proved more: if E exp{α0 ||ξ − m||2 } < ∞ (here α0 is a certain positive number), then E exp{α0 ||ξ||2 < ∞ for all α < α0 . 13) The sequence {ξn } converges in probability to z, therefore, Gaussian measures μn := μξn weakly converge to δz , Dirac measure at point z (see [BIL 99]). Fix τ > ¯ τ )) =: c = 1. In this case, from the proof of theorem 5.10 we ||z||, then δz (B(0, obtain that for any β > 0 there exists nβ ≥ 1 such that sup E exp{β||ξn ||2 } < ∞.
n≥nβ
We set β = 2α where α is the number from problem (13). Then sup E f 2 (ξn ) ≤ sup E exp{2α||ξn ||2 } < ∞.
n≥nβ
n≥nβ
Since f is continuous almost everywhere (a.e.) with respect to the limit measure δz , we obtain (see [BIL 99]): E f (ξn ) → E f (ξ) = f (z)
as
n → ∞.
Here, ξ(ω) ≡ z, ξn → ξ in distribution. 14) We modify the proof demonstrated in example 5.2. Since the measure σ is sigma-finite, B ∗ = Lq (T, σ). For x∗ ∈ Lq (T, σ), we explain why
ξ(t)x∗ (t)dσ(t)
η := T
is a Gaussian r.v. (all the other reasonings from example 5.2 are easily extended to the measure σ on T instead of the Lebesgue measure on the interval).
Solutions
227
First one can show by Fubini’s theorem that η is a r.v. There exists an increasing sequence {Tn } such that T = ∪∞ 1 Tn , Tn ∈ S, n ≥ 1 (here S is a sigma-algebra where σ is defined), and σ(Tn ) ≤ n for all n. Denote x∗n = x∗n (t) = x∗ · IAn (t),
An = {t ∈ Tn : (x∗ (t))2 · E ξ 2 (t) ≤ n}.
It is enough to show that η is Gaussian, with x∗n instead of x∗ . In this case σ(Tn ) < ∞, η= Tn
ξ(t)x∗n (t)dσ(t),
Tn
E ξ 2 (t) · (x∗n (t))2 dσ(t) ≤ n2 < ∞,
and η ∈ L2 (Ω, P). The rest of the proof of the fact, that such an η is Gaussian, follows the line of proof in example 5.2, where the Lebesgue integral over [0, T ] is replaced with the integral over T w.r.t. the measure σ. 7.6. Solutions for Chapter 6 1) a) Necessity follows from the inequality lim sup zn ≤ sup zn . n→∞
n≥1
b) Sufficiency Fix ε > 0. From the given relation, it follows that there exists a positive α0 with lim sup E |Xn | · I(|Xn | ≥ α0 ) < ε. n→∞
Hence for some n0 ≥ 2, sup E |Xn | · I(|Xn | ≥ α0 ) < 2ε.
n≥1
Since E |Xn | < ∞, it holds lim E |Xn | · I(|Xn | ≥ α) = 0. Therefore, there α→+∞
exists α1 > 0 with n 0 −1 n=1
E |Xn | · I(|Xn | ≥ α1 ) < ε.
228
Gaussian Measures in Hilbert Space
For α2 := max {α0 , α1 }, we have sup E |Xn | · I(|Xn | ≥ α2 ) ≤
n≥1
n 0 −1
E |Xn | · I(|Xn | ≥ α2 )+
n=1
+ sup E |Xn | · I(|Xn | ≥ α2 ) < ε + 2ε = 3ε. n≥n0
Thus, for all α ≥ α2 we obtain sup E |Xn | · I(|Xn | ≥ α) < 3ε.
n≥1
This proves [6.1]. 2) Let ξ have a Cauchy distribution and {εn } be a sequence of positive numbers with εn → 0 and nεn → ∞ as n → ∞. We set Xn = εn ξI(|ξ| ≤ n) and will specify εn more precisely later. Since Xn is symmetrically distributed at finite intervals, it holds E Xn = 0 → 0 as n → ∞, and |Xn | ≤ εn |ξ| → 0 a.s., hence Xn → X = 0 a.s. For α > 0, we have if nεn ≥ α :
α ≤ |ξ| ≤ n = εn xdx εn α2 2 log(1 + n . = ) − log 1 + 1 + x2 π ε2n
An := E |Xn | · I(|Xn | ≥ α) = εn E |ξ| · I = 2εn ·
1 π
n α εn
1 We select εn = log(1+n 2 ) , n ≥ 1, such that An tends to a positive number. Indeed, with such a choice, An → π1 as n → ∞. Hence
B(α) := sup An ≥ lim An = n≥1
n→∞
1 , π
and B(α) does not tend to 0 as n → ∞. Thus, {Xn } is not uniformly integrable. 3) a) Sufficiency Denote ε(α) = supx≥α xI(x ≥ α) ≤ ε(α)G(x),
x G(x)
and note that ε(α) → 0 as α → +∞ and
x ≥ 0.
Solutions
229
Therefore, sup E |Xn | · I(|Xn | ≥ α) ≤ ε(α) sup E G(|Xn |) → 0
n≥1
as
n≥1
α → +∞.
b) Necessity If supn≥1 |Xn | is a bounded r.v., then G(t) = t2 , t ≥ 0 satisfies the statement. Now, suppose that supn≥1 |Xn | is an unbounded r.v. We set ϕα (x) == xI(x ≥ α), x ≥ 0, α > 0 and −1/2 G(t) = t sup E ϕt (|Xn |) ,
t ≥ 0.
n≥1
Then G(t) is an increasing function as a product of an increasing function g(t) = t, t ≥ 0, and non-decreasing function
−1/2 sup E ϕt (|Xn |)
h(t) =
,
n≥1
t ≥ 0.
Moreover, since {Xn } is uniformly integrable, it holds t → +∞.
G(t) t
= h(t) → +∞ as
Let Fn be the cdf of |Xn | and τn ∈ [0, +∞] be the right point of the distribution of Xn . (This means in case τn = +∞ that Xn is unbounded, and in case τn < +∞ that P(|Xn | ≤ τn ) = 1 and P(|Xn | > τn − ε) > 0, for each ε > 0.) We have
τn
E G(|Xn |) ≤ 0
*
τn
= −2
τn
d 0
tdFn (t) = τn xdF (x) n t τn
xdFn (x) = 2
xdFn (x) = 2
E |Xn |.
0
t
Here, the integrals are Lebesgue–Stieltjes integrals. Hence by lemma 6.1(a), 1/2 < ∞. sup E G(|Xn |) ≤ 2 sup E |Xn |
n≥1
n≥1
230
Gaussian Measures in Hilbert Space
4)Let ν1 and μ1 be uniform distributions at [0, 2] and [1, 3], respectively. Consider ∞ ∞ ν = 1 νk and μ = 1 μk , where {νk , k ≥ 2} are arbitrary probability measures on B and μk = νk , k ≥ 2. Since ν1 is not absolutely continuous w.r.t. μ1 , ν is not absolutely continuous w.r.t. μ either (see theorem 6.4). It holds ν1 = 12 (ν1 + ν1 ) where ν1 and ν1 are uniform distributions at [1, 2] and [0, 1], respectively. Then
ν=
1 (ν + ν ), 2
ν = ν1 ×
∞
ν = ν1 ×
νk ,
k=2
∞
νk .
k=2
By theorem 6.4, ν μ. Since ν1 ⊥μ1 , it holds ν ⊥μ. Hence ν is neither absolutely continuous w.r.t. μ nor it is singular to μ. In a similar way, it is shown that μ is not absolutely continuous w.r.t. ν either. ˆ k ) and fk = 5) Denote νk = P ois(λk ), μk = P ois(λ for n ≥ 0, νk ({n}) = fk (n) = e
dνk dμk .
It holds νk ∼ μk and
ˆ
{n}
fk (x)dμk (x) = fk (n)
ˆk ) −(λk −λ
λk ˆk λ
n
e−λk ˆ n λ , n! k
.
Compute Hellinger integral
H(λk , μk ) = =e ∞
1
R
√ 2 √ ˆk ) − 12 ( λk − λ
fk dμk (x) =
∞
ˆk ) − 12 (λk −λ
e
n=0
;
∞
1 √ H(λk , μk ) = exp − ( λk − 2 1
λk ˆk λ
n/2
ˆ
e−λk ˆ n λ = n! k
ˆ k )2 . λ
∞ √ ˆ k )2 < ∞, then the latter infinite product converges, and if λ If 1 ( λ k − ∞ √ 2 ˆ k ) = ∞, then the product diverges to zero. Now, the desired λ 1 ( λk − statement follows from theorem 6.4 and corollary 6.2. 6) a) Let μ ν. Take A ∈ S with νe (A) = 0. Then ν(A ∩ Xr ) = νe (A) = 0; hence 0 = μ(A ∩ Xr ) = μe (A) = 0, and μe νe .
Solutions
231
Now, assume that μe νe . Take B ∈ Xr ∩ S with ν(B) = 0. Hence νe (B) = 0 and 0 = μe (B) = μ(B), and μ ν. b) Let μ⊥ν. Then Xr = Xr ∪Xr with Xr ∩Xr = ∅, μ(Xr ) = 0, ν(Xr ) = 0. Hence μe (Xr ) = 0, νe (X \ Xr ) = ν(Xr ) = 0, and μe ⊥νe . Now, assume that μe ⊥νe . Then X = X ∪ X with X ∩ X = ∅, μe (X ) = 0, νe (X ) = 0. Hence Xr = Xr ∪Xr with Xr = X ∩Xr , Xr = X ∩Xr , μ(Xr ) = 0 and ν(Xr ) = 0. Thus, μ⊥ν. 7) Similarly to theorem 6.7, we obtain for a1 − a2 = B 1/2 b, b ∈ H (here (b, ek ) = 0 whenever βk = 0): 2 dμ b −1/2 (x − a2 )) . = exp − + (b, B dν 2 Here, (b, B −1/2 (x − a2 )) is a measurable functional on the probability space (H, B(H), ν) (see remark 6.6), n bk (xk − a2,k ) √ , n→∞ βk 1
(b, B −1/2 (x − a2 )) = lim
and the limit exists a.e. with respect to ν; if some βk0 = 0, then xk0 = a2,k0 (mod ν) b (xk0 −a2,k0 ) and k0 √ = 0 by definition. βk0
8) The necessity is obvious. Now, we prove the sufficiency. Let {An } satisfy the conditions of the problem. Decompose ν = ν1 +ν2 where ν1 and ν2 are finite measures on S with ν1 μ, ν2 ⊥μ. Since μ(An ) → 0, it holds ν1 (An ) → 0, and ν2 (An ) = ν(An ) − ν1 (An ) → 1 as n → ∞. Hence ν2 (X) ≥ 1, ν1 (X) == ν(X) − ν2 (X) ≤ 1 − 1 = 0, ν1 = 0 and ν = ν2 ⊥μ. 9) Let ξ = N (a2 , B2 ), then tξ + c ∼ N (a1 , B1 ) for some a1 ∈ H and B1 == t2 B2 . If t = 0, then KerB1 = KerB2 and μξ ⊥μtξ+c (see remark 6.6). If t ∈
−1/2 −1/2 −1/2 −1/2 2 R \ {−1, 0, 1}, then J := B2 B1 B 2 = B2 , and Jx = t2 x, t B2 B2 −1/2 −1/2 x ∈ H KerB2 =: L, dimL = ∞. For D := B2 B1 B 2 − I, it holds 2 D|L = (t − 1)IL , where IL is the identity operator on L. Since t2 − 1 = 0, D|L is not a compact operator on L; hence D is not compact as well, and the Feldman–Hájek theorem implies that N (a1 , B1 )⊥N (a2 , B2 ). ˆ m ) is a Gaussian random vector, the estimator ˆb is a 10) Since (ˆ α1 , . . . , α Gaussian random element on H. Its correlation operator Sˆb coincides with the one of
ˆb0 := m (fi , B −1/2 X0 )fi . Random vector (fi , B −1/2 X0 ) m has distribution i=1 i=1
232
Gaussian Measures in Hilbert Space
N (0, Im ). Hence ˆb0 ∼ N (0, PUm ), Um = span(f1 , . . . , fm ), PUm is orthoprojector ˆ = B 1/2ˆb, its correlation operator is on Um . Thus, Sˆb = PUm . Since the MLE a ˆ is unbiased, a ˆ ∼ N (a, B 1/2 PUm B 1/2 ). B 1/2 PUm B 1/2 . Because the estimator a Answer: a ˆ ∼ span(f1 , . . . , fm ).
N (a, B 1/2 PUm B 1/2 ), where PUm is orthoprojector on
11) For any real C,
P {ξ + η = C} = x+y=C
=
dμξ (x)dμη (y) =
R
R
dμη (y)
{C−y}
dμξ (x) =
P {ξ = C − y} dμη (y) = 0,
since cdf of ξ is continuous at any point C − y. Here, μξ and μη are the distribution of ξ and η, respectively. ∞
12) Let random element X = (Xk )1 on R∞ be related to Y as in section 6.4.2. Under H0 , {Xk } are i.i.d. standard normal and under H ∗ , {Xk } are independent with Xk ∼ N (0, 1 + βk ), k ≥ 1. Introduce S(X) and Q(X) defined in [6.70] and [6.72], respectively. Like in section 6.4.2, PH0 {S(X) = C} = 0 and the Neyman–Pearson test is constructed based on relation [6.73]: reject H0 if Q(X) > C˜α , and do not reject H0 if Q(X) ≤ C˜α . Since βk ≤ 0, k ≥ 1, equation [6.73] takes a form 5 6 ˜ < −C˜α = α, PH0 Q(X)
˜ Q(X) :=
∞ |βk | X 2 k
1
1 + βk
.
˜ Using [6.75]–[6.76], we approximate the distribution of Q(X) by the distribution Aχ2ν , A=
∞ 1
ν=
∞
|βk | βk2 : , 2 (1 + βk ) 1 + βk 1
∞ |βk | 1 + βk 1
2
:
∞ 1
βk2 . (1 + βk )2
Hence C˜α = −Aχ2ν,1−α , where χ2ν,1−α is the upper (1 − α)-quantile of χ2ν distribution.
Solutions
233
∞ Xk ˜ Under H ∗ , Q(X) = 1 |βk | γk2 , γk = √1+β , {γk } are i.i.d. standard normal, k ˜ and we approximate the distribution of Q(X) by the distribution Bχ2τ with B=
∞
βk2
:
1
∞
|βk | ,
1
∞ 2 |βk |) τ = 1 ∞ 2 . 1 βk (
The power of the test is as follows: power ≈ P
%
Bχ2τ