187 20 14MB
English Pages 192 Year 1993
Large Deviations for Discrete-Time Processes with Averaging
Large Deviations for Discrete-Time Processes with Averaging O.V. Gulinsky and A.Yu. Veretennikov
///VSP///
Utrecht, The Netherlands, 1993
VSPBV P.O. B o x 3 4 6 3 7 0 0 AH Zeist The Netherlands
© V S P B V 1993 First published in 1993 ISBN 90-6764-148-0
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the copyright owner.
CIP-DATA KONINKLIJKE BIBLIOTHEEK, DEN HAAG Gulinsky, O.V. Large deviations for discrete-lime processes with averaging / O.V. Gulinsky and A.Yu. Veretennikov. Utrecht: VSP with ref. ISBN 90-6764-148-0 bound NUGI 815 Subject headings: Cramer's theorem / Markov processes.
Printed in The Netherlands by Koninklijke Wöhrmann,
Zutphen.
V
Contents
Preface
1
Chapter 1. Introduction to large deviations 1.1. Cramer-type results 1.1.1. The classical Cramer theorem 1.1.2. The extensions of Cramer's theorem 1.2. Large deviations on the space of probability measures 1.3. Applications to statistical mechanics 1.4. Basic large deviations concepts 1.5. Large deviations for sums of independent and identically distributed variables in function space 1.6. Applications to recursive estimation and control theory
36 47
Chapter 2. Large deviations for the non-Markovian recursive scheme with additive Svhite noise'
51
7 9 13 16 23 27
Chapter 3. Large deviations for the recursive scheme with stationary disturbances 3.1. Large deviations for the sums of stationary sequence with the Wold-type representation 3.2. Large deviations for the recursive scheme with the Wold-type disturbances
80
Chapter 4. Generalization of Cramer's theorem 4.1. Large deviations for sums of stationary sequences 4.2. Large deviations for sums of semimartingales
87 95
Chapter 5. Mixing for Markov processes 5.1. Definitions 5.2. Main results 5.3. Preliminary results 5.4. Proofs of Theorems 5.1-5.6 5.5. Mixing coefficients for recursive procedures
71
103 104 105 109 111
Chapter 6. The averaging principle for some recursive stochastic schemes with state dependent noise
117
Chapter 7. Normal deviations
123
Chapter 8. Large deviations for Markov processes 8.1. Gartner's theorem 8.2. Examples
127 131
Contents
vi
8.3. 8.4. 8.5. 8.6.
Markovian non-compact case Auxiliary results Proofs of Theorems 8.6-8.8 Proof of Theorem 8.9
133 134 138 139
Chapter 9. Large deviations for stationary processes 9.1. Compact non-singular case 9.2. Non-compact non-singular case
145 151
Chapter 10. Large deviations for empirical measures 10.1. Introduction 10.2. Markov chain with Doeblin-type condition 10.3. Non-compact Markov case 10.4. Stationary compact case 10.5. Stationary non-compact case
159 159 161 163 164
Chapter 11. Large deviations in averaging principle 11.1. Compact case 11.2. Non-compact case
167 173
Bibliography
183
1
Preface Let £ = ((k)k>\ be a sequence of independent and identically distributed (i.i.d.) random variables with a common distribution \i that are defined on a probability space (fi, F, P). Assuming that JR |x|^(dx) < oo, the weak law of large numbers says that n-'tb i—i converges as n —» oo to m = JR x/x(dx). Suppose that ¡i has mean 0 and variance 1. Then by the classical central limit theorem for each x £ R = P
¿ 6
oo to J e"" 2 / 2 du. —oo The theory of large deviations emerged as an attempt to give an answer to the question how fast the tail of distribution Fn is approaching to the Gaussian's tails. The first general answer given by Cramer (1938) has the following form: under Cramer's condition E exp (A£i) < oo, A > 0, for x > 0, x = o(s/n) (n —> oo) 0 does not depend on n. It turns out (Cramér (1938), see also Chernoff (1952)) that under Cramér's condition the following limit takes place: lim n - 1 In P I ^
> na I = - / ( a ) ,
2
O. Gulinsky and A. Verelennikov
where the rate function 1(a) is the Legendre transform of the logarithmic moment generating function G(A) = In E exp (A(i): 1(a) = su£ [Ao - G(A)]. In a general case the sequence (Fn)n> i of probability measures on Borel a-field of metric space X is said to have the large deviation property if there exists a function /(•) such that for each open set G in X lim inf n" 1 l n f n ( G ) — > - inf I(x), n—»oo ig G for each closed set K in X l i m s u p n ' M n F n i / f ) < — inf n—»00 xqK
I(x),
and the level sets of the rate function I(x) are compact in X. Cramer's result was next extended by many authors. Sanov (1957) was the first to formulate and prove the large deviation property on a space of probability measures. He considered large deviations for empirical distribution ßn(A)
= n~'J2 I (f." € A), 1=1
with A G B(R) (here I is indicator). This formula determines the measure-valued process with value in the space Mi(fl) of probability measures on R. If Mi is supplied by an appropriate metric p(a, v), a,u G Mi, it is reasonable to inquire about logarithmic asymptotics of the probability P(p(fin,v) < 6), v G Mi, 0. The rate function for this problem was found to be the relative entropy function dv
/
I n — (x)i/(dx), v G Mi, " introduced into statistics by Kullback and Leibler (1951). At the same time this rate function can be represented as the Legendre transform of the corresponding logarithmic moment generating function. The large deviation theory for the occupation time functional was next developed by Donsker and Varadhan (1975a, 1975b, 1976) for a Markov chains and processes. Similar results were independently obtained by Gärtner (1977). Bahadur and Zabell (1979) used the ideas which were introduced by Ruelle (1965, 1967) for studying of thermodynamical limits in the context of the Gibbs variational principle in equilibrium classical statistical mechanics to derive Sanov's theorem as well as Banach space case of Cramer's theorem. Donsker and Varadhan (1983) were the first to formulate and proved the large deviation principle for the distribution of the empirical process (i.e. the empirical measure of the whole process) in case the underlying process is Markovian. The solution was given in terms of a mean relative entropy as the rate function. The mean relative entropy has another interesting characterization. It was found to be closely related to the Kolmogorov-Sinai invariant of the corresponding dynamical R
Preface
3
system (see Takahashi (1982) who gives a formulation of the Gibbs variational principle in an abstract manner which unifies the Donsker and Varadhan theory for Markov chains, the equilibrium classical statistical mechanics of lattice systems and the theories for symbolic dynamics and Anosov diffeomorphisms). There are many literatures concerning the large deviation principle for the occupation time functional and the empirical measure. The reader is referred to Varadhan (1984) and Deuschel and Stroock (1989) for a detailed discussion and for complete references. Lanford (1973) and Ellis (1985) look at large deviations from the point of view of statistical mechanics. The first large deviation result for distribution on a function space was obtained by Schilder (1966) for a family of Wiener processes (ewt)t 0. If we denote p(y>, V>) the uniform metric on the space of continuous function C[0, T], it is reasonable to consider the logarithmic asymptotics (as e —> 0) of the probabilities P(p(ew, ip) < 6), ip 6 C[0, T], 6 > 0, for processes ( e w t ) t < T to stay in tube neighbourhoods of different curves. The rate function for this problem was found to be o in a good accordance with a heuristic representation of Wiener's measure
Varadhan (1966) deals with diffusion processes without drift and with a small diffusion term. Borovkov (1967) studied random processes with independent stationary increments. Wentzell and Freidlin (1970, 1972, 1984) were the first to undergo a systematic study of dynamical systems with small random perturbations via large deviations. Particularly, they proved the large deviation principle and mean exit time theorem for infinitely divisible random processes. Freidlin (1978) formulated and proved the large deviation principle for the family of processes Xc = (A7)i (-00,00] with the properties: (a) G>(A) is convex. Indeed, by Holder's inequality G>(cAi + (1 - c)A 2 )
=
In j exp(cAiz) exp((l - c)A 2 x) d/i(x)