205 64 63MB
English Pages 748 [752] Year 1999
Probability Theory and Mathematical Statistics
Probability Theory and Mathematical Statistics Proceedings of the Seventh Vilnius Conference (1998)
Vilnius, Lithuania, 12-18 August, 1998
Edited by B. Grigelionis, J. Kubilius, V. Paulauskas H. Pragarauskas, R. Rudzkis and V. Statulevicius
P f
Vilnius, Lithuania
IIIVSP III
Utrecht, The Netherlands
VSPBV P.O. Box 346 3700 AH Zeist The Netherlands
TEV Ltd. Akademijos 4 2600 Vilnius Lithuania
©VSP BY & TEV Ltd. 1999
First published in 1999
ISBN 90-6764-313-0 (VSP) ISBN 9986-546-72-9 (TEV)
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the copyright owner.
Typeset in Lithuania by TEV Ltd., Vilnius, SL 1185 Printed in Lithuania by Spindulys, Kaunas
CONTENTS
Foreword
ix
Rate of convergence in the transference max-limit theorem A. Aksomaitis
1
Theorems of large deviations for non-Gaussian approximation A. Aleskeviciene and V. Statulevicius
5
Algorithm for calculation of joint distribution of bootstrap sample elements A. Andronov and M. Fioshin
15
Bismut type differentiation of semigroups M. Arnaudon and A. Thalmaier
23
Random permutations and the Ewens sampling formula in genetics G. J. Babu and E. Manstavicius
33
On large deviations of self-normalized sum A. Basalykas
43
Minimax estimation in the blurred signal model E. Belitser
57
Invariant measures of diffusion processes in Hilbert spaces and Hilbert manifolds Y. Belopolskaya
67
One term Edgeworth expansion for Student's t statistic M. Bloznelis and H. Putter
81
Asynchronous stochastic approximation and adaptation in a competitive system R. T. Buche and H. J. Kushner
99
Homogenization of random parabolic operator with large potential F. Campillo, M. Kleptsyna and A. Piatnitski
115
Remarks on infinitely divisible approximations to the binomial law V. Cekanavicius
135
Rates of convergence to stable and discrete stable laws G. Christoph and K. Schreiber
147
Normal-ordered white noise differential equations. II: Regularity properties of solutions Dong Myung Chung, Un Cig Ji and Nobuaki Obata
157
VI
Contents
Line models for snowflake crystals and their Hausdorff dimension J. Croft
175
Multiplicative strong unimodality revisited I. Cuculescu and R. Theodorescu
187
Random walks with jumps in random environments (examples of cycle and weight representations) Y. Derriennic
199
Analysis of generating functions and probabilities on trees M. Drmota
213
On the distribution of roots of polynomials in sectors. Ill A. Dubickas
229
Asymptotic expansions for the moments of the regret risk of classification based on mixed variables K. Ducinskas
235
On the structure of strong Markov continuous local Dirichlet processes H. J. Engelbert and J. Wolf
241
On minimax robustness of Bayesian statistical prediction V. Galinskij and A. Kharin
259
On the zeros of the Lerch zeta-function. II R. Garunkstis
267
The Galton-Watson tree conditioned on its height J. Geiger and G. Kersting
277
Prognosis of experiment results, based on a two-stage examination and linking the inverse Polya distribution to Bayes' rule T. Gerstenkorn
287
Applications of semiparametric biased sampling models to AIDS vaccine and treatment trials P. B. Gilbert and R. J. Bosch
295
Bayesian estimation of common long-range dependent models Nan-Jung Hsu, B. K. Ray and F. J. Breidt
311
Starting an image segmentation: a level set approach J. Istas and M. Hoebeke
325
Wide-ranging interpretations of the cycle representations of Markov processes S. Kalpazidou
339
Analysis of price, income and money demand indicators B. Kaminskiene
355
Robustness in cluster analysis of multivariate observations Yu. Kharin and E. Zhuk
363
Contents
vii
An elementary approach to filtering in systems with fractional Brownian observation noise M. L. Kleptsyna, A. Le Breton andM.-C. Roubaud
373
Bootstrap tests for parametric volatility structure in nonparametric autoregression J. -P. Kreiss and M. H. Neumann
393
Value-distribution of general Dirichlet series A. Laurincikas
405
The Robbins-Monro type SDE and recursive estimation N. Lazrieva and T. Toronjadze
415
A squared binomial tree approach to discrete-time bond market modelling R. Leipus
429
Strong consistency of ¿-centres in reflexive spaces J. Lember and K. Pàrna
441
Asymptotic behavior of small ball probabilities M. A. Lifshits
453
Good statistical practice in analytical chemistry P. Lischer
469
On the principle of maximum entropy on the mean as a methodology for the regularization of inverse problems P. Maréchal
481
The rate of convergence in the central limit theorem for endomorphisms of two-dimensional torus G. Misevicius
493
Simple methods for the analysis of multivariate and longitudinal categorical data G. Molenberghs and L. Danielson
499
Trimmed sums and associated random variables in the q-domain of attraction of stable laws D. Neuenschwander
and R. Schott
515
^-Variation and integration of sample functions of stochastic processes R.Norvaisa
521
Functional central limit theorems on Lie groups. A survey G. Pap
541
On some new results for cointegrated processes with infinite variance innovations V. Paulauskas
553
viii
Contents
Sampling methods in Lithuanian official statistics. Design and parameter estimation problems A. Plikusas
571
On ¿''-estimates of stochastic integrals H. Pragarauskas
579
Volatility estimation and bootstrap Z. Prdskovd
589
Random fields and central limit theorem in some generalized Holder spaces A. Rackauskas
and Ch. Suquet
599
Consistent estimation of discriminant space in mixture model by using projection pursuit M. Radavicius
and R. Rudzkis
617
Nonparametric wavelet methods for nonstationary time series R. von Sachs
627
Some results on arithmetical functions W. Schwarz
653
Linear approximation of random processes and sampling design problems 0. V. Seleznjev
665
Convergence of solution of stochastic equations in the space of formal series I. Ya. Spectorsky
685
Unified inference in survey sampling 1. Traat and K. Meister
697
Semiclassical analysis and a new result in excursion theory A. Truman and I. M. Davies
701
On KPP equations with various wave front speeds, the Larson result via large deviations A. Yu. Veretennikov
707
Co-ordinating and monitoring the transformation process in statistics in emerging democracies and economies in transition - the Polish experience T. Walczak
715
Branching diffusions with random offspring distributions A. Wulfsohn
727
FOREWORD
The 7 t h Vilnius Conference on Probability Theory and Mathematical Statistics was held 12-18 August 1999 in conjunction with the 22 n d European Meeting of Statisticians. T h e joint conference was organized by European Regional Committee of the Bernoulli Society, Institute of Mathematics and Informatics, Vilnius Gediminas Technical University, Vilnius University, the Lithuanian Academy of Sciences, the Lithuanian Mathematical Society, and the Lithuanian Statistical Society. The Organizing Committee was headed by Professor Vytautas Statulevicius and Chairman of the Programme Committee was Professor Peter Jagers. It was the first time that the Vilnius Conference and the European Meeting of Statisticians were held together. This provided a good opportunity for meetings and discussions between representatives from theoretical branches of probability and statistics, and those working in applied probability and practical statistics. Furthermore, the joint conference reflected new trends in applications, such as financial mathematics, official statistics, and many more. Participants from 37 countries attended the conference - numerous from countries traditionally present, such as Russia, USA, Germany, Great Britain and France, but also participants f r o m non-traditional countries such as Cyprus, Mongolia and Kuwait were attending. The conference was divided into 34 sessions, during which 350 reports were delivered, including approximately 100 invited lectures and 5 plenary lectures. The conference was opened by the President of the Republic of Lithuania, Valdas Adamkus. The plenary lectures were given by Professors R. Gill (opening lecture), N. Reid (forum lecture), J.F. Le Gall (closing lecture), C.D. Cutler (IMS lecture), V. Bentkus (special Vilnius lecture). As a rule, 3 invited speakers were chosen by the session organizers for each session. The Proceedings contain the papers by invited speakers submitted to the Organizing Committee, as well as some selected papers by contributing speakers. We would like to offer our sincere thanks to all the participants who have contributed to the success of the conference. On behalf of the organizers we would also like to express our special gratitude to the scientific secretary of the conference, Dr. Aleksandras Plikusas. We hope that these Proceedings will promote a further advance in probability theory and mathematical statistics. The Editors
Prob. Theory and Math. Stat., pp. 1 - 4 B. Grigelionis et al. (Eds) © 1999 VSP/TEV
RATE OF CONVERGENCE IN THE TRANSFERENCE MAX-LIMIT THEOREM A. AKSOMAITIS Kaunas University of Technology, Studentu 50-325, 3028 Kaunas, Lithuania
ABSTRACT Here we present a nonuniform estimate of the convergence rate in the transference theorem in maxscheme for independent and identically distributed random variables. Presented results give more exact the convergence rate that obtained by the author earlier.
1. I N T R O D U C T I O N Let { X j , j > 1} be a sequence of independent and identically distributed random variables (r.v.'s) with a distribution function F{x) = P(X j ^ x), j > 1, and {Gn,n > 1} be a sequence of continuous strongly increasing functions defined in the interval (/, s) of values taken by the r.v.'s X j , j ^ 1. After normalizing the r.v.'s Xj we get Xnj = G~l(Xj), j ^ 1. Let {Nn, n > 1} be a sequence of positive integral-valued r.v.'s independent of the X j , j > 1. We form quantities n Mn = \J Xnj, j=i
N„ MNn = \J Xn]. ;'=l
In the case when Gn(x) = xbn + an (a„ e R, bn > 0), the necessary and sufficient conditions of weak convergence P(A/„ ^ x) => H{x), n -> oo, to nonsingular distribution function H are presented in the fundamental work of Gnedenko (1943). Nonuniform estimate of convergence rate |P(M„ ^x)-H(x)\
^ An(x)
(1.1)
was obtained in monograph (Galambos, 1978). More general estimates of convergence rate are given in the paper (Aksomaitis, 1988). We are interested in the nonuniform estimates of convergence rate in the transference theorem (Gnedenko and Gnedenko, 1984), i.e. estimates of the type
AN„(x) = |P(Ai/v„
-
(1-2)
2
A.
Aksomaitis
here, distribution function ^ ( x ) = /Q°° Hz(x) d/l(z) and A(x) = lim„^oo A„(nx) = liirijj-^oo P ( N n / n ^ x). In partial case (linear normalizing), using estimate (1.1), the estimate A ^ „ ( x ) is known (Aksomaitis, 1987). This estimate was improved in the paper (Aksomaitis, 1988). Our aim can be formulated as follows: to get estimates of nonuniform convergence rate in the transference theorem provided estimates \P(Mn
^x)-H(x)\
\Hn{nx)
- A{x)\
(1.3)
< Â(x)
(1.4)
are known. 2. M A I N
RESULTS
A nonuniform estimate of the convergence rate in (1.2) is given by the following theorem. Let H be a limit distribution function of the r.v. Mn and / i ( + 0 ) = 0 . Then, for each x satisfying the conditions (1.3), (1.4) and H(x) > 0, the following estimate holds: THEOREM.
oo ÀNn(x)
< A™ (*)«„(*) + f
 0,
(3.2)
which holds true for all 0 ^ u ^ 1 and 0 ^ v ^ 1. This inequality follows from the relation \ua — va\ = a\ f" ta~l di|. The complete probability formula implies: AaU*) =
j)-EHN»'n(x)
E f(G„(x))P(/V„ = JZ i |Fn(Gn(x))J/n
< £
- H'l\x)\V{Nn
=
j).
i Applying (3.2), we get Àg(*) - i i ( x ) | j
z max ( F ' ^ z _ 1 ' ( G „ ( x ) ) , Hz~\X))
dAn (,nz)
and À g ( * ) = J2Hj/n(x)V(Nn=
j)-V(x)
00
=
/
00
z/-^ /t Hz(x)dAn(nZ)-
j
0
I uz
00
Hz(x)dA(z)
j Hz(x)d(A„(nz)
-
A(z))
0
Integrating by parts, we get i< , _ , is taken over parti•p = £ X f,(,,> • • • X t 0, y ^ 0, then, in the interval X < x < Ay, where Ay = If (-71)1^
the relation of L.D.
P(X >x)
= ( l - P(x- X))eLy^
(l +
(3)
holds. The following
inequality P(X > x) < e - *
l o g x + x l o g
(4)
T K
is true for 0 < x — X ^ X A and P(X
e-*Iog^log^Alog(l
+
^i)
+ x l o g
(
l
\
+
>LZ±) AA /
(5)
for x - X ^ A. A, ify = 0. //ere 6>i = 0 ^20 + m a x f ^ , ^ L ) ) , À' // Ly{x)
=
À) 2 I r 2 ( * ) ^ — { + 2X*
|©| ^ 1,
( x - x
(X -
r
x log \ 1 + X2
\ X*
* - A.
p p A * = À + f 2 ( X ) , Xiy = J^ck+2xk, X2y(x) = „
M+1
— 1,
EYf
\ J_
n
n
(6a)
/ 71k
= Pk+l H
\-pr,
= 0,
j,
EYiYj
/, 7 = 1 , . . . , r — 1, (x) (n ->• oo),
FYi(x)
_
/Mm
+
J Xk+l
n
THEOREM 4. In the interval
i = 1 , . . . , r — 1.
1 < x < con1^
the relation 3
P(X2
> X) = (1 - Kr-i(x))
(l
holds, where cq, Cq depend only on the distribution
+ P ( f e A), \d\ sC 1.
The estimation from above of cumulants T k ( x ) of the Pearson statistic will be like A x « ,
+
(^t-i
anc* w e
^ave
interval of L.D. 1 < x < con 1 / 3 (because
y = 1 and A 1 / ( 1 + 2 ' / ) x n>/3).
H o w to estimate r ^ A " ) if X = x2 is the Pearson statistic. For a better understanding w e will take m
where S^
= £(;_i)„+i H
i.i.d. r.v.s with
= 0,
1- £,•„, i — 1, 2 , . . . , m, and
Since
it = 3 , 4 , . . . .
j
Obviously, X -»• x,„
£2- • • • is a sequence of
= 1 satisfying the Cramer condition ^k\Hk~2,
d
/ Ai) \ 2
(1 - » 00).
=
r-(') \ 2 1
1'
(7a)
12
A. Aleskeviciene
and V. Statulevicius
it suffices to estimate only r ^ K 2 ) , where K =
+
v We have (Aleskeviciene and Statulevicius, 1996, 1997)
b
and =
E
E
r r - n
E Y 7 1 l
-
E Y
* ' '
¡ S i W n - r M n .
For example, 22r2(K2) = £Y2 - 3 = r4(K) 23r(y2) = r 6 ( n +
(excess),
io(r3(n)2,
24r4(r2) = r 8 ( n + 32(r4(n)2 + 56r3(y)r5(K), 2 5 r 5 ( r 2 ) = rio(K) + i 2 o r 3 ( r ) r 7 ( n + 200r4(r)r6(K) + i 2 6 ( r 5 ( n ) 2 + 20oo(r3(y))2r4(y). Then, in general case, we finally find
1
1=2
//,+...+/,,=/p=1
f
j=1
l
J-
21 X
v=\k,+-+ku=21j - (-l)k-l2k-\k
- l)(k - 1)!.
(8)
From this we see that term that have r w i t h ki = 2 in their factors are interreduced because ^ ( 1 0 = 1, kt \ = 2 and for each such term there is a term with the opposite sign because of the multiplier (—1 )k~l+P~l while we obtain coefficient 2 from 2k~l. Thus,
M
c
2*nt(K2) = E E V=1 kl+~+kv=2k
ki-kJki(Y)---rkv{Y).
(8) implies that | c ^ - k j ^ HQ, and C2k = 1 (v = 1, k\ = 2k). From the Crammer condition (7a) we have | r * ( £ i ) | ^ k\Hk~2 and Statulevicius, 1991)), so n
( H x = 2 H , see Lemma 3.1 in (Saulis /r_1
Theorems of Large Deviations for Non-Gaussian
Approximation
13
and finally, making use of the Stirling formula we obtain
Os^f-i Consequently, A = n, y = 1, Al/0+2y)
= n1/3.
We can estimate the constants Hi, i = 0, 2, 3 , 4 , just like we have done in (Saulis and Statulevicius, 1991)). We proved in (Aleskeviciene and Statulevicius, 1997) a general lemma for any r.v. X: LEMMA. If a r.v. X = Xn, with EXn condition
= m, where m is a integer, satisfies
irt(XB)| < - j ^ - ,
¿ = 2,3,...,
A„
for some A„ > 1, then, in the
(9)
interval a1/3
11 < X < the ralations of large
the
" —— V2
deviations P(Xn>x)
= (\-K
m
(x))(l+e^\ An
holds, where \Q\ < 6 . 8 2 T _ 1 r ( f ) , if A„ > 6 4 . However, condition (9) will be fullfilled only if the distribution P(Xn < x) is very close to that of Km{x), e.g., if X„ has the density p„ (x) of the type p„(x) = km(x) + Qn(x)e~x/2 and sup \Qn{x)\e~xlI2 - > 0 (n - > oo). In (7) this will be the case if instead of
r\
=
Sw
distribution we take the first terms v« of asymptotic expansion rejecting the remainder term, which is frequently done in practical statistics. Otherwise, will grow with resspect to k as (k\) 2 . v
With respect to Pirson statistic (6) x2 — ^ correlated but dependent, we use equqlity (1): K'l
h
i, and though r.v.s Y, are not
tk^r-1
We express the correlation function r ( T , ^ , . . . , Y^) through the centered moments EY2Y2
• • • Y2, which are estimated because we know expression (6a) of Y2, i =
1 , . . . , r - 1, by frequences. The mein term in (10) will be f k ( Y f ) + • • and estmating will be the some like in the case (7), m = r — 1. We get
Hence follows the preposition of Theorem 4.
•+Tk(Y2_l)
14
A. Aleskeviciene
and V. Statu lev icius
REFERENCES
Aleskeviciene, A. and Statulevieius, V. (1994). Large derivation in approximation by Poisson law. In: Probability Theory and Mathematical Statistics. Proceeding of the Sixth Vilnius Conference, pp. 1 18, VSP/TEV, Utrecht/Vilnius. Aleskeviciene, A. and Statulevieius, V. (1995). Large deviations in pover zones in the a p p r o x i m a t i o n by the Poissn law. Uspekhi Mat. Nauk 50. 6 3 - 8 2 . Aleskeviciene, A. and Statulevieius, V. A. (1996). Inversion formulas in the case of discontinuous limit law. Theory'Probab. Appl. 42(1), 1 - 1 6 . Aleskeviciene, A . and Statulevieius, V. (1997). Probabilities of large derivations for x 2 approximation. Lith. Math. J. 35(4), 3 0 9 - 3 2 7 . Cekanavicius, V. (1997). A s y m p t o t i c expansion for compound Poisson measure. Lith. Math. J. 37, 4 2 6 447. Kruopis, J. (1986). Precision of approximations of the generalized binominal distribution by convoliution of Poisson measures. Lith. Math. J. 26, 3 7 - 4 9 . Padvelskis, K. and Statulevieius, V. A. (1998). T h e o r e m s of large deviations for s u m s of r a n d o m variables related to a M a r k o v chain. I. Lith. Math. J. 38(4), 4 5 6 - 4 7 1 . Padvelskis, K. and Statulevieius, V. A. (1999). T h e o r e m s of large derivations for s u m s of random variables related to a M a r k o v chain. II. Lith. Math. J. 39(1). Saulis, L. and Statulevieius, V. (1991). Limit Theorems for Large Deviations. recht.
Kluwer Acad. Publ., Dod-
Tsaregradskii, I. P. (1958). On u n i f o r m approximation of the binominal distribution by infinetly divisible laws. Theory Probab. Appl. 3, 4 3 4 - 4 3 8 . T u m a n y a n , S. Kh. (1956). A s y m p t o t i c distribution of the x1 criterion when the n u m b e r of observations and n u m b e r of groups increase simultaneausly. Theoty Probab. Appl. (1), 9 9 - 1 1 6 .
Prob. Theory and Math. Stat., pp. 15-22 B. Grigelionis el cd. (Eds) © 1999 V S P / T E V
ALGORITHM FOR CALCULATION OF JOINT DISTRIBUTION OF BOOTSTRAP SAMPLE ELEMENTS A. A N D R O N O V and M . F I O S H I N 1 D e p a r t m e n t of C o m p u t e r Technology, Riga Aviation University, 1, L o m o n o s o v Str., Riga, LV-1019, Latvia
ABSTRACT This paper is devoted to the joint distribution of empirical sums of bootstrap-samples elements calculation. With that end in view, so-called marked Markov chain is introduced and used. A proposed approach is based oneself on computer calculation and supposes small size of samples and n u m b e r of sums. In this connection computational aspects are considered in detail. An example for normally distributed initial data illustrates the proposed approach.
1. I N T R O D U C T I O N T h e bootstrap method is used for cases of an insufficiency of primary data. T h e bootstrap technique involves multiple subsampling, with replacement, f r o m the original primary sample to have a lot of resamples. This makes it possible to decrease a bias of estimators, to estimate their variance, to determine confidence intervals (Efron and Tibshirani, 1993; Davison and Hinkley, 1997). Accurate calculation of a variance of expectation estimate for the bootstrap-method has been proposed earlier in previous papers of authors (Andronov et al., 1 9 9 5 , 1 9 9 6 , 1997, 1998). In this paper our goal is to calculate a distribution of empirical sum of bootstrap-samples. Follow DiCiccio and E f r o n (1996), w e will consider the one-sample nonparametric situation w h e r e the observed data X = ( X i , X j , . . . , X„) are a r a n d o m sample f r o m an arbitrary distribution density f ( x ) . A nonparametric bootstrap sample Y = (Ki, Y2, ... , Yr) is a r a n d o m sample of size r drawn without replacement f r o m X. In other words, Y equals (Xjl, Xj2,... , Xjr) w h e r e ( j \ , 7 2 , . . . , jr) is a rand o m sample without replacement f r o m { 1 , 2 , . . . , n}. T h e sum of bootstrap sample elements is S = X
j l
+ X
h
+ --- + Xjr.
(1)
' We are very thankful to Latvian Council of Science for the Grant Nr.97.0798 within which the present investigation is worked out.
16
A. Andronov and M. Fioshin
Now, let we have k such independent random samples ji,,,... , jr,i) from {1, 2 , . . . , n], i = 1, 2 , . . . , k, and k corresponding sums S, = X , + Xj2, H h Xjri. Our purpose is to calculate a joint distribution function F(x\, xi,... , xk) of the sums 5], S2, • • • , Sjt. A case of one sum (k — 1) but for sampling with replacement was considered by Andronov and Fioshin (1998). More formally this problem can be defined by the following way. Let wtj is a Boolean variable which has unit value if the observed element Xj e X belongs to the sum S,: 1, 0,
if if
X j e Si, XjtSi.
.
Then w = {wij}kxn is Boolean matrix which describes results of k random samples forming. This matrix satisfies a condition we„ = re^, where e„ is v-dimensional column-vector. The set of such matrices forms the space of elementary events All elementary events w e are equiprobable. Now we can introduce the random Boolean matrix W with a distribution P{W = w } = ^ " ^
,
w e £2.
(3)
The column-vector of sums S = (Si, 5 2 , . . . , Sk) T can be defined now as S = WX,
(4)
where matrix W and vector X are mutually independent. It is simply to see that £ ( S ) - r/xe*.
(5)
C(S) = ra2l + r V - ( e * e [ - I), n
(6)
where C(S) is covariance matrix of the vector S, ¡i = E(Xi) and cr2 = Var(X,) are expectation and variance of X, e X, I - ¿-dimensional unit matrix. Matrix C(S) belongs to the class of "T-matrices" (Sheffe, 1958, p. 452). It has inverse matrix always. We can consider above described procedure of sum forming as some smoothing of initial data. We will call one "S-smoothing of data". Thus we mean to calculate the joint distribution of S-smoothed data. We introduce corresponding Markov chain in the next section with this purpose. Section 3 is devoted to the computing of the distribution function for the sums Si, S 2 , . . . , S*. In Section 4 we consider a case of the normally distributed X,. The last section contains statistical applications to the confidence interval construction. 2. THE MARKED MARKOV CHAIN In order to calculate a joint distribution of sums (Si, S2, • • • , Sk) we introduce socalled marked Markov chain. A step of this chain consists in one sum forming.
Joint Distribution
of Bootstrap
Sample
17
Elements
On the first step we form the sum Sk, on the second step - the sum Sk~\, on the l-th step - a sum 1. Therefore, to the beginning of the /-th step, sums Sk, S k ~ ] , . . . , Sk-i + 2 are just formed. Each sum contains r elements. Some elements from just formed sums are found in the sums S\, S2,... , Sk~i+\. Let Zv(l) denote elements number which are just found in the sum Sv after the /-th step. On the /-th step we need to fill the sum 1 with r — Zk-i+\(l — 1) elements from X, which are not found in the previous sums. Let Zo(/ — 1) elements from X are found in the sums Sk, Sk — 1 , . . . , 1 to the beginning of the l-th step. Then we are able to extract elements for sum Sk-i+1 from n — Zq(1 — 1) ones. We make this selection at random without replacement. Obviously, sequence Z/ = (Zq(1), Z\{1),... , Zk~i{l)), I = 0, 1, 2, . . . ,k, is a Markov chain with initial state Zo = ( 0 , 0 , . . . , 0). Corresponding state space Qi contains (k — I + l)-dimensional vectors a — («o, "&})}. Let us find the transition probabilities for this Markov chain. On the I-th step, the sum Sk~i+\ is added by r — Zk-i+1 (/ — 1) elements. Some from these elements are got into another sums. Let Pj(m) denotes the probability that m elements are got into the sum Sj. Since the sum Sj just contains Zj(l — 1) elements, then this probability is equal to „ f , Pi ( « ) =
m
r
rr-Zj(l-l)-m
'
(/—1) (x\, X2, • • • , xk) and call it by transitive distribution function. First we consider a case when a = (ao, cci, • • • , «;•-/+1). a' = (chq, a' 0 = a 0 + 1, a , + 1,
ifi e / = {1,2,...
,k],
«¿,
if i ^ / = {1,2
A:}.
, . . • , oi'k_i),
UU;
For new state a ' transitive function is calculated as a result of using an operator £ / to the function Fa for the state a. This operator values in the point ( x j , X2, . . . , Xk) is defined by formula Faa'(x\,X2,...
,xic)~
CiFa{xi,x2,...
,xk)
00
= j
f(y)Fa(u\,u2,...
,uk)dy,
(11)
—00
where w, =
x,, Xi - y,
if i 4. / , c• , if i e I.
Now we will consider a general case when a = (ao, a i , ... , a^ _/+1), a! = (a'0, a ' j , ... , ot'k_i), a'Q = ao + r - otk-i+1, = oti + mi, i = 1 , 2 , ... ,k - I. Here m i, m 2 , . . . , mic-l are non-negative integer numbers. We can calculate a transitive function Faa' by r — 1 steps using auxiliary Markov chain. The step of the auxiliary chain is an examination of successive element f r o m r — 1 elements being added to the sum Sk-i+\ • On this step we select sums from S \ , S z , . . . , Sk-i to which an examined item is added. The state v to the beginning of v-th step, v = 1 , 2 , . . . ,r — o ^ - z + i , is the vector v = (vo, vi,... , Vk~i), where i»o — ao + v — 1,a-, ^ Vj ^ min(a; + v — 1, a - ) . Here, v/ is a number of items from examined (v— 1) items, that are just found to the sum 5,. On the v-th step the transition from the state v to the state v' = (v'Q, d'j ,... , where v'n = i>o + 1, vi + 1,
if i e / ,
vi,
if i i
I,
occurs with the probability a'• — Vi
hir
~
ak
-i+1 ~ v
i—r ( + 1
Ui ^
N
a'r
~
ak
~i+x -
y
+
1
/
Joint Distribution
of Bootstrap Sample
Elements
19
Before the first step (v = 1) the state v is equal to a. By r — ak-/+i steps the state a' is reached. Probabilities P'v{v) of transitive states on different steps is calculated in usual way for Markov chains. For each new state v' we calculate a transitive distribution function Fvv>, which is obtained by using an operator £ / (see formula (11)) to the function Fv. For the beginning state v = a the distribution function Fa is known. Function Fv for transitive states is calculated analogous to Fa, as it will be shown below. So let us return to the distribution function Fa> that corresponds to the state a' e Q/ of our marked Markov chain. One is calculated based on transitive function Faai, probabilities P/-\(a) of states a e i and probabilities Pi (a, a') of transitions a —»• a ' on the /-th step: Fa>(x i,x2,...,xk) =
(13)
^ Pi-\(a)Pi(a, aei2/_i
a')Faa>(x[,X2,
••• ,xk),
a ' e £2/.
Now the joint distribution function F{x.\, X2, • • • , xk) of sums S\, S2, • • • , Sk can be obtained after executing the last k-th step of the marked Markov chain: F(xi,x2,
... ,Xk) =
where Qk = {(r), (r + I),...
p
k(oc)Fa(xuX2,
• • • ,Xk),
(14)
,(«)).
The distribution function F(x\, X2, •. • , x^) is a symmetric function of its arguments, so we need to calculate it for x\ ^ X2 ^ • • • ^ xk only. 4. A C A S E O F T H E N O R M A L D I S T R I B U T I O N In this section we consider a case when the initial data X = (X\,Xi,... ,Xn) is a sample from the normal distribution with mean p. = E(Xj) and variance er2 = var(Xi), i = 1 , 2 , . . . , n. Then each distribution function Fa(Z[, X2, • • • , xk) for the state a = (ao, ^ a j ) t c ( a , i ) ( - ) , i
(15)
where (/i, (a)} are positive weights which satisfy the condition ¿ 2 h i ( a ) = Pi(a),
(16)
/
•Jy.c is the A:-dimensional normal distribution function with mean ¡i and covariance matrix C. The vector of means for Fa is calculated simply: it is equal to 1, ft,... , fik)T where = a, f o r = 1 , 2 , . . . ,k — l and Pi — r for i = k — 1+ 1, ... , k. Therefore without generality restriction we suppose p = 0. The covariance matrix for Fa can be written as C„ = ^ / i , ( a ) C ( a , i). i
(17)
20
A. Andronov
and M.
Fioshin
Thus the mark Fa(•) for the state a is determined fully by the list of weights and covariance matrices (/¡¿(a), C(a, i) : i = 1, 2 , . . . ) . Let us show how to calculate the transitive function Faa> from formula (11) (or from formula (13)). If next state a ' = (aQ, a \ , . . . , is defined by formula (10) and function Fa(-) by formula (15) then Faa'(-) = X > ( a ) < V ( 0 , C ' ( l ) ( - ) ,
(18)
i
where /x'(i) C'(i)
and e/ i i
=
= ¡¿(a, = C(a,
i) + nei,
(19)
2
i) + a e/ef
(20)
, ek)T is a column-vector with e, = 1 for
{e\,ei,...
i
e / and c, = 0 for
I.
Therefore we are able to calculate recurrently the marks Fa(-) for a e £~2/, / = 1, 2 , . . . , k, by using formulas (13), (15), (18)-(20). The required joint distribution function F(-) is computed by formula (14) and has form (15). 5. COMPUTATIONAL ASPECTS The proposed approach requires a big amount of calculations and memory of computer. We are able to decrease these requirements essentially if to use a symmetry property of the sums S\, Si,... , Sk distribution. Let us consider the state space £2/ after the l-th step. We can consider only socalled ordered states a = ( a o , « ] , . . . , a^-i) e £2/, where «i ^ a2 ^ • • • ^ Let vector a contains X\ elements which occur exactly one times, A2 pairs of equal elements, Xj three of equal elements etc. It is obviously that + 2^2 + 3A.3 + • • • + (k -
l)Xk-i
= k — I.
Then
states from Qi correspond to the ordered state a . These states are obtained by different permutations of the components from a . All of these states have the same probability. Let after the l-th step we have a space £1/ of ordered states a . We will call a neighborhood Oi(a) of ordered state a a set of different unordered states which can be obtained from a using permutations of its elements. It is obviously that 0 / ( a ) U Oi(a')
-
0 if a i
Oi(a'),
L U n , Ot(a)
=
a
h
Because all sums for O/ (a) are statically equivalent, we will consider only states from !T2/. It allows us to decrease essentially an amount of computer memory and calculations (according to formula (21)).
Joint
Distribution
of Bootstrap
Sample
21
Elements
To calculate the distribution function and different probabilities, we use numerical integration. With this the net method is applied. In the case of the normal distribution we use properties of conditional multi-dimensional distributions. So we must consider the degenerate case separately. In the process of a calculation we check an accuracy of calculations continually. The following check methods are realized. • Satisfiability of the normalization condition for Markov chain state probabilities (9). • Satisfiability of the condition (16) on any step. • Final covariance matrix of distribution (14) must have form (6). • Concurrence of two- and three-dimensional integration results with known values for a normal case (in particular with formulas (25)). • Concurrence of our results for a case r = 1 with the data from Andronov and Fioshin (1998). 6. STATISTICAL APPLICATIONS The bootstrap method is used often for confidence interval construction and statistical hypothesis testing. With this the confidence level of the interval and the size of the test are calculated approximately by using the normal (or the studentized) approximation (DiCicio and Efron, 1996; Davison and Hinkley, 1997). The knowledge of the above considered distributions allows us to compute these values precisely. Let us confine oneself to a simple problem of the confidence interval construction for a mean fi = E(X). Let us 5(i), 5(2),... , 5(£) are the ordered values of the sums S\, S2, • • • , 5^. The simplest confidence interval for ¡i is (5(i)/r, S ^ / r ) . We should find corresponding coverage probability Y =
P{S(\)/r
(24)
i
where P^(C) is a probability that all components of ^-dimensional normal vector with covariance matrix C are positive (see Kendall and Stuart, 1962, pp. 485, 486).
22
A. Andronov and M. Fioshin
The probabilities P®(C(a, /)) should be calculated numerically. With that end in view a numerical integration is used (see previous section). Kendall and Stuart give the following explicit expressions for some P f : o
1
1
Pi — - + — arcsinpi2, 4 27T P9 = - + — ( a r c s i n p i 2 + arcsinpn + a r c s i n ^ ) , 8 4jt
(25)
where ptJ is a correlation coefficient for sums S, and Sj. Numerical results are shown below for the case k — 4. Table 1 T h e coverage probability y
r = 6 r = 7
n = 6
n = l
n = 8
n=
0
0.30 0
0.40 0.28
0.48 0.40
9
n = 10 0.54 0.46
7. C O N C L U S I O N S The paper states the method of the joint distribution calculation for bootstrap-samples elements sums. Numerical example demonstrates a suitability of proposed approach for small size of initial samples and number of sums. The simplest problem of the confidence interval construction for a mean is considered. Future investigations are connected with a study of a more general class of statistical problems. REFERENCES
Andronov, A., Merkuryev, Yu., and Loginova, T. (1995). Use of the bootstrap method in simulation of hierarchical systems. In: Proceedings of the European Simulation Symposium, pp. 9 - 1 3 , ErlangenNuremberg, Germany. Andronov, A. and Merkuryev, Yu. (1996). Optimization of statistical sizes in simulation. In: of the 2-nd St. Petersburg
Workshop
on Simulation,
Proceedings
pp. 2 2 0 - 2 2 5 , St. Petersburg.
Andronov, A. and Merkuryev, Yu. (1997). Controlled bootstrap method and its application in simulation. In: Proceedings
of the 11th European
Simulation
Multiconference,
pp. 1 6 0 - 1 6 4 , Istanbul, Turkey.
Andronov, A. and Merkuryev, Yu. (1998). Controlled bootstrap method and its application in simulation of hierarchical structures. In: Proceedings
of the 3-d St. Petersburg
Workshop on Simulation,
pp. 2 7 1 -
277, St. Petersburg. Andronov, A. and Fioshin, M . (1998). Distribution calculation for the sum of bootstrap sample elements. In: Proceedings of the Fifth International Conference "Computer Data Analysis and Modeling", pp. 5 - 1 2 , Minsk, Belarus. Davison, A. C. and Hinkley, D. V. (1997). Bootstrap Press, Cambridge, UK.
Methods
and their Application.
C a m b r i d g e Univ.
DiCicio, T. J. and E f r o n , B. (1996). Bootstrap confidence intervals. Statist. Sci. 1(3), 1 8 9 - 2 2 8 . E f r o n , B. and Tibshirani, R. Y. (1993). Introduction
to the Bootstrap.
Kendall, M. and Stuart, A. (1962). The Advanced
Theory
Griffin, London. Scheffe, H. (1958). The Analysis
C h a p m a n & Hall, L o n d o n .
of Statistics.
of Variance. Wiley, New York.
Volume I. Distribution
Theory.
Prob. Theory and Math. Stat., pp. 23-32 B. Grigelionis et al. (Eds) © 1999 VSP/TEV
BISMUT TYPE DIFFERENTIATION OF SEMIGROUPS MARC ARNAUDON Institut de Recherche Mathématique Avancée, Université Louis Pasteur, 7, rue René Descartes, F-67084 Strasbourg Cedex, France ANTON THALMAIER Institut für Angewandte Mathematik, Universität Bonn, Wegeierstraße 6, D - 53115 Bonn, Germany
ABSTRACT We present a unified approach to Bismut type differentiation formulas for heat semigroups on functions and forms. Both elliptic and hypoelliptic situations are considered. Nonlinear extensions apply to the case of harmonic maps between Riemannian manifolds and solutions to the nonlinear heat equation.
1. INTRODUCTION Let M be a smooth n-dimensional manifold. On M consider a Stratonovich SDE with smooth coefficients of the type SX
= A(X)SZ+
A0(X)DT,
(1.1)
where AQ e T(TM) is a vector field and A: M x f TM, (x, z) h> A{X)Z a bundle map over M for some r. The driving process Z is assumed to be an M r -valued Brownian motion on some filtered probability space satisfying the usual completeness conditions. We write X.(jc) for the solution to (1.1) starting from the point x e M, and denote its maximal lifetime by f (x). Solutions to (1.1) are diffusions with generator given in Hormander form as (1.2) i=i with Ai = A( • to (1.1)
e T(TM) (P,f)(x)
for i = 1 , . . . , r. We consider the minimal semigroup = E[(f
o
Xt(x))l{t 0, is a local martingale, and the claim follows upon s noting that F ( s , X s ( x ) ) = F(0,x) + / 0 d F ( r , • )*,-(.*) A ( X r ( x ) ) d Z r which implies that /o d F ( r , • )Xr(X) dhr - F(s, Xs ( x ) ) ( k , d Z ) , s ^ 0, is a local martingale.
The procedure is now straightforward. Suppose that we are able to choose the process k in (2.2) in such a way that the following two conditions hold: ( a ) / 0 CT W~lr
(b)
N°
A ( X r ( x ) ) k r dr
= (N
t A O
)t^o
= v a.s., a n d
is a uniformly integrable martingale.
Then, evaluating E [No] = E [ N a ] gives a formula for dF(0, ')xv in terms of the process F ( - , X . ( x ) ) . In the elliptic case, both (a) and (b) can easily be achieved (Thalmaier, 1997), by exploiting the fact that there is a right-inverse to the process
3. THE ELLIPTIC CASE Suppose that our system (1.1) is elliptic, i.e. A(x): R r -»• TXM is onto for each x e M. Then there is a Riemannian metric g on M which makes A(x)*: TXM -> Mr to an isometric inclusion for each jc € M. In particular, the generator (1.2) then writes as L =
^A M
+
V,
(3.1)
where A M is the Laplace-Beltrami operator on M with respect to the metric g and V is a first order term, i.e. V e V ( T M ) . In this case, there is an intrinsic choice for the linear transports Wo,, in (2.1). To this end, let Wo,i: TXM —• Txs(x)M be defined by the following covariant equation along X(x): 1 ^-W 0 ,s = - J R i c M ( W 0 > i , • ) # + W ( H V ) , Wo,o = idTXM • (3.2) di 2 Then the local martingale property of (2.1) is elementary to check: indeed, this can be done either directly by applying Weitzenböck's formula to the Laplacian on 1forms (see, for instance, Thalmaier, 1998), or by using the method of filtering out redundant noise from the derivative process TxX (cf. El worthy and Yor, 1993). Let r
(3.3) o 3'
26
M. Arnaudon and A.
Thalmaier
be the martingale part of the anti-development A(X) : = /J //q~(' SXs of X = X(x). Obviously, B is a Brownian motion on TXM and, by definition, A(Xs(x))dZs = / / 0 f dBs. In this situation, Lemma 1 is easily adapted to give the local martingale property of Ns = d F(s, .)XAx)
W0iJ ^v-
j
/ / 0 i l . M r j + F(s, X s ( x ) ) j
{k,dB).
Taking hs = v — f^ W^j //0 rkr dr, we get the following typical applications in the elliptic case, see (Thalmaier, 1997) for details. In the sequel, let H(R+, TXM) denote the Cameron-Martin space of paths y : M + —>• TXM of finite energy. THEOREM 2 (Differentiation formula for heat semigroups). Let f : M —R be bounded measurable, x e M and v e TXM. Then, for any bounded adapted process h with paths in H(K+, TXM) such that (/0r(t)A' \hs\2 di) 1 / 2 e L], and the property that ho — v, hs = 0 for all s ^ r (x) A t, the following formula holds: T ( X ) A
d(Ptf)xv
=
-E
/(^W)l{l r(x), and the property that ( / J ^ \hs\2 di)1/2 e L1+£ for some e > 0. For possible choices of the process h see (Thalmaier and Wang, 1998) where formula (3.5) is used to prove local gradient estimates of harmonic functions. 4. NONLINEAR GENERALIZATIONS The arguments of the previous section are easily extended to nonlinear situations, e.g. to harmonic maps u : M N between Riemannian manifolds, and more generally, to solutions of the nonlinear heat equation 9 1 —u = - A M dt 2
(4.1) '
v
Bismut Type Differentiation
of Semigroups
27
for maps between M and N, where Au = trace Vdu is the tension of u. In these cases (1) Fs = F(s, Xs(x))
— u(Xs(x)),
(2) Fs = F(s, Xs(x))
=
respectively
u,-s(Xs(x)),
define V-martingales on N for Brownian motions X on M. DEFINITION 4 (Arnaudon and Thalmaier, 1998b). For a continuous semimartingale Y taking values in a manifold N, endowed with a torsionfree connection V, the geodesic transport (also called deformed or damped transport; Dohrn-Guerra transport) 0 O f : Ty()N —> TY/N on N along Y is defined by the following covariant equation along Y: d
(//o7. l 0 o,.) = - 5 / / 0 ; > ( © 0 , . -
dY
^dY'
©0,0 =
id
-
(4.2)
where / / 0 ( : TYQN TY 0, and the property that ho = v, hs = 0 for all s ^ r(x ) A t; here x(x) is again the first exit time of X (x) from some relatively compact neighbourhood D of x. In the same way, formulas for the differential of harmonic maps u : M tween Riemannian manifolds can be derived,
N be-
T(X)
(dH^w -
- E Afcf(«(X.(x))) T W J
{W0,shsJ/0sdBs)
(4.4)
with assumptions analogous to Theorem 6. Note that the stochastic integral in (4.4) depends only on the local geometry of M about the point x. Formula (4.4) has been used in (Arnaudon and Thalmaier, 1998b) to prove local gradient estimates for harmonic maps of bounded dilatation, recovering theorems of Goldberg-IshiharaPetridis and Shen as special cases. 5. EXTENSIONS TO DIFFERENTIAL FORMS In this section we sketch extensions from functions to differential forms. For more general results in this direction, e.g. for sections in vector bundles, see (Driver and Thalmaier, 1999). On a Riemannian manifold M, endowed with the Levi-Civita connection V, consider the exterior algebra bundle ApT*M.
E = AT*M = 0 o
Let A denote the de Rham-Hodge Laplacian on r(£) with sign convention A = — (dd* + d*d),
(5.1)
where d and d* are the exterior differential, respectively codifferential. Note that for a e r(£), da = c + V a ,
d*a = - c ~ V a ,
(5.2)
where V: r ( £ ) - » F(T*M E) is the induced covariant derivative on E and c + = A : T*M ® AT*M -*• AT*M,
s ® m s A v,
c~ = L : T*M AT*M -» AT*M,
s®vh+s\_v
29
Bismut Type Differentiation of Semigroups
denote wedge product and contraction in E = AT*M, respectively. On ATM the operations c ± are defined analogously. Recall that, by definition, í L V = J2k=\ (s> Vk) V] A • • •v'ic A - • • vp for v = di A • • • A vp. Further, by Weitzenbock's formula, Afl = Da - TZa,
(5.3)
where Da = trace V 2 a is the "rough Laplacian" on T ( £ ) and 71 e r ( H o m E) the Weitzenbóck curvature term. For x e M fixed, the Weitzenbock term 71 can be used to define a process Q with values in Au^Z?*) via the pathwise equation jt:Qt = -¿Q,K//Qj, where 71/,^ = f/~] o 7ZXl(x) o // 0
Qo = idEx,
(5.4)
r
For the remainder of this section, let (•, •) denote the natural pairing between AT*M and ATXM, and let Q* e A u t ( A T X M ) be determined by ( Q t v , w ) = {v, Q*w) for v e AT*M, w e ATXM. The following lemma is crucial for derivative formulas in our situation, see (Driver and Thalmaier, 1999) for genéralizations in various directions. Let a be a solution to the heat equation on differential forms, i.e. on T(E), jjfl = j Aa, where A is the de Rham-Hodge Laplacian given by (5.1). Then
LEMMA 7.
{Qs//-}sdat_s(Xs(x)),m)
N, =
S
~ (Qs //o,}a,-s(Xs(x)),
J(Q*Tx{áBr
L
Q*i{r))^
o and N? = [QS
//^sd*at_s{Xs{x)),l{sj) s
+ I^Qs //o~ V ^ O O ) , J(Q*ryl(dBr
A
Q*t(r))^
o are local nuirtingales for any adapted process I with paths in the Cameron-Martin space H([0, i], A TXM). Here X(x) is a Brownian motion on M starting from x and B its anti-development taking values in Tx M. The following theorem is a typical application of Lemma 7, see also (Elworthy and Li, 1998). For simplicity, we assume M to be compact. For L 2 -sections a of E, let the semigroup P,a = e'^ 2 A a be defined by means of the spectral theorem, where A denotes the closure of A. Then (P,a)x = E [ Q , //0~,1 a(X,(*))].
M. Arnaudon and A. Thalmaier
30
THEOREM 8. Let M be a compact manifold, a e L2-Y(AT*M) section and v e ATXM for some x e M. Then
be a bounded L -
I
(dP,a)x v = - E
f (QXl{dBs L
//^a(X,(x)),
Q*i{s))
o / (d*P,a)xv
= E
//~}(Xt(x)),
j(Q*rl{dBs
A Q* ¿(s))
for any adapted process I with sample paths in H([0, i], A TXM) such that £(0) = v, l(t) = 0, and the property that (/„' | £ ( i ) | 2 d i ) 1 / 2 e L1. 6. F O R M U L A S I N T H E H Y P O E L L I P T I C
CASE
In the rest of this survey, we give a generalization of the results in Section 3. We consider again our system (1.1) but now with the assumption that the generator L = ^ 0 + 5 ]C;=l Af is only hypoelliptic, i.e. Lie(A,, [Ao, A,]: i —
..., r)(x) = TXM,
forall xeM.
(HI)
In other words: the ideal in Lie(Ao, A\,..., Ar) generated by ( A i , . . . , Ar) is assumed to be full at each x e M. We want to extend Theorem 1 and Theorem 2 to the hypoelliptic situation. Under hypothesis (HI), the "Malliavin Covariance" /
f (X.- 1 A,)x ® {X^Ai)x
C,(x) = £
is
(6.1)
'=>o defines a positive symmetric bilinear form on T*M for x e M such that t < f (x). We read C, (x): T*M TXM, C,(x)~l: TXM -> T*M and write
I
c,{x)
= j
{X-JA)x(X;:A)1 ds
where ( X " 1 A)x: K r TXM is defined by z Lemma 1, for any K r -valued k process in L? (Z),
(x7*
(6.2)
z' • R e c a 1 1
that b
Y
3 Ns :=dF(s,
-)X„
v- j
X-'
A(Xr(x))krdr
+ F(s, Xs(x))
J (k, dZ) o
is a local martingale on [0, t A f (x)[. - i jA) k , Remark 9. F o r * e A / a n d v e Tx M one may consider the system hs = —(X s x s r ho = v. Note that if we are able to find a predictable k with values in R such that (1) h(j — 0 where a — r(x) or o = r(x) A t, and in addition,
Bismut Type Differentiation of Semigroups
31
(2) (/ 0 a |/M 2 dy)'/ 2 6 Li+E for some e > 0, then the results of Section 3 in the elliptic case carry over almost verbatim. However, it can be shown (Arnaudon and Thalmaier, 1998c) that, in general, this is not possible in the hypoelliptic situation. A counterexample in three dimensions (a two-dimensional Brownian motion and its Levy area as a third coordinate) has been communicated to us by Jean Picard. The following result in the hypoelliptic case is taken from (Arnaudon and Thalmaier, 1998c). A similar result can be given for L-harmonic functions. THEOREM 10. Let M be a smooth manifold and f : M R be bounded measurable. Assume that (HI) holds. Let v e TXM and t > 0. Then, for Ptf defined by (1.3), there is a formula of the type (p,f)xv
(6-3)
= E[/(xr(x))i{,( is a T*M-valued random variable, Lp-integrable for 1 ^ p < oo and local in the following sense: for any relatively compact neighbourhood D ofx in M there is a choice for , which is T„ -measurable where a — t A r(x) and r(x) is the first exit time of X from D when starting at x. We briefly sketch some arguments underlying Theorem 10. For simplicity, we assume that M is compact and / is C 1 . Our sketch of proof does not show that r is indeed local: to see this, the given arguments have to be reformulated in terms of local martingales (cf. Arnaudon and Thalmaier, 1998c). Let a be a predictable process with values in T*M ®W and X e TXM = R" (locally about 0) such that for t > 0, E
exp
2 I», X\ ds
\ j
< oo.
Let dZx = dZ + aX di and consider the Girsanov exponential G* defined by — exp
{aX,dZ) V
- ^ j
0
|aA.| 2 di
0
We write Xx for the flow to our SDE driven by the perturbed Brownian motion Zx, analogously C x ( x ) , etc. By Girsanov's theorem, we find that//(A) = ^ E[/(X ; x (x))Gx • {Cx(x)~x)ki i / ] is independent of A. for each k. Thus a f - L _ 0 # M = 0, from where we conclude that t ^ E (Dif)(x,(x))(xt* i,k,e L ^
f{X-lA)xasds) n
'
ik
(Ct(x)~l)ke
ve
32
M. Arnaudon
te
L
and A.
Thalmaier
ôkk
-I
Recall that A)x e W®TXM. Now the idea is to set as = a? = (X" 1 e r T*M ® M , where r„ f t is a sequence of stopping times such that each a" satisfies the integrability condition. This gives E [ ( d f )x,u) X-t* CTn(x) Ct(x)~xv] = E [ ( / o X , W ) - 4 » ? v]. Finally, taking the limit as n E [ ( / o X , ( x ) ) - 4 > , u], where
oo, we get d(P,f)xv
= E[(d/)x,(jc)
=
REFERENCES
Arnaudon, M. and Thalmaier, A. (1998a). Stability of stochastic differential equations in manifolds. In: Séminaire de Probabilités XXXII, pp. 188-214. Lecture Notes in Math. 1686, Springer, Berlin. Arnaudon, M. and Thalmaier, A. (1998b). Complete lifts of connections and stochastic Jacobi fields. J. Math. Pures Appl. 77, 283-315. Arnaudon, M. and Thalmaier, A. (1998c). The differentiation of hypoelliptic diffusion semigroups. Preprint. Driver, B. K. and Thalmaier, A. (1999). Heat kernel derivative formulas for vector bundles. Preprint. Elworthy, K. D. and Li, X.-M. (1994). Differentiation of heat semigroups and applications. In: Probability Theory and Mathematical Statistics. Proc. 6th Vilnius Conf., pp. 239-251, B. Grigelionis et al. (Eds), VSP, Utrecht/TEV, Vilnius. Elworthy, K. D. and Li, X.-M. (1998). Bismut type formulae for differential forms. C.R. Acad. Sci. Paris Sér. I Math. 327, 87-92. Elworthy, K. D. and Yor, M. (1993) Conditional expectations for derivatives of certain stochastic flows. In: Séminaire de Probabilités XXVII, pp. 159-172, Lecture Notes in Math. 1557, Springer, Berlin. Thalmaier, A. (1997). On the differentiation of heat semigroups and Poisson integrals. Stochastics chastics Rep. 61, 297-321. Thalmaier, A. (1998). Some remarks on the heat flow for functions and forms. Electron. Comm. 3, 43-49.
Sto-
Probab.
Thalmaier, A. and Wang, F.-Y. (1998). Gradient estimates for harmonic functions on regular domsit J in Riemannian manifolds. J. Fund. Anal. 155, 109-124.
Prob. Theory and Math. Stat., pp. 33^12 B. Grigelionis et al. (Eds) © 1999 VSP/TEV
RANDOM PERMUTATIONS AND THE EWENS SAMPLING FORMULA IN GENETICS G. J. BABU 1 Department of Statistics, 319 Thomas Building, The Pennsylvania State University, University Park, PA 16802, U S A E. MANSTAVICIUS 2 Department of Mathematics, Vilnius University, Naugarduko str. 24, LT 2006 Vilnius, Lithuania
ABSTRACT In the last few decades, mathematical population geneticists have been exploring the mechanisms that maintain diversity in a population. In 1972, Ewens established a formula to describe the probability distribution of a sample of genes from a population that has evolved over many generations, by a family of measures on the set of partitions of an integer. The Ewens formula can be used to test if the popular assumptions are consistent with data, and to estimate the parameters. The statistics that are useful in this connection will generally be expressed as functions of the sums of transforms of the allelic partition. Such statistics can be viewed as functions of a process on the permutation group of integers. The Ewens sampling formula also arises in Bayesian statistics via Dirichlet processes. Necessary and sufficient conditions for a process defined through the Ewens sampling formula to converge in a functional space to a stable process are presented. A counter example to show that these conditions are not necessary for one-dimensional convergence is constructed.
1. INTRODUCTION Mathematical population geneticists have been exploring, for the past few decades, the mechanisms that maintain genetic diversity in a population. Natural selection due to interaction with the environment and mutation are some of the factors that influence genetic evolution. Some geneticists believe that the mutation and random fluctuations that are inherent in the reproductive process account for much of the genetic diversity. They view that the effect of selection has been exaggerated and hence concentrate on the so called neutral alleles models. It is the property of this model that there is no meaningful way of labelling the alleles. 'Research supported in part by NSA grant MDA904-97-1-0023, NSF grant DMS-9626189, and by National Research Council's 1997-99 Twinning Fellowship. 2 Research supported in part by Lithuanian Science and Studies Fund and by National Research Council's 1997-99 Twinning Fellowship.
34
G. J. Babu and E.
Manstavicius
The study of sampling distribution of genes from a population, that has evolved over many generations, helps in understanding the genetic structure of the population and in estimating the mutation rates.
2. THE EWENS SAMPLING FORMULA Ewens (1972) established a formula to describe the sampling distribution of a sample of n genes from a large population by a partly heuristic argument. In several genetic models it is an exact formula and in others it is a close approximation. The formula is derived under the null hypothesis that there is no selection. In this case the allelic partition, k = (k i,... , k„), contains all the information available in a sample of n genes, where kj denotes the number of alleles represented j times in the sample, j = 1 , . . . , n. In other words Ewens formula provides the distribution of the multiplicities of alleles of a sample of genes from the so-called neutral alleles model of population genetics. The Ewens sampling formula (Ewens, 1972) is given by (1) where 9 > 0,9(n)
= 9(9 + 1) • • • (9 + n - 1), k j ^ 0, and l&l + • • • + nkn = n.
(2)
The vector k represents a partition of the integer n. The derivation of equation (1) was made rigorous by Karlin and McGregor (1972). Kingman (1980) argues that several different approaches lead to (1), under very broad assumptions. He claims that "The formula is reliable when (a) the size of the population is large compared to n, and the expected total number of mutations per generation is moderate (differing from 9 only by a numerical factor), (b) the population is in statistical equilibrium under mutation and genetic drift, with selection at the locus playing a negligible role, and (c) mutation is nonrecurrent, so that every mutant allele is a completely novel one." The Ewens sampling formula exposed the inadequacy of the 'standard' methods for estimation of mutation rates. It shows that k is a sufficient statistic for 9, so that estimation of 9 by using the sample heterozygosity (which is the least informative part of the data) rather than by k is inefficient. The formula can be used to test if the assumptions (a), (b), and (c) are consistent with the data, and to estimate the parameter 9. The statistics that are of interest can generally be expressed as functions of the sums hj(kj), where r ^ 1 and hj is a function on the set of nonnegative integers. While the sampling theory of neutral alleles is still developing, the focus has shifted more toward DNA sequence data in recent years.
Random Permutations 2.1. Ewens formula in Bayesian nonparametric
35
problems
The Ewens formula also made its impact in an entirely different field, Bayesian statistics. It is well known that Dirichlet processes play an important role in Bayesian approach to nonparametric problems. A random probability measure D on a measure space , A) with parameter a is called a Dirichlet process, if for every k = 1 , 2 , . . . and measurable partition A\,... , Ak of i2, the joint distribution of ( D ( A \ ) , . . . , D{Ak)) is Dirichlet with parameters ( a ( A i ) , . . . ,a(Ak)). Antoniak (1974) has shown that Ewens formula can be used to test if a given data is from a Dirichlet process with unknown parameter a , using the actual pattern of multiplicities observed. Suppose a sample of size n from a Dirichlet process with parameter a is drawn. If a is nonatomic, then Antoniak (1974) establishes that the probability that the sample contains kj elements that occur exactly j times, j = 1 , . . . ,n, is given by the Ewens formula (1), with 9 = a(i2). 3. GROUP OF PERMUTATIONS Another interesting aspect of the Ewens formula is its combinatorial content. It is central to the study of a broad class of combinatorial structures such as permutation groups. Ewens formula can be viewed as a measure on the space of partitions of an integer n. A brief description of permutation groups and associated conjugate elements is presented here. Let Sn denote the symmetric group of permutations on { 1 , . . . , n). The elements of S„ can be represented uniquely by the product of independent cycles. More precisely, let a e S„ be an arbitrary permutation and a — x \---xw,
(3)
be its unique representation (up to the order) by the product of independent cycles xi, where w = w(a) denotes the number of cycles. For example, when n = 8, the permutation r that maps {1, 2, 3,4, 5, 6, 7, 8} onto {5, 3, 6, 1, 8, 2, 7,4} (i.e., r ( l ) = 5, r(2) = 3, r(3) = 6, r(4) = 1, r(5) = 8, r(6) = 2, r(7) = 7, r(8) = 4), has three cycles (1 5 8 4), (2 3 6) and (7). The length of the first cycle is four, the length of the second cycle is three and the last one has length one. So one can write r = (1 5 8 4)(2 3 6)(7). Similarly, r 2 = (1 8)(4 5)(2 3 6)(7), r 3 = (1 4 8 5)(2)(3)(6)(7). The order Ord(cr) of a permutation a is defined as the least positive integer k for which a k — / , where / is the identity permutation. For the example above, Ord(r) = 12. It is well known that Ord(cr) is the least common multiple of {j : kj(a) > 0}. The asymptotic distribution of the order function was studied by Erdos and Turan (1965, 1967). They established that - U (or e S„ : logOrd(cr) - 0.5 log 2 n ^ n\ \ as n —» oo, where denotes the standard normal logOrd is closely related to the function over all positive kj(a).
4 = log 3 / 2 n ) —» 0>), V3 / distribution function. The function ' where the sum J2* is extended
36
G. J. Babu
and E.
Manstavicius
For a e S„ the representation (3) leads to lk\{a) -\ h nkn(o) = n, where kj(a) denotes the number of cycles of a of length j . The group S„ can be partitioned into equivalence classes by identifying the elements a by the vector k = (k... , kn), where 0 ^ kj = kj{a) < n and + • • • + nkn = n. In this case w(a) = k\ + ••• + £„, the total number of cycles of a, describes an additive function on S„. For each 9 > 0, the Ewens formula can be considered as a measure on the symmetric group S„ of permutations on { 1 , . . . , n}. This motivates a study of the distribution of values of a function on S„. A functional limit theorem for a partial sum process h j ( k j ) , o o ^ 1, under the Ewens sampling formula, is described in this paper, where y is a function on the unit interval and hj are functions on the set of non-negative integers satisfying hj(0) — 0. The derivations involve concepts and ideas from probabilistic number theory. 3.1. Functions
on permutations
group
Let G be an additive abelian group. A map h: S„ if it satisfies the relation
G is called an additive
function
n h(a)
=
J^hj(kj(a))
(4)
j=l
for each o £ Sn, where hjiQi) = 0 and hj(k), j ^ 1, k 0, is some double sequence in G. If hj(k) = khj( 1) for each 1 ^ j ^ n and k ^ 0, then h is called completely additive function. The number of cycles w(a) in (3) is a typical example of a completely additive function. Similarly, a complex valued function / on S„, given by f(a) = ]~["=1 fj(kj(a)) with fj (0) = 1 is called multiplicative. It is called completely multiplicative if, in addition, fj{k) = fj{\)k holds for each j ^ 1 and k ^ 0. All these functions are measurable with respect to the finite field T of subsets of S„ generated by the system of conjugate classes. For each 0 > 0, v n o defined in (1) is a probability measure on T . The uniform distribution (the Haar measure) on S„ induces the measure v„,i on the space of conjugate classes. This is the Ewens formula when 0 — 1. 4. FUNCTIONAL LIMIT THEOREM The main result is based on the well known relation lye(k) =
P($i
=*!,...,
£„ = kn
= n),
(5)
1 ^ j ^ n are independent Poisson random variables, satisfying E = 1^1 + • • • + , (see, for instance, Arratia etal., 1992). To state the general invariance principle for additive functions h(o) given by (4), set for brevity a(j) = hj( 1), and u* — (1 A |u|)sgn u, where a A b := min{a, b}. Here and in what follows we take the limits as n -> oo. Let the normalizing factors /3(n) > 0 satisfy /3(m) ^ oo. The sequence [fi(n)} need not be monotone. Define where
0/j, t; =
B(u, n;h)
=
y
(
A
(
u
,
„; h) = 6 V
( f ^ V l
Random Permutations and y(t)
:=
yn{t)
= max{/ s$ n :
B ( i , n; A) ^ tB{n, n; ft)}, t e [0, 1]. We shall
consider the weak convergence (denoted by Hn,h := //„./.(ff,
t) = - i -
37
of the process
£ A ; ( * ; ( f f ) ) - A ( y ( f ) , «; A), jsSy(f)
* £ [0, 1]
under the measure v n fi, in the space D[0, 1] endowed with the Skorohod topology (Billingsley, 1968). The corresponding process X„ with independent increments is defined by
Xn,h:=Xnih(t)
= ^ -
T
a(j)Sj-A(y(t),n;h),
f e [0, 1].
(6)
We consider the weak convergence (denoted by = > ) of the process //„,/,. Babu and Manstavicius ( 1 9 9 8 b ) obtained the following theorem on weak convergence to stable limit processes.
THEOREM 1. Let X be a process with independent increments satisfying P(X(0)
=
0) = 1 and o E(exp{iX(X(r) - X ( i ) ) } ) = exp|(i - j)qi J
(eiku
- 1-
iku*)d(\u\~a)
—00 00
- (t-s)a2
J (e i X u - 1 - iA.u*)d(w-°) j , o
where a\, a2 ^ 0, a\ + a2 > 0, 0 < a < 2, 0 ^ i ^ t ^ 1, A. e R .In order that Hn,h => X, it is necessary and sufficient that for any u > 0, J 2
r
a(j)up(n), j^n
j-l^a
2
e~lu-
a
(7)
and lim limsup/} - 2 (n)
a(j)2j~l ^ \a{j)\ 0, Babu and Manstavicius (1998a) obtained necessary and sufficient conditions for weak convergence to the Brownian motion. Their result generalizes the classical central limit theorem for the number of cycles (Goncharov, 1942) as well as the central limit theorem for the logarithm o f the product of lengths of cycles (Erdos and Turan, 1965). As in Probabilistic Number Theory (see Babu, 1973, Manstavicius, 1984, 1985; Kubilius, 1964), the proof of sufficiency depends on "truncated" sums, and it is
38
G. J. Babu and E.
Manstavicius
enough to establish the result for completely additive functions. The idea of the proof of necessity comes from Manstavicius (1985) and Timofeev and Usmanov (1982). Remark. The choice of the 'time' index function {y(/) : 0 ^ t ^ 1} makes it possible to derive the functional limit result from one-dimensional weak convergence. However, H n j , ( 1) => X ( l ) does not imply X„,/, (1) => X ( l ) . The counter example given in (Babu and Manstavicius, 1998a) illustrates this in case X is the Brownian Motion. A counter example is constructed in the next section to illustrate this fact, when //„,/,(!) converges to a stable law. 5. COUNTER EXAMPLE In this section, for each 0 < a < 2, an additive function h is constructed such that the distribution of / / « > ( 1) under v n j converges weakly to a symmetric stable distribution with characteristic exponent a , while X„ t h( 1) has a different limiting distribution. The idea of construction has its origin in (Timofeev, 1985). The main construction depends on the following analytic result. LEMMA (Manstavicius, 1996). Let 6 = 1, and let f be a complex valued completely multiplicative function defined on Sn by f j { k ) = f j ( l ) k , where |/_/(l)| ^ 1, j ^ 1. If >
H j
7=1
then the mean-value
of f , under
^ M < oo,
vnj,
+ 0(K~(]/2)+s),
(9)
for 1 < K < n and for each 8 e (0, 1 /2). The constants in the symbol O depend at most on M and 8. THEOREM 2. For each 0 < a < 2. There exists a sequence of numbers that the completely additive function h(a) = J2"j=l j)kj{a) satisfies vnA{P{n)^h{o)~
A(n,n\h)
^ x) —•
F(x)
for all x € R, where fi(n) — n^a, and F denotes the stable law with function (pa given by 4>a(s) = . However the distribution of the sum of independent random variables, Xn,h{ 1) = n"1/a does not converge to F.
J2 j^n
~
A ( n
'
h )
[a(j)}such
(10) characteristic corresponding
Random Permutations Proof. Define the bivariate distribution G by F(x)y if 0 < y < 1, G(x,y)= F(x) ify ^ 1, 0
39
otherwise,
d(j)
0 otherwise, where {*} denotes the fractional part of x, and
h{o) = J^dUy^kjia). n
j=i
Let / x „ ( . . . ) = « " ' # { 1 ^ j ^ n : ...}. constant c and all x > 1, we have
Since 1 - F(x) + F(—x)
^ cx~a,
for some
fin(d(j) < *) = /i„({yV2} < F(x)) + 0 ( / T ) 1 1/a)) + 0 ( / i „ ( V n < j < n, IF" «;^})! > y 1/2
= F(x) + 0 ( « - ' / 2 ) + O ^ n d i " - ' « ; ^ } ) ! ^ n1/2a))
= F(x) + O(n~1/2) + 0(F(-n1/2a) + 1 - F(n1/2 3
1.
(13)
Summation by parts and (12) are used to obtain the last inequality above. For 9iz ^ 0 and j real, the inequalities (13) imply thatsup n / < oo, for some ft > 1, and hence the identity map is uniformly integrable under the sequence of measures Fng~l. Thus for 9iz > 0, real s and for some /S > 1, J \g(sx, y; Z)| dG(x, y) < oo,
J \g(sx, y, z)f
d G(x, y) < oo,
(14)
and Sn(s,z)
= J g(sx,y,z)dFn(x,y)—+
J g(sx, y\z)dG(x,
y).
(15)
Similarly, A(n, n; h) = J ^-(xyl'a)*
dFn(x, y) ^
j
^ ( V 7 " ) * dG(x, y) = 0.
(16)
The last equality in (16) follows as F(x) = 1 — F(—x) for all real x. Hence A(n,n;h) —» 0. By (13), Sn(s,z) is bounded uniformly in 9iz ^ 0. Hence we have by the dominated convergence theorem, that for any K > 0, 1 ^r 2n i
1+iK f »z ez I j(exp(5n(j,z)))dz 1 -iK( 1 +iK j l-i K
g(sx,y,z)dG{x,y))>jdz.
(17)
Random
41
Permutations
Note that by (14) and Fubini's theorem, we have for any z with non-negative real part, l = j J I(ei"',/" - l ) e ^
J g(sx,y,z)dG(x,y)
dydF(x)
0 R
0
£ + |i|°
Consequently for K > 1, l+iK j
2ni
e x p
Y
g(.sx'y F(x), uniformly in x e R as n oo. This implies (10) as A{n, n; h) ->• Oby (16). 4*
G. J. Babu and E.
42
Manstavicius
On the other hand the characteristic function of n
Ylj^n
' s given by (20)
— exp
^ 0, Xnj,(l) also converges weakly to the distribution with the characteristic function given by (20). This completes the proof. REFERENCES Antoniak, C. E. (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann. Statist. 2, 1152-1174. Arratia, R., Barbour, A. D. and Tavare, S. (1992). Poisson process approximations for the E w e n s sampling formula. Ann. Appl. Probab. 2, 5 1 9 - 5 3 5 . Babu, G. J. (1973). A note on the invariance principle for additive functions. Sankhya
A 35, 3 0 7 - 3 1 0 .
Babu, G. J. and Manstavicius, E. (1998a). Brownian motion for random permutations. Submitted for publication. Babu, G. J. and Manstavicius, E. (1998b). Infinitely divisible limit processes for the E w e n s sampling formula. Submitted for publication. Billingsley, P. (1968). Convergence
of Probability
Measures.
Willey, New York.
DeLaurentis, J. M. and Pittel, B. G. (1985). R a n d o m permutations and the Brownian motion. Pacific J. Math. 119, 2 8 7 - 3 0 1 . Donnelly, P., Kurtz, T. G., and Tavare, S. (1991). O n the functional central limit theorem for the E w e n s Sampling Formula. Ann. Appl. Probab. 1, 5 3 9 - 5 4 5 . Erdos, P. and Turan, P. (1965). On s o m e problems of a statistical group theory 1. Z. Wahrsch. Verw. 4, 1 7 5 - 1 8 6 .
Cebiete
Erdos, P. and Turan, P. (1967). O n s o m e problems of a statistical group theory III. Acta Math. Acad. Hungar. 18, 3 0 9 - 3 2 0 .
Sci.
Ewens, W. J. (1972). T h e sampling theory of selectively neutral alleles. Theor. Pop. Biol. 3, 8 7 - 1 1 2 . Goncharov, V. L. (1942). On the distribution of cycles in permutations. Dokl. Acad. Nauk SSSR 35, 2 9 9 301 (in Russian). Hansen, J. C. (1990). A functional central limit theorem for tl.e E w e n s Sampling Formula. J. Probab. 27, 2 8 - 4 3 .
Appl.
Karlin, S. and McGregor, J. L. (1972). A d d e n d u m to a paper of W. E w e n s . Theor. Pop. Biol. 3, 112-116. Kingman, J. F. C. (1980). Mathematics Kubilius, J. (1964). Probabilistic idence, RI.
of Genetic Diversity.
Methods
S I A M , Philadelphia, PA.
in the Theory of Numbers.
Amer. Malh. Soc. Transl. 11, Prov-
Manstavicius, E . (1984). Arithmetic simulation of stochastic processes. Litli. Math. J. 24, 2 7 6 - 2 8 5 . Manstavicius, E. (1985). Additive functions and stochastic processes. Litli. Math. J. 25, 5 2 - 6 1 . Manstavicius, E. (1996). Additive and multiplicative functions on r a n d o m permutations. Litli. Math. 36, 4 0 0 - 4 0 8 . Timofeev, N. M . and Usmanov, H. H. (1982). Arithmetic simulation of the Brownian motion. Dokl. Nauk Tadzli. SSR 25, 2 0 7 - 2 1 1 (in Russian). Timofeev, N. M. (1985). Stable limit laws for additive arithmetic functions. Mat. Zametki (in Russian).
J.
Acad.
37, 4 6 5 - 4 7 3
Prob. Theory and Math. Stat., pp. 43-56 B. Grigelionis et al. (Eds) © 1999 VSP/TEV
ON LARGE DEVIATIONS OF SELF-NORMALIZED SUM A. B A S A L Y K A S Institute of Mathematics and Informatics, Akademijos 4, 2600 Vilnius, Lithuania
Let X, X\, Xï ...
are i.i.d. random variables with EX
self-normalized sum tn — Sn/Vn,
— 0. By tn denote so called
where Sn — Xi H
1- X„, V2 = X2 H
h X2.
A . M. Shao (1997) proved the following result about large deviations of t„. T H E O R E M ( S h a o ( 1 9 9 7 ) ) . Let {xn, xn —> oo such thatxn varying
= o {*Jn)
n ^
asn^>
1} be a sequence
of positive
= 0 and EX2{\X\
oo. If EX
numbers < x} is
with slowly
as x —> o o then
hrn
«->00
xn
zlnP(r„
> xn)
=
-
1 2
Egorov in (1998) (in some cases) obtained the inequalities for large deviations of f„. In this article we shall investigate large deviations of tn using cumulant method. For this we introduce the following (probably, technical) condition: \X\ ^ H, Denote by a2 = EX2,
= £
H > 0.
and r * ( - ) - £th order cumulant, [a] - integer
part of number a, ln — (tn — Etn)/^Dtn,
O ( x ) - distribution function of standart
normal variable. L E M M A 1. For c0^n(cr/H),
k
s =
=
3,4,...
[na2/{cx
+
2, ITA(F«)L
+ ln(l +
H2(\nn
^
(k -
2)!/A*-2,
where
A
=
c 0 ^ 320, c\ > 1.
a2H/fo)))},
As s < 2 A 2 (for sufficiently large n) then by Lemma 2.2 from (Saulis and Statulevicius, 1991) we have the following T H E O R E M 1. For 0 ^ x < s/s/O 1 -
PQn L
P(tn
< x)
X +
/
=exp{L(,)} (l
< - X ) \ =expO.(-jc)
$(-*)
Ji)
+
( ( 1 +92f2(x)—=-),
V
0
l
M
x
)
L\
— j
,
X+\\ |0i|, I ^ K •s/S J
1.
44
A. Basalykas
Here fj(x)=
117 + 9 6 i e x p ( - ( l / 2 ) ( l - 3 V e x / V i ) i 1 / 4 ) (1 - 3 J ë x / J s )
j = 1,2; L(;t) = YlkLo^xk where coefficients 4 , £ = 0, 1 . . . are expressed through first min{& + 3, î} cumulants of random variable tn, and first k ^ s — 3 coefficients coincide with respective coefficients of classical Cramer series. By c (with index or not) we denote positive absolute constants, not always the same. LEMMA 2.
There exist absolute
(1)
constants
C2, ct, > 0 such that
\Et„\^c2p3/{(T3V^T),
(2)
Dtn>
c3/8
1
cr6(rt
— 2)
Proof. We use the following identity ^ = / f ^ e - " 2 v 2 / 2 d w , V„ # 0. If Vn = 0 (in this case also Sn = 0) we define Sn /Vn = 0. Therefore identity holds also for V„ = 0. Denote by 00
v. Y' ~ Vn " Since |£f„| ^ JV(0,1)) E e -
2
f J
e~u2v"/2du
then we have to estimate |£Ti|. For this we note that (£ =
^ =
= 1- ^
?
This yields that for \u\^S,
+e
4
8=
i e""2v»/2d«, J |«|>i
"
^ 1 ^ , n^/2
1
|0| < 1.
|3)
As £|||3 = 2 y / l / J n , then •) '=1
2
= l-(£i„) +n(n-l)£yir2,
(3)
we have to estimate £ T i Y2. For this we use identity 00 EY\Y2
ue~u2vZ/2du
= cEX 1X2 j
= Ji + J2,
0 c E X i X 2 J f ° ue~u2v^2du, h
where Jx =
gously like /1 and
s c f X i A ^ / w e - " 2 ^ / ^ « . Analo0
=
we estimate J) and J2. 00
|7i| ^ E\Y\\\Y2\
2 4
f
Me-"
/ e-"2/4d^c(£e-^)"^c(e-^2/8)",
svn s \h\ ^ c J «|£X(e-" 2 ^ 2 / 2 - l ) | 2 ( £ e - " 2 x 2 / 4 ) n - 2 d M ^ c^ 2 /a 6 (n - 2) 3 .
(4)
0 From (3) and (4) we finally get n , > 11 « ^
eft2 cfii LI fi, ,s 6 cr (n — 1)
p2n(nv c8in(n C
- 1) 1) 3 1 > 11 ft, ^
sum over where i-k =k composition of number k into q parts (the composition of natural number k is called any finite sequence of natural numbers k\,k2, • • • , k q such that k\ + • • • + kq = k). The first formula in (6) gives the number of partitioning of ^-element set into q nonempty subsets. Note that it we can also write in the following form:
£
(7)
>£
k\
^
where the first sum is taken over all nonordered sequences o f k \ , . . . , k q : k i H h = k and then, when k\,... , kq are fixed, the second sum is taken over all ordered
kq
k\,...
,kq:
k\ H
b kq
=
k.
Denote by \\u\\2p = u] + • • • + u2p, yk = y * ( « i . . . . ,up) u2 v /2
Sne~ r " ).
= r ( 5 „ e - " i v-2/2,...
Since tn = t'n + t't[, where 2v
|u| 0, yield
< t . i £ e - ' - ' T
Since E\Sn\2ki
(E\Sn\2ki){/2(Ee'r2v")l/2,
, DPl,... , DPr = Dp« such that DPj and DPj+l engage. Obviously, partition V = D\ U • • • U Dq is indecomposable if and only if every two sets in it communicate. Note, that (see, e.g., (Saulis and Statulevicius, 1991, Appendix 3), or (Leonov and Shiriaev, 1959)) Em where
m =
£
r(Di)--.r(D„),
D =V means summation over all possible partitions of V = D\ U • • • U
Dv. Now by formula (10) Yk = r (23),
(11)
where Sn
s-ïv«/2
\
V =
/
\ Sn
Since table V is two-column table we need to estimate the cumulants of three types: ( l ) r ( S „ , . . . , Sn), m= (2)T(e-?vi/2
1 , 2 , . . . ,k;
e—lv-/2),
= 1,2
ik;
50
A. Basalykas
( 3 ) r ( S „ , . . . , Sn, e-"^«/2,...
V /2
,
" ), 1
m times
Easy to see I"i (S n ) = 0 and for m = 2 , . . . ,k m-2
| r ( S „ , . . . ,S„)\ = | r m ( 5 „ ) | ^ n | r m ( X ) | ^
nm\(2H)
2
by Lemma 3.1 from (Saulis and Statulevicius, 1991). Also it is not difficult to see that r(e~"? v " 2 / 2 ) = (£e~"? A ; 2 / 2 )" < exp{-u 2 «j 2 n/8}. Let us calculate cumulants of second and third types for p = 2, 3 , . . . ,k. For this we recall Leonov-Shiriajev formula. Let (
e-"i
x
>/2
e-»i*»/ 2
j)(p)
&-u\X
J2
- " f f i
(
\
2
p-u\X]l2
2
Sn
(12)
m times J)(m,p) —
u n -u]X f/2
-.u]X2J2
q~uPx\I2
By formula (10) r(
(p^ are independent, i.e., columns in table V ^ are independent. Therefore in formula (13) we have nonzero terms only in the cases when every Dr (r = 1,2,... ,q) from indecomposable partition = D\ U • • • U Dq contains the elements only from the one column. Let us provide two lemmas which will be helpful for calculating of r ( S n , . . . , Sn, . . . , e - " ^ 2 / 2 ) and r ( e - " ^ « 2 / 2 , . . . , c ' u l V " / 2 ) . In order to begin work with (13) and (14) firstly we need to estimate terms F(£>i), . . . , r ( D i , ) in these formulas. For this we have Lemma 4. D e n o t e b y Y{m)
=
Y,...,Y. m times
On Large Deviations of Self-Normalized Sum L e m m a
(a) | r ( e ~
4.
H
, e~"^
ix2/2,...
2 / 2
^
)|
51
\)\u2
{p -
u2pH2{p~l)o2,
2; (a') | r ( e - " ? * P >
2;
(b)
|r(X
( m )
2
/
2
,...
, e""
,e-"'*
2 x 2 / 2
, ...
< /n!(p - \)\4m2p~] Proof.
see that
2
/
2
)|
, e""
{E^ 1 r(5^ , e-"?x.2/2,... , e-"2'x?/2) = r(X(m),e-"2x2/2>... , e ^ ^ 2 ) .
Remark 2. m)
(15)
Denote by G O « , , . . . , « , )
=
£
Y , V Z=D\
i r ( A ) i - - - i r ( D
u
) i ,
U-"UD U
where Z = {e~"i*2/2,... , e~urx I2} and Ez=z>|U-uov means summation over all possible partitions of Z = D\ U • • • U Dv. By G(S(m), mi, ... ,up) denote G(S
( m
>,m,...
=
£
|r(Di)|---|r(D V
v
)|,
Z(m)=D|U-UD„
where Z(m) = {S„,... , Sn, e""2*?/2,... , e""2^?72} and E'z(m)=D,u-uD„ means m times
summation over all possible partitions of set Z(m) = D\ U • • • U Dv and such, that
A.
52
Basalykas u2 x 2
every D, from Z( m ) = D\ U • • • U Dv contains at least one e
iV.
2
For example M x2 2
(using also (15)), G ( S ® , u , ) = \r(S„, Sn, e " " W / ) | = | r ( X , X, e~ ? / )|, G(S(2\UUU2) = I r ( 5 „ , S„, e - " i +
/ 2
) I r i (e~u2x2/2)
+ I r ( S „ , Sn, e~u2xV2)
|Ti ( e " " 1 x 2 / 2 )
|r(S„,S„,e-"W/2,e-"2x?/2)|+2|^
+ I r (X, X, e - " ? x 2 / 2 , e - " i x 2 / 2 ) I + 21 r ( X , e - " ? x 2 / 2 ) | | r ( X , e ~ u 2 x 2 / 2 ) \. Remark 3. Note, that (15) yields G ( S ( m ) , « i , . . . , u p ) = (i.e., Z ( m ) = { X , . . . , X, e ^ /
LEMMA 5. (a) G(S(m\
uu ... ,up)
, . . . , e""^2/2}).
< Am\6m\\u\\2
Hma2,
^exp{-||M||2a2/3}.
(b) G ( u i , . . . ,up) Proof, that
2
G ( X ( m ) , mj, . . . , up),
(a) Firstly let us take m = 1. (Denote by U = {«],...,
up}.) Easy to see
rix,e-"?.x2/2, E = £>1 U • • • U Dv, containing v sets and such that r(D,-) ^ 0, i = 1 , . . . , v (the partitions with V(Di) = 0 for some i are not interesting); remember that P r o o f
/ V
S
e
n
-«?V„2/2 \
=
\ s
c~«W2
n
=
)
E
E
*
r ( D i ) - - - r { D
v
) .
( 1 8 )
" P=DiU-UD„ Since r i ( 5 „ ) = 0 then 1 ^ v ^ k + 1. Denote by i the number of one-element sets among indecomposable sets D \ , . . . , D (T> = D \ U • • • U D ) . Note that if these one-element sets are fixed then 1 < v ^ k — s + 1. It is not difficult to see that these one-element sets have the following form: { e ~ " ^ } , i = 1 , 2 , . . . , k. There are (*) ways to choose them. Also is not difficult to see that v
v
i V n
2
r ( e - " ? v " / 2 ) = E&~ u i v " / 2 ^ e - " ? " f f 2 / 3 . From (6) and (7) using estimates for r t e - " ^ 2 , . . . , e~ u p V "/ 2 ) and T ( 5 „ , . . . , S n , c ~ W 2 , . . . , e " " ^ - / 2 ) fromLemma 6 we get l w ( « l , • • • , «t)l = | r ^
c
^
( S „
-
\
\
- W
e
2{k + 1) nk+l
K n " ^
^
/
k + l
(k + l)\
V1
„k+1
k+\
+
(k + l)k +
(k+l)\ +
n*+i
,
Integrating ( 2 0 ) w e g e t J
"
J \Yk(ui,...
,uk)\du\
•
•-duk
2
||h||^1/(48// ) (cH)k~2
*
( 2k - s ) l
(cH)k~2
N o w L e m m a 1 f o l l o w s f r o m L e m m a s 3, 7 a n d R e m a r k 1.
REFERENCES
Egorov, V. (1998). On the growth rate of moments of random sums. In: 7th Vilnius Conference on Probab. Theory, Abstracts, pp. 197-198. Leonov, V. P. and Shiriaev, A. N. (1959). On a technics of calculating semi-invariants. Teor. Veroyatnost. i Primenen. 4(3), 342-355 (in Russian). Petrov, V. (1987). Limit Theorems for Sums of Independent Random Variables (in Russian), Nauka, Moscow. Saulis, L. and Statulevicius, V. (1991). Limit Theorems of Large Deviations. Kluwer, Dordrecht. Shao, Q. M. (1997). Self-normalized large deviation. Ann. Probab. 25(1), 285-328.
Prob. Theory and Math. Stat., pp. 57-66 B. Grigelionis et al. (Eds) © 1999 VSP/TEV
MINIMAX ESTIMATION IN THE BLURRED SIGNAL MODEL E. BELITSER Mathematical Institute, Leiden University, Niels Bohrweg 1, 2333 CA Leiden, the Netherlands
ABSTRACT We study problem of the minimax estimation of a nonparametric signal blurred by some known function and observed with additive noise. The unknown function is assumed to belong to an ellipsoid in 1]). We propose a linear estimator and show that its maximal risk, under some conditions, has the same asymptotic behavior as the minimax risk. We also derive the exact asymptotics of the minimax risk. The results are illustrated by one example.
1. INTRODUCTION Consider the problem of estimating a nonparametric signal function f ( x ) , x e [0,1], on the basis of a finite set of indirect observations Yi = ( K f ) (ti ) + s ì ,
/ =
!,...,«,
(1)
where the unknown signal / is assumed to be periodic with period 1 and belong to the nonparametric class £ = £{Q) which is an ellipsoid in space L2[0, 1], s/'s are independent random variables with zero mean, variance er2 and finite Fisher information IE, i, = i,„ = i/n and the blurring operator K is defined as follows
o We assume that the function k is known, periodic with period 1. Let 9 — {0k}^= i and k = denote the Fourier coefficients of the signal / and the blurring function k with respect to the the orthonormal trigonometric basis {(t}^Li in ¿2[0,1]. Throughout this paper we assume that /q's are all nonzero. Denote w = (w\,w2,...)
= (kil
- 1
, \ki\~X,
•••)•
Assume also that the unknown infinite-dimensional parameter 0 — {0\, &2, • • • ) lies 5*
E.
58
Belitser
in an /2-ellipsoid © (see (Pinsker, 1980; Donoho, Liu and MacGibbon, 1990)): £ = £(Q)
=
00
>
k=\
I
£(Q,a)
where a = [ a i ] ^ is some positive numerical sequence. We assume that it satisfies the following condition: a/c —»• 00,
as k - » 00.
(2)
In fact, we are interested in studying the exact asymptotics of the minimax risk for "smooth" blurring functions. In this case the sequence w should be bounded away from zero or even converge to infinity. For simplicity some variables and dependence subscript n will frequently be dropped from notation. All asymptotic equations below refer to, unless otherwise specified, n —> 00. By / e £ we will mean from now on that 0 = (Q\, 62, • • •) € £, where 9k s are the Fourier coefficients of function / . Introduce now the minimax risk over the nonparametric class £ : rn = r„{£, 1v) = inf sup Ef\\f„h feS
f || 2 ,
(3)
where inf is taken over all estimators and sup is taken over all signals from the class £ . The minimax risk expresses the least possible mean loss when the worst case happens and, in a way, reflects the complexity of the estimation problem over the class £ . By an estimator / = / „ = / „ ( ; , X\,... , X„) we mean any measurable function of observations and t e [0,1]. We call an estimator fn asymptotically minimax if Rn(fn)
= sup Ef\\fn fee
- f\\2
= r„(£, uO(l + o ( l ) )
as n
00.
Typically, the notion of asymptotic optimality is associated with the so called optimal rate of convergence of the minimax risk. The minimax approach becomes more accurate if the constants involved in the lower and upper bounds are found, especially when these constants happen to coincide. The problem of finding the exact constants tends to be of an increasing interest. Results of such kind have only been obtained in a limited number of studies and only for models with direct observations: white noise model, regression and in density estimation problem. We should mention the related results of (Pinsker, 1980; Ermakov, 1989; Donoho and Johnstone, 1994; Belitser and Levit, 1996; Golubev and Levit, 1996). 2. MAIN RESULTS First introduce some notations. Since l i m ^ o o o-k — 00, the equation 00
p ^ akwj(l k= 1
- cnak)+
- cn Qn
(4)
Estimation in the Blurred Signal has the unique solution cn — c„(p)
59
Model
> 0 for any p > 0. Here x+
= cn{p,£,w)
denotes nonnegative part of x . Denote I = I(p)=In(p,S,w) N = N(p)
= Nn(p,
= [k:
0 0 (1 + o ( l ) ) +
r„ >n-xp^w2k-n-[cnpY^akwl
0{n~a),
kel
kzl
where cn and 1 are defined by (4) and (5) respectively with p = I f 1 . Now we construct the estimator which is going to be efficient for ellipsoids certain regularity condition. 00
n
/«(*)==
with
Ok = —J2Mi/n)Yi,
k=\
n
Xk = (1 - c„ak)+, where c„ = cn(o2)
satisfying
Define
1,2,...,
k=
(9)
i=i
(10)
is defined by (4) with p — a2.
Introduce 00 = X] kel
W*
(^ak+2lnWk+2lny2
where J = I ( c r 2 ) is defined by (5) with p = THEOREM 2 . The following 0(dy2(p)^\
+ a2ln-kW2ln-kr2),
(11)
1=1
a2.
upper bound holds rn
Rn(f„)
^ dn (p) + 0{\j/n) +
with p = a1, and estimator fn is defined by (9) and ( 1 0 ) .
60
E. Belitser
We introduce the condition under which we derive the exact upper bound for the minimax risk: E2. y« - o(d„). We immediately conclude from the theorems above the following result. COROLLARY 3. Suppose that the errors ei's are normally distributed. Let condition E l and conditions E2 be fulfilled. Then rn = dn(cr2)(l + o(l)) and estimator fn is asymptotically minimax. Remarks. (1) With some minor modifications of the proofs one can derive the results under weaker assumption ak ^ 0 instead of strict inequalities at > 0, k= 1 , 2 , . . . . (2) The estimator fn(x) is a generalized kernel estimator fn(x) where its kernel is defined as follows:
=
Kn(x,
i/n)Y,,
n
Kn(x, i/n) = n"1
w
k(l ~ c„ak)+ 4>k(i/n) 0}. For all k e /C we define v*(;c) = Vk(x, 8) = mklv{m^x x). These are probability densities with supports (—Rgmk, mkRs) respectively and if Xk = mkX then Xk is a random variable with density Vk(x). We have E X I ( v
2
k
k
= m
) = m
2
k
k 2
{ \ - 8 / l ) ,
(14)
I ( v ) ^ m f { l + 8 ) .
(15)
Let ¡? be distributed according to a prior measure /x such that d u t f ) = f ] v k m d & k f j S(dtfi), ksK.
kgK.
where 8(dx) is the atomic measure with total probability mass concentrated at 0. Let E denote the expectation with respect to the joint distribution of Ki, Y2, • • • and 01.02
62
E.
Belitser
By using (12) and the fact that © is closed and convex, w e bound the minimax risk from below as follows: oo wlEvih-Vk? d^û) /_ Tfei. 0 oo . . . e/ *=i OO
00
^ inf w\E(h ô *=1
- ûk)2 -
sup I J2 ¿»ssuppmJC k=l
wl2Et0
k-0k)2dfi(&)
oo > inf Y ^ w 2 k E ( û k - û k ) 2 - 4 f l a V ( © C ) £ > ¿ > 4 4=1 ^ *=1
(16)
The assumptions on probability density vk(x) allow us to apply the van Trees inequality (for details see (Gill and Levit, 1992; van Trees, 1968)) to the B a y e s risk E(h - ûk)2 for all k e /C: E(h
- ûk)2 >
* ,, ,, + I{vk)
EI(ûk)
(IV)
where I{ûk) is the Fisher information about ûk contained in observations Y\, Y2, • • • , Yn. We calculate = J
EI(ûk)
log P s ( Y j - ( K f ) ( i / n ) ) ]
Ef
2A
1=1 =
7
=
i=1
£n'
which follows from Proposition 1. Recalling (15), we obtain E($k ~ $k)
t
>
1 nIE + (l + 8)/m2
r =
nlem2
^
ml +
(l+S)
for all k e K. But this lower bound is true for all k. Indeed, if k K then mk = 0 and in this case the bound becomes trivial. Thus, by the last relation, w e obtain 00
inf y^w2kE(h *
fa
- Vk)2 ^
B y (14), we have that E$k J J t l i akw\m\
1
w2m2 ^ T. n(l+S)fam2Ie(\+8)-l+n-l 00
= m\( 1 — 6/2). Recall also that me®
and therefore
^ Q- F r ° m this it follows
\a2w2k(#2
- EV2)| < a2kw2m2k\R2
00
Q - Y^aWkE&l k=\
- 1 + ¿/2|
0 and a + ft > 1/2. Let C denote generic positive constants, different in general in different expressions. First we determine cn defined by (4). Note that the sequence is monotone. So, 1 = { 1 , . . . , N] and cn = N~a( 1 + o(l)). Using the asymptotics JL. A/u+l V kv = (l+o(l)), Ai —• oo, v > — 1, v + l m=1: and (4) with p = a2, we calculate N = ( « 2 ( 2 / 6 + 2 « + l)(2£ + a + l ) / ( a c r 2 ) ) 3 ™ ( l + o ( l ) ) , cn = (aa2/(nQ(2/3
+ 2a + 1)(20 + a +
(1 +
0(1)),
66
E.
Belitser
dn = C(ct,P,a2,Q)n
+
0(1)),
where 2
1
/
aa
¿
\
2«+i / 0 = Mo-
then u(t, y) = —0 to (1.7) is absolutely continuous with respect to v(dy) . Let v(t,y) denote the Radon-Nikodymderivative ¡xt (dy) = v(t, ;y)v(d;y). By direct computation one could check that v(t,y) solves the following Cauchy problem ^ = l - B i j i V i V j v + 2 V v k j + VjXiV + XiXjv) at 2 + (¿ViBu
-ajyVjV
+ XjV)
+ ( i V i V j B i j - diva)u,
w(0, y) = 1.
(2.2)
Here and below we assume summing over the repeating indices and put B — A*A. Rewrite (2.2) in the form ^=G{t)v at
+ (b-a,Vv)+mv,
where b, = BtJkj + V 7 B,j. probabilistic representation
v(0,y)
= \,
(2.3)
Now we can easily see the solution to (2.3) has the t
v(t, y) = E exp
/
m(r,
y(r))dr
L0 where y(t) satisfies dy = [b(t,Y(.t))-a(t,y(t))]dt
+ A(t,y(t))dw,
y(0) = y.
(2.4)
Obviously, if m(t,y) = 0 then v(t,y) = 1 satisfies (2.3). It yields that given v(dr) = Mo(d*) we receive /x(i, dx) = V*(0, 0 y ( < i * ) = v(dx) that is v(dx) is an invariant measure of the diffusion process £ (/)• COROLLARY. The condition m = 0 which ensures that V is an invariant for the solution to (1.1) could be rewritten as follows Mv = div[^V A (Ay) - a v j = 0.
measure
(2.5)
To check the last assertion consider the analogue of (2.1) in terms of the measure v(dy) itself rather then its logarithmic derivative X. As the result we immediately
Invariant
71
Measures
derive (2.5) which should be called a conservation law for the diffusion process. Consider some applications of the above results. Consider H = Rn, A — I and let v(dv) be a Gaussian measure with correlation operator I. Then chosing a(x) = ¿k(x) = — ¿x we receive m = 0. The process £(f) solving (1.1) with coefficients A and a chosen above is called the OrnsteinUhlenbeck process and the result reads that stationary Gaussian measure is an invariant measure of the Ornstein-Uhlenbeck process. In the infinite dimensional case putting
we derive SDE for the Ornstein-Uhlenbeck process d£ = - ^ ( f ) d f + Adu>,
£(0) = Ç0
and prove that the Gaussian measure with the correlation operator B = A*A is an invariant measure for this process. Another application of the above results is a construction of an infinite dimensional version of the transformation to the ground state representation (see for example (Dalecky, 1983) for the finite dimensional case). Let V(x),x e H be a measurable bounded function. Assume that coefficients a(s, x) = a(x), A(s, x) = A(x) of (1.1) are uniform in time and let P\r(t)f(x) = £ e / o v«(*)) p ^ f ) _ where solyes Actually c o u l d n o t b e repre sented in terms of the distribution of paths at time t that reflects the fact that the function f ( x ) = 1 is not preserved by it. To correct the situation or as physists say to derive the transformation to the ground state representation one could do the following. Let Py{t) be the operator dual to Pv(t)/ ^ ( i ) v ( d . y ) solves the Cauchy problem -^-=A*nv at with v e M2(H,
+ Vixv,
It is easy to check that iiy{t,éy)
ixv(0,dy)
= v(dy).
=
(2.6)
H+).
Given the solution /¿y(î, dy) of (2.6) one could easily check that i>v(t, y) determined by dy) = vv{t, }')v(d3') solves the Cauchy problem dvv — =Çvv+ at
(b-a,Vvv)
with m and b determined above. Now
+ [m + V]w,
vv(0,y)=l
(2.7)
we see that if V = —m then vit, y) = 1 is a solution to (2.7) and hence that the measure v is an invariant measure of the evolution family Py(t). At the other hand if v was an invariant measure of the process £(f) then m = 0 and hence v is no more invariant measure for Py(t). To construct a stochastic process P(t) for which this measure would be invariant as well or in other words to get a probabilistic representation to Py(t) we proceed as follows. 6. 860
72
Y. Belopolskaya
Let U(y) be a bounded smooth real valued function such that 2VU(y) Vu(dy) where X(y) = ^v(dy) ^ and B = A* A. Denote by Mi =gU
-l-Ti(VAU,S?AU)
+ {b-a,
= BX(y)
VU),
and choose Vu = — (M\ + m). It is easy to check that n(t, y) = tv^vvu
(t, y) is a solution to Vn) + [m + Mx +
— = Qn + (AVU, AVn) + (b-a, dt
Vu]n,
n(0, y) = eu M be a tangent Hilbert bundle with a typical fibre Hx = y(*), x e M isomorphic to a Hilbert space F. Let L(y, y*) be a vector bundle such that linear maps from Hx = y~Hx) into H* = ( y * ) - ' (x) form its fibre at the p o i n t * e M. A nondegenerate and nonegative section Gx of this bundle gives rise to a bilinear functional on Fx: r]x)Hx = (Hx, Gxr]x)x. This functional as well as the section G itself is called a Riemannian metric on y. We omit sometimes subscripts putting (•, -)h* = (•, •)• A Hilbert bundle with a Riemannian metric is called a Rimennian bundle. A manifold is called a Riemannian manifold if its tangent bundle r : TM —> M is a Riemannian bundle. Using Riemannian structure we can construct the orthonormal frame bundle O (M) — {(*, E\,E2, ••.)'•
e,•• form an orthonormal frame of TXM}.
Let exp: TM M be an exponential mapping generated by the Riemannian connection on M. The exponential map is a diffeomorphism from a small ball around zero in TXOM to a neighborhood of XQ in M. The radius of the largest ball in TX0M for which it is true is called the injectivity radius in XQ. The exponential map defines a system of coordinates around XQ called the normal coordinate system. Consider a smooth trivialization of T M over a normal coordinate chart centered at point XO E M. Choose an orthonormal frame of the tangent space TXQM with respect to which the coordinate functions are x, and the corresponding partial derivatives are 3; , thus vectors 9, are orthonormal at Jto although not in general elsewhere. We denote by {e,} the orthonormal frame of T M obtained from the smooth trivialization of T A/ in this coordinate patch. The manifold M is said to be equipped with a Hilbert-Schmidt structure ( t x , yx, /') if given the Hilbert bundle y and a Hilbert tangent bundle r : TM M one can define a bundle imbedding i: y —• r such that rx o i = y possesses the following property: for each x e M the map ix : Hx = (x) —> TXM belongs to L\2(J~LX, TXM) and ixHx is a dense subset of TXM. We say this structure is nuclear if the map ix belongs to a set of trace class operators L\\(HX, TXM). If y is a Riemannian bundle with the inner product iHx, rix)Hx = iHx, G(x)rjx)x we say that the HS-structure is Riemannian. The affine connection on M is called Hilbert-Schmidt affine connection if the local connection coefficient possesses the property rvx=Tl\BxH: 6"
BxH-^H,
xeU.
(3.1)
74 Introduce Tx*:
Y. Belopolskaya B x H*
//* by
(z,
u))// = - ( r r c y , z),
Denote by tXjt(y) the class of vector fields belonging to crk(r) and valued in TXM for each x e M. Given rj e ct/Kk) and £ e Ojt(r) put
Hence V^i; e a
(3.16)
+ V A *div G A* - div G V A *A* + (A, A*)(A, Ak)
+ div G A*div G A*] + V A t ( A , Ak) + d i v c A t ( A , Ak) - d i v G a - (a, A)]u. Notice that we have derived a parabolic equation for v of the type studied above. Once again we could construct a probabilistic representation for the solution to
78
Y. Belopolskaya
(3.15) in terms of a new stochastic process y(t) which solves the following stochastic equation df = e x p M ( q ( ; ( t ) dt + A(;(t)) dw)
(3.17)
with the same diffusion coefficient A(y) and a new drift coefficient q(y) given by q{y) = [(AOO, Ak(y)) + V ^ ( y ) + divGAk(y)]Ak(y).
(3.18)
In terms of this process the solution to (3.15) has the form i v(t, y) = E exp
(3.19)
(?(t))dr
where « 0 0 - ¿ [ ( A ( y ) , VAHy)Ak(y)) + (A(y), Ak(y))(A(y), + VAHy)(A(y), - &vca(y)
+ >c(Ak, Ak) Ak(y)) +
divGA*(jOdivcA*00]
A*(>0)] + div G A*(y)(A(y), ~ (a(y),
Ak(y))
A{y)j\
and K(Ak, Ak) = VAk(y)d\wGAk(y)
- d\vGVAHy)Ak{y).
(3.20)
Finally, the condition «00=0
(3.21)
ensures that v(t, y) = 1 satisfies (3.15) and hence measure v is invariant with respect to the evolution family V*(t). As the result we have proved the following assertion. THEOREM 3.1. Let coefficients of (3.3) satisfy conditions of the existence and uniqueness theorem, smooth enough and the initial value £(0) has a smooth distribution v. Assume in addition Malliavin conditions and (3.21) are satisfied. Then v is invariant with respect to the solution of (3.3). Consider now an application of the above result. 4. TRANSFORMATION TO THE GROUND STATE REPRESENTATION ON MANIFOLDS Consider a stochastic process §(r) which solves the Cauchy problem d£ = expM(a(l(t))
dt + A(i;(t))
£(i)=f0eA/,
0 ^ s < t ^ T ,
dw), (4.1)
and has a smooth initial distribution v(d.v) with a logarithmic derivative X(y). Let fM u(t - s, y)v(dy) = fM P(t - s)f(y)v(dy) = E f M ) It follows from the
Invariant Measures
79
Feynman-Kac formula that = Etf'V^T))drf(Ht))
J Pv(t-s)f(y)v(dy)
(4.2)
M
determines an evolution family Py in
B(M).
It is known that generally speaking v(t — s, y) = fM f(z)Pv(t — s, y, dz) could not be represented in terms of the transition probability of the process £ (r) which reflects the fact that Py(t — s, y, •) does not preserve the function f ( y ) = 1. To force 1 to be invariant function for Pv one has to proceed in a way similar to that used in the case of linear spaces. It follows from (4.2) that
u(t -s,y)
= PV(t - s)f(y) = J f{z)PV{t - i, y, dz)
(4.3)
M solves the Cauchy problem
^ = Au + Vu, at
«(0, y) = f(y).
(4.4)
Let U(y) be a bounded smooth real valued function such that 2 V U ( y ) = where B = A*A. Denote by
B{y)X(y)
M\ — AU — ^Tr(VAU,VAU) and choose Vu = —M\.
Let u(t,y) satisfies
^ = Au + Vvu, at
«(0, y) = f(y).
Let us derive the equation which governs y(t, y) = cU(-yhi(t,
(4.5) y).
THEOREM 4.1. LetU(y) e C2{H,R[), Vu(y) = M\{y), u(t,y) be the smooth solution to (4.5). Then y(t, y) = eu^u(t, y) solves ^
=Ay + (VAkU,VAky),
y(0, y) = e^/OO
(4.6)
if Vu=AU Proof.
-l-(VAkU,VAkU).
(4.7)
By direct computation we check that ^A-Y=eu[VAkUu
+ VAku]
(4.8)
and
Ay = [AUy + \(VA,U, VAkU)y + (VAuU, VAky) + Au^.
(4.9)
80
Y.
Belopolskaya
Substituting the resulting expressions into (4.5) and putting M
u
y
= [.AU
+ i(A
k
VU,
A*V£/)]y
w e get = ¿[V^V^ -
- (V„*t/, V A * y ) + Vuy
Muy,
+
Y i 0 , y ) = \ .
Thus if V(j + Mu (4.6).
— 0 w e deduce that the function y(t,y)
= e
{y
'u(t,
y) solves
Notice that w e can write the probabilistic representation for the solution to (4.6) in the form y(t, y ) = E e U W ) ) f ( P ( t ) where 0(t) is the solution to dp
= e x p $ f ) (B(P(t))[VU(P(t))
+ a m ) ) ] df + A { f i ( t ) ) dw)
under the condition y3(0) = y. Next choosing f ( y ) = e _ i / ( } , ) w e can prove that y(t, y) = 1 is a solution to (4.6). Acknowledgments
It is a pleasure to acknowledge the hospitality of the Blaise Pascal University at Clermont-Ferrand, France where this work was started. The author expresses especially her deep gratitude to S. Paycha for the invitation and friutful discussions of topics related to this paper.
REFERENCES Belopolskaya, Ya. (1997). Probabilistic representation of solutions boundary-value problems for hydrodynamics equations. Zapiski Seminarov POMI249, Belopolskaya, Ya. and Dalecky, Yu. (1990). Stochastic
77-101. Equations and Differential
Geometry.
Kluwer
Acad. Publ., Dordrecht. Dalecky, Yu. (1967). Infinite dimensional elliptic operators and parabolic equations associated with them. Russian Math. Surveys 22, 1-53. Dalecky, Yu. (1983). Stochastic differential geometry. Russian Math. Surveys 38, 97-125. Dalecky, Yu. and Fomin, S. (1991). Measures and Differential Equations in Infinite Dimensional
Space.
Kluwer Acad. Publ., Dordrecht. Ebin, D. and Marsden, J. (1970). Groups of diffeomorphisms and the motion of an incompressible fluid. Ann. Math. 92, 102-163. Gross, L. (1967). Potential theory on Hilbert space. J. Funct. Anal. 1, 123-181. Stroock, D. (1994). Probability NYC, USA.
Theory, an Analytic View. Cambridge Univ. Press, Cambridge, U.K. and
Prob. Theory and Math. Stat., pp. 81-98 B. Grigelionis et al. (Eds) © 1999 VSP/TEV
ONE TERM EDGEWORTH EXPANSION FOR STUDENT'S t STATISTIC M. BLOZNELIS 1 Vilnius University and Institute of Mathematics and Informatics H. PUTTER University of Amsterdam, NL-1105 AZ Amsterdam, The Netherlands
ABSTRACT We evaluate the rate of approximation of the distribution function of Student's t statistic based on N i.i.d. observations by its one term Edgeworth expansion. The rate is o ( i V " ' ' J ) if the distribution of observations is non-lattice and has finite third absolute moment. If the fourth absolute moment is finite and Cramer's condition holds the rate is 0 ( / V ~ ' ) .
1. INTRODUCTION AND RESULTS Let X¡,...,
XN, . . . be independent identically distributed random variables.
Write EXi = fi. Let denote the Student statistic, where X = N~l(X!+•••
+ X
N
),
a2
= N~]
- X )
2
.
Assume that A2 := E(X] — ¡I) is finite and positive. Then the statistics TN = VÑt is asymptotically standard normal as N -»• oo. The rate of the normal approximation and asymptotic expansions to the distribution function F(x) = P { V Ñ t ^ x} were studied by a number of authors, Chung (1946), Bhattacharya and Ghosh (1978), Chibisov (1980), Helmers and van Zwet (1982), Babu and Singh (1985), Helmers (1985), Slavova (1985), Hall (1987, 1988), Praskova (1989), Friedrich (1989), Bentkus and Götze (1996), Bentkus, Bloznelis and Götze (1996), Bentkus, Götze and van Zwet (1997), Gine, Götze and Mason (1997), Putter and van Zwet (1998), etc. 1 Research supported by A. von Humboldt Foundation and by Lithuanian Science and Studies Foundation.
82
M.
Bloznelis
and
H .
Putter
It is interesting to note that some fundamental problems in this area were solved only recently. The necessary and sufficient conditions for the asymptotic normality of Tn were found by Gine, Gotze and Mason (1997). The Berry-Esseen bound sup^ |F(;c)-4>(x)| < c/V~!/2/33/(x) denotes the standard normal distribution function and ^3 := E|X] — ¡i| 3 . The problem of establishing the asymptotic expansions under optimal moment and smoothness conditions so far remains open. Probably the most general and precise result concerned with a higher order asymptotics of Student's t statistic is due to Hall (1987) who proved the validity of a ¿-term Edgeworth expansion of F(x) with remainder o ( N ~ ^ ) , for every integer k , provided that E|Xi \k+1 < 00 and the distribution Fq of X\ is non-singular. The moment conditions in Hall (1987) are the minimal ones, but the smoothness condition on the distribution Fq of the observations is too restrictive. k
2
The aim of the present paper is to prove the validity of one term Edgeworth expansion under optimal conditions. We approximate F ( x ) by the one-term Edgeworth expansion (also called the second order approximation) G ( x )
=
0, see Bentkus, Gotze and van Zwet (1997). The minimal smoothness condition which allows to prove the validity of one-term Edgeworth expansion, i.e., to prove the bound A/v = o(N~^2), is the non-latticeness of the distribution F 0, for every x > 0. THEOREM 2.
Assume that for some decreasing functions
f\ and fo with
/,
(x)
0
as x —»• +oo, P(x)>fl(x),
E | X , - / ¿ | 3 l { ( X ] - n)2 > x} ^ f2(x),
x>x0,
(1.2)
for some XQ > 0. Then there exists a sequence EN (depending only on f\ and f-£) with sn 0 as N oo such that A
N
^ N ~
1 / 2
S
N
,
for N
=
2,3,....
Theorem 2 provides a bound for A N which is uniform over the class of distributions satisfying (1.2) with given functions f\ and f i . An immediate consequence of Theorem 2 is the following result. If the distribution of X\ is nonlattice and E | X i | 3 < oo then AJV=O (N~
L / 2
)
a s JV —> o o .
(1.3)
Theorem 2 improves earlier results of Babu and Singh ( 1 9 8 5 ) , Helmers ( 1 9 9 1 ) and Putter and van Zwet ( 1 9 9 8 ) where the bound ( 1 . 3 ) was established assuming that Fo is non-lattice and some moment conditions, the weakest of them being E | X i | 3 + E < oo, for some s > 0, obtained in the latter paper. Our approach differs from that used by Hall ( 1 9 8 7 ) and that of Putter and van Zwet ( 1 9 9 8 ) . We use and elaborate some ideas and techniques, e.g. "data depending smoothing" from Bentkus and Gotze ( 1 9 9 6 ) and Bentkus, Gotze and van Zwet (1997).
The rest of the paper is organized as follows. In Section 2 we present proofs of our results. Some more technical steps of the proofs are given in Section 3. Auxiliary results are collected in Section 4.
2. PROOFS The proofs are rather technical and involved. The only excuse for such complex proofs is that the results obtained are optimal. We may and shall assume that E X j = 0 and a 2 = 1.
M. Bloznelis and H. Putter
84
In what follows c, c\, C2, • •. denote generic absolute constants. We write c(a, b,...) to denote a constant that depends only on the quantities a , b, We shall write A i x)x and write ||J/|| = sup, \H(x)\. Let g: R —• R denote a function which is infinitely many times differentiable with bounded derivatives and such that 8
8
y
/
for all x e R,
1 g(x) = —, -y/X
and
7
9 f o r - < x ^ - . O O
Write cg = ||g|| + ||g'|| + ||g"|| + ||g'"|[. Let a denote the largest nonnegative solution of the equation a2 = EX^IjXj
^
a2/V}.
For 1 sC i, j s$ N and 1 < k < 4, write K; = a~lN-}/2Xil{Xf y = yi+
+ m = Y
k
bk=E\Y\\ ,
< a2N} and denote
t] = m + --- + w, 2
-EY
2
,
di = Yi k = ^|EKi|,
M = NbA,
D = d\ + --- +
rn,
Qij
y0 = A T
= 1/2
dN,
Yfdj,
E|X,
1
N / 2
j,
2
Note that \Yj \ ^ 1, E l f = A T . By Holder's inequality, flj sC cr /34 = fa and b3 ^
2
= N^ (b3N)2
2
,
M^
^ N2b2b4
Nbj = M.
=
N~\ (2.1)
If Q denotes the sum qH h qic, where q{ are some quantities, then write Q 0, ft/VN
< co,
fa/N
p^
< c0,
In N < c0N.
(2.2)
Indeed, if the first inequality fails, the bound (1.1) follows from the simple inequalities An « 1 + ft/-Ht
and H.
f J
Putter
H, ~26t,
f J
«-
Then UAg'(VB) = UlA + U2,i + t/3.3 + U* and e{Z} = e{W + L + U*}glg2g3, where we denote gp — e{Upp}. By the mean value theorem, gp = 1 + xp, where xp = i t UPtPEe&{6UPtP}. Write «1 = 1,
«5=£1£2, The identity g\g2g3 — (a\ 4 E,,7Ao " W, where W = Y2q/N + 2Y A a Y B JN. Expanding g in
powers of W and then in powers of r]Ao we get S = S3 + n + ln|«|yw|,
Irzl «
UVm0I-
\n\«\YAo\ri2Ao,
+ rj + rn, with k4|
«\YBo\n2Ao-
Now an application of (2.17) gives |e{S}-e{S3}| \tYW\ + |ri%7Mol 2/3 +
«
+ \tYA0nAo\2/5
=• ¿ 2 .
(2.22)
Finally, combining the inequalities
E\YW\(l + 0 ) « AT1,
E|YBor,2Ao\1/2(l + ©) « m0/N « t~2 In\t\,
E|KAO»?AOI2/3(1 + ® ) «
+ L^OL) «
mo/N «
E|Kao^oI2/5(1 + ©) «
+ \>]A0\) « m/N
t~ 2 In |f|,
« t~2 In\t\,
we obtain /2 : =
gl/4 " } / EL2(1 + ©)dr «
p~ 1 Pa/N.
pN J c1
B y (2.21) and (2.22), I ^ I\ + I2 + h, where I3 is defined in the same way as h, but with ¿ 2 replaced by E ^ e f ^ } . Clearly, the inequality
\EAoe{S3}\ « \t\~2 + c c ^ " 3 / 4 ) 3 4 3 / V b o I 3 / 2 -
(2-23)
in combination with (4.1) implies -C p~l^/N. Now collecting the bounds for I\, ¡2 and /3 we obtain (2.20). It remains to prove (2.23). Write I B = H f l O 6 ^ " 1
Ylofa < 1}. We have 1 -
|EA 0 e{5 3 }| < f + (1 - l a ) ,
«
N ^ ^
0
|
/
3
2
and
C : = |E A o e{5 3 }I f i |.
Therefore, it suffices to estimate ( . By the symmetry
? < |i;IBr°, Expanding in powers of
v = u + R,
V ^E^Kig^
+ rn W W } -
itrj 1 YBog'(VBo) and it Y + 1 q{VBo) we obtain
M =E 1 e{y,g(V B o )},
R ^ t2Kin2Y2Boc2q +
t2Ex\mYBJx\c2q.
Note that |/?|IB ^ i 2 ^ - ' / 1 0 0 . Then \vIB\ ^ |«| + |«|Ib < 1
1~ 1 1 o ^
t2 8iV+
t2 100N
t2 9N•
(2.24) K
'
Here we used the inequality |k| ^ 1 — t2/(&N) which follows from the inequality (2.19) applied to r = Y\g(VBQ), for \t\ ^ H\ ^ 4 " 1 Nl>2, use also (2.6). It follows from (2.24) that
? < (1 - t2/(9N))mo
sC exp { -
t2m0/(9N)}
We arrive at (2.23) thus completing the proof of (2.20).
< |?r20/9