Stability Problems for Stochastic Models: Proceedings of the International Seminar held in Suzdal, Russia, Jan.27-Feb. 2,1991 (Lecture Notes in Mathematics, 1546) 3540567445, 9783540567448

The subject of this book is a new direction in the field of probability theory and mathematical statistics which can be

124 25 14MB

English Pages 244 [238] Year 1993

Recommend Papers

Stability Problems for Stochastic Models: Proceedings of the 9th International Seminar held in Varna, Bulgaria, May 13-19, 1985 (Lecture Notes in Mathematics, 1233) 3540172041, 9783540172048

The subject of this book is a new direction in the field ofprobability theory and mathematical statistics which can beca

102 65 13MB Read more

Stability Problems for Stochastic Models: Proceedings of the 6th International Seminar Held in Moscow, USSR, April 1982 (Lecture Notes in Mathematics, 982) 3540122788, 9783540122784

English, Russian (translation)

108 61 15MB Read more

Stochastic Analysis: Proceedings of the Japanese-French Seminar held in Paris, France, June 16-19, 1987 (Lecture Notes in Mathematics, 1322) 3540193529, 9783540193524

122 34 12MB Read more

Stability of Stochastic Dynamical Systems: Proceedings of the International Symposium Organized by 'The Control Theory Centre', University of Warwick, ... 1972 (Lecture Notes in Mathematics, 294) 3540060502, 9783540060505

107 72 17MB Read more

Stability of Unfoldings (Lecture Notes in Mathematics, 393) 3540067949, 9783540067948

117 92 7MB Read more

Stochastic Processes and Their Applications: Proceedings of the International Conference held in Nagoya, July 2-6, 1985 (Lecture Notes in Mathematics, 1203) 3540167730, 9783540167730

Asymptotic behaviour of stochastic flows of diffeomorphisms.- Stochastic ensembles and hierarchies.- A stochastic approa

101 18 13MB Read more

Stochastic Analysis and Applications: Proceedings of the International Conference held in Swansea, April 11-15, 1983 (Lecture Notes in Mathematics, 1095) 9783540138914, 3540138919

Newtonian diffusions and planets, with a remark on non-standard Dirichlet forms and polymers.- The equivalence of ensemb

112 22 11MB Read more

Stochastic Processes - Mathematics and Physics: Proceedings of the 1st BiBoS-Symposium held in Bielefeld, West Germany, September 10-15, 1984 (Lecture Notes in Mathematics, 1158) 3540159983, 9783540159988

This second BiBoS volume surveys recent developments in the theory of stochastic processes. Particular attention is give

110 87 14MB Read more

Stochastic Processes - Mathematics and Physics II: Proceedings of the 2nd BiBoS Symposium held in Bielefeld, West Germany, April 15-19, 1985 (Lecture Notes in Mathematics, 1250) 3540177973, 9783540177975

This second BiBoS volume surveys recent developments in the theory of stochastic processes. Particular attention is give

115 22 20MB Read more

Stochastic Mechanics and Stochastic Processes: Proceedings of a Conference held in Swansea, UK, August 4-8, 1986 (Lecture Notes in Mathematics, 1325) 9783540500155, 3540500154

The main theme of the meeting was to illustrate the use of stochastic processes in the study of topological problems in

110 106 11MB Read more

Stability Problems for Stochastic Models: Proceedings of the International Seminar held in Suzdal, Russia, Jan.27-Feb. 2,1991 (Lecture Notes in Mathematics, 1546)
3540567445, 9783540567448

Author / Uploaded
Vladimir V. Kalashnikov (editor)
Vladimir M. Zolotarev (editor)

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Lecture Notes in Mathematics Editors: A. Dold, Heidelberg B. Eckmann, ZUrich F. Takens, Groningen

1546

Lecture Notes in Mathematics Editors: A. Dold, Heidelberg B. Eckmann, ZUrich F. Takens, Groningen

1546

v. V. Kalashnikov

V. M. Zolotarev (Eds.)

Stability Problems for Stochastic Models Proceedings of the International Seminar, held in Suzdal, Russia, Jan. 27-Feb. 2, 1991

Springer-Verlag Berlin Heidelberg New York London Paris Tokyo Hong Kong Barcelona Budapest

Editors Vladimir V. Kalashnikov Institute of Systems Analysis Russian Academy of Sciences Prospekt 60 let Oktyabrya, 9 117312 Moscow, Russia Vladimir M. Zolotarev Steklov Mathematical Institute Russian Academy of Sciences Vavilov St. 42 117333 Moscow, Russia

Mathematics Subject Classification (1991): 60B 10, 60B99, 60ElO, 60E99, 60K25, 62EI0, 62FI0

ISBN 3-540-56744-5 Springer-Verlag Berlin Heidelberg New York ISBN 0-387-56744-5 Springer-Verlag New York Berlin Heidelberg

Library of Congress Cataloging-in-Publication Data. Stability problems for stochastic models: Proceedings of the international seminar, held in Suzdal, Russia, Jan. 27-Feb. 2, 1991 1 V. V. Kalashnikov, V. M. Zolotharev (eds.). p. em. - (Lecture notes in mathematics; 1546) Includes bibliographical references. ISBN 3-540-56744-5 (Berlin: acid-free). - ISBN 0-387-56744-5 (New York: acidfree) 1. Stochastic systems-Congresses. 2. Stability-Congresses. 1. Kalashnikov, Vladimir Viacheslavovich. II. Zolotarev, V. M. Ill. Series: Lecture notes in mathematics (Springer-Verlag); 1546. QA3.L28 no. 1546. (QA402) 510 s-dc20 (519.2) 93-15959 CIP This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. © Springer-Verlag Berlin Heidelberg 1993 Printed in Germany Typesetting: Camera-ready by authorleditor 46/3140-543210 - Printed on acid-free paper

PREFACE In 1991 the traditional Seminar on stability problems for stochastic models was held in the town of Suzdal, Russia, from 27 January to 2 February. It was altogether the 14th Seminar on this topic and the fourth Seminar with international participation. The previous international Seminars on stability problems for stochastic models were held in 1982 in Moscow, in 1985 in Varna (Bulgaria) and in 1987 in Sukhumi. The Suzdal Seminar was organized by Steklov Mathematical Institute, the Institute for System Studies and Vladimir Polytechnical Institute. The latter was the host organization. The Seminar was sponsored by Posad Center in Vladimir (headed by A. A. Mel'nikov), which granted the main part of the required means, the Center of the Soviet-American joint venture Dialogue at Moscow State University (headed by V. F. Matveyev), which provided financing and camera-ready preparation of the present collection of papers, and COTT cooperative organization (headed by A. A. Eremeyev). We express our gratitude to all of them. The Seminar took place in the comfortable Suzdal Tourist Center which turned out to be very convenient for scientific conferences. This Center is situated in a picturesque place within the town limits. The unusual weather, with temperatures rising to 25-35 CO below zero on sunny days, was a peculiar addition to the exotic architecture of the ancient Russian towns Suzdal and Vladimir. 108 specialists in the field of probability theory, mathematical statistics and their applications took part in the seminar, including 19 foreign guests from 13 countries of Euroupe, Asia and both Americas. 89 Soviet participants represented scientific centers from 25 cities of 9 republics. During 5 working days (one day was set aside to the excursions and the individual programs of the participants, more than 50 reports were delivered devoted mostly to the traditional themes of the Seminar. Some of the presented papers are included in this collection. The rest of them will be published in Russian in the annual series "Stability Problems for Stochastic Models - Proceedings of the Seminar", issued by the Institute for Systems Studies. These will later be translated into English in the "Journal of Soviet Mathematics" issued by Plenum Publishers. The abstracts of communications have already been published in the "Probability Theory and its Applications", Vol. 36, No.4, 1991. Preparations for the Seminar and its conducting required considerable efforts from the organizing committee. So on behalf of all the participants I express heart-felt gratitude to our colleagues who ensured the success of the Seminar: 1. G. Afanas'eva (Moscow State University), S. N. Antonov (Vladimir Polytechnical Institute), T. V. Bychkova (Vladimir Polytechnical Institute), 1. V. Grinevich (Steklov Mathematical Institute), V. V. Kalashnikov (Institute for Systems Studies), V. Yu. Korolev (Moscow State University), V. M. Kruglov (Moscow State University), V. V. Senatov (Moscow Institute of Steel and Alloys), A. 1. Tolmachev (Moscow Aviation Institute).

V. M. Zolotareu

Table of Contents Caterina Dimaki and Evdokia Xekalaki, Characterizations of the Pareto distribution based on order statistics

1

B. Dimitrov and Z. Khalil, Some characterizations of the exponential distribution based on the service time properties of an unreliable server

17

Mark Finkelstein and Howard G. Tucker, On the distribution on the Wilcoxon rank-sum statistic

26

Wildfried Hazod, On different stability-concepts for probabilities on groups

33

Herbert Heyer, Functional limit theorems for random walks on onedimensional hypergroups

45

Peter Jagers, Stabilities and instabilities in population dynamics

58

Slobodanka Jankovic, Some properties of random variables which are stable with respect to the random sample size

68

V. V. Kalashnikov, Two-side estimates of geometric convolutions

76

L.B. Klebanov and A.Yu. Yakovlev, A stochastic model of radiation carcinogenesis

89

V.Yu. Korolev and V.M. Kruglov, Limit theorems for random sums of independent random variables

100

I.S. Molchanov, On regularly varying multivalued functions

121

E. V. Morozov, A comparison theorem for queueing system with non-identical channels

130

VIII

Josep M. Oller, On an intrinsic Bias measure .................................................. 134 Jerzy Pusz, Characterization of exponential distributions by conditional moments

159

Yu.S. Khokhlov, The function limit theorem on nilpotent Lie group

163

M.Yu. Svertchkov, On wide-sense regeneration

167

S.M. Shkol'nik, Some properties of the median of the stable distributions close to the symmetric ones

170

Hermann Thorisson, Regeneration, stationarity and simulation

,

,

Jacek Wesolowski, Multivariate infinitely divisible distributions with the Gaussian second order conditional structure

174

180

O.L. Yanushkevichiene, On the convergence of random symmetric polynomials

184

R.V. Yanushkevichius, Stability of characterization by record properties

189

Ricardas Zitikis, A Berry-Esseen bound for multivariate L-estimates with explicit dependence on dimension

197

L.G. Afanas'eva, On the ergodicity condition of random walks with a periodic control sequence

212

A. Plucinska and E. Plucinski, Some limit problem for dependent random variables

,

224

Caterina Dimaki and Evdokia Xekalaki

CHARACTERIZATIONS OF THE PARETO DISTRIBUTION BASED ON ORDER STATISTICS

In this paper characterization theorems for the Pareto distribution based on properties of order statistics are shown. The first theorem demonstrates that under general conditions, the random variable X is Pareto (e, 0') distributed if and only if the statistics X k:n / X ron and X m :n / X r : n are identically distributed as X k-r:n-r and X m - r :n - r respectively. The second is based on constructing an appropriate function of order statistics with the same distribution as the one sampled. The function considered in the present paper is min (Xl, X2', ... , X:;). The third theorem characterizes the marginal distributions of the random variables X and Y as Pareto when the conditional distribution of X given that T = XY is known. The four theorems that follow characterize the Pareto distribution using relations among the moments of order statistics while the last theorem characterizes the Pareto distribution within the linear log-exponential family. 1. Introduction.

Many socioeconomic as well as naturally occurring quantities are known to be distributed according to certain skew patterns. Several distributions have been developed in order to describe these skew patterns. And among them the Pareto distribution is one of the most important. The most prominent application of the Pareto distribution has been in the study of the distribution of personal income. It has also been applied in reliability theory and for the description of empirical phenomena such as the occurrence of natural resources, stock price fluctuations, error clustering in communication circuits, the size of firms, city population sizes etc. In the sequel, a random variable X will be said to have the Pareto distribution if its probability density function (p.d.f.) is

(1.1)

f(x)

={

O'eax-(a+l)

0,

'

x:::: e, e > 0,

0'

> 0,

elsewhere,

or equivalently if its cumulative distribution function (c.d.f.) is given by (1.2)

F(x)

=1-

ex-a,

e:: x

0,

0'

>

0.

It is worth mentioning at this point that 0' has a natural interpretation in the context of income distribution since it can be used as a measure of inequality or income concentration. This is a consequence of the fact that 0' is monotonically related to the Gini coefficient g, thus g = 1/(20' - 1). Also represents the minimum level of income. The wide applicability of the Pareto distribution has stimulated much interest in problems connected with characteristitic properties of this distribution and many characterizations have appeared in the literature. These can be classified into the following major categories: 1. Characterizations based on order statistics: i) based on distributional properties of order statistics. ii) based on independence properties of certain functions of order statistics.

e

2

2. 3. 4. 5. 6. 7.

Characterizations based on relations among moments of order statistics. Characterizations within the linear log-exponential family. Characterizations by conditional distributions and/or conditional expectations. Characterizations in the context of reliability theory. Characterizations based on weighted distributions. Characterizations in the context of damage and generating models. In this paper several characterization theorems for the Pareto (8, a) distribution are shown. These belong to the first three categories. Befor proceeding to the presentation of the main results we shall introduce some relevant notation. Let Xl, X 2 , ••• be independent non-trivial random variabes each having the distribution F( x) with F(I) = O. Let also X I : n, ... , X n: n denote the order statistics of a sample of size n from that distribution. It is known that the distribution function F r ( x), 7' = 1,2, ... ,n of the r th order statistic is given by

(1.3) Further, if the density f(x) of Xj exists, then the density fr(x) = F:(x) of the rt h order statistic also exists for each 7' and it is given by

(1.4) 2. Characterization theorems based on order statistics. Malik [15] stated that under general conditions, X follows a Pareto (8, a) distribution if and only if for fixed m, 1 m n - 1, Xm+l:n/Xm:n and X m+ l:n, are independent. Samanta [19] tried a twofold improvement over the above theorem. He firstly weakened the absolute continuity to right continuity and secondly he weakened the independence of Xm+l:n/Xm:n and X m+ l:n to that of Xm+l:n/Xm:n and Xk:n for some m # k. However, Huang [13] stated two simple counter examples which show right continuity is not sufficient. A further generalization of the theorem was provided by Ahsanullah and Kabir [2]. They proved that under general conditions a necesseary and sufficient condition that X follows the Pareto distribution as given by (1.1) is that for sorne v and s (1 r s n) the statistics X r: n and Xs:n/X r: n are independent. Similar results leading to characterizations of the Pareto distribution have been proved by Rogers [18], Govindarajulu [11], Ferguson [9]. Also Dallas [4] proved that the condition

(C independent of x, and

7'

> 0)

characterizes the distribution of X as Pareto provided that the expectation is finite. At this point it is worth mentioning some results which are closely related to the previously discussed theorems. The first was stated by Srivastava [21]. He demonstrated that under certain restrictions X follows the Pareto distribution if and only if the statistics X I : n and (X I : n + ... + X n: n ) / X I : n are independent. The second result is due to Samanta [19]. He proved that the independence between a vector of quotients of order statistics and an order statistic characterizes the Pareto distribution. Specifically he showed that under

3

general conditions the random vector (X 2: n/X1:n , X 3 : n/X1: n , ••• , X n :n/X1:n ) and X 1:n are independent if and only if X is Pareto distributed. Finally Dallas [4] proved that the condition

(C independent of x, r > 0) characterizes the distribution of X as Pareto provided that the expectation is finite. Most of the theorems mentioned so far had as characterizing condition the independence between an order statistic and a function of order statistics. Galambos [10] proved some very interesting results. In one of those the basic assumption is the independence between X 1:2 and a function g(X 1:2, X 2:2). The theorem is a general one and can be applied to more that one distributions. We can also notice that although it gives only a necessary condition it may indeed be classified as a characterization theorem since the sufficiency part in any concrete case is straightforward. The characterization theorem that follows is based on the fact that certain functions of order statistics have the same distribution. In other words with Theorem 2.1 it is shown that the Pareto distribution is characterized by the distribution of the statistics Xs;:n/ X r : n and Xs,-r:n-r for two distinct values S1 and S2 (Xs;-r:n-r is the (Si - r)th order statistic out of a sample of size n - r). Theorem 2.1. Let X be a random variable having an absolutely continuous (with respect to Lebesgue measure) strictly increasing distribution function F( x) for all x 2:: 1, and F( x) = o for all x < 1. A necessary and sufficient condition that X follows the Pareto with density as given in (1.1) with () = 1 is that for any fixed r and two distinct numbers S1 and S2 (1 ::; r < 51 < 52 ::; n) the distributions of the statistics Xs;:n/Xr:n and XSi-r:n-r are identical for i = 1,2. Proof. Necessity: Let U = X r : n , Vi = Xs;:n/ X r:n, i = 1,2, i.e. (2.1)

X r :n

= U,

Xs,:n = UVj ,

i = 1,2

The joint p.d.f. of X r : n and Xs;:n is given by

where K

= n!/ (r -1)!(5i -

f( U, uv )

= Ul or\ [1 -

r -l)!(n - sill. Making the transformation (2.1) it follows that

U-a] r-1 [-a U - (UVi)-a] s;-r-1 [(UVi)-a] n-s, a 2 u -2a-2 Vi-a-1

or where

L -- K a 2

(1 _ Vi_a)S;-r-1

( _a)n-s,+1 -1 Vi Vi .

Since these factorise with independent ranges it follows that !Vi(V;)

= consi Ko: 2 ( 1 -

)s'-r-1 (Via )n-s'+1 1 via, . vi.

4

Evaluation of the constant yields,

(2.2) Let Wi

(2.3)

- r)! (1 _ :-a)Si- r - 1 (:-a)n-S i+ 1 :-1 f Vi (v,.) = (. _ (n .)1 V, V, QV, . S, r _ 1)'. (n _ s,.

= XSi-rm-r' [w, (Wi )

=

The density of Wi is given by the relation

(1 - W·_a)Si- r (n - r)! (Si- r- 1)!(n- s i ) ! '

1

i+ 1 (_a)n-S -1 W· QW· ,

,

From (2.2) and (2.3) it follows that V; and Wi are indentically distributed. Sufficiency: the joint p.d.f. of U and V; is

fuv;(u,uV;) = K[F(uW- 1 [F(UVi) - F(U)j'i- 1 [1- F(uv;)r- Si f(u)f(uvi), 1 :::; u < 00,1 :::; Vi < ooi = 1,2, where K = n!j(r -l)!(si - r -l)!(n - s;)!. Hence the marginal propability density of V; is 00

fVi(V;)

= K

j f(u,uv;)du. 1

Also the probability density of Wi

=

XSi-r:n-r

is given by

[F( )]Si- r- 1 [ F( )]n-si f ( ) f Wi (Wi ) = (s,. _ r (n-r)! _ 1)'. (n _ s,. ')' Wi 1Wi Wi . Since Vi, Wi are identically distributed it follows that

00

=K

1

j[F(U)r- [F(ux) - F(u)r-

1

[1- F(ux)r- Si f(u)f(u) du

1

or 00

(n - r)! (r - 1)! n!

=j

[F(uW- 1 [F(UX) - F(u) 1 - F(X)] Si 1 - F(ux) F(x)

1

x [

F(x) F(ux) - F(u)

Then left Rand side is independent of for i = 1, 2. Let

H(

Si

]r+1

[l-F(UX)]n f(u)f(ux)du. 1 - F(x) f(x)

therefore the right hand side is independent of

) _ F(ux) - F(u) 1 - F(x) u,x - 1- F(ux) F(x)'

Si

5

Then

(n-r)!(r-1)! n!

= joo[F(U)] r-l [H(U,x)]S;

[

1

F(x) F(ux) - F(u)

1 - F(UX)]n f(u)f(ux) d x [ 1 _ F(x) f(x) u

Subtracting we get

j

0= OO [F(u)

r [

HS'(u,x) 1- H

f

or

.= 1 2 ,.

][

S'-S2(U,X)

Jr+l

F(x) F(ux) _ F(u)

Jr+l

1

1 - F(UX)]n f(u)f(ux) duo [ 1 - F(x) f(x)

F is monotone, strictly increasing, therefore

HS'(u,x) [1 - H S'-S2(U,X)]

=

°

Hence either H(u,x) = 1. Since neither F(ux) - F(u) nor 1- F(x) can vanish identucally it follows that H( u, x) = 1 i.e,

[F(ux) - F(u)] [1 - F(x)] = F(x) [1 - F(ux)]

==?

[1 - F(x)] [1 - F(u)] = 1 - F(ux) or

= F(ux). F(x) = xC. But

F(x)F(u)

°

The solution of this functional equation is F(oo) = =} c is negative, i.e. c = -0', 0' > 0, i.e. Pareto with 8 = 1. (J Most of the characterizations mentioned above have something in common; they are based either on the independence of suitable functions of order statistics, or on the fact that certain functions of order statistics have the same dictribution. A slightly different result is now given. It establishes a characterizations of the Pareto distribution based on the idea of deriving a function of order statistics with the same distribution as the one sampled. More precisely we will prove that under certain restrictions the statements

(i)

n X l:n

:i X

and

(ii) X is distributed according to a Pareto distribution, are equivalent. The above result can be giver in the form of the following theorem. Theorem 2.2. Let X be a random variable which is greater than one, with the distribution function H(x). Let (X 1,X2 , ••• ,Xn ) , n ;::: 2 be a random sample from H(·). Then

6

X and Z = min (Xi', ... ,X::) are identically distributed (for each integer n 2) if and only if H(x) = 1 - x-a for some 0:: > O. Proof. Sufficiency. Let (X 1,X2 , ... ,Xn ) , n 2 be a random sample from H(x) = 1 - x-a for some 0:: > O. Let also Z = min(Xf, ... ,X;:). Then

Hz(z)

= I-P(Z > z) = 1- [P(X n > z)t

= 1 -1 -

[P(X < zl/n)r

= 1- [1- H(zl/n)] n = 1- z-C< = Hx(z). Neceuity. Let X and Z have the distribution. We shall prove that X follows the Pareto distribution with 9 = 1. Let H(w) = 1 - H(w), then,

H(z)

= I-P(Z >

z)

= 1- [P(X n > z)t = 1-

[1- H(zl/n)]n

Therefore,

(2.4) We must now determine explicitly the from of H(z). The relation (2.4) is equivalent to

(2.5)

InH(zn)

= n [InH(z)]

.

Let InH(z) = M(z). Then, the relation (2.5) can be written as M(zn) = nM(z). Consequently, M(z) = cln z. Hence, H(z) = ZC or H(z) = 1 - ZC. Since H(oo) = 1 =} cis negative, i.e, c = -0:: where Q > 0, i.e. H(z) = 1 - e:>, Q > 0, which implies that X is Pareto (1,0::) distributed. 0 A similar characterization theorem for the exponential distribution has been given by Desu [5]. Sometimes one is dealing with problems which involve the bivariate distribution of the random variables X and Y where only the conditional distribution of X given that T = X + Y is known. In such a situation an obvious question arises. What can one say about the marginal distribution X and Y. The problem was first examined by Patil and Seshadri [17] in the case where X and Yare independent. They proved, among other results, that under certain assumptions the random variables X, Y have both the negative exponential distribution. The same result under slightly modified assumptions can be found in Galambos [10]. The case where X and Yare dependent was examined by Panaretos [16] when their distributions are discrete and the distribution of X I (X + Y) is of a general form. In the sequel, a question of a stightly different nature is examined. In particular attention is given to the problem of degermining the marginal distribution of the random variables X and Y in situations where only the conditional distribution of X given T = Xl' is known. Theorem 2.3. Let X and Y be independent, greater that one and identically distributed random variables with continuous density function f(x). Let T = Xl' and VV = min (X, l'). Then f(x) is the Pareto density by (1.1) with () = 1, if and only if, the conditional density f( u It) of U = 2W given t satisfies (2.6)

f(u It) = 2u- 1 / Int

1

T) ,

then it is true N

bl(T)

=L

M-I

[min(Xn,Tn) + Yn-d;

b2(T)

=T +

L

n=l

n=l

Let

i

= 1,2

Yn

.

18

be the corresponding L.S.T. Let 00

0 and Y As a corollary of Lemma 2 if T is exponential and P (X we get

O.

0

= 0) = 0

then for any F x (x)

(9) Furthermore, we have the following: Corollary 2. Under the condition" of Lemma 2, if the 8erver has instantaneous renewal times, i.e. P (Y = 0) 1 then

=

(10)

and

for any lifetime di"tribution F x (t). Proof. Substitute If'y(s) = 1 in (7) and since If'x(s) f31(S) = pi (p

< 1 then

(7) gives

+ s) = f3T(S),

and this confirms the statement of the corolloary. 0 The properties expressed by equations (8), (9) and (10) appear to be new (at least to the authors) properties related to the exponential distribution. Naturally the question arises or not the original processing time T will be exponential if one or the other of equations (8)(10) are satisfied? For this we need some additional particular results of Lemma 1. Some auxiliary results. Consider the results of Lemma 1 when the server's lifetimes X are either a given constant x i.e, P (X n = x) = 1, or X n E Exp (,X), i.e, the server has a constant failure rate. 4.1. The ccse P (X n = x) = 1. From (1) we have at once x

(11)

f31(S) = [ / e-·tdFT(t)] [1-lf'y(s)e-, x (1 - FT (x) ) r o

1

.

20 To get the corresponding expression for (32 (s) , instead of using (2) and (3) we derive (32 (s ) anew: Lemma 3. If the life time X is a fixed constant x > 0, then the L.S. T. (32 (s) is given by

J

(k+l)x

(12)

(32(s)=f[ k=O

Proof.

e-·tdFr(t)][

O.

(This has probability

dFr(t)). On the interval ( 0, t ) there will be exactly k

=

which lead to k in dependent down times Y1 , ••• ,Yk b2 (T I T = t) is

Therefore the total execution time

•

[tlx] interruptions of the service

and its L.S.T. is

(13) When we substitute (13) in (2) we get (12). One can get the same result by using the special form of and inverting the Laplace transform (4). Since the details need long explanations we prefer the above direct proof. From Corollary 1, for P (X n = x) = 1, we obtain

J x

(14)

Eb1(T)

=

[x+EY(I-FT(x)) -

FT(t)dt]/Fr(x);

o

f J

(k+l)x

(15)

Eb 2(T)=ET +EY

k=O

kx

4.2. The case X E Exp(A). If X is expotentially distributed then HAt) = At we have: Lemma 4.1he L.S.T. of the tital service times are

and the ezpected value" are

(18) (19)

Eb2(T)

= ET (1 + '\E.Y).

21

The proof of (16)-(19) can be found in Khalil, Dimitrov and Petrov [5] but one can derive it directly from Lemma 1 and Corollary 1. Now we return to the question posed at the end of paragraph 3.

5. Characterization theorems for the exponentially distributed processing time T. In Lemma 2 we have seen that when T is exponentially distributed then (8) and (9) hold true, i.e. the total service time under two different service disciplines coincide in distribution as well as their expected values are equal. We show that the converse is true but with some additional assumptions. Theorem 1. The coincidence in distribution bI (T) ::t: b2 (T ) for the constant server lifetimes in an area of zero (i.e. for P (X = x) = 1, and any x > 0) takes place iff the original service time T is exponentially distributed. Proof. The necessity part is already show with Lemma 2. To show the sufficiency we mention that under the given conditions (8) is true for any s :::: o. Now in view of (11) and for fixed X = x we have:

J

e- st dFT(t) 1[1- epy(s)(l - FT(X))]

(20)

f J

(HI)x

x

=

o

k=O

estdFT(t)[epy(s)]k

kx

Note that if P (Y = 0) = 1 (instantaneous renewal of the server) then epy (s) (20) takes the simpler form

J x

(21)

J

=1

and

00

e- st dFT (t) [ 1 - e- st (1 - FT (x)))

-1

=

o

e- st dFT(t)

'PT (s).

=

0

If P(Y = 0) < 1, then epy (s) < 1 for s > 0 and it changes continuously with s. The left hand side in (20) can be rewritten as

J x

e- st dFT (t) [ 1 - e- st (1- FT (X))]-I

o

=

f

[epy(s)]k e-k8X[1_FT(x)]k

k=O

j

e- st dFT(t).

0

For equation (20) to be true it is necessery that the coefficients of [epy (s)]k on both sides coincide and we have:

J

(22)

J

(HI)x

x

[ (1 - FT (x)) e- sx ] k

e- 8t dFT (t)

e- st dFT(t),

k=O,1,2, ...

kx

o

Summing (22) over k we have again that (21) holds. Consider (21), the ratio on the L.H.S. for any x > 0 is a constant, depending on s but and

'P(o)

= ET.

22 Rewrite (21) in the form (omit the subscript T for a while)

J Z

e- s t dF(t)

(23)

= 'P (8) [ 1 -

e- s t (1 - F(x)) ] ,

o

which upon differentiation with respect to x (no solutions of (21) are lost) yields

(24)

e- S Z dF(x)

= 'P ($) e

[8 (1 -

SZ

F(x)) dF(x) ]

or

F'(x)

(25)

1 - F( x)

=

8'P(8)

1 - 'P (8)

= C > O.

This last equation shows that

8'P(8)

..,---'---'--f---:- =

1-'P(8)

i.e,

C

8> 0,

,

C

'P(8)=C+8'

8>0

as well as F ' (x) / (1-F(x) ) = c i.e, F(x) has a constant failure rate and F( x) = 1- e- c x • The Theorem is proved. [J The only disadvantage of Theorem 1 is that it needs a continuous set of values x for which equation (8) holds. But an example shows that it is not sufficent to have the truth of (21) or (23) for given fixed x only. It is easy to verify that the distribution function F T (t) given by its p.d.f,

f ( t ) =a - ( T

a+x

X

a+x

)n -, 1 x

for

t E [nx, (n

+ 1)x ]

and n = 0, 1,2, .,. satisfies the equation (21). Moreover any distribution from the class, given with a p.d.f. in the form when

t E [nx, (n + 1)x ]

n = 0,1,2, ... , where a E (0, 1) and q(t) is a certain probability distribution on [0, x], also satisfies (21). The next Theorem shows that if X E Exp (A) and (8) holds for a given value of A, it implies that T is also exponentially distributed. Theorem 2. The concidence is true for some F T (x) = 1- e->'x iff original service time T is exponentially distributed. Proff. The necessity follows from Lemma 2. To show the sufficiency part, we use equation (8) and the forms (16) and (17) for (31 (8) and (32 (8). It implies that

23

is an identity in s , If P (Y

= 0) = 1 , then

(26) transforms into:

and it has been proved in Dimitrov and Khalil (1990) that this relation implies T E Exp (f-l) with a proper value of f-l • If P (Y = 0) < 1 , then 'fly (s) is a continuous decreasing function of S E [0, (0). The L.H.S. of (26) can be rewritten in the form

J J f co

'fIT(S+A-A'fIy(S)(S)) =

e-(·+A)l e'\l"y(·)ldFT(t)

o

oo

['fly (s), lktkdFT(t)

e-(·+'\)I

o

k=O

dk

:E ['fly (s), lk kT dsk v- (s + >'). >.k

oo

k=O

Equating this last expression to the R.H.S. of (26) and comparing coefficients of ['fly (s)]k we get for k = 1 (27) Let here z = >.

+s

we obtain the differential equation

'fiT

d'flT (z) (z) [1 - 'fiT (z)]

dz

= -:;'

and it solution is (28)

'fiT

(z)

= (1 + CZ)-l .

The obvious solution 'fiT (z) == 1 of (27) gives an instantaneous original service time, = 0) = 1 ,while in view of the expected properties of a generating function 'fIT(Z) (positive for z > 0, decreasing in z E [0, (0) the solution in (28) have c > 0 and hence 0 T E Exp (c). The theorem is proved.

P(T

Theorem 3. Let X E Exp (>'k), for a series of different parameters Ak with finite codensation point. The equality of expected values E b1 (T) and E bz(T) for any k, (equation (9) ) takes place iff T is exponentially distributed. Proof. We use here the particular results (18) and (19) in (9) and we get

Simple algebraic manipulations lead to:

(29)

24

Since the last identity remains for a series of different ,\'s (exactly {Ak} in the conditions of the theorem) then according to Moran's theorem for analytic function, (29) is true everwhere as long as both sides remain analytic functions. Thus T E Exp (( E T )-1). 0 i.e.

Theorem 4. The truth of (10) P (Y = 0) = 1 and for a set of constant values of X P (X = x) = 1 ,for x E (0, a), a> 0 holds iff T exponentially distributed. Proof. Now (10) and (11) give the identities

J x

"PT (s)

=

e- 8 t dFT(t) [1 - e- 8 X (1 - FT(x))]

-1

= 131 (s).

o This is equivalent to

J X

"PT (s) [1 - e-

H

=

(1 - FT(x))]

e- s t dFT (t).

o

Differentiating with respect to x we get:

From this last we get

F;(x) = c > o. () X

1- F T

As seen from Theorem 1 the only solution of the above equation (30) (excluding the obvious solution F T (x) == 1 ) is F T (x) = 1 - e- C X • Let consider the case Eb 1(T) = ET for x E (0, a). Using (14) we get

J x

E(T)=x-

FT(t)dtIFT(x),

o

which is true for arbitrary x E (0, a). Rewriting this last equetion we get

J x

FT(x)E(T) =x-

FT(t)dt.

o

Differentiating with respect to x yields

F T (x) 1- F T (x)

i.e. This proves the theorem.

dx ET'

FT (x) = 1-exp(-xIET), 0

25

Finaly we remember that equetions (10) are true for exponentially distributed lifetimes of the server X n iff T is also exponential. This is the case studied by B. Dimitrov and Z. Khalil in [3J and will be not discussed here.

REFERENCE [1] T. Azlarov and N. Volodin, Characterization Problems Associated with Exponential Distribution, Springer, Berlin - New York, 1986. [2] J. Galambos and S. Kotz, Characterizations of Probability Distributions: an Unified Approach with an Emphasis on Exponential and Related Models, Leet. Notes Math., Springer, 675 (1978). [3] B. Dimitrov and Z. Khalil, On a new characterizing Property of the exponential distribution, J. Appl., probab., (to appear). [4] N. K. Jaiswal, Priority Queues, Academ. Press, New York, 1968. [5] Z. Khalil, B. Dimitrov and P. Petrov, On the total execution time on an unreliable server with exlicit breakdowns, in: Trans. IEEE on Reliability, 1990 (to appear). DEPT. OF MATH. & STAT., CONCORDIA UNIV., MONTREAL, QUEBEC, H4B lR6 CANADA

Mark Finkelstein and Howard G. Tucker

ON THE DISTRIBUTION OF THE WILCOXON RANK-SUM STATISTIC

Under the null hypothesis of equal distributions in the two-sample Wilcoxon test, the sum of the ranks of one sample within the pooled ordered sample is referred to here as the Wilcoxon distribution. If the distribution of the ranks of one sample within the pooled ordered sample has the Wilcoxon distribution for infinitely many sizes of the other sample, then the two population distributions are equal. In a special case it is shown that this is a best result, and a theorem is proved that indicates what it is sufficient to prove in general in order to prove that this is a best result. 1. Introduction and Summary. A random variable W is said to have the Wilcoxon i m., m+n )-distribution, or Till (17/.17/ + n )-distribution, if its distribution function is that of the sum of a simple random sample of size m taken without replacement from the integers 1,2, ... , m + n. This distribution arises in the following situation. If Xl, .... X m , Yl , ... , Y" are independent random variables where Xl, ... ,Xm are independent and identically distributed (i.i.d.) with common continuous distribution function F and Yl , ... , Yn are i.i.d. with common continuous distribution function G, if Z is the sum of the ranks of the X; s in the pooled ordered sample and if the null hypothesis F = G is true, then Z has the W (m, m + n )-distribution. This can be found in any nonparametric text (see, e.g., Lehmann [1]). The question considered here is: if Z has the W (m, m + n )-distribution, does this imply F = G? The following partial answers are obtained. If Xl, ... ,Xm, Yl , Y2, . .. are independent random variables, if Xl, ... ,Xm have common continuous distribution function F, and Yl , Y2 , ••• have common continuous distribution function G, and if the sum of the ranks of Xl, . . . ,Xm in the pooled ordered sample Xl, ... ,X m , Yl , ... ,Y;, has the Till (m, m + n)distribution for infinitely many positive integers n, then F = G. Possibly this is a best result. In order to show that it is, one would have to prove that for arbitrary positive integers m. and n there exist distinct continuous distribution functions F and G such that Z as defined above has the W (m, m + n)-distribution. We are able to show this here for m = 1 and n arbitrary. For arbitrary m 2 2 and arbitrary n we show that it is sufficient to prove the following: there exists a continuous distribution function H satisfying H(O) = 0, H(I) = 1 and H not the uniform distribution function over [0, 1] such that if Ul , . . . ,Um are i.i.d. (H) with order statistics U(l) < U(2) < ... < U(m), then m

E

(II U(;»)

m!

j=l

for all m-tuples (1"1,'" ,1"m) of nonnegative integers which satisfy 1"1 + ... + I'm ::; n. To take away any possible mystery of this result, we note that the right hand side of the above equation is equal to E (IT}:l 11(;»), where V(l) < ... < V(m) are the order statistics of a sample of size m on the uniform [0, 1] distribution.

27 A few words are in order concerning notation. If X is a random variable, then Fx (x) will denote its distribution function, and Fx(u) will denote its characteristic function. If Z,ZI,Z2, ... are random variables, then Zn Z means Fzn(x) - - 4 Fz(x) as n ---> 00 at all values of x at which Fz is continuous. We shall write X is U [0, 1] to mean: X has the uniform distribution over the interval [0, 1].

2. The Main Result. In order to prove our main result we shall need two lemmas. Lemma 1. If W n lias the W(m,m + n)-distribution, then Wn/(m + n) - - 4 IV as n ---> 00 where W = I: Uj and where U1 , ... , UM are independent and each is U [0, 1]. Proof. Let ZI,." , Zm be a simple random sample without replacement of size In from 1,2, ... ,m+n. Let 0 < Xi < 1 for 1:::; i:::; m be arbitrary, and let i l,i 2 , ... .i-; be a permutation of 1,2, ... ,m such that Xi, :::; Xi, :::; ... :::; T'm . Since the joint distributions of ZI, ... , Zm and of Zi" ... , Zi m are the same, it follows that L

J=1

p

Co

[Zij(m+n)

XiJ)

p

(,iJ

[Z,,/(m

p

Co

[Zij (m + n) < x"] )

n

m-l

J=O

Hence (m

U[O,l].

+ n)-1 L:;:1 0

Zj

L:J=I

[Xi;+l(m

+ n) < x"J)

+ n)]

m+n-j

U], where U

n m

- j --4

I ,... ,Um

X.i

as

n --->

00.

J=I

are independent and each

Lemma 2. If X,Y1,Y2, ... are independent random variables, and if yl,yz .... are i.i.d. with common distribution function G, then L:j=1 1[\'J 0::: X) - - 4 G(X) as n ---> 00 over a set of probability one. Proof. Let Px , P y" PY21'" be the induced probability measures over (R I , 8(1») of X,YI,Y2 , ... respectively; e.g., for BE 8(1), Px(B) = P[X E Bl. Now consider the infinite product measure space (n, A, P), where n = Roo, A = 8(00) and P = P x x pYj' For each (to,tl,t 2 , ... ) E n define X(t O,t l,t 2 , ... ) = to, and, for j :::: 1, define Y j ( to, t 1 , t 2 , •.. ) = t i : Then the joint distribution of X, Y I , Yi, . .. as defined over this product probability space is the same as in the hypothesis. If we denote Gn(x) = I:j=l l[Yj: 00 uniformly in X with probability one, and thus uniformly over the set {X(w): wE R I } . Thus Gn(x) - - 4 G(x) as n ---> 00 a.s. 0 Theorem 1. Let Xl, ... ,Xm , Y I,Y2 , ... be independent random variables, where each Xi has the same continuous distribution function F, 1 :::; i :::; rn, and each Y i has the same continuous distribution function G, j :::: 1. Suppose that, for unfinitely many positive integers n, the sum of the rank» of Xl, ... ,Xm within the pooled ordered sample Xl, ... ,Xm , 1"1, ... ,Yn has the W (m, m + n )-distribution. Then F = G.

28

Proof. Let Rn(Xi) denote the rank of Xi in the pooled ordered sample Xl, ... ,X"" Y1, ... ,Yn • Then, for each fixed i, 1 ::; i ::; m, n

n; (Xi) = L

I[Yj 00. By hypothesis, 2:;:1 Rn(X)) has the HI (m, In many values of n. Hence by Lemma 1,

+ n)-distribution for

infinitely

where U1 , ••• ,Urn are independent and each being U [0, 1]. Thus, for this fixed value of m, 2:;:1 U, and 2:7'=1 G(Xj) have the same distribution functions, i.e., (Fu(u))'" =

(Fcex)(u))m for all u E R1, where U is U [0, 1], and G(X) has the same distribution function as G(Xl). We wish to show that this implies that U and G(X) have the same distribution functions. Note that Fu( 11) has zeros, so we have difficulties in taking mth roots of both sides. Both U and G(X) are bounded by 0 and 1, and, by Theorem 7.2.1 in Lukacs [2], we obtain that FU(ll) and Fc()\)(u) are restrictions to the real axis in C of entire functions which will be denoted by Fu(z) and Fc()()(z), all z E C. Hence (Fu(z))m and (Fc(x)(z))rn are entire functions, and since they are equal over the real axis, they are equal over the complex plane C. At z = 0, both of these entire functions are equal to 1; hence there exists a neighborhood N(O) of 0 E C over which both functions do not vanish. Over N(O) we take the principal branch of the logarithm of both functions, obtaining mlog Fu(z) = mlog Fc(x)(z) for all z E N(O). Cancelling and exponentiating we get

Fu(z) = Fc(x)(z) for all z E N(O). But both functions are entire and are equal over an uncountable set, hence they are equal over C, and hence over the real axis. Thus U and G(X) have the same distribution function, namely, UfO, 1]. Finally, we prove F = G. Since G is a nondecreasing function, we have {X :s: t} = {G(X) ::; G(t)}. Taking probabilities of both sides we obtain, for all i E R 1 , F(t)

= P {X::; t} = P {G(X) < G(t)} = P {U < G(t)}

=

G(t),

29 0 i.e., F = G. There remains the problem of showing that Theorem 1 cannot be improved. In order to do this we must show that for every pair of positive integers m, n there exist distinct continuous distribution functions F, G such that if XI, ... ,X"" YI , ... , Y" are independent random variables with XI, ... ,X", being i.i.d. (F) and Yi, ... ,Yn being i.i.d. (G), then the sum ofthe ranks ofthe Xis in the pooled ordered sample has the W (177" m+n)- distribution. We shall prove this in Section 3 for m = 1 and n arbitrary. For the case 177, 2 and n arbitrary, we are unable to show this, but we can show in Section 4 that it is sufficient to prove the following: the set of all mixed moments of total order not greater than n of the 177, order statistics of XI, ... ,X"" which are i.i.d. with common continuous distribution function F, do not determine F.

3. The Case m = l. We present a sequence of lemmas needed to prove the main result in the case m = 1. Let C [0, 1] denote the real linear space of all real-valued continuous functions defined over [0, 1]. For 9 E C [0, 1]' we define 11911 by 11911 = max{lg(J:)1 I 0 < J: < I}, which provides a uniform norm topology over C [0, 1]. Consider the cone C + defined by C + = {g E C[O,I] g(x) > 0, all x E [0,1]}. 1

Lemma 3 .. C + is an open set in C [0, 1] with respect to the norm topology. Proof. Let i. e C+. Then let 8 be defined by 8 = min{h(x) 10:':: or:':: I}. Since hEC[O,I]then 8>0. Thisimplies{gEC[O,I] Illh-gll R I, 0 :':: j :':: k are linear l [uriciioiials defined by Ljg = x j g( x) dx, then there exists q E C [0, 1] such. that

fo

(i)

9

#

1 [OlJ

and

(ii)

Ljg = 1/(]

+ 1),

0:':: j

< k.

Proof. For constants ai, ... , ak+1 to be determined, define g( x) that 9 is a polynomial with no constant term. Let us set

L j9

1

=

k+1

j

aixi)x dx o

.

k+1

= L. .

;=1

i

ai

+ 7+1

= (j + 1

,

",HI

L."J=I

o :'::

(i+ \1) k+1 J

Note

j :':: 1.

J=I'

This system of linear equations has a unique non-null solution since the (k matrix A =

a;x;

+ 1)

x (k

+ 1)

is positive definite; this last assertion follows by the fact that

,.;=1

rl k+1 . 2 Jo (2: j = 1 bjx J ) dx > 0 if not all bjs are zero. Thus (ii) is satisfied. Also, (i) is satisfied, since g has no constant term. 0 Note that the function 9 determined in Lemma 4 is not necessarily a density; it could be negative for some value of x. Thus we need the next lemma. Lemma 5. For fixed positive integer k , there exists a density f which is not the density of a U [0, 1] random variable, such that f{ or) = 0 for x [0, 1] and such that

J I

x

o

jf(x)dx=(j+l)-l,

30

Proof. Let k be a fixed positive integer, and let 9 be as in Lemma 4. Note that 1 [0,1] E C +, and 1 [O,lJ satisfies Ljl [0,1] = (j + 1)-1, 0 j k. Hence for every BE [0,1], if we define

go(x) = Bg(x)

+ (1 -

B)l [O,,](X),

then Ljgo == (j + 1)-1, 0 j k. Hence by Lemma 3, go E C + for B sufficiently small. For every such B, go is a density function of a random variable which is not U [0,1]. 0 We are now able to show that Theorem 1 is a best possible result when 171 = 1 and 11 is arbitrary.

Theorem 2. For any positive integer k there exist two distinct continuous distribution functions F and G such that if X, Y 1 , ••• ,Yk are independent random variables with corresponding distribution functions F, G, ... ,G then the rank of X in the pooled ordered sample in uniform over {I, 2, ... , k + I}. Proof. We shall take G to be U [0, 1]. Let R denote the rank of X in the pooled ordered sample. We shall show that there exists a continuous distribution function F such that, under the hypothesis of the theorem, P (R = j + 1) = (k + 1)-1 for 0 j k. Now, for any continuous distribution function F of X satisfying F(O) = 0 and F(l) = 1, we have

G) (0 G) J j

P (R

= j + 1) =

P

k

[Yi < X]

i=CJ'l [Yi > x])

1

=

j

x (l - x)k J dF(T).

o

We must show that there exists a continuous F such that

G) J 1

j x i (1 x)k dF(x)

= (k + 1)-1

for

0

j

k.

o

This is a system of k + 1 linear equations in /Lo, /Ll, ... ,/Lk, where /Lj system has one solution, namely

= Jo1 x j dF( x).

This

o < j < k, obtained when F(x) = x, 0 x 1. Thus, all we need is a density concentrated over [0, 1] whose jth moment is (j + 1)-1 for j k and which is not U [0, 1]. By Lemma 5, such a density does exist. 0

4. A Possible Approach. The problem of showing that Theorem 1 is a best possible result will be shown now to be solvable if one can show that a finite set of mixed moments of the order statistics of a sample on a continuous distribution function do not determine that disrtibution function. This is essentially what we did in Section 3 in the case 171 = 1.

31

Lemma 6. If U 1, order statistics of U1,

.U,« are independent and U [0, 1], if U(l)

,Um, and if r1,'"

m

E

(II

= m! /

U(Tj)j)

< ...
o.

Now apply Theorem 2.9 to P = PI and bn := a-lCn, n 2:: 2. d) (ii) -+ (i) is obvious.

41

(iii)-t (ii): In c) we proved that (J-lt) is stable w.r.t. (at) J-lt E r(J-l), t > O. Now b) yields for a, b E r, c = 8( a, b):

r, this especially implies

Hence the assertion. Now assume (ii) to hold, 1jJ independent of t. Let a, b E r, c:= 1jJ( a, b). Then for any t > 0 Therefore

a(A)

+ b(A) = c(A),

i.e.

8(a,b)=c=1jJ(a,b)

mod'I(J-lt).

0

3.4. Remark. If G = R d the commutativity assumption is autmomatically fulfiled. Moreover the convolution semigroup (J-lt) and hence A is uniquely determined by J-l = J-ll. Therefore in this case the conditions (i), (ii) and (iii) are equivalent. Proof. Assume (i). Let a, b, c E r with

Then are convolution semi groups with i.e,

111

= .AI.

Hence by the uniqueness property lit = .At, t :2: 0,

a(J-lt) * b(J-lt)

= c(J-lt}.

Hence we have proved (ii) and the independence of 1jJ of t. 0 The investigation of stable semigroups on G resp. of stable generating distributions A can be reduced to the investigation of certain operator stable laws on the vector space ®, i.e. the Lie algebra of G. Similar, the investigation of probablities with idempotent infinitesimal r-types may be translated by 3.4 to an analogous problem on the vector space ®. Let for f E V(G), o

f

:=

f

0

log E V(®).

o

For a generating distribution A on G let A be the generating distribution on ®, defined by o

0

(A, f) := (A, f). If (J-lt) is the convolution semigroup on G generated by A, let (lit) be the c.c.s, on ®

generated by

A.

For a E B let

da be the differential of a (See [4, 5, 6, 7]).

3.5. Theorem. (i) A ha.s idempotent infiniteJimal r-types iff o

0

(ii) A ha» idempotent infinite.simal r- type (where I' := : a E I'] Aut (®) GL (®)); Aut (®) being the group of Lie algebra auiomorpliisms}: Hence according to 3.4 (ii) iJ equivalent to

42

(iii)

o

vI

0

hal idempotentf-type (i.e. i, I''t) are uniquely determined by >'1 ([22], [23,6, Prop. 6]), hence Vt.= jlt, t > 0, and therefore

a(A)

+ b(A) = c(A).

El

REFERENCES

[1] P. Baldi, Lois stables sur leI deplacements de Rd, in: Probability Measures on Groups, Lect. Notes Math., 706 (1979), pp. 1-9. [2] P. Baldi, Uniciti du plongement d 'une mesure de probabiliie dans un semiqroup de convolution Gaussien. Cal nonabelian., Math. Z., 188 (1985), pp. 411-417. [3] W. Hazod, Stetige Halbgruppen von Wahrscheinlichkeit. und erzeugende Distributionen, Lect, Notes Math., Springer, 595 (1977). [4] W. Hazod, Stable probabilities on locally compact groups, in: Probability Measures on Groups, Lect. Notes Math., Springer, 928 (1982), pp. 183-211. [5] W. Hazod, Stable and semisiable probabilities on groups and vector spaces, in: Probability Theory on Vector Spaces III, Lect. Notes Math., Springer, 1080 (1984), pp. 68-89. [6] W. Hazod, Semigroups de convolution [demu.] stables et auiodecomposables sur les qroupes localement compactl, in: Probabilites sur les structures geometriques, Publ, du Lab. Stat. et Prob., Univ. de Toulouse, 1985, pp. 57-85. [7] W. Hazod, Stable probability measures on group, and on vector spaces: a lurvey, in: Probability Measures on Groops VIII, Lect. Notes Math., Springer, 1210 (1986), pp. 304-352. [8] W. Hazod, On the decomposability group of a convolution semiqroup, in: Probability Theory on Vector Spaces, Lect. Notes Math., 1391 (1989), pp. 99-11l. [9] W. Hazod, 6ber der Typenkonvergenlatz auf zusammenluinqenden Lie Gruppen, Mt. Math., 110 (1990), pp. 261-278. [10] W. Hazod, Semiltability and domains of attraction on compact extensions of nilpotent group" in: Probability Measures on Groups IX, Proc. Overwolfach (1990), (to appear). [l1J W. Hazod and S. Nobel, Converqenceofisjpes theorem for simply connected nilpotent Lie group, in: Probability Measures on Groups IX, Leet. Notes Math., Springer, 1375 (1989), pp. 90-106. [12] W. Hazod and E. Siebert, Continuous auiomorphism group, on a locally compact group contracting modulo a compact lubgroup and applications to stable convolution semi. groups, Semigroup Forum, 33 (1986), pp. 111-143. [13J W. Hazod and E. Siebert, Auiomorphisms on a Lie group contracting modulo a compact subgroup and applications to semisiable convolution semiqroups, Theor. Probab. J., 1 (1988), pp. 211-226. [14J H. Heyer, Probability Measures on Locally Compact Groups, Springer, Berlin-New York, 1977.

44

[15] W.N. Hudson, Z. J. Jurek and J. A. Veeh, The symmetry group and exponents of operator stable probability measures, Ann. Probab., 14 (1986), pp. 1014-1023. [16] Z. J. Jurek, On stability of probability measures in Euclidean spaces, in: Probability Theory on Vector Spaces II, Lect. Notes Math., Springer, 828 (1988), pp. 129-145. [17] Z. J. Jurek, Convergence of types self-decomposability and stability of measures on linear spaces, in: Probability in Banach spaces III, Lect. Notes Math., Springer, 860 (1981), pp. 257-284. [18] E. Kehrer, Stabilitiit von Wahrscheinlichkeit. unter Operatorgruppen auf Banachriiu men, Dissert., Univ. Tiibingen, 1983. [19] G. Kucharczak, On stable probability measures, Bull. Acad. Pol. Sci., 23 (1975), pp. 571-576. [20] A. Luczak, Elliptical symmetry and characterization of operator-stable and operator semi-stable measures, Ann. Probab., 12 (1984), pp. 1217-1223. [21] K. R. Parthasarathy and K. Schmdt, Stable positive definite functions, Trans. Amer. Math. Soc., Providence, 203 (1975), pp. 163-174. [22] S. Nobel, Limit theorems for probability measures on simply connected nilpotent Lie groups, Theor. Probab. J., 4 (1991), pp. 261-284. [23] S. Nobel, Grenzurertsiitze fur Wahrscheinlichkeit. auf eifach zusamrnenluinqen-tieti nilpotenten Liegruppen, Dissert., Univ. Dortmund, 1988. [24] K. Sato, Strictly operator-stable distributions, J. Mult. Anal., 22 (1987), pp. 287-295. [25] M. Sharpe, Operator stable probability measures on vector groups, Trans. Amer. Math. Soc., Providence, 136 (1969), pp. 51-65. [26] K. Schmidt, Stable probability measures on RV, Z. Wahrsch. Verw. Geb., 33 (1975), pp.19-31. [27] E. Siebert, Einbettung unendlich teilbarer Wahrscheinlichkeit. auf topologischen Gruppen, Z. Wahrsch. Verw. Geb., 28 (1974), pp. 227-247. [28] E. Siebert, Supplements to operator stable and operator-semistable laws on Euclidean spaces, J. Mult. Anal., 19 (1986), pp. 329-341. MATH. INST. UNIV. DORTMUND, POSTFACH 500 500 0-4600 DORTMUND 50, Germany

Herbert Heyer

FUNCTIONAL LIMIT THEOREMS FOR RANDOM WALKS ON ONE-DIMENSIONAL HYPERGROUPS

Functional limit theorems or invariance principles concern the linear interpolation of sums of independent identically distributed random variables. The corresponding standardized continuous-time processes are considered as generalized random variables taking values in the Skorokhod space of cadlag functions equipped with the Skorokhod topology. It has been a challenging problem in classical probability theory to study the convergence in distribution of such interpolated processes. In the present paper we describe a method of extending classical funetionallimit theorems to one-dimensional generalized convolution structures or hypergroups in the sense of R. J. Jewett [11]. The results to be quoted are due to Hm. Zeuner [20]; they are based on central limit theorems proved previously for random walks on Sturm-Liouville and polynomial hypergroups by Hm. Zeuner [19] and M. Voit [15], [16] respectively. I. The motivation: Cosine transformations.

Let X and Y denote two independent random vectors in RP which are rotationally invariant in the sense that their distributions are rotationally invariant probability measures on RP. Then also the vector X +Y is rotationally invariant. By the euclidean cosine theorem we have

where the random angle A : = L(X, Y) is independent of X : = 'X, and Y : = IYI. The distribution of IX + Y I can be computed for bounded continuous functions f on n, as

1

111 1

f dP 1X+Y1 = Cp

=:

f(x 2

+ y2 + 2xy cos t?)1/2) sinp - 2 t?dt?Px(dx)Py(dy)

f d(Px * Py) =Px * Py(f)

with a constant Cp : = r(p/2)1r- 1 / 2 [ r ( p - 1 ) / 2 ) r where the convolution * is an associative operation on the set M 1 (R+) of all probability measures on Rr. 1

46

In order to get a hand on this convolution one classically applies some harmonic analysis via Bessel functions. These are defined by the differential equation

{

jp +

+ jp = 0

jp(O) = 1 jp(O) = 0

with f3: = p - 1. Introducing the modified Bessel functions ep >.. (x) : = j fJ (AX) for all A and X E R+ we obtain the identity

ep>.. defined by

and with the natural definition of the Hankel transjorm fJ.(A) : = J ep>.. dJ.1. for all AE whenever J.1. belongs to the space M b (R+) of bounded measures on R+, this reads as Replacing the euclidean cosine transformation

by the spherical one (G. Gasper [8], N. H. Bingham [2]) cos z = cos x cos y + sin x sin y cos {} or by the hyperbolic one (M. Flensted-Jensen, T. H. Koornwinder [5], Hm. Zeuner

[17])

cosh z

= cosh x cosh y + sinh x sinh y cos {}

a similar approach leads to convolution structures arising from spherically or hyperbolically invariant random vectors respectively. 2. The framework: Sturm-Liouville and polynomial hypergroups.

In the subsequent discussion we shall extend the special situation described above to more general exponents p. This program leads us to the notion of a hypergroup. Roughly speaking a hypergroup is a locally compact space K together with a convolution in Mb(K) such that (Mb(K),*) becomes a (commutative) Banach algebra and such that (among others) the following axioms hold (HGl) For any x, y E K the convolution product Cz *cll oftwo Dirac measures Cz and Cy is a probability measure on K with compact (not necessarily singleton) support. (HG2) The mapping (x, y) I-t supp (cx* t y) from K x K into the space of compact subsets of K furnished with the Hausdorff-Michael topology, is continuous.

47

For the full axiomatics as well as for properties and examples of commutative hypergroups the reader is referred to [9] and [10].

I. Sturm-Liouville (SL) hypergroups x

(R.r, *). For Sturm-Liouville functions A E C(R.r) with A(x) > 0 for all > 0 such that Res R x A E Cl (R+) we consider the following properties (SL1) A' A- 1(x)

either

+

= Cio/x + Cil(X)

(SL1a) (Singularity at 0): Cio

A(O) = 0) or

for all x near 0, with Cio

0 such that

> 0, Cil E Coo(R) and odd (which implies that

(SLlb) (Regularity at 0): Cio = 0, Cil E Cl(R+) (which implies that A(O) > 0). (SL2) There exists a function Ci2 E C1(R+) such that Ci2(0) 0, A' A-l_ Ci2 0, decreasing on and that

1, 1 2 A' q . = -Ci2 - -Ci2 + - Ci2 . 2 4 2A is decreasing on Subclasses of SL functions are (SLla') (H. Chebli [3], K. Trimeche [14]) A' A-I 0 and decreasing on (such that Ci2 and q can be chosen 0). (SLlb') (B. M. Levitan [12]) A E C2(R.r), A'(O) 0, and q is decreasing (with 0'2 : = A'A-I). The number

is called the index of the SL function A. For any SL function A (satisfying (SL1) and (SL2)) we introduce the SL operator LA on C2 by L Af : = - f" - A' A -1 f'. Moreover we define the operator e on C2 by

l[u](x,y) : = (LA),t;U(X,y) - (LA)yU(X,y) = -uu(x,y) - A'A- 1(x)u,t;(x,y)

+ ulIlI(x,y) + A'A-1(y)ulI(x,y)

with the usual notation for the partial derivatives. A hypergroup *) is said to be a 5L hypergroup if there exists a. 5L function A such that given any real-valued function f on R.r which is the restriction of an even COO-function 0 on R the function 11./ defined by

48

for all (x, y) E tion

is twice differentiable and satisfies the partial differential equa-

{

qUI] (U

=0

I ) y ( x, 0) = 0

x E

for all

It has been shown (H. Chebli [3], B. M. Levitan [12], Hm. Zeuner [20]) that for 5L functions A satisfying (5L1) and (5L2) there exists a 5L hypergroup (R+, *(A)) associated with A such that for all x, y E R+

supp (cx * Cy) C [Ix yl, x

+ y].

In the case of CT hypergroups of type (5L1a') this inclusion turns out to be an equality. We note that A· >'R+ is a Haar measure of the 5L hypergroup (R+, *(A)), and that the multiplicative functions of (R+, *(A)) defined by the integral equation . Tx(dt)

whenever>' E C. It turns out that every CL hypergroup has property (5L3). Finally we list the property (5L4) limxtoo x A' A-I (x) exists (in the case that e = 0). Examples.

(A) The Bessel-Kingman hypergroup is a 5L hypergroup (R+, *(A)) with A given by A(x) : = x P for all x E R+ (j3 > 0). This hypergroup is of subexponential growth (e = 0). For j3 : = p 1 (p 2) we are back in the case of rotation invariance (Euclidean cosine transformation) of the preceding section. (B) The Jacobi hypergroup is a 5L hypergroup (R+, *(A)) with A given by

A(x) : = sinhP x

for

all

x E R+

(j3 > 0).

49

This hypergroup is of exponential growth (e > 0). For f3 : = p - 1 (p 2) the special case of Lorentz invariance (hyperbolic cosine transformation) occurs. II. Polynomial (P) hypergroups (7l+, *). Consider sequences (Qn)n>O of polynomials Qn of degree n on R which are normed in the sence that Q:(xo) = 1 for all n E 7l+ and some Xo E R. If the sequence (Qn)n>O is nonnegatively linearized in the sense that QmQn = L:k;:::O c (m, n, k )Qk ;-ith c (m, n, k) 0 for all m, n, k E 7l+, then the convolution

L

em *e n :=

k;:::O

c(m,n,k)ek

defines a P hypergroup (7l+,*(Qn» with supp (em * en) C [1m - nl, m + n] n 7l+. For a P hypergroup (7l+, *(Qn» we consider the following properties in terms of sequences (an)n>o, (bn)n>o and (cn)n>o in R defined by an : = £n * e1({n + I}), b« : = £n * £1 ({n}) and cn-: = en * e1 ({;; - I}) (n 0) for which it is assumed that the limits 0: : = limn-too an > 0, 'Y : = limn-too C n > 0 and f3 : = limn-too bn 0 exist. (PI) There exists an no 1 such that an C n for all n no and (bn)n;:::no is a monotone sequence. (P2) (: = limn-too n (an - cn) exists (which implies that 0: = 'Y which characterizes the subexponential growth of (7l+, *(Qn»). (P3) 1 > 0: > 'Y (which means that (7l+, *(Qn» is of exponential growth). (P4) L:n>1 n max (bn - bn-1, 0) < 00. A Haar measure of the P hypergroup (7l+,*(Qn» is given by if n = 0, wz ({n}):={.1 +

c(n, n, 0)-1

=

n-1

ak /

n

Ck

if n E N,

the multiplicative functions on 7l+ are the evalution maps x -t Xx defined by Xx(n) : = Qn(x) for all n E 7l+, and the dual space can be identified with {x E R: IQn(x)1 ::; 1 for all n E 7l+}. It turns out that Xo : = 1 corresponds to the unit character of the hypergroup (7l+, *(Qn». In analogy to Example I we shall often recur to P hypergroups (7l+, *(Qn» admitting a Laplace representation in the sense that (P5) for every n E 7l+ n

a; = L

h(n,k)Tk, k=O where Tk is the Chebychev polynomial of the first kind and of degree k, and the h(n, k) are coefficients 0 for n, k E 7l+, n k.

50

It has been conjectured that any P hypergroup has property (P5).

Examples.

(C) The polynomial Jacobi hypergroup is a P hypergroup (7.1+, where defines the nth Jacobi polynomial on [-1, IJ with parameters o:,fJ E R, 0: fJ > -1 and 0: +fJ +1 0. This hypergroup is of subexponential growth (0: = /). (D) The Cartier hypergroup is a P hypergroup (7.1+, where denotes the nth Cartier polynomial with parameters a, b 2. It arises as a double coset hypergroup of an infinite distance-transitive graph, and it is of exponential growth (0: < I)' From now on we shall employ the notion of a one-dimensional hypergroup for both those of Examples I and II. 3. The probabilistic objects: additive processes.

For every hypergroup K there exists a concretization which is a defined as a triplet (M,J.t,ep) consisting of a compact space M, a measure J.t E M1(M), and a Borel-measurable mapping ep: K x K x M I-t K such that

J.t ({A E M: ep(x, y, A) E A}) =

C:t: H;I/(A)

for all x,y E K and sets A in the Borel-zr-algebra B(K) of K. Special cases.

(A) A concretization of the Bessel-Kingman hypergroup (R+, *(A» is given by M : = [-I,IJ, J.t : = g. AM with

for all t E [-I,IJ, and ep with

ep(x, y, >.) : = (x 2 + y2 _ 2>.xy)1/2 for all xy E A E [-1, IJ. (D) A concretization of the Cartier hypergroup (7.1+ * b:?: 5 is given by M : = [0, IJ, J.t : = >'M, and ep such that

ep(m, n, >.) : = max (1m - n], m

+n -

for a, b

0(>'»

for all m, n E 7.l+, where

0(>') : = min (0, [-log(a-b)(b_l) ax], [-log(a-l)(b-l) a(b - l)x] whenever

x E [0,1].

+ 2)

2, a +

51

Now let X, Y be two K-valued random variables on a probability space (12, A, P), and for a given concretization (M, j1., O. Now let K be a hypergroup with arbitrary moment functions ml and m2 satisfying the hypothesis m2 Imll2. For any K-valued random variable X on (Q, A, P) such ml 0 X is integrable, the number E* (X) : = E (ml

0

X)

is called the modified expectation of X. Moreover, if E (m2 implies that E* (X) exists), then the modified variance V* (X) : = E (m2

0

0

X)

00

(which

X) -IE*(XW

of X can be introduced. One shows that with the mapping v: K x C : = m2(x) - 2Re (ml(x)e) + for all E K x C, by V*(X)

59 In passing, let us point out that the population process can certainly also be defined in time or generation inhomogeneous cases, where the life law is assumed not be determined by the type alone but also by the birth-time or generation of the individual. Such processes are well-known in the simple Galton-Watson case, but have also been discussed in the present general framework, cf. Cohn and Jagers [2]. The size of the population at a certain time should then be the size of all its individuals at that time, i.e. the sum of their sizes. Individual size, in its turn, can be thought of in many ways, cf. Jagers [3]-[4] or lagers and Nerman [5], the simplest being just counting individuals as they are born into the population, your "size" being zero until you are born, one afterwards. The resulting population size is of course the total population at time t, usually denoted Yt. The most natural size concept might be the number Zt of living individuals. This presumes a life span A: n -+ R+ or better A: 5 x n -+ R+ being defined. Another basic size concept is the number (n of generation individuals ever born, or (n(A), the number of nth generation individuals with types in A E S. Generally, each so-called random characteristic, cf. op. cit., defines a population counted with or measured by that characteristic. In Section 3 we shall give a - somewhat quick - description of very general characteristics.

2. The instability of size. The extinction problem is suitably studied in terms of the embedded multi-type GaltonWatson process ( = {(n(A), n E Z+, A E S}, the number of nth generation type Aindividuals. The most general result about the possible stability of this process is due to Liemant, Matthes, and Wakolbinger [6], Section 2.9, and can be expressed in the following manner (using our terminology): Theorem 1. Assume (n(5) finite and ( stationary, i.e. the number and type distribution invariant over generations. Then, there is a subset e of the type space such that any individual with type in e gets exactly one child of type in e. Consider now a process (c = {(;;; n = 0,1, ... } with the same life law as that underlying ( but with the initial population (; = {(o(A n e), A E S}, i.e. in the starting generation possible individuals outside are disregarded. Then, in distribution -+ (0, as n -+ 00. (I am grateful to A. Wakolbinger for discussions on this.) In the one-type case this result is well-known since long, cf, Jagers [3], and can be summarized by saying that a population with a stable size always freezes in the sense that from some generation onwards there is no real reproduction, each individual just replacing herself. This result even holds in the case of varying environment, where the life law need not remain the same as generations pass (cf. op. cit. p. 70). And, indeed, at bottom the explosion or extinction character of repoducing populations is a conseqeunce of the asymmetry between being extinct, where the population cannot ressurreet inself, and being large, where it still runs the risk, albeit small, of dying out. Thus the property should hold much more generally, in cases of interaction between repoducing individuals, nonhomogeneity in time or generations, non-Markoviannes in generations etc. A try to catch this general but vague property is made by the following:

e

Theorem 2. Consider a sequence of random variables X 1,X2 , ... defined on some probability space and taking values in R+. Assume 0 absorbing in the sense that

x"

= 0

==> X n+1

= 0

60 and SUpp08e that there always is a risk of extinction in the following way: for any is a 8 > 0 such that

.7:

there

if only X k ::::: x. Then, with probability one either (i) there is an n such that all Xk = 0 for k :::: n or (ii) Xk -+ 00 as k -+ 00. Proof. The proof is rather direct from Levy's convergence theorem for closed martingales: if D: = f3njX n = O} and E n := a(X1,X2 , ••• ,Xn ) , then

on a subset J{ of the sample spase that has probability one. Let x > O. For any outcome J{ such that Xn(w) ::::: x infinitely often and a 8> 0, chosen to satisfy the assumption of the theorem, it holds that P (D I En) (w) 2: Ii infinitely often. By the convergence of the conditional probabilities, ID(W) 2: 8 > 0, i.e. wED. Since this holds for all T,

wE

U{X

n

O. Then

if only (k(S)::::: x. 2. Varying environment. The same type of argument applies if the life law should be time and lor generation dependent. 3. Size dependence. And even in this case, under some conditions: Assume that the probability of having no children, given all preceding generation sizes is no less than, say, fix if the size of your own generation is ::::: x. Then,

provided 0 < e < e and (k(S) ::::: x.

3. The stability of composition. The expected development of a branching population is determined by its reproduction kernel u ; defined as the expected number of childern of various types and at various (bearing) ages: fl(r,ds x dt):= E, x dt)].

61

The population is supposed to be Malthusian and supercritical, this meaning that there is a number 0' > 0, the Malthusian parameter, such that the kernel (1(0'),

J 00

(1(r,ds;a):=

e- at f-l(r,ds

X

dt)

o

has Perron root one and is what Shurenkov [12] calls conservative. (This correponds to irreducibility and a-recurrence in the terminology of Niemi and Nummelin [97] ). By the abstract Perron-Frobenius theorem (see [12], p. 43; [10], p. 70), there is then au-finite measure 7r on the type space (S,S), and strictly positive a.e. [7r] finite measurable function h on the same space, such that

J J

(1(r, ds;a) 7r(dr) = 7r(ds),

S

h(s) Mr, ds; 0') = h(r).

S

Further we require so-called strong a-recurrence, viz. that

0 O. Then 7r is finite and can (and will) also be normed to a probability measure. Finally, we assume that the reproduction kernel is non-lattice and satisfies the natural condition

SUpf-l(s,S x [0, a]) < 1 s

for some a > O. Note that we assume only non-latticeness, rather than spread-outness of the kernel. These are the conditions (on f-l alone) for the general renewal theorem of Shurenkov ([12], p. 107). We shall summarize them by referring to the population as non-lattice strictly Malthusian. Clearly there is a lattice analog of our results, relying upon the lattice Markov renewal theorem (d. Shurenkov [12], p. 122). As pointed out in the Introduction, populations can be most generally measured by random characteristics. For an exposition at a more leisurely pace the reader is referred to Jagers [3], Jagers and Nerman [5], or for the multi-type case Jagers [4]. In order to go into that area, the presentation must, however, be somewhat tightened.

62 Thus, let N denote the positive natural numbers, N°: = {O},

the Ulam-Harris space of possible individuals. The population space is S x oJ, S x AI) with the probability measure p., defined by the starting condition that the ancestor 0 be of type s E S and the life kernel cf. J agers [4], Section 3. (We allow ourselves to us the same notation for the kernel as for the resulting probability neasure on the whole population space.) We denote by T x , X E I x's birth-time, TO being zero, and subsequent birth-times inductively defined through mother's age at bearing. Similarly a x is x's type. The daughter process Sx of x E I describes the type of x and the lifes of x and all its possible progeny. Formally it is the coordinate projection (ao, {w y; Y E I}) f-> (a x, {"-'"y; y E I} , xy being the concatenation of x and y, first x's coordinates, then y's. A charcheristic X is a measurable function

supposed to have realizations which are D-valued in its last argument and vanishing jf the latter is negative. The x-counted population at time t is then defined as

"X

'-L-J (S"I-r,)'

Z tX . -

xEI

In this, note that Sx has "two" coordinates, a x and the lives of x and all her progeny. By a slight adaptation from Jagers [4], abstaining from spread-outness, we obtain. Theorem 3. Consider a non-lattice, strictly Malthusian branching population, counted with a bounded characteristic X such that the function e-utE. [x( t)] is directly Riemann integrable. Then, for n -almost all s E S lim

t-oo

= h(s)

J

e-aIErlX(t)J1[(dr)dul/3:=

SxR

in an obvious notation for Laplace transform, and E" = J E r 1[( dr), expectation under the initial condition ao rv it . Proof. The notion of direct Riemann integrability used is that of Shurenkov ([12], pp. 80 ff). If we can prove that SUPt e-utE. [z{] is finite (at least for an s-set of positive 7r-measure), the other conditions of his Theorem 1, p. 107, are clearly satisfied. This is, s, S X [0, a]) < however, a consequence of the two assumptions that X be bounded and sup. 1. Indeeed, with

"*,,

denoting Markov convolution ( convolution in time and transition in type) and

63

and

°::;

X ::; c we may write

J t

e-O'tE. [zf] < ce-O'tE. [yd = c

e-O'(t-u) vO'(5,dr X du).

o But

1 = V O' ( 5, S

>

[0, t] ) - flO' * V O' ( 5, S

X

J

X

[0, t] )

(l-flO'(r,SX [0,t-u]))v O'(5,dT xdu)

Sx[t-a,tJ

(1 - ,.) V O' ( 5, S

X

[t - a, t] ),

if suP. fl(5, S X [0, a]) ::; ,. < 1. This holding for any t, partitioning into disjoint interval yields the linear hound V

O' ( 5, S

X

[0, t]) ::; At

+ B.

Integration hy.parts shows that

J t

e-O'(t-u) v O'(5,dT

X

du)

o

is hounded. 0 The theorem readily yields E. [z{]/ E. [Yt] -> Elf [X( a)], which can be extended to the P, almost sure convergence of the total population average of X, z{ /Yt -> Err [x(a)] under suitable ("xlogx") conditions (Jagers [4]). This tells much about average properties of the population, but not enough: In order to catch the history of an average individual we need similar results about characteristics that may be influenced by the past of the individual measured. To this end, fix x E I and let it be measured by a characteristic Xx: S x nJ x R -> [0,1]. Generally, at age a xy will have the score XX(Sy, a), obviously influenced by the lives and types of xy's n last progenitors, if x E Nn. Since

z( = 2..: X X( Sy, t -

T

xy ) =

2..: XX( Sy, t -

T

x 0 Sy -

T

y ),

yE[

yEJ

under the conditions of Theorem 3. Now, consider a population at time t, thought of as quite large, where an individual has been sampled at random from among all those born, i.e. with (conditional) probability l/Yt. Measure the individual somehow in a manner that may depend upon all her progeny, upon her n last progenitors, and 011 her age. Denoting the measure by 'Ij;, we can write it in the form 'Ij; = l:xENn Xx, the superscript x = (Xl,'" ,Xn-l, x n ) indicating that 'Ij; is considered on the set where the indivisual's descent is that she was the xnth child of her mother, who was the Xn-l th child of her mother etc .. The relation

E('Ij;)=

2..: xENn

Err[e- O' Tx xX(a)]

64

for all n defines a linear functional and underlying probability measure P on a doubly infinite population space, centered at the randomly sampled individual, Ego, containing the population that branches off from her, information about Ego's age, now at sampling, her descent, the lives of all her progenitors and their progeny. For a formal description of such a space and the stable population probability measure P, in the one-type case (see [5], [7J, [8]). Here we shall give a less complete (but hopefully more intuitive) description of the stable population. Some notation is necessary: To will denote Ego's age at sampling, So her type, and R o her rank, i.e. ordinal number in her sibship. T 1 is Ego's mother's age, when she gave birth to Ego, SI her type, and R l her rank. And so on backwards. Similarly we let U«, Ul , .•. denote the whole lives of Ego, Ego's mother, ... ,and ZO the population initiated by Ego. ZI, Z2, ... can be used to denote Ego's mother's daughter process except Ego, grandmother's daughter process, except mother and her progeny etc .. Also recall that a(i), r(i) are the type and bearing age of a mother's ith child (the latter equalling infinity, if no ith child is ever born).

Theorem 4. The stable population measure satisfies

.. , ;U n E d ui..,'!'; E dt".S" E ds",R" "" /',,)

"" E tt (e-QT(rn)l a(rn)Eds n )E -'In (e-otn-1l Ann{a(rn_lEd,s>l_l,r(I',,_tEdl n} )

This relation also determines the types and birth times of sisters of Ego's progenitors. Given these, the daughter processes of these sisters are independent branching population with the given starting type and starting time. Proof. Just carefully insert the approptiate 1/J into the relation defining E as a linear functional. 0 In order to grasp the meaning of this we shall formulate a number of special cases as corollaries. In them we write (1(r,ds) as an abbreviation for

J 00

{1(r,ds; 0') =

e-Q!/l(r,ds x dt).

o

Corollary 1. The sequence of types backwards from Ego, transition probabilities

is a Markov chain with

.) p.(r, ds) 7r (dS - - - . 7r( c!7') The distribution of So is

tt ,

whereas as

n .....

00.

65 Proof. Integrating and summing in the theorem yields

Tn_l

E -si ('\"" L...J e -aT( Ttll {cr(ro)Eds o} )

X

TO

= Err = 1r(dsn )fi(sn,ds n- 1 ) · · · il(Sl' dso), where

J

E1

ex>

c- at

a) :=

x dt)

o

and we have used the eigenmeasure property J il(s, de; )1r( ds) = 1r( ds n ) . The form of transition and marginal probabilities follows from this joint distribution of types. The convergence il n ( s,S) ..... h(s) follows directly from the lattice Markov renewal theorem (Shurenkov, 1989, p. 122). Of course, it can also be brought back to a limit theorem for Markov chains. D Without spelling this out as another corollary, let us state that the sequence {(Rn,Sn)} of ranks and types also constitutes a Markov chain. P(R o = i) = Err (e-aT(i») whereas

P(R n = i) =

J

J

5

5

iln(s, 5) Err (e-aT(i);O'(i) E ds).------..

h(s)E" (c-O'T(i);O'(i) E ds),

as 11 ..... oo. Though the distribution of, at least R o is important for birth rank studies, the joint behaviour of types and times between births seems of greater import both mathematically, and in tracing populations backwards, e.g. in evolutionary genetics. Corollary 2. The sequence of types and interbirth times. backwards from Ego, {Sn, is a Markov renewal process with transition kernel

P(Sn+l E ds,Tn+1 Edt

I Sn

=

1') = 1r(ds)

{to'(S,

1r

dr

X

(I) (,1'

dt)

.

The distribution of So is tt , To is exponentially distributed with the Malthusian parameter, and independent of the rest.

(Sn,Tn) ""

J

p,n-l(1',S)/-la(s,d1'

X

dt)1r(ds).------..

5

J

h(1')/-la(s,d1' x dt)1r(ds),

5

as n ..... oo. Proof. The proof follows the pattern of the preceding, and is left out. 0 Among other things, this shows that the expected age at a random childbearing is E [Td =

J

SxR+

te- at /-l(1r, ds x dt),

66 whereas the expectation of the asymptotic distribution of Tn,

(3

J

=

te- cxth(s)fl(7f,ds

X

11. -> 00

is

dt).

SxR+

Here, of course,

fl(1r,ds

X

dt) =

J

fl(r,ds

X

dt)1f(dr)

S

In analogy with Corollary 2, the sequence {R n, Sn, Tn} constitutes a Markov renewal! process, with a transition kernel that is easily determined from Theorem 4. Actually, more generally:

Corollary 3. The &equence of types, ranks, and lives backwards from Ego has the Markov property P (R n + 1

= i. Sn+l

E ds, Un + 1 E A I

u; = i, s; = r , U-: R n - 1 , Sn-l ... )

=E,,(e-orlJ);J(;)Eds)

E (e-or(i). An {J(i) E dr}) 'E ( - ' ( ) () i) . 1r e G T t ; a l Err

The distribution of (R o, So, Uo) is E" ( e-ar(l); J(1) E ds ) P, (clw), whereas in the notation

ftA(s,B):= E, [ (B;a);A], P(R n

= t,s; E ss.u; E A) =

E,,(e-cxr(i);J(i) E ds)

--... E1r[e- cx r(i); 00.

S

Proof. The proof is again rather straightforward by insertion. 0 The stable population measure P describes a typical individual, her background and future when sampling from among all those born, dead or alive. Though this is artificial from, say, a biological viewpoint, it is not only mathematically convenient but also conceptually the fundamental situation. Being alive or not is a property that can be recorded by a characteristic. Therefore the stable measure when sampling in the live population is obtained by conditioning in P on Ego being alive, and correspondingly for sampling from other subsets of individuals. To express this more formally, let L o denote Ego's life span, so that L o > To means that Ego is alive. Let 'IjJ and XX be as in the equation defining E, recal! that the life span, as defined on the individual life space 0, was denoted by A and write Ax for the life span of the individual x E I, as a random variable defined on the population space, S X OJ.

Theorem 5. The probability law describing a typical individual, sampled from among those alive is P (.

I L o > To)

and

67

Proof. Not much of proof is needed. The characteristic recording whether you are alive or not, is I[o,A) and

Hence The characteristic XX operating only on living individuals is XX l[o,>'x)' The ratio given in the theorem is thus the limit of

E.

(L z( )/

E.[Zt],

as

t

-+ 00.

X

For the convergence of ratios between the random processes themselves d. lagers [4]. D This work has been supported by a grant from the Swedish Natural Science Research Council REFERENCES [1] L. Breiman, Probability, Addison-Wesley, Reading, Massachusets, 1969. [2] H. Cohn and P. lagers, General Branching Processes in Varying Environment, (to appear). [3] P. lagers, Branching Processes with Bioligical Applications, Wiley, New York, 1975. [4] P. lagers, General branching processes as Markov fields, Stoch. Proc. Appl., 32 (1989), pp. 183-242. [5] P. lagers and O. Nerman, The growth and composition of branching popilaiions, Adv. Appl. Probab., 16 (1984), pp. 221-259. [6] A. Liemant, K. Matthes, and A. Wakolbinger, Equilibrium Distributions of Branching Processes, Akad. Verlag, Berlin, 1988. [7] O. Nerrnan, The Crouwth and Composition of Supercritical Branching Populations on General Type Spaces, Dep. Math. Chalmers Techn. Univ., Gothenburg, 4 (1984). [8] O. Nerman and P. lagers, The stable doubly infinite' pedigree process of supercritical branching, Z. Wahrsch. Verw. Geb., 64 (1984), pp. 445-446. [9] S. Niemi and E. Nummelin, On non-singular renewal kernels with an application to a semiqroup of transition kernels, Stoch. Proc. Appl., 22 (1986), pp 177-202. [10] E. Nummelin, General [reducible Markov Chains and Non-negative Operators, Cambro Univ. Press, Cambridge, 1984. [11] V. M. Shurenkov, On the theory of Markov renewal, Probab. Theory Appl., 29 (1984), pp. 247-265. [12] V. M. Shurenkov, Ergodic Markov Processes, Nauka, Moscow, 1989. CHALMERS UNIV. GOTEBORG,

Sweden

OF TECH.

AND TilE

U NIV.

OF GOTEBORG, DEPT.

OF

MATHEMATICS 5-412 96

Slobodanka Jankovic

SOME PROPERTIES OF RANDOM VARIABLES WHICH ARE STABLE WITH RESPECT TO THE RANDOM SAMPLE SIZE

The notion of stability of random variables (r.v.'s) with respect to the random sample size is introduced. It generalizes the notion of strict stability with geometrically distributed sample size (geometrically stricly stable r.v.) and parallels the notion of max-stability with respect to the random sample size. Let Xl, X 2 , ••• be independent and identically distributed (i.i.d.) random variables with probability distribution F, and let v be an integer- valued r.v., independent of {Xi }, 00

2.: Pi = 1.

P(v=n)=Pn,

i=l

Put

n

s; = 2.: Xi, ;=1

Distribution functions

2..: pnP(Mn < x)

2.: Pn P (Sn < x), 00

00

n=l

n=l

(obtained by taking the sum (maximum) of a random number l/ of variables Xl, X 2, ... ) that are of the same type as F were investigated in [1]-[8]. Gnedenko [1] investigated the class of nonnegative i.i.d. random variables whose sum up to a geometrically distributed random number has the same distribution type as its summands. More precisely, he characterized the class of probability distributions F such that for every 0 < P < 1, there exists an a p > 0 such that following relation is valid

2.: qp;-l F(x)*; = F(apx), 00

q=l-p

;=1

(* denotes the convolution). Gnedenko and Janjic [2] generalized the preceding problem to the case when Xl, X 2 , ••• can take negative values too. Klebanov, Maniya and Melamed [3] introduced the notion of geometrically infinitely divisible and geometrically stricly stable random variables and gave analytic expressions for the corresponding characteristic functions. According to [3], the random variable Y is geometrically infinitely divisible if for each P E (0,1) it could be presented in the form lip

Y -L..-P' "'XU) j=p

69

where v p has a geometric distribution with the parameter p, (j = 1,2, ... ) are identically distributed and Y, vp, (j = 1,2, ... ) are independent. Also according to [3]' Y is geometrically stricly stable if for each P E (0,1), there exists C = C(p) > such that

°

Vp

4: C(p) I: Yj ,

Y

j=l

where Y, Y 1 , Y2 , are i.i.d., v p has a geometric distribution and is independent of Y, Y 1 , Y 2 , • . . Geometrically stricly stable random variables are closely related to the random variables whose characterization is given in [2]. Janjic [4] studied the case of positive i.i.d. random variables whose sum up to a random index i/« (not necessarily geometrically distributed) is again of the same type as one of the summands, i.e. Vp

4: C(p) I: Yj,

Y

j=l

in fact the case of stricly stable random variables with respect to the random sample size. Analogous problems were investigated in the case when the maximum is taken insted of a sum. J anjic [5] characterized the class of maxstable random variables with respect to the geometrically distributed sample size. Baringhaus [6] studied the class of maxstable random variables with respect to the random (not necessarily geometrically distributed) sample size. Voorn [7][8] investigated also the class of maxstable random variables with respect to the random sample size and proved some characteristic properties for that class. The object of this article is to establish properties of random variables stable with respect to the random sample size, parallel to the properties of r.v.'s which are maxstable with respect to a random sample size, see [7][8]. Let Xl, X 2 , ••• be a sequence of nondegenerate, nonnegative i.i.d. random variables with the Laplace transform f and let V n be a sequence of positive integervalued r.v.'s, independent of XI, X 2 , ••• , such that Pnk

= P(v n = k),

oc

for all nand

L pnk = 1, k=l

and also Pn1 < 1, Pn1 -+ 1, as n -+ such that

(1)

00.

Pn1 J(t)

If there exists a sequence of positive constants an >

°

+ pn2 f2(t) + ... = J(ant)

holds for all nand t E [0,(0), then we say that the Laplace transform f is stable with respect to the random sample size Vn. (As usual, we shall say that random variable, or probability distribution, are stable with respect to the random sample size if their Laplace transform is stable with respect to the random sample size.) From the equation (1) it is obvious that stability with respect to the random sample size Un means that the sum of a random number of random variables is, for each n, of the same type as each summand of that sum.

70

Lemma.

If the Laplace transform f is stable with respect to the random sample size,

then:

(i)

an > 1

an -+ 1

for all n,

(ii)

f(t)

>0

when Pnl -+ 1 ;

for all t.

The following theorem gives the necessary conditionfor a r.v. X to be stable with respect to the random sample size.

Theorem 1. Let X be stable with respect to the random sample size v«. Then there exist nonnegative constants Ck, uniquely determined by X, such that Ck

.

pnk

= n1.......imoo 1--' nan C2

k

= 2,3, ...

,

+ Ca + ... < 00

and the Laplace transform f of X satisfies the following differential equation

(2)

Vt E [0, (0).

We shall be concerned with the equation (2) which is satisfied by every Laplace transform stable with random sample size. The question is whether the equation (2) characterizes the underlying class of random variables, and if not, to determine some additional conditions which together with (2) yield the stability with respect to the random sample size.

Theorem 2. Let f: [0,(0) -+ R be a differentiable function, such that 0 < f( to) < 1 for at least one to E (0,00), which satisfies the following differential equation

where C2, ca, ... 2: 0, C = C2 + Ca + ... < +00. Then we have (i)f E C?", f is stricly decreasing on (0,00), f(O) = 1; (ii) there exist functions Pl(a), P2(a), ...

z= ex>

Pk(a)

= 1,

a

2: 1,

k=l

such that

f(at)

= PI (a)f(t) + P2(a)j2(t) +...

for all t E [0, (0);

(iii)

k k-l Pk (a)

= a -kc '"""' L...J •=1

J

= 2,3, ...

,

a

Z. Ck-i+l

V kc-l

Pi (V) dV .

1

We are interested to know whether and when the function

f,

satisfying the equation

71

from the preceding theorem, is completely monotone in which case we would have that J is a Laplace transform and, accordingly, J would be stable with respect to the random sample size. Let us define functions 1, we have from (1): co

0= f(to) =

L

Pnkf(to/an)k = f(to/a n)

k=l

=L 00

k=l

Pnk

= ... =

Vi = 1,2,'"

72

Since tol -+ 0, as i -+ 00, from the continuity of f it would follow that f(O) = 0, which is impossible. Therefore there must be f(t) > 0, t E (0,00). Proof of Theorem 1. Our Laplace transform f is a function f: [0,00) f-+ (0,1], f(O) = 1, f( 00) = 0, strictly monotonically decreasing. Put t = e- X for t > 0 and an = e A n in the equation (1) we get

Let us define the function F in the following way: F: R -+ (0, 1), F( x) Obviously F strictly increases and satisfies the following equation:

(4)

Pn1 F(x)

+ Pn2 F 2(x) + ... = F(x -

An),

pnk 0, L:Pnk = 1, Pn1 -+ 1, as n -+ 00. But that means exactly that F is max-stable with respect to the random sample size (see [7]-[8]) and accordingly the following necessary condition is satisfied (see [7]): there exist non-negative constants

= 2,3, ...

k

C2 and such that dF( x )

,

+ C3 + ... < 00,

= C2 [F( x) - F 2] (x) + C3 [F(x) - F 3] (x) + ...

for all x. If we return to our case, we get _e- f'(eX

X

)

= C2

[J(e-

X

)

-

f2(e-

X

) ]

+ C3

[J(e-

X

)

-

f3(e-

X

) ]

+ ... ,

i.e.-tf'(t) = C2 [f(t) - P(t)] + C3 [f(t) - P(t)] +... Proof of Theorem 2. The following theorem was proved in [8J: Let F: R -+ R be a differentiable function, such that 0 < F(xo) < 1 for some Xo E (-00,00), which satisfies the differential equation: F'(x)

= Cz (F(x) -

F 2(x))

- F 3(x))

+ C3 (F(x)

+ ...

Then the following holds: (a) F E Coo, F strictly increases, F( -00) = 0, F( 00) = 1. (b) There exists a sequence P1(A),P2(A), ... , 00

A

such thatF(x -A) = P1(A ) F (x ) +P2(A ) F 2(x ) + ... (c) P 1(A) = e- c A , Pk(O) = 0, k = 2,3, ... , k-1 Pk(A) = exp(-kcA) L:>Ck-i+i •=1

0, j

J A

exp(kcv) Pi(V) dv .

0

73

Let us prove our theorem. Put f(t) = F( -In t), t > o. Then for f(t) = F( -In t), t > 0 all conditions of Voorn's theorem [8] hold, so we have (i) f E Coo (as a composition of two Coo functions), f strictly decreases (because F strictly increases and -In strictly decreases), f(O) = 1, f(+oo) = 0; (ii) We have

feat) = F(-ln at) = F( -In a -In t)

= PI (In a)F( -In t) + P2(ln a)F 2 ( -In t) + ... = Pl(a)f(t) + P2(a)f2(t) + ... , where we put Pk(a) = Pk(ln a). (iii) It follows that

Plea) = PI (In a)

= e- cln a = «»,

Pk(l) = Pk(ln 1) = Pk(O) = 0, Pk(a) = Pk(ln a)

= exp (-kcln a) k-l

=L

J In a

i Ck-i+1

,=1

= a- kc

exp (kcV)Pi(V) dV

0

L i Ck-i+1 Jexp (kcln v)Pi(ln v)v- 1 dv

k-1

a

.=1

1

k-1 -kc'\' . =a Z Ck-i+1 .=1

J a

V

kc-1Pi ()d v v

1

(we perfomed the change V = In v). Proof of Theorem 3. Denote by fen) the nth derivative of f. Let us show that j(n) could be written in the following way: jCn)(t) = (_1)n a/q) ::; (12 q(a _ q)-2. q(l- q)k- 1kp(X1,X1)

80 If we have additional information about {Xd (not only m2 < improve the estimate (4.7). E.g. iffor some s > 2 there exists

(4.8)

m.

= EXt
2

=1

(i.e. a

1.8). Then for m2 ::; 2 £1 (a, m2) ::; £1 (a, 2)

f2(a, m2) ::; (ae- 1 + a- 2 ) m2 - e- 1 - a- 2 ::; m2 - 1, as m2 1 and ae- 1 Hence,

+ a- 2 < l. sup ( Eq(x) - W(x» :0;

(4.40)

::; q max(l, m2 -1) + o(q)

== £(m2)q + o(q).

=

84

From (3.7) and (2.11) we have sup ( W(x) - Eq(x)) S q max (1, m2 -1)

(4.41 )

r

== c(m2)q.

Thus, the lower bound W is closer to Eq(x) than Wand, hence, the final result is that

sup(W(x) -Eq(x)) Sq max(1,m2 -1).

(4.42)

r

Now compare estimates (4.38), (4.39), (4.41), (4.42) with similar ones from Brown [1] (see Corollary 2.3 in this paper). In our notations Brown proved the following inequality (for q --+ 0): sup ( W( x) - E q ( x)) S q max (m2 /2 r

+ 1,

m2 - 1)

+ o(q)

(4.43) One can see that the following relations are true

so our estimates are better than Brown's ones at least for small q. Of course, there exist furter possibilities for improvements of the about estimaties. In Theorem 4.2 and 4.3 the estimaties W(x) and W(x) were compared with Eq(x). The only reason for this was that the d.f. of r.v, qSv converges to E1(x). In order to apply these estimates in engineering, it is more reasonable to consider a function

W(x)

(4.44)

=

t (W(x) + W(x))

as an approximation of an unknown d.f. W(x). It is clear that

supIW(x)-W(x)1 S tsup(W(x)-W(x)).

(4.45)

r

r

The similar arguments as were used for proving estimates (3.7), (4.33) and (4.37) follow: Theorem 4.4. For any a > 0 the following inequalities are true: (i) If (1 + Aq)(l - q)H,sH S 1 then sup ( W(x) - W(x)) S max { -q-(b H

1-q

r

+ 1 + Aq), 1-q

m2 qi a-q

[_1_ exp( aAq ) -1] 1- q 1 + Aq + qe- L: q + max(b 1)]}.

+ e-a

a

(ii)

If

(1

+ Aq)(l -

r

(4.47)

> 1 then

q)l+,sH

sup ( W(x) - W(x)) Sq[((A

H ,

+ 1)

1-

q)

X exp { -

A - 1 - bH - Aq(2 + bH (1+AqHA+1)

) }

m2 - 1 ] + .

(a-q)2

85

Choose again a 1.8 so that a = 1. Then it is clear that A ::; m2 + 1 and so we can use only the inequality (4.46). Besides, one can prove that the maximal term in the right-hand side of (4.46) is the first one. It means that

+-Aq) sup(W(x)-W(x))::; -q- ( m 2 - 11+ l-q

x

l-q

= qm2 + o(q).

(4.48) Hence,

sup IW(x) - W(x)

(4.49)

x

I ::;

O.5m2q

+ o(q).

5. Lower and upper bounds for W(x) when Xl has higher moments. It is possible to improve the above estimates if we do know something about higher moments than m2. The upper bound W(x) has the previous form (3.2) and as it was pointed out in Section 2, it can be improved if we have some additional information about Xl (see (2.12)). As for a lower bound, it again has the form of (4.26) and hence, Theorems 4.2 and 4.3 stay valid. But the parameters A in (4.22) can be chosen in another way and namely this permits to improve them. Let us suppose first that for some A > 0 there exists an exponential moment E exp ( AXI

(5.1)

)

= J1.).

1 ,the following inequality is true

(5.13) a constant c depending only on m. (see formula (25) in Kalashnikov & Vsekhsviatskii [7]).

6. Stability of W(x). Let

(6.1)

F(x)=P(X 1

In this Section we denote a corresponding d.f. W of r.v. 51/ as W F have another sequence {X:} of i.i.d. r.v.'s such that E Xf = 1 and

.

Suppose that we

(6.2) Let the d.f. of be W F' • The following question arises: What is the difference between W F and W F' in terms of the diference F and F'? Denote

(6.3)

Ow(q)

(6.4)

OF

= sup IWF(x) - W Ff (x) I, x

= sup I F(x) - F'(x) I· x

Lemma 6.1.

(6.5) Proof.

Ow (q) = sup x

sL 00

k=1

L

IL 00

k=1

q(1- q)k-1 (F.k (x) - F;k (x))

q(1 - q)k-1 sup I F: (x) - F;k(x) I x

00

1 . This expression implies the possibility of reducing the problem to testing the hypothesis, which may be classified as the Lehmann alternative, with the use of nonparametric statistical methods, the proportional hazards model by Cox [32] included.

3. Properties of the model. In this Section we dwell on reproducing dose-rate effects using the following expression for survival function

(9)

-G(t) == exp {8V -p t T 1\

J tAT

F(t - x)dx },

o

where 8 is assumed to be nonrandom. This expression describes the observation process which is fairly close to experimental designs in dose-rate studies. Note that for estimation of

93

the carcinogenic risk at time t not formula (9) but (7) must be used. It is worth reminding (see formula (3)) that the probability p depends on dose rate via the parameter 'Y = 8V/ (tA T) (compare with formula (2)) and it follows from (3) that p -> 1 as jL -> O. We shall consider the cases of p = 1 (no lesion repair) and 0 < p < 1 separately. Let V be fixed and p = 1, t = T + T, T ::::: O. In this special case expression (9) turns into the following one T

G(t)=exp{ _8; j F(T+T-X)dx}. o

Then for all T

>0

T

:T(8; j F(T+T-X)dx) :::::0. o

Consequently, if T 1 < T 2 then G(T1 + T) > G(T2 + T), i.e, the "paradoxical" effect of high LET radiation is demonstrated. Note that G(Tl + T) and G(T + T) correspond to the survival functions for two different r.v.'s defined on the same measurable space. The same conclusion can be drawn from (9), when p = 1 and t T. In this case t

j F(t-x)dx}

G(t)=exp{

o

and

t

j F( t - x) dx ) :::::

0

o

for all 0 < t T. Therefore, G(td > G(h) for t l < t2' This effect also takes place when the conditional survival function, defined by (8), is under consideration instead of the inproper distribution (9). For t > T it is interesting to consider another indicator of carcinogenic effect, namely, the conditional survival function for the r.v. U, given that the disease has not been detected up to the end of irradiation, i.e.

p{

v(T)

v(T)

I\(Ei+Xi»T+T i=O

I 1\ (Ei + Xi) > T} i=O

T

= exp{ -p 8; jIF(T + T - x) - F(T - x)] dx }. o Setting p = 1 , we have T

(8; !IF(T + T o

x) - F(T - x)] dX)

94

8D { F(T+T)-F(T) =1' T

-

![F(T + T - x) - F(T - x)J dx } . o

Situation is slightly more complicated here, because the sigh of this derivative depends on properties of the function F. For example, if within the interval [0, T + TJ the function F is convex, then the above discribed dose-rate effect takes place. The opposite effect will be demonstrated if the function F is concave on Rt. As far as unimodal distributions F are concerned the following most typical situation can be outlined: when T + T is below some threshold value the conditional survival probability decreases with increasing duration T of radiation exposure, the total dose D being fixed, but above this threshold it turns into an increasing function of T. Now let us consider the same problem when p is the function of dose rate, given by (3), and its values are less then unity. In this case we should investigate the derivative T

F(T+T-X)dx)

TO::=o ck(JIT)m-k jk!)2

o X

{f

[ F(T + T)

ck

k=O

-!

T

!

T

F(T

+T

-

x) dX] -

o

F(T

+T

-

x) dx

0

where 8D is denoted by c. The sign of this derivative depends on the interrelationships between the parameters T, T, JI and m. In particular, for F(T) > 1j(m + 1) and sufficiently small values of JI a decreased survival (increased tumour incidence) with increasing exposure duration (at fixed total dose) should be expected, but if the values of Jl are sufficiently large, then the opposite effect manifests itself. IT the period of irradiation is longer than the period of observation, i.e, t 1\ T = t, then taking into account that

! t

t

F(t-u)durvt,

00,

o

we have

t

00.

In the case of supressed repair, i.e. p = 1, the asymptotic behaviour of O(t), t 00, is exhaustively determined by the total dose values and no dose-rate ffects can be revealed. Proceeding from the results of this section we come to the important conclusion: the manifestation of either dose-rate effect is crucially dependent on the expression of repair

95

processes and the way of observation and data analysis chosen in a given experimental or epidemologic study. Within the proposed model the supressed repair assumption gives a natural explanation of the dose-rate effects documented for high LET radiation.

4. Acute irradiation. It is interesting to state conditions providing validity of formula (1) from the view point of the proposed model. In the case of acute irradiation at high dose rates ('Y --+ 00) it follows from formula (3) that one may set p = 1. Recall formula (9) and assume 8 is gamma distributed with the shape parameter r and the scale Setting p = 1 and compounding (9) with respect to this special distribution of the r.v, 8, we obtain

f

lilT

(10)

r

F(t-x)dxr ,

o

It is easy to make sure that this randomized version of the model (9) also predicts an increased tumour incidence with increasing dose rate. Setting t = T + r , r > 0, and letting T --+ 0 in formula (10), we get

_

Gu(r)

=

(+

VF(r)

)r

To find the form of Gu (r) for large dose values introduce the normalizing factor N = V Assuming the following asymptotic behaviour of the c.d.f, F(x) in the neighbourhood of the origin

(11)

lim x-II>. F(x)

:

0,

it is easy to show that the limit c.d.f, for the r.v. N>'U has the form of a Pareto type distribution:

with coincides with distribution (1). Remark 2. Keeping in mind that r = 1/v 2 , where v is the cell (or individual) radiosensitivity variation coefficient, one can infer from (1) and (10) that variability in response to irradiation is a factor that promotes the diminishing of carcinogenic risk. Remark 3. The failure rate function corresponding to G(t), given by (1), has one maximum. On the other hand if we set ar = 6, where 6 is a positive constant, then G(t) tends to Weibull distribution as r --+ 00. It can be shown that sup IG(t) - «: -

6/1/'

I

1/er.

Thus, distribution (1) can be quite a good approximation for the Weibull distribution with monotone failure rate. In the work of Kadyrova et al [17] the following reasoning was used to obtain distribution (1). The latent period duration U is described as U = NY, where Y = (the

96

value of P (1) = 0) is neglected), and N is an appropriate normalizing factor. The times Xi until lesion realization are assumed to be i.i.d. r.v.'s with a common c.d.f. F(x). The r.v. 1) is negative binomial distributed with parameters q and, and is independent of the sequence X = (X I,X2 , .•• ). Considering only integer values of" the r.v, Y may be represented in the form: Y = min Yj where Y j = min Xi, ' and

are i.i.d. r.v.'s with common geometric distribution

(12)

k=I,2, .. ·,

O'X (13)

axIl>' -1+ax l / >"

H(x) - ---;-:-

x

o < x < 1,

0,

a> O.

Thus we state the weak convergence q ->'X

(14)

d

->.Z ,

where Z = Zi, the sequence Z = (Zl, Z2,"') consists of nonnegatine i.i.d. r.v.'s Z; with c.d.f. (13), the r.v. f is distributed according to (12) and does not depend on Z. Besides, we may write .

(15)

mm q

->.yj

d

•

rrun q

->.wj,

where

It is clear that the r.v. in the right-hand side of (15) has distribution (1). This way of reasoning has some advantages when studying intrinsic structure of the mathematical model. For instance, the stability of convergence can be estimated. The following theorem gives an estimate of the convergence rate in (14).

Theorem. Let x be a set of all nonnegative r.v.ts defined on a non-atomic probability .1', P), X = (Xl, X 2 , ... ) be a of i.i.d. Xl E x and X = min Xi, where

has distribution (12) and does not depend on X. Then for every s > 1/,\

(16) In the above estimate, p

the weighted Kolmogorottmetric

C,

XE

x,

97

C

= 1 + sup

U

2

H'(u),

and p((,x) = Po ((,X), il the Kolmoqorou's uniform metric. The proof was given in [17J. To obtain a similar result concerning the rate of convergence in (15) we use the following argument p( min min Xi, min min Zi) 1$j$r

l$j$r 1$i${j

r

=p( min Yj, min 1$J$r

1$J$r

j=1

Hence, for relation (15) result (16) remains valid but the constant C is to be replaced by

-c.

Acknoledgements. This research was supported by a Yamagiva-Yoshida Memorial Cancer Study Grant of International Union Against Cancer and by WHO Research Grant HQ/89/191138.

REFERENCES [1J E. J. Answorth, Radiation carcinoqenesis-perspectiues, in: Probability Models and Cancer, North Holland, Amsterdam, 1982, pp. 99-169. [2] D. A. Pierce and M. Vaeth, The shape of the cancer mortality doseresponce curve for atomic bomb lurvivorl, REFR, Tech. Rep., 7 (1989) , pp. 1-23. [3] S. H. Moolga.vkar and D. J. Venzon, Two-event models for corcinoqenesi«: incidence curvel for childhood and adult tumorl, Math. Biosci., 47 (1979), pp. 55-77. [4J S. H. Moolgavkar and A. G. Knudson, Mutation and cancer: a model for human carcinoqenesis, J. Nat. Cancer. Inst., 66 (1981), pp. 1037-1052. [5] S. H. Moolgavkar, A. Dewanji and D. J. Venzon, A stochostic two.,tage model for cancer rilk aueument. 1. The hazard function and the probability of tumor, Risk Anal., 8 (1988), pp. 383-392. [6] E. G. Luebeck and S. H. Moolgavkar, Stochastic analYli& of intermediate lesions in carcinogeneAiA ezperimentA, Risk Anal., 11 (1991), pp. 149-157. [7] A. G. Knudson, Jr., Two-event carcinoqenesis: roles of oncogenes and antioncogenel, in: Scientific Issues in Quantitative Cancer Risk Assesment, S. H. Moolgavkar, Ed., Birkhiiuser, Boston, 1990, pp. 32-48. [8] J. Neyman and P. S. Puri, A structural model of radiation effectA in living cells, Proc. Nat. Acad. Sci., USA, 73 (1976) ,pp. 3360-3363. [9J J. Neyman and P. S. Puri, A hypothetical stochastic mechanilm of radiation effectl in lingle cellI, Proc. Roy. Stat. Soc. London, Ser. B, 213 (1981), pp. 134-160. [10J P. S. Puri, A hypothetical stocluutic mechanism of radiation effect» in lingle cells: some further thoughtA and reAult" in: Probability Models and Cancer, North Holland, Amsterdam, pp. 171-187.

98

[11] G. L. Yang and C. W. Chen, A stochastic tuio-siaqe careinoqenesi« model: a new approach to computing the probability of ob,erving tumor in animal bioauay" Math. Biosci. 104 (1991), pp. 247-258. [12] I. L. Kruglikov and A. Yu. Yakovlev, Siochsiic model, in cell radiobiology: a ,urvey, submitt. Adv. Appi. Probab. [13] J. J. Chen, R. L. Kodell and D. Gaylor, U,ing the biological two-&tage model to aueu rid: from short-term ezpo,ure" Risk Anal., 6 (1988), pp. 223-230. [14] D. Krewski and D. J. Murdoch, Cancer modeling with intermittent eeposure« in: Scientific Issue in Quantitative Cancer Risk Assessment, S. H. Moolgavkar, Ed., Birkhiiuser, Boston, 1990, pp. 196-214. [15] C. W. Chen and A. Moini, Cancer dose-response model, incorporating clonal ezpansion; in: Scientific Issues in Quantitative Cancer Risk Assessment, S. H. Moolgavkar, Ed., Birkhiiuser, Boston, 1990, pp. 153-175. [16] S. H. Moolgavkar, G. Luebeck and M. de Gunst, Two mutation model for carcinogene,i,: relative role, of ,omatic mutation, and cell proliferation in determining ri,k, in: Scientific Issues in Quantitative Cancer Risk Assessment, S. H. Moolgavkar, Ed., Birkhiiuser, Boston, 1990, pp. 136-152. [17] N. O. Kadyrova, L. B. Klebanov, S. T. Rachev and A. Yu. Yakovlev, A latent time dutribution for the carcinogenic rnk e,timation, Tech. Rep., Univ. Santa Barbara, 105 (1990), pp. 1-19. [18] E. J. Hall, The dose-rate factor in radiation biology, Int. J. Radiat. BioI. 59 (1991), pp. 595-610. [19] J. J. Broerse, L. A. Hennen, and M. J. Van Zweiten, Radiation carcinogene,u in ezperimental animal.t and it, implication, for radiation protection, Int. J. Radiat. Biol. 48 (1985), pp. 167-187. [20] Do,e-re,ponce Relation,hip, for Radi&tion-indu«d Cancer, Rep. (1984) of Un. Nat. Sci. Commit. Effect. Atom. Radiat ..

AI AC, 82/R. 424

[21] H. H. Rossi and A. M. Kellerer, The dOle-rate dependence of oncogenic tran,formation by neutronl may be due to variation of re'pon,e during the cell cycle, Int. J. Radiat.BioI., 50 (1986), pp. 353-361. [22] M. M. Turner, Some clalle, of hit-target models, Math. Biosci., 23 (1975), pp. 219-235.

[23] A. Yu. Yakovlev and A. V. Zorin, Computer Simulation in Cell Radiobiology, Led. Notes Biomath., Springer, 74 (1988). [24] C. S. Tobiu, The repair.muperair model in radiobiology: compaNon to other model" Radiat. Resp. Suppl., 8 (1985), pp. 77-95. [25] H. M. Taylor and S. Karlin, An Introduction to Stocha,tic Modeling, Academ. Press, New York, 1984. [26] D. R. Cox and D. Oakes, Anal",i, of Survival Data, Chapman and Hall, London, 1983.

99

[27] B. V. Gnedenko, On "Orne dability theorems, in: Stability Problems for Stochastic Models, Lect. Notes Math., Springer, 982 (1983), pp. 24-31. ST. PETERSBURG TECHNICAL UNIVERSITY, POLYTECHNICHESKAYA

29

ST. PETERSBURG

195251, Rus-

sia INSTITUTE OF MATHEMATICAL GEOLOGY SHPALERNAJA

12

ST. PETERSBURG

191187, Russi.

V. Yu. Korolev and V. M. Kruglov

LIMIT THEOREMS FOR RANDOM SUMS OF INDEPENDENT RANDOM VARIABLES

1. Introduction. The problem under discussion emerged more than 40 years ago. The asymptotic behaviour of random sums is traditionally studied in two limit schemes: the double array scheme and the scheme of "growing" sums. When the asymptotic behaviour of the sums of a nonrandom number of random variables is investigated it does not matter at all, whether the sums themselves or the separate summands are centered. In this case the scheme of "growing" sums turns out to be a particular case of the more general double array scheme. This stereotype of invariance of the result with respect to centering existing in the classical theory of summation of random variables leads to a sort of a contradiction, which is as follows. If random sums are considered in the scheme of "growing" sums then location mixtures of L class laws may appear to be limit distributions as was shown by H. Robbins [1], But these mixtures can't be limiting for the distributions of random sums in the classical scheme of double array of centered summands considered by B.V. Gnedenko and his followers [2, 3,4]. So it seems as if the scheme of "growing" random sums were not the particular case of the double array scheme. Evidently unlike the case of sums of a nonrandom number of random variables, if the sums of a random number of random variables are considered then centering of sums and centering of summands lead to different results. In the present paper we generalize [3] and [5, Chapter 2] and show that the consideration of nonrandomly centered sums in the double array scheme takes the above contradiction away. We also mention that nonrandom centering seems to be much more reasonable from the practical point of view than random one, which appears when random sums of centered summands are considered. Let {X nj} j>1, n = 1,2, ... be a double array of rowwise independent random variables (r.v.'s). Let {N n }n>1 be a sequence of positive integervalued r.v.'s such that for each n :2: 1 r.v.' u; and -{X nj } j >1 are independent. For a natural k put Sn,k = X n1 + ...

+ X;».

Throughout this paper the symbol => will stand for weak convergence; p - lim means the limit in probability.

2. Sufficient conditions for the convergence of distributions of random sums of identically distributed summands. At first we consider the case of identically distributed summands which turns out to be very illustrative for structure of limit laws. Lemma 2.1. Assume that for some sequences of numbers {a n}n>1 and {k n }n2: 1 (an are real, k n are natural, k n ---> 00 as n ---> 00 ) and for some r. v. },.

(2.1)

n. --->

00.

101

t>l

Assume aslo that the family {k;;-lN n is weakly relatively compact. Let a sequence of real numbers. Then for each-t E R 1 we have

(2.2)

lim

n->oo

IE exp {it(Sn,Nn - cn)} - gn(t)1

where

be

= 0,

J 00

gn(t) =

hU(t)exp {it(va n - cn)} dAn(u),

o

An(u)

=P

(k;;l N n < u) ,

h(t) = E exp {itY}.

Proof. Denote fn(t) = E exp{itX n1}. Then

== \[!n(t).

Eexp{it(Sn,k n -an)} It follows from condition (2.1) that for each T E (0,00)

(2.3)

lim

I \[!n(t) -

sup

n->oo

h(t)

1= °

and since the power function is continuous and the infinitely divisible characteristic function (ch.f.) h(t) does not turn into zero, we have for each u20:

(2.4)

lim

sup

n->oo

I

- h"(t) 1= 0.

Further, by the law of total probability we have

L P (N n 00

E exp {it(Sn,N n - cn)} =

=

k) E exp {it (S",k - Cn)}

k=l

L P (n; = k) 00

=

exp {-lic n}

k=l

J J 00

=

{-itcn}dAn(u)

o

00

=

{it (ua n - cn)}dAn(v).

o

That's why

JI 00

(2.5)

IE exp {it (Sn,N n - cn)} - gn(t) IS

- h"(t) I dA,,(u).

o

At first we consider such a subsequence N of natural numbers that All => A as -+ 00, n EN, for some distribution function (d.f.) .4. Let 1,,(.,) and I'n(3) be respectively the greatest lower bound and the least upper bound of .s-quantiles of the d.f.

II

102

An, l(s) and r(s) be respectively the greatest lower bound and the least upper bound of s -quantiles of the d.r. A. According to the Theorem 1.1.1 from [5], weak convergence of

An to A as n

-+ 00,

n EN, implies l(s) 5liminf In(s) 5 lim sup rn(s) 51'(s) nEN

nEN

for each s E (0,1). Therefore if 1(3) An A (n -+ 00, n E N) implies

'*

r(8) for some 8 E (0,1) then the condition

lim I n (8) = 1(8).

(2.6)

nEN

It is easy to verify that the set {8 : 0 < s < 1, 1(3) f= r(3)} is no more than countable and therefore its Lebesgue measure equals zero whatever d.£. A is. Rewrite the right-hand side of (2.5): 00

J

(2.7)

1

- hU(t) I dAn(u) =

o

- hl,,(s) I cis 0

We have In(s) = I(s) = 0(1) as 11 Therefore it follows from (2.4) that

-+ 00,11 E

N, for each .s E (0,1) satisfying (2.6).

- h1n(s)(t) 15 sup

- hl(s)+O(l)(T)

1----> 0

as 11 -+ 00, 11 EN. Consequently the Lebesgue majorized convergence theorem implies that the integral in the right-hand side of (2.7) tends to zero as 11 -+ 00, 11 EN, and hence, according to (2.5) the relation (2.2) holds for 11 EN. Assume now that in the general case (2.2) does not take place. In other words, assume that there exist an > 0 and a subsequence N! of natural numbers such that

IE exp

(2.8)

{it (Sn,N n - cn)} - gn(t) I >

E

for all n E N 1 • By the condition of the lemma the family {An}n>l is weakly relatively compact. Therefore there exists a subsequence N 2 N 1 such that An A* as 11 -+ 00,11 E N 2 , for some dJ. A*. But it has already been proved that if An A*, 11 -+ 00, n E N 2 , then for all 11 (some) no, 11 E N 2 , the inequality converse to (2.8) must hold. We have the contradiction with the definition of the sequence N! which proves the lemma. 0 The lemma just proved opens the way for the search for approximations to the d.f.'s of random sums of identically distributed r.v.'s. when the d.f.'s. of the random indices are known. The following statement describes the limit situation. Theorem 2.1. Let sequences {kn}n>l, {an}n>! and {c n}n2:! (k n are natural,

'*

kn

-+ 00,

(2.9)

an and

C

n are real) provide (2.1)

-

11, ---4 ()Q,

'*

103

for some pair of r.u, 'so (U, V). Then we have

(2.10)

Sn,N n

-

en ==? Z,

where Z is the r. v. with the ch.].

(2.11)

f(t) = E [hU(t)exp{itV}].

Proof. Denote

Then, obviously, we have

9n(t) = E [hUn(t)exp {it V,,}] . Therefore taking Lemma 2.1 into account suffice it is to make sure that 19n(t) - f(t)1 -> 0 as n -> 00 for all t E R 1 • Consider the function

0.

/I

--+ 00.

o

Therefore as a consequence of (3.5) we have (3.6)

E exp {it(Sn,N n

-

en)} = E exp {it [X(U)

+ a,,(U) - en]} + 0(1)

as n -4 00. The function 'Pt(x,y) = exp {it(x + V)} is bounded and continuous in x and y for each t E R1 . Hence (3.2) and (3.6) imply (3.7)

lim E exp {it (Sn,N n

-

en)} = E exp {it(X(U)

+ V)} == f(t)

for each t E R 1 . Since the stochastic process X (s) and the r. v. U are independent we can also consider X(s) and the limit r.v. V to be independent. Thus the representation (3.4) follows from (3.7). The proof is over. D Now consider the distribution of the limit r.v. Z from (3.3). Denote the d.£. of the r.v. X(s) by H s. If H s = H, s E (0,1), then the independence of the r.v. U and the stochastic process X( s) implies the equivalence of (3.2) to the condition of existence of some r.v. V for which on(U) - en ==> II as II -+ CXJ • It this case the d.£. F of the r.v. Z can be represented as a convolution F = H * G, where GCl') = P (V < .r}, i.e. F is a location mixture of the d.£. H. And if c., = O. (1.,,(,,) == 0 for all /I : : 1 then F is of the 1 form F(x) = fo Hs(x)ds. This situation was considered in full detail in [5]. 4. Weak compactness of random SU1ns. In the sequel we shall use Doob's centers of distributions. Following [6] we shall say that the number !Ill (X) uniquely defined by the condition E arctg(X - !Ill (X)) =

°

is the Doob's center of the r.v. X. We shall use the following properties of Doob's centers. (1). If the sequence of r.v.'s. {X n }n>l is weakly relatively compact then sequence {!Ill(Xn )}n>l is bounded. (2). If Xn ::::;. X (n -4 00) then !Ill(X,,) - > !Ill(X) (n -+ 00).

106

(3). If a

= canst

E R1 then ][}(X

+ a) = ][}(X) + a.

Denote

Fn(x)

= P (Sn,N

c.,

-

n

< .r}.

Lemma 4.1. The .!equence of d.f. '.I. {Fn }n2: 1 is weakly relatively compact relatively compact are the .!equence.! of d.f. '.I

iff

weakly

(4.1) and

(4.2) for each s

E (0,1), where U is a r.v. uniformly distributed on [0, 1).

+ ... +

Proof. Nece.!.!ity. Denote Zn,k = (see, e.g., [7. p. 261])

With the help of P.Levy's inequality

n

k

(4.3)

I'" X(s)1 :::: x) 0 guaranteeing P( IYn,kl;:: T):::; 1/4 uniformly over HI.,. Let P.n,k be the median of Yn,k. By definition of the median we have IPn,k! :::; r , We shall use one more P.Levy's inequality ([7]), p. 261): P ( max l::;k::;n

IXI + ... + Xk < 2P(

(4.5)

med(X 1

!xi

max l::;k::;n

O )

+ ... + Xdl ;:: x)

+ ... + Xk')1 ;:: x),

which is valid for any x ;:: 0 and independent r.v.'s Xl,'" and (4.5) we obtain

P(

,X n . With the help of (4.3)

max IYn,kl;::x):::;P( max IYn,k-/'nkl;::.r-1") l::;k::;InCo) l::;k::;I"C.)

:::;2P( max

l::;k::;i,,(')

IZ""I;::·I-I')

:::; 4P (IZ",I"c.)1 ;::

x -

I)

Therefore by the law of total probability we have oo

p( ISn,lnCU)

- dn(U)1 ;:: X) =

L

P(Nn = k)P (IYn,kl ;:: x)

k=l

InCo)-l

00, 1"1

EN.

109

The proof of this lemma is a combination of proof of Lemma 2.1.1. from [5, p. 41-42] and Theorem 2 from [4] and therefore is omitted.

Theorem 4.1. Assume that (4.7) takes place. The family {Fn}n>l of d.f. 's of the natural numbers centered random sums is weakly relatively compact iff for any sequence N there exist an infinitely divisible measurable stochastic process with independent increments X(s), safisfying the condition lD>(X(s)) = 0, s E [0,1), a subsequence N 1 00,

=> (X(U), V)

n

n E

N1 ,

---> 00,

n E N,

where the pair of r. v. 's (U, V) is independent of the process X (s), the r. v. U being uniformly distributed on (0,1). Proof. Necessity follows from Lemma 4.2 and the fact that weak relative compactness of the sequence {Fn}n>l by Lemma 4.1 implies weak relative compactness of the sequence {p (dn(U) - en < x) which in its turn guarantees weak relative compactness of the r.v.ts {(X(U),' dn(U) } n>l' Sufficiency follows from Theorem 2.1. 0 The conditions for weak compactness of random sums of independent identically distributed r.v.ts will be stated in another way that seems to be less complicated. To establish them we shall use the concepts of the center of a r.v. and its scatter introduced by V. M. Zolotarev [9]. Let G and 9 be the d.f. and ch.f. of a LV. X and the number u > is chosen in such a way that g(t) i= 0 for It I ::; v. We shall call the values

°

C(v,X)

=

C(v,G)

1

= -1m In v

and

g(v)

v

lR(v,X)

= lR(v,G) =

v3

JRe In g(t)dt o

respectively the center (or v - center) and the scatter (or v - scatter) of the r.v. X (or of the d.f. G). The properties of these values are described in [9]. Here we outline only some of them. Let X,X 1,X2 , ... be independent r.v.ls, whose ch.f.'s don't turn into zero for t E [O,v]. Then

C(v,-X)

= -C(v,X),

+ lR(v,X 1 +

+ X n) +Xn)

C(v,X 1

and if X n

=> X as n

---> 00

then

lR(v,-X)

+ lR(v,Xd +

= C(v,Xd

=

= lR(v,X). + C(v,X n), + lR(v, X n ) .

110

In the following lemma r.v.'s {X nj}j2:1 are not supposed to be identically distributed.

Lemma 4.3. A33ume that for so rne v > 0 and all nand j there exist centers iC (v, X nJ)' The sequence of d.j.?« {Fn } n 2: 1 of centered random 3ums is weakly relatively compact iff weakly relatively compact are the 3equence3 of d.f. '3 In(s)

(4.9)

{P

- iC(v,X n / ) )

< x)

j=l

t2:

1

and Nn

(4.10)

{P

-

Cn

< x)

j=l

t2:

1

for any s E (0,1). The proof of this lemma differs from that of Lemma 4.1 only in notations.

Corollary. A33ume that for some v> 0 and all nand j there exist centers iC(v,Xnj). The family of d.f.'3 {F n } n>l is weakly relatively compact iff weakly relatively compact are the sequences (4.10) and {P (Yn,N n < x) } n2:1' where for natural k k

Yn,k = Sn,k - LiC(v,X nJ) . .1=1

From now on to the end of this section we assume distributed in each series.

LV.'S

{X".l}.l2'1 to be identically

Lemma 4.4. A33ume that

(4.11)

p- lim N'; =

00.

If the sequence of d.f.'3 {Fn } n2: 1 is weakly relatively compact then there exist

and for some v > 0 and all n 2: 1 and besides that the sequences of distribuiion functions {P (NniCn(v) - C n < x)} n>l and {P (NnlRn(v) < x)} n>l are weakly relatively compact. Proof. Denote the ch.f, of a r.v. X n1 by fn(t). We shall use notations introduced in the proof of Lemma 4.1 and some of its results. In particular, it was shown that the sequence of d.f.'s {P (Zn,ln(s) < x) } n>l is weakly relatively compact for each ., E (0,1). By virtue

of Corollary 2 in [7, P- 206]

any

E

> 0 there exists such a 5> 0 that 1-lfn(t)1 2 / n ( ' )
1 converges weakly to some limit d.f. G. By the second and third properties of scatters we have 11 --t

oo ,

11

E

N.

But this contradicts our assumption and therefore (4.12) is proved. Let x > ;3(v, s). By the law of total probability we have 00

P (NnIan(v) > x)

= LP(Nn = k)P (kIan(v) > x) k=1

1.. (.)-1

L

00

P(Nn=k)P(kIan(v»x)+

k=1

L

1.. (.)-1

L

P(N,,=k)

k=I .. (.)

P(Nn = k)P (In(s)Ian(v) > :r)

+ 1- s

1-

S

k=1

and therefore lim sup P (NnIan(v)

X ..... CXl

n

> x)

1 - s.

Passing to the limit as s T 1 in this inequality we make sure that the sequence of d.f.'s is weakly relatively compact. The lemma is proved. []

{P (NnIan(v) < x)}

112

Theorem 4.2. Assume that (4.11) takes place. The family of d.f. 's {Fn } n::C: 1 of centered random sums of independent random variables is weakly relatively compact iff for any seN and sequences {kn}nEN1 quence N of natural numbers there exist a subsequence N 1 and {an}nENl (k n are natural, an are reaD such that the sequences of d.f. 's

{P (Sn,k n

an < x)} nEN1

-

and

{p

x)

lim P

1!-CX)

Xl,

==

0

or the limit distribution of the r .v , Y".N" as II -+ 00. /I E No, is not degenerate at zero. At first consider the second case. Without loss of generality we can consider the d.f.'s of 1'. v.'s X,,} to be non-degenerate for all n 2': 1. Therefore the scatters lEn (v) which exist for some v > 0 according to Lemma 4.4, are positive. The integer part of a number x will be denoted by [x]. Put an

==

CIl(v) lEn (v)'

In the proof of Lemma 4.4 we established that as

n->oo

uniformly over each finite interval of the domain of t. Therefore k n -> hence (4.14)

00

(n -> (0) and

lim knIEn(v) == 1.

Due to the above definition of constants k., and u" weak relative compactness of the sequence of dJ.'s J'-ln JV" )} { P ( ----;:- < Xl, an ----;:- - (" < :12 /i'n

ti'

n

II.ENo

113

follows from Lemma 4.4 since (4.14) takes place and

NnCn(v) - en = an Nnlffin(v) - Cn = an

knlffin(v) - Cn, n

Now prove weak relative compactness of the sequence of d.L's

For this purpose at first we prove that the sequence of d.L's {P (Zn,"" < x)} nE}/o is weakly relatively compact (we use notations introduced in the proof of Lemma 4.1). Suppose the contrary. Then for some [; > 0, some subsequnce N' c:;; No and some infinitely increasing sequence {Xn}nEN' the inequality (4.15)

ii

EN',

is valid. With the help of inequality (4.4) we make sure that the weak relative compactness of the sequence {Fn}n>l implies the weak relative compactness of the sequence {P (Zn,lnCs) < x)} n>l for each S E (0,1). Again we consider two incompatible cases: for some s E (0,1) -

.

(4.16)

In(s)

Iirn sup -kn---+oo

> 0;

11.

for each s E (0,1)

.

(4.17)

In(s)

lim sup -k- = 0. n----+CX)

n

°

> and all n from some subsequence ,kn is valid. Denote m = + 1. Applying P. Levy's

Let (4.16) take place. In this case for some ,

Nil c:;; N' the inequality In(s) inequality (4.3) we obtain

P (IZn,knl ::: x) S 2P (IZn,mlnC8)1 ::: x) S 2mP (IZn,lnCs)1 ::: x/m). These inequalities imply weak relative compactness of the sequence {P (Zn,k" < x) } nEN" which contradicts the assumption (4.15). Thus we proved that the sequence of d.L's {P (Zn,k n < x} nENo is weakly relatively compact. Just in the same way as proved the weak relative compactness of the family W s in Lemma 4.1, we can prove that the sequence { P (Yn,k n < x) } nENo is weakly relatively compact. But with the account for the definition of an we have

114

Therefore taking relation (4.14) into account we can conclude that weak relative compactness of the family {P (Yn •kn < x) } nOlo implies weak relative compactness of the sequence

tEN

' of d.f.'s {P (sk < x) o Let now (4.17) take place. In [9, p. 160] the proof of the inequality

100 Jrn(v,X) c

P(IX C(v,X)1 2: e)

(4.18)

2

can be found. This inequality is valid for 0 < e < v under the assumption that the center and the scatter of the r.v. X exist. Further, we note that the definition of k n and the relation (4.17) by virtue of (4.14) imply Iim sup Lf.s) Jrn,,(v) = 0

(4.19)

for any s E (0,1). Using (4.18) and addivi tivity and positivity of scatters for 0 < obtain

E

< v we

00

P (IYn.Nnl2: c)

= LP(Nn = k)P (ISn,k kC,,(v)l2: c) k=1

I n(.)-1

L P (Nn = k)P (ISn,k kiC,,(v)1 2: c) + P (N" 2: I,,(s)) k=1

100 2

c

I n(.)-1

L P i»; k=1

= k)kJrn,,(v) + 1 -

s

s::

100

2 In(s)Jrnn(v)

c

+1-

s.

Here it follows from (4.19) that limsupP (IYn,Nnl2: c)

1 s.

ncoc

Taking the limit as s i 1 in this inequality we come to the conclusion that (4.17) implies (4.13) which is impossible. Thus we proved weak relative compactness of the family of d.f.'s {P (Sn,k n an < x} nENo under the assumption that the limit distribution of the r.v. Yn,Nn as n 00 n E No, differs from degenerate at zero. Now consider the case (4.13). For each m = 1,2, ... choose 11", E No in such a way that P (IYn,Nnl > l/m for n 2: n m , n E No, and the sequence {l1m}m;:: 1 increases. For n E

No n [n m , nm+d put m-

1 2 /

Tn

= In(s) (1

inf P (IYn.kl >

Jm-).

f

For such n we have

P(Nn = k)P (IYn'kl >

k=r n

fP(N n = k)P (IYn,kl > k=1

=P

(IYn •Nnl > .

m

m

.

115

Consequently

s:

Therefore there exists a k n ,

is valid. Denoting an

P

(IY

n ,k

n E [nm, nm+d n

No,

Jm.

for which the inequality

= kniCn(v) and reminding the definition of

the r.v,

we shall obtain from (4.20) that

n E No, which implies the weak convergence of centered sums Sn,k n - an to zero as n n E No, and, henceworth, their weak relative compactness. Further, for n m :::; n < nm+l, n E No, and for each x> 1 we have

which implies limsupP (N n 'n e-e

> knx)

-> 00

x>l.

= 0,

cc

nEN

Finally, it follows from the definition of an that

and the weak relative compactness of the family r.v.'s NniCn(v) - Cn follows from Lemma 4.4. Thus we proved the weak relative compactness of the families of d.L's {P (Sn,k n - an < x)} nENo and {P < Xl, an - Cn < X2)} nENo' Now taking the subsequence N l No along which the d.f.'s and

Nn P ( -k n

00,

n E

No,

116

where {kn}nENo and {an}nENo are the sequences of natural and real numbers respectively, Y, U, V are some r.v.'s. Applying Theorem 2.1 we obtain that 11 -+

oo ,

11

E

JII,

where Z is a r.v. with the ch.f.

EhU(t) exp{itV}, i.e. the family of d.f.'s

h(t)

=E

exp{itY},

turns out to be weakly relatively compact.

0

5. Necessary and sufficient conditions for the convergence of random sums. At first we consider the general case when the summands are not supposed to be identically distributed. We denote dy 1) the set of the measurable infinitely divisible stochastic processes Xes), s E [0,1) with independent increments such that X(O) = 0 and IIJ)(X(s)) = 0 for all s E [0,1). Here IIJ)(.) is the Doob's center, see Section 4. To every d.f. F we put into correspondence a set H( F) consisting of triples (X (-), U, V), where X(·) E 1), the r.v. U is distributed uniformly on [0,1) and the ch.f. f corresponding to F can be represented in the form (3.4), where h(t;s) is the ch.f. of the r.v. Xes). For each d.f. F the set H(F) is not empty since it contains at least one triple of the form (Io(')' U, [F(U)), where P (Io(s) = 0) = 1 for all s E [0,1) and IF is the greatest lower bound of s-quantiles of the d.f. F. Let L (., .) be a metric which metrizes weak convergence in the space of one-dimensional d.f.'s and let lI(.,.) be a metric which metrizes weak convergence in the space of twodimensional d.f.'s. The examples of these metrics are Levy's metric and Levy-Prokhorov's metric respectively. If X and Yare r.v.'s with d.f.'s G and H respectively then we shall not make difference between L(X,Y) and L(G,H). By analogy to this if (X 1,X2 ) and (Y1,Y2 ) are two-dimensional r.v.'s then we shall not make difference between II(G,H) and

II((X 1,X2 ) , (Y1,Y2 ) ) . Let Y1(s) and Y 2(s) be two stochastic processes, s E [0.1). Introduce the distance between the processes Yi(.) and Y 2 (.) in the following way:

J 1

(5.1)

A (Y1 ( · ) , Y2 ( · ) ) =

L (Y1(s), Y2 (s)) d.s.

a

The distance A (', .), is obviously nonnegative, symmetric and satisfies the triangle inequality. The equality A(Y1 ( · ) , Y2 ( · ) ) = 0 is equivalent to the coincidence of the d.f's of Y1(s) and Y 2 (s ) for almost all s E [0,1). As above, we denote

Theorem 5.1. A""ume that (4.7) take" place. We have (5.2)

11.-+ l is weakly relatively compact for any u E [0,1). Prove (5.2). Suppose the contrary. In this case for some 0 > 0 and all n from some subsequence of natural numbers .N the inequality L (F n , F) 0 holds. Using the reasoning analogous to that used to prove Lemma 4.2 (based on Cantor's diagonal method) we make sure that there exist a process y(.) E 'D and a subsequence .tV1 Y( s) as n ---> 00, n E .tV1, for almost all s E [0,1). Lebesgue's majorized convergence theorem here implies that Yn(Un) => Y(U) i.e. the sequence of d.I.'s of LV.'S {Yf/(U,,) is weakly relatively compact. This together with the coincidence of distributions of LV.'S Z and Yn(Un) + Vn imply that the sequence of d.f.'s of r.v.'s {1/" },,>1 is also weakly

LEN

relatively compact. Henceworth the distributions of pairs {(Y;,(U n ) , V:) } nEJVl or are also weakly relatively compact. Choose a subsequence .tV2 1 from J(F) satisfying the following conditions: n

(5.11)

(5.12)

II

an

- cn )

,

(Un,vn))

-+

-'> 00

0,

17.

00.

II. For (5.10) it is sufficient that there exist numerical sequences {k n }n>l and {On}n>l (k n are natural, an are rea0 and also a weakly relatively compact sequence-of triples {(Yn , U«, V n ) } n>l from :J(F) satisfying (5.11) and (5.12). The proof of this theorem in main details reproduces the proof of Theorem 2.2.2. III [5, p. 54-56] and therefore is omitted. REFERENCES [1] H. Robbins, The asymptotic distribution of the sum of a random number of random variables, Bull. Arner. Math. Soc., 54 (1948), pp. 1151-1161. [2] B. V. Gnedenko and H. Fahim, On a transfer theorem, Dokl. AN SSSR, 187 (1969), pp. 15-17. (In Russian.) [3] D. Szasz; On the classes of limit ditributions for sums of a random number of identically distbuted random variables, Probab. Theory. Appl., 17 (1972), pp. 424-439. [4] D. Szasz, Limit theorems for the distributions of the sums of a random number of random variables, Ann. Math. Stat., 43 (1972), pp. 1902-1913. [5] V. M. Kruglov and V. Yu. Korolev, Limit Theorems for Random Sums, Mosc. St. Univ. Pub!., Moscow, 1990. (In Russian.) [6] J. L. Doob, Stochastic Processes, Wiley, New York, 1953. [7] M. Loeve, Probability Theory, Van Nostrand, Princeton, 1963. [8] B. V. Gnedenko and A. N. Kolmogorov, Limit Distributions for Sums of Independent Random Variables, Addison-Wesley, Reading, Massachusets, 1968. [9] V. M. Zolotarev, Modern Theory of Summation of Independent Random Variables, Nauka, Moscow, 1986. (In Russian.) DEPT. OF COMPUT. MATH. 119899, Russia

AND CYBERN., Moscow STATE UNIV., LENINSKIE GORY Moscow

I. S. Molchanov

ON REGULARLY VARYING MULTIVAL UED FUNCTIONS

The basic notions of the theory of regularly varying functions can be found in Seneta [6], and their generalizations for the multivariate case in de Haan and Orney [1], Yakimiv [7]. In the present paper the regularly varying multivalued functions are introduced and investigated. These functions take values in the family of closed subsets of the Euclidean space Rd. Note that multivalued functions are studied within the framework of the probability theory as random closed sets, see Matheron [3]. Let :F (respectively K) be the class of all closed (compact) subsets of Rd. The sequence {Fn , n I} is said to converge in :F to the limit F if the next conditions are valid (see Matheron [3]): 1) If K n F = 0 for some compact K then J( n F; = 0 for all sufficiently large 1?; 2) If G n F # 0 for some open G then G n r; # 0 for all sufficiently large 1? The sequence of compacts {I{n, n I} is said to converge to J( in J( if, additionally, the next condition is valid: 3) K n C M for all sufficiently large n. and some bounded set M. Respective limits are shortly denoted by F =:F -limFn , The convergence in

J(

J{ =

K lim K n

.

is metrized by the Hausdorff metric

where J{'

= {x

E R d I B,(x) n K

#

0}

is the eparallel set of K, B,(x) is the closed ball having radius c: centered at x. Denote also IC'={xEK I B,(x)CK}. For any set M its closure and interior are denoted by 11'{, Int 11'{ respectively. The set M is said to be canonically closed iff M coincides with the closure of its interior. Let r be a canonically closed cone in R m, 5 = r \ {O}, and let M: r --+ F be a multivalued function on r. Suppose additionally that M(O) = {a} and M is measurable, i.e. for any compact K the set {u E r I M(u) n K # 0} is measurable. The function M is said to be regularly varying with limit function and index ct if for any u from 5 (1)

F - lim M( tu)/ g(t) = (u),

°

where (u) is a nontrivial closed subset of Rd, (u) # {a} for 11 # and g: (0,00) --+ (0,00) is a numerical regularly varying function of index a (see Seneta [6]). We then write M E IId g, 5, F, o , O. If K n ( un) i- 0 for sufficiently large n then without loss of generality we may suppose that lnt K' n ( un) i- 0 for all sufficiently large n. Then Int KG n M (tun) / g(t)

i- 0

r

for all t t n and some tn. Suppose i« 00 and take Ut = Un for t E [tn,tn+d. Then (2) implies Int K' n ( u) i- 0, i.e. K' n ( u) i- 0 for E. > 0, and we come to contradiction. Let G n (u) i- 0 for some open G. Then G I n (u) i- 0 for some open set G I with compact closure, such that GI C G. If G n ( un) = 0 for all sufficiently large n then G I n M (tun) / g(t) = 0 for t tn' From (2) we get G I n (u) = 0, i.e. we come to contadiction. Thus F -lim(un ) = (u). For ME II 2 (g, 5, K, o , F be a multi valued function on the unit sphere 5 m - l . Then the function M defined by M(se) = s'"F(e), s > 0, e E 5 m - 1 is said to be homogeneous. It is evident that ME III (g,Rm \ {O},F,Ct,F). If F is continuous on F(K) then ME II 2 . Example 2. Let m = 6, d = 2 and let M(UI, ...,U6) be the triangle with the vertices (UI,U2), (US,U4)' (US,U6)' Then Mis homogemeous and regularly varying of index 1. If p(M) is the square of M then the function MI(u) = p(M(u))i3 M(u) is regularly varying of index 2{3 + 1. Ezample 3. Let hi: Rm -> RI, 1 ::; i ::; d be regularly varying numerical functions from the class III on 5, i.e.

(3)

lim hi(tU)/ g(t)

t--+oo

Thus

M(u)

= 'Pi(U),

= {(hl(u), .. ,hd(u))}

1

j = 1,2. The next lemma shows that the set-theoretic operations preserve the regular variation property. Let conv (F) be the convex hull of F, and let F I EB F2 = { X + Y I x E F I , Y E F2 } be the Minkowski sum of F I and F 2 .

123

Lemma 1. Let j

= 1 or 2 and let M(1)

M, E II j =

M(2) =

C1M1

, Ci

> 0, 1 :::; i :::; p. Then the functions

U ... U

cpM p ,

conv(M(l),

M(3) = C1 M1 EB ... EB cpMp belong to the same class II i : Proof follows from continuity of these operations w.r.t. the convergence in F(K). The next theorem is the analogue of the inversion theorem for a numerical regularly varying function (see Seneta [6]). It should be noted that this theorem can not be formulated within the framework of numerical multivariate regularly varying functions only, since the inverse function for a multivariate one is necessarily multivalued.

Theorem 2. Let MEII 2 ( g, 5, F , 0',

O. Let K' n 00. Hence (2) implies F -lim Ms(u s ) = cI>(uo). From (9) we get (uo) n K8 i= 0 for all 8 > O. Hence (uo) n K i= 0, which contradicts the condition Uo E K'. Thus the first condition of F-convergence is valid even without assumption (5). If E = 0 then we have to consider the case 0 E K', US ----> Uo = 0 as s ----> 00. From (8), (9) we get 0 E (J(s)IC)8 for all 8> 0 and sufficiently large s. This is in contradiction with 0 rt K. Let us verify the second condition of F -convergence in (6) for E = 0 at once. Since cI>l(K), the common point of G and cI>l(K) is not zero, and 0 rt G. Let Uo E Gnl (K), and let Be (uo) C G for some E > O. Suppose that

(10) for unbounded s, Hence

U

lVIs(u) n f(s)K s

= 0.

uEB,(uo)

From (5) we get cI> (uo)6 n f(s)K s = 0. Henceforth (uo) n K = 0, which is in contradiction with our supposition. Thus (6) has been proved. Now suppose that ME II 2(g,5,K,a,cI» and 0 cI>(u) whenever u E 5. In order to prove K-convergence in (6) we have to verify that the sets M l ( S, IC ) I gl (s) are contained in some compact for all sufficiently large s. Suppose that Us E M] (sK) I g](s) for unbounded sequence of points Us' Without loss of generality suppose that e s = usl Ilusll converges to the unit vector e as s ----> 00. From (2) we get

rt

and

g(gl(s)llusll)1 g(gl(S)) Since Be(O) n cI>(e)

=

0 for some E

Ilusll"

as

s

----> 00.

> 0, we get

for sufficiently large s, Hence M s (us) n f(s)K s = 0, which is in contradiction with the choice of Us' The condition (5) is the most awkward. However it can be weakened a little. Denote for any closed F [F]6 = U {Fy I 1 - 8 y 1 + 8} .

Corollary 1. Suppose that all conditions of Theorem 2 are valid except (5), and for all E > 0 and some 8 > 0

Uo E 5, (11)

U

M(qus)1 g(s)

for all sufficiently large s. Then the statement of Theorem 2 is valid if

K, = K,

s > o.

125

Proof of Theorem 2 is remained except the next implication. From (10) we get

U

M 8 (uOq) n f(s)K = 0,

1- :$q$l+

so (11) implies [ A, 0 cp(u)L(ut)du

M(s)/tnBR(O)

S ta

(14)

J

cp(u)L(ut)du.

z·

It follows from Yakimiv [7], Haan and Resnick [2] that

L

cp(u)L(ut) du '"" L(te)G(F)

for any closed F,O tI- F, where G(F)

= fF cp(u) du,

From (14) we get

To obtain (13) we have to suppose R -+ 00, S -+ 0 and use continuity of cp. Remark. The condition (12) is more restrictive then the condition of :F - convergence M(8)/ g(8) to Z. However, for convex multifunction M these conditions are equivalent.

Corollary 3. Let M E Ih t:s, 5, lC, 0', ell) be a convex-valued multifunction, and the functions cp, L 6atisfy the conditions of Theorem 4. Then H(v) = fM(v)cp(u)L(u)du is a regularly varying numerical function from the class Ih.

127

Now we apply the results above to prove the limit theorem for unions of random closed sets. Let be a random vector in R m with density f. Suppose that fElIz on 5 = R m \ {O} with index a - m, a < 0 (see Yakimiv [7]). Then f = '{J . L, where '{J is a homogeneous continuous function of index a - m, L is a slowly varying function. Let M be a multivalued function from the class u, (g, 5, K, " if! ) , , > O. Then A = M( 0 is a random closed set (RACS) (see Matheron [3]). Let AI, A 2 , ... be independent copies of A. Choose an arbitrary vector e from 5 and define

an=sup{g(s)

I sOL(se);:::I/n},

The distribution of any RACS, say X n

,

is determined by its capacity functional It follows from Norberg

Tn(K) = P {X n K I- tP} , where K describes the class K. [5] that X; converges weakly to the RACS X if Tn(I o. We may obtain similar estimates from below and henceforth the formula tor T in Theorem 5 is valid. [J Remark. We may choose instead of 9J1 another class which determines weak convergence [5]. The statement of Theorem 5 is also true if its condition is valid for the class 9J1' such that K-< c K 1 c K.«: K 2 c Kc for any K from 9J1, E > 0 and some J{1, 1(2 from 9J1'. Note that the condition of Theorem 5 is valid for all functions from Examples 1-4.

129

Example 5. Let M = g( Ilull)Br (e u ), where r > 0, 9 is regularly varying function of index, > 0, e u = u/ Ilull, 9 (iss) / g(s) -+ 0 as t s -+ 0, S -+ 00. Furthermore, let be a random vector which satisfies the conditions of Theorem 5. Then the RACS X n converges weakly to the RACS X with the capacity functional T(K)=l-ex p{-

J

sm-Ilp(e)de

J

sa-Ids},

FJ«(e)

where FJ«( e) = {s > 0 I s'YB r( e) n K metric then cp( e) = C =const and

T(K)

= 1 exp

i=

{-C I

0 }. If the distribution of

SaIj.lm_1 (smI

is spherically sym-

n (K/s'Yf) dS}'

o

where

j.lml

is the Lebesgue measure on

s»:',

REFERENCES [1] L. de Haan and E. Omey, Integrals and derivatives of regularly varying functions in

Rd and domains of attraction of stable distributions II, Stoch. Proc. Appl., 16 (1983), pp. 157170. [2] L. de Haan and S. I. Resnick, On regular variation of probability densities, Stoch. Proc. Appl., 25 (1987), pp. 8393. [3] G. Matheron, Random Sets and Integral Geometry, Wiley, New York, 1975. [4] I. S. Molchanov, On limit theorems for unions of random closed sets, in: Abstracts of the 5th School of Young Mathematicians of Siberia and Far East, Novosibirsk (1990), pp. 7374. (In Russian.) [5] T. Norberg, Convergence and existence of random set distributions, Ann. Probab., 12 (1984), pp. 726732. [6] E. Seneta, Regularly Varying Functions, Springer, Berlin etc., 1976. [7] A. L. Yakimiv, Multivariate Tauberian theorems and their application to the BellmanHarris branching processes, Math. USSR Sb., 115 (1981), pp. 463477. KIEV THECHNOLOGICAL INSTITUTE FOOD INDUSTRY, VLADIMIRSKAYA 68 KIEV

252017. Ukraina

E. V. Morozov

A COMPARISON THEOREM FOR QUEUEING SYSTEM WITH NON-IDENTICAL CHANNELS

We consider two m-channel queueing systems Q and Q' with the same regenerative input and non-identical channels (in each system). In Q the arrivals form common queue and in Q' the so-called random assignment discipline is used. It is shown that the system Q' (it is more simple for analisys than Q) possesses some "majoring" property with respect to Q. We use the following notations: d;j is the equality by definition, l{A} is the indicator of the event {A}. As a rule, relations between random variables (r.v.'s) are assumed to hold almost surely (a.s.]. Let (Tn)n>l be the sequence of input intervals and 0 = {3o < {31 < ... be the regeneration points for such that so-called regeneration cycles (r.c.'s)j

are i.i.d.. We assume that

(1)

E ({3J) < 00

(2) (3)

E(Tl+"'+Tp') = d 0, {31 = 1) > O. de!

Let

be i.i.d. service times for the ith channel of Q and

de!

=

1/ bi

p.d;j_J-t_i_ I ""m , L..l J-ti

i

J-ti

'

= 1, ... ,m.

We note that for m > 1

(4)

0< Pi < 1,

i

= 1, ...

,m.

We also consider queueing system Q' on the same probability space as Q. There exists the only difference between Q and Q' : Q has a common queue with FIFO service discipline whiles in Q' the nth customer is assigned to the ith channel with probability

Pi

=P

= 1)

where are i.i.d. Bernoulli r.v.'s, i = 1, ... ,m. By assumptions we can require that the service times of the nth customer at the ith channel in both systems (Q' and Q) are

131

equal a.s. for (n 1) and that it is true for input intervals too. (We shall use the same notations for these values in Q' and in Q). Let N(O) = 0 and for t > 0 define

N(t)

= 1 + sup (n:

Tl + ... + Tn < t)

(sup (0)

= 0 ).

Evidently N(t) is the number of customers to arrive by epoch t. The process N(.) is a nondecreasing commulative process with finite mean r.c. length d and with finite mean increment E ((31) per r.c., Therefore we can define the input rate as

(5)

A d;J lim E (N(t))/ t t-+oo

:=

E ((3d/ dE (0,00)

(see [1]). We note that N(t)/t -+ A a.s. as t -+ 00. In fact, the regenerative input with rate APi goes to the ith channel of Q'. Actually, let Ni(t) (NI(t)) be the number of customers among N(t) which need service at the ith channel of Q (Q'). Then N(t)

N;(t)

= 2:

t

i

0,

= 1, ... ,m,

p=1

But the epochs ((3n) are the regeneration points for nondecreasing cumulative process Ni(') too, and the mean increment of N i ( · ) per LC. is /h

E (2:

Pi E ((31) < 00.

p=1

Thus we obtain

(6)

lim E(NI(t))/t

t-+oo

= APi,

i

= 1, ...

,m

Let t n = Tl + ... + Tn-I, n 1 (we recall that the first customer arrives at tl :::; 0 and that it begins the first r.c, of input). Let lI(t) (1I'(t)) be the number of customers at Q (Q') at time t-, lI(i)(t) be the part of 1I'(t) assigned for the ith channel of Q', d;J lI(i)(tn ) , Wi(t) (w:(t)) be the resudial workload (including service times for waiting customers) at time t: at the ith channel of Q (Q'), i = 1, ... m. We define J1.i(t) (J1.:(t)) as the time when the ith channel of Q (Q') is empty in the interval [0, tl, and Ci(t) (cHt)) as the total workload for the ith channel by time t, i :::; 1, ... ,m, at Q (Q'). Let = 0, = inf { k: k > :::; 0, k :::; (3p for some p I} (inf0 = 00). By construction of Q' the moments

are the regeneration points for the ith channel

of Q'. More exactly, are the regeneration points for the processes wH') and 1), i = 1, ... ,m. Let us define :::;

(we note that

xii)

t

(i)

fJn+1

-

t

(0)

fJn

is the nth input r.c, length in continious time), and let

when the ith channel of Q' is empty on

n

xii) (to d;J 0), n

0, i

= 1, ...

,m.

be the time

132

Let

=

/E

E

(= 0,

(};i

if

= 00).

E

(i»)n>l are LL .. d • r.v. ,s,.In d den t 0 f XCi) d ( J.!n(i») n>l are LL .. d • r.v, ' s W e no t e th a t (X n epen 0 ' an independent of i -= 1, ... , m. We define -

J t

J.!(t)

=

l{v(x) 1 and

(8)

APi < J.!i,

then for all x E [0,00)

< 00 I w:(O)

(9)

= x) = 1,

and

(10)

i

We define

(};i

Theorem. If A
O.

Proof. We have for all t (12)

c;(t)

= t + Wi(t) -

J.!i(t),

i

= 1, ...

For each t there exists a random integer i(t) such that (13)

t

If we use Wi(O) = WI(O) (a.s.), i

(14)

C;(t)(t)

From (7) we have J.!(t)

(15)

=

= 1, ...

,mj then from (13) we obtain

aCt)

a'(t)

p=l

p=l

J.!i(t)(t), t

O.

'E

= «t)(t),

t

O. From (12), (14) we have now

o.

,m.

133

and thus m

t 2: 0.

J.L(t) 2: J.Li(t)(t) 2: min J.LHt) - '" WI(t), 1 0,

t

-+ 00,

i=l, ...

,m,

The proof is completed. [] We remark that convergence in mean takes place in (17), (18) too [1]. For m = 1 the relation (11) turns into equality. However in this case PI = 1 and the proof of (9), (10) becomes more complicated than the one in the paper [2J where we used condition P (ri > (31 = 1) > which is more restrictive than the condition (3) used here. More strong majoring properties of the system Q' in the case of the identical channels were obtained, for example, in [4], [5]. We would like to mention that the method of comparison of the "worst" channel of Q' with the best channel of Q used in this paper, had been proposed in [6J.

°

REFERENCES [IJ W. L. Smith, Regenerative stochastic processes, Proc. London Roy. Soc., 232 A (1955), pp. 6-31. [2J E. V. Morozov, A service of the regenerative input, in: Flow Control, Analysis and Modeling of Information and Computing Networks, Kuibyshev Stat. Univ. Pub!., Kuibyshev, pp. 87-94. (In Russian.) [3] E. V. Morozov, Some results for time-continuous processes in the queueing system Gl/Gl/l with losses from queue. I, in: News Byelorus, Acad. Sci., 2 (1983), pp. 51-55. (In Russian.) [4] R. Wolff, An upper bound for multi-channel queues, J. Appl. Probab., 17 (1977), pp. 884-888. [5J R. Wolff, Upper bounds on work in system for multichannel queues, J. Appl, Probab., 24 (1987), pp. 547-551. [6J E. V. Morozov, Renovation of multi-channel queues, Rep. Byelorus. Acad, Sci., 31 (1987), pp. 120-121. PETROZAVODSK STATE UNIV., PROSPEKT LENINA 33 PETROZAVODSI< 185000,

Russia

Josep M. Oller

ON AN INTRINSIC BIAS MEASURE

1. Introduction. In parametric statistical estimation theory, the concepts of bias and mean square error play an important role in characterizing the properties of estimators. They have been widely used since Fisher [9, 10] and through them, many important results have been set up: Cramer-Rae lower bound, Rao-Blackwell theorem, among many others. These concepts are clearly dependent on the coordinate system or model parametrization. This fact should not be problematic provided that closely related properties, like unbiasedness and minimum variance estimation, are preserved under coordinate system transformations. But, unfortunately, this is not the case, essentially due to the non-tensorial character of the bias and the mean square error classical measures. Therefore, in spite of their importance, these concepts present a serious conceptual problem: their lack of invariance, and properties like unbiasedness or minimum variance, are not intrin..ie to the estimation method, but depend on the parametrization of the statistical model. From these considerations a natural question arises: are the bias, the mean square error or some other analogous measures, necessarily dependent on a coordinate system? Or alternatively, could these notions be formulated depending only on the estimation procedure employed? In this paper an affirmative answer is given to the second question, and some statistical consequences are explored. The first part of the paper is an introduction to the moments of a random field on an n-dimensional Coo real manifold, and also the mean concept of a random variable which takes values on a Hausdorff and connected manifold equipped with an affine connection, through the exponential map, emphasizing the analogies and differences between moments and mean values, and cosidering, in particular, the Riemannian case. Additionally, we extend the Fourier transform and the exponential families to the present context. The second part is the application of these results to the bias and mean square error corresponding to a ststistical estimator, while the third part is the development of an intrinsic version of the Cramer-Rae lower bound. In the last part some examples are introduced and discussed.

2. Moments and mean values. Let P) be a probability space, where X is the sample space, is a u-algebra of subsets of X and P is a probability measure Let (M,2!) be an n-dimensional Coo real manifold, being 2! the atlas for M, also called the differentiable structure for M. For the sake of simplicity we shall proceed with finite dimensional Coo real manifolds, but we could extend the following to Banach manifolds, with the same basic ideas. Let f he a measurable map, f: X 1-+ M, also called a random variable on M, that is, a map such that for all open sets W C M, f-l(W) E We will now introduce the notion of mean value and moments of f, assuming the fewest necessary asumption and maintaining the intuitive notion of centrality measure, in a. closely rela.ted idea of center of rna.... as we shall see latter, see Karcher [13], Kobayashi and Nomizu [16] and Kendall [15J, but allowing the introduction of additional tools which we expect to he fruitful in statistics.

135

The first attempt to solve this problem is by considering the atlas structure. If there exists a global chart (M, 4» we may try to define the mean value of f as:

E(f)

=r

1

(/

(4) 0 f)(x) dP(X)) ,

x but this naive approach it is not satisfactory since E (f) would be dependent, in general, on the coordinaye system. Only if we restrict ourselves to linear transformations would this way be suitable, but this is too restrictive and quite arbitrary. In order to solve this problem, let us first introduce some concepts. Let A be a set of M, and P],q) the set of all Coo tensor fields in any open subset of A, of order p + q, p times contravariant and q times covariant. Fixed mEA, any map X from X to P],q) induces a map X m, such that X m : X ---+ Tf(Mm) with Xm(x) = (X(x))m, where Tf(Mm) denotes the space of (p, q)-tensors on the tangent space at m, M m , having a natural topological vector space structure. Now, a simple and operative definition follows, Definition 2.1. A Coo random (p, q)-tensor field on A, X, is a map from X to P],q), such that Vm E A, the induced map X m is a measurable map on (X, 23). Notice that, with the definition, a random tensor field may be considered a tensor valued stochastic process, parametrized by mEA C M. On the other hand, observe that P],q) may be equipped with a topology induced, through the atlas, by any standard topology between maps from open sets of R n to open sets of R n(P+') , allowing a different, and more elegant, random tensor field definition, as a measurable tensor field valued map, whose relationship with the previous one could be interesting to study. Moreover, any random tensor field may be characterized by its n(p+q) components with respect to any coordinate system, 1 , •.• ,en,

e

X"" , ,"'p ( . e1 en) fh, ,P. x, ,"',

e

,en,

which are clearly fixed x, COO functions of 1 , ••• and fixed () real valued measurable functions on (X, 23). Let 18' stand for the tensor field product. In the present context it is natural to define. Definition 2.2. The k-ordex moment of the random tensor field X is an ordinary Coo(kp, kq)-tensor field on A defined by k

Mk(X)

=/

kEN.

x provided the existence of the above integral, or equivalently, k

Mk(X)m

=/

X(x)m

dP(x),

Vm E A,

kENo

x Notice that Mk(X) may be computed explicity through its components in any coordinate system. The components of Mk(X), with respect a coordinate system (Jl, .•. ,(In, will be given by

M"'J'''' '''':(e) = / (X"'}'''' '''':(Xje) ... X"'!,·..'''':(xje)) dP(x). P ... ,P. fJ ... ,fJ. {J" •.• ,{J. "

x

"

136

This is in fact the simplest and also the most natural extension of the k-order moment to a random tensor field. In particular, the I-order moment should be called the eepectation tensor field corresponding to X, and may be denoted as E (X) == M 1(X). Observe also the linearity of the L-order moment, as a consequence of the integral properties, and note that k k

M (X)

---= E (X .0· .. 0 X)

where the tensor product of random tensor fields is naturally defined from the tensor product of ordinary tensor fields. Further, let X be a random field on A and Y E P],q), then it is straight-forward to introduce the following Definition 2.3. The k-order moment of X with respect to Y is given by k

M}(X) == Mk(X - Y)

= !(X(x) -

- Y)dP(x)

x provided their existence. Also, the moments with respect to the first order moment shall be called central moment" which exhibit classical properties:

== Mi:(x)(X)

= E (X 0

X) - E (X) 0 E (X).

The components of this tensor, with respect to a coordinate system, may be written in matrix notation, obtaining the covariance matrix,

= E(XX') - E(X)E(X)' identifying, in the previous equation, the tensors with their components. We may now observe that since there exists a natural identification of the tangent vectors with first order contravariant tensors, we can extend the previous definitions to random vector field,. Futhermore, we may extend the Laplace or Fourier transform to the present context, if we previously introduce some additional concepts. Given a map 'IjJ : Pl'q) 1-+ :FA where :FA is the set of all Coo functions on A C M, with the corresponding induced maps 'ljJm : Tf(Mm ) 1-+ R, mEA with 'ljJm(X m) = 'IjJ(X)(m), we may introduce the following Definition 2.4. The map 'IjJ is differentiable if and only if there exists a (q,p)-tensor field D 'IjJ(X) on A such that for every mEA, the corresponding (q,p)-tensor D 'IjJ(X)m E T/(Mm ) satisfies lim Ym-+O m

l'ljJm(Xm + Ym) - 'ljJm(Xm) - C(D 'IjJ(X)m, Ym)1

IlYmll

=0

where X m, Y m E Tf(Mm ), C is the contraction operator over all tensor indexes, and \I ·11 is any norm on Tf(Mm ) , compatible with the topology induced by the coordinate system. Notice that the existence of D 'IjJ(X)m satisfying the previously mentioned property is equivalent to the classical differentiability concept, introduced through the existence of a

137

linear map A: T:(Mm) 1-+ R which is called the differential, since (T:(M m))' == T:(Mm). For this reason we may call D ljJ(X) the differential tensor field of 1jJ at X. Additionally, we may define higher order differentiability in a rather obvious way through the succesive differentials, and obtain tensor fields on A of orders (q,p), (2q,2p), (3q, 3p), ... etc., thus allowing analogous tensorial versions of Taylor's development. Let X be a random (p, q)-tensor field on A, and let T be an ordinary (q, p)-tensor field on A, T E Pl'p), then we may introduce Definition 2.5. The Fourier transform of the random tensor field X is defined as

'Px(t)

=

J

exp {iC(T,X(x))} dP(x).

X

Notice that 'Px(T) E FA and D 'Px(O) = iM 1(X), provided their existence, and successive moments may be obtained by successive differentiation of 'Px(T). Moreover, let X be a random (p,q)-tensor field on A eM, in the present context it is natural to introduce the following Definition 2.6. We sball say tbat tbe random tensor field X is exponential type distributed if and only if these exists an ordinary (q,p)-tensor field on A, 3 E and a a-finite positive measure J1 on (X, IE) sucli tbat the map g(3) E FA defined as g(3)(m)

=

J

exp {C(3 m , X (x )m)} dJ1(x)

x

is bounded, g(3)( m) < 00 'tim E A, and the random tensor field X admits a density function, at each tangent tensor space T:(Mm ) , oi the form

is tbe u-induced measure by X m , on T:(Mm ) . Notice that this is an extension of the exponential family distributions to random tensor fields on a manifold. In order to consider the mean value of a random variable, measurable map, which takes values on a Hausdorff and connected manifold, we have to introduce an additional structure on the manifold: we shall assume that there is an affine connection defined on it. Naturally associated with an affine connection there is a map, called the exponential map, which is defined through the corresponding geodesics as follows. Let I: [0,1] -. M be a geodesic such that "1(0) = m, m E M and dl \ - v I

dt

.=0-

then, the exponential map is defined by eXPm (v) = 1(1), defined for all v in an open starshaped neighbourhood of Om E Mm. It is well known that this map, in general, has not an inverse, although there are important particular cases where one exists. Moreover, we can always restrict the map in an open neighbourhood of Om E M m , such that the inverse

is well defined, thus being the exponential ma.p a. Iocal diffecmcrphism. Typical ex.&mplc. of manifolds with an affine connection are Riemannian manifolds. Furthermore, let m be a

138

point of a Riemannian manifold (M, Q1), m E M, and let M m be the tangent space at m. We now define 15m c M m as 15m = Mm : = 1} and for each E 15m we define cm(O

= sup{t > 0:

p(m'I{(t))

= t}

where p is the Riemannian distance and I{ is a geodesic defined in an open interval containing zero, such that I{(O) = m and with tangent vector equal to at the origin. Then if we set and Dm

= eXPm (::Dm)

it is well known that eXPm maps::D m diffeomorphically onto D m • Moreover, if the manifold is also complete, the boundary of::D m, 8'iJ m it is mapped by the exponential map onto aDm, called the cut locus of m in M. The cut locus of m has zero ndimensional Riemannian measure in M(essentially due to Sard theorem), and M is the disjoint union of D m and aDm • For additional details see Hicks [l1J or Spivak [24]. Even if the inverse of the exponential map does not exist, we may define a map that we shall call an admissible plleudoinverlle given as follows: Definition 2.1. exp;;. is an admissible pseudoinverse of the aRine connectionexponential map at the point mEW C M if and only if exp;;.: W >--+ M m such that it is a Coo function in any open set contained in W with eXPm

=

m',

for any m' E W, and additionally, if exp;;;(m') is another map, defined on W n iV i- 0 and satisfying the previous condition, then (i) Manifolds with an aRine connection: if exp;;'(m') = Aexp;;;(m') for a real number A, then IAI l. (ii) Riemannian manifolds

II

iV,

with

lI ex p;;; (m')lIm

where the norm 1I·lIm is the Riemannian norm on Mm. Notice that the condition demanded in the Riemannian case is stronger than in the affine connection case. Hereafter, we shall assume that the admissible pseudoinverse exponential maps satisfy the condition corresponding to the Riemannian case, depending on whether or not the considered Hausdorff and connected affine manifold also has Riemannian structure. Let us remark that in the complete and Riemannian case, exp;;. (.) is uniquely defined in D m , and thus it becomes the true inverse of the exponential map restricted to ::D m. Through the concept of admissible pseudoinverse of the exponential map, given a random variable f taking values on a Hausdorff and connected manifold, equipped with an affine connection (which may be the LevyOivita connection corresponding to a Riemannian manifold), there is a natural way to define a random vector (contravariant tensor) field over a manifold subset, given by exp;;. (J( x)), where exp;;. (.) is a an admissible pseudoinverse of the exponential map. This vector field is not necessarily defined for all x E X. Moreover,

139

even when it is defined, it may not be uniquely defined. Therefore we may have different admissible of these fields. Then, we are ready to introduce the following mean value concept, Definition 2.8. A point on the manifold m E M is a mean value of the random variable f and we shall write m = 9R(J), if and only if there exist an admissible pseudoinverse of tbe exponential map sucb tbat exp;;. (J(x)) is defined almost everywhere [P], and for any admissible pseudoinverse of the exponential map satisfying this condition, we have

j exp;;'(J(x))dP(x) =Om.

x Let us remark that this is an mean value definition, independent of the coordinate system. If we denote by P f the probability measure induced by the measurable map in M, we have the following result. Proposition 2.9. Let P) be a probability space, (M,21) be a complete Riemannian manifold and f: X f-+ M a measurable map, such. that P f is dominated by the Riemannian rnecsure VR, P f « VR. Let exp;;. (.) and (-) be two admiuible of the ezponential map. Then

j exp;;' (J(x)) dP(x)

x

=j

(J(x)) dP(x)

VmEM

x

provided their ezisience. Proof. This is an immediate consequence of image measure theorem and that the cut locus of m in M is a P f probability zero set, since P f « VR, and thus exp;;' (.)

(-)

=

a.e.

[P],

following the proposition. 8 Therefore, in the complete and Riemannian case, with P f « VR, all admissible inverses are equivalent in order to compute mean values. We shall consider now several examples. Ezomple 2.10. Let M be R", Identifying the points with their coordinates corresponding to the trivial chart, and considering the usual Euclidean affine connection, we have, for z, mER", that exp;:;. (z) = (z - m)m. In order to find the mean value of a random variable f we have to solve the following equation

j(J(x) - m)m dP(x)

= Om

X

but this equation has the unique trivial solution mm

=

j f(x)m

x

dP(x)

140

Moreover, the second order central moment of exp;;; (f( x)) can be written, in matrix notation and omitting the subindex m, as

= M;(exp;;. (f(x))) = E((f(x) - m)(f(x) - m)') = E(ff') - E(f)E(f)' which is the usual covariance matrix. Ezample 2.11. Another interesting example is given by considering the mean values of the Von Mises distribution. In this case the manifold is the unit n-dimensional Euclidean sphere. The probability measure induced in the manifold is absolutely continuous with respect the surface measure on the sphere and the corresponding density function (Radon-Nikodym derivative) is given by z,

and where

Cl'n(.\)

eE Sn = {z E R" : z' z = I},

= .\k/2-1/(27r)k/2 h/2-1('\)

is a normalization constant, h/2-1 being the modified Bessel function of the first kind and order k/2 - 1. In this case the existence of two mean values is clear, given by eand - e· Compare this result with the mean direction defined in Mardia et al. [17, pp. 424-451]. See also Jupp and Mardia [12]' for a comprehensive exposition. Ezample 2.12. Consider a random variable uniformly distributed in a circle, with the connection induced by the natural embedding into the Euclidean space R2. Then, all points on the circle are mean values. Notice that the paradoxical existence of many mean values is possible. In order to emphasize the existence of a unique mean value m, in such a case we shall call it, the proper mean value and we would supply, in the Riemannian case, a scalar dispersion measure with respect to the mean value: the ordinary expected value of the Riemannian distance square between f( x) and m, which may be regarded as an invariant version independent of the coordinate system, of the variance of a real random variable. It is also possible to define a dispersion measure with respect to an arbitrary reference point of a Riemannian manifold, as the mean value of the square of the Riemannian distance between f(x) and the selected reference point. We may observe also that with this extention of the concept of the mean value or expectation, we maintain the intuitive and appealing meaning of centrality measure, even though we lose the linear properties of the expectation, since the linearity is a consequence of the integral properties. The classical expectation definition of a random variable which takes values on R" (or, in general, in a Banach space) allows the identification of the mean value and the integral concepts, since the tangent space of R n can be identified trivially with R n itself, and thus the R n vectors may be viewed as constant first order contravariant tensor fields. This suggests the dissociation of the mean value and the (first order) moment concept. The moments of a random map i, which takes values on M, should be defined as Definition 2.13. The k-order moment of the random map f is an ordinary COO(k, 0)tensor field on A defined by

Mk(f)m

=

J X

k

'tImEA,

141

provided the existence of the above integral, and its independence with respect the concrete exp;;:. (.) version. A slightly different way to regard the mean value points of f is by considering a map defined as

pAm)

= eXPm

(1

exp;;:' (f(x)) dP(x») .

X

Then the mean values of f are the fixed points of PI' provided their existence. There exist a relationship between the defined mean value and the classical center of ma ss, 11:, see the references at the beginning of Section two,

11:

= argminmEM

f

p2(m, f(x))dP(x)

X

given in next proposition. First we introduce a classical differential geometry tool: the geodesic spherical coordinate.. on D m , through eXPm the restriction of eXPm on Let us assume that there exist a coordinate system on 8 m , = where u varies over a domain in R n-l. A coordinate system on D m is defined by

v(p, u) = eXPm and the Riemannian metric may be expressed as

= (dp? + 1) matrix A(p; 0, see Chavel [8, pp. ds 2

for a certain (n - 1) volume is given by

X

(n -

dVR eXPm (p

m = det

66-67], and the Riemannian

dt dpp{O

where dl1m denotes the (n -I)-dimensional volume element on 8 m. Now we are ready for the following Proposition 2.14. Let (X, 113, P) be a probability space, (M, 21) be a complete Riemannian manifold and f : X f-t M a mea surable map, ..uch that P, is dominated by the Riemannian mea sure VR, P, « VR. Let J be a Coo function on the manifold defined a..

J(m)

=

f

p2(m, f(x))dP(x).

X

=

Then J has a critical point at m E M if and only if m 9Jt(J). Proof. Let q E M be a fixed point on the manifold, consider the map exp;;:. (q) uniquely defined in the open set D q , such that p2(m, q) = II exp;;:.(q)1I 2 • Let X be a vector field defined in a neighbourhood of m. Then, considering a geodesic spherical coordinate system, with origin q, it is clear that

= 2 < D x exp;;' (q), exp;;' (q) >m= -2 < X, exp;;'(q) >m critical point at m if and only if XmJ = 0 VX m E M m, we have the

Xmll exp;;'(q)11 2 and since J has a equivalent condition

0= XmJ

= -2

1 x,
p-1X

where

where O(n) stands for the n x n orthogonal matrix group. Therefore, with respect to a coordinate system, and using matrix notation, it follows that "IT E O(n) and where >- denotes identical distribution. Therefore, if the second order exp;o(Uk) moment exists, we have:

E po (G 1 / 2A po A'Po G1 / 2 ) =TEPo (G 1 / 2A Po A'Po G 1 / 2)T'

"IT E O(n)

whatever true density po is assumed, and then the second conclusion follows. 0 Now let us illustrate the previously introduced concept with the following. Ezample 3.8. Consider the multivariate elliptic probability distributions, with fixed dispersion matrix E = Eo, that is the parametric family with density functions, in RR with respect the Lebesgue measure, given by

where Eo is a fixed n x n strictly positivedefinite matrix, f.J = (f.Jl' ... ,f.Jn)' is a parameter vector, r(n/2) is the usual gamma function, and F is a nonnegative function on R+ (0,00) satisfying

f

00

o

r

R

/

2

1

F(r) dr

= 1.

146

The vector /-I. and the matrix Eo may be expressed in terms of E (X) and Cov (X), provided the latter exists. In fact, let be t = (tl,'" ,t n ) ' ; the characteristic function tPAt) = E (exp it' X) of the above introduced parametric family of probability distributions, which may be expressed as tPF(t) = exp {it'/-I.} AF(t'Eot), where

J 00

AF(s)

= r(n/2)

rn/2-lF(r)Kn/2_l(rs)dr

s ER

o

with

K (s) - 2" J,,(JS) _ "

-

(_s)m 4m m ! r ( m + v + l )

(JS)" -

and where J" is the ordinary Bessel function of order u, Formally, therefore E(X)=_·8 tPF(t ) 1 d E(XX,)=_ z at .=0 an

02 tPF(t ) 1

atat'

'=0 .

= /-1./-1.' + cFEo, where

This gives E(X) = /-I. and E(XX')

J 00

CF

= -2A' (0) = F

.!.n

r

n 2 /

F(r) dr

o

and hence Cov (X) = cFEo. In particular, E(X) exists if and only if

J 00

r(n-l)/2F(r)dr < 00;

o

additionally Cov(X) exists if and only if we have

J 00

r n / 2 F(r)dr

0

Zk tj>(q)) } sk

Z

with

Z '" .Nn(O,

kEN

where I:- stands for the weak convergence or convergence in law, and .Nn denotes an nvariate normal distribution. Notice that if the previous conditions are satisfied for one local chart (W, tj» with q E W, then they are satisfied for all local charts (V, 8) such that q E V, since it is well known, see for instance Serfling [23], that

{

(8

0

z, 8(q)) } Sk

z kEN

where

Z '" .Nn(O,

149

J being the I ::; II AIIIICII where < ',' > and" . " stand for the inner product and the norm defined on every tangent space. Additionally,

151

again by Cauchy-Schwartz inequality, and where the expectations, at each point P, are computed with respect the corresponding probability measure P(k) djik. Let be C(x; 0) = grad (1ogP(k)(X; 0)), where grad(·) stands for the gradient operator. In components notation, and freely using the repeated index summation convention, we may write C Ot( • 0) = OtP(O) Blogp(k)(x;8) x, 9 BOfJ'

gOtP(O) being the components of the contravariant fundamental tensor field and where P is the joint sample density function. Therefore, simplifying the notation, we have

II

Cl12 _

..

- g,] 9

_ - 9

iP Blogp(k)

BOP

9

iOt 81og p(k)

BOOt

Otp Blogp(k) Blogp(k) BOP (JI)Ot

and taking expectations, and using matrix notation,

E(IICI1 2 )

= E(C'G-1C) = tr E(C'G-1C)) = E(tr(C'G-1C)) = E(tr(G-1CC')) = tr(G-1ECC')) = k tr (G- 1 G) = k tr I = kn

we also have IE « A, C

»1 s

E (I < A, C> I),

therefore but IIAII2 = p2(p, Uk), where p is the Riemannian distance, also called in this case the Rao distance. Then On the other hand

< A C >= 9 ,

Ot,p

AOtCP

thus,

E«

A, C

» =

J

A

= gOt,fJ A Ot 9 fJ-y Blogp(k) = AOt BO-y

Ot Blogp(k) BOOt P(k) djik

=

J

A

a

Blogp(k) BOOt

8P(k) BOOt djik.

x·

X.

Notice that AOt is a function of x which is independent of the coordinate system: when x is fixed it is a scalar function on the manifold. Additionally, since B is the bias tensor field corresponding to A, we have

J

Ae>P(k) djik

x·

= Be>

a

= 1, ... ,n

152

taking partial derivatives,

We may observe that A o )P(k)and

are the components of a mixed second order tensor, while

are not the components of a tensor. Also, we have

J

fij(O)Aj(x; O)P(k)(X; O)d{lk(X)

= fij(O)

x·

J

Aj(x; O)P(k)(X; O)dflk(X)

x.

where fij are the Christoffel symbols of the second kind. Therefore

J{

x· but

OA O fOAj} d OOi + ij P(k) {lk

A o ". -_ oAo OOi

+

+

fO Aj ij

J

A

oOP(k)d .o»: OOi flk - OOi

x.

an d

B

o

-

,i -

oBo OOi

+

+ fOBj ij

fOBj ij

are the components of mixed second order tensor fields, A a ,i and BO ,i respectively, which are, classically, called the covariant derivative of tensors A o and BO. Notice the tensorial character of the last equation. If we carry out an index contraction we shall obtain a scalar equation:

or equivalently, since

AO,o = div(A), andBo,o E(div(A))

+

J

= div(B),

dflk = div(B)

x·

which is invariant with respect coordinate changes. That is, both integrands depend on x, but are independent of the coordinate system. Therefore 1 follows. Fixing x, we are going to choose a convenient coordinate system. Given P and Uk(X), we choose a geodesic spherical coordinates system with origin Uk(X), i.e. a system (p, u) as discussed in Section 2, since Du.(x) = M almost surely. It is clear that the components of tensor a are (-p, 0, 0, ... ,0) when p, the Riemannian distance between P and Uk(X), is the first coordinate. Additionally, oAo -=-1

00°

and

o

j _

o

_

fo,jA - -pfo! - -

8log vg op p

153

where 9 is the determinant of the metric tensor. Then

J x·

A

O'ap(k) aoO' df-lk

. = div (B) +

J{

1 +P

8log y'g} ap

P(k) df-lk.

X.

Now we consider several cases. Case 4.1.1. Sectional curvature equal to zero. In this case, y'g = pn-ls(O, see Chavel [8, pp. 38-39,66-67]. Therefore:

log y'g resulting in

= (n -

J x·

1) log p + log s(O

alogy'g

ap

aAO' aoO' P(k) df-lk = div(B)

n-1

= -p-

+ n.

Then, we have which, for a sample of k size, results in

(div(B)+n)2 kn

.. obtained from a sample of size n given by X n, the ordinary mean sample. The corresponding maximum-likelihood estimator for (J is given by

156

2.JX':. Since the metric tensor is constant under the coordinate system given by 6, the bias tensor, if we let S = nX n, is given by

which is clearly biased. Moreover, since the equation

E (J{S))

=

f=

f(k)e- n >.

k=O

(n:r =

equivalent to

where f is an arbitrary function, has no solution because VZez it is not an an ali tic function, we conclude that for univariate Poisson distribution there does not exist an intrinsically unbiased estimator based on the sufficient statistics S. Example 5.3. The Bernoulli distribution. Let us consider the Bernouilli density function parametrized as

pE[O,I],

x E {O,l}.

The metric tensor component is given by gl1(P) = l/p(l - p). If we let 6 = 2arcsiny'p, the new tensor components will become 911(11) = 1. Let us now consider the maximumlikelihood estimator for the parameter p computed from a sample of size n given by X n, the ordinary mean sample. The corresponding maximum-likelihood estimator for 6 is given by 2 arcsin .JX':. Since the metric tensor is constant under the coordinate system given by 6, the bias tensor, if we let S = nX n, is given by

B1(B)

=E

(2 arcsin

y'S/n -

B)

=2

i:

arcsiny'k/n

pk(l_ pr- k - 8

k=O

which is biased, and it is clearly impossible to correct its bias. 6. Discussion. It is pointed out that the classical bias and mean square error measures are not intrinsic quantities and therefore, in this sense, meaningless. Therefore, the defined bias measure and the mean square of the Rao distance allow us to investigate the estimator properties in a more objective way. Unfortunately in many common and simple cases intrinsically unbiased estimators do not exist, although it is possible to correct the bias locally, obtaining a new estimator with its corresponding bias tensor null at one fixed point, Po, provided that the bias tensor field is uniquely defined at Po. Observe that fixed a sample size k, in order to correct the bias of an estimator Uk (x), at a fixed point Po, it is enough to define the modified estimator

157

where B p o is the bias tensor corresponding to Uk(x)atpo. This could be used in testing hypothesis theory, when the null hypothesis is simple, correcting the estimator bias under the null hypothesis, and allowing the construction of tests which would be invariant under reparametrizations. It is also possible to give an average measure of the bias, like the integral of the square of the norm of the bias tensor, over the manifold, as a scalar bias measure: B

=

JIIBpll;

dVR(p)

M

where dVR is the Riemannian measure over the manifold. Notice that this is a definition independent of the coordinate system, and with possible Bayesian interpretations. Additionally, in Theorem 4.1, we could obtain some approximate results, possibly useful in large sample theory, since it is well known, see Chavel [8, p. 317], that the Riemannian volume may be expressed as

e

where is in the unit sphere of M p , el p and d/l p denotes the (n - I)-dimensional volume element on el p • Therefore, it is easily bounded through the diagonalization of the quadratic form corresponding to the Ricci tensor. These results point out the role of the curvature in point estimation theory. Other possibility is to investigate the properties of the defined exponential families of probability distributions over a manifold (Section 2), when we consider as a manifold the statistical model itself, and as a measurable map, a true probability measure estimator. Moreover, it will be interesting to characterize the parametric families which allow an estimator attaining the intrinsic lower bound for the mean square of the Rao distance. Also, it is possible to define, in Section 2 context, the conditional expectation concept with respect to a subalgebra, and extent all this notations to infinite dimensional manifolds. Acknowlegements. I should like to thank the comments and suggestions of last SIDGCA project workshop assistants, held in Universite Paul Sabatier, Toulouse, October 1990. The work is partially supported by CGYCIT grant (Spain), PB-200.

REFERENCES [1] S. Amari, Differential-Geometrical Methods in Statistics, Lect. Notes Stat., Springer, New York, 1985. ' [2] C. Atkinson and A. F. S. Mitchell, Rao's distance measure, Sankhya, 43 A (1981), pp. 345-365. [3] O. E. Barndorff-Nielsen, Differential geometry in statistical inference, Led. Notes, Inst , Math. Stat., Mayward, California, 10 (1987), pp. 95-161. [4] O. E. Barndorff-Nielsen and P. Blaesild, Strings: mathematical theory and statistical examples, in: Proc. London Roy. Soc., 411 A (1987), pp. 155-176. [5] J. Burbea, Informative geometry of probability spaced, Expos. Math., 4 (1986), pp. 347-378. [6] J. Burbea and J. M. Oller, The information metric for univariate linear elliptic modell, Stat. & Decis., 6 (1988), pp. 209-221.

158

[7] J. Burbea and C. R. Rao, Entropy differential metric, distance and divergence measures in probability spaces: a unified approach, J. Mult. Anal., 12 (1982), pp. 575-596.

[8] I. Chavel, Eiqenvalues in Riemannian Geometry. Pure and applied mathematics, Acad. Press, Hayward, California, 1984. [9] R. A. Fisher, On the mathematical foundations of theoretical statistics, Phil. Trans. Roy. Soc., 222 A (1922), pp. 309-368. [10] R. A. Fisher, Theory of statistical estimation, in: Proc. Camb. Phil. Soc., 22 (1925), pp. 700-725. [11] N. J. Hicks, Notes of Differential Geometry, Van Nostrand, London, 1965. [12] P. E. Jupp and K. V. Mardia, An unified view of the theory of directional statistic», 1975-1988, Int. Stat. Rev., 57 (1989), pp. 261-294. [13] M. Karcher, Riemannian center of mass and mollifier smooihinq, Comm. Pure Appl. Math., 30 (1977), pp. 509-54!. [14] D. Kelker, Distribution theory of spherical distributions and a locationscale parameter generalization,Sankhya, 32 A (1970), pp. 419-430. [15] W. S. Kendall, Probability, convezity and harmonic map3 with small image, : Proc. London Math. Soc., 61 (1990), pp. 371-406. [16] S. Kobayashi and K. Nomizu, Foundations of Differential Geometry, Wiley, New York, 1969. [17] J. T. Kent, K. V. Mardia and J. M. Bibby, Multivariate AnalY3i&, Acad. Press, London, 1979. [18] A. F. S. Mitchell and W. J. Krzanowski, The mahalanobis distance and elliptic distributions, Biometrika, 72 (1985), pp. 467-467. [19] R. J. Muirhead, A3pect3 of Multivariate Statistical Theory, Wiley, New York, 1982. [20] J. M. Oller, Information metric for eztreme value and loqisiic probability di&tribuiions, Sankhya, 49 A (1987), pp. 17-23. [21] J. M. Oller, Statistical data analY3i3 and inference, Elsevier Sci. Publ. B. V., North Holland, Amsterdam, 1989, pp. 41-58. [22] J. M. Oller and C. M. Cuadras, Raa's di3tance for multinomial negative di&tribution&, Sankhya, 47 A (1985), pp. 75-83. [23] R. J. Serfiing, Aprozimation Theorems in Mathematical Statistic», Wiley, New York, 1980. [24] M. Spivak, A Comprehensive Introduction to Differential Geometry, Publ. Perish, Berkeley, 1979. DEPT. STAT., BARCELONA UNIV., 08028 SPAIN

Jerzy Pusz

CHARACTERIZATION OF EXPONENTIAL DISTRIBUTIONS BY CONDITIONAL MOMENTS

1. Introduction. Let TJ be independent random variables with finite second moments and suppose that the following conditions hold

e,

(1)

E

(2)

I + 11) = h + 11), I + TJ) = + TJ)

for some functions h, h. Let h be linear function. It is a well-known fact that if h is nonrandom, then TJ have normal distributions (see Theorem 5.7.1 in [5]), if h is a linear function, then TJ are of Poisson type (see Lemma 2.1 in [1]) and if h is quadratic function, then and TJ are of gamma type (see Lemma 2.3 in [3]). In this paper we characterize exponential distributions in terms of conditional expectations and variances. We consider the case when h is nonlinear and h is nonrandom. Throughout the paper the random variables are real and are defined on a probability space P). All the equations between random variables are the P- a.s. equations. 2. Main Result. Our main result is contained in the following Theorem. Let Xl, X 2 , X s be independent, "quare integrable, identically didributed, non· negative and continuou" random oariables such. that

e

(3)

for

k

= 1,2, 3.

Let if W

(4)

s 0,

if W > 0,

and

(5)

V

ar

(U IW)

= a2 + 2ab + 2b2 (a + b)21J2

where

(6) for "ome positioe number" a, b. The random variable" Xl, X 2 , Xs have ezponentiaZ dil· iribution» if and only if the condition" (4)-(5) hold.

160

Assume that the conditions (4), (5) are satisfied. Then, for W ::;

Proof (Sufficiency).

(7) and for W >

°

2 2 a a2 + ab + b2 E(U jW)=W +2(a+b)b W+ 2 (a+b)2b2

a2 + ab + b2

2

(8)

°

E(U IW)=2 (a+b)2b 2·

Let fw(w) and f(ulw) be densities of Wand the conditional distribution of U with respect to W, respectively, and

(9)

cp(t l , t2)

=E

(eit1U+it,W) ,

tl , t2 E R

be the characteristic function of random pair (U, W). From (9) we get o +00 :::: =i / eit'Wfw(w)(/ ueit1Uf(ulw)du)dw -00 -00 00 +00

+i/eit,wfw(w)(/ ueit1Uf(ulw)du)dw o -00

(10) and with the help of (4)

(11)

:::: 1'1=0

=i

(a: b)b cp(O, h) +H(t2))'

where o

H(t 2) =

(12)

/

weit,Wfw(w)dw

-00 Differentiating both sides of (10) with respect to tt we have 2

8 cp

=-

/0.

e,t,wfw(w)

-00

(

+/00. 2e't1Uf(ulw)du )

u

dw

-00

00

+00

- / eit,w fw(w)( /

u2eit1u f(ulw)du ) dw

o -00 Hence and from (7) and (8) it follows that o

8

2

': 1

at l '1=0

w2eit,Wfw(w)dw

=- / -00

o

-2

a / (a + b)b

weit,wj, (w)dw w

-00

2 2 +00 + ab + b / eit,wj, ( ) d (a + b)2b2 w W w.

_ 2a

-00

161

Taking into account (12) we get

J o

H'(t 2 ) =i

w2 eitt , wfw(w)dw.

-00

Hence

(13) The substitution of h from (11) into (13) leads to the differential equation for the chara.cteristic function

(14) Now let

(15)

k=I,2,3

be the characteristic function of Xl, X 2 and X!. From (6)

0

p(u, w)

=

w

Ae-bue(a+b)w {

Ae(a+b)ue-bw

0

If u < 0

and u and u < w,

where A = ab'(a + 2b)-1. The marginal distribution of W is the sum of two independent, exponential distributions concentrated on -00, 0, 0, +00, respectively, and the density of W is ab(a + b)- l e- bW fw () w ={ ab(a + b)-le aw if w < O.

If f( u Iw) is the density of the conditional distribution U with respect to W, then for w

(23) and for w

(24)

={

f(ulw)

Be-bu Be(aH)u

0

if u 0, if u < 0,

O: inf supmax{I>.(t)-tl, If(t)-lg(>.(t))l} :Sc}. ).EA

t

(D G , d) is a separable-complete metric space. For further information concerning the space DG as well as characterizations of the uniform tightness of families of probabilities measures on DG the reader is referred to [1] (the arguments used there apply to our situation without almost any change). 2. Results. Let JlI(G) be the set of all Borel probability measures on G. JlI(G), furnished with the convolution product and the weak topology, is a topological semigroup with identity Ce, where Cx is the probability measure degenerated at the point x E G. Jl*" denotes the n - fold convolution product of a measure Jl E JlI(G), Jl*o: = e.: For any random element X with values in G (or in DG ) .c(X) denotes its probability distribution. We suppose that for some v, Jl E JlI(G) and any n 2: 1 there exist Un E Aut (G) such that

(1)

It is known that the measure Jl must be stable ([5]). Probability measure Jl is embeddable in some continuous convolution semigroup [c.c.s.) S = (Jll> t 2: 0), i.e. 1) Jlt * Jl. = Jll+" t, s 2: 0; 2) Jlt => ce, t 0; 3) JlI = Jl. Moreover there exists continuous contracting one-parameter group T = (Tt, t > 0) of automorphisms of group G: 1) TtT. = Tta, t, s > 0;

164

2) Tt(X) -+ e, t -+ 0, x E G; 3) t -+ Tt(X), X E G, is the continuous mapping; 4) Tt(J-t.) = J-tt., t > 0, s OJ i.e, c.c.s. Sis T- stable ([2]). Then there exists stochastic process Z = (Z(t), t 0) with values in G such that 1) J-tt is probability law of Z(t), t 0; 2) the left increments of Z are homogeneous, i.e, for any s < t the probability law of Z(S)-IZ(t) is J-tt-.; 3) Z has independent left increments, i.e. for any t l < ... < t n < (X) random elements Z(t l ) , Z(tl)-l Z(t2),'" ,Z(tn_d- l Z(t n ) are independent; 4) all sample paths of process Z are right-continuous with finite left-hand limits. Consider the sequence of independent random elements Xl, X 2 , ••• with values in G and having common probability law t/, Define random processes Zn:

°

(2) where t E [0, 1] and an E Aut (G) are the same as in (1). It is evident that Zn are random elements with values in DG • We want to prove that .c(Zn) =? .c(Z) in DG •

Theorem 1. Let Zn be a $equence of random processes with independent left increments and sample patlu belonging to D G , multidimensional probability lauu of Zn converge to the correspondinq multidimensional law of some stochasticallu continuou$ random process, If for any e >

°

(3) then .c(Zn) =? .c(Z) in D G • The arguments used in [1], Theorem 15.4, or in [6]' $40, Theorem 1, apply to our case without almost any change. Using Theorem 1 we will prove the following

Theorem 2. Let Zn be random processes defined by (2), convergence (1) hold and Z be the random process correspondinq to C.C.$. S. Then .c(Zn) =? .c(Z) in D G • Remark», 1. Analogously one can prove that the probability distributions of random processes Zn defined by (2) for t E [0, T] converge weakly to probability distribution of Z in DG [0, T] for any T> 0. 2. In paper [4] it was shown that in our case the convergence of .c(Zn) in DG [0, T] for all T> yields the convergence of .c(Zn) in DG [0, (X»). So we have the following.

°

Theorem 3. Let Zn be random processes defined by (2) for all t > 0, convergence (1) hold and Z be the random process corresponding to C.C.$. S. Then .c(Zn) =? .c(Z) in

n, [0, (X»).

3. Proof of Theorem 2. First we will prove the convergence of multidimensional probabilities laws. In paper [5] it was proved that .c(Zn(t)) =? .c(Z(t)) for any t > 0 as n -+ oo, The convergence

165

is equivalent to the convergence

£(Zn(td, Zn(td- l Zn(t2),'" ,Zn(tm-d- l Zn(t m))

==}

£(Z(td, Z(td- l Z(h), ... , Z(tm_d- l Z(tm)). But this convergence follows from above because random elements Xk are independent and identically distributed and process Z has homogeneous and independent left increments. To finish the proof we have to verify condition (3) from Theorem 1. Denote

O(e)

= {x E G: [z] ::; s}.

In view of indentically distribution of random element Xk condition (3) is equivalent to

(3 ')

sup P {Zn(t)

lim lim

6-+0 n-+oo 0 0 and

C2

> 0 &uch

cdn ::; P {O"n (Xl) ¢. O(e)} ::; cdn. Proof. Let £(G) be Lie algebra of group G, exp: £( G) f--+ G the exponential mapping. In our case it is one-to-one and infinitely differential mapping £( G) on G. It is known that O"n (1/ on) converges if and only if ;;n on) converges, where ([3]).

;;n : = exp-l OO"n 0 exp

E Aut

(£( G)),

/.I 0

exp E pI (£(G)).

It follows that

where [O(eW := exp-I(O(e)). In linear space 'c(G) the convergence of ';'n on) yields the convergence n;;n L, where L is the Levy measure of limit law. So we have the conclusion. Now take a sequence of positive numbers bn such that bn -+ 0, nbn/n"f -+ 0 for some 1E(0,1) and P{O"n(Xd¢.O(ej[nbn])}::;c/n"f, as n-+oo.

'*

It is possible because Lemma 1 holds. Let t ::; Then

s;

P {\Zn(t)\ >

e} = P {\O"n(X I

...

X[nt])\ > e}

[nlJ

::; P

{l: IO"n(X

k

)/ 2

e}

k=l

s P {[non]

sup k$[n6 n J

100n(Xk)\ > e}

= 1 - (p {IO"n(Xdl ::; e/[nbn]})[n6 n] = 1- (1 - P {O"n(Xd ¢. 0(e/[nOn])})[n6n1 ::; 1 - (1 - c/n"f)[n6 n J

-+

0,

as

n

-> 00,

166

due to the choice of bn • Not let bn < t ::; b. Then p {IZn(t)

Since [ntJ ;::: [nbnJ

> e} = P {IO"n(X1",X[ntJ)1 > e}

=P

(4) 00

.. ,X[ntj)! >

e}.

it follows that

uniformly in t ::; s as n 00. Hence the family {£(U[ntj(X l . "X[ntj)), n;::: 1, t ::; b} is relatively compact. For t ::; b 0 uniformly in t::; 6, x E O(e), as n 00 and 6 O. Consequently the right hand of equality (4) tends to zero as n 00 and 6 O.

REFERENCES [1] P. Billingsley, Convergence of Probability New York, 1968. [2] W. Hazod, Stable probability on qroup» and vector spaces. A in: Probability Measures on Groups. VIII, Lect. Notes Math., Springer, 1210 (1986), pp. 304-352. [3J W. Hazod and H. P. Schemer, The domains of partial attraction of probabilities on and on vector spaces, (Submitted to J. Theor. Probab.) [4] T. Lindva.ll, Weak convergence of probability and random [unctions in function space D[O,oo), J. Appl. Prob., 10 (1973), pp. 109-121. [5] S. Nobel, Limit theorems for probability on connected nilpotent Lie J. Theor. Probab.,4 (1991), pp. 261-284. [6J A. V. Skorokhod, Stochastic processes with independent Nauka, Moscow, 1964. (In Russian.) DEPART. OF ApPL. MATH. TVER' STATE UNIV., ZHELIABOV STREET 33 TVER' 170013,

Russia.

M. Yu. Svertchkov

ON WIDE-SENSE REGENERATION

This paper presents criteria for wide-sense regeneration (regeneration in the sense of Asmussen [1]), which is more general than regeneration in the traditional sense of Smith [2J. For example, any stationary process is regenerative in the wide sense ( it is sufficient to take any independent of process sequence of i.i.d, positive random variables as intervals of regeneration to show this) while processes regenerative in the traditional sense necessarily satisfy a "strong mixing" condition. The criteria given in this paper need not use all the regeneration times and so it seems to be more convenient than initial definition from [lJ; moreover our conditions generalize Smith's classical construction using independent tours (see [2]). Definitions and criteria. Let (X, 6) be a measurable space offunctions on R+ = [0, 00), with values in measurable space (E, t'); let e = {8 t , t E R+} be the semi group of shifts, i.e. for any x = (Xt),eR.+ E X

t, s E R+. We define an "infinite" shift in the following way: for any x E X and t E R+ let 800xt = e, where e is some fixed element from E, Let us assume that (X, 6) satisfies a "shift-measurable" condition, i.e. mapping (t, x) 8t x from (R+ x X, B+ 6) to (X, 6) is measurable (where B+ is the Borel o-algebra on R+). Remark. This assumption is satisfied for example for the Prokhorov- Stone space C[O, 00), the Skorokhod-Stone space D[O, 00) with Borel o-algebras (see [3]). However, it should be noted that for the Kolmogorov space R T of all functions on T = R+ with product u-algebra B@T this assumption is not valid (see [4]). Definition. A stochastic process ( with trajectories in X is regenerative in the wide sense if there exist (perhaps on an extended probability space) a sequence of non-negative increasing almost surely finite random variables T = (Tn' n E N+ = {O,1, 2, ... }) and probability measure p on (6 such that Tn ---+ 00 a.s. and for any n E N+,

AE6

In this case T is called a sequence of regeneration times. Remark. Suppose ( is regenerative in the wide sense and f is measurable function from (X, 6) to some measurable space. Then .. + is regenerative in the wide sense with the same regeneration times as (. Certainly processes regenerative in the traditional sense do not have this property.

Examples of processes regenerative in the wide sense but not in the traditional sense are

described in [5, 6].

168

Remark that the traditional regenerative process was introduced by Smith because of its simple construction (see [2], p. 12). However the main results in [2] were proved for wide-sense regenerative processes (which were called in [2] equilibrium processes). The criteria given in this paper generalize Smith's classical construction using independent tours: the criteria use information about process behaviour only at first two regeneration times and need not use all the regeneration times.

e

Theorem. A stochastic process with trajectories in X i3 regenerative in the wide "en"e if and only if there ezisi« (probably on an eziended probability space] a pair of non-negative almost sure finite random variable" Tl and T 2 such. that: 1 0 • Tl:S T 2 a.s., P {T2 - t, > O} > 0; 2°. (OT1 T 2 - Tl ) is independent of 3°. OT. i" independent of T 2 - Tl ; 4°.

e, e BTl e

r.,

OT.

e, where

mecns equality in distribution.

(Tl , T 2 ) . Moreover there ezisis regeneration times T such. that (To, ri) Proof. It is clear that a process regenerative in the wide sense satisfies conditions 1°-4° (with Tl == To, T 2 == ri). Let us prove the converse. Let Q be the distribution ofthe random element (OT1 T2-Td on (X X R+, 6 B+). Then if (17, X) has distribution Q, it follows that: 1) On is independent of X (see 3°),

e,

2) OX" 17 (see 4°). Denote now by Q' the distribution of 17 and put

G(y,u) = sup{t: P(X
' ((3)) -dt , t

where ,\((3) = -F«o)(3 . Hence co

F'(xj 0, (3) = - J{;O) j t"'-1 exp (_t Ci cos '\((3)) (sin /\((3)sin (tx + t'" sin ,\((3)) o

+ cos '\((3) cos (tx

+ t a sin >'((3))) dt.

If (3 = 0, then

j ta-1exp(-t"')costmdt. oc

(11 )

K(o) F'(mjo,O) = - 2-

o

We write the integral in (11) as

(12) It is easy see that T

11 = jt"'-l ex p(-t"') dt (1 + O(m)) o

=

172

In turn, 12 = a

(m k )

for any k >

°.Thus, from (11) and (12) we obtain

F'(mjO',O)

(13)

=-

K(O') 20'

(1 + O(m)) .

Since F"(mj 0', B) is bounded, it follows from (5), (9) and (13) that

rn

-f Jr

(1) (1+0(1))=13-(1+0(1)), K(O') 0'

2

This equality gives us (2). Let 0' = 1. Taking into account that the family of the symmetric stable distributions is continuous with respect to 0', we get from (9)

(14)

1 m (2) F(mjl,O)=-+-+O m

2

.

Jr

Further, from (1) and (6) we have

F(.T; 1,13)

=

2

Hence

F'(Xj 1,13) =

+

rr

J= e-

I

sin

(t:1 +

o

J

In

rr

:2 10-1In t cos (tx +

t) dtt '

I3t In t) dt

o

or

J cc

F'(mj 1, 0) =

Jr2

e- I In t cos tm dt.

o

using formula (2.6.40.1) in (2) it is easy to show that

(15)

F'{m; 1,0)

=

--;E + Jr

O(m),

where E is Euler's constant. Then from (5), (14) and (15) we obtain m

(1

+ o(1)) =

E 13 (1

+ 0(1)) ,

°

It is clear that [3] follows from this. The case 11 < is considered similary. The theorem is proved. 0 Proof of Theorem 2. Denote by F'(x; 0', (3) and FI/(x; o, #) the first and second derivatives with respect to 0' .We can write

where 0' :::; B :::; 2. If m = m( 0',13) is the median of the distribution F( Xj 0',13) , then

(16)

F(mj2,13)+(0'-2)F'(mj2,13)

+

173

Since F( X; 2, (3) law, we have

= .( a))

.

It is easy to verify that >'(2) = 0, >.'(2) = -if3 . Hence

(18) where

J 00

R1

=

texp(-e) lntsintmdt,

o

J 00

R2 =

texp(-e)costmdt.

o

It is evident that R 1 = O( m). In turn, from the estimation of the integral (12) it follows that R 2 = + O(m). Thus, from (18) we get

t

F'(m; 2,(3) =

(19)

+ O(m).

Since F"(m;B,f3) is bounded, it follows from (16), (17) and (19) that

(1 + 0(1)) = t(a - 2)f3(1 + 0(1)) . The theorem is proved.

0

REFERENCES [1] B. Rosen, On the asymptotic distribution for sums of independent identically distributed random variables, Ark. Mat., 4 (1961), pp. 323-332. [2] A. P. Prudnikov, Yu. A. Brychkov and O. I. Marichev,Integrals and Series, Nauka, Moscow, 1981. (In Russian.) DEPT. OF ApPL. MATH., TASHKENT INST. TASHKENT 7000047,Uzbekistan

OF MOTOR TRANSPORT ENGIN., K. MARX STREET 42

Hermann Thorisson

REGENERATION, STATIONARITY AND SIMULATION

Introduction This paper shows how a constructive approach to the existence of a stationary version of a regenerative process yields a partial solution of the socalled initial transient problem for regenerative processes: the simulation problem of generating the stationary version when it is known how to generate the zero-delayed version. After establishing notation in Section I we present the construction in Section 2; the full details can be found in [2]. For an application of the construction to general cyclestationary processes without any independence assumptions (Palm theory), see [3]. In Section 3 we present a simulation algorithm based on this construction, show that it works in the bounded inter-regeneration time case and renders an approximate solution in the general case. In fact, in the general case the total variation distance from stationarity is determined in a simple way by the distribution of the stationary initial delay time. For more on the initial transient problem, see [I].

1. Preliminaries Let Z = (Zs)n..be a path-measurable stochastic process (this is the case for example v=:sS s)ds,

y

O.

178

Remark 1. Note that C* is the well-known stationary initial delay distribution and C*(c/G*(c» --+ 1 as c --+

00.

Proof. Replacing So 1 by SOl AC in the calculation in the proof of Theorem 2 and leaving

out the final step yields

for all non-negative measurable functions! This is equivalent to (5)

E dx) = «xAc)/mC*(c»

E dx)

and

Due to (4) and (6) we have IIP«ZN, SN) E') 1\ P«Z', S') E·)II =

E')

1\

P(S'l E·)II

while (3) and (5) yield thefirst identity in E .)

A

P(S'l E·)II =

J(xl\c/mC*(c»

=(11m) J(c/C*(c» 1\ x

E

A

(x/m)

dx)

=C*(clC*(c». Finally, since U is independent of both

(zN,

and (Z', S') and since

But;(ZN, SN) is determined measurably by U and (zN, in the same way as

(2*, SO) is determined by U and (Z', S') we have IIP(But;(Z N, SN) E .)

1\

P«Z*, SO) E .)11

IIP«ZN, SN) E .) 1\ P«Z', S1 E ·)11

and the proof is complete.

E dx)

179

Remark 2. If we let (ZO, SO), (Z*, S*) and (Z', S') be double-ended and (J/(Z, S) denote (Z,S) centred at t , then we have

because then (ZN,

and (Z', Sj can be recapitulated from (JUSN\(ZN, SN) and (Z*, S*),

respectively.

REFERENCES [1] S. Asmussen, P. Glynn and H. Thorisson, Stationarity detection in the initial transient problem. To appear in ACM Trans. Modeling Comput. Simulation. [2] H. Thorisson, Construction of a stationary regenerative process. Stoch. Proc. Appl. 42,237-253, 1992. [3] H. Thorisson: On time- and cycle-stationarity. Preprint

SCIENCE lNS'ITI'UI'E, UNNERSTIYOF ICELAND,

DUNHAGA 3, 107REYKJAVIK. ICEI.AN1)

E-mailaddress:[email protected]

Jacek Wesolowski

MULTIVARIATE INFINITELY DIVISIBLE DISTRIBUTIONS WITH THE GAUSSIAN SECOND ORDER CONDITIONAL STRUCTURE

1. Introduction. Univariate infinitely divisible laws are widely investigated. However the number of papers devoted to the multivariate infinitely divisible distributions is considerably lower. These by Dwass and Teicher [4], Horn and Steutel [5] and Veeh [8] are among the most interesting. In this note we observe that multivariate infinetly divisible distribution with all the univariate marginals Gaussian is a Gaussian distribution. This, quite simple fact, seems to have wide applications. We use it to simplify a characterization of the multivariate Gaussian law by properties of fourth cumulants obtained by Talwalker [7]. The main result is a characterization of the Gaussian distribution among multivariate infinitely divisible laws with the Gaussian second order conditional structure.

2. Univariate Gaussian marginals. The characteristic function of a n-variate square integrable infinitely divisible distribution has the form (1)

O} = 0 does not restrict the generality of proofs. On the

191

other hand, if F(xo) = 1 for some xo, i.e. F(xo) = 0, then, obviously, (7) turns into trivial identity for x:::: xo. Therefore we shall assume that equation (7) is valid for all 1: :::: o. Emphasize that the conditions on the continuity and strict monotonicity of the function F and also the condition on the moments were required only in order to obtain convolution equation (7) for all x :::: O. Denote wm=max{cos(271'k/m): k=I, ...,m-l} Obviously,

Wm

0, for x:::: 0 we have:

J 00

(9)

H(x)

=

H(u) q(x - u) du -,-I,(x)H(x).

o

Note, further, that q E LI(-oo,oo) and

J J J (( , 0

00

Q(t) =

exp(itx) q(x) dx =

-00

exp ((it

+ A)X) (_x)m-l dx

-00

00

m

= ,-

exp -A+zt)x)x

o

m-I

mf(m).

m!

dx= , (A +zt,m ') = , (',,+zt ') m .

If for some to Q(to) = 1, then it would mean that (A + ito)m == m!h. Consequently,

(10) ito = -,\ + \lm!h cos (271'k/m)

+ i \lm!h sin (2d/m)

for

k = 0, 1.." m - 1 .

192

By virtue of (8) real to. Thus,

->.. + y'm!h

i- 0

cos (27rk/m)

i- 1

Q(t)

and it contradicts for relation (10) tor any

Vt E (-CXl,CXl).

To conclude checking the conditions of Theorem 1 with respect to convolution equation (9), it is sufficient to make sure that

(11) For this purpose let us formulate the following lemma.

Lemma (R. Shimizu [7], see also [1 J). Let G( x) be a distribution function with support in [0, CXl) and for some 5> 0

J 00

(12)

1

-CXl and for all Y 2:

o.

If in addition

J 00

S(x) 2:

(14)

S(x + y)dG(y),

Vx 2:

.TO ,

o

then S(x) is bounded. According to (7),

(15)

I>..m_ (1'(x)) - ,F(x) 1+m. I

= (

>..m

_ )1 mI.

Joo u m- 1p(u+x)dl!. o

Define " m -1. By virtue of (8) O. For small e > 0, obviously, for Vx 2: O. Therefore

m.

F(x)

(1 + 1'(x)) S

m.

I

(1 + 5.) F(x)

=

and it follows from (15) that Vx 2: 0

J 00

(16)

F(x)::::

um-1F(u+x)du.

o

F(x)

11'( x) 1S 1o =2-3= -1,

Consequently, for m = 3 condition (20) has already taken place. Having considered the partial cases of the proof of (20), let us pass over to the consideration of an arbitrary m. It is easy to verify that the set {Ai - i cos (27rk/m) \l'm!1I

+ sin (27rk/m) \l'm!1I

k = 0,1, "" m -

I}

coincides with the set {Zk: k=0,1" .. ,m-1}ofsolutionstotheequation r(A+iz)m =m!, i.e. with the zero set of the function 1-Q(z). Note also that Imzo=A- \l'm!II O. Since we have already noted Wm

= max, {cos(27rk/m)

k

= 1, ... , m

- 1}

< 1,

and according to (8) ,\ > max(O,w m ) \l'm!lI, Im zj, > 0 also for those k, 1 for which cos (27rk/m) > O. Consequently, the function 1 - Q( z)

=

k

m - 1,

r(A + iz)m - m! .) rrt r (A + lZ

has m - 1 zeros in the upper half-plane, while in the lower one it has the single zero. Since' the multiplicity of the unique pole Z = Ai (in the upper half-plane) is equal to m,

N - P IImz>o = (m - 1) - m = -1,

-(N -

p)lrm

ec

o=

-(1 - 0) = -l.

Consequently, (20) takes place for the arbitrary m. It means that a set of solutions to corresponding to (9) homogeneous equation has a base consisting of a unique exponential function and Fo(x) in (17) is determined by formula (18). Thus, we shall simplify to the minimum the selection of solutions having probability meaning at the expense of a succesful choise of the kernel q(x) in equation (9).

Remark. It is not difficult to show that Theorem 2 holds also for m = mIfm2, where m·1 and m2 are natural, ml ::::: m2.

REFERENCES [1] R. Yanushkevichius, Stability of Characterizations of Probability Distributions, Mokslas, Vilnius, 1991. (In Russian.) [2] C. R. Rao and D. N. Shanbhag, Recent results on characterization of probability distributions: a unified approach through extensions of Deny's theorem, Adv. Appl. Probab., 18 (1986), pp. 660-678. [3] R. C. Gupta, Relationships between order stati.!tics and record value.! and some characterization re.!ults, J. Appl. Probab., 21 (1984), pp. 425-430.

196

[4] T. A. Azlarov, Characterization properties of exponential distributions and their stability, in: Limit Theorems, Random Processes, Tashkent, 1979, pp. 103-111. (In Russian.) [5] H. M. Gu and K. S. Lau, Integrated Cauchy functional equation with an error term and the ezponentiallaw, Sankhya, 46 (1984), pp. 339-354. [6] L. B. Klebanov and O. L. Yanushkevichiene, On the stability of characterization of exponential law, Lith. Math. J., 22 (1982), pp. 103-111. [7] R. Shimizu, Functional eqiation with an error term and the stability of some characterizations of the exponential distribution, Ann. Math. Statist., 32 (1980), pp. 1-16. INSTITUTE OF MATHEMATICS AND INFORMATICS, VILNIUS 232600 AI and II . II respectively.

198

Let X, Xl, X 2 , • •• be a sequence of independent and identically distributed d-variate random vectors. For all k = 1, ... ,d the order statistics of 1:), ••• will be denoted (k) ( ) • b Y X 1(I:) :n ::::; ••• ::::; X n : n . Futhermore, let C1n,." ,C n n be non-random d-vanate vectors, called weights. Denote

xf

The vector L n is called multivariate (exactly, d-variate) L-estimate. Let J = (J(1), ... ,J(d»), J(k): (0,1) -+ R, be such that

(1.1) for some K

0 and all u, v E (0,1), and denote

J i/n

cin(l) := n

J(u)du.

(i-1)/n

The corresponding (d-variate) L-estimate will be denote by L n(I). Denote by r» the distribution function of X(k) and by Jl the vector Jl:= (Jl(k):=

J

uJ(I')(F(k)(u)) dF(k)(u), k

= 1, ...

,d).

R

Furthermore, let

Y := (y(I:):=

J

J(k) (F(k)( u

- F(k)( u)} du, k = 1, ... ,d)

R

(1(.) denote the indicator function) and

It will be always assumed that

(1.2)

< h,hIl > 0

Vhi' O.

Then the inverse matrix of Il, denote it by Il- 1 , exists and there is a symmetric and positively defined matrix, denote it by Il- 1 / 2 , such that

Remark.

All eigenvalues of Il are positive; denote the least of them by A.

199

Let G be a Gaussian random vector with mean zero and variance I ("I" denotes the identity d X d-matrix). Now we are able to formulate our main results. The following two theorems provide an illustration of the main theorem (Theorem 1.3 below).

Theorem 1.1. Let B be the class of all Euclid balls. Then there ezist« a univer&al con&tant c 2 0 sucb. that (1.3)

sup IP

BEB

(v'n"(L n (l ) c

E B) - P(G E B)I

(K + IIJ(0)1I00)3 )..-3/2 E IIXIl3/ v'n";

where 1·100 := maxs 1•(k) I. Theorem 1.2. Let C be the clas« of all convez Borel sets. Then there ezist» a univerlal con&tant c 2 0 such. that

(1.4)

sup

CEe

Ip(v'n"(L n (l ) -

p)

E

C) -

PG E

C)I

Before formulating our next theorem we need some preliminaries. First of all, agree to use the notation y; A) instead of writing sup IP(x E A) - P(y E A)I ,

AEA

where A denotes a subset of the class of all convex Borel sets of Rd. Furthermore, it will be always assumed that the following two conditions hold: a) if A E A then A + u E A and cA E A for all u E R d and c > O. Moreover, for all c > 0, Ae:={u: inf lIu-vll 0 such that

sup P(G E A e \A e )

AEA

adc.

Remark. Clearly, a) holds for A = B and A = C. Furthermore, since A C, then by Corollary 3.2 from Bhattacharya and Ranga Rao [3J we have that condition b) is satisfied for all A described above. Moreover, that corollaty states that there exists a universal constant c > 0 such that (1.6) In the case of all balls, A

= B, the bound

(1.7) holds for some universal constant c> 0 (see Sazonov [14], Sazonov [15]).

200

Theorem 1.3. Let C1n,... ,Cnn be any weight& satisfying 3B

0:

n

(1.8)

sup k

;=1

Then there ezists a universal con&tant e

0 such that

Remark. In the case d = 1 condition (1.8) was used in Vandemaele and Veraverbeke [17]. In Helmers [9] the condition

(LlO)

max le(k) - e(k)(l)l< B 1 In In In k,..

was imposed. Clearly, if (1.10) holds, then (1.8) holds too, moreover, with B The following theorem presents another illustration of Theorem 1.3.

= Bi-

Theorem 1.4. Let c;n(2):= J(i/(n + 1)), i = 1, ... ,n, and denote the correspondinq Lestimate by L n(2). Then there ezists a universal constant e 0 such: that (1.11)

- fl),G;A)::;

::; e(l + ad)(K + IIJ(0)lloo)3,\ -3/2E IIXI1 3I

vn.

Remark. In the case of multivariate sample mean Theorems 1.1-1.3 coincide with Theorems 2, 1,3 from Bentkus [1], respectively. Certainly, some generalizations of the results given above could be made. First, other characteristics of the random vector X could be used instead of E IIXI1 3,\ -3/2 in the error bound. Second, the case of non-Gaussian limiting law could be considered as well (see Paulauskas [12], Bloznelis [4] and references there concerning the case of multivariate sample mean with stable limiting law, and Gaussian as well). Third, multivariate L-estimates with unbounded functions J could be considered; see Friedrich [6] (univariate case) and Bolthausen and Gotze [5], Gotze [8] (multivariate case) on this research topic. However, we do not treat this case here because even in the univariate case such L-estimates require some additional consideration (for example, what could be said about the error bound in the case J( u) = {u(l - u)} -n when a < 1 is close to I?). We are going to study that problem elsewhere. 2. Proof of the main results. Theorems 1.1 and 1.2 are immediate consequences of Theorem 1.3 (hint: use bounds (1.6) and(1.7)). Theorem 1.4 follows from Theorem 1.3 too (hint: remind (1.1) and use (1.10)). So, let us prove Theorem 1.3. However, we will first introduce some notational conventions. We shall use r to denote random variables disributed uniformly in (0,1). Thus, for example, the mean value theorem takes the following form hex) - hey) = E h' (y + r(x - y))(x - y). Furthermore, if V is a random vector, then VI,'" ,V n, • •• denote independent copies of it. All random variables to be introduced below will be independent from the other ones. Let us also agree to denote by 1r r the density function of the random vector r G, r > O. Finally, below we shall use c to denote universal constants not always the same from line to line; if necessary to distiquish between them, we shall use the notations C1, C2, • •• as well.

201

Now the proof of the theorem is in order. The decomposition

and Lemma 4.5 b) tell us that (2.1) where

6

n(1)

:d;j 6(y'n(L n(1) _ p)I:- 1 / 2,GjA),

T1(a) :d;j P(y'n II(L n

-

L n(1))I:- 1 / 2 11

a).

Proposition 2.1. The following estimate holds:

Remark. Proofs of all propositions formulated in this paragraph are given in the next one. Let us now consider 6 n(1). We shall prove that for some universal constant C2 10 the bound

(2.2) holds for every n E N (we use the notation

where the constant Cl is defined in Corollary 4.2 below). Clearly, for all n = 1, ... , (C2,8s)2 bound (2.2) holds. So, let n be any natural number such that n > (C2,8S)2 and let us prove (2.2). For this, we shall prove that for all real e > 0 and all natural m :s: [,8sn 1/2J ([xJ denote the integer part of x E R) the estimate

holds, where N := n - [,8sn 1/2J (e and m will be exactly choosen below). We already know that (2.2) holds for n = 1, ... ,100 (since, by Corollary 4.2 below, c2,8S 10). So, let us assume that

(2.4) for all k

= 1, . ..

, n - 1. Choose

e = ../m/n. Since m :s: [,8sn 1/2J and to > 0, then we are able to use (2.3). Applying the inductive hypothesis (2.4), we get

202 for some universal constant c. Letting C2 = max (10, c2 ) , we arrive at (2.2) immediately. Now, combining the obtained bound (2.2) with Proposition 2.1 and (2.1), we have the theorem completed. Thus our further task is to prove the estimate (2.3). Denote

where

JJ 1

:='.fk

{](k)(p(k)(t)) - ](k)(p(k)(t)

R 0

+

I L U?)(t))} dX{ L U?)(t)} dt I

I

i=l

j=l

U[k)(t) :=

l(xi')$t) -

p(k)(t).

Then the Taylor's formula tell us that

(2.5) (Throughout by

Z we

denote

Zis a d-variate vector). Denote

Then by Lemma 4.5 b),

(2.6) where

Proposition 2.2. For all natural N

n

Let us bound (1). For the benefit of the reader, we will first discuss the main points of the further work. So, we write

Note that the random vectors (YN+1 + ... + Y n) I yTi and p {SN + RN(N)} are independent. Thus, step by step (see the proof of Proposition 2.4 below), we replace Yi by

203

G i for all i = N + 1, ... ,n. The just-obtained random variable qG + p {SN + RN(N)} has the Gaussian part, qG, which enables us to "throw out" pRN(N) and consider the random vector PSN + qG only (look at the quantity Te and the estimation of it below). Since pS N + qG approaches G in the reguired fashion (we use Bentkus [1] for a help here), (1). we have the desirable result for Let us now prove these remarks exactly. To get started, we formulate the following lemma. Lemma 2.3. (Senatov [16], see also Benikus [1], Blosnelis [4]). Let 'f' E Coo (R, [0, 1]) be any function such. that 'f'( x) = 1 it if x :S 0, and 'f'( x) x ;::: 1. Define, for every A E A and E > 0,

r(x) := 'f' ( inf lin - xll/ E),

=0

:='f'( inf lin-xii/E).

uEA

uEA.

Then the following statements hold:

r(A)

(a)

=1 = 1

(b) For 9 E

U',

= 0,

r(R \

and and

\ A) = O.

} sup 1Ig'(x)(h)11 :S c l(xEAC\A c >! E,

(2.7)

IIhll9

(2.8)

sup 11g'(x)(h) - g'(y)(h)11 :S c {1(xEAC\A c ) +

IIhll9

l(YEAC\A c )

So, having this lemma and using Lemma 4.5 c), we get that for all

(2.9)

max

AE ..... gE{/C ,/c}

}

E

Ilx - yll/ E.

>0

T3 ,

where Decompose T3 as follows

(2.10) where n

T t := IEg(p{SN + RN(N)} +

L

Yi) - Eg (P{SN + RN(N)} + qG)

i=N+l

Ts : = lEg (P{SN + RN(N)} + qG) - Eg(pSN + qG) - Eg(p{ G 1 + RN(N)} + qG) + Eg(G)I,

I,

if

204

T 6 := IEg(pSN

+qG)-Eg(G)I,

Now we shall give estimates of these quantities. Proposition 2.4. For all natural m: m:S; n - N

T4 :s; c(K + IIJ(O)lloo)S-\ -S/2E IIXllsn-1/2

x { 1 + (A N(1)c- 1+ ad"'; n1N)(1 Proposition 2.5.

For all natural Nand

+ 10-1"';min + 10"'; nlm) }

M

such that N

:s;

.

n - 1 and

M:S;N-1, Ts

vn }

:s; c(1 + ad ) X { (K + IIJ(O)lloo)S x-s/2E IIXll s+ c X

{K 2E IIXI1 2 I (-\(n - N)(N - M))} 1/2

+ cK 2 E IIXI1 2I (-\(n - N)) + cK 2E IIXI1 2(N - M)I (-\N(n -

N)) .

Proposition 2.6. The following estimate hold«

Proposition 2.7. The following estimates holds

Now combining Propositions 2.2, 2.4-2.7 and estimates (2.6), (2.9), (2.10), and setting M = N - [vnJ, we arrive at (2.3). This completes the proof of Theorem 1.3. 0

Proof of the Propositions, Proof of Proposition 2.1. Since sUPllh1l9 Schwarz inequality and condition (1.8), we get

=

v; and so

T

1 (O' )

for all a:

0'2

(2

sP

+ E V 2 )/ n.

tV? -

E

:=

+ T12 ·

then using the Cauchy-

IIXill"'; B I -\,

2)

Now decompose T 1 (O' ) as follows:

T1 (O' ) :s; Tn Where

-\-1/2,

205

Using the Chebyshev's inequality, we get

Tn

1 4 s ;;:E V l(v$v'fi)'

Therefore, Choose now clearly

a = ao := (2 + E V 3 )/..rn ;

(2 + E V2)/ n. Then inf {ada

a>O

+ T1(a) }

adaO

+ T1(ao) ,

which, coupled with (3.1), completes the proof. Proof 01 Propo&ition 2.2. Using the Chebyshev's inequality, we have

and so, by Corollary 4.4 a),

If n = N, then T2 (j3 ) == o. Let j3 ! o. We get the proposition proved. If n > N, then there are two posibilities: p:= E IIXl12 K2 = 0 and p > o. If p = 0, then T2 (j3 ) == 0 and so, letting j3 ! 0, we have the proposition proved. In the case p > 0 choose

Then the inequality completes the proposition. Proof Proposition. 2.4. In fact, the proof of the proposition is given in Bentkus [1] (see the proof of bound (9) there). Certainly, some slight modification should be made.. Let us discuss them briefly. Denote K=n-N.

Then

Replace now, step by step, each Zi by Gi, in the quantity

206

We get the estimate K

L

(3.2)

sup T4k(y ),

k=l yERd

where

v'"k"=l Gd y'ri + y / y'ri) Eg(pM + y + v'"k"=l Gd y'ri + G/ y'ri)I·

T4k(Y) : = lEg (pM + y +

-

In a way quite similar to the estimates (5)-(7) in Bentkus [1] are proved, one can obtain the following estimates: (3.3)

(3.5)

T4k(Y)

cE IIYI13{ n- 3 / 2

+

M, G; A)

+ ade/ r)/ (k -

1)3/2} for all k : m

.-3/2 E IIXII! + e y'ri} x

(3.6)

{K 2 E IIXII2/ (>.(n -

+ {K 2E IIXII2(N -

N) (N _ M) } 1/2

M)/ (>.(n - N) N) } 1/2 ,

dxl,

207

(3.7) Firstly, we shall prove (3.6). Since

= - (x, Y}7r q (x )1 q2 and

pi q = {NI (n - N)}I/2 ,

the estimate

(3.8) holds, where HI := IE{g(pSN

H 2 :=

IE

{q(pSN

+ qG)

+ qG)

- g(pG 1

- g(pG 1

+ qG)} (G,

+ qG)}

vi MIN RM(M)} I·

(G,RN(N) - vlMIN RM(M)}I·

Let us estimate the quantity HI' By Lemma 4.5 b), we have

This estimate, together with the fact VarG

=I

and Corollary 4.4 b), proves

The quantity could be estimated using the following theorem due to V. Bentkus (see [1], Theorem 3). Theorem 3.I. Let V, VI, ... , V n, be d-variate random vector" EV = 0 and VarY = I. Then there ezist« an univer,al con.ttant c 0 ,uch that

We get

(3.9)

x { K 2E I/Xl12 I (>..N (N - M» } 1/2

.

(Note that we have used Corollary 4.2). It completes the estimating of HI' The quantity H 2 does not exceed

208

Now applying the fact Var G = I and Corollary 4.4 a) with k = N and I = M, we have (3.10) Bringing estimates (3.1), (3.9), (3.10) together, we arrive at (3.6). To establish (3.7), note that by we get

TS2

= q-2\E(1 -

r )

{g(pSN + rpRN(N) + qG)

- g(pG 1 + rpR N(N)

+ qG) }

(G,pR N(N))2 -llpRN(N)1I 2

X {

}

I,

and so T S2 ::; (pjq)2EI(G,R N(N))2 - IIR N (N)112 1· Consequently, combining the fact Var G = I, Corollary 4.4 b) and (pj q)2 = Nj (n - N), we complete the proof of (3.7), and of Proposition 2.5 as well. I::] Proof of Proposition 2.6. The proposition follows from Lemma 4.5 d), Lemma 3.1 and Corollary 4.2. I::] Proof of Proposition 2.7. Simple computations show that

T6

=

IE

J

g(X){1r1 (x - pRN(N)) - 1r1 (x)}dxl·

Rd

Applying the Taylor's formula, we get T6

= IE

J

g(x)(x - rpRN(N), pR N(N)} 1r1 (x - rpRN(N)) dxl

Rd

and so, since Var G

= I, T6

::;

P {E IIRN (N)11 2

} 1/2.

Corollary 4.4 b) completes the proof of the proposition.

I::]

4. Some auxiliary lemmas. Lemma 4.1. For all real r > 0 n(K):=

(4.1)

J

W(K)(tW dt ::; IX(K)I

+ 2E IX(K)I·

R

Proof. Clearly,

f

n=

+ l(x>o)}

{

-00

f

+00

X

r(t) dt

+

x

(1 - F(tW dt }

209

(we have omited the index x for simplicity). Thus

J

J

+00

r s IXI +

+00

r(t)dt

+

(1 - F(tW dt.

-00

0

Let 0: > 0 be a real number; we shall choose it below. The n, for all f3 > 1/ r ,

J

+00

n:::; IX\ (4.2)

+ 20+2

P(\X\

'"

< IXI + 20: + 2( E IXI t P/

{ (rf3

- 1) orp-1

} .

°:

There are two posibilities: E IXI = 0 and E IXI > 0. First, E IXI = choose f3 = 2/ r and let 0 ! 0. Clearly, (4.1) follows from (4.2). The second posibility, E IX\ > 0 is clear too: choose 0 = E IXI and let f3 t +00. This completes the lemma. Corollary 4.2. There ezist« a pO.9itive univer.9al condant

and 3

1, E IIYII", E IIGII" :::;

C1 (K

+

C1

.9uch that for all k = 1,2

IIJ(O)IIoo)S A-s/2E IIXli s.

Proof. Since VarY = VarG = I, the moments E IIYII", E IIGII" does not exceed cE IIYII 3 for some c 0. This fact, coupled with E IIGII 2 = d 1 and (4.3) should prove the corollary completely. So let us establish (4.3). Using sup IIhll9

=

A- 1 / 2

and we arrive at E IIYII s :::; (K

+ IIJ(O)IIoo)S A-s/2E {

i: (J > 0, iit = 0,

if

iit > 0.

=1 k > the process {Xt, t = 0, k}

°

P

{1]0

> k}

for some fixed integer is defined a.s. One can easily verify that {Xt, t = 0, k} is a Markov chain. We shall call it an induced chain. The one-step transition probabilites {Pi j } are given by the relations

= P {X t + l = j I X; = i} = L§i (8m + j 00

(1.2)

Pi j

- i

+ 1)

i,j = I,m.

8=0

If a Markov chain. weth one-step transition probabilites (1.2) is irreducible and aperiodic there exists a stationary distribution which will be denoted by {1rj,i,j = I,m.}. We are now able to formulate

Theorem 1. Let

{1]t} be a Markov chain Mti&fying (1.2) with a control &equence

{ en}. Suppo&e that the following conditions hold.

214

1. For "orne periodic "equence of generative function" {G n }

p(G n , G n) =

L !gn(j) 00

gn(j)I-> 0

(n

--+

00).

j=O

gn(1)gn(O)gn(1) > O. There eziJt" a probability generative function GN(x) > G(x) for all x 0 and n = 0,1, .... Then {"It} is ergodic if 2. 3.

G(x) such. that G'(l) < 00 and

m

(=

(1.3)

L 1ri G' (1) -

1 < O.

j

j=l

If (1.4)

(>0

and p (Gn , G n ) = 0 for all Jufficiently large n then

(1.5)

"It

p ->

00 (t

--+

00)

(We recall that {1rj} is a stationary distribution of an induced Markov chain.) Proof. We shall assume that (1.3) holds but {"It} is non ergodic and then come to a contradiction. Under the condition 2 chain {Tft} is irreducible and aperiodic. If Tft is non ergodic for any y > 0 and e > 0 one can find M so that P {Tft > y} > 1 - e

(1.6)

for all

T

M.

Let us consider a finite-state Markov chain with one-step transition probabilities matrix = {Pij} defined by (1.2). The Condition 1 provides the existence of a stationary distribution {1r j, j = 1, m}. For fixed e > 0 we can choose k such that

P

(1.7)

IPij(t) -1rjl < ele for all

i, j

= 1,m

andt > k.

Here (i,j = 1,m), are t-step transition probabilities according to the matrix P and = max [G'(l), Lj:l G;(l)]. Let us fix k and choose N > k satisfying the inequality

e

L 00

(1.8)

Ign(j) - gn(j)1 < elk

for all

n > N.

j=O

The assumption (1.6) ensures the existence T such that (1.9)

P (At)

= P ht-A: > N + k} > 1 -

ele

for all

t > T.

215

The following estimaties are true if t > T

E 7/t+1 (1.10)

= E [7/t + It - 1]+ XA' + E [7/t + It - 1]+ X (At} s E [71t + It - 1J+ XA, + E [71t + It - 1]x (At) + P (At) S E7/t + E l t XA, + EltX A , -1 + e/c,

where XA is the indicator function of the event A. In view (1.9) we have (1.11) Now we want to estimate E l t X(A,). To do it let us introduce a Markov chain with a control sequence {G n } and the initial condition

_

(1.12)

7/t-k

=

on the set At,

{ tu-» N

{'1., s 2:: t-k}

+k

on the set At.

Then P

{i]. 2:: N

for all s

=t -

= 1.

k, t}

Therefore we can consider an induced Markov chain {X., s t - k, t} according to {'1., s 2:: t - k}. Taking into account the choice of the constant c we derive from (1.7) m

(1.13)

E it

- 1=

L

m

Gj (1) P {X t

= j} - 1 S

Now our idea is to show that for fixed t s = t - k, t} can be constucted so that

{'1.,

(1.14)

p{

>

L

Gj (1) 7ri + e - 1 =

T chains {7/., s

(

+e

t - k, t} and

k

Cf:

j=O

L

19ii,(j) - 9'1,(j)11 fis = TIs }

00

P {iis = TIs = n}

Ign(j) - gn(j)1

j=O

n?N

{n k

{rt-j = it-j}

j=O

-1

P {iis = TIs = n} )

< e jk .

n?N

In virtue of (1.9) we have further for i

P

(L

= 0, j

- 1

n

At } = P (At) P {rt-k = it-k

I At}

k

X

IT P {,t-Hj = it-Hj

j=l

I At, It-Hi =

(1 - c:/c)(l - clk)k

it-Hi},

1 - e - e]«:

It is equivalent (1.14). Using (1.13) and (1.14) we can write

+ E , t X A, X(-y,#"!tl -1 :::;EitXA,) +E , tX("!,htl- 1

E , t X(A,) -1 =Eit X(A,) X(-y,="!tl

< ( + e + c(e + eI c) - 1.

(1.15)

Hence, immediately hold inequality ETlt+l :::;ETlt+(+3c:+cc+c/c.

Since (

0 such that

(1.16) and this inequality contradicts (1.6). The proof of the first part of the theorem is completed. Now we go to the case ( > 0 on. Let us assume, for the sake brevity, that Gn(z) for all n = 0,1, .... We suppose that {TIt} is ergodic so that

(1.17)

lim liminf P {TIt:::; Y}

y-+oo

t-+(X)

== Gn(z)

= 1.

Then we shall also come to a contradiction. Let us introduce a stationary Markov chain {Yt, t = 0,1, ... } with one-step transition probabilities {P i,j, i, j = 1, m} defined by (1.2) and a sequence of non-negative random variebles {Of} so that

217

(t > 0) This condition provides that {Yt, 8t } is a Markov chain with state space lattice in the positive quarter plane = {(i,j) : i,j 20 are integer}. Since

zt

One can find c

>0

such that t

P

8. - 1) > et

.=1

and

for all

t

-1)

P

2 O} > 0

t

= +oo} = 1 .

• =0

It means that t

(1.18)

t

P(A) = P

-1) > 0,

-1) = +00, .=0

t

2 o} > o.

.=0

Let us consider a chain {17t} with the initial condition P{170

'fm) D;"(.z, s) = 1 + >'fm z D;;'(z,s),

where

(i

= I,m -1)

00

= j e-· r Di(z,r) dr,

Di(z,s)

o (Re s > 0). the determination off unctions Di(z,r) (i possible to find a control sequence as

= I,m)

from this system makes

00

(3.6)

Gi(z ) = E ( z9, IOt=i)= jDi(z,r)dB(r). o

Let us construct an induced Markov chain Xl to find the ergodicity condition of {Ok}. It is possible now to obtain transition probabilities from (1.2). Instead of doing that we see conditional probabilities (cpi;(r) = P {y(r) = j (mod m) I y(O) = i (mod m)} i,j = 1, m. One can earsily find in Laplace transform terms for i = 1, m - 1, i + k = m

(s + >.f;)CPii(S) = 1 + >'fi CPi+li(S) (s + >'f;)CPii+k(S) = S>'1i CPi+li+k(S) (s + Vm)CP;"m(s) = 1 + Vm m(s) (s + Vm) cp;" i(S) = s>"fm i(S)

(3.7) where

00

cpi;(s) = j e-·lcpi;(r)dr. o

Obviously the system (3.7) has the only solution and transition probabilities are given by the means of the formula 00

Pi; = P{Xt+l = j

x, =

i} = j cpii+l(r)dB(r) = cpi;

j = I,m -1, i = I,m,

o

(3.8)

00

Pim = P{Xt+l

= m I x, = i}

= j cpil(r)dB(r) = CPil o

i

= I,m,

221

If

!k > 0

(3.9)

k

for any

a Markov chain X, has a stationary distribution of the sysrem m

(3,10)

'lrj

= 1, m

{'lrj,

j

= 1, m}

m

=L

L'lrj = 1,

'Ir; 0 then {qt} is non ergodic.

Example. For the case m

'lr2

=2

one can easily get from (3.6-3.8) and (3.10)

h + hb*(O) = (h + 12)(1 + b*(O)) ,

where

+ Cd2 (1 -

=

C1

=

C1 -

b*(O)),

c2h (1 - b*(O)),

J 00

b*(s)

=

e- sx dB(x),

0

= ).(11 + h),

o C1

= 2).bhh (h + 12)-1,

C2

= (h - h)(h + 12)-2.

The inequality (3.11) can be rewritten as (3,12)

(=

C1

+ (12 -

1 - b*(O)

h)C2 1 + b*(O) -1 < O.

It is clear that (3.12) holds when p = )'b < 1. In the case p = 1 we have ( < 0 iff hh < 1. We can see that ergodicity condition for all the distribution functions B(x) is the same when rho 1. We assume now that p > 1. Without the loss of generality we suppose 12 > h and put 12 = 1, h = z , Then (3.12) can be rewritten as

(3.13)

«p -1)x 2

+ px -1)v(x) < 2x - px - px 2

222

where vex)

= b*(-\(l + x)).

In the exponential case

vex)

= (1 + p + px)-l

and (3.11) has the following form

p2 X < 1

that coincides with (3.1). The elementary analyse of functions

'Pl(x)=(p-1)x 2+px-1

and

'P2(x)=2x-px-px 2

behaviour makes clear that there exists the only solution "Y(p) (0 equation

< "Y(p)
1, the sequence

{qt} is ergodic if

{fd hos a period

m

=2

and [z

> II. Then

II < h "Y(ph)

where "Y(p) (0 < "Y(p) < 1) i$ the only solution of the equation (3.13). The following table gives "Y(p) for some distributions,

Note. This behavior of the ergodicity coefficient "Y(p) is in fact natural. We deal here with the situation when customers have incompleted information. Really we can interpret a system with a periodic sequence {fk} by the following manner. There is a buffer with m - 1 places in which curstomers encountering busy server are stored. When all m - 1 places are occupied the buffer becomes free and all the customers go to the hidden waiting room which is assumed to be unlimited. So an arriving customer can only know the number of customers in the buffer. Using only this information thr customer should decide whether he stays in the system or leaves it. In our model /k is the probability of the first choice. Tablel "Y(p) for some distributions.

p \ 1 1,2 1,4 1,6 1,8 2,0 2,4 2,6 2,8 3,0

b·(.)

exp (-bs)

(1 + bs/2)-2

1 0,681 0,475 0,337 0,240 0,173 0,091 0,067 0,048 0,036

1 0,690 0,498 0,372 0,286 0,225 0,146 0,121 0,101 0,085

(2s

+ 1)-0,1 1 0,709 0,544 0,439 0,366 0,314 0,242 0,217 0,197 0,120

(1 + bS)-l 1 0,692 0,510 0,390 0,309 0,250 0,174 0,148 0,128 0,111

223

REFERENCES [lJ V. A. Malyshev, Cla33ification of two-dimentional posiiioe random walk3 and almost. linear semimartinqales, Soviet Math. Dokl., 13 (1972), pp. 526-528. [2] V. A. Malyshev and M. V. Menshikov, Ergodicity, continuity and analiticity of countable Marcov chain", Trans. Moskow Math. Soc., 39 (1979), pp. 2-48. [3] G. Fayolle, n random walk.. ari"ing in queuering "ystem..: ergodicity and tran..ience via quadratic form" a" Lyapounov junction" I, Queueing Systems: Theory and Appl., S (1989), pp. 167-184. [4] L. Takacs, Theory of Queue.., Oxford Univ. Press, New York, 1962. [5J B. Natvig, On the tran"ient ..tate probabiliiie s for a queueing model where potential cu..tomer.. are di..couraged by queue length. J. Appl. Probab., 11, (1974), pp. 345-354. [6] V. Doorn, The tran..ient state probabilitie .. for a queueing model where potential cuedomer.. are di"couranged by queue length, 18 (1981), pp. 499--506. [7] J. Kieger and J. Wolfowitz, On the theory of queue, with many ..erver" Trans. mer. Math. Soc., 78 (1955), pp. 1-18. [8] L. G. Afanas'eva and E. V. Bulinskaja, Stochastic Processes in the Theory of Queue, Storage", Moscow State Univ. Publ., Moscow, 1980.(In Russion.) [9] L. G. Afanas'eva, On the output in the queueing model with imputient cu..tomer" Izv. Akad. Nauk SSSR, Ser. Techn. Kibern., 3 (1966), pp. 57-65. (In Russian.) DEPT. OF MATH. AND MECH., Moscow STATE UNIV., LENINSKIE GORY Moscow 119899,

RUlli.

A. Plucinska and E. Plucinski

SOME LIMIT PROBLEM FOR DEPENDENT RANDOM VARIABLES

1. Introduction and formulation of the results.

Let be a probability space, let {X nkh$n,n2:1 be a sequence of randon variables, E(Xnk) = O. We put Snr = Xnk, S« = Snn, = and we assume throughout the paper that

(0)

lim an

n--->oo

= 00.

Let h::;n, n2:1 be a sequence of sub aalgebras of which satisfies one of the following conditions: either C k = 1,2'00' ,n 1; n > 1 and X nk is measurable or = a(U nk), Snk =