236 77 4MB
English Pages IX, 167 [171] Year 2020
Progress in Probability 75
Sergio I. López Víctor M. Rivero Alfonso Rocha-Arteaga Arno Siri-Jégousse Editors
XIII Symposium on Probability and Stochastic Processes UNAM, Mexico, December 4-8, 2017
Progress in Probability Volume 75 Series Editors Steffen Dereich, Universität Münster, Münster, Germany Davar Khoshnevisan, The University of Utah, Salt Lake City, UT, USA Andreas E. Kyprianou, University of Bath, Bath, UK Sidney I. Resnick, Cornell University, Ithaca, NY, USA Progress in Probability is designed for the publication of workshops, seminars and conference proceedings on all aspects of probability theory and stochastic processes, as well as their connections with and applications to other areas such as mathematical statistics and statistical physics.
More information about this series at http://www.springer.com/series/4839
Sergio I. López • Víctor M. Rivero • Alfonso Rocha-Arteaga • Arno Siri-Jégousse Editors
XIII Symposium on Probability and Stochastic Processes UNAM, Mexico, December 4-8, 2017
Editors Sergio I. López Departamento de Matemáticas Universidad Nacional Autónoma México México, Ciudad de México, Mexico
Víctor M. Rivero CIMAT Guanajuato, Mexico
Alfonso Rocha-Arteaga Facultad de Ciencias Universidad Autónoma de Sinaloa Culiácan, Sinaloa, Mexico
Arno Siri-Jégousse Departamento de Probabilidad Universidad Nacional Autónoma México México, Ciudad de México, Mexico
ISSN 1050-6977 ISSN 2297-0428 (electronic) Progress in Probability ISBN 978-3-030-57512-0 ISBN 978-3-030-57513-7 (eBook) https://doi.org/10.1007/978-3-030-57513-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This book is published under the imprint Birkhäuser, www.birkhauser-science.com, by the registered company Springer Nature Switzerland AG. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
This volume contains research contributions and lecture notes from the XIII Symposium on Probability and Stochastic Processes, held at the Universidad Nacional Autónoma de México (UNAM), México in December 4–8, 2017. Since the first edition of this symposium, held in December 1988 at CIMAT, the initial and main goal of the event is to create a forum where ideas are exchanged and recent progress in the field are discussed, by gathering national and international researchers as well as graduate students, making it one of the main events in the field in Mexico. Held in Mexico City, preserving the tradition of taking place almost biannually at different Mexican institutions all over the country for the last three decades, it gathered academics from Argentina, Chile, France, and the USA, many of whom are academically close to the probability community in Mexico. A wide range of topics were covered, from consolidated to emergent areas in probability, in both theoretical and applied probability. It is also worth mentioning that this event also hosted a number of conferences for the celebration of the 75th anniversary of the Instituto de Matemáticas, UNAM. This undoubtedly gave a great impulse to the symposium’s program, in particular by ensuring the participation of further prestigious mathematicians and enlarging the scope of the event. This scientific program included two courses: Reflected (degenerate) Diffusions and Stationary Measures organized by Mauricio Duarte and Soliton Decomposition of Box–Ball System and Hydrodynamics of N-Branching Brownian Motions by Pablo Ferrari. We thank both, Duarte and Ferrari for putting together these interesting courses and the lecture notes. The event also benefited from five plenary talks that were given by Octavio Arizmendi, Florent Benaych-Georges, Rolando Cavazos, Joaquín Fontbona, and Brian Rider, as well as four plenary talks shared with the Instituto de Matemáticas, given by Luis Caffarelli, Pierre-Louis Lions, Sylvie Méléard, and Nizar Touzi. Another four thematic sessions on Random Matrices, Random Trees, Risk Theory, and Stochastic Control and four contributed talks completed the outline of the symposium. This volume is split into two main parts: the first one presents lecture notes of the course provided by Mauricio Duarte. It is followed by its second part which contains
v
vi
Preface
research contributions of some of the participants. Pablo Ferrari, along with Davide Gabrielli, wrote a research contribution paper instead of lecture notes. The lecture notes of Mauricio Duarte give an insight about diffusions with instantaneous reflection when hitting a boundary. After defining properly these processes, two main tools are developed to analyze them: stochastic differential equations and the submartingale problem. After studying in detail the existence and uniqueness of the stationary measure of such processes, the author illustrates his lecture with two examples inspired from his own research: the Brownian motion reflected when hitting a falling particle (gravity versus Brownian motion) and the spinning Brownian motion. The research contributions start with an article written by Osvaldo AngtuncioHernández where an alternative construction of multidimensional random walks conditioned to stay ordered is provided. This is the multidimensional version of the standard random walk conditioned to stay positive. This new construction is inspired by the one-dimensional case, where random walks are conditioned to stay positive until a geometric time, and has the advantage of requiring only a minimal restriction on the h-function, relaxing the hypotheses of previous works. The hfunction is also studied in detail and a characterization, when the limit is Markovian or sub-Markovian, is provided. A Berry-Essen-type theorem is presented by Octavio Arizmendi and Daniel Perales for finite free convolution of polynomials. They investigate the rate of convergence in the central limit theorem for finite free convolution of polynomials indicating, as for the classical and free case, also a rate of order n−1/2 . Cumulants for finite free convolution, introduced by the authors in another paper, are used to approach free cumulants in order to obtain their result. This approach provides a nice example of the many properties in free probability which already appear in finite free probability. Erik Bardoux and José Pedraza pose the problem of finding an optimal time which minimizes the L1 distance to the last time when a spectrally negative Lévy process X (with positive drift) is below the origin barrier. Since last-passage times are not stopping times, a direct approach to solve the problem is not suitable. Based on existing results, the authors rewrite the problem as a classical stopping time problem. Using that reformulation, they succeed in linking the optimal stopping level with the median of the convolution with itself of the distribution function of the negative of the running infimum of the process. Three qualitative different examples are explicitly computed: Brownian motion with drift, Cramér–Lundberg risk process, and a process with infinite variation without a Gaussian component. These computations give a strong flavor of how these results can be used in applications. With the initial intention of writing a survey article, Pablo Ferrari and Davide Gabrielli present a novel decomposition of configurations of solitons. After reviewing recent results about equivalent decompositions of solitons as different interacting particles systems, they construct a decomposition equivalent to the branch decomposition of the tree associated with the excursion. Using this new decomposition, they are able to obtain combinatorial results and to find explicitly the joint
Preface
vii
distribution of branches in the Bernoulli independent case. Besides the new results, the paper is a nice introduction to the topic of Box–Ball systems and its soliton conservation property. Finally, Orimar Sauri studies the invertibility of continuous-time moving average processes driven by Lévy process. The author provides sufficient conditions for the recovery of the driving Lévy process, motivated by the discrete-time moving average framework. The paper reviews some invertibility results on discrete-time moving average processes and gives a detailed overview of stochastic integrals of a deterministic kernel regarding Lévy process. Then, continuous-time moving average processes are invertible whenever the Fourier transform of the kernel never vanishes, and a regularity condition on the characteristic triplet of the background driving Lévy process is imposed. Several examples are discussed including Ornstein– Uhlenbeck processes. All of the papers, including the lecture notes from the guest course, were subjected to a strict peer-review process with high international standards. We are very grateful to the referees, experts in their fields, for their professional and useful reports. All of their comments were considered by the authors and substantially improved the material presented herein. We would also like to express our gratitude to all of the authors whose contributions are published in this book, as well as to all of the speakers and session organizers of the symposium for their stimulating talks and support. Their valuable contributions show the interest and activity in the area of Probability and Stochastic Processes in Mexico. We hold in high regards the editors of the book series Progress in Probability, Steffen Dereich, Davar Khoshnevisan, Andreas E. Kyprianou, and Sidney I. Resnick, for giving us the opportunity to publish this special volume in the aforementioned prestigious series. Special thanks to the symposium venue Facultad de Ciencias at UNAM and its staff for providing great hospitality and excellent conference facilities. We are also indebted to the Local Committee of the conference, formed by Clara Fittipaldi, Yuri Salazar-Flores, and Geronimo Uribe-Bravo, whose organizational work allowed us to focus on the academic aspects of the conference. The symposium as well as this volume would not have been possible without the great support of our sponsors: Facultad de Ciencias, Instituto de Matemáticas, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas at UNAM, Centro de Investigación en Matemáticas, Universidad Autónoma de Sinaloa, as well as the Consejo Nacional de Ciencia y Tecnología. We hope that the reader will enjoy learning about the various topics addressed in this volume. Mexico City, Mexico Guanajuato, Mexico Culiácan, Mexico Mexico City, Mexico
Sergio I. López Víctor M. Rivero Alfonso Rocha-Arteaga Arno Siri-Jégousse
Contents
Part I
Lecture Notes
Reflected (Degenerate) Diffusions and Stationary Measures .. . . . . . . . . . . . . . . Mauricio Duarte Part II
3
Articles
Multidimensional Random Walks Conditioned to Stay Ordered via Generalized Ladder Height Functions .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Osvaldo Angtuncio-Hernández
39
A Berry–Esseen Type Theorem for Finite Free Convolution.. . . . . . . . . . . . . . . Octavio Arizmendi and Daniel Perales
67
Predicting the Last Zero of a Spectrally Negative Lévy Process .. . . . . . . . . . . Erik J. Baurdoux and José M. Pedraza
77
Box-Ball System: Soliton and Tree Decomposition of Excursions . . . . . . . . . . 107 Pablo A. Ferrari and Davide Gabrielli Invertibility of Infinitely Divisible Continuous-Time Moving Average Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 153 Orimar Sauri
ix
Part I
Lecture Notes
Reflected (Degenerate) Diffusions and Stationary Measures Mauricio Duarte
Abstract These notes were written with the occasion of the XIII Symposium on Probability and Stochastic Processes at UNAM. We will introduce general reflected diffusions with instantaneous reflection when hitting the boundary. Two main tools for studying these processes are presented: the submartingale problem, and stochastic differential equations. We will see how these two complement each other. In the last sections, we will see in detail two processes to which this theory applies nicely, and uniqueness of a stationary distribution holds for them, despite the fact they are degenerate.
1 Reflected(-ing) Brownian Motion In this notes,1 we will introduce some aspects of the theory of Reflected diffusions. We start by considering a probability space (, F, P), and a one dimensional Brownian motion Bt in this space. For details on the construction of Brownian motion and some other introductory material on stochastic processes related to these notes, the reader can consult the books [20, 22, 35].
1 These notes were written for a four-lecture mini-course at the XIII Symposium on Probability and Stochastic Processes, held at the Faculty of Sciences, Universidad Nacional Autónoma de México, from December 4–8, 2017.
M. Duarte () Departamento de Matematica, Universidad Andres Bello, Santiago, Chile e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 S. I. López et al. (eds.), XIII Symposium on Probability and Stochastic Processes, Progress in Probability 75, https://doi.org/10.1007/978-3-030-57513-7_1
3
M. Duarte
Space
4
Brownian motion
Mirror BM Time
Fig. 1 A Brownian motion and its mirrored reflection
A celebrated result of Lévy (1948), established that the following two processes have the same law: X1 = {|Bt | , 0 ≤ t < ∞} X = 2
sup (Bu − Bt ), 0 ≤ t < ∞ . 0≤u≤t
Is intuitive to picture the graph of Xt1 as a Brownian motion that is mirrored on the horizontal axis. We will call the process Xt1 a mirror Brownian motion (Fig. 1). Observe that the mirror Brownian motion carries less information than its original counterpart. Proposition 1.1 The filtration FtB generated by Bt is strictly larger than the |B| filtration Ft generated by |Bt |. Proof Consider the process St = sgn(Bt ), that is St = 0 if Bt = 0, and St = Bt /|Bt | for Bt = 0. It is clear form the definition that St is adapted to FtB . |B| To prove our claim, it suffices to show that St is not adapted to Ft . Assume that St = ϕ(|Bu |, u ≤ t), for some measurable function ϕ. Define Bt = −Bt , and t ). It follows that St = sgn(B u |, u ≤ t) = ϕ(|Bu |, u ≤ t) = St , St = ϕ(|B |B|
a contradiction. Therefore, St is not adapted to Ft , and the claim is proved.
Reflected (Degenerate) Diffusions and Stationary Measures
5
Definition 1.2 Any stochastic process with the same law as X1 will be called a (one dimensional) Reflected Brownian motion. Typically, we will abbreviate it by RBM. A necessary remark is that one way to obtain a RBM is as a mirrored BM. We will explore other ways to realize this process in what follows. Our aim is to describe RBM improving the measurability restrictions discussed in Proposition 1.1. In order to do so, we need an extension of the well known Itô formula. The proof of the following proposition can be found in Theorem 1.1, Chapter VI [35]. Proposition 1.3 Let f : R → R be a convex function and X. If X is a continuous semimartingale, there exists a continuous increasing process Af such that
t
f (Xt ) = f (X0 ) + 0
D − f (Xu )dXu + At , f
(1.1)
where D − f corresponds to the left derivative of f . If we take f (x) = |x|, then the functional given by the previous proposition is known as the Local time of Brownian motion at zero, and is typically denoted by L0t . The resulting equation is known as Tanaka’s formula:
t
|Bt | = |B0 | + 0
sgn(Bu )dBu + L0t .
(1.2)
This equation shows t that |Bt | is also a semimartingale, and its martingale part is given by βt = 0 sgn(Bu )dBu . We can easily check from Itô’s isometry that the quadratic variation of β is β t = t, which implies that βt is a Brownian motion by Levy’s characterization of Brownian motion (see [22, Chapter 3, Theorem 3.16].) Lévy also proved that the law of the process given by MtB = sup0≤u≤t Bu is the same as the law of Local time at zero. The following proposition justifies the terminology “local time” for the process L0t . Proposition 1.4 Given a Brownian motion Bt , its local time at zero satisfies the following formula a.s. 1 ε→0 2ε
L0t = lim
t 0
1[−ε,ε] (Bu )du,
(1.3)
Proof The proof uses Itô’s formula to approximate equation (1.2). We will consider a sequence of functions fn ∈ C 2 (R) such that fn (x) → |x| for all x ∈ R, and for x = 0 it holds that fn (x) → sgn(x), and |fn (x)| ≤ 2. Take ε > 0. Let gn (x) be the even extension of the function gn (x) = 1[0,ε] (x) + (1 − n(x − ε))1
0, n1
(x
− ε),
x ≥ 0,
6
M. Duarte
and let fn (x) be the C 2 (R) function such that fn (0) = 0, fn (0) = 0, and fn (x) = gn (x). For x ≥ 0 we have fn (x) = x1[0,ε](x) + (ε + (x − ε)(2 − n(x − ε))/2)1
0, n1
1 + ε+ 1 1 (x − ε), 2n n ,∞
(x
− ε)
and for x < 0 we have fn (x) = −fn (−x). We readily check that |fn (x)| ≤ ε + 1, and that fn (x) → (x − ε)1[0,ε] (x) + ε. Using these facts, we prove that there is a function f∞ (x) = limn→∞ fn (x), satisfying f∞ (x) =
|x|∧ε
(t − ε) dt + ε|x| =
0
1 (|x| − ε)2 1[0,ε] (x) + ε|x| = o(ε) + ε|x| 2
Applying Itô’s formula to fn , and taking the limit as n → ∞, we obtain o(ε) + ε|Bt | = ε|B0 | +
t 0
(|Bu | − ε)1[−ε,ε](Bu ) + sgn(Bu )ε dBu +
1 t 1[−ε,ε] (Bu )du 2 0
For fixed t ≥ 0, it is easy to see that the martingale term in the last equation has quadratic variation of order o(ε), and so, by dividing that equation by ε, it converges as ε goes to zero. We obtain,
t
|Bt | = |B0 | +
1 ε→0 2ε
t
sgn(Bu )dBu + lim
0
0
1[−ε,ε] (Bu )du.
Comparing this equation to (1.2), we obtain the desired result.
Moving forward, if we denote Xt = |Bt |, then, the last display in the proof of Proposition 1.4 motivates the definition of the Local time at zero of Xt , namely, 1 ε→0 2ε
t
Lt = lim
0
1[0,ε] (Xu )du.
(1.4)
This yields the decomposition Xt = X0 + βt + Lt ,
(1.5)
t where βt = 0 sgn(Bu )dBu is a Brownian motion. The local time Lt is then defined in terms of Xt only. Let’s summarize the most relevant properties we already know about Lt : (1) it is increasing, (2) it is a continuous additive functional, (3) it only increases on the set {t : Xt = 0}.
Reflected (Degenerate) Diffusions and Stationary Measures
7
Properties (1) and (2) are direct form Proposition 1.3, while (3) can be deduce directly form Proposition 1.4. Our next step will be to turn around the decomposition Xt = X0 +βt +Lt , that is, we will prove an important lemma that says that given any Brownian motion βt , we can find a Reflected Brownian motion Xt , and a process Lt satisfying properties (1)–(3) such that (1.5) holds. Lemma 1.5 (Skorohod) Let y(·) be a real valued, continuous function on [0, ∞) such that y(0) ≥ 0. There exists a unique pair of functions (z, ϕ) on [0, ∞) such that: (a) z(t) = y(t) + ϕ(t) for all t ≥ 0, (b) z(t) ≥ 0, (c) ϕ(·) is increasing, continuous, ϕ(0) = 0, and the corresponding measure dϕ(t) is carried by the set {t : z(t) = 0}. Moreover, ϕ(t) = sup (−y(t) ∨ 0) .
(1.6)
u≤t
Proof Define ϕ(t) by (1.6) above, and z(t) by condition (a) in the statement. Then, it is clear that (a) and (b) are satisfied. To check condition (c), note that Eq. (1.6) shows that ϕ(0) = 0 and that ϕ is continuous and increasing. Since z(t) is continuous, the set Z = {t : z(t) = 0} is closed. Hence, its complement Zc is a union of open intervals. Let (t1 , t2 ) ⊂ Zc . We claim that ϕ(t2 ) = ϕ(t1 ). If not, we have ϕ(t2 ) > ϕ(t1 ), which means by (1.6) that there is t3 ∈ (t1 , t2 ) such that ϕ(t3 ) = −y(t3 ) > ϕ(t1 ). But then we obtain t3 ∈ Z, a contradiction. It follows that dϕ(t) = 0 for t ∈ Zc . To check uniqueness, let (z1 , ϕ1 ) be another pair of functions satisfying conditions (a), (b), and (c). Note that z1 (t) − z(t) = y1 (t) − y(t) is a difference between two increasing processes and so it is an absolutely continuous function. We have 1 |z(t) − z1 (t)|2 = 2
t 0
(z(u) − z1 (u))d(ϕ(u) − ϕ1 (u))
=− 0
t
z(u)dϕ1 (u) −
t
z1 (u)dϕ(u) ≤ 0,
0
where we have used (c) in the second equality, and (b) and (c) in the inequality. It follows that z1 (t) = z(t), and y1 (t) = y(t). The Skorohod Lemma enables us to provide a strong construction of RBM, that is, given a Brownian motion Bt , we can define a RBM Xt such that the filtrations generated by these two processes coincide. Theorem 1.6 Let Bt be a Brownian motion starting from 0. There are unique stochastic process (Xt , Lt ) such that Xt = X0 + Bt + Lt , and Lt = supu≤t (−(X0 +
8
M. Duarte
Bt ) ∧ 0). The law of (Xt , Lt ) is the same as (|Bt |, L0t ), that is, Xt is a Reflected Brownian motion. Equation (1.4) holds, and also FX = FB . Proof Since trajectories of Bt are a.s. continuous, we can define (Xt , Lt ) as the unique solution to the Skorohod problem associated to y(t) = X0 + Bt , from where the first assertion of the theorem follows. From uniqueness in Skorohod’s lemma, we have that (|Bt |, L0t ) is the unique solution for y(t) = X0 + βt (with βt as before). The law of the solution (Xt , Lt ) is completely determined by the explicit solution to the Skorohod lemma, and so it coincides with the law of (|Bt |, L0t ), because Bt and βt have the same law. This proves that Xt is a RBM. To show (1.4), recall the sequences of functions fn and gn defined in Proposition (1.4). Since Xt is a semimartingale with martingale part Bt , it follows from Itô formula that t t 1 t fn (Xt ) = fn (X0 ) + fn (Xu )dBu + fn (Xu )dLu + gn (Xu )du 2 0 0 0 t 1 t fn (Xu )dBu + gn (Xu )du, = fn (X0 ) + 2 0 0 because dLt is supported on {Xt = 0}, and fn (0) = 0. Recall that Xt ≥ 0. Taking the limit as n goes to infinity, and justifying the limit procedures an is Proposition (1.4), we arrive at
t
o(ε) + εXt = εX0 + 0
1 (Xu − ε)1[0,ε] (Xu ) + ε dBu + 2
t 0
1[0,ε] (Xu )du. (1.7)
t We claim that the martingale ε−1 0 (Xu − ε)1[0,ε] (Xu )dBu converges to zero as ε → 0. Indeed, it has quadratic variation ε
−2
t 0
(Xu − ε) 1[0,ε](Xu )du ≤
t
2
0
1[0,ε] (Xu )du.
It follows that
t
t t −2 2 E ε (Xu − ε) 1[0,ε] (Xu )du ≤ E 1[0,ε](|Bu |)du = P(|Bu | ≤ ε)du 0
0
0
The last quantity converges to zero as ε goes to zero. This can be seen, for instance, by dominated convergence. Dividing (1.7) by ε and taking ε to zero, yields 1 ε→0 2ε
t
Xt = X0 + Bt + lim
0
1[0,ε] (Xu )du.
9
Space
Reflected (Degenerate) Diffusions and Stationary Measures
Brownian motion
Local time Time
Fig. 2 Typical paths of functions in Skorohod reflection of Brownian motion
Comparing this equation with Xt = X0 + Bt + Lt , we obtain (1.4). It is clear from the explicit solution to the Skorohod Lemma that Xt is FtB measurable, and so FtX ⊂ FtB . To show the converse, note that Eq. (1.4) shows that Lt is FtX measurable, and so it is Xt − Lt = Bt (Fig. 2). The Skorokhod Lemma redefines our notion of Reflected Brownian motion. We started with specular reflection, and ended with a constrained process, which, away from the boundary of its domain, behaves as a Brownian motion. Can we extend this idea to other driving processes? Namely, instead of reflecting a BM, can we constrain in a similar way other diffusions? Exercise Let Xt be a RBM. Prove that
∞ 0
1{0} (Xt )dt = 0.
2 Stochastic Differential Equations with Reflection Let Bt be a Brownian motion. We are interested in finding a process Xt ≥ 0 satisfying the following equation: dXt = σ (Xt )dBt + b(Xt )dt + dLt ,
(2.1)
where the coefficients σ (·) and b(·) are Lipschitz continuous in R+ , and σ (x) ≥ 0. The process Lt must be increasing, continuous, and it must satisfy dLt = 1{0} (Xt )dLt .
10
M. Duarte
Definition 2.1 A process Xt satisfying (2.1) will be called a Reflected diffusion. Reflected Brownian Motion with Drift Let’s consider the case σ ≡ σ0 > 0, and b ≡ b0 ∈ R, In this case, we can use Skorohod lemma again, to find the processes (Xt , Lt ). This time, we will write it a little different, but equivalently as in Eq. (1.6): Lt = sup (−X0 − σ0 Bt − b0 t ∨ 0) . u≤t
Theorem 2.2 A unique strong solution exists for Eq. (2.1). Our approach to finding a solution to (2.1) will be analytical. But let’s make some remarks first: • There is no reason to expect that Xt = |X0 + Bt | solves the equation. • Because of the dependence of the coefficients on Xt , Skorohod lemma only provides an implicit solution. More precisely, provided we have a solution Xt , then Skorohod lemma ensures that
t t Lt = sup −X0 − σ (Xu )dBu − b(Xu )du ∨ 0 u≤t
0
0
• The process is an Itô diffusion up to the first time it hits level zero. The essential problem is how to define the local time after this time. Proof Fix T > 0, and consider the Banach space HT of continuous adapted processes such that for all t > 0, E supu≤T |Xu | < ∞, equipped with the norm
XT = E sup |Xu |
1/2 2
.
u≤T
For X ∈ HT , let S(X) be thefirst componentof the solution to the Skorohod t t problem with input y(t) = X0 + 0 σ (Xu )dBu + 0 b(Xu )du. We will show that S(·) is a contracting map if T is small enough. Applying Itô formula, if X, Y ∈ HT , we deduce that 2 |S(X)t − S(Y )t |2 = 2
t 0
+2 +2
(S(X)u − S(Y )u )(σ (Xu ) − σ (Yu ))dBu +
t 0
t 0
(S(X)u − S(Y )u )(b(Xu ) − b(Yu ))du+ Y (S(X)u − S(Y )u )d(LX u − Lu ) +
t 0
|σ (Xu ) − σ (Yu )|2 du.
2 This is a good exercise in stochastic calculus. I recommend to follow it closely, and fill the minor gaps.
Reflected (Degenerate) Diffusions and Stationary Measures
11
Y Y X Note that (S(X)u − S(Y )u )d(LX u − Lu ) = −S(X)u dLu − S(Y )u dLu ≤ 0. Hence,
s
sup |S(X)s − S(Y )s | ≤ 2 sup 2
s≤t
s≤t
(S(X)u − S(Y )u )(σ (Xu ) − σ (Yu ))dBu +
0
t
+2
|S(X)u − S(Y )u | · |b(Xu ) − b(Yu )|du+
0 t
|σ (Xu ) − σ (Yu )|2 du.
+ 0
By the Burkholder–Davis–Gundy inequality, and the Lipschitz character of the coefficients of (2.1), we obtain
sup |S(X)s − S(Y )s |2 s≤t
E
≤CE
0
t
t
+CE
1/2
(S(X)u − S(Y )u )2 (Xu − Yu )2 du
0
+
|S(X)s − S(Y )s ||Xs − Ys | + |Xs − Ys |2 ds .
We will next use the fact that for every a, b, ε > 0, it holds that ab =
√ √ 2εa · b/ 2ε ≤ εa 2 + b 2 /(4ε).
It follows that
E
0
t
1/2 (S(X)u − S(Y )u )2 (Xu − Yu )2 du
≤ E sup |S(X)u − S(Y )u |
t
u≤t
0
≤ ε E sup |S(X)u − S(Y )u |2 + u≤t
1/2
|Xu − Yu |2 du
t 1 E |Xu − Yu |2 du. 4ε 0
Similarly, E
t 0
|S(X)s − S(Y )s ||Xs − Ys |ds t
ε t |S(X)s − S(Y )s |2 + |Xs − Ys |2 ds 4ε 0 t
t t E |Xs − Ys |2 ds. ≤ ε E sup |S(X)s − S(Y )s |2 + 4ε 0 s≤t ≤E
12
M. Duarte
Using these observations in the main bound above, and rearranging terms, we obtain that for some C1 > 0, the inequality
t E sup |S(X)s − S(Y )s |2 ≤ C1 (1 + t) E sup |Xs − Ys |2 du, s≤t
0
(2.2)
s≤u
holds. If T is small enough so that C1 (1 + T )T < 1, then the inequality above implies that S(X) − S(Y )T < X − Y T , which proves our claim. Since any solution to (2.1) would be a fixed point of S(·), existence and uniqueness in HT follows from Banach’s fixed point theorem. To extend existence and uniqueness to all of [0, ∞), set X0 = S(X)T and repeat the argument inductively.
2.1 SDEs in Higher Dimensions For simplicity, let’s consider a smooth, bounded domain D ⊂ Rd , where d > 1. The aim is to set up a stochastic differential equation for a process Xt such that: • • • •
The process Xt has trajectories on the closure of D, behaves as an Itô diffusion when is away from the boundary ∂D, once it hits the boundary it is pushed immediately back into the domain D, spends no Lebesgue time on the boundary, that is
∞
1∂D (Xt )dt = 0.
(2.3)
0
Once again, the main difficulty is to appropriately define a reflecting mechanism. In one dimension, such mechanism was the local time. Since the shape of the boundary ∂D is unknown, in general, there is little hope to be able to come up with a closed formula such as (1.6), and we will need to focus on the analytical properties of the local time instead, that is, (one dimensional) the local time is the minimal additive functional such that Xt = Bt + Lt , and acts only when Xt = 0. Definition 2.3 Let Bt be an d-dimensional Brownian motion. Consider Lipschitz functions σ : D → Rd × Rd , and b : D → Rd , such that σ (x) is positive semidefinite for all x ∈ D. Denote the inner normal vector at x ∈ ∂D by n(x), and let γ : ∂D → Rd be a continuous, bounded function such that γ (x)T n(x) ≥ 1 for every x ∈ ∂D. A continuous strong Markov process Xt with trajectories in D is a Reflected diffusion with coefficients σ, b, γ if it can be written as dXt = σ (Xt )dBt + b(Xt )dt + γ (Xt )dLt ,
(2.4)
where Lt is a continuous, increasing additive functional such that L0 = 0, dLt = 1∂D (Xt )dLt .
Reflected (Degenerate) Diffusions and Stationary Measures
13
Fig. 3 Typical path of Reflected Brownian motion in a planar domain
Note that in the definition we have asked for γ (x)T n(x) ≥ 1 on the boundary t = c(Xt )dLt , of D. If we set c(x) = γ (x)T n(x), γ (x) = c(x)−1 γ (x), and d L then γ and L can replace their counterparts in Eq. (2.4), and γ (x)T n(x) = 1 on ∂D. The process L cannot be longer represented as the local time in (1.4), because the diffusion matrix σ alters the clock at which the process runs with respect to Brownian paths, and the formula needs to take care of this time change. In the case σ satisfies a condition, and b ≡ 0, it is possible to show the following result, with no much more complication than our proof of Eq. (1.4) within Theorem 1.6 (Fig. 3). Proposition 2.4 In the context of Definition 2.3, assume that b = 0, and for all x ∈ ∂D, σ (x)n(x) = 1, and γ (x)T n(x) = 1. Then, 1 Lt = lim ε→0 2ε
t
1 {dist(Xu , ∂D) < ε} du.
(2.5)
0
For a proof of this proposition, the reader can consult chapter IV, section 7 from [20], where the problem is reduced to a one dimensional case, and then apply Theorem 1.6. Remark 2.5 It follows directly form the previous proposition that (2.3) holds.
2.2 An Example in Two Dimensions: Heavy Traffic Queues Multidimensional Reflected Brownian motion was successfully applied in queuing and storage theory, and was studied as early as in the 1960s. Kingman [23], suggested that queues in heavy traffic could be modeled by a multidimensional Reflected Brownian motion in an orthant, which gave new light to the field. Kingman’s hypothesis was later proved by Iglehart and Whitt [19] for multiple channels queues in heavy traffic. Even though the initial interest of studying these processes was their connection to heavy traffic queues, it quickly developed into a
14
M. Duarte
research field of its own, extending from Brownian motion to general diffusions and semimartingales. To keep the ideas contained within these notes, we will briefly describe a system of two servers in line, and will describe heuristically their behavior. For a more profound reading in this theory, you can consult the book [37], and the articles [16, 17, 19, 23]. The inter arrival times to server one (S) are exponential with rate λ, that is, the number of customers that have arrived to server one by time t is a Poisson process of rate λ. Assume that only one customer is served at a time, at an exponential time of rate μ. Once served by S1 , customers move on to the second server (Q), where one at time, get served at an exponential time of rate ν. Let St be the number of customers in (line in) the first server at time t ≥ 0, and Qt be analogous process for the second server. Note that St can be computed as the difference of two independent Poisson processes, and so St −(λ−μ)t is a martingale. Kingman’s hypothesis was partly based in the fact that, if we let Stn be the process just described with arrival rate nλ, and service rate nμ, then Stn − n(λ − μ)t is a martingale, and the process Xtn = Stn/n − (λ − μ)t
(2.6)
converges to a Brownian motion with variance |λ − μ|, reflected at zero. You can use the fact that S n has independent increments and the Central Limit Theorem to gain some evidence of this. Roughly speaking, we’ll have Xtn ≈ (λ − μ)Bt1 + L1t ,
(2.7)
where Bt1 is some BM, and L1t only increases when Xtn = 0, that is, when the first server is empty. Next, let’s look at the process Qt . The analysis we did in the previous paragraph can almost be mimicked, but we need to point out that the arrival rate for Qt is μ if and only if St > 0. If St = 0 the arrival rate at this time is zero. Again, denote by Qnt the process for the second server, when the first one is described by Stn , and has service rate nν. The process Ytn = Qnt/n − (μ − ν)t will not be a martingale, but as in the previous case we will have Ytn ≈ (μ − ν)Bt2 + L2t − κL1t . The first two terms correspond to a RBM (dL2t is supported on Ytn = 0), and the last term comes from the fact that when the first server is empty, the second server will only decrease the amount of clients waiting to be served. Here, κ is some positive constant. We obtain the vectorial equation d
n
1
1
Xt λ−μ 0 1 0 Bt Lt ≈ + , d d Ytn Bt2 L2t 0 μ−ν −κ 1
Reflected (Degenerate) Diffusions and Stationary Measures
15
Fig. 4 The two server network system under heavy traffic
for a Reflected diffusion in the first quadrant of the plane. Its boundary will be composed of the positive parts of the x and the y axis. If we set dLt = 1 Xtn = 0 dL1t + 1 Ytn = 0 dL2t , then this equation can be written in the form of Eq. (2.4). What would the vector γ be in this case on each part of the boundary? (Fig. 4) The list of applications of Equations with Reflection in multidimensional setting is vast and it has touched several applied areas of mathematics. These include heavy traffic analysis of queueing networks [2, 11, 12, 30, 31, 34, 36], control theory, game theory and mathematical economics [3, 25, 32, 33, 42], molecular dynamics [38–40], and image processing [6]. This is just a small set of examples of its applicability.3
2.3 The Submartingale Problem The process described by Eq. (2.4) is part of a larger class of diffusion described by a second order differential operator plus a boundary condition. The infinitesimal generator of it can be found via Itô formula, and it corresponds to Lf (x) =
d d 1 ∂ 2f ∂f (σ (x)T σ (x))ij (x) + b(x)j (x), 2 ∂xi ∂xj ∂xj i,j =1
(2.8)
j =1
acting con functions f ∈ C02 (Rd ). A general class of boundary conditions was found by Wentzell [44]. If the reader is interested, the construction of a Reflected diffusion in terms of Stochastic differential equations and the Skorohod problem can be found in [8, 10, 20, 26, 29] among many others.
3 We
must give credit for this compilation of applications to the authors of [28].
16
M. Duarte
A crucial consequence from (2.4), comes from applying Itô formula. For any f ∈ C02 (Rd ), and 0 ≤ s < t, we have f (Xt ) = f (Xs ) +
t s
∇f (Xu )T σ (Xu )dBu +
t s
Lf (Xu )du +
t s
∇f (Xu )T γ (Xu )dLu .
If ∇f (x)T γ (x) ≥ 0 on ∂D, then we will have that
t
f (Xt ) −
Lf (Xu )du
(2.9)
0
is a submartingale. One very successful way of constructing Reflected diffusions was developed by Stroock and Varadhan [43]. Their submartingale problem proved to be a successful extension of their ideas developed to treat the well-known martingale problem. The following survey on the submartingale problem is based on their original presentation. Let D a non-empty, open subset of Rd , such that: (i) there exists φ ∈ Cb2 (Rd ; R) such that D = φ −1 (0, ∞), and ∂D = φ −1 ({0}). (ii) ∇φ(x) ≥ 1 for all x ∈ ∂D. The following functions will also be given: (i) a : [0, ∞) × D → Rd × Rd which is bounded, continuous, and semipositive definite, (ii) b : [0, ∞) × D → Rd which is bounded and continuous, (iii) γ : [0, ∞) × ∂D → Rd which is bounded, continuous, and satisfies ∇φ(x)T γ (t, x) ≥ β > 0 for t ≥ 0 and x ∈ ∂D. Define, for u ≥ 0 and x ∈ D d d 1 ∂2 ∂ ai,j (u, x) + bi (u, x) ; Lu = 2 ∂xi ∂xj ∂xi i,j =1
(2.10)
i=1
and, for u ≥ 0 and x ∈ ∂D Bu =
d
γi (u, x)
i=1
∂ . ∂xi
Definition 2.6 We say that a probability measure P on (, F) solves the submartingale problem on D for (Lu , B) if P Xt ∈ D = 1, for t ≥ 0, and
t
St [f ] := f (t, Xt ) − 0
∂f + Lu f (u, Xu ) du 1D (Xu ) ∂u
(2.11)
Reflected (Degenerate) Diffusions and Stationary Measures
17
is a P-submartingale for any f ∈ C01,2 [0, ∞) × Rd satisfying Bt f ≥ 0
on
[0, ∞) × ∂D.
(2.12)
We say the submartingale problem is well-posed if it has a unique solution. Remark 2.7 The indicator in the integrand in (2.11) can be omitted because, by Remark 2.5, Eq. (2.3) holds. We included it to respect the original definition, as introduced in [43]. As we showed in Eq. (2.9), it is simple to establish that any solution to (2.4) provides a solution to the associated submartingale problem. The key result of [43] is that the reciprocal is also true. Theorem 2.8 Assume that the submartingale problem on D for (L, B) is wellposed. Then, its solution provides a weak solution to Eq. (2.4). Proof This will be a sketch, as you can find the original proof in Theorems 2.4 and 2.5 from [43]. Since the coefficients in (2.4) do not explicitly involve time, our test functions can be taken “independent” of the t coordinate. Also, we will assume b ≡ 0 to focus on the main ingredients of the Reflected diffusion. The general case is analogous, or can be recovered from this one by a Girsanov transformation. Let Xt∗ be the solution to the submartingale problem (the process is given by the Law P in the definition). Consider any f ∈ C02 (Rd ) such that (2.12) holds on ∂D. We have that St [f ] is a continuous submartingale, and by the Doob-Meyer decomposition, we have a unique decomposition St [f ] = Mt [f ] + At [f ],
(2.13)
where Mt [f ] is a martingale, A0 [f ] = 0, and At [f ] is a continuous additive functional of bounded variation. We will assume for now that D is bounded. First, we identify the martingale part by computing the quadratic variation of St [f ]. This is achieved by comparing the processes St [f 2 ] and (St [f ])2 , and some Itô calculus for semimartingales. We obtain that S[f ] t = M[f ] t =
t 0
|∇f (Xu∗ )T σ (Xu∗ )|2 du.
(2.14)
The reader familiar with Itô formula observes that this is exactly the quadratic variation of f (Xt ), for a solution Xt of Eq. (2.4). We are on a good track. Since D is bounded, then we can select functions fj ∈ C02 (Rd ) such that fj (x) = xj in D. Hence Lfj ≡ 0 since b ≡ 0 and second order derivatives of fj vanish in D. We obtain
M[fj ] t =
t d 0 k=1
|σ (Xu∗ )j k |2 du.
18
M. Duarte
Theorem 3.4.2 in [22] and the argument about diagonalizing the positive semidefinite matrix σ T σ (Xt∗ ) can then be used to show that there are d independent j Brownian motions (Bt )dj=1 such that dMt [fj ] = dk=1 σ (Xt∗ )j k dBtk . Using (2.14) again, we obtain for Bt = (Bt1 , . . . , Btd ), and every f ∈ C02 (Rd ), Mt [f ] =
f (X0∗ ) +
t 0
∇f (Xu∗ )T σ (Xu∗ )dBu ,
(2.15)
as desired. Next, we deal with the additive functional with bounded variation. First, assume D is bounded and consider a test function ϕ ∈ C02 (D) such that ∇ϕ(x)T γ (x) ≡ 1 on ∂D. Such a function can be a regular version of the distance to the boundary, see section 14.6 in [15]. Set L∗t = At [ϕ], our candidate for the local time in (2.4). Fix any f ∈ C02 (Rd ), even if it does not satisfy (2.12), and let α = maxx∈∂D ∇f (x)T γ (x), and β = minx∈∂D ∇f (x)T γ (x). The functions f ϕ = αϕ − f and fϕ = f − βϕ satisfy (2.12), which means that At [f ϕ ] and At [fϕ ] are increasing. Note that since St [·] and Mt [·] are linear, so is At [·]. Thus it is natural to define At [f ] := αL∗t − At [f ϕ ] = βL∗t + At [fϕ ]. It follows that, αdL∗t ≥ dAt [f ] ≥ βdL∗t , that is, dAt [f ] is absolutely continuous with respect to dL∗t , and the RadonNykodym derivative is bounded between β and α. With some more technical details, but using the same ideas as before, it can be proven that for any measurable K ⊂ ∂D, 1K (Xt∗ ) max(∇f T (Xt∗ )γ (Xt∗ ))dL∗t ≥ 1K (Xt∗ )dAt [f ] K
≥
1K (Xt∗ ) min(∇f T (Xt∗ )γ (Xt∗ ))dL∗t , K
from where we deduce that dAt [f ] = 1∂D (Xt∗ )∇f (Xt∗ )T γ (Xt∗ )dL∗t , by continuity. It follows that t t ∗ ∗ ∗ T ∗ f (Xt ) = f (X0 ) + ∇f (Xu ) γ (Xu )dBu + ∇f (Xu∗ )T γ (Xu∗ )dL∗u . 0
0
(2.16) By using fj (x) = xj as before, we deduce (2.4). The case for unbounded D follows from a standard stopping and localization argument, with some technicals requirements of D that ensure that path-wise uniqueness holds for (2.4) (see [8, 26]).
Reflected (Degenerate) Diffusions and Stationary Measures
19
3 Stationary Measures for Reflected Diffusions This section is mostly a collection of already known results on stationary distributions of diffusions with boundary conditions. We start by some basic definitions and a very useful functional characterization due to Weiss [45] by following closely his presentation. Definition 3.1 Let P (t) be the strongly continuous contraction semigroup associated to some Markov process with state space E. A probability measure μ on E is called a stationary distribution for such process if μ(A) =
P (t)1A (x)μ(dx)
(3.1)
E
for all Borel sets A. When the property above holds for a general (non-finite) measure, we call μ an invariant measure. It is clear that a probability measure μ is stationary if and only if
f (x)μ(dx) =
P (t)f (x)μ(dx),
E
(3.2)
E
for any f ∈ Cb (E) and t > 0. A stationary distribution can be often found as a limit of long time averages. Indeed, let P (t) be the semigroup of a Markov process Zt . For any fixed probability measure μ0 on the state space, define the occupation time measures by μt (A) =
1 t
t 0
Pμ0 (Zs ∈ A) ds =
1 t
t P (s)1A (x)μ0 (dx)ds. 0
Whenever this family of measures has a convergent subsequence, for example, when the state space is compact, any of its limits will be a stationary distribution. Indeed, we have the following general lemma, Lemma 3.2 Let Pt be a Feller semigroup associated to a process Zt with state space E, and let μ be a limit point of the sequence of occupation time measures. Then μ is a stationary distribution for X. Proof Let μn = μtn be a sequence in the aforementioned family, converging weakly to a probability measure μ, and let f ∈ Cb (E). For any t > 0
P (t)f (x)μ(dx) = lim E
n
= lim n
P (t)f (x)μn (dx) E
1 tn
tn
P (u)P (t)f (x)μ0 (dx)du
0
E
20
M. Duarte
1 = lim n tn 1 = lim n tn
tn
0
P (u + t)f (x)μ0 (dx)du E
tn +t
P (v)f (x)μ0 (dx)dv.
t
E
Since |P (t)f (x)| ≤ f ∞ , we have the following estimate t +t 2 f ∞ t n 1 t + 1 ≤ P (v)f (x)μ (dx)dv P (v)f (x)μ (dx)dv , 0 0 tn 0 E tn t n tn E
which converges to zero as tn → ∞. Therefore,
1 tn P (t)f (x)μ(dx) = lim P (v)f (x)μ0 (dx)dv n tn 0 E E f (x)μ(dx), = lim f (x)μn (dx) = n
E
E
which shows that μ is stationary by (3.2)
Roughly speaking, this result states that a stationary distribution represents the average time that the diffusion spends in Borel sets. An interesting consequence of this result is that if there is only one stationary distribution, then it can be characterized as the limit of μt as t goes to infinity, for any starting measure μ0 . The result avoids the question of convergence of the occupation time measures, and one has to draw upon to techniques such as Lyapunov functions to obtain existence of limit points. For a detailed discussion of the matter we refer the reader to Kurtz and Ethier [13, chapter 4, section 9].
3.1 An Example We consider the case of Reflected Brownian motion in a smooth domain D ⊂ Rd . In this case, σ is the identity, b ≡ 0, and γ = n is the interior normal on ∂D. dXt = dBt + n(Xt )dLt
⇒
Lf (x) =
1 f (x), 2
for f ∈ Cb2 (Rn ). By using Itô formula, we can check that if μ is stationary, then −
1 2
1
f (x)μ(dx) = Eμ D
0
∇f (Xu )T n(Xu )dLu .
(3.3)
Reflected (Degenerate) Diffusions and Stationary Measures
21
Let f be a solution to the Neumann problem ⎧ 1 ⎪ ⎪− f = g ⎨ 2 ⎪ ∂f ⎪ ⎩ =0 ∂n
in D (3.4) on ∂D.
For this choice of f , (3.3) becomes 1 − 2
g(x)μ(dx) = 0 D
It is well known that a solution to the Neumann problem exists (under domain regularity conditions) if and only if D gdx = 0 (see,e.g. chapter 6, problem 4 in [14]). This condition, shows that a candidate for μ is the Lebesgue measure. We will need to delve further in the theory to actually prove this intuition.
3.2 Characterization of Stationary Measures Consider the Reflected diffusion Xt solving (2.4). Assume μ is a stationary distribution. From (2.8), we have that t Lf (Xu )du ≥ Ex [f (X0 )] , Ex f (Xt ) − 0
for any f ∈ C02 (Rd ). If X0 is distributed according to μ, then Xu is distributed as μ for all u ≥ 0. Note that since Xt spends zero Lebesgue time on ∂D, then μ(∂D) = 0. Thus, integrating the above with respect to μ(dx)
t
−
Eμ (Lf (Xu ))du ≥ 0,
for all t ≥ 0.
0
Hence, for f ∈ C02 (Rd ) with ∇f T γ (x) ≥ 0 for x ∈ ∂D, if μ is invariant for the diffusion, then Lf (x)μ(dx) ≤ 0. (3.5) D
The main result of [45] is the converse of the previous statement. It covers the case of a sticky boundary, in which the process spends positive Lebesgue time at the boundary. We give a specialized version that will be sufficient in our case: Let L be as in (2.8), where ai,j and bi are bounded, Lipschitz functions. Assume that a bounded, Lipschitz vector field γ (·) is given on the boundary of a C 2 (Rd ) domain
22
M. Duarte
D, such that γ T n(x) ≥ β > 0 for x ∈ ∂D. Let φ be a C 2 (Rd ) function defining the boundary of G. Theorem 3.3 (Weiss ’81, Kang and Ramanan ’14) Let D be compact in Rd , and assume that the submartingale problem on D for (L, γ ) is well posed, with the law of the Reflected diffusion solving (2.4) being the unique solution. Then, a measure μ is stationary for Xt if and only if μ(∂D) = 0 and Lf (x)μ(dx) ≤ 0
(3.6)
D
for all f ∈ C02 (Rd ) with ∇f T γ (x) ≥ 0 for x ∈ ∂D. Note that local time does not appear in this formulation, which makes calculations simpler to handle. Also, the theorem provides a test: given a candidate measure for stationarity, we only need to check if it satisfies the conditions in the theorem. Remark 3.4 In the original presentation of Weiss, the matrix (aij (x)) was assumed to be strictly elliptic, a condition imposed to guarantee well-posedness of the submartingale problem. In the formulation of Kang and Ramanan [21], this condition has been made explicit, and it is only assumed that (ai,j (x)) is semidefinite positive. Also, in [21], the condition on smoothness of D are relaxed to the point of including domain with finitely many corners.
3.3 Continuation of Example 3.1 Now we have a way to test that the Lebesgue measure is stationary for RBM in a domain D ⊂ Rd . In this case, we have L = 12 . The domain is smooth, hence, the Lebesgue measure of the boundary is zero. For any f ∈ C02 (Rd ) such that ∇f (x)T n(x) ≥ 0 on ∂D, we have by the Gauss-Green formula 1 2
f (x)dx = − D
1 2
∇f (x)T n(x)ν(dx) ≤ 0. ∂D
The measure ν(dx) is the surface measure on ∂D. This calculation and Theorem 3.3 confirm our guess that the Lebesgue measure is stationary for Reflected Brownian motion. Theorem 3.3 has also been successfully used by Harrison, Landau and Shepp [18] to give an explicit formula for the stationary distribution of obliquely reflected Brownian motion in planar domains, in two cases: (a) the domain is of class C 2 (C) and bounded, and the reflection coefficient γ has a global extension to a Cb2 (R2 ) vector field; and (b) the domain is a convex polygon, and the reflection coefficient is constant in each face. Their technique to obtain an explicit representation is to assume that μ(dx) = ρ(x)dx and integrate (3.6) by parts to obtain a PDE with boundary conditions for ρ, and solve such equation.
Reflected (Degenerate) Diffusions and Stationary Measures
23
3.4 Some Results on Uniqueness To prove uniqueness of the stationary distribution, we can follow the scheme of Harrison and Williams in [17]. Their setting is different from ours in that they consider obliquely reflected Brownian motion in an orthant, which is a non-smooth domain, and the reflection vector γ is assumed to be constant along each face. Nonetheless, their proofs only are based in two facts: (a) the process behaves as Brownian motion inside of the domain, and (b) the process spend zero Lebesgue time on the boundary. Since these two facts are true in our case, it is possible to reproduce their proofs, and make them work in our setting. The following theorem, summarizes the properties we need. Lemma 3.5 Assume that (ai,j (x)) in the definition of L is strictly elliptic. For each x ∈ D, and t > 0 (a) Px (Xt ∈ ∂D) = 0. (b) Let m be the Lebesgue measure in Rd . For any Borel set A ⊂ D we have Px (Xt ∈ A) = 0
⇐⇒
m(A) = 0.
(3.7)
(c) Suppose μ is a stationary distribution for X. Then μ and m are mutually absolutely continuous on D. Proof (a) We have established this already. For an alternative proof that follows the submartingale formulation, see Lemma 2.3 in [43]. To show (b), in view of part (a) and the fact that m(∂D) = 0, it suffices to assume that A ⊂ K, where K is a compact subset of D \ ∂D. Let τ = inf {u ≥ 0 : Xu ∈ ∂D} , σ = inf {u ≥ 0 : Xu ∈ K}, σ0 = 0, and for each k ≥ 1, let τk = σk−1 + τ ◦ θσk−1 and σk = τk + σ ◦ θτk where θ is the usual shift operator for X. Then, for each x ∈ D, τk ∞ Px a.s. as k → ∞ and Ex
∞
∞ 1A (Xt )dt = Ex
0
τk
σk−1
k=1
∞ 1A (Xt )dt = Ex EXσk−1
τ
1A (Xt )dt
,
0
k=1
where the second equality holds by the strong Markov property. Since Xσk−1 ∈ K, and Xt behaves as Brownian motion within D, we have
∞
Ex 0
∞ 1A (Xt )dt = Ex EXσk−1 k=1
τ
1A (Bt )dt
.
0
The right hand side is zero if and only if m(A) = 0, since the distribution of Bt is mutually absolutely continuous with the Lebesgue measure for all t > 0. By Fubini’s theorem we deduce,
∞
m(A) = 0 ⇐⇒ 0
Px (Xt ∈ A)dt = 0 ⇐⇒ Px (Xt ∈ A) = 0,
24
M. Duarte
for all t > 0, since the trajectories of Brownian motion are continuous. This shows part (b). Finally, part (c) follows from part (b), and the fact that a stationary distribution μ satisfies Px (Xt ∈ A)μ(dx), for all t > 0. μ(A) = D
If both μ1 and μ2 are stationary for X, then μj (A) =
Px (Xt ∈ A)μj (dx) =
D
Px (Xt ∈ A) D
dμj (x)dx, dm
dμ
where A is a Borel set and dmj are Radon–Nykodym derivatives. The fact that μj and m are mutually absolutely continuous implies that μ1 and μ2 are mutually absolutely continuous. One the other hand, it follows from the ergodic decomposition theorem (2.2.8 in [1]) that any two stationary distributions must have disjoint supports. This contradiction shows that the stationary distribution for X solving (2.4) is unique.
4 Gravitation versus Brownian Motion This section, concerns results included in my own work from [4], in which we investigate the motion of an inert (massive) particle being impinged from below by a particle performing (reflected) Brownian motion. The velocity of the inert particle increases in proportion to the local time of collisions and decreases according to a constant downward gravitational acceleration. We will denote by St the trajectory of the inert particle, and by Xt the trajectory of the particle performing BM when away from St . The process Vt will be the velocity of St . These processes should satisfy the following stochastic differential equations ⎧ ⎪ ⎪ ⎨dXt = dBt − dLt , dVt = dLt − gdt, ⎪ ⎪ ⎩dS = V dt. t t
(4.1)
The model without the gravitational component was originally introduced in [24], where a full description of the inverse of the velocity process for the inert particle was developed.
Reflected (Degenerate) Diffusions and Stationary Measures
25
Fig. 5 Typical Path of inert particle (blue) and brownian particle(red). In color black, we have the distance between the two, and in green the velocity of the inert particle
A number of related models were studied in [46] and [5]. In the first one, a characterization of discrete-space processes with inert drift was given for such processes with product form stationary distribution (Fig. 5). Some observations we can make from (4.1) and simulations: (1) Velocity increases only when particle St is impinged by BM. (2) The two particles seem to travel downwards at an (average) steady rate, that is lim Xt /t = lim St /t = V∞ .
t →∞
t →∞
(3) The separation between particles looks like a RBM wit quadratic drift towards zero, and stationary. (4) Velocity process seems to have the structure of a renewable and symmetric process.
4.1 Existence and Uniqueness of the Process The process given by Ht = St − Xt will be called the gap process. If we know both V and H , we can recover the movement of the individual particles by first integrating V to obtain S, and then computing Xt = St − Ht .
26
M. Duarte
Thus, existence and uniqueness of a strong solution to the system (4.1) are equivalent to those of the following system of equations:
dHt = −dBt + Vt dt + dLt , dVt = dLt − gdt,
(4.2)
where Bt is a standard one dimensional Brownian motion, Ht ≥ 0 for all t ≥ 0, and Lt is a continuous, non-decreasing process satisfying dLt = 1{0} (Ht )dLt . We will write Zt = (Ht , Vt ). If Bt and Zt = (Vt , Ht ) are given, then Lt can be computed from the equation Lt = L0 + Vt − V0 + gt. Thus, the complete description of the strong solution to (4.1) can be given in terms of only Z and B. Equation (4.2) can be written in vectorial form as
−1 0 Vt 1 dZt = dt + dBt + dLt . −g 0 0 1 From our discussion in Sect. 2.1, we deduce that there is a unique strong solution to (4.2). Note that the infinitesimal generator for Zt is degenerate, since the product σ T σ is clearly not elliptic. The diffusion Zt is, hence, degenerate.
4.2 The Process is Renewable The process (Ht , Vt ) is renewable in a very specific sense. The renewal state will be R = (0, −g) because we will see later that −g is the limiting velocity. Before renewing, we need the velocity to slide away from −g: ζ1 = inf t > 0 : sup |Vu + g| > 2, Vt = −g, Ht = 0 .
(4.3)
u 0 such that for all t ≥ 0, 1/2
P(ζ1 > t) ≤ Ce−C t
.
It follows that for any integer n ≥ 1, E(ζ1n ) < ∞. Lemma 4.2 The inter-regeneration time ζ1 has a density with respect to Lebesgue measure.
Reflected (Degenerate) Diffusions and Stationary Measures
27
Theorem 4.3 For any z ∈ H, assuming that Z0 = z, when t → ∞, the law of Zt converges in total variation distance to a unique stationary distribution π, given by
π(A) =
ER
ζ1 0
1A (Zu )du
ER (ζ1 )
,
(4.4)
for any measurable set A ⊆ H. Furthermore, for every z ∈ H, Pz -a.s., t lim
t →∞
0
1A (Zu )du = π(A). t
(4.5)
4.3 The Stationary Distribution In order to use Theorem 3.3, we need a candidate for stationary measure. Since a stationary measure weighs the long time behavior of a process, we will take Eqs. (4.2) and discard the Brownian noise, since it averages zero in the long range. We are left with the equations:
dht = vt dt + dt , dvt = −gdt + dt .
(4.6)
Without the noise, there is no “force” to separate the particles after they collide, thus ht = 0 in the long term. This shows that dt = −vt dt. Plugging this into the second equation yields dvt = −(vt + g)dt. The equilibrium solution is vt = −g, and other solutions are given by vt = −g + (v0 + g)e−t . A first order approximation of (4.6) around the equilibrium velocity will be dht ≈ −dbt − gdt + dt , dvt = −gdt + dt .
(4.7)
Approximately, ht behaves as a Reflected Brownian motion with negative drift. The stationary distribution of such process is well known to have density e−2gh . Using Skorohod decomposition, we have that t ≈ sups≤t (bs +gs). In the stationary regime, excursions of Ht above zero are parabolic trajectories with noise. If the velocity at the beginning of the excursion is v + g, then the parabola is ht = (v + g)t − gt 2 /2, and sup ht = (v + g)2 /2g. This give us a candidate for stationary distribution. Our first theorem states that Z := (S − X, V ) has a unique stationary distribution which is the product of a Gaussian distribution and an exponential distribution. We actually proved that the law of Zt converge in the total variation distance to the stationary distribution.
28
M. Duarte
Theorem 4.4 The process Z := (V , S − X) has a unique stationary distribution with the density with respect to Lebesgue measure given by 2g 2 ξ(h, v) = √ e−2gh e−(v+g) , π
v ∈ R, h ≥ 0.
(4.8)
Proof Let f ∈ C02 (R+ × R). We have from (4.2) that γ = (1, 1), and Lf (h, v) = 1 T 2 ∂hh f + v∂h f − g∂v f . Assume that ∇f (0, v) γ ≥ 0. Integrating by parts, we compute R+ ×R
Lf (h, v)ξ(h, v)dhdv =
R R+
=−
R
1 ∂hh f + v∂h f − g∂v f 2
e−2gh e−(v+g) dhdv 2
2 ∇f (0, v)T γ e−(v+g) dv ≤ 0.
We conclude stationarity by Theorem 3.3. Uniqueness was part of Theorem 4.3. We believe that the uniqueness of the stationary distribution follows from the Harris irreducibility of our process. But the main power of Theorem 4.3 comes from the fact that it uses Lemmas 4.1 and 4.2 to show convergence to stationarity in total variation distance (the existence and uniqueness of the stationary distribution follow as a by-product of this convergence).
5 Spinning Brownian Motion This section concerns results included in [9]. Let D ⊂ Rn be a bounded C 2 domain, and let Bt be an n-dimensional Brownian motion. A pair (Xt , St ) with values in D × Rp is called spinning Brownian motion if it solves the following stochastic differential equation
dXt = σ (Xt )dBt + γ (Xt , St )dLt , dSt = g(Xt ) − α(Xt )St dLt ,
(5.1)
where Lt is the local time for Xt on the boundary of ∂D, and γ points uniformly into D. Our assumptions on the coefficients are as follows: • σ (·) is an (n × n)-matrix valued, Lipschitz continuous function, and is uniformly elliptic, that is, there is a constant c1 > 0 such that ξ T σ (x)ξ ≥ c1 |ξ |2 for all ξ ∈ Rn , and all x ∈ D. • γ (x, s) = n(x) + τ(x, s) is defined for x ∈ ∂D and s ∈ Rp , where n is the interior normal to ∂D, and τ is a Lipschitz vector field on ∂D × Rp such that n (x) · τ(x, s) = 0 for all x ∈ ∂D and s ∈ Rp ,
Reflected (Degenerate) Diffusions and Stationary Measures
29
• g(·) is a Lipschitz vector field on ∂D with values in Rp . The function α(·) is defined on ∂D, it is assumed to be Lipschitz, with values in a compact interval [α0 , α1 ] for some 0 < α0 < α1 < ∞. The process Xt behaves just like a Brownian diffusion inside D, and is reflected instantaneously in the direction γ = n + τ when it hits the boundary. The challenge is that the direction of reflection depends on the multidimensional parameter St , which is updated every time the main process Xt hits the boundary of D. This type of process has the following interpretation: consider a small ball that spins and moves around a planar box following a Brownian path. On the boundary of the box we put tiny wheels which rotate at different speeds, modifying the spin of the ball as well as pushing it in a certain (non-tangential) direction. In this context, it is natural to think of the boundary wheels as an external forcing system that is not affected by the hitting of the ball: every wheel on the boundary rotates at a speed dependent only on its position. The position Xt of the particle at time t is described by the first equation in (5.1), in which the direction of the boundary push γ (Xt , St ) depends on the current position of the particle, that is, on which boundary wheel it hits, and also on the current value of the spin St at the time the boundary is hit. The spin of the particle is recorded by the process S. As we described it, it only updates when the particle is on the boundary, and we have chosen its amount of change to be linear with respect to the current spin, since this is the physically relevant situation. Indeed, angular momentum is conserved when two particles collide in absence of external interference. The spinning Brownian particle of our interest will collide against the revolving wheel and this system will locally maintain its total angular momentum. It is natural that the change of spin is given by a linear combination of the current spin and the spin of the revolving boundary wheel [ g (x) − α(x)s], also taking into account that part of the angular momentum is used in reflecting the particle in a non-normal direction (thus the factor α(x).) This model has inspired us to call the solution to (5.1) spinning Brownian motion. Even though our inspiration for the model comes from the spinning ball bouncing off of a moving boundary, from the mathematical point of view it is natural to regard the process (X, S) as a multidimensional reflected diffusion in D × Rp with degeneracy, due to the absence of a diffusive motion in the p components of S. Setting Z = (X, S) we can write (5.1) as dZt = σ0 (Zt )dBt + κ (Zt )dLt ,
(5.2)
where σ0 (x, s) is the (n + p) × n matrix obtained from σ (x) by augmenting it with zeroes, and κ (x, s) = (γ (x, s), g(x) − α(x)s). Since we have ∂(D × Rp ) = ∂D×Rp , the local times of (5.1) and (5.2) are the same because Z is in the boundary of its domain if and only if X ∈ ∂D, and the interior normal to D×Rp is just ( n, 0p ) where 0p is the zero vector in Rp . Understanding the structure of the stationary distribution of spinning Brownian motion has been one of our main interests. The most challenging part is to prove
30
M. Duarte
that spinning Brownian motion admits a unique stationary distribution under the following crucial assumption on the vector field g: A1
There are p + 1 points x1 , . . . , xp+1 on the boundary of D such that for every p+1 y ∈ Rp , there exist non negative coefficients λj such that y = j =1 λj g(xj ).
We start with an intermediate result that apparently has little to do with the stationary distribution, and it is interesting on its own. We identify the components of an exit system (see [27] for a definition) for excursions away from the boundary, in terms of the local time Lt of the process, and a family of excursion measures Hx that has been constructed in the build up for Theorem 7.2 in [7]. We denote by C the set of continuous functions defined in some interval [0, tf ), where tf might depend on f . Theorem 5.1 (Theorem 1 in [27]) There exists a positive, continuous additive functional L∗ of Z such that, for every z ∈ E, any positive, bounded, predictable process V , and any universally measurable function f : C → [0, ∞) that vanishes on excursions et identically equal to , ! Ez
"
∞
Vt f (et ) = Ez 0
t ∈J
Vξs HZ(ξs ) (f )ds = Ez
∞ 0
Vt HZt (f )dL∗t
. (5.3)
In the equation above, the set J corresponds to the times at which excursions et start. The process ξ is the right inverse of L∗ . Standard notation is used for Hz (f ) = C f dHz . Theorem 5.2 Let PD be the law of Brownian motion killed upon exiting D. For x ∈ ∂D, s ∈ Rp , define Hx := lim ε↓0
1 D P , ε x+εn(x)
Hx,s := Hx ⊗ δ{s} ,
(5.4)
and let Lt be the local time of (X, S), satisfying equation (5.1), with σ = In . Then Hx,s is a sigma-finite measure, strongly Markovian with respect to the filtration of the driving Brownian motion Bt , and (Lt , Hx,s ) is an exit system from ∂D × Rp for the process (X, S). Proof We will sketch the proof to see the main computations. The full proof can be found in [9]. The fact that Hx is sigma-finite for all x ∈ ∂D, and strongly Markovian is proved in Theorem 7.2 in [7]. Let (L∗t , Hx ) be an exit system for (X, S), as in Theorem 5.1. Fix T > 0, x0 ∈ ∂D, and small enough ε > 0 such that both B(x0 , ε)∩D and = B(x0 , ε)∩∂D are connected sets. The set ∂ = {x ∈ ∂D : |x − x0 | = ε} has surface measure zero, and since the surface measure and the harmonic measure are mutually absolutely continuous, almost surely no excursions of X have ending points in ∂. In particular, P(x,s)(T∂ < ∞) = 0 for all x ∈ D.
Reflected (Degenerate) Diffusions and Stationary Measures
31
Let G be a subdomain of D such that G is a compact subset of D, and G ∩ B(x0 , ε) = ∅. As usual TE denotes the hitting time of the set E by the process X. Define AG = {TG < T∂D }. Define the following sequences of stopping times: η0 = T , and inductively set τk+1 = ηk + TD\B(x0,ε) ◦ θηk , and ηk+1 = τk+1 + T ◦ θτk+1 . It is standard to check that ηk grows to infinity almost surely. We will focus on excursions that start from before time T > 0, and belong to AG . For z = (x, s), x ∈ D, we define
Iz (T , ; G) = Ez
1 (Xt )1AG (et ) .
t 1) = h(x)
x ∈ W,
where / W}. τ := min{n : Xn ∈ A function h is subharmonic (superharmonic) if Ex (h(X1 ); τ > 1) ≤ h(x) (Ex (h(X1 ); τ > 1) ≥ h(x)) for every x ∈ W. The resulting (sub)harmonic function associated with a Doob h-transform will also be called h-function. To avoid trivial cases, we assume that Y has components taking positive and negative values with positive probability. Besides that, the construction works with no further hypothesis if either some component of Y drifts to −∞, or every component of Y drifts to +∞. When such conditions are not satisfied, we need the existence of positive = (1 , . . . , d−1 ) ∈ Rd−1 + such that P(Y1 > ) > 0, needed only for the finiteness of the (sub)harmonic function h. Our main result is the following, which justifies our construction can be interpreted as a random walk X conditioned to stay ordered forever. Theorem 1 Let N be a geometric time with parameter 1 − e−c , independent of X. Assume that ⎛ ⎞ J 1 −1 1 Y n−1 − Yn < y ⎠ < ∞ x = (x1 , . . . , xd ) ∈ W, h↑ (x) := 1 + E⎝ n=1 k
with y = (x2 − x1 , . . . , xd − xd−1 ) and J1 = inf{n > 0 : Y n−1 < Ynk , k ∈ [d − 1]}. Then, for every x ∈ W, every finite Fn -stopping time T and ∈ FT lim Px (, T ≤ N|X(i) ∈ W, i ∈ [N]) = P↑x (, T < ζ )
c→0+
:=
1 Q ↑ {, } E h (X )1 T < ζ , T h↑ (x) x
Q
where Ex is the expectation under the law of X killed at the first exit time of the Weyl chamber. The limit law is a Markov chain with transition probabilities p↑ (w, dz) = 1 {z ∈ W}
h↑ (z) p(w, dz) h↑ (w)
w ∈ W.
(2)
Multidimensional Random Walks Conditioned to Stay Ordered via Generalized. . .
43
Moreover, it is a probability measure if E(τ ) = ∞, or a subprobability measure if E(τ ) < ∞. We also give simple conditions to ensure h↑ is finite. Lemma 1 Assume that either (1) some component of Y drifts to −∞, (2) every component of Y drifts to +∞ and P (τ = ∞) > 0, (3) there exists = (1 , . . . , d−1 ) ∈ Rd−1 + such that P(Y1 > ) > 0. Then h↑ (x) < ∞
∀ x ∈ W.
The following is an application of Theorem 1. Example 1 For d ≥ 1, consider a multidimensional random walk with partial 1 d sums 1 Xn = (an,2Xn , . .1. , Xn , bn), whered a < b and a, b ∈ R. Assume that P X1 − a > 1 , X1 − X1 > 2 , . . . , b − X1 > d+1 > 0 for some i > 0, i ∈ {1, . . . , d + 1}, and P X11 − a < 0, X12 − X11 < 0, . . . , b − X1d < 0 > 0. Then, the construction of Theorem 1 can easily be applied in this case, and provides us with a d-dimensional random walk conditioned to have ordered components staying inside the set {x ∈ R : at < x < bt, ∀ t ∈ R+ }. Depending on the drift of its components, we reexpress our function h↑ . Lemma 2 Let x ∈ W. If some component of Y drifts to −∞, the h↑ -transform is given by h↑ (x) =
Ex (τ ) . E (τ )
If every component drifts to +∞ and P (τ = ∞) > 0, then h↑ (x) =
Px (τ = ∞) . P (τ = ∞)
We also express h↑ as a renovation function. For k ∈ [d − 1], denote by (βik , i ∈ N) the strict descending ladder times of Y k , that is β0k = 0 and for i ∈ N the time k k + n) < Y k (βi−1 ). Let {β0 , β1 , β2 , . . .} βik is the smallest index n such that Y k (βi−1 be the ordered union of all such ladder times, with β0 = 1. Denoting by gn = βn and dn = βn+1 for n ≥ 0, the set {gn , gn + 1 . . . , dn − 1} is the nth interval where Y remains constant.
44
O. Angtuncio-Hernández
Proposition 1 Let x ∈ W. The h-function can be expressed as h↑ (x) = 1 +
∞ P −Y n=1
βn
Tk : −Xj > Hk }.
The convention is Hk = ∞ if Tk = ∞. The renewal function associated with H1 is V (x) =
∞
P(Hk ≤ x),
x ≥ 0.
k=0
This is a non-decreasing right-continuous function. But the duality lemma gives us ⎛
σ 0 −1
V (x) = E⎝
⎞ 1 −x ≤ Xj ⎠ = h(x).
j =0
Thus, Proposition 1 is a generalization of this result. Our other reexpresions of h↑ given in Lemma 2, also had their respective reexpresions in the unidimensional case (see [2]). We conjecture that our h-function is subharmonic when X has i.i.d. components taking values in R, satisfies the hypotheses of [10] or [5], and d > 2. The reason is that on such papers, the tail of the distribution of τ is computed, which we prove
Multidimensional Random Walks Conditioned to Stay Ordered via Generalized. . .
45
helps to characterize the harmonicity of our h. In [10] it is proved that Px (τ > n) is of the order n−d(d−1)/4 (see Sect. 5 for another approximations), implying Ex (τ ) < ∞ when d ≥ 3 and x ∈ W. In fact, we prove in Lemma 6 that h is harmonic (subharmonic) iff the expectation of τ is infinite (finite). This represents a difference with respect to [10], since, regardless of the dimension, their h-function is harmonic. Nevertheless, such difference has also been observed in [12]. On such paper, it is characterized the h-function of centered irreducible random walks taking values on a countable set, with slowly varying hitting probabilities and other minor assumptions. The author proved that when E(τ ) = ∞, any harmonic function for the process is proportional to V (·) = E· (T ), where T is the exit time of some ladder height process, and also that lim Px (An )/P(An ) = V (x), with x ∈ W and An = {τ > n}. But, when E(τ ) < ∞, the author proved that V (x) ≤ lim Px (An )/P(An ) and that V is superharmonic (in our case, our h-function is subharmonic). It is important to address that even in the unidimensional case, the harmonicity of the h-function highly depends on E(τ ) and even in the way we choose the approximating events. An example appears in [2] for random walks not drifting to +∞. They compare limits of some random walks, under two different conditionings to stay positive. It is proved that for oscillating random walks both limits are the same. But when the drift is negative, depending on the upper tail of the step distribution it can happen: both limits are the same and the h-function is subharmonic; both limits are different with harmonic h-function. Results in the same spirit for Lévy processes, are given in [2]. Also, the h-function of the Brownian motion with negative drift conditioned to stay positive is harmonic or subharmonic, depending on the approximation: it is proved in [19] that is harmonic conditioning with {τ > t} and letting t → ∞; while in [3] is subharmonic when conditioning with {τ > E/c}, an exponential random variable with mean 1, and letting c → 0. This paper is organized as follows. Our construction is given in Sect. 2, conditioning the walk to stay ordered up to an independent geometric time. We prove this is a Markov chain and an h-transform of the process, where the harmonic function ↑ ↑ is denoted by hc . In Sects. 2.1 and 2.2 we reexpress hc using a partition of N on random intervals, making the random walk to be interpreted as excursions on each ↑ interval. This allows us to obtain in Lemma 5 the limit h↑ of hc as c ↓ 0, and implies the limit of the random walk is a Markov chain using a change of measure with h↑ ; this is the second part of Theorem 1. We characterize in Sect. 3 when h↑ is harmonic or subharmonic; give a condition to ensure its finiteness; and prove in Lemma 8 that the law of the random walk using the h-function h↑ is the same as the limit of the random walk law conditioned to stay ordered up to a geometric time, which proves Theorem 1. In Sect. 4 we obtain several reexpresions of h↑ . Finally, in Sect. 5 we review known results about the order of Px (τ > n) as n → ∞.
46
O. Angtuncio-Hernández
2 The Random Walk Conditioned to Be Ordered Up to a Geometric Time as an h-Transform Without loss of generality, we will assume d = 3. Recall the notation at the beginning of Sect. 1.2. Consider x = (x1 , x2 , x3 ) ∈ W and let y = (y1 , y2 ) = (x2 − x1 , x3 − x2 ). Recall the definition of τ = inf{n : Xn ∈ / W}, the first exit time from the Weyl chamber. For any n ∈ N and A = A0 × A1 × · · · × An ∈ B(Rn+1 ), we find the limit as c → 0+ of
n ( c P {Xi + xi ∈ Ai } τ > N 0
n ( 1 c 2 3 {Xi + xi ∈ Ai } Xj + x1 < Xj + x2 < Xj + x3 , j ∈ [N] . =P 0
First we prove this is a Markov chain. Proposition 2 Under Pc and for any x = (x1 , x2 , x3 ) ∈ W, the chain X + x conditioned to be ordered up to time N is a Markov chain with transition probabilities P↑c (w, dz) = 1 {z ∈ W}
↑
hc (z) ↑
e−c p(w, dz),
hc (w)
with w = (w1 , w2 , w3 ) ∈ W, z = (z1 , z2 , z3 ), p(w, dz) = P(W1 + w ∈ dz) and h↑c (w) =
Pc (Xi + w ∈ W, i ∈ [ζ ]) . Pc (Xi ∈ W, i ∈ [ζ ])
Proof We compute the n-step transition probabilities Pcx (Xi ∈ dwi , i ∈ [n]0 | τ > N) , where wi ∈ R3 for i ∈ [n]0 := {0, . . . , n} and w0 = x ∈ W. Then, the numerator of the n-step transition probability is given by Pc (Xi + x ∈ dwi , i ∈ [n]0 , Xi + x ∈ W, i ∈ [N]) . On such set we have N ≥ n, and for i ∈ [n]0 we have Xi + x = wi ∈ W, while for n + i ∈ {n, n + 1, . . . , N} {Xn + x ∈ dwn , Xn+i + x ∈ W} = {Xn + x ∈ dwn , Xn+i − Xn + wn ∈ W} .
Multidimensional Random Walks Conditioned to Stay Ordered via Generalized. . .
47
Summing over the values of N, and using independent and stationary increments of X, the numerator is equal to 1 ∩n1 {wi ∈ W} P (Xi + x ∈ dwi , i ∈ [n]0 ) × P (Xn+i − Xn + wn ∈ W, n + i ∈ [k]) P(N = k) k≥n
= 1 ∩n1 {wi ∈ W} P (Xi + x ∈ dwi , i ∈ [n]0 ) e−cn × Pc (Xi + wn ∈ W, i ∈ [N]) , by the lack of memory property of N. Therefore, the n-step transition probability is given by Pc (wn + Xi ∈ W, i ∈ [N]) 1 ∩n1 {wi ∈ W} P (Xi + x ∈ dwi , i ∈ [n]0 ) e−cn × c . P (x + Xi ∈ W, i ∈ [N]) Denote by X↑ the random walk X conditioned to stay ordered up to time N. Considering wi ∈ W for i ∈ [n − 1]0 , we obtain, using that X is a random walk Pc X↑ (n) ∈ dwn |X↑ (0) = w0 , X↑ (1) ∈ dw1 , . . . , X↑ (n − 1) ∈ dwn−1 = 1 {{wn ∈ W}} P (Xn + x ∈ dwn |Xn−1 + x ∈ dwn−1 ) e−c = 1 {{wn ∈ W}} P (X1 ∈ dwn |X0 ∈ dwn−1 ) e−c
↑
hc (wn ) ↑
hc (wn−1 )
↑
hc (wn ) ↑
,
hc (wn−1 )
which is the one-step transition probability, and depends only on wn−1 and wn . Now we analyze the
↑ function hc .
2.1 Reexpression of the h-Function of the Ordered RW Up to a Geometric Time ↑
A priori, hc is the division of two probabilities converging to zero. We reexpress ↑ ↑ hc to prove it converges. Working with the numerator of hc (x), first sum over all possible values of N Pc (Xi + x ∈ W, i ∈ [ζ ])
∞ −c −cn n = (1 − e ) 1 + e P (Xi + x ∈ W, i ∈ [n]) . 1
48
O. Angtuncio-Hernández
Recall that Y = (X2 − X1 , X3 − X2 ) and y = (x2 − x1 , x3 − x2 ). It follows that
P (Xi + x ∈ W, i ∈ [ζ ]) /(1 − e c
−c
)−1=E
e
−cn
* ) 1 −y < Y n .
(4)
1
For any n ∈ N, it is known that X and the time-reversed process X∗ has the same distribution, with Xi∗ = Xn − Xn−i
for 0 ≤ i ≤ n.
This chain has components X∗ = (X1,∗ , X2,∗ , X3,∗ ), and similarly for Y ∗ . Then ) * ) * d −y < Y n = max{−Yj1,∗ , j ∈ [n]} < x2 − x1 , max{−Yj2,∗ , j ∈ [n]} < x3 − x2 . 1
k
2
Define Y n = max{Yjk , 0 ≤ j ≤ n} for k = 1, 2 and Y n = (Y n , Y n ). Add and k k subtract the term Ynk,∗ and use Yik,∗ − Ynk,∗ = Ynk − Yn−i − Ynk,∗ = −Yn−i for k = 1, 2, so
)
−y < Y
* n
d = Y n−1 − Yn < y .
(5)
↑
This implies that hc can be reexpresed as h↑c (x)
=
∞
−cn P Y n−1 1 e ∞ −cn 1 + 1 e P Y n−1
1+
− Yn < y − Yn < 0
(6)
2.2 Partitioning N via the Times of a Multidimensional Ladder Height Function to Obtain the Limit of the h-Function In this subsection, we partition N at some particular times {Ji , i ∈ N}. Those are the times in common among the ascending ladder times of Y 1 and Y 2 , that is, if (αjk , j ≥ 0) are the strict ascending ladder times of Y k , then Ji is the ith time such that αj1 = αl2 for some j, l ∈ N. We prove that the subpaths {YJi +n , 0 ≤ n < Ji+1 − Ji }i are i.i.d., and at the times Ji , every component of the walk Y is at least as big as the current cumulative maximum. In this sense, the reader should think on those subpaths as excursions of Y . Let J0 = 0 and for i ∈ N, define ) * k Ji+1 = min n > Ji : Y n−1 < Ynk , k = 1, 2 ,
Multidimensional Random Walks Conditioned to Stay Ordered via Generalized. . .
49
the first time after Ji , such that both walks reach the current maximum at the same time. Remark 1 Note that Y Ji = YJi , since both processes are at the same maximum. Also, since we assumed P(Y1 > 0) > 0, then P(J1 = 1) = P Y 0 < Y1 = P(0 < Y1 ) has positive probability. We prove (Ji , i ≥ 0) are stopping times. Let (Fn , n ∈ N) be the natural filtration of X. For any m ∈ N, the event {J1 = m} is equal to ) * ) * k k Y j −1 ≥ Yjk , 1 ≤ j ≤ m − 1 for k = 1 or k = 2 ∩ Y m−1 < Ymk , k = 1, 2 , which is in Fm . Assuming Ji is a stopping time, the event {Ji+1 = m} is equal to m−1 +
) k * {Ji = l} ∩ Y j −1 ≥ Yjk , l + 1 ≤ j ≤ m − 1 for k = 1 or k = 2
l=i
) * k ∩ Y m−1 < Ymk , k = 1, 2 ,
which also belongs to Fm . We prove the independence and obtain the distribution between such times. Lemma 3 For every i ∈ N, the walk {Y Ji +n−1 − YJi +n , n ≥ 1} is independent of FJi and has the same distribution as {Y n−1 − Yn , n ≥ 1}. Proof Let T < ∞ be a stopping time. For n ≥ 2, decompose Y T +n−1 as the maximum up to time T and the maximum between times {T + 1, . . . , T + n − 1}. Hence Y T +n−1 − YT +n = Y T − YT ∨ max{YT +l − YT , l ∈ [n − 1]} − (YT +n − YT ), and for n = 1 Y T − YT +1 = Y T − YT ∨ (0, 0) − (YT +1 − YT ) . We substitute T = Ji for i ∈ N and recall Y Ji = YJi . For n ∈ N and Am ∈ R2 with m ∈ [n], the events n (
{Y Ji +m−1 − YJi +m ∈ Am }
m=1
=
n ( (0, 0) ∨ max{YJi +l − YJi , l ∈ [m − 1]} − (YJi +m − YJi ) ∈ Am m=1
50
O. Angtuncio-Hernández
are independent of FJi under {Ji < ∞}, by the strong Markov property. They also have the same distribution as n (
{(0, 0) ∨ max{Yl , l ∈ [m − 1]} − Ym ∈ Am } =
m=1
n (
Y m−1 − Ym ∈ Am ,
m=1
recalling that Y 0 = Y0 = (0, 0) under P.
The following result is crucial to partition the sums in (6). Lemma 4 The times {Ji+1 − Ji , i ∈ N} are i.i.d. and Ji+1 − Ji = J1 ◦ θJi , where θ is the translation operator. Proof For i ∈ N we have Ji+1 − Ji = min{n > 0 : Y Ji +n−1 < YJi +n } = min{n > 0 : max{YJi +m − YJi ; 0 ≤ m ≤ n − 1} − (YJi +n − YJi ) < 0} = J1 ◦ θJi . Then Ji+1 − Ji is independent of FJi and has the same law as J1 , by Lemma 3. ↑
Lemma 5 The h-function hc converges as c ↓ 0 to ⎛ h (x) = 1 + E⎝ ↑
J 1 −1
⎞ 1 Y n−1 − Yn < y ⎠ ,
n=1
recalling that y = (x2 − x1 , x3 − x2 ). Proof Recall Eq. (6). Partition N at times (Ji , i ∈ N)
∞ −cn E e 1 Y n−1 − Yn < y n=1
⎛
= E⎝
∞
e−cJi 1 {Ji < ∞}
i=0
Ji+1 −Ji
⎞ e−cn 1 Y n−1+Ji − Yn+Ji < y ⎠ .
n=1
Conditioning with FJi and summing over the values taken by Ji+1 −Ji , the previous equation is equal to E
⎧ ∞ ⎨ ⎩
i=0
e−cJi 1 {Ji < ∞}
m m≥1 n=1
⎫ ⎬ e−cn P Ji+1 − Ji = m, Y n−1+Ji − Yn+Ji < y |FJi . ⎭
Multidimensional Random Walks Conditioned to Stay Ordered via Generalized. . .
51
Using Lemmas 3 and 4, we obtain Pc (Xi + x ∈ W, i ∈ [ζ ]) /(1 − e−c )
∞ J 1 −cJi −cn =1+E e 1 {Ji < ∞} E e 1 Y n−1 − Yn < y . i=0
n=1
Since e−cJi 1 {Ji = ∞} = 0, we can ignore the indicator 1 {Ji < ∞}. When x = 0, 1 the only term that remains in the second expectation above is e−cJ1 , since Y n−1 ≥ 2
Yn1 or Y n−1 ≥ Yn2 for n < J1 . It follows that P (Xi ∈ W, i ∈ [ζ ]) /(1 − e c
−c
∞ −cJi )=1+E e E e−cJ1 . i=0
Dividing both terms, and using E
J 1
e−cn 1 Y n−1 − Yn < y
⎛
J 1 −1
− E e−cJ1 = E⎝
n=1
⎞ e−cn 1 Y n−1 − Yn < y ⎠ ,
n=1
we have ⎛ h↑c (x)
= 1 + E⎝
J 1 −1
⎞ e−cn 1 Y n−1 − Yn < y ⎠
n=1
Since Ji =
i−1 0
−cJi E ∞ i=0 e . −cJi E e −cJ1 1+E ∞ i=0 e
(Jk+1 − Jk ) is a sum of i.i.d. random variables, then
E
∞
e−cJi
−1 , = 1 − E e−cJ1
i=0
implying ⎛
J 1 −1
h↑c (x) = 1 + E⎝
⎞ e−cn 1 Y n−1 − Yn < y ⎠ .
n=1
The result follows from the monotone convergence theorem.
Now, we are ready to obtain the transition probabilities of the limit, as stated in Theorem 1.
52
O. Angtuncio-Hernández
Proof of Equation (2) in Theorem 1 By Proposition 2 and Lemma 5, the transition probabilities of the random walk conditioned to be ordered up to time N converge to p↑ (w, dz) = 1 {z ∈ W}
h↑ (z) p(w, dz) h↑ (w)
w ∈ W.
h↑
is (sub)harmonic, and give a simple In the next section, we prove that condition that ensures it is finite. Then we prove h↑ appears naturally when considering the law of a random walk conditioned to stay ordered up to a geometric time.
3 Properties of the Limiting h-Function and the Interpretation of the Walk as Conditioned to Stay Ordered Forever 3.1 The Harmonicity of the h-Function Depends on the First Exit Time to W We know that the first exit time from the Weyl chamber is given by τ = min{n > 0 : Yn1 ∧ Yn2 ≤ 0}. By Lemma 5 and Eq. (4), we rewrite h↑ as Pcx Y N > 0 Px (τ > N) = lim . h↑ (x) = lim c→0+ Pc Y > 0 c→0+ P (τ > N) N
Let Qx be the law of X killed at the first exit of the Weyl chamber, that is, for n ∈ N and ∈ Fn Qx (, n < ζ ) = Px (, n < τ ) . Q
Expectations under Qx will be denoted by Ex . The next lemma gives us conditions to know if h↑ is harmonic or subharmonic. It is based on Lemma 1 of [3]. Lemma 6 Let x ∈ W. If E(τ ) < ∞, then h↑ is subharmonic and ↑ ↑ EQ x h (Xn )1 {n < ζ } < h (x).
Multidimensional Random Walks Conditioned to Stay Ordered via Generalized. . .
53
If E(τ ) = ∞, then h↑ is harmonic and ↑ {n } EQ h (X )1 < ζ = h↑ (x). n x ↑
Proof Since we proved in Lemma 5 that the convergence of hc to h↑ is monotone, then
PXn (τ > N) ↑ {n } {n } h 1 < τ . (7) EQ (X )1 < ζ = lim E n x x P (τ > N) c→0+ Using the Markov property ) * Px (τ > n + N) = Ex 1 Yk2 ∧ Yk1 > 0, k ∈ [n + N] ) * ) * = Ex 1 Yk2 ∧ Yk1 > 0, k ∈ [n] 1 Yk2 ∧ Yk1 > 0, k ∈ [N] ◦ θn = Ex 1 {τ > n} PXn (τ > N) , which is the numerator in the right-hand side of Eq. (7). Summing over all the values of N Px (τ > n + N) = Px (τ > n + k, N = k) k
=e
cn
Px (τ > k) (1 − e−c )e−ck .
k≥n
Starting the sum from k = 0, we obtain Px (τ > n + N) = e
cn
Px (τ > N) −
n−1
Px (τ > k) P(N = k) .
0
Thus, the right-hand side of Eq. (7) is equal to lim e
c→0+
cn
Px (τ > N) Px (τ > k) e−ck − P (τ > N) P (τ > m) e−cm
= h↑ (x) −
n−1
0
1 Px (τ > k) , E(τ ) n−1 0
which proves the lemma, since Px (τ > 0) = P(x + X0 ∈ W) = 1.
54
O. Angtuncio-Hernández
3.2 Finiteness of the h-Function To prove h↑ (x) < ∞ for every x ∈ W, we use the remark of Lemma 1 in [20]. In this subsection, the inequality x > z for x, z ∈ R3 means there is strict inequality component-wise. Lemma 7 Assume there exists = (1 , 2 ) ∈ R+ such that P (X12 − X11 , X13 − X12 ) > > 0. Then h↑ (x) < ∞
∀ x ∈ W.
Remark 2 In Lemma 9 we give other conditions, depending on the drift of the components of Y , under which h↑ is finite. Proof Note that Lemma 6 was independent of the finiteness of h↑ . Hence, from such lemma and taking x ∈ W we have h↑ (x) ≥
P(x + X1 ∈ dz, 1 < τ ) h↑ (z) =
P(x + X1 ∈ dz) h↑ (z). z∈W
For x ∈ W, define the function g(x) := x − := (x2 − x1 , x3 − x2 ). For instance, we have X− = (X2 − X1 , X3 − X2 ) and x − := (x2 − x1 , x3 − x2 ). Then, we have P(x + X1 ∈ dz) = P x1 + X11 ∈ dz1 , x − + X1− ∈ dz− . Note from Lemma (5) that h↑ (x) depends on x only through x − . Define h− : R2+ ∪ {(0, 0)} "→ R+ as h− (x − ) := h↑ (x), so ⎛
J 1 −1
h (x ) = 1 + E⎝ −
−
⎞ − ⎠ . 1 Y n−1 − Yn < x
n=1
It follows that h− (x − ) ≥
z∈W
P x1 + X11 ∈ dz1 , x − + X1− ∈ dz− h− (z− ).
) * − Also, note that h− (0) = 1, since 1 X 0 − X1− ≤ 0 = 1 implies J1 = 1.
(8)
Multidimensional Random Walks Conditioned to Stay Ordered via Generalized. . .
55
Assume that h− (z− ) = ∞ for every z− > . Fix any x1 ∈ R, and use x = (x1 , x1 , x1 ) in (8) to obtain 1 = h− (0) = h↑ (x) ≥ P x1 + X11 ∈ dz1 , x − + X1− ∈ dz− h− (z− ). z∈W∩{z∈R3 :z− >}
Hence, it should be the case that 0 = P x1 + X11 ∈ R, x − + X1− > = P X1− > , contradicting the hypothesis. Therefore, there exists z(1) = (x1 , x1 + z(1),1, x1 + z(1),1 + z(1),2) ∈ W such that − z(1) >
− h↑ (z(1) ) = h− (z(1) ) < ∞.
and
− − . Use x = z(1) in (8) to obtain Now, assume h− (z− ) = ∞ for every z− > + z(1) − ∞ > h− (z(1) )≥
z∈W∩{z∈R3 :z− >+z− (1) }
− P x1 + X11 ∈ dz1 , z(1) + X1− ∈ dz− h− (z− ).
Then, it should be true that − − + X1− > + z(1) = P X1− > , 0 = P x1 + X11 ∈ R, z(1) again contradicting the hypothesis. Hence, there exists z(2) = (x1 , x1 + z(2),1 , x1 + z(2),1 + z(2),2) ∈ W such that − − > + z(1) z(2)
and
− h↑ (z(2) ) = h− (z(2) ) < ∞.
Continuing in this way, there is some subsequence (z(n) , n ∈ N), with z(n) = (x1 , x1 + z(n),1 , x1 + z(n),1 + z(n),2 ) ∈ W satisfying − − > + z(n−1) z(n)
and
− h↑ (z(n) ) = h− (z(n) ) < ∞,
for every n. Fix any x = (x1 , x2 , x3 ) ∈ W. We prove that h↑ (x) < ∞. Note that in the previous analysis, x1 was arbitrary. Let n ∈ N such that − − ∧ z(n),2 > (n1 ) ∧ (n2 ) > (x3 − x2 ) ∨ (x2 − x1 ). z(n),1
56
O. Angtuncio-Hernández
It follows that ⎛ h↑ (x) = 1 + E⎝ ⎛ ≤ 1 + E⎝
J 1 −1 n=1
J 1 −1
⎞ * ) − 1 X n−1 − Xn− < x − ⎠ ⎞ * ) − − ⎠ 1 Xn−1 − Xn− < z(n) ,
n=1
which is finite by construction.
3.3 Ordered Random Walks as the Limit Law of Random Walks Conditioned to Stay Ordered up to a Geometric Time Let (qn , n ≥ 1) be the transition probabilities of (X, Q). From Theorem 1, denote ↑ by (pn , n ≥ 1) the transition probabilities of X conditioned to stay ordered pn↑ (w, dz) =
h↑ (z) qn (w, dz) h↑ (w)
w ∈ W, n ∈ N. ↑
The law of the Markov process with transition probabilities (pn , n ≥ 1) and starting ↑ from x ∈ W is denoted by Px . Hence, for n ∈ N and ∈ Fn P↑x (, n < ζ ) = ↑
1 Q ↑ {, } h E (X )1 n < ζ . n x h↑ (x)
(9)
↑
Its lifetime is Px -finite if h↑ is subharmonic and Px -infinite if it is harmonic. Let ↑ us prove (X, Px ) is the limit as c → 0+ of (X, Px ) conditioned to have ordered components up to a geometric time. This is based on Proposition 1 of [3], and its part of the proof of Theorem 1. Lemma 8 Let N be geometric time with parameter 1 − e−c , independent of (X, P). Assume h↑ (x) < ∞ for every x ∈ W. Then, for every x ∈ W, every finite Fn stopping time T and ∈ FT lim Px (, T ≤ N|X(i) ∈ W, i ∈ [N]) = P↑x (, T < ζ ).
c→0+
Multidimensional Random Walks Conditioned to Stay Ordered via Generalized. . .
57
Proof First we use a deterministic time T ∈ N. Note that {T < τ } = {X(i) ∈ W, i ∈ [T ]}. We work with Px (, T ≤ N, X(i) ∈ W, i ∈ [N]). Separating in the first T values of X and summing over all the values of N Px (, T ≤ N, X(i) ∈ W, i ∈ [N]) = Px (, T ≤ n, X(i) ∈ W, i ∈ [T ], X(T + i) ∈ W, i ∈ [n − T ]) P(N = n) , n≥T
starting the sum at zero and using the Markov property at FT Px (, T ≤ N, X(i) ∈ W, i ∈ [N]) = e−cT Px (, T < τ, Px (X(T + i) ∈ W, i ∈ [n]|FT )) P(N = n) n≥0
=e
−cT
Px , T < τ, PXT (N < τ )
= Px , T < τ, t ≤ N, PXT (N < τ ) . ↑
Now, consider c0 > 0 and any c ∈ (0, c0 ). Recall from Lemma 5 that hc increases to h↑ , hence 1 {, T < τ, T ≤ N}
↑
PXT (τ > N) hc (XT ) = 1 {, T < τ, T ≤ N} ↑ Px (τ > N) hc (x) ≤ 1 {, T < τ }
h↑ (XT ) ↑
.
hc0 (x)
Taking expectations on both sides and using Lemma 6
h↑ (x) PXT (τ > N) Ex 1 {, T < τ, T ≤ N} , ≤ ↑ Px (τ > N) hc0 (x) and the right-hand side is finite by hypothesis. Hence, by Lebesgue’s dominated convergence theorem
PX (τ > N) lim Px (, T ≤ N|τ > N) = lim Ex 1 {, T < τ, T ≤ N} T Px (τ > N) c→0+ c→0+ = P↑x (, T < ζ ) . Let us prove the same convergence for any finite stopping time T . Summing over all the values of T , the equality Px (, T ≤ N < τ ) = Px , T < τ, T ≤ N, PXT (N < τ )
58
O. Angtuncio-Hernández
and Eq. (9) holds for T . We need to prove Lemma 6 holds true for any stopping time T < ∞ a.s. Summing over all the values of T , in the subharmonic case ↑ EQ Ex h↑ (Xn )1 {n < τ, T = n} x h (XT )1 {T < ζ } = n
≤ h↑ (x)
P↑x (T = n, n < ζ )
n ↑
=h
(x)P↑x (T
< ∞, T < ζ ) ,
which is smaller than h↑ (x). In the harmonic case, the inequality above is an ↑ equality, so it remains to prove that Px (T < ∞, T < ζ ) = 1. But this is clear since P↑x (T = ∞, T < ζ ) = lim n
1 ↑ {T h (X )1 > n} = 0, E x T h↑ (x)
by monotone convergence. In the next section, we obtain several reexpresions of the h-function.
4 Reexpressions of the h-Function 4.1 Reexpresions Using the Minimum of the Descending Ladder Times of the Components Changing the measure to start at zero, we have
E
∞
e
−cn
∞ ) ) * * −cn 1 −y < Y e 1 0 0, then Px β11 ∧ β12 = ∞ Px (τ = ∞) = . h (x) = 1 2 P (τ = ∞) P β1 ∧ β1 = ∞ ↑
Proof If Y k drifts to −∞ for some k = 1, 2, then E β11 ∧ β12 ≤ E β1k < ∞ by Proposition 9.3, page 167 of [14]. Therefore, by the monotone convergence theorem h↑c (x)
1 + Ex β11 ∧ β12 − 1 . → 1 + E β11 ∧ β12 − 1
If every component drifts to +∞, then Y 1 ∧Y 2 has a finite minimum with positive probability. By hypothesis P β11 ∧ β12 = ∞ > 0, so
h↑c (x) =
Ex E
β11 ∧β12 −1 0
β11 ∧β12 −1 0
e−cn
e−cn
1 2 1 − e−c(β1 ∧β1 ) 1 β11 ∧ β12 < ∞ + 1 β11 ∧ β12 = ∞ = 1 2 E 1 − e−c(β1 ∧β1 ) 1 β11 ∧ β12 < ∞ + 1 β11 ∧ β12 = ∞ Px β11 ∧ β12 = ∞ . → 1 P β1 ∧ β12 = ∞ Ex
60
O. Angtuncio-Hernández
4.2 Reexpresions Using the Ordered Union of the Descending Ladder Times Let {β1 , β2 , . . .} be the ordered union of the positive strict descending ladder times of Y 1 and Y 2 , that is, the ordered union of {βi1 , βj2 , i, j ≥ 1}. Define β0 = 1. Denoting by gn = βn and dn = βn+1 for n ≥ 0, the set {gn , gn + 1, . . . , dn − 1} is the nth interval where Y remains constant. Partitioning N on such intervals, from Eq. (4)
1+E
∞
e
−cn
) * 1 −y < Y n
n=1
⎞ ⎛ d n −1 * ) = E⎝ 1 {gn < ∞} e−c(k−gn ) e−cgn 1 −y < Y k ⎠ k=gn
n≥0
⎛ ⎞ n −1 * dn −g ) = E⎝ e−cgn 1 gn < ∞, −y < Y g e−ck ⎠ . n
n≥0
k=0
The above equation for x = 0 is
E
∞
⎞ ⎛ d 0 −1 * ) e−cn 1 0 < Y n e−ck ⎠ . = E⎝
n=0
k=0
Note that d0 = β11 ∧ β12 . Also, note that −y ≤ 0 ≤ Y h↑c (x) =
E
d0 −1 −ck k=0 e
= 1 + ⎝E⎝
k=0
∞
n=1 E
⎞⎞−1
⎛ ⎛
d 0 −1
+
e
−ck ⎠⎠
d0 −1
, therefore
) * dn −gn −1 −ck e−cgn 1 gn < ∞, −y < Y g e k=0 n d0 −1 −ck E e 0
∞
⎛ E⎝e
−cgn
)
1 gn < ∞, −y < Y g
n
n=1
n −1 * dn −g
⎞ e
−ck ⎠
.
k=0
As before, depending on the asymptotic behavior of the components, we can obtain a limit. Proposition 3 If some component of Y drifts to −∞, then E(dn − gn ) < ∞
∀n,
Multidimensional Random Walks Conditioned to Stay Ordered via Generalized. . .
61
and the h-function is h↑ (x) = 1 +
∞ E dn − gn ; gn < ∞, −y < Y gn E(d0 )
n=1
.
If every component drifts to +∞ and P(d0 = ∞) > 0, then h↑ (x) = 1 +
∞ P gn < ∞, −y < Y , dn − gn = ∞ gn P(d0 = ∞)
n=1
.
Proof As before, if the component k drifts to −∞, the first case follows by monotone convergence theorem and dn − gn ≤ βjk − βjk−1 for some j . The second case follows by dn −g n −1
e−ck =
k=0
1 − e−c(dn −gn ) 1 1 {dn < ∞} + 1 {dn = ∞} , 1 − e−c 1 − e−c
and using monotone convergence theorem.
4.3 Reexpresion as a Renovation Function Recall from the previous section that (βn , n ≥ 0) is the ordered union of the strict descending ladder times of Y 1 and Y 2 . We have the following result. Proposition 4 The h-function h↑ can be expressed as h↑ (x) = 1 +
∞ P −Y β < y . n
n=1
Proof First we express h↑ as an infinite sum, using Tonelli’s theorem and Theorem 1, we have ⎛ ⎞ J ∞ 1 −1 h↑ (x) − 1 = E⎝ 1 Y n−1 − Yn < y ⎠ = E 1 Y n−1 − Yn < y 1 {J1 > n} . n=1
n=1
(10)
62
O. Angtuncio-Hernández
The event {J1 > n} means that for every j ∈ [n], there is some k, such that the running maximum at time j − 1 of Y k is at least Yjk . This is written as {J1 > n} =
n +) (
) * * max Ylk ; 0 ≤ l ≤ j − 1 − Yjk ≥ 0 .
j =1 k
Recall the equality in distribution between Y and Y ∗ , which is Y reversed in time. Also, recall the equality in distribution of −Y and Y ·−1 − Y· of Eq. (5). Hence, we have d
{J1 > n} =
n +) ) * * ( max Ylk,∗ ; 0 ≤ l ≤ j − 1 − Yjk,∗ > 0 j =1 k
n +) ) * * ( k k max −Yn−l ; 0 ≤ l ≤ j − 1 + Yn−j >0 = j =1 k
=
n +) ) * * ( k k min Yn−l . ; 0 ≤ l ≤ j − 1 < Yn−j j =1 k
In a similar way, we can prove that * d ) Y n−1 − Yn ≤ y, J1 > n = Y kn ≥ −yk , ∀k n +) ) * * (( k k min Yn−l ; 0 ≤ l ≤ j − 1 < Yn−j . j =1 k
Now we prove the last term means n is a strict descending ladder time of some Y k . In fact, reordering the index set, the last term is equal to n +) * ) * ( k min Ylk ; n − j + 1 ≤ l ≤ n < Yn−j j =1 k
=
n +) ) * * ( min Ylk ; j ≤ l ≤ n < Yjk−1 .
j =1 k
Multidimensional Random Walks Conditioned to Stay Ordered via Generalized. . .
63
The right-hand side means the future minimum of Y k up to time n is always smaller than the current value of Y k , for some k. Thus, the time n is a strict descending ladder time of some Y k , and h↑ (x) = 1 +
∞ P Y β > −y . n=1
n
The next section is devoted to obtain conditions for the finiteness of E(τ ).
5 Known Results About the Expectation of the First Exit Time of W, to Ensure the h-Function Is Harmonic In Theorem 1 of [4], the tail of the distribution of τ is computed. Explicitly, let X = (X1 , . . . , Xd ) be a random walk with i.i.d. components on R. Under the assumptions that the step distribution has mean zero and the α moment is finite for α = d − 1 if d > 3, and α > 2 if d = 3, they prove lim nd(d−1)/4Px (τ > n) = KV (x), n
where K is an explicit constant and V is given by V (x) = (x) − Ex ((X(τ )))
x ∈ W ∩ Sd ,
with S ⊂ R the state space of the random walks, and defined in 1. This implies that for x ∈ W ∩ S d Ex (τ ) < ∞
whenever d ≥ 3.
This suggests that E(τ ) < ∞ in this case. In the paper [7] the author obtains the asymptotic behavior of τx , for random walks with non-zero drift killed when leaving general cones on Rd . Under some assumptions, in particular, the step distribution having all moments and a drift pointing out of the cone, it is proved the existence of a function U such that P (τx > n) ∼ ρcn n−p−d/2 U (x). The value p ≥ 1 is the order of some homogeneous function, and c ∈ [0, 1]. This suggests E(τx ) ≤ ρU (x) cn < ∞ whenever c ∈ (0, 1).
64
O. Angtuncio-Hernández
In the paper [5], the authors obtain P(τx > n) ∼ cV (x)n−p/2
n → ∞,
for random walks in a cone, with components having zero mean, variance one, covariance zero, and some finite moment. In that case, the value p is p=
/ λ1 + (d/2 − 1)2 − (d/2 − 1) > 0.
Thus, the expectation of τx is infinite iff 1 ≥ p/2 ⇐⇒ (d/2 + 1)2 ≥ λ1 + (d/2 − 1)2 ⇐⇒ 2d ≥ λ1 . The paper [11], computes the asymptotic exit time probability for random walks in cones, under some general conditions. The first is that the support of the probability measure of X(1) is not included in any linear hyperplane. The second is that, if L is the Laplace transform of the random walk having x ∗ as a minimum, then L is finite on an open neighborhood of x ∗ , and that this value belongs to the dual cone. Under such hypotheses, they prove that lim Px (τ > n)1/n = L(x ∗ ),
n→∞
for all x ∈ Kδ := K + δv, for some δ ≥ 0 and some fixed v in K o . The authors note that in general, there is no explicit link between the drift m of the walk (if exists), x ∗ and L(x ∗ ). The only exception is when m ∈ K. In such case, L(x ∗ ) = 1 iff x ∗ = 0. Furthermore, when the drift m exists, then m ∈ K iff x ∗ = 0. Hence, if we want that E(τx ) = ∞, we should restrict to the case L(x ∗ ) = 1. Acknowledgement Research supported by CoNaCyT grant FC-2016-1946 and UNAM-DGAPAPAPIIT grant IN115217.
References 1. Bertoin, J.: Splitting at the infimum and excursions in half-lines for random walks and Lévy processes. Stoch. Process. Appl. 47(1), 17–35 (1993). MR 1232850 2. Bertoin, J., Doney, R.A.: On conditioning a random walk to stay nonnegative. Ann. Probab. 22(4), 2152–2167 (1994). MR 1331218 3. Chaumont, L., Doney, R.A.: On Lévy processes conditioned to stay positive. Electron. J. Probab. 10(28), 948–961 (2005). MR 2164035 4. Denisov, D., Wachtel, V.: Conditional limit theorems for ordered random walks. Electron. J. Probab. 15(11), 292–322 (2010). MR 2609589
Multidimensional Random Walks Conditioned to Stay Ordered via Generalized. . .
65
5. Denisov, D., Wachtel, V.: Random walks in cones. Ann. Probab. 43(3), 992–1044 (2015). MR 3342657 6. Doney, R.A.: Fluctuation Theory for Lévy Processes. Lecture Notes in Mathematics, vol. 1897. Springer, Berlin (2007). Lectures from the 35th Summer School on Probability Theory held in Saint-Flour, July 6–23, 2005, Edited and with a foreword by Jean Picard. MR 2320889 7. Duraj, J.: On harmonic functions of killed random walks in convex cones. Electron. Commun. Probab. 19(80), 10 (2014). MR 3283611 8. Duraj, J.:, Random walks in cones: the case of nonzero drift. Stoch. Process. Appl. 124(4), 1503–1518 (2014). MR 3163211 9. Dyson, F.J.: A Brownian-motion model for the eigenvalues of a random matrix. J. Math. Phys. 3, 1191–1198 (1962). MR 0148397 10. Eichelsbacher, P., König, W.: Ordered random walks. Electron. J. Probab. 13(46), 1307–1336 (2008). MR 2430709 11. Garbit, R., Raschel, K.: On the exit time from a cone for random walks with drift. Rev. Mat. Iberoam. 32(2), 511–532 (2016). MR 3512425 12. Ignatiouk-Robert, I.: Harmonic functions of random walks in a semigroup via ladder heights. ArXiv e-prints (2018) 13. Izumi, M., Katori, M.: Extreme value distributions of noncolliding diffusion processes, Spectra of random operators and related topics. In: RIMS Kôkyûroku Bessatsu, vol. B27, pp. 45–65. Research Institute for Mathematical Sciences (RIMS), Kyoto (2011). MR 2885254 14. Kallenberg, O.: Foundations of Modern Probability. Probability and its Applications, 2nd edn. Springer, New York (2002). MR1876169 15. Karlin, S., McGregor, J.: Coincidence probabilities. Pac. J. Math. 9, 1141–1164 (1959). MR 0114248 16. Katori, M., Tanemura, H.: Symmetry of matrix-valued stochastic processes and noncolliding diffusion particle systems. J. Math. Phys. 45(8), 3058–3085 (2004). MR 2077500 17. König, W.: Orthogonal polynomial ensembles in probability theory. Probab. Surv. 2, 385–447 (2005). MR 2203677 18. König, W., O’Connell, N., Roch, S.: Non-colliding random walks, tandem queues, and discrete orthogonal polynomial ensembles. Electron. J. Probab. 7(5), 24pp (2002). MR 1887625 19. Martínez, S., San Martín, J.: Quasi-stationary distributions for a Brownian motion with drift and associated limit laws. J. Appl. Probab. 31(4), 911–920 (1994). MR 1303922 20. Tanaka, H.: Time reversal of random walks in one-dimension. Tokyo J. Math. 12(1), 159–174 (1989). MR 1001739 21. Tanaka, H.: Lévy processes conditioned to stay positive and diffusions in random environments. In: Stochastic Analysis on Large Scale Interacting Systems. Advanced Studies in Pure Mathematics, vol. 39, pp. 355–376. Mathematical Society of Japan, Tokyo (2004). MR 2073341
A Berry–Esseen Type Theorem for Finite Free Convolution Octavio Arizmendi and Daniel Perales
Abstract We prove that the rate of convergence for the central limit theorem in finite free convolution is of order n−1/2 .
1 Introduction In recent years, the convolution of polynomials, first studied by Walsh [10], was revisited by Marcus et al. [8], in order to exhibit bounds for the eigenvalues of expected characteristic polynomials of certain d-regular graphs, in their aim to construct bipartite Ramanujan graphs of all sizes [9]. The authors refer to this convolution as finite free additive convolution because of its relation to free convolution, see [1, 7, 9]. In [7], Marcus showed that the Central Limit Theorem for this convolution is given by the polynomial D1/√d (Hd ), where Hd is an Hermite polynomial and may be written as d & '
Hd (x) = d!
2 i=0
(−1)i x d−2i , i!(d − 2i)! 2i
and in general for any λ > 0 and polynomial p of degree d, Dλ p(x) := λd p(x/λ) is the dilation by λ of the polynomial p. In this note we are interested in the rate of convergence in the Central Limit Theorem for finite free convolution. Recall that for the central limit theorem in probability, the Berry–Esseen Theorem [2, 4] states that if μ is a probability measure with m1 (μ) = 0, m2 (μ) = 1, and R |x|3 dμ < ∞, then the distance to the standard
O. Arizmendi () · D. Perales Centro de Investigación en Matemáticas, Guanajuato, México e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 S. I. López et al. (eds.), XIII Symposium on Probability and Stochastic Processes, Progress in Probability 75, https://doi.org/10.1007/978-3-030-57513-7_3
67
68
O. Arizmendi and D. Perales
Gaussian distribution N is bounded as follows ∗n
dkol (D √1 μ , N ) ≤ C n
R |x|
3 dμ
√ n
,
where dkol denotes the Kolmogorov distance between measures, Db μ denotes the dilation of a measure μ by a factor b > 0, ∗ denotes the classical convolution, and C is an absolute constant. Our main result shows that as for the classical and free case [3, 5], in finite free probability we also achieve a rate of order n−1/2 . However, we use the Lévy distance (see Sect. 2.3 for more details) instead of Kolmogorov distance, the reason being that we are dealing with measures supported in d atoms with size 1/d and thus we cannot expect better. Thus, for two polynomials of degree d, p and q, let us define the distance between them to be L(p, q) := dL (μp , μq ), where dL is the Lévy distance and the measures μp and μq are the distributions of p and q, respectively. In this language we can state our contribution as follows. Theorem 1.1 Let d ∈ N and let p be a real polynomial of degree d such that the first two moments of μp are m1 = 0 and m2 = 1. Then, there exists an absolute constant Cd , only depending on d, such that for all n > 0, Cd L D1/√n (pd n ), D1/√d (Hd ) < √ , n The main tool to prove the above rate of convergence are the cumulants for finite free convolution, as we defined in [1]. These cumulants give a combinatorial approach to investigate this convolution and its relation to free probability. In particular we showed that finite free cumulants approach free cumulants, providing a combinatorial perspective to the fact that finite free convolution approaches free convolution in the limit. Using these cumulants we were able to show that some properties of free convolution are valid already in the finite free case. The above theorem is another instance of the fact that many properties in free probability already appear in the finite level. Apart from this introduction this note consists of two sections. Section 2 gives the preliminaries for the theory of finite free probability and in Sect. 3 we give the proof of the main theorem, Theorem 1.1.
2 Preliminaries We give very basic preliminaries on finite free convolution we refer to [1, 7] for details.
Berry–Esseen for Finite Free Convolution
69
2.1 Finite Free Convolution p q For two polynomials, p(x) = di=0 x d−i (−1)i ai and q(x) = di=0 x d−i (−1)i ai , the finite free additive convolution of p and q is given by p(x) d q(x) =
d
x d−r (−1)r
(d − i)!(d − j )! p q a a . d!(d − i − j )! i j
i+j =r
k=0
The finite R-transform of a polynomial is defined by
d (−d)i a p ∂ 1 i i ln s Rdp (s) ≡ − d ∂s (d)i
mod [s d ],
(2.1)
i=0
p when p is the monic polynomial p(x) = di=0 x d−i (−1)i ai . We consider the truncated R-transform given by the sum of the first d terms in the series expansion of Rdp , which will have the cumulants as coefficients. Definition 2.1 ([1]) Let p be a monic polynomial of degree d, and suppose the Rdp (s) satisfies Rdp (s) ≡
d−1
κj +1 (p)s j
mod [s d ].
j =0
(1) We call the sum of the first d terms in the series expansion of Rd the truncated ˜ dp (s), i.e. R-transform and denote by R R˜ dp (s) :=
d−1
κj +1 (p)s j .
j =0
(2) The numbers κ1 (p), κ2 (p), . . . , κd (p) will be called the finite free cumulants. To simplify notation we will omit the dependence on p when we deal with only one polynomial. We want to use the combinatorial framework in terms of moments for these cumulants. Hence, for a polynomial p with roots λ1 , . . . ., λn we define the moments of p, by the formula mn (p) = d1 di=1 λni . These finite free cumulants satisfy the following properties which are the analog of the properties in the axiomatization of cumulants by Lehner [6], in noncommutative probability.
70
O. Arizmendi and D. Perales
1. Polynomial in the first n moments: κn (p) is a polynomial in the first n moments of p with leading term dn mn (p). (d)n 2. Homogeneity: for all monic polynomials p and λ = 0 we have κn (Dλ (p)) = λn κn (p). 3. Additivity: for all monic polynomials p and q, we have κn (p d q) = κn (p) + κn (q).
2.2 Moment-Cumulant Formula Moment-cumulant formulas involve summing over partitions on the set [n]. Let us introduce this definition and some notation. Definition 2.2 We call π = {V1 , . . . , Vr } a partition of the set [n] := {1, 2, . . . , n} if Vi (1 ≤ i ≤ r) are pairwise disjoint, non-void subsets of [n], such that V1 ∪V2 . . .∪ Vr = {1, 2, . . . , n}. We call V1 , V2 , . . . , Vr the blocks of π. The number of blocks of π is denoted by |π|. We will denote the set of partitions of [n] by P(n). The set P(n) can be equipped with the partial order ≤ of reverse refinement (π ≤ σ if and only if every block of π is completely contained in a block of σ ). With this order the minimum is given by the partition with n blocks, 0n = {{1}, {2}, · · · , {n}}, and the maximum is given by the partition with 1 block, 1n = {{1, 2, · · · , n}}. Thus one can consider the incidence algebra of P(n). For two partitions σ, ρ in the set of partitions P(n) the Möbius function is given by μ(σ, ρ) = (−1)|σ |−|ρ| (2!)r3 (3!)r4 · · · ((n − 1)!)rn , where ri is the number of blocks of ρ that contain exactly i blocks of σ . In particular, for σ = 0n we have μ(0n , ρ) = (−1)n−|ρ| (2!)t3 (3!)t4 · · · ((n − 1)!)tn , where ti is the number of blocks of ρ of size i. Given a sequence of complex numbers f = {fn }n∈N we may extend f to partitions in a multiplicative way by the formula fπ = f|V1 | f|V2 | · · · f|Vn | ,
Berry–Esseen for Finite Free Convolution
71
where V1 , . . . , Vn are the blocks of π. In this note we will frequently use the multiplicative extensions of the Pochhammer sequence (d)n = (d)(d − 1) · · · (d − n + 1) and the factorial sequence n!, whose extensions will be denoted by (d)π and N!π , respectively. In [1], we gave formulas that relate the moments and coefficients of a polynomial with its finite free cumulants. First, we have a formula that writes coefficients in terms of cumulants. Proposition 2.3 (Coefficient-Cumulant Formula) Let p(x) = di=0 x d−i (−1)i ai be a polynomial of degree d and let (κn )dn=1 be its finite free cumulants. The following formulas hold. an =
(d)n |π| d μ(0n , π)κπ , d n n!
n ∈ N.
(2.2)
π∈P (n)
We also have a moment-cumulant formula for finite free cumulants: Proposition 2.4 Let p be a monic polynomial of degree d and let (mn )∞ n=1 and (κn )dn=1 , be the moments and cumulants of p, respectively. Then κn =
μ(π, 1n ) (−d)n−1 |σ | d μ(0, σ )mσ , (n − 1)! (d)π π≥σ σ ∈P (n)
for n = 1, . . . , d and mn =
(−1)n d |σ | μ(0, σ )κσ −μ(π, 1n )(d)π , n+1 d (n − 1)! π≥σ σ ∈P (n)
for n ∈ N. Remark 2.5 The explicit moment-cumulant formulas of the first three finite cumulants are κ1 = m1 , κ3 =
κ2 =
d (m2 − m21 ), d −1
d2 (2m31 − 3m1 m2 + m3 ), (d − 1)(d − 2)
and the explicit moment-cumulant formulas of the first three finite moments are d−1 κ2 + κ12 , d (d − 1)(d − 2) 3(d − 1) κ2 κ1 + κ13 . m3 = κ3 + 2 d d
m1 = κ1 ,
m2 =
72
O. Arizmendi and D. Perales
2.3 Convergence of Polynomials and Lévy Distance In this setting of [1, 7] convergence of polynomials is pointwise convergence of the coefficients. We prefer to consider the weak convergence of the induced measures since it is common with the free probability setting. Thus, for a polynomial p, with roots λ1 , λ2 , . . . , λn , we define its distribution μp as the uniform measure on the roots of p, μp = d1 i δλi . To quantify this convergence we use the Lévy distance dL (μ, ν) := inf{ > 0 | F (x − ) − ≤ G(x) ≤ F (x + ) + for all x ∈ R}, where F and G are the cumulative distribution functions of μ and ν respectively.
3 Proof of Theorem 1.1 Before going in to the proof of the main theorem we prove a couple of lemmas about the support and cumulants of polynomials with mean 0 and variance 1. Lemma 3.1 Let p be a real polynomial of degree d with κ1 = 0 and κ2 = 1. Then √ √ the support of p is contained in (− d − 1, d − 1). Proof If κ1 = 0 and κ2 = 1 then d 1 2 m2 = λi . d −1 d −1 d
1 = κ2 =
i=1
This means that λ2i < d − 1 (strict because there is at least another non-zero λ) and √ thus |λi | < d − 1 for all i = 1, . . . , d. Lemma 3.2 Let p be a real polynomial of degree d with κ1 = 0 and κ2 = 1. Then there exists a constant cd , depending only on d, such that max2≤s≤d |κs (p)| < cd . Proof By the previous lemma mn ≤ (d − 1)n and then max2≤s≤d |ms (p)| < (d − 1)d , so we can bound uniformly κn by the moment-cumulant formulas. Now we are able to prove the main theorem which we state again for convenience of the reader. Proposition 3.3 Let p be a real polynomial with κ1 = 0 and κ2 = 1. Then, there exists Cd such that for all n > 0 Cd L D1/√n (pd n ), D1/√d (Hd ) < √ . n
Berry–Esseen for Finite Free Convolution
73
Proof Let us denote h = D1/√d (Hd ), pn = pd n and qn = D1/√n (pn ). By the coefficient-cumulant formula, we know that q
aj n =
(d)j |π| d μ(0j , π)κπ (qn ) dj j ! π∈P (j )
= ajh +
(d)j dj j !
d |π| μ(0j , π)κπ (qn ),
π∈P (j )\P12 (j )
where P12 (j ) is the set of partitions π = (V1 , . . . , Vr ) ∈ P(j ) such that |Vi | ≤ 2 for all i ∈ {1, . . . , r} (i.e., π = (V1 , . . . , Vr ) ∈ P(j )\P12(j ), if |Vi | > 2 for some i ∈ {1, . . . , r}). Recall that |κs (qn )| = |κs (D1/√n (pn ))| =
n |κs (p)| ≤ n1−s/2 c, ns/2
for s = 3, . . . , d, where c := cd from Lemma 3.2. Thus, for any 3 ≤ j ≤ d and π = (V1 , . . . , Vr ) ∈ P(j )\P12(j ) we get |κπ (qn )| ≤ cr ·nr ·n−
|V1 |+···+|Vr | 2
j
j
j
j
= cr nr− 2 ≤ cr n 3 − 2 = cr n− 6 ≤ cd n− 2 . 1
(3.1)
Then, q
|aj n − ajh | ≤
cd K1 (d) , √ n
∀j ∈ {1, . . . , d}
where K1 (d) = max
1≤j ≤d
(d)j dj j !
d |π| |μ(0j , π)|.
π∈P (j )\P12 (j )
Let’s denote z1 , z2 , · · · , zd the d distinct roots of h and δ = 12 min1≤i
c2d K(d) , ε2
K 2 (d)K 2 (d)
where K(d) = 1 δ 2d−22 . Since c2d K(d) does not depend on i, we get that for any i = 1, . . . , n, if z ∈ ∂Bi , then |qn (z) − h(z)| ≤
cd K1 (d)K2 (d) < εδ d−1 ≤ |h(z)| ≤ |h(z)| + |qn (z)|. √ n
Thus, Rouché’s theorem implies that qn and h have the same number of roots (counting multiplicity) in Bi for i = 1, . . . , n. By the definition of the Bi we know that they are pairwise disjoint and each one contains exactly one of the d roots of h. Thus, each Bi contains exactly one of the d roots of qn implying that distance between the roots of qn and h, (and therefore the Lévy distance) is less than ε. Observe that Theorem 1.1 directly gives a bound for T in the next proposition. Proposition 3.4 ([1]) Let p = x d be a real polynomial, then there exists T > 0 such that for all t > T the polynomial pd t has d different real roots. √ Finally, we show that one cannot do better than O( n) as long as m3 (p) = 0. Proposition 3.5 Let p be a real polynomial with κ1 = 0 and κ2 = 1 and |m3 | = α = 0. Then, for all n > 0 L D1/√n (pd n ), D1/√d (Hd ) ≥
α √ . 3d n
Proof We use again the notation h = D1/√d (Hd ), pn = pd n and qn = D1/√n (pn ) and suppose that L(qn , h) < 3dα√n . Since κ1 (qn ) = 0, from the moment cumulant formulas we have m3 (qn ) = |m3 (qn )| =
(d−1)(d−2) κ3 (qn ) d2
(d−1)(d−2) |κ3 (qn )| d2
=
and then
(d−1)(d−2) n |κ3 (p)| d2 n3/2
=
|m3 (p)| α √ = √ . n n
Berry–Esseen for Finite Free Convolution
75
Since m3 (h) = 0, we can compute m3 (qn ) =
d d d 1 3 1 3 1 3 λi (qn ) = λi (qn ) − λi (h), d d d i=1
i=1
i=1
and thus |m3 (qn )| ≤
1 3 |λi (qn ) − λ3i (h)| d i
=
d 1 |λi (qn ) − λi (h)||λ2i (qn ) + λi (qn )λi (h) + λ2i (h)| d i=1
d 1 α α < √ (d + d + d) = √ d 3d n n. i=1
Where we used in the last inequality the assumption that L(qn , h) < is a contradiction since the inequality is strict.
α√ . 3d n
This
Remark 3.6 A specific example with κ3 = 0 is the finite free Poisson distribution which has cumulants κn = α for all n. If αd is a positive integer we obtain a valid polynomial. This is a modification of a Laguerre polynomial, thus we obtain a precise estimate for the distance between the roots of certain Laguerre polynomials and the Hermite polynomials. Remark 3.7 A closer look at (3.1) shows that if m3 (p) = 0 then the convergence rate is of order 1/n. Indeed, m3 (p) = 0 implies κ3 (qn ) = κ3 (p) = 0. So in (3.1) we only need to consider partitions with |Vi | ≥ 4. In this case, for any 4 ≤ j ≤ d we have j
j
j
j
|κπ (p)| ≤ cr nr− 2 ≤ cr n 4 − 2 = cr n− 4 ≤ cd n−1 . Finally, let us mention that while it is very tempting to let d go to infinity, possibly together with n, to obtain a Berry–Esseen bound in free probability, there are two problems. First, the quantity Cd , as we obtained it, increases broadly as d → ∞, and second, there is, for the moment not a good bound between finite free and free convolutions. Acknowledgments This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 734922.
76
O. Arizmendi and D. Perales
References 1. Arizmendi, O., Perales, D.: Cumulants for finite free convolution. J. Combin. Theory Ser. A 155, 244–266 (2018) 2. Berry, A.C.: The accuracy of the Gaussian approximation to the sum of independent variates. Trans. Am. Math. Soc. 49(1), 122–136 (1941) 3. Chistyakov, G.P., Götze, F.: Limit theorems in free probability theory. I. Ann. Prob. 36, 54–90 (2008) 4. Esseen, C.-G.: On the Liapounoff Limit of Error in the Theory of Probability. Almqvist & Wiksell (1942) 5. Kargin, V.: Berry–Esseen for free random variables. J. Theor. Prob. 20(2), 381–395 (2007) 6. Lehner, F.: Free cumulants and enumeration of connected partitions. Eur. J. Comb. 23(8), 1025– 1031 (2002) 7. Marcus, A.: Polynomial convolutions and (finite) free probability (2015). Preprint 8. Marcus, A., Spielman, D.A., Srivastava, N.: Finite free convolutions of polynomials (2015). arXiv preprint arXiv:1504.00350 9. Marcus, A., Spielman, D.A., Srivastava, N.: Interlacing families IV: bipartite Ramanujan graphs of all sizes. In: 2015 IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS), pp. 1358–1377. IEEE, Piscataway (2015) 10. Walsh, J.L.: On the location of the roots of certain types of polynomials. Trans. Am. Math. Soc. 24(3), 163–180 (1922)
Predicting the Last Zero of a Spectrally Negative Lévy Process Erik J. Baurdoux and José M. Pedraza
Abstract Last passage times arise in a number of areas of applied probability, including risk theory and degradation models. Such times are obviously not stopping times since they depend on the whole path of the underlying process. We consider the problem of finding a stopping time that minimises the L1 -distance to the last time a spectrally negative Lévy process X is below zero. Examples of related problems in a finite horizon setting for processes with continuous paths are by Du Toit et al. (Stochastics Int J Probab Stochastics Process 80(2–3):229–245, 2008) and Glover and Hulley (SIAM J Control Optim 52(6):3833–3853, 2014), where the last zero is predicted for a Brownian motion with drift, and for a transient diffusion, respectively. As we consider the infinite horizon setting, the problem is interesting only when the Lévy process drifts to ∞ which we will assume throughout. Existing results allow us to rewrite the problem as a classic optimal stopping problem, i.e. with an adapted payoff process. We use a direct method to show that an optimal stopping time is given by the first passage time above a level defined in terms of the median of the convolution with itself of the distribution function of − inft ≥0 Xt . We also characterise when continuous and/or smooth fit holds Keywords Lévy processes · Optimal prediction · Optimal stopping Mathematics Subject Classification (2000) 60G40, 62M20
E. J. Baurdoux · J. M. Pedraza () Department of Statistics, London School of Economics and Political Science, London, UK e-mail: [email protected]; [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 S. I. López et al. (eds.), XIII Symposium on Probability and Stochastic Processes, Progress in Probability 75, https://doi.org/10.1007/978-3-030-57513-7_4
77
78
E. J. Baurdoux and J. M. Pedraza
1 Introduction In recent years last exit times have been studied in several areas of applied probability, e.g. in risk theory (see [8]). Consider the Cramér–Lundberg process, which is a process consisting of a deterministic drift plus a compound Poisson process which has only negative jumps (see Fig. 1) which typically models the capital of an insurance company. A key quantity of interest is the time of ruin τ0 , i.e. the first time the process becomes negative. Suppose the insurance company has funds to endure negative capital for some time. Then another quantity of interest is the last time g that the process is below zero. In a more general setting we may consider a spectrally negative Lévy process instead of the classical risk process. We refer to [8] and [2] for the Laplace transform of the last time before an exponential time a spectrally negative Lévy process is below some level. Last passage times also appear in financial modeling. In particular, [16, 17] showed that the price of a European put and call option for certain non-negative, continuous martingales, can be expressed in terms of the probability distributions of last passage times. Another application is in degradation models. Paroissin and Rabehasaina [18] proposed a spectrally positive Lévy process as a degradation model. They consider a subordinator perturbed by an independent Brownian motion. The presence of a Brownian motion can model small repairs of the component or system and the jumps represent major deterioration. Classically, the failure time of a component or system is defined as the first hitting time of a critical level b which represents a failure or a bad performance of the component or system. Another approach is to consider instead the last time that the process is under b. Indeed, for this process the paths are not necessarily monotone and hence when the process is above the level b it can return back below it later. The main aim of this paper is to predict the last time a spectrally negative Lévy process is below zero. More specifically, we aim to find a stopping time that is closest (in L1 sense) to the above random time. This is an example of an optimal prediction problem. Recently, these problems have received considerable attention, Fig. 1 Cramér–Lundberg process with τ0 the moment of ruin and g the last zero
Xt
x
τ0
g
t
Predicting the Last Zero of a Spectrally Negative Lévy Process
79
for example, [4] predicted the ultimate supremum of a stable spectrally negative Lévy process in a finite time horizon. A few years later, the infinite time horizon version was solved in [3] for a general Lévy process. Glover et al. [12] predicted the time of its ultimate minimum for a transient diffusion processes. Du Toit et al. [10] predicted the last zero of a Brownian motion with drift and [11] predicted the last zero of a transient difussion. It turns out that these problems are equivalent to an optimal stopping problem; in other words, optimal prediction problems and optimal stopping problems are intimately related. The rest of this paper is organised as follows. In Sect. 2 we discuss some preliminaries and technicalities to be used later. Section 3 concerns the main result, Theorem 3.3. Section 4 is then dedicated to the proof of Theorem 3.3. In the final Section we consider specific examples.
2 Prerequisites and Formulation of the Problem Formally, let X be a spectrally negative Lévy process drifting to infinity, i.e. limt →∞ Xt = ∞, starting from 0, defined on a filtered probability space (, F , F, P), where F = {Ft , t ≥ 0} is the filtration generated by X which is naturally enlarged (see Definition 1.3.38 in [6]). Suppose that X has Lévy triplet (c, σ, ) where c ∈ R, σ ≥ 0 and is the so-called Lévy measure concentrated on (−∞, 0) satisfying (−∞,0) (1 ∧ x 2 )(dx) < ∞. Then the characteristic exponent defined by (θ ) := − log(E(eiθX1 )) takes the form 1 (θ ) = icθ + σ 2 θ 2 + 2
(−∞,0)
(1 − eiθx + iθ xI{x>−1} )(dx).
Moreover, it can be shown that all Lévy processes satisfy the strong Markov property. Let W (q) and Z (q) be the scale functions corresponding to the process X (see [15] or [5] for more details). That is, W (q) is such that W (q) (x) = 0 for x < 0, and is characterised on [0, ∞) as a strictly increasing and continuous function whose Laplace transform satisfies
∞ 0
e−βx W (q) (x)dx =
1 ψ(β) − q
for β > (q),
and Z (q) (x) = 1 + q 0
x
W (q) (y)dy,
80
E. J. Baurdoux and J. M. Pedraza
where ψ and are, respectively, the Laplace exponent and its right inverse given by ψ(λ) = log E(eλX1 ) (q) = sup{λ ≥ 0 : ψ(λ) = q} for λ, q ≥ 0. When q = 0 we omit the superscript and simply write W instead of W (0) . Note that ψ is zero at zero and tends to infinity at infinity. Moreover, it is infinitely differentiable and strictly convex with ψ (0+) = E(X1 ) > 0 (since X drifts to infinity). The latter directly implies that (0) = 0. The right and left derivatives of W exist (see [15] Lemma 8.2). For ease of notation we shall assume that has no atoms when X is of finite variation, which guarantees that W ∈ C 1 (0, ∞). Moreover, for every x ≥ 0 the function q "→ W (q) is analytic on C. If X is of finite variation we may write (1 − eλy )(dy),
ψ(λ) = dλ − (−∞,0)
where necessarily d = −c −
x(dx) > 0. (−1,0)
With this notation, from the fact that 0 ≤ 1 − eλy ≤ 1, for y ≤ 0, and using the dominated convergence theorem, we have that ψ (0+) = d +
x(dx).
(1)
(−∞,0)
For all q ≥ 0, the function W (q) may have a discontinuity at zero and this depends on the path variation of X: if X is of infinite variation we have that W (q) (0) = 0, otherwise W (q) (0) =
1 . d
(2)
There are many important fluctuation identities in terms of the scale functions W (q) and Z (q) (see [5, Chapter VII] or [15, Chapter 8]). We mention some of them that will be useful later on. Denote by τ0− the first time the process X is below zero, i.e. τ0− = inf{t > 0 : Xt < 0}.
Predicting the Last Zero of a Spectrally Negative Lévy Process
81
We then have, for x ∈ R, Px (τ0− < ∞) =
1 − ψ (0+)W (x) if ψ (0+) ≥ 0, 1 if ψ (0+) < 0,
(3)
where Px denotes the law of X started from x. Let us define the q-potential measure of X killed on exiting (−∞, a) for q ≥ 0 as follows ∞ (q) R (a, x, dy) := e−qt Px (Xt ∈ dy, τa+ > t). 0
The potential measure R (q) has a density r (q) (a, x, y) (see [14] Theorem 2.7 for details) which is given by r (q) (a, x, y) = e−(q)(a−x)W (q) (a − y) − W (q) (x − y).
(4)
In particular, R (0) will be useful later. Another pair of processes that will appear later are the running supremum and running infimum defined, for all t ≥ 0, by X t = sup Xs , 0≤s≤t
Xt = inf Xs . 0≤s≤t
The well-known duality lemma states that the pairs (Xt , X t − Xt ) and (Xt − Xt , −Xt ) have the same distribution under the measure P. Moreover, with eq an independent exponentially distributed random variable with parameter q ≥ 0 (where we understand eq = ∞ when q = 0), we deduce from the Wiener– Hopf factorisation that the random variables Xeq and X eq − Xeq are independent. Furthermore, in the spectrally negative case, X eq is exponentially distributed with parameter (q). From the theory of scale functions it is also well known that −Xeq is a continuous random variable with P(−Xeq ∈ dz) =
q W (q) (dz) − qW (q) (z)dz (q)
(5)
for z ≥ 0. Denote by gr as the last passage time below r ∈ R, i.e. gr = sup{t ≥ 0 : Xt ≤ r}.
(6)
When r = 0 we simply write g0 = g. Remark 2.1 Note that from the fact that X drifts to infinity we have that, for any r ∈ R, gr < ∞, P-a.s. Moreover, as X is a spectrally negative Lévy process,
82
E. J. Baurdoux and J. M. Pedraza
and hence the case of a compound Poisson process is excluded, the only way of exiting the set (−∞, r] is by creeping upwards. This tells us that Xgr − = r and that gr = sup{t ≥ 0 : Xt < r} P-a.s. Let τx+ be the first passage time above x, i.e τx+ = inf{t > 0 : Xt > x}. The next Lemma give us an equivalence between the existence of the moments of gr , τx and X in terms of an integrability condition of the Lévy measure. Lemma 2.2 Let X be a spectrally negative Lévy process drifting to infinity with Lévy measure . The following conditions are equivalent x 2 (dx) < ∞. 1. (−∞,−1)
2. E(gr ) < ∞ for some (hence every) r ∈ R. 3. E((τx+ )2 ) < ∞ for some (hence every) x > 0. 4. E((Xt )2 ) < ∞ for some (hence every) t > 0. Proof The proof follows directly from [9] (see Theorem 1 and Remark (ii) pp. 281–282) and the well-known condition for integrability of X in terms of the Lévy measure (see e.g. [21, Theorem 25.3 pp. 159]). Clearly, up to any time t ≥ 0 the value of g is unknown (unless X is trivial), and it is only with the realisation of the whole process that we know that the last passage time below 0 has occurred. However, this is often too late: typically one would like to know how close X is to g at any time t ≥ 0 and then take some action based on this information. We search for a stopping time τ∗ of X that is as “close” as possible to g. We therefore consider the optimal prediction problem V∗ = inf E|g − τ |, τ ∈T
(7)
where T is the set of all stopping times of F.
3 Main Result In order to solve the optimal prediction (7) we state an equivalence with an optimal stopping problem, which will be solved with a direct approach using fluctuation identities for Lévy processes and the general theory of optimal stopping (see e.g. [19, Chapter I]). The mentioned equivalence is mainly based on the work of [22].
Predicting the Last Zero of a Spectrally Negative Lévy Process
83
Lemma 3.1 Consider the standard optimal stopping problem
τ
V = inf E τ ∈T
G(Xs )ds ,
(8)
0
where the function G is given by G(x) = 2ψ (0+)W (x)−1 for x ∈ R. The stopping time which minimises (7) is the same which minimises (8). In particular, V∗ = V + E(g).
(9)
Proof Fix any stopping time τ of F. We then have |g − τ | = (τ − g)+ + (τ − g)− = (τ − g)+ + g − (τ ∧ g) τ τ I{g≤s} ds + g − I{g>s} ds =
0
0
τ
= 0
I{g≤s} ds + g −
τ
=g+ 0
[1 − I{g≤s} ]ds
[2I{g≤s} − 1]ds.
From Fubini’s Theorem we have τ I{g≤s} ds = E 0
0
τ
∞ 0 ∞
= 0
E[I{s (−∞,0) x(dx)/d ≥ 1/ 2−1 (since F (0) = ψ (0+)/d = 1+ (−∞,0) x(dx)/d and (−∞,0) x(dx) ≤ 0) so the condition given in (ii) tells us that the drift d is much larger than the average size of the jumps. This implies that the process drifts quickly to infinity and then we have to stop the first time that the process X is above zero. In this case, concerning the optimal prediction problem, the stopping time which is nearest (in the L1 sense) to the last time that the process is below zero is the first time that the process is above the level zero. (iii) If X is of finite variation with F (0)2 < 1/2 then (−∞,0) x(dx)/d < √ 1/ 2 − 1 < 0 we have that the average of size of the jumps of X are sufficiently large such that when the process crosses above the level zero the process is more likely (than in (ii)) that the process X jumps again below zero and spends more time in the region where G is negative. This condition also tells us that the process X drifts a little slower to infinity that in the (ii). The stopping time which is nearest (in the L1 sense) to the last time that the process is below zero is the first time that the process is above the level a ∗ > 0.
88
E. J. Baurdoux and J. M. Pedraza
4 Proof of Main Result In the next section we prove Theorem 3.3 using a direct method. Since the proof is rather long we break it into a number of lemmas. In particular, we will use the general theory of optimal stopping (see for example [19, Chapter I]) to get a direct proof of Theorem 3.3. First, using the Snell envelope we will show that an optimal stopping time for (10) is the first time that the process enters the stopping set D, defined in terms of the value function V . Recall the set Tt = {τ ≥ t : τ is a stopping time}. We denote T = T0 as the set of all stopping times. The next Lemma is standard in optimal stopping (see e.g. [19, Section I.2.2]) and we include the proof for completeness. Lemma 4.1 Denoting by D = {x ∈ R : V (x) = 0} the stopping set, we have that for any x ∈ R the stopping time τD = inf{t ≥ 0 : Xt ∈ D} attains the infimum in (10). Proof From the general theory of optimal stopping consider the Snell envelope defined as
τ x St = ess inf E G(Xs + x)ds Ft τ ∈ Tt
0
and define the stopping time τx∗
t x = inf t ≥ 0 : St = G(Xs + x)ds . 0
Then we have that the stopping time τx∗ is optimal (see Theorem 2.2 of [19, pp. 29]) for τ
inf E G(Xs + x)ds . (18) τ ∈T
0
On account of the Markov property we have
= ess inf E G(Xs + x)ds Ft τ ∈ Tt 0
τ t G(Xs + x)ds + ess inf E G(Xs + x)ds Ft =
Stx
0
τ
τ ∈ Tt
t
Predicting the Last Zero of a Spectrally Negative Lévy Process
t
=
G(Xs + x)ds + ess inf EXt τ ∈T
0 t
=
τ
89
G(Xs + x)ds
0
G(Xs + x)ds + V (Xt + x),
0
where the last equality follows from the spatial homogeneity of Lévy processes and from the definition of V . Therefore τx∗ = inf{t ≥ 0 : V (Xt + x) = 0} and thus τx∗ = inf{t ≥ 0 : Xt + x ∈ D}. It follows that V (x) = inf Ex τ ∈T
τ
G(Xt )dt 0 τ
= inf E τ ∈T
G(Xt + x)dt
0
τx∗
=E
G(Xt + x)dt
0
τD
= Ex
G(Xt )dt ,
0
where the third equality holds because τx∗ is optimal for (18) and the fourth follows from the spatial homogeneity of Lévy processes. Therefore τD is an optimal stopping time for V (x) for all x ∈ R. Next, we prove that V (x) is finite for all x ∈ R which implies that there exists a stopping time τ∗ such that the infimum in (10) is attained. Recall the definition of x0 in (11). Lemma 4.2 The function V is non-decreasing with V (x) ∈ (−∞, 0] for all x ∈ R. In particular, V (x) < 0 for any x ∈ (−∞, x0 ). Proof From the spatial homogeneity of Lévy processes we deduce
τ
V (x) = inf E τ ∈T
G(Xs + x)ds .
0
Then, if x1 ≤ x2 we have G(Xs + x1 ) ≤ G(Xs + x2 ) since G is a non-decreasing function (see the discussion before Theorem 3.3). This implies that V (x1 ) ≤ V (x2 ) and V is non-decreasing as claimed. By taking the stopping time τ ≡ 0 we immediately deduce that for any x ∈ R we have V (x) ≤ 0.
90
E. J. Baurdoux and J. M. Pedraza
Next, we show that V (x) < 0 for any x ∈ (−∞, x0 ). Let x < x0 and let y0 ∈ (x, x0 ). Then G(x) ≤ G(y0 ) < 0 and from the fact that for all s < τy+0 , Xs ≤ y0 we have
τy+
0
V (x) ≤ Ex
G(Xs )ds
0
≤ Ex
0
τy+
= G(y0 )Ex (τy+0 ) < 0,
G(y0 )ds
0
where the strict inequality holds due to Px (τy+0 > 0) > 0 and thus Ex (τy+0 ) > 0. Finally, we show that V (x) > −∞ for all x ∈ R. Note that G(x) ≥ −I{x≤x0 } holds for all x ∈ R and thus
τ G(Xs )ds V (x) = inf Ex τ ∈T
0 τ
≥ inf Ex τ ∈T
0
−I{Xs ≤x0 } ds
τ
= − sup Ex τ ∈T
0
∞
≥ −Ex 0
I{Xs ≤x0 } ds
I{Xs ≤x0 } ds
≥ −Ex (gx0 ), where the last inequality holds since if s > gx0 then I{Xs ≤x0 } = 0. From Lemma 2.2 we have that Ex (gx0 ) < ∞. Hence for all x < x0 we have V (x) ≥ −Ex (gx0 ) > −∞ and due to the monotonicity of V , V (x) > −∞ for all x ∈ R. Next, we derive some properties of V which will be useful to find the form of the set D. Lemma 4.3 The set D is non-empty. Moreover, there exists an x such that V (x) = 0
for all x ≥ x.
Proof Suppose that D = ∅. Then by Lemma 4.1 an optimal stopping time for (10) is τD = ∞. This implies that
∞
V (x) = Ex
G(Xt )dt .
0
Let m be the median of G, i.e. m = inf{x ∈ R : G(x) ≥ 1/2}
Predicting the Last Zero of a Spectrally Negative Lévy Process
91
and let gm the last time that the process is below the level m defined in (6). Then
∞
Ex
gm
= Ex
G(Xt )dt 0
G(Xt )dt + Ex
0
∞
G(Xt )dt .
(19)
gm
Note that from the fact as G is finite and since gm has finite expectation (see Lemma 2.2)) the first term on the right-hand side of (19) is finite. Now we analyse the second term on the right-hand side of (19). With n ∈ N, since G(Xt ) is nonnegative for all t ≥ gm we have Ex
∞
G(Xt )dt gm
= Ex I{gm 0 where the last equality follows since F is a distribution function. Moreover we have that f (a) = −1/ψ (0+) < 0 for a < 0 and f (0+) = lim f (a) = a↓0
2 ψ (0+)
F (0)2 −
1 ψ (0+)
.
Hence in the case that X is of infinite variation we have that F (0) = 0 and thus f (0+) < 0. Meanwhile in the case that X is of finite variation f (0+) < 0 if and only if F (0)2 < 1/2. Hence, if X is of infinite variation or X is of finite variation
Predicting the Last Zero of a Spectrally Negative Lévy Process
97
with F (0)2 < 1/2 there exists a value a ∗ ≥ 0 such that f (a ∗ ) = 0 and this occurs if and only if [0,a ∗ ]
F (a ∗ − z)F (dz) =
1 . 2
In the case that X is of finite variation with F (0)2 ≥ 1/2 we have that f (0+) ≥ 0 and then we define a ∗ = 0. Therefore we have the following: there exists a value a ∗ ≥ 0 such that for x < a ∗ , f (x) < 0 and for x > a ∗ it holds that f (x) > 0. This implies that the behaviour of a "→ Va (x) is as follows: for a < a ∗ , a "→ Va (x) is a decreasing function, and for a > a ∗ , a "→ Va (x) is increasing. Consequently a "→ Va (x) reaches its minimal value uniquely at a = a ∗ . That is, for all a ∈ R Va (x) ≥ Va ∗ (x)
for all x ∈ R.
It only remains to prove that a ∗ ≥ x0 . Recall that the definition of x0 is x0 = inf{x ∈ R : G(x) ≥ 0} = inf{x ∈ R : F (x) ≥ 1/2}. We know from the definition of a ∗ that f (a ∗ ) ≥ 0 which implies that 1 ≤ 2
∗
[0,a∗]
F (a − z)F (dz) ≤
[0,a∗]
F (dz) = F (a ∗ ),
where in the last inequality we use that F (x) ≤ 1. Therefore we have that a ∗ ≥ x0 . We conclude that for all x ∈ R,
τa+∗
V (x) = Ex
G(Xt )dt ,
0
where a ∗ is characterised in Theorem 3.3. All that is left to show now are the necessary and sufficient conditions for smooth fit to hold at a ∗ . Lemma 4.7 We have the following: (i) If X is of infinite variation or finite variation with (13) then there is smooth fit at a ∗ i.e. V (a ∗ −) = 0. (ii) If X is of finite variation and (13) does not hold then there is continuous fit at a ∗ = 0 i.e. V (0−) = 0. There is no smooth fit at a ∗ i.e. V (a ∗ −) > 0. Proof From Lemma 4.5 we know that V (x) = 0 for x ≥ a ∗ and for x ≤ a ∗ V (x) =
2 ψ (0+)
a∗ x
[0,y]
F (y − z)F (dz)dy −
a∗ − x . ψ (0+)
98
E. J. Baurdoux and J. M. Pedraza
Note that when X is of finite variation with F (0)2 ≥ 1/2 we have a ∗ = 0 and hence V (x) =
x
I{x≤0} , ψ (0+)
so V (0−) = 0 = V (0+). The left and right derivative of V at 0 are given by V (0−) =
1 ψ (0+)
and
V (0+) = 0.
Therefore in this case only the continuous fit at 0 is satisfied. If X is of infinite variation or finite variation with F (0)2 < 1/2 we have from Lemma 4.5 that a ∗ > 0. Its derivative for x ≤ a ∗ is 2 1 V (x) = − . F (x − z)F (dz) + ψ (0+) [0,x] ψ (0+) Since a ∗ satisfies
[0,a ∗ ] F (a
∗
− z)F (dz) = 1/2 we have that
V (a ∗ −) = 0 = V (a ∗ +) Thus we have smooth fit at a ∗ .
Remark 4.8 The main result can also be deduced using a classical verification-type argument. Indeed, it is straightforward to show that if τ ∗ ≥ 0 is a candidate optimal ∗ τ ∗ strategy for the optimal stopping problem (10) and V (x) = Ex 0 G(Xt )dt , x ∈ R, then the pair (V ∗ , τ ∗ ) is a solution if 1. V ∗ (x) ≤ 0 forall x ∈ R, t 2. the process V ∗ (Xt ) + G(Xs )ds, t ≥ 0 is a Px -submartingale for all x ∈ R.
0
With τ ∗ = τa+∗ it can be shown that the first condition is satisfied. The submartingale property can also be shown to hold using Itô’s formula. However, the proof of this turns out to be rather more involved than the direct approach, as it requires some technical lemmas to derive the necessary smoothness of the value function, as well as the required inequality linked to the submartingale propery.
Predicting the Last Zero of a Spectrally Negative Lévy Process
99
Remark 4.9 In the present work we predict the last zero in an infinite time horizon setting for which the condition that X drifts to infinity is added to have a non trivial solution. The latter condition may be removed if we work in a finite horizon setting, i.e. if we predict the last zero before a fixed time T > 0 gT = sup{0 < t ≤ T : Xt ≤ 0}. In this case, similarly as [10], we expect the optimal boundary to be a continuous and non-increasing function of time. However, the finite horizon case is significantly more challenging compared to the infinite horizon setting, and beyond the scope of this paper. A more tractable version of the finite horizon problem is obtained by replacing the fixed horizon T > 0 by an (independent) exponentially distributed time. Perhaps surprisingly, unlike the case of Canadised options (see e.g. [1]), it turns out that an optimal stopping time is given by the first time the process crosses above a nonincreasing and nonnegative function of time. The characterisation of this boundary requires more involved calculations and is left for a subsequent paper.
5 Examples We calculate numerically (using the statistical software [20]) the value function x "→ Va (x) for some values of a ∈ R. The models used were Brownian motion with drift, Cramér–Lundberg risk process with exponential claims and a spectrally negative Lévy process with no Gaussian component and Lévy measure given by (dy) = eβy (ey − 1)−(β+1)dy, y < 0.
5.1 Brownian Motion with Drift Let X = {Xt , t ≥ 0} be a Brownian motion with drift, i.e. Xt = σ Bt + μt,
t ≥ 0.
where μ > 0. Since there is absence of positive jumps and we have a positive drift, X is indeed a spectrally negative Lévy process drifting to infinity. In this case the expressions for the Laplace exponent and scale functions are well known (see for example [14, Example 1.3]). The Laplace exponent is given by ψ(β) =
σ2 2 β + μβ, 2
β ∈ R.
100
E. J. Baurdoux and J. M. Pedraza
For q ≥ 0 the scale function W (q) is W
(q)
/
x 2 2 exp ( μ + 2qσ − μ) 2 (x) = 0 σ μ2 + 2qσ 2 /
x − exp −( μ2 + 2qσ 2 + μ) 2 − , x ≥ 0. σ 1
Letting q = 0 and using that F (x) = ψ (0+)W (x) = μW (x) we get that x ≥ 0.
F (x) = 1 − exp(−2μ/σ 2 x),
That is, −X ∞ ∼ Exp(2μ/σ 2 ), which implies that H corresponds to the distribution function of a Gamma(2, 2μ/σ 2 ) random variable (see Remark 3.4 (i)). Therefore a ∗ corresponds to the median of the aforementioned Gamma distribution. In other words, H is given by H (x) = 1 −
2μ 2μ 2μ x exp − x − exp − x , σ2 σ2 σ2
x≥0
and then a ∗ is the solution to 1−
2μ 2μ 2μ 1 x exp − x − exp − x = . σ2 σ2 σ2 2
Moreover, V is given by
V (x) =
⎧ ⎪ ⎪ ⎨0 ⎪ ⎪ ⎩
2 2 2 ∗ 2 ∗ −2μ/σ 2 a ∗ − xe−2μ/σ x ) + 2σ (e−2μ/σ a μ (a e μ2 2 ∗ 2 ∗ 2 ∗ 2 ∗ −2μ/σ a − 2σ (1 − e−2μ/σ a ) + a μ+x μa e μ2
∗ 2 − e−2μ/σ x ) + a μ−x
x ≥ a∗, 0 < x < a∗ , x ≤ 0.
In Fig. 2 we sketch a picture of Va (x) defined in (23) for different values of a. The parameters chosen for the model are μ = 1 and σ = 1.
5.2 Cramér–Lundberg Risk Process We consider X = {Xt , t ≥ 0} as the Cramér–Lundberg risk process with exponential claims. That is Xt is given by Xt = μt −
Nt i=1
ξi ,
t ≥0
101
1
Predicting the Last Zero of a Spectrally Negative Lévy Process
0
Va2(x)
Va*(x)
−1
Va(x)
Va1(x)
a1
a2
−3
−2
a*
−4
−2
0 x
2
4
Fig. 2 Brownian motion with drift. Function x "→ Va for different values of a. Blue: a < a ∗ ; green: a > a ∗ ; black: a = a ∗
where μ > 0, N = {Nt , t ≥ 0} is a Poisson process with rate λ ≥ 0 and ξi is a sequence of independent and identically exponentially distributed random variables with parameter ρ > 0. Due to the presence of only negative jumps we have that X is a spectrally negative Lévy process. It can be easily shown that X is a finite variation process. Moreover, since we need the process to drift to infinity we assume that λ < 1. ρμ The Laplace exponent is given by ψ(β) = μβ −
λβ , ρ +β
β ≥ 0.
It is well known (see [13, Example 1.3], or [15, Exercise 8.3 (iii)]) that the scale function for this process is given by 1 W (x) = μ − λ/ρ
λ 1− exp(−(ρ − λ/μ)x) , μρ
x ≥ 0.
102
E. J. Baurdoux and J. M. Pedraza
This directly implies that F (x) = 1 −
λ exp(−(ρ − λ/μ)x), μρ
x ≥ 0.
and then H is given by
λ 2 λ λ H (x) = 1 − +2 1− (1 − exp(−(ρ − λ/μ)x)) μρ μρ μρ
λ 2 (1 − (ρ − λ/μ)x exp(−(ρ − λ/μ)x) − exp(−(ρ − λ/μ)x))) , + μρ
x ≥ 0.
Hence, when (1 − λ/(μρ))2 ≥ 1/2 we have that a ∗ = 0 and V (x) =
x I{x≤0} . μ − λ/ρ
For the case (1 − λ/(μρ))2 ≤ 1/2, a ∗ is the solution to the equation H (x) = 1/2 and the value function is given by
V (x) =
2 μ − λ/ρ
x
a∗
a∗ − x H (y)dy − μ − λ/ρ
I{x≤a ∗ } .
In Fig. 3 we calculate numerically the value of x "→ Va (x) for the parameters μ = 2, λ = 1 and ρ = 1 and different values of a.
5.3 An Infinite Variation Process with No Gaussian Component In this subsection we consider a spectrally negative Lévy process related to the theory of self-similar Markov processes and conditioned stable Lévy processes (see [7]). We consider X as a spectrally negative Lévy process with Laplace exponent ψ(θ ) =
(θ + β) , (θ )(β)
θ ≥0
where β ∈ (1, 2). This process has no Gaussian component and the corresponding Lévy measure is given by (dy) =
eβy dy, (ey − 1)β+1
y < 0.
103
0.5
Predicting the Last Zero of a Spectrally Negative Lévy Process
0.0
Va2(x)
−0.5
Va*(x)
−1.5
−1.0
Va(x)
Va1(x)
a1 −2
−1
a2
a* 0
1
2
3
4
x
Fig. 3 Cramér–Lundberg risk process. Function x "→ Va for different values of a. Blue: a < a ∗ ; green: a > a ∗ ; black: a = a ∗
This process is of infinite variation and drifts to infinity. Its scale function is given by W (x) = (1 − e−x )β−1 ,
x ≥ 0.
Taking the limit when x goes to infinity we easily deduce that ψ (0+) = 1 and thus F (x) = W (x). For the case β = 2 we calculate numerically the function H and then the value of the number a ∗ as well as the values of the function V see Fig. 4. We also included the values of the function Va (x) (defined in (23)) for different values of a (including a = a ∗ ). We close this section with a final remark. Using the empirical evidence from the previous examples we make some observations about whether the smooth fit conditions holds. Remark 5.1 Note in Figs. 2, 3, and 4 that the value a ∗ is the unique value for which the function x "→ Va (x) exhibits smooth fit (or continuous fit) at a ∗ . When we choose a2 > a ∗ , the function x "→ Va2 (x) is not differentiable at a2 . Moreover, there exists some x such that Va2 (x) > 0. Similarly, If a1 < a ∗ the function x "→ Va1 (x) is also not differentiable at a1 .
E. J. Baurdoux and J. M. Pedraza
0.5
1.0
104
0.0
Va2(x)
−0.5
Va*(x)
−1.5
−1.0
Va(x)
Va1(x)
a1
a2
−2.0
a*
−2
−1
0
1
2
3
4
x
Fig. 4 Spectrally negative Lévy process with Lévy measure (dy) = e2y (ey − 1)−3 dy. Function x "→ Va for different values of a. Blue: a < a ∗ ; green: a > a ∗ ; black: a = a ∗
References 1. Avram, F., Kyprianou, A.E., Pistorius, M.R., et al.: Exit problems for spectrally negative lévy processes and applications to (Canadized) Russian options. Ann. Appl. Probab. 14(1), 215–238 (2004) 2. Baurdoux, E.J.: Last exit before an exponential time for spectrally negative lévy processes. J. Appl. Probab. 46(2), 542–558 (2009) 3. Baurdoux, E.J., Van Schaik, K.: Predicting the time at which a lévy process attains its ultimate supremum. Acta Appl. Math. 134(1), 21–44 (2014) 4. Bernyk, V., Dalang, R.C., Peskir, G.: Predicting the ultimate supremum of a stable lévy process with no negative jumps. Ann. Probab. 39(6), 2385–2423 (2011) 5. Bertoin, J.: Lévy Processes, vol. 121. Cambridge University, Cambridge (1998) 6. Bichteler, K.: Stochastic Integration with Jumps, vol. 89. Cambridge University, Cambridge (2002) 7. Chaumont, L., Kyprianou, A.E., Pardo, J.C.: Some explicit identities associated with positive self-similar Markov processes. Stochastic Process. Appl. 119(3), 980–1000 (2009) 8. Chiu, S.N., Yin, C., et al. Passage times for a spectrally negative lévy process with applications to risk theory. Bernoulli 11(3), 511–522 (2005) 9. Doney, R., Maller, R.: Moments of passage times for lévy processes. In: Annales de l’ihp Probabilités et Statistiques, vol. 40, pp. 279–297 (2004) 10. Du Toit, J., Peskir, G., Shiryaev, A.: Predicting the last zero of brownian motion with drift. Stochastics Int. J. Probab. Stochastics Process. 80(2–3), 229–245 (2008) 11. Glover, K., Hulley, H.: Optimal prediction of the last-passage time of a transient diffusion. SIAM J. Control Optim. 52(6), 3833–3853 (2014)
Predicting the Last Zero of a Spectrally Negative Lévy Process
105
12. Glover, K., Hulley, H., Peskir, G., et al.: Three-dimensional brownian motion and the golden ratio rule. Ann. Appl. Probab. 23(3), 895–922 (2013) 13. Hubalek, F., Kyprianou, E.: Old and new examples of scale functions for spectrally negative lévy processes. In: Seminar on Stochastic Analysis, Random Fields and Applications, vol. vi, pp. 119–145 (2011) 14. Kuznetsov, A., Kyprianou, A., Rivero, V.: In: The Theory of Scale Functions for spectrally Negative lévy Processes. lévy Matters II.. Springer Lecture Notes in Mathematics (2011) 15. Kyprianou, A.E.: Fluctuations of lévy Processes with Applications: Introductory Lectures, 2nd edn. Springer, Berlin (2014) 16. Madan, D., Roynette, B., Yor, M.: From black-scholes formula, to local times and last passage times for certain submartingales, Working Paper (2008a). https://hal.archives-ouvertes.fr/hal00261868 17. Madan, D., Roynette, B., Yor, M.: Option prices as probabilities. Finance Res. Lett. 5(2), 79–87 (2008b) 18. Paroissin, C., Rabehasaina, L.: First and last passage times of spectrally positive lévy processes with application to reliability. Methodol. Comput. Appl. Probab. 17(2), 1–22 (2013) 19. Peskir, G., Shiryaev, A.: Optimal Stopping and Free-Boundary Problems. Springer, Berlin (2006) 20. R Core Team: R: A language and Environment for Statistical Computing [SoftwareHandbuch]. Vienna, Austria (2017). https://www.R-project.org/ 21. Sato, K.: Lévy Processes and Infinitely Divisible Distributions. Cambridge University, Cambridge (1999) 22. Urusov, M.A.: On a property of the moment at which brownian motion attains its maximum and some optimal stopping problems. Theory Probab. Appl. 49(1), 169–176 (2005)
Box-Ball System: Soliton and Tree Decomposition of Excursions Pablo A. Ferrari and Davide Gabrielli
Abstract We review combinatorial properties of solitons of the Box-Ball system introduced by Takahashi and Satsuma (J Phys Soc Jpn 59(10):3514–3519, 1990). Starting with several definitions of the system, we describe ways to identify solitons and review a proof of the conservation of the solitons under the dynamics. Ferrari et al. (Soliton decomposition of the box-ball system (2018). arXiv:1806.02798) proposed a soliton decomposition of a configuration into a family of vectors, one for each soliton size. Based on this decompositions, the authors (Ferrari and Gabrielli, Electron. J. Probab. 25, Paper No. 78–1, 2020) propose a family of measures on the set of excursions which induces invariant distributions for the Box-Ball System. In the present paper, we propose a new soliton decomposition which is equivalent to a branch decomposition of the tree associated to the excursion, see Le Gall (Une approche élémentaire des théorèmes de décomposition de Williams. In: Séminaire de Probabilités, XX, 1984/85, vol. 1204, pp. 447–464. Lecture Notes in Mathematics. Springer, Berlin (1986)). A ball configuration distributed as independent Bernoulli variables of parameter λ < 1/2 is in correspondence with a simple random walk with negative drift 2λ−1 and having infinitely many excursions over the local minima. In this case the soliton decomposition of the walk consists on independent double-infinite vectors of iid geometric random variables (Ferrari and Gabrielli, Electron. J. Probab. 25, Paper No. 78–1, 2020). We show that this property is shared by the branch decomposition of the excursion trees of the random walk and discuss a corresponding construction of a Geometric branching process with independent but not identically distributed Geometric random variables.
P. A. Ferrari () Universidad de Buenos Aires, Buenos Aires, Argentina e-mail: [email protected] D. Gabrielli Università di L’Aquila, L’Aquila, Italy e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 S. I. López et al. (eds.), XIII Symposium on Probability and Stochastic Processes, Progress in Probability 75, https://doi.org/10.1007/978-3-030-57513-7_5
107
108
P. A. Ferrari and D. Gabrielli
Keywords Box-ball system · Solitons · Excursions · Planar trees AMS 2010 Subject Classification 37B15, 37K40, 60C05, 82C23
1 Introduction The Ball-Box-System (BBS) is a cellular automaton introduced by Takahashi and Satsuma [17] describing the deterministic evolution of a finite number of balls on the infinite lattice Z. A ball configuration η is an element of {0, 1}Z , where η(i) = 1 indicates that there is a ball at box i ∈ Z. For ball configurations with a finite number of balls, the dynamics is as follows. A carrier starts with zero load to the left of the occupied boxes, visits successively boxes from left to right and at each box proceeds as follows (a) if the box is occupied, the carrier increases its load by one and the box becomes empty or (b) if the box is empty and the carrier load is positive, then the carrier load decreases its load by one and the box becomes occupied. This mechanism is illustrated with an example in Fig. 1. In Sect. 2.1 we describe alternative equivalent descriptions of the dynamics. We denote T η the configuration obtained after the carrier has visited all boxes in η, and T t η the configuration obtained after the iteration of this procedure t times, for positive integer t. The dynamics can be defined for suitable configurations with infinitely many balls satisfying that “there are more empty boxes than occupied boxes” and conserves the set of configurations with density of balls less than 1/2 [7]; see details in Sect. 2.1. The main motivation of [17] was to identify objects conserved by the dynamics that they called basic sequences, later called solitons by [14]; we follow this nomenclature. The Box-Ball system has been proposed as a discrete model with the same behavior of the Korteweg-de Vries equation [18], an integrable partial differential equation having solitonic behavior. Given a ball configuration η a k-soliton consists of k occupied boxes denoted h1 , . . . , hk ∈ Z and k empty boxes denoted t1 , . . . , tk ∈ Z. Takahashi and Satsuma showed that if η has a finite number of occupied boxes, then all occupied box belongs to some soliton and proposed an algorithm to identify solitons. We explain their algorithm in detail in Sect. 3.2; for the moment we give the simplest example of k-soliton. Let η have only k balls occupying k successive boxes, then the (only) k-soliton of η consists on the k successive occupied boxes h1 , . . . , hk and the k successive empty boxes t1 , . . . , tk ∈ Z given by tj = hj + k. For the η just described, the configuration T η has balls in boxes t1 , . . . , tk and no balls in the other boxes, implying that η = T η has a k-soliton consisting of occupied boxes h1 , . . . , hk with hj = tj and empty boxes tj = tj + k. Hence, in one step, an isolated k-soliton preserves its shape and moves k steps forward. Iterating the evolution t times, we conclude that not being other balls in the system, a k-soliton moves kt boxes forward, that is, it travels at speed k. Since for different k’s the solitons have different speeds, they “collide”, that is the order and the
Box-Ball System: Soliton and Tree Decomposition of Excursions
1
109
6
2
3
7
8
4
9
5
10
Fig. 1 An example of the evolution of the Ball-Box-System. The initial configuration of balls η is the one in the frame 1 where the boxes not drawn are all empty. The carrier starts empty on the left of the particles and moves to the right following the increasing order of the frames. In the last frame 10 the carrier is empty and will remain empty. The configuration T η is the one drawn in the frame 10. A ball is denoted by • while an empty box by ◦
positions of hi and ti of each soliton change. In fact, solitons can be identified for configurations with finite number of balls [17] and for suitable configurations η with infinitely many balls [7] and moreover solitons are conserved by the dynamics [7, 17]; we explain this in detail in Sect. 3. Given a suitable configuration η, a k-soliton consists always on k occupied boxes and k empty boxes, but they are not necessarily consecutive and the empty boxes of the soliton may precede the occupied ones. In any case, different solitons occupy disjoint sets of boxes. The trajectory of each soliton can be identified along time [7]. When the distribution of the initial ball configuration is translation invariant and invariant for the dynamics, the asymptotic soliton speeds satisfy a system of linear equations [7] which is a feature of several other integrable systems [1].
110
P. A. Ferrari and D. Gabrielli
A ball configuration can be mapped to a walk indexed by boxes that jumps one unit up at occupied boxes and one unit down at empty boxes. If the configuration has density less than 12 , then the walk has down records and finite excursions consisting on the pieces of configuration between two consecutive records. A ball configuration can be codified as a set of infinite vectors, based on the concept of slots [7]. Given a ball configuration with identified solitons, there are boxes called k-slots satisfying that any k-soliton is strictly in between two successive k-slots; in this case we say that the k-soliton is attached to the left k-slot. To be more precise, the set of kslots of η consists on the records and the boxes belonging to any bigger soliton of the form hj or tj for any j > k. Taking a ball configuration with a record at the origin, the k-slots are enumerated and then for each integer i, the i-th coordinate of the k-component of the configuration is the number of k-solitons attached to the i-th k-slot. The obtained components can be composed again to recover the initial configuration η. A large part of the paper is dedicated to a complete explanation of these constructions. It is useful to perform the decomposition in each excursion to obtain what [6] call slot diagram, a combinatorial object that encodes the structure of the excursion. We discuss also the relationship between the soliton decomposition of an excursion and other combinatorial objects as the excursion tree [4, 5, 10, 12, 13], Catalan numbers [5] and Dyck and Motzkin paths [5, 14]. A notable property proven by [7] is that the k-component of the configuration T η is a shift of the k-component of η, the amount shifted depending on the mcomponents for m > k. As a consequence, [7] prove that measures with independent and translation invariant soliton components are invariant for the dynamics. In [6] a special class of these measures is studied in detail. The papers [2, 3] show families of invariant measures for the BBS based on reversible Markov chains on {0, 1}. The Box-Ball system is strictly related to several remarkable combinatorial constructions (see for example [8, 9, 14, 15, 19, 20]); we illustrate some of them. Sometimes, instead of giving formal proofs and detailed descriptions we adopt a more informal point of view trying to illustrate the different constructions through explicative examples. The paper is organized as follows. In Sect. 2 we fix the notation and review several different equivalent definitions of the dynamics. We start considering the simple case of a finite number of balls. Then, following [7] we introduce the walk representation and give a definition of the dynamics in the general case for configurations of balls whose walk representation can be cut into infinitely many finite excursions. In Sect. 3 we discuss some conserved quantities of the dynamics, the identification of the solitons, a codification of the conserved quantities in terms of Young diagrams and define the slot diagrams from [6]. In Sect. 4 we recall the construction of the excursion tree and propose a new soliton decomposition of the excursion based on the tree. We introduce a branch decomposition of the tree and conclude that its slot diagram coincides with the one discussed in Sect. 3. The contents of this section are new.
Box-Ball System: Soliton and Tree Decomposition of Excursions
111
In Sect. 5 we review results from [6] related with the distribution of the soliton decomposition of excursions. In particular, the soliton decomposition of a simple random walk consists on independent double-infinite vectors of iid geometric random variables. We discuss also an application to branching processes.
2 Preliminaries and Notation 2.1 Box-Ball System The Box-Ball System (BBS) [17] is a discrete-time cellular automaton. We start considering a finite number of balls evolving on the infinite lattice Z. The elements of Z are called boxes. A configuration of balls is codified by η ∈ {0, 1}Z , that is, by a doubly infinite sequence of 1 s and 0 s, corresponding respectively to the boxes occupied by balls and the empty boxes. Pictorially a ball will be denoted by • while an empty box by ◦. There are several equivalent ways of defining the evolution. We denote by T : {0, 1}Z → {0, 1}Z the operator defining the evolution in one single step. This means that the configuration η evolves in a single step into the configuration T η. In the following definitions we consider configurations having only a finite number of 1 s. The equivalence among all the definitions is simple. The different definitions are however related to different classic combinatorial constructions and illustrate the evolution from different perspectives. First Definition We define the dynamics through a pairing between the balls and some empty boxes. Consider a ball configuration η containing only a finite number of balls. The evolution is defined iteratively. At the first step we consider the balls that have an empty box in the nearest neighbor lattice site to the right, that is, local configurations of the type •◦ and we pair the two boxes drawing a line. Remove all the pairs created and continue following the same rule with the configuration obtained after the deletion of the paired boxes. This procedure will stop after a finite number of iterations because there are only a finite number of balls. See Fig. 2, where we assumed that there are no balls outside the window and the lines connect balls with the corresponding paired empty boxes. The evolved configuration of balls, denoted T η is obtained by transporting every ball along the lines to the Fig. 2 A finite configuration of balls with the corresponding pairing lines of the first definition
112
P. A. Ferrari and D. Gabrielli
corresponding paired empty box. Note that the lines pairing balls and empty boxes can be drawn without intersections in the upper half plane. Second Definition [17] This is the original definition of the model. Consider an empty carrier that starts to the left of the leftmost ball and visit the boxes one after another moving from left to right. The carrier can transport an arbitrary large number of balls. When visiting box i, the carrier picks the ball if η(i) = 1 and the number of balls transported by the carrier augments therefore by one and site i is updated to be empty: T η(i) = 0. If instead η(i) = 0 and the carrier contains at least one ball then he deposits one ball in the box getting T η(i) = 1. After visiting a finite number of boxes the carrier will be always empty and will not change any more the configuration, see Fig. 1. The final configuration T η is the same as the one obtained by the previous construction. Third Definition Dyck words (after Walther von Dyck). Substitute any ball with an open parenthesis and any empty box with a closed one. The sequence of Fig. 1 becomes for example ((()()(()())()()())(())) and outside this window there are only closed ) parenthesis. According to the usual algebraic rules we can pair any open parenthesis to the corresponding closed one. Recalling that open parenthesis correspond to balls, we move each ball from the position of the open parenthesis to the position of the corresponding closed one. Forth Definition As a first step we duplicate each ball. After this operation on each occupied box there will be exactly 2 balls, one is the original one while the second is the clone. We select an arbitrary occupied box and move the cloned ball to the first empty box to the right. Then we select again arbitrarily another box containing two balls and do the same. We continue according to an arbitrary order up to when there are no more boxes containing more than one ball. At this point we remove the original balls and keep just the cloned ones. The configuration of balls that we obtain does not depend on the arbitrary order that we followed and coincides with T η. Fifth Definition Start from the leftmost ball and move it to the nearest empty box to its right. Then do the same with the second leftmost ball (according to the original order). Proceed in this way up to move once all the balls. This is a particular case of the fourth definition. It correspond to move the balls according to the order given by the initial position of the balls. Our viewpoint will be to consider all the balls indistinguishable and from this perspective all the above definitions are equivalent. If we are instead interested in the motion of a tagged ball then we can have different evolutions according to the different definitions given above. The construction can be naturally generalized to a class of configurations with infinitely many balls or to configuration of balls on a ring. This can be done under
Box-Ball System: Soliton and Tree Decomposition of Excursions
113
suitable assumptions on the configuration η [3, 7]. We will discuss briefly this issue following the approach of [7], but to do this we need some notation and definitions.
2.2 Walk Representation and Excursions A function ξ : Z → Z satisfying |ξ(i) − ξ(i − 1)| = 1 is called walk. We map a ball configuration η to a walk ξ = W η defined up to a global additive constant by ξ(i) − ξ(i − 1) = 2η(i) − 1
(1)
The constant is fixed for example by choosing ξ(0) = 0. Essentially the map between ball configurations and walks is fixed by the correspondence • ←→ and ◦ ←→ , where • represents a ball, ◦ an empty box and , pieces of walk to be glued together continuously. The map W is invertible (when the additive constant is fixed) and the configuration of balls η = W −1 ξ can be recovered using (1). We remark that there are several walks that are projected to the same configuration of balls and all of them differ by a global additive constant. This means that W is a bijection only if the arbitrary additive constant is fixed and this will be always done in such a way that ξ(0) = 0. We call i ∈ Z a (minimum) record for the walk ξ if ξ(i) < ξ(i ) for any i < i. The hitting time of −j for the walk ξ is a record denoted r(j, ξ ). We call excursion of a walk the piece of trajectory between two successive records. A pictorial perspective on the decomposition of the walk into records and disjoint excursions is the following. Think the walk as a physical profile and imagine the sun is at the sunshine on the left so that the light is coming horizontally from the left. The parts of the profile that are enlightened correspond to the records while the disjoint parts in the shadow are the different excursions. We call a finite walk a finite trajectory of a random walk. More precisely a finite walk ξ = (ξ(i))i∈[0,k] , k ∈ N, is an element of Z[0,k] such that |ξ(i) − ξ(i − 1)| = 1. Again we always fix ξ(0) = 0 and like before there is a bijection W between finite walks and finite configurations of balls, i.e. elements η ∈ {0, 1}k for some k ∈ N. We use the same notation ξ for finite and infinite walks and η for finite and infinite configurations of balls. It will be clear from the context when the walk/configuration is finite or infinite. We introduce the set E of finite soft excursions between records 0 and 1. An element ε ∈ E is a finite walk that starts and ends at zero, it is always nonnegative and it has length 2n(ε). More precisely ε = ε(0), . . . , ε(2n(ε)) with the constraints |ε(i) − ε(i − 1)| = 1, ε(i) ≥ 0 and ε(0) = ε(2n(ε)) = 0. The empty excursion ∅ is also an element of E with n(∅) = 0. We call En the set of soft finite excursions of length 2n so that E = ∪+∞ n=0 En . Using the same correspondence as before between of walks and configuration balls we can associate a finite configuration of balls η(1), . . . , η(2n(ε)) = W −1 ε
114
P. A. Ferrari and D. Gabrielli
2n(ε) to the finite excursion ε. If η = W −1 ε, then we have i=1 (2η(i) − 1) = 0 but obviously not all configuration of balls satisfying this constraint generates a soft excursion by the transformation W . It is well known [16] that the number of excursions of length 2n is given by
1 2n |En | = ; n+1 n
(2)
the right hand side is the Catalan number Cn . We denote by E o ⊂ E the set of strict excursions. An element ε ∈ E o is an excursion that satisfies the strict inequality ε(i) > 0 when i = 0, 2n(ε). Likewise we call Eno the strict excursions of length 2n. o There is a simple bijection between En and En+1 . This is obtained by considering an element ε ∈ En and adding a at the beginning and a at the end. The result o is an element of En+1 . The converse map is obtained removing a at the beginning o and a at the end of an element of En+1 obtaining an element of En . This can be easily shown to be a bijection. In particular we deduce by (2) that |Eno | = n1 2(n−1) n−1 . Concatenating Excursions Given a finite soft excursion ε we call ε˜ the finite walk 2n(ε)+1 such that ε˜ (i) = ε(i) when 0 ≤ i ≤ 2n(ε) and ε˜ (2n(ε) + 1) = −1. (˜ε(i))i=0 This corresponds essentially to add a at the end of the soft excursion. Given two such finite walks ε˜ 0 and ε˜ 1 we introduce their concatenation ε˜ 0 ε˜ 1 . This is a finite walk such that ε˜ 0 ε˜ 1 (i) = ε˜ 0 (i) when 0 ≤ i ≤ 2n(ε0 ) + 1 and ε˜ 0 ε˜ 1 (i) = ε˜ 1 (i − 2n(ε0 ) − 1) − 1 if 2n(ε0 ) + 1 < i ≤ 2(n(ε0 ) + n(ε1 )) + 2. Essentially this operation corresponds to glue the graphs of the walks one after the other continuously. Iterating this operation we can define similarly also the concatenation of a finite number of finite walks ε˜ 0 ε˜ 2 · · · ε˜ k . Likewise we consider an infinite walk (˜εi )i∈Z obtained by a doubly infinite concatenation of finite walks. Informally this is obtained concatenating continuously the graphs as before with the condition that (˜εi )i∈Z (j ) = ε˜ 0 (j ) for 0 ≤ j ≤ 2n(ε0 ) + 1. Formally the walk ξ is defined in terms of a family of excursions (εj )j ∈Z as follows. First fix the position of the records of the walk ξ iteratively by r(0, ξ ) := 0, r(k + 1, ξ ) − r(k, ξ ) := 2n(εk ) + 1 for k ∈ Z so that the number of boxes between records k and k + 1 is the size of excursion k. Now complete the definition by inserting excursion k between those records: ξ(r(k, ξ ) + i) = −k + εk (i),
i ∈ {0, . . . , 2n(εk )},
k ∈ Z.
(3)
The resulting walk ξ attains the level −k for the first time at position r(k, ξ ). In particular, ξ has infinite many records, one for each element of Z. When this happens we say shortly that the walk has all the records. Clearly a similar concatenation
Box-Ball System: Soliton and Tree Decomposition of Excursions
115
−1 0 1 ε−1 (empty)
2
3
ε0
4
5 6
ε1 (empty) ε2 ε3 (empty) ε4 (empty) ε5 Fig. 3 Up: walk representation of ball configuration of next Fig. 4, with records represented by a black dot and labeled from −1 to 6. Down: excursion decomposition of the walk
procedure can be performed for any collection of finite walks and not just for excursions. We do not give the straightforward details. Conversely, if we have a walk ξ with all the records and such that record 0 is at 0 and record k is at r(k, ξ ), then for each k ∈ Z we can define the excursion εk = εk [ξ ] by εk (i) = ξ(r(k, ξ ) + i) − (−k),
i ∈ {0, 1, . . . , r(k + 1, ξ ) − r(k, ξ ) − 1}
(4)
(see Fig. 3), then we have that (˜εi )i∈Z coincides with the original walk ξ . We proved therefore that an infinite walk is obtained by an infinite concatenation of finite soft excursions separated by a if and only if it has all the records. The set of configurations with density a is defined by
n
Xa := η ∈ {0, 1}Z : lim
n→±∞
j =0 η(j )
n+1
=a
,
a ∈ [0, 1] ,
(5)
and call X := ∪a tj (γ ) for any i, j and that the walk has the same height at hi (γ ) and ti (γ ): ξ(hi (γ )) = ξ(ti (γ )), for this reason we say that the head and tail of a soliton are paired. We have the following key definition. Definition 1 We say that a box i is a k-slot if either i is a record or i belongs to {t (γ ), h (γ )} for some > k for some m-soliton γ with m > k. The set of k-slots contains the set of m-slots for all m > k. We illustrate in Fig. 8 the slots induced by the soliton decomposition of the excursion in Fig. 7, obtaining one 4-slot (at the record), 5 3-slots, 11 2-slots and 21 1-slots. According to definition 1, the number of k-slots is sk := 1 + >k 2( − k)nk , where nk is the number of k-solitons in the excursion. The 1 in the above formula corresponds to the record on the left that is a slot of any order. For each k we enumerate the k-slots in the excursion starting with 0 for the kslot in the record preceding the excursion. We say that a k-soliton γ is attached to the k-slot number i if the boxes occupied by γ are contained in the segment with extremes the ith and (i + 1)th k-slots in the excursion. We define xk (i) := #{k-solitons attached to k-slot number i}.
(8)
3.3 Slot Diagrams We define a combinatorial family of objects called slot diagrams that according to [7] is in bijection with E, see also Sect. 4 below.
122
P. A. Ferrari and D. Gabrielli
A generic slot diagram is denoted by x = (xk : 1 ≤ k ≤ M), where M = M(x) is a non negative integer, xk = (xk (0), . . . , xk (sk − 1)) ∈ Nsk and sk is a non-negative integer. We say that xk (j ) is the number of k-solitons sk −1 attached to the k-slot number j . We denote nk := i=0 xk (i), the number of ksolitons in x. A precise definition is the following. We say that x is a slot diagram if • There exists a non negative integer number M = M(x) such that sM = 1 and xm (0) = 0 for m > M. Hence, we can ignore soliton sizes above M and denote x = (xk : 1 ≤ k ≤ M). • For any k, the number of k slots sk is determined by (x : > k) via the formula sk = sk (x) = 1 + 2
M
(i − k)ni .
(9)
i=k+1
Consider now the soliton decomposition of an excursion and the corresponding vectors defined by (8) and define M = min{k ≥ 0 : xk (0) = 0 for all k > k}. Then, the family of vectors (xk : k ≤ M) forms a slot diagram; if M = 0 the slot diagram is empty and corresponds to an empty excursion. The slot diagram of the excursion in Fig. 8 is as follows. We have M = 4 and x4 = (2),
s4 = 1;
x3 = (0, 1, 0, 0, 0),
s3 = 5;
x2 = (0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0),
s2 = 11;
x1 = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0),
s4 = 21.
For example the vector x3 has just x3 (1) = 1 = 0 since there is just one 3-soliton and its support (the boxes corresponding to the green part of the walk in Fig. 8) is contained between the 3-slots number 1 and number 2. Recall that the k-slot located at the record to the left of the excursions is numbered 0 for all k and then the 3-slot number 1 is the second green box from the left in Fig. 8.
3.4 Head-Tail Soliton Decomposition We propose another decomposition, called HT soliton decomposition. (0) Start with a ball configuration η with a single excursion. (1) If there is just one single (infinite) run then stop, otherwise go to the next step.
Box-Ball System: Soliton and Tree Decomposition of Excursions
123
(2) Search for the leftmost among the smallest runs. If the run contains 1’s, then pair the boxes belonging to the run with the first boxes (with zeroes) belonging to the nearest neighbor run to its right. If the run contains 0’s, then pair the boxes with the nearest boxes (with ones) to the left of the run. The set of paired boxes and their contents identifies a soliton γ . (3) Ignore the boxes of the identified solitons, update the runs gluing together the remaining boxes and go to step 1. The HT soliton decomposition of the excursion in Fig. 6 is given in Fig. 9. The name of the decomposition comes from the fact that the head of each soliton is to the left of its tail in all cases. We will denote soliton, those solitons identified by the HT decomposition. We will see that this decomposition arises naturally in terms of a tree associated to the excursion. We say that a box i is a k-slot, if either i is a record or i ∈ {h (γ , ), tm−+1 (γ , )} for some ∈ {1, m − k} for some m-soliton, γ , for some m > k; for example, if γ , is a 4-soliton, , h1 (γ , ) and t4 (γ , ) are 3-slots, . See the upper part of Fig. 10.
Fig. 9 The HT soliton decomposition of the excursion in Fig. 6
Fig. 10 Comparing the HT decomposition (above) with the Takahashi-Satsuma decomposition (below). Colored squares indicate that the corresponding box is a k-slot, (or a k-slot) with 1 = violet, 2 = red, 3 = green and 4 = blue. The number of k-slots belonging to an -soliton is 2(−k) in both cases, but the localizations inside the solitons are different. The leftmost box corresponds to the record the excursion is associated with. The slot diagrams of both decompositions coincide; for instance there is a 3-soliton attached to 3-slot number 1 in both pictures
124
P. A. Ferrari and D. Gabrielli
Observe that, as before, the set of k-slots, is contained in the set of -slots, for any < k. As before we say that a k-soliton, is attached to k-slot, number i if the boxes of the soliton, are strictly between k-slots, i and i + 1. If we enumerate the k-slots, of the excursion starting with 0 for the k-slot, at record 0, we can again define xk, (i) := number of k-solitons, attached to k-slot, number i. This produces a slot diagram x , associated to the excursion. We denote x , [ε] the slot diagram produced by the HT soliton decomposition of the excursion ε. See Fig. 10 for a comparison of the slots induced by the HT soliton decomposition and the TS soliton decomposition. The next result says that the slot diagrams produced by both decompositions are identical. Observe that a slot diagram gives information about the number of solitons and about their combinatorial arrangement so that codifies completely the corresponding excursion. Theorem 2 The slot diagram of the Head-Tail decomposition to an excursion ε ∈ E coincides with the slot diagram of the Takahashi-Satsuma decomposition of ε. That is, x[ε] = x , [ε]. Proof Let ε be an excursion and denote x[ε] and x , [ε] the TS and HT slot diagrams of ε, respectively; let m and m, be the maximal soliton size in each representation. Let sk (i) and s,k (i) be the position of the i-th k-slot in the TS and HT decompositions of ε, respectively. Assume ε has neither -solitons nor -solitons, for all ≤ k. Then, sk (0) = , sk (0) = 0 and for 0 < i < sk , we will show that sk (i) =
s,k (i) + k, s,k (i),
if sk (i) belongs to the head of a soliton, ; if sk (i) belongs to the tail of a soliton, .
(10)
which implies the theorem. We prove (10) by induction. If ε has only m-solitons, then (10) holds for any k < m by definition. Assume (10) holds if ε is an excursion with no -solitons for ≤ k. Now attach a k-soliton, γ , to s,k (i) and a k-soliton γ to sk (i). We have 2 cases: (1) sk (i) is the record or belongs to the tail of a m-soliton α with m bigger than k. In this case also sk, (i) belongs to the record or to the tail of a m-soliton, α , and γ is attached to the same place as γ , , hence it does not affect the distances between -slots and -slots, in the excursion (indeed, they coincide in the record and in the tail of α and α , ) for ≤ k. On the other hand, the -slots carried by γ and the -slots, carried by γ , satisfy (10). (2) sk (i) is in the head of α. In this case necessarily sk, (i) is in the head of α , by inductive hypothesis and sk (i) = sk, (i) + k. We consider 2 cases now: (2a) k-slots. The attachments of γ to sk (i) and γ , to sk, (i) does not change the distance between k-slots and k-slots, because either sk (j ) < sk (i) and sk, (j ) < sk, (i) and in this case the insertions do not change their positions or otherwise both slots are translated by 2k, the number of boxes occupied by the k-solitons. We conclude that (10) is satisfied by k-slots and k-slots, after the attachments.
Box-Ball System: Soliton and Tree Decomposition of Excursions
125
Fig. 11 Above: Attach a 3-soliton, γ , in 3-slot, number 1. Below: attach a 3-soliton γ in 3slot number 1. The excursions after the attachments coincide and the slot, diagram after the attachments coincide with the slot diagram and is given by m = 4, x4 = (1), x3 = (0, 1, 0), x2 is a vector with 7 zeroes and x1 a vector with 11 zeroes. The soliton decompositions of the excursion obtained after the attachments satisfy (10)
(2b) -slots for < k. Take an < k and an -slot s (j ) in the head of α. If s, (j ) < sk, (i) and s (j ) < sk (i), neither will be displaced, so (10) is satisfied for -slots to the left of sk (i). On the other hand, if s, (j ) > sk, (i), then s, (j ) keeps its place after the attachment of γ , and s (j ) is to the left of the attachment, hence they satisfy (10) after the attachments (this is the case of the 4th violet 1-slot and 1-slot, ). We have proved that if the slot and slot, diagrams of an excursion with no solitons for ≤ k coincide, then they coincide after attaching k-solitons and ksolitons, . See Fig. 11 for an illustrative example.
3.5 Attaching Solitons In the previous subsection we discussed the decomposition of a configuration into elementary solitons/solitons, and how to codify each single excursion using a slot diagram that takes care of the combinatorial arrangement of the solitons/solitons, into the available slots/slots, . In this Section we discuss the reverse construction. Given a slot diagram we illustrate how to construct the corresponding excursion. The procedure is particularly simple and natural in the case of the HT decomposition.
126
P. A. Ferrari and D. Gabrielli
Fig. 12 The epigraph of an excursion divided into different colored regions obtained by drawing horizontal lines from the leftmost point on the graph of the excursion associated to a given soliton, to the rightmost in the HT decomposition
0 Fig. 13 The walk of a ball configuration where the down-steps associated to records have been substituted by horizontal lines at height zero. The region below the graph of each excursion has been colored like in Fig. 12
The basic idea is illustrated in Fig. 12, which was obtained from Fig. 9 by drawing horizontal lines from the leftmost point in the graph of the excursion associated to the head to the rightmost point associated to the tail of each soliton, . These lines cut the epigraph of the excursion into disjoint regions that we color with the corresponding color of the boundary. We imagine each colored region as a physical two dimensional object glued recursively to generate the interface. Indeed we will show that the excursion can be obtained as the final boundary of a region obtained adding with a tetris-like construction one after the other upside oriented triangles having elastic diagonal sides. It is convenient to represent the walk associated to a ball configuration in X as follows. We transform each down oriented step associated to a record into an horizontal line—at height 0. The parts of the walk associated to the excursions are vertically shifted to level 0, remaining concatenated one after the other by an horizontal line of length equal to the number of records separating the excursions in the walk. The walk is therefore represented by infinitely many pieces of horizontal lines at the zero level (the sea level) separated by infinitely many finite excursions (mountain profiles). This is the construction associated to the Harris walk (see for example [14]). See Fig. 13 for an example with three excursions where we implemented also the same coloring of Fig. 12. We discuss how to generate one single excursion from a slot diagram using the HT decomposition. We represent an isolated k-soliton, as a right-angle isosceles triangle having hypotenuse of size 2k. The triangle is oriented in such a way that the hypotenuse is horizontal and the triangle is upside oriented, see Fig. 14. The basic mechanism of attaching solitons, is illustrated in Fig. 15. In the first up left drawing we represent a 4-soliton, as an upper oriented triangle and draw below it the corresponding slots, . The leftmost slot, corresponds to a record located just on the left of the excursion. Colors are like before: violet = 1, red = 2,
Box-Ball System: Soliton and Tree Decomposition of Excursions
127
2k
Fig. 14 An isolated k-soliton, represented as a right angle triangle with horizontal hypotenuse of size 2k, up oriented and having the other sides of equal length. The hypotenuse is rigid while the other sides are soft and deformable
1
0
2
5
3
4
6
Fig. 15 An isolated 4-soliton, (blue, up left diagram) with the corresponding k-slot, for k = 1, 2, 3, 4 (colors as in the previous Section). The record to the left of the excursion is the unique 4-slot, . We attach one 1-soliton, (violet) in all the possible ways. In the drawing number i the 1-soliton, is attached to the 1-slot, number i
green = 3, blue = 4. In the drawing number i with i = 0, . . . , 6 we attach one 1soliton, to the 1-slot, number i. This corresponds to attach a triangle with horizontal hypotenuse of size 2 in correspondence of the position of the corresponding slot, . The Figure is exhaustive and represents all the possible ways of attaching the 1soliton, . The precise rules and the change of the positions of the slot, during the
128
P. A. Ferrari and D. Gabrielli
attaching procedure to generate an excursion, are illustrated using as an example the following slot diagram k → xk 4 → (1) 3 → (0, 0, 0)
(11)
2 → (0, 1, 0, 1, 0) 1 → (0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0) We construct now the excursion that corresponds to this slot diagram. We do this using the HT decomposition since it is simpler but the TS decomposition gives as a result the same excursion. First we observe that the maximal soliton, size in (11) is 4 and there is just one maximal soliton, . We start therefore with drawing 1 of Fig. 16 where we have a blue 4-soliton, represented by a upside oriented triangle. Below it we represent also the -slots, for < 4; the leftmost -slot, is always located in the
1
2
3
4
Fig. 16 The growing of the excursion codified by the slot diagram (11), adding solitons, one after the other from the biggest to the smallest. Adding a new soliton, corresponds to add an up oriented triangle with horizontal hypotenuse in correspondence of the slot, specified by the diagram. The diagonal sides of the already presented triangles are soft and deform in order to glue perfectly the geometric figures
Box-Ball System: Soliton and Tree Decomposition of Excursions
129
record just on the left of the excursion. Since there are no 3-solitons, we do not have to add green triangles having hypotenuse of size 6. We proceed therefore attaching 2-solitons, represented as upside oriented triangles with hypotenuse of size 4. We have two of them and we have to attach to the 2-slot, number 1 and 3. We label as -slot, number zero the one associated to the record and number the other ones increasingly from left to right. There are 5 2-slot, in the drawing 1 of Fig. 16 (that are the piles of colored squares containing a red one). We start attaching the 2soliton, to the 2-slot, number 1. This means that the left corner of the red triangle has to be attached to the boundary of the colored region in correspondence to the intersection of the boundary with the dashed line just on the right of 2-slot, number one. Since the bottom edge of the triangles is rigid the blue diagonal side deforms in order to have a perfect gluing. This is illustrated in the drawing number 2 of Fig. 16. Note that the slots, in correspondence with the shifted diagonal sides of the blue triangle are shifted accordingly. There are moreover new 1-slot, created in correspondence with some red diagonal sides. The same gluing procedure is done with a second red triangle in correspondence of the 2-slot, number 3, and this is shown in the drawing number 3 of Fig. 16. Note that we do this two gluing operations one after the other to illustrated better the rules but they can be done simultaneously or in the reversed order, the final result is the same. This is because attaching a k-soliton we generate just new j-slot, with j < k. Finally we have to attach a 1-soliton, that is a violet triangle in the 1-slot, number 3 and this is shown in the final drawing 4 of Fig. 16.
3.6 Conserved Quantities We discuss a way to identify conserved quantities using the first definition of the dynamics in Sect. 2.1 applied to a finite excursion. Recall that the basic step consists on pairing all neighboring boxes of type 10 by drawing a line from the 1 to the 0 and then remove the paired boxes to iterate, see Fig. 2. Call ri the number of lines drawn in the i-th iteration of the construction. We have r1 ≥ r2 ≥ · · · ≥ rM , where M is the number of iterations necessary to pair all the balls. In the example of Fig. 2 we have M = 4 and r1 = 8, r2 = 2, r3 = r4 = 1. Proposition 3 (Yoshihara et al. [19]) The numbers ri are invariant for the dynamics. That is, ri (η) = ri (T η)
for any i.
(12)
Proof We present a simplified version of the argument given by Yoshihara et al. [19]. The basic property that we use is the reversibility of the dynamics. Introduce the evolution T ∗ that is defined exactly as the original dynamics apart the fact that balls move to the left instead of to the right. The reversibility of the dynamics is encoded by the relation T ∗ T η = η. This fact follows from the definition: looking
130
P. A. Ferrari and D. Gabrielli
Fig. 17 The pairing construction for a dynamics evolving to the right (lines above) and the pairing construction for the same configuration of balls but evolving to the left (lines below)
at Fig. 2 the configuration T η is obtained just coloring black the white boxes and white the black ones. The evolution T ∗ is obtained pairing balls with empty boxes to the left. The lines associated to T ∗ for the configuration T η are exactly the same as those already drawn. The only difference is that the balls are now transported from right to left along these lines. Denote ri∗ the number of lines drawn at iteration number i for the evolution T ∗ . Since the lines used are the same we have ri (η) = ri∗ (T η) ,
∀i .
(13)
Now evolve the original configuration η according to T ∗ . In Fig. 17 we draw above the lines corresponding to the evolution T and below those corresponding to T ∗ . We want now to show that ri (η) = ri∗ (η) ,
∀i .
(14)
Recall that a run is a sequence of consecutive empty or full boxes. In the configuration η of our example there are two infinite empty runs and then alternated respectively 8 and 7 full and empty finite runs. The first step is to show that r1 (η) = r1∗ (η). This is simple because these numbers coincide with the number of full runs in the configuration η. The second step of the algorithm consists on erasing the rightmost ball of every occupied run and the leftmost empty box of every empty run for T , while the leftmost ball of every occupied run and the rightmost empty box of every empty run are erased for T ∗ . Observe that r2 (η) coincides with the number of full runs in a configuration obtained removing the balls and the empty boxes paired in the first step. This configuration is obtained from η decreasing by one the size of every finite run. If in η there are some runs of size 1 then they disappear. The same happens for computing r2∗ (η). Since we are just interested on the sizes of the alternating sequences of empty and full runs, erasing on the left or on the right is irrelevant. We deduce r2 (η) = r2∗ (η) since both coincide with the number of finite occupied runs of two configurations having the
Box-Ball System: Soliton and Tree Decomposition of Excursions
131
same sequence of sizes of the runs. Iterating this argument we deduce (14). Now, using (13) and (14) we deduce (12).
3.7 Young Diagrams We discuss now a generalization of the conservation property (12) to the case of infinite configurations and the relation with the conservation of the solitons. Since the numbers ri are monotone, it is natural to represent them using a Young diagram, [11]. A Young diagram is a diagram of left-justified rows of boxes where any row is not longer than the row on top of it. We can fix for example the number ri representing the length of the row number i from the top. The number of iterations M corresponds to the number of rows. The Young diagram associated to the example in Fig. 2 is therefore
.
(15)
This diagram can be naturally codified by the numbers ri , representing the sizes of the rows, as (8, 2, 1, 1). Another way of codifying a Young diagram is by the sizes of the columns. This gives another Young diagram that is called the conjugate diagram and it is obtained by reflecting the diagram across the diagonal. The same diagram (15) can therefore be codified as [4, 2, 1, 1, 1, 1, 1, 1]. Finally another equivalent codification can be given specifying the numbers n1 , n2 , . . . , nM of columns of length respectively 1, 2, . . . , M. For the Young diagram above we have for example n1 = 6, n2 = 1, n3 = 0, n4 = 1. The numbers ri and ni give alternative and equivalent coding of the diagram and are related by ri =
M
nm ,
ni = ri − ri+1
(16)
m=i
where we set rM+1 := 0. The number ni can be interpreted as the number of solitons of length i. Take for example the diagram (15) and cut it into vertical slices obtaining
(17)
132
P. A. Ferrari and D. Gabrielli
The original Young diagram can be reconstructed gluing together the columns in decreasing order from left to right and justifying all of them to the top. Each column of height k in (17) will represent a k-soliton on the dynamics. We are not giving a formal proof of this statement it can however easily be obtained by the construction in Sect. 4.2. We will show indeed that the soliton decomposition can be naturally done using trees codifying excursions. In Sect. 4.2 we show how the trees can be constructed using the lines of the first definition in Sect. 2.1 getting directly the relationship among the Young diagrams and the solitons. According to this, the configuration η having associated the Young diagram (15) obtained gluing again together the columns in (17), contains one 4-soliton one 2-soliton and 6 1-solitons. The Young diagram contains only some information about the configuration of balls, i.e. the map that associates to η its Young diagram is not invertible, and for example there are several configurations of balls giving (15) as a result. The one in Fig. 2 is just one of them. Essentially the Young diagram contains just the information concerning the numbers of solitons contained in the configuration but not the way in which they are combinatorially organized. In the example discussed above we worked with a configuration of balls having one single non trivial finite excursion. Consider now a finite configuration η whose walk representation contains more than one excursion. Our argument on the conservation of the numbers ri proves that the global Young diagram associated to the whole configuration is invariant by the dynamics. Let us consider however separately the single excursions. Recall that two different excursions are separated by empty boxes from which there are no lines exiting. For example in Fig. 4 there are 3 excursions that we surrounded by rectangles to clarify the different excursions. We construct for each excursion separately the corresponding Young diagram. For the example of Fig. 4 the three Young diagrams are
(18) By definition the global Young diagram that is preserved by the dynamics is the one having as length of the first row (the number r1 ) the sum of the lengths of the first rows of the three diagrams, as length of the second row (the parameter r2 ) the sum of the length of the second rows of all the Young diagrams and so on. This means that the global Young diagram is obtained suitably joining together the single Young diagrams. In particular the gluing procedure is the following. We have to split the columns of each single diagram then put all the columns together and glue them together as explained before, i.e. arranging them in decreasing order from left to
Box-Ball System: Soliton and Tree Decomposition of Excursions
133
right and justifying all of them to the top. For example the first Young diagram on the left in (18) is split into
For the second diagram in (18) we have two columns of size 1
while for the
third one we have one single column of size 2 . The global Young diagram for the example of Fig. 4 is therefore
The number ni of columns of length i in the global diagram is obtained as the sum of the number of columns of size i on the single diagrams. Also the numbers ri are obtained summing the corresponding row lengths on each single group (with the usual convention that a Young diagram with M rows has rj = 0 for j > M). The shapes of the single diagrams in (18) are not invariant by the dynamics. Even the number of such diagrams is not conserved since during evolution the number of excursions may change. It is instead the total number of columns of each given size to be conserved. More precisely given a configuration η we can construct the Young diagrams for each excursions and then we can cut them into single columns. The configuration of balls T η will have different excursions with different Young diagrams but they will be obtained again combining differently into separated Young diagrams the same columns obtained for the configuration η. The Box-Ball dynamics preserves the number of columns of size k for each k. Indeed this is nothing else that a different identification of the traveling solitons again by the construction in Sect. 4.2. If η is an infinite configuration with a walk having all the records, we can construct a Young diagram for each excursion. Cutting the diagrams along the columns we obtain the solitons contained in the excursion. Slot Diagrams and Young Diagrams Since a slot diagram describes the number of solitons per slot, we can associate a Young diagram to a slot diagram x as follows: M(x) is the number of rows and nk is the number of columns of length k. The diagram is constructed gluing nM columns of length M, then nM−1 columns
134
P. A. Ferrari and D. Gabrielli
of length M − 1 up to n1 columns of length 1. For example the Young diagram associated to the slot diagram (11) is given by
.
(19)
4 Trees, Excursions and Slot Diagrams In this section we provide an alternative decomposition of an excursion using a bijection between soft excursions and planar trees. The construction is a slight variant of the classical bijection of strict excursions and planar rooted trees, see [5, 12, 13]. There are several ways of codifying planar trees (see for example [13]). We will try to use a direct pictorial approach introducing less algebraic notation as possible.
4.1 Tree Representation of Excursions In this subsection we summarize classical results mapping finite trees to excursions, see for instance §1.1 of the lecture notes of Le Gall [13]. Start with the graph of a soft excursion as in Fig. 18. Draw horizontal lines corresponding to the integer values of the height. The region below the graph of the excursion is cut into disjoint components by the horizontal lines. Associate one node to each connected
Fig. 18 Construction of a planar tree associated to the excursion of Fig. 6. The root is a black circle and the nodes are white circles. The root is associated to a record
Box-Ball System: Soliton and Tree Decomposition of Excursions
135
Fig. 19 The pairing of opposite diagonal sides of an excursion. Each double arrow corresponds to a node of the tree different from the root
component. The root is the node corresponding to the bottom region. The tree is obtained by drawing an edge between nodes whose associated components share a piece of a horizontal line. The construction is illustrated in Fig. 18 where the root is drawn as a • while the other nodes as a ◦. The tree that we obtain is rooted since there is a distinguished vertex and it is planar. This means that it is embedded on the plane where the graph of the excursion is drawn. A consequence of this specific embedding is that every vertex different from the root has an edge incoming from below and all the other edges are ordered from left to right going clockwise. In [13] the map from the planar tree to the excursion is given in terms of a Dyck path. The excursion gives the distance to the root of a vehicle that turns around the tree at speed one edge per unit of time. The reverse bijection amount to glue the edges face to face below the excursion (in order to recover the edges), as in Fig. 19.
4.2 Trees and Pairing Algorithm The tree associated to an excursion can be constructed using the pairing definition of the dynamics of Fig. 2. As before, draw dashed horizontal lines in correspondence of the integer heights that cut the epigraph of the excursion into disjoint regions. Pair the opposite diagonal faces of each region, connected by dashed double arrows in Fig. 19. Since the left face is of type and the right one is of the type , corresponding respectively to balls and empty boxes, we obtain exactly the pairing of the first definition of the dynamics. Indeed, the pairings of the first iteration of the first definition of the dynamics coincide exactly with the pairing of the two opposite diagonal sides near each local maxima. Then remove the paired objects and iterate to obtain a proof. We construct the planar tree associating the root to the unbounded upper region of the upper half plane and one node to each pairing line. Nodes associated to maximal lines are linked to the root. Consider a node A associated to a maximal line. Node B associated to another line is connected to A if: (1) the line associated to B is surrounded by the line associated to A and (2) removing the maximal line associated to A the line associated to B becomes maximal. The tree is constructed after a finite iteration of this algorithm, see Fig. 20 where the planar tree is red and downside oriented.
136
P. A. Ferrari and D. Gabrielli
Fig. 20 The construction of the planar tree associated to an excursion using the pairing between balls and empty boxes
4.3 Branch Identification of Planar Trees We discuss a natural branch decomposition of a rooted planar tree that is in correspondence with the soliton decompositions previously discussed. We give 3 equivalent algorithms to identify the branches of a planar rooted tree. Branch Identification I Step 1.
Step .
Let A1 be the set of the leaves (nodes with only one neighbor). Associate a distinct color and the generation number 1 to each leaf. The root is black, a color not allowed for the other nodes. Let A−1 be the set of numbered and colored nodes after − 1 steps. Let N be the set of nodes with all offsprings in A−1 . To each n ∈ N give the color of the rightmost neighbor among those with bigger generation number, say g, and give generation number g + 1 to n. Stop when all nodes are colored. In Fig. 21 give a distinct color to each leaf (we have for simplicity repeated colors in the picture). In each step to each not-yet-colored node with all offsprings already colored give the color of the rightmost maximal offspring. After coloring all nodes, identify the color of branches of the same size (knowing the result, we have started with those colors already identified). A k-branch is a one-dimensional path with k nodes all of the same color and k edges, one of which is incident to a node of a different color. In Fig. 21 we have colored the tree produced by the excursion in Fig. 18 and have identified 2 violet 1-branches, 2 red 2-branches, 1 green 3-branch and 2 blue 4-branches (for simplicity we used a simplified convention for color, see the caption for the explanation).
Branch Identification II Step 0. Step 1. Step .
Enumerate the colors. In our example we use violet for 1-branches, red for 2-branches, green for 3-branches and blue for 4-branches. Paint all leaves with color 1, violet. Update those nodes with all offsprings entering into nodes already colored during steps 1 up to − 1. Give color to updating nodes and change to color those nodes belonging to the rightmost offspring path of size starting from each updating node. See Fig. 22.
Box-Ball System: Soliton and Tree Decomposition of Excursions
137
Fig. 21 Branch identification I
Fig. 22 Branch identification II
In Fig. 22 we give color 1 (violet in this case) to each leaf. In step 2 (a) give color 2 (red) to all nodes having all offsprings already colored and (b) change to color 2 each already colored node belonging to the rightmost offspring path with 2 nodes starting at each updating node. In step 3 use color green and in step 4 use color blue. The final branch decomposition is the same as in Fig. 21.
138
P. A. Ferrari and D. Gabrielli
Fig. 23 Branch identification III
Branch Identification III Step 0.
Step 1.
Step 2.
Orient the tree toward the root. Consider the oriented paths starting from the leaves of the tree. Remove the root but not the edges incident to the root. Search for the maximal directed paths starting from the leaves. If two or more of them share at least one edge, select just the rightmost path among those. Observe that the last edge is incident only to one node. A selected path with k nodes is named k-branch. Remove the selected branches. If all paths have been removed, then stop. Otherwise go to step 1. The tree is oriented just to define the procedure. The branches selected and removed constitute the branch decomposition of the tree. In Fig. 23 we apply this procedure to the same example of the previous procedures. The result is the same. In Fig. 23. First square represents the first iteration. There are 3 paths of length 4 sharing the left edge incident to the root and two paths of length 4 sharing 3 edges. The rightmost path of each group is identified as a 4-branch and colored blue. The second iteration identifies one 3branch in green; the third iteration identify two 2-branches and the forth iteration identifies two 1-branches. Putting back the colored branches to their original position we obtain the last picture of Fig. 22.
4.4 Tree-Induced Soliton Decomposition of Excursions We now take the tree produced by an excursion, as illustrated in Fig. 18, use any algorithm of Sect. 4.3 to identify its branches and use the colored tree to identify
Box-Ball System: Soliton and Tree Decomposition of Excursions
139
Fig. 24 Soliton* decomposition of the excursion of Fig. 6 using the branch decomposition of Sect. 4.3 of the tree associated to the excursion as obtained in Fig. 18
solitons, as follows. Put the colored tree back into the excursion and color the diagonal boundaries of the region associated to each node with the color of the node. Each k-branch is then associated to k empty and k occupied boxes with the same color; we call those boxes and their content a k-soliton*. We use the * to indicate solitons and slots in the tree-induced decomposition. In this case all solitons* are oriented up, that is, the head of each soliton* is to the left of its tail. See Fig. 24. Proposition 4 (HT and Tree Decomposition) Given any excursion ε, the HT soliton decomposition of ε coincides with the tree decomposition of ε. Proof This proposition is consequence of Proposition 5 below, given in terms of the slot diagrams of both objects.
4.5 Slot Diagrams of Planar Trees Think each node of a tree as a geometric object. More precisely identify each node with a circumference that is exactly the boundary of the associated colored region like in Fig. 25. Each incident edge to the node is now a segment intersecting the circumference; different edges intersect different points, called incident points. By convention, we assume that there is a segment incident to the root from below. The arcs of the circumference with extremes in the incident points and with no incident point in the interior are called slots*. We will describe a procedure to attach new branches to slots*. We use the same symbol ∗ for the solitons of the previous section and slots here since there is a direct correspondence between the solitons* and the slot* diagram for the branches of the tree. We say that a node of a tree has k generations if it is colored in the iteration number k of the algorithm Branch identification II. This is equivalent to say that the maximal path from the node to a leaf, moving always in the opposite direction with respect to the root, has k nodes, including the node and the leaf. Slots Identification of Trees I Consider a colored tree with maximal branch of size m. Declare the whole circumference of the root of the tree as the m-slot* number 0; recall there is an incident edge to this node from below. Attach the m-branches to the unique m-slot*. Proceed then iteratively for k < m. Assume that the tree has no -branches for ≤ k and call a slot* s a k-slot* if one of the following conditions hold (a) s belongs to the root, (b) s belongs to a node with more than k generations,
140
P. A. Ferrari and D. Gabrielli 2
1
3 0
2
1
4
0
4-slot
3-slots 4
3
11
9 4
8
3
2
5
7
4
6 1
9 2
10
0
6
5
10 8 7
3 1
17 16
12 13
15
18 19
14
20
0
2-slots
1-slots
Fig. 25 Slot identification. Upper-left: for m = 4 there is a unique 4-slot* in the root. Upper-right: attaching two 4-branches to this slot we identify five 3-slots*. Attaching one 3-branch to 3-slot 1, identify eleven 2-slots* and finally attaching two 2-branches to 2-slots* 1 and 4, we identify 21 1-slots*. To complete the tree in Fig. 24 we have to attach two 1-branches (not in this picture)
(c) s belongs to a node with k generations and all path with k nodes containing a leaf incident to the node, is incident to the right of s. k-slots* are numbered from left to right, starting with k-slot* 0 at the left side of the node associated to the record. See Fig. 25. Slot Diagram of a Tree The slot diagram of the tree is a collection of vectors x ∗ = (xk∗ (0), . . . , xk∗ (sk∗ − 1)) : k = 1, . . . , m
(20)
where m is the length of the longest path in the tree and ∗ sm = 1 and for k = m, . . . , 1 iterate: :
xk∗ (i) = number of k -branches attached to k -slot* number i, n∗k
sk∗ −1
=
xk∗ (i)
i = 0, . . . , sk∗ − 1
(21)
i=0 ∗ sk−1 =1+
m =k
2( − k)n∗
(22)
Box-Ball System: Soliton and Tree Decomposition of Excursions Fig. 26 Slot* diagram (23). There are 2 1-branches attached to 1-slots* number 17 and 19, 2 2-branches attached to 2-slots* 1 and 4; 1 3-branch attached to 3-slot* number 1 and 2 4-branches attached to 4-slot* number 0
17
141
19
1
4
1
0
0
In particular the slot* diagram of Fig. 25 is given by m = 4 and (s1∗ , s2∗ , s3∗ , s4∗ ) = (21, 11, 5, 1) x4∗ = (2) x3∗ = (0, 1, 0, 0, 0)
(23)
x2∗ = (0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0) x4∗ = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0) using the slot enumeration in Fig. 25. We illustrate this slot diagram in Fig. 26. Slots Identification of Trees II A reverse way to find the slot* diagram of a colored tree with identified slots* is the following. Remove the 1-branches keeping track of the 1-slot* index each branch was attached to. Assume we have removed the -branches for < k. Then, remove the k-branches keeping track of the k-slot* number associated to each removed k-branch. The slot* diagram associated to the tree consists on the removed branches and its associated slots* number. See Fig. 27. Figure 27. Upper-left: a colored tree. Upper-right: erasing 1-branches in the tree, we identify and enumerate 1-slots*. Lower-left, erasing 1-branches and 2-branches, we identify and enumerate 2-slots*. Lower-right: in a tree with 4-branches we identify and enumerate 3-slots*. The node associated to the record, in black, has one 3-slot* for each arc.
4.6 From Paths to Trees We illustrate now the reverse operation. Start with the slot diagram obtained in Fig. 26. Put the root. Let m be the biggest size of the branches in the slot diagram. Attach the m-branches to the root. Then successively for k = m − 1, . . . , 1 attach the k branches to the associated k-slot in the tree. The result is illustrated in Fig. 27 looking at it backwards: In rectangle 4 we attach 2 4-branches to 4-slot 0 and indicate the place and number of each 3-slot; in rectangle 3 we attach one 3-branch to 3-slot number 1 and so on.
142
P. A. Ferrari and D. Gabrielli 2
1
11
9 6
5 4
3
2
1
17
10 8
12
7
16
15
13
18 19
14
20
0
1-slots
4
3
4 3
2
8 5
3
7
6 1 0
9
2
1
10
0
4
2-slots
3-slots
Fig. 27 Slot identification of trees II
Proposition 5 Given a finite excursion ε we have x , [ε] = x ∗ [ε]. Sketch Proof We give a sketch of the proof showing the basic idea. Consider an arbitrary slot diagram x. We are going to show that the excursion ε characterized by x , [ε] = x and the excursion ε characterized by x ∗ [ε ] = x are the same, i.e. ε = ε . This implies the statement of the Proposition. Recall that we have constructed the excursion associated to x , [ε] iteratively in Sect. 3.5 gluing one after the other some special triangles. We just showed instead that to construct x ∗ [ε ] we have to glue recursively the branches like the ones in Fig. 26 glued in Fig. 27 (recall that the gluing procedure has to be followed in the reverse order). Since x is the same, both procedures deal with the same number of k-triangles and k-branches to be attached to the same slots. The proof is therefore based on the correspondence between the two different procedures once we fix the basic correspondence of Fig. 28 between the two basic building blocks. Considering the example of Sect. 3.5 we show in Fig. 29 the construction of the tree associated to the excursion ε such that x ∗ [ε ] = x where x is the slot diagram (11). This Figure has to be compared with Fig. 16 where we constructed the excursion ε such that x , [ε] = x where x is again (11). In Fig. 29 for simplicity we draw just the slots* useful for the attachments. Looking carefully in parallel to the two construction the reader can see that at each step the excursion is the same and the allocations of the slots is again the same. A long formal proof could be given following this strategy. See also Fig. 30 for illustration.
Box-Ball System: Soliton and Tree Decomposition of Excursions
143
k
2k
Fig. 28 The correspondence between a triangle and a branch in the two different constructions. Here k = 4. The excursion associated to each basic building block is the same and coincides with the diagonal boundary of the triangle
5 4
2
1
3
0
4
3 2
6
8
7
1
9 0
10
Fig. 29 The construction of the excursion of Fig. 16 using branches instead of triangles. On the left the unique 4-branch attached to the root with the location of the 2-slots* (there are no 3-branches in this case). In the middle the tree after attaching 2-branches to slot* number 1 and slot* number 3, with the location of the 1-slots*. On the right the final tree after attaching the 1-branch to 1-slot* number 3
As a byproduct of the correspondence between planar trees and slot* diagrams we can count the number of planar trees that have a fixed number of branches. This corresponds to count the number of slot* diagrams when the numbers n∗k are fixed. For each level k we need to arrange n∗k branches in sk∗ available slots* and this can n∗ +s ∗ −1 be done in k n∗k different ways. Since this can be done independently on each k
level we have therefore that the numbers of planar trees having n∗k branches of length k is given by
M ∗ # n + s∗ − 1 k
k=1
where we used (22).
k n∗k
=
M ∗ # n +2 k
k=1
M
j =k+1 (j n∗k
− k)n∗j
,
(24)
144
P. A. Ferrari and D. Gabrielli
11
9 6
5 4 2
3 1
17
10 8
16
12
7
13
15
18 19
14
20
0
1-slots
2
4 3
2
8 5
7
6
9
1
10
0
2-slots
3 2
1
4
0
3-slots
0
4-slot
Fig. 30 First line: on the left, the HT soliton decomposition of Fig. 10 and the localization of the slots, ; on the right, branch decomposition of the tree produced by this excursion from Fig. 24 and 27. Following lines: checking that the slot, localization on the excursion and slot* localization on the tree are the same. In each line we have erased the solitons, /branches smaller than k and show the position of the k-slots, /slots*; k = 1, 2, 3, 4. To see the slot, /slot* number a soliton, /branch is attached to, look for a square/arrow of the same color in the line below
5 Soliton Distribution We report in Sect. 5.1 a family of distributions on the set of excursions proposed by the authors [6] based on the slot decomposition of the excursions. In particular, the slot diagram of the excursion of a random walk satisfies that given the mcomponents for m > k, the distribution of the k-component is a vector of independent Geometric random variables; the size of the vector is a function of the bigger components. As a consequence, we obtain that the distribution of the
Box-Ball System: Soliton and Tree Decomposition of Excursions
145
k-branches of the tree associated to the excursion of the random walk given the m-branches, for m > k, is a vector of independent geometric random variables. Theorem 7 considers a random ball configuration consisting on iid Bernoulli of parameter λ < 12 , conditioned to have a record at the origin and shows that their components are independent and that the k-component consists of iid geometric random variables. Since the measure is given in terms of the number of solitons and slots of the excursion, and those numbers are the same in all the slot diagrams we have introduced, we just work with a generic slot diagram.
5.1 A Distribution on the Set of Excursions Let nk (ε) be the number of k-solitons in the excursion ε and for α = (αk )k≥1 ∈ [0, 1)N define Zα :=
ε∈E
5
nk (ε) , k≥1 αk
(25)
with the convention 00 = 1. Define A := {α ∈ [0, 1)N : Zα < ∞}
(26)
This set has a complex structure since the expression (25) is difficult to handle. For α ∈ A define the probability measure να on E by να (ε) :=
1 5 n (ε) α k . Zα k≥1 k
(27)
For q ∈ (0, 1]N define the operator A : q "→ α by α1 := (1 − q1 );
αk := (1 − qk )
5k−1
2(k−j ) , j =1 qj
for k ≥ 2.
(28)
Reciprocally, define the operator Q : α "→ q by q1 := 1 − α1
and iteratively,
αk qk := 1 − 5k−1 2(k−j ) , j =1 qj
(29) k ≥ 2.
(30)
Let Q := {q ∈ (0, 1]N :
k≥1 (1 − qk )
< ∞}.
The next results gives an expression of να (ε) in terms of the slot diagram of ε.
(31)
146
P. A. Ferrari and D. Gabrielli
Theorem 6 (Ferrari and Gabrielli [6]) (a) Let q ∈ Q, α = Aq and να given by (27). Then, α ∈ A and να (ε) =
5
k≥1 (1 − qk )
nk
qksk
(32)
where nk and sk are the number of k-solitons, respectively k-slots, of ε. (b) The map A : Q → A is a bijection with Q = A−1 . The 5 proof of (a) given below shows that if q ∈ Q then Aq ∈ A with ZAq = ( k≥1 qk )−1 . On the other hand, to complete the proof of (b) it suffices to show that Qα ∈ Q. The proof of this fact is more involved and can be found in [6]. If we denote xk∞ = (xk , xk+1 , . . . ), the expression (32) is equivalent to the following (with the convention q0 := 0 to take care of the empty excursion). να (M = m) = (1 − qm )
5
>m q ,
m ≥ 0,
(33)
να xm (0)M = m = (1 − qm )xm (0)−1qm ,
(34)
∞ να xk xk+1 = (1 − qk )nk qksk ,
(35)
where we abuse notation writing xm as “the set of excursions ε whose m-component in x[ε] is xm ”, and so on. Recall that nk is the number of k-solitons of x and sk is ∞ . the number of k-slots of x, a function of xk+1 Formulas (33) to (35) give a recipe to construct the slot diagram of a random excursion with law να : first choose a maximal soliton-size m with probability (33) and use (34) to determine the number of maximal solitons xm (0) (a Geometric(qm ) random variable conditioned to be strictly positive). Then we use (35) to construct iteratively the lower components. In particular, (35) says that under the ∞ , the variables (x (0), . . . x (s − 1)) are i.i.d. measure να and conditioned on xk+1 k k k Geometric(qk ). Proof of Theorem 6 (a) Using formula (9), we have 5
k≥1 (1 − qk )
nk
s
qk k = = =
5
k≥1 (1 − qk )
5
n≥1 qn
5
n≥1 qn
nk
5
6 k≥1
5
1+ >k 2(−k)n
qk
(1 − qk )
nk k≥1 αk ,
= να (ε), because Zα =
5
n≥1 qn
−1
< ∞ since q ∈ Q.
(36)
2(k−j ) nk j =1 q
(37)
denoting α := Aq
(38)
5k−1
(39)
Box-Ball System: Soliton and Tree Decomposition of Excursions
147
5.2 Branch Distribution of the Random Walk Excursion Tree For λ ≤
1 2
define α = α(λ) by αk := (λ(1 − λ))k .
(40)
Then α(λ) ∈ A and να(λ) is the law of the excursion of a simple random walk that has probability λ to jump up and 1 − λ to jump down. A computation using the Catalan numbers shows that Zα(λ) =
1 . 1−λ
(41)
On the other hand, the probability that the random walk perform a fixed excursion with length 2n is λn (1 − λ)n+1 , where the extra (1 − λ) is the probability that the walk jumps down after the 2n steps of the excursion. This gives (41) with no computations. In terms of the branches of the tree associated to the excursion, one chooses the size of the largest branch m of the tree with (33) and use (34) to decide how many maximal branches are attached to the root of the tree. Then identify the (m − 1) slots and proceed iteratively using (35) to attach the branches of lower size. Given the branches of size bigger than k already present in the tree, the number of kbranches per k-slot (xk (0), . . . xk (sk − 1)) are i.i.d. Geometric(qk ) given iteratively by q1 = 1 − λ(1 − λ),
(λ(1 − λ))k qk = 1 − 5 , k−1 2(k−j ) j =1 qj
k ≥ 2.
(42)
5.2.1 Geometric Branching Processes Let ρ be a probability measure on N ∪ {0}. A branching process with offspring i distribution ρ is a random growing tree defined as follows. Let Xj be a i,j ∈N
double indexed sequence of i.i.d. random variables having law ρ. At initial time zero the tree is constituted by one single vertex, the root. At time 1 there are X11 individuals on the first generation, all of them are generated by the root and are drawn as vertices connected to the root. Give to them an arbitrary order from left to right embedding the tree on a plane. At time 2 each individual of the first generation produces independently a number of new vertices with distribution ρ. More precisely the number of vertices produced by the individual number i of the first generation is X2i . Every such new vertex is connected by an edge to the parent vertex of the previous generation with an arbitrary order from left to right given by the embedding. Continue iteratively in this way with Xji being the number of
148
P. A. Ferrari and D. Gabrielli
vertices of the generation j produced by the individual number i of the generation j − 1. If +∞ k=0 kρ(k) < 1 the branching process is called subcritical and the above procedure produces a.e. a finite random planar tree. See [12] for more details, references, and the relation with the law of the corresponding excursions. Let us consider the case when ρ is the law of a geometric random variable Geometric(1 − λ), i.e. ρ(k) = P(Xji = k) = (1 − λ)λk , k = 0, 1, . . . . In this case the probability of any given finite tree is given by (1 − λ)|V | λ|V |−1 , where |V | is the number of vertices, included the root. Since |V | − 1 = 2n that is the length of the corresponding excursion (the correspondence is described in Sect. 4.1), we have that the law of the excursion coincides with the law of the excursion of a simple random walk having probability λ to jump up and probability 1 − λ of jumping down (see [12] for more details). Using the result discussed in this paper we obtain therefore an alternative procedure of construction of a geometric branching process using independent but not identically distributed geometric random variables. Consider the parameters (qk )k∈N defined as in (42). The law of the maximal generation M of the branching process is given by the right hand side of (33), i.e. P(M = m) = (1 − qm )
#
ql ,
m ≥ 0.
l>m
Once the maximal generation has been fixed we attach, directly to the root, a number of maximal branches os size M = m according to the distribution given on the right hand side of (34), i.e. a Geometric(qm) conditioned to be positive. See for example Fig. 25 where m = 4 and we attach two 4 branches directly to the root in the frame number 2. We proceed now iteratively. Suppose that all the branches of size bigger than k have been attached. Consider all the k-slots* of the tree and attach to all of then a Geometric(qk ) number of k-branches (see for example frame 3 of Fig. 25 where we attach 3-branches and frame 4 where we attach 2-branches). The final random tree obtained this way is a branching process with offspring law ρ given by Geometric(1 − λ).
5.3 Soliton Decomposition of Product Measures in {0, 1}Z Forest of Trees Associated to Configurations with Infinitely Many Balls Consider a configuration η (with possibly infinitely many balls) and assume the walk ξ = W η has a record at the origin and all records, that is r(i, ξ ) ∈ Z for all i ∈ Z. Let (εi )i∈Z be the excursion decomposition of ξ . Associating to each excursion the corresponding tree, we finish with a forest of trees each associated with an excursion, and sharing the slot diagrams of the excursion. See Fig. 31 for the trees associated to the ball configuration in Fig. 4.
Box-Ball System: Soliton and Tree Decomposition of Excursions
149
−1 0 1
2
3
4
5 6
−1 0
1 2
3
4 5
−1
0
1
2
4
3 5
6
6
Fig. 31 Up: Walk representation of the ball configuration of Fig. 4, with records represented by a black dot and labeled from −1 to 6. Middle: black dots representing records with nonempty excursions have been displaced to facilitate the picture. Each excursion tree has been decomposed into branches, with the corresponding colors. Down: the forest representing this piece of walk
Soliton Decomposition of Configurations with Infinitely Many Balls For the same walk ξ with excursion components (εi )i∈Z , consider (x i )i∈Z , the set of slot diagrams associated to those excursions. Recall εi is the excursion between Record i and Record i + 1. N We define the vector ζ ∈ (N ∪ {0})Z obtained by concatenation of the kcomponents of x i as follows: Sk0 := 0,
for all k ≥ 1
i
Sk − Ski := ski + · · · + ski −1 + i − i, for i < i , k ≥ 1, ζk (Ski + j ) := xki (j ),
(43)
j ∈ {0, . . . , ski − 1}, k ≥ 1.
The components of ζ are ζk (j ) ∈ N ∪ {0} with k ∈ N and j ∈ Z. For example, Fig. 31 contains a piece of ξ between Record −1 and Record 6. The excursions ε−1 , ε1 , ε3 , ε4 are empty, so ski = 1 for k ≥ 1 and xki (0) = 0 for k ≥ 0 and
150
P. A. Ferrari and D. Gabrielli
i = −1, 1, 3, 4. The corresponding slot diagrams are x −1 = x 1 = x 3 = x 4 = ∅ x30 = (1), x20 = (0, 0, 0), x10 = (0, 0, 2, 0, 1) x12 = (2) x25 = (1), x15 = (0, 0, 0)
(44)
So that the maximal soliton number in the slot diagram i is mi = 0 for i ∈ {−1, 1, 3, 4}, m0 = 3, m2 = 2 and m5 = 2. Young Diagram To better explain graphically the definitions (43) and construct the piece of configuration ζ corresponding to the above excursions, we associate a Young diagram to each slot diagram, as follows: for each soliton size k on the slot diagram x pile one row of size sk for k ≤ m and one row of length 1 for all k > m. We finish with an infinite column at slot 0 and all k-slots of the same number piled on the same column. Taking the vertical coordinate as k and the horizontal coordinate as j , in box (j, k) put xk (j ). Figure 32 shows the Young diagrams corresponding to (44). Once we have the slot diagrams of the excursions of ξ as decorated Young tableaux, to obtain ζ it suffices to glue the rows of the same height into a unique row justified by column 0, as in the Fig. 33. We call A+ the set of α such that the mean excursion size under να is finite: A+ := α : k≥1 2kρk (α) < +∞ .
(45)
By definition we have A+ ⊆ A. We define also Q+ := q : k≥1 k(1 − qk ) < +∞ .
...
...
...
...
...
...
...
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
2
0
2
0
0
0
x−1
x0
0
1 x1
x2
x3
x4
0
0
x5
Fig. 32 Young shape of slot diagrams of excursions −1 to 5 of Fig. 31. Dots at top of columns mean that this is an infinite column of zeroes from that point up
Box-Ball System: Soliton and Tree Decomposition of Excursions
151
... ... ... ... ... ... ... 4
0
0
0
0
0
0
0
3
0
1
0
0
0
0
0
2
0
0
0
0
0
0
0
0
1
0 1 k j −1
0
0
2
0
1
0
2
0
0
0
0
0
0
1
2
3
4
5
6
7
8
9
10
11
Fig. 33 Justify the slots diagrams with the column at the 0-slot. The result is the piece of configuration ζ produced by the excursions −1 to 5. In the vertical coordinate the k-component, in the horizontal coordinate, the slot number. For example ζ2 (7) = 1
Theorem 7 (From Independent Solitons to Independent IID Geometrics [6]) If α ∈ A+ and (εi )i∈Z are iid excursions with distribution να , then (ζk )k∈Z ∈ (N ∪ {0})Z , as defined in (43) is a family of independent configurations and for each k, (ζk (j ))j ∈Z are iid random variables with distribution Geometric(qk ), where A−1 α = q ∈ Q+ . Proof A part of the proof is a direct consequence of (33)–(35). For the proof of the remaining statements see [6]. Acknowledgments We thank Leo Rolla for many fruitful discussions and Jean François Le Gall for pointing out relevant references on excursion trees. This project started when PAF was visiting GSSI at L’Aquila en 2016. He thanks the hospitality and support. Part of this project was developed during the stay of the authors at the Institut Henri Poincaré—Centre Émile Borel during the trimester Stochastic Dynamics Out of Equilibrium. We thank this institution for hospitality and support.
References 1. Cao, X., Bulchandani, V.B., Spohn, H.: The GGE averaged currents of the classical Toda chain (2019). arXiv:1905.04548 2. Croydon, D.A., Sasada, M.: Invariant measures for the box-ball system based on stationary Markov chains and periodic Gibbs measures (2019). arXiv:1905.00186 3. Croydon, D.A., Kato, T., Sasada, M., Tsujimoto, S.: Dynamics of the box-ball system with random initial conditions via Pitman’s transformation (2018). arXiv:1806.02147 4. Dwass, M.: Branching processes in simple random walk. Proc. Amer. Math. Soc. 51, 270–274 (1975) 5. Evans, S.N.: Probability and Real Trees, vol. 1920. Lecture Notes in Mathematics. Lectures from the 35th Summer School on Probability Theory held in Saint-Flour, July 6–23, 2005. Springer, Berlin (2008) 6. Ferrari, P.A., Gabrielli, D.: BBS invariant measures with independent soliton components. Electron. J. Probab. 25, Paper No. 78–1 (2020)
152
P. A. Ferrari and D. Gabrielli
7. Ferrari, P.A., Nguyen, C., Rolla, L., Wang, M.: Soliton decomposition of the box-ball system (2018). arXiv:1806.02798 8. Inoue, R., Kuniba, A., Takagi, T.: Integrable structure of box–ball systems: crystal, Bethe ansatz, ultradiscretization and tropical geometry. J. Phys. A Math. Theor. 45(7), 073001 (2012) 9. Kato, T., Tsujimoto, S., Zuk, A.: Spectral analysis of transition operators, automata groups and translation in BBS. Commun. Math. Phys. 350(1), 205–229 (2017) 10. Kawazu, K., Watanabe, S.: Branching processes with immigration and related limit theorems. Teor. Verojatnost. i Primenen. 16, 34–51 (1971) 11. Kuniba, A., Lyu, H.: One-sided scaling limit of multicolor box-ball system (2018). arXiv:1808.08074 12. Le Gall, J.-F.: Une approche élémentaire des théorèmes de décomposition de Williams. In: Séminaire de Probabilités, XX, 1984/85, vol. 1204, pp. 447–464. Lecture Notes in Mathematics. Springer, Berlin (1986) 13. Le Gall, J.-F.: Random trees and applications. Probab. Surv. 2, 245–311 (2005) 14. Levine, L., Lyu, H., Pike, J.: Double jump phase transition in a random soliton cellular automaton (2017). arXiv:1706.05621 15. Mada, J., Idzumi, M., Tokihiro, T.: The exact correspondence between conserved quantities of a periodic box-ball system and string solutions of the Bethe ansatz equations. J. Math. Phys. 47(5), 053507 (2006) 16. Stanley, R.P.: Catalan Numbers. Cambridge University Press, New York (2015) 17. Takahashi, D., Satsuma, J.: A soliton cellular automaton. J. Phys. Soc. Jpn. 59(10), 3514–3519 (1990) 18. Tokihiro, T., Takahashi, D., Matsukidaira, J., Satsuma, J.: From soliton equations to integrable cellular automata through a limiting procedure. Phys. Rev. Lett. 76(18), 3247 (1996) 19. Yoshihara, D., Yura, F., Tokihiro, T.: Fundamental cycle of a periodic box-ball system. J. Phys. A 36(1), 99–121 (2003) 20. Yura, F., Tokihiro, T.: On a periodic soliton cellular automaton. J. Phys. A Math. Gen. 35(16), 3787 (2002)
Invertibility of Infinitely Divisible Continuous-Time Moving Average Processes Orimar Sauri
Abstract This paper studies the invertibility property of continuous time moving average processes driven by a Lévy process. We provide of sufficient conditions for the recovery of the driving noise. Our assumptions are specified via the kernel involved and the characteristic triplet of the background driving Lévy process. Keywords Moving average processes · Infinitely divisible processes · Invertibility of stationary processes · Causality · Lévy semistationary processes
1 Introduction In the context of time series, the concept of invertibility of stochastic processes refers to the task of recovering the driving noise by the observed series. Such a property plays an important role for the characterization of the notion of causality, which is the principle in where the current state of a given system is not influenced by its future states. Invertibility and causality are well understood in the discrete-time framework, in particular, for moving average processes, necessary and sufficient conditions for invertibility and causality have been established in terms of its moving average coefficients. See for instance Brockwell and Davis [8]. Motivated by this framework, the main goal of the present paper is to study the invertibility property of the class of continuous-time moving average processes driven by a Lévy process, that is, the observed process (Xt )t ∈R admits the spectral representation Xt :=
R
f (t − s) dLs , t ∈ R,
(1)
O. Sauri () Department of Mathematical Sciences, Aalborg University, Aalborg, Denmark CREATES, Aarhus University, Aarhus, Denmark e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 S. I. López et al. (eds.), XIII Symposium on Probability and Stochastic Processes, Progress in Probability 75, https://doi.org/10.1007/978-3-030-57513-7_6
153
154
O. Sauri
where f is a measurable function, often called kernel, and L is a Lévy process. Our main result states that the process X is invertible, for a certain class of Lévy processes, whenever the Fourier transform of f does not vanish, which is in essence the analogous condition to the discrete-time setting. We would like to emphasize that the class of Lévy processes we consider in our results does not need to be square integrable. See Sect. 3 for more details. Observe that the process X is infinitely divisible in the sense of BarndorffNielsen et al. [3, 4]. In statistical terms, the kernel f models the autocorrelation structure of X while L describe its distributional properties. Thus, X can be used as a flexible model that is able to reproduce many of the stylized properties found in empirical data such as fat tails and local Gaussianity (mixed Gaussian distributions). Hence, from the modeling perspective, invertibility provides of a simple way to identify (in a one-to-one relation) and estimate the law of X by L, and vice versa. Several authors have investigated the invertibility problem for continuous-time processes. For instance, Comte and Renault [11] studied the invertibility and causality of Gaussian Volterra processes, which are those processes that can be written as in (1) but f (t − s) is replaced by f (t, s), and L by a Brownian motion. Under smoothness assumptions on the kernel, the authors provided necessary and sufficient conditions for the invertibility and causality of these type of processes. In the non-Gaussian case, Cohen and Maejima [10] established the invertibility property for the family of fractional Lévy processes in the case when L is centered and has finite second moment. In the stationary framework, Brockwell and Lindner [9] considered the continuous-time version of the classical ARMA processes. In their set up, the authors gave necessary and sufficient conditions (which turned out to be the analogous of those for the classical ARMA) for the causality and invertibility of this family. Recently, Basse-O’Connor et al. [6] studied the solutions of ARMA type stochastic differential equations. The authors showed that when the solution exists, it can be written as in (1) and, under extra regularity conditions, such a solution is invertible and causal. The previous situations are contained in our framework. The present paper is organized as follows. Section 2 introduces the basic notation and some background on infinite divisibility, stochastic integration with respect to Lévy processes, and Orlicz spaces. In Sect. 3, we present our main result and we discuss several important examples. Section 4 concludes.
2 Preliminaries and Basic Results Throughout this paper , F , (Ft )t ∈R , P denotes a filtered probability space satisfying the usual conditions of right-continuity and completeness. For p ≥ 0, we denote by Lp (, F , P) the space of p-integrable random variables endowed with the convergence in p-mean for p > 0 and convergence in probability for the case when p = 0.
Invertibility of Infinitely Divisible Continuous-Time Moving Average Processes
155
A two-sided Rd -valued Lévy process (Lt )t ∈R on (, F , P) is a stochastic process taking values in Rd with independent and stationary increments whose sample paths are almost surely càdlàg. We say that (Lt )t ∈R is an (Ft )-Lévy process if for all t > s, Lt − Ls is Ft -measurable and independent of Fs . By I D Rd we mean the space of infinitely divisible distributions on Rd . Any Lévy process is infinitely divisible and L1 has a Lévy-Khintchine representation, relative to a truncation function τ , given by log 7 μ (z) = i z, γτ −
1 z, Bz + 2
Rd
6
eiz,x − 1 − i τ (x) , z ν (dx) , z ∈ Rn ,
where 7 μ is the characteristic function of the law of L1 , γτ ∈ Rd , B is a symmetric definite matrix on Rd×d , and ν is a Lévy measure, i.e. d nonnegative d ν 0 = 0, with 0 denoting the origin in Rd , and Rd (1 ∧ |x|2 )ν (dx) < ∞. Here, we assume that the truncation function τ is given by τ (x1 , . . . , xn ) = n xi 1∨|xi | i=1 ,
(x1 , . . . , xn ) ∈ Rn . An infinitely divisible continuous-time moving average (IDCMA) process is a stochastic process (Xt )t ∈R on , F , (Ft )t ∈R , P given by the following formula Xt :=
R
f (t − s) dLs , t ∈ R,
(2)
where f is a deterministic function and L is a Lévy process with triplet (γτ , B, ν). IDCMA process belongs to the class of Lévy semistationary process (LSS) which are those processes (Yt )t ∈R which are described by the following dynamics Yt = θ +
t
−∞
g (t − s) σs dLs +
t
−∞
q (t − s) as ds, t ∈ R,
(3)
where θ ∈ Rd , L is a Lévy process, g and q are deterministic functions such that g (x) = q (x) = 0 for x ≤ 0, and σ and a are adapted càdlàg processes. For further references to theory and applications of Lévy semistationary processes, see Barndorff-Nielsen et al. [2] and references therein.
2.1 Stochastic Integrals and Orlicz Spaces In the following, we present a short review of Rajput and Rosi´ nski [17] and Sato [19] concerning the existence of stochastic integrals of the form R f (s)dLs , where f : R → R is a measurable function and L a Lévy process as well as the connection of such integral with the so-called Orlicz spaces.
156
O. Sauri
Let L be an Rd -valued Lévy process with characteristic triplet (γτ , B, ν). The space of simple functions on R will be denoted by ϑ. Thus, f ∈ ϑ if and only if f can be written as f =
k
ai 1(si ,ti ] ,
i=1
where si ≤ ti and ai ∈ R for i = 1, . . . , k. For any f ∈ ϑ, the integral of f with respect to (w.r.t. for short) is defined as R
f (s)dLs :=
k
ai (Lti − Lsi ).
i=1
We will say that f is L-integrable if there exists a sequence(fn )n≥1 ⊆ ϑ, such that as n → ∞, fn → f almost everywhere and the sequence R fn (s)dLs has a limit in probability. In such a situation we write f (s)dLs := P- lim fn (s)dLs , n→∞ R
R
In Rajput and Rosi´nski [17], c.f. Sato [19], it has been shown that f is L-integrable (γ ,B,ν) and R f (s)dLs ∈ Lp (, F , P) if and only if R p τ (f (s))ds < ∞, where (γ ,B,ν)
p τ
(u) := V (u) + tr(B)u2 +
Rd
[ux2 1ux≤1 + uxp 1ux>1 ]ν(dx), u ∈ R,
(4)
with V (u) := γτ u +
Rd
[τ (ux) − uτ (x)] ν(dx) , u ∈ R.
(γ ,B,ν) is well defined if and only if x>1 xp ν(dx) < Observe that for p > 0, p τ ∞. For the rest of this paper the space of L-integrable functions will be denoted by L(γτ ,B,ν) := {f : (R, B(R)) → (R, B(R)) : p
(γ ,B,ν)
R
p τ
(f (s))ds < ∞}.
(5)
In the case when functions are identified up to a null set, L(γτ ,B,ν) happens to be p a complete linear metric space in which ϑ is dense. It is worth mentioning that the metric in L(γτ ,B,ν) is induced by the F-norm p
˜ −1 |f (s)|)ds ≤ c , f (γτ ,B,ν) = inf c > 0 : (c p
R
Invertibility of Infinitely Divisible Continuous-Time Moving Average Processes
157
where ˜ (u) := sup V (yu) + tr(B)u2 + |y|≤1
Rd
[ux2 1ux≤1 + uxp 1ux>1 ]ν(dx), u ∈ R.
(γ ,B,ν) Furthermore, fn → 0 as n → ∞ in L(γτ ,B,ν) if and only if R p τ (fn (s))ds → p 0. For more details on the properties discussed above, we refer the reader to Rajput and Rosi´nski [17], c.f. Musielak [15]. In general, L(γτ ,B,ν) does not admit a norm. However, under certain conditions p
(γ ,B,ν)
on p τ , L(γτ ,B,ν) becomes an Orlicz Space, which is a certain type of Banach p space. Hence, we now present some properties of such spaces. For a detailed presentation see Rao and Ren [18]. A mapping : R → [0, ∞] is said to be a Young function if it is even, convex with (s) = 0 if and only if s = 0, and such that lims→∞ (s) = +∞. Given a Young function , the mapping (x) := sup {|x| y − (y)} , x ∈ R.
(6)
y≥0
defines a new Young function which is termed as the complementary function of . We say that a measurable function fulfills the 2 -condition if there exists K > 0 such that (2x) ≤ K (x), for all x ∈ R. For a given Young function satisfying the 2 -condition let L := f : (R, B(R)) → (R, B(R)) : (|f (s)|) ds < ∞ . R
Within this framework, L is a separable Banach space equipped with Luxemburg norm −1 f := inf a > 0 : (7) a |f (s)| ds ≤ 1 , R
when equivalent functions are identified almost everywhere. L is known as the Orlicz space associated to . By S(R) we mean the space of test functions of rapidly decaying, i.e. φ ∈ S(R) if it is infinitely continuously differentiable and for any n ≥ 1 and m ≥ 0, the mapping x "→ φ (m) (x)x n is bounded on R, where φ (m) denotes the derivative of order m of φ. The space of tempered distributions, which we denote by S (R), is the topological dual of S(R). For more details on the theory of tempered distributions we refer to Duistermaat and Kolk [12]. Fix a non-trivial Young function, i.e. (x) = +∞, x > 0, satisfying the 2 -condition.
158
O. Sauri
We have the following connections between Orlicz spaces and the space of tempered distributions: 1. Let f ∈ L , then n>1
f is locally integrable and by Jensen’s inequality, for any
R
f (s) ds ≤ cn (|f (s)|)ds < ∞. (1 + |s|)n R
The latter, according to Duistermaat and Kolk [12, p. 189], gives us that L ⊆ S (R). 2. If f ∈ L , g ∈ L . Then for any t ∈ R R
|f (t − s)g(s)| ds ≤ 2 f g .
For a proof see [18, p. 58]. 3. By the previous point, if f ∈ L , g ∈ L , we get that for any n > 1 f ∗ g(s) (1 + |s|)n ds < ∞, R
which means that the distribution induced by f ∗ g belongs to S (R). The next result identify L , the dual of L . Theorem 1 (Rao and Ren [18, p. 105]) The dual of L is isometrically isomorphic to L , where is as in (6). More precisely, for any T ∈ L there exists a unique g ∈ L , such that T (f ) =
R
f (s)g(s)ds, f ∈ L .
Recall that in a Banach space (X , ·X ) , a collection F = (fα )α∈ is said to be dense if F = X under the norm ·X . From the previous theorem and the HahnBanach Theorem we get: Corollary 1 A collection F = (fα )α∈ ⊂ L is dense in L if and only if R
fα (s)g(s)ds = 0,
∀ α ∈ ,
with g ∈ L , implies that g ≡ 0, almost everywhere.
Invertibility of Infinitely Divisible Continuous-Time Moving Average Processes
159
We turn back to the stochastic integral discussed above. Fix p ≥ 0 and suppose (γ ,B,ν) is comparable to a Young function, that is, there are c, C > 0 and a that p τ Young function , such that (γ ,B,ν)
c(x) ≤ p τ
(x) ≤ C(x), x ≥ 0.
(8)
(γ ,B,ν)
Since p τ satisfies the 2 -condition (see [17]), we conclude that in this case L = L(γτ ,B,ν) is a Banach space. p
Remark 1 We observe the following: 1. Although the Lévy processes under consideration are Rd -valued, the space (L , · ) contains only real-valued functions. 2. From Kaminska [14], an Orlicz space (L , · ) is isometric to some Hilbert space if and only if (x) = kx 2 for some k > 0. Therefore, L(γτ ,B,ν) is p comparable to a Hilbert space if and only if L is centered and square integrable. The following properties of the stochastic integral defined above will be useful for the rest of the paper, see [17] for a proof: Theorem 2 Let (Lt )t ∈R be a Lévy process with characteristic triplet (γτ , B, ν). Then, as n → ∞ 1. The mapping f ∈ L(γτ ,B,ν) "→ R f (s)dLs ∈ Lp (, F , P) is continuous, p i.e. if fn → f in L(γτ ,B,ν) , then R fn (s)dLs → R f (s)dLs in Lp (, F , P); p 2. If L is symmetric, then f ∈ L(γτ ,B,ν) "→ R f (s)dLs ∈ Lp (, F , P) is an p isomorphism between L(γτ ,B,ν) and Lp (, F , P), that is, if R fn (s)dLs → 0 p in probability, then fn → 0 in L(γτ ,B,ν) . Moreover p
span{Lt − Ls : s ≤ t} = {
R
f (s)dLs : f ∈ L },
where the closure is taken on Lp (, F , P).
3 Invertibility of IDCMA Processes In this section we present the main result of this paper. Let us start by recalling the notions of invertibility and causality in the time series framework. Let (Xt )t ∈Z be a discrete-time moving average process, i.e. Xt =
j ∈Z
θj εt −j = ! (B) εt , t ∈ Z,
160
O. Sauri
where the process (εt )t ∈Z is a mean zero weak stationary white noise, ∞, B is the lag operator and ! (z) =
j ∈Z
θ j
1 Xt = −
θ −j εt +j , t ∈ Z.
(12)
j ≥0
Note that in (11), X only depends on the past innovations of ε contrary to that in (12), in which X is expressed in terms of the future innovations of ε. When X admits a representation as in (11), it is called causal and for the case of (12) it is called non-causal. Despite of this behavior, it is obvious that ε only depends on the
Invertibility of Infinitely Divisible Continuous-Time Moving Average Processes
161
past innovations of X, i.e. ε admits a causal representation. This property is usually called invertibility in the causal sense. In analogy with the discrete-time framework, we introduce the notion of invertibility for an IDCMA. Definition 1 Let X be as in (2). X is said to be invertible on Lp (, F , P) for some p ≥ 0, if Lt − Ls ∈ span {Xu }u∈R for any t > s, where the closure is taken in Lp (, F , P). In the same context, we are going to say that X is invertible in the causal sense if Lt − Ls ∈ span {Xu }u≤t for any t > s. A natural question appears, as in the discrete-time case, is f7 = 0 a sufficient (necessary) condition for the invertibility of an IDCMA? In the case when (8) holds, i.e. whenever L(γτ ,B,ν) is an Orlicz space, the answer is affirmative as the following p theorem shows. Theorem 3 Let (Lt )t ∈R be a Lévy process with characteristic triplet (γ , B, ν) and suppose that for some p ≥ 0, there is a Young function satisfying (8). If f ∈ L ∩ L1 (dx) has non-vanishing Fourier transform, then span {Xu }u∈R = span {Lt − Ls : s ≤ t} , in Lp (, F , P) .
(13)
Before presenting the proof of this theorem, we discuss several important examples. Example 1 (Symmetric and Integrable Lévy Processes) Suppose that L is a symmetric Lévy process with characteristic triplet (γ , B, ν) and such that E(L1 ) < ∞. In this situation it holds that 1 (u) := tr(B)u2 + (ux2 ∧ ux)ν(dx), u ∈ R. Rd
From the proof of Theorem 3.3 in [7], we have that the mapping (u) := tr(B)u2 +
Rd
[ux2 1ux≤1 + 2(ux − 1)1ux>1 ]ν(dx),
is convex and such that (u)/2 ≤ 1 (u) ≤ (u), u ∈ R. Therefore L satisfies the assumptions of Theorem 3 (i.e. is a Young function) if B = 0 or B = 0 and as u → ∞ Rd
(ux2 ∧ ux)ν(dx) → +∞.
162
O. Sauri
Example 2 (Ornstein–Uhlenbeck Processes) Let L be a Lévy process with characteristic triplet (γτ , B, ν) and put f (s) := e−s 1{s≥0} , s ∈ R. Then X, the resulting IDCMA process, is the classic OU process driven by L. It is well known that f ∈ L(γτ ,B,ν) if and only if |x|>1 log (|x|) ν (dx) < ∞. Moreover, 0 since f7, the Fourier transform of f , never vanishes, we conclude that f satisfies the assumptions of Theorem 3. Furthermore, due to the Langevin equation, it follows that X is in fact invertible in the causal sense. Now, if consider instead the process Xt :=
∞
e−(s−t )dLs , t ∈ R,
t
we get that X is not adapted but well defined provided that |x|>1 log (|x|) ν (dx) < ∞. Nevertheless, it is easy to check that X fulfills a sort of Langevin equation, that is, almost surely,
t
Xr dr = Lt − Ls + Xt − Xs , t ≥ s.
s
Hence, we deduce that X is invertible in the causal sense. Observe that the Langevin equation holds in a pathwise sense, so for the invertibility of OU-type processes, the condition (8) is superfluous. Example 3 (LSS with a Gamma Kernel) Denote by L a Lévy process with characteristic triplet (γτ , B, ν). Let α > −1 and consider f (s) := e−λ s α 1{s>0} , s ∈ R.
(14)
It has been shown in [5], c.f. [16], that f ∈ L(γτ ,B,ν) if and only if the following 0 two conditions are satisfied: 1. |x|>1 log (|x|) ν (dx) < ∞, 2. One of the following conditions holds: (a) α > −1/2; (b) α = −1/2, B = 0 and |x|≤1 |x|2 |log (|x|)| ν (dx) < ∞; (c) α ∈ (−1, −1/2), B = 0 and |x|≤1 |x|−1/α ν (dx) < ∞. On the other hand, if p > 0, we claim that f ∈ L(γτ ,B,ν) ∩ L(γτ ,B,ν) if and only p 0 if αp > −1 and |x|>1 xp ν (dx) < ∞. Indeed, we first observe that there are c, C > 0 such that cφα,λ/2(s) ≤ f (s) ≤ Cφα,λ (s) , s > 0,
Invertibility of Infinitely Divisible Continuous-Time Moving Average Processes
163
where φα,λ (s) :=
s α 1{01} for − 1/2 < α < 0; for α ≥ 0. e−λs 1{s≥0}
Hence f ∈ L(γτ ,B,ν) ∩ L(γτ ,B,ν) if and only if φα,λ ∈ L(γτ ,B,ν) ∩ L(γτ ,B,ν) . Our p p 0 0 claim then follows by noting that for α ≥ 0
∞ Rd
0
8 8 8φα,λ (s)x 8p 1
1 φα,λ (s)x >1 ν (dx) ds = λp
x>1
xp (1−x−1 )ν (dx) ,
while for 0 > αp > −1
∞ 0
Rd
8 8 8φα,λ (s)x 8p 1 φ (s)x >1 ν (dx) ds = α,λ
1 pα + 1
|x|>1
xp ν (dx)
1 x−1/α ν (dx) + pα + 1 |x|≤1 1 xp (x−1 − e)ν (dx) . + λp x>e
In this case X, the associated IDCMA process, is called Lévy semistationary process with a gamma kernel. See Pedersen and Sauri [16] for more properties on this process. Note that the Fourier transform of f is given by (α + 1) 1 fˆ (ξ ) = √ , ξ ∈ R. 2π (λ + iξ )α+1 Hence, under the framework of Theorem 3, X is invertible. Furthermore, it is possible to show that if |x|>1 x ν (dx) < ∞, then for any −1 < α < 0, almost surely
∞ 0
Xt −u μ (du) = kα
t −∞
e−λ(t −s) dLs , for any t ∈ R,
(15)
where μ (du) := e−λu u−α−1 (u) 1{u≥0} du and kα > 0. This relation actually shows that X is invertible in the causal sense provided that |x|>1 x ν (dx) < ∞. As final remark we would like to mention that Eq. (15) was originally proved in [2] for the case when L is a subordinator. Example 4 (CARMA(p, q)) The Lévy driven CARMA(p, q) (continuous-time auto-regressive moving average process) with parameters p > q, constitutes the generalization to the continuous-time framework of the classical ARMA models in
164
O. Sauri
time series. They were introduced in [9] as the stationary process given by Xt = b Yt where Y follows the SDE dYt = AYt dt + ep dLt , where L is a real-valued Lévy process with characteristic triplet (γ , B, ν), b = b0 , . . . , bp−1 , ep = (0, 0, · · · , 1) and ⎡
0 0 .. .
1 0 .. .
0 1 .. .
⎤ ··· 0 ··· 0 ⎥ ⎥ . . .. ⎥ . . . ⎥ ⎥ ⎦ ··· 1
⎢ ⎢ ⎢ A=⎢ ⎢ ⎣ 0 0 0 −ap ap−1 ap−2 · · · −a1
in which a1 , . . . , ap , b0 , . . . , bp−1 are such that bq = 0 and bj = 0 for j > q. The authors showed that X can be written as an IDCMA Xt = g (t − s) dLs , t ∈ R, R
with g (s) = b eAs ep 1{s>0} , provided that |x|>1 log (|x|) ν (dx) < ∞ and the roots of the polynomial a (λ) = ap + ap−1 λ + · · · + a1 λp−1 + λp , λ ∈ C, have strictly negative real part. Since in this case 7 g (ξ ) =
b (−iξ ) , ξ ∈ R, a (−iξ )
with b (λ) = b0 + b1 λ + · · · + bp−1 λp−1 , λ ∈ C, we conclude that the kernel of a CARMA(p, q) satisfies the assumptions of Theorem 3 if the roots of the polynomial b have non-vanishing real part, i.e. if b (λ∗ ) = 0 then Reλ∗ = 0, and a and b have no common roots. Observe that this condition coincides with the Assumption 1 in [13]. For generalizations on the CARMA equation introduced before we refer to [6]. The proof of Theorem 3 is mainly based on the following lemma. Lemma 1 Let (Lt )t ∈R be a Lévy process with characteristic triplet (γτ , B, ν) and as in Theorem 3. Let (fα)α∈ ⊂ L . If 1(s,t ] ∈ span (fα )α∈ under · for s ≤ t, then Lt − Ls ∈ span R fα (s) dLs α∈ in Lp (, F , P).
Invertibility of Infinitely Divisible Continuous-Time Moving Average Processes
165
Proof If 1(s,t n] ∈ span (fα )α∈ under n for s ≤ t, then there exist · θ8n := θin i=1 ∈ 8Rn and α n := αin i=1 ⊂ with n ∈ N, such that 8n 8 8 i=1 θin fαin − 1(s,t ] 8 → 0, as n → ∞. Therefore, from (8), Theorem 12 in [18], and Theorem 2, we conclude that for some p ≥ 0, n R i=1
θin fαin (r)dLr → Lt − Ls , in Lp (, F , P) ,
which is enough.
Proof of Theorem 3 Obviously span {Xu }u∈R ⊆ span {Lt − Ls : s ≤ t} so we only need to show the opposite contention. Recall that under our assumptions, for some p ≥ 0, L(γτ ,B,ν) = L which is an Orlicz space. Thus, from Lemma 1, we only p need to check that for every u > s, 1(s,u] ∈ span (f (t − ·))t ∈R under · . To do this, we will apply Corollary 1 in order to verify that span {f (t − ·)}t ∈R = L .
(16)
Thus, let g ∈ L in such a way that R
f (t − s) g (s) ds = 0, for all t ∈ R.
From Sect. 2 we know that the functions f, g and f ∗ g induce distributions on S (R). Therefore, their Fourier transforms are well defined as elements of S (R). Denote by suppgˆ and suppfˆ the (distributional) support of such Fourier transforms. Since f ∈ L ∩ L1 (dx), we can apply Lemma 5 in [20], c.f. [1], to get that ) * suppgˆ ⊆ z ∈ R : fˆ(z) = 0 = ∅. This implies immediately that g ≡ 0 almost everywhere, which according to Corollary 1, gives (16). Remark 2 Observe that if we replace in (7) the one-dimensional Lebesgue measure with the d-dimensional Lebesgue measure, the last part of the previous proof remains valid. In particular, if f : Rd → R is integrable, belongs to L and has non-vanishing Fourier transform, then span {f (t − ·)}t ∈Rd = L . In this setting and under (8), the random field Xt :=
Rd
f (t − s) L(ds), t ∈ Rd ,
(17)
166
O. Sauri
is well defined. Here f is as above and L denotes a homogeneous Lévy basis (Lévy sheet) on Rd with characteristic triplet (γτ , B, ν). Since in this framework Theorem 2 still holds (see Rajput and Rosi´nski [17]), we conclude that Theorem 3 also applies for random fields of the form of (17).
4 Conclusions This paper studied the invertibility of continuous-time moving averages processes driven by a Lévy processes. We show that the driving noise can be recovered by direct observations of the process. To do this we assumed that the Fourier transform of the kernel never vanishes and we imposed a regularity condition on the characteristic triplet of the background driving Lévy process. Acknowledgments The author gratefully acknowledges to Ole E. Barndorff-Nielsen and Benedykt Szozda for helpful comments on a previous version of this work. Financial support from the Center for Research in the Econometric Analysis of Time Series (grant DNRF78) funded by the Danish National Research Foundation is gratefully acknowledged. This study was also partially funded by the Villum Fonden as part of the project number 11745 titled ”Ambit Fields: Probabilistic Properties and Statistical Inference”.
References 1. Bang, H.H.: Spectrum of functions in Orlicz spaces. J. Math. Sci. Univ. Tokyo 4(2), 341–349 (1997) 2. Barndorff-Nielsen, O.E., Benth, F.E., Veraart, A.: Modelling energy spot prices by volatility modulated Lévy-driven Volterra processes. Bernoulli 19(3), 803–845 (2013) 3. Barndorff-Nielsen, O.E., Maejima, M., Sato K.: Infinite divisibility for stochastic processes and time change. J. Theor. Probab. 19(2), 411–446 (2006) 4. Barndorff-Nielsen, O.E., Sauri, O., Szozda, B.: Selfdecomposable fields. J. Theor. Probab. 30(1), 233–267 (2017) 5. Basse-O’Connor, A.: Some properties of a class of continuous time moving average processes. In: Proceedings of the 18th EYSM, pp. 59–64 (2013) 6. Basse-O’Connor, A., Nielsen, M., Pedersen, J., Rohde, V.: A continuous-time framework for arma processes. ArXiv e-prints (2017) 7. Basse-O’Connor, A., Rosi´nski, J.: Characterization of the finite variation property for a class of stationary increment infinitely divisible processes. Stoch. Process. Appl. 123(6), 1871–1890 (2013) 8. Brockwell, P.J., Davis, R.A.: Time Series: Theory and Methods. Springer, New York (1986) 9. Brockwell, P.J., Lindner, A.: Existence and uniqueness of stationary Lévy-driven CARMA processes. Stoch. Process. Appl. 119(8), 2660–2681 (2009) 10. Cohen, S., Maejima, M.: Selfdecomposability of moving average fractional Lévy processes. Stat. Probab. Lett. 81(11), 1664–1669 (2011) 11. Comte, F., Renault, E.: Noncausality in continuous time models. Econ. Theory 12, 215–256 (1996) 12. Duistermaat, J., Kolk, J.: Distributions: Theory and Applications. Cornerstones. Birkhäuser, Boston (2010)
Invertibility of Infinitely Divisible Continuous-Time Moving Average Processes
167
13. Ferrazzano, V., Fuchs, F.: Noise recovery for Lévy-driven CARMA processes and highfrequency behaviour of approximating Riemann sums. Electron. J. Stat. 7, 533–561 (2013) 14. Kaminska, A.: On Musielak-Orlicz spaces isometric to L2 or L∞ . Collect. Math. 48(4–6), 563–569 (1997) 15. Musielak, J.: Orlicz Spaces and Modular Spaces. Lecture Notes in Mathematics. Springer, Berlin (1983) 16. Pedersen, J., Sauri, O.: On Lévy semistationary processes with a gamma kernel. In: XI Symposium on Probability and Stochastic Processes. Progress in Probability, vol. 69, pp. 217– 239. Springer, Cham (2015) 17. Rajput, B.S., Rosi´nski, J.: Spectral representations of infinitely divisible processes. Probab. Theory Rel. Fields 82(3), 451–487 (1989) 18. Rao, M.M., Ren, Z.D.: Theory of Orlicz Spaces. M. Dekkerl, New York (1994) 19. Sato, K.: Additive processes and stochastic integrals. Illinois J. Math. 50(1–4), 825–851 (2006) 20. Thuong, T.V.: Some colletions of functions dense in an Orlicz space. Acta Math. Vietnam. 25(2), 195–208 (2000)