Stochastic Processes, Statistical Methods, and Engineering Mathematics: SPAS 2019, Västerås, Sweden, September 30–October 2 (Springer Proceedings in Mathematics & Statistics, 408) [1st ed. 2022] 303117819X, 9783031178191

The goal of the 2019 conference on Stochastic Processes and Algebraic Structures held in SPAS2019, Västerås, Sweden, fro

104 84 11MB

English Pages 933 [907] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Contents
Contributors
Part I Stochastic Processes and Analysis
1 An Improved Asymptotics of Implied Volatility in the Gatheral Model
1.1 Introduction
1.2 Previous Results
1.3 The Expansion of Order 3
1.4 A Sketch of a Proof
1.4.1 Preliminaries
1.4.2 Order 3
1.5 Conclusions and Future Work
References
2 Ruin Probability for Merged Risk Processes with Correlated Arrivals
2.1 Introduction
2.2 The Compound Poisson Model with Phase Type Claims
2.2.1 Phase Type Distribution
2.2.2 Ruin Probability for Phase Type Claims
2.3 Correlated Poisson Arrivals with Exponential Claims
2.3.1 Transition to a Single Poisson Compound Process
2.3.2 Numerical Example
2.3.3 An Auxiliary Result
2.4 Correlated Poisson Arrivals with Phase Type Claims
2.4.1 The Phase Type Distribution of Claims of the Merged Process
2.4.2 Example
2.4.3 Ruin Probability in Case of Phase Type Claims
2.5 Conclusion
References
3 Method Development for Emergent Properties in Stage-Structured Population Models with Stochastic Resource Growth
3.1 Introduction
3.2 The Deterministic Stage Model
3.3 Stochastic Stage Model
3.4 The Recovery Potential
3.5 Probability of Extinction for Stochastic Case
3.6 Resilience for Stage Model
3.7 Simulation Results
3.7.1 Stage-Structured Biomass Dynamics and Yield
3.7.2 Impact on Size Structure and Biomass
3.7.3 The Stock Recovery Potential
3.7.4 The Probability of Extinction
3.7.5 Resilience
3.8 Conclusions and Discussion
3.9 Appendix 3.1. Stage-Structured Biomass Model with Logistic Resource Dynamics
3.10 Appendix 3.2. Proof of Uniqueness of the Solution w*J(R)
References
4 Representations of Polynomial Covariance Type Commutation Relations by Linear Integral Operators on Lp Over Measure Spaces
4.1 Introduction
4.2 Preliminaries and Notations
4.3 Representations by Linear Integral Operators
References
5 Computable Bounds of Exponential Moments of Simultaneous Hitting Time for Two Time-Inhomogeneous Atomic Markov Chains
5.1 Introduction
5.2 Notation
5.3 Geometric Drift Condition
5.3.1 Drift Condition for Inhomogeneous Markov Chains
5.3.2 Constructing a Sequence that Dominates Return Time
5.4 Main Result
5.5 Auxiliary Lemmas
References
6 Valuation and Optimal Strategies for American Options Under a Markovian Regime-Switching Model
6.1 Introduction
6.2 A Markovian Regime-Switching Model
6.3 Pricing of American Options
6.4 Some Properties for Optimal Strategy
6.4.1 Preliminaries
6.4.2 Lemmata
6.4.3 Properties
6.4.4 Optimal Strategy
6.5 Numerical Examples
6.5.1 Model Implementation
6.5.2 Numerical Results
6.6 Conclusion and Future Research
References
7 Inequalities for Moments of Branching Processes in a Varying Environment
7.1 Introduction
7.2 Main Results
7.3 Auxiliary Results
7.4 Proof of the Main Results
References
8 A Law of the Iterated Logarithm for the Empirical Process Based Upon Twice Censored Data
8.1 Introduction
8.2 Product-Limit Estimator
8.3 Results
8.4 Simulation
8.5 Proofs of the Lemmas
References
9 Investigating Some Attributes of Periodicity in DNA Sequences via Semi-Markov Modelling
9.1 Introduction
9.2 The Basic Framework
9.2.1 The Homogeneous Case
9.2.2 The Case of Partial Non Homogeneity
9.3 Quasiperiodicity
9.4 Illustrations of Real and Synthetic Data
9.4.1 DNA Sequences of Synthetic Data
9.4.2 DNA Sequences of Real Data
9.5 Conclusion
References
10 Limit Theorems of Baxter Type for Generalized Random Gaussian Processes with Independent Values
10.1 Introduction
10.2 The Covariance Functional of a Generalized Random Process with Independent Values
10.3 The Families of Test Functions
10.4 Convergence of Baxter Sums
10.5 Conditions of Singularity of Measures
References
11 On Explicit Formulas of Steady-State Probabilities for the [ M/M/c/c+m ]-Type Retrial Queue
11.1 Introduction
11.2 Mathematical Model and Ergodicity Condition
11.3 Steady-State Distribution
11.4 Numerical Results
11.5 Conclusion and Future Research
References
12 Testing Cubature Formulae on Wiener Space Versus Explicit Pricing Formulae
12.1 Introduction and Background
12.2 Implementation of Cubature Formulae on Wiener Space
12.2.1 SDE and Stochastic Integral (Itô)
12.2.2 SDE and Stochastic Integral (Stratonovich)
12.2.3 Itô Integral Versus Stratonovich Integral
12.3 Construction of Cubature Formulae on Wiener Space
12.3.1 The Trajectories of SDE (12.6)
12.4 Cubature Formula of Degree 5
12.4.1 Black–Scholes Versus Cubature Pricing Formula (Degree 5)
12.4.2 Construction of a Trinomial Model Based on the Cubature Formula
12.5 Cubature Formula of Degree 7
12.5.1 Black–Scholes Versus Cubature Formula (Degree 7)
12.6 Conclusion and Future Works
References
13 Gaussian Processes with Volterra Kernels
13.1 Introduction
13.2 Gaussian Volterra Processes and Their Smoothness Properties
13.3 Gaussian Volterra Processes with Sonine Kernels
13.3.1 Fractional Brownian Motion and Sonine Kernels
13.3.2 A General Approach to Volterra Processes with Sonine Kernels
13.4 Examples of Sonine Kernels
13.5 Appendix
13.5.1 Inequalities for Norms of Convolutions and Products
13.5.2 Continuity of Trajectories and Hölder Condition
13.5.3 Application of Fractional Calculus
13.5.4 Existence of the Solution to Volterra Integral Equation Where the Integral Operator Is an Operator of Convolution with Integrable Singularity at 0
References
14 Stochastic Differential Equations Driven by Additive Volterra–Lévy and Volterra–Gaussian Noises
14.1 Introduction
14.2 Brief Description of Volterra–Lévy Processes
14.3 Moment Upper Bounds and Hölder Properties of Volterra–Lévy Processes
14.3.1 General Upper Bounds for the Incremental Moments
14.3.2 Incremental Moments and Hölder Continuity Under Power Restrictions on the Kernel g
14.3.3 Application of the Upper Bounds for the Incremental Moments to Volterra–Lévy Processes of Three Types
14.3.4 Examples of Volterra–Lévy Processes with Power Restrictions on the Kernel
14.3.5 Sonine Pairs and Two Kinds of Volterra–Gaussian Processes
14.4 Equations with Locally Lipschitz Drift of Linear Growth
14.5 Equations with Volterra–Gaussian Processes
14.5.1 Girsanov Theorem. Definition of Weak and Strong Solutions
14.5.2 Weak Existence and Weak Uniqueness
14.5.3 Pathwise Uniqueness of Weak Solution. Existence and Uniqueness of Strong Solution
References
15 Fixed Point Results of Generalized Cyclic Contractive Mappings in Multiplicative Metric Spaces
15.1 Introduction
15.1.1 Multiplicative Metric Spaces
15.1.2 Fixed Points of Maps in Multiplicative Metric Space
15.2 Cyclic Contraction Mappings
15.2.1 Fixed Point Results of Cyclic Contraction Mappings
15.2.2 Well-Posedness Results for Cyclic Contractive Maps
15.2.3 Limit Shadowing Property for Cyclic Contractive Maps
15.2.4 Periodic Points of Cyclic Contractive Maps
References
16 Fixed Points of T-Hardy Rogers Type Mappings and Coupled Fixed Point Results in Multiplicative Metric Spaces
16.1 Introduction
16.2 Fixed Points of T-Hardy Rogers Type Contractive Maps
16.2.1 Well-Posedness Results for T-Hardy Rogers Type Contractions
16.2.2 Limit Shadowing Property for T-Hardy Rogers Type Contractions
16.2.3 Periodic Point Property for T-Hardy Rogers Type Contractive Maps
16.3 Coupled Fixed Points in Multiplicative Metric Spaces
16.3.1 Coupled Fixed Points
16.3.2 Coupled Fixed Point Results
16.3.3 Well-Posedness Result for Coupled Maps
16.3.4 Application
References
17 Some Periodic Point and Fixed Point Results in Multiplicative Metric Spaces
17.1 Introduction
17.2 Periodic Point Results
17.3 Cyclic Contractions
17.4 Applications
References
18 Bochner Integrability of the Random Fixed Point of a Generalized Random Operator and Almost Sure Stability of Some Faster Random Iterative Processes
18.1 Introduction and Preliminaries
18.2 Bochner Integrability of the Fixed Point of a Generalized Random Operator
18.3 Almost Sure T-Stability Results
18.4 Application to Random Nonlinear Integral Equation of the Hammerstein Type
References
19 An Approach to the Absence of Price Bubbles Through State-Price Deflators
19.1 Introduction
19.2 Securities Market Model and State-Price Deflators
19.3 About the Existence of Price Bubbles
19.4 Vector Spaces Associated to Marketed Strategies
19.5 Conclusion
References
20 Form Factors for Stars Generalized Grey Brownian Motion
20.1 Introduction
20.2 Generalized Grey Brownian Motion in Arbitrary Dimensions
20.2.1 Construction of the Mittag-Leffler Measure
20.2.2 Generalized Grey Brownian Motion
20.3 Form Factors for Different Classes of Star Generalized Grey Brownian Motion
20.4 Form Factors for Star Fractional Brownian Motion
20.5 Conclusion
References
21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes
21.1 Introduction
21.2 First-Rare-Event Times for Perturbed Semi-Markov Processes
21.2.1 First-Rare-Event Times
21.2.2 Asymptotically Uniformly Ergodic Markov Chains
21.2.3 Necessary and Sufficient Conditions of Weak Convergence for First-Rare-Event Times
21.3 Counting Processes Generated by Flows of Rare Events
21.3.1 Counting Processes for Rare-Events
21.3.2 Necessary and Sufficient Conditions of Convergence for Counting Processes Generated by Flows of Rare Events
21.4 Markov Renewal Processes Generated by Flows of Rare Events
21.4.1 Return Times and Rare Events
21.4.2 Necessary and Sufficient Conditions of Convergence For Markov Renewal Processes Generated by Flows of Rare Events
21.5 Vector Counting Processes Generated by Flows of Rare Events
21.5.1 Vector Counting Process for Rare Events
21.5.2 Necessary and Sufficient Conditions of Convergence for Vector Counting Process Generated by Flows of Rare Events
References
Part II Statistical Methods
22 An Econometric Analysis of Drawdown Based Measures
22.1 Introduction
22.2 Risk Measures
22.3 Mathematical Models
22.3.1 ARMA Model
22.3.2 GARCH Model
22.3.3 EGARCH Model
22.4 Application
22.5 Conclusions
References
23 Forecasting and Optimizing Patient Enrolment in Clinical Trials Under Various Restrictions
23.1 Introduction
23.2 Enrolment Modelling
23.2.1 Modelling Unrestricted Enrolment
23.2.2 Modelling Enrolment on Country Level
23.2.3 Modelling Global Enrolment
23.3 Modelling Enrolment with Restrictions
23.3.1 Modelling Enrolment with Restrictions in One Centre
23.3.2 Modelling Enrolment with Restrictions on Country Level
23.3.3 Forecasting Global Enrolment Under Country Restrictions
23.3.4 Using Historic Data for Better Prediction of the Enrolment Rates for the New Trials
23.4 Optimal Enrolment Design
23.4.1 Unrestricted Enrolment
23.4.2 Restricted Enrolment
23.5 Conclusions
23.6 Appendix
23.6.1 Approximation of the Convolution of PG Variables
23.6.2 Calculation of the Mean of the Restricted Process
23.6.3 Calculation of the 2nd Moment of the Restricted Process
References
24 Algorithms for Recalculating Alpha and Eigenvector Centrality Measures Using Graph Partitioning Techniques
24.1 Introduction
24.1.1 Notation and Abbreviations
24.1.2 Graph Concepts
24.2 The Alpha Centrality Algorithm
24.2.1 Stages for Algorithm Formulation
24.2.2 Computing Other Centrality Measures by Using the α-Centrality Measure Algorithm
24.2.3 Example
24.3 The Eigenvector Centrality for Large Directed Graphs
24.3.1 The Eigenvector Centrality Algorithm
24.3.2 Reformulation of the Power Method
24.3.3 Computing Eigenvector Centrality of a Graph Componentwise
24.4 Discussion and Conclusion
References
25 On Statistical Properties of the Estimator of Impulse Response Function
25.1 Introduction
25.2 The Estimator of an Impulse Response Function and Its Properties
25.3 Trigonometric Basis
25.4 Square Gaussian Random Variables and Processes
25.5 On the Rate of Convergence of the Estimator of Impulse Response Function
25.6 Testing Hypotheses on the Impulse Response Function
25.7 Simulation Study
References
26 Connections Between the Extreme Points for Vandermonde Determinants and Minimizing Risk Measure in Financial Mathematics
26.1 Introduction
26.2 Money Market Account
26.3 Derivatives and Arbitrage Pricing
26.4 Pricing Derivatives
26.4.1 Discount Bonds and Coupon-Bearing Bonds
26.4.2 Yield to Maturity
26.4.3 Spot Rate
26.4.4 Forward Yields and Forward Rates
26.4.5 Forward and Future Contracts
26.5 Options
26.6 Optimization Model in Finance
26.6.1 Extreme Points of the Vandermonde Determinant on Various Surfaces Defined as Efficient Frontiers
26.7 Vandermonde Matrix, Determinant and Portfolio Construction
26.7.1 Optimum Value of Generalized Variance V[] with Extreme Points of Vandermonde Determinant
26.8 Conclusion
References
27 Extreme Points of the Vandermonde Determinant and Wishart Ensemble on Symmetric Cones
27.1 Introduction
27.1.1 Gaussian and Chi-Square Distributions
27.1.2 Laplace Transform and The Wishart Density
27.1.3 The Gindikin Set and Wishart Joint Eigenvalue Distribution
27.2 A Quick Jump into Wishart Distribution on Symmetric Cones
27.3 Extreme Points of the Degenerate Wishart Distribution and Vandermonde Determinant
27.4 Conclusion
References
28 Option Pricing and Stochastic Optimization
28.1 Introduction
28.2 Problem of Investor
28.2.1 Problem Statement
28.2.2 Stochastic Optimization and Probability Functionals
28.3 Applying Investor Problem for Option Pricing
28.3.1 Investor Optimal Price
28.3.2 Examples
28.3.3 Numerical Results
28.4 Summary
References
Part III Engineering Mathematics
29 Stochastic Solutions of Stefan Problems with General Time-Dependent Boundary Conditions
29.1 Introduction
29.1.1 Random Walk and the Heat Equation
29.1.2 A Random Walk Model with Boundary Conditions
29.2 The Stefan Problem
29.2.1 The Stefan Condition
29.2.2 Modelling the Moving Boundary
29.2.3 Stefan Problem with an Incoming Heat Flux
29.3 Numerical Results for Stefan Problems
29.3.1 Stefan Problem with Constant Boundary Condition f(t)=T0
29.3.2 Stefan Problem with a Special Boundary Condition f(t)=et-1
29.3.3 Stefan Problem with a Special Heat Flux Boundary Condition h(t)=-q0/sqrtt
29.3.4 Stefan Problem with Oscillating Boundary Condition
29.3.5 Stefan Problem with Boundary Condition According to Daytime Temperature Variations
29.4 Discussion
29.5 Conclusions
References
30 Numerical Upscaling via the Wave Equation with Perfectly Matched Layers
30.1 Introduction
30.2 The Wave Approach to Approximate a0
30.3 Perfectly Matched Layer for the Second Order Wave Equation
30.3.1 The New Local Problem Based on the Wave Equation Combined with PML
30.4 Numerical Discretization
30.5 Computational Results
30.6 Concluding Remarks
References
31 Homotopy Analysis Method (HAM) for Differential Equations Pertaining to the Mixed Convection Boundary-Layer Flow over a Vertical Surface Embedded in a Porous Medium
31.1 Introduction
31.2 Governing Equations
31.3 Homotopy Analysis Solution
31.4 Results and Discussion
31.5 Concluding Remarks
References
32 Magnetic Force Calculation Between Truncated Cone Shaped Permanent Magnet and Soft Magnetic Cylinder Using Hybrid Boundary Element Method
32.1 Introduction
32.2 Problem Definition
32.2.1 Force Calulation Betwen Circular Loops Loaded with Magnetization Charges
32.2.2 Force Calulation Betwen Permanent Magnet and Soft Magnetic Cylinder
32.3 Numerical Results
32.4 Conclusion
References
33 A Mathematical Model for Harvesting in a Stage-Structured Cannibalistic System
33.1 Introduction
33.2 Model Description
33.2.1 Biological Assumptions
33.2.2 The Schematic Diagram of the Model
33.2.3 The Model
33.2.4 Stability Analysis
33.3 Numerical Simulation
33.3.1 Comparing Harvesting Scenarios
33.4 Discussion and Conclusion
References
34 On the Approximation of Physiologically Structured Population Model with a Three Stage-Structured Population Model in a Grazing System
34.1 Introduction
34.1.1 Consumer-Resource Systems
34.1.2 Unstructured and Structured Population Models
34.1.3 Description of the Grazing System
34.2 Physiologically Structured Population Models
34.2.1 Size-Structured Population Models
34.2.2 Formulation of the Size-Structured Population Model
34.2.3 Vital Rates
34.2.4 Modification of the Physiologically Structured Population Model
34.3 Stage-Structured Population Model
34.4 Discussion and Conclusion
References
35 Magnetohydrodynamic Casson Nanofluid Flow Over a Nonlinear Stretching Sheet with Velocity Slip and Convective Boundary Conditions
35.1 Introduction
35.2 Mathematical Formulation
35.3 Results and Discussion
35.4 Velocity Profiles
35.5 Temperature Profiles
35.6 Concentration Profiles
35.7 Conclusion
References
36 Mathematical and Computational Analysis of MHD Viscoelastic Fluid Flow and Heat Transfer Over Stretching Surface Embedded in a Saturated Porous Medium
36.1 Introduction
36.2 Mathematical Formulation
36.3 Boundary Conditions
36.3.1 Prescribed Surface Temperature (PST)
36.3.2 Prescribed Wall Heat Flux (PHF)
36.4 Dimensionless Quantities
36.5 Reduced Non-linear Ordinary Differential Equations
36.6 Reduced Boundary Conditions
36.6.1 Prescribed Surface Temperature (PST)
36.6.2 Prescribed Heat Flux (PHF)
36.7 Results and Discussion
36.8 Conclusion
References
37 Numerical Solution of Boundary Layer Flow Problem of a Maxwell Fluid Past a Porous Stretching Surface
37.1 Introduction
37.2 Mathematical Formulation
37.3 Physical Quantities
37.4 Numerical Solution of the Problem
37.5 Results and Discussion
37.6 Conclusions
References
38 Effect of Electromagnetic Field on Mixed Convection of Two Immiscible Conducting Fluids in a Vertical Channel
38.1 Introduction
38.2 Mathematical Formulation
38.3 Analytical Solutions
38.3.1 Special Cases
38.4 Results and Discussion
38.5 Conclusion
38.6 Nomenclature
References
39 Stochastic Smart Grid Meter for Industry 4.0—From an Idea to the Practical Prototype
39.1 Introduction
39.2 Multibit SFADC
39.3 Base Conditions and Limitations
39.4 Mathematical Model of the SFADC Measurement Uncertainty
39.5 Multibit SMI
39.6 Stochastic Digital Electrical Energy Meter
39.7 Hardware Prototype of 4-Bit SDEEM
39.8 Measurement Results
39.9 Conclusion
References
40 Mathematical Basis of the Stochastic Digital Measurement Method
40.1 Introduction
40.2 Measurement of the Mean Value of the Signal
40.3 Measurement of the Mean Value of the Product of Two Signals Using Two-Bit SDMM
40.3.1 Measurement in Time Domain
40.3.2 Measurement in Fourier Domain
40.4 Stochastic Digital DFT Processor (SDDFT Processor)
40.5 Application of SDMM
40.6 Discussion
40.7 Conclusion
References
Subject Index
Index
Author Index
Author Index
Recommend Papers

Stochastic Processes, Statistical Methods, and Engineering Mathematics: SPAS 2019, Västerås, Sweden, September 30–October 2 (Springer Proceedings in Mathematics & Statistics, 408) [1st ed. 2022]
 303117819X, 9783031178191

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Springer Proceedings in Mathematics & Statistics

Anatoliy Malyarenko Ying Ni Milica Rančić Sergei Silvestrov   Editors

Stochastic Processes, Statistical Methods, and Engineering Mathematics SPAS 2019, Västerås, Sweden, September 30–October 2

Springer Proceedings in Mathematics & Statistics Volume 408

This book series features volumes composed of selected contributions from workshops and conferences in all areas of current research in mathematics and statistics, including data science, operations research and optimization. In addition to an overall evaluation of the interest, scientific quality, and timeliness of each proposal at the hands of the publisher, individual contributions are all refereed to the high quality standards of leading journals in the field. Thus, this series provides the research community with well-edited, authoritative reports on developments in the most exciting areas of mathematical and statistical research today.

Anatoliy Malyarenko · Ying Ni · Milica Ranˇci´c · Sergei Silvestrov Editors

Stochastic Processes, Statistical Methods, and Engineering Mathematics SPAS 2019, Västerås, Sweden, September 30–October 2

Editors Anatoliy Malyarenko Division of Mathematics and Physics Mälardalen University Västerås, Sweden

Ying Ni Division of Mathematics and Physics Mälardalen University Västerås, Sweden

Milica Ranˇci´c Division of Mathematics and Physics Mälardalen University Västerås, Sweden

Sergei Silvestrov Division of Mathematics and Physics Mälardalen University Västerås, Sweden

ISSN 2194-1009 ISSN 2194-1017 (electronic) Springer Proceedings in Mathematics & Statistics ISBN 978-3-031-17819-1 ISBN 978-3-031-17820-7 (eBook) https://doi.org/10.1007/978-3-031-17820-7 Mathematics Subject Classification: 62P05, 60H10, 60F17, 60K15, 62M10, 65C30, 65K10 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

This volume originated on the basis of selected contributions presented at the international conference “Stochastic Processes and Algebraic Structures—From Theory Towards Applications” (SPAS2019). A group of mathematicians, researchers from related areas, and practitioners from industry, who contribute to the areas of Stochastic Processes, Statistical Methods, Engineering Mathematics, and Algebraic Structures, participated in the conference, which has been organized by the Division of Mathematics and Physics at the Mälardalen University in Västerås, Sweden, and has been held on September 30–October 2, 2019. The scope of the volume is Stochastic Processes, Statistical Methods, and Engineering Mathematics. The accompanying volume contains contributions to Algebraic Structures. The purpose of the book is to highlight the latest advances in the above areas and focus on the mathematical structures, models, concepts, problems, computational methods, and algorithms that are important for applications in various fields of science and society. The volume is divided into three parts, according to the areas of the book’s scope described above. Part I, called “Stochastic Processes and Analysis”, contains 21 Chapters. Chapter 1, written by Mohammed Albuhayri, Christopher Engström, Anatoliy Malyarenko, Ying Ni, and Sergei Silvestrov, is concerned with the double-meanreverting model proposed by Jim Gatheral. The asymptotic expansion of the implied volatility of a European call option written on a stock in the above model is improved up to order 3. Chapter 2, written by Mohammad Jamsher Ali and Kalev Pärna, studies the ruin probability of the sum of two classical risk processes under the assumption that the claim size distributions are of phase type and that the two Poisson processes of claim arrivals are correlated. An exact formula for the ruin probability of the merged risk process is constructed. Chapter 3 by Tin Nwe Aye and Linus Carlsson studies an aquatic ecological system containing one fish species and an underlying resource. This model is investigated in both the deterministic and the stochastic settings. The authors provide estimates for v

vi

Preface

the expected outcome of population properties, measures of dispersion, probability of extinction, and the recovery potential of a species, among others. Chapter 4 by Domingos Djinja, Sergei Silvestrov, and Alex Behakanira Tumwesigye is devoted to pairs of integral operators on the spaces L p over arbitrary measure spaces. The authors obtain conditions on the kernels of integral operators to satisfy general covariance type commutation relations associated to polynomial actions important in applications of noncommutative analysis, noncommutative geometry, and operator algebra methods in stochastic processes, harmonic analysis, quantum physics, and engineering. Chapter 5 by Vitaliy Golomoziy studies the first simultaneous hitting of the atom by two discrete-time, inhomogeneous Markov chains with values in a general phase space. Both conditions for the existence and computable bounds for the hitting time’s exponential moment are established. Chapter 6, written by Lu Jin, Marko Dimitrov, and Ying Ni, considers the pricing of American options when the underlying asset is governed by the Markov regimeswitching process. The existence of a monotonic optimal exercise policy with respect to the holding time, asset price, and economic conditions is proved. Chapter 7 by Ya. M. Khusanbaev and Kh. E. Kudratov gives the upper bounds for moments and central moments of branching processes in a varying environment starting with a random number of particles. In Chap. 8, written by Abderrahim Kitouni and Fatiha Messaci, a functional law of the iterated logarithm for the increment functions of empirical processes with twice censored data is established. Strong laws for kernel estimators of the density and the failure rate of the lifetime are also derived. Chapter 9 by Pavlos Kolias and Alexandra Papadopoulou describes a DNA sequence by a semi-Markov chain with discrete state space consisting of the four nucleotides. Both strong and weak d-periodic and quasiperiodic behaviour of this model is characterized by equations in closed analytic form. The related probabilities and the corresponding indexes are provided. In Chap. 10, Sergey Krasnitskiy, Oleksandr Kurchenko, and Olga Syniavska prove a Baxter-type theorem for Gaussian generalized random processes with independent values. Sufficient conditions for the singularity of probability measures corresponding to such processes are also given. In Chap. 11, Eugene Lebedev, Vadym Ponomarov, and Hanna Livinska investigate a bivariate Markov process. The state space of the process is a lattice semi-strip. Such a model describes the service policy of a multi-server retrial queue in which the rate of repeated flow does not depend on the number of sources of repeated calls. The ergodicity conditions and a vector-matrix representation of steady state distribution are obtained. In Chap. 12, Anatoliy Malyarenko and Hossein Nohrouzian use cubature formulae of degrees 5 and 7 on Wiener space to price European options in the classical BlackScholes model. The obtained numerical results are compared with the well-known Nobel prize-awarded closed form solution, and several important properties of the cubature methods are established.

Preface

vii

In Chap. 13, Yuliya Mishura, Georgiy Shevchenko, and Sergiy Shklyar study Volterra processes. The celebrated fractional Brownian motion appears here as a particular case. The smoothness properties of the above processes, including continuity and Hölder property, are established. The problem of inverse representation of the standard Brownian motion via a Volterra process is investigated. In Chap. 14, Giulia Di Nunno, Yuliya Mishura, and Kostiantyn Ralchenko study the existence and uniqueness of solutions to stochastic differential equations with Volterra processes driven by Lévy noise. Special attention is given to two kinds of Volterra-Gaussian processes that generalize the compact interval representation of fractional Brownian motion to stochastic equations with such processes. In Chaps. 15–17, Talat Nazir and Sergei Silvestrov investigate the existence and other properties of fixed points, joint fixed points, and periodic points for mappings and pairs of mappings on multiplicative metric spaces satisfying various generalized contraction and cyclic conditions. In Chap. 18, Godwin Amechi Okeke, Mujahid Abbas, and Sergei Silvestrov introduce a random version of some known fast fixed point iterative processes, approximate the random fixed point of a generalized random operator using these random iterative processes, and prove the Bochner integrability of the random fixed points for this kind of generalized random operators and the almost sure T -stability of the above random iterative processes. In Chap. 19, Salvador Cruz Rambaud presents mathematical results for the absence of asset price bubbles by using, as an algebraic tool, a state-price deflator across an infinite time horizon. A financial market with both uncertainty, a finite number of corporate securities, and a countable number of trading dates is investigated there. José L. da Silva, Custódia Drumond, and Ludwig Streit give an account of the form factors of paths for a certain class of non-Gaussian processes in Chap. 20. A closed analytic form for the form factors and the Debye function are obtained. The relation between the mean square end-to-end length and the radius of gyration is explicitly derived. In Chap. 21, Dmitrii Silvestrov obtains necessary and sufficient conditions for convergence in distribution and in Skorokhod J -topology for counting processes generated by flows of rare events for perturbed semi-Markov processes. Part II, called “Statistical Methods”, contains 7 chapters. It begins with the chapter by Guglielmo D’Amico, Bice Di Basilio, Filippo Petroni, and Fulvio Gismondi (Chap. 22), in which two drawdown-based risk measures for managing market crisis are considered. These two measures are then analysed using high-frequency market data and synthetic data generated by ARMA, GARCH, and EGARCH models. Chapter 23 by Vladimir Anisimov and Matthew Austin deals with patient enrolment modelling and forecasting. Here, modelling of enrolment on different levels and with restrictions are discussed, and new analytical techniques are proposed to find the solution to the corresponding optimization problem. Chapter 24 by Collins Anguzu, Christopher Engström, Henry Kasumba, John Magero Mango, and Sergei Silvestrov deals with centrality measures in graph theory.

viii

Preface

Algorithms are developed for recalculating two centrality measures, namely alpha and eigenvector centrality measures, using graph partitioning techniques. In Chap. 25 by Yuriy Kozachenko and Iryna Rozora, estimation of the impulse response function of a time-invariant continuous linear system with a real-valued impulse response function was considered. Statistical properties of this impulse response function and criterion on its shape are given. Chapters 26 and 27, by Asaph Keikara Muhumuza, Karl Lundengård, Anatoliy Malyarenko, Sergei Silvestrov, John Magero Mango, and Godwin Kakuba, are devoted to useful applications of extreme points for Vandermonde determinants. In Chap. 26, the extreme points, optimized on various surfaces, are used to conduct the risk-minimization task in asset pricing and optimal portfolio selection. In Chap. 27, the extreme points maximize the Wishart probability distribution based on the boundary of the symmetric cones in Jordan algebra. Finally, Part 2 ends with Chap. 28 by Nataliya Shchestyuk and Serhii Tyshchenko proposing a new approach to option pricing. A concept of investor optimal price is defined as the optimal decision of a investor maximizing expected profit. This investor optimal pricing, integrated with risk management, is then conducted by stochastic optimization. Part III, called “Engineering Mathematics”, contains 12 chapters. Chapter 29 by Magnus Ögren deals with the one-dimensional Stefan problem with a general time-dependent boundary condition at the fixed boundary. Applying discrete random walks, stochastic solutions have been obtained and confirmed against analytical or numerical solutions (FDM). In Chap. 30, Doghonay Arjmand explores the possibility of integrating perfectly matched layers to the local wave equation. In particular, questions in relation to accuracy and reduced computational costs are addressed. Numerical simulations are provided in a simplified one-dimensional setting to illustrate the ideas. The objective of the work done by Imran M. Chandarki and Brijbhan Singh in Chap. 31 was to revisit the problem pertaining to a vertically flowing fluid passed a model of a thin vertical fin in a saturated porous medium. The governing equations have been simplified using the similarity transformation to yield ordinary differential equations. These equations have been solved by homotopy analysis method (HAM). In Chap. 32, Vuˇckovi´c et al. present modelling of a permanent magnet shaped as truncated cone and positioned in the vicinity of a body of finite dimensions made of soft magnetic material. The force calculation between the permanent magnet and soft magnetic cylinder is performed using the hybrid boundary element method along with semi-analytical approach based on fictitious magnetization charges and discretization technique. Chapters 33 and 32 tackle problems related to the population dynamics. Specifically, in Chap. 33, Loy Nankinga and Linus Carlsson deal with interactions of a consumer-resource system with harvesting in which African Catfish consumes the food resource. The dynamics of the food resource and the African Catfish result in a system of ordinary differential equations called a stage-structured fish population model. Analysis of eight harvesting scenarios revealed that harvesting large juveniles

Preface

ix

and small adults under equal harvesting rates gives the highest maximum sustainable yield compared to other harvesting scenarios. In Chap. 34, Sam Canpwonyi and Linus Carlsson studied dynamics of forage resource and livestock population in a grassland ecosystem describing it by coupled ordinary differential equations. By solving this system, one is able to predict the density-dependent properties of the population since the system provides a somewhat close-to-reality description of the natural and traditional grazing system. In Chaps. 35–38 by Prashant G. Metri and coauthors, numerical and analytical methods are applied to investigation of solutions of boundary and initial value problems for systems of partial differential equation in fluid mechanics and electromagnetism applications, including magnetohydrodynamic Casson nanofluid flow over a nonlinear stretching sheet with velocity slip and convective boundary conditions, mathematical and computational analysis of MHD viscoelastic fluid flow and heat transfer over stretching surface embedded in a saturated porous medium, numerical solution of boundary layer flow problem of a Maxwell fluid past a porous stretching surface, and effect of electromagnetic field on mixed convection of two immiscible conducting fluids in a vertical channel. Finally, Chaps. 39 and 40 cover new stochastic digital measurement method (SDMM) and its role in designing low-cost digital high precision power grid electrical energy meters. In Chap. 39, authored by Vujiˇci´c et al, mathematical properties of the SDMM are given. Practical and useful formulas are derived that connect the measurement parameters with the precision. Hardware of a two-bit SDMM is simple, so sources of systematic errors can easily be identified and error corrected. In addition, simple hardware enables large-scale parallelization of measurements and processing. Using the multibit SMI (Stochastic Measurement Instruments) mathematical model developed in 39 and 40, a working hardware prototype of a 4-bit SDEEM (Stochastic Digital Electrical Energy Meter) was built and rigorously tested by Marjan Urekar and Jelena Djordjevi´c Kozarov. This confirmed the validity of the theoretical model and that SDEEM is an ideal solution for a Smart Meter in Smart Grid in Industry 4.0 applications, due to its high precision and accuracy, high reliability, digital controls and ease of interfacing with IoT and IIoT, simple hardware, and low cost. The volume is intended for researchers, graduate and Ph.D. students, and practitioners in the areas of Mathematics, Statistics, Finance, and Engineering, who are interested in a source for inspiration, cutting-edge research, and applications. This book comprises selected refereed contributions from several large research communities in modern stochastic processes, probability theory, statistics, analysis, computational mathematics, engineering mathematics, and their interplay and applications. The book will be a useful source of inspiration for a broad spectrum of researchers and research students in the field of Mathematics and Applied Mathematics, as well as in the specific areas of applications considered in the book. This collective book project has been realized thanks to the strategic support offered by Mälardalen University for the research and research education in Mathematics which is conducted by the research environment Mathematics and Applied Mathematics (MAM) in the established research specialization of Educational Sciences and Mathematics at the School of Education, Culture and Communication at

x

Preface

Mälardalen University. We also wish to extend our thanks to the Swedish International Development Cooperation Agency (Sida) and International Science Programme in Mathematical Sciences (ISP), Nordplus programme of Nordic Council of Ministers, Swedish research council, Royal Swedish Academy of Sciences as well as many other national and international funding organizations and the research and education environments and institutions of the individual researchers and research teams who contributed to the success of SPAS2019 and to this collective book. Finally, we especially thank all the authors for their excellent research contributions to this book. We also thank the staff of the publisher Springer for their excellent efforts and cooperation in the publication of this collective book. All contributed chapters have been reviewed and we are grateful to the reviewers for their work. Västerås, Sweden June 2022

Anatoliy Malyarenko Sergei Silvestrov Ying Ni Milica Ranˇci´c

Contents

Part I 1

2

3

4

5

Stochastic Processes and Analysis

An Improved Asymptotics of Implied Volatility in the Gatheral Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohammed Albuhayri, Christopher Engström, Anatoliy Malyarenko, Ying Ni, and Sergei Silvestrov Ruin Probability for Merged Risk Processes with Correlated Arrivals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohammad Jamsher Ali and Kalev Pärna Method Development for Emergent Properties in Stage-Structured Population Models with Stochastic Resource Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tin Nwe Aye and Linus Carlsson Representations of Polynomial Covariance Type Commutation Relations by Linear Integral Operators on L p Over Measure Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Domingos Djinja, Sergei Silvestrov, and Alex Behakanira Tumwesigye Computable Bounds of Exponential Moments of Simultaneous Hitting Time for Two Time-Inhomogeneous Atomic Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vitaliy Golomoziy

3

15

33

59

97

6

Valuation and Optimal Strategies for American Options Under a Markovian Regime-Switching Model . . . . . . . . . . . . . . . . . . . . 121 Lu Jin, Marko Dimitrov, and Ying Ni

7

Inequalities for Moments of Branching Processes in a Varying Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Ya. M. Khusanbaev and Kh. E. Kudratov

xi

xii

Contents

8

A Law of the Iterated Logarithm for the Empirical Process Based Upon Twice Censored Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Abderrahim Kitouni and Fatiha Messaci

9

Investigating Some Attributes of Periodicity in DNA Sequences via Semi-Markov Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Pavlos Kolias and Alexandra Papadopoulou

10 Limit Theorems of Baxter Type for Generalized Random Gaussian Processes with Independent Values . . . . . . . . . . . . . . . . . . . . 197 Sergey Krasnitskiy, Oleksandr Kurchenko, and Olga Syniavska 11 On Explicit Formulas of Steady-State Probabilities for the [M/M/c/c + m]-Type Retrial Queue . . . . . . . . . . . . . . . . . . . . . 211 Eugene Lebedev, Vadym Ponomarov, and Hanna Livinska 12 Testing Cubature Formulae on Wiener Space Versus Explicit Pricing Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 Anatoliy Malyarenko and Hossein Nohrouzian 13 Gaussian Processes with Volterra Kernels . . . . . . . . . . . . . . . . . . . . . . . 249 Yuliya Mishura, Georgiy Shevchenko, and Sergiy Shklyar 14 Stochastic Differential Equations Driven by Additive Volterra–Lévy and Volterra–Gaussian Noises . . . . . . . . . . . . . . . . . . . . 277 Giulia Di Nunno, Yuliya Mishura, and Kostiantyn Ralchenko 15 Fixed Point Results of Generalized Cyclic Contractive Mappings in Multiplicative Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . 325 Talat Nazir and Sergei Silvestrov 16 Fixed Points of T-Hardy Rogers Type Mappings and Coupled Fixed Point Results in Multiplicative Metric Spaces . . . . . . . . . . . . . . 343 Talat Nazir and Sergei Silvestrov 17 Some Periodic Point and Fixed Point Results in Multiplicative Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 Talat Nazir and Sergei Silvestrov 18 Bochner Integrability of the Random Fixed Point of a Generalized Random Operator and Almost Sure Stability of Some Faster Random Iterative Processes . . . . . . . . . . . . . . . . . . . . . . 383 Godwin Amechi Okeke, Mujahid Abbas, and Sergei Silvestrov 19 An Approach to the Absence of Price Bubbles Through State-Price Deflators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407 Salvador Cruz Rambaud 20 Form Factors for Stars Generalized Grey Brownian Motion . . . . . . . 431 José L. da Silva, Custódia Drumond, and Ludwig Streit

Contents

xiii

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 Dmitrii Silvestrov Part II

Statistical Methods

22 An Econometric Analysis of Drawdown Based Measures . . . . . . . . . . 489 Guglielmo D’Amico, Bice Di Basilio, Filippo Petroni, and Fulvio Gismondi 23 Forecasting and Optimizing Patient Enrolment in Clinical Trials Under Various Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511 Vladimir Anisimov and Matthew Austin 24 Algorithms for Recalculating Alpha and Eigenvector Centrality Measures Using Graph Partitioning Techniques . . . . . . . . 541 Collins Anguzu, Christopher Engström, Henry Kasumba, John Magero Mango, and Sergei Silvestrov 25 On Statistical Properties of the Estimator of Impulse Response Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563 Yuriy Kozachenko and Iryna Rozora 26 Connections Between the Extreme Points for Vandermonde Determinants and Minimizing Risk Measure in Financial Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587 Asaph Keikara Muhumuza, Karl Lundengård, Anatoliy Malyarenko, Sergei Silvestrov, John Magero Mango, and Godwin Kakuba 27 Extreme Points of the Vandermonde Determinant and Wishart Ensemble on Symmetric Cones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625 Asaph Keikara Muhumuza, Anatoliy Malyarenko, Karl Lundengård, Sergei Silvestrov, John Magero Mango, and Godwin Kakuba 28 Option Pricing and Stochastic Optimization . . . . . . . . . . . . . . . . . . . . . 651 Nataliya Shchestyuk and Serhii Tyshchenko Part III Engineering Mathematics 29 Stochastic Solutions of Stefan Problems with General Time-Dependent Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 669 Magnus Ögren 30 Numerical Upscaling via the Wave Equation with Perfectly Matched Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689 Doghonay Arjmand

xiv

Contents

31 Homotopy Analysis Method (HAM) for Differential Equations Pertaining to the Mixed Convection Boundary-Layer Flow over a Vertical Surface Embedded in a Porous Medium . . . . . . . . . . . 703 Imran M. Chandarki and Brijbhan Singh 32 Magnetic Force Calculation Between Truncated Cone Shaped Permanent Magnet and Soft Magnetic Cylinder Using Hybrid Boundary Element Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 719 Ana Vuˇckovi´c, Dušan Vuˇckovi´c, Mirjana Peri´c, and Nebojša Raiˇcevi´c 33 A Mathematical Model for Harvesting in a Stage-Structured Cannibalistic System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735 Loy Nankinga and Linus Carlsson 34 On the Approximation of Physiologically Structured Population Model with a Three Stage-Structured Population Model in a Grazing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753 Sam Canpwonyi and Linus Carlsson 35 Magnetohydrodynamic Casson Nanofluid Flow Over a Nonlinear Stretching Sheet with Velocity Slip and Convective Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . 773 Prashant G. Metri, M. Subhas Abel, and Deena Sunil Sharanappa 36 Mathematical and Computational Analysis of MHD Viscoelastic Fluid Flow and Heat Transfer Over Stretching Surface Embedded in a Saturated Porous Medium . . . . . . . . . . . . . . . 791 Jagadish Tawade and Prashant G. Metri 37 Numerical Solution of Boundary Layer Flow Problem of a Maxwell Fluid Past a Porous Stretching Surface . . . . . . . . . . . . . . 809 Jagadish Tawade and Prashant G. Metri 38 Effect of Electromagnetic Field on Mixed Convection of Two Immiscible Conducting Fluids in a Vertical Channel . . . . . . . . . . . . . . 829 J. C. Umavathi, Prashant G. Metri, and Sergei Silvestrov 39 Stochastic Smart Grid Meter for Industry 4.0—From an Idea to the Practical Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857 Marjan Urekar and Jelena Djordjevi´c Kozarov 40 Mathematical Basis of the Stochastic Digital Measurement Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 889 Vladimir Vujiˇci´c, Jelena Djordjevi´c Kozarov, Platon Sovilj, and Bojan Vujiˇci´c Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 909 Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 913

Contributors

Abbas Mujahid Department of Mathematics, Government College University, Lahore, Pakistan; Department of Mathematics and Applied Mathematics, University of Pretoria, Pretoria, South Africa Abel M. Subhas Department of Mathematics, Gulbarga University, Kalaburagi, Karnataka, India Albuhayri Mohammed Division of Mathematics and Physics, Mälardalen University, Västerås, Sweden Ali Mohammad Jamsher University of Tartu, Tartu, Estonia Anguzu Collins Department of Mathematics, School of Physical Sciences, Makerere University, Kampala, Uganda Anisimov Vladimir Center for Design and Analysis, Amgen Inc., Cambridge, UK Arjmand Doghonay Division of Mathematics and Physics, School of Education, Culture and Communication, Mälardalen University, Västerås, Sweden Austin Matthew Center for Design and Analysis, Amgen Inc., Thousand Oaks, CA, USA Aye Tin Nwe Division of Mathematics and Physics, Mälardalen University, Västerås, Sweden; Taunggoke University, Toungup, Myanmar Canpwonyi Sam Department of Mathematics, Gulu University, Gulu, Uganda Carlsson Linus Division of Mathematics and Physics, Mälardalen University, Västerås, Sweden; School of Education, Culture and Communication, Mälardalen University, Västerås, Sweden; Department of Mathematics and Applied Mathematics, Mälardalen University, Västerås, Sweden xv

xvi

Contributors

Chandarki Imran M. Department of General Science and Engineering, N. B. Navale Sinhgad College of Engineering, Solapur, Maharashtra, India Cruz Rambaud Salvador Departamento de Economía y Empresa, Universidad de Almería, Almería, Spain da Silva José L. CIMA, University of Madeira, Campus da Penteada, Funchal, Portugal Di Basilio Bice Department of Economics, University of Chieti-Pescara, Pescara, Italy Di Nunno Giulia Department of Mathematics, University of Oslo, Oslo, Norway Dimitrov Marko Division of Mathematics and Physics, Mälardalen University, Västerås, Sweden Djinja Domingos Department of Mathematics and Informatics, Faculty of Sciences, Eduardo Mondlane University, Maputo, Mozambique; Division of Mathematics and Physics, School of Education, Culture and Communication, Mälardalen University, Västerås, Sweden Djordjevi´c Kozarov Jelena Faculty of Electronic Engineering, University of Niš, Niš, Serbia Drumond Custódia University of Madeira, Campus da Penteada, Funchal, Portugal D’Amico Guglielmo Department of Economics, University of Chieti-Pescara, Pescara, Italy Engström Christopher Division of Mathematics and Physics, The School of Education, Culture and Communication, Mälardalen University, Västerås, Sweden Gismondi Fulvio Department of Economic and Business Science, ‘Guglielmo Marconi’ University, Rome, Italy Golomoziy Vitaliy Taras Shevchenko National University of Kyiv, Kyiv, Ukraine Jin Lu Department of Informatics, University of Electro-Communications, Tokyo, Japan Kakuba Godwin Department of Mathematics, Makerere University, Kampala, Uganda Kasumba Henry Department of Mathematics, School of Physical Sciences, Makerere University, Kampala, Uganda Khusanbaev Ya. M. Uzbekistan Akademy of Sciences, V. I. Romanovskiy Institute of Mathematics, Tashkent, Uzbekistan Kitouni Abderrahim Département de Mathématiques, Université frères Mentouri Constantine 1, Constantine, Algeria

Contributors

xvii

Kolias Pavlos Department of Mathematics, Aristotle University of Thessaloniki, Thessaloniki, Greece Kozachenko Yuriy Taras Shevchenko National University of Kyiv, Kyiv, Ukraine Krasnitskiy Sergey Kyiv National University of Technology and Design, Kyiv, Ukraine Kudratov Kh. E. Samarkand State University named after Sharof Rashidov, Samarkand, Uzbekistan Kurchenko Oleksandr Kyiv National Taras Shevchenko University, Kyiv, Ukraine Lebedev Eugene Taras Shevchenko National University of Kyiv, Kyiv, Ukraine Livinska Hanna Taras Shevchenko National University of Kyiv, Kyiv, Ukraine Lundengård Karl Division of Mathematics and Physics, Mälardalen University, Västerås, Sweden Malyarenko Anatoliy Division of Mathematics and Physics, Mälardalen University, Västerås, Sweden Mango John Magero Department of Mathematics, School of Physical Sciences, Makerere University, Kampala, Uganda Messaci Fatiha Département de Mathématiques, Université frères Mentouri Constantine 1, Constantine, Algeria Metri Prashant G. Department of Mechanical Engineering and Mathematics, Walchand Institute of Technology, Solapur, Maharashtra, India Mishura Yuliya Department of Probability Theory, Statistics and Actuarial Mathematics, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine Muhumuza Asaph Keikara Department of Mathematics, Busitema University, Tororo, Uganda; Division of Mathematics and Physics, Mälardalen University, Västerås, Sweden Nankinga Loy Department of Mathematics and Statistics, Kyambogo University, Kampala, Uganda Nazir Talat Department of Mathematical Sciences, University of South Africa, Florida, South Africa Ni Ying Division of Mathematics and Physics, Mälardalen University, Västerås, Sweden Nohrouzian Hossein Division of Mathematics and Physics, Mälardalen University, Västerås, Sweden Ögren Magnus School of Science and Technology, Örebro University, Örebro, Sweden; Hellenic Mediterranean University, Heraklion, Greece

xviii

Contributors

Okeke Godwin Amechi Department of Mathematics, School of Physical Sciences, Federal University of Technology, Owerri, Imo State, Nigeria Papadopoulou Alexandra Department of Mathematics, Aristotle University of Thessaloniki, Thessaloniki, Greece Peri´c Mirjana Faculty of Electronic Engineering, University of Niš, Niš, Serbia Petroni Filippo Department of Management, Marche Polytechnic University, Ancona, Italy Ponomarov Vadym Taras Shevchenko National University of Kyiv, Kyiv, Ukraine Pärna Kalev University of Tartu, Tartu, Estonia Raiˇcevi´c Nebojša Faculty of Electronic Engineering, University of Niš, Niš, Serbia Ralchenko Kostiantyn Department of Probability Theory, Statistics and Actuarial Mathematics, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine Rozora Iryna Taras Shevchenko National University of Kyiv, Kyiv, Ukraine Sharanappa Deena Sunil Department of Mathematics, Indira Gandhi Tribal University, Madhya Pradesh, India Shchestyuk Nataliya NaUKMA, Kyiv, Ukraine Shevchenko Georgiy Department of Probability Theory, Statistics and Actuarial Mathematics, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine Shklyar Sergiy Department of Probability Theory, Statistics and Actuarial Mathematics, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine Silvestrov Dmitrii Department of Mathematics, Stockholm University, Stockholm, Sweden Silvestrov Sergei Division of Mathematics and Physics, School of Education, Culture and Communication, Mälardalen University, Västerås, Sweden Singh Brijbhan Department of Mathematics, Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad, Maharashtra, India Sovilj Platon Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia Streit Ludwig BiBoS, Universitat Bielefeld, Germany and CIMA, Unversidade da Madeira, Funchal, Portugal Syniavska Olga Uzhhorod National University, Uzhhorod, Ukraine Tawade Jagadish Faculty of Science and Technology, Vishwakarma University, Pune, Maharashtra, India Tumwesigye Alex Behakanira Department of Mathematics, College of Natural Sciences, Makerere University, Kampala, Uganda

Contributors

xix

Tyshchenko Serhii NaUKMA, Kyiv, Ukraine Umavathi J. C. Department of Mathematics, Gulbarga University, Gulbarga, Karnataka, India Urekar Marjan Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia Vujiˇci´c Bojan Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia Vujiˇci´c Vladimir Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia Vuˇckovi´c Ana Faculty of Electronic Engineering, University of Niš, Niš, Serbia Vuˇckovi´c Dušan Faculty of Electronic Engineering, University of Niš, Niš, Serbia

Part I

Stochastic Processes and Analysis

Part I presents new developments in theoretical and applied aspects of Stochastic Processes. The above theory is applied to Financial Engineering in Chaps. 1, 6, and 12; to Actuarial Mathematics in Chap. 2; and to Mathematical Biology in Chaps. 3 and 9. Various theoretical problems are considered in Chaps. 4, 5, 7, 8, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20, and 21.

Chapter 1

An Improved Asymptotics of Implied Volatility in the Gatheral Model Mohammed Albuhayri, Christopher Engström, Anatoliy Malyarenko, Ying Ni, and Sergei Silvestrov

Abstract We study the double-mean-reverting model by Gatheral. Our previous results concerning the asymptotic expansion of the implied volatility of a European call option, are improved up to order 3, that is, the error of the approximation is ultimately smaller that the 1.5th power of time to maturity plus the cube of the absolute value of the difference between the logarithmic security price and the logarithmic strike price. Keywords Double-mean-reverting model · Implied volatility MSC 2020: 91G20, 91G70, 91G60

1.1 Introduction Gatheral [2] elaborated a model given by the following system of stochastic differential equations: M. Albuhayri · C. Engström · A. Malyarenko (B) · Y. Ni · S. Silvestrov Division of Mathematics and Physics, Mälardalen University, SE 721 23, Västerås, Sweden e-mail: [email protected] M. Albuhayri e-mail: [email protected] C. Engström e-mail: [email protected] Y. Ni e-mail: [email protected] S. Silvestrov e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_1

3

4

M. Albuhayri et al.

dSt =

√ νt St dWt1 ,

dνt = κ1 (νt − νt ) dt + ξ1 νtα1 dWt2 , dνt = κ2 (θ − νt ) dt + ξ2 νt

α2

dWt3

on a probability space (Ω, F, P∗ ) with risk-neutral measure P∗ and a filtration j { Ft : 0 ≤ t ≤ T }, where the Wiener processes Wti are correlated: E[Wsi Wt ] = ρi j min{s, t} with 1 ≤ i < j ≤ 3 and ρi j ∈ (−1, 1). In the above model, the first stochastic variance νt reverts to the second stochastic variance νt at the rate κ1 , while the second stochastic variance reverts to the deterministic level θ at a usually slower rate κ2 . That’s why this model is usually called a double mean-reverting model. The parameters α1 and α2 lie in the interval [1/2, 1]. Several known models of financial engineering are particular cases of the double mean-reverting model. In particular, when α1 = α2 = 1/2, we obtain the so-called double Heston model, in the case of α1 = α2 = 1 the double lognormal model. To use the above model for explicit calculations, one must calibrate it to real data. The calibration process is usually slow and uses either Monte Carlo or finitedifference methods. This is because neither the pricing formula for European type options nor the implied volatility are known in closed form. To deal with this difficulty, we develop an asymptotic expansion of the implied volatility in the above model. To explain this concept, first perform the change of variable xt = ln St in the Gatheral system. The multidimensional Itô formula gives 1 √ dxt = − νt dt + νt dWt1 , 2 dνt = κ1 (νt − νt ) dt + ξ1 νtα1 dWt2 , dνt = κ2 (θ − νt ) dt + ξ2 νt

α2

(1.1)

dWt3 .

Denote by K the strike price of a European option with maturity T , and let k = ln K . Let u(t, x, ν, ν  , T, k) = E∗ [max{0, ex T − ek } | Ft , xt = x, νt = ν, νt = ν  ] be the time t no-arbitrage price of the above option. On the other hand, the Black– Scholes price of a European option σ is u BS (σ, t, x, k) = ex N (d+ ) −  with volatility 2 σ (T −t) 1 k . The implied volatility is the e N (d− ), where d± = σ√T −t x − k ± 2 unique positive solution σ of the equation u BS (σ, t, x, k) = u(t, x, ν, ν  , T, k).

1.2 Previous Results In [1], we proved the following result. Theorem 1 The asymptotic expansion of order 2 of the implied volatility has the form

1 An Improved Asymptotics of Implied Volatility …

σ(t, x0 , ν0 , ν0 ; T, k) = +



5

1 3 2α −5/2 ν0 + ρ12 ξ1 ν0α1 −1 (k − x0 ) − ρ212 ξ12 ν0 1 (k − x0 )2 4 16

1 −1/2 2α −3/2 [8κ1 (ν0 − ν0 )ν0 + 4ρ12 ξ1 ν0α1 + 3ρ212 ξ12 ν0 1 ](T − t) + o(T − t + (k − x0 )2 ). 32

For an arbitrary positive real number λ, the above expansion converges in a parabolic domain √ (1.2) Pλ = { (T, k) ∈ (0, T0 ] × R : |x − k| ≤ λ T − t }.

1.3 The Expansion of Order 3 Theorem 2 The asymptotic expansion of order 3 of the implied volatility has the form σ(t, x0 , ν0 , ν0 ; T, k) =

1 3 √ 2α −5/2 (k − x0 )2 ν0 + ρ12 ξ1 ν0α1 −1 (k − x0 ) − ρ212 ξ12 ν0 1 4 16

1 −1/2 2α −3/2 [8κ1 (ν0 − ν0 )ν0 + 4ρ12 ξ1 ν0α1 + 3ρ212 ξ12 ν0 1 ](T − t) 32 1 2α −5/2 + [16κ1 (ν0 − ν0 )ρ12 ξ1 ν α1 −3 + 8ρ212 ξ12 ν0 1 − 32ρ312 ξ13 ν03α1 −4 ](k − x0 )3 128 1 [24κ1 (ν0 − ν0 )ρ12 ξ1 ν α1 −2 + 12ρ212 ξ12 ν 2α1 −3/2 + 128 + 35ρ312 ξ13 ν03α1 −3 − 8ρ12 ξ1 ν0α1 ](T − t)(k − x0 ) +

+ o((T − t)3/2 + |k − x0 |3 ).

1.4 A Sketch of a Proof 1.4.1 Preliminaries Introduce the following notation: z = (x, ν, ν  ) . The infinitesimal generator of the 3 3   2 system (1.1) has the form A = 21 ai j (z) ∂z∂i ∂z j + ai (z) ∂z∂ i , where i, j=1

i=1

a11 (z) = ν,

a12 (z) = ρ12 ξ1 ν α1 +1/2 ,

a13 (z) = ρ13 ξ2 ν 1/2 (ν  )α2 ,

a22 (z) = ξ12 ν 2α1 , a1 (z) = −ν/2,

a23 (z) = ρ23 ξ1 ξ2 ν α1 (ν  )α2 , a33 (z) = ξ12 (ν  )2α2 , a2 (z) = κ1 (ν  − ν), a3 (z) = κ2 (θ − ν  ).

Following [4], fix z = (x0 , ν0 , ν0 ) . Pagliarani and Pascucci [4, Eq. 3.10] gives the 0th order approximation of the implied volatility as σ0(z) (t, T ) =

6



M. Albuhayri et al.

1/2 a (τ , z) dτ = ν0 . For N ≥ 1, the N th order approximation of the 11 t implied volatility is given by [4, Definition 3.4] as 1 T −t

T

σ N (t, x, ν, ν  , T, k) =

N 



σn(x,ν,ν ) (t, x, ν, ν  , T, k),

(1.3)

n=0 

where the terms σn(x,ν,ν ) (t, x, ν, ν  , T, k) are defined by [4, Eq. 3.15], which is recursive. We are interested only in the first three terms: ∂ BS (z) (σ0 ) 1 2 ∂σ 2u σ σ1 = ∂ = , σ − , 2 1 (z) ∂ BS (z) ∂ BS BS u (σ0 ) u (σ0 ) 2 u (σ0(z) ) ∂σ ∂σ ∂σ

u 1(z)

σ3 =

u 3(z)

u 2(z)

∂ BS (z) u (σ0 ) ∂σ

2

∂ 2 BS (z) (σ0 ) 2u σ1 σ2 ∂σ∂ u BS (σ0(z) ) ∂σ



∂ BS (z) (σ0 ) 1 3u − σ13 ∂σ∂ , 6 u BS (σ0(z) ) 3

(1.4)

∂σ

where we omitted the arguments of the functions for simplicity. The terms u 1(z) , 1 ≤ i ≤ 3, are the approximations of the option’s price. Pagliarani and Pascucci [4, Theorem D.1] proved that u n(z) (t, z) = L˜n(z) (t, T, z)u BS (σ0(z) ), where L˜n(z) (t, T, z) =

n   k=1

×

t





T

ds1 t1

T

 ds2 · · ·

T

dsk tk−1

Gi(z) (t, s1 , z) · · · Gi(z) (t, tk , z). 1 k−1

i∈In,k

In this expression, In,k = { i = (i 1 , . . . , i k ) ∈ Nk : i 1 + · · · + i k = n }. The differential operators Gn(z) (t, s, z) are as follows: Gn(z) (t, s, z) = An(z) (s, z − z + m(z) (t, s) + C (z) (t, s)∇z ), where An(z) (t, z) =

 |β|=n



3  D β ai j (t, z) ∂2 (z − z)β β! ∂z i ∂z j i, j=1 3  D β ai (t, z) ∂ (z − z)β + , β! ∂z i i=1



(z) and the components of the of the matrix C (z) (t, s)  svector m (t, s)(z)and the entries s (z) are given by m i (t, s) = t ai (u, z) du, Ci j (t, s) = t ai j (u, z) du. Lorig et al. proved in [3, Eq. (3.13)] that

1 An Improved Asymptotics of Implied Volatility …

u n(z) (t, z) ∂ BS (z) u (σ0 ) ∂σ

=



L˜n (t, T )

(T − t)σ0(z)

7

∂2 ∂x 2



∂2 ∂x 2



u BS (σ0(z) )  . ∂ u BS (σ0(z) ) − ∂x



∂ ∂x

(1.5)

In this expression, the differential operators L˜n (t, T ) are as follows: L˜n (t, T ) =

n   k=1



T

T

dt1 t

 dt2 · · ·

t1

T

dtk tk−1



Gi1 (t, t1 ) · · · Gik−1 (t, tk−1 )

i∈In,k

(1.6)

× a(2,0,0) ,ik (t, Mx (t, tk ), M y (t, tk ), Mz (t, tk )). The differential operators Gi (t, tm ) are as follows: 

Gi (t, tm ) =

aα,i (tm , Mx (t, tm ), M y (t, tm ), Mz (t, tm ))

1≤|α|≤2

∂ |α| , ∂zα

(1.7)

where α = (α1 , α2 , α3 ) , |α| = α1 + α2 + α3 , zα = z 1α1 z 2α2 z 3α3 , and the coefficients α1 , α2 , and α3 are nonnegative integers. The coefficients aα,i (t, z) have the form aα,i (t, z) =

 1 ∂ |β| aα (t, z) (z − z)β . β! ∂zβ

|β|=i

The functions Mx (t, tm ), M y (t, tm ), and Mz (t, tm ) are three components of the vector-valued function z + m(z) (t, s) + C (z) (t, s)∇z . In order to calculate σ2 and σ3 , we use [4, Proposition C.4]: for any n ≥ 2, ∂ n BS (z) u (σ0 ) ∂σ n ∂ BS (z) u (σ0 ) ∂σ

n/2 n−q−1

=

  q=0



× where τ = T − t, ζ =

p=0

1 √

cn,n−2q (σ0(z) )n−2q−1 τ n−q−1 p+n−q−1

z0 2τ x−k−(σ0(z) )2 τ /2 √ , σ0(z) 2τ

n−q −1 p

(1.8)

H p+n−q−1 (ζ), the coefficients cn,n−2q are defined recur-

sively by cn,n = 1, cn,n−2q = (n − 2q + 1)cn−1,n−2q+1 + cn−1,n−2q−1 , and Hn (ζ) = n (−1)n exp(ζ 2 ) dζd n exp(−ζ 2 ) is the nth “physical” Hermite polynomial. Note that the calculation of the asymptotic expansion of up to order 2 were explained in [1]. We explain the next order.

8

M. Albuhayri et al.

1.4.2 Order 3 To calculate σ3 , we use the third display equation in (1.4). The coefficients σ1 and σ2 have been calculated in [1] as α +1/2

ν t − ν0 ρ12 ξ1 ν0 1 κ1 (ν0 − ν0 )τ 2 − (x − k − σ τ /2) + , 0 2σ0 4σ0 4σ03 1 2 2 2α1 +1 1 α +1/2 3/2 σ2 = ρ ξ ν τ (4ζ 4 − 12ζ 2 + 3) − √ 4 ρ12 ξ1 ν0 1 τ (2ζ 3 − 3ζ) 5 12 1 0 32σ0 8 2σ0 1 1 − √ 4 ρ212 ξ12 ν02α1 +1 τ 3/2 (2ζ 3 − 3ζ) − √ 3 κ21 (ν0 − ν0 )2 τ 2 16 2σ0 32 2σ0 1 1 − √ 3 ρ212 ξ12 ν02α1 +1 τ 2 (2ζ 2 − 1) − √ 2 ρ212 ξ12 ν02α1 +1 τ 7/2 ζ 3 16 2σ0 8 2σ0 1 1 α +1/2 3 2 − 3 ρ212 ξ12 ν02α1 +1 τ 3 ζ 4 + 2 κ1 (ν0 − ν0 )ρ12 ξ1 ν0 1 τ ζ 8σ0 8σ0 1 α +1/2 5/2 3 + √ 3 κ1 (ν0 − ν0 )ρ12 ξ1 ν0 1 τ ζ . 4 2σ0 σ1 =

The sets I3,k have the form I3,1 = {(3)}, I3,2 = {(1, 2), (2, 1)}, I3,3 = {(1, 1, 1)}. Note that a(2,0,0) ,2 = a(2,0,0) ,3 = 0, which makes the terms with k = 1 and (k = 2, i = (1, 2)) in (1.6) equal to 0. Equation (1.7) gives G2 (t, tm ) = 0, which excludes the term with (k = 2, i = (2, 1)). We obtain L˜3 (t, T ) =





T

dt1 t



T

T

dt2 t1

dt3 G1 (t, t1 )G1 (t, t2 )

t2

× a(2,0,0) ,1 (t, Mx (t, t3 ), M y (t, t3 ), Mz (t, t3 )). Observe that the differential operator L˜3 (t, T ) will be applied to the Black– Scholes price, which does not depend on ν and ν  . We denote by · · · all terms that contain partial derivatives with respect to the above two variables. we   2In particular, ∂ ∂ with − have G1 (t, tm ) = a(2,0,0) ,1 (t, Mx (t, tm ), M y (t, tm ), Mz (t, tm )) ∂x 2 ∂x 1 ν − ν0 + (tm − t)κ1 (ν  − ν) 2 α +1/2 ∂ + ··· . + (tm − t)ρ12 ξ1 ν0 1 ∂x

a(2,0,0) ,1 (t, Mx (t, tm ), M y (t, tm ), Mz (t, tm )) =

Introduce the following notation: A(s, t) = ν − ν0 + (s − t)κ1 (ν  − ν),

α +1/2

B(s, t) = (s − t)ρ12 ξ1 ν0 1

.

1 An Improved Asymptotics of Implied Volatility …

9

After simple algebraic manipulations, the operator L˜3 (t, T ) takes the form    1 T T T ∂7 ˜ L3 (t, T ) = B(t1 , t)B(t2 , t)B(t3 , t) 7 + [B(t1 , t)B(t2 , t)A(t3 , t) 8 t ∂x t1 t2 − 2B(t1 , t)B(t2 , t)B(t3 , t) + B(t1 , t)A(t2 , t)B(t3 , t) ∂6 + [−2B(t1 , t)B(t2 , t)A(t3 , t) ∂x 6 + B(t1 , t)A(t2 , t)A(t3 , t) + A(t1 , t)B(t2 , t)A(t3 , t) + A(t1 , t)B(t2 , t)B(t3 , t)]

− 2B(t1 , t)A(t2 , t)B(t3 , t) − 2 A(t1 , t)B(t2 , t)B(t3 , t) ∂5 ∂x 5 + [−2B(t1 , t)A(t2 , t)A(t3 , t) − 2 A(t1 , t)B(t2 , t)A(t3 , t) + B(t1 , t)B(t2 , t)B(t3 , t) + A(t1 , t)A(t2 , t)B(t3 , t)]

+ B(t1 , t)B(t2 , t)A(t3 , t) + A(t1 , t)A(t2 , t)A(t3 , t) − 2 A(t1 , t)A(t2 , t)B(t3 , t) + B(t1 , t)A(t2 , t)B(t3 , t) ∂4 + [−2 A(t1 , t)A(t2 , t)A(t3 , t) ∂x 4 + B(t1 , t)A(t2 , t)A(t3 , t) + A(t1 , t)B(t2 , t)A(t3 , t) + A(t1 , t)B(t2 , t)B(t3 , t)]

∂3 ∂ + A(t1 , t)A(t2 , t)B(t3 , t)] 3 + A(t1 , t)A(t2 , t)A(t3 , t) 2 ∂x ∂x

 dt3 dt2 dt1 .

To simplify this expression, the following integrals are important: 

T



T



T



T



T



T



T



T



t



t



t



t



t



t



t



t

T



T



T



T



T



T



T



T



t1

t1

t1

t1

t1

t1

t1

t1

T t2 T

dt3 dt2 dt1 =

1 (T − t)3 , 6

(t1 − t) dt3 dt2 dt1 =

1 (T − t)4 , 8

(t2 − t) dt3 dt2 dt1 =

1 (T − t)4 , 12

(t3 − t) dt3 dt2 dt1 =

1 (T − t)4 , 24

t2 T t2 T t2 T

(t1 − t)(t2 − t) dt3 dt2 dt1 =

1 (T − t)5 , 15

(t1 − t)(t3 − t) dt3 dt2 dt1 =

1 (T − t)5 , 30

(t2 − t)(t3 − t) dt3 dt2 dt1 =

1 (T − t)5 , 40

t2 T t2 T t2 T t2

(t1 − t)(t2 − t)(t3 − t) dt3 dt2 dt1 =

1 (T − t)6 . 48

10

M. Albuhayri et al.

Equation (1.5) with n = 3 gives u 3(z) (t, z) ∂ BS (z) u (σ0 ) ∂σ

=

7 

∂m m  ∂x

χ(3) m (τ , x, ν, ν ) 

m=2



∂2 ∂x 2

∂2 ∂x 2



− ∂ ∂x

∂ ∂x





u BS (σ0(z) )

u BS (σ0(z) )

,

(1.9)

where 1 2 1 −1/2 −1/2 τ (ν − ν0 )3 ν0 + τ 3 (ν − ν0 )2 κ1 (ν  − ν)ν0 48 32 1 1 5 3  −1/2 −1/2 τ κ1 (ν − ν)3 ν0 , + τ 4 (ν − ν0 )κ21 (ν  − ν)2 ν0 + 64 384 1 2 1 −1/2 −1/2  χ(3) τ (ν − ν0 )3 ν0 − τ 3 (ν − ν0 )2 κ1 (ν  − ν)ν0 3 (τ , ν, ν ) = − 24 16 1 1 5 3  −1/2 −1/2 − τ 4 (ν − ν0 )κ21 (ν  − ν)2 ν0 − τ κ1 (ν − ν)3 ν0 32 192 1 3 1 5 α1 τ ρ12 ξ1 ν0α1 κ21 (ν  − ν)2 + τ ρ12 ξ1 ν0 (ν − ν0 ) + 32 128 1 + τ 4 ρ12 ξ1 ν0α1 (ν − ν0 )(ν  − ν), 32 1 1 (3)  χ4 (τ , ν, ν ) = − τ 3 ρ12 ξ1 ν0α1 (ν − ν0 )2 − τ 4 ρ12 ξ1 ν0α1 (ν − ν0 )κ1 (ν  − ν) 16 16 1 5 1 2α +1/2 α1 2  2 − τ ρ12 ξ1 ν0 κ1 (ν − ν) + τ 4 ρ212 ξ12 ν0 1 (ν − ν0 ) 64 64 1 5 2 2 2α1 +1/2 1 −1/2 + τ ρ12 ξ1 ν0 κ1 (ν  − ν) + τ 2 ν0 (ν − ν0 )3 128 48 1 1 −1/2 −1/2 + τ 3 ν0 (ν − ν0 )2 κ1 (ν  − ν) + τ 4 ν0 (ν − ν0 )κ21 (ν  − ν)2 32 64 1 5 −1/2 3  τ ν0 κ1 (ν − ν)3 , + 384 1 4 2 2 2α1 +1/2 1 5  χ(3) τ ρ12 ξ1 ν0 τ ρ12 ξ1 ν0α1 κ21 (ν  − ν)2 (ν − ν0 ) + 5 (τ , ν, ν ) = − 32 128 1 1 2α +1/2 + τ 4 ρ12 ξ1 ν0α1 (ν − ν0 )(ν  − ν) − τ 5 ρ212 ξ12 ν0 1 κ1 (ν  − ν) 32 64 1 1 5 3 3 3α1 +1 + τ 3 ρ12 ξ1 ν0α1 (ν − ν0 )2 + τ ρ12 ξ1 ν0 , 32 384 1 4 2 2 2α1 +1/2 1 5 3 3 3α1 +1  τ ρ12 ξ1 ν0 τ ρ12 ξ1 ν0 χ(3) (ν − ν0 ) − 6 (τ , ν, ν ) = 64 192 1 5 2 2 2α1 +1/2 τ ρ12 ξ1 ν0 + κ1 (ν  − ν), 128 1 5 3 3 3α1 +1  χ(3) τ ρ12 ξ1 ν0 . 7 (τ , ν, ν ) = 384  χ(3) 2 (τ , ν, ν ) =

Using Eq. (1.8), we calculate the ratio (1.9):

1 An Improved Asymptotics of Implied Volatility … u 3(z) (t,z) ∂ BS (z) ∂σ u (σ0 )

=

7 

 χ(3) m (τ , ν, ν )

m=2

11



1 √ ν0 2τ

m

Hm (ζ),

and then σ3 by the third equation in (1.4). Observe that the right hand side of the (z) (t, z, T, k) has 4 terms. Each of them is multiplied by 4ζ 2 − 2 that equation for χ2,3 also contains 4 terms. Overall, we have 285 terms here. Which of them give a con(z) (t, z, T, k) tribution to the asymptotic expansion? We explain this issue using χ2,3 as an example. First, the terms with nonzero powers of v − v0 give no contribution by the same reason as previously. The only term which may give contri2  −1/2 1 (T − t)5 κ31 (v − v)3 v0 . After multiplying by σ √1 2τ , it becomes bution is 384 1 (T 768

−3/2

− t)4 κ31 (v − v)3 v0

0 √ σ √ T −t √x−k − . Checking σ 2(T −t) 2 2 3 −3/2 2 v) v0 (4ζ − 2), we see that no

. Write ζ as follows: ζ =

1 (T − t)4 κ31 (v − all 16 terms of the product 768 term has either the form C(k − x0 )3 or C(k − x0 )(T − t). There are no contributions to the asymptotic expansion here. We continue in the same way and find 7  (z) two contributions from the term σ √1 2τ χ7,3 (t, z, T, k)H7 (ζ). The first one is 0  7 3  (z) 1 √ χ7,3 (t, z, T, k) × 3360 σ√x−k , which gives the third coefficient in 2(T −t) σ0 2τ 7  (z) (k − x0 )3 . The second one is σ √1 2τ χ7,3 (t, z, T, k) × (−1680) σ√x−k , which 2(T −t) 0 gives the third coefficient in (T − t)(k − x0 ). The next step is to calculate σ 3 , using Eq. (1.3). So far we calculated the contribution of the first term in the right hand side of the third equation in (1.3). We proceed to the second term. The terms σ1(z) (t, x, y2 , T, k) and σ2(z) (t, x, y2 , T, k) were calculated in [1]: 

√ + κ1 (v4(t)−v(t))τ σ1(z) (t, x, y2 , T, k) = y22√−v(t) v(t) v(t) 1 α1 −1 − 8 ρ12 ξ1 v (t)(2x − 2k − v(t)τ ) σ2(z) (t, x, y2 , T, k) √ 1 [8κ1 (v 0 − v(t))v−1/2 = v0 + 32 2α −3/2 α1 +4ρ12 ξ1 v0 + 3ρ212 ξ12 v0 1 ](T − t) α −1 1 1 − 4 ρ12 ξ1 v0 (x0 − k) 3 2 2 2α1 −5/2 − 16 ρ12 ξ1 v0 (x0 − k)2

(1.10)



2α −3/2 3 2κ1 (v 0 − v0 )ρ12 ξ1 v0α1 −2 + ρ212 ξ12 v0 1 (T − t)(x0 − + 32   α1 −1 2 −3/2 1 2   − 32 κ1 (v 0 − v0 )ρ12 ξ1 v0 (T − t)2 + κ1 (v 0 − v0 ) v0  −3/2 1 22κ1 (v 0 − v0 )ρ12 ξ1 v0α1 −1 + 4κ21 (v 0 − v0 )2 v0 + 128  2α −1/2 (T − t)2 (x0 − k). + 3ρ212 ξ12 v0 1

k)

12

M. Albuhayri et al.

For the last term Eq. (1.8) gives ∂ 2 BS u (σ) ∂σ 2 ∂ BS u (σ) ∂σ

=

1 

1−2q

c2,2−2q v0

q=0

(T − t)1−q

p=0

×

1−q

 1−q

v0



1 2(T − t)

p

p+1−q H p+1−q (ζ) 

= v0 (T − t)

 H2 (ζ) H1 (ζ) + v0−1 . + 2 √ v0 2(T − t) 2v0 (T − t)

Using the values of the Hermite polynomials, we find ∂ 2 BS u (σ) ∂σ 2 ∂ BS u (σ) ∂σ

√ =−

v0 −3/2 + v0 (x − k)2 (T − t)−1 . 2

The product of the 3 terms in (σ1(z) (t, x, y2 , T, k)) and of the 7 terms in (σ2(z) (t, x, y2 , T, k)) in (1.10) and 2 terms in the last display contains 42 terms. Of that, three terms contribute to all coefficient of (k − x0 )3 and one term contributes to the last coefficient in (T − t)(k − x0 ). We proceed to calculate the term ∂ 3 BS u (σ) ∂σ 3 ∂ BS u (σ) ∂σ

=

1 

∂ 3 BS u (σ) ∂σ 3 ∂ BS ∂σ u (σ)

c3,3−2q σ 2−2q τ 2−q

q=0

. Equation (1.8) gives



2−q

 2−q p=0

p

1 √ σ 2τ

p+2−q H p+2−q (ζ)

H2 (ζ) H3 (ζ) H4 (ζ) = +√ 3 − t) + 4 2 3/2 2v0 (T − t) 4v0 (T − t)2 2v0 (T − t)   H1 (ζ) H2 (ζ) + 3(T − t) √ √ + 2 2v0 (T − t) 2v0 T − t 3 −1 1 = √ v0 (T − t)1/2 H1 (ζ) + (T − t + 3v0−2 )H2 (ζ) 2 2 1 1 −1 + √ v0 (T − t)1/2 H3 (ζ) + v0−2 H4 (ζ). 4 2 v02 (T

2

Using the values of the Hermite polynomials, we obtain ∂ 3 BS u (σ) ∂σ 3 ∂ BS u (σ) ∂σ

and finally,

√ = −(T − t) − 3 2v0−1 (T − t)1/2 ζ + [2(T − t) − 6v0−2 ]ζ 2 √ + 2 2v0−1 (T − t)1/2 ζ 3 + 4v0−2 ζ 4 ,



1 An Improved Asymptotics of Implied Volatility … ∂ 3 BS u (σ) ∂σ 3 ∂ BS u (σ) ∂σ

=

1 1 (x − k)4 (T − t)−2 − 2 (x − k)2 6 σ 2σ

+

σ2 3 1 (T − t)2 − 4 (x − k)2 (T − t)−1 − (T − t). 16 σ 4

13

The term (σ1(z) (t, x, y2 , T, k))3 contains 4 parts. Its product by the right hand side of the last display contains 20 terms and no of term give any contribution to the asymptotic expansion of order 3.

1.5 Conclusions and Future Work In this paper, we calculated the asymptotic expansion of the implied volatility in the Gatheral model up to the third order in |k − x0 | and 1.5th order in |T − t|. In a subsequent publication, we plan to perform a numerical investigation of the implied volatility. Acknowledgements This work was funded by The Saudi Arabia Cultural Bureau in Germany (SACUOF) in cooperation with the scholarship program of the Saudi Arabian Ministry of Education (MOE). Moreover, this work would not have been accomplished without the support of the International Science Programme (ISP) at Uppsala University which handles the coordination of collaboration between Al-Baha University, Saudi Arabia and Mälardalen University, Sweden.

References 1. Albuhayri, M., Malyarenko, A., Silvestrov, S., Ni, Y., Engström, C., Tewolde, F., Zhang, J.: Asymptotics of implied volatility in the Gatheral double stochastic volatility model. In: Dimotikalis, Y., Karagrigoriou, A., Parpoula, C., Skiadas, C. (eds.) Applied Modeling Techniques and Data Analysis. iSTE Wiley (2020) 2. Gatheral, J.: Consistent Modelling of SPX and VIX Options. Bachelier Congress (2008) 3. Lorig, M., Pagliarani, S., Pascucci, A.: Explicit implied volatilities for multifactor localstochastic volatility models. Math. Finance 27(3), 926–960 (2017) 4. Pagliarani, S., Pascucci, A.: The exact Taylor formula of the implied volatility. Finance Stoch. 21(3), 661–718 (2017)

Chapter 2

Ruin Probability for Merged Risk Processes with Correlated Arrivals Mohammad Jamsher Ali and Kalev Pärna

Abstract In this paper the ruin probability of the sum of two classical risk processes is studied under the assumption that the claim size distributions are of phase type and that the two Poisson processes of claim arrivals are correlated. The correlation between two claim number processes is modeled by Common Shock Model. We represent the merged risk process as a classical compound Poisson risk process where the initial claim size distributions are replaced by a new, properly chosen phase type distribution. This allows to construct an exact formula for the ruin probability of the merged risk process. Keywords Poisson process · Poisson risk process · Ruin probability MSC 2020 60G55 · 91B05

2.1 Introduction In actuarial science, calculation of ruin probabilities for risk processes is a central issue. The problem has been studied by a number of researchers who have used different models and methods. A basic model for the surplus of an insurance company is the risk process defined by R(t) = pt −

N (t) 

Xi ,

(2.1)

i=1

M. J. Ali (B) · K. Pärna University of Tartu, Tartu, Estonia e-mail: [email protected] K. Pärna e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_2

15

16

M. J. Ali and K. Pärna

where the constant p > 0 is interpreted as the premium rate, X i ’s are claims and N (t) is the number of claims within the time interval (0, t]. If {X i } are independent and identically distributed (i.i.d.) random variables and the process N (t) is a homogeneous Poisson process with intensity λ and independent of {X i }, the process R(t) is called classical (or compound Poisson, or Cramér–Lundberg) risk process. The ruin probability for a risk process R(t) with initial capital u ≥ 0 is defined as Ψ (u) = P{u + R(t) < 0 for some t ∈ (0, ∞)}. In their pioneering works, about a century ago, Lundberg [8] and Cramér [6] derived an exact formula for the ruin probability in case of exponentially distributed claims, and obtained an asymptotic formula for a wide class of other claim distributions. In the exponential case, X i ∼ E x p(μ), the ruin probability is Ψ (u) =

ρμu 1 − 1+ρ e , 1+ρ

where ρ denotes the relative safety loading, ρ = pμ/λ − 1. Later on, among some other exact results, an explicit formula of the ruin probability has been obtained for claims with phase type (PH) distribution—a far-reaching generalization of the exponential distribution [9]. That formula (derived in a queueing context) will be given in a subsequent section and it plays a key role in this paper. A comprehensive treatment of calculation and estimation of ruin probabilities under various modifications of the classical risk model is given in the monograph by Asmussen and Albrecher [2]. More recently, researchers have shown an increased interest in estimation of ruin probabilities connected with several risk processes (2.1) that are running in parallel. The motivation comes from necessity and possibility to handle dependencies between different portfolios or businesses. As a typical example, a part of traffic accidents generates claims in both, motor and health insurance. An overview of possible approaches to handling the multivariate case is presented in [1]. More formally, suppose that we have two simultaneously running risk processes of type (2.1): N N 1 (t) 2 (t)   X i , R2 (t) = p2 t − Yi , (2.2) R1 (t) = p1 t − i=1

i=1

where X i ’s are i.i.d. with common distribution function FX and mean m X = E X i , and Yi ’s are i.i.d. with common distribution function FY and mean m Y = EYi . Note that one single (univariate) counting process N (t) = Ni (t) can serve all the component processes—the case often called the ’multivariate risk process’ (see e.g. [2], p. 435). However, we do not restrict ourselves with this particular case and allow different (and possibly dependent) Ni (t). In the case of a multivariate risk process one can definevarious ruin-related events, e.g., (i) ruin of the aggregate (merged) process R(t) = i Ri (t), (ii) ruin of at least

2 Ruin Probability for Merged Risk Processes with Correlated Arrivals

17

one component process, (iii) ruin of all component processes (possibly at different times), and (iv) ruin of all component processes simultaneously. In multivariate setting, the case of two risk processes has deserved special attention. For instance, in [10] a bivariate risk model is considered and the probability that at least one class of business becomes ruined is examined. An approximate formula by using a binomial process in discrete time is obtained. In [11], same authors deduced an expression of the probability of ruin of the merged process by considering correlated claim arrival processes with a common component that follows Erlang process, while the unique components are Poisson processes. Moreover, in their model the claim sizes are exponentially distributed. Note that the special case where N1 (t) = N2 (t) =: N (t) is focused in [5] where the total surplus of bivariate compound Poisson model was under study. The authors obtained an explicit formula for the ruin probability assuming that the claims of different processes are stochastically dependent and have phase type joint distribution. In [4], the authors extend their idea by allowing dependencies between claim sizes. They showed that if the dependency of claim sizes increases, then ruin probability varies. They focused on some computable bounds of the ruin probabilities and examined the performance of these bounds for the multivariate compound Poisson risk models where claim sizes follow Marshall-Olkin exponential distribution. However, they argued that even in the simplest case, when the claims follow a multivariate exponential distribution, the explicit expressions of ruin probability are intractable. A rather general multivariate risk process is considered in [3] where the counting process is a multivariate continuous-time Markov chain of pure birth type with births rates that depend on the current state of the process. Using the ’fluid limits’ obtained, the Cramér asymptotics for ruin probabilities of the aggregated risk process are derived for several subclasses of the general model. In [7] several models for correlated Poisson processes are discussed, including the Common Shock Model. In this paper, we consider the merged risk process obtained by adding two processes in (2.2): N N 1 (t) 2 (t)   Xi − Yi (2.3) R(t) = p t − i=1

i=1

where N1 (t) and N2 (t) are assumed to be Poisson processes. We also assume that claims {X i } and {Yi } are independent and taken together they are independent of {N1 (t), N2 (t)}. The claim distributions FX and FY are supposed to be phase type (phase type distributions are described in the next section.). However, we allow for the processes N1 (t) and N2 (t) to be correlated by assuming the following dependence structure N1 (t) = J1 (t) + J3 (t), N2 (t) = J2 (t) + J3 (t), where J1 (t), J2 (t), and J3 (t) are independent Poisson processes with intensities λ1 , λ2 , and λ3 . The inter-corporation of common part J3 (t) in both claim number processes leads to an obvious correlation between N1 (t) and N2 (t) and is sometimes called the Common Shock Model. For example, if the risk processes R1 (t) and R2 (t) model the surplus of motor and health insurance (respectively), then the common process J3 (t) counts traffic accidents that generate both motor and health

18

M. J. Ali and K. Pärna

claims. Since Ji (t), i = 1, 2, 3, are independent Poisson processes, the composition N (t) = J1 (t) + J2 (t) + J3 (t) is a Poisson process with intensity λ1 + λ2 + λ3 . Our main idea is to use N (t) in representing the merged risk process (2.3) in a probabilistically equivalent but simpler form R(t) = p t −

N (t) 

Zi ,

(2.4)

i=1

where the claim sizes Z i follow a (new) phase type distribution FZ that depends on FX and FY . We will develop a method for calculation of FZ from FX and FY in such a way as to ensure that the resulting process (2.4) is equivalent to the initial process (2.3). Up to our best knowledge there is no treatment of this problem in the literature for the case where the initial claim distributions FX and FY are of general phase type. Since any distribution of a non-negative random variable (like claims) can be approximated by a phase type distribution (see, e.g., [2], p. 542), our result has a potential to provide an approximate value for the ruin probability of a merged risk process with arbitrary initial claim distributions. The overall structure of the rest of the article takes the following form. In Sect. 2.2 we discuss phase type distributions by stressing their links with continuous time Markov chains, and bring the formula of ruin probability in case of phase type distributed claims. In Sect. 2.2.2, we deduce an exact formula for the ruin probability of the merged risk process (2.3) where the two claim arrival processes are correlated and claim sizes have exponential distributions. This section also includes a numerical example. In Sect. 2.3, the results obtained in the previous section are generalized to the case of phase type claim distributions.

2.2 The Compound Poisson Model with Phase Type Claims Here we define the phase type distribution and bring a formula of the ruin probability for the classical compound Poisson risk process with phase type distributed claims. The terminology and notation follow the monograph [2] by Asmussen and Albrecher.

2.2.1 Phase Type Distribution Let {X (t), t ≥ 0} be a continuous time Markov chain with finitely many states denoted by 1, 2, . . . , n, Δ. The state Δ is assumed to be absorbing and all other states are transient. The transition probability matrix of X (t) is denoted by ¶, the ith row being the conditional distribution of the next state given the current state i. Let T denote the transition intensity matrix for the states 1, . . . , n. Then the intensity matrix (transition rate matrix, infinitesimal generator) for the whole Markov

2 Ruin Probability for Merged Risk Processes with Correlated Arrivals

19

 T t , where t = −Te and 0 0  e = (1, 1, · · · , 1) . The vector t represents the exit rate vector with its i-th component ti being the intensity of leaving the state i for the absorbing state Δ. 

chain can be written in block-partitioned form as

Definition 1 The distribution of the absorption time in the Markov chain described above is called phase type distribution. Let α be a row vector representing the initial distribution of states 1, 2, . . . , n. The couple (α, T) is called the representation of the phase type distribution. The density of a phase type distribution can be written as f (x) = αeTx t, x ≥ 0.

(2.5)

It is seen that phase type distribution is a generalization of the exponential distribution which corresponds to the case n = 1. An important property of the class of phase type distributions is that it is closed w.r.t. mixing and convolution operators (see also Lemma 1 in Sect. 2.3). Example 1 The next Fig. 2.1 describes a simple continuous time Markov chain with three states (1, 2, and the absorbing state Δ), which generates a phase type distribution. In this Fig. 2.1, α1 and α2 are initial probabilities satisfying α1 + α2 = 1. The parameters μ1 and μ2 are transition rates. The time of absorbtion of this Markov chain has a phase type distribution given by the mixture F = α1 E x p (μ1 ) ∗ E x p (μ2 ) + α2 E x p (μ2 ), where ∗ stands for convolution. In this example the  transition rate −μ1 μ1 and matrix and the vector of initial probabilities can be written as T = 0 −μ2 α = (α1 , α2 ). Note that, in general, the representation of a phase type distribution is not unique - the same distribution F can be generated by different Markov chains. For example, the process depicted in Fig. 2.2 generates the same absorbing time distribution F as the Markov chain in Fig. 2.1. However, its transition rate matrix and initial distribution are different now: ⎡ ⎤ −μ1 μ1 0 T = ⎣ 0 −μ2 0 ⎦ , α = (α1 , 0, α2 ). 0 0 −μ2

Fig. 2.1 Markov chain generating phase type distribution F

20

M. J. Ali and K. Pärna

Fig. 2.2 Another Markov chain generating the same phase type distribution F (the state depicted as a small circle has zero initial probability)

2.2.2 Ruin Probability for Phase Type Claims The phase type claim distribution is one of the few cases where an explicit formula for the probability of ruin of the classical risk process is available. Namely, in [9] it was shown that, considering the risk process (2.1) with p = 1 and assuming that the claim size distribution FX (x) is phase type with representation (α, T), the ruin probability with initial capital u can be expressed as Ψ (u) = α + e(T+tα+ )u e,

(2.6)

α + = −βαT−1

(2.7)

where

with β equal to the intensity of the Poisson process N (t), and t = −Te. The vector α + represents the (defective) initial distribution of the Markov chain governing the phase type distribution of the ladder height or, in other words, α + is the distribution of the Markov state at a time when the claim surplus reaches a new record value. Technical remark. Both formulas, (2.5) and (2.6), involve matrix exponentials of type eQu that can be calculated e.g. in terms of eigenvalues and eigenvectors of Q. More specifically, for an n × n matrix Q the matrix exponential is defined as eQ =

∞  Qp p=0

p!

.

Assume that Q has n different eigenvalues η1 , . . . , ηn with respective right eigenvectors r 1 , . . . , r n (columns) and left eigenvectors l 1 , . . . , l n (rows), and let the eigenvectors be normalized so that l i r i = 1 and l i r j = 0, i = j. Then the following spectral representation formula holds ([2], p. 528): e

Qu

=

n 

eηi u r i l i .

(2.8)

i=1

Denoting n × n matrices Ai = r i l i , one can rewrite (2.8) as follows: eQu = eη1 u A1 + eη2 u A2 + · · · + eηn u An .

(2.9)

2 Ruin Probability for Merged Risk Processes with Correlated Arrivals

21

The latter formula can be used for calculating the ruin probability (2.6). To do this, one first has to find the eigenvalues and eigenvectors of the matrix Q = T+tα + . Then, combining (2.6), (2.7), and (2.9) one obtains Ψ (u) = −βαT−1

n 

eηi u Ai e.

i=1

2.3 Correlated Poisson Arrivals with Exponential Claims Here we discuss a merged risk process (2.3) where the two claim arrival processes are correlated via the Common Shock Model (CSM), and (for simplicity) the claims are exponentially distributed. A numerical example is also included. The scheme used here will later be applied to the case of phase type claims.

2.3.1 Transition to a Single Poisson Compound Process Since now we assume, without loss of generality, that the premium rate p = 1, i.e. p is equal to 1 money unit. Let us focus on the merged risk process (2.3) R(t) = t −

N 1 (t) 

Xi −

i=1

N 2 (t) 

Yi

(2.10)

i=1

satisfying: (1) the i.i.d. claims {X i } are independent of i.i.d. claims {Yi }, and taken together they are independent of {N1 (t), N2 (t)}, (2) the claims are exponentially distributed, X i ∼ E x p (μ1 ) and Yi ∼ E x p (μ2 ), (3) the Poisson processes N1 (t) and N2 (t) are correlated via the following dependence structure (CSM) N1 (t) = J1 (t) + J3 (t),

N2 (t) = J2 (t) + J3 (t),

(2.11)

where J1 (t), J2 (t), and J3 (t) are independent Poisson processes with intensities λ1 , λ2 , and λ3 . We represent the merged risk process above in a form of a single Poisson compound process. First note that the composition N (t) = J1 (t) + J2 (t) + J3 (t) is a Poisson process with intensity β = λ1 + λ2 + λ3 . Then the merged risk process (2.10) is probabilistically equivalent to the following Poisson compound process N (t)  Zi , (2.12) R(t) = t − i=1

22

M. J. Ali and K. Pärna

Fig. 2.3 Claim arrival process N (t) = J1 (t) + J2 (t) + J3 (t) with three types of events marked by x, v, and o. Below the line, the claims generated by respective processes are indicated

where the claim size Z i can be one of three types: type X (in case of event from J1 ), type Y (in case of event from J2 ), or type X + Y (in case of event from J3 ). This process is depicted in the Fig. 2.3. Since the probability for a claim event to be originated from J j is equal to αj =

λj λj = , j = 1, 2, 3, λ1 + λ2 + λ3 β

(2.13)

the c.d.f. of claims Z i in the risk process (2.10) is the following mixture of exponential distributions and their convolution FZ (x) = α1 FX (x) + α2 FY (x) + α3 FX ∗ FY (x).

(2.14)

The same in terms of densities is: f Z (x) = α1 f X (x) + α2 f Y (x) + α3 f X ∗ f Y (x), where f X (x) = μ1 e−μ1 x , expressed as f X ∗ f Y (x) =

(2.15)

f Y (x) = μ2 e−μ2 x and the convolution can easily be μ2 xe−μx , μ1 μ2 (e−μ1 x − e−μ2 x ), μ2 −μ1

if μ1 = μ2 = μ, if μ1 = μ2 .

As FX and FY are exponential distributions (and thus phase type), and since the class of phase type distributions is closed with respect to mixture and convolution operations, we conclude that the claim size distribution (2.14) is of phase type. A graphical representation of the governing Markov chain which generates the distribution (2.14) is shown, for example, in Fig. 2.4.1 We now specify the representation (α, T) of the phase type distribution (2.14), corresponding to the Fig. 2.4. The initial distribution is given by the vector α = (α1 , α2 , α3 ) with components defined by (2.13), and the transition rate matrix T is ⎡

⎤ −μ1 0 0 T = ⎣ 0 −μ2 0 ⎦ . 0 μ1 −μ1

1

(2.16)

Note that, besides Fig. 2.4, several other phase diagrams (and respective Markov chains) are possible, each generating the same distribution (2.14) but having different representation (α, T).

2 Ruin Probability for Merged Risk Processes with Correlated Arrivals

23

Fig. 2.4 A Markov chain which generates the claim size distribution (2.14)

Henceforth, we assume that our risk process has positive safety loading, i.e. β · E Z i < 1 = p where β is the intensity of the Poisson process N (t) and E Z i denotes the mean value of a claim. Due to (2.14), this condition can also be written as λ1 +λ3 3 + λ2μ+λ < 1. Thus, by now we have shown that the initial process (2.10) is μ1 2 equivalent to (2.12), and that the claims Z i in (2.12) are of phase type. Therefore, the general formula (2.6) of ruin probability for phase type claims applies. Let us calculate basic components of this formula, namely α + = −βαT−1 , and T + tα + . Since in our case β = λ1 + λ2 + λ3 , and since αi = λi /β, the product βα is equal to the Poisson arrival βα = λ = (λ1 , λ2 , λ3 ). Further on, it easy to ⎤ ⎡ 1 intensity vector, 0 − μ1 0 ⎥ ⎢ see that T−1 = ⎣ 0 − μ12 0 ⎦ . Therefore, we have α + = −βαT −1 = −λT −1 1 1 0 − μ2 − μ1 or   λ1 λ2 + λ3 λ3 . (2.17) α+ = , , μ1 μ2 μ1 Furthermore, as

t = −Te = (μ1 , μ2 , 0)

(2.18)

one gets, by putting together (2.16), (2.17), and (2.18), the matrix ⎡ ⎢ Q := T + tα + = ⎣

λ1 − μ1 λ1 μ 2 μ1

0

(λ2+λ3 )μ1 μ2

λ3



⎥ (λ2 + λ3 ) − μ2 λμ3 μ1 2 ⎦ . μ1 −μ1

(2.19)

Now the general ruin probability formula (2.6) applies and we obtain the following result. Proposition 1 Let R(t) be the merged risk process (2.10) with exponentially distributed claims X i ∼ E x p (μ1 ), Yi ∼ E x p (μ2 ), and correlated arrival processes N1 (t) and N2 (t), as in (2.11). Then the ruin probability can be expressed as Ψ (u) = α + eQu e

(2.20) 

where α + is given by (2.17), Q is given by (2.19), and e = (1, . . . , 1) .

24

M. J. Ali and K. Pärna

Comment: To calculate the matrix exponential eQu , one can use the spectral representation formula (2.8), where the eigenvalues ηi and eigenvectors r i , l i are to be found for the matrix Q and then substituted into the equation e

Qu

=

3 

eηi u r i l i .

(2.21)

i=1

2.3.2 Numerical Example Consider a merged risk process (2.10) with the following parameters: the premium rate p = 1, the rates of the claim size exponential distributions μ1 = 5 and μ2 = 4, and the intensities of the three claim arrival processes λ1 = 1, λ2 = 1.5 and λ3 = 0.5. Then β = λ1 + λ2 + λ3 = 3 and α = ( 26 , 36 , 16 ). Therefore, the claim size density (2.15) is equal to f Z (x) = α1 f X (x) + α2 f Y (x) + α3 f X ∗ f Y (x) = 10 e−5x + 6 12 −4x 20 −4x 16 −4x 5 −5x −5x e + 6 (e −e )= 3 e − 3e with mean value E Z = 4/15. Note 6 that the relative safety loading of the process is positive (since β · E Z = 3 · 4/15 = 0.8 < 1 = p), and one can rely on Proposition 1. Further on, one calculates α + = −βαT−1 = (0.2, 0.5, 0.1) and thus ⎡

⎤ −4.0 2.5 0.5 Q = T + tα + = ⎣ 0.8 −2.0 0.4 ⎦ . 0.0 5.0 −5.0 Eigenvalues and respective eigenvectors of the matrix Q are as follows: η1 = −5.2360, η2 = −5.00 and η3 = −0.7639 ; normalized right eigenvectors are ⎡

⎤ −2.23295 r 1 = ⎣−0.34116⎦ , 7.22598



⎤ −3.24037 r 2 = ⎣ 0.00000 ⎦ 6.48074



⎤ −0.71689 and r 3 = ⎣−0.75074⎦ ; −0.88613

and normalized left eigenvectors are l 1 = (0.52434, −0.81015, 0.26217), l 2 = (−0.61721, 0.77152, −0.15430), and l 3 = (−0.23828, −0.96386, −0.11914). Now, using Proposition 1 we can find the ruin probability for initial capital u as Ψ (u) = 0.8025e−0.7639u − 0.0025e−5.2360u (Fig. 2.5). A graphical presentation of Ψ (u) is

2 Ruin Probability for Merged Risk Processes with Correlated Arrivals

25

Fig. 2.5 Ruin probability of a merged risk process with exponential claims

2.3.3 An Auxiliary Result In the numerical example above the final expression of Ψ (u) only contains two terms, instead of three as one might expect from the representation (2.21). We next show that this is not an exception (due to our numerical choices), but a general property which is related to the structure of the risk process (2.10). Proposition 2 Under the assumptions of Proposition 1 the following holds: (a) one of the eigenvalues of Q (given in (2.19)) is η = −μ1 , and its respective right 1 s ) ; eigenvector has the form r = (s1 , 0, −α α3 1 (b) the ruin probability Ψ (u) takes the form of a linear combination of two exponential functions. Proof (a) Check that the value η = −μ1 takes the determinant |Q − ηI | to zero. Indeed, denoting two cells by A, B, we have that    λ1 A λ3   λ1 μ 2  |Q − ηI | = |Q + μ1 I | =  μ1 B λμ3 μ1 2  = 0.  0 μ1 0  Let us now find the right eigenvector, r = (s1 , s2 , s3 ) corresponding to η = −μ1 , by solving the equation Qr = −μ1 r or ⎡ ⎤⎡ ⎤ (λ2 +λ3 )μ1 λ3 λ1 s1 μ2 ⎢ λ1 μ 2 ⎥ (2.22) (λ2 +λ3 ) − μ2 + μ1 λμ3 μ1 2 ⎦ ⎣s2 ⎦ = 0. ⎣ μ1 s3 0 μ1 0 As μ1 = 0, the third equation of (2.22) gives s2 = 0, and after a small simplification we see that the first and the second equations are the same, namely λ1 s1 + λ3 s3 = 0.

26

M. J. Ali and K. Pärna

Fig. 2.6 Phase diagram with two transient states that generates the distribution FZ given by (2.14)

1 To conclude, any vector of the form r = (s1 , 0, −λ s ) serves as a right eigenvector λ3 1 corresponding to the eigenvalue η = −μ1 . (b) Now the proof of (b) is simple. By substituting (2.21) into (2.20), we have:

Ψ (u) = α +

3  i=1

eηi u r i l i e =

3 

eηi u α + r i l i e.

i=1

The eigenvalue η = −μ1 is one of the three eigenvalues in this decomposition, say η3 = −μ1 . We see that then the third term of the decomposition vanishes. Indeed,    λ1 λ2 + λ3 λ3 λ1 λ1 λ3 λ1 s1 , 0, − s1 = , , s1 + 0 − · s1 = 0. α+ r 3 = μ1 μ2 μ1 λ3 μ1 μ1 λ3 Thus we have Ψ (u) = functions.

2 i=1

eηi u α + r i l i e, a linear combination of two exponential 

In fact, the last result is not surprising, since there exists a representation of the distribution FZ which only uses two transient states (plus the absorbing state). Respective phase diagram is shown in the following Fig. 2.6 where the two transient states with initial probabilities α1 and α2 + α3 generate transition times of intensity μ1 and μ2 (respectively), but μ2 is divided between two subsequent states, the states 1 and Δ. An application of the ruin probability formula (2.6) for this PH distribution results in a function with two exponential components.

2.4 Correlated Poisson Arrivals with Phase Type Claims Here we generalize basic results of the previous section to the case of arbitrary phase type claim size distributions FX and FY .

2.4.1 The Phase Type Distribution of Claims of the Merged Process  N1 (t)  N2 (t) Consider again the merged risk process (2.10) i.e. R(t) = t − i=1 X i − i=1 Yi satisfying (1) and (3) as before, but instead of condition (2) we now postulate: 2’) the claims X i and Yi are phase type distributed.

2 Ruin Probability for Merged Risk Processes with Correlated Arrivals

27

More specifically, let X i and Yi be independent random variables with phase type distributions having representations (α X , T X ) and (α Y , TY ), respectively, where ⎤ ⎡ Y X X X μ12 · · · μ1m μ11 μ11 ⎥ ⎢ .. ⎢ . . .. TX = ⎣ . ⎦ , TY = ⎣ .. X X Y X μm2 · · · μmm μm1 μk1 ⎡

⎤ Y Y μ12 · · · μ1k ⎥ .. ⎦ . Y Y μk2 · · · μkk

and α X = (αx1 , αx2 , . . . , αxm ), α Y = (α y1 , α y2 , . . . , α yk ). The vectors α X and α Y represent initial distributions of transient states of governing Markov chains, M X and MY , whose absorbtion times are Xand Y , respectively. We  assume that the distributions of X and Y satisfy α X e = i αxi ≤ 1 and α Y e = j α y j ≤ 1, where the case of a strict inequality means that with some positive probability αxΔ = 1 − α X e > 0 (or α yΔ > 0) the Markov chain M X (or MY ) starts from the absorbing state Δ. In such a case the absorbtion time X is considered to be zero and the distribution of X (or Y ) is called zero-modified. The absorbtion intensities in various (transient) states are represented by column vectors t X = −T X e and tY = −TY e. We need the following technical lemma showing how to calculate the parameters of the convolution of two phase type distributions. Lemma 1 Let X and Y be independent random variables of phase type with representation (α X , T X ) and (α Y , TY ), respectively. Then U = X + Y is also of phase type with representation (αU , TU ), where αU = (α X , αxΔ α Y ) and  T X t X αY TU = . 0 TY 

(2.23)

Proof The proof is rather straightforward. We have to combine the Markov chains M X and MY in such a way as to ensure that the absorbtion time U of the combined chain will be equal to the sum of two initial absorbtion times, X and Y . The structure of αU and TU shown in (2.23) ensure that the process starts (1) from a state of M X according to the initial distribution α X , or (2) from a state of MY with total initial probability αxΔ divided between the states according to α Y . In case 1, the process remains within the state space of M X for the time X (absorbtion time), but instead of absorbtion it enters the state space of MY where it stays for another absorbtion time Y . The transition from one space to another is described by off-diagonal block t X α Y with general form ⎤ μ1X α y1 μ1X α y2 · · · μ1X α yk ⎥ ⎢ .. t X α Y = ⎣ ... ⎦, . X X X μm α y1 μm α y2 · · · μm α yk ⎡

where t X = (μ1X , . . . , μmX ) represent exit intensities for M X , or—what is the same— entrance intensities for MY . The rows of the matrix are proportinal showing that,

28

M. J. Ali and K. Pärna

whichever is i, the entrance intensity μiX is divided between the states of MY in accordance with the initial distribution α Y —thus ensuring independence of X and Y . In case 2 (when X has zero value), the process only generates a value for Y . Note that the total absorbing time U = X + Y is zero-modified (has zero value), if and only if both, X and Y , are zero-modified.  Proposition 3 Let X and Y be independent random variables of phase type with representation (α X , T X ) and (α Y , TY ), respectively. Then the claim size distribution FZ (x) defined in (2.14) is of phase type with possible (but not unique) representation (α Z , T Z ), where ⎤ ⎡   TX 0 0 TX 0 TZ = ≡ ⎣ 0 T X t X αY ⎦ , 0 TU 0 0 TY

(2.24)

α Z = (α1 α X , α3 α X , (α3 αxΔ + α2 )α Y )

(2.25)

with α1 , α2 and α3 given by (2.13). Proof The proof is based on the closure property of phase type distributions with respect to mixture and convolution operations. Since X and Y have phase type distributions FX and FY , and since FZ is a mixture of FX , FY , and the convolution FX ∗ FY , the distribution FZ is also phase type. It remains to see that the representation (2.24)–(2.25) is what we need. The form of TZ is, of course, somewhat optional (e.g. similar matrix with Y instead of X also goes). The initial probabilities α1 α X ensure non-zero value for X , the component α3 α X ensures the non-zero X together with non-zero Y , and (α3 αxΔ + α2 )α Y ensures non-zero Y . The additional term α3 αxΔ in the last component is needed to capture the sum of type 0 + Y not included in the second component.  In addition to the lemma, one can see that (using the formula (2.25)), the total probability that Z will obtain zero value is equal to 1 − α Z e = α1 αxΔ + α2 α yΔ + α3 αxΔ α yΔ . Hence, in case of regular distributions FX and FY with αxΔ = α yΔ = 0, the distribution FZ will also be regular, ¶(Z > 0) = 1, and (2.25) simplifies to α Z = (α1 α X , α3 α X , α2 α Y ).

2.4.2 Example Here we will give an example of application of Lemma 1 and Proposition 3. Let X and Y have the following phase type distributions: FX = αx1 E x p(μ1 ) ∗ E x p(μ2 ) + αx2 E x p(μ2 ), FY = α y1 E x p(μ3 ) + α y2 E x p(μ4 ),

2 Ruin Probability for Merged Risk Processes with Correlated Arrivals

29

Fig. 2.7 Markov chains generating phase type distributions of X (left), Y (middle), and U = X + Y (right). The values in circles are initial probabilities, small circles have zero initial probabilities.

where the mixture weights satisfy αx1 + αx2 = α y1 + α y2 = 1 (regular distributions with αxΔ = α yΔ = 0). The purpose is to derive the representation of distribution of the sum U = X + Y . Three Markov chains whose absorbtion times are X , Y , and U , are depicted in the next Fig. 2.7. The diagram on the right is combined from the two other diagrams in such a way that the exit for Δ in the first process (i.e. absorbtion) becomes an arrival for the second process. The intensity matrices corresponding to the three Markov chains in Fig. 2.7 are the following (in case of TU the formula (2.23) is used): ⎤ ⎡ 0 0 −μ1 μ1     ⎢ 0 −μ2 α y1 μ2 α y2 μ2 ⎥ −μ1 μ1 −μ3 0 ⎥. , TY = , TU = ⎢ TX = ⎣ 0 0 −μ2 0 −μ3 0 ⎦ 0 −μ4 0 0 0 −μ4 The three graph in Fig. 2.7, together with their respective intensity matrices T X , TY , TU , and initial distributions, are necessary elements for building the distribution of Z -claims in (2.14). First, a Markov chain generating the distribution of Z can be seen in Fig. 2.8. By (2.24), the intensity matrix corresponding to the phase diagram in Fig. 2.8 is ⎤ −μ1 μ1 0 0 0 0 ⎢ 0 −μ2 0 0 0 0 ⎥ ⎥ ⎢ ⎢ 0 μ 0 0 ⎥ 0 −μ 1 1 ⎥, ⎢ TZ = ⎢ 0 0 −μ2 α y1 μ2 α y2 μ2 ⎥ ⎥ ⎢ 0 ⎣ 0 0 0 0 −μ3 0 ⎦ 0 0 0 0 0 −μ4 ⎡

and, by (2.25), the vector of initial probabilities is equal to

30

M. J. Ali and K. Pärna

Fig. 2.8 A Markov chain with absorbtion time Z having distribution FZ = α1 FX + α2 FY + α3 FX +Y . The quantities shown inside circles are initial probabilities of respective states

α Z = (α1 αx1 , α1 αx2 , α3 αx1 , α3 αx2 , α2 α y1 , α2 α y2 ). The sum of all 6 initial probabilities equals 1, hence Z is regular. A note on uniqueness. As it has been noted in similar cases before, the representation (α Z , T Z ) just obtained is not the only possible way to describe the PH distribution FZ . For example, it can be checked that the following, more concise representation of FZ is possible:   α Z = α1 αx1 , α1 αx2 , (α2 + α3 )α y1 , (α2 + α3 )α y2 , ⎡

⎤ μ1 0 0 −μ1 ⎢0 ⎥ 0 0 −μ2 TZ = ⎢ .⎥ α3 3 ⎣ α α+α ⎦ α μ α μ −μ 0 x1 3 3 α2 +α3 x2 3 2 3 α3 α3 α μ α μ 0 −μ 4 α2 +α3 x1 4 α2 +α3 x2 4

2.4.3 Ruin Probability in Case of Phase Type Claims Here we deduce an explicit form of the ruin probability for the merged risk process (2.10) with phase type claims. The final result is given in the proposition below. Let the claims X i and Yi in the merged risk process (2.10) have phase type distribution with representation (α X , T X ) and (α Y , TY ). Then, by Proposition 3, the claims

2 Ruin Probability for Merged Risk Processes with Correlated Arrivals

31

Z i of the equivalent process (2.12) are phase type distributed with the representation (2.24)–(2.25): ⎡ ⎤ TX 0 0 TZ = ⎣ 0 T X t X αY ⎦ , 0 0 TY and α Z = (α1 α X , α3 α X , (α3 αxΔ + α2 )α Y ) with α1 , α2 and α3 given by (2.13). Therefore, the ruin probability formula (2.6) applies: Ψ (u) = α + e(T Z +t Z α+ )u e, where t Z = −T Z e, α + = −βα Z T−1 Z , and β = λ1 + λ2 + λ3 . Next we make some calculations. First, it is easy to check that the inverse of T Z is given by ⎡

T−1 Z

⎤ T−1 0 0 X −1 ⎦ = ⎣ 0 T−1 . −T−1 X X t X α Y TY −1 0 0 TY

Then, by substituting α Z and T−1 Z from above, and using βαi = λi , the vector α + calculates as: −1 α + = −βα Z T−1 Z = −(λ1 α X , λ3 α X , (λ3 αxΔ + λ2 )α Y )T Z   −1 −1 −1 . = λ1 α X T−1 X , λ3 α X T X , (−λ3 α X T X t X + λ3 αxΔ + λ2 ) α Y TY

−1 Since t X = −T X e, the term −λ3 α X T−1 X t X = λ3 α X T X T X e = λ3 α X e = λ3 (1 − αxΔ ), which takes α + to the form:

  −1 −1 . α + = λ1 α X T−1 X , λ3 α X T X , (λ2 + λ3 )α Y TY

(2.26)

Now consider the exponent T Z + t Z α + =: Q in the ruin probability formula (2.6). Using the definition of t Z , we have Q = T Z + t Z α + = T Z − T Z e α + , or Q = T Z (i − e α + ),

(2.27)

where i denotes the identity matrix of order 2m + k. This way, we have proved the following result. Proposition 4 Let the claims X i and Yi in the merged risk process (2.10) have phase type distributions with representation (α X , TX ) and (α Y , TY ), respectively. Then the ruin probability for initial capital u is equal to Ψ (u) = α + eQu e, where α + is given by (2.26), Q is given by (2.27), and e is column vector of units.

32

M. J. Ali and K. Pärna

2.5 Conclusion We have demonstrated how the sum of two classical risk processes with correlated Poisson arrivals and with exponential or, more generally, with phase type claims, can be represented in a form of one single compound Poisson process. In doing that, the initial claim size distributions were replaced by a properly chosen phase type claim distribution for the merged process. After that exact formulas of ruin probability for the merged risk process were derived. In this paper we did not pay attention to the minimality of the representation of the phase type distributions used in modeling the claims of the merged risk process. Although the choice of the representation does not have any effect on the ruin probabilities, knowing the exact order of the phase type claim distribution is useful, because it determines the number of exponential components in the ruin probability formula. Therefore, it would be interesting to study approximation of ruin probabilities by using phase type distributions of low order.

References 1. Anastasiadis, S., Chukova, S.: Multivariate insurance models. An overview. Insur. Math. Econ. 51, 222–227 (2012) 2. Asmussen, S., Albrecher, H.: Ruin Probabilities, 2nd edn. World Scientific, New Jersey (2010) 3. Bäuerle, N., Grübel, R.: Multivariate risk processes with interacting intensities. Adv. Appl. Probab. 40, 578–601 (2008) 4. Cai, J., Li, H.: Dependence properties and bounds for ruin probabilities in multivariate compound risk models. J. Multivariate Anal. 98, 757–773 (2007) 5. Cai, J., Li, H.: Multivariate risk model of phase-type. Insur. Math. Econ. 36, 137–152 (2005) 6. Crámer, H.: On the Mathematical Theory of Risk. Skandia Jubilee Volume, Stockholm (1930) 7. Kreinin, A.: Financial Signal Processing and Machine Learning. J. Wiley & Sons (2016) 8. Lundberg, F.: I Approximerad Framställning av Sannolikhetsfunktionen. II Återförsäkring av Kollektivrisker. Almquist & Wiksell, Uppsala (1903) 9. Neuts, M.F.: Matrix-Geometric Solutions in Stochastic Models. John Hopkins University Press, Baltimore (1981) 10. Yuen, K. C., Guo, J., Wu, X.: On the first time of ruin in the bivariate compound Poisson model. Insur. Math. Econ. 38, 298–308 (2006) 11. Yuen, K. C., Guo, J., Wu, X.: On a correlated aggregate claims model with Poisson and Erlang risk processes. Insur. Math. Econ. 32, 205–214 (2002)

Chapter 3

Method Development for Emergent Properties in Stage-Structured Population Models with Stochastic Resource Growth Tin Nwe Aye and Linus Carlsson

Abstract Modelling population dynamics in ecological systems reveals properties that are difficult to find by empirical means, such as the probability that a population will go extinct when it is exposed to harvesting. In this article, we use an aquatic ecological system containing one fish species and an underlying resource as our model. In particular, we study a class of stage-structured population systems, in both the deterministic and the stochastic settings, including stochasticity in such a way such that we allow the underlying resource to have a random growth rate. In these models, we study how properties connected to the fish species depend on different harvesting rates. To investigate models in the stochastic setting, we use Monte Carlo simulations to capture several of the emergent properties of the population. These properties have previously been studied in the deterministic case. In the stochastic setting, we get estimates for the expected outcome of population properties in our model, but we also get measures of dispersion. There are properties that emerge when introducing randomness in the model that cannot be studied in the deterministic cases, such as the probability of extinction. In this paper, we develop a method to derive this property. We also construct a method to determine the recovery potential of a species by introducing it in a virgin environment. Keywords Stage-structured · Stochastic · Population dynamics · Method development · Logistic · Semi-chemostat · Semi-logistic · Probability of extinction MSC 2020: 92D40 · 92D40 T. N. Aye (B) · L. Carlsson Division of Mathematics and Physics, Mälardalen University, Box 883, 721 23 Västerås, Sweden e-mail: [email protected]; [email protected] L. Carlsson e-mail: [email protected] T. N. Aye Taunggoke University, Toungup, Myanmar © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_3

33

34

T. N. Aye and L. Carlsson

Fig. 3.1 Graphs for the semi-chemostat, logistic, and semi-logistic curves, used in the growth rate models for the resource

3.1 Introduction The dynamics in general ecological systems are very complicated, and it is quite often difficult to draw appropriate precise conclusions. In spite of this, some properties are well defined and commonly studied, such as biodiversity, stability, and food webs; see e.g., [4, 6, 15, 20, 22]. Aquaculture is an expanding source of proteins to be used in human consumption, and is of great importance for developing countries [32, 33] as well as for industrialized nations [17, 18] including the consideration of the global warming potential. In this article, we have used a small aquatic ecological system as the base model, but the results herein can be applied to other types of ecological systems. More precisely, we assume that our ecosystem consists of one fish species and an underlying food resource. The model we use is a stage-structured model, in which we assume the fish to be divided into two stages, the juveniles and the adults, and the food resource is treated as an unstructured entity. The field of stage-structured population models has attracted much attention; see e.g., [1, 2, 19, 23, 25]. The stage-structured population model can be derived from the general physiologically structured population model; see e.g., [13]. In most aquatic stage-structured models, the resource growth rate is assumed to be of either the semi-chemostat growth or the logistic growth [3, 12, 28], but logistic growth might not be suitable for ecological purposes [26], and in this paper we have found that the simulated solutions become highly unstable under this growth rate; see Appendix 3.9. The reason for this instability is that the logistic growth rate curve has a horizontal tangent line at the origin (see Fig. 3.1). There are several ways to overcome this problem, one could, for example, assume that the fish will not be able to find the resource if its density is less than a certain

3 Method Development for Emergent Properties in Stage-Structured …

35

threshold, but we have taken the direction of assuming that the resource is available in all densities and has some small influx from surrounding ecosystems. To accomplish this, we define a new growth rate model, which we call semi-logistic growth. This model consists of a large fraction of the logistic growth and a small fraction of the semi-chemostat growth. As we see in Fig. 3.1, the semi-logistic growth curve has a positive slope at the origin, which will stabilize the solutions in such a way that they approach a steady-state solution in finite time. In the remainder of this paper, we will consider the two basic growth rate models that are semi-chemostat and semilogistic, except in Appendix 3.9, where we consider the logistic growth rate model. The main purpose of this project is to develop methods to understand the emergent properties that are studied in deterministic models, when these models are extended to stochastic models. The way we introduce randomness in the stage-structured models is to add randomness in the growth rate model for the resource. There is of course a wide variety of possibilities to include randomness in the models, but for the sake of transparency, we restrict ourselves to this extension. Stochastic stage-structured models have been studied for some time, see e.g., [8, 9, 29]. A major advantage in the stochastic setting, is that, in addition to the expected values, we also get the dispersion of the different emergent properties of the population and resource. We have used Monte Carlo methods to evaluate the juvenile/adult/resource biomass, yield, impact on biomass, impact on size-structure, resilience, and the minimum viable population (MVP) formulation of the probability of extinction. To be able to find the recovery potential, we needed to develop a non-equilibrium formula, since a stochastic model never reaches equilibrium. To use this formula, we start with a pure resource environment and then introduce a small quantity of the fish population. From here, we can estimate the expectation of the change of rates for the population in a pristine environment, which is needed in this formula; see Sect. 3.4. This new method coincides (Sect. 3.7.3) with the prior deterministic method developed by [25]. In deterministic population models, the population either goes extinct, or it stays positive for all future periods of time. But in stochastic population models, depending on the parameters, there is a possibility that the population goes extinct within a finite period of time. We call this the probability of extinction, which is closely connected to the minimum viable population (MVP) size; see [14, 30, 34]. As mentioned above, we find the MVP formulation of the probability of extinction by Monte Carlo simulations. We have also constructed a more natural approach to this subject by utilizing the fact that when the recovery potential is smaller than one, there is extinction. Thus we have used statistical methods on this property to find the probability of extinction (Sect. 3.5), this formulation corroborates well with the MVP formulation, see Sect. 3.7.4.

36

T. N. Aye and L. Carlsson

3.2 The Deterministic Stage Model Our ecological model consists of a single species and a food resource. The individuals of the species are modeled by a stage-structured biomass model which is derived by formulating a size-structured population model on the individuals’ life history. Individuals are divided into two stages: juveniles and adults, based only on their size. For many relevant properties of the species, a two-stage model is often enough; see e.g. [1, 13, 23, 25]. The individuals are assumed to be the same size at birth, sbirth , and the maximum size of individuals is denoted by smax . Both stages forage on a shared resource R = R(t) and metabolic requirements increase with body size. The growth rate in body size depends on available resources [7] and varies with population density [10] and [11]. Juvenile and adult biomasses are denoted by J = J (t) and A = A(t) respectively. For the resource dynamics, the rates which follow semi-chemostat dynamics and logistic growth are given by the following differential equation: Rsc d Rsc = r (Rmax − Rsc ) − Imax (J + q A), dt H + Rsc d Rlg Rlg = r Rlg (Rmax − Rlg ) − Imax (J + q A), dt H + Rlg where Rsc and Rlg are the semi-chemostat and logistic densities, respectively, for the resource. We assume that r is the resource turnover rate and Rmax is the maximum resource density. The maximum ingestion rate per unit of biomass per time, which is assumed to scale linearly with body size, is equal to Imax for juvenile individuals, while we assume that it is equal to q Imax for adult individuals. The factor q describes the difference in ingestion rates between juveniles and adults. Here, H is the halfsaturation constant of consumers. As explained above, the logistic growth rate is not a natural assumption, and the solutions to the model also become unstable under this growth rate, i.e., we will not reach a steady state or a periodic solution. We therefore postpone all simulations based on logistic growth rate to Appendix 3.9. A more natural assumption is a combination of the above resource models, which we in this paper call the semi-logistic growth rate, Rcomb , defined by d Rlg d Rcomb d Rsc =p + (1 − p) , dt dt dt where p is the proportion of the semi-chemostat growth. Sections 3.1 and 3.8 explain why we introduce the semi-logistic growth rate. We use standard stage-structure dynamics (see e.g. [13, 23, 25]), i.e., juvenile individuals use all the consumed energy for growth, development, and maintenance, whereas adult individuals use all their energy for maintenance and reproduction. The dynamics of the biomass of the juveniles and the adults becomes

3 Method Development for Emergent Properties in Stage-Structured …

dJ = (w J (R) − v(w J (R)) − M − FJ )J + w A (R)A, dt dA = v(w J (R))J − (M + FA )A. dt

37

(3.1) (3.2)

Here, w J (R) and w A (R) are the net biomass production per unit of body mass of juveniles and adults, respectively. Furthermore, M denotes the natural mortality rate. The stage-dependent harvesting rates1 of juveniles and adults are given by FJ and FA . The maturation rate, v(w J (R)), is the resource-dependent rate at which juveniles mature and become adults. Resource ingested is assimilated with efficiency . The maintenance requirements are also assumed to scale linearly with body size with proportionality constant T . Therefore, the net biomass production rates for juveniles and adults are given by w J (R) = max{0, Imax

R R − T }, w A (R) = max{0, q Imax − T }. H+R H+R

Here, H is the half-saturation constant of consumers. We set z = ssbirth to reduce the max numbers of parameters. In the stage model, the juvenile maturation rate is given by

v(w J (R)) =

By setting γ = γ(R) = 1 −

⎧ w (R)−M−F J J ⎪ , w J (R) = M + FJ , M+F J ⎪ 1− ⎨ w (R) 1−z

⎪ ⎪ ⎩

J − M+F ln(z)

M+FJ , w J (R)

 v(w J (R)) =

J

(3.3) ,

otherwise.

the above equation becomes

γ w J (R) , 1−z γ J , − M+F ln(z)

w J (R) = M + FJ , otherwise.

Here, H and Rmax are measured in biomass per unit of volume, while T , r , Imax and M are expressed per unit of time. All parameters as well as the biomass densities J , A and R can be considered non-dimensional after a rescaling of equations. We assume that the juvenile and adult biomasses at equilibrium of the harvested population is denoted by J ∗ = J ∗ (FJ , FA ) and A∗ = A∗ (FJ , FA ) respectively. The yield objective function is defined by yield = FJ J ∗ + FA A∗ .

In this study, we will assume equal harvesting rate, F = FJ = FA , on both stages. This in view of [23], in which they conclude that equal harvesting rate does not inflict too much on the properties of the solutions. It can be noted that one could differentiate the mortality rates by stages in a similar way by M J and M A to get an even more generalized model.

1

38

T. N. Aye and L. Carlsson

In the absence of harvesting, Ju∗ = J ∗ (0, 0) and A∗u = A∗ (0, 0) denote the juvenile and adult biomasses, respectively, at equilibrium. The impact on biomass due to harvesting is measured by the expression impact on biomass = 1 −

B∗ , Bu∗

where B ∗ = J ∗ + A∗ and Bu∗ = Ju∗ + A∗u . Similarly, the impact on size structure is considered through the expression J∗ impact on size structure = ∗ J + A∗



Ju∗ Ju∗ + A∗u

−1

− 1,

(3.4)

which equals the relative change in the fraction of juvenile biomass compared to the total biomass under different harvesting rates. If the impact on size structure is positive(negative), then harvesting increases(decreases) the fraction of juveniles in the population [23].

3.3 Stochastic Stage Model There has been major developments in stochastic analysis in many scientific fields over the last few decades. One commonly use a system of deterministic partial differential equations to model physiologically structured populations, and from these equations derive a stage-structured population model. To be able to investigate properties such as the probability of extinction of the species, we propose a stochastic model. We modify the stage-structured model described in Sect. 3.2 by including environmental randomness, i.e., we include a stochastic term in the growth rate of the food resource. We will study three different types of stochastic resource dynamics, which in the absence of consumers are: semi-chemostat: Used in, e.g., [23, 25], d R = r (Rmax − R)dt + νdW (t), where, W (t) is a Wiener process and ν is the standard deviation parameter.2 logistic: Used in, e.g., [31], d R = r R(Rmax − R)dt + ν R dW (t). semi-logistic: The resource dynamics in this model are arrived at through a linear combination of the stochastic semi-chemostat and the logistic growth rates, given by 2

In finance this model is often referred to as the Vasicek model.

3 Method Development for Emergent Properties in Stage-Structured …

39

d R = p (r (Rmax − R)dt + νdW (t)) + (1 − p) (r R(Rmax − R)dt + ν R dW (t)) = r ((1 − p)R + p) (Rmax − R)dt + ν((1 − p)R + p)dW (t), where 0 < p  1 is a constant. When consumers are introduced to these models, the rate at which the biomasses, R, of available resource are given by the stochastic differential equations R (J + q A)dt, H+R R (J + q A) dt, logistic: d R = r R(Rmax − R)dt + ν R dW (t) − Imax H+R semi-logistic: d R = r ((1 − p)R + p) (Rmax − R)dt + ν((1 − p)R + p)dW (t) R (J + q A) dt. − Imax H+R

semi-chemostat: d R = r (Rmax − R)dt + νdW (t) − Imax

For the above mentioned models, we investigate the impact of harvesting on the consumer population. In addition to harvesting, the biomasses of juveniles and adults also depend on the resource dynamics, which in turn decrease due to the consorted foraging of all consumers on it. By using the stochastic results of juveniles and adults biomasses, the yield for stochastic case is calculated in a similar way as in the deterministic case. A bit of caution might be needed here. The stochastic models have quite a complicated feedback interaction between the juveniles, the adults and the resource, and the expectation of the solutions in the stochastic models will thus not always be the solution of the corresponding deterministic models. If the models would have had a simpler setting, we would have been able to compensate for this difference using techniques similar to the theory developed in, e.g., [16]. Population extinction has been studied as an interesting topic for a variety of stochastic model formulations [31]. Population size can cause the extinction of a species through overharvesting, habitat distribution, disasters and other influences. One way of finding the probability of extinction is through the minimum viable population (MVP). The MVP is an estimate of the minimum number of organisms that is capable of persisting in the wild life [5, 30]. The probability of extinction is defined by A + PA J < MVP,

(3.5)

where PA is the probability that juveniles will become adults. In Sect. 3.5, we present an alternative approach to find the probability of extinction. Moreover, the impact on biomass of harvesting in the stochastic case is measured by the expression E[B ∗ ] , (3.6) Impact on biomass = 1 − E[Bu∗ ]

40

T. N. Aye and L. Carlsson

where E[B ∗ ] = E[B ∗ ](F) is the steady state of the expectation of the total biomass under a harvesting rate F, and E[Bu∗ ] = E[Bu∗ ](0). The impact on size structure for stochastic case is explored in a similar way; compare Eq. (3.4).

3.4 The Recovery Potential We consider the basic reproduction ratio which represents the average number of offspring produced over the lifetime of an individual in the absence of densitydependent competition. The measure of recovery potential is closely related to the basic reproduction ratio in a virgin environment. For the deterministic stage model, the recovery potential, introduced in [25] is defined by using the steady state equation as follows ν(w J (R)) w A (R) × . g(R) = M + FA ν(w J (R)) − w J (R) + M + FJ In a virgin environment, the recovery potential is defined as a function of harvesting rates by recovery potential =

ν(w J (Rmax )) w A (Rmax ) × . M + FA ν(w J (Rmax )) − w J (Rmax ) + M + FJ

(3.7)

It has been proved that a unique positive equilibrium of resource, juvenile and adult biomass density exists when the recovery potential is larger than 1. However, extinction of the population follows when the recovery potential is smaller than 1, see [25]. Meng et al. [25] derived this recovery potential, Eq. (3.7), by assuming A = J  = 0 in Eqs. (3.1) and (3.2). Nevertheless, we will never reach an equilibrium for the stochastic approach of recovery potential. Therefore, we will derive a similar expression for the recovery potential at non-equilibrium condition. We consider the following equations for the rates at which the biomass of juveniles and adults change, which are J  = w A (R)A + (w J (R) − v(w J (R)) − M − FJ )J, A = v(w J (R))J − (M + FA )A.

(3.8) (3.9)

Rearranging Eq. (3.9) yields v(w J (R)) =

A + (M + FA )A . J

(3.10)

Note that the right-hand side does not depend on R. In what follows, our goal is to derive an expression for the recovery potential that is not dependent of R, since, in the stochastic case, we will not get an equilibrium solution. When this has been

3 Method Development for Emergent Properties in Stage-Structured …

41

achieved, we can use the expectation of the recovery potential to find the probability of extinction. We now substitute Eq. (3.10) in Eq. (3.3). When w J (R) = M + FJ , we get the unique solution, w∗J (R). This solution exists since the right hand side of Eq. (3.3) is a continuous and increasing function with respect to R; see Appendix 3.10 for a complete derivation. By substituting Eq. (3.10) and the unique solution w∗J (R), Eq. (3.8) becomes w∗ (R)J A + (M + FA )A (M + FJ )J J − J + + A A A A ∗   J − (w J (R) − M − FJ )J + (A + (M + FA )A) . = A

w A (R) =

(3.11)

Recall that in this paper we assume equal harvesting rates, i.e., F = FA = FJ . By using Eqs. (3.10), (3.11) and the unique solution w∗J (R), we define the recovery potential as R(F); see Eq. (3.7). When w J (R) = M + F, the deterministic recovery potential is J  − (w∗J (R) − M − F)J + (A + (M + F)A) A(M + F) A + (M + F)A . ×  A + (M + F)A − (w∗J (R) − M − F)J

R(F) =

(3.12)

We now study the case w J (R) = M + F, Eq. (3.8) becomes (M + F)J A + (M + F)A (M + F)J J − + + A A A A   J + (A + (M + F)A) . = A

w A (R) =

(3.13)

By using Eqs. (3.10), (3.13) and w J (R), Eq. (3.7) becomes J  + (A + (M + F)A) A + (M + F)A ×  A(M + F) A + (M + F)A − (M + F − M − F)J J  + (A + (M + F)A) . = A(M + F)

R(F) =

That is, for the non-equilibrium case when w J (R) = M + FJ , the recovery potential for deterministic case is R(F) =

J  + (A + (M + F)A) . A(M + F)

(3.14)

For the stochastic case, we find the mean and standard deviation of the recovery potential through Eqs. (3.12) and (3.14) for both cases.

42

T. N. Aye and L. Carlsson

3.5 Probability of Extinction for Stochastic Case The extinction of a population of organisms may be caused by a range of processes that fall under either environmental fluctuation or population regulation. In this paper, we investigate the probability of extinction through the results of recovery potential. The recovery potential provides an indication of the population’s probability of surviving a potential extinction caused by, e.g., environmental stochasticity or overexploitment. In other words, the biomass of an initially small population will increase in a virgin environment when the recovery potential is larger than 1 and the population will go extinct when the recovery potential is smaller than 1. We then find the probability of extinction, which is calculated by investigating the mean of recovery potential. To do this, we use initial data R = Rmax , J = J0 , and A = A0 where J0 and A0 are close to zero. We then simulate the solution for two short time steps, and use the central difference quotient to find an estimate for J  and A . The simulations are done for a range of harvesting rates F = F1 , F2 , ..., FN . Since the model is stochastic, we repeat this procedure many times, and use the central limit theorem to estimate the recovery potential of our model. That is, we estimate the expected recovery potential, μR (F) = E[R(F)] ≈ R(F), where R(F) = mean(R(F)) and the standard deviation, σR , of the recovery potential is estimated in a similar way. In view of the central limit theorem, we use the cumulative distribution function, CDF(x, μ, σ), for the normal distribution with mean, μ, and standard deviation, σ, evaluated at x on the mean of the recovery potential with harvesting rate, F. That is, we assume that R(F)∼N (μR (F), σR (F)) and get the probability of extinction by P(R(F) < 1) = CDF(1, μR (F), σR (F)). The reason for evaluating the cumulative distribution function at x = 1 is that if the recovery potential is smaller than 1, then the population goes extinct; see [25].

3.6 Resilience for Stage Model Resilience is one of the important components of stability of an ecosystem, it is a measure of how fast the ecosystem recovers after population perturbation. It is strongly influenced by the types of environmental fluctuations commonly encountered by an ecosystem [21]. We consider the resilience of the population, for both deterministic and stochastic cases by measuring the reciprocal of the time needed for the population to recover to the positive equilibrium given a random perturbation [23]. We denote that (J ∗ , A∗ , R ∗ ) is the equilibrium of biomass and κ > 0, which scales the maximum displacement of the population from the equilibrium. A trajectory of the stage model is started from a random point uniformly distributed in the cube

3 Method Development for Emergent Properties in Stage-Structured …

43

(0, κJ ∗ ) × (0, κA∗ ) × (0, κR ∗ ). We then find the return time as the time needed for this trajectory to be close enough to the equilibrium in the sense that

J ∗ − J (t) J∗

2

+

A∗ − A(t) A∗

2

+

R ∗ − R(t) R∗

2 21

≤ ε,

(3.15)

for some small ε > 0. To find the resilience, we use the definition by [23]. That is, after repeating this procedure N times, the resilience is defined by resilience =

1 . average value of the return times

(3.16)

The higher the resilience, the smaller the risk of extinction due to random drift [23]. To estimate the resilience in the stochastic case, we find the mean value of the biomasses of the resource, the juveniles and the adults. This is done by performing a number of simulations using the stochastic model, in which the initial values are randomly picked in the cube defined above. We then find the return time to be the time it takes for the solution to come close enough to the equilibrium in the sense of Eq. (3.15) by using the mean values. The average value of the return times is calculated after repeating the procedure N times to find the resilience by using Eq. (3.16) for the stochastic case. Finally, we repeat the above procedure many times to find the mean and standard deviation of the resilience.

3.7 Simulation Results In this section, we investigate emergent properties in the stage-structured biomass models. The properties we study are the steady state biomasses, yield, resilience, recovery potential, probability of extinction, impact on biomass, and impact on size structure. These properties are compared between the different models, under a range of harvesting rates. In particular, the models differentiate by the growth model for the resource, both as semi-chemostat growth and semi-logistic growth, in combination with deterministic versus stochastic growth. The simulations concerning the logistic growth model are postponed to Appendix 3.9. In this section (as well as in the rest of this paper), all parameters as well as the biomass densities for the stage-structured biomass model are considered nondimensional, using rescaling as in [13, 23]. In our simulations, we have used the parameters given in Table 3.1. When solving the stochastic models, we have simulated each model 10,000 times in order to find good estimates for the expectation and standard deviation, using the sample mean and the sample standard deviation.

44

T. N. Aye and L. Carlsson

Table 3.1 Ecological and economic parameters. The column Unit is the unit dimensions before rescaling into non-dimensional parameters Symbol Value Unit Interpretation H

1

kg/a.u.

T

1

day−1

r Rmax

1 2

day−1 kg/a.u.



0.5



sbirth smax Imax

0.1 10 10

cm cm day−1

q

0.85



M F

0.1 −

day−1 day−1

ρ

0.2



MVP

0.007



Half-saturation constant of consumers Mass-specific metabolic rate Resource turnover rate Maximum resource density Efficiency of resource ingestion Size at birth Size at maturation Maximum ingestion rate per unit of biomass Differences in ingestion ability between juveniles and adults Natural mortality rate Harvesting rate of juveniles and adults Volatility in resource growth Minimum viable population

3.7.1 Stage-Structured Biomass Dynamics and Yield Our consumer-resource models are based on the models derived by [13], which are reliable approximations of a fully size-structured population models. They showed that stage-structured population models formulated in this way incorporates key individual life-history processes. In this subsection, we investigate the model with two types of resource dynamics which depend on the different harvesting rates for the stochastic case, and include the deterministic model as a base case. The stochastic models are the same as the deterministic models, with the exception that a random perturbation is added to the resource growth rate; see Sect. 3.3. We investigate the mean value and standard deviation of all our findings when the solution has reached a steady state according to Eq. (3.15). We see that the resource density is increased with the harvesting rate in Fig. 3.2 as the population becomes overexploited (compare the biomasses of the juveniles, Fig. 3.3, and the adults, Fig. 3.4, as well as the probability of extinction, Fig. 3.10). We

3 Method Development for Emergent Properties in Stage-Structured …

45

Fig. 3.2 Resource dynamics of semi-chemostat (left figure) and semi-logistic (right figure) growth rate at steady state w.r.t. harvesting rate. The grey trajectories show the resource density from different simulations for the stochastic case. The yellow curve represents the resource mean value of the simulations. The light blue curves represent the interval of standard deviation of resource density. The black curve represents the resource for the deterministic case

see that the resource reaches Rmax after all individuals die out, due to the harvesting strategies. In Fig. 3.3 for both growth models, the juvenile biomass first increases with harvesting rate as they can synthesize more proteins at metabolic costs close to the theoretical minimum compared to the adults, due to the difference in ingestion rates, which compensates the reduction of reproduction of offspring. At higher harvesting rates, the increment in food cannot compensate the loss of reproduction of offspring any longer and the juvenile biomass will therefore start to decrease. Furthermore, the biomass of juveniles for semi-chemostat decreases faster than juveniles biomass for semi-logistic when the harvesting rates increase, and will reach the same biomass at a harvesting rate around F = 1.8. When the harvesting rate becomes too high, around F = 2.2, both the juvenile and the adult population go extinct; see Figs. 3.3 and 3.4. In Fig. 3.4, we see that the adult biomass decreases, for both models, as the harvesting rate increases. Figure 3.5 represents how the yield changes with respect to harvesting rates. The yield for semi-chemostat first increases faster than the yield for semi-logistic as harvesting rate increases. The yield decreases as the population approaches the MVP, the yield for semi-chemostat and semi-logistic growth going to zero as the harvesting rate reaches F = 2.2. The semi-logistic model reaches the maximum yield closer to the point of extinction, than in the semi-chemostat model. In our model, the harvesting rate is deterministic, but in real life this is not realistic and might also cause extinction; see Sect. 3.8 for a deeper discussion.

46

T. N. Aye and L. Carlsson

Fig. 3.3 Juvenile biomass of semi-chemostat (left figure) and semi-logistic (right figure) growth rate at steady state w.r.t. harvesting rate. The grey trajectories show the juvenile biomass from different simulations for the stochastic case. The yellow curve represents the resource mean value of the simulations. The light blue curves represent the interval of standard deviation of juvenile biomass. The black curve represents the juvenile biomass for the deterministic case

Fig. 3.4 Adult biomass of semi-chemostat (left figure) and semi-logistic (right figure) growth rate at steady state w.r.t. harvesting rate. The grey trajectories show the adult biomass from different simulations for the stochastic case. The yellow curve represents the adult biomass mean value of the simulations. The light blue curves represent the interval of standard deviation of adult biomass. The black curve represents the adult biomass for the deterministic case

3.7.2 Impact on Size Structure and Biomass The population size structure is completely determined by the distribution of biomass between juveniles and adults, as compared to the distribution of the same stages at steady state without any harvesting. We investigate changes in population size structured and biomass in response to harvesting of juveniles and adults at equal rates. In the deterministic model by [25], harvesting juveniles and adults equally always leads to an increase in the percentage of juvenile biomass in the population, but in other deterministic stage-structured models, this percentage might decrease, e.g. in cannibalistic models. We follow these ideas and hence study the consequences

3 Method Development for Emergent Properties in Stage-Structured …

47

Fig. 3.5 The yield of semi-chemostat (left figure) and semi-logistic (right figure) growth rates at steady state w.r.t. harvesting rate. The grey trajectories show the yield from different simulations for the stochastic case. The yellow curve represents the yield mean value of the simulations. The light blue curves represent the interval of standard deviation of the yield. The black curve represents the yield for the deterministic case

Fig. 3.6 Impact on size structure of semi-chemostat (left figure) and semi-logistic (right figure) growth rate at steady state w.r.t. harvesting rate. The grey trajectories show the impact on size structure from different simulations for the stochastic case. The yellow curve represents the mean value of the simulations. The light blue curves represent the interval of standard deviation. The black curve represents the impact on size structure for the deterministic case. Equation (3.4) is used to derive the above graphs

of harvesting through the impact measures which influence biomass and size structure for deterministic and stochastic cases. In contrast with the yield (see Fig. 3.5), we note that the impact on size structure, Fig. 3.6 (right), for the mean value of the semi-logistic model is relatively close to the deterministic solution, whereas in the semi-chemostat growth rate model, the mean value of the stochastic case differs quite significantly from the deterministic growth rate (Fig. 3.6 (left)). The impact on size structure for both growth models is positive, which means that the fraction of juvenile stage biomass in comparison with the total biomass increases

48

T. N. Aye and L. Carlsson

Fig. 3.7 Impact on biomass of semi-chemostat (left figure) and semi-logistic (right figure) growth rate at steady state w.r.t. harvesting rate. The grey trajectories show the impact on biomass from different simulations for the stochastic case. The yellow curve represents the mean value of the simulation. The light blue curves represent the interval of standard deviation. The black curve represents the impact on biomass for the deterministic case. The above graphs are calculated using Eq. (3.6)

with higher harvesting rate. The reason for this follows the same reasoning as in Sect. 3.7.1. Figure 3.7, shows that the impact on biomass, in the semi-chemostat case (left figure), increases almost linearly from zero up to one, whereas in the semi-logistic case, the impact on biomass resembles an exponential growth curve. The relative flatness in the semi-logistic case of the impact on biomass implies that the population size is hardly affected for low harvesting rates; this is further discussed in Sect. 3.8. It is also worth noting that, in contrast to the intuitive feeling that a population must decrease when harvesting is introduced, in the stochastic semi-logistic case, the mean value of the impact on biomass may be negative. This implies that the total population biomass can increase under small harvesting rates in comparison with the steady state solution without any harvesting. This phenomena has also been observed by [28], but does not occur in our deterministic model.

3.7.3 The Stock Recovery Potential The recovery potential is the generational net biomass production (per unit of biomass) in a virgin environment. It approximates the net reproduction (expressed in biomass) of a small population introduced in an environment which is close to its maximum value Rmax , see [25]. We have named the recovery potential, expressed by Eq. (3.7), the Meng et al. recovery potential. As mentioned above, the Meng et al. recovery potential is evaluated in a virgin environment at equilibrium.

3 Method Development for Emergent Properties in Stage-Structured …

49

Fig. 3.8 Meng et al. recovery potential (red curve) and non-equilibrium recovery potential (blue curve) w.r.t. harvesting rate. The left figure represents the deterministic semi-chemostat growth rate and the right figure shows the deterministic semi-logistic growth rate

Fig. 3.9 The deterministic Meng et al. recovery potential (red curve), the mean value of the stochastic non-equilibrium recovery potential (yellow curve), and the interval of standard deviation of the stochastic non-equilibrium recovery potential (light blue curves) w.r.t harvesting rate. The left figure represents the semi-chemostat growth rate and the right figure shows the semi-logistic growth rate

Since our stochastic model never reaches equilibrium, we re-formulated the recovery potential in non-equilibrium terms through Eqs. (3.12) and (3.14). We denote this formulation as non-equilibrium recovery potential. In this section, we find that the two above formulations for the recovery potential agree in the deterministic case (see Fig. 3.8). Also, the Meng et al. recovery potential in the deterministic case falls within one standard deviation from the mean value of the non-equilibrium recovery potential (see Fig. 3.9). Because of this agreement, we can now use the non-equilibrium recovery potential when finding the probability of extinction, see Sect. 3.7.4.

50

T. N. Aye and L. Carlsson

Fig. 3.10 Probability of extinction with respect to the harvesting rate

3.7.4 The Probability of Extinction Stability in ecological systems is important since a lack of stability promotes extinction. Under a random environment,3 the lack of steady state can amplify the probability that the population goes extinct. We have two ways of finding the probability of extinction: we can either use the MVP formulation, Eq. (3.5), or we can use the method explained in Sect. 3.5, which we call RP formulation.4 As can be seen in Fig. 3.10, the two formulations for the probability of extinction give very similar results, both in the semi-chemostat model (top row figures) and in the semi-logistic model (bottom row figures), the MVP formulation is presented in the left figures and the RP formulation is presented in the right figures. The probability of extinction is calculated for both the semi-chemostat dynamics (top figures) and the semi-logistic dynamics (bottom figures). The yellow curve (left 3

In the deterministic setting, the probability of extinction can be evaluated using Eq. (3.5), which produces a graph with zero probability of extinction up to a certain point, and after this point the probability of extinction is one. 4 Here, RP stands for the recovery potential formulation used to find the probability of extinction.

3 Method Development for Emergent Properties in Stage-Structured …

51

figures) shows the mean of the MVP formulation of the probability of extinction and the blue curves represent the interval of the standard deviation of this probability of extinction. For the RP formulation of the probability of extinction (right figures), the mean is shown by the yellow curve and the standard deviation by the light blue curves. The numerical values in these calculations are as follows: Each model is run for 10,000 simulations. For the MVP formulation of the probability of extinction, we have used PA = 0.5, which we deduced from our simulations and MVP = 0.007, which is based on the assumptions that: (1) the simulated lake contains an expected value of 10,000 adult individuals at steady state without harvesting, (2) MVP = 117 (individuals), which is the average value for fresh water fish species, given in [34], and (3) the steady state of the expected density A + PA ∗ J = 0.38 + 0.5 ∗ 0.44 (weight/volume) in our simulations. For the RP formulation of the probability of extinction, we have used the right graphs in Fig. 3.10. First, we find the mean value of the non-equilibrium recovery potential for 50 simulations, and call this stochastic variable R 1 . We then repeat this procedure 200 times, and follow the explanations given in Sect. 3.5 to find the probability of extinction.

3.7.5 Resilience Resilience is increasingly used in ecology and fishery management context [23]. In simple terms, one can say that the higher the resilience value is, the shorter it takes for a disturbance in the population to converge back toward its steady-state solution. In this section, we use the ideas in Sect. 3.6 to consider the resilience of a population, using both the semi-chemostat resource dynamics as well as the semi-logistic resource dynamics. The initial values in each simulation are chosen at random for the resource, the juvenile and the adult biomass in the cube. We then run the simulations and for each trajectory we determine the return time by using inequality (3.15). The resilience is measured by taking the reciprocal of the average value of the return times over a large number of simulations by the use of Eq. (3.16). We observe that the resilience for the deterministic models in Fig. 3.11 achieves its maximum around the same harvesting rates as the corresponding maximum values for the yield; see Fig. 3.5. Thus in both models, one might want to aim for a slightly lower yield in order to increase the resilience; compare this with the concept of “pretty good yield” as defined in [23] (Fig. 3.12).

3.8 Conclusions and Discussion An aquatic ecological system consisting of one fish species and its food resource is considered. We study the impact on, and emergent properties of, a two-stage structured population model using different growth rate expressions for the unstructured resource under different harvesting rates. The growth rate models we consider

52

T. N. Aye and L. Carlsson

Fig. 3.11 The resilience for semi-chemostat (left figure) and semi-logistic (right figure) growth is calculated by taking the reciprocal of the mean value of return time over the number of simulations for the deterministic case. We find the return times when the solution trajectory is close enough to the equilibrium in the sense of Eq. (3.15) by the mean value of 50 such simulations. The mean (yellow curve) and the standard deviation (light blue curves) of the resilience are shown for both models

Fig. 3.12 The resilience for semi-chemostat (left figure) and semi-logistic (right figure) stochastic growth rate. The mean (yellow curve) and the standard deviation (light blue curves) of the resilience are shown for both models

are logistic growth, semi-chemostat growth, and semi-logistic growth. The semichemostat growth rate is a natural model when plants or phytoplankton are considered as the food resource, but it may also be appropriate to use for zooplankton and insects when these are migrating into the relevant ecosystem. The logistic growth rate may be considered when the resource consists of zooplankton and insects in a closed ecosystem, but in general, the fish population will not be able to reach all of the resource, due to coverage or other types of inaccessibility (see e.g. [26] for a deeper discussion of these phenomena in ecosystems). For this reason, the logistic model is not a realistic natural growth rate model. In addition, when we simulated the ecosystem with logistic growth, we most often get unstable solutions; see Appendix 3.9. Many authors have investigated the behavior of stage-structured models; see e.g. [13, 23, 25].

3 Method Development for Emergent Properties in Stage-Structured …

53

To compensate for influx and inaccessibility of the resource, we have also defined a new growth rate model, which we call semi-logistic growth. The semi-logistic growth model is a linear combination of the other two models, in which only a small portion of semi-chemostat growth is used. The main purpose of this paper is the study of stochastic growth rate models. To the deterministic semi-chemostat growth rate model, we added a white noise in terms of a Brownian motion multiplied with a constant volatility, and to the deterministic logistic growth rate model, the same white noise is added, but with the volatility proportional to the resource. Both these models have been considered by other authors; see e.g. [16, 24, 27, 31]. Some of the population properties can be straightly derived in this stochastic setting by estimating the expectation and variance via Monte Carlo simulations. These properties include biomasses, yield, impact on biomass, impact on size structure and so on. Among these properties, it is worth mentioning that both models shift their size structure towards juvenile individuals as harvesting is increased, see Fig. 3.6. In contrast to this, there is a big discrepancy between the models in that the total biomass in the semi-logistic growth rate model is not much affected under moderate harvesting rates, whereas the total biomass in the semi-chemostat growth rate model declines rapidly with increasing harvesting rates; see Fig. 3.7. Also, in some scenarios, the total biomass actually increases under small harvesting rates in the semi-logistic growth rate model. Some properties cannot directly be translated from the deterministic case to the stochastic case. For example, the recovery potential defined in [25] is evaluated assuming an equilibrium in the governing differential equations, and since this cannot be achieved with a stochastic growth rate, we have, in Sect. 3.4, derived an equivalent formulation in a non equilibrium setting. Furthermore, new emergent properties become available when stochastic models are used; for example, the probability of extinction. In the literature (see e.g [34]), the probability of extinction can be evaluated by comparing the solution to the very non-exact formulation in terms of minimum viable population (MVP). In Sect. 3.5, we propose another approach to evaluating the probability of extinction. In Sect. 3.7.4, our approach is corroborated with the aforementioned MVP formulation. The stochasticity that we introduced is focused on random growth rate for the resource. This is a natural assumption, since the resource depends on the fluctuations in the environment. Another feature that cannot be naturally controlled is the harvesting. In upcoming work, we will study a more general stochastic stage-structured model, in which we include randomness in a broader way. We introduced the stochasticity in the already derived system of ordinary differential equations that describes the stage-structured population dynamics. However, we will in future work try to include the randomness in the basic assumptions. This will lead to a physiologically structured population model, consisting of a system of stochastic partial differential equations, from here that we would like to derive the stochastic stage-structured model.

54

T. N. Aye and L. Carlsson

Fig. 3.13 Resource dynamics (left figure), juvenile biomass (middle figure) and adult biomass (right figure) w.r.t. harvesting rates. The grey trajectories show the results from different simulations for the stochastic case. The yellow curve represents the mean value of simulations and the light blue curves represent the confidence interval for one standard deviation for the stochastic case

Acknowledgements This research was supported by International Science Programme (ISP) in collaboration with South-East Asia Mathematical Network (SEAMaN). The authors are grateful to Lennart Persson for his insightful comments about growth rate models in our manuscript. Tin Nwe Aye is grateful to the research environment Mathematics and Applied Mathematics (MAM), Division of Applied Mathematics, Mälardalen University for providing an excellent and inspiring environment for research education and research.

3.9 Appendix 3.1. Stage-Structured Biomass Model with Logistic Resource Dynamics For completeness, in this appendix, we study the instability of the logistic growth rate model. This instability shows that, caution must be taken when designing models containing the logistic growth rate. We simulate the biomass-based consumer-resource model. The total biomass in the juvenile and adult stage is examined. We also study the yield, the impact on biomass and size structure, the resilience and the recovery potential. Our results for the logistic case reveal that the solution is unstable in a large region of harvesting rates; see Figs. 3.13, 3.14, 3.15, 3.16, 3.17 and 3.18. Except for the logistic growth rate, we have used the same parameters as in Sect. 3.7. From the instability of the solution, it follows that the probability of extinction of the population is positive over a large range of harvesting rates (see Fig. 3.17) when using logistic growth rate for the resource. Also, the impact on biomass (Fig. 3.15) gives an obvious unrealistic result due to this instability.

3.10 Appendix 3.2. Proof of Uniqueness of the Solution w∗J (R) For simplicity, we set θ = w J (R) and K = M + FJ . Equation (3.3) then becomes

3 Method Development for Emergent Properties in Stage-Structured …

55

Fig. 3.14 The yield of logistic growth rate at steady state w.r.t. harvesting rate. The grey trajectories show the yield from different stochastic simulations at w.r.t. harvesting rate. The yellow curve represents the mean value of the stochastic simulation. The light blue curves represent the interval of standard deviation of yield. The black curve represents the yield for the deterministic case

Fig. 3.15 Impact on size structure (left figure) and biomass (right figure) of logistic growth rate at steady state at w.r.t. harvesting rate. The grey trajectories show the impact on size structure and biomass from different simulations for the stochastic case. The yellow curve represents the mean value of the simulation. The light blue curves represent the interval of standard deviation. The black curve represents the impact on size structure and biomass for the deterministic case

 v(θ) =

θ−K 1−z (1−

K) θ

,

−K / ln(z),

θ = K , θ = K.

Our goal is to show that this function is injective. Since this function is continuous for all θ ≥ 0,5 and differentiable for all θ > 0 such that θ = K , we study its derivatives. We find the derivative of v(θ) to check if v(θ) is increasing function.

−2    1 dv(θ) −K K −2 θ z θ θ2 − zθ2 − K 2 z ln(z) + K zθ ln(z) . = z θ −1 K −θ dθ z θ 5

This function is actually not defined at θ = 0 but the limit exists and equals zero.

56

T. N. Aye and L. Carlsson

Fig. 3.16 Recovery potential at w.r.t harvesting rate. Deterministic model, left figure: Meng et al. recovery potential (red curve) and non-equilibrium recovery potential (blue curve). Stochastic model, right figure: Meng et al. recovery potential (red curve), mean value of the non-equilibrium recovery potential (yellow curve), and interval of one standard deviation of the non-equilibrium recovery potential (light blue curves)

Fig. 3.17 The probability of extinction at w.r.t. harvesting rate, the MVP-formulation (left figure) and the RP formulation (right figure). The yellow curve represents the mean value of the probability of extinction and the light blue curves represent the confidence interval of one standard deviation

Fig. 3.18 The resilience w.r.t. harvesting rates, the deterministic model (left figure) and the stochastic model (right figure). The mean (yellow curve) and the confidence interval of one standard deviation (light blue curves) are shown in both curves

3 Method Development for Emergent Properties in Stage-Structured …

57

We see that the first term in the above differential equation is positive, thus, it is enough to show that the second term is positive, we rewrite

K θ

z θ − zθ − K z ln(z) + K zθ ln(z) = θ 2

2

2

2

z 

K θ

−z−

K θ



2 z ln(z) + 

K θ





z ln(z) . 

=g( Kθ )

Then g( Kθ ) is zero in the case of K = θ. Setting t = Kθ , gives g(t) = (z t − z) − t 2 z ln(z) + t z ln(z) = (z t − z) + t (1 − t)z ln(z). We must show that g(t) > 0 when t > 0 and t = 1. The derivative of g(t) is dg(t) = ln(z)(z − 2t z + z t ) = z ln(z)(1 − 2t + z (t−1) ). dt Observe that, since 0 < z = sbirth /smax < 1, we get z ln(z) < 0. We now denote L(t) = 1 − 2t + z (t−1) . It is now sufficient to show that L(t) < 0 when t > 1 and = z (t−1) ln(z) − 2 < L(t) > 0 when 0 < t < 1. Since the derivative of L(t) is d L(t) dt 0, we are done.

References 1. Ackleh, A.S., Jang, S.R.J.: A discrete two-stage population model: continuous versus seasonal reproduction. J. Differ. Eq. Appl. 13(4), 261–274 (2007) 2. Aiello, W.G., Freedman, H.I., Wu, J.: Analysis of a model representing stage-structured population growth with state-dependent time delay. SIAM J. Appl. Math. 52(3), 855–869 (1992) 3. Aye, T.N., Carlsson, L.: Increasing efficiency in the EBT algorithm. In: ASMDA2019, 18th Applied Stochastic Models and Data Analysis International Conference, pp. 179–205. ISAST: International Society for the Advancement of Science and Technology (2019) 4. Bardgett, R.D., van der Putten, W.H.: Belowground biodiversity and ecosystem functioning. Nature 515(7528), 505–511 (2014) 5. Boyce, M.S.: Population viability analysis. Annu. Rev. Ecol. Syst. 23(1), 481–497 (1992) 6. Brännström, Å., Carlsson, L., Rossberg, A.G.: Rigorous conditions for food-web intervality in high-dimensional trophic niche spaces. J. Math. Biol. 63(3), 575–592 (2011) 7. Brown, J.H., Gillooly, J.F., Allen, A.P., Savage, V.M., West, G.B.: Toward a metabolic theory of ecology. Ecology 85(7), 1771–1789 (2004) 8. Burgman, M.A., Gerard, V.A.: A stage-structured, stochastic population model for the giant kelp Macrocystis pyrifera. Mar. Biol. 105(1), 15–23 (1990) 9. Castañera, M.B., Aparicio, J.P., Gürtler, R.E.: A stage-structured stochastic model of the population dynamics of Triatoma infestans, the main vector of Chagas disease. Ecol. Model. 162(1–2), 33–53 (2003) 10. de Roos, A.M., Persson, L.: Size-dependent life-history traits promote catastrophic collapses of top predators. Proc. Natl. Acad. Sci. 99(20), 12907–12912 (2002) 11. de Roos, A.M., Persson, L.: The Influence of Individual Growth and Development on the Structure of Ecological Communities, pp. 89–100. Academic Press (2005)

58

T. N. Aye and L. Carlsson

12. de Roos, A.M., Metz, H., Evers, E., Leipoldt, A.: A size dependent predator-prey interaction: who pursues whom? J. Math. Biol. 28(6), 609–643 (1990) 13. de Roos, A.M., Schellekens, T., van Kooten, T., van de Wolfshaar, K., Claessen, D., Persson, L.: Simplifying a physiologically structured population model to a stage-structured biomass model. Theor. Popul. Biol. 73(1), 47–62 (2008) 14. Flather, C.H., Hayward, G.D., Beissinger, S.R., Stephens, P.A.: Minimum viable populations: is there a ‘magic number’ for conservation practitioners? Trends Ecol. Evol. 26(6), 307–316 (2011) 15. Flores, C.O., Kortsch, S., Tittensor, D., Harfoot, M., Purves, D.: Food Webs: Insights from a General Ecosystem Model. BioRxiv, p. 588665 (2019) 16. Giet, J.S., Vallois, P., Wantz-Mézieres, S.: The logistic SDE. Theory Stoch. Process. 20(1), 28–62 (2015) 17. Hasan, M.R.: Nutrition and feeding for sustainable aquaculture development in the third millennium. In: Aquaculture in the Third Millennium. Technical Proceedings of the Conference on Aquaculture in the Third Millennium, pp. 193–219 (2001) 18. Liu, S., Hu, Z., Wu, S., Li, S., Li, Z., Zou, J.: Methane and nitrous oxide emissions reduced following conversion of rice paddies to inland crab-fish aquaculture in southeast China. Environ. Sci. Technol. 50(2), 633–642 (2016) 19. Liz, E., Pilarczyk, P.: Global dynamics in a stage-structured discrete-time population model with harvesting. J. Theor. Biol. 297, 148–165 (2012) 20. Loreau, M.: Biodiversity and ecosystem functioning: recent theoretical advances. Oikos 91(1), 3–17 (2000) 21. Loreau, M., Behera, N.: Phenotypic diversity and stability of ecosystem processes. Theor. Popul. Biol. 56(1), 29–47 (1999) 22. Loreau, M., de Mazancourt, C.: Biodiversity and ecosystem stability: a synthesis of underlying mechanisms. Ecol. Lett. 16, 106–115 (2013) 23. Lundström, N.L., Loeuille, N., Meng, X., Bodin, M., Brännström, Å.: Meeting yield and conservation objectives by harvesting both juveniles and adults. Am. Nat. 193(3), 373–390 (2019) 24. Lv, Q., Pitchford, J.W.: Stochastic von Bertalanffy models, with applications to fish recruitment. J. Theor. Biol. 244(4), 640–655 (2007) 25. Meng, X., Lundström, N.L., Bodin, M., Brännström, Å.: Dynamics and management of stagestructured fish stocks. Bull. Math. Biol. 75(1), 1–23 (2013) 26. Persson, L., Leonardsson, K., de Roos, A.M., Gyllenberg, M., Christensen, B.: Ontogenetic scaling of foraging rates and the dynamics of a size-structured consumer-resource model. Theor. Popul. Biol. 54(3), 270–293 (1998) 27. Rupšys, P., Petrauskas, E., Bartkeviˇcius, E., Memgaudas, R.: Re-examination of the taper models by stochastic differential equations. In: Recent Advances in Signal Processing, Computational Geometry and Systems Theory, pp. 43–47 (2011) 28. Schröder, A., van Leeuwen, A., Cameron, T.C.: When less is more: positive population-level effects of mortality. Trends Ecol. Evol. 29(11), 614–624 (2014) 29. Scranton, K., Knape, J., de Valpine, P.: An approximate Bayesian computation approach to parameter estimation in a stochastic stage-structured population model. Ecology 95(5), 1418– 1428 (2014) 30. Shaffer, M.L.: Minimum population sizes for species conservation. BioScience 31(2), 131–134 (1981) 31. Shah, M.A.: Stochastic logistic model for fish growth. Open J. Stat. 4(01), 11 (2014) 32. Sribhibhadh, A.: Role of aquaculture in economic development within southeast Asia. J. Fish. Res. Board Can. 33 (2011) 33. Subasinghe, R., Soto, D., Jia, J.: Global aquaculture and its role in sustainable development. Rev. Aquac. 1(1), 2–9 (2009) 34. Wang, T., Fujiwara, M., Gao, X., Liu, H.: Minimum viable population size and population growth rate of freshwater fishes and their relationships with life history traits. Sci. Rep. 9(1), 1–8 (2019)

Chapter 4

Representations of Polynomial Covariance Type Commutation Relations by Linear Integral Operators on L p Over Measure Spaces Domingos Djinja, Sergei Silvestrov, and Alex Behakanira Tumwesigye Abstract Representations of polynomial covariance type commutation relations by linear integral operators on L p over measures spaces are constructed. Conditions for such representations are described in terms of kernels of the corresponding integral operators. Representation by integral operators are studied both for general polynomial covariance commutation relations and for important classes of polynomial covariance commutation relations associated to arbitrary monomials and to affine functions. Examples of integral operators on L p spaces representing the covariance commutation relations are constructed. Representations of commutation relations by integral operators with special classes of kernels such as separable kernels and convolution kernels are investigated. Keywords Integral operators · Covariance commutation relations · Convolution MSC 2020 47G10 · 47L80 · 81D05 · 47L65

D. Djinja Department of Mathematics and Informatics, Faculty of Sciences, Eduardo Mondlane University, Box 257, Maputo, Mozambique D. Djinja (B) · S. Silvestrov Division of Mathematics and Physics, School of Education, Culture and Communication, Mälardalen University, Box 883, 72123 Västerås, Sweden e-mail: [email protected]; [email protected] S. Silvestrov e-mail: [email protected] A. B. Tumwesigye Department of Mathematics, College of Natural Sciences, Makerere University, Box 7062, Kampala, Uganda e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_4

59

60

D. Djinja et al.

4.1 Introduction Commutation relations of the form AB = B F(A)

(4.1)

where A, B are elements of an associative algebra and F is a function of the elements of the algebra, are important in many areas of Mathematics and applications. Such commutation relations are usually called covariance relations, crossed product relations or semi-direct product relations. Elements of an algebra that satisfy (4.1) are called a representation of this relation in that algebra. Representations of covariance commutation relations (4.1) by linear operators are important for the study of actions and induced representations of groups and semigroups, crossed product operator algebras, dynamical systems, harmonic analysis, wavelets and fractals analysis and applications in physics and engineering [4, 5, 16–18, 26–28, 34, 35, 42]. A description of the structure of representations for the relation (4.1) and more general families of self-adjoint operators satisfying such relations by bounded and unbounded self-adjoint linear operators on a Hilbert space use reordering formulas for functions of the algebra elements and operators satisfying covariance commutation relation, functional calculus and spectral representation of operators and interplay with dynamical systems generated by iteration of maps involved in the commutation relations [3, 7–13, 19–21, 29–34, 36–40, 42–55]. In this paper, we construct representations of the covariance commutation relations (4.1) by linear integral operators on Banach spaces L p over measure spaces. When B = 0, the relation (4.1) is trivially satisfied for any A. Thus, we focus on construction and properties of nontrivial representations of (4.1). We consider representations by the linear integral operators defined by kernels satisfying different conditions. We derive conditions on such kernel functions so that the corresponding operators satisfy (4.1) for polynomial F when both operators are of linear integral type. Representations of polynomial covariance type commutation relations by linear integral operators on L p over measure spaces are constructed. Conditions for such representations are described in terms of kernels of the corresponding integral operators. Representation by integral operators are studied both for general polynomial covariance commutation relations and for important classes of polynomial covariance commutation relations associated to arbitrary monomials and to affine functions. Examples of integral operators on L p spaces representing the covariance commutation relations are constructed. Representations of commutation relations by integral operators with special classes of kernels such as separable kernels and convolution kernels are investigated. In particular, we prove that there are no nonzero one sided convolution linear integral operators representing covariance type commutation relation for monomial t m , where m a nonnegative integer except 1. This paper is organized in four sections. After the introduction, we present in Sect. 4.2 some preliminaries, notations, basic definitions and two useful lemmas. In Sect. 4.3, we present some representations when both operators A and B are linear integral

4 Representations of Polynomial Covariance Type Commutation …

61

operators acting on the Banach spaces L p . In particular, we consider cases when operators are convolution type and operators with separable kernels.

4.2 Preliminaries and Notations In this section we present preliminaries, basic definitions and notations for this article [1, 2, 6, 14, 22–24, 41]. Let R be the set of all real numbers, X be a non-empty space, and S ⊆ X . Let (S, Σ, μ) be a σ -finite measure space, where Σ is a σ −algebra with measurable subsets of S, and S can be covered with at most countably many disjoint sets E 1 , E 2 , E 3 , . . . such that E i ∈ Σ, μ(E i ) < ∞, i = 1, 2, . . . and μ is a measure. For 1 ≤ p < ∞, we denote by L p (S, μ), the set of all classes of equivalent on a set of zero measure) measurable functions f : S → R such that  (different | f (t)| p dμ < ∞. This is a Banach space (Hilbert space when p = 2) with norm S

 f p =

 

| f (t)| p dt

 1p

. We denote by L ∞ (S, μ) the set of all classes of equivalent

S

measurable functions f : S → R such that exists C > 0, | f (t)| ≤ C almost everywhere. This is a Banach space with norm  f ∞ = ess supt∈S | f (t)|. The support of a function f : X → R is supp f = {t ∈ X : f (t) = 0}. We will use notation  Q G (u, v) =

u(t)v(t)dμ

(4.2)

G

for G ∈ Σ and such functions u, v : G → R that integral exists and is finite. The convolution of functions f : R → R and g : R → R is defined by ( f  g)(t) = +∞  f (τ )g(t − τ )dτ.

−∞

Now we will consider two useful lemmas for integral operators which will be used throughout the article. Lemma 1 is used in the proof of Theorem 1 and Lemma 2 is used in the proof of Theorem 2.

Lemma 1 Let (X, Σ, μ) be a σ -finite measure space. Let f, g ∈ L q (X, μ) for 1 ≤ q ≤ ∞ and let G 1 , G 2 ∈ Σ such that μ(G i ) < ∞, i = 1, 2. Let G = G 1 ∩ G 2 . Then the following statements are equivalent: 1 1 1. For all x ∈ L p (X, μ), 1 ≤ p ≤ ∞ such that + = 1, p q  Q G 1 ( f, x) =

 f (t)x(t)dμ =

G1

2. The following conditions hold:

g(t)x(t)dμ = Q G 2 (g, x). G2

62

D. Djinja et al.

(a) for almost every t ∈ G, f (t) = g(t), (b) for almost every t ∈ G 1 \ G, f (t) = 0, (c) for almost every t ∈ G 2 \ G, g(t) = 0. Proof 2 ⇒ 1 By additivity of the measure of integration μ on Σ, 

 f (t)x(t)dμ =

G1



 f (t)x(t)dμ +

G 1 \G



g(t)x(t)dμ =

=

 f (t)x(t)dμ =

G



g(t)x(t)dμ + G 2 \G

G

f (t)x(t)dμ G



g(t)x(t)dμ = G

g(t)x(t)dμ. G2

1 ⇒ 2 For the indicator function x(t) = I H1 (t) of the set H1 = G 1 ∪ G 2 , 

f (t)x(t)dμ =

G1





g(t)x(t)dμ =

G2



f (t)dμ =

G1

g(t)dμ = η,

G2

where η is a constant. Now by taking x(t) = IG 1 \G we get 

f (t)x(t)dμ =

G1

Then

 G 1 \G



g(t)x(t)dμ =

G2

 G 1 \G

f (t)dμ =



g(t) · 0dμ = 0.

G2

f (t)dμ = 0. Analogously by taking x(t) = IG 2 \G (t) we get



g(t)dμ

G 2 \G

= 0 We claim that f (t) = 0 for almost every t ∈ G 1 \ G and g(t) = 0 for almost every t ∈ G 2 \ G. We take a partition S1 , S2 , . . . , Sn , . . . of the set G 1 \ G such measure. For each xi (t) that each set Si , i = 1, 2, 3, . . . has positive   = I Si (t), i = 1, 2, 3, . . . we have f (t)x(t)dμ = g(t)x(t)dμ = f (t)dμ = g(t) · 0dμ = G1 G2 Si G2  0. Thus, f (t)dμ = 0, i = 1, 2, 3, . . . Since we can choose arbitrary partition with Si

positive measure on each of its elements, f (t) = 0 for almost everyt ∈ G 1 \ G. f (t)dμ = Analogously, g(t) = 0 for almost every t ∈ G 2 \ G. Therefore η = G1    g(t)dμ = f (t)dμ = g(t)dμ. Then, for all function x ∈ L p (X, μ) we have G G G 2   f (t)x(t)dμ = g(t)x(t)dμ ⇔ [ f (t) − g(t)]x(t)dμ = 0. This implies that G

G

G

f (t) = g(t) for almost every t ∈ G.



Let n be a positive integer, (Rn , Σ, μ) be the standard Lebesgue measure space and Ω ∈ Σ. We denote by C(Ω) the set of all continuous functions f : Ω → R. This is a Banach space with norm  f  = maxt∈Ω | f (t)|. We denote by Cc (Rn ) the set of all continuous functions with compact support. The following statement is similar to Lemma 1 under conditions: X = Rn and sets G 1 , G 2 can have infinite measure.

4 Representations of Polynomial Covariance Type Commutation …

63

Lemma 2 Let (Rn , Σ, μ) be the standard Lebesgue measure space and f, g ∈ L q (Rn , μ) for 1 < q < ∞, G 1 ∈ Σ and G 2 ∈ Σ. Let G = G 1 ∩ G 2 . Then the following statements are equivalent: 1. For all x ∈ L p (Rn , μ), where 1 ≤ p < ∞ such that  Q G 1 ( f, x) =

1 1 + = 1, p q

 f (t)x(t)dμ =

G1

g(t)x(t)dμ = Q G 2 (g, x). G2

2. The following conditions hold: (a) for almost every t ∈ G, f (t) = g(t); (b) for almost every t ∈ G 1 \ G f (t) = 0, (c) for almost every t ∈ G 2 \ G g(t) = 0. Proof 2 ⇒ 1 This follows by direct computation as in the proof of Lemma 1. 1 ⇒ 2 Suppose that 2 is true. If G 1 ∈ Σ and G 2 ∈ Σ have finite measure then it follows from Lemma 1. Suppose that either G 1 has infinite measure or G 2 has infinite measure. For any α > 0 and Ωα = [−α, α]n ⊂ Rn , the set Vα = {x ∈ Cc (Rn ) : x(t) = 0, ∀t ∈ Rn \ Ωα } is a subspace of Cc (Rn ). Since condition 1 is satisfied for any x ∈ Vα , and any x ∈ Vα vanishes outside the set Ωα , with finite measure, we have from Lemma 1: (a) for almost every t ∈ G ∩ Ωα , f (t) = g(t); (b) for almost every t ∈ (G 1 ∩ Ωα ) \ G, f (t) = 0; (c) for almost every t ∈ (G 2 ∩ Ωα ) \ G, g(t) = 0. These conclusions are true for any fixed α > 0, and so for the corresponding Ωα , Vα . Since f, g ∈ L p (Rn , μ) 1 < p < ∞ then there exist compact sets K m such that lim μ({t ∈ Rn \ K m : f (t) > 0}) = lim μ({t ∈ Rn \ K m : g(t) > 0}) = 0.

m→+∞

m→+∞

Hence condition 1 holds for all x ∈ Cc (Rn ) if and only if condition 2 holds. The conclusion follows from [6, Theorem 4.3 and Theorem 4.12], that is, the set Cc (Rn )  is dense in L p (Rn , μ), for 1 < p < ∞.

4.3 Representations by Linear Integral Operators Let (X, Σ, μ) be a σ -finite measure space. In this section we consider representations of the covariance type commutation relation (4.1) when both A and B are linear integral operators acting from the Banach space L p (X, μ) to itself for a fixed p such that 1 ≤ p ≤ ∞ defined as follows:

64

D. Djinja et al.

 (Ax)(t) =

 k A (t, s)x(s)dμs , (Bx)(t) =

SA

k B (t, s)x(s)dμs , SB

almost everywhere, where the index in μs indicates the variable of integration, S A , S B ∈ Σ, μ(S A ) < ∞, μ(S B ) < ∞, k A (t, s) : X × S A → R, k B (t, s) : X × S B → R are measurable functions satisfying conditions bellow. For 1 < p < ∞ we have from [15] that the operators A : L p (X, μ) → L p (X, μ) and B : L p (X, μ) → L p (X, μ) are well-defined if kernels satisfy the following conditions 

⎛ ⎝

X

⎞ p/q



|k A (t, s)|q dμs ⎠

 dμt < ∞,

SA

⎛ ⎝

X

⎞ p/q



|k B (t, s)|q dμs ⎠

dμt < ∞,

SB

(4.3) where 1 < q < ∞ is such that 1p + q1 = 1. For p = 1, operators A : L 1 (X, μ) → L 1 (X, μ) and B : L 1 (X, μ) → L 1 (X, μ) are well-defined if kernels satisfy the following conditions 

 ess sup |k A (t, s)|dμt < ∞, X

s∈S A

ess sup |k B (t, s)|dμt < ∞. X

(4.4)

s∈S B

For p = ∞, operators A : L ∞ (X, μ) → L ∞ (X, μ) and B : L ∞ (X, μ) → L ∞ (X, μ) are well-defined if kernels satisfy the following conditions ⎛ ess sup ⎝



⎞ |k A (t, s)|dμs ⎠ < ∞,

t∈X

⎛ ess sup ⎝



⎞ |k B (t, s)|dμs ⎠ < ∞. (4.5)

t∈X SA

SB

Theorem 1 Let (X, Σ, μ) be a σ -finite measure space. Let A : L p (X, μ) → L p (X, μ), B : L p (X, μ) → L p (X, μ), 1 ≤ p ≤ ∞ be nonzero operators defined as follows  (Ax)(t) =

 k A (t, s)x(s)dμs , (Bx)(t) =

GA

k B (t, s)x(s)dμs , GB

almost everywhere, where the index in μs indicates the variable of integration, G A , G B ∈ Σ, μ(G A ) < ∞, μ(G B ) < ∞, k A (t, s) : R × S A → R, k B (t, s) : R × S B → R are measurable functions satisfying either relation (4.3) or (4.4) or (4.5), n

respectively. Consider a polynomial F : R → R defined by F(z) = δ j z j , where δ j ∈ R, j = 0, 1, 2, . . . , n. Set G = G A ∩ G B , and

j=0

4 Representations of Polynomial Covariance Type Commutation …

65

 k0,A (t, s) = k A (t, s), km,A (t, s) =

k A (t, τ )km−1,A (τ, s)dμτ , m = 1, 2, 3, . . . , n GA

Fn (k A (t, s)) =

n

δ j k j−1 (t, s), n = 1, 2, 3, . . .

j=1

Then AB = B F(A) if and only if the following conditions are fulfilled : 1. for almost every (t, τ ) ∈ X × G, 



˜ τ) = k A (t, s)k B (s, τ )dμs − δ0 k(t,

GA

k B (t, s)Fn (k A (s, τ ))dμs , GB

where μs indicates that integration is taken  with respect to the variable s; 2. for almost every (t, τ ) ∈ X × (G B \ G), k A (t, s)k B (s, τ )dμs = δ0 k B (t, τ ); GA 3. for almost every (t, τ ) ∈ X × (G A \ G), k B (t, s)Fn (k(s, τ ))dμs = 0. GB

Proof By applying Fubini theorem from [1] and iterative kernels from [25] we have 

(A x)(t) = 2

=

   GA



k A (t, s)(Ax)(s)dμs =

GA

GA



GA

k A (s, τ )x(τ )dμτ dμs

GA

GA

k A (t, τ )k A (τ, s)dμτ ;

GA



(A3 x)(t) =   

k A (t, s)

GA



  k A (t, s)k(s, τ )dμs x(τ )dμτ = k1,A (t, τ )x(τ )dμτ ,

k1,A (t, s) =

=



k A (t, s)(A2 x)(s)dμs =

GA



k A (t, s)

GA

 

 k1,A (s, τ )x(τ )dμτ dμs

GA

  k A (t, s)k1 (s, τ )dμs x(τ )dμτ = k2,A (t, τ )x(τ )dμτ ,

GA



k2,A (t, s) =

GA

k A (t, τ )k1,A (τ, s)dμτ ;

GA



(An x)(t) =

kn−1,A (t, s)x(s)dμs , n ≥ 1,

GA

km,A (t, s) =



k A (t, τ )km−1,A (τ, s)dμτ , m = 1, 2, 3, . . . , n,

GA

k0,A (t, s) = k A (t, s); (F(A)x)(t) = δ0 x(t) +

n

δ j (A j x)(t)

j=1

= δ0 x(t) +

n

δj

j=1

= δ0 x(t) +



GA



k j−1,A (t, s)x(s)dμs

GA

Fn (k A (t, s))x(s)dμs ,

66

D. Djinja et al.

Fn (k A (t, s)) =

n

δ j k j−1,A (t, s), n = 1, 2, 3, . . . ;

j=1

(B F(A)x)(t) = =





k B (t, s)(F(A)x)(s)dμs

GB

k B (t, s) δ0 x(s) +

GB

= δ0





Fn (k A (s, τ )x(τ )dμτ ) dμs

GA



k B (t, s)x(s)dμs +

GB

= δ0



 GA

k B (t, s)x(s)dμs +

GB



k B (t, s)Fn (k A (s, τ ))dμs x(τ )dμτ

GB

k B F A (t, τ )x(τ )dμτ ,

GA



k B F A (t, τ ) =





k B (t, s)Fn (k A (s, τ ))dμs ;

GB

(ABx)(t) = =

 GB



 GB



k A (t, s)(Bx)(s)dμs =



k A (t, s)

GA

k AB (t, τ ) =



k B (s, τ )x(τ )dμτ dμs

GB

k A (t, s)k B (s, τ )dμs x(τ )dμτ =

GA





k AB (t, τ )x(τ )dμτ ,

GB

k A (t, s)k B (s, τ )dμs .

GA

Therefore, (ABx)(t) = (B F(A)x)(t) for all x ∈ L p (X, μ) if and only if 

[k AB (t, τ ) − δ0 k B (t, τ )]x(τ )dμτ =

GB



k B F A (t, τ )x(τ )dμτ .

GA

By applying Lemma 1 we have AB = B F(A) if and only if 1. for almost every (t, τ ) ∈ X × G, 

k A (t, s)k B (s, τ )dμs − δ0 k B (t, τ ) =

GA

2. for almost every (t, τ ) ∈ X × (G B \ G), 3. for almost every (t, τ ) ∈ X × (G A \ G),



k B (t, s)Fn (k A (s, τ ))dμs ;

GB

 GA

˜ τ ); k A (t, s)k A (s, τ )dμs = δ0 k(t, k B (t, s)Fn (k A (s, τ ))dμs = 0.



GB

Remark 1 In Theorem 1 when G A = G B = G conditions 2 and 3 are taken on set of measure zero so we can ignore them. Thus, we only remain with condition 1. When G A = G B , then we need to check conditions 2 and 3 outside the intersection G = G A ∩ G B . Moreover, condition 3 that for almost every (t, τ ) ∈ X × (G A \ G),

4 Representations of Polynomial Covariance Type Commutation …

67

 k B (t, s)Fn (k A (s, τ ))dμs = 0,

(4.6)

GB

does not imply B

n

δk A

k

= 0 because its kernel has to satisfy (4.6) only on the

k=1

set X × (G A \ G) and not on the whole set of definition. On the other hand, the same kernel has to satisfy condition 1, which is, for almost every (t, τ ) ∈ X × G, 

˜ τ )dμs − δ0 k(t, ˜ τ) = k(t, s)k(s,

GA



˜ s)Fn (k(s, τ ))dμs . k(t,

GB

n n Note that Theorem 1 does not imply δk Ak = 0. In fact, δk Ak = 0 implies k=1 k=1 n

B δk Ak = 0 but as mentioned above it can be non zero in general. k=1

Example 1 Let (R, Σ, μ) be the standard Lebesgue measure space. Consider integral operators acting on L p (R, μ) for 1 < p < ∞. Let A : L p (R, μ) → L p (R, μ), B : L p (R, μ) → L p (R, μ), 1 < p < ∞ defined as follows (Ax)(t) =



k A (t, s)x(s)dμs , (Bx)(t) =

0



k B (t, s)x(s)dμs ,

0

almost everywhere, where the index in μ indicates the variable of integration, k A (t, s) = I[α,β] (t) π2 (cos t cos s + sin t sin s + cos t sin s), k B (t, s) = I[α,β] (t) π2 (cos t cos s + 2 sin t sin s), almost everywhere (t, s) ∈ R × [0, π ], α, β are real constants such that α ≤ 0, β ≥ π and I E (t) is the indicator function of the set E. These operators are well defined, since the kernels satisfy (4.3). In fact,   π R

|k A (t, s)|q dμs

 qp

dμt

0

β =





α



0





α π

R

0

p

 2 p  (cos t cos s + sin t sin s + cos t sin s)q dμs q dμt π

6 dt π p−1

=

6 p (β−α) π p−1

|k B (t, s)|q dμs

 p/q

< ∞, dμt

68

D. Djinja et al.

=

 β  π  2 β   (cos t cos s + 2 sin t sin s)q dμs p/q dμt ≤ π α

α

0

where q ≥ 1 such that

6p dt π p−1

=

6 p (β−α) π p−1

< ∞,

1 1 + = 1. In the estimations above we used the inequalities: p q

|2(cos t cos s + sin t sin s + cos t sin s)|q ≤ 2q · 3q = 6q , |2(cos t cos s + 2 sin t sin s)|q ≤ 2q · 3q = 6q , 1 < q < ∞. Note that in this case conditions 1, 2 and 3 of Theorem 1 reduce just to condition 1 because the sets G A = G B = [0, π ], and so G = [0, π ], G A \ G = G B \ G = ∅. Therefore, according to Remark 1 conditions 2 and 3 are taken on a set of measure zero. Consider the polynomial F(t) = t 2 , t ∈ R. These operators satisfy AB = B F(A). In fact, by applying Theorem 1 we have δ0 = δ1 = 0, δ2 = 1, n = 2, k AB (t, τ ) =



k A (t, s)k B (s, τ )dμs

0

=

4 π2



I[α,β] (t)(cos(t) cos(s) + sin(t) sin(s) + cos(t) sin(s))·

0

I[α,β] (s)(cos(s) cos(τ ) + 2 sin s sin τ )dμs   = π4 I[α,β] (t) cos t2cos τ + cos t sin τ + sin t sin τ =

2 I (t)(cos t π [α,β]

cos τ + 2 cos t sin τ + 2 sin t sin τ ),

for almost every (t, τ ) ∈ R × [0, π ]. Moreover, F2 (k A (t, s)) = k1,A (t, s) =



k A (t, τ )k A (τ, s)dμτ =

0

=

4 π2



I[α,β] (t)(cos t cos τ + sin t sin τ + cos t sin τ )·

0

I[α,β] (τ )(cos τ cos s + sin τ sin s + cos τ sin s)dμτ   = π4 I[α,β] (t) cos t2cos s + cos t sin s + sin t2sin s =

2 I (t)(cos t π [α,β]

cos s + 2 cos t sin s + sin t sin s),

for almost every (t, s) ∈ R × [0, π ]. Therefore,

4 Representations of Polynomial Covariance Type Commutation …

k B F A (t, τ ) =



69

k B (t, s)F2 (k A (s, τ ))dμs

0

=

= =

4 π2



I[α,β] (t)(cos(t) cos(s) + 2 sin(t) sin(s))

0

· I[α,β] (s)(cos s cos τ + 2 cos s sin τ + sin s sin τ )dμs 



4 I (t) cos t2cos τ + cos t sin τ + sin t sin τ π [α,β] 2 I (t)(cos t cos τ + 2 cos t sin τ + 2 sin t π [α,β]

sin τ ),

for almost every (t, τ ) ∈ R × [0, π ], which coincides with the kernel k AB . Thus, conditions of Theorem 1 are fulfilled and so AB = B A2 . Moreover, B A2 = 0 as mentioned in Remark 1, in fact (B A2 x)(t) =

2 I (t) π [α,β]

π/2 

(cos t cos τ + 2 cos t sin τ + 2 sin t sin τ )x(τ )dμτ

0

almost everywhere. The following corollary is a special case of Theorem 1 for the important class of covariance commutation relations, associated to affine (degree 1) polynomials F. Corollary 1 Let (X, Σ, μ) be a σ -finite measure space. Let A : L p (X, μ) → L p (X, μ), B : L p (X, μ) → L p (X, μ), 1 ≤ p ≤ ∞ be nonzero operators defined as follows  (Ax)(t) =

 k A (t, s)x(s)dμs , (Bx)(t) =

GA

k B (t, s)x(s)dμs , GB

where the index in μs indicates variable of integration, G A , G B ∈ Σ, μ(G A ) < ∞, μ(G B ) < ∞, k A (t, s) : X × G A → R, k B (t, s) : X × G B → R are measurable functions satisfying either relation (4.3) or (4.4) or (4.5). Let F : R → R be a polynomial of degree at most 1 given by F(z) = δ0 + δ1 z, where δ0 , δ1 ∈ R. We set G = G A ∩ GB. Then AB − δ1 B A = δ0 B if and only if the following conditions are fulfilled 1. for almost every (t, τ ) ∈ X × G, 

 k A (t, s)k B (s, τ )dμs − δ0 k B (t, τ ) = δ1

GA

2. for almost every (t, τ ) ∈ X × (G B \ G),

k B (t, s)k A (s, τ )dμs . GB



k A (t, s)k B (s, τ )dμs = δ0 k B (t, τ ).  3. for almost every (t, τ ) ∈ X × (G A \ G), δ1 k B (t, s)k A (s, τ )dμs = 0. GA

GB

70

D. Djinja et al.

The following corollary of Theorem 1 is concerned with representations by integral operators of another important family of covariance commutation relations associated to monomials F. Corollary 2 Let (X, Σ, μ) be a σ -finite measure space. Let A : L p (X, μ) → L p (X, μ), B : L p (X, μ) → L p (X, μ), 1 ≤ p ≤ ∞ be nonzero operators defined as follows  (Ax)(t) =

 k A (t, s)x(s)dμs , (Bx)(t) =

GA

k B (t, s)x(s)dμs , GB

where the index in μs indicates variable of integration, G A , G B ∈ Σ, μ(G A ) < ∞, μ(G B ) < ∞, k A (t, s) : X × G A → R, k B (t, s) : X × G B → R are measurable functions satisfying either relation (4.3) or (4.4) or (4.5). Let F : R → R be a monomial defined by F(z) = δz d , where δ = 0 is a real number and d is a positive integer. Let G = G A ∩ G B and  k0,A (t, s) = k A (t, s), km,A (t, s) = k A (t, τ )km−1,A (τ, s)dμτ , m = 1, 2, 3, . . . , d. GA

Then AB = δ B Ad if and only if the following conditions are fulfilled 1. for almost every (t, τ ) ∈ X × G, 

 k A (t, s)k B (s, τ )dμs = δ

GA

k B (t, s)kd−1,A (s, τ )dμs .

GB

2. for almost every (t, τ ) ∈ X × (G B \ G), 3. for almost every (t, τ ) ∈ X × (G A \ G),

 GA

k A (t, s)k B (s, τ )dμs = 0. k B (t, s)kd−1,A (s, τ )dμs = 0.

GB

Remark 2 Example 1 describes a specific case for Corollary 2 when G A = G B = [0, π ], δ = 1, d = 2. Consider now the case when X = Rl and μ is the Lebesgue measure. In the following theorem we allow the sets G A and G B to have infinite measure. Theorem 2 Let (Rl , Σ, μ) be the standard Lebesgue measure space. Let A : L p (Rl , μ) → L p (Rl , μ), B : L p (Rl , μ) → L p (Rl , μ), 1 < p < ∞ be nonzero operators defined by 

 k A (t, s)x(s)dμs , (Bx)(t) =

(Ax)(t) = GA

k B (t, s)x(s)dμs , GB

4 Representations of Polynomial Covariance Type Commutation …

71

where the index in μ indicates the variable of integration, G A ∈ Σ and G B ∈ Σ, and kernels k A (t, s) : Rl × G A → R, k B (t, s) : Rl × G B → R are measurable functions satisfying either relation (4.3) or (4.4). Consider a polynomial F : R → R defined n

by F(z) = δ j z j , where δ j ∈ R, j = 0, 1, 2, . . . , n. Let G = G A ∩ G B and j=0

 k A,0 (t, s) = k A (t, s), k A,m (t, s) =

k A (t, τ )k A,m−1 (τ, s)dμτ , m = 1, 2, 3, . . . , n GA

Fm (k A (t, s)) =

m

δ j k A, j−1 (t, s), m = 1, 2, 3, . . . , n.

j=1

Then AB = B F(A) if and only if the following conditions are fulfilled: 1. for almost every (t, τ ) ∈ Rn × G, 

 k A (t, s)k B (s, τ )dμs − δ0 k B (t, τ ) =

GA

k B (t, s)Fn (k A (s, τ ))dμs . GB

2. for almost every (t, τ ) ∈ Rn × (G B \ G), 3. for almost every (t, τ ) ∈ Rn × (G A \ G),

 GA

k A (t, s)k B (s, τ )dμs = δ0 k B (t, τ ). k B (t, s)Fn (k A (s, τ ))dμs = 0.

GB

Proof By applying Fubini theorem from [1] and iterative kernels from [25] we have (A2 x)(t) =

 GA

=

  

GA

k1,A (t, s) =





k A (t, s)(Ax)(s)dμs =







k A (t, s)k A (s, τ )ds x(τ )dτ =

k1,A (t, τ )x(τ )dμτ ,

GA

k A (t, τ )k A (τ, s)dμτ ; k A (t, s)(A2 x)(s)dμs =

  

GA

k2,A (t, s) =

 k A (s, τ )x(τ )dμτ dμs

GA



GA

GA

=

 

GA

GA

(A3 x)(t) =

k A (t, s)



k A (t, s)

 

GA

 k1,A (s, τ )x(τ )dμτ dμs

GA

  k A (t, s)k1,A (s, τ )dμs x(τ )dμτ = k2,A (t, τ )x(τ )dμτ ,

GA

GA

k A (t, τ )k1,A (τ, s)dμτ ;

GA

(An x)(t) =



kn−1,A (t, s)x(s)dμs , n ≥ 1

GA

km,A (t, s) =



k A (t, τ )km−1,A (τ, s)dμτ , m = 1, 2, 3, . . . , n, k0,A (t, s) = k A (t, s);

GA

(F(A)x)(t) = δ0 x(t) +

n

j=1

δ j (A j x)(t) = δ0 x(t) +

n

j=1

δj

 GA

k j−1,A (t, s)x(s)dμs

72

D. Djinja et al.

= δ0 x(t) +



Fn (k A (t, s))x(s)dμs ,

GA

Fn (k A (t, s)) =

n

δ j k j−1,A (t, s), n = 1, 2, 3, . . . ;

j=1

(B F(A)x)(t) = =



k B (t, s)(F(A)x)(s)dμs =

GB

   k B (t, s) δ0 x(s) + Fn (k A (s, τ )x(τ )dμτ ) dμs

GB





= δ0

GA

G2



= δ0 k B F (t, τ ) =



  

k B (t, s)x(s)dμs +

GA



k B (t, s)x(s)dμs +

GB



k A (t, s)(Bx)(s)dμs =

  

GB

k AB (t, τ ) =



k B F (t, τ )x(τ )dμτ

k B (t, s)Fn (k A (s, τ ))dμs ;

GB

=

GB

GA

GB

(ABx)(t) =

 k B (t, s)Fn (k B (s, τ ))dμs x(τ )dμτ =



k A (t, s)

 

GA

 k B (s, τ )x(τ )dμτ dμs

GB

  k A (t, s)k B (s, τ )dμs x(τ )dμτ = k AB (t, τ )x(τ )dμτ ,

GA

GB

k A (t, s)k B (s, τ )dμs .

GA

Thus for all x ∈ L p (Rl , μ), 1 < p < ∞ we have (ABx)(t) = (B F(A)x)(t) almost   ˜ τ )]x(τ )dμτ = k B F (t, τ )x(τ )dμτ everywhere if and only if [k AB (t, τ ) − δ0 k(t, GB

GA

almost everywhere. By Lemma 2 we have AB = B F(A) if and only if 1. for almost every (t, τ ) ∈ R × G, 

k A (t, s)k B (s, τ )dμs − δ0 k B (t, τ ) =

GA



k B (t, s)Fn (k A (s, τ ))dμs ;

GB

2. for almost every (t, τ ) ∈ R × (G B \ G), 3. for almost every (t, τ ) ∈ R × (G A \ G),

 GA

k A (t, s)k B (s, τ )dμs = δ0 k B (t, τ ). k B (t, s)Fn (k A (s, τ ))dμs = 0.



GB

Remark 3 Similar to Remark 1, in Theorem 2 when G A = G B = G conditions 2 and 3 are taken on set of measure zero so we can ignore them. Thus, we only remain with condition 1. When G A = G B we need to check also conditions 2 and 3 outside the intersection G = G A ∩ G B . Moreover condition 3, which is, for almost every (t, τ ) ∈ Rn × (G A \ G),  k B (t, s)Fn (k A (s, τ ))dμs = 0. GB

(4.7)

4 Representations of Polynomial Covariance Type Commutation …

does not imply B

n

73

δk A

k

= 0 because its kernel has to satisfy (4.7) only on the

k=1

set Rn × (G A \ G) and not on the whole set of definition. On the other hand, the same kernel has to satisfy condition 2, that for almost every (t, τ ) ∈ Rn × G, 

˜ τ )dμs − δ0 k(t, ˜ τ) = k(t, s)k(s,

GA



˜ s)Fn (k(s, τ ))dμs . k(t,

GB

n n Note that Theorem 2 does not imply δk Ak = 0. In fact, δk Ak = 0 implies k=1 k=1 n

B δk Ak = 0 but as mentioned above it can be non zero in general. k=1

Proposition 1 Let (R, Σ, μ) be the standard Lebesgue measure space. Let A : L p (R, μ) → L p (R, μ), B : L p (R, μ) → L p (R, μ), 1 < p < ∞ be nonzero operators defined as follows  (Ax)(t) =

k˜ A (t − s)x(s)dμs , (Bx)(t) =

R



k˜ B (t − s)x(s)dμs ,

R

where the index in μ indicates the variable of integration, kernels k˜ A (·) ∈ L 1 (R, μ), k˜ B (·) ∈ L 1 (R, μ), that is,   |k˜ A (t)|dμt < ∞, |k˜ B (t)|dμt < ∞. R

R

Consider a polynomial F : R → R, F(z) =

n

δ j z j , where δ j ∈ R, j = 0, 1,

j=0

2, . . . , n. Then AB = B F(A) if and only if for almost every t ∈ R, ⎛ ⎜ k˜ B  ⎝k˜ A − δ0 −

⎞ n j=1

⎟ δ j (k˜ A  k˜ A  . . .  k˜ A )⎠ (t) = 0.   

(4.8)

j times

In particular, if δ0 = 0, that is, F(z) = δ1 t + δ2 z 2 + . . . + δn z n , then AB = B F(A)

j if and only if the set supp K B ∩ supp K A − nj=1 δ j K A has measure zero in R, where ∞ K B (s) = −∞

exp(−st)k˜ B (t)dμt , K A (s) =

∞ −∞

exp(−st)k˜ A (t)dμt .

74

D. Djinja et al.

Proof Operators A and B are well defined by Young theorem ([6], Theorem 4.15). By Fubbini theorem for composition of operators A, B and An , similarly to the proof of Theorem 2 when k A (t, s) = k˜ A (t − s), k B (t, s) = k˜ B (t − s) and G A = G B = R we get from Lemma 2 that AB = B F(A) if and only if for almost every (t, s) ∈ R2 , 

k˜ A (t − τ )k˜ B (τ − s)dμτ − δ0 k˜ B (t − s) =

R



k˜ B (t − τ )Fn (k A (τ, s))dμτ , (4.9)

R

where 

k˜0,A (t, s) = k˜ A (t − s), k˜m,A (t, s) =

k˜ A (t − τ )km−1,A (τ, s)dμτ , m = 1, 2, 3, . . . , n

R

Fm (k˜ A (t, s)) =

m

δ j k˜ j−1,A (t, s), m = 1, 2, 3, . . . , n.

j=1

Computing k˜m,A (t, s) we have for m = 1, k˜1,A (t, s) =



k˜ A (t − τ )k˜ A (τ − s)dμτ =

R



k˜ A (t − s − ν)k˜ A (ν)dμν = (k˜ A  k˜ A )(t − s),

R

for m = 2, k˜2,A (t, s) =



k˜ A (t − τ )(k˜ A  k˜ A (τ − s)dμτ =

R

 =

k˜ A (t − s − ν)(k˜ A  k˜ A (ν)dμν = (k˜ A  k˜ A  k˜ A )(t − s).

R

and for all 2 ≤ m ≤ n, k˜m−1,A (t, s) = (k˜ A  k˜ A  . . .  k˜ A )(t − s). Thus, for all 1 ≤    m times

m≤n Fm (k˜ A (t, s)) =

m j=1

 R

=

δ j (k˜ A  k˜ A  . . .  k˜ A )(t − s),    j times

k˜ B (t − s)Fn (k˜ A (s, τ ))dμs =  n R j=1



k˜ B (t − τ ) ·

n j=1

R

δ j (k˜ A  k˜ A  . . .  k˜ A )(τ − s)dμτ   

δ j k˜ B (t − s − ν) · (k˜ A  k˜ A  . . .  k˜ A )(ν)dμν    j times

j times

4 Representations of Polynomial Covariance Type Commutation …

=

n j=1

75

δ j k˜ B  (k˜ A  k˜ A  . . .  k˜ A )(t − s).    j times

Therefore, for almost every pairs (t, s) ∈ R2 , the equality (4.9) is equivalent to ⎞

⎛ (k˜ A  k˜ B )(t − s) = δ0 k˜ B (t − s) +

n j=1

⎟ ⎜ δ j k˜ B  ⎝k˜ A  k˜ A  . . .  k˜ A ⎠ (t − s)    j times

which is equivalent to (4.8). If δ0 = 0, then by applying the two-sided Laplace transform we get that (4.8) is equivalent to ∞

⎛ exp(−st)k˜ B  ⎝k˜ A −

n

⎞ j δ j k˜ A ⎠ (t)dμt = 0,

j=1

−∞

which is equivalent to

j K B (s) · (K A (s) − nj=1 δ K A (s)) = 0, (4.10) ∞ ∞   where K B (s) = exp(−st)k˜ B (t)dμt , K A (s) = exp(−st)k˜ A (t)dμt . −∞

−∞

 

j Equation (4.10) is equivalent to the set supp K B ∩ supp K A − nj=1 δ j K A to have measure zero in R.  Proposition 2 Let (R, , μ) be the standard Lebesgue measure space. Let A : L p (R, μ) → L p (R, μ), B : L p (R, μ) → L p (R, μ), 1 < p < ∞ be non-zero operators defined as follows  (Ax)(t) =

k˜ A (t − s)x(s)dμs , (Bx)(t) =

R



k˜ B (t − s)x(s)dμs ,

(4.11)

R

where k˜ A (·) ∈ L 1 (R, μ), k˜ B (·) ∈ L 1 (R, μ), that is,   |k˜ A (t)|dμt < ∞, |k˜ B (t)|dμt < ∞. R

R

and the index in μ indicates the variable of integration. Suppose that

(4.12)

76

D. Djinja et al.

∞

∞

exp(−st)k˜ A (t)dμt = K A (s),

−∞

exp(−st)k˜ B (t)dμt = K B (s)

−∞

exist and the domain of K A (·) is equal to the domain of K B (·) with exception of a set of measure zero. Then, AB = δ B An , for a fixed n ∈ Z, n ≥ 2 and δ ∈ R \ {0} if and only if (k˜ A  k˜ B )(t) = 0 almost everywhere. Proof Operators A and B are well defined by Young theorem ([6], Theorem 4.15). Let n ≥ 1. By Fubbini Theorem for composition of operators A, B and An , similarly to the proof of Theorem 2 when k A (t, s) = k˜ A (t − s), k B (t, s) = k˜ B (t − s), G A = G B = R, we get from Lemma 2 that AB = δ B An if and only if, for almost every (t, s) ∈ R2 ,   ˜k A (t − τ )k˜ B (τ − s)dμτ = δ k˜ B (t − τ )k˜n−1,A (τ, s)dμτ , (4.13) R

R

k˜0,A (t, s) = k˜ A (t − s), k˜n,A (t, s) =



k˜ A (t − τ )k˜n−1,A (τ, s)dμτ , n ≥ 1.

R

Computing k˜n,A (t, s), we get for n = 1, k˜1,A (t, s) =



k˜ A (t − τ )k˜ A (τ − s)dμτ =

R



k˜ A (t − s − ν)k˜ A (ν)dμν = (k˜ A  k˜ A )(t − s),

R

for n = 2, k˜2,A (t, s) =



k˜ A (t − τ )(k˜ A  k˜ A (τ − s)dμτ =

R

 =

k˜ A (t − s − ν)(k˜ A  k˜ A (ν)dμν = (k˜ A  k˜ A  k˜ A )(t − s).

R

and for all n ≥ 2, k˜n−1,A (t, s) = (k˜ A  k˜ A  . . .  k˜ A )(t − s), (4.14)    n times   ˜k B (t − s)(k˜n−1,A (s, τ ))dμs = k˜ B (t − τ ) · (k˜ A  k˜ A  . . .  k˜ A )(τ − s)dμτ    n times R R  = k˜ B (t − s − ν) · (k˜ A  k˜ A  . . .  k˜ A )(ν)dμν .    R

n times

Therefore, for almost all pairs (t, s) ∈ R2 , the equality (4.13) is equivalent to

4 Representations of Polynomial Covariance Type Commutation …

77

(k˜ A  k˜ B )(t − s) = δ k˜ B  (k˜ A  k˜ A  . . .  k˜ A )(t − s),    n times

which is equivalent to (k˜ A  k˜ B )(t) = δ k˜ B  (k˜ A  k˜ A  . . .  k˜ A )(t)   

(4.15)

n times

almost everywhere. By applying the two-sided Laplace transform in both cases n ≥ 2   ∞ we get that (4.15) is equivalent to exp(−st)k˜ B  k˜ A − δ k˜ An (t)dμt = 0 almost −∞

everywhere, which can be written as follows   K B (s) · K A (s) − δ K An (s) = 0, n ≥ 2,

(4.16)

∞ exp(−st)k˜ B (t)dμt , K A (s) = exp(−st)k˜ A (t) −∞ −∞   dμt . Equation (4.16) is equivalent to the set supp K B ∩ supp K A − δ K An , n ≥ 2, to have measure zero in R, that is, K B (·) · I(supp(K A −δ K An )) (·) = 0 almost everywhere and (K A (·) − δ K An (·)) · I(supp K B ) (·) = 0 almost everywhere, where I E (·) is the indicator function of the set E. If supp(K A − δ K An ) = R then supp K B has measure zero, that is, B = 0. Similarly, if suppK B = R then A = 0. Suppose that supp K B = R and has positive measure. If (K A (·) − δ K An (·)) · I(supp K B ) (·) = 0 almost everywhere, then K A (s) − δ K An (s) = 0 for almost every s ∈ supp K B . Let p(z) = z − δz n . Suppose that p(z) has m > 0 roots z i , i = 1, 2, . . . , m, m ≤ n, n ≥ 2. We consider the following cases: almost everywhere, K B (s) =

∞

• If n > 1 and p(z) has m ≥ 2 roots z i , i = 1, 2, . . . , m, m ≤ n, then m z i (t − z i ), where (t − t0 ), t, t0 ∈ R, is the Dirac function defined k˜ A (t) = i=1  0, t = t0 as follows (t − t0 ) = . In this case K A (s) − δ K An (s) = 0 for almost ∞, t = t0 every s in supp K B . But this implies K A (s) = 0 for almost every s in supp K B since the Dirac function (·) is equivalent to zero function.   • If n > 1 and p(z) has only one real root, which is z = 0, then supp K A − δ K An = supp K A for all s in supp K B . This implies that Equality (4.16) is satisfied if and only if K A (·) = 0 almost everywhere in supp K B . In both cases we conclude that K A (·) = 0 almost everywhere in supp K B . Outside of supp K B , the function K A (·) can be nonzero. This implies that Equality (4.16) is equivalent to K A (s)K B (s) = 0 almost everywhere. This is equivalent to (k˜ A  k˜ B )(t) = 0 almost everywhere. Remark 4 Let (R, Σ, μ) be the standard Lebesgue measure space. The operators A and B defined in (4.11) as A : L p (R, μ) → L p (R, μ) and B : L p (R, μ) →

78

D. Djinja et al.

L p (R, μ), 1 < p < ∞, (Ax)(t) =

 R

k˜ A (t − s)x(s)dμs , (Bx)(t) =

 R

k˜ B (t − s)

x(s)dμs , almost everywhere, with k˜ A (·) ∈ L 1 (R, μ), k˜ B (·) ∈ L 1 (R, μ), commute, that is AB = B A. In fact, by applying Fubbini theorem for composition of A, B and Lemma 2,  AB = B A ⇔

k˜ A (t − s)k˜ B (s − τ )ds =

R



k˜ B (t − s)k˜ A (s − τ )dμs

R

⇔ (k˜ A  k˜ B )(t − τ ) = (k˜ B  k˜ A )(t − τ ) for almost every (t, τ ) ∈ R2 , which holds true by the commutativity property of convolution. Remark 5 If operators A and B commute then they satisfy, simultaneously, the following relations AB = B F(A), B A = F(A)B and B(A − F(A)) = 0. In fact, if A and B commute, then AB = B A, B F(A) = F(A)B, and thus AB = B F(A) is equivalent to B A = F(A)B, which can be then written also as B(A − F(A)) = 0. Proposition 3 Let ([0, ∞), Σ, μ) be the standard Lebesgue measure space. Let A : L p ([0, ∞), μ) → L p ([0, ∞), μ), B : L p ([0, ∞), μ) → L p ([0, ∞), μ), 1 < p < ∞

be non-zero operators defined by (Ax)(t) = (Bx)(t) =

∞ 0 ∞

k˜ A (t − s)I[0,∞) (t − s)x(s)dμs , k˜ B (t − s)I[0,∞) (t − s)x(s)dμs ,

(4.17)

0

k˜ A (·) ∈ L 1 ([0, ∞), μ), k˜ B (·) ∈ L 1 ([0, ∞), μ) ∞ ∞ that is, |k˜ A (t)|dμt < ∞, |k˜ B (t)|dμt < ∞, 0

0

where I E (·) is the indicator function of the set E and the index in μ is the variable of integration. Then, there are no non-zero operators A and B satisfying AB = δ B An for a fixed n ∈ Z , n ≥ 2, δ ∈ R \ {0}. Proof Operators A and B are well defined by Young’s theorem ([6], Theorem 4.15). Let n ≥ 1. By applying Fubbini theorem for composition of operators A, B and An , similarly to the proof of Theorem 2 when k A (t, s) = k˜ A (t − s)I[0,∞) , k B (t, s) = k˜ B (t − s)I[0,∞) (t − s) and G 1 = G 2 = [0, ∞), we get from Lemma 2 that AB = δ B An if and only if for almost every (t, s) ∈ R2 ,

4 Representations of Polynomial Covariance Type Commutation …

∞

79

k˜ A (t − τ )I[0,∞) (t − τ )k˜ B (τ − s)I[0,∞) (τ − s)dμτ

0

=

∞

(4.18)

k˜ B (t − τ )(k˜n−1,A (τ, s))dμτ ,

0

k˜0,A (t, s) = k˜ A (t − s)I[0,∞) (t − s), ∞ k˜n,A (t, s) = k˜ A (t − τ )I[0,∞) (t − τ )kn−1,A (τ, s)dμτ , n ≥ 1. 0

Computing k˜n−1,A (t, s) for n ≥ 1, using (4.14), yields k˜n−1,A (t, s) = (k˜ A  k˜ A  . . .  k˜ A )(t − s)I[0,∞) (t − s) =    n times

=

t s

=

k˜ A (t − τ ) · (k˜ A  k˜ A  . . . k˜ A )(τ − s)dμτ =   

t−s  0

n−1 times

k˜ A (t − s − ν) · (k˜ A  k˜ A  . . . k˜ A )(ν)dμν    n−1 times

= (k˜ A  k˜ A  . . .  k˜ A )(t − s).    n times

Therefore, from (4.18) we have for n ≥ 2, t−s 

k˜ A (t − s − τ )k˜ B (τ )dμτ =

0

t−s  0

k˜ B (t − s − τ )δ(k˜ A  k˜ A  . . . k˜ A )(τ )dμτ    n times

which we can write as follows (k˜ A k˜ B )(t − s) = δ(k˜ B  (k˜ A  k˜ A  . . . k˜ A ))(t − s).   

(4.19)

n times

By commutativity, linearity of convolution and the Titchmarsh convolution theorem, (4.19) is equivalent to either k˜ B (t − s) = 0 or δ(k˜ A  k˜ A  . . . k˜ A )(t − s) = k˜ A (t − s)    n times

for almost every (t, s) ∈ R2 such that t ≥ 0, 0 ≤ s ≤ t. This is equivalent to either k˜ B (t) = 0 or δ(k˜ A  k˜ A  . . . k˜ A )(t) = k˜ A (t)    n times

80

D. Djinja et al.

almost everywhere, n ≥ 2. Suppose that k˜ B (t) = 0 for almost every t in a set of positive measure. Then δ(k˜ A  k˜ A  . . . k˜ A )(t) = k˜ A (t) for almost every t ∈ [0, ∞). By    n times

applying the one sided Laplace transform K A (s) =

∞

k˜ A (t) exp(−ts)dt, which exists

0

for certain s > 0 since exp(−st) ∈ L p ([0, ∞), μ), 1 < p < ∞, we have for n ≥ 2 δ(k˜ A  k˜ A  . . . k˜ A )(t) = k˜ A (t) ⇐⇒ δ K An (s) = K A (s). Let p(z) = z − δz n and sup   n times

pose that p(z) has m > 0 roots z i , i = 1, 2, . . . , m, m ≤ n, n > 1. We consider the following cases: • If n > 1 and p(z) has m ≥ 2 roots, then k˜ A (t) =

m

z i Δ(t − z i ), In this case

i=1

δ K An (s)

= 0 for all s in the domain of K A (·). But this implies A = 0 K A (s) − since the Dirac function Δ(·) is equivalent to zero function. • If n > 1 and p(z) has only one real root, which is z = 0, then K A (s) − δ K An (s) = 0 implies A = 0.  Remark 6 Let ([0, ∞), Σ, μ) be the standard Lebesgue measure space. The operators A : L p ([0, ∞), μ) → L p ([0, ∞), μ), B : L p ([0, ∞), μ) → L p ([0, ∞), μ), 1 ≤ p < ∞, defined in (4.17) as (Ax)(t) =

∞

k˜ A (t − s) · I[0,∞) (t − s)x(s)dμs , (Bx)(t) =

0

∞

k˜ B (t − s) · I[0,∞) x(s)dμs ,

0

almost everywhere, with k˜ A (·) ∈ L 1 ([0, ∞), μ), k˜ B (·) ∈ L 1 ([0, ∞), μ) (where I E (·) denotes the indicator function of the set E, and the index in μ indicates the variable of integration) commute, AB = B A. In fact, by applying Fubbini theorem for composition of operators A, B and Lemma 2 we have AB = B A if and only if ∞

k˜ A (t − s) · I[0,∞) (t − s)k˜ B (s − τ ) · I[0,∞) (s − τ )dμs

0

=

∞

k˜ B (t − s) · I[0,∞) (t − s)k˜ A (s − τ ) · I[0,∞) (s − τ )dμs



0

t

k˜ A (t − s)k˜ B (s − τ )dμs =

τ t−τ  0

t τ

k˜ A (t − τ − ν)k˜ B (ν)dμν =

k˜ B (t − s) · k˜ A (s − τ )dμs ⇔ t−τ 

k˜ B (t − τ − ν) · k˜ A (ν)dμν ,

(4.20)

0

for almost every (t, τ ) ∈ R2 . By changing variable ξ = t − τ − ν on the right hand side of (4.20) we get

4 Representations of Polynomial Covariance Type Commutation … t−τ 

0

k˜ B (t − τ − ν) · k˜ A (ν)dμν = −

81

k˜ B (ξ ) · k˜ A (t − τ − ξ )dμξ

t−τ

0

=

t−τ 

k˜ A (t − τ − ξ ) · k˜ B (ξ )dμξ

0

which proves (4.20). This completes the proof. In the following theorem we consider a special case of operators in Theorem 1 when the kernels have the separated variables. Theorem 3 Let (X, Σ, μ) be σ -finite measure space. Let A : L p (X, μ) → L p (X, μ), B : L p (X, μ) → L p (X, μ), 1 ≤ p ≤ ∞ be nonzero operators defined as follows  (Ax)(t) =

 a(t)b(s)x(s)dμs , (Bx)(t) =

GA

c(t)e(s)x(s)dμs ,

(4.21)

GB

almost everywhere, where the index in μs indicates the variable of integration, G A ∈ Σ and G B ∈ Σ with finite measure, a, c ∈ L p (X, μ), b ∈ L q (G A , μ), e ∈ L q (G B , μ), 1 ≤ q ≤ ∞, 1p + q1 = 1. Consider a polynomial F : R → R defined by

F(z) = nj=0 δ j z j , where δ j ∈ R j = 0, 1, 2, . . . , n. let G = G A ∩ G B , and k1 =

n

δ j Q G A (a, b) j−1 Q G B (a, e), k2 = Q G B (b, c),

j=1

where Q Λ (u, v), Λ ∈ Σ, is defined by (4.2). Then AB = B F(A) if and only if the following conditions are fulfilled: 1. (a) for almost every (t, s) ∈ supp c × [(supp e) ∩ G], we have; (i) if k2 = 0 then b(s)k1 = λe(s) and a(t) = (δ0 +λ)c(t) for some real scalar k2 λ, (ii) if k2 = 0 then k1 b(s) = −δ0 e(s). / supp c. (b) If t ∈ / supp c then either k2 = 0 or a(t) = 0 for almost all t ∈ (c) If s ∈ G \ supp e then either k1 = 0 or b(s) = 0 for almost all s ∈ G \ supp e. 2. k2 a(t) − δ0 c(t) = 0 for almost every t ∈ X or e(s) = 0 for almost every s ∈ G B \ G. 3. k1 = 0 or b(s) = 0 for almost every s ∈ G A \ G. Proof We observe that since a, c ∈ L p (X, μ), 1 ≤ p ≤ ∞, b ∈ L q (G A , μ), e ∈ L q (G B , μ), where 1 ≤ q ≤ ∞, with 1p + q1 = 1, then either condition (4.3) or (4.4) or (4.5) is satisfied and therefore the operators A and B are well-defined. By direct calculation, we have

82

D. Djinja et al.





(A2 x)(t) =



a(t)b(s)(Ax)(s)dμs = GA

b(τ1 )x(τ1 )dμτ1

a(t)b(s)a(s)dμs GA

GA

= Q G A (a, b)(Ax)(t), (A3 x)(t) = A(A2 x)(t) = Q G A (a, b)(A2 x)(t) = Q G A (a, b)2 (Ax)(t) almost everywhere. We suppose that (Am x)(t) = Q G A (a, b)m−1 (Ax)(t), m = 1, 2, . . . almost everywhere. Then (Am+1 x)(t) = A(Am x)(t) = Q G A (a, b)m−1 (A2 x)(t) = Q G A (a, b)m (Ax)(t) almost everywhere. Then, we compute 

 (ABx)(t) =

e(τ1 )x(τ1 )dμτ1

a(t)b(s)c(s)dμs GA



= k2

GB

(4.22)

a(t)e(τ1 )x(τ1 )dμτ1 , GB

(F(A)x)(t) = δ0 x(t) + a(t)  (B F(A)x)(t) = δ0 c(t)

n

+ c(t)



= δ0 c(t)

 b(τ )x(τ )dμτ , GA

e(τ1 )x(τ1 )dμτ1

(4.23)



  j−1 δ j Q G A (a, b)

j=1

δ j Q G A (a, b)

 j−1

j=1

GB n



 e(τ )a(τ )dμτ

GB

GB



e(τ1 )x(τ1 )dμτ1 + c(t)k1

GB

b(τ1 )x(τ1 )dμτ1

b(τ1 )x(τ1 )dμτ1 . GA

Thus, (ABx)(t) = (B F(A)x)(t) for all x ∈ L p (X, μ) if and only if 

 [k2 a(t) − δ0 c(t)]e(s)x(s)dμs =

GB

k1 c(t)b(s)x(s)dμs . GA

Then by Lemma 1, AB = B F(A) if and only if 1. for almost every (t, s) ∈ X × G,

(4.24)

4 Representations of Polynomial Covariance Type Commutation …

83

[k2 a(t) − δ0 c(t)]e(s) = k1 c(t)b(s); 2. k2 a(t) − δ0 c(t) = 0 for almost every t ∈ X or e(s) = 0 for almost every s ∈ G B \ G; 3. k1 = 0 or c(t) = 0 for almost every t ∈ X or b(s) = 0 for almost every s ∈ G A \ G. We can rewrite the first condition as follows: (a) Suppose (t, s) ∈ supp c × [(supp e) ∩ G]. (i) If k2 = 0, then k1 b(s) = k2 a(t) − δ0 = λ for some real scalar λ. From this, e(s) c(t) c(t). it follows that k1 b(s) = e(s)λ and a(t) = δ0k+λ 2 (ii) If k2 = 0 then −δ0 c(t)e(s) = k1 c(t)b(s) from which we get that k1 b(s) = −δ0 e(s). (b) If t ∈ / supp c then k2 a(t)e(s) = 0 from which we get that either k2 = 0 or a(t) = 0 for almost all t ∈ / supp c or e(s) = 0 almost everywhere (this implies B = 0). (c) If s ∈ G \ supp e, then k1 c(t)b(s) = 0 which implies that either k1 = 0 or b(s) = 0 for almost all s ∈ G \ supp e, or c(t) = 0 almost everywhere (this implies that B = 0).  Remark 7 Observe that operators A and B as defined in (4.21) take the form (Ax)(t) = a(t)φ(x) and (Bx)(t) = c(t)ψ(x) for some functions a, c ∈ L p (X, μ), 1 ≤ p ≤ ∞ and linear functionals φ, ψ : X → R. In this case AB = B F(A) if and only in φ(ψ(x)c(t))a(t) = ψ(F(φ(x)a(t)))c(t) in L p (X, μ), 1 ≤ p ≤ ∞. Corollary 3 Let (X, Σ, μ) be a σ -finite measure space. Let A : L p (X, μ) → L p (X, μ), B : L p (X, μ) → L p (X, μ), 1 ≤ p ≤ ∞ be nonzero operators such that  (Ax)(t) =

 a(t)b(s)x(s)dμs , (Bx)(t) =

G

c(t)e(s)x(s)dμs , G

(the index in μs indicates the variable of integration) almost everywhere, G ∈ Σ is a set with finite measure, a, c ∈ L p (X, μ), b, e ∈ L q (G, μ), 1 ≤ q ≤ ∞, 1p + q1 = 1. Consider a polynomial F(z) = δ0 + δ1 z + . . . + δn z n , where z ∈ R, δ j ∈ R, j = 0, 1, 2, . . . , n. Set k1 =

n

δ j Q G (a, b) j−1 Q G (a, e), k2 = Q G (b, c),

j=1

Then AB = B F(A) if and only if the following is true 1. for almost every (t, s) ∈ supp c × supp e, we have a) If k2 = 0, then k1 b(s) = e(s)λ and a(t) = b) If k2 = 0 then k1 b(s) = −δ0 e(s);

δ0 +λ c(t) k2

for some λ ∈ R.

84

D. Djinja et al.

2. If t ∈ / supp c then either k2 = 0 or a(t) = 0 for almost all t ∈ / supp c. 3. If s ∈ G \ supp e, then either k1 = 0 or b(s) = 0 for almost all s ∈ G \ supp e Proof This follows immediately from Theorem 3 as G A = G B = G.



Remark 8 From Theorem 3 and Corollary 3 we observe that if k1 , k2 = 0, then given operator B as defined by (4.21), we can obtain the kernel of operator A using relations, c(t) and b(s) = kλ1 e(s) for some λ ∈ R. In the next two propositions we a(t) = δ0k+λ 2 state necessary and sufficient conditions for the choice of λ. Proposition 4 Let (X, Σ, μ) be a σ -finite measure space. Let A : L p (X, μ) → L p (X, μ), B : L p (X, μ) → L p (X, μ), 1 ≤ p ≤ ∞ be nonzero operators such that 

 a(t)b(s)x(s)dμs , (Bx)(t) =

(Ax)(t) = G

c(t)e(s)x(s)dμs , G

(the index in μs indicates the variable of integration) almost everywhere, G ∈ Σ is a set with finite measure, and a, c ∈ L p (X, μ), b, e ∈ L q (G, μ), 1 ≤ q ≤ ∞, 1p + q1 = 1. Consider a polynomial F(z) = δ0 + δ1 z + · · · + δn z n , where z ∈ R, δ j ∈ R, j = 0, 1, 2, 3, . . . , n. Set k1 =

n

δ j Q G (a, b) j−1 Q G (a, e), k2 = Q G (b, c).

j=1

Suppose that AB = B F(A). If k2 = 0 and k1 = 0 in condition 1(a) in Corollary 3, then the corresponding nonzero λ satisfy F(λ + δ0 ) = λ + δ0 .

(4.25)

Proof By definition k1 = nj=1 δ j Q G (a, b) j−1 Q G (a, e), k2 = Q G (b, c). If k1 = 0 c(t), b(s) = 0, k2 = 0, by condition 1(a) in Corollary 3 we have a(t) = λ+δ k2 λ e(s) almost everywhere. If λ = 0 then we replace k2 = Q G (b, c) = Q G ( kλ1 e, c) k1    j−1 

0 0 in the following equality k1 = nj=1 δ j Q G λ+δ c, kλ1 e Q G λ+δ c, e . Then, k2 k2 by using

the bilinearity of Q G (·, ·) and after simplification, this is equivalent to λ = nj=1 δ j (λ + δ0 ) j . By adding δ0 on both sides we can write this as (4.25).  Proposition 5 Let (X, Σ, μ) be a σ -finite measure space. Let A : L p (X, μ) → L p (X, μ), B : L p (X, μ) → L p (X, μ), 1 ≤ p ≤ ∞ be nonzero operators such that

4 Representations of Polynomial Covariance Type Commutation …

 (Ax)(t) =

85

 a(t)b(s)x(s)dμs , (Bx)(t) =

G

c(t)e(s)x(s)dμs , G

(the index in μs indicates the variable of integration) almost everywhere, G ∈ Σ is a set with finite measure, a, c ∈ L p (X, μ), b, e ∈ L q (G, μ), 1 ≤ q ≤ ∞, 1p + q1 = 1. Consider a polynomial F(z) = δ0 + δ1 z + . . . + δn z n , where z ∈ R, δ j ∈ R, j = 0, 1, 2, 3, . . . , n. Suppose that for almost every (t, s) ∈ supp c × supp e, we have a(t) =

λ λ + δ0 c(t), b(s) = e(s) k2 k1

for nonzero constants λ, k1 and k2 . If F(λ + δ0 ) = λ + δ0 and k2 =

λ k1

Q G (e, c), then

0 1. A = Qλ+δ B, G (e,c) 2. for all x ∈ L p (X, μ) and almost all t ∈ supp c, (ABx)(t) = (B F(A)x)(t).

Proof We have, almost everywhere,  (Ax)(t) =

(λ + δ0 )λ k1 k2

a(t)b(s)x(s)dμs = G

 c(t)e(s)x(s)dμs = G

(λ + δ0 )λ (Bx)(t) k1 k2

(λ + δ0 )λ (λ + δ0 )λ 2 (B x)(t) = Q G (c, e)(Bx)(t), (ABx)(t) = k1 k2 k1 k2 2    (λ + δ0 )λ (λ + δ0 )λ 2 2 2 (A x)(t) = (B x)(t) = Q G (e, c)(Bx)(t). k1 k2 k1 k2

Similarly, for m ≥ 2, almost everywhere  (Am x)(t) =

(λ + δ0 )λ k1 k2

m Q G (c, e)m−1 (Bx)(t)

(F(A)x)(t) = δ0 (Bx)(t) +

n

 δj

j=1

(λ + δ0 )λ k1 k2

j Q G (c, e) j−1 (Bx)(t).

Therefore, almost everywhere,  (λ + δ0 )λ j Q G (c, e) j−1 (B 2 x)(t) k k 1 2 j=1   n (λ + δ0 )λ j = δ0 Q G (c, e)(Bx)(t) + δj Q G (c, e) j (Bx)(t) k k 1 2 j=1   (λ + δ0 )λ =F Q G (c, e) (Bx)(t), k1 k2

(B F(A)x)(t) = δ0 (B 2 x)(t) +

n



δj

86

D. Djinja et al.

Hence, (ABx)(t) = (B F(A)x)(t), for all x ∈ L p (X, μ) and almost all t ∈ supp c if and only if   (λ + δ0 )λ (λ + δ0 )λ Q G (c, e) = F Q G (c, e) (4.26) k1 k2 k1 k2 for almost every t ∈ supp c. If k2 = holds.

λ k1

Q G (c, e) and λ satisfies (4.25), then (4.26) 

Corollary 4 Let (X, Σ, μ) be a σ -finite measure space. Let A : L p (X, μ) → L p (X, μ), B : L p (X, μ) → L p (X, μ), 1 < p < ∞ be nonzero operators such that 



(Ax)(t) =

a(t)b(s)x(s)dμs , (Bx)(t) = G

c(t)e(s)x(s)dμs , G

(the index in μs indicates the variable of integration) almost everywhere, G ∈ Σ is a set with finite measure, a, c ∈ L p (X, μ), b, e ∈ L q (G, μ), 1 < q < ∞, 1p + q1 = 1. Consider a polynomial F(z) = δ0 + δ1 z + δ2 z 2 , where z ∈ R, δ j ∈ R, j = 0, 1, 2. Suppose that for almost every (t, s) ∈ supp c × 0 c(t), b(s) = kλ1 e(s) for nonzero constants λ, k1 and k2 . If supp e, a(t) = λ+δ k2 k2 = kλ1 Q G (e, c), then (ABx)(t) = (B F(A)x)(t), for all x ∈ L p (G, μ) and almost √ all t ∈√supp c if either δ0 δ2 < 0, or δ0 δ2 ≥ 0 and either δ1 ≥ 1 + 2 δ0 δ2 or δ1 ≤ 1 − 2 δ0 δ2 . Proof From Propositions 4 and 5 we have that AB = B F(A) if F(λ + δ0 ) = λ + δ0 . This is equivalent to δ2 λ2 + (2δ0 δ2 + δ1 − 1)λ + δ2 δ02 + δ1 δ0 = 0

(4.27)

− 4δ0 δ2  0. This is√ equivEquation (4.27) has real solutions if and only if (δ1 − 1)2 √ alent to either δ0 δ2 < 0, or δ0 δ2 ≥ 0 and either δ1 ≥ 1 + 2 δ0 δ2 or δ1 ≤ 1 − 2 δ0 δ2 , which completes the proof.  Example 2 Let (R, Σ, μ) be the standard Lebesgue measure space. Let A : L p (R, μ) → L p (R, μ), B : L p (R, μ) → L p (R, μ), 1 < p < ∞ be nonzero operators defined as follows 1

1 a(t)b(s)x(s)ds, (Bx)(t) =

(Ax)(t) = 0

c(t)e(s)x(s)ds, 0

where a ∈ L p (R, μ), b ∈ L q ([0, 1], μ), 1 < q < ∞, 1p + q1 = 1, and c(t) = t I[0,1] (t), e(s) = s + 1. Consider the polynomial F(z) = z 2 + z − 1 and suppose that for

4 Representations of Polynomial Covariance Type Commutation …

87

0 almost every (t, s) ∈ supp c × supp e, a(t) = λ+δ c(t), b(s) = λe(s) for nonzero k2 5 constants λ and k2 = λQ [0,1] (e, c) = 6 λ. From Propositions 4 and 5 we have that AB = B F(A) if F(λ − 1) = λ − 1, or λ2 − 2λ = 0. Therefore, we take λ = 2.      2 λ+δ0 B = 65 B. Hence, A2 = 65 B 65 B = 65 B 2 . But Then, A = Q [0,1] (e,c)

1 (B x)(t) =

1 t I[0,1] (t)(s + 1)

2

0

s I[0,1] (s)(τ + 1)x(τ )dτ =

5 (Bx)(t). 6

0

Therefore, A2 = ( 65 )2 B 2 = 65 B = A. Thus, F(A) = A2 + A − I = 2 A − I =   12 B − I and B F(A) = B 12 B − I = 12 B 2 − B = 12 · 56 B − B = B. Finally, 5 5 5 5 6 2 6 5 AB = 5 B = 5 · 6 B = B = B F(A). 1

k δk A = 0 as mentioned in Remark 9 Example 2 is a case when operator B k=1

Remark 1 and Remark 3. In this case we have G A = G B = G = [0, 1], operators A : L p (R, μ) → L p (R, μ), B : L p (R, μ) → L p (R, μ), 1 < p < ∞ are defined as fol1 lows (Ax)(t) = 65 (Bx)(t), (Bx)(t) = t I[0,1] (t)(s + 1)x(s)ds almost everywhere, 0

the polynomial is F(z) = −1 + z + z 2 with coefficients δ0 = −1, δ1 = δ2 = 1. We have that Condition 2 and 3 are satisfied because they are taken on the set R × ∅ = ∅ which has measure zero in R × [0, 1]. Condition 1 is satisfied as showed in Example 1 2. Moreover, B(A + A2 ) = 2B A = 2 · 65 B 2 = 2B = 2 t I[0,1] (t)(s + 1)x(s)ds = 0, and A + A2 = 2 A =

12 B= 5

0

1

t I[0,1] (t)(s + 1)x(s)ds = 0. 0

Remark 10 Let (X, Σ, μ) be a σ -finite measure space. From Proposition 5 we have that if A : L p (X, μ) → L p (X, μ), B : L p (X,μ) → L p (X, μ), 1 ≤ p ≤ ∞ are nonzero operators defined as follows (Ax)(t) = a(t)b(s)x(s)dμs , (Bx)(t) = G  c(t)e(s)x(s)dμs , almost everywhere, G ∈ Σ is a set with finite measure, a, c ∈ G

L p (X, μ), b, e ∈ L q (G, μ), 1 ≤ q ≤ ∞, 1p + q1 = 1 and F(z) = δ0 + δ1 z + . . . + δn z n , where z ∈ R, δ j ∈ R, j = 0, 1, 2, 3, . . . , n. If we suppose that for almost 0 c(t) and b(s) = kλ1 e(s) for some nonzero every (t, s) ∈ supp c × supp e, a(t) = λ+δ k2 constants λ, k1 and k2 and if F(λ + δ0 ) = λ + δ0 and k2 = kλ1 Q G (e, c), then A = λ+δ0 B and AB = B F(A). Now suppose that A = ωB for some ω ∈ R, then Q G (e,c) AB = B F(A) if and only if F(ωQ G (c, e)) = ωQ G (c, e).

This relation is the same as Equation (4.26) with ω =

(λ+δ0 )λ . k1 k2

(4.28)

88

D. Djinja et al.

Corollary 5 Let (X, Σ, μ) be a σ -finite measure space. Let A : L p (X, μ) → L p (X, μ), B : L p (X, μ) → L p (X, μ), 1 ≤ p ≤ ∞ be nonzero operators defined by (Ax)(t) = a(t)b(s)x(s)dμs , (Bx)(t) = c(t)e(s)x(s)dμs , almost everyGA

GB

where, where G A ∈ Σ, G B ∈ Σ are sets with finite measure, a, c ∈ L p (X, μ), b ∈ L q (G A , μ), e ∈ L q (G B , μ), 1 ≤ q ≤ ∞ and 1p + q1 = 1. Consider a polynomial F(z) = δ0 + δ1 z + . . . + δn z n , where z ∈ R, δ j ∈ R, j = 0, 1, 2, 3, . . . , n. Let G = G A ∩ G B , and k1 =

n

δ j Q G A (a, b) j−1 Q G B (a, e), k2 = Q G B (b, c).

j=1

Then, 1. if k1 = 0, k2 = 0, then AB = B F(A) if and only if A = ωB, for some constant ω which satisfies (4.28); 2. if k2 = 0 then AB = 0 and, AB = B F(A) if and only if B F(A) = 0. Moreover, (a) if k1 = 0 then B F(A) = 0 if and only if b(s) = − kδ01 e(s)IG (s) almost everywhere;

(b) if k1 = 0 then AB = B F(A) if δ0 = 0, that is, F(t) = nj=1 δ j t j ; 3. if k2 = 0 and k1 = 0 then AB = B F(A) if and only if AB = δ0 B, that is (Ax)(t) =

δ0 k2

 c(t)b(s)x(s)dμs . GA

Proof 1. By applying Theorem 3 if k1 = 0 and k2 = 0 we have AB = B F(A) if and only if the following is true: • for almost every t ∈ supp c a(t) = δ0k+λ c(t) and b(s) = kλ2 e(s) for almost 2 every s ∈ G ∩ supp e and nonzero constant λ satisfying (4.26); • e(s) = 0 for almost every s ∈ G B \ G; • b(s) = 0 for almost every s ∈ G A \ G; From which we have,   (Ax)(t) = a(t)b(s)x(s)dμs + G

(λ + δ0 )λ = k1 k2



a(t)b(s)x(s)dμs =

G A \G

c(t)e(s)x(s)dμs =

(λ + δ0 )λ (Bx)(t) k1 k2

G

almost everywhere. If λ = 0 then A = 0. 2. If k2 = 0 then from (4.22) we have AB = 0 and, hence AB = B F(A) if and only if B F(A) = 0. Moreover, by applying Theorem 3 we have

4 Representations of Polynomial Covariance Type Commutation …

89

(a) if k1 = 0 then AB = B F(A) if and only if for almost every s ∈ supp e ∩ G, b(s) = − kδ01 e(s), b(s) = 0 for almost every s ∈ G \ supp e, e(s) = 0 for almost every s ∈ G B \ G and b(s) = 0 for almost every s ∈ G A \ G. Therefore, almost everywhere, b(s) = − kδ01 e(s)IG (s). (b) if k1 = 0 and δ0 = 0, then AB = B F(A). 3. By applying Theorem 3 and if k2 = 0, k1 = 0 we have AB = B F(A) if c(t) and λe(s) = 0 for and only if for almost every t ∈ supp c, a(t) = δ0k+λ 2 almost every s ∈ G ∩ supp e, from which we get λ = 0. Therefore, a(t) = δ0 c(t) almost everywhere. So we can write (Ax)(t) = a(t)b(s)x(s)dμs = k2 GA  δ0 c(t)b(s)x(s)dμ almost everywhere. Hence, almost everywhere, s k2 GA

 (ABx)(t) =

⎛ c(t)b(s) ⎝

GA

δ0 = c(t) k2



⎞ e(τ )x(τ )dμτ ⎠ dμs

GB



 e(τ )x(τ )dμτ

c(s)b(s)dμs

GA

δ0 = Q G (b, c) k2



GB

c(t)e(τ )x(τ )dμτ = δ0 (Bx)(t)

GB

On the other hand, from (4.24) follows that B F(A) = δ0 B if k1 = 0.



Example 3 Let (R, Σ, μ) be the standard Lebesgue measure space. Let A : L p (R, μ) → L p (R, μ), B : L p (R, μ) → L p (R, μ), 1 < p < ∞ be nonzero operators (Ax)(t) =

1

a(t)b(s)x(s)ds, (Bx)(t) =

0 3

1

c(t)e(s)x(s)ds,

0

where a(t) = t 2 I[0,1] (t), b(s) = s , c(t) = −6t 2 I[0,1] (t) and e(s) = s. Consider a polynomial F(t) = δ0 + δ1 t + δ2 t 2 , where t ∈ R, δ j ∈ R, j = 0, 1, 2. We have 1 1 k2 = Q [0,1] (b, c) = b(s)c(s)ds = −6s 3 s 2 ds = −1. If 0

0

k1 = δ1 Q [0,1] (a, e) + δ2 Q [0,1] (a, b)Q [0,1] (a, e) = 0, then choose δi , i = 1, 2 such that 0 = δ1 + δ2 Q [0,1] (a, b) = δ1 − 16 δ2 Q [0,1] (c, b) = δ1 + 16 δ2 . Thus δ2 = −6δ1 and kδ02 = − 16 from which we get δ0 = 16 . Hence, F(t) = −6δ1 t 2 + δ1 t + 16 . We have almost everywhere

90

D. Djinja et al.

1 (Ax)(t) =

1 t I[0,1] (t)s x(s)ds, (Bx)(t) = −6 2

t 2 I[0,1] (t)sx(s)ds,

3

0

0

and thus almost everywhere 1 (ABx)(t) =

⎛ t 2 I[0,1] (t)s 3 ⎝−6

0

1 (A2 x)(t) =

⎛ t 2 I[0,1] (t)s 3 ⎝

1

⎞ s 2 I[0,1] (s)τ x(τ )dτ ⎠ ds =

0

1

0

1 (Bx)(t), 6



s 2 I[0,1] (s)τ 3 x(τ )dτ ⎠ ds =

1 (Ax)(t). 6

0

Finally, we have   1 1 1 2 B F(A) = B −6δ1 A + δ1 A + I = −δ1 B A + δ1 B A + B = B = AB. 6 6 6 Example 4 Let (R, Σ, μ) be the standard Lebesgue measure space. Let A : L p (R, μ) → L p (R, μ), B : L p (R, μ) → L p (R, μ), 1 < p < ∞ be nonzero operators (Ax)(t) =

β α

a(t)b(s)x(s)ds, (Bx)(t) =



c(t)e(s)x(s)ds,

α

where α, β ∈ R, −∞ < α ≤ β < ∞, a, c ∈ L p (R, μ), b, e ∈ L q ([α, β], μ) where 1 < q < ∞ such that 1p + q1 = 1. Consider a polynomial F(t) = δ0 + δ1 t + δ2 t 2 , where t ∈ R, δ j ∈ R, j = 0, 1, 2. We set k2 = Q [α,β] (b, c) =

β α

b(s)c(s)ds, k1 = δ1 Q [α,β] (a, e) + δ2 Q [α,β] (a, b)Q [α,β] (a, e).

If k2 = 0 and k1 = 0 then we choose either Q [α,β] (a, e) = 0 or δi , i = 1, 2 such that δ1 + δ2 Q [α,β] (a, b) = 0. Thus from Corollary 5 we have a(t) = kδ02 c(t) almost everywhere. Thus k1 = 0 implies that either Q [α,β] (a, e) = 0 or δ1 + kδ02 δ2 k2 = 0. We choose coefficients δ j , j = 0, 1, 2 such that δ1 = −δ0 δ2 , and hence F(t) = δ2 t 2 − δ0 δ2 t + δ0 . Then, the operators δ0 (Ax)(t) = k2



β c(t)b(s)x(s)ds, (Bx)(t) =

α

almost everywhere, satisfy the relation

c(t)e(s)x(s)ds α

4 Representations of Polynomial Covariance Type Commutation …

AB = δ2 B A2 − δ0 δ2 B A + δ0 B.

91

(4.29)

In fact, δ0 (ABx)(t) = k2

β α



δ0 (A2 x)(t) = k2

α

⎛ c(t)b(s) ⎝ ⎛

δ0 c(t)b(s) ⎝ k2



⎞ c(s)e(τ )x(τ )dτ ⎠ ds = δ0 (Bx)(t),

α



⎞ c(s)b(τ )x(τ )dτ ⎠ ds = δ0 (Ax)(t),

α

  almost everywhere. Finally, we have B F(A) = B δ2 A2 − δ2 δ0 A + δ0 I = δ2 δ0 B A − δ2 δ0 B A + δ0 B = δ0 B = AB. In particular, if α = 0, β = 1, b(s) = s and c(t) = t 2 I[0,1] (t), e(s) = s 3 we have k2 = Q [0,1] (b, c) = 41 . Hence the operators 1 (Ax)(t) = 4δ0

1 t I[0,1] (t)sx(s)ds, (Bx)(t) =

t 2 I[0,1] (t)s 3 x(s)ds

2

0

(4.30)

0

satisfy the Relation (4.29). In particular, if δ2 = 1 and δ0 = −1, that is, F(t) = t 2 + t − 1 then the corresponding operators in (4.30) satisfy AB = B A2 + B A − B. Corollary 6 Let (X, Σ, μ) be a σ -finite measure space. Let A : L p (X, μ) → L p (X, μ), B : L p (X, μ) → L p (X, μ), 1 ≤ p ≤ ∞ be nonzero operators defined by   (Ax)(t) = a(t)b(s)x(s)dμs , (Bx)(t) = c(t)e(s)x(s)dμs , GA

GB

almost everywhere, G A ∈ Σ, G B ∈ Σ are sets with finite measure, a, c ∈ L p (X, μ), b ∈ L q (G A , μ), e ∈ L q (G B , μ), 1 ≤ q ≤ ∞ and 1p + q1 = 1. Consider a poly+ δn z n , where z ∈ R, δ j ∈ R, j = 0, 1, 2, 3, . . . , n. nomial F(z) = δ0 + δ1 z + . . .

Let G = G A ∩ G B and k1 = nj=1 δ j Q G A (a, b) j−1 Q G B (a, e), k2 = Q G B (b, c). If k2 = 0 and Q G B (a, e) = 0, then AB = B F(A) if and only if AB = δ0 B, that is a(t) = kδ02 c(t), almost everywhere. Proof This follows by Corollary 5 since k2 = 0 and k1 = 0.



Corollary 7 Let (X, Σ, μ) be a σ -finite measure space. Let A : L p (X, μ) → L p (X, μ), B : L p (X, μ) → L p (X, μ), 1 ≤ p ≤ ∞ be nonzero operators defined by   (Ax)(t) = a(t)b(s)x(s)dμs , (Bx)(t) = c(t)e(s)x(s)dμs , GA

GB

almost everywhere, G A ∈ Σ, G B ∈ Σ are sets with finite measure, a, c ∈ L p (X, μ), b ∈ L q (G A , μ), e ∈ L q (G B , μ), 1 ≤ q ≤ ∞ and 1p + q1 = 1. Consider a monomial

92

D. Djinja et al.

F(z) = δz d , where z ∈ R, d is a positive integer and δ = 0 is a real number. Let G = G A ∩ G B and k1 = δ Q G A (a, b)d−1 Q G B (a, e), k2 = Q G B (b, c). Then AB = δ B Ad if and only the following conditions are fulfilled: 1. (a) for almost every (t, s) ∈ supp c × [(supp e) ∩ G] we have the following: (i) If k2 = 0, then k1 b(s) = e(s)λ and a(t) = kλ2 c(t) for some λ ∈ R. (ii) If k2 = 0 then either k1 = 0 or b(s) = 0 for almost all s ∈ supp e ∩ G. / supp c. (b) If t ∈ / supp c then either k2 = 0 or a(t) = 0 for almost all t ∈ (c) If s ∈ G \ supp e then either k1 = 0 or b(s) = 0 for almost all s ∈ G \ supp e. 2. k2 = 0 , or e(s) = 0 for almost every s ∈ G B \ G. 3. k1 = 0 or b(s) = 0 for almost every s ∈ G A \ G. Proof This follows from Theorem 3 and the fact that δ0 = 0 in this case.



Example 5 Let (R, Σ, μ) be the standard Lebesgue measure space. Let A : L 2 ([α, β], μ) → L 2 ([α, β], μ), B : L 2 ([α, β], μ) → L 2 ([α, β], μ) be defined by (Ax)(t) =

β α

a(t)b(s)x(s)ds, (Bx)(t) =



c(t)e(s)x(s)ds, where

α

α, β are real numbers with α < β, a, b, c, e ∈ L 2 ([α, β], μ), such that a ⊥ b and β β b ⊥ c, that is, a(t)b(t)dt = b(t)c(t)dt = 0. Then the above operators satisfy α

α

AB = δ B Ad , d = 2, 3, . . .. In fact, by using Corollary 7 and putting F(t) = δt d , d = 2, 3, . . . k1 = Q [α,β] (a, b)d−1 Q [α,β] (a, e), k2 = Q [α,β] (b, c), we get k1 = k2 = 0. So we have all conditions in Corollary 7 satisfied. In particular, if a(t) = 53 t 3 − 23 t I[−1,1] (t), b(s) = 23 s 2 − 21 and c(t) = t I[−1,1] (t), then the operators 1  (Ax)(t) = −1

5 3 3 t − t 3 2



 3 2 1 I[−1,1] (t) s − x(s)ds, 2 2 

1 (Bx)(t) =

t I[−1,1] (t)e(s)x(s)ds −1

satisfy the relation AB = B Ad , d = 2, 3, . . .. In fact a, b, c are pairwise orthogonal in [−1, 1]. Acknowledgements This work was supported by the Swedish International Development Cooperation Agency (Sida), bilateral capacity development program in Mathematics with Mozambique.

4 Representations of Polynomial Covariance Type Commutation …

93

Domingos Djinja is grateful to Dr. Lars Hellström and Dr. Yury Nepomnyashchkh for useful comments and is also grateful to the Mathematics and Applied Mathematics research environment MAM, Division of Mathematics and Physics, School of Education, Culture and Communication, Mälardalen University for excellent environment for research in Mathematics.

References 1. Adams, M., Gullemin, V.: Measure Theory and Probability. Birkhäuser (1996) 2. Akhiezer, N. I., Glazman, I. M.: Theory of Linear Operators in Hilbert Spaces, vol I. Pitman Advanced Publication (1981) 3. Bratteli, O., Evans, D.E., Jorgensen, P.E.T.: Compactly supported wavelets and representations of the Cuntz relations. Appl. Comput. Harmon. Anal. 8(2), 166–196 (2000) 4. Bratteli, O., Jorgensen, P. E. T.: Iterated function systems and permutation representations of the Cuntz algebra. Mem. Am. Math. Soc. 139(663), x+89 (1999) 5. Bratteli, O., Jorgensen, P. E. T.: Wavelets through a looking glass. The world of the spectrum. In: Applied and Numerical Harmonic Analysis. Birkhauser Boston, Inc., Boston, MA, xxii+398 (2002) 6. Brezis, H.: Functional Analysis. Sobolev Spaces and Partial Differential Equations. Springer, New York (2011) 7. Carlsen, T.M., Silvestrov, S.: C ∗ -crossed products and shift spaces. Expo. Math. 25(4), 275–307 (2007) 8. Carlsen, T.M., Silvestrov, S.: On the Exel crossed product of topological covering maps. Acta Appl. Math. 108(3), 573–583 (2009) 9. Carlsen, T.M., Silvestrov, S.: On the K -theory of the C ∗ -algebra associated with a one-sided shift space. Proc. Est. Acad. Sci. 59(4), 272–279 (2010) 10. Dutkay, D.E., Jorgensen, P.E.T.: Martingales, endomorphisms, and covariant systems of operators in Hilbert space. J. Oper. Theory 58(2), 269–310 (2007) 11. Dutkay, D. E., Jorgensen, P. E. T., Silvestrov, S.: Decomposition of wavelet representations and Martin boundaries. J. Funct. Anal. 262(3), 1043–1061 (2012). (arXiv:1105.3442 [math.FA], 2011) 12. Dutkay, D. E., Larson, D. R., Silvestrov, S: Irreducible wavelet representations and ergodic automorphisms on solenoids. Oper. Matrices 5(2), 201–219 (2011) (arXiv:0910.0870 [math.FA], 2009) 13. Dutkay, D. E., Silvestrov, S.: Reducibility of the wavelet representation associated to the Cantor set. Proc. Amer. Math. Soc. 139(10), 3657–3664 (2011) (arXiv:1008.4349 [math.FA], 2010) 14. Folland, G.: Real Analysis: Modern Techniques and Their Applications, 2nd edn. John Wiley & Sons Inc. (1999) 15. Hutson, V., Pym, J. S., Cloud, M. J.: Applications of Functional Analysis and Operator Theory, 2nd edn. Elsevier (2005) 16. Jorgensen, P. E. T.: Analysis and probability: wavelets, signals, fractals. In: Graduate Texts in Mathematics, vol. 234. Springer, New York, xlviii+276 (2006) 17. Jorgensen, P. E. T.: Operators and Representation Theory. Canonical Models for Algebras of Operators Arising in Quantum Mechanics. North-Holland Mathematical Studies 147 (Notas de Matemática 120), Elsevier Science Publishers, viii+337 (1988) 18. Jorgensen, P. E. T., Moore, R. T.: Operator Commutation Relations. Commutation Relations for Operators, Semigroups, and Resolvents with Applications to Mathematical Physics and Representations of Lie Groups. Springer Netherlands, xviii+493 (1984) 19. Dutkay, D. E., Silvestrov, S.: Wavelet representations and their commutant. In: Åström, K., Persson, L-E., Silvestrov, S. D. (eds.) Analysis for Science, Engineering and Beyond. Springer proceedings in Mathematics, vol. 6. Springer, Berlin, Heidelberg, Ch. 9, pp. 253–265 (2012)

94

D. Djinja et al.

20. de Jeu, M., Svensson, C., Tomiyama, J.: On the Banach ∗-algebra crossed product associated with a topological dynamical system. J. Funct. Anal. 262(11), 4746–4765 (2012) 21. de Jeu, M., Tomiyama, J.: Maximal abelian subalgebras and projections in two Banach algebras associated with a topological dynamical system. Studia Math. 208(1), 47–75 (2012) 22. Kantorovitch, L.V., Akilov, G.P.: Functional Analysis, 2nd edn. Pergramond Press Ltd, England (1982) 23. Kolmogorov, A. N., Fomim, S. V.: Elements of the Theory of Functions and Functional Analysis, 1st vol. Graylock Press (1957) 24. Kolmogorov, A. N., Fomim, S. V.: Elements of the Theory of Functions and Functional Analysis, 2nd vol. Graylock Press (1961) 25. Krasnosel’skii, M.A., Zabreyko, P.P., Pustylnik, E.I., Sobolevski, P.E.: Integral Operators on the Space of Summable Functions. Noordhoff Int. Publ, Springer, Netherlands (1976) 26. Mackey, G.W.: Induced Representations of Groups and Quantum Mechanics. W. A. Benjamin, New York, Editore Boringhieri, Torino (1968) 27. Mackey, G. W.: The Theory of Unitary Group Representations. University of Chicago Press (1976) 28. Mackey, G. W.: Unitary Group Representations in Physics, Probability, and Number Theory. Addison-Wesley (1989) 29. Mansour, T., Schork, M.: Commutation Relations, Normal Ordering, and Stirling Numbers. CRC Press (2016) 30. Musonda, J.: Reordering in Noncommutative Algebras, Orthogonal Polynomials and Operators. PhD thesis, Mälardalen University (2018) 31. Musonda, J., Richter, J., Silvestrov, S.: Reordering in a multi-parametric family of algebras. J. Phys. Conf. Ser. 1194, 012078 (2019) 32. Musonda, J., Richter, J., Silvestrov, S.: Reordering in noncommutative algebras associated with iterated function systems. In: Silvestrov, S., Malyarenko, A., Ran˘ci´c, M. (eds.), Algebraic Structures and Applications, Springer Proceedings in Mathematics and Statistics, vol. 317. Springer (2020) 33. Nazaikinskii, V. E., Shatalov, V. E., Sternin, B. Yu.: Methods of Noncommutative Analysis. Theory and Applications. De Gruyter Studies in Mathematics 22 Walter De Gruyter & Co. Berlin (1996) 34. Ostrovsky˘ı, V. L., Samo˘ılenko, Yu. S.: Introduction to the theory of representations of finitely presented ∗-algebras. I. Representations by bounded operators. Rev. Math. Phys. 11. The Gordon and Breach Publ. Group (1999) 35. Pedersen, G. K.: C ∗ -Algebras and Their Automorphism Groups. Academic Press (1979) 36. Persson, T., Silvestrov, S. D.: From dynamical systems to commutativity in non-commutative operator algebras. In: A. Khrennikov (ed.) Dynamical systems from number theory to probability-2, Växjö University Press, Mathematical Modeling in Physics, Engineering and Cognitive Science, vol. 6, pp. 109–143 (2003) 37. Persson, T., Silvestrov, S. D.: Commuting elements in non-commutative algebras associated to dynamical systems. In: A. Khrennikov (ed.) Dynamical systems from number theory to probability-2, Växjö University Press; Mathematical Modeling in Physics, Engineering and Cognitive Science, vol. 6, pp. 145–172 (2003) 38. Persson, T., Silvestrov, S.D.: Commuting operators for representations of commutation relations defined by dynamical systems. Numer. Funct. Anal. Opt. 33(7–9), 1146–1165 (2002) 39. Richter, J., Silvestrov, S. D., Tumwesigye, B. A.: Commutants in crossed product algebras for piece-wise constant functions. In: Silvestrov, S., Ranˇci´c, M. (eds.) Engineering Mathematics II: Algebraic, Stochastic and Analysis Structures for Networks, Data Classification and Optimization. Springer Proceedings in Mathematics and Statistics, vol 179, pp. 95–108. Springer (2016) 40. Richter, J., Silvestrov, S. D., Ssembatya, V. A., Tumwesigye, A. B.: Crossed product algebras for piece-wise constant functions. In: Silvestrov, S., Ranˇci´c, M., (eds.) Engineering Mathematics II: Algebraic, Stochastic and Analysis Structures for Networks, Data Classification and Optimization. Springer Proceedings in Mathematics and Statistics, vol. 179, pp. 75–93. Springer (2016)

4 Representations of Polynomial Covariance Type Commutation …

95

41. Rudin, W.: Real and Complex Analysis. 3rd ed, Mc Graw-Hill (1987) 42. Samoilenko, Yu. S.: Spectral theory of families of self-adjoint operators. Kluwer Academic Publication (1991) (Extended transl. from Russian edit. published by Naukova Dumka, Kiev, 1984) 43. Samoilenko, Yu. S., Vaysleb, E. Ye.: On representation of relations AU = U F(A) by unbounded self-adjoint and unitary operators. In: Boundary Problems for Differential Equations. Academy of Sciences of Ukrain. SSR, Inst. Mat., Kiev, 30–52 (1988) (Russian). English transl.: Representations of the relations AU = U F(A) by unbounded self-adjoint and unitary operators. Selecta Math. Sov. 13(1), 35–54 (1994) 44. Silvestrov, S. D.: Representations of commutation relations. A dynamical systems approach. Doctoral thesis, Department of Maths, Umeå University, 10 (1995); Hadron. J. Suppl. 11(1), 116 (1996) 45. Silvestrov, S.D., Tomiyama, Y.: Topological dynamical systems of Type I. Expos. Math. 20, 117–142 (2002) 46. Silvestrov, S.D., Wallin, H.: Representations of algebras associated with a Möbius transformation. J. Nonlin. Math. Phys. 3(1–2), 202–213 (1996) 47. Svensson, C., Silvestrov, S., de Jeu, M.: Dynamical systems and commutants in crossed products. Internat. J. Math. 18, 455–471 (2007) 48. Svensson, C., Silvestrov, S., de Jeu, M.: Connections between dynamical systems and crossed products of Banach algebras by Z. In: Methods of Spectral Analysis in Mathematical Physics, pp. 391–401; Oper. Theory Adv. Appl. 186. Birkhäuser Verlag, Basel (2009) (Preprints in Mathematical Sciences, Centre for Mathematical Sciences, Lund University 2007:5, LUTFMA5081-2007; Leiden Mathematical Institute report 2007-02; arXiv:math/0702118) 49. Svensson, C., Silvestrov, S., de Jeu, M.: Dynamical systems associated with crossed products. Acta Appl. Math. 108(3), 547–559 (2009) (Preprints in Mathematical Sciences, Centre for Mathematical Sciences, Lund University 2007:22, LUTFMA-5088-2007; Leiden Mathematical Institute report 2007-30; arXiv:0707.1881 [math.OA]) 50. Svensson, C., Tomiyama, J.: On the commutant of C(X ) in C ∗ -crossed products by Z and their representations. J. Funct. Anal. 256(7), 2367–2386 (2009) 51. Tomiyama, J.: Invitation to C ∗ -Algebras and Topological Dynamics. World Scientific (1987) 52. Tomiyama, J.: The interplay between topological dynamics and theory of C ∗ -algebras. In: Lecture Notes Series, vol. 2, Seoul National University Research Institute of Mathematics, Global Anal. Research Center, Seoul (1992) 53. Tomiyama, J.: The interplay between topological dynamics and theory of C∗ -algebras. II., S¯urikaisekikenky¯usho K¯oky¯uroku (Kyoto Univ.) 1151, 1–71 (2000) 54. Tumwesigye, A. B.: Dynamical systems and commutants in non-commutative algebras. Ph.D. Thesis, Mälardalen University (2018) 55. Vaysleb, E. Ye., Samo˘ılenko, Yu. S.: Representations of operator relations by unbounded operators and multi-dimensional dynamical systems. Ukr. Math. Zh. 42(8), 1011–1019 (1990) (Russian). English transl.: Ukr. Math. J. 42 899–906 (1990)

Chapter 5

Computable Bounds of Exponential Moments of Simultaneous Hitting Time for Two Time-Inhomogeneous Atomic Markov Chains Vitaliy Golomoziy Abstract In this paper, we study the first simultaneous hitting of the atom by two discrete-time, inhomogeneous Markov chains with values in general phase space. We establish conditions for the existence and find computable bounds for the hitting time’s exponential moment using a geometric drift condition adapted for timeinhomogeneous Markov chains. Keywords Markov chain · Hitting time MSC 2020 60J10

5.1 Introduction Properties of hitting moments play an important role in the Markov chains theory, and various drift conditions are practical tools used in applications when dealing with such moments. The theory of hitting moments and convergence of homogeneous Markov chains is well developed. Many books are devoted to that topic, see, for example, [7, 23, 30]. The first work includes a good overview of the recent results, and we will refer to this book repeatedly. Hitting moments play such an extraordinary role in the homogeneous Markov chains theory because of two important methods that are used in research nowadays: splitting and coupling. The splitting method was introduced in the seminal work of Nummelin [25] and was further developed by other authors. A famous book [23] presents the comprehensive theory of homogeneous Markov chains developed using the splitting technique. The coupling method was first used by Doeblin [6] and became very popular afterward. The essence of the coupling method is covered in books [20, 31]. In the last years coupling method was extensively used (see [1, 8, 14–19, 27]). V. Golomoziy (B) Taras Shevchenko National University of Kyiv, 64 Volodymyrska st., 01033 Kyiv, Ukraine e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_5

97

98

V. Golomoziy

At the same time majority of the papers were devoted to the homogeneous Markov chains, while the inhomogeneous theory is not so well developed. Inhomogeneous chains play an important role in applications, for example, in actuarial mathematics [16], risk theory [2], stochastic optimization [4, 22], etc. In addition, there is a great theoretical interest to this area of research. Foundational results in the theory of inhomogeneous Markov chains were established by Dobrushin [5], and following works [21, 26, 28]. Most of the aforementioned works are related to the ergodicity of time-inhomogeneous chains. However, the coupling technique can be applied to the study of the stability of inhomogeneous chains. By stability, we mean not only stability of the same chain with regard to different initial distributions, but proximity in some sense, of two different time-inhomogeneous chains. Stability results for inhomogeneous chains could be found in [14–19]. The essential tool in the application of the coupling method to stability research in the inhomogeneous case is the renewal theory. This theory is well-developed for the homogeneous renewal sequences, and such classical results as Blackwell, Kendall theorems, and Key Renewal Theorem play an important role. However, there are no such strong results for inhomogeneous renewal sequences (such sequences are generated by the inhomogeneous Markov chains). We can highlight the works of Chow and Robbins [3], and Smith [29] in this domain. Properties of renewal sequences (such as estimation of the expectation of the simultaneous hitting time) in the context of inhomogeneous Markov chains were studied in the papers [9–13]. The present paper can be considered as the contribution to the renewal theory of time-inhomogeneous Markov chains. An important aspect of any research is the ability to verify the conditions in practical application. The standard tool that is used for that purpose in the Markov chains theory is drift conditions. In this paper, we develop a drift condition that is sufficient to ensure the existence of the exponential moment of the return time and evaluate its bounds. Such conditions extensively used in the theory of both homogeneous and inhomogeneous Markov chains. See [1, 8], as an example of drift condition being used for studying ergodicity of the inhomogeneous Markov chains. The paper is organized as follows. In Sect. 5.2 we construct probability space and introduce notation used in the rest of the paper. Section 5.3 includes drift conditions adapted to time-inhomogeneous chains that ensure an exponential moment’s existence. Section 5.4 presents theorems that guarantee the existence of an exponential moment of simultaneous return time of the couple of different inhomogeneous chains, computable bounds for such moment and application to ergodicity. Section 5.5 includes auxiliary lemmas used in the proof of the main results. In Appendix we provide the well-known Comparison Theorem for adapted processes with discrete time.

5 Computable Bounds of Exponential Moments of Simultaneous Hitting …

99

5.2 Notation Let (E, E) be a measurable space, M1 (E), F+ (E), Fb (E) be the spaces of all probability measures, positive and bounded measurable functions on (E, E), respectively. Denote by N0 a set of all nonnegative integers, N0 := {0, 1, 2, . . .}. In this paper, we study time-inhomogeneous Markov chains, taking values in the space (E, E). We associate time-inhomogeneous Markov chain with a sequence of Markov kernels Pt : E × E → [0, 1], where t ∈ N0 . Markov kernel Pt (x, A) stands for the probability for the chain to be at time t at the state x ∈ E and hit the set A ∈ E at time t + 1. We introduce a special notation for a product of transition kernels: ⎛ P t,n (x, A) = ⎝

n−1 

⎞ Pt+k ⎠ (x, A) =





k=0

E

...

E

Pt (x, d x1 ) . . . Pt+n−1 (xn−1 , A), n ≥ 1,

P t,0 (x, A) = 1 A (x). It is very well known that a sequence of Markov kernels (Pt ) together with the starting measure λ ∈ M1 (E) defines a time-inhomogeneous Markov chain (see [24], Theorem 5.1). Our main goal in this section is to introduce a notation that is not overwhelmed with indexes and allows us to use intuition from the homogeneous Markov chains theory and emphasize the difference brought by time-inhomogeneity. Let  = E ∞ be the set of all infinite sequences ω = (ω0 , ω1 , . . .), ω j ∈ E and F = E ∞ be the sigma-field generated by all cylinder sets. For each fixed t ∈ N0 and λ ∈ M1 (E) there exists unique probability measure Ptλ (see [24], Chap. 5 for details of the construction), such that for any cylinder set A0 × A1 . . . At+n × E ∞ ∈ F:  Ptλ {A0 × · · · × At+n } =



 At

At+1

...

At+n

λ(d x0 )Pt (x0 , d x1 ) . . . Pt+n−1 (xn−1 , d xn ).

Define a sequence of random variables X t,n (ω) =  ωt+n , n ≥ 0, such that for all A0 , . . . , An ∈ E: Ptλ X t,0 ∈ A 0 , . . . , X t,n ∈ An = Ptλ {E t × A0 × . . . × An }. associated Denote Ft,n = σ X t,k , 0 ≤ k ≤ n , a natural filtration

with random sequence (X t,n , n ≥ 0), and use a notation Et f (X t,n+m )|Ft,n for the conditional expectation associated with X t,n , n ≥ 0 (here f ∈F

+ (E), n, m ≥ 0). Then the Markov property holds true: Et f (X t,n+m )|Ft,n = Et f (X t,n+m )|X t,n . So far, we defined a space (, F) and a sequence Ptλ of probability measures on that space, as well as double-indexed sequence X t,n :  → E, t, n ≥ 0. Naturally, we associate X t,n with a probability space (, F, Ptλ ). Now, we establish how Ptλ and X t,n are connected for different t. Let us introduce the shift operators θn :  → , n ≥ 1, where ∀ω = θ (ω) = θ1 (ω) = (ω1 , ω2 , . . .), θn (ω) = (θn−1 ◦ θ )(ω) = (ω0 , ω1 , . . .) ∈ : (ωn , ωn+1 , . . .), n > 1. It is clear that, for any t, s, n ∈ N0 : X t+s,n = X t,n+s =   t X X ∈ B = P ∈ B . For a set C ∈ E X t,n ◦ θs , but, for B ∈ E: Pt+s t+s,n t,n+s λ λ

100

V. Golomoziy

define hitting and return times by τt,C = inf{n ≥ 0 : X t+n ∈ C}, σt,C = inf{n ≥ 1 : X t+n ∈ C}. In order to simplify further transformation and use intuition originated by the time-homogeneous case, we make the following agreement: We always use random element X t,n in context of probability Ptλ and never in context of Psλ , s = t. So, we will omit lower index t for X t,n , Ft,n , τt,C and σt,C in context of Ptλ . For example, we can write  / C, . . . X n ∈ / C} Ptλ {σC > n} = Ptλ σt,C > n = Ptλ {X 1 ∈  t t / C, . . . X t,n ∈ / C = Pλ {ω ∈  : ωt+1 ∈ / C, . . . , ωt+n ∈ / C} . = Pλ X t,1 ∈ Similarly, for f ∈ F+ (E), using Markov property, we get





Et f (X n+m )|Fn = Et f (X t,n+m )|Ft,n = Et f (X n+m )|X n = P t+n,m f (X t,n )    = ... Pt+n (ωt+n , d x1 )Pt+n+1 (x1 , d x2 ) . . . Pt+n+m−1 (xm−1 , d xm ) f (xm ). E

E

E

Note, that in the formula above the index t must be specified in P t+n,m f (X t,n ). On the other hand, it is obvious that for x ∈ E,

Et f (X n+m )|X n = x = Et+n [ f (X m )] , x which is a typical expression in the theory of homogeneous Markov chains. We conclude this section with the definition of an atom. Definition 1 We say that a set α ∈ E is an atom for the sequence of Markov kernels (Pt , t ∈ N0 ), if there exists a sequence of probability measures μt ∈ M1 (E) such that for any t ∈ N0 , A ∈ E and x ∈ α: Pt (x, A) = μt (A). We say that atom α is aperiodic if there exists m ≥ 1 such that inf {P t,m (α, α), P t,m+1 (α, α)} > 0. t

(5.1)

Remark 1 For a homogeneous Markov chain with kernel P aperiodic atom satisfies P n (α, α) > 0, for all n ≥ m, where m some positive integer. Note that condition (5.1) implies that there exists m ≥ 1, such that P t,m+n (α, α) > 0, for all n ≥ 0 In contrast to the homogeneous case in the inhomogeneous case it is possible that P t,m+n (α, α) → 0, t → ∞. That is why we require in (5.1) that inf t {P t,m (α, α), P t,m+1 (α, α)} > 0. Since the greatest common divisor of m and m + 1 equals to 1, the latter condition can be rewritten in the following form, which we use in the proof of the main result: ∃m ≥ 1, ∀n ≥ 0, ∃γn > 0 :

inf {P t,m+k (α, α) ≥ γn > 0}.

t,0≤k≤n

(5.2)

5 Computable Bounds of Exponential Moments of Simultaneous Hitting …

101

5.3 Geometric Drift Condition 5.3.1 Drift Condition for Inhomogeneous Markov Chains In this section, we construct a time-inhomogeneous analog of the very well known result for homogeneous Markov chains, regarding geometric drift conditions and existence of the exponential moments. It worth mentioning that the standard homogeneous drift condition in the form P V (x) ≤ λV (x) + b1C (x) is not useful in the inhomogeneous case. This is related to the fact that the inhomogeneous chain properties do not necessarily coincide with that of each particular Pt . In other words, the whole chain may have the finite exponential moment, while the homogeneous chains generated by most of Pt will not. In time-inhomogeneous case, constant λ and test function V have to be dependent on t. One other peculiarity of inhomogeneous chains is that it could be convenient to analyze the chain in terms of blocks P t,k rather than in terms of individual transition probabilities Pt . That is why we state the drift condition in such “blocks” form. Condition (D). We say that a sequence of Markov kernels (Pt , t ∈ N0 ) satisfies Condition (D) with the set C ∈ E if: 1. There exist a sequence of positive integers {n k , k ≥ 1}, a sequence of measurable functions Vk : E → [1, ∞] and two sequences of positive constants {λk , k ≥ 0}, and {bk , k ≥ 0} such that for all x ∈ E P Nk ,n k+1 Vk+1 (x) ≤ λk+1 Vk (x) + bk 1C (x), where Nk =

k

(5.3)

n j , k ≥ 1.

j=1

2. Sequence {λk , k ≥ 0} defined in item 1., satisfies ⎛ ⎞−1 ∞ k

 ⎝ λ j ∨ 1⎠ (1 − λk )+ = ∞. k=0

j=0

Here a ∨ b = max{a, b}, and a + = max{a, 0}. We find it convenient to use the following notation k(t) = min{k : Nk ≥ t}, N (t) = Nk(t) , τ = inf{ j ≥ 1 : X t,Nk(t)+ j −t ∈ C}, where t ∈ N0 .

(5.4)

Variable τ here depends on selection of t, which should be clear from the context. Theorem 1 Let (Pt ) be a sequence of Markov transition kernels, C ∈ E be some set and Condition (D) hold true. Then the following two statements hold true. 1. For any t ∈ N0 and x ∈ E such that P t,N (t)−t VN (t) (x) < ∞:

102

V. Golomoziy

Ptx {τ < ∞} = Ptx {σC < ∞} = 1. 2. For any x ∈ E, t ∈ N0 : ⎡ Etx ⎣

τ 

⎤ t,N (t)−t ⎦ ≤ P t,N (t)−t VN (t) (x) + λ−1 λ−1 (x, C), Nk(t)+1 b N (t) P N (t)+ j

j=1

where k(t), N (t) and τ are defined in (5.4). Proof In the proof we use the notation from Sect. 5.2. The key tool of the proof is the Comparison Theorem (see Appendix). Assume that t ∈ N0 is fixed. First, for readability purposes, we define an increasing sequence of positive numbers {m k , k ≥ 0}, inhomogeneous Markov chain Z k and filtration Fn∗ (all depending on t) in the following way: j ≥ 0, Z j = X t,m j −t , j ≥ 0, Fn∗ = Ft,m n .

m j = Nk(t)+ j ,

So, m 0 ≥ t is the first number Nk that is greater or equal than t, m 1 is the second such number and so on. It is also clear that τ is a stopping time for filtration Fn∗ and it can be written as τ = inf{ j ≥ 1 : Z j ∈ C}. First we show that if x ∈ E is such that P t,N (t)−t VN (t) (x) < ∞ then Ptx {τ < ∞} = Ptx {σC < ∞} = 1. ⎛ Define An = ⎝

n 

(5.5)

⎞−1 λm j ∨ 1⎠

, Vn = An Vm n (Z n ),

j=0

Zn = An+1 (1 − λm n+1 )+ Vm n (Z n ) and Yn = b N (t) An+1 1C (Z n ). Then, we can write



 Et Vn+1 |Fn∗ + Zn = An+1 Et Vm n+1 (Z n+1 )|Fm n + (1 − λm n+1 )+ Vm n (Z n )   = An+1 P m n ,m n+1 −m n Vm n+1 (Z n ) + (1 − λm n+1 )+ Vm n (Z n )   ≤ An+1 λm n+1 Vm n (Z n ) + (1 − λm n+1 )+ Vm n (Z n ) + b N (t) 1C (Z n ) . Consider now two cases. If λm n+1 ≤ 1, then An+1 = An and λm n+1 Vm n (Z n ) + (1 − λm n+1 )+ Vm n (Z n ) = Vm n (Z n ), which yields for λm n+1 ≤ 1:

Et Vn+1 |Fn∗ + Zn ≤ An Vm n (Z n ) + An+1 b N (t) 1C (Z n ) = Vn + Yn .

(5.6)

Now, let λm n+1 > 1. In this case An+1 λm n+1 = An and (1 − λm n+1 )+ = 0. Then, for λm n+1 > 1,

5 Computable Bounds of Exponential Moments of Simultaneous Hitting …

103

Et Vn+1 |Fn∗ + Zn ≤ An Vm n (Z n ) + An+1 b N (t) 1C (Z n ) = Vn + Yn .

(5.7)

Combining (5.6) and (5.7), we conclude that these inequalities holds for all λm n+1 > 0. Comparison Theorem then yields: Etx

[Vτ 1τ n} = Ex An (1 − λm n ) n≥0



Etx

 τ −1

n=0

 +

An (1 − λm n ) Vm n (Z n ) =

n=0

Etx

 τ −1

 Zn ≤

Etx

[V0 ] +

Etx

n=0

 τ −1

 Yk

k=0

= A0 P t,N (t)−t VN (t) (x) + Etx A1 b N (t) 1C (X N (t)−t ) < ∞. An (1 − λm n )+ Ptx {τ > n} < ∞. It follows from the Therefore, we get the relation n≥0 Condition (D) that An (1 − λm n )+ = ∞ which implies Ptx {τ > n} → 0, which n≥0

proves (5.5), since Ptx {σC < ∞} ≥ Ptx {τ < ∞} = 1. The rest of the proof of the theorem follows the arguments from the [7], Proposition 4.3.3 (ii). We apply the Comparison Theorem once again. Put

0 = 1, n =

n 

λ−1 m k , n ≥ 1, Vn = n Vm n (Z n ), n ≥ 0,

k=1

Zn = 0, Yn = n+1 b N (t) 1C (Z n ), n ≥ 0. Then, for all n ≥ 0,

Et Vn+1 |Fn∗ + Z n = n+1 P m n ,m n+1 −m n Vm n+1 (Z n ) ≤ n+1 λm n+1 Vm n (Z n ) + n+1 b N (t) 1C (Z n ) = n Vm n (Z n ) + Yn = Vn + Yn . Assume that x ∈ E satisfies inequality P t,N (t)−t VN (t) (x) < ∞. Taking into account (5.5), the Comparison Theorem yields:

104

V. Golomoziy

Etx

[ τ ] ≤

Etx

[Vτ ] ≤

Etx

[V0 ] +

Etx

 τ −1

 Yk

k=0 t,N (t)−t = P t,N (t)−t VN (t) (x) + λ−1 (x, C), Nk(t)+1 b N (t) P

which completes the proof. Corollary 1 It follows from Theorem 1 that sufficient conditions for the existence of the moment of σC for given t ∈ N0 and x ∈ E are the following: 1. Condition (D) holds true. 2. There exist β > 1 and Cβ > 0 such that ∀n, k ≥ 0: β k ≤ Cβ

k  j=1

λ−1 n+ j .

3. x ∈ E is such that P t,N (t)−t VN (t) (x) < ∞. Then, the following inequality is valid,

t,N (t)−t (x, C). Cβ−1 Etx β σC ≤ P t,N (t)−t VN (t) (x) + λ−1 Nk(t)+1 b N (t) P Proof Since for all ω ∈ , return time σC (ω) satisfies  the inequality  σC (ω) ≤ τ (ω), τ  −1 we can conclude that Etx [β σC ] ≤ Etx [β τ ] ≤ Cβ Etx λ N (t)+ j . Required statej=1

ment then follows from Theorem 1. Remark 2 In the case where n k = 1, for every k ≥ 0, the drift condition and exponential moment bound could be rewritten in the simpler form:

Pt Vt+1 (x) ≤ λt+1 Vt (x) + bt 1C (x), Cβ−1 Etx β σC ≤ Vt (x) + λ−1 t+1 bt 1C (x), assuming that conditions of Corollary 1 are satisfied. In fact, Condition (D) implies one-step drift condition with the special functions Vt under some additional assumptions. The following proposition is a straightforward analog of a well-known homogeneous result (see [7], Proposition 4.3.3 (i)). Proposition 1 Let {λt , t ∈N0 } be a set of positive constants such that σC  λ−1 b := supt∈N0 ,x∈C Etx t+k < ∞. Then drift condition Pt Vt+1 (x) ≤ λt Vt (x) + k=0 τ  C  −1 t b1C (x) holds true for the function Vt (x) = Ex λt+k . k=0

Proof Markov property yields:

5 Computable Bounds of Exponential Moments of Simultaneous Hitting …

Pt Vt+1 (x) = =

Etx Etx







Vt+1 (X 1 ) =

τt+1,C 

Etx

Et+1 X t,1

 λ−1 t+1+k

=

Etx

τ ◦θ C 

k=0

=

Etx

 j−1 

j≥1

=

λt Etx

j≥1

 λ−1 t+1+k

k=0



λ−1 t+1+k

=

k=0

 λ−1 t+1+k 1σC = j

k=0



τ C 



=

λ−1 t+k 1σC = j

= λt Etx

k=0

 j 

Etx

j≥1



j

105

j≥1

τ ◦θ C 

 λ−1 t+1+k 1σC = j

k=0



λ−1 t+k 1σC = j

k=1

σ C 

Etx



λ−1 t+k .

k=0

For x ∈ / C we have Ptx {σC = τC } = 1, which means that Pt Vt+1 (x) = λt Vt (x). Additionally, for any x ∈ C: Pt Vt+1 (x) =

λt Etx

σ C 

 λ−1 t+k

k=0



λt sup Etx x∈C

σ C 

 λ−1 t+k

≤ λt λ−1 t b.

k=0

Combining the inequalities for x ∈ C and x ∈ / C we get: Pt Vt+1 (x) ≤ λt Vt (x)1C c (x) + b1C (x) ≤ λt Vt (x) + b1C (x). So, the statement Remark 2 is proved.

5.3.2 Constructing a Sequence that Dominates Return Time Existence of a dominating sequence, i.e., such sequence of positive, real numbers {Gˆ n , n ≥ 0} that Gˆ n (x) ≥ Ptx {σC > n} , plays an important role in a series of results for inhomogeneous Markov chains (see, [9–13]). Practically, however, it is not always easy to find such a sequence. We will show that the drift condition could be used to address this problem. Lemma 1 Consider inhomogeneous Markov chain defined by a series of Markov kernels (Pt , t ∈ N0 ). Assume that conditions of Corollary 1 are satisfied. Then Ptx {σC > n} ≤ Cβ

t,N (t)−t P t,N (t)−t VN (t) (x) + λ−1 (x, C) Nk(t)+1 b N (t) P

e(n+1) ln β

.

(5.8)

In particular, when conditions of Remark 2 are satisfied, (5.8) is equal to Ptx {σα > n} ≤ Cβ

Vt (x) +

bt λt+1 e(n+1) ln β

.

(5.9)

106

V. Golomoziy

Proof is a trivial application of the Chernoff inequality Ptx

{σC > n} =

Ptx

Etx eσC ln β {σC ≥ n + 1} ≤ (n+1) ln β . e

Formulas (5.8) and (5.9) follows from Corollary 1 and Remark 2. Using Lemma 1, we may construct dominating sequences in the assumption that right-hand sides of (5.8) or (5.9) are bounded as functions of t. The nice property of such dominating sequences is that they admit finite exponential moments.

5.4 Main Result In this section we consider a pair of sequences of Markov kernels (P0,t , t ∈ N0 ) and (P1,t , t ∈ N0 ) defined on the E × E. Let E ⊗ E be a sigma-field generated by all products A × B, A, B ∈ E and ∀λ, λ ∈ M1 (E) denote as λ ⊗ λ a product measure defined on the E ⊗ E. We may construct the sequence of Markov kernels P¯t : E 2 × E ⊗ E → [0, 1], such that for all t ∈ N0 , x, y ∈ E, A ∈ E ⊗ E: P¯t ((x, y), A) =

 (z 0 ,z 1 )∈A

P0,t (x, dz 0 )P1,t (x, dz 1 ).

¯ F¯ ) and a series of probability measures Pλ ⊗λ We can build the canonical space (, 0 1 ¯ (λ0 , λ1 ∈ M1 (E)) using the same approach as in Sect. 5.2. It is clear that every ω ¯ ∈    (1) , ω(i) can be written as ω¯ = (ω¯ 0 , ω¯ 1 , . . .), where ω¯ j = ω(0) j , ωj j ∈ E, i ∈ {0, 1}, j ≥ 0. For each t ∈ N0 we have then a pair of time-inhomogeneous Markov chains (0) (1) (0) (0) (1) (1) , X t,n , n ≥ 0), such that X t,n (ω) ¯ = ωt+n and similarly X t,n (ω) ¯ = ωt+n . (X t,n It follows from the construction, that ∀A ∈ E, i ∈ {0, 1}:   t Pλ0 ⊗λ1 X n(i) ∈ A = λi (d x)Pit,n (x, A), t

E

where, for i ∈ {0, 1}: Pit,n (x, A) =

n−1 

 Pi,t+k (x, A). For a given set C ∈ E we

k=0

define hitting and return times to C × C:

  (0) (1) ∈ C × C}, τ¯t,C×C = inf{n ≥ 0 : X t,n , X t,n   (0) (1) σ¯ t,C×C = inf{n ≥ 1 : X t,n ∈ C × C}, , X t,n ¯ θ¯ ((ω¯ 0 , ω¯ 1 , . . .)) = (ω¯ 1 , ω¯ 2 , . . .). an the shift operator on :

5 Computable Bounds of Exponential Moments of Simultaneous Hitting …

107

t t We will also need indviduall probabilities and expectations Pi,λ , Ei,λ , i ∈ {0, 1}, λ ∈ M1 (E). They should be understood as canonical probabilities and expectations generated by the sequences (P0,t , t ∈ N0 ), or (P1,t , t ∈ N0 ) separately. t Going forward we will drop the bottom index t in the context of Pλ0 ⊗λ1 as described in the Sect. 5.2. We will need the following conditions: Condition A: There exists set α ∈ E which is an aperiodic atom for both (P0,t ), and (P1,t ). Condition D1: Assume that Condition (D) holds true for each of the sequences (i) (Pi,t , t ∈ N0 , i ∈ {0, 1}) with Vt(i) , λ(i) t and βt . Assume also, there exists β > 1 and constants Cβ(i) > 0 such that for i ∈ {0, 1}, t, n ≥ 0, and λ(i) t

β ≤ n

Cβ(i)

 n 

−1 λ(i) t+k

.

(5.10)

k=1

Now we introduce a notation specific for the proof of the main result. Let Condition (A) hold true, so that α is an aperiodic atom for both chains. Then, we can assume, without loss of generality, that there are m > 0 and γ0 > 0, such that γ0 =

inf

t∈N0 ,i∈{0,1}

{Pit,m (α, α), Pit,m+1 (α, α), . . . , Pit,2m−1 (α, α)} > 0.

(5.11)

Let us define a sequence of “coupling trials” νt,k : νt,−1 = min{σ¯ t,α×E , σ¯ t,E×α }, νt,0 = max{σ¯ t,α×E , σ¯ t,E×α }, ⎧ ⎨ ∞, if νn = ∞, (1) (0) νt,n+1 = min{k ≥ νt,n + m, X t,k ∈ α}, if X t,νt,n ∈ α, ⎩ (0) (1) ∈ α}, if X t,ν ∈ α, min{k ≥ νt,n + m, X t,k t,n

(5.12)

where n ≥ 0 and m is from (5.11). We also introduce the following notation Ut,n = νt,n − νt,n−1 , n ≥ 0, τt = min{k ≥ 0 : νt,k−1 = νt,k }.

(5.13)

(i) Ut,n can be understood as a next after time m hit of α by X (1−i) if X t,ν ∈ α, t,n τt is a number of the first successfull coupling trial, and νt,τt is an index, such that (0) (1) X t,ν ∈ α × α for the first time. The main reason, why we added m steps , X t,ν t,τt t,τt of delay, is to ensure that renewal probabilities are separated out from 0, which is the critical element for the proof. Let us also define a family of sigma-fields:



Bt,n = σ F¯ νt,n−1 , Ut,n , n ≥ 0.

(5.14)

Theorem 2 Let (P0,t , t ∈ N0 ) and (P1,t , t ≥ 1) be two sequences of Markov kernels. Assume that Condition (A) is satisfied and there exist constant β > 1 and sets

108

V. Golomoziy

E˜ 0 , E˜ 1 ∈ E, such that α ⊂ E˜ 0 ∩ E˜ 1 and for all x, y ∈ E˜ 0 × E˜ 1 : 



 sup Et0,x β σα + Et1,y β σα < ∞. t

Then there exists constant M > 0 such that the following inequality holds true : 



 t

Ex,y δ σ¯ α×α ≤ M Et0,x β σα + Et1,y β σα .

(5.15)

Constant M could be expressed as M =1+

1 , √ 1 − (1 − γ )(1 + ε)

(5.16)

where γ , ε > 0 some constants, such that (1 − γ )(1 + ε) < 1. ¯ we have the inequality σ¯ t,α×α (ω) Proof Since for every ω¯ ∈  ¯ ≤ νt,τt (ω) ¯ then for all x, y ∈ E˜ 0 × E˜ 1 we get: ∞ t

t

t

Ex,y β σ¯ α×α ≤ Ex,y β ντ = Ex,y 1τ =k β νk k=0

t

≤ Ex,y β ν0 + t

≤ Ex,y β ν0 +



t

Ex,y 1τ >k−1 β νk

(5.17)

k=1 ∞ 

 21 t t

Px,y {τ > k} Ex,y β 2νk+1 .

k=0

Last inequality is due to the Cauchy-Schwarz inequality. By Lemma 6 with r (k) = 1  there exists γ ∈ (0, 1) such that P¯ t τ > j|F¯ ν j−1 ≤ (1 − γ )1τt > j−1 , which entails: t

Px,y {τ > k} < (1 − γ )k .

(5.18)

Note that νt,k+1 = νt,k + Ut,k+1 and νt,k is Bt,k -measurable. Let us select ε, such that t (1 + ε)(1 − γ ) < 1. Since supt,i Ei,α [β σα ] < ∞, we can apply Lemma 4 and find such δ ∈ (1, β) that " # t

t t

t

Ex,y δ 2νk+1 = Ex,y δ 2νk E δ 2Uk+1 |Bk ≤ (1 + ε)Ex,y δ 2νk .

(5.19)

Applying (5.19) recursively we obtain the following estimate: t

t

Ex,y δ 2νk+1 ≤ (1 + ε)k+1 Ex,y δ ν0 .

(5.20)

Plugging (5.20) and (5.18) into (5.17), and taking into account that (5.17) remains true if we replace β with δ, we get:

5 Computable Bounds of Exponential Moments of Simultaneous Hitting … t Ex,y

109

  ∞

σ¯ α×α k t ν 0 δ ≤ Ex,y δ 1+ ((1 − γ )(1 + ε)) 2 k=0

 1 ≤ √ 1 − (1 − γ )(1 + ε)    t σα

σα  1 t . ≤ E0,x δ + E1,y δ 1+ √ 1 − (1 − γ )(1 + ε) t Ex,y



ν0 δ 1+

(5.21)

Since δ ≤ β, (5.21) renders:   t

Ex,y δ σ¯ α×α ≤ Et0,x [β σα ] + Et1,y [β σα ] 1 + which proves the theorem with M = 1 +

√ 1 1− (1−γ )(1+ε)



,

(5.22)

√ 1 . 1− (1−γ )(1+ε)

Theorem 2 establishes the existence of the exponential moment, however, it could be difficult to verify its conditions and find constants δ, ε, γ , which are necessary to calculate M using (5.16). To address this problem, we state the next result. Theorem 3 Let (Pi,t , i ∈ {0, 1}, t ∈ N0 ) be two sequences of Markov kernels. Assume that Condition (A) and Condition (D1) hold true. Assume additionally: 1. There exist constants Cˆ > 0, βˆ > β such that t {σα > n} ≤ Cˆ βˆ −n , Pi,α

so that mˆ :=



Cˆ βˆ −n =

n≥0

Cˆ βˆ ˆ β−1

< ∞.

2. There exist m > 0 and γ0 > 0 such that for i ∈ {0, 1}, inf {Pit,m (α, α), . . . Pit,2m−1 (α, α)} ≥ γ0 .

t∈N0

3. There exist sets Ai ∈ E, Ai = ∅, i ∈ {0, 1} such that for all x ∈ Ai , sup Pit,N (t)−t VN(i)(t) (x) < ∞. t

Then the following inequality holds true for x ∈ A0 ∪ α, y ∈ A1 ∪ α,   t

Ex,y δ σ¯ α×α ≤ M Cβ(0) Wt(0) (x) + Cβ(1) Wt(1) (y) ,

(5.23)

110

V. Golomoziy

where Wt(0) (x) = P0t,N (t)−t VN(0)(t) (x) + Wt(1) (y) = P1t,N (t)−t VN(1)(t) (y) + M =1+

b(0) N (t)

λ(0) N (t)+1 b(1) N (t) λ(1) N (t)+1

 Pt0,x X N (t)−t ∈ α ,  Pt1,x X N (t)−t ∈ α , m− ˆ Gˆ m

γ = γ0 (1 − Gˆ m ) Gˆ m , δ = (1 + ε/2) m+n0 ,   %  ˆ β−β) β + 3. n 0 = ln 2ε(Cβ ˆ m+1 / ln βˆ

√ 1 , 1− (1−γ )(1+ε) $

Here ε is an arbitrary constant such that ε < real number a.

γ 1−γ

1

(5.24)

, and a is an integer part of a

Proof Since Condtion (D1) is satisfied, we can apply Corollary 1 and get t for every x ∈ Ai \ α, i ∈ {0, 1}: supt Ei,x [β σα ] < ∞. Condition 1 implies that t supt Ei,α [β σα ] < ∞. So, conditions of Theorem 2 are satisfied with E˜ i = Ai ∪ α. Formulas for Wt(0) (x) and Wt(1) (y) follow from Theorem 1, the formula for the constant M is proven in Theorem 2, formulas for δ and n 0 are from Lemma 4. The formula for γ follows from Lemmas 2 and 3. Remark 3 In the case when all n k from Condition (D) are equal to 1, the formulas for Wt(0) (x) and Wt(1) (y) in (5.24) could be simplified to Wt(0) (x) = Vt(0) (x) +

bt(0) λ(0) t+1

1α (x), Wt(1) (y) = Vt(1) (y) +

bt(1) λ(1) t+1

1α (y).

(5.25)

Remark 4 Condition 1 in Theorem 3 seems more restrictive than Condition (D1), but in fact, it can be derived from Condition (D1) as shown in Lemma 1. In this case we should find β  ∈ (1, β) and set βˆ = β and β = β  which will satisfy conditions of Theorem 3. We stated condition 1 in Theorem 3 as a separate condition because mˆ and Gˆ m used in the obtained bounds. And of course, for some particular chains it is possible to find better Gˆ n than provided by Lemma 1. Next, we show how the bounds for an exponential moment could be applied to the ergodicity of inhomogeneous Markov chains. Conditions that guarantee strong and week ergodicity of inhomogeneous Markov chains are well known. Strong ergodicity was investigated in papers [5, 28] and criterion for weak ergodicity was established in [21, 26]. Condition (D) does not imply even weak ergodicity as defined in [21, 26] (unless supx,t Vt (x) < ∞, which does not hold in practice), but this condition is sufficient for convergence in measure’s norm for n-steps transition probabilities. Rates of such convergence have been studied in papers [1, 8]. In [1], geometric drift condition was used to establish convergence rates. However, Condition (D) in the present paper is less restrictive than one in [1] since we allow some λt to be greater than 1. The main difference with the result in [1] is that we established bounds for geo∞ δ k ||P t,k (x, ·) − P t,k (y, ·)|| while [1] is concerned with bounds for metric sums k=0

5 Computable Bounds of Exponential Moments of Simultaneous Hitting …

111

a single term ||P t,k (x, ·) − P t,k (y, ·)||. In order to prove an estimate for the geometric sum defined above, in the next theorem, we apply the coupling method to two copies of the same inhomogeneous Markov chain started with different initial distributions. This allows us to show that the sum is bounded by the exponential moment of the simultaneous hitting time, and we can use Theorems 2 or 3 to obtain computable bounds in terms of exponential moments of each chain or test function from Condition (D). The next theorem is a well-known fact for homogeneous Markov chains, and the proof follows the same arguments as used for homogeneous chains (see. [7] Chaps. 8, 13). We state the theorem here to demonstrate one possible application of Theorems 2 and 3 and highlight the importance of the existence of exponential moment. Theorem 4 Let (Pt , t ∈ N0 ) be a sequence of Markov kernels that admits an aperiodic atom α ∈ E, and λ, λ ∈ M1 (E) two probability measures, such that for all t ∈ N0 , Ptλ {σα < ∞} = Ptλ {σα < ∞} = 1. Assume that there exists β > 1 such that Etα [β σα ] < ∞. Then there exists δ ∈ (1, β) satisfying the following inequality

δ k ||λP t,k − λ P t,k || ≤

k≥0



1  t Eλ⊗λ δ σ¯ α×α − 1 . δ−1

Proof We conduct the proof using the standard coupling technique adapted for timeinhomogeneous chains. (0) (1) Let f : E → R be a bounded measurable function. Consider chains X t,n and X t,n as two copies of the same time-inhomogeneous chain with a sequence of Markov kernels (Pt , t ∈ N0 ). Then

 

  t Et0,λ f X n(0) = Eλ⊗λ f X n(0) = =

n

k=0 n

 

  t t Eλ⊗λ f X n(0) 1σ¯ α×α =k + Eλ⊗λ f X n(0) 1σ¯ α×α >n " t+k "  # #

  t t (0) 1σ¯ α×α =k + Eλ⊗λ f X n(0) 1σ¯ α×α >n Eλ⊗λ Eα×α f X n−k

k=0

=

n

  t t Pλ⊗λ {σ¯ α×α = k} P t+k,n−k f (α) + Eλ⊗λ f X n(0) 1σ¯ α×α >n .

k=0

  Similar inequality holds true for Et1,λ f X n(1) . By the bounds obtained above, & t  (0) 

  &

&    & t &E − Et1,λ f X n(1) & ≤ Eλ⊗λ & f X n(1) − f X n(2) & 1σ¯ α×α >n 0,λ f X n t

≤ sup | f (x) − f (y)|Pλ⊗λ {σ¯ α×α > n} . x,y∈E

112

V. Golomoziy

&

 

  & Then, ||λP t,n − λ P t,n || = sup A∈E &Et0,λ 1 A X n(1) − Et1,λ 1 A X n(2) & t

≤ Pλ⊗λ {σ¯ α×α > n} . Finally, we arrive to:

δ n ||λP t,n − λ P t,n || ≤

n≥0

t

δ n Pλ⊗λ {σ¯ α×α > n} =

n≥0



1  t Eλ⊗λ δ σ¯ α×α − 1 . δ−1

The theorem is proved.

5.5 Auxiliary Lemmas Lemma 2 For a sequence of Markov kernels (Pt , t ∈ N0 ), let α be an aperiodic atom, and γ0 = inf t {P t,m (α, α), P t,m+1 (α, α), . . . , P t,2m−1 (α, α)} > 0. Then, for all t, n ≥ 0, n  t,2m+n P (α, α) ≥ γ0 Pt+k α {σα ≤ m + n − k} = γ0

k=0 n 

(5.26) Pt+n−k α

{σα ≤ m + k} > 0.

k=0

Proof We prove the lemma by induction. Let us start with n = 0 in (5.26), P t,2m (α, α) =

2m

Ptα {σα = k} P t+k,2m−k (α, α)

k=1



m

Ptα {σα = k} P t+k,2m−k (α, α) ≥ γ0

k=1

m

Ptα {σα = k} = γ0 Ptα {σα ≤ m} .

k=1

Assume that inequality (5.26) is true for all t ∈ N0 , k ≤ n, let’s check if for n + 1. Using the first entrance decomposition (see [23], Chap. 8, p. 174). we can write P t,2m+n+1 (α, α) =

2m+n+1

Ptα {σα = k}P t+k,2m+n+1−k (α, α)

k=1



m+n+1

Ptα {σα = k}P t+k,2m+n+1−k (α, α)

k=1

=

n+1

k=1

Ptα {σα = k}P t+k,2m+n+1−k (α, α) +

m+n+1

k=n+2

Ptα {σα = k}P t+k,2m+n+1−k (α, α)

5 Computable Bounds of Exponential Moments of Simultaneous Hitting … ≥ γ0

n+1

Ptα {σα = k}

n+1−k 

k=1

= γ0

n+1

{σα ≤ m + j} + γ0 Ptα {n + 2 ≤ σα ≤ m + n + 1}

j=0

Ptα {σα = k}

n+1−k 

k=1

≥ γ0

t+k+n+1−k− j



113

n+1

j Pt+n+1− {σα ≤ m + j} + γ0 Ptα {n + 2 ≤ σα ≤ m + n + 1} α

j=0

Ptα {σα = k}

k=1

n 

j Pt+n+1− {σα ≤ m + j} + γ0 Ptα {n + 2 ≤ σα ≤ m + n + 1} α

j=0

⎛ ≥ ⎝γ0

n 

⎞ j Pt+n+1− {σα α

≤ m + j}⎠ Ptα {σα ≤ m + n + 1}

j=0

= γ0

n+1 

j Pt+n− {σα ≤ m + j + 1}. α

j=0

Since for every t ∈ N0 , Ptα {σα ≤ m} ≥ P t,m (α, α) > 0, and sequence Ptα {σα ≤ n} is increasing in n, it implies that each term in the product (5.26) is positive. Therefore P t,2m+n (α, α) > 0, for all n ≥ 0. Lemma 3 Let conditions of Lemma 2 hold true, and assume there exists a sequence of decreasing, non-negative numbers {Gˆ n , n ≥ 0}, such that Gˆ n ≥ supt Ptα {σα > n}, M−Gˆ m and Gˆ n = M < ∞. Assume also that Gˆ m < 1. Then for γ = γ0 (1 − Gˆ m ) Gˆ m > n≥0

0 and for all t, n ≥ 0 we have the lower bound P t,2m+n (α, α) ≥ γ > 0.

(5.27)

Proof Lemma 2 yields: P t,2m+n ≥ γ0

n 

(1 − Gˆ m+k ).

(5.28)

k=0

The fact that (5.28) entails (5.27) is proved in Theorem 4.1 from [10]. Note that condition Gˆ m < 1 is not restrictive. Since

∞ n=0

G n < ∞ and {G n , n ≥ 0}

is nonincreasing, it is necessary that there exists n 0 such that Gˆ k < 1 for all k > n 0 . In case m ≤ n 0 , Lemma 2 shows it is always possible to choose another, bigger m at the cost of smaller γ0 . The next three lemmas are adjusted versions of the known results from homogeneous theory (see [7], Chap. 13 for more details). The main difference is that we study here two different inhomogeneous chains rather than two copies of the same homo-

114

V. Golomoziy

geneous chain. In the next lemma, for example, we have conditions and estimates that are different from the homogeneous analog. Lemma 4 The following statements hold. 1. Let (Pt , t ∈ N0 ) be a sequence of Markov kernels that admits an aperiodic atom α and there exist constant β > 1 such that sup Etα [β σα ] < ∞,

(5.29)

t

Then for every m ≥ 0 and every ε > 0 there exists δ = δ(m, ε) ∈ (1, β) such that sup Etα [δ m+τα ◦θn ] ≤ 1 + ε.

(5.30)

t,n

2. If additionally, there exists a dominating sequence Gˆ n and constants Cˆ > 0, βˆ > β, such that for all t, k ≥ 0 Ptα {σα > k} ≤ Gˆ k ≤ Cˆ βˆ −k ,

(5.31)

then ' 

δ = (1 + ε/2)

1 m+n 0

  ( ε(βˆ − β) β , n 0 = ln / ln + 3, m+1 ˆ βˆ 2Cβ

(5.32)

where a is an integer part of a real number a. Proof First we wish to establish the following inequality Ptα {τα

◦ θn = k} ≤

n

Pt+n−1 {σα = k + j}. α

(5.33)

j=1

In order to do this we provide the next transformations Ptα {τα ◦ θn = k} =



( j)

Pt {σα

( j+1)

< n ≤ σα

, τα ◦ θn = k}

j=0

=

∞ n−1

( j)

= i, σα ◦ θ

( j)

= i}Pt+i α {σα = k + n − i} =

Ptα {σα

j=0 i=0

=

n−1 ∞

Ptα {σα

( j)

σα

= k + n − i}

i=0 j=0



n−1

i=0

Pt+i α {σα = k + n − i} =

n−1

t Pt+i α {σα = k + n − i}Pα {X i ∈ α}

i=0 n

t+n− j Pα {σα = k + j}. j=1

5 Computable Bounds of Exponential Moments of Simultaneous Hitting …

115

Now we can derive that for all l ≥ 0 ∞

β k Ptα {τα ◦ θn = k} ≤

k=l

n ∞

j β k Pt+n− {σα = k + j} α

k=l j=1

=

n

j=1

β

−j

(5.34) β

k

j Pt+n− {σα α

= k}.

k≥l+ j

√ 1. We assume that (5.29) holds true. Let us denote β1 = β > 0, and ξt = σt,α spaces, as β1 , note that random variables ξt defined on different

probability described in Sect. 5.2. Condition (5.29) implies supt Etα |ξt |2 = supt Etα [β σα ] < ∞, which means that family of distributions of ξ t is uniformly integrable. β1k Ptα {σα = k} . So, we We introduce a special notation for its tails an(t) = k≥n

have ∞ j=1

supt an(t)

→ 0, n → ∞. Then, (5.34) yields supn



k=l

β1k Ptα {τα ◦ θn = k} ≤

(t) β − j supt al+ j → 0, l → 0. The latter expression implies that we can find a

number n 0 > 0 such that ∞

k>n 0

β1k Ptα {τα ◦ θn = k} ≤

ε . 2β1m

(5.35)

We now choose δ ∈ (1, β1 ) such that δ m+n 0 ≤ 1 + ε/2. Then we have:





Etα δ m+τα ◦θn = Etα δ m+τα ◦θn 1τα ◦θn ≤n 0 + Etα δ m+τα ◦θn 1τα ◦θn >n 0

Etα β1k 1τα ◦θn =k ≤ δ m+n 0 + β1m (5.36)

k>n 0

= δ m+n 0 + β1m

k>n 0

εβ m β1k Ptα {τα ◦ θn = k} ≤ 1 + ε/2 + 1m = 1 + ε. 2β1

2. Assume that condition (5.31) holds true. Using trivial β k = (β − 1) we get for all l ≥ 1,

k−1 i=0

β i + 1,

116

V. Golomoziy ∞

j {σα = k} ≤ (β − 1) β k Pt+n− α

k= j+l

⎡ ≤ (β − 1) ⎣

j+l−2



∞ k−1

j {σα = k} + Gˆ l+ j−1 β i Pt+n− α

k=l+ j i=0

β i Gˆ j+l−1 +

i=0

⎤ j {σα > i}⎦ + Gˆ l+ j−1 β i Pt+n− α

i> j+l

⎤ l+ j−1

β − 1 ≤ (β − 1) ⎣ β i Gˆ i ⎦ + Gˆ l+ j−1 Gˆ j+l−1 + β −1 i> j+l

j+l−1 ˆ =β β i Gˆ i G j+l−1 + (β − 1) ⎡

i> j+l

  j+l−1

 β i  β  j+l−1 β(βˆ − 1) β ˆ ˆ − 1) C. ≤ Cˆ + C(β = ˆ βˆ βˆ βˆ − β i> j+l β Plugging this inequality into (5.34) we get sup n



Ptα {τα ◦ θn = k} ≤ Cˆ

k=l

  ∞ β(βˆ − 1) − j β j+l−1 β βˆ − β j=1 βˆ

β(βˆ − 1) = Cˆ βˆ − β



 l−1 l−1 ∞ β β β βˆ − j = Cˆ . ˆ hatβ β − β βˆ j=1

(5.37) We can now find number n 0 ≥ 2 such that

Ptα {τα ◦ θn = k} ≤ Cˆ

k>n 0

 n 0 −2 β ε ≤ . 2β m βˆ − β βˆ β

(5.38)

From (5.38) we can derive a direct expression for n 0 ,   ( ε(βˆ − β) β + 3, n 0 = ln / ln m+1 ˆ βˆ 2Cβ ' 

which proves the formula for n 0 in (5.32). The proof is completed by setting 1 δ = (1 + ε/2) m+n0 and applying transformations (5.36) with β instead of β1 . In the next two lemmas, we will use notation from Sect. 5.4 and assume that conditions of Theorem 2 hold true. Lemma 5 Let h : E → R+ be a measurable function. Then ∀t ∈ N0 :   "   #   t,U 1α X ν(i)j−1 Et h X ν(i)j |B j = 1α X ν(i)j−1 Pi j h(α).

(5.39)

5 Computable Bounds of Exponential Moments of Simultaneous Hitting …

117

Proof We prove formula (5.39) using the definition of conditional expectation. The t,U random variable Pi j h(α) is B j -measurable by construction of B j . It is enough to prove that for any set A ∈ Fν j−1 : " " # # t,U Etx,y 1 A 1U j =k 1α (X ν(i)j−1 )h(X ν(i)j ) = Etx,y 1 A 1U j =k 1α (X ν(i)j−1 )Pi j h(α) .

(5.40)

Using the definition of ν j we get " " " ## # Etx,y 1 A 1U j =k 1α (X ν(i)j−1 )h(X ν(i)j ) = Etx,y 1 A 1α (X ν(i)j−1 )Et 1U j =k h(X ν(i)j )|Fν j−1 " " ## = Etx,y 1 A 1α (X ν(i)j−1 )Et 1U j =k h(X ν(i)j−1 +k )|Fν j−1 " " ## = Etx,y 1 A 1α (X ν(i)j−1 )Et 1ν j−1 +q+τ (1−i) ◦θν j−1 +q=k h(X ν(i)j−1 +k )|Fν j−1  " # = Etx,y 1 A 1α (X ν(i)j−1 )EtX (i) ,X (1−i) 1τ (1−i) ◦θq =k−q h(X k(i) ) ν j−1 ν j−1 " " # # t h(X k(i) ) P X ν(1−i) {τ (1−i) ◦ θq = k − q} = Etx,y 1 A 1α (X ν(i)j−1 )Ei,α j−1 " " # # (i) t t (i) t = Ei,α h(X k ) Ex,y 1 A 1α (X ν j−1 )P {U j = k|Fν j−1 } " " # # t,U = Pit,k h(α)Etx,y 1 A 1U j =k 1α (X ν(i)j−1 ) = Etx,y 1 A 1U j =k 1α (X ν(i)j−1 )Pi j h(α) . So, formula (5.40) and thus (5.39) is proved. Lemma 6 Let r (n), n ≥ 0 be a nonnegative sequence. Then there is γ < 1 such that



Et 1τ > j r (ν j )|Fν j−1 ≤ (1 − γ )1τ > j−1 Et r (ν j )|Fν j−1 .

(5.41)

t Proof In this proof all random variables    outsideE should be understood as having (i) lower index t, that is, 1α X ν(i)j−1 = 1α X t,νt, j−1 . We have

 

  "   # 1α X ν(i)j−1 Et 1τ > j r (ν j )|Fν j−1 = 1α X ν(i)j−1 Et 1τ > j−1 1αc X ν(i)j r (ν j )|Fν j−1   " "   # # = 1τ > j−1 1α X ν(i)j−1 Et Et 1αc X ν(i)j |B j r (ν j )|Fν j−1 . Using Lemma 5 we get   " "   # # 1τ > j−1 1α X ν(i)j−1 Et Et 1αc X ν(i)j |B j r (ν j )|Fν j−1   " # t,U = 1τ > j−1 1α X ν(i)j−1 Et Pi j (α, α c )r (ν j )|Fν j−1   " # t,U = 1τ > j−1 1α X ν(i)j−1 Et (1 − Pi j (α, α))r (ν j )|Fν j−1 .

118

V. Golomoziy

By Lemma 3, Pit,2m+n (α, α) ≥ γ , ∀n ≥ 0, and since U j ≥ 2m:   " # t,U 1τ > j−1 1α X ν(i)j−1 Et (1 − Pi j (α, α))r (ν j )|Fν j−1  

≤ (1 − γ )1τ > j−1 1α X ν(i)j−1 Et r (ν j )|Fν j−1 . It means, that we have established the following relation  

 

1α X ν(i)j−1 Et 1τ > j r (ν j )|Fν j−1 ≤ (1 − γ )1τ > j−1 1α X ν(i)j−1 Et r (ν j )|Fν j−1 . (5.42) We may note that by the definition of ν j−1 : "    # 1α X ν(0)j−1 + 1α X ν(1)j−1 1τ > j−1 = 1.

(5.43)

Now we sum inequalities (5.42) for i ∈ {0, 1} and using (5.43) derive (5.41).

Appendix We state here the Comparison Theorem, it is proved in [7], Theorem 4.3.1. Theorem 5 Let {Vn , n ≥ 0}, {Yn , n ≥ 0}, and {Z n , n ≥ 0} be three {Fn , n ≥ 0}adapted nonnegative processes such that for all n ≥ 0,

E Vn+1 |Fn + Zn ≤ Vn + Yn , P– a.s.. Then for every {Fn , n ≥ 0}-stopping time τ , E [Vτ 1τ 0. Denote St as the price of this undert . lying asset at a discrete-time t, and define the price relative dynamics as X t = SSt−1 Assume that X t depends on the situation of the economy Z at time t, more specifically X j , which is the price dynamics given Z = j, follows a probability distribution  qj xj = uj P(X = x j ) = 1 − q j x j = d j = 1/u j j

e(r −δ)h −d

(6.1)



with q j = u j −d j j , u j = eσ j h , where σ j > σ j  for j < j  . Here r > 0 is the continuously compounded risk-free rate and to exclude arbitrage opportunities assume d j < e(r −δ)h < u j ,

j ∈ Z.

(6.2)

Note that our economy is classified according to the volatility σi , with high volatility referring to the bad economy and low volatility to good economy. Indeed, low volatility indicates usually a stable market. By the risk-neutral option valuation theory, the price of a European option is the expected payoff under a risk-neutral probability measure, discounted at the risk-free interest rate. We assume here that a risk-neutral probability measure Q is chosen by the market and is given to us. All expectations and distributions in this paper are under this probability measure Q. Indeed, the probability distribution (6.1) given above is the well-known risk-neutral probability distribution for a binomial model with a continuous dividend yield.



> Important

Our model is not a particular case of the generic model in [14] as it does not satisfy its assumption 3.1 (i). Indeed, Assumption 3.1 (i) in [14] states that X j is stochastically  less than or equal to X j for two different states if j ≤ j  . However, in our model, it  is trivial to show that E[X j ] = e(r −δ)h for any state j, and Var[X j ] > Var[X j ] for  j < j  , implying that X j and X j cannot be ordered in terms of stochastic increasing.

6 Valuation and Optimal Strategies for American Options …

125



! Attention

It is never optimal to early exercise an American call if the underlying is a nondividend-paying stock and r > 0, which is a well-known model-free result on American call options. Hence to avoid triviality we assume the dividend yield δ > 0 in all our discussions on American call options below.

At each time period, the economic situation in the spot market cannot be observed directly. However, we are able to receive observation Y , such as economic indicators that provide incomplete information related to the real economic situation Z . Observation Y comes from a finite set, Y = {1, 2, ..., m}. Let  = [γ jθ ] j∈Z,θ∈Y be an observed conditional probability matrix that describes the relationships between the economic situation and the observations. Here, γ jθ = P(Y = θ |Z = j) is the element of  in j-th row and θ -th column. Let π = (π1 , ..., πn ) be a probability vector that the information about expresses n the economic situation. Here, πi = P (Z = i) , i=1 πi = 1. In this research, π is called the economy information vector. At any time period, the pair (s, π ) is called a process state, meaning that the current asset price is s and the information vector which reflects the economic situation is π . At the beginning of every time period, the holder can select one of two actions: early exercise or hold. If the holder decides to early exercise, a payoff of ve (st ) = max{K − st , 0} (reps. ve (st ) = max{st − K , 0}) is received for put (reps. call) option, where K is the strike price and st is the underlying asset price at time t.

6.3 Pricing of American Options At the beginning of every time period, an early-exercising decision is made based on the current process state (s, π ). Under the decision to hold for one more time period, the information vector at the beginning of the next time period is updated to T(π , θ ), given the observation θ with probability ψ(θ |π). Here, ψ(θ |π) =

n n  

πi pi j γ jθ ,

(6.3)

j=1 i=1

and the j-th element of the updated information vector T(π , θ ) is n T j (π , θ ) =

πi pi j γ jθ . ψ(θ |π)

i=1

(6.4)

126

L. Jin et al.

We now formulate the optimal stopping problem using a partially observable Markov decision process. Let N be the number of the remaining time periods to maturity. i.e. N = T / h at the beginning of option transaction, and N = 0 at maturity. Consider an American put option with the current process state (s, π ), strike price K and remaining periods to maturity N . The option price v N (s, π ), is given by ⎧ − s, 0} = veN (s) ⎪ ⎨ max{K m 2  

v N (s, π ) = max β ψ(θ |π) v N −1 sx kj , T(π, θ ) P(x kj ) = v hN (s, π ) ⎪ ⎩ θ=1

k=1

where x 1j = u j , x 2j = d j , and β = e−r h (0 < β < 1) is the discount factor. Let the quantity veN (s, π ) be the value/payoff if the holder exercises the option at the beginning of the current time period, and v hN (s, π ) be the value if the holder decides to hold and follow the optimal strategy in the remaining periods. Note that v hN (s, π ) is valued as the discounted expected payoff for one time period, i.e. in a similar way to an European option. Since the payoff of early exercise does not depend on the remaining time periods N , we use ve (s) instead of veN (s) in the following. When the time period expires, v0h (s, π ) = 0, hence v0 (s, π ) = max{ve (s), v0h (s, π )} = ve (s).

(6.5)

6.4 Some Properties for Optimal Strategy In this section, the structural properties of the optimal total payoff function are derived.

6.4.1 Preliminaries First define a totally positive property of order 2 (see [4]), abbreviated as TP2 , which is used in this research. Definition 1 If for two vectors x = (x1 , x2 , ..., xn ), and y = (y1 , y2 , ..., yn ) xi x j yi y j ≥ 0,

1 ≤ i < j ≤ n,

holds, it is said that y dominates x in the sense of totally positive ordering of order TP2

2, denoted by x ≺ y. Definition 2 Let X = [xi j ]i j be an n × m matrix for which det(B) ≥ 0 for every submatrix B = [xik jl ]kl of dimensions 2 × 2 where 1 ≤ i 1 < i 2 ≤ n, 1 ≤ j1 < j2 ≤

6 Valuation and Optimal Strategies for American Options …

127

m. Matrix X is said to have a property of totally positive of order two, denoted by X ∈ TP2 . In this research, the following important conditions are assumed. Note that we consider functions as increasing or decreasing in the weak sense throughout this paper. (A-1) The transition probability matrix for economic situation P has a TP2 property. (A-2) The conditional probability matrix for observation  has a TP2 property. Assumption (A-1) asserts that, as the economy gets better, it tends to move to a more progressing situation in the next time period. Assumption (A-2) implies that a better economic situation gives rise to higher output levels for the observations probabilistically.

6.4.2 Lemmata From the assumptions (A-1) and (A-2), we obtain the following lemmata and properties to establish the structural properties of American option prices. We begin our preparation by citing two results on TP2 ordering from [4, 11]. These are fundamental properties of TP2 vectors and matrices. n Lemma 1 ([4]) If f (i) is a decreasing/increasing function of i,then i=1 πi1 f (i) decreases/increases in π in the sense of TP2 ordering. Lemma 2 ([11]) If P is a (k P × k) TP2 matrix, and Q is a (k × k Q ) TP2 matrix, then PQ is a (k P × k Q ) TP2 matrix. TP2

Lemma 3 Under assumptions (A-1) and (A-2), T(π , θ1 ) ≺ T(π, θ2 ) holds for any π and 1 ≤ θ1 < θ2 ≤ m. TP2

Lemma 4 Under assumptions (A-1) and (A-2), T(π 1 , θ ) ≺ T(π 2 , θ ) holds for any TP2

θ and π 1 ≺ π 2 . Lemmas 3 and 4 establish the monotonicity on θ and π of T(π , θ ) in the sense of TP2 , respectively. Here, T(π, θ ) is the updated information vector of the next time period given the current information vector π . We omit the proofs for Lemmas 3 and 4 since they can be obtained by developing Eq. (6.4) from assumptions (A-1) and (A-2). Lemma 5 The following inequality holds for j < j  : 2  k=1

max(K − sx kj , 0)P(x kj ) ≥

2  k=1

max(K − sx kj , 0)P(x kj ).

128

L. Jin et al.

Proof Recall that P(x 1j ) = P(sx j = su j ) = q j ,

P(x 2j ) = P(sx j = sd j ) = 1 − q j ,

P(x 1j  ) = P(sx j  = su j  ) = q j  ,

P(x 2j  ) = P(sx j  = sd j  ) = 1 − q j  .

It is straightforward to show that E[sx j ] = E[sx j  ] = e(r −δ)h s. Hence, the following is true (6.6) q j su j + (1 − q j )sd j = q j  su j  + (1 − q j  )sd j  , and Eq. (6.6) is the starting point of our proof. Note that s > 0 and sd j < sd j  < su j  < su j . Depending on the value of strike price K , we have one of the following cases: (i) K < sd j , (ii) sd j ≤ K < sd j  , (iii) sd j  ≤ K < su j  ; (iv) su j  ≤ K < su j and (v) K ≥ su j . We prove the lemma for each of the above cases. For case (i) 2 

max(K − sx kj , 0)P(x kj ) =

k=1

2 

max(K − sx kj , 0)P(x kj ) = 0.

k=1

The lemma holds with equality. For case (ii), the lemma holds due to the facts that 2 

max(K −

sx kj , 0)P(x kj )

> 0,

k=1

2 

max(K − sx kj , 0)P(x kj ) = 0.

k=1

For case (iii), the lemma reduces to (1 − q j )(K − sd j ) ≥ (1 − q j  )(K − sd j  ). In Eq. (6.6), we replace su j in the left hand side and su j  by the strike price K in the right hand side. Since su j > su j  and q j , q j  ∈ (0, 1), by Eq. (6.6) we obtain q j K + (1 − q j )sd j < q j  K + (1 − q j  )sd j  , which leads to (1 − q j )(K − sd j ) > (1 − q j  )(K − sd j  ). Then, the proof for case (iii) is complete. Consider case (iv). The lemma now reduces to (1 − q j )(K − sd j ) ≥ q j  (K − su j  ) + (1 − q j  )(K − sd j  ). Since K < su j , we have (1 − q j )sd j < q j  su j  + (1 − q j  )sd j  − q j K from Eq. (6.6), and the above can be written as (1 − q j )(K − sd j ) > q j  (K − su j  ) + (1 − q j  )(K − sd j  ). The proof for this case is complete. Finally for case (v), the lemma in this case is q j (K − su j ) + (1 − q j )(K − sd j ) ≥ q j  (K − su j  ) + (1 −  q j  )(K − sd j  ) and it holds with equality from Eq. (6.6). Lemma 6 The following is true for for j < j  : 2 

max(sx kj − K , 0)P(x kj ) ≥

k=1

Proof Let f 1 = Note that

2 

max(sx kj − K , 0)P(x kj ).

k=1

2 k=1

max(sx kj − K , 0)P(x kj ),

f2 =

2 k=1

max(K − sx kj , 0)P(x kj )

6 Valuation and Optimal Strategies for American Options …

f1 − f2 =

129

2  [max(sx kj − K , 0) − max(K − sx kj , 0)]P(x kj ) k=1

=

2  (sx kj − K )P(x kj ) = e(r −δ)h s − K k=1

Hence f 1 = f 2 + e(r −δ)h s − K . Now it is obvious that f 1 and f 2 have the same behavior with respect to the economy state, which completes the proof.  TP2

Lemma 7 For both American put and call options, for π 1 ≺ π 2 , the inequality v1h (s, π 1 ) ≥ v1h (s, π 2 ) holds. Proof We prove the lemma for American put options. From Eq. (6.3), v1h (s, π 1 ) = β

m 

ψ(θ |π 1 )

θ=1



m  θ=1

ψ(θ |π 1 )

2 



v0 sx kj , T(π 1 , θ ) P(x kj )

k=1 2 

ve (sx kj )P(x kj ) = β

k=1

n n   j=1 i=1

Let π P be a vector with

n i=1

πi1 pi j

2 

max(sx kj − K , 0)P(x kj )

k=1 TP2

πi pi j as the j-th element, then π 1 P ≺ π 2 P for

TP2

π 1 ≺ π 2 from Assumption (A-1) and Definition 1. Therefore, we obtain v1h (s, π 1 ) = β

n  n 

πi1 pi j

j=1 i=1

≥β

n  n  j=1 i=1

2 

max(sx kj − K , 0)P(x kj )

k=1

πi2 pi j

2 

max(sx kj − K , 0)P(x kj ) = v1h (s, π 2 )

k=1

from Lemmata 1 and 5. The case for American call can be proved similarly from Lemmata 1 and 6. 

6.4.3 Properties From the above assumptions and lemmata, we provide some properties which are important for investigating the optimal strategies for an American option. A decision of the option buyer is made at the beginning of every time period. This discretization scheme of decision making enables us to use induction on the steps of option price iteration as a proof technique.

130

L. Jin et al.

At first, we obtain some properties on the value function for holding v hN (s, π ). Proposition 1 For a put (call) option, v hN (s, π ) is decreasing (increasing) in s for any N and π under the assumptions (A-1) and (A-2). Proof We focus on American put options first and prove the claim by mathematical induction. For N = 1 the following holds for s < s  using Eq. (6.5): v1h (s, π) = β =β

m 

m 

ψ(θ |π)

θ=1

ψ(θ|π)

θ=1

2 



v0 sx kj , T(π, θ) P(x kj )

k=1 2 

ve (sx kj )P(x kj ) ≥ β

k=1

m  θ=1

ψ(θ |π)

2 

ve (s  x kj )P(x kj ) = v1h (s  , π ).

k=1

h For N = n − 1, assume that vn−1 (s, π ) is decreasing in s. Next, prove the monoh tonicity of s for vn (s, π ). Since for N = n it follows that

vnh (s, π ) = β

m 

ψ(θ |π)

θ=1



m 

ψ(θ |π)

θ=1

≥β

m  θ=1

2 



vn−1 sx kj , T(π , θ ) P(x kj )

k=1 2 

k 

 h sx j , T(π, θ ) P(x kj ) max ve (sx kj ), vn−1

k=1

ψ(θ |π)

2 

 k 

 h s x j , T(π, θ ) P(x kj ) = vnh (s  , π ) max ve (s  x kj ), vn−1

k=1

holds for s < s  using the inductive hypothesis of N = n − 1, which proves the claim. The increasing of holding value function for American call can be derived similarly.  Proposition 2 For both put and call options, v hN (s, π ) is increasing in remaining time periods N for any s and π under the assumptions (A-1) and (A-2). Proof Prove the proposition using mathematical induction for the case of American put. For N = 1, it is obvious that v1h (s, π ) ≥ v0h (s, π ). For N = n − 1, assume that h h (s, π ) ≥ vn−2 (s, π ), then for N = n, vn−1

6 Valuation and Optimal Strategies for American Options …

vnh (s, π ) = β =β

m  θ=1 m  θ=1 m 

≥β =β

ψ(θ |π) ψ(θ |π) ψ(θ |π)

θ=1 m 

ψ(θ |π)

θ=1



vn−1 sx kj , T(π, θ) P(x kj )

2  k=1 2  k=1 2 

131



 h sx kj , T(π, θ) P(x kj ) max ve (sx kj ), vn−1 

 h sx kj , T(π, θ) P(x kj ) max ve (sx kj ), vn−2

k=1 2 



h (s, π ) vn−2 sx kj , T(π, θ) P(x kj ) = vn−1

k=1

from inductive hypothesis of N = n − 1. Therefore, Proposition 2 holds true, and we obtain the same result for American call.  Proposition 3 For both put and call options, v hN (s, π ) is decreasing in π in the sense of TP2 for any N and s under the assumptions (A-1) and (A-2). Proof We consider American put first. Since v0h (s, π ) = 0 for every s and π, we have the fact that v0h (s, π 1 ) = v0h (s, π 2 ), and v1h (s, π 1 ) ≥ v1h (s, π 2 ) TP2

holds for π 1 ≺ π 2 from Lemma 7. Next, assume that h h (s, π 1 ) ≥ vn−1 (s, π 2 ) vn−1

for

TP2

π1 ≺ π2

(6.7)

and prove that vnh (s, π 1 ) ≥ vnh (s, π 2 ) holds for N = n. We focus on 2 

for

TP2

π1 ≺ π2

2

k k 1 k=1 vn−1 [sx j , T(π , θ )]P(x j )

(6.8) first.



vn−1 sx kj , T(π 1 , θ) P(x kj )

k=1

=

2 



 h sx kj , T(π 1 , θ) P(x kj ) max ve (sx kj ), vn−1

k=1



2 



 h sx kj , T(π 1 , θ  ) P(x kj ) max v e (sx kj ), vn−1

k=1

=

2  k=1



vn−1 sx kj , T(π 1 , θ  ) P(x kj )

132

L. Jin et al.

for θ < θ  from the given by Eq. (6.7) and Lemma 3, and this induction hypothesis

2 k k 1 means k=1 vn−1 sx j , T(π , θ ) P(x j ) is a decreasing function of θ . Similarly, 2 

2 



vn−1 sx kj , T(π 1 , θ ) P(x kj ) ≥ vn−1 sx kj , T(π 2 , θ ) P(x kj )

k=1

(6.9)

k=1

Next, look at ψ(·|π) = (ψ(1|π ), ..., ψ(m|π )). Since TP2

ψ(·|π 1 ) ≺ ψ(·|π 2 )

(6.10)

TP2

for π 1 ≺ π 2 under assumptions (A-1) and (A-2) from Lemma 2. From Eqs. (6.9) and (6.10), the following holds vnh (s, π 1 ) = β ≥β

m 

ψ(θ |π 1 )

θ=1

k=1

m 

2 

ψ(θ |π 2 )

θ=1

≥β

2 

m  θ=1

vn−1 (sx kj , T(π 1 , θ ))P(x kj ) vn−1 (sx kj , T(π 1 , θ ))P(x kj )

k=1

ψ(θ |π 2 )

2 

vn−1 (sx kj , T(π 2 , θ ))P(x kj ) = vnh (s, π 2 )

k=1

on the basis of Lemmata 1 and 4. This establishes Eq. (6.8). This property for American call can be derived in the same way.



Since the value of early exercise is given by v (s) = max{K − s, 0} (v (s) = max{s − K , 0}) which is a decreasing (increasing) function of s, we can obtain the following properties of the functions for American put/call option price from the above properties. e

e

Proposition 4 Under the assumptions (A-1) and (A-2), v N (s, π ) is monotonically decreasing (increasing) in s for any π for American put (call) option. Proposition 5 v N (s, π ) is increasing in remaining time periods N for any s and π under the assumptions (A-1) and (A-2). Proposition 6 Under the assumptions (A-1) and (A-2), v N (s, π ) is monotonically decreasing in π (in the sense of TP2 ordering) for any s for both American put and call options. Propositions 4, 5, and 6 provide a set of sufficient conditions under which the American option price is monotonic in N , s and π . Note that in this research we investigate the strategy in the sense of TP2 ordering of π. This means that the price of the American option is monotonic in the remaining time, asset price and the progression of the economy.

6 Valuation and Optimal Strategies for American Options …

133

To explore the optimal investment strategy for buyers, we also need to investigate the relationship between the value functions under holding and early exercising decisions.   Define the holding value premium L N (s, π ) = max 0, v hN (s, π ) − ve (s) . We study the properties of L N (s, π ) in N and π for American options. Proposition 7 For a put (call) option, (i) v hN (s, π ) is a convex function of s, (ii) the decreasing (increasing) rate of v hN (s, π ) in s is less than 1 for any π under the assumptions (A-1) and (A-2). Proof First, prove the convexity of v hN (s, π ) in s for any given π inductively. For N = 0, v0h (s, π ) = 0. For N = 1, the following is true v1h (s, π ) = β

m 

ψ(θ |π)

θ=1



m 

ψ(θ |π)

θ=1

2 



v0 sx kj , T(π, θ ) P(x kj )

k=1 2 

ve (sx kj )P(x kj ) = β

m  θ=1

k=1

ψ(θ |π)

2 

  max K − sx kj , 0 P(x kj )

k=1

is a convex function of s. h (s, π ) is a convex function of s for N = n − 1, then Next, assume that vn−1 h h h λvn−1 (s1 , π ) + (1 − λ)vn−1 (s2 , π ) ≥ vn−1 [λs1 + (1 − λ)s2 , π ]

for 0 < λ < 1 and s1 < s2 from Jensen’s inequality. Since vn−1 (s, π ), which is a h (s, π ) and ve (s) = max {K − s, 0}, is a convex convex linear combination of vn−1 function of s, then λvn−1 (s1 , π ) + (1 − λ)vn−1 (s2 , π ) ≥ vn−1 [λs1 + (1 − λ)s2 , π ]

(6.11)

for 0 < λ < 1 and s1 < s2 . For N = n, it follows that λvnh (s1 , π ) + (1 − λ)vnh (s2 , π ) =β

m 

ψ(θ |π)

θ=1

≥β

m  θ=1

=

vnh

2  

 λvn−1 s1 x kj , T(π , θ ) + (1 − λ)vn−1 s2 x kj , T(π, θ ) P(x kj ) k=1

ψ(θ |π)

2  



 vn−1 (λs1 + (1 − λ)s2 )x kj , T(π, θ ) P(x kj )

k=1



(λs1 + (1 − λ)s2 )x kj , π

from the inductive hypothesis of the convexity of vn−1 (s, π ) given by Eq. (6.11) for N = n − 1. h (s, π ) in s. For N = 0, Next, investigate the decreasing rate of vn−1

134

L. Jin et al.

v0h (s1 , π ) − v0h (s2 , π ) = 0 ≤ s2 − s1 h h for s1 < s2 . Assume vn−1 (s1 , π  )h− vn−1 (s2 , πe ) ≤ s2 − s1 for  hN = n − 1. Since vn−1 (s1 , π ) − vn−1 (s2 , π ) = max vn−1 (s1 , π ), v (s1 ) − max vn−1 (s2 , π ), ve (s2 ) ≤ s2 − s1 , then,

vnh (s1 , π ) − vnh (s2 , π ) =β

m 

ψ(θ |π)

θ=1

≤β =e

m 

2  

 vn−1 s1 x kj , T(π , θ ) − vn−1 s2 x kj , T(π, θ ) P(x kj ) k=1

ψ(θ |π)

θ=1 −r h (r −δ)h

e

2 

(s2 − s1 ) x kj P(x kj )

k=1

(s2 − s1 ) = e−δh (s2 − s1 ) ≤ s2 − s1 ,

since 0 < e−δh < 1. Hence the decreasing rate of vnh (s, π ) is thus less than 1. This result can be proven for a call option.



Proposition 8 For an American put or call option, L N (s, π ) is increasing in N for any s and π under the assumptions (A-1) and (A-2). Proposition 9 For an American put or call option, L N (s, π ) is decreasing in π in the sense of TP2 ordering for any N and s under the assumptions (A-1) and (A-2). Propositions 8 and 9 follow directly from the fact that ve (s) is constant in both N and π .

6.4.4 Optimal Strategy Based on the properties obtained in Sect. 6.4.3, we study the structural properties of the optimal strategy for an American option. Define the stopping region and the holding region for any N as follows: • Stopping region for early exercise     D eN = (s, π ) | v hN (s, π ) < ve (s) = (s, π ) | v N (s, π ) = ve (s) • Holding region     D hN = (s, π ) | v hN (s, π ) > ve (s) = (s, π ) | v N (s, π ) = v hN (s, π ) . We consider first American put options. Figure 6.1 plots the values for holding and for early exercising as functions of s. From Propositions 1 and 7, we know that v hN (s, π ) is a convex and decreasing function of s. Moreover the decreasing rate of

6 Valuation and Optimal Strategies for American Options …

135

Fig. 6.1 Relationship between holding value, exercise value and asset price s for the case of American put option

v hN (s, π ) is thus less than 1. For a put option, the decreasing rate of veN (s) is −1 for s ∈ [0, K ) and 0 for s ∈ [K , ∞). Consequently, there is at most one threshold s N∗ (π ) for π when the remaining number of periods is N . As shown in Fig. 6.1, the thresholds separate the space of s into two regions: stopping (early exercise) region and holding region. Furthermore, s N∗ (π ) increases with π since v hN (s, π ) decreases TP2

with π from Proposition 3. As shown in Fig. 6.1, s N∗ (π) < s N∗ (π  ) for π ≺ π  . This means that it is preferable to hold the option under a worse economy situation i.e. in a more volatile market. Figure 6.2 plots the values of holding and early exercise as functions of π(∈ TP2 ). From Proposition 3, we know that v hN (s, π ) is a decreasing function of π in the sense of TP2 , and ve (s) is constant for any π . Therefore, there exists at most one threshold π ∗ (s) (as shown in Fig. 6.2). From Proposition 1, π ∗ (s) decreases with s which implies that it is better to exercise earlier if the asset price is lower.

Fig. 6.2 Relationship between holding value, exercise value and economy situation π ∗ (s) for the case of American put option

136

L. Jin et al.

Similar properties for American call options can also be derived. This means that the information space of (s, π ) is divided into two regions for both American put and call options. We illustrate these regions using numerical examples in Sect. 6.5.

6.5 Numerical Examples In this section, numerical examples are introduced to show the monotonicity of the mentioned functions. A tree model was used for computation of the American put option and American call option with dividend yield for three states.

6.5.1 Model Implementation Consider a three-state economy with three pieces of information. Let P and  be a transition probability matrix conditional probability matrix, respectively, ⎡ ⎤ and a ⎡ ⎤ p11 p12 p13 γ11 γ12 γ13 defined by P = ⎣ p21 p22 p23 ⎦ ,  = ⎣γ21 γ22 γ23 ⎦ , and the economy information p31 p32 p33 γ31 γ32 γ33 vector π = (π1 , π2 , π3 ) = (π1 , π2 , 1 − π1 − π2 ). Assume that P,  ∈ TP2 . Let S0 be the initial asset price, and K strike price. Consider a tree of M steps, with expiration time T , volatility σ = (σ1 , σ2 , σ3 ), interest rate r , and economy information vector π = (π1 , π2 , π3 ). The time duration of a step is h = T /M. The asset price after n steps on the tree depends on the state, and the number of times the price went up, and down. Denote with n i the number of times the price increased in state i, i = 1, 2, 3, and with m j the number of times the price decreased in state j, j = 1, 2, 3. Hence, sn (n 1 , n 2 , n 3 , m 1 , m 2 , m 3 ) = S0 u n1 1 u n2 2 u n3 3 d1m 1 d2m 2 d3m 3 where n 1 + n 2 + n 3 + m 1 + m 2 + m 3 = n. Note that s N = s M−n (n = M − N ) where N is the number of the remaining time periods to the maturity. The pricing of American options is explained in Sect. 6.3. Notice that the infori, j mation vector π also yields a tree. In step n, each node represents a vector π n = i, j i, j i, j i j (π1,n , π2,n , π3,n ), j = 1, 2, ..., 3n−1 , i = 3 j − 2, 3 j − 1, 3 j, where π 0 = π. The index j represents a node in step n − 1 to which it is connected, e.g. nodes 3,1 π i,3 3 , i = 7, 8, 9 are connected to node π 2 . Therefore, in step n there is a total i, j of 3n−1 × 3 = 3n nodes. To obtain vectors π n , first find 3n−1 probability mass funcj j j j,k j j n−1 tions hn , j = 1, 2, ..., 3 , from hn = (h 1,n , h 2,n , h 3,n ) = π n−1 P, where k is the   i, j , j+1 , 3j Then, for π n , and j = 1, 2, ..., 3n−1 , the followonly integer in set j+2 3 3 ing is true

6 Valuation and Optimal Strategies for American Options …

 π 3n j−2, j

=

3 j−2, j 3 j−2, j 3 j−2, j (π1,n , π2,n , π3,n )

= 

π 3n j−1, j π 3n j, j

137

j

j

j



j

j

j



h 1,n γ1,1 h 2,n γ2,1 h 3,n γ3,1 , , 1, j 1, j 1, j fn fn fn

h 1,n γ1,2 h 2,n γ2,2 h 3,n γ3,2 = = , , 2, j 2, j 2, j fn fn fn  j  j j h 1,n γ1,3 h 2,n γ2,3 h 3,n γ3,3 3 j, j 3 j, j 3 j, j = (π1,n , π2,n , π3,n ) = , , 3, j 3, j 3, j fn fn fn 3 j−1, j 3 j−1, j 3 j−1, j (π1,n , π2,n , π3,n )

1, j

2, j

3, j

where fnj = ( f n , f n , f n ) = hnj . The American put option pricing starts at each of the final nodes and ends at the tree’s first node. In the M th step, the value of the American put option is ν M (xnm ) = [K − s M (xnm )]+ = max {K − s M (xnm ), 0} , where xnm = (n 1 , n 2 , n 3 , m 1 , m 2 , m 3 ), and n 1 + n 2 + n 3 + m 1 + m 2 + m 3 = M. For step k, each node in the original tree has 3k option prices for j = 1, 2, ..., 3k , j j ˜ k+1 } νk (xnm ; hk ) = max{[K − sk (xnm )]+ , Bν

 3 j l ˜ k+1 ( j, h j ) = e−r h 3 where Bν i=1 θ=1 h i,k γi,θ Bk+1 ( j, hk+1 , i), and for option k+1 prices in the next step Bk+1 ( j, hlk+1 , i), l = 1, 2, ..., 3k+1 is given by   Bk+1 ( j, hlk+1 , 1) = q1 νk+1 n 1 + 1, n 2 , n 3 , m 1 , m 2 , m 3 ; hlk+1   + (1 − q1 )νk n 1 , n 2 , n 3 , m 1 + 1, m 2 , m 3 ; hlk+1 ,   Bk+1 ( j, hlk+1 , 2) = q2 νk+1 n 1 , n 2 + 1, n 3 , m 1 , m 2 , m 3 ; hlk+1   + (1 − q2 )νk n 1 , n 2 , n 3 , m 1 , m 2 + 1, m 3 ; hlk+1 ,   Bk+1 ( j, hlk+1 , 3) = q3 νk+1 n 1 , n 2 , n 3 + 1, m 1 , m 2 , m 3 ; hlk+1   + (1 − q3 )νk n 1 , n 2 , n 3 , m 1 , m 2 , m 3 + 1; hlk+1 . Let S be a set of initial asset prices S0 , and a family of sets i , i ∈ I, where I is an index set, for which if economy information vectors π , π  ∈ i , i ∈ I then π, π  are TP2 comparable. The thresholds in the numerical examples are obtained as follows. First, take a finite subset S ∗ ⊆ S, and a set TP∗2 ∈ . Then, for a fixed economy information vector π ∈ TP∗2 compute the option price for every S0 ∈ S ∗ . Initial asset price s ∗ ∈ S ∗ is the one-threshold that splits the set S ∗ into a (early) exercise region D eN and a hold region D hN . The tree obtained from the information vector has exponential growth. It can be controlled by discretizing the π space as follows. First, choose evenly spread a finite number of information vectors from the π space. Then, find the closest information vector from the finite set of information vectors to the one acquired in the node

138

L. Jin et al.

and substitute it. In that way the number of different information vectors can be controlled.

6.5.2 Numerical Results In this subsection, unless it is said otherwise, the parameters used for computation are given in Table 6.1. The transition probability matrix (T P M) and the probabilistic relation between a signal and a state of the economy, the conditional probability matrix ⎡ ⎤ ⎡ ⎤ (C P M), are, 0.7 0.2 0.1 0.6 0.2 0.2 respectively, given by P = ⎣ 0.1 0.4 0.5⎦ ,  = ⎣ 0.1 0.4 0.5 ⎦ . This choice 0.05 0.25 0.7 0.05 0.4 0.55 of parameters satisfy assumptions (A–1) and (A–2). It can be seen that both matrices have the property of TP2 . (i) To show monotonicity of ν Nh (s, π ), ν N (s, π ), and L N (s, π ) in π = (π1 , π2 , π3 ) (in sense of TP2 ordering) for American put options and every s and N , a set of all economy information vectors such that π2 = 0.05, π1 = π2 + 0.03 × i, i = 0, 1, ..., 30, π3 = 1 − π1 − π2 , (6.12) denoted by TP∗2 , was used. Note that sequence TP∗2 ∈ , is in its reverse order. Indeed, let π 1 = ( p1 , p2 , 1 − p1 − p2 ) and π 2 = (q1 , q2 , 1 − q1 − q2 ) such that π 1 = π 2 . If p1 = 0 and p1 = q1 , then π 1 and π 2 are not TP2 comparable. TP2

If p1 ≥ q1 and p2 = q2 , then π 1 π 2 . This claim can be easily derived from Definition 1 so we omit the proof here. By this result, we note also that other information vectors from the above partitioned space cannot be added into this

Table 6.1 General model test parameters Name Notation Maturity time Number of steps Time duration of a step Volatility vector Strike price Interest rate Dividend yield (American call) TPM CPM

T M h σ K r δ P 

Parameters 8/252 4 2/252 (0.5, 0.3, 0.1) 100 0.02 0.1 [ pi j ]i, j=1,2,3 [γi j ]i, j=1,2,3

6 Valuation and Optimal Strategies for American Options …

139

sequence, as each of them is not comparable to at least one of the information vector in TP∗2 . Note that in all relevant figures in this section, if the horizontal axis refers to π information vectors, we plot these information vectors in descending order with respect to the TP2 ordering. Figures 6.3, 6.4, and 6.5 show that, respectively, ν Nh (s, π ), L N (s, π ), and ν N (s, π ) are decreasing in π for initial asset price s = 93.3 and N = M = 4. Figure 6.6 shows that ν N (s, π ) is decreasing in π for an American call option with dividend yield δ = 0.1 and initial asset price s = 105. Note that the noarbitrage condition (6.2) is satisfied with our choice of parameter values. (ii) To show monotonicity of ν N (s, π ) in the remaining time periods N for every s and π , a set N ∗ = {3, 6, ..., 90} of remaining time periods N was used. Figure 6.7 shows that ν N (s, π ) is increasing in remaining time periods N for s = 93.3 and π = (0.92, 0.04, 0.04) for an American put option. (iii) To show monotonicity of ν N (s, π ) in s for every π and N , a set S ∗ S∗ =

   0.4 × i × K : i ∈ {0, 1, ..., 30} 0.7 + 30

Fig. 6.3 An example of the monotonicity in π for the holding value of an American put ν Nh (s, π) with parameters given in Table 6.1 and in (i)

Fig. 6.4 An example of the monotonicity in π of holding value premium L N (s, π ) with parameters given in Table 6.1 and in (i)

140

L. Jin et al.

Fig. 6.5 An example of the monotonicity in π of the value of an American put option ν N (s, π) with parameters given in Table 6.1 and in (i)

Fig. 6.6 An example of the monotonicity in π of the value of an American call option with dividend yield ν N (s, π) with parameters given in Table 6.1 and in (i)

of initial asset prices s was used for an American put option. Figure 6.8 shows that ν N (s, π ) is decreasing in s for N = M = 4 and π = (0.92, 0.04, 0.04). (iv) In Sect. 6.4.4, we discussed the existence of one-threshold for the early exercising decisions. To show the exercise and hold regions, as well as the monotonicity of threshold in π, sets TP∗2 and S ∗ defined by Eq. (6.12) and S∗ =

   0.3 × i × K : i ∈ {0, 1, ..., 5000} 0.7 + 5000

for an American put, and by S∗ =

   0.3 × i × K : i ∈ {0, 1, ..., 5000} 1+ 5000

for an American call option with dividend yield δ = 0.1 were used. Figures 6.9 and 6.10 show that the threshold is decreasing/increasing in π for N = M = 4, as well as the exercise and hold regions for the buyer and both American call option with dividend yield and American put, respectively.

6 Valuation and Optimal Strategies for American Options …

141

Fig. 6.7 An example of the monotonicity in N of the value of an American put option ν N (s, π) with parameters given in Table 6.1 and in (ii)

Fig. 6.8 An example of the monotonicity in s of the value of an American put option ν N (s, π) with parameters given in Table 6.1 and in (iii)

Fig. 6.9 An example of the optimal stopping regions for an American put option and the monotonicity of the threshold in π with parameters given in Table 6.1 and in (iv)

Figures 6.11 and 6.12 show the monotonicity of threshold in π for different choice of π2 in Eq. (6.12), π2 = 0.02 and π2 = 0.07, respectively.

142 Fig. 6.10 An example of the optimal stopping regions for an American call option with dividend yield and the monotonicity of the threshold in π with parameters given in Table 6.1 and in (iv)

Fig. 6.11 An example of the optimal stopping regions for an American put option and the monotonicity of the threshold in π with parameters given in Table 6.1 and in (iv) for π2 = 0.02

Fig. 6.12 An example of the optimal stopping regions for an American put option and the monotonicity of the threshold in π with parameters given in Table 6.1 and in (iv) for π2 = 0.07

L. Jin et al.

6 Valuation and Optimal Strategies for American Options …

143

6.6 Conclusion and Future Research We have studied the American option pricing and the corresponding optimal exercising strategies under a novel model. Under our model, the asset price follows an extended binomial tree with the volatility parameter governed by a discrete-time hidden Markov chain. We have formulated the problem using a partially observable Markov decision process and derived analytical structural properties for the American option prices and optimal exercising strategies, under a set of sufficient conditions on the transition probability matrix of the economy evolution and the conditional probabilities of observations. Our analytical results are fully illustrated with numerical examples. For future research, we consider generalizing our model by permitting a more general probability distribution for the asset price. We plan also to conduct extensive numerical studies on the structural properties under less restrict conditions. Such information is useful when we use this model in practice. The results of this research are limited to the pricing of short-maturity options because the changes in the economic situation are simple. For a long time period, decision-makers often face more complex situations in the economy. So as future work, we would also like to study extensions of our model for options with longer maturities.

References 1. Aingworth, D.D., Das, S.R., Motwani, R.: A simple approach for pricing equity options with Markov switching state variables. Quant. Finance 6(2), 95–105 (2006) 2. Broadie, M., Detemple, J.: American options on dividend-paying assets. Fields Inst. Commun. 22, 69–97 (1999) 3. Hamilton, J.D.: A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57, 357–384 (1989) 4. Karlin, S.: Total Positivity, Volume I. Stanford University Press (1968) 5. Jösson, H.: Optimal stopping domains and Reward Funcions for Discrete Time American Type Options. Doctoral Dissertation, No. 22, Mälardalen University (2005) 6. Kim, I.J., Byun, S.J.: Optimal exercise boundary in a binomial option pricing model. J. Financ. Eng. 3(2), 137–158 (1994) 7. Kim, I.J., Byun, S.J., Lim, S.: Valuing and hedging American options under time varying volatility. J. Derivat. Account. 1, 195–204 (2004) 8. Kijima, M., Yoshida, T.: A simple option pricing model with Markovian volatilities. J. Oper. Res So. Janpan 36(3), 149–166 (1993) 9. Kukush, A.G., Silvestrov, D.S.: Strcuture of optimal stopping strategies for American type options. In: Uryasev, P. (ed.), Probabilistic Constrained Optimization: Methodology and Applications, Nonconvex Optim. Appl., vol. 49, pp. 173–185. Kluwer, Dordrecht (2000) 10. Kukush, A.G., Silvestrov, D.S.: Optimal pricing of American type options with discrete time. Theory Stoch. Process. 10(26)(1-2), 72–96 (2004) 11. Marshall, A.W., Olkin, I., Arnold, B.C.: Inequalities: Theory of Majorization and Its Applications. Academic Press Inc, Orlando (1979) 12. Naik, V.: Option valuation and hedging strategies with jumps in the volatility of asset returns. J. Financ. 48, 1969–1984 (1993)

144

L. Jin et al.

13. Sato, K., Sawaki, K.: The dynamic pricing for callable securities with Markov-modulated prices. J. Oper. Res. Soc. Japan 57, 87–103 (2014) 14. Sato, K., Sawaki, K.: The dynamic valuation of callable contingent claims with a partially observable regime switch. (2018). Available at SSRN: https://ssrn.com/abstract=3284489 15. Shen, Y., Siu T.-K.: Pricing variance swaps under a stochastic interest rate and volatility model with regime-switching. Oper. Res. Lett. 41, 180–187 (2013) 16. Silvestrov, D.: American-Type Options. Stochastic Approximation Methods. Volume 1. De Gruyter Studies in Mathematics, vol. 56, p. x+509. Walter de Gruyter, Berlin (2014) 17. Silvestrov, D.: American-Type Options. Stochastic Approximation Methods. Volume 2. De Gruyter Studies in Mathematics, vol. 57, p. xi+558. Walter de Gruyter, Berlin (2015)

Chapter 7

Inequalities for Moments of Branching Processes in a Varying Environment Ya. M. Khusanbaev and Kh. E. Kudratov

Abstract In the present paper, we give the upper bounds for moments and central moments of branching processes in a varying environment starting with a random number of particles. Keywords Branching process · Central moments MSC 2020 60J80, 60J85

7.1 Introduction Let Y1 , Y2 , . . . be a sequence of random variables with the range N0 = {0, 1, 2, . . .} and generating functions f 1 , f 2 , . . .. Further, let Yn,i , n, i ∈ N be independent random variables such that for each n, i ≥ 1, the distribution of the random variable Yn,i coincides with the distribution of the random variable Yn . Also let η be a random variable that takes nonnegative integer values, has the generating function ϕ, and is independent of the random variables {Yk, j , k, j ∈ N }. We define the random variables Z n , n ≥ 1 taking values in N0 with the help of the following recurrence relations: Z 0 = η, Z n =

Z n−1 

Yn, j

(7.1)

j=1

Ya. M. Khusanbaev (B) Uzbekistan Akademy of Sciences, V. I. Romanovskiy Institute of Mathematics, 81, Mirzo-Ulugbek street, Tashkent, Uzbekistan e-mail: [email protected] Kh. E. Kudratov Samarkand State University named after Sharof Rashidov, 15, University street, Samarkand, Uzbekistan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_7

145

146

Ya. M. Khusanbaev and Kh. E. Kudratov

The process (7.1) is called a branching process in a varying environment starting with a random number of particles (see [1]). If all generating functions f i are the same, process (7.1) is the Galton–Watson process. Now suppose that εn , n ∈ N are random variables that takes nonnegative integer values, have respectively the generating functions h 1 , h 2 , . . . , and are independent of random variables {Yk, j , k, j ∈ N } and η. We define the random variables Wn , n ≥ 0, taking values in N0 , by the following recurrence relations: Wn−1  W0 = η, Wn = Yn, j + εn (7.2) j=1

The process (7.2) is called a branching process in a varying environment with varying immigration starting with a random number of particles. In solving many problems of probability theory, estimation for the moments of random variables is considered as one of the most important problems. In this respect, a lot of results have been obtained for sums of independent random variables (see p [3, 4]). To calculate moments E Z n for the process defined by relation (7.1) one can differentiate its generating function Es Z n and obtain formulas. However, the resulting expressions become more complex as the number p increases. Therefore p the upper bounds for E Z n play in important role. In this work, we obtain estimates for the moments and central moments of branching processes in a varying environment of the forms (7.1) and (7.2). In this connection, we mainly use probability methods and known inequalities for the sum of independent random variables. Introduce the following notations: Hn (s) := Es Z n , m n := f n (1), νn :=

μ0 := 1, μn :=

n  k=1

f n (1) f n (1) ,  := n ( f n (1))2 ( f n (1))3

m k , γn( p) := EYn,1 , n( p) := p

n 

( p)

γ j , θn( p) := E | Yn,1 − m n | p

j=1

α p := Eη p , β p := E|η − Eη| p G n (s) := Es Wn , λn := Eεn , bn2 := Varεn , τn( p) := Eεnp , δn( p) := E|εn − λn | p Throughout what follows we will assume that all considered moments are finite.

7 Inequalities for Moments of Branching Processes in a Varying Environment

147

7.2 Main Results Theorem 1 Let the process (7.1) be given. Then the following inequalities hold: ( p)

p

1. for 0 < p ≤ 1, E Z n ≤ α1 γn μn−1 ; p ( p) 2. for p > 1, E Z n ≤ α p n . Theorem 1 implies that the following statement takes place for the Galton-Watson branching process starting with one particle. Corollary 1 If η ≡ 1, f i ≡ f, i ≥ 1, then the following inequalities hold: p

1. for 0 < p ≤ 1, E Z n ≤ m n−1 γ p ; p 2. for p > 1, E Z n ≤ (γ p )n . p

Here m = EY1,1 , γ p = EY1,1 . Theorem 2 Let the process (7.1) be given. Then the following inequalities hold:  n   θ ( p) p p k 1. for 0 < p ≤ 1, E|Z n − E Z n | ≤ μn α1 ; p−1 p + β p μk−1 m k k=1   n  ( p) 2(n+1−k)( p−1) θk p p n( p−1) 2. for 1 < p ≤ 2, E|Z n − E Z n | ≤ μn 2α1 +2 βp ; p−1 p μk−1 m k k=1   p n  (n+1−k)( p−1) ( p) ( 2 ) θ  2 p k k−1 3. for p > 2, E|Z n − E Z n | p ≤ μn C( p)α 2p + 2n( p−1) β p . p−1 p k=1

μk−1 m k

Here C( p) is a number depending only on p. This theorem implies the following statement for the Galton-Watson branching process starting with one particle. Corollary 2 If η ≡ 1, f i ≡ f, i ≥ 1, then the following inequalities hold: θ m (n−1) p (m n(1− p) −1)

1. for 0 < p ≤ 1, m = 1, E|Z n − E Z n | p ≤ p m (1− p) −1 2. for 0 < p ≤ 1, m = 1, E|Z n − E Z n | p ≤ θ p n;

;

2 p θ m n−1 ((2m)n( p−1) −1)

; 3. for 1 < p ≤ 2, m = 21 , E|Z n − E Z n | p ≤ p (2m)( p−1) −1 1 4. for 1 < p ≤ 2, m = , E|Z n − E Z n | p ≤ 2( p+1−n) θ p n; 2 2( p−1) C( p)θ p m n−1 ((2m)n( p−1) −(γ p )n ) ( p−1) 2 5. for p > 2, (2m) = γ 2p , E|Z n − E Z n | p ≤ ; (2m)( p−1) −γ p 2

6. for p > 2, (2m)( p−1) = γ 2p , E|Z n − E Z n | p ≤ 2( p−1) C( p)θ p (2 p−1 m p )n−1 n. p

2 . Here m = EY1,1 , θ p = E|Y1,1 − m| p , γ 2p = EY1,1

Theorem 3 Let the process (7.2) be given. Then the following inequalities hold:    λj p ( p) ( p) 1. for 0 < p ≤ 1, E Wn ≤ γn μn−1 α1 + n−1 j=1 μ j + τn ;

148

Ya. M. Khusanbaev and Kh. E. Kudratov ( p)

p

( p)

2. for p > 1, E Wn ≤ 2n( p−1) n α p + n

n

( p)

τ j 2(n+1− j)( p−1) ( p)

j=1

j

.

This theorem implies the following statement for branching processes with immigration. Corollary 3 If η ≡ 0, f i ≡ f, h i ≡ h, i ≥ 1, then the following inequalities hold: λγ (m n−1 −1)

p

1. for 0 < p ≤ 1 and m = 1, E Wn ≤ p m−1 + τp; p 2. for 0 < p ≤ 1 and m = 1, E Wn ≤ λγ p (n − 1) + τ p ; n 2 p−1 γ p ) −1) p 3. for p > 1 and 2 p−1 γ = 1, E W ≤ 2 p−1 τ (( n p

p

p

4. for p > 1 and 2 p−1 γ p = 1, E Wn ≤ 2 p−1 τ p n. p

2 p−1 γ p −1

p

Here m = EY1,1 , γ p = EY1,1 , λ = Eε1 , τ p = Eε1 . Theorem 4 Let the process (7.2) be given. Then the following inequalities hold: 1. for 0 < p ≤ 1, E|Wn − E Wn | ≤ p

μnp

 n 

 −p μk



( p) θk μk−1

k=1

k−1  λi α1 + μi i=1



 +

( p) δk

 + βp ;

2. for 1 < p ≤ 2,  E|Wn − E Wn | p ≤ μnp

n 

 −p μk 2(n+1−k)( p−1)



( p) 2 p θk μk−1

k=1 ( p)

k−1  λi α1 + μ i i=1



+ 2 p−1 δk )) + μnp 2n( p−1) β p ; 3. for p > 2, E|Wn − E Wn | p ≤ μnp

n 

 p  p −p ( p) ( 2 ) 2(k−1)( 2 −1) α 2p μk 2(n+2−k)( p−1) C( p)θk k−1

k=1

+

k−1  i=1

( 2p ) (k−i)( p −1) 2

τi

2



( p) i 2

 +

( p) δk

+ μnp 2n( p−1) β p ,

here C( p) is a number depending only on p. Theorem 4 implies the following statement for branching processes with immigration in the special case when 0 < p ≤ 1.

7 Inequalities for Moments of Branching Processes in a Varying Environment

149

Corollary 4 If η ≡ 0, f i ≡ f, h i ≡ h, i ≥ 1, then the following inequalities hold: 1. for 0 < p ≤ 1 and m = 1, E|Wn − E Wn | p ≤

λθ p m−1

m p(n−1) (m n(1− p) − 1) (m np − 1) − (1− p) mp − 1 m −1

2. for 0 < p ≤ 1 and m = 1, E|Wn − E Wn | p ≤

θ p λ(n−1)n 2

 + δp

(m np − 1) ; mp − 1

+ nδ p .

Here m = EY1,1 , θ p = E|Y1,1 − m| , λ = Eε1 , δ p = E|ε1 − λ| p . p

7.3 Auxiliary Results To prove the main results, we need the several lemmas. Lemma 1 Let the process (7.1) be given. Then the following equalities hold for generating functions Hn (s), n ≥ 1 of the variables Z n , n ≥ 1: Hn (s) = ϕ( f 1,n (s)), n ∈ N,

(7.3)

where f i,n = f i ◦ ... ◦ f n , i = 1, . . . , n, f i,i = f i . Proof We prove equality (7.3) by the method of mathematical induction. For this we first show the correctness of the equality for n = 1. Taking into account identical distribution of random variables {Y1,i , i ∈ N }, we get H1 (s) = Es Z 1 = E(Es Y1,1 +···+Y1,η /η) = E( f 1 (s))η = ϕ( f 1 (s)). Assume that equality (7.3) is true for n = k. Now we prove the correctness of the equality for n = k + 1. Taking into account the independence and identical distribution of random variables {Yk+1, j , j ∈ N }, we obtain Hk+1 (s) = Es Z k+1 = Es Yk+1,1 +Yk+1,2 +···+Yk+1,Z k = E(Es Yk+1,1 +Yk+1,2 +...+Yk+1,Z k /Z k ) = E( f k+1 (s)) Z k = ϕ( f 1,k+1 (s)). It means that equality (7.3) is true ∀n ∈ N. Lemma 2 Let the process (7.1) be given. Then the following equalities hold for the first, second, third factorial moments and covariation of the random variables Z n , n≥1: E Z n = α1 μn , E Z n (Z n − 1) = μ2n (α2 − α1 ) + α1

n  νk ; μk−1 k=1

150

Ya. M. Khusanbaev and Kh. E. Kudratov

⎡ E Z n (Z n − 1)(Z n − 2) =

μ3n

⎤ n 2  νj (α − α ) 2 1 ⎣(α3 − 3α2 + 2α1 ) + (α2 − α1 ) ⎦ − μ j−1 α1



+

⎞2

j=1

n n  − ν2 n−1 n     j νj νj νk μ3n ⎝ j ⎠ + α1 μ3n (α2 − α1 ) + α1 + α1 μ3n ; 2 α1 μ j−1 μ μ μ j−1 j−1 j=1 j=1 j=1 k= j+1 k−1

Cov(Z k , Z n ) =

μk∨n Var Z k∧n . μk∧n

Proof We prove the first equality using Lemma 1. Differentiating (7.3), we obtain        Hn (s) = ϕ  f 1,n (s) f 1 f 2,n (s) f 2 f 3,n (s) · · · f n−1 ( f n (s)) f n (s).

(7.4)

From this, taking into account f k,n (1) = 1, k = 1, . . . , n, we get Hn (1) = ϕ  (1) f 1 (1) f 2 (1) · · · f n (1). Now we prove the second equality of Lemma 2. Taking into account that ϕ  (s) > 0, f k (s) > 0, k = 1, . . . , n, s ∈ (0, 1], and taking the logarithm of the both sides of (7.4), we obtain ln Hn (s) = ln ϕ  ( f 1,n (s)) + ln f 1 ( f 2,n (s)) + ln f 2 ( f 3,n (s)) + · · ·  ( f n (s)) + ln f n (s). · · · + ln f n−1

(7.5)

Now, differentiating equality (7.5), we have Hn (s) ϕ  ( f 1,n (s)) f 1 ( f 2,n (s)) · · · f n (s) f  ( f 2,n (s)) f 2 ( f 3,n (s)) · · · f n (s) = + 1   Hn (s) ϕ ( f 1,n (s)) f 1 ( f 2,n (s)) f  ( f n (s)) f n (s) f  (s) f  ( f 3,n (s)) f 3 ( f 4,n (s)) · · · f n (s) + · · · + n−1 + n . + 2  f 2 ( f 3,n (s)) f n−1 ( f n (s)) f n (s) (7.6) Since f k,n (1) = 1, k = 1, . . . , n, we have ϕ  (1) f 1 (1) · · · f n (1) f 1 (1) f 2 (1) · · · f n (1) Hn (1) = + Hn (1) ϕ  (1) f 1 (1) +··· +

 (1) f n (1) f n−1 f n (1) + ,  f n−1 (1) f n (1)

7 Inequalities for Moments of Branching Processes in a Varying Environment

151

which implies Hn (1)

 =

α1 μ2n

  (1) f n−1 ϕ  (1) f 1 (1) f 2 (1) f n (1) . + + + ··· + + ϕ  (1) μn−1 m 2n m 21 μ1 m 22 μn−2 m 2n−1

The second equality of Lemma 2 follows from the last equality. And now we turn to the proof of the third equality. Differentiating (7.6), we obtain ϕ  ( f 1,n (s))( f 1 ( f 2,n (s)) · · · f n (s))2 ϕ  ( f 1,n (s)) Hn (s)Hn (s) − (Hn (s))2 =  (Hn (s))2 (ϕ  ( f 1,n (s)))2 + +

ϕ  ( f 1,n (s)) f 1 ( f 2,n (s))( f 2 ( f 3,n (s)) · · · f n (s))2 ϕ  ( f 1,n (s)) (ϕ  ( f 1,n (s)))2

+ ······

ϕ  ( f 1,n (s)) f 1 ( f 2,n (s)) · · · f n (s)ϕ  ( f 1,n (s)) − (ϕ  ( f 1,n (s)) f 1 ( f 2,n (s)) · · · f n (s))2 (ϕ  ( f 1,n (s)))2

f  ( f 2,n (s))( f 2 ( f 3,n (s)) · · · f n (s))2 f 1 ( f 2,n (s)) + 1 ( f 1 ( f 2,n (s)))2 f  ( f 2,n (s)) f 2 ( f 3,n (s))( f 3 ( f 4,n (s)) · · · f n (s))2 f 1 ( f 2,n (s)) + 1 + ··· ( f 1 ( f 2,n (s)))2 f  ( f 2,n (s)) f 2 ( f 3,n (s)) · · · f n (s) f 1 ( f 2,n (s)) − ( f 1 ( f 2,n (s)) f 2 ( f 3,n (s)) · · · f n (s))2 + 1 ( f 1 ( f 2,n (s)))2 f  ( f 3,n (s))( f 3 ( f 4,n (s)) · · · f n (s))2 f 2 ( f 3,n (s)) + 2 ( f 2 ( f 3,n (s)))2 f  ( f 3,n (s)) f 3 ( f 4,n (s))( f 4 ( f 5,n (s)) · · · f n (s))2 f 2 ( f 3,n (s)) + 2 + ··· ( f 2 ( f 3,n (s)))2 f  ( f 3,n (s)) f 3 ( f 4,n (s)) · · · f n (s) f 2 ( f 3,n (s)) − ( f 2 ( f 3,n (s)) f 3 ( f 4,n (s)) · · · f n (s))2 + 2 ( f 2 ( f 3,n (s)))2 +

 ( f (s))( f  (s))2 + f  ( f (s)) f  (s)) f  ( f (s)) − ( f  ( f (s)) f  (s))2 ( f n−1 n n n n n−1 n n−1 n n−1 n  ( f (s)))2 ( f n−1 n

f  (s) f n (s) − ( f n (s))2 + n . ( f n (s))2

From this, taking into account that f k,n (1) = 1, k = 1, . . . , n, we get

152

Ya. M. Khusanbaev and Kh. E. Kudratov

Hn (1)Hn (1) − (Hn (1))2 (Hn (1))2 [ϕ  (1)( f 1 (1) · · · f n (1))2 + ϕ  (1) f 1 (1)( f 2 (1) · · · f n (1))2 + · · · (ϕ  (1))2     ϕ (1) f 1 (1) · · · f n (1)]ϕ (1) − (ϕ  (1) f 1 (1) · · · f n (1))2 + (ϕ  (1))2    2 [ f (1)( f 2 (1) · · · f n (1)) + f 1 (1) f 2 (1)( f 3 (1) · · · f n (1))2 + · · · + 1 ( f 1 (1))2 =

+ +

[ f 2 (1)( f 3 (1) · · · f n (1))2 + f 2 (1) f 3 (1)( f 4 (1) · · · f n (1))2 + · · · ( f 2 (1))2

+ +

f 1 (1) f 2 (1) · · · f n (1)] f 1 (1) − ( f 1 (1) f 2 (1) · · · f n (1))2 ( f 1 (1))2

f 2 (1) f 3 (1) · · · f n (1)] f 2 (1) − ( f 2 (1) f 3 (1) · · · f n (1))2 + ··· ( f 2 (1))2

    (1)( f n (1))2 + f n−1 (1) f n (1)) f n−1 (1) − ( f n−1 (1)) f n (1))2 ( f n−1  ( f n−1 (1))2

+

f n (1) f n (1) − ( f n (1))2 . ( f n (1))2

The third equality of Lemma 2 is obtained from the last equality taking into account our notations. The last equality of Lemma 2 follows from the following recurrence relation, which holds for 0 ≤ k < n E((Z k − E Z k )(Z n − E Z n )/Fn−1 ) = (Z k − E Z k )E((Z n − m n E Z n−1 )/Fn−1 ) ⎛⎛ ⎞ ⎞  Z n−1  = (Z k − E Z k )E ⎝⎝ (Yn, j − m n ) + m n (Z n−1 − E Z n−1 )⎠ Fn−1 ⎠ j=1

= m n (Z k − E Z k )(Z n−1 − E Z n−1 ), where Fi = σ {Z 0 , Z 1 , . . . , Z i } is the σ -field generating by the random variables Z 0 , Z 1 , . . . , Z i . Hence, Cov(Z k , Z n ) = m n Cov(Z k , Z n−1 ), 0 ≤ k < n. The last equality of Lemma 2 follows from the last recurrence relation. Lemma 3 Let the process (7.2) be given. Then the following equalities hold for the generating functions G n (s), n ≥ 1 of the random variables Wn , n ≥ 0 : G n (s) = h n (s)h n−1 ( f n (s))h n−2 ( f n−1,n (s)) · · · h 1 ( f 2,n (s)))ϕ( f 1,n (s)).

(7.7)

Proof We prove equality (7.7) by the method of mathematical induction. To do this, we first show its validity for n = 1. Taking into account that the random variables {Y1,i , i ∈ N } are equally distributed and independent on {ε1 }, we obtain

7 Inequalities for Moments of Branching Processes in a Varying Environment

G 1 (s) = Es W1 = E



153

  Es Y1,1 +···+Y1,η +ε1 /η = h 1 (s)E( f 1 (s))η = h 1 (s)ϕ( f 1 (s)).

Suppose that equality (7.7) is true for n = k. Under this condition, we prove its validity for n = k + 1. Since random variables {Yk+1, j , j ∈ N } are identically distributed and independent on {εk+1 }, we conclude that G k+1 (s) = Es Wk+1 = Es Yk+1,1 +Yk+1,2 +···+Yk+1,Wk +εk+1 =E



  Es Yk+1,1 +Yk+1,2 +···+Yk+1,Wk +εk+1 /Wk = h k+1 (s)E( f k+1 (s))Wk

= h k+1 (s)h k ( f k+1 (s)))h k−1 ( f k,k+1 (s)) · · · h 1 ( f 2,k+1 (s))ϕ( f 1,k+1 (s)). Thus, equality (7.7) is valid ∀n ∈ N. Lemma 4 Let the process (7.2) be given. Then the following equalities are valid respectively for the mathematic expectations, variancesand covariances of the ran  dom variables Wn , n ≥ 0 : E Wn = μn α1 + nk=1 μλkk , ⎡ VarWn = μ2n ⎣

n  b 2 − λk k

k=1

μ2k

⎤ n k−1 n   νk  λ j νk + + α1 + (α2 − α1 ) − α12 ⎦ μ μ μ k−1 j k−1 k=2 j=1 k=1  +μn

n  λk α1 + μ k k=1

Cov(Wk , Wn ) =

 ;

μk∨n VarWk∧n . μk∧n

Proof At first, we prove validity of the first equality of Lemma 4. For the convenience, we take the logarithm of both sides of equality (7.7): ln G n (s) = ln h n (s) + ln h n−1 ( f n (s)) + ln h n−2 ( f n−1,n (s)) + · · · + ln h 1 ( f 2,n (s)) + ln ϕ( f 1,n (s))

(7.8)

Now differentiating the both sides of (7.8), we obtain  ( f n (s)) f n (s) h  (s) h n−1 ( f n (s)) f n (s) h n−2 ( f n−1,n (s)) f n−1 G n (s) = n + + + ··· G n (s) h n (s) h n−1 ( f n (s)) h n−2 ( f n−1,n (s)) h  ( f 2,n (s)) f 2 ( f 3,n (s)) · · · f n (s) ϕ  ( f 1,n (s)) f 1 ( f 2,n (s)) · · · f n (s) + . + 1 h 1 ( f 2,n (s)) ϕ( f 1,n (s)) (7.9) Taking into account that G n (1) = 1, h k (1) = 1, k = 1, . . . , n, f k,n (1) = 1, k = 1, . . . , n, ϕ(1) = 1, we conclude that

154

Ya. M. Khusanbaev and Kh. E. Kudratov  G n (1) = h n (1) + h n−1 (1) f n (1) + h n−2 (1) f n−1 (1) f n (1) + · · ·

+h 1 (1) f 2 (1) · · · f n (1) + ϕ  (1) f 1 (1) · · · f n (1). The first equality of Lemma 4 follows from the last equality. Now we show the validity of the second equality. To do this, we differentiate equality (7.9) and obtain the following: G n (s)G n (s) − (G n (s))2

h  (s)h n (s) − (h n (s))2 = n h 2n (s)

G 2n (s)  ( f n (s))( f n (s))2 + h n−1 ( f n (s)) f n (s)]h n−1 ( f n (s)) − (h n−1 ( f n (s)) f n (s))2 [h + n−1 h 2n−1 ( f n (s))  ( f (s)) f  (s))2 + h    2 h n−2 ( f n−1,n (s))( f n−1 n n n−2 ( f n−1,n (s)) f n−1 ( f n (s))( f n (s))

+

+ −

h n−2 ( f n−1,n (s))  ( f (s)) f  (s)h h n−2 ( f n−1,n (s)) f n−1 n n−2 ( f n−1,n (s)) n

h 2n−2 ( f n−1,n (s))  ( f (s)) f  (s))2 (h n−2 ( f n−1,n (s)) f n−1 n n

h 2n−2 ( f n−1,n (s))

+ ···

h  ( f 2,n (s))( f 2 ( f 3,n (s)) · · · f n (s))2 h 1 ( f 2,n (s)) + 1 h 21 ( f 2,n (s)) h  ( f 2,n (s)) f 2 ( f 3,n (s))( f 3 ( f 4,n (s)) · · · f n (s))2 h 1 ( f 2,n (s)) + 1 + ··· h 21 ( f 2,n (s)) h  ( f 2,n (s)) f 2 ( f 3,n (s)) · · · f n (s)h 1 ( f 2,n (s)) − (h 1 ( f 2,n (s)) f 2 ( f 3,n (s)) · · · f n (s))2 + 1 h 21 ( f 2,n (s))

ϕ  ( f 1,n (s))( f 1 ( f 2,n (s)) · · · f n (s))2 ϕ( f 1,n (s)) ϕ 2 ( f 1,n (s))   ϕ ( f 1,n (s)) f 1 ( f 2,n (s))( f 2 ( f 3,n (s)) · · · f n (s))2 ϕ( f 1,n (s)) + + ··· ϕ 2 ( f 1,n (s)) ϕ  ( f 1,n (s)) f 1 ( f 2,n (s)) · · · f n (s)ϕ( f 1,n (s)) − (ϕ  ( f 1,n (s)) f 1 ( f 2,n (s)) · · · f n (s))2 . + ϕ 2 ( f 1,n (s)) +

From this, taking into account that G n (1) = 1, h k (1) = 1, k = 1, . . . , n, f k,n (1) = 1, k = 1, . . . , n, ϕ(1) = 1, we obtain

7 Inequalities for Moments of Branching Processes in a Varying Environment

155

G n (1) − (G n (1))2 = h n (1) − (h n (1))2 + h n−1 (1)( f n (1))2 + h n−1 (1) f n (1)  (1) f n (1))2 − (h n−1 (1) f n (1))2 + h n−2 (1)( f n−1   + h n−2 (1) f n−1 (1)( f n (1))2 + h n−2 (1) f n−1 (1) f n (1)  (1) f n (1))2 + · · · + h 1 (1)( f 2 (1) · · · f n (1))2 − (h n−2 (1) f n−1

+ h 1 (1) f 2 (1)( f 3 (1) · · · f n (1))2 + · · · + h 1 (1) f 2 (1) · · · f n (1) − (h 1 (1) f 2 (1) · · · f n (1))2 + ϕ  (1)( f 1 (1) · · · f n (1))2 + ϕ  (1) f 1 (1)( f 2 (1) · · · f n (1))2 + · · · + ϕ  (1) f 1 (1) · · · f n (1) − (ϕ  (1) f 1 (1) · · · f n (1))2 .

The second equality of Lemma 4 follows from the last relation taking into account our notations. Considering E((Wk − E Wk )(Wn − E Wn )/An−1 ) = (Wk − E Wk )E((Wn − m n E Wn−1 − λn )/An−1 ) ⎞ ⎛⎛ ⎞ Wn−1  = (Wk − E Wk )E ⎝⎝ (Yn, j − m n ) + m n (Wn−1 − E Wn−1 ) + (εn − λn )⎠ /An−1 ⎠ j=1

= m n (Wk − E Wk )(Wn−1 − E Wn−1 ),

where Ai = σ {W0 , W1 , . . . , Wi } is the σ -field generating by the random variables W0 , W1 , . . . , Wi . we obtain Cov(Wk , Wn ) = m n Cov(Wk , Wn−1 ), 0 ≤ k < n. The third equality of Lemma 4 follows from the last recurrence relation. Lemma 5 Suppose that Y1 , Y2 , . . . , Yn are independent, identically distributed random variables, and ζ is a random variable independent on this sequence of variables and taking nonnegative values. Let Eζ < ∞ and EY1 = 0, E|Y1 | p < ∞ for some 1 < p ≤ 2. Then   ζ   p   Yk  ≤ 2Eζ E|Y1 | p . E  

(7.10)

k=1

Proof Since ζ and random variables {Yk , k = 1, n} are independent, taking into account the total probability law, we obtain the following relation E|

ζ  k=1

Yk | p =

∞  N =0

E|

N  k=1

Yk | p I {ζ = N } =

∞  N =0

E|

N 

Yk | p P{ζ = N }.

k=1

Applying the von Bahr-Esseen inequality (see [3]), we obtain the following

(7.11)

156

Ya. M. Khusanbaev and Kh. E. Kudratov ∞  N =0

≤2

∞ 

E

N =0

N 

   N  N ∞   p   p      E Yk  P{ζ = N } ≤ 2 E Yk  P{ζ = N }     N =0

k=1

|Yk | p P{ζ = N } = 2E|Y1 | p

∞ 

k=1

N P(ζ = N ) = 2Eζ E|Y1 | p . (7.12)

N =0

k=1

Inequality (7.10) immediately follows from relations (7.11) and (7.12). Lemma 6 Suppose that Y1 , Y2 , . . . , Yn are independent, identically distributed random variables, and ζ is a random variable independent on this sequence of p variables and taking nonnegative values. Let Eζ 2 < ∞, EY1 = 0, E|Y1 | p < ∞ for some p ≥ 2. Then   ζ   p p   Yk  ≤ C( p)Eζ 2 E|Y1 | p , (7.13) E   k=1

where C( p) is a constant depending only on p. Proof Taking into account the independence of random variables ζ and {Yk , k = 1, . . . , n}, we obtain from the total probability formula the following relation  ζ     N  N ∞ ∞   p    p   p        E Yk  = E Yk  I {ζ = N } = E Yk  P{ζ = N }.       k=1

N =0

N =0

k=1

(7.14)

k=1

  Since the random variables Yk , k = 1, n are identically distributed, given the Marcinkiewicz-Zygmund inequality (see [3]), we obtain the following ∞  N =0

  N ∞ N   p   p   E Yk  P{ζ = N } ≤ C( p) N 2 −1 E |Yk | p P{ζ = N }   N =0

k=1

= C( p)E|Y1 | p

∞ 

k=1

p

p

N 2 P(ζ = N ) = C( p)Eζ 2 E|Y1 | p . (7.15)

N =0

Inequality (7.13) follows from relations (7.14) and (7.15). We also need the following well-known inequalities (see, for example [4]) to prove our theorems. Lemma 7 The following inequalities are valid: n  p n p 1. if 0 < p ≤ 1, then ai ≤ i=1 ai (ai ≥ 0); i=1 p  p n n 2. if 0 < p > 1, then ≤ n p−1 i=1 ai (ai ≥ 0). i=1 ai

7 Inequalities for Moments of Branching Processes in a Varying Environment

157

7.4 Proof of the Main Results Proof (Theorem 1) At first, consider the case of 0 < p ≤ 1. By Lemma 7, we obtain the following equality ⎛⎛ E(Z n /Z n−1 ) = E ⎝⎝ p

Z n−1

⎞p



⎛⎛

Yn, j ⎠ /Z n−1 ⎠ ≤ E ⎝⎝

j=1

Z n−1



⎞ ( p)

Yn, j ⎠ /Z n−1 ⎠ = Z n−1 γn . p

j=1 ( p)

p

( p)

Using E Z n = α1 μn , by the last equality, E Z n ≤ γn E Z n−1 = α1 γn μn−1 . Now consider the case of p > 1. According to Lemma 7, the following relations hold ⎛⎛ p E(Z n /Z n−1 )

= E ⎝⎝



Z n−1

⎞p



⎛⎛

Yn, j ⎠ /Z n−1 ⎠ ≤ E

⎝⎝ Z p−1 n−1

j=1

p

( p)



Z n−1





p Yn, j ⎠ /Z n−1 ⎠

p

( p)

= Z n−1 γn .

j=1

p

Thus, E Z n ≤ γn E Z n−1 ≤ · · · ≤ α p

n j=1

( p)

γj

( p)

= α p n . Theorem 1 is proved.

Proof (Theorem 2) First consider the case of 0 < p ≤ 1. We have Z n − E Z n = Z n − E(Z n /Z n−1 ) + E(Z n /Z n−1 ) − E Z n =

Z n−1 

Yn, j − Z n−1 m n + Z n−1 m n − α1 μn

j=1

=

Z n−1 

(Yn, j − m n ) + m n (Z n−1 − E Z n−1 ).

(7.16)

j=1

It follows from here, taking into account Lemma 7 that  p  Z n−1     p  E|Z n − E Z n | ≤ E  (Yn, j − m n ) + m np E|Z n−1 − E Z n−1 | p .  j=1 

(7.17)

According to Lemma 7, we obtain  p    Z n−1   E  (Yn, j − m n ) ≤ E|Yn,1 − m n | p E Z n−1 = α1 μn−1 θn( p) ,  j=1 

(7.18)

which, together with (7.17), implies the following E|Z n − E Z n | p ≤ α1 μn−1 θn( p) + m np E|Z n−1 − E Z n−1 | p .

(7.19)

158

Ya. M. Khusanbaev and Kh. E. Kudratov ( p)

p

Denote An = E|Z n − E Z n | p , Cn = α1 μn−1 θn , Bn = m n . These notations allows us to rewrite (7.19) in the following form: An ≤ Cn + Bn An−1 . This inequality implies An ≤ Cn + Bn An−1 ≤ Cn + Bn (Cn−1 + Bn−1 An−2 ) = Cn + Bn Cn−1 + Bn Bn−1 An−2 ≤ Cn + Bn Cn−1 + Bn Bn−1 (Cn−2 + Bn−2 An−3 ) = Cn + Bn Cn−1 + Bn Bn−1 Cn−2 + Bn Bn−1 Bn−2 An−3 (7.20) ≤ · · · ≤ Cn + Bn Cn−1 + Bn Bn−1 Cn−2 + Bn Bn−1 Bn−2 Cn−3 + · · · n  p p Ck p +Bn Bn−1 · · · B2 C1 + Bn Bn−1 · · · B2 B1 A0 = μn p + μn E|η − Eη| . μ k=1

k

The first inequality of Theorem 2 follows from (7.20). Now we consider the case of 1 < p ≤ 2. According to (7.16) and Lemma 7,  p   Z n−1   E|Z n − E Z n | p ≤ 2 p−1 E  (Yn, j − m n ) + 2 p−1 m np E|Z n−1 − E Z n−1 | p .  j=1  (7.21) Lemma 5 implies the following  p    Z n−1   E  (Yn, j − m n ) ≤ 2E|Yn,1 − m n | p E Z n−1 = 2α1 μn−1 θn( p) .  j=1 

(7.22)

Further, we obtain from (7.21) and (7.22) E|Z n − E Z n | p ≤ 2 p α1 μn−1 θn( p) + 2 p−1 m np E|Z n−1 − E Z n−1 | p .

(7.23)

( p)

Denoting Dn =2α1 μn−1 θn rewrite (7.23) as follows, An ≤ 2 p−1 Dn + 2 p−1 Bn An−1 . In turn, this inequality implies An ≤ 2 p−1 Dn + 2 p−1 Bn An−1 ≤ 2 p−1 Dn + 2 p−1 Bn (2 p−1 Dn−1 + 2 p−1 Bn−1 An−2 ) = 2 p−1 Dn + 22( p−1) Bn Dn−1 + 22( p−1) Bn Bn−1 An−2 ≤ 2 p−1 Dn + 22( p−1) Bn Dn−1 +22( p−1) Bn Bn−1 (2 p−1 Dn−2 + 2 p−1 Bn−2 An−3 ) = 2 p−1 Dn + 22( p−1) Bn Dn−1 + 23( p−1) Bn Bn−1 Dn−2 +23( p−1) Bn Bn−1 Bn−2 An−3 ≤ · · · ≤ 2 p−1 Dn + 22( p−1) Bn Dn−1 + 23( p−1) Bn Bn−1 Dn−2 +24( p−1) Bn Bn−1 Bn−2 Dn−3 + · · · + 2n( p−1) Bn Bn−1 . . . B2 D1 + 2(n+1)( p−1) Bn Bn−1 . . . B2 B1 A0 n  p p 2(n+1−k)( p−1) Dk = μn + 2n( p−1) μn E|η − Eη| p . p μ k=1

k

(7.24)

7 Inequalities for Moments of Branching Processes in a Varying Environment

159

The second inequality of Theorem 2 follows from the last inequality. And now consider the case of p > 2. According to Lemma 6 and Theorem 1, we obtain the following: p    p   Z n−1 ( 2p ) ( p) 2 E|Yn,1 − m n | p ≤ C( p)α 2p n−1 θn . E  (Yn, j − m n ) ≤ C( p)E Z n−1   j=1

(7.25)

We have ( p)

2 E|Z n − E Z n | p ≤ 2 p−1 C( p)α 2p n−1 θn( p) + 2 p−1 m np E|Z n−1 − E Z n−1 | p .

( p)

(7.26)

( p)

2 Denoting Tn = C( p)α 2p n−1 θn , (7.26) becomes An ≤ 2 p−1 Tn + 2 p−1 Bn An−1 . As in solving inequality (7.24), we have

E|Z n − E Z n | p ≤ μnp

n  2(n+1−k)( p−1) Tk p

k=1

μk

+ 2n( p−1) μnp E|η − Eη| p .

The third inequality of Theorem 2 follows from the last inequality. Thus, Theorem 2 is proved. Proof (Theorem 3) At first we consider the case of 0 < p ≤ 1. Lemma 7 implies ⎛⎛ E(Wnp /Wn−1 )

= E ⎝⎝



Wn−1

⎞p



⎛⎛

Yn, j + εn ⎠ /Wn−1 ⎠ ≤ E ⎝⎝

j=1



Wn−1

⎞p



Yn, j ⎠ /Wn−1 ⎠

j=1

⎛⎛ ⎞ ⎞ Wn−1  p  p  +E εn /Wn−1 ≤ E ⎝⎝ Yn, j ⎠ /Wn−1 ⎠ + τn( p) = γn( p) Wn−1 + τn( p) . j=1

  p ( p) ( p) Hence, E Wn ≤ γn E Wn−1 + τn . Hence, since E Wn = μn α1 + nj=1    λj p ( p) ( p) inequality E Wn ≤ γn μn−1 α1 + n−1 j=1 μ j + τn holds. Now consider the case of p > 1. We get from Lemma 7 the following

λj μj



, the

160

Ya. M. Khusanbaev and Kh. E. Kudratov

⎛⎛ p E(Wn /Wn−1 )

= E ⎝⎝

⎛⎛ ≤2

( p−1)

E ⎝⎝ ⎛⎛



Wn−1



Wn−1

⎞p



Yn, j + εn ⎠ /Wn−1 ⎠

j=1 ⎞p



  p Yn, j ⎠ /Wn−1 ⎠ + 2( p−1) E εn /Wn−1

j=1

≤ 2( p−1) E ⎝⎝Wn−1

p−1



Wn−1



⎞ ( p)

Yn, j ⎠ /Wn−1 ⎠ + 2( p−1) τn p

( p)

( p)

= 2( p−1) γn Wn−1 + 2( p−1) τn . p

j=1

It follows from here E Wnp ≤ 2( p−1) γn( p) E Wn−1 + 2( p−1) τn( p) ≤ · · · ≤ 2n( p−1) n( p) α p p

+

n( p)

( p) n  τ j 2(n+1− j)( p−1) ( p)

j=1

j

.

Thus, Theorem 3 is proved. Proof (Theorem 4) At first we consider the case of 0 < p ≤ 1. We have Wn − E Wn = Wn − E(Wn /Wn−1 ) + E(Wn /Wn−1 ) − E Wn 

Wn−1

=

Yn, j + εn − m n Wn−1 − λn + m n Wn−1 + λn − m n E Wn−1 − λn

j=1



Wn−1

=

(Yn, j − m n ) + (εn − λn ) + m n (Wn−1 − E Wn−1 ) (7.27)

j=1

We obtain from (7.27) and Lemma 7 the following  p W  n−1   E|Wn − E Wn | p = E  (Yn, j − m n ) + (εn − λn ) + m n (Wn−1 − E Wn−1 )  j=1   p W   n−1  p ≤ E  (Yn, j − m n ) + E |εn − λn | p + m n E |Wn−1 − E Wn−1 | p  j=1  ( p)

(7.28)

p

≤ E|Yn,1 − m n | p E Wn−1 + δn + m n E|Wn−1 − E Wn−1 | p   n−1  λi ( p) ( p) p = θn μn−1 α1 + + δn + m n E|Wn−1 − E Wn−1 | p . μi i=1

 n−1 λi  ( p) ( p) + δn . According Denote Q n = E|Wn − E Wn | p , Rn = θn μn−1 α1 + i=1 μi to this notation, one can rewrite (7.28) as follows:  Q n ≤ Rn + Bn Q n−1 . As in solving p p inequality (7.20), we have E|Wn − E Wn | p ≤ μn nk=1 μRkp + μn E|η − Eη| p , which k implies the first inequality of Theorem 4.

7 Inequalities for Moments of Branching Processes in a Varying Environment

161

Now consider the case of 1 < p ≤ 2. According to (7.27) and Lemma 7,  p Wn−1     E|Wn − E Wn | p ≤ 2 p−1 E  (Yn, j − m n ) + (εn − λn )  j=1 

(7.29)

+ 2 p−1 m np E|Wn−1 − E Wn−1 | p . Lemmas 5 and 7 imply  p  p Wn−1  Wn−1       p−1    E (Yn, j − m n ) + (εn − λn ) ≤ 2 E  (Yn, j − m n )  j=1   j=1  + 2 p−1 E|εn − λn | p = 2 p E|Yn,1 − m n | p E Wn−1 + 2 p−1 δn( p) ⎛ ⎞ n−1  λ j ⎠ = 2 p θn( p) μn−1 ⎝α1 + + 2 p−1 δn( p) . μ j j=1

(7.30)

It follows from (7.29) and (7.30) ⎛ E|Wn − E Wn | ≤ 2 p

p−1



⎝2 p θn( p) μn−1

⎞ ⎞ n−1  λ j p−1 ( p) ⎝α1 + ⎠ + 2 δn ⎠ μj j=1 +2 p−1 m np E|Wn−1 − E Wn−1 | p . (7.31)

  ( p) Now, denoting Sn = 2 p θn μn−1 α1 + n−1 j=1

λj μj



( p)

+ 2 p−1 δn , taking into account

this and above notations, we rewrite (7.31) as follows: Q n ≤ 2 p−1 Sn + 2 p−1 Bn Q n−1 . As in solving inequality (7.24), we have E|Wn − E Wn | p ≤ μnp

n  2(n+1−k)( p−1) Sk p

k=1

μk

+ 2n( p−1) μnp E|η − Eη| p ,

(7.32)

that implies the second inequality of Theorem 4. It remains to consider the case of p > 2. Lemmas 6, 7, and Theorem 3 imply  p  p  Wn−1   Wn−1  E  j=1 (Yn, j − m n ) + (εn − λn ) ≤ 2 p−1 E  j=1 (Yn, j − m n ) p

( p)

2 + 2 p−1 E|εn − λn | p = 2 p−1 C( p)E|Yn,1 − m n | p E Wn−1 + 2 p−1 δn p p ( ) p ( 2p ) ( 2p ) n−1 τ j 2 2(n− j)( 2 −1) ( p) ≤ 2 p−1 C( p)θn α 2p + n−1 2(n−1)( 2 −1) n−1 p j=1 ( )

( p) +2 p−1 δn .

We obtain from (7.29) and (7.33)

j 2

(7.33)

162

Ya. M. Khusanbaev and Kh. E. Kudratov

  p ( 2p ) ( p) E|Wn − E Wn | p ≤ 2 p−1 2 p−1 C( p)θn αp 2(n−1)( 2 −1) n−1 2

( p)

2 +n−1

( 2p )

n−1 τ  j

2

(n− j)( 2p −1)

( j

j=1



p 2)

( p)

+ 2 p−1 δn

 p

+ 2 p−1 m n E|Wn−1 − E Wn−1 | p .

(7.34)

Denote ⎛ Un =

p ( 2p ) ( p) 2 p−1 C( p)θn ⎝2(n−1)( 2 −1) n−1 αp 2

n−1 ( 2p )  + n−1 j=1

( p)

p

τ j 2 2(n− j)( 2 −1) ( p) j 2

⎞ ⎠ + 2 p−1 δn( p) ,

Taking into account the notation, (7.34) becomes Q n ≤ 2 p−1 Un + 2 p−1 Bn Q n−1 . As in solving inequality (7.24), we have E|Wn − E Wn | p ≤ μnp

n  2(n+1−k)( p−1) Uk p

k=1

μk

+ 2n( p−1) μnp E|η − Eη| p .

The third inequality of Theorem 4 follows from the last inequality. Theorem 4 is proved.

References 1. Kersting, G., Vatutin, V.A.: Discrete Time Branching Processes in Random Environment. ISTE Limited (2017) 2. Athreya, K.B., Ney, P.E.: Branching Processes. Springer, Heidelberg (1972) 3. Petrov, V.V.: Sums of Independent Random Variables. Springer (1975) 4. Lin, Z., Bai, Z.: Probability Inequalities. Springer, Heidelberg (2010)

Chapter 8

A Law of the Iterated Logarithm for the Empirical Process Based Upon Twice Censored Data Abderrahim Kitouni and Fatiha Messaci

Abstract We give a functional law of the iterated logarithm for the increment functions of empirical processes with twice censored data. In this setting, the lifetime of interest X is right censored by a variable R and min(X, R) is itself left censored by a variable L. This model is however characterized by the independence of the latent variables X , R and L. We also derive strong laws for kernel estimators of the density and the failure rate of the lifetime X . Our result extends the results available for complete or singly censored data. Keywords Lifetime of interest · Latent variable · Empirical process MSC 2020 62N01

8.1 Introduction The study of the empirical distribution function is an important topic in statistics. Its natural normalization defines the empirical process, whose increments lead to the local empirical process [16]. A motivation behind the study of local empirical processes is the fact that it is instrumental in the derivation of limit laws for statistics which can be expressed as local functionals of these processes. Typical examples of such statistics are estimators of the density function and those of the failure rate. We refer to [2–4] and the references therein for more investigations on this problem for complete and right censored data. The aim of the present work is to provide a law of the iterated logarithm for the local empirical process based upon twice censored data sets in the neighborhood of a fixed point. For such data, instead of observing the variable of interest X , with distribution function FX , we only have a sample of the random variable ((X ∧ R) ∨ L), δ) where A. Kitouni (B) · F. Messaci Département de Mathématiques, Université frères Mentouri Constantine 1, route d’Ain El Bey, 25017 Constantine, Algeria e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_8

163

164

A. Kitouni and F. Messaci

R and L are nonnegative censoring variables, δ is an indicator of the latent variable being observed. The variables X , R and L are assumed independent. The article [13] gives an application of this twice censoring model to farm trees, whereas [14] shows the appropriateness of this model to study a reliability system, and estimate the survival function 1 − F by means of a product limit estimator which leads to the empirical process under consideration in this paper. This is similar to the doubly censored data model studied in [18, 19] but is different. The Turnbull model assumes that L < R, which may make more sense but does considerably complicate derivation of various results. For the sake of avoiding repetitions, we do not develop the application of our result in order to obtain limit laws for estimators based on the twice censoring model studied in this work. Notice, for example, that the derivation of strong laws for the kernel density estimator, based upon the above Patilea-Rolin’s estimator, can be conducted exactly as that of the kernel density estimator based on right censored data (see e.g., [2, 7, 17, 20, 21]). This paper is organized as follows. Section 8.2 is devoted to recalling the ideas permitting the introduction of the Patilea-Rolin’s estimator and to explicit its expression. Our main result is stated and proved in Sect. 8.3. In Sect. 8.4, a simulation study illustrates the good behavior of kernel estimators of the density and the rate functions based upon twice censored data. The proofs of some useful technical lemmas are relegated to the Appendix section.

8.2 Product-Limit Estimator We denote by FV (t) := P(V ≤ t) the (right-continuous) distribution function of a real random variable V and by SV = 1 − FV its survival function. The upper and lower endpoints of FV are denoted by TV (= sup{t : FV (t) < 1}) and I V (= inf{t : FV (t) > 0}) respectively. For any function R, we set R(t− ) = limε↓0 R(t − ε) whenever this limit exists. For any set C ⊆ R, we denote by B(C) the space of all bounded real-valued functions defined on C, endowed with the topology of uniform convergence. Consider independent random variables X , R and L, denoting a survival time, a right censoring random variable and a left censoring random variable respectively. In the model I of [14], we observe a sample (Z i , δi )1≤i≤n of independent random copies of (Z , δ) where Z := max(min(X, R), L) and ⎧ ⎪ ⎨0 if L < X ≤ R, δ := 1 if L < R < X, ⎪ ⎩ 2 if min(X, R) ≤ L . Consider the subdistribution functions of Z defined for k = 0, 1, 2 by H (k) (t) = P(Z ≤ t, δ = k). We have the relations

8 A Law of the Iterated Logarithm for the Empirical Process …

H (0) (t) = H (1) (t) = H (2) (t) =



t



0



0

t

t

165

FL (u − )S R (u − ) d FX (u), FL (u − )S X (u) d FR (u),

(8.1)

{1 − S X (u)S R (u)} d FL (u),

0

and the distribution function of Z is H = H (0) + H (1) + H (2) . Set Y = min(X, R) t and H (01) = H (0) + H (1) , we have H (01) (t) = 0 FL (u − ) d FY (u) and H (2) (t) = t 0 FY (u) d FL (u). We first consider estimation with the left censoring model, for this consider the (2) H (01) (t) reverse hazard measures M2 (dt) = d HH (t)(t) , and M01 (dt) = H (t−d)+H (01) (t) , (with H (01) (t) = H (01) (t) − H (01) (t− )) and let F (2) and F (01) (resp. S (2) and S (01) ) be the associated distribution functions (resp. the survival functions) which can be directly estimated using the available data. (If M is a reverse hazard measure, then the cor responding distribution function is given by F(t) = (1 − M(ds)) where ]t,∞] denotes the product integral, see [8]. We obtain that H (t) = F (2) (t)F (01) (t). Equation (8.1) and the definition of S X imply that gests to define the following hazard measure (dt) =

d H (0) (t) FL (t− )SY (t− )

(8.2) =

d FX (t) . S X (t− )

This sug-

d H (0) (t) . F (2) (t− )S (01) (t− )

(8.3)

Let FXI be its corresponding distribution function. To compute the estimator of FXI , let Hn , Hn(0) , Hn(1) and Hn(2) denote the empirical versions of H , H (0) , H (1) and H (2) , given by n 1

1{Z i ≤t} , Hn (t) = n i=1

Hn(k) (t)

n 1

= 1{Z i ≤t,δi =k} , for k = 0, 1, 2, n i=1

(8.4)

and let Fn(2) , Sn(01) , n and Fn denote the functions obtained when replacing H (0) , H (1) , H (2) by their empirical versions in the expressions of F (2) , S (01) ,  and FXI respectively. Fn is the estimator of FX proposed by [14] (Note that FX = FXI under some identification conditions). Its expression is recalled below. Denote by {Z j , 1 ≤ j ≤ M} the distinct values in increasing order of {Z i , 1 ≤ i ≤ n}. Define Dk j = n

 1 − D2l /(n Hn (Z l )) . Then 1{Z i =Z j ,Ai =k} and U j−1 = n i=1

j≤l≤M

166

A. Kitouni and F. Messaci

 1 − D0 j /(U j−1 − n Hn (Z j−1 ) .

1 − Fn (t) =

(8.5)

j/Z j ≤t

One can see that 1 − Fn generalizes the well known estimator given in [9].

8.3 Results The estimator defined in (8.5) leads to an empirical process defined by setting, for any t ∈ R, √ (8.6) an (t) = n(Fn (t) − FX (t)). Let (h n )n≥0 be a sequence of positive constants satisfying, as n → ∞, H1 h n ↓ 0 and nh n ↑ ∞; H2 nh n / log log n → ∞. We will need the following identification hypothesis: H3 max(I L , I R ) < I X and TX < TR .

√ Fix M > 0 and z ∈ ]I X , TX [. Set bn = 2h n log log n and define the increments of (an ), also known as the local empirical process, for any u ∈ [−M, M] by ξn (u) =

1 (an (z + h n u) − an (z)). bn

(8.7)

Our main result is as follows. Theorem 1 Let z ∈]I X , TX [. Assume that FX , FL and FR are continuous and that the derivative f X of FX at z exists. Then under (H1), (H2) and (H3) the sequence {ξn , n ≥ 1} is almost surely relatively compact in B([−M, M]) with limit set equal to the set of all functions h of the form 

u

h(u) = 0

 ψ(s) ds, u ∈ [−M, M], where

M −M

ψ 2 (s) ds ≤

f X (z) . (8.8) FL (z)S R (z)

This result is similar to Theorem 1.2 of [2] where the data are only right-censored. The point of such a result is that it can give convergence rates for estimators wich depend on the increments of the cumulative distribution function or an estimator of the latter. An example of such an estimator is the kernel estimator of the density,  +∞ 1 y−u  introduced by [15]: f n (x) = −∞ h n K h n dG n (u) where G n is the empirical distribution function, K is a kernel (or a weighting function), and h n is a sequence of strictly positive integers called the bandwidth. Replacing G n by the estimator defined in (8.5), we get the kernel density estimator for twice censored data introduced in [10]. Assume K satisfies the following conditions:

8 A Law of the Iterated Logarithm for the Empirical Process …

167

K1 K has bounded variation on R; K2 K has compact support; ∞ K3 −∞ K (u) du = 1.  +∞ 

1 d F(u). When the data are comK y−u Define the quantity Eˆ f n (z) = hn hn −∞

pletely observed, Eˆ f n (z) is equal to the expectation of f n (z), but this is not the case in general. It is worth noticing that the treatment of the term Eˆ f n (z) − f (z) is classical under suitable conditions on K and h n and appropriate smoothness assumptions on f X ; it is the same as for complete data. The following corollary, whose proof is omitted since it is the same as Theorem 1.1 of [2], gives the rate of convergence of the estimator. Corollary 1 Let z ∈]I X , TX [. Assume that FX , FL and FR are continuous and that the derivative f X of FX at z exists. Then under conditions (H1), (H2), (H3), (K1), (K2) and (K3) we have 

1/2    nh n f X (z) 2 ˆ f n (z) − E f n (z) = lim sup ± K (u)du log log n FL (z)S R (z) n→∞ This estimator can be used to define an estimator λn for the hazard rate λ(z) = f (z)/(1 − F(z)) by setting λn (z) = f n (z)/(1 − Fn (z)) if (1 − Fn (z)) = 0. The following result is an immediate consequence of Corollary 1. Corollary 2 Let z ∈]I X , TX [ and M > 0 be fixed. Assume that FX , FR and FL are Then conditions (H1), continuous and that the derivative f X of FX at z exists. 

 under   log log n Eˆ f n (z)  . (H2), (H3), (K1), (K2) and (K3): λn (z) − 1−F(z)  = O nh n  of [10], we obtain a better convergence rate, that is, √ Compared to the results log log n/nh n versus log n/nh 2n , albeit for a weaker convergence mode (pointwise almost sure versus uniform almost complete convergence). The rest of this section is devoted to the proof of Theorem 1, divided into a sequence of lemmas (the proofs of which are given in the next section). As is often the case with results about empirical processes, we make use of the reduction to [0, 1]. Motivated by the idea in [2], we construct a sequence of random variables with uniform law on [0, 1] such that the sequence of the observations (Z i , δi ) can be rewritten in terms of this sequence. For this, set p = P(δ = 0) and q = P(δ = 1), and assume that p > 0, q > 0 and p + q < 1. Notice that p + q = 1 (resp. q = 0) corresponds to right (resp. left) censored data. The latter case can be derived from the result of [2] by reversing time. Now define the quantile functions Q (0) , Q (1) and Q (2) of H (0) , H (1) and H (2) : Q (0) (s) = inf{x : H (0) (x) ≥ s},

0 < s < p;

(1)

(1)

(x) ≥ s},

0 < s < q;

(2)

(2)

(x) ≥ s},

0 < s < 1 − p − q.

Q (s) = inf{x : H Q (s) = inf{x : H

168

A. Kitouni and F. Messaci

These definitions imply Q (0) (s) ≤ x ⇐⇒ s ≤ H (0) (x)

for 0 < s < p;

(1)

(1)

(x)

for 0 < s < q;

(2)

(2)

(x)

for 0 < s < 1 − p − q.

Q (s) ≤ x ⇐⇒ s ≤ H Q (s) ≤ x ⇐⇒ s ≤ H

(8.9)

We can now state the result giving the reduction to [0, 1]. Lemma 1 On a sufficiently rich probability space, we can define a Uniform (0,1) random variable U , such that almost surely, δ = 1{ p wi (t, n) = wi (t, m) = ∞   m=n+1 j∈S

m=n+1

pi, j (t)h i, j (m). The basic parameter of the SMC is the core matrix and

182

P. Kolias and A. Papadopoulou

it is defined as C(t, m) = {ci, j (t, m)}i, j∈S = P(t) ◦ H(m), where the operator {◦} denotes the element-wise product of matrices (Hadamard product). Also, we define the interval transition probabilities qi, j (t, n), which are the probabilities for the SMC to be in state j after n time units, while it entered state i in time t, to be Q(t, n) = {qi, j (t, n)}i, j∈S n  = > W (t, n) + [ P(t) ◦ H(m)] Q(t + m, n − m),

(9.1)

m=0

where > W (t, n) = diag{ > wi (t, n)}. The elements of the matrix Q(t, n) are qi, j (t, n) = δi, j > wi (t, n) +

n 

ci,r (t, m)qr,i (t + m, n − m), i, j ∈ S, t, n ∈ N .

r ∈S m=1

9.2.1 The Homogeneous Case In the following, we consider the DNA sequence to be a homogeneous semi-Markov chain, therefore we have pi, j (t) = pi, j , ∀t ∈ N . Furthermore, we assume that DNA sequences do not contain virtual transitions, therefore subsequent appearances of the same state count as holding and pi,i (t) = 0, ∀i ∈ S, t ∈ N . For the purpose of the present, the parameter of time indicates the position, based on the nature of the DNA sequences, as their evolution depends on the index position of every letter in the sequence. In order to study the d-periodic behaviour of a DNA sequence, we would like to examine the probability of a letter reappearance after d positions. Also, for a sequence with strong d-periodic behaviour, it is expected that for every periodic state, the frequency of the state appearances, every kd positions, would be high. Therefore, an interesting question is whether the chain is in the same state, not only for the first cycle of length d, but also for a number of n successive cycles of the same length. Thus, we define the following probabilities. Definition 1 Let pi (1, d) be the probability that the SMC will be in state i in position d, while in the initial position it was observed to be in state i, that is pi (1, d) = Pr ob[the SMC will be in state i in position d/ the initial state was observed to be i]. Similarly, we define the probability that the SMC will be in state i every d positions for n cycles, while in the initial position it was observed to be in state i, as follows

9 Investigating Some Attributes of Periodicity in DNA Sequences …

183

pi (n, d) = Pr ob[the SMC will be in state i every d positions for n cycles / the initial state was observed to be i]. It is important to note that for a given DNA sequence, we do not know if the initial position is due to a letter transition or reappearance of the same letter, therefore we have to include both cases in order to calculate the probability above. If we observed the process to be in state i in the initial position, it would be unlikely that upon the first observation the SMC had just entered this state. On the other hand, it would be more plausible to think that we started to observe the process in a position, where the entrance to a state has already been achieved. As a result, the process will stay in state i for the remaining positions and then make a transition to state j. The basic parameters of the SMC under random starting concern only the behaviour of the process until the first transition. Hence, let us denote by r pi, j (·) the transition probabilities under random starting and r h i, j (·) the distributions of the holding positions under random starting. A more detailed specification of the SMC under random starting could be found in the book of Howard [15]. Lemma 1 Let P(1, d) and P(n, d) be the (N × 1) vectors, which consist of the probabilities pi (d) and pi (n, d), i ∈ S respectively, following Definition 1. Then, (a)

P(1, d) =

>

rW (d) +

d 

  I ◦ rC(x)[ Q(d − x) ◦ (U − I)] · 1.

(9.2)

x=1

(b)

P(n, d) = P(n − 1, d) ◦ P(1, d),

(9.3)

where I is the identity matrix, >rW (d) = diag{>rwi (d)} denotes the survival function of the waiting time distribution under random starting, rC(x) = {rci, j (x)} denotes the core matrix of the SMC under random starting, which consist of the elements rci, j (x) = r pi, j · r h i, j (x), U = {u i, j }, where u i, j = 1, for every i, j ∈ S and 1 = [1, 1, · · · , 1]T . Proof Let Sx = i i i i · · · i j u u · · · u i, be the sequence of states of length d, where x−times

x = 1, 2, ..., d, j denotes any state different than i and u denotes any state from the state space S. For a given sequence, let us now consider the following instances which are mutually exclusive and exhaustive events: S1 = i j u u · · · u i S2 = i i j u u · · · u i S3 = i i i j u u · · · u i .. . Sd−2 = i i · · · i j u i Sd−1 = i i i · · · i j i

184

P. Kolias and A. Papadopoulou

Sd = i i i i i · · · i According to the previous, the semi-Markov chain, with initial observed state i, will be in state i after d positions, if either it holds for more than d steps in the initial state or makes a transition to a different state j at position x before the end of the cycle, but in any case to occupy state i in the final position. Thus, using probabilistic argument and summing over all possible states and holding times, we can conclude d N   to the equation pi (1, d) = >rwi (d) + r ci, j (x)q j,i (d − x). Let the element of j=i x=1

the ith row of a vector P(1, d) be the probability pi (1, d). The matrix notation in Eq. 9.2 can immediately be deduced by keeping only the non-diagonal elements, i.e. multiplying by the matrix [U − I]. Similarly, concerning Eq. 9.3, let us consider that the elements of the matrix P(n, d) to be the probabilities pi (n, d). Hence, in order for the SMC to be in the same state after n successive cycles of length d, d N   > n we have pi (n, d) = rwi (d) + r ci, j (x)q j,i (d − x) . The matrix form is j=i x=1



deduced immediately by the result above.

Remark 1 For the interval transition probability matrix Q(n), instead of using the recursive formula 9.1, one can apply the closed analytic form, as proposed by Vassiliou and Papadopoulou [28] Q(n) =

>

j−2 n   W (n) + C(n) + {C( j − 1) + S j (k, m k )} j=2

>

k=1

×{ W (n − j + 1) + C(n − j + 1)}, where S j (k, m k ) =

j−k 



j−k+1

m k =2 m k−1 =1+m k

···

j−1 

k−1

(9.4)

C(m k−r −1 − m k−r ) for j  k +

m 1 =1+m 2 r =−1

2, while if j  k + 2 we have S j (k, m k ) = 0.

9.2.2 The Case of Partial Non Homogeneity The partial non-homogeneous semi-Markov chain (PNHSMC) is constructed based on the fact that every amino acid consists of three nucleotides (codon). Using this information, we can create three discrete coding positions k = {1, 2, 3} and for the PNHSMC, we have three stochastic matrices P(k), k = 1, 2, 3 for the embedded Markov chain. Similar to the homogeneous case, it would be of interest to find the probability for the PNHSMC to be in the same state after a length of d positions and also for n successive cycles of length d.

9 Investigating Some Attributes of Periodicity in DNA Sequences …

185

Definition 2 Let us define the quantity pi (k, 1, d) to be the probability that the PNHSMC will be in state i in position d, while in the initial position it was observed to be in state i, in coding position k, that is pi (k, 1, d) = Pr ob[the SMC will be in state i in position d / the initial state was observed to be i in coding position k]. Furthermore, we define the quantity pi (k, n, d) to be the probability that the PNHSMC will be in state i every d positions for n cycles, while in the initial position it was observed to be in state i, in coding position k, that is pi (k, n, d) = Pr ob[the SMC will be in state i every d positions for n cycles/ the initial state was observed to be i in coding position k]. Lemma 2 Let P(k, 1, d) and P(k, n, d) be (N × 1) vectors, consisting of the probabilities pi (k, 1, d) and pi (k, n, d), i ∈ S respectively, following Definition 2. Then (a)

P(k, 1, d) = (9.5) d >    I ◦ rC(k, x)[ Q(k + x mod s, d − x) ◦ (U − I)] · 1 rW (k, d) +

(b)

P(k, n, d) = P(k, n − 1, d) ◦ P(k, 1, d),

x=1

(9.6)

where >rW (k, d) = diag{>rwi (k, d)} denotes the survival function of the waiting time distribution of the PNHSMC under random starting, rC(k, x) = {rci, j (k, x)} denotes the core matrix of the PNHSMC under random starting, which consist of the elements r ci, j (k, x) = r pi, j (k) · r h i, j (x) and U = {u i, j }, where u i, j = 1. Proof Let Sx = i k i k+1 · · · i k+x−1 mod s jk+x mod s u k+x+1 mod s · · · u k+d−1 mod s i k+d mod s 

x−times

be the sequence of states of length d, where x = 1, 2, ..., d, j denotes any state different than i, u denotes any state from the state space S, k denotes the coding position and s denotes the total number of different coding positions. For a given sequence, let us define the following instances which are mutually exclusive and exhaustive events:

186

P. Kolias and A. Papadopoulou

S1 = i k jk+1 u k+3 u k+3 u k+4 · · · u k+d−1 mod s i k+d mod s S2 = i k i k+1 jk+2 u k+3 u k+4 · · · u k+d−1 mod s i k+d mod s S3 = i k i k+1 i k+2 jk+3 u k+4 · · · u k+d−1 mod s i k+d mod s .. . Sd−2 = i k i k+1 i k+2 · · · jk+d−2 mod s u k+d−1 mod s i k+d mod s Sd−1 = i k i k+1 i k+2 i k+3 · · · jk+d−1 mod s i k+d mod s Sd = i k i k+1 i k+2 i k+3 i k+4 i k+5 · · · i k+d mod s , The PNHSMC, with initial observed state i in coding position k, will be in state i after d positions, either if it holds for more than d positions in the initial state or moves to a different state j at position x + k mod s before the end of the cycle, but in any case to occupy state i in the final position. Thus, using probabilistic argument and summing over all possible states and holding positions, we obtain pi (k, 1, d) =

> rwi (k, d)

+

d N  

rci, j (k, x)q j,i ((k

+ x) mod s, d − x).

j=i x=1

Let the element of the ith row of a vector P(k, 1, d) to be the probability pi (k, 1, d). The matrix notation in Eq. (9.5) can be deduced immediately by multiplying with the matrix [U − I]. Similarly, concerning equation (9.6), let us consider the elements of the matrix P(k, n, d) to be the probabilities pi (k, n, d). In order for the PNHSMC to be in the same state after n successive cycles of length d, we have pi (k, n, d) =

>

rwi (k, d)

+

d N  

r ci, j (k, x)q j,i (k

n + x mod s, d − x) .

j=i x=1

The matrix form in (9.6) is deduced immediately by applying the Hadamard product over n matrices of the form P(k, 1, d).  Remark 2 For the interval transition probability matrix Q(t, n), instead of using the recursive formula, we can apply the closed analytic form, which is [28] Q(k, n) =

>

W (k, n) + C(k, n) +

n 

{C(k, j − 1) +

j−2 

S j (x, k, m x )}

x=1

j=2

(9.7)

×{> W (k + j − 1, n − j + 1) + C(k + j − 1, n − j + 1)}, where S j (x, k, m x ) =

j−x 

j−x+1 

m x =2 m x−1 =1+m x

···

j−1 

x−1

m 1 =1+m 2 r =−1

C(k + m x−r − 1, m x−r −1 − m x−r ) for

j  x + 2, while if j  x + 2 we have S j (x, k, m x ) = 0.

9 Investigating Some Attributes of Periodicity in DNA Sequences …

187

9.3 Quasiperiodicity The previous results, for both the homogeneous and non-homogeneous case, correspond to the probability of a state i to reappear again after d positions and n successive cycles. However, for the model to be more coherent, we also have to include the event that the periodicity is not strict and the state i does not appear exactly after d positions, but in the interval (d − ε, d + ε). Also, we are interested in the quasiperiodic behaviour of the SMC, not only for a cycle of length d, but also for a number of n successive cycles. For simplicity we assume that ε = 1, although the results for ε > 1 are straightforward. For this purpose, let us define the entrance probabilities under random starting rei, j (n), which are the probabilities that the SMC will enter state j at position n, given that, in the initial position, the SMC was observed to be in state i [15]. The equation for calculating the probabiliN  n  ties is r ei, j (n) = δi, j δ(n) + r ci,r er, j (n − m). Furthermore, let us define the r =1 m=0

first passage time probabilities f i, j (n), which are the probabilities that the SMC will transition to state j for the first time after n positions, given that it had entered state i in the initial position [15]. The recursive formula of the probabilities f i, j (n) is given N  n  by f i, j (n) = pi,r h i,r (m) fr, j (n − m) + pi, j h i, j (n). r = j m=0

Definition 3 Let us define the quantity ε pi (1, d), assuming ε = 1, to be the probability that the SMC will be in state i at least once in the position interval d ± ε, while in the initial position, the SMC was observed to be in state i. Also, let us define the probability ε pi (n, d) to be the probability that the SMC will be in the state i in the interval (d − 1, d + 1) for n successive cycles, that is (a)

ε pi (1, d)

= Pr ob[the SMC will be in state i either in position d − 1, (9.8) or d, or d + 1 / the initial state was observed to be i]

(b)

ε pi (n, d)

= Pr ob[the SMC to be in state i either in positiond − 1, or d, or d + 1 for n cycles/ the initial state was observed to be i]

(9.9)

Theorem 1 Let ε P(1, d) and ε P(n, d) be (N × 1) vectors, consisiting of the probabilities ε pi (1, d) and ε pi (n, d), i ∈ S respectively, following Definition 3. Then (a)

ε P(d) = P(d − 1)+   d−1  

I◦

m=1

(b)

r

(9.10)

E(m) [F(d − m) + F(d + 1 − m)] ◦ (U − I)



 ·1

ε P(n, d) =ε P(n − 1, d))◦



P(d − 1) +

 d−1  m=1

I◦

 r

   E(m) [F(d − m) + F(d + 1 − m)] ◦ (U − I) · 1 ,

(9.11)

188

P. Kolias and A. Papadopoulou

where r E(·) = {r ei, j (·)} is the matrix which consists of the entrance probabilities under random starting and F(·) = { f i, j (·)} is the matrix with the first passage time probabilities. Proof Let us define the events A0 , A1 , A2 as A0 =

[the SMC is in state i in position d − 1/the initial state was

A1 =

observed to bei]. [the SMC is in state i in position d and in state r = i in position d − 1/

A2 =

the initial state was observed to be i]. [the SMC is in state i in position d + 1 and in state r = i in positions d − 1 and d/the initial state was observed to be i].

Schematically, we can visualize the events defined above, as the following sequences A0 = i u u u · · · u i A1 = i u u u · · · u r i A2 = i u u u · · · u r r i , d−1

where u denotes any state from state space S and r denotes a state different from i. It is obvious that the events are mutually exclusive, therefore Pr ob[A0 ∪ A1 ∪ A2 ] = Pr ob[A0 ] + Pr ob[A1 ] + Pr ob[A2 ]. The probability for the event A0 is defined as Pr ob[A0 ] = pi (1, d − 1) = Pr ob[the SMC will be in state i in positiond − 1/ the initial state was observed to be i]. For the event A1 to happen, it is required for the SMC to be in a state r = i in position d − 1 and transition to state i in position d. Therefore, the SMC could have entered state r = i at a position m ≤ d − 1 and then transitioned to state i for the first time after the remaining d − m positions. Using probabilistic argument and summing over all the different positions and states, we can deduce the following equation N  d−1  Pr ob[A1 ] = r ei,r (m) f r,i (d − m). Similarly we can deduce the probability r =i m=0

of the event A2 to happen, Pr ob[A2 ] =

N  d−1 

r ei,r (m) f r,i (d

+ 1 − m). For the

r =i m=0

sum of the probabilities of the three events we can derive the following expression

9 Investigating Some Attributes of Periodicity in DNA Sequences … ε pi (d)

189

= Pr ob[A0 ] + Pr ob[A1 ] + Pr ob[A2 ]

= pi (d − 1) +

d−1 N  

r ei,r (m) f r,i (d

− m) +

r =i m=0

= pi (d − 1) +

d−1 N  

d−1 N  

r ei,r (m) f r,i (d

+ 1 − m) =

r =i m=0 r ei,r (m)[ f r,i (d

− m) + fr,i (d + 1 − m)].

r =i m=0

Equation (9.2) can be written in matrix form as ε P(d)

= P(d − 1)+ d−1      I ◦ E(m) [F(d − m) + F(d + 1 − m)] ◦ (U − I) · 1 . m=1

Last and by applying Lemmas 1 and 2, we can derive the corresponding equations for the probabilities ε pi (n, d), which are described in matrix notation, as follows =ε P(n − 1, d))◦ d−1       P(d − 1) + I ◦ E(m) [F(d − m) + F(d + 1 − m)] ◦ (U − I) 1 .

ε P(n, d)

m=1



9.4 Illustrations of Real and Synthetic Data For the illustrations of the homogeneous semi-Markov model, synthetic DNA sequences as well as real genomic and mRNA sequences were used. The coding sequence used was human dystrophin mRNA and the non-coding sequence, which was used for comparison, was the human b-nerve growth factor gene (BNGF). These sequences have already been examined using the spectral density analysis by Tsonis [27]. We assumed that each of the sequences could be described by a homogeneous semi-Markov chain {X t }∞ t=0 , with state space S = {A, C, G, T } and the index t denotes the position of each nucleotide inside the sequence. The basic parameters P i, j (s) and H i, j (m) of the SMC were estimated using the empirical estimators N (i(k) → j) N (i → j, m)

pi, j (k) =  and h i, j (m) =  where N (i(k) → j) N (i(k) → x) N (i → x, m) x∈S

x∈S

denotes the number of transitions from state i to state j, starting from coding position k and N (i → j, m) denotes the number of transitions from state i to state j, while the SMC remained in state i for m positions. In order to estimate the initial condition, which are the probabilities of the matrix P(1, d), the first 10

190

P. Kolias and A. Papadopoulou

cycles of length 3 have been used and the basic parameters P and H(m) have been estimated. After that and for each cycle n, the core matrix C(m) has been estimated, using the letters of the sequence up until the position 30 + n · d. This specific process has been implemented, correcting the estimations, as in the current application the length of each period is small (d = 3), resulting in an non adequate sample size for each cycle. However, if we were interested in examining the periodic behaviour for larger periods, this correction procedure would not be necessary. Finally, the probability for the chain to be in the same state for every n · d positions has been calculated using the recursive equation P(n, d). Let us define the ratio  −1 · P(n, d), where 1 = [1, 1, ..., 1]. The quantity by R(n) = [ P(n − 1, d)1] ◦ I R(n) is a (N × 1) vector and the ith element of matrix R(n) is the ratio of the probability pi (n, d) over pi (n − 1, d) for every n and illustrates the variations between the probabilities pi (n, d) and pi (n − 1, d), in order to investigate the periodicity over a number of cycles. It is obvious that the probabilities pi (k, n, d) will converge to zero, as they are a product of n probabilities. The most important things in the periodic investigation, are the initial probability P(1, d), which contains the probabilities for the chain to be in the same state after d positions and also the ratio R(n), which measures the relationship between the probabilities of the current cycle and the previous one using the correction procedure. For higher values of R(n), the probabilities pi (n, d) decrease with a slower rate, while for lower values of R(n),the probabilities pi (n, d) converge to zero faster.

9.4.1 DNA Sequences of Synthetic Data Example 1 (Comparison between random and periodic DNA sequences) Let L be a DNA sequence of length N = 1000 of the form: L = {U, U, U, ..., U }, where the letter U corresponds to any nucleotide, from a uniform distribution. Thus, Pr ob[U = A] = Pr ob[U = C] = Pr ob[U = G] = Pr ob[U = T ] = 1/4. This kind of sequence would not exhibit any periodic behaviour, however the estimated probability matrix P(n, d), for d = 3, will be estimated for comparison. The ⎛ ⎞ 0 0.2 0.8 0 ⎜ ⎟ ⎜0.375 0 0.5 0.125⎟ ⎟ estimation of the embedded Markov matrix is P = ⎜ ⎜ ⎟ and ⎜0.125 0.5 0 0.375⎟ ⎝ ⎠ 0.25 0.75 0 0 ⎛ ⎞ ⎛ ⎞ 0 0 0.8 0 0 0 00 ⎜ ⎟ ⎜ ⎟ ⎜0.375 ⎜0 0 0.5 0.125⎟ 0 0 0⎟ ⎟ ⎜ ⎟ the core matrix C(m) is C(1) = ⎜ ⎜ ⎟, C(2) = ⎜ ⎟ ⎜0.125 0.375 0 0.375⎟ ⎜0 0.125 0 0⎟ ⎝ ⎠ ⎝ ⎠ 0.25 0.5 0 0 0 0 00

9 Investigating Some Attributes of Periodicity in DNA Sequences …

191

Fig. 9.1 R(n) for the synthetic DNA sequence of a uniform distribution

while the only non zero element of C(3) is c4,2 (3) = 0.25. The initial condition is ⎛ ⎞ 0.32 ⎜ ⎟ ⎜0.34⎟ ⎟ P(1, 3) = ⎜ ⎜ ⎟. ⎜0.42⎟ ⎝ ⎠ 0.27 Figure 9.1 visualizes the ratio R(n) for the whole sequence. We observe that, as expected, there exist no clear tendency for any state to achieve a stronger periodic behaviour, compared to the other states. Now, let L be a DNA sequence of length N = 1000 of the form: L = {A, U, U, A, U, U, ...}, where the letter A corresponds to adenine and the letter U corresponds to any nucleotide from a uniform distribution, therefore Pr ob[U = A] = Pr ob[U = C] = Pr ob[U = G] = Pr ob[U = T ] = 1/4. We will investigate the periodic behaviour, of period d = 3. One can notice that the letter A can possibly have a non-zero waiting time probability w A (m) for every m. On the other hand, for the other three letters C, G, T , the waiting time probabilities are zero if m exceeds two, as between every three letters, the letter A always appears at least once. The estimated embedded Markov transition matrix ⎞is ⎛ ⎞ ⎛ 0 0.30 0.30 0.40 0 0.19 0.16 0.27 ⎜ ⎟ ⎜ ⎟ ⎜0.73 0 0.15 0.12⎟ ⎜0.60 0 0.15 0.13⎟ ⎟ ⎜ ⎟ P =⎜ ⎜ ⎟ and the core matrix is C(1) = ⎜ ⎟, ⎜0.69 0.17 0 0.14⎟ ⎜0.56 0.17 0 0.15⎟ ⎝ ⎠ ⎝ ⎠ 0.70 0.14 0.16 0 0.50 0.14 0.16 0 ⎛ ⎞ 0 0.08 0.11 0.09 ⎜ ⎟ ⎜0.13 0 0 0 ⎟ ⎟ C(2) = ⎜ ⎜ ⎟ while the other matrices C(m) for m > 2, have non⎜0.13 0 0 0 ⎟ ⎝ ⎠ 0.20 0 0 0 zero elements only in the first row, that is for the letter A. The initial condition

192

P. Kolias and A. Papadopoulou

Fig. 9.2 R(n) for the synthetic DNA sequence with 3-base periodicity of adenine ⎛

is P(1, 3) =

0.83



⎜ ⎟ ⎜0.18⎟ ⎜ ⎟ ⎜ ⎟. ⎜0.20⎟ ⎝ ⎠

The probability for the chain to be in state A, every d = 3

0.25 positions, while starting from state A, is greater than the other three states, as we expected. This is also confirmed by the ratio, as presented in Fig. 9.2, that shows that state A exhibits higher values compared to the other states. Example 2 (Detection of periodic regions inside a sequence) Let L be a DNA sequence of length N = 5000 of the form: L = {U, U, U, ..., U }, where the letter U corresponds to any random nucleotide from a uniform distribution. In the position intervals 1500–2000 and 3000–3500, which correspond to the 3-base cycles 500– 666 and 1000–1166 respectively, the letter U has been substituted with the letter A, starting from the first position and at every 3 positions thereafter. Figure 9.3 shows the values of the ratio R(n) for the letter A, where the green regions are the 3-base cycles of the sequence R(n) where the sequence is increasing, while the red regions are the 3-base cycles where the sequence R(n) decreases. It is observed, that the regions, in which we have synthetically added periodic behaviour for the letter A, have an increasing ratio R(n) for A, indicating that in these regions the periodic behaviour of A is stronger.

9.4.2 DNA Sequences of Real Data The information about the periodic behaviour of the coding regions of the genome could possibly be used, in order to distinguish these regions, over a DNA sequence with great length. For the coding sequences of real DNA, the human dystrophin mRNA has been used, while for the non coding region, the human b-nerve growth

9 Investigating Some Attributes of Periodicity in DNA Sequences …

193

Fig. 9.3 R(n) of the letter A of the synthetic sequence with periodicity in the cycles 500-666 and 1000-1166

Fig. 9.4 R(n) for the human dystrophin mRNA sequence

factor has been used. These sequences have a length greater than 5000 bases and they have already been studied for periodic behaviour [27]. One can notice through Fig. 9.4, that for the human dystrophin mRNA sequence, the nucleotide A has a higher chance to appear every 3 positions, while all the other nucleotides have almost the same behaviour. The ratio for the nucleotide A is higher compared to the other three states for the human dystrophin mRNA sequence, indicating the stronger periodic behaviour for adenine. However, Fig. 9.5 indicates that for the human b-nerve growth factor gene, which contains in more than 90% intronic sequences, the results are similar with the random sequence, that was created in the first example.

194

P. Kolias and A. Papadopoulou

Fig. 9.5 R(n) for the human b-nerve growth factor sequence

9.5 Conclusion In the present paper, a method is developed, in order to investigate some attributes related to the periodicity of DNA sequences. The applied model is a semi-Markov chain of discrete and finite state space and discrete time, where the elements of the state space are the four nucleotides, i.e. S = {A, C, G, T } and time denotes the index position in the sequence. The purpose of the model was to describe the periodic behaviour of a given DNA sequence, something that could possibly discriminate between coding and non-coding regions. It is known that the coding regions of the genome have different structure from the non-coding regions, as they exhibit a characteristic tendency of repetition of some nucleotides every 3 bases. Considering the previous fact and by modelling a DNA sequence with a semi-Markov chain, a recursive equation that could be used as an identification tool for regions that have strong or weak d-periodic behaviour is constructed. The corresponding probabilities are calculated in relation to the basic parameters of the model in closed analytic form. The theoretical results are also generalized for the non-homogeneous case, considering the triplet nature of the DNA and assuming each coding position corresponds to a different transition matrix P(k). In addition, the case of quasiperiodicity of a state is examined. The above theory is developed considering the fact that small perturbations in the cycle of the period may appear, such as a shift of the position of a letter due to genetic mutations and lead the chain to lose its periodic behaviour for a number of cycles. Therefore, the state will appear not exactly after a period of d positions, but in a radius of d ±  positions. The numerical results of the implementation of the model on actual data confirmed the previous studies, as it was apparent that periodic behaviour is a characteristic of the coding segments, unlike non-coding segments that did not show similar behaviour. For the estimation of the parameters, a correction procedure was applied, due to the short duration of the period (d = 3) for the specific application. The approach could potentially be used as an initial method

9 Investigating Some Attributes of Periodicity in DNA Sequences …

195

for investigating periodicity for any DNA sequence and also it could be used to separate two different DNA segments, in terms of their periodic behaviour. Although the examples produced satisfactory results, they should be perceived with caution, due to the complexity of the structure of DNA and its various peculiarities. For example, additional parameters could be included in the model, such as the sequence length, the frequencies of each nucleotide, the open reading frames (ORFS), the target organism, specific mutations and others.

References 1. Almagor, H.: A Markov analysis of DNA sequences. J. Theor. Biol. 104(4), 633–645 (1983) 2. Almirantis, Y.: A standard deviation based quantification differentiates coding from non-coding DNA sequences and gives insight to their evolutionary history. J. Theor. Biol. 196(3), 297–308 (1999) 3. Avery, P.J., Henderson, D.A.: Fitting Markov chain models to discrete state series such as DNA sequences. J. R. Stat. Soc.: Ser. C (Appl. Stat.) 48(1), 53–61 (1999) 4. Bartholomew, D., Forbes, A., McClean, S.: Statistical Techniques for Manpower Planning. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. Wiley (1991) 5. Benson, G.: Tandem repeats finder: a program to analyze DNA sequences. Nucl. Acids Res. 27(2), 573–580 (1999) 6. Burge, C., Karlin, S.: Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268(1), 78–94 (1997) 7. Chechetkin, V.R., Yu. Turygin, A.: Search of hidden periodicities in DNA sequences. J. Theor. Biol. 175(4), 477–94 (1995) 8. Chechetkin, V.R., Turygin, A.Y.: On the spectral criteria of disorder in nonperiodic sequences: application to inflation models, symbolic dynamics and DNA sequences. J. Phys. A: Math. Gen. 27(14), 4875–4898 (1994) 9. Cheever, E.A., Overton, G.C., Searls, D.B.: Fast Fourier transform-based correlation of DNA sequences using complex plane encoding. Comput. Appl. Biosci.: CABIOS 7(2), 143–54 (1991) 10. Cohanim, A.B., Trifonov, E.N., Kashi, Y.: Specific selection pressure at the third codon positions: contribution to 10-to 11-base periodicity in prokaryotic genomes. J. Mol. Evol. 63(3), 393–400 (2006) 11. D’Amico, G., Petroni, F., Prattico, F.: First and second order semi-Markov chains for wind speed modeling. Phys. A: Stat. Mech. Its Appl. 392(5), 1194–1201 (2013) 12. Eskesen, S.T., Eskesen, F.N., Kinghorn, B., Ruvinsky, A.: Periodicity of DNA in exons. BMC Mol. Biol. 5(1), 12 (2004) 13. Garden, P.W.: Markov analysis of viral DNA/RNA sequences. J. Theor. Biol. 82(4), 679–684 (1980) 14. Herzel, H., Weiss, O., Trifonov, E.N.: 10–11 bp periodicities in complete genomes reflect protein structure and DNA folding. Bioinformatics (Oxford, England) 15(3), 187–193 (1999) 15. Howard, R.A.: Dynamic probabilistic systems: Markov models, vol. 2. Courier Corporation (1971) 16. Janssen, J.: Semi-Markov Models: Theory and Applications. Springer (1999) 17. Janssen, J., Manca, R.: Applied semi-Markov processes. Springer Science & Business Media (2006) 18. Papadopoulou, A.: Counting transitions–entrance probabilities in non-homogeneous semiMarkov systems. Appl. Stoch. Models Data Anal. 13(3–4), 199–206 (1997)

196

P. Kolias and A. Papadopoulou

19. Papadopoulou, A.A.: Some results on modeling biological sequences and web navigation with a semi Markov chain. Commun. Stat.-Theory Methods 42(16), 2853–2871 (2013) 20. Provata, A., Almirantis, Y.: Scaling properties of coding and non-coding DNA sequences. Phys. A: Stat. Mech. Its Appl. 247(1–4), 482–496 (1997) 21. Reinert, G., Schbath, S., Waterman, M.S.: Probabilistic and statistical properties of words: an overview. J. Comput. Biol. 7(1–2), 1–46 (2000) 22. Salih, B., Tripathi, V., Trifonov, E.N.: Visible periodicity of strong nucleosome DNA sequences. J. Biomol. Struct. Dyn. 33(1), 1–9 (2015) 23. Schbath, S., Prum, B., De Turckheim, E.: Exceptional motifs in different Markov chain models for a statistical analysis of DNA sequences. J. Comput. Biol. 2(3), 417–437 (1995) 24. Tavare, S., Giddings, B.W.: Some statistical aspects of the primary structure of nucleotide sequences. In: Waterman, M.S. (ed.) Mathematical Methods for DNA Sequences (1989) 25. Trifonov, E.N.: 3-, 10.5-, 200-and 400-base periodicities in genome sequences. Phys. A: Stat. Mech. Its Appl. 249(1–4), 511–516 (1998) 26. Trifonov, E.N., Sussman, J.L.: The pitch of chromatin DNA is reflected in its nucleotide sequence. Proc. Natl. Acad. Sci. 77(7), 3816–3820 (1980) 27. Tsonis, A.A., Elsner, J.B., Tsonis, P.A.: Periodicity in DNA coding sequences: implications in gene evolution. J. Theor. Biol. 151(3), 323–331 (1991) 28. Vassiliou, P.C.G., Papadopoulou, A.: Non-homogeneous semi-Markov systems and maintainability of the state sizes. J. Appl. Probab. 29(3), 519–534 (1992) 29. Waterman, M.: Introduction to Computational Biology: Maps, Sequences, and Genomes: Interdisciplinary Statistics. Chapman & Hall/CRC, New York (1995) 30. Wu, T.J., Hsieh, Y.C., Li, L.A.: Statistical measures of DNA sequence dissimilarity under Markov chain models of base composition. Biometrics 57(2), 441–448 (2001) 31. Yin, C., Wang, J.: Periodic power spectrum with applications in detection of latent periodicities in DNA sequences. J. Math. Biol. 73(5), 1053–1079 (2016) 32. Yin, C., Yau, S.S.T.: Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence. J. Theor. Biol. 247(4), 687–694 (2007)

Chapter 10

Limit Theorems of Baxter Type for Generalized Random Gaussian Processes with Independent Values Sergey Krasnitskiy, Oleksandr Kurchenko, and Olga Syniavska

Abstract In this paper we obtain a Baxter-type theorem for Gaussian generalized random processes with independent values. The main results are formulated in terms of general representation of covariance functionals of such processes. Sufficient conditions in those terms for the singularity of probability measures corresponding to such processes are also given. Keywords Lévy–Baxter theorems · Generalized random process · Test function · Covariance functional’s representation · Singularity of measures MSC 2020 60G15

10.1 Introduction Let K be the space of finite infinitely differentiable functions (another common notation D), [1–3], {b(n), n = 1, 2, . . .} be a monotonous unbounded sequence of natural numbers,   χk,n ; k = 0, 1, . . . , b(n) − 1, n = 1, 2, . . .

(10.1)

a function family from space K such that

S. Krasnitskiy (B) Kyiv National University of Technology and Design, Kyiv, Ukraine e-mail: [email protected] O. Kurchenko Kyiv National Taras Shevchenko University, Kyiv, Ukraine e-mail: [email protected] O. Syniavska Uzhhorod National University, Uzhhorod, Ukraine e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_10

197

198

S. Krasnitskiy et al.

 suppχk,n ⊂

 k k+1 , , 0 ≤ k ≤ b(n) − 1, n ∈ N. b(n) b(n)

Further, let ξ = (ξ, ϕ) ∈ K be a generalized random process over the space K [4]. The random variable b(n)−1   (ξ, χk,n )2 Sn (ξ ) = Sn (ξ, χk,n ) =



(10.2)

k=0

will be called the quadratic Baxter sum (further the Baxter sum ) of the generalized   random process ξ relative to the function collection χ0,n , χ1,n , . . . , χb(n)−1,n . The statements about the convergence in some sense of the sequence of Baxter sums to a non-random constant will be called Baxter–type theorems for generalized random processes. Baxter type theorems (also called the Lévy–Baxter theorems) date back to the results of Lévy [5] for the standard Brownian motion and Baxter [6] for a much wider classes of Gaussian random processes. In further the convergence of the sequence of Baxter sums to a non-random constant for random processes and fields was investigated by many mathematicians. The articles [7–9] should be noted among the works on Lévy–Baxter theorems for Gaussian random processes. Theorems of Baxter type for Gaussian random fields were obtained in [10–12] and other. Lévy–Baxter theorems for non–Gaussian processes were investigated in [13] and for one class of non-Gaussian random fields in [14]. The Baxter type theorems for generalized random functions were obtained in papers [15–19]. In this paper we obtain Baxter type theorem for generalized random Gaussian processes with independent values. In a certain sense, such processes are analogues of regular processes with independent increments and play an essential role in the theory of stochastic differential equations [20] and Markov random fields [21].

10.2 The Covariance Functional of a Generalized Random Process with Independent Values Definition 1 A generalized random process ξ = (ξ, ϕ) , ϕ ∈ K is said to be a generalized random process with independent values if the random variables (ξ, ϕ), (ξ, ψ) are independent for arbitrary functions ϕ, ψ ∈ K with supports without common internal points. Assume, without loss of generality, that the mean of a generalized random process ξ is zero: E(ξ, ϕ) = 0, ϕ ∈ K . We denote by B(ϕ, ψ), ϕ, ψ ∈ K the covariance functional of the generalized random process ξ = (ξ, ϕ) , ϕ ∈ K : B(ϕ, ψ) = E ((ξ, ϕ)(ξ, ψ)).

10 Limit Theorems of Baxter Type for Generalized Random Gaussian …

199

Theorem 1 ([4, p. 355]) The covariance functional B(ϕ, ψ) of the generalized random processes with independent values is given by the formula  B(ϕ, ψ) =

+∞ −∞



R jk (x)ϕ ( j) (x)ψ (k) (x) dx,

ϕ, ψ ∈ K ,

(10.3)

j,k

where the functions R jk are continuous on R and only a finite number of R jk s are non–zero on each finite segment. Remark 1 ([4, p. 356]) For every positive definite bilinear functional of the form (10.3) there exists a generalized Gaussian random process with independent values and the covariance functional (10.3). In the future, we consider only the Gaussian case, i.e. all finite-dimensional distributions of the processes under consideration are Gaussian. Remark 2 Let ξ˜ be a restriction of a generalized random process ξ to space C0(∞) ([0, 1]) of infinitely differentiable finite functions with supports on the seg˜ ment [0, 1]. By virtue of Theorem 1, the covariance functional B(ϕ, ψ), ϕ, ψ ∈ (∞) ˜ C0 ([0, 1]) of ξ can be represented as ˜ B(ϕ, ψ) =





1

0

R jk (x)ϕ ( j) (x)ψ (k) (x) dx,

ϕ, ψ ∈ K ,

(10.4)

j,k≥0, j+k≤M

where M is a non–negative integer. Under the assumption of sufficient smoothness of the functions R jk (x) the right– hand side of equality (10.4) can be transformed in many ways by integration by parts. The following lemma contains the result of these actions that is convenient for us   Lemma 1 Let M be a non–negative integer, R jk | j, k ≥ 0, j + k ≤ M ⊂ C (M) ([0, 1]) be a family of functions and g(ϕ, ψ) =

  j,k≥0, j+k≤M

1

R jk (x)ϕ ( j) (x)ψ (k) (x) dx,

0

ϕ, ψ ∈ C0(∞) ([0, 1])

(10.5)

be a bilinear symmetric functional on the space C0(∞) ([0, 1]). Then the functional g(ϕ, ψ) can be represented as g(ϕ, ψ) =

N   k=0

0

1

R˜ kk (x)ϕ (k) (x)ψ (k) (x) dx,

ϕ, ψ ∈ C0(∞) ([0, 1]) ,

(10.6)

 where N = M2 , R˜ kk , 0 ≤ k ≤ N are continuous functions on the segment [0, 1] which can be expressed through the functions R (l) jk , j, k ≥ 0, j + k ≤ M, 0 ≤ l ≤ M.

200

S. Krasnitskiy et al.

Proof It is sufficient to consider M = 2N + 1. Call the number j + k an order

1 of the symbol 0 R jk (x)ϕ ( j) (x)ψ (k) (x) dx. We denote by Ω L = {ω L } a set of all

1 linear combinations of symbols 0 R jk (x)ϕ ( j) (x)ψ (k) (x)d x, j + k ≤ L , 0 ≤ L ≤ 2N + 1. For the summands of order 2N + 1 in the sum of the right side of equality (10.5) we use integration by parts:

1

1 ( j) (k) ( j+1) (x)ψ (k−1) (x) dx 0 R jk (x)ϕ (x)ψ (x) dx = − 0 R jk (x)ϕ 1  j < k, − 0 R jk (x)ϕ ( j) (x)ψ (k−1) (x) dx,

1

1 ( j) (k) ( j−1) (x)ψ (k+1) (x)d x− 0 R jk (x)ϕ (x)ψ (x)d x = − 0 R jk (x)ϕ 1  j > k. − 0 R jk (x)ϕ ( j−1) (x)ψ (k) (x) dx, As a result, the expression



1 0

(10.7) (10.8)

R jk (x)ϕ ( j) (x)ψ (k) (x) dx takes the form

j,k≥0, j+k=2N +1

1

(N ) ˜ ˜ (x)ψ (N +1) (x) dx + ω2N , where the function R(x) is a linear combiR(x)ϕ nation of the functions R jk (x), j + k = 2N + 1 and therefore R˜ ∈ C (M) ([0, 1]);

1 (N ) ˜ (x)ψ (N +1) (x) dx + ω˜ 2N , where ω˜ 2N ∈ ω2N ∈ Ω2N . Thus, g(ϕ, ψ) = 0 R(x)ϕ Ω2N . On account of symmetry of bilinear functional g(ϕ, ψ), we have 0

1 (g(ϕ, ψ) + g(ψ, ϕ)) , 2  1  1 (N ) (N +1) (N +1) ˜ ˜ (x) dx + (x)ψ (N ) (x) dx R(x)ϕ (x)ψ R(x)ϕ g(ϕ, ψ) =

0

0



1

=−

(N +1) ˜ (x)ψ (N ) (x) dx − R(x)ϕ

0



1

+



1

R˜  (x)ϕ (N ) (x)ψ (N ) (x) dx

0

(N +1) ˜ (x)ψ (N ) (x) dx = − R(x)ϕ

0



1

R˜  (x)ϕ (N ) (x)ψ (N ) (x) dx ∈ Ω2N .

0

Therefore g(ϕ, ψ) ∈ Ω2N . So we have g(ϕ, ψ) =

2N   j=0

1

Q j,2N − j (x)ϕ ( j) (x)ψ (2N − j) (x) dx + ω2N −1 ,

(10.9)

0

where the functions Q j,2N − j (x), 0 ≤ j ≤ 2N are linearly expressed by means of the functions R (l) jk , 2N ≤ j + k ≤ 2N + 1, l = 0, 1. For every integral in the right–hand side of equality (10.9) with j = N we use integration by parts |N − j| times and obtain

10 Limit Theorems of Baxter Type for Generalized Random Gaussian …



1

201

Q j,2N − j (x)ϕ ( j) (x)ψ (2N − j) (x) dx =

0

= (−1) N − j



1

Q j,2N − j (x)ϕ (N ) (x)ψ (N ) (x) dx + α2N −1 ,

0

where α2N −1 ∈ Ω2N −1 . Then we have the following representation of functional g(ϕ, ψ):  1 g(ϕ, ψ) = R˜ N N (x)ϕ (N ) (x)ψ (N ) (x) dx + g1 (ϕ, ψ), 0

where g1 (ϕ, ψ) ∈ Ω2N −1 , R˜ N N (x) =

2N  (−1) N − j Q j,2N − j (x), x ∈ [0, 1].

(10.10)

j=0

The bilinear symmetric functional g1 (ϕ, ψ) belongs to Ω2N −1 and the previous arguments are applicable. We repeat these arguments until we get the representation (10.6).  Taking into account Lemma 1, in the future, unless otherwise stated, we assume that the covariance functional of the process ξ˜ is presented in the form (10.11): B(ϕ, ψ) =

N   k=0

1

R˜ kk (x)ϕ (k) (x)ψ (k) (x)d x,

(10.11)

0

where N is a non–negative integer, the functions R˜ kk ∈ C ([0, 1]) , 0 ≤ k ≤ N and R˜ N N = 0.

10.3 The Families of Test Functions Further, when constructing a function family of the form (10.1), it will be more convenient for us to proceed from a two-parameter test function family of the form   A = χt,h = χt,h (·)|t ∈ R, h ∈ (0, 1), suppχt,h ⊂ [t, t + h] ,

(10.12)

assuming the formation of sums (10.2) by the equality χk,n := χt,h for t =

k 1 ,h= , k = 0, 1, . . . , b(n) − 1, n ≥ 1. (10.13) b(n) b(n)

202

S. Krasnitskiy et al.

The families of functions of the form (10.12) and (10.13) for which relation (10.3) is satisfied will be called suitable for a generalized random process ξ . For the purpose of the subsequent families of suitable functions construction, we give several definitions and examples. Definition 2 A family of test functions (10.12) is called O2 type family (has type or is of type O2 ) if for these functions 

t+h t

2 χt,h (x)d x = h + o(h), h → 0+

(10.14)

uniformly over t ∈ R. A family of test functions (10.12) is called o2 type family (has type or is of type o2 ) if for these functions 

t+h t

2 χt,h (x)d x = o(h), h → 0+

(10.15)

uniformly over t ∈ R.   Example 1 Let ρ0,h |h ∈ (0, 1) ⊂ K be an one–parameter function family such 2 2  (x) = 1. Then the function family that suppρ0,h ⊂ [0, h], ∀x ∈ h , h − h : ρ0,h ρt,h |ρt,h (x) = ρ0,h (x − t), t ∈ R, h ∈ (0, 1) has type O2 . Example 2 Let a function ϕ ∈ K has support on the segment [0, 1] and satisfies the

1 2 Then the function family condition 0 ϕ (x)d x = 1.

x−t  χt,h (x) : t ∈ R, h ∈ (0, 1), χt,h (x) = ϕ h has type O2 .

t+h 2 x−t

t+h 2 d x = h. Indeed, t χt,h (x)d x = t ϕ h   Example 3 Letthe function family χt,h has type O2 . Then for every α > 0 the  function family h α χt,h has type o2 .

Definition 3 Let N ≥ 0 be an integer. A family of test functions αt,h |t ∈ R, h ∈  (0, 1), suppαt,h ⊂ [t, t + h] is called O2(N ) type family (has type or is of type O2(N ) ) if: 

(N ) has type O2 ; 1. the family αt,h

 (l) , where l < N , is of type o2 . 2. each family of derivatives αt,h By definition O2(0) = O2 .



N Example 4 The function family αt,h = hd N χt,h , where the functions χt,h , are  2 1/2 1 defined in Example 2, d N = 0 ϕ (N ) (x) d x has type O2(N ) . Indeed, 

t+h  (l) 2 2 2(N −l)+1 1 h, for l = N , (l) αt,h (x) d x = h d 2 dx = t 0 ϕ (x) o(h), h → 0+, for 0 ≤ l < N . N

10 Limit Theorems of Baxter Type for Generalized Random Gaussian …

203

Let us show how the function family of type O2(N ) can be obtained from an arbitrary O2 type family. Define the operator S : A → A by the rule χt,h → Sχt,h , where





Sχt,h (x) =



x

−∞

χt,h (2y − t) − χt,h (3t + 2h − 2y) dy,

x ∈ R. (10.16)

Remark 3 The supports of functions χt,h (2y under the  − t), χ t,h(3t + 2h − 2y) integral sign in (10.16) belong to the segments t, t + h2 , t + h2 , t + h respectively. We note that a similar construction of the operator that maps a family of type O2 to a family of type O2(N ) was applied in [19].   Lemma 2 Let χt,h be an O2 type family. Then the function family 

N (N −1) A N = αt,h = 2− 2 S N χt,h has type O2(N ) . Proof Let us prove that a function Sχt,h , has o2 type. We have

t+h

t+h x 2 2 dx. Sχt,h (x) dx = t t t χt,h (2y − t) − χt,h (3t + 2h − 2y) dy Due to the Cauchy–Bunyakovskii inequality the last expression does not exceed 

t+h

  (x − t)

t

t+h

χt,h (2y − t) − χt,h (3t + 2h − 2y) dy

2 dx

t

=

h2 2



t+h t

 2 χt,h (2y − t) dy +

=

t+h

t

 2 χt,h (3t + 2h − 2y) dy

h2 (h + o(h)) = o(h), h → 0 2

uniformly over t ∈ [0, 1]. Similarly we can prove that the operator S preserves o2 type.   Let us now consider the function family S N χt,h . By the Leibniz’s rule for differentiation of the integral with variable limits of integration and Remark 3, 2 −1 l(l−1)  dl N 2 S χ = 2 γk ωk , t,h d xl k=0 l

1 ≤ l ≤ N,

(10.17)

where γk = ±1, ωk is a function whose graph is obtained from the graph of S N −1 χt,h using the corresponding compression and parallel transfer transformations, so that     (k+1)h , 0 ≤ k ≤ 2l − 1. Since the function family Sχt,h , t + suppωk ⊂ t + kh 2l 2l   has o2 type and the operator S saves o2 type, then the function family S N−1 χt,h l has o2 type for N − l > 0. Consequently the function family ddx l S N χt,h has o2

204

S. Krasnitskiy et al.

type. For l = N ,

dN dx N

αt,h =

N 2 −1

γk ωk , where γk = ±1, ωk is a function whose graph

k=0

and shift transis obtained from the graph of function χt,h using the compression  (k+1)h kh , 0 ≤ k ≤ 2 N − 1. Since forms mentioned above, so suppωk ⊂ t + 2 N , t + 2 N   the function family χt,h has O2 type, then we get 

t+h

t

 =

  2 (N ) αt,h (x) dx =

N −1 t+h 2

t

 ωk2 d x = 2 N

k=0

t+h t

⎛ t+h t

N 2 −1



⎞2 γk ωk ⎠ dx

k=0

ω02 dx = h + o(h), h → 0+,



(N ) has O2 type. that is, the function family αt,h



10.4 Convergence of Baxter Sums Theorem 2 ([17, Corollary 2.2]) Let ξ = (ξ, ϕ), ϕ ∈ K be a generalized Gaussian random process with independent values and zero mean, χk,n be a sequence of test functions (10.1), (Sn (ξ )) be a sequence of Baxter sums (10.2). Then the condition vn(0) =

b(n)−1  

2 2 E ξ, χk,n → 0, n → ∞

(10.18)

k=0

is necessary and sufficient for the convergence Sn (ξ ) − E Sn (ξ ) → 0, in the mean-square. If the series surely.

∞

(0) n=1 vn

(10.19)

converges, then (10.19) converges almost

Theorem 3 Let ξ be a generalized Gaussian random process having zero mathematical expectation and a covariance functional whose restriction to [0, 1] has the form (see Lemma 1) B(ϕ, ψ) =

N   k=0

0

1

Rk (x)ϕ (k) (x)ψ (k) (x) dx,

ϕ, ψ ∈ C0(∞) ([0, 1]) ,

(10.20)

10 Limit Theorems of Baxter Type for Generalized Random Gaussian …

205

where the functions Rk (x),0 ≤ k ≤ N are continuous on the interval [0, 1]. Then any family of test functions χt,h = χt,h (·)|t ∈ R, h ∈ (0, 1), suppχt,h ⊂ [t, t + h] having O2(N ) type is suitable for the random process ξ and Sn (ξ ) =

b(n)−1 

2 ξ, χk,n →



1

R N (x) dx

(10.21)

0

k=0

in the mean-square as n → ∞. If the series

∞ 

1 b(n)

converges, then (10.21) converges

n=1

almost surely. Proof We verify the conditions of Theorem 2. At first we will prove the convergence of the sequence (E Sn (ξ )) and find its limit. Let  . . . , N }. By the mean value  l ∈ {0, 1, k k+1 theorem for definite integrals, there is θk,l ∈ b(n) , b(n) , 0 ≤ k ≤ b(n) − 1, n ≥ 1,  2

k+1

k+1  (l) 2 (l) such that b(n) χk,n (x) dx. Since the family Rl (x) χk,n (x) dx = Rl (θk,l ) b(n) k k b(n) b(n)   of functions χt,h has O2(N ) type, then 

k+1 b(n) k b(n)

⎧   ⎨o 1 ,  2 b(n)  (l) χk,n (x) dx = ⎩ 1 +o b(n)

1 b(n)



0 ≤ l ≤ N − 1, ,l=N

as n → ∞ uniformly over k ∈ {0, 1, . . . , b(n) − 1}. Furthermore, E Sn (ξ ) =

b(n)−1 

N   b(n)−1 B χk,n , χk,n =

k=0

=

b(n)−1 N    k=0

=

l=0

N b(n)−1   l=0

k=0 k+1 b(n)

k b(n)

l=0

 2 (l) Rl (x) χk,n (x) dx = k+1 b(n) k b(n)

k=0

1 0

 2 (l) Rl (x) χk,n (x) dx

N b(n)−1    l=0

 Rl (θk,l )



k=0

k+1 b(n) k b(n)

 2 (l) Rl (x) χk,n (x) dx

 2 (l) χk,n (x) dx.

For 0 ≤ l ≤ N − 1 the last expression goes to 0 as n → ∞. For l = N we get b(n)−1 

 R N (θk,N )

k=0

=

b(n)−1  k=0

Thus,

k+1 b(n) k b(n)



R N (θk,N )

 2 (N ) χk,n (x) dx

   1 1 1 +o → R N (x) dx, b(n) b(n) 0

n → ∞.

206

S. Krasnitskiy et al.

 E Sn (ξ ) →

1

R N (x) dx,

n → ∞.

(10.22)

0

Let us check conditions (10.19) of Theorem 2. From the above reasoning it follows that for any k ∈ {0, 1, . . . , b(n) − 1} N 2  E ξ, χk,n =



k b(n)

l=0

=

k+1 b(n)

 2 (l) Rl (x) χk,n (x) dx

  1 1 +o , b(n) b(n)

n→∞

uniformly over k ∈ {0, 1, . . . , b(n) − 1}. Further, vn(0)

=

b(n)−1  

2 2 E ξ, χk,n = b(n)



k=0

 2 1 1 1 +o , n → ∞. ∼ b(n) b(n) b(n)

So, Theorem 3 follows from Theorem 2 and the relation (10.22).



Remark 4 Let the conditions of Theorem

3 be satisfied. Then, as can be seen from the proof of Theorem 3 for every family χt,h = χt,h (·)|t ∈ R, h ∈ (0, 1), suppχt,h ⊂  2  ξ, χk,n → 0 in the mean[t, t + h] of type O2(M) , M > N : Sn (ξ ) = b(n)−1 k=0  1 square as n → ∞. If the series ∞ n=1 b(n) converges, then it converges almost surely. Corollary 1 below is applicable directly to the case when the functional B(ϕ, ψ) is presented in general form (10.2), and is not reduced to the “canonical” form (10.20). Corollary 1 Let ξ be a generalized Gaussian random process with zero mean and covariance functional B(ϕ, ψ) =

  j,k≥0, j+k≤2N

1 0

R jk (x)ϕ ( j) (x)ψ (k) (x) dx, ϕ, ψ ∈ C0(∞) ([0, 1]) , (10.23)

  | j, k ≥ 0, j + k ≤ 2N ⊂ C (2N ) ([0, 1]). Then where R jk  every family of functions

χt,h = χt,h (·)|t ∈ R, h ∈ (0, 1), suppχt,h ⊂ [t, t + h] of type O2(N ) is suitable for process ξ and Sn (ξ ) =

b(n)−1  k=0

2N  2 ξ, χk,n → (−1) N −l l=0



1 0

Rl,2N −l (x)d x

(10.24)

10 Limit Theorems of Baxter Type for Generalized Random Gaussian …

in the square mean as n → ∞. If the series

∞ 

1 b(n)

207

converges, then the convergence

n=1

in (10.24) takes place almost surely. Proof We bring equality (10.23) to the form (10.20). From the proof of Lemma 1 (equality (10.10)) it follows that R N (x) =

2N  (−1) N −l Rl,2N −l (x).

(10.25)

l=0



Now, the corollary follows from Theorem 3.

It follows from the proof of Theorem 3 that the segment [0, 1] in its conditions can be replaced by an arbitrary segment [u, v] ⊂ R. For this purpose, while forming Baxter sums (10.2), instead of equality (10.13), we can put χk,n := χt,h for t = u +

v−u k(v − u) ,h= , k = 0, 1, . . . , b(n) − 1, n ≥ 1. b(n) b(n)

Thus, we obtain the following two corollaries Corollary 2 Let the conditions of Theorem 3 hold

true. Then for any segment [u, v] ⊂ [0, 1], for every family of functions χt,h = χt,h (·)|t ∈ R, h ∈ (0, 1),  suppχt,h ⊂ [t, t + h] of type O2(N ) we have the relation b(n)−1 

2 ξ, χk,n →



v

R N (x) dx

(10.26)

u

k=0

in the mean-square as n → ∞. If the series

∞ 

1 b(n)

converges, then (10.26) converges

n=1

almost surely. Corollary 3 Let the conditions

of Corollary 2 hold true. Then for any segment [u, v] ⊂ [0, 1], for every family χt,h = χt,h (·)|t ∈ R, h ∈ (0, 1), suppχt,h ⊂ [t, t +  h] of type O2(N ) we have the relation b(n)−1  k=0

2N  2 ξ, χk,n → (−1) N −l

v

Rl,2N −l (x) dx

(10.27)

u

l=0

in the mean-square as n → ∞. If the series

∞  n=1

almost surely.



1 b(n)

converges, then (10.27) converges

208

S. Krasnitskiy et al.

Remark 5 When, under the conditions of Corollaries 2, 3, the family type O2(N ) is replaced by O2(M) for M > N , the Baxter sums in (10.26), (10.27) converge to 0.

10.5 Conditions of Singularity of Measures Let the statistical structure (Ω, Σ, P1 , P2 ) [22] be such that the generalized Gaussian random process ξ = (ξ, ϕ) , ϕ ∈ C0(∞) ([0, 1]) has zero mean and the covariance function   1 R jk (x)ϕ ( j) (x)ψ (k) (x) dx, R jk ∈ C0(2N1 ) ([0, 1]) (10.28) B1 (ϕ, ψ) = j,k≥0, j+k≤2N1

0

with respect to the measure P1 , and zero mean and the covariance function B2 (ϕ, ψ) =

  j,k≥0, j+k≤2N2

1 0

Q jk (x)ϕ ( j) (x)ψ (k) (x) dx, Q jk ∈ C0(2N2 ) ([0, 1]) . (10.29)

with respect to the measure P2 . Theorem 4 Let the relations (10.28), (10.29) be satisfied and the functions 2N1  (−1) N1 −l Rl,2N1 −l (x),

2N2  (−1) N2 −l Q l,2N2 −l (x)

l=0

l=0

are not identically equal to zero on [0, 1]. Then for N1 = N2 the measures P1 , P2 are orthogonal on the σ –field Σ. For N1 = N2 = N this measures are orthogonal if there is a point x0 ∈ [0, 1], such that 2N 2N   (−1) N −l Rl,2N −l (x0 ) = (−1) N −l Q l,2N −l (x0 ). l=0

(10.30)

l=0

Proof Let N1 < N2 and there exists a point y0 ∈ [0, 1], such that 2N2  (−1) N2 −l Q l,2N2 −l (y0 ) = a > 0. l=0

By virtue of continuity of the functions Q l,2N2 −l (0 ≤ l ≤ 2N2 ), there exist a segment 2N2  [u, v] ⊂ [0, 1], [u, v] y0 such that inequality (−1) N2 −l Q l,2N2 −l (x) > a2 > 0 l=0

10 Limit Theorems of Baxter Type for Generalized Random Gaussian …

209

  holds for any x ∈ [u, v]. Let us choose the family of test functions χt,h of type  1 O2(N2 ) and the sequence (b(n)), at which the series ∞ n=1 b(n) converges. We form a   sequence of Baxter

sums Sn (ξ )= Sn ξ, χk,n for the segment [u, v] and denote by X the event Sn (ξ ) −−−→ 0 . According to Remark 5, we have P1 (X ) = 1. On n→∞ the other hand according to Corollary 3   2 ξ, χk,n →

v u

k=0

  2N 2 a N2 −l (−1) Q l,2N2 −l (x) dx > (v − u) > 0, n → ∞. 2 l=0

Therefore P2 (X ) = 0. Let N1 = N2 = N and condition (10.30) holds. Suppose that 2N 2N   (−1) N −l Rl,2N −l (x0 ) − (−1) N −l Q l,2N −l (x0 ) = b > 0. l=0

l=0

Then there is a segment [u, v] ⊂ [0, 1] and a point x0 ∈ [0, 1], such that for any x ∈ 2N 2N   (−1) N −l Rl,2N −l (x) − (−1) N −l Q l,2N −l (x) > b2 and thus [u, v]: l=0 l=0    2N  2N  

v v (−1) N −l Rl,2N −l (x) dx = u (−1) N −l Q l,2N −l (x) dx. u l=0 l=0   Let χt,h be a family of test functions of type O2(N ) . In view of Corollary 3, the probability P1 is concentrated on the event b(n)−1 

2 ξ, χk,n →



v u

k=0

  2N  N −l (−1) Rl,2N −l (x) dx,

n→∞

l=0

while the probability P2 is concentrated on the event b(n)−1 



ξ, χk,n

k=0

2

 → u

v

 2N 

 (−1) N −l Q l,2N −l (x)

dx,

n → ∞.

l=0

Thus, the probabilities P1 , P2 are concentrated on disjoint subsets of Ω. I.e. the  measures P1 and P2 are singular (orthogonal). Since, as is known (see, for example, [23]) two Gaussian measures are either equivalent or orthogonal, then Theorem 4 implies the following statement Corollary 4 Suppose that the conditions of Theorem 4 are satisfied, and the measures P1 and P2 are equivalent on the σ -field generated by the random variables ξ(ϕ), ϕ ∈ C0(∞) ([0, 1]). Then 1. N1 = N2 = N ;

210

S. Krasnitskiy et al.

2N 2N 2. the functions l=0 (−1) N −l Rl,2N −l (x), l=0 (−1) N −l Q l,2N −l (x) coincide on the segment [0, 1].

References 1. Gelfand, I.M., Shilov, G.E.: Generalized Functions. Volume I: Properties and Operations. Dobroswet, Moscow (2000) 2. Vladimirov, V.S.: Generalized Functions in Mathematical Physics. Mir, Moscow (1979) 3. Hörmander, L.: The Analysis of Linear Partial Differential Operators I–IV. Mir, Moscow (1986) 4. Gelfand, I.M., Ya. Vilenkin, N.: Applications of Harmonic Analysis. Equipped Hilbert Spaces. Fizmatgiz, Moscow (1961) 5. Levy, P.: Le movement Brownian plan. Am. J. Math. 62, 487–550 (1940) 6. Baxter, G.: A strong limit theorem for Gaussian processes. Proc. Am. Math. Soc. 7(3), 522–527 (1956) 7. Gladyshev, E.G.: A new limit theorem for stochastic processes with Gaussian increments. Teor. Veroyatnnost. i Primenen. 6(1), 57–66 (1961) 8. Ryzhov, Yu.M.: Limit distributions of some functionals of a stationary Gaussian process. Theory Probab. Appl. 14(2), 229–243 (1969) 9. Gine, E., Klein, R.: On quadratic variation of processes with Gaussian increments. Ann. Probab. 3(4), 716–721 (1975) 10. Berman, S.M.: A version of the Levy-Baxter theorem for Gaussian processes. Proc. Am. Math. Soc. 18, 1051–1055 (1967) 11. Krasnitskiy, S.M.: On some limit theorems for random fields with Gaussian m-order increments. Teor. Veroyatnost. i Mat. Statist. 5, 71–80 (1971) 12. Kawada, T.: The Levy-Baxter theorem for Gaussian for Gaussian random fields: a sufficient condition. Proc. Am. Math. Soc. 53, 463–469 (1975) 13. Kozachenko, Yu.V., Kurchenko, O.O.: Levy-Baxter theorems for one class of non-Gaussian stochastic processes. Random Oper. Stoch. Equ. 19(4), 313–326 (2011) 14. Kozachenko, Yu.V., Kurchenko, O.O., Syniavska, O.O.: The Levy-Baxter theorems for one class of non-Gaussian random fields. Monte Carlo Metods Appl. 19(3), 171–182 (2013) 15. Arato, N.M.: On a limit theorem for generalized Gaussian random fields corresponding to stochastic partial differential equations. Teor. Veroyatnost. i Primenen. 34(2), 409–411 (1989) 16. Goryainov, V.B.: On Levy-Baxter theorems for stochastic elliptic equations. Teor. Verojatnost. i Primenen. 33(1), 176–179 (1988) 17. Krasnitskiy, S.M., Kurchenko, O.O.: Baxter type theorems for generalized random Gaussian processes. Theory Stoch. Process. 21(37, 1), 45–52 (2016) 18. Krasnitskiy, S.M., Kurchenko, O.O.: On Baxter type theorems for generalized random Gaussian fields. Proc. Math. Stat. 271, 91–102. Springer, Cham (2018) 19. Krasnitskiy, S.M., Kurchenko, O.O.: On Baxter type theorems for generalized random Gaussian processes with independent values. Cybern. Syst. Anal. 56(1), 66–74 (2020) 20. Rozanov, Yu.A.: Random Fields and Stochastic Partial Differential Equations. Nauka, Moscow (1995) 21. Rozanov, Yu.A.: Markov Random Fields. Nauka, Moscow (1981) 22. Soler, J.-L.: Notion de liberté en Statistique Mathématique. Russian translation Mir, Moscow (1972) 23. Rozanov, Yu.A.: Infinite-Dimensional Gaussian Distributions. Nauka, Moscow (1968)

Chapter 11

On Explicit Formulas of Steady-State Probabilities for the [M/M/c/c + m]-Type Retrial Queue Eugene Lebedev, Vadym Ponomarov, and Hanna Livinska

Abstract The paper deals with a research of bivariate Markov process {X (t), t ≥ 0} whose state space is a lattice semistrip S(X ) = {0, 1, . . . , c + m} × Z + . The process {X (t), t ≥ 0} describes the service policy of a multi-server retrial queue in which the rate of repeated flow does not depend on the number of sources of repeated calls. First, we study the ergodicity conditions. Then obtain a vector-matrix representation of steady state distribution through the parameters of the system. The investigative techniques use an approximation of the initial model by means of the truncated one and the direct passage to the limit. The application of the obtained results is demonstrated via numerical examples. Keywords Markov process · Queueing system MSC 2020 60J05 · 60J27 · 60K25

11.1 Introduction Retrial queues arise in studies of various systems such as telecommunications, call centers, computer networks, etc. (for details see [9]). Classic queueing systems assume that the arriving customer who finds all servers busy is blocked and lost forever. There may be modifications with an infinite waiting capacity when a blocked customer waits until being served. However, there are many scenarios and various systems in which customers do not want to wait and temporarily leave the system. After a random period of time, they return and try to get service again. Such customer behavior is modeled by retrial queues. In a retrial queue with classic retrial policy, each blocked customer joins the orbit and returns to the servers again in a random period of time independently from other customers in the orbit. This means that a retrial flow rate directly correlates with the E. Lebedev (B) · V. Ponomarov · H. Livinska Taras Shevchenko National University of Kyiv, Volodymyrska str., 64, Kyiv 01601, Ukraine e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_11

211

212

E. Lebedev et al.

number of sources of retrial calls at any given moment. However, there are several types of systems, where the retrial of the customer is controlled. This means that a retrial flow rate does not depend on the number of customers in the orbit. For example, [7] studies the retrial queue model with a constant retrial rate in the application to the CSMA/CD protocol. In [5, 6] authors model TCP traffic using similar models. Constant retrial rate could be interpreted as so-called “calling for blocked customer”: when the server is idle, it calls blocked customers one by one. The time for the server to pick a blocked customer could be interpreted as the retrial time. Artalejo et al. [3] formulate a multiserver M/M/c/c retrial queue with constant retrial rate by a level-independent QBD process which could be analyzed efficiently using matrix analytic methods invented by Neuts [15]. As in the classical retrial rate, the block matrices of the level-independent QBD process are also sparse leading to efficient algorithms for the stationary distribution. These algorithms are discussed by Artalejo et al. [3] using a matrix analytic method and by Do et al. [8] using the spectral expansion method. As one can see queueing systems with constant retrial rate are widely used for actual practical problems. The research of these problems will greatly benefit from the analytical representations of probabilistic characteristics of the underlying models. Unfortunately, explicit formulas of the steady state probabilities for such systems were obtained only in the simplest cases [1]. Some sort of a recurrent algorithm for M/M/r/r + d queues with constant retrial rate was presented in [10]. The main performance measures such as waiting time and busy period for a similar model were discussed in [11]. The structure and the aim of the rest of this paper are as follows. First, we describe the process under investigation and determine the conditions of its ergodicity. In Sect. 11.3 we obtain an analytical representation of steady-state probabilities of the service process through its parameters. The implementation of the results is demonstrated on the numeric examples in which the behavior of the main integral characteristics of the queues is shown.

11.2 Mathematical Model and Ergodicity Condition Let us consider the following retrial queue, that contains c identical servers and m waiting positions. The input flow of calls is a Poisson process with parameter λ. If an input call finds the available server, it is served immediately. Service time is exponentially distributed with parameter ν. If there are no free servers on call arrival, it tries to occupy a waiting position. If all places in waiting line are occupied, the input call becomes a source of retrial calls and joins the orbit. The retrial calls rate does not depend on the number of calls in the orbit and is equal to μ. The service process of the above retrial queue can be expressed in terms of twodimensional continuous time Markov chain X (t) = (X 1 (t), X 2 (t))T , X 1 (t) ∈ {0, 1, . . . , c + m}, X 2 (t) ∈ {0, 1, . . .},

11 On Explicit Formulas of Steady-State Probabilities …

213

defined by its local parameters q(i, j)(i  , j  ) , (i, j), (i  , j  ) ∈ S(X ) = {0, 1, . . . , c + m} × {0, 1, . . .} in the following way: 1. For 0 ≤ i < c, j > 0 ⎧ ⎪ ⎪ ⎪ ⎪ ⎨

q(i, j)(i  , j  )



λ, when (i ,  iν, when (i ,  = μ, when (i , ⎪  ⎪ ⎪ −(λ + iν + μ), when (i , ⎪ ⎩ 0, other wise.



j ) = (i + 1, j);  j ) = (i − 1, j);  j ) = (i + 1, j − 1);  j ) = (i, j);

2. For c ≤ i < c + m, j > 0 ⎧ ⎪ ⎪ ⎨

q(i, j)(i  , j  )





λ, when (i , j ) = (i + 1, j);   cν, when (i , j ) = (i − 1, j); =   ⎪ −(λ + cν), when (i , j ) = (i, j); ⎪ ⎩ 0, other wise.

3. For i = c + m, j ≥ 0 ⎧ ⎪ ⎪ ⎨

q(c+m, j)(i  , j  )





λ, when (i , j ) = (c + m, j + 1);   cν, when (i , j ) = (c + m − 1, j); =   ⎪ −(λ + cν), when (i , j ) = (c + m, j); ⎪ ⎩ 0, other wise.

4. For 0 ≤ i < c + m, j = 0 ⎧ ⎪ ⎪ ⎨

q(i,0)(i  , j  )





λ, when (i , j ) = (i + 1, 0);   (i ∧ c)ν, when (i , j ) = (i − 1, 0); =   ⎪ − [λ + (i ∧ c)ν] , when (i , j ) = (i, 0); ⎪ ⎩ 0, other wise,

where (i ∧ c) = min(i, c). The first component X 1 (t) ∈ {0, 1, . . . , c + m} indicates the summarized number of busy servers and calls in the queue at the instant t ≥ 0, and the second one X 2 (t) ∈ {0, 1, . . .} is the number of retrial sources. Further, the process X (t) = (X 1 (t), X 2 (t))T is the main subject of our investigation. Let us write the states of X (t) as S(X ) = {(0, 0), (1, 0), . . . , (c + m, 0), (0, 1), (1, 1), . . . , (c + m, 1), . . .}. Then the infinitesimal matrix of the Markov chain X (t) can be represented in a matrixblock form:

214

E. Lebedev et al.





Q (0,0) Q (0,+1) ⎜ Q (−1) Q (0) ⎜ ⎜ .. .. . . Q=⎜ ⎜ (−1) ⎜ Q ⎝ .. .

⎟ Q (+1) ⎟ ⎟ .. ⎟, . ⎟ Q (0) Q (+1) ⎟ ⎠ .. .. . .

(11.1)

c+m where Q (0,0) = qi(0,0) i,c+m j j=0 = q(i,0)( j,0) i, j=0 is a three-diagonal matrix with

qi(0,0) j

⎧ ⎪ ⎪ ⎨

λ, when j = i + 1, i = 0, . . . , c + m − 1; (i ∧ c)ν, when j = i − 1, i = 1, . . . , c + m; = − + (i ∧ c)ν] , when i = j = 0, . . . , c + m; [λ ⎪ ⎪ ⎩ 0, other wise,

c+m Q (0,+1) = qi(0,+1) i,c+m j=0 = q(i,0)( j,1) i, j=0 = Λ(0, . . . , 0, λ) is a diagonal matrix j with the vector (0, . . . , 0, λ)T on the principal diagonal; c+m c+m Q (0) = qi(0) j i, j=0 (= q(i,k)( j,k) i, j=0 for any k = 1, 2, . . .) is a three-diagonal matrix with ⎧ λ, when j = i + 1, i = 0, . . . , c + m − 1; ⎪ ⎪ ⎪ ⎪ (i ∧ c)ν, when j = i − 1, i = 1, . . . , c + m; ⎨ −(λ + cν), when i = j = c, c + 1, . . . , c + m; qi(0) = j ⎪ ⎪ −(λ + μ + iν), when i = j = 0, . . . , c − 1; ⎪ ⎪ ⎩ 0, other wise,

Q (+1) = Q (0,+1) = Λ(0, . . . , 0, λ) (= q(i,k)( j,k+1) i,c+m j=0 for any k = 1, 2, . . .) is a (−1) c+m (−1) diagonal matrix; Q = qi j i, j=0 (= q(i,k)( j,k−1) i,c+m j=0 for any k = 1, 2, . . .) , μ, when j = i + 1, i = 0, . . . , c − 1; qi(−1) = j 0, other wise, The representation of Q in the form (11.1) guarantees that X (t) is a levelindependent quasi-birth-and-death process (QBD process, see, for example, [2, p. 189]). To find the ergodicity conditions for X (t), let us consider the three-diagonal

= Q (−1) + Q (0) + Q (+1) = q˜i j i,c+m infinitesimal matrix Q j=0 , where ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨

λ, when j =i λ + μ, when j =i (i ∧ c)ν, when j =i −cν, when i= j q˜i j = ⎪ ⎪ −(λ + cν), when i= j ⎪ ⎪ ⎪ ⎪ −(λ + μ + iν), when i = j ⎪ ⎪ ⎩ 0, other wise,

+ 1, i = c, . . . , c + m − 1; + 1, i = 0, . . . , c − 1; − 1, i = 1, . . . , c + m; = c + m; = c, c + 1, . . . , c + m − 1; = 0, . . . , c − 1;

11 On Explicit Formulas of Steady-State Probabilities …

215

T

= 0c+m+1 Solving the system of equations ρ T Q , ρ T 1c+m+1 = 1, where 0c+m+1 , 1c+m+1 are (c + m + 1)-dimensional vectors formed by zeros and ones, respectively, we find

ρi =

1 i!

1 ρi = c!





λ+μ ν λ+μ ν

i  c−1 k=0

c 

λ cν

1 k!



λ+μ ν

i−c  c−1 k=0

k

1 k!

+ 

1 c!



λ+μ ν

λ+μ ν k

c  m 

1 + c!

k=0



λ cν

λ+μ ν

k −1 , i = 0, . . . , c − 1,

c  m  k=0

λ cν

k −1 , i = c, . . . , c + m.

An ergodicity condition for the level-independent QBD process is defined by inequality (see, for example, [2, p. 196]): ρ T Q (+1) 1c+m+1 < ρ T Q (−1) 1c+m+1 . For the process X (t) it takes the form 1 λ c!



λ+μ ν

c 

λ cν

m

  c−1  1 λ+μ i 0, i = 1, . . . , c + m.

k=1 k = j

G i = λ + μ + [(i − 1) ∧ c] ν − [(i − 1) ∧ c] ν − λ = μ, i = 1, . . . , c + m − 1, G c+m = λ + μ + cν − cν = λ + μ. The above conditions are satisfied for all i = 1, . . . , c + m, which means nonsingularity of B (see Theorem 6.1.10 from [12]). B −1 can be obtained by decomposition ∞  p of the inverse matrix: B −1 = (F(E − B1 ))−1 = (E − B1 )−1 F −1 = B1 F −1 . Now let us proceed to the matrix C. Since it is triangular, |C| =

p=0 c+m−1 

bi i+1 =

i=1

(−1)c+m−1 c!cm−1 ν c+m−1 = 0. So, nonsingularity of C holds and lemma is proved.  In order to construct the steady state distribution of the service process in the [M/M/c/c + m]-queue we consider a similar queue with a truncated state space. New calls in such a queue are lost when all servers are occupied, there are no free waiting places and there are already N calls in the orbit. Formally, the service process in such a queue is described by the Markov chain X (t, N ) = (X 1 (t, N ), X 2 (t, N ))T , X 1 (t, N ) ∈ 0, 1, . . . , c + m, X 2 (t, N ) ∈ 0, 1, . . . , N .

Its infinitesimal transition rates

218

E. Lebedev et al. (N )   q(i, j)(i  , j  ) , (i, j), (i , j ) ∈ S(X, N ) = {0, . . . , c + m} × {0, . . . , N }

are equal to q(i, j)(i  , j  ) of the chain X (t) in all phase points except the boundary case i = c + m, j = N , where (N ) q(c+m,N )(i  , j  )

⎧ when (i  , j  ) = (c + m − 1, N ); ⎨ cν, (i  , j  ) = (c + m, N ); = −cν, when ⎩ 0, other wise.

The phase space S(X, N ) of the process X (t, N ) is finite, therefore X (t, N ) has a steady state distribution. By πi j (N ), (i, j) ∈ S(X, N ), we will denote its stationary probabilities.  T Let π j (N ) = π0 j (N ), . . . , πc+m−1 j (N ) be the vector of the stationary probabilities. The following result holds. Lemma 3 Steady-state probabilities of the process X (t, N ) can be represented in the following form: π j (N ) =  j (N ) · π00 (N ), j = 0, 1, . . . , where 0 (N ) = −1 A) N − j C −1 e1 B0−1 A1 (N ),  j (N ) = eT B(B−1 B(B j = 1, 2, . . . , e1 = (δ11 , . . . , δ1c+m )T . −1 A) N C −1 e , 1

1

0

Proof The probabilities πi j (N ), (i, j) ∈ S(X, N ) of the truncated queue satisfy the set of Eqs. (11.3)–(11.6) for the original system. It can be rewritten in the vectormatrix form: Aπ1 (N ) = B0 π0 (N ),

Aπ j+1 (N ) = Bπ j (N ), j = 1, . . . , N − 1.

(11.7)

For the boundary case j = N we have [λ + μ]π0N (N ) = νπ1N (N ), [λ + μ + iν]πi N (N ) = λπi−1 N (N ) + (i + 1)νπi+1 N (N ), i = 1, . . . , c − 1, [λ + μ + cν]πi N (N ) = λπi−1 N (N ) + cνπi+1 N (N ), i = c, . . . , c + m − 1. We can supplement the above set of equations by the identity π0N (N ) = π0N (N ) and write it in the vector-matrix form: Cπ N (N ) = e1 · π0N (N ). Then from this equation we find (11.8) π N (N ) = C −1 e1 · π0N (N ). Combining (11.7) and (11.8) we arrive at   N − j −1 C e1 · π0N (N ), j = 1, . . . , N . π j (N ) = B −1 A Under j = 0, Eqs. (11.3)–(11.6) are reduced to π0 (N ) = B0−1 Aπ1 (N ) = B0−1 B(B −1 A) N C −1 e1 π0N (N ).

(11.9)

11 On Explicit Formulas of Steady-State Probabilities …

219

From the last equation we find the probability π0N (N ) : −1  π0N (N ) = e1T B0−1 B(B −1 A) N C −1 e1 π00 (N ). Substituting the last expression into Eq. (11.9), we obtain proof of the Lemma.



Now we can proceed to studying the limit behavior of the vectors  j (N ), as N → ∞. The following result takes place. Lemma 4 The limits of the vectors  j (N ), j = 0, 1, . . . , as N → ∞, are represented in the form: uv T C −1 e1 0 = B0−1 A1 ,  j = lim  j (N ) = , j = 1, 2, . . . , N →∞ e1T B0−1 B(B −1 A) j uv T C −1 e1

where u T = (u 1 , u 2 , . . . , u c+m ) > 0 and v T = (v1 , v2 , . . . , vc+m ) > 0 are right and left eigenvectors of the matrix B −1 A, which correspond to the Perron root. Proof Since the matrix B −1 A > 0 (Lemma 2, point 2), then by virtue of Theorem 8.2.8 from [12], the N-th power of B −1 A can be written in the following form:  −1  N B A = r N uv T + o(r1N ), where r1 < r, r is the Perron root of the matrix B −1 A; u, v are the right and left Perron vectors and moreover u T v = 1. We substitute this expression into the formula for the vector  j (N ), j = 1, 2, . . . :  j (N ) =

( B −1 A) N − j C −1 e1 = N T −1 e1 B0 B ( B −1 A) C −1 e1



 N− j uv T r N − j +o(r1 ) C −1 e1   . N− j e1T B0−1 B(B −1 A) j uv T r N − j +o(r1 ) C −1 e1

(11.10)

Let us divide the numerator and the denominator of (11.10) by r N − j and take the uv T C −1 e1 limit as N → ∞ :  j = lim  j (N ) = eT B −1 B(B −1 A) j uv T C −1 e . N →∞

1

0

1

The matrix uv T is present both in the numerator and the denominator of the expression for  j , so we can omit the normalizing condition u T v = 1 for the Perron vectors u and v.  As N → ∞, steady-state probabilities πi j (N ) approximate the corresponding probabilities πi j of the original model (see Theorems 2.3 and 2.4 from [9], Chap. 2 on the stochastic ordering of probability distributions for migration processes). In what follows we will obtain explicit representations of πi j (N ) and πi j via the model parameters, that allows evaluating degree of convergence. On the basis of this approximation and propositions proved earlier, we can obtain the principal result.

220

E. Lebedev et al.

Theorem 1 If the process X (t) satisfies the condition of Lemma 1, then π j = lim π j (N ) =  j · π00 , j = 0, 1, . . . , N →∞

πc+m j = μλ 1T (c + m) j+1 π00 , j = 0, 1, . . . ,  where π00 = 1 (c + m)0 + T

λ+μ λ

∞ 

(11.11)

−1 1 (c + m) j T

.

j=1

The claim of Theorem 1 is a direct consequence of Lemmas 3, 4. The probabilities πc+m j , j = 0, 1, . . . , can be found from the Eq. (11.3), and probability π00 from the normalizing condition. Taking into account Theorem 1, we can obtain an explicit form of the stationary probabilities via spectral characteristics of the matrix B −1 A. Corollary 1 If the process X (t) satisfies the condition of Lemma 1, then: π0 = B −1 Auv T C −1 e1 H −1 , π j = r − j uv T C −1 e1 H −1 , j = 1, 2, . . . , πc+m j = μλ r − j−1 1T (c + m)uv T C −1 e1 H −1 , j = 0, 1, . . . ,  where H = 1T (c + m) B0−1 A +

1+μ/λ E r −1



(11.12)

uv T C −1 e1 .

Note that the formulas (11.12) are simpler than modified matrix-geometric form of the stationary probabilities, which is evident from the commonly accepted theory of the QBD-processes (see, for example, [2, p. 196]). They describe a structure of stationary distribution (the weighted geometric type) more precisely and provide explicit formulas for main performance measures.

11.4 Numerical Results In this section, let us calculate some performance measures of retrial queues, using the obtained formulas. The blocking probability πb and the average number of calls in the orbit E[X 2 ] were chosen among the main integral characteristics of the retrial queues. Taking into account the results of Theorem 1 and Corollary 1, they are calculated by the following formulas: πb =

∞  πc+m j = j=0

E[X 2 ] =

μ/λ T 1 (c r −1

∞ c+m   j πi j = j=0

i=0

+ m)uv T C −1 e1 H −1 ,

r +1 T 1 (c (r −1)2

+ m)uv T C −1 e1 H −1 .

11 On Explicit Formulas of Steady-State Probabilities …

221

Fig. 11.1 Blocking probability πb and average number of retrials E[X 2 ] versus λ

Fig. 11.2 Blocking probability πb and average number of retrials E[X 2 ] versus ν

Fig. 11.3 Blocking probability πb and average number of retrials E[X 2 ] versus μ

The effect of different parameters on blocking probability and the average number of calls in the orbit is shown graphically on Figs. 11.1, 11.2 and 11.3. Figures 11.1, 11.2 and 11.3 demonstrate blocking probability πb and the average number of retrials E[X 2 ] against the parameters of a retrial queue with constant retrial rate. One can see that these characteristics can be significantly improved if we are able to change or control some of the system’s parameters.

222

E. Lebedev et al.

11.5 Conclusion and Future Research In this paper, we present steady-state analysis of retrial queues with c servers and m waiting places, in which the rate of the secondary flow of repeated calls does not depend on the number of customer in the orbit. We obtain the steady-state probabilities through the system parameters in closed form. This representation can be applied for further detailed analysis of this model, calculation of its performance measures and finding the solution of optimization problems. Works [4, 7–13] contain variants of such optimization problems for various retrial queues and their analysis.

References 1. Artalejo, J.R.: Stationary analysis of the characteristics of the M/M/2 queue with constant repeated attempts. Opsearch 33, 83–95 (1996) 2. Artalejo, J.R., Gomez-Corral, A.: Retrial Queueing Systems. A Computational Approach. Springer, Berlin, Heidelberg (2008) 3. Artalejo, J.R., Gomez-Corral, A., Neuts, M.F.: Analysis of multiserver queues with constant retrial rate. Eur. J. Oper. Res. 135, 569–581 (2001) 4. Atencia, I., Lebedev, E., Ponomarov, V., Livinska, H.: Special retrial queues with statedependent input rate. Commun. Comput. Inf. Sci. 1109, 73–85 (2019) 5. Avrachenkov, K., Yechiali, U.: Retrial networks with finite buffers and their application to internet data traffic. Probab. Eng. Inf. Sci. 22, 519–536 (2008) 6. Avrachenkov, K., Yechiali, U.: On tandem blocking queues with a common retrial queue. Comput. Oper. Res. 37, 1174–1180 (2010) 7. Choi, B.D., Shin, Y.W., Ahn, W.C.: Retrial queues with collision arising from unslotted CSMA/CD protocol. Queueing Syst. 11, 335–356 (1992) 8. Do, T.V., Chakka, R.: An efficient method to compute the rate matrix for retrial queues with large number of servers. Appl. Math. Lett. 23, 638–643 (2010) 9. Falin, G.I., Templeton, J.G.C.: Retrial Queues. Chapman & Hall, London (1997) 10. Gomez-Corral, A., Ramalhoto, M.F.: The stationary distribution of a Markovian process arising in the theory of multiserver retrial queueing systems. Math. Comput. Model. 30, 141–158 (1999) 11. Gomez-Corral, A., Ramalhoto, M.F.: On the waiting time distribution and the busy period of a retrial queue with constant retrial rate. Stoch. Model. Appl. 3(2), 37–47 (2000) 12. Horn, R.A., Johnson, Ch.R.: Matrix Analysis. Cambridge University Press (1986) 13. Lebedev, E.A., Ponomarov, V.D.: Finite source retrial queues with state-dependent service rate. Commun. Comput. Inf. Sci. 356, 140–146 (2013) 14. Lebedev, E.A., Prischepa, O.V.: On one multichannel queueing system with retrial calls. Cybern. Syst. Anal. 53(3), 127–137 (2017) 15. Neuts, M.F.: Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach. Johns Hopkins University Press (1981) 16. Walrand, J.: An Introduction to Queueing Networks. Prentice Hall, Englewood Cliffs, New Jersey (1988)

Chapter 12

Testing Cubature Formulae on Wiener Space Versus Explicit Pricing Formulae Anatoliy Malyarenko and Hossein Nohrouzian

Abstract Cubature is an effective way to calculate integrals in a finite dimensional space. Extending the idea of cubature to the infinite-dimensional Wiener space would have practical usages in pricing financial instruments. In this paper, we calculate and use cubature formulae of degree 5 and 7 on Wiener space to price European options in the classical Black–Scholes model. This problem has a closed form solution and thus we will compare the obtained numerical results with the above solution. In this procedure, we study some characteristics of the obtained cubature formulae and discuss some of their applications to pricing American options. Keywords Wiener space · Black–Scholes model · Cubature formulae MSC 2020 60H05 · 91-10 · 28C20 · 60J70

12.1 Introduction and Background One of the challenges in mathematical finance and physics is to calculate expected values of functionals defined on solutions of stochastic differential equations. It is not an easy task, and in many cases, solving the above problem requires either some (proven) available theoretical techniques or some proper numerical techniques.

A. Malyarenko · H. Nohrouzian (B) Division of Mathematics and Physics, Mälardalen University, 721 23 Västerås, Sweden e-mail: [email protected] A. Malyarenko e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_12

223

224

A. Malyarenko and H. Nohrouzian

A well-known example of using a theory to solve a version of the above problem, is to use the Feynman–Kac representation formula under some technical conditions to derive Black–Scholes European call and put options pricing formulae.1 That is, solving a boundary value problem (BVP) for the Black–Scholes partial differential equation (PDE) yields to Black–Scholes European call and put options pricing formulae. Monte Carlo simulation of stochastic differential equations (SDEs) is one of the most popular numerical methods to transform solving parabolic PDEs to simulating the values of the stochastic integral of objective functions on Wiener space. Furthermore, among different numerical methods in finite dimensions, the cubature formula is an effective approach to calculate/estimate integrals. Generally speaking, the idea presented in [7] is to extend the idea of cubature formula to the Wiener space such that the obtained cubature formulae on Wiener space would be valid for high-dimensional SDEs and semi-elliptic PDEs. In this paper, an attempt is to examine the possible applications of cubature formulae within mathematical finance. To achieve this, we follow the approach presented in [7] and extend the idea to find cubature formulae of degree 5 and 7 on the Wiener space of continuous real-valued functions. Then, we use our obtained cubature formulae to calculate the price of European options and compare the results with explicit Black–Scholes pricing formulae. We will also see that our cubature formulae can be used to develop lattice approximation with application in pricing American and European options. In more details, in Sects. 12.2 and 12.3 we will quickly review construction and implementation of cubature formulae on Wiener space. Then, in Sects. 12.4 and 12.5, we present cubature formulae of degree 5 and 7. We also use the obtained cubature formulae to find the fair price of European call and put options and compare the obtained results with those given by the Black–Scholes pricing formula. After that, we will discuss how one can use the obtained cubature formulae to construct a lattice tree to price American options. Finally, we conclude this work and mention our capable future works to extend the idea.

12.2 Implementation of Cubature Formulae on Wiener Space In this section, we make a quick review on Itô and Stratonovich representations of SDEs and stochastic integrals. Then, like [1] and [10], we assume that asset price process S follows a Geometric Brownian motion. In other words, we use [14] representation of return process on a risky asset S.

1

Black and Scholes, in their original paper, constructed an instantaneously delta-hedged risk-free portfolio. Then, they applied the following no-arbitrage argument: risk-free portfolio must earn risk-free interest rate. This yields the famous Black–Scholes PDE.

12 Testing Cubature Formulae on Wiener Space …

225

12.2.1 SDE and Stochastic Integral (Itô) Let μ be a n-dimensional drift function. Also, let σ be an n × m-dimensional diffusion function and {W (t), t ≥ 0} be the standard m-dimensional Wiener process. Then, an n-dimensional Itô process is given by [5], ⎧ 1 m ⎪ ⎪ ⎨dX 1 = μ1 dt + σ11 dW + · · · + σ1m dW , .. .. .. .. . . . . ⎪ ⎪ ⎩dX = μ dt + σ dW 1 + · · · + σ dW m . n n n1 nm The above system of SDEs can be expressed in the following equivalent matrix form, dX(t) = μ(t, X(t))dt + σ (t, X(t))dW (t).

(12.1)

For 1 ≤ i ≤ n and 1 ≤ j ≤ m, we re-write Eq. (12.1) in its equivalent integral form, 

t

X i (t) = X i (0) +

μi (u, X(u))du +

0

m  

t 0

j=1

σi j (u, X(u))dWuj ,

(12.2)

where the first integral is the Riemann integral, while the second one is the Itô integral.

12.2.2 SDE and Stochastic Integral (Stratonovich) The Stratonovich representation of SDEs in Eq. (12.1) is given by [13], ˜ X t )dt + σ (t, X t ) ◦ dW t , dX t = μ(t, where μ˜ i (t, x) = μi (t, x) −

(12.3)

1 m n ∂σi j (t, x) σk j (t, x) j=1 k=1 2 ∂ xk

is the Stratonovich correction. For 1 ≤ i ≤ n and 1 ≤ j ≤ m, we re-write Eq. (12.3) in its equivalent integral form,  X ti = X 0i +

0

t

μ˜ i (u, Xu )du +

m   j=1

0

t

σi j (u, Xu ) ◦ dWuj ,

where the second integral is the Stratonovich integral.

(12.4)

226

A. Malyarenko and H. Nohrouzian

12.2.3 Itô Integral Versus Stratonovich Integral To begin with, one of the biggest advantage of Stratonovich integral occurs under changing of variables. Meaning that, there are no second order terms and consequently, changing of variable is an ordinary Chain Rule formula [13] and thus one can skip calculating the second derivatives. In the [1] and [10] model, the drift function r and diffusion function σ are constants and the dynamic of (non dividend paying) risky asset under risk-neutral probability measure satisfies the following SDE, dS(t) = r S(t)dt + σ S(t)dW (t),

(12.5)

where r is called mean rate of return, σ is volatility and {W (t), t ≥ 0} is the standard Wiener process (Brownian motion). The above SDE is equivalent to SDE in Eq. (12.1) when n = m = 1, r (= μ) and σ are constants, and dX = dS/S.

12.2.3.1

Itô Representation of SDE (12.5)

Rearranging and taking integral of both hand sides in Eq. (12.5) yields dS(t) = r dt + σ dW (t). S(t) To solve the integral in the left hand side of the above equation, one can make a change of variable from S(t) to ln S(t), apply Itô’s lemma and obtain,  1 2 S(t) = S(0) exp (r − σ )t + σ W (t) . 2

12.2.3.2

Stratonovich Representation of SDE (12.5)

We want to write Eq. (12.5) in its Stratonovich representation form. Using formula in Eq. (12.3), we obtain the following SDE and stochastic integral, 1 (12.6) SDE: dS(t) = (r − σ 2 )S(t)dt + σ S(t) ◦ dW (t), 2  t  t 1 2 stochastic integral: S(t) = S(0) + (r − σ ) S(u)du + σ S(u) ◦ dW (u). 2 0 0 Now, our objective is to derive a general formula for calculating the price of financial derivative (option) using Stratonovich integrals. We will do this in the next section after constructing cubature formulae on Wiener space.

12 Testing Cubature Formulae on Wiener Space …

227

12.3 Construction of Cubature Formulae on Wiener Space In this section, we will review the construction of paths (trajectories), calculating probabilities and weights of the possible trajectories of cubature formulae on Wiener space. More specifically, we follow [7] and explain briefly the procedure of construction of cubature formulae on Wiener space. For more details, proofs and explicit explanations, the reader is referred to [7, 8]. To begin with, recall that the stochastic Taylor expansion can be approximated by the sum of Stratonovich iterated integrals. Denote Wt0 = t. Let k be a nonnegative integer. A multi-index I with k components is defined as the empty set if k = 0 or a finite sequence I = (i 1 , . . . , i k ) such that each i j is an integer with 0 ≤ i j ≤ d for 1 ≤ j ≤ k. The degree deg of the empty multi-index is defined to be equal to 0, for a multi-index (i 1 , . . . , i k ) its degree is defined as k plus the number of zeroes among i 1 , …, i k . Let A be the set of all multi-indices. Now, a cubature formula of degree M is defined as follows. 0 ([0, T ]; Rd+1 ) be the space of all functions ω : [0, T ] → Definition 12.1 Let Cbv d+1 R such that each component ωi : [0, T ] → R1 is a continuous function of bounded variation with ωi (0) = 0 and ω0 (s) = s. Let N be a positive integer, let λ1 , …, λ N be 0 ([0, T ]; Rd+1 ). positive weights summing up to 1, and let ω1 , …, ω N be paths in Cbv Then, the mentioned weights and paths satisfy a moment matching condition of order M [6] or form a cubature formula on Wiener space of degree M and size N [7], if and only if (iff) for all multi-indices (i 1 , . . . , i k ) with deg(i 1 , . . . , i k ) ≤ M,

⎡ E⎣



 ◦dWti11 0≤t1 ≤···≤tk ≤T

◦ ··· ◦

dWtikk ⎦



N  = λn n=1

dωni1 (t1 ) · · · dωnik (tk ). 0≤t1 ≤···≤tk ≤T

Let { εi : 0 ≤ i ≤ d } be the standard basis of the space Rd+1 . For a multi-index I ∈ A , define the tensor  1, if I = ∅, εI = εi1 ⊗ · · · ⊗ εik , if I = (i 1 , . . . , i k ). Recall that the tensor product εi1 ⊗ · · · ⊗ εik is the k-linear form on the space (Rd+1 )k , acting by εi1 ⊗ · · · ⊗ εik (x1 , . . . , xk ) = (εi1 , x1 ) · · · (εik , xk ),

xi ∈ Rd+1 , 1 ≤ i ≤ k.

ˆ d+1 ) of sequences { ak : k ≥ 0 }, The map I → ε I maps A to the tensor algebra T(R d ⊗k where ak belongs to the space (R ) of all k-linear forms on Rd+1 . The signature 0 ˆ d+1 ) defined by the series, of a path ω ∈ Cbv ([s, T ]; Rd+1 ) is an element of T(R

228

A. Malyarenko and H. Nohrouzian

Ss,t (ω) =



 I ∈A

s≤t1 ≤···≤tk ≤T

dωti11 · · · dωtikk ε I ,

that converges in the topology of coordinate-wise convergence. The Brownian sigˆ d+1 ) defined by the series, nature is a random element of T(R Ss,t (W) =



 I ∈A

s≤t1 ≤···≤tk ≤T

◦dωti11 ◦ · · · ◦ dωtikk ε I ,

with random coefficients. It is proved in [7] that under the above mappings the moment matching condition becomes, 

1 π M exp ε 0 + εi ⊗ εi 2 i=1 d

 =

N 

λn π M exp(Ln ),

n=1

where π M is the projection to the linear space generated by the tensors ε I with deg I ≤ M and Ln are Lie polynomials. There exist paths ω1 , …, ω N with π M S0,1 (ωn ) = π M exp(Ln ). The path ω1 , …, ω N and the weights λ1 , …, λ N form a cubature formula on Wiener space of degree M and size N . The methods of finding the weights and Lie polynomials are described in [4]. The reconstruction of the paths is more tricky and will be discussed later.

12.3.1 The Trajectories of SDE (12.6) Recall that we wrote Eq. (12.5) in its Stratonovich representation form. Then, we used formula in Eq. (12.3), and obtained SDE given in Eq. (12.6). Furthermore, we briefly explained the construction of cubature formulae on Wiener space in the beginning of this section. Now, assume that there exists l, (l ∈ Z+ ) possible number of trajectories in the cubature formula of degree M. Let ωk , (1 ≤ k ≤ l) be the k-th possible trajectory. Then, Eq. (12.6) for the k-th trajectory can be re-written as, 1 Sk (t) = S(0) + (r − σ 2 ) 2

 0

t



t

Sk (u)du + σ

Sk (u)dωk (u).

0

Taking the derivative of both hand sides of the above equation and applying the fundamental theorem of calculus yields 1 dSk (t) = (r − σ 2 )Sk (t)t + σ Sk (t)ωk (t). 2 Now, it is easy to re-arrange the above equation and calculate the integral of both hand sides. After doing that, for i = 1, . . . , l, we let 0 ≤ ti ≤ 1 and we will write,

12 Testing Cubature Formulae on Wiener Space …

229

Table 12.1 Information for cubature formulae of degree 5

 1 Sk (ti ) = Sk (ti−1 ) exp (r − σ 2 )[ti − ti−1 ] + σ [ωk (ti ) − ωk (ti−1 )] . 2

(12.7)

The next challenge is to use the cubature formulae on Wiener space in order to calculate ωk (ti ) − ωk (ti−1 ). Let us consider some examples and investigate some numerical results in the next sections.

12.4 Cubature Formula of Degree 5 To begin with, it is possible to do all calculations by hand and derive the cubature formula of degree 5 on Wiener space of continuous real-valued functions. We did that in Section 6, Example 2 of our previous work [8]. Also, we showed that the linear space U is spanned by the 6 tensors and the cubature formula (exponentiating Lie polynomials) of degree 5 took following form (also see [4, p. 100]), k (ω))) = ε0 + αk,1 ε1 + αk,2 [ε1 , [ε0 , ε1 ]], Lk = πU (log(X0,1

1 ≤ k ≤ 3 (12.8)

with πU denoting the projections to the linear space U . The coefficients α and weights λ in the above equation are summarized in Table 12.1a. Further, Eq. (12.8), for piecewise-linear paths and for k = 1, 2, 3 is equivalent to, 3 exp((1/3)ε0 + θi1 ε1 )), Lk = π5 log(⊗i=1

1 ≤ k ≤ 3,

which yields to following (non-linear) system of polynomial equations, ⎧ ⎪ ⎨θ1,1 + θ1,2 + θ1,3 = αk,1 , θ1,1 − θ1,3 = 0, ⎪ ⎩ (θ1,1 − θ1,2 )2 + (θ1,1 − θ1,3 )2 + (θ1,2 − θ1,3 )2 = 3.

(12.9)

The above system has two exact solutions summarized in Table 12.1b. Now, assume that the trajectories we consider on Wiener space are starting from 0 and stopping at time T . We refer to such a time interval from 0 to some arbitrary T as

230

A. Malyarenko and H. Nohrouzian Cubature plots of degree 5

2

1

1

0

0

-1

-1

-2

0

0.3333

0.6667

Cubature plots of degree 5

2

1

t

-2

0

0.3333

0.6667

1

t

Fig. 12.1 Two cubature formulae of degree 5

one unit of time. Then, we divide one unit of time into three equal sub-intervals, i.e., for i = 1, 2, 3 we have, 0 = t0 < t1 < t2 < t3 = 1 and ti − ti−1 = 1/3. After that, the line equation for trajectories can be generalized by, ωk (ti ) = 3θk,i (ti − ti−1 ) + ωk (ti−1 ), i = 1, 2, 3, ωk (0) = 0.

(12.10)

Substituting the values summarized in Table 12.1b into Eq. (12.10) result to get our trajectories. Figure 12.1 is created in MATLAB® and depicts the two possible sets of trajectories for cubature of degree 5. From Eq. (12.10), we can calculate ωk (ti ) − ωk (ti−1 ) and substitute the result in Eq. (12.7). Now, it is easy to see that one can get sample points of the random price (random variable) Sk for each k. To find the fair price of an arbitrary option, we will use each of the obtained Sk in the considered arbitrary option’s payoff function formula. Finally, we calculate the sum of the product of all k number of payoffs with corresponding weight λk and discount them. This will give us discounted expected payoff or the fair price of the option. Remark 12.1 Using the symmetric property of the first and the third paths and the fact that the second path starts from zero and ends up at zero, one may solve the system of equations in (12.9) for αk,1 = 1. Then, the solutions are the slopes for the first path, multiplying the results by −1 gives information for the third path and we put simply the slopes of second √ the right √ path equal to zero. After that, multiplying hand side of Eq. (12.10) by 3 (recall that αk,1 and αk,3 are in fact ± 3) yields different trajectories but the same results in this paper. The above approach, yields Table 12.2 and Fig. 12.2. Example 12.1 Assume that the current price of a stock is S(0) = 10 and the volatility of stock for one unit of time is σ = 20%. Substituting (12.10) in (12.7)  1 2 Sk (ti ) = Sk (ti−1 ) exp (r − σ )[ti − ti−1 ] + 3σ θk,i [ti − ti−1 ] . 2

12 Testing Cubature Formulae on Wiener Space …

231

Table 12.2 Information for anther cubature formulae of degree 5 k

λk

1

1/6

−(2 ±

θk,1 = θk,3 √ 6)/6)

2

2/3

0

3

1/6

(2 ±



θk,2 = 1 − 2θk,1 √ 6)/6)

−(1 − 2(2 ±

0 6)/6

1 − 2(2 ±

√ 6)/6

Cubature plots of degree 5

2

1

0

0

-1

-1

-2

0

0.3333

0.6667

1

t

-2

0 (2 ±



6)/6

Cubature plots of degree 5

2

1

θk,3 = θk,1 √ 6)/6)

−(2 ±

0

0.3333

0.6667

1

t

Fig. 12.2 Another two cubature formulae of degree 5

Substituting the values from Table 12.1b in the above equation enables us to find the price trajectories of √the stock. On the other hand, we first multiply the right hand side of Eq. (12.10) by 3 and the substitute it in Eq. (12.7) which gives,  √ 1 2 Sk (ti ) = Sk (ti−1 ) exp (r − σ )[ti − ti−1 ] + 3 3σ θk,i [ti − ti−1 ] . 2

(12.11)

Now substituting the values from Table 12.2 in the above equation enables us to find some other price trajectories of the mentioned stock. For arbitrary interest rates r = 7% and r = −2.5%, the price trajectories of the stock are graphically illustrated in Figs. 12.3 and 12.4.

12.4.1 Black–Scholes Versus Cubature Pricing Formula (Degree 5) In this part, we implement our cubature formulae (12.11) in MATLAB® to find the prices of European call and put options both using Black–Scholes pricing formula (we call them exact solutions) and the cubature pricing formula (we call them approximated solutions). After that, we compare the graphical behavior of exact and approximated prices and also absolute error, i.e., the difference between exact and approximated solution, or relative error, i.e., absolute error divided by exact solution.

A. Malyarenko and H. Nohrouzian 16

16

14

14

12

12

S(t)

S(t)

232

10

10

8

8 0

0.3333

0.6667

0

1

0.3333

16

14

14

12

12

S(t)

S(t)

16

10

10

8

8 0

0.3333

0.6667

1

0.6667

1

0.6667

1

0.6667

1

t

t

0.6667

0

1

0.3333

t

t

14

14

12

12

S(t)

S(t)

Fig. 12.3 Price trajectories for S0 = 10, r = 0.07 and σ = 0.20

10 8 6

10 8

0

0.3333

0.6667

6

1

0

0.3333

t

14

14

12

12

S(t)

S(t)

t

10

8

8 6

10

0

0.3333

0.6667

1

6

0

0.3333

t

Fig. 12.4 Price trajectories for S0 = 10, r = −0.025 and σ = 0.20

t

12 Testing Cubature Formulae on Wiener Space … European call option

1

97

98

99

100

101

102

103

0.15 0.1 0.05 0 96

104

Strike price European put option

4

Price of option

Absolute error

2

0 96

2 1 0 96

97

98

99

100

97

98

101

102

103

99

100

101

102

103

104

103

104

Strike price Absolute error (European put option)

0.2

Cubature price Black-Scholes price

3

Absolute error (European call option)

0.2 Cubature price Black-Scholes price

3

Absolute error

Price of option

4

233

0.15 0.1 0.05 0 96

104

97

98

99

100

101

102

Strike price

Strike price

Fig. 12.5 Changing strike price European call option

Relative error

Price of option

1.5

1 Cubature price Black-Scholes price

0.5 0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

Relative error

Price of option

Cubature price Black-Scholes price

0.4

0.2

0.002

0.004

0.006

0.008

0.01

0.012

0.1

0

0.002

0.004

0.014

Interest rate

0.016

0.018

0.02

0.006

0.008

0.01

0.012

0.014

0.016

0.018

0.02

0.016

0.018

0.02

Interest rate Relative error (European put option)

0.3

0.6

0

0.2

0

0.02

Interest rate European put option

0

Relative error (European call option)

0.3

2

0.2

0.1

0

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

Interest rate

Fig. 12.6 Changing interest rate

To begin with, we set one unit of time to be one day (time to maturity). We choose small overnight interest rate r √ = 0.0001 and daily volatility σ = 0.016 (equivalent to yearly volatility σ y = 0.016 252 = 25.4%). Let us assume that the current asset price is S(0) = 100$. Changing strike price K Let us consider a scenario, where the strike price can deviate ±4% from the asset price. We set K = {96, 96.04, 96.08, …, 103.96, 104}. Figure 12.5 depicts the result. Changing interest rate r In the previous part, we saw that the worse performance of our cubature formula happens when K = S(0). So, we choose S(0) = K = 100, vary r = {0.0001, 0.0002, . . . , 0.0199, 0.0200}. Figure 12.6 illustrates the outcome. Changing daily volatility σ Again, we choose S(0) = K = 100 and r = 0.0001. We increase the daily volatility √ (in 100 steps) from σ = 0.016 (equivalent to yearly volatility σ y = 0.016 252 =

234

A. Malyarenko and H. Nohrouzian

Relative error

0.8

0.6

0.017

0.018

0.019

0.02

0.021

0.022

0.023

0.272 0.271 0.27 0.269 0.268 0.016

0.024

0.017

1

Cubature price Black-Scholes price

0.8

0.6

0.4 0.016

0.017

0.018

0.019

0.02

0.021

0.018

0.019

0.02

0.021

0.022

0.023

0.024

0.023

0.024

8

9

8

9

Daily volatility Relative error (European put option)

Daily volatility European put option

Relative error

Price of option

Cubature price Black-Scholes price

0.4 0.016

Price of option

Relative error (European call option)

European call option

1

0.022

0.023

0.024

0.272 0.271 0.27 0.269 0.268 0.016

0.017

0.018

0.019

0.02

0.021

0.022

Daily volatility

Daily volatility

Fig. 12.7 Changing daily volatility An n-day European call option

2 Cubature price Black-Scholes price

1.5

1

2

3

1.5

Price

4

5

6

7

8

0.5 Cubature price Black-Scholes price

1

2

3

4

5

6

0.01

1

2

7

8

9

Nubmber of days

3

4

5

6

7

Number of days Relative error (European put option)

0.03

1

0

0.02

0

9

Nubmber of days An n-day European put option

Relative error

1

Relative error (European call option)

0.03

Relative error

Price

2.5

0.02

0.01

0

1

2

3

4

5

6

7

Number of days

Fig. 12.8 n day Black–Scholes versus cubature pricing formula, K = 99

√ 25.4%) to σ = 0.024 (equivalent to yearly volatility σ y = 0.024 252 = 38.1%). Figure 12.7 summarize the obtained output.

12.4.1.1

Changing Time to Maturity

Assume that overnight rate and daily volatility are constant through the time and given by r = 0.0001 and σ = 0.016. We would like to calculate the price of an nday European option. The idea is to repeat Eq. (12.7) in day n for nodes starting at day n − 1. Since our cubature formula of degree 5 gives three possible trajectories, we have to calculate the prices and corresponding probabilities at the nodes of an nstep trinomial tree. That is, at the end of day n we have 3n possible price trajectories. We will deal with this exponential growth in number of trajectories in Sect. 12.4.2. Figures 12.8, 12.9 and 12.10 show the results for K = {99, 100, 101} when 1 ≤ n ≤ 9.

12 Testing Cubature Formulae on Wiener Space … An n-day European call option

Price

1.5

1 Cubature price Black-Scholes price

0.5 1

2

3

4

5

6

7

8

Relative error

Price 1 Cubature price Black-Scholes price

0.5 2

3

4

5

6

0.1

1

2

7

8

4

5

6

7

8

9

8

9

8

9

8

9

0.2

0.1

0

9

3

Number of days Relative error (European put option)

0.3

1.5

1

0.2

0

9

Nubmber of days An n-day European put option

2

Relative error (European call option)

0.3

Relative error

2

235

1

2

3

4

Nubmber of days

5

6

7

Number of days

Fig. 12.9 n day Black–Scholes versus cubature pricing formula, K = 100 An n-day European call option

Relative error

Price

1

0.5 Cubature price Black-Scholes price

1

2

3

5

6

7

8

1.5 Cubature price Black-Scholes price

1

2

3

4

5

Nubmber of days

6

0.05

1

2

7

8

9

3

4

5

6

7

Number of days Relative error (European put option)

0.15

2

1

0.1

0

9

Nubmber of days An n-day European put option

2.5

Price

4

Relative error

0

Relative error (European call option)

0.15

1.5

0.1

0.05

0

1

2

3

4

5

6

7

Number of days

Fig. 12.10 n day Black–Scholes versus cubature pricing formula, K = 101

Remark 12.2 1. Choosing proper daily, weekly, monthly, quarterly and yearly, volatilities and interest rates, it is possible to change the time unit from one day to week(s), month(s), quarter(s) or year(s). 2. We can choose, negative interest rate in our cubature formula. 3. We chose Samuelson representation of asset price process, i.e., geometric Brownian motion for asset price dynamics, which is log-normally distributed. It is possible to choose another representation, e.g., Bachelier which assumes that the asset price process is normally distributed. 4. The solution St of Eq. (12.6) satisfies the following condition: if r < 0, then St → 0 and if r > 0, then St → ∞ as t → ∞. For a proof, see [13]. The above proof is based on the Law of Iterated Logarithm for one-dimensional Brownian motion. Figure 12.11 makes a graphical illustration to this fact. 5. It is possible to choose different length of time units, volatilities and interest rates in different steps. Figure 12.12 depicts the idea for three step trinomial tree where the time units are {1, 2, 3} days, daily interests rates are {−0.001, 0, 0.0005} and volatilities are {0.016, 0.014, 0.018}.

236

A. Malyarenko and H. Nohrouzian

350

160

140

300

120 250 100 200 80 150 60 100

50

40

0

1

2

3

4

20

5

0

1

2

3

4

5

Fig. 12.11 Five steps trinomial model based on the cubature formula of degree 5 (σ = 0.1 and r = ±0.075) 120 115 110 105 100 95 90 85 80

0

1

2

3

4

5

6

Fig. 12.12 Three steps trinomial model with different time units, volatilities and interest rates

6. Choosing equal time units, volatility and interest rate, the trinomial application in our model becomes recombining which makes our model much easier to deal with and gives more flexibility to implement it in computer software (see Fig. 12.11). We will discuss the idea in the upcoming sub-section.

12.4.2 Construction of a Trinomial Model Based on the Cubature Formula Assume that we have equal time units, equal volatilities and equal interest rates in every and each step of an n-step trinomial model. Then, an immediate application of

12 Testing Cubature Formulae on Wiener Space …

237

the cubature formula of degree 5 on Wiener space of continuous real-valued functions is using it to construct a recombining trinomial model. The idea presented in Sect. 12.4.1.1 can be developed in an optimal way. Such a development involves the construction of a recombining trinomial tree. We start by defining an up factor f u , a middle factor f m and a down factor f d . The price at each node can follow the upward trajectory ω1 by probability λ1 , can follow the middle trajectory ω2 by probability λ2 and can follow the downward trajectory ω3 by probability λ3 . If the unit of time is equal in all steps of the trinomial tree, then it suffices to use Eq. (12.7) for 1 ≤ k ≤ 3 to find f u , f m and f d . That is,  1 S1 (1) = exp (r − σ 2 ) + σ ω1 (1) , S(0) 2  1 S2 (1) = exp r − σ 2 , fm = S(0) 2  S3 (1) 1 fd = = exp (r − σ 2 ) + σ ω3 (1) . S(0) 2 fu =

−ω3 , and the price process is log-normal, Since, ω1 and ω3 are symmetric, i.e., ω1 = √ i.e., (Sk > 0), it is easy to see that f m = f u f d . Thus, such a construction leads to a recombining trinomial tree. As the consequence, the number of nodes in the constructed recombining trinomial tree decrease from 3n to 2n + 1. It is easy to do a trinomial expansion and find the trinomial coefficients which represent the number of possible paths to reach each node. Also, one can simply calculates the corresponding probabilities and prices at each steps of the tree and for each node. The up, middle and down factors can be easily found using the cubature formula for one unit of time. As an example, we consider a scenario where a unit of time is equal to one day, the current asset price is S(0) = 100, overnight interest is r = 0.0001, daily volatility is σ = 0.016 and the strike price is K = 100. Then, we substitute our cubature formula of degree 5 in Eq. (12.7) which gives the possible asset prices after one day. The asset prices at the end of day one can be either S1 (1) = 102.8072 or S2 (1) = 99.9972 or S3 (1) = 97.2640. Dividing S1 (1), S2 (1) and S3 (1) by S(0) at the end of day one gives us up, middle and down factors, i.e., f u , f m , f d . We use trinomial expansion to find all (n + 1)(n + 2)/2 trinomial coefficients at step n of the constructed tree. Having trinomial coefficients, we can calculate the number of possible paths to reach each node at any step n. Calculating the prices at each node and probability to reach each node is more than easy. Finally, we calculate the price of European call and put options for 1 ≤ n ≤ 252. Figure 12.13 compares the obtained results versus Black– Scholes pricing formula (as it can be seen in the left figure, the prices are almost identical).

238

A. Malyarenko and H. Nohrouzian

12.4.2.1

An Application in Pricing American Options

The constructed n-step trinomial model based on the cubature formula of degree 5 can be developed and used to evaluate the price of American options. The idea is simple. We would like to construct an n-step trinomial tree from today until the maturity. Assume that yearly volatility, annual continuously compounded interest rate and the time to maturity (in years) are given. We interpret the time to maturity as a one unit of time. Then, we use the yearly volatility and annual continuously compounded interest rate to calculate the corresponding rate and volatility based on chosen unit of time. As an example, assume that the today’s asset price is S(0) = 100, annual continuously compounded interest rate is r = 0.05, yearly volatility is σ = 0.26 and for 6 months European options, i.e., T = 0.5 (or 126 trading days), the strike price is K = 90. To construct an n-step trinomial tree within the chosen unit of time, we do as follows. Since we consider 6 months (half a year) options and we would like to have this 6 months as a one unit √ of time, we convert yearly volatility to volatility of one unit of time using σ1 = T σ . The continuously compounded interest rate for one unit of time can be approximated by r1 = (1 + r )T . Now, for √ each step n ≥ 2 in our trinomial tree, we calculate f u , f m and f d , substitute σn = 1/nσ1 and rn = (1 + r1 )1/n in Eq. (12.7). Then, similar to previous examples, we calculate the possible prices, payoffs and probabilities at the nodes of our constructed trinomial tree. Finally, we discount the expected payoff. Figure 12.14 shows the results. As another example, we compare our method with the classical Cox–Ross–Rubinstein (CRR) binomial model [3]. As Fig. 12.15 shows, some times the result of our model and some other times the result of CRR model is closer to Black–Scholes pricing formula. Examining different inputs, we observed that our trinomial model gives lower prices of European call options and higher prices of European put options comparing with Black–Scholes pricing formulae. These differences increased for higher rate of interests, for example, when r > 0.1. European call option

Relative error (European call option)

0.3

0

50

100

150

200

Trinomial price Black-Scholes price

50

100

150

Nubmber of days

0.1

50

200

250

100

150

200

250

200

250

Number of days Relative error (European put option)

0.3

5

0

0.2

0

250

Nubmber of days European put option

10

Price

Relative error

Trinomial price Black-Scholes price

5

Relative error

Price

10

0.2

0.1

0

50

100

150

Number of days

Fig. 12.13 252 steps (days) recombining trinomial model versus Black–Scholes pricing formula

12 Testing Cubature Formulae on Wiener Space …

239

12.5 Cubature Formula of Degree 7 In Section 6, Example 3 of our work in [8], we showed that the cubature formula (exponentiating Lie polynomials) of degree 7 gives the information summarized in Table 12.3, (see also [4, p. 102]). Additionally, the elements of the Philip Hall basis up to commutators with 6 elements are, European call option

15

Relative error (European call option) 0.02

Relative error

Price

Trinomial price Black-Scholes price

14.8

14.6

14.4

20

40

60

80

100

0.01 0.005 0

120

20

Nubmber of steps European put option

2.8

Relative error

2.4

20

40

60

60

80

100

120

100

120

0.02

2.6

2.2

40

Number of steps Relative error (European put option) Trinomial price Black-Scholes price

Price

0.015

80

100

0.015 0.01 0.005 0

120

20

40

Nubmber of steps

60

80

Number of steps

Fig. 12.14 126 steps trinomial model versus Black–Scholes pricing formula (fixed maturity) European call option

15.5

Relative error

Price

15

14.5

14

20

40

60

80

100

Relative error

Price

2.5

2 40

60

0.02

20

80

100

120

40

60

80

100

120

Number of steps Relative error (European put option)

0.06 CRR price Trinomial price Black-Scholes price

20

Binomial Trinomial

0.04

0

120

Nubmber of steps European put option 3

Relative error (European call option)

0.06 CRR price Trinomial price Black-Scholes price

Binomial Trinomial

0.04

0.02

0

20

40

Nubmber of steps

60

80

Number of steps

Fig. 12.15 126 steps trinomial model versus CRR binomial model Table 12.3 The coefficients αk, j and λk k

αk,1

αk,2

αk,3

αk,4

λk

1

−2.856970013872805

0

−1/12

−1/360

0.011257411327721

2

−1.355626179974266

−1/12

−1/360

0.222075922005613

3

0

0 √

−1/360

0.266666666666667

4

0

5/32 −1/12 √ − 5/32 −1/12

−1/360

0.266666666666667

5

1.355626179974266

0

−1/12

−1/360

0.222075922005613

6

2.856970013872805

0

−1/12

−1/360

0.011257411327721

100

120

240

A. Malyarenko and H. Nohrouzian

ε0 , ε1 , [ε0 , ε1 ], [ε0 , [ε0 , ε1 ]], [ε1 , [ε0 , ε1 ]], [ε0 , [ε0 , [ε0 , ε1 ]]], [ε1 , [ε0 , [ε0 , ε1 ]]], [ε1 , [ε1 , [ε0 , ε1 ]]], [ε0 , [ε0 , [ε0 , [ε0 , ε1 ]]]], [ε1 , [ε0 , [ε0 , [ε0 , ε1 ]]]], [ε1 , [ε1 , [ε0 , [ε0 , ε1 ]]]], [ε1 , [ε1 , [ε1 , [ε0 , ε1 ]]]], [[ε0 , ε1 ], [ε0 , [ε0 , ε1 ]]], [[ε0 , ε1 ], [ε1 , [ε0 , ε1 ]]], [ε1 , [ε0 , [ε0 , [ε0 , [ε0 , ε1 ]]]]], [ε1 , [ε1 , [ε0 , [ε0 , [ε0 , ε1 ]]]]], [ε1 , [ε1 , [ε1 , [ε0 , [ε0 , ε1 ]]]]], [ε1 , [ε1 , [ε1 , [ε0 , [ε0 , ε1 ]]]]], [ε1 , [ε1 , [ε1 , [ε1 , [ε0 , ε1 ]]]]], [[ε0 , ε1 ], [ε0 , [ε0 , [ε0 , ε1 ]]]], [[ε0 , ε1 ], [ε1 , [ε0 , [ε0 , ε1 ]]]], [[ε0 , ε1 ], [ε1 , [ε1 , [ε0 , ε1 ]]]], [[ε0 , [ε0 , ε1 ]], [ε1 , [ε0 , ε1 ]]]. Furthermore, the linear space U is spanned by the 12 tensors shown in bold. By [4, p. 102], the cubature formula of degree 7 takes following form, i πU (log(X0,1 (ω))) =

ε0 + αk,1 ε1 + αk,2 [ε0 , ε1 ] + αk,3 [ε1 , [ε0 , ε1 ]] + αk,4 [ε1 , [ε1 , [ε1 , [ε0 , ε1 ]]]]. (12.12) Now, the objective is to find the paths ωk , 1 ≤ k ≤ 6, satisfying (12.12). Note that for the path ω(u) = (u, ϕ1 u, . . . , ϕd u) we have, Ss,t (ω) = exp((t − s)(ε0 + ϕ1 ε 1 + · · · + ϕd ε d )). Divide the interval [0, 1] into 6 subintervals of lengths θ11 , θ12 , …, θ16 . Consider the piecewise-linear path ω such that the slope of the function ω1 on the ith interval is equal to θ2i /θ1i . Its signature is the tensor product of the signatures of all pieces, S0,1 (ω) = exp(θ11 ε0 + θ21 ε1 ) ⊗ · · · ⊗ exp(θ16 ε 0 + θ26 ε 1 ). Equation (12.12) takes the form, πU (log(exp(θ11 ε 0 + θ21 ε 1 ) ⊗ · · · ⊗ exp(θ16 ε 0 + θ26 ε 1 ))) =ε 0 + αk,1 ε 1 + αk,2 [ε0 , ε 1 ] + αk,3 [ε 1 , [ε 0 , ε 1 ]] + αk,4 [ε 1 , [ε 1 , [ε 1 , [ε 0 , ε 1 ]]]].

(12.13)

Both hand sides of this equation are elements of the linear space U of dimension 12. We need to expand the left hand side in the basis in which the right hand side is expanded, and equate the coefficients of both expansions. In this way, we obtain a system of 12 equations with 12 unknowns. To proceed with the above expansion, we need to use a technical tool called the Baker–Campbell–Hausdorff theorem. Let e0 and e1 be two elements of the free Lie algebra L with 2 generators, say ε0 and ε 1 . For every element b of the Philip Hall basis BL of L , replace each symbol ˜ εi with ei and call the result b.

12 Testing Cubature Formulae on Wiener Space …

241

Theorem 12.1 (Baker–Campbell–Hausdorff) There exist rational numbers { rb : b ∈ BL } such that,  ˜ log(exp(e0 ) ⊗ exp(e1 )) = rb b. b∈B L

In particular, Casas and Murua [2] calculated the coefficients rb for 111013 elements of the Philip Hall basis containing up to 20 symbols. The terms lying in πU L are as follows, 1 1 πU log(exp(e0 ) ⊗ exp(e1 )) = e0 + e1 + [e0 , e1 ] + [e0 , [e0 , e1 ]] 2 12 1 1 1 − [e1 , [e0 , e1 ]] − [e1 , [e0 , [e0 , e1 ]]] + [e1 , [e1 , [e0 , [e0 , e1 ]]]] 12 24 180 1 1 + [e1 , [e1 , [e1 , [e0 , e1 ]]]] − [[e0 , e1 ], [e1 , [e0 , e1 ]]]. 720 120 How to find the corresponding expansion for the logarithm of the tensor product of 6 terms? We have to use the fact that the tensor product is associative. In particular, log(exp(e0 ) ⊗ exp(e1 ) ⊗ exp(e2 )) = log((exp(e0 ) ⊗ exp(e1 )) ⊗ exp(e2 )). We replace the first two terms in the right hand side with the ordinary Baker– Campbell–Hausdorff formula, simplify the result and obtain the formula for the logarithm of the tensor product of 6 terms, and so on. To perform these calculations, we use MATLAB® codes for dealing with “Differential Equations on Manifolds” provided by [11]. In particular, we found that the Baker–Campbell–Hausdorff formula for, log(exp(e1 ) ⊗ · · · ⊗ exp(e6 )), contains 7639 nonzero coefficients for elements of the Philip Hall basis containing up to 6 symbols. In the obtained formula, we substituted ei with θ1i ε 0 + θ2i ε 1 , simplified the result and compared the coefficients in Eq. (12.13). As a result, we obtained the system of 12 polynomial equations with 12 unknowns that occupies 48 pages of format A4. It will not be presented here but is available upon request. The next obstacle is to find the solution to the system of obtained polynomials and consequently to find the paths. Finding paths requires extensive calculations, e.g., with Gröbner basis. We chose however to use Newton method for solving non-linear system of equations where it gives more accurate (and real-valued) solutions. The first equation is, θ11 + θ12 + θ13 + θ14 + θ15 + θ16 = 1, and therefor, we set θ1i = 1/6 for all paths. After that, we choose only 6 of obtained Lie polynomials and using “Newton method for solving system of non-linear equations” (see [9]), we find all θ2i s. For path k and 1 ≤ k ≤ 6, the θ2i s are summarized in Table 12.4, where 1 ≤ i ≤ 6.

242

A. Malyarenko and H. Nohrouzian

Table 12.4 Obtained θ2i s using Newton method 1 0.122115547246675 2 0.057943566794764 3 –0.699645095895579 4 0.699645095895579 5 –0.057943566794764 6 –0.122115547246675

–2.129635156086790 –1.010507340772309 0.138305465548990 –0.138305465548990 1.010507340772309 2.129635156086790

0.579034601903712 0.274750683990411 0.711600837704638 –0.711600837704638 –0.274750683990411 –0.579034601903712

0.579034601903712 0.274750683990411 –0.711600837704638 0.711600837704638 –0.274750683990411 –0.579034601903712

–2.129635156086790 0.122115547246675 –1.010507340772309 0.057943566794764 –0.138305465548990 0.699645095895579 0.138305465548990 –0.699645095895579 1.010507340772309 –0.057943566794764 2.129635156086790 –0.122115547246675

Put all the above discussions together, the line equation for trajectories can be generalized by, ωk (ti ) = 6θk,2i (ti − ti−1 ) + ωk (ti−1 ), i = 1, . . . , 6, t0 = 0, ωk (0) = 0. (12.14) Figure 12.16 is created in MATLAB® and depicts two sets of the possible trajectories in cubature formula of degree 7. In the left sub-figure (data shown in Table 12.4) we chose initial guesses of zeros and in the right sub-figure initial guesses of ones. Substituting the results shown in Table 12.4 in all the 12 Lie polynomial gives less error and therefore we will use them. Remark 12.3 Newton method may give several answers depending on (proper) initial guesses, however the end points of each trajectory (constructed based on different answers) would almost be the same. Additionally, we chose only 6 of obtained Lie polynomials to find θ2i s, since Newton method would not converge or would give bigger errors. Finally, the 6 chosen polynomials were not the same to estimate θ2i s for every k-th row in the Table 12.4. Cubature plots of degree 7

5 4

4

3

3

2

2

1

1

0

0

-1

-1

-2

-2

-3

-3

-4

-4

-5

0

0.1667

0.3333

0.5

0.6667

Cubature plots of degree 7

5

0.8333

1

-5

0

0.1667

0.3333

Fig. 12.16 2 possible sets of trajectories in cubature formula of degree 7

0.5

0.6667

0.8333

1

12 Testing Cubature Formulae on Wiener Space …

243

12.5.1 Black–Scholes Versus Cubature Formula (Degree 7) In this part, we repeat exactly the same experiences as we did in Sect. 12.4.1 with the only difference that we use cubature formula of degree 7 instead of cubature formula of degree 5. Changing strike price K Let the strike price can deviate ±4% from the asset price. We set K = {96, 96.04, 96.08, …, 103.96, 104}. Figure 12.17 depicts the result.

Absolute error

2 1

97

98

99

100

101

102

103

4

0 96

104

97

1

98

99

99

100

101

102

103

104

103

104

0.1

2

97

98

Strike price Absolute error (European put option)

Cubature price Black-Scholes price

3

0 96

0.05

Strike price European put option Absolute error

Price of option

0.1

Cubature price Black-Scholes price

3

0 96

Price of option

Absolute error (European call option)

European call option

4

100

101

Strike price

Fig. 12.17 Changing strike price

102

103

104

0.05

0 96

97

98

99

100

Strike price

101

102

244

A. Malyarenko and H. Nohrouzian European call option

Relative error

Price of option

1.5

1 Cubature price Black-Scholes price

0.5

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

Relative error

Price of option

Cubature price Black-Scholes price

0.2

0

0.002

0.004

0.006

0.008

0.01

0.012

0.1 0.05

0

0.002

0.004

0.014

0.016

0.018

0.008

0.01

0.012

0.014

0.016

0.018

0.02

0.016

0.018

0.02

0.15 0.1 0.05 0

0.02

0.006

Interest rate Relative error (European put option)

0.2

0.4

0

0.15

0

0.02

Interest rate European put option 0.6

Relative error (European call option)

0.2

2

0

0.002

0.004

0.006

0.008

Interest rate

0.01

0.012

0.014

Interest rate

Fig. 12.18 Changing interest rate European call option

Relative error (European call option) 0.162

Cubature price Black-Scholes price

0.9

Relative error

Price of option

1

0.8 0.7

0.161 0.16 0.159

0.6 0.017

0.018

0.02

0.021

0.022

0.023

0.017

0.7 0.6 0.018

0.019

0.02

0.021

0.022

0.023

0.024

0.023

0.024

0.162

0.8

0.017

0.018

Daily volatility Relative error (European put option)

Cubature price Black-Scholes price

0.9

0.016

0.158 0.016

0.024

Daily volatility European put option

1

Price of option

0.019

Relative error

0.016

0.019

0.02

0.021

0.022

0.023

0.024

Daily volatility

0.161 0.16 0.159 0.158 0.016

0.017

0.018

0.019

0.02

0.021

0.022

Daily volatility

Fig. 12.19 Changing daily volatility

Changing interest rate r Figure 12.18 illustrates the outcome of putting S(0) = K = 100 and varying r = {0.0001, 0.0002, …, 0.0199, 0.0200}. Changing daily volatility σ Put S(0) = K = 100 and r = 0.0001. Increase the daily volatility (in 100 steps) from σ = 0.016 to σ = 0.024. Figure 12.19 summarizes the obtained output.

12.5.1.1

Changing Time to Maturity

Assume that overnight rate is r = 0.0001 and daily volatility is σ = 0.016. Also, assume that the overnight rate and volatility are constants through the time. We would like to calculate the price of an n-day European option. To do this, we adapt and program an n-step hexanomial tree to our cubature formula. That is, at day n we have 6n price trajectories. Figures 12.20, 12.21 and 12.22 show the results for K = {99, 100, 101}.

12 Testing Cubature Formulae on Wiener Space …

245

An n-day European call option

Relative error (European call option) 0.03

Relative error

Price

2 1.8 1.6 Cubature price Black-Scholes price

1.4 1

2

3

4

5

0.02

0.01

0

6

1

2

4

5

6

5

6

5

6

5

6

5

6

5

6

0.03

Relative error

1

Price

3

Number of days Relative error (European put option)

Nubmber of days An n-day European put option

0.8 0.6 Cubature price Black-Scholes price

0.4 1

2

3

4

5

0.025 0.02 0.015 0.01

6

1

2

Nubmber of days

3

4

Number of days

Fig. 12.20 n-day Black–Scholes versus cubature pricing formulae, K = 99 Relative error (European call option)

An n-day European call option 0.15

Cubature price Black-Scholes price

1

0.5

Relative error

Price

1.5

1

2

3

4

5

0.1

0.05

0

6

1

2

4

0.15

Relative error

Price

1.5

Cubature price Black-Scholes price

1

0.5

3

Number of days Relative error (European put option)

Nubmber of days An n-day European put option

1

2

3

4

5

0.1

0.05

0

6

1

2

3

4

Number of days

Nubmber of days

Fig. 12.21 n-day Black–Scholes versus cubature pricing formulae, K = 100 An n-day European call option

Relative error (European call option) 0.15

Relative error

Price

1 0.8 0.6 Cubature price Black-Scholes price

0.4 1

2

3

4

5

0.1

0.05

0

6

1

2

Nubmber of days An n-day European put option

3

4

Number of days Relative error (European put option) 0.15

Relative error

Price

2 1.8 1.6 Cubature price Black-Scholes price

1.4 1

2

3

4

Nubmber of days

5

6

0.1

0.05

0

1

2

3

4

Number of days

Fig. 12.22 n day-Black–Scholes versus cubature pricing formulae, K = 101

246

A. Malyarenko and H. Nohrouzian 115

110

105

100

95

90

85

0

1

2

3

Fig. 12.23 3-step hexanomial (reduced to pentanomial) tree constructed based on cubature formula of degree 7 European call option

8

Relative error

7.5

Price

Relative error (European call option)

7 6.5 6 10

20

30

40

50

Binomial Pentanomial

0.2

CRR price Pentanomial price Black-Scholes price

0.15 0.1 0.05 0

60

10

Nubmber of steps European put option 10

Relative error

Price

9 8.5 8

10

20

30

40

30

40

50

60

50

Binomial Pentanomial

0.2

CRR price Pentanomial price Black-Scholes price

9.5

20

Number of steps Relative error (European put option)

60

0.15 0.1 0.05 0

10

20

Nubmber of steps

30

40

50

60

Number of steps

Fig. 12.24 63 steps pentanomial model versus CRR binomial model

Figure 12.23 shows a 3-step hexanomial tree constructed based on cubature formula of degree 7. As it can be seen in the figure, the hexanomial tree reduces to pentanomial tree, since the trajectories ω3 and ω4 have exactly the same value at the end of one unit of time. With the same argument we made for the trinomial tree, one can see that hexanomial (pentanomial) tree is also recombining. Generally, the cubature of degree 7 performs better than cubature of degree 5. As an example for fixed maturity time, we used the same inputs, except K = 105 and followed the same approach as in Sect. 12.4.2.1. Then, we programmed a pentanomial model (tree) with (n + 4)(n + 3)(n + 2)(n + 1)/4! nodes. When we reduce hexanomial tree to the pentanomial tree we decrease the number of probability parameters from 6 to 5 by adding original λ3 and λ4 . Figure 12.24 compares the prices and errors of CRR model and our pentanomial approach versus Black–Scholes model.

12 Testing Cubature Formulae on Wiener Space …

247

12.6 Conclusion and Future Works In this work, we studied cubature formulae on Wiener space with the purpose of extending the idea into different (financial) applications. We obtained cubature formulas of degree 5 and 7 on the Wiener space, then we used the obtained formulas to approximate the price of European call and put options, and after that we compared the obtained approximated results with exact results of classical Black–Scholes option pricing formulae. We examined different inputs and consequently the deviations between exact and approximated solutions were varied. Choosing real-market data and completely accurate existing volatilities and interest rates would have helped us to make a better conclusion. Last but not least, we were able to use our cubature formula to develop lattice (tree) approaches which have practical applications, e.g., in the pricing American options. Despite the mentioned facts, we would say that the results were satisfactory at this stage and indicated that we are most probably in the right direction. To make a proper conclusion about the accuracy of our cubature formulas, and their practical and possible usages in different area of science with different applications, further and deeper studies are indubitably vital. For example, it might be a good idea to use our cubature formula to find approximated solution to some other boundary value problems which have explicit solutions. Considering our research procedure and future works, we have already written (see [12]) about the construction of an arbitrage-free large market model within Heath–Jarrow–Morton (HJM) framework, where the forward spread curves for a given finite tenor structure are described as a mild solution to a BVP for a system of infinite-dimensional SDEs. In other words, we have reviewed the construction of a financial market which contains finitely many overnight index swaps (OIS) zero coupon bonds and forward rate agreement (FRA) contracts with all possible maturities (see also [8]). That is, including as many source of uncertainties as possible to evaluate forward spread curves. Using the results of our previous works and the current one, we would like to apply the method of cubature on infinite-dimensional spaces in order to find the solutions to the above system of SDEs.

References 1. Black, F., Scholes, M.: The pricing of options and corporate liabilities. J. Polit. Econ. 81(3), 637–654 (1973) 2. Casas, F., Murua, A.: An efficient algorithm for computing the Baker-Campbell-Hausdorff series and some of its applications. J. Math. Phys. 50(3), 033,513,23 (2009) 3. Cox, J.C., Ross, S.A., Rubinstein, M.: Option pricing: a simplified approach. J. Financ. Econ. 7(3), 229–263 (1979) 4. Gyurkó, L.G., Lyons, T.J.: Efficient and practical implementations of cubature on Wiener space. In: Crisan, D. (ed.) Stochastic Analysis 2010, pp. 73–111. Springer, Heidelberg (2011) 5. Kijima, M.: Stochastic Processes with Applications to Finance, 2nd edn. Hall/CRC Financial Mathematics Series. CRC Press, Boca Raton, FL (2013)

248

A. Malyarenko and H. Nohrouzian

6. Kusuoka, S.: Approximation of expectation of diffusion process and mathematical finance. In: Taniguchi Conference on Mathematics Nara ’98, Advanced Studies in Pure Mathematics, vol. 31, Mathematical Society of Japan, Tokyo, pp. 147–165 (2001) 7. Lyons, T., Victoir, N.: Cubature on Wiener space. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 460(2041), 169–198 (2004) 8. Malyarenko, A., Nohrouzian, H., Silvestrov, S.: An algebraic method for pricing financial instruments on post-crisis market. In: Silvestrov, S., Malyarenko, A., Ranˇci´c, M. (eds.) Algebraic Structures and Applications. SPAS 2017, Springer Proceedings in Mathematics & Statistics, vol. 317, pp. 839–856. Springer (2020) 9. Mathews, J.H., Fink, K.K.: Numerical Methods Using Matlab, 4th edn. Pearson (2004) 10. Merton, R.C.: Theory of rational option pricing. Bell J. Econ. Manage. Sci. 4, 141–183 (1973) 11. Munthe-Kaas, H., Owren, B.: Computations in a free Lie algebra. R. Soc. Lond. Philos. Trans. Ser. A Math. Phys. Eng. Sci. 357(1754), 957–981 (1999) 12. Nohrouzian, H., Ni, Y., Malyarenko, A.: An arbitrage-free large market model for forward spread curves. In: Proceedings of ASMDA Conference (2019) 13. Øksendal, B.: Stochastic Differential Equations. An Introduction with Applications, 6th edn. Universitext. Springer, Berlin (2003) 14. Samuelson, P.A.: Rational theory of warrant pricing. In: Grünbaum, F.A., van Moerbeke, P., Moll, V.H. (eds.) Henry P. McKean Jr. Selecta, pp. 195–232. Springer International Publishing, Cham (2015)

Chapter 13

Gaussian Processes with Volterra Kernels Yuliya Mishura, Georgiy Shevchenko, and Sergiy Shklyar

t Abstract We study Volterra processes X t = 0 K (t, s)dWs , where W is a standard t Wiener process, and the kernel has the form K (t, s) = a(s) s b(u)c(u − s)du. This form generalizes the Volterra kernel for fractional Brownian motion (fBm) with Hurst index H > 1/2. We establish smoothness properties of X , including continuity and Hölder property. It happens that its Hölder smoothness is close to well-known Hölder smoothness of fBm but is a bit worse. We give a comparison with fBm for any smoothness theorem. Then we investigate the problem of inverse representation of W via X in the case where c ∈ L 1 [0, T ] creates a Sonine pair, i.e. there exists h ∈ L 1 [0, T ] such that c ∗ h ≡ 1. It is a natural extension of the respective property of fBm that generates the same filtration with the underlying Wiener process. Since the inverse representations of the Gaussian processes under consideration are based on the properties of Sonine pairs, we provide several examples of Sonine pairs, both well-known and new. Keywords Gaussian process · Volterra process · Sonine pair · Continuity · Hölder property · Inverse representation MSC 2020 60G15

Y. Mishura · G. Shevchenko · S. Shklyar (B) Department of Probability Theory, Statistics and Actuarial Mathematics, Taras Shevchenko National University of Kyiv, 64, Volodymyrs’ka St., 01601 Kyiv, Ukraine e-mail: [email protected] Y. Mishura e-mail: [email protected] G. Shevchenko e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_13

249

250

Y. Mishura et al.

13.1 Introduction Among various classes of Gaussian processes, consider the class of the processes admitting the integral representation via some Wiener process. Such processes arise in finance, see e.g. [4]. They are the natural extension of fractional Brownian motion (fBm) which admits the integral representation via the Wiener process, and the Volterra kernel of its representation consists of power functions. The solution of many problems related to fBm is based on the Hölder properties of its trajectories. Therefore it is interesting to consider the smoothness properties of Gaussian processes admitting the integral representation via some Wiener process, with the representation kernel that generalizes the kernel in the representation of fBm. The next question is what properties should the kernel have in order for the Wiener process and the corresponding Gaussian process to generate the same filtration. It turned out that the functions in the kernel should form, in a specific way, so called Sonine pair, property that the components of the kernel generating fBm have. Thus, the properties of the Gaussian process turned out to be directly related to the analytical properties of the generating kernel. The present work is devoted to the study of these properties. It is organized as follows. Section 13.2 is devoted to the smoothness properties of the Gaussian processes generated by Volterra kernels. Assumptions which supply the existence and continuity of the Gaussian process are provided. Then the Hölder properties are established. They have certain features. Namely, under reasonable assumptions on the kernel we can establish only Hölder property up to order 1/2 while fBm with Hurst index H has Hölder property of the trajectories up to order H , and for H > 1/2 (exactly the case from which we start) fBm has better smoothness properties. In this connection, we establish the conditions of smoothness that is comparable with the one for fBm, but only on any interval separated from zero. Finally, we establish the conditions on the kernel supplying Hölder property at zero. Section 13.3 describes how the generalized fractional calculus related to a Volterra process with Sonine kernel can be used to invert the corresponding covariance operator. Section 13.4 contains examples of Sonine pairs, and Sect. 13.5 contains all necessary auxiliary results.

13.2 Gaussian Volterra Processes and Their Smoothness Properties Let (Ω, F , F = {Ft , t ≥ 0}, P) be a stochastic basis with filtration, and let W = {Wt , t ≥ 0} be a Wiener process adapted to this filtration. Consider a Gaussian process of the form t X t = K (t, s)dWs (13.1) 0

13 Gaussian Processes with Volterra Kernels

251

where K ∈ L 2 ([0, T ]2 ) is a Volterra kernel, i.e. K (t, s) = 0 for s > t. Obviously, X is also adapted to the filtration F. Recall that a very common example of such process is a fractional Brownian motion (fBm) with Hurst index H , i.e., a Gaussian t process B H = {BtH , t ≥ 0}, admitting a representation BtH = K (t, s)dWs , with 0

some Wiener process W and Volterra kernel  K (t, s) = c H s 1/2−H (t (t − s)) H −1/2  t −(H − 1/2) u H −3/2 (u − s) H −1/2 du 10 1/2, we assume that the kernel in the representation (13.1) is given by t b(u)c(u − s)du,

K (t, s) = a(s)

(13.4)

s

where a, b, c : [0, T ] → R are some measurable functions. Since many applications of fBm are based on its smoothness properties, we consider what properties of functions a, b, c provide a certain smoothness of the process X which, in the case under consideration, takes the form ⎞ ⎛ t t X t = ⎝a(s) b(u)c(u − s)du ⎠ dWs , t ∈ [0, T ]. (13.5) 0

s

Our first goal is to investigate the assumptions which supply the existence and continuity of process X . Considering L-spaces, we put, as is standard, 1/∞ = 0 and 1/0 = ∞. Theorem 1 Assume that (K1) a ∈ L p [0, T ], b ∈ L q [0, T ], and c ∈ L r [0, T ] for p ∈ [2, ∞], q ∈ [1, ∞], r ∈ [1, ∞], such that 1/ p + 1/q + 1/r ≤ 23 .

252

Y. Mishura et al.

(1) Then supt∈[0,T ] K (t, · ) L 2 [0,t] < ∞, which means that the process X is welldefined. (2) If, in addition, 1/ p + 1/r < 23 , then the process X has a continuous modification.

Remark 1 In the case of fBm with H > 1/2 we have a(t) = H − 21 c H t 1/2−H , b(t) = t H −1/2 and c(t) = t H −3/2 . Therefore, p can be any number such that 21 > 1 > H − 21 , q can be any number from [1, ∞], and r can be any number such that p 1 > r1 > 23 − H . It means that both conditions of Theorem 1 are satisfied if we put

1 = H − 21 + 3ε , q1 = 3ε and r1 = 23 − H + 3ε , where 0 <  < min 3 H − 21 , 3(1 − p

H ), 21 . Proof For both statements, without loss of generality, we can assume 1/q + 1/r ≥ 1. Considering statement (2) we can assume that q < ∞. (1) Extend the functions a, b, c onto the entire set R assuming a(s) = b(s) = c(s) = 0 for s ∈ / [0, T ]. Extend the kernel K (t, s) assuming K (t, s) = 0 for s ∈ / [0, t]. Then we have ˜ for all 0 ≤ t ≤ T, s ∈ R, K (t, s) = a(s) (b1[0,t] ∗ c)(s),

(13.6)

where c(v) ˜ = c(−v). By Young’s convolution inequality (13.20) ˜ (1/q+1/r −1)−1 ≤ b1[0,t] q c ˜ r ≤ b q c r . b1[0,t] ∗ c

(13.7)

(Here we applied inequality 1/q + 1/r ≥ 1.) By Hölder inequality (13.21) for nonconjugate exponents ˜ (1/ p+1/q+1/r −1)−1 K (t, · ) (1/ p+1/q+1/r −1)−1 = a (b1[0,t] ∗ c) ≤ a p b1[0,t] ∗ c ˜ (1/q+1/r −1)−1 ≤ a p b q c r .

(13.8)

−1

Hence K (t, · ) ∈ L (1/ p+1/q+1/r −1) [0, t]. Since (1/ p + 1/q + 1/r − 1)−1 > 2, we conclude that K (t, · ) ∈ L 2 [0, t], and it follows from (13.8) that the norms are uniformly bounded. It completes the proof of the first statement. (2) Let 0 ≤ t1 < t2 ≤ T . It follows from (13.6) that ˜ K (t2 , s) − K (t1 , s) = a(s) (b1(t1 ,t2 ] ∗ c)(s),

s ∈ R.

(13.9)

Similarly to (13.7) and (13.8), ˜ (1/q+1/r −1)−1 ≤ b1(t1 ,t2 ] q c ˜ r ≤ b1(t1 ,t2 ] q c r , b1(t1 ,t2 ] ∗ c K (t2 , · ) − K (t1 , · ) (1/ p+1/q+1/r −1)−1 = a (b1(t1 ,t2 ] ∗ c) ˜ (1/ p+1/q+1/r −1)−1 ≤ a p b1(t1 ,t2 ] ∗ c ˜ (1/q+1/r −1)−1 ≤ a p b1(t1 ,t2 ] q c r . Notice that 2 < (1/ p + 1/q + 1/r − 1)−1 , and the function K (t2 , · ) − K (t1 , · ) is zero-valued outside the interval [0, t2 ]. Apply the inequality (13.22) between the

13 Gaussian Processes with Volterra Kernels

253

−1

norms in L 2 [0, t2 ] and L (1/ p+1/q+1/r −1) [0, t2 ]: 3

K (t2 , · ) − K (t1 , · ) 2 ≤ K (t2 , · ) − K (t1 , · ) (1/ p+1/q+1/r −1)−1 t22 3

≤ a p b1(t1 ,t2 ] q c r t22

−1/ p−1/q−1/r

−1/ p−1/q−1/r

≤ C b1(t1 ,t2 ] q ,

with C = T 2 −1/ p−1/q−1/r a p c r . Hence 3

  E (X t2 − X t1 )2 = K (t2 , · ) − K (t1 , · ) 22 ≤ C 2 b1(t1 ,t2 ] q2 ⎞2/q ⎛ t 2 = C 2 ⎝ |b(s)|q ds ⎠ = (F(t2 ) − F(t1 ))2/q , t1

where F(t) = C q

t

|b(s)|q ds is a nondecreasing function. By Lemma 5, the process

0

{X t , t ∈ [0, T ]} has a continuous modification.



Now, let us establish the conditions supplying Hölder properties of X . Lemma 1 Assume that a ∈ L p [0, T ], b ∈ L q [0, T ], and c ∈ L r [0, T ] with p ∈ [2, ∞], q ∈ (1, ∞], r ∈ [1, ∞], so that 1/ p + 1/r ≥ 21 and 1/ p + 1/q + 1/r < 23 . Then the stochastic process X defined by (13.5) has a modification satisfying Hölder condition up to order 23 − 1/ p − 1/q − 1/r . Remark 2 As it was mentioned in Remark 1, in the case of fractional Brownian motion, for any small positive ε, we have chosen p, q and r so that 1 ≤ 1/ p + 1/q + 1/r ≤ 1 + ε. Therefore in conditions of Lemma 1 we get for fBm Hölder property only up to order 1/2 while in reality we know Hölder property up to order H > 1/2. Proof Extend the functions a, b, c and K (t, s) as it was done in the proof of Theorem 1. Let 0 ≤ t1 < t2 ≤ T . We are going to find an upper bound for K (t2 , · ) − K (t1 , · ) 2 using a representation (13.9). By Hölder inequality for non-conjugate exponents (13.21), b1(t1 ,t2 ] ( 23 −1/ p−1/r )−1 ≤ b q 1(t1 ,t2 ] ( 23 −1/ p−1/q−1/r )−1 3 = b q (t2 − t1 ) 2 −1/ p−1/q−1/r . Here we use that 1/ p + 1/q + 1/r ≤ 23 . By Young’s convolution inequality (13.20), b1[t1 ,t2 ] ∗ c ˜ ( 21 −1/ p)−1 ≤ b1[t1 ,t2 ] ( 23 −1/ p−1/r )−1 c ˜ r 3 −1/ p−1/q−1/r ≤ b q c r (t2 − t1 ) 2 . Here c(v) ˜ = c(−v); we used inequalities r ≥ 1, 21 ≤ 1/ p + 1/r < 23 so ( 23 − 1/ p − 1/r )−1 ≥ 1, and p ≥ 2, so ( 21 − 1/ p)−1 ≥ 2. Again, by Hölder inequality for non-

254

Y. Mishura et al.

conjugate exponents, K (t2 , · ) − K (t1 , · ) 2 = a (b1(t1 ,t2 ] ∗ c) ˜ 2 3 (13.10) ≤ a p b1(t1 ,t2 ] ∗ c ˜ 1/( 21 −1/ p) ≤ a p b q c r (t2 − t1 ) 2 −1/ p−1/q−1/r . Hence   E (X t2 − X t1 )2 = K (t2 , · ) − K (t1 , · ) 22 ≤ a 2p b q2 c r2 (t2 − t1 )3−2(1/ p+1/q+1/r ) . By Corollary 1, the process {X t , t ∈ [0, T ]} has a modification that satisfies Hölder condition up to order 23 − 1/ p − 1/q − 1/r .  The following statement follows, to some extent, from Lemma 1. Now we drop the condition 1/ p + 1/r ≥ 21 , and simultaneously relax the assertion of the mentioned lemma. Theorem 2 Let a ∈ L p [0, T ], b ∈ L q [0, T ], and c ∈ L r [0, T ] with p ∈ [2, ∞], q ∈ (1, ∞], and r ∈ [1, ∞], which satisfy the inequality 1/ p + 1/q + 1/r < 23 . Then the stochastic process X defined in (13.5) has a modification that satisfies Hölder condition up to order 23 − 1/q − max( 21 , 1/ p + 1/r ).

Remark 3 For the fBm with Hurst index H ∈ 21 , 1 and functions a, b and c and exponents p, q and r defined in Remark 1, Theorem 2 provides Hölder condition up to

order 23 − 3 − max 21 , 1 + 23 = 21 − . However, since conditions of Lemma 1 hold true in this case, Lemma 1 gives the same result.

−1

Proof Let r = max(1/r, 21 − 1/ p) . Then r ∈ [1, +∞], r ≤ r , c ∈ L r [0, T ], 1/ p + 1/r ≥ 21 , 1/ p + 1/q + 1/r < 23 . Applying Lemma 1 to the functions a, b, c and exponents p, q and r , we obtain that the process X has a modification that satisfies Hölder condition up to order 23 − 1/ p − 1/q − 1/r = 23 − 1/q − max( 21 , 1/ p + 1/r ).  Now, let us formulate stronger conditions on the functions a, b and c, supplying better Hölder properties on any interval, “close” to [0, T ], but not on the whole [0, T ]. Theorem 3 Let t1 ≥ 0, t2 ≥ 0 and t1 + t2 < T . Let the functions a, b and c and constants p, p1 , q, q1 , r , and r1 satisfy the following assumptions a ∈ L p [0, T ] ∩ L p1 [t1 , T ], wher e 2 ≤ p ≤ p1 ; b ∈ L q [0, T ] ∩ L q1 [t1 + t2 , T ], wher e 1 < q ≤ q1 ; c ∈ L r [0, T ] ∩ L r1 [t2 , T ], wher e 1 ≤ r ≤ r1 .

Also, let 1/ p + 1/q + 1/r ≤ 23 , and 1/q1 + max 21 , 1/ p + 1/r1 , 1/ p1 + 1/r < 3 . 2 Then the stochastic process {X t , t ∈ [t1 + t2 , T ]} has a modification that satisfies

Hölder condition up to order 23 − 1/q1 − max 21 , 1/ p + 1/r1 , 1/ p1 + 1/r .

13 Gaussian Processes with Volterra Kernels

255



Remark 4 Consider the fBm with Hurst index H ∈ 21 , 1 on interval [0, T ]. Define the functions a, b and c and exponents p, q and r as it is done in Remark 1. Let p1 = q1 = r1 = 3/, where  comes from Remark 1, and let t1 = t2 = t0 /2 for some t0 ∈ (0, T ). Then the conditions of Theorem 3 are satisfied, and, according to Theorem 3 the fBm has a modification in the interval

which satisfies Hölder condition [t0 , T ] up to order 23 − 3 − max 21 , 23 − H + 23 , H − 21 + 23 = H − . This is equivalent to the fact that the fBm satisfies Hölder condition in the interval [t0 , T ] up to order H . Proof Let us extend the function a(s), b(s), c(s) and K (t, s) as it was done in the proof of Theorem 1. With this extension, (13.4) holds true for all t ∈ [0, T ] and s ∈ R. Denote a1 (s) = a(s)1[0,t1 ) , a2 (s) = a(s)1[t1 ,T ] ,

b1 (s) = b(s)1[t1 +t2 , T ] , c1 (s) = c(s)1[t2 , T ] ,

c(s) ˜ = c(−s),

c˜1 (s) = c1 (−s) = c(−s)1[−T, −t2 ] (s).

The process {X t , t ∈ [0, T ]} is well-defined according to Theorem 1. We consider the increments of the process {X t , t ∈ [t1 + t2 , T ]}. Let t3 and t4 be such that t1 + t2 ≤ t3 < t4 < T . Then t4 K (t4 , s) − K (t3 , s) = a(s)

t4 b(u)c(u − s) du = a(s)

t3

b1 (u)c(u − s) du t3

for all s ∈ R; t4 K (t4 , s) − K (t3 , s) = a1 (s)

b1 (u)c1 (u − s) du for 0 ≤ s < t1 ; t3

t4 K (t4 , s) − K (t3 , s) = a2 (s)

b1 (u)c(u − s) du for t1 ≤ s ≤ T. t3

Thus, for all s ∈ R t4 K (t4 , s) − K (t3 , s) = a1 (s)

t4 b1 (u)c1 (u − s) du + a2 (s)

t3

b1 (u)c(u − s) du t3

= a1 (s) (b1 1(t3 ,t4 ] ∗ c˜1 )(s) + a2 (s) (b1 1(t3 ,t4 ] ∗ c)(s). ˜

−1 satFunctions a1 , b1 and c1 with exponents p, q1 and max(1/r1 , 21 − 1/ p) isfy conditions of Lemma 1. Functions a2 , b1 and c with exponents p1 , q1 and

−1 max(1/r, 21 − 1/ p1 ) also satisfy conditions of Lemma 1. By inequality (13.10) in the proof of Lemma 1,

256

Y. Mishura et al.

a1 (b1 1(t3 ,t4 ] ∗ c˜1 ) 2 ≤ a1 p b1 q1 c1 1/max(1/r1 , 21 −1/ p) (t4 − t3 )λ1 , a2 (b1 1(t3 ,t4 ] ∗ c) ˜ 2 ≤ a2 p1 b1 q1 c 1/max(1/r, 21 −1/ p1 ) (t4 − t3 )λ2 where  1 1 3 1 − − − max , 2 p q1 r1  3 1 1 1 λ2 = − − − max , 2 p1 q1 r

λ1 =

Denote λ = min(λ1 , λ2 ) =

3 2



1 q1

1 − 2 1 − 2

   1 3 1 1 1 1 , = − − max , + p 2 q1 2 p r1    3 1 1 1 1 1 = − . − max , + p1 2 q1 2 p1 r

− max



1 1 , 2 p

+

1 , p11 r1

+

1 r



. Then

K (t4 , · ) − K (t3 , · ) 2 ≤ a1 (b1 1(t3 ,t4 ] ∗ c˜1 ) 2 + a2 (b1 1(t3 ,t4 ] ∗ c) ˜ 2 ≤ C (t4 − t3 )λ ,

where C = a1 p b1 q1 c1 1/max(r1 , 21 −1/ p) T λ1 −λ + a2 p1 b1 q1 c 1/max(r, 21 −1/ p1 ) T λ2 −λ . Finally,

2   ≤ E X t4 − X t3

t4 (K (t4 , s) − K (t3 , s))2 ds t3

= K (t4 , · ) − K (t3 , · ) 22 ≤ C 2 (t4 − t3 )2λ . By Corollary 1, the stochastic process {X t , t ∈ [t1 + t2 , T ]} has a modification that satisfies Hölder condition up to order λ.  The next result, namely, Lemma 2, generalizes Lemma 1 and Theorem 2. It allows us to apply the mentioned lemma directly to the power functions a(s) = s −1/ p0 and c(s) = s −1/r0 . Lemma 2 Let p0 ∈ (0, +∞], q0 ∈ (1, +∞], r0 ∈ (0, +∞] with 1/ p0 + 1/q0 + 1/r0 < 23 . Also, for any p ∈ (0, p0 ) let a ∈ L max(2, p) [0, T ], for any q ∈ [1, q0 ) let b ∈ L q [0, T ], and for any r ∈ (0, r0 ) let c ∈ L max(1,r ) [0, T ]. Then the stochastic process X defined in (13.5) has a modification that satisfies Hölder condition up to order λ = 23 − 1/q0 − max( 21 , 1/ p0 + 1/r0 ). Remark 5 In Remark 1 we applied Lemma 1 and obtained that the fBm with Hurst index H > 21 has a modification that satisfies Hölder condition up to order 21 . With Lemma 2, we can obtain the same result more easily. We just apply Lemma 2 for

−1

−1 p0 = H − 21 , q0 = ∞ and r0 = 23 − H and do not bother with .

13 Gaussian Processes with Volterra Kernels

257



 0 a set of Proof Notice that 0 < λ ≤ 1. Denote A = m ∈ N : m > max 3, qλq −1 0 “large enough” positive integers. Let n ∈ A. Let pn , qn and rn be such real numbers that 1/ pn = min( 21 , 1/ p0 + 

λ/n), 1/qn = 1/q0 + λ/n, and 1/rn = min(1, 1/r0 + λ/n). Then pn ∈ 21 , ∞ , qn ∈ (1, ∞), rn ∈ [1, ∞), and 1/ pn + 1/qn + 1/rn < 23 . Apply Lemma 1 for functions a, b, c and exponents pn , qn and rn . By Lemma 1, the process X has a modification X (n) that satisfies Holder condition up to order 23 − 1/qn − max( 21 , 1/ pn + 1/rn ) ≥ (n − 3)λ/n. For different n ∈ A, the processes X (n) coincide almost surely on [0, T ]. Let B = {∀m ∈ A ∀n ∈ A ∀t ∈ [0, T ] : X t(m) = X t(n) }. be a random event which occurs when all these processes coincide. Then P(B) = 1, and  X = X (k) 1 B (where k = min A is the least element of the set A) is a modification of X that satisfies Hölder condition up to order λ.  Lemma 3 Let a ∈ L p [0, T ], b ∈ L q [0, T ], c ∈ L r [0, T ], where the exponents satisfy relations p ∈ [2, ∞], q ∈ [1, ∞), r ∈ [1, ∞], and 1/ p + 1/q + 1/r ≤ 23 . Let there exist λ > 0 and C ∈ R such that ∀t ∈ [0, T ] : 0 < b1[0,t] q ≤ Ct λ . Then the stochastic process {X t , t ∈ [0, T ]} has a modification which is continuous on [0, T ] and satisfies Hölder condition at point 0 up to order λ. Remark 6 For the fBm with Hurst index H > 21 , apply Lemma 3 to the functions a, b and c defined in Remark 1, but for exponents 1/ p = H − 21 + 2 , 1/q = 21 − , and 1/r = 23 − H + 2 for some  such that 0 <  < min 2(1 −

H ), 21 , 2 H − 21 . Verify the conditions of Lemma 3. We have H − 21 < 1/ p < 21 , 0 < 1/q < 1, 23 − H < 1/r < 1 (whence a ∈ L p [0, T ] and c ∈ L r [0, T ]; the relation b ∈ L q [0, T ] holds true for all q ≥ 1) and 1/ p + 1/q + 1/r = 23 . Moreover, b1[0,t] q = C t H −1/2+1/q , where C = ((H − 21 )q + 1)−1/q . According to Lemma 3, the fBm satisfies Hölder condition at point 0 up to order H − 21 + 1/q = H − . As this can be proved for any  > 0 small enough, the fBm satisfies Hölder condition at point 0 up to order H . Proof Without loss of generality we can assume that 1/q + 1/r ≥ 1. Indeed, under original conditions of the lemma, let r = min(r, q/(q − 1)). Then 1 ≤

r ≤ r , 1/q + 1/r ≥ 1, 1/ p + 1/q + 1/r ≤ 23 , and c ∈ L r [0, T ]. The inequality 1/ p + 1/q + 1/r ≤ 23 can be proved as follows: 1 1 1 1 3 q 1 1 + + = + + ≤ if r ≤ ; p q r p q r 2 q −1 1 1 1 1 q −1 1 1 3 q 1 + + = + + = +1≤ +1= if r ≥ . p q r p q q p 2 2 q −1 The other relations can be proved easily. Thus, after substitution of r for r all conditions of Lemma 3 still hold true, as well as 1/q + 1/r ≥ 1.

258

Y. Mishura et al.

Denote F(t) =

t

|b(s)|q dt + t λq . Then F : [0, T ] → [0, +∞) is a strictly increas-

0

ing function such that F(0) = 0,

F(t) ≤ C1 t λq if 0 ≤ t ≤ T,

b1(t1 ,t2 ] q < (F(t2 ) − F(t1 ))1/q if 0 ≤ t1 < t2 ≤ T. ˜ = c(−v). Let us construct an upper bound Let 0 ≤ t1 < t2 ≤ T . Again, denote c(v) ˜ 2 , see (13.9). By Young’s convolufor K 1 (t2 , · ) − K 1 (t1 , · ) 2 = a (b1(t1 ,t2 ] ∗ c) tion inequality (13.20), ˜ (1/q+1/r −1)−1 ≤ b1(t1 ,t2 ] q c ˜ r ≤ (F(t2 ) − F(t1 ))1/q c r . b1(t1 ,t2 ] ∗ c ˜ is Here we used that q ≥ 1, r ≥ 1 and 1/r + 1/q ≥ 1. The function a (b1(t1 ,t2 ] ∗ c) equal to 0 outside the interval [0, t2 ]. Noticing that 2 ≤ (1/ p + 1/q + 1/r − 1)−1 , −1 using the inequality (13.22) for norms in L 2 [0, t2 ] and L (1/ p+1/q+1/r −1) [0, t2 ] and Hölder inequality for non-conjugate exponents (13.21), we get 3

˜ 2 ≤ a (b1(t1 ,t2 ] ∗ c) ˜ (1/ p+1/q+1/r −1)−1 t22 a (b1(t1 ,t2 ] ∗ c) 3

≤ a p b1(t1 ,t2 ] ∗ c ˜ (1/q+1/r −1)−1 t22 3

≤ a p (F(t2 ) − F(t1 ))1/q c r t22

−1/ p−1/q−1/r

−1/ p−1/q−1/r

−1/ p−1/q−1/r

.

Hence E



X t2 − X t1

2 

= K (t2 , · ) − K (t1 , · ) 22 = a (b1(t1 ,t2 ] ∗ c) ˜ 22 3−2(1/ p+1/q+1/r )

≤ a 2p (F(t2 ) − F(t1 ))2/q c r2 t2

.

Consider stochastic process Y = {Ys : s ∈ [0, F(T )]}, with Y F(t) = X t for all t ∈ [0, T ]. This process Y satisfies inequality 

2  ≤ a 2p (s2 − s1 )2/q c r2 T 3−2(1/ p+1/q+1/r ) E Ys2 − Ys1  that if 0 ≤ s1 < s2 ≤ F(T ). By Corollary 1, the process Y has a modification Y satisfies Hölder condition up to order 1/q. Therefore, for any λ1 ∈ (0, λ) s2 − Y s1 | ≤ C2 |s2 − s1 |λ1 /(λq) , ∃C2 ∀s1 ∈ [0, F(T )] ∀s2 ∈ [0, F(T )] : |Y where C2 is a random variable; C2 < ∞ almost surely. In particular, s − Y 0 | ≤ C2 s λ1 /(λq) . ∃C2 ∀s ∈ [0, F(T )] : |Y

13 Gaussian Processes with Volterra Kernels

259

F(t) , t ∈ [0, T ]} is a modification The stochastic process  X = { X t , t ∈ [0, T ]} = {Y of the stochastic process X . It satisfies inequalities Xt −  X 0 | ≤ C2 F(t)λ1 /(λq) ; ∃C2 ∀t ∈ [0, T ] : |  ∃C3 ∀t ∈ [0, T ] : |  Xt −  X 0 | ≤ C 3 t λ1 . Thus, all the paths of the stochastic process  X satisfy Hölder condition at point 0 with exponent λ1 . 

13.3 Gaussian Volterra Processes with Sonine Kernels 13.3.1 Fractional Brownian Motion and Sonine Kernels Consider now a natural question: for which kernels K of the form (13.4) Gaussian process of the form (13.5) with Volterra kernel K generates the same filtration as the Wiener process W . Sufficient condition for this is the representation of the Wiener process W as t (13.11) Wt = L(t, s)d X s , 0

where L ∈ L 2 ([0, T ]2 ) is a Volterra kernel, and the integral is well-defined, in some sense. As an example, let us consider fractional Brownian motion B H , H > 1/2 admitting a representation (13.1) withVolterra kernel (13.3). For any 0 < ε< 1 cont 1/2−H t H −1/2 s u (u − εs) H −3/2 du dWs , t ≥ sider the approximation BtH,ε = d H s

0

0. Unlike the original process, in such approximation we can change the limits  t u 1/2−H H,ε H −1/2 H −3/2 u s (u − εs) dWs du. of integration and get that Bt = d H 0

0

This representation allows to write the equality t

t u 1/2−H d BuH,ε = d H

0

0

⎛ u ⎞  ⎝ s 1/2−H (u − εs) H −3/2 dWs ⎠ du, 0

and it follows immediately from (13.12) that

(13.12)

260

Y. Mishura et al.

t (t − u)1/2−H u 1/2−H d BuH,ε 0

t = dH

⎛ (t − u)1/2−H ⎝

0

t = dH



u

⎞ s 1/2−H (u − εs) H −3/2 dWs ⎠ du

0



t

s 1/2−H ⎝ (t − u)1/2−H (u − εs) H −3/2 du ⎠ dWs .

(13.13)

s

0

Applying Theorem 3.3 from [2], p. 160, we can go to the limit in (13.13) and get that t

t (t − u)1/2−H u 1/2−H d BuH = d H

0

⎞ ⎛ t  s 1/2−H ⎝ (t − u)1/2−H (u − s) H −3/2 du ⎠ d Ws . s

0

Now the highlight is that the integral t

stant,

(t − u)1/2−H (u − s) H −3/2 du =

s

t

t

(t − u)1/2−H (u − s) H −3/2 du is a con-

s

(t − u)1/2−H u H −3/2 du = B(3/2 − H, H −

0

1/2), where B is a beta-function. After we noticed this, then everything is simple:

t Yt :=

t (t − u)

1/2−H 1/2−H

u

d BuH

= d H B(3/2 − H, H − 1/2) s 1/2−H dWs ,

0

and finally we get that Wt = e H

0

t

s H −1/2 dYs with some constant e H . It means that we

0

have representation (13.11) and, in particular, W and B H generate the same filtration. Of course, these transformations can be performed much faster, but our goal here was to pay attention on the role of the property of the convolution of two functions to be a constant. This property is a characterization of Sonine kernels.

13.3.2 A General Approach to Volterra Processes with Sonine Kernels First we give basic information about Sonine kernels, more details can be found in [12]. We also consider, in a simplified form, the related generalized fractional calculus introduced in [6].

13 Gaussian Processes with Volterra Kernels

261

Definition 1 A function c ∈ L 1 [0, T ] is called a Sonine kernel if there exists a function h ∈ L 1 [0, T ] such that t c(s)h(t − s) ds = 1, t ∈ (0, T ].

(13.14)

0

Functions c, h are called Sonine pair, or, equivalently, we say that c and h form (or create) a Sonine pair. If cˆ and hˆ denote the Laplace transforms of c and h respectively, then (13.14) is ˆ equivalent to c(λ) ˆ h(λ) = λ−1 , λ > 0. Since the Laplace transform characterizes a function uniquely, for any c there can be not more than one function h satisfying (13.14). Examples of Sonine pairs are given in Sect. 13.4. Let functions c and h form a Sonine pair. For a function f ∈ L 1 [0, T ] consider t c f (t) = c(t − s) f (s)ds. It is an analogue of forward fractional the operator I0+ 0

integration operator. Let us identify an inverse operator. In order to do this, for t h g(t) = h(t − s)g (s)ds + h(t)g(0). g ∈ AC[0, T ] define D0+ 0

Note that t

h D0+ g(u)du =

0

=

t u

= =

t

u 

0

 h(u − s)g (s)ds + h(u)g(0) du

0

t h(s)g (u − s)ds du + g(0) h(u)du

0 0

t

t

h(s)

t

0 t



g (u − s)du ds + g(0) h(u)du

s

0

0

t t

h(s) g(t − s) − g(0) ds ds + g(0) h(u)du = h(s)g(t − s)ds,

0

0

0

so we can also write h D0+ g(t) =

d dt

t h(s)g(t − s)ds =

d dt

0

t h(t − s)g(s)ds,

(13.15)

0

where the derivative is understood in the weak sense. Similarly, we can define an anaT logue of backward fractional integral: IcT − f (s) = c(t − s) f (t)dt, f ∈ L 1 [0, T ] s

and the corresponding differentiation operator T DhT − g(s)

= g(T )h(T − s) − s

h(t − s)g (t)dt.

262

Y. Mishura et al.

h

c D0+ g (t) = g(t) and IcT − DhT − g (s) = Lemma 4 Let g ∈ AC[0, T ]. Then I0+ g(s). Proof We have h c D0+ g (t) = I0+

t

⎛ c(t − s) ⎝

0

s

⎞ h(s − u)g (u)du + h(s)g(0)⎠ ds

0

t t

t



c(t − s)h(s − u)ds g (u)du + g(0)

= 0

t =

u

c(t − s)h(s)ds 0

g (u)du + g(0) = g(t),

0

as required. Similarly,

IcT − DhT − g (s) =

T

⎛ c(t − s) ⎝h(T − t)g(T ) −

s

T

⎞ h(u − t)g (u)du ⎠ ds

t

T = g(T )

T u c(t − s)h(T − t)dt −

s

s

T = g(T ) −

c(t − s)h(u − t) dt g (u) du

s

g (u)du + g(s) = g(s)

s



as required.

Now consider a Gaussian process X given by the integral transformation of type (13.1) with the kernel of the form (13.4) satisfying condition (K1) of Theorem 1. t t Define the integral operator K f (t) = a(s) b(u)c(u − s)du f (s)ds. Note that s

0

for f ∈ L 2 [0, T ], K f (t) ∈ AC[0, T ]. Indeed, by definition, t K f (t) =

t t K (t, s) f (s)ds =

0

0

s

∂ K (u, s)du f (s)ds. ∂u

∂ K is integrable on Since f and ∂t∂ K (t, s) are square integrable, the product f ∂u {(s, u) : 0 ≤ s ≤ u ≤ t}. Therefore, we can apply Fubini theorem to get t u ∂ t K f (t) = K (u, s) f (s)ds du = α(u)du, where α ∈ L 1 [0, t] for all t ∈ ∂u 0 0

0

[0, T ], so α ∈ L 1 [0, T ]. Consequently, for f ∈ L 2 [0, T ] we can denote by

13 Gaussian Processes with Volterra Kernels

t

J f (t) =

263

∂ ∂t

K (t, s) f (s)ds the weak derivative of K f . Further, define for a mea2  T T ∂ 2 surable g : [0, T ] → R such that g H X := K (u, s)g(u)du ds < ∞ the ∂u 0

0

integral operator J ∗ g(s) =

T s

∂ ∂u

s

K (u, s)g(u)du. It can be extended to the com-

pletion H X of the set of measurable functions with finite norm · 2H X so that T

2 g 2H X = J ∗ g(t) dt, g ∈ H X . The operator J ∗ is related to the adjoint K ∗ 0

of K in the following way: for a finite signed measure μ on [0, T ], K ∗ μ = J ∗ h with h(t) = μ([t, T ]). We are going to identify inverse to the operators J and J ∗ . Clearly, it is not possible in general, so we will assume that (S) the function c forms a Sonine pair with some h ∈ L 1 [0, T ]. In this case the operators J and J ∗ can be written in terms of “fractional” operators defined above: t J f (t) = 0

J ∗ g(s) =

∂ K (t, s) f (s)ds = ∂t

t c a(s) b(t) c(t − s) f (s) ds = b(t) I0+ (a f )(t), 0

T a(s) b(t) c(t − s) g(t) dt = a(s) IcT − (bg)(s). s

In order for this operators to be injective, we assume (K2) the functions a, b are positive a.e. on [0, T ]. For f such that f b−1 ∈ AC[0, T ], define −1 h f b (t) L f (t) = a(t)−1 D0+ ⎞ ⎛ t 



= a(t)−1 ⎝ h(t − s) f b−1 (s)ds + h(t) f b−1 (0)⎠ , 0

and for g such that ga −1 ∈ AC[0, T ], define

L ∗ g(s) = b(s)−1 DhT − ga −1 (s) ⎞ ⎛ T



= b(s)−1 ⎝h(T − s) ga −1 (T ) − h(t − s) ga −1 (t)dt ⎠ . s

264

Y. Mishura et al.

Proposition 1 Assume that (S), (K1) and (K2) hold. Then the operators J and J ∗ are injective, and for functions f, g such that f b−1 ∈ AC[0, T ] and ga −1 ∈ AC[0, T ], J L f (t) = f (t), J ∗ L ∗ g(s) = g(s). c (b f ) = 0 Proof Assume that J f = 0 for some f ∈ L 2 [0, T ]. Then, by (K2), I0+ a.e. on [0, T ]. Therefore, for any t ∈ [0, T ]

t 0=

t c h(t − s)I0+ (b f )(s)ds =

0

s h(t − s)

0

c(s − u)b(u) f (u)du ds 0

t t

t h(t − s)c(s − u)ds b(u) f (u)du =

= 0

u

b(u) f (u)du, 0

whence b f = 0 a.e. on [0, T ], so, using (K2) once more, f = 0 a.e. The injectivity  of J ∗ is shown similarly, and the second statement follows from Lemma 4. Now we are in a position to invert the covariance operator R = K K need a further assumption.



of X . We

(K3) a −1 ∈ C 1 [0, T ], d := b−1 ∈ C 2 [0, T ] and either d(0) = d (0) = 0 or a −2 h ∈ C 1 [0, T ]. Proposition 2 Let the assumptions (S), (K1) – (K3) hold, and f ∈ C 3 [0, T ] with f (0) = 0. Then for h = L ∗ L f , the measure μ([t, T ]) = h(t) is such that Rμ = f . Proof Thanks to (K3), f b−1 ∈ AC[0, T ] and ⎞ ⎛ t 



a(t)−1 L f (t) = a(t)−2 ⎝ h(t − s) f b−1 (s)ds + h(t) f b−1 (0)⎠ . (13.16) 0

Similarly to (13.15),

t



h(t − s) f b−1 (s)ds is absolutely continuous with

0

d dt

t



−1

h(t − s) f b 0

t (s)ds =



h(s) f b−1 (t − s)ds + h(t) f b−1 (0).

0

Then, thanks to 13.3.2, both summands in the right-hand side of (13.16) are absolutely continuous with bounded derivatives. So by Proposition 1, K ∗μ = J ∗h = J ∗L ∗L f = L f , J K ∗μ = J L f = f . Thus,

Rμ (t) = K K ∗ μ (t) =

t 0

required.

J K ∗ μ (s) ds =

t 0

f (s) ds = f (t)

as 

13 Gaussian Processes with Volterra Kernels

265

Now we recall the definition of integral with respect to X given by (13.5); for more T details see [1]. Define I X (1[0,t] ) = 0 1[0,t] (s)d X s = X t and extend thisby linearity  to the set S of piecewise constant functions. Then, for any g ∈ S , E I X (g)2 = g 2H X . Therefore, I X can be extended to isometry between H X and a subspace of L 2 (Ω). Moreover, for any g ∈ H X , T

T g(t)d X t =

0

J ∗ g(t)dWt .

(13.17)

0

Proposition 3 Assume that (S), (K1) − (K3) hold, and let X be given by (13.5). t t −1 −1 p (v) Then Wt = k(t, s)d X s , where k(t, s) = p(t)b(s) h(t − s) − b(s) 0

s

h(v − s)dv, and p = a −1 .

Proof Write k(t, s) = k1 (t, s) − k2 (t, s), where k1 (t, s) = p(t)b(s)−1 h(t − s), t k2 (t, s) = b(s)−1 p (v)h(v − s)dv, and transform s



J k1 (t, ·)1[0,t] (s) =

T s

t = p(t)

∂ K (u, s) p(t)b(u)−1 h(t − u)1[0,t] (u)du ∂u

a(s)b(u)c(u − s)b(u)−1 h(t − u)du 1[0,t] (s)

s

t c(u − s)h(t − u)du 1[0,t] (s) = p(t)a(s)1[0,t] (s).

= p(t)a(s) s

Similarly, ∗

J k2 (t, ·)1[0,t] (s) =

t a(s)c(u − s) s

t = a(s) s

t = a(s) s

p (v)

t

p (v)h(v − u)dv du 1[0,t] (s)

u

v c(u − s)h(v − u)du dv 1[0,t] (s) s



p (v)dv 1[0,t] (s) = a(s) p(t) − p(s) 1[0,t] (s).

266

Y. Mishura et al.

Consequently,



∗ J k(t, ·)1[0,t] (s) = p(t)a(s)1[0,t] (s) − a(s) p(t) − p(s) 1[0,t] (s) = a(s) p(s)1[0,t] (s) = 1[0,t] (s). Therefore, thanks to (13.17), T

T k(t, s)d X s =

0



J k(t, ·)1[0,t] (s)dWs =

0

T 1[0,t] (s)dWs = Wt , 0



as required.

13.4 Examples of Sonine Kernels Example 1 Functions c(s) = s −α and h(s) = s α−1 with some α ∈ (0, 1/2) were considered in connection with fractional Brownian motion, see Sect. 13.3.1. Example 2 For α ∈ (0, 1) and A ∈ R, let γ = Γ (1) be Euler-Mascheroni con∞ x t−α elt 1 x α−1 (ln x1 + A) and h(x) = Γ (1−α+t) dt create stant, l = γ − A. Then c(x) = Γ (α) 0

a Sonine pair, see [12]. Example 3 This example was proposed by Sonine himself [13]: for ν ∈ (0, 1), √ √ h(x) = x −ν/2 J−ν (2 x), c(x) = x (ν−1)/2 Iν−1 (2 x), where J and I are, respectively, Bessel and modified Bessel functions of the first kind, Jν (y) =

∞ y ν  (−1)k y 2k 2−2k , 2ν k=0 k!Γ (ν + k + 1)

Iν (y) =

∞ y 2k 2−2k yν  . 2ν k=0 k!Γ (ν + k + 1)

In particular, setting ν = 1/2, we get the following Sonine pair: √ √ cos 2 x cosh 2 x h(x) = √ . , c(x) = √ 2 πx 2 πx

(13.18)

Remark 7 It is interesting that the creation of Sonine pairs allows to get the relations between the special functions (see [9, Sect. 1.14]). Let c(x) = x −1/2 cosh(ax 1/2 ) and x h(x) = s ν/2 Jν (as 1/2 ) (x − s)γ ds, be a fractional integral of s ν/2 Jν (as 1/2 ), where 0

−1 < ν < − 21 , γ + ν = − 23 . If we denote Fy (λ) Laplace transform of function y at point λ, then the Laplace transforms of these functions equal

13 Gaussian Processes with Volterra Kernels

267

Fc (λ) = (π/λ)1/2 exp(a 2 /4λ), Fh (λ) = Γ (γ + 1)2−ν a ν λ−ν−1 exp(−a 2 /4λ)λ−γ −1 = Γ (γ + 1)2−ν a ν λ−1/2√exp(−a 2 /4λ), λ > 0, Fc (λ)Fh (λ) = Γ (γ + 1)2−ν πa ν λ−1 , √ whence their convolution equals (c ∗ h)t = Γ (γ + 1)2−ν πa ν , t > 0. Therefore √ c(x) and (Γ (γ + 1)2−ν πa ν )−1 h(x) create a Sonine pair. However, comparing with Example 3 with a = 2, and taking into account that the pair in Sonine pair √ x √ is unique, we get 4 π (Γ (γ + 1))−1 s ν/2 Jν (2s 1/2 ) (x − s)γ ds = cos√2x x . Similarly, let c(x) =

x

0

t

−1/2

cosh(at

1/2

) (x − t)γ dt, h(x) = x ν/2 Jν (ax 1/2 ) with γ ∈

0

(−1, − 12 ), ν ∈ (−1, 0), γ + ν = − 23 . Then Fc (λ) = π 1/2 Γ (γ + 1)λ−γ −3/2 ν exp(a 2 /4λ), and Fh (λ) = a2ν λ−ν−1 exp(−a 2 /4λ), whence Fc (λ)Fh (λ) = π 1/2 Γ (γ + ν 1) a2ν λ−1 . If we put a = 2 and compare with (13.18), we get the following representation π

−1/2

−1

x

(Γ (γ + 1))

√ t −1/2 cosh(2t 1/2 ) (x − t)γ dt = x (−ν−1)/2 I−ν−1 (2 x).

0

Example 4 On the way of creation of the new Sonine pairs, a natural idea is to consider g(s) = eβs s α−1 with β ∈ R and examine if this function admits a Sonine pair. It happens so that the answer to this question is positive, but far from obvious and not simple. All preliminary results are contained in Sect. 13.5.4. Let g(x) =

exp(βx) , 0 < α < 1, β < 0; Γ (α)x 1−α

y(x) = 1.

Then h(x) = αβ 1 F 1 (α + 1; 2; βx) < 0, x ∈ [0, T ]. The conditions of Theorem 7 hold true. The equation (13.35) has a unique solution in L 1 [0, T ] (Actually, it has many solutions, but each two solutions are equal almost everywhere.) The solution has a representative that is continuous and attains only positive values on the left-open interval (0, T ], and it is a Sonine pair to g(s) = eβs s α−1 .

13.5 Appendix 13.5.1 Inequalities for Norms of Convolutions and Products Recall notation f p for the norm of function f ∈ L p (R), p ∈ [1, ∞]. The convolution of two measurable functions f and g is defined by integration

268

Y. Mishura et al.

 ( f ∗ g)(t) =

f (s)g(t − s) ds.

(13.19)

R

Now we state an inequality for the norm of convolution of two functions. If p ∈ [1, ∞], q ∈ [1, ∞] but 1/ p + 1/q ≥ 1, f ∈ L p (R), g ∈ L q (R), then the convolution f ∗ g is well-defined almost everywhere (that is the integral in (13.19) converges −1 absolutely for almost all t ∈ R), f ∗ g ∈ L (1/ p+1/q−1) (R), and f ∗ g (1/ p+1/q−1)−1 ≤ f p g q .

(13.20)

Now we state an inequality for the norm of the product of two functions ( f g)(t) = f (t)g(t). We call it Hölder inequality for non-conjugate exponents. If p ∈ [1, ∞], −1 q ∈ [1, ∞], 1/ p + 1/q ≤ 1, f ∈ L p (R), g ∈ L q (R), then f g ∈ L (1/ p+1/q) (R) and f g (1/ p+1/q)−1 ≤ f p g q .

(13.21)

Now we state an inequality for the norms in L p [a, b] and L q [a, b]. If −∞ < / [a, b], then a < b < ∞, 1 ≤ p ≤ q ≤ ∞, f ∈ L q (R) and the f (t) = 0 for all t ∈ f ∈ L p (R) and (13.22) f p ≤ (b − a)1/ p−1/q f q . Remark 8 Conditions for inequalities (13.21) and (13.22) are over-restrictive because of restrictive notation f p . This notation can be extended to all p ∈ (0, ∞] and all measurable functions f . Then the conditions for inequalities (13.21) and (13.22) may be relaxed. Inequality (13.20) is proved in [7, Theorem 4.2]; see item (2) in the remarks after this theorem and part (A) of its proof. If p < ∞ and q < ∞, then inequality (13.21) follows from the conventional Hölder inequality. Otherwise, if p = ∞ or q = ∞, then inequality (13.21) is trivial. Inequality (13.22) can be rewritten as f 1[a,b] p ≤ 1[a,b] (1/ p−1/q)−1 f q , and so follows from (13.21).

13.5.2 Continuity of Trajectories and Hölder Condition Kolmogorov continuity theorem provides sufficiency conditions for a stochastic process to have a continuous modification. The following theorem aggregates Theorems 2, 4 and 5 in [3]. Theorem 4 (Kolmogorov Continuity Theorem) Let {X t , t ∈ [0, T ]} be a stochastic process. If there exist K ≥ 0, α > 0 and β > 0 such that   E |X t − X s |α ≤ K |t − s|1+β for all 0 ≤ s ≤ t ≤ T, then

13 Gaussian Processes with Volterra Kernels

269

1. The process X has a continuous modification; 2. Every continuous modification of the process X whose trajectories almost surely satisfies Hölder condition for all exponents γ ∈ (0, β/α). 3. There exists a modification of the process X that satisfies Hölder condition for exponent γ ∈ (0, β/α). This theorem can be applied for Gaussian processes. Corollary 1 Let {X t , t ∈ [0, Gaussian process. If there exist   T ]} be a centered K ≥ 0 and δ > 0 such that E (X t − X s )2 ≤ K |t − s|δ for all 0 ≤ s ≤ t ≤ T, then the following holds true: 1. The process X have a modification  X that has continuous trajectories. X satisfy γ -Hölder 2. For every γ , 0 < γ < 21 δ, the trajectories of the process  condition almost surely. 3. The process X has a modification that satisfies Hölder condition for all exponents γ ∈ (0, 21 δ). Since X s − X t is a centered Gaussian variable,  2α/2  α + 1     α/2 α E (X t − X s )2 . E |X t − X s | = √ Γ 2 π The first statement of the corollary can be proved by applying Kolmogorov continuity theorem for α > 2/δ and β = 21 αδ − 1. The second statement of the corollary can be 2 proved by applying Kolmogorov continuity theorem for α > δ−2γ and β = 21 αδ − 1. Consider the random event   X satisfies γ − Holder condition A = ∀γ ∈ (0, 21 δ) :      1 1  1− δ − Holder condition . = ∀n ∈ N : X satisfies 2 n (The measurability of A follows from the continuity of the process  X ). By the second statement of Corollary 1 P(A) = 1. Thus, {  X t 1 A , t ∈ [0, t]} is the desired modification which satisfies Hölder condition for all exponents γ ∈ (0, 21 δ). Remark 9 1. Corollary 1 holds true even without assumption that the Gaussian process X is centered. 2. The first statement of Corollary 1 can be proved with Fernique’s continuity criterion [5] as well. Lemma 5 Let {X t , t ∈ [0, T ]} be a centered Gaussian process. Suppose that there exist δ > 0 and a nondecreasing continuous function F : [0, T ] → R such that   E (X t − X s )2 ≤ (F(t) − F(s))δ for all 0 ≤ s ≤ t ≤ T. Then

(13.23)

270

Y. Mishura et al.

1. The process X have a modification  X that has continuous trajectories. 2. If the function F satisfies Lipschitz condition in an interval [a, b] ⊂ [0, T ], then X has a modification whose trajectories for every γ , 0 < γ < 21 δ, the process  satisfy γ -Hölder property on the interval [a, b]. Proof Without loss of generality, we can assume that the function F is strictly increasing. Indeed, if the condition (13.23) holds true for F being continuous nondecreasing function F1 , it also holds true for F = F2 with F2 (t) = F1 (t) + t, where F2 is a continuous strictly increasing function. With this additional assumption, the inverse function F −1 is one-to-one, strictly increasing continuous function [F(0), F(T )] → [0, T ]. Consider a stochastic process {Yu , u ∈ [F(0), F(T )]}, with Yu = Y F −1 (u) . The stochastic process Y is centered and Gaussian; it satisfies condition     E (Yv − Yu )2 = E (X F −1 (v) − X F −1 (u) )2 ≤ (F(F −1 (v)) − F(F −1 (u)))δ = (v − u)δ for all F(0) ≤ u ≤ v ≤ F(T ). According to Corollary 1, the process Y has a modiF(t) is a modification of  with continuous trajectories. Then  fication Y X with  Xt = Y the process X with continuous trajectories. The second statement of the lemma is a direct consequence of Corollary 1. If the function F satisfies Lipschitz condition with constant L on the interval [a, b], then   2 E (X t − X s ) ≤ L δ (t − s)δ for all a ≤ s ≤ t ≤ b, which is the main condition for Corollary 1. 

13.5.3 Application of Fractional Calculus The lower and upper Riemann–Liouville fractional integrals of a function f ∈ L 1 [a, b] are defined as follows: α f )(x) = (Ia+

1 Γ (α)

x a

f (t) dt , (x − t)1−α

α (Ib− f )(x) =

1 Γ (α)

b x

f (t) dt . (t − x)1−α

α α The integrals (Ia+ f )(x) and (Ib− f )(x) are well-defined for almost all x ∈ [a, b], α α f ∈ L 1 [a, b] and Ib− f ∈ L 1 [a, b]. Thus, and are integrable functions of x, that is Ia+ α α 1 1 Ia+ and Ib− might be considered linear operators L [a, b] → L [a, b]. A reflection relation for functions g(x) = f (a + b − x) imply the following relation for their fractional integrals: α α g)(x) = (Ia+ f )(a + b − x); (Ib−

(13.24)

13 Gaussian Processes with Volterra Kernels

271

see [11, Chap. 1, Sect. 2.3]. The integration-by-parts formula is given, e.g., in [11, Chapter 1, Sect. 2.3]. q Proposition 4 (integration-by-parts formula) Let α > 0, f ∈ L p [a,  b], g ∈ L [a, b], p ∈ [1, +∞], q ∈ [1, +∞], while 1p + q1 ≤ 1 + α and max 1 + α − 1p − q1 ,   b α b α > 0. Then (Ia+ f )(t) g(t) dt = f (t) (Ib− g)(t) dt. min 1 − 1p , 1 − q1 a

a

Now we establish conditions for a function to be in the range of the fractional α , and we provide formulas for the preimage, which is called a fractional operator Ia+ derivative. The following statements are the modifications of the Theorem 2.1 and following corollary in [11, Chapter 1]. The formulas for the fractional derivative are also provided in [8, Sect. 2.5]. Theorem 5 Let 0 < α < 1. Consider the integral equation α f =g Ia+

(13.25)

with unknown function f ∈ L 1 [a, b] and known function (i.e., a parameter) g ∈ 1−α (Ia+ g)(x) if a < x ≤ b, L 1 [a, b]. Denote h(x) = . If h ∈ AC[a, b], then Eq. 0 if x = a (13.25) has a unique (up to equality almost everywhere in [a, b]) solution f , namely / AC[a, b], then Eq. (13.25) has no solutions in f (x) = h (x). Otherwise, if h ∈ 1−α g)(x) is not well-defined, then L 1 [a, b]. If for some x ∈ (a, b] the integral (Ia+ 1 Eq. (13.25) does not have solutions in L [a, b]. Corollary 2 Let 0 < α < 1. The integral Eq. (13.25) with unknown function f ∈ L 1 [a, b] and known function g ∈ AC[a, b] has a unique solution. The solution is equal to ⎞ ⎛ x 

(t) dt g g(a) 1 g(a) 1−α

⎠. ⎝ f (x) = (Ia+ (g ))(x) + = + Γ (1 − α) (x − a)α Γ (1 − α) (x − t)α (x − a)α a

13.5.4 Existence of the Solution to Volterra Integral Equation Where the Integral Operator Is an Operator of Convolution with Integrable Singularity at 0 Consider Volterra integral equation of the first kind x f (t) g(x − t) dt = y(x), 0

x ∈ (0, T ],

(13.26)

272

Y. Mishura et al.

with g(x) and y(x) known (parameter) functions and f (x) unknown function. Suppose that the function g(x) is integrable in the interval (0, T ] but behaves asympK , x → 0, where totically as a power function in the neighborhood of 0: g(x) ∼ x 1−α 0 < α < 1. More specifically, assume that g(x) admits a representation ⎞ ⎛ x 1 1 h(t) dt 1 α ⎠, ⎝ g(x) = + (I0+ h)(x) = + Γ (α)x 1−α Γ (α) x 1−α (x − t)1−α

(13.27)

0

α where Γ (α) is a gamma function, I0+ h is a lower Riemann–Liouville fractional intex  h(t) dt 1 α gral of h, (I0+ h)(x) = Γ (α) , and h(x) is a absolutely continuous function. (x−t)1−α 0

The sufficient conditions for existence and uniqueness of the solution to integral equation claimed in [8, Sect. 2.1–2] are not satisfied. The kernel of the integration operator in (13.26) is unbounded, and y(0) might be nonzero. But we use Remark 2 in [8, Sect. 2.1–2]. We reduce the Volterra integral equation of the first kind to a Volterra integral equation of the second kind similarly as it is done for regular functions g(x); compare with [8, Sect. 2.3] for the case of regular g(x). For the next theorem we keep in mind that if a function f is a solution to (13.26), then every function that is equal to f almost everywhere on [0, T ] is also a solution to (13.26). Theorem 6 Let y, h ∈ C 1 [0, T ] and g be defined in (13.27). Then the Eq. (13.26) has a unique (up to equality almost everywhere) solution f ∈ L 1 [0, T ]. The solution is (more precisely, some of almost-everywhere equal solutions are) continuous in the left-open interval (0, T ]. Proof Substitute (13.27) into (13.26): x 0



 1 α f (t) + (I0+ h)(x − t) dt = y(x), Γ (α)(x − t)1−α α (I0+

x f )(x) +

α f (t) (I0+ h)(x − t) dt = y(x).

0

Denote h x (t) = h(x − t). According to Eq. (13.24), the fractional integrals of h and α α h)(x − t) = (Ix− h x )(t). Hence, Eq. (13.26) is equivalent h x satisfy the relation (I0+ to the following one: α (I0+ f )(x) +

x 0

α f (t) (Ix− h x )(t) dt = y(x).

(13.28)

13 Gaussian Processes with Volterra Kernels

273

Now apply the integration-by-parts formula. We have f ∈ L 1 [0, x], h x ∈ L ∞ [0, x], and 1 + 0 < 1 + α. Hence, by Proposition 4, x

α f (t) (Ix− h x )(t) dt

x =

0

α (I0+ f )(t) h x (t) dt.

0

It means that Eq. (13.28) is equivalent to the following ones: α (I0+

x f )(x) +

α (I0+ f )(t) h x (t) dt = y(x),

0 α (I0+ f )(x) +

x

α (I0+ f )(t) h(x − t) dt = y(x).

0 α Denote F = I0+ f , and obtain a Volterra integral equation of the second kind:

x F(x) = y(x) −

F(t) h(x − t) dt.

(13.29)

0

Equation (13.29) has a unique solution in C[0, T ], as well as in L 1 [0, T ]. In other words, (13.29) has a unique integrable solution, and this solution is a continuous function. According to Theorem 5, either unique (up to almost-everywhere equality) function f , or no functions f correspond to the function F. Thus, all integrable solution to integral equation (13.26) are equal almost everywhere. Now we construct a solution to Eq. (13.26) that is continuous and integrable on (0, T ]. Differentiating (13.29), we obtain

x



F (x) = y (x) − F(x) h(0) −

F(t) h (x − t) dt,

0 α f whence F ∈ C 1 [0, T ]. According to Corollary 2, the integral equation F = I0+ 1 has a unique solution f ∈ L [0, T ], which is equal to ⎞ ⎛ x 

1 ⎝ F (t) dt + F(0) ⎠ . f (x) = (13.30) Γ (1 − α) (x − t)α xα 0

The constructed function f (x) is continuous and integrable in (0, T ], and f (x) is a solution to (13.26). 

274

Y. Mishura et al.

Remark 10 In Theorem 6 the condition h ∈ C 1 [0, T ] can be relaxed and replaced with the condition h ∈ AC[0, T ]. In other words, if the function h is absolutely continuous but is not continuously differentiable, the statement of Theorem 6 still holds true.

13.5.4.1

Example: g(x) = exp(β x)x α−1 /Γ (α) and y(x) = 1

It is well known that x 0

1 1 dt = 1. Γ (1 − α)t α Γ (α)(x − t)1−α

(13.31)

In this section, we prove that the equation x f (t) 0

e(x−t)β dt = 1 Γ (α)(x − t)1−α

(13.32)

has an integrable solution. According to (13.31), f (x) = x −α / Γ (1 − α) is a solution to (13.32) if β = 0. Denote exp(βx) . (13.33) g(x) = Γ (α)x 1−α Demonstrate that g(x) admits a representation (13.27). To construct h, we need Kummer confluent hypergeometric function [14]: 1 1 F 1 (a; b; z) = B(a, b − a)

1 e zt t a−1 (1 − t)b−a−1 dt, 0 < a < b, z ∈ C. 0

For a and b fixed, 1 F 1 (a; b; · ) is an entire function. Its derivative equals ∂ a 1 F 1 (a; b; z) = 1 F 1 (a + 1; b + 1; z). ∂z b For all 0 < a < b and z ∈ R, 1 F 1 (a; b; z) > 0, 1 F 1 (a; b; 0) = 1. Notice that if x dt 0 < α < 1 and x > 0, then B(α,11−α) t exp(zt) 1−α (x−t)α = 1 F 1 (α; 1; x z). Considering an 0

α equation for unknown h, (13.27) is equivalent to I0+ h = g0 , where g0 (x) = g(x) − βx 1 e −1 = Γ (α)x 1−α . Then Γ (α)x 1−α

13 Gaussian Processes with Volterra Kernels

1−α (I0+ g0 )(x)

1 = B(α, 1 − α)

x 0

275

eβt − 1 dt = 1 F 1 (α; 1; βx) − 1. t 1−α (1 − t)α

Besides, 1 F 1 (α; 1; βx) − 1 is an absolutely continuous function in x, and 1 F 1 (α; 1; α h = g0 has the βx) − 1 = 0 if x = 0. According to Theorem 5, the equation I0+ 1 unique solution h = L [0, T ], which is equal to h(x) =

∂(1 F 1 (α; 1; βx) − 1) = αβ 1 F 1 (α + 1; 2; βx). ∂x

(13.34)

The constructed function h(x) is a solution to (13.27) and is continuously differentiable. In summary, h ∈ C 1 [0, T ], y(x) = 1, and y ∈ C 1 [0, T ]. According to Theorem 6 the integral equation x f (t) g(x − t) dt = 1,

x ∈ (0, T ],

(13.35)

0

has a unique solution f ∈ L 1 [0, T ] (up to equality almost everywhere). The solution is continuous in (0, T ]. Remark 11 The fact that the functions g and h defined in (13.33) and (13.34), respectively, satisfy (13.27), can be checked directly. For such verification, one can apply Lemma 2.2(i) from [10].

13.5.4.2

Positive Solution to the Volterra Integral Equation

Theorem 7 Let the conditions of Theorem 6 hold true. Additionally, let y(x) > 0, y (x) ≥ 0, h(x) < 0

for all x ∈ [0, T ].

Then the continuous solution f (x) to (13.26) attains only positive values in (0, T ]. Proof Notice that (13.29) implies F(0) = y(0) > 0. Taking this into account, let’s differentiate both sides of (13.29) the other way: x F(x) = y(x) −

F(x − t) h(t) dt, 0





x

F (x) = y (x) − F(0) h(x) − 0

F (x − t) h(t) dt.

(13.36)

276

Y. Mishura et al.

Let us prove that F (x) > 0 in [0, T ] by contradiction. Assume the contrary, that is ∃x ∈ [0, 1] : F (x) ≤ 0. Since the function F (x) is continuous in [0, T ], the contrary implies the existence of the minimum in x0 = min{x ∈ [0, T ] : F (x) ≤ 0}. But for x = x0 the left-hand side in (13.36) is less or equal then zero, while the right-hand side is greater than zero. Thus, (13.36) does not hold true. The proof also works for x0 = 0. There is a contradiction. Thus, we have proved  that F (x) > 0 for all x ∈ [0, T ]. By (13.30), f (x) > 0 for all x ∈ (0, T ].

References 1. Alòs, E., Mazet, O., Nualart, D.: Stochastic calculus with respect to Gaussian processes. Ann. Probab. 29(2), 766–801 (2001) 2. Banna, O., Mishura, Y., Ralchenko, K., Shklyar, S.: Fractional Brownian Motion: Weak and Strong Approximations and Projections. Mathematics and Statistics Series. ISTE and Wiley, London and Hoboken NJ (2019) 3. Bell, J.: The Kolmogorov continuity theorem, Hölder continuity, and Kolmogorov– Chentsov theorem. Lecture Notes, University of Toronto (2015). https://individual.utoronto. ca/jordanbell/notes/kolmogorovcontinuity.pdf 4. Boguslavskaya, E., Mishura, Y., Shevchenko, G.: Replication of Wiener-transformable stochastic processes with application to financial markets with memory. In: Silvestrov, S., Malyarenko, A., Rancic, M. (eds.) Stochastic Processes and Applications, Springer Proceedings in Mathematics & Statistics, vol. 271, pp. 335–361 (2018) 5. Fernique, X.: Continuité des processus Gaussiens. C. R. Acad. Sci. Paris 258, 6058–6060 (1964) 6. Kochubei, A.N.: General fractional calculus, evolution equations, and renewal processes. Integr. Eqn. Oper. Theory. 71(4), 583–600 (2011) 7. Lieb, E.H., Loss, M.: Analysis. AMS, Providence (2001) 8. Manzhirov, A.V., Polyanin, A.D.: Integral Equations Handbook. Factorial Press, Moscow (2000) (In Russian) 9. Mishura, Y.S.: Stochastic calculus for fractional Brownian motion and related processes. Lecture Notes in Mathematics, vol. 1929. Springer-Verlag, Berlin (2008) 10. Norros, I., Valkeila, E., Virtamo, J.: An elementary approach to a Girsanov formula and other analytical results on fractional Brownian motions. Bernoulli 5(4), 571–587 (1999) 11. Samko, S.G., Kilbas, A.A., Marichev, O.I.: Fractional Integrals and Derivatives: Theory and Applications. Gordon and Breach, Amsterdam (1993) 12. Samko, S.G., Cardoso, R.P.: Sonine integral equations of the first kind in L p (0, b). Fract. Calc. Appl. Anal. 6(3), 235–258 (2003) 13. Sonin, N. Ya.: Investigations of cylinder functions and special polynomials. (Russian). Gosudarstv. Izdat. Tehn.-Teor. Lit., Moscow (1954) 14. Wolfram Research. Kummer confluent hypergeometric function 1 F1 , The Wolfram Functions Site (2008). www.functions.wolfram.com/HypergeometricFunctions/Hypergeometric1F1. Accessed 05 November 2019

Chapter 14

Stochastic Differential Equations Driven by Additive Volterra–Lévy and Volterra–Gaussian Noises Giulia Di Nunno, Yuliya Mishura, and Kostiantyn Ralchenko

Abstract We study the existence and uniqueness of solutions to stochastic differential equations with Volterra processes driven by Lévy noise. For this purpose, we study in detail smoothness properties of these processes. Special attention is given to two kinds of Volterra–Gaussian processes that generalize the compact interval representation of fractional Brownian motion to stochastic equations with such process. Keywords Volterra process · Lévy process · Gaussian process · Sonine pair · continuity · Hölder property · weak solution · strong solution MSC 2020 60G15 · 60G51 · 60H10

14.1 Introduction The main object that is studied in the present paper are stochastic differential equations with additive noise, admitting the form d X t = u(X t )dt + dYt , t ≥ 0, X |t=0 = X 0 ∈ R,

(14.1)

where u : R → R is a measurable function, and Y = {Yt , t ≥ 0} is a Volterra–Lévy process. Equations of the form (14.1), with different coefficients and different noises, were the subject of long and careful considerations. Namely, the most popular case G. Di Nunno Department of Mathematics, University of Oslo, Blindern, P.O. Box 1053, 0316 Oslo, Norway e-mail: [email protected] Y. Mishura · K. Ralchenko (B) Department of Probability Theory, Statistics and Actuarial Mathematics, Taras Shevchenko National University of Kyiv, 64, Volodymyrs’ka St., 01601 Kyiv, Ukraine e-mail: [email protected] Y. Mishura e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_14

277

278

G. Di Nunno et al.

is the Langevin equation, where u(x) = ax, x ∈ R, with some coefficient a = 0, and a Wiener process as a noise. Such process is called the Ornstein–Uhlenbeck process, or the Vasicek process, and it serves as mathematical model in many areas of science. Initially the Eq. (14.1) was proposed as a model for velocity of particles in the theory of the Brownian motion in [9], then the corresponding mathematical theory was developed in [21, 24], see, e. g. the book [22] for applications of the Ornstein–Uhlenbeck process in physics. Since the seminal paper by Vasicek [23], the Ornstein–Uhlenbeck process has become a very popular model in mathematical finance, see e. g. [6, 7, 10, 11, 17–19, 25]. t A Volterra–Lévy process has the form Yt = 0 g(t, s) d Z s , where g(t, s) is a given deterministic Volterra-type kernel, and Z is a Lévy process. The conditions on g and Z supplying the existence of Volterra–Lévy processes were studied in [2] together with a theory of pathwise stochastic integration with respect to such processes. Some approximations and first numerical results can be found in [1]. The goal of the present paper is to study stochastic differential equations with additive noise represented by a Volterra–Lévy process. We start with investigation of continuity and Hölder properties of Volterra– Lévy processes. In order to apply the Kolmogorov–Chentsov theorem, we establish moment upper bounds for increments of these processes. In particular, we study in detail the case when the kernel g satisfies certain power restrictions. Two examples of such kernels are considered, namely, the Molchan–Golosov kernel, which arises in the compact interval representation of fractional Brownian motion, and a subfractional kernel, which corresponds to sub-fractional Brownian motion. For both kernels, it holds that sample paths of the corresponding Volterra–Lévy processes satisfy Hölder condition up to order H − 21 , where H denotes the Hurst index. However, in the particular case of Gaussian Z , one has Hölder continuity up to order H . This agrees with the theory of fractional Brownian motion and with the paper [20], where the authors study the case, when g(t, s) is the Molchan–Golosov kernel and Z is a Lévy process without Gaussian component. Special attention in the paper is given to Volterra–Gaussian processes that arise in the case when Lévy process Z is a Brownian motion. We investigate two types of kernels that generalize the Molchan–Golosov kernel of fractional Brownian motion. One of these kernels corresponds to fractional Brownian motion with Hurst index H > 21 . It was introduced in [12], where conditions for its existence and Hölder continuity were investigated. Also, in [12] the inverse representation of underlying Wiener process via Volterra–Gaussian process was studied. This study was based on the properties of Sonine pairs. In the present paper we introduce also another type of Volterra–Gaussian process that extends fractional Brownian motion with H < 21 . We study smoothness of this process. We also derive the inverse operators for both types of Volterra–Gaussian processes in terms of generalized fractional integrals and derivatives for Sonine pairs. Then we apply the results mentioned above for investigation of stochastic differential equations with Volterra–Lévy processes. We start with a deterministic analog of the Eq. (14.1), where the stochastic term Yt is replaced by a non-random function that is locally integrable or locally bounded. We study solvability of this equation

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

279

under Lipschitz condition on the drift coefficient u. Then we prove that the stochastic Eq. (14.1) with locally Lipschitz coefficient of linear growth has a unique solution under certain conditions on the underlying Volterra process Z and power restrictions on the kernel g(t, s). We also study stochastic differential equations with two kinds of Volterra– Gaussian processes. In this case we can prove solvability of the equation under weaker assumptions on the drift coefficient. Namely, we assume sublinear growth of this coefficient and its Hölder continuity. We generalize the results of [14], where the noise was fractional Brownian motion, to the case of more general Volterra– Gaussian noise. We prove the existence and uniqueness of a weak solution, the pathwise uniqueness of two weak solutions and the existence and uniqueness of a strong solution. The paper is organized as follows. In Sect. 14.2 we recall the definition of a Volterra–Lévy processes, necessary conditions for its existence, and a priory estimates for its moments. Section 14.3 is devoted to Hölder properties of Volterra– Lévy processes. As auxiliary results, we establish upper bounds for the incremental moments in general case (Subsect. 14.3.1) as well as in the case of power restrictions on the kernel (Subsect. 14.3.2). In Subsect. 14.3.3 we apply these bounds for investigation of continuity and Hölder properties of three types of Volterra–Lévy processes. Two examples of appropriate kernels are given in Subsect. 14.3.4. In Subsect. 14.3.5 two kinds of Volterra–Gaussian processes are studied. Section 14.4 is devoted to the existence and uniqueness of solution to the Eq. (14.1). The stochastic differential equations with Volterra–Gaussian processes are studied in Sect. 14.5. In Appendix A we prove some auxiliary results related to fractional calculus for Sonine pairs. In the Appendix B a deterministic analog of the Eq. (14.1) is investigated. Throughout the paper, we shall use notation C for various constants whose value is not important and may change from line to line and even in the same line.

14.2 Brief Description of Volterra–Lévy Processes 

|z| ≤ 1, |z| > 1. Then the characteristic function of Z t can be represented in the following form (see, e. g., [16]): E exp {iμZ t } = exp {t(μ)} , where

We start with a Lévy process Z . In order to describe it, define τ (z) :=

aμ2 + (μ) = ibμ − 2



z,

z |z| ,

 iμx  e − 1 − iμτ (x) π(d x),

R

b ∈ R,a ≥ 0, π is a Lévy measure on R, that is a σ -finite Borel measure satisfying R x 2 ∧ 1 π(d x) < ∞, with π({0}) = 0. The triplet (a, b, π ) is shortly called the characteristic triplet of Z . Let us fix some T > 0 and introduce the following Volterra–Lévy process

280

G. Di Nunno et al.

t Yt =

g(t, s) d Z s , t ∈ [0, T ],

(14.2)

0

where g(t, s) is a given deterministic Volterra-type kernel. The integral in (14.2) is understood in the sense of [15] as the limit in probability of elementary integrals. Its construction is described in [2, Thm. 2.2]. According to [2], in order to guarantee the existence of the process Y and of its moments, we need more strict assumptions on the here called base-process Z and the kernel g(t, s). More precisely, in what follows we assume that the Volterra–Lévy process (14.2) has b = 0 (i. e., Z is a Lévy process without drift), the measure π is symmetric and one of the following conditions holds: (A1) There exists  p ∈ [1, 2) such that g = g(t, ·) ∈ L p ([0, t]) for any t ∈ [0, T ]; a = 0 and R |x| p π(d x) < ∞; (A2) There exists p ≥ 2 such that g = g(t, ·) ∈ L p ([0, t]) for any t ∈ [0, T ] and  p |x| π(d x) < ∞. R t Then, according to [2, Thm. 2.2], the integral 0 g(t, s) d Z s exists for any t ∈ [0, T ]. Moreover, in the case when condition (A1) holds, we have the following a priori estimate  t p      p   (14.3) E  g(t, s) d Z s  ≤ C g(t, ·) L p ([0,t]) |x| p π(d x),   R

0

and in the case when condition (A2) holds, we have the following a priori estimate  t p      E  g(t, s) d Z s    0 ⎛ ≤C

⎝a p/2 g(t, ·) p L 2 ([0,t])

 +

p g(t, ·) L p ([0,t])



(14.4)

|x| p π(d x)⎠ .

R

The constant C in (14.3) and (14.4) does not depend on the function g. However, it may depend on p and T .

14.3 Moment Upper Bounds and Hölder Properties of Volterra–Lévy Processes In our approach, in order to consider a Volterra–Lévy process as a noise, we need in the smoothness properties of its trajectories. So, the present section is devoted to its Hölder properties. Obviously, these properties depend both on the properties of the kernel g and the Lévy baseprocess Z .

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

281

14.3.1 General Upper Bounds for the Incremental Moments In this subsection we establish upper bounds for E |Yt − Ys | p under assumptions (A1) and (A2). Lemma 1 Consider 0 ≤ s ≤ t ≤ T . Let assumption (A1) hold. Then  E |Yt − Ys | p ≤ C

⎛ |x| p π(d x) ⎝

R

t

s |g(t, u)| p du +

s

⎞ |g(t, u) − g(s, u)| p du ⎠ .

0

(14.5)

Let assumption (A2) hold. Then  E |Yt − Ys | p ≤ C

⎛ |x| p π(d x) ⎝

R

t

s |g(t, u)| p du +

s

⎞ |g(t, u) − g(s, u)| p du ⎠

0

⎛⎛ ⎞ p/2 ⎛ s ⎞ p/2 ⎞ t  ⎜ ⎟ + Ca p/2 ⎝⎝ |g(t, u)|2 du ⎠ + ⎝ |g(t, u) − g(s, u)|2 du ⎠ ⎠ . s

0

(14.6) Proof Note that the increment of Y is given by t Yt − Ys =

s g(t, u)d Z u −

g(s, u)d Z u

0

0

t

s

=

g(t, u)d Z u + s

(g(t, u) − g(s, u))d Z u . 0

Therefore,  s p p⎞ ⎛  t         p    ⎝ E |Yt − Ys | ≤ C E  g(t, u)d Z u  + E  (g(t, u) − g(s, u))d Z u  ⎠ .     s 0 (14.7) In order to conclude the proof, it suffices to apply the bounds (14.3) and (14.4) to the integrals in the right-hand side of (14.7).  We remark that the Hölder continuity of paths is a central property also e. g. in the rough-paths approach to the study of stochastic (partial) differential equations. Our results can then find application in that framework. We refer to e. g. [8] for a study of Volterra-driven stochastic differential equations with multiplicative noise via

282

G. Di Nunno et al.

rough-paths. Note that, different from our work, the starting base-process is Hölder continuous.

14.3.2 Incremental Moments and Hölder Continuity Under Power Restrictions on the Kernel g As one can see from the inequalities (14.5) and (14.6), the incremental moments of Y are bounded by some integrals containing g, its powers and its increments. Now let us consider more specific class of the kernels g. Assume that the function g satisfies the following power restrictions with some p ≥ 1. (B1) There exist constants α ∈ R, β > − 1p and γ > − 1p such that |g(t, u)| ≤ Ct α u β (t − u)γ for all 0 < u < t ≤ T. (B2) There exist a constant δ > 0 and a function h(t, s, u) such that |g(t, u) − g(s, u)| ≤ |t − s|δ h(t, s, u) for all 0 < u < s < t ≤ T,  and

s

sup 0 1, we will be able to apply the Kolmogorov continuity theorem and to investigate Hölder properties Lemma 1, we need to estimate  t of Y . Taking intoaccount s the integrals of the form s |g(t, u)| p du and 0 |g(t, u) − g(s, u)| p du. Obviously, the second integral under the assumption (B2) satisfies the inequality s

|g(t, u) − g(s, u)| p du ≤ C |t − s|δp .

(14.8)

0

The study of the first integral is more delicate. We start with the following auxiliary result.

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

283

Lemma 2 Let μ > −1 and ν > −1. Then for all 0 ≤ s < t ≤ T , t

u μ (t − u)ν du ≤ Ct μ (t − s)ν+1 .

(14.9)

s

The positive constant C in (14.9) may depend on μ, ν and T . Proof Write t

s+t

μ

2

ν

μ

u (t − u) du =

ν

t

u (t − u) du +

s

s

u μ (t − u)ν du =: I1 + I2 .

(14.10)

s+t 2

For s ≤ u ≤ t, we have (t − u)ν = (t − u)ν+1 (t − u)−1 ≤ (t − s)ν+1 (t − u)−1 . Therefore, s+t s+t 2 2 μ μ u u (t − u + u) du = (t − s)ν+1 t −1 du I1 ≤ (t − s)ν+1 t −u t −u s

s

s+t 2

= (t − s)ν+1 t −1



s+t

u μ du + (t − s)ν+1 t −1

s

2 s

u μ+1 du =: I11 + I12 . (14.11) t −u

The term I11 can be bounded as follows:   μ+1 s + t − s μ+1 ≤ Ct μ (t − s)ν+1 , I11 = C(t − s)ν+1 t −1 2

(14.12)

 μ+1  μ+1 since s+t − s μ+1 ≤ s+t ≤ t μ+1 . 2 2  μ+1 ≤ t μ+1 . We get In order to bound I12 , we use the inequality u μ+1 ≤ s+t 2 s+t

I12 ≤ (t − s)ν+1 t μ

2 s

  t −s du = (t − s)ν+1 t μ log(t − s) − log t −u 2

= t μ (t − s)ν+1 log 2 = Ct μ (t − s)ν+1 . Consider I2 . Note that for uμ ≤



s+t 2

uμ ≤ t μ

s+t 2



(14.13)

< u < t, ≤

 μ t 2

if μ < 0, if μ ≥ 0.

284

G. Di Nunno et al.

Hence, in both cases we have the bound u μ ≤ Ct μ . Therefore, I2 ≤ Ct

t

μ

  s + t ν+1 (t − u) du = Ct t − = Ct μ (t − s)ν+1 . 2 ν

μ

(14.14)

s+t 2



Combining (14.10)–(14.14), we get (14.9). Lemma 2 allows us to obtain an upper bound for the integral

t s

|g(t, u)| p du.

Lemma 3 Assume that condition (B1) holds with some p ≥ 1. Then t

|g(t, u)| p du ≤ C(t − s)κ p+1 , for all 0 ≤ s < t ≤ T,

s

where

 α + β + γ , if α + β < 0, κ = κ(α, β, γ ) = γ, if α + β ≥ 0.

(14.15)

The constant C may depend on α, β, γ , p and T . t t Proof According to condition (B1), s |g(t, u)| p du ≤ Ct αp s u βp (t − u)γ p du. t Applying the upper bound (14.9), one gets s |g(t, u)| p du ≤ Ct (α+β) p (t − s)γ p+1 . t If α + β < 0, then t (α+β) p ≤ (t − s)(α+β) p , and we obtain the inequality s |g(t, u)| p t du ≤ C(t − s)(α+β+γ ) p+1 . If α + β ≥ 0, then t (α+β) p ≤ T (α+β) p , hence, s |g(t, u)| p du ≤ C(t − s)γ p+1 . 

14.3.3 Application of the Upper Bounds for the Incremental Moments to Volterra–Lévy Processes of Three Types Now, basing on Lemma 3, we can better specify the upper bounds (14.5) and (14.6) for the moments of increments of the Volterra–Lévy process Y satisfying (B1)–(B2). Also, as a consequence, we shall state its Hölder properties. We consider three cases: (1) Z is a Lévy process without Brownian part; (2) Z is a Brownian motion; (3) Z is a Lévy process of a general form.

14.3.3.1

Lévy–Based Process Without Brownian Part

We start with the case of a Lévy process in (14.2) without Brownian part, that is, a = 0.

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

285

 Lemma 4 Assume that p ≥ 1, a = 0, R |x| p π(d x) < ∞, the conditions (B1) and (B2) hold with some α ∈ R, δ > 0, β > − 1p , γ > − 1p and such that α + β + γ > − 1p . Then for all 0 ≤ s < t ≤ T , E |Yt − Ys | p ≤ C(t − s)min{κ p+1,δp} , where κ is defined by (14.15). If κ > 0 and δ > 1p , then the trajectories of Y are a. s. Hölder   continuous up to order min κ, δ − 1p . Proof According to Lemma 1, we have ⎛ t ⎞  s E |Yt − Ys | p ≤ C ⎝ |g(t, u)| p du + |g(t, u) − g(s, u)| p du ⎠ . s

0

Applying Lemma 3 and (14.8), we get E |Yt − Ys | p ≤ C(t − s)κ p+1 + C(t − s)δp   min{κ p+1,δp} min{κ p+1,δp} κ p+1 t − s δp t − s ≤ CT + CT T T ≤ C(t − s)min{κ p+1,δp} . 

Hölder continuity follows from the Kolmogorov continuity theorem.

14.3.3.2

The Brownian Case

Lemma 5 Assume that Z is a Brownian motion, the conditions (B1) and (B2) hold with p = 2, α ∈ R, β > − 21 , γ > − 21 such that α + β + γ > − 21 . Then for all p ≥ 2 1 and all 0 ≤ s < t ≤ T , E |Yt − Ys | p ≤ C(t − s) p min{κ+ 2 ,δ} , where κ is defined by (14.15). If κ > − 21 , then the trajectories of Y are a. s. Hölder continuous up to order   min κ + 21 , δ . Proof In the Brownian case, (14.6) becomes ⎛⎛ ⎞ p/2 ⎛ s ⎞ p/2 ⎞ t  ⎟ ⎜ E |Yt − Ys | p ≤ C ⎝⎝ |g(t, u)|2 du ⎠ + ⎝ |g(t, u) − g(s, u)|2 du ⎠ ⎠ . s

0

Then by Lemma 3 and (14.8), we get E |Yt − Ys | p ≤ C(t − s) 2 (2κ+1) + C(t − s)δp ≤ C(t − s) p min{κ+ 2 ,δ} . p

1

  By the Kolmogorov continuity theorem, if p min κ + 21 , δ > 1, then the trajectories   of Y are a. s. Hölder up to order min κ + 21 , δ − 1p . Since p can be chosen arbitrarily    large, we get Hölder continuity up to order min κ + 21 , δ , if κ > −1/2.

286

14.3.3.3

G. Di Nunno et al.

Lévy–Based Process of a General Form

Now let us consider a Lévy process Z of a general form. In this case we need to assume that p ≥ 2 in order to guarantee the existence of Y and its moments, see [2, Thm. 2.2]. It turns out that under this assumption we have the same upper bound for the incremental moment as in the case a = 0.  Lemma 6 Assume that for some p ≥ 2 we have R |x| p π(d x) < ∞ and the conditions (B1) and (B2) hold with some α ∈ R, β > − 1p , γ > − 1p such that α + β + γ > − 1p . Then for all 0 ≤ s < t ≤ T , E |Yt − Ys | p ≤ C(t − s)min{κ p+1,δp} , where κ is defined by (14.15). If κ > 0 and δ > 1p , then the trajectories of Y are a. s. Hölder   continuous up to order min κ, δ − 1p . Proof Applying Lemma 1, Lemma 3 and (14.8), we obtain ⎛ E |Yt − Ys | ≤ C ⎝

t

⎛ +⎝

s |g(t, u)| du +

|g(t, u) − g(s, u)| p du

p

p

s

t

0

⎞ p/2 |g(t, u)|2 du ⎠

⎛ +⎝

s

s

⎞ p/2 ⎞ ⎟ |g(t, u) − g(s, u)|2 du ⎠ ⎠

0 p

+ C(t − s) + C(t − s) 2 (κ p+1) p ≤ C(t − s)min{κ p+1,δp, 2 (κ p+1)} = C(t − s)min{κ p+1,δp} . ≤ C(t − s)

κ p+1

δp

Hölder continuity follows from the Kolmogorov continuity theorem.



Remark 1 The assumption (B1) can be replaced by the following more general condition: (B1 ) There exist constants αi ∈ R, βi > − 1p and γi > − 1p , i = 1, 2, . . . , m, such m that for all 0 < u < t ≤ T , |g(t, u)| ≤ C t αi u βi (t − u)γi . i=1

In this case the statements of Lemmas 3–6 hold true with κ = min κi , where 1≤i≤m

κi = κ(αi , βi , γi ), i = 1, . . . , m, are defined by (14.15). Indeed, in order to proof Lemma 3 under the assumption (B1 ), it suffices to apply the bound (x1 + · · · + p p p xm ) ≤ C x1 + · · · + xm and follow the same reasoning as in the case of the condition (B1). Other lemmas are then easily deduced from Lemma 3.

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

287

14.3.4 Examples of Volterra–Lévy Processes with Power Restrictions on the Kernel 14.3.4.1

The Molchan–Golosov Kernel

Let us verify the assumptions (B1) and (B2) for the Molchan–Golosov kernel, which is defined as ⎞ ⎛ t 1 1 1 3 1 K H (t, s) = C H s 2 −H ⎝t H − 2 (t − s) H − 2 − (H − 21 ) u H − 2 (u − s) H − 2 du ⎠ , s



2H (H + 21 ) ( 23 −H ) (2−2H )



(14.16) 1 2

where H ∈ (0, 1), C H = . This kernel arises in the compact interval representation of the fractional Brownian motion as an integral with respect to a Wiener process W , see, e. g., [13, Sect. 2.8]. More precisely, the Volterra process t BtH

=

K H (t, s) dWs , t ≥ 0

(14.17)

0

is a fractional Brownian motion with the Hurst parameter  H , that is a zero mean  Gaussian process with covariance function EBtH BsH = 21 s 2H + t 2H − |t − s|2H . Note that the precise value of C H is irrelevant in the context of our study, the following results concerning Hölder continuity of Volterra processes are valid for any C > 0 instead of C H . Hereafter we consider the Volterra process t YtH

=

K H (t, s) d Z s , t ∈ [0, T ],

(14.18)

0

where Z is a Lévy base-process. We recall that if Z is without Gaussian component, then the process (14.18) is known as fractional Lévy process by Molchan–Golosov transformation. It was introduced and studied in [20]. Proposition 1 Let H ∈ (0, 1), ε ∈ (0, H ).  1. Let 0 < R x 2 π(d x) < ∞. Then for all 0 ≤ s < t ≤ T ,  2 E YtH − YsH  ≤ C(t − s)2(H −ε) . If H ∈ ( 21 , 1), then the trajectories of Y H are κ-Hölder continuous for any κ ∈ (0, H − 21 ). 2. Let Z be a Brownian motion. Then for all p ≥ 2 and all 0 ≤ s < t ≤ T ,

288

G. Di Nunno et al.

 p E YtH − YsH  ≤ C(t − s) p(H −ε) , and the trajectories of Y H are κ-Hölder continuous for any κ ∈ (0, H ). Proof We prove both simultaneously. Without loss of generality, assume  statements  that 0 < ε < min 1 − H, 21 . Indeed, if the result of the proposition holds for some ε = ε∗ > 0, then it holds also for all ε > ε∗ . We consider the cases H = 21 , H > 21 and H < 21 separately. Case H = 21 . Note that if H = 21 , then K H ≡ const. Hence,  for any p, (B1) and (B2) are valid with α = β = γ = 0 and with any δ > 0. If R x 2 π(d x) < ∞, then,  2 by Lemma 6, E YtH − YsH  ≤ C(t − s) for all 0 ≤ s < t ≤ T . If Z is a Brownian  p p motion, then by Lemma 5, E YtH − YsH  ≤ C(t − s) 2 for all 0 ≤ s < t ≤ T and p ≥ 2. Hence, both statements of the proposition hold even for ε = 0 (consequently, they hold for any ε > 0). Case H ∈ ( 21 , 1). In this case the kernel (14.16) can be rewritten using integration by parts in the following form: K H (t, s) = Cs

1 2 −H

t

u H − 2 (u − s) H − 2 du. 1

3

(14.19)

s

For 0 < s < t ≤ T , we have |K H (t, s)| ≤ Cs

1 2 −H

t

H − 21

t

(u − s) H − 2 du = Ct H − 2 s 2 −H (t − s) H − 2 . 3

1

1

1

s

Therefore the condition (B1) holds with α = H − 21 , β = 21 − H , γ = H − 21 . In order to verify the condition (B2), we need to estimate the difference |K H (t, u) − K H (s, u)|. We have for 0 < u < s < t ≤ T , |K H (t, u) − K H (s, u)| = Cu

1 2 −H

t

z H − 2 (z − u) H − 2 dz 1

3

s

≤ Cu

1 2 −H

t

2H −2

(z − u)

t dz + C

s

(z − u) H − 2 dz 3

s

(14.20) (here we have used the inequality z H − 2 ≤ (z − u) H − 2 + u H − 2 ). Let ε ∈ (0, 1 − H ). Then the integrals in the right-hand side of (14.20) can be bounded as follows: 1

1

1

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

t

2H −2

(z − u)

dz ≤ (s − u)

H +ε−1

s

t

289

(z − s) H −ε−1 dz

s

= C(s − u) H +ε−1 (t − s) H −ε , t (z − u)

H − 23

ε− 21

t

dz ≤ (s − u)

s

(z − s) H −ε−1 dz = C(s − u)ε− 2 (t − s) H −ε . 1

s

Hence,

|K H (t, u) − K H (s, u)| ≤ (t − s) H −ε h(s, u),   1 1 h(s, u) = C u 2 −H (s − u) H +ε−1 + (s − u)ε− 2 .

where

If p
21 . In this 1 case h(s) = s 2 −H . Other examples of Sonine pairs (c, h) are given in [12]. 1

1

3

Let us consider the operator K associated with the kernel K (t, s) in (14.33): t K f (t) =

t K (t, s) f (s) ds =

0

t b(u)c(u − s) du f (s) ds.

a(s) 0

(14.34)

s

In order to find an inverse operator to K , let us apply the elements of “fractional” calculus related to Sonine pair (c, h). More precisely, we use the notions similar to notions of the fractional integral and the fractional derivative, as given in Definition 3 from Appendix A, see also [12]. c from Definition 3, the operator K can be In terms of the fractional integral I0+ rewritten as follows: t K f (t) =

s a(s)c(u − s) f (s) ds du =

b(u) 0

t

0

c b(u)I0+ (a f )(u) du. (14.35) 0

Lemma 7 Consider the equation t K f (t) =

t b(u)c(u − s) du f (s) ds =

a(s) 0

t

s

u(z) dz, t ∈ [0, T ]. 0

Then its solution has a form h f (t) = a −1 (t)D0+ (ub−1 )(t),

(14.36)

under the assumption that the right-hand side of (14.36) is well-defined and h h (ub−1 ) ∈ L 1 ([0, T ]). Here D0+ stands for fractional derivative, see Definition 3. D0+

298

G. Di Nunno et al.

c Proof According to (14.35), b(t)I0+ (a f )(t) = u(t) a.e. or c I0+ (a f )(t) = b−1 (t)u(t).

(14.37)

Assume that a f ∈ L 1 ([0, T ]) and apply Lemma 10, item (i) to (14.37). As a result, h we arrive to a f (t) = D0+ (b−1 u)(t), and the proof follows.  As already mentioned, condition (C1) is sufficient for the existence of process Y . However, in order to guarantee its Hölder continuity, a stronger assumption is required. The following proposition summarizes the results in Lemma 1 and Theorem 3 of [12]. Proposition 3 1. Let the coefficients a, b, c satisfy the assumption (C4) a ∈ L p ([0, T ]), b ∈ L q ([0, T ]), c ∈ L r ([0, T ]), where p ≥ 2, q, r ≥ 1, 1 ≤ 21 , and 1p + q1 + r1 ≤ 1 + ε for some ε ∈ (0, 1/2). r

1 p

+

Then the stochastic process Y has a modification satisfying Hölder condition up to order ν = 23 − 1p − q1 − r1 > 1/2 − ε. 2. Let the coefficients a, b, c satisfy the assumption (C5) for any t1 ≥ 0, t2 ≥ 0, t1 + t2 < T , a ∈ L p ([0, T ]) ∩ L p1 ([t1 , T ]), where 2 ≤ p ≤ p1 , b ∈ L q ([0, T ]) ∩ L q1 ([t1 + t2 , T ]), where 1 < q ≤ q1 , c ∈ L r ([0, T ]) ∩ L r1 ([t2 , T ]), where 1 ≤ r ≤ r1 , 1 p

+

1 q

+

1 r

≤ 23 , and

1 q1

+ max



1 1 , 2 p

+

1 1 , r1 p1

+

1 r



< 1.

Then the process Y on any interval [t1 + t2 , T ] has  satisfies  a modification that Hölder condition up to order μ = 23 − q11 − max 21 , 1p + r11 , p11 + r1 > 1/2. In [12] details about the value of these conditions for fractional Brownian motion are given. Briefly, condition (C4) supplies its Hölder property up to order 1/2, and condition (C5) supplies Hölder property up to order H on any interval separated from zero. (b) In the present paper, we consider the kernel (14.16) with Hurst index H ∈ (0, 1/2). Then we introduce its generalization in the form ⎡ ˆ c(t (t, s) = a(s) K ˆ ⎣b(t) ˆ − s) −

t

⎤ ˆ − s) du ⎦ , bˆ (u)c(u

(14.38)

s

ˆ cˆ : [0, T ] → R are measurable functions. In what follows, we assume where a, ˆ b, that the following conditions hold.

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

299

( C1) The function aˆ is nondecreasing, bˆ is absolutely continuous, aˆ bˆ is bounded, cˆ ∈ L 2 ([0, T ]), and TT    u∧z     ˆ   ˆ  ˆ − s) ds du dz < ∞. ˆ − s) c(z A(T ) = b (u) b (z) aˆ 2 (s) c(u 0 0

0

( C2) Functions a, ˆ bˆ are positive a. e. on [0, T ].  (C3) Function cˆ creates a Sonine pair with some hˆ ∈ L 1 ([0, T ]). Remark 4 Sufficient condition for ( C1) is ⎛ ⎞ 21 T   u  ˆ  ⎝ ( C1 ) aˆ 2 (z) cˆ2 (u − z) dz ⎠ du < ∞. b (u) 0

0

Indeed, under ( C1 ) TT    u∧z     ˆ   ˆ  ˆ − s) ds du dz ˆ − s) c(z aˆ 2 (s) c(u b (u) b (z) 0 0

0

TT ≤

⎛ ⎞ 21    u∧z   2  ˆ   ˆ  ⎝ ˆ − s) ds ⎠ aˆ 2 (s) c(u b (u) b (z)

0 0

0

⎛ u∧z ⎞ 21    2 ˆ − s) ds ⎠ du dz × ⎝ aˆ 2 (s) c(z 0



⎛ ⎞ 21 ⎞2 T   u 2   ⎜  ⎟ ˆ − s) ds ⎠ du ⎠ < ∞. ≤ ⎝ bˆ (u) ⎝ aˆ 2 (s) c(u 0

0

Remark 5 We observe that in the case of fractional Brownian motion with H < 1 1 1 ˆ ˆ = c(s) ˆ = s H − 2 and h(s) = s − 2 −H , so, these 1/2 it holds that a(s) ˆ = Cs 2 −H , b(s) functions indeed satisfy conditions ( C1)–( C3). Indeed, from the remark above we can see that ⎛ ⎞ 21 T   u   2  ˆ  ⎝ ˆ − s) ds ⎠ du aˆ 2 (s) c(u b (u) 0

0

=C

1 2

−H



T 0

u H− 2

3

⎞ 21 ⎛ u  ⎝ s 1−2H (u − s)2H −1 ds ⎠ du 0

300

G. Di Nunno et al.

=C

1 2

−H



T

u H −1 du < ∞.

0

(t, ·) L 2 ([0,t]) < ∞ holds, and for Lemma 8 Under assumption ( C1), supt∈[0,T ] K  t = t K (t, s) dWs , t ∈ [0, T ], any Wiener process W = {Wt , t ∈ [0, T ]} a process Y 0 is well defined. Proof Obviously, (t, ·) 2L ([0,t]) K 2

t

ˆ2

≤ C b (t)

aˆ 2 (s)cˆ2 (t − s) ds 0

t +C

⎛ t ⎞2  ˆ c(u aˆ 2 (s) ⎝ b(u) ˆ − s) du ⎠ ds. s

0

If aˆ bˆ is bounded, aˆ is nondecreasing, and cˆ ∈ L 2 ([0, T ]), then ˆ2

t

b (t)

ˆ2

t

aˆ (s)cˆ (t − s) ds ≤ b (t)aˆ (t) 2

2

0

cˆ2 (t − s) ds

2

0

 2 ≤ aˆ bˆ (t) c ˆ 2L 2 ([0,T ]) < ∞.

Furthermore, t

⎛ t ⎞2  ˆ c(u aˆ 2 (s) ⎝ b(u) ˆ − s) du ⎠ ds s

0

t

t aˆ (s)



2

0

 t t =

  c(u ˆ ˆ − s) du b(u)

s

t

  ˆ c(v ˆ − s) dv ds b(v)

s

u∧v    ˆb(u)b(v) ˆ ˆ − s) ds du dv ≤ A(T ) < ∞, ˆ − s) c(v aˆ 2 (s) c(u

0 0

and the proof follows.

0



(t, s) in (14.38) Let us now consider the operator Kassociated with the kernel K (similarly to the operator K from (14.34) associated with K (t, s)). In this case K has the form

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

Kf (t) =

t

301

(t, s) f (s) ds K

0

=

c ˆ ˆ b(t)I 0+ (a

t f )(t) −

t (aˆ f )(s)

ˆ − s) du ds, bˆ (u)c(u

f ∈ L 2 ([0, T ]),

s

0

and under the assumptions ( C1)–( C3) we can apply the Fubini theorem and get Kf (t) =

c ˆ b(t)I ˆ 0+ (a

t f )(t) −

ˆ

b (u) 0

t =

d ˆ b(u) du

0

u c(u ˆ − s)(aˆ f )(s) ds du 0

u

t c(u ˆ − z)(aˆ f )(z) dz du =

0

cˆ ˆ ˆ f )(u) du. b(u)D 0+ (a

0

Consider the following Gaussian process t = Y

t

(t, s) dWs , t ∈ [0, T ], K

(14.39)

0

C1)–( C3) it where W = {Wt , t ∈ [0, T ]} is a Wiener process. Under assumptions ( is well defined on [0, T ]. Taking Lemma 10 from Appendix A into account, it is easy to establish, similarly to Lemma 7, the following result. t  Lemma 9 Consider the equation  K f (t) = 0 u(z) dz, z ∈ [0, T ]. Then its soluh tion has a form f (t) = aˆ −1 I0+ bˆ −1 u (t). Furthermore, we prove the following result on the Hölder continuity of paths. Theorem 1 Let the conditions ( C1)–( C3) hold, together with the following assumptions:     ˆ bˆ (t) ≤ Ct −1 , t ∈ [0, T ]; 1. a(t) 2. there exists γ ∈ (0, 2) such that t

cˆ2 (s) ds ≤ Ct γ , t ∈ [0, T ],

0

T −t  2 c(t ˆ + s) − c(s) ˆ ds ≤ Ct γ , t ∈ [0, T ]. 0

302

G. Di Nunno et al.

 satisfy δ-Hölder condition a. s. for any δ ∈ Then the trajectories of the process Y (0, γ /2). 1 1 ˆ Remark 6 In the case when a(s) ˆ = Cs 2 −H , b(s) = c(s) ˆ = s H − 2 we have that t

−1 2 2H a(s) ˆ bˆ (s) = Cs , 0 cˆ (s) ds = Ct ,

T −t

 2 c(t ˆ + s) − c(s) ˆ ds =

0

T −t  1 1 2 (t + s) H − 2 − s H − 2 ds 0

−1

T t

=t

  ∞  2  1 1 2 H − 21 H − 21 2H (1 + z) (1 + z) H − 2 − z H − 2 dz −z dz < t

2H

0

0

≤ Ct 2H , 2 ∞ H − 21 H − 21 (1 + z) − z dz ≤ 0   C. Therefore, in this case we can put γ = 2H , and (C4)–(C5) hold. since (1 + z) H − 2 − z H − 2 ∼ z H − 2 , z → ∞, and so 1

1

3

Proof For t1 < t2 , 

t2 − Y t1 E Y

2

⎞2 ⎛ t 1 t2   (t2 , s) − K (t1 , s) dWs + K (t2 , s) dWs ⎠ K = E⎝ t1

0

t1 = 0





≤ 2⎝

(t2 , s) − K (t1 , s) K

2

t2 ds + t1

t1

 2 ˆ 2 )c(t ˆ 1 )c(t aˆ 2 (s) b(t ˆ 2 − s) − b(t ˆ 1 − s) ds

0

+bˆ 2 (t2 )

t2

t2

t1 aˆ 2 (s)cˆ2 (t2 − s) ds +

t1

+



aˆ (s) ⎝

0

t2

2

t1

2 (t2 , s) ds K

⎛ t ⎞2 2 aˆ 2 (s) ⎝ bˆ (u)c(u ˆ − s) du ⎠ ds

⎞2



t1

⎟ ˆ − s) du ⎠ ds ⎠ =: 2(I1 + I2 + I3 + I4 ). bˆ (u)c(u

s

Let us show that each term in the right-hand side is bounded by C(t2 − t1 )γ . We make the analysis term by term.

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

303

1. The first term can be rewritten as follows: t1 I1 =

 2 ˆ 2 )c(t ˆ 1 )c(t aˆ 2 (s) b(t ˆ 2 − s) − b(t ˆ 1 − s) ds

0

≤ 2bˆ 2 (t2 )

t1

 2 aˆ 2 (s) c(t ˆ 2 − s) − c(t ˆ 1 − s) ds

0



ˆ 2 ) − b(t ˆ 1) + 2 b(t

2 t1

aˆ 2 (s)cˆ2 (t1 − s) ds =: J1 + J2 .

(14.40)

0

The first term in the right-hand side of (14.40) is bounded as follows: bˆ 2 (t2 )

t1

 2 aˆ 2 (s) c(t ˆ 2 − s) − c(t ˆ 1 − s) ds

0

t1

ˆ2

≤ b (t2 )aˆ (t2 ) 2

 2 c(t ˆ 2 − s) − c(t ˆ 1 − s) ds

0

t1 ≤C

 2 c(t ˆ 2 − t1 + z) − c(z) ˆ dz ≤ C(t2 − t1 )γ ,

0

and the second one can be bounded as follows: ⎛ t ⎞2 t1 2  2 t1 ˆ 2 ) − b(t ˆ 1) b(t aˆ 2 (s)cˆ2 (t1 − s) ds = aˆ 2 (s)cˆ2 (t1 − s) ⎝ bˆ (v) dv ⎠ ds 0

t1 ≤

⎛ cˆ2 (t1 − s) ⎝

t2

⎞2



    γ ˆ bˆ (v) dv ⎠ ds ≤ Ct1 ⎝ a(v)

t1

0

0

t1

t2

⎞2 v −1 dv ⎠

t1

⎛ t ⎞2 ⎛ t ⎞2 2 γ 2  γ γ 2 γ = C ⎝ t12 v −1 dv ⎠ ≤ C ⎝ v 2 −1 dv ⎠ ≤ C t22 − t12 ≤ C (t2 − t1 )γ . t1

t1

Here we have used the monotonicity of aˆ and then the conditions ( C4) and ( C5). 2. The second term can be bounded with the help of conditions ( C1) and ( C5):

304

G. Di Nunno et al.

I2 = bˆ 2 (t2 )

t2

aˆ (s)cˆ (t2 − s) ds ≤ aˆ (t2 )bˆ 2 (t2 ) 2

2

t2 cˆ2 (t2 − s) ds

2

t1

t1

t 2 −t1

cˆ2 (z) dz ≤ C (t2 − t1 )γ .

≤C 0

3. By Fubini’s theorem and monotonicity of a, ˆ the third term can be estimated as follows: t1 I3 =

⎛ aˆ 2 (s) ⎝

⎞2 ˆ − s) du ⎠ ds bˆ (u)c(u

t1

0

t1 =

t2

t2 aˆ (s) 2

t1

0

t2 = t1

ˆ − s) du bˆ (u)c(u

t2

t2

ˆ − s) dv ds bˆ (v)c(v

t1

bˆ (u)bˆ (v)

t1

t1 aˆ 2 (s)c(u ˆ − s)c(v ˆ − s) ds du dv 0

t2t2    t1      



ˆ bˆ (v) c(u ˆ bˆ (u) a(v) ˆ − s)c(v ˆ − s) ds du dv. ≤ a(u) t1 t1

0

Then applying successively the condition ( C4), the Cauchy–Schwarz inequality and the condition ( C5) we obtain: t2t2 I3 ≤ C

−1 −1

t1

u v t1 t1

t2t2 ≤C t1 t1

t2t2 ≤C

  c(u ˆ − s)c(v ˆ − s) ds du dv

0

⎛ u −1 v −1 ⎝

t1 0

⎞ 21 ⎛ t ⎞ 21 1 cˆ2 (u − s) ds ⎠ ⎝ cˆ2 (v − s) ds ⎠ du dv 0

 γ γ 2 γ γ u 2 −1 v 2 −1 du dv ≤ C t22 − t12 ≤ C (t2 − t1 )γ .

t1 t1

4. The fourth term can be bounded similarly to the third one:

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

t2 I4 =

305

⎛ t ⎞2 2 aˆ 2 (s) ⎝ bˆ (u)c(u ˆ − s) du ⎠ ds

t1

s

t2t2t2 =

aˆ 2 (s)bˆ (u)c(u ˆ − s)bˆ (v)c(v ˆ − s) du dv ds

t1 s s

t2t2t2        ˆ bˆ (v) c(u ˆ − s)c(v ˆ − s) du dv ds ˆ bˆ (u) a(v) ≤ a(u) t1 s s

t2t2t2 ≤C

  ˆ − s)c(v ˆ − s) du dv ds u −1 v −1 c(u

t1 s s

t2t2 =C

−1 −1

u v t1 t1

u∧v   c(u ˆ − s)c(v ˆ − s) ds du dv t1

⎛ u∧v ⎞ 21 ⎛ u∧v ⎞ 21 t2t2   ≤C u −1 v −1 ⎝ cˆ2 (u − s) ds ⎠ ⎝ cˆ2 (v − s) ds ⎠ du dv t1 t1

t2t2 ≤C

t1

⎞ 21 ⎛ v ⎞ 21 ⎛ u   u −1 v −1 ⎝ cˆ2 (u − s) ds ⎠ ⎝ cˆ2 (v − s) ds ⎠ du dv

t1 t1

t2t2 ≤C

t1

t1

t1 γ

γ

u −1 v −1 (u − t1 ) 2 (v − t1 ) 2 du dv

t1 t1

t2t2 ≤C

γ

γ

(u − t1 ) 2 −1 (v − t1 ) 2 −1 du dv ≤ C (t2 − t1 )γ .

t1 t1

  t2 − Y t1 2 ≤ C (t2 − t1 )γ , whence the result folCombining the bounds we get E Y lows. 

14.4 Equations with Locally Lipschitz Drift of Linear Growth In this section we study stochastic differential equations with additive Volterra–Lévy noise. The noise considered has Hölder regularity of the paths as discussed in the first

306

G. Di Nunno et al.

part of this work. We shall adopt pathwise considerations and, for this reason, we start the study taking deterministic equations into account, then we move to discuss the stochastic cases. Let T > 0 be fixed, f = f (t), t ∈ [0, T ], and coefficient u = u(x), x ∈ R, be the measurable functions. Introduce the equation t Xt =

u(X s ) ds + f (t), t ∈ [0, T ], X |t=0 = X 0 ∈ R.

(14.41)

0

This equation is studied in Appendix B. Now letus return to the Eq. (14.1), that is, t let us consider the Volterra–Lévy process Yt = 0 g(t, s) d Z s instead of the deterministic function f . According to Lemma 11 in Appendix B, in order to obtain the existence and uniqueness of a solution, it suffices to establish either local integrability or local boundedness of Y . First, we study the sufficient conditions for integrability. Namely, we present the T conditions supplying E 0 |Yt | dt < ∞. If the assumption (A1) holds, then by (14.3), T

T |Yt | dt ≤

E 0

⎛ ⎞ 1p T   1   E |Yt | p p dt ≤ C ⎝ |x| p π(d x)⎠ g(t, ·) L p ([0,t]) dt, R

0

0

T therefore, the sufficient condition for integrability is 0 g(t, ·) L p ([0,t]) dt < ∞. Similarly, if the assumption (A2) holds, then using (14.4) we get T |Yt | dt ≤ Ca

E 0

1 2

T g(t, ·) L 2 ([0,t]) dt 0

⎞ 1p T ⎛   g(t, ·) L p ([0,t]) dt. + C ⎝ |x| p π(d x)⎠ R

0

Since p ≥ 2, we see that again the sufficient condition for integrability has the form T case the second term vanishes, hence a 0 g(t, ·) L p ([0,t]) dt < ∞. In the Gaussian T weaker condition is required, namely 0 g(t, ·) L 2 ([0,t]) dt < ∞. Now let the kernel g satisfy the assumption (B1). Then

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

T

T g(t, ·) L p ([0,t]) dt ≤ C

0

0

T ≤C

307

⎞ 1p ⎛ t  t α ⎝ s βp (t − s)γ p ds ⎠ dt 0

t α+β+γ + p dt, 1

0

t where we have used the equality 0 s βp (t − s)γ p ds = B(βp + 1, γ p + 1)t βp+γ p+1 (assuming that β > − 1p , γ > − 1p ). Consequently, under the assumption (B1) the T condition 0 g(t, ·) L p ([0,t]) dt < ∞ holds, if α + β + γ + 1p > −1. Similarly to Lemmas 4–6, we can consider three cases. Thus, we arrive at the following result. Theorem 2 Assume that one of the following assumptions holds:  1. p ≥ 1, a = 0, R |x| p π(d x) < ∞, the condition (B1) holds with some α ∈ R, β > − 1p , γ > − 1p such that α + β + γ > − 1p − 1;  2. p ≥ 2, R |x| p π(d x) < ∞, the condition (B1) holds with some α ∈ R, β > − 1p , γ > − 1p such that α + β + γ > − 1p − 1; 3. Z is a Brownian motion, the condition (B1) holds with p = 2, α ∈ R, β > − 21 , γ > − 21 such that α + β + γ > − 23 . T Then E 0 |Yt | dt < ∞. Consequently, if the coefficient u satisfies the assumption (D1) 1) of Lemma 11, then the Eq. (14.1) has a unique solution. Now we adapt the condition (D2) 3) of Lemma 11 to the stochastic case. Since continuity is a sufficient condition for local boundedness, we obtain the following corollary from Lemmas 4–6. Theorem 3 Assume that one of the following assumptions holds:  1. p ≥ 1, a = 0, R |x| p π(d x) < ∞, the conditions (B1) and (B2) hold with some α ∈ R, β > − 1p , γ > − 1p , δ > 1p such that α + β + γ > − 1p , κ > 0;  2. p ≥ 2 we have R |x| p π(d x) < ∞ and the conditions (B1) and (B2) hold with some α ∈ R, β > − 1p , γ > − 1p , δ > 1p such that α + β + γ > − 1p , κ > 0; 3. Z is a Brownian motion, the conditions (B1) and (B2) hold with p = 2, α ∈ R, β > − 21 , γ > − 21 , δ > 0 such that α + β + γ > − 21 , κ > − 21 . Then Y has a. s. continuous (hence, locally bounded) sample paths. Consequently, if the coefficient u satisfies the assumptions (D2) 1), 2) of Lemma 11, then the Eq. (14.1) has a unique solution. We remark that it seems that there no general results about solutions of stochastic differential equations (14.1) with Volterra–Lévy noise without some form of Lipschitz continuity assumptions. There are instead some papers dealing with some

308

G. Di Nunno et al.

classes of such equations also with exploding drift. We refer e. g. to [3] for a short survey and the study of a class of such equations. In the next section we address another class of equation without Lipschitz drift. We focus on Volterra–Gaussian processes. The particular case of fractional Brownian motion was considered in [14].

14.5 Equations with Volterra–Gaussian Processes Now our goal is to consider equations with additive noise represented by various Volterra–Gaussian processes, some of which were introduced in [12]. Our aim is to relax the conditions on the drift coefficient, in a similar fashion to what was done in the paper [14]. Remark that, in [14], the noise was fractional Brownian motion, but here we deal with more general noise.

14.5.1 Girsanov Theorem. Definition of Weak and Strong Solutions   Let FtV , t ∈ [0, T ] denote the natural filtration of V , where V can be either Y  is defined by (14.39) and (14.38). For defined by (14.32) and (14.33), or it can be Y some process u = {u t , t ∈ [0, T ]} with integrable trajectories, denote     −1   h hˆ bˆ −1 u (s). ub (s), zˆ (s) = aˆ −1 I0+ z(s) = a −1 D0+ an let, respectively, ⎧ T ⎫ T ⎨  ⎬ 1 z 2 (s) ds , ξT = E p − z(s) dWs − ⎩ ⎭ 2 0 0 ⎧ T ⎫ T ⎨  ⎬ 1  ξT = E p − zˆ (s) dWs − z 2 (s) ds . ⎩ ⎭ 2 0

0

Theorem 4 (1) Let the assumptions (C1)–(C3) hold, and let u = {u t , t ∈ [0, T ]} be a F Y -adapted process with integrable trajectories. Consider the transformation t V0 (t) = Yt +

u s ds. 0

(14.42)

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

309

Assume that 1. z ∈ L 2 ([0, T ]) a. s., 2. EξT = 1. Then V0 can be represented as V0 (t) =

t

K (t, s) d Bs , t ∈ [0, T ], where B is a

0

F Y -Wiener process under the new probability P B defined by dP B /dP = ξT .  (2) Let the assumptions ( C1)–( C3) hold, and let u = {u t , t ∈ [0, T ]} be a F Y adapted process with integrable trajectories. Consider the transformation t + 0 (t) = Y V

t u s ds. 0

Assume that 1. zˆ ∈ L 2 ([0, T ]) a. s., and 2. E ξT = 1

 0 can be represented as V 0 (t) = t K (t, s) d  Then V B is Bs , t ∈ [0, T ], where  0  Y  a F -Wiener process under the new probability P  B defined by dP  B /dP = ξT .

Proof Let us prove only (1) since both Insert t statements are tproved similarly. t ing (14.32) into (14.42) yields V0 (t) = 0 K (t, s) dWs + 0 u s ds = 0 K (t, s) d Bs , t  ·  where Bt = Wt + 0 K −1 0 u s ds (r ) dr. Using (14.36), we get t Bt = Wt +

 −1  h a −1 (r )D0+ ub (r ) dr.

0

Finally, by the standard Girsanov theorem, B is a F Y -Wiener process under the  probability P B . In the sequel, we study two stochastic differential equations t X t = x + Vt +

u(s, X s ) ds, t ∈ [0, T ],

(14.43)

0

, where Y is where x ∈ R, u : [0, T ] × R → R is a measurable function, V = Y, Y  defined by (14.32) and (14.33), while Y is defined by (14.39) and (14.38). We shall consider both strong and weak solutions according to the definition below. Definition 2 (i) By a weak solution of Eq. (14.43) we mean a couple of processes (V, X ) on the filtered probability space (, F , FV , P), such that

310

G. Di Nunno et al.

t Vt =

t K (t, s) dWs or Vt =

0

(t, s) dWs , K

(14.44)

0

respectively, with some Wiener process W , and (V, X ) satisfy (14.43). (ii) By a strong solution of equation (14.43) we understand a process X on (, F , FV , P), and V is of the form (14.44) with the fixed Wiener process W.

14.5.2 Weak Existence and Weak Uniqueness Let the coefficients a, b, c satisfy the assumptions (C1)–(C4). Then, according to Proposition 3, the stochastic process Y has a modification satisfying Hölder condition up to order ν ∈ (0, 1/2). Theorem 5 (i) Assume that u(s, x) satisfies the sublinear growth condition: there exist such 0 < α < 1 and C > 0 that |u(t, x)| ≤ C(1 + |x|α ),

(14.45)

and Hölder condition in space and time: there exist 0 < β ≤ 1, 0 < γ < 1 and C > 0 such that for any s, t ∈ [0, T ] and any x, y ∈ R   |u(t, x) − u(s, y)| ≤ C |t − s|β + |y − x|γ . Additionally to (C1)–(C4), let also functions a, b and h satisfy the following assumption: there exist C > 0 and ν ∈ (0, ν) such that T

a −2 (s)h 2 (s)b−2 (s) ds ≤ C,

0

T

⎞2 ⎛ t      a −2 (t) ⎝ h (t − r ) b−1 (t) − b−1 (r ) dr ⎠ dt ≤ C,

0

0

T 0

T 0



a −2 (t)b−2 (t) ⎝

t

⎞2  

h (t − r ) (t − r )β dr ⎠ dt ≤ C,

0

⎛ t ⎞2   

a −2 (t)b−2 (t) ⎝ h (t − r ) (t − r )γ ν dr ⎠ dt ≤ C. 0

Then the equation (14.43) with V = Y has a unique weak solution.

(14.46)

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

311

(ii) Assume that u(s, x) satisfies the sublinear growth condition (14.45), and, additionally to ( C1)–( C3), functions a, ˆ bˆ and hˆ satisfy following assumption: there exists C > 0 such that aˆ −1 (s)

s      ˆ h(s − r ) bˆ −1 (r ) dr ≤ C.

(14.47)

0

 has a unique weak solution. Then Eq. (14.43) with V = Y Remark 7 Let us check the conditions (14.46) and (14.47) in the case when V is a fractional Brownian motion. 1 1 3 1 (i) Let H > 21 , a(s) = s 2 −H , b(s) = s H − 2 , c(s) = s H − 2 , h(s) = s 2 −H . Then, T

−2

T

−2

a (s)h (s)b (s) ds = 0

T

2

s 1−2H ds = (2 − 2H )−1 T 2−2H ;

0

⎞2 ⎛ t      a −2 (t) ⎝ h (t − r ) b−1 (t) − b−1 (r ) dr ⎠ dt

0

0

T =C



t 2H −1 ⎝ (t − r )

0

T =C 0

t 0

t 2H −1 · t −1−2H ⎛

− 21 −H

⎞2  1  1 r 2 −H − t 2 −H dr ⎠ dt

⎛ 1 ⎞2   1  1 · t 1−2H t 2 dt ⎝ (1 − r )− 2 −H r 2 −H − 1 dr ⎠ 0

1

= C T 2−2H ⎝ (1 − r )

− 21 −H

⎞2  1  r 2 −H − 1 dr ⎠ .

0

Integral

 1  − 21 −H 2 −H − 1 r (1 − r ) dr is finite, since around zero, 0

1

 1  1 1 (1 − r )− 2 −H r 2 −H − 1 ∼ r 2 −H − 1  1  1 1 and around 1, (1 − r )− 2 −H r 2 −H − 1 ∼ (1 − r ) 2 −H . Further,

312

G. Di Nunno et al.

T

⎛ t ⎞2   

 −2 −2  a b (t) ⎝ h (t − r ) (t − r )β dr ⎠ dt

0

T =C 0



0

t

⎝ (t − r )

⎞2 − 21 −H +β

dr ⎠ dt ≤ C

0

if − 21 − H + β > −1, or β > H − 21 . Finally, T

⎛ t ⎞2   

 −2 −2 

a b (t) ⎝ h (t − r ) (t − r )γ ν dr ⎠ dt

0

0

T =C 0

⎛ t ⎞2  1

⎝ (t − r )− 2 −H +γ ν dr ⎠ dt ≤ C 0

if − 21 − H + γ ν or γ ν > H − 21 . But in this case ν can be any number from 0 to H , 1 therefore, condition γ ν > H − 21 holds if γ H > H − 21 , or γ > 1 − 2H . Therefore 1 1 assumptions (14.46) hold for β > H − 2 , γ > 1 − 2H . 1 1 1 ˆ ˆ ˆ = Cs 2 −H , b(s) = c(s) ˆ = s H − 2 , h(s) = s − 2 −H , therefore (ii) Let H < 21 . Then a(s) s  s   1 1 1   ˆ −1  ˆ H − 21 (s − r )− 2 −H r 2 −H dr = Cs 2 −H ≤ C, aˆ (s) h(s − r ) b (r ) dr = Cs −1

0

0

so (14.47) holds. Proof First, we give some upper bounds for z(s) and zˆ (s) in order to confirm that the theorem assumptions supply Novikov conditions for ξT and  ξT , and therefore ξT and  ξT satisfy Theorem 4. Then the proofs of (i) and (ii) are similar, therefore we continue only with the second statement, dividing the proof into several steps and refer to the paper [14] for additional detail. Concerning z(s), by Lemma 10 (iii), we have that   z(s) = a −1 hb−1 (s)u (s, Ys + x) s   + a −1 (s) u (z, Yz + x) b−1 (z) − u (s, Ys + x) b−1 (s) h (s − z) dz 0

= J1 (s) + J2 (s). Let us construct upper bounds for J1 and J2 . Namely, we are interested in two integrals. First,

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

T J12 (s) ds 0

313

 T   −2 2 −2  2α a h b (s) ds ≤ C 1 + sup |Ys + x| 0≤s≤T

  2α ≤ C 1 + sup |Ys | ,

(14.48)

0

0≤s≤T

according to the first assumption in (14.46). Second, T

  J22 (s) ds ≤ C 1 + sup |Ys + x|2α 0≤s≤T

0





T

×

a −2 (s) ⎝

0

T +

s

⎞2     −1 b (z) − b−1 (s) h (s − z) dz ⎠ ds

0



 −2 −2  a b (s) ⎝

0

s

⎞2  

|u(s, Ys + x) − u(z, Yz + x)| h (s − z) dz ⎠ ds

0

= M1 + M2 . Obviously,



 M1 ≤ C 1 + sup |Ys |



,

(14.49)

0≤s≤T

according to the second assumption in (14.46). Concerning M2 , it admits the following upper bound: T M2 ≤ C

⎞2   (ab)−2 (s) ⎝ (s − z)β h (s − z) dz ⎠ ds

0



s 0

T +C 0

⎛ (ab)−2 (s) ⎝

s

⎞2   |Ys − Yz |γ h (s − z) dz ⎠ ds = N1 + N2 .

0

According to 3rd assumption in (14.46), N1 ≤ C. Further, due to the 4th assumption from (14.46),

314

G. Di Nunno et al.

 N2 ≤ C

γ

sup 0≤s 0, and this inequality supplies 0≤s≤T

Novikov condition for  ξT . Now we continue with the proof of (ii). We consider the two cases of V . ' is a Volterra–Gaussian (a) Together with Theorem 4, we can conclude that Y  ' (t, s) d ' 't = t K , where B is a Wiener process with respect B process of the form Y s 0  defined by dP /dP = ξ , where to the probability measure P ' ' T B B  ξT = exp

⎧ T ⎨ ⎩ 0

1 zˆ (s) dWs − 2

T zˆ 2 (s) ds

⎫ ⎬ ⎭

.

0

', Y  + x) creates a weak solution of (14.43) with V = Y . It means that the couple (Y (b) Now let us apply and modify the approach from [14] concerning the proof of uniqueness in law and pathwise uniqueness of the equations with additive fractional

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

315

noise. Namely, consider any solution of the equation t Xt = x +

t , u(s, X s ) ds + Y

0

t = where Y

t 0

(t, s) d Bs , B is some Wiener process, and define K s

−1

zˆ (s) = aˆ (s)

h(s − r )bˆ −1 (r )u(r, X r ) dr.

0

Note that X ∈ C([0, T ]), therefore, due to sublinear growth condition,   sup |u(v, X v )| ≤ C 1 + sup |X v |2α < ∞ a.s.

0≤v≤r

0≤v≤r

  Also, sup  Yt  < ∞ a. s. Therefore, from Gronwall inequality, we get sup |X t | ≤ 0≤t≤T 0≤t≤T     |x| + sup  Yt  + C T eC T , and in turn it implies that, similarly to (14.50), under 0≤t≤T     2   2α for assumption (14.47), zˆ (s) ≤ C 1 + sup |X t |2α ≤ C1 1 + sup  Yt  0≤t≤T

any s ∈ [0, T ]. It means that w. r. t. the measure  P such that

0≤t≤T

⎧ T ⎫  T ⎨ ⎬  d PT 1 = exp − zˆ (s)d Bs − zˆ 2 (s) ds , ⎩ ⎭ dPT 2 0

(14.51)

0

t X t − x has the samedistribution as the process 0 Kˆ (t, s) d Vs , where V is a Wiener s process, Vs = Bs + 0 zˆ (u)du, and the right-hand side of (14.51) indeed defines a probability measure. Further, for any bounded measurable functional  on C([0, T ]),  EP (X − x) = ⎛

(ξ − x) 

= EP ⎝(X − x) exp ⎛ = EP ⎝(X − x) exp

⎧ T ⎨ ⎩ 0 ⎧ T ⎨ ⎩ 0

dPT (ξ ) d  PT d PT

1 zˆ (s) d Bs + 2

T

⎫⎞ ⎬ zˆ 2 (s) ds ⎠ ⎭

0

aˆ −1 (s)

s 0

h(s − r )bˆ −1 (r )u(r, X r ) dr d Bs

316

G. Di Nunno et al.

⎫⎞ ⎬ 1 ⎝ −1 + aˆ (s) h(s − r )bˆ −1 (r )u(r, X r ) dr ⎠ ds ⎠ ⎭ 2 0 0 ⎧ T ⎛ s ⎨ −1 ⎝ = EP (X − x) exp aˆ (s) h(s − r )bˆ −1 (r )u(r, X r ) dr d Vs ⎩ ⎛

T



1 2

0



T

⎝aˆ −1 (s)

0



= EP  ⎝

s 0



⎞2

s

0

⎫⎞ ⎬ h(s − r )bˆ −1 (r )u(r, X r ) dr ⎠ ds ⎠ ⎭ ⎞ ⎞2

(·, s) d Bs ⎠ K

0

× exp

⎧ T ⎨ ⎩

aˆ −1 (s)

0



1 2

T 0

⎝aˆ −1 (s)

= EP  ⎝

s 0



⎛ h(s − r )bˆ −1 (r ) u ⎝r, x +

0





s

⎛ h(s − r )bˆ −1 (r )u ⎝r, x +

T

⎞ (r, z) d Bz ⎠ dr d Bs K

0

T



⎞2

(r, z) d Bz ⎠ dr ⎠ ds K

0



(·, s) d Vs ⎠ . K

⎫ ⎪ ⎬ ⎪ ⎭

(14.52)

0

Taking (14.52) into account, we conclude that any two weak solutions have the same distribution, so we established weak uniqueness. 

14.5.3 Pathwise Uniqueness of Weak Solution. Existence and Uniqueness of Strong Solution Now we consider only equation t X t = x + Yt +

u(s, X s ) ds, t ∈ [0, T ],

(14.53)

0

where x ∈ R, u : [0, T ] × R → R is a measurable function, Y is defined by (14.32) and (14.33). Theorem 6 Let coefficients a, b, c satisfy assumptions (C1)–(C3) and (C5). Let also coefficient u(s, x) satisfy conditions of item (i), Theorem 5. Then any two weak solu-

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

317

tions of Eq. (14.53) with the same Wiener process W occurring in the representation of Y , coincide a. s. Proof According to Proposition 3, the condition (C5) supplies that the process Y on any interval [t1 + t2 ,T ] has a modification  that satisfies Hölder condition up to order 3 1 1 1 1 1 1 μ = 2 − q1 − max 2 , p + r1 , p1 + r > 21 . So, consider any 0 < ε < T , and on   the interval [ε, T ] apply Itô formula to the process max X t1 , X t2 , where X 1 and X 2 are two weak solutions with the same Wiener process W . Observing that X 1 and X 2 are Hölder up to order μ > 21 on [ε, T ], which implies that the quadratic variation of X 1 − X 2 is zero, we get that for any t ∈ [ε, T ]         max X t1 , X t2 − max X ε1 , X ε2 = X t1 − X ε1 + X t2 − X t1 + − X ε2 − X ε1 + t

= Yt − Yε +



u s,

X s1



t

ε

t = Yt − Yε +

     u s, X s2 − u s, X s1 1{X s2 >X s1 } ds

ds + ε

   u s, max X s1 , X s2 ds.

ε

Let ε → 0. Then it follows from continuity of Y and u that Yε → 0 a. s., and t



u s, max



X s1 ,

X s2

ε



t ds →

   u s, max X s1 , X s2 ds a. s.

0

  Moreover, max X ε1 , X ε2 → x a. s. t   1 2   Finally, X t , X t = x + Yt + 0 u s, max X s1 , X s2 ds. It means that   1 2max Eq. (14.53). Due max X t , X t (and similarly min X s1 , X s2 ) satisfies    to the weak uniqueness proved in Theorem 5, max X t1 , X t2 and min X s1 , X s2 have the same distribution, whence X t1 = X t2 a. s., and from continuity of X 1 and X 2 , X t1 = X t2 , t ∈ [0, T ], a. s.  Remark 8 1. Condition (C5) is fulfilled in the case when Y = B H with H > 21 . In   this case we can put p1 = q1 = r1 = 3ε , where 0 < ε < min (H − 21 ), 3(1 − H ), 21 , 1 = H = 21 + 3ε , q1 = 3ε , r1 = 23 − H + 3ε . Then p μ=

3 2



ε 3

− max

1 2

, 23 − H +

2ε , 3

H−

1 2

+

2ε 3



= H − ε > 21 .

2. In the case when we cannot guarantee that Y is Hölder up to some order μ > 21 (for example, in the case when Y = B H with H < 21 ) the Itô formula for   max X t1 , X t2 has a different form, and the statement like Theorem 5 is an open problem. We conclude with a straightforward consequence of Theorems 5 and 6.

318

G. Di Nunno et al.

Theorem 7 Under the assumptions of Theorem 6, Eq. (14.53) has a unique strong solution. Acknowledgements The authors acknowledge that the present research is carried through within the frame and support of the ToppForsk project nr. 274410 of the Research Council of Norway with title STORM: Stochastics for Time-Space Risk Models.

Appendix A: Elements of Fractional Calculus for Sonine Pairs Here we consider some notions similar to the notions of the fractional integral and of the fractional derivative proper to classical fractional calculus. Definition 3 Let functions c and h from L 1 ([0, T ]) create a Sonine pair. Introduce the operators, similar to operators of fractional integral and fractional derivative: 

c I0+



t

f (t) =

c(t − s) f (s) ds,

f ∈ L 1 ([0, T ]),

0



⎞ ⎛ t  d h ⎝ h(t − s) f (s) ds ⎠ , f (t) = D0+ ds 

0

where f : [0, T ] → R is such that

t 0

h(t − s) f (s) ds ∈ AC([0, T ]).

c h and D0+ . Denote Now we can here establish some properties of the operators I0+

    c   c L 1 ([0, T ]) = ψ : [0, T ] → R : ψ(t) = I0+ ϕ (t), ϕ ∈ L 1 ([0, T ]) . I0+  h c  Lemma 10 (i) Let f ∈ L 1([0, T ]). Then D0+  I0+ f (t) = f (t) a. e. c c h L 1 ([0, T ]) . Then I0+ D0+ f (t) = f (t), t ∈ [0, T ]. (ii) Let f ∈ I0+ (iii) Let h ∈ C 1 (0, T ), there exist β > 0 such that lim s β+1 h (s) < ∞. Also, let f be s→0  h  a Hölder function of order γ , and γ > β. Then for any t ∈ [0, T ], D0+ f (t) = t

h(t) f (t) + 0 [ f (z) − f (t)]h (t − z) dz. Proof (i) Obviously,

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …



319

⎛ s ⎞ ⎞ ⎛ t   d h c ⎝ h(t − s) ⎝ c(s − u) f (u) du ⎠ ds ⎠ D0+ I0+ f (t) = dt 0 0 ⎛ t ⎞ ⎞ ⎛ t   d ⎝ = f (u) ⎝ h(t − s)c(s − u) ds ⎠ du ⎠ dt u 0 ⎞ ⎛ t  d ⎝ = f (u) du ⎠ = f (t) a. e. dt 

0

 c  ϕ (t), ϕ ∈ L 1 ([0, T ]). Then, according to (i), (ii) Let f (t) = I0+ 

  c h c   c  c h D0+ f (t) = I0+ D0+ I0+ ϕ (t) = I0+ ϕ (t) = f (t), t ∈ [0, T ]. I0+

(iii) For any t ∈ (0, T ) and t > 0 (other values can be considered similarly), t+t 

 f :=

t h(t + t − s) f (s) ds −

0

t



=

h(t − s) f (s) ds 0 t+t 



h(t + t − s) − h(t − s) f (s) ds +

h(t + t − s) f (s) ds t

0

t



=



h(t + t − s) − h(t − s)

 f (s) − f (t) ds

0 t+t 

+





h(t + t − s) f (s) − f (t) ds + f (t) t

Evidently, 1 t

1 t

t+t 



h(s) ds. t

f (t)

 t+t t



h(s) ds → f (t)h(t), a. e., as t → 0. Furthermore,

  t+t      = |h(t + t − θt )| | f (θt ) − f (t)| ,  h(t + t − s)[ f (s) − f (t)]ds     t

where θt ∈ [t, t + t]. According to condition (iii) and L’Hôpital’s rule, for some constant C > 0, limt→0 |h(t + t − θt )| | f (θt ) − f (t)| ≤ C lim t γ −β = 0. Finally, for 0 < ε < t,

t→0

320

G. Di Nunno et al.

  t         h(t + t − s) − h(t − s)

 − h (t − s) f (s) − f (t) ds   t   0  t     

  

 =  h (θt − s) − h (t − s) f (s) − f (t) ds    0  t−ε     

   ≤  h (θt − s) − h (t − s) f (s) − f (t) ds    0  t     

   +  h (θt − s) − h (t − s) f (s) − f (t) ds    t−ε  t−ε     

  

 ≤ h (θt − s) − h (t − s) f (s) − f (t) ds    0

t +

t



|h (θt − s)|| f (s) − f (t)|ds +

t−ε

 

h (t − s) | f (s) − f (t)| ds.

t−ε

      t−ε The first term,  0 h (θt − s) − h (t − s) f (s) − f (t) ds , tends to 0 as t → 0 for any ε > 0. Concerning the second term, it can be bounded as follows. For sufficiently small ε, it follows from condition (iii) that t

 

h (θt − s) | f (s) − f (t)| ds ≤ C

t−ε

t

(t − s)−1−β (t − s)α ds = Cεα−β ,

t−ε

and, the same is true for

 t 

|  t−ε h (t − s) f (s) − f (t)| ds, and the proof follows.

Appendix B: Deterministic Equations with Locally Lipschitz Drift of Linear Growth In this appendix we investigate the deterministic Eq. (14.41). Lemma 11 Let any of two following groups of conditions hold. (D1) (1) The coefficient u is Lipschitz: there exists C > 0 such that for any x, y ∈ R, |u(x) − u(y)| ≤ C |x − y| . (2) The function f is locally integrable.

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

321

(D2) (1) The coefficient u is of linear growth: there exists C > 0 such that for any x ∈ R, |u(x)| ≤ C(1 + |x|). (2) the coefficient u is locally Lipschitz: for any R > 0 there exists C R > 0 such that for any x, y ∈ R, |x| , |y| < R, |u(x) − u(y)| ≤ C R |x − y| . (3) The function f is locally bounded. Then the Eq. (14.41) has a unique solution X on [0, T ]. If condition (D1) holds, then X is locally integrable. If condition (D2) holds, then X is locally bounded. Proof First, we assume that (D1) holds. Let t0 > 0 be some number. We apply successive approximations with X t(0) = 0, X t(1) = f (t) ∈ L 1 ([0, t0 ]), X t(n)

t =

  u X s(n−1) ds + f (t) ∈ L 1 ([0, t0 ]).

(14.54)

0

Then for any 0 < t ≤ t0 , t

 (n)   X − X (n−1)  ds ≤ s s

0

 ts

  (n−1)    u X − u X (n−2)  dv ds v

v

0 0

t ≤C

 (n−1)  X − X v(n−2)  (t − v) dv ≤ · · · ≤ C n−1 v

0



(Ct)n−1 (n − 1)!

t | f (s)| 0

(t − s)n−1 ds (n − 1)!

t | f (s)| ds. 0

This means that X (n) is a Cauchy sequence in L 1 ([0, t0 ]), therefore there exists a limit X t = limn→∞ X t(n) in L 1 ([0, t0 ]). It is clear that X is a solution of (14.41). Uniqueness follows from the Gronwall inequality. Now let us consider the case when holds. As before, let t0 > 0 be fixed, and f (t) ≤ C = C(t0 ). With X t(0) = 0, X t(1) = f (t) is locally bounded, and every X (n) that is defined by (14.54) is locally bounded as well. Moreover,

322

G. Di Nunno et al.

t      (n)   X t  ≤ | f (t)| + Ct + C  X s(n−1)  ds 0

t ≤ | f (t)| + Ct + C

t (| f (s)| + Cs)ds + C

0

2

 (n−2)  X  (t − s) ds ≤ · · · ≤ s

0

t ≤ | f (t)| + Ct + C

(| f (s)| + Cs)eC(t−s) ds, 0

therefore, X (n) are totally locally bounded. Existence of the limit that is a unique solution of (14.41) is evident.

References 1. Di Nunno, G., Fiacco, A., Karlsen, E.H.: On the approximation of Lévy driven Volterra processes and their integrals. J. Math. Anal. Appl. 476(1), 120–148 (2019) 2. Di Nunno, G., Mishura, Y., Ralchenko, K.: Fractional calculus and pathwise integration for Volterra processes driven by Lévy and martingale noise. Fract. Calc. Appl. Anal. 19(6), 1356– 1392 (2016) 3. Di Nunno, G., Mishura, Y., Yurchenko-Tytarenko, A.: Sandwiched SDEs with unbounded drift driven by Hölder noises. To appear in Advances in Applied Probability 55 (2023) 4. Dzhaparidze, K., van Zanten, H.: A series expansion of fractional Brownian motion. Probab. Theory Related Fields 130(1), 39–55 (2004) 5. Fernique, X.: Regularité des trajectoires des fonctions aléatoires gaussiennes. In: École d’Été de Probabilités de Saint-Flour, IV-1974, pp. 1–96. Lecture Notes in Math. Vol. 480 (1975) 6. Föllmer, H., Schweizer, M.: A microeconomic approach to diffusion models for stock prices. Math. Finance 3(1), 1–23 (1993) 7. Gibson, R., Schwartz, E.S.: Stochastic convenience yield and the pricing of oil contingent claims. J. Finance 45(3), 959–976 (1990) 8. Harang, F.A., Tindel, S.: Volterra equations driven by rough signals. Stochast. Processes Appl. 142, 34–78 (2021) 9. Langevin, P.: Sur la théorie du mouvement brownien. Compt. Rendus 146, 530–533 (1908) 10. Mishura, Y.: Diffusion approximation of recurrent schemes for financial markets, with application to the Ornstein-Uhlenbeck process. Opuscula Math. 35(1), 99–116 (2015) 11. Mishura, Y.: The rate of convergence of option prices on the asset following a geometric Ornstein-Uhlenbeck process. Lith. Math. J. 55(1), 134–149 (2015) 12. Mishura, Y., Shevchenko, G., Shklyar, S.: Gaussian processes with Volterra kernels. In: Silvestrov, S., Malyarenko, A., Rancic, M. (eds.) Stochastic Processes, Statistical Methods and Engineering Mathematics. Springer (2020) 13. Mishura, Y., Zili, M.: Stochastic Analysis of Mixed Fractional Gaussian Processes. ISTE Press, London; Elsevier Ltd, Oxford (2018) 14. Nualart, D., Ouknine, Y.: Regularization of differential equations by fractional noise. Stochastic Process. Appl. 102(1), 103–116 (2002) 15. Rajput, B.S., Rosi´nski, J.: Spectral representations of infinitely divisible processes. Probab. Theory Related Fields 82(3), 451–487 (1989) 16. Sato, K.: Lévy Processes and Infinitely Divisible Sistributions, Cambridge Studies in Advanced Mathematics, vol. 68. Cambridge University Press, Cambridge (1999)

14 Stochastic Differential Equations Driven by Additive Volterra-Lévy …

323

17. Schöbel, R., Zhu, J.: Stochastic volatility with an Ornstein-Uhlenbeck process: an extension. Rev. Finance 3(1), 23–46 (1999) 18. Stein, E.M., Stein, J.C.: Stock price distributions with stochastic volatility: an analytic approach. Rev. Financ. Stud. 4(4), 727–752 (1991) 19. Su, X., Wang, W.: Pricing options with credit risk in a reduced form model. J. Korean Statist. Soc. 41(4), 437–444 (2012) 20. Tikanmäki, H., Mishura, Y.: Fractional Lévy processes as a result of compact interval integral transformation. Stoch. Anal. Appl. 29(6), 1081–1101 (2011) 21. Uhlenbeck, G.E., Ornstein, L.S.: On the theory of the Brownian motion. Phys. Rev. 36(5), 823 (1930) 22. Van Kampen, N.G.: Stochastic Processes in Physics and Chemistry, 3rd ed. Elsevier (2007) 23. Vasicek, O.: An equilibrium characterization of the term structure. J. Financ. Econ. 5(2), 177– 188 (1977) 24. Wang, M.C., Uhlenbeck, G.E.: On the theory of the Brownian motion II. Rev. Modern Phys. 17(2–3), 323 (1945) 25. Wiggins, J.B.: Option values under stochastic volatility: theory and empirical estimates. J. Financ. Econ. 19(2), 351–372 (1987)

Chapter 15

Fixed Point Results of Generalized Cyclic Contractive Mappings in Multiplicative Metric Spaces Talat Nazir and Sergei Silvestrov

Abstract Several generalised contractive type conditions are established for existence, uniqueness and well-posedness of the fixed point results, limit shadowing property, and also for the property of coincidence of sets of periodic points and fixed points for cyclic contractive maps on multiplicative metric spaces. Keywords Fixed point · Generalized contractive mapping · Well-posedness · Limit shadowing property · Peroidic point property · Multiplicative metric space MSC 2020 Classification 47H09 · 47H10 · 54C60 · 54H25

15.1 Introduction The study of fixed points of maps with certain type of contractive restrictions is a powerful approach towards solving variety of scientific problems in various areas of mathematics as well as important methodology for computational algorithms in Physics and other Natural sciences and Engineering subjects. In this work, after reviewing some relevant definitions, fundamental results and examples from existing literature on multiplicative metric spaces, we establish several generalised contractive type conditions for existence, uniqueness and well-posedness of the fixed point results, limit shadowing property, and also for the property of coincidence of sets of T. Nazir (B) Department of Mathematical Sciences, University of South Africa, Florida 0003, South Africa e-mail: [email protected] S. Silvestrov Division of Mathematics and Physics, School of Education, Culture and Communication, Mälardalen University, Box 883, 72123 Västerås, Sweden e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_15

325

326

T. Nazir and S. Silvestrov

periodic points and fixed points for cyclic contractive maps on multiplicative metric spaces.

15.1.1 Multiplicative Metric Spaces In the theory of metric spaces, the principle of Banach contraction is considered as a powerful tool which has been extended and generalized in many ways, that is, either by considering generalize contractive conditions of mappings or by extending the domain of the mappings. Ozavsar and Cevikel [20] proved an analogous of Banach contraction principle in the framework of multiplicative metric spaces. They also obtained the topological properties of the relevant in the setup of multiplicative metric space. Bashirov et al. [7] studied the concept of multiplicative calculus and proved a fundamental theorem of multiplicative calculus. They also illustrated the usefulness of multiplicative calculus with many interesting applications. Multiplicative calculus provides natural and straightforward way to compute the derivative of product and quotient of two functions [7]. It was shown that the multiplicative differential equations are more suitable than the ordinary differential equations in investigating some problems in economics and finance. Due to its operational simplicity and support to Newtonian calculus, it has attracted the attention of several researchers in the recent years. Furthermore, based on the definition of multiplicative absolute value function, they defined the multiplicative distance between two non-negative real numbers and between two positive square matrices. This provided the basis for multiplicative metric spaces. Bashirov et al. [8] obtained the modeling of mathematical problems with multiplicative differential equations. Florack and van Assen [13] gave applications of multiplicative calculus in biomedical image analysis. He et al. [14] studied common fixed points for weak commutative mappings on a multiplicative metric space. Recently, Yamaod and Sintunavarat [27] obtained some fixed point results for generalized contraction mappings with cyclic (α, β)-admissible mapping in multiplicative metric spaces. We start with the definition of multiplicative metric space [20]. Henceforth, R, C, R>0 , Rn>0 , Cn and N denote the sets for real numbers, complex numbers, positive real numbers, n-tuples of positive real numbers, n-tuples of complex numbers and natural numbers, respectively. Definition 15.1 (multiplicative metric space) Let X be a non-empty set. A mapping d : X × X → R>0 is said to be a multiplicative metric on X if for any x, y, z ∈ X, the following conditions hold: (i) d(x, y) ≥ 1 and d(x, y) = 1 if and only if x = y; (ii) d(x, y) = d(y, x); (iii) d(x, y) ≤ d(x, z) · d(z, y). The pair (X, d) is called a multiplicative metric space. The multiplicative absolute-value function |·|∗ : R → R>0 is defined as

15 Fixed Point Results of Generalized Cyclic …

⎧ α if α ⎪ ⎪ ⎪ ⎪ 1 ⎪ ⎪ ⎪ if α ⎪ ⎪ α ⎪ ⎨ |α|∗ = 1 if α ⎪ ⎪ ⎪ 1 ⎪ ⎪ − if α ⎪ ⎪ ⎪ α ⎪ ⎪ ⎩ −α if α

327

≥ 1, ∈ (0, 1), = 0, ∈ (−1, 0), ≤ −1.

For arbitrary x, y ∈ R>0 , the multiplicative absolute value function |·|∗ : R>0 → 1 if x ≤ 0 and R>0 satisfies the following properties: |x|∗ ≥ 1, x ≤ |x|∗ , x ≤ |x|∗ 1 ≤ x if x > 0, as well as |x · y|∗ ≤ |x|∗ |y|∗ . |x|∗ Example 15.1 ([20]) The mappings               x1   x2   xn   x1   x2   xn            d1 (x, y) =   ·   · . . . ·   , d2 (x, y) = max   ,   , . . . ,   , y1 ∗ y2 ∗ yn ∗ y1 ∗ y2 ∗ yn ∗ x = (x1 , x2 , . . . , xn ), y = (y1 , y2 , . . . , yn ) ∈ Rn>0 ,

define multiplicative metrics on X = Rn>0 . Example 15.2 ([20]) Let α > 1 be a fixed real number. Then dα : Rn × Rn → R>0 n

|xi −yi |

defined by dα (x, y) := α for x = (x1 , x2 , . . . , xn ), y = (y1 , y2 , . . . , yn ) ∈ Rn (or ∈ Cn ) satisfies the multiplicative metric conditions on Rn (or on Cn ) i=1

Example 15.3 ([20]) Let C ∗ [α, β] be the collection of all real-valued  multiplicative  f (c)  continuous functions from [α, β] ⊂ R to R>0 . If d( f, g) = sup  g(c)  for arbitrary c∈[α,β]

f, g ∈ C ∗ [α, β], then d is multiplicative metric on C ∗ [α, β].



Definition 15.2 ([20]) Let (X, d) be multiplicative metric space, z 0 be an arbitrary point in X and ε > 1. The set B (z 0 , ε) = {z ∈ X | d(z, z 0 ) < ε} is called a multiplicative open ball. A multiplicative closed ball is the set {z ∈ X | d(z, z 0 ) ≤ ε}. Definition 15.3 A sequence {xn } in a multiplicative metric space (X, d) is said to be multiplicative convergent to some point x ∈ X if for any given ε > 1, we have n 0 in N such that xn ∈ B(x, ε) for all n ≥ n 0 . If {xn } converges to x, then we write xn → x as n → ∞. Definition 15.4 ([20]) A sequence {xn } in a multiplicative metric space (X, d) is multiplicative convergent to x in X if and only if d(xn , x) → 1 as n → ∞. Definition 15.5 Let (X, d X ) and (Y, dY ) be two multiplicative metric spaces, and x0 an arbitrary but fixed element of X . A mapping f : X → Y is said to be multiplicative

328

T. Nazir and S. Silvestrov

continuous at x0 if and only if xn → x0 in (X, d X ) implies that f (xn ) → f (x0 ) in (Y, dY ). That is, given arbitrary ε > 1, there exists δ > 1 which depend on x0 and ε such that dY ( f x, f x0 ) < ε for all those x in X for which d X (x, x0 ) < δ. Definition 15.6 ([20]) A sequence {xn } in multiplicative metric space (X ,d) is called a multiplicative Cauchy sequence if for any ε > 1, there exists n 0 ∈ N such that d(xm , xn ) < ε for all m, n ≥ n 0 . A multiplicative metric space (X, d) is called complete if every multiplicative Cauchy sequence {xn } in X is multiplicative convergent in X. Theorem 15.1 ([20]) A sequence {xn } in multiplicative metric space (X, d) is multiplicative Cauchy sequence if and only if d(xm , xn ) → 1 as n, m → ∞. Example 15.4 ([20]) The multiplicative metric space (C ∗ [α, β], d) in Example 15.3 is complete.

15.1.2 Fixed Points of Maps in Multiplicative Metric Space Let X be a non-empty set and f : X → X be any map. If a point p in X satisfies f ( p) = p, then we call it a fixed point of f . We denote the collection of all fixed points of self-map f : X → X by F ( f ) . Definition 15.7 ([20]) Let (X, d) be multiplicative metric space. We say that a mapping f : X → X is multiplicative Banach contraction if there is a constant λ ∈ [0, 1) such that d( f (x), f (y)) ≤ d(x, y)λ for all x, y ∈ X . Theorem 15.2 ([20]) If (X, d) is a complete multiplicative metric space and f : X → X is a multiplicative Banach contraction. Then the set of fixed points F ( f ) = ∅ is singleton (that is f has a unique fixed point).

15.2 Cyclic Contraction Mappings In 2003, Kirk et al. [19] studied the cyclical contractive condition for self-mappings and proved some fixed point results. It is an important worth of cyclic contraction mappings that while mappings that satisfy the Banach contraction are always continuous, a mapping that satisfy cyclic contractive condition need not be continuous. P˘acurar and Rus [21] obtained some fixed point results for maps that satisfy the cyclic weak contractive conditions. Piatek [22] studied some results on cyclic Meir-Keeler contractions in metric spaces. Using fixed point result for weakly contractive map, Karapinar [17] established some interesting fixed point results for cyclic φ-weak contraction mappings. Derafshpour and Rezapour [12] obtained results on the existence of best proximity points of cyclic contractions. Recently, Abbas et al. [1] obtained fixed point results for generalized cyclic contractions in partial metric spaces. Several other useful results of cyclic contractive mappings are appeared in [2, 4, 9, 15, 25].

15 Fixed Point Results of Generalized Cyclic …

329

15.2.1 Fixed Point Results of Cyclic Contraction Mappings In this section, we obtain several fixed point results for self-maps satisfying certain generalized cyclic contractions defined on a complete multiplicative metric space. We start with the following definition. Definition 15.8 Let X be a non-empty set, f : X → X be a self-mapping. Then, m

Ai is a cyclic representation of X with respect to f if X= i=1

(a) each Ai , for i = 1, 2, . . . , m is non-empty subset of X ; (b) f (A1 ) ⊆ A2 , f (A2 ) ⊆ A3 , . . . , f (Am−1 ) ⊂ Am , f (Am ) ⊂ A1 . Theorem 15.3 Let (X, d) be complete multiplicative metric space, A1 , A2 , . . . , m

Am , be non-empty closed subsets of X with X = Ai . Suppose that f : X → X i=1

satisfy m

(1) Ai is the cyclic representation with respect to f , i=1

(2) for all (x, y) ∈ Ai × Ai+1 , with Am+1 = A1 , and some λ ∈ [0, 1), d( f x, f y) ≤ Mλ (x, y),

(15.1) λ

where Mλ (x, y) = max{d(x, y)λ , d( f x, x)λ , d( f y, y)λ , [d( f x, y) · d( f y, x)] 2 }. m Then, the map f has a unique fixed point z ∈ X. Moreover, z ∈ Ai . i=1

Proof Suppose x0 is the arbitrary point of X =

n

Ai . Then there exists some i 0

i=1

such that x0 ∈ Ai0 . Now f (Ai0 ) ⊆ Ai0 +1 implies that f x0 ∈ Ai0 +1 . Thus there exists x1 in Ai0 +1 such that f x0 = x1 . Similarly, f xn = xn+1 , where xn ∈ Ain . Hence for n ≥ 1, there exists i n ∈ {1, 2, . . . , m} such that xn ∈ Ain and xn+1 ∈ Ain+1 . In case, xn 0 = xn 0 +1 for some n 0 = 1, 2, . . . , then it is clear that xn 0 is a fixed point of f . Now we assume xn = xn+1 for all n ≥ 1. By (15.1), d(xn+1 , xn+2 ) = d( f xn , f xn+1 ) ≤ Mλ (xn+1 , xn ), where λ

(15.2)

λ

Mλ (xn+1 , xn ) = max{d(xn , xn+1 ) , d(xn , f xn ) , λ

d(xn+1 , f xn+1 )λ , [d( f xn , xn+1 ) · d( f xn+1 , xn )] 2 } λ

= max{d(xn , xn+1 )λ , d(xn , xn+1 )λ , d(xn+1 , xn+2 )λ , [d(xn , xn+2 ) · d(xn+1 , xn+1 )] 2 } λ

≤ max{d(xn , xn+1 )λ , d(xn+2 , xn+1 )λ , [d(xn , xn+1 ) · d(xn+2 , xn+1 )] 2 }. Now, the different possibilities arise. If λ

max{d(xn , xn+1 )λ , d(xn+2 , xn+1 )λ , [d(xn , xn+1 ) · d(xn+2 , xn+1 )] 2 } = d(xn , xn+1 )λ ,

330

T. Nazir and S. Silvestrov

then from (15.2), d(xn+1 , xn+2 ) ≤ d(xn+1 , xn )λ . Also when λ

max{d(xn+1 , xn )λ , d(xn+2 , xn+1 )λ , [d(xn , xn+1 ) · d(xn+2 , xn+1 )] 2 } = d(xn+2 , xn+1 )λ ,

(15.2) gives d(xn+1 , xn+2 ) ≤ d(xn+1 , xn+2 )λ . Since λ < 1, we get d(xn+1 , xn+2 ) = 1 and xn+1 = xn+2 , a contradiction as xn+1 = xn+2 for all n ∈ N. Finally if λ

max{d(xn+1 , xn )λ , d(xn+2 , xn+1 )λ , [d(xn , xn+1 ) · d(xn+2 , xn+1 )] 2 } λ

= [d(xn , xn+1 ) · d(xn+2 , xn+1 )] 2 , λ

then (15.2) implies that d(xn+1 , xn+2 ) ≤ [d(xn , xn+1 ) · d(xn+2 , xn+1 )] 2 , which gives d(xn+1 , xn+2 ) ≤ d(xn , xn+1 )λ . Thus for all cases, d(xn+1 , xn+2 ) ≤ d(xn , xn+1 )λ for 2 n all n ≥ 1, and d(xn+1 , xn+2 ) ≤ d (xn , xn+1 )λ ≤ d(xn , xn−1 )λ ≤ · · · ≤ d(x1 , x0 )λ . Now for n, m ∈ N with m > n, we obtain d(xn , xm ) ≤ d(xm , xm−1 ) · d(xm−2 , xm−1 ) . . . · d(xn+1 , xn ) ≤ d(x0 , x1 )λ

m-1

· d(x0 , x1 )λ

= d(x0 , x1 )λ

m−1

+λm−2 +···+λn

m−2

. . . · d(x0 , x1 )λ

n

λn

≤ d(x0 , x1 ) 1−λ → 1 as n, m → ∞.

Hence {xn } is multiplicative Cauchy sequence in space (X, d).As (X, d) is complete, so we obtain z ∈ X for which xn → z as n → ∞, or lim d(xn , z) = 1. First, we shall n→∞ m Ai . From the condition 1 , and since x0 ∈ A1 , we have {xn k } ⊆ A1 . prove that z ∈ i=1

As A1 is closed, z ∈ A1 . Again, from the condition 1, we have {xn k +1 } ⊆ A2 . From m the closedness of A2 , we get z ∈ A2 . Continuing this way, we obtain that z ∈ Ai and thus

m

i=1

Ai = ∅. Now, for xn ∈ Ai for i ∈ {1, 2, . . . , m} and z ∈ Ai+1 ,

i=1

d(xn , f z) = d( f xn−1 , f z) ≤ Mλ (xn−1 , z), Mλ (xn−1 , z) = max{d(xn-1 , z)λ , d( f xn−1 , xn−1 )λ , d(z, f z)λ , λ

[d(xn−1 , f z) · d(z, f xn−1 )] 2 } λ

= max{d(xn−1 , z)λ , d(xn , xn−1 )λ , d(z, f z)λ , [d(xn-1 , f z) · d(z, xn )] 2 }, and on taking the limit as n → ∞ implies that λ

d( f z, z) ≤ max{d(z, z)λ , d(z, z)λ , d(z, f z)λ , [d( f z, z) · d(z, z)] 2 } = d(z, f z)λ , which implies d(z, f z) = 1 and hence z = f z. Finally, we show that the fixed point m m Ai . For this, suppose that there exists another u ∈ Ai such of f is unique in i=1

i=1

15 Fixed Point Results of Generalized Cyclic …

331

that u = f u. Then (15.1) gives d(u, z) = d( f u, f z) ≤ Mλ (u, z),

(15.3) λ 2

Mλ (u, z) = max{d(u, z)λ , d(u, f u)λ , d(z, f z)λ , {d(u, f z) · d( f u, z)} } λ

= max{d(u, z)λ , d(u, u)λ , d(z, z)λ , {d(u, z) · d(z, u)} 2 } = d(u, z)λ . Thus from (15.3), d(u, z) ≤ d(z, u)λ , which gives d(u, z) = 1 and hence u = z.  Example 15.5 Let X = [0, 1] and d : X × X → R>0 be defined by d (x, y) = α |x−y| where α > 1. Then (X, d) is complete multiplicative metric space. Sup

1 1 , A2 = , 1 with A3 = A1 . For Y = A1 A2 , we define pose that A1 = 0, 3 3 ⎧1 ⎨ if x ∈ [0, 1), . Clearly A1 and A2 are closed subsets of f : Y → Y by f (x) = 3 ⎩ 0 if x = 1    

1 1 ⊆ A2 and f (A2 ) = 0, ⊆ A1 . So, A1 A2 is (X, d) . Note that f (A1 ) = 3 3 the cyclic representation with respect to f. To show that Theorem 15.3 is satisfied 1 for λ = , we have the following cases. 2

 1 1 and y ∈ , 1 , d ( f x, f y) = (1) When x ∈ A1 , y ∈ A2 , then for x ∈ 0, 3 3   1 d 13 , 13 = α 0 and (15.1) is satisfied and, when x ∈ 0, and y = 1, implies 3 d ( f x, f y) = d

1

 1 1 , 0 = α 3 < α 2 = d(y, f y)λ

3 λ

λ

≤ max{d(x, y)λ , d( f x, x) , d( f y, y)λ , [d( f x, y) · d( f y, x)] 2 }.

   1 1 (2) If y ∈ A1 , x ∈ A2 , then for y ∈ 0, ,x ∈ , 1 , d ( f x, f y) = d 13 , 13 = 3 3

1 0 α , and when y ∈ 0, and x = 1, we have 3   1 1 d ( f x, f y) = d 0, 13 = α 3 < α 2 = d( f x, x)λ λ

≤ max{d(x, y)λ , d( f x, x)λ , d( f y, y)λ , [d( f x, y) · d( f y, x)] 2 }. 1 So, all conditions of Theorem 15.3 hold. Moreover, is the unique fixed point of f 3 in A1 A2 .

332

T. Nazir and S. Silvestrov

Theorem 15.4 Let (X, d) be complete multiplicative metric space, A1 , A2 , . . . , m

Am , be non-empty closed subsets of X with X = Ai . Suppose that f : X → X i=1

satisfy (1)

m

Ai is the cyclic representation with respect to f ,

i=1

(2) for all (x, y) ∈ Ai × Ai+1 , with Am+1 = A1 , and some λi ≥ 0 for i = {1, 2, 3, 4, 5}, such that λ1 + λ2 + λ3 + λ4 + 2λ5 < 1, d( f x, f y) ≤ M ∗ (x, y), (15.4) ∗

(x, y) = d(x,y)λ1 · d( f x, x)λ2 · d(y, f y)λ3 · d( f x, y)λ4 · d( f y, x)λ5 .

Then, f has a unique fixed point z ∈ X . Moreover z ∈

m

Ai .

i=1

Proof Suppose x0 is arbitrary element of X =

n

Ai . Then there exists some i 0

i=1

such that x0 ∈ Ai0 . Now f (Ai0 ) ⊆ Ai0 +1 implies that f x0 ∈ Ai0 +1 . Thus there exists x1 in Ai0 +1 such that f x0 = x1 . Similarly, f xn = xn+1 , where xn ∈ Ain . Hence for n ≥ 1, there exists i n ∈ {1, 2, . . . , m} such that xn ∈ Ain and xn+1 ∈ Ain+1 . In case, xn 0 = xn 0 +1 for some n 0 = 1, 2, . . . , then it is clear that xn 0 is a fixed point of f . For xn = xn+1 for all n ∈ N, by (15.4), d(xn+2 , xn+1 ) = d( f xn , f xn+1 ) ≤ M ∗ (xn+1 , xn ), where M ∗ (xn+1 , xn ) = d(xn , xn+1 )λ1 · d( f xn , xn )λ2 · d( f xn+1 , xn+1 )λ3 · d( f xn , xn+1 )λ4 · d( f xn+1 , xn )λ5 = d(xn , xn+1 )λ1 · d(xn+1 , xn )λ2 · d(xn+2 , xn+1 )λ3 · d(xn+1 , xn+1 )λ4 · d(xn+2 , xn )λ5 ≤ d(xn , xn+1 )λ1 · d(xn+1 , xn )λ2 · d(xn+2 , xn+1 )λ3 · d(xn , xn+1 )λ5 · d(xn+1 , xn+2 )λ5 = d(xn+1 , xn )λ1 +λ2 +λ5 · d(xn+2 , xn+1 )λ3 +λ5 . Now, d(xn+1 , xn+2 ) ≤ d(xn+1 , xn )θ , where θ = (λ1 + λ2 + λ5 )/(1 − λ3 − λ5 ). Thus 2 n for all n ∈ N d(xn+1 , xn+2 ) ≤ d(xn+1 , xn )θ ≤ d(xn-1 , xn )θ ≤ . . . ≤ d(x1 , x0 )θ . Now for n, m ∈ N, with n < m, d(xn , xm ) ≤ d(xm , xm-1 ) · d(xm−1 , xm−2 ) · . . . · d(xn+1 , xn ) ≤ d(x0 , x1 )θ

· d(x0 , x1 )θ · . . . · d(x0 , x1 )θ = d(x0 , x1 )θ +θ θn n 2 = d(x1 , x0 )θ (1+θ+θ +... ) ≤ d(x0 , x1 ) 1−θ → 1 when n → ∞. m-1

m−2

n

m−1

m−2

+...+θ

Hence {xn } is the multiplicative Cauchy sequence in space (X, d). As given (X, d) is complete, so we obtain z ∈ X for which xn → z as n → ∞, or lim d(xn , z) = 1. n→∞

n

15 Fixed Point Results of Generalized Cyclic …

First, let us prove that z ∈

m

333

Ai . From the condition (1), and since x0 ∈ A1 , there

i=1

is a subsequence {xn k } ⊆ A1 . As A1 is closed, z ∈ A1 . Again, from the condition (1), we have {xn k +1 } ⊆ A2 . From the closedness of A2 , we get z ∈ A2 . Continuing this m m way, we get z ∈ Ai and thus Ai = ∅. Now, for xn ∈ Ai , i ∈ {1, 2, . . . , m} and z ∈ Ai+1 ,

i=1

i=1

d(xn , f z) = d( f xn−1 , f z) ≤ M ∗ (xn−1 , z), where M ∗ (xn−1 , z) = d(xn−1 , z)λ1 · d( f xn−1 , xn−1 )λ2 · d( f z, z)λ3 · d( f xn−1 , z)λ4 · d( f z, xn−1 )λ5 = d(xn−1 , z)λ1 · d(xn , xn−1 )λ2 · d( f z, z)λ3 · d(xn , z)λ4 · d( f z, xn−1 )λ5 and hence on taking the limit as n → ∞, d(z, f z) ≤ d(z, z)λ1 · d(z, z)λ2 · d(z, f z)λ3 · d(z, z)λ4 · d( f z, z)λ5 = d( f z, z)λ3 +λ5 which implies d(z, f z) = 1 and thus, z = f z. Finally, we show that the fixed point m m Ai . For this, suppose that there exists another v ∈ Ai such of f is unique in i=1

that v = f v and v = z. Then from (15.4), we have

i=1

d(v, z) = d( f v, f z) ≤ M ∗ (v, z), where M ∗ (v, z) = d(v, z)λ1 · d( f v, v)λ2 · d( f z, z)λ3 · d( f v, z)λ4 · d( f z, v)λ5 = d(v, z)λ1 · d(v, v)λ2 · d(z, z)λ3 · d(v, z)λ4 · d(z, v)λ5 = d(v, z)(λ1 +λ4 +λ5 ) , which gives d(v, z) = 1 and hence v = z.



Example 15.6 Let X = [0, 2] and d : X × X → R>0 be defined by d (x, y) = 1 |x−y| , . Then (X, d) is complete multiplicative metric space. Suppose A1 = 0, e 7



1 A2 = , 2 with A3 = A1 . For Y = A1 A2 , define f : Y → Y by f (x) = 7 ⎧1 ⎨ if x ∈ [0, 2), 7 . Clearly A1 and A2 are closed subsets of (X, d) . Note that ⎩ 0 if x = 2    

1 1 ⊆ A2 and f (A2 ) = 0, ⊆ A1 . So, A1 A2 is the cyclic repf (A1 ) = 7 7 1 resentation with respect to f. To show that Theorem 15.4 is satisfied for λi = for 8 i ∈ {1, 2, 3, 4, 5}, we shall distinguish the following cases.     (1) If x ∈ A1 , y ∈ A2 , then for x ∈ 0, 17 and y ∈ 17 , 2 , we deduce that

334

T. Nazir and S. Silvestrov

 d ( f x, f y) = d

1 1 , 7 7

 = e0 ≤ e

|x−y| 8

·e

|x− 17 | 8

·e

| y− 71 | 8

·e

| y− 71 | 8

·e

|x− 17 | 8

= d(x, y)λ1 · d( f x, x)λ2 · d(y, f y)λ3 · d( f x, y)λ4 · d( f y, x)λ5   and when x ∈ 0, 17 and y = 2, we have  d ( f x, f y) = d

 | 17 −x | 2 |2− 17 | x 1 1 2−x ,0 = e7 ≤ e 8 · e 8 · e8 · e 8 · e8 7

= d(x, y)λ1 · d( f x, x)λ2 · d(y, f y)λ3 · d( f x, y)λ4 · d( f y, x)λ5 .     (2) If y ∈ A1 , x ∈ A2 , then for y ∈ 0, 17 , x = 17 , 2 , we obtain  d ( f x, f y) = d

1 1 , 7 7

 = e0 ≤ e

x−y 8

·e

x− 17 8

·e

| y− 71 | 8

·e

| y− 71 | 8

·e

|x− 17 | 8

= d(x, y)λ1 · d( f x, x)λ2 · d(y, f y)λ3 · d( f x, y)λ4 · d( f y, x)λ5 .   and also for y ∈ 0, 17 and x = 2, we have   | y− 71 | |2− 17 | |2−y| y 1 1 2 d ( f x, f y) = d 0, = e7 ≤ e 8 · e8 · e 8 · e8 · e 8 7 = d(x, y)λ1 · d( f x, x)λ2 · d(y, f y)λ3 · d( f x, y)λ4 · d( f y, x)λ5 . 1 Hence all the conditions of Theorem 15.4 are satisfied. Moreover, is the fixed point 7 of f in A1 A2 .

15.2.2 Well-Posedness Results for Cyclic Contractive Maps The notion of well-posedness of a fixed point has evoked much interest to several mathematicians. Recently, Karapinar [17] studied well-posed problem for a cyclic weak φ−contraction mapping. Some useful results regarding well-posedness of fixed point problems are appeared in [3, 5, 24]. In this section, well-posedness of fixed point problem for cyclic contraction mappings are obtained. First, we define wellposedness of fixed point problems in multiplicative metric spaces. Definition 15.9 Let (X, d) be multiplicative metric space. The fixed point problem of self-map f : X → X is called well-posed if set F ( f ) is singleton with x ∗ ∈ F ( f ), and for sequence {xn } in X such that lim d( f xn , xn ) = 1 it holds that lim d (xn , x ∗ ) = 1.

n→∞

n→∞

15 Fixed Point Results of Generalized Cyclic …

335

Theorem 15.5 Let (X, d) be complete multiplicative metric space, and A1 , A2 , . . . , m

Am , be non-empty closed subsets of X with X = Ai . Suppose that f : X → X i=1

satisfy m

(1) Ai is the cyclic representation with respect to f , i=1

(2) for all (x, y) ∈ Ai × Ai+1 , with Am+1 = A1 , and some λ ∈ [0, 1), d( f x, f y) ≤ Mλ (x, y), where λ

Mλ (x, y) = max{d(x, y)λ , d( f x, x)λ , d( f y, y)λ , [d( f x, y) · d( f y, x)] 2 }. Then the fixed point problem of f is well-posed. Proof It follows from Theorem 15.3 that for any x ∈

m

Ai , the point z ∈

i=1

is the unique fixed point of f . Let us take a sequence {xn } in

m

m

Ai

i=1

Ai that satisfy

i=1

d( f xn , xn ) → 1 as n → ∞. Then we have d(z, xn ) ≤ d( f xn , f z) · d( f xn , xn ) ≤ M λ (xn , z) · d( f xn , xn )

λ

= max{d(xn , z)λ , d( f xn , xn )λ , d( f z, z)λ , [d(z, f xn ) · d( f z, xn )] 2 } · d( f xn , xn ) λ

= max{d(xn , z)λ , d( f xn , xn )λ , [d( f xn , z) · d(z, xn )] 2 } · d( f xn , xn ) λ

≤ max{d(xn , z)λ , d( f xn , xn )λ , [d( f xn , xn ) · d(xn , z)2 ] 2 } · d( f xn , xn ).

(15.5)

λ

Now if max{d(xn , z)λ , d( f xn , xn )λ , [d( f xn , xn ) · d(xn , z)2 ] 2 } = d(xn , z)λ , then (15.5) implies d(z, xn ) ≤ d(xn , z)λ · d( f xn , xn ), which implies d(xn , z) ≤ 1 d( f xn , xn ) 1−λ . Taking limit as n → ∞ yields d(z, xn ) → 1. Also, if λ

max{d(xn , z)λ , d( f xn , xn )λ , [d( f xn , xn ) · d(xn , z)2 ] 2 } = d( f xn , xn )λ , then (15.5) implies d(z, xn ) ≤ d( f xn , xn )λ · d( f xn , xn ). Taking limit as n → ∞ yields d(z, xn ) → 1. Finally, when λ

λ

max{d(xn , z)λ , d( f xn , xn )λ , [d( f xn , xn ) · d(xn , z)2 ] 2 } = [d( f xn , xn ) · d(xn , z)2 ] 2 , λ

(15.5) implies d(z, xn ) ≤ [d( f xn , xn ) · d(xn , z)2 ] 2 · d( f xn , xn ), which yields λ+2

d(z, xn ) ≤ [d( f xn , xn )] 2(1−λ) . Taking limit as n → ∞ gives d(z, xn ) → 1. Hence the fixed point problem of f is well-posed.  Theorem 15.6 Let (X, d) be complete multiplicative metric space, A1 , A2 , . . . , Am , m

be non-empty closed subsets of X with X = Ai . Suppose that f : X → X satisfy i=1

(1)

m

i=1

Ai is the cyclic representation with respect to f ,

336

T. Nazir and S. Silvestrov

(2) for all (x, y) ∈ Ai × Ai+1 , with Am+1 = A1 , and some λi ≥ 0 for i ∈ {1, 2, 3, 4, 5}, such that λ1 + λ2 + λ3 + λ4 + 2λ5 < 1, d( f x, f y) ≤ M ∗ (x, y), where M ∗ (x, y) = d(x, y)λ1 · d( f x, x)λ2 · d(y, f y)λ3 · d( f x, y)λ4 · d( f y, x)λ5 . Then fixed point problem for f is well-posed. Proof It follows from Theorem 15.4 that for any x ∈

m

Ai , the point z ∈

i=1 m

unique fixed point of f . Let us take a sequence {xn } in

m

Ai is the

i=1

Ai such that d( f xn , xn ) →

i=1

1 as n → ∞. Then we have

d(z, xn ) ≤ d( f xn , f z) · d( f xn , xn ) ≤ M ∗ (xn , z) · d( f xn , xn ) = d(xn , z)λ1 · d( f xn , xn )λ2 · d( f z, z)λ3 · d( f xn , z)λ4 · d( f z, xn )λ5 · d( f xn , xn ) = d(xn , z)λ1 · d( f xn , xn )λ2 · d(z, z)λ3 · d( f xn , z)λ4 · d(z, xn )λ5 · d( f xn , xn ) ≤ d(xn , z)λ1 · d( f xn , xn )λ2 · d( f xn , xn )λ4 · d(xn , z)λ4 · d(z, xn )λ5 · d( f xn , xn ) = d(xn , z)λ1 +λ4 +λ5 · d( f xn , xn )λ2 +λ4 +1 , 1+λ2 +λ4

which further implies d(z, xn ) ≤ [d( f xn , xn )] 1−λ1 −λ4 −λ5 . Taking limit as n → ∞  yields d(z, xn ) → 1. Hence the fixed point problem of f is well-posed.

15.2.3 Limit Shadowing Property for Cyclic Contractive Maps In this section, we study the limit shadowing property of cyclic contractive self-maps on multiplicative metric space. Definition 15.10 Let (X, d) be multiplicative metric space. The self-map f : X → X is said to have a limit shadowing property if for a convergent sequence {xn } in X such that d(xn+1 , f xn ) → 1 as n → ∞, there is y ∈ X such that d(xn , f n y) → 1 as n → ∞. Theorem 15.7 Let (X ,d) be complete multiplicative metric space, A1 , A2 , . . . , Am , m

be non-empty closed subsets of X with X = Ai . Suppose that f : X → X satisfy i=1

(1)

m

Ai is the cyclic representation with respect to f ,

i=1

(2) for all (x, y) ∈ Ai × Ai+1 , with Am+1 = A1 , and some λ ∈ [0, 1), d( f x, f y) ≤ Mλ (x, y), where λ

λ(x, y) = max{d(x, y)λ , d( f x, x)λ , d( f y, y)λ , [d( f x, y) · d( f y, x)] 2 }.

15 Fixed Point Results of Generalized Cyclic …

337

Then f has the limit shadowing property. Proof By Theorem 15.3, the map f has a unique fixed point z in

m

Ai . Let y be

i=1

the limit of {xn } in X with d(xn+1 , f xn ) → 1 as n → ∞, then we have d(xn+1 , z) ≤ d(xn+1 , f xn ) · d( f xn , f z) ≤ d(xn+1 , f xn ) · Mλ (xn , z) λ

= d(xn+1 , f xn ) · max{d(xn , z)λ , d( f xn , xn )λ , d(z, f z)λ , [d(xn , f z) · d(z, f xn )] 2 } λ

= d(xn+1 , f xn ) · max{d(xn , z)λ , d( f xn , xn )λ , [d(xn , z) · d(z, f xn )] 2 } ≤ d(xn+1 , f xn ) · max{d(xn , z)λ , d( f xn , xn+1 )λ · d(xn , xn+1 ), λ

[d(xn , z) · d(z, xn+1 ) · d(xn+1 , f xn )] 2 }. Taking limit as n → ∞ yields d(y, z) ≤ d(y, z)λ , possible only when d(y, z) = 1,  and hence y = z. Thus, d(xn , f n y) → 1 as n → ∞. Theorem 15.8 Let (X, d) be complete multiplicative metric space, A1 , A2 , . . . , m

Am , be non-empty closed subsets of X with X = Ai . Suppose that f : X → X i=1

satisfy (1)

m

Ai is the cyclic representation with respect to f ,

i=1

(2) for all (x, y) ∈ Ai × Ai+1 , with Am+1 = A1 , and some λi ≥ 0, i = 1, 2, . . . , 5, obeying λ1 + λ2 + λ3 + λ4 + 2λ5 < 1, it holds that d( f x, f y) ≤ M ∗ (x, y), where M ∗ (x, y) = d(x, y)λ1 · d( f x, x)λ2 · d(y, f y)λ3 · d( f x, y)λ4 · d( f y, x)λ5 . Then f has the limit shadowing property. Proof Follows from Theorem 15.4, for any initial value x ∈ X, the map f has a m Ai . Let y be the limit of {xn } in X with d(xn+1 , f xn ) → 1 unique fixed point z in i=1

as n → ∞, then we have

d(xn+1 , z) ≤ d(xn+1 , f xn ).d( f xn , f z) ≤ d(xn+1 , f xn ) · M ∗ (x, y) = d(xn+1 , f xn ) · d(xn , z)λ1 · d( f xn , xn )λ2 · d( f z, z)λ3 · d( f xn , z)λ4 · d(xn , f z)λ5 = d(xn+1 , f xn ) · d(xn , z)λ1 · d( f xn , xn )λ2 · d(z, z)λ3 · d( f xn , z)λ4 · d(z, xn )λ5 = d(xn+1 , f xn ) · d(xn , z)λ1 · d( f xn , xn )λ2 · d( f xn , z)λ4 · d(z, xn )λ5 ≤ d(xn+1 , f xn ) · d(xn , z)λ1 · d( f xn , xn )λ2 · d( f xn , xn )λ4 · d(xn , z)λ4 · d(z, xn )λ5 .

Taking limit as n → ∞ yields d(y, z) ≤ d(y, z)λ1 +λ4 +λ5 , which gives d(y, z) = 1  and hence y = z. Thus, d(xn , f n y) → 1 as n → ∞.

338

T. Nazir and S. Silvestrov

15.2.4 Periodic Points of Cyclic Contractive Maps If a point u is the fixed point of self-map f , then u is also the fixed point of f n for every n > 1 in N. However the converse is not always true. Definition 15.11 For a non-empty set X and for self-map f : X → X , if F( f ) = F( f n ) hold for each n ∈ N then we say that f has property P. Jeong and Rhoades [16] showed that maps satisfying many contractive conditions have property P. Abbas and Rhoades [6] studied the same problem in cone metric spaces (see also, [18, 23]). Chaipunya et al. [10] studied the property P and periodic points of order ∞. Chen, Karapınar and Rakoˇcevi´c [11] considered mappings satisfying a contractive condition in the setting of generalized quasi metric spaces. It could be also interesting to mention, that in a totally different context of interplay of dynamical systems and C∗-algebras, Silvestrov and Tomiyama [26] obtained several general equivalent conditions for the coincidence of the sets of recurrent and periodic points of homeomorphismdynamicalsystemsoftopologicalspaces,andalsodiscussedsomeexamples and classes of homeomorphism dynamical systems satisfying property of coincidence of the sets of periodic and fixed points (property P). In this session, we study the property P of cyclic contractive self-maps. Theorem 15.9 Let (X, d) be complete multiplicative metric space, A1 , A2 , . . . , m

Am , be non-empty closed subsets of X with X = Ai . Suppose that f : X → X i=1

satisfy (1)

m

Ai is the cyclic representation with respect to f ,

i=1

(2) for all (x, y) ∈ Ai × Ai+1 , with Am+1 = A1 , and some λ ∈ [0, 1), d( f x, f y) ≤ Mλ (x, y), where λ

Mλ (x, y) = max{d(x, y)λ , d( f x, x)λ , d( f y, y)λ , [d( f x, y) · d( f y, x)] 2 }. Then f has property P. Proof From Theorem 15.3, we have F ( f ) = ∅. Assume that for n > 1, since n = 1 is trivially true. Let u ∈ F( f n ). Then, d(u, f u) = d( f ( f n−1 u), f ( f n u)) ≤ Mλ ( f n−1 u, f n u) = max{d( f n−1 u, f n u)λ , d( f n u, f n−1 u)λ , d( f n u, f n+1 u), λ

[d( f n−1 u, f 1+n u) · d( f n u, f n u)] 2 } λ

= max{d( f n−1 u, u)λ , d( f n−1 u, u)λ , d(u, f n+1 u)λ , [d( f n−1 u, f n+1 u)] 2 } λ

= max{d( f n−1 u, u)λ , d(u, f u)λ , d( f n−1 u, f u) 2 } λ

≤ max{d( f n−1 u, u)λ , d(u, f u)λ , [d( f n−1 u, u) · d(u, f u)] 2 }.

15 Fixed Point Results of Generalized Cyclic …

339 λ

If max{d( f n−1 u, u)λ , d(u, f u)λ , [d( f n−1 u, u) · d(u, f u)] 2 } = d( f n−1 u, u)λ , then d(u, f u) ≤ d( f n−1 u, u)λ = d( f n−1 u, f n u)λ ≤ d( f n−2 u, f n−1 u)λ ≤ · · · ≤ d(u, f u)λ , 2

n

which gives u = f u. If λ

max{d( f n−1 u, u)λ , d(u, f u)λ , [d( f n−1 u, u) · d(u, f u)] 2 } = d(u, f u)λ , then d( f u, u) ≤ d(u, f u)λ and u = f u. Finally, when λ

max{d( f n−1 u, u)λ , d(u, f u)λ , [d( f n−1 u, u) · d(u, f u)] 2 } λ

= [d( f n−1 u, u) · d(u, f u)] 2 , λ

d(u, f u) ≤ [d( f n−1 u, u) · d(u, f u)] 2 , which implies d(u, f u) ≤ d( f n−1 u, u)h , λ < 1, and continuing this we get where h = 2−λ d(u, f u) ≤ d( f n−1 u, u)h = d( f n−1 u, f n u)h ≤ d( f n−2 u, f n−1 u)h

2

≤ · · · ≤ d(u, f u)h

n

and hence u = f u. Thus for all cases, we have u = f u, and hence F( f n ) = F( f ).  Theorem 15.10 Let

(X, d)

be

complete

multiplicative

space, A1 , A2 , . . . , Am , be non-empty closed subsets of X Suppose that f : X → X satisfy (1)

m

metric m

with X = Ai . i=1

Ai is the cyclic representation with respect to f ,

i=1

(2) for all (x, y) ∈ Ai × Ai+1 , with Am+1 = A1 , and some λi ≥ 0 for i ∈ {1, 2, 3, 4, 5}, such that λ1 + λ2 + λ3 + λ4 + 2λ5 < 1, it holds that d( f x, f y) ≤ M ∗ (x, y), where M ∗ (x, y) = d(x, y)λ1 · d( f x, x)λ2 · d(y, f y)λ3 · d( f x, y)λ4 · d( f y, x)λ5 . Then f has property P. Proof It follows from Theorem 15.4 that f has fixed point. Assume that for n > 1, since n = 1 is trivially true. Let u ∈ F( f n ). Now,

340

T. Nazir and S. Silvestrov

  d(u, f u) = d( f ( f n−1 u), f ( f n u)) ≤ M ∗ f n−1 u, f n u = d( f n−1 u, f n u)λ1 · d( f ( f n−1 u), f n−1 u)λ2 · d( f ( f n u), f n u)λ3 ·d( f ( f n−1 u)λ4 , f n u) · d( f n−1 u, f ( f n u))λ5 = d( f n−1 u, u)λ1 · d(u, f n−1 u)λ2 · d( f u, u)λ3 · d(u, u)λ4 · d( f n−1 u, f u)λ5 ≤ d( f n−1 u, u)λ1 +λ2 · d( f u, u)λ3 · [d( f n−1 u, u) · d (u, f u)]λ5 = d( f n−1 u, u)λ1 +λ2 +λ5 · d( f u, u)λ3 +λ5 , which implies d(u, f u) ≤ d( f n−1 u, u)τ , where τ =

λ1 +λ2 +λ5 1−λ3 −λ5

< 1. Thus,

d(u, f u) ≤ d( f n−1 u, u)τ = d( f n−1 u, f n u)τ ≤ d( f n−2 u, f n−1 u)τ

2

≤ · · · ≤ d(u, f u)τ , n

which gives that d(u, f u) = 1, and so u = f u. Hence F( f n ) = F( f ).



Acknowledgements Talat Nazir is grateful to ERUSMUS MUNDUS “Featured eUrope and South/south-east Asia mobility Network FUSION” and its Swedish node, MAM research milieu in Mathematics and Applied Mathematics, Division of Mathematics and Physics, School of Education, Culture and Communication at Mälardalen University for support and excellent research and research education environment during his visits.

References 1. Abbas, M., Nazir, T., Romaguera, S.: Fixed point results for generalized cyclic contraction mappings in partial metric spaces. Rev. R. Acad. Cienc. Exactas Fis. Nat. Ser. A Math. 106, 287–297 (2012) 2. Agarwal, R.P., Alghamdi, M.A., Shahzad, N.: Fixed point theory for cyclic generalized contractions in partial metric spaces. Fixed Point Theory Appl. 2012, 40, 11 pp (2012) 3. Akkouchi, M., Popa, V.: Well-posedness of fixed point problem for three mappings under strict contractive conditions. Bull. Math. Inform. Phys. Pet.-Gash Univ. Ploiesti 61(2), 1–10 (2009) 4. Aydi, H., Vetro, C., Sintunavarat, W., Kumam, P.: Coincidence and fixed points for contractions and cyclical contractions in partial metric spaces. Fixed Point Theory Appl. 2012, 124, 18 pp. (2012) 5. Abbas, M., Fisher, B., Nazir, T.: Well-posedness and periodic point property of mappings satisfying a rational inequality in an ordered complex valued metric space. Sci. Stud. Res. Ser. Math. Info. 22(1), 5–24 (2012) 6. Abbas, M., Rhoades, B.E.: Fixed and periodic point results in cone metric spaces. Appl. Math. Lett. 22, 511–515 (2009) 7. Bashirov, A.E., Kurpınar, E.M., Ozyapıcı, A.: Multiplicative calculus and its applications. J. Math. Anal. Appl. 337, 36–48 (2008) 8. Bashirov, A.E., Mısırlı, E., Tando˘gdu, Y., Ozyapıcı, A.: On modeling with multiplicative differential equations. Appl. Math. J. Chin. Uni. 26(4), 425–438 (2011) 9. Chaipunya, P., Cho, Y.J., Sintunavarat, W., Kumam, P.: Fixed point and common fixed point theorems for cyclic quasi-contractions in metric and ultrametric spaces. Adv. Pure Math. 2, 401–407 (2012) 10. Chaipunya, P., Cho, Y.J., Kumam, P.: A remark on the property P and periodic points of order ∞. Math. Vesnik 66(4), 357–363 (2014)

15 Fixed Point Results of Generalized Cyclic …

341

11. Chen, C., Karapınar, E., Rakoˇcevi´c, V.: Existence of periodic fixed point theorems in the setting of generalized quasi metric spaces. J. Appl. Math. 353765, 1–8 (2014) 12. Derafshpour, M., Rezapour, S.: On the existence of best proximity points of cyclic contractions. Adv. Dyn. Sys. Appl. 6(1), 33–40 (2011) 13. Florack, L., van Assen, H.: Multiplicative calculus in biomedical image analysis. J. Math. Imag. Vision 42(1), 64–75 (2012) 14. He, X., Song, M., Chen, D.: Common fixed points for weak commutative mappings on a multiplicative metric space. Fixed Point Theory Appl. 2014, 48, 9 pp. (2014) 15. Nashine, H.K., Sintunavarat, W., Kumam, P.: Cyclic generalized contractions and fixed point results with applications to an integral equation. Fixed Point Theory Appl. 2012, 217, 13 pp. (2012) 16. Jeong, G.S., Rhoades, B.E.: Maps for which F(T ) = F(T n ). Fixed Point Theory 6, 87–131 (2005) 17. Karapınar, E.: Fixed point theory for cyclic weak φ-contraction. Appl. Math. Lett. 24, 822–825 (2011) 18. Kumam, P., Rahimi, H., Rad, G.S.: The existence of fixed and periodic point theorems in cone metric type spaces. J. Nonlinear Sci. Appl. 7, 255–263 (2014) 19. Kirk, W.A., Srinivasan, P.S., Veeramini, P.: Fixed points for mappings satisfying cyclical contractive conditions. Fixed Point Theory 4(1), 79–89 (2003) 20. Özav¸sar, M., Çevikel, A.C.: Fixed point of multiplicative contraction mappings on multiplicative metric space (2012). arXiv:1205.5131v1 [matn.GN] 21. P˘acurar, M., Rus, I.A.: Fixed point theory for cyclic φ-contractions. Nonlinear Anal. 72(3–4), 1181–1187 (2010) 22. Pia˛tek, B.: On cyclic Meir-Keeler contractions in metric spaces. Nonlinear Anal. 74, 35–40 (2011) 23. Rahimia, H., Rhoades, B.E., Radenovi´c, S., Rad, G.S.: Fixed and periodic point theorems for T-contractions on cone metric spaces. FILOMAT 27(5), 881–888 (2013) 24. Reich, S., Zaslavski, A.J.: Well posedness of fixed point problems. Far East J. Math. Sci. Spec. Vol. Part 3, 393–401 (2001) 25. Rus, I.A.: Cyclic representation and fixed points. Ann. T. Popoviciu Seminar Func. Eq. Approx. Convexity 3, 171–178 (2005) 26. Silvestrov, S.D., Tomiyama, J.: Topological dynamical systems of type I. Expo. Math. 20(2), 117–142 (2002) 27. Yamaod, O., Sintunavarat, W.: Some fixed point results for generalized contraction mappings with cyclic (α, β)-admissible mapping in multiplicative metric spaces. J. Ineq. Appl. 2014(488), 1–15 (2014)

Chapter 16

Fixed Points of T-Hardy Rogers Type Mappings and Coupled Fixed Point Results in Multiplicative Metric Spaces Talat Nazir and Sergei Silvestrov

Abstract The fixed point results of T-Hardy Rogers type mappings that are satisfying generalized contractive conditions in the setup of multiplicative metric spaces are investigated. The well-posedness and limit shadowing property of T-Hardy Rogers type mappings are also established. Furthermore, periodic point property of these contraction mappings are also shown. Several examples are also presented to show the validity of main results. The coupled fixed point results are also obtained in multiplicative metric spaces. An application for solving integral equations are established in the frame work of multiplicative metric spaces. Keywords Fixed point · Peroidic point · T-hardy rogers type mapping · Limit shadowing property · Multiplicative metric space MSC 2020 47H09 · 47H10 · 54C60 · 54H25

16.1 Introduction Ozavsar and Cevikel [24] proved an analogous of Banach contraction principle in the framework of multiplicative metric spaces. They also obtained the topological properties in the setup of multiplicative metric space. Bashirov et al. [6] studied the concept of multiplicative calculus and proved a fundamental theorem of multiplicative calculus. They also illustrated the usefulness of multiplicative calculus with many interesting applications. Multiplicative calculus provides natural and straightforward way to compute the derivative of product and quotient of two functions [6]. T. Nazir (B) Department of Mathematical Sciences, University of South Africa, Florida 0003, South Africa e-mail: [email protected] S. Silvestrov Division of Mathematics and Physics, School of Education, Culture and Communication, Mälardalen University, Box 883, 72123 Västerås, Sweden e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_16

343

344

T. Nazir and S. Silvestrov

Abbas et al. [1] obtained common fixed points of locally contractive mappings in multiplicative metric spaces with application. Several useful fixed point results in multiplicative metric spaces appeared in [7, 13, 15, 22, 32]. Order oriented fixed point theory is studied in the framework created by a class of partially ordered sets with appropriate mappings satisfying certain order condition like monotonicity, expansivity or order continuity. Existence of fixed points in partially ordered metric spaces was first investigated in 2004 by Ran and Reurings [26], and then by Nieto and Lopez [23]. Further results in this direction under different contractive conditions were proved in [2, 3, 5, 12, 21]. This section deals with the study of fixed point results for T -Hardy Roger type contraction mappings in ordered multiplicative metric space. Throughout this work, R, R>0 and N denote the sets of real numbers, positive real numbers and natural numbers. Following definitions are needed in this squeal. We start with the definition of multiplicative metric space [24]. Definition 1 (multiplicative metric space) Let X be a non-empty set. A mapping d : X × X → R>0 = {t ∈ R | t > 0} is said to be a multiplicative metric on X if for any x, y, z ∈ X, the following conditions hold: (i) d(x, y) ≥ 1, and d(x, y) = 1 if and only if x = y; (ii) d(x, y) = d(y, x); (iii) d(x, y) ≤ d(x, z) · d(z, y). The pair (X, d) is called a multiplicative metric space. Definition 2 Let (X, d) be multiplicative metric space. Mapping T : X → X is (i) Continuous at point x in X if for any {xn } in X convergent to x, the sequence {T xn } converges to T x, that is, d (xn , x) → 1 as n → ∞ implies d (T xn , T x) → 1 as n → ∞. (ii) Sequentially convergent if for any sequence {yn } in X such that {T yn } is convergent in X , the sequence {yn } is convergent in X, that is, d (T yn , x) → 1 as n → ∞ implies existence of y ∈ X such that d (yn , y) → 1 as n → ∞. (iii) Subsequentially convergent if for any sequence {yn } in X such that {T yn } is convergent in X , the sequence {yn } has a convergent subsequence in X, that is, d (T yn , x) → 1 asn → ∞  implies existence of y ∈ X and subsequence {yni } of {yn } such that d yni , y → 1 as n i → ∞. Definition 3 Let X be any non-empty set. Triplet (X, , d) is called ordered multiplicative metric space if (i) d is multiplicative metric on X , (ii)  is an order on X. Definition 4 Let (X, d) be partially ordered set. Elements u, v ∈ X are called comparable when u  v or u  v holds.

16 Fixed Points of T-Hardy Rogers Type Mappings …

345

Definition 5 A subset  of a partially ordered set X is said to be well ordered if every two elements of  are comparable. By F( f ) we denote a set of all fixed points of self-map f on X , that is u ∈ F( f ) if and only if u = f (u).

16.2 Fixed Points of T-Hardy Rogers Type Contractive Maps In this section, we obtain fixed point results for self-maps satisfying property of T Hardy type contraction mapping defined on a complete multiplicative metric space. Theorem 1 Let (X, , d) be ordered complete multiplicative metric space. Let T : X → X be injective, continuous and f : X → X be non-decreasing mapping satisfying, for all comparable u, v ∈ X, with u = v, d (T f u, T f v) ≤ d (T u, T v)α · [d (T v, T f v) · d (T u, T f u)]β ·[d (T f u, T v) · d (T f v, T u)]γ

(16.1)

where α, β, γ are non-negative with α + 2β + 2γ < 1. If there exists x0 ∈ X with x0  f x0 and one of the following two conditions is satisfied: (a) f is continuous self-map on X ; (b) for any non-decreasing sequence {xn } in X which converges to z, it holds that xn  z for all n ∈ N, then F ( f ) = ∅, provided that T is subsequentially or sequentially convergent. Furthermore, F( f ) is singleton if and only if F ( f ) is well ordered. Proof As f is non-decreasing, we have x1 = f x0  f 2 x0  f 3 x0  · · ·  f n x0  f n+1 x0  . . . . Define a sequence {xn } in X with xn = f n x0 and so xn+1 = f xn for n ∈ N. If xn = f xn for some n, then xn is a fixed point. So we assume that xn = f xn for all n ∈ N. Since xn  xn+1 for all n ∈ N, by replacing x by xn and y by xn+1 in (16.1), we have d(T xn+1 , T xn+2 ) = d (T f xn , T f xn+1 ) ≤ d(T xn , T xn+1 )α · [d (T xn , T f xn ) · d(T xn+1 , T f xn+1 )]β · [d(T f xn , T xn+1 ) · d(T f xn+1 , T xn )]γ = d (T xn , T xn+1 )α · [d(T xn , T xn+1 ) · d(T xn+2 , T xn+2 )]β · [d(T xn+1 , T xn+1 ) · d(T xn+2 , T xn )]γ = d(T xn , T xn+1 )α · d(T xn+1 , T xn )β · d(T xn+2 , T xn+1 )β · d (T xn+2 , T xn )γ

346

T. Nazir and S. Silvestrov

≤ d(T xn , T xn+1 )α+β · d(T xn+2 , T xn+1 )β · d(T xn+1 , T xn )γ · d(T xn+1 , T xn+2 )γ = d(T xn , T xn+1 )α+β+γ · d(T xn+1 , T xn+2 )β+γ , which implies d(T xn+1 , T xn+2 ) ≤ d (T xn , T xn+1 )λ , where λ = λ ∈ [0, 1). Therefore, for all n ∈ N,

α+β+γ 1−β−γ

. Obviously,

d(T xn+1 , T xn+2 ) ≤ d (T xn , T xn+1 )λ ≤ d (T xn−1 , T xn )λ ≤ · · · ≤ d(T x0 , T x1 )λ . 2

n

Now for m, n ≥ 1 with n < m, we obtain d (T xm , T xn ) ≤ d (T xn , T xn+1 ) · d(T xn+1 , T xn+2 ) · . . . · d(T xm−1 , T xm ) ≤ d(T x0 , T x1 )λ · d(T x0 , T x1 )λ n

= d(T x0 , T x1 )λ

n

+λn+1 +···+λm−1

n+1

· . . . · d(T x0 , T x1 )λ

m−1

λn

≤ d (T x0 , T x1 ) 1−λ ,

which implies d (T xn , T xm ) → 1 as m, n → ∞. Hence {T xn } is multiplicative Cauchy sequence. As X is complete space, there exists z ∈ X such that {T xn } converges to z as n → ∞. Suppose that T is subsequentially convergent, then {xn } has a convergent subsequence {xni } in X and lim xni = u for some u ∈ X. As T is continuous, lim T xni = n→∞ n→∞ T u and uniqueness of the limit implies that T u = z. In case, f is continuous on X, then T f u = T u. As T is injective, so we have f u = u. If f is not continuous, then by given assumption we have f xni  u for all n i ∈ N, that is, xni +1  u. So from (16.1),   d(T xni +1 , T f u) = d T f xni , T f u   α  ≤ d T xni , T u · [d T xni , T f xni · d (T f u, T u)]β     ·[d T f xni , T u · d T f u, T xni ]γ   = d(T xni , T u)α · [d T xni , T xni +1 · d(T u,T f u)]β     ·[d T xni +1 , T u · d T f u, T xni ]γ . On taking limit as i → ∞ we obtain d (T u, T f u) ≤ d (T u, T u)α · [d (T u, T u) · d (T u, T f u)]β · [d (T u, T u) · d (T f u, T u)]γ = d (T u, T f u)β+γ ,

which yields d (T u, T f u)β+γ = 1 and T u = T f u. Now injectivity of T gives u = f u. Following similar arguments, result follows that T is sequentially convergent. Now suppose that F ( f ) is well ordered. We show that f has unique fixed point in X. Suppose w is another fixed point of f. As u and w are comparable, and xni  u, from (16.1), we have

16 Fixed Points of T-Hardy Rogers Type Mappings …

347

    d T xni +1 , T w = d T f xni , T f w   α  ≤ d T xni , T w · [d T xni , T f xni · d (T w, T f w)]β     ·[d T f xni , T w · d T f w, T xni ]γ   α  = d T xni , T w · [d T xni , T xni +1 · d (T w, T f w)]β     ·[d T xni +1 , T w · d T f w, T xni ]γ , and passing to the limit as i → ∞ we obtain d (T u, T w) ≤ d (T u, T w)α · [d(T u, T u) · d (T w, T f w)]β · [d(T u, T w) · d (T f w, T u)]γ = d (T u, T w)α · d (T w, T w)]β · [d(T u, T w) · d (T w, T u)]γ = d(T u, T w)α · d(T u, T w)2γ = d (T u, T w)α+2γ

and d (T w, T u) = 1. Since T is an injective, so that w = u. Corollary 1 Let (X, , d) be ordered complete multiplicative metric space. Let T : X → X be injective, continuous and f : X → X be non-decreasing mapping satisfying for all comparable u, v ∈ X , u = v,       d T f m u, T f m v ≤ d (T u, T v)α · [d T v, T f m v · d T u, T f m u ]β     ·[d T f m u, T v · d T f m v, T u ]γ where α, β, γ are non-negative with α + 2β + 2γ < 1 and m ∈ N. If there exists x0 ∈ X with x0  f m x0 and one of the following two conditions is satisfied: (a) f m is continuous self-map on X ; (b) for any non-decreasing sequence {xn } in X which converges to z, it holds that xn  z for all n ∈ N, then F ( f ) = ∅, provided that T is subsequentially or sequentially convergent. Furthermore, F ( f ) is singleton if and only if F ( f ) is well ordered. Proof It follows from Theorem 1 that f m has a unique fixed point z. Now f (z) = f ( f m (z)) = f m+1 (z) = f m ( f (z)) implies that f z is also fixed point of f m , that is, z = f z. Corollary 2 Let (X,  ,d) be ordered complete multiplicative metric space. Let f : X → X be non-decreasing mapping satisfying for all comparable u, v ∈ X , u = v, d ( f u, f v) ≤ d (u, v)α · [d (u, f u) · d (v, f v)]β · [d ( f u, v) · d ( f v, u)]γ where α, β, γ are non-negative with α + 2β + 2γ < 1. If there exists u 0 ∈ X with u 0  f u 0 and one of the following two conditions is satisfied: (a) f is continuous self-map on X ;

348

T. Nazir and S. Silvestrov

(b) for any non-decreasing sequence {xn } in X which converges to z, it holds that xn  z for all n ∈ N, then F ( f ) = ∅. Furthermore, F ( f ) is singleton if and only if F ( f ) is well ordered. Proof By taking T as identity map, result follows from Theorem 1. Theorem 2 Let (X, , d) be ordered complete multiplicative metric space. Let T : X → X be injective, continuous and f : X → X be non-decreasing mapping satisfying for all comparable u, v ∈ X , u = v, d (T f u, T f v) ≤ max{d (T u, T v) , [d (T v, T f v) · d (T u, T f u)]1/2 , [d (T f u, T v) · d (T f v, T u)]}λ (16.2) where λ ∈ [0, 1). If there exists x0 ∈ X with x0  f x0 and one of the following two conditions is satisfied: (a) f is continuous self-map on X ; (b) for any non-decreasing sequence {xn } in X which converges to z, it holds that xn  z for all n ∈ N, then F ( f ) = ∅, provided that T is subsequentially or sequentially convergent. Furthermore, F ( f ) is singleton if and only if F ( f ) is well ordered. Proof As f is non-decreasing, we have x1 = f x0  f 2 x0  f 3 x0  · · ·  f n x0  f n+1 x0  . . . . Define a sequence {xn } in X with xn = f n x0 and so xn+1 = f xn for n ∈ N. If xn = f xn for some n, then xn is a fixed point. So we assume that xn = f xn for all n ∈ N. Since xn  xn+1 for all n ∈ N, by replacing x by xn , y by xn+1 in (16.2), we have d(T xn+1 , T xn+2 ) = d(T f xn , T f xn+1 )

  ≤ max{d(T xn , T xn+1 ), [d(T f xn , T xn ) · d T xn+1 , T f xn+1 ]1/2 ,

[d(T f xn , T xn+1 ) · d(T f xn+1 , T xn )]1/2 }λ = max{d(T xn , T xn+1 ), [d(T xn , T xn+1 ) · d(T xn+1 , T xn+2 )]1/2 ,   [d T xn+1 , T xn+1 · d(T xn+2 , T xn )]1/2 }λ = max{d(T xn , T xn+1 ), [d(T xn , T xn+1 ) · d(T xn+1 , T xn+2 )]1/2 , d(T xn+2 , T xn )1/2 }λ =: ST (xn , xn+1 ) (just notation).

Now, different possibilities arise. If ST (xn , xn+1 ) = d(T xn , T xn+1 ), then d (T xn+1 , T xn+2 ) ≤ d(T xn , T xn+1 )λ .

16 Fixed Points of T-Hardy Rogers Type Mappings …

349

Also if ST (xn , xn+1 ) = [d(T xn , T xn+1 ) · d(T xn+1 , T xn+2 )]1/2 , then d (T xn+1 , T xn+2 ) ≤ [d(T xn , T xn+1 ) · d(T xn+1 , T xn+2 )]λ/2 . So, d (T xn+1 , T xn+2 ) ≤ d(T xn , T xn+1 )λ . If ST (xn , xn+1 ) = d(T xn+2 , T xn )1/2 , then d (T xn+1 , T xn+2 ) ≤ d(T xn+2 , T xn )λ/2 ≤ [d (T xn , T xn+1 ) · d(T xn+1 , T xn+2 )]λ/2 . Thus, d(T xn+1 , T xn+2 ) ≤ d (T xn , T xn+1 )λ . Hence, in all cases, for all n ∈ N, d(T xn+1 , T xn+2 ) ≤ d(T xn , T xn+1 )λ . Now, d(T xn+1 , T xn+2 ) ≤ d(T xn , T xn+1 )λ ≤ d(T xn−1 , T xn )λ ≤ · · · ≤ d(T x0 , T x1 )λ . 2

n

For m, n ≥ 1 with n < m, we obtain d (T xm ,T xn ) ≤ d (T xn , T xn+1 ) · d(T xn+1 , T xn+2 ) · . . . · d(T xm−1 , T xm ) ≤ d(T x0 , T x1 )λ · d(T x0 , T x1 )λ n

= d(T x0 , T x1 )λ

n

+λn+1 +···+λm−1

n+1

· . . . · d(T x0 , T x1 )λ

≤ d (T x0 , T x1 )

λn 1−λ

m−1

,

which implies d (T xn , T xm ) → 1 as m, n → ∞. Hence {T xn } is multiplicative Cauchy sequence. As X is complete space, there exists z ∈ X such as {T xn } converges to z as n → ∞. Suppose that T is subsequentially convergent, then {xn } has a convergent subsequence {xni } in X and lim xni = u for some u ∈ X. As T is continuous, lim T xni = n→∞ n→∞ T u and uniqueness of the limit implies that T u = z. In case, f is continuous on X, it holds that T f u = T u. As T is injective, we have f u = u. If f is not continuous then by given assumption we have f xni  u for all n i ∈ N. That is xni +1  u. So from (16.2)     d T xni +1 , T f u = d T f xni , T f u     ≤ max{d T xni , T u , [d T xni , T f xni · d (T u, T f u)]1/2 ,     [d T f xni , T u · d T f u, T xni ]1/2 }λ     = max{d T xni , T u , [d T xni , T f xni · d (T u, T u)]1/2 ,     [d T f xni , T u · d T u, T xni ]1/2 }λ , and taking the limit as i → ∞ yields d (T u, T f u) ≤ max{d (T u, T u) , [d (T u, T f u) · d (T u, T u)]1/2 , [d (T f u, T u) · d (T u, T u)]1/2 }λ = max{d (T u, T f u)λ/2 , d (T f u, T u)λ/2 } = d (T u, T f u)λ/2

350

T. Nazir and S. Silvestrov

which implies that d (T u, T f u) = 1 and T u = T f u. Injectivity of T gives u = f u. Following similar arguments to those given above, result follows that T is sequentially convergent. Now suppose that F ( f ) is well ordered. We show that f has unique fixed point in X. Suppose w is another fixed point of f. As u and w are comparable, and xni  u, from (16.1), we have     d T xni +1 , T w = d T f xni , T f w     ≤ max{d T xni , T w , [d T xni , T f xni · d (T w, T f w)]1/2 ,     [d T f xni , T w · d T f w, T xni ]1/2 }λ     = d T xni , T w , [d T xni , T xni +1 · d (T w, T f w)]1/2 ,     [d T xni +1 , T w · d T f w, T xni ]1/2 }λ , and on passing to the limit as i → ∞ we obtain d (T u, T w) ≤ max{d (T u, T w) · [d(T u, T u) · d (T w, T f w)]1/2 ·[d(T u, T w) · d (T f w, T u)]1/2 }λ = max{d (T u, T w) · [d (T w, T w)]1/2 · [d(T u, T w) · d (T w, T u)]1/2 }λ ≤ d (T u, T w)λ and d (T w, T u) = 1. Since T is an injective, w = u. Example 1 Let X = {1, 2, 3} be a set endowed   with a usual ordering. Define the map d : X × X → [1, ∞) by d(x, y) = e

1 1 x−y

. Let f, T : X → X be defined as

x 1 2 3 f (x) 1 1 2 T (x) 3 2 1

To check contractive condition d (T f x, T f y) ≤ max{d (T x, T y) , [d (T x, T f x) · d (T f y, T y)]1/2 , [d (T f x, T y) · d (T f y, T x)]1/2 }λ , where x, y ∈ X , x = y, the following cases arise: (1) x = 1, y = 2, d (T f x, T f y) = d (3, 3) = e0 ≤ max{d (T x, T y) , [d (T x, T f x) · d (T f y, T y)]1/2 , [d (T f x, T y) · d (T f y, T x)]1/2 }λ .

(2) x = 1, y = 3,

16 Fixed Points of T-Hardy Rogers Type Mappings …

351 | 1 − 21 |

d (T f x, T f y) = d (3, 2) = e 6 ≤ max{e| 3 −1| , [e 2 1

1

·e

|1− 21 |

1

| 1 −1|

] 2 , [e 2

| 1 − 21 |

·e2

1

1

1

1

1

1

1

1

1

1

]2 }2

1

= max{d (3, 1) , [d (2, 2) · d (1, 2)] 2 , [d(2, 1) · d (2, 2)] 2 } 2 = max{d (T x, T y) , [d (T x, T f x) · d (T f y, T y)]1/2 , [d (T f x, T y) · d (T f y, T x)]1/2 }λ . (3) x = 2, y = 1, d (T f x, T f y) = d (3, 3) = e0

≤ max{d (T x, T y) , [d (T x, T f x) · d (T f y, T y)]1/2 , [d (T f x, T y) · d (T f y, T x)]1/2 }λ .

(4) x = 2, y = 3, | 1 − 31 |

d (T f x, T f y) = d (3, 2) = e 6 ≤ max{e| 2 −1| , [e 2 1

1

·e

|1− 21 |

1

| 1 −1|

] 2 , [e 3

| 1 − 21 |

·e2

1

1

]2 }2

1

= max{d (2, 1) , [d (2, 3) · d (1, 2)] 2 , [d (3, 1) · d (2, 2)] 2 } 2 = max{d (T x, T y) , [d (T x, T f x) · d (T f y, T y)]1/2 , [d (T f x, T y) · d (T f y, T x)]1/2 }λ . (5) x = 3, y = 1, d (T f x, T f y) = d (2, 3) = e 6 ≤ max {e|1− 3 | , [e 1

1

|1− 21 |

| 1 − 21 |

·e2

1

| 1 − 21 |

] 2 , [e 2

1

| 1 −1|

·e2 1

]2 }2

1

= max {d (1, 3) , [d (1, 2) · d (2, 2)] 2 , [d (2, 2) · d (2, 1)] 2 } 2 = max {d (T x, T y) , [d (T x, T f x) · d (T f y, T y)]1/2 , [d (T f x, T y) · d (T f y, T x)]1/2 }λ . (6) x = 3, y = 2, d (T f x, T f y) = d (2, 3) = e 6 ≤ max {e|1− 2 | , [e 1

1

|1− 21 | 1

| 1 − 31 |

·e2

1

| 1 − 21 |

] 2 , [e 2

| 1 −1|

·e3 1

]2 }2

1

= max {d (1, 2) , [d (1, 2) · d (2, 3)] 2 , [d (2, 2) · d(3, 1)] 2 } 2 = max {d (T x, T y) , [d (T x, T f x) · d (T f y, T y)]1/2 , [d (T f x, T y) · d (T f y, T x)]1/2 }λ .

1 Hence conditions of Theorem 2 hold with λ = . Moreover, 1 is the unique fixed 2 point of f .

352

T. Nazir and S. Silvestrov

16.2.1 Well-Posedness Results for T-Hardy Rogers Type Contractions The notion of well-posedness of a fixed point has evoked much interest to several mathematicians. Recently, Abbas et al. [2] obtained well-posedness of mappings satisfying a rational inequality in an ordered complex valued metric space. Further results in this direction under different contractive conditions were proved in [4, 27]. Now, we study the well-posedness of fixed point for self-maps satisfying T -Hardy type contraction mapping in the setup of ordered multiplicative metric space. Definition 6 Let(X, , d) be ordered multiplicative metric space. Fixed point problem of self-map f : X → X is called well-posed if f has a unique fixed point say  x in X , and for sequences {xn } in X , such that lim d( f xn , xn ) = 1 it holds that lim d(xn ,  x ) = 1.

n→∞

n→∞

Theorem 3 Let (X, , d) be ordered complete multiplicative metric space. Let T : X → X be injective, continuous and f : X → X be non-decreasing mapping satisfying, for all comparable u, v ∈ X , u = v, d (T f u, T f v) ≤ d (T u, T v)α · [d (T v, T f v) · d (T u, T f u)]β ·[d (T f u, T v) · d (T f v, T u)]γ where α, β, γ are non-negative with α + 2β + 2γ < 1. If there exists x0 ∈ X with x0  f x0 and one of the following two conditions is satisfied: (a) f is continuous self-map on X ; (b) for any non-decreasing sequence {xn } in X which converges to z, it holds that xn  z for all n ∈ N, then if T is sequentially convergent and F( f ) is well ordered, then the fixed point of f is well-posed. Proof It follows from Theorem 1 that the mapping f has a unique fixed point, say z ∈ X. Let {xn } be a sequence in X whose every term is comparable with z and lim d( f xn , xn ) = 1. Assume that z = xn for every non-negative integer n. Now, n→∞

d(T z, T xn ) ≤ d(T xn , T f xn ) · d(T f xn , T f z) = d(T xn , T f xn ) · d (T xn , T z)α · [d (T xn , T f xn ) · d (T z, T f z)]β ·[d (T f xn , T z) · d (T f z, T xn )]γ = d(T xn , T f xn ) · d (T xn , T z)α · [d (T xn , T xn+1 ) · d (T z, T z)]β ·[d (T f xn , T z) · d (T z, T xn )]γ ≤ d(T xn , T f xn ) · d (T xn , T z)α · d(T xn , T xn+1 )β · d(T f xn , T xn )γ = d(T xn , T f xn )1+β+γ · d (T xn , T z)α ,

16 Fixed Points of T-Hardy Rogers Type Mappings …

353 1+β+γ

which further implies d(T z, T xn ) ≤ [d(T xn , T f xn )] 1−α . Taking limit as n → ∞ yields d(T xn , T z) → 1 as n → ∞. Assume that T is sequentially convergent, then {xn } is convergent in X that is, lim xn = u for some u ∈ X. Since map T is continuous, lim T xn = T u and uniquen→∞ n→∞ ness of the limit gives T u = T z. The injectivity of T implies u = z. Hence we obtain lim xn = z and the fixed point problem of f is well-posed. n→∞

Theorem 4 Let (X, , d) be ordered complete multiplicative metric space. Let T : X → X be injective, continuous and f : X → X be non-decreasing mapping satisfying, for all comparable u, v ∈ X , u = v, d (T f u, T f v) ≤ max{d (T u, T v) , [d (T v, T f v) · d (T u, T f u)]1/2 , [d (T f u, T v) · d (T f v, T u)]}λ where λ ∈ [0, 1). If there exists x0 ∈ X with x0  f x0 and one of the following two conditions is satisfied: (a) f is continuous self-map on X ; (b) for any non-decreasing sequence {xn } in X which converges to z it holds that xn  z for all n ∈ N, then if T is sequentially convergent and F ( f ) is well ordered, then the fixed point of f is well-posed. Proof It follows from Theorem 2 that the mapping f has a unique fixed point, say z ∈ X. Let {xn } be a sequence in X whose every term is comparable with z and lim d( f xn , xn ) = 1. Suppose z = xn for every non-negative integer n. Then from n→∞ the triangular inequality, we have d(T z, T xn ) ≤ d(T f xn , T f z) · d(T xn , T f xn ) ≤ max {d (T xn , T z) , [d (T xn , T f xn ) · d (T z, T f z)]1/2 , [d (T f xn , T z) · d (T f z, T xn )]1/2 }λ · d(T xn , T f xn ) = max {d (T xn , T z) , [d (T xn , T f xn ) · d (T z, T z)]1/2 , [d (T f xn , T z) · d (T z, T xn )]1/2 }λ · d(T xn , T f xn ) ≤ max {d (T xn , T z) , d (T xn , T f xn )1/2 , d(T f xn , T xn )1/2 }λ · d(T xn , T f xn ) = max {d (T xn , T z) , d(T f xn , T xn )1/2 }λ · d(T xn , T f xn ). (16.3) Now if max{d (T xn , T z) , d(T f xn , T xn )1/2 }λ = d (T xn , T z)λ , then (16.3) implies d(T z, T xn ) ≤ d(T xn , T z)λ · d(T xn , T f xn ) and taking limit as n → ∞ yields d(T z, T xn ) → 1. If max{d (T xn , T z) , d(T f xn , T xn )1/2 }λ = d(T f xn , T xn )λ/2 , then (16.3) implies λ

d(T z, T xn ) ≤ d (T f xn , T xn ) 2 · d(T xn , T f xn ) = d (T f xn , T xn )

λ+2 2

.

354

T. Nazir and S. Silvestrov

Taking limit as n → ∞ yields d(T z, T xn ) → 1 as n → ∞. Assume that T is sequentially convergent, then {xn } is convergent in X that is, lim xn = u for some u ∈ X. Since map T is continuous, so lim T xn = T u and n→∞ n→∞ uniqueness of the limit gives T u = T z. The injectivity of T implies u = z. Hence we obtain lim xn = z and the fixed point problem of f is well-posed. n→∞

16.2.2 Limit Shadowing Property for T-Hardy Rogers Type Contractions In this section, we study the limit shadowing property of self-map on ordered multiplicative metric space. Definition 7 Let (X, , d) be ordered multiplicative metric space. Self-map f : X → X is said to have limit shadowing property if it holds that if there is a convergent sequence {xn } in X such that d(xn+1 , f xn ) → 1 as n → ∞, then there is some y in X such that d(xn , f n y) → 1 as n → ∞. Theorem 5 Let (X, , d) be ordered complete multiplicative metric space. Let T : X → X be injective, continuous and f : X → X be non-decreasing mapping satisfying, for all comparable u,v ∈ X , u = v, d (T f u, T f v) ≤ d (T u, T v)α · [d (T v, T f v) · d (T u, T f u)]β ·[d (T f u, T v) · d (T f v, T u)]γ where α, β, γ are non-negative with α + 2β + 2γ < 1. If there exists x0 ∈ X with x0  f x0 and one of the following two conditions is satisfied: (a) f is continuous self-map on X ; (b) for any non-decreasing sequence {xn } in X which converges to z implies xn  z for all n ∈ N, then if T is sequentially convergent and F ( f ) is well ordered, then f has the limit shadowing property. Proof It follows from Theorem 1 that the map f has a unique fixed point z. Let y be the limit of {xn } in X. Since T is continuous, T xn → T y with d(T xn+1 , T f xn ) → 1 as n → ∞. Now, d(T xn+1 , T z) ≤ d(T xn+1 , T f xn ) · d(T f xn , T f z) = d(T xn+1 , T f xn ) · d(T xn , T z)α · [d(T xn , T f xn ) · d(T z, T f z)]β ·[d(T f xn , T z) · d(T f z, T xn )]γ ≤ d(T xn+1 , T f xn ) · d(T xn , T z)α · [d(T xn , T xn+1 ) · d(T xn+1 , T z)]β ·[d(T f xn , T xn+1 ) · d(T xn+1 , T z) · d(T z, T xn )]γ .

16 Fixed Points of T-Hardy Rogers Type Mappings …

355

Taking limit as n → ∞ yields d(T y, T z) ≤ d(T y, T z)α+β+2β which is only possible when d(T y, T z) = 1. Since T is injective, y = z. Thus, d(xn , f n y) → 1 as n → ∞. Theorem 6 Let (X, , d) be ordered complete multiplicative metric space. Let T : X → X be injective, continuous and f : X → X be non-decreasing mapping satisfying, for all comparable u, v ∈ X , u = v, d (T f u, T f v) ≤ max {d (T u, T v) , [d (T v, T f v) · d (T u, T f u)]1/2 , [d (T f u, T v) · d (T f v, T u)]}λ where λ ∈ [0, 1). If there exists x0 ∈ X with x0  f x0 and one of the following two conditions is satisfied: (a) f is continuous self-map on X ; (b) for any non-decreasing sequence {xn } in X which converges to z it holds that xn  z for all n ∈ N, then if T is sequentially convergent and F ( f ) is well ordered, then f has the limit shadowing property. Proof It follows from Theorem 2 that the map f has a unique fixed point z. Let y be the limit of {xn } in X. Since T is continuous, T xn → T y with d(T xn+1 , T f xn ) → 1 as n → ∞. Now, d(T xn+1 , T z) ≤ d(T xn+1 , T f xn ) · d(T f xn , T f z) ≤ d(T xn+1 , T f xn ) · max {d (T xn , T z) , [d(T xn , T f xn · d (T z, T f z)]1/2 , [d (T f xn , T z) · d(T f z, T xn )]1/2 }λ ≤ d(T xn+1 , T f xn ) · max {d (T xn , T z) , [d(T xn , T xn+1 ) · d(T xn+1 , T f xn )]1/2 , [d(T f xn , T xn+1 ) · d(T xn+1 , T z) · d(T z, T xn )]1/2 }λ , and taking limit as n → ∞ yields d(T y, T z) ≤ d(T y, T z)λ , which gives d(T y, T z) = 1 and so y = z. Hence d(xn , f n y) → 1 as n → ∞.

16.2.3 Periodic Point Property for T-Hardy Rogers Type Contractive Maps Now, we study the periodic point property for maps satisfying T -Hardy Rogers type contractive conditions in ordered multiplicative metric space. Recall that for a nonempty set X and for self-map f : X → X , if F( f ) = F( f n ) hold for each n ∈ N, then we call it f has property P [17]. Further results in this direction under different contractive conditions were proved in [9, 10, 19, 25].

356

T. Nazir and S. Silvestrov

Theorem 7 Let (X, , d) be ordered complete multiplicative metric space. Let T : X → X be injective, continuous and f : X → X be non-decreasing mapping satisfying, for all comparable u, v ∈ X , u = v, d (T f u, T f v) ≤ d (T u, T v)α · [d (T v, T f v) · d (T u, T f u)]β ·[d (T f u, T v) · d (T f v, T u)]γ where α, β, γ are non-negative with α + 2β + 2γ < 1. If there exists x0 ∈ X with x0  f x0 and one of the following two conditions is satisfied: (a) f is continuous self-map on X ; (b) for any non-decreasing sequence {xn } in X which converges to z it holds that xn  z for all n ∈ N, then if T is sequentially convergent and F ( f ) is well ordered, then f has property P. Proof It follows from Theorem 1 that F( f ) = ∅. Assume that n > 1, since for n = 1 conclusion is trivially true. Let u ∈ F( f n ). Now, d(T u, T f u) = d(T f ( f n−1 u, ), T f ( f n u))  α     ≤ d T f n−1 u, T f n u · [d T f n−1 u, T f ( f n−1 u) · d T f n u, T f ( f n u) ]β     ·[d T f ( f n−1 u), T f n u ) · d T f ( f n u), T ( f n−1 u ]γ  α     = d T f n−1 u, T f n u · [d T f n−1 u, T f n u · d T f n u, T f u ]β     ·[d T f n u, T f n u · d T f u, T f n−1 u ]γ  α     = d T f n−1 u, T u · [d T f n−1 u, T u · d (T u, T u)]β · [d T u, T f n−1 u ]γ   α+β+γ α+β +γ . which further implies d(T u, T f u) ≤ [d T f n−1 u, T u ] 1−β . Let θ = 1−β Then obviously θ < 1. Thus,  θ 2 d(T u, T f u) ≤ d T f n−1 u, T u = d(T f n−1 u, T f n u)θ ≤ d(T f n−2 u, T f n−1 u)θ ≤ · · · ≤ d (T u, T f u)θ , n

which gives d(T u, T f u) = 1 and so T u = T f u. Since T is injective, we get u = f u, and hence F( f n ) = F( f ). Theorem 8 Let (X, , d) be ordered complete multiplicative metric space. Let T : X → X be injective, continuous and f : X → X be non-decreasing mapping satisfying, for all comparable u, v ∈ X , with u = v, d (T f u, T f v) ≤ max {d (T u, T v) , [d (T v, T f v) · d (T u, T f u)]1/2 , [d (T f u, T v) · d (T f v, T u)]}λ

16 Fixed Points of T-Hardy Rogers Type Mappings …

357

where λ ∈ [0, 1). If there exists x0 ∈ X with x0  f x0 and one of the following two conditions is satisfied: (a) f is continuous self-map on X ; (b) for any non-decreasing sequence {xn } in X which converges to z it holds that xn  z for all n ∈ N, then if T is sequentially convergent and F ( f ) is well ordered, then f has property p. Proof From Theorem 2, f has a fixed point. Assume n > 1. Let u ∈ F( f n ). Now d(T u, T f u) = d(T f ( f n−1 u), T f ( f n u))       ≤ max {d T f n−1 u, T ( f n u) , [d T f n−1 u, T f ( f n−1 u) · d T ( f n u), T f ( f n u) ]1/2 , 



[d(T f ( f n−1 u), T ( f n u)) · d(T f ( f n u), T f n−1 u)]1/2 }λ

= max {d T f n−1 u, T u , [d(T f n−1 u, T f n u) · d (T u, T f u)]1/2 ,     [d T f n u, T u · d T f u, T f n−1 u ]1/2 }λ   = max {d(T f n−1 u, T u, [d T f n−1 u, T u · d (T u, T u)]1/2 ,   [d (T u, T u) · d T u, T f n−1 u ]1/2 }λ     = max {d T f n−1 u, T u , [d T f n−1 u, T u ]1/2 , [d(T u, T f n−1 u)]1/2 }λ

which further implies d(T u, T f u) ≤ d(T f n−1 u, T u)λ , where λ ∈ [0, 1). Thus  λ d(T u, T f u) ≤ d(T f n−1 u, T u)λ = d T f n−1 u, T f n u  λ2 n ≤ d T f n−2 u, T f n−1 u ≤ · · · ≤ d (T u, T f u)λ , which gives d(T u, T f u) = 1 and so T u = T f u. Since map T is injective, we get u = f u, and hence F( f n ) = F( f ).

16.3 Coupled Fixed Points in Multiplicative Metric Spaces Guo and Lakshmikantham [14] initiated the study of coupled fixed point for maps in the setup of metric spaces endowed with partially order. After this, many researchers obtained useful results in this direction, see for example [11, 16, 18, 20, 28–31]. Bhaskar and Lakshmikantham [8] defined the mixed monotone property of mappings in the setup of partially ordered metric space and proved some coupled fixed point theorems. Furthermore, they studied the sufficient conditions under which, a unique solution of periodic boundary value problem for first-order ordinary differential equation exists.

358

T. Nazir and S. Silvestrov

We establish the existence of some coupled fixed points of mappings in partially ordered multiplicative metric space. We also show that our coupled fixed point problems are well-posed. The sufficient conditions for unique solution of integral equations are also obtained.

16.3.1 Coupled Fixed Points In this section, we define some basic definitions and notations. Definition 8 Let X be any nonempty set,  be a partial order on X and F : X × X → X . We say that F has the mixed-monotone property if for any u, v ∈ X , u 1 , u 2 ∈ X with u 1  u 2 implies that F (u 1 , v)  F (u 2 , v) , and v1 , v2 ∈ X with v1  v2 implies that F (u, v1 )  F (u, v2 ) . Definition 9 An element (u, v) ∈ X × X is called a coupled fixed point of the mapping F : X × X → X if F (u, v) = u and F(v, u) = v.

16.3.2 Coupled Fixed Point Results In this section, some coupled fixed point results for maps satisfying coupled contractions defined on a complete multiplicative metric space are obtained. We start with the following theorem. Theorem 9 Let (X, ) be partially ordered set and (X, d) be complete multiplicative metric space. Suppose that mapping F : X × X → X is continuous, having mixed-monotone property on X, and there exists η ∈ [0, 1) such that for all u  w, v  z, η d(F(u, v), F(w, z)) ≤ [d(u, w) · d(v, z)] 2 . Then there exist points u, v ∈ X, such that u = F(u, v) and v = F(v, u) provided that, there exist u 0 , v0 ∈ X such that F(u 0 , v0 )  u 0 and F(v0 , u 0 )  v0 . Proof Since for x0 , y0 ∈ X, x0  F(x0 , y0 ) and y0  F(y0 , x0 ), let F(x0 , y0 ) = x1 , F(y0 , x0 ) = y1 . Then, x0  x1 , y0  y1 . Letting x2 = F(x1 , y1 ) and y2 = F(y1 , x2 ), implies x2 = F 2 (x0 , y0 ) = F(x1 , y1 ) = F (F(x0 , y0 ), F(y0 , x0 )) , y2 = F(y1, x1 ) = F 2 (y0 , x0 ) = F (F(y0 , x0 ), F(x0 , y0 )) . With this notation, we now have by mixed-monotone properly of map F,

16 Fixed Points of T-Hardy Rogers Type Mappings …

359

x1 = F(x0 , y0 )  F(x1, y1 ) = F 2 (x0 , y0 ) = x2 , y1 = F(y0 , x0 )  F(y1, x1 ) = F 2 (y0 , x0 ) = y2 . Further for n ∈ N, xn+1 = F n+1 (x0 , y0 ) = F(F n (x0 , y0 ), F n (y0 , x0 )), yn+1 = F n+1 (y0 , x0 ) = F(F n (y0 , x0 ), F n (x0 , y0 )). We can easily verify that x0  x1  x2  · · ·  xn+1  . . . , y0  y1  y2  · · ·  xn+1  . . . . Now we are to show that for n ∈ N, d (xn+1 , xn ) = d(F n+1 (x0 , y0 ), F n (x0 , y0 )) ηn ≤ [d(F(x0 , y0 ), x0 ) · d(F(y0 , x0 ), y0 )] 2n ,

(16.4)

d (yn+1 , yn ) = d(F n+1 (y0 , x0 ), F n (y0 , x0 )) ηn ≤ [d(F(y0 , x0 ), y0 ) · d(F(x0 , y0 ), x0 )] 2n .

(16.5)

Indeed, for n = 1, using x0  F(x0 , y0 ) and y0  F(y0 , x0 ), we get d (x2 , x1 ) = d(F 2 (x0 , y0 ), F(x0 , y0 )) = d(F(F(x0 , y0 ), F(y0 , x0 )), F(x0 , y0 )) η ≤ d(F(x0 , y0 ), x0 ) · d(F(y0 , x0 ), y0 )] 2 , d (y2 , y1 ) = d(F 2 (y0 , x0 ), F(y0 , x0 )) = d(F(F(y0 , x0 ), F(x0 , y0 )), F(y0 , x0 )) η

≤ d(F(y0 , x0 ), y0 ) · d(F(x0 , y0 ), x0 )] 2 Assume that inequalities (16.4) and (16.5) hold. Using F n (x0 , y0 )  F n+1 (x0 , y0 ),

F n (y0 , x0 )  F n+1 (y0 , x0 ),

we get d(xn+2 , xn+1 ) = d(F n+2 (x0, y0 ), F n+1 (x0 , y0 )) ηn+1

≤ [d(F(F n+1 (x0 , y0 ), F n (x0 , y0 )) · (F(F n+1 (y0 , x0 ), F n (y0 , x0 )))] 2n+1 ηn

= {[d(F(x0 , y0 ), x0 ) · d(F(y0 , x0 ), y0 )] 2n

ηn

ηn+1 n+1

· [d(F(y0 , x0 ), y0 ) · d(F(x0 , y0 ), x0 )] 2n } 2

360

T. Nazir and S. Silvestrov ηn+1 n+1

2ηn

≤ {[d(F(x0 , y0 ), x0 ) · d(F(y0 , x0 ), y0 )] 2n } 2 = [d(F(x0 , y0 ), x0 ) · d(F(y0 , x0 ), y0 )]

η2n+1 22n

.

Similarly, d(yn+2 , yn+1 ) ≤ [d(F(y0 , x0 ), y0 ) · d(F(x0 , y0 ), x0 )] that {F n (x0 , y0 )} and {F n (y0 , x0 )} are Cauchy sequences in X . Suppose that m > n. Then we get

η2n+1 2n

. This implies

d(F m (x0 , y0 ), F n (x0 , y0 )) ≤ d(F m (x0 , y0 ), F m−1 (x0 , y0 )) · d(F m−1 (x0 , y0 ), F m−2 (x0 , y0 )) · . . . · d(F n+1 (x0 , y0 ), F n (x0 , y0 )) k m−1

kn

≤ [d(F(x0 , y0 ), x0 ) · d(F(y0 , x0 ), y0 )] 2m−1 . . . [d(F(x0 , y0 ), x0 ) · d(F(y0 , x0 ), y0 )] 2n k m−1

≤ [d(F(x0 , y0 ), x0 ) · d(F(y0 , x0 ), y0 )] 2m−1 kn

≤ [d(F(y0 , x0 ), y0 ) · d(F(x0 , y0 ), x0 )] 2n

m−2

n

+ k m−2 +···+ k2n 2

m−n−1

[ k m−n−1 +... ] 2

→ 1 as m, n → ∞.

Hence the sequence is multiplicative Cauchy. By the completeness of X we have elements x, y in X , such that lim xk = lim F k (x0, y0 ) = x and lim yk = lim F k (y0 , x0 ) = y.

k→∞

k→∞

k→∞

k→∞

Finally, we claim x = F(x, y) and y = F(y, x). Let ε > 1. Since F is continuous at pair(x, y). For ε > 1, there exist a δ > 1 such that√[d (x, u) · d (y, v)] < δ√implies d(F(x, y), F(u, v)) < ε. So, d(F n (x0, y0 ), x) < ε, d(F m (y0 , x0 ), y) < ε. For n ≥ 1, d(F(x, y), x) ≤ d(F(x, y), F n+1 (x0 , y0 )) · d(F n+1 (x0 , y0 ), x)) ≤ d(x, F n (x0 , y0 )) · d(y, F n (y0 , x0 )) · d(F n+1 (x0 , y0 ), x) √ √ √ = ε · ε · ε = ε∗ . So we get x = F(x, y). Following the similar arguments we have y = F(y, x). Theorem 10 Let(X, ) be partially ordered set and (X, d) be complete multiplicative metric space. Suppose that mapping F : X × X → X is continuous, having mixed-monotone property on X, and there exists η ∈ [0, 1) such that η

d(F(u, v), F(w, z)) ≤ [d(u, w) · d(v, z)] 2

hold for all u  w, v  z. Then there exist points u, v in X, such that u = F(u, v) and v = F(v, u) provided that, X has the following properties: (a) for non-decreasing sequence {u n } → u, then u n  u for all n; (b) for non-increasing sequence {wn } → w, then w  wn, for all n.

16 Fixed Points of T-Hardy Rogers Type Mappings …

361

Proof Following the proof of Theorem 9, we need only to show x = F(x, y) and F(y, x) = y. For ε > 1, since {F n (x0 , y0 )} → x and {F n (y0 , x0 )} → y, there √ exist n 1 , n 2 ∈ N, such that√for all n ≥ n 1 and m ≥ n 2 , we have d(F n (x0 , y0 ), x) < ε and d(F n (y0 , x0 ), y) < ε. Taking n ∈ N, n ≥ max{n 1 , n 2 } and using F n (x0 , y0 )  x and F n (y0 , x0 )  y we get d(F(x, y), x) ≤ d(F(x, y), F n+1 (x0 , y0 )) · d(F n+1 (x0 , y0 ) , x))

η

= {d(F(x, y), F(F n (x0 , y0 ) , F n (y0 , x0 )) · d(F n+1 (x0 , y0 ) , x))} 2 η η η ≤ d(x, F n (x0 , y0 )) 2 · (y, F n (y0 , x0 )) 2 · d(F n+1 (x0 , y0 ) , x) 2 = d(x, F n (x0 , y0 ) · (y, F n (y0 , x0 )) · d(F n+1 (x0 , y0 ) , x) √ √ √ ≤ ε · ε · ε = ε∗ and so F(x, y) = x. Following the similar argument, we obtain F(y, x) = y.

16.3.3 Well-Posedness Result for Coupled Maps In this section we study the well-posedness of coupled fixed problem in multiplicative metric space defined on partially ordered set. Definition 10 Let (X, d) be multiplicative metric space. A coupled fixed point problem of F : X × X → X is called well-posed if F has unique coupled fixed point say (u ∗ , v ∗ ) in X × X and for any two sequences {u n } and {vn } in X such that lim d(F (u n , vn ) , u n ) = 1 and lim d(F (vn , u n ) , vn ) = 1 it holds that n→∞

n→∞

lim d (u n , u ∗ ) = 1 and lim d (vn , v ∗ ) = 1.

n→∞

n→∞

Theorem 11 Let (X , ) be partially ordered set, (X, d) be complete multiplicative metric space. Suppose F : X × X → X is continuous mapping satisfying the mixed monotone property on X , and there exists η ∈ [0, 1) such that for all z  x, w  y, η

d(F(x, y), F(z, w)) ≤ [d(x, z) · d(y, w)] 2 . If there exist α0 , β0 ∈ X such that α0  F(α0 , β0 ) and β0  F(β0 , α0 ), then the couple fixed point of F is well-posed. Proof By Theorem 9, the mapping F has a unique coupled fixed point (x ∗ , y ∗ ) in X × X. Take a sequence {xn } in X such that lim d(F (xn , yn ) , xn ) = 1. Then we n→∞ have

362

T. Nazir and S. Silvestrov

d(x ∗ , xn ) ≤ d(x ∗ , F(xn , yn )) · d(F(xn , yn ), xn )   = d(F(xn , yn ), F x ∗ , y ∗ ) · d(xn , F(xn , yn ))   η ≤ {d x ∗ , xn · d(y ∗ , yn )} 2 · d(xn , F(xn , yn ))  η η = d x ∗ , xn 2 · d(y ∗ , yn ) 2 · d(xn , F(xn , yn )), that is,

η

2

η

2

d(x ∗ , xn ) ≤ d(y ∗ , yn ) 2−η · d(F(xn , yn ), xn ) 2−η .

(16.6)

Similarly, we obtain d(y ∗ , yn ) ≤ d(x ∗ , xn ) 2−η · d(F(yn , xn ), yn ) 2−η .

(16.7)

From (16.6) and (16.7), we get 1

d(x ∗ , xn ) · d(y ∗ , yn ) ≤ [d(F(xn , yn ), xn ) · d(F(yn , xn ), yn )] 1−η . Thus, lim [d(x ∗ , xn ) · d(y ∗ , yn )] = 1. So, lim d(xn , x ∗ ) = 1, lim d(y ∗ , yn ) = 1. n→∞

n→∞

n→∞

16.3.4 Application Let = [0, T ] be a bounded set in R, for T > 0 and X = C ( , R) be the space of real-valued continuous maps on . Consider the integral equations u(t) = v(t) =

 

q(t, u (s) , v(s))ds + κ(t); q(t, v (s) , u(s))ds + κ(t),

(16.8)

where q : × R × R → R and κ : → R be given continuous mappings. We shall study the sufficient conditions for existence of solution of integral equations (16.8) in framework of multiplicative metric space. Define d : X × X → sup|x(t)−y(t)|

. Then (X, d) is complete multiplicative metric [1, ∞) by d(x, y) = e t∈ space. We consider X with partial order  given by: α, β ∈ X, α  β if and only if α(t) ≤ β (t), for all t ∈ . Suppose that (i) there exist α (t) , β (t) ∈ C ( , R) such that, for all t ∈ , α (t) ≤ q(t, α (s) , β(s))ds + k(t), β (t) ≥ q(t, β (s) , α(s))ds + k(t);



(ii) for all x, y, z, w ∈ R with x ≤ z and y ≥ w, we have for each τ ∈ ,

16 Fixed Points of T-Hardy Rogers Type Mappings …

0 ≤ q(τ, x, y) − q(τ, z, w) ≤

363

λ (x − y + w − z), T



1 where λ ∈ 0, . 2 Then there exists a solution of integral equations (16.8) in C ( , R).  Proof We define F (ω1 , ω2 ) (τ ) = q(t, ω1 (s) , ω2 (s))ds + k(τ ) for ω1 , ω2 ∈ X

and τ ∈ . Now for u(τ ) ≥ x(τ ) and v(τ ) ≤ y(τ ) for all τ ∈ , using (ii), we obtain that sup|F(u,v)(τ )−F(x,y)(τ )|

d(F(u, v) (τ ) , F(x, y) (τ )) = e τ ∈ =e

  sup| q(τ,u(s),v(s))ds− q(τ,x(s),y(s))ds|

τ ∈



 sup |q(τ,u(s),v(s))−q(τ,x(s),y(s))|ds

≤ e τ ∈ ≤e

λ T



[sup|u(τ )−x(τ )|+sup|y(τ )−v(τ )|]ds

τ ∈

τ ∈

λ[sup|u(τ )−x(τ )|+sup|y(τ )−v(τ )|]

τ ∈ = e τ ∈ = (d(u (τ ) , x (τ )) · d(v (τ ) , y (τ )))λ .

holds for Thus, v), F(x, y)) ≤ (d(u (τ ) , x (τ )) · d(y (τ ) , v (τ )))λ

d(F(u, 1 . Moreover, it is easy to see that there exists (α0 , β0 ) ∈ C ( , R) × λ ∈ 0, 2 C ( , R) such that F (α0 , β0 )  α0 , F (β0 , α0 )  β0 . Thus all the condition of Theorem 9 are satisfied. It follows from Theorem 9 that there exists ( x, y) ∈ C ( , R) × C ( , R) which is the solution of integral equations (16.8). Acknowledgements Talat Nazir is grateful to ERUSMUS MUNDUS “Featured eUrope and South/south-east Asia mobility Network FUSION” and its Swedish node, MAM research milieu in Mathematics and Applied Mathematics, Division of Mathematics and Physics, School of Education, Culture and Communication at Mälardalen University for support and excellent research and research education environment during his visits.

References 1. Abbas, M., Ali, B., Suleiman, Y.I.: Common fixed points of locally contractive mappings in multiplicative metric spaces with application. Inter. J. Math. Sci. 1–7 (2015). Article ID 218683 2. Abbas, M., Fisher, B., Nazir, T.: Well-posedness and periodic point property of mappings satisfying a rational inequality in an ordered complex valued metric space. Sci. Stud. Res. Ser. Math. Info. 22(1), 5–24 (2012) 3. Agarwal, R.P., El-Gebeily, M.A., O’Regan. D.: Generalized contractions in partially ordered metric spaces. Appl. Anal. 87(1), 109–116 (2008) 4. Akkouchi, M., Popa, V.: Well-posedness of fixed point problem for three mappings under strict contractive conditions. Bull. Math. Inform. Phys. Petrol.-Gas Univ. Ploiesti 61(2), 1–10 (2009)

364

T. Nazir and S. Silvestrov

5. Altun, I., Simsek, H.: Some fixed point theorems on ordered metric spaces and application. Fixed Point Theory Appl. 2010, 1–17 (2010). Article ID 621492 6. Bashirov, A.E., Kurpınar, E.M., Ozyapıcı, A.: Multiplicative calculus and its applications. J. Math. Anal. Appl. 337, 36–48 (2008) 7. Bashirov, A.E., Mısırlı, E., Tando˘gdu, Y., Ozyapıcı, A.: On modeling with multiplicative differential equations. Appl. Math. J. Chin. Univ. 26(4), 425–438 (2011) 8. Bhashkar, T.G., Lakshmikantham, V.: Fixed point theorems in partially ordered cone metric spaces and applications. Nonlinear Anal. 65(7), 825–832 (2006) 9. Chaipunya, P., Cho, Y.J., Kumam, P.: A remark on the property ρ and periodic points of order ∞. Math. Vesnik 66(4), 357–363 (2014) 10. Chen, C., Karapınar, E., Rakoˇcevi´c, V.: Existence of periodic fixed point theorems in the setting of generalized quasi metric spaces. J. Appl. Math. 1–8 (2014). Article ID 353765 11. Choudhury, B.S., Kundu, A.: A coupled coincidence point result in partially ordered metric spaces for compatible mappings. Nonlinear Anal. 73, 2524–2531 (2010) ´ c, L., Caki´c, N., Rajovi´c, M., Ume, J.S.: Monotone generalized nonlinear contractions in 12. Ciri´ partially ordered metric spaces. Fixed Point Theory Appl. 1–11 (2008). Article ID 131294 13. Florack, L., van Assen, H.: Multiplicative calculus in biomedical image analysis. J. Math. Imag. Vis. 42(1), 64–75 (2012) 14. Guo, D., Lakshmikantham, V.: Coupled fixed points of nonlinear operators with applications. Nonlinear Anal. 11, 623–632 (1987) 15. He, X., Song, M., Chen, D.: Common fixed points for weak commutative mappings on a multiplicative metric space. Fixed Point Theory Appl. 2014(48), 1–9 (2014) 16. Hussain, N., Abbas, M., Azam, A., Ahmad, J.: Coupled coincidence point results for a generalized compatible pair with applications. Fixed Point Theory Appl. 2014(62), 1–21 (2014) 17. Jeong, G.S., Rhoades, B.E.: Maps for which F(T ) = F(T n ). Fixed Point Theory 6, 87–131 (2005) 18. Kadelburg, Z., Kumam, P., Radenovi´c, S., Sintunavarat, W.: Common coupled fixed point theorems for Geraghty-type contraction mappings using monotone property. Fixed Point Theory Appl. 2015(27), 1–15 (2015) 19. Kumam, P., Rahimi, H., Rad, G.S.: The existence of fixed and periodic point theorems in cone metric type spaces. J. Nonlinear Sci. Appl. 7, 255–263 (2014) ´ c, Lj.: Coupled fixed point theorems for nonlinear contractions in 20. Lakshmikantham, V., Ciri´ partially ordered metric space. Nonlinear Anal. 70(12), 4341–4349 (2009) 21. Nashine, H.K., Samet, B., Vetro, C.: Fixed point theorems in partially ordered metric spaces and existence results for integral equations. Numer. Funct. Anal. Optim. 33(11), 1304–1320 (2012) 22. Nazir, T., Silvestrov, S.: Common fixed point and periodic point results in multiplicative metric spaces. Waves Wavelets Fractals Adv. Anal. 3, 61–74 (2017) 23. Nieto, J.J., Lopez, R.R.: Contractive mapping theorems in partially ordered sets and applications to ordinary differential equations. Order 22, 223–239 (2005) 24. Özav¸sar, M., Çevikel, A.C.: Fixed point of multiplicative contraction mappings on multiplicative metric space (2012). arXiv:1205.5131v1 [matn.GN] 25. Rahimia, H., Rhoades, B.E., Radenovi´c, S., Rad, G.S.: Fixed and periodic point theorems for T-contractions on cone metric spaces. FILOMAT 27(5), 881–888 (2013) 26. Ran, A.C.M., Reurings, M.C.B.: A fixed point theorem in partially ordered sets and some application to matrix equations. Proc. Am. Math. Soc. 132, 1435–1443 (2004) 27. Reich, S., Zaslavski, A.J.: Well posedness of fixed point problems. Far East J. Math. Sci. Spec. Vol. 3, 393–401 (2001) 28. Sabetghadam, F., Masiha, H.P., Sanatpour, A.H.: Some coupled fixed point theorems in cone metric space. Fixed Point Theory Appl. 1–8 (2009). Article ID 125426 29. Samet, B.: Coupled fixed point theorems for a generalized Meir-Keeler contraction in partially ordered metric spaces. Nonlinear Anal. 72, 4508–4517 (2010) 30. Shatanawi, W.: Partially ordered cone metric spaces and coupled fixed point results. Comput. Math. Appl. 60, 2508–2515 (2010)

16 Fixed Points of T-Hardy Rogers Type Mappings …

365

31. Shatanawi, W., Kumam, P., Cho, Y.J.: Coupled fixed point theorems for nonlinear contractions without mixed monotone property. Fixed Point Theory Appl. 2012(170), 1–16 (2012) 32. Yamaod, O., Sintunavarat, W.: Some fixed point results for generalized contraction mappings with cyclic (α, β)-admissible mapping in multiplicative metric spaces. J. Inequalities Appl. 2014(488), 1–15 (2014)

Chapter 17

Some Periodic Point and Fixed Point Results in Multiplicative Metric Spaces Talat Nazir and Sergei Silvestrov

Abstract We investigate the periodic points and common fixed point of generalized contraction mappings self-mappings in the setup of multiplicative metric spaces. We also study the well-posedness for the obtained results. The common fixed point results of mappings involved in the cyclic representation are also obtained. Moreover, some applications to obtain the common solution of integral equations are presented. Keywords Periodic point · Common fixed point · Cyclic representation · Multiplicative metric space MSC 2020 Classification 47H09 · 47H10 · 54C60 · 54H25

17.1 Introduction Fixed point theory is a very effective and powerful tool for solving various kind of mathematical problems. The study of fixed points of mappings has several applications in the solution of optimization problems, differential equations and integral equations (see for example, [6, 7, 18, 19, 21, 25]). Theorems dealing with fixed point of certain mappings inspired and motivated the investigations of many other important kinds of points like periodic points, intersection points, sectional points, etc. It is an obvious fact that if S is a self-map which has a fixed point x, that is, Sx = x, then x is also a fixed point of S n for every natural number n, that is, S n x = x. However, fixed points of S n for a natural number n > 1 do not need to T. Nazir (B) Department of Mathematical Sciences, University of South Africa, Florida 0003, South Africa e-mail: [email protected] S. Silvestrov Division of Mathematics and Physics, School of Education, Culture and Communication, Mälardalen University, Box 883, 72123 Västerås, Sweden e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_17

367

368

T. Nazir and S. Silvestrov

be fixed points for S as they can be periodic points of the period larger than 1. For example, consider, X = R, and S defined as S (x) = a − x for some a ∈ R\{0}. a Then S has a unique fixed point at x = , but every even iterate of S is the identity 2 map, which has every point of R as a fixed point. On the other hand, if X = [0, π ] and S (x) = cos x, then every iterate of S has the same fixed point as S (see [4, 11, 12, 14, 20, 22, 26, 28]). If a self-map satisfies F(S) = F(S n ) for each n ∈ N, where F(S) denotes a set of all fixed point of S, then it is said that  S has property P. Also if two self-maps S and T satisfy F(S) F(T ) = F(S n ) F (T n ) for each n ∈ N, then it is said that pair (S, T ) has property Q. Jeong and Rhoades [20] showed that maps satisfying many contractive conditions have property P. Abbas and Rhoades in [5] studied the same problem in cone metric spaces (see also, [22, 26]), and in [25] considered the mappings satisfying a contractive condition of integral type for which fixed point and periodic point coincide. Chaipunya, Cho and Kumam [11] studied the property P and periodic points of order ∞. Chen, Karapınar and Rakoˇcevi´c [12] considered mappings satisfying a contractive condition in the setting of generalized quasi metric spaces. It could be also interesting to mention, that in a totally different context of interplay of dynamical systems and C ∗ -algebras, Silvestrov and Tomiyama [27] obtained several general equivalent conditions for the coincidence of the sets of recurrent and periodic points of homeomorphism dynamical systems of topological spaces, and discussed some examples and classes of homeomorphism dynamical systems satisfying property of coincidence of the sets of periodic and fixed points (property P). Recently, Özav¸sar and Çevikel [23] proved an analogous of Banach contraction principle in the framework of multiplicative metric spaces. They also studied some topological properties of the relevant multiplicative metric space. Bashirov, Kurpınar and Ozyapıcı [9] studied the concept of multiplicative calculus and proved a fundamental theorem of multiplicative calculus. They also illustrated the usefulness of multiplicative calculus with some interesting applications. Multiplicative calculus provides natural and straightforward way to compute the derivative of product and quotient of two functions [10]. Florack and van Assen [13] gave applications of multiplicative calculus in biomedical image analysis. He, Song and Chen [16] studied common fixed points for weak commutative mappings on a multiplicative metric space (see also, [1, 3]). Recently, Yamaod and Sintunavarat [29] obtained some fixed point results for generalized contraction mappings with cyclic (α, β)admissible mapping in multiplicative metric spaces. We study the common fixed point problems that satisfy property Q of mappings in the framework of multiplicative metric spaces. We also show the well-posedness of these results. We also study the sufficient conditions for the existence of common fixed points of pair of power contractive type mappings involved in cyclic representation of a non-empty subset of a multiplicative metric space. Some applications of obtained results are also shown. By R, R>0 , Rn>0 and N we denote the set of all real numbers, the set of all positive real numbers, the set of all n-tuples of positive real numbers and the set of all natural numbers, respectively.

17 Some Periodic Point and Fixed Point Results in Multiplicative Metric Spaces

369

The following definitions and results will be needed in the sequel [9, 23]. Definition 1 (multiplicative metric space) Let X be a non-empty set. A mapping d : X × X → R>0 is said to be a multiplicative metric on X if for any x, y, z ∈ X, the following conditions hold: (i) d(x, y) ≥ 1 and d(x, y) = 1 if and only if x = y; (ii) d(x, y) = d(y, x); (iii) d(x, y) ≤ d(x, z) · d(z, y). The pair (X, d) is called a multiplicative metric space. Definition 2 ([23]) A sequence {xn } in a multiplicative metric space (X, d) is multiplicative convergent to x in X if and only if d(xn , x) → 1 as n → ∞. Definition 3 Let (X, d X ) and (Y, dY ) be two multiplicative metric spaces, and x0 an arbitrary but fixed element of X . A mapping S : X → Y is said to be multiplicative continuous at x0 if and only if xn → x0 in (X, d X ) implies that S(xn ) → S(x0 ) in (Y, dY ), that is, for any ε > 1, there exists δ > 1, which depends on x0 and ε, such that dY (Sx, Sx0 ) < ε for all those x in X for which d X (x, x0 ) < δ. Definition 4 ([23]) A sequence {xn } in a multiplicative metric space (X, d) is said to be multiplicative Cauchy sequence if for any ε > 1, there exists n 0 ∈ N such that d(xn , xm ) < ε for all m, n ≥ n 0 . A multiplicative metric space (X, d) is said to be complete if every multiplicative Cauchy sequence {xn } in X is multiplicative convergent in X. Theorem 1 ([23]) A sequence {xn } in a multiplicative metric space (X, d) is multiplicative Cauchy if and only if d(xn , xm ) → 1 as n, m → ∞. The multiplicative absolute-value function |·|∗ : R → R>0 is defined as ⎧ α if α ⎪ ⎪ ⎪ ⎪ 1 ⎪ ⎪ ⎪ if α ⎪ ⎪ α ⎪ ⎨ |α|∗ = 1 if α ⎪ ⎪ ⎪ 1 ⎪ ⎪ − if α ⎪ ⎪ ⎪ α ⎪ ⎪ ⎩ −α if α

≥ 1, ∈ (0, 1), = 0, ∈ (−1, 0), ≤ −1.

For arbitrary x, y ∈ R>0 , the multiplicative absolute value function |·|∗ : R>0 → 1 if x ≤ 0 and R>0 has the following basic properties: |x|∗ ≥ 1, x ≤ |x|∗ , x ≤ |x|∗ 1 ≤ x if x > 0, as well as |x · y|∗ ≤ |x|∗ |y|∗ . |x|∗

370

T. Nazir and S. Silvestrov

Example 1 Let X = C ∗ [a, b] be the collection of all real-valued multiplicative continuous functions over [a, b] ⊂ R>0 with  the multiplicative metric d defined   , where |·|∗ : R>0 → R>0 is the for arbitrary S, T ∈ X by d(S, T ) = sup  TS(x) (x)  x∈[a,b]



multiplicative absolute value function. Then (C ∗ [a, b], d) is complete multiplicative metric space. The notion of well-posedness of fixed point problem has evoked much interest to several mathematicians. Recently, Karapinar [21] studied well-posed problem for a cyclic weak contraction mapping on a complete metric space (see also, [2, 24]). We define well-posedness of common fixed point problem in multiplicative metric space. Definition 5 Let (X, d) be a multiplicative metric space and S and T be two selfmaps on X . A common fixed point problem of S and T is called well-posed on X if S and T have at most one common fixed point (say u) and for any sequence {xn } in X such that lim d(Sxn , xn ) = 1 or lim d(T xn , xn ) = 1 implies that lim d(xn , u) = n→∞ n→∞ n→∞ 1.

17.2 Periodic Point Results In this section, we obtain several periodic point and common fixed point results of self-maps satisfying certain power contractive conditions in the framework of multiplicative metric space. We start with the following result. Theorem 2 Let (X, d) be a complete multiplicative metric space and S, T : X → X . Suppose that there exists an upper semi-continuous and nondecreasing function φ : [1, ∞) → [1, ∞), obeying φ(t) < t for all t > 1, such that for all x, y ∈ X , d(Sx, T y)δ ≤ φ(d(x, y)α · d(x, Sx)β · d(y, T y)γ ),

(17.1)

 where α, β, γ ≥ 0 with δ = α + β + γ ∈ (0, ∞). Then F(S) F (T ) is singleton and pair (S, T ) has property Q.  Proof First, we show that F(S) F (T ) = φ. We divide the proof in four facts. Fact 1. If S or T has a fixed point u in X , then u is a common fixed point of S and T. Indeed, let u be a fixed point of S. Assume that d(u, T u) > 1. From (17.1) with x = y = u, we have d(u, T u)δ = d(Su, T u)δ ≤ φ(d(u, u)α · d(u, Su)β · d(u, T u)γ )  = φ(d(u, u)α+β · d(u, T u)γ ) ≤ φ(d(u, T u)γ ) ≤ φ d(u, T u)δ < d(u, T u)δ , a contradiction. Hence d(u, T u) = 1 and so u = T u. Therefore u is a common fixed point of S and T . Similarly, if u is a fixed point of T , then it is also fixed point of S.

17 Some Periodic Point and Fixed Point Results in Multiplicative Metric Spaces

371

Fact 2. The sequence {xn } constructed by S and T satisfies lim d(xn , xn+1 ) = 1 in n→∞ a multiplicative metric space. Indeed, let x0 be an arbitrary point of X. If Sx0 = x0 , then the proof is finished, so we assume that Sx0 = x0 . Define a sequence {xn } in X as Sx2n = x2n+1 and T x2n+1 = x2n+2 for n ∈ N. We may assume that d(x2n , x2n+1 ) > 1, for all n ∈ N. If not, then x2k = x2k+1 for some k, so Sx2k = x2k+1 = x2k , and thus x2k is a fixed point of S. Hence x2k is also a fixed point of T by Fact 1. Now, by taking d(x2n , x2n+1 ) > 1 for all n ∈ N, from (17.1), we consider d(x2n+1 , x2n+2 )δ = d(Sx2n , T x2n+1 )δ ≤ φ(d(x2n , x2n+1 )α · d(x2n , Sx2n )β · d(x2n+1 , T x2n+1 )γ ) = φ(d(x2n , x2n+1 )α · d(x2n , x2n+1 )β · d(x2n+1 , x2n+2 )γ ) = φ(d(x2n , x2n+1 )α+β · d(x2n+1 , x2n+2 )γ ) < d(x2n , x2n+1 )α+β · d(x2n+1 , x2n+2 )γ , which implies that d(x2n+1 , x2n+2 )α+β < d(x2n , x2n+1 )α+β . If α + β = 0, then a contradiction arises. Thus α + β > 0, and we have d(x2n+1 , x2n+2 ) < d(x2n , x2n+1 ), for all n ∈ N. Again from (17.1), we have d(x2n+2 , x2n+3 )δ = d(T x2n+1 , Sx2n+2 )δ = d(Sx2n+2 , T x2n+1 )δ ≤ φ(d(x2n+2 , x2n+1 )α · d(x2n+2 , Sx2n+2 )β · d(x2n+1 , T x2n+1 )γ ) = φ(d(x2n+1 , x2n+2 )α · d(x2n+2 , x2n+3 )β · d(x2n+1 , x2n+2 )γ ) = φ(d(x2n+1 , x2n+2 )α+γ · d(x2n+2 , x2n+3 )β ) < d(x2n+1 , x2n+2 )α+γ · d(x2n+2 , x2n+3 )β , which implies that d(x2n+2 , x2n+3 )α+γ < d(x2n+1 , x2n+2 )α+γ . If α + γ = 0, then a contradiction arises. Hence α + γ > 0 and d(x2n+2 , x2n+3 ) < d(x2n+1 , x2n+2 ), for all n ∈ N. Consequently, for all n ∈ N, d(xn , xn+1 ) < d(xn−1 , xn ).

(17.2)

Therefore, the decreasing sequence of positive real numbers {d(xn , xn+1 )} converges to some c ≥ 1. If we assume that c > 1, then from (17.2) we deduce that cδ = lim d(x2n+1 , x2n+2 )δ = lim d(Sx2n , T x2n+1 )δ n→∞

n→∞

≤ lim sup φ(d(x2n , x2n+1 )α · d(x2n , Sx2n )β · d(x2n+1 , T x2n+1 )γ ) n→∞

= lim sup φ(d(x2n , x2n+1 )α · d(x2n , x2n+1 )β · d(x2n+1 , x2n+2 ))γ n→∞

≤ φ(cα+β+γ ) = φ(cδ ) < cδ ,

372

T. Nazir and S. Silvestrov

a contradiction, so c = 1, that is, lim d(xn , xn+1 )δ = 1 and so lim d(xn , xn+1 ) = 1. n→∞

n→∞

Fact 3. The sequence {xn } constructed in Fact 2 is a multiplicative Cauchy in (X, d). We have d(xn , xn+1 )δ ≤ φ(d(xn−1 , xn )δ ) ≤ . . . ≤ φ n (d(x0 , x1 )δ ). For m, n ∈ N with m > n, d(xn , xm )δ ≤ d(xn , xn+1 )δ · d(xn+1 , xn+2 )δ · . . . · d(xm−1 , xm )δ ≤ φ n (d(x0 , x1 )δ ) · φ n+1 (d(x0 , x1 )δ ) · . . . · φ m−1 (d(x0 , x1 )δ ), which implies that d(xn , xm )δ converges to 1 as n, m → ∞. Thus, lim d(xn , xm ) = n,m→∞

is a multiplicative Cauchy sequence in (X, d). 1, that is, {xn }  Fact 4. F (S) F (T ) = ∅. Indeed, since (X, d) is complete multiplicative space, there exists u in X such that lim d(u, xn ) = 1. Assume on contrary that, d(Su, u) > 1, then from (17.1), we n→∞ have d(Su, x2n+2 )δ = d(Su, T x2n+1 )δ ≤ φ(d(u, x2n+1 )α · d(u, Su)β · d(x2n+1 , T x2n+1 )γ ) = φ(d(u, x2n+1 )α · d(u, Su)β · d(x2n+1 , x2n+2 )γ ),

(17.3)

we deduce, by taking upper limit as n → ∞ into account (17.3), that d(Su, u)δ ≤ lim sup φ(d(u, x2n+1 )α · d(u, Su)β · d(x2n+1 , T x2n+1 )γ ) n→∞

≤ φ(d(u, u)α · d(u, Su)β · d(u, u)γ ) ≤ φ(d(u, Su)β ) < d(u, Su)δ , a contradiction. Hence u = Su and thus u is the common fixed point of S and T by Fact 1.  Now, let us show that F (S) F (T ) is singleton set. Assume on contrary that Su = T u = u and Sv = T v = v but u = v. Then d(u, v)δ = d(Su, T v)δ ≤ φ(d(u, v)α · d(u, Su)β · d(v, T v)γ ) = φ(d(u, v)α · d(u, u)β · d(v, v))γ ≤ φ(d(u, v)δ ),  a contradiction because d(u, v)δ > 1. Hence u = v and F (S) F (T ) = {u}.  Let u ∈ F(S n ) F(T n ) be arbitrary for n > 1, since the statement for n = 1 is trivial. Now, d(u, T u)δ = d(S(S n−1 u), T (T n u))δ ≤ φ(d(S n−1 u, T n u)α · d(S n−1 u, S n u)β · d(T n u, T n+1 u))γ = φ(d(S n−1 u, T n u)α · d(S n−1 u, T n u)β · d(u, T u)γ ) < d(S n−1 u, T n u)α+β · d(u, T u)γ ,

17 Some Periodic Point and Fixed Point Results in Multiplicative Metric Spaces

which implies that d(u, T u) < d(S n−1 u, T n u)λ , where λ =

373

α+β = 1. Thus δ−γ

d(u, T u) = d(S n u, T n+1 u) < d(S n−1 u, T n u) < d(S n−2 u, T n−1 u) < . . . < d(u, T u), a contradiction. Hence d(u, T u) = 1, that is, u = T u. Similarly, it can be shown that Su = u. Thus u ∈ F(S) F(T ), and hence S and T have property Q. Theorem 3 Let (X, d) be a complete multiplicative metric space and S, T : X → X . If there exists an upper semi-continuous nondecreasing function φ : [1, ∞) → [1, ∞), obeying φ(t) < t for all t > 1, such that for all x, y ∈ X , d(Sx, T y)δ ≤ φ(d(x, y)α · d(x, Sx)β · d(y, T y)γ ), where α, β, γ > 0 with δ = α + β + γ ∈ (0, ∞), then the common fixed point problem of S and T is well-posed on X . Proof Due to Theorem 2, for any x0 ∈ X, u ∈ X is the unique common fixed point of S and T . Let {xn } be a sequence in X such that d(Sxn , xn ) → 1 as n → ∞. Then, d(xn , u)δ ≤ d(xn , Sxn )δ · d(Sxn , T u)δ ≤ d(xn , Sxn )δ · φ(d(xn , u)α · d(xn , Sxn )β · d(u, T u)γ ) = d(xn , Sxn )δ · φ(d(xn , u)α · d(xn , Sxn )β ) ≤ d(xn , Sxn )δ+β · d(xn , u)α , δ+β

which implies that d(xn , u)β+γ ≤ d(xn , Sxn )δ+β , that is, d(xn , u) ≤ [d(xn , Sxn )] β+γ . Taking limit as n → ∞ implies d(xn , u) → 1. Corollary 1 Let (X, d) be a complete multiplicative metric space and S, T : X → X . If there exists an upper semi-continuous nondecreasing function φ : [1, ∞) → ∞

[1, ∞), with φ n (t) convergent for each t > 1, such that for all x, y ∈ X , n=1

d(S s x, T t y)δ ≤ φ(d(x, y)α · d(x, S s x)β · d(y, T t y))γ , where α, β, γ ≥ 0 with δ = α + β + γ ∈ (0, ∞) and s, t ∈ N, then F(S) is singleton and pair (S, T ) has property Q.



F (T )

Proof It follows from Theorem 2, that S s and T t have a unique common fixed point w. Now S(w) = S(S s (w)) = S s+1 (w) = S s (S(w)) and T (w) = T (T t (w)) = T t+1 (w) = T t (T (w)) implies that Sw and T w are also fixed points for S s and T t . Since the common fixed point of S s and T t is unique, we deduce that w = Sw = T w. It is obvious that every fixed point of S is a fixed point of T and conversely. If we take φ(t) = t k for k ∈ [0, 1) in Theorem 2, we have the following Corollary.

374

T. Nazir and S. Silvestrov

Corollary 2 Let (X, d) be a complete multiplicative metric space and S, T : X → X . Suppose that there exists k ∈ [0, 1) such that for all x, y ∈ X , d(Sx, T y)δ ≤ (d(x, y)α · d(x, Sx)β · d(y, T y)γ )k , where α, β, γ ≥ 0 with δ = α + β + γ ∈ (0, ∞). Then F(S) and pair (S, T ) has property Q.



F (T ) is singleton

Corollary 3 Let (X, d) be a complete multiplicative metric space, S, T : X → X , and suppose that there exists some upper semi-continuous nondecreasing function ∞

φ n (t) convergent for each t > 1, such that one of the φ : [1, ∞) → [1, ∞), with n=1

following conditions is satisfied for all x, y ∈ X , (i) d(Sx, T y) ≤ φ(d(x, y)), (ii) d(Sx, T y) ≤ φ(d(x, Sx)), (iii) d(Sx, T y) ≤ φ(d(y, T y)).  Then F(S) F (T ) is singleton and pair (S, T ) has property Q. Proof Taking (i) α = 1 and β = γ = 0; (ii) β = 1, α = γ = 0; (iii) γ = 1, α = β = 0 in Theorem 2, respectively, then the conclusion of Corollary 3 can be obtained from Theorem 2 immediately. Corollary 4 Let (X, d) be a complete multiplicative metric space and S, T : X → X . If there exists an upper semi-continuous nondecreasing function φ : [1, ∞) → ∞

φ n (t) convergent for each t > 1, such that one of the following [1, ∞), with n=1

condition is satisfied for all x, y ∈ X , (i) d(Sx, T y)2 ≤ φ(d(x, y) · d(x, Sx)), (ii) d(Sx, T y)2 ≤ φ(d(x, y) · d(y, T y)), (iii) d(Sx, T y)2 ≤ φ(d(x, Sx) · d(y, T y)),  then F(S) F (T ) is singleton and pair (S, T ) has property Q. Proof Taking (i) α = β = 1 and γ = 0; (ii) α = γ = 1, β = 0; (iii) β = γ = 1, α = 0 in Theorem 2, respectively, the conclusion of Corollary 4 can be obtained from Theorem 2 immediately. Corollary 5 Let (X, d) be a complete multiplicative metric space and S, T : X → X . Suppose that there exists φ : [1, ∞) → [1, ∞) an upper semi-continuous non∞

decreasing function φ : [1, ∞) → [1, ∞), with φ n (t) convergent for each t > 1, n=1

3 such that  for all x, y ∈ X , d(Sx, T y) ≤ φ(d(x, y) · d(x, Sx) · d(y, T y)). Then F(S) F (T ) is singleton and pair (S, T ) has property Q.

Proof Taking α = β = γ = 1 in Theorem 2, then the conclusion of Corollary 5 can be obtained from Theorem 2 immediately.

17 Some Periodic Point and Fixed Point Results in Multiplicative Metric Spaces

375

Corollary 6 Let (X, d) be a complete multiplicative metric space and T : X → X . If there exists an upper semi-continuous nondecreasing function φ : [1, ∞) → ∞

φ n (t) convergent for each t > 1, such that for all x, y ∈ X , [1, ∞), with n=1

d(T x, T 2 y)δ ≤ φ(d(x, y)α · d(x, T x)β · d(y, T 2 y)γ ), where α, β, γ ≥ 0 with δ = α + β + γ ∈ (0, ∞), then F(T ) and pair T, T 2 has property Q.



 F T 2 is singleton

Proof Take S = T 2 in (17.1), then Corollary 6 follows from Theorem 2.

17.3 Cyclic Contractions Now we obtain common fixed point result for self-maps satisfying cyclic contraction defined on a multiplicative metric space. Definition 6 Let {X i : i = 1, 2, . . . , p} be a finite collection of non-empty subsets of a set X, where p is some positive integer and S, T : X → X . The set X is said to have a cyclic representation with respect to the collection {X i : i = 1, 2, . . . , p} and a pair (S, T ) if (1) X =

p

Xi ,

i=1

(2) S(X 1 ) ⊆ X 2 , T (X 2 ) ⊆ X 3 , . . . , S(X p−1 ) ⊆ X p , T (X p ) ⊆ X 1 . Theorem 4 Let (X, d) be a multiplicative metric space, A1 , A2 , . . . , A p non-empty p closed subsets of X and Y = Ai . Suppose that S, T : Y → Y are such that i=1

(i) Y has a cyclic representation with respect to pair (S, T ) and to the collection {Ai : i = 1, 2, . . . , p}; (ii) there exists an upper semi-continuous nondecreasing function φ : [1, ∞) → ∞

φ n (t) convergent for each t > 1, such that for any (x, y) ∈ [1, ∞), with n=1

Ai × Ai+1 , i = 1, 2, . . . , p, d(Sx, T y)δ ≤ φ(d(x, y)α · d(x, Sx)β · d(y, T y)γ ) holds with A p+1 = A1 where α, β, γ ≥ 0 with δ = α + β + γ ∈ (0, ∞). Then F(S)



F (T ) is singleton with F(S)



F (T ) ⊆

p  i=1

Ai .

(17.4)

376

T. Nazir and S. Silvestrov

 Proof To show that F(S) F (T ) is singleton, we divide the proof in four facts. Fact 1. The sequence {xn } constructed by S and T satisfies lim d(xn , xn+1 ) = 1 in n→∞ a multiplicative metric space. p Let x0 be a given point in i=1 Ai . Choose point x1 in Ai0 +1 and point x2 in Ai0 +2 such that S(x0 ) = x1 and T (x1 ) = x2 . This can be done because S(Ai0 ) ⊆ Ai0 +1 and T (Ai0 +1 ) ⊆ Ai0 +2 . Continuing this process, for n > 0, there exists i n ∈ {1, 2, . . . , p} such that having chosen x2n in A2in , we obtain x2n+1 in A2in +1 and x2n+2 in A2in +2 such that S(x2n ) = x2n+1 T (x2n+1 ) = x2n+2 . If for some n 0 ≥ 0, we have x2n 0 =  and x2n 0 +1 , then x2n 0 = S x2n 0 implies that xn 0 is the fixed point of S. And from (17.4), d(x2n+1 , x2n+2 )δ = d(Sx2n , T x2n+1 )δ ≤ φ(d(x2n , x2n+1 )α · d(x2n , Sx2n )β · d(x2n+1 , T x2n+1 )γ ) = φ(d(x2n , x2n+1 )α · d(x2n , x2n+1 )β · d(x2n+1 , x2n+2 )γ ) = φ(d(x2n+1 , x2n+2 )γ ) ≤ d(x2n+1 , x2n+2 )γ ,   which implies that x2n+1 = x2n+2 . Thus x2n 0 = S x2n 0 = T x2n 0 implies that xn 0 is the common fixed point of S and T . Now, by taking x2n = x2n+1 for all n ∈ N, from (17.4), we consider d(x2n+1 , x2n+2 )δ = d(Sx2n , T x2n+1 )δ ≤ φ(d(x2n , x2n+1 )α · d(x2n , Sx2n )β · d(x2n+1 , T x2n+1 )γ ) = φ(d(x2n , x2n+1 )α · d(x2n , x2n+1 )β · d(x2n+1 , x2n+2 )γ ) = φ(d(x2n , x2n+1 )α+β · d(x2n+1 , x2n+2 )γ ) < d(x2n , x2n+1 )α+β · d(x2n+1 , x2n+2 )γ , which implies that d(x2n+1 , x2n+2 )α+β < d(x2n , x2n+1 )α+β . If α + β = 0, then a contradiction arises. Thus α + β > 0, and for all n ∈ N, d(x2n+1 , x2n+2 ) < d(x2n , x2n+1 ). Again from (17.4), we have d(x2n+2 , x2n+3 )δ = d(T x2n+1 , Sx2n+2 )δ = d(Sx2n+2 , T x2n+1 )δ ≤ φ(d(x2n+2 , x2n+1 )α · d(x2n+2 , Sx2n+2 )β · d(x2n+1 , T x2n+1 )γ ) = φ(d(x2n+1 , x2n+2 )α · d(x2n+2 , x2n+3 )β · d(x2n+1 , x2n+2 )γ ) = φ(d(x2n+1 , x2n+2 )α+γ · d(x2n+2 , x2n+3 )β ) < d(x2n+1 , x2n+2 )α+γ · d(x2n+2 , x2n+3 )β , which implies that d(x2n+2 , x2n+3 )α+γ < d(x2n+1 , x2n+2 )α+γ . If α + γ = 0, then a contradiction arises. Hence α + γ > 0, and for all n ∈ N, d(x2n+2 , x2n+3 ) < d(x2n+1 , x2n+2 ), Consequently, for all n ∈ N, d(xn , xn+1 ) < d(xn−1 , xn ).

(17.5)

17 Some Periodic Point and Fixed Point Results in Multiplicative Metric Spaces

377

Therefore, the decreasing sequence of positive real numbers {d(xn , xn+1 )} converges to some c ≥ 1. If we assume that c > 1, then from (17.5) we deduce that cδ = lim d(x2n+1 , x2n+2 )δ = lim d(Sx2n , T x2n+1 )δ n→∞

n→∞

≤ lim sup φ(d(x2n , x2n+1 )α · d(x2n , Sx2n )β · d(x2n+1 , T x2n+1 )γ ) n→∞

= lim sup φ(d(x2n , x2n+1 )α · d(x2n , x2n+1 )β · d(x2n+1 , x2n+2 ))γ n→∞

≤ φ(cα+β+γ ) = φ(cδ ) < cδ , a contradiction, so c = 1, that is, lim d(xn , xn+1 )δ = 1, and so lim d(xn , xn+1 ) = n→∞ n→∞ 1. Fact 2. The sequence {xn } constructed in Fact 1 is a multiplicative Cauchy in (X, d). Indeed, d(xn , xn+1 )δ ≤ φ(d(xn−1 , xn ))δ ≤ . . . ≤ φ n (d(x0 , x1 )δ ). Now, for m, n ∈ N such that m > n, d(xn , xm )δ ≤ d(xn , xn+1 )δ · d(xn+1 , xn+2 )δ · . . . · d(xm−1 , xm )δ ≤ φ n (d(x0 , x1 )δ ) · φ n+1 (d(x0 , x1 ))δ · . . . · φ m−1 (d(x0 , x1 )δ ), which implies that d(xn , xm )δ converges to 1 as n, m → ∞. Thus, lim d(xn , xm ) = n,m→∞

is a multiplicative Cauchy sequence in (X, d). 1, that is, {xn }  Fact 3. F (S) F (T ) = ∅. Indeed, since (X, d) is complete multiplicative space, there exists u in X such that lim d(u, xn ) = 1. n→∞

Now we show that u ∈

p 

Ai . From condition 4, and x0 ∈ Ai0 for some i 0 ∈

i=1

{1, 2, . . . , p}, we can choose a subsequence {xn k } in Ai0 out of the sequence {xn }. Obviously, {xn k } ⊆ S(Ai0 ) ⊆ Ai0 +1 . As Ai0 +1 is closed, so u ∈ Ai0 +1 . Similarly, we can choose a subsequence {xn k +1 } in Ai0 +1 out of the sequence {xn }. Obviously, {xn k +1 } ⊆ T (Ai0 +1 ) ⊆ Ai0 +2 . As Ai0 +2 is closed, so u ∈ Ai0 +2 . Continuing this way, p  p we obtain that u ∈ Ai and hence i=1 Ai = ∅. i=1

Now we show that S(z) = z. Since u ∈

p 

Ai , there exists some i in {1, 2, . . . , p} i=1 Ai . Choose a subsequence {x2n k +1 } of {xn } with x2n k +1 ∈ Ai+1 . Assume

such that u ∈ on contrary that, d(Su, u) > 1, then from (17.4), we have

d(Su, x2n k +2 )δ = d(Su, T x2n k +1 )δ ≤ φ(d(u, x2n k +1 )α · d(u, Su)β · d(x2n k +1 , T x2n k +1 )γ ) = φ(d(u, x2n k +1 )α · d(u, Su)β · d(x2n k +1 , x2n k +2 )γ ), (17.6) we deduce, by taking upper limit as k → ∞ into account (17.6), that

378

T. Nazir and S. Silvestrov

d(Su, u)δ ≤ lim sup φ(d(u, x2n k +1 )α · d(u, Su)β · d(x2n k +1 , T x2n k +1 )γ ) n→∞

≤ φ(d(u, u)α · d(u, Su)β · d(u, u)γ ) ≤ φ(d(u, Su)β ) < d(u, Su)δ , a contradiction. Hence u = Su. Similarly, we have u = T u and thus u is the common fixed point of S and T  . Fact 4. The set F (S) F (T ) is a singleton set. Assume on contrary that Su = T u = u and Sv = T v = v but u = v. Then d(u, v)δ = d(Su, T v)δ ≤ φ(d(u, v)α · d(u, Su)β · d(v, T v)γ ) = φ(d(u, v)α · d(u, u)β · d(v, v)γ ) ≤ φ(d(u, v)δ ), a contradiction because d(u, v)δ > 1. Hence u = v and F (S)



F (T ) = {u}.

Example 2 Let X = R, and d a multiplicative metric on X defined by d(x, y) = a |x−y| , where a > 1 is a real number. For some c > 1 , set A1 = [−c, 0], A2 = [0, c] 2 2 k2 k1 and A3 = A1 . Define S, T : Ai → Ai by S (x) = − x and T (x) = − x, c c i=1 i=1 where 0 < k1 ≤ k2 ≤ 21 c. Note that S(A ) = [0, k ] ⊆ [0, c] = A and T (A )= 1 1 2 2 [−k2 , 0] ⊆ [−c, 0] = A1 . Y = A1 A2 has a cyclic representation with respect to pair (S, T ). 4/5 t , if t ∈ [1, a δc ), Define φ : [1, ∞) → [1, ∞) by φ(t)= δc where δ ∈ (0, ∞) . a , if a δc ≤ t, ∞

φ n (t) convergent for Clearly φ is upper semi-continuous and nondecreasing with n=1

all t > 1. δ δ We show that condition 4 is satisfied for α = , β = γ = . 2 4 Now, for x ∈ A1 , y ∈ A2 , d(Sx, T y)δ = d(− 4 δ

k2 y δ k1 x δ ,− ) = a c (k2 y−k1 x) c c δ

δ

k1

k2

≤ a 5 [ 2 (y−x)+ 4 (1+ c )|x|+ 4 (1+ c )y] k1

k2

= φ(a α(y−x)+β(1+ c )|x|+γ (1+ c )y ) = φ(d(x, y)α · d(x, Sx)β · d(y, T y)γ ). When x ∈ A2 , y ∈ A1 , d(Sx, T y)δ = d(− 4 δ

k2 y δ k1 x δ ,− ) = a c (k1 x−k2 y) c c δ

k1

δ

k2

≤ a 5 [ 2 (x−y)+ 4 (1+ c )x+ 4 (1+ c )|y|] k1

k2

= φ(a α(x−y)+β(1+ c )x+γ (1+ c )|y| ) = φ(d(x, y)α · d(x, Sx)β · d(y, T y)γ ).

17 Some Periodic Point and Fixed Point Results in Multiplicative Metric Spaces

379

Thus, S and T satisfy all the conditions of Theorem 4. Moreover, S and T have at most one common fixed point.

17.4 Applications Let = [0, 1] be a bounded set in R, L 2 ( ), the set of comparable functions on

whose square is integrable on . Consider the integral equations x(t) = y(t) =





q1 (t, s, x(s))ds + k(t), (17.7)

q2 (t, s, y(s))ds + k(t),

where q1 , q2 : × × R → R and k : → R be given continuous mappings. Altun and Simsek [8] obtained the common solution of integral equations (17.7) as an application in ordered Banach spaces. We shall study sufficient condition for existence of common solution of integral equations in framework of multiplicative sup|x(t)−y(t)|

. Then (X, d) metric spaces. Define d : X × X → [1, ∞) by d(x, y) = e t∈

is a complete multiplicative metric space. Suppose that there exists an upper semi∞

continuous nondecreasing function φ : [1, ∞) → [1, ∞), with φ n (t) convergent n=1

for each t > 1, such that  sup t∈

  1/δ  δsup|u(t)−v(t)| t∈

|q1 (t, s, u(s)) − q2 (t, s, v(s))| ds ≤ ln φ e



for each s ∈ , where δ ∈ (0, ∞) . Then the integral equations (17.7) have a common solution in L 2 ( ).

Proof Let (Sx)(t) = q1 (t, s, x(s))ds + k(t) and (T x)(t) = q2 (t, s, x(s))ds +



k(t). For all x, y ∈ X , δ

d(Sx, T y) = e ≤e

δsup|(Sx)(t)−(T y)(t)| t∈



=e

δsup| q1 (t,s,x(s))ds− q2 (t,s,y(s))ds| t∈

δsup |q1 (t,s,x(s))−q2 (t,s,y(s))|ds t∈



  δsup|x(t)−y(t)| ≤ φ e t∈

= φ(d(x, y)α · d(x, Sx)β · d(y, T y)γ ). Thus (17.1) is satisfied for δ = α + β + γ ∈ (0, ∞) with β = γ = 0. Now Theorem 2 yields the common solutions of integral equations (17.7) in L 2 ( ). Now, consider another integral equation

380

T. Nazir and S. Silvestrov

 p(t, x(t)) =

q(t, s, x(s))ds,

(17.8)



where p : × R → R and q : × × R → R be two mappings. Hussain, Khan and Agarwal [17] obtained the solution of implicit integral equation (17.8) as an application of Ky Fan type fixed point theorem in ordered Banach spaces (see also, [15]). We shall study sufficient condition for existence of solution of integral equation in framework of multiplicative metric spaces. We assume that there exists a function G : × R → R>0 such that

(i) p(s, v(t)) ≥ q(t, s, u(s))ds ≥ G(s, v(t)) for each s, t ∈ .

 1/δ δsup| p(s,v(t))−v(t)| ) , for each s ∈ , (ii) sup[ p(s, v(t)) − G(s, v(t))] ≤ ln φ(e t∈

t∈

where φ : [1, ∞) → [1, ∞) is an upper semi-continuous and nondecreasing ∞

function with φ n (t) convergent for each t > 1 and δ ∈ (0, ∞) . n=1

Then integral equation (17.8) has a solution in L 2 ( ).

Proof Define (Sx)(t) = p(t, x(t)) and (T x)(t) = q(t, s, x(s))ds. Now

d(Sx, T y)δ = e ≤e

δsup|(Sx)(t)−(T y)(t)| t∈

=e

δsup| p(t,x(t))−T (t,x(t))| t∈

δsup| p(t,x(t))− q(t,s,y(t))dt|

t∈

≤ φ(e

βsup| p(t,x(t))−x(t)| t∈

 ) = φ d(x, Sx)β .

Thus, for all x, y ∈ X , d(Sx, T y)δ ≤ φ(d(x, y)α · d(x, Sx)β · d(y, T y)γ ), where δ = β ∈ (0, ∞). Now Theorem 2 yields the solution of integral equation (17.8) in L 2 ( ). Acknowledgements Talat Nazir is grateful to ERUSMUS MUNDUS “Featured europe and South/south-east Asia mobility Network FUSION” and its Swedish node, MAM research milieu in Mathematics and Applied Mathematics, Division of Mathematics and Physics, School of Education, Culture and Communication at Mälardalen University for support and excellent research and research education environment during his visits.

References 1. Abbas, M., Ali, B., Suleiman, Y.I.: Common fixed points of locally contractive mappings in multiplicative metric spaces with application. Inter. J. Math. Math. Sci. 218683, 1–7 (2015) 2. Abbas, M., Fisher, B., Nazir, T.: Well-posedness and periodic point property of mappings satisfying a rational inequality in an ordered complex valued metric space. Sci. Stud. Res. Ser. Math. Info. 22(1), 2405–2416 (2012) 3. Abbas, M., La Sen, D.M., Nazir, T.: Common fixed points of generalized rational type co-cyclic mappings in multiplicative metric spaces. Discret. Dyn. Nat. Soc. 532725 1–10 (2015)

17 Some Periodic Point and Fixed Point Results in Multiplicative Metric Spaces

381

4. Abbas, M., Nazir, T., Radenovi´c, S.: Some periodic point results in generalized metric spaces. Appl. Math. Comput. 217(8), 4094–4099 (2010) 5. Abbas, M., Rhoades, B.E.: Fixed and periodic point results in cone metric spaces. Appl. Math. Lett. 22, 511–515 (2009) 6. Alber, Y.I., Guerre-Delabriere, S.: Principle of weakly contractive maps in Hilbert space. In: Gohberg, I., Lyubich, Yu. (eds.) New Results in Operator Theory, Advances and Applications, vol. 98, pp. 7–22. Birkhauser, Basel (1997) 7. Altun, I., Abbas, M., Simsek, H.: A fixed point theorem on cone metric spaces with new type contractivity. Banach J. Math. Anal. 5(2), 15–24 (2011) 8. Altun, I., Simsek, H.: Some fixed point theorems on ordered metric spaces and application. Fixed Point Theory Appl. 621492, 1–17 (2010) 9. Bashirov, A.E., Kurpınar, E.M., Ozyapıcı, A.: Multiplicative calculus and its applications. J. Math. Anal. Appl. 337, 36–48 (2008) 10. Bashirov, A.E., Mısırlı, E., Tando˘gdu, Y., Ozyapıcı, A.: On modeling with multiplicative differential equations. Appl. Math. J. Chin. Uni. 26(4), 425–438 (2011) 11. Chaipunya, P., Cho, Y.J., Kumam, P.: A remark on the property P and periodic points of order ∞. Math. Vesnik 66(4), 357–363 (2014) 12. Chen, C., Karapınar, E., Rakoˇcevi´c, V.: Existence of periodic fixed point theorems in the setting of generalized quasi metric spaces. J. Appl. Math. 353765, 8 (2014) 13. Florack, L., van Assen, H.: Multiplicative calculus in biomedical image analysis. J. Math. Imag. Vision 42(1), 64–75 (2012) 14. Gornicki, J., Rhoades, B.E.: A general fixed point theorem for involutions. Indian J. Pure Appl. Math. 27, 13–23 (1996) 15. Feckan, M.: Nonnegative solutions of nonlinear integral equations. Comment, Math. Univ. Carolinae. 36, 615–627 (1995) 16. He, Z., Song, M., Chen, D.: Common fixed points for weak commutative mappings on a multiplicative metric space. Fixed Point Theory Appl. 2014(48), 1–9 (2014) 17. Hussain, N., Khan, A.R., Agarwal, R.P.: Krasnoselskii and Ky Fan type fixed point theorems in ordered Banach spaces. J. Nonlinear Convex Anal. 11(3), 475–489 (2010) 18. Hussain, N., Latif, A., Salimi, P.: New fixed point results for contractive maps involving dominating auxiliary functions. J. Nonlinear Sci. Appl. 9, 4114–4126 (2016) 19. Latif, A., Mongkolkeha, C.: Sintunavarat, W.: Fixed point theorems for generalized α-β-weakly contraction mappings in metric spaces and applications. Sci. World J. 784207, 1–14 (2014) 20. Jeong, G. S.: Rhoades, B.E., Maps for which F(T ) = F(T n ). Fixed Point Theory 6, 87–131 (2005) 21. Karapınar, E.: Fixed point theory for cyclic weak φ-contraction. Appl. Math. Lett. 24, 822–825 (2011) 22. Kumam, P., Rahimi, H., Rad, G.S.: The existence of fixed and periodic point theorems in cone metric type spaces. J. Nonlinear Sci. Appl. 7, 255–263 (2014) 23. Özav¸sar, M., Çevikel, A.C.: Fixed point of multiplicative contraction mappings on multiplicative metric space (2012). arXiv:1205.5131v1 [matn.GN] 24. P˘acurar, M., Rus, I.A.: Fixed point theory for cyclic φ-contractions. Nonlinear Anal. 72(3–4), 1181–1187 (2010) 25. Rhoades, B.E., Abbas, M.: Maps satisfying a contractive condition of integral type for which fixed point and periodic point coincidence. Int. J. Pure App. Math. 45(2), 225–231 (2008) 26. Rahimia, H., Rhoades, B.E., Radenovi´c, S., Rad, G.S.: Fixed and periodic point theorems for T -contractions on cone metric spaces. FILOMAT 27(5), 881–888 (2013)

382

T. Nazir and S. Silvestrov

27. Silvestrov, S.D., Tomiyama, J.: Topological dynamical systems of type I. Expo. Math. 20(2), 117–142 (2002) 28. Singh, K.L.: Sequences of iterates of generalized contractions. Fund. Math. 105, 115–126 (1980) 29. Yamaod, O., Sintunavarat, W.: Some fixed point results for generalized contraction mappings with cyclic (α, β)-admissible mapping in multiplicative metric spaces. J. Ineq. Appl. 2014(488), 1–15 (2014)

Chapter 18

Bochner Integrability of the Random Fixed Point of a Generalized Random Operator and Almost Sure Stability of Some Faster Random Iterative Processes Godwin Amechi Okeke, Mujahid Abbas, and Sergei Silvestrov Abstract We introduce a random version of some known faster fixed point iterative processes and approximate the random fixed point of a generalized random operator using these random iterative processes. Moreover, the Bochner integrability of the random fixed points for this kind of generalized random operators and the almost sure T -stability of these random iterative processes are proved. We apply our results in proving the existence of solution of a nonlinear Hammerstein type stochastic integral equation. Keywords Random fixed point · Bochner integrability · Random iterative process · Almost sure T -stability MSC 2020 47H09 · 47H10 · 49M05 · 54H25

G. A. Okeke Department of Mathematics, School of Physical Sciences, Federal University of Technology, Owerri, P.M.B. 1526, Owerri, Imo State, Nigeria e-mail: [email protected] M. Abbas Department of Mathematics, Government College University, 54000 Lahore, Pakistan Department of Mathematics and Applied Mathematics, University of Pretoria, Pretoria, South Africa S. Silvestrov (B) Division of Mathematics and Physics, School of Education, Culture and Communication, Mälardalen University, Box 883, 72123 Västerås, Sweden e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_18

383

384

G. A. Okeke et al.

18.1 Introduction and Preliminaries Random nonlinear analysis, an important branch of probabilistic functional analysis, deals with the solution of various classes of random operator equations and the related problems (see [21]). The development of random methods have revolutionized the financial markets. Random fixed point theorems are stochastic generalizations of classical or deterministic fixed point theorems and are required for the theory of random equations, random matrices, random partial differential equations and various classes of random operators arising in physical systems (see, [15]). Random fixed point theory was initiated in 1950s by Prague school of probabilists. Spacek [32] and Hanš [12] established a stochastic analogue of the Banach fixed point theorem in a separable complete metric space. Itoh [14] in 1979 generalized and extended Spacek and Han’s theorem to a multivalued contraction random operator. Several authors have proved some interesting results (see, e.g. [6–8, 12, 14, 15, 17, 21–26, 29, 30, 32, 36]). The purpose of this paper is to introduce the random version of some known faster fixed point iterative processes and approximate the random fixed point of a generalized random operator using these random iterative processes. Moreover, the Bochner integrability of the random fixed points for this kind of generalized random operators and the almost sure T -stability of these random iterative processes are proved. We construct some numerical examples to demonstrate the applicability of our results. Let (Ω, Σ, μ) be a complete probability measure space and (E, B(E)) measurable space, where E a separable Banach space, B(E) is Borel σ -algebra of E, (Ω, Σ) is a measurable space with σ -algebra Σ and a probability measure μ on Σ, that is, a measure with total measure one μ(Σ) = 1. A mapping ξ : Ω → E is called (a) E−valued random variable if ξ is (Σ, B(E))−measurable (b) strongly μ−measurable if there exists a sequence {ξn } of μ−simple functions converging μ− almost everywhere to ξ . Due to the separability of a Banach space E, the sum of two E−valued random variables is E−valued random variable. A mapping T : Ω × E → E is called a random operator if for each fixed e in E, the mapping T (·, e) : Ω → E is measurable. Throughout this study, we assume that (Ω, ξ, μ) is a complete probability measure space and E is a nonempty subset of a separable Banach space X. Definition 1 ([15]) A random variable x : Ω → E is Bochner integrable if for any ω∈Ω:  x(ω)dμ(ω) < ∞, Ω

that is, if x(ω) ∈ L 1 (Ω, ξ, μ). The Bochner integral is a natural generalization of the Lebesgue integral for vector-valued set functions.

18 Bochner Integrability of the Random Fixed Point of a Generalized Random …

385

Proposition 1 ([15]) A random variable x : Ω → E is Bochner integrable if and only if there exists a sequence of random variables {xn }∞ n=1 converging strongly to x almost surely such that  lim

n→∞ Ω

xn (ω) − x(ω)dμ(ω) = 0.

Definition 2 ([36]) Let (Ω, ξ, μ) be a complete probability measure space and E be a nonempty subset of a separable Banach space X . For a random operator T : Ω × E → E, denote by R F(T ) = {x ∗ (ω) ∈ E : T (ω, x ∗ (ω)) = x ∗ (ω), ω ∈ Ω} the random fixed point set of T. For any given arbitrary measurable mapping x0 : Ω → E, let {xn (ω)}∞ n=0 be sequence of measurable mappings from Ω to E, and xn+1 (ω) = f (T, xn (ω)), n = 0, 1, 2, . . . ,

(18.1)

where f is some function measurable in the second variable. Let x ∗ (ω) be a random ∞ fixed point of T and Bochner integrable with respect to {xn (ω)}∞ n=0 . Let {yn (ω)}n=0 ⊆ E Ω be an arbitrary sequence of measurable mappings. Denote εn (ω) = yn+1 (ω) − f (T, yn (ω)), and assume that εn (ω) ∈ L 1 (Ω, ξ, μ), n = 0, 1, 2, . . . . Then, the iterative scheme (18.1) is T -stable almost surely (or the iterative scheme (18.1) is stable with respect to T almost surely) if and only if  lim

n→∞ Ω

εn (ω)dμ(ω) = 0

implies that x ∗ (ω) is Bochner integrable with respect to {yn (ω)}∞ n=0 . Definition 3 ([17]) A random operator T : Ω × C → C is generalized φ-weakly contractive type if there exists L(ω) ≥ 0 and a continuous and non-decreasing function φ : R+ → R+ with φ(t) > 0 for each t ∈ (0, ∞), φ(0) = 0 and for each x, y ∈ C, ω ∈ Ω, we have  Ω

T (ω, x) − T (ω, y)dμ(ω) ≤    x − ydμ(ω) − φ x − ydμ(ω) . e L(ω)x−T (ω,x) Ω

(18.2)

Ω

Thakur et al. [33] introduced and proved that the Thakur iteration converges faster than Picard, Mann, Ishikawa, S, Noor and Abbas iteration processes for Suzuki’s generalized nonexpansive mappings. In 2018, Ullah and Arshad [34] introduced the M-iteration. They proved that this iterative process converges faster than all of S [3],

386

G. A. Okeke et al.

Picard-S [11], Picard, Mann [19], Ishikawa [13], Noor [20], SP [28], CR [9], S ∗ [16], Abbas [1] and Normal-S [31] iteration processes. Let T : Ω × C → C be a random operator where C is a nonempty convex subset of X. Suppose x0 : Ω → C and u 0 : Ω → C are arbitrary measurable mappings, the random versions of the M and Thakur iterations are given as follows: ⎧ x0 (ω) ∈ C, ⎪ ⎪ ⎨ z n (ω) = (1 − αn )xn (ω) + αn T (ω, xn (ω)), yn (ω) = T (ω, z n (ω)), ⎪ ⎪ ⎩ xn+1 (ω) = T (ω, yn (ω));

(18.3)

⎧ u 0 (ω) ∈ C, ⎪ ⎪ ⎨ z n (ω) = (1 − βn )u n (ω) + βn T (ω, u n (ω)), yn (ω) = T (ω, (1 − αn )u n (ω) + αn z n (ω)), ⎪ ⎪ ⎩ u n+1 (ω) = T (ω, yn (ω)).

(18.4)

The random M-iteration {xn (ω)} is given by relation (18.3) whereas the random Thakur iteration {u n (ω)} is given by relation (18.4). Lemma 1 ([4]) Let {γn } and {λn } be two sequences Let of nonnegative real numbers. γn σ = ∞ and lim = 0. {σn } be a sequence of positive numbers satisfying ∞ n n→∞ n=1 σn If the following condition is satisfied: λn+1 ≤ λn − σn φ(λn ) + γn , ∀n ≥ 1, where φ : R+ → R+ is a continuous and strictly increasing function with φ(0) = 0, then {λn } converges to 0 as n → ∞.

18.2 Bochner Integrability of the Fixed Point of a Generalized Random Operator In this section we prove some random fixed point results for a generalized random operator satisfying condition (18.2). Theorem 1 Let C be a nonempty closed and convex subset of a separable Banach space X, T : Ω × C → C be a random generalized φ-weakly contractive-type operator satisfying condition (18.2) with R F(T ) = ∅. Suppose x ∗ (ω) is the random fixed point of T and {xn (ω)} is the random M-iteration process defined by (18.3), where α {αn } is a real sequence in (0, 1) such that ∞ n=1 n = ∞. Then the random fixed point x ∗ (ω) of T is Bochner integrable. Proof To prove that x ∗ (ω) is Bochner integrable, it suffice to prove that  lim

n→∞ Ω

xn (ω) − x ∗ (ω)dμ(ω) = 0.

18 Bochner Integrability of the Random Fixed Point of a Generalized Random …

387

Using relation (18.2) and (18.3) we have  xn+1 (ω) − x ∗ (ω)dμ(ω) = T (ω, yn (ω)) − T (ω, x ∗ (ω))dμ(ω) Ω Ω  L(ω)x ∗ (ω)−T (ω,x ∗ (ω) yn (ω) − x ∗ (ω)dμ(ω) ≤e Ω   −φ( yn (ω) − x ∗ (ω)dμ(ω)) Ω  yn (ω) − x ∗ (ω)dμ(ω) ≤ Ω  = T (ω, z n (ω)) − T (ω, x ∗ (ω))dμ(ω) Ω    z n (ω) − x ∗ (ω)dμ(ω) − φ z n (ω) − x ∗ (ω)dμ(ω) ≤ Ω Ω ∗ z n (ω) − x (ω)dμ(ω) ≤ Ω = (1 − αn )xn (ω) + αn T (ω, xn (ω)) − x ∗ (ω)dμ(ω) Ω   ≤ (1 − αn ) xn (ω) − x ∗ (ω)dμ(ω) + αn T (ω, xn (ω)) − x ∗ (ω)dμ(ω) Ω Ω ∗ ≤ (1 − αn ) xn (ω) − x (ω)dμ(ω)  Ω   ∗ ∗ xn (ω) − x (ω)dμ(ω) − φ xn (ω) − x (ω)dμ(ω) + αn Ω Ω    ∗ ∗ xn (ω) − x (ω)dμ(ω) − αn φ xn (ω) − x (ω)dμ(ω) . =



Ω

Ω



Next, we take λn = Ω xn (ω) − x ∗ (ω)dμ(ω), σn = αn and γn = 0. Using the condition on {αn } in Theorem 1 we see that the conditions of Lemma 1 are satisfied. Therefore, we obtain  xn (ω) − x ∗ (ω)dμ(ω) = 0. lim n→∞ Ω

Hence, x ∗ (ω) ∈ R F(T ) is Bochner integrable.



Theorem 2 Let C be a nonempty closed and convex subset of a separable Banach space X, and let T : Ω × C → C be a random generalized φ-weakly contractivetype operator satisfying condition (18.2) with R F(T ) = ∅. Suppose x ∗ (ω) is the random fixed point of T and {u n (ω)} is the random Thakur iteration process defined α by (18.4), where {αn } and {βn } are real sequences in (0, 1) such that ∞ n=1 n βn = ∞. Then the random fixed point x ∗ (ω) of T is Bochner integrable.

388

G. A. Okeke et al.

Proof To prove that x ∗ (ω) is Bochner integrable, it suffice to prove that  lim

n→∞ Ω

u n (ω) − x ∗ (ω)dμ(ω) = 0.

Using relation (18.2) and (18.4) we have 





u n+1 (ω) − x (ω)dμ(ω) = T (ω, yn (ω)) − T (ω, x ∗ (ω))dμ(ω) Ω  ∗ ∗ yn (ω) − x ∗ (ω)dμ(ω) ≤ e L(ω)x (ω)−T (ω,x (ω)) Ω   −φ( yn (ω) − x ∗ (ω)dμ(ω)) Ω  ∗ yn (ω) − x (ω)dμ(ω) ≤ Ω = T (ω, (1 − αn )u n (ω) + αn z n (ω)) − T (ω, x ∗ (ω))dμ(ω) Ω  ≤ (1 − αn )u n (ω) + αn z n (ω) − x ∗ (ω)dμ(ω) Ω   (1 − αn )u n (ω) + αn z n (ω) − x ∗ (ω)dμ(ω) −φ Ω  u n (ω) − x ∗ (ω)dμ(ω) + αn z n (ω) − x ∗ (ω)dμ(ω). (18.5) ≤ (1 − αn ) Ω

Ω

Ω

Now, we obtain the following estimate 

z n (ω) − x ∗ (ω)dμ(ω) Ω  = (1 − βn )u n (ω) + βn T (ω, u n (ω)) − x ∗ (ω)dμ(ω) Ω   ∗ ≤ (1 − βn ) u n (ω) − x (ω)dμ(ω) + βn T (ω, u n (ω)) − x ∗ (ω)dμ(ω) Ω Ω  ≤ (1 − βn ) u n (ω) − x ∗ (ω)dμ(ω) Ω    u n (ω) − x ∗ (ω)dμ(ω) − φ u n (ω) − x ∗ (ω)dμ(ω) + βn Ω Ω   u n (ω) − x ∗ (ω)dμ(ω) − βn φ( u n (ω) − x ∗ (ω)dμ(ω). (18.6) = Ω

Using (18.6) in (18.5), we have

Ω

18 Bochner Integrability of the Random Fixed Point of a Generalized Random …

389

 u n+1 (ω) − x ∗ (ω)dμ(ω) ≤ (1 − αn ) u n (ω) − x ∗ (ω)dμ(ω)+ Ω Ω    ∗ ∗ u n (ω) − x (ω)dμ(ω) − βn φ u n (ω) − x (ω)dμ(ω) αn Ω Ω    ∗ ∗ u n (ω) − x (ω)dμ(ω) − αn βn φ u n (ω) − x (ω)dμ(ω) . = 

Ω

Ω

Next, we take λn = Ω xn (ω) − x ∗ (ω)dμ(ω), σn = αn βn and γn = 0. Using the condition on {αn } and βn in Theorem 2 we see that the conditions of Lemma 1 are satisfied. Therefore, we obtain  u n (ω) − x ∗ (ω)dμ(ω) = 0. lim n→∞ Ω

This means that x ∗ (ω) ∈ R F(T ) is Bochner integrable.



Next, we prove the following theorem using the random version of the Picard-S iteration process. Theorem 3 Let C be a nonempty closed and convex subset of a separable Banach space X, T : Ω × C → C be a random generalized φ-weakly contractive-type operator satisfying condition (18.2) with R F(T ) = ∅. Suppose x ∗ (ω) is the random fixed point of T and { pn (ω)} is the random Picard-S iteration process defined by ⎧ p0 (ω) ∈ C, ⎪ ⎪ ⎨ pn+1 (ω) = T (ω, yn (ω)), y ⎪ n (ω) = (1 − αn )T (ω, pn (ω)) + αn T (ω, z n (ω)), ⎪ ⎩ z n (ω) = (1 − βn ) pn (ω) + βn T (ω, pn (ω)), where {αn } and {βn } are real sequences in (0, 1) such that the random fixed point x ∗ (ω) of T is Bochner integrable.

∞ n=1

αn βn = ∞. Then

Proof The proof of Theorem 3 follows similar lines as in the proof of Theorem 1 and Theorem 2.  Next, we prove the following existence results in separable Banach spaces. Theorem 4 Suppose X is a separable Banach space and (Ω, Σ, μ) is a complete probability measure space. Let T : Ω × X → X be a continuous random operator such that T (ω, x1 ) − T (ω, x2 ) ≤ e L(ω)x1 −T (ω,x1 ) x1 − x2  −φ (x1 − x2 ) e L(ω)x1 −T (ω,x1 ) ,

(18.7)

almost surely for all x1 , x2 ∈ X , where L(ω) ≥ 0 and φ is a continuous and nondecreasing function φ : R+ → R+ with φ(t) > 0 for each t ∈ (0, ∞) and φ(0) = 0 almost surely. Then T has a unique random fixed point.

390

G. A. Okeke et al.

Proof Suppose A = {ω ∈ Ω : T (ω, x) is a continuous function of x},

C x1 ,x2 = ω ∈ Ω : T (ω, x1 ) − T (ω, x2 ) ≤ e L(ω)x1 −T (ω,x1 ) x1 − x2 

 − φ(x1 − x2 )e L(ω)x1 −T (ω,x1 ) ,

B = {φ : R+ → R+ | φ is continuous, nondecreasing, φ(0) = 0, ∀t ∈ (0, ∞) : φ(t) > 0}, K = {ω ∈ Ω : L(ω) ≥ 0}. Suppose H is a countable dense subset of X. Then we show that 

(C x1 ,x2 ∩ A ∩ B ∩ K ) =

x1 ,x2 ∈X

Now, we show that  

(C h 1 ,h 2 ∩ A ∩ B ∩ K ).

h 1 ,h 2 ∈H

C h 1 ,h 2 ∩ A ∩ B ∩ K ⊆

h 1 ,h 2 ∈H

Let ω ∈



h 1 ,h 2 ∈H (C h 1 ,h 2



C x1 ,x2 ∩ A ∩ B ∩ K .

x1 ,x2 ∈X

∩ A ∩ B ∩ K ), then for each h 1 , h 2 ∈ H, we have

T (ω, h 1 ) − T (ω, h 2 ) ≤ e L(ω)h 1 −T (ω,h 1 ) h 1 − h 2  − φ(h 1 − h 2 )e L(ω)h 1 −T (ω,h 1 ) . Suppose x1 , x2 ∈ X, we obtain T (ω, x1 ) − T (ω, x2 ) ≤ T (ω, x1 ) − T (ω, h 1 ) +T (ω, h 1 ) − T (ω, h 2 ) + T (ω, h 2 ) − T (ω, x2 ) ≤ T (ω, x1 ) − T (ω, h 1 ) + T (ω, h 2 ) − T (ω, x2 ) +e L(ω)h 1 −T (ω,h 1 ) h 1 − h 2  − φ(h 1 − h 2 )e L(ω)h 1 −T (ω,h 1 ) ≤ T (ω, x1 ) − T (ω, h 1 ) + T (ω, h 2 ) − T (ω, x2 ) +e L(ω)h 1 −T (ω,h 1 ) [h 1 − x1  + x1 − x2  + x2 − h 2 ] −φ(x1 − x2 )e L(ω)h 1 −T (ω,h 1 ) = T (ω, x1 ) − T (ω, h 1 ) + T (ω, h 2 ) − T (ω, x2 ) +e L(ω)x1 −T (ω,x1 ) h 1 − x1  + e L(ω)x1 −T (ω,x1 ) x1 − x2  +e L(ω)x1 −T (ω,x1 ) x2 − h 2  − φ(x1 − x2 )e L(ω)x1 −T (ω,x1 ) .

(18.8)

Since for each ω ∈ Ω, T (ω, x) is a continuous function of x, this means that for arbitrary ε > 0, there exists δi (xi ) > 0, (i = 1, 2), such that T (ω, x1 ) − T (ω, h 1 ) < ε whenever x1 − h 1  < δ1 (x1 ) and T (ω, h 2 ) − T (ω, x2 ) < 4ε whenever h 2 − 4 x2  < δ2 (x2 ).

18 Bochner Integrability of the Random Fixed Point of a Generalized Random …

391

Now choose δ1 = min{δ1 (x1 ), 4ε } and δ2 = min{δ2 (x2 ), 4ε }. Using the choice of δ1 , δ2 we see that (18.8) becomes + 4ε + 4ε + e L(ω)x1 −T (ω,x1 ) x1 − x2  + −φ(x1 − x2 )e L(ω)x1 −T (ω,x1 ) ≤ ε + e L(ω)x1 −T (ω,x1 ) x1 − x2  −φ(x1 − x2 )e L(ω)x1 −T (ω,x1 ) .

T (ω, x1 ) − T (ω, x2 ) ≤

ε 4

ε 4

(18.9)

Since ε > 0 is arbitrary, it follows from (18.9) that T (ω, x1 ) − T (ω, x2 ) ≤e L(ω)x1 −T (ω,x1 ) x1 − x2  − φ(x1 − x2 )e L(ω)x1 −T (ω,x1 ) . This means that ω ∈ 

 x1 ,x2 ∈X

C x1 ,x2 ∩ A ∩ B ∩ K , this implies that 

C h 1 ,h 2 ∩ A ∩ B ∩ K ⊆

h 1 ,h 2 ∈H

C x1 ,x2 ∩ A ∩ B ∩ K .

(18.10)

C h 1 ,h 2 ∩ A ∩ B ∩ K .

(18.11)

x1 ,x2 ∈X

Similarly, we can easily show that 



C x1 ,x2 ∩ A ∩ B ∩ K ⊆

x1 ,x2 ∈X

h 1 ,h 2 ∈H

Hence, by (18.10) and (18.11), we have 

C x1 ,x2 ∩ A ∩ B ∩ K =

x1 ,x2 ∈X



C h 1 ,h 2 ∩ A ∩ B ∩ K .

h 1 ,h 2 ∈H

 Suppose M  = h 1 ,h 2 ∈H C h 1 ,h 2 ∩ A ∩ B ∩ K . Then μ(M  ) = 1. Therefore, for all ω ∈ M  , T (ω, x) is a deterministic operator satisfying (18.7). Hence T has a unique random fixed point in X.  Remark 1 Theorem 4 is a generalization of several results in literature, including the results of Okeke, Bishop, Akewe [25, Theorem 3.4]. Example 1 Suppose Ω = [0, 1] and Σ is the σ -algebra of the Lebesgue measurable subsets of Ω. Let X = R, C = [0, 1] and define the generalized random operator . Then the measurable mapping x ∗ : Ω → X T : Ω × C → C as T (ω, x) = ω+x 4 2 ω defined by x ∗ (ω) = 3 , for every ω ∈ Ω is a random fixed point of T. Let φ(t) = t2 and L(ω) = 2, we have  T (ω, x) − T (ω, y)dμ(ω) ≤ Ω    x − ydμ(ω) − φ x − ydμ(ω) . e2x−T (ω,x) Ω

Ω

392

G. A. Okeke et al. 2

n Clearly, T satisfies condition (18.2). Choose the prototype sequences αn = 1+n 2, 3 5 ∞ ∞ n n βn = 1+n 3 . Then 0 < αn βn < 1, and n=1 αn βn = n=1 (1+n 2 )(1+n 3 ) = ∞. Hence, all the conditions of Theorems 1, 2 and 3 are satisfied. Therefore, the random fixed point x ∗ (ω) of T (ω, x) is Bochner integrable.

18.3 Almost Sure T -Stability Results In this section, we prove that those random iterative processes discussed in section two are T -stable almost surely. Moreover, we construct a numerical example to demonstrate the applicability of our results. Theorem 5 Let C be a nonempty closed and convex subset of a separable Banach space X , and let T : Ω × C → C be a random generalized φ-weakly contractivetype operator satisfying condition (18.2) with R F(T ) = ∅. Suppose x ∗ (ω) is the random fixed point of T and {xn (ω)} is the random M-iteration process defined by (18.3), where {αn } is a real sequence in (0, 1) such that 0 < α ≤ αn . Then {xn (ω)} is T -stable almost surely. Proof Let {yn (ω)} be an arbitrary sequence of measurable mappings in C and εn (ω) = yn+1 (ω) − T (ω, an (ω)), n = 0, 1, 2, 3, . . . , 

where

an (ω) = T (ω, bn (ω)), bn (ω) = (1 − αn )yn (ω) + αn T (ω, yn (ω)).

(18.12)

(18.13)

Suppose limn→∞ Ω εn (ω)dμ(ω) = 0. We now prove that x ∗ (ω) is Bochner integrable with respect to the sequence {yn (ω)}. Using (18.2), (18.12) and (18.13) we have 





Ω

yn+1 (ω) − x (ω)dμ(ω) ≤

Ω

 ≤

yn+1 (ω) − T (ω, an (ω))dμ(ω)  + T (ω, an (ω)) − x ∗ (ω)dμ(ω)

Ω

L(ω)x ∗ (ω)−T (ω,x ∗ (ω))

εn (ω)dμ(ω) + e ×    an (ω) − x ∗ (ω)dμ(ω) − φ an (ω) − x ∗ (ω)dμ(ω) Ω Ω   εn (ω)dμ(ω) + an (ω) − x ∗ (ω)dμ(ω) ≤ Ω Ω   = εn (ω)dμ(ω) + T (ω, bn (ω)) − T (ω, x ∗ (ω))dμ(ω) Ω

Ω

Ω

18 Bochner Integrability of the Random Fixed Point of a Generalized Random …

 ≤



Ω

εn (ω)dμ(ω) +

Ω

−φ ≤



εn (ω)dμ(ω) +





Ω



Ω

bn (ω) − x ∗ (ω)dμ(ω)





Ω

393



bn (ω) − x (ω)dμ(ω)

bn (ω) − x ∗ (ω)dμ(ω)

(1 − αn )yn (ω) + αn T (ω, yn (ω)) − x ∗ (ω)dμ(ω) Ω Ω  ≤ εn (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) Ω Ω  + αn T (ω, yn (ω)) − T (ω, x ∗ (ω))dμ(ω) Ω   ≤ εn (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) Ω Ω    + αn yn (ω) − x ∗ (ω)dμ(ω) − φ yn (ω) − x ∗ (ω)dμ(ω) Ω Ω   εn (ω)dμ(ω) + yn (ω) − x ∗ (ω)dμ(ω) = Ω Ω   yn (ω) − x ∗ (ω)dμ(ω) . (18.14) − αn φ ≤

εn (ω)dμ(ω) +

Ω

Using the assumptions that limn→∞ Ω εn (ω)dμ(ω) = 0 and 0 < α ≤ αn for each n ∈ N, we have εn (ω)dμ(ω) Ω εn (ω)dμ(ω) = 0. lim ≤ lim Ω n→∞ n→∞ αn α By Lemma 1, we take  λn =

Ω

yn (ω) − x ∗ (ω)dμ(ω), σn = αn , γn =

 Ω

εn (ω)dμ(ω).

Clearly, all the conditions of Lemma 1 are satisfied. Hence, we obtain  lim

n→∞ Ω

yn (ω) − x ∗ (ω)dμ(ω) = 0.

Conversely, suppose x ∗ (ω) is Bochner integrable with respect to the sequence {yn (ω)}. Then we have

394

G. A. Okeke et al.

 εn (ω)dμ(ω) = yn+1 (ω) − T (ω, an (ω))dμ(ω) Ω Ω   ∗ ≤ yn+1 (ω) − x (ω)dμ(ω) + x ∗ (ω) − T (ω, an (ω))dμ(ω) Ω Ω  ∗ ∗ ≤ yn+1 (ω) − x ∗ (ω)dμ(ω) + e L(ω)x (ω)−T (ω,x (ω) × Ω    an (ω) − x ∗ (ω)dμ(ω) − φ an (ω) − x ∗ (ω)dμ(ω) Ω Ω   ∗ yn+1 (ω) − x (ω)dμ(ω) + an (ω) − x ∗ (ω)dμ(ω) ≤ Ω Ω ∗ = yn+1 (ω) − x (ω)dμ(ω) + T (ω, bn (ω)) − T (ω, x ∗ (ω))dμ(ω) Ω Ω   ∗ ≤ yn+1 (ω) − x (ω)dμ(ω) + bn (ω) − x ∗ (ω)dμ(ω) Ω Ω   ∗ −φ bn (ω) − x (ω)dμ(ω) Ω   yn+1 (ω) − x ∗ (ω)dμ(ω) + bn (ω) − x ∗ (ω)dμ(ω) ≤ Ω Ω  = yn+1 (ω) − x ∗ (ω)dμ(ω) Ω  + (1 − αn )yn (ω) + αn T (ω, yn (ω)) − x ∗ (ω)dμ(ω) Ω   ∗ ≤ yn+1 (ω) − x (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) Ω Ω  +αn T (ω, yn (ω)) − T (ω, x ∗ (ω))dμ(ω) Ω   ∗ ≤ yn+1 (ω) − x (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) Ω Ω    ∗ ∗ + αn yn (ω) − x (ω)dμ(ω) − φ yn (ω) − x (ω)dμ(ω) Ω Ω   yn+1 (ω) − x ∗ (ω)dμ(ω) + yn (ω) − x ∗ (ω)dμ(ω) = Ω Ω   (18.15) ∗ −αn φ yn (ω) − x (ω)dμ(ω) .



Ω

Hence, we have lim



n→∞ Ω

εn (ω)dμ(ω) = 0. This means that the random M-iteration

process {xn (ω)} is T -stable almost surely. The proof of Theorem 5 is completed.  Theorem 6 Let C be a nonempty closed and convex subset of a separable Banach space X , and let T : Ω × C → C be a random generalized φ-weakly contractivetype operator satisfying condition (18.2) with R F(T ) = ∅. Suppose x ∗ (ω) is the

18 Bochner Integrability of the Random Fixed Point of a Generalized Random …

395

random fixed point of T and { pn (ω)} is the random Picard-S iteration process defined by ⎧ p0 (ω) ∈ C, ⎪ ⎪ ⎨ pn+1 (ω) = T (ω, yn (ω)), y ⎪ n (ω) = (1 − αn )T (ω, pn (ω)) + αn T (ω, z n (ω)), ⎪ ⎩ z n (ω) = (1 − βn ) pn (ω) + βn T (ω, pn (ω)), where {αn } and {βn } are real sequences in (0, 1) such that 0 < α ≤ αn and 0 < β ≤ βn . Then { pn (ω)} is T -stable almost surely. Proof Suppose {yn (ω)} is an arbitrary sequence of measurable mappings from Ω to C and εn (ω) = yn+1 (ω) − T (ω, an (ω)), n = 0, 1, 2, 3, . . . , where 

an (ω) = (1 − αn )T (ω, yn (ω)) + αn T (ω, bn (ω)), bn (ω) = (1 − βn )yn (ω) + βn T (ω, yn (ω)).

Let limn→∞ Ω εn (ω)dμ(ω) = 0, we prove that x ∗ (ω) is Bochner integrable with respect to the sequence {yn (ω)}. Using (18.2) and (18.3), we have  Ω





yn+1 (ω) − x (ω)dμ(ω) ≤

Ω

yn+1 (ω) − T (ω, an (ω))dμ(ω)  + T (ω, an (ω)) − x ∗ (ω)dμ(ω) Ω







≤ εn (ω)dμ(ω) + e L(ω)x (ω)−T (ω,x (ω)) × Ω    ∗ ∗ an (ω) − x (ω)dμ(ω) − φ an (ω) − x (ω)dμ(ω) Ω  Ω εn (ω)dμ(ω) + an (ω) − x ∗ (ω)dμ(ω) ≤ Ω Ω = εn (ω)dμ(ω) + (1 − αn )T (ω, yn (ω)) Ω

Ω

+ αn T (ω, bn (ω)) − x ∗ (ω)dμ(ω)   ≤ εn (ω)dμ(ω) + (1 − αn ) T (ω, yn (ω)) − x ∗ (ω)dμ(ω) Ω Ω  + αn T (ω, bn (ω)) − x ∗ (ω)dμ(ω)  Ω ≤ εn (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) Ω Ω    ∗ −φ yn (ω) − x (ω)dμ(ω) + αn T (ω, bn (ω)) − x ∗ (ω)dμ(ω) Ω Ω   εn (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) ≤ Ω

Ω

396

G. A. Okeke et al.



T (ω, bn (ω)) − T (ω, x ∗ (ω))dμ(ω)   ≤ εn (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) Ω     ∗ ∗ + αn bn (ω) − x (ω)dμ(ω) − φ bn (ω) − x (ω)dμ(ω) Ω Ω   εn (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) ≤ Ω Ω  + αn bn (ω) − x ∗ (ω)dμ(ω)  Ω = εn (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) Ω Ω  + αn (1 − βn )yn (ω) + βn T (ω, yn (ω)) − x ∗ (ω)dμ(ω) Ω   ≤ εn (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) Ω Ω  + αn (1 − βn ) yn (ω) − x ∗ (ω)dμ(ω) Ω  + αn βn T (ω, yn (ω)) − x ∗ (ω)dμ(ω) Ω   ≤ εn (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) Ω Ω  + αn (1 − βn ) yn (ω) − x ∗ (ω)dμ(ω) Ω    ∗ ∗ + αn βn yn (ω) − x (ω)dμ(ω) − φ yn (ω) − x (ω)dμ(ω) Ω Ω   εn (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) = Ω Ω    ∗ ∗ + αn yn (ω) − x (ω)dμ(ω) − αn βn φ yn (ω) − x (ω)dμ(ω) Ω Ω   εn (ω)dμ(ω) + yn (ω) − x ∗ (ω)dμ(ω) = Ω Ω  − αn βn φ( yn (ω) − x ∗ (ω)dμ(ω)). + αn

Ω

Ω

Using the assumptions that limn→∞ Ω εn (ω)dμ(ω), 0 < α ≤ αn and 0 < β ≤ βn for each n ∈ N. Hence, we have εn (ω)dμ(ω) Ω εn (ω)dμ(ω) = 0. lim ≤ lim Ω n→∞ n→∞ αn βn αβ

18 Bochner Integrability of the Random Fixed Point of a Generalized Random …

397

By Lemma 1, taking  λn =

Ω

yn (ω) − x ∗ (ω)dμ(ω), σn = αn βn , γn =

 Ω

εn (ω)dμ(ω),

we see that all conditions in Lemma 1 are satisfied. Therefore, we have  yn (ω) − x ∗ (ω)dμ(ω) = 0. lim n→∞ Ω

Conversely, for x ∗ (ω) Bochner integrable with respect to the sequence {yn (ω)}, 



εn (ω)dμ(ω) = yn+1 (ω) − T (ω, an (ω))dμ(ω) Ω   ≤ yn+1 (ω) − x ∗ (ω)dμ(ω) + x ∗ (ω) − T (ω, an (ω))dμ(ω) Ω Ω ∗ ≤ yn+1 (ω) − x (ω)dμ(ω) + an (ω) − x ∗ (ω)dμ(ω) Ω Ω   −φ an (ω) − x ∗ (ω)dμ(ω) Ω   yn+1 (ω) − x ∗ (ω)dμ(ω) + an (ω) − x ∗ (ω)dμ(ω) ≤ Ω Ω  = yn+1 (ω) − x ∗ (ω)dμ(ω) Ω  + (1 − αn )T (ω, yn (ω)) + αn T (ω, bn (ω)) − x ∗ (ω)dμ(ω)  Ω ≤ yn+1 (ω) − x ∗ (ω)dμ(ω) Ω  + (1 − αn ) T (ω, yn (ω)) − x ∗ (ω)dμ(ω) Ω  + αn T (ω, bn (ω)) − x ∗ (ω)dμ(ω) Ω  ≤ yn+1 (ω) − x ∗ (ω)dμ(ω) Ω    ∗ ∗ + (1 − αn ) yn (ω) − x (ω)dμ(ω) − φ yn (ω) − x (ω)dμ(ω) Ω



Ω



Ω

bn (ω) − x ∗ (ω)dμ(ω) − φ bn (ω) − x ∗ (ω)dμ(ω) Ω Ω   ∗ yn+1 (ω) − x (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) ≤ + αn

Ω

Ω



398

G. A. Okeke et al.

 

+ αn

Ω

bn (ω) − x ∗ (ω)dμ(ω)



yn+1 (ω) − x (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) Ω  + αn (1 − βn )yn (ω) + βn T (ω, yn (ω)) − x ∗ (ω)dμ(ω) Ω   ≤ yn+1 (ω) − x ∗ (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) Ω Ω   + αn (1 − βn ) yn (ω) − x ∗ (ω)dμ(ω)

=



Ω

Ω

 + βn T (ω, yn (ω)) − x ∗ (ω)dμ(ω) Ω   yn+1 (ω) − x ∗ (ω)dμ(ω) + (1 − αn ) yn (ω) − x ∗ (ω)dμ(ω) ≤ Ω Ω  ∗ + αn (1 − βn ) yn (ω) − x (ω)dμ(ω) Ω    + αn βn yn (ω) − x ∗ (ω)dμ(ω) − φ yn (ω) − x ∗ (ω)dμ(ω) Ω Ω   yn+1 (ω) − x ∗ (ω)dμ(ω) + yn (ω) − x ∗ (ω)dμ(ω) = Ω Ω   ∗ yn (ω) − x (ω)dμ(ω) . − αn βn φ 

Ω



Hence, we have lim

n→∞ Ω

εn (ω)dμ(ω) = 0.

This means that the random Picard-S iteration process { pn (ω)} is T -stable almost surely. The proof of Theorem 6 is completed.  Theorem 7 Let C be a nonempty closed and convex subset of a separable Banach space X, T : Ω × C → C be a random generalized φ-weakly contractive-type operator satisfying condition (18.2) with R F(T ) = ∅. Suppose x ∗ (ω) is the random fixed point of T and {u n (ω)} is the random Thakur iteration process defined by (18.4), where {αn } and {βn } are real sequences in (0, 1) such that 0 < α ≤ αn and 0 < β ≤ βn . Then {u n (ω)} is T -stable almost surely. Proof The proof of Theorem 7 follows similar lines as in the proof of Theorem 5 and Theorem 6.  Next, we give a numerical example to demonstrate the applicability of our results. Example 2 Suppose that Ω = [0, 1] and Σ is the σ -algebra of the Lebesgue measurable subsets of Ω. Let X = R with the usual metric, C = [0, 1] and define the

18 Bochner Integrability of the Random Fixed Point of a Generalized Random …

399

generalized random operator T : Ω × C → C as T (ω, x) = ω+x . Then the measur3 able mapping x ∗ : Ω → X defined by x ∗ (ω) = ω2 , for every ω ∈ Ω is a random fixed point of T. Let φ(t) = 2t and L(ω) = 3. Clearly, T satisfies condition (18.2). Choose n2 the prototype sequence αn = 1+n 2 . Suppose {x n (ω)} is the random M-iteration (18.3). Let {yn (ω) = nω2 } be a sequence of measurable mappings from Ω to C and |εn (ω)| = |yn+1 (ω) − T (ω, an (ω))|, 

an (ω) = T (ω, bn (ω)), Then the random M-iteration bn (ω) = (1 − αn )yn (ω) + αn T (ω, yn (ω)). {xn (ω)} is T -stable almost surely. Suppose that Ω |εn (ω)|dμ(ω) = 0. We now show that x ∗ (ω) = ω2 is Bochner integrable. Using (18.14) and (18.15), we have

where







|yn+1 (ω) − x (ω)|dμ(ω) ≤ |εn (ω)|dμ(ω)     |yn (ω) − x ∗ (ω)|dμ(ω) − αn φ |yn (ω) − x ∗ (ω)|dμ(ω) +

Ω

Ω 1

Ω

 1 ω ω ω ω | − |dμ(ω) + | 2 − |dμ(ω) ≤ 2 (n + 1) 2 n 2 0 0  1  2 n 1 ω ω − × | − |dμ(ω) 1 + n2 2 0 n2 2  1  1 2ω − ω(n + 1)2 2ω − ωn 2 = | |dμ(ω) + | |dμ(ω)− 2(n + 1)2 2n 2 0 0  1 n2 2ω − n 2 ω | |dμ(ω) −→ 0 as n → ∞. 2(1 + n 2 ) 0 2n 2 Therefore, limn→∞ Ω |yn (ω) − x ∗ (ω)|dμ(ω) = 0. Conversely, supposing that x ∗ (ω) = ω2 is Bochner integrable with respect to the sequence {yn (ω)}, from (18.15), we have 





|εn (ω)|dμ(ω) ≤ |yn+1 (ω) − x ∗ (ω)|dμ(ω) Ω    ∗ ∗ |yn (ω) − x (ω)|dμ(ω) − αn φ |yn (ω) − x (ω)|dμ(ω) +

Ω

Ω 1

Ω

 1 ω ω ω ω | − |dμ(ω) + | 2 − |dμ(ω)− = 2 (n + 1) 2 n 2 0 0  1 n2 ω ω | − |dμ(ω) −→ 0 as n → ∞. 2(1 + n 2 ) 0 n 2 2 Therefore, limn→∞ Ω |εn (ω)|dμ(ω) = 0. This means that the random M-iteration {xn (ω)} defined by (18.3) is T -stable almost surely. 

400

G. A. Okeke et al.

18.4 Application to Random Nonlinear Integral Equation of the Hammerstein Type In this section, we shall apply Theorem 4 to prove the existence of a solution in a Banach space of a random nonlinear integral equation of the form:  k(t, s; ω) f (s, x(s; ω))dμ0 (s),

x(t; ω) = h(t; ω) +

(18.16)

S

where (i) S is a locally compact metric space with a metric d on S × S equipped with a complete σ -finite measure μ0 defined on the collection of Borel subsets of S; (ii) ω ∈ Ω, where ω is a supporting element of a set of probability measure space (Ω, β, μ); (iii) x(t; ω) is the unknown vector-valued random variable for each t ∈ S; (iv) h(t; ω) is the stochastic free term defined for t ∈ S; (v) k(t, s; ω) is the stochastic kernel defined for t and s in S; (vi) f (t, x) is a vector-valued function of t ∈ S and x. The integral equation (18.16) is interpreted as a Bochner integral, (see Padgett [27]). Furthermore, we shall assume that S is the union of a countable family of compact sets {Cn } with the properties that C1 ⊂ C2 ⊂ · · · and that for any other compact set S there is a Ci which contains it (see, Arens [5]). Definition 4 ([10]) We define the space C(S, L 2 (Ω, β, μ)) to be the space of all continuous functions from S into L 2 (Ω, β, μ) with the topology of uniform convergence on compacta, i.e. for each fixed t ∈ S, x(t; ω) is a vector valued random variable such that  x(t; ω)2L 2 (Ω,β,μ) = |x(t; ω)|2 dμ(ω) < ∞. Ω

Note that C(S, L 2 (Ω, β, μ)) is a locally convex space, whose topology is defined by a countable family of semi-norms (see, Yosida [35]) given by x(t; ω)n = sup x(t; ω) L 2 (Ω,β,μ) , n = 1, 2, . . . . t∈Cn

Moreover, C(S, L 2 (Ω, β, μ)) is complete relative to this topology, since the space L 2 (Ω, β, μ) is complete. We define BC = BC(S, L 2 (Ω, β, μ)) to be the Banach space of all bounded continuous functions from S into L 2 (Ω, β, μ) with norm x(t; ω) BC = sup x(t; ω) L 2 (Ω,β,μ) . t∈S

18 Bochner Integrability of the Random Fixed Point of a Generalized Random …

401

The space BC ⊂ C is the space of all second order vector valued stochastic process defined on S, which is bounded and continuous in mean square. We will consider the function h(t; ω) and f (t, x(t; ω)) to be in the space C(S, L 2 (Ω, β, μ)) with respect to the stochastic kernel. We assume that for each pair (t, s), k(t, s; ω) ∈ L ∞ (Ω, β, μ) and denote the norm by k(t, s; ω) = k(t, s; ω) L ∞ (Ω,β,μ) = μ − ess sup |k(t, s; ω)|. ω∈Ω

Suppose that k(t, s; ω) is such that |k(t, s; ω)|.x(s; ω) L 2 (Ω,β,μ) is μ0 -integrable with respect to s for each t ∈ S and x(s; ω) in C(S, L 2 (Ω, β, μ)) and there exists a real valued function G defined μ0 -a.e. on S, so that G(S)x(s; ω) L 2 (Ω,β,μ) is μ0 -integrable and for each pair (t, s) ∈ S × S, |k(t, u; ω) − k(s, u; ω)|.x(u, ω) L 2 (Ω,β,μ) ≤ G(u)x(u, ω) L 2 (Ω,β,μ) μ0 -a.e.. Furthermore, for almost all s ∈ S, k(t, s; ω) will be continuous in t from S into L ∞ (Ω, β, μ). Now, we define the random integral operator T on C(S, L 2 (Ω, β, μ)) by  k(t, s; ω)x(s; ω)dμ0 (s)

(T x)(t; ω) =

(18.17)

S

where the integral is a Bochner integral. Moreover, for each t ∈ S, (T x)(t; ω) ∈ L 2 (Ω, β, μ) and (T x)(t; ω) is continuous in mean square by Lebesgue dominated convergence theorem. So (T x)(t; ω) ∈ C(S, L 2 (Ω, β, μ)). Definition 5 ([2, 18]) Let B and D be Banach spaces. The pair (B, D) is said to be admissible with respect to a random operator T (ω) if T (ω)(B) ⊂ D. Lemma 2 ([15]) The linear operator T defined by (18.17) is continuous operator from C(S, L 2 (Ω, β, μ)) into itself. Lemma 3 ([15, 18]) Let T be a continuous linear operator from the space C(S, L 2 (Ω, β, μ)) into itself, and let B, D ⊂ C(S, L 2 (Ω, β, μ)) be Banach spaces that are stronger than C(S, L 2 (Ω, β, μ)) such that (B, D) is admissible with respect to T . Then T is continuous from B into D. Remark 2 ([27]) The operator T defined by (18.17) is a bounded linear operator from B into D. It is to be noted that a random solution of equation (18.16) will mean a function x(t; ω) in C(S, L 2 (Ω, β, μ)) which satisfies the equation (18.16) μ-a.e.. We now prove the following theorem. Theorem 8 We consider the stochastic integral equation (18.16) subject to the following conditions: (a) B and D are Banach spaces stronger than C(S, L 2 (Ω, β, μ)) such that (B, D) is admissible with respect to the integral operator defined by (18.17);

402

G. A. Okeke et al.

(b) x(t; ω) → f (t, x(t; ω)) is an operator from the set Q(ρ) = {x(t; ω) : x(t; ω) ∈ D, x(t; ω) D ≤ ρ} into the space B satisfying  f (t, x1 (t; ω)) − f (t, x2 (t; ω)) B ≤ e L(ω)x1 (t;ω)− f (t,x1 (t;ω)) D x1 (t; ω) − x2 (t; ω) D − φ(x1 (t; ω) − x2 (t; ω) D )e

(18.18)

L(ω)x1 (t;ω)− f (t,x1 (t;ω)) D

for all x1 (t; ω), x2 (t; ω) ∈ Q(ρ), where φ : R+ → R+ is a continuous and nondecreasing function such that φ(t) > 0 for each t ∈ (0, ∞) and φ(0) = 0; (c) h(t; ω) ∈ D. Then there exists a unique random solution of (18.16) in Q(ρ), provided h(t; ω) D + c(ω) f (t; 0) B ≤ ρ(1 − c(ω)), where c(ω) is the norm of T (ω). Proof We define the operator U (ω) from Q(ρ) into D as follows:  k(t, s; ω) f (s, x(s; ω))dμ0 (s).

(U x)(t; ω) = h(t; ω) + S

Next we have (U x)(t; ω) D ≤ h(t; ω) D + c(ω) f (t, x(t; ω)) B ≤ h(t; ω) D + c(ω) f (t; 0) B + c(ω) f (t, x(t; ω)) − f (t; 0) B .

(18.19)

Using the condition of (18.18), we have  f (t, x(t; ω)) − f (t; 0) B ≤ e L(ω)x(t;ω) D x(t; ω) D − φ(x(t; ω) D )e L(ω)x(t;ω) D ≤ ρ − φ(ρ) ≤ ρ.

(18.20)

Using (18.20) in (18.19), we have (U x)(t; ω) D ≤ h(t; ω) D + c(ω) f (t; 0) B + c(ω)ρ ≤ ρ. This means that (U x)(t; ω) ∈ Q(ρ). Then for each x1 (t; ω), x2 (t; ω) ∈ Q(ρ), we have by using assumption (b) that

18 Bochner Integrability of the Random Fixed Point of a Generalized Random …

403

(U x1 )(t; ω) − (U x2 )(t; ω) D =  =  k(t, s; ω)[ f (s, x1 (s; ω)) − f (s, x2 (s; ω))]dμ0 (s) D S

≤ e L(ω)x1 (t;ω)− f (t,x1 (t;ω)) D x1 (t; ω) − x2 (t; ω) D − φ(x1 (t; ω) − x2 (t; ω) D )e L(ω)x1 (t;ω)− f (t,x1 (t;ω)) D . Since φ : R + → R + is a continuous and nondecreasing function such that φ(t) > 0 for each t ∈ (0, ∞) and φ(0) = 0. It follows that U (ω) is a nonlinear contractive operator on Q(ρ). Therefore, by Theorem 4 there exists a unique random fixed point x ∗ (t, ω) of U (ω), which is the random solution of equation (18.16). The proof of Theorem 8 is completed.  The following example demonstrates the applicability of Theorem 8. Example 3 We consider the following nonlinear stochastic integral equation 

 e−t−s ds . 16(1 + |x(s; ω)|) 0 0 (18.21) By comparing relation (18.21) with (18.16), we observe that 1

x(t; ω) = e 3



1 e−t−s ds − 16(1 + |x(s; ω)|) 4

h(t; ω) = 0, k(t, s; ω) =





1 −t−s 1 e . , f (s, x(s; ω)) = 4 4(1 + |x(s; ω)|)

By the usual computation, we clearly see that (18.18) is satisfied with φ( j) = 4j and φ(0) = 04 = 0 for all j ∈ (0, ∞). Hence, all the conditions of Theorem 8 are satisfied. Therefore, there exists a unique random fixed point x ∗ (t, ω) of the integral operator T satisfying (18.17). Acknowledgements Professor Dr. Mujahid Abbas is grateful to the Mathematics and Applied Mathematics research milieu MAM, Division of Applied Mathematics, School of Education, Culture and Communication, Mälardalen University for support, hospitality and excellent research environment during his visit in Autumn 2019.

References 1. Abbas, M., Nazir, T.: A new faster iteration process applied to constrained minimization and feasibility problems. Mat. Vesn. 66, 223–234 (2014) 2. Achari, J.: On a pair of random generalized non-linear contractions. Int. J. Math. Math. Sci. 6(3), 467–475 (1983) 3. Agarwal, R.P., O’Regan, D., Sahu, D.R.: Iterative construction of fixed points of nearly asymptotically nonexpansive mappings. J. Nonlinear Convex Anal. 8, 61–79 (2007) 4. Alber, Y.I., Guerre-Delabriere, S.: Principle of weakly contractive maps in Hilbert spaces. In: Gohberg, I., Lyubich, Y. (eds.) New results in Operator Theory and its Applications, pp. 7–22. Birkhauser Verlag Basel, Switzerland (1997)

404

G. A. Okeke et al.

5. Arens, R.F.: A topology for spaces of transformations. Annals Math. 47(2), 480–495 (1946) 6. Beg, I., Abbas, M.: Equivalence and stability of random fixed point iterative procedures. J. Appl. Math. Stochast. Anal. Article ID 23297, 1–19 (2006). https://doi.org/10.1155/JAMSA/ 2006/23297 7. Beg, I., Abbas, M.: Random fixed point theorems for Caristi type random operators. J. Appl. Math. Comput. 25(1–2), 425–434 (2007) 8. Beg, I., Abbas, M., Azam, A.: Periodic fixed points of random operators. Ann. Math. et Informat. 37, 39–49 (2010) 9. Chugh, R., Kumar, V., Kumar, S.: Strong Convergence of a new three step iterative scheme in Banach spaces. Am. J. Comput. Math. 2, 345–357 (2012) 10. Dey, D., Saha, M.: Application of random fixed point theorems in solving nonlinear stochastic integral equation of the Hammerstein type. Malaya J. Matematik 2(1), 54–59 (2013) 11. Gursoy, F., Karakaya, V.: A Picard-S hybrid type iteration method for solving a differential equation with retarded argument. arXiv:1403.2546v2 [math.FA] (2014) 12. Hanš, O.: Random operator equations. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Vol. II, Part I, University of California Press, California, 185–202 (1961) 13. Ishikawa, S.: Fixed points by a new iteration method. Proc. Am. Math. Soc. 44, 147–150 (1974) 14. Itoh, S.: Random fixed point theorems with an application to random differential equations in Banach spaces. J. Math. Anal. Appl. 67(2), 261–273 (1979) 15. Joshi, M.C., Bose, R.K.: Some Topics In Nonlinear Functional Analysis. Wiley Eastern Limited, New Delhi (1985) 16. Karahan, I., Ozdemir, M.: A general iterative method for approximation of fixed points and their applications. Adv. Fixed Point Theory 3, 510–526 (2013) 17. Khan, A.R., Kumar, V., Narwal, S., Chugh, R.: Random iterative algorithms and almost sure stability in Banach spaces. Filomat 31(12), 3611–3626 (2017) 18. Lee, A.C.H., Padgett, W.J.: On random nonlinear contractions. Math. Syst. Theory 11, 77–84 (1977) 19. Mann, W.R.: Mean value methods in iteration. Proc. Am. Math. Soc. 4, 506–510 (1953) 20. Noor, M.A.: New approximation schemes for general variational inequalities. J. Math. Anal. Appl. 251, 217–229 (2000) 21. Okeke, G.A., Abbas, M.: Convergence and almost sure T -stability for a random iterative sequence generated by a generalized random operator. J. Inequal. Appl. 2015(146), 11 (2015) 22. Okeke, G.A., Kim, J.K.: Convergence and summable almost T -stability of the random PicardMann hybrid iterative process. J. Inequal. Appl. 2015(290), 14 (2015) 23. Okeke, G.A., Eke, K.: S: Convergence and almost sure T -stability for random Noor-type iterative scheme. Int. J. Pure Appl. Math. 107(1), 1–16 (2016) 24. Okeke, G.A., Kim, J.K.: Convergence and (S, T )-stability almost surely for random Jungcktype iteration processes with applications. Cogent Math. 3, 1258768, p. 15 (2016) 25. Okeke, G.A., Bishop, S.A., Akewe, H.: Random fixed point theorems in Banach spaces applied to a random nonlinear integral equation of the Hammerstein type. Fixed Point Theory Appl. 2019(15), 24 (2019) 26. Okeke, G.A.: Random fixed point theorems in certain Banach spaces. J. Nonlinear Convex Anal. 20(10), 2155–2170 (2019) 27. Padgett, W.J.: On a nonlinear stochastic integral equation of the Hammerstein type. Proc. Amer. Math. Soc. 38(3), 625–631 (1973) 28. Phuengrattana, W., Suantai, S.: On the rate of convergence of Mann, Ishikawa, Noor and SP-iterations for continuous functions on an arbitrary interval. J. Comput. Appl. Math. 235, 3006–3014 (2011) 29. Rashwan, R.A., Hammad, H.A.: Random fixed point theorems for random mappings. Asia Pac. J. Math. 3(2), 114–135 (2016) 30. Rashwan, R.A., Hammad, H.A., Okeke, G.A.: Convergence and almost sure (S, T )-stability for random iterative schemes. Int. J. Adv. Math. 2016(1), 1–16 (2016)

18 Bochner Integrability of the Random Fixed Point of a Generalized Random …

405

31. Sahu, D.R., Petrusel, A.: Strong convergence of iterative methods by strictly pseudocontractive mappings in Banach spaces. Nonlinear Anal. Theory, Methods Appl. 74, 6012–6023 (2011) 32. Spacek, A.: Zufallige gleichungen. Czech. Math. J. 5, 462–466 (1955) 33. Thakur, B.S., Thakur, D., Postolache, M.: A new iterative scheme for numerical reckoning fixed points of Suzuki’s generalized nonexpansive mappings. App. Math. Comp. 275, 147–155 (2016) 34. Ullah, K., Arshad, M.: Numerical reckoning fixed points for Suzuki’s generalized nonexpansive mappings via new iteration process. Filomat 32(1), 187–196 (2018) 35. Yosida, K.: Functional Analysis. Academic press, New York, Springer, Berlin (1965) 36. Zhang, S.S., Wang, X.R., Liu, M.: Almost sure T -stability and convergence for random iterative algorithms. Appl. Math. Mech.-Engl. Ed. 32(6), 805–810 (2011)

Chapter 19

An Approach to the Absence of Price Bubbles Through State-Price Deflators Salvador Cruz Rambaud

Abstract The objective of this chapter is to present some mathematical results for the absence of asset price bubbles by using, as algebraic tool, a state-price deflator across an infinite time horizon. The methodology used in this work is the martingale analysis of financial markets because the existence of state-price deflators is equivalent to the No-Arbitrage condition. In particular, this chapter analyzes the existing relation between the divergence of the sum of dividend-price ratios and the absence of price bubbles. The framework used in this work describes a financial market with both uncertainty, a finite number of corporate securities and a countable number of trading dates. Keywords Bubble · Martingale · No-Arbitrage · Dividend-price ratio · Corporate security MSC 2020: 91G15, 91G80

19.1 Introduction According to Gilles and LeRoy [14], a bubble is a payoff at infinity. Take into account that the market value of a security can be considered as the present value of all payoffs corresponding to countably finite dates. Traditionally, the concept of bubble has been studied from two perspectives: rational and speculative. From a rational point of view, a price bubble arises when the price of a security is greater than its fundamental value. According to [26], price bubbles cannot exist in equilibrium in S. Cruz Rambaud (B) Departamento de Economía y Empresa, Universidad de Almería, La Cañada de San Urbano, s/n, 04120 Almería, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_19

407

408

S. Cruz Rambaud

the standard dynamic asset pricing model [31]. Some rational agents believe that the supply of money is exogenously given by a central bank. However, other agents believe that the supply of money is endogenously controlled by the banking sector. In this context, the existence of bubbles is due to borrowing constraints. From a speculative point of view, a price bubble can arise when the price of an asset is based on behavioral foundations or because of the marginal valuation of future dividends [15]. The framework of this paper is a discrete-time model over an infinite time horizon where both prices and dividends are uncertain. This setting is general enough to describe short- and long-lived securities, and spot and forward markets. The basic tool used in this analysis is martingale which has been successfully employed in financial mathematics. In this way, the Martingale Convergence Theorem allows to reinforce the idea of a bubble as a gap between the fundamental and the market values of an asset. According to [2, 12, 19], trading constraints may introduce asset price bubbles into the market. Malinvaud [20] states that the source of economic inefficiency can be found in an excess capital accumulation at infinity. The objective of this chapter is to analyze the behavior of a specific series related to the existence of price bubbles. Montrucchio [22] introduces the following short-run pricing equilibria: at pt = Et [at+1 ( pt+1 + dt+1 )], where pt is the spot price vector of an asset, dt is its dividend vector, whilst at denotes a state-price process. He also analyzes the relation existing between arbitrage-free prices, satisfying this equation, and the behavior of the series: ∞  |dt | , | pt | t=0

where | · | is the vector l1 norm. Several results showed the close connection between the divergence of this series and the absence of bubbles. In a different way, we will use the usual state-price deflators (SPD) of Finance to develop some mathematical results for the absence of asset price bubbles by introducing a state-price deflator and its characteristics. It is well known that the existence of a state-price deflator is equivalent to the No-Arbitrage (NA) condition. In effect, it can be shown that, if all state prices are positive, then there are no arbitrage opportunities. Since the probabilities of the states are all positive (by definition), the absence of arbitrage opportunities implies the existence of a (strictly positive) stateprice deflator. The equivalence between NA and the existence of a state-price deflator can be found in [26]. Following [16–18], the NA condition can be defined from several perspectives. One variant of NA is No Unbounded Profits with Bounded Risk (NUPBR), which is equivalent to the existence of a local martingale deflator process. Another variant of NA is No Free Lunch with Vanishing Risk (NFLVR), which is shown to be equivalent to NA and NUPBR. The first fundamental theorem demonstrates the equivalence

19 An Approach to the Absence of Price Bubbles Through …

409

between NFLVR and the existence of an equivalent local martingale measure. The second fundamental theorem relates NFLVR and the completeness of the market with the uniqueness of the equivalent local martingale measure. The third fundamental theorem sets the equivalence between NFLVR and No Dominance (ND) with the existence of an equivalent probability Q such that the price S is a Q-martingale. Thus, the existence of bubbles is related with the first and third fundamental theorems. More specifically, given NFLVR, the asset price bubble βt is defined as [16]: βt := St − E Q [ST |Ft ], where E Q [ST |Ft ] is the asset’s fundamental value [4]. For a complete revision of the relationship between the main types of NA, see [11]. Additionally, the absence of price bubbles is related with the boundedness of St and the hypothesis of ND. The objective of this chapter will be to characterize the absence of price bubbles by imposing some conditions to the state-price deflator, in a discrete securities market with infinite time horizon, which seems more natural from an economics perspective. In particular, these conditions refer to the uniform integrability and the closure property of the state-price deflator. The organization of this chapter is as follows. Section 19.2 presents the stochastic framework for the analysis of price bubbles, the development of the NA condition and the introduction of the concept of a state-price deflator whose existence is a necessary and sufficient condition for NA. Starting from a date-0 multiperiod SPD, a date-t multiperiod SPD is defined, moreover including some properties which facilitates later calculations with SPDs. Additionally, a multi-period dividend, independent of the state-price deflator, appears, which will be of vital importance in the rest of sections. Section 19.3 states some necessary and sufficient conditions for the absence of price bubbles with respect to any state-price deflator. Moreover, this section presents some clarifying examples and the concept of a closed martingale. Section 19.4 describes the working of trading strategies with portfolios in a financial market and derives some algebraic properties of the space of self-financing strategies i . Finally, Sect. 19.5 summarizes and concludes. involving the dividend Dt,s

19.2 Securities Market Model and State-Price Deflators Let us consider a complete probability space (Ω, F , P) with state space Ω, σ algebra F , and probability measure P. There are a countable number of trading dates t indexed by the non-negative integers. The available information is represented by a filtration {Ft }∞ t=0 such that F0 = {∅, Ω} and that Ft is a refinement of Ft−1 , for every date t ≥ 1, i.e., Ft is a collection of partitions of the events in Ft−1 . Moreover,  F = σ ∪∞ t=0 Ft . The information Ft is accessible to all market agents at precisely date t (see, for example, [8]). Let X t denote the vector space of all Ft -measurable random variables, for every date t ≥ 0. A sequence of random variables {xt }∞ t=0 is adapted provided that x t ∈ X t ,

410

S. Cruz Rambaud

for every t ≥ 0. The market consists of a finite number n of corporate securities paying dividends. Let the finite set I index the corporate securities. The ith corporate security is represented by the (ex dividend) adapted price process {z ti }∞ t=0 and the adapted i dividend process {dti }∞ , with d = 0. Both the price and the dividend processes t=0 0 for a corporate security are positive, by virtue of limited liability, and adapted to the filtration. Let z t := [z t1 , z t2 , . . . , z tn ] and dt := [dt1 , dt2 , . . . , dtn ], being z ti and dti the price and the dividend corresponding to the ith corporate security at time t, respectively. All equalities and inequalities are presumed to hold almost surely in the following development. In this context, we are going to define an arbitrage opportunity. An arbitrage is a trading strategy which produces something for nothing. This allows us to define the absence of arbitrage opportunities as follows. i denote the capital gains on one share of the ith Definition 1 Let Δit := z ti − z t−1 risky security at date t ≥ 1. For any date t ≥ 1, we will say that the No-Arbitrage (NA) condition holds at date t provided that, for any portfolio {θti }i∈I ,



θti (Δit + dti ) ≥ 0 ⇒

i∈I



θti (Δit + dti ) = 0.

(19.1)

i∈I

More broadly, we will say that NA holds provided that it holds at every date t ≥ 1. In summary, NA asserts that no portfolio yields positive, nonzero earnings. As indicated in the Introduction, it can be shown that if all state prices are positive, then there are no arbitrage opportunities. This justifies the following definition. Definition 2 A state-price deflator (SPD) is a real-valued, strictly positive martingale {λt }∞ t=0 , bounded away from zero everywhere, such that λ0 = 1 and i i + dt+1 )|Ft ] = λt z ti , E P [λt+1 (z t+1

(19.2)

for every corporate security i ∈ I and date 0 ≤ t < ∞. In vectorial notation, E P [λt+1 (z t+1 + dt+1 )|Ft ] = λt z t ,

(19.3)

for every date 0 ≤ t < ∞. The process {λt }∞ t=0 has a variety of other different names: deflator, pricing kernel and stochastic discount function. State-price deflators provide a useful theoretical framework when we are working in a multicurrency setting. In such a setting there is a different risk-neutral measure Q i for each currency i. In contrast, the measure used in pricing with state-price deflators is unaffected by the base currency. In what follows, we will consider both the finite-state case and the case of infinite states. However, in order to guarantee that the various conditional expectations are well-defined, we will assume that all prices and dividends are integrable. As indicated, {λt }∞ t=0 is a SPD but Eq. (19.9) in [6] allows us to obtain the prices at instant 0 of a corporate security at any instant t. For this reason, we will say

19 An Approach to the Absence of Price Bubbles Through …

411

that {λt }∞ t=0 is a date-0 multi-period SPD. Our aim is to define a multi-period SPD between a fixed date t ≥ 0 and s, for every t ≤ s < ∞. This SPD will be called a date-t multi-period SPD which will give us the prices at time t of a corporate security at any time s ≥ t. Both (date-0 and date-t) multi-period SPDs will  beconstructed t is deduced and justified in Theorem 1. In [24] a sequence of one-period SPDs λλt−1 ∞ from {λt }t=0 . Nevertheless, we will generalize this construction. Theorem 1 The following four conditions are equivalent: i ∞ (i) {λt }∞ t=0 is a SPD with respect to the price process {z t }t=0 and the dividend process i ∞ {dt }t=0 . i ∞ (ii) For every t ≥ 0, {λt,s = λλst }∞ s=t is a SPD with respect to the price process {z s }s=t  i ∞ i and the dividend process {dt,s }s=t , being dt,t = 0 and dt,s = sh=t+1 λλhs dhi , for every s > t. (iii) For every t ≥ 0 and with respect to {λt,s = λλst }∞ s=t , the deflated price plus cumulative deflated dividend at t:

 λt,s z si

+

s  h=t+1

∞ λt,h dhi s=t

is a martingale. (iv) With respect to {λt }∞ t=0 , the deflated price plus cumulative deflated dividend at 0: ∞  t  i i λh dh λt z t + h=1

t=0

is a martingale. Proof (i) ⇒ (ii). Let {λt }∞ t=0 be a date-0 SPD. We are going to construct a date-t multi-period SPD by recurrence, for every t ≥ 0. By definition of a date-0 SPD, one has: i i + dt+1 )|Ft ] = λt z ti , E P [λt+1 (z t+1 for every date 0 ≤ t < ∞, from where:

EP

λt+1 i i (z t+1 + dt+1 )|Ft = z ti . λt

(19.4)

is a factor which allows us to convert prices at instant t + 1 into The ratio λλt+1 t i is the dividend prices at moment t. It will be labelled as λt,t+1 . On the other hand, dt+1 i generated during the period ]t, t + 1]. In what follows, it will be denoted by dt,t+1 (see Fig. 19.1):

412

S. Cruz Rambaud

] t

i i zt+1 + dt,t+1 ] t +1

Fig. 19.1 Price at time t + 1 and dividend generated during ]t, t + 1]

] t

] t +1

] s−1

···

i zis + dt,s ] s

Fig. 19.2 Price at time s and dividend generated during ]t, s]

So, with the new notation, Eq. (19.4) would remain:

i i + dt,t+1 )|Ft = z ti . E P λt,t+1 (z t+1

(19.5)

Let us assume that we have constructed a factor λt,s = λλst , t < s, which allows us to convert pricesat instant s into prices at moment t and that we have generated a i = sh=t+1 λλhs dhi during the period ]t, s] (see Fig. 19.2), satisfying: dividend dt,s

i E P λt,s (z si + dt,s )|Ft = z ti .

(19.6)

i i = dt,t+1 . By Eq. (19.5): Observe that, for s = t + 1, dt,s



i i + ds+1 )|Fs , z si = E P λs,s+1 (z s+1 where λs,s+1 =

λs+1 . λs

(19.7)

Therefore, substituting (19.7) into Eq. (19.6), one has:





i i i E P λt,s E P λs,s+1 (z s+1 + ds+1 )|Fs + dt,s |Ft = z ti . As Ft ⊆ Fs , from the Law of Iterated Expectations (see [30] or [21]):

i i i + ds+1 ) + λt,s dt,s |Ft = z ti , E P λt,s λs,s+1 (z s+1

(19.8)

which leads to:

EP or

λs+1 λt

 i i z s+1 + ds+1 +

1 λs,s+1

 i |Ft = z ti , dt,s



i i E P λt,s+1 (z s+1 + dt,s+1 )|Ft = z ti ,

(19.9)

(19.10)

19 An Approach to the Absence of Price Bubbles Through …

where λt,s+1 =

λs+1 λt

and 1

i i dt,s+1 = ds+1 +

λs,s+1

Obviously, for every t ≥ 0, {λt,s =

413

i dt,s =

λs ∞ } λt s=t

s+1 

λh i dh . λ h=t+1 s+1

is a martingale.

(ii) ⇒ (iii). The following sequence of equations holds:  i λt,s+1 z s+1

EP

+

λt,h dhi |Fs

h=t+1

 = EP



s+1 

i λt,s+1 (z s+1

+

i ds+1 )

+

s 

 λt,h dhi |Fs

h=t+1

 s   λt,s+1 i i i = λt,s E P (z + ds+1 )|Fs + E P λt,h dh |Fs λt,s s+1 h=t+1  s  

i i = λt,s E P λs,s+1 (z s+1 + ds+1 )|Fs + E P λt,h dhi |Fs

h=t+1

= λt,s z si +

s 

λt,h dhi .

h=t+1

 ∞  Therefore, λt,s z si + sh=t+1 λt,h dhi s=t is a martingale. (iii) ⇒ (iv). This is the particular case when t = 0.  (iv) ⇒ (i). As {λt z ti + th=1 λh dti }∞ t=0 is a martingale,  i E P λt+1 z t+1 +

t+1 

 λh dhi |Ft

= λt z ti +

h=1

t 

λh dhi .

(19.11)

h=1

On the other hand, by the properties of conditional expectations:  EP

i λt+1 z t+1

+

t+1 

 λh dhi |Ft

t

 i i + dt+1 )|Ft + λh dhi . = E P λt+1 (z t+1

h=1

h=1

(19.12) From Eqs. (19.11) and (19.12), it can be deduced that:

i i + dt+1 )|Ft = λt z ti E P λt+1 (z t+1

414

S. Cruz Rambaud

and so {λt }∞ t=0 is a date-0 multi-period SPD.



Proposition 1 If {λt }∞ t=0 is a date-0 multi-period SPD with respect to the price λs ∞ ∞ and the dividend process {dti }∞ process {z ti }∞ t=0 t=0 , then {{λt,s = λt }s=t }t=0 is a family of martingales, satisfying: 





i i i |Ft = E P λt,r (zri + dt,r )|Fs + dt,s )|Ft = z ti , E P λt,s E P λs,r (zri + ds,r (19.13)  i i = 0, dt,s = sh=t+1 λλhs dhi and t < s < r . where dt,t Proof It is evident by taking into account that: 



i i E P λt,s E P λs,r (zri + ds,r )|Fs + dt,s |Ft     r s  λr λh i λs  λ h i i = EP d + d |Ft zr + λt λ h λr h=t+1 λs h h=s+1 r

i )|Ft = z ti , = E P λt,r (zri + dt,r this last equality due to condition (ii) of Theorem 1. Obviously, for every t ≥ 0,  {λt,s = λλst }∞ s=t , is a martingale. Equation (19.13) will be called the additivity of the state-price deflator {λt }∞ t=0 . Obviously, the disadvantage of the former defined multi-period dividend is that its expression depends on a specific SPD, namely {λt }∞ t=0 . Thus, we will try to define a new multi-period dividend whose expression does not depend on any SPD. In effect, starting from Eq. (19.3) and assuming, as usual, that the processes {z ti }∞ t=0 and {dti }∞ t=0 are positive, it can be deduced that:  E P λs

i i z s−1 + ds−1 i z s−1

 (z si

+

dsi )|Fs−1

i i + ds−1 ). = λs−1 (z s−1

(19.14)

On the other hand, taking mathematical expectations in both members of Eq. (19.14), one has: 



E P E P λs

i i z s−1 + ds−1 i z s−1

 (z si

+

dsi )|Fs−1

 i i + ds−1 )|Fs−2 ]. |Fs−2 = E P [λs−1 (z s−1

As Fs−2 ⊆ Fs−1 , from the Law of Iterated Expectations and the definition of a SPD, one has:   i i + ds−1 λs z s−1 i i i EP (z s + ds )|Fs−2 = z s−2 . (19.15) i λs−2 z s−1 We can continue this process for a finite number of steps, leading to:

19 An Approach to the Absence of Price Bubbles Through …

 EP

415

i i i i i i z t+1 + ds−1 z s−2 + ds−2 + dt+1 λs z s−1 · · · (z si + dsi )|Ft i i i λt z s−1 z s−2 z t+1



or EP that is to say

 EP

s−1 λs  z ki + dki i (z s + dsi )|Ft λt k=t+1 z ki

 = z ti

(19.16)



 s−1  di λs  1 + ik (z si + dsi )|Ft λt k=t+1 zk

= z ti ,  = z ti .

i , such Recall that we want to obtain a new multi-period dividend, denoted by Dt,s that

λs i i (z + Dt,s )|Ft = z ti . EP λt s

So, we can take: z si

+

i Dt,s

 s−1   dki = 1 + i (z si + dsi ), zk k=t+1 

from where i Dt,s

=

z si

   s  dki 1+ i −1 . zk k=t+1

(19.17)

The following proposition shows the property to be satisfied by the new multiperiod dividend (19.17). Proposition 2 If {λt }∞ t=0 is a date-0 multi-period SPD with respect to the price λs ∞ ∞ and the dividend process {dti }∞ process {z ti }∞ t=0 t=0 , then {{λt,s = λt }s=t }t=0 is a family of martingales satisfying the condition of additivity of discount functions with respect i , that is to say, to the family of multi-period dividends Dt,s 





i i i E P λt,s E P λs,r (zri + Ds,r |Ft = E P λt,r (zri + Dt,r )|Fs + Dt,s )|Ft = z ti , (19.18) for every 0 ≤ t ≤ s ≤ r < ∞. Equation (19.16) allows the graphic representation shown in Fig. 19.3. In Sect. 19.3, we will see that the former product is intimately related to the sum of dividends divided by prices of an asset. Thus, this sum, being infinite, and another condition are sufficient to derive the absence of price bubbles for that asset.

416

S. Cruz Rambaud

Fig. 19.3 A representation of Eq. 19.16

19.3 About the Existence of Price Bubbles Let us start this section with the following definition. i ∞ Definition 3 The price process {z ti }∞ t=0 and the dividend process {dt }t=0 unambigui ously (resp. ambiguously) involve no bubble if lim t→∞ E P (λt z t ) = 0, for every (resp. for a specific) state-price deflator {λt }∞ t=0 .

In order to guarantee the absence of price bubbles, [22] requires that the random sequence {λt z ti }∞ t=0 is equicontinuous, i.e., for every ε > 0, there is some η = η(ε) > 0 such that E P (1 A λt z ti ) ≤ ε, for every t and every A ∈ F with P(A) ≤ η. However, we have to point out the following remarks: • The definition of an equicontinuous random sequence is not well known in the existing literature. • This is a very immediate condition to deduce the absence of price bubbles. • Montrucchio [22] recognizes that the so-defined equicontinuity property is not so simple to check. The following definition provides sufficient conditions (Theorem 2 and Corollary 1) for the absence of price bubbles [9, 10]. Definition 4 A random sequence {X t }∞ t=0 is locally bounded if there exists ε0 > 0 and a positive number M such that |X t | ≤ M, on every A with P(A) < ε0 and for every t. ∞

dti t=1 z ti = i ∞ +∞ almost surely and {λt z ti }∞ t=0 is locally bounded, then the price process {z t }t=0 and ∞ involve no bubble with respect to {λ } (ambiguously). the dividend process {dti }∞ t t=0 t=0

Theorem 2 Let {λt }∞ t=0 be a state-price deflator. For each fixed asset i, if

Corollary 1 Let {λt }∞ t=0 be a state-price deflator. For each fixed asset i, if

∞

dti t=1 z ti

=

i ∞ +∞ almost surely and {λt z ti }∞ t=0 is decreasing, then the price process {z t }t=0 and the ∞ i ∞ dividend process {dt }t=0 involve no bubble with respect to {λt }t=0 (ambiguously).

19 An Approach to the Absence of Price Bubbles Through …

417

Example 1 The state space Ω consists of the natural numbers, the σ -algebra is given by F = 2Ω , and the probability measure P on (Ω, F ) is induced by the geometric distribution with parameter 1/2. Thus, P({ω}) = 2−ω , for every ω ∈ Ω. The filtration {Ft }∞ t=0 is given by F0 = {∅, Ω} and Ft = σ ({1}, {2}, . . . , {t}), for every 1 ≤ t < ∞. Thus, the random variable xt : Ω −→ R is Ft -measurable if, and only if, xt is constant on the event {t + 1, t + 2, . . . }. The market consists of one corporate security whose dividend process {dt }∞ t=0 is given by d0 = 0 and  dt (ω) =

2−t+1 − 2−t , if 1 ≤ ω ≤ t if t < ω < ∞ 3−t ,

for every date 1 ≤ t < ∞. On the other hand, the price process {z t }∞ t=0 is given by z 0 = 2 and  −t 2 , if 1 ≤ ω ≤ t z t (ω) = 2 −t 3 −t 2 + 3 , if t < ω < ∞ 3 5 for every date 1 ≤ t < ∞. To arrive to both expressions, we have proposed:  dt (ω) =

2−t+1 − 2−t , if 1 ≤ ω ≤ t if t < ω < ∞ 3−t , 

and z t (ω) =

2−t , if 1 ≤ ω ≤ t h(t), if t < ω < ∞

To deduce the expression of h(t), we use the SPD condition: [2−t−1 + (2−t − 2−t−1 )]2−1 + [h(t + 1) + 3−t ]2−1 = h(t), 2−t 2−1 + [h(t + 1) + 3−t ]2−1 = h(t). We divide by 2−1 :

2−t + h(t + 1) + 3−t = 2h(t), −h(t + 1) + 2h(t) = 2−t + 3−t ,

which is a finite-difference equation, whose solution is h(t) = 23 2−t + 35 3−t . Moreover, if λt = 1, for every date 1 ≤ t < ∞, one has: i ∞ • {λt }∞ t=0 is a state-price deflator with respect to the price process {z t }t=0 and the i ∞ dividend process {dt }t=0 . • On the other hand, one has:  −t+1 −t 2 −2 = 2 − 1 = 1, if 1 ≤ ω ≤ t dt (ω) 2−t −t = 3 → 0, if t < ω < ∞ 2 −t z t (ω) 2 + 3 3−t 3

5

418

S. Cruz Rambaud

2 −t  −t 3 2 + 3 3−t because limt→∞ 3 3−t5 = limt→∞ 23 23 + 5 = ∞ + 35 = ∞.   2 −t −t −t • E P (λt z t ) = E P (z t ) = 2 (1 − 2 ) + 3 2 + 35 3−t 2−t → 0. Obviously, ∞  dt = ∞, z t=1 t

almost surely. Moreover, {λt }∞ t=0 is locally bounded. Definition 5 ([23]) The limit lim

t→∞

t  di

k

k=0

z ki

= +∞

holds uniformly if, for every scalar M, there is some t M such that  P

tM  di t i z k=0 t

 ≥M

= 1.

Clearly, this condition is stronger than almost surely divergence of the series. So we can state the following Corollary 2 Let {λt }∞ t=0 be a state-price deflator. For each fixed asset i, if

∞

dti t=1 z ti

=

i ∞ +∞ uniformly, then the price process {z ti }∞ t=0 and the dividend process {dt }t=0 involve ∞ no bubble with respect to {λt }t=0 .

Proof It is obvious because uniformly divergence implies the almost surely divergence of the series.  Next, we are going to look for other sufficient conditions to guarantee the absence of price bubbles, but before we are going to introduce the following definition. ∞ Definition 6 ([13]) Suppose {X t }∞ t=0 is a positive supermartingale. Then {X t }t=0 is said to be a potential if limt→∞ E P (X t ) = 0.

The following Lemma is inspired in the Theorem of Riesz Decomposition and is a version of the Krickeberg Decomposition Theorem [25]. 1 Lemma 1 Let {X t }∞ t=0 be a positive submartingale convergent in L (Ω, F , P) to ∞ X ∞ a.s. positive. Then {X t }t=0 can be decomposed as the difference of a positive ∞ martingale {Yt }∞ t=0 with a positive limit, and a potential {Z t }t=0 .

Proof For every t ∈ N and p > 0, define X t, p := E P (X t+ p |Ft ).

19 An Approach to the Absence of Price Bubbles Through …

419

• Because of {X t }∞ t=0 is a submartingale, E P (X t+ p |Ft ) ≥ X t a.s., so X t, p ≥ X t a.s. • Let us see the monotonicity of X t, p with respect to p: X t, p+1 := E P (X t+ p+1 |Ft ) = (because of the properties of conditional expectations) = E P [E P (X t+ p+1 |Ft+ p )|Ft ] ≥ (by the submartingale condition) ≥ E P (X t+ p |Ft ) := X t, p a.s. Therefore, {X t, p }∞ p=0 is, almost surely, increasing. • Define Yt := lim p→∞ X t, p , which exists by hypothesis. So, Yt ≥ X t , whereby lim Yt ≥ lim X t > 0,

t→∞

t→∞

so lim Yt > 0.

t→∞

• For every m ≥ 0: E P (Yt+m |Ft ) := E P ( lim X t+m, p |Ft ) = p→∞

(by the Lebesgue’s Monotone Convergence Theorem: {X t, p }∞ p=0 is, almost surely, increasing and X t, p > X t ) = lim E P (X t+m, p |Ft ) = p→∞

= lim E P [E P (X t+m+ p |Ft+m )|Ft ] = p→∞

= lim E P (X t+m+ p |Ft ) = p→∞

= lim X t,m+ p =: Yt a.s. p→∞

Therefore, {Yt }∞ t=0 is a martingale. • Define Z t := Yt − X t . Clearly, Z t (ω) ≥ 0 a.s., and so {Z t }∞ t=0 is a non-negative supermartingale: E P (Z t+m |Ft ) = E P (Yt+m |Ft ) − E P (X t+m |Ft ) ≤ Yt − X t = Z t ,

420

S. Cruz Rambaud

so E P (Z t+m |Ft ) ≤ Z t . • From the definition of {Yt }∞ t=0 : lim E P (Z t+ p |Ft ) = lim E(Yt+ p |Ft ) − lim E(X t+ p |Ft ) =

p→∞

p→∞

p→∞

= lim Yt − lim X t, p = Yt − Yt = 0 a.s., p→∞

p→∞

for every t ∈ N. In particular, when t = 0, lim E P (Z p |F0 ) = lim E P (Z p ) = 0.

p→∞

p→∞

By Lebesgue’s Monotone Convergence Theorem [1, 5], as E P (Z t+ p ) ≤ E P (Z t ), then lim E P (Z t+ p ) = 0,

p→∞

so {Z t }∞ t=0 is a potential.



The following theorem has been inspired in [3]. Theorem 3 Let {λt }∞ t=0 be a state-price deflator such that, for every p > 1, i i E P (λt+ p z t+ p |Ft ) ≤ λt+ p−1 z t+ p−1 .

dti t=1 z ti = +∞ almost surely, process {dti }∞ t=0 involve no bubble

In these conditions, for each fixed asset i, if {z ti }∞ t=0

∞

(19.19) then

the price process and the dividend with respect to {λt }∞ t=0 (ambiguously).  i Proof Obviously, X t := tk=1 λk dki is a submartingale, because E P (λt+1 dt+1 |Ft ) ≥ t+1 t i i 0 implies E P ( k=1 λk dk |Ft ) ≥ k=1 λk dk . So, we can apply Lemma 1 and construct a martingale and a potential. The steps to construct these random sequences are:

19 An Approach to the Absence of Price Bubbles Through …

421

• Step #1:  t+ p 

 λk dki |Ft

X t, p := E P (X t+ p |Ft ) := E P = k=1   t+ p    i i + λk dki |Ft − E P λt+ p z t+ = E P λt+ p z t+ p p |Ft = k=1

in [7])   t (by (8) i i k=1 λk dk − E λt+ p z t+ p |Ft ≥ (by hypothesis) t  i ≥ λt z ti + λk dki − λt+ p−1 z t+ p−1 .

= λt z ti +

k=1

• Step #2: λt z ti

Yt := lim X t, p ≥ p→∞

+

t 

λk dki −

k=1

i i − lim λt+ p−1 z t+ p−1 = λt z t + p→∞

t 

λk dki ,

k=1

i because, by Theorem 1 (see its proof in [9]), λt+ p−1 z t+ p−1 → 0. • Step #3: t t   λk dki − λk dki = λt z ti . Z t := Yt − X t ≥ λt z ti + k=1

k=1

i So, applying Lemma 1, {Z ti }∞ t=0 is a potential and, thus, lim t→∞ E P (λt z t ) ≤ limt→∞ E P (Z t ) = 0. Therefore,

lim E P (λt z ti ) = 0

t→∞



and there is not price bubble. The converse of Theorem 2 is Theorem 3 in [9].

1 Theorem 4 Let {λt }∞ t=0 be a state-price deflator verifying that P(Aλ ) = 1. For each i ∞ i ∞ fixed asset i, if the price process {z t }t=0 and the dividend process {dt }t=0 involve no ∞ dti bubble with respect to {λt }∞ t=0 , then t=1 z i = +∞. t

Example 2 Observe that, in the case of Example 1, the hypothesis of Theorem 3 holds. In effect, 1

Aλ the maximal set where ζ i := λt z ti

t k=0

 1+

dki z ki

 > 0.

422

S. Cruz Rambaud

 t   dk (ω) 1+ z k (ω) k=0  −t t −t 2 · 2t = 1,  if 1 ≤ ω ≤ t, k=1 (1 + 1) = 2 = 2 −t 3 −t t  −k ( 3 2 + 5 3 ) k=1 1 + 2 2−k3+ 3 3−k , otherwise.

λt (ω)z t (ω)

3

Obviously, λ∞ (ω)z ∞ (ω)

5

 ∞   dk (ω) 1+ = 1. z k (ω) k=0

A last result about the possible existence of price bubbles is provided by the following theorem [9]. Theorem 5 Let {λt }∞ t=0 be a state-price deflator. Assume that, for each fixed asset i ∞ i, the price process {z ti }∞ t=0 and the dividend process {dt }t=0 verify the following condition z ∞ + d∞ > 0. Then there is not a price bubble with respect to {λt }∞ t=0 if, and only if, λ∞ = 0 almost surely. The following examples have been modified from [27–29]. In spite of the fact that the following sequences are not stochastic, they will be useful for our purpose. Example 3 The firm never pays any dividend: • • • •

dti = 0. z ti = 2. λt z ti = ( 21 )t−1 . λt = ( 21 )t . In this case,

∞  di t

t=1

z ti

i = 0, z ∞ = 2 and λ∞ = 0,

i.e., {λt }∞ t=0 is not regular and so we cannot apply neither Theorem 3 nor Theorem 4. Example 4 The firm never pays a positive but declining dividend: • • • •

dti = ( 21 )( 41 )t .  1 τ +1 z ti = t−1 ]. τ =0 [1 − ( 2 ) 1 t i λt z t = ( 2 ) .  1 λt = t−1 τ =0 2−( 1 )τ . 2

In this case,

19 An Approach to the Absence of Price Bubbles Through … ∞  di

0
0 a.s. if, and only if, λt > 0 a.s., for every t, and this last condition is true by the definition of a SPD. 

19.4 Vector Spaces Associated to Marketed Strategies In this section, our aim is to define the vector space Θ associated to the set of the possible trading strategies within this framework. More specifically, we will try to characterize the NA condition through the space of the self-financing strategies. A portfolio ({θti }i∈I , ξt ) specifies the number of shares θti of the ith corporate security for every i ∈ I and the number ξt of government bills held by an agent from date t − 1 to date t (see Fig. 19.4). We always assume that each θ0i and ξ0 are constants, and that each θti and ξt are Ft−1 -measurable random variables for every t ≥ 1. Notice that θti and ξt are magnitudes corresponding to period t: [t − 1, t]. Thus, within the period t we have decided to purchase some proportions of the commodities, θti , and the government bill, ξt . On the other hand, at instant t, we receive the value of these commodities and of their dividends, z ti and dti , respectively. i and Subsequently, within the period t + 1, the new proportions of purchase are θt+1 i i ξt+1 . Notice that z t is a magnitude corresponding to instant t, whilst dt corresponds to period t (see Fig. 19.5):

19 An Approach to the Absence of Price Bubbles Through …

425

Fig. 19.4 Composition of a portfolio during the period t

Fig. 19.5 A short-term double transaction at instant t

Fig. 19.6 A long-term double transaction at instant t

i θt+1 · z ti + ξt+1 = θti (z ti + dti ) + ξt .

Generalizing, let θti denote the number of shares purchased by an agent from date i t − 1 until date t at price z t−1 . Notice that these shares have not to be necessarily sold at date t, but at another later date s (see Fig. 19.6). Nevertheless, in the following paragraph, we will not consider the government bill. Definition 8 A trading strategy is the selection of a feasible portfolio ({θti }i∈I , ξt ) at every date t > 0. i ∞ Given a pair price-dividend, {z ti }∞ t=0 and {dt }t=0 , for n commodities:

• Let X t denote the vector space of all Ft -measurable random variables for every date t. • X := ∞ t=0 X t . • Let Θt denote the vector space of trading strategies within period t: ]t − 1, t]. • Θ := {Θt }∞ t=0 . • Let L denote the vector space of the adapted processes. • Let M = {δ θ : θ ∈ Θ} denote the marketed subspace of the dividend processes generated by the marketed strategies. In effect, for every θ and φ in Θ and scalars a and b, we have

426

S. Cruz Rambaud

aδ θ + bδ φ = δ aθ+bφ . It is verified that M is a linear subspace of L which, in turn, is a linear subspace of X . Definition 9 The trading strategy {θti }∞ t=0 is self-financing provided that i z ti = θti (z ti + dti ), θt+1

for every t ≥ 0. Theorem 6 If NA condition holds, then the space of self-financing strategies in Θ is the direct limit of {Θt }∞ t=0 . Proof Let us consider the set of non-negative integers Z+ = {0, 1, 2, . . . } which is a partially ordered set by the relation 0, ρ ∈ C and γ > 0, cf. [12, Eq. (2.2.13)] 

1

t 0

ρ−1

(1 − t)

σ−1

γ

E β,α (xt ) dt = Γ (σ)2 Ψ2



(ρ, γ), (1, 1) x , (α, β), (σ + ρ, γ)

(20.8)

where 2 Ψ2 is the Fox-Wright function (also called generalized Wright function [7], [2, Appendix F, Eq. (F.2.14)] and [10]) given for x, ai , ci ∈ C and bi , di ∈ R by

20 Form Factors for Stars Generalized Grey Brownian Motion

2 Ψ2

435

 ∞ Γ (a1 + b1 n)Γ (a2 + b2 n) x n (a1 , b1 ), (a2 , b2 ) x = . (c1 , d1 ), (c2 , d2 ) Γ (c1 + d1 n)Γ (c2 + d2 n) n! n=0

In particular, when ρ = α and γ = β, Eq. (20.8) yields 

1

t α−1 (1 − t)σ−1 E β,α (xt β ) dt = Γ (σ)E β,α+σ (x).

(20.9)

0

Both integrals (20.8) and (20.9) will be used in Sect. 20.3 below. The M-Wright function with two variables M1β of order β (1-dimension in space) is defined by 1 −β t Mβ (|x|t −β ), 0 < β < 1, x ∈ R, t ∈ R+ 2 (20.10) which is a probability density in x evolving in time t with self-similarity exponent β. The following integral representation for the M-Wright is valid, see [11]. M1β (x, t) := Mβ (x, t) :=





Mβ/2 (x, t) = 2 0

x2

e− 4τ −β t Mβ (τ t −β ) dτ , 0 < β ≤ 1, x ∈ R. √ 4πτ

(20.11)

This representation is valid in a more general form, see [11, Eq. (6.3)], but for our purpose it is sufficient in view of its generalization for x ∈ Rd . In fact, Eq. (20.11) may be extended to a general spatial dimension d by the extension of the Gaussian function, namely 



e− 4τ |x| −β t Mβ (τ t −β ) dτ , x ∈ Rd , t ≥ 0, 0 < β ≤ 1. d/2 (4πτ ) 0 (20.12) The function Mdβ/2 is nothing but the density of the fundamental solution of a timefractional diffusion equation, see [13]. The Mittag-Leffler measures μβ , 0 < β ≤ 1 are a family of probability measures on Sd whose characteristic functions are given by the Mittag-Leffler functions. On Sd we choose the Borel σ-algebra B generated by the cylinder sets, that is Mdβ/2 (x, t) := 2

1

2



F Cb∞ (Sd ) := f (l1 , . . . , ln ) | n ∈ N, f ∈ Cb∞ (Rn ), l1 , . . . , ln ∈ Sd , where Cb∞ (Rn ) is the space of bounded infinitely often differentiable functions on Rn , where all partial derivatives are also bounded. Using the Bochner-Minlos theorem, see [1] or [6], the following definition makes sense. Definition 20.2 (cf. [3, Def. 2.5]) For any β ∈ (0, 1] the Mittag-Leffler measure is defined as the unique probability measure μβ on Sd by fixing its characteristic functional

436

J. L. da Silva et al.

 Sd

1 eiw,ϕ0 dμβ (w) = E β − |ϕ|20 , ϕ ∈ Sd . 2

(20.13)

Remark 20.1 1. The measure μβ is also called grey noise (reference) measure, cf. [4] and [3]. 2. The range 0 < β ≤ 1 ensures the complete monotonicity of E β (−x), see Pollard [18], that is, (−1)n E β(n) (−x) ≥ 0 for all x ≥ 0 and n ∈ N0 := {0, 1, 2, . . .}. In other words, this is sufficient to show that

1 2 Sd ϕ → E β − |ϕ|0 ∈ R 2 is the characteristic function of a measure in Sd . We consider the Hilbert space of complex square integrable measurable functions defined on Sd , L 2 (μβ ) := L 2 (Sd , B, μβ ), with scalar product defined by  ((F, G)) L 2 (μβ ) :=

Sd

¯ F(w)G(w) dμβ (w),

F, G ∈ L 2 (μβ ).

The corresponding norm is denoted by  ·  L 2 (μβ ) . It follows from (20.13) that all moments of μβ exists and we have Lemma 20.1 For any ϕ ∈ Sd and n ∈ N we have  Sd

w, ϕ2n+1 dμβ (w) = 0, 0

 Sd

w, ϕ2n 0 dμβ (w) =

(2n)! |ϕ|2n . + 1) 0

2n Γ (βn

1 In particular, ·, ϕ2L 2 = Γ (β+1) |ϕ|20 and by polarization for any ϕ, ψ ∈ Sd we obtain  1 ϕ, ψ0 . w, ϕ0 w, ψ0 dμβ (w) =  Γ (β + 1) Sd

20.2.2 Generalized Grey Brownian Motion For any test function ϕ ∈ Sd we define the random variable   X β (ϕ) : Sd −→ Rd , w → X β (ϕ, w) := w1 , ϕ1 , . . . , wd , ϕd  .

20 Form Factors for Stars Generalized Grey Brownian Motion

437

The random variable X β (ϕ) has the following properties which are a consequence of Lemma 20.1 and the characteristic function of μβ given in (20.13). Proposition 20.2 Let ϕ, ψ ∈ Sd , k ∈ Rd be given. Then 1. The characteristic function of X β (ϕ) is given by ⎛ ⎞ d   i(k,X β (ϕ))  1 = E β ⎝− k 2 |ϕ j |2L 2 ⎠ . E e 2 j=1 j

(20.14)

2. The characteristic function of the random variable X β (ϕ) − X β (ψ) is   d  i(k,X β (ϕ)−X β (ψ))  1 2 2 E e = Eβ − k |ϕ j − ψ j | L 2 . 2 i=1 j

(20.15)

3. The moments of X β (ϕ) are given by  Sd

β X (ϕ, w) 2n+1 dμβ (w) = 0,

 Sd

β X (ϕ, w) 2n dμβ (w) =

(2n)! |ϕ|2n . + 1) 0

2n (βn

(20.16)

The property (20.16) of X β (ϕ) gives the possibility to extend the definition of X β ∞ to any element in L 2d , in fact, if f ∈ L 2d , then there exists a sequence k )k=1  (ϕ ∞⊂ Sd 2 β such that ϕk −→ f , k → ∞ in the norm of L d . Hence, the sequence X (ϕk ) k=1 ⊂ L 2 (μβ ) forms a Cauchy sequence which converges to an element denoted by X β ( f ) ∈ L 2 (μβ ). So, defining 11[0,t) ∈ L 2d , t ≥ 0, by 11[0,t) := (11[0,t) ⊗ e1 , . . . , 11[0,t) ⊗ ed ) we may consider the process X β (11[0,t) ) ∈ L 2 (μβ ) such that the following definition makes sense. Definition 20.3 For any 0 < α < 2 we define the process   α/2 α/22 Sd w → B β,α (t, w) := w, (M− 11[0,t) ) ⊗ e1 , . . . , w, (M− 11[0,t) ) ⊗ ed    α/22 α/2 = w1 , M− 11[0,t) , . . . , wd , M− 11[0,t)  , t > 0 (20.17) as an element in L 2 (μβ ) and call this process d-dimensional generalized grey Brownian motion (ggBm for short). Its characteristic function has the form

438

J. L. da Silva et al.

 i(k,B β,α (t))  |k|2 α t , k ∈ Rd . = Eβ − E e 2

(20.18)

Remark 20.2 1. The d-dimensional ggBm B β,α exist as a L 2 (μβ )-limit and hence α/2 the map Sd ω → ω, M− 11[0,t)  yields a version of ggBm, μβ -a.s., but not in the pathwise sense. 2. For any fixed 0 < α < 2 one can show by the Kolmogorov-Centsov continuity theorem that the paths of the process are μβ -a.s. continuous, cf. [3, Prop. 3.8]. 3. Below we mainly deal with expectation of functions of ggBm, therefore the version of ggBm defined above is sufficient. Proposition 20.3 1. For any 0 < α < 2, the process B β,α := {B β,α (t), t ≥ 0}, is α -self-similar with stationary increments. 2 2. The finite dimensional probability density functions are given for any 0 ≤ t1 < t2 < . . . < tn < ∞ by ρβ,α n (x, Q) =

(2π)

dn 2





1 det(Q)

 21



1 τ

0

dn 2

e−

x2Q 4τ

Mβ (τ ) dτ , x ∈ Rdn ,

where Q = (ai j ) is the covariance matrix given by   ai j = E (B β,α (ti ), B β,α (t j )) =

α  d ti + t αj − |ti − t j |α 2Γ (β + 1)

and x2Q := (x, Q −1 x)Rdn . 

Proof The proof can be found in [17].

Remark 20.3 The family {B β,α (t), t ≥ 0, β ∈ (0, 1], α ∈ (0, 2)} forms a class of α -self-similar processes with stationary increments ( α2 -sssi) which includes: 2 1. For β = α = 1, the process {B 1,1 (t), t ≥ 0} is a standard d-dimensional Bm. 2. For β = 1 and 0 < α < 2, {B 1,α (t), t ≥ 0} is a d-dimensional fBm with Hurst parameter α2 . 3. For α = 1, {B β,1 (t), t ≥ 0} is a 21 -self-similar non Gaussian process with 

E e

i (k,B β,1 (t))



= Eβ

|k|2 − t , k ∈ Rd . 2

(20.19)

4. For 0 < α = β < 1, the process {B β (t) := B β,β (t), t ≥ 0} is β2 -self-similar and is called d-dimensional grey Brownian motion (gBm for short). Its characteristic function is given by

  |k|2 β β t , k ∈ Rd . E ei (k,B (t)) = E β − 2

(20.20)

20 Form Factors for Stars Generalized Grey Brownian Motion

439

For d = 1, this process was introduced by W. Schneider in [20, 21]. 5. For other choices of the parameters β and α we obtain, in general, non Gaussian processes.

20.3 Form Factors for Different Classes of Star Generalized Grey Brownian Motion In this section we investigate the form factors of ggBm and two particular classes introduced in Sect. 20.2.2, cf. Remark 20.3. β,α Let Bs := B β,α be a d-dimensional fBm indexed by a n A -harms star. In particular, B β,α is α2 -self-similar and if B β,α (t) and B β,α (s) are on the same harm, then

 i(k,B β,α (t)−B β,α (s))  |k|2 α |t − s| , = Eβ − E e 2 while if B β,α (t) and B β,α (s) are located at different harms, we have

 i(k,B β,α (t)−B β,α (s))  |k|2 α E e = Eβ − |t + s| . 2 The form factors for B β,α indexed by the star with n A harms is computed as Ssβ,α (k)

nA  N  N   1  β,α β,α := 2 E ei(k,B (t)−B (s)) dt ds N n=1 0 0  N N  nA nA   1  β,α β,α = 2 E ei(k,B (t)−B (s)) ds dt N n=1 l=1,n=l 0 0

  N α |k|2 n A (n A − 1) 1 1 β,α α ds dt = n A S (k) + |t + s| Eβ − 2 2 0 0

 1 t N α |k|2 |t + s|α ds dt , = n A S β,α (k) + n A (n A − 1) Eβ − 2 0 0    (A)

where in the second from the last equality we have used the α2 -self-similarity of B β,α and denoted by S β,α the form factors for ggBm, see [22], given by n A S β,α (k) = 2n A

∞  n=0

(−y 2 )n , Γ (βn + 1)(αn + 1)(αn + 2)

y 2 :=

N α |k|2 . 2

440

J. L. da Silva et al.

Using the change of variable τ = t + s, the integral (A) is equal to  0

=



1

1 2 

τ /2

0 1

 0



2 α

E β (−y τ ) ds dτ +

2



τ /2 τ −1

1

E β (−y 2 τ α ) ds dτ

 2  τ − + 1 E β (−y 2 τ α ) dτ . τ E β (−y 2 τ α ) dτ + 2   1   (A1 )

(A2 )

The integral (A1 ) is equal to (A1 ) =



1 2

1

τ E β (−y 2 τ α ) dτ =

0



(−y 2 )n 1 2 n=0 Γ (βn + 1)(αn + 2)

and the integral (A2 ) gives 

  τ − + 1 E β (−y 2 τ α ) dτ 2 1  2  1 2 =− τ E β (−y 2 τ α ) dτ + E β (−y 2 τ α ) dτ 2 1 1  2  2 ∞ ∞  (−y 2 )n 1  (−y 2 )n αn+1 =− τ dτ + τ αn dτ 2 n=0 Γ (βn + 1) 1 Γ (βn + 1) 1 n=0

(A2 ) =

2



=−



 (−y 2 )n (2αn+1 − 1) 1  (−y 2 )n (2αn+2 − 1) + . 2 n=0 Γ (βn + 1)(αn + 2) n=0 Γ (βn + 1)(αn + 1)

Putting together and simplifying the integral (A) yields (A) =

∞  n=0

(−y 2 )n (2αn+1 − 1) . Γ (βn + 1)(αn + 1)(αn + 2) β,α

We finally obtain the form factors Ss Ssβ,α (k) = 2n A

∞  n=0

(−y 2 )n Γ (βn + 1)(αn + 1)(αn + 2)

+n A (n A − 1)

∞  n=0

= nA

∞  n=0

explicitly as

(−y 2 )n (2αn+1 − 1) Γ (βn + 1)(αn + 1)(αn + 2)

  (−y 2 )n 2 + (n A − 1)(2αn+1 − 1) Γ (βn + 1)(αn + 1)(αn + 2)

20 Form Factors for Stars Generalized Grey Brownian Motion

441

which can also be represented as Ssβ,α (k)

(1, 1), (1, α) 2 −y = n A (3 − n A )2 Ψ2 (1, β), (3, α)

(1, 1), (1, α) 2 α −y 2 . +2n A (n A − 1) 2 Ψ2 (1, β), (3, α)

β,α

For fixed β, α, the form factor Ss function, that is, f D (y, β, α, n A ) = n A

∞  n=0

depends only on y 2 via the so-called Debye

  (−y 2 )n 2 + (n A − 1)(2αn+1 − 1) . Γ (βn + 1)(αn + 1)(αn + 2)

The Debye function f D may be written in terms of Fox-Wright function 2 Ψ2 as (1, 1), (1, α) 2 − y (1, β), (3, α)

(1, 1), (1, α) α 2 − 2 . +2n A (n A − 1)2 Ψ2 y (1, β), (3, α)

f D (y, β, α, n A ) = n A (3 − n A )2 Ψ2

20.4 Form Factors for Star Fractional Brownian Motion Let B H be a d-dimensional fBm indexed by a n A -harms star. In particular, B H is H -self-similar process and if B H (t) and B H (s), t = s are on the same harm, then

  |k|2 H H |t − s|2H , E ei(k,B (t)−B (s)) = exp − 2 while if B H (t) and B H (s) are located at different harms, we have

  |k|2 H H E ei(k,B (t)−B (s)) = exp − |t + s|2H . 2 The form factors for B H indexed by the star with n A harms is computed as

442

J. L. da Silva et al. nA  N  N   1  i(k,B H (t)−B H (s)) dt ds E e N 2 n=1 0 0  N N  nA nA   1  H H + 2 E ei(k,B (t)−B (s)) ds dt N n=1 l=1,n=l 0 0   n A (n A − 1) 1 1 − N 2H |k|2 |t+s|2H 2 = n A S fBm (k) + e ds dt 2 0 0  1 t N 2H |k|2 2H = n A S fBm (k) + n A (n A − 1) e− 2 |t+s| ds dt , 0 0  

S sfBm (k) : =

(A)

where in the second from the last equality we have used the H -self-similarity of B H and denoted 



1 2 1 1 N 2H |k|2 fBm 2 1/(2H ) 2 2 (y − 2γ , y , y , y . ) γ := S (k) = H (y 2 )1/H 2H H 2 Using the change of variable τ = t + s, the integral (A) is equal to 



1

τ /2

e 0

=

1 2 

0 1

 0

−y 2 τ 2H

τ e−y 

τ

2 2H

(A1 )

 ds dτ + 

dτ +  1

1

2

2



τ /2

e−y

τ

2 2H

ds dτ

τ −1

  τ 2 2H − + 1 e−y τ dτ . 2   (A2 )

The integral (A1 ) is equal to 1 2



1

τe

0

−y 2 τ 2H

1 dτ = γ 4H (y 2 )1/H



1 2 ,y H



and the integral (A2 ) gives  1

2





   τ 1 2H 2 1 2 1 −y 2 τ 2H γ ,2 y − γ ,y − +1 e dτ = − 2 4H (y 2 )1/H H H 



 1 1 1 2H 2 2 ,2 y − γ ,y γ . + 2H (y 2 )1/(2H ) 2H 2H

Putting together and simplifying the integral (A) yields

20 Form Factors for Stars Generalized Grey Brownian Motion

(A) =

443





1 2 1 1 2H 2 1 γ − , y γ , 2 y 2H (y 2 )1/H H 2 H 



 1 1 1 2H 2 2 γ − γ . , 2 , y + y 2H (y 2 )1/(2H ) 2H 2H

We finally obtain the form factors S sfBm explicitly as S sfBm (k) =





1 2 nA 1 2 1/(2H ) 2 (y − 2γ , y , y ) γ H (y 2 )1/H 2H H 



1 2 1 1 2H 2 n A (n A − 1) γ − , y γ , 2 + y 2H (y 2 )1/H H 2 H 



 1 1 n A (n A − 1) 2H 2 2 + y γ − γ . , 2 , y 2H (y 2 )1/(2H ) 2H 2H

For fixed H , the form factor depends only on y 2 via the so-called Debye function, that is,



 1 1 2 nA 2 1/(2H ) 2 , y , y f DsfBm (y, H, n A ) = ) γ (y − 2γ H (y 2 )1/H 2H H 



1 2 1 2H 2 n A (n A − 1) 1 + y γ − , y γ , 2 2H (y 2 )1/H H 2 H 



 1 1 n A (n A − 1) 2H 2 2 γ − γ . , 2 , y + y 2H (y 2 )1/(2H ) 2H 2H In Fig. 20.1 we show the plots of the Debye function f DsfBm with n A = 5 and n A = 10 for different values of the parameter H . The radius of gyration for the star fBm is computed as (RgsfBm )2 =

  1 nA N N E(|B H (t) − B H (s)|2 ) 2 N2 0 0   1 n A (n A − 1) N N E(|B H (t) − B H (s)|2 ) + 4 N2 0 0

which may be obtained by expanding the form factor to lowest order. It simple to compute and get (RgsfBm )2 =

(n A − 1) 2H +1 n A N 2H 1+ (2 − 1) . 2(H + 1)(2H + 1) 2

On the other hand, the mean square end-to-end length is (ResfBm )2 = E(|B H (N )|2 ) = n A N 2H

444

J. L. da Silva et al.

Fig. 20.1 The Debye function f DsfBm . Upper linear scale; down log-log scale

from which follows the relation

(ResfBm )2 (n A − 1) 2H +1 1+ (2 − 1) = (RgsfBm )2 . 2(H + 1)(2H + 1) 2

20.5 Conclusion In this paper we have computed the explicit analytic results of form factors and the corresponding Debye functions for a class of star non Gaussian processes, namely generalized grey Brownian motion. They are given in terms of the Fox-Wright function 2 Ψ2 . We also emphasize the important special case of star fractional Brownian motion. This star may serve as models for the coformations of star polymers in solvents. Acknowledgements This work has been partially supported by Center for Research in Mathematics and Applications (CIMA) related with the Statistics, Stochastic Processes and Applications (SSPA) group, through the grant UIDB/MAT/04674/2020 of FCT-Fundação para a Ciência e a Tecnologia, Portugal.

20 Form Factors for Stars Generalized Grey Brownian Motion

445

References 1. Berezansky, Y.M., Kondratiev, Y.G.: Spectral Methods in Infinite-Dimensional Analysis, vol. 1. Kluwer Academic Publishers, Dordrecht (1995) 2. Gorenflo, R., Kilbas, A.A., Mainardi, F., Rogosin, S.V.: Mittag-Leffler Functions. Springer, Related Topics and Applications (2014) 3. Grothaus, M., Jahnert, F.: Mittag-Leffler analysis II: application to the fractional heat equation. J. Funct. Anal. 270(7), 2732–2768 (2016). https://doi.org/10.1016/j.jfa.2016.01.018 4. Grothaus, M., Jahnert, F., Riemann, F., Silva, J.L.: Mittag-Leffler analysis I: construction and characterization. J. Funct. Anal. 268(7), 1876–1903 (2015). https://doi.org/10.1016/j.jfa.2014. 12.007 5. Hammouda, B.: Probing nanoscale structures-the sans toolbox (2016). http://www.ncnr.nist. gov/staff/hammouda/ 6. Hida, T., Kuo, H.H., Potthoff, J., Streit, L.: White Noise. An Infinite Dimensional Calculus. Kluwer Academic Publishers, Dordrecht (1993) 7. Kilbas, A.A., Saigo, M., Trujillo, J.J.: On the generalized Wright function. Fract. Calc. Appl. Anal. 5(4), 437–460 (2002) 8. Kilbas, A.A., Srivastava, H.M., Trujillo, J.J.: Theory and Applications of Fractional Differential Equations, North-Holland Mathematics Studies, vol. 204. Elsevier Science B.V, Amsterdam (2006) 9. Mainardi, F.: Fractional Calculus and Waves in Linear Viscoelasticity: An Introduction to Mathematical Models. World Scientific (2010) 10. Mainardi, F., Pagnini, G.: The role of the Fox-Wright functions in fractional sub-diffusion of distributed order. J. Comput. Anal. Appl. 207, 245–257 (2007). https://doi.org/10.1016/j.cam. 2006.10.014 11. Mainardi, F., Pagnini, G., Gorenflo, R.: Mellin transform and subordination laws in fractional diffusion processes. Fract. Calc. Appl. Anal. 6(4), 441–459 (2003) 12. Mathai, A.M., Haubold, H.J.: Special Functions for Applied Scientists. Springer (2008) 13. Mentrelli, A., Pagnini, G.: Front propagation in anomalous diffusive media governed by timefractional diffusion. J. Comput. Phys. 293, 427–441 (2015) 14. Mittag-Leffler, G.M.: Sur la nouvelle fonction eα (x). CR Acad. Sci. Paris 137(2), 554–558 (1903) 15. Mittag-Leffler, G.M.: Sopra la funzione eα (x). Rend. Accad. Lincei 5(13), 3–5 (1904) 16. Mittag-Leffler, G.M.: Sur la représentation analytique d’une branche uniforme d’une fonction monogène. Acta Math. 29(1), 101–181 (1905). https://doi.org/10.1007/BF02403200 17. Mura, A., Pagnini, G.: Characterizations and simulations of a class of stochastic processes to model anomalous diffusion. J. Phys. A Math. Theor. 41(28), 285003, 22 (2008). https://doi. org/10.1088/1751-8113/41/28/285003 18. Pollard, H.: The completely monotonic character of the Mittag-Leffler function E a (−x). Bull. Amer. Math. Soc. 54, 1115–1116 (1948) 19. Samko, S.G., Kilbas, A.A., Marichev, O.I.: Fractional integrals and derivatives. Gordon and Breach Science Publishers. Yverdon (1993). Theory and applications, Edited and with a foreword by S. M. Nikol’ski˘ı, Translated from the 1987 Russian original, Revised by the authors 20. Schneider, W.R.: Grey noise. In: Albeverio, S., Casati, G., Cattaneo, U., Merlini, D., Moresi, R. (eds.) Stochastic Processes, Physics and Geometry, pp. 676–681. World Scientific Publishing, Teaneck, NJ (1990) 21. Schneider, W.R.: Grey noise. In: Albeverio, S., Fenstad, J.E., Holden, H., Lindstrøm, T. (eds.) Ideas and Methods in Mathematical Analysis, Stochastics, and Applications (Oslo, 1988), pp. 261–282. Cambridge Univ. Press, Cambridge (1992) 22. da Silva, J.L., Streit, L.: Structure factors for generalized grey Browinian motion. Fract. Calc. Appl. Anal. 22(2), 396–411 (2019). https://doi.org/10.1515/fca-2019-0024 23. Teraoka, I.: Polymer Solutions: An Introduction to Physical Properties. Wiley, New York (2002)

Chapter 21

Flows of Rare Events for Regularly Perturbed Semi-Markov Processes Dmitrii Silvestrov

Abstract Necessary and sufficient conditions for convergence in distribution and in Skorokhod J-topology for counting processes generated by flows of rare events for perturbed semi-Markov processes with finite phase space are obtained. Keywords Semi-Markov process · Rare event · Counting process · Limit theorem MSC 2020 60K15

21.1 Introduction Random functionals similar with first-rare-event times are known under different names such as first hitting times, first passage times, absorption times, in theoretical studies, and as lifetimes, first failure times, extinction times, etc., in applications. Limit theorems for such functionals for Markov type processes have been studied by many researchers. The case of Markov chains and semi-Markov processes with finite phase spaces is the most deeply investigated. We refer here to the work [2–11, 13, 15–21, 27–32, 35, 40–43, 46, 50, 51, 54, 58, 60, 63–68, 73, 74, 77–79, 81, 82, 84, 87–93]. The case of Markov chains and semi-Markov processes with countable and an arbitrary phase space was treated in works [1, 6, 7, 24, 26, 33, 34, 36–39, 44, 45, 47–49, 52, 56, 57, 61, 62, 67–71, 75, 76, 83]. We also refer to the books [31, 72, 81] and papers [55] and [80], where one can find comprehensive bibliographies of works in the area. The main features for the most previous results is that they give sufficient conditions of convergence for such functionals. As a rule, those conditions involve assumptions, which imply convergence in distribution for sums of i.i.d. random variables distributed as sojourn times for the semi-Markov process (for every state) to D. Silvestrov (B) Department of Mathematics, Stockholm University, 106 91 Stockholm, Sweden e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_21

447

448

D. Silvestrov

some infinitely divisible random variables, plus some ergodicity condition for the imbedded Markov chain, plus condition of convergence to zero of probabilities of occurring a rare event during one transition step for the semi-Markov process. In the context of necessary and sufficient conditions of convergence in distribution for first-rare-event-time type functionals, we would like to point out the paper [53] and the books [12] and [25], where one can find some related results for geometric sums of random variables, and the papers [44] and [83], where one can find some related results for first-rare-event-time type functionals defined on Markov chains with arbitrary phase space. The results of the present paper relate to the model of perturbed semi-Markov processes with a finite phase space. Instead of conditions based on “individual” distributions of sojourn times, we use more general and weaker conditions imposed on distributions sojourn times averaged by stationary distributions of the corresponding imbedded Markov chains. The present paper can be considered as the second part of paper [74]. In this paper, the necessary and sufficient conditions for convergence in distribution of first-rare-event times and convergence in Skorokhod J-topology of first-rare-event-time processes have been obtained. In the present paper, we expand these results on the counting processes generated by flows of rare events for perturbed semi-Markov processes with finite phase. These results give some kind of a “final solution” for limit theorems for such counting processes. The results presented in the paper generalize and improve results concerned necessary and sufficient conditions of weak convergence for counting processes generated by flows of rare events for perturbed semi-Markov process obtained in papers [77–79] and [18–20]. Firstly, a weaken model condition of asymptotically uniform ergodicity is imposed on the corresponding embedded Markov chains. Secondly, more general multivariate counting processes generated by flows of rare events for perturbed semi-Markov process are considered. Thirdly, new proofs, partly based on general limit theorems for randomly stopped stochastic processes, developed and extensively presented in [72], are given. This makes it possible to formulate the corresponding results, also, in the more advanced form of functional limit theorems.

21.2 First-Rare-Event Times for Perturbed Semi-Markov Processes In this section we formulate the main results about weak and J-convergence of first-rare-event processes and some related results concerned asymptotically uniform ergodic Markov chains obtained in [74], where one can find the proofs of formulated below Lemmas 1, 2 and Theorem 1, and additional comments.

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

449

21.2.1 First-Rare-Event Times Let (ηε,n , κε,n , ζε,n ), n = 0, 1, . . . be, for every ε ∈ (0, 1], a Markov renewal process, i.e., a homogenous Markov chain with a phase space Z = X × [0, ∞) × {0, 1} (where X = {1, 2, . . . , m} is a finite set), an initial distribution q¯ε = qε,i = P{ηε,0 = i, κε,0 = 0, ζε,0 = 0} = P{ηε,0 = i}, i ∈ X and transition probabilities, for i, j ∈ X, s, t ≥ 0, ı,  = 0, 1, P{ηε,n+1 = j, κε,n+1 ≤ t, ζε,n+1 = /ηε,n = i, ξε,n = s, ζε,n = ı} = P{ηε,n+1 = j, κε,n+1 ≤ t, ζε,n+1 = /ηε,n = i} = Q ε,i j (t, ).

(21.1)

As is known, the first component ηε,n of the above Markov renewal process is also a homogenous Markov chain, with the phase space X = {1, 2, . . . , m}, the initial distribution q¯ε = qε,i = P{ηε,0 = i}, i ∈ X and the transition probabilities, for i, j ∈ X, (21.2) pε,i j = Q ε,i j (+∞, 0) + Q ε,i j (+∞, 1). Also, the random sequence (ηε,n , ζε,n ), n = 0, 1, . . . is a Markov renewal process with the phase space X × {0, 1}, the initial distribution q¯ε = qε,i = P{ηε,0 = i, ζε,0 = 0} = P{ηε,0 = i}, i ∈ X and the transition probabilities, for i, j ∈ X, ı,  = 0, 1, (21.3) pε,iı, j = Q ε,i j (+∞, ). Random variables κε,n , n = 1, 2, . . . can be interpreted as sojourn times and random variables τε,n = κε,1 + · · · + κε,n , n = 1, 2, . . . , τε,0 = 0 as moments of jumps for a semi-Markov process ηε (t), t ≥ 0 defined by the following relation, ηε (t) = ηε,n for τε,n ≤ t < τε,n+1 , n = 0, 1, . . . ,

(21.4)

The transition probabilities for this semi-Markov process has the following form, i, j ∈ X, s, t ≥ 0, P{ηε,n+1 = j, κε,n+1 ≤ t/ηε,n = i, ξε,n = s} = P{ηε,n+1 = j, κε,n+1 ≤ t/ηε,n = i} = Q ε,i j (t) = Q ε,i j (t, 0) + Q ε,i j (t, 1).

(21.5)

As far as random variables ζε,n , n = 1, 2, . . . are concerned, they are interpreted as so-called, “flag variables” and are used to record events {ζε,n = 1} which we interpret as “rare” events. Let us introduce random variables,

450

D. Silvestrov

ξε =

νε 

κε,n ,

(21.6)

n=1

where, νε = min(n ≥ 1 : ζε,n = 1).

(21.7)

The random variable νε counts the number of transitions of the imbedded Markov chain ηε,n up to the first occurrence of “rare” event, while a random variable ξε can be interpreted as the first-rare-event time of the first occurrence of “rare” event for the semi-Markov process ηε (t). We also consider the first-rare-event-time process, ξε (t) =

[tνε ] 

κε,n , t ≥ 0.

(21.8)

n=1

It is useful to note that the traditional definition of first-rare-event time as the first hitting time of the semi-Markov process ηε (t) into some asymptotically absorbing state or domain is a particular case of the above one. Here and henceforth, we use symbol ⇒ for indication weak convergence of disd

tributions, symbol −→ for indication convergence in distribution for random variables (equivalent to the weak convergence of their distribution functions or finiteP

dimensional distributions for stochastic processes), symbol −→ for indication conU

J

vergence of random variables in probability, symbols −→ and −→ for indication convergence, respectively, in uniform U-topology and Skorokhod J-topology for real-valued càdlàg stochastic processes defined on time interval [0, ∞). We refer to books [14, 23] and [72] for details concerned the above forms of functional convergence. The problems formulated above are solved under three general model assumptions. Let us introduce the probabilities of occurrence of rare event during one transition step of the semi-Markov process ηε (t), pε,i = Pi {ζε,1 = 1}, i ∈ X. Here and henceforth, Pi and Ei denote, respectively, conditional probability and expectation calculated under condition that ηε,0 = i. The first model assumption A1 , imposed on probabilities piε , specifies interpretation of the event {ζε,n = 1} as “rare” and guarantees the possibility for such event to occur: A1 : 0 < maxi∈X pε,i → 0 as ε → 0.

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

451

21.2.2 Asymptotically Uniformly Ergodic Markov Chains Let us introduce random variables, με,i (n) =

n 

I (ηε,k−1 = i), n = 0, 1, . . . , i ∈ X.

(21.9)

k=1

If the Markov chain ηε,n is ergodic, i.e., X is one class of communicative states for this Markov chain, then its stationary distribution is given by the following ergodic relation, με,i (n) P −→ πε,i as n → ∞, for i ∈ X. (21.10) n The ergodic relation (21.10) holds for any initial distribution q¯ε , and the stationary distribution πε,i , i ∈ X does not depend on the initial distribution. Also, all stationary probabilities are positive, i.e., πε,i > 0, i ∈ X. As is known, the stationary probabilities πε,i , i ∈ X are the unique solution for the system of linear equations, πε,i =



πε, j pε, ji , i ∈ X,

j∈X



πε,i = 1.

(21.11)

i∈X

The second model assumption is a condition of asymptotically uniform ergodicity for the embedded Markov chains ηε,n : B1 : There exists a ring chain of states i 0 , i 1 , . . . , i N = i 0 , which contains all states from the phase space X and such that limε→0 pε,ik−1 ik > 0, for k = 1, . . . , N . Let η˜ ε,n be, for every ε ∈ (0, 1] a Markov chain with the phase space X and a matrix of transition probabilities p˜ ε,i j . We shall also use the following condition: B2 : pε,i j − p˜ ε,i j → 0 as ε → 0, for i, j ∈ X. If transition probabilities p˜ ε,i j ≡ p0,i j , i, j ∈ X do not depend on ε, then condition B2 reduces to the following condition: B3 : pε,i j → p0,i j as ε → 0, for i, j ∈ X. Lemma 1 Let condition B1 holds for the Markov chains ηε,n . Then: (i) There exists ε0 ∈ (0, 1] such that the Markov chain ηε,n is ergodic, for every ε ∈ (0, ε0 ] and 0 < limε→0 πε,i ≤ limε→0 πε,i < 1, for i ∈ X. (ii) If, together with B1 , condition B2 holds, then, there exists ε˜ 0 ∈ (0, ε0 ] such that the Markov chain η˜ ε,n is ergodic, for every ε ∈ (0, ε˜ 0 ], and its stationary distribution π˜ ε,i , i ∈ X satisfies the asymptotic relation, πε,i − π˜ ε,i → 0 as ε → 0, for i ∈ X. (iii) If condition B3 holds, then matrix p0,i j is stochastic, condition B1 is equivalent to the assumption that a Markov chain η0,n , with the matrix of transition probabilities p0,i j , is ergodic and the following asymptotic relation holds, πε,i → π0,i

452

D. Silvestrov

as ε → 0, for i ∈ X, where π0,i , i ∈ X is the stationary distribution of the Markov chain η0,n . Remark 1 Proposition (iii) of Lemma 1 implies that, in the case, where the transition probabilities pε,i j = p0,i j , i, j ∈ X do not depend on parameter ε or pε,i j → p0,i j as ε → 0, for i, j ∈ X, condition B1 reduces to the standard assumption that the Markov chain η0,n , with the matrix of transition probabilities p0,i j , is ergodic. According Lemma 1, condition B1 implies that there exists ε0 ∈ (0, 1] such that the phase space X is one class of communicative states for the Markov chain ηε,n , for every ε ∈ (0, ε0 ]. In what follows, we assume that ε ∈ (0, ε0 ]. Let αε,i,0 = 0 and αε,i,n = min(k > αε,i,n−1 : ηε,k = i), n = 1, 2, . . . be sequential moments of hitting to state i ∈ X for the Markov chain ηε,n . Lemma 2 Let condition B1 holds. Then, for any initial distribution q¯ε and 0 < u ε → ∞ as ε → 0, ∗ αε,i (t) =

αε,i ([πε,i tu ε ]) U , t ≥ 0 −→ t, t ≥ 0 as ε → 0, uε

and μ∗ε,i (t) =

με,i ([tu ε ]) U , t ≥ 0 −→ t, t ≥ 0 as ε → 0. πε,i u ε

(21.12)

(21.13)

21.2.3 Necessary and Sufficient Conditions of Weak Convergence for First-Rare-Event Times The third model assumption is the following condition which guarantees that the last summand κε,νε in the random sum ξε is asymptotically negligible: C1 :

Pi {κε,1 > δ/ζε,1 = 1} → 0 as ε → 0, for δ > 0, i ∈ X.

Note that we define probability Pi {κε,1 > δ/ζε,1 = 1} = 0 in the cases where Pi {ζε,1 = 1} = 0. Let us consider, for every ε ∈ (0, ε0 ], the step-sum stochastic process, κε (t) =

[tvε ] 

κε,n , t ≥ 0.

(21.14)

n=1

The random variables κε (t) can be interpreted as rewards accumulated on trajectories of the Markov chain ηε,n . Respectively, random variables ξε can be interpreted as rewards accumulated on trajectories of the Markov chain ηε,n up to the first occurrence of the “rare” event.

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

453

Let us define the probability which is the result of averaging of the probabilities of occurrence of rare event in one transition step by the stationary distribution of the imbedded Markov chain ηε,n , pε =



πε,i pε,i and u ε = pε−1 .

(21.15)

i∈X

Let us introduce the distribution functions of a sojourn times κε,1 for the semiMarkov processes ηε (t), G ε,i (t) = Pi {κε,1 ≤ t}, t ≥ 0, i ∈ X.

(21.16)

Let θε,n , n = 1, 2, . . . be i.i.d. random variables with distribution G ε (t), which is a result of averaging of distribution functions of sojourn times by the stationary distribution of the imbedded Markov chain ηε,n , G ε (t) =



πε,i G ε,i (t), t ≥ 0.

(21.17)

i∈X

Now, we can formulate the necessary and sufficient condition for convergence in distribution for first-rare-event times: [u ε ] d D1 : θε = n=1 θε,n −→ θ0 as ε → 0, where θ0 is a non-negative random variable with distribution not concentrated in zero. As well known, (d1 ) the limiting random variable θ0 appearing in condition D1 should be infinitely divisible and, thus, its Laplace transform has the form, Ee−sθ0 = ∞ e−A(s) , where A(s) = gs + 0 (1 − e−sv )G(dv), s ≥ 0,g is a non-negative constant v G(dv) < ∞; (d2 ) and G(dv) is a measure on interval (0, ∞) such that (0,∞) 1+v  v g + (0,∞) 1+v G(dv) > 0 (this is equivalent to the assumption that P{θ0 = 0} < 1). Let also consider the homogeneous step-sum process with independent increments (summands are i.i.d. random variables), θε (t) =

[tu ε ] 

θε,n , t ≥ 0.

(21.18)

n=1

As is known (see, for example, [85, 86]), condition D1 is necessary and sufficient for holding of the asymptotic relation, θε (t) =

[tu ε ]  n=1

J

θε,n , t ≥ 0 −→ θ0 (t), t ≥ 0 as ε → 0,

(21.19)

454

D. Silvestrov

where θ0 (t), t ≥ 0 is a nonnegative Lévy process (a càdlàg homogeneous process with independent increments) with the Laplace transforms Ee−sθ0 (t) = e−t A(s) , s, t ≥ 0. Let us define the Laplace transforms, φε,i (s) = Ei e

−sκε,1

∞ =

e−st G ε,i (dt), s ≥ 0, i ∈ X,

(21.20)

0

and φε (s) = Ee

−sθε,1

∞ =

e−st G ε (dt) =



πε,i φε,i (s), s ≥ 0.

(21.21)

∈X

0

Condition D1 can be reformulated (see, for example, [22]) in the equivalent form, in terms of the above Laplace transforms: D2 : u ε (1 − ϕε (s)) → A(s) as ε → 0, for s > 0, where the limiting function A(s) > 0, for s > 0 and A(s) → 0 as s → 0. In this case, (d3 ) A(s) is a cumulant of non-negative random variable with distribution not concentrated in zero. Moreover, (d4 ) A(s) should be the cumulant of infinitely divisible distribution of the form given in the above conditions (d1 ) and (d2 ). The following condition, which is a variant of the so-called central criterion of convergence (see, for example, [59]), is equivalent to condition D1 , with the Laplace transform of the limiting random variable θ0 given in the above conditions (d1 ) and (d2 ): D3 : (a) u ε (1 − G ε (u)) → G(u) as ε → 0 for all u > 0, which are points of continuity of the limiting function, which is nonnegative, non-increasing, and right continuous function defined on interval (0, ∞), which has the limiting value G(+∞) = 0 and is connected with the measure G(dv) by the rela tion G((u , u

]) = G(u ) − G(u

), 0 < u ≤ u

< ∞; (b) vε (0,u] vG ε (dv) →  g + (0,u] vG(dv) as ε → 0 for some u > 0 which is a point of continuity of G(u). It is useful to note that (d5 ) the asymptotic relation penetrating condition D3 (b) holds, under condition D2 (a), for any u > 0 which is a point of continuity for function G(u). The following theorem takes place. Theorem 1 Let conditions A1 , B1 and C1 hold. Then: (i) Condition D1 is necessary and sufficient for holding (for some or any initial distributions q¯ε , respectively, in statements of necessity and sufficiency) of the d

asymptotic relation ξε = ξε (1) −→ ξ0 as ε → 0, where ξ0 is a non-negative random variable with distribution not concentrated in zero. In this case:

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

455

(ii) The distribution  ∞ function F(u) of1 the limiting random variable ξ0 has the , where A(s) is a cumulant of infinitely Laplace transform 0 e−su F(du) = 1+A(s) divisible distribution defined in condition D1 . J

(iii) The stochastic processes κε (t), t ≥ 0 −→ θ0 (t), t ≥ 0 as ε → 0, where (a) θ0 (t), t ≥ 0 is a nonnegative Lévy process with the Laplace transforms Ee−sθ0 (t) = J

e−t A(s) , s, t ≥ 0, and stochastic processes ξε (t), t ≥ 0 −→ ξ0 (t) = θ0 (tν0 ), t ≥ 0 as ε → 0, where (b) ν0 is a random variable, which has the exponential distribution with parameter 1, (c) the random variable ν0 and the process θ0 (t), t ≥ 0 are independent. Remark 2 According Theorem 1, class F of all possible nonnegative, nondecreasing, càdlàg, stochastically continuous processes ξ0 (t), t ≥ 0 with distributions of random variables ξ0 (t), t > 0 not concentrated in zero, and such that the asymptotic J

relation, ξε (t), t ≥ 0 −→ ξ0 (t), t ≥ 0 as ε → 0, holds, coincides with the class of limiting processes described in proposition (iii). Condition D1 is necessary and sufficient condition for holding not only the asymptotic relation given in propositions (i) – (ii) but also for the much stronger asymptotic relation given in proposition (iii). Remark 3 The statement “for some or any initial distributions q¯ε , respectively, in statements of necessity and sufficiency” used in the formulation of Theorem 1 should be understood in the sense that the asymptotic relation penetrating proposition (i) should hold for at least one family of initial distributions q¯ε , ε ∈ (0, ε0 ], in the statement of necessity, and for any family of initial distributions q¯ε , ε ∈ (0, ε0 ], in the statement of sufficiency. Remark 4 It is possible to modify condition D1 and to assume that the random variable θ0 appearing in thiscondition has no atom at zero. In terms of the corresponding ∞ cumulant A(s) = gs + 0 (1 − e−sv )G(dv) this holds if and only if A(s) → ∞ as s → ∞ or, equivalently, g > 0 or g = 0 but G((0, ∞)) = ∞. In this case, the random variable ξ0 appearing in proposition (i) of Theorem 1 also should have distribution function without an atom at zero. Remark 5 The specific Markov property possessed by the Markov renewal process (ηε,n , κε,n , ζε,n ), represented by relation (21.1), condition C1 imply that, for i ∈ X and δ > 0, and any initial distributions q¯ε , P

κνε −→ 0 as ε → 0.

(21.22)

ε κ can be replaced This relation implies that the first-rare-event times ξε = νn=1 νε −1 n

in Theorem 1 by the modified first-rare-event times ξε = n=1 κn and, moreover, by any random variable ξε

such that ξε ≤ ξε

≤ ξε . Remark 6 Simpler variants of asymptotic ergodicity condition, based on condition B3 and the assumption of ergodicity of the Markov chain η0,n combined with averaging of characteristic in condition D1 by its stationary distribution π0,i , i ∈ X, have been used in the mentioned above works [78] and [18] for proving analogues of

456

D. Silvestrov

propositions (i) and (ii) of Theorem 1. In this case, the averaging of characteristics in the necessary and sufficient condition D1 , in fact, relates mainly to distributions of sojourn times. Condition B1 balances in a natural way averaging of characteristics in condition D1 between distributions of sojourn times and stationary distributions of the corresponding embedded Markov chains.

21.3 Counting Processes Generated by Flows of Rare Events In this section, we present necessary and sufficient conditions of weak convergence and convergence in J-topology for counting processes generated by flows of rare events for perturbed semi-Markov processes.

21.3.1 Counting Processes for Rare-Events Let us define recurrently random variables, for k = 1, 2, . . ., νε (k) = min(n ≥ νε (k − 1) : ζε,n = 1),

(21.23)

where νε (0) = 0. A random variable νε (k) counts the number of transitions of the imbedded Markov chain ηε,n up to the k-th appearance of the “rare” event. Obviously, νε (1) = νε . Let us also define inter-rare-event times, for k = 1, 2, . . ., κε (k) =

ν ε (k)

κε,n .

(21.24)

n=νε (k−1)+1

Let us also introduce random variables showing positions of the imbedded Markov chain ηε,n at moments νε (k), k = 0, 1, . . ., ηε (k) = ηε,νε (k) .

(21.25)

Obviously (ηε (k), κε (k)), k = 0, 1, . . . (here, κε (k) = 0) is a Markov renewal process, i.e. a homogeneous Markov chain with the phase space X × [0, ∞) and transition probabilities, for i, j ∈ X, s, t ≥ 0, P{ηε (k + 1) = j, κε (k + 1) ≤ t/ηε (k) = i, κε (k) = s} = P{ηε (k + 1) = j, κε (k + 1) ≤ t/ηε (k) = i} = Pi {ηε,νε = j, ξε ≤ t} = Q i(ε) j (t).

(21.26)

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

457

Let us now define random variables, ξε (k) =

ν ε (k) n=1

κε,n =

k 

κε (n), k = 0, 1, . . . , ξε (0) = 0.

(21.27)

n=1

Random variable ξε (k) can be interpreted as the time of k-th appearance of the rare event time for the semi-Markov process ηε (t). Obviously, ξε (1) = ξε . Now, we can define a counting stochastic process generated by the flow of rare events, (21.28) Nε (t) = max(k ≥ 0 : ξε (k) ≤ t), t ≥ 0. Let also denote by N the class of integer-valued, non-negative, non-decreasing, stepwise, with finite number of jumps in every finite interval, continuous from the right processes defined on interval [0, ∞). Obviously, any process from class N is a càdlàg process. The counting process Nε (t) belongs to the class N .

21.3.2 Necessary and Sufficient Conditions of Convergence for Counting Processes Generated by Flows of Rare Events Let κ(k), k = 1, 2, . . . be a sequence of i.i.d. non-negative randomvariables with distribution function F(u) not concentrated in zero, and ξ(k) = kn=1 κ(n), k = 0, 1, . . .. Let us also define the standard renewal counting process, N (t) = max(k ≥ 0 : ξ(k) ≤ t), t ≥ 0.

(21.29)

Let us denote by N the set of stochastic continuity points t > 0 for the process N (t). Set N is at most countable. Moreover, N ⊆ K = ∪n≥1 Kn , where Kn is the set of discontinuity points u > 0 for the distribution function F (∗n) (u) of the random variable ξ(n), for n = 1, 2, . . .. Also, any standard renewal counting process N (t), t ≥ 0 belongs to class N . The following theorem takes place. Theorem 2 Let conditions A1 , B1 , and C1 hold. Then: (i) Condition D1 is necessary and sufficient for holding (for some or any initial distributions q¯ε , respectively, in statements of necessity and sufficiency) the asympd

totic relation Nε (t), t ∈ N −→ N (t), t ∈ N as ε → 0, where N (t), t ≥ 0 is some process from class N such that P{N (t) ≥ 1} = F(t) is the distribution function on [0, ∞) not concentrated in zero, and N is the set of stochastic continuity points t > 0 for this process.

458

D. Silvestrov

(ii) In this case, N (t), t ≥ 0 is a standard renewal counting process defined by relation (21.29) via some sequence of i.i.d. non-negative random variables κ(k), k = 1, 2, . . . with distribution function F(u) not concentrated in zero and the Laplace ∞ 1 , where A(s) is the cumulant of infinitely transform φ(s) = 0 e−st F(dt) = 1+A(s) divisible distribution defined in condition D1 . Proof Obviously, the random variable κε (1) has the following conditional distribution function, for i ∈ X,  (ε) Q i j (u), u ≥ 0. (21.30) Fi(ε) (u) = Pi {κε (1) ≤ u} = Pi {ξε ≤ u} = j∈X

According Theorem 1, conditions A1 – D1 , imply that, for i ∈ X, Fi(ε) (·) = P{κε (1) ≤ ·} ⇒ F(·) as ε → 0,

(21.31)

where F(u) is the distribution not concentrated in zero, with the  ∞ function on [0, ∞) 1 appearing in condition D1 . Laplace transform φ(s) = 0 e−st F(dt) = 1+A(s) Using Markov property of the Markov renewal process (ηε (k), κε (k)), we get the following formula for the joint conditional distribution function of random variables κε (k), k = 1, . . . , n, for u 1 , . . . , u n ≥ 0, and i 0 ∈ X, n = 1, 2, . . ., Pi0 {κε (k) ≤ u k , k = 1, . . . , n}  Pi0 {κε (k) ≤ u k , k = 1, . . . , n − 1, = i n−1 ∈X

ηε (n − 1) = i n−1 }Fi(ε) (u n ). n−1

(21.32)

Using relations (21.32) and (21.31), we get that, under conditions A1 – D1 , the following convergence relation hold for the joint conditional distribution function of random variables κε (k), k = 1, 2, for all points u¯ 2 = (u 1 , u 2 ), u 1 , u 2 ≥ 0, which are points of condinuity for the corresponding multi-dimensional distribution function, and i 0 ∈ X, n = 1, 2, . . ., |Pi0 {κε (k) ≤ u k , k = 1, 2} − F(u 1 )F(u 2 )| ≤ |Pi0 {κε (k) ≤ u k , k = 1, 2} − Pi0 {κε (1) ≤ u 1 }F(u 2 )| + |Pi0 {κε (1) ≤ u 1 }F(u 2 )| − F(u 1 )F(u 2 )|  Pi0 {κε (k) ≤ u 1 , ηε (1) = i 1 }|Fi(ε) (u 2 ) − F(u 2 )| ≤ 1 i 1 ∈X

+ |Pi0 {κε (1) ≤ u 1 } − F(u 1 )|F(u 2 ) → 0 as ε → 0.

(21.33)

Analogously, using relations (21.31) and (21.32), we get that, under conditions A1 –D1 , the following convergence relation hold for the joint conditional distribution function of random variables κε (k), k = 1, . . . , n, for all points u¯ n = (u 1 , . . . , u n ),

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

459

u 1 , . . . , u n ≥ 0, which are points of continuity for the corresponding multidimensional distribution function, and i 0 ∈ X, n = 1, 2, . . ., |Pi0 {κε (k) ≤ u k , k = 1, . . . , n} −

n 

F(u k )| → 0 as ε → 0.

(21.34)

k=1

Relation (21.33) means that the inter-renewal times κε (k), k = 1, 2, . . . are asymptotically independent. Let κ(k),  k = 1, 2, . . . be i.i.d. random variables with distribution function F(u), ξ(n) = nk=1 κ(k), n = 1, 2, . . ., and H (u 1 , . . . , u n ) be the joint distribution function for the random variables ξ(k), k = 1, . . . , n, for n = 1, 2, . . .. Relation (21.33) obviously implies that, under conditions A1 – D1 , the following convergence relation hold  for the joint conditional distribution functions for random variables ξε (k) = rk=1 κε (r ), k = 1, . . . , n, for all points u¯ n = (u 1 , . . . , u n ), u 1 , . . . , u n ≥ 0, which are points of continuity for the corresponding multi-dimensional distribution function, and i ∈ X, n = 1, 2, . . . Pi {ξε (k) ≤ u k , k = 1, . . . , n} → P{ξ(k) ≤ u k , k = 1, . . . , n} = H (u 1 , . . . , u n ) as ε → 0.

(21.35)

Relation (21.35) implies in an obvious way that, under conditions A1 – D1 , the following convergence relation hold for 0 < t1 ≤ · · · ≤ tn < ∞, which are point of continuity for the corresponding limiting distribution functions (of random variables ξ(rk ), k = 1, . . . , n), integer 0 ≤ r1 ≤ · · · ≤ rn < ∞, and i ∈ X, n = 1, 2, . . ., Pi {Nε (tk ) ≥ rk , k = 1, . . . , n} = Pi {ξε (rk ) ≤ tk , k = 1, . . . , n} → P{ξ(rk ) ≤ tk , k = 1, . . . , n} = P{N (tk ) ≥ rk , k = 1, . . . , n} as ε → 0. (21.36) Relation (21.36), in an obvious way, implies that the following asymptotic relation holds, for any initial distributions q¯ε , d

Nε (t), t ∈ K −→ N (t), t ∈ K as ε → 0.

(21.37)

Since, N (t), t ≥ 0 is a non-decreasing process and set K is at most countable and, thus, set K ⊆ N is dense in interval [0, ∞), relation (21.37) can, in the standard way, be extended to the asymptotic relation given in proposition (i), with the limiting standard counting process N (t) described in proposition (ii) of Theorem 2. Let, now, assume that conditions A1 – C1 hold, and, for some initial distribud tions q¯ε , the asymptotic relation, Nε (t), t ∈ N −→ N (t), t ∈ N as ε → 0 takes place, where N (t) is some stochastic process described in proposition (i) of Theorem 2, and N is its set of stochastic continuity points t > 0 for this process. This asymptotic relations imply that and the following relation holds, for t ∈ N,

460

D. Silvestrov

Pq¯ε {ξε ≤ t} = Pq¯ε {Nε (t) ≥ 1} → P{N (t) ≥ 1} = F(t) as ε → 0.

(21.38)

Since F(t) is the distribution function on [0, ∞) not concentrated in zero and set N is dense in [0, ∞), relation (21.38) implies that Pq¯ε {ξε ≤ ·} ⇒ F(·) as ε → 0.

(21.39)

Relation (21.39) implies, by Theorem 1, that condition D1 holds. Thus, proposition (ii) of Theorem 2 takes place and N (t) is the standard renewal counting process described in this proposition.  Remark 7 In the case, where condition D1 is modified in the way described in Remark 4, i.e., it is assumed that the distribution function F(u) appearing in this condition has no atom at zero, N (0) = 0 with probability 1, point 0 can be additionally included in set N, and the distribution function P{N (t) ≥ 1} = F(t), which appears in proposition (i) of Theorem 2, has no atom in zero. Remark 8 In the case described in Remark 7, relation (21.33) implies (by Theorem 4.4.1 from [72]) that, under conditions, A1 –D1 , the more general asymptotic relation, J

Nε (t), t ≥ 0 −→ N (t), t ≥ 0 as ε → 0, takes place. Remark 9 Conditions A1 – D1 of Theorem 1 guarantee that the distribution functions of the first rare-event times Fε (·) = P{ξε ≤ ·} ⇒ F(·) as ε → 0, where F(·) is the distribution function appearing in condition D1 . However, the above weak convergence relation does not guarantee that probabilities Fε (0) converge to probability F(0) if F(0) > 0. It is because of, in this case, 0 is a discontinuity point for the distribution function F(·). Since, P{Nε (0) = r } = Fε (0)r (1 − Fε (0)), r = 0, 1, . . ., the point 0 can not, in this case, be included in the set of weak convergence N appearing in the asymptotic relation given in proposition (i) of Theorem 2. Also, in the above case, J-convergence of processes Nε (t) can non be guaranteed.

21.4 Markov Renewal Processes Generated by Flows of Rare Events I this section, we present necessary and sufficient conditions of weak convergence for transition probabilities and finite-dimensional distributions for Markov renewal processes generated by flows of rare events for perturbed semi-Markov processes.

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

461

21.4.1 Return Times and Rare Events According Lemma 1, condition B1 implies that the Markov chain ηε,n is ergodic for ε ∈ (0, ε0 ], for some ε0 ∈ (0, 1]. In what follows, we assume that ε ∈ (0, ε0 ]. What is interesting, that conditions A1 – D1 , are not sufficient for weak convergence of transition probabilities of the Markov renewal process (ηε (k), κε (k)) that forms the counting process Nε (t). Let us first to prove some useful lemmas. Let us introduce the moment of sequential return of the Markov chain ηε,n to the state i ∈ X, αε,i (k) = min(n > αε,i (k − 1), ηε,n = i), k = 1, 2, . . . , αε,0 = 0.

(21.40)

Also, let us define probabilities, for i, r ∈ X, qε, ji (r ) = P j {νε ≤ αε,i (1), ηε,νε = r }, and, for i ∈ X,

qε, ji = P j {νε ≤ αε,i (1)} =



(21.41)

qε, ji (r ).

(21.42)

r ∈X

Let us also denote, for i, r ∈ X, pε,i (r ) = Pi {ηε,1 = r, ζε,1 = 1}, and, for r ∈ X, pε (r ) =

m 

(21.43)

πε,i pε,i (r ).

(21.44)

i=1

Obviously, for i ∈ X, pε,i = Pi {ζε,1 = 1} =



pε,i (r ),

(21.45)

r ∈X

and pε =

 i∈X

πε,i pε,i =

 i∈X

πε,i



pε,i (r ) =

r ∈X



pε (r ).

(21.46)

r ∈X

Lemma 3 Let conditions A1 –C1 hold. Then, for every i ∈ X, πε,i qε,ii → 1 as ε → 0. pε Proof Let us introduce the following matrices, for i ∈ X,

(21.47)

462

D. Silvestrov



pε,11 . . . pε,1 i−1 ⎢ .. .. i Pε = ⎣ . . pε,m1 . . . pε,m i−1

⎤ 0 pε,1 i+1 . . . pε,1 m .. .. .. ⎥ . . . . ⎦ 0 pε,m i+1 . . . pε,m m

(21.48)

Let us introduce random variable δε,ik which is the number of visits of the imbedded Markov chain ηε,n to state k up to the first visit to the sate i, for i, k ∈ X, αε,i (1)

δε,ik =



I(ηε,n−1 = k).

(21.49)

n=1

As it is well known, due to the ergodicity of the Markov chain ηε,n , that, for j, i, k ∈ X, (21.50) E j δε,ik < ∞. Moreover, for every i ∈ X , there exists the inverse matrix,

[I − i Pε ]−1 = E j δε,ik

(21.51)

Let us also introduce, for i ∈ X, matrix, ⎡

p˜ ε,1 1 . . . p˜ ε,1 i−1 ⎢ .. .. ˜ P = ⎣ . i ε . p˜ ε,m 1 . . . p˜ ε,m i−1 where, for j, k ∈ X,

⎤ 0 p˜ ε,1 i+1 . . . p˜ ε,1 m .. .. .. ⎥ , . . . ⎦ 0 p˜ ε,m i+1 . . . p˜ ε,m m

p˜ ε, jk = P j {ηε,1 = k, ζε,1 = 0}.

(21.52)

(21.53)

Let us also introduce random variable δ˜ε,ik which is the number of visits of the imbedded Markov chain ηn of the state k before the first visit to the sate i or the occurrence of the first rare event, for i, k ∈ X, δ˜ε,ik =

αε,i (1)∧νε



I(ηε,n−1 = k),

(21.54)

n=1

Obviously, 0 ≤ δ˜ε,ik ≤ δε,ik and, therefore, for j, i, k ∈ X, E j δ˜ε,ik ≤ E j δε,ik < ∞.

(21.55)

Moreover, the matrices i P˜ εn has the following form, for i ∈ X and n ≥ 1, ˜n i Pε

= P j {ηε,n = k, αε,i (1) ∧ νε > n} .

(21.56)

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

463

and, therefore, there exists the inverse matrix,

−1 

I − i P˜ ε = I + i P˜ ε + i P˜ ε2 + · · · = E j δ˜ε,ik .

(21.57)

Probabilities pε, jk ∈ [0, 1], j, k ∈ X, for ε ∈ (0, 1]. Therefore, any sequence εn ∈ (0, 1], n = 1, 2, . . . such that εn → 0 as n → ∞ contains a subsequence εnr , r = 1, 2, . . . such that, for j, k ∈ X, pεnr , jk → p0, jk ∈ [0, 1] as r → ∞.

(21.58)

According the above remarks there exists, for every i ∈ X, the inverse matrix [I − i Pεnr ]−1 for all r such that εnr ∈ (0, ε0 ]. Obviously matrix P0 = p0, jk is stochastic. Moreover, condition B1 implies that the phase space X for a Markov chain, with the matrix of transition probabilities P0 is one class of communicative states, i.e. this Markov chain is ergodic. This implies, by reasons analogous to those mentioned in (21.50) and (21.51), that, for every i ∈ X, there exists the inverse matrix [I − i P0 ]−1 . Thus, det(I − i P0 ) = 0. Elements of matrices [I − i Pεnr ]−1 and [I − i P0 ]−1 are continuous rational functions of elements, respectively, of matrices [I − i Pεnr ] and [I − i P0 ]. That is why, relation (21.58) implies that, for i ∈ X, [I − i Pεnr ]−1 → [I − i P0 ]−1 as ε → 0.

(21.59)

Condition A obviously implies that, for j, k ∈ X, | p˜ ε, jk − pε, jk | ≤ pε, j → 0 as ε → 0.

(21.60)

and, therefore, for j, k ∈ X, p˜ εnr , jk → p0, jk ∈ [0, 1] as r → ∞.

(21.61)

Relation (21.61), by reasons analogous to those mentioned in (21.59), for i ∈ X, implies that (21.62) [I − i P˜ εnr ]−1 → [I − i P0 ]−1 as ε → 0. The matrices i P0 , i ∈ X depend on choice of sequences εn and subsequence εnr . However, relations (21.59) and (21.62) imply that, for i ∈ X, [I − i P˜ εnr ]−1 − [I − i Pεnr ]−1 → 0 as ε → 0,

(21.63)

where 0 is a (m − 1) × m matrix with all zero elements. Since an arbitrary choice of sequence 0 < εn → 0 and independence of limit 0 in relation (21.63) on choice of sequence εn and subsequence εnr , this relation implies that,

464

D. Silvestrov

[I − i P˜ ε ]−1 − [I − i Pε ]−1 → 0 as ε → 0.

(21.64)

or equivalently that, for j, i, k ∈ X, E j δ˜ε,ik − E j δε,ik → 0 as ε → 0.

(21.65)

Probabilities qε, ji , j ∈ X satisfy, for every i ∈ X, the following system of linear equations,  qε, ji = pε, j + p˜ ε, jk qε,ki , j ∈ X. (21.66) k=i

System (21.66) can be rewritten, for every i ∈ X, in the following matrix form, qε,i = pε + i Pε qε,i , ⎤ ⎤ ⎡ qε,1i pε,1 ⎥ ⎥ ⎢ ⎢ = ⎣ ... ⎦ , pε = ⎣ ... ⎦ , qε,mi pε,m

(21.67)



where qε,i

(21.68)

Since, matrix I − i P˜ ε , has inverse matrices for ε ∈ (0, ε0 ] and i ∈ X. Therefore, its solution of system (21.66) has the following form, for every i ∈ X , −1  pε . qε,i = I − i P˜ ε

(21.69)

Relations (21.57) and (21.69) imply that, for i ∈ X, qε,ii =



Ei δ˜ε,ik pε,k .

(21.70)

k∈X

As is known, the following formula holds, since the Markov chain ηε,n is ergodic, for i, k ∈ X. πε,k . (21.71) Ei δε,ik = πε,i Using relations (21.65), (21.70), (21.71), and proposition (i) of Lemma 1 accoding to which limε→0 πε,k > 0, we get the following inequalities, for i ∈ X, |qε,ii − pε πε,i

pε πε,i

|

   πε,i pε,k πε,k   ˜ ≤ Ei δε,ik − π   ε,i j∈X πε, j pε, j k∈X π    ε,i ≤ → 0 as ε → 0. Ei δ˜ε,ik − Ei δε,ik  πε,k k∈X

(21.72)

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

Relation (21.72) implies holding of relation (21.47).

465



It follows from the formulated bellow lemma, which describes the asymptotic behaviour of so-called “absorbing” probabilities, for i, j ∈ X, Q i(ε) j (∞) = Pi {ηε,νε = j}.

(21.73)

Lemma 4 Let conditions A1 –C1 hold. Then, the following relation holds, for i, r ∈ X, pε (r ) (ε) Q ir (∞) − → 0 as ε → 0. (21.74) pε Proof Taking into account that the Markov renewal process (ηε,n , κε,n , ζε,n ) regenerates at moments of return of the component ηε,n to any state i ∈ X and νε is a Markov moment for this process, we can get following cyclic representation for absorbing (ε) (∞), for i, r ∈ X, probabilities Q ir (ε) Q ir (∞) =

=

∞ 

Pi {αε,i (n) < νε ≤ αε,i (n + 1), ηε,νε = r }

n=0 ∞ 

(1 − qε,ii )n qε,ii (r ) =

n=0

qε,ii (r ) . qε,ii

(21.75)

Probabilities qε, ji (r ), j ∈ X satisfy, for every i, r ∈ X, the following system of linear equations, qε, ji (r ) = pε (r ) +



p˜ ε, jk qε,ki (r ), j ∈ X.

(21.76)

k=i

This system has the matrix of coefficients i P˜ ε (the same with the system of linear equations (21.66)) and differs of this system only by the free terms. Thus, by repeating reasoning given in the proof of lemma 1 we can get the following formula similar with formula (21.57), for i, r ∈ X, qiiε (r ) =

m 

Ei δ˜ε,ik pε (r ).

(21.77)

k=1

Using relations (21.65), (21.71), (21.77), proposition (i) of Lemma 1 accoding to which limε→0 πε,k > 0, and relation (21.47) we get the following inequalities, for i ∈ X,

466

D. Silvestrov

   qε,ii (r ) pε (r )   πε,k πε,i pε,k (r ) pε  ≤ − |Ei δ˜ε,ik − |·  ·  q πε,i qε,i  πε,i π p π qε,i ε,i ε, j ε, j ε,i j∈X k∈X  πε,i pε ≤ |Ei δ˜ε,ik − Ei δε,ik | · · → 0 as ε → 0. πε,k πε,i qε,i k∈X

(21.78) Now, using relation (21.46) and relation (21.47) given in Lemma 3, we get, for i ∈ X,       pε (r ) pε (r )  pε pε (r )    = − − 1    π q pε pε πε,i qε,i ε,i ε,i    pε  ≤  (21.79) − 1 → 0 as ε → 0. π q ε,i ε,i

Relations (21.75), (21.78), and (21.79) imply holding of relation (21.74).



Let us introduce the following balancing condition: pε (r ) pε

A2 :

→ Q r as ε → 0, for r ∈ X.

Constants Q r , automatically satisfy the following relations, Q r ≥ 0, r ∈ X,



Q r = 1.

(21.80)

r ∈X

The following lemma is the direct corollary of Lemma 4. Lemma 5 Let conditions A1 –C1 hold. Then, condition A2 is necessary and sufficient for holding (for some or every i ∈ X, respectively, in the statements of necessity and sufficiency) the following relation, (ε) (∞) → Q r as ε → 0, for r ∈ X. Q ir

(21.81)

Let us introduce Laplace transforms, for j, i, r ∈ X, ψε, jir (s) = E{e−sξε I(ηε,νε = r )/ηε,0 = j, νε ≤ αε,i (1)), s ≥ 0. and

ψ˜ ε, jir (s) = E j e−sξε I(ηε,νε = r, νε ≤ αε,i (1)), s ≥ 0.

(21.82)

(21.83)

The following lemma takes place. Lemma 6 Let conditions A1 –D1 hold. Then, for i, r ∈ X and s ≥ 0, pε (r ) − ψε,iir (s) → 0 as ε → 0. pε

(21.84)

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

467

Proof Let us also introduce the Laplace transforms, for j, r ∈ X, p˜ ε, jr (s) = E j e−sκε,1 I(ζε,1 = 0, ηε,1 = r ) = φˆ ε, j (s) pε, j (r ), s ≥ 0, and

(21.85)

p˜ ε, j (s) = E j e−sκε,1 I(ζε,1 = 1) = ϕˆε, j (s) pε, j , s ≥ 0,

(21.86)

φˆ ε, j (s) = E{e−sκε,1 /ηε,0 = j, ζε,1 = 1}, s ≥ 0.

(21.87)

where

Functions ψε, jir (s), j ∈ X satisfy, for every s ≥ 0 and i, r ∈ X , the following system of linear equations, ψε, jir (s) = pε, jr (s) +



pε, jk (s)ψε,kir (s), j ∈ X.

(21.88)

k=i

System (21.88) can be rewritten in the following equivalent matrix form

where

˜ ε,ir (s) = p˜ ε,r (s) + i P˜ ε (s) ˜ ε,ir (s) 

(21.89)

⎤ ⎤ ⎡ ψ˜ ε,1ir (s) p˜ ε,1r (s) ⎥ ⎥ ⎢ .. .. ˜ ε,ir (s) = ⎢  ⎦, ⎦ , p˜ ε,r (s) = ⎣ ⎣ . . pε,mr (s) ψ˜ ε,mir (s)

(21.90)



and ˜

i Pε (s)



⎤ p˜ ε,11 (s) . . . p˜ ε,1 i−1 (s) 0 p˜ ε,1 i+1 (s) . . . p˜ ε,1 m (s) ⎢ ⎥ .. .. .. .. .. =⎣ ⎦. . . . . . p˜ ε,m 1 (s) . . . p˜ ε,m i−1 0 p˜ ε,m i+1 (s) . . . pε,m m (s)

(21.91)

Let us also introduce random variables δ˜ε,ik (s), for s ≥ 0, i, k ∈ X, δ˜ε,ik (s) =

αε,i (1)∧νε



e−sτε,n I(ηε,n−1 = k),

(21.92)

n=1

Obviously, 0 ≤ δ˜ε,ik (s) ≤ δ˜ε,ik and, therefore, for s ≥ 0 and j, i, k ∈ X, E j δ˜ε,ik (s) ≤ E j δ˜ε,ik < ∞.

(21.93)

Moreover, the matrices i P˜ εn (s) has the following form, for s ≥ 0, i ∈ X and n ≥ 1,

468

D. Silvestrov

˜n i Pε (s)

= E j e−sτε,n I(ηε,n = k, αε,i (1) ∧ νε > n)} .

(21.94)

and, therefore, there exists the inverse matrix,

 −1

I − i P˜ ε (s) = I + i P˜ ε (s) + i P˜ ε2 (s) + · · · = E j δ˜ε,ik (s) .

(21.95)

Therefore, the solution of the system (21.89) has the following form,  −1 ˜ ε,ir (s) = I − i P˜ ε,i (s)  p˜ ε,r (s).

(21.96)

The following part of the proof is analogous to those presented in relations (21.58)–(21.59). Let us choose sequence εn → 0 as n → ∞ and subsequence εnr → 0 as r → ∞ in the way described in relation (21.58). This implies that det(I − i P0 ) = 0, i.e., there exists the inverse matrix [I − i P0 ]−1 and relation (21.59) holds. Condition D1 implies that, for δ > 0, 1 − G ε (δ) =



πε,i (1 − G ε, j (δ)) → 0 as ε → 0.

(21.97)

j∈X

Since, according condition B1 and Lemma 1, limε→0 πε,i > 0, for i ∈ X, relation (21.97) implies that, for j ∈ X and δ > 0, P j {κε,1 > δ} = (1 − G ε, j (δ)) → 0 as ε → 0.

(21.98)

Realtion (21.98) and condition B1 imply that, for any and s ≥ 0, j, j ∈ X and δ > 0, | pε, jk − p˜ ε, jk (s)| = | pε, jk − p˜ ε, jk | + |E j (1 − e−sκε, j )I(ηε,1 = k, ζε,1 = 0}| ≤ pε, j + (1 − e−δ ) + P j {κε, j > δ} → 1 − e−δ as ε → 0. (21.99) Due to possibility of an arbitrary choice of δ > 0 in relation (21.99), this relation implies that, for s ≥ 0 and j, k ∈ X, p˜ ε, jk (s) − pε, jk ∈ [0, 1] as ε → 0.

(21.100)

Relations (21.58) and (21.100) imply that, for s ≥ 0 and j, k ∈ X, p˜ εnr , jk (s) → p0, jk ∈ [0, 1] as ε → 0.

(21.101)

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

469

Relation (21.101) implies that det(I − i P˜ εnr (s)) → det(I − i P0 ) = 0 as r → ∞, for s ≥ 0, i ∈ X. Thus, for every s ≥ 0, i ∈ X, there exists the inverse matrix [I − −1 ˜ for r large enough. i Pεnr (s)] Elements of matrices [I − i P˜ εnr (s)]−1 and [I − i P0 ]−1 are continuous rational functions of elements, respectively, of matrices [I − i P˜ εnr (s)] and [I − i P0 ]. That is why, relation (21.101) implies that, for s ≥ 0, i ∈ X, [I − i P˜ εnr (s)]−1 → [I − i P0 ]−1 as ε → 0.

(21.102)

The matrices i P0 , i ∈ X depend on choice of sequences εn and subsequence εnr . However, relations (21.59) and (21.102) imply that, for s ≥ 0, i ∈ X, [I − i Pεnr (s)]−1 − [I − i Pεnr ]−1 → 0 as ε → 0,

(21.103)

where 0 is a (m − 1) × m matrix with all zero elements. Since an arbitrary choice of sequence 0 < εn → 0 and independence of limit 0 in relation (21.63) on choice of sequence εn and subsequence εnr , this relation implies that, for s ≥ 0, i ∈ X [I − i P˜ ε (s)]−1 − [I − i Pε ]−1 → 0 as ε → 0.

(21.104)

or equivalently that, for s ≥ 0, j, i, k ∈ X, E j δ˜ε,ik (s) − E j δε,ik → 0 as ε → 0.

(21.105)

Taking in account formulas (21.85), (21.86), (21.95), and (21.96), we get, for every s ≥ 0 and i, r ∈ X, ψ˜ ε,iir (s) =



Ei δ˜ε,ik (s)ϕˆε,k (s) pε,k (r )

(21.106)

k∈X

Condition A2 implies that, for every s ≥ 0 and k ∈ X, 0 ≤ lim (1 − φˆ ε,k (s)) ≤ 1 − e−sδ + lim Pk {κε,1 > δ/ζε,1 = 1} ε→0

ε→0

→ 1 − exp{−sδ} as ε → 0.

(21.107)

Due to possibility of an arbitrary choice of δ > 0 in relation (21.107), this relation implies that, for s ≥ 0 and j, k ∈ X, φˆ ε,k (s) → 1 as ε → 0. Obviously, for i, r ∈ X and s ≥ 0,

(21.108)

470

D. Silvestrov

ψε,iir (s) =

ψ˜ ε,iir (s) . qε,ii

(21.109)

Lemma 3, proposition (i) of Lemma 1, according to which limε→0 πε,k > 0, and relations (21.71), (21.105), (21.108) imply that, for every s ≥ 0 and i, k ∈ X, |ψε,iir (s) −

pε (r ) pε (r ) pε (r ) |≤| − | pε πε,i qε,ii pε  pε,k (r ) πε,k pε,k (r ) |Eδε,iik (s)φˆ ε,k (s) − | + qε,ii πε,i qε,ii k∈X  πε,k pε pε,k (r )πε,i |Eδε,iik (s)φˆ ε,k (s) − | ≤ πε,i πε,i qε,ii pε k∈X  πε,k pε πε,i ≤ |Eδε,iik (s)φˆ ε,k (s) − | πε,i πε,i qε,ii πε,k k∈X

→ 0 as ε → 0.

(21.110) 

The proof is complete. The following lemma is the direct corollary of Lemma 6.

Lemma 7 Let conditions A1 –D1 hold. Then, condition A2 is necessary and sufficient for holding (for some or every i ∈ X, respectively, in the statements of necessity and sufficiency) the following relation, ψε,iir (s) → Q r as ε → 0, for s ≥ 0, r ∈ X.

(21.111)

21.4.2 Necessary and Sufficient Conditions of Convergence For Markov Renewal Processes Generated by Flows of Rare Events The following theorem shows that the-first-rare-event times ξε and random functional ηε,νε are asymptotically independent, and completes the description of the asymptotic behaviour of the transition probabilities Q i(ε) j (t) for the Markov renewal process (ηε (k), κε (k)). Theorem 3 Let conditions A1 , B1 and C1 hold. Then, (i) conditions D1 and A2 are necessary and sufficient for holding (for some or every initial distributions q¯ε , respectively, in the statements of necessity and suffid

ciency) the asymptotic relation (ηε (1), ξε (1)) −→ (η0 (1), ξ0 (1), ) as ε → 0, where (η0 (1), ξ0 (1)) is a random vector, which takes values in X × [0, ∞) and has the joint distribution function P{η0 (1)= r, ξ0 (1) ≤ u, } = F(r, u), r ∈ X, u ≥ 0 such that distribution function F(u) = r ∈X F(r, u) is not concentrated in zero.

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

471

(ii) In this case, F(r, u) = Q r F(u), r ∈ X, u ≥ 0, where F(u) is the distribution 1 appearing in condition D1 and Q r , r ∈ X function with Laplace transform 1+A(s) are limits appearing in condition A2 . (iii) Moreover, for any rk ∈ X, k = 1, . . . , n, points u k ≥ 0, k = 1, . . . , n. which are points of continuity for the distribution function F(u), and any i ∈ X, n = 1, 2, . . ., Pi {ηε (k)) = rk , ξε (k) ≤ u k , k = 1, . . . , n} → Q(rk , u k , k = 1, . . . , n) =

n 

Q rk F(u k ) as ε → 0.

(21.112)

k=1

Proof The statement of necessity in proposition (i) of Theorem 3 follows from Theorem 1 and Lemma 5, since the asymptotic relation appearing in this proposition, d

random implies that ξε (1) −→ ξ0 (1) as ε → 0, where ξ0 (1) is a non-negative  variable with the distribution function P{ξ0 (1) ≤ u} = F(u) = r ∈X F(u, r ), u ≥ 0 d

and ηε (1) −→ η0 (1) as ε → 0, where η0 (1) is a random variable taking values in space X such that P{η0 (1) = r } = Q r = F(∞, r ), r ∈ X. Let us prove that conditions A1 – C1 and D1 , A2 imply the asymptotic relation appearing in propositions (i) and (ii) of Theorem 4 holds. Let us introduce random variables, for i ∈ X, αε,i (k)



βε,i (k) =

κε,n , k = 1, 2, . . . ,

(21.113)

n=αε,i (k−1)+1

and Laplace transforms, for j, i ∈ X, ψε, ji (s) = E{e−sβε,i (1) /ηε,0 = j, νε > αε,i (1)}, s ≥ 0,

(21.114)

Analogously to relation (21.75) the following representation can be written down, for s ≥ 0, i ∈ X,   νε,i  βε,i (k) Ei exp −s k=1

= =

∞  n=0 ∞ 



Ei exp −s

n 

 βε,i (k) I(αε,i (n) < νε ≤ αε,i (n + 1))

k=1

(1 − qε,i )n ψε,ii (s)n qε,i

n=0

=

qε,ii . 1 − (1 − qε,ii )ψε,ii (s)

(21.115)

472

D. Silvestrov

Let us also introduce random variables, for i ∈ X, νε 

βε,i =

κε,n .

(21.116)

n=αε,i (νε,i )+1

where νε,i = max{n ≥ 0 : αε,i (n) ≤ νε } = με,i (νε ).

(21.117)

Analogously to relation (21.75) the following representation can be written down, for s ≥ 0, i, r ∈ X, Ei exp{−sβε,i }I(ηε,νε = r ) ⎧  νe ∞ ⎨   = Ei exp −s ⎩ n=0

=

∞ 

κε,k

⎫ ⎬

I(ηε,νε = r ) ⎭  /αε,i (n) < νε ≤ αε,i (n + 1) P{αε,i (n) < νε ≤ αε,i (n + 1)} k=αε,i (n)+1

ψε,iir (s)(1 − qε,i )n qε,i = ψε,iir (s).

(21.118)

n=0

Finally, let us introduce Laplace transforms, for i, r ∈ X, ε,ir (s) = Ei e−sξε I(ηε,νε = r ), s ≥ 0,

(21.119)

Analogously to relation (21.75) the following representation can be written down, for s ≥ 0, i, r ∈ X,   νε,i  βε,i (k) + βε,i I(ηε,νε = r ) ε,ir (s) = Ei exp −s k=1

=

∞ 



Ei exp −s

n=0

n 

 βε,i (k)

k=1

× exp{−sβε,i }I(ηε,νε = r, αε,i (n) < νε ≤ αε,i (n + 1)) =

∞  (1 − qε,i )n ψε,ii (s)n qε,i ψε,iir (s) n=0

=

qε,ii · ψε,iir (s). 1 − (1 − qε,ii )ψε,ii (s)

Also, for s ≥ 0, i ∈ X,

(21.120)

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

 Ei exp −s

νε,i  n=1

⎧ ⎨



βε,i (n) = Ei exp −s ⎩

αε,i (νε,i )

 n=1

κε,n

473

⎫ ⎬ ⎭

(21.121)

The random variable αε,i (νε,i ) can be represented, for i ∈ X, in the following form, (21.122) αε,i (νε,i ) = αε,i (με,i (νε )), Using Lemma 1 and by applying Theorem 3.2.1 from [72], we get the following asymptotic relation, for i ∈ X and function u ε = pε−1 , αε,i (με,i (|t pε−1 ])) U ∗ = αε,i (μ∗ε,i (t), t ≥ 0 −→ t, t ≥ 0 as ε → 0. pε−1

(21.123)

or, equivalintly, αε,i (με,i (|t pε−1 ])) U ∗ − t = αε,i (μ∗ε,i (t) − t, t ≥ 0 −→ 0(t), t ≥ 0 as ε → 0. pε−1 (21.124) where 0(t) ≡ 0, t ≥ 0. Lemma 2, Slutsky theorem (see, for example, Theorem 1.2.3 in [72]) and relation (21.124) imply holding the following asymptotic relation, for i ∈ X, ∗ ∗ (μ∗ε,i (νε∗ )), κε (t)) = (αε,i (μ∗ε,i (νε∗ )) − νε∗ + νε∗ , κε (t)), t ≥ 0 (αε,i d

−→ (ν0 , θ0 (t)), t ≥ 0 as ε → 0.

(21.125)

Asymptotic relation given in proposition (iii) of Theorem 1 and relation (21.125) let us apply Theorem 3.4.1 from [72] to the stochastic processes κε (t), t ≥ 0 randomly stopped at nonent νε∗ = pε−1 νε that yield the following relation, d

∗ κε (αε,i (μ∗ε,i (νε∗ ))) −→ ξ0 = θ0 (ν0 ) as ε → 0,

(21.126)

where ξ0 = θ0 (ν0 ) is the random variable appearing in condition D1 . Lemma 7, relations (21.120), (21.121) and (21.126) imply that the following relation holds, i, r ∈ X and s ≥ 0, ε,ir (s) = →

1 ε,ii (s)) 1 + (1 − qε,ii ) (1−ψqε,i

· ψε,iir (s)

1 · Q r as ε → 0. 1 + A(s)

(21.127)

Relation (21.127) is equivalent to sufficiency statement of propositions (i) and proposition (ii) of Theorem 3.

474

D. Silvestrov

Let conditions A1 – D1 and A2 hold. Obviously, the random vector (ηε (1), κε (1)) has the following conditional distribution, for i, r ∈ X, u ≥ 0, (ε) (u) = Pi {ηε (1) = r, κε (1) ≤ u} = Pi {ηε,νε = r, ξε ≤ u}. Q ir

(21.128)

Using Markov property of the Markov renewal process (ηε (k), κε (k)), we get the following formula for the joint conditional distribution function of random variables ηε (k), κε (k), k = 1, . . . , n, for i k ∈ X, u k ≥ 0, k = 1, . . . , n, and i 0 ∈ X, n = 1, 2, . . ., Pi0 {ηε (k) = i k , κε (k) ≤ u k , k = 1, . . . , n} (u n ) = Pi0 {ηε (k) = i k , κε (k) ≤ u k , k = 1, . . . , n − 1}Q i(ε) n−1 ,i n = ··· = =

n 

Q i(ε) (u k ). k−1 ,i k

(21.129)

k=1

According propositions (i) and (ii) of Theorem 3, conditions A1 – D1 and A2 imply that, for i, r ∈ X, (ε) (·) ⇒ F(·)Q r as ε → 0. Q ir

(21.130)

where F(u) is the distribution function on [0, ∞) not concentrated in zero appearing in condition D1 and Q r , r ∈ X are limits appearing in condition A2 . Using relations (21.129) and (21.130), we get that, under conditions A1 –D1 and A2 , the following convergence relation hold for the joint conditional distribution function of random variables ηε (k), κε (k), k = 1, . . . , n, for all rk ∈ X, k = 1, . . . , n and u k ≥ 0, k = 1, . . . , n, which are points of continuity for the distribution function F(u), and i ∈ X, n = 1, 2, . . ., Pi {ηε (k) = rk , κε (k) ≤ u k , k = 1, . . . , K } → P{η(k) = rk , κ(k) ≤ u k , k = 1, . . . , n} = Q(rk , u k , k = 1, . . . , n) as ε → 0.

(21.131)

Relation (21.131) completes the proof of proposition (iii) of Theorem 3.



21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

475

21.5 Vector Counting Processes Generated by Flows of Rare Events In this section, we present necessary and sufficient conditions for convergence in distribution and in Skorokhod J-topology for more general vector counting processes generated by flows of rare events for perturbed semi-Markov processes.

21.5.1 Vector Counting Process for Rare Events Let us also consider the vector counting process, M¯ ε (t) = (Mε,r (t), r ∈ X), t ≥ 0, where, for r ∈ X, [t]  Mε,r (t) = I(ηε (k) = r ), t ≥ 0. (21.132) k=1

and the vector counting process N¯ ε (t) = (Nε,r (t), r ∈ X), t ≥ 0, where, for r ∈ X, Nε,r (t) = Mε,r (Nε (t)), t ≥ 0.

(21.133)

Obviously, the following relation takes place, Nε (t) =



Nε,r (t), t ≥ 0.

(21.134)

r ∈X

Let η(k), κ(k), k = 1, 2, . . . be mutually independent random variables such that random variables ηk , k = 1, 2, . . . take value r with probability Q r , for r ∈ X, while κ(k), k = 1, 2, . . . are non-negative random variables with distribution function F(u) not concentrated in zero. Let us also define random variables ξ(k) = nk=1 κ(k), n = 1, 2, . . .. ¯ Let us define the vector friequence process, M(t) = (Mr (t), r ∈ X), t ≥ 0, where, for r ∈ X, [t]  Mr (t) = I(η(k) = r ), t ≥ 0. (21.135) k=1

and the counting process N¯ (t) = (Nr (t), r ∈ X), t ≥ 0, where Nr (t) = Mr (N (t)), t ≥ 0.

(21.136)

Obviously, the following relation takes place, N (t) =

 r ∈X

Nr (t), t ≥ 0.

(21.137)

476

D. Silvestrov

Since, N (t), t ≥ 0 and Mr (t), t ≥ 0, r ∈ X are integer-valued, non-negative, stepwise, non-decreasing, càdlàg processes, Mr (N (t)), t ≥ 0, r ∈ X also are integervalued, non-negative, step-wise, non-decreasing, càdlàg processes. ¯ (t)) has the same set N of stochastic Note also that processes N (t) and M(N continuity points t > 0. That is why, the role of finite dimensional distributions for the vector process N (t), t ≥ 0 is played by the following probabilities (which can be expressed via joint  distributions of sums ξ(n) = nk=1 κ(k) of i.i.d. random variables κ(k), k = 1, 2, . . . with distributions F(·)), for 0 ≤ t1 ≤ t2 < · · · integer 0 ≤ m[t1 ] ≤ m[t2 ] ≤ · · · , P{N (t1 ) = m[t1 ]} = P{N (t1 ) ≥ m[t1 ]} − P{N (t1 ) ≥ m[t1 ] + 1} = P{ξ(m[t1 ]) ≤ t1 } − P{ξ(m[t1 ] + 1) ≤ t1 }, P{N (tl ) = m[tl ], l = 1, 2} = P{N (t1 ) = m[t1 ], N (t2 ) ≥ m[t2 ]} − P{N (t1 ) = m[t1 ], N (t2 ) ≥ m[t2 ] + 1} = P{N (t1 ) ≥ m[t1 ], N (t2 ) ≥ m[t2 ]} − P{N (t1 ) ≥ m[t1 ] + 1, N (t2 ) ≥ m[t2 ]} + P{N (t1 ) ≥ m[t1 ], N (t2 ) ≥ m[t2 ] + 1} − P{N (t1 ) ≥ m[t1 ] + 1, N (t2 ) ≥ m[t2 ] + 1} = P{ξ(m[t1 ]) ≤ t1 , ξ(m[t2 ]) ≤ t2 } − P{ξ(m[t1 ] + 1) ≤ t1 , ξ(m[t2 ]) ≤ t2 } + P{ξ(m[t1 ]) ≤ t1 , ξ(m[t2 ] + 1) ≤ t2 } − P{ξ(m[t1 ] + 1) ≤ t1 , ξ(m[t2 ] + 1) < t2 }, etc.

(21.138)

¯ be the multinomial probabilities, defined by the following formula, Let G(k, l¯k , p)  for vectors p¯ = ( pr , r ∈ X) with components pr ≥ 0, r ∈ X that r ∈X pr = 1 and vectors l¯k = (lk,r , r ∈ X) with non-negative integer components lk,r , r ∈ X such that  l r ∈X k,r = k, for k = 1, 2, . . ., G(k, l¯k , p) ¯ =

k!



r ∈X l k,r ! r ∈X

prlk,r .

(21.139)

¯ Process M(t), t ≥ 0 is a vector process with independent increments, which incre¯ ¯ ments M(t) − M(s) have multinomial joint distributions of components, i.e., for vectors Q¯ = (Q r , r ∈ X) and l¯m[s,t] , where m[s, t] = [t] − [s], and 0 ≤ s ≤ t < ∞, ¯ ¯ ¯ P{ M(t) − M(s) = l¯m[s,t] } = G(m[s, t], l¯m[s,t] , Q).

(21.140)

¯ Note also that the counting process N (t), t ≥ 0 and process M(t), t ≥ 0 are independent.

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

477

That is why, the finite dimensional distributions for the vector process N¯ (t), t ≥ 0 take the following form, for 0 ≤ t1 ≤ t2 ≤ · · · integer 0 ≤ m[t1 ] ≤ m[t2 ] ≤ · · · ∞ and n = 1, 2, . . ., P{ N¯ (tk ) = l¯m[tk ] , k = 1, . . . , n} = P{N (tk ) = m[tk ], k = 1, . . . , n} ¯ ¯ × P{ M(m[t k ]) = l m[tk ] , k = 1, . . . , n}.

(21.141)

Here, relation (21.137) is taken into account.

21.5.2 Necessary and Sufficient Conditions of Convergence for Vector Counting Process Generated by Flows of Rare Events The following theorem generalises Theorem 2 and describes the asymptotic behaviour of the vector rare-event reward processes N¯ ε (t). Theorem 4 Let conditions A1 , B1 and C1 hold. Then: (i) Conditions D1 and A2 are necessary and sufficient for holding (for some or any initial distributions q¯ε , respectively, in statements of necessity and sufficiency) d the asymptotic relation N¯ ε (t), t ∈ N −→ N¯ (t), t ∈ N as ε → 0, where N¯ (t), t ≥ 0 is a stochastic process with integer-valued, non-negative, non-decreasing, step-wise, càdlàg components such that P{N (t) = r ∈X Nr (t) ≥ 1} = F(t) is the distribution function on [0, ∞) not concentrated at zero, and N is the set of stochastic continuity points t > 0 for this process. (ii) In this case, N¯ (t), t ≥ 0 is a process defined by relations (21.29), (21.135), and (21.133) via some sequence of independent random variables η(k), κ(k), k = 1, 2, . . ., where κ(k), k = 1, 2, . . . are non-negative random variables with distrifunction F(u) not concentrated at zero and the Laplace transform φ(s) = bution ∞ −st 1 e F(dt) = 1+A(s) , where A(s) is the cumulant of infinitely divisible distribu0 tion defined in condition D1 and η(k), k = 1, 2, . . . are random variables taking values r ∈ X with probabilities Q r , r ∈ X appearing in condition A2 , and N is the set of stochastic continuity points t > 0 for process N¯ (t). Proof Relation (21.112) given in proposition (iii) of Theorem 3 implies that, under conditions A1 –D1 and A2 , the following convergence relation hold for all vectors l¯k = (lk,r , r ∈ X), k = 1, . . . , n and u k ≥ 0, k = 1, . . . , n, which are points of continuity for the distribution function F(u), and i ∈ X, n = 1, 2, . . ., Pi { M¯ ε (k) = l¯k , ξε (k) ≤ u k , k = 1, . . . , n} ¯ → P{ M(k) = l¯k , k = 1, . . . , n} × P{ξ(k) ≤ u k , k = 1, . . . , n} as ε → 0.

(21.142)

478

D. Silvestrov

Relation (21.35) implies in an obvious way that, under conditions A1 – D1 , the following convergence relation hold for 0 < t1 ≤ · · · ≤ tn < ∞, which are point of continuity for the corresponding limiting distribution functions (of random variables ξ(rk ), k = 1, . . . , n), integers 0 ≤ m[1] ≤ · · · ≤ m[n] < ∞, 0 ≤ r1 ≤ · · · ≤ rn < ∞, and i ∈ X, n = 1, 2, . . ., Pi { M¯ ε (m[k]) = l¯m[k] , Nε (tk ) ≥ rk , k = 1, . . . , n} = Pi { M¯ ε (m[k]) = l¯m[k] , ξε (rk ) ≤ tk , k = 1, . . . , n} ¯ → P{ M(m[k]) = l¯m[k] , k = 1, . . . , n}P{ξ(rk ) ≤ tk , k = 1, . . . , n} ¯ = P{ M(m[k]) = l¯m[k] , k = 1, . . . , n}P{N (tk ) ≥ rk , k = 1, . . . , n} as ε → 0.

(21.143)

The following relations, analogous to (21.138) take place, for 0 < t1 ≤ · · · ≤ tn < ∞, integers 0 ≤ m[1] ≤ m[2] ≤ · · · < ∞, 0 ≤ m[t1 ] ≤ m[t2 ] ≤ · · · , and i ∈ X, n = 1, 2, . . ., Pi { M¯ ε (m[1]) = l¯m[1] , Nε (t1 ) = m[t1 ]} = Pi { M¯ ε (m[1]) = l¯m[1] , Nε (t1 ) ≥ m[t1 ]} − Pi { M¯ ε (m[1]) = l¯m[1] , Nε (t1 ) ≥ m[t1 ] + 1} = Pi { M¯ ε (m[1]) = l¯m[1] , ξε (m[t1 ]) ≤ t1 } − Pi { M¯ ε (m[1]) = l¯m[1] , ξε (m[t1 ] + 1) ≤ t1 }, Pi { M¯ ε (m[l]) = l¯m[l] , Nε (tl ) = m[tl ], l = 1, 2} = Pi { M¯ ε (m[l]) = l¯m[l] , l = 1, 2, Nε (t1 ) = m[t1 ], Nε (t2 ) ≥ m[t2 ]} − Pi { M¯ ε (m[l]) = l¯m[l] , l = 1, 2, Nε (t1 ) = m[t1 ], Nε (t2 ) ≥ m[t2 ] + 1} = Pi { M¯ ε (m[l]) = l¯m[l] , l = 1, 2, Nε (t1 ) ≥ m[t1 ], Nε (t2 ) ≥ m[t2 ]} − Pi { M¯ ε (m[l]) = l¯m[l] , l = 1, 2, Nε (t1 ) ≥ m[t1 ] + 1, Nε (t2 ) ≥ m[t2 ]} + Pi { M¯ ε (m[l]) = l¯m[l] , l = 1, 2, Nε (t1 ) ≥ m[t1 ], Nε (t2 ) ≥ m[t2 ] + 1} − Pi { M¯ ε (m[l]) = l¯m[l] , l = 1, 2, Nε (t1 ) ≥ m[t1 ] + 1, Nε (t2 ) ≥ m[t2 ] + 1} = Pi { M¯ ε (m[l]) = l¯m[l] , l = 1, 2, ξε (m[t1 ]) ≤ t1 , ξε (m[t2 ]) ≤ t2 } − Pi { M¯ ε (m[l]) = l¯m[l] , l = 1, 2, ξε (m[t1 ] + 1) ≤ t1 , ξε (m[t2 ]) ≤ t2 } + Pi { M¯ ε (m[l]) = l¯m[l] , l = 1, 2, ξε (m[t1 ]) ≤ t1 , ξε (m[t2 ] + 1) ≤ t2 } − Pi { M¯ ε (m[l]) = l¯m[l] , l = 1, 2, ξε (m[t1 ] + 1) ≤ t1 , ξε (m[t2 ] + 1) < t2 }, etc. (21.144)

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

479

Relations (21.143 and (21.144) imply that, for 0 < t1 ≤ · · · ≤ tn < ∞, which are point of continuity for the corresponding limiting distribution functions (of random variables ξ(rk ), k = 1, . . . , n), integers 0 ≤ m[1] ≤ · · · ≤ m[n] ≤ · · · < ∞, 0 ≤ m[t1 ] ≤ · · · ≥ m[tn ] ≤ · · · , and i ∈ X, n = 1, 2, . . ., Pi { M¯ ε (m[l]) = l¯m[l] , Nε (tl ) = m[tl ], l = 1, . . . , n} ¯ → P{ M(m[l]) = l¯m[l] , l = 1, . . . , n} × P{N (tl ) = m[tl ], l = 1, . . . , n} as ε → 0.

(21.145)

Finally, we get that, under conditions A1 – D1 and A2 , relations (21.134) and (21.145) imply that, for which are point of continuity for the corresponding limiting distribution functions (of random variables ξ(rk ), k = 1, . . . , n), integers 0 ≤ m[t1 ] ≤ m[t2 ] ≤ · · · ∞ and i ∈ X, n = 1, 2, . . ., Pi { N¯ ε (tk ) = l¯m[tk ] , k = 1, . . . , n} = Pi {Nε (tk ) = m[tk ], M¯ ε (m[tk ]) = l¯m[tk ] , k = 1, . . . , n} → P{N (tk ) = m[tk ], k = 1, . . . , n} ¯ ¯ × P{ M(m[t k ]) = lm[tk ] , k = 1, . . . , n} = P{ N¯ (tk ) = l¯m[tk ] , k = 1, . . . , n} as ε → 0.

(21.146)

Relation (21.36), in an obvious way, implies that the following asymptotic relation holds, for any initial distributions q¯ε , d N¯ ε (t), t ∈ K −→ N¯ (t), t ∈ K as ε → 0.

(21.147)

Since, N¯ (t), t ≥ 0 is a càdlàg process with non-decreasing components and set K is at most countable and, thus, set K ⊆ N is dense in interval [0, ∞), relation (21.37) can, in the standard way, be extended to the asymptotic relation given in proposition (i), with the limiting counting process N¯ (t) described in proposition (ii) of Theorem 4. Let, now, assume that conditions A1 – C1 hold, and, for some initial distributions q¯ε , the asymptotic relation takes place, d N¯ ε (t), t ∈ N −→ N¯ (t), t ∈ N as ε → 0,

(21.148)

where N¯ (t) is some stochastic process described in proposition (i) of Theorem 4, and N is its set of stochastic continuity points t > 0 for this process. Let us prove that, in this case, conditions D1 and A2 hold. Relation (21.148) implies that, for the above initial distributions q¯ε , the following relation holds, for t ∈ N,

480

D. Silvestrov

Nε (t) =

 r ∈X

d

Nε,r (t), t ∈ N −→ N (t) =



Nr (t), t ∈ N as ε → 0.

(21.149)

r ∈X

Obviously, N (t) is a integer-valued, non-negative, non-decreasing, step-wise, càdlàg, stochastic process such that P{N (t) ≥ 1} = F(t) is the distribution function on [0, ∞) not concentrated at zero, and N is the set of stochastic continuity points t > 0 for this process. Therefore, according proposition (i) of Theorem 2, condition D1 holds, and, according proposition (ii) of Theorem 2, F(u) is the distribution function with 1 appearing in condition D1 . Laplace transform 1+A(s) Since functions pε (r )/ pε ∈ [0, 1] for ε ∈ (0, 1], r ∈ X, any sequence 0 < εn → 0 as n → ∞ contains a sub-sequence 0 < εnl → 0 as l → ∞ such that the following relation holds, pεnl (r ) → Q r as l → ∞, for r ∈ X. (21.150) pεnl Obviously, conditions A1 – C1 hold for the Markov renewal processes (ξεnl (k), ηεnl (k)). According the above remarks condition D1 also holds for the Markov renewal processes (ξεnl (k), ηεnl (k)). Relation (21.150) plays the role of condition A2 for the Markov renewal processes (ξεnl (k), ηεnl (k)). Therefore, proposition (ii) of Theorem 4 takes place for the Markov renewal processes (ξεnl (k), ηεnl (k)). According to this proposition, the following asymptotic ¯ relation takes place for t ∈ N, vectors  lk = (lk,r , r ∈ X) with non-negative integer components lk,r , r ∈ X such that r ∈X lk,r = k, k = 0, 1, . . ., and i ∈ X, Pi { N¯ εnl (t) = l¯k } = Pi {Nεnl (t) = k, M¯ εnl (k) = l¯k } → P{ N¯ (t) = l¯k } ¯ = P{N (t) = k}P{ M(1) = l¯k } ¯ as l → ∞. = P{N (t) = k}G(k, l¯k , Q)

(21.151)

Note, that according relation (21.148) and (21.149), probabilities P{ N¯ (t) = l¯k } and P{N (t) = k} do not depend on the choice of sequence εn and subsequence εnl , ¯ may depend on while vectors Q¯ = (Q r , r ∈ X) and, thus, probabilities G(k, l¯k , Q) the choice of sequence εn and subsequence εnl .  P Obviously N (t) −→ ∞ as t → ∞. That is why, P{N (t) ≥ 1} = k≥1 P{N (t) = k} > 0 for t large enough. Therefore, there exists t ∈ N such that P{N (t) = k} > 0 for some k ≥ 1. Let us chose an arbitrary r ∈ X and, then, vector l¯k = (kI(li = lr ), i ∈ X). In this case, the equality in relation (21.151) takes the following form, P{ N¯ (t) = l¯k } = P{N (t) = k}Q rk .

(21.152)

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

481

This equality implies that probability Q r is the same for any sequence εn and subsequence εnl chosen such that such that the asymptotic relation (21.150) holds. It is so for any r ∈ X. Thus, the following asymptotic relation takes place, pε (r ) → Q r as ε → 0, for r ∈ X, pε  where Q r ≥ 0, r ∈ X, r ∈X Q r = 1. Therefore, not only condition D1 , but, also, condition A2 holds.

(21.153)



Remark 10 In the case, where condition D1 is modified in the way described in Remark 4, Nr (0) = 0 with probability 1, for r ∈ X, point 0 can be additionally included in set N. Remark 11 In the case described in Remark 10, relation (21.33) implies (by Theorem 4.4.1 from [72] that, under conditions, A1 – D1 and A2 , the more general J asymptotic relation, N¯ ε (t), t ≥ 0 −→ N¯ (t), t ≥ 0 as ε → 0, takes place.

References 1. Aldous, D.J.: Markov chains with almost exponential hitting times. Stoch. Proces. Appl. 13, 305–310 (1982) 2. Alimov, D., Shurenkov, V.M.: Markov renewal theorems in triangular array model. Ukr. Mat. Zh., 42, 1443–1448 (English translation in Ukr. Math. J., 42, 1283–1288) (1990) 3. Alimov, D., Shurenkov, V.M.: Asymptotic behavior of terminating Markov processes that are close to ergodic. Ukr. Mat. Zh., 42, 1701–1703 (English translation in Ukr. Math. J., 42 1535– 1538) (1990) 4. Anisimov, V.V.: Limit theorems for sums of random variables on a Markov chain, connected with the exit from a set that forms a single class in the limit. Teor. Veroyatn. Mat. Stat. 4, 3–17 (English translation in Theory Probab. Math. Stat. 4, 1–13) (1971) 5. Anisimov, V.V.: Limit theorems for sums of random variables in array of sequences defined on a subset of states of a Markov chain up to the exit time. Teor. Veroyatn. Mat. Stat., 4, 18–26 (English translation in Theory Probab. Math. Stat. 4, 15–22) (1971) 6. Anisimov, V.V.: Random Processes with Discrete Components, p. 183. Vysshaya Shkola and Izdatel’stvo Kievskogo Universiteta, Kiev (1988) 7. Anisimov, V.V.: Switching Processes in Queueing Models. Applied Stochastic Methods Series, p. 345. ISTE, London and Wiley, Hoboken, NJ (2008) 8. Asmussen, S.: Busy period analysis, rare events and transient behavior in fluid flow models. J. Appl. Math. Stoch. Anal. 7(3), 269–299 (1994) 9. Asmussen, S.: Applied probability and queues. Second edition. Appl. Math. 51 (2003). Stochastic Modelling and Applied Probability. Springer, New York, xii+438 pp 10. Asmussen, S., Albrecher, H.: Ruin probabilities. Second edition. Adv. Ser. Stat. Sci. Appl. Probab. 14 (2010). World Scientific, Hackensack, NJ, xviii+602 pp 11. Avrachenkov, K.E., Filar, J.A., Howlett, P.G.: Analytic Perturbation Theory and Its Applications. SIAM, Philadelphia, xii+372 pp (2013)

482

D. Silvestrov

12. Bening, V.E., Korolev, VYu.: Generalized Poisson Models and their Applications in Insurance and Finance. Modern Probability and Statistics, p. 432. VSP, Utrecht (2002) 13. Benois, O., Landim, C., Mourragui, M.: Hitting times of rare events in Markov chains. J. Stat. Phys. 153(6), 967–990 (2013) 14. Billingsley, P.: Convergence of Probability Measures. Wiley Series in Probability and Statistics, Wiley, New York, x+277 pp (1968) 15. Brown, M., Shao, Y.: Identifying coefficients in spectral representation for first passage-time distributions. Prob. Eng. Inf. Sci. 1, 69–74 (1987) 16. Darroch, J., Seneta, E.: On quasi-stationary distributions in absorbing discrete-time finite Markov chains. J. Appl. Probab. 2, 88–100 (1965) 17. Darroch, J., Seneta, E.: On quasi-stationary distributions in absorbing continuous-time finite Markov chains. J. Appl. Probab. 4, 192–196 (1967) 18. Drozdenko, M.: Weak convergence of first-rare-event times for semi-Markov processes. I. Theory Stoch. Process., 13(29), 29–63 (2007) 19. Drozdenko, M.: Weak Convergence of First-Rare-Event Times for Semi-Markov Processes. Doctoral dissertation 49, Mälardalen University, Västerås (2007) 20. Drozdenko, M.: Weak convergence of first-rare-event times for semi-Markov processes. II. Theory Stoch. Process. 15(31), 99–118 (2009) 21. Ele˘ıko, Y.I., Shurenkov, V.M: Transient phenomena in a class of matrix-valued stochastic evolutions. Teor. Imorvirn. Mat. Stat. 52, 72–76 (English translation in Theory Probab. Math. Stat. 52, 75–79) (1995) 22. Feller, W.: An Introduction to Probability Theory and Its Applications, vol. II, p. 669. Wiley Series in Probability and Statistics, Wiley, New York (1971) 23. Gikhman, I.I., Skorokhod, A.V.: Theory of Random Processes. 1. Probability Theory and Mathematical Statistics, Nauka, Moscow, 664 pp. (English edition: The Theory of Stochastic Processes. 1. Fundamental Principles of Mathematical Sciences, vol. 210, viii+574 pp. Springer, New York (1974) and Berlin (1980) (1971) 24. Glynn, P.: On exponential limit laws for hitting times of rare sets for Harris chains and processes. In: Glynn, P., Mikosch, T., Rolski, T. (eds.) New Frontiers in Applied Probability: A Festschrift for Søren Asmussen. J. Appl. Probab. Spec. 48A, 319–326 (2011) 25. Gnedenko, B.V., Korolev, VYu.: Random Summation. Limit Theorems and Applications, p. 288. CRC Press, Boca Raton, FL (1996) 26. Gusak, D.V., Korolyuk, V.S.: Asymptotic behaviour of semi-Markov processes with a decomposable set of states. Teor. Veroyatn. Mat. Stat., 5, 43–50 (English translation in Theory Probab. Math. Stat., 5, 43–51) (1971) 27. Gut, A., Holst, L.: On the waiting time in a generalized roulette game. Stat. Probab. Lett. 2(4), 229–239 (1984) 28. Gyllenberg, M., Silvestrov, D.S.: Quasi-stationary distributions of a stochastic metapopulation model. J. Math. Biol. 33, 35–70 (1994) 29. Gyllenberg, M., Silvestrov, D.S.: Quasi-stationary phenomena for semi-Markov processes. In: Janssen, J., Limnios, N. (eds.) Semi-Markov Models and Applications, pp. 33–60. Kluwer, Dordrecht (1999) 30. Gyllenberg, M., Silvestrov, D.S.: Nonlinearly perturbed regenerative processes and pseudostationary phenomena for stochastic systems. Stoch. Process. Appl. 86, 1–27 (2000) 31. Gyllenberg, M., Silvestrov, D.S.: Quasi-Stationary Phenomena in Nonlinearly Perturbed Stochastic Systems. De Gruyter Expositions in Mathematics, vol. 44, Walter de Gruyter, Berlin, ix+579 pp (2008) 32. Hassin, R., Haviv, M.: Mean passage times and nearly uncoupled Markov chains. SIAM J. Disc Math. 5, 386–397 (1992) 33. Kaplan, E.I.: Limit theorems for exit times of random sequences with mixing. Teor. Veroyatn. Mat. Stat. 21, 53–59 (English translation in Theory Probab. Math. Stat., 21, 59–65) (1979) 34. Kaplan, E.I.: Limit Theorems for Sum of Switching Random Variables with an Arbitrary Phase Space of Switching Component. Candidate of Science dissertation, Kiev State University (1980)

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

483

35. Kalashnikov, V.V.: Geometric Sums: Bounds for Rare Events with Applications. Mathematics and its Applications, vol. 413, xviii+265 pp. Kluwer, Dordrecht (1997) 36. Kartashov, N.V.: Estimates for the geometric asymptotics of Markov times on homogeneous chains. Teor. Veroyatn. Mat. Stat. 37, 66–77 (English translation in Theory Probab. Math. Stat. 37, 75–88) (1987) 37. Kartashov, N.V.: Inequalities in Rénei’s theorem. Teor. ˇImovirn. Mat. Stat. 45, 27–33 (English translation in Theory Probab. Math. Stat. 45, 23–28) (1991) 38. Kartashov, N.V.: Strong Stable Markov Chains, p. 138. VSP, Utrecht and TBiMC, Kiev (1996) 39. Kartashov, M.V.: Quantitative and qualitative limits for exponential asymptotics of hitting times for birth-and-death chains in a scheme of series. Teor. ˇImovirn. Mat. Stat. 89, 40–50 (English translation in Theory Probab. Math. Stat. 89, 45–56) (2013) 40. Keilson, J.: A limit theorem for passage times in ergodic regenerative processes. Ann. Math. Stat. 37, 866–870 (1966) 41. Keilson, J.: Markov Chain Models—Rarity and Exponentiality. Applied Mathematical Sciences, vol. 28, xiii+184 pp. Springer, New York (1979) 42. Kijima, M.: Markov Processes for Stochastic Modelling. Stochastic Modeling Series, x+341 pp. Chapman & Hall, London (1997) 43. Kingman, J.F.: The exponential decay of Markovian transition probabilities. Proc. London Math. Soc. 13, 337–358 (1963) 44. Korolyuk, D.V., Silvestrov D.S.: Entry times into asymptotically receding domains for ergodic Markov chains. Teor. Veroyatn. Primen. 28, 410–420 (English translation in Theory Probab. Appl. 28, 432–442) (1983) 45. Korolyuk, D.V., Silvestrov D.S.: Entry times into asymptotically receding regions for processes with semi-Markov switchings. Teor. Veroyatn. Primen. 29, 539–544 (English translation in Theory Probab. Appl. 29, 558–563) (1984) 46. Korolyuk, V.S.: On asymptotical estimate for time of a semi-Markov process being in the set of states. Ukr. Mat. Zh. 21, 842–845 (1969) 47. Korolyuk, V.S., Korolyuk, V.V.: Stochastic Models of Systems. Mathematics and its Applications, vol. 469, xii+185 pp. Kluwer, Dordrecht (1999) 48. Koroliuk, V.S., Limnios, N.: Stochastic Systems in Merging Phase Space, xv+331 pp. World Scientific, Singapore (2005) 49. Korolyuk, V., Swishchuk, A.: Semi-Markov Random Evolutions. Naukova Dumka, Kiev, 254 pp. (English revised edition: Semi-Markov Random Evolutions. Mathematics and its Applications, vol. 308. Kluwer, Dordrecht, 1995, x+310 pp.) (1992) 50. Korolyuk, V.S., Turbin, A.F.: On the asymptotic behaviour of the occupation time of a semiMarkov process in a reducible subset of states. Teor. Veroyatn. Mat. Stat. 2, 133–143 (English translation in Theory Probab. Math. Stat. 2, 133–143) (1970) 51. Korolyuk, V.S., Turbin, A.F.: Semi-Markov Processes and its Applications, p. 184. Naukova Dumka, Kiev (1976) 52. Korolyuk, V.S., Turbin, A.F.: Mathematical Foundations of the State Lumping of Large Systems. Naukova Dumka, Kiev, 218 pp. (English edition: Mathematics and its Applications, vol. 264, Kluwer, Dordrecht, 1993, x+278 pp.) (1978) 53. Kovalenko, I.N.: On the class of limit distributions for thinning flows of homogeneous events. Litov. Mat. Sbornik 5, 569–573 (1965) 54. Kovalenko, I.N.: An algorithm of asymptotic analysis of a sojourn time of Markov chain in a set of states. Dokl. Acad. Nauk Ukr. SSR Ser. A, 6, 422–426 (1973) 55. Kovalenko, I.N.: Rare events in queuing theory: a survey. Queuing Syst. Theory Appl. 16(1–2), 1–49 (1994) 56. Kovalenko, I.N., Kuznetsov, M.J.: Renewal process and rare events limit theorems for essentially multidimensional queueing processes. Math. Oper. Stat. Ser. Stat. 12(2), 211–224 (1981) 57. Kupsa, M., Lacroix, Y.: Asymptotics for hitting times. Ann. Probab. 33(2), 610–619 (2005) 58. Latouch, G., Louchard, G.: Return times in nearly decomposible stochastic processes. J. Appl. Probab. 15, 251–267 (1978)

484

D. Silvestrov

59. Loève, M.: Probability Theory. I. Fourth edition. Graduate Texts in Mathematics, vol. 45, xvii+425 pp. Springer, New York (1977) 60. Masol, V.I., Silvestrov, D.S.: Record values of the occupation time of a semi-Markov process. Visnik Kiev. Univ. Ser. Mat. Meh. 14, 81–89 (1972) 61. Motsa, A.I., Silvestrov, D.S.: Asymptotics of extremal statistics and functionals of additive type for Markov chains. In: Klesov, O., Korolyuk, V., Kulldorff, G., Silvestrov, D. (Eds.) Proceedings of the First Ukrainian–Scandinavian Conference on Stochastic Dynamical Systems, Uzhgorod, 1995. Theory Stoch. Process 2(18)(1-2), 217–224 (1996) 62. Serlet, L.: Hitting times for the perturbed reflecting random walk. Stoch. Process. Appl. 123(1), 110–130 (2013) 63. Shurenkov, V.M.: Transition phenomena of the renewal theory in asymptotical problems of theory of random processes 1. Mat. Sbornik, 112, 115–132 (English translation in Math. USSR: Sbornik, 40(1), 107–123 (1981)) (1980) 64. Shurenkov, V.M.: Transition phenomena of the renewal theory in asymptotical problems of theory of random processes 2. Mat. Sbornik, 112, 226–241 (English translation in Math. USSR: Sbornik, 40(2), 211–225 (1981)) (1980) 65. Silvestrov, D.S.: Limit theorems for semi-Markov processes and their applications. 1, 2. Teor. Veroyatn. Mat. Stat., 3, 155–172, 173–194 (English translation in Theory Probab. Math. Stat., 3, 159–176, 177–198) (1970) 66. Silvestrov, D.S.: Limit theorems for semi-Markov summation schemes. 1. Teor. Veroyatn. Mat. Stat. 4, 153–170 (English translation in Theory Probab. Math. Stat. 4, 141–157) (1971) 67. Silvestrov, D.S.: Limit Theorems for Composite Random Functions, p. 318. Vysshaya Shkola and Izdatel’stvo Kievskogo Universiteta, Kiev (1974) 68. Silvestrov, D.S.: Semi-Markov Processes with a Discrete State Space. Library for an Engineer in Reliability, p. 272. Sovetskoe Radio, Moscow (1980) 69. Silvestrov, D.S.: Theorems of large deviations type for entry times of a sequence with mixing. Teor. Veroyatn. Mat. Stat. 24, 129–135 (English translation in Theory Probab. Math. Stat. 24, 145–151) (1981) 70. Silvestrov, D.S.: Exponential asymptotic for perturbed renewal equations. Teor. ˇImovirn. Mat. Stat. 52, 143–153 (English translation in Theory Probab. Math. Stat. 52, 153–162) (1995) 71. Silvestrov, D.S.: Nonlinearly perturbed Markov chains and large deviations for lifetime functionals. In: Limnios, N., Nikulin, M. (eds.) Recent Advances in Reliability Theory: Methodology, Practice and Inference, pp. 135–144. Birkhäuser, Boston (2000) 72. Silvestrov D.S.: Limit Theorems for Randomly Stopped Stochastic Processes. Probability and Its Applications, xvi+398 pp. Springer, London (2004) 73. Silvestrov, D.S.: Improved asymptotics for ruin probabilities. In: Silvestrov, D., Martin-Löf, A. (eds.) Modern Problems in Insurance Mathematics, Chapter 5, pp. 93–110. EAA series, Springer, Cham (2014) 74. Silvestrov, D.: Necessary and sufficient conditions for convergence of first-rare-event times for perturbed s emi-Markov processes. Teor. ˇImovirn. Mat. Stat. 95, (2016), 119–137 (Also, in Theor. Probab. Math. Stat. 95, 131–151) (2016) 75. Silvestrov, D.S., Abadov, Z.A.: Uniform asymptotic expansions for exponential moments of sums of random variables defined on a Markov chain and distributions of entry times. 1. Teor. Veroyatn. Mat. Stat. 45, 108–127 (English translation in Theory Probab. Math. Stat. 45, 105– 120) (1991) 76. Silvestrov, D.S., Abadov, Z.A.: Uniform representations of exponential moments of sums of random variables defined on a Markov chain, and of distributions of passage times. 2. Teor. Veroyatn. Mat. Stat. 48, 175–183 (English translation in Theory Probab. Math. Stat. 48, 125– 130) (1993) 77. Silvestrov, D.S., Drozdenko, M.O.: Necessary and sufficient conditions for the weak convergence of the first-rare-event times for semi-Markov processes. Dopov. Nac. Akad. Nauk Ukr., Mat. Prirodozn. Tekh Nuki, 11, 25–28 (2005) 78. Silvestrov, D.S., Drozdenko, M.O.: Necessary and sufficient conditions for weak convergence of first-rare-event times for semi-Markov processes. I. Theory Stoch. Process., 12(28), 3–4, 151–186 (2006)

21 Flows of Rare Events for Regularly Perturbed Semi-Markov Processes

485

79. Silvestrov, D.S., Drozdenko, M.O.: Necessary and sufficient conditions for weak convergence of first-rare-event times for semi-Markov processes. II. Theory Stoch. Process., 12(28), 3–4, 187–202 (2006b) 80. Silvestrov, D., Silvestrov, S.: Asymptotic expansions for stationary distributions of perturbed semi-Markov processes. In: Silvestrov, S., Rancic, M. (eds.) Engineering Mathematics II. Algebraic, Stochastic and Analysis Structures for Networks, Data Classification and Optimization. Springer Proceedings in Mathematics & Statistics, vol. 179, Chapter 10, pp. 151–222. Springer, Heidelberg (2016) 81. Silvestrov, D., Silvestrov, S.: Nonlinearly Perturbed Semi-Markov Processes. Springer Briefs in Probability & Mathematical Statistics, xiv+143 pp. Springer, Cham (2017) 82. Silvestrov, D., Silvestrov, S.: Asymptotic expansions for power-exponential moments of hitting times for nonlinearly perturbed semi-Markov processes. Teor. Imovirn. Mat. Stat. 97, 171–187 (Also, in Theory Probab. Math. Stat. 97, 183–200) (2017) 83. Silvestrov, D.S., Velikii, Y.A.: Necessary and sufficient conditions for convergence of attainment times. In: Zolotarev, V.M., Kalashnikov, V.V. (eds.) Stability Problems for Stochastic Models. Trudy Seminara, VNIISI, Moscow, pp. 129–137 (English translation in J. Soviet. Math., 57, 3317–3324 (1991)) (1988) 84. Simon, H.A., Ando, A.: Aggregation of variables in dynamic systems. Econometrica 29, 111– 138 (1961) 85. Skorokhod, A.V.: Random Processes with Independent Increments. Probability Theory and Mathematical Statistics, Nauka, Moscow, 278 pp. (English edition: Nat. Lending Library for Sci. and Tech., Boston Spa, 1971) (1964) 86. Skorokhod, A.V.: Random Processes with Independent Increments. Second edition, Probability Theory and Mathematical Statistics, Nauka, Moscow, 320 pp. (English edition: Mathematics and its Applications, vol. 47, xii+279 pp. Kluwer, Dordrecht, 1991) (1986) 87. Stewart, G.W.: Matrix Algorithms. Basic Decompositions, vol. I, xx+458 pp. SIAM, Philadelphia, PA (1998) 88. Stewart, G.W.: Matrix Algorithms. Eigensystems, vol. II, xx+469 pp. SIAM, Philadelphia, PA (2001) 89. Turbin, A.F.: On asymptotic behavior of time of a semi-Markov process being in a reducible set of states. Linear case. Teor. Verotatn. Mat. Stat. 4, 179–194 (English translation in Theory Probab. Math. Stat. 4, 167–182) (1971) 90. Yin, G.G., Zhang, Q.: Discrete-Time Markov Chains. Two-Time-Scale Methods and Applications. Stochastic Modelling and Applied Probability, xix+348 pp. Springer, New York (2005) 91. Yin, G.G., Zhang, Q.: Continuous-Time Markov Chains and Applications. A Two-Time-Scale Approach. Second edition. Stochastic Modelling and Applied Probability, vol. 37, xxii+427 pp. Springer, New York (2013) 92. Zakusilo, O.K.: Thinning semi-Markov processes. Teor. Veroyatn. Mat. Stat. 6, 54–59 (English translation in Theory Probab. Math. Stat. 6, 53–58) (1972) 93. Zakusilo, O.K.: Necessary conditions for convergence of semi-Markov processes that thin. Teor. Veroyatn. Mat. Stat. 7, 65–69 (English translation in Theory Probab. Math. Stat. 7, 63– 66) (1972)

Part II

Statistical Methods

Part II presents new developments in statistical methods. It begins with the chapter by D’Amico, Basilio, Petroni, and Gismondi (Chap. 22), in which two drawdownbased risk measures for managing market crisis are considered. These two measures are then analyzed using high-frequency market data and synthetic data generated by ARMA, GARCH, and EGARCH models. Chapter 23 by Anisimov and Austin deals with patient enrolment modelling and forecasting. Here, modelling of enrolment on different levels and with restrictions is discussed and new analytical techniques are proposed to find the solution to the corresponding optimization problem. Chapter 24 by Anguzu, Engström, Kasumba, Mango, and Silvestrov deals with centrality measures in graph theory. Algorithms are developed for recalculating two centrality measures, namely alpha and eigenvector centrality measures, using graph partitioning techniques. In the next Chapter 25 by Kozachenko and Rozora, the estimation of the impulse response function of a time-invariant continuous linear system with a real-valued impulse response function was considered. Statistical properties of this impulse response function and criterion on its shape are given. Chapters 26 and 27 by Muhumuza, Lundengaard, Malyarenko, Silvestrov, Mango, and Kakuba are devoted to useful applications of extreme points for Vandermonde determinants. In Chap. 26, the extreme points, optimized on various surfaces, are used to conduct the risk-minimization task in asset pricing and optimal portfolio selection. In Chap. 27, the extreme points maximize the Wishart probability distribution based on the boundary of the symmetric cones in Jordan algebra. Finally, Part II ends with Chap. 28 by Shchestyuk and Tyshchenko in which a new approach to option pricing is proposed. The concept of investor optimal price is defined as the optimal decision of an investor maximizing expected profit. This investor optimal pricing, integrated with risk management, is then conducted by stochastic optimization.

Chapter 22

An Econometric Analysis of Drawdown Based Measures Guglielmo D’Amico, Bice Di Basilio, Filippo Petroni, and Fulvio Gismondi

Abstract In this chapter, we discuss two risk measures based on drawdown process and closely related to market crises: the drawdown of a fixed level and the speed of market crash. They allow us to study the first time that the asset’s price deviates from its current maximum by a certain threshold value and the velocity at which this drop occurs, respectively. Consequently, the former, is a relative measure of the losses linked to an asset, while the latter, quantifies the speed at which these losses occur. In order to study these risk measures, we consider tick-by-tick prices of two assets, listed on the Italian Stock Exchange. We implement an empirical investigation involving estimation and simulation of widely used econometric models such as Autoregressive Moving Average (ARMA) models, Generalized Autoregressive Conditional Heteroskedasticity (GARCH) models and Exponential GARCH (EGARCH) models. We test the ability of each model to reproduce the volatility autocorrelation, a typical feature of financial time series, and then we analyze their capacity to reproduce the drawdown of fixed level and the speed of market crash, compared to real data. Keywords Risk measure · Econometric models MSC 2020 91B05

G. D’Amico (B) · B. Di Basilio Department of Economics, University of Chieti-Pescara, 65127 Pescara, Italy e-mail: [email protected] B. Di Basilio e-mail: [email protected] F. Petroni Department of Management, Marche Polytechnic University, 60121 Ancona, Italy e-mail: [email protected] F. Gismondi Department of Economic and Business Science, ‘Guglielmo Marconi’ University, Rome, Italy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_22

489

490

G. D’Amico et al.

22.1 Introduction In portfolio management, the concept of financial risk, which expresses uncertainty regarding the future value of a financial asset, is very important. Typically, financial markets are characterized by continuous upward and downward fluctuations in prices, able to generate market crises. As a consequence, it is in the investors’ interest to implement all the necessary shrewdness to control the uncertainty factors related to financial activities, trying to limit the effects of unwanted events. Over the years, an extensive literature on risk measures, able to satisfy all stakeholders needs, has been developed. A well-known and intuitive family of financial indicators, is that based on quantiles, to which Value-at-Risk (VaR) and Expected Shortfall (ES) belong to, (see, e.g. [4, 6]). The main disadvantage of these indicators is that, being based on quantiles, they do not consider the temporal order of data, which is central to the analysis of financial time series. However, there are other families of risk measures, such as the drawdown based one, which take into account the timeline of data. This class includes many indicators, among which we mention: the maximum drawdown (MDD), the average drawdown (AvDD), the drawdown of a fixed level (τ KD ) and the speed of market crash (S). The first indicator, the maximum drawdown, is very widespread and intuitively, it describes the maximum drawdown value over a fixed time interval, (see, e.g. [14, 19]). Like the maximum drawdown, also the average drawdown is very understandable. It is an average of the drawdowns over a certain time horizon, (see, e.g. [5]). Finally, the drawdown of fixed level and the speed of market crash, are less immediate, compared to the previous ones, but very important in the management of market collapses, (see, e.g. [22]). They describe the first time in which the drawdown of an asset reaches a pre-specified level and the velocity at which this level is achieved, respectively. To analyze these risk measures, which consider the data time path, it is necessary to use stochastic models for asset returns. In literature, various models have been proposed, which shape returns directly or indirectly. As a matter of fact, the most famous are the econometric models (see, e.g. [2, 11, 12, 16, 21]) and the diffusive models (see, e.g. [1, 13]). Recently, valuable alternatives based on semi-Markovian models were also considered (see, e.g. [7–9, 15, 17]). In this work, we focus on both the drawdown of a fixed level and the speed of market crash as measures closely related to market collapses. Both risk measures depend on the threshold K , which expresses the riskiness of an event and thus the intensity of a market crash. Specifically, small K -values indicates low risk events while large K -values refers to very risky events. In detail, we analyze these risk indicators by means of intra-day prices of Fiat and Tenaris stocks. Essentially, our aim is to explore if, the classical econometric models are suitable for reproducing these risk measures. In order to do this, we carry out estimation and simulation using ARMA models, GARCH models and EGARCH models, which are very diffused in financial field. Firstly, we show the capacity of these models to repeat the long range volatility autocorrelation and, as result, EGARCH models are able to reproduce

22 An Econometric Analysis of Drawdown Based Measures

491

better, than GARCH and ARMA models, this property, on both assets. Secondly, we analyze the behavior of the average values of the two risk measures on simulated and real data, for fixed values of K . Thirdly, we quantify the distance between real and simulated distributions, for selected values of K . In general, for both these financial indicators, EGARCH models seem to be more performing for Fiat stock, but less suitable for Tenaris stock, which alternatively prefers GARCH and ARMA models. The rest of the chapter is organized as follows: Sect. 22.2 provides a short excursus on risk measures and their rigorous definitions. Section 22.3 formally presents the chosen econometric models. Section 22.4 shows both analysis and discussion of the obtained results. Section 22.5 presents some concluding remarks.

22.2 Risk Measures A market crash is a rapid, unexpected and sometimes abrupt drop in prices, which burns investors’ earnings in a short time. The roots of these crises often lie, not only in economic and financial factors, but also in the general panic among investors, following the reception of negative news. In fact, stakeholders play a fundamental role, since their loss of confidence in investments on a certain market, can trigger the massive sale of the related shares and therefore a sharp drop in prices. To contain the sudden value changes of a security and avoid any crises, stock market investors are interested in quantifying the risk associated with their portfolio by means of risk measures. In literature, there are several risk indicators, which have been developed in different historical-political contexts. The most widespread and commonly used are Value-at-Risk (VaR) and Expected Shortfall (ES). VaR was introduced by J. P. Morgan bank in 1994 and its diffusion is related to Basel regulatory framework. Formally, VaR of level α ∈ (0, 1) is defined as follows: V a Rα (YT ) := −qα (YT ),

(22.1)

where q is the quantile of level α of the random variable Y that acts for profits and losses of a portfolio (P&L) over a time horizon T . In practice, α is close to 0 and T changes from 1 day to 10 days up to 1 year. It represents, the maximum hypothetical loss deriving from holding of a financial asset, on a fixed time horizon and with a given level of probability. However, when VaR is used to limit the risks assumed by investors, it can lead to undesirable results because it does not take into account the size of the tail and it is not a coherent measure. To remedy the problem, a more suitable risk measure, known as Expected Shortfall, has been introduced. This is also sometimes referred to as Conditional VaR (CVaR) or Tail Conditional Expectation (TCE) or Expected Tail Loss (ETL). Mathematically, ES of level α ∈ (0, 1) is defined as:

492

G. D’Amico et al.

1 E Sα (Y ) := α

α V a Ru (Y )du.

(22.2)

0

ES, like VaR, depends on both the time horizon and the level of confidence. It describes the expected loss over a certain time interval, since we are in the tail of the loss distribution, to the right of the percentile α, (i.e. to the right of the VaR). The main drawback of these measures is the use of quantiles that disregard the temporal order of data. Nevertheless, there are risk indicators that exceed this issue, such as those based on the drawdown process which take into account the historical evolution of data. In order to rigorously define the drawdown of an asset, it is necessary to introduce the time varying asset price process X t and its running maximum process, determined as: X t := sup X s . (22.3) s∈[0,t]

Consequently, the drawdown process Dt is described by the difference between the above processes: (22.4) Dt := X t − X t , t ≥ 0. It expresses the correction of the asset price with respect to a previous relative maximum. The most popular drawdown based risk measures are the maximum drawdown (MDD) and the average drawdown (AvDD). The first indicator represents the greatest drawdown attained in a certain time horizon. More formally, MDD in the time interval [0, t] is defined as: M D D := sup Dz .

(22.5)

z∈[0,t]

Pratically, it describes the maximum loss an investor would suffer if he bought a security at its maximum price and resold it at its minimum price. The second measure, the average drawdown, is an average of the drawdowns over the interval [0, t]. Considering time as a continuous variable, it is identified as: Av D D :=

1 t

t Dr dr.

(22.6)

0

The above formula also has a discrete time version: 1 Dj. t j=0 t

Av D D :=

(22.7)

22 An Econometric Analysis of Drawdown Based Measures

493

Unlike the maximum drawdown, which scans the worst possible scenario, it is a measure of the losses that are normally expected. In this chapter, we discuss two risk measures based on drawdown and connected to market crashes: the drawdown of a fixed level K and the speed of market crash. The former, denoted by τ KD , is defined as the the first time that the drawdown process reaches a certain threshold K : τ KD := inf{t ≥ 0|Dt ≥ K } where K ≥ 0.

(22.8)

It explores whether the drawdown of a stock has achieved a certain threshold or even exceeded it in the past, in order to understand if it was risky and how risky it was. To defined the second measure, denoted by S, we have to introduce the quantity ρ that is the last visit time of the maximum before the time τ KD : ρ := sup{t ∈ [0, τ KD ]|X t = X t }.

(22.9)

Using the definition of ρ and τ KD , we can formalize S as follows: S := τ KD − ρ.

(22.10)

It represents the velocity at which a market crash takes place and thus, how fast portfolio losses occur.

22.3 Mathematical Models As market crashes are caused by snap drops in asset prices or equivalently in asset returns, it is essential to introduce the definition of financial returns. Let X t be the price of a stock at time t ∈ N. The time varying log returns, denoted by Rt , are determined as: Rt := log(X t ) − log(X t−1 ).

(22.11)

Since it is our objective to study log returns by means of econometric models and we want to render this paper self readable we are going to present in next subsections some of the most widely used models in econometric studies. However, it should be stressed right now that, ARMA family models directly returns, while GARCH/EGARCH model volatility, so that returns are considered as a consequence of volatility process to which a noise process transformation is applied.

494

G. D’Amico et al.

22.3.1 ARMA Model The Autoregressive Moving Average (ARMA) models are a combination of the Autoregressive (AR) models, introduced in [21] by Yule, and the Moving Average (MA) models, proposed in [16] by Slutzky. Basically, these were presented to overcome the problem according to which AR and MA models often required a very large number of parameters to correctly describe the data structure. Therefore ARMA models are more parsimonious, in terms of parameters to be estimated, compared to AR and MA models. Although this class of processes were first proposed in [20] by Wold, only in [3] they became popular thanks to Box and Jenkins. The delay in their use, is mainly explained by the absence of machines able to perform complex calculations and thus, estimate the parameters of these models. Mathematically, a general ARMA( p,q) model for asset return is in the form: R t = φ0 +

p  i=1

φi Rt−i + at −

q 

θ j at− j ,

(22.12)

j=1

where p and q are nonnegative integers and {at } is a white noise series. An ARMA( p,q) model, expresses the conditional mean of Rt as a function of both the past values of returns, Rt−1 , Rt−2 , . . . , Rt− p , and the past values of the innovations, at−1 , at−2 , . . . , at−q . The number of past observations that Rt depends on, p, is the AR degree. Conversely, the number of past innovations that Rt depends on, q, is the MA degree. As a consequence, AR and MA models can be seen as particular cases of ARMA models, which occur when q = 0 and p = 0, respectively. Generally, the selection of the best ARMA( p,q), is done by means of the extended autocorrelation function (EACF) or the information criteria, (see, e.g. [18]). Once an ARMA( p,q) model is identified, its parametres can be assess by means of the maximum likelihood estimator, (see, e.g. [10]). In literature, there are many strengths and weaknesses discussed on ARMA processes. In particular, this class of models has the advantage of being well known, both theoretically and computationally, as it turns out to be very simple to model data with an ARMA structure, also thanks to the high level of statistical packages available for this purpose. However, with the evolution of literature on stylized facts, new phenomena have emerged, such as volatility clustering, which ARMA type processes fail to explain because these assume a constant variance.

22.3.2 GARCH Model In [11], Engle proposed a new class of non linear models, able to represent the phenomenon of volatility clustering, according to which periods of high (low) volatility

22 An Econometric Analysis of Drawdown Based Measures

495

are followed by periods of high (low) volatility. These are referred to as the Autoregressive Conditional Heteroskedasticity (ARCH) models. Although ARCH models are very simple, it is often necessary to estimate many parameters, in order to adapt model to empirical data. To exceed this problem, in [2] Bollerslev introduced a parsimonious improvement of ARCH models, known as Generalized Autoregressive Conditional Heteroskedasticity (GARCH) models. Formally, let at = Rt − μt be the innovation process at time t. Then at follows a general GARCH(m,s) if (22.13) at = σt εt , σt2 = α0 +

m  i=1

2 αi at−i +

s 

2 β j σt− j,

(22.14)

j=1

where m, s are nonnegative integers and {εt } is a sequence of i.i.d. random variables with mean 0 and variance 1. It is often assumed that {εt } follows a standardized Gaussian distribution or a standardized Student-t distribution. Besides, α0 > 0, αi ≥ max(m,s) (αi + βi ) < 1. 0, βi ≥ 0 and i=1 A GARCH(m,s) model, describes the conditional variance of an asset return as a function of both the lagged conditional variances (GARCH component), 2 2 2 , σt−2 , . . . , σt−s , and the lagged squared innovations (ARCH component), σt−1 2 2 2 , at−2 , . . . , at−m . The degrees of the GARCH and the ARCH components are at−1 expressed by s and m, respectively. It is easy to check that, if s = 0, we recover the case of an ARCH(m) model. Moreover, it is interesting to note that, a GARCH process can be regarded as an ARMA process for the at2 series, (see [2]). In order to prove this, we fix ηt = at2 − σt2 2 2 so that σt2 = at2 − ηt . By replacing σt−i = at−i − ηt−i (i = 0, . . . , s) into equation (22.14), we can rewrite the GARCH(m,s) as follows: at2 = α0 +

max(m,s)  i=1

2 (αi + βi )at−i + ηt −

s 

β j ηt− j .

(22.15)

j=1

The equation above shows an ARMA form for the squared series {at2 } where {ηt } is a martingale difference series, (i.e. E(ηt ) = 0 and Cov(ηt , ηt− j ) = 0 for j > 1). To correctly identify the GARCH(m,s), the information criteria-approach, already mentioned for ARMA models, is used. Once the model is specified, its parameters can be estimated using the maximumlikelihood method. Despite the success and the diffusion of GARCH models, these show a fundamental disadvantage, that deserves to be discussed. In fact, this type of process assumes that, positive or negative shocks have the same effect on volatility, as it depends on the square of previous shocks. Unfortunately, this is in contrast to the empirical evidence, which shows as, very often, there are different responses to positive or negative shocks, especially in the stock markets. This phenomenon, commonly known

496

G. D’Amico et al.

as leverage effect, was first discovered by the economist Black in 1976 who noted that, in any market, downward movements tend to be followed by higher volatility, than that resulting from upward movements of the same order.

22.3.3 EGARCH Model To consider the leverage effect, in [12] Nelson elaborated an improvement of the GARCH model, known as Exponential GARCH (EGARCH) model. In detail, to make sure the asymmetric effects between positive and negative asset returns, the following innovation is considered: g(εt ) = θ εt + γ [|εt | − E(|εt |)],

(22.16)

where εt and [|εt | − E(|εt |)] are i.i.d sequences with mean zero. As a consequence E[g(εt )] = 0. The real constants θ and γ , describe the symmetrical and asymmetrical effect of the shock, respectively. To notice the asymmetry, we can write g(εt ) as:  g(εt ) =

(θ + γ )εt − γ E(|εt |) if εt ≥ 0, (θ − γ )εt − γ E(|εt |) if εt < 0.

(22.17)

The E(|εt |) depends on the distribution of the random variable √ εt . For instance, if εt follows the standard Gaussian distribution then E(|εt |) = 2/π . Alternatively, if εt has a standard Student-t distribution, we get: E(|εt |) =

√ 2 ν − 2Γ [(ν + 1)/2] . √ (ν − 1)Γ (ν/2) π

(22.18)

A general EGARCH(m,s) is in the form: at = σt εt , log(σt2 )

= α0 +

m  i=1

αi g(εt−i ) +

(22.19) s 

2 β j log(σt− j ),

(22.20)

j=1

where m and s are nonnegative integers and the parameters α0 , αi , β j do not necessarily have to be positive. An EGARCH(m,s) models the conditional log-variance of an asset return as a function of both the lagged conditional log-variances (GARCH component), 2 2 2 ), log(σt−2 ), . . . , log(σt−s ), and the lagged innovations (ARCH component), log(σt−1 g(εt−1 ), g(εt−2 ), . . . , g(εt−m ). Obviously m and s represent the degrees of ARCH and GARCH component, respectively.

22 An Econometric Analysis of Drawdown Based Measures

497

Although the EGARCH model belongs to the GARCH family, it has some differences, compared to it. First, it models the conditional log-variance, and not the conditional variance, as GARCH model does. This particularity, guarantees the positivity of the conditional variance, making it a more flexible model than the previous GARCH model. At the same time, however, forecasts on conditional variance made with an EGARCH model, are biased because by Jensen’s inequality, we have: E(σt2 ) ≥ E(log(σt2 )).

(22.21)

In addition, it uses weighted innovation g(εt ), and not a simple innovation, whose impact on the conditional log-variance, is in line with what emerges from the markets: volatility tends to increase in the presence of bad news, while decreases when news is good.

22.4 Application The database consists of intra-day prices of Fiat and Tenaris stocks, denoted with the symbols F and TEN, respectively. Data has been downloaded from the Italian Stock Exchange (‘www.borsaitaliana.it’) for the period January 2007–December 2010 and then have been resampled to obtain the frequency of 1 min. Overall, we studied 1001 trading days, each consisting of 507 min, for a total of 506506 returns for every asset. Table 22.1 summarize the descriptive statistics of the dataset. In Figs. 22.1, 22.2, 22.3 22.4, we analyze the behavior of the measures τ KD and S on real prices, by means of histograms. The time, expressed in minutes (X-axis), has a logarithmic scale while the absolute frequencies (Y-axis), have a linear scale. Comparing Figs. 22.1, 22.2, 22.3 22.4, it is possible to note that a 1%-change in Fiat drawdown occurs almost always in the first 100 min of the trading day. On the contrary, Tenaris stock needs more time before reaching the same variation. Specularly, in Figs. 22.3 and22.4 we show histograms of S for K = 1%. From Fig. 22.3, we observe that Fiat asset reaches a 1%-change in its drawdown almost always with a speed lower than 60 min. On the other hand, Fig. 22.4 displays as, Tenaris asset, achieves the same variation with a velocity slightly slower.

Table 22.1 Descriptive statistics of the dataset used for the analysis Stocks Mean Median Standard Skewness Deviation F TEN

8.2369e−07 1.3209e−06

0 0

8.3534e−04 8.2227e−04

−0.0169 −0.0090

Kurtosis 3.4852 3.7288

498

G. D’Amico et al.

Fig. 22.1 Histogram of τ KD computed on Fiat stock prices, for K = 1%. The time, expressed in minutes, (X-axis) has a logarithmic scale while the absolute frequencies (Y-axis) have a linear scale

Fig. 22.2 Histogram of τ KD computed on Tenaris stock prices, for K = 1%. The time, expressed in minutes, (X-axis) has a logarithmic scale while the absolute frequencies (Y-axis) have a linear scale

To realize the behaviour of these measures as the threshold K moves, in Tables 22.2 and 22.3, we portray the main descriptive statistics of τ KD and S, for several levels of K (K = 0.5%, K = 1%, K = 1.5%, K = 2%, K = 3%). Paying attention on the mean values of τ KD , in Table 22.2, we can note that, as K increases, also τ KD increases on average, for both stocks. However, Tenaris asset takes longer than Fiat asset to reach each threshold. For instance, a 0.5%-variation in the drawdown is achived on average in 31 min for Fiat stock and in 35 min for Tenaris stock. To obtain a greater change, such as 2%, it takes 171 and 220 min on average, respectively. This implies both that more extreme events occur in more time on both assets and that Tenaris needs more time than Fiat to overcome any K value.

22 An Econometric Analysis of Drawdown Based Measures

499

Fig. 22.3 Histogram of S computed on Fiat stock prices, for K = 1%. The time, expressed in minutes, (X-axis) has a logarithmic scale while the absolute frequencies (Y-axis) have a linear scale

Fig. 22.4 Histogram of S computed on Tenaris stock prices, for K = 1%. The time, expressed in minutes, (X-axis) has a logarithmic scale while the absolute frequencies (Y-axis) have a linear scale

Likewise, in Table 22.3 we observe that, as K grows, the mean values of S grow too, for both assets. However, Tenaris shows to be slower than Fiat. Just to give an example, a variation of 0.5% occurs on average at an approximate speed of 13 min in Fiat asset and 14 min in Tenaris asset. Conversely, to attain a change of 2%, it takes 115 and 144 min about, respectively. This signifies that, more extreme events are reached more slowly by both assets and that, Tenaris appears to be less fast for each value of the threshold K . We reproduced returns through ARMA, GARCH and EGARCH models. This choice is motivated by the fact that these are widely used in finance. We obtained the parameters of each model by applying a maximum likelihood optimization algorithm using Matlab software. Using the optimized parameters and by means a Matlab function, we genereted a series of returns for each considered model, using Gaussian

500

G. D’Amico et al.

Table 22.2 Statistics of τ KD for different levels of K Stock

K (%)

Mean

Standard deviation

Skewness

Kurtosis

F

0.5 1 1.5 2 3 0.5 1 1.5 2 3

31.455 101.358 138.033 170.723 224.982 35.481 125.335 187.395 219.778 264.596

62.982 129.424 141.105 148.522 154.684 56.484 138.740 156.899 158.605 130.422

4.244 1.613 1.128 0.770 0.272 3.677 1.199 0.514 0.227 −0.182

23.432 4.452 2.960 2.242 1.641 19.641 3.111 1.742 1.540 1.650

Skewness

Kurtosis

6.408 2.741 1.913 1.388 2.205 4.494 2.171 1.369 0.931 0.422

53.634 10.842 6.257 4.147 0.670 33.077 7.882 4.087 2.881 2.046

TEN

Table 22.3 Statistics of S for different levels of K Stock K (%) Mean Standard deviation F

TEN

0.5 1 1.5 2 3 0.5 1 1.5 2 3

12.495 58.493 84.865 115.353 172.048 13.571 61.393 110.383 143.723 184.704

27.650 84.176 100.649 117.269 137.352 21.992 77.971 110.552 123.401 130.422

innovations. In order to make comparison between real and simulated series, we produced synthetic data having the same length as the real one. A very important empirical evidence of financial markets is that, returns are not autocorrelated while the square of returns or their absolute values are characterized by a long range autocorrelation. To test this feature, we computed the autocorrelation functions both for real squared returns and synthetic squared returns. Recall that the autocorrelation of the square of returns (R 2 ) for various time lags (τ ), is given by:   Cov R 2 (t + τ ), R 2 (t) . Σ(τ ) = Var(R 2 (t))

(22.22)

The estimetes of Σ(τ ) were made by varying τ from 1 minute to 100 min. In Table 22.4, we show the Mean Absolute Percentage Error (MAPE) between real

22 An Econometric Analysis of Drawdown Based Measures

501

Table 22.4 Mean absolute percentage error (MAPE) between real and synthetic autocorrelation functions Stock Model MAPE F

TEN

ARMA (1,1) ARMA (1,2) ARMA (2,1) ARMA (2,2) GARCH (1,1) GARCH (1,2) GARCH (2,1) EGARCH (1,1) EGARCH (1,2) EGARCH (2,1) EGARCH (2,2) ARMA (1,1) ARMA (1,2) ARMA (2,1) ARMA (2,2) GARCH (1,1) GARCH (1,2) GARCH (2,1) EGARCH (1,1) EGARCH (1,2) EGARCH (2,1) EGARCH (2,2)

0.9908 0.9935 0.9908 0.9826 0.9538 0.9581 0.9505 0.8699 0.6826 0.8285 0.6657 0.9881 0.9882 0.9884 0.9898 0.9245 0.9116 0.9096 0.8699 0.5998 0.6731 0.6228

and simulated autocorrelation functions. It can be observed that EGARCH models reproduce better the volatility autocorrelation, than the others chosen models, on both stocks. Furthermore, in Figs. 22.5, 22.6, 22.7, 22.8, 22.9, 22.10, 22.11, 22.12, 22.13, 22.14, 22.15, 22.16 we analyze the behavior of the average values of the two risk measures on real and simulated prices, as K varies. Specifically, we consider 121 values for the threshold K (K ranges from 0 to 0.0120) and for each fixed K , we calculate both the real and the simulated average values. In detail, in Figs. 22.5, 22.6, 22.7, 22.8, 22.9, 22.10, we show a comparison between the mean values of τ KD in function of K , on real and synthetic series, for both stocks. Note that the Upper and Lower Bounds are computed on simulated series. To match these graphics, in Table 22.5 we fix three levels of the threshold K (K = 0.003, K = 0.007, K = 0.009) and then, we display the mean values of τ KD for real and simulated series. We note that, for each selected K-value, the average values coming from EGARCH and ARMA models, are the closest to the real average values of Fiat and Tenaris stocks, respectively.

502

G. D’Amico et al.

Fig. 22.5 Plots of the averages of τ KD in function of K , both for real data (F) and simulated data with ARMA models. Upper (UB) and Lower (LB) Bounds calculated on simulated data

Fig. 22.6 Plots of the averages of τ KD in function of K , both for real data (TEN) and simulated data with ARMA models. Upper (UB) and Lower (LB) Bounds calculated on simulated data

Fig. 22.7 Plots of the averages of τ KD in function of K , both for real data (F) and simulated data with GARCH models. Upper (UB) and Lower (LB) Bounds calculated on simulated data

22 An Econometric Analysis of Drawdown Based Measures

503

Fig. 22.8 Plots of the averages of τ KD in function of K , both for real data (TEN) and simulated data with GARCH models. Upper (UB) and Lower (LB) Bounds calculated on simulated data

Fig. 22.9 Plots of the averages of τ KD in function of K , both for real data (F) and simulated data with EGARCH models. Upper (UB) and Lower (LB) Bounds calculated on simulated data

Fig. 22.10 Plots of the averages of τ KD in function of K , both for real data (TEN) and simulated data with EGARCH models. Upper (UB) and Lower (LB) Bounds calculated on simulated data

504

G. D’Amico et al.

Table 22.5 Average values of τ KD for both real and simulated data, considering different levels of K , (K = 0.003, K = 0.007, K = 0.009) Risk measure Stock Model K = 0.003 K = 0.007 K = 0.009 τ KD

τ KD

F

TEN

Real Data ARMA (1,1) ARMA (1,2) ARMA (2,1) ARMA (2,2) GARCH (1,1) GARCH (1,2) GARCH (2,1) EGARCH (1,1) EGARCH (1,2) EGARCH (2,1) EGARCH (2,2) Real data ARMA (1,1) ARMA (1,2) ARMA (2,1) ARMA (2,2) GARCH (1,1) GARCH (1,2) GARCH (2,1) EGARCH (1,1) EGARCH (1,2) EGARCH (2,1) EGARCH (2,2)

11.493 31.593 31.706 31.194 30.266 31.215 32.220 29.981 26.178

61.733 127.043 120.656 122.109 121.109 117.384 113.689 110.580 89.209

79.649 168.298 167.943 161.389 164.959 158.666 153.947 154.748 126.617

19.115

66.707

102.842

24.607

90.428

128.890

18.433

63.057

96.505

36.634 23.991 23.433 23.300 24.279 22.865 22.721 22.181 18.099

116.342 89.451 92.461 92.271 94.404 81.859 83.355 77.996 55.607

137.716 131.310 138.651 134.256 137.832 120.126 118.292 113.581 80.622

52.131

74.261

74.262

15.976

53.851

77.316

25.592

52.909

76.074

Similarly, in Figs. 22.11, 22.12, 22.13, 22.14, 22.15, 22.16, we propose the same comparison also for the measure S. By means Table 22.6, we confront these figures for three selected levels of K (K = 0.003, K = 0.007, K = 0.009). It can be observed that, the measure S has the same behavior as the measure τ KD , for each chosen value of K and on both stocks. In fact, the mean values associated with EGARCH and ARMA models are the most similar to the real one, for Fiat and Tenaris assets, respectively. To gauge the distance between real and simulated distributions of the two risk measures under study, we compute the Kullback–Leibler divergence. We recall that, the

22 An Econometric Analysis of Drawdown Based Measures

505

Fig. 22.11 Plots of the averages of S in function of K , both for real data (F) and simulated data with ARMA models. Upper (UB) and Lower (LB) Bounds calculated on simulated data

Fig. 22.12 Plots of the averages of S in function of K , both for real data (TEN) and simulated data with ARMA models. Upper (UB) and Lower (LB) Bounds calculated on simulated data

Fig. 22.13 Plots of the averages of S in function of K , both for real data (F) and simulated data with GARCH models. Upper (UB) and Lower (LB) Bounds calculated on simulated data

506

G. D’Amico et al.

Fig. 22.14 Plots of the averages of S in function of K , both for real data (TEN) and simulated data with GARCH models. Upper (UB) and Lower (LB) Bounds calculated on simulated data

Fig. 22.15 Plots of the averages of S in function of K , both for real data (F) and simulated data with EGARCH models. Upper (UB) and Lower (LB) Bounds calculated on simulated data

Fig. 22.16 Plots of the averages of S in function of K , both for real data (TEN) and simulated data with EGARCH models. Upper (UB) and Lower (LB) Bounds calculated on simulated data

22 An Econometric Analysis of Drawdown Based Measures

507

Table 22.6 Average values of S for both real and simulated data, considering different levels of K , (K = 0.003, K = 0.007, K = 0.009) Risk measure

Stock

Model

K = 0.003

K = 0.007

K = 0.009

S

F

Real data

3.713

28.154

41.909

ARMA (1,1)

10.304

40.583

59.533

ARMA (1,2)

10.732

42.176

62.733

ARMA (2,1)

10.000

41.853

60.876

ARMA (2,2)

9.742

40.279

59.732

GARCH (1,1)

10.024

39.475

58.920

GARCH (1,2)

10.299

38.986

60.290

GARCH (2,1)

9.825

38.380

56.679

EGARCH (1,1)

7.864

27.805

42.506

EGARCH (1,2)

6.300

21.673

33.570

EGARCH (2,1)

7.771

27.538

42.173

EGARCH (2,2)

5.816

20.427

30.938

Real data

10.652

38.470

50.377

ARMA (1,1)

7.881

30.482

46.852

ARMA (1,2)

7.700

30.860

47.124

ARMA (2,1)

7.910

30.269

45.505

ARMA (2,2)

7.869

30.512

45.669

GARCH (1,1)

7.701

27.035

41.156

GARCH (1,2)

7.559

27.903

40.586

GARCH (2,1)

7.353

26.503

40.073

EGARCH (1,1)

5.720

18.012

26.467

EGARCH (1,2)

4.934

16.369

24.196

EGARCH (2,1)

5.100

17.356

25.279

EGARCH (2,2)

4.801

16.036

22.838

S

TEN

Kullback–Leibler divergence of the distribution Q from the distribution P, denoted by D K L (P||Q), is the measure of the information lost when Q is used to approximate P. It is defined as follows:   P(i) , (22.23) P(i)log2 D K L (P||Q) := Q(i) i where P and Q are discrete distributions. In our framework P is the real distribution of τ KD and S for a fixed K -value, respectively. Accordingly, Q stands for the synthetic distributions of these risk measures for a selected K -value. In Table 22.7, we show the Kullback–Leibler divergence for the measure τ KD , considering three levels of the threshold K (K = 0.003, K = 0.007, K = 0.009). It can be noted that, for Fiat stock, EGARCH models have the lowest distance for K = 0.003 and K = 0.007, while for K = 0.009, ARMA models, performs better than the others. As Tenaris stock, ARMA models have less distance for K = 0.003, but the greatest distance for the other values of K . Conversely, in Table 22.8, we propose the same analysis also for the measure S. It seems that, EGARCH and GARCH models have the smallest distance for every selected K -value, for Fiat and Tenaris assets, respectively.

508

G. D’Amico et al.

Table 22.7 Kullback–Leibler divergence computed for the measure τ KD , considering three values of K , (K = 0.003, K = 0.007, K = 0.009) Risk measure τkD

τkD

Stock

Model

K = 0.003

K = 0.007

K = 0.009

F

ARMA (1,1)

0.6168

0.2701

0.0908

ARMA (1,2)

0.7464

0.2779

0.1045

ARMA (2,1)

0.6557

0.2675

0.1010

ARMA (2,2)

0.6521

0.2774

0.1112

GARCH (1,1)

0.6670

0.2341

0.1040

GARCH (1,2)

0.6764

0.8106

0.1065

GARCH (2,1)

0.5970

0.7483

0.1099

EGARCH (1,1)

0.4675

0.2116

0.1933

EGARCH (1,2)

0.3282

0.1900

0.2181

EGARCH (2,1)

0.4785

0.2128

0.1997

EGARCH (2,2)

0.3171

0.1871

0.2163

ARMA (1,1)

0.0155

0.0334

0.0331

ARMA (1,2)

0.0121

0.0307

0.0233

ARMA (2,1)

0.0102

0.0398

0.0313

ARMA (2,2)

0.0182

0.0325

0.0263

GARCH (1,1)

0.0081

0.0161

0.0187

GARCH (1,2)

0.0073

0.0178

0.0214

GARCH (2,1)

0.0066

0.0165

0.0259

EGARCH (1,1)

0.0408

0.0093

0.0137

EGARCH (1,2)

0.0696

0.0136

0.0152

EGARCH (2,1)

0.0892

0.0127

0.0228

EGARCH (2,2)

0.0891

0.0122

0.0129

TEN

Table 22.8 Kullback–Leibler divergence computed for the measure S, considering three values of K , (K = 0.003, K = 0.007, K = 0.009) Risk measure

Stock

Model

K = 0.003

K = 0.007

K = 0.009

S

F

ARMA (1,1)

1.2362

0.7430

0.3760

ARMA (1,2)

1.0786

0.7943

0.4078

ARMA (2,1)

1.1535

0.8034

0.4435

ARMA (2,2)

1.0350

0.7963

0.4493

GARCH (1,1)

0.7470

0.5000

0.3398

GARCH (1,2)

0.7890

0.4547

0.3122

GARCH (2,1)

0.6909

0.4571

0.3158

EGARCH (1,1)

0.6106

0.3128

0.2750

EGARCH (1,2)

0.3837

0.1942

0.2204

EGARCH (2,1)

0.6421

0.3027

0.2594

EGARCH (2,2)

0.5590

0.1648

0.2087

ARMA (1,1)

0.2415

0.1616

0.1707

ARMA (1,2)

1.5560

0.1636

0.1814

ARMA (2,1)

0.3898

0.4238

0.2077

ARMA (2,2)

0.2107

0.1772

0.1836

GARCH (1,1)

0.1061

0.0182

0.0263

GARCH (1,2)

0.0899

0.0184

0.0278

GARCH (2,1)

0.1122

0.0202

0.0374

EGARCH (1,1)

1.2340

0.0681

0.0430

EGARCH (1,2)

8.4959

0.0830

0.0605

EGARCH (2,1)

3.1565

0.0790

0.0495

EGARCH (2,2)

4.2859

0.0969

0.0609

S

TEN

22 An Econometric Analysis of Drawdown Based Measures

509

22.5 Conclusions In this chapter, we considered two risk measures, the drawdown of a fixed level and the speed of market crash, useful in managing market crises. In particular, we analyzed the behaviour of these indicators on high frequency data relating to two assets, listed on ‘Borsa Italiana’. By applying ARMA, GARCH and EGARCH models to the dataset, we generated synthetic series whereby we make comparisons. Firstly, we tested the ability of each considered model to reproduce the volatility autocorrelation by means of the calculus of MAPE between real and simulated autocorrelation functions. Secondly, we explored the capacity of each synthetic series to repeat the risk measures as the threshold changes. Thirdly, using Kullback–Leibler divergence, we quantified the distance between real and simulated distributions, for selected values of K . Globally, the chosen econometric models give partially satisfactory results, which allow us to glimpse the need to use more performing models to reproduce the measures under study.

References 1. Andersen, T.G., Bollerslev, T., Frederiksen, P., Nielsen, M.O.: Continuous-time models, realized volatilities, and testable distribution implications for daily stock returns. J. Appl. Econometrics 25, 233–261 (2010) 2. Bollerslev, T.: Generalized autoregressive conditional heteroskedasticity. J. Econometrics 31, 307–327 (1986) 3. Box, G.E.P., Jenkins, G.: Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco (1970) 4. Christoffersen, P.F.: Elements of Financial Risk Management. Elsevier, Amsterdam (2012) 5. Chekhlov, A., Uryasev, S.P., Zabarankin, M.: Drawdown measure in portfolio optimization. Int. J. Theor. Appl. Finance 8, 13–58 (2005) 6. Cont, R., Deguest, R., Scandolo, G.: Robustness and sensitivity analysis of risk measurement procedures. Quant. Finance 10, 593–606 (2010) 7. D’Amico, G., Petroni, F.: A semi-Markov model with memory for price changes. J. Stat. Mech. Theory Exp. (2011). https://doi.org/10.1088/1742-5468/2011/12/P12009 8. D’Amico, G., Petroni, F.: Multivariate high-frequency financial data via semi-Markov processes. Markov Process. Relat. Fields 20, 415–434 (2014) 9. D’Amico, G., Petroni, F.: Copula based multivariate semi-Markov models with applications in high-frequency finance. Eur. J. Oper. Res. 267, 765–777 (2018) 10. DeGroot, M.H., Schervish, M.J.: Probability and Statistics. Addison-Wesley, Boston (2012) 11. Engle, R.F.: Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflations. Econometrica 50, 987–1007 (1982) 12. Nelson, D.B.: Conditional heteroskedasticity in asset return: A new approach. Econometrica 59, 347–370 (1991) 13. Pospisil, L., Vecer, J., Hadjiliadis, O.: Formulas for stopped diffusion processes with stopping times based on drawdowns and drawups. Stoch. Process. Their Appl. 119, 2563–2578 (2009) 14. Rotundo, G., Navarra, M.: On the Maximum drawdown during speculative bubbles. Phys. A 382, 235–246 (2007) 15. Silvestrov, D.S., Stenberg, F.: A pricing process with stochastic volatility controlled by a semiMarkov process. Commun. Stat.-Theor. M. 33, 591–608 (2004)

510

G. D’Amico et al.

16. Slutzky, E.: The summation of random causes as the source of cyclic processes. Econometrica 5, 106–146 (1937) 17. Swishchuk, A., Vadori, N.: A semi-Markovian modeling of limit order markets. SIAM J. Financ. Math. 8, 240–273 (2017) 18. Tsay, R.S.: Analysis of Financial Time Series. Wiley, New York (2010) 19. Vacer, J.: Preventing portfolio losses by hedging maximum drawdown. Wilmott 5, 1–8 (2007) 20. Wold, H.: A Study in the Analysis of Stationary Time Series. Almqvist and Wiksells, Uppsala Stockholm (1938) 21. Yule, G.U.: Why do we sometimes get nonsense-correlations between time series? A study in sampling and the nature of the time series. J. Roy. Stat. Soc. 89, 1–63 (1926) 22. Zhang, H., Hadjiliadis, O.: Drawdown and the speed of market crash. Methodol. Comput. Appl. Probab. 14, 739–752 (2012)

Chapter 23

Forecasting and Optimizing Patient Enrolment in Clinical Trials Under Various Restrictions Vladimir Anisimov and Matthew Austin

Abstract Design and forecasting of patient enrolment is among the greatest challenges that the clinical research enterprize faces today, as inefficient enrolment can be a major cause of drug development delays. Therefore, the development of the innovative statistical and artificial intelligence technologies for improving the efficiency of clinical trials operation are of the imperative need. This chapter is describing further developments in the innovative statistical methodology for modelling and forecasting patient enrolment. The underlying technique uses a Poisson-gamma enrolment model developed by Anisimov and Fedorov in the previous publications and is extended to analytic modelling of the enrolment on country/region level. For this purpose, a new analytic technique based on the approximation of the enrolment process in country/region by a Poisson-gamma process with aggregated parameters is developed. Another innovative direction is modelling the enrolment under some restrictions (enrolment caps in countries). Some discussion on using historic trials for better prediction of the enrolment in the new trials is provided. These results are used for solving the problem of optimal trial enrolment design: find an optimal allocation of centres/counties that minimizes the total trial cost given that probability to reach an enrolment target is no less than some prescribed probability. Different techniques to find an optimal solution for high dimensional optimization problem are proposed. Keywords Patient enrolment · Poisson-gamma model · Forecasting enrolment · Restricted enrolment · Optimal enrolment design MSC 2020: 60G55 · 90C30

V. Anisimov (B) Center for Design and Analysis, Amgen Inc., Cambridge, UK e-mail: [email protected] M. Austin Center for Design and Analysis, Amgen Inc., Thousand Oaks, CA, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_23

511

512

V. Anisimov and M. Austin

23.1 Introduction The multibillion clinical trials market is in an outstanding need of using innovative statistical and artificial intelligence technologies for improving the efficiency of clinical trials operation as 80% of clinical trials fail to meet enrolment timelines. Statistical design and trial operation are affected by stochasticity in patient enrolment and various event’s appearance. The complexity of clinical trials and multi-state hierarchic structure of different operational processes require developing new predictive analytic techniques for efficient data analysis, forecasting/monitoring and optimal decision making. There are many challenging problems in trial design. According to a research from the Tufts Center for the Study of Drug Development [28], while nine out of 10 clinical trials worldwide meet their patient enrolment goals, reaching those targets typically means that drug developers need to nearly double their original timelines. “Patient recruitment and retention are among the greatest challenges that the clinical research enterprise faces today, and they are a major cause of drug development delays,” said Ken Getz, director of sponsored research at Tufts Center for the Study of Drug Development. Patient enrolment is one of the main engines driving operation of contemporary trials. There are many uncertainties in input data and randomness in enrolment over time. Enrolment stage is very costly, it also affects many other operational characteristics: follow-up stage, supply chain and time to deliver drug on market. Many companies still use ad-hoc simplified or deterministic models. This may lead to inefficient design, underpowered and delayed trials, extra costs and drug waste. The key questions for all pharmaceutical companies: How to improve predictability of patient enrolment with the goal to improve the efficiency and quality of clinical trial operation? Which countries and how many sites should we select for study that: enrol the fastest with min cost to get a desired Probability of Success? Historically, the main attention of statisticians working in clinical research is paid to the statistical trial design, sample size analysis, without giving much consideration to the investigation of the impact of a patient enrolment process on the whole study operational design. However, as at any future time point the numbers of patients at different levels and in different cohorts are uncertain, to use the proper stochastic models to account for these uncertainties is a key point because this will allow predicting the times of interim and final analysis and evaluate the resources required to reach the trial goals in time. Nowadays, the late phase clinical trials typically involve hundreds or even thousands of patients recruited by many clinical centers among different countries. Some controversies in the analysis of multicentre clinical trials are considered in [26, 27]. Therefore, we investigate clinical trials where the patients are recruited by multiple clinical centres. At the initial stage of trial design and at the interim stage the imperative tasks are predicting the number of patients to be recruited in different countries/regions as this impacts the whole trial operational design.

23 Forecasting and Optimizing Patient Enrolment in Clinical …

513

There is a quite extensive literature on using different approaches for enrolment modelling. Quite large number of papers are devoted to using mixed Poisson models. In [30] the authors use a Poisson process with gamma distributed rate to model the global enrolment process. Several authors [17, 26, 27] use the Poisson processes with fixed recruitment rates to describe the enrolment process in different clinical centres. However, in real trials different centres typically may have different capacity and productivity, thus, the enrolment rates in different centres vary. To reflect this variation, Anisimov and Fedorov introduced a so-called Poisson-gamma model, where the variation in rates is described using a gamma distribution (see [11, 12]). This model fits to the empirical Bayesian approach where the prior distribution of the rates is a gamma distribution with the parameters that at the initial stage are evaluated either using historic data or expert estimates of study managers. In [12] it is also proposed a maximum likelihood technique for estimating the parameters of the rates and the Bayesian technique for adjusting the posterior distribution of the rates at any interim time using enrolment data in the individual centres. Later on in [19], it was independently considered a similar model for modelling enrolment using a Poisson process with gamma distributed rate but assuming that there is only one clinical center. To capture wider realistic scenarios, the technique based on using a Poissongamma model was developed further to account for random delays and closure of clinical centres and analysis of some performance measures [1, 2, 10]. The Poissongamma model was used as a baseline methodology in [3] for modelling event’s counts in event-driven trials, in [4] for forecasting various trial operational characteristics associated with enrolment, and in [9] for centralized statistical monitoring of clinical trial enrolment performance. The Poisson-gamma model was also used in [13] for evaluating the parameters of the model using meta-analytic techniques of historic trials, and in [23, 24] to investigate the opportunity of using Pareto distribution for the enrolment rates and for evaluating the duration of recruitment when historic data are available. A survey on using mixed Poisson models is provided in a discussion paper [5]. Note that a mixed Poisson-gamma distribution and the associated negative binomial distribution were also used in other applications, e.g. in [15] for describing the variation of positive variables in modelling flows of various events. There are also other approaches to enrolment modelling described in the literature, however, they are dealing mainly with the analysis of global enrolment and therefore have some limitations. Specifically, these approaches typically require rather large number of centres and patients (to use some approximations) and cannot be applied on the level of centre/country for evaluating the enrolment performance and forecasting. There are different techniques used and the readers can look survey papers [14, 20, 21] and also a discussion paper [6]. The purpose of this chapter is to develop further the basic methodology for analytic modelling of enrolment on different levels, consider practical cases of restrictions on the enrolment on country level and also propose the techniques for solving the problem of optimal enrolment design given some cost/timelines constraints.

514

V. Anisimov and M. Austin

As we need to model the enrolment characteristics on different levels, the approaches oriented to modelling global enrolment are not suitable here. As the baseline model we use a Poisson-gamma enrolment model for modelling enrolment on centre level. The enrolment processes on country/region levels are described by mixed Poisson processes and depend on the centre’s initiation and closure. The chapter is organized as follows. Section 23.1 is devoted to some background and literature survey. In Sect. 23.2 a Poisson-gamma enrolment model for unrestricted (competitive) enrolment is introduced as these results are used in further presentation. Section 23.3 is devoted to predicting enrolment under upper restrictions on country level and to the investigation of the impact of enrolment caps. A brief discussion on using historic data for better predicting enrolment rates for the new trials is also provided. Section 23.4 is devoted to discussion of different approaches/techniques on how to create the optimal enrolment design (choose an optimal centres/countries allocation) given some cost/timelines constraints. Some results on the approximation of the convolution of Poisson-gamma variables and on the calculation of some characteristics of the restricted enrolment are given in Appendix.

23.2 Enrolment Modelling In this section, in Sect. 23.2.1 we review some basic notation and properties of a Poisson-gamma enrolment model (referred to as a PG model) that will be used throughout the chapter. The presentation here mainly follows [2]. Section 23.2.2 presents a novel analytic technique for modelling enrolment process on country level using the approximation by a Poisson-gamma process. These results are essential for developing the technique for modelling enrolment under some restrictions on country level which is investigated in the next Sect. 23.3.

23.2.1 Modelling Unrestricted Enrolment Consider a clinical trial where the patients are recruited by different clinical centres and, after a screening period, they are randomized to different treatments. Most of clinical trials use so-called competitive enrolment (no restrictions on the number of patients to be recruited in particular centres/regions). Nevertheless, sometimes due to some geographical or population reasons, clinical teams may use restricted enrolment, e.g. in some countries/regions it might be set the upper (or low) thresholds (say, to enroll not more (or not less) patients than a given number). In this subsection we consider first a competitive enrolment (no restrictions). Assume that the patients arrive to each clinical centre one at a time and independently of each other. Then the natural model to describe the arrival flow in centre i is a Poisson process with some rate λi . As the value of the rate may not be certain and can be evaluated only up to some uncertainties, it is natural to model a variation in

23 Forecasting and Optimizing Patient Enrolment in Clinical …

515

the rate using a gamma distribution. Moreover, as patients arrive at different centres independently, we assume that rates λi are jointly independent random variables. This enrolment model is developed by Anisimov and Fedorov and is called a Poisson-gamma (PG) model [11, 12]. It is also extended further in some directions in [2, 7, 8]. Let us introduce some basic notation that will be used throughout the chapter. Denote by Πa (t) an ordinary homogeneous Poisson process with rate a, so, for any t > 0, (at)k , k = 0, 1, . . . P(Πa (t) = k) = e−at k! where we set 0! = 1 and 00 = 1. Denote also by Π (a) a Poisson random variable with parameter a. Let Ga(α, β) be a gamma distributed random variable with parameters (α, β) (shape and rate) and probability density function f (x, α, β) =

e−βx β α x α−1 , x > 0, Γ (α)

(23.1)

∞ where Γ (α) = 0 e−x x α−1 dx is a gamma function. Assume now that the rate λ has a gamma distribution with parameters (α, β) and introduce a mixed (doubly stochastic) Poisson process Πλ (t). According to [16], Πλ (t) is a Poisson-gamma (PG) process with parameters (t, α, β) and P(Πλ (t) = k) =

t k βα Γ (α + k) , k = 0, 1, . . . k! Γ (α) (β + t)α+k

(23.2)

For convenience denote also by PG(t, α, β) a PG random variable that has the same distribution as Πλ (t). For t = 1, Πλ (1) has the same distribution as Π (λ) (mixed Poisson variable). In this case for simplicity we use notation PG(α, β) instead of PG(1, α, β). Note that according to [22, p. 199], the distribution of Πλ (t) in (23.2) can be also described as a negative binomial distribution, and for any t > 0,   P(Πλ (t) = k) = P NB α,

β β +t



 = k , k = 0, 1, . . . .

(23.3)

where NB(α, p) denotes a random variable which has a negative binomial distribution with size α and probability p: P(NB(α, p) = k) =

Γ (α + k) α p (1 − p)k , k = 0, 1, . . . k! Γ (α)

As in R programming language there are standard functions for computing a negative binomial distribution, relation (23.3) can be used for calculating distributions of PG processes.

516

V. Anisimov and M. Austin

For example, the distribution (23.2) can be calculated using a function in R: dnbinom(k,size=alpha,prob=beta/(beta+t))

To calculate CPF, P(Πλ (t) ≤ L), where λ = Ga(α, β), we can use a function pnbinom(L,size=alpha,prob=beta/(beta+t))

Now let us return to modelling enrolment. Denote by n i (t) the enrolment process in centre i (the number of patients recruited in time interval [0, t]). Denote also by u i the date of the activation of centre i. These dates at the initial stage may not be known in advance, e.g., u i can be considered as a uniform random variable in some interval [ai , bi ] [1, 7, 10]. The cases of beta and gamma distributions are considered in [8]. However, to avoid rather complicated calculations, we restrict our attention to the case when the values u i are known. Then PG enrolment model assumes that in centre i the enrolment process n i (t) is a mixed Poisson process with rate λi in time interval [u i , ∞) where λi is viewed as a gamma distributed variable Ga(αi , βi ). Thus, n i (t) is a Poisson process with time-dependent rate λi (t), where λi (t) = 0 as t ≤ u i and λi (t) = λi as t > u i . Consider also more convenient representation via a cumulative rate. Denote x(t, u) = max(0, t − u) (the duration of active enrolment at time t for a centre activated at time u). So, if centre i is active at time t, then x(t, u i ) = t − u i . Then the cumulative rate of the process n i (t) is Λi (t, u i ) = λi x(t, u i ). This means, if λi = Ga(αi , βi ), n i (t) is a PG process with parameters (x(t, u i ), αi , βi ), and the distribution of n i (t) can be calculated using (23.2) where in the right-hand side we should use x(t, u i ) instead of t, and parameters (αi , βi ).

23.2.2 Modelling Enrolment on Country Level Consider some country s with Ns centres. Denote by Is the set of indexes of these centres. Then the enrolment process n(Is , t) in this country is a mixed Poisson process with the cumulative rate  λi x(t, u i ). (23.4) Λ(Is , t) = i∈Is

Consider a special case when the rates in all centres of this country have the same parameters (α, β) of a gamma distribution. Assume in addition that all u i ≡ u and t > u. Then for all i, x(t, u i ) = t − u. In this very special case in distribution Λ(Is , t) = (t − u)Ga(α Ns , β)

(23.5)

Thus, n(Is , t) is a PG process with parameters (t − u, α Ns , β) and we can use again relation (23.2) to calculate its distribution.

23 Forecasting and Optimizing Patient Enrolment in Clinical …

517

However, in practice we should not expect that all centres will be activated at the same time. Moreover, the parameters of the rates can be also different. In these cases, as the sum of gamma distributed variables with different rate parameters βi does not have a gamma distribution, the cumulative rate Λ(Is , t) may not have a gamma distribution. Therefore, the process n(Is , t) in general is not a PG process. Thus, to develop the analytic technique for calculating the distribution of n(Is , t) we have to use some approximations. If Ns is large enough (Ns > 10 ÷ 15), in [2] it was proposed a normal approximation which used the closed-form expressions for the mean and the variance of the rate Λ(Is , t). This approximation works perfectly well for global enrolment. However, for country predictions the normal approximation may not be appropriate as in real trials in many countries the number of centres can be less than 10. Therefore, for predicting enrolment on country level we have to develop another type of approximation that works efficiently for small number of centres. In [2] it was proposed an approach to approximate the country processes by PG processes with some aggregated parameters which was elaborated in details in [9]. Below we provide the details of this approach as it is essentially used in the chapter for modelling the restricted enrolment and for creating an optimal trial design. At any time t the cumulative rate of the enrolment process n(Is , t) is defined by (23.4). Consider a general case where the rates λi are gamma distributed with different parameters (αi , βi ). Denote for the ease of notation vi = x(t, u i ) – the duration of active enrolment in centre i. Then for a centre active at time t, vi = t − u i , and clearly only these centres can contribute to the number of patients enrolled up to time t. Denote also by m i = αi /βi and si2 = αi /βi2 the mean and the variance of λi and introduce the mean and the variance of the cumulative rate Λ(I, t) as E(Is , t) = E[Λ(Is , t)], S 2 (Is , t) = Var[Λ(Is , t)]. It is easy to see that E(Is , t) =

 i∈Is

m i vi , S 2 (Is , t) =



si2 vi2

(23.6)

i∈Is

Let us introduce the variables A(Is , t) = E 2 (Is , t)/S 2 (Is , t), B(Is , t) = E(Is , t)/S 2 (Is , t)

(23.7)

The following statement is a slight extension of the result in [9] to the case where the rates are gamma distributed with different parameters. Lemma 1 The distribution of n(Is , t) can be well approximated by the distribution of a PG random variable P G(A(Is , t), B(Is , t)). In [9] it is shown using numerical calculations that this approximation provides a very good fit even for a small number of centres, 2, 3, and with the larger number of centres the difference between the exact and approximative distributions is decreasing (see Appendix 23.6.1). The explanation of this result is the following. The cumulative rate Λ(Is , t) has the same mean and the variance as a gamma distributed variable Ga(A(Is , t), B(Is , t)).

518

V. Anisimov and M. Austin

Thus, the distribution of n(Is , t) can be approximated by the distribution of the variable Π (Ga(A(Is , t), B(Is , t))) which by definition is P G(A(Is , t), B(Is , t)). Note that this approximation resembles in some sense Welch–Satterthwaite [25, 29] approximation that was originally used to approximate the linear combinations of independent chi-squared random variables. A PG approximation can be applied for any number of centres and therefore is much more preferable compared to a normal approximation, as provides a unified way for the approximation of the global and country enrolment processes. Using an approximation of the country process n(Is , t) by a PG process P G(A(Is , t), B(Is , t)), we can calculate directly the mean value as E[n(Is , t)] = E(Is , t) and using formulae for a NB distribution calculate the predictive bounds for any confidence level Q. Indeed, Q-quantile of n(Is , t) can be calculated in R as qnbinom(Q,size=A(Is,t),prob=B(Is,t)/(B(Is,t)+1))

The quantiles for Q = 0.05 and Q = 0.95 reflect 90% predictive interval for n(Is , t). It is also possible to calculate the distribution of the time to reach a specific target for the number of patients in a country. Denote by τ (Is , L s ) the time to reach a given number of patients L s in country s. As for any t > 0, P(τ (Is , L s ) ≤ t) = P(n(Is , t) ≥ L s )

(23.8)

the distribution of τ (Is , L s ) is represented via PG distribution of n(Is , t). This provides a useful opportunity to calculate the probabilities to reach specific country goals and compare the performance of enrolment in different countries.

23.2.3 Modelling Global Enrolment Assume that trial involves S countries. The global enrolment process n(t) is a sum of country processes and is a mixed Poisson process with the global cumulative rate Λ(t) =

S 

Λ(Is , t)

(23.9)

s=1

where country rates Λ(Is , t) are defined in (23.4). Assuming for simplicity that all centres are active at time t and using relations (23.6) we get the relations for the mean E(t) and the variance S 2 (t) of the rate Λ(t), E(t) =

S  s=1

E(Is , t); S 2 (t) =

S  s=1

S 2 (Is , t).

(23.10)

23 Forecasting and Optimizing Patient Enrolment in Clinical …

519

Then, using Lemma 1, we can approximate the distribution of n(t) by the distribution of a PG random variable P G(A(t), B(t)), where A(t) = E 2 (t)/S 2 (t), B(t) = E(t)/S 2 (t).

(23.11)

Using this approximation and formulae for a negative binomial distribution we can calculate the mean, median and Q-predictive bounds for the process n(t). Correspondingly, denote by τ (n) the time to reach the planned number of patients n (to complete enrolment). As P(τ (n) ≤ t) = P(n(t) ≥ n),

(23.12)

the probability to complete enrolment before time t is represented via the calculated PG distribution of n(t). Therefore, PoS (to complete enrolment before a planned date T plan ) is calculated as P(τ (n) ≤ T plan ) = P(n(T plan ) ≥ n).

(23.13)

23.3 Modelling Enrolment with Restrictions In this section we develop a novel technique for modelling and forecasting enrolment under some upper restrictions (caps) on country level.

23.3.1 Modelling Enrolment with Restrictions in One Centre Consider first modelling a restricted enrolment in one centre. Consider a centre i with the enrolment rate λi = Ga(αi , βi ) and time of activation u i . Assume that the enrolment in this centre is stopped when the number of patients hits a given upper threshold (cap) L i . For the ease of notation, omit index i at the variables αi , βi .u i , L i . Consider first the unrestricted process n i (t) and denote by P(k, t, u) its distribution which is defined according to (23.2) as P(k, t, u) =

x k (t, u)β α Γ (α + k) , k = 0, 1, . . . k! Γ (α) (β + x(t, u))α+k

(23.14)

Define now the enrolment process n iL (t) restricted by cap L as  n iL (t)

=

n i (t) as n i (t) < L L as n i (t) ≥ L

(23.15)

520

V. Anisimov and M. Austin

Then the distribution of n iL (t) can be calculated directly: ⎧ t, u) as 0 ≤ k < L ⎨ P(k,

L−1 P(n iL (t) = k) = 1 − k=0 P(k, t, u) as k = L ⎩ 0 otherwise

(23.16)

Correspondingly, the first two moments are calculated as follows (see Sects. 23.6.2 and 23.6.3 in Appendix): αx(t, u) P(PG(x(t, u), α + 1, β)) ≤ L − 2)) β + L 1 − P(PG(x(t, u), α, β) ≤ L − 1)

E[n iL (t)] =

α(α + 1)x 2 (t, u) P(P G(x(t, u), α + 2, β) ≤ L − 3) β2 αx(t, u) P(P G(x(t, u), α + 1, β) ≤ L − 2) + β + L 2 1 − P(P G(x(t, u), α, β) ≤ L − 1)

(23.17)

E[(n iL (t))2 ] =

(23.18)

23.3.2 Modelling Enrolment with Restrictions on Country Level In real trials typically restrictions can be imposed on country level basing on some regulatory assumptions. Using the results of Sects. 23.2.2 and 23.3.1 we can develop an analytic technique for predicting restricted enrolment on country level. Consider some country s with Ns centres indexed by set Is . According to Lemma 1, the distribution of unrestricted enrolment process n(Is , t) in this country can be well approximated by the distribution of a PG variable PG(A(Is , t), B(Is , t)) which has the same distribution as a PG process PG(1, A(Is , t), B(Is , t)). That means, the distribution of n(Is , t) is described by (23.14) where in the right-hand side we should put x(t, u) = 1, α = A(Is , t), β = B(Is , t). Assume now that there is a cap L(s), so the enrolment in country s is stopped when the number of patients n(Is , t) reaches L(s). To model the process n(Is , t) restricted by cap L(s), (denote it as n L(s) (Is , t)) we can use the same relations as in Sect. 23.3.1 above, where we should put x(t, u) = 1, α = A(Is , t), β = B(Is , t). According to (23.16), the distribution of a restricted country process is defined as:

23 Forecasting and Optimizing Patient Enrolment in Clinical …

B(Is , t) A(Is ,t) Γ (A(Is , t) + k) , k! Γ (A(Is , t)) (B(Is , t) + 1) A(Is ,t)+k k = 0, 1, . . . , L(s) − 1,

521

P(n L(s) (Is , t) = k) =

(23.19)

P(n L(s) (Is , t) = L(s)) = 1 − P(PG(A(Is , t), B(Is , t)) ≤ L(s) − 1). Correspondingly, using relations (23.7), (23.17) and (23.18) we get E[n L(s) (Is , t)] = E(Is , t)P(PG(A(Is , t) + 1, B(Is , t))) ≤ L(s) − 2)) + L(s) 1 − P(PG(A(Is , t), B(Is , t)) ≤ L(s) − 1)

(23.20)

E[(n L(s) (Is , t))2 ] = (E 2 (Is , t) + S 2 (Is , t))P(P G(A(Is , t) + 2, B(Is , t))) ≤ L(s) − 3) + E(Is , t)P(P G(A(Is , t) + 1, B(Is , t)) ≤ L(s) − 2) + L 2 (s) 1 − P(P G(A(Is , t), B(Is , t)) ≤ L(s) − 1) . (23.21) Consider an important characteristic – the time τ (Is , L(s)) to reach cap L(s) in country s. According to Lemma 1, we can use the following relation: for any t > 0, P(τ (Is , L(s)) ≤ t) = P(n(Is , t) ≥ L(s)) = 1 − P(P G(A(Is , t), B(Is , t)) ≤ L(s) − 1). 23.3.2.1

(23.22)

Asymptotic Properties

Consider the asymptotic dependence of the country enrolment process restricted by cap on the time and on the number of centres. 1st case. Consider the case where t → ∞. Denote   m i , V 2 (Is ) = si2 . M(Is ) = i∈Is

i∈Is

Lemma 2 Assume that t → ∞ and other parameters are fixed. Let also M(Is ) > 0, V 2 (Is ) > 0. Then, P (23.23) n L(s) (Is , t) −→ L(s) P

where symbol −→ means convergence in probability. Proof As t → ∞, in relation (23.6), E(Is , t) = M(Is )t (1 + O(1)); S 2 (Is , t) = V 2 (Is )t 2 (1 + O(1)) Thus, in relation (23.7),

522

V. Anisimov and M. Austin

A(Is , t) → M 2 (Is )/V 2 (Is ) > 0, B(Is , t) = O(1/t). From relation (23.2), for any k ≥ 0, α > 0, as t → ∞ and β = O(1/t), P(PG(α, β) = k) → 0; tP(PG(α + 1, β) = k) → 0; t 2 P(PG(α + 2, β) = k) → 0.

Using these relations together with (23.19) we get from (23.20), (23.21): E[n L(s) (Is , t)] → L(s); E[(n L(s) (Is , t))2 ] → L 2 (s).

(23.24)

Thus, Var[n L(s) (Is , t)] → 0, and relation (23.23) follows from Chebyshev’s inequality. Actually, for restricted process the relation (23.23) is expected. Note that the case V 2 (Is ) = 0 corresponds to a Poisson model with fixed rates and can be considered similarly. 2nd case. Consider now the case where the number of centres Ns → ∞.

s (t), S 2 (Is , t)/Ns → Lemma 3 Assume that for any t > 0, E(Is , t)/Ns → M 2 2

s (t) > 0, V

s2 (t) >

Vs (t), where Ms (t) and Vs (t) are some bounded functions, and M 0. Then relation (23.23) holds.

s (t)/V

s2 (t) > 0. Proof In this case, A(Is , t) = O(Ns ) → ∞, B(Is , t) → M Note that as α → ∞, in relation (23.2), for any k ≥ 0, Γ (α + k)/Γ (α) = O(α k ), and for any q, 0 < q < 1, α k q α → 0. Thus, for any p > 0, k ≥ 0, α p P(PG(α, β) = k) → 0. Therefore, similar to Case 1, using these relations together with (23.19) we get the relation (23.24). Finally, relation (23.23) follows from Chebyshev inequality. This result shows that for rather large number of centres in a country, the country cap can be reached rather quickly, earlier than the planned stopping time, and after that point this country will not contribute further into the global enrolment. Thus, the caps should be chosen rather carefully by analyzing and comparing the times to reach country caps with the planned enrolment time. For example, denote by T the planned enrolment time and assume that in relation (23.22), P(τ (Is , L(s)) ≤ T ) is rather high (say, more than 0.9). Then it is very likely that the cap in this country will be reached before the planned time T . Thus, if the enrolment will go according to plan, the centres in this country will not be used fully efficiently. If there are many caps in different countries such that these caps can be reached with high probabilities before time T , this will lead to closing of enrolment in these countries earlier than planned which may lead to substantial delay of the global enrolment.

23 Forecasting and Optimizing Patient Enrolment in Clinical …

523

Therefore, in these cases it can be recommended to reconsider the design of enrolment and increase or eliminate caps in such countries if possible.

23.3.3 Forecasting Global Enrolment Under Country Restrictions Consider now forecasting of the global enrolment when there are enrolment caps in some countries. There are two main cases. The first is when the number of countries is rather large (more than 10 ÷ 15). Then for the global enrolment process n(t) which is a sum of country processes we can use a normal approximation. Using expressions (23.20) and (23.21) for the mean and the 2nd moment of country processes, for any t > 0 we can calculate the mean and the variance of the global enrolment process n(t) (as sums of means and variances of country processes) and use them to calculate the predictive bounds and the probability to stop enrolment before deadline based on a normal approximation similar as it was considered for unrestricted process in [2]. If the number of countries is not so large, we can use the following numeric procedure. For every country and any time t > 0, the distribution of the restricted process is defined by relations (23.19). Then the distribution of the global process is a convolution of country processes. In R-software this distribution can be calculated using very fast numeric procedure based on a discrete Fourier transform and Rfunction convolve(). This algorithm is working very efficiently and calculates for any t the vector distribution of the global process. Then for this process we can calculate numerically the predictive mean, median and predictive bounds. Correspondingly, using relation (23.12) for the global enrolment time and the calculated distribution of n(t) we can also calculate the probability to complete enrolment before time t. Note that for practical reasons it is enough to provide calculations on a daily basis. Therefore, to create the predictions of country and global enrolment processes, we need first to evaluate the upper predictive bound for the enrolment time using rather high confidence level (usually 0.95). This can be done numerically using (23.12) and calculating sequentially the first time T0.95 such that P(n(T0.95 ) ≥ n) ≥ 0.95.

(23.25)

Then we consider a sequence of times tk (usually (1, 2, . . . , T0.95 )), and for every tk calculate numerically the predictive mean, median and the bounds for n(tk ) for a given confidence level (usually 0.9) using the calculated distribution of n(tk ). Probability to complete enrolment up to any time tk can be calculated using (23.12). The main probability of interest (Probability of Success – PoS) is the probability to complete enrolment before the planned time T plan which is P(τ (n) ≤ T plan ). Then PoS can be calculated using the distribution of n(t) at time t = T plan and (23.12).

524

V. Anisimov and M. Austin

PoS plays the main role at the initial study design. If PoS is not very large, then it is likely that study can be delayed. Therefore, it can be recommended to improve the enrolment design where one of the options can be adding more clinical centres and recalculating then PoS.

23.3.3.1

Analysis of the Impact of Enrolment Caps

In Sect. 23.3.2 it is noted that the enrolment caps in countries may lead to a substantial delay of the global enrolment and to the inefficient use of centres in these countries. Consider some numeric approaches for the analysis and comparing the impact of caps. Assume that there are several countries (1, . . . , J ) with restrictive caps L( j). Using relation (23.22) and formula for the distribution of the unrestricted PG process n(I j , t) in country j, the probability P(T, I j , L( j)) to reach cap in this country before the planned enrolment time T is calculated as P(T, I j , L( j)) = 1 − P(n(I j , T ) ≤ L( j) − 1) Correspondingly, using the results of Sect. 23.3.3 we can calculate the PoS P(T ) to complete the global enrolment before time T . Now, if for country j, P(T, I j , L( j)) > P(T ), then it is likely that the cap L( j) in this country will be reached before stopping the global enrolment. Thus, for country j it can be recommended to increase the value of cap if possible. Another opportunity is to compare the quantiles of the times to reach caps with the quantile of the global enrolment time. Consider some value Q (e.g. Q = 0.9). Using formula (23.22) for the distribution of the time τ (I j , L( j)), we can calculate its Q-quantile S(Q, j, L( j)). Correspondingly, using results of Sect. 23.3.3, we can calculate Q-quantile S(Q, n) of the global enrolment time τ (n). Then we can compare the values S(Q, j, L( j)) and S(Q, n). If for some country j, S(Q, j, L( j)) < S(Q, n), then it is likely that the cap in country j will be reached before stopping the global enrolment. Thus, for this country it can be recommended to increase the value of cap if possible. It can be also proposed to compare the mean times in countries to reach caps and the mean of the global enrolment time, however, this approach in general leads to similar conclusions. Therefore, the enrolment design involving country restrictions should be first evaluated by analyzing the impact of caps in different countries.

23 Forecasting and Optimizing Patient Enrolment in Clinical …

525

23.3.4 Using Historic Data for Better Prediction of the Enrolment Rates for the New Trials The technique for modelling and forecasting enrolment uses some input parameters, specifically, the mean and the variance of the enrolment rates and the times of centres activation. Actually at the initial (planning) stages the enrolment rates are not known in advance. Therefore, a practical question of a paramount interest is: how to estimate efficiently the parameters of the rates at the planning stage when real trial data are not available yet using historic data from similar trials? Typically, the enrolment rates are provided by clinical teams using the expert estimates and their knowledge of the specific of particular trials. At the current stage pharmaceutical companies have an access to very large databases of historic trials. These data can be used to evaluate the values of historic rates and use them as the initial rates for the new trials. As there are many factors which can influence the enrolment, standard regression models may not work well. Therefore, one of the directions is using machine learning algorithms trained on large databases of historic studies using different features: therapeutic area, study indication, number of sites, study start-up times, phase, country, enrolment windows, etc. This is a very important area which require a separate discussion. Some approaches on using a PG model for predicting new trials were proposed in [24].

23.4 Optimal Enrolment Design One of the cornerstone problem at the planning stage is: find an optimal allocation of centres/counties that minimizes the total trial cost given that PoS is no less than a given value and there are certain restrictions on the number of centers in countries. To formalize this problem let us introduce the basic notation. Consider a given set of countries (1, . . . , S) and assume that we have chosen some number of centres (N1 , . . . , N S ) in these countries. Let T = T plan is the target enrolment time. Suppose also that for any given country s and the value Ns , the times of centres activation (u s1 , . . . , u sNs ) are generated according to some algorithm, e.g. it can be a uniform grid on some interval [as , bs ] or piece-wise uniform grid using the expected quartiles of the times of centres activation (e.g. the times when 25, 50, 75, 100% of centres to be activated). Assume for simplicity that the mean and the variance (m(s), σ 2 (s)) of the enrolment rates in any country s are the same for all centres in this country and all centres are planned to be activated before target time T .

526

V. Anisimov and M. Austin

Consider the following costs: 1. the vector of costs per selecting one center in each country, C¯ = (Cs , s = 1, . . . , S); 2. the vector of costs per one enrolled patient in each country, c¯ = (cs , s = 1, . . . , S); 3. the vector of costs per including country s with non-zero number of centres, Q¯ = (Q s , s = 1, . . . , S); ¯ c, ¯ the total mean cost of the trial in time interval [0, T ] Denote by C(T, N¯ , C, ¯ Q) for a given centre’s allocation N¯ = (N1 , . . . , N S ). Assume also that there is some planned set of restrictions W on the numbers of centers, e.g. the minimal and maximal numbers of centers for each country. Denote by P(n, T, N¯ ) a PoS – the probability to reach a planned number of patients n for a given centre’s allocation N¯ = (N1 , . . . , N S ) before target time T . Then the optimal enrolment design is a solution of the following problem: Optimization problem 1: For a given probability P find an optimal centres allocation N¯ = (N1 , . . . , N S ) ¯ c, ¯ given that minimizes the total cost C(T, N¯ , C, ¯ Q) P(n, T, N¯ ) ≥ P

(23.26)

N¯ ∈ W where P is an agreed confidence level (e.g. 0.8, 0.9, …).

23.4.1 Unrestricted Enrolment Consider two main approaches in the case of unrestricted enrolment for how to calculate the PoS and the optimal trial design depending on whether the number of countries S is rather large or not.

23.4.1.1

The Number of Countries Is Rather Large

Assume that S ≥ 10 ÷ 15, so we can use a normal approximation for the global enrolment process n(T ) = n(T, N¯ ) as a sum of country processes n(Is , T ). The global cumulative enrolment rate at time T has the form Λ(T, N¯ ) =

S   s=1 i∈Is

λis (T − u is ),

(23.27)

23 Forecasting and Optimizing Patient Enrolment in Clinical …

527

where λis are the enrolment rates in centres in country s with mean m(s) and variance σ 2 (s). Therefore, values E(Is , T ) and S 2 (Is , T ) defined in (23.6) have the form: E(Is , T ) = m(s)

  (T − u is ), S 2 (Is , T ) = σ 2 (s) (T − u is )2 , i∈Is

(23.28)

i∈Is

and the mean E(T, N¯ ) and the variance S 2 (T, N¯ ) of Λ(T, N¯ ) are expressed as E(T, N¯ ) =

S 

E(Is , T ), S 2 (T, N¯ ) =

s=1

Denote

S 

S 2 (Is , T ).

(23.29)

s=1

G 2 (T, N¯ ) = E(T, N¯ ) + S 2 (T, N¯ ).

(23.30)

Note that G 2 (T, N¯ ) = Var[n(T )]. Using relation (23.12) and a normal approximation for the process n(T ) we can easy derive the following criterion: Criterion (to complete enrolment in time): The study for a chosen countries allocation N¯ will complete enrolment up to time T with probability P if the following inequality is satisfied: E(T, N¯ ) − n  ≥ zP G 2 (T, N¯ )

(23.31)

where z P is a P-quantile of a standard normal distribution. Consider now the calculation of global costs. The cost for centres involved is Cost (centr es, N¯ ) =

S 

C s Ns

(23.32)

s=1

The cost for the mean number of patients recruited in interval [0, T ] is Cost ( patients, N¯ ) =

S 

cs m(s)

s=1

 (T − u is )

(23.33)

i∈Is

The cost for the countries with non-zero number of centres is Cost (countries, N¯ ) =

S 

Q s I (Ns > 0)

(23.34)

s=1

where I (A) is the indicator of the event A. ¯ c, ¯ is Thus, for any given allocation of centres N¯ , the global cost C(T, N¯ , C, ¯ Q) the sum of costs defined by relations (23.32)–(23.34).

528

V. Anisimov and M. Austin

Note also that the condition N¯ ∈ W typically has the following form: The vector H¯ = (H1 , . . . , HS ) of the low bounds and the vector U¯ = (U1 , . . . , U S ) of the upper bounds for the number of centres in each country are given, so the condition N¯ ∈ W means: Hs ≤ Ns ≤ Us , s = 1, 2, . . . , S.

(23.35)

In this setting, the optimization problem has the following general form: Optimization problem 2: For a given probability P find an optimal centres allocation N¯ that: ¯ c, ¯ given conditions (23.31) and (23.35). minimizes the global cost C(T, N¯ , C, ¯ Q) Note that the set of possible allocations should not be empty, so the probability P can be reached for some allocation. This will be guaranteed if the following condition is satisfied: Condition of feasibility for probability P: E(T, U¯ ) − n  ≥ zP G 2 (T, U¯ )

(23.36)

As the total cost and condition (23.31) have a non-linear dependence on vector N¯ , this general problem can be solved using the methods of constrained optimization or random search.

23.4.1.2

Approach Using Step-Wise Linearization

Assume that in restrictions (23.35) for all s, Hs > 0, so all countries at the design stage plan to involve some centres, which is quite natural. Assume for simplicity that the times (u s1 , . . . , u sNs ) of centres activation in country s are chosen as a uniform grid in some time interval [a(s), b(s)] defined for this country at the planning stage. In general it can be considered more sophisticated algorithms. For a given centres allocation N¯ = (N1 , . . . , N S ), define for every country s, assuming Ns > 0, the average enrolment time R(s) for any generic centre in this country: 1  (T − u is ) (23.37) R(s) = Ns i∈I s

In this case E(T, N¯ ) =

S  s=1

Ns m(s)R(s)

(23.38)

23 Forecasting and Optimizing Patient Enrolment in Clinical …

529

and the patient cost in (23.33) can be written as Cost ( patients, N¯ ) =

S 

Ns cs m(s)R(s).

(23.39)

s=1

Note that in the case when in country s the values u is are generated according to a uniform distribution in interval [a(s), b(s)], the mean enrolment time is ⎡ ⎤ 1 ⎣ E (T − u is )⎦ = E(T − u s1 ) = (b(s) − a(s))/2, R(s) = Ns i∈I

(23.40)

s

so R(s) doesn’t depend on Ns . Thus, we can keep a linear representation (23.39) for any other vector of the number of centres in countries assuming that the times of activation of centres in country s are chosen as a uniform grid. Using this representation, we see that all costs are linearly dependent on the running vector of centres N¯ . This representation essentially accelerates the computations on each step in the optimization algorithm. At the final stage, when we will calculate the optimal number of centres, we can exactly calculate PoS using a specific centres allocation in each country. However, numerical calculations show that the difference in PoS, calculated using a proportional method as above or the specific uniform grid of centres allocation, is in the 2nd-3rd digit after comma. Thus, this approach can be efficiently used in practice. Now the remaining point is – how to deal with a non-linear condition (23.31). This condition can be written in the form  (23.41) E(T, N¯ ) − z P G 2 (T, N¯ ) ≥ n. The value E(T, N¯ ) can be represented in a linear form with respect to vector N¯ as in (23.38). The value G 2 (T, N¯ ) in (23.30) can be also represented in a linear form with respect to vector N¯ using for every country an averaged quadratic enrolment time in any generic centre: V (s) =

1  (T − u is )2 Ns i∈I

(23.42)

s

Then according to (23.28), G 2 (T, N¯ ) =

S 

Ns m(s)R(s) + σ 2 (s)V (s)

s=1

However, relation (23.41) is still non-linear with respect to vector N¯ except the case when P = 0.5 as z 0.5 = 0.

530

V. Anisimov and M. Austin

To resolve this problem, it is developed a step-wise recurrent algorithm where on each step we set linear restrictions and use a simplex method for linear constrained optimization which is working extremely fast even for very large number of countries up to several hundreds. Note that the simplex method assumes that the variables involved into optimization can take also non-integer values. Assuming so, we can find a solution of optimization problem in the space of continuous variables, and then at the last step, we can use a simple search checking for every non-integer variable xk which of the two nearest U pp integer values, low NkLow or upper Nk , gives the less total cost keeping condition (23.41). On this way, we will find a quasi-optimal discrete allocation N¯ opt satisfying the conditions of optimization problem 2. The description of the step-wise recurrent algorithm is the following. First, for any running centre’s allocation N¯ we introduce the new vector variable x¯ = N¯ − H¯ . Then E(T, N¯ ) = E(T, x) ¯ + E(T, H¯ ) and the global costs have the form: ¯ c, ¯ = C(T, H¯ , C, ¯ c, ¯ + C(T, x, ¯ c, ¯ C(T, N¯ , C, ¯ Q) ¯ Q) ¯ C, ¯ Q), ¯ depends linearly on x, ¯ c, ¯ Q) ¯ and 0 ≤ x¯ ≤ U¯ − H¯ by all compowhere C(T, x, ¯ C, nents. Now let us start with the initial vector x¯ (0) = 0¯ and find the next value x¯ (1) as a solution of the optimization problem with linear constrains using simplex method with respect to vector x¯ = (x1 , . . . , x S ), where condition (23.41) is re-written to have linear restrictions on vector x: ¯ S 

 xs m(s)R(s) ≥ n + z P G 2 (T, x¯ (0) + H¯ ) − E(T, H¯ )

(23.43)

s=1

Correspondingly, denote by x¯ (k) a solution of the linear constrained optimization problem on step k. The next value x¯ (k+1) is calculated as a solution of the linear constrained optimization problem with respect to vector x¯ where (23.43) has the form  S  xs m(s)R(s) ≥ n + z P G 2 (T, x¯ (k) + H¯ ) − E(T, H¯ ) (23.44) s=1

Convergence of this algorithm can be proved in one dimensional case. Indeed, consider a trial with country {1} only. Assume for simplicity that H1 = 0. Then the relation (23.44) on step k will be reduced to the relation x (k+1) =

V √ (k) n + x E E

(23.45)

where x (0) = 0 and x (k) ≤ U1 , and according to(23.28)–(23.30), E and V are some constants, specifically, E = m(1)T /2, V = z P m(1)T /2 + σ 2 (1)T 2 /4.

23 Forecasting and Optimizing Patient Enrolment in Clinical …

Thus, x x

(1)

(3)

n > 0; x (2) = = E  V n n + + = E E E

 V n n + > x (1) ; E E E  V n > x (2) , . . . . E E

531

(23.46)

and so on. Therefore, we can see that x (k) is a monotonically increasing sequence bounded by U1 , thus the algorithm is convergent. In the multidimensional case we were not able to prove the convergence rigorously. However, numerical calculations for many scenarios show that if we set some stopping rule, e.g. stop the sequential algorithm when the difference in global costs is less than 0.5, then the number of iterations does not exceed 10 ÷ 15 steps. As a result, for any feasible probability P this step-wise optimization algorithm calculates the optimal centre’s allocation satisfying conditions of optimization problem with the optimal cost.

23.4.1.3

Numerical Example

Consider an artificial case study which by the design is very similar to real studies. In this study it is planned to recruit 250 patients during 2 years. There are 16 countries where all sites in each country are planned to be activated in the interval between 30 and 210 days. The first four columns in the Table 23.1 describe the enrolment design for this study. The columns “Low” and “Upp” reflect the vectors H¯ and U¯ of the low and upper bounds for the number of centres in condition (23.35). The column “Rate” shows the mean monthly enrolment rate for each centre in a corresponding country. The column “Cost” shows the cost in USD for one patient enrolled in each country. It is assumed that the coefficient of variation of the enrolment rates is the same and equal 1.2, which corresponds to the medium variation, and assumed that the costs per including one center are the same in all countries and equal to $5000. Using the approach proposed in Sect. 23.4.1.2, it is possible to solve “Optimization problem 2” and for a given range of PoS calculate the optimal allocations of centres in these countries. The columns named “Opt.alloc” in the Table 23.1 show for each target PoS in the range 0.5, 0.6, …, 0.9, the optimal allocation of centres in these countries such that the corresponding PoS will be reached with minimal total cost. The last row “Opt Cost” shows the total cost of study design including patients and centres costs for each optimal allocation. For example, in “Country1” there is rather high cost for patients, so it’s not efficient to include centres from this country. On contrary, in “Country5” the cost is not that high and there is a medium mean rate. Thus, the optimization shows that this country is more preferable and it is cost-efficient to include all 6 centres (out of max 6) in the study design.

532

V. Anisimov and M. Austin

Table 23.1 Optimal centre’s allocation Country

Low

Upp

Rate

Cost

Probab.

Opt.alloc. Opt.alloc. Opt.alloc. Opt.alloc. Opt.alloc. 0.5

0.6

0.7

0.8

0.9

Country1

0

7

0.42

15600

0

0

0

0

0

Country2

0

4

0.43

14250

0

0

0

0

1

Country3

2

5

0.22

13550

2

4

5

5

5

Country4

0

4

0.55

14200

3

4

4

4

4

Country5

0

6

0.3

13800

6

6

6

6

6

Country6

1

7

0.57

14300

1

1

2

4

6

Country7

1

5

0.21

13400

5

5

5

5

5

Country8

1

7

0.25

14250

1

1

1

1

1

Country9

2

5

0.16

12300

5

5

5

5

5

Country10

0

7

0.19

13800

1

1

0

0

0

Country11

2

7

0.18

14600

2

2

2

2

2

Country12

2

7

0.62

16380

2

2

2

2

2

Country13

0

4

0.45

13400

4

4

4

4

4

Country14

0

5

0.23

11200

5

5

5

5

5

Country15

0

5

0.3

14000

1

1

1

1

1

Country16

2

7

0.39

14100

2

2

2

2

2

Total

14

92





40

43

45

46

49

Opt cost









3,643,470 3,902,851 4,135,948 4,415,110 4,879,621

The dimension of this problem is 7.11 × 1012 , so this problem cannot be solved by using a method of direct search which is proposed in the next Sect. 23.4.1.4 for studies with not so large number of countries.

23.4.1.4

The Number of Countries Is Not so Large

If the number of countries is not so large, we can use the direct search. Consider a general setting in “Optimization problem 1”. ¯ c, ¯ If there are no restrictions on the enrolment, the global cost C(T, N¯ , C, ¯ Q) is the sum of costs defined by relations (23.32)–(23.34), where for accelerating computations we represent the cost for patients in the linear form (23.39). Using Lemma 1, for any given allocation of centres N¯ , we can approximate the distribution of the global enrolment process at time T , n(T ), by the distribution of a PG random variable P G(A(T, N¯ ), B(T, N¯ )), where according to (23.11), A(T, N¯ ) = E 2 (T, N¯ )/S 2 (T, N¯ ), B(T, N¯ ) = E(T, N¯ )/S 2 (T, N¯ )

(23.47)

and the functions E(T, N¯ ) and S 2 (T, N¯ ), using relations (23.37), (23.42), are calculated according to (23.10) as

23 Forecasting and Optimizing Patient Enrolment in Clinical …

E(T, N¯ ) =

S 

Ns m(s)R(s); S 2 (T, N¯ ) =

s=1

S 

Ns σ 2 (s)V (s)

533

(23.48)

s=1

Therefore, a function P(n, T, N¯ ) in (23.26) basing on the results of Sect. 23.2.3 is

P(n, T, N¯ ) = P(P G(A(T, N¯ ), B(T, N¯ )) ≥ n)

(23.49)

Correspondingly, the probability P is feasible (can be reached for some allocation) if

P(P G(A(T, U¯ ), B(T, U¯ )) ≥ n) ≥ P

(23.50)

where U¯ is the vector of upper bounds for the number of centres in countries. Note that the representation (23.48) in the form of linear dependence on the vector N¯ substantially accelerates computations, the values R(s) and V (s) can be calculated in advance and then on each step we use only dependence on N¯ . The recurrent step-by-step algorithm (complete search) is designed as follows. Denote by W2 a set of all possible allocations of vector N¯ given restrictions (23.35). The dimension of this set is Dim =

S 

(Us − Hs + 1)

(23.51)

s=1

Let us consider any recurrent algorithm that can choose on step k some allocation N¯ k without repetition in such a way that the set { N¯ k , k = 1, . . . , Dim} coincides with the set W2 and N¯ 1 = H¯ , N¯ Dim = U¯ . ¯ c, ¯ Let For ease of notation denote P( N¯ ) = P(n, T, N¯ ); C( N¯ ) = C(T, N¯ , C, ¯ Q). us take a desirable feasible probability P to complete enrolment in time. Consider the following recurrent procedure. Introduce the target vector Z¯ and set the initial value Z¯ = (C(U¯ ), U¯ ). Denote the first component of Z¯ as Z¯ [1]. Then on any step k, if P( N¯ k ) < P, go to step k + 1; if P( N¯ k ) ≥ P, then check: if C( N¯ k ) ≥ Z¯ [1], go to step k + 1; if C( N¯ k ) < Z¯ [1], then set the new value for target vector Z¯ : Z¯ = (C( N¯ k ), N¯ k ), and go to step k + 1. Finally, this algorithm will come to the optimal target vector Z¯ opt where the components (2, . . . , S + 1) define a feasible allocation N¯ opt that satisfies condition P( N¯ opt ) ≥ P with minimal cost C( N¯ opt ). Computations for different scenarios show that using R, for Dim = 109 the time of calculation is about 60 min. For example, for a study with 12 countries and variation in every country about 5 centres, the time of calculation is about 15 min, which suits practical purposes.

534

V. Anisimov and M. Austin

Therefore, the problem to find an optimal enrolment design for unrestricted enrolment can be efficiently solved, as for studies with not so many countries (up to 12) we can use the exact algorithm based on the direct search, and for larger studies we can use the approach based on the normal approximation of the global enrolment process and step-wise linearization recurrent algorithm using simplex method.

23.4.2 Restricted Enrolment In this case we also consider two cases: not so large number of countries and wise versa. For the case of not so large number of countries we can design an algorithm based on the direct search using similar steps as described in Sect. 23.4.1.4. However, for restricted enrolment the calculations of PoS are based on rather complicated formulae as described in Sect. 23.3 and take longer time. Therefore, this algorithm will work longer and can be realistically applied to studies with up to 8–10 countries with the range up to 4–6 centres in each country. For larger dimension we can use so called evolution or genetic algorithms.

23.4.2.1

Evolution Algorithms

Evolution algorithms (Differential Evolution – DE) were designed as some type of random search algorithms using similarity with genetic mutations [18]. They belong to the class of genetic algorithms which use biology-inspired operations of crossover, mutation, and selection on a population in order to minimize an objective function over the course of successive generations. As other evolutionary algorithms, DE solves optimization problems by evolving a population of candidate solutions using alteration and selection operators. DE uses floating-point instead of bit-string encoding of population members, and arithmetic operations instead of logical operations in mutation. DE is particularly well-suited to find the global optimum of a real-valued function of real-valued parameters, and does not require that the function be either continuous or differentiable. The advantage of these algorithms is that they can be applied to a general setting in Optimization problem 1 (23.26) where PoS is calculated using the algorithms described in Sect. 23.4.1.4 for restricted enrolment. These algorithms are suitable for solving large dimensional problems and can be applied to our optimization problem. Note that by nature this is some special form of random search, thus, the outputs can be different for different runs. It may also take a substantial time to calculate the optimal point, and there is no guarantee that the output will provide a global optimum. However, a comparison with the results obtained by using direct search shows that in all considered examples the evolution algorithms lead to the same results as the exact algorithm using direct search.

23 Forecasting and Optimizing Patient Enrolment in Clinical …

535

23.5 Conclusions A new analytic technique for modelling and predicting patient enrolment on country level using the approximation of the enrolment process in a country by a Poissongamma process with aggregated parameters is developed. A novel analytic technique for modelling the enrolment under some restrictions (enrolment Caps in countries) is also developed. These techniques form the basis for solving the problem of optimal trial enrolment design: find an optimal allocation of centres/counties that minimizes the total trial cost under the condition that the probability to reach a given number of patients in time is no less than a given probability. Different techniques to find an optimal solution for low and high dimensional optimization problems are proposed. The developed techniques supported by R-software have a huge potential for improving the efficiency and quality of clinical trial operation, additional benefits and cost savings. Acknowledgements The authors are thankful to Data Science team at the Center for Design & Analysis, Amgen Inc. for useful discussions and providing data from real clinical studies. The authors also would like to thank an anonymous referee for useful editorial remarks.

23.6 Appendix 23.6.1 Approximation of the Convolution of PG Variables Let us provide some numerical calculations to support the results of Lemma 1. Consider a country I with K centres. Assume that the enrolment rates λi in all centres are gamma distributed with the same parameters (α, β). Consider some interim time t and denote by vi the enrolment duration in centre i up to time t. Then, according to (23.4), the global cumulative enrolment rate in country I is Λ(I, t) =

K 

λi vi

i=1

As noted in Sect. 23.2.2, if for all i ∈ I , vi ≡ v, then Λ(I, t) has the same distribution as Ga(α K , β/v), and the enrolment process n(I, t) in country I is a PG process which has the same distribution as P G(v, α K , β) variable. However, in realistic cases the enrolment durations vi are different, Λ(I, t) does not have a gamma distribution, so, n(I, t) does not have a PG distribution. Nevertheless, using Lemma 1 it can be approximated by a PG-process with some parameters.

536

V. Anisimov and M. Austin

The accuracy of this approximation was evaluated using numerical calculations for many different scenarios and the results led to the same conclusions. Here for the illustration we provide the analysis using only one example (see [9]). Put α = 1.5, β = 150. Assume that vi , i = 1, . . . , K , are taken using a uniform grid in interval [1, 300] as: r ound((1 : K ) ∗ 300/K ). This reflects a reasonably large variation in vi . In centre i the distribution of the number of enrolled patients is calculated as a vector ppi of the length L + 1 using a PG distribution with parameters (vi , α, β) and formula in R: dnbinom(0:L,size=alf,prob=be/(be+v[i]))

Consider the length L = 50 as for k > 50 the probabilities to enrol k patients are zeros up to 4 digits. The probability distribution of the enrolment process in the country is a convolution of probability distributions ppi and can be calculated numerically using very fast procedure in R based on function convolve(). Denote the resulting distribution by ppK . This is the exact distribution up to the accuracy of computations. We can also approximate the probability distribution of the enrolment process in the country by a PG distribution using relations (23.6), (23.7) and Lemma 1. Denote the approximative distribution by ppK P G. The computations show that probability distributions are very close even for small K = 2, 3, where the absolute difference     Di f (K ) = max( ppK [i] − ppK P G[i], i = 0, . . . , L) is decreasing when K is increasing. The table below describes the values of Di f (K ) for different K . K Dif(K)

2 0.0019

3 0.0017

5 0.0011

8 0.00075

10 0.00059

15 0.00039

20 0.00029

The plot on Fig. 23.1 illustrates the case K = 3. Here the difference between the exact and approximative probability distributions is negligible. Thus, the result of Lemma 1 can be used very efficiently in practice.

23.6.2 Calculation of the Mean of the Restricted Process Relation (23.16) implies that E[n iL (t)] =

L−1  k=0

k P(k, t, u) + L(1 − P(P G(α, β, x(t, u)) ≤ L − 1))

(23.52)

23 Forecasting and Optimizing Patient Enrolment in Clinical …

537

0.00

0.02

0.04

0.06

0.08

0.10

0.12

Probabilities for convolution and approximation of sums of PG variables

0

2

4

6

8

10

12

14

16

18

20

22

24

26

28

30

Fig. 23.1 Approximation of the distribution of the enrolment process in the country with three centres by the distribution of a PG variable. A continuous line shows the exact values of the distribution of the enrolment process, dotted line shows its approximation

Denote the first sum in the right-hand side as S1 . To simplify calculations, we use t instead of x(t, u). Using the change of variable (k − 1 → k) and relation Γ (α + 1) = αΓ (α), we get S1 = =

L−1  Γ (k + α) β α t k k k! Γ (α) (β + t)α+k k=0 L−1 αt  Γ (k − 1 + α + 1) β α+1 t k−1 β k=1 (k − 1)! Γ (α + 1) (β + t)α+1+k−1

L−2 αt  Γ (k + α + 1) β α+1 t k = β k=0 k! Γ (α + 1) (β + t)α+1+k

=

αt P(P G(α + 1, β, t) ≤ L − 2) β

Finally, putting back t = x(t, u), we get relation (23.17).

538

V. Anisimov and M. Austin

23.6.3 Calculation of the 2nd Moment of the Restricted Process Relation (23.16) implies that E[(n iL (t))2 ] =

L−1 

k 2 P(k, t, u) + L 2 (1 − P(P G(α, β, x(t, u)) ≤ L − 1))

k=0

(23.53) To simplify calculations, we use again t instead of x(t, u). Denote the first sum in the right-hand side as M2 . Consider first an auxiliary sum A=

L−1 

k(k − 1)

k=0

Γ (k + α) β α t k k! Γ (α) (β + t)α+k

By definition of S1 we see that M2 = A + S1 . Then, using the change of variable (k − 2 → k) and relation Γ (α + 2) = α(α + 1)Γ (α), we get A=

L−1 

k(k − 1)

k=0

Γ (k + α) β α t k k! Γ (α) (β + t)α+k

=

L−1 α(α + 1)t 2  Γ (k − 2 + α + 2) β α+2 t k−2 β2 (k − 2)! Γ (α + 2) (β + t)α+2+k−2 k=2

=

L−3 α(α + 1)t 2  Γ (k + α + 2) β α+2 t k β2 k! Γ (α + 2) (β + t)α+2+k k=0

=

α(α + 1)t 2 P(P G(α + 2, β, t) ≤ L − 3) β2

Thus, α(α + 1)t 2 P(P G(α + 2, β, t) ≤ L − 3) β2 αt P(P G(α + 1, β, t) ≤ L − 2) + β

M2 =

Finally, putting back t = x(t, u), we get relation (23.18).

23 Forecasting and Optimizing Patient Enrolment in Clinical …

539

References 1. Anisimov, V.: Using mixed Poisson models in patient recruitment in multicentre clinical trials. Proc. World Congr. Eng. 2, 1046–1049 (2008) 2. Anisimov, V.: Statistical modeling of clinical trials (recruitment and randomization). Commun. Stat. Theory Methods 40(19–20), 3684–3699 (2011) 3. Anisimov, V.: Predictive event modelling in multicentre clinical trials with waiting time to response. Pharm. Stat. 10(6), 517–522 (2011) 4. Anisimov, V.: Predictive hierarchic modelling of operational characteristics in clinical trials. Commun. Stat. Simul. Comput. 45(05), 1477–1488 (2016) 5. Anisimov, V.: Discussion on the paper “Real-time prediction of clinical trial enrollment and event counts: a review” by Heitjan, D. F., Ge, Z., Ying, G.S. Contemp. Clin. Trials 46, 7–10 (2016) 6. Anisimov, V.: Discussion on the paper “Prediction of accrual closure date in multi-center clinical trials with discrete-time Poisson process models” by G. Tang, Y. Kong, C. Chang, L. Kong, and J. Costantino. Pharm. Stat. 11(5), 357–358 (2012) 7. Anisimov, V.: Predictive modelling of recruitment and drug supply in multicenter clinical trials. In: Proceedings of the Joint Statistical Meeting, Biopharmaceutical Section, Washington, DC, pp. 1248–1259. American Statistical Association (2009) 8. Anisimov, V.: Modern analytic techniques for predictive modelling of clinical trial operations. In: Marchenko, O., Katenka, N. (eds.) Quantitative Methods in Pharmaceutical Research and Development: Concepts and Applications, pp. 361–408. Springer International Publ. (2020) 9. Anisimov, V., Austin, M.: Centralized statistical monitoring of clinical trial enrollment performance. Commun. Stat. Case Stud. Data Anal. Appl. 6(4), 392–410 (2020) 10. Anisimov, V., Downing, D., Fedorov, V.: Recruitment in multicentre trials: prediction and adjustment. In: Lopez-Fidalgo, J., Rodriguez-Diaz, J.M., Torsney, B. (eds.) mODa 8— Advances in Model-Oriented Design and Analysis, pp. 1–8. Physica-Verlag (2007) 11. Anisimov, V., Fedorov, V.: Design of multicentre clinical trials with random enrolment. In: Balakrishnan, N., Auget, J.-L., Mesbah, M., Molenberghs, G. (eds.) Advances in Statistical Methods for the Health Sciences, Series: Statistics for Industry and Technology, Ch. 25, pp. 387–400. Birkhauser (2007) 12. Anisimov, V., Fedorov, V.: Modeling, prediction and adaptive adjustment of recruitment in multicentre trials. Stat. Med. 26(27), 4958–4975 (2007) 13. Bakhshi, A., Senn, S., Phillips, A.: Some issues in predicting patient recruitment in multi-centre clinical trials. Stat. Med. 32(30), 5458–5468 (2013) 14. Barnard, K.D., Dent, L., Cook, A.: A systematic review of models to predict recruitment to multicentre clinical trials. BMC Med. Res. Method. 10(63) (2010) 15. Bates, G.E., Neyman, J.: Contributions to the theory of accident proneness. Univ. Calif. Publ. Stat. 1(9), 215–254 (1952) 16. Bernardo, J.M., Smith, A.F.M.: Bayesian Theory. Wiley, Hoboken, NJ (2004) 17. Carter, R.E., Sonne, S.C., Brady, K.T.: Practical considerations for estimating clinical trial accrual periods: application to a multi-center effectiveness study. BMC Med. Res. Methodol. 5, 11–15 (2005) 18. Du, K.-L., Swamy, M.N.S.: Search and Optimization by Metaheuristics: Techniques and Algorithms Inspired by Nature. Birkhauser, Basel (2016) 19. Gajewski, B.J., Simon, S.D., Carlson, S.E.: Predicting accrual in clinical trials with Bayesian posterior predictive distributions. Stat. Med. 27, 2328–2340 (2008) 20. Gkioni, E., Riusd, R., Dodda, S., Gamblea, C.: A systematic review describes models for recruitment prediction at the design stage of a clinical trial. J. Clin. Epidemiol. 115, 141–149 (2019) 21. Heitjan, D.F., Ge, Z., Ying, G.S.: Real-time prediction of clinical trial enrollment and event counts: a review. Contemp. Clin. Trials 45(Part A), 26–33 (2015) 22. Johnson, N.L., Kotz, S., Kemp, A.W.: Univariate Discrete Distributions, 2nd edn. Wiley, New York (1993)

540

V. Anisimov and M. Austin

23. Mijoule, G., Savy, S., Savy, N.: Models for patients’ recruitment in clinical trials and sensitivity analysis. Stat. Med. 31(16), 1655–1674 (2012) 24. Minois, N.N., Lauwers-Cances, V., Savy, S., Attal, M., Andrieua, S., Anisimov, V., Savy, N.: Using Poisson-gamma model to evaluate the duration of recruitment process when historical trials are available. Stat. Med. 36(23), 3605–3620 (2017) 25. Satterthwaite, F.E.: An approximate distribution of estimates of variance components. Biom. Bull. 2(6), 110–114 (1946) 26. Senn, S.: Statistical Issues in Drug Development. Wiley, Chichester (1997) 27. Senn, S.: Some controversies in planning and analysis multi-center trials. Stat. Med. 17, 1753– 1756 (1998) 28. Tufts CSDD impact report: 89% of trials meet enrolment, but timelines slip, half of sites underenrol. Tufts Center for the Study of Drug Development, Impact report, vol 15, no 1 (2013) 29. Welch, B.L.: The generalization of Student’s problem when several different population variances are involved. Biometrika 34, 28–35 (1947) 30. Williford, W.O., Bingham, S.F., Weiss, D.G., Collins, J.F., Rains, K.T., Krol, W.F.: The constant intake rate’ assumption in interim recruitment goal methodology for multicenter clinical trials. J. Chronic Dis. 40, 297–307 (1987)

Chapter 24

Algorithms for Recalculating Alpha and Eigenvector Centrality Measures Using Graph Partitioning Techniques Collins Anguzu, Christopher Engström, Henry Kasumba, John Magero Mango, and Sergei Silvestrov Abstract In graph theory, centrality measures are very crucial in ranking vertices of the graph in order of their importance. Alpha and eigenvector centralities are some of the highly placed centrality measures applied especially in social network analysis, disease diffusion networks and mechanical infrastructural developments. In this study we focus on recalculating alpha and eigenvector centralities using graph partitioning techniques. We write an algorithm for partitioning, sorting and efficiently computing these centralities for a graph. We then numerically demonstrate the technique on some sample small-sized networks to recalculate the two centrality measures. Keywords Alpha centrality · Eigenvector centrality · Graph partitioning MSC 2020 05C82, 05C70, 94C15, 05C76

C. Anguzu (B) · H. Kasumba · J. M. Mango Department of Mathematics, School of Physical Sciences, Makerere University, Box 7062, Kampala, Uganda e-mail: [email protected] H. Kasumba e-mail: [email protected] J. M. Mango e-mail: [email protected] C. Engström · S. Silvestrov Division of Mathematics and Physics, The School of Education, Culture and Communication, Mälardalen University, Box 883, 72123 Västerås, Sweden e-mail: [email protected] S. Silvestrov e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_24

541

542

C. Anguzu et al.

24.1 Introduction Researchers in co-authorship collaborations [23], genes or proteins in a biological structure [20, 26] and neurons in a neural networks [19] are a few examples of entities that can be represented and studied by using graphs [28]. In each of these cases the objects are referred to as vertices whereas the interactions between them are edges. Data for a graph involving interactions within each of these various structures usually contains a vast number of the interacting vertices and edges, which can be in order of millions [15]. An example here is the google matrix used by Brin and Page [14], in the computation of PageRank. This therefore renders non-sophisticated tools and techniques less useful in mathematical analysis of such data [10]. Various researchers have however, attempted to address the above challenge by partitioning the large graph into blocks or components. The factor binding the vertices here is the similarity in their contact patterns and interactive features. Following [30], vertices with the same adjacency structure are partitioned into the same block. This however, gives one of the many criteria used in graph partitioning. For more discussion on partitioning we refer you to [7, 9, 11, 12, 24, 25]. Network partitioning also referred to as graph partitioning is a technique of splitting a network into relatively smaller and autonomous subsystems or components. The splitting is based on some key characteristics of networks such as vertices being strongly connected, vertices forming connected components and so on. In practice, a network can be partitioned for different reasons among which include: • the ease of optimisation of the whole system by the autonomy exhibited by the individual components, for instance, maximum accuracy of results. • limited storage space required for each component • short computation time since components at the same level can be run in parallel • it is easy to identify a problem , for instance, cyclic nature of the graph. • continuity in performance or use of a network device whenever there is failure in the device. For instance, if vertices v p , vq , vr and vs define a network with v p and vq in one component and vr and vs in another. If a problem occurs with the edge linking the components, they can remain working independently, as in the case they were all connected. The authors in [3] assert that graph partitioning has of late gained grounds as a result of its application for clustering, and detection of cliques in social, pathological and biological networks. According to [8, 18, 27], many widely used network measures, referred to as vertex centrality measures target the evaluation, ranking and identification of important vertices. This can be subject to the power, influence or relevance of the vertices in the network. The use of centrality measures in large scale and complex networks requires bonified techniques [10]. This is our motivation to use graph partitioning in this study so as to reduce the large and complex networks into simpler and manageable subgraphs or components. In spite of the huge volume of literature on graph centralities, there has been only little on computing the eigenvector and alpha centralities by graph partitioning

24 Algorithms for Recalculating Alpha and Eigenvector Centrality Measures …

543

techniques. In graph theory, eigenvector centrality gives comparative (relative) ranks to all vertices in the network. The idea behind the distribution of the ranks is that a vertex that is connected to highly ranked vertices attains a high rank, opposed to when it is connected to many low ranked vertices. Suppose that G := (V, E) is a graph with |V | as the number of vertices and |E| the number of edges. The comparative rank x(i) of a vertex vi is given by x(i) =

1  1  x( j) = a x( j), λ j∈N (i) λ j∈V i j

(24.1)

where N (i) is a set of neighbours of vi , λ is a constant (an eigenvalue of A) and ai j = 1 if vertex vi has influence on vertex v j and is zero otherwise. The relation (24.1) can be modified to the standard eigenvector equation in matrix form as λx = A x.

(24.2)

In practice, several distinct values of the eigenvalue, λ do exist that can give nonzero eigenvectors. But the condition for eigenvector centrality is that all elements in the vector must be non-negative. This implies that we employ the Perron-Frobenius theorem [4] to allow us to consider only the dominant (greatest) eigenvalue so as to fulfill the condition of non-negative elements in x(i). Since the eigenvector can only be defined up to a certain factor it needs to be normalised in some way, but since this normalisation will never change the relative differences in the rank vector this choice of normalisation is usually not important. The eigenvector x corresponding to the eigenvalue λ, of a graph can be obtained by a variety of methods. One of the most powerful and easy methods to apply, and the one we shall focus on is the power method. The eigenvector centrality of a vertex vi , denoted by x(i), is the collection of influence x( j), from its directly connected neighbours v j , and so is given by relation (24.1). Alpha centrality is a way to incorporate information outside of the network (or part of the network) through the use of a weight vector describing this influence, the alpha centrality xα¯ (i), of vertex vi can be calculated using the following relation xα¯ (i) = α



aij xα¯ ( j) + e(i),

(24.3)

j

where α is a constant whose value establishes the difference or similarity between the two centralities. The vector e(i) encodes the external influence received by the vertex vi . It is observed that for zero external influence, the relation in (24.3) above reduces to an equation of a right eigenvector [21], with α becoming λ1 where λ is an eigenvalue of the adjacency matrix. In particular, we choose λ to be the dominant eigenvalue. The idea behind the eigenvector centrality is in two fold: In the first case, if a node v influences node u and node u does the same to other vertices with those vertices doing

544

C. Anguzu et al.

the same, then node v is very influential. Secondly, eigenvector centrality provides a model of risk at each connected vertex. That is, the risk of a vertex receiving influence in the long term steady state depends on the number of connections it has [2]. In eigenvector centrality, the centrality x(i) of a node vi is proportional to the sum of centralities x( j) of its neighbours. It is written as in Eq. (24.1). If we assume that λ is the maximum of the absolute value of the eigenvalues λ (spectral radius, ρ(A) ) of the adjacency matrix A, then by Perron-Frobenius theorem, there exists a unique non-negative eigenvector x that satisfies the Eq. (24.2) and is the eigenvector centrality of the network. The worth of the node vi is noted by simply observing the ith entry in x. According to [22], there are three advantages of the eigenvector centrality: (i) When computing eigenvector centrality, local information is captured. This is because the centrality of a vertex depends entirely on the centrality of its neighbours. (ii) Global information is also used in computing the eigenvector centrality of a vertex. This is attributed to the fact that any network has extended neighbourhood throughout the graph. (iii) With eigenvector centrality it is relatively fast to analyse a very large network due to the availability of numerous numerical methods for computing efficiently eigenvalues and eigenvectors. We are motivated to consider these two centralities because of their vast application in social networks [1], networks of biological systems and disease diffusion networks [15]. Secondly, there is seemingly inadequate literature on componentwise computation of these centralities.

24.1.1 Notation and Abbreviations The words graph and network shall be used interchangeably to mean a collection of vertices with edges connecting the vertices. The adjacency matrix of a graph and its transposed form shall be denoted by A and A respectively. The vector of all ones shall be denoted by e. Through out this work, alpha centrality and αcentrality shall be used interchangeably to mean the same thing. The eigenvalues of a matrix shall be denoted by λ. By λ or otherwise defined, we shall be referring to the dominant eigenvalue of a component, which is the dominant eigenvalue of the adjacency matrix representing the component. Generally, the dominant eigenvalue λmax , is the eigenvalue with the largest absolute value. Particularly in this paper, λmax will be positive since we are using positive definite adjacency matrices, which by Perron-Frobenius theorem [14] have a positive dominant eigenvalue. It should be noted here that in all our computations we adapt the linear algebra convention of the transposed adjacency matrix acting on the column vector.

24 Algorithms for Recalculating Alpha and Eigenvector Centrality Measures …

545

Fig. 24.1 Example of a graph and corresponding components from SCC partitioning of the graph with 4 SCCs and one 1-vertex component. Component labels denote the level of each component in the SCC partition

24.1.2 Graph Concepts A graph can be partitioned into the three most common and prominent components namely; directed acyclic graph (DAG), strongly connected components (SCC) and the single vertex component. We now give simple and precise definitions of the types of components mentioned above and state some of their features. Definition 1 A directed acyclic graph (DAG) is a directed graph with no directed cycles. Definition 2 A strongly connected component (SCC) of a directed graph G is a subgraph G 1 of G such that for any pair of vertices v1 and v2 in G 1 , there exits a path from v1 to v2 and vice versa. In addition the subgraph is maximal in that if any set of vertices and edges in G is added to G 1 it would break this property. Definition 3 A single vertex component is a component comprising only one vertex. Such a vertex could be a root vertex or leaf vertex. A vertex is referred to as a root vertex if it has no incoming edges but has at least one outgoing edge whereas a vertex is a leaf vertex if it has at least one incoming edge but has no outgoing edge. Among the several centrality measures based on powers of the adjacency matrix, we focus on the α- and the eigenvector centralities. We investigate how a graph can be ranked efficiently by the aforementioned centrality measures using the notion of components. The ranking of vertices is basically done by calculating the centrality measure one component at a time. To start with, the rank of the top component is calculated normally, then the prior for the lower component is adjusted and the rank of this lower component is now calculated normally. This whole process goes on until all the components are dealt with. In this paper we consider only graphs that are simple, directed and unweighted with partitions that are strongly connected. Consequently, connections between the SCCs yields a directed acyclic graph (DAG). In this case each component is treated as a single vertex and the resultant graph is a DAG as shown in Fig. 24.1. Theorem 1 Given that a graph G is partitioned into strongly connected components. Contracting each component in G into a single vertex, contracts every directed cycle

546

C. Anguzu et al.

to a single vertex and this turns G into a DAG. Then the existence of a directed edge between a pair of vertices in two neighbouring components implies existence of a similar edge between the corresponding vertices in the resultant DAG. Levels are then assigned to these vertices of the DAG beginning with the dangling vertices. Each vertex of the DAG is in a level corresponding to the furthest path length it has from the lowest level. We note here that single vertex components can actually be regarded as being SCC. Definition 4 Regular graphs: A graph is d-regular if all its vertices have degree d. with the corresponding Definition 5 Let x ∈ Rn \{0} be an eigenvector of a graph G eigenvalue λ ∈ R. Then for every i = 1, 2, ..., n, λxi = x j , where N + (i) is j∈N + (i)

the out-neighborhood of i. We shall be handling regular graphs or subgraphs with or without selfloops in the subsequent Sections. In such a case the corresponding maximum eigenvalue λ equals to d. In this section we have given a brief background of alpha and eigenvector centralities, notation and abbreviations and some graph concepts. The rest of the article is organized as follows: In Sect. 24.2 we give the algorithms for graph partitioning, recalculation of α-centrality measure and show how Katz centrality can be determined from α-centrality. In Sect. 24.3, we re-write the Power method to suit the partitioning technique and show how such a representation converges. Finally in this section, we re-calculate eigenvector centrality for graphs of different structural arrangements of the components. Section 24.4 presents conclusions of the study.

24.2 The Alpha Centrality Algorithm For a complete and meaningful α-centrality algorithm, the steps in the following subsection are paramount.

24.2.1 Stages for Algorithm Formulation In this subsection we consider three steps; step of partitioning the graph, formulation of relevant matrices and weight vectors and computation of alpha centrality. Partitioning of the graph It is well known that SCCs can be found using the Tarjan’s algorithm [29]. Variations in the use of the algorithm come at the implementation level. Here is a brief overview of the algorithm. In this step, every vertex v is assigned six values. Each of these values is quite significant in the partitioning process. The importance ranges from

24 Algorithms for Recalculating Alpha and Eigenvector Centrality Measures …

547

holding the order in which the vertex v is discovered during the depth first search (DFS) to showing the level of the component in which v is found. The following three steps summarise the partitioning process during the DFS: (i) Discover : This step initialises the value for the vertex. (ii) E x plor e: This step enables all neighbours of v to be visited and searched by DFS before going to the next vertex. Once a vertex is visited, the v.lowlink (value that represents the lowest index of any vertex that can be reached from v in a SCC) and v.level (value that shows the level of the component in which we find v) are updated. (iii) Finish: Once all neighbours of v are visited, a new component is created if possible. More detail on finding SCC by DFS can be found in [5]. As previously mentioned in Sect. 24.1.2, when SCCs are contracted, the resultant graph is DAG. This is a tree-like structure in which the vertex (contracted component) in the lowest position is assigned the lowest level. The levels then increase upwards through the structure. Step of formation of relevant matrices and weight vectors This is an intermediate step in which relevant matrices of the components and their weight vectors are created from the data containing the list of edges and weights of vertices. It is in this step that varying implementation strategies can be employed to the algorithm. This includes among others, solving a linear system that emanates from the permutation matrix transformed from the SCC partitioning [6] rather than solving the recursive reordering algorithm [13]. Here, components are first sorted as per their levels and then in order of size. All this is done from the largest component going down. Then one component is handled at a time at the computation stage, beginning with the largest component in each level. In this step we intend to separate lists of edges within each component and also those between levels. The components and edges respectively can be sorted efficiently in linear time since all different values are known and bounded hence no comparisons are needed. This is similar to how you could sort a large number of coins (edges) by first checking their individual values (level) and putting each coin in pre-made bags for each different kind of coin and then sort them by simply taking out coins from each bag in order. Finally, all 1-vertex components are merged so as to ease computation of αcentrality of many such components at a given level. At the end of this intermediate step, the matrices AuL and the weight vectors ζ for all the components are now readily available awaiting calculations. Computation of alpha centrality measure We compute the α-centrality of the components beginning with the component(s) at the highest level and then the largest component at each level. The size and the type of the component determines the technique to be used to compute the α-centrality of that component. In this work we assumed that there is no self-looping in the vertices

548

C. Anguzu et al.

for the case of alpha-centrality and only SCC partitioning is done. The issue of the techniques used is addressed as follows. (a) For a collection of 1-vertex components without loops, the initial weight of the vertex is its α-centrality. (b) For SCCs, we employ appropriate iterative method(s) to compute the centrality of the component. For effective computation of the centralities, the following steps are necessary: (i) The maximum level component in the whole graph is initiated as L. (ii) Using an appropriate method, we calculate the α-centrality for each component at level L. (iii) The weight vector ζ for components at levels lower than L are updated. This updating is done for all components at all the levels below L (i.e. L − 1, L − 2, ...) where there are connections with components at level L. At this stage of updating, in every case, only vertices in the lower components with direct connections to vertices in L have their weights updated. Once the update is finished for all components at levels below L, then we leave level L and shift attention to level L − 1. We compute the α-centrality for each component at level L − 1. Thereafter weights of components at levels below L − 1 (i.e. L − 2, L − 3, ...), are updated similarly. As it is above, updating is only possible for components with vertices having direct connection to vertices at level L − 1. After updating all components at levels below L − 1, attention is now shifted to level L − 2. This process continues until we compute the ranks of vertices in components at the lowest level.

24.2.2 Computing Other Centrality Measures by Using the α-Centrality Measure Algorithm In this section we look at the viability of computing a centrality measure using the algorithm for α-centrality computation. We compare the candidate centrality measure with α-centrality measure in mainly two aspects. That is, the metric in which the centrality measures do apply and their mathematical formulation. Satisfying the mentioned conditions is the Katz centrality measure which is then calculated by using the relation in the subsequent step. According to Katz [17], influence could be measured by a weighted column sum of all the powers of the adjacency matrix A, ∞  that is, (αA)l , where α is an attenuation factor. The column sums are then given by l=1

multiplying the transpose of this matrix by a vector of all ones, e. The Katz centrality measure is then obtained using xz =

∞  l=1

αl (A )l e.

(24.4)

24 Algorithms for Recalculating Alpha and Eigenvector Centrality Measures …

549

Fig. 24.2 A simple directed graph G = (V, E) with |V | = 6 and |E| = 7

The solution to Eq. (24.3) is xα (i) = (I − αA )−1 e.

(24.5)

By applying Neumann series the relation (24.5) becomes xα (i) =

∞ 

αl (A )l e,

(24.6)

l=0

which is the α-centrality, of vertex vi of the same graph of adjacency matrix A. Since change of limits of summation can be possible here, (24.6) can thus be expressed as xα = e +

∞ 

αl (A )l e.

(24.7)

l=1

So, from relations (24.4) and (24.7) we have xz = xα − e.

(24.8)

In this case, we simply add a negative vector of ones to the result for the α-centrality so as to obtain the Katz centrality of the same graph.

24.2.3 Example In this Subsection we give an example of how alpha centrality can be calculated by partitioning a simple directed graph. Consider a simple directed graph in Fig. 24.2 with six vertices each with an initial weight of one. Suppose that the set of vertices {v1 , v2 }, {v3 , v4 , v5 } and {v6 } constitute the levels 2, 1 and 0 respectively. We compute the α-centrality beginning with the top (level 2) components and end with level 0 component. In the example the α-centrality for the top components is determined. This is followed by updating weights of components in level 1, after which the α-centrality of the component in level 1 is recalculated. The rest of the procedure follows from the description in Sect. 24.2 as shown in the following subsections.

550

C. Anguzu et al.

Computing α-centrality of the top level 2 components This level consists of two single vertex components. Since the components are at the top most level, their centrality measure remains the same as the initial weight assigned to the vertices. Let x(1) and x(2) denote the alpha centralities of vertices v1 ans v2 respectively in level 2 and each vertex in the graph is assigned   an initial weight   1

0

1. Then for vertices v1 and v2 respectively, we have x(1) = 0 and x(2) = 1 . The dimension of the vectors x(1) and x(2) corresponds to the number of vertices in   1

level 2. A single vector xα,2 = x(1) + x(2) = 1 gives the alpha centrality measure of the vertices v1 and v2 in level 2. Since the level 2 components are connected to level 1, the centralities of level 2 components will influence the rank of the level 1 component as shown below. Computing the rank of the level 1 component normally Let α be the weight of each edge. We observe in Fig. 24.2 that the connection between level 2 and level 1 components goes via vertex v3 . The total weight received by v3 from v1 and v2 is 2α. Since other vertices v4 and v5 have not received any weights from the level 2 components, they have no additional weights. ⎡ ⎤ Hence the overall 2α

vector for weights on component in level 1 takes the form ⎣ 0 ⎦ . Thus the weights 0 ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 2α 1 + 2α of the level 1 component are given by w = ⎣ 1 ⎦ + ⎣ 0 ⎦ = ⎣ 1 ⎦ . Using the 1 0 1

above prior, we can now compute the rank of the level 1 component in the following way. The adjacency matrix⎡ corresponding to this component is ⎡denoted ⎤ ⎤ by A1 and 010

001

100

010

⎣ 1 0 0 ⎦ . So, from is first obtained as A1 = ⎣ 0 0 1 ⎦ , and its transpose is A 1 =

the definition of α-centrality, we obtain the rank xα,1 , of the component at level

⎤l ⎡ ⎤l ⎡ ⎤ 001 001 1 + 2α ∞  l 1 as xα,1 = α ⎣ 1 0 0 ⎦ w, that is, xα,1 = α ⎣ 1 0 0 ⎦ ⎣ 1 ⎦ . For l = l=0 l=0 010 010 1 ⎤ ⎡ ⎡ ⎤ ⎡ ⎤ ⎡ 3 ⎤ α2 α α + 2α 4 1 + 2α 2 2 3 ⎦+⎣ ⎦ + ··· 0, 1, 2, 3 terms: xα,1 = ⎣ 1 ⎦ + ⎣ α + 2α ⎦ + ⎣ α α 1 α α 2 + 2α 3 α3 ∞ 



l

Thus, the α-centrality of the level 1 component in the general sense is

xα,1 =



1 + 3α + α 2 + α 3 + 2α 4 + · · · 1 + α + 3α 2 + α 3 + · · · . 1 + α + α 2 + 3α 3 + · · ·

(24.9)

The procedure in relation (24.9) continues until convergence is reached. Computing the rank of level 0 component and alpha centrality of the entire graph The weight of the single vertex component v6 is then 1 + (r4 )α, where r4 = 1 + α + 3α 2 + α 3 + · · · is the rank of vertex v4 . Then, the rank of v6 denoted by xα,0 , is

24 Algorithms for Recalculating Alpha and Eigenvector Centrality Measures …

xα,0 = 2 + α + 3α 2 + α 3 + · · · .

551

(24.10)

It is observed that xα,2 represent the α-centrality measures of the single vertex components v1 and v2 , xα,1 in relation (24.9) represents that for v3 , v4 and v5 whereas xα,0 in relation (24.10) shows the centrality measure for v6 . The vector xα , corresponding to the α-centrality measure of the vertices⎡of the entire graph is then obtained by sim⎤ 1 ⎢ ⎥ 1 ⎢ ⎥ ⎢ 1 + 3α + α 2 + α 3 + 2α 4 + · · · ⎥ ⎥ . Following ply combining xα,2 , xα,1 and xα,0 : xα = ⎢ ⎢ ⎥ 1 + α + 3α 2 + α 3 + · · · ⎢ ⎥ 2 3 ⎣ ⎦ 1 + α + α + 3α + · · · 2 + α + 3α 2 + α 3 + · · ·

(24.8),⎡we compute the Katz centrality graph, ⎡ ⎤ for the above

⎡ ⎤ 1 1 ⎢ ⎥ ⎢1⎥ 1 ⎢ ⎥ ⎥ ⎢ 1 + 3α + α 2 + α 3 + 2α 4 + · · · ⎥ ⎢ 1⎥ ⎥-⎢ xz = ⎢ ⎥= 2 + α3 + · · · ⎢ ⎥ ⎢ 1 + α + 3α 1 ⎢ ⎢ ⎥ ⎣ ⎥ 2 3 ⎣ ⎦ 1⎦ 1 + α + α + 3α + · · · 1 2 + α + 3α 2 + α 3 + · · ·

⎤ 0 ⎢ ⎥ 0 ⎢ ⎥ ⎢ 3α + α 2 + α 3 + 2α 4 + · · · ⎥ ⎢ ⎥. ⎢ ⎥ α + 3α 2 + α 3 + · · · ⎢ ⎥ 2 3 ⎣ ⎦ α + α + 3α + · · · 1 + α + 3α 2 + α 3 + . . .

24.3 The Eigenvector Centrality for Large Directed Graphs 24.3.1 The Eigenvector Centrality Algorithm Here we have two main steps. Finding components. The step of component finding involves partitioning the graph into components and then finding the levels of the components as described in Sect. 24.1 of this paper. Our focus still lies on the strongly connected components except that in this case of the eigenvector centrality we allow the vertices to have self-loops. Formation of relevant matrices. The relevant matrices are established from the available data pertaining to the list of edges and weights of the vertices. In addition to the formation of the relevant matrices, this step incorporates computations of the dominant eigenvalues and their corresponding eigenvectors. Consider two components; one at the top, with dominant eigenvalue λ1 and eigenvector x1 and the other below, with dominant eigenvalue λ2 and eigenvector x2 . Here, we shall assume that the dominant eigenvalue and its corresponding eigenvector are known for the top component. We then compute the dominant eigenvalue and eigenvector for the lower component at the limiting point. The state of convergence for this case and for the subsequent cases of λ1 ≤ λ2 are shown by cases 1 and 2 in Sect. 24.3.2.

552

C. Anguzu et al.

24.3.2 Reformulation of the Power Method The original and usual normalised power method for a whole network with adjacency matrix A is  (k) (24.11) x(k+1) = AA xx(k) 1 . We disregard the normalisation A x(k) 1 since that can be done at any iteration, and instead consider the simpler formula below: x(k+1) = A x(k) .

(24.12)

In a graph partitioned into components, there are edges linking one component to another. Suppose that two components are linked with edges running from the one at the top with dominant eigenvalue λ1 to one at the bottom with dominant  eigenvalue B A   1 , where A λ2 , such that λ1 > λ2 . We decompose A as follows: A = 1 0 A 2   corresponds to the top component, A2 the bottom component and B edges from the top component to the bottom component and finally 0 being a zero matrix since there are no edges from the bottom to top component. Our goal is to find a simple and efficiently calculated expression for the part of the x(k+1) vector corresponding to the bottom component depending on the eigenvalue of the top and bottom component when considered alone. 

¯ = A2 , where A1 , λ1 is the matrix and eigenvalue for the top comLemma 1 Let A λ1 ponent, A2 , λ2 for the bottom component, and B is the part of the original matrix corresponding to the edges from the top to the bottom component. Then the eigenvector centrality as calculated in Eq. (24.12) using non-normalized power iterations k−1  i k  k (0) ¯ v, where v = B x1 is calculated A can be calculated by x(k) 2 = (A2 ) x2 + λ1 i=0

from the eigenvalue equation λ1 x1 = A1 x1 normalized such that |v| = |x(0) 1 |. Proof We start from (24.12) considering only the part x(k) 2 corresponding to vertices in the lower component after k iterations,  (0)  (0) x(1) 2 = A2 x2 + B x1 ,

   (0)  2 (0)   (0)  (0)   (0) x(2) + B x(1) 2 = A2 A2 x2 + B x1 1 = (A2 ) x2 + A2 B x1 + B A1 x1 , ... k−1   k−i   i (0)  k (0) (A2 ) B (A1 ) x1 x(k) 2 = (A2 ) x2 + i=0

Next, we let the x(0) 1 corresponding to the top component start in the steady state (this corresponds to a really lucky “guess” of initial values) which gives us

24 Algorithms for Recalculating Alpha and Eigenvector Centrality Measures …  k (0) x(k) 2 = (A2 ) x2 +

=

k (0) (A 2 ) x2

+

553

k−1 

k−i  i (0) (A B λ1 x 1 2)

i=0 k−1  i=0

k−i k (0) k λi1 (A v = (A 2) 2 ) x2 + λ

k−1  i=0

A 2 λ1

i v 

completing the proof.

Case 1: λ1 > λ2 Next we will consider what happens with the ranking vector for the lower component when λ1 > λ2 (top component having a larger eigenvalue). A

¯ = 2 , where A1 , λ1 is the matrix and eigenvalue for the top Theorem 2 Let A λ1 component, A2 , λ2 for the bottom component, and B is the part of the original matrix corresponding to the edges from the top to the bottom component. Then the eigenvector centrality as calculated in Eq. (24.12) using non-normalized power iterations can be calculated by k  k (0) x(k) 2 = (A2 ) x2 + λ1

k−1 

¯ i v, A

(24.13)

i=0

where v = B x1 is calculated from the eigenvalue equation λ1 x1 = A1 x1 normalized such that |v| = |x(0) 1 |. If λ1 > λ2 at the limit k → ∞, then a normalised rank vector can be written as k−1   A2 i (24.14) x(k) v. 2 = λ1 i=0

Proof We already proved the first part of the theorem in Lemma 1 so we only need to check what happens with Eq. (24.13) k → ∞ when λ1 > λ2. Writing out the  as   1   k−1 A A (k) (0) k k 2 v. To compare sum we get x2 = (A + · · · + λ21 2 ) x 2 + λ 1 I + λ1 the speed of growth between the two sides we a normalisation at step  k, namely  apply   k   1   k−1 (k) A2 A2 A2 (0) k x2 divide the result by λ1 : λk = λ1 x2 + I + λ1 + · · · + λ1 v. Since 1

λ1 > λ2 , the first term will tend towards zero, while the right hand side is equal  k−1  i   1   k−1   x(k) A2 A A 2 2 v= v. to the results we seek lim λk = I + λ1 + · · · + λ21 λ1 k→∞

1

i=0

Hence, then the result in the theorem is shown since the term λk1 is just a normalisation constant used to prevent the vector from growing in every iteration.  This means that if we know λ1 > λ2 and have already calculated the rank vector for the top component, we have an efficient way to calculate the rank vector for the bottom component using a series of matrix-vector multiplications and vector additions on a smaller matrix than the original.

554

C. Anguzu et al.

Case 2: λ1 ≤ λ2 Next we will consider what happens with the ranking vector for the lower component when we have the opposite scenario where λ1 < λ2 . A

¯ = 2 , where A1 , λ1 is the matrix and eigenvalue for the top Theorem 3 Let A λ1 component, A2 , λ2 for the bottom component, and B is the part of the original matrix corresponding to the edges from the top to the bottom component. Then eigenvector centrality, as calculated in (24.12) using non-normalized power iterations, can be calculated by k−1  i k  k (0) ¯ v, (24.15) A x(k) 2 = (A2 ) x2 + λ1 i=0

where v = B x1 is calculated from the eigenvalue equation λ1 x1 = A1 x1 normalized such that |v| = |x(0) 1 |. If λ1 ≤ λ2 at the limit k → ∞, then a normalised rank vector k (A 2) v can be written as x(k) 2 = ||(A )k v|| . 2

Proof Again we do not need to prove the first part since it is already proven in Lemma 1, so we only need to check what happens with (24.15) as k → ∞ when λ1 ≤ λ2 . Like k before in the proof of Theorem 2 we out the sum, but this time  write  divide by λ2         k k 1 k−1 x(k) A A A λ1 I + λ21 + · · · + λ21 v. Now neither instead: λ2k = λ22 x(0) 2 + λ2 2

the first nor the second term goes towards zero, however we can show that it doesn’t matter since both the right and left hand side should converge to the same vector   k A (up to multiplication of some constant). As k increases the left hand side λ22 x(0) 2

converges towards the (right) eigenvector of the dominant eigenvalue of A . Similarly    1 2  k−1  A2 A2 as k increases the newly added terms in the sum Iv + λ1 v + · · · + λ1 v converge towards the same eigenvector, just scaled differently and increasing in magnitude over time. Hence, to get the vector we need, we can pick any side of the expression and normalise the final result as we see fit. One option is then to pick the last term in the sum on the right hand side and normalise by dividing by the norm of k (A 2) v the result to get x(k)  2 = ||(A )k v|| . 2

We conclude this section with a remark on what happens with the top component in case λ1 ≤ λ2 . In the case of strict inequality, it is clear that the top component will in fact lose all rank previously calculated since a new larger normalisation constant would be used for the graph as a whole. In the case where they have equal eigenvalue the situation is a bit more tricky, especially if there are more than two components all with the same largest eigenvalue. Thankfully, this scenario should be rare for most common networks, especially larger networks due to the presence of a supercomponent containing a large proportion of the vertices and therefore almost surely having the largest eigenvalue. Finally, while the final rank could be represented in various ways, the reason we picked this one is that it allows the user to first calculate the top component completely

24 Algorithms for Recalculating Alpha and Eigenvector Centrality Measures …

555

Fig. 24.3 A simple setup of two components formed from seven vertices, each having a self loop, with the top component having bigger dominant eigenvalue than the one below. The adjacency matrix A correspond to the graph

(rank vector + eigenvalue), and then when focusing on the second component only the power series in Eq. (24.14) needs to be calculated. If this converges, then we know λ2 < λ1 and we have out rank, if it doesn’t, then we continue with ordinary power iterations from the last term in the sum to get the new largest λ and corresponding rank vector.

24.3.3 Computing Eigenvector Centrality of a Graph Componentwise In this section we consider complete directed graphs of varying component compositions. The variation is based on the magnitude of the dominant eigenvalue of each of the components which is observed in the state of convergence or non-convergence of the eigenvectors of these components. Throughout this work, we shall denote every dominant eigenvalue of the components in the network by λ2 (for lower component) and λ1 (for upper component). This is divided into the following cases. Case 1: Top component having λ1 , (λ1 > λ2 ). Here, we consider a case where a graph with 7 vertices is partitioned into two components each with a selfloop (see Fig. 24.3). One component on top with four vertices has its dominant eigenvalue computed as 4 and it is assumed that no information is known about the dominant eigenvalue of the bottom component. The two components are each strongly connected. We assign a score of one initially to ⎛ all the vertices ⎞ 1 ⎜1 ⎜ in the graph. For the top component, the adjacency matrix is A1 = ⎝ 1 1

1 1 1 1

eigenvector for λ1 = 4, by using the usual power method (Algorithm ), is

1 1 1 1

1 1⎟ ⎟ . The 1⎠ 1

556

C. Anguzu et al.

⎛ ⎞ 1

xu3

⎜1⎟ = ⎝ 1 ⎠ , where u3 means the upper component of Fig. 24.3.

(24.16)

1 Algorithm 1: To compute eigenvector centrality of any component of a graph normally Step 1: Input A , (adjacency matrix) Choose x(0) , (initial vector of ones) Choose , (an arbitrary small positive number)  (i) Step 2: Compute x(i+1) = AA xx(i)  . 1 (i+1) (i) Step 3: If x − x  < , then STOP (x(i+1) is the centrality vector) else i := i + 1; repeat from Step 2. ⎛ ⎞ 111 For the lower component, the adjacency matrix is A2 = ⎝1 1 1⎠ . Let the initial vector 111 ⎛ ⎞ 1 be x(0) = ⎝1⎠ . Then the relation (24.11) can now be used to iteratively compute 1

the eigenvectors of the lower component until steady state. In the first iteration, we multiply the matrix A2 by the initial vector and then add it to the weight vector   v = 1 0 0 , representing the contribution from the top component as shown below. ⎛ ⎞ ⎛

⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 1 111 1 4 1 4 3⎟ ⎜ Iteration 1: ⎝1 1 1⎠ ⎝1⎠ + ⎝0⎠ = ⎝3⎠ = 4 ⎝ 4 ⎠ . From the result ⎝ 3 ⎠ in iteration 3 111 1 3 0 3 4

1 above, we factor ⎛ out ⎞ the dominant eigenvalue 4, of the top component. The resulting 1

(0) ⎜3⎟ vector x(1) 2 = ⎝ 4 ⎠ then replaces x2 in the second iteration. The above procedure 3 4

is repeated in ⎛ the successive iterations until a steady vector is achieved. ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 111

1

1

3.5

3.5 4

111

3 4

0

2.5

2.5 4

⎠ . At the 20th iteration we get Iteration 2: ⎝1 1 1⎠ ⎝ 34 ⎠ + ⎝0⎠ = ⎝2.5⎠ = 4 ⎝ 2.5 4

the approximate vector

x(20) 2 ,

at the steady state as  x(20) 2

=

1.9968 0.9968 0.9968

 .

(24.17)

Following the trend of the vectors from those preceding the 20th iteration, we observe above. To obtain the overall that the steady state vector is not far from the vector x(20) 2 steady state vector of the graph with the condition that λ1 > λ2 , we use following conjecture. Conjecture 1 Let two components be connected such that there are rank flows from the upper component u to the lower component l. Let λ1 , λ2 (λ1 > λ2 ) and xu , xl be the dominant eigenvalues and the rank vectors of u and l respectively. The overall

24 Algorithms for Recalculating Alpha and Eigenvector Centrality Measures …

rank vector of the graph at steady state is given by x = the algebraic sum of all entries of the vectors xu and xl .

1 S

 xu

1 x λ1 l



557

, where S is

in (24.17) are combined and normalised By Conjecture 1, xu3 in (24.16) and x(20) 2 into ⎛ 0.2001 ⎞ ⎜ 0.2001 ⎟ ⎜ 0.2001 ⎟ ⎟ ⎜ (24.18) x = ⎜ 0.2001 ⎟ . ⎜ 0.0999 ⎟ ⎠ ⎝ 0.0499 0.0499

While for the whole graph with the adjacency matrix as in Fig. 24.3, the normalised eigenvector centrality as computed by the Algorithm 1 is ⎛ 0.2000 ⎞ ⎜ 0.2000 ⎟ ⎜ 0.2000 ⎟ ⎜ ⎟ x¯ = ⎜ 0.2000 ⎟ . ⎜ 0.1000 ⎟ ⎝ ⎠

(24.19)

0.0500 0.0500

Since the ordering of entries in the vectors in the relations (24.18) and (24.19) is the same, there is consistence in the approach of computation. It is observed from the above computations that the vertex that receives the scores from the donating component, will always have a higher entry than its counterparts in the same component. For this scenario where the top component has bigger dominant eigenvalue, Algorithm 2 can be used to compute the ranks of the lower component. Algorithm 2: To compute eigenvector centrality of a component with contribution from another component Step 1: Input A2 , λ1 , v (adjacency matrix of lower component, dominant eigenvalue, contribution vector from top component) (0) Set x2 = e and y(0) = v, where e is the one vector Set i = 0 (initial iteration counter) ¯ = A2 Set A λ1 Choose , (an arbitrary small positive number) Choose τ , (small positive integer for when to start checking for non-convergence) (i+1) ¯ (i) = Ax Step 2: Compute x2 2 (i+1) (i+1) Step 3: Compute y = y(i) + x2 Step 4: If y(i+1) − y(i)  < , then STOP y(i+1) is the centrality vector) (i+1) (i) Step 5: If i > τ and x2 1 > x2 1 calculate rank vector using Algorithm 1 using (i+1) as initial rank vector y Else: set i = i + 1 and repeat from step 2.

Case 2: Top component having λ1 , (λ1 < λ2 ).

558

C. Anguzu et al.

Fig. 24.4 A simple setup of two components with corresponding adjacency matrix A. The graph consist of seven vertices, each having a self loop, with the top component having smaller dominant eigenvalue than the one below

In this case, Fig. 24.4, we have chosen to simply invert the graph structure in Case 1. For the of three vertices each with a self-loop, we have the matrix ⎛ top component ⎞ 111

A1 = ⎝ 1 1 1 ⎠ and the dominant eigenvalue λ1 = 3, and so the rank vector xu4 is 111 ⎛ ⎞ 1 xu4 = ⎝ 1 ⎠ , by applying Algorithm 1. For the lower component, the adjacency 1 ⎛ ⎞ 1111 ⎜1 1 1 1⎟ ⎟ matrix is given by A2 = ⎜ ⎝ 1 1 1 1 ⎠ . We want the long-term equilibrium vector 1111

x2 , for the second component with the consideration that it continues to receive scores from the top component via the edge connecting vertices 3 and 4. We use the Algorithm 2 with some slight alterations. The value of λ1 is now 3 and the vector   v = 1 0 0 0 . In this case the iterations do not converge. We are then motivated to establish the dominant eigenvalue of the lower component, which is found to be 4. Then by the Perron-Frobeneous theorem, one can conclude that the eigenvector for the component with dominant eigenvalue 4 outgrows that of eigenvalue 3. As iterations increase, the lower component with bigger dominant eigenvalue draws all the weight of the top component with smaller dominant eigenvalue. In the long term, the entries of the eigenvector for the weaker and donating component will progress to zero as the stronger component settles at non-zero values. As such we use Algorithm 2 to compute the rank vector of the lower component and then set the one above to zero which results into the overall rank vector, of Fig. 24.4. That is, x =   0001111 . Many other scenarios do exist with different structural arrangements of the components. However, the computations in cases 1 and 2 above gives the basis of the computation for all other structures. Table 24.1 gives an overview of how the rank vectors can be calculated for some of the structures noted in the network. In the   table, the representation x = [x u xl ] means that x is the rank vector of the whole network with some normalisation, xu is the rank vector of the component in the top position, xl is the rank vector of the component in the bottom position.

24 Algorithms for Recalculating Alpha and Eigenvector Centrality Measures …

559

Table 24.1 An overview of the steady state rank vectors for different structural arrangements of components (comp ci means component ci ) Graph structure and λmax c1

Steady state vector

λmax = λ1 > λ2

x = [x 1

1   λ1 x2 ]

λ3 < λ4 = λmax

x = [x 3

1   λ4 x4 ]

λmax = λ5 > λ6 = λ7

x = [x 5

1  1 2   λ5 x6 ( λ5 ) x7 ]

λmax = λ8 > λ10 > λ9

x = [x 8

1  1 2   λ8 x9 ( λ8 ) x10 ]

λmax = λ11 > λ12 > λ13

1 2   1  x = [x 11 λ11 x12 ( λ11 ) x13 ]

λmax = λ16 > λ15 > λ14

  x = [x 14,15 x16 ]

c2

c3 c4

Description x1 , x2 –rank vectors of comp c1 and c2

x3 –zero vector for c3 , x4 –rank vector of comp c4

c5 c6

x5 , x6 , x7 –rank vectors of comp c5 , c6 and c7

c7 c8 c9

x8 , x9 , x10 –rank vectors of comp c8 , c9 and c10

c10

c11 c12

x11 , x12 , x13 –rank vectors of comp c11 , c12 and c13

c13

c14 c15

x14,15 –zero vector for, x16 –rank vector of comp c16

c16

c17

c18 c19

λmax = λ19 > λ18 = λ17

c20

  x = [x 17,18 x19 ]

x17,18 -zero vector for, x19 -rank vector of comp c19

c21 c22

λmax = λ22 > λ21 > λ20

c23

  x = [x 20,21 x22 ]

x20,21 –zero vectors for, x22 –rank vector of comp c22

c24 c25

λmax = λ23 = λ24 > λ25

c26

x = [x 23,24

1   λ23 x25 ]

x = [x 26,27

1   λ27 x28 ]

x23 =x24 , x25 –rank vectors of c23 , x24 and c25

c27 c28

λmax = λ27 > λ26 > λ28

x26,27 –zero vectors, x28 –rank vector of comp c28

560

C. Anguzu et al.

Fig. 24.5 A simple setup of three components

For scenarios such as in Fig. 24.5, in which at least two of the dominant eigenvalues, λ2 of the components say c1 , c2 and c3 are equal we do not obtain satisfactory results with this approach. For now we choose to solve such components by combining them as one graph. We reserve the task for a closer look at these components using this method for a future time.

24.4 Discussion and Conclusion The aim of this paper was to develop algorithms for recalculating alpha and eigenvector centralities using graph partitioning. We began with the alpha centrality, showing how a large graph can be partitioned into smaller components and how these components are levelled. Using a concrete example, we have shown how alpha centrality is recalculated with updates of the ranks of the successive levels. With the top component ct having the maximum dominant eigenvalue in the set of components, the rank vector of ct is simply obtained from setting the norm of the rank vector equal to the number of vertices of the component, that is, xn  = n. When the order is reversed such that the component cs with smaller dominant eigenvalue is on top, the rank vector for cs runs to zero whereas for ct it is maintained at non-zero values, subject to some normalisation. In this case the component ct with higher dominant eigenvalue, λmax appearing at the base of a series of components, sets the rank vectors of all the components above to zero. This is a consequence of the fact that convergence of rank vectors occurs at the dominant eigenvalue [16]. So the component with λmax draws all the weights towards itself leaving the top components with zero. The weight contribution of ct to the components below it, with lower dominant eigenvalues, sets the lower rank vectors to non-zero values, with some normalisation. The theorem of the dominant eigenvalue prevailing applies even to situations of branching as in the vertices from c17 to c28 , in the Table 24.1. The experimental and theoretical calculations we have carried out with small-size graphs reveals same results, that is, same centrality measures as for the whole graph. We notice that the component-wise recalculation yields very minimal complications in terms of the storage space on the computer. It is thus easier and cheaper working with smaller components of a very large graph and then combining them than working with a whole large graph.

24 Algorithms for Recalculating Alpha and Eigenvector Centrality Measures …

561

Acknowledgements This research was supported by the Swedish International Development Cooperation Agency (Sida), International Science Program (ISP) in Mathematical Sciences (IPMS), Sida Bilateral Research Program (Makerere University). We also highly recognise the input of the Research environment Mathematics and Applied Mathematics (MAM), Division of Mathematics and Physics, Mälardalen University and the Department of Mathematics, Makerere University for providing us with conducive environment for research.

References 1. Bonacich, P., Lloyd, P.: Eigenvector-like measures of centrality for asymmetric relations. Soc. Netw. 23(3), 191–201 (2001) 2. Borgatti, S.P.: Centrality and network flow. Soc. Netw. 27(1), 55–71 (2005) 3. Buluç, A., Meyerhenke, H., Safro, I., Sanders, P., Schulz, C.: Recent advances in graph partitioning. In: Algorithm Engineering, Springer, pp. 117–158 (2016) 4. Engström, C.: PageRank as a Solution to a Linear System, PageRank in Changing Systems and Non-normalized Versions of PageRank. Lund University (2011) 5. Engström, C.: PageRank in evolving networks and applications of graphs in natural language processing and biology. Ph.D. thesis, Mälardalen University (2016) 6. Engström, C., Silvestrov, S.: Graph partitioning and a componentwise PageRank algorithm. arXiv:1609.09068 (2016) 7. Feinberg, S.E., Meyer, M.M., Wasserman, S.: Analyzing data from multivariate directed graphs: an application to social networks. Technical Report, University of Minnesota (1980) 8. Freeman, L.C.: Centrality in social networks conceptual clarification. Soc. Netw. 1(3), 215–239 (1978) 9. Gonzalez, T.F.: Approximation algorithms for multilevel graph partitioning. In: Handbook of Approximation Algorithms and Metaheuristics, Chapman and Hall/CRC, pp. 943–958 (2007) 10. Grando, F., Granville, L.Z., Lamb, L.C.: Machine learning in network centrality measures: tutorial and outlook. ACM Comput. Surv. (CSUR) 51(5), 102 (2018) 11. Holland, P.W., Laskey, K.B., Leinhardt, S.: Stochastic blockmodels: first steps. Soc. Netw. 5(2), 109–137 (1983) 12. Kim, J., Hwang, I., Kim, Y., Moon, B.: Genetic approaches for graph partitioning: a survey. In: Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, pp. 473–480 (2011) 13. Langville, A.N., Meyer, C.D.: A reordering for the PageRank problem. SIAM J. Sci. Comput. 27(6), 2112–2120 (2006) 14. Langville, A.N., Meyer, C.D.: Google’s PageRank and beyond: the science of search engine rankings. Princeton university press (2011) 15. Lü, L., Chen, D., Ren, X., Zhang, Q., Zhang, Y., Zhou, T.: Vital nodes identification in complex networks. Phys. Rep. 650, 1–63 (2016) 16. Meyer, C.D.: Matrix Analysis and Applied Linear Algebra, vol. 71. Siam (2000) 17. Nathan, E., Bader, D.A.: Approximating personalized Katz centrality in dynamic graphs. In: International Conference on Parallel Processing and Applied Mathematics, Springer, pp. 290– 302 (2017) 18. Nieminen, J.: On the centrality in a graph. Scand. J. Psychol. 15(1), 332–336 (1974) 19. Pang, X., Zhou, Y., Wang, P., Lin, W., Chang, V.: An innovative neural network approach for stock market prediction. J. Supercomput. 76(3), 2098–2118 (2020) 20. Pellegrini, M., Haynor, D., Johnson, J.M.: Protein interaction networks. Expert Rev. Proteomics 1(2), 239–249 (2004) 21. Rohde, A.: Eigenvalues and eigenvectors of the Euler equations in general geometries. In: 15th AIAA Computational Fluid Dynamics Conference, p. 2609 (2001)

562

C. Anguzu et al.

22. Saad, Y.: Numerical Methods for Large Eigenvalue Problems: Revised Edition, vol. 66. Siam (2011) 23. Savi´c, M., Ivanovi´c, M., Jain, L.C.: Co-authorship networks: an introduction. In: Complex Networks in Software, Knowledge, and Social Systems. Springer, pp. 179–192 (2019) 24. Schloegel, K., Karypis, G., Kumar, V.: Graph partitioning for high performance scientific simulations. Army High Performance Computing Research Center (2000) 25. Schulz, C., Strash, D.: Graph partitioning: formulations and applications to big data. In: Encyclopedia of Big Data Technologies. Springer, Cham, pp. 1–7 (2018) 26. Sharma, P., Bhattacharyya, D.K., Kalita, J.K.: Centrality analysis in PPI networks. In: 2016 International Conference on Accessibility to Digital World (ICADW), IEEE, pp. 135–140 (2016) 27. Shaw, M.E.: Group structure and the behavior of individuals in small groups. J. Psychol. 38(1), 139–149 (1954) 28. Tallberg, C.: Testing centralization in random graphs. Soc. Netw. 26(3), 205–219 (2004) 29. Tarjan, R.: Depth-first search and linear graph algorithms. SIAM J. Comput. 1(2), 146–160 (1972) 30. White, H.C. Boorman, S.A., Breiger, R.L.: Social structure from multiple networks. i. blockmodels of roles and positions. Am. J. Sociol. 81(4), 730–780 (1976)

Chapter 25

On Statistical Properties of the Estimator of Impulse Response Function Yuriy Kozachenko and Iryna Rozora

Abstract In this paper a time-invariant continuous linear system is considered with a real-valued impulse response function which is defined on bounded domain. A sample input-output cross-correlogram is taken as an estimator of the response function. The input processes are supposed to be zero-mean stationary Gaussian processes that can be represented as the truncated series of Fourier expansion. A criterion on the shape of the impulse response function is given. For this purpose, a theory of square-Gaussian stochastic processes is used. Keywords Square-gaussian stochastic process · Gaussian process · Cross-correlogram MSC 2020 60G15 · 62G05

25.1 Introduction The problem of identification and estimation of a stochastic linear system has been a matter of active research for the last years. System identification means the building of mathematical models of dynamic systems from observed input-output data. This generates a great amount of models that can be considered. The sphere of applications of these models is very extensive: signal processing, automatic control, financial market, medicine, machine learning and so on. For more details, see [18, 32, 33, 35, 36] and [37]. The issue of the estimation of the impulse response function is similar to inverse problem and deconvolution one that are used, for example, for restoring signal or images, signal detection [1, 4, 12–15, 19, 30, 31, 38, 39]. Y. Kozachenko (B) · I. Rozora Taras Shevchenko National University of Kyiv, Kyiv, Ukraine e-mail: [email protected] I. Rozora e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_25

563

564

Y. Kozachenko and I. Rozora

We are interested in the estimation of the so-called impulse function from observations of responses of a SISO (single-input single-output) system to certain input signals. To solve this problem, different statistical approaches were used. Let us mention two monographs on this problem by Bendat and Piersol [5] and Schetzen [35]. Akaike [2] studied a MISO (multiple-input single-output) linear system and obtained estimates of the Fourier transform of the response function in each component. He considered later a scenario involving non-Gaussian processes [3]. Some methods for estimation of unknown impulse response function of linear system and the study of properties of corresponding estimators were considered in the works of Buldygin and his followers. These methods are based on constructing a sample cross–correlogram between the input stochastic process and the response of the system ( see, e.g. [7, 9, 10]). An inequality of the distribution for supremum of estimation error in the space of continuous functions in the case of integral-type cross-correlogram estimator was obtained by Kozachenko and Rozora in [24]. In the paper [25] a time–invariant continuous linear system with a real-valued impulse response function was considered. The input signal process was supposed to be a zero mean Gaussian stochastic process which was represented as a treamed sum with respect to orthonormal basis in L2 (R). The case of Hermite polynomials as orthonormal basis in L2 (R) was studied. In [34] for integral cross–correlogram estimator the algorithm of statistical hypothesis testing for response function was written applying upper estimate of overrunning by square-Gaussian random process the level specified by continuous function. In this paper a time-invariant continuous linear system is considered with a realvalued impulse response function which is defined on bounded domain. An input– output cross–correlogram based on one single observation is taken as an estimator of the response function. The input processes are supposed to be zero-mean stationary Gaussian process and can be represented in truncated series of Fourier decomposition. In such model the built estimator of impulse function depends on the length of averaging interval for cross-correlogram T and the level of cutting for input signal N . The estimation of large deviation probability for errors in the space of continuous functions is found. This allows us to develop a criterion on the shape of the impulse response function. For this purpose, a theory of square-Gaussian stochastic processes are used. The paper consists of 7 sections and the structure is as follows. Section 25.2 covers the main definitions and general properties of the estimator. The input signal process is supposed to be a zero mean Gaussian stochastic process which is represented as a truncated sum with respect to orthonormal basis in L2 ([0, Λ]). In Sect. 25.3 we suppose that the input process of the system can be represented as a series with respect to the trigonometric basis on [0, Λ]. The estimates of mathematical expectation, variance and variance of the increments for the estimator of impulse function are found that provide asymptotically unbiasedness as N → ∞ and consistency as N , T → ∞. Section 25.4 deals with square Gaussian random vari-

25 On Statistical Properties of the Estimator of Impulse …

565

ables and processes. Inequality for the C(T ) norm of a square Gaussian stochastic process is shown. In Sect. 25.5 the convergence rate for the estimator of unknown impulse response function in the space of continuous functions. In Sect. 25.6 a criterion is developed on the shape of the impulse response function. Section 25.7 is devoted to software simulation. In one particular case the critical values of the length of averaging interval T are found for different accuracy and reliability and N (the upper limit of the summing in the model) using software environment for statistical computing and graphics R.

25.2 The Estimator of an Impulse Response Function and Its Properties Consider a time-invariant continuous linear system with a real-valued square integrable impulse response function H (τ ) which is defined on finite domain τ ∈ [0, Λ]. This means that the response of the system to an input signal X (t), which is observed on t ∈ R, has the following form: Λ H (τ )X (t − τ )d τ , t ∈ R,

Y (t) =

(25.1)

0

and H ∈ L2 ([0, Λ]). One of the problems arising in the theory of linear systems is to estimate the function H from observations of responses of the system to certain input signals. Let the system of functions {ϕ0 (t), ϕk (t), ψk (t), k ≥ 1} be an orthonormal basis in L2 [0, Λ]. Assume that the function ϕ0 (t) is a constant. This means that it should be equal to ϕ0 (t) = √1Λ . Consider as input of the linear system a real-valued Gaussian zero mean stationary stochastic process X = XN = (XN (u), u ∈ R), that can be presented as N N   XN (u) = ξk ϕk (u) + ηk ψk (u), (25.2) k=0

k=1

where N > 0 is fixed integer number and random variables ξk , ηk , k ≥ 0, are independent with Eξk = Eηk = 0, Eξk2 = Eηk2 . Remark 1 There are a lot of techniques that allows to expand stochastic processes in the series such as Kahrunen-Loève expansion [26, 27], Fourier series [23] and so on. In the manuscripts [22, 23] dealing with the modeling of stochastic processes with accuracy and reliability, the main idea is to represent the model as a finite sum of a series if the process itself can be represented as series with stochastic terms. The cutting-off moment of such finite (truncated) series is based on the conditions

566

Y. Kozachenko and I. Rozora

that depend on the given accuracy and reliability. That is why the stochastic process XN (u) from (25.2) can be considered as a model of random process X (u) =

∞ 

ξk ϕk (u) +

k=0

∞ 

ηk ψk (u).

k=1

Remark 2 Since the process X (t) is defined on R and the functions ϕk (t), ψk (t), k ≥ 1, are the orthonormal basis in L2 [0, Λ] then we will further assume that these functions are periodical with period Λ. Hence, ϕ0 (t) = ϕ0 (t + nΛ), ϕk (t) = ϕk (t + nΛ), ψk (t) = ψk (t + nΛ), k ≥ 1, t ∈ [0, Λ], n ∈ Z. If the system (25.1) is perturbed by the stochastic process XN , then for the output Λ process we obtain YN (t) = 0 H (τ )XN (t − τ )d τ , It’s easy to find the covariance function of stationary stochastic process XN . Really, rN (t − s) = EXN (t)XN (s) =

N 

(ϕk (t)ϕk (s) + ψk (t)ψk (s)) .

(25.3)

k=1

By a0 we define the output (response) of the system on the constant signal 1 a0 = √ Λ

Λ H (t)dt.

(25.4)

0

Set H ∗ (τ ) = H (τ ) − a0 . As an estimator H ∗ (τ ) of the difference of impulse response function and a0 we will consider an integral cross-correlogram 1 Hˆ (τ ) = Hˆ N ,T ,Λ (τ ) = T

T YN (t)XN (t − τ )dt,

(25.5)

0

where T > 0 is a parameter for averaging. Remark 3 The integral in (25.1) is considered as the mean-square Riemann integral. The integral in (25.1) exists if and only if there exists the Riemann integral (see [16]) Λ Λ H (τ )rN (s − τ )H (s)dsd τ . 0

0

(25.6)

25 On Statistical Properties of the Estimator of Impulse …

567

 Since rN (s − τ ) = Nk=1 (ϕk (t)ϕk (s) + ψk (t)ψk (s)), the functions ϕk (s), ψk (t) are square integrable on [0, Λ] and H ∈ L2 ([0, Λ]), then the integral in (25.6) exists. Denote now Λ ak =

Λ H (t)ϕk (t)dt,

bk =

0

H (t)ψk (t)dt.

(25.7)

0

Suppose that H ∈ C 1 ([0, Λ]). Then the function H can be expanded into the series by orthonormal basis {ϕk (t), ψk (t) k ≥ 0} on the domain [0, Λ]. We obtain H (τ ) =

∞ 

ak ϕk (t) +

k=0

∞ 

bk ψk (t),

(25.8)

k=1

where ak and bk are from (25.7). Moreover, the series in (25.8) uniformly converges on [0, Λ]. Lemma 1 The following relations hold true: EHˆ N ,T ,Λ (τ ) = =



H (v)rN (v − τ )dv

0 N 

(25.9) (ϕk (τ )ak + ψk (τ )bk ) , τ ∈ [0, Λ],

k=1

H ∗ (τ ) − EHˆ N ,T ,Λ (τ ) =

∞  k=N +1

(ϕk (τ )ak + ψk (τ )bk ) , τ ∈ [0, Λ].

(25.10)

Proof The joint covariance function of the processes XN and YN equals Λ EYN (t)XN (t − τ ) = E H (v)X (t − τ )X (t − v)dv =



0

(25.11)

H (v)rN (v − τ )dv.

0

From (25.11) it follows that the estimator Hˆ N ,T ,Λ (τ ), in general case, is biased EHˆ N ,T ,Λ (τ ) = =

1 T



T

EYN (t)XN (t − τ )dt

0

H (v)rN (v − τ )dv.

0

Substituting in (25.12) the values from (25.8), (25.3) we have

(25.12)

568

Y. Kozachenko and I. Rozora

EHˆ N ,T ,Λ (τ ) =

Λ  ∞ 0

=

 N  (ϕl (v)ϕl (τ ) + ψl (v)ψl (τ )) dv (ak ϕk (v) + bk ψk (v))

k=0

N ∞  

l=1

⎛ ⎝ak ϕl (τ )

k=0 l=1



Λ ϕk (v)ϕl (v)dv + ak ψl (τ )

0

0



Λ + bk ϕl (τ )

ϕk (v)ψl (v)dv

Λ

ψk (v)ϕl (v)dv + bk ψl (τ )



ψk (v)ψl (v)dv⎠

0

0

Λ Since 0 ψk (v)ϕl (v)dv = δkl , where δkl is Kronecker delta, then the relationship above can be transformed as EHˆ N ,T ,Λ (τ ) =

N 

(ϕk (τ )ak + ψk (τ )bk ) .

k=0

So, the first statement (25.9) of the Lemma is proved. Equalities (25.8) and (25.9) imply ∞ 

H (τ ) − EHˆ N ,T ,Λ (τ ) = ∗

(ϕk (τ )ak + ψk (τ )bk ) , τ ∈ [0, Λ].

k=N +1



Lemma is completely proved.

In the following Lemma we calculate the joint moments for the estimator of impulse response function. Lemma 2 The joint moment of Hˆ N ,T ,Λ is equal to EHˆ N ,T ,Λ (τ )Hˆ N ,T ,Λ (θ) = Λ × H (v)rN (θ − v)dv 0

+ T12

T T Λ Λ 0 0



H (u)rN (τ − u)du

0

H (v)H (u)rN (t − s + u − v)dudv

0 0

(25.13)

×rN (t − s + θ − τ ) Λ + H (v)rN (t − s + θ − v)dv 0

× H (u)rN (s − t + τ − u)du dtds, Λ 0

where rN (t − s) = EXN (t)XN (s) is a covariance of XN , the coefficients ak and bk are defined in (25.7).

25 On Statistical Properties of the Estimator of Impulse …

569

Proof By the definition of Hˆ N ,T ,Λ (τ ) we can write the joint moment for Hˆ (τ ), EHˆ N ,T ,Λ (τ )Hˆ N ,T ,Λ (θ) = E T1 × T1 =

1 T2

T

T

YN (t)XN (t − τ )dt

0

YN (s)XN (s − θ)ds

0 T T

(25.14)

EYN (t)XN (t − τ )YN (s)XN (s − θ)dtds.

0 0

Consider the integrand (25.14) separately: EYN (t)XN (t − τ )YN (s)XN (s − θ) = EYN (t)XN (t − τ ) · EYN (s)XN (s − θ) +EYN (t)YN (s) · EXN (t − τ )XN (s − θ) + EYN (t)XN (s − θ) · EXN (t − τ )YN (s).

(25.15)

The covariance function for the response will be Λ  EYN (t)YN (s) = E 0

Λ

H (τ )H (v)XN (t − τ )XN (s − v)d τ dv

0

Λ Λ =

H (τ )H (v)rN (t − τ − s + v)d τ dv 0

(25.16)

0

The joint moment of input and output processes is equal to Λ EYN (t)XN (s − θ) = E

H (v)XN (t − v)XN (s − θ)dv 0

Λ =

H (v)rN (t − s + θ − v)dv

(25.17)

0

It follows from (25.16), (25.11) and (25.17) that expression in (25.15) can be calculated as

570

Y. Kozachenko and I. Rozora

EYN (t)XN (t − τ )YN (s)XN (s − θ) Λ

Λ H (u)rN (θ − u)du

= 0

H (v)rN (θ − v)dv 0

Λ Λ +

H (v)H (u)rN (t − s + u − v)dudv · rN (t − s + θ − τ ) 0

0

Λ +

Λ H (v)rN (t − s + θ − v)dv ·

0

H (u)rN (s − t + τ − u)du.

(25.18)

0

Substituting (25.18) in (25.14) we obtain EHˆ N ,T ,Λ (τ )Hˆ N ,T ,Λ (θ) =



Λ



0

1 + 2 T

T T 0

⎡ ⎣

H (v)rN (θ − v)dv

0

Λ Λ

0

H (v)H (u)rN (t − s + u − v)dudv · rN (t − s + θ − τ ) 0

0

Λ +

Λ

H (u)rN (τ − u)du ·

Λ H (v)rN (t − s + θ − v)dv ·

0

⎤ H (u)rN (s − t + τ − u)du⎦ dtds.

0

 Corollary 1 The variance of the estimator Hˆ N ,T ,Λ is equal to Var Hˆ N ,T ,Λ (τ ) 1 = 2 T

T T 0

0

⎡ ⎣

Λ Λ H (v)H (u)rN (t − s + u − v)dudv · rN (t − s) 0

0

Λ +

Λ H (v)rN (t − s + τ − v)dv ·

0

⎤ H (u)rN (s − t + τ − u)du⎦ dtds, (25.19)

0

Var(Hˆ N ,T ,Λ (τ ) − Hˆ N ,T ,Λ (θ)) 1 = 2 T

T T  Λ Λ 2 H (v)H (u)rN (t − s + u − v)dudv 0

0

0

0

× (rN (t − s) − rN (t − s + θ − τ ))

25 On Statistical Properties of the Estimator of Impulse …

571

Λ +

H (v)(rN (t − s + τ − v) − rN (t − s + θ − v))dv 0



Λ

×

H (u)rN (s − t + τ − u)du

0

Λ H (v)rN (t − s + θ − v)dv

+ 0

Λ ×

H (u)(rN (s − t + θ − u) − rN (s − t + τ − u))du dtds.

(25.20)

0

Proof From relation (25.13) it follows that ⎛ EHˆ N2 ,T ,Λ (τ ) = ⎝



⎞2 H (u)rN (τ − u)du⎠

0

+

1 T2

T T  Λ Λ H (v)H (u)rN (t − s + u − v)dudv · rN (t − s) 0

0

0

0

Λ +

Λ H (v)rN (t − s + τ − v)dv ·

0

 H (u)rN (s − t + τ − u)du dtds.

(25.21)

0

By (25.12) and (25.21) we have E(Hˆ N ,T ,Λ (τ ) − EHˆ N ,T ,Λ (τ ))2 = EHˆ N2 ,T ,Λ (τ ) − (EHˆ N ,T ,Λ (τ ))2  T T Λ Λ 1 = T2 H (v)H (u)rN (t − s + u − v)dudv · rN (t − s) 0 0

+



0 0

H (v)rN (t − s + τ − v)dv ·

0



 H (u)rN (s − t + τ − u)du dtds.

0

So, the first statement of corollary is proved. The second statement of this corollary follows from Lemma 2. Relations (25.9) and (25.13) imply cov(Hˆ N ,T ,Λ (τ ), Hˆ N ,T ,Λ (θ)) = EHˆ N ,T ,Λ (τ )Hˆ N ,T ,Λ (θ) − EHˆ N ,T ,Λ (τ )EHˆ N ,T ,Λ (θ) ⎡ T T Λ Λ 1 ⎣ H (v)H (u)rN (t − s + u − v)dudv · rN (t − s + θ − τ ) = 2 T 0

0

0

0

572

Y. Kozachenko and I. Rozora

Λ +

Λ H (v)rN (t − s + θ − v)dv ·

0

⎤ H (u)rN (s − t + τ − u)du⎦ dtds.

(25.22)

0

Therefore, by (25.19) and (25.22), making elementary reduction we obtain Var(Hˆ N ,T ,Λ (τ ) − Hˆ N ,T ,Λ (θ)) ⎡ Λ T T   Λ 1 ⎣2 H (v)H (u)rN (t − s + u − v)dudv = 2 T 0 0

0

0

× (rN (t − s) − rN (t − s + θ − τ )) Λ +

H (v)(rN (t − s + τ − v) − rN (t − s + θ − v))dv 0

Λ ×

H (u)rN (s − t + τ − u)du 0

Λ +

H (v)rN (t − s + θ − v)dv 0

Λ ×

⎤ H (u)(rN (s − t + θ − u) − rN (s − t + τ − u))du⎦ dtds,

0



which completes the proof.

25.3 Trigonometric Basis Let now consider the system of functions 

1 √ , Λ



      2kπt 2kπt 2 2 cos , sin ,k ≥ 1 Λ Λ Λ Λ

(25.23)

that is orthonormal basis in L2 ([0, Λ]). Under notation of previous section 1 ϕ0 (t) = √ , ϕk (t) = Λ



     2kπt 2kπt 2 2 cos , ψk (t) = sin ,k≥1 Λ Λ Λ Λ

and the coefficients ak , bk are equal to

25 On Statistical Properties of the Estimator of Impulse …



Λ ak =

H (τ )ϕk (τ )d τ = 0



Λ bk =

H (τ )ψk (τ )d τ = 0

2 Λ

2 Λ



Λ H (τ ) cos 0

Λ 0

573

2kπτ Λ



2kπτ H (τ ) sin Λ

 d τ k ≥ 1,

(25.24)

d τ k ≥ 1.

(25.25)



Suppose now that the input signal processes of the system (25.1) are zero mean stationary Gaussian stochastic processes that are formed by (25.23). This means that the process XN (u) is given by:  XN (u) =

    N  2  2kπu 2kπu ξk cos + ηk sin , Λ Λ Λ

u ∈ R.

(25.26)

k=1

It follows from (25.3) that the covariance function of stationary Gaussian process XN can be written as rN (t − s) =

N 

(ϕk (t)ϕk (s) + ψk (t)ψk (s))

(25.27)

k=1

        N  2kπt 2kπs 2kπt 2kπs 2  cos cos + sin sin Λ Λ Λ Λ Λ k=1  N    2kπ(t − s) 2  cos = Λ Λ

=

k=1

Consider the following conditions: Condition A. The function H (τ ) is two times differentiable on [0, Λ]. The functions H (τ ) and H  (τ ) are continuous on [0, Λ] and Λ I0 = I0 (Λ) =

⎛ Λ ⎞1/2  |H (τ )|d τ < ∞, 0 < I1 = ⎝ |H  (τ )|2 d τ ⎠ < ∞,

0

0

Λ I2 = I2 (Λ) =

|H  (τ )|d τ < ∞.

0

Condition B. The following relation holds true H (0) = H (Λ).

574

Y. Kozachenko and I. Rozora

Remark 4 Condition B means that the impulse action is completely finished on the interval [0, Λ]. Therefore, the value of the function at the beginning H (0) should be equal to the value at the end H (Λ) and H (τ ) couldn’t be a constant. Let us denote

d := |H  (0)| + |H  (Λ)|.

(25.28)

Lemma 3 Assume that the conditions A, B are satisfied. Then |H ∗ (τ ) − EHˆ N ,T ,Λ (τ )| ≤ Var Hˆ N ,T ,Λ (τ ) ≤

Λ(d + I2 (Λ)) , 2π 2 N

Λ3 (Λ + 2)I12 π4 T 2

(25.29)

  1 2 2− , N

(25.30)

where I1 (Λ) and I2 (Λ) are defined in condition A. Proof From (25.10), (25.24) and (25.25) it follows that H ∗ (τ ) − EHˆ N ,T ,Λ (τ ) = =

2 Λ

=

2 Λ

∞ Λ  k=N +1 0 ∞  

k=N +1

(ϕk (τ )ak + ψk (τ )bk

         cos 2kπτ + sin 2kπu sin 2kπτ du H (u) cos 2kπu Λ Λ Λ Λ

Λ 0

k=N +1

∞ 

H (u) cos

 2kπ Λ

(25.31)

 (u − τ ) du.

Estimate now the integral in equality above using first partial integration and condition B: Λ 0

H (u) cos  =



2kπ(u−τ ) Λ

Λ H (u) 2kπ

Λ = − 2kπ



sin

du

2kπ(u−τ ) Λ

H  (u) sin



 Λ − 0

2kπ(u−τ ) Λ

Λ 2kπ

Λ 0

du 0

 ) Λ Λ Λ = − 2kπ −H  (u) 2kπ cos 2kπ(u−τ 0 Λ 

Λ  ) Λ du + 2kπ H  (u) cos 2kπ(u−τ Λ 0 (H  (0) cos(2kπτ /Λ)−cos(2kπτ /Λ)H  (Λ))Λ2 (2kπ)2

Λ  2kπ(u−τ ) Λ2  du. − (2kπ) H (u) cos 2 Λ 0

=



H (u) sin



2kπ(u−τ ) Λ



du

(25.32)

25 On Statistical Properties of the Estimator of Impulse …

575

Then by conditions A, B, notation (25.28) and relationship (25.32) we obtain |ϕk (τ )ak + ψk (τ )bk | ≤

2Λ(d + I2 ) (2kπ)2

(25.33)

Relations (25.31) and (25.33) imply that ∞ 

|H (τ ) − EHˆ N ,T ,Λ (τ )| ≤ ∗

k=N +1

Since

∞ Λ(d + I2 )  1 |ϕk (τ )ak + ψk (τ )bk | ≤ . 2π 2 k2 k=N +1

k ∞ ∞   1 1 1 < dx = , k2 x2 N

k=N +1

(25.34)

k=N +1k−1

the biasedness of Hˆ N ,T ,Λ (τ ) to the parameter H ∗ (τ ) can be evaluated as follows, +I2 ) , and inequality (25.29) is completely proved. To |H ∗ (τ ) − EHˆ N ,T ,Λ (τ )| ≤ Λ(d 2π 2 N estimate the variance we use (25.19). Denoting RH (t, s)

=

R1 (t, s, τ ) = R2 (t, s, τ ) =

 Λ Λ 0

Λ 0 Λ

H (v)H (u)rN (t − s + u − v)dudv,

0

H (v)rN (t − s + τ − v)dv, H (u)rN (s − t + τ − u)du,

0

Var Hˆ N ,T ,Λ (τ )  T T Λ Λ 1 = T2 H (v)H (u)rN (t − s + u − v)dudv · rN (t − s) 0 0 0 0  Λ Λ + H (v)rN (t − s + τ − v)dv · H (u)rN (s − t + τ − u)du dtds 0  0 T T 1 = T2 RH (t, s)rN (t − s)dtds 0 0  T T + R1 (t, s, τ ) · R2 (t, s, τ )dtds . 0 0

By (25.27) we have

(25.35)

576

Y. Kozachenko and I. Rozora

Λ Λ RH (t, s) = H (v)H (u)rN (t − s + u − v)dudv 0 0



N Λ  2kπ(s+v) du dv = Λ2 H (u) cos 2kπ(t+u) H (v) cos Λ Λ k=1 0 0 



Λ 2kπ(s+v) + H (u) sin 2kπ(t+u) du dv H (v) sin Λ Λ 0

=

2 Λ

N 

(25.36)

0

(cH (t)cH (s) + sH (t)sH (s)) ,

k=1

Λ Λ 2kπ(t+u) du, s du. (t) = H (u) sin where cH (t) = 0 H (u) cos 2kπ(t+u) H 0 Λ Λ Similarly to (25.32) we can estimate the functions cH (t) and sH (t). Indeed, it follows from conditions A, B

Λ du| |cH (t)| = | H (u) cos 2kπ(t+u) Λ 0  



Λ  2kπ(u+t) Λ 2kπ(u+t) Λ Λ = | H (u) 2kπ sin du | H (u) sin 0 − 2kπ Λ Λ Λ = | 2kπ



Λ 2kπ



H  (u) sin

0



2kπ(u+t) Λ

0

(25.37)

du|

 21  21  

Λ Λ   du (H (u))2 du sin2 2kπ(u+t) = Λ 0

0

ΛI1 (Λ) 2kπ



Λ . 2

The same evaluation could be made for sH (t) ΛI1 (Λ) |sH (t)| ≤ 2kπ



Λ . 2

(25.38)

Using the similar method for estimation as in (25.34) it’s easy to prove that N  1 1 ≤2− . 2 k N

(25.39)

k=1

Then substituting (25.37) and (25.38) in (25.36) and using (25.39) we obtain   2   N N 1 4  ΛI1 (Λ) Λ Λ2 I12  1 Λ2 I12 RH (t, s) ≤ 2− , (25.40) = ≤ Λ 2kπ 2 2π 2 k2 2π 2 N k=1

k=1

that does not depend on t and s. To evaluate the first summand in VarHΛ,T ,N ,

25 On Statistical Properties of the Estimator of Impulse …

577

T T

rN (t − s)dtds ⎛ 2  2 ⎞ T T N        ⎝ = T 22Λ dt + dt ⎠ cos 2kπt sin 2kπt Λ Λ

1 T2

0 0

k=1

= =

2 T 2Λ

N  

k=1 N 

Λ2 2π 2 T 2

0

Λ 2kπ



k=1

0

sin

 2kπT 2 Λ

2−2 cos( 2kπT Λ ) k2

+





Λ 2kπ

2Λ2 π2 T 2

cos

 2−

 2kπT  Λ

1 N



Λ 2kπ

2

(25.41)



Λ R1 (t, s, τ ) = H (v)rN (t − s + τ − v)dv  0



N  ) 2kπ(s+v) dv H (v) cos = Λ2 cos 2kπ(t+τ Λ Λ k=1 0 



) + sin 2kπ(t+τ dv H (v) sin 2kπ(s+v) Λ Λ

(25.42)

0

=

2 Λ



I1 π





N  ) ) cos 2kπ(t+τ cH (s) + sin 2kπ(t+τ sH (s) Λ Λ k=1



  N cos 2kπ(t+τ ) +sin 2kπ(t+τ ) Λ 2

Λ

Λ

k

k=1

Similarly, R2 (t, s, τ ) ≤

I1 π



Λ 2

N  k=1

) ) )+sin( 2kπ(s+τ ) cos( 2kπ(s+τ Λ Λ k

(25.43)

By (25.35), (25.40), (25.41) (25.42), (25.43) we have Var Hˆ N ,T ,Λ (τ ) ≤



Λ2 I12 2π 2

     2Λ2 1 1 · 2− 2 − N π2 T 2 N

⎞2 ⎛     T   N 1 ⎝ I1 Λ  1 2kπ(t + τ ) 2kπ(t + τ ) + 2 cos + sin dt ⎠ T π 2 k Λ Λ k=1

0

 2      N 1 2 1 2 Λ4 I12 1 2I1 Λ Λ  1 Λ3 (Λ + 2)I12 2 − ≤ 4 2 2− + 2 ≤ π T N T π2 2 k2 π4 T 2 N k=1

which completes the proof.



The following Lemma gives an estimation for the variance of the difference Hˆ N ,T ,Λ (τ ) − Hˆ N ,T ,Λ (θ). Lemma 4 Suppose that the conditions of Lemma 3 are fulfilled. Then ˜ , T , Λ)|τ − θ|α , α ∈ (0, 1], τ , θ ∈ [0, Λ], Var(Hˆ N ,T ,Λ (τ ) − Hˆ N ,T ,Λ (θ)) ≤ C(N

578

Y. Kozachenko and I. Rozora

where ˜ , T , Λ) = C(N ˜ , T , Λ) = C(N

√ (4+ 2)Λ3−α I12 (2N −1)((2−α)N 1−α +1) when α (1−α)π 4−α T 2 N 2−α √ (4+ 2)Λ2 I12 (2N −1)(1+ln N ) when α = 1. π3 T 2 N

∈ (0, 1),

(25.44)

Proof Use the relationship (25.20) from Corollary 1. Then the variance of the difference can be written as Var(Hˆ N ,T ,Λ (τ ) − Hˆ N ,T ,Λ (θ)) = 2S1 + S2 + S3 ,

(25.45)

T T S1 = 12 [RH (t, s) (rN (t − s) − rN (t − s + θ − τ ))] dtds, T 0 0   T T Λ H (v)(rN (t − s + τ − v) − rN (t − s + θ − v))dvR2 (t, s, τ ) dtds, S2 = 12 T

0 0 0   T T  1 R1 (t, s, θ) 0Λ H (u)(rN (s − t + θ − u) − rN (s − t + τ − u))du dtds. S3 = 2 T 0 0

and the functions RH (t, s), Ri (t, s, τ ) are from (25.3). We estimate now each sum Λ2 I 2  mand separately. By (25.40) we have |RH (t, s)| ≤ 2π21 2 − N1 . By (25.27), rN (t − s) − rN (t − s + θ − τ ) = =

4 Λ

N 

sin

k=1

πk(2t−2s+θ−τ ) Λ

· sin

2 Λ

N  2πk(t−s+θ−τ ) cos 2πk(t−s) − cos Λ Λ

k=1

πk(θ−τ ) . Λ

  For u ≥ 0 and v > 0 the inequality sin

 

u v



uα , vα

α ∈ (0, 1] holds (see, for example,

α  ) ) [22]). Then rN (t − s) − rN (t − s+θ − τ ) ≤ Λ4 Nk=1 sin πk(2t−2s+θ−τ · πk(θ−τ . Λ Λ By (25.40) and inequality above we have |S1 | ≤ ≤

Λ2 I12 2π 2



2−

Λ1−α I12 |θ−τ |α T 2 π 2−α

1 N

N  4πα |θ−τ |α 

 2−

T 2 Λ1+α 1 N

N 

k=1

kα ·

k=1



T T

sin

0 0

 Λ 2 kπ



πk(2t−2s+θ−τ ) dtds Λ

Λ3−α I12 T 2 π 4−α

 2−

1 N



(25.46)

· fN |θ − τ |α

 1 where the sum Nk=1 k 2−α can be bounded by fN using the similar approach for   1 1 estimation as in (25.34) and the function fN equals fN = 1 + 1−α 1 − N 1−α as α ∈ (0, 1) and fN = 1 + ln N as α = 1. In common with (25.46) and utilizing the estimations (25.43), (25.42) we obtain Λ3−α I12 |Si | ≤ √ 2T 2 π 4−α

 2−

1 N



· fN |θ − τ |α , i = 1, 2.

(25.47)

25 On Statistical Properties of the Estimator of Impulse …

579

Substituting in (25.45) the values from (25.46) and (25.47) we have ˜ , T , Λ)|τ − θ|α , α ∈ (0, 1], Var(Hˆ N ,T ,Λ (τ ) − Hˆ N ,T ,Λ (θ)) ≤ C(N ˜ , T , Λ) is given in (25.44). where C(N



25.4 Square Gaussian Random Variables and Processes In this section the definition and some properties of square Gaussian random variables and processes are presented. Let (Ω, L, P) be a probability space and let (T, ρ) be a compact metric space with metric ρ. Definition 1 [8] Let Ξ = {ξt , t ∈ T} be a family of joint Gaussian random variables for which Eξt = 0 (e.g., ξt , t ∈ T, is a Gaussian stochastic process . The space SG Ξ (Ω) is the space of square Gaussian random variables if any element η ∈ SG Ξ (Ω) can be presented as ¯ η = ξ¯ Aξ¯ − Eξ¯ Aξ,

(25.48)

where ξ¯ = (ξ1 , ξ2 , . . . , ξn ), ξk ∈ Ξ, k = 1, . . . , n, A is a real-valued matrix or the element η ∈ SG Ξ (Ω) is the square mean limit of the sequence (25.48) η = lim (ξ¯n Aξ¯n − Eξ¯n Aξ¯n ). n→∞

Definition 2 [8] A stochastic process ξ(t) = {ξ(t), t ∈ T} is square-Gaussian if for any t ∈ T a random variable ξ(t) belongs to the space SG Ξ (Ω). The properties of square Gaussian random processes can be found in, for example, [8, 20–23, 28]. Denote by N (u) the metric massiveness, that is the least number of closed balls of radius u, covering the set T with respect to the metric ρ. Let ξ(t) = {ξ(t), t ∈ T} be a square Gaussian stochastic process. Assume that there exists a monotonically increasing continuous function σ(h), h > 0, such that σ(h) → 0 as h → 0, and the 1 inequality supρ(t,s)≤h (Var(ξ(t) − ξ(s))) 2 ≤ σ(h) holds true. Define now the following values: ε0 = inf t∈T sups∈T ρ(t, s), t0 = σ(ε0 ), γ0 = supt∈T (Var ξ(t))1/2 , Let C be a maximum of t0 and γ0 , C = max{t0 , γ0 }. The next theorem gives an estimate for the large deviation probability of square Gaussian process in the norm of continuous function. The proof of the Theorem can be found in the article [24] or in monograph [23]. Theorem 1 Let ξ(t) = {ξ(t), t ∈ T} be a separable square Gaussian stochastic process. Suppose that there exists an increasing function rN (u) ≥ 0, u ≥ 1, with the properties: r(u) → ∞ andu →∞and let thefunction r(exp{t}) be convex. Assume t that the following integral 0 0 r N σ (−1) (u) du is convergent. Then for all x > 0

580

Y. Kozachenko and I. Rozora

 !   t p   P sup |ξ (t)| > x ≤ 2 inf r (−1) t01p 0 0 r N σ (−1) (ν) d ν 0 0 (positive definite matrix). Thus βˆ is a minimum. Recalling that ε ∼ N (0, σ 2 ), then E[ε  ε] = σ 2 I N . Then we can derive the variance-covariance matrix of OLS estimates as follows: −1   −1   X Y = X X X (Xβ + ε) βˆ = X X   −1    −1  = X X (X X)β + X X X ε   −1  =β+ X X X ε.

608

A. K. Muhumuza et al.

It follows that     −1     −1   X X X ε X X X ε E[(βˆ − β)(βˆ − β) ] = E  −1   %  &    −1  X X X = X X X E εε  −1      −1 X X X X = σ 2 I N X X   −1  −1 2 = σ IN X X = σ 2 X X .

(26.25)

We estimate σ 2 with σˆ 2 where σˆ 2 = ε  ε/(n − k). The structure of the covariance matrix is of the form:      ⎤ ⎡ var βˆ 1 cov βˆ 1 , βˆ 2 . . . cov βˆ 1 , βˆ k ⎢     ⎥  ⎥ ⎢ var βˆ 2 . . . cov βˆ 2 , βˆ k ⎥ cov βˆ 2 , βˆ 1 ⎢ ⎥. E[(βˆ − β)(βˆ − β) ] = ⎢ ⎥ ⎢ .. .. .. .. ⎥ ⎢ . . . . ⎣      ⎦  cov βˆ k , βˆ 1 cov βˆ k , βˆ 2 . . . var βˆ k (26.26)

ˆ 26.7.1 Optimum Value of Generalized Variance V [β] with Extreme Points of Vandermonde Determinant By the singular value decomposition, let X be an N × p matrix, with p ≤ N , then X = UΛV where U is an N × p matrix whose columns have length 1 and mutually orthogonal, that is, in brief they are called principal components of U. V is p × p matrix whose columns have length 1 and mutually orthogonal, that is, in brief V is a rotation of R p . Λ is a diagonal p × p matrix, whose diagonal elements λ11 , λ22 , . . . , λ pp are non-negative, that is, they are singular values of X and they may be ordered from largest to the smallest. In brief, this criteria can be summarized by the following conditions U U = I N , V V = I p . The structure of singular value decomposition (SVD) can be expressed as follows. Definition 4 Let X ∈ R N ×N . Then, the (full rank) singular value decomposition of X is given by

26 Connections Between the Extreme Points for Vandermonde Determinants …



  ⎜  ⎜  ⎜  ⎜  X = UΛV = ⎜U1  U2 ⎜  ⎜  ⎝  

       ...     

⎛ ⎞ λ11   ⎜0  ⎟⎜  .. ⎟⎜  . ⎟⎜ ⎜  ⎜  UN ⎟ 0 ⎟⎜  ⎟⎜  ⎟⎜ 0  ⎠⎜ .  ⎝ ..  0

0 ... λ22 . . . .. . . . . 0 ... 0 ... .. .. . .

609



⎛ ⎞ ⎟ ⎟ ⎜ (V ) ⎟ ⎟⎜ 1 ⎟ ⎟ ⎜ (V ) ⎟ ⎟⎜ 2 ⎟ λN N ⎟ . ⎟ , (26.27) ⎟⎜ ⎜ .. ⎟ 0 ⎟ ⎟ ⎟⎜  .. ⎟ ⎝ (V p ) ⎠ ⎠ . 0 ... 0 0 0 .. .

where U, V are orthogonal matrices and Λ is diagonal. The λii are the singular values of X, and by convention are arranged in a nondecreasing order λ11 ≥ λ22 ≥ . . . ≥ λ N N ≥ 0. The columns of U are termed leftsingular vectors of X and the columns of V are termed right-singular vectors of X. This scheme helps to express the X X in a decomposed way. From the regression model (26.21) whose least squares solution via normal equation is given by βˆ =   −1  X Y. Then using the SVD, X = VΛU , it follows that X X  &−1     −1  % VΛU X = VΛU UΛV X X &−1   % VΛU = VΛU UΛV

(26.28)

= VΛ−2 V VΛU = VΛ−2 ΛU = VΛ−1 U , where U U = I and V V = I N . The only difference between the expressions   −1  N −1  X = VΛ U and X = VΛU is that the reciprocals of the elements X X of Λ are used. From the covariance of the estimates −1   X Y, βˆ = X X

  ˆ = σ 2 X X −1 . cov(β)

Using the SVD as in (26.28), we obtain  %  &−1 −1 = σ 2 VΛU UΛV = σ 2 VΛ−2 V . σ 2 X X

(26.29)

In other words, the covariance act like that of p−orthogonal variables, each with variance λii2 that have been rotated in R p . This same SVD criteria can also apply to other special matrices, for instance, the hat matrix:   −1    X = UΛV VΛ−1 U = UU = I N . H = X X X In terms of eigenvalue analysis,

610

A. K. Muhumuza et al.

        X X = VΛU UΛV = VΛ2 V        XX = UΛV VΛU = UΛ2 U .

(26.30)

    It follows immediately that the eigenvalues of X X and XX are the squares  of the singular values. The columns of V are  the eigenvectors of X X and the  columns of U are the eigenvectors of XX . We summarize the above relationships (26.17) through to (26.30) with the result stated in Lemma 1, which caters for the general relationship between the generalized variance and the Vandermonde determinant, whereby the points that maximize the determinant would minimize variance and hence the risk. ˆ Lemma 1 The determinant of the variance-covariance matrix of the estimates β, 2 ˆ also referred to as generalized variance, V [β], can be expressed in terms of σ and determinant of Λ2 where det(Λ) = det(X) given in (26.23). Proof Apply determinant techniques to (26.29), it follows that −1    −1 σ 2 X X = σ 2  VΛU UΛV  = σ 2 |V||Λ|−2 |V | = σ 2 |Λ|−2 |VV | = σ 2 |Λ|−2 |I| = σ 2 |Λ|−2 =

σ2 , |Λ|2

(26.31)

since VV = I and | · | is determinant of the given matrix. Thus combining the expressions in (26.25), (26.26) and (26.31), it follows immediately that the generalˆ is given by ized variance V [β]   2 ˆ ≡ E[(βˆ − β)(βˆ − β) ] = σ , V [β] |Λ|2 where det(X) = |X| = |Λ| =

$

(xi − x j ).

(26.32) 

1≤i< j≤N

Since the variance-covariance matrix is the risk measure in asset pricing theory [2, 43], then this leads to an important result in pricing with extreme points of the Vandermonde determinant. In Theorem 2, we state and prove our first general result that relates the generalized variance and the square of the Vandermonde determinant, and how such extreme points that maximize the determinant would minimize variance and hence the risk in relation to a given surface defined as efficient frontier dependent on the returns xi or the rates ri for a given asset i. Theorem 2 If E[(βˆ − β)(βˆ − β) ] is the variance-covariance matrix of the ordinary least square estimates βˆ of the expected returns on risky assets N , then risk involved in investing in such assets can be minimized by maximizing the square of the Vandermonde determinant, that is, the determinant of the variance-covariance matrix also called the generalized variance is inversely proportional to the square of the Vandermonde determinant.

26 Connections Between the Extreme Points for Vandermonde Determinants …

611

Proof Using (26.27), the properties of determinant and the fact that X is Vandermonde matrix as defined in the system (26.21), then $

det(X) = |X| ≡ |Λ| =

(xi − x j ).

1≤i< j≤N

Substituting this directly into (26.32) we obtain   2 ˆ = E[(βˆ − β)(βˆ − β) ] = σ = V [β] |Λ|2

σ2

$

|(xi − x j )|2

.

(26.33)

1≤i< j≤N

Since xi = (1 + ri )−1 , then the term in the denominator can be expressed as 2 1 1 |(xi − x j )| = − 1 + ri 1 + rj 1≤i< j≤N 1≤i< j≤N   2 $ $  (1 + r j ) − (1 + ri ) 2 r j − ri = = . (1 + ri )(1 + r j ) (1 + ri )(1 + r j ) 1≤i< j≤N 1≤i< j≤N $



$

2

Thus, substituting this into (26.33) and simplifying gives $ ˆ = σ2 · V [β]

% &2 (1 + ri )(1 + r j )

1≤i< j≤N

$

.

|ri − r j |2

(26.34)

1≤i< j≤N

In the continuous time discounting xi = e−ri thus $ ˆ = σ2 · V [β]

1≤i< j≤N

$

|e − e | ri

1≤i< j≤N



= σ 2 · exp ⎝2

$

% ri r j &2 e e rj 2

= σ2 ·

e2(ri +r j )

1≤i< j≤N

$

|eri − er j |2

1≤i< j≤N

⎞ - $ (ri + r j )⎠

1≤i< j≤N

|eri − er j |2 .

1≤i< j≤N

ˆ so as to maximize the expected returns E[β] ˆ =β Therefore, to minimize V [β] one aims at maximizing the denominator which is the square of the Vandermonde determinant. The points that maximize the Vandermonde determinant are called Fekete points as explained in [50]. 

612

A. K. Muhumuza et al.

We notice that the matrix W = X X is called a Wishart matrix, which is a random matrix of moments of xi [52]. The joint eigenvalue probability distribution of the Wishart type matrices in random matrix theory is as expressed in Theorem 3 which is also stated and proved in [17, 44, 45]. Theorem 3 If the matrix W has eigenvalues λ = (λ1 , λ2 , . . . , λ N ) then, W has a joint eigenvalue probability density function given by PW (λ) =

1 Z N ,β



⎞ N β |λi − λ j |β · exp ⎝− P(λk )⎠ 4 j=1 1≤i< j≤N $

where Z N ,β is a normalizing constant, P(λk ) =

N

(26.35)

p

λk , p ≥ 2 and β = 1, 2, 4.

k=1

The relation (26.35) gives the connection between the extreme points of the Vandermonde determinant and the eigenvalues of the Wishart matrix type W as well as the returns on the assets. Lemma 2, also stated and proved in [52], gives the basis of efficient frontier characterization and based on the extreme points of the Vandermonde determinant. Lemma 2 The square extreme points of the Vandermonde determinant X are also the extreme points of the determinant of the matrix W = X X and these points are given by the eigenvalues W which lie on surface defined by the p−norm p

S N (λ) =

N

p

λi = Tr(P(W)),

(26.36)

i=1

 where

p S N (λ)

= λ ∈ RN :

N

p λi

=r

p

where p ≥ 2 and r ≥ 1 is the radius of

i=1

the p−sphere. Proof A detailed statement of the proof can be found in [52].



Thus, based on the results stated in Lemma 1, Theorems 2 and 3 and Lemma 2, then such models described in (26.15) and (26.16) can be modified for nonlinear constraints. In Theorem 4, we state and prove the key result that demonstrates that the extreme points of the Vandermonde determinant when optimized over the unit sphere would be the same points lying on a an efficient frontier represented by smooth surface as a sphere. These same points would give a minimize the risk defined by the determinant of the variance-covariance matrix. ˆ which is Theorem 4 If the risk measure is defined by the generalized variance V [β] ˆ then equal to the determinant of the variance-covariance matrix of the estimates β, one can construct an efficient frontier based on the optimization model

26 Connections Between the Extreme Points for Vandermonde Determinants …

⎧ ⎪ ⎪ ⎨minimize

613

  ˆ = E[(βˆ − β)(βˆ − β) ] V [β]

2 ⎪ ⎪ ⎩subject to S N (λ) =

N

(26.37)

λi2 = 1,

i=1

or equivalently,

⎧ ⎪ maximize ⎪ ⎪ ⎨

v N (λ) =

⎪ 2 ⎪ ⎪ ⎩subject to S N (λ)

$

|λi − λ j |2

1≤i< j≤N N = λi2 = i=1

(26.38) 1,

whereby based on (26.32), (26.35) and (26.36), the solution points λi to (26.37) and ˆ and will lie on the (26.38) are the eigenvalues of the variance-covariance matrix β, 2 surface defined by sphere S N (λ). The solution points λi are also given by the zeros of the Hermite orthogonal polynomial, HN (x) where x represents general λi . Proof Using the Lagrange multiplier optimization criteria to find the extreme points of the Vandermonde determinant on a given surface whereby for a given function, say f (λ) = v N (λ) such that {λ ∈ Rn : g(λ) = S N2 (λ) − 1 = 0}, then any λ that satisfies ⎧ ∂f ⎪ = λ ∂∂gx1 ⎪ ⎨ ∂ x1 .. . ⎪ ⎪ ⎩ ∂ f = λ ∂g ∂ xn ∂ xn will be a stationary points where λ is the Lagrange multiplier. Since the derivative of the Vandermonde polynomial v N (λ) exists, then it follows ∂v N (λ) = 0 for the stationary points to exist. that ∂λ j Thus, the stationary points of v N (λ) must lie in the intersection of the surfaces g(λ) = S N2 (λ) − 1 = 0 and

n ∂g(λ) k=1

∂ xk

= 0.

Based on (26.38), the derivative of Vandermonde determinant v N (λ) is given by ∂v N (λ) v N (λ) = ∂λk (λk − λ j ) i=1 n

(26.39)

k =i

and that for the sphere S N2 (λ) by ∂ S N2 (λ) = 2λk . ∂λk

(26.40)

614

A. K. Muhumuza et al.

Denoting the maximum value of Vandermonde determinant v N (λ) by vmax , then it follows from (26.39) and (26.40), by method of the Lagrange multipliers that, N i=1 k =i

vmax − 2λλk = 0 for all 1 ≤ k ≤ N , (λk − λ j )

(26.41)

where ρ is a general constant. The (26.41) can also be re-written as N i=1 k =i

2λ 1 ρ = xk = xk for all 1 ≤ k ≤ N . (xk − x j ) vmax N

(26.42)

From the general monic polynomial PN (x) = (x − λ1 )(x − λ2 ) · · · (x − λ N ) it can be shown that N 1 PN

(λk ) vmax = . (26.43) 2 PN (λk ) (λk − λ j ) i=1 k =i

It follows immediately from (26.42) and (26.43) that ρ 1 PN

(λk ) ρ = λk ⇔ PN

(λk ) − λk PN (λk ) = 0 for all 1 ≤ k ≤ N .

2 PN (λk ) N N Since for a polynomial f = PN of degree N , then the left hand of the differential equation in the previous expression must be a polynomial with λk as its zeros, thus f (x) is multiplied by the scalar. Exchanging λk for a general x and comparing terms gives the second order ordinary differential equation PN

(x) −

ρ x P (x) + 2ρ PN (x) = 0. n N

(26.44)

According to [1, 67], the solutions of the o.d.e (26.44) are of the form ./ PN (x) = cHN

0 N (N − 1) x , c∈C 2

1 N2  (−1)k (2x) N −2k where HN (x) = N ! k=0 is Hermite polynomial whose zeros will k! (N −2k)! be the extreme points of Vandermonde determinant on the surface defined by the sphere. 

26 Connections Between the Extreme Points for Vandermonde Determinants …

615

These same extreme points that maximize the square of the Vandermonde determinant expressed (26.37) and lying on the sphere as an efficient frontier would be the same points that minimize the generalized variance as expressed in (26.38). The results of Theorem 4 are as illustrated in Fig. 26.9. We notice that, if for instance the interest rates ri are defined in the interval [0, 21 ] ⊂ [0, 1], then the discount factor xi = (1 + ri )−1 is defined in the interval [ 23 , 1] ⊂ [0, 1]. That is, from 0 ≤ ri ≤ 21 and with xi = (1 + ri )−1 so that ri = x1i − 1, then 0 ≤ x1i − 1 ≤ 21 or 22 ≤ xi ≤ 1. This is as illustrated at the first subfigure of Fig. 26.7. This leads to the case of the optimization of the Vandermonde determinant of the [ 23 , 1]3 . Applying the results of the principles of optimization Vandermonde determinant as stated in [41, 42, 49–52, 67], we can generalize as follows. Lemma 3 If x = (x1 , x2 , . . . , xn ) is a critical point of the Vandermonde determinant on a surface S ⊂ Rn then (c x˙1 + a, c x˙2 + a, . . . , c x˙n + a), where a, c ∈ R and c = 0, is a critical point of the Vandermonde determinant on the surface {cx + a1 ∈ Rn |x ∈ S}. Proof By the following computation vn (c · x1 + a, c · x2 + a, . . . , c · xn + a) = =

$

$

  c · x j + a − c · xi − a

1≤i< j≤n

c · (x j − xi ) = c

n(n−1) 2

vn (x1 , . . . , xn ).

1≤i< j≤n

the proof follows immediately.



The Lemma 3 explains the property that the Vandermonde determinant (or polynomial) is homogeneous which is also utilized in its optimization on various surfaces. Lemma 4 If the interest rates ri ∈ [0, 1] for N different assets i = 1, 2, . . . N , then their corresponding discount factors xi ∈ [0, 1]. Thus total risk on the investment in the N assets can be minimized by maximizing the Vandermonde determinant over the cube with bounds [0, 1] N as shown in Fig. 26.7. The proof for Lemma 4 is direct and can generalized using the geometric illustration in Fig. 26.7. In Lemma 4 we give the illustration of the extreme points of Vandermonde determinant lying on an efficient frontier defined by a unit cube with boundary points give by infinity norm. & % % & Lemma 5 If ri ∈ 0, 21 are interest rates and xi ∈ 23 , 1 , xi = (1 + ri )−1 are their corresponding discount factors for N different assets i = 1, 2, . . . N , then the total risk on the investment in the N assets can be minimized by maximizing the Vander% &N monde determinant over the cube bounded by 23 , 1 .

616

A. K. Muhumuza et al.

Fig. 26.7 Illustration of the unit cube [−1, 1]3 , its subsets [0, 21 ]3 , [ 23 , 1]3 and S32 = x12 + x22 + x32 = 1

The proof for Lemma 5 is also direct and can generalized using the geometric illustration in Fig. 26.7, for the case where a unit sphere is fitted inside a cube [0, 2]3 . In Lemma 5 we give a result to illustrate the extreme points of Vandermonde determinant lying on an efficient frontier defined by a cube enclosed within a unit cube bounded by [0, 1] N . The extreme points of the Vandermonde determinant on the cube are known since the early 20th century [64] and can be described by the roots of a Legendre polynomial. The form of the solution in Theorem 5 below. Theorem 5 [26] The points xi on the unit cube [−1, 1] N that maximizes or minimizes the Vandermonde determinant have coordinates that can be written as a permutation of the case where x1 < . . . < x N are given by the roots of

(x) FN (x) = (x 2 − 1)Pn−2

(26.45)

where Pm (x) is the derivative of the m:th Legendre polynomial Pm (x) =

  m   m m+k x −1 k . k k 2 k=0

Now, combining the results of Lemmas 3, 4, 5 and Theorem 5 we obtained a general result for extreme points of Vandermonde determinant defined on an efficient frontier represented by cube bounded by [ 23 , 1]3 as stated in Corollary 1 below and illustrated in Fig. 26.8. & % Corollary 1 If ri ∈ 0, 21 are interest rates for N different assets i = 1, 2, . . . N % & and xi ∈ 23 , 1 , xi = (1 + ri )−1 are their corresponding discount factors, then the total risk on the investment in the N assets can be minimized by choosing interest

26 Connections Between the Extreme Points for Vandermonde Determinants …

617

Fig. 26.8 Illustration of the % &3 cube 23 , 1 coloured according to the value of the Vandermonde determinant on its surface with the extreme points, given by   permutations of 23 , 56 , 1 , marked in blue

i rates such that ri = 1−x , i = 1, . . . , n − 1 and r N = 0 where xi are the roots of the 5+xi polynomial given by (26.45), or some permutation thereof. &N % Proof The cube 23 , 1 can be created by scaling the cube [−1, 1] N by 16 and  5 translating it by 6 , . . . , 56 so the corollary follows from applying Lemma 5 and the relation xi = (1 + ri )−1 . 

Related to the cube as efficient frontier we can also generalize to the sphere as an efficient frontier. Based on Theorem 4 and Lemma 4 we generalize the results for a sphere as efficient frontier as stated in in Theorem 6 below. Theorem 6 The points xi on the unit sphere enclosed with a unit cube with bounds [−1, 1] N that maximizes or minimizes the Vandermonde determinant have coordinates that can be written as a permutation of the case where x1 < . . . < x N are given by the roots of ./ 0 n(n − 1) PN (x) = HN (26.46) 2 where HN (x) is the N :th (physicist) Hermite polynomial [67] N  2  (−1)k (2x) N −2k . HN (x) = n! k! (N − 2k)! k=0

618

A. K. Muhumuza et al.

Fig. 26.9 The visualization of the maximum points of the square of the Vandermonde determinant on a unit sphere in 4 dimensions, mapped to a unit sphere in 3 dimensions using the technique described in [42], coloured according to the value of the square of the Vandermonde determinant. The coordinates of the extreme . 0 points are given by permutations of / / / / 2 2 2 2 1 2 1 2 1 2 1 2 −2 1 + 3,−2 1 − 3, 2 1 − 3, 2 1 + 3

Also, combining the results of Theorem 4, Lemma 3 and Theorem 6 we obtained a general result for extreme points of Vandermonde determinant defined on an efficient frontier represented by unit sphere enclosed within a cube bounded by [−1, 1] N as stated in Corollary 2 below and illustrated in Fig. 26.9. & % Corollary 2 If the interest rates are chosen such that ri ∈ 0, 21 are interest rates % & for N different assets i = 1, 2, . . . N and xi ∈ 23 , 1 , xi = (1 + ri )−1 are their corresponding discount factors, then the total risk on the investment in the N assets i , i = 1, . . . , n − 1 can be minimized by choosing interest rates such that ri = 1−x 5+xi and r N = 0 where xi are the roots of the polynomial given by (26.46), or some permutation thereof. &N % Proof The cube 23 , 1 can be created by scaling the cube [−1, 1] N by 16 and   translating it by 56 , . . . , 56 so the corollary follows from applying Lemma 5 and the  relation xi = (1 + ri )−1 . Constructing a surface that describes the efficient frontier might not be easy, but the extreme points of such a surface can be approximated using extreme points of the Vandermonde determinant on a set of simpler surfaces, for example cubes and spheres. The surface can be discretized by e.g. cubes or spheres whose extreme points lie close to the efficient frontier as illustrated in Fig. 26.10a and b respectively. Thus

26 Connections Between the Extreme Points for Vandermonde Determinants …

619

Fig. 26.10 Illustration of the construction efficient frontier by discretization using extreme points of Vandermonde determinant of a cube b sphere, where σ is the risk and E is the returns

by examining a smaller set of points on or near the efficient frontier can be used to create a portfolio with optimum returns and minimum risk.

26.8 Conclusion In this study we have established that we can use the concepts of polynomials and least squares in portfolio analysis and construction. Among our results include, we have constructed the relationship between the generalized which is the determinant of the variance-covariance matrix and the Vandermonde determinant as expressed in Lemma 1. We further proved under Theorem 2 that the risk of investing in a given number assets is inversely proportional to the square of the Vandermonde determinant. Implying that it is possible to minimize the risk σ by maximizing the Vandermonde determinant. In Theorem 3, we adopted the fact that the eigenvalues of the variance-covariance matrix which is a Wishart of type matrix can have a joint eigenvalue probability density distribution [44]. This further, revealed that we can optimize the square Vandermonde determinant using the trace expressed in the exponentiation term of the density as the constraint as was demonstrated in [52]. The trace function represents the connection between the extreme points of the Vandermonde determinant and the efficient frontier where the extreme points that maximize the determinant and also minimize the risk would lie on the same surface say a sphere. In Theorem 4, we state and proved the general result whereby the efficient frontier would be described by the smooth sphere. The extreme points of the square of the Vandermonde determinant would be the same points that would minimize the risk on a given set of assets and these same points are given by the zeros of the classical

620

A. K. Muhumuza et al.

Hermite polynomials. This result was further extended to other surfaces say cubes as discussed in Theorem 5. Regarding error analysis in the use of the method of extreme points of Vandermonde determinant to approximate the efficient frontier, it should be noted in our study we have used analytic method in computations so we couldn’t do the error analysis in comparison with other well established methods. But in our recent work [49] we established the fact that the stability of the Vandermonde determinant or conditioning of the Vandermonde matrix in numerical approximation and error analysis is highly dependent on the extreme points. Thus our method extreme points in approximating the efficient frontier would be consider most efficiency in terms of stability and reliability. We hope in our future works we apply experimental data for further analysis. Therefore, extreme points of the Vandermonde determinant optimized over various surfaces can be applied in approximating the efficient frontier and play a useful role in asset pricing and portfolio construction. For instance, to determine most appropriate asset allocation, that is, which assets can make the best combination based on their risk measure to be able to maximize the returns by minimizing the risk. Acknowledgements We acknowledge the financial support for this research by the Swedish International Development Agency, (Sida), Grant No.316, International Science Program, (ISP) in Mathematical Sciences, (IPMS). We are also grateful to the Division of Mathematics and Physics, Mälardalen University for providing an excellent and inspiring environment for research education and research.

References 1. Abramowitz, M., Stegun, I.: Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover, New York (1964) 2. Bartholomew-Biggs, M.C.: Non-Linear Optimization with Financial Applications. Kluwer Academic Publishers, Net Library Inc (2005) 3. Benth, F.E.: Option Theory with Stochastic Analysis: An Introduction to Mathematical Finance. Universitext, Springer, Berlin (2004) 4. Bonello, N., Sheng, C., Lajos, H.: Construction of regular quasi-cyclic protograph LDPC codes based on Vandermonde matrices. IEEE Trans. Veh. Technol. 57(8), 2583–2588 (2008) 5. Bose, R.C., Ray-Chaudhuri, D.K.: On a class of error correcting binary group codes. Inf. Control 3(1), 68–79 (1960) 6. Björk, T.: Arbitrage Theory in Continuous Time. Oxford University, Press (2000) 7. Cirafici, M., Sinkovics, A., Szabo, R.J.: Cohomological gauge theory, quiver matrix models and Donaldson-Thomas theory. Nucl. Phys. Sect. B 809(3), 452–518 (2009) 8. Cox, J.C., Ross, S.A., Rubinstein, M.: Option pricing: a simplified approach. J. Financ. Econ. 7(3), 229–263 (1979) 9. Dana, R.-A., Monique, J.: Financial Markets in Continuous Time. Springer Finance. Springer, Berlin (2007) 10. Davis, M.H.A.: Martingale representation and all that. In: Abed, E.H. (ed.) Advances in Control, Communication Networks, and Transportation Systems. Systems and Control Foundations and Applications, pp. 57–68. Birkhäuser, Boston (2005) 11. Davis, P.J.: Interpolation and Approximation. Blaisdell, New York (1963)

26 Connections Between the Extreme Points for Vandermonde Determinants …

621

12. Delbaen, F., Chachermayer, W.: The Mathematics of Arbitrage. Springer Finance. Springer, Berlin (2006) 13. Delbaen, F., Shirakawa, H.: A note on option pricing for the constant elasticity of variance model. Asia-Pacific Financ. Markets 9, 85–99 (2002) 14. Dothan, M.U.: Efficiency and arbitrage in financial markets. Int. Res. J. Finan. Econ. 19, 102– 106 (2008) 15. Dothan, M.U.: Prices in Financial Markets. The Clarendon Press, Oxford University Press, New York (1990) 16. Duffie, D.: Dynamic Asset Pricing Theory, 3rd edn. Princeton University Press, Princeton, NJ (2001) 17. Edelman, A., Rao, N.R.: Random matrix theory. Acta Numerica 14, 233–297 (2005) 18. Elliott, J.R., Kopp, E.P.: Mathematics of Financial Markets. Springer Finance, 2nd edn. Springer, New York (2005) 19. Epps, T.W.: Pricing Derivative Securities. World Scientific, Singapore (2000) 20. Fabozzi, F.J.: A Handbook of Fixed Income Securities, 7th edn. McGraw-Hill Publishing (2005) 21. Fischer, B., Scholes, M.: The pricing of options and corporate liabilities. J. Polit. Econ. 81, 637–659 (1975) 22. Forrester, P.J.: Log-Gases and Random Matrices. Princeton University Press (2010) 23. Föllmer, H., Schied, A.: Stochastic Finance. An Introduction in Discrete Time, 2nd ed., Gruyter Studies in Mathematics 27, Walter de Gruyter, New York (2002) 24. Gazizov, R.K., Ibragimov, N.H.: Lie symmetry analysis of differential equations in finance. Nonlinear Dyn. 17(4), 387–407 (1998) 25. Glasserman, P.: Monte Carlo methods in financial engineering. Stochastic Modelling and Applied Probability (SMAP), vol. 53. Springer, New York (2004) 26. Guest, P.G.: The spacing of observations in polynomial regression. Ann. Math. Stat. 29(1), 294–299 (1958) 27. Hocquenghem, A.: Codes correcteurs d’erreurs. Chiffres 2, 147–156 (1959) 28. Huang, C.-F., Litzenberger, R.H.: Foundations for Financial Economics. North-Holland Publishing Co., New York (1988) 29. Hull, J.C.: Options Futures and Other Derivatives. Prentice Hall College Div, New York (2000) 30. Hull, J.C.: Options, Futures and Other Derivatives, 7th edn. Pearson/Prentice Hall (2015) 31. Ibragimov, N.H.: CRC Handbook of Lie Group Analysis of Differential Equations. CRC Press, Boca Raton, FL. (ed) Vol.1 (1994), Vol. 2, 3 (1995) 32. Ingersoll, J.E.: Theory of Financial Decision Making. Blackwell, Oxford (1997) 33. Karatzas, I.: Lectures on the Mathematics of Finance, CRM Monograph Series, vol. 8. American Mathematical Society, Providence RI (1988) 34. Karatzas, I., Shreve, S.E.: Methods of Mathematical Finance. Applications of Mathematics, vol. 39. Springer, New York (1998) 35. Karatzas, I., Shreve, S.E.: Brownian Motion and Stochastic Calculus, 2nd ed., Graduate Texts in Mathematics, vol. 113. Springer, New York (1991) 36. Kijima, M.: Stochastic Processes with Applications to Finance. CRC Press (2013) 37. Klein, A.: Matrix algebraic properties of the Fisher information matrix of stationary processes. Entropy 16, 2013–2055 (2014) 38. Lamberton D., Lapeyre, B.: Introduction to Stochastic Calculus Applied to Finance, 2nd edn. Chapman & Hall/CRC Financial Mathematics Series. Chapman & Hall/CRC, Boca Raton, FL (2008) 39. Laurence, P.: Quantitative Modeling of Derivative Securities. From Theory to Practice. Routledge (2017) 40. Lipton, A.: Mathematical Methods for Foreign Exchange. A Financial Engineer’s Approach. World Scientific Publishing Co., Inc., River Edge, NJ (2001) 41. Lundengård, K.: Extreme Points of the Vandermonde Determinant and Phenomenological Modelling with Power Exponential Functions. Doctoral dissertation, Mälardalen University (2019)

622

A. K. Muhumuza et al.

42. Lundengård, K., Österberg, J., Silvestrov, S.: Optimization of the determinant of the Vandermonde matrix and related matrices. Methodol. Comput. Appl. Probab. 19(4), 1–12 (2017) 43. Markowitz, H.: Portfolio selection. J. Finan. 7(1), 77–91 (1952) 44. Mehta, M.L.: Random Matrices. Elsevier (2004) 45. Mehta, M.L.: Random Matrices and the Statistical Theory of Energy Levels. Academic Press, New York, London (1967) 46. Merton, R.C.: Continuous-Time Finance. Blackwell, Cambridge, MA (1999) 47. Merton, R.C.: The theory of rational option pricing. Bell J. Econ. Manag. Sci. 4, 141–183 (1973) 48. Moya-Cessa, H.M., Soto-Eguibar, F.: Discrete fractional Fourier transform: Vandermonde approach. arxiv: 1604.06686v1 [math.GM] (2016) 49. Muhumuza, A.K., Lundengård, K., Silvestrov, S, Mango, J.M., Kakuba, G.: Properties of the extreme points of the joint eigenvalue probability density function of the random Wishart matrix. In: Dimotikalis, Y., Karagrigoriou, A., Parpoula, C., Skiadas, C.H. (eds.), Applied Modeling Techniques and Data Analysis 2: Financial, Demographic, Stochastic and Statistical Models and Methods, Vol. 8, Ch.14, pp. 195–209 (2021). (first appered In: Skiadas, C.H. (Ed.), ASMDA2019, 18th Applied Stochastic Models and Data Analysis International Conference. ISAST: International Society for the Advancement of Science and Technology, 559–571 (2019)) 50. Muhumuza, A. K., Lundengård, K., Österberg, J., Silvestrov, S, Mango, J. M., Kakuba, G.: The generalized Vandermonde interpolation polynomial based on divided differences. In: Skiadas, C. H. (Ed.), Proceedings of the 5th Stochastic Modeling Techniques and Data Analysis International Conference with Demographics Workshop, Chania, Crete, Greece, 2018, ISAST: International Society for the Advancement of Science and Technology, pp 443–456 (2018) 51. Muhumuza, A. K., Lundengård, K., Österberg, J., Silvestrov, S., Mango, J. M., Kakuba, G.: Extreme points of the Vandermonde determinant on surfaces implicitly determined by a univariate polynomial In: Silvestrov, S., Malyarenko, A., Rancic, M. (Eds.), Algebraic Structures and Applications, Springer Proceedings in Mathematics and Statistics, Vol 317, Ch. 33, pp. 791–818 (2020) 52. Muhumuza, A. K., Lundengå, K., Österberg, J., Silvestrov, S., Mango, J.M., Kakuba, G.: Optimization of the Wishart joint eigenvalue probability density distribution based on the Vandermonde determinant. In: Silvestrov, S., Malyarenko, A., Rancic, M. (Eds.), Algebraic Structures and Applications, Springer Proceedings in Mathematics and Statistics, Vol 317, Ch. 34, pp. 819–838 (2020) 53. Muhumuza, A.K., Malyarenko, A., Silvestrov, S.: Lie symmetries of the Black–Scholes type equations in financial mathematics. In: Skiadas, C.H. (Ed.), Proceedings of the 17th Applied Stochastic Models and Data Analysis International Conference with the 6th Demographics Workshop London, UK (ASMDA2017): 6-9 June, 2017. ISAST: International Society for the Advancement of Science and Technology, pp. 723-740 (2017) 54. Musiela, M., Rutkowski, M.: Martingale Methods in Financial Modelling, 2nd ed.. Stochastic Modelling and Applied Probability, vol. 36. Springer, Berlin (2005) 55. Neftci, S.N.: Introduction to the Mathematics of Financial Derivatives, 2nd edn. Academic Press, Orlando, FL (2000) 56. Pliska, S.R.: Introduction to the Mathematics of Financial Derivatives Discrete Models. Wiley (1997) 57. Reed, I.S., Solomon, G.: Polynomial codes over certain finite fields. J. Soc. Ind. Appl. Math. 8(2), 300–304 (1960) 58. Rouge, R., El Karoui, N.: Pricing via utility maximization and entropy. Math. Finan. 10(2), 259–276 (2000) 59. Rubinstein, A., Romero, C., Paolone, M., Rachidi, F., Rubinstein, M., Zweiacker, P., Daout, B.: Lightning measurement station on mount Säntis in Switzerland. In: Proceedings of X International Symposium on Lightning Protection, Curitiba, Brazil, pp. 463–468 (2009) 60. Sharp, K.P.: Stochastic differential equations in finance. Appl. Math. Comput. 38, 207–413 (1990)

26 Connections Between the Extreme Points for Vandermonde Determinants …

623

61. Sharp, F.W.: Capital asset prices: a theory of market equilibrium under conditions of risk. J. Finan. XIX(3), 425–442 (1990) 62. Shreve, S.E.: Stochastic Calculus for Finance. I. The Binomial Asset Pricing Model. Springer Finance, Springer, New York (2004) 63. Shreve, S.E.: Stochastic Calculus for Finance. II. Continuous-Time Models. Springer Finance, Springer, New York (2004) 64. Schur, I.: Über die Verteilung der Wurzeln bei gewissen algebraischen Gleichungen mit ganzzahligen Koeffizienten. Matematische Zeitschrift 1(4), 377–402 (1918) 65. Staff Investopedia: Portfolio, Investopedia. Archived from the original on 2018-04-20, Retrieved 2018-04-19, 2003-11-25 66. Steele, J.M.: Stochastic Calculus and Financial Applications. Applications of Mathematics (New York), vol. 45. Springer, New York (2001) 67. Szeg˝o, G.: Orthogonal Polynomials. American Mathematics Society (1939) 68. Vein, R., Dale, P.: Determinants and Their Applications in Mathematical Physics. Springer, New York (1999) 69. Wilmott, P., Howison, S., Dewynne, J.: The Mathematics of Financial Derivatives: A Student Introduction. Cambridge University Press, Cambridge (1995) 70. Zhu, Y.-L., Wu, X., Chernm, I.-L.: Derivative Securities and Difference Methods. Springer Finance, Springer, New York (2004)

Chapter 27

Extreme Points of the Vandermonde Determinant and Wishart Ensemble on Symmetric Cones Asaph Keikara Muhumuza, Anatoliy Malyarenko, Karl Lundengård, Sergei Silvestrov, John Magero Mango, and Godwin Kakuba Abstract In this paper we demonstrate the extreme points of the Wishart joint eigenvalue probability distributions in higher dimension based on the boundary points of the symmetric cones in Jordan algebras. The extreme points of the Vandermonde determinant are defined to be a set of boundary points of the symmetric cones that occur in both the discrete and continuous part of the Gindikin set. The symmetric cones form a basis for the construction of the degenerate and non-degenerate Wishart ensembles in Herm(m, C), Herm(m, H), Herm(3, O) denotes respectively the Jordan algebra of all Hermitian matrices of size m × m with complex entries, the skew field H of quaternions, and the algebra O of octonions. Keywords Jordan algebras · Symmetric cones · Vandermonde determinant · Wishart joint eigenvalue distributions MSC 2020 15A15, 91G10, 17A15

A. K. Muhumuza (B) Department of Mathematics, Busitema University, Box 236, Tororo, Uganda e-mail: [email protected] A. Malyarenko · K. Lundengård · S. Silvestrov Division of Mathematics and Physics, Mälardalen University, Box 883, 721 23 Västerås, Sweden e-mail: [email protected] S. Silvestrov e-mail: [email protected] J. M. Mango · G. Kakuba Department of Mathematics, Makerere University, Box 7062, Kampala, Uganda e-mail: [email protected] G. Kakuba e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_27

625

626

A. K. Muhumuza et al.

27.1 Introduction The maximal invariants that generally arise in statistical hypothesis testing as discussed in [19] are actually the functions of eigenvalues of sample variance-covariance matrices, for example see [1, 2, 15] and [24]. This motivates us into the problem of determining the distribution of ordered eigenvalues of the Wishart matrix in the domain of symmetric cones generated by positive definite matrices over the real, complex, quaternion and octonion division algebras. The study of distributions of ordered eigenvalues of random matrices with values defined in the domain of classical symmetric cones coincides with the similar studies of random matrix theory which is fully discussed in [6, 7] and [22]. Thus, the applications of the joint eigenvalue distributions are not only limited to the hypothesis testing but also to many other problems, for instance in quantum mechanics, principle component analysis, signal processing, random fields and many others, for details see [1, 16, 22, 24–27] and [21].

27.1.1 Gaussian and Chi-Square Distributions Consider the following classical statistical problem. Let X be a normal random variable with mean μ and variance σ 2 . Let x1 , …, x N be a sample from the normal population distributed like X. It is well known that the maximum likelihood estimates of the parameters μ and σ 2 are μˆ =

N 1  xi , N i=1

σˆ 2 =

N 1  (xi − μ) ˆ 2. N i=1

N Moreover, the random variable Y = 2 σˆ 2 has the chi-square distribution with N σ degrees of freedom. The probability density of the chi-square distribution with N degrees of freedom was derived by William S. Gosset, a brewer of Guinness beer, in [31]. Gosset published his research under the pen name “Student”. According to [34], At Guinness the scientific brewers, including Gosset, were allowed by the company to publish research so long as they did not mention (1) beer, (2) Guinness, or (3) their own surname.

The above probability density has the form f Y (x) =

1 2 N /2 Γ (N /2)

where Γ is the gamma-function:

exp(−x/2)x N /2−1 1(0,∞) (x),

27 Extreme Points of the Vandermonde Determinant …

 Γ (s) =



627

exp(−x)x s−1 dx.

(27.1)

0

It follows from (27.1) that the function f Y (x) =

x λ−1 exp(−x/σ ) 1(0,∞) (x) σ λ Γ (λ)

is a probability density as long as λ > 0 and σ > 0. The corresponding probability distribution is the gamma distribution with shape parameter λ and scale parameter σ . For particular values of λ = N /2 and σ = 2, we return back to the chi-square distribution with N degrees of freedom.

27.1.2 Laplace Transform and The Wishart Density Recalling the fact that the Laplace transform of a random variable Y is defined as LY (s) = E[exp(sY)] for all complex numbers s such that the expectation exists. Remark 27.1 In many sources, the above definition contains an opposite sign: LY (s) = E[exp(−sY)]. We choose the convention by [5] and subsequent papers. In particular, for the gamma distribution we have Lλ,σ (s) = (1 − σ s)−λ ,

Re s < σ −1 .

Observe that limλ↓0 Lλ,σ (s) = 1. The right hand side is the Laplace transform of the random variable Y = 0. We say that this random variable has gamma distribution with shape parameter λ = 0 and an arbitrary scale parameter σ > 0. It is easy to generalise the above discussion to the case of random vectors. Specifically, let X be a m-dimensional normal random vector with mean µ and covariance matrix Σ. Let x1 , …, x N be a sample from a normal population distributed like X. The maximum likelihood estimates of the parameters µ and Σ are ˆ = µ

N 1  xi , N i=1

N 1  ˆ i − µ) ˆ . Σˆ = (xi − µ)(x N i=1

Moreover, the random variable Y = N Σˆ has the classical Wishart distribution with N degrees of freedom and covariance matrix Σ , see [24]. Denote this distribution by Wmc (N , Σ ). If N ≥ m, then the probability density of the above distribution has the form f Y (x) =

1 Σ −1 x)/2)(det(x))(N −m−1)/2 1Ω (x), exp(−tr(Σ 2 N m/2 Γm (N /2)(det Σ ) N /2 (27.2)

628

A. K. Muhumuza et al.

where Ω is the set of all symmetric positive-definite m × m matrices, Γm is the multivariate gamma function defined for all s ∈ C with Re s > (m − 1)/2 as  Γm (s) =

exp(−tr(x))(det(x))s−(m+1)/2 dx,

(27.3)

Ω

see [24, Definition 2.1.10], and where tr denotes the trace of a matrix. For the case of s = 2, this density was derived in [9]. The case of an arbitrary m was considered in [33]. The integral in the right hand side of Eq. (27.3) was calculated in [14]. Theorem 27.1 ([14]) We have  Ω

exp(−tr x)(det x)s−(m+1)/2 dx = π m(m−1)/4

m−1 

Γ (s − i/2).

i=0

Later, this integral appeared in [30] and became known in the number-theoretical community as the Siegel integral. Let E = Sym(m, R) be the linear space of all symmetric m × m matrices with real entries. Introduce the scalar product on E by (x|y) = tr(x y). The Laplace transform of a E-valued random matrix Y is defined by LY (x) = E[exp((x|Y ))] for all x ∈ E for which the expectation exists. In the case of the classical Wishart distribution we obtain Σ x))−N /2 , LY (x) = (det(I − 2Σ

(27.4)

where I is the m × m identity matrix. It is convenient to change slightly the parametrisation of the classical Wishart distribution. Denote Wm (N , Σ ) = Wmc (2N , Σ −1 /2). The probability density (27.2) becomes (det Σ ) N Σ x))(det x) N −(m+1)/2 1Ω (x), exp(−tr(Σ f Y (x) = ΓΩ (N ) while the Laplace transform (27.4) becomes LY (x) = (det(I − Σ −1 x))−N . In particular, when m = 1, we obtain an alternative parametrisation of the chi-square distribution: 1 exp(−x)x N −1 1(0,∞) (x). f Y (x) = Γ (N ) We introduce a family of Wishart distributions as a particular case of the following general construction. Let μ be a measure defined  on the Borel σ -field of the Euclidean finite-dimensional space E. Let Lμ (y) = E exp((x|y)) dμ(y) be the Laplace transform of μ, and assume that the interior Y(μ) of the set of all y ∈ E for which Lμ (y) < ∞ is not empty. Definition 27.1 The set F(μ) = { P y,μ : y ∈ Y(μ) } of probability measures on E defined by

27 Extreme Points of the Vandermonde Determinant …

dP y,μ (x) =

629

1 exp((x|y)) dμ(x) Lμ (y)

is called the natural exponential family generated by μ. Observe that  1 exp((x|z)) exp((x|y)) dμ(x) Lμ (y) E  1 Lμ (z + y) = . exp((x|z + y)) dμ(x) = Lμ (y) E Lμ (y)

LP y,μ (z) =

The standard reference for natural exponential families is [4]. Example 27.1 Define the measure μλ by dμλ (x) =

1 (det x)λ−(m+1)/2 1Ω (x) dx, Γm (λ)

λ>

m−1 . 2

The Laplace transform of this measure is Lμλ (y) = (det(−y))−λ ,

y ∈ −Ω = { −x : x ∈ Ω }.

The corresponding natural exponential family is 1 Σ )) dμλ (x) exp((x|Σ Σ) Lμλ (Σ Σ ))λ (det(−Σ Σ ))(det x)λ−(m+1)/2 1Ω (x) dx exp((x|Σ = Γm (λ)

dPΣ ,μλ (x) =

Σ . We for Σ ∈ −Ω. We would like to run Σ over Ω. For that, replace Σ with −Σ obtain the distribution dPΣ ,μλ (x) =

Σ ))λ (det(Σ Σ x))(det x)λ−(m+1)/2 1Ω (x) dx. exp(−tr(Σ Γm (λ)

This distribution is called the Wishart distribution with the shape parameter λ > (m − 1)/2 and scale parameter Σ ∈ Ω. In particular, the Wishart distribution with shape parameter λ = N /2 and scale Σ −1 is the classical one. When m = 1, we obtain an alternative parametriparameter 2Σ sation of the exponential distribution: dPσ,μλ (x) =

σλ exp(−σ x)x λ−1 1(0,∞) (x) dx. Γ (λ)

The Laplace transform of the Wishart distribution is

630

A. K. Muhumuza et al.

LPΣ,μλ (y) =

Σ ))λ (det(Σ Lμλ (y − Σ ) = = (det(I − Σ −1 y))−λ . Σ) Σ − y))λ L μλ (−Σ (det(Σ

Does such a distribution exist for the remaining values of λ? Let Ω be the closure of Ω in E, that is, the set of all positive-semidefinite r × r matrices with real entries. Definition 27.2 A -valued random matrix Y has Wishart distribution with shape parameter λ and scale parameter Σ if and only if LY (y) = (det(I − Σ −1 y))−λ .

(27.5)

This is a particular case of the general definition given by [5]. The following definition and theorem are particular cases of the results by [11].

27.1.3 The Gindikin Set and Wishart Joint Eigenvalue Distribution Definition 27.3 The Gindikin set is    m−1 m−1 1 ∪ ,∞ . = 0, , 1, . . . , 2 2 2

(27.6)

Theorem 27.2 The right hand side of Eq. (27.5) defines the Laplace transform of a random variable if and only if λ ∈ . When λ = /2, 0 ≤ ≤ m − 1, the Wishart distribution is supported by the subset of the topological boundary ∂ of the set  which consists of all positive-semidefinite m × m matrices with real entries and of rank . In particular, when m = 1, the set ∂ is the singleton {0}, and the gamma distribution with shape parameter 0 is indeed supported by the above singleton. Theorem 27.2 was conjectured in [20] for the case of m = 2. After [11], it was proved again independently in different mathematical communities, see [18, 28, 32], and [8]. In the functional analysis community, the set is called the Wallach set. In physics, ensembles are understood as joint distributions of finitely many real objects, see [17]. As an example, consider the Wishart ensemble, that is, the joint distribution of the eigenvalues λ1 ≥ λ2 ≥ · · · ≥ λm > 0 of the Wishart matrix. This distribution was calculated almost simultaneously by [10, 12, 13], and [29]. The general formula (see e.g., [24]), includes a complicated integral. We describe a particular case, when Σ = I, that, is, the probability density of the matrix is f Y (x) =

1 exp(−tr(x))(det x)λ−(m+1)/2 1 (x), Γm (λ)

λ>

m−1 . 2

27 Extreme Points of the Vandermonde Determinant …

631

Table 27.1 Classification of simple Euclidean Jordan algebras E  n r R1 R1 × Rm−1 Sym(m, R) Herm(m, C) Herm(m, H) Herm(3, O)

(0, ∞) Λm Πm (R) Πm (C) Πm (H) Π3 (O)

1 m m(m + 1)/2 m2 m(2m − 1) 27

d

1 2 m m m 3

0 m−2 1 2 4 8

Theorem 27.3 The probability density of the ordered eigenvalues of the Wishart matrix with shape parameter λ and scale parameter I is

m λ−(m+1)/2 2  m!π m /2 λi f Y (λ1 , . . . , λm ) = Γm (m/2)Γm (λ) i=1

m   × (λi − λ j ) exp − λi . 1≤i< j≤m

(27.7)

i=1

27.2 A Quick Jump into Wishart Distribution on Symmetric Cones In a companion paper, we introduced the irreducible symmetric cones Ω. These are the half-line (0, ∞), the Lorentz cone Λm = { (λ, u) ∈ R1 × Rm−1 : λ2 − (u|u) > 0, λ > 0 }, and the sets Πm (R), Πm (C), Πm (H), and Π3 (O) of positive-definite matrices in simple Euclidean Jordan algebras. The density of the non-degenerate Wishart distribution with parameters λ ∈ ((r − 1)d/2, ∞) and Σ ∈ Ω is given by f Y (x) =

Σ ))λ (det(Σ Σ ◦ x))(det(x))λ−n/r 1Σ (x), exp(−tr(Σ ΓΩ (λ)

where the numbers r and d are given in Table 27.1 and the gamma function determined by the cone  is  Γ (s) =

exp(−tr(x))(det(x))s−n/r dx, 

Re s > n/r − 1.

In small dimensions, we have the following isomorphisms:

632

A. K. Muhumuza et al.

Sym(1, R) ∼ Herm(1, C) ∼ Herm(1, H) ∼ Herm(1, O) ∼ R1 , Sym(2, R) ∼ R1 ⊕ R2 , Herm(2, C) ∼ R1 ⊕ R3 , Herm(2, H) ∼ R1 ⊕ R5 , Herm(2, O) ∼ R1 ⊕ R9 . Theorem 27.4 ([23]) Let λ be a real number that belongs to the interior of the Gindikin set (27.6). The probability density of the distribution of the ordered spectral eigenvalues of the Wishart random variable with Laplace transform (27.5) is given by r r ![Γ (d/2)]r (2π )n−r  λ−n/r f (λ1 , . . . , λr ) = λi Γ (λ)Γ (r d/2) i=1



⎛ (λi − λ j )d exp ⎝−

1≤i< j≤r

r 

⎞ λi ⎠ .

i=1

For the Lorentz cone Λm , the distribution of the spectral eigenvalues has the density f (λ1 , λ2 ) = C(λ, m)(λ1 λ2 )λ−m/2 (λ2 − λ1 )m−2 exp(−λ1 − λ2 ), √

π where C(λ, m) = Γ (λ)Γ (λ−m/2+1)2 m−4 Γ ((m−1)/2) and λ > For the cone Herm(3, C), the above density is

f (λ1 , λ2 , λ3 ) =

m 2

(27.8)

− 1.

 3(λ1 λ2 λ3 )λ−3 (λ j − λi )2 exp(−λ1 − λ2 − λ3 ), Γ (λ)Γ (λ − 1)Γ (λ − 2) 1≤i< j≤3 (27.9)

where λ > 2. For the cone Herm(3, H), the above density is f (λ1 , λ2 , λ3 ) =

(λ1 λ2 λ3 )λ−5 120Γ (λ)Γ (λ − 2)Γ (λ − 4)



(λ j − λi )4 exp(−λ1 − λ2 − λ3 ),

1≤i< j≤3

(27.10) where λ > 4. For the cone Herm(3, O), the above density is f (λ1 , λ2 , λ3 ) =

63 (λ1 λ2 λ3 )λ−9 11!7!Γ (λ)Γ (λ − 4)Γ (λ − 8)



(λ j − λi )8 exp(−λ1 − λ2 − λ3 ),

1≤i< j≤3

(27.11) where λ > 8. The degenerate Wishart distribution is supported by the set ∂ Ω = { x ∈ ∂Ω : rank(x) = }.

27 Extreme Points of the Vandermonde Determinant …

633

Table 27.2 The data associated to simple Euclidean Jordan algebras E  Ω

K R1 R1 × Rm−1 Sym(m, R)

(0, ∞) Λm Πm (R)

{0} (0, ∞) Sym( , R)

{1} SO(m − 1) SO(m)

Herm(m, C)

Πm (C)

Herm( , C)

SU(m)

Herm(m, H)

Πm (H)

Herm(m, H)

Sp(m)

Herm(3, O)

Π3 (O)

Herm( , O)

F4(−52)

M

{1} SO(m − 2) SO( ) × SO(m −

) S(U( ) × U(m −

)) Sp( ) × Sp(m −

) Spin(9)

We found that the set ∂  is stratified into strata. Each stratum is a rotated symmetric cone of a certain Euclidean Jordan algebra, call this cone Ω , and the strata are enumerated by the elements of the set K /M of left cosets of a certain group M in a certain group K . These groups are given in Table 27.2. In Table 27.2, the symbol F4(−52) denotes the simple simply connected real compact exceptional Lie group of type F4 . Let dξ be the Lebesgue measure on Ω , and let dc be the probabilistic K -invariant measure on the set K /M . The degenerate Wishart distribution is given by dPΣ,μ d/2 (ξ, c) =

1 (det(Σ)) d/2 exp(−(ξ |Σ)) Γu (r d/2)

× (det(ξ + e − c))(r +1− )d/2−1 dξ dc. Theorem 27.5 The probability density of the nonzero ordered spectral eigenvalues of the degenerate Wishart distribution with Σ = e is given by f (λr − +1 , . . . , λ ) =

![Γ (d /2)] (2π )n −

Γu ( d /2)ΓΩu (r d/2)

×

r  i=r − +1

(27.12)



(r +1− )d/2−1

λi



⎛ (λ j − λi )d exp ⎝−

r − +1≤i< j≤r



⎞ λi ⎠ ,

i=1

where e is the identity element of the Jordan algebra E. Assume = 2 and E is a matrix algebra of rank r = 3. When F = R, the probability density of the distribution of the nonzero spectral eigenvalues of the degenerate Wishart matrix [23] is 4 f (λ2 , λ3 ) = √ (λ3 − λ2 ) exp(−λ2 − λ3 ). π

(27.13)

634

A. K. Muhumuza et al.

When F = C, the above density is f (λ2 , λ3 ) =

√ π λ2 λ3 (λ3 − λ2 )2 exp(−λ2 − λ3 ).

(27.14)

When F = H, the above density is f (λ2 , λ3 ) =

208 (λ2 λ3 )3 (λ3 − λ2 )4 exp(−λ2 − λ3 ). 15!!

(27.15)

Finally, when F = O, the above density is f (λ2 , λ3 ) =

211 (λ2 λ3 )7 (λ3 − λ2 )8 exp(−λ2 − λ3 ). 11!21!!

(27.16)

27.3 Extreme Points of the Degenerate Wishart Distribution and Vandermonde Determinant Setting p = (r + 1 − )d/2 − 1 gives a related result as in [3, Theorems 8.5.4], and thus we give the following generalizations. Theorem 27.6 The coordinates of the maximum point of ⎡

⎤d

  1 p p/d U(λ1 , . . . , λr ) = ⎣ λi exp − λi |λi − λ j |⎦ , > 0, d

d

i=1 1≤i< j≤r r 

(27.17)   2λ attained when λ1 , . . . , λr are the zeros of the Laguerre polynomial .  d  (1) 2 If p = 0 in (27.17) then the maximum is instead given by the roots of λL r −1 d λ . Note that this means that one of the coordinates must be equal to 0. (2 p/d −1) Lr

Proof We notice that the case d = 1 follows exactly as in [3, Theorems 8.5.4], where the points λ1 , . . . , λ2 that maximizes U are the zero of the Laguerre polynomial (2 p−1) . To prove for general d = 2, 4, 8 we proceed from (27.17) as follows. Using Ln change of variable yi = d1 λi so that λi = d yi gives ⎡ (r (r −1)/2+ p/d )

U(y) = d



r 

p/d

yi

exp (−yi )

i=1

It follows that the maximum of U occurs when



⎤d

|yi − y j |⎦

1≤i< j≤r ∂T ∂ yi

= 0, i = 1, 2, . . . , n, where

27 Extreme Points of the Vandermonde Determinant …

635

T(y) = − log U(y) ⎡  n r  p  p r (r − 1) + ln(d ) − d ⎣ ln(yi ) − yi + =− 2 d

d

i=1

i=1



⎤ ln |y j − yi |⎦ .

1≤i< j≤r

It follows that ⎤

⎡ ∂T ⎢p 1 = −d ⎣ −1+ ∂ yi d yi

r  i=1 i = j

1 ⎥ ⎦ = 0. (y j − yi )

(27.18)

Since d is not zero this generates a set of r nonlinear equations with r unknowns. Now, considering a monic polynomial  f of degree r whose zeros are y1 , . . . , yn satisfy (27.18), that is, setting f (y) = ri=1 (y − yi ), so that the discriminant D of f becomes (square of Vandermonde determinant) D(y1 , . . . , yr ) =



(y j − yi )2 =

1≤i< j≤r

r 

f  (yi ).

i=1

Taking natural logarithm of this discriminant and differentiating we obtain  ∂ f  (yi ) 2 =  , ln D(y1 , . . . , yr ) = ∂ yi (y j − yi ) f (yi ) i=1 r

i = 1, 2, . . . , r.

i = j

Substituting this in (27.18), we obtain   p 1 1 f  (yi ) p  = 0 ⇔ yi f (yi ) + 2 −1+ − yi f  (yi ) = 0, i = 1, 2, . . . , r. d yi 2 f  (yi ) d

The left hand side in the equation yi f  (yi ) + 2



p − yi d



f  (yi ) = 0, i = 1, 2, . . . , r.

gives a polynomial of degree ≤ r , and since f is a polynomial of degree r and it is zero at yi , i = 1, 2, . . . , r . This implies that the left hand side in the equation is a multiple of f (y). Therefore, we deduce that f (y) satisfies the differential equation y f  (y) + 2



p −y d



f  (y) + λ f (y) = 0

(27.19)

for some constant λ. To find the value of λ we can match the terms containing y r on both sides in the expression. This gives −2r + λ = 0 and thus λ = 2r .

636

A. K. Muhumuza et al.

Now, the Eq. (27.19) is equivalent to the differential equation for the Laguerre ˆ polynomials, L(y) = L r(2α−1) (2y), given by y If we set α =

ˆ ˆ d L(y) d2 L(y) ˆ + 2r L(y) = 0. + 2(α − y) 2 dy dy

p . d

Since p = (r + 1 − )d/2 − 1, then the points λ1 , . . . , λr that    2 (r +1− )d/2−1 −1  d

2 maximize (27.17) are the zeros of the Laguerre polynomial L r λ . d

In the case when p = 0 the differential equation (27.19) becomes y f  (y) − 2y f  (y) + 2r f (y) = 0 whose polynomial solutions are givenby f (y) = λL r(1) −1 (2y) and the the coordinates are given by the roots of λL r(1) −1

2 λ d

.

Example 27.2 In Eq. (27.8), which is the two dimensional non-degenerate case of the Lorentz cone, the coordinates for the maximum point are given by the roots of (2 dp −1)

the polynomial L 2



( d2 λ) with p = λ −

m 2

and d = m − 2 if λ >

m . If λ = m2 2 2 λL (1) 1 ( d λ).

then the coordinates for the maximum points are given by the roots of For the case when λ = m2 it is easy to give an expression for the coordinates.   2 1 Since λL (1) 1 ( d λ) = 2λ 1 − m−1 the maximum point will be given by λ1 = 0 and λ2 = m − 2. For the case when λ > m2 we will illustrate the location of the maximum for some combinations of different λ and m (see Table 27.3). As a simple example we compute λ = 2 and m = 3. These values of λ and m gives p = 21 and d = 1. In this case Eq. (27.8) reduces to √ 2 π 1 f (λ1 , λ2 ) = (λ1 λ2 ) 2 (λ2 − λ1 ) exp(−λ1 − λ2 ). Γ (2)Γ ( 23 ) Theorem 27.6 gives that the maximum is given by the solutions to 2 L (0) 2 (2λ) = 2λ − 4λ + 1 = 0. √



Thus, the zeros become λ1 = 22 − 1 and λ2 = 22 − 1. In a similar fashion the rest of the results in Table 27.3 can be computed. Illustration of the locations of the extreme points are shown in Figs. 27.1, 27.2, 27.3, 27.4, and 27.5. Example 27.3 In Eqs. (27.9), (27.10) and (27.11) we have the case of threedimensional non-degenerate Wishart ensembles whose distributions are special cases of the formula in Theorem 27.4. The maximum points can be found by finding the (2(λ− n3 )d −1 −1)  2  x . zeros of the Laguerre polynomial L 3 d

27 Extreme Points of the Vandermonde Determinant …

637

Table 27.3 The example to illustrate Eq. (27.8) for the Lorentz case of non-degenerate Wishart Ensembles with some chosen values for λ and m and corresponding Laguerre polynomials given by Theorem 27.6. When λ = m2 the coordinates are given be (λ1 , λ2 ) = (0, m − 2) (α)

λ

m

p= λ − m/2

d = m − 2

α = 2d p − 1

L 2 ( 2x d )

2

3

1 2

1

0

2x 2 − 4x + 1

3

3

3 2

1

1

4

1

2

0

5

1 2

3

− 23

3

4 2

1

4

4

2

2

1

λ1

λ2

√ 2 2 −1

√ 2 2 +1

2x 2 − 8x + 6

1

3

1 2 2 x − 2x +

2−



2

2+

√ 2

2+

√ 2

1

4

√ 2 x2 − 2− 3 9 8x + 2 9 9 √ 2x 2 − 12x + 3 − 26 15 1 2 2 x − 3x +

3−

3

6

5

3 2

3

0

6

1

4

− 21

7

1 2

5

− 45

3

9 2

1

8

4

4

2

3

2 x2 − 9 4x +1 3 1 x2 − 8 3 3 4x + 8 2 x2 − 25 12 x + 3 25 25 2x 2 − 20x +

8

5

7 2

3

4 3

6

3

4

1 2

8

2

6

− 23

10

1

8

− 34

3

13 2

1

12

4

6

2

5

2 x2 − 9 20 x + 35 9 9 1 x2 − 8 5 15 4x + 8 1 x2 − 18 5x + 5 9 9 1 2 32 x − 5 5 16 x + 32 2 2x − 28x +



3−

5

11 2

3

8 3

6

5

4

3 2

8

4

6

1 3

10

3

8

− 14

2 x2 − 9 28 x + 77 9 9 1 x2 − 8 7 35 4x + 8 1 x2 − 18 7 x + 14 9 9 1 2 32 x − 7 21 16 x + 32



6



3 − 230 √

5 − 210 5−



5

√ 5 − 230

5− 5−

√ √



√ 7 − 242

7− 7−

√ √

3+

6





5 + 210 5−



5



5 + 230 √ √

10 15

√ 5+2 5 √

7 + 214 7+



7



7 + 242

14

7+

21

7+

√ 7−2 7



3 + 230

5+

7

3



15





3 + 322

5+

7 − 214 7−

3+

10

√ 5−2 5

9 1 2 2 x − 7x + 21

3

3 − 322

45 1 2 2 x − 5x + 10





3 + 26

√ √

14 21

√ 7+2 7

638

A. K. Muhumuza et al. = 2, m = 3

2.5 2

0.4

0.2

0.5

0.1

0 0

1

0.8

2

0.6 0.4

1

0.2

0 0

0

2

1

2

0.3

1

2

1.5

= 2, m = 4

3

0.5

1

2

3

0

1

1

Fig. 27.1 Illustration of the extreme point of the distribution given by (27.8) with λ = 2 and m = 3, 4, the blue dot marks the maximum point with coordinates given by the zeros of the corresponding Laguerre polynomial given in Table 27.3. When m = 4 we have λ = m2 and the coordinates are given be (λ1 , λ2 ) = (0, 2) = 3, m = 3

0.25

5

0.2

4

0.15

3

4

0.15

2

2

2 1

1

2

3

4

0.1

2

0.05

1

0

0 0

1

0.1 0.05

2

0

4 1

= 3, m = 5

6

5

= 3, m = 6

0.3

4

0.6

4

2

0.2

2 0.1

1 2

4 1

0

0.4

2

3

0 0

0.25 0.2

3

0 0

= 3, m = 4

2

0 0

0.2

2

4

6

0

1

Fig. 27.2 Illustration of the extreme points of the distribution given by (27.8) with λ = 3 and m = 3, 4, 5, the blue dot marks the maximum point with coordinates given by the zeros of the corresponding Laguerre polynomial given in Table 27.3. When m = 4 we have λ = m2 and the coordinates are given be (λ1 , λ2 ) = (0, 4)

27 Extreme Points of the Vandermonde Determinant …

639

= 4, m = 3

= 4, m = 4

6

0.15

6

0.15

4

0.1

0.05

2

0.05

0

0 0

5 4

2

2

0.1

3 2 1 0 0

2

4

6

2

4

1

1

= 4, m = 5

= 4, m = 6

0.2

8

6

0.15

6

4

0.1

0

6

0.2

2 0 0

2

4

6

2

2

0.15

4

0.05

2

0

0 0

0.05

2

4

6

1

1

= 4, m = 7

= 4, m = 8 0.3

8 6

0.5 0.4

2

2

0.15

0.3

4

0.1

2

4

6

8

0.2

2

0.05

2

0

0.6

6

0.2

4

8

8

0.25

0 0

0.1

0 0

0

0.1

2

4

1

6

8

0

1

Fig. 27.3 Illustration of the extreme points of the distribution given by (27.8) with λ = 4 and m = 3, 4, 5, 6, 7, 8, the blue dot marks the maximum point with coordinates given by the zeros of the corresponding Laguerre polynomial given in Table 27.3. When m = 8 we have λ = m2 and the coordinates are given be (λ1 , λ2 ) = (0, 6)

Using the classical form of the Laguerre polynomial, L αn (y), that is, L (α) n (y) = we obtain

 n  n + α yi , (−1)i n − i i! i=0

3 (2(λ− n3 )d −1 −1) − d4 L3 4



2 x d



ˆ 2 + cx ˆ = ax ˆ 3 + bx ˆ + d,

(27.20)

640

A. K. Muhumuza et al. = 6, m = 3

= 6, m = 4

0.1

10

6

0.06

0.08

8 2

0.08

2

8

0.06

6

4

0.04

4

0.04

2

0.02

2

0.02

0

0 0

0 0

2

4

6

8

5

= 6, m = 5

= 6, m = 6 0.1 0.08

8

0.06

6 4

0.04

2

0.02

5

10

0.1

10

0.08 0.06

2

10

0 0

0

1

1

2

10

5

0.04 0.02

0 0

0

5

10

1

1

= 6, m = 8

= 6, m = 10 0.12 0.1

2

0.08 0.06

5

0.04

0.15

10 0.1

2

10

0

5

0.05

0.02

0 0

5

10 1

0

0 0

5

10

0

1

Fig. 27.4 Illustration of the extreme points of the distribution given by (27.8) with λ = 6 and m = 3, 4, 5, 6, 8, 10, the blue dot marks the maximum point with coordinates given by the zeros of the corresponding Laguerre polynomial given in Table 27.3

27 Extreme Points of the Vandermonde Determinant …

641

= 8, m = 3

= 8, m = 4 0.06 0.05

2

0.04

0.06

10

0.03

5

0.04

2

10

5

0.02

0.02

0.01

0 0

5

15

0 0

0

10

5

0

10

1

1

= 8, m = 5

= 8, m = 6 15 0.06

2

0.04

5

0 0

0.06

10

5

0.02

5

10

15

0.04

2

10

0 0

0

0.02

5

1

10

15

0

1

= 8, m = 8

= 8, m = 10 0.08

15

0.08

15 0.06 0.04

5 0 0

5

10 1

15

0.06

10

2

2

10

0.04

0.02

5

0

0 0

0.02

5

10

15

0

1

Fig. 27.5 Illustration of the extreme points of the distribution given by (27.8) with λ = 8 and m = 3, 4, 5, 6, 8, 10, the blue dot marks the maximum point with coordinates given by the zeros of the corresponding Laguerre polynomial given in Table 27.3

642

A. K. Muhumuza et al.

where aˆ = 1, bˆ = −3λ − 3d + n,  (−486d + 216n)λ 3 2 3n n2 cˆ = − 3λ2 − , + d − d+ 108 2 2 3 n n3 (162d − 108n) 2 (54d 2 − 108nd + 36n 2 ) n2 λ − λ + d2 − d + . dˆ = −λ3 − 108 108 6 6 27 ˆ aˆ in (27.20) gives Using the substitution x = t − b/3  d2 (n − 3λ)d 3 2 (n − 3λd) 3 d t− + L (λ) (t) = t + − + n 2 2 2 6

(27.21)

= x 3 + px + q, from which we obtain 1 3 (n − 3λd) p = cˆ − bˆ 2 = − d 2 + 3 2 2 2 (n − 3λ)d 1 d 2 . q = dˆ − bˆ cˆ + bˆ 3 = − + 3 27 2 6 It follows that to have three distinct roots for (27.20) and (27.21) the discriminant must be greater than zero, that is n = bˆ 2 cˆ2 − 4aˆ cˆ3 − 4bˆ 3 dˆ − 27aˆ 2 dˆ 2 + 18bˆ cˆdˆ = −4 p 3 − 27q 2 1 = d 3 (3d + 6λ − 2n)(3d + 3λ − n)2 > 0. 4 The three distinct roots can be generated using the Cardano formula t1 = γ 1 + γ 2



√ √ −1 + i 3 −1 − i 3 t2 = γ1 + γ2 2 2



√ √ −1 − i 3 −1 + i 3 t3 = γ1 + γ2 , 2 2 where

(27.22)

27 Extreme Points of the Vandermonde Determinant …

643

    −n q 2  p 3 q 3 , γ1 = + = − + 2 3 2 108      −n q 2  p 3 q q 3 3 γ2 = − − + = − − . 2 2 3 2 108  3

With x = t −

q − + 2

bˆ , it follows from (27.22) that 3aˆ

bˆ + γ1 + γ2 3aˆ

√ √ −1 + 3 −1 − 3 bˆ + x2 = − γ1 + γ2 3aˆ 2 2



√ √ −1 − 3 −1 + 3 bˆ + x3 = − γ1 + γ2 . 3aˆ 2 2 x1 = −

Now, for Eq. (27.9) with, for example, λ = 5, we get f (λ1 , λ2 , λ3 ) =

 3(λ1 λ2 λ3 )2 (λ j − λi )2 exp(−λ1 − λ2 − λ3 ). Γ (5)Γ (4)Γ (3) 1≤i< j≤3

and the corresponding Laguerre polynomial becomes 1 3 2 L (1) 3 (x) = − x + 2x − 6x + 4 6

(27.23)

and its roots are λ1 = 7.7588, λ2 = 3.3054 and λ3 = 0.9358. In Eq. (27.10) with, for example, λ = 7, we get f (λ1 , λ2 , λ3 ) =

 (λ1 λ2 λ3 )2 (λ j − λi )4 exp(−λ1 − λ2 − λ3 ). 120Γ (7)Γ (5)Γ (3) 1≤i< j≤3

and the corresponding Laguerre polynomial becomes L (0) 3

x  2

=−

1 3 3 2 3 x + x − x + 1, 48 8 2

(27.24)

and its roots are λ1 = 12.5799, λ2 = 4.5886 and λ3 = 0.8315. In Eq. (27.11), with, for example, λ = 11, we get f (λ1 , λ2 , λ3 ) =

 63 (λ1 λ2 λ3 )2 (λ j − λi )8 exp(−λ1 − λ2 − λ3 ). 11!7!Γ (11)Γ (7)Γ (3) 1≤i< j≤3

644

A. K. Muhumuza et al.

Table 27.4 The example to illustrate Eqs. (27.9), (27.10), and (27.11) for non-degenerate Wishart Ensembles   E



λ

d

α

Herm(3, C)

Π3 (C)

5

2

1

Herm(3, H)

Π3 (H)

7

4

0

Herm(3, O)

Π3 (O)

11

8

1 2

(α)

2x L3 d

1 3 − x + 6 2x 2 − 6x + 4 1 − x3 + 48 3 2 3 x − x+ 8 2 1 1 3 − x + 384 5 2 x − 64 15 5 x+ 32 16

λ1

λ2

λ3

7.7588

3.3054

0.9358

12.5799 4.5886

0.8315

22.1014 7.1380

0.7607

and the corresponding Laguerre polynomial becomes   1 3 5 15 5 (1) x L3 2 =− x + x2 − x + 4 384 64 32 16

(27.25)

and its roots are given λ1 = 22.1014, λ2 = 7.1380 and λ3 = 0.7607. The location of the extreme points of the distribution in Theorem 27.5 can be visualized. First we note that expression (27.12) is of the same form as (27.17) and the critical points an expression of that form will satisfy the equation system (27.18), ⎤

⎡ ⎢p 1 −d ⎣ −1+ d yi

r  i=1 i = j

1 ⎥ ⎦ = 0, (y j − yi )

i = 1, . . . , r,

where yi = dλ i . If all equations in the equation system above are summed together the following equations is obtained r  d

i=1

λi

=

r . p

(27.26)

When r = 3 and 0 < λ1 < λ2 < λ3 then Eq. (27.26) defines a surface that can be drawn and coloured according to the value of (27.12), some examples of this visualisation is drawn in Fig. 27.6a, c and the summary of the corresponding results is given in Table 27.4. Example 27.4 In this example it will be illustrated where the maximum points of the distribution can be found for the degenerate Wishart distributions given by

27 Extreme Points of the Vandermonde Determinant …

645

Fig. 27.6 Illustration of the extreme points of the distributions for degenerate Wishart ensembles given by (27.9)–(27.11). The blue dot mark the location of the extreme points given by the roots of the corresponding polynomial. The coordinates of the extreme points can found in Table 27.4

646

A. K. Muhumuza et al.

Eqs. (27.13), (27.14), (27.15), and (27.16). Using the classical form of the Laguerre polynomial, L αn (x), that is, L (α) n (x) =

 n  n + α xi (−1)i , n − i i! i=0

from which 1 x2 − (α + 2)x + (α + 1)(α + 2) 2 2 = ax 2 + bx + c,

L (α) 2 (x) =

where a = 21 , b = −(α + 2) and c = 21 (α + 1)(α + 2). Thus for the real roots to exist, the discriminant b2 − 4ac = [−(α + 2)]2 − 4 ×

1 1 × (α + 1)(α + 2) = α + 2, 2 2

that is α > −2 or α ∈ (−2, ∞).



According to Theorem 27.6 we are looking for the roots of L n 2 dp

 2 dp −1



2 λ d



, where

p = (r + 1 − )d/2 − 1. Thus α = − 1 and thus α > −1 and we will always have real roots. For the distribution given by (27.13) where r = 2, p = 0 and d = 1 the extreme points will occur when the spectral eigenvalues λ2 and λ3 are the roots of the poly(2λ) = 2λ2 − 2λ. Thus we get λ2 = 0 and λ3 = 1. Putting these values nomial L (−1) 2 4 into (27.13) gives f (λ2 , λ3 ) = √ . e π For the distribution given by (27.14) where r = 2, p = 1 and d = 2 the extreme the polynopoints will occur when the spectral eigenvalues λ2 and√λ3 are the roots of√ λ2 (λ) = − 2λ + 1. Thus we get λ = 2 − 2 and λ = 2 + 2. Putting mial L (0) 2 3 2 2 these values into (27.14) gives √ f (λ2 , λ3 ) = 16 π e−4 . For the distribution given by (27.15) where r = 2, p = 3 and d = 4 the extreme points will occur when the spectral eigenvalues λ2 and λ3 are the roots of the poly√ √ 2 (1) nomial L 2 2 (λ) = λ8 − 54 λ + 15 . Thus we get λ2 = 5 − 10 and λ3 = 5 + 10. 8 360000 −10 e . Putting these values into (27.15) gives f (λ2 , λ3 ) = 10395 For the distribution given by (27.16) where r = 2, p = 7 and d = 8 the extreme points will occur when the spectral eigenvalues λ2 and λ3 are the roots of the poly-

27 Extreme Points of the Vandermonde Determinant …

647

Table 27.5 The spectral eigenvalues that give the extreme points of the distributions for degenerate Wishart ensembles given by (27.13)–(27.16) E

Ω

p

d

λ2

λ3

Sym(2, R)

Π2 (R)

0

1

0

1

Herm(2, C)

Π2 (C)

1

2

2−

Herm(2, H)

Π2 (H)

3

4

5−

Herm(2, O)

Π2 (O)

7

8

√ √

2

10 √ 11 − 2 11

2+ 5+

√ √

2

10 √ 11 + 2 11

f (λ2 , λ3 ) 4 √ e π √ 16 π e−4 360000 −10 e 10395 2968221073324638208 −22 e 51655573125

Fig. 27.7 Illustration of the extreme points of the distributions for degenerate Wishart ensembles given by (27.13)–(27.16). The blue dots mark the location of the extreme points as given in Table 27.5

√ √ Thus we get λ2 = 2 − 2 and λ3 = 2 + 2. 2968221073324638208 −22 e . Putting these values into (27.16) gives f (λ2 , λ3 ) = 51655573125 The results for the distributions given by (27.13)–(27.16) are compiled in Table 27.5 and illustrated in Fig. 27.7.

(3) nomial L 2 4 (λ) =

λ2 32



11 λ 16

+

77 . 32

648

A. K. Muhumuza et al.

27.4 Conclusion The Wishart probability distributions can be generalized in higher dimension based on the boundary points of the symmetric cones in Jordan algebras. This density is mainly characterised by the structure of the Vandermonde determinant and the exponential weight that is dependent on the trace of the given matrix. The symmetric cones especially the Gindikin set form a suitable basis for the construction of the degenerate and non-degenerate Wishart distributions in the field of Herm(m, C), Herm(m, H), Herm(3, O) denotes respectively the Jordan algebra of all Hermitian matrices of size m × m with complex entries, the skew field H of quaternions, and the algebra O of octonions. As it has been demonstrated the extreme points of the Vandermonde determinant that maximizes the joint eigenvalue probability density for both degenerate and non-degenerate Wishart ensembles are indeed the zeros of the classical Laguerre polynomial of degree n. Acknowledgements We acknowledge the financial support for this research by the Swedish International Development Agency, (Sida), Grant No. 316, International Science Program, (ISP) in Mathematical Sciences, (IPMS). We are also grateful to the Division of Applied Mathematics, Mälardalen University for providing an excellent and inspiring environment for research education and research.

References 1. Anderson, T.W.: An Introduction to Multivariate Statistical Analysis, 3rd edn. Wiley Series in Probability and Statistics. Wiley-Interscience [Wiley], Hoboken, NJ (2003) 2. Andersson, S.A., Brøns, H.K., Jensen, S.T.: Distribution of eigenvalues in multivariate statistical analysis. Ann. Statist. 11(2), 392–415 (1983) 3. Andrews, G.E., Askey, R., Roy, R.: Special functions. In: Encyclopedia of Mathematics and its Applications, vol. 71. Cambridge University Press, Cambridge (1999) 4. Barndorff-Nielsen, O.: Information and Exponential Families in Statistical Theory. Wiley Series in Probability and Statistics. Wiley, Ltd., Chichester (2014) (Reprint of the 1978 original) 5. Casalis, M., Letac, G.: Characterization of the Jørgensen set in generalized linear models. Test 3(1), 145–162 (1994) 6. Edelman, A.: The distribution and moments of the smallest eigenvalue of a random matrix of Wishart type. Linear Algebra Appl. 159, 55–80 (1991) 7. Edelman, A., Rao, N.R.: Random matrix theory. Acta Numer. 14, 233–297 (2005) 8. Faraut, J.: Formule du binôme généralisée. In: Harmonic analysis (Luxembourg, 1987). Lecture Notes in Mathematics, vol. 1359, pp. 170–180. Springer, Berlin (1988) 9. Fisher, R.A.: Frequency distribution of the values of the correlation coefficients in samples from an infinitely large population. Biometrika 10(4), 507–521 (1915) 10. Fisher, R.A.: The sampling distribution of some statistics obtained from non-linear equations. Ann. Eugenics 9, 238–249 (1939) 11. Gindikin, S.G.: Invariant generalized functions in homogeneous domains. Funkcional. Anal. i Priložen. 9(1), 56–58 (1975) 12. Girshick, M.A.: On the sampling theory of roots of determinantal equations. Ann. Math. Stat. 10, 203–224 (1939) 13. Hsu, P.L.: On the distribution of roots of certain determinantal equations. Ann. Eugenics 9, 250–258 (1939)

27 Extreme Points of the Vandermonde Determinant …

649

14. Ingham, A.E.: An integral which occurs in statistics. Math. Proc. Camb. Philos. Soc. 29(2), 271–276 (1933) 15. James, A.T.: Distributions of matrix variates and latent roots derived from normal samples. Ann. Math. Statist. 35, 475–501 (1964) 16. Johnstone, I.M.: On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29(2), 295–327 (2001) 17. König, W.: Orthogonal polynomial ensembles in probability theory. Probab. Surv. 2, 385–447 (2005) 18. Lassalle, M.: Algèbre de Jordan et ensemble de Wallach. Invent. Math. 89(2), 375–393 (1987) 19. Lehmann, E.L., Romano, J.P.: Testing Statistical Hypotheses, 3rd edn. Springer Texts in Statistics. Springer, New York (2005) 20. Lévy, P.: The arithmetic character of the Wishart distribution. Proc. Camb. Philos. Soc. 44, 295–297 (1948) 21. Malyarenko, A.: Invariant Random Fields on Spaces with a Group Action. Springer Science & Business Media (2012) 22. Mehta, M.L.: Random matrices. In: Pure and Applied Mathematics (Amsterdam), vol. 142, 3rd edn. Elsevier/Academic Press, Amsterdam (2004) 23. Muhumuza, A.K., Malyarenko, A., Lundengård, K., Silvestrov, S., Mango, J.M., Kakuba G.: The Wishart distribution on symmetric cones. In: Silvestrov, S., Malyarenko, A. (eds.) Noncommutative and Non-associative Algebra and Analysis Structures. Springer (2022) 24. Muirhead, R.J.: Aspects of Multivariate Statistical Theory. Wiley Series in Probability and Mathematical Statistics. Wiley, Inc., New York (1982) 25. Nadakuditi, R.R., Edelman, A.: Sample eigenvalue based detection of high-dimensional signals in white noise using relatively few samples. IEEE Trans. Signal Process. 56(7, part 1), 2625– 2638 (2008) 26. Rao, N.R., Mingo, J.A., Speicher, R., Edelman, A.: Statistical eigen-inference from large Wishart matrices. Ann. Statist. 36(6), 2850–2885 (2008) 27. Ratnaradzha, T., Val’yankur, R., Alvo, M.: Complex random matrices and the capacity of a Rician channel. Problemy Peredachi Informatsii 41(1), 3–27 (2005) 28. Rossi, H., Vergne, M.: Analytic continuation of the holomorphic discrete series of a semi-simple Lie group. Acta Math. 136(1–2), 1–59 (1976) 29. Roy, S.N.: p-statistics or some generalisations in analysis of variance appropriate to multivariate problems. Sankhya 4(3), 381–396 (1939) 30. Siegel, C.L.: Über die analytische Theorie der quadratischen Formen. Ann. Math. (2) 36(3), 527–606 (1935) 31. Student: The probable error of a mean. Biometrika 6(1), 1–25 (1908) 32. Wallach, N.R.: The analytic continuation of the discrete series. I, II. Trans. Amer. Math. Soc. 251, 1–17, 19–37 (1979) 33. Wishart, J.: The generalised product moment distribution in samples from a normal multivariate population. Biometrika 20A(1/2), 32–52 (1928) 34. Ziliak, S.T.: How large are your G-values? Try Gosset’s Guinnessometrics when a little “p” is not enough. Amer. Statist. 73(suppl. 1), 281–290 (2019)

Chapter 28

Option Pricing and Stochastic Optimization Nataliya Shchestyuk and Serhii Tyshchenko

Abstract In this paper we propose an approach to option pricing which is based on the solution of the investor problem. We demonstrate that the link between optimal option pricing from investor’s point of view and risk measuring is especially close, and it is given by stochastic optimization. We consider the optimal option pricing X ∗ as the optimal decision of the investor, who should maximize the expected profit. It is possible because the average value-at-risk AV @R is related to the simple stochastic optimization problem with a piecewise linear profit/cost function and as it was proved in [12], maximal value is attained. If we consider investing in a European option, then the profit/cost function is a payoff function Y (S) of a European call or put option and the optimal decision can be found as X ∗ = V @Rα (Y ), where parameter α can be computed using interest rates for borrowing and lending and reflects the level of the real economic environment. We illustrate our results for GBM model and Student-like models with dependence (FAT models) and determine optimal option price as the optimal amount to invest for these cases. Meanwhile we measure and manage risk for these models. Keywords European option · Payoff function · Value-at-risk MSC 2020 91G20

28.1 Introduction As a matter of fact Black–Scholes Model (BS) [2] is also known as the Black– Scholes–Merton (BSM) and is one of the most important concepts for valuing options in modern financial theory both in terms of approach and applicability. The model N. Shchestyuk (B) · S. Tyshchenko NaUKMA, Skovoroda str., 2, Kyiv, Ukraine e-mail: [email protected] S. Tyshchenko e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_28

651

652

N. Shchestyuk and S. Tyshchenko

had a huge impact on the theoretical studies in the theory of stochastic processes and financial mathematics and led to a boom in options trading. The Black–Scholes model assumes that the market consists of at least one risky asset with price St , usually called the stock, and one riskless asset, usually called the money market, cash, or bond Bt . Assume the price of the bond evolves as Bt = B0 ert ,

(28.1)

and the movement of the underlying risk assets St follows a geometric Brownian motion (GBM): (28.2) dSt = μSt dt + σSt d Wt , where r is the annualized risk-free interest rate, continuously compounded; μ is the annualized drift rate of S(t); σ is the annualized standard deviation of the stock’s returns; Wt is a standard Brownian motion (BM). With some assumptions holding, suppose there is a derivative security also trading in this market. We specify that this security will have a certain payoff at a specified date in the future, depending on the value(s) taken by the stock up to that date. For the special case of a European call or put option, Black and Scholes showed that “it is possible to create a hedged position, consisting of a long position in the stock and a short position in the option, whose value will not depend on the price of the stock” [2]. Their dynamic hedging strategy led to a partial differential equation which governed the price of the option. Its solution is called the fair price of option and is given by the Black–Scholes formula: (28.3) C(T , K, S0 , r, σ) = S0 Φ(d1 ) − Ke−rT Φ(d2 ), where d1 =

log SK0 + rT + 21 σ 2 T log SK0 + rT − 21 σ 2 T , d2 = √ √ σ T σ T

(28.4)

are both functions of five parameters: T , K, S0 , r, σ, and Φ(·) is a standard normal cumulative distribution function. The model is often good as a first approximation, but even Fisher Black pointed on the holes of his model: “People like the model because they can easily understand its assumptions. The model is often good as a first approximation and if you can see the holes in the assumptions you can use the model in more sophisticated ways” [3]. Actually, recent history in the wake of numerous market events has questioned some assumptions of the model and it has been shown that the GBM model lacks many empirically found features: the real financial data have heavy-tailed return, squared returns are positively correlated, the volatility is variable and clustering. Finally, one of the most significant assumptions of the Black–Scholes theory is its assumption that the market is complete. However, in reality markets are incomplete, meaning that some payoffs cannot be replicated by trading in marketed securities. The classic no-arbitrage theory of valuation in a complete market, based on the

28 Option Pricing and Stochastic Optimization

653

unique price of a self-financing replicating portfolio, is not adequate for nonreplicable payoffs in incomplete markets. What can be done? An extensive literature already exists, where more elaborate models are considered. First of all it is the use of Levy processes instead of BM in GBM. (See for example Eberlein and Raible, 1999 or Andersen, A., 2008). It is also the use of the Fractional BM (B. Mandelbrot, 1997; Y.Mishura, 2008). Later general stochastic volatility BM (e.g. Heston model, GARCH model and SABR volatility model) were proposed. After that two—factor volatility models [10] (Canhanga, B., Malyarenko, A., Ni, Y. and Silvestrov, S.) appeared. These models include one fast-changing and another slow-changing stochastic volatilities of mean-reversion type. The different changing frequencies of volatilities can be interpreted as the effects of weekends and effects of seasons of the year (summer and winter) on the asset price. In contrast to the two—factor volatility models, in 1999 Heyde presented a model, which does not only have a calendar clock, but a “fractal clock” too. This “fractal clock” can be interpreted as time over which market prices evolve and is often associated with trading volume or the flow of new price-sensitive information [7–9]. Also a variety of approaches have been suggested to get round the problem of option valuation in incomplete markets, none of them perhaps entirely. The following aspects were considered: theory of no-arbitrage bounds (El Karoui and Quenez, 1995), Indifference prices (Jaschke and Kuchler, 2001), Good deal bounds (Cochrane and Requejo, 2000), Min—max pricing measures (Skiadas, 2006), Expected utility in quadratic (Pham, 2000 and Schweize, 2001) or in exponential forms (Delbaen et al., 2002), Stochastic discount—factor approach (John H Cochrane, 2005). For more details see, for instant, the papers [4, 5], which have reviewed the literature on option pricing in incomplete markets. It is important that the real problem of pricing in incomplete markets essentially depends on the objective [5]. For example, one of the goals in setting bid and ask prices is to ensure that any trade undertaken at these prices is advantageous to the firm. Another objective is marking to market, the goal of which is to assign to the firm’s portfolio of derivative securities a value, not a price, that is accurate from an accounting or actuarial perspective. For risk management the aim is to propose a pricing decision based on quantitative risk assessment, using models of incomplete markets [5]. The value-at-risk (V@R) as a quantitative risk assessment was developed as a systematic way to segregate extreme events, which are studied qualitatively over long-term history and broad market events, from everyday price movements, which are studied quantitatively using short-term data in specific markets. It was hoped that “Black Swans” would be preceded by increases in estimated V@R or increased frequency of V@R breaks, in at least some markets. The extent to which this has proven to be true is controversial. In this paper we consider an approach to option pricing which is based on the problem of investor. We demonstrate that the link between measuring and managing is especially close, it is given by stochastic optimization and uses some quantitative risk assessments (V@R, AV@R).

654

N. Shchestyuk and S. Tyshchenko

The paper is organized as follows. In Sect. 28.2 we first consider the investor problem, which is linked to stochastic optimization, then review probability functionals and their properties to solve this problem. In Sect. 28.3 we discuss the main results. We demonstrate an approach of applying the investor problem for option pricing in an incomplete market. We define the optimal investment price for an option and how we can find it if the distribution of the payoff and interest rates for depositing and lending are known to the investor. In Sect. 28.4 we illustrate our approach to both GBM model and Student-like model with dependence (FAT models) and determine optimal option price as the optimal amount to invest in these cases. Meanwhile we measure and manage risk for these models. In Sect. 28.3.3, we present numerical results.

28.2 Problem of Investor 28.2.1 Problem Statement If the world does not obey a model’s predictions, we can conclude that the model needs improvement. However, we can also conclude that the world is wrong, that some assets are “mispriced” and there are trading opportunities for a shrewd investor. Suppose that the shrewd investor has to make a decision about the amount X to invest if the actual available income is given to him by a random variable Y . The present value of each unit of investment is 1. If the income is less than the committed sum X , then a shortfall occurs. For the shortfall a price of u|Y − X |− has to be paid, where u > 1. On the other hand, if the income is larger than X , the remaining amount has only a value of l|Y − X |+ where l < 1. The total sum (profit) function (see [6]) is (28.5) H (X , Y ) = X − u|Y − X |− + l|Y − X |+ . We can rewrite this function as H (X , Y ) = (1 − l)(X − where α=

1 |Y − X |− ) + lY , α

1−l . u−l

The optimal decision X should maximize the expected profit E(H ): E(H (X , Y )) = (1 − l)E(X −

1 |Y − X |− ) + lE(Y ). α

(28.6)

28 Option Pricing and Stochastic Optimization

655

To solve this optimization problem (as you will see later) we need to use some probability functionals.

28.2.2 Stochastic Optimization and Probability Functionals Probability functionals have been objects of many theoretical and empirical investigations of risk measuring. For background on probability (risk) functionals see Pflug [6], for instance. For a seminal work on axiomatic definitions for risk functionals see Artzner et al. [1]. Let us recall the definitions of probability (risk) functionals V@R and AV@R of level α. Let (Ω, F, P) be the probability space and suppose for p ∈ [1, +∞) a linear space L(p) of real valued random variables Y : Ω → R1 such that E(|Y |p ) < ∞ is defined on it. The linear space L(p) becomes a normed space by defining ⎛ ||Y ||p := ⎝



⎞1/p |Y (Ω)|p dP(Ω)⎠

 1/p = E(|Y |p ) .

Ω

Note that the role of parameter p in the context of stochastic optimization becomes clear when there is a need to build for linear normed space L(p) its dual space L(q) for 1/p + 1/q = 1, including the pairs p = ∞, q = 1 and p = 1, q = ∞ [6]. Definition 28.1 The value-at-risk of level α, 0 < α ≤ 1 for random variable Y ∈ L(p) is a probability functional, defined as α-quantile of the profit (loss) function V @Rα (Y ) = G −1 (α) = inf { y ∈ R : α ≤ G(Y )} ,

(28.7)

where G is the distribution function of Y ∈ L(p), G −1 is the quantile function of α, 0 < α ≤ 1. In general, even though the distribution function G may fail to possess a left or right inverse, the quantile function G −1 behaves as an “almost sure left inverse” for the distribution function, in the sense that G −1 (G(Y )) = Y almost surely. Moreover, probability functional V@R has the following properties. The valueat-risk V @Rα , 0 < α ≤ 1 is (i) translation-equivariant: V @Rα (Y + C) = V @Rα (Y ) + C

656

N. Shchestyuk and S. Tyshchenko

for all C ∈ R, (ii) positively homogeneous: V @Rα (λY ) = λV @Rα (Y ) for all λ ∈ R, (iii) comonotone additive: V @Rα (Y1 + Y2 ) = V @Rα (Y1 ) + V @Rα (Y2 ) for any two comonotonic random variables Y1 , Y2 . Notice that comonotonic case for random variables Y1 , Y2 means P(Y1 < u, Y2 < v) = min(G 1 (u), G 2 (v)). Although V @R is a very popular measure of risk, it has undesirable mathematical characteristics such as a lack of subadditivity and convexity. As an alternative measure of risk, the average value-at-risk AV @R is known to have better properties than V @R. Definition 28.2 The average value-at-risk of level α, 0 < α ≤ 1 for a random variable Y ∈ L(p) is defined as 1 AV @Rα (Y ) = α



G −1 (u) du,

(28.8)

0

where G −1 is defined as in (28.7). The average value-at-risk is also known under the names of conditional value-atrisk (CV@R), tail value-at-risk (TV@R) and expected shortfall. Moreover, defining the value-at-risk as the quantile the AV @R appears as the average of these values, average over u ∈ [0, α], and this justifies the name. Pflug [6] proved that AV @R has the some useful properties. The average value-at-risk AV @Rα , 0 < α ≤ 1 is (i) transition-equivariant: AV @Rα (Y + C) = AV @Rα (Y ) + C for all C ∈ R, (ii) positively homogeneous: AV @Rα (λY ) = λAV @Rα (Y ) for all λ ∈ R, (iii) isotonic:

28 Option Pricing and Stochastic Optimization

657

AV @Rα (Y ) ≤ AV @Rα (E(Y |F1 )) for all σ—algebras F1 , (iv) concave: AV @Rα (λY1 + (1 − λ)Y2 ) ≥ λAV @Rα (Y1 ) + (1 − λ)AV @Rα (Y2 ) for all 0 ≤ λ ≤ 1, (v) comonotone additive: AV @Rα (Y1 + Y2 ) = AV @Rα (Y1 ) + AV @Rα (Y2 ) for any two comonotonic random variables Y1 , Y2 . (See properties for V @R above); (vi) strict: AV @Rα (Y ) ≤ E(AV @Rα (Y )), (vii) Lipschitz continuous on L1 with constant α1 : |AV @Rα (Y1 ) − AV @Rα (Y2 )| ≤

1 ||Y1 − Y2 ||1 . α

The AV @Rα (Y ) is related to the value-at-risk V @Rα (Y ) by definitions (28.7), (28.8) and by the inequality (see Pflug [6]) AV @Rα (Y ) ≤ V @Rα (Y ).

(28.9)

The next theorem of Rockafellar and Uryasev [12], which we use in the form proposed by Pflug and Romisch [6], states that the average value-at-risk can be represented as the optimal value of the following optimization problem AV @Rα (Y ) = max(X −

1 E(|Y − X |− ), α

(28.10)

over X ∈ R. The maximum is attained. The power of the formula (28.10) is apparent because such optimization problems are especially easy to solve numerically. In context of the investor problem this theorem allows to formulate optimization problem (28.6) in terms V @R and states the expected profit takes the maximal value maxE(H (X , Y )) = (1 − l)V @Rα (Y ) + lE(Y ).

(28.11)

and the maximum is attained argmaxE(H (X , Y )) = G −1 α (Y ).

(28.12)

658

N. Shchestyuk and S. Tyshchenko

28.3 Applying Investor Problem for Option Pricing 28.3.1 Investor Optimal Price We consider a similar investment problem for a market, which consists of banking processes with price Bt and risky stocks with price St . The price of the banking processes evolves according to (28.1), where B0 is a given constant that we take to be 1 without loss of generality, and r is the interest rate, which differs in terms of borrowing and depositing. Namely, the interest rate can be defined as LIBOR (London Interbank Offered Rate) r − for borrowing or as LIBID (London Interbank Bid Rate) r + for depositing. (LIBOR is always higher then LIBID, r − > r + ). The price of the stock St evolves according to some SDE. Suppose the SDE has a unique solution St and it is not necessarily GBM. Moreover, we don’t make an assumption about constant volatility. But we do assume that the distribution of ST and all its moments are known to the investor. Suppose that the investor is interested in option trading. If we take time to maturity T and strike price K, then for a call option the income Y of investor is payoff |S − K|+ , where the distribution of underlying risky assets is given to him. For a put option the income Y is |S − K|− . There are two scenarios in option trading: the actual payoff Y is less than the market price of option X and the actual payoff Y is more than the market price of option X . In order to unite these scenarios, we introduce a function H (X , Y ), (28.5), where 0 < l < 1 is the discounting factor, which is defined as l = e−(r



)T

,

(28.13)

u > 1 is the increasing factor, which is defined as u = e(r

+

)T

.

(28.14)

For fair option pricing it is natural to find as E(H (X , Y )). But for the investor problem we would like to find maxE(H (X , Y )) over X ∈ R. This value is called investor optimal option price. Definition 28.3 The investor optimal price X ∗ for option with time to maturity T , strike price K and payoff Y = Y (S, K) is defined as a solution of the following optimization problem E(H (X , Y )) = (1 − e−(r



)T

)E(X −

1 − |Y − X |− ) + e−(r )T E(Y ) → max, α (28.15)

over X ∈ R. Parameter α is given as −

α=

1 − e−(r )T . (r e + )T − e−(r − )T

(28.16)

28 Option Pricing and Stochastic Optimization

659

Remark 28.1 In our model we use r − as LIBOR (or borrowing rate) and r + as LIBID (or deposit rate) instead of one risk-neutral rate r. The difference between the two is the bid-ask spread on these transactions. A bid-ask spread is the amount by which the ask price exceeds the bid price for money in the inter-bank market. The bid-ask spread is the de facto measure of market liquidity and can be used for describing the state of the economic environment. Markets at certain moments in time are more liquid than at other, for example, at crises. This should be reflected in lower bid-ask spreads. On the other hand, for any given currency many different types of interest rates are regularly quoted. These include mortgage rates, deposit rates, prime borrowing rates and so on. We can consider r + as the deposit rate and r − as the borrowing rate. Whether the interest spread is applicable in a situation depending on the credit risk. The higher credit risk, the higher spread between deposit and borrowing rates. For a real data series that includes interest rate spread see the World Bank Group site [13]. Remark 28.2 If r + coincides with r − and equals r, then α=

1 , 1 + erT

Proposition 28.1 The investor optimal price X ∗ for an option with time to maturity T , strike price K and payoff Y (S) can be found as X ∗ = G −1 α (Y ) = V @Rα (Y ),

(28.17)

where G(Y ) is distribution function for payoff Y (S) = |S − K|+ for a call option and Y (S) = |S − K|− for a put option. Proof follows from (28.12), if we assume that investor has to make decision about the option premium X if the actual available payoff Y (S(T )) is given to him by random variable S. Remark 28.3 In order to find the quantile of the payoff distribution function G = G(Y ) we need to construct this function. Suppose the probability distribution density function fS (x) for underlying asset S is given. Then for a call option payoff with strike price K S − K, S > K Y (S) = |S − K|+ = 0, S ≤ K, its distribution density function can be written in the form: ⎧ fS (y + K), y>0 ⎪ ⎪ ⎪ 0 ⎪  ⎨ gY (y) = fS (u + K) du, y = 0 ⎪ ⎪ ⎪−∞ ⎪ ⎩ 0, y < 0,

660

N. Shchestyuk and S. Tyshchenko

because K

0 fS (u)du =

P(Y = 0) = P(S(T ) < K) = −∞

fS (u + K) du. −∞

The cumulative distribution function (CDF) for Y (S) = |S − K|+ is given by

G Y (y) =

⎧ y  ⎪ ⎪ ⎨ ⎪ ⎪ ⎩−∞

fS (u + K) du, y ≥ 0 y < 0.

0,

Proposition 28.2 For the investor optimal price X ∗ the expected profit E(H (X , Y )) takes the maximal value (28.11), where l is a discounting factor (28.13) and maxE(H (X ∗ , Y )) ≤ E(Y ).

(28.18)

Proof If we use (28.10) then from (28.15) we get 1 − |Y − X |− ) + e−(r )T E(Y ) = α

maxE(H (X , Y )) = (1 − e−(r



)T

)maxE(X −

= (1 − e−(r



)T

)AV @R(Y ) + e−(r



)T

E(Y )

(28.19)

Then according to the property of strictness (vi) of AV @R: AV @Rα (Y ) ≤ E(AV @Rα (Y )), we obtain: maxE(H (X , Y )) ≤ (1 − e−(r



)T

)E(Y ) + e−(r



)T

E(Y ) = E(Y ).

Proposition 28.3 For the investor optimal price X ∗ for an option with time to maturity T , strike price K and payoff Y (S) the following equality holds X∗ ≥

maxE(H (X , Y )) − e−(r 1 − e−(r + )T

+

)T

E(Y )

.

(28.20)

Proof If we use property (28.9), then from (28.19), we get maxE(H (X , Y )) ≤ (1 − e−(r Because of (28.17) we obtain

+

)T

)V @R(Y ) + e−(r

+

)T

E(Y ).

28 Option Pricing and Stochastic Optimization

maxE(H (X , Y )) ≤ (1 − e−(r From this

661 +

maxE(H (X , Y )) − e−(r 1 − e−(r + )T

)T

+

)X ∗ + e−(r

)T

E(Y )

+

)T

E(Y ).

≤ X ∗.

Remark 28.4 If St evolves according to (28.1) and (r − ) = (r + ) = r then e−rT E(Y ) = c is the discounting mathematical expectation for the payoff function in respect to the risk-neutral measure and X∗ >

maxE(H (X , Y )) − c , 1 − e−rT

(28.21)

Remark 28.5 The optimal investor pricing decision X ∗ for an option with time to maturity T and payoff Y (S) is based on a quantitative risk assessment as V @Rα (Y ) (28.17). Now it is interesting to compare the optimal investor price for the option with the standard p-value at risk. For a given portfolio, time horizon T , and probability p, the p − V @R can be defined informally as the maximum possible loss during that time after we exclude all worse outcomes whose combined probability is at most p. More formally, p − V @R is defined such that the probability of a loss greater than V@R is (at most) p while the probability of a loss less than V@R is (at least) 1 − p. Common parameters for standard V @R are 1% and 5% probabilities and one day and two-week horizons, although other combinations are in use. Formula (28.17) only helps to choose p—value according to the level of real economic environment, using r − and r + (28.16). Thus, the optimal investor price summarizes the distribution of possible losses by a quantile, a point with a probability of greater losses computed by (28.16). Remark 28.6 The option pricing using the investor problem approach can be embedded in a utility maximization framework. In this case an investor with utility function H and starting with initial cash X expects income whose cash value at time t is Y π (X , t), when he uses trading strategy π from the set of admissible trading strategies . His objective is to maximize expected utility of wealth at a fixed final time T : V (X ) = supE (H (Y π (X , T )) , over π ∈ Π . In our case the set of admissible trading strategies Π includes only set of possible investments X ∈ R. For a more general case see [4].

662

N. Shchestyuk and S. Tyshchenko

28.3.2 Examples In this section we would like to show how it is possible to find the optimal investor price for the Black–Scholes model and for the Fractal Activity Time model.

28.3.2.1

Classical Black–Scholes Model

Let us consider a market for the classical Black–Scholes model, which consists of a risk- free bond with price Bt and risky stocks with price St . The risk-free interest rate is r. The price of underlying traded assets St follows a geometric Brownian motion GBM (28.2). Log returns have normal distribution, whose parameters are known to investor from statistical data. The fair price for call option with time to maturity T and strike price K can be computed from the Black–Scholes formula (28.3). For finding the optimal investor price we should assume that r− and r+ are known to the investor. Then the optimal investor price is a is a quantile of the payoff distribution Y = Y (S) on level α, where α corresponds (28.16), S has log normal distribution ⎧ ⎨

(logx−μ)2 1 √ e− 2σ2 , x > 0 fS (x) = flogNorm (x) = xσ 2π ⎩ 0, otherwise,

and gY (y) and G Y (y) can be obtained as in Remark 28.3.

28.3.2.2

Fractal Activity Time (FAT) Model

The fractal activity time model was introduced by Heyde (1999) to try to encompass the empirically found characteristics of real data and elaborated on for Variance Gamma, normal inverse Gaussian and Student distributions [7–9]. In paper [9], we considered two constructions of activity time. The first construction is based on reciprocal gamma diffusion type processes and leads to stationary returns with exact Student marginal distribution. The second construction uses a superposition of two reciprocal gamma diffusion type processes and leads to Student-like marginal distribution. The market for the FAT model consists of a risk- free bond with price Bt and risky stocks with price St . Price of the bond evolves according to (28.1). The price of the underlying traded assets St is a strong solution of the following stochastic differential equation: dSt = μSt dt + (θ + σ 2 /2)St dTt + σSt d WTt , t ≥ 0,

28 Option Pricing and Stochastic Optimization

663

where Tt , t ≥ 0, is a random time change or fractal activity time, that is positive nondecreasing process such that T0 = 0, and Wt , t ≥ 0, is a standard Brownian motion independent of the process Tt . The increments over unit time are τt = Tt − Tt−1 , t = 1, 2, . . . and the returns are given by   1 St d = μ + θτt + στt 2 W1 , Xt = log St−1 d

where = denotes equality in distribution.  2 If increments τt ∼ RΓ ν2 , δ2 , with PDF  2  ν2 fRΓ (x) =

δ 2

Γ

δ2

ν

 ν  x− 2 −1 e− 2x , x > 0

(28.22)

2

then assuming θ = 0 and σ = 1, the log returns Xt is stationary process with marginal Student T (μ, δ, ν) distribution Γ ( ν+1 ) 1 x ∈ R, fSt (x) = √ 2 ν ν+1 , x−μ δ πΓ ( 2 ) [1 + ( )2 ] 2 δ

where μ ∈ R is a location parameter, δ > 0 is a scaling parameter, ν > 0 is a tail index. The fair price for call option with time to maturity T and strike price K can be computed as in paper [9]. The optimal investor price can be found as a quantile of distribution Y on level α, where α corresponds (28.16), S has log Student (Gosset [11]) distribution fS (x) = flogSt (x) =

Γ ( ν+1 ) 1 x ∈ R, √ 2 ν ν+1 , logx−μ xσδ πΓ ( 2 ) [1 + ( )2 ] 2

(28.23)

δ

gY (y) and G Y (y) can be obtained as in Remark 28.3, if r− and r+ are known to the investor.

28.3.3 Numerical Results In this section, numerical comparisons between the investor optimal price (28.17) and the fair price (Black–Scholes formula) (28.3) for European call options on the Apple Inc. stocks are demonstrated. We considered spot price S0 = 277.0 for March 14, 2020. The strike price for call options with maturity T = 1/12 year is set at K = 255; 260; 265; 270, the yearly

664

N. Shchestyuk and S. Tyshchenko

Table 28.1 Numerical comparisons Strike price Last (Market) 255 260 265 270

32.6 28.67 27 23

Optimal FAT

Optimal GBM

Fair Black–Scholes

32.49 28.56 23.34 18.59

32.28 27.28 22.27 17.28

26.67 22.76 19.17 15.92

volatility for returns of the underlying asset is computed at σ = 33.7%, the yearly riskless interest rate is set at r = 5.8%. To illustrate the approach we propose, we consider the case now where the yearly interests are r− = 5.4% for borrowing and r+ = 1.2% for lending. Then α = 0.82 for one month. To find the optimal investor price we need to build empirically or theoretically distribution of the payoff function and calculate a quantile for this distribution of level α. So, for the FAT model we just need to compute the quantile function for a transformed (as in Remark 28.3) log Student distribution of level α = 0.82, and for GBM model we calculate the quantile function for a transformed log Normal distribution of level α = 0.82. The stated conditions remain the same for both call and put options. Some basic results for European call options are presented in Table 28.1. In order to evaluate the accuracy of the proposed method and to compare it with fair pricing according to the Black–Scholes formula, we can compare their percentage errors. The percentage error for a given strike price K can be computed as a following ratio: |ctheoretical − cmarket | 100% Percentage Error = cmarket Overall, the results show that average percentage errors for Optimal FAT pricing (8.3%) and for Optimal GBM pricing (12%) are less than for Fair Black–Scholes Pricing (24%). However, these errors may indicate that the Black–Scholes formula is very sensitive to volatility σ, and the optimal investor price is sensitive to interest rates r−, r+. Thus, we offer our approach not as an alternative to the Black–Scholes formula, but as an additional investor tool for decision making.

28.4 Summary Following the financial tsunami experiences of 2008 and the crisis of Covid19 in 2020, the risk controls of derivative instruments on stocks and other financial assets have become extremely important. Thanks to our approach, the investor gets a tool,

28 Option Pricing and Stochastic Optimization

665

which allows him to integrate the option pricing with risk management. This method gives an opportunity to evaluate the optimal investor price in incomplete markets without unrealistic assumptions about a unique constant risk-free interest rate, ignorance of transaction costs, or perfect liquidity. All the investor needs to do is to compute the α (which as a bid-ask spread is the de facto measure of market liquidity and can be used for describing the state of the economic environment), build empirically or theoretically a distribution function for the payoff and find a quantile for this distribution of level α.

References 1. Artzner, P., Delbaen, F., Eber, J.-M., Heath, D.: Coherent measures of risk. Math. Finance 9, 203–228 (1999) 2. Black, F., Scholes, M.: The pricing of options and corporate liabilities. J. Polit. Econ. 81, 637–654 (1973) 3. Black, F.: The holes in Black-Scholes. RISK-magazin 26, 47–51 (1988) 4. Davis, M.H.A.: Option pricing in incomplete markets. In: Mathematics of Derivative Securities, pp. 216–226. Cambridge University Press, Cambridge (1997) 5. Staum, J.: Incomplete markets. Handbooks in OR and MS. Chapter 12, vol. 15, pp. 511–563 (2007) 6. Pflug, G.C., Werner, R.: Modeling, Measuring and Managing a Risk, p. 285. World Scientific Publishing Co., Pte. Ltd., Singapore (2007) 7. Heyde, C.C., Leonenko, N.N.: Student processes. J. Appl. Probab. 37, 342–365 (2005) 8. Leonenko, N.N., Petherick, S., Sikorskii, A.: A normal inverse Gaussian model for a risky asset with dependence. Statist. Probab. Lett. 82, 109–115 (2012) 9. Castelli, F., Leonenko, N.N., Shchestyuk, N.: Student-like models for risky asset with dependence. Stoch. Anal. Appl. 35(3), 452–464 (2017) 10. Canhanga, B., Malyarenko, A., Ying, N., Silvestrov, S.: Perturbation Methods for Pricing European Options in a Model with Two Stochastic Volatilities. New Trends in Stochastic Modelling and Data Analysis, pp. 199–210 (2015) 11. Cassidy, Daniel T., Hamp, Michael J., Ouyed, Rachid: Pricing European options with a log Student’s t-distribution: a Gosset formula. Phys. A: Stat. Mech. Appl. 389(24), 5736–5748 (2010) 12. Rockafellar, R.T., Uryasev, S.: Optimization of conditional value-at- risk. J. Risk 2, 21–41 (2000) 13. https://data.worldbank.org/indicator/FR.INR.LNDP

Part III

Engineering Mathematics

Part III covers various applications of computational mathematics in different engineering fields. Contributions range from applications related to climate models (Chap. 29), multiscale problems arising when studying fluid flow in porous media, thermal conduction or wave propagation in composite materials (Chaps. 30, 31), to permanent magnet shape optimization (Chap. 32). The next two chapters tackle problems related to the population dynamics of forage resource and livestock population in a grassland ecosystem (Chap. 33) and interactions of a consumer-resource system with harvesting (Chap. 34). Chapters 35–38 by Prashant G. Metri and coauthors concerned with applications of numerical and analytical methods to investigation of solutions of boundary and initial value problems for systems of partial differential equation in fluid mechanics and electromagnetism applications. Finally, Chaps. 39 and 40 cover new stochastic digital measurement method and its role in designing low-cost digital high precision power grid electrical energy meters.

Chapter 29

Stochastic Solutions of Stefan Problems with General Time-Dependent Boundary Conditions Magnus Ögren

Abstract This work deals with the one-dimensional Stefan problem with a general time-dependent boundary condition at the fixed boundary. Stochastic solutions are obtained using discrete random walks, and the results are compared with analytic formulae when they exist, otherwise with numerical solutions from a finite difference method. The innovative part is to model the moving boundary with a random walk method. The results show statistical convergence for many random walkers when Δx → 0. Stochastic methods are very competitive in large domains in higher dimensions and has the advantages of generality and ease of implementation. The stochastic method suffers from that longer execution times are required for increased accuracy. Since the code is easily adapted for parallel computing, it is possible to speed up the calculations. Regarding applications for Stefan problems, they have historically been used to model the dynamics of melting ice, and we give such an example here where the fixed boundary condition follows data from observed day temperatures at Örebro airport. Nowadays, there are a large range of examples of applications, such as climate models, the diffusion of lithium-ions in lithium-ion batteries and modelling steam chambers for petroleum extraction. Keywords Random walk · Heat equation MSC 2020 35K05

M. Ögren (B) School of Science and Technology, Örebro University, 70182 Örebro, Sweden Hellenic Mediterranean University, P. O. Box 1939, GR-71004 Heraklion, Greece e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_29

669

670

M. Ögren

29.1 Introduction The Stefan Problem has its name from Josef Stefan (1835–1893) who was first to investigate problems including a moving boundary in detail. This was described in his report on ice formation in polar seas [13], where he also presented the analytical solution, see Eq. (29.23), to the problem where the fixed boundary has constant temperature. However, for the general case where the temperature at the fixed boundary f (t) is an arbitrary function, no explicit solution has been obtained, though there are power series formulations described in the literature, see e.g. [15]. In addition to the ice formation problem originally examined by Stefan, moving boundary problems now have many other applications, see e.g. [3, 6, 7]. The aim here is to show how to solve Stefan problems for arbitrary boundary conditions using stochastic methods. We will present a discrete Random Walk Method (RWM) that solves the Stefan problem, which is a PDE consisting of the heat equation defined in a phase changing medium. There are different types of formulations of this problem, but one of the characteristics is that it has a free or moving boundary governed by a so-called Stefan condition, which describes the position of the interface between the phases. Beyond the moving boundary, the general formulation of the problem usually also includes a fixed boundary with a boundary condition different from the moving one. For physical reasons the boundary condition at the moving boundary is here set to be the transition temperature, i.e. the melting point TM of ice. At the fixed boundary, the condition for the temperature may be set to an arbitrary function f (t) of time. For the case where we have a constant temperature at the fixed boundary f (t) = T0 and for one other special form of f (t), there are analytical solutions to the Stefan problem. In addition a specific time dependent incoming heat flux is illustrated to be equivalent with the constant temperature condition. However, in most cases we need numerical calculations to evaluate a solution. As a practical example of such a case, we model the melting of ice where the surface temperature is defined according to the variations in the air temperature.

29.1.1 Random Walk and the Heat Equation In this Section, we will study the heat equation ∂2T ∂T =α 2 , ∂t ∂x

(29.1)

and describe how to translate it into an RWM [2, 11]. In our one-dimensional model, we want to let one walker represent the temperature difference of 1 ◦ C on the volume element Δx · 1 · 1 m 3 . To make a simple illustration for the heat equation, denote the number of walkers in the volume element i with width Δx at time t as Ni (t). If we let the probability for a walker to go either to the left or to the right to be equal

29 Stochastic Solutions of Stefan Problems with General …

671

during a time step Δt we have equal probabilities P = 1/2. Then we expect to have P Ni walkers going to volume element i + 1 and the same amount going to volume element i − 1. At the same time walkers from volume elements i − 1 and i + 1 will walk into the volume element i, giving the following balancing equation for Ni Ni (t + Δt) = Ni (t) − (walkers to i − 1) − (walkers to i + 1) +(walkers from i − 1) + (walkers from i + 1)

(29.2)

⇒ Ni (t + Δt) − Ni (t) = P(Ni+1 − 2Ni + Ni−1 ) . We divide Eq. (29.2) by Δt and introduce the constant

such that

α = (P/Δt)(Δx)2 ,

(29.3)

Ni+1 (t) − 2Ni (t) + Ni−1 (t) Ni (t + Δt) − Ni (t) =α , Δt (Δx)2

(29.4)

which is a discretized partial differential equation for N (x, t). We then see that Eq. (29.4) has the same form as the heat equation (29.1). In general, the same arguments can be made to derive the corresponding equation in D dimensions, since a symmetric Cartesian grid has 2D directions for a walker to go with equal probability P = 1/(2D), hence α = (Δx)2 /(2DΔt). We now first present an introductory example without boundary conditions. Consider the heat conduction problem for an infinite rod, with a central heat impulse at t =0 ∂2T ∂T =α 2 , ∂t ∂x T (x, 0) = δ(x) .

x ∈ R,

t > 0,

(29.5) (29.6)

The well known solution to this problem is T (x, t) = √

1 4π αt

e−x

2

/(4αt)

.

(29.7)

This problem is straightforward to model with the RWM since it is defined for all real values and thus has no boundary conditions to consider. In Fig. 29.1 we can see a comparison between the analytic solution and the discrete probability density function for random walks with n = 105 initial walkers at x0 = 0.

672

M. Ögren

Fig. 29.1 Analytical versus random walk solutions for the introductory problem of Eqs. (29.5)– (29.6) for t = 1. The red dashed curves are from √ Eq. (29.7). The blue bins have the width 2Δx. (Left) N = 5 time steps,√i.e. Δt = 1/5 and Δx = 2/5  0.63. (Right) N = 100 time steps, i.e. Δt = 1/100 and Δx = 2/10  0.14

29.1.2 A Random Walk Model with Boundary Conditions As the next problem, a heat equation with fixed boundaries and homogeneous Dirichlet conditions is considered. ∂T ∂2T =α 2 , ∂t ∂x T (0, t) = T (L , t) = 0 , T (x, 0) = g(x) .

0 0,

t > 0,

(29.8) (29.9) (29.10)

Using separation of variables on Eq. (29.8), the general solution to this problem can be written ∞  nπ x   nπ 2 T (x, t) = . (29.11) cn e−α( L ) t sin L n=1 We write the initial condition as g(x) =

∞ 

cn sin

n=1

 nπ x  L

.

(29.12)

By recognizing this as the Fourier series expansion of g(x) on 0 < x < L, cn can be determined according to 2 cn = L

L g(x) sin 0

 nπ x  L

dx .

(29.13)

29 Stochastic Solutions of Stefan Problems with General …

673

Here we set g(x) = 1 as an example, which gives the solution from Eq. (29.11) on the form   απ 2 2  ∞ 4  exp − L 2 (2k + 1) t (2k + 1)π x . (29.14) T (x, t) = sin π k=0 2k + 1 L In the previous problem of Eqs. (29.5)–(29.6) we adapted a RWM to a problem defined on the whole x-axis. If we instead want to solve the problem of Eqs. (29.8)– (29.10), it is necessary to implement boundary conditions. This is done by discretizing the space and time on the finite domain according to Eq. (29.3). In this example we choose the following discretization, where we for simplicity set α = 1 Δx = xi+1 − xi = 10−2 ,

i = 0, 1, . . . , N − 1 , x0 = 0 , x N = 1 ,

Δt = t j+1 − t j = 5 · 10−5 , j = 0, 1, . . . , M − 1 , t0 = 0 , t M = 1 .

(29.15)

The initial condition of Eq. (29.10) with T (x, 0) = 1 will here be represented by one walker starting at T (xi , t0 ) for all i. By the next timestep t1 , all walkers will have moved one step either to the right or to the left. For homogenous Dirichlet conditions, the walkers that reach the boundaries will be absorbed and disappear, such that T (x0 , t j ) = T (x N , t j ) = 0. In the case of inhomogenous Dirichlet conditions, i.e. T (x0 , t j ) = 0, as in the upcoming Stefan problem, see Eq. (29.17), we also have walkers starting from the boundary. The number of walkers starting at T (x0 , t j ) will here be set according to f (t), where f (t) = 1 will be represented by one walker starting at T (x0 , t j ) for all j. We then iterate over time until all walkers have reached the boundaries or the maximum time t M is attained. A statistical problem so far is that the result of our model with one walker, representing a temperature difference of 1 ◦ C per volume unit, might differ a lot depending on how each random walk turns out. Real moving particles causing thermal diffusion representing that raise of temperature are large in numbers. Therefore, to get an accurate result we multiply the number of walkers starting at all points defined by initial- or boundary conditions with a large number n, and at the end we divide the temperature at all points with n. In Eq. (29.14) we presented an analytic solution for the heat conduction problem Eqs. (29.8)–(29.10) for the initial temperature g(x) = 1 ◦ C. Figure 29.2 shows the temperature distributions T (x, t) for the analytical result and the RWM solution with α = 1. In Fig. 29.3 we see a comparison between the analytical result and the RWM in the cross-section x = 0.5.

29.2 The Stefan Problem In our model for the one-dimensional Stefan problem we consider an initial block of ice, i.e. a solid (S), with semi-infinite extent (x → ∞) and one surface to air at

674

M. Ögren

Fig. 29.2 Solutions for the heat conduction problem for t ∈ [0, 0.4] with initial condition g(x) = 1. (Left) Analytic solution from Eq. (29.14) with 100 terms in the Fourier series. (Right) RWM solution with n = 104 and Δx = 0.01 Fig. 29.3 Comparison between the analytic solution, Eq. (29.14) (green) for x = 0.5 and t ∈ [0, 0.4], and the RWM (red) with n = 104 and Δx = 0.01

x = 0. At t = 0 there is no water phase and the temperature for the ice phase is at Tice = Tm = 0 ◦ C. For t > 0 the ice can start to melt and thus we can have a water phase, i.e. a liquid (L), to the left of the ice. We presently treat only the so-called onephase Stefan problem, which means that the temperature in the ice phase does not change in time. The temperature at the x = 0 boundary, i.e. the interface between air and water for t > 0, is allowed to change over time according to f (t), and to simulate a melting process, we initially assume f (t) > 0. This yields the following equations [5]

29 Stochastic Solutions of Stefan Problems with General …

∂T ∂2T = αL 2 , ∂t ∂x T (0, t) = f (t) ,

0 < x < s(t) ,

675

t > 0,

t > 0,

T (x, 0) = 0 ,

(29.16) (29.17) (29.18)

ds ∂ T

ρl , = −k L dt ∂ x x=s(t)

s(0) = 0 , T (s(t), t) = TM = 0 ◦ C ,

t > 0,

(29.19)

t > 0.

(29.20) (29.21)

Here the thermal diffusivity in the liquid part, α L [m2 /s] in (29.16), is defined as αL =

kL , ρL cL

(29.22)

where k L [W/(mK)] is the heat conductivity, ρ L [kg/m3 ] the density and c L [J/(kgK)] the specific heat capacity in the liquid phase. Note that these physical properties differ between the solid and liquid part, e.g. α L = α S . But since T = 0 ◦ C in the solid phase and the temperature distribution only is evaluated in the liquid phase, α S is not taken into consideration in this one-phase Stefan problem. In equation (29.19) l is the specific latent heat and ρ is the density. Here it is assumed that ρ = ρ L = ρ S for simplicity. The analytic solution to the problem when f (t) = T0 is constant is [13] ⎧ ⎪ ⎪ ⎪ T (x, t) ⎨

  √ α L t)) = T0 1 − erf(x/(2 , erf(λ) √ = 2λ α L t ,

s(t) ⎪ ⎪ ⎪ 2 ⎩ √ β π λeλ erf(λ) = T0 ,

(29.23)

x 2 where β = l/c L and erf(x) is the error function defined as √2π 0 e−y dy. A different special case when an analytic solution also exist is when f (t) = et − 1. Provided β = 1, the solution is then [10, 13] ⎧ ⎨ T (x, t) = et−x − 1 (29.24) =t. ⎩ s(t) We will use also this case for a numerical comparison with the RWM in Sect. 29.3.

676

M. Ögren

Fig. 29.4 Volume of melted ice between time t0 and t1

29.2.1 The Stefan Condition The position of the free boundary, i.e. the interface between the two phases, is timedependent and denoted as x = s(t). At time t0 the entire domain x > 0 is divided into two subdomains consisting of, the water phase x < s(t0 ), and the ice phase x > s(t0 ). Here we consider a one-phase problem which means that the temperature in one of the phases (here the ice phase) is constant at the melting temperature TM = 0 ◦ C. We here briefly derive the Stefan condition stated in Eq. (29.19), which will later be used in the formulation of the stochastic model for the interface s(t). More details on the derivation of the Stefan condition can be found e.g. in [5]. In the case of melting ice, the water phase at time t1 > t0 will be increased, resulting in s(t1 ) > s(t0 ). If we imagine a block of ice with cross sectional area S, the volume V of the melted ice in the time interval t ∈ [t0 , t1 ] is S (s(t1 ) − s(t0 )), see Fig. 29.4. The thermal energy Q [J] required for the melting of this block is determined according to Q = Vρl = S (s(t1 ) − s(t0 )) ρl ,

(29.25)

where l [J/kg] is the specific latent heat for the phase transition. As we here assume that the heat is only spread by diffusion, the heat transport obeys Fourier’s law q = −ki

dT , dx

(29.26)

where q is the local heat flux density [W/m2 ]. By energy conservation and the expressions for the heat fluxes from the liquid and solid phases, Q can be written t1   ∂ T (s(τ ), τ ) ∂ T (s(τ ), τ ) −k L Q= · ex − k S · (−ex ) d S dτ ∂x ∂x t0

t1  ∂ T (s(τ ), τ ) ∂ T (s(τ ), τ ) =S −k L + kS dτ . ∂x ∂x t0

(29.27)

29 Stochastic Solutions of Stefan Problems with General …

677

Combining Eqs. (29.25) and (29.27), dividing by t1 − t0 , and letting t1 → t0 , will yield Eq. (29.19) for the Stefan condition t1  ∂ T (s(τ ), τ ) −k L ∂x t0 ∂ T (s(τ ), τ ) dτ + kS ∂x ds ∂ T (s(t), t) ∂ T (s(t), t) ⇒ ρl = −k L + kS . dt ∂x ∂x

s(t1 ) − s(t0 ) 1 ρl S lim = S lim t1 →t0 t1 →t0 t1 − t0 t1 − t0

(29.28)

Here t0 have been replaced by t since t0 can be chosen arbitrarily. In the present case where we assume T = 0 ◦ C for x > s(t), diffusion only occur in the liquid phase and Eq. (29.28) reduces to ρl

∂ T (s(t), t) ds = −k L . dt ∂x

(29.29)

29.2.2 Modelling the Moving Boundary To be able to solve the Stefan problem with the RWM, the critical part is how to handle the moving boundary s(t). To set up a model for the movement of the boundary s(t) we start from Sect. 29.2.1. In Eq. (29.25) we established that the heat required to move the boundary a small step Δs is Q = SΔsρl , and thus Δs =

Q . Sρl

(29.30)

(29.31)

We want to compare this with the heat represented by one walker as it raises the temperature 1 ◦ C of the volume SΔx [m3 ], see Fig. 29.5. This can be expressed as (c = c L ) (29.32) Q walker = cρV ΔT = cρ SΔx · 1 ◦ C . By combining Eqs. (29.31) and (29.32), we have Δs =

c cρ SΔx = Δx . lρ S l

(29.33)

So for every walker absorbed by the moving boundary at s(t) the boundary will move the increment Δs [14]. To adjust for the multiplication with the factor n at the

678

M. Ögren

Fig. 29.5 One walker raises the temperature of the gray volume with 1 ◦ C

starting points, as discussed in Sect. 29.1.2, we also need to correct the step length Δs by dividing with n. Hence, the moving boundary will have the position i in the x-grid when Δs sk < i + 1 , sk = sk−1 + . (29.34) i≤ Δx n It is of interest to see how the ratio between Δs and Δx turns out as we insert realistic physical parameter values for c and l. For water at 0 ◦ C we have c = 4.22 kJ/(kg·K) and l = 334 kJ/kg [9], which gives c/l ≈ 0.0126, and we see from Eq. (29.33) that Δs Δx. Note that in the opposite case, if Δs Δx, the boundary will move several Δx-steps as it is reached by one walker and this will lead to poor results when modelling the movement of the boundary. Thus, in the case that we have c/l > 1 we have to compensate by increasing the number n and thereby decreasing the step size Δs/n in Eq. (29.34). So a rule of thumb to yield a good approximation of the boundary is to choose n such that Δs/n Δx.

29.2.3 Stefan Problem with an Incoming Heat Flux In the Stefan problem Eqs. (29.16)–(29.21) the temperature at the fixed boundary (x = 0) is described by the Dirichlet condition of Eq. (29.17). Changing instead to a Neumann condition ∂T (0, t) = h(t) , (29.35) ∂x allow us to model a prescribed heat flux. In fact there is a specific form of heat flux that is equivalent to the constant Dirichlet condition f (t) = T0 in Eq. (29.17), that is [1] q0 ∂T (0, t) = − √ . (29.36) ∂x kL t

29 Stochastic Solutions of Stefan Problems with General …

679

Fig. 29.6 Solutions for Stefan problem for t ∈ [0, 0.6] with boundary condition f (t) = 1. (Left) Analytic solution from Eq. (29.23). (Right) RWM solution with n = 104 and Δx = 0.01

Hence, given a relation between q0 and T0 , the analytic solution Eq. (29.23) is applicable also in this case, as we illustrate numerically in the upcoming Section. The implementation of Dirichlet boundary conditions was described in Sect. 29.1.2. Here we sketch an implementation of the Neumann boundary condition (29.35). At the first time step, we seed the temperature for the fixed boundary with the order of unity, i.e. T (x0 , t0 )  1. Using a forward differentiation approximation T (x1 , t j ) − T (x0 , t j ) ∂T (0, t) ≈ , ∂x Δx

(29.37)

we in the consecutive time steps ( j > 0) update the temperature at the fixed boundary according to   (29.38) T (x0 , t j ) = round −nΔxh(t j ) + T (x1 , t j ) , where round rounds a number to the nearest integer.

29.3 Numerical Results for Stefan Problems 29.3.1 Stefan Problem with Constant Boundary Condition f (t) = T0 In Eq. (29.23) we presented the analytic solution for the Stefan problem Eqs. (29.16)– (29.21) when f (t) = T0 . Figure 29.6 shows the temperature distributions T (x, t) for the analytic result and the RWM solution with T0 = 1 ◦ C, for α = 1 and β = 1. The green respectively the red curves denotes the solid-liquid interface.

680

M. Ögren

Fig. 29.7 Analytic solution T (x, 0.5), x ∈ [0, 0.4], of Eq. (29.23) (green). RWM solutions with Δx = 0.01 and different values of n, see inset legend. Numerical results from a finite difference method (FDM) [8] (blue)

Fig. 29.8 Analytic solution s(t) of Eq. (29.23) (green). Numerical solution of s(t) from RWM for t ∈ [0, 0.5] and n = 104 , with different step lengths Δt, see inset legend

In Fig. 29.7 we compare different values of n for the RWM in the cross-section t = 0.5. In the Fig. 29.8 we compare different sizes of the step length Δt in a plot of the moving boundary s(t).

29.3.2 Stefan Problem with a Special Boundary Condition f (t) = e t − 1 In Eq. (29.24) we presented the analytical solution for the Stefan problem (29.16)– (29.21) in the special case when f (t) = et − 1. Figure 29.9 shows the temperature distributions T (x, t) for the analytical result and the RWM solution for α = 1 and β = 1. The green respective the red curves shows the solid-liquid interface. In Fig. 29.10

29 Stochastic Solutions of Stefan Problems with General …

681

Fig. 29.9 Solutions for the Stefan problem with the boundary condition f (t) = et − 1 for t ∈ [0, 1]. (Left) Analytic solution Eq. (29.24). (Right) RWM solution with n = 104 and Δx = 0.01 Fig. 29.10 Analytic solution T (x, 0.5), x ∈ [0, 0.4], of Eq. (29.24) (green). RWM solutions with n = 104 and Δx = 0.01 (red). Numerical results from a finite difference method (FDM) [8] (blue)

we see a comparison between the analytic result, the RWM and FDM in the crosssection t = 0.5.

29.3.3 Stefan Problem with a Special Heat Flux Boundary √ Condition h(t) = −q0 / t We now estimate what value of q0 that is required in order for the temperature to be T (0, t) = T0 . The total heat entering during the time t is q0 Q=S kL

 0

t

dτ 2Sq0 √ t. √ = kL τ

(29.39)

682

M. Ögren

Fig. 29.11 Solutions of the Stefan problem with the special heat flux boundary condition h(t) = √ −q0 / t for t ∈ [0, 0.6]. (Left) RWM solution for q0 = 0.9108 with n = 104 and Δx = 0.01, to be compared with Fig. 29.6. (Right) Cross section x = 0 of the RWM solution

From Fig. 29.7 we obtain the approximation T (x, t = 0.5) ≈ 1 − 1.25x for the constant temperature case. Hence, during the time interval t ∈ [0, 0.5], the solid phase have received the heat Q S = ρV l ≈ ρ S(s(0.5) − s(0))l = 0.8ρl, and the liquid phase have received the heat Q L = ρV cΔT ≈ ρ S(s(0.5) − s(0))c(T (0, 0.5) + T (0.4, 0.5))/2 = 0.6ρc. With ρ = S = l = c = k L = 1 (α = β = 1), we have the total heat Q = Q S + Q L = 1.4. Solving for q0 from Eq. (29.39), we obtain the esti∂ T (0, t)/∂ x from the analytic solution mation q0 ≈ 0.99. If one instead calculates √ Eq. (29.23), one obtains q0 = 1/( πerf(λ)) = 0.9108. Numerically we find that q0 ≈ 0.9108 gives a constant temperature T (x = 0, t) ≈ 1 for α = 1 and β = 1, see Fig. 29.11, which is in agreement with [1].

29.3.4 Stefan Problem with Oscillating Boundary Condition In the introduction we proposed to model a general time dependent fixed boundary condition T (0, t) = f (t) with the RWM. Due to limitations in the existing code for the finite difference method (FDM) [8], we are presently restricted to consider f (t) = sin(t) at the boundary when comparing the two numerical methods. The RWM solution for the Stefan problem Eqs. (29.16)–(29.21) yields the temperature distribution T (x, t) as seen in the left part of Fig. 29.12. The RWM is compared to the FDM for the cross-section t = 1 in the right part of Fig. 29.12. Here we have set α = 1 and β = 2.

29 Stochastic Solutions of Stefan Problems with General …

683

Fig. 29.12 (Left) RWM solution for f (t) = sin (t) with n = 104 and Δx = 0.01 on the interval t ∈ [0, 1]. (Right) Comparison between RWM and FDM for f (t) = sin(t) at t = 1 on the interval x ∈ [0, 0.4] Fig. 29.13 RWM model where f (t) is set to the observed day temperatures at Örebro airport 1–3 March 2019, x is in mm and t ∈ [0, 62 h]

29.3.5 Stefan Problem with Boundary Condition According to Daytime Temperature Variations To finally apply our RWM model with an arbitrary time dependent temperature at the fixed boundary in a simulation of melting ice, we set the physical constants for water to α ≈ 0.1429 mm2 /s and β ≈ 79.9 K [9]. We model the melting of ice according to the daytime temperature variations and therefore we set f (t) at the fixed boundary to the observed air temperatures from Örebro airport 1–3 March 2019 [16]. Assuming the observed air temperature at the fixed surface is a simplification that does not take the temperature gradient between air and ice/water, or heat transport by convection or radiation, into account. Nevertheless, Fig. 29.13 gives a qualitative view of the dynamics of the melting ice, and we see for example that it is freezing again during the first night, although the present one-phase implementation with negative temperatures in the liquid is quantitatively unrealistic.

684

M. Ögren

29.4 Discussion From the numerical results of the previous Sect., we can see qualitatively from Figs. 29.6, 29.7, 29.8, 29.9, 29.10 and 29.11 that the RWM solution to Stefan problems converges to the analytical as Δx → 0. An oscillatory boundary condition was successfully evaluated against a finite difference method in Fig. 29.12. Finally, an arbitrary time dependent function for the fixed boundary was used to model the melting of ice with realistic temperature data in Fig. 29.13. There are a few simplifications in our model for the Stefan problem that can be improved in a more detailed study. Among the physical simplifications, we have mentioned our assumption that we use the same density for water and ice, ρ L = ρ S which is not the real case. We may also want to consider a temperature distribution in the solid phase, Tice (x, t) = 0, which leads to a two-phase Stefan problem with a system of PDE:s. Some cases of two-phase problems also have analytic solutions, see e.g. [4]. There are several applications for the Stefan problems in different fields of engineering. By looking at the original purpose of Stefan’s article in 1891, which was to model the arctic ices, this is highly relevant today due to the demand of better climate models. According to Hunke et al. [7], Stefan’s one dimensional thermodynamical model is still in use for global climate models, although the complete thermodynamical sea-ice models are of course more complex. Hence thermodynamical sea-ice models may be a subject for future work with the RWM approach. Other areas where a solid-liquid interface is moving is in 3D-printing, freezing of food, solidifying of building components. Also, in lithium-ion batteries, the diffusion of lithium ions in the battery is separated into two phases, one where lithium ions are evenly distributed, and one where they are not present. To be able to compute the properties of batteries in a better way, such as life-time and capacity, one can estimate the movement of the interface between these two phases as a Stefan problem [6].

29.5 Conclusions In accordance with our opening objective, we have successfully used a stochastic method to calculate numerical solutions with arbitrary accuracy to the Stefan problem with general time-dependent boundary condition at the fixed boundary. In comparison with the finite difference method, our experience is that the RWM is easier to implement and more flexible in terms of switching between different boundary conditions. This further motivates the use of stochastic methods in more complex applied problems in higher dimensions [12]. Acknowledgements We thank the students Andreas Lockby, Daniel Stoor, and Emil Gestsson for fruitful discussions about the Stefan problem. We are also grateful to Tobias Jonsson for sharing the finite difference code, used here for comparisons with the stochastic method. Finally we thank Daniel Edström and Bair Budaev for proofreading.

29 Stochastic Solutions of Stefan Problems with General …

685

Appendix

Example of a M ATLAB code that can reproduce Figs. 29.6, 29.7, 29.8, 29.9, 29.10, 29.11 and 29.12

% RWM_Stefan.m % (can be downloaded from the arXiv:2006.04939 [math.AP] Ancillary files) clear all; close all % PARAMETERS: alpha=1 % K/(rho*c); % Thermal diffusivity. beta=1 % l/c; % Parameter with unit [K]. L=1 % Length of domain t_max=0.5 % Maximum time T_0=1; % [degree C] Temperature for constant temperature BC. % Parameter for the constant heat flux BC. q_0 = 0.9108 % = 1/(sqrt(pi)*erf(lambda)). n=1e2; % Number of walkers. dx=0.01; dt=dxˆ2/(2*alpha); % Steplengths in x and t ds=dx/(n*beta); % Increment for s(t) when absorbing a walker. % Number of points in the space and time. N_x=ceil(L/dx); N_t=ceil(t_max/dt); % Matrix representing T(x,t), initially set to 0 degree C. T=zeros(N_x,N_t); s_vector=zeros(1,N_t); % Vector representing s(t). j_t=1; j_s=1; % Indices for time and the position of s(t). s=dx; % initial value for s(t) /approx 0. % Loop for all time steps as long as s(t) < L. while j_t < N_t && j_s < N_x % Examples of boundary conditions (BC) for the fixed boundary. T(1,j_t)=n*T_0; % Constant Dirichlet BC. % T(1,j_t)=n*(exp(j_t*dt)-1); % Exponential BC. % T(1,j_t)=n*sin(j_t*dt); % Oscillating BC. % % Heat flux % if j_t==1 % First timestep. % T(1,1)=n*1; % Seed temperature of order unity. % else % Consecutive timesteps. % T(1,j_t)=round( (n*dx*q_0/(j_t*dt)ˆ(0.5)+T(2,j_t)) ); % end % if s_vector(j_t)=s; for j_x = 1:N_x % If T is below 0 degree C (unrealistic one-phase model).

686

M. Ögren

if T(j_x,j_t) < 0 sign=-1; else sign=1; end for k=1:sign*T(j_x,j_t) % Move all walkers at (j_x,j_t). p=2*round(rand)-1; % =+-1, with P(+1)=P(-1)=1/2. % A walker move if it has not reached the boundaries. if j_x+p > 1 && j_x+p 1. Several promising algorithms have been developed to achieve such a reduced error bound for the resonance error. These approaches can be classified into two groups: (i) time dependent local problems such as [2, 4, 8], (ii) elliptic local problems such as [3, 9, 10]. The common idea behind all these methods is to modify the cell problem (30.5) so that the effect of the artificial boundary conditions (e.g., posed in (30.5)) on the computed homogenized coefficient (30.4) is significantly reduced. The time dependent approaches from [2, 4, 8] use either the wave equation or the heat equation as the local problem, and an averaging in time is also needed when the homogenized coefficient a 0 is to be approximated. On the other hand, elliptic local problems from [3, 9, 10] are based on adding a correction term to the cell problem (30.5) to reduce the effect of the inaccurate boundary conditions. The main aim of this paper is connected to the wave approach from [8]. Unlike all other available methods in the literature which aim at reducing the boundary error, the wave approach is known to remove the boundary error totally due to the finite 3

This error holds for periodic multiscale coefficients with period ε.

692

D. Arjmand

speed of propagation of waves; namely the waves near the boundary will not have enough time to pollute the interior solution if the computational domain is chosen to be sufficiently large. On the other hand, one of the main disadvantages with the wave approach is that the size of the computational domain will increase linearly with the wave speed, and the usefulness of this strategy is questionable when the wave speed increases. The objective of the present report is to bypass this computational problem by integrating a perfectly matched layer (PML) to the time dependent local problem from [8], which uses the second order wave equation as the local problem, and to explore how the PML affects the computational cost and accuracy. This will then allow for decreasing the computational complexity of the wave approach as the wave speed increases.

30.2 The Wave Approach to Approximate a0 Before starting with reviewing the wave approach from [8], we will introduce the notion of averaging kernels which is needed while computing the averages of oscillating functions. Definition 30.1 We say that a function μ : [−1/2, 1/2] → R+ belongs to the space Kq with q ≥ 0 if (i) μ ∈ C q ([−1/2, 1/2]) ∩ W q+1,∞ ((−1/2, 1/2))  1/2 (ii) −1/2 μ(x) d x = 1, (iii) μk (−1/2) = μk (1/2) = 0 for all k ∈ {0, . . . , q − 1}. In multi-dimensions the filter μη with η > 0 is defined by μη (x) := η−d

d i=1

μ

xi η

,

where x = (x1 , x2 , . . . , xd ) ∈ Rd . The averaging kernels μη can be used to accelerate the computations of averages of oscillating functions. In particular, the following theorem proved in [8] shows an estimate for the decay of the averaging error in the context of periodic functions. Theorem 30.1 Let f be a 1-periodic function bounded function such that f ∈ L ∞ ([−1/2, 1/2]) and let μ ∈ Kq . Then 

η/2 −η/2

 μη (x) f (x/ε) d x −

1/2 −1/2



q+2 ε f (x) d x ≤ C| f |∞ , η

where C is a constant which is independent of ε, η, f but may depend on μ.

(30.6)

30 Numerical Upscaling via the Wave Equation with Perfectly Matched Layers

693

Fig. 30.1 Left: An averaging kernel μη with q = 4 and η = 0.2, as well as an oscillatory function f (x/ε) = 1.1 + sin(2π x/ε)2 with period ε = 0.1 are depicted. Right: The decay of the error as ε → 0 is demonstrated, see the estimate (30.6). The convergence rate is observed to be O((ε/η)6 ) as expected theoretically

Note that a naive averaging (when μη = 1/η for −η/2 < x < η/2) corresponds to q = −1, and will lead to a first order convergence rate in (30.6). This is excluded from the very definition of Kq , but the estimate (30.6) still holds in this case, see [8]. For a numerical illustration of the convergence rate see the numerical results in Fig. 30.1, where the error is computed by substracting the weighted average (with q = 4) from the true average of a periodic function. Now we are ready to introduce the wave approach from [8]. For this, let Iτ := (0, τ/2] and Ωx M ,η := x M + [−L η,τ , L η,τ ]d be the localized temporal and√spatial domains centered at t = 0 and x M ∈ Ω respectively, and L η,τ := η2 + τ2 |a ε |∞ . The local cell problem is modelled by the following second order wave equation   ε,η ε,η ∂tt u i (t, x) − ∇ · a ε (x)∇u i (t, x) = ∇ · (a ε (x)ei ), in Iτ × Ωx M ,η u iε (0, x) = ∂t u iε (0, x) = 0 on ∂Ωx0 ,η ,

(30.7)

The dependency of the computational domain Ωx0 ,η on the wave speed is to ensure that the boundary conditions in (30.7) do not affect the solution in the region Iτ × [x M − η/2, x M + η/2]d , which is used in the computation of the homogenized coefficient, cf. (30.8). Then the approximate homogenized coefficient is given by

=

 τ/2 

−τ/2 Ωx M

ei · a˜ 0 (x M ) · e j  

d ε,η ε ε μ (x − x )μ (t) a (x) + a (x)∂ u (t, x) dxdt η M τ xk i k=1 ik ij ,η (30.8)

694

D. Arjmand

Note that in (30.8) the averaging formula requires the solution of the wave equation backward in time. This does not have any additional computational cost since ε,η ε,η u i (t, x) = u i (−t, x), which can be easily verified using (30.7). Remark 30.1 The motivation behind the wave approach is as follows: In formula (30.8) there are averaging kernels both in space and time. The only time dependent ε,η quantity in the right hand side is the solution u i (t, x) to the wave equation (30.7). Therefore, we can apply the time averaging first. In other words, let χ ε (x) :=



τ/2 −τ/2

ε,η

μτ (t)u i (t, x) dt.

Then one can use an eigenfunction expansion and prove that χ ε (x) = χ (x/ε), where χ solves the cell problem (30.5) up to some averaging error, see [7, 8] for mathematical details. If a ε is periodic with period ε < η  1, and η = τ , then the following estimate follows (see [8]):   ε q+2 0 0 , (30.9)

a − a˜ F ≤ C η where cdot F is the standard Frobenius norm for matrices. Moreover, in the setting of locally-periodic coefficients when a ε (x) = a(x, x/ε), and a is 1-periodic in the second argument the following theorem holds (see [7]):   ε q−1 −5 7

a − a˜ F ≤ C +ε+ε η . η 0

0

The main idea in this paper is to use perfectly matched layers in combination with the wave equation (30.7) in order to get rid of the strong linear dependency of the size of the localized domain Ωx0 ,η on the wave speed a ε . Another important goal is to numerically check if the convergence rate in (30.9) still holds in the presence of PML.

30.3 Perfectly Matched Layer for the Second Order Wave Equation To motivate the basic idea behind PML suppose that we are interested in the behaviour of a wave in a limited region while assuming that the waves are propagating over the entire Rd . Simulating such a problem is impossible by considering the problem as an infinite domain problem due to the limited computational power (memory) in present computers. The general idea behind PML is then to first select a computational

30 Numerical Upscaling via the Wave Equation with Perfectly Matched Layers

695

domain, say D, and then surround the domain with perfectly matched layers, where the outgoing waves are exponentially damped once they enter into the PML region. Therefore no reflections from the artificial boundary of the extended computational domain is obtained. This will allow for simulating the wave propagating over the entire Rd . The idea of PML gained popularity after the celebrated work by Engquist and Majda [13], on absorbing boundary conditions for numerical simulation of waves. The amount of literature addressing the design of PML for various problems is extensive, but to cite a few, we refer the reader to [1, 5, 11] and the references therein for the general theory and construction of PMLs. For the sake of completeness, we review an efficient design of PML presented in [11]. The advantage with using the PML from [11] lies in the fact that the PML uses the wave equation directly in the second order form, while other works in the literature transform the wave equation into a system of first order hyperbolic PDE; introducing additional auxiliary variables to deal with in numerics. To present the construction idea from [11], we consider the second order wave equation ∂tt u(t, x) − ∂x (a(x)∂x u(t, x)) = f (t, x), in R × (0, T ], u(0, x) = g(x), ∂t u(0, x) = h(x).

(30.10)

Here it is assumed that the initial data g and h are nonzero only in a region of interest, say D := [−δ, δ]. We start by taking the Laplace transform of u. Therefore, let 



u(x, ˆ s) :=

e−st u(x, t) dt, s ∈ C.

0

Then

  ˆ s) − sg(x) − h(x) = ∂x a∂x u(x, ˆ s) + fˆ(s, x). s 2 u(x,

Consider the coordinate transformation  1 x ζ (z) dz, x˜ = x + s 0

and

(30.11)

s ∂ ∂ = . ∂ x˜ s + ζ ∂x

Here ζ is a damping profile which vanishes inside the domain D, and is nonzero ˜ inside the absorbing layer defined in the region D/D, where D˜ := [−δ − l, δ + l]. Note that when x ∈ D then x = x, ˜ and the coordinate x˜ is stretched once x is in the absorbing layer. Moreover, let γ = s+ζ , and let us require that (30.11) is satisfied s also for the stretched coordinate. Then it follows that   s 2 γ uˆ − sγ g − γ h = ∂x aγ −1 ∂x uˆ + γ fˆ. which can be re-written as

696

D. Arjmand ⎛ 



⎜ ⎟    ⎜ ⎟   ζ ⎟ ˆ ζ ˆ s 2 uˆ − sg − h + (s uˆ − g)ζ = ∂x a∂x uˆ − ∂x ⎜ ⎜a s + ζ ∂x uˆ ⎟ + f + s f (s, x) + h . ⎝  ⎠ ˆ :=−Φ(s,x)

Hence taking the inverse Laplace transform we obtain 

t

∂tt u(t, x) + ζ (x)∂t u(t, x) = ∂x (a∂x u(t, x)) + ∂x Φ + f (t, x) + ζ

f (τ, x)

0

+ h(x) dτ  =0

∂t Φ = −ζ Φ − aζ ∂x u(t, x). (30.12) Remark 30.2 Note that to solve (30.12) we need the initial data for Φ(t, x). In general, the initial data Φ(0, x) can be computed using the transformation mentioned above in the derivation. Remark 30.3 In principle, the choice of the damping factor ζ (x) in (30.12) is arbitrary provided that it vanishes inside D. Later, in the numerical experiments, the choice of ζ will be made explicit for the reader. To simplify the exposition, the ideas in this section were presented in a onedimensional setting. Generalizations to multi-dimensions relies on the very same principle, with the exception that there will be d damping factors responsible for the wave attenuation in every dimension, see [11]. Moreover, the questions in relation with the strong stability of the PML presented in this section is also addressed in [11], and are skipped in the present report to ease the readability.

30.3.1 The New Local Problem Based on the Wave Equation Combined with PML We are now in a position to present our new formulation for approximating the homogenized coefficients for multiscale elliptic PDEs. Again, we adopt a onedimensional notation but the ideas are easily generalizable to any dimension. In the new formulation, we apply the PML formulation (30.12) from the previous section to the microscale problem (30.7). Therefore, setting g = h = 0 and f = ∂x a ε (x), we obtain ∂tt u ε,η (t, x) + ζ (x)∂t u ε,η (t, x) = ∂x (a ε (x)∂x u ε,η (t, x)) + ∂x Φ(t, x) + (1 + ζ (x)t)∂x a ε (x), in Iτ × K x M ,η,δb ,δ ∂t Φ(t, x) = −ζ (x)Φ(t, x) − a ε (x)ζ (x)∂x u ε,η (t, x). (30.13)

30 Numerical Upscaling via the Wave Equation with Perfectly Matched Layers

697

Here it is assumed that K x M ,η,δb ,δ := x M + [−(η + δb + δ)/2, (η + δb + δ)/2], where δ > 0 is the size of the absorbing layer which is essentially independent of the wave speed, η is the size of the averaging domain, and δb is a buffer zone between the averaging domain and the absorbing layer. As a part of this study, the existence of a nonzero buffer zone is observed to be essential for obtaining a decaying error as the parameter η increases. This is mainly due to the fact that the PML formulation typically assumes a constant wave speed in the absorbing layer. However, our original wave speed is oscillatory and that causes small (but computationally acceptable) reflections from the absorbing layer. Apart from the introduction of PML, the main difference between (30.13) and (30.7) is related to the domain size. In other words, the domain size (30.7) increases linearly with respect to the wave speed, while the domain K x M ,η,δb ,δ used in the new formulation (30.13) is independent of the wave speed; thereby leading to a significant reduction in the computational cost of the upscaling procedure. Finally, once the Eq. (30.13) is solved, one can then use the following averaging formula to approximate the homogenized coefficient.

=2

  Iτ

Ωx M

(30.14) ei · a˜ 0 (x M ) · e j  

ε,η d ε μη (x − x M )μτ (t) aiεj (x) + k=1 aik (x)∂xk u i (t, x) dxdt ,η

It is worth mentioning that, in principle any η and τ larger than ε can be chosen in a numerical simulation. In practice, however, it is favourable to choose η and τ as small as possible, e.g., τ = η = 10ε, so that the cost of solving the problem (30.13) becomes independent of the small scale ε. Moreover, note that the factor 2 in front of (30.14) is due to the fact that the solution of the wave equation has symmetry with ε,η ε,η respect to time, i.e., u i (t, x) = u i (−t, x).

30.4 Numerical Discretization For the numerical discretization, we consider finite difference approximations in time and space of the derivatives in (30.13). For this we let xi = x M − η+δ2b +δ + ix, i = 0, 1, . . . , N x , with N x x = η + δb + δ, which corresponds to discrete points in space. Similar we discretize the time interval Iτ by tn = nt, n = 0, 1, . . . , Nt , with Nt t = τ/2. Moreover, by4 u n,i we denote the approximation to u ε,η (tn , xi ) in (30.13) and establish the following difference formula for a numerical approximation:

4

For simplicity we skip the dependency of the discrete solution on the parameters η and ε.

698

D. Arjmand

u n+1,i − 2u n,i + u n−1,i u n+1,i − u n−1,i = + ζ (x ) i t 2 2t

n,i Φ n+1,i − Φ n−1,i u n,i+1 − u n,i − u n,i−1 ε,i−1/2 u a ε,i+1/2 + + − a x 2 x 2 2t a ε,i+1/2 − a ε,i−1/2 +(1 + ζ (xi )tn ) , x u n,i+1/2 − u n,i−1/2 Φ n,i − Φ n−1,i = −ζ (xi )Φ n,i − a ε,i ζ (xi ) . t x Here the initial data u 0,i = Φ 0,i = 0, and u 1,i is computed using the Eq. (30.13) similar to the standard Leap-Frog scheme. Note that the differential equation describing the dynamics of Φ is discretized using a first order difference approximation in scheme (unlike the first equation). The difference approximation can be easily generalized to higher order methods in time and space. This issue is not included in this report since the main objective has been to study the accuracy of the microscale model (30.13), in relation with approximation of homogenized coefficient a 0 . As previously stated, the choice of damping factor is arbitrary, but here we use the very same function as defined in [11]. The damping factor reads as

ζ (x) =

⎧ ⎪ ⎨0,  ⎪ ⎩C

 x ∈ [−η/2, η/2] 2π|x−η/2| ) |x − η/2| sin( δ − , η/2 < |x| < η/2 + δ/2. δ 2π

Here C is an appropriate constant which can be adjusted according to the size of the spatial step-size as well as the size of the absorbing layer. Moreover, the size of the absorbing layer δ will be made clear in the computational results section.

30.5 Computational Results In this section, we present numerical results to demonstrate the following: i. A nonzero buffer zone between the averaging domain and the absorbing layer is needed to obtain decaying upscaling errors ii. We aim at understanding the effect of enlarging the absorbing layer on the error iii. We are interested in understanding the effect of reducing the size of the averaging domain on the error To simulate a multiscale wave propagation problem, we consider the ε-periodic multiscale coefficient a ε (x) = 2.1 + sin(2π x/ε). The exact homogenized coefficient is given by

30 Numerical Upscaling via the Wave Equation with Perfectly Matched Layers

a 0 (x) =



699

2.12 − 1.

We solve the microscale problem using the discretization introduced in the previous section. In all the simulations we use 50 points in space (per period), and we let the overall domain size η + δb + δ increase, while τ = η. Moreover, we fix the degrees of freedom in time as 3000 points uniformly distributed in Iτ . In all the simulations the small scale parameter ε is fixed and ε = 0.01. In Fig. 30.2, we study the effect of the buffer zone on the convergence rate. As we expect, the numerical simulation demonstrates the fact that if there is no buffer zone in the computations, then the overall error will be dominated by an error which is due to the fact that wave speed is variable and therefore there will be some reflections from the absorbing layer polluting the solution in the averaging domain. This is evident from the plateu depicted in Fig. 30.2, which becomes more evident as the size of the computational domain increases. In Fig. 30.3, we add a nonzero buffer zone of size δb = 3|K x M ,η,δb ,δ |/8, with the very same discrete parameters as in the previous simulation. The addition of a small buffer zone seems to result in a better accuracy for the approximation of the homogenized coefficient and a plateau is observed in reasonable tolerances of interests. The size of the buffer zone is then changed to δb = 2|K x M ,η,δb ,δ |/8 in Fig. 30.4, while the size of the absorbing layer is increased accordingly. The simulations demonstrate that this strategy does not lead to a significant change in the decay of the error in comparison to Fig. 30.3. In Figs. 30.5 and 30.6, we now change the size of the averaging domain to η = 3|K x M ,η,δb ,δ |/8 and η = |K x M ,η,δb ,δ |/4 respectively. We observe that as the size of the averaging domain decreases, the plateau seems to disappear and we recover the precise convergence rate O((ε/η)q+2 ), consistent with the ideal (expected) estimate from (30.9). The simulations from Fig. 30.6 results in the best approximation due to the combined effect of reducing the size of the averaging domain and enlarging the buffer zone.

Fig. 30.2 The effect of having δb = 0 is depicted. This result demonstrates the need for having a nonzero buffer zone to get decaying errors as the domain size |K x M ,η,δb ,δ | increases. Moreover, δ = |K x M ,η,δb ,δ |/8 is used in the simulation

700

D. Arjmand

Fig. 30.3 Here we choose η = |K x M ,η,δb ,δ |/2, and δ = |K x M ,η,δb ,δ |/8, adding a nonzero buffer zone to the previous computation from Fig. 30.2. The plateau from Fig. 30.2 seems to occur at a lower error tolerance of order 10−4

Fig. 30.4 Here we choose η = |K x M ,η,δb ,δ |/2, and δ = 3|K x M ,η,δb ,δ |/8, adding a nonzero buffer zone (but smaller than Fig. 30.3). This change does not seem to result in any drastic change in the decay of the error

30.6 Concluding Remarks In the present article, we use a perfectly matched layer in combination with a second order wave equation to reduce the resonance error present in typical multiscale numerical methods. We have run numerical simulations to analyze the dependency of the resonance error on the parameters associated with the PML. In particular, we found out that by simultaneously reducing the size of the averaging domain and enlarging the buffer zone in the PML formulation, the previously known convergence rates can be recovered at a lower computational cost as the overall computational geometry does no longer depend on the wave speed. The approach is general as it

30 Numerical Upscaling via the Wave Equation with Perfectly Matched Layers

701

Fig. 30.5 Here we choose η = 3|K x M ,η,δb ,δ |/8, and δ = 3|K x M ,η,δb ,δ |/8. This change seems to result in a decrease in the error and recovering the correct convergence rate

Fig. 30.6 Here we choose η = |K x M ,η,δb ,δ |/4, and δ = 3|K x M ,η,δb ,δ |/8. This change seems to result in a decrease in the error and recovering the correct convergence rate

can easily be applied to wave propagation problems in higher dimensions. Moreover, when the main problem is in non-divergence form, the present approach can be glued together with known multiscale algorithms such as [6].

702

D. Arjmand

References 1. Abarbanel, S., Gottlieb, D., Hesthaven, J.S.: Long time behavior of the perfectly matched layer equations in computational electromagnetics. J. Sci. Comput. 17(1–4), 405–422 (2002) 2. Abdulle, A., Arjmand, D., Paganoni, E.: A parabolic local problem with exponential decay of the resonance error for numerical homogenization (2020). arXiv:2001.05543 3. Abdulle, A., Arjmand, D., Paganoni, E.: An elliptic local problem with exponential decay of the resonance error for numerical homogenization (2020). arXiv:2001.06315 4. Abdulle, A., Arjmand, D., Paganoni, E.: Exponential decay of the resonance error in numerical homogenization via parabolic and elliptic cell problems. Comptes Rendus Mathematique 357(6), 545–551 (2019) 5. Appelo, D., Hagstrom, T., Kreiss, G.: Perfectly matched layers for hyperbolic systems: generalformulation, well-posedness, and stability. SIAM J. Appl. Math. 67(1), 1–23 (2006) 6. Arjmand, D., Kreiss, G.: An equation-free approach for second order multiscale hyperbolic problems in non-divergence form. Commun. Math. Sci. 16(8), 2317–2343 (2018) 7. Arjmand, D., Runborg, O.: Estimates for the upscaling error in heterogeneous multiscale methods for wave propagation problems in locally periodic media. Multiscale Model. Simul. 15(2), 948–976 (2017) 8. Arjmand, D., Runborg, O.: A time dependent approach for removing the cell boundary error in elliptic homogenization problems. J. Comput. Phys. 314, 206–227 (2016) 9. Blanc, X., Le Bris, C.: Improving on computation of homogenized coefficients in the periodic and quasi-periodic settings. Netw. Heterog. Media 5(1), 1–29 (2010) 10. Gloria, A., Habibi, Z.: Reduction in the resonance error in numerical homogenization ii: Correctors and extrapolation. Found. Comput. Math. 16(1), 217–296 (2016) 11. Grote, M. J., Sim, I.: Efficient PML for the wave equation (2010). arXiv:1001.0319 12. Weinan, E., Engquist, B.: The heterogeneous multiscale methods. Commun. Math. Sci. 1(1), 87–133 (2003) 13. Engquist, B., Majda, A.: Absorbing boundary conditions for numerical simulation of waves. Proc. Natl. Acad. Sci. 74(5), 1765–1766 (1977) 14. Hou, T.Y., Wu, X.H.: A multiscale finite element method for elliptic problems in composite materials and porous media. J. Comput. Phys. 134, 169–189 (1997)

Chapter 31

Homotopy Analysis Method (HAM) for Differential Equations Pertaining to the Mixed Convection Boundary-Layer Flow over a Vertical Surface Embedded in a Porous Medium Imran M. Chandarki and Brijbhan Singh

Abstract The objective of present work is to revisit the problem pertaining to a vertically flowing fluid passed a model of a thin vertical fin in a saturated porous medium. The governing equations have been simplified using the similarity transformation to yield ordinary differential equations. These equations have been solved by homotopy analysis method (HAM). It is shown that the solution has two branches in a certain range of the mixed convection and surface temperature parameters. The effects of these parameters on the velocity distribution have been presented graphically. The results obtained by HAM have been found in good agreement with the corresponding results obtained by otherworkers. Keywords Similarity boundary layer equations · Homotopy analysis method · Mixed convection MSC 2020 76D10 · 76D09 · 76B99

31.1 Introduction The study of mixed convection boundary-layer flows has got much attention in the last few decades. These flows have their applications in the fields like oil reservoir modelling, the analysis of insulating systems, food processing, casting and welding I. M. Chandarki (B) Department of General Science and Engineering, N. B. Navale Sinhgad College of Engineering, Solapur 413004, Maharashtra, India e-mail: [email protected] B. Singh Department of Mathematics, Dr. Babasaheb Ambedkar Technological University, Lonere 402103, Raigad, Maharashtra, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_31

703

704

I. M. Chandarki and B. Singh

in manufacturing processes, the dispersion of chemical contaminants in different industrial processes and in the environment, geothermal energy extraction, etc. The leading contributions pertaining to convection flows in porous media are due to Ingham and Pop [11], Nield and Bejan [21], Vafai [24], Pop and Ingham [22], Bejan and Kraus [4], Ingham et al. [10], Bejan et al. [3], Johnson and Cheng [12], Merkin [18], Harris et al. [9], Magyari et al. [17], and Aly et al. [2]. Similarity solutions have been obtained by numerous research workers for the situation where the free-stream velocity and the surface temperature distribution vary according to the same power function of the distance along the surface. Aly et al. [2] investigated the numerical and asymptotic solutions for the boundary layer equations of mixed convection resulting from the flow over a heated vertical or horizontal flat plate for both aiding, where the flow is directed vertically upwards, and opposing situations, where the flow is directed vertically downwards. Merkin [18] examined the effect of opposing buoyancy forces on the uniform boundary layer flow on a semi-infinite vertical flat plate at a constant temperature. Furthermore, this problem was studied by Wilks [25, 26] and Hunt and Wilks[8], who also considered the case of uniform flow over a semi-infinite flat plate, but now heated at a constant heat flux rate. Dual solutions of the mixed convection boundary-layer equations for opposing flows on a vertical surface have been studied by Wilks and Bramley [27], Merkin [18], Merkin and Mahmood [19], Ridha [23], Merkin and Pop [20], etc. In this chapter, the Homotopy Analysis Method (HAM) is used to get solutions for a vertically flowing fluid past a thin vertical fin modeled as a fixed semi-infinite vertical surface in a vertically flowing fluid-saturated porous medium maintained at a constant temperature T∞ . The temperature of the fin, above the ambient temperature T∞ , is assumed to vary as x λ , where x is the distance from the tip of the fin and λ is a pre-assigned constant. It was Liao [14] who developed an analytic method for strongly nonlinear problems, namely, the homotopy analysis method (HAM), which has been successfully applied to many nonlinear problems in science and engineering in the last few decades. The HAM is based on a traditional concept of homotopy in topology. However, in the frame of the HAM, the concept of homotopy is generalized by means of introducing an auxiliary parameter and an auxiliary function. One first connects selected initial guesses and unknown solutions of a nonlinear problem by constructing such a generalized homotopy with respect to an embedding parameter q ∈ [0, 1]. Then, the solution can be expressed by a kind of Taylor series with respect to the embedding parameter q at q = 0, and each term of the solution series is governed by a linear equation. In this way, a nonlinear problem is transformed into an infinite number of linear problems. However, different from perturbation techniques, this kind of transformation does not depend on any small/large parameters at all. Besides, different from all previous analytic techniques, the HAM provides us with a simple way to control and adjust the convergence of the solution series, and also provide us with great freedom to choose a proper set of base functions. Furthermore, it logically contains other nonperturbation techniques such as Lyapunov’s small parameter method (Lyapunov [16]), the δ-expansion method (Karmishin et al. [13]), and Adomian’s decomposition method (Adomian [1]), as proved by Liao [14]. So, the HAM

31 Homotopy Analysis Method (HAM) for Differential Equations Pertaining …

705

is rather general. Liao [15] successfully applied the HAM to the unsteady boundarylayer flows caused by an impulsively stretching plate and obtained analytic series solutions valid for all times 0 ≤ t < ∞. In this chapter, we further apply the HAM to give analytic series solutions of the considered problem, which are valid and accurate for all times.

31.2 Governing Equations We assume that the convecting fluid and the porous media are isotropic,are in thermal equilibrium and have their physical properties such as viscosity, thermal conductivity, thermal expansion coefficient, specific heat and permeability as constant. The flow is assumed to be described by Darcy’s law and the validity of the Boussinesq approximation also holds good (Fig. 31.1). Let us use the following non-dimensional quantities: x=

x , L

y=

(Pe )1/2 y , L

ψ=

ψ , αm (Pe )1/2

Θ=

T − T∞ T0 − T∞

(31.1)

where x, ¯ y¯ are vertical and horizontal co-ordinates, respectively, L is an arbitrary length scale, Pe is the Peclet number Pe = U0 L/αm , where U0 = B L λ is the reference velocity, B > 0, λ is a pre-assigned constant, ψ¯ is the stream function, αm is effective thermal diffusivity of the porous medium and T0 is the reference temperature, T0 = T∞ + |A|L λ . Then, the governing equations are given by Cheng and Minkowcyz [7] 1 ∂ 2ψ ∂ 2ψ Ra ∂Θ + = 2 2 Pe ∂ x ∂y Pe ∂ y

Fig. 31.1 Mathematical model of a vertically flowing fluid past a model of a thin vertical fin in a saturated porous medium

(31.2)

706

I. M. Chandarki and B. Singh

∂ψ ∂Θ 1 ∂ 2Θ ∂ψ ∂Θ ∂ 2Θ − = + ∂y ∂x ∂x ∂y Pe ∂ x 2 ∂ y2

(31.3)

L(T0 −T∞ ) where ga is the magnitude where Ra is the Rayleigh number, Ra = ga K k βTνα m of the acceleration due to gravity, K k is the permeability of the porous media, βT is the coefficient of thermal expansion and ν is the kinematic viscosity of the fluid, αm is the effective thermal diffusivity. On assuming that the Peclet number is very large, the resulting temperature boundary-layer becomes analogous to that in classical boundary-layer theory. Therefore, by letting Pe → ∞ in Eqs. (31.2) and (31.3), we obtain the following boundarylayer equations:

∂Θ ∂ 2ψ =ε 2 ∂y ∂y ∂ψ ∂Θ ∂ψ ∂Θ ∂ 2Θ − = ∂y ∂x ∂y ∂y ∂ y2

(31.4) (31.5)

where the mixed convection parameter ε is defined as ε=

Ra . Pe

(31.6)

Equations (31.4) and (31.5) have to be solved subject to the boundary conditions that there is no normal velocity on the plate. The temperature of the plate is T∞ + A x¯ λ and the vertical component of the fluid velocity at the edge of the boundary-layer is B x¯ λ . On using the non-dimensional Eq. (31.1), the boundary conditions become: ψ = 0,

Θ = x λ (A > 0)or, Θ = −x λ , (A < 0), on y = 0,

0 0 and ε < 0, respectively.

31.3 Homotopy Analysis Solution To seek the HAM solution of the Eq. (31.14), we select f 0 (η) = 1 + η − e−εη

(31.16)

as the initial approximation of f , and L



 ∂ 3 fˆ(η; p) ∂ 2 fˆ(η; q) + fˆ(η; q) = ∂η3 ∂η2

(31.17)

as the auxiliary linear operator that satisfies   L C1 + ηC2 + C3 e−η = 0

(31.18)

708

I. M. Chandarki and B. Singh

where Ci (i = 1, 2, 3) are arbitrary constants. If q ∈ [0, 1] is an embedding parameter and  is a non-zero auxiliary parameter(also called convergence control parameter), the zeroth-order deformation problem from (31.14)–(31.15) becomes (1 − q)L



   fˆ(η; q) − f (0; q) = qN fˆ(η; q)

(31.19)

fˆ (∞, 0) = 1

(31.20)

fˆ(0; q) = 0,

fˆ (0; q) = 1 + ε,

where N



 ∂ 3 fˆ(η; p) ∂ 2 fˆ(η; p) + (1 + λ) fˆ(η; p) fˆ(η; q) = 3 ∂η ∂η2  ∂ fˆ(η; p) ∂ fˆ(η; p) + 2λ 1 − ∂η ∂η

(31.21)

and the mth-order deformation problem becomes   L f m (η) − χm f m−1 (η) = Rm (η)

(31.22)

f m (0) = 0, f m (0) = f m (∞) = 0

(31.23)

  (η) + 2λ f m−1 (η) + Rm (η) = f m−1

m−1

   (1 + λ) f m−k−1 (η) f k (η) − 2λ f m−k−1 f k

k=0

where χm =

(31.24)

0, m ≤ 1 and 1, m > 1 f m (η) =

2m+1

am,n e−nη .

(31.25)

n=0

Here Mathematica software has been used to find the solution of the Eq. (31.22) up to the first few orders of approximations. Now, from (31.25), we obtain f m (η)

=

2m+1

(−n)am,n e

−nη

=

n=0

2m+1

a1m,n e−nη

(31.26)

n=0

where a1m,n = (−n)am,n . From (31.26), we obtain f m (η) =

2m+1

n=0

(−n)a1m,n e−nη =

2m+1

n=0

a2m,n e−nη

(31.27)

31 Homotopy Analysis Method (HAM) for Differential Equations Pertaining …

709

where a2m,n = (−n)a1m,n . From (31.27), we obtain f m (η)

=

2m+1

(−n)a2m,n e

−nη

=

n=0

2m+1

a3m,n e−nη

(31.28)

n=0

where a3m,n = (−n)a2m,n . From (31.28), we obtain  f m−1 (η) =

2m−1

a3m−1,n e−nη .

(31.29)

a1m−1,n e−nη .

(31.30)

n=0

Again, from (31.26), we obtain  f m−1 (η) =

2m−1

n=0

From (31.25) and (31.27), we obtain −sη f m−k−1 (η) f k (η) = r2m−2k−1 am−k−1,r e−r η 2k+1 =0 s=0 a2k,s e 2m−2k−1 2k+1 −(r +s)η = r =0 e s=0 a2k,s am−k−1,r .

(31.31)

Let us put r + s = n in (31.31) to obtain f m−k−1 (η)

f k (η)

=

2m

e

n=0

Taking

m−1

m−1

k=0

s=min{n,2k+1}

−nη

a2k,s am−k−1,n−s .

(31.32)

s=max{0,n−2m+2k+1}

on both the sides, we obtain

f m−k−1 (η) f k (η) =

m−1

k=0

⎡ ⎣

k=0

2m

s=min{n,2k+1}

e−nη

n=0

⎤ a2k,s am−k−1,n−s ⎦ .

s=max{0,n−2m+2k+1}

(31.33) m−1

f m−k−1 (η) f k (η) =

2m

n=0

k=0

=

2m

n=0

⎡ e−nη ⎣

m−1

s=min{n,2k+1}

⎤ a2k,s am−k−1,n−s ⎦

k=0 s=max{0,n−2m+2k+1}

e−nη δ1m,n

(31.34)

710

I. M. Chandarki and B. Singh

where δ1m,n =

m−1

s=min{n,2k+1}

a2k,s am−k−1,n−s .

k=0 s=max{0,n−2m+2k+1}

Similarly, on the parallel lines of (31.34) and from (31.30) and (31.34), m−1

 f m−k−1 (η) f k (η) =

2m

⎡ e−nη ⎣

n=0

k=0

=

2m

m−1

s=min{n,2k+1}

⎤ a1k,s am−k−1,n−s ⎦

k=0 s=max{0,n−2m+2k+1}

e−nη δ2m,n

(31.35)

n=0

where δ2m,n =

m−1

s=min{n,2k+1}

a1k,s am−k−1,n−s .

k=0 s=max{0,n−2m+2k+1}

Now, from Eqs. (31.22) and (31.24), we obtain   Rm =  f m−1 (η) + 2λ f m−1 (η) m−1

   + (η) f k (η) (1 + λ) f m−k−1 (η) f k (η) − 2λ f m−k−1 k=0 2m−1

=

  e−nη a3m−1,n + 2λa1m−1,n

(31.36)

n=0

+

2m

  e−nη (1 + λ)δ1m,n − 2λδ2m,n

n=0

Rm = 

2m+1

  χ2m−n+1 e−nη a3m−1,n + 2λa1m−1,n a1m−1,n

n=0

+

2m+1

  χ2m−n+1 e−nη (1 + λ)δ1m,n − 2λδ2m,n

(31.37)

n=0

Rm =

2m+1

e−nη Δm,n

(31.38)

n=0

  where Δm,n = χ2m−n+1 a3m−1,n + 2λa1m−1,n + (1 + λ)δ1m,n − 2λδ2m,n . From the Eq. (31.22), we therefore have 



L f m (η) − χm f m−1 (η) = Rm (η) =

2m+1

Δm,n e−nη .

n=0

Taking L −1 on both the sides of above equation, we obtain

(31.39)

31 Homotopy Analysis Method (HAM) for Differential Equations Pertaining …

f m (η) − χm f m−1 (η) =

2m+1

711

  Δm,n L −1 e−nη

n=0

2m+1

 −nη  1 e + c1m + c2m η + c3m e−η Δm,n 3 = 2 D + D n=0 2m+1

e−nη = Δm,n + c1m + c2m η + c3m e−η 3 + n2 −n n=2 2m+1

e−nη =− Δm,n 2 + c1m + c2m η + c3m e−η n (−1 + n) n=2 2m+1

e−(n+2)η + c1m + c2m η + c3m e−η . =− Δm,n+2 2 (1 + n) (n + 2) n=0

(31.40)

Using equation the (31.23) in (31.40), we get c1m , c2m and c3m as follows: c1m =

2m+1

Δm,n+2

n=0

c3m =

1 (n + 1)(n + 2) 2m+1

Δm,n+2

n=0



 1 − 1 , c2m = 0 (n + 2)

1 . (n + 1)(n + 2)

Substituting c1m , c2m and c3m in (31.40), we obtain 2m+1

e−(n+2)η (n + 2)2 (1 + n) n=0   2m+1

1 1 −1 + Δm,n+2 (n + 1)(n + 2) (n + 2) n=0 2m+1

1 e−η + Δm,n+2 (n + 1)(n + 2) n=0 f m (η) = χm f m−1 (η) −

2m+1

am,n e−nη = χm

n=0 2m+1

Δm,n+2

2m−1

2m+1

n=0

n=0

am−1,n e−nη −

Δm,n+2

(31.41)

e−(n+2)η (n + 2)2 (1 + n)

  1 1 −1 + Δm,n+2 (n + 1)(n + 2) (n + 2) n=0 2m+1

1 e−η [U sing(31.25)] + Δm,n+2 (n + 1)(n + 2) n=0

which gives after simplification and proper construction

(31.42)

712

I. M. Chandarki and B. Singh

2m+1

am,n e−nη =

n=0 2m+1

2m+1

χm χ2m−n+1 am−1,n e−nη −

n=0

2m+1

Δm,n+2

n=0



e−(n+2)η (n + 2)2 (1 + n)

 2m+1

1 1 Δm,n+2 −1 + e−η (n + 2) (n + 1)(n + 2) n=0 (31.43) By putting n = 0, 1, n and comparing coefficients of like terms from LHS and RHS of (31.43),we get 1 + Δm,n+2 (n + 1)(n + 2) n=0

am,0 = χm χ2m+1 am−1,0 +

2m+1

n=0

2m+1

1 Δm,n+2 (n + 1)(n + 2)

Δm,n+2

1 (n + 1)(n + 2)

am,n = χm χ2m−n+1 am−1,n − Δm,n+2

e−2η (n + 1)(n + 2)2

am,1 = χm χ2m am−1,1 +

n=0



 1 −1 (n + 2)

By the HAM, at q = 0 and q = 1, we have fˆ(η; 0) = f 0 (η) and fˆ(η; 1) = f (η), respectively. So, as the embedding parameter q ∈ [0, 1] increases from 0 to 1, the solution f (η; q) of the zeroth-order deformation equations varies (or deforms) from the initial guess f 0 (η) to the exact solution f (η) of the original nonlinear differential equation N [ f (η)] = 0. Such kind of continuous variation is called deformation in topology, and this is the reason why we call (31.19) the zeroth order deformation equation. Since fˆ(η; q) is also dependent upon the embedding parameter q ∈ [0, 1], we can expand it into the Maclaurin’s series with respect to q: fˆ(η; q) = f 0 (η) +



f m (η)q m ;

(31.44)

m=1

the above Eq. (31.44) is called the homotopy-Maclaurin’s series. Assuming that the auxiliary linear operator L and the initial guess f 0 (η) are so properly chosen that the above homotopy-Maclaurin’s series converges at q = 1, we have the so-called homotopy-series solution as f (η) = f 0 (η) +



f m (η).

(31.45)

m=1

Using the Eqs. (31.43) and (31.45), we obtain f (η) =



m=0

 f m (η) = lim

N →∞

N

m=0

am,0 +

2N +1

n=1

 e

−nη

2N

m=n−1

 am,n

(31.46)

31 Homotopy Analysis Method (HAM) for Differential Equations Pertaining …

713

in which the coefficient am,n of f m (η) can be found by using given boundary conditions and with the initial guess approximation in Eq. (31.16). Also, the numerical data are presented graphically.

31.4 Results and Discussion Liao [14] proved that, as long as a solution series given by the HAM converges, it must be one of the solutions. The analytic solution of the problem has been computed for the equation containing non-zero auxiliary parameters  which can adjust and control the convergence of the solutions. The -curve is defined as a horizontal line segment above the valid region of all possible values of . In the present case, the 10th orders of -curves have been plotted for λ = ε = 0 ,λ = 0 and ε = −1, λ = 0.05, and ε = 0 and λ = 0.05 and ε = −1.35 in the Fig. 31.2. From this figure, we can see that the permissible range for  is −0.2 ≤ λ ≤ 0.1. So, to assure the convergence of the HAM solution, the values of  should be chosen from this range. We have here studied two patterns of the flow, namely, flows aided and flows opposed by the convection. We have solved the Eqs. (31.14)–(31.15) for λ = 0. The Table 31.1 shows the variation of f  (0) as a function of ε for ε ≤ 0. We observe that for ε = −1 and for λ = 0, the differential Eqs. (31.14)–(31.15) will reduce to that of Blasius non-linear differential equation (Ref. Blasius [5]) and we have the solution of the Blasius problem with f  (0) = 0.4696, which is the greatest value of f  (0). Furthermore, for −1.354 ≤ ε ≤ −1, two patterns of solutions are observed. At ε = −1, fl (0) = 0 and f u (0) = 0.469760 have been found to be the maximum and minimum values of f  (0). It is seen from the Table 31.1 that the present results Fig. 31.2  -Curves for λ = 0, 0.05 and ε = 0, −1.35

714

I. M. Chandarki and B. Singh

Table 31.1 Values of skin friction coefficient f  (0) for different values of ε and λ = 0 λ

0

ε

fl (0)

f u (0)

f  (0)

Blasius [5]

Present case

Aly et al. [2]

Present case

Aly et al. [2]

Present case

Aly et al. [2] 0

0









0

−0.1









0.0711476 0.072059

– –

−0.2









0.144934

0.145847



−0.3









0.206244

0.207156



−0.4









0.271992

0.272904



−0.5









0.323463

0.324376



−0.6









0.369295

0.370208



−0.7









0.405889

0.406801



−0.8









0.431564

0.432477



−0.9









0.450881

0.451793



−1

0

0

0.469760

0.469600





0.469602

−1.1

0.0038393 0.002565

0.447402

0.448834







−1.2

0.033714

0.032440

0.411288

0.412720







−1.3

0.095263

0.093989

0.316504

0.317936







−1.354

0.179728

0.178454

0.179728

0.181160







are in excellent agreement with those reported by Aly et al. [2] and Blasisus [5]. This provides credence to the accuracies of the present results. Again, for λ = 0.05, the Eqs. (31.14)–(31.15) have got two solutions in the range −1.3640 ≤ ε ≤ 0.4. The Table 31.2 shows the variation of f  (0) for −1.3640 ≤ ε ≤ 0.4. We observe that these two solutions are negative in the range 0 ≤ ε ≤ 0.4 and further negativity will be observed for the range 0.4 < ε < ∞ also. It is evident from the Table 31.2 that the present results are in good agreement with those reported by Aly et al. [2] and this validates the accuracy of the present results. In the range, −1.364 < ε < 0, the upper solution f u (0) is positive while the lower solution fl (0) is positive in the range −1.364 < ε < −1.2164 and negative in the range −1.2164 < ε < 0. In the Fig. 31.3, the upper solutions of the velocity profiles have been depicted for λ = 0.05. The effect of ε is very little on the upper solution. At the same time, ε affects a lot on lower solutions, as shown in the Fig. 31.4. Again, it can be seen from the Fig. 31.4 that when ε > 0, the lower solutions decrease rapidly from some positive value, cut the η-axis around, before returning to a positive value and levelling out at the value of unity.

31 Homotopy Analysis Method (HAM) for Differential Equations Pertaining …

715

Table 31.2 Values of skin friction coefficient f  (0) for different values of ε and λ = 0.05 λ ε fl (0) f u (0) Present case Aly et al. [2] Present case Aly et al. [2] 0.05

−1.364 −1.3 −1.2 −1.1 −1 −0.9 −0.8 −0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4

Fig. 31.3 Profiles of the vertical component of the fluid velocity for λ = 0.05 and different values of ε corresponding to the upper solution

0.290980 0.1215686 0.036862 0.0290196 0.0284967 0.009673 −0.037385 −0.084967 −0.194248 −0.214117 −0.2590849 −0.407058 −0.504313 −0.578039 −0.733333 −0.866143 −0.973333 −1.140130 −1.22274

0.289706 – 0.355887 – 0.027745 – −0.038659 – −0.195522 – −0.260358 – −0.505587 – −0.734607 – −0.97460733 – −1.224019

0.290980 0.460915 0.545098 0.576470 0.585359 0.582222 0.560784 0.522091 0.483398 0.435816 0.372026 0.292026 0.227712 0.143006 0 −0.048366 −0.145620 −0.284705 −0.374117

0.28970639 – 0.546960 – 0.587221 – 0.562646 – 0.485260 – 0.373888 – 0.229574 – 0 – −0.143758 – −0.372255

716

I. M. Chandarki and B. Singh

Fig. 31.4 Profiles of the vertical component of the fluid velocity for λ = 0.05 and different values of ε corresponding to the lower solution

31.5 Concluding Remarks In this work, we have revisited the problem of Aly et al. [2] of mixed convection boundary-layer flow over a vertical surface embedded in a porous medium. The results are presented graphically and the effects of the pertinent parameters have been discussed. It is concluded that the HAM provides a simple and easy way to control and adjust the convergence region for strong nonlinearity and is applicable to highly nonlinear problems. Also, results obtained by using the HAM are in good agreement with those of [2] and Blasius [5] for ε = −1 and λ = 0. It is also obvious that we have here two HAM solutions over a range of values of ε for a given value of λ > 0. The variations of f  (0) as a function of ε and the velocity profiles for the range 1.3640 ≤ ε < ∞, λ = 0.05 have also been studied. Again, for the aiding flow (i.e for ε > 0), there exist two solutions e.g. the upper solutions and the lower solutions, for λ = 0.05; the upper solution is likely to be physically stable. But, for the opposing flows (i.e. for ε < 0), as expected, the boundary layer breaks down in case the opposing flow is too large.

References 1. Adomian, G.: A review of the decomposition method in applied mathematics. J. Math. Anal. Appl. 135(2), 501–544 (1988) 2. Aly, E.H., Elliot, L., Ingham, D.B.: Mixed convection boundary layer flow over a vertical surface embedded in a porous medium. Eur. J. Mech. B. Fluids 22, 529–543 (2003) 3. Bejan, A., Dincer, I., Lorente, S., Miguel, A.F., Reis, A.H.: Porous and Complex Flow Structures in Modern Technologies. Springer, New York (2004) 4. Bejan, A., Kraus, A.D.: Heat Transfer Handbook. Wiley, New York (2003) 5. Blasius, H.: Grenzchichten in flussigkeiten mit kleiner reibung. Z. Math. Phys. 56, 1–37 (1908)

31 Homotopy Analysis Method (HAM) for Differential Equations Pertaining …

717

6. Chandarki, I.M.: On the laminar similarity boundary layer equations. Ph.D. thesis, Mathematics, Lonere (2014) 7. Cheng, P., Minkowcyz, W.J.: Free convection about a vertical flat plate embedded in a porous medium with application to heat transfer from a dike. J. Geophys. Res. 82, 2040–2044 (1977) 8. Hunt, R., Wilks, G.: On the behaviour of laminar boundary layer equations of mixed convoction near a point of zero skin friction. J. Fluid Mech. 101, 377–391 (1980) 9. Harris, S.D., Ingham, D.B., Pop, I.: Unsteady mixed convection boundary-layer flow on a vertical surface in a porous medium. Int. J. Heat Mass Transf. 42, 357–372 (1999) 10. Ingham, D.B., Bejan, A., Mamut, E., Pop, I.: Emerging Technologies and Techniques in Porous Media. Kluwer, Dordrecht (2004) 11. Ingham, D.B., Pop, I.: Transport Phenomena in Porous Media. Pergamon, Oxford (1998) 12. Johnson, C.H., Cheng, P.: Possible similarity solutions for free convection boundary layers adjacent to flat plates in porous media. Int. J. Heat Mass Transf. 21, 709–718 (1978) 13. Karmishin, A.V., Zhukov, A.T., Kolosov, V.G.: Methods of Dynamics Calculation and Testing for Thin-walled Structures, vol. 135. Mashinostroyenie, Moscow (in Russian) (1990) 14. Liao, S.J.: Beyond Perturbation: Introduction to Homotopy Analysis Method. Chapman & Hall/ CRC Press, Boca Raton (2003) 15. Liao, S.J.: An analytic solution of unsteady boundary-layer flows caused by an impulsively stretching plate. Commun. Nonlinear Sci. Numer. Simul. 11(3), 326–339 (2006) 16. Lyapunov, A.M.: The general problem of the stability of motion. Int. J. Control 55(3), 531–534 (1992) 17. Magyari, E., Pop, I., Keller, B.: Analytic solutions for unsteady free convection in porous media. J. Eng. Math. 48, 93–104 (2004) 18. Merkin, J.H.: Mixed convection boundary layer flow on a vertical surface in a saturated porous medium. J. Eng. Math. 14, 301–313 (1980) 19. Merkin, J.H., Mahmood, T.: Mixed convection boundary layer similarity solutions: prescribed wall heat flux. J. Appl. Math. Phys. (ZAMP) 40, 51–68 (1989) 20. Merkin, J.H., Pop, I.: Mixed convection along a vertical surface similarity solutions for uniform flow. Fluid Dyn. Res. 30, 233–250 (2002) 21. Nield, D.A., Bejan, A.: Convection in Porous Media, 2nd edn. Springer, New York (1999) 22. Pop, I., Ingham, D.B.: Convective Heat Transfer: Mathematical and Computational Modelling of Viscous Fluids and Porous Media. Pergamon, Oxford (2001) 23. Ridha, A.: Aiding flows non-unique similarity solutions of mixed convection boundary layer equations. J. Appl. Math. Phys. (ZAMP) 47, 341–352 (1996) 24. Vafai, K.: Handbook of Porous Media. Marcel Dekker, New York (2000) 25. Wilks, G.: Combined forced and free convection flow on vertical surfaces. Int. J. Heat Mass Transf. 16, 1958–1963 (1973) 26. Wilks, G.: A separated flow in mixed convection. J. Fluid Mech. 16, 359–368 (1974) 27. Wilks, G., Bramley, S.J.: Dual solutions in mixed convection. Proc. Roy. Soc. Edinburgh, Sect. A87, 349–358 (1981)

Chapter 32

Magnetic Force Calculation Between Truncated Cone Shaped Permanent Magnet and Soft Magnetic Cylinder Using Hybrid Boundary Element Method Ana Vuˇckovi´c, Dušan Vuˇckovi´c, Mirjana Peri´c, and Nebojša Raiˇcevi´c Abstract The paper presents modeling of the truncated cone shaped permanent magnet in the vicinity of a body of finite dimensions made of soft magnetic material. The force calculation between the permanent magnet and soft magnetic cylinder is performed using the hybrid boundary element method along with semi-analytical approach based on fictitious magnetization charges and discretization technique. In many electromechanical devices the use of other permanent magnet shapes may result a performance improvement. Also, there is a constant need for permanent magnet shape optimization and size reduction. This method enables force calculation in case of atypical permanent magnet shape and modelling the configuration that contains object made of soft magnetic material with finite dimensions. Results of presented approach are compared with results of Finite element method (FEMM software). Since only the complete elliptic integrals of the second kind are used and other additional integrations are avoided in this calculation, the advantage of presented method is its simplicity and time efficiency. Keywords Magnetic material · Finite element method MSC 2020 78M10 · 78M15

A. Vuˇckovi´c (B) · D. Vuˇckovi´c · M. Peri´c · N. Raiˇcevi´c Faculty of Electronic Engineering, University of Niš, Niš, Serbia e-mail: [email protected] D. Vuˇckovi´c e-mail: [email protected] M. Peri´c e-mail: [email protected] N. Raiˇcevi´c e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_32

719

720

A. Vuˇckovi´c et al.

32.1 Introduction Although the permanent magnetism is one of the oldest continuously studied branches of the science, the utilization of permanent magnets was limited in the past because of low force they could generate. Thanks to the advances in rare-earth materials with high coercively and energy production developed in 1970s s and ‘80s, the application of permanent magnets gained a new interest. The quality of electrical devices that contain permanent magnets is dependent on the magnetic material they are made of, their magnetization, as well as their size. There is a constant need for their optimization and size reduction, which leads to new mathematical methods for their field calculation. Since Yonnet presented the first study concerning different permanent magnet configuration [1], many scientists around the World are dealing with this issue by carrying out various researches in order to determine the magnetic field of permanent magnets and the force between them accurately and efficiently [2–5]. There are numerous techniques for analyzing permanent magnet configurations. Simplified and complex formulations of the interaction forces for different PMs configurations are proposed. In most cases, analytical methods for permanent magnet field calculation based on the distribution of magnetic charges or Ampere’s microscopic currents are limited to block and cylindrical structures [6, 7]. A major problem of the numerical approach is that it has a high computational cost. Moreover, numerical algorithms are not as precise as analytical calculations. On the other hand it is difficult to obtain fully analytical expression of the magnetic field created by ring permanent magnet, so the determination of the force is still more difficult. In order to overcome drawbacks of analytical and numerical methods, numerous semi-numerical approaches were proposed by different authors. Many authors proposed different methods for determining the magnetic field generated by the PMs of various shapes and interaction force between them aiming towards the simplest and fastest analysis of magnetic structures in relation to different parameters [8–11]. A special group of problems present permanent magnet structures in the vicinity of different bodies with magnetic material composition. In the survey of existing literature, in most of the cases, only the field or force between permanent magnet and infinite magnetic plain calculation was available [12]. Problems of this kind were solved by first applying the method of images, after which Ampere’s approach [13, 14] or Coulombian approach [15, 16] was used to model the configuration. This paper presents modeling of permanent magnet in the vicinity of a body of finite dimensions made of soft magnetic material. The field models of permanent magnets in literature have mainly been focused on magnet assemblies with cylindrical, ring or cuboidal permanent magnets. However, in many electromechanical devices the use of other magnet shapes may result in performance improvement. This paper presents modeling of truncated cone shaped permanent magnet in the vicinity of a body of finite dimensions made of soft magnetic material. The force between permanent magnet and soft magnetic cylinder is performed using the hybrid boundary element method along with semi-analytical

32 Magnetic Force Calculation Between Truncated Cone Shaped Permanent …

721

approach based on fictitious magnetization charges and discretization technique. The Hybrid Boundary Element Method (HBEM) was developed at the Department of Theoretical Electrical Engineering, Faculty of Electronic Engineering of Niš, and was successfully used for the analysis of multilayered electromagnetic problems [17–20].

32.2 Problem Definition The truncated cone shaped permanent magnet, placed above the cylinder made of linear magnetic material with relative permeability μr 2 , is considered, Fig. 32.1. Presumption is that the magnet is magnetized in axial direction and the following boundary conditions have to be satisfied, ηm = nˆ · M, ρm = −∇ · M.

(32.1)

It is obvious that the fictitious surface magnetic charges, ηm , exist on both bases of the magnet and on its covers ηm1 = nˆ 1 · M = M, ηm2 = nˆ 2 · M = −M, ηmc = nˆ 3 · M = Mcosα, ρm = −∇ · M = 0, where α = √

c−a (c−a)2 +L 21

(32.2) (32.3)

and nˆ 1 , nˆ 2 and nˆ 3 are corresponding unit normal vectors.

Volume charges, ρm , do not exist in this case. Therefore, each permanent magnet base and its cover can be discretized into system of circular loops, loaded with different magnetization charges (Fig. 32.2). Since the cylinder is made of linear magnetic material with relative permeability μr 2 , the influence of the magnetic material can be replaced with the system of thin toroidal magnetic sources placed on the bound-

Fig. 32.1 Permanent magnet above soft magnetic cylinder

a

r1

L1

M c n r2

b

h L2

722

A. Vuˇckovi´c et al.

z

Fig. 32.2 Discretization model

r1n Qm11

Qm 1n1

rm

M Qm1

2

Qm3m Qm31

Q m 2n2

n1

Q1

r1

Qm1N b1 Qm3cN

Qm2 Nb2

rn2

z2n

zm

z1n

r

0 r2

Qi

n2

n3

QN

ary surface of two different materials (on the magnetic cylinder cover and bases). Discretization model and distributions of magnetic charges as well as thin toroidal sources are shown in the Fig. 32.2. To derive the magnetic force that exerts between permanent magnet and soft magnetic cylinder the superposition of results obtained for axial magnetic force between two circular loops is used [21].

32.2.1 Force Calulation Betwen Circular Loops Loaded with Magnetization Charges The goal of this approach is to determine first the interaction magnetic force between two circular loops uniformly loaded with different magnetization charges Q m1 and Q m2 . Dimensions and positions of the loops are presented in the Fig. 32.3.

z

Fig. 32.3 Circular loops uniformly loaded with different magnetization charges

Q m1 (1)

rm

Q m2

zm

(2)

P(r, ,z)

r0

R dQm2

z0 r x

z y

32 Magnetic Force Calculation Between Truncated Cone Shaped Permanent …

723

For determining the interaction force between two circular loops, magnetic scalar potential, magnetic field and magnetic flux density generated by the lower loop will be calculated. Elementary magnetic scalar potential generated by the elementary point magnetization charge, d Q m2 , is dϕm = Since, d Q m2 = Q m2 dl = tial has the following form

Q m2 r dθ  2πr0 0

dϕm =

d Q m2 1 · , 4π R =

(32.4)

Q m2 dθ  , elementary magnetic scalar poten2π

Q m2 1  · dθ , 8π 2 R

(32.5)

and the resulting magnetic scalar potential generated by the lower circular loop at an arbitrary point P(r, θ, z) is Q m2 ϕm (r, z) = 8π 2





0

1  dθ  , 2 2 2  r + r0 + (z − z 0 ) − 2r0 r cos(θ − θ )

(32.6)

Considering the existing symmetry, in θ = 0 plane, magnetic scalar potential has the following form ϕm (r, z) =



Q m2 4π 2

π

0

1  dθ  , 2 2 2  r + r0 + (z − z 0 ) − 2r0 r cos(θ )

(32.7)

Substituting θ  = π − 2α in Eq. (32.6), magnetic scalar potential is obtained as: ϕm (r, z) =

Q m2 2π 2

 0

π/2



1 (r + r0 )2 − 4rr0 sin 2 (α) + (z − z 0 )2

dα,

(32.8)

After some simple mathematical operations the magnetic scalar potential can be given in the form: ϕm (r, z) =

K (k) Q m2  , 2 2π (r + r0 )2 + (z − z 0 )2

Complete elliptic integral of the first kind is K (k) =

0



1 dα 1−k 2 sin 2 (α)

,

with modulus k = Magnetic field generated by the lower loop at an arbitrary point can be determined as (32.10) H ext (r, z) = −gradϕm (r, z) = Hrext (r, z)ˆr + Hzext (r, z)ˆz . 2

4rr0 . (r +r0 )2 +(z−z 0 )2

 π/2

(32.9)

724

A. Vuˇckovi´c et al.

External magnetic flux density is B ext (r, z) = μ0 H ext (r, z)

(32.11)

B ext (r, z) = Brext (r, z)ˆr + Bzext (r, z)ˆz .

(32.12)

with components Brext (r, z) = −μ0

∂ϕm (r, z) , ∂r

(32.13)

 (r 2 − r02 − (z − z 0 )2 )E(k) Q m2  + 2 2π 2r ((r − r0 )2 + (z − z 0 )2 ) (r + r0 )2 + (z − z 0 )2  K (k)  2r (r + r0 )2 + (z − z 0 )2 (32.14)

Brext (r, z) = μ0

and Bzext (r, z) = −μ0 Bzext (r, z) = μ0

∂ϕm (r, z) , ∂z

Q m2 (z − z 0 ))E(k)  2 2 2π ((r − r0 ) + (z − z 0 )2 ) (r + r0 )2 + (z − z 0 )2

(32.15)

(32.16)

where complete elliptic integral of the second kind is 

π/2

E(k) =

 1 − k 2 sin2 (α)dα

0 0 with modulus k 2 = (r +r0 )4rr 2 +(z−z )2 . 0 The interaction magnetic force on elementary magnetization charge of the upper circular loop Q m1 Q m1  dβ rm dβ = (32.17) d Q m1 = Q m1 dl = 2πrm 2π

is d F = d Q m1 B ext (rm , z m ).

(32.18)

Finally, interaction magnetic force components can be expressed as:  (rm2 − r02 − (z m − z 0 )2 )E(k0 ) Q m1 Q m2  + 2π 2 2rm ((rm − r0 )2 + (z m − z 0 )2 ) (rm + r0 )2 + (z m − z 0 )2  K (k0 )  =0 2rm (rm + r0 )2 + (z m − z 0 )2 (32.19)

Fr = μ0

32 Magnetic Force Calculation Between Truncated Cone Shaped Permanent …

Fz = μ0

Q m1 Q m2 (z m − z 0 ))E(k0 )  2 2π ((rm − r0 )2 + (z m − z 0 )2 ) (rm + r0 )2 + (z m − z 0 )2 Fz = μ0

Q m1 Q m2 Fzp (r0 , rm , z 0 , z m ) 2π 2

725

(32.20)

(32.21)

m r0 with elliptic integral modulus k02 = (rm +r0 )4r2 +(z 2. m −z 0 ) The axial component of the force, Eq. (32.20), presents interaction force between two magnetized circular loops, since Fr = 0, because of the axial symmetry.

32.2.2 Force Calulation Betwen Permanent Magnet and Soft Magnetic Cylinder Magnetic scalar potential of the considered system is

ϕm (r, z) =

1 {S(Q m1 , r, rn1 , z, z n1 , n 1 , Nb1 ) + (S(Q m2 , r, rn2 , z, z n2 , n 2 , Nb2 ) 2π 2 + (S(Q m3 , r, rm , z, z m , m, Nc ))} (32.22)

where S(Q, r, r0 , z, z 0 , n, N ) =

N  n=1

Q

K (k) (r + r0 )2 + (z − z 0 )2

.

(32.23)

Parameters of permanent magnet’s upper base, from Fig. 32.2, are rn1 =

2n 1 − 1 a, z n1 = h + L 1 , n 1 = 1, 2, . . . , Nb1 , 2Nb1

(32.24)

while the magnetic charges of the upper base are Q m1 = 2πrn1 M

a , n 1 = 1, 2, . . . , Nb1 . Nb1

(32.25)

Nb1 is the number of discretization segments (circular loops) of upper permanent magnet base. Parameters of permanent magnet’s lower base are rn2 =

2n 2 − 1 c, z n2 = h, n 2 = 1, 2, . . . , Nb2 , 2Nb2

(32.26)

726

A. Vuˇckovi´c et al.

while the magnetic charges of the lower base are Q m2 = −2πrn2 M

c , n 2 = 1, 2, . . . , Nb2 . Nb2

(32.27)

Nb2 is the number of discretization segments (circular loops) of lower permanent magnet base. Parameters of permanent magnet cover segments are rm = c −

c−a 2m − 1 (z m − h), z m = h + L 1 , m = 1, 2, . . . , Nc . (32.28) L1 2Nc

Corresponding magnetic charges of the cover are L1 c−a , cosα =  , m = 1, 2, . . . , Nc . Nc (c − a)2 + L 21 (32.29) Nc is the number of cover segments. Nb1 , Nb2 and Nc are calculated from the initial number of surface segments, Ns , and they depend on the permanent magnet dimensions. Positions of toroidal magnetic sources along magnetic cylinder covers and bases are (ri , z i ). Positions of the sources along the upper base are Q m3 = 2πrm cosα M

ri =

2i − 1 b, z i = 0, i = 1, 2, . . . , N1 , 2N1

(32.30)

while the cross-section radius of toroidal sources is ae1 =

Δr1 b , Δr1 = . π N1

(32.31)

For the cover sources the following relations are fulfilled ri = b, z i =

2N1 − 2i + 1 L 2 , i = N1 + 1, N1 + 2, . . . , N1 + N2 , 2N2

(32.32)

Δz 2 L2 , Δz 2 = . π N2

(32.33)

with radius ae2 =

For the lower base toroidal sources’ positions are ri = b −

2i − 2N1 − 2N2 − 1 b, z i = −L 2 , i = N1 + N2 + 1, . . . , 2N1 + N2 , 2N1 (32.34)

32 Magnetic Force Calculation Between Truncated Cone Shaped Permanent …

727

and cross-section radius is ae3 = ae1 =

r1 b , r1 = . π N1

(32.35)

N1 is the number of sources for each cylinder base, while N2 is the number of cylinder cover sources. These segment numbers are calculated starting from the initial number of toroidal sources, N , and they depend on the cylinder dimensions. The total number of cylinder magnetic sources is Ntot = 2N1 + N2 . Starting from the expression for magnetic scalar potential, magnetic field strength vector can be expressed as (32.36) H = −grad(ϕm ). The normal component of magnetic field vector and cylinder surface charges have to complete the following relation nˆ k · H (0+) =− k ηmi =

μr 2 ηmi μr 1 − μr 2

(32.37)

Qi , i = 1, 2, . . . , Ntot , k = 1, 2, 3. 2πri ri

nˆ k is the unit normal vector, (nˆ 1 = zˆ , nˆ 2 = rˆ , nˆ 3 = −ˆz ). The point matching method is applied for the normal component of the magnetic field and the system of the linear equations is formed. The solution of the linear equations system gives the values of unknown charges of toroidal sources, Q i , that are placed on the cover and the bases of the soft magnetic cylinder. Figure 32.4 presents the distribution of normalized magnetic charges, Q inor = Q i /M L 21 , along the boundary surface of two different magnetic materials. It is determined for the configuration parameters: a/L 1 = 3.0, b/L 1 = 3.0, c/L 1 = 3.5, L 2 /L 1 = 1.0, h/L1 = 0.5, μr 1 = 1, μr 2 = 3, Ns = 200 and N = 200. After calculating the values of unknown magnetic sources, the magnetic scalar potential of soft magnetic cylinder can be determined from the expression

Fig. 32.4 Distribution of magnetic sources along the boundary surface of two different magnetic materials

L2

b

0.04

b

0.02

Qi ML12 - 0.02 - 0.04

50

100

150

200

250

300

350

728

A. Vuˇckovi´c et al.

ϕmc =

Ntot K (ki ) 1  Qi  , 2π 2 i=1 (r + ri )2 + (z − z i )2

(32.38)

along with magnetic field and magnetic flux density vector anywhere in the cylinder’s vicinity. Final step would be the calculation of the force between permanent magnet segments and toroidal magnetic sources. It is performed starting from the expression for the force between two circular loops loaded with different magnetic charges. Using the superposition of contributions for all permanent magnet’s magnetic charges and toroidal magnetic sources, the expression for the force between permanent magnet and the cylinder made of linear magnetic material is derived. The assumption is that the system is placed in the environment with permeability μ1 = μ0 .

μ0 Fz = 2π 2

 Nb1  Ntot

(z n1 − z i )E



4rn1 ri (rn1 +ri )2 +(z n1 −z i )2



 ((rn1 − ri )2 + (z n1 − z i )2 ) (rn1 + ri )2 + (z n1 − z i )2

n2 ri Nb2  Ntot (z n2 − z i )E (rn2 +ri )4r2 +(z  2 −z ) n2 i  + Q m2 Q i 2 + (z − z )2 ) (r + r )2 + (z − z )2 ((r − r ) n2 i n2 i n2 i n2 i n 2 =1 i=1

4r r  m i Nc  Ntot (z m − z i )E (rm +ri )2 +(z  2 m −z i )  . + Q m3 Q i ((rm − ri )2 + (z m − z i )2 ) (rm + ri )2 + (z m − z i )2 m=1 i=1 (32.39) Q m1 Q i

n 1 =1 i=1

32.3 Numerical Results The results obtained with proposed approach are presented graphically and confirmed using Finite element method (FEMM 4.2 software) [23]. Since it is shown in the previously published papers that the high accuracy is achieved when the number of surface permanent magnet segments is Ns = 200, the convergence of the results is tested here to determine the optimal number of toroidal sources. Figure 32.5 presents convergence results for configuration parameters a/L 1 = 2.0, c/L 1 = 2.5, b/L 1 = 3.0, L 2 /L 1 = 0.5, h/L 1 = 0.5, μr 1 = 1, μr 2 = 3. Normalized magnetic force, Fznor = Fz /(μ0 M 2 L 21 ), is calculated for different number of toroidal sources and it is compared with FEMM 4.2 results. In case when the initial number of sources is N = 200 the force is | Fmnor |= 0.212707. For mentioned parameters the force calculated in FEMM 4.2 is | Fmnor |= 0.212720, and the relative error is δ = 0.006%. Therefore, the initial number of toroidal sources is limited to N = 200, in order to reduce the calculation time. Normalized axial force, that exerts between permanent magnet and soft magnetic cylinder, versus normalized axial displacement of permanent magnet, h/L 1 , for vari-

32 Magnetic Force Calculation Between Truncated Cone Shaped Permanent … Fig. 32.5 Convergence of the results

729

0.2128 0.2126

nor

Fz =0.212707

0.2124 0.2122 nor

Fz

a/L1=2.0 c/L1=2.5 b/L1=3.0 L 2 /L1=0.5 h/L1 =0.5

0.2120 0.2118 0.2116

r1

0.2114

r2 =3

0.2112 0.2110 0

Fig. 32.6 Normalized axial magnetic force versus ratio h/L 1 for different cylinder radius

nor

Fz

1.5 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

100

200

Ntot

300

400

500

a/L1=2.5 c/L1=3.0 L 2 /L1=0.5

b/L1=3.5 b/L1=3.0

r2 r1

b/L1=2.5 b/L1=2.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

h/L1

able values of cylinder radius, b/L 1 , is shown in the Fig. 32.6. It is presented for the system parameters: a/L 1 = 2.5, c/L 1 = 3.0, L 2 /L 1 = 0.5, μr 1 = 1, μr 2 = 10. Figure 32.7 presents normalized magnetic force versus distance h/L 1 for different values of permanent magnet radius, a/L 1 , for configuration parameters c/L 1 = 3.0, b/L 1 = 4.0, L 2 /L 1 = 1.0, μr 1 = 1, μr 2 = 3. Normalized magnetic force dependence on relative magnetic permeability of the soft magnetic cylinder is also calculated. It is show in Fig. 32.8 for different ratio L 2 /L 1 and for configuration parameters a/L 1 = 4.5, c/L 1 = 5.0, b/L 1 = 5.0, h/L 1 = 0.5, μr 1 = 1 and in Fig. 32.9 for different values of axial displacement of permanent magnet, h/L 1 , when it comes to parameters a/L 1 = 3.0, c/L 1 = 3.5, b/L 1 = 4.0, L 2 /L 1 = 1.0, μr 1 = 1. Derived expression for the axial force could be also used for calculating the force in the case of cylindrical permanent magnet (a/L 1 = c/L 1 ). Normalized axial force between cylindrical permanent magnet and the soft magnetic cylinder, versus nor-

730

A. Vuˇckovi´c et al.

1.1

a/L1=3.0

1.0 0.9

a/L1=2.75

0.8 0.7 nor

Fz

a/L1=2.50

0.6

c/L1=3.0 b/L1=4.0 L 2 /L1=1.0

a/L1=2.25

0.5

r1

a/L1=2.0

0.4

r2 =3

0.3 0.2 0.1 0.0 0.1 0.2

0.4

0.6

0.8

1.0

h/L1

1.2

1.4

1.6

1.8

2.0

Fig. 32.7 Normalized axial magnetic force versus ratio h/L 1 for different permanent magnet radius, a/L 1

1.6 2

1.4

2

1

1

1.2 1.0 nor

Fz

2 2

0.8

2

1

1

1

0.6 1

0.4

1 1

h/L1 =0.5

0.2

r1

0.0

0

50

100

150

200

250

300

r2

Fig. 32.8 Normalized axial magnetic force versus cylinder permeability for different ratio between permanent magnet and magnetic cylinder heights

32 Magnetic Force Calculation Between Truncated Cone Shaped Permanent …

731

3.0

h/L1=0.1

2.5

a/L1=3.0 c/L1=3.5 b/L1=4.0 L 2 /L1=1.0

2.0 nor

Fz

h/L1=0.3

r1

1.5

h/L1=0.5

1.0

h/L1=0.7 h/L1=0.9

0.5 0.0 0

25

50

75

100

125

150

175

200

r2

Fig. 32.9 Normalized axial magnetic force versus cylinder permeability for different distance h/L 1

0.5

a/L 1=1.0

r2

0.4

b/L 1=2.0

L 2 /L 1 =0.5

r2

0.3

r1

nor Fz

r2

0.2 r2

0.1

r2 r2

0.0 0.2

0.4

0.6

0.8

1.0

1.2

1.4

h/L1 Fig. 32.10 Axial magnetic force versus ratio h/L 1 for different cylinder permeability

malized axial displacement of the permanent magnet, h/L 1 , for variable relative permeability of soft magnetic cylinder is presented in the Fig. 32.10. In this case the configuration parameters are a/L 1 = c/L 1 = 1.0, b/L 1 = 2.0, L 2 /L 1 = 0.5, μr 1 = 1.

732

A. Vuˇckovi´c et al.

32.4 Conclusion The derivation of the magnetic force between the soft magnetic cylinder and the axially magnetized truncated cone shaped permanent magnet is presented in the paper. It is performed using semi-analytical approach based on magnetic charges and HBEM. Algorithm presented here can be easily implemented in any standard software development environment and the rapid parametric studies of the magnetic force is enabled. The results of the presented approach are successfully confirmed with FEM results. It gives 300 times lower execution time than FEMM 4.2 software. The expression for the force contains only complete elliptic integrals of the second kind and redundant integrations are avoided with this approach. Therefore, its simplicity, accuracy and time efficiency are evident. Majority of published papers deal with calculation of interaction force between permanent magnet and infinite magnetic plain, while this paper presents the force determination for the object made of soft magnetic material with finite dimensions. Acknowledgements This work has been supported by the Ministry of Education, Science and Technological Development of the Republic of Serbia.

References 1. Yonnet, J.P.: Passive magnetic bearings with permanent magnets. IEEE Trans. Magn. 14, 803– 805 (1978) 2. Bekinal, S., Anil, Rao, T.R., Jana, S.: Analysis of the magnetic field created by permanent magnet rings in permanent magnet bearings. Int. J. Appl. Electromagn. Mech. 46, 255–269 (2014) 3. Yang, W., Zhou, G., Gong, K., Li, Y.: The numeric calculation for cylindrical magnet’s magnetic flux intensity. Int. J. Appl. Electromagn. Mech. 40, 227–235 (2012) 4. Babic, S., Akyel, C.: Magnetic force calculation between thin coaxial circular coils in air. IEEE Trans. Magn. 47, 445–452 (2008) 5. Ravaud, R., Lemarquand, G., Babic, S., Lemarquand, V., Akyel, C.: Cylindrical magnets and coils: fields, forces and inductances. IEEE Trans. Magn. 46, 3585–3590 (2010) 6. Akoun, G., Yonnet, J.P.: 3D analytical calculation of the forces exerted between two cuboidal magnets. IEEE Trans. Magn. 20, 1962–1964 (1984) 7. Furlani, E.P., Reznik, S., Kroll, A.: A three-dimensional field solution for radially polarized cylinders. IEEE Trans. Magn. 31, 844–851 (1995) 8. Robertson, W.S., Cazzolato, B.S., Zander, A.C.: A simplified force equation for coaxial cylindrical magnets and thin coils. IEEE Trans. Magn. 47, 2045–2049 (2011) 9. Ravaud, R., Lemarquand, G., Lemarquand, V.: Force and stiffness of passive magnetic bearings using permanent magnets. Part 1: axial magnetization. IEEE Trans. Magn. 45, 2996–3002 (2009) 10. Bekinal S.I., Rao Anil, T.R., Jana, S.: Analysis of the magnetic field created by permanent magnet rings in permanent magnet bearings. Int. J. Appl. Electromagn. Mech. 46, 255–269 (2014) 11. Vuckovic, A.N., Ilic, S.S., Aleksic, S.R.: Interaction magnetic force calculation of permanent magnets using magnetization charges and discretization technique. Electromagnetics 33, 421– 436 (2013)

32 Magnetic Force Calculation Between Truncated Cone Shaped Permanent …

733

12. Beleggia, M., Vokoun, D., Graef, M.D.: Forces between a permanent magnet and a soft magnetic plate. IEEE Magn. Lett. 3 (2012) 13. Vuckovic, A.N., Ilic, S.S., Aleksic, S.R.: Interaction magnetic force calculation of ring permanent magnets using Ampere’s microscopic surface currents and discretization technique. Electromagnetics 32, 117–134 (2012) 14. Braneshi, M., Zavalani, O., Pijetri, A.: The use of calculating function for the evaluation of axial force between two coaxial disk coils. In: Third International Ph.D. Seminar on Computational Electromagnetics and Technical Application, vol. 1, pp. 21–30. Banja Luka, Bosnia and Hertzegovina (2006) 15. Rakotoarison, H.L., Yonnet, J.P., Delinchant, B.: Using Coulombian approach for modeling scalar potential and magnetic field of a permanent magnet with radial polarization. IEEE Trans. Magn. 43, 1261–1264 (2007) 16. Ravaud, R., Lemarquand, G., Lemarquand, V.: Force and stiffness of passive magnetic bearings using permanent magnets. Part 2: radial magnetization. IEEE Trans. Magn. 45, 3334–3342 (2009) 17. Raicevic, N.B., Aleksic, S.R., Ilic, S.S.: A hybrid boundary element method for multilayer electrostatic and magnetostatic problems. Electromagnetics 30, 507–524 (2010) 18. Peric, M.T., Ilic, S.S., Aleksic, S.R., Raicevic, N.B.: Application of hybrid boundary element method to 2D microstrip lines analysis. Int. J. Appl. Electromagn. Mech. 42, 179–190 (2013) 19. Peric, M.T., Ilic, S.S., Aleksic, S.R., Raicevic, N.B.: Characteristic parameters determination of different striplines configurations using HBEM. ACES J 28, 858–865 (2013) 20. Vuckovic, A.N., Raicevic, N.B., Peric, M.T.: Hybrid boundary element method for force calculation of permanent magnet and soft magnetic cylinder. In: CD Proceeding of International Symposium on Electromagnetic Fields in Mechatronics, Electrical and Electronic Engineering— ISEF 2015, Valencia, Spain (2015) 21. Vuckovic, A.N., Raicevic, N.B., Ilic, S.S., Aleksic, S.R.: Interaction magnetic force calculation of radial passive magnetic bearing using magnetization charges and discretization technique. Int. J. Appl. Electromagn. Mech. 42, 311–323 (2013) 22. Vuckovic, A.N., Raicevic, N.B., Peric, M.T.: Radialy magnetized ring permanent magnet modelling in vicinity of soft magnetic cylinder. Saf. Eng. 8, 33–37 (2018) 23. Meeker, D.: FEMM 4.2 (2009). http://www.femm.info/wiki/Download

Chapter 33

A Mathematical Model for Harvesting in a Stage-Structured Cannibalistic System Loy Nankinga and Linus Carlsson

Abstract To increase the production of proteins in East Africa, aquaculture gained increased attention recently. In this paper, we study the interactions of a consumerresource system with harvesting, in which African Catfish (Glarias gariepinus) consume a food resource. The cannibalistic behavior of African Catfish is captured by using a four stage-structured system. The dynamics of food resource and African Catfish result in a system of ordinary differential equations called a stage-structured fish population model. Existence and stability of steady states are analyzed quantitatively. We have investigated eight different harvesting scenarios which account for yield of the fish stock. Results from the simulations revealed that harvesting large juveniles and small adults under equal harvesting rates gives the highest maximum sustainable yield compared to other harvesting scenarios. In contrast to non-cannibalistic models, we find an increase of the proportion of the adult individuals under harvesting. Keywords Harvesting rate · Fish population model MSC 2020 92D15 · 92D25 · 92D50

33.1 Introduction In East Africa, fish is a rich source of animal protein for human consumption and provides raw materials (fish meal) for processing animal feeds [19]. The fish industry is vital in creating employment, generates income and foreign exchange earnings through fish exports to regional and international markets [1]. Due to the nutritional importance of fish for so many people, demand is very high and has resulted into L. Nankinga (B) Department of Mathematics and Statistics, Kyambogo University, Kampala, Uganda e-mail: [email protected] L. Carlsson Division of Mathematics and Physics, Malardalen University, 721 23 Vasteras, Sweden e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_33

735

736

L. Nankinga and L. Carlsson

overfishing and hence decrease in fish population stock. Therefore, there is need to develop sustainable fisheries management practices. Most of fisheries-management theory is based on unstructured models [10, 14, 27]. Over the last three decades, there has been a trend towards increased realism in ecology theory. This trend has been supported by introduction of physiologically structured population models, see e.g., [3, 4, 6, 8, 9, 11, 17, 18, 21, 23]. Whereas Unstructured population models treat all individuals as identical, physiologically structured population models distinguish individuals based on physiological characteristics e.g., size, mass, weight, length, and height. Size is a common choice in the context of fisheries because it is a good predictor of life history state and associated with key rates such mortality, reproduction and growth, see [2, 16, 25, 28]. The foundation for our model was formulated by [24]. In their paper they developed a two stage-structured population model (SSPM) in which individuals were composed into two stages; juveniles and adults, depending only on their size. This SSPM incorporates key individual life-history processes such as food dependent growth, maturation and reproduction. They also showed that the SSPM is a reliable approximation of the fully size structured population model. Recently [15] adopted the model introduced by [24], in which they compared properties between this SSPM with an age structured population model. They investigated how stage-dependent harvesting strategies that qualify for pretty good yield (PGY) can account for conservation. Their result showed that PGY harvesting strategies give large conservation benefits and equal harvesting rates of juveniles and adults is always a good strategy. Our work is an extension of [24] model by introducing cannibalism and use four stages in the population, we also include different harvesting scenarios. This paper is structured as follows. Section 33.2, we present the description of a four stage structured fish population model including harvesting and cannibalism. Section 33.3, we introduce different harvesting scenarios of juveniles and adults with different consequences for yield, impact on biomass and impact in size structure. Finally, in the concluding Sect. 33.4, we discuss the results obtained.

33.2 Model Description In this section, we consider interactions of African Catfish and a food resource. African Catfish are composed into four stages: small juveniles, large juveniles, small adults and large adults. The stages depend only on size. In cannibalism the ability of a predator to capture, kill and handle prey depends on both the predator size and prey size, see [7, 26, 29]. Generally a cannibal is considerably larger than its victim [20]. In addition to food resource, small adults feed on small juveniles while, large adults feed on large juveniles.

33 A Mathematical Model for Harvesting in a Stage-Structured Cannibalistic System

737

Table 33.1 Variables used in the four stage-structured catfish-food resource model Variables Description X 1 (t)

Biomass of small juvenile African Catfish at time (t) Biomass of large juvenile African Catfish at time (t) Biomass of small adult African Catfish at time (t) Biomass of large adult African Catfish at time (t) Biomass of food resource at time (t)

X 2 (t) X 3 (t) X 4 (t) R(t)

33.2.1 Biological Assumptions In the model description, we make the following biological assumptions: 1. Individuals are characterized by their sizes, s. They are assumed to be born with size sb . The size of small juveniles is in the interval sb ≤ s < s1 , the size of large juveniles is in the interval s1 ≤ s < s2 , the size of the small adults is in the interval s2 ≤ s < smax and the size of large adults is smax . 2. Small and large adults prefer cannibalism to eating food resource. 3. Large adults of sizes smax do not grow but invest all the energy in reproduction. 4. Small adults use the available energy for growth, maturation and reproduction. 5. Small and large juveniles use all available energy for growth and maturation. 6. The growth rates and reproduction rates depend on food abundance. 7. Juveniles and adults do not produce biomass when the energy intake is insufficient to cover maintenance requirements. 8. Maturation rates depend on the net biomass ingestion rates. 9. The net ingestion rates are assumed to equal the balance between ingestion and maintenance rates. We follow the definitions in de Roos et al. [24] of net biomass ingestion rate of individuals in different stages and maturation rates from one stage to another. The biomass of small juvenile Catfish, X 1 (t), increases due to recruitment of biomass to the stage through somatic growth at a rate ω1 (R), reproduction at a rate ω4 (R, X 2 ) from large adult Catfish and reproduction at a rate kν3 (ω3 (R, X 1 )) from small adult Catfish. It decreases due to maturation into the large juvenile stage at a rate ν1 (ω1 (R), harvesting at a rate h 1 , cannibalism at a rate C13 (X 1 , X 3 ) and death due to natural causes at a rate μ1 . The net biomass ingestion rate of small juveniles is given by  ω1 (R) = max 0, σ1 I1

R − T1 H1 + R



738

L. Nankinga and L. Carlsson

Table 33.2 Parameters used in the four stage-structured catfish-food resource model Parameter Descriptions Hi r Rmax sb s1 s2 smax k Ti Ii μi h σi

Half saturation constant for consumers stage i, (i = 1, 2, 3, 4) Intrinsic per capita growth rate of food resource Carrying capacity of food resource Size of small juvenile Catfish at birth Size of maturation to large juvenile Catfish Size of maturation to small adult Catfish Size of large adult Catfish Reproduction ratio of small adult Catfish Maintenance rate at stage i, (i = 1, 2, 3, 4) Maximum ingestion rate per unit biomass at stage i, (i = 1, 2, 3, 4) Natural mortality rate at stage i, (i = 1, 2, 3, 4) Harvesting strategy h = (h 1 , h 2 , h 3 , h 4 ) Conversion efficiency of ingested biomass at stage i, (i = 1, 2, 3, 4)

where, σ1 is the conversion efficiency of food resource into small juvenile, I1 is the maximum ingestion rate per unit biomass of small juvenile, T1 is the maintenance rate of small juvenile and H1 is the half saturation food level of small juvenile. Net biomass ingestion rate of large adults is given by ω4 (R, X 2 ) = max {0, I4 σ4 F(X 2 , R) − T4 } where

  R X2 X2 F(X 2 , R) = + 1− H4 + X 2 H4 + X 2 H4 + R

(33.1)

and σ4 is the conversion efficiency of food resource and large juveniles into large adults, T4 is the maintenance rate of large adult and H4 is the half saturation constant. We designed the feeding rate, F(X 2 , R), in such a way that cannibalism is preferred over eating the food resource, R. In particular, for large values of biomass X 2 , 2 and for X 2 h

0

Fh  Imax E −1

(34.6)

0

otherwise

 F − E0 + q Imax F+F h

N (t) I∗ N (t)+Nh max



if F >

Fh q Imax E −1 0

0

(34.7)

otherwise

respectively, where E 0 is the maintenance rate per unit body mass per unit time. Growth rate of the juveniles and reproduction rate for the adults are defined as g(x, F) = xvl (F), xm v A (F), b(xm , F) = xb

(34.8) (34.9)

respectively. The natural mortality is modeled by the sum of the background and the starvation mortality, where the starvation mortality becomes non-zero when the resource density is insufficient to cover maintenance costs. Therefore these affect the number of the juveniles and adults as follow  μl (F) =  μa (F) =

μl

if F ≥

Fh  Imax E −1

μl − vl (F) otherwise μa

if F ≥

0

Fh q Imax E −1

μa − v A (F) otherwise

0

For the sake of algebraic simplicity, we suppress the food resource, F(t) to F and instead write vl , v A , μl and μa for the vital rates.

34.2.4 Modification of the Physiologically Structured Population Model In this section, we formulate a three-stage-structured population model, using an averaging process over the basic PSPM. Our stage structured population model will consider two juvenile distributions l1 (x, t) and l2 (x, t) and the adult livestock population N (t). The former class represents the young juveniles which are kept at home

760

S. Canpwonyi and L. Carlsson

and fed by the dams while the latter consists of the larger juveniles which are taken to graze together with the rest of the adult livestock. In addition, we subject the larger juveniles and the adults livestock to harvesting based on their sizes and health status. Using the traditional practice of the pastoral communities in the study area, livestock are often taken out for grazing on a daily basis, and therefore we consider neglecting starvation mortality for model simplicity sake. Using the ideas from [26], we assume the following dynamics for the grazing system: ∂ ∂ (34.10) l1 (x, t) + (g1 (x, N )l1 (x, t)) = −μl l1 (x, t), xb ≤ x < xw ∂t ∂x ∂ ∂ l2 (x, t) + (g2 (x, F)l2 (x, t) )= −( μ2 + h 1 ) l2 (x, t), xw ≤ x < xm (34.11) ∂t ∂x g1 (xb , N )l1 (xb , t) = b(xm , F)N (34.12) l1 (t, xw )g1 (N , xw ) = l2 (t, xw )g2 (F, xw ) d N = g2 (xm , F)l2 (xm , t) − (μa + h 2 ) N dt  xw N ∗ I xl1 (x, t)d x − N + Nh max xb   F d F = rF 1 − dt Fmax  xm  F − Imax xl2 (x, t)d x + q xm N , F + Fh xw

(34.13)

(34.14)

(34.15)

where N and F are functions of the time t. Furthermore, the growth rates of the juvenile classes l1 (x, t) and l2 (x, t) are, respectively, given by g1 (x, N ) = xv L 1 (N ),

g2 (x, F) = xv L 2 (F).

(34.16)

∗ is the maximum ingestion rate for the smaller juveniles. Finally, Imax The PDEs (34.10) and (34.11) describe the changes in the juvenile size distributions for l1 (x, t) and l2 (x, t) due to growth, mortality and harvesting respectively. Boundary conditions (34.12) and (34.13) account for change in population due to reproduction of newborns by the adult livestock and the influx of juveniles from l1 (t, x) into l2 (t, x) across size xw . And the ODE (34.14) relates the adult dynamics to maturation term balanced by the birth of new individuals, feeding the newly born calves, together with mortality and harvesting.

34 On the Approximation of Physiologically Structured Population Model …

761

34.3 Stage-Structured Population Model We now develop a three stage-structured population model (SSPM), an extension carried out along the lines of de Roos [42], by assuming that the equilibrium solutions of the resulting SSPM and the basic PSPM are identical. The two stage-structured population model in [42] involves categorizing the population on the basis of size in their life history; by considering broadly juveniles and adults. Further studies of the two stage-structured population models considering different harvesting strategies have been considered in [23] and a four stage-structured population model as has been investigated in [29]. We will generate a system of ordinary differential equations that provides a link of the growth of the juveniles into adult population by considering an interacting livestock-forage model. To do this, we associate the stages with the total biomass consumed by the small juveniles (which derive their energy requirements from the dams), large juveniles and the adult livestock. Thus, these three stages become biomasses, L 1 , L 2 and A defined, respectively, by  L 1 (t) =

xw

 xl1 (x, t)d x,

L 2 (t) =

xb

xm

xl2 (x, t)d x

(34.17)

xw

A(t) = xm N (t).

(34.18)

Now we can obtain the differential equations describing the grazing system, in terms of L 1 , L 2 and A as follows: Differentiating equations (34.17) and (34.18) with respect to time gives  xw d d L1 = xl1 (x, t)d x, dt dt xb d d A = xm N . dt dt

d d L2 = dt dt



xm

xl2 (x, t)d x

(34.19)

xw

(34.20)

Using Leibniz’s rule on the small juveniles, L 1 , in Eq. (34.19) the differential equation becomes d d L1 = dt dt



xw

 xl1 (x, t)d x =

xb

xw

x xb

∂ l1 (x, t)d x. ∂t

Applying Eq. (34.10), we obtain   ∂ x − g1 (x, N )l1 (x, t) − μl l1 (x, t) d x ∂x xb  xw  xw ∂ =− x g1 (x, N )l1 (x, t)d x − μl xl1 (x, t)d x. ∂x x xb

 b

d L1 = dt



xw

I

(34.21)

762

S. Canpwonyi and L. Carlsson

On using integration by parts on I together with Eqs. (34.16) and (34.17), we have  xw I = − [xg1 (x, N )l1 (x, t)]xxwb + g1 (x, N )l1 (x, t)d x xb

= xb g1 (xb , N )l1 (xb , t) − xw g1 (xw , N )l1 (xw , t) + v L 1 (N )L 1 . Thus, Eq. (34.21) now becomes d L1 = xb g1 (xb , N )l1 (xb , t) −xw g1 (xw , N )l1 (xw , t) + v L 1 (N )L 1 − μ1 L 1 .

 dt II

(34.22) With the boundary condition (34.12), Definition (34.9), and Eq. (34.20), the reproduction rate of adult livestock reads I I = xb g1 (xb , N )l1 (xb , t) = xb v A (F)

xm N = v A A. xb

By this substitution, Eq. (34.22) becomes d L1 = v A A − xw g1 (xw , N )l1 (xw , t) + v L 1 (N )L 1 − μ1 L 1 dt . It should be remarked here that, this expression will later be rewritten in terms of L 1 , L 2 , and A. Similar manipulations applied to the second term of Eq. (34.19) yields d L2 = xw g1 (xw , N )l1 (xw , t) − xm g2 (xm , F)l2 (xm , t) + v L 2 (F)L 2 − (μ2 + h 1 )L 2 . dt Turning our attention to the dynamics of the adult livestock given by Eq. 34.14) and using Definition (34.18), we obtain dA 1 = xm g2 (xm , F)l2 (xm , t) − I ∗ L 1 A − (μa + h 2 ) A. dt N + Nh max Thus the differential equations describing the dynamics of the grazing system becomes d L1 = v A A − xw g1 (xw , N )l1 (xw , t) + v L 1 (A) L 1 − μ1 L 1 , dt d L2 = xw g1 (xw , F)l1 (xw , t) − xm g2 (xm , F)l2 (xm , t) dt + v L 2 (F)L 2 − (μ2 + h 1 )L 2 , dA 1 I ∗ L 1 A − (μa + h 2 ) A, = xm g2 (xm , F)l2 (xm , t) − dt N + Nh max

(34.23)

(34.24) (34.25)

34 On the Approximation of Physiologically Structured Population Model …

  F F dF − = rF 1 − Imax (L 2 + q A) . dt Fmax F + Fh

763

(34.26)

The terms xw g1 (xw , N )l1 (xw , t) and xm g2 (xm , F)l2 (xm , t) are the maturation rates of the juveniles from the stages L 1 to L 2 and L 2 to A respectively. Now we express these rates in terms of L 1 , L 2 and A. Recalling that the PSPM and SSPM have identical biomass equilibria L ∗1 , L ∗2 and ∗ A obtained when ∂t∂ l1∗ (x, t) = ∂t∂ l2∗ (x, t) = 0. So that Eqs. (34.10) and (34.11) become ∂ g1 (x, N )l1∗ (x, t) = −μ1l1∗ (x, t), ∂x ∂ g2 (x, F)l2∗ (x, t) = −(μ2 + h 1 )l2∗ (x, t). ∂x

(34.27) (34.28)

Dividing Equation (34.27), on both sides, by g1 (x, N )l1∗ (x, t) gives ∂ g (x, ∂x 1

N )l1∗ (x, t) d1 (F)l1∗ (x, t) =− . ∗ g1 (x, N )l1 (x, t) g1 (x, N )l1∗ (x, t)

(34.29)

Integrating from size xb to any size x and using Eq. (34.8), we obtain (See Appendix “Appendix–Systematic derivation of the maturation rates of juveniles and adults” for a careful derivation) the l1 -equilibrium distribution given by μ

l1∗ (x, t) =

g1 (xb , N )l1∗ (xb , t) vL11 − vμL1 xb x 1 . xv L 1

(34.30)

Now we find the juvenile biomass density for L 1 at equilibrium by substituting Equation (34.30) in Definition (34.17) for L 1 , and after some algebraic manipulations (see Appendix, “Appendix–Systematic derivation of the maturation rates of juveniles and adults”), we get L ∗1 (t) =

μ  μ μ  1− v 1 1− v 1 g1 (xb , N )l1∗ (xb , t) vL11 L L xw 1 − xb 1 xb v L 1 − μ1

(34.31)

or equivalently, μ1

vL g1 (xb , N )l1∗ (xb , t)xb 1

=

v L 1 − μ1 μ

1− v 1 L xw 1

μ1

1− v

− xb

 L ∗1 (t).

(34.32)

L1

Multiplying Equation (34.30) with x 2 v L 1 on both sides, evaluated at x = xw and using Eq. (34.8) gives

764

S. Canpwonyi and L. Carlsson μ1 vL

μ1

1− v

xw g1 (xw , N )l1∗ (xw , t) = g1 (xb , N )l1∗ (xb , t)xb 1 xw

L1

.

(34.33)

Inserting Eq. (34.32) into Eq. (34.33) and applying some algebra, we obtain an expression for the maturation rate of the juveniles from l1 into l2 , attained at xw , given by xw g1 (xw , N )l1∗ (xw , t)

= 

v L 1 − μ1 1− v μ L xw 1

μ1

1− v

μ1

1− v

 L ∗1 (t)xw

L1

− xb

μ 1− v 1 v L 1 − μ1 L1 ∗ = (t)x L  w μ1  1 μ1 1− v 1− L1 1 − ( xxwb ) vL 1 xw

v L 1 − μ1 ∗ =  1− vμ1 L 1 (t) L1 xb 1 − xw L1

= γ (v L 1 (A))L ∗1 (t),

v L 1 − μ1 γ (v L 1 (A) ) =  1− vμ1 . L1 1 − xxwb

where

To rewrite Eq. (34.28) in terms of L 2 , we divide it by g2 (x, F)l2∗ (x, t) to obtain ∂ g (x, ∂x 2

F)l2∗ (x, t) (μ2 + h 1 )l2∗ (x, t) =− . ∗ g2 (x, F)l2 (x, t) g2 (x, F)l2∗ (x, t)

Using analogous derivations as above, we write the expression for the l2 equilibrium distribution as μ +h

l2∗ (x, t) =

g2 (xw , F)l2∗ (xw , t) 2vL 2 1 − μ2vL+h1 2 xw x xv L 2

and the juvenile biomass density for L 2 can be derived as follows: 

μ +h

g2 (xw , F)l2∗ (xw , t) 2vL 2 1 − μ2vL+h1 2 dx xw x xv L 2 xw μ +h  g2 (xw , F)l2∗ (xw , t) 2vL 2 1 xm − μ2vL+h1 2 dx xw x = vL 2 xw   μ +h μ +h μ +h 1− 2v 1 1− 2v 1 g2 (xw , F)l2∗ (xw , t) 2vL 2 1 L2 L2 xw − xw = xm v L 2 − (μ2 + h 1 )

L ∗2 (t) =

xm

x

34 On the Approximation of Physiologically Structured Population Model …

765

or equivalently μ2 +h 1 vL 2

g2 (xw , F)l2∗ (xw , t)xw

v L 2 − (μ2 + h 1 )

=

1− xm

μ2 +h 1 vL 2



1− xw

L ∗2 (t).

μ2 +h 1 vL 2

(34.34)

Recalling that the maturation rate from the L 2 -stage to the A-stage, attained at size xm , is given by μ2 +h 1 vL 2

xm g2 (xm , F)l2 (xm , t) = g2 (xw , F)l2 (xw , t)xw

1−

xm

μ2 +h 1 vL 2

(34.35)

and substituting Equation (34.34) in Eq. (34.35) gives v L 2 − (μ2 + h 1 )

xm g2 (xm , F)l2∗ (xm , t) = 

μ +h 1− 2v 1 L2

xm =

1−

xm

μ +h 1− ( 2 1 ) vL

− xw

 L ∗2 (t)xm

1−

μ2 +h 1 vL 2

2

μ +h v L 2 − (μ2 + h 1 ) 1− 2v 1 ∗ L2   (t)x L m 2 μ +h μ +h 2 1 vL 2

1−

1 − xw

2 1 vL 2

= γ (v L 2 )L ∗2 (t), where γ (v L 2 ) =

v L 2 − (μ2 + h 1 )  1− μ2v +h1 . L2 1 − xxmw

We use the equations, from calibrating the solutions at equilibrium in the three stage-structure system of ODEs, as the population model for the grazing system: d L1 dt d L2 dt dA dt dF dt

= v A (F) A + v L 1 (A) L 1 − γ (v L 1 (A)) L 1 − μ1 L 1

(34.36)

= γ (v L 1 (F)) L 1 + v L 2 (F) L 2 − γ (v L 2 (F)) L 2 − (μ2 + h 1 ) L 2

(34.37)

1 = γ (v L 2 (F)) L 2 − I ∗ L 1 A − (μa + h 2 ) A N + Nh max   F F − = rF 1 − Imax (L 2 + q A) . Fmax F + Fh

(34.38) (34.39)

766

S. Canpwonyi and L. Carlsson

34.4 Discussion and Conclusion In this paper, we present a three stage-structured population model in a grazing system. This three-staged model considers the growing characteristics of the juvenile population, where the younger juveniles are isolated from the older ones and fed on milk by the dams while the older ones are taken out for external grazing together with the adult livestock. We have carefully explained all calculations in the derivation of our three stage-structured population model from the basic physiologically structured population model by calibrating their equilibrium solutions put into the context of previous studies [23, 37, 42]. After formulating this model, derived by averaging approximation of a physiologically structured population model, the authors find that it is more realistic than (34.1), because it has incorporated more biological details about the characteristics of the consumer population. We intend to build upon this work by analytically investigating and numerically simulating the three stage-structured population model using empirical data and parameters values. In particular, the results can be used to make predictions about the sustainable livestock production, forage biomass conservation as well as maintaining a healthy grassland ecosystem. Acknowledgements This research work was supported by Swedish International Development Cooperation Agency (Sida) and International Science Programme (ISP) in collaboration with Sida-Makerere Bilateral Research Cooperation. Canpwonyi is so much grateful to the Research environment at Mathematics and Applied Mathematics (MAM), Division of Applied Mathematics, Malardalen University for providing a conducive and enabling atmosphere for education and research. Canpwonyi also wants to thank B.K. Nannyonga, G.M. Malinga, and A. Ssematimba for their guidance in the course of his studies.

Appendix–Systematic Derivation of the Maturation Rates of Juveniles and Adults Our task is to write the maturation rates of the juveniles xw g1 (xw , N )l1 (xw , t) and xm g2 (xm , F)l2 (xm , t) in terms of L 1 , L 2 and A. At the equilibrium, ∂t∂ l1∗ (x, t) = ∂t∂ l2∗ (x, t) = 0, and the juvenile dynamics becomes ∂ g1 (x, N )l1∗ (x, t) = −μ1 l1∗ (x, t). ∂x

(34.40)

∂ g2 (x, F)l2∗ (x, t) = −(μ2 + h 1 )l2∗ (x, t). ∂x

(34.41)

Now by considering Eq. (34.40) and dividing both sides by g1 (x, N )l1∗ (x, t) we have; ∂ g (x, N )l1∗ (x, t) d1 (F)l1∗ (x, t) ∂x 1 =− ∗ g1 (x, N )l1 (x, t) g1 (x, N )l1∗ (x, t)

34 On the Approximation of Physiologically Structured Population Model …

767

. Integrating from size xb to any size x and using juvenile growth rate we obtain  x N )l1∗ (s, t) μ1l1∗ (s, t) ds = ds − g1 (s, N )l1∗ (s, t) g1 (s, N )l1∗ (s, t) xb xb  x x μ1 ln(g1 (s, N )l1∗ (s, t))xb = − ds xb g1 (s, F)  μ1 x 1 ln(g1 (x, N )l1∗ (x, t)) − ln(g1 (xb , N )l1∗ (xb , t)) = − ds v L 1 xb s    μ1 x 1 g1 (x, N )l1∗ (x, t) = − ds ln g1 (xb , N )l1∗ (xb , t) v L 1 xb s x  μ1 =− ln s  vL 1 xb μ1 =− (ln x − ln xb ) vL 1   μ1 x =− ln vL 1 xb  − vμ1 L1 x = ln xb 

x ∂ g (s, ∂s 1

or equivalently g1 (x, N )l1∗ (x, t) = g1 (xb , N )l1∗ (xb , t) g1 (x,

N )l1∗ (x, t)

=



x xb

− vμ1

L1

g1 (xb , N )l1∗ (xb , t)



x xb

− vμ1

μ1 vL

= g1 (xb , N )l1∗ (xb , t)xb 1 x

L1

μ1

−v

L1

.

Therefore the maturation rate of l1 into l2 , attained at xw , is given by μ1 vL

μ1

1− v

xw g1 (xw , N )l1∗ (xw , t) = g1 (xb , N )l1∗ (xb , t)xb 1 xw

L1

.

(34.42)

The equilibrium juvenile distribution for l1 is therefore given by μ

l1∗ (x, t) =

g1 (xb , N )l1∗ (xb , t) vL11 − vμL1 xb x 1 . xv L 1

(34.43)

Next we calculate the juvenile biomass density for L 1 by substituting Equation (34.2) in Eq. (34.17) so that

768

S. Canpwonyi and L. Carlsson

L ∗1 (t) =



μ

xw

x 

xb xw

g1 (xb , N )l1 (xb , t) vL11 − vμL1 xb x 1 d x xv L 1 μ

g1 (xb , N )l1 (xb , t) vL11 − vμL1 xb x 1 d x vL 1 xb  μ g1 (xb , N )l1 (xb , t) vL11 xw − vμL1 = xb x 1 dx vL 1 xb μ 1− 1  μ g1 (xb , N )l1 (xb , t) vL11 x vL 1 xw xb = vL 1 1 − vμL1 xb =

1 μ

μ1

1− v 1 L xw 1

μ1 vL 1

1− v

− xb 1 − vμL1

g1 (xb , N )l1 (xb , t) = xw vL 1

L1

1

μ

μ1

1− v 1 L xw 1

μ

1− v

− xb 1 g1 (xb , N )l1 (xb , t) vL11 = xb v L 1 vL 1 v L 1 − μ1  1− μ1 μ μ  1− v 1 g1 (xb , N )l1 (xb , t) vL11 vL L = xb xw 1 − xb 1 v L 1 − μ1 L

or equivalently μ1

vL g1 (xb , N )l1∗ (xb , t)xb 1

=

v L 1 − μ1 μ

μ1

1− v 1 L xw 1

1− v

− xb

 L ∗1 (t)

L1

and substituting this in Eq. (34.42) we have the juvenile maturation rate of L 1 from xb to xw xw g1 (xw , N )l1 (xw , t) = 

v L 1 − μ1 1− v μ

xw

L1

μ

μ1

1− v

− xb

1− v 1 L  L ∗1 (x, t)xw 1

L1

μ 1− v 1 v L 1 − μ1 L1 ∗ (x, t)x L   w μ 1 μ1 1− v 1 1− L xw 1 1 − ( xxwb ) vL 1

v L 1 − μ1 ∗ =  1− vμ1 L 1 (t) L1 1 − xxwb =

= γ (v L 1 (F))L ∗1 (t), where



v L 1 − μ1 γ (v L 1 ) =  1− vμ1 . L1 1 − xxwb

34 On the Approximation of Physiologically Structured Population Model …

769

The juvenile maturation rate of L 2 from size xw to size xm can as well be derived in a similar fashion to give xm g2 (xm , F)l2 (xm , t) = γ (v L 2 )L ∗2 (t), where γ (v L 2 ) =

v L 2 − (μ2 + h 1 )  1− μ2v +h1 . L2 1 − xxmw

With these maturation rates for the juveniles substituted in the juvenile dynamics, we obtain the set of ordinary differential equations describing the three stagestructured population model for the grazing system given in (34.36).

References 1. Berryman, A.: On principles, laws and theory in population ecology. Oikos 103(3), 695–701 (2003) 2. Berryman, A., Michalski, J., Gutierrez, A., Arditi, R.: Logistic theory of food web dynamics. Ecology 76(2), 336–343 (1995) 3. Bertness, M., Callaway, R.: Positive interactions in communities. Trends Ecol. Evol. 9(5), 191–193 (1994) 4. Brännström, Å., Carlsson, L., Simpson, D.: On the convergence of the escalator boxcar train. SIAM J. Numer. Anal. 51(6), 3213–3231 (2013) 5. Byström, P., Andersson, J.: Size-dependent foraging capacities and intercohort competition in an ontogenetic omnivore (arctic char). Oikos 110(3), 523–536 (2005) 6. Cherrett, J.: Ecological concepts; the contribution of ecology to an understanding of the natural world. 04; QH540, C4 1988 (1989) 7. Chesson, P.: Macarthur’s consumer-resource model. Theor. Popul. Biol. 37(1), 26–38 (1990) 8. Cohen, J., Pimm, S., Yodzis, P., Saldaña, J.: Body sizes of animal predators and animal prey in food webs. J. Anim. Ecol. 67–78 (1993) 9. Connell, J.: On the prevalence and relative importance of interspecific competition: evidence from field experiments. Am. Nat. 122(5), 661–696 (1983) 10. Cropp, R., Norbury, J.: Population interactions in ecology: a rule-based approach to modeling ecosystems in a mass-conserving framework. SIAM Rev. 57(3), 437–465 (2015) 11. Cuddington, K.: The “balance of nature” metaphor and equilibrium in population ecology. Biol. Philos. 16(4), 463–479 (2001) 12. Diekmann, O., Gyllenberg, M., Metz, J.: Physiologically structured population models: towards a general mathematical theory. In: Mathematics for Ecology and Environmental Sciences, pp. 5–20. Springer (2007) 13. Durinx, M., Metz, J.H., Meszéna, G.: Adaptive dynamics for physiologically structured population models. J. Math. Biol. 56(5), 673–742 (2008) 14. Ginzburg, L.: The theory of population dynamics: I. back to first principles. J. Theor. Biol. 122(4), 385–399 (1986) 15. Gross, J., Shipley, L.A., Hobbs, N.T., Spalinger, D., Wunder, B.: Functional response of herbivores in food-concentrated patches: tests of a mechanistic model. Ecology 74(3), 778–791 (1993) 16. Hastings, A.: Global stability in lotka-volterra systems with diffusion. J. Math. Biol. 6(2), 163–168 (1978)

770

S. Canpwonyi and L. Carlsson

17. Hastings, A.: Mckendrick von foerster models for patch dynamics. In: Differential Equations Models in Biology, Epidemiology and Ecology, pp. 189–199. Springer (1991) 18. Jackson, L., Trebitz, A., Cottingham, K.: An introduction to the practice of ecological modeling. Bioscience 50(8), 694–706 (2000) 19. Krementz, D., Brown, P.W., Kehoe, F., Houston, C.: Population dynamics of white-winged scoters. J. Wildl. Manag. 222–227 (1997) 20. Lawton, J.: Are there general laws in ecology? Oikos 177–192 (1999) 21. Liu, Y., He, Z.: Behavioral analysis of a nonlinear three-staged population model with agesize-structure. Appl. Math. Comput. 227, 437–448 (2014) 22. Lundberg, S., Persson, L.: Optimal body size and resource density. J. Theor. Biol. 164(2), 163–180 (1993) 23. Lundström, N., Loeuille, N., Meng XB, M., Brännström, A.: Meeting yield and conservation objectives by harvesting both juveniles and adults. Am. Nat. 193(3), 373–390 (2019) 24. Metz, J., De Roos, A.: The role of physiologically structured population models within a general individual-based modelling perspective. In: Individual Based Models and Approaches in Ecology: Populations, Communities, and Ecosystems, pp. 88–111 (1992) 25. Metz, J., Diekmann, O.: Age dependence. In: The Dynamics of Physiologically Structured Populations, pp. 136–184. Springer (1986) 26. Metz, J., Diekmann, O.: The dynamics of physiologically structured populations, vol. 86. Springer (1986) 27. Meza, M., Bhaya, A., Kaszkurewicz, E., da Silveira, C.: On–off policy and hysteresis on–off policy control of the herbivore-vegetation dynamics in a semi-arid grazing system. Ecol. Eng. 28(2), 114–123 (2006) 28. Mittelbach, G.: Foraging efficiency and body size: a study of optimal diet and habitat use by bluegills. Ecology 62(5), 1370–1386 (1981) 29. Nankinga, L., Carlsson, L.: A mathematical model for harvesting in a stage-structured cannibalistic system. In: Submitted to Proceedings of SPAS 2019, pp. 735–751. Springer (2020) 30. Neubert, M., Caswell, H.: Density-dependent vital rates and their population dynamic consequences. J. Math. Biol. 41(2), 103–121 (2000) 31. Odenbaugh, J.: The “structure” of population ecology: philosophical reflections on unstructured and structured models. na (2005) 32. Owen-Smith, N.: Credible models for herbivore-vegetation systems: towards an ecology of equations: starfield festschrift. S. Afr. J. Sci. 98(9), 445–449 (2002) 33. Owen-Smith, N.: A metaphysiological modelling approach to stability in herbivore-vegetation systems. Ecol. Model. 149(1–2), 153–178 (2002) 34. Pennycuick, C., Compton, R., Beckingham, L.: A computer model for simulating the growth of a population, or of two interacting populations. J. Theor. Biol. 18(3), 316–329 (1968) 35. Persson, L., Leonardsson, K., Christensen, B.: Ontogenetic scaling of foraging rates and the dynamics of a size-structured consumer-resource model. Theor. Popul. Biol. 54(3), 270–293 (1998) 36. de Roos, A.: Numerical methods for structured population models: the escalator boxcar train. Numer. Methods Part. Differ. Equ. 4(3), 173–195 (1988) 37. de Roos, A.: A gentle introduction to physiologically structured population models. In: Structured-Population Models in Marine, Terrestrial, and Freshwater Systems, pp. 119–204. Springer (1997) 38. de Roos, A.: Interplay between individual growth and population feedbacks shapes body-size distributions. In: Body Size: The Structure and Function of Aquatic Ecosystems, pp. 225–244. Cambridge University Press (2007) 39. de Roos, A., Diekmann, O., Metz, J.: Studying the dynamics of structured population models: a versatile technique and its application to daphnia. Am. Nat. 139(1), 123–147 (1992) 40. de Roos, A., Persson, L.: Physiologically structured models-from versatile technique to ecological theory. Oikos 94(1), 51–71 (2001) 41. de Roos, A., Persson, L.: Population and community ecology of ontogenetic development, vol. 59. Princeton University Press (2013)

34 On the Approximation of Physiologically Structured Population Model …

771

42. de Roos, A., Schellekens, T., van Kooten, T., van de Wolfshaar, K., Claessen, D., Persson, L.: Simplifying a physiologically structured population model to a stage-structured biomass model. Theor. Popul. Biol. 73(1), 47–62 (2008) 43. Sabelis, M., Diekmann, O., Jansen, V.: Metapopulation persistence despite local extinction: predator-prey patch models of the lotka-volterra type. Biol. J. Lin. Soc. 42(1–2), 267–283 (1991) 44. Sæther, B.E., Bakke, Ø.: Avian life history variation and contribution of demographic traits to the population growth rate. Ecology 81(3), 642–653 (2000) 45. Sisodiya, A., Singh, B., Joshi, B.: Effect of two interacting populations on resource following generalized logistic growth. Appl. Math. Sci. 5(9), 407–420 (2011) 46. Turchin, P.: Does population ecology have general laws? Oikos 94(1), 17–26 (2001) 47. Walzer, A.: Logic and rhetoric in malthus’s essay on the principle of population, 1798. Q. J. Speech 73(1), 1–17 (1987) 48. Webb, G.: Logistic models of structured population growth. In: Hyperbolic Partial Differential Equations, pp. 527–539. Elsevier (1986) 49. Weisberg, P., Coughenour, M., Bugmann, H.: Modelling of large herbivore-vegetation interactions in a landscape context. Conservation Biology Series-Cambridge- 11, 348 (2006)

Chapter 35

Magnetohydrodynamic Casson Nanofluid Flow Over a Nonlinear Stretching Sheet with Velocity Slip and Convective Boundary Conditions Prashant G. Metri, M. Subhas Abel, and Deena Sunil Sharanappa Abstract The present work explores combined effect of convective boundary condition, Brownian motion and thermophoresis on flow of Casson fluid over a nonlinear stretching sheet took into account of momentum slip condition. The transport equations are used to in the analysis of the effect of thermophoresis and Brownian motion. Similarity solutions are used to reduce a nonlinear partial differential equation into aset of nonlinear ordinary differential equations. A precise numerical solution of boundary value problem involving nonlinear partial differential equations describing flow, heat and mass transfer are obtained by efficient Runge-KuttaFehlberg with shooting technique. The numerical results are obtained for velocity, temperature and concentration distributions. The effect of various material parameters such Casson fluid parameter β,Suction/Injection parameter f w , Lewis number Le,Brownian motion parameter N b, thermophoresis parameter N t, Prandtl number Pr , on momentum,thermal and concentration boundary layers are investigated. The flow, heat and mass transfer parameters induces significant thought on the behavior of velocity,temperature and nanoparticle concentration profiles discussed in detail. A comparision with previously published work in the literature has been done and found to be in excellent agreement with it. Keywords Casson fluid · Similarity solutions · Brownian motion · Transport equations · Prandtl number MSC 2020: 76D10 · 76D09 · 76B99 · 76M45 P. G. Metri (B) Department of Mechanical Engineering and Mathematics, Walchand Institute of Technology, Solapur, Maharashtra, India e-mail: [email protected] M. S. Abel Department of Mathematics, Gulbarga University, Kalaburagi, Karnataka, India D. S. Sharanappa Department of Mathematics, Indira Gandhi Tribal University, Madhya Pradesh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_35

773

774

P. G. Metri et al.

35.1 Introduction The boundary layer flow over a stretching surface has been studied extensively in fluid dynamics. Numerous studies have been conducted in the field of flow around the boundary layer of a continuously moving surface in terms of industry and technological application. And is gaining importance in the face of increasing consumption of geothermal energy and astrophysical problems. A better understanding of mass, energy, and fundamental principles can also be helpful for many other engineering applications.The better understanding of mass, momentum and energy transport in fluid flow, namely cooling of nuclear reactors, underground disposal of nuclear waste, operation of oil tanks, insulation of buildings, food processing, and casting and welding in industrial processes. The fluid flow and heat transfer characteristics over a stretching surface has important industrial applications. There are many practical uses in industries the field of metallurgy and chemical industry; in the extrusion process heat-treated materials moving between the feed roller and the lifting roller or on the conveyor belt has a solid moving surface. The rate of stretching and cooling significantly affects the quality of the final product with the desired properties. In the above processes, the molten liquid is cooled by being drawn into a cooling system. The desired product properties of such a process depend primarily on two properties: the first is the refrigerant used and the second is the rate of expansion. Low conductivity non-Newtonian fluids can be selected as refrigerants in their flow form, and hence the heat transfer rate can be controlled externally. Optimum stretching speed is important because rapid stretching leads to sudden hardening, thereby destroying the properties expected from the product. Nanofluids can be considered a new generation of heat transfer fluids, as they offer exciting new opportunities to improve heat transfer performance compared to pure fluids. They are estimated to have better properties than conventional heat transfer fluids as well as liquids containing metal microparticles. In addition, nanofluids can improve scratch properties compared to conventional solid/liquid mixtures. The development of nanofluids is still limited by a number of factors, such as inconsistent results, poor suspension performance and a lack of theoretical understanding of the mechanisms. Nanoparticles suspended in various base liquids. Nanoparticles are colloidal suspensions of nanoparticles that are introduced into raw materials. Nanoparticles used in nanofluids are usually composed of metals, oxides, carbides, or carbon nanotubes. Common liquids are water, ethylene glycol and oil. Nanofluids have new properties that allow heat transfer to be used in a wide range of applications, including microelectronics, fuel cells, pharmaceutical processes and hybrid engines, vehicle cooling/heat treatment, internal cooling, radiator, heat exchanger. grinding, processing and lowering the boiler gas temperature. They have an increased thermal conductivity and a convective heat transfer coefficient compared to the base fluid. We found that knowing the rheological behavior of nanofluids is very important in deciding whether they are suitable for convective heat transfer (Table 35.1).

35 Magnetohydrodynamic Casson Nanofluid Flow … Table 35.1 Nomenclature Bi Biot number

a

775

Positive constant associated with linear strecthing Nanoparticle volume fraction at large values of y(ambient) Thermophoretic diffusion coefficient Convective heat transfer coefficient

Cw

Nanoparticle volume fraction at the wall

C∞

DB

DT

Nb

Brownian diffusion coefficient specific heat at constant pressure [J K g −1 K −1 ] Dimensionless steam function Thermal conductivity of the nanofluid Brownian motion

Nt

Nusselt number

N ur

M

Pr

Sh T

Sherwood number Temperature [K ]

Tw

T∞

Magnetic field 2σ B02 M= aρ(n + 1) Reynolds number Reduced Sherwood number Temperature of the hot fluid Ambient temperature

Thermophoresis parameter Reduced Nusselt number Prandtl number

Tw

u

velocity [ms −1 ]

x, y

η ρ

Similarity variable Density of the fluid [K gm −3 ] Viscosity [K gm −1 s −1 ] Dimensionless volume fraction Nanoparticle mass density

α ν

Sheet surface (wall) temperature Sheet surface (wall) temperature Diemensionless coordinates Thermal diffusivity Kinametic viscosity

Cp

f (η) k

Re Shr Tf

μ φ ρp

h

g Le Nt

θ ρf β

Acceleration due to gravity [ms −2 ] Lewis number

Dimensionless temperature parameter Density of the base fluid Casson fluid

776

P. G. Metri et al.

The Casson fluid model is classified as a subclass of non-Newtonian fluid which has several applications in food processing, metallurgy, drilling operations, and bioengineering operations. The Casson fluid model was discovered by Casson in 1959 for the prediction of the flow behavior of pigment-oil suspensions Casson [1]. Mustafa et al. [20] studied the analytical and numerical solution of unsteady Casson fluid flow over an continuously moving plate with viscous dissipation. It is observed that increase in the casson fluid the boundary layer thickness decreases. Mukhopadhyay [19] studied non-Newtonian fluid flow and heat transfer over a nonlinear stretching surface taken into account of Casson fluid. Nadeem et al. [21] studied the Casson nanoliquid flow and heat transfer over a stretching sheet with convective boundary conditions. Analytical solutions were solved by using optimal homotopy analysis method. Mabood et al. [10] studied the electrically conducting boundary layer flow, heat and mass transfer of nanoliquids over a nonlinear stretching surface. Buongiorno nanoliquid model is considered. Sandeep et al. [28] investigated the influence of thermal radiation on electrically conducting fluid flow and heat transfer of a dusty nanoliquid over an exponential stretching sheet. An effective medium theory (EMT) based model is applied to the thermal conductivity of a liquid. Metal nanoparticles and metal oxides are considered carboxymethyl cellulose (CMC) is a water-based liquid. It is observed that external magnetic field has tendency to reduce the skin friction coefficient and Nusselt number. Ullah et al. [32] studied the effects of chemical reaction and thermal radiation on electrically conducting fluid flow, heat and mass transfer of nanoliquid over a nonlinear porous stretching surface. Sulochana et al. [29] examined the three dimensional electrically conducting fluid flow, heat and masst transfer of Casson nanoliquid flow over a stretching surface with convective boundary condition. Buongiorno nanoliquid model had been employed. It is observed that Brownian motion has tendency to influence the mass transfer rate. Research group [11–15, 17, 31], investigated the Fluid flow and heat transfer over an unsteady/steady stretching surface, considering different geometries. Umavathi et al. [25, 33] theroretically studied linear stability studied the Maxwell nano fluid flow in a saturated porous medium. And, MHD flow in vertical double passage channel. Ibrahim et al. [8] investigated the numerical and mathematical model for electrically conducting Cassson fluid flow, heat and mass transfer of nanoliquid over an nonlinear permeable stretching surface with thermal radiation, internal heating and chemical reaction. Kumaran et al. [9] studied the electrically conducting Williamson and Casson fluid flow, heat and mass transfer over cone plate and sheet. Narayana et al. [16, 23] studied the effect of thermocapillary flow of non-newtonian nanoliquid over an unsteady stretching surface. Ghadikolaei et al. [4] studied the influence of thermal radiation and chemical reaction on electrically conducting Casson nanoliquid flow, heat and mass transfer over a nonlinear inclined stretching sheet with internal heating. The Buongiorno nanoliquid model had been employed. Nagaraja et al. [22] studied the influence of chemical reaction internal heating on electrically conducting Casson fluid flow over a curved stretching sheet in presence of convective boundary conditions. It is observed that Biot number has increasing effect on temperature and concentration distributions due to convection effect. Gangadhar et al. [3] examined influence of buoyancy effect on mixed convection Casson fluid flow and heat transfer

35 Magnetohydrodynamic Casson Nanofluid Flow …

777

over a stretching sheet. Haldar et al. [7] studied the casson fluid flow and heat transfer over an exponential shrinking sheet with convective boundary condition. Oyelakin et al. [24] numerically studied the three dimensional tangent hyperbolic electrically conducting Casson fluid flow and heat transfer of nanoliquid over a stretching surface with slip. Mittal et al. [18] studied the two dimensional electrically conducting mixed convection Casson nano fluid flow, heat and mass transfer over a stretching surface in presence of thermal radiation and internal heating. Gireesha et al. [5] examined the electrically conducting dusty Casson fluid flow and melting heat transfer over a stretching surface by considering modified Fourier’s law through Cattaneo-Christov heat flux model. Venkata Ramudu et al. [34] examined the influence of thermal radiation and non-uniform heat source/sink on electrically conducting mixed convection Casson fluid flow and heat transfer over a stretching surface with convective boundary condition and velocity slip. Gnaneswara Reddy et al. [6] studied the electrical conducting Casson fluid flow over contract cylinder in presence of thermal radiation. Das et al. [2] examined the influence of thermal radiation, internal heating and chemical reaction on mixed convection Casson fluid flow, heat and mass transfer over a stretching sheet with velocity slip and convective boundary condition. Most recently Tarakaramu et al. [30] examined the three dimensional electrically conducting Casson fluid flow and heat transfer over a porous stretching surface by considering internal heating. It is observed that temperature of Casson fluid is more than that of Casson fluid at infinity. Venkateswara Raju et al. [35] studied influence of thermal radiation and internal heating on the electrically conducting two dimensional mixed convection Casson fluid flow and heat transfer over a stretching sheet with solutal slip. Samantha Kumari et al. [27] examined three dimensional electrically conducting Casson fluid flow, heat and mass transfer over a stretching shett with convective boundary conditions. The above literature review finds that no such work has been reported so far, for flow heat-mass transfer, transfer, and nanoparticle concentration profiles over a nonlinear stretching sheet in Casson nanofluid. The main objective of this present work is to investigate the combined effect of convective boundary condition and slip condition on boundary layer flow, heat transfer and nanoparticle fraction profiles over a nonlinear stretching sheet in Casson nanofluid. The behavior of all studied variables was presented graphically with hydrothermal and velocity characteristics. The results for heat and mass transfer rate are also calculated using the table. We discussed the effect of various technical parameters on velocity, nanoparticle concentration, and temperature profile.

35.2 Mathematical Formulation Consider the steady, two-dimensional, incompressible viscous flow of water-based nanofluid passing through a nonlinearly stretching sheet with a linear velocity varying along x − axis i.e. u w = ax n , where n is a nonlinear stretching parameter, and a is a true positive constant, and the coordinate x is measured along the stretched surface.

778

P. G. Metri et al.

Fig. 35.1 Physical configuration and coordinate system.

The electrically conducting fluid due to an applied magnetic fluid B(x) normal to stretching surface. We assume that the Reynolds magnetic number is very small, so the induced magnetic field is considered insignificant. The wall temperature Tw and nanoparticle concentration Cw is assumed constant at stretching surface. It is important to note that the constant temperature and concentration of nanoparticles are presumably higher than the ambient temperature and the fraction of nanoparticles T∞ , C∞ , respectively. The coordinate system and flow model shown in Fig. 35.1. The governing equations are: ∂u ∂u + =0 (35.1) ∂x ∂y u

  2  σ B02 u ∂u ∂u 1 ∂ u − +v =ν 1+ 2 ∂x ∂y β ∂y ρ

∂T ∂T u +v =α ∂x ∂y u



∂2 T ∂ y2



 +τ

 DB

∂C ∂C +ν = DB ∂x ∂y



∂C ∂ T ∂y ∂y

∂ 2C ∂ y2





 +

DT + T∞ DT T∞





(35.2)

∂T ∂y

∂2T ∂ y2

2  +

(35.3)

 (35.4)

where u and v are the velocity components along the x and y directions, respectively, p is the fluid pressure, ρ f is the density of base fluid, ν is the kinematic viscosity ρc of the base fluid, α is the thermal diffusivity of the base fluid, τ = ρc pf is the ratio of nanoparticle heat capacity and the base fluid heat capacity, D B is the Brownian diffusion coefficient, DT is the thermophoretic diffusion coefficient and T is temperature. The subscript ∞ denotes the values of at large values at large values of y where the fluid is quiescent.

35 Magnetohydrodynamic Casson Nanofluid Flow …

779

The associated boundary conditions are,   1 ∂T , ν = νw − k = h(T f − T ), C = Cw (35.5) y = 0, u = ax + L 1 + β ∂y n

y → ∞, u = 0 , ν = 0, T = T∞ , C = C∞

(35.6)

We introduce the following dimensionless quantities 

a(n + 1) n−1 T − T∞ x 2 , θ= , u = ax f (η), η = y 2ν T f − T∞ 

C − C∞ (n − 1) aν(n + 1) n−1 2 φ= f + x η f , ,ν = − Cw − C∞ 2 n+1 n



(35.7)

Substituting (35.7) in (35.1–35.6), and (35.7) automatically take care of continuity equation (35.1), we obtain the following set of equations,     2n 1 1+ f  + f f  − f 2 − f 2 − M f  = 0 β n+1

(35.8)

θ  + Pr f θ  + Pr N bφ  θ  + Pr N tθ 2 = 0

(35.9)

φ  + Le f φ  +

N t  θ =0 Nb

(35.10)

subject to the following boundary conditions.   1 f  (0), θ  (0) = Bi[1 − θ (0)], φ(0) = 1 1+ β (35.11) (35.12) f  (∞) = 0, θ (∞) = 0, φ(∞) = 0

f (0) = f w , f  (0) = 1 + γ

where primes denote differentiation with respect and the parameters appearing in Eqs. (35.8–35.12) are defined as follows. Pr = υα , Le = Nt =

υ , Nb DB (ρc) p DT (T f −T∞ ) , (ρc) f υT∞ 1

Bi =

h( υa ) 2 k

=

(ρc) p D B (Cw −C∞ ) , (ρc) f υ

2σ B02 2 w , f w = − √νaυ ,M = (n + 1). x n−1 aρ f (n + 1)

(35.13)

With N b = 0 there is no transport due to buoyancy effects created as a result of nanoparticle concentration gradients. In Eq. (27.7) β, M, f w , γ , Pr , Le, N b, N t,Bi denote the Casson fluid parameter, Magnetic field parameter, Suction/injection parameter, slip parameter, Prandtl number, Lewis number, Brownian motion param-

780

P. G. Metri et al.

eter, thermophoresis parameter and Biot number respectively. The reduced Nusselt number N ur and the reduced Sherwood number Shr are obtained in terms of the dimensionless temperature at the surface, θ  (0) and the dimensionless concentration at the sheet surface, φ  (0), respectively.

35.3 Results and Discussion The boundary layer flow of electrically conducting Casson fluid, heat and mass transfer of nanoliquid over a nonlinear stretching sheet with momentum slip and convective boundary condition has been discussed in detail. We employed Buongiorno nanoliquid model. The governing flow equations are coupled and non-linear. These equations are solved numerically. Equations (35.8)–(35.12) constitute two point, nonlinear boundary value problem has been solved by Runge-Kutta-Fehlberg method based shooting technique. The properties and characteristics of the different hydrodynamic flow variables are shown for velocity, temperature and dimensionless field of the volume concentration of the nanoparticles, are depicted in Figs. 35.2, 35.3, 35.4, 35.5, 35.6, 35.7, 35.8, 35.9, 35.10, 35.11, 35.12, 35.13 and 35.14 (Table 35.2).

35.4 Velocity Profiles Figure 35.2 Establishes the effect of Casson fluid parameter β on velocity profiles and it is observed from this plot, that velocity decreases with increase of Casson fluid parameter β , further this decrease of velocity is attributed due to the yield

Fig. 35.2 Velocity profile for different values of Casson fluid β

35 Magnetohydrodynamic Casson Nanofluid Flow …

781

Fig. 35.3 Velocity profile for different values of magnetic field M

Fig. 35.4 Velocity profile for different values of momentum slip parameter γ

stress. Figure 35.3 illustrates the effect of Magnetic field on velocity distribution. It is observed that Velocity decreases with increase in the magnetic field M, due to Lorentz to obstruct the flow velocity in the boundary layer region. Further results in thinning of momentum boundary layer thickness. In Fig. 35.4 shows the effect of momentum slip parameter γ on velocity profile. It is observed that velocity distribution is decreasing function of γ . Reduction in momentum boundary layer thickness is attributed to decline in velocity of stretching surface due to momentum slip boundary condition.

782

P. G. Metri et al.

Fig. 35.5 Temperature profile for different values of Casson fluid β

Fig. 35.6 Temperature profile for different values of N t and N b

35.5 Temperature Profiles Figure 35.5 illustrates the effect of Casson parameter β on temperature distribution. When Casson parameter increases thermal boundary layer thickness increases, this is due to the yeild stress, which impact of N b and N t on θ (η). Hence increase in Casson fluid parameter enhances the thermal boundary layer thickness. The Impacts of N b and N t on are elucidated in Fig. 35.6, and enhancing N b and N t leads to the faster random motion of nanoparticles in fluid flow which displays enhancement of thermal boundary layer thickness and controls the temperature of Casson nano fluid rapidly. As in practice, in thermophoresis more heated particles adjacent to the surface travel away from heated regions towards the cold region, and raises temperature there, and consequently the collective temperature of the whole system

35 Magnetohydrodynamic Casson Nanofluid Flow …

783

Fig. 35.7 Temperature profile for different values of Lewis number Le

Fig. 35.8 Temperature profile for different values of Prandtl number Pr

rises. The random motion of nanoparticles within the base fluid is called Brownian motion which occurs because of the continuous collisions between nanoparticles and molecules of the base fluid. As observed in Fig. 35.7, the effect of Lewis number on the temperature profiles is noticeable only in a region close to the sheet as the curves tend to merge at larger distances from the sheet. The Lewis number expresses the relative contribution of thermal diffusion rate to species diffusion rate in the boundary layer regime. An increase of Lewis number will reduce thermal boundary layer thickness and will be accompanied with a decrease in temperature. Larger Lewis number will suppress concentration values, i.e. inhibit nanoparticle species diffusion. There will be much greater reduction in concentration boundary layer thickness than thermal boundary layer thickness over an increment in Lewis Number. Figure 35.8 illstrate the effect of Prandtl number Pr on temperature distribution. When increase

784 Fig. 35.9 Temperature profile for different values of Magnetic field M

Fig. 35.10 Temperature profile for different values of Biot number Bi

Fig. 35.11 Concentration profile for different values of thermophoresis parameter Nt

P. G. Metri et al.

35 Magnetohydrodynamic Casson Nanofluid Flow …

785

Fig. 35.12 Concentration profile for different values of Lewis number Le

Fig. 35.13 Concentration profile for different values of Brownian motion N b

in the Prandtl number thickness of boundary layer decreases. As a consequence, the reduced Nusselt number, being proportional to the initial slope increases. This pattern is reminiscent of the free convective boundary layer flow in a regular fluid. In Fig. 35.9 it is noticed that variation of magnetic field parameter M is observed only in region close to the sheet as the curves tend to merge at larger distances from the sheet. The amplified values of M increased heat transfer rate and hence temperature profile increases. Apparently for higher values of M the Lorentz forces increases which augments opposing forces to the fluid particles resulting in increase of temperature in thermal boundary layer region. Figure 35.10 illustrates the effect of Biot number on the thermal boundary layer. For Biot number convection takes place very largely. As expected, the stronger convection results in higher surface temperatures, causing the thermal effect to penetrate deeper into the quiescent fluid.

786

P. G. Metri et al.

Fig. 35.14 Concentration profile for different values of Biot number number Bi

Table 35.2 Comparision results of Nusselt numbers and Sherwood numbers for different values of n, N t and N b with Rana et al. [26] and Mabood et al. [10]. n Nt Nb Rana et al. [26] Mabood et al. [10] Present results 0.2

3.0

10.0

0.1 0.3 0.5 0.1 0.3 0.5 0.1 0.3

0.5

−θ  (0) 0.5160 0.4553 0.3999 0.4864 0.4282 0.3786 0.4799 0.4227

−φ  0 0.9062 0.8395 0.8048 0.8445 0.7785 0.7379 0.8323 0.7654

−θ  (0) 0.5148 0.4520 0.3987 0.4852 0.4271 0.3775 0.4788 0.4216

−φ  0 0.9014 0.8402 0.8059 0.8447 0.7791 0.7390 0.8325 0.7660

−θ  (0) 0.5140 0.4502 0.3990 0.4850 0.4270 0.3776 0.4790 0.4220

−φ  0 0.9012 0.8399 0.8398 0.8450 0.7790 0.3780 0.8330 0.7665

35.6 Concentration Profiles The Fig. 35.11 shows the variation of thermophoresis parameter N t on the concentration distribution. Thermophoresis parameter enhances near the stretching wall upto certain value of η and beyond that point, the opposite tendency is observed. Further it is observed that upto certain value of η boundary layer thickness increases. This is due to the revised state of nanoparticle concentration. The effect of Lewis number Le on nanoparticle concentration profiles, is shown in Fig. 35.12. Unlike the temperature profiles, the concentration profiles are only slightly affected by the strength of the Brownian motion and thermophoresis. A comparison of Figs. 35.7 and 35.12 shows that the Lewis number significantly affected the concentration distribution (Fig. 35.11), but has little influence on the temperature distribution (Fig. 35.7). For a base fluid of certain kinematic viscosity ν ,a higher Lewis number implies a

35 Magnetohydrodynamic Casson Nanofluid Flow …

787

lower Brownian diffusion coefficient D B (see Eq. (35.13)) which must result in a shorter penetration depth for the concentration boundary layer. The effect of Brownian motion parameter N b is seen in Fig. 35.13. With the increase in Brownian motion Parameter N b, there is decrease in concentration boundary layer thickness. The magnitude of the concentration gradient on the surface of the sheet decreases with increasing N b value. Thus, the local Sherwood number φ  (0), which indicates the rate of mass transfer of the surface, increases with increasing Nb. This may be due to the fact that N b, as a parameter of Brownian motion, reduces the mass transfer of nanofluids; as a result, the rate of mass transfer of the surface increases. It was observed in Fig. 35.14 that as the convective heating of the sheet is enhanced i.e. increases, the thermal penetration depth increases. Because the concentration distribution is driven by the temperature field, one anticipates that a higher Biot number Bi would promote a deeper penetration of the concentration. This anticipation is indeed realized in Fig. 35.14, which predict higher concentration at higher values of the Biot number.

35.7 Conclusion A numerical study of the boundary layer flow in a Casson nanofluid induced as a result of motion of a non-linearly stretching sheet has been performed. The use of a convective heating boundary condition instead of a constant temperature or a constant heat flux makes this study more general novel. The following conclusions are summarized as follows 1. For infinitely large Biot number characterizing the convective heating (which corresponds to the constant temperature boundary condition), the present results and those reported by Rana et al. [26] and Mabood et al. [10] match up to four places of decimal. 2. The concentration boundary layer suppress while increase in the Lewis number and Brownian motion parameter. 3. The thermal boundary layer thickness is thicker in Lewis number and Prandtl number. 4. The momentum boundary layer is thinner for magnetic field. 5. While increase in the velocity slip parameter reduces the velocity profile. 6. For a fixed Pr, Le, γ , Biot number Bi and the concentration boundary layer thickens and the local temperature rises as the Brownian motion and thermophoresis effects intensify. A similar effect on the thermal boundary is observed when N b, N t, Le and Bi are kept fixed and the Prandtl number Pr is increased, the temperature distribution is affected only minimally. Acknowledgements Prashant Metri is also grateful to FUSION network and its Swedish node, MAM research milieu in Mathematics and Applied Mathematics, Division of Mathematics and Physics, School of Education, Culture and Communication at Mälardalen University for support and excellent research and research education environment during his visits.

788

P. G. Metri et al.

References 1. Casson, N..: In Rheology of Dipersed System. Peragamon Press, Oxford, UK (1959) 2. Das, M., Mahanta, G., Shaw, S.: Heat and mass transfer effect on an unsteady MHD radiative chemically reactive Casson fluid over a stretching sheet in porous medium. Heat Trans. 49, 4350–4369 (2020) 3. Gangadhar,K., Edukondala Nayak, R., Venkata Subha Rao, M.: Buyoncy effect on mixed convection boundary layer flow of casson fluid over a non linear stretching sheet using spectral relaxation method. Int. J. Ambient Energy 43(1), 1994–2002 (2020) 4. Ghadikolaei, S.S., Hosseinzadeh, K.H., Ganji. D.D., Jafari, B.: Nonlinear thermal radiation effect on magneto Casson nanoliquid flow with Joule heating effect over an inclined porous stretching sheet. Case Stud. Thermal Eng. 12, 176–187 (2018) 5. Gireesha, B.J., Shankaralingappa, B.M., Prasannakumar, B.C., Nagaraja, B.: MHD flow and melting heat transfer of dusty Casson fluid over a stretching sheet with Cattaneo-Christov heat flux model. Int. J. Ambient Energy 43(1), 2931–2939 (2022) 6. Gnaneswara Reddy, M., Vijayakumari, P., Sudharani, M.V.V.N.L., Ganesh Kumar, K.: Quadratic convective heat transport of Casson Nanoliquid over a contract cylinder: an unsteady case. BioNanoSci. 10, 344–350 (2020) 7. Haldar, S., Mukhopadhyay, S., Layek, G.C.: Flow and heat transfer of Casson fluid over an exponentially shrinking permeable sheet in presence of exponentially moving free stream with covective boundary condition. Mech. Adv. Mater. Struct. 26(17), 1498–1504 (2019) 8. Ibrahim, S.M., Lorenzini, G., Vijay Kumar, P., Raju, C.S.K.: Influence of chemical reaction and heat source on dissipative MHD mixed convection flow of a casson nanofluid over a nonlinear permeable stretching sheet. Int. J. Heat Mass Trans. 111, 346–355 (2017) 9. Kumaran, G., Sandeep, N.: Thermophoresis and brownian motion effects on parabolic flow of MHD Casson and Williamson fluids with cross diffusion. J. Mol. Liq. 233, 262–269 (2017) 10. Mabood, F., Khan, W.A., Ismail, A.I.M.: MHD boundary layer flow and heat transfer of nanofluids over nonlinear stretching sheet: numerical study. J. Magn. Magn. Mater. 374, 569–576 (2015) 11. Metri, P.G., Metri, P.G., Abel, M.S., Silvestrov, S.: Heat transfer in MHD mixed convection viscoelastic fluid flow over a stretching sheet embedded in a porous medium with viscous dissipation and non-uniform heat source/sink. Procedia Eng. 157, 309–316 (2016) 12. Metri, P.G., Bablad, V.M., Metri, P.G., Abel, M.S., Silvestrov, S.: Mixed convection heat transfer in MHD Non-Darcian flow due to an exponential stretching sheet embedded in a porous medium in presence of non-uniform heat source/sink. In: Silvestrov, S., Rancic, M. (eds.), Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol. 178, pp. 187–201. Springer, Cham (2016) 13. Metri, P.G., Abel, M.S.: Hydromagnetic flow of a thin nanoliquid film over an unsteady stretching sheet. Int. J. Adv. Appl. Math. Mech. 3(4), 121–134 (2016) 14. Metri, P.G., Abel, M.S., Tawade, J., Metri, P.G.: Fluid flow and radiative nonlinear heat transfer in a liquid film over an unsteady stretching sheet. In: Proceedings of 2016 7th International Conference on Mechanical and Aerospace Engineering (ICMAE), London, pp. 83–87. IEEE (2016) 15. Metri, P.G., Abel, M.S., Silvestrov, S.: Heat and mass transfer in MHD boundary layer flow over a nonlinear stretching sheet in a nanofluid with convective boundary condition and viscous dissipation. In: Silvestrov, S., Rancic, M. (eds.), Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol. 178, pp. 203–219. Springer, Cham (2016) 16. Metri, P.G., Narayana, M., Silvestrov, S.: Hypergeometric steady solution of hydromagnetic nano liquid film flow over an unsteady stretching sheet. AIP Conf. Proc. 1798, 020097 (2017) 17. Metri, P.G., Guariglia, E., Silvestrov, S.: Lie group analysis for MHD boundary layer flow and heat transfer over stretching sheet in presence of viscous dissipation and uniform heat source/sink. AIP Conf. Proc. 1798, 020096 (2017)

35 Magnetohydrodynamic Casson Nanofluid Flow …

789

18. Mittal, A.S., Patel, H.R.: Influence of thermophoresis and Brownian motion on mixed convection two dimensional MHD Casson fluid flow with non-linear radiation and heat generation. Physica A 537, 122710 (2020) 19. Mukhopadhyay, S.: Casson fluid flow and heat transfer over nonlinearly stretching surface. Chin. Phys. B 22(7), 074701 (2013) 20. Mustafa, M., Hayat, T., Pop, I., Aziz, A.: Unsteady boundary layer flow of a Casson fluid due to an impulsively started moving flat plate. Heat Trans.-Asian Res. 40(6), 563–576 (2011) 21. Nadeem, S., Mehmood, R., Akbar, N.: Optimized analytical solution for oblique flow of a Casson-nano fluid with convective boundary conditions. Int. J. Thermal Sci. 78, 90–100 (2014) 22. Nagaraja, B., Gireesha, B.J.: Exponential space-dependent heat generation impact on MHD convective flow of casson fluid over a curved stretching surface with chemical reaction. J. Thermal Anal. Calorim. 143(3), 4071–4079 (1965) 23. Narayana, M., Metri, P.G., Silvestrov, S.: Thermocapillary flow of a non-Newtonian nanoliquid film over an unsteady stretching sheet. AIP Conf. Proc. 1798, 020109 (2017) 24. Oyelakin, I.S., Lalramneihmawii, P.C., Mondal, S., Sibanda, P.: Analysis of double diffusion convectionn on three-diemensional MHD stagnation point flow of a tangent hyperbolic Casson nanofluid. Int. J. Ambient Energy 43(1), 1854–1865 (2022) 25. Pratap Kumar, J., Umavathi, J.C., Metri, P.G., Silvestrov,S.: Effect of first order chemical reaction on magneto convection in a vertical double passage channel. In: Silvestrov S., Rancic M. (eds.), Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol. 178, pp. 247–279. Springer, Cham (2016) 26. Rana, P., Bhargava, R.: Flow and heat transfer of a nanofluid over a nonlinearly stretching sheet: a numerical study. Commun. Nonlinear Sci. Numer. Simul. 17, 212–226 (2012) 27. Samantha Kumari, S., Sankara Sekhar Raju, G.: Casson Nanoliquid flow due to a nonlinear stretched sheet with convective conditions. In: Rushi, Kumar B., Sivaraj, R., Prakash, J. (eds.) Advances in Fluid Dynamics. Lecture Notes in Mechanical Engineering, Springer, Singapore (2021) 28. Sandeep, N., Sulochana, C., Rushi Kumar, C.: Unsteady MHD radiative flow and heat transfer of dusty nanofluid over an exponentially stretching surface. Eng. Sci. Tech. Int. J. 19, 227–240 (2016) 29. Sulochana, C., Ashwinkumar, G.P., Sandeep, N.: Similarity solution of 3D Casson nanoliquid flow over a stretching sheet with convective boundary conditions. J. Nigerian Math. Soc. 35, 128–141 (2016) 30. Tarakaramu, N., Satya Narayana, P..V.: Influence of Heat Generation/Absorption on 3D Magnetohydrodynamic Casson Fluid Flow Over a Porous Stretching Surface. In: Rushi, Kumar B., Sivaraj, R., Prakash, J. (eds.) Advances in Fluid Dynamics. Lecture Notes in Mechanical Engineering, Springer, Singapore (2021) 31. Tawade, J., Metri, P.G., Abel. M.S.: Thin film flow and heat transfer over an unsteady stretching sheet with thermal radiation, internal heating in presence of external magnetic field. Int. J. Adv. Appl. Math. Mech. 3(4), 29–40 (2016). arXiv:1603.03664 32. Ullah, I., Khan, I., Shafie, S.: MHD natural convection flow of Casson nanofluid over nonlinear stretching sheet through porous medium with chemical reaction and thermal radiation. Nanoscale Res. Lett. 11, 527 (2016) 33. Umavathi, J.C., Vajravelu, K., Metri, P.G., Silvestrov, S.: Effect of Time-Periodic boundary temperature modulations on the Onset of convection in a Maxwell fluid nanofluid saturated porous layer. In: Silvestrov, S., Rancic, M. (eds.), Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol. 178, pp. 221–245. Springer, Cham (2016) 34. Venkata Ramudu, A.C., Anantha Kumar, K., Sugunamma, V., Sandeep, N.: Influence of suction/injection on MHD Casson fluid flow over a vertical stretching surface. J. Thermal Anal. Calorim. 139, 3675–3682 (2020) 35. Venkateswara Raju, K., Durga Prasad, P., Raju, M.C., Sivaraj, R.: MHD Casson fluid flow past a stretching sheet with convective boundary and heat source. In: Rushi, Kumar B., Sivaraj, R., Prakash, J. (eds.) Advances in Fluid Dynamics. Lecture Notes in Mechanical Engineering, Springer, Singapore (2021)

Chapter 36

Mathematical and Computational Analysis of MHD Viscoelastic Fluid Flow and Heat Transfer Over Stretching Surface Embedded in a Saturated Porous Medium Jagadish Tawade and Prashant G. Metri Abstract The study of MHD flow and heat transfer over a stretching sheet in presence of saturated porous media with the effect of space and temperature dependent internal heat source/sink. Two different heating process has been considered namely, Prescribed surface temperature (PST) and Prescribe heat flux (PHF). The nonlinear boundary layer equations of momentum, which are nonlinear partial differential equations are converted into nonlinear ordinary differential equations by means of suitable similarity transformation. Similarly, the heat transfer equations, which are partial differential equations, are converted into ordinary differential equations introducing a similarity transformation. The resultant flow and heat transfer has been solved analytically. The effects of viscoelastic parameter, porous parameter, Magnetic field, suction, space and temperature dependent heat source/sink on both flow and heat transfer characteristics are presented graphically. Keywords Boundary layer flow · Heat transfer · Kummer’s function · MHD · Stretching sheet · Viscoelastic fluid MSC 2020 76D10 · 76D09 · 76B99 · 76M45

36.1 Introduction The study of continuously moving surface in a laminar boundary layer of nonNewtonian fluids is an important type of flow that occurs in various technological J. Tawade (B) Faculty of Science and Technology, Vishwakarma University, Pune 411048, Maharashtra, India e-mail: [email protected] P. G. Metri Department of Mechanical Engineering and Mathematics, Walchand Institute of Technology, Solapur, Maharashtra, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_36

791

792

J. Tawade and P. G. Metri

processes. Examples of practical applications include aerodynamic extrusion of plastic sheets, endless sheet metal cooling along cooling paths showing electrolyte crystal growth, boundary layers of liquid layers in the condensation process, compression by continuous deformation of mold or filament polymer sheets. Glass blowing, continuous casting, and fiber shearing are associated with flux as the surface stretches. There are a number of practical applications in the metallurgical and chemical industries, such as materials produced by extrusion and heat-treated materials, which move between a feed roller and a coil roller, or run on a conveyor belt with a continuously moving surface. And also in the polymer industry, where plastic films and man-made fibers are drawn, which rise from almost zero at the opening to a maximum value at which they remain stable. The moving fiber boundary layer creates a flow in the surrounding medium. Eventually, the fiber cools, and this affects the final product of the yarn. Flow through a porous fluid saturated medium is important in many technological applications, and its importance increases with the growing interest in the problems of geothermal energy and astrophysics. Certain other applications can also benefit from an understanding of the basics of mass, energy and momentum transfer in porous media, namely cooling of nuclear reactors and underground disposal of nuclear waste, oil tank operations, building insulation, food processing, casting and welding, manufacturing processes and etc. Heat transfer in a porous medium plays a key role in these applications, so the books by Ingham and Pop [7], Nild and Bejan [20] have shown that flow in porous media is becoming a classic topic where previous developments have been confirmed by a large number of subsequent research. Most studies of the thermal conductivity of porous media have been considered for liquids in environments with constant physical properties. However, it is well known that the viscosity of a liquid changes significantly with temperature and this affects the change in velocity and temperature caused by the flow. Therefore, when used for practical heat transfer problems where there is a large temperature difference between the surface and the liquid, continuous assimilation of ownership can lead to significant errors. For example, the viscosity of water decreases by about 240% when the temperature rises from 10 to 500 ◦ C, the effect of changing properties on heat transfer is a very difficult task for several reasons. Firstly, variations in temperature properties differ from one liquid to another, sometimes it is impossible to express them analytically. However, for practical applications, a reliable and suitable correlation equation based on the assumption of a constant property can be used so that it can be used when the influence of a variable property becomes important. Mahapatra et al. [8] studied the heat transfer and MHD stagnation point flow of a viscoelastic fluid over a stretching sheet. When the sheet is stretched on its own plane, the speed is proportional to the distance from the stagnation point. Babaelahi et al. [3] analyzed the viscoelastic magnetohydrodynamic flow and heat transfer overa stretching sheet and taken into account of ohmic and viscous dissipation. Hsiao [6] studied the heat and mass transfer of mixed convection magnetohydrodynamic viscoelastic fluid flow over a stretching sheet in presence of ohmic dissipation. Aiboud et al. [1] analyzed the second law of thermodynamics to MHD viscoelastic flow over a stretching sheet. The

36 Mathematical and Computational Analysis of MHD …

793

entropy analysis highly influenced the visco-elastic parameter and internal heating heat source/sink parameter. The magnetic field was also used to clean molten metals from non-metallic inclusions. Numerous studies have been reported on the flow and heat transfer of electrically conductive fluids over an elongated surface in the presence of a magnetic field. Turkyilmazoglu [27, 28] studied the viscoelastic electrical conducting fluid of slip flow over a stretching surface. The key is to look at the structure of the solutions analytically and determine the thresholds above which multiple solutions exist. In various embodiments, the closed form formulas of the flow boundary layer equations are shown in two different cases, such as second class liquid B fluids and Walter. The heat transfer analysis is also performed for two different cases, such as the surface temperature of the required secondary power law and the heat flux of the required secondary power law. Rushi Kumar et al. [23] studied the viscoelastic electrically conducting fluid flow, heat and mass transfer over a vertical cone and flat plate taken into account of variable viscosity, viscous dissipation and chemical reaction. Turkyilmazoglu [29] investigated the three dimensional viscoelastic magnetohydrogynamic flow and heat transfer over a porous stretching and shrinking surface. Eswaramoorthi et al. [5] studied the three diemensional viscoelastic electrically conducting convective flow and heat transfer over a stretching surface in presence of thermal radiation and internal heating. Nayak et al. [18] studied the vicoelastic electrically conducting fluid, heat and mass transfer over a porous surface taken in to account of viscous dissipation, thermal radiation and chemical reaction. Metri et al. [9] studied the mathematical and computational analysis of viscoelastic electrically conducting mixed convection flow over a porous stretching surface considered the effects of viscous dissipation and non-uniform heat source/sink. Nayak [19] studied the heat and mass transfer of viscoelastic MHD flow over a stretching sheet embedded in a porous medium, considered the effects of thermal radiation and chemical reaction. Metri et al. [10, 11] studied the heat and mass transfer over a nonlinear stretching surface in presence of viscous dissipation. Tawade et al. [26] studied the effects of thermal radiation on electrically conducting fluid flow over a unsteady stretching surface with internal heating. Umavathi et al. [30] studied the linear stability of Maxwell nano fluid in a saturated porous medium, when the porous walls of the porous layers are subjected to periodic temperature modulation. A modified Darcy-Maxwell model is used to describe the movement of fluids. Metri et al. [12, 13] studied the MHD nanoliquid flow and heat transfer over an unsteady stretching surface. Pratap Kumar et al. [21] studied the effect of chemical reaction on magneto hydrodynamic flow in a vertical double passage channel. Baag et al. [2] analyzed the second law of thermodynamics for viscoelastic electrically conducting fluid flow, heat and mass transfer over a porous stretching surface. Metri et al. [14] Metri, P. G., studied the Lie symmetry analysis for magnetohydrodynamic flow over a stretching surface in presence of viscous dissipation and internal heating. Metri et al. [15] studied the analytical solution for magnetohydronamic nanoliquid flow over an unsteady stretching surface. Narayana et al. [17] investigated the thermocapillary effect on laminar flow of a thin film of a power-law nanoliquid over an unsteady stretching surface. Mishra et al. [16] studied the analytical and numerical solution of the viscoelastic electrically conducting fluid

794

J. Tawade and P. G. Metri

flow over a porous surface with non-uniform heat source/sink. Seth et al. [24] studied the radiation effect on magnetohydrodynamic viscoelastic free convective flow and heat transfer over a stretching surface in presence of partial slip. Galerkin Finite element method is used to solve the nonlinear fluid flow and heat transfer equations. Sureshkumar Raju [25] studied the Viscoelastic Darcy-Forchheimer flow and heat transfer over moving neddle with viscous dissipation. Ramesh et al. [22] studied the effects of thermal radiation, chemical reaction on viscoelastic nanoparticles flow, heat and mass transfer over stretching surface with convective boundary condition. It is worth mentioning that heat transfer in porous media which is induced by internal heat generation arises in various physical problems such as heat removal from nuclear fuel debris in nuclear reactors, the underground disposal of radioactive waste materials, fire and combustion modeling, the development of metal waste from spent nuclear fuel, and exothermic chemical reaction in packed-bed reactors. Exact modeling of internal heat source/sink is impossible and hence simple mathematical models considering average behavior in most physical situations. In the present work we worked on Newtonian liquids investigated to viscoelastic liquid flows. The main aim of the article is to analyze the effect of space dependent and temperature dependent heat generation absorption parameters, Prandtl number, magnetic parameter, viscoelastic parameter, and porosity parameter on a viscoelastic boundary layer flow and heat transfer over stretching sheet with suction-blowing effects.

36.2 Mathematical Formulation Two-dimensional flow of an incompressible electrically conducting viscoelastic fluid of the type Walter’s liquid B past a porous stretching sheet embedded in a porous medium is considered. The flow is generated due to stretching sheet along x-axis by application of two equal and opposite forces. This flow obeys the rheological equation of state derived by Beard and Walters [4], further this flow field is exposed under the influence of uniform transverse magnetic field. Hence, under the usual boundary layer assumptions, the equations of continuity, momentum and energy for the flow of MHD viscoelastic fluid of Walter’s B model are: ∂u ∂u + = 0. ∂x ∂y

(36.1) σ B2

u ∂∂ux + ν ∂u = ν ∂∂ yu2 − ρ 0 u  ∂3y  3 2 2 −k0 u ∂ ∂x∂uy 2 + ν ∂∂ yu3 + ∂∂ux ∂∂ yu2 − ∂∂x∂uy ∂u − ∂y   ∂T ∂T ∂2T +ν = k 2 + q  . ρc p u ∂x ∂y ∂y 2

v k

u.

(36.2) (36.3)

Here, B0 is the applied magnetic field, σ is the electrical conductivity of the fluid, k0 is the first moment of the distribution function of relaxation times, ν is Kinematic viscosity and k  is the permeability of the porous medium. The magnetic field B0 is

36 Mathematical and Computational Analysis of MHD …

795

applied in the transverse direction of the sheet and induced magnetic field is assumed to be negligible. where u, v and T are the fluid x-component of velocity, y-component of velocity and the temperature, respectively. ρ, ν, k and c p are the fluid density, Kinematic viscosity, thermal conductivity and specific heat at constant pressure of the fluid respectively, q  is the rate of internal heat source (> 0) or sink (< 0) coefficient. The internal heat source or sink term is modeled according to the following equation 

q =



 ku w (x) [A∗ (Tw − T∞ )e−αη + B ∗ (T − T∞ )], xν

(36.4)

In (36.4), the first term represents the dependence of the internal heat source or sink on the space coordinates while the latter represents its dependence on the temperature. Note that when both A∗ > 0 and B ∗ > 0, this case corresponds to internal heat source while for both A∗ < 0 and B ∗ < 0, this case corresponds to internal heat sink.

36.3 Boundary Conditions The boundary conditions for the flow situation are u w (x) = cx, v = −v0 at y = 0,

and u → 0 u y = 0 as y → ∞. (36.5)

36.3.1 Prescribed Surface Temperature (PST) For this case boundary conditions are: T = Tw = T∞ + Ax 1 at y = 0 and T → ∞ as y → ∞,

(36.6)

where l is the wall temperature parameter, Tw is the temperature at the wall, and A is a constant. When l = 0, the thermal boundary conditions become isothermal. We define non-dimensional temperature profile as θ (η) =

T − T∞ . Tw − T∞

36.3.2 Prescribed Wall Heat Flux (PHF) The boundary conditions are

(36.7)

796

J. Tawade and P. G. Metri

−k

∂T = Tw = Dx m , at y = 0, and T → T∞ as y → ∞. ∂y

(36.8)

Here, m is the wall heat flux parameter, for m = 0, the stretching sheet is subjected to uniform heat flux. Defining 1

T − T∞ = E x m (νc) 2 g(η),

(36.9)

where E is another constant. Further change of dependent variable (36.9),

36.4 Dimensionless Quantities Equations (36.1) and (36.2) admit self-similar solution of the form 

u = cx f (η),

1

v = −(cν) 2 f (η) wher e η =

 c 21 ν

y.

(36.10)

Equations (36.1)–(36.3) with Eqs. (36.5)–(36.7) and (36.10), we obtain following equations.

36.5 Reduced Non-linear Ordinary Differential Equations

f 2 − f f  = f  − Mn f  − k1 2 f  f  − f 2 − f f iv − k2 f  , 







(36.11)

θ + Pr f θ + (B ∗ − Prl f )θ = −A∗ f .

(36.12)

g  + Pr f g  + (B ∗ − Prl f  )g = −A∗ f  ,

(36.13)

σ B2

where k1 = kν0 c is the viscoelastic parameter, Mn = ρ 0 the magnetic parameter, μc k2 = kν c is the porosity parameter, Pr = k p is the Prandtl number and A∗ and B ∗ are space and temperature dependent internal heat generation/absorption. Corresponding boundary condition.

36 Mathematical and Computational Analysis of MHD …

797

36.6 Reduced Boundary Conditions f  (0) = 1,

f (0) = R at η = 0 and f  (∞) = 0,

f  (∞) = 0 as η → ∞, (36.14)

0 is the suction parameter. where R = √νcv The flow behavior permits us to assume the solution of (36.11) in the form which satisfies the basic equation (36.1) and boundary conditions (36.14) with

f (η) = A + B ex p(−αη), α > 0,

(36.15)

with A = R + α1 , and B = − α1 . Here, α is the positive real root of the cubic equation α3 +

(k1 − 1) 2 1 (1 + Mn + k2 ) α + α+ = 0. Rk1 k1 Rk1

(36.16)

Hence, the resultant solutions of velocity components are u = cxex p(−αη), 1 2

ν = −(vc) {A + Bex p(−αη)} .

(36.17)

36.6.1 Prescribed Surface Temperature (PST) Similarly boundary conditions (36.6) take the form θ (0) = 1 at η = 0 ,

36.6.1.1

θ (∞) = 0 as η → ∞,

(36.18)

Introducing a New Independent Variable

ξ=

Pr Bex p(−αη) , α

(36.19)

substituting above equation in (36.12) and considering the value f , we obtain,   Pr A A∗   ξθ + 1 − − ξ θ − l B∗θ = . α Pr The corresponding boundary conditions are

(36.20)

798

J. Tawade and P. G. Metri

 θ

Pr B α

 = 1,

θ (0) = 0.

(36.21)

The solution of (36.20) subjected to the boundary conditions (36.21) is 

2 Pr Be−αη Pr Be−αη + c2 θ (η) = c1 (e−αη ) p1 M 1 + l B ∗ , p1 + 1, ( . α α (36.22) Here M denotes the Kummer’s function with   1 − c2 ( Prα B )2 c1 =  Pr B  p1 , (36.23) M[ p1 + l B ∗ , p1 + 1, Prα B ] α c1 =

p1 =

Pr A , α

A∗ , Pr [4 − 2 p1 − l B ∗ ] A= R+

1 1 , B=− . α α

(36.24)

(36.25)

The non-dimensional wall temperature gradient derived from (36.11), (36.14)– (36.15) is 

Pr B  θ (0) = c1 −αp1 M p1 + l B ∗ , p1 ; + α

 p1l B ∗ Pr B Pr B M p1 + l B ∗ + 1, p1 + 2; − + α α Pr 2 B 2 . −c2 α

(36.26)

36.6.2 Prescribed Heat Flux (PHF) g  (0) = −1

at η = 0 and

g(∞) = 0 as η → ∞.

(36.27)

Here prime denotes derivative w.r.t η Using transformation (36.19), Eqs. (36.13) and (36.27) take the following respective form   Pr A A∗  − ξ g − l B ∗ g = , (36.28) ξg + 1 − α Pr g





Pr B α

 =

1 and g(0) = 0, Pr B

(36.29)

36 Mathematical and Computational Analysis of MHD …

799

where prime denotes derivative w.r.t. ξ . The solution of (36.28), w.r.t the boundary conditions (36.29) is g(η) = c3 (e

  Pr B −αη 2 Pr B −αη + c2 e e ) M p1 + m B , p1 + 1; , α α (36.30)

−αη p1





where: • c2 = • c3 = • p1 =

A∗ , Pr [4−2 p1 −l B ∗ ]

1−2c Pr B , p +m B ∗ αp1 M [ p1 +m B ∗ , p1 +1; Prα B ]− Prα B 1p +1 M [ p1 +m B ∗ +1, p1 +2; Prα B ] 1 Pr A . α

and

36.7 Results and Discussion A boundary layer problem for momentum and heat transfer in MHD boundary layer viscoelastic fluid flow over a stretching surface in porous media with space and temperature dependent internal heat source/sink is examined in this paper. Linear stretching of the porous boundary, temperature dependent, space dependent, heat source/sink and porosity, magnetic parameter are taken into consideration in this study. The basic boundary layer partial differential equations, which are highly nonlinear, have been converted into a set of non-linear ordinary differential equations by applying suitable similarity transformations and their analytical solutions are obtained in terms of confluent hyper geometric function (Kummer’s function). Different analytical expressions are obtained for non-dimensional temperature profile for two general cases of boundary conditions, namely (i) Prescribed power law surface temperature (PST) and (ii) Prescribed power law heat flux (PHF). In order to have some insight of the flow and heat transfer characteristics, results are plotted graphically for typical choice of physical parameters. Figure 36.1a, b are graphical representation of horizontal velocity profiles f  (η) for different values of viscoelastic parameter k1 and porous parameter k2 . Figure 36.1a provides the information that the increase of viscoelastic parameter leads to the decrease of the horizontal velocity profile. This is because of the fact that introduction of tensile stress due to viscoelasticity causes transverse contraction of the boundary layer and hence velocity decreases. The effect of porosity parameter on the horizontal velocity profile in the boundary layer is shown in Fig. 36.1b, it is observed that the increase of permeability parameter k2 leads to the decrease of the horizontal velocity profiles, which leads to the enhanced deceleration of the flow and hence, the velocity decreases. Figure 36.1c illustrates that the effect of magnetic parameter i.e., the introduction of transverse magnetic field normal to the direction, has a tendency to create a drag due to horizontal force which tends to resist the flow and, hence the horizontal

800

J. Tawade and P. G. Metri

Fig. 36.1 Velocity profiles for various values of parameters of interest

velocity boundary layer decreases. This result is even true for the presence of porous parameter k2 . The presence of magnetic field in an electrically conducting fluid tends to produce a body force against the flow. This type of resistive force tends to slow down the motion of the fluid in the boundary layer which, in turn reduces the rate of heat in the flow and appears in increasing the flow temperature. Figure 36.1d, e depict the influence of suction/blowing parameter R on the velocity profiles in the boundary layer. It is known that imposition of the wall suction (R > 0)

36 Mathematical and Computational Analysis of MHD …

801

Fig. 36.2 Temperature profiles for various values of k1

has the tendency to reduce the momentum boundary layer thickness, this causes reduction in the velocity profiles. However, the opposite behaviour is observed by imposition of the wall fluid blowing or injection (R < 0). In Fig. 36.2a, b, θ (η) temperature distribution θ (η) in both PST and PHF case respectively for different values of viscoelastic parameters k1 are plotted. Figure 36.2a, b reveals that increase of viscoelastic parameter k1 leads to increase of temperature profile θ (η) in the boundary layer. This is consistent with the fact that thickening of thermal boundary layer occurs due to the increase of non-Newtonian viscoelastic normal stress. The effect of porosity parameter k2 on temperature profiles for PST and PHF case is shown in Fig. 36.3a and b, respectively. It is observed that the effect of porosity parameter k2 is to decrease the temperature profile in the boundary layer. The effect of magnetic parameter on temperature profile for PST and PHF case in presence of porosity parameter and heat source/sink parameter is shown in Fig. 36.4a and b, respectively. It is observed that the effect of magnetic parameter is to increase the temperature profile in the boundary layer. The Lorentz force has the tendency

Fig. 36.3 Temperature profiles for various values of k2

802

J. Tawade and P. G. Metri

Fig. 36.4 Temperature profiles for various values of Mn

to increase the temperature profile, also the effect on the flow and thermal fields become more so as the strength of the magnetic field increases. The effect of magnetic parameter is to increase the wall temperature gradient in PST and PHF case. Figure 36.5a, b depict the influence of suction/blowing parameter R on the temperature profile in the boundary layer. It is observed that imposition of the wall suction (R > 0) have the tendency to reduce the thermal boundary layer thickness. This causes reduction in the temperature profile. However, opposite behaviour is observed by imposition of the wall fluid blowing or injection (R < 0) as shown in Fig. 36.6a, b. The influence of the presence of space dependent internal heat source (A∗ > 0) or sink (A∗ < 0) in the boundary layer on the temperature field is presented in Fig. 36.7a and b in PST and PHF case respectively, it is clear from this graph that increasing the value of A∗ produces increase in temperature distributions of the fluid. This is expected since the presence of heat source (A∗ > 0) in the boundary layer generates energy which causes the temperature of the fluid to increase. This increase in the temperature produces an increase in the flow field due to the buoyancy effect.

Fig. 36.5 Temperature profiles for various values of R (R > 0)

36 Mathematical and Computational Analysis of MHD …

803

Fig. 36.6 Temperature profiles for various values of R (R < 0)

Fig. 36.7 Temperature profiles for various values of A∗

However, as the heat source effect becomes large (A∗ = 1, A∗ = 2), a distinctive peak in the temperature profile occurs in the fluid adjacent to the wall. This means that the temperature of the fluid near the sheet is higher than the sheet temperature and consequently, heat is expected to transfer to the wall. On the contrary, heat sink (A∗ < 0) has the opposite effect, namely cooling of the fluid. When the internal heat source is absent or present, it is seen that the effect of the internal heat source is especially pronounced for high values of A∗ . The fluid temperature is greater when internal heat source exists. This is logical because the increase of the heat transfer close to the plate and this will induce more flow along the plate. The influence of the temperature-dependent internal heat source (B ∗ > 0) or sink (B ∗ < 0) in the boundary layer on the temperature field is the same as that of space-dependent internal heat source or sink. Namely, for B ∗ > 0 (heat source), the temperature of fluid increase while they decrease for B ∗ < 0 (heat sink). These behaviours are depicted in Fig. 36.8a and b in PST and PHF case respectively. In Fig. 36.9a and b several temperature profiles are drawn in both PST and PHF cases respectively. The effect of Prandtl number on heat transfer may be analyzed

804

J. Tawade and P. G. Metri

Fig. 36.8 Temperature profiles for various values of B ∗

Fig. 36.9 Temperature profiles for various values of Pr

from these figures. Both graphs implicate that the increase of Prandtl number results in the decrease of temperature distribution at a particular point. This is due to the fact that there would be a decrease of thermal boundary layer thickness with the increasing values of Prandtl number. Temperature distribution in both situations asymptotically approaches to zero in the free stream region.

36.8 Conclusion A mathematical model study on the influence of heat transfer in MHD boundary layer viscoelastic fluid flow over stretching surface in porous media with space and temperature dependent internal heat source/sink, where flow is subject to suction/blowing through the porous boundary are taken in to consideration in this study. Analytical solutions of the governing boundary layer problem have been obtained in terms of confluent hyper geometric function (Kummer’s function) and its special form, different analytical expressions are obtained for non-dimensional temperature profile for two general cases of boundary conditions, namely (i) Prescribed surface temper-

36 Mathematical and Computational Analysis of MHD …

805

ature (PST) and (ii) Prescribed wall heat flux (PHF). Explicit analytical expressions are also obtained for dimensionless temperature gradient θ  (0) and heat flux qw for general cases as well as for special cases of different physical situations. The special conclusions derived from this study can be listed as follows. 1. Explicit expressions are obtained for various heat transfer characteristics in the form of confluent hyper geometric function (Kummer’s function), several expressions are also obtained in the form of some other elementary functions as the special cases of Kummer’s function. 2. The combined effect of increasing values of viscoelastic parameter k1 and porosity parameter k2 is to decrease velocity of the fluid significantly in the boundary layer region, this is because of the fact that introduction of tensile stress due to viscoelasticity causes transverse contraction of the boundary layer and hence velocity decreases, and increasing the values of porosity parameter which leads to enhanced deceleration of the flow and hence, velocity decreases. 3. The effect of increasing values of magnetic parameter Mn is to decrease velocity of the fluid, i.e., the introduction of transverse magnetic field normal to the direction have a tendency to create a drag due to horizontal force which tends to resist the flow and, hence the horizontal velocity boundary layer decreases. 4. The effect of increasing values of suction parameter (R > 0) is to decrease the velocity, where as it has opposite effect for R < 0. 5. The combined effect of increasing values of magnetic parameter Mn and viscoelastic parameter k1 is to increase temperature distribution in the flow region, as the increasing values of magnetic parameter is to increase the temperature, because Lorentz force has the tendency to increase the temperature profile, also the effect on the flow and thermal fields become more so as the strength of the magnetic field increases. An increasing values of viscoelastic parameter is to increase temperature, this is consistent with the fact that thickening of thermal boundary layer occurs due to the increase of non-Newtonian viscoelastic normal stress. 6. The effect of increasing the values of porosity parameter k2 is to decrease the temperature distribution in the flow region. 7. The effect of increasing values of suction parameter R is to decrease the temperature distribution and that of blowing is to increase the same, it is known that imposition of the wall suction (R > 0) have the tendency to reduce the momentum boundary layer thickness. This causes reduction in the velocity profiles. However the opposite behavior is observed by imposition of the wall fluid blowing or injection (R < 0). 8. The combined effect of increasing values of space dependent and temperature dependent heat source/sink parameters A∗ and B ∗ respectively is to increase the temperature distribution in the boundary layer flow region, as increase in the values of A∗ is to increase fluid temperature is greater when internal heat source exists. This is logical because the increase of the heat transfer close to the plate and this will induce more flow along the plate.

806

J. Tawade and P. G. Metri

9. The effect of increasing values of Prandtl number Pr is to reduce the temperature largely in the boundary layer flow region. This is due to the fact that there would be a decrease of thermal boundary layer thickness with the increasing values of Prandtl number. Temperature distribution in both situations asymptotically approaches to zero in the free stream region. Acknowledgements Prashant Metri is also grateful to FUSION network and its Swedish node, MAM research milieu in Mathematics and Applied Mathematics, Division of Mathematics and Physics, School of Education, Culture and Communication at Mälardalen University for support and excellent research and research education environment during his visits.

References 1. Aiboud, S., Saouli, S.: Entropy analysis for viscoelastic magnetogydrodynamic flow over a stretching surface. Int. J. Non-Lin. Mech. 45, 482–489 (2010) 2. Baag, S., Mishra, S.R., Dash, G.C., Acharya, M.R.: Entropy generation analysis for viscoelastic MHD flow over a stretching sheet embedded in a porous medium. Ain Shams Eng. J. 8, 623-632 (2017) 3. Babaelahi, M., Domairry, G., Joneidi, A.A.: Viscoelastic MHD flow boundary layer over a stretching surface with viscous and ohmic dissipation. Meccanica 45, 817–827 (2010) 4. Beard, D.W., Walters, K.: Elasto-viscous boundary layer flow. Proc. Camb. Phil. Soc. 60, 667–674 (1964) 5. Eswaramoorth, S., Bhuvaneshwari, M., Sivasankaran, S., Rajan, S.: Effect of radiation on MHD convective flow and heat transfer of a viscoelastic fluid over a stretching surface. Proc. Eng. 127, 916–923 (2015) 6. Hsiao, K.L.: Heat and mass mixed convection for MHD viscoelastic fluid past a stretching sheet with ohmic dissipation. Comm. Non Lin. Sci. Num. Sim. 15, 1803–1812 (2010) 7. Ingham, D.B., Pop, I.: Transport Phenomena in Porous Media. Pergmon, Oxford (1988) 8. Mahapatra, T.R., Dholey, S., Gupta, A.S.: Momentum and heat transfer in the magnetohydrodynamic stagnation point flow of a viscoelastic fluid towards a stretching surface. Meccanica 42, 263–272 (2007) 9. Metri, P.G., Abel, M.S., Silvestrov, S.: Heat transfer in MHD mixed convection viscoelastic fluid flow over a stretching sheet embedded in a porous medium with viscous dissipation and non-uniform heat source/sink. Proc. Eng. 157, 309–316 (2016) 10. Metri, P.G., Bablad, V.M., Metri, P.G., Abel, M.S., Silvestrov, S.: Mixed convection heat transfer in MHD non-Darcian flow due to an exponential stretching sheet embedded in a porous medium in presence of non-uniform heat source/sink. In: Silvestrov, S., Rancic, M. (eds.) Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol 178, pp. 187–201. Springer, Cham (2016) 11. Metri, P.G., Abel, M.S., Silvestrov, S.: Heat and mass transfer in MHD boundary layer flow over a nonlinear stretching sheet in a nanofluid with convective boundary condition and viscous dissipation. In: Silvestrov, S., Rancic, M. (Eds.), Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol 178. Springer, Cham, pp. 203–219 (2016) 12. Metri, P.G., Abel, M.S.: Hydromagnetic flow of a thin nanoliquid film over an unsteady stretching sheet. Int. J. Adv. Appl. Math. Mech. 3(4), 121–134 (2016) 13. Metri, P.G., Abel, M.S., Tawade, J., Metri, P.G.: Fluid flow and radiative nonlinear heat transfer in a liquid film over an unsteady stretching sheet. In: Proceedings of 2016 7th International Conference on Mechanical and Aerospace Engineering (ICMAE), London. IEEE, pp. 83–87 (2016)

36 Mathematical and Computational Analysis of MHD …

807

14. Metri, P.G., Guariglia, E., Silvestrov, S.: Lie group analysis for MHD boundary layer flow and heat transfer over stretching sheet in presence of viscous dissipation and uniform heat source/sink. AIP Conf. Proc. 1798, 020096 (2017) 15. Metri, P.G., Narayana, M., Silvestrov, S.: Hypergeometric steady solution of hydromagnetic nano liquid film flow over an unsteady stretching sheet. AIP Conf. Proc. 1798, 020097 (2017) 16. Mishra, S.R., Tripathy, R.S., Dash, G.G.: MHD viscoelastic fluid flow through porous medium over a stretching sheet in presence of non-uniform heat source/sink. Rend. Circ. Mat. Palermo II(67), 129–143 (2018) 17. Narayana, M., Metri, P.G., Silvestrov, S.: Thermocapillary flow of a non-Newtonian nanoliquid film over an unsteady stretching sheet. AIP Conf. Proc. 1798, 020109 (2017) 18. Nayak, M.K., Dash, G.C., Sing, L.P.: Heat and mass transfer effects on MHD viscoelastic fluid over a stretching sheet through porous medium in presence of chemical reaction. Propuls. Power Res. 5(1), 70–80 (2016) 19. Nayak, M.K.: Chemical reaction effect on MHD viscoelastic fluid over a stretching sheet through porous medium. Meccanica 51, 1699–1711 (2016) 20. Nield, D.A., Bejan, A.: Convection in Porous Media, 2nd edn. Springer (1999) 21. Pratap Kumar, J., Umavathi, J.C., Metri, P.G., Silvestrov, S.: Effect of first order chemical reaction on magneto convection in a vertical double passage channel. In: Silvestrov, S., Rancic, M. (eds.) Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol. 178. Springer, Cham, pp. 247–279 (2016) 22. Ramesh, G.K.: Analysis of active and passive control of nanoparticles in viscoelastic nanomaterial inspired by activation energy and chemical reaction. Phys. A 550, 123964 (2020) 23. Rushi Kumar, B., Sivaraj, R.: Heat and mass transfer in MHD viscoelastic fluid flow over a vertical cone and flat plate with variable viscosity. Int. J. Heat Mass Transf. 56, 370–379 (2013) 24. Seth, G.S., Mishra, M.K., Tripathy, R.S.: MHD free convective heat transfer in Walter’s liquidB fluid past a convectively heated stretching sheet with partial slip. J. Braz. Soc. Mech. Sci. Eng. 40(103) (2018) 25. Sureshkumar Raju, S., Ganesh Kumar. K., Rahimi-Gorji, M., Khan, I.: Darcy-Forchheimer flow and heat transfer augumentation of a viscoelastic fluid over an incessant moving needle in the presence of viscous dissipation. Microsyst. Technol. 25, 3399–3405 (2019) 26. Tawade, J., Metri, P.G., Abel, M.S.: Thin film flow and heat transfer over an unsteady stretching sheet with thermal radiation, internal heating in presence of external magnetic field. Int. J. Adv. Appl. Math. Mech. 3(4), 29–40 (2016). arXiv:1603.03664 27. Turkyilmazoglu, M.: Multiple solutions of heat and mass transfer of MHD slip flow for viscoelastic fluid over a stretching sheet. Int. J. Therm. Sci. 50, 2264–2275 (2011) 28. Turkyilmazoglu, M.: Multiple analytical solutions solutions of heat and mass transfer of magnetohydrodynamic slip flow for two types of viscoelastic fluid over a stretching surface. J. Heat Transf. 134(7), 071701, 9 pp. (2012) 29. Turkyilmazoglu, M.: Three diemensional MHD flow and heat transfer over a stretching/shrinking surface in a viscoelastic fluid with various physical effects. Int. J. Heat Mass Transf. 78, 150–155 (2014) 30. Umavathi, J.C., Vajravelu, K., Metri, P.G., Silvestrov, S.: Effect of time-periodic boundary temperature modulations on the onset of convection in a Maxwell fluid nanofluid saturated porous layer. In: Silvestrov, S., Rancic, M. (eds.) Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol. 178. Springer, Cham, pp. 221–245 (2016)

Chapter 37

Numerical Solution of Boundary Layer Flow Problem of a Maxwell Fluid Past a Porous Stretching Surface Jagadish Tawade and Prashant G. Metri

Abstract Present study deals flow of MHD boundary layer flow and heat transfer of a Maxwell fluid over stretching sheet with non-uniform heat source/sink in porous medium. The effects of various values of the emerging dimensionless parameters are discussed in two different heating process cases namely, (i) a surface with prescribed wall temperature (PST) (ii) a surface with prescribed wall heat flux (PHF). The governing nonlinear partial differential equations governing the momentum and heat transfer are reduced into nonlinear ordinary differential equations by suitable similarity transformations. Resulting nonlinear ordinary differential equations are solve numerically for boundary value problems carried out by Runge-Kutta-Fehlberg method with efficient shooting technique. The results are obtained and illustrated the behaviour of velocity and temperature has been studied and presented graphically for different non-dimensional parameters for both PST and PHF cases and discussed in detail. It is observed that boundary layer thickness is increases with increase in the Eckert number and non-uniform heat source/sink parameter for both PST and PHF cases. And it is also seen that thermal boundary layer thickness decreases with increase in the Prandtl number for both PST and PHF cases. Keywords Boundary layer flow · Maxwell fluid · Porous media · Magnetic field · Skin friction coefficient · PST and PHF cases MSC 2020 76D10 · 76D09 · 76B99 · 76M45

37.1 Introduction The study of flow induced by a stretching surface has scientific and engineering applications such as aerodynamic extrusion of plastic sheets and fibres, drawingJ. Tawade (B) Faculty of Science and Technology, Vishwakarma University, 411048 Maharashtra, India e-mail: [email protected] P. G. Metri Department of Mechanical Engineering and Mathematics, Walchand Institute of Technology, Solapur, Maharashtra, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_37

809

810

J. Tawade and P. G. Metri

annealing- tinning of copper wire, paper production, crystal growing and glass blowing. Such applications involve cooling of a molten liquid by drawing it into a cooling system. In drawing the liquid into the cooling system it is sometimes stretched as in the case of polymer extrusion process. The fluid mechanical properties desired for the penultimate outcome of such a process depend mainly on two things one being the rate of cooling and other being the rate of stretching. The choice of an appropriate cooling liquid is crucial as it has a direct impact on rate of cooling and care must be taken to exercise optimum stretching rate otherwise sudden stretching may spoil the properties desired for the final outcome. These two aspects demand for a thorough understanding of flow and heat transfer characteristics which is the main theme of the present investigation. With the stand point of many applications akin to polymer extrusion process Crane [5] initiated the analytical study of boundary layer flow due to a stretching sheet. He assumed the velocity of the sheet to be a linear function of the distance from the slit. The solution so obtained by Crane for the flow driven by a stretching sheet belongs to an important class of exact solutions of Navier-Stokes equations and the uniqueness of the solution is well-established. Hayat et al.[6] studied the effects of thermal radiation and viscous dissipation on magnetohydrodynamics boundary layer flow, heat and mass transfer of a Maxwell fluid past stretching surface with thermophoresis. Mahmoud [11] analyzed the effect of variable fluid property and thermal conductivity on magnetohydrodynamic flow of Maxwell fluid over a stretching sheet with internal heating. It is noticed that the temperature distribution is increases with increase in the thermal conductivity and internal heating. Hayat et al. [7] examined the effects of thermal radiation and viscous dissipationon on magnetohydrodynamic flow in a channel with porous walls, the Maxwell fluid fills with porous within the walls. The problem is solved by homotopy analysis method. The author ([7]) observed Nusselt number is increases with increase in the Magnetic parameter, eckert number, Prandtl number, Porosity parameter, Deborah number and decreases with increase in the radiation parameter. Abel et al. [2, 3] studied the effects of thermal radiation on magnetohydrodynamic flow and heat transfer for Maxwell fluid flow over stretching surface. Mukhopadhyay et al. [20] analyzed the transpiration and uniform heat source/sink on magnetohydrodynamic flow and heat transfer of Maxwell fluid over an unsteady stretching surface. It is noticed that temperature distribution increases with increase in the Maxwell parameter. Vajravelu et al. [30] studied magnetohydrodynamic flow, heat and mass transfer of Maxwell fluid past over a stretching surface embedded in a porous medium. It is seen that chemical reaction parameter increases the thickness of the diffusion boundary layer. Mosta et al. [19] examined magnetohydrodynamic flow of Maxwell fluid over a past porous surface, employed successive Taylor series linearization method for nonlinear equations. Shateyi [27] studied the effects of thermal radiation on magnetohydrodynamic flow, heat and mass transfer of Maxwell fluid over a darcian porous surface with thermophoresis, chemical reaction and viscous dissipation. Abbasbandy [1] examined the numerical and analytical solutions for Falkner-Skan flow of magnetohydrodynamic Maxwell fluid. Ramesh et al. [26] examined the effect of uniform heat source/sink on a Maxwell fluid over a stretching surface with convective boundary conditions. It is observed that temper-

37 Numerical Solution of Boundary Layer Flow Problem …

811

ature distribution increases for Biot number due to convective heat exchange in the stretching surface it tends to increase in the thickness of the thermal boundary layer. In the recent years, investigations of the magnetohydrodynamic fluid flow and heat transfer due to the influence of the magnetic field on the flow control in the boundary layer and the performance of many electrically conductive fluid current systems, these types of flow problems are found in industrial applications, such as heat exchangers, MHD generators etc. Metri et al. [12–14] examined the effects magnetohydrodynamics flow, heat and mass transfer over a stretching surface with viscous dissipation, non-uniform heat source/sink, mixed convection, viscoelastic fluid, porous media and convective boundary condition. Tawade et al. [28] studied the effect of internal heating and thermal radiation on flow and heat transfer over an unsteady stretching surface with external magnetic field. Umavathi et al. [29] theoretically and numerically studied the linear stability of Maxwell nanofluid flow in a saturated porous layer when the porous wall is subjected to time-periodic temperature modulation. The modified Darcy-Maxwell model is used to describe fluid movements, and the nanofluid model used includes the effects of Brownian motion. Metri et al. [15, 16] studied the effects of magnetohydrodynamic flow and heat transfer over an unsteady stretching sheet with viscous dissipation. Pratap et al. [23] studied the influence of chemical reaction on magnetohydrodynamic flow in vertical double passage channel. The governing equations are solved using regular perturbation method and numerically solved by differential transform method. It is observed that an exact results are obtained for differential transform method and perturbation method. Bala Anki Reddy et al. [4] studied the MHD flow, heat and mass transfer of Maxwell fluid past an exponential stretching surface with thermophoresis and boundary layer slip. It is noticed that thermopheosis parameter tend to increases the thickness of the thermal boundary layer and diffusion boundary layer. Metri et al. [17] studied the Magnetohydrodynamic flow and heat transfer over a stretching surface with internal heating and viscous dissipation by using Lie group transformations. The governing partial differential equations are reduced to nonlinear ordinary differential equations by using scaling of group transformation. Madhu et al. [10] examined influence of thermal radiation on magnetohydrodynamic flow and heat transfer of Maxwell nanofluid over an stretching surface. Research group Metri et al. [18, 22] examined the effects of thermocapillary flow and heat transfer over an unsteady stretching surface with viscous dissipation, magnetic field. Kumaran et al. [9] examined the MHD casson and Maxwell fluid flow, heat and mass transfer over a stretching sheet in presence of cross diffusion and nonuniform heat source/sink. Numerically studied and compare the results of Maxwell fluid and Casson fluid. It is noticed that Maxwell fluid is highly influenced the external magnetic field when compared with Casson fluid. Ramesh et al. [25] examined the influence of thermal radiation and internal heating on three diemensional incompressible Maxwell nanofluid flow, heat and mass transfer embedded in a saturated porous surface with chemical reaction. Ramana Reddy et al. [24] studied the influence of non-uniform heat source/sink on magnetohydrodynamic Casson and Maxwell fluid flow, heat and mass transfer over a stretching surface with cross diffusion. It is noticed that viscous dissipation has significant impact on Casson fluid compared to Maxell fluid. Murtaza et al. [21]

812

J. Tawade and P. G. Metri

studied the three diemensional incompressible flow of magnetohydrodynamic and ferohydrodynamic of Maxwell fluid over a stretching surface with nonuniform heat source/sink. Ibrahim et al. [8] examined influence of thermal radiaon on magnetohydrodynamic flow, heat and mass transfer of Maxwell fluid past stretching surface with internal heating, viscous dissipation, chemical reaction and thermal slip. All the above investigations motivates us to propose the effect of thermal conductivity and non-uniform heat source/sink on the heat transfer characteristics of a boundary layer flow of a Maxwell fluid over the stretching sheet in the present paper. The various effects of different parameters such as Elastic parameter, MHD parameter, porous parameter, Prandtl number and non-uniform heat source/sink parameter are discussed in detail and graphically presented.

37.2 Mathematical Formulation The boundary layer equations can be derived for any viscoelastic fluid starting fromCauchy equations of motion. For steady two-dimensional flows, these equations governing transport of heat and momentum can be written as ∂v ∂u + =0 ∂x ∂y

(37.1)

  2 2 ∂u ∂v ∂ 2u ν ∂ 2u σ B02 2∂ u 2∂ u u +v +λ u = ν u −  u (37.2) + v + 2uv − ∂x ∂y ∂x2 ∂ y2 ∂ x∂ y ∂ y2 ρ k u

∂T ∂T k ∂2T μ +v = + 2 ∂x ∂y ρC p ∂ y ρC p



∂u ∂y

2 +

q  , ρC p

(37.3)

where u and v are the velocity components along x and y directions respectively, t is the temperature of the fluid, ρ is the density, ν is the kinematic viscosity, k  is the porosity parameter, C p is the specific heat at constant pressure, k is the thermal conductivity of the liquid far away from the sheet, B0 is the strength of the magnetic field, and λ is the relaxation time Parameter of the fluid. The non-uniform heat source/sink q  is modeled as q  =

ku w (x) ∗ [A (Ts − T0 ) f  + (T − T0 )B ∗ ] xν

(37.4)

where A∗ and B ∗ are the coefficients of space and temperature dependent heat source/sink respectively. Here we make a note that the case A∗ > 0, B ∗ > 0 corresponds to internal heat source and that A∗ < 0, B ∗ < 0 corresponds to internal heat sink. In deriving these equations, it is assumed, in addition to the usual boundary layer approximations that the contribution due to the normal stress and shearing stress is of the same order of magnitude.

37 Numerical Solution of Boundary Layer Flow Problem …

813

The boundary conditions applicable to the flow problem are u = bx, v = 0, T = Tw = T∞ + A −K

 x 2 l

 x 2 ∂T = Qw = D at y = 0 ∂y l

P ST case

(37.5)

P H F case

(37.6)

u → 0, u y → 0, T → T∞ as y → ∞

(37.7)

where A and D are constants, b is the constant known as the stretching rate, l the characteristic length, Tw is the wall temperature and T∞ constant temperature far away from the sheet. In order to obtain dimensionless form of the solution we define following variables  u = bx f η (η), ν = − bγ f (η), wher e η =

θ (η) =

b y γ

 x 2 T − T∞ , wher e Tw − T∞ = A P ST case Tw − T∞ l  x 2 Tw − T∞ = D P H F Case l

(37.8)

(37.9)

where subscript η denotes the derivative with respect to η . Clearly u and v satisfy the Eq. (37.1) identically. Substituting these new variables in Eqs. (37.2) and (37.3), we have, f  − M f  − ( f  )2 + f  + β(2 f f  f  − f f  ) + k2 f  = 0,

(37.10)

Pr [2 f  θ − θ  f ] = θ  + EC Pr f 2 + (A∗ f  + B ∗ θ ),

(37.11)

Pr [2 f  g − g  f ] = θ  + EC Pr f 2 + (A∗ f  + B ∗ θ ), , Boundary conditions of Eqs. (37.5)–(37.7) transform to PST case f η (η) = 1, θ (η) = 1, f η (η) = 0 at f η (η) → 0, θ (η) → 0, f ηη → (η) → 0

as

(37.12)

η=0

(37.13)

η→∞

(37.14)

η=0

(37.15)

η→∞

(37.16)

PHF case f η (η) = 1, θ (η) = −1, f η (η) = 0 f η (η) → 0, θ (η) → 0, f ηη → (η) → 0

at as

814

J. Tawade and P. G. Metri

where subscript η denotes the differentiation with respect to η . β denotes elastic parameter, k2 is the porosity parameter, M is the magnetic parameter, Pr and Ec denotes the Prandtl number and Eckert number respectively, A∗ and B ∗ denotes space and temperature dependent heat source/sink parameters respectively. The physical quantities are defined as, k2 =

μC p γ σ B02 b2 l 2 , Pr = , M = , Ec = bK  bρ K∞ AC p

(37.17)

37.3 Physical Quantities Our interest lies in investigation of the flow behavior and heat transfer characteristics by analyzing the non-dimensional local shear stress(τw ) and Nusselt number (N u). These non-dimensional parameters are defined as : τw =

  τ∗ ∂u

= f ηη (0), where τ ∗ = −μ b ∂ y y=0 μbx γ Nu =

Nu =

−h Ty {θη (0) Tw − T∞

1 −h Ty Tw − T∞ theta

P ST case}

(37.18)

(37.19)

P H Fcase

(37.20)

37.4 Numerical Solution of the Problem We adopt the most efficient shooting method with fourth order Runge-Kutta integration scheme to solve boundary value problems in PST and PHF cases mentioned in the previous section. The non-linear Eqs. (37.1) and (37.3) in the PST case are transformed into a system of five first order differential equations as follows: d f0 = f1, dη

(37.21)

d f1 = f2 , dη

(37.22)

d f2 ( f 1 )2 + M f 1 − f 0 f 2 − 2β f 0 f 1 f 2 − k2 f  = , dη 1 − β f 02

(37.23)

37 Numerical Solution of Boundary Layer Flow Problem …

dθ0 = θ1 , dη dθ1 = Pr [2 f 1 θ0 − θ1 f 0 ] − Ec Pr f 22 + (A∗ f  + B ∗ θ ). dη

815

(37.24)

(37.25)

Subsequently the boundary conditions (9) take the form, f 0 (0) = 0, f 1 (0) = 0, f 1 (∞) = 0, f 2 (0) = 0, θ0 (0) = 0, θ0 (∞) = 0,

(37.26)

Here f 0 = f (η) and θ0 = θ (η)Aforementioned boundary value problem is first converted into an initial value problem by appropriately guessing the missing slopes f 2 (0) and θ1 (0). The resulting IVP is solved by shooting method for a set of parameters appearing in the governing equations with a known value of f 2 (0) and θ1 (0). The convergence criterion largely depends on fairly good guesses of the initial conditions in the shooting technique. The iterative process is terminated until the relative difference between the current iterative values of f 2 (0) matches with the previous iterative value of f 2 (0) up to a tolerance of 10−6 . Once the convergence is achieved we integrate the resultant ordinary differential equations using standard fourth order Runge-Kutta method with the given set of parameters to obtain the required solution.

37.5 Results and Discussion Numerical computation has been carried out for different physical parameters like Elastic parameter β , Porosity parameter k2 , Magnetic parameter M, Prandtl number (Pr ), Eckert number (Ec), which are presented graphically in Figs. (37.1, 37.2, 37.3, 37.4, 37.5, 37.6, 37.7, 37.8, 37.9, 37.10, 37.11, 37.12, 37.13, 37.14, 37.15, 37.16). The non-linear ordinary differential Eqs. (37.10), (37.11) and (37.12) subject to the boundary conditions (Metri3:27.4–37.7), were solved numerically using the most effective numerical fourth-order Runge–Kutta method with efficient shooting technique. Appropriate similarity transformation is adopted to transform the governing partial differential equations of flow and heat transfer into a system of non-linear ordinary differential equations. The effect of several parameters controlling the velocity and temperature profiles are shown graphically and discussed briefly. Figures (37.1) and (37.2) show the effect of magnetic parameter M, with Elastic parameter at(β = 1) on the velocity profile above the sheet. An increase in the magnetic parameter leads in decrease of both u and v velocity components at any given point above the sheet. This is due to the fact that applied transverse magnetic field produces a drag in the form of Lorentz force thereby decreasing the magnitude of velocity. Figures (37.3) and (37.4) shows the effect of Elastic parameter β, in presence of magnetic number M on the velocity profile above the sheet. An increase in the

816

J. Tawade and P. G. Metri

Fig. 37.1 The effect of MHD parameter M on u velocity component f  at β = 1

Fig. 37.2 The effect of MHD parameter M on v velocity component f  at β = 1

37 Numerical Solution of Boundary Layer Flow Problem …

Fig. 37.3 The effect of elastic parameter β on u velocity component f  at β = 1

Fig. 37.4 The effect of elastic parameter β on v velocity component f  at β = 1

817

818

J. Tawade and P. G. Metri

Fig. 37.5 The effect of Porous parameter γ on u velocity component f  at M = β = 1

Fig. 37.6 The effect of Porous parameter γ on v velocity component f  at M = β = 1

37 Numerical Solution of Boundary Layer Flow Problem …

Fig. 37.7 The effect of Prandtl number for temperature distribution: P ST case

Fig. 37.8 The effect of Prandtl number for temperature distribution: P H F case

819

820

J. Tawade and P. G. Metri

Fig. 37.9 The effect of Eckert number for temperature distribution: P ST case

Fig. 37.10 The effect of Eckert number for temperature distribution: P H F case

37 Numerical Solution of Boundary Layer Flow Problem …

821

Fig. 37.11 The effect of non-uniform heat source A∗ for temperature distribution: P ST case

Fig. 37.12 The effect of non-uniform heat source A∗ for temperature distribution: P H F case

822

J. Tawade and P. G. Metri

Fig. 37.13 The effect of non-uniform heat sink B ∗ for temperature distribution: P ST case

Fig. 37.14 The effect of non-uniform heat sink B ∗ for temperature distribution: P H F case

37 Numerical Solution of Boundary Layer Flow Problem …

Fig. 37.15 Effect of porous parameter γ on temperature distribution:P ST case

Fig. 37.16 Effect of porous parameter γ on temperature distribution:P H F case

823

824

J. Tawade and P. G. Metri

Elastic parameter is noticed to decrease both u and v velocity components at any given point above the sheet. Figures (37.5) and (37.6) revels that, the effect of porosity in presence of magnetic number and Elastic parameter on the velocity profile above the sheet. An increase in the porous parameter leads to increase both u and v velocity components above the sheet. Figures (37.7) and (37.8) demonstrate the effect of Prandtl number Pr on the temperature profiles for two different cases P ST and P H F. These plots reveals the fact that for a particular value of Pr the temperature increases monotonically from the free surface temperature Ts to wall velocity the T0 . The thermal boundary layer thickness decreases drastically for high values of Pr i.e., low thermal diffusivity. Figures (37.9) and (37.10) project the effect of Eckert number Ec on the temperature profiles for both PST and PHF cases. The effect of viscous dissipation is to enhance the temperature of the fluid. i.e., increasing values of Ec contributes in thickening of thermal boundary layer. For effective cooling of the sheet a fluid of low viscosity is preferable. Figures (37.11) and (37.12) depicts the effect of space dependent heat source/sink parameter A∗ on the temperature profile for P ST and P H F cases. It is observed that the thermal boundary layer generates energy which causes the temperature (both in P ST and P H F) to increase in magnitude with increasing values of A∗ > 0 whereas in the case A∗ < 0 boundary layer absorbs energy resulting a substantial fall in temperature with decreasing values of |A∗ |. It is observed in all these plots that there is a transfer of heat from the boundary layer region to the sheet for some negative values of A∗ . The effect of temperature dependent heat source/sink parameter B ∗ on heat transfer is demonstrated in Figs. (37.13) and (37.14) for P ST and P H F cases. These graphs show that energy is released for increasing values B ∗ > 0 and this causes the magnitude of temperature to increase both in P ST and P H F cases, where as energy is absorbed for decreasing values of B ∗ < 0 resulting in temperature dropping significantly near the boundary layer. Figures (37.15) and (37.16) revels that, the effect of porosity γ in presence of magnetic number and Elastics parameter at M = β = 1 on the temperature profile above the sheet. An increase in the porous parameter leads to decrease the temperature in P ST case whereas opposite effect is seen in P H F case. Figures (37.17) and (37.18) illustrate the effect of variation in wall shear stress parameter (or skin friction coefficient) with Elastic parameter and Magnetic parameter M, respectively. The shear stress parameter takes a higher value at larger values of and M. A drop in skin friction as investigated in this paper has an important implication that in free coating operations, elastic properties of the coating formulations may be beneficial for the whole process. Which means that less force may be needed to pull a moving sheet at a given withdrawal velocity or equivalently higher withdrawal speeds can be achieved for a given driving force resulting in, increase in the rate of production. The dimensionless wall temperature gradient takes a higher value at large Prandtl number Pr as representing through Fig. (37.19).

37 Numerical Solution of Boundary Layer Flow Problem …

Fig. 37.17 Variation of wall shear stress parameter − f  (0) with Elastic parameter β

Fig. 37.18 Variation of wall shear stress parameter − f  (0) with Magnetic parameter M

825

826

J. Tawade and P. G. Metri

Fig. 37.19 Dimensionless heat flux −θ  (0) at thesheet vs Prandtl number

37.6 Conclusions The effect of magnetohydrodynamic flow of Maxwell fluid past stretching sheet in presence of viscous dissipation and non-uniform heat source/sink. Two different heating process has been studied, namely Prescribed surface temperature(PST) and Prescribed heat flux (PHF). The parameters on velocity and temperature distribution are grphically represented. The effect of parameters M, β, γ , Pr, Ec, A∗ , B ∗ on flow and heat transfer examined. Some of the important findings of the paper are: 1. The effect of transverse magnetic field on a viscous incompressible electrically conducting fluid is to suppress the velocity field which in turn causes the enhancement of the temperature field. 2. The viscous dissipation effect is characterized by Eckert number (Ec) in the present analysis. Comparing to the results without viscous dissipation, one can see that the dimensionless temperature will increase when the fluid is being heated (Ec > 0) but decreases when the fluid is being cooled (Ec < 0). This reveals that effect of viscous dissipation is to enhance the temperature in the thermal boundary layer. 3. The effect of non-uniform heat source/sink parameter is to generate temperature for increasing positive values and absorb temperature for decreasing negative values. Hence non-uniform heat sinks are better suited for cooling purpose.

37 Numerical Solution of Boundary Layer Flow Problem …

827

Acknowledgements Prashant Metri is also grateful to FUSION network and its Swedish node, MAM research milieu in Mathematics and Applied Mathematics, Division of Mathematics and Physics, School of Education, Culture and Communication at Mälardalen University for support and excellent research and research education environment during his visits.

References 1. Abbasbandy, S., Naz, R., Hayat, T., Alsaedi, A.: Numerical and analytical solutions for FalknerSkan flow of MHD Maxwell fluid. Appl. Math. Comput. 242, 569–575 (2014) 2. Abel, M.S., Tawade, J.V., Nandeppanavar, M.M.: MHD flow and heat transfer for the upperconvected Maxwell fluid over a stretching sheet. Meccanica 47, 385–393 (2012) 3. Abel, M.S, Tawade, J.V., Shinde, J.N.: The effects of MHD flow and heat transfer for the UCM fluid over a stretching surface in presence of thermal radiation. Adv. Math. Phys. 2012, 702681, 21 (2012) 4. Bala Anki Reddy, P., Suneetha, S., Bhaskar Reddy, N.: Numerical study of magnetohydrodynamics(MHD) boundary layer slip flow of a Maxwell nanofluid over an exponentially stretching surface with convective boundary condition. Propul. Power Res. 6(4), 259–268 (2017) 5. Crane, L.J.: Flow past a stretching plate. ZAMP 21, 645 (1970) 6. Hayat, T., Qasim, M.: Influence of thermal radiation and joule heating on MHD flow of a Maxwell fluid in the presence of thermophoresis. Int. J. Heat Mass Transfer 53, 4780–4788 (2010) 7. Hayat, T., Sajjid, R., Abbas, Z., Sajjid, M., Hendi, A.A.: Radiation effects on MHD flow of Maxwell fluid in a channel with porous medium. Int. J. Heat Mass Transfer 54, 854–862 (2011) 8. Ibrahim, W., Negera, M.: MHD slip flow of upper-convected Maxwell nanofluid over a stretching sheet with chemical reaction. J. Egypt. Math. Soc. 28(7), (2020) 9. Kumaran, G., Sandeep, N., Ali, M.E.: Computational analysis of magnetohydrodynamic Casson and Maxwell flows over a stretching sheet with cross diffusion. Results Phys. 7, 147–155 (2017) 10. Madhu, M., Kishan, N., Chamkha, A.J.: Unsteady flow of Maxwell nanofluid over a stretching surface in presence of magnetohydrodynamic and thermal radiation effects. Propul. Power Res. 6(1), 31–40 (2017) 11. Mahmoud, M.A.A.: Theeffect of variable fluid properties on MHD Maxwell fluid over a stretching surface in the presence of heat generation/absorption. Chem. Eng. Comm. 198, 131–146 (2011) 12. Metri, P.G., Metri, P.G., Abel, M.S., Silvestrov, S.: Heat transfer in MHD mixed convection viscoelastic fluid flow over a stretching sheet embedded in a porous medium with viscous dissipation and non-uniform heat source/sink. Procedia Eng. 157, 309–316 (2016) 13. Metri, P.G., Bablad, V.M., Metri, P.G., Abel, M.S., Silvestrov, S.: Mixed convection heat transfer in MHD non-Darcian flow due to an exponential stretching sheet embedded in a porous medium in presence of non-uniform heat source/sink. In: Silvestrov, S., Rancic, M. (Eds.) Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol. 178, pp. 187–201. Springer, Cham (2016) 14. Metri, P.G., Abel, M.S., Silvestrov, S.: Heat and mass transfer in MHD boundary layer flow over a nonlinear stretching sheet in a nanofluid with convective boundary condition and viscous dissipation. In: Silvestrov, S., Rancic, M. (Eds.) Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol 178, pp. 203–219. Springer, Cham (2016) 15. Metri, P.G., Abel, M.S.: Hydromagnetic flow of a thin nanoliquid film over an unsteady stretching sheet. Int. J. Adv. Appl. Math. Mech. 3(4), 121–134 (2016) 16. Metri, P.G., Abel, M.S., Tawade, J., Metri, P.G.: Fluid flow and radiative nonlinear heat transfer in a liquid film over an unsteady stretching sheet. In: 2016 7th International Conference on Mechanical and Aerospace Engineering (ICMAE), London, pp. 83–87. IEEE (2016)

828

J. Tawade and P. G. Metri

17. Metri, P.G., Guariglia, E., Silvestrov, S.: Lie group analysis for MHD boundary layer flow and heat transfer over stretching sheet in presence of viscous dissipation and uniform heat source/sink. AIP Conf. Proc. 1798, 020096 (2017) 18. Metri, P.G., Narayana, M., Silvestrov, S.: Hypergeometric steady solution of hydromagnetic nano liquid film flow over an unsteady stretching sheet. AIP Conf. Proc. 1798, 020097 (2017) 19. Mosta, S.S., Hayat, T., Aldossary, O.M.: MHD flow of upper convected Maxwell fluid past porous stretching sheet successive Taylor series linearization method. Appl. Math. Eng. Ed. 33(8), 975–990 (2012) 20. Mukhopadhyay, S., Vajravelu, K.: Effects of transpiration and internal heat generation/absorption on the unsteady flow of Maxwell fluid at a stretching surface. J. Appl. Mech. 79(4), 044508 (2012) 21. Murtaza, M.G., Ferdows, M., Mishra, J.C., Tzirtzilakis, E.E.: Three dimensional biomagnetic Maxwell fluid flow over a stretching surface in presence of heat source/sink. Int. J. Biomath. 12(3), 1950036 (2019) 22. Narayana, M., Metri, P.G., Silvestrov, S.: Thermocapillary flow of a non-Newtonian nanoliquid film over an unsteady stretching sheet. AIP Conf. Proc. 1798, 020109 (2017) 23. Pratap Kumar, J., Umavathi, J.C., Metri, P.G., Silvestrov, S.: Effect of First Order Chemical Reaction on Magneto Convection in a Vertical Double Passage Channel. In: Silvestrov, S., Rancic, M. (Eds.) Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol. 178, pp. 247–279. Springer, Cham (2016) 24. Ramana Reddy, J.V., Ananth Kumar, K., Sugunamma, V., Sandeep, N.: Effect of cross diffusion on MHD non-Newtonian fluids flow past a stretching sheet with non-uniform heat source/sink: A comparative study. Alexandria Eng. J. 57, 1829–11838 (2018) 25. Ramesh, G.K., Prasannakumara, B.C., Gireesha, B.J., Shehzad, S.A., Abbasi, F.M.: Three dimensional flow of Maxwell fluid with suspended nanoparticles past a bidirectional porous stretching surface with thermal radiation. Therm. Sci. Eng. Progress 1, 6–14 (2017) 26. Ramesh, G.K., Gireesha, B.J.: Influence of heat source/sink on a Maxwell fluid over a stretching surface with convective boundary condition in the presence of nanoparticles. Ain Shams Eng. J. 5, 991–998 (2014) 27. Shateyi, S.: A numerical approach to MHD flow of upper convected Maxwell fluid past a vertical stretching sheet in the presence of thermophoresis and chemical reaction. Bound Value Probl. 2013, 196 (2013) 28. Tawade, J., Metri, P.G., Abel, M.S.: Thin film flow and heat transfer over an unsteady stretching sheet with thermal radiation, internal heating in presence of external magnetic field. Int. J. Adv. Appl. Math. Mech. 3(4), 29–40. arXiv:1603.03664 (2016) 29. Umavathi, J.C., Vajravelu, K., Metri, P.G., Silvestrov,S.: Effect of Time-Periodic boundary temperature modulations on the Onset of convection in a Maxwell fluid nanofluid saturated porous layer. In: Silvestrov, S., Rancic, M. (Eds.) Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol 178, pp. 221–245. Springer, Cham (2016) 30. Vajravelu, K., Prasad, K.V., Sujatha, A., NG, C.: MHD flow and mass transfer of chemically reactive upper convected Maxwell fluid past porous surface. Appl. Math. Eng. Ed. 33(7), 899– 910 (2012)

Chapter 38

Effect of Electromagnetic Field on Mixed Convection of Two Immiscible Conducting Fluids in a Vertical Channel J. C. Umavathi, Prashant G. Metri, and Sergei Silvestrov

Abstract An analysis is carried out to study the flow and heat transfer of electrically conducting immiscible viscous fluids in a parallel vertical channel. Both fluids are incompressible and the flow is assumed to be steady, one dimensional and fully developed. Combined free and forced convection inside the channel is considered. Through proper choice of dimensionless variables, the governing equations are developed and three types of thermal boundary conditions are prescribed. These thermal boundary conditions are isothermal-isothermal, isoflux-isothermal and isothermalisoflux for the left-right walls of the channel respectively. The basic equations are solved analytically using the regular perturbation method and numerically using Runge-Kutta-Gill method. Solutions for the velocity and temperature for the various special cases are reported. A selected set of graphical results illustrating the effects of various parameters involved in the problem on the velocity and temperature profiles as well as flow reversal situation and Nusselt number is presented and discussed. Keywords Two immiscible fluids · Vertical channel · Mixed convection · Conducting fluid · Regular perturbation method MSC 2020 76D10 · 76D09 · 76B99 · 76M45

J. C. Umavathi Department of Mathematics, Gulbarga University, Gulbarga, Karnataka, India P. G. Metri (B) Department of Mechanical Engineering and Mathematics, Walchand Institute of Technology, Solapur, Maharashtra, India e-mail: [email protected] S. Silvestrov Division of Mathematics and Physics, School of Education, Culture and Communication, Mälardalen University, Box 883, 72123 Västerås, Sweden e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_38

829

830

J. C. Umavathi et al.

38.1 Introduction Buoyancy-driven flow and heat transfer in open-ended enclosures is receiving increasing attention by many researchers in recent years. Natural convection in open-ended cavities has been the subject of interest both experimentally and numerically. This is primarily because of its practical interest, such as nuclear reactors, fire researches, thermal insulation, thermal storage systems, electric transmission cables, and brake housing of an aircraft (Desai and Vafai [11]), can be modeled by various extensions of this type of geometry. In most cases, the studies in this field are stimulated by the need for enhancing heat transfer, for instance in the design of compact heat exchangers or of solar collectors. A wide literature refers to the simplest non-circular ducts, i.e. parallel plate and rectangular ducts. Hartnett and Kostic [13] provided a very deep review of the most important results on heat transfer in rectangular ducts both for forced and mixed convection flows. One of the first theoretical analyses of laminar convection in rectangular ducts appeared in Han work [12]. Valuable references on mixed convection flow through a heated channel with viscous dissipation can be found in Aung [2], Aung and Worku [3, 4] and Barletta [6, 7]. The problem concerning the flow of immiscible fluids owns a definite role in chemical engineering and in medicine. In view of this, Bird et al. [9] obtained an exact solution for the laminar flow of two immiscible fluids between two parallel plates. Recent advance studies in two-phase flow are remarkable and many things had been clarified about various phenomena in two-phase flow. However, of course, there are still many more things to be studied in order to achieve sufficient understanding and satisfactory prediction about two-phase flow. In particular, the microscopic structures of two-phase flow, such as velocity components and phase distributions, interfacial structures and turbulence phenomena, are quite important topics and many efforts have been made in these research areas in recent years. This is partly due to scientific interests in the physical phenomena of two-phase flow and partly due to industrial demands for more precise predictions of two-phase flow behavior in various industrial devices. The study of an electrically conducting viscous fluid under the action of a transversely applied magnetic field has immediate applications in many devices such as magnetohydrodynamic (MHD) power generators, MHD pumps, petroleum industry and fluid droplets sprays. Postlethwaite and Sluyter [22] presented an overview of the heat transfer problems associated with a MHD generator. The fluid mechanics and the heat transfer characteristics of the generator channel are significantly influenced by the presence of the magnetic field. There have been some theoretical and experimental works on the stratified laminar flow of two immiscible liquids in a horizontal pipe (see Charles and Lilleleht [10]), Bentwich [8], Packham and Shail [21]). The interest in this configuration stems from the possibility of reducing the power required to pump oil in a pipeline by the suitable addition of water. Packham and Shail [21] analyzed stratified laminar flow of two immiscible liquids in a horizontal pipe. Shail [24] investigated theoretically

38 Effect of Electromagnetic Field on Mixed Convection …

831

the possibility of using a two-phase system to obtain increased flow rates in an electromagnetic pump. Hartmann flow of a conducting fluid and non-conducting fluid layer contained in a channel has been studied by Shail [24]. As a result of this study, Shail [24] reported that an increase of the order of 30% can be achieved in the flow rate of an electromagnetic pump for suitable depth and viscosity ratios of the two fluids and realistic values of the Hartmann number. Loharsbi and Sahai [14] dealt with two-phase MHD flow and heat transfer in a parallel-plate channel. Both phases are incompressible and the flow is assumed to be steady, one-dimensional and fully developed. The study was expected to be useful in the understanding of the effect of the presence of slag layers on the heat transfer characteristics of a coal-fired MHD generator. Alireza and Sahai [1] studied the effect of temperature-dependent transport properties on the developing MHD flow and heat transfer in a parallel plate channel whose walls were held at constant and equal temperatures. Malashetty and Leela [15, 16] reported closed-form solutions for the two-phase flow and heat transfer situation in a horizontal channel for which both phases are electrically conducting. Malashetty and Umavathi [17] studied two-phase MHD flow and heat transfer in an inclined channel in the presence of buoyancy effects for the situation where only one of the phases is electrically conducting. Malashetty et al. [18–20] analyzed the problem of fully developed two-fluid magnetohydrodynamic flows with and without applied electric field in an inclined channel. Umavathi et al. [25, 26, 28, 29] studied steady and unsteady flow and heat transfer of conducting fluid in a vertical channel. Umavathi et al. [30] studied the study the effect of thermal modulation on the onset of convection in a Maxwell fluid and nanofluid saturated porous medium. Pratap Kumar et al. [23] investigated study MHD flow nature by inserting a baffle in a vertical channel filled with chemically reacting conducting fluid. Electromagnetic conduction has been increasing in the manufacturing process of semi-conducting materials such as silicon crystal or gallium arsenide. In the literature there is no much work on Hartmann flow with applied electric field. Hence keeping in view the applications cited above, the authors project on the study of mixed convection of two-phase magneto-hydrodynamic channel flows for open and short circuits.

38.2 Mathematical Formulation The geometry under consideration illustrated in Fig. 38.1 consists of two infinite parallel plates maintained at different or equal constant temperatures extending in the X and Z directions. The region 0 ≤ Y ≤ h21 is occupied by viscous incompressible electrical conducting fluid of density ρ1 , viscosity μ1 , thermal conductivity K 1 , thermal expansion coefficient β1 and electrical conductivity σe1 . The region h22 ≤ Y ≤ 0 is occupied by another viscous incompressible electrically conducting fluid of density ρ2 , viscosity μ2 , thermal conductivity K 2 , thermal expansion coefficient β2 and electrical conductivity σe2 . A uniform magnetic field B0 is applied normal to the plates and a uniform electric field E 0 is applied across the channel. The fluids

832

J. C. Umavathi et al.

Fig. 38.1 Physical configuration

are assumed to have constant properties except the density in the buoyancy term in momentum equation. A fluid rises in the channel driven by buoyancy forces. The transport properties of both the fluids are assumed to be constant. We consider the flow that is steady, laminar and fully developed. It is assumed that the only non-zero component of the → velocity vector − q is the X -component Ui (i = 1, 2). Thus, as a consequence of the mass balance equation, one obtains ∂Ui =0 ∂X so that Ui depends only on Y . The stream wise and transverse momentum balance equations become gβi (Ti − T0 ) −

d 2 Ui 1 ∂P σei + νi − (E 0 + B0 Ui )B0 = 0. ρi ∂ X dY 2 ρi

(38.1)

The Y -momentum balance equation can be expressed as ∂P = 0, ∂Y

(38.2)

38 Effect of Electromagnetic Field on Mixed Convection …

833

where P = p + ρ0 gx (for P1 = P2 = P) is the difference between the pressure and the hydrostatic pressure. On account of (38.2), P depends only on X so that (38.1) can be written as gβi (Ti − T0 ) −

d 2 Ui 1 ∂P σei + νi − (E 0 + B0 Ui )B0 = 0. 2 ρi ∂ X dY ρi

(38.3)

Let us assume that the walls of the channel are isothermal. In particular, the temperature of the boundary at Y = h22 is Tw2 , with Tw2 ≥ Tw1 . These boundary conditions are compatible with (38.3) if and only if dd XP is independent of X . Therefore, there exists a constant A such that dP = A. dX

(38.4)

On account of (38.4) and by evaluating the derivatives of (38.3) with respect to X , one obtain dTi = 0, dX so that temperature also depends only on Y . Taking into account the effect of viscous and Ohmic dissipations, the energy balance equation can be written as d 2 Ti νi αi − dY 2 Cp



dUi dY

2 +

σei (E 0 + B0 Ui )2 = 0. ρi C p

(38.5)

Equations (38.3) and (38.5) allow one to obtain differential equations for Ui , namely d 4 Ui σei Bo2 d 2 Ui ρi gβi − − 4 2 dY μi dY Ki



dUi dY

2 −

gβi σei (E 0 + B0 Ui )2 = 0 K i νi

The boundary conditions on Ui are both no slip conditions U1 = 0

U2 = 0

Y =−

Y =

h1 , 2

h2 , 2

and those induced by the boundary conditions on T and by (38.3) are U1 = U2

Y = 0,

(38.6)

834

J. C. Umavathi et al.

d 2 U1 A σe1 E 0 B0 gβ1 (Tw1 − T0 ) h1 = + − at Y = − , dY 2 μ1 μ1 ν1 2

d 2 U2 A σe2 E 0 B0 gβ2 (Tw1 − T0 ) h2 = + − at Y = , dY 2 μ2 μ2 ν2 2

μ1

dU1 dU2 = μ2 at Y = 0 dY dY

  d 2 U1 ρ1 β1 σe1 μ2 ρ1 β1 d 2 U2 A 1 − − (E + B U )B = + 0 0 1 0 dY 2 μ1 μ1 ρ2 β2 dY 2 μ1 ρ2 β2 σe2 ρ1 β2 (E 0 + B0 U1 )B0 at Y = 0 − μ1 ρ2 β2

d 3 U1 μ2 ρ1 K 2 β1 d 3 U2 σe1 B02 dU1 σe2 B02 ρ1 K 2 β1 dU2 = − − dY 3 μ1 dy μ1 ρ2 K 1 β2 dY 3 μ1 ρ2 K 1 β2 dY

at Y = 0 (38.7)

Equations (38.6)–(38.7) can be written in the dimensionless form by employing the dimensionless quantities

u1 =

U1 U0(1)

, u2 =

U2 U0(2)

, θ1 =

T1 − T0 T2 − T0 Y1 Y2 , θ2 = , y1 = , y2 = ,(38.8) ΔT ΔT D1 D2

Gr U0(1) D1 μ1 U0(1) Tw2 gβ1 ΔT D13 ,GR = , RT = , Re = , Br = , Gr = 2 ν1 K 1 ΔT Re Tw1 ν1 2

M2 =

σe1 B02 D12 E0 ,E = , D1 = 2h 1 , D2 = 2h 2 . μ1 B0 U0(1)

The reference velocity U0i and the reference temperature T0 are given by U0i =

ADi2 , 48μi

T0 =

Tw1 + Tw2 2

Moreover, the temperature difference ΔT is given by ΔT = Tw2 − Tw1 if Tw1 < Tw2

38 Effect of Electromagnetic Field on Mixed Convection …

835

As a consequence, the dimensionless parameter RT can only take the values 0 or 1. That is, RT is 1 for asymmetric heating with Tw1 < Tw2 , while RT is 0 for symmetric heating with Tw1 = Tw2 , respectively. Equation (38.4) implies that A can be either positive or negative. If A > 0, then U0(i) , Re and G R are negative, i.e. the flow is downward. On the contrary, if A < 0, the flow is upward, so that U0(i) , Re and G R are positive (Barletta [5]). It is noted that the flow is termed as assisting flow when G R > 0 and as opposing flow when G R < 0. Region-I 2 d 4u1 2 d u1 − M = G R Br dy 4 dy 2



du 1 dy



2 + M 2 E 2 + M 2 u 21 + 2M 2 Eu 1

(38.9)

Region-II d 4u2 d 2u2 − B = mnbK h 4 dy 4 dy 2 where B = M 2 mh 2 σr and A4 = and interface conditions become



E mh 2

du 2 dy



2 + B A24 + Bu 22 + 2B A4 u 2

(38.10)

are not independent parameters. The boundary

1 d 2u1 G R RT at y = − (38.11) = −48 + M 2 E + dy 2 2 4 1 d 2u2 nbG R RT u 2 = 0, at y = = −48 + M 2 Eσr − dy 2 2 4 du 2 du 1 = at y = 0 u 1 = mh 2 u 2 , dy dy   d 2u1 1 d 2u2 2 2 2 at y = 0 − M u = − M σ E + nbM E + 48(1 − nb) − Bu 1 r 2 dy 2 nb dy 2  3  d 2u1 1 d u2 du 2 2 du 1 = at y = 0 − M − B dy 3 dy nbK h dy 3 dy

u 1 = 0,

38.3 Analytical Solutions 38.3.1 Special Cases 38.3.1.1

Without Dissipation Effect (Br = 0)

The solutions of (38.9) and (38.10) using (38.11), in the absence of viscous and Ohmic dissipations (Br = 0), are given by

836

J. C. Umavathi et al.

Region-I u 1 = G 1 + G 2 y + G 3 Cosh(M y) + G 4 Cosh(M y) Region-II

  u 2 = G 5 + G 6 y + G 7 Cosh( By) + G 8 Sinh( By)

Using (38.8) in (38.3), we obtain the energy balance equation as Region-I   d 2u1 1 2 48 + − M (E + u ) θ1 = − 1 GR dy 2 Region-II θ2 = −

1 nbG R

  d 2u2 2 48 + − M Eσ − Bu r 2 dy 2

Purely Forced Convection (G R = 0)

38.3.1.2

When buoyancy forces are negligible and viscous and Ohmic dissipations are dominating, (G R = 0), so that a purely forced convection occurs. Under this condition the solutions of (38.9) and (38.10) subject to boundary and interface conditions, as given in (38.11) are given by Region-I (38.12) u 1 = F1 + F2 y + F3 Cosh(M y) + F4 Sinh(M y) Region-II

  u 2 = F5 + F6 y + F7 Cosh( By) + F8 Sinh( By)

(38.13)

Using (38.8) in (38.5), we obtain the energy balance equations as Region-I d 2 θ1 = −Br dy 2



du 1 dy



2 +M E + 2

2

M 2 u 21

+ 2M Eu 1 2

(38.14)

Region-II d 2 θ2 = −Br m K h 4 dy 2



du 2 dy



2 +

B A24

+

Bu 22

The boundary and interface conditions for temperature are

+ 2B A4 u 2

38 Effect of Electromagnetic Field on Mixed Convection …

837

    1 1 RT RT =− , θ2 = θ1 − 4 2 4 2 1 dθ2 dθ1 = at y = 0 θ1 = θ2 , dy K h dy

(38.15)

Using (38.12), (38.13) and (38.15) the energy balance (38.14) hold. Perturbation solution We define the dimensionless parameter (defined as Aung and Worku [3, 4] and Barletta [5–7]): ε=

Gr Br = G R Br. Re

Equation (38.19) shows that ε does not depend on the reference temperature difference ΔT . The fact that the product G R Br (ε) is assumed very small and can be exploited to use the regular perturbation method. To this end the solutions are assumed in the form u i (y) = u i0 (y) + εu i1 (y) + ε u i2 (y) + · · · = 2

∞ 

εn u in (y).

(38.16)

n=0

Using (38.16) in (38.9) and (38.10) and equating the coefficients of like powers ε to zero, we obtain the zeroth and first-order equations as follows: Isothermal-isothermal(Tw1 − Tw2 ) wall condition Region-I Zeroth-order equation d 4 u 10 d 4 u 10 − M2 =0 (38.17) 4 dy dy 2 First-order equation 2 d 4 u 11 2 d u 11 − M = dy 4 dy 2



du 10 dy

2 + M 2 E 2 + M 2 u 210 + 2M 2 Eu 10

(38.18)

Region-II Zeroth-order equation d 4 u 20 d 2 u 20 − B =0 dy 4 dy 2

(38.19)

First-order equation d 4 u 21 d 2 u 21 −B = mnbK h 4 4 dy dy 2



du 20 dy



 + B A24 + Bu 220 + 2B A4 u 20

(38.20)

838

J. C. Umavathi et al.

The corresponding boundary and interface conditions given in (38.11) reduces to Zeroth-order equation: Zeroth-order equation 1 d 2 u 10 G R RT at y = − = −48 + M 2 E + dy 2 2 4 1 d 2 u 20 nbG R RT at y = = −48 + M 2 Eσr + u 20 = 0, 2 dy 2 4 du 20 du 10 2 =h at y = 0 u 10 = mh u 20 , dy dy   d 2 u 10 1 d 2 u 20 2 2 2 − M u 10 = − Bu 20 − M Eσr + nbM E + 48(1 − nb) dy 2 nb dy 2 at y = 0  3  3 d u 10 1 d u 20 du 10 du 20 = − M2 −B at y = 0 dy 3 dy nbK H dy 3 dy (38.21) First order equation u 10 = 0,

d 2 u 11 = 0 at y dy 2 d 2 u 21 u 21 = 0, = 0 at dy 2 du 21 du 11 =h at u 11 = mh 2 u 21 , dy dy   d 2 u 11 1 d 2 u 21 2 at − M u = − Bu 11 21 dy 2 nb dy 2  3  1 d u 21 du 11 du 21 = at − M2 − B dy nbK H dy 3 dy u 11 = 0,

d 3 u 11 dy 3

1 4 1 y= 4

=−

(38.22)

y=0 y=0 y=0

Solutions of zeroth-order equations (38.17), (38.19) using boundary and interface conditions (38.21) and first order (38.18), (38.20) using boundary and interface conditions (38.22) can be obtained and are not presented. Using (38.8) in (38.3), we obtain the energy balance equations as Region-I   d 2u1 1 2 48 + θ1 = − − M (E + u ) . (38.23) 1 GR dy 2 Region-II θ1 = −

1 nbG R

  d 2u1 2 48 + − M Eσ − Bu ) . r 2 dy 2

(38.24)

38 Effect of Electromagnetic Field on Mixed Convection …

839

Using velocities obtained in (38.17)–(38.20), the energy balance equations (38.23) and (38.24) can be evaluated and not presented. Isoflux-isothermal (q1 − Tw2 ) wall conditions For this case, the thermal boundary conditions for the channel walls can be written in the dimensional form as h1 dT1 at Y = − , dY 2 h2 T = T2 at Y = , 2

q1 = −K 1

The dimensionless form of the above equations can be obtained by using (38.8) q 1 D1 along with ΔT = to give K1 q

1 dθ1 = −1 at Y = − , dY 4 Rqt 1 θ2 = at Y = , 2 4

(38.25)

−T0 where Rqt = Tw2ΔT is the thermal ratio parameter for isoflux-isothermal walls. Other than the no-slip conditions at the channel walls, two more boundary conditions in terms of U1 are required to solve (38.6). These are induced by the conditions given by (38.25) and the other obtained from (38.3) as follows. Differentiating (38.3) with dP = A gives respect to Y with dX

d 3 U1 gβ1 dT1 σe1 B02 dU1 + =0 − 3 dY μ1 dY ν1 dY Equation (38.26) is non-dimensionalized by using (38.8) to give dθ1 du 1 d 3u1 + GR = 0. − M2 3 dY dY dy 1 Evaluating the (38.27) at the left wall y = − yields 4 1 du 1 d 3u1 = G R at y = − . − M2 dY 3 dY 4 The other boundary condition at the right wall can be shown to be the same as that given for the isothermal-isothermal wall with RT replaced by Rqt such that

840

J. C. Umavathi et al.

nbG R Rqt 1 d 2u2 = G R at y = . = −48 − dY 2 2 4 The integration constants appeared in (38.23)–(38.25) are evaluated using boundary conditions (38.21), (38.22), (38.28) and (38.29) and are not presented. Isothermal-isoflux(Tw1 − q2 ) wall conditions For this case, the thermal boundary conditions for the channel walls can be written in the dimensional form as h1 , 2 h2 at Y = − 2

T = Tw1 at Y = − q2 = −K 2

dT2 dY

(38.26)

The dimensionless form of the above equations can be obtained by using (38.8) along q 2 D2 with ΔT = to give K2 1 θ1 = Rtq at y = − , 4

dθ2 1 = −1 at y = , dy 4

(38.27)

Tw1 − T0 is the thermal ratio parameter for isothermal-isoflux walls. where Rtq = ΔT Other than the no-slip conditions at the channel walls, two more boundary conditions in terms of U2 are required to solve (38.6). These are induced by the conditions given in (38.27) and the other obtained from (38.3) is as follows. Differentiating (38.3) with dP = A gives respect to Y with dX d 3 U2 gβ2 dT2 σe2 B02 dU2 + =0 − 3 dY μ2 dY ν2 dY

(38.28)

Equation (38.32) is non-dimensionalized by using (38.8) to give dθ2 d 3u2 du 2 + G Rnb = 0. −B 3 dY dY dy Evaluating the (38.29) at the right wall y =

(38.29)

1 yields 4

d 3u2 1 du 2 = G Rnb at y = . −B 3 dY dY 4

(38.30)

The other boundary condition at the right wall can be shown to be the same as that given for the isothermal-isothermal wall with RT replaced by Rtq such that

38 Effect of Electromagnetic Field on Mixed Convection …

G R Rqt d 2u1 = −48 + dY 2 2

at y =

841

1 . 4

(38.31)

The integrating constants appeared in (38.17)–(38.20) are evaluated using boundary the conditions (38.21), (38.22), (38.30) and (38.31) and are not presented. Rate of heat transfer The heat transfer parameters on the walls expressed in terms of the Nusselt number, which in non-dimensional form become   1 1 dθ1 at y = − (38.32) N u1 = 1 + h dy 4 1 dθ1 at y = N u 2 = (1 + h) dy 4

38.4 Results and Discussion In this section the fluid flow and heat transfer results for an electrically conducting fluid flow in a vertical enclosure are discussed in the presence of an applied magnetic field B0 normal to gravity and applied electric field E 0 parallel to z-axis considering both viscous and Ohmic dissipations. The electric loading parameter E = 0 corresponds to short circuit configuration and E = 0 corresponds to open circuit, E may be positive or negative depending on the polarity of E 0 . The effect of electromagnetic force when E = −1 is found to accelerate the flow and hence acts as a MHD generator. Further, the direction of the flow when E > 0 is opposite to that when E < 0 and hence the present configuration can be used effectively for flow reversal situation required in many practical problems. The flow equations are coupled and nonlinear; hence finding exact solutions is not possible. So, regular perturbation technique is used to find approximate analytical solutions, which is applicable only for small ε = (G R Br ). However, it is essential to analyze the flow nature for large ε. This is achieved by solving basic equations numerically using the Runge-Kutta-Gill method. It is expected that analytical and numerical solutions are in good agreement when ε is relatively small. The flow field for asymmetric heating RT is obtained and depicted in Figs. 38.2, 38.3, 38.4, 38.5, 38.6, 38.7, 38.8, 38.9 and 38.10 and also shown in Tables 38.1, 38.2 and 38.3 for the values of the parameters fixed as m = 1, b = 1, h = 1, K = 1, n = 1, M = 2, G R = 500 and ε = 0.1 except for the varying ones. Equations (38.12) and (38.13) are the velocity field in both the regions in the absence of Brinkman number (Br = 0), in which the corresponding profiles are depicted graphically in 1 Fig. 38.2. There is a flow reversal near the cold wall at y = − for G R = 400 and 4 there is symmetric profile for G R = 0. Equations (38.16) and (38.17) are evaluated for temperature field in the absence of Grashof number G R = 0, which are depicted graphically in Fig. 38.3 for differ-

842

Fig. 38.2 Velocity profiles for different values of G R

Fig. 38.3 Temperature profiles for different values of Br

J. C. Umavathi et al.

38 Effect of Electromagnetic Field on Mixed Convection …

Fig. 38.4 Velocity profiles for different values of G R and ε

Fig. 38.5 Temperature profiles for different values of Br

843

844

Fig. 38.6 Velocity profiles for different values of G R and ε

Fig. 38.7 Velocity profiles for different values of Hartmann number M

J. C. Umavathi et al.

38 Effect of Electromagnetic Field on Mixed Convection …

Fig. 38.8 Velocity profiles for different values of viscosity ratio m

Fig. 38.9 Velocity profiles for different values of width ratio h

845

846

J. C. Umavathi et al.

Fig. 38.10 Temperature velocity profiles for different values of ratio h Table 38.1 Temperature values for different values of G R and ε G R = −500, ε = −0.1 G R = 500, ε = 0.1 y E = −1 E =0 E =1 E = −1 E =0 −0.25 −0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 0.2 0.25

−0.5 −0.3997 −0.2996 −0.1995 −0.0995 5E−4 0.1005 0.2004 0.3003 0.4002 0.5

−0.5 −0.3997 −0.2996 −0.1995 −0.0995 6E−4 0.1005 0.2004 0.3003 0.4002 0.5

−0.5 −0.3997 −0.2996 −0.1995 −0.0994 6E−4 0.1006 0.2005 0.3003 0.4002 0.5

−0.5 −0.3998 −0.2997 −0.1995 −0.0994 6E−4 0.1006 0.2005 0.3004 0.4003 0.5

−0.5 −0.3998 −0.2997 −0.1996 −0.0995 6E−4 0.1005 0.2005 0.3004 0.4003 0.5

E =1 −0.5 −0.3998 −0.2997 −0.1996 −0.0995 5E−4 0.1005 0.2005 0.3004 0.4003 0.5

ent values of Brinkman number Br. The temperature field increases with increasing values of Brinkman number. Figures 38.2 (Br = 0) and 38.3 (Gr = 0) are the similar graphs obtained by Barletta [5] and Umavathi et al. [27] for one fluid model considering permeable fluid and viscous fluid, respectively. The effect of G R and ε on the velocity and temperature fields is shown in Figs. 38.4, 38.5, 38.6 and 38.7 for short (E = 0) and open circuits (E = 0). Figures 38.4 and 38.5 show that the dimensionless velocity and temperature at each position are increasing functions of ε irrespective of the values of electric load parameter E. This

38 Effect of Electromagnetic Field on Mixed Convection …

847

Table 38.2 Temperature values for different values of Hartmann number M M =2 M =6 y E = −1 E =0 E =1 E = −1 E =0 −0.25 −0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 0.2 0.25

−0.5 −0.3998 −0.2997 −0.1996 −0.0995 5E−4 0.1005 0.2005 0.3004 0.4003 0.5

−0.5 −0.3998 −0.2997 −0.1996 −0.0995 6E−4 0.1005 0.2005 0.3004 0.4003 0.5

−0.5 −0.3998 −0.2997 −0.1995 −0.0994 6E−4 0.1006 0.2005 0.3004 0.4003 0.5

−0.5 −0.3998 −0.2997 −0.1996 −0.0996 5E−4 0.1004 0.2004 0.3003 0.4003 0.5

−0.5 −0.3999 −0.2998 −0.1997 −0.0996 5E−4 0.1005 0.2004 0.3004 0.4002 0.5

E =1 −0.5 −0.3998 −0.2997 −0.1996 −0.0994 6E−4 0.1007 0.2006 0.3005 0.4003 0.5

Table 38.3 Temperature values for different values of viscosity ratio m m = 0.5

m=2

y

E = −1

E =0

E =1

E = −1

E =0

E =1

−0.25

−0.5

−0.5

−0.5

−0.5

−0.5

−0.5

−0.2

−0.3999

−0.3999

−0.3999

−0.3997

−0.3997

−0.3997

−0.15

−0.2998

−0.2998

−0.2998

−0.2995

−0.2995

−0.2995

−0.1

−0.1997

−0.1997

−0.1997

−0.1993

−0.1993

−0.1992

−0.05

−0.0997

−0.0997

−0.0996

−0.0991

−0.0991

−0.0991

0

3E−4

3E−4

4E−4

9E−4

1E−3

1E−3

0.05

0.1003

0.1003

0.1004

0.1009

0.1009

0.101

0.1

0.2003

0.2003

0.2003

0.2008

0.2008

0.2009

0.15

0.3002

0.3002

0.3003

0.3007

0.3007

0.3007

0.2

0.4002

0.4002

0.4002

0.4005

0.4005

0.4005

0.25

0.5

0.5

0.5

0.5

0.5

0.5

is due to the fact that a greater energy generated by viscous dissipations yields a greater fluid temperature and, as a consequence a stronger buoyancy force occurs. The increase of the buoyancy force implies an increase of the velocity in the upward direction for both open and short circuits. One can also observe from Figs. 38.4 and 38.5 that analytical and numerical solutions agree very well for and 1 and differs slightly for but becomes large for ε = 0. Figure 38.6 displays the velocity profile u with G R ± 500, ε ± 0 (both assisting and opposing flows) for open and short circuits. It is seen that for large values of G R, e.g. G R = 500, flow reversal occurs both at the cold and hot walls, depending on whether the flow is assisting G R > 0 or opposing G R < 0. In addition, the effect of E is found to prevent or inhibit the fluid from moving, that is, the velocity decreases as E increases. The intensity of

848

J. C. Umavathi et al.

flow reversal is therefore enhanced by the effect of increasing E, irrespective of the flow being assisting or opposing. The flow nature for G R = ±100, ±500 is similar to one fluid model for permeable and viscous fluid (see Barletta [5] and Umavathi et al. [27]). The effect of G R and ε on temperature for E = −1, 0, 1 is shown in Table 38.1. For positive G R and negative and ε, the values of temperature are same up to three decimal places and the effect is almost invariant for both open and short circuits. The effect of Hartmann number on velocity is shown in Fig. 38.7. This graph also shows the effect of electric field load parameter E on the flow for both open and short circuits. Effect of Hartmann number is to suppress the velocity field in Region II, which is a typical effect of retarding Lorentz force on the flow field. It is also observed that there is a flow reversal near the cold wall and its intensity is enhanced for increasing values of electric loading parameter. The effect of Hartmann number on temperature varies only at the fourth decimal place, as shown in Table 38.3 for both open and short circuits. The effect of viscosity ratio m on the velocity is displayed in Fig. 38.8. As the viscosity ratio m increases, the velocity increases in Region-I and decreases in RegionII for both open and short circuits. Flow reversal is also observable near the cold wall and the downward flow intensity increases for decreasing values of m. It is interesting to note that the curvature of the velocity profiles changes sign for both the cases m = 0.5 and m = 2 near the interface due to the boundary conditions equation (38.11). However the slopes of the velocity profiles for m = 0.5 and m = 2 are of the opposite signs near the interface. The effect of viscosity ratio m on temperature is also insensitive and hence shown in Table 38.3. In Figs. 38.9 and 38.10, the influence of width ratio h on velocity and temperature is shown respectively for open and short circuits. As the width ratio h increases both velocity and temperature fields decrease with fixed other parameters. The temperature field remains invariant for specified values of E = −1, 0, 1 as seen in Fig. 38.10. The phenomenon of sign change of velocity curvature at the interface preserves even in the cases of h = 0.5 and h = 2 as that for varying m case. Again the sign of the slopes of velocity profile for h = 0.5 is opposite for h = 2 at the interface. In addition the nearly linear temperature profiles are observed since the dissipation effect is very small for the present parameters specified. But the slope of temperature changes from Region-I to Region-II owing to the continuity of heat flux at the interface. Figures 38.11 and 38.12 exhibit the effect of thermal conductivity ratio K on the flow and temperature fields. It is seen that increasing the values of K decreases both velocity and temperature in both the regions. There is a flow reversal near the cold wall for large value of K = 2 but this phenomenon disappears for small value of K = 2. Different from cases of varying m and h, both the slope and curvature of the velocity profiles for K = 0.5 and K = 2 remains the same sign around the interface. There is no apparent effect of electric field load parameter E on the flow for varying K as seen in Fig. 38.12. Figures 38.13, 38.14, 38.15 and 38.16 illustrate the effect of Hartmann number M on the flow and temperature for isoflux-isothermal (Rqt = 0.5) and isothermalisoflux (Rtq = −0.5) wall conditions with E = −1, 0, 1. The effect of Hartmann

38 Effect of Electromagnetic Field on Mixed Convection …

Fig. 38.11 Velocity profiles for different values of conductivity ratio K

Fig. 38.12 Temperature profiles for different values of conductivity ratio K

849

850

J. C. Umavathi et al.

Fig. 38.13 Velocity profiles for different values of Hartmann number M in isoflux-isothermal wall conditions

Fig. 38.14 Temperature profiles for different values of Hartmann number M in isoflux-isothermal wall conditions

38 Effect of Electromagnetic Field on Mixed Convection …

851

Fig. 38.15 Velocity profiles for different values of Hartmann number M in isothermal-isoflux wall conditions

Fig. 38.16 Temperature profiles for different values of Hartmann number M in isothermal-isoflux wall conditions

852

J. C. Umavathi et al.

Table 38.4 Nusselt numbers for different values of Hartmann number M E = −1 E =0 E =1 M N u1 N u2 N u1 N u2 N u1 1 2 3 4

4.00680222 4.00677246 4.00673726 4.00671053

3.97998195 3.98034703 3.98089185 3.98154949

4.00677148 4.00664771 4.00645211 4.00619830

3.98014715 3.98096168 3.98213106 3.98347436

4.00693668 4.00726236 4.00769132 4.00812317

N u2 3.98011641 3.98083693 3.98184591 3.98296213

number M is to suppress the velocity and temperature in both the regions for isofluxisothermal case, as seen in Figs. 38.13 and 38.14 respectively. The suppression of velocity and temperature due to increasing M is also found for both open and short circuits, whereas the effect of electric load parameter E on temperature is invariant. For isothermal-isoflux wall conditions (Rtq = −0.5), the effect of Hartmann number is to reduce the downward velocity and temperature in both the regions but the direction of flow is reversed when compared to isoflux-isothermal wall conditions (Rqt = 0.5). In this case the effect of electric load parameter E is significant on velocity and varies only slightly on temperature as shown in Figs. 38.15 and 38.16. It is also observed from Figs. 38.2, 38.3, 38.4, 38.5, 38.6, 38.7, 38.8, 38.9, 38.10, 38.11, 38.12, 38.13, 38.14, 38.15 and 38.16 that the velocity and temperature profiles for short circuit E = 0 lie between open circuits E = ±1. The variations of Nusselt number for varying Hartmann numbers with symmetric heating in both open and short circuits are shown in Table 38.4. As Hartmann number M increases, rate of heat transfer decreases near the cold wall and increases near the hot wall for (E = −1, 0) but increases both at cold and hot wall for E = 1.

38.5 Conclusion The effect of electromagnetic field on mixed convection of two immiscible conducting fluids in a vertical channel is examined both analytically and numerically. The selected parameters on the velocity and temperature characteristics are drawn graphically and explained physically. The effect of the parameters G R, Br, E, M, m, h, K on flow and heat transfer is examined. Under the category of parameters specified, the following conclusion can be jumped: 1. The velocity in Region-I increases as m and Br increase and it decreases as G R, E, h and K increase, whereas the velocity in Region-II increases as G R and Br increase and it decreases as E, M, m, h and K increase. 2. The flow reversal can occur for larger values of G R and its intensity is enhanced by increasing the values of Br, E, h, K and by decreasing the values of m. A similar flow reversal phenomenon can also be observed for opposing flow.

38 Effect of Electromagnetic Field on Mixed Convection …

853

3. The temperatures in Region-I and Region-II increase as Br increases and they decreases as h and K increase. They are insensitive for varying E, M and m if Br or ε is small. 4. The Nusselt number at cold wall slightly decrease with increasing E and M, whereas N u + slightly increases as E and M increase. 5. For q − T case with Rqt = 0.5, when M increases the velocity and temperature in both regions decrease. As M increases for T − q case with RT q = −0.5, the downward velocity and temperature decreases.

38.6 Nomenclature 



Constant

b

B0

Magnetic field

Br

Cp

The specific heat at constant pressure [J Kg−1 K−1 ]

E0

Applied electric field

E

Electrical loading parameter

g

Acceleration due to gravity [m s−2 ]

GR

Dimensionless parameter

h1 , h2

Twice width of the region-I, II [m]

K1 , K2 m P

Thermal conductivity of the fluid region-I, II [W m−1 K−1 ] Ratio of viscosities (μ1 /μ2 ) Dimensional pressure

T

Temperature [K]

u

Velocity [m s−1 ]

x, y α σe2 β

Dimensionless coordinates Thermal diffusivity Electrical conductivity of the region-II Coefficient of thermal expansion

ν

Kinematic viscosity

ε θi

Dimensionless parameter Non-dimensional temperature

Gr h K M n Re (i) U0 Tw1 , Tw2 X, Y σe1 σr ρ μ ΔT

E0 (1) B0 U0   gβ1 D13 ΔT Grashof number ν12 h Width ratio 2 h1 K Ratio of the thermal conductivity 1 K2 √ Hartmann number (D1 B0 σ e1 /μ1 ) Ratio of densities (ρ1 /ρ2 ) (1) Reynolds number (U0 D1 /ν1 ) ⎞ ⎛ 2 dp Di ⎠ Reference velocity ⎝− d X 48μi Temperature of the boundaries [K] Space coordinates [m] Electrical conductivity of the region-I Ratio of electrical conductivities (σe2 /σe1 ) Density of the fluid [kg m−3 ]

Viscosity [kg m−1 s −1 ] Difference in temperature [K] (Ti − T0 /ΔT )

Ratio of thermal expansion coefficient ⎞ ⎛ (1)2 ⎟ ⎜ μ1 U 0 Brinkman number ⎝ ⎠ K 1 ΔT

β2 β1

A

Gr Re

Acknowledgements J. C. Umavathi is thankful to Prof. Maurizio Sasso, Supervisor and Prof. Matteo Savino co-ordinator for the financial support under the scheme of ERUSMUS MUNDUS “Featured Europe and South/south-east Asia mobility Network FUSION “for Post-Doctoral Research. Prashant Metri is also grateful to FUSION network and its Swedish node, MAM research milieu in Mathematics and Applied Mathematics, Division of Mathematics and Physics, School of Education, Culture and Communication at Mälardalen University for support and excellent research and research education environment during his visits.

854

J. C. Umavathi et al.

References 1. Alireza, S., Sahai, V.: Heat transfer in developing magnetohydrodynamic Poiseuille flow and variable transport properties. Int. J. Heat Mass Transf. 33, 1711–1720 (1990) 2. Aung, W.: Fully developed laminar free convection between vertical plates heated asymmetrically. Int. J. Heat Mass Transf. 15, 1577–1580 (1972) 3. Aung, W., Worku, G.: Developing flow and flow reversal in a vertical channel with asymmetric wall temperature. ASME J. Heat Transf. 108, 299–304 (1986) 4. Aung, W., Worku, G.: Theory of fully developed combined convection including flow reversal. ASME J. Heat Transf. 108, 485–488 (1986) 5. Barletta, A.: Laminar mixed convection with viscous dissipation in a vertical channel. Int. J. Heat Mass Transf. 41, 3501–3513 (1998) 6. Barletta, A.: Analysis of combined forced and free flow in a vertical channel with viscous dissipation and isothermal-isoflux boundary conditions. ASME J. Heat Transf. 121, 349–356 (1999) 7. Barletta, A.: Fully developed mixed convection and flow reversal in a vertical rectangular duct with uniform wall heat flux. Int. J. Heat Transf. 45, 641–654 (2002) 8. Bentwich, M.: Two-phase axial flow in pipe. Basic Eng. D 86, 669–672 (1964) 9. Bird, R.B., Stewart, W.E., Lightfoot, E.N.: Transport Phenomena. Wiley, New York (1960) 10. Charles, M.E., Lilleleht, L.U.: Co-current stratified laminar flow of two immiscible liquids in a rectangular conduit. Can. J. Chem. Eng. 43, 110–116 (1965) 11. Desai, C., Vafai, K.: Three-dimensional buoyancy-induced flow and heat transfer around the wheel outboard of an aircraft. Int. J. Heat Fluid Flow 13, 50–64 (1992) 12. Han, L.S.: Laminar heat transfer in rectangular channels. ASME J. Heat Transf. 81, 121–128 (1959) 13. Hartnett, J.P., Kostic, M.: Heat transfer to Newtonian and non-Newtonian fluids in rectangular ducts. Adv. Heat Transf. 19, 247–356 (1989) 14. Lohrasbi, J., Sahai, V.: Magnetohydrodynamic heat transfer in two phase flow between parallel plates. Appl. Sci. Res. 45, 53–66 (1987) 15. Malashetty, M.S., Leela, V.: Magnetohydrodynamic heat transfer in two-fluid flow. In: Proceedings of the National Heat Transfer Conference on AIChE and ASME, HTD, vol. 159 (1991) 16. Malashetty, M.S., Leela, V.: Magnetohydrodynamic heat transfer in two-phase flow. Int. J. Eng. Sci. 30, 371–377 (1992) 17. Malashetty, M.S., Umavathi, J.C.: Two-phase magnetohydrodynamic flow and heat transfer in an inclined channel. Int. J. Multiph. Flow 23, 545–560 (1997) 18. Malashetty, M.S., Umavathi, J.C., Prathap Kumar, J.: Two fluid magnetoconvection flow in an inclined channel. Int. J. Trans. Phenom. 3, 73–84 (2000) 19. Malashetty, M.S., Umavathi, J.C., Prathap Kumar, J.: Convective magnetohydrodynamic two fluid flow and heat transfer in an inclined channel. Heat Mass Transf. 37, 259–264 (2001) 20. Malashetty, M.S., Umavathi, J.C., Prathap Kumar, J.: Magnetoconvection of two-immiscible fluids in vertical enclosure. Heat Mass Transf. 42, 977–993 (2006) 21. Packham, B.A., Shail, R.: Stratified laminar flow of two immiscible fluids. Proc. Camb. Phil. Soc. 69, 443–448 (1971) 22. Postlethwaite, A.W., Sluyter, M.M.: On laminar two-phase flows in magnetohydrodynamics. Mech. Eng. 100, 29–32 (1978) 23. Pratap Kumar, J., Umavathi, J.C., Metri, P.G., Silvestrov, S.: Effect of first order chemical reaction on magneto convection in a vertical double passage channel. In: Silvestrov, S., Rancic, M. (eds.) Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol. 178, pp. 247–279. Springer, Cham (2016) 24. Shail, R.: On laminar two-phase flows in magnetohydrodynamics. Int. J. Eng. Sci. 11, 1103– 1108 (1973) 25. Umavathi, J.C.: A note on magnetoconvection in a vertical enclosure. Int. J. Nonlinear Mech. 31, 371–376 (1996)

38 Effect of Electromagnetic Field on Mixed Convection …

855

26. Umavathi, J.C., Malashetty, M.S.: Magnetohydrodynamic mixed convection in a vertical channel. Int. J. Nonlinear Mech. 40, 91–101 (2005) 27. Umavathi, J.C., Patil, M.B., Pop, I.: On laminar mixed convection flow in a vertical porous stratum with symmetric wall heating conditions. Int. J. Trans. Phenom. 8, 315–335 (2005) 28. Umavathi, J.C., Chamkha, A.J., Mateen, A., Prathap Kumar, J.: Unsteady magnetohydrodynamic two fluid flow and heat transfer in a horizontal channel. Int. J. Heat Tech. 26, 121–133 (2008) 29. Umavathi, J.C., Chamkha, A.J., Sridhar, K.S.R.: Generalized plane Couette flow and heat transfer in a composite channel. Transp. Porous Med. 85, 157–169 (2010) 30. Umavathi, J.C., Vajravelu, K., Metri, P.G., Silvestrov, S.: Effect of time-periodic boundary temperature modulations on the onset of convection in a Maxwell fluid nanofluid saturated porous layer. In: Silvestrov, S., Rancic, M. (eds.) Engineering Mathematics I. Springer Proceedings in Mathematics and Statistics, vol. 178, pp. 221–245. Springer, Cham (2016)

Chapter 39

Stochastic Smart Grid Meter for Industry 4.0—From an Idea to the Practical Prototype Marjan Urekar and Jelena Djordjevi´c Kozarov

Abstract There are demands for digital high precision power grid electrical energy meters, but with low price, as millions of units are needed for households around the EU. Most solutions are based on standard analog to digital converters. A different approach, Stochastic Digital Measurement Method uses cheap, high speed, low resolution flash-A/D converters with added dither, a well-defined and controlled statistical random noise signal with the uniform probability density function. It decouples the quantization error of low resolution converters and produces the average value of the input signal. High precision and high speed/bandwidth are two opposing demands in regular sampling, but by measuring signals over finite periods of time with this novel method, high precision results are obtained. A fully detailed mathematical model was developed for a multibit stochastic instrument, and the hardware prototype of a 4-bit resolution Stochastic Digital Electrical Energy Meter was produced. It was shown that full digital control over its main functions, high precision at wide bandwidth and low cost, make it ideal solution for a meter in Smart Grid applications in Industry 4.0 environments. The prototype has measurement uncertainty of 80 ppm, up to 25 times better than commercially available models. Keywords Random noise signal · Uniform probability density function MSC 2020 60G35

M. Urekar (B) Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia e-mail: [email protected] J. Djordjevi´c Kozarov Faculty of Electronic Engineering, University of Niš, Niš, Serbia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_39

857

858

M. Urekar and J. Djordjevi´c Kozarov

39.1 Introduction Modern industrial technology concept of Industry 4.0 (I4.0) [1–3] presents the next step in “industrial revolutions” where current equipment is either updated or replaced with novel smart electronic devices, fully digitally operated, interconnected in local and global nets via Internet of Things (IoT), enabling remote access and control, implementation of Virtual and Augmented Reality providing off-site experience and monitoring, Artificial Intelligence replacing human operators and decision makers, Cyber-Physical Systems integrating computer algorithms, networking, and physical/mechanical processes. Measurements are integral and fundamental part of any modern technology, as we must test and monitor every step in the industrial process, not just hoping that everything is going as planned. As with any other technological breakthrough, I4.0 is expected to provide advanced-performance industry equipment, especially for measurements better accuracy and precision, lower measurement uncertainty, higher reliability and stability of performance over time. This is not a trivial task of just using new and better parts, as we must measure ultrahigh frequencies, low-level signals in noise, mixed EM sources, all in real-time. Another requisite is that modern instrumentation is fully digital, controllable, and configurable and with various networking options (wired, wireless, PC, Internet, etc.). Many current instruments still in use today are mechanical, analog or older generation digital, and they all need to be replaced with up-to-date equipment. This leads to another important condition—the price and cost/benefit ratio (CBR). Some applications need the highest possible performance (e.g. in calibration of the national level standards), where the cost is not the most relevant condition. Opposed to the previous instance are mass-produced measurement systems, where the low cost per unit is the most significant parameter, but still must provide satisfactory low level of error, e.g. in residential Smart Grid (SG) electrical energy meters, as there are hundreds of millions of units needed, just for the EU market. Here we have another important condition presented—ability to perform measurements at a large number of channels simultaneously, at economical cost/benefit ratio. Most of the current digital instruments are based on analog-to-digital converters (ADC) and Standard Sampling Method (SSM) [4]. The main idea of the SSM is based on measuring over an infinitely short period Δt, (or “in a point”). At this instant, a sample of the input signal amplitude is taken and converted from analog to digital using the ADC. The sampling frequency is determined as f s = 1/Δt. In order to obtain higher sampling frequencies, Δt must be very short. The most advanced ADCs today (also the most expensive) have Δt  0.5 ns. For low-resolution ADCs (below 10 bits), the Bennett quantization error model [5] cannot be applied, resulting in the impossibility of modeling the quantization error as a stationary white noise signal with zero mean [6]. Now we can see the main problem in I4.0 measurements—how to accommodate two opposing demands—high-performance and low-cost. Hence, two new problems arise. First, in order to get better performance, we need the most advanced ADCs in our designs. Those come at ever-increasing price, degrading CBR. In addition, some of the top-end ADCs today have reached the current technological

39 Stochastic Smart Grid Meter for Industry 4.0 …

859

maximum, and there is no or just too little ability to improve in the future. Second, ADC inherent problem presents another opposing dilemma. High accuracy and precision ADCs have low bandwidth and operate at very low frequencies (down to DC). On the other hand, ADCs optimized for high frequency signals have large voltage offsets and lower accuracy as trade-offs. Having ADC operating at high frequencies (for multi-channel measurements), with low error, all at low cost and with high reliability—it is almost impossible today. In an attempt to overcome this methodological contradiction, the Stochastic Digital Measurement Method (SDMM) was successfully developed at Faculty of Technical Sciences, University of Novi Sad [7, 8]. SDMM has a completely different paradigm of digital measurements compared to SSM and represents a radical step forward in digital measurements. In SDMM, unlike SSM, the measured quantity is not observed and measured only in a time instant, rather in a set of points over a time interval. The fastest low-resolution Flash ADCs (FADC) are used. They are extremely simple, robust and reliable, with a small number of sources of systematic error that are easily corrected. Therefore, SDMM is inherently very accurate. Typically, a FADC has a low resolution of 2–10 bits, resulting in large quantization error, so precision seems to be an issue. This problem is efficiently solved by adding dither (h)—random uniform noise, creating a Stochastic Flash ADC (SFADC). This enables SDMM to have high precision, and the short digital word of a SFADC output (only several bits), resulting in a very simple block for basic multiplication and accumulation needed for processing the results. The need for high resolution and wide SFADC bandwidth, leads to complicated hardware and large number of components, introducing more sources of possible systematic error, so it is preferable to use SFADCs with low resolution (2–10), with fewer sources of systematic error. The synergy of simple, low resolution A/D conversion with FADC and simple processing has the benefits of simple: multichannel operation, measurements, processing of results and hardware. This allows for: control over the value of systematic error, high sampling rate, high accuracy and precision, high linearity and low measurement uncertainty of Stochastic Measurement Instruments (SMI) based on SDMM to be built. This simple structure enables a SMI to be realized and integrated in an ASIC (Application-Specific Integrated Circuit) chip. A detailed and practical SDMM theory has been developed, but mainly for the simplest 2-bit version of a SFADC. A working hardware prototype was built that has been rigorously tested in laboratory and verified by software simulations and practical experiments. The results proved that the basic idea and developed theory are correct, with high level of confirmation. One of the important conclusions of the theoretical analysis is that higher SFADC resolution improves precision, but for each new bit of resolution, its hardware and the number of sources of systematic error are doubled, compromising accuracy and reliability. In order to determine if this compromise between the speed, reliability and precision of SFADC is acceptable for I4.0 purposes, a full mathematical model had to be developed for a general case of a SFADC with q-bits of resolution. With theoretical model, a practical multibit hardware prototype can be built as a design proof-ofconcept, and be tested with the best available calibration equipment.

860

M. Urekar and J. Djordjevi´c Kozarov

39.2 Multibit SFADC A generalized model of a multibit (m-bit) SFADC is presented [9], as an electrical instrument for averaging values over a measurement time interval. The input signal y = f (t) is summed in the adder with the uniform dither signal h, a uniform stochastic signal whose probability density function (PDF) is given as p(h), and its maximum amplitude is equal to the value of the voltage threshold g (Fig. 39.1). Dither represents a voltage signal of controlled noise, in which the PDF amplitude of the signal in the time domain has a uniform distribution of 1/2g, in contrast to the regular white noise having Gaussian distribution. Two examples of multibit SFADC are given, 3-bit in Fig. 39.2 and 4-bit in Fig. 39.3, as typical implementations of this model in practice. The sum of these signals is routed to the inverting inputs of all 2m voltage comparators (VC). SFADC has 2m DC voltage thresholds VT (V Ci ) generated, using two voltage sources of (2m − 1) g and (−(2m − 1) g) volts, which are connected at the ends of the voltage divider made out of (2m − 1) resistors Rd of equal resistance, connected in series. Mark g is the lowest voltage threshold that is applied to one of the comparators. Each of these thresholds is fed to the non-inverting input of the corresponding V Ci comparator, in order from the highest to the lowest: VT (V C1 ) = (2m − 1) g, VT (V C2 ) = (2m − 3) g, VT (V C3 ) = (2m − 5) g, . . .   VT (V C2m −1 ) = 2m − (2m+1 − 3) g = −(2m − 3) g, (39.1)  m  m+1 m VT (V C2m ) = 2 − (2 − 1) g = −(2 − 1) g. When the composite dithered signal y + h becomes higher than the i-th voltage threshold, at that moment all comparators from the first to the i-th have the state of the logic unit set at their output, i.e. positive supply voltage of the comparator +V . All other comparators, with comparison thresholds higher than the current level at the inverting inputs, have a logical zero at their output, i.e. negative voltage −V from the comparator power supply. It is clear that with input voltage variations, all comparators with voltage thresholds lower than that level are activated. This type of digital coding is called the thermometer code, using analogy with a thermometer whose mercury level covers all the divisions on the scale below the highest point it currently shows. This type of code is not suitable for a simple binary accumulator, so the code with marked binary

Fig. 39.1 Dither PDF and dither in time domain

39 Stochastic Smart Grid Meter for Industry 4.0 …

861

Fig. 39.2 Schematic of a 3-bit SFADC

numbers (binary number with a sign) was chosen as the most favorable, which enables the use of simplest form of a digital accumulator at the output. The output of each V C is connected to the level limiter (L L) that adjusts the output voltage level, for further coupling with digital logic circuits and their logic signal levels. The outputs of each two adjacent V Cs are connected to one exclusive–O R (X O R) logic circuit. From here, we get (2m − 1) outputs with values from −(2m−1 − 1) to +(2m−1 − 1). These outputs are now easily encoded in the signed binary code, via simple diode logic AN D network. The output code has (m − 1) bits plus an additional bit for the sign, making a binary word with 2m possible states and m bits, where each bit has its own line leading to the digital adder or m-bit accumulator (ACC) that adds each new output FADC state to the previous sum in the accumulator, after each A/D conversion cycle at f s . If the input signal exceeds the permitted voltage range, the highest comparator with the highest voltage threshold activates and changes the state from logic “1” to “0”, with all other lower comparators activated at the same time. This value is changed by the inverter to “1” and multiplied by the clock signal in a

862

Fig. 39.3 Schematic of a 4-bit SFADC

M. Urekar and J. Djordjevi´c Kozarov

39 Stochastic Smart Grid Meter for Industry 4.0 …

863

logic AN D circuit. This generates a positive overflow signal (+O V ), signaling that the upper (positive) limit of the input voltage range is exceeded. If the signal level drops below the allowed lower voltage threshold, only then the lowest V C changes the output from “0” to “1”, while all other V Cs have the output “1”. Multiplying this signal with the clock signal, a negative overflow signal (−O V ) is generated, signaling that the lower (negative) limit of the input voltage range is exceeded. Both of these signals are also routed to the accumulator, that is designed in a way that when one of these lines is active, the current sample is skipped and not added to the sum. This prevents measurement error due to the occurrence of short-term interference on the measured signal (peaks, glitches, fast transients). Also, in a special register or in an external processor circuit, ACC records the number of clock cycles during the measurement period, which corresponds to the number of generated samples N . If +O V or −O V are active, N will be decreased by the number of cycles during which input range limits are exceeded.

39.3 Base Conditions and Limitations For a m-bit SFADC, the following set of conditions must be met: 1.

A= R−g |y + h|  (2q − 1) g = A + g = R.

(39.2)

The sum of the input signal amplitude A and the uniform dither signal must be within the full voltage range ±R of the SFADC input stage, which is equal to the lowest comparator voltage threshold g times (2q − 1), where q is the number of bits of SFADC resolution. It should be emphasized here that the overflow protection built into the SFADC serves as an error indicator and prevents the counting of samples that would introduce an additional indeterminate error. If the input signal voltage exceeds the upper limit by any margin, be it only 1% or 100%, the indicator is incapable of assessing the level of this margin and will just report the event as a faulty reading not related to the input voltage. Therefore, we characterize this error as indeterminate. Overflow and skipped samples lead to an increase in the measurement error, due to the decrease of number of samples N . When an over-range voltage occurs, the received value is not included in ACC sum, and the clock counter is reduced by one, reducing the measurement length by one sample, which corresponds to a reduction in the length of the measurement period tm . One clock cycle length is tc = 1/ f c , where f c , is the clock frequency. If we do not take into the account the number of occurrences of O V , we would get a result that is not calculated according to the expected number of samples N1 = tm /tc , but according to N2 = N1 − N O V , where N O V is the number of samples that exceeded the input voltage range. This is the reason why it is necessary for the ACC to have an indication of the value N2 or N O V , in order to determine the corrected number of samples or corrected measuring time tmc = N2 · tc over which the value was calculated. Another solution is to prolong the

864

M. Urekar and J. Djordjevi´c Kozarov

measurement period further, until the expected number of samples N is gathered in ACC (a simpler solution). An important conclusion is that the occurrence of input voltage levels outside the range limits leads to the degradation of the measurement characteristics of SFADC and SMI in general, so correcting mechanisms must be included in the design. Since f c is equal to the sampling frequency f s of a given SFADC, according to Nyquist–Shannon sampling theorem, the maximum frequency of the measured input signal is f y = f s /2 = f c /2. −R  y  R ⇒ |y|  A = R − |h| = (2q − 1) g − g = (2q − 2) g. (39.3) The maximum amplitude A of the measured input signal must be limited and is determined by the values of quantities q and g. 2.

3.

Δ = 2g =

2R A = q−1 . 2q − 1 2 −1

(39.4)

We can define the smallest step (quantum) Δ of a uniform quantizer that represents this ADC, i.e. quantum of the multibit SFADC, as equal to twice the value of the lowest voltage threshold g of the comparator. The value of the threshold decreases with increasing resolution (number of bits of resolution q), so the value of the SFADC quantum decreases with increasing number of bits q. From here, it is possible to set following criteria for designing multibit SFADC: (a)

g=

2q

Vr e f A R = q = q . −1 2 −2 2

(39.5)

If a q-bit SFADC with input range ±R is designed for a signal with maximum amplitude A, then the threshold g can be determined. This is the most common base condition in the SMI design process, and we can consider it only as a technical limitation—if we know which components will be used in the device itself, we can determine the maximum allowed R in advance, based on their known characteristics. (b)

± Vr e f = ± 2 q g.

(39.6)

The criterion (39.6) also enables the calculation of the required reference voltage Vr e f , that is applied to the ends of the voltage divider, which generates the required voltage thresholds. A positive reference voltage equal to the threshold of the first (upper) comparator VT (V C1 ) = (2 q − 1) g = (+Vr e f − g) is applied to the upper end of the divider. A negative reference voltage, equal to the threshold of the last (lowest) comparator VT (V C2m ) = −(2q − 1) g = (−Vr e f + g), is applied to the lower end of the voltage divider. The voltage divider further divides these voltages to the required voltage thresholds. If −Vr e f is generated from +Vr e f , thermal tracking of all voltage thresholds is provided. Since only a single reference voltage source is needed (instead of 2 q separate threshold voltage sources), the complexity is reduced and the scattering due to the individual tolerances is eliminated. If voltage Vr e f drifts,

39 Stochastic Smart Grid Meter for Industry 4.0 …

865

all threshold levels will change their absolute value at the same time, but their mutual relative differences remains the same. If the voltage changes with temperature, then it also affects the resistors that make up the voltage divider, so their temperature coefficients must be paired. The best solution is the selection of laser-trimmed resistor networks that have a small or almost zero temperature coefficient, having a negligible change of resistance with the ambient temperature. (c)

|y|  R − g = A = (2q − 2) g =

2q−1 − 1 · Vr e f . 2q−1

(39.7)

If we have a given q-bit SFADC with a pre-known threshold g, we can determine what is the maximum amplitude A of the input signal y being measured. This is useful data in theoretical considerations comparing the performance of several SFDACs with different number of bits of resolution.

|h|  4.

p(h) =

1 Δ

Vr e f √ A R = q = q 2 −1 2 −2 2 2 −1 2q−1 − 1 2q−1 1 = = = = . 2g 2R A Vr e f

Δ 2

=g=

2q q

(39.8)

The signal of the stochastic uniform dither h [10] must have limited amplitude with value of less than one SFADC quantum Δ, i.e. minimum threshold g, in order to satisfy Widrow condition [11] (band-limited characteristic function of input for a quantization error of uniform PDF). According to the previous consideration, the amplitude of the dither can also be determined by the value of the voltage reference Vr e f (39.5). Another requisite for dither is that PDF p(h) must always be the same— the reciprocal value of the twice threshold g or one quantum Δ. It is also possible to specify a PDF using predefined values of A, R or Vr e f , for a given number of bits q.

39.4 Mathematical Model of the SFADC Measurement Uncertainty The output function of q-bit SFADC is marked as Ψq . The mean value of the output function Ψ q over the measurement period interval [t1 , t2 ], with total of N samples made, is given as: Ψq =

N 1  · Ψq (i) N i=1

(39.9)

where tm = t2 − t1 is the measurement period, and f s is sampling frequency of the SFADC.

866

M. Urekar and J. Djordjevi´c Kozarov

N = f s · tm

(39.10)

In time domain, the output function is: t2 Ψq =

Ψq dt

(39.11)

t1

The SFADC output is a random variable that depends on y and h, but also, y depends on time t. The essential probability d PΨ (differential of PDF of random quantity Ψq ) can be represented by essential probabilities of quantities it is dependent of, as in: d PΨ = d Py/t · d Pt · d Ph

(39.12)

Let y = f (t) be an integrable function whose mean value is measured over the time interval from t1 to t2 . We consider time t as a random variable with uniform distribution p(t): p(t) =

1 1 = t2 − t1 tm

(39.13)

Then, y is also a random variable depending on t, so we can say that the mean value of this function using its PDF is p(y) and Dirac delta function δ: +∞ y =

+∞t2 y · p(y) dy =

−∞

+∞ =

y · p(y/t) · p(t) dy dt −∞ t1

t2 y dy ·

−∞

1 y= · t2 − t1

+∞ p(t) dt · p(y/t) = −∞

t1

t2

+∞ dt

t1

−∞



1 y dy · t2 − t1



t2   δ y − f (t) dt

(39.14)

t1

1 y · δ y − f (t) dy = · t2 − t1

t2 f (t) dt

(39.15)

t1

At the same time, it also represents the average value of the input signal during the period tm : 1 y= · tm

tm f (t) dt 0

(39.16)

39 Stochastic Smart Grid Meter for Industry 4.0 …

867

In a similar way, using (39.11), (39.12) and (39.14), we can write:   d PΨ = δ y − f (t) dy ·

1 Ψ = t2 − t1

t2 t1

1 1 dt · dh t2 − t1 2g

+2 g +g     1 dh dt δ y − f (t) dy Ψ · 2g

(39.17)

(39.18)

−g

−2 g

+g 1 dh y = f (t) = Ψ · 2g

(39.19)

−g

The average value during the time interval tm can be given as:

1 Ψq = t2 − t1

t2 t1

1 f (t) dt = · tm

tm f (t) dt

(39.20)

0

In an ideal case, we get:  N 1  1 · Ψq = y ⇒ Ψq (i) = · f (t) dt N i=1 tm tm

(39.21)

0

Equation (39.20) is valid only in the theoretical case of infinitely high sampling frequency f s and measurement at a finite time interval tm , which corresponds to measurements with an infinite number of samples N → ∞ (39.21). In practice, number of samples N can be large, but still finite, which results in a measurement error called SFADC quantization error eq : Ψq ≈ y Ψq = y + eq

(39.22)

Since the value of Ψq is a random variable due to the dither, and the input signal y is deterministic in its nature [12], then the quantization error eq must also be a random variable. From this, we can conclude that eq is uncorrelated with the deterministic value of the signal y, therefore:

868

M. Urekar and J. Djordjevi´c Kozarov

Ψq ≈ y Ψ q = y + eq

(39.23)

Ideally, when N is infinite, the mean quantization error of a multibit SFADC is zero: eq = 0

(39.24)

When a multibit SFADC measures the mean value y of the input signal over a time period tm with a finite sampling rate f s , the number of samples N is also finite, so: N 1  y = Ψq ≈ Ψq (i) −→ eq = 0 · N i=1

(39.25)

The mean value of the quantization error eq is not zero in the real case, which means that the standard measurement uncertainty u q , estimated for a 95% confidence interval [13], is: u q = 2 σ eq

(39.26)

where σ eq is the standard deviation of average error eq . The upper limit of the quantization error value eq and its PDF are constants (C1 and C2 ): 2R = C1 |eq |  Δ = 2 g = q 2q − 1 1 2 −1 1 = = = C2 p(eq ) = 2Δ 4g 4R

(39.27)

If the conditions of the Central Limit Theorem (CLT) and the Statistical Theory of Sampling (STS) are satisfied, then it holds that every third moment of error e is limited, including the third central moment M3 (skewness) of the quantization error eq [14]: +2 g  2 g  3 3 eq p(eq ) deq  2 eq  p(eq ) deq

 3 M3 (eq ) = eq − eq =

−2 g

 3  2eq  p(eq )

2 g

deq  2 0



2R −1

2q

3

0

2q − 1 2 g = (2 g)3 4R (39.28)

39 Stochastic Smart Grid Meter for Industry 4.0 …

869

The condition for the validity of CLT is that any third moment must be limited, including M3 , as in (39.28). The variance σe2q of the average value eq of an error eq is: σ e2q =

σe2q

(39.29)

N

When CLT is valid, measurement uncertainty assessment u q is: u q2 = 4σ e2q =

4σe2q

σNeq σ eq = √ N

(39.30)

where σeq is standard deviation of error eq during the measurement of y with N samples. In [15] it was shown that the square of the standard deviation of the quantization error (variance) σeq when measuring the signal f (t) in the continuous time domain could be determined as: σe2q

2g = · t2 − t1 =

Δ · tm

t2 t1

1 | f (t)| dt − · t2 − t1

tm | f (t)| dt −

t2 f 2 (t) dt t1

1 2 f (t) dt = eq2 tm

(39.31)

0

At the same time, it is shown that:

σe2q = eq2 ≈

N N −1  Δ2  Δ2 · · |Ψq (i) − Ψq (i) · Ψq (i + 1) N i=1 N − 1 i=1

(39.32)

The measurement uncertainty u q of a multibit SFADC can be determined, based on (39.31) and (39.32), as:

u q2

Δ = · N tm

tm 0

   f (t) dt −

1 · N tm

tm f 2 (t) dt 0

N N −1   Δ2 Δ2   Ψ (i) − · ≈ 2· Ψ (i) · Ψ (i + 1) N i=1 N (N − 1) i=1 N N −1    Δ Ψq (i) − N · uq ≈ · Ψq (i) · Ψq (i + 1) N N − 1 i=1 i=1

(39.33)

870

M. Urekar and J. Djordjevi´c Kozarov

If the exact value of u m for a y is not needed, then the calculation can be performed in a much simpler way, using the upper limit of the value Uq of the measurement uncertainty u q : |Uq | = max|u q |

(39.34)

The upper limit of standard deviation for any multibit SFADC, according to [7], is:

σe2max 

Δ2 4

(39.35)

The upper limit of absolute measurement uncertainty U in a SMI (regardless of the number of bits of resolution) is given in [16] as:

|U | ≈

√ Δ Δ 2· √ =√ N 2N

(39.36)

Based on (39.4), (39.35) and (39.36), we can determine the upper limit of the measurement uncertainty Uq for q-bit SFADC in several ways (39.37), depending on whether the quantities g, R or Vr e f are predefined: ⎧ Δ ⎪ √ ⎪ ⎪ ⎪ N ⎪ ⎪ ⎪ 2 g ⎪ ⎨√   ⎪   N Uq   2R ⎪ ⎪ √ ⎪ ⎪ q ⎪ (2 − 1) N ⎪ ⎪ ⎪ Vr e f ⎪ ⎩ √ q−1 2 N

(39.37)

The value of measurement uncertainty can be determined with either of previous four equations, and depending on the parameter that is defined in advance, we choose the one that allows us the calculation based on that parameter. This is, in fact, the process for designing SFADC where, based on the given constraints and the universal list of equations, we can set the desired performance of SFADC in advance. This principle has been applied to other quantities that can be calculated in several different ways. This value represents the worst-case scenario, the value from which the real measurement uncertainty cannot be higher, only equal or smaller. This secure (but conservatively estimated) value allows for quick and easy computation, reducing the requirements for the speed and complexity of data processing hardware and software. At the same time, the value determined in this way is a reliable parameter that allows

39 Stochastic Smart Grid Meter for Industry 4.0 …

871

Fig. 39.4 Block diagram of a SMI measuring the average value of the product of two signals

an unbiased comparison of different SMIs based on SFADC technology, regardless of the specific performance and the number of bits of resolution.

39.5 Multibit SMI Figure 39.4 presents a SMI, based on two q-bit SFADCs, which measures the average product value (AVP) of two input signals, y1 and y2 . This circuit can be referred to as AVP SMI (AS). Two uncorrelated dither signals, h 1 and h 2 , are used. The outputs Ψq1 and Ψq2 are multiplied in a multibit integer digital multiplier, and the multiplication result is added to the accumulator. This is a very important step in the development of a SMI. If the multiplier function was to be developed in the usual way, an analog multiplier in the form of an integrated chip or an analog circuit with the same function would have to be used. The main disadvantages of the analog multiplier are: large influence of voltage offsets on the multiplication result, error due to tolerances of discrete components, variations due to the temperature changes, limited maximum frequency of input signals (in the order of tens of kHz), low bandwidth, delay and propagation time through the circuit that causes the phase shift of the signal. In the digital domain, the multiplier can be realized with a combination of AN D and O R logic gates, which have a high upper operating frequency limit (in the order of GHz). The effects of voltage offset, temperature, component tolerance and delay time are minimized by the digital nature of the circuit. From (39.9) we get that the AVP of the input signals Ψ AS at the output of the SMI accumulator [17] is given as:

Ψ AS

tm N 1  1 · = Ψq (i) · Ψq2 (i) ≈ y1 (t) · y2 (t) dt = y1 · y2 N i=1 1 tm

(39.38)

0

If the signal y1 is proportional to the power load voltage and the y2 voltage is proportional to the current of the same power load, then this AVP SMI becomes a Stochastic Digital Electric Energy Meter (SDEEM) [18].

872

M. Urekar and J. Djordjevi´c Kozarov

The outputs of the AVP multiplier and accumulator are: Ψ = y1 · y2 + e AS Ψ = y1 · y2 + e¯ AS

(39.39)

We consider the case when both SFADCs are identical, with the same q, R and Δ. We can also consider the special case where the same signal is present at both inputs y = y1 = y2 , considering that both inputs have the same input range ± R. The RMS value of the input signal y is:

yR M S

tm 1 = y 2 (t) dt tm

(39.40)

0

Using (39.9) and [16], the upper limit of standard deviation value of this AS is:

σe2AS

Δ2 1 ·  · 4 tm

tm 

 Δ4 = y12 (t) + y22 (t) dt + 16

0

=

Δ2 2 · · 4 tm

tm y 2 (t) dt =

Δ2 · (y R M S )2 2

(39.41)

0

√ Δ σe AS  √ · y R M S = 2 g · y R M S 2

(39.42)

Squared value of the input signal y RMS value is:

y R2 M S

1 = tm

tm y 2 (t) dt = y 2

(39.43)

0

The measurement time interval must be tm = c · T , where c is an integer and T is period T of the input signal y = f (t). This is an important condition if we want the lowest possible measurement error. If this condition is not met, then an error of at most ±( f s · T ) samples can occur, i.e. that one signal period is not fully measured, where the integer value of the number of periods in the accumulator is always taken into the account.

39 Stochastic Smart Grid Meter for Industry 4.0 …

873

The consequence is that the data processing block, in addition to the information on the measurement length that is set, must also receive data on the frequency of the signal during the measurement, i.e. to have that frequency measured by the SMI. A more important consequence of this condition is that the frequency of the signal during the measurement must be stable and unchanged. If this is not met, the actual number of samples N in the accumulator will not correspond to the expected number for a given measurement period, leading to an error that cannot be quantified. However, small changes and instabilities in the signal frequency during the measurement time are averaged by a large number of samples—a higher sampling frequency and/or a longer measurement interval. The upper limit of measurement uncertainty of AS (SDEEM, in this case), is defined as expanded measurement uncertainty with a coverage factor of k = 2 and a 95% confidence interval [13], (39.30), (39.36) and (39.43):   σe   U AS  = k · σ e AS = 2 · √ AS N

(39.44)

Based on (39.42) and (39.44), the measurement uncertainty |U AS | can be expressed in several ways, via Δ, g, R or Vr e f : ⎧√ 2Δ ⎪ ⎪ ⎪ √ · yR M S ⎪ ⎪ ⎪ √N ⎪ ⎪ ⎪ 2 2g ⎪ ⎪ · yR M S ⎨ √ N√ |U AS | = ⎪ 2 2R ⎪ ⎪ √ · yR M S ⎪ ⎪ q ⎪ (2√− 1) N ⎪ ⎪ ⎪ 2 2 ⎪ ⎪ ⎩ √ · Vr e f · y R M S 2q N

(39.45)

The relative expanded measurement uncertainty of the mean value of the square of the measurement result is:   |U | |U AS | AS   = 2 U y 2  = 2 yR M S y

(39.46)

From (39.45) and we determine the dependence of the relative measure  (39.46)   ment uncertainty U y 2  of the squared measured value, using one of the four available equations, according to the relevant parameters Δ, g, R or Vr e f :

874

M. Urekar and J. Djordjevi´c Kozarov

⎧ √ 2Δ ⎪ ⎪ √ ⎪ ⎪ ⎪ y R M√S N ⎪ ⎪ ⎪ ⎪ ⎪ 2 2√g ⎨   ⎪   yR M S √ N U y 2  = ⎪ 2 2R ⎪ ⎪ √ ⎪ ⎪ q ⎪ ⎪ (2 −√1) y R M S N ⎪ ⎪ ⎪ 2 2 ⎪ ⎩ √ · Vr e f q 2 yR M S N

(39.47)

The relative value of the measurement uncertainty of the RMS value of the measured voltage is half of the value of the relative measurement uncertainty of the squared measured signal [18]:      √  1   U y 2  = · U y 2  2

(39.48)

      |U AS |  √      U y 2  = 100 · U√ y 2  = 50 · U y 2  = 50 · 2 % yR M S

(39.49)

This value corresponds to the measurement precision Γ for an AS (or SDEEM) composed out of two identical SFADCs with equal resolutions [18]:     |Γ |% = U√ y 2 

%

⎧ 100 Δ ⎪ ⎪ √ ⎪ ⎪ ⎪ 2N · y R M S ⎪ ⎪ ⎪ 200 g ⎪ ⎪ √ ⎨ 2N · y R M S |Γ |%  200 R ⎪ ⎪ √ ⎪ ⎪ q ⎪ (2 − 1) y 2N R M S ⎪ ⎪ ⎪ 200 ⎪ ⎪ · V √ ⎩ q ref 2 y R M S 2N

(39.50)

(39.51)

AVP SMI is often used as an electric meter of active energy for residential consumers (electricity meter), whose current and voltage signals are mostly close to the sinusoidal waveform (in the designation sin), excluding distortion due to the higher order harmonics. In addition, when calibrating in a laboratory conditions, adjusting and verifying the AS, sinusoidal waveforms are always used as the source of reference signals (calibration standards).

39 Stochastic Smart Grid Meter for Industry 4.0 …

875

For these reasons, it is useful to determine the previous relations for pure sine waveform of the input signal with amplitude A:

σe AS sin

A y R M S sin = √ 2 Δ  √ · y R M S sin = g · A 2

(39.52)

Based on (39.51), one of the key properties of SDMM and SMI can be shown which distinguishes this method from the SSM measurement:

    U AS 

sin

⎧ Δ √ · A ⎪ ⎪ 2 Ng ⎪ ⎪ √ · A ⎪ ⎪ ⎨ N 2R  √ ·A ⎪ q ⎪ (2 − 1) N ⎪ ⎪ ⎪ 2 ⎪ ⎩ √ · Vr e f · A q 2 N

(39.53)

The measurement error in all classical methods is fixed and depends on various parameters. Reduction of the error can be achieved only by changing the circuit parameters, such as: using higher quality, more accurate/precise components, greater stability of the supply voltage, reducing the impact of noise, transients and electromagnetic fields, ensuring stable ambient temperature, etc. With SDMM, on the other hand, the measurement error is adjustable, and can be regulated by the user with only one variable—the measurement time. For higher precision, we choose longer measurement periods, while lower ones are achieved with shorter measurement intervals. This is a unique feature of the SDMM that gives us controllability over measurement error that can be adjusted according to the current conditions and demands (better performance, optimal performance and performance with set cost/benefit ratio). From (39.53), in analogy to (39.48)–(39.51):

   √  U y 2 

%sin

  |U | |U AS | |U AS | AS   = 2 =2 U y 2  = 2 A2 y y S ⎧ R2MΔ ⎪ ⎪ √ ⎪ ⎪ A N ⎪ ⎪ ⎪ 4g ⎪ ⎪ ⎨ √     A N U y 2   4R ⎪ sin ⎪ √ ⎪ ⎪ ⎪ (2q − 1) A N ⎪ ⎪ ⎪ 4 ⎪ ⎩ √ · Vr e f q 2 A N     |U AS |sin     = 100 · U√ y 2  = 50 · U y 2  = 100 · sin sin A2

(39.54)

(39.55)

876

M. Urekar and J. Djordjevi´c Kozarov

    Γ 

%sin

     U√ y 2 

%sin

⎧ 100 Δ ⎪ √ ⎪ ⎪ ⎪ A N ⎪ ⎪ ⎪ 200 g ⎪ ⎪ ⎨ √ = A N 200 R ⎪ ⎪ √ ⎪ ⎪ q ⎪ (2 − 1) A N ⎪ ⎪ ⎪ 200 ⎪ ⎩ √ · Vr e f q 2 A N

(39.56)

If we exactly know all the relevant parameters in the circuit (39.57), we can determine the minimum of the required measurement time tmmin , i.e. minimum number of samples Nmin to achieve the desired level of measurement precision (39.58).     Γ   %

(2q

200 R 200 R = q √ √ (2 − 1) 2 tm · f s · y R M S − 1) 2 N · y R M S

2 · 104 R2 2 · 104 g2 · = · 2 2 2 2 (2q − 1)2 · f s · |Γ |% y R M S f s · |Γ |% y R M S 2 · 104 R2 2 · 104 g2 Nmin = q · = · (2 − 1)2 · |Γ |2% y R2 M S |Γ |2% y R2 M S

(39.57)

tm min =

(39.58)

It can be seen that the measurement time depends on the waveform of the input signal and its amplitude, sampling frequency, number of SFADC bits of resolution and projected precision. From here, we can get a simpler expression for determining the minimum length of the measurement time period for a given precision, when measuring the sinusoidal voltage: 2 · 104 R2 · (2q − 1)2 · f s · |Γ |2%sin A2 104 4 · 104 g2 = = · 2 2 2 q−1 2 2(2 − 1) · f s · |Γ |% sin f s · |Γ |% sin A 104 4 · 104 g 2 = = · 2(2q−1 − 1)2 · |Γ |2% sin |Γ |2% sin A2

tm min sin =

Nmin sin

(39.59)

Based on the previous consideration of the mathematical model, we can observe the specific error values for the sinusoidal signal, which is taken as the standard for determining and comparing the characteristics of different SMIs. With a small number of bits of resolution q, it is possible to keep the systematic error under a certain limit, and at the same time achieve very high measurement precision. A direct overview of the measurement precision is given for the practical

39 Stochastic Smart Grid Meter for Industry 4.0 …

877

cases of resolutions of 2, up to 6 bits, when measuring the sinusoidal signal with amplitude A: (a) q = 2, 2 g = A, Δ = A σe2AS (2) =

σe2AS N



Δ2 1 A4 A2 1 A2 · · y R2 M S = · · = 4 N 4 N 2 8N

|G|max =



√ 2 |G| = 2 2 σe AS

√ √   |G|max 2 2 σe AS A2 2 4 2   = · =√ = √ Γ y 2 (2)  2 2 /2 2 A A (y R M S ) 2 2N N   1   1   √   Γ y 2 (2) = · Γ y 2 (2)  √ 2 N

(39.60)

(39.61)

(39.62)

(39.63)

(b) q = 3, 6 g = A, Δ = A/3 σe2AS (3) 

A4 Δ2 1 A2 1 1 A2 · · y R2 M S = · · · = 4 N 9 4 N 2 72N

(39.64)

  2 √2 A 2 2 2   · 2 = √ Γ y 2 (3)  √ 6 2N A 3 N

(39.65)

  1   1   √   Γ y 2 (3) = · Γ y 2 (3)  √ 2 3 N

(39.66)

(c) q = 4, 14 g = A, Δ = A/7 σe2AS (4) 

A4 Δ2 1 A2 1 1 A2 · · y R2 M S = · · · = 4 N 49 4 N 2 392N

(39.67)

  2 √2 A 2 2 2   · 2 = √ Γ y 2 (4)  √ 14 2N A 7 N

(39.68)

  1  √  Γ y 2 (4)  √ 7 N

(39.69)

(d) q = 5, 30 g = A, Δ = A/15

878

M. Urekar and J. Djordjevi´c Kozarov

σe2AS (5) 

A4 Δ2 1 A2 1 1 A2 · · y R2 M S = · · · = 4 N 225 4 N 2 1800N   2 √2 A 2 2 2   · 2 = √ Γ y 2 (5)  √ 30 2N A 15 N    √  Γ y 2 (5) 

1 √ 15 N

(39.70)

(39.71)

(39.72)

(e) q = 6, 62 g = A, Δ = A/31 σe2AS (6) 

A4 Δ2 1 A2 1 1 A2 · · y R2 M S = · · · = 4 N 961 4 N 2 7688N   2 √2 A 2 2 2   · 2 = √ Γ y 2 (6)  √ 62 2N A 31 N    √  Γ y 2 (6) 

1 √ 31 N

(39.73)

(39.74)

(39.75)

In a more general case, for q = m bits of resolution and sinusoidal voltage at the input: (f) q = m, (2q − 2) g = A, Δ = 2 g = A/(2q − 2) = A/(2q−1 − 1) σe2AS (q) 

Δ2 1 A4 · · y R2 M S =  2 4 N 8 2q−1 − 1 N

(39.76)

  √ A2 2 2   √ √ · 2 =  Γ y 2 (q)  2 2 · √  q−1 q−1 A −1 N 2 2 2 2 −1 N

(39.77)

  1 1  √  √  √ =  q−1 Γ y 2 (q)   q−1 2 − 1 tm · f s 2 −1 N

(39.78)

  100 Γ (q)   √ % % q−1 2 − 1 tm · f s   106 Γ (q) √   q−1 ppm ppm 2 − 1 tm · f s

(39.79)

39 Stochastic Smart Grid Meter for Industry 4.0 …

879

39.6 Stochastic Digital Electrical Energy Meter Main limitations for any SDMM based SMI are the following assumptions [18, 19]: • Input signals y1 and y2 have the same frequency f y . • Input signal period is much shorter than the measurement period: T = 1/ f y  tm . • Both input signals are stationary during the period 2T . All of these assumptions are met in a SMI application for the power grid electricity metering, i.e. SDEEM: • Input voltages are proportional to the voltage and current of the same power load, hence both have the same frequency. • Power grid measurements are performed over long periods—hours, days, weeks. • Short-term fluctuations in energy consumption are negligible in short periods of on large power consumers. In order to realize SDEEM for Smart Grid applications, we must monitor voltage u L , current i L , active power PL and energy E L on a power load R L . The digital nature of SMI enables full control and readout of its main functions via remote communication (IoT/IIoT). To understand how SDEEM works as a SG meter, we must start from the basic premise of a SMI that measures the mean value of the product of two signals over a finite interval, with precision given in (39.79): 1 · lim Ψ = Δt→0 tm

tm

1 f 1 (t) · f 2 (t) dt ⇒ Ψ ≈ · tm

0

tm f 1 (t) · f 2 (t) dt. (39.80) 0

Let us now consider a high-power load, denoted by R L , with voltage u L and current i L . If f 1 the voltage signal is proportional to the value of the measured voltage u L , and f 2 the voltage signal is proportional to i L , then the mean value of the function given by (39.80) is proportional to the mean value of the active power PL . If we multiply this result with the period of the measurement tm , then the output value of this SDEEM corresponds to the value of active energy E L measured over that period on the power load. If both SDEEM inputs are shorted, i.e. f 1 (t) = f 2 (t), and voltage proportional to the load voltage or current is brought to the inputs, by extracting a square root from the output we obtain the RMS value of that voltage or current, respectively [8]. Block diagram of this SDEMM is given in Fig. 39.5. The conversion of the high voltage level u L (of the order of hundreds or thousands of volts) into the low AC voltage level f 1 (in the order of volts), as well as the high current of i L (of the order of tens or hundreds of amperes) into the low voltage signal f 2 (in the order of volts), is performed by industrial-grade high precision transducers, available commercially. The voltage and current conversion coefficients to the corresponding low voltages are given as K 1 and K 2 , respectively [18, 19]. f 1 (t) = K 1 · u L (t) f 2 (t) = K 2 · i L (t)

(39.81)

880

M. Urekar and J. Djordjevi´c Kozarov

Fig. 39.5 Block diagram of a 4-bit SMI as SDEEM for SG measuring active power and energy as the mean value of the product of two signals over a period of time, proportional to the input voltage and current of the electric power load, based on 4-bit SFADC

Now, we can give relations defining quantities needed in monitoring power quality of a SG, in a relation to the SDEEM output. Measuring RMS voltage u Lr ms on the power load:

Ψ u Lr ms

K1 · K1 = · tm

tm 0

tm

K1 · K1 u L (t) · u L (t) dt = · tm

u Lr ms =

1 · K1

u 2L (t) dt = K 12 · u 2Lr ms (39.82) 0

 Ψ uL

(39.83)

Measuring RMS current i Lr ms on the power load: Ψ i Lr ms =

K2 · K2 · tm

tm i L (t) · i L (t) dt = 0

i Lr ms =

1 · K2

K2 · K2 · tm

 Ψ iL

Measuring active power PL on the power load:

tm 2 i L2 (t) dt = K 22 · i Lr ms

(39.84)

0

(39.85)

39 Stochastic Smart Grid Meter for Industry 4.0 …

Ψ PL =

K1 · K2 · tm

tm u L (t) · i L (t) dt = 0

PL =

K1 · K2 · tm

881 tm PL (t) dt = K 1 · K 2 · PL

(39.86)

0

1 · Ψ PL K1 · K2

(39.87)

Measuring active energy E L on the power load over period tm :

Ψ EL =

K1 · K2 · tm

tm u L (t) · i L (t) dt = 0

EL =

K1 · K2 · tm

tm PL (t) dt = 0

K1 · K2 · E L (39.88) tm

tm · Ψ EL K1 · K2

(39.89)

Variance of the SDEEM output, in a general case, is given as:

σe2

(2 g)2 = · tm

tm 0

σe2

1 | f 1 (t) · f 2 (t)| dt − · tm

tm f 12 (t) · f 22 (t) dt

(39.90)

0 N  (2 g)4    · ≈ Ψ1 (i) · Ψ2 (i) − N i=1

=

e2



N −1 (2 g)4  Ψ1 (i) · Ψ2 (i) · Ψ1 (i + 1) · Ψ2 (i + 1) N − 1 i=1

(39.91)

From here, we can determine the quantities that influence the accuracy of a SDEEM: waveforms of input signals, sampling frequency f s , SFADC resolution q and quant Δ = 2 g, time period tm . Measurement precision given in (39.79) is still valid.

39.7 Hardware Prototype of 4-Bit SDEEM A working hardware of a multibit SMI prototype was produced, Fig. 39.6. It was decided to option for the 4-bit version, with practical limitations to the complexity and the cost of the project, but also considering the limitations of the best available calibration standards used to check the measurement results obtained with this device.

882

M. Urekar and J. Djordjevi´c Kozarov

Fig. 39.6 Hardware prototype of a 4-bit SMI as SDEEM for a SG

Fig. 39.7 Block schematic of 4-bit SMI/SDEEM main functions. A 3-way switch determines the quantity that SDEMM measures in a SG: position A—RMS voltage, position B—RMS current, position C—active power and energy

Function wise, it was decided to be an electric energy meter with options to measure power, voltage and current. This prototype is a practical proof-of-concept of SDMM and multibit SMI theory. In addition, it should prove that a SDEEM is capable for Smart Grid applications in an Industry 4.0 environment. Basic data of the SDEEM prototype: q = 4, f s = 100 kHz, ymax = 3 Vr ms , ± R = ± A = ± 4.6875 V, Δ = 0.625 V, g = 0.3125 V, Vr e f = 5 V. Figure 39.7 shows SDEEM with all the selectable basic modes of operation for measurements in a SG.

39 Stochastic Smart Grid Meter for Industry 4.0 …

883

Table 39.1 Calculated and measured precision of the measurement results over various periods tm (s) Precision calculated (ppm) Precision measured (ppm) 10 30 40 60 120 180

142.86 82.48 71.43 58.32 41.24 33.67

156 76 69 57 48 29

39.8 Measurement Results Measurements were performed in an accredited Laboratory for Metrology at Faculty of Technical Sciences. Best available calibration standard in the Laboratory was the digital multimeter HP3458A, so the SDEEM was tested in the RMS voltage measurement mode, as the standard provided best accuracy in AC voltage metering mode. As this accuracy of the multimeter is at 100 ppm level, it is clear why the 4bit version of SMI was chosen for SDEEM—limitations of the available calibration equipment, albeit of the highest level, showed that there were no means to test SMI with resolution of 5 bits or more. The measurement results are very close to the values obtained by theoretical calculations, as in Table 39.1. Some values obtained are even lower than theoretical: 1. Theory gives the worst-case scenario, the value above which results would not deviate, but lower values are quite possible, 2. Measurement precision obtained is in or above the declared limits of the test equipment. Figures 39.8, 39.9 and 39.10 show precision of the measurements over the full input range, for various measurement periods. Two functions at each figure show two repeated measurements, proving high repeatability of the SDEEM results under the same conditions. Due to the limitations of PC memory used for storing measurement results, maximum tm during tests was 180 s. As shown in [18], measurement uncertainty in this case was assessed at 80 ppm. As the best commercially available smart grid meters have declared only measurement precision at 0.2%, without value for assessed uncertainty, it was concluded that 4-bit SDEEM prototype exhibited at least 25 times better performance than the commercial meters.

884

M. Urekar and J. Djordjevi´c Kozarov

Fig. 39.8 Measurement precision over the full input range for 10 s period (upper) and for 30 s period (lower)

39 Stochastic Smart Grid Meter for Industry 4.0 …

885

Fig. 39.9 Measurement precision over the full input range for 40 s period (upper) and 60 s period (lower)

886

M. Urekar and J. Djordjevi´c Kozarov

Fig. 39.10 Measurement precision over the full input range for 120 s period (upper) and 180 s period (lower)

39 Stochastic Smart Grid Meter for Industry 4.0 …

887

39.9 Conclusion Using the fully developed and detailed multibit SMI mathematical model, a working hardware prototype of a 4-bit SDEEM was built and rigorously tested, confirming that theoretical model is valid. Measurement results proved that SDEEM is an ideal solution for a Smart Meter in Smart Grid in Industry 4.0 applications, due to its high precision and accuracy, high reliability, digital controls and ease of interfacing with IoT and IIoT, simple hardware and low cost. The prototype exhibited at least 25 times better measurement uncertainty (80 ppm) than commercially available smart meters, proving excellent BCR of SMIs. Acknowledgements This work has been supported by the Ministry of Education, Science and Technological Development of the Republic of Serbia.

References 1. European Commission: Smart grids and meters—Smart Grids Task Force. https://ec.europa. eu/energy/en/topics/market-and-consumers/smart-grids-and-meters (2018) 2. Directorate General for Internal Policies: Industry 4.0, Study. Eur. Parliament (2016) 3. Urekar, M.: Projektovanje industrijskih mernih sistema za Industriju 4.0 i pametne distributivne mreže, Merno-informacione tehnologije MIT2019. Faculty of Technical Sciences, University of Novi Sad, Novi Sad (2019) (in Serbian) 4. D’Antona, G., Ferrero, A.: Digital Signal Processing for Measurement Systems. Springer (2006) 5. Brown, B.D., Card, H.C.: Stochastic neural computation I: computational elements. IEEE Trans. Comput. 50, 891–905 (2001) 6. Manolakis, D.G., Ingle, V.K.: Applied Digital Signal Processing—Theory and Practice. Cambridge University Press (2011) 7. Vujicic, V.: Generalized low frequency stochastic true RMS instrument. IEEE Trans. Instrum. Meas 50(5), 1089–1092 (2001) 8. Urekar, M., Pejic, D., Vujicic, V., Avramov-Zamurovic, S.: Accuracy improvement of the stochastic digital electrical energy meter. Measurement (Elsevier) 98, 139–150 (2017). https:// doi.org/10.1016/j.measurement.2016.11.038 9. Urekar, M.: Contribution to the optimization of digital measurements performance. Ph.D. dissertation (in Serbian), Dept. of Power, Electronic and Telecommunication Engineering, Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia (2018) 10. Kay, S.M.: Fundamentals of Statistical Signal Processing: Estimation Theory, vol. I, 1st ed. Prentice Hall, New Jersey (1993) 11. Zözer, U.: Digital Audio Signal Processing, 2nd edn. Wiley (2008) 12. Lindman, H.R.: Analysis of Variance in Experimental Design. Springer (1992) 13. JCGM 100:2008 Evaluation of measurement data—guide to the expression of uncertainty in measurement. JCGM, BIPM (2010) 14. Papoulis, A., Pillai, S.U.: Probability, Random Variables and Stochastic Processes, 4th edn. McGraw-Hill (2002) 15. Zupunski, I., Vujicic, V., Mitrovic, Z., Milovancev, S., Pesaljevic, M.: Online determination of the measurement uncertainty of stochastic measurement method. Proc. IMEKO XIX World Congress 278, 1048–1051 (2009) 16. Pjevalica, V., Vujiˇci´c, V.: Further generalization of the low frequency true RMS instrument. IEEE Trans. Instrum. Meas. 59(3), 736–744 (2010)

888

M. Urekar and J. Djordjevi´c Kozarov

17. Pjevalica, N., Pjevalica, V., Petrovic, N.: Advances in concurrent computing for digital stochastic measurement simulation. J. Circuits Syst. Comput. (2019). https://doi.org/10.1142/ S0218126620500334 18. Urekar, M., Gazivoda, N., Peji´c, D.: The core for high-precision stochastic smart grid meter based on low-resolution flash ADC. Trans. Instrum. Meas. 68(6), 1705–1713 (2019). https:// doi.org/10.1109/TIM.2018.2886868 19. Peji´c, D., Urekar, M., Vujiˇci´c, V., Avramov-Zamurovi´c, S.: Comparator Offset Error Suppression in Stochastic Converters used in a Watt-Hour Meter, CPEM 2010. Daejeon, Korea (2010)

Chapter 40

Mathematical Basis of the Stochastic Digital Measurement Method Vladimir Vujiˇci´c, Jelena Djordjevi´c Kozarov, Platon Sovilj, and Bojan Vujiˇci´c

Abstract Stochastic digital measurement method (SDMM) is a significantly different paradigm of digital measurements compared to the standard sampling method (SSM). Namely, unlike SSM, this method does not assume that the quantization error is negligibly small. Furthermore, the quantization error can be 100% of full-scale range, but the final (measurement) result will be precise as well as accurate. The paper presents a mathematical basis of two-bit SDMM and gives general formulae for the precision of measurement the mean value of the signal and the mean value of the product of 2 signals over time interval. Both the time domain and the Fourier domain are considered. The formulae connect the measurement parameters: quantum of two-bit flash AD converter; sampling frequency and length of measurement time interval; as well as waveforms of the signal, with the main performance of the measuring device or the measuring system—the precision. These practical and usable formulae are derived in the paper. They enabled the design and implementation of a number of devices listed in the paper and references. Accuracy is also considered and ways to eliminate systematic error of instruments in which two-bit SDMM is applied are presented. The result of the latest development, a large parallelization of the measurements and signal processing in these devices and systems, is also shown. That led to a completely new situation in engineering practice: the previously used

V. Vujiˇci´c · P. Sovilj · B. Vujiˇci´c Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia e-mail: [email protected] P. Sovilj e-mail: [email protected] B. Vujiˇci´c e-mail: [email protected] J. Djordjevi´c Kozarov (B) Faculty of Electronic Engineering, University of Niš, Niš, Serbia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, https://doi.org/10.1007/978-3-031-17820-7_40

889

890

V. Vujiˇci´c et al.

Weierstrass approximation theorem and Bernstein polynomial approximation are practically now replaced by the Fourier series theory. Keywords Digital measurement · Stochastic digital measurement method · Precision · Accuracy · Fourier series MSC 2020 93C62 and 60G35

40.1 Introduction A general overview of the SDMM is given in [1]. Its most important mathematical characteristic is summarized in the following quotation from [1]: “The mathematics that governs the operation all types of the above presented instruments is relatively simple but very interesting. Rather than discussing the details of mathematical modeling, here we want to point out an element of philosophy. The average value on an interval is actually the integral of the measured function. As the digital integration is a process of adding, the commutative principle applies. Therefore, the elements of the sum can be added in a deterministic fashion (as they are sampled in time), in an arbitrary chosen order, or in an absolutely random sequence—it is completely unimportant. Consequently, from the average value point of view, the time within the interval can be treated as a stochastic variable with a uniform distribution. In this way, the problem of measurement over the interval can be classified in the Probability theory and the area of Statistic theory of samples”. The SDMM is accepted among engineers and it is constantly evolving. This can be seen from [2]. However, so far, none of the mathematicians has written any comments about the SDMM. Therefore it is unknown whether or not this method is accepted by the mathematicians. The authors believe that this conference is a competent forum for the discussion of this issue. This paper gives an overview of the mathematical (theoretical) basis of the SDMM with an emphasis on a two-bit SDMM and measurements in the transformation (Fourier) domain. “There is nothing more practical than a good theory”—Kurt Lewin. The authors (they all are engineers) are wondering continuously: is the theory of the SDMM good enough? In the last two decades, they have been continuously working on its theoretical/practical expansion and refinement. This can be seen in the latest papers describing the implementation of SDMM [3] February 2020 and [4] April 2020. The paper [3] has shown how to simply and accurately measure the wind power and energy using a two-bit SDMM. The paper discusses the problem and theoretically generalizes it, so the formula for the precision of measurement the mean value of the product of a finite number of input signals is derived. This result is directly applicable to the measurements of nonlinear quantities and to the application of nonlinear sensors in measurements. In paper [4], a multi-bit SDMM was applied to measure ERP (Event Related Brain Potential). A significant

40 Mathematical Basis of the Stochastic Digital Measurement Method

891

advantage of SDMM over SSM has been shown, in particular, in measurement of the key parameter—latency. In order to fully understand the idea of SDMM, the following lemma and its proof is crucial: Lemma 1 Average value of an integral of the function x = ϕ(t) over the interval t ∈ [t1 , t2 ] is t2 1 x= ϕ(t)dt, t2 − t1 t1

regardless of whether t is treated as a deterministic variable or a random variable of uniform distribution 1 p(t) = . t2 − t1 Proof (a) If t is a deterministic variable, proof is trivial—it comes down to the definition of the mean value of an integrable function over an interval; 1 , then: (b) If t is a random variable of uniform distribution p(t) = t2 − t1 +∞ x= xdPx , −∞

dPx = dPx/t · dPt dPx/t = px/t · dx dPt =

1 dt. t2 − t1

From Fig. 40.1 it can be seen that px/t = δ(x − ϕ(t) so 1 x= t2 − t1

t2 t1

+∞ dt xδ(x − ϕ(t)dx. −∞

According to the properties of the Dirac delta function +∞ xδ(x − ϕ(t)dx = ϕ(t) −∞

892

V. Vujiˇci´c et al.

Fig. 40.1 Graph of probability Px/t and probability density px/t

we get 1 x= t2 − t1

t2 ϕ(t)dt, t1



thus proving the lemma.

40.2 Measurement of the Mean Value of the Signal Figure 40.2 shows a schema for measurement of the mean value of the signal y = f (t) over the interval t ∈ [t1 , t2 ] using a two-bit SDMM, where: y – is an integrable function of time over range |y| ≤ 2g h – is a dither with uniform distribution p(h) =

1 2g ,

|h| ≤ g i h = 0

b1 , b−1 – output bits of two-bit dithered flash AD converter, b1 , b−1 ∈ {0, 1}, b1 · b−1 = 0 Ψ – is a two-bit output of ADC, Ψ = 2g · (b1 − b−1 ).

40 Mathematical Basis of the Stochastic Digital Measurement Method

893

Fig. 40.2 Schema for measurement of the mean value of the signal y = f (t)

From Fig. 40.2 one can observe that for −g ≤ y + h ≤ g , value ψ = 0, that is b1 = 0, b−1 = 0; y + h > g , value ψ = 2g, that is b1 = 1, b−1 = 0; y + h < −g , value ψ = −2g, that is b1 = 0, b−1 = 1.

Mean value of the flash ADC output ψ is Ψ =

N 1  COU NTER1  . Ψ (i) = N i=1 COU NTER2 

On the other hand,

(40.1)

3g Ψ =

Ψ dPΨ , −3g

where dPΨ is an elementary probability of Ψ happening. Function y = f (t) is a deterministic, h is a random dither, therefore ψ is a random two-bit quantity, dPΨ = dPy/t · dPt · dPh , dPy/t = δ (y − f (t)) dy dPt =

1 dt t2 − t1

894

V. Vujiˇci´c et al.

dPh =

1 dh. 2g

so the upper symbolic integral becomes 1 Ψ = t2 − t1

t2

2g dt −2g

t1

+g dh δ (y − f (t)) dy Ψ . 2g −g

It can easily be proven that for t = const ⇒ y = const ⇒ +g dh = y, Ψ 2g

−g

giving 1 Ψ = t2 − t1

t2

2g dt

t1

−2g

+g dh δ (y − f (t)) dy Ψ , 2g −g

based on the properties of the Dirac delta function:

1 ψ= t2 − t1

t2 f (t)dt.

(40.2)

t1

So, device from Fig. 40.2 measures the mean value of the signal y = f (t). In previous derivation it is implicitly assumed that the frequency of sampling (clock) is infinitely high, so in that case: t2 N 1  1 Ψ = lim ψ(i) = f (t)dt. N →∞ N t2 − t1 i=1 t1

Realistically sampling frequency is finite and N is a finite number, so relation (40.1) holds only approximately. Let us observe what happens when sampling frequency is finite. Quantity Ψ is random, so error e is random quantity as well. Ψ =y+e Based on relations (40.1) and (40.2), it is valid that

(40.3)

40 Mathematical Basis of the Stochastic Digital Measurement Method

895

Ψ = y, so e = 0. If the third moment of quantity e is finite (as well as its third central moment), then for quantity e applies Central limit theorem and Theory of samples [5], so that:

σe2 =

σe2 N

(40.4)

where σe2 is a variance of quantity e, and N is a number of samples. 2g M3 (e) = (e −

e)3

=

e3

=

2g e p(e)de ≤

−2g

2g |e| p(e)de ≤

3

(2g)3 p(e)de

3

−2g

−2g

= (2g) . 3

Since M3 (e) ≤ (2g)3 relation (40.4) holds, where N =< COU NTER2 >. Now σe2 remains to be determined. From relation (40.3) follows: σΨ2 = σy2 + σe2 because y and e are statistically independent quantities. By definition 1 σy2 = y2 − y2 = t2 − t1

t2 t1

⎡ 1 f 2 (t)dt − ⎣ t2 − t1

On the other hand, the following holds 2

σΨ2 = Ψ 2 − Ψ . For quantity Ψ it is easy to prove that Ψ l = (sgnΨ )l |Ψ | · (2g)l−1 , where l is a finite natural number. So, for l = 2, which is our case, we get Ψ 2 = 2g · |Ψ |

t2 t1

⎤2 f (t)dt ⎦ .

896

V. Vujiˇci´c et al.

Fig. 40.3 Location of the key error U0 of two-bit dithered flash A/D converter

so then Ψ 2 = 2g · |Ψ |. Reasoning in a similar way as when deriving relation (40.2), gives us: Ψ2

1 = 2g · t2 − t1

then σe2

=

σΨ2



σy2

2g = t2 − t1

t2 t1

t2 |f (t)|dt t1

1 |f (t)|dt − t2 − t1

t2 f 2 (t)dt t1

and finally ⎫ ⎧ t2 ⎬ ⎨ 2g t2 1 1 σe2 = |f (t)|dt − f 2 (t)dt . ⎭ N ⎩ t2 − t1 t2 − t1 t1

(40.5)

t1

In position (1) of the switch P at the input of the first OP-AMP voltage is h + U0 , where U0 is the offset voltage of analog adder (first OP-AMP-a), as it can be seen from Fig. 40.3. Based on relation (40.2) Ψ (1) = U0 = U with precision  σe =

1 2g|U0 | − |U0 |2 ≈ N



1 2g|U0 |. N

40 Mathematical Basis of the Stochastic Digital Measurement Method

897

Let us note that |U0 | 2g. Voltage U0 is very small, slow changing and practically a constant DC voltage. For sampling frequency of 1 MHz and for normalized quantity U0 /2g = 0.001, it is  1 σe = = 1 · 10−5 = 10 ppm, 1010 so the measured value of the offset of analog adder is known with the precision of 10 ppm after only 10 s. In position (2) of the switch P is Ψ (2) = y + U0 , so accurate value y is given by y = f (t) = Ψ (2) − U0 = Ψ (2) − Ψ (1)

(40.6)

thus eliminating the systematic error caused by the offset of analog adder, with the precision of 20 ppm (coverage factor is 2) and far greater accuracy.1 When measuring interval is one hour in position (1) of the switch, the precision of eliminating error, relation (40.6), is lowered below 1 ppm. As we can see from this simple situation, two-bit flash AD converter is accurate and precision is a critical property. Experience in the last 20 years and more has shown that precision is a bottle neck in all the measurements using two-bit SDMM in power distribution networks, apart from measuring energy. Energy measurements are extremely precise and accurate [6] due to the fact that they are performed over a long periods of time. From described example of the measurement of mean value of the signal over the interval it is clear that the accuracy only depends on dithered flash AD converter. Accuracy is very high because systematic errors are easy to identify and eliminate. Precision does depend on the waveform of measured signal, but far more on the number of samples over the measurement interval, N . N is a product of sampling frequency and length of time interval in which measurement is made. Sampling frequency is limited by the frequency of generating samples of dithered signal h. This limit is 1 MHz for 20 bit accuracy of samples [7]. Concluding this chapter we can state the following: 1. Measurement of the mean value of the signal using SDMM over time interval is accurate. 2. Precision of measurement of the mean value of the signal using two-bit SDMM is defined by formula (40.5). 1

Cross-switching [6] eliminates the error caused by the offset of comparators (Fig. 40.3), so those errors have not been discussed here.

898

V. Vujiˇci´c et al.

Fig. 40.4 Measurement of the mean value of the product of two signals

3. Measurement uncertainty of measuring mean value of signal using two-bit SDMM is equal to the sum of precision and accuracy. Due to the fact that accuracy is in order of 1 ppm, measurement uncertainty defines only precision. 4. Only when performing measurements over a long periods of time (days, weeks or months), when precision is extremely high, accuracy starts to play a part in measurement uncertainty.

40.3 Measurement of the Mean Value of the Product of Two Signals Using Two-Bit SDMM 40.3.1 Measurement in Time Domain In Fig. 40.4, a schema for the measurement of the mean value of the product of two signals – y1 = f1 (t) and y2 = f2 (t) is shown. Let us note that Ψ = Ψ1 · Ψ2 , where Ψ1 = 2g(b11 − b−11 ) and Ψ2 = 2g(b12 − b−12 ). If dither signals h1 and h2 are mutually uncorrelated, then the following holds: T N 1  1 Ψ = lim Ψ1 (i) · Ψ2 (i) = f1 (t) · f2 (t)dt. n→∞ N T i=1 0

Let us derive the previous relation.

40 Mathematical Basis of the Stochastic Digital Measurement Method

899

∞ Ψ =

Ψ dPΨ , −∞

where dPΨ is an elementary probability of happening Ψ . Note that y1 and y2 are not mutually independent but depend on same random parameter t, so then dPΨ = dPy1 /t · dPy2 /t · dPt · dPh1 · dPh2   dPy1 /t = δ y1 − f1 (t) · dy1   dPy2 /t = δ y2 − f2 (t) · dy2 dPt =

1 dt T

dPh1 =

1 dh1 2g

dPh2 =

1 dh2 . 2g

Then 1 Ψ = T

T

2g



2g

δ y1 − f1 (t) dy1

dt 0



−2g

  δ y2 − f2 (t) dy2 ·

g −g

−2g

Ψ1 dh1 · 2g

g −g

For t = const ⇒ (y1 = const) ∧ (y2 = const), it is easily shown that g −g

Ψi dhi = yi 2g

(i = 1, 2),

so then 1 Ψ = T

T

2g dt

0

  y1 · δ y1 − f1 (t) dy1 ·

−2g

2g

  y2 · δ y2 − f2 (t) dy2 .

−2g

Based on the properties of the Dirac delta function 2g yi · δ (yi − fi (t)) dyi −2g

(i = 1, 2)

Ψ2 dh2 . 2g

900

V. Vujiˇci´c et al.

we get 1 Ψ = T

T f1 (t) · f2 (t)dt = y1 · y2 .

(40.7)

0

We have implicitly assumed that frequency of sampling in infinitely high, so then Ψ = Ψ1 · Ψ2 = lim

N 

n→∞

i=1

1 Ψ1 (i) · Ψ2 (i) = T

T f1 (t) · f2 (t)dt = y1 · y2 . 0

Relation (40.7) and the upper relation show that the device in Fig. 40.4 measures the mean value of product of two signals over an interval t ∈ [0, T ] when sampling frequency is infinite. When sampling frequency is finite, i.e. N is a natural number, it is important to show that the third central moment, M3 (2), of the quantity e (error) is finite. Ψ = y1 · y2 + e. As quantity y1 · y2 , y2 is deterministic, and if h1 and h2 are uncorrelated random dither signals (voltages), quantity Ψ is random, so e also has to be random quantity, statistically independent of y1 · y2 , then Ψ = y1 · y2 + e ⇒ e = 0, 4g

4g

2

M3 (2) = (e − e)3 = e3 =

2

e3 · p(e)de  −4g 2

|e3 |p(e)de  (4g 2 ) −4g 2

3

4g

2

p(e)de

−4g 2

= (4g ) . 2 3

Therefore, M3 (2)  (4g 2 )3 . Quantity e has Gaussian distribution and following applies σe2 =

σe2 N

for finite number N , where N =< COU NTER2 >. It remains to determine σe2 . σΨ2 = σy21 y2 + σe2

40 Mathematical Basis of the Stochastic Digital Measurement Method

σy21 y2

= (y1 y2 ) − y1 y2 2

2

1 = T

901

T f12 (t) · f22 (t)dt − y1 y2 2 0

2

σΨ2 = Ψ 2 − Ψ = Ψ 2 − y1 y2 2 so then σe2

=

Ψ2

1 − T

T f12 (t) · f22 (t)dt 0

Ψ 2 =? Ψ 2 = Ψ12 · Ψ22 Ψi2 = 2g|Ψi2 |

(i = 1, 2)

Ψ 2 = (2g)2 · |Ψ1 | · |Ψ2 | = (2g)2 · |Ψ |. Acting analogously as in previous derivation of the relation (40.7) we get Ψ2

(2g)2 = T

T

2     f1 (t) · f2 (t) dt = (2g) T

0

so finally, σe2

(2g)2 = T

0

T  T  1   f12 (t) · f22 (t) dt f1 (t) · f2 (t) dt − T 0

σe2

T     f1 (t) · f2 (t) dt

0

  T T  1 1 (2g)2   2 2 = f1 (t) · f2 (t) dt . f1 (t) · f2 (t) dt − N T T 0

(40.8)

0

For the coverage factor of 2, 2 · σe2 is a measure of precision of measurement of mean value of y1 · y2 over the interval t ∈ [0, T ]. Regarding accuracy, situation is now significantly different than in previous chapter. Now we have two analogue adders, two dithered two-bit flash A/D converters and two-bit multiplier. Systematic errors due to voltage offsets are eliminated using cross-switching procedure. However, systematic errors due to the voltage offsets of analogue adders are considerably more complex and in general require a different approach. Question of its elimination will remain open for now.

902

V. Vujiˇci´c et al.

Fig. 40.5 Measurement of a single Fourier coefficient

40.3.2 Measurement in Fourier Domain Let us assume that components of two-bit dithers FADC are ideal. Then we have only PRECISION as a characteristic of such device σe2

  T T 1 (2g)2 1 = |f1 (t) · f2 (t)|dt − f12 (t) · f22 (t)dt . N T T 0

0

If y2 = f2 (t) is a function from Fourier orthonormal basis stored in memory, then relation (40.8) represents the precision of measurement of corresponding (single) Fourier coefficient. As y2 = f2 (t) is a known function, h2 is a known dither signal, two-bit random signal Ψ2 can be formed in computer, stored in memory and be ready for measurements. Schematic from Fig. 40.4 then turns to schematic in Fig. 40.5. Let us note that now we have an ideal random signal Ψ2 and only one source of systematic error—offset of analog adder in direct FADC that measures y1 = f1 (t). That is much simpler situation than the one shown in Fig. 40.4. A systematic error due to the offset of the analog adder can be seen as a small additional DC component signal y1 . This small DC component, which is linearly independent of any other y2 of Fourier set, does not affect the measurement of any Fourier coefficient. It only affects accuracy of the measurement of mean value of y1 , that is, its DC component. Measurement of the Fourier trigonometric polynomial, i.e. function y1 = f1 (t), is shown in Fig. 40.6. As can be seen from Fig. 40.6 in position (1) of the switch P, ratio < COU NTER1 > = −U0 < COU NTER2 >

40 Mathematical Basis of the Stochastic Digital Measurement Method

903

Fig. 40.6 Measurement of the function y1 = f1 (t) in Fourier domain

gives value of the systematic error due to offset of buffer amplifier, offset of inverting analog filter and analogue adder (I ). In position (2) of the switch P, U0 is added to a measured mean value y1 ≈ Ψ (2) , so then Ψ1 = Ψ (2) + Ψ (1) .

(40.9)

40.4 Stochastic Digital DFT Processor (SDDFT Processor) Figures 40.4 and 40.5 show hardware defined two-bit operation MAC (Multiply and Accumulative). As can be seen it is implemented using 4 two-input “AND” gates, 2 two-input “OR” gates and U P–DOW N counter. It is very simple hardware, so measurement of mean value of product of two signals or measurement of a single Fourier coefficient is easy to perform. Figure 40.7 shows a schema of stochastic digital DFT processor, that fully parallel measures 2M Fourier coefficients. Let us note that when time interval T of measurement is over we have calculated all the Fourier coefficients. That is the essential difference between SDDFT processor and FFT processor. Namely, the FFT processor only after the end of the interval T starts to calculate the Fourier coefficients from the sequence of recorded samples. The second important difference is that standard FFT processor instead of simple two-bit integer arithmetic uses floating-point arithmetic. The third important difference is that standard FFT processor calculates in series (sequentially) and SDDFT does it in parallel.

904

V. Vujiˇci´c et al.

Fig. 40.7 A stochastic DFT processor for measuring 2M Fourier coefficients

Figure 40.8 shows a schema that ensures the completeness of the measurement of the spectrum of the measured signal. Namely, the first from the top (highest, all the way up) output measures the complete signal power, and others below it measure the components of the spectrum. When the sum of the squares of the lower components is equal to the value of the upper output—spectrum is complete. In general case, it is necessary to expand the SDDFT processor in the frequency domain both up and down to achieve complete measurement of the signal power spectrum.

40.5 Application of SDMM Two-bit SDMM, described in previous chapters, has since been used mainly in measurements of power and energy in power distribution networks. In 2013 accurate double three-phase (MM2) and quadruple-three phase (MM4) power analyzers were produced. They detected, identified and measured unauthorised consumption of electric energy (electricity theft). Figure 40.9 shows double three-phase power analyzer MM2 from year 2013. It is worth noting that the key hardware component of this device is an FPGA chip. FPGA chip can be reprogrammed, so in devices MM2 and MM4 from 2016 important additional set of functions have been added [2].

40 Mathematical Basis of the Stochastic Digital Measurement Method

905

Fig. 40.8 Implementation of on-line criterion for complete energy spectrum of the signal over an interval

In paper [8] it has been shown how, combining the content of memory and FPGA to design SDFFT processor, to correctly achieve subsampling measurements and quickly detect a fault or irregularity in power distribution networks [9]. In paper [3] it has been shown how to precisely and accurately measure the power and energy of the wind using two-bit SDMM. In paper [4] it has been demonstrated the use of multi-bit SDMM in measurements of ERP. Complicated procedure such as an examination of human brain, using SDMM, can be significantly shortened in some aspects, without the loss of performance. Conclusion of this chapter is that SDMM, especially two-bit SDMM, is suitable for implementation in FPGA technology. Massive parallelization of measurement and processing, characteristic of two-bit SDMM in FPGA technology, finds its natural environment.

40.6 Discussion Two-bit SDMM, considered optimal from the stand point of precision and accuracy is characterised primarily by a simple hardware. Consequence of this is small number of systematic errors that are easy to identify and then correct. Furthermore, measurement of signals and processing of signals are carried out together, not separately (as in standard approach) [10]. This leads to very good synergy effects. The first effect is possibility of on-line harmonic measurement. Due to equivalence of time and Fourier domains, the signal can now be stored, sent or received, and processed in the Fourier domain. As the data in the Fourier domain (harmonic) is

906

V. Vujiˇci´c et al.

Fig. 40.9 Dual three-phase power analyzer based on a two-bit stochastic ADC

naturally weighted by its amplitude, it introduces the possibility, especially attractive to engineers, of adaptive use and processing data in accordance with the limitations imposed by the problem. The paper [11] discusses the Big data reduction problem. The second important synergy effect is the possibility of parallel on-line measurement of huge number of harmonics. Namely, in one FPGA chip of the latest generation, it is possible to measure tens of thousands of harmonics, which is a completely new situation. Now Weierstrass Approximation Theorem and Bernstein approximation of the function on the interval (Bernstein Polynomial Theorem) gare losing their significance, and Fourier series and its properties come to the forefront, from the engineering point of view. The mathematical foundations of two-bit SDMM, presented in this paper, have helped authors and their colleagues in solving specific engineering problems. They were able to easily, reliably and quickly evaluate the performance of any solution in the field of measurement and signal processing that was put in front of them.

40.7 Conclusion Stochastic digital measuring method incorporates at the same time measuring and processing of signals. Optimal SDMM, from stand point of accuracy, is two-bit one. In this paper its mathematical properties are given. Practical and useful formulas are derived which connect the measurement parameters with the key performance—

40 Mathematical Basis of the Stochastic Digital Measurement Method

907

precision. Hardware of two-bit SDMM is simple, so sources of systematic errors can easily be identified and error corrected. Simple hardware enables large scale parallelisation of measurements and processing. Now, the Weierstrass Approximation Theorem and Bernstein’s Theorem take the backstage and Fourier series comes to forefront. That fact has numerous consequences that are in front of the authors and scientific community. Acknowledgements This work has been supported by the Ministry of Education, Science and Technological Development of the Republic of Serbia.

References 1. Vujicic, V., Zupunski, I., Mitrovic, Z., Sokola, M.: Measurement in a point versus measurement over an interval. In: Proceedings of IMEKO XIX World Congress, vol. 480, pp. 1128–1132 (2009) 2. Pejic, D., Naumovic-Vukovic, D., Vujicic, B., Radonjic, A., Sovilj, P., Vujicic, V.: Stochastic digital DFT processor and its application to measurement of reactive power and energy. Measurement 124, 494–504 (2018) 3. Vujicic, V., Licina, B., Pejic, D., Sovilj, P., Radonjic, A.: Stochastic measurement of wind power using a two-bit A/D converter. Measurement 152, 107184 (2020) 4. Novakovic, D.J., Sovilj, P., Petrovic, N., Milovanovic, M., Makal, J., Walendziuk, W.: Measurement of ERP amplitude and latency based on digital stochastic measurement over interval. Elektronika Ir Elektrotechnika 26(2), 59–68 (2020) 5. Vranic, V.: Probability and Statistics (in Croatian), 2nd edn. Technical Book, Zagreb, Croatia (1965) 6. Urekar, M., Pejic, D., Vujicic, Avramov Zamurovic, V.S.: Accuracy improvement of the stochastic digital electrical energy meter. Measurement 98, 139–150 (2017) 7. Analog Devices (2010–2020) Voltage output DAC—1 ppm, 20-Bit, ±1 LSB INL. In: Data Sheet “AD5791”, One Technology Way, P.O. Box 9106, Norwood, MA 02062-9106, USA 8. Vujicic, V., Pejic, D., Radonjic, A.: A brief overview of stochastic instruments for measuring flows of electrical power and energy. Facta Universitatis Series: Electron. Energet. 323, 439– 448 (2019) 9. Ghanavati, A., Lev-Ari, H., Stankovic, A.: A sub-cycle approach to dynamic phasors with application to dynamic power quality metrics. IEEE Trans. Power Deliv. 335, 2217–2225 (2018) 10. Sovilj, P., Zupunski, L., Vujicic, B., Radonjic, A., Vujicic, V., Kovacevic, D.: Synergy and completeness of simple A/D conversion and simple signal processing. J. Phys. Conf. Ser. 1379, 012064. Joint IMEKO TC1-TC7-TC13-TC18 Symposium, St. Petersburg, Russian Federation (2019). https://doi.org/10.1088/1742-6596/1379/1/012064 11. Vujicic, V., Sokola, M., Radonjic, A., Sovilj, P.: Measurement in Fourier domain—a natural method of big data volume reduction. In: Proceedings IcETRAN 2019, Silver Lake, Serbia, pp 471–474 (2019)

Subject Index

A Arbitrage, 410 Arbitrage-free market, 588 Atom, 100 aperiodic, 100

B Baker–Campbell–Hausdorff theorem, 241 Baxter sum, 198 Bond corporate, 593 coupon-bearing, 593 default-free, 593 discount, 592 Boundary-layer equations, 706 Brownian motion fractional, 251 geometric, 652

C Chain Markov, 19, 449 semi-Markov, 181 Commutation relation, 60 Contraction multiplicative Banach, 328 Cubature formula on Wiener space, 227

D Deflator, 410 State-price, 410 Degree

of a multi-index, 227 Distribution chi-square, 626 gamma, 627 phase type, 19 Wishart, 627, 630

E Efficient frontier, 603 Efficient Market Hypothesis, 588 Ergodicity, 451

F Face value, 592 Fekete point, 611 Fixed point, 345 Function discount, 593 Fox-Wright, 434 Mittag-Leffler, 434 payoff, 596 stochastic discount, 410 Wright, 434

H Heat equation, 670

I Iterative scheme stable w.r.t. random operator, 385 Itô integral, 225

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, 10.1007/978-3-031-17820-7

909

910 K Kolmogorov equations, 215 L Laplace transform of a random variable, 627 M Markov chain, 212 inhomogeneous, 102 Mean value of signal, 894 Measure equilibrium, 591 equivalent martingale, 588, 591 hazard, 165 Mittag-Leffler, 435 risk-neutral, 591 Method finite difference, 682 homotopy analysis, 707 Metric multiplicative, 344, 369 Metric space multiplicative, 326, 344, 369 ordered multiplicative, 344 Mixed convection, 706 Model ARMA, 494 Black–Scholes, 652 double Heston, 4 double lognormal, 4 double mean-reverting, 4 EGARCH, 496 GARCH, 495 Moment matching condition, 227 Multi-index, 227 N No Free Lunch with Vanishing Risk, 588 Nusselt number, 841 O Operator integral, 63 random, 385 Option, 598 P Polynomial

Subject Index Laguerre, 634, 639 Pricing kernel, 410 Probability extinction, 43 conditional, 450 ruin, 16, 20 stationary, 451 transition, 449 Process Branching, 146 Galton–Watson, 146 Galton-Watson branching, 147 Gausian stochastic, 579 Gaussian, 250 Markovian regime-switching, 124 Markov renewal, 449 Poisson, 515 risk, 20 square-Gausian, 579 Volterra–Lévy, 278

R Random fixed point, 385 Random fixed point set, 385 Random process Gaussian, 204 generalized, 198 Random variable Bochner integrable, 384 Rate forward, 595 Resilience, 43 Risky asset, 605

S Sequence DNA, 182 Signature Brownian, 228 of a path, 227 Stochastic variance, 4 Stratonovich correction, 225 Stratonovich integral, 225 Symmetric cone, 631

T Tensor product, 227 Term structure of the default-free discount bonds, 593 Time first-rare-event, 450

911 hitting, 100 return, 100 V Vandermonde determinant, 606, 635 Vandermonde matrix, 606 Vector economy information, 125 exit rate, 19 left-singular, 609 right-singular, 609

Volterra kernel, 251

W Wishart matrix, 612

Y Yield forward, 595 Yield to maturity, 594

Author Index

A Abbas, Mujahid, 383 Abel, M. Subhas, 773 Albuhayri, Mohammed, 3 Ali, Mohammad Jamsher, 15 Anguzu, Collins, 541 Anisimov, Vladimir, 511 Arjmand, Doghonay, 689 Austin, Matthew, 511 Aye, Tin Nwe, 33

C Canpwonyi, Sam, 753 Carlsson, Linus, 33, 735, 753 Chandarki, I. M., 703

D D’Amico, Guglielmo, 489 Da Silva, José, 431 Di Basilio, Bice, 489 Dimitrov, Marko, 121 Di Nunno, Giulia, 277 Djinja, Domingos, 59 Djordjevi´c Kozarov, Jelena, 857, 889 Drumond, Custódia, 431

J Jin, Lu, 121

K Kakuba, Godwin, 587, 625 Kasumba, Henry, 541 Khusanbaev, Yakubdjan, 145 Kitouni, Abderrahim, 163 Kolias, Pavlos, 179 Kozachenko, Yuriy, 563 Krasnitskiy, Sergey, 197 Kudratov, Kh. E., 145 Kurchenko, Oleksandr, 197

L Lebedev, Eugene, 211 Livinska, Hanna, 211 Lundengård, Karl, 587, 625

M Malyarenko, Anatoliy, 3, 223, 625 Mango, John Magero, 541, 587, 625 Messaci, Fatiha, 163 Metri, Prashant G., 773, 791, 809, 829 Mishura, Yuliya, 249, 277 Muhumuza, Asaph Keikara, 587, 625

E Engström, Christopher, 3, 541

G Gismondi, Fulvio, 489 Golomoziy, Vitaliy, 97

N Nankinga, Loy, 735 Nazir, Talat, 325, 343, 367 Ni, Ying, 3, 121 Nohrouzian, Hossein, 223

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 A. Malyarenko et al. (eds.), Stochastic Processes, Statistical Methods, and Engineering Mathematics, Springer Proceedings in Mathematics & Statistics 408, 10.1007/978-3-031-17820-7

913

914 O Ögren, Magnus, 669 Okeke, Godwin Amechi, 383

P Pärna, Kalev, 15 Papadopoulou, Alexandra, 179 Peri´c, Mirjana, 719 Petroni, Filippo, 489 Ponomarov, Vadym, 211

Author Index Silvestrov, Dmitrii, 447 Silvestrov, Sergei, 3, 59, 325, 343, 367, 383, 541, 587, 625, 829 Singh, B. B., 703 Sovilj, Platon, 889 Streit, Ludwig, 431 Syniavska, Olga, 197

T Tawade, Jagadish, 791, 809 Tumwesigye, Alex Behakanira, 59 Tyshchenko, Serhii, 651

R Raiˇcevi´c, Nebojša, 719 Ralchenko, Kostiantyn, 277 Rambaud, Salvador Cruz, 407 Rozora, Iryna, 563

U Umavathi, J. C., 829 Urekar, Marjan, 857

S Sharanappa, Deena Sunil, 773 Shchestyuk, Nataliya, 651 Shevchenko, Georgiy, 249 Shklyar, Sergiy, 249

V Vuˇckovi´c, Ana, 719 Vuˇckovi´c, Dušan, 719 Vujiˇci´c, Bojan, 889 Vujiˇci´c, Vladimir, 889