134 42 14MB
English Pages 202 Year 2022
Nonlinear Systems and Complexity Series Editor: Albert C. J. Luo
Dimitri Volchenkov J. A. Tenreiro Machado Editors
Mathematical Methods in Modern Complexity Science
Nonlinear Systems and Complexity Volume 33
Series Editor Albert C. J. Luo,
Southern Illinois University, Edwardsville, IL, USA
Nonlinear Systems and Complexity provides a place to systematically summarize recent developments, applications, and overall advance in all aspects of nonlinearity, chaos, and complexity as part of the established research literature, beyond the novel and recent findings published in primary journals. The aims of the book series are to publish theories and techniques in nonlinear systems and complexity; stimulate more research interest on nonlinearity, synchronization, and complexity in nonlinear science; and fast-scatter the new knowledge to scientists, engineers, and students in the corresponding fields. Books in this series will focus on the recent developments, findings and progress on theories, principles, methodology, computational techniques in nonlinear systems and mathematics with engineering applications. The Series establishes highly relevant monographs on wide ranging topics covering fundamental advances and new applications in the field. Topical areas include, but are not limited to: Nonlinear dynamics Complexity, nonlinearity, and chaos Computational methods for nonlinear systems Stability, bifurcation, chaos and fractals in engineering Nonlinear chemical and biological phenomena Fractional dynamics and applications Discontinuity, synchronization and control.
More information about this series at https://link.springer.com/bookseries/11433
Dimitri Volchenkov • J. A. Tenreiro Machado Editors
Mathematical Methods in Modern Complexity Science
Editors Dimitri Volchenkov Department of Mathematics & Statistics Texas Tech University Lubbock, TX, USA
(deceased) J. A. Tenreiro Machado ISEP-Institute of Engineering Polytechnic of Porto Porto, Portugal
ISSN 2195-9994 ISSN 2196-0003 (electronic) Nonlinear Systems and Complexity ISBN 978-3-030-79411-8 ISBN 978-3-030-79412-5 (eBook) https://doi.org/10.1007/978-3-030-79412-5 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
In memoriam of Professor Valentin Afraimovich (1945–2018), a visionary scientist, respected colleague, generous mentor, and loyal friend.
Preface
The present volume on Mathematical Methods in Modern Complexity Science: From Artificial Intelligence to Relativistic Chaotic Dynamics explores recent developments in mathematical methods applied in complexity science, including modern approaches based on data analysis and artificial intelligence. The volume is dedicated to the memory of our colleague Valentin Afraimovich (1945–2018), a visionary scientist, respected colleague, generous mentor, and loyal friend. Professor Afraimovich was a Soviet, Russian, and Mexican mathematician known for his works in dynamical systems theory, qualitative theory of ordinary differential equations, bifurcation theory, concept of attractor, strange attractors, space-time chaos, mathematical models of nonequilibrium media and biological systems, traveling waves in lattices, complexity of orbits, and dimension-like characteristics in dynamical systems. The collection of works in this edited volume opens with the contribution of Professor Machado on Shannon information analysis of the chromosome code. We discuss probabilistic and information theoretic methods for risk assessment, relativistic chaotic scattering, and artificial intelligence methods for studying human perception, multistability and coexistence of memristive chaotic systems, evolution of systems with power-law memory, and many other topics recently developed in the complexity science. The volume facilitates a better understanding of the mechanisms and phenomena in nonlinear dynamics and develops the corresponding mathematical theory to apply nonlinear design to practical engineering. Valentin Afraimovich was a generous, gregarious, and energetic presence at the very heart of nonlinear dynamics and complexity science communities, all of which were transformed by his presence. We hope that the scientific community will benefit from this edited volume. Lubbock, TX, USA Proto, Portuga
Dimitri Volchenkov J. A. Tenreiro Machado
vii
Contents
Shannon Information Analysis of the Chromosome Code. . . . . . . . . . . . . . . . . . . J. A. Tenreiro Machado
1
An Unfair Coin of the Standard & Poor’s 500 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dimitri Volchenkov and Veniamin Smirnov
13
Relativistic Chaotic Scattering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Juan D. Bernal, Jesús M. Seoane, and Miguel A. F. Sanjuán
33
Complex Dynamics of Solitons in Rotating Fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . Lev A. Ostrovsky and Yury A. Stepanyants
63
Multistability Coexistence of Memristive Chaotic System and the Application in Image Decryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fuhong Min and Chuang Li Extreme Events and Emergency Scales. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Veniamin Smirnov, Zhuanzhuan Ma, and Dimitri Volchenkov
79 99
Probability Entanglement and Destructive Interference in Biased Coin Tossing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Dimitri Volchenkov On the Solvability of Some Systems of Integro-Differential Equations with Drift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Messoud Efendiev and Vitali Vougalter Solvability in the Sense of Sequences for Some non Fredholm Operators with the Bi-Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Vitali Vougalter and Vitaly Volpert
ix
x
Contents
The Preservation of Nonnegativity of Solutions of a Parabolic System with the Bi-Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Vitali Vougalter Inverse Problems for Some Systems of Parabolic Equations with Coefficient Depending on Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Vitali Vougalter
Shannon Information Analysis of the Chromosome Code J. A. Tenreiro Machado
AMS (MSC 2010) 70K50, 35P30, 37M05
1 Introduction During the last years the genome sequencing project produced a large volume of data that is presently available for computational processing [6, 7, 17, 22, 24, 26, 31, 33, 35]. Researchers have been tackling the information content of the deoxyribonucleic acid (DNA), but important questions remain still open [1, 5, 12, 13, 21, 25, 34]. This paper addresses the information flow along each DNA strand [14]. For this purpose several statistics are developed and the relative frequencies of distinct types of symbol associations are evaluated. The concepts of character, word, space and message are defined, and the information content of each chromosome is quantified.
Professor José António Tenreiro Machado passed away unexpectedly on October 6, 2021 October, the day of his 64th birthday. He authored over 1160 publications, including 11 books, 575 journal papers, 118 book chapters, 382 presentations, and 60 plenary lectures at national and international meetings and conferences, and 74 courses in national and international universities. He was editor of 22 books, advisory editor of 3 book series and editor in chief of 1 book series. He was also scientific director of the journal “Robotics and Automation” (in Portuguese), guest-editor of 55 special issues in journals, member of the editorial board, associate editor in several scientific journals and editor-in-chief of 3 scientific journals. The present contributed volume is his last scientific contribution. We will always miss him with his wisdom, hard work attitude, passion for research and integrity. J. A. Tenreiro Machado (deceased) ISEP-Institute of Engineering, Polytechnic of Porto, Porto, Portugal
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 D. Volchenkov, J. A. Tenreiro Machado (eds.), Mathematical Methods in Modern Complexity Science, Nonlinear Systems and Complexity 33, https://doi.org/10.1007/978-3-030-79412-5_1
1
2
J. A. Tenreiro Machado
It was recognized [16, 18, 27] that DNA has an information structure that reveal long range behavior, somehow in the line of thought of systems with dynamics described by the tools of Fractional Calculus (FC) [2, 9, 20, 23, 28–30]. Having these ideas in mind, this paper is organized as follows. Section 2 presents the DNA sequence decoding concepts, the mathematical tools, and formulates the algorithm that computes the information for each chromosome and species. Section 3 analyzes the DNA information dynamical content of 463 chromosomes corresponding to a set of 23 species. Finally, Sect. 4 outlines the main conclusions.
2 Preliminary Notes on the DNA Information In the DNA double helix we find 4 distinct nitrogenous bases, namely the thymine, cytosine, adenine and guanine, denoted by the symbols {T , C, A, G}. Each type of base on one strand connects with only one type of base on the other strand, forming the base pairing A − T and G − C. Besides the 4 symbols T , C, A, and G, the available chromosome data includes a small percentage with a fifth symbol N , that usually is considered to have no practical meaning in the DNA decoding. For processing the DNA information a possible technique is to convert the symbols into a numerical value. In previous papers the direct symbol translation √ T = 0 + i, C = −1 + i0, A = 1 + i0, G = 0 − i, N = 0 + i0, where i = −1 was adopted. We can move along the DNA strip, one symbol (base) at a time and the resulting values form a signal x(t), where t can be interpreted as a pseudotime. can be treated by the Fourier transform F {x (t)} = X (iω) = +∞ The signal −iωt dt, where ω represents the angular frequency. This spectrum is x e (t) −∞ typical of fractional-order systems involving long-range memory effects and nonlocality. Figure 1 shows an example with the amplitude of the Fourier transform for chromosome 1 of the human being. It is clear a similarity with fractional Brownian noise [15]. The frequency interval 107 ≤ ω ≤ 100 Is adopted and a power-law approximation is fitted, namely with |X (iω)| = 6105 · ω−0.3 , revealing a strong correlation between both plots. This technique has, however, one drawback which is the initial assignment of numerical values to the DNA symbols. Therefore, it is important to design an alternative method of analysis avoiding that problem, but, on the other hand, capable of revealing fractional order phenomena. Bearing this strategy in mind, in this paper an approach based on the histograms of symbol alignment and information theory is adopted. This study focuses 23 species listed in Table 1 yielding a total of 463 chromosomes. For the DNA information decoding we start by defining the fundamental concepts. The first level is the “symbol” that consists of one of the four possibilities {T , C, A, G}, while N is simply disregarded. The second level is the “character” c, represented by a n-tuple association (n = 1, 2, . . .) of symbols, resulting a
Shannon Information Analysis of the Chromosome Code
3
Amplitude of Fourier transform Power law approximation
Amplitude
10 6
10 5
10
4
10
-7
10
-6
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
angular frequency
Fig. 1 Amplitude of the Fourier transform versus ω for chromosome 1 of the human species and PL approximation |X (iω)| = 6105 · ω−0.3
total of 4n possible symbols per character. The information is obtained when moving sequentially along the DNA. The n-tuple association of symbols may have different meaning and are divided into two classes, namely the “characters” (c) and the “spaces” (s). The 2 categories, c and s, have the role of providing information and delimiting groups of characters, respectively. Several consecutive characters without spaces in the middle form the third level, that is, the “word” (w). Multiple consecutive spaces are read as a single space. When the complete association of consecutive words is fulfilled we obtain the forth level represented by the “message”, corresponding to the total information of the chromosome. The final level consists of the total information provided by all the chromosomes of a given species, and is simply calculated by the sum of all individual messages. We verify that (i) we may have words with different lengths,(ii) that it is considered as a space any repetition of several consecutive spaces, and (iii) that the only predefined assumption is the choice of n. The message finishes at the end of the DNA strand. After defining the concepts of symbol, character/space and message, we need to establish the numerical value to be adopted by n and to have a concrete method for measuring the information. In what concerns n no a priori optimal value is
4
J. A. Tenreiro Machado
Table 1 The 23 species j 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Species Mosquito (Anopheles gambiae) Honeybee (Apis mellifera) Caenorhabditis briggsae Caenorhabditis elegans Chimpanzee Dog Drosophila Simulans Drosophila Yakuba Horse Chicken Human Medaka Mouse Opossum Orangutan Cow Pig Rat Yeast (Saccharomyces cerevisiae) Stickleback Zebra Finch Tetraodon Zebrafish
Tag Ag Am Cb Ce Ch Dg Ds Dy Eq Ga Ho Me Mm Op Or Ox Po Rn Sc St Tg Tn Zf
Group Insect Insect Nematode Nematode Mammal Mammal Insect Insect Mammal Bird Mammal Fish Mammal Mammal Mammal Mammal Mammal Mammal Fungus Fish Bird Fish Fish
Nc 5 16 6 6 25 39 6 10 32 31 24 24 21 9 24 30 19 21 16 21 32 21 25
considered. Therefore, in the experiments to be performed in the follow-up, we analyze the influence of adopting n-tuples with values of n between n = 1 and n = 12, that is to say, when going from 41 up to 412 symbols per character c. This pre-evaluation is performed for one chromosome only, given the huge computational load required by high values of n and the total of 463 chromosomes. In what concerns the measurement the Shannon information [3, 8, 10, 11, 32] I = − ln (p), is adopted, where p and I represent the probability and the quantity of information given by a given event. With regard to this topic we can mention [4, 17, 19] addressing the calculation of the information embedded in the DNA. For a n-tuple symbol encoding, the occurrence of the k-th character ck has probability pn (ck ) leading to information In (ck ) = − ln (pn (ck )) .
(1)
Therefore, the total information content In (wr ) of the r-th word wr is given by: k max
In (wr ) = −
k=1
ln (pn (ck )),
(2)
Shannon Information Analysis of the Chromosome Code
5
where kmax represents the total number of characters forming the word wr including the first space. Several experiments evaluated numerically the effect of including, or not, the information about one or more spaces, but due to its low importance, the final effect is negligible. Therefore, it is considered the inclusion of one space as the information for delimiting the word, while further consecutive repetitions of spaces are disregarded. The message information of chromosome q is the sum of all word information: In (q) =
r max
In (wr ),
(3)
r=1
where rmax denotes the total number of words included in the q-th message (i.e., the q-th chromosome). Finally, the total information for a given species is the sum of the information content of all chromosomes, that is: In =
Nc
In (q),
(4)
q=1
where Nc denotes the total number of chromosomes. The information measurement requires the knowledge of the probability p (c). Therefore, it is adopted a numerical procedure that starts by reading the chromosome message based on the n-tuple character setup leading to the construction of one histogram per chromosome. In the set of 4n bins are chosen, by inspection, those that are much more frequent (and have smaller information content) for the role of spaces, being the others interpreted as characters. In a second phase, the relative frequencies calculated from the DNA data are read as approximate probabilities, and the information values are calculated while traveling along the DNA strand. This strategy does not consider some optimal value of n defined a priori. Therefore, as mentioned previously, several distinct values of n are tested before defining some value for n.
3 Capturing the DNA Information We start by reading the information of the human chromosome 12, while testing values of n in the range n = {1, . . . , 12}. This chromosome is represented by a medium size file (130 Mbytes) and may be considered a good compromise between length and computational load. Figure 2 depicts the histogram for n = 3, where, for simplifying the visualization, the characters are ordered by decreasing magnitude of relative frequency. For
6
J. A. Tenreiro Machado 0.04
0.035
0.03
relative frequency
0.025
0.02
0.015
0.01
0
TTT AAA ATT AAT TCT AGA TTA TAA TAT ATA CAG CTG CTT TGT AAG ACA TTC GAA TGA TCA TTG CAA CCA TGG CAT ATG CCT AGG CTC GAG AGT ACT TCC GGA CAC GTG AAC GTT GCA TGC AGC GCT ATC GAT CTA TAG CCC GGG GCC GGC ACC GGT TAC GTA GAC GTC CCG CGG CGT ACG GCG CGC CGA TCG
0.005
character (n = 3)
Fig. 2 Histogram for the human chromosome 12 and n = 3
other chromosomes the resulting histograms reveal identical characteristics, namely two characters with a very large relative frequency. The two characters are simply a succession of symbols A or T and the corresponding n-tuples (i.e., {A · · · A} and {T · · · T }) are adopted in the sequel as “spaces”. For the histograms construction two counting methods were investigated, namely (i) counting with disjoint set of n symbols, and (ii) counting the sets while sliding one symbol at a time. At first sight it seems that option (i) is the most straightforward. However, if we must consider that we do not have information for starting and synchronizing the counting. Furthermore, we must keep in mind that the standard reading, from left to right in sequence, may not be the appropriate one for the DNA. Therefore, method (ii) seems more robust and, therefore, is adopted in the sequel. Figure 3 shows the word information dynamics I (c) when traveling along the DNA strand of the human chromosome 12, for n = 3. We verify the emergence of quantum information levels. This effect is due to finite number of quantifying levels of information that occur before a space terminates a word. The number of quantum levels increases with n. Figures 4 and 5 depict the total chromosome information In and the number of words Nw and the average word information Iav versus n, respectively. In Fig. 4 we
word information
10
2
10
1
10
0
0
2
4
6
8
10
12
DNA length
10
7
Fig. 3 Word information In (c) versus t for the human chromosome 12 and n = 3 10
1.8
8
1.75 1.7
Hu12 total information
1.65 1.6 1.55 1.5 1.45 1.4 1.35 1.3 1.25 0
1
2
3
4
5
6
7
n
Fig. 4 Human chromosome 12: total information In versus n
8
9
10
11
12
8
J. A. Tenreiro Machado 10 8 Nw I av
10 7
10 6
Nw , Iav
10 5
10
4
10 3
10
2
10
1
10 0
0
1
2
3
4
5
6
7
8
9
10
11
12
n
Fig. 5 Human chromosome 12: average word information Iav and number of words Nw versus n
verify a maximum of the total chromosome information for n = 3. For larger values of n the average information Iav decreases slightly due to the effect of dropping out repeated consecutive spaces. Therefore, we can say that large values of n seem to give to a slightly better estimate of the total information content, while the cases of n = 1 or n = 2 lead to an inferior measurement process. In Fig. 5 we note that the number of words Nw decreases with n, but its average information Iav varies in the opposite way. Figure 6 shows the total information, that is, the information resulting from summing the information of all the chromosomes of each species, versus the corresponding number of chromosomes Nc , for character encoding with n = 8. We observe a weak correlation between both variables. The total information for a species In is a cumulative index. This observation suggests the definition of a second index adopting an orthogonal perspective. In this line of thought, a measure Dn reflecting the differences, that is to say, the discontinuities, between distinct chromosomes is defined as follows: Dn =
N c −1 1 |In (q + 1) − In (q)|. Nc − 1 q=1
(5)
Shannon Information Analysis of the Chromosome Code
10
9
10
Op Po
Mm Rn Rm
Ho Or Ch
Ox
Eq
Dg
Zf GaTg
10
9
Me
In
St Ag
Tn Am Ds Cb Ce
10
Dy
8
Sc 10 7
5
10
15
20
25
30
35
40
Nc
Fig. 6 Locus of the total information In versus the number of chromosomes Nc for the 23 species with n = 8
During the computation implementation, for each species, we sort initially all values of In (q) before the calculation of (5). This procedure minimizes the value of Dn by reducing the differences between consecutive values of In (q). Figure 7 shows Dn versus In for the 23 species. We verify that there is a strong correlation between the 2 indices. Furthermore we find that species with superior/inferior complexity are located at the upper right/lower left corner of the locus. It is also interesting to observe (i) the limit positions of Sc and Op, and (ii) that the species “evolve” toward the Hu. These results support the assumption that the DNA has information that can be handled by information theory.
10
J. A. Tenreiro Machado
10
Op
8
Po Rn Mm Ho Ch Or
Ag 10
7
Ga Tg Rm
Ds
Ox
Dy
Dn
Eq Dg
Zf
Cb Ce Am Tn
10 6
St Me
Sc 10 5 7 10
10
8
10
9
10
10
In
Fig. 7 Locus of Dn versus In for the 23 species with n = 8
4 Conclusions Chromosomes have a code based on a 4 symbol alphabet that can be analyzed with methods usually adopted in information processing. The information structure has resemblances to those occurring in systems characterized by fractional dynamics. Nevertheless, schemes based on assigning numerical values to the DNA symbols may deform the information and alternative methods that avoid such problem need to be implemented. In this paper it was proposed a scheme based on the Shannon information theory. Bearing these ideas in mind, the chromosomes were processed considering the average information and the total number of words, for distinct values of character encoding. The resulting locus revealed the emergence of clearly interpretable patterns in accordance with current knowledge in phylogenetics. The proposed methodology opens new directions of research for DNA information processing and supports the recent discoveries that fractional phenomena are present in this biological structure.
Shannon Information Analysis of the Chromosome Code
11
Acknowledgments We thank the following organizations for allowing access to genome data: • Gambiae Mosquito, The International Anopheles Genome Project • Honeybee, The Baylor College of Medicine Human Genome Sequencing Center, http://www. hgsc.bcm.tmc.edu/projects/honeybee/ • Briggsae nematode, Genome Sequencing Center at Washington University in St. Louis School of Medicine • Elegans nematode, Wormbase, http://www.wormbase.org/ • Common Chimpanzee, Chimpanzee Genome Sequencing Consortium • Dog, http://www.broad.mit.edu/mammals/dog/ • Drosophila Simulans, http://genome.wustl.edu/genomes/view/drosophila_simulans_white_501 • Drosophila Yakuba,http://genome.wustl.edu/genomes/view/drosophila_yakuba • Horse, http://www.broad.mit.edu/mammals/horse/ • Chicken, International Chicken Genome Sequencing Consortium • Human, Genome Reference Consortium, http://www.ncbi.nlm.nih.gov/projects/genome/ assembly/grc/ • Medaka, http://dolphin.lab.nig.ac.jp/medaka/ • Mouse, Mouse Genome Sequencing Consortium. Ihttp://www.hgsc.bcm.tmc.edu/projects/ mouse/ • Opossum, The Broad Institute, http://www.broad.mit.edu/mammals/opossum/ • Orangutan, Genome Sequencing Center at WUSTL, http://genome.wustl.edu/genome. cgiGENOME=Pongo%20abelii • Cow, The Baylor College of Medicine Human Genome Sequencing Center, http://www.hgsc. bcm.tmc.edu/projects/bovine/ • Pig, The Swine Genome Sequencing Consortium, http://piggenome.org/ • Rat, The Baylor College of Medicine Human Genome Sequencing Center, http://www.hgsc. bcm.tmc.edu/projects/rat/ • Yeast, Sacchromyces Genome Database, http://www.yeastgenome.org/ • Stickleback, http://www.broadinstitute.org/scientific-community/science/projects/mammalsmodels/vertebrates-invertebrates/stickleback/stickleba • Zebra Finch, Genome Sequencing Center at Washington University St. Louis School of Medicine • Tetraodon, Genoscope, http://www.genoscope.cns.fr/ • Zebrafish, The Wellcome Trust Sanger Institute, http://www.sanger.ac.uk/Projects/D_rerio/
References 1. G. Albrecht-Buehler, Asymptotically increasing compliance of genomes with Chargaff’s second parity rules through inversions and inverted transpositions. Proc. Nat. Acad. Sci. USA, 103(47), 17828–17833 (2006) 2. D. Baleanu, K. Diethelm, E. Scalas, J.J. Trujillo, Fractional Calculus: Models and Numerical Methods. Series on Complexity, Nonlinearity and Chaos (World Scientific Publishing Company, Singapore, 2012) 3. C. Beck, Generalised information and entropy measures in physics. Contemp. Phys. 50(4), 495–510 (2009) 4. H.-D. Chen, C.-H. Chang, L.-C. Hsieh, H.-C. Lee, Divergence and Shannon information in genomes. Phys Rev Lett. 94(17), 178103 (2005) 5. P.J. Deschavanne, A. Giron, J. Vilain, G. Fagot, B. Fertil, Genomic signature: characterization and classification of species assessed by chaos game representation of sequences. Molecular Biol. Evol. 16(10), 1391–1399 (1999) 6. C.W. Dunn et al., Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452(10), 745–750 (2008)
12
J. A. Tenreiro Machado
7. I. Ebersberger, P. Galgoczy, S. Taudien, S. Taenzer, M. Platzer, A. von Haeseler, Mapping human genetic ancestry. Molecular Biol. Evol. 24(10), 2266–2276 (2007) 8. R.M. Gray, Entropy and Information Theory (Springer, New York, 2009) 9. R. Hilfer, Application of Fractional Calculus in Physics (World Scientific, Singapore, 2000) 10. E.T. Jaynes, Information theory and statistical mechanics. Phys. Rev. 106(6), 620–630 (1957) 11. A.I. Khinchin, Mathematical Foundations of Information Theory (Dover, New York, 1957) 12. M. Kimura, The Neutral Theory of Molecular Evolution (Cambridge University Press, New York, 1983) 13. M. Lynch, The frailty of adaptive hypotheses for the origins of organismal complexity. Proc. Nat. Acad. Sci. USA 104(suppl 1), 8597–8604 (2007) 14. J.A.T. Machado, Shannon information and power law analysis of the chromosome code. Abstract Appl. Analy. 2012(Article ID 439089), 13 (2012) 15. J.T. Machado, Fractional order description of DNA. Appl. Math. Model. 39(14), 4095–4102 (2015) 16. J.A.T. Machado, Bond graph and memristor approach to DNA analysis. Nonlinear Dyn. 88(2), 1051–1057 (2017) 17. J.T. Machado, A. Costa, M. Quelhas, Entropy analysis of DNA code dynamics in human chromosomes. Comput. Math. Appl. 62(3), 1612–1617 (2011) 18. J.T. Machado, A.C. Costa, M.D. Quelhas, Fractional dynamics in DNA. Commun. Nonlinear Sci. Num. Simul. 16(8), 2963–2969 (2011) 19. J.T. Machado, A.C. Costa, M.D. Quelhas, Shannon, Rényi and Tsallis entropy analysis of DNA using phase plane. Nonlinear Analy. Ser. B Real World Appl. 12(6), 3135–3144 (2011) 20. K. Miller, B. Ross, An Introduction to the Fractional Calculus and Fractional Differential Equations (Wiley, New York, 1993) 21. D. Mitchell, R. Bridge, A test of Chargaff’s second rule. Biochem. Biophys. Res. Commun. 340(1), 90–94 (2006) 22. W.J. Murphy, T.H. Pringle, T.A. Crider, M.S. Springer, W. Miller, Using genomic data to unravel the root of the placental mammal phylogeny. Genome Res. 17(4), 413–421 (2007) 23. K. Oldham, J. Spanier, The Fractional Calculus: Theory and Application of Differentiation and Integration to Arbitrary Order (Academic, New York, 1974) 24. H. Pearson, Genetics: what is a gene? Nature 441(7092), 398–401 (2006) 25. B. Powdel, S.S. Satapathy, A. Kumar, P.K. Jha, A.K. Buragohain, M. Borah, S.K. Ray, A study in entire chromosomes of violations of the intra-strand parity of complementary nucleotides (Chargaff’s second parity rule). DNA Res. 16(6), 325–343 (2009) 26. A.B. Prasad, M.W. Allard, Confirming the phylogeny of mammals by use of large comparative sequence data sets. Molecular Biol. Evol. 25(9), 1795–1808 (2008) 27. J.A.T.M. Rómulo Antão, A. Mota, Kolmogorov complexity as a data similarity metric: application in mitochondrial DNA. Nonlinear Dyn. 93(3), 1059–1071 (2018) 28. B. Ross, Fractional calculus. Math. Mag. 50(3), 115–122 (1977) 29. J. Sabatier, O.P. Agrawal, J.T. Machado, Advances in Fractional Calculus: Theoretical Developments and Applications in Physics and Engineering (Springer, Dordrecht, 2007) 30. S. Samko, A. Kilbas, O. Marichev, Fractional Integrals and Derivatives: Theory and Applications (Gordon and Breach Science Publishers, Amsterdam, 1993) 31. H. Seitz, Analytics of Protein-DNA Interactions. Advances in Biochemical Engineering Biotechnology (Springer, Berlin, 2007) 32. C.E. Shannon, A mathematical theory of communication. Bell Syst. Techn. J. 27(3), 379–423, 623–656 (1948) 33. G.E. Sims, S.-R. Jun, G.A. Wu, S.-H. Kim, Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proc. Nat. Acad. Sci. USA 106(8), 2677– 2682 (2009) 34. C.-T. Zhang, R. Zhang, H.-Y. Ou, The Z curve database: a graphic representation of genome sequences. Bioinformatics19(5), 593–599 (2003) 35. H. Zhao, G. Bourque, Recovering genome rearrangements in the mammalian phylogeny. Genome Res. 19(5), 934–942 (2009)
An Unfair Coin of the Standard & Poor’s 500 Dimitri Volchenkov
and Veniamin Smirnov
In memoriam of Valentin Afraimovich (1945–2018), a visionary scientist, respected colleague, generous mentor, and loyal friend.
AMS (MSC 2010) 70K50, 35P30, 37M05
1 Introduction In the famous book “The intelligent investor”, Benjamin Graham had advised not wasting time at forecasting how the market will perform in the future, as markets are fickle and market prices are largely meaningless [1]. Instead, he taught that the intelligent investor should focus at the diligent financial evaluation of company’s market value, buying stocks that are clearly underpriced in the market [2]. Nowadays, we have enough publicly available historical data and the adequate mathematical methods to confirm or deny the conclusions of Mr. Graham, the father of value investing and security analysis. The efficient-market hypothesis (EMH) states that asset prices fully reflect all available information [3] though it is impossible to ascertain what a stock should be worth under an efficient market, since investors value stocks differently. Random events are entirely acceptable under an efficient market, and it is unknown how much time prices need to revert to fair value, so that the price remains uncertain at every moment of time and over any time horizon. Therefore, it is important to ask whether EMH undermines itself in its allowance for random occurrences or environmental eventualities. In our work, we suggest a conceptually novel answer resolving this paradox by demonstrating that uncertainty of a stock state in phase space (specified by the stock daily return and its time derivative, roughness) always
D. Volchenkov () · V. Smirnov Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX, USA e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 D. Volchenkov, J. A. Tenreiro Machado (eds.), Mathematical Methods in Modern Complexity Science, Nonlinear Systems and Complexity 33, https://doi.org/10.1007/978-3-030-79412-5_2
13
14
D. Volchenkov and V. Smirnov
contain a quantifiable information component that can neither be predicted from any available data (whether historical, new, or hidden “insider” information), nor has any repercussion for the future stock behavior, but which holds on to the ephemeral present only (“ephemeral information”, see Sect. 4). Although the price remains profoundly uncertain, uncertainty does not affect the global ability of financial market to correct itself by instantly changing to reflect new public information. We investigate the detailed 5 year history data (2013–2018) of the daily stock prices for the companies selected by the S&P 500 committee accordingly the company’s market capitalization (must be greater than or equal to $6.1 bln), liquidity, domicile, public float, sector classification, financial viability, and length of time publicly traded and stock exchange, are representative of the industries in the United States economy, with indexed assets comprising approximately $3.4 trln of this total capturing approximately 80% coverage of available market capitalization. Based on the available data, we reconstruct a discrete model of S&P 500 phase space, in which every stock is characterized by its daily raw–return and raw– roughness, and the stock market state corresponds to a certain distribution of stocks in phase space. We study predictability of market states and individual stock prices in phase space. The novelty of our approach to forecasting is twofold. First, we have developed the novel concept of predictability of states in phase space. Our approach is based on the idea that with some probability a state might be a t-day precursor of another state in phase space. The introduced measure of t-day predictability of a state is a sum of all information components from its tday precursors. From such a point of view, Markov chains are characterized by the maximum predictability, as any state of a Markov chain is a predictor for any other state of the chain for all t with probability 1. The total amount of predictable information for any state of the Markov chain equals the total information content of the chain. Second, we have proposed the novel methodology for a quantitative assessment of the amounts of predictable and unpredictable information in any stochastic process whenever an empirical transition matrix between the states observed over a certain time horizon becomes available. The proposed technique might be seen as an extension of the famous Ulam’s method [4] for approximating invariant densities of dynamical systems. The following conclusions can be drawn from our study: 1. Long-range forecasts based on the stock price history are more efficient for the frequently observed events than for the rare events, such as the market crashes. 2. Short-range forecasts (1–3 days ahead) of the sudden stock price changes might be efficient if and only if the predicted states are resulted from the processes of exponential decay /growth in time. 3. Stocks have different ‘mobility’ in phase space, which is irrelevant to the S&P 500 capitalization weight of the company. The time series for highly mobile stocks contain more predictable information required for the efficient forecasting of the future states of the stocks in phase space than the time series corresponding to the less mobile stocks.
An Unfair Coin of the S & P 500
15
Interestingly, the relations between the amounts of predictable and unpredictable information in the daily stock prices of the individual companies, as well as between the amounts of predictable information in the future state of a stock available from the historic data and from the present state alone are resembling those in unfair coin tossing, where each state repeats itself with the probability 0 ≤ p ≤ 1.
2 Data Source, Data Validation and Preparation The analysis of in the kinematic time series associated to the S&P 500 index has been performed based on the publicly available data collected during 2718 trading days, in the period from January 3, 2000, till October 19, 2018 (see Fig. 1). The predictability analysis was based on the detailed data on the individual daily stock prices of the S&P 500 constituents from December 8, 2013 till September, 9 2018 (during 1259 trading days) acquired using The Investor’s Exchange API, the Python script is available at https://github.com/CNuge/kaggle-code/blob/master/ stock_data. We have chosen the largest detailed data set publicly available for free and verifiable from different data sources. The data set contains price of the stock at market open, highest price reached in the day, lowest price in the day, price of the stock at market close, number of shares traded labeled by their ticker names for the most of S&P 500 companies during the studied period. We have used individual stock prices at market close since this information was always present in the data set. Moreover, in order to preserve consistency of our data, only 468 companies were considered; those selected were present in S&P 500 index throughout the entire observation period, and comprise 98.78% of S&P 500 total market weight. The other 37 constituents were excluded from the study because their presence in the index was inconsistent, and their
Fig. 1 The S&P 500 Index of 500 stocks used to measure large-cap U.S. stock market performance. For the year 2008, S&P 500 falls 38.49%, its worst yearly percentage loss
16
D. Volchenkov and V. Smirnov
stock prices were corrupted. The data set has been verified using Yahoo Finance. Computations were made using Python’s numerical libraries, such as NumPy and Pandas, Maple and Matlab to make sure the consistency of computations and eliminate possible errors.
3 The Discrete Model of Standard & Poor’s 500 Phase Space We use the classical concept of phase space as a collection of all states specified by the observed values of the 1-day raw – return (as a ‘position’): R(t) =
Price(t + 1) − Price(t) , Price(t)
(1)
where Price(t) is the price of the stock at market close, and t runs over trading days; and its discrete time derivative, ˙ = R(t + 1) − R(t), R(t)
(2)
the raw – roughness (as a ‘momentum’) [5]. A stock is represented by a point in phase space accordingly its daily return and roughness. The stock’s price evolving over time traces a path in phase space (a trajectory) representing the set of states in a phase plot (see Fig. 2). The state of market corresponds to a certain distribution and coherent movements of points in phase space.
Fig. 2 The smoothed phase space trajectories (individual stock prices at market close, one observation per day). (a) The Amazon company stocks (from 12/18/2013 to 01/10/2014); (b) The Apple company stocks (from 02/21/2018 to 03/14/2018). The regions of negative return are colored in red; the regions of positive returns are green colored
An Unfair Coin of the S & P 500
17
For the sake of computational feasibility, the 5-year range of return values (minimum −0.53405 and maximum of 0.34336) as well as 5-year range of roughness values (minimum −0.59824 and maximum 0.70596) were divided into 50 intervals of equal lengths (of 0.017548 and 0.02608, respectively). The discrete model of market phase space was reconstructed as the ordered set of 2500 distinct ˙ where {R} and {R} ˙ are the sets of intervals of returns and cells, (r, r˙ ) ∈ {R} × {R} roughness, respectively. Our approach to reconstructing market phase space is substantially different from the previous works, in which a phase space was discussed as a metaphor of either all possible market events “where the transfer of rights to real estate takes place and decision-making processes occur” [6], or a collection of recurrence plots containing the information on the systemic trajectory repetition in exchange market times series [7]. While studying the dynamics of the S&P 500 index, we use the log –return and log—roughness (the discrete time derivative of log–return) instead of the raw— return and—roughness values used to reconstruct the phase space.
4 Methods In our work, we use both the raw– return and—roughness (1) and (2) for our analysis. It is important to mention that there is not a one-to-one relationship between mean logarithmic and mean simple returns. The mean of a set of returns calculated using logarithmic returns is less than the mean calculated using simple returns by an amount related to the variance of the set of returns [8].
4.1 Predictability of Stock Prices in Market Phase Space The degree to which a correct prediction of the future value of a company stock or other financial instrument traded on an exchange can be made is an important open question for an intelligent investor. The stock market is prone to the sudden dramatic declines of stock prices driven by panic as much as by underlying economic factors [9]. Although the behavior of stock prices might be highly uncertain, the amount of uncertainty can be quantified, and the relevant degree of predictability can be assessed from publicly available information. Methods discussed in the present section can be used for a quantitative assessment of the amounts of predictable and unpredictable information in any stochastic process—whether Markovian or not—whenever an empirical transition matrix between the states observed over a certain time horizon becomes available. The proposed methodology resembles the rigorous numerical scheme for approximating invariant densities of dynamical systems, Ulam’s method [4], and can be adopted to work with diverse, multimodal data sets.
18
4.1.1
D. Volchenkov and V. Smirnov
Information Content of Cells in Market Phase Space
The amount of uncertainty associated to an event is related to the probability distribution of the event. Once the event X has been observed, some amount of uncertainty is removed, and the relevant amount of information is released. A highly uncertain event X, occurring with small probability P (X) 1, contains more information than frequent events. The information content of the event X occurring with probability P (X) is J(X) = − log2 P (X)
(3)
measured in bits [10]. The information content of tossing a fair coin where the probabilities of heads and tails are p = 0.5 equals − log2 0.5 = 1 bit. 4.1.2
T-Days Mutual Information Between Cells in Market Phase Space
Mutual information measuring the dependence between two random variables had been introduced by Cover and Thomas [11] as a measure of predictability of stochastic states in time, I (t) =
t
t
P (X − → Y ) log2
{X,Y }
P (X − → Y) , P (X)P (Y )
(4)
t
where P (X − → Y ) is the empirical probability found by dividing the number of times the transition between the X and Y cells occurred precisely in t days by the total number of observed transitions from X, P (X) and P (Y ) are the marginal probabilities (of observing the stock in X and Y , independently of each other), and summation is performed over all possible pairs of cells X and Y . If the t transition probability P (X − → Y ) is statistically independent of the cells X and Y , the amount of mutual information associated with such a pair of cells is zero. Mutual information decreases monotonically with time for a stationary Markov process [11]. 4.1.3
T-Days Precursors and Predictability of Market States
Bayes’ theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event [12]. Given that a stock is in the phase space elementary cell X, it may be a t-days precursor of that in t days the stock will be in the cell Y with the following probability: t
Pt (X|Y ) =
→ Y ) P (X) P (X − P (Y )
(5)
An Unfair Coin of the S & P 500
19
t
where P (X − → Y ) is the observed probability of transition of a stock from the phase space cell X to Y precisely in t days; P (X), P (Y ) are the marginal probabilities of X and Y , respectively. Pt (X|Y ) can be interpreted as a density of the t-days precursors for the event Y in phase space. If Pt (X|Y ) is the same as P (X), then Y is unpredictable (i.e., any X is a precursor for Y ). The total amount of available information about the forthcoming event Y can be gained from observing all t-day precursors X is measured by the Kullback-Leibler (KL) divergence [11], Pt (Y ) =
{X}
Pt (X|Y ) → Y) P (X − → Y) P (X − = log2 . P (X) P (X) P (Y ) P (Y ) t
Pt (X|Y ) log2
t
{X}
(6) The KL–divergence (6) has the form of relative entropy [11] that vanishes if and only if the density of the t-days precursors for the event Y in phase space Pt (X|Y ) is identical to P (X). This implies that observation of stock in X is statistically independent of its observation in Y t-days later, i.e. X is not a precursor for Y . Relative entropy was proposed as a measure of predictability in [13, 14], and we consider the relative entropy (6) as a measure of predictability for the forthcoming event Y t days ahead. If the discussed process was Markovian, the marginal probability P (X) is the t major eigenvector of the transition matrix P (X − → Y ), and the probability (5) is Pt (X|Y ) = 1, for all t and any Y . For a Markov chain, any state X is a predictor for any other state Y for all t with probability 1 (5). Then the total amount of predictable information (6) for any state Y of the Markov chain equals PM = −
log2 P (X) =
{X}
J(X),
(7)
{X}
the total information content of the chain. Therefore, the amount of predictable information for the forthcoming state Y normalized to the maximum predictability PM possible, Mt (Y ) =
Pt (Y ) PM
(8)
can be considered as the degree of Markovianity of the state Y in the dynamical process, over the time horizon t. The degree of Markovianity (8) might vary greatly over the process states for the different transition duration t.
20
D. Volchenkov and V. Smirnov
4.2 Predictable and Unpredictable Information in t-Days Transitions t
The empirical transition probability P (X − → Y ) between a finite number of states X and Y historically observed over the time horizon t (days) defines a finite state discrete Markov chain in phase space although the actual dynamics of S&P 500 stock market is neither Markovian, nor even stationary. For every time horizon t, the t corresponding transition matrix P (X − → Y ) is characterized by some uncertainty of a state that can be decomposed into the quantifiable amounts of predictable and unpredictable information [15]. Uncertainty of a state of a Markov chain measured by Shannon’s entropy consists of three independent information components, H = D + U + E,
(9)
where D (“downward causation” [15]) characterizes our capability to predict the forthcoming state of the chain from the past states; U (“upward causation” [15]) characterizes our capability to guess the future state from the present one; and the amount of information E (“ephemeral information” [16]) which can neither be predicted from the past, nor has repercussions for the future. For example, tossing an unfair coin, in which each state (’heads’ or ‘tails’) repeats itself with the probability 0 ≤ p ≤ 1, is described by the Markov chain p 1−p with the transition probabilities T(p) = (see Fig. 3a). For p = 1 1−p p and p = 0, the Markov chain generates the constant sequences of symbols, viz.,
Fig. 3 (a) The state diagram for tossing an unfair coin, in which each state (’heads’ or ‘tails’) repeats itself with the probability 0 ≤ p ≤ 1. (b) The information components D(p), the past-future mutual information (excess entropy [16]); U(p), the conditional mutual information available at the present state of the chain and relevant to the future states; E(p), the ephemeral information existing only in the present state of the chain, being neither a consequence of the past, nor of consequence for the future
An Unfair Coin of the S & P 500
21
. . . 0, 1, 0, 1, 0, (for p = 0), and . . . 1, 1, 1, 1, 1, or . . . 0, 0, 0, 0, 0 (for p = 1). When p = 1/2, the Markov chain Fig. 3a represents tossing a fair coin. The densities of both states are equal, π = [1/2, 1/2], so that throwing an unfair coin to choose between ‘heads’ and ‘tail’ reveals a single bit of information as quantified by Shannon’s entropy, H (Xt ) = − 2i=1 πi log2 πi = − log2 12 = 1, for any value of p [15]. The information associated to the downward causation process, D(p) ≡ H (Xt ) − H (Xt+1 |Xt ) = −
2
i=1 πi
log2 πi −
2
j =1 Tij
log2 Tij
= −p log2 p − (1 − p) log2 (1 − p), (10) quantifies the amount of uncertainty that is released from observation of the sequence of past states of the chain. When p = 0 or p = 1, the series is stationary, and the forthcoming state is determined by observing the past states, so that D(0) = D(1) = 1 bit. When tossing a fair coin (p = 1/2), the information component (10) vanishes (D(1/2) = 0), and the forecast of a future state by observing the historical sequence of symbols loses any predictive power (see Fig. 3b). Although our capability to predict the forthcoming state by observing the historical sequences weakens as p > 0 or p < 1, we can now guess the future state of the chain directly from the presently observed symbol (by alternating / repeating the current symbol with the probability p > 0, or p < 1, respectively). The related information component (’upward causation’) quantifying the goodness of such a guess is the mutual information between the present and future states of the chain conditioned on the past [15], U = H (Xt+1 |Xt−1 ) − H (Xt+1 |Xt−1 ) , viz.,
U(p) = 2i=1 πi 2j =1 Tij log2 Tij − (T 2 )ij log2 (T 2 )ij = p log2 p + (1 − p) log2 (1 − p) − 2p(1 − p) log2 2p(1 − p)
− p2 + (1 − p)2 log2 p2 + (1 − p)2 .
(11)
Our capability to predict the future symbol from the present alone (as quantified by the mutual information (11)) increases as p > 0 (p < 1) till p ≈ 0.121 (p ≈ 0.879) when the effect of destructive interference between the obviously incompatible guesses on alternating the current symbol at the next step (for p > 0) and on repeating the current symbol (for p < 1) causes the attenuation and complete cancellation of this information component in the case of fair coin tossing (U(1/2) = 0) (see Fig. 3b). The remaining conditional entropy, E ≡ H (Xt |Xt+1 , Xt−1 ) , quantifies the portion of uncertainty that can neither be predicted from the historical data, nor having repercussions for the future, viz., E(p) = −2p log2 p − 2(1 − p) log2 (1 − p) + 2p(1 − p) log2 2p(1 − p)
+ p2 + (1 − p)2 log2 p2 + (1 − p)2 . (12)
22
D. Volchenkov and V. Smirnov
Since all information shared between the past and future states in Markov chains goes only through the present, the mutual information between the past states of the chain and its future states conditioned on the present moment is always trivial: I (Xt−1 ; Xt+1 |Xt ) = 0 [15, 16]. We apply the information decomposition (9) for the empirical transition probat bility matrices P (X − → Y ) observed over the time horizon t in order to estimate the amounts of predictable and unpredictable information in the S&P 500 stock market dynamics as the functions of transition duration t. We also use the information decomposition (9) for estimating the amounts of predictable and unpredictable information in the empirical transition matrices of every company in the S&P 500 list over the 5 years of historical data available to us.
5 Results We discuss the degree of predictability and the associated amounts of predictable and unpredictable information released in the empirically observed transitions of stock prices of the S&P 500 companies in stock market phase space over the variable time horizon (2013–2015).
5.1 Visualizing the Stock Market Crashes, Rallies, and Market Tumbling on Shocking Events The state of the US economy can be visualized by daily snapshots of the S&P 500 stocks in phase space as displayed in Fig. 4. Standard & Poor’s calculates the market capitalization weights (currently ranging from 0.00753 for the National Weather Service to 3.963351 for the Apple company) using the number of shares available for public trading. Movements in the prices of stocks with higher market capitalization (the share price times the number of shares outstanding) had a greater impact on the value of the index than do companies with smaller market caps. In Fig. 4, we indicate the capitalization weights of companies by the radii of circles and jet-colors for convenience of the reader. Market crashes reveal themselves as a dramatic decline of stock prices across the market when the most of stocks appear in the ‘red zone’ of phase space (Fig. 4a) characterized by the negative return and often negative roughness values, resulting in a significant loss of paper wealth. During the periods of sustained increases in the prices of stocks (i.e., market rallies), the stocks overwhelmingly move to the ‘green zone’ of phase space, with positive returns and often positive roughness (Fig. 4b). Market tumbling on shocking events constitutes a 2-day synchronization phenomenon (Fig. 4c and e; d and f). On the first day, many stocks across the market get synchronized on the zero–return value that becomes visible as a vertical line emerging on the phase space snapshots (Fig. 4c and d). On the second day, the
An Unfair Coin of the S & P 500
23
Fig. 4 The daily snapshots of S&P 500 stocks in phase space during (a) a stock market crush; (b) a stock market rally; (c) , (e), (d) and (f) a market tumbling phenomenon. The regions of phase space characterized by the positive/ negative values of return are colored by ‘green’ and ‘red’, respectively)
24
D. Volchenkov and V. Smirnov
stock market would become essentially volatile, plunging and snapping back due to coherent sells-off / ramping of stocks that might be affected by the announced news. Indeed, the stock drops may result in the rise of stock prices for corporations competing against the affected corporations. Interestingly, a certain degree of coherence in the apparently volatile market movements might persist for a day as it seen from the continued alignment of stocks in phase space visible on the second day of the market recovery process (Fig. 4e and f). The first tumbling event shown on Fig. 4c and e had occurred on Thursday, 07/18/2014, when a Malaysia Airlines plane (flight MH17) headed from Amsterdam to Kuala Lumpur carrying 298 passengers had been shot down by a Russian military unit invading eastern Ukraine. The second tumbling is observed on August 4, 2014 (see Fig. 4d and f) at the coincidence of alarming events, including Argentina’s default on bond payments, more sanctions against Russia backed by EU and US in response to the downing of flight MH17, and the Islamic State seizure of fifth Iraqi oil field. Within that week, the S&P 500 lost 2.69%, the Dow fell 2.75%, and the Nasdaq slid 2.18%.
5.2 Phase Portrait, t–Days Predictability of States, and t–Days Mutual Information in S&P 500 Phase Space Not all cells in market phase space are visited equally often by stock prices. The central region, about the zero-return zero-roughness equilibrium point was the most visited of all (see Fig. 5a) during 5 years of observations. This observation is in line with the Mean reversion theory suggesting that asset prices and returns eventually return back to the long-run mean or average of the entire data set [17]. Consequently, the information content is minimal for the frequently visited cells located in the central region, but is high for the periphery cells rarely visited by stocks (Fig. 5b). The phase space cells corresponding to market crushes are characterized by the maximum information content. Uneven attendance of cells in S&P 500 phase space results in the inequitable degree of predictability of the corresponding events. The predictability of states calculated as the KL–divergence (6) between the density of t-days precursors for an event and the marginal density of states P (X) across phase space changes with time (see Fig. 5c and d). Summarizing on shortrange predictions based on observation of all possible 1-day precursors for every cell in phase space, we obtain (see Fig. 5c) that the most predictable states are ˙ of the phase portrait in S&P 500 phase aligned along the ‘main diagonal’ (R ∝ R) space, roughly corresponding to the exponential growth (decay) processes with a single unstable equilibrium at the zero-return zero-roughness point (0, 0). The states out of the ’main diagonal’ shown in Fig. 5c cannot be predicted efficiently. Long-range forecasts predicting the behavior of stocks for more than 3 days in advance follow the different predictability pattern shown in Fig. 5d. The efficiently predictable states for long-range forecasts are located overwhelmingly in the central region of the phase portrait, about the zero-return zero-roughness unstable
An Unfair Coin of the S & P 500
25
Fig. 5 (a) The color—coded histogram indicating the number of visits of the phase space cells throughout the entire period of observations; (b) The color—coded histogram representing the information content of phase space cells; (c) The predictability of states based on 1-day precursors; (d) The predictability of states based on 24-day precursors
equilibrium point. In Fig. 5d, we have presented the predictability pattern based on 24-day precursors. This observation also supports the general wisdom of the Mean reversion theory [17]. Long-range predictions of infrequent events (such as market crushes) situated on the periphery of the phase diagram should be seen as a rough guide, as the accuracy of such predictions falls considerably around the few days mark. The degree of Markovianity (8) estimated by the predictability of market states in S&P 500 phase space never exceeds 1%. The decay rate of time-dependent mutual information is an important characteristic of the fluctuations of information flow in a complex system [18]. In the case of small fluctuations of information flow, the time– dependent mutual information decays linearly in time; but the decay might be slower than linear, in the case of the large fluctuations of the information flow in the system [18]. Mutual information decays monotonically with time for Markov processes [11]. In formal languages, the mutual information between two symbols, as a function of the number of symbols between the two, decays exponentially in any probabilistic regular grammar, but can decay like a power law for a context-free grammar [19]. It is worth mentioning that the time decay of mutual information in time (4) calculated over S&P 500 phase space is very heterogeneous and even non-
26
D. Volchenkov and V. Smirnov
Fig. 6 T -days mutual information of the S&P 500
Fig. 7 The shares of predictable and unpredictable information in the uncertainty of a stock market state with respect to the t-days transitions
monotonous. The value of mutual information plunges within the first 3 days to approximately 7 Kbits and then gets stuck there, at least up to 100-day’s mark (see Fig. 6). The observed non-monotonous decay of mutual information may indicate the presence of the long-lasting large fluctuations of information flow that might be attributed to stock market bubbles resulted from groupthink and herd behavior [20]. The shares of predictable information in the uncertainty of a stock market state with respect to the t-days transitions also plunges within the first 3 days, but then get stabilized (see Fig. 7).
An Unfair Coin of the S & P 500
27
5.3 Predictable and Unpredictable Information in Company Stock Prices. Stocks of Different Mobility An intelligent investor is interested in diversification of the portfolio across stocks to reduce the amount of unsystematic risk of each security. Unsystematic risk (related to the occurrence of desirable events) might be associated to the high shares of unpredictable (ephemeral) information in daily stock prices. To quantify the risk levels in market phase space, we have investigated the shares of predictable and unpredictable information in the transitions of stocks between the cells in phase space for all 468 studied companies from the S&P 500 (Fig. 8). We have analyzed the empirical 1-day transition matrices for the stocks of individual companies from the S&P 500 list observed in the last 5 years. Every transition matrix gives us three values (D, U, E) quantifying the predictable and unpredictable information components in uncertainty of a stock state with respect to 1-day transitions. In Sect. 4.2, we discussed that the transition matrices for unfair coin tossing can be parametrized by the state repetition probability, so that for every value of 0 ≤ p ≤ 1, we get a triplet (D, U, E)(p) shown in Fig. 3. While working with the S&P 500 data, we have parametrized the (D, U, E)—triplets by uncertainty of daily stock price assessed by its Shannon entropy H producing the curves shown in Fig. 8. The functional analogy between the state repetition probability 0 ≤ p ≤ 1 for tossing an unfair coin and uncertainty of daily stock price of a company can be
Fig. 8 The shares of transition information predictable from the historic time series (D), from the present observation (U), and ephemeral information (E) vs. Shannon’s entropy H measuring uncertainty of daily return for every studied company. The lines represent the polynomial trends for the empirically observed relations
28
D. Volchenkov and V. Smirnov
intuitively understood in terms of stock “mobility” in phase space. The abundance of trajectory patterns pertinent to highly mobile stocks provide more valuable information for the efficient prediction of the future states. On the contrary, from the forecasting perspective, the behavior of low mobile stocks is prone to high unsystematic risk of unpredictable events like tossing a fair coin. Since the shares of predictable information in the uncertainty of a stock market state is maximum for the 1-day transition (Fig. 7), we study the 1-day transition dynamics of the stock prices of individual companies from the S&P 500 list. The empirically observed relations shown in Fig. 8 demonstrate that the amount of predictable information in daily transition of stocks (D + U) grows while the amount of unpredictable information (E) decays monotonously with Shannon’s entropy H measuring uncertainty of the daily stock price for every studied company. Unpredictable information associated to unsystematic risk is maximum (exceeding the both predictable information components) for the companies characterized by the lowest entropy values. In contrast to it, the amount of unpredictable information practically vanishes for the companies characterized by the highest uncertainty of daily stock prices, which can be mostly resolved by observation the historic time series as indicated by the dominance of the information component D (see Fig. 9). Although entropy is commonly associated with the amount of disorder, or chaos in a system, our conclusion does not appear as a paradox judged from the everyday perspective. Stocks have different ‘mobility’ in phase space as quantified by inequitable uncertainty of daily return (Shannon’s entropy) H in market phase space. In statistics, entropy is related to the abundance of ‘microscopic configurations’ (i.e., trajectories of stocks updated daily) that are consistent with the observed ‘macroscopic quantities’ (i.e., the density of cells obtained over the whole period of observation) [21]. Highly mobile stocks increase the degree to which the probability of visiting a cell is spread out over different possible trajectories in market phase
Fig. 9 Uncertainty of daily stock price measured by Shannon’s entropy H along with the information components, D, U and E, for 468 major companies from the S&P 500
An Unfair Coin of the S & P 500
29
space, providing more data valuable for the efficient forecasting the future states; the more such trajectories are available to the stock with appreciable probability, the greater it’s entropy. On the contrary, stocks of low mobility characterized by the low levels of Shannon’s entropy do not accumulate enough data for an efficient predictions of the future states. The shares of information components in complex systems are naturally related to each other. The amount of unpredictable information decreases when the amount of predictable information increases; information predictable from the present state of a system alone is minimum when the future state can be determined from observation of the historic data. Interestingly, there are remarkable similarities between the information relations observed across the S&P 500 stock market and in unfair coin tossing Fig. 3a discussed by us in Sect. 4.2. In Fig. 10a and b, we have presented the relations
Fig. 10 A “unfair coin” of the S& P 500 stock market. (a) Unpredictable (E) vs. predictable information (D + U) in unfair coin tossing Fig. 3a; (b) Predictable information available from the present state alone (U) vs. predictable information available from the past series (D) in unfair coin tossing Fig. 3a; (c) Unpredictable (E) vs. predictable information (D + U) in the S& P 500; (d) Predictable information available from the present state alone (U) vs. predictable information available from the past series (D) in the S& P 500
30
D. Volchenkov and V. Smirnov
between the amounts of unpredictable (E) and predictable information (D + U) and between D and U in unfair coin tossing: these relations are linear and parabolic, respectively. The corresponding relations between the information components observed across the S& P 500 stock market are shown in Fig. 10c and d. Following the analogy between the behavior of stock in S&P 500 phase space and tossing an unfair coin, we may say that stocks highly mobile in phase space characterized by the high entropy values correspond to the marginal values p 0, or p 1 for the probability to repeat a symbol in unfair coin tossing. In both cases, the abundance of patterns in symbolic time series provide more valuable information for the efficient prediction of the future states. On the contrary, from the forecasting perspective, the behavior of ‘low mobile’ stocks prone to high unsystematic risk of undesirable events is somewhat similar to tossing a fair coin.
6 Discussion and Conclusion We have reconstructed the discrete model of S&P 500 phase space and studied predictability of the cells and individual S&P 500 constituents. The certain distributions and coherent movements of points in phase space are associated to the particular states of the market. Inhomogeneous density of states in S&P 500 phase space results in their inequitable predictability: more frequent market events are predicted more efficiently than relatively rare events, especially in the long-run. The heterogeneous and even non-monotonous decay of mutual information in time might be attributed to strong fluctuation of information flow across S&P 500 phase space arisen due to stock market bubbles. Mobility of a stock in phase space can be quantified by entropy measuring uncertainty of the daily return of the company. Highly mobile stocks characterized by the high entropy values provide more data valuable for the efficient forecasting the future states and therefore are associated to lower unsystematic risk than the stocks of ‘low mobility’. High stock mobility is usually irrelevant to the market capitalization weight of the company. WEC Energy Group Inc., providing electricity and natural gas to 4.4 mln customers across four states through its brands, is characterized by the highest value of entropy (HWEC = 8.377 bit) among all S&P 500 constituents. Entergy Corporation, an integrated energy company engaged in electric power production and retail electric distribution operations, and the CMS Energy Corporation share the second and third places with HWEC = 8.352 bit and HCMS = 8.279 bit, respectively. For comparison with Apple Inc., the leader of the S&P 500 market capitalization weight (3.963351), it’s on the 120th place with HAAPL = 7.5526 bit. We confirm a great deal of wisdom shared with the common investors by Benjamin Graham in his famous book [2]. Reliable long-range predictions of rare events, such as market crushes, are hardly possible. Companies that offer a large margin of safety are not always those of the highest S&P 500 market capitalization
An Unfair Coin of the S & P 500
31
weight. In the last 5 years, the companies of low unsystematic risk are found in the energy industry, forming the bedrock of our society. We believe that our work would help to resolve a logical paradox related to EMH on that if no investor had any clear advantage over another, would there be investors who have consistently beat the market, such as Benjamin Graham and his disciple, Warren Buffett? Acknowledgments VS is grateful to the Department of Mathematics and Statistics, Texas Tech University for the support during Summer semesters 2018.
References 1. S.P. Greiner, Ben Graham was a Quant : Raising the IQ of the Intelligent Investor. Wiley Finance Series (Wiley, Hoboken, 2011) 2. B. Graham, The Intelligent Investor (Harper & Row, New York, 1949, 1954, 1959, 1965) 3. E. Fama, The behavior of stock market prices. J. Business 38, 34–105 (1965) 4. S.M. Ulam, Problems in Modern Mathematics (Interscience, New York, 1964) 5. D.D. Nolte, The tangled tale of phase space. Phys. Today 63(4), 33–38 (2010) 6. A. Radzewicz, Real estate market system theory approach phase space. Real Estate Manag. Val. 21(4), 87–95 (2013) 7. C.-Z. Yao, Q.-W. Lin, Recurrence plots analysis of the CNY exchange markets based on phase space reconstruction. North Amer. J. Econ. Finance 42, 584–596 (2017) 8. R. Hudson, A. Gregoriou, Calculating and Comparing Security Returns is Harder than you Think: A Comparison between Logarithmic and Simple Returns. Available at SSRN: https:// ssrn.com/abstract=1549328 (2010) 9. J. Galbraith, The Great Crash 1929 (Houghton Mifflin, Boston, 1988 edition) 10. T. Sun Han, K. Kobayashi, Mathematics of Information and Coding (American Mathematical Society, Providence, 2002) 11. T.M. Cover, J.A. Thomas, Elements of Information Theory (Wiley, Hoboken, 1991) 12. P.M. Lee, Bayesian Statistics (Wiley, Hoboken, 2012). ISBN 978-1-1183-3257-3 13. T. DelSole, Predictability and information theory. Part I: measures of predictability. J. Atmos. Sci. 61, 2425–2440 (2004) 14. R. Kleeman, Measuring dynamical prediction utility using relative entropy. J. Atmos. Sci. 59, 2057–2072 (2002) 15. D. Volchenkov, Grammar Of Complexity: From Mathematics to a Sustainable World. World Scientific Series in Nonlinear Physical Science (World Scientific, Singapore, 2018) 16. R.G. James, Ch.J. Ellison, J.P. Crutchfield, Anatomy of a bit: Information in a time series observation. Chaos 21, 037109 (2011) 17. B. Mahdavi Damghani, The non-misleading value of inferred correlation: an introduction to the cointelation model. Wilmott 1, 50–61 (2013) 18. K. Kaneko, I. Tsuda, Complex Systems: Chaos and Beyond: A Constructive Approach with Applications. Springer Series in Science& Business Media (Springer, Berlin, 2011) 19. H.W. Lin, M. Tegmark, Criticality in formal languages and statistical physics. Entropy, 19, 299 (2017) 20. V.L. Smith, G.L. Suchanek, A.W. Williams, Bubbles, crashes, and endogenous expectations in experimental spot asset markets. Econometrica. The Econ. Soc. 56(5), 1119–1151 (1988) 21. J.P. Sethna, Statistical Mechanics: Entropy, Order Parameters, and Complexity (Oxford University Press, Oxford, 2006)
Relativistic Chaotic Scattering Juan D. Bernal, Jesús M. Seoane, and Miguel A. F. Sanjuán
AMS (MSC 2010) 34H10, 65T60, 74H45
1 Introduction Chaotic scattering in open Hamiltonian systems has been a broad area of study in nonlinear dynamics, with applications in numerous fields in physics (see Refs. [1, 2]). This topic is essentially defined by a scattering region where there are interactions between incident particles and a potential. Outside this region the influence of the potential on the particles is negligible and the motions of the incident particles are uniform. For many applications of physical interest, the equations of motion of the test particles are nonlinear and the resultant dynamics is chaotic in the scattering region. Therefore, slightly similar initial conditions may describe completely different trajectories. Since the system is open, this region possesses exits for which the particles may enter or escape. Quite often, particles starting in the scattering region bounce back and forth for a finite time before escaping. In this sense chaotic scattering could be presented as a physical manifestation of transient chaos [3, 4].
J. D. Bernal · J. M. Seoane Nonlinear Dynamics, Chaos and Complex Systems Group, Departamento de Física, Universidad Rey Juan Carlos, Madrid, Spain e-mail: [email protected]; [email protected] M. A. F. Sanjuán () Nonlinear Dynamics, Chaos and Complex Systems Group, Departamento de Física, Universidad Rey Juan Carlos, Madrid, Spain Department of Applied Informatics, Kaunas University of Technology, Kaunas, Lithuania Institute for Physical Science and Technology, University of Maryland, College Park, MD, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 D. Volchenkov, J. A. Tenreiro Machado (eds.), Mathematical Methods in Modern Complexity Science, Nonlinear Systems and Complexity 33, https://doi.org/10.1007/978-3-030-79412-5_3
33
34
J. D. Bernal et al.
Using the Newtonian approximation for modeling the dynamics of the system is the most widely accepted convention in physics and engineering applications when the speed of objects is low compared to the speed of light [5]. Nevertheless, if the dynamics of the system is really sensitive to the initial conditions, the trajectories predicted by the Newtonian scheme rapidly disagree with the ones described by the special relativity theory (see Refs. [6–9]). Recently, there have been some results [10] that pointed out that the global properties of the dynamical systems, such as the dimension of the nonattracting chaotic invariant set, are more robust and the Newtonian approximation actually provides accurate enough results for them in slow chaotic scattering motion. The first goal of this chapter is to show that there are relevant global properties of chaotic scattering systems that indeed do depend on the effect of the Lorentz transformations and we may consider the special relativity scheme in case we want to describe them in a realistic manner, even for low velocities. Specifically, we focus our study in both the average escape times and in the decay law of the particles from the scattering region which are quite important global properties in the scattering systems. These results were presented in Ref. [11]. Furthermore, we analyze in detail some key properties that characterized the exit basin topology of this kind of systems: the uncertainty dimension, the Wada property and the basin entropy. This is important since the exit basin topology is useful to achieve a priori very rich global insights. For example, once we know the exit basin topology of a system, we can infer the degree of unpredictability of the final state of the system by just knowing some approximate information about the initial conditions. The research over the exit basin topology is published in Ref. [12]. The authors of the present work showed in the past the effect of external perturbations as noise and dissipation in some Hamiltonian systems (see for example Ref. [13]). It is worth highlighting that the consideration of the relativistic framework on the system dynamics cannot be considered as an external perturbation like the noise or the dissipation, although the global properties of the system also change. From now on, we will refer to relativistic to any effect where the Lorentz transformations have been considered. Likewise, we say that any property or object is nonrelativistic or Newtonian when we do not take into consideration the Lorentz transformations but the Galilean ones. This chapter is organized as follows. In Sect. 2, we describe our prototype model, the relativistic Hénon-Heiles system. The effects of the Lorentz transformation on the average escape time of the particles and their decay law are carried out in Sect. 3. In Sect. 4, we give an heuristic reasoning based on energetic considerations to explain the results obtained for both global properties. Moreover, here we characterize the decay law of the relativistic particles. In Sect. 5, we show the effects of the Lorentz transformation on the uncertainty dimension of a typical scattering function. A qualitative description of the influence of the special relativity in the exit basins is shown in Sect. 6. Additionally, we use the basin entropy to quantify the uncertainty of the system based on the study of the exit basin topology. Then, we understand the different sources of uncertainty in the relativistic Hénon-
Relativistic Chaotic Scattering
35
Heiles system provoked by the Lorentz transformations. A discussion and the main conclusions of this chapter are presented in Sect. 8.
2 Model Description We focus our study on the effects of the relativistic corrections in a paradigmatic chaotic scattering system, the Hénon-Heiles Hamiltonian. The two-dimensional potential of the Hénon-Heiles system is defined by V (x, y) =
1 1 k(x 2 + y 2 ) + λ(x 2 y − y 3 ). 2 3
(1)
We consider the Hénon-Heiles potential in a system of units where k = λ = 1, see Ref. [14]. Its isopotential curves can be seen in Fig. 1. Due to the triangular symmetry of the system, the exits are separated by an angle of 2π/3 radians. We call Exit 1 the upper exit (y → +∞), Exit 2, the left one (y → −∞, x → −∞), and, Exit 3, the right exit (y → −∞, x → +∞). We define the nonrelativistic total mechanical energy and we call it Newtonian energy, EN , as EN = T (p) + V (r), where T is the kinetic energy of the particle, T = p2 /2m, p is its linear momentum vector, V (r) is the potential energy, and r it its vector position. If EN ∈ [0, 1/6], the trajectory of any incident particle is trapped in the scattering region. For EN > 1/6, the particles may eventually escape up to infinity. There are indeed three different regimes of motion depending on the initial value of the energy: (a) closed-nonhyperbolic EN ∈ [0, 1/6], (b) open-nonhyperbolic EN ∈ (1/6, 2/9) and (c) open-hyperbolic EN ∈ [2/9, +∞) [15]. In the first energy range, all the trajectories are trapped and there is no exit Fig. 1 Isopotential curves for the Hénon-Heiles potential: they are closed for energies below the nonrelativistic threshold energy escape Ee = 1/6. It shows three different exits for energy values above Ee = 1/6
36
J. D. Bernal et al.
by which any particle may escape. When EN ∈ (1/6, 2/9), the energy is large enough to allow escapes from the scattering region and the coexistence of stable invariant tori with chaotic saddles, which typically results in an algebraic decay in the survival probability of a particle in the scattering region. On the contrary, when EN ∈ [2/9, +∞), the regime is open-hyperbolic and all the periodic trajectories are unstable; there is no KAM tori in phase space. If we consider the motion of a relativistic particle moving in an external potential energy V(r), the Hamiltonian (or the total energy) is: H = E = γ mc2 + V (r) =
m2 c4 + c2 p2 + V (r),
(2)
where m is the particle’s rest mass, c is the speed of light and γ is the Lorentz factor which is defined as: p2 1 γ = 1+ 2 2 = . (3) 2 m c 1− v c2
Therefore, the Hamilton’s canonical equations are: ∂H = −∇V (r), ∂r ∂H p r˙ = v = = , ∂p mγ p˙ = −
(4)
When γ = 1 the Newtonian equations of motion are recovered from Eq. 4. We define β as the ratio v/c, where v is the modulus of the vector velocity v. Then the Lorentz factor can be rewritten as γ = √ 1 2 . Whereas γ ∈ [1, +∞), the range of 1−β
values for β is [0, 1]. However, γ and β express essentially the same: how large is the velocity of the object as compared to the speed of light. From now on, we will use β instead of γ to show our results for mere convenience. Taking into consideration Eqs. 1 and 4, the relativistic equations of motion of a scattering particle of unit rest mass (m = 1) interacting with the Hénon-Heiles potential are: p , γ q y˙ = , γ x˙ =
p˙ = −x − 2xy, q˙ = −y − x 2 + y 2 , where p and q are the two components of the linear momentum vector p.
(5)
Relativistic Chaotic Scattering
37
In the present work, we aim to isolate the effects of the variation of the Lorentz factor γ (or β as previously shown) from the rest of variables of the system, i.e., the initial velocity of the particles, its energy, etc. For this reason, during our numerical computations we have used a different systems of units so that γ be the only varying parameter in the equations of motion (Eq. 5). Therefore, we analyze the evolution of the properties of the system when β varies, by comparing these properties with the characteristics of the nonrelativistic system. Then we have to choose the same value of the initial velocity, v = 0.583, in different systems of units. This initial velocity corresponds to a Newtonian energy EN = 0.17, which is in the open-nonhyperbolic regime and quite close to the limit Ee . For the sake of clarity, we consider an incident particle coming from the infinity to the scattering region. Then, imagine that we measure the properties of the incident particle in the system of Planck units. In this system, the speed of light is c = 1 c, that is, the unit of the variable speed is measured as a multiple of the speed of light c instead of, for instance, in m/s. Now, suppose that, according to our measures, the rest mass of the particle is m = 1 mP (in the Planck units, the mass is expressed as a multiple of the Planck mass mP , which is mP ≈ 2.2 × 10−8 kg). Likewise, the speed of the particle is v = 0.583 c. According to the Newtonian scheme, the classical energy of the particle is EN = 1 2 2 v ≈ 0.17 EP , where EP is the Planck energy, which is the energy unit in the Planck units system (EP ≈ 1.96 × 109 J ). Now, we consider another incident particle with different rest mass and velocity, however we choose the International System of Units (SI) to measure its properties. In this case, we obtain that its rest mass is again the unity, although now the rest mass is one kilogram, m = 1 kg, and its velocity is v = 0.583 m/s. The speed of light in SI is c ≈ 3 × 108 m/s. The initial Newtonian energy of the particle is EN ≈ 0.17 J in SI, but now the initial velocity is almost negligible as compared to the speed of light, that is, β = v/c = 0.583/3 × 108 ≈ 2 × 10−9 . Therefore, from the perspective of the equations of motion of both particles, when we just consider the Galilean transformations, we can conclude that the behavior of the particles will be the same since V (x, y) of Eq. 1 is equal in both cases, as long as the parameters k = λ = 1 in their respective system of units. However, they are completely different when the relativistic corrections are considered because the Lorentz factor γ affects the equations of motion by the variation of β, regardless the chosen system of units. To summarize, the objective of our numerical computations and analysis is to study the effect of γ in the equations of motion, so the key point is to set the speed of light c as the threshold value of the speed of the particles, regardless the system of units we may be considering. We represent Fig. 2 in order to give a visual example about the effect of the Lorent factor in the evolution of single trajectories. There, we represent three different trajectories of the same relativistic particle when it is shot at the same initial Newtonian velocity, v = 0.583, from the same initial conditions, x = 0, y = 0 and equal initial shooting angle φ = 0.8π . However, that velocity, measured in different systems of units, represents different values of the parameter β. Figure 2a is the trajectory of the particle for β = 0.01. The particle leaves the scattering region by Exit 1 at 286, 56 s. In Fig. 2b we represent the trajectory for β = 0.1. The particle is trapped in the
38
J. D. Bernal et al.
(a)
(b) 1 0.5
0.5
y
y
1
0
0 -0.5
-0.5 0
-0.5
0.5
x
-0.5
0
x
0.5
(c) 1
y
0.5 0
-0.5 -1 -1
-0.5
0
x
0.5
1
Fig. 2 Single trajectories for relativistic particles in the scattering region. It shows different trajectories of the same relativistic particle when it is shot at the same initial Newtonian velocity, v = 0.583, from the same initial conditions: x = 0, y = 0 and shooting angle φ = 0.1π . That velocity is measured in different systems of units, representing different values of the parameter β. (a) is the trajectory of the particle for β = 0.01. The particle leaves the scattering region by Exit 1 at 286, 56 s. (b) represents the trajectory for β = 0.1. The particle is trapped in the scattering region forever. (c) shows the trajectory of the particle for β = 0.8. It leaves the scattering region by Exit 3 in 18, 19 s
scattering region forever. Figure 2c shows the trajectory of the particle for β = 0.8. It leaves the scattering region by Exit 3 in 18, 19 s. We can see that, even for low velocities the trajectories described for the particles are completely different.
3 Numerical Results on the Escape Time and the Decay Law Here we study the discrepancies between the relativistic and the nonrelativistic corrections when we analyze the average escape time, T¯e , of the system. Furthermore, we analyze another fundamental piece of any chaotic scattering system, its time delay statistics P (t) when we consider the Lorentz corrections. Both are essential global characteristics in chaotic scattering problems.
Relativistic Chaotic Scattering
39
Te
300 200 100
0
2
0.
4
0.
6
β
0.
8
0.
1
Fig. 3 Average escape time: T¯e of 10,000 particles inside the scattering region with initial velocity v = 0.583. The initial conditions are (x0 , y0 , x˙0 , y˙0 ) = (0, 0, v cos(ϕ), v sin(ϕ)), with shooting angle, ϕ ∈ [0, 2π ]. We use 500 different values of β in our calculations. There is a linear decrease of T¯e up to β ≈ 0.4. Indeed at β ≈ 0.4 there is a leap where the linear decrease trend of T¯e changes abruptly. Figure obtained from Ref. [11]
3.1 Escape Time The escape time, Te , of an incident particle is defined as the time spent by it in the scattering region. For times above Te , the particle travels to infinity after having crossed one of the three exit boundaries, which are extremely unstable trajectories called Lyapunov orbits (see Ref. [16]). In the case of the Hénon-Heiles system, the Lyapunov trajectories exist for energies higher than Ee = 1/6. The higher the energy, the shorter the escape times are. When we consider a large number of particles and we average their individual Te , then we obtain the global property T¯e , which is a unique and characteristic property of the system. We represent in Fig. 3 the average escape time, T¯e , of 10,000 particles shot inside the scattering region with an initial velocity v = 0.583. The initial conditions are (x0 , y0 , x˙0 , y˙0 ) = (0, 0, v cos(ϕ), v sin(ϕ)), with shooting angle, ϕ ∈ [0, 2π ]. We use 500 different values of β for our calculations. The Newtonian average escape time in Fig. 3 is the first point of the graph, when β → 0. This value is indeed the inner average escape time of the particles. As a reminder, this the time as seen by an observer who is stationary with regard to the reference frame of the particle. Then, if we average the measures of the inner escape time of the 10,000 particles, then we get the value of T¯e when β → 0. As can be seen in Fig. 3, there is a clear influence of the Lorentz factor variation on the average escape time T¯e . It is worth to note that, in the most general sense, we define scattering as the problem of obtaining the relationship between an input variable taken from outside the scattering region and an output variable, which characterizes the final state of the system after interacting with the scattering region. However, the fact of starting the numerical experiments inside the scattering region is a convention frequently used in the scientific literature (see for example Refs. [17–19]). The reason behind this is to take advantage of the well-known topological structure of the escape basins resulting from the Poincaré
40
ΚΑΜ
0.5
Φ
Fig. 4 Percentage of trapped particles in the scattering region: KAM , expressed as a decimal, at tmax , is directly proportional to the Lebesgue measure of the KAM islands in the Poincaré surface of section. At β ≈ 0.4 there are just a few particles trapped in the scattering region. Figure obtained from Ref. [11]
J. D. Bernal et al.
0.4 0.3 0.2 0.1 0
0.2
0.4
0.6
0.8
1
β surface of section (y, ˙ y) for x(0) = 0. Therefore it is implicitly assumed that the initial conditions chosen for the computations may correspond to trajectories which come from outside the scattering region and, after bouncing back and forth for a certain time in the scattering region, they pass through x = 0 with a certain velocity (x, ˙ y). ˙ This is the precise instant when the simulations start and the initial conditions are set as (x = 0, y, x, ˙ y). ˙ According to Fig. 3, there is a relevant decrease of T¯e up to β ≈ 0.4. Indeed at β ≈ 0.4 there is a leap where the linear decrease trend of T¯e changes abruptly. This can be explained when we highlight that at β ≈ 0.4 the KAM islands are almost destroyed and there are just a few trajectories trapped in the scattering region forever. In fact, as it has been shown in the literature [20], the KAM islands exhibit certain stickiness in the sense that its presence in the phase space provokes longer transients inside the scattering region. In order to confirm the destruction of the KAM island at β ≈ 0.4, we analyze in Fig. 4 the percentage, expressed as a decimal, of particles trapped in the scattering region at tmax . We name this percentage φKAM and it is directly related to the presence of KAM islands and its Lebesgue measure in the Poincaré surface of section. To calculate KAM we have considered again 10,000 particles inside the scattering region with initial conditions (x0 , y0 , x˙0 , y˙0 ) = (0, 0, 0.583 cos(ϕ), 0.583 sin(ϕ)) and shooting angle ϕ ∈ [0, 2π ]. Then we compare the number of particles remaining in the scattering region after a long transient, tmax , with the total number of the initially shot particles, obtaining the quantity KAM for a certain value of β. Finally, we take different values of β and we represent KAM vs β to get Fig. 4. The results point out that, even for low velocities (β < 0.2) the number of trapped particles decreases as β increases. When β ≈ 0.4 there are almost no particles trapped in the scattering region, which is a direct proof of the destruction of the KAM islands. It is worth noting that the shape of both curves, T¯e (β) and
KAM (β), as shown in Figs. 3 and 4, are very similar, which expresses the relevance of the KAM destruction mechanism over the global properties of the system. In Sect. 4 we will discuss the reasons behind the trend of the average escape time T¯e of the system under the variation of β from Fig. 3.
Relativistic Chaotic Scattering
41
3.2 Decay Law Suppose that we pick many different initial conditions at random in some interval of the domain. Then we examine the resulting trajectory for each value and determine the time t that its trajectory spends in the scattering region. The fraction of trajectories with time delay between t and t + dt is P (t)dt. For open nonhyperbolic dynamics with bounding KAM surfaces in the scattering region, one finds that for large t the time delay statistics, P (t), decays algebraically as follows, P (t) ∼ t −α .
(6)
An algebraic decay law as the described in Eq. 6 is also found in higher dimensional Hamiltonian systems when the phase space is partially filled with a KAM tori (see Ref. [21]). For our simulations we have considered 10,000 particles shot inside the scattering region with initial velocities v ≈ 0.5831. The initial conditions are (x0 , y0 , x˙0 , y˙0 ) = (0, 0, v cos(ϕ), v sin(ϕ)), with shooting angle, ϕ ∈ [0, 2π ]. Then we get the fraction of particles inside the scattering region between t and t + dt, that is P (t)dt, an we represent log10 (P (t)) vs. log10 (t) to get the value of the parameter α (the slope of the resulting straight line). Calculating α for different values of β, we obtain the evolution of the parameter α with β. In Fig. 5 we can see that the numerical values of α(β) fits a quadratic curve as α ≈ A0 + A1 β + A2 β 2 with A0 = 0.46138, A1 = −2.5311 and A2 = 15.185. We have indeed found that the decay law of the time delay statistics is algebraic, according to Eq. 6, for the range of energies where the regime of the system is open nonhyperbolic. For the initial conditions chosen to perform our computations, this regime takes up to β ≈ 0.4. The value of the coefficients A0 , A1 and A2 are exclusively valid in the range of values that we have considered for the nonlinear fitting, [0.05, 0.4]. However, we can expect that the value of the parameter α in the nonrelativistic framework may be similar to the one obtained by the quadratic formula. That is because the minimum
2.5 2 1.5
α
Fig. 5 (Color online) Evolution of the parameter α: we show the exponent α of the algebraic decay law (see Eq. 6) in the relativistic Hénon-Heiles system under the variation of β. The initial velocity of v ≈ 0.5831. There is a quadratic trend, α ≈ A0 + A1 β + A2 β 2 , where A0 = 0.46138, A1 = −2.5311 and A2 = 15.185. Figure obtained from Ref. [11]
1 0.5 0
0
0.1
0.2
0.3
β
0.4
42
0.2 0.15
τ
Fig. 6 (Color online) Evolution of the parameter τ : of the exponential decay law of the relativistic Hénon-Heiles system under the variation of β. The initial velocity of v ≈ 0.5831. The trend is quadratic, τ ≈ τ0 + τ1 β + τ2 β 2 , where τ0 = 0.065207, τ1 = −0.028988 and τ2 = 0.4125. Figure obtained from Ref. [11]
J. D. Bernal et al.
0.1 0.05 0.4
0.5
0.6
0.7
0.8
0.9
1
β
value of the range considered for the fitting, that is 0.05, is relatively close to β → 0. Indeed, the coefficient A0 = 0.461 may be deemed as a good approximation of the Newtonian framework since this yields a value for α equals to 0.386. As the speed of particles increases and β > 0.4, the measure of bounding KAM surfaces is practically negligible in the scattering region and all the trajectories exit from there. The decay law of the particles becomes exponential according to Eq. 7: P (t) ∼ e−τ t ,
(7)
where 1/τ is the characteristic time for the scatterer. We can proceed similarly to calculate the evolution of the parameter τ while β is increased. We shoot 10,000 particles inside the scattering region with initial conditions are (x0 , y0 , x˙0 , y˙0 ) = (0, 0, v cos(ϕ), v sin(ϕ)), with v ≈ 0.5831 and shooting angle, ϕ ∈ [0, 2π ]. Getting the fraction of particles inside the scattering region between t and t + dt, we represent ln(P (t)) vs. t. The slope of the resulting straight line is the value of the parameter τ . If we do the same for increasing values of β we obtain the relation between τ and β. In Fig. 6 is shown the quadratic evolution of the numerical data of the parameter τ while the Lorentz factor varies according to τ ≈ τ0 + τ1 β + τ2 β 2 , where τ0 = 0.065207, τ1 = −0.028988 and τ2 = 0.4125. In the next section we will see why the numerical values of α(β) and τ (β) follow a quadratic trend.
Relativistic Chaotic Scattering
43
4 Discussion About the Escape Time and the Decay Law 4.1 Energetic Reasoning About the Escape Time and the Decay Law In the present section we follow a qualitative approach to discuss the trends of the global properties of the system that we have studied in Sects. 3.1 and 3.2 as T¯e (β), α(β) and τ (β). Firstly we take the relativistic kinetic energy of the system, K = mγ c2 − mc2 , as a explicit function of β: K(β) =
v2 v2
− 2. β β2 1 − β2
(8)
In Fig. 7 we represent K(β). If we try to fit the curve K(β) to a polynomial while the system is in the open nonhyperbolic regime (up to β ≈ 0.4), we see that the numerical values of the relativistic kinetic energy of the system fit a quadratic curve: K(β) ≈ K0 + K1 β + K2 β 2 , where K0 = 0.25676, K1 = −0.77133 and K2 = 1.2553. The parameter α of Eq. 6 is related to the square of the average speed by which the particles leave the scattering region. The higher the α, the faster decays P (t) and, therefore, the quicker the particles exit from the scattering region. Therefore, we may consider that the parameter α should be directly proportional to the energy of the system, in a linear way. This is indeed the case of the nonrelativistic HénonHeiles system. In Fig. 8 we show the linear relation between the parameter α and the total energy of the classical Hénon-Heiles system, EN , in the open nonhyperbolic regime. This numerical result was also demonstrated in previous works [22]. Then, if the energy of the system fits a quadratic curve of β and it is also directly proportional to α, we may expect that the parameter α shows a quadratic trend when β is varied. 1.25 1 0.75
K(β)
Fig. 7 (Color online) Representation of the relativistic kinetic energy: of the system as a explicit function of β, K(β) (see Eq. 8). While the system is in the open nonhyperbolic regime, the kinetic energy fits a quadratic curve. K(β) ≈ K0 + K1 β + K2 β 2 , where K0 = 0.25676, K1 = −0.77133 and K2 = 1.2553. Figure obtained from Ref. [11]
0.5 0.25 0
0
0.2
0.4
0.6
β
0.8
1
44
J. D. Bernal et al.
Fig. 8 (Color online) Linear correlation between α and the total energy of the nonrelativistic Hénon-Heiles system. Figure obtained from Ref. [11]
2
α
1.5 1 0.5 0.17
0.18
0.19
0.2
EN As we have done for the open nonhyperbolic regime, now we can proceed to fit the curve K = K(β) to a quadratic curve, while the system is in the open hyperbolic regime, β ∈ (0.4, 0.8]. This is shown in Fig. 8. It yields a second order curve K(β) ≈ K0 + K1 β + K2 β 2 , where K0 = 0.254, K1 = −0.3869 and K2 = 5968. R 2 = 0.9971. The goodness of the fit of the numerical data of K = K(β)) in the open hyperbolic regime to a quadratic curve is quite high so we can conclude that, within this regime of energy, K ∝ β 2 . The parameter τ of Eq. 7 is also related to the square of the average speed by which the particles leave the scattering region. Then we can again conclude that τ is linearly proportional to the energy of the system and, therefore, that explains why the numerical values of τ (β) follows a quadratic trend for the considered range of β. We are now in the position to understand the linear trend of the curve T¯e (β) before the KAM island destruction at β ≈ 0.4 as shown in Fig. 3. If K ∝ α and α ∝ β 2 , considering that β is a magnitude related to the velocity of the particles, and this is inversely proportional to the escape time, then 1/T¯e ∝ β 2 . In Fig. 9, we can see the results obtained from our computations. The relation between 1/T¯e and β 2 is linear. The interesting result is that there has been detected a transition from β ∈ [0, 0.4) (or β 2 ∈ [0, 0.16) in the graph) to β ∈ [0.4, 0.6] (or β 2 ∈ [0.16, 0.40]). This transition is corresponding to the destruction of the KAM tori (about β ∼ 0.4). This explains the leap that can be seen in Fig. 3 at β ∼ 0.4. This is also the value where the percentage of trapped particles turns sharply towards zero in Fig. 4. Both slopes of the straight lines of Fig. 9 determine the speed of the particles exiting from the scattering region. Then it is another numerical evidence of the KAM island stickiness. Likewise, since τ ∝ β 2 according to Fig. 6 and we can again state that β is inversely proportional to the escape time, then 1/T¯e ∝ β 2 . Therefore, the same reasoning can be used to explain the behavior of Fig. 3 from β ∼ 0.4 on.
Relativistic Chaotic Scattering
45
0.05 0.04
Te
-1
0.03
KAM destruction β ≈ 0.4
0.02
0
0.1
0.2
0.3
0.4
2
β
Fig. 9 (Color online) Analysis of the relation between the average escape time of the particles T¯e and β: we show the linear relation between 1/T¯e and β 2 . At β ∼ 0.4 (that is, β 2 ∼ 0.16 in the graph), we can see a transition corresponding to the destruction of the KAM tori. This explains the leap that can be seen in Fig. 3 at β ∼ 0.4 and why the percentage of trapped particles in the scattering region turns sharply towards zero in Fig. 4. Figure obtained from Ref. [11]
4.2 Decay Law Characterization In previous studies, Zhao and Du derived a formula for the exponential decay law, setting the parameter τ of Eq. 7 as a function of the energy of the nonrelativistic Hénon-Heiles system (see Ref. [22]). The regime of energies considered by them was the open hyperbolic one, simplifying the model with the assumption of the nonexistence of KAM islands for Newtonian energies higher than EN = 1/6. In this section we apply a similar methodology for the open hyperbolic regime but considering the relativistic corrections in order to find a theoretical expression for the escape rate of the Hénon-Heiles system. The phase space distribution can be generally expressed as ψ(p, q) =
δ(E − H (p, q)) , dpdq δ(E − H (p, q))
(9)
where p and q are the coordinates of the linear momentum (see Ref. [23]). δ is the operator that expresses a small variation of the variables in brackets. E is the difference between the relativistic mechanical energy, K + V = E − mc2 , and the threshold energy where the whole phase space of the system is chaotic and the particles may escape from the scattering region, Ee = 1/6. For convenience and simplicity, we have selected K + V instead of the total relativistic energy E in the following calculations. In fact, the constant value of E equals the kinetic energy of the particle when it is freely moving outside the scattering region according to Eq. 8. When it is under the effect of the Hénon-Heiles potential V , then the kinetic and the potential energy are continually being exchanged in order to keep constant the sum K + V . E is a conserved quantity and the following reasoning is
46
J. D. Bernal et al.
completely valid. The phase space distribution can be rewritten in terms of (x, y, θ ) 1 , where θ is the angle between the direction of the as ρ(x, y, θ ) = 2π S(E) momentum p and the y axis. S(E) is the area of the well. To define the area of the well, we have to consider the straight lines which contain the three saddle points of the Hénon-Heiles system and are perpendicular to the direction of the bisector lines of the equilateral triangle arranged by those three saddle points. Therefore, S is the region bounded by the well contour lines and the aforesaid straight lines. Given N particles in the S region, the number of particles leaving the well through the opening at a saddle point (for instance, P1 = (0, 1)) in a unit time can be x π/2 expressed as N xAB dx −π/2 ρ(x, y, θ )v(x, y) cos(θ )dθ , where the integral in x is along the straight line which contains P1 . The limits of integration xA and xB are the points where the contour lines of the Hénon-Heiles potential intersect the straight line that contains P1 . If we note the triangular symmetry of the system, the number of particles leaving the well in a unit time from the three openings are just three times the previous result. The change of N with respect to t is π/2 √2E/3 √ dN (t) = −3N (t)ρ cos(θ )dθ √ 2(E − 3x 2 /2)dx = −2π 3EρN (t). dt −π/2 − 2E/3 (10)
If we compare this result with the Eq. 7, we obtain the analytical expression for the escape rate as √ τ (E) =
3E . S(E)
(11)
There is no algebraic approach to obtain the expression of S = S(E), but we can determine it by applying an indirect method as, for instance, the Monte Carlo method. In Fig. 10a we represent the area of the well S as a function of E. The numerical results fit a quadratic polynomial: S(E) = S0 + S1 E + S2 E 2 , with S0 = 1.299, S1 = 6.7271 and S2 = −7.3541. The value S0 is in fact the area of the equilateral triangle whose vertexes are the three saddle points of the Hénon-Heiles √ system, that is S0 = 3 4 3 . Therefore, we can obtain the expression of τ = τ (E) as √
τ (E) =
3E . S0 + S1 E + S2 E 2
(12)
In Fig. 10b we show S as a function of β. Again the numerical results fit a quadratic relation: s(β) = s0 + s1 β + s2 β 2 , with s0 = 2.2321, s1 = −3.7433 and s2 = 4.8112. We can obtain the analytic expression of τ = τ (β) from Eq. 12 because the conserved value of E must agree with the Eq. 8 when it is applied to a free moving particle. Then, we express τ (β) as
Relativistic Chaotic Scattering
(a)
3 2.5 2
2.5 2 1.5
1.5 0
(b)
3
SHH
SHH
47
0.1
0.2
ΔE
0.3
0.4
0.4
0.5
0.6
β
0.7
0.8
0.9
Fig. 10 (Color online) Area of the well S of the Hénon-Heiles system: in panel (a) we show the evolution of S under the variation of energy of the system E. The initial velocity is v ≈ 0.5831. The trend is quadratic, S(E) = S0 + S1 E + S2 E 2 , with S0 = 1.299, S1 = 6.7271 and S2 = −7.3541. The regime of considered energies is the hyperbolic one, from β = 0.4 on. The maximum value of the energy E is the correspondent to β = 0.9. Likewise, in panel (b) we present the area of the well S as a function of under the β. The trend is quadratic, S(β) = s0 + s1 β + s2 β 2 , with s0 = 2.2321, s1 = −3.7433 and s2 = 4.8112. Figure obtained from Ref. [11]
0.2 τ experimental data τ analytic expression
0.15
τ
Fig. 11 (Color online) Comparison between the data of the parameter τ from the numerical computations and the results obtained from the analytic expression according to Eq. 13. Figure obtained from Ref. [11]
0.1
0.05 0.4
τ (β) =
0.5
0.6
β
√ (K0 + K1 β + K2 β 2 ) − Ee 3 , s0 + s1 β + s2 β 2
0.7
0.8
0.9
(13)
where we have expressed the value of the conserved energy of the system E by K0 + K1 β + K2 β 2 − Ee . Since the Eq. 13 is a fraction of two quadratic polynomials, it can be expressed as τ (β) = Γ0 + Γ1 β + Γ2 β 2 , which corresponds to a quadratic polynomial as showed in Fig. 6. In Fig. 11 we compare the value of the parameter τ obtained from the numerical computations and the results from the analytic formula of Eq. 13. Now we will obtain a reasoning for the parameter α as a function of β according to Eq. 6 in the open nonhyperbolic regime. Looking for that goal, we consider the stickiness effect of the KAM islands in the trajectories which leave the scattering region and eventually pass through the KAM tori. The basic idea is well explained
48
J. D. Bernal et al.
in [24]. If a process that decays (or grows) exponentially is killed randomly, then the distribution of the killed state will follow a power law in one or both tails. Indeed, we can consider that all the particles leaving the scattering region follow an exponential decay law, but because some of the trajectories pass close to the KAM islands, then the exponential decay process is killed during a certain time. The average result is that the decay law when sizable KAM tori exist is algebraic. Therefore, if we consider the exponential decay law of the particles P (t) = e−τ t killed at a random time T which is exponentially distributed with parameter ν, then the killed state P¯ = −ν e−τ T has the probability density function fP¯ (t) = ( τν )t 1−τ for t > 1. Therefore, the average decay law of the particles shows a power law behavior. For the sake of clarity, τ is the parameter of Eqs. 12 and 13, which regulates the exponential decay law of the particles. When a particle trajectory passes close to a KAM island, then the KAM stickiness provokes that the escaping process is killed during a certain time. Indeed as we showed in Fig. 4, the higher the energy of the system is, the smaller the area of the KAM island is. Therefore, we can consider that for higher energies it is more difficult for a certain trajectory to pass close to the KAM islands. The exponential decay of the particles is more probable to be killed at low energies than at higher energies. In that sense the parameter ν is directly related to the energy of the system so we can rewrite the expression of fP¯ (t) as a function of β as fP¯ (t) = −g(β)
1−τ . The function which relates ν to β is g(β). Comparing f (t) with Eq. 6 ( g(β) P¯ τ )t g(β) we can write α(β) = 1−τ . (β) We have proposed an expression for g(β) that matches quite well the numerical values obtained for α(β) as
g(β) =
1 , √ DP S KAM
where DP S is the number of canonical coordinates defined on the phase space. In this case, DP S = 4 since the canonical coordinates are (x, y, p, q) ∈ R4 . According to the proposed expression, g(β) is inversely proportional to the area of the KAM island due to the term √ 1 and also to the dimension of the phase space. The KAM higher the dimension of the phase space is, the less probable is for a particle to reach the KAM region since it has other directions where to go. The obtained formula to express the parameter α as a function of β is 1
α(β) =
1 √ KAM . DP S 1 − τ (β)
(14)
In Fig. 12 we compare the value of the parameter α obtained from the numerical computations and the results from the analytic formula of Eq. 14.
Relativistic Chaotic Scattering
2 α experimental data
1.5
α
Fig. 12 (Color online) Comparison between the data of parameter α from the numerical computations and the results obtained from the analytic expression according to Eq. 14. Figure obtained from Ref. [11]
49
α analytic expression
1 0.5 0
0.1
0.2
0.3
0.4
β 5 Uncertainty Dimension The scattering functions are one the most fundamental footprints of any chaotic scattering system. A scattering function relates an input variable of an incident particle with an output variable characterizing the trajectory of the particle, once the scattering occurs. These functions can be obtained empirically and, thanks to them, we can infer relevant information about the system. In Fig. 13, we can see a typical scattering function: the average escape time of a test particle vs. the initial shooting angle, for the relativistic Hénon-Heiles system. Red (dark gray) dots are the values of the escape times for a relativistic system with β = 0.01, while green (gray) dots denote the escape times for β = 0.5. We use two panels to represent the scattering function. The lower left panel shows the scattering function for a shooting angle φ ∈ [4.71, 6.71]. The upper right panel is a magnification of the scattering function, varying the initial angle narrower, φ ∈ [5.35, 5.44]. In order to obtain both panels, we shoot in both cases 1000 particles from (x, y) = (0, 1) into the scattering region with an initial velocity v = 0.583. In Fig. 13, we can see that the scattering function contains some regions where the escape time of the particle varies smoothly with the shooting angle. However, there are some other fractal regions with singularities, where a slightly different initial condition in the shooting angle implies an abrupt change in the particle escape time. The crucial point is that, because, as we have seen in Fig. 13, any small variation in the neighborhood of a singular input variable implies a huge variation in the output variable, and furthermore, the range of variation of the output variable does not tend to zero despite the variation goes to zero. This type of behavior of the scattering function means that a small uncertainty in the input variable may make impossible any prediction about the value of the output variable. The fractal dimension D of the set of singular input variable values provides a quantitative characterization of the magnitude of such uncertainty. That is why the fractal dimension D is defined here as the uncertainty dimension. As it was previously demonstrated, when the regime of the chaotic scattering system is hyperbolic, all
50
J. D. Bernal et al.
60
40 Zoom In
20 0
50
5.36
5.38
5.4
5.42
5.44
Te
40 30
β=0.01
20
β=0.5
10 0
5.5
6
6.5
φ
Fig. 13 (Color online) Typical cattering function: of the escape time Te vs. the initial angle φ of 1000 particles shot into the scattering region from (x, y) = (0, 1) with initial velocity v = 0.583. Red (dark gray) dots represent the escape times for a relativistic system with β = 0.01, while green (gray) dots denote the escape time values for β = 0.5. The lower left panel shows the scattering function for a shooting angle φ ∈ [4.71, 6.71]. Likewise, the upper right panel is a zoom-in of the scattering function, taking a narrower initial angle range, φ ∈ [5.35, 5.44]. The scattering function contains some regions where the escape time of the particle varies smoothly with the shooting angle and, some others, where a slightly different initial condition in the shooting angle implies an abrupt change in the particle escape time. Figure obtained from Ref. [12]
the orbits are unstable and then 0 < D < 1. However, when the dynamics is nonhyperbolic, there are KAM islands in phase space and then D ≈ 1 [25, 26]. In this section we investigate the evolution of the uncertainty dimension, D, in a typical scattering function as the parameter β is varied. In order to compute D, we use the uncertainty algorithm [27]. We select a horizontal line segment defined by y0 = 1 from which we shoot the test particles towards the scattering region with initial velocity v = 0.583. For a certain initial condition on the line segment, for instance (x0 , y0 ) = (0, 1), we choose a perturbed initial condition (x0 , y0 ) = (x0 + ε, 1), where ε is the amount of perturbation. Then we let both trajectories evolve according to Eqs. 5. We track the time they last in the scattering region, and by which exit they escape. In case that the two trajectories escape from the scattering region at the same time or throughout the same exit, then we consider that both trajectories are certain with regard to the perturbation ε. Otherwise, we say both trajectories are uncertain. Taking a large number of initial conditions for each value of ε, we conclude that the fraction of uncertain initial conditions f (ε) scales algebraically with ε as f (ε) ∼ ε1−D , or f (ε)/ε ∼ ε−D . Therefore, D is the uncertainty dimension. When we repeat this process for different values of β,
Relativistic Chaotic Scattering
51
1 0.9
α
0.8 0.7 0.6 0.5 0
0.2
0.4
β
0.6
0.8
Fig. 14 (Color online) Uncertainty dimension D: evolution of the uncertainty dimension D in a scattering function defined on the horizontal initial line segment at y0 = 1 with the variation of β. For many values of β ∈ (0, 1) we randomly launch 1000 test particles from the horizontal line segment passing through y0 . The particles are shot towards the scattering region with initial velocity v = 0.583. The results indicate that D ≈ 1 when β → 0. Additionally, there is a linear decrease of D with any increment in β up to a value β ≈ 0.625. At this point, there is a crossover behavior since, for values β > 0.625, there is a steeper linear decrease of β. Figure obtained from Ref. [12]
we obtain the evolution of the uncertainty dimension D with β. As we can see in Fig. 14, D ≈ 1 when β → 0. Moreover, for β ∈ [0, 0.625), there is a linear decrease of D with any increment in β. There is a crossover behavior at β ≈ 0.625 such that for β > 0.625, the linear decrease of β is steeper. In order to provide a theoretical reasoning about the dependence of the uncertainty dimension D with the factor β, as shown in Fig. 14, we follow the approach explained in previous literature (see Ref. [28]). Firstly, as an informative example, we consider a Cantor set with a Lebesgue measure equals zero and a fractal dimension equals 1. We shall explain below why such a set is relevant to our construction. To construct this set, we proceed iteratively as follows. Iteration 1: starting with the closed interval [0, 1] of the real numbers, we remove the open middle third interval. There are two remaining intervals of length 1/3 each. Iteration 2: we remove the middle fourth interval from the two remaining intervals and, therefore, we have four closed intervals of length 1/9. Iteration 3: again, we take away the fifth middle open interval from each four remaining intervals. Iteration nth : there are N = 2n intervals, each of length n = 2−n [2/(n + 2)]. The total length of all the intervals is n N ∼ n−1 and it goes to zero as n goes to infinity. For covering the set by intervals of size n , the required number of intervals is N () ∼ −1 (ln −1 )−1 . On the other hand, the fractal dimension is D = lim→0 [ln(N ())/ ln( −1 )], which clearly yields 1. We note that the exponent of the dependence N() ∼ 1/ D is the uncertainty dimension D, as was previously defined. The weaker logarithmic dependence does not have any influence on the determination of the dimension. However, the logarithmic term is indeed the one
52
J. D. Bernal et al.
that makes the Cantor set to be a Lebesgue measure zero since N ∼ (ln −1 )−1 tends to 0 as → 0. In order to generalize this example, we may consider that in each stage we remove a fraction ηn = a/(n + b), where a and b are constants, from the middle of each of the 2n remaining intervals. Then we find that N() ∼ (1/)[ln(1/)]−a .
(15)
According to Eq. 15, the slope at any point of the curve ln N()vs ln(1/) is, by definition, d ln N ()/d ln(1/), and it is always less than 1 for small , although it approaches 1 logarithmically as → 0. Therefore, the result about the fractal dimension is still D = 1. Coming back to the relativistic chaotic scattering analysis, now we can do a parallelism with the fractal dimension of Cantor-like structures. First, we note that chaotic scattering occurs due to a nonattracting chaotic set (i.e., a chaotic saddle) in phase space where the scattering interactions takes place [29]. Moreover, both the stable and the unstable manifolds of the chaotic saddle are fractals [30]. Scattering particles are launched from a line segment straddling the stable manifold of the chaotic saddle outside the scattering region. The set of singularities is the set of intersections of the stable manifold and the line segment, and it can be effectively considered a Cantor-like set. There is an interval of input variables which leads to trajectories that remain in the scattering region for at least a duration of time T0 . By time 2T0 , there is a fraction η of these particles leaving the scattering region. In case that these particles are all located in the middle of the original interval, there are then two equal-length subsets of the input variable that lead to trajectories that remain in the scattering region for, at least, 2T0 . Likewise, we may consider that a different fraction η of incident particles, whose initial conditions were located in the middle of the first two subintervals that remains at time 2T0 , are now leaving the scattering region by 3T0 . There are then four particle subintervals that remains in the scattering region for at least 3T0 . If we continue this iterative procedure, we can easily recognize the parallelism of the emerging fractal structure made of the incident particles which never escape, and a Cantor-like set of zero Lebesgue measure. The fractal dimension D of the Cantor set then is given by D=
ln 2 . ln[(1 − η)/2]−1
(16)
In the nonhyperbolic regime, the decay law of the particles is algebraic and this implies that the fraction η is not constant during the iterative process of construction of the Cantor set. At the nth stage (being n large enough), the fraction ηn is approximately given by ηn ≈ −T0 P −1 dP /dt ≈ z/n.
(17)
This expression obviously yields a Cantor set with dimension D = 1 when we substitute ηn ≈ z/n in Eq. 16. If we compare this result with the mathematical
Relativistic Chaotic Scattering
53
construction of the Cantor set as described in Eq. 15, then we realize that the exponent z of the algebraic decay law corresponds to the exponent a of the Eq. 15. On the other hand, in the case of the hyperbolic chaotic scattering, the incident particles leave the scattering region exponentially. The exponent of the decay law τ is related with the fraction η as τ = T0−1 ln(1 − η)−1 .
(18)
Now, we are in the position to complete the reasoning behind the behavior of the uncertainty dimension D with the relativistic parameter β. As we pointed out in Sect. 4, the relation between the parameter τ of the hyperbolic regime and β follows a quadratic relation. Then, according to Eqs. 16 and 18 we may find that D=
ln 2 . ln 2 + T0 (τ0 + τ1 β + τ2 β 2 )
(19)
According to Eq. 19, the fractal dimension D is always less than 1 and it decreases while β increases. In the limit, as β → 1, the quadratic relation between the particles decay rate τ and β is no longer valid, since the kinetic energy of the system grows to +∞. In that case, as β → 1, then τ → +∞ and, therefore, D → 0. Likewise, the exponent α of the algebraic decay law and the parameter β are also related in a quadratic manner too. Then if we take into consideration Eqs. 16 and 17 we obtain D=
ln 2 ln( 1−A
2
0 +A1 β+A2 β
2 )/n
)
.
(20)
As β → 0, we have D → 1 for large n, and, moreover, we find that dD/dβ = 0 since we recover the Newtonian system. When β increases, then D decreases. For large values of n, the value of 1 − (Z0 + Z1 β + Z2 β 2 )/n is always larger than 0 and smaller than 1 because the maximun value of z is z ≈ 1.5, as we noted in Sect. 2. Despite the KAM destruction, the transition from the algebraic regime to the hyperbolic one is not very abrupt. This is the reason why the uncertainty dimension D decreases smoothly with β as shown in Fig. 14 up to β ≈ 0.625. When we reach this value, the hyperbolic regime is clear and it yields a steeper change in D vs. β.
6 Exit Basins Description 6.1 Exit Basins Description We define exit basin as the set of initial conditions whose trajectories converges to an specified exit [31]. Likewise, an initial condition is a boundary point of a basin B if every open neighborhood of y has a nonempty intersection with basin B and at
54
J. D. Bernal et al.
least one other basin. The boundary of a basin is the set of all boundary points of that basin. The basin boundary could be a smooth curve, but in chaotic systems, the boundaries are usually fractal. In this case, since the phase space resolution is finite in any real physical situation, those fractal structures impose an extreme dependence on the initial conditions, which obstructs the prediction of the system final state. For that reason, the understanding of the exit basin topology is crucial to foresee the final fate of the system. In this section, we will provide a qualitative description of how the exit basins of the Hénon-Heiles system evolve while the ratio β is increased. We use Poincaré section surfaces (q, y) at t = 0 and x = 0 to represent the exit basins. To carry out our simulations, we shoot 1,000,000 particles from x = 0 and y ∈ [−1, 1], with initial angles φ ∈ [0, π ]. Then, we follow each trajectory and we register by which exit the particles have escaped from the scattering region. If a particle leaves the scatterer by Exit 1, then we color the initial condition, (q0 , y0 ), in brown (gray). Likewise, we color the initial condition in blue (dark gray) if the particle escapes by Exit 2 and, in case it leaves the scatterer by Exit 3, we color it in yellow (light gray). When a particle remains in the scattering region after tmax , then we color its initial condition in black. We have run different simulations to plot the exit basins of the Hénon-Heiles system for a wide range of parameters β, as shown in Fig. 15. In Fig. 15a–e, we can see the evolution of the exit basins of the Hénon-Heiles system while the parameter β is increased. In Fig. 15a, we represent the Newtonian case. It corresponds to EN = 0.17, which is fairly close to the escaping energy threshold value Ee = 1/6. At this energy, the exit basins are quite mixed throughout the phase space and there are many initial conditions (black dots) which do not escape from the scattering region. Likewise, the KAM islands can be easily recognized as the black areas inside the phase space. In Fig. 15b, we can see the relativistic effects of the Lorentz corrections for β = 0.2. The exit basins are still quite mixed although now there are larger regions where we can see compact exit basins. In Fig. 15c, we represent the exit basins of the system for β = 0.4. The exit basins are clearly located in regions and their boundaries are fractal. There are still some portions of the phase space where the basins corresponding to Exits 1, 2 and 3 are mixed. However, the KAM islands are destroyed and there are just a few trajectories remaining in the scattering region after Tmax (colored in black). Figure 15d represents the exit basins of the system for β = 0.625. There are no regions where the exit basins are mixed. The boundaries are fractal. Lastly, in Fig. 15e, we show the Henón-Heiles exit basins for β = 0.9. The exit basins are smoothly spread on the phase space and the fractality of the boundaries has decreased. Many dynamical system of interest, with two or more coexisting attractors (or escapes), exhibit a singular topological property in their basin boundaries that is called the property of Wada [32] here. If for every boundary point ub of a certain basin, we can find an infinitely tiny open neighborhood centered in ub that contains points from the rest of the basins, we can say that this boundary has the property of Wada. A logical consequence of this definition is that a Wada basin boundary is the same boundary for all the basins. The property of Wada is a very interesting
Relativistic Chaotic Scattering
55
Fig. 15 (Color online) Evolution of the exit basins of the Hénon-Heiles system for different values of β. The sets of brown (gray), blue (dark gray) and yellow (light gray) dots denote initial conditions resulting in trajectories that escape through Exits 1, 2 and 3 (see Fig. 1), respectively. The black regions denote the KAM islands and, generally speaking, the black dots are the initial conditions which do not escape. (a) Newtonian case: the exit basins are quite mixed throughout the phase space and the KAM islands can be easily recognized as the big black regions inside phase space. (b) β = 0.2: the exit basins are still fairly mixed. The regions corresponding to exit basins are larger than in the Newtonian case. (c) β = 0.4: exit basin regions are larger. The exit basin boundaries are fractal. The KAM islands are destroyed. (d) β = 0.625: the exit basins are not mixed anymore. The boundaries are fractal. (e) β = 0.9: the boundaries are smoother, and the exit basins occupy a larger region of phase space. Figure obtained from Ref. [12]
characteristic because the fate of any dynamical system is harder to predict since we cannot foresee a priori by which of the exits any initial condition close to the boundaries is going to escape. In that case, the degree of unpredictability of the destinations can be more severe than the case where there are just fractal basins with only two potential destinations associated. In the energy regime that
56
J. D. Bernal et al.
we are considering, the Hénon-Heiles system exhibits three symmetric exits to escape from the scattering region, giving rise to three qualitatively distinct scattering destinations. This allows Wada basin boundaries to occur. Each exit of the system has its own associated exit basin. Previous research described some algorithms to obtain the numerical verification of the Wada property in dynamical systems [33– 36]. In this paper, we resort to the appearance of the exit basin boundaries to give visual indications about the persistence of the Wada property as the parameter β is varied. For values of β ≤ 0.625, we have observed that the boundary points of any exit basin magnification seem to be surrounded by points from the three basins. However, when β > 0.625, we have seen that the boundary points are exclusively surrounded by points belonging just to two basins. In order to give visual examples, we show Fig. 16a and b. Both are detailed analysis of the Hénon-Heiles exit basins for β = 0.625 and β = 0.9, respectively, when we perform the computations on a tiny portion of the exit basins, in y ∈ [−0.001, 0.001]. Therefore, Fig. 16a is a zoom-in of the Figs. 15d and 16b is a zoom-in of the Fig. 15e. For β > 0.625, the exit basin representations are similar to the one described in Fig. 16b, where we can observe that, for example, the boundary located at p = 1 is smooth. Moreover, this boundary divides only two basins, the one corresponding to Exit 1 (brown–gray–) from the one associated to Exit 3 (yellow–light gray–). This is a visual indication that the Wada property might not be observable in the relativistic Hénon-Heiles system for β > 0.625; at least, at the numerical scale we have performed the calculations. In that sense, we may suppose that the unpredictability associated to the final destination of the trajectories for values of β ≤ 0.625 is higher. As we can see, considerations of special relativity have qualitative implications on the exit basin topology of the system, even for low values of β. In the following sections we will provide more insights about this statement from a quantitative point of view.
6.2 Basin Entropy The basin entropy is a novel tool developed to quantitatively describe the exit basin topology of any dynamical system [37]. The main idea behind the concept of basin entropy is that the continuous phase space of the system can be considered a discrete grid due to the finite resolution of any experimental or numerical procedure to determine any point in phase space. In fact, this unavoidable scaling error indeed does induce wrong predictions in chaotic systems even when they are completely deterministic. Therefore, the basin entropy helps to quantify to what extent one system is chaotic according to the topology of its phase space. In particular, considering the exit basins of the Hénon-Heiles system as the ones shown in Fig. 5, we can easily create a discrete grid if we assume a finite precision δ in the determination of the initial conditions and we cover the phase space with boxes of size δ. This way, every piece of the grid is surrounded by other pieces, and we may define a ball around a piece as the pieces sharing some side with it. The method to
Relativistic Chaotic Scattering
57
Fig. 16 (Color online) Zoom-in of the exit basins for β = 0.625 and β = 0.9. The sets of brown (gray), blue (dark gray) and yellow (light gray) dots denote initial conditions resulting in trajectories that escape through Exits 1, 2 and 3 (see Fig. 1), respectively. In (a) we can see that the boundary points seem to be surrounded by points from other basins. Nonetheless, in (b) we can find boundary points which are surrounded only by points of just two basins. The boundary located at p = 1 is smooth and it divides just two basins, the one corresponding to Exit 1 from the corresponding to the Exit 3. Figure obtained from Ref. [12]
calculate the basin entropy considers that, the ball is a random variable, being the potential results of that variable the different exit basins. Taking into account that the pieces inside the ball are independent and applying the Gibbs entropy concept, the basin entropy Sb is defined as Sb =
k max k=1
Nk0 αk δ log(mk ), N0
(21)
where k is the label for the different exit basin boundaries, mk is the number of exit basins contained in a certain ball and αk is the uncertainty dimension of the N0 boundary k as defined in Sect. 5. The ratio k0 is a term related with the portion N of the discretized phase space occupied by the boundaries, that is, the number of pieces lying in the boundaries divided by the total number of pieces in the grid. N0 Therefore, there are three sources that increase the basin entropy: (a) k0 , that is, N the larger portion of the phase space occupied by the boundaries, the higher Sb ; (b) the uncertainty dimension term δ αk , related to the fractality of the boundaries; (c) log(mk ), which is a term related to the number of different exit basins mk . In the case that the basins exhibit the property of Wada, then there is just one boundary
58
J. D. Bernal et al.
1.1
Sb
1 0.9
A B
0.8 0.7
C 0
0.2
0.4
D 0.6
0.8
β Fig. 17 (Color online) Evolution of the basin entropy Sb of the relativistic Hénon-Heiles system with β. There are represented 4 regions: (A) β ∈ (0, 0.2], increase of Sb up to β ≈ 0.2; (B) β ∈ [0, 0.4], steep decrease of Sb until β ≈ 0.4; (C) β ∈ (0.4, 0.625] and (D) β ∈ (0.625, 0.9), both regions show a smoother decrease of Sb . The behavior of Sb in region (A) is explained because of the reduction of both the KAM islands and the regions where the basins are mixed. The global N0 effect is that 10 is increased. In contrast, in region (B) the areas where the basins are mixed are N N0 negligible and the KAM islands are rapidly tinnier. These effects cause a relevant reduction of 10 . N In regions (C) and (D) the exit basin sets are progressively larger and smoother (as was shown in Fig. 5). Figure obtained from Ref. [12]
that separates all the basins. In this case, the term log(mk ) is maximun and Sb is increased because all the possible exits are present in every boundary box. As we have shown in Sect. 6, this may be the case for the relativistic Hénon-Heiles system for β ≤ 0.625. In Fig. 17, we can see the evolution of the basin entropy Sb of the Hénon-Heiles system with β. We can distinguish 4 regions: (A) β ∈ (0, 0.2], increasing of Sb up to β ≈ 0.2; (B) β ∈ [0, 0.4], steep decrease of Sb until β ≈ 0.4; (C) β ∈ (0.4, 0.625] and (D) β ∈ (0.625, 0.9), smoother decrease of Sb . As was shown in Sect. 5, the uncertainty dimension α is a monotonically decreasing function, so the increase of N0 Sb in the region (A) can only be explained because of a higher increase of 10 . In N region (A), when β is increased, the zones where the basins are mixed are indeed reduced. However, the KAM islands are reduced. These effects can be seen in the exit basin evolution from Fig. 5a to b. Moreover there are more pieces in the grid of the discretized phase space occupied by the boundaries. In the region (B) there is an N0 important decrease of Sb because of the reduction of 10 . In this region, while β is N increased, the areas of the phase space where the basins are mixed are negligible and the KAM islands are progressively losing relevance in phase space. At β ≈ 0.4 there is an inflection point, just when the KAM islands are destroyed. There are fewer pieces of the grid occupied by the boundaries. In the region (C), β ∈ (0.4, 0.625],
Relativistic Chaotic Scattering
59
the exit basin areas are larger and they grow as β increases, while the fractality of the boundaries decreases. This is exactly what was found in Fig. 4, where the uncertainty dimension α decreases abruptly from β ≈ 0.625 on. In region (D), the fractality of the boundaries is reduced and there have been some visual indications about the disappearance of the Wada basins, as described in Sect. 6. As we have seen in the course of this work, the exit basin topology of the relativistic Hénon-Heiles varies with β, even for low velocities. From this point of view, the properties of the system that depend on the phase space topology may vary too.
7 Applications to Natural Phenomena We have seen along this work that the dynamics of the relativistic Hénon-Heiles mainly depends on the evolution of the topology of the phase space when we vary β. In particular, we have concluded that the existence of KAM islands in the phase space of the system is the key driver to exhibit a nonhyperbolic or a hyperbolic dynamics. From this point of view, the global properties of the system which depend on the topology of the phase space may vary even for low velocities. Although the Hénon-Heiles potential was initially developed to model the motion of stars around an axisymetrical galaxy, we think that the phenomena described in the present work may be related to many other phenomena occurred in Nature. For instance, the M-sigma (or M-σ ) relation is an empirical correlation between the stellar velocity dispersion σ of a galaxy bulge and the mass M of the supermassive black hole at its center (see Refs. [38, 39]). This correlation is quite relevant since it is commonly used to estimate black hole masses in distant galaxies using the easily measured quantity σ . The M-sigma relation is well described by an algebraic power law and, although the applications of the Hénon-Heiles and the M-sigma models are very different, we speculate that the underlying mathematical properties are similar, including the presence of KAM islands in the phase space of both systems. Therefore, the KAM islands would be responsible of the algebraic decay law and, hypothetically, their destruction would imply an exponential decay law to relate σ with the mass M of the supermassive black hole. Examples of systems that can be described by the Hénon-Heiles Hamiltonian are, for instance, the planar three-body system, buckled beams and some stationary plasma systems [40]. One relevant topic of special interest that is also modeled by the Hénon-Heiles Hamiltonian is the dynamics of charged particles in a magnetic dipole field (which is also called the Störmer problem). Since a long time, scientists have studied it in the context of the northern lights and cosmic radiation and it models how a charged particle moves in the magnetic field of the Earth. The analysis of this system leads to the conclusion that charged particles are trapped in the Earth magnetosphere or escape to infinity, and the trapping region is bounded by a torus-like surface, the Van Allen inner radiation belt. In the trapping region, the motion of the charged particles can be periodic, quasi-period or chaotic [41].
60
J. D. Bernal et al.
According to the research we have presented in this chapter, if we want to study global properties of the Störmer problem, we should consider the relativistic effects because those properties depend on the exit basin topology. In those cases, we may expect that the uncertainty associated to the prediction of the final state of the particles varies as the parameter β increases.
8 Conclusions In the last years, there has been an important progress in understanding the relativistic effects in chaotic scattering. Most of the research has been focused on studying the discrepancies between the Newtonian and the relativistic approaches over the trajectories of the particles. Here we show that some global relevant properties of chaotic scattering systems, do depend on the effect of the Lorentz transformations and we may consider the relativistic corrections in case we want to describe them in a realistic manner, even for low velocities. We have used the Hénon-Heiles system in order to undertake our theoretical reasoning and to perform the numerical computations. We consider the global properties of the Hénon-Heiles system vary because the Lorentz corrections destabilize the topology of the phase space. In that sense, according to the regime of energies we have chosen for our numerical calculations, the KAM islands are fully destroyed for β ≈ 0.4. We have proved in Fig. 3 that the average escape time T¯e of the Hénon-Heiles system decreases when β increases. Indeed at β ≈ 0.4 there is a leap where the linear trend of T¯e changes abruptly. This can be easily explained from the perspective of the KAM islands destruction (see Fig. 4). We have explained the shape of the curves T¯e (β), α(β) and τ (β) in Figs. 3, 5 and 6 by energetic considerations. We have also characterized the decay laws of the open-nonhyperbolic and hyperbolic regimes, obtaining algebraic expressions that fits the data from our numerical computations. We have also focused our attention in describing different characteristics of the exit basing topology as the uncertainty dimension, the Wada property and the basin entropy of the relativistic Hénon-Heiles system. We have found that the Lorentz corrections modify these properties. We have shown in Fig. 13 the evolution of the uncertainty dimension, D, in a typical scattering function as the parameter β is varied. We found that D decreases almost as β increases up to a certain value of the parameter β ≈ 0.625, when a crossover phenomenon occurs and D decreases abruptly. This takes place due to the transition from the algebraic particles decay law to the exponential decay law. We have also described in a qualitative manner the evolution of the exit basin topology with the parameter β (see Fig. 15). Moreover, we have found a qualitative evidence that the Wada basin boundaries disappear for β > 0.625. Lastly, we have used the concept of basin entropy to quantify the evolution of the exit basins with any variations of the parameter β (see Fig. 17). All our results point out that the uncertainty in the prediction of the final fate of the system depends
Relativistic Chaotic Scattering
61
on the considered value of β, and this relation is not linear. There are some intervals, i.e., β ∈ (0, 0.2], where the Sb is increased as β grows, whereas there are some others, i.e., β ∈ (0.4, 0.625], where the Sb decreases rapidly. As we have seen, if we want to make accurate predictions about the final state of any chaotic scattering system, we think that the relativistic corrections should always be considered, regardless the energy of the system. Lastly, we have speculated about the possibility of finding this dependence of the global properties of the system with the topology of the phase space in many other phenomena in Nature. For instance, the M-σ correlation is well described by an algebraic power law. We may consider that this relation is due to the presence of KAM islands in the phase space. Its destruction would involve an exponential decay law. Another example of the application to real problems in Nature of the results that we have obtained is, for example, the description of the dynamics of charged particles moving through a magnetic dipole-field as the one modeled by the Störmer problem. Therefore, we consider that our results are useful for a better understanding of the relativistic chaotic scattering systems. Acknowledgments We dedicate this work to our colleague and friend Valentin Afraimovich. This work was supported by the Spanish State Research Agency (AEI) and the European Regional Development Fund (FEDER) under Projects No. FIS2016-76883-P and PID2019-105554GB-I00. MAFS acknowledges the jointly sponsored financial support by the Fulbright Program and the Spanish Ministry of Education (Program No. FMECD-ST-2016).
References 1. J.M. Seoane, M.A.F. Sanjuán, Rep. Prog. Phys. 76, 016001 (2012) 2. E. Ott, T. Tél, Chaos 3, 4 (1993) 3. C. Grebogi, E. Ott, J.A. Yorke, Physica D 7, 181 (1983) 4. T. Tél, B.L. Hao (Ed.), Directions in Chaos, ed. by B.L. Hao, vol. 3 (World Scientific, Singapore, 1990), p. 149; in STATPHYS 19, ed. by B.L. Hao (World Scientific, Singapore, 1996), p.243 5. H.C. Ohanian, Special Relativity: A Modern Introduction (Physics Curriculum & Instruction, Lakeville, 2001) 6. B.L. Lan, Chaos 16, 033107 (2006) 7. B.L. Lan, Chaos, Solitons Fractals 42, 534 (2009) 8. B.L. Lan, F. Borondo, Phys. Rev. E 83, 036201 (2011) 9. S.-N. Liang, B.L. Lang, Results Phys. 4, 187 (2014) 10. J.M. Aguirregabiria, arXiv: 1004.4064v1 (2010). 11. J.D. Bernal, J.M. Seoane, M.A.F. Sanjuán, Phys. Rev. E 95, 032205 (2017) 12. J.D. Bernal, J.M. Seoane, M.A.F. Sanjuán, Phys. Rev. E 97, 042214 (2018) 13. J.D. Bernal, J.M. Seoane, M.A.F. Sanjuán, Phys. Rev. E 88, 032914 (2013) 14. M. Hénon, C. Heiles, Astron. J. 69, 73 (1964) 15. R. Barrio, F. Blesa, S. Serrano, Europhys. Lett. 82, 10003 (2008) 16. G. Contopoulos, Astron. Astrophys. 231, 41 (1990) 17. J.M. Seoane, M.A.F. Sanjuán, Phys. Lett. A 372, 110 (2008) 18. J.M. Seoane, L. Huang, M.A.F. Sanjuán, Y.-C. Lai, Phys. Rev. E 79, 047202 (2009) 19. J.M. Seoane, M.A.F. Sanjuán, Int. J. Bifurcation Chaos 20, 2783 (2010)
62
J. D. Bernal et al.
20. C.F.F. Karney, Physica D 8, 360 (1983); B.V. Chirikov, D.L. Shepelyansky, Physica D 13, 395 (1984); J. Meiss, E. Ott, Physica D 20, 387 (1986); Y.-C. Lai, M. Ding, C. Grebogi, R. Blümel, Phys. Rev. A 46, 4661 (1992) 21. M. Ding, T. Bountis, E. Ott, Phys. Lett. A 8, 395 (1991) 22. H.J. Zhao, M.L. Du, Phys. Rev. E 76, 027201 (2007) 23. M.C. Gutzwiller, Chaos in Classical and Quantum Mechanics (Springer, New York, 1990) 24. W.J. Reed, B.D. Hughes, Phys. Rev. E 66, 067103 (2002) 25. Y.-T. Lau, J.M. Finn, E. Ott, Phys. Rev. Lett. 66, 978 (1991) 26. S. Bleher, E. Ott, C. Grebogy, Phys. Rev. Lett. 63, 919 (1989) 27. C. Grebogi, S.W. McDonald, E. Ott, J.A. Yorke, Phys. Lett. A 99, 415 (1983) 28. J.M. Seoane, M.A.F. Sanjuán, Y.-C. Lai, Phys. Rev. E 76, 016208 (2007) 29. S. Bleher, E. Ott, C. Grebogi, Phys. Rev. Lett. 63, 919 (1989); S. Bleher, C. Grebogi, E. Ott, Physica D 46, 87 (1990) 30. C. Grebogi, S.W. McDonald, E. Ott, J.A. Yorke, Phys. Lett. A 99, 415 (1983); S.W. McDonald, C. Grebogi, E. Ott, J.A. Yorke. Phys. Lett. D 17, 125 (1985) 31. G. Contopoulos, Order and Chaos in Dynamical Astronomy (Springer, New York, 2002) 32. J. Kennedy, J.A. Yorke, Phys. D 51, 213 (1991) 33. H.E. Nusse, J.A. Yorke, Physica D 90, 242 (1996) 34. J. Aguirre, J.C. Vallejo, M.A.F. Sanjuán, Phys. Rev. E 64, 066208 (2001) 35. J.M. Seoane, J. Aguirre, M.A.F. Sanjuán, Y.-C. Lai, Chaos 16, 023101 (2006) 36. A. Daza, A. Wagemakers, M.A.F. Sanjuán, J.A. Yorke, Sci. Rep. 5, 16579 (2015) 37. A. Daza, A. Wagemakers, B. Georgeot, D.G. Odelin, M.A.F. Sanjuán, Sci. Rep. 6, 31416 (2016) 38. F. Ferrarese, D. Merritt, Astrophys. J 539, L9–L12 (2000) 39. K. Gebhardt, R. Bender, G. Bower, A. Dressler, S.M. Faber, A.V. Filippenko, R. Green, C. Grillmair, L.C. Ho, J. Kormendy, T.R. Lauer, J. Magorrian, J. Pinkney, D. Richstone, S. Tremaine, Astrophys. J 530, L13–L16 (2000) 40. J.C. Bastos, C.G.Ragazzo, C.P. Malta, Phys. Lett. A 241, 35 (1998) 41. R. Dilao, R. Alves-Pires, Chaos in the Störmer Problem. Progress in Nonlinear Differential Equations and Their Applications (Birkhäuser Verlag Basel, Switzerland, 2008)
Complex Dynamics of Solitons in Rotating Fluids Lev A. Ostrovsky and Yury A. Stepanyants
AMS (MSC 2010) 35Q53, 76B47
1 Introduction Currently it is a well-known fact that solitary waves (or solitons) reveal the features similar to those of material particles [1–4]. In particular, two interacting solitons in the integrable systems emerge from their collision with the same parameters as before the collision, similarly to the elastic collision of two classical particles [5], which led to the term “soliton”. We believe that the same notion can be extended to solitary waves in non-integrable systems; although their collisions are not entirely “elastic” since they can radiate small-amplitude waves. For the integrable equations, exact multi-soliton solutions can often be obtained by the inverse scattering method and some other techniques such as the Hirota transform, Bäcklund transform, or Darboux–Matveev transform [6, 7]. Along with the exact methods, the approximate, asymptotic methods have been developed for similar problems, which are equally applicable to the integrable and non-integrable systems where solitary waves behave as the attractive or repulsive particles, and sometime demonstrating even more complex formations, such as
L. A. Ostrovsky University of Colorado, Boulder, USA Y. A. Stepanyants () School of Sciences, Faculty of Health, Engineering and Sciences, University of Southern Queensland, Toowoomba, QLD, Australia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 D. Volchenkov, J. A. Tenreiro Machado (eds.), Mathematical Methods in Modern Complexity Science, Nonlinear Systems and Complexity 33, https://doi.org/10.1007/978-3-030-79412-5_4
63
64
L. A. Ostrovsky and Y. A. Stepanyants
stationary or oscillating soliton pairs, multi-soliton structure, and breathers [1–4, 8–10]. A radically different dynamics of impulses close to solitons occurs in rotating systems such as surface and internal gravity waves in the ocean under the action of the Coriolis force. As demonstrated by field data [11–13] and laboratory experiments [14–16], the effects of nonlinearity and rotation-induced dispersion can be comparable in many real situations. Modeling of such waves leads to the apparently non-integrable rotation modified Korteweg–de Vries (rKdV) equation [17] which, strictly speaking, does not allow solitary waves at all due to radiation losses. However, solitons can exist on a long-wave background, which pumps up the energy into a soliton and thus compensates the losses. In such a configuration, a soliton can perform complex motions, both periodic and unlimited. Even more complex is the dynamics of two or more solitons superimposed on a long wave. Some interesting aspects of this dynamics are briefly discussed below.
2 The Background 2.1 Soliton Interactions Without Rotation First we give some basic information about the behavior of solitons as particles. As mentioned, for the Korteweg–de Vries (KdV) and other integrable equations, interaction of solitons can be described by exact methods. The asymptotic theory of such processes is most thoroughly developed for the cases when the well separated solitons interact via their asymptotics (“tails”) [1, 8–10]. In these cases, the energy always flows from the rear soliton to the frontal one, and two scenarios are possible: (1) Overtaking, when the rear soliton is strong enough to overtake the frontal one, and (2) Exchange, when the frontal soliton is accelerated enough to break away. These processes are schematically shown in Fig. 1. The theory described below includes this process as a particular case. As mentioned, in the integrable systems, the solitons eventually restore the same amplitudes as before the interaction so that the latter only causes an additional time delay or phase shift (the “elastic collision”). In non-integrable systems such
Fig. 1 Two types of interaction of two solitons
Complex Dynamics of Solitons in Rotating Fluids
65
processes are commonly accompanied by radiation of small-amplitude wave trains; therefore, soliton parameters after the interaction slightly differ from those before it (the “inelastic collision”).
2.2 Rotating Media Earth rotation makes the problem much more complex. In what follows we base our consideration on the rKdV model equation obtained as early as in 1978 [17] and now becoming an object of numerous studies: (ut + c0 ux + αuux + βuxxx )x = γ u,
(1)
where c0 is a linear long-wave velocity, α and β are, respectively, the coefficients of nonlinearity and dispersion, and γ is proportional to the squared Coriolis frequency. This equation is not known to be integrable (except for the limits of γ = 0 or β = 0); its solutions are defined by the interplay of “two dispersions”: the Boussinesq-type (~β) and Coriolis-type (~γ ). Some specific features of this model found in different years are: ∞ 1. For the periodic and localized solutions, the mass integral M = −∞ u (x, t) dx is zero [17]. 2. There are no solitary waves on a constant background (“antisoliton theorem”) [18, 19]. 3. In the long-wave case (β = 0) there exists a family of stationary periodic waves including a limiting wave consisting of parabolic pieces [17]. 4. An initial KdV soliton attenuates due to radiation of long small-amplitude waves and disappears as a whole entity in a finite time (the “terminal damping”) [20]. Here we outline the recent results related to the behavior of a soliton and a pair of solitons interacting with a low-frequency “pump” wave, which compensates radiation losses and thus allows the existence of solitary waves. As will be shown below, the solitary waves on a background of a long carrier wave reveal a non-trivial dynamics.
3 Solitons on a Long Wave 3.1 Asymptotic Equations For the sake of convenience we use Eq. (1) in the dimensionless form: (ut + uux + uxxx )x = u.
(2)
66
L. A. Ostrovsky and Y. A. Stepanyants
u1 80
3
50
2 1
20 – 20
– 10
– 10
0
10
20
S
– 40
Fig. 2 Samples of periodic solutions of Eq. (3) with different amplitudes
Note first that for a long wave one can neglect the term with the third derivative. The resulting “reduced” equation is integrable [21]. Its stationary solutions u1 depending on one variable S = x – Vt satisfy the equation d2 dS 2
1 2 u − c1 u1 2 1
= u1 .
(3)
Periodic solutions of this equation are shown in Fig. 2 (a detailed analysis of both periodic and solitary solutions of (3) can be found in Refs. [22, 23]). This family of periodic solutions includes, in particular, a limiting solution with a parabolic profile: u1 (S) =
L L L2 1 L2 S2 − ,− ≤ S ≤ ,c = . 6 12 2 2 36
(4)
It represents a sequence of parabolic arcs one period of which is shown by line 3 in Fig. 2. Assume now that one or more solitons close to the solitary solution of the KdV equation exist on the background of the long wave described by Eq. (4). In the present notation, a KdV soliton, which is a particular solution of Eq. (2) with the zero right-hand side, has the well-known form: u2 = A sech2
t ζ =x−
V dt, 0
ζ −S − p,
where
A V = − p + u1 (ζ ) , = 3
√ 4 3A 12 , p= . A Λ
(5)
(6)
Complex Dynamics of Solitons in Rotating Fluids
67
The small parameter p (the “pedestal”) is added here to satisfy the zero mass condition mentioned above. In a more general case when N solitons are considered the solution u2 reads: u2 =
N
u2n =
1
N ζ − Sn An sech2 − pn , n
(7)
1
where the parameters of each soliton are linked by the relations (6) with the corresponding local values of u1 . Now let us seek a solution to Eq. (2) in the form u = u1 + u2 where u1 is given by Eq. (4) and make natural assumptions that An >> max(u1 ) and 0 ⎪ ⎪ ⎨ d1 d2 − d3 > 0 ⎪ d (d d − d1 d4 ) − d32 > 0 ⎪ ⎩ 1 2 3 d4 d1 (d2 d3 − d1 d4 ) − d32 > 0
(7)
With increasing the parameter c from 15 to 20 and changing the parameter |l| from 0 to 1.2, the stability distribution of equilibrium points is shown in Fig. 3, where the green regions are unstable, whereas the cyan regions are stable.
3 Multiple Coexisting Attractors In this section, the dynamical behaviors for the system starting equilibrium point and non-equilibrium point are analyzed and the coexisting infinitely many attractors’ behavior of the system can be shown.
3.1 Multiple Coexisting Attractors Depending on c and v(0) Firstly, the system parameters a = 9, b = 30, c = 17, m = − 1.2, n = 1.2 are kept unchanged, and initial conditions are set as [10−6 , 0, 0, 0, v(0)] and [−10−6 , 0, 0, 0, v(0)] respectively. With v(0) increasing in the region [−1, 1], the bifurcation diagrams of the state variable x and Lyapunov exponent spectrum are presented in Fig. 4. Specially, the bifurcation behaviors of region I, II in Fig. 4a and the corresponding three largest Lyapunov exponents are shown in Table 1.
84
F. Min and C. Li
Lyapunov exponents Li (i=1,2,3)
4
Xmax
2
0 I -2 -1.0
-0.5
II 0.0 v(0)
0.5
1.0
0.4 0.0 -0.4 -0.8
(b1)
-1.2
II
I
0.4 0.0 -0.4
L1
-0.8 -1.2 -1.0
L2
(b2)
L3
-0.5
(a)
0.0 v(0)
0.5
1.0
(b)
Fig. 4 Bifurcation diagrams and Lyapunov exponent spectrum with the initial condition v(0) increasing (a) x(0) = ± 10−6 , bifurcation diagrams; (b1), (b2) x(0) = ± 10−6 , the three maximum Lyapunov exponents Table 1 Dynamic behavior in different regions Figure 4a
Dynamic behavior (blue)
Region I Period ↓ Chaos ↓ Quasi-period
Region II Chaos ↓ Period ↓ Point
Figure 4b
(L1 , L2 , L3 )
Region I (0, 0, −) ↓ (+, 0, −) ↓ (0, 0, −)
Region II (+, 0, −) ↓ (0, 0, −) ↓ (−, −, −)
(+, 0, −) ↓ (0, 0, −) ↓ (−, −, −)
(0, 0, −) ↓ (+, 0, −) ↓ (0, 0, −)
(b1)
Dynamic behavior (red)
Point ↓ Period ↓ Chaos
Period ↓ Chaos ↓ Quasi-period
(L1 , L2 , L3 )
(b2)
When v(0) values in different regions are selected, the phase trajectories of the system display the coexisting single scroll chaotic attractor, periodic attractor and asymmetric chaotic attractor, as shown in Fig. 5a, c. Figure 5a shows the coexisting period-1 and single band chaotic attractor; Fig. 5c shows the coexisting asymmetric chaotic attractors. Meanwhile, the topological properties of coexisting phase trajectories are studied by using the Poincare map, as shown in Fig. 5b, d. Next, when x(0) = ± 10−6 , v(0) = ± 0.2, and c varies in the region of [16, 19.5], the four nonzero eigenvalues are λ1 > 0, λ2, 3 = α 1 ± jω1 , λ4 < 0 and α 1 < 0, which corresponds to unstable region I in Fig. 3 and means that the system motion starts from the unstable saddle point. With c increasing in the region of [16, 19.5], the bifurcation diagrams of the state variable z and Lyapunov exponent spectrum are shown in Fig. 6a–d. When v(0) = ± 0.63 and other conditions are kept unchanged, the four nonzero eigenvalues satisfy λ1, 2 = α 2 ± jω2 , λ3, 4 = α 3 ± jω3 and α 2 > 0,
Multistability Coexistence of Memristive Chaotic System and the Application. . . 1.0
85
1.2 [1e-6 0 0 0 0.27]
0.5
0.6
[1e-6,0,0,0,0.27]
0.0
x
y(t)
[1e-6 0 0 0 -0.27]
0.0
[1e-6,0,0,0,-0.27] [-1e-6,0,0,0,0.27]
[-1e-6 0 0 0 0.27]
-0.5
-0.6
[-1e-6,0,0,0,-0.27]
[-1e-6 0 0 0 -0.27]
-1.0 -4
-2
0 x(t)
2
4
-1.2 -2
-1
(a)
1
2
(b)
0.8
1.2 [1e-6 0 0 0 0.63]
[1e-6,0,0,0,-0.63]
0.4
0.6
0.0
0.0
x
y(t)
0 v
-0.4
-0.6 [1e-6,0,0,0,0.63]
[1e-6 0 0 0 -0.63]
-0.8 -3.0
-1.5
0.0
1.5
3.0
-1.2 -2
x(t)
-1
0 v
(c)
(d)
1
2
Fig. 5 The coexisting behavior with different v(0) (a) coexistence of period 1 and single scroll chaotic attractor; (b) Poincare maps of coexisting trajectories; (c) coexistence of asymmetric chaotic attractors; (d) Poincare maps of coexisting chaotic attractors
α 3 < 0, which corresponds to unstable region II in Fig. 3 and indicates that the system trajectory starts from the unstable saddle-focus point. The corresponding bifurcation diagrams and Lyapunov exponent spectrum are shown in Fig. 6e, f. From Fig. 6, the four sets of different parameters c are selected and each parameter corresponds to six sets of different initial conditions. ICi (i = 1, 2, 3, 4) are in the unstable region I shown in Fig. 3; ICi (i = 5, 6) are in the unstable region j II shown in Fig. 3, as shown in Table 2. In the Table 2, Pi represents the i-th type of period-j; Ci represents the i-th type of asymmetric double scroll chaotic attractor; Si represents the i-th type of single scroll chaotic attractor. The phase diagrams of the coexisting attractors, listed in Table 2, are shown in Fig. 7. Figure 7a shows six types of the coexisting period-1 under c = 19.4; Fig. 7b shows the coexisting periodic limit cycles under c = 18.6; Fig. 7c shows the coexistence of periodic limit cycles and single scroll chaotic attractors under c = 17.7; Fig. 7d shows the coexistence of limit cycles and asymmetric double scroll chaotic attractors under c = 17.3.
86
F. Min and C. Li
Fig. 6 The dynamic behaviors of memristor chaotic system (a) |v(0)| = 0.2, bifurcation diagram of the state variable z; (b) Lyapunov exponent spectrum; (c) bifurcation diagram of the state variable z; (d) Lyapunov exponent spectrum; (e) |v(0)| = 0.63, bifurcation diagram of the state variable z; (f) Lyapunov exponent spectrum
3.2 Multiple Coexisting Attractors Depending on x(0) and v(0) The coexisting infinitely many attractors depending on double state variables x(0) and v(0) is described by attraction basins. To present clearer dynamics, the range of
Multistability Coexistence of Memristive Chaotic System and the Application. . .
87
Table 2 The coexisting trajectories of the system Parameter c c = 19.4 P11 P21 P31 P41 P51 P61 Figure 7a
3.0
3.0
1.5
1.5 x(t)
x(t)
Initial condition ICi (i = 1, 2, 3, 4, 5, 6) IC1 = [10−6 , 0, 0, 0, 0.2] IC2 = [10−6 , 0, 0, 0, −0.2] IC3 = [−10−6 , 0, 0, 0, 0.2] IC4 = [−10−6 , 0, 0, 0, −0.2] IC5 = [10−6 , 0, 0, 0, 0.63] IC6 = [10−6 , 0, 0, 0, −0.63] Phase diagrams
0.0
-0.7
0.0 v(t)
0.7
1.4
-3.0 -1.4
-0.7
0.0 v(t)
0.7
1.4
0.7
1.4
(b)
3.0
3.0
1.5
1.5 x(t)
x(t)
c = 17.3 P14 P24 P14 P24 C1 C2 Figure 7e
0.0
(a)
0.0
-1.5 -3.0 -1.4
c = 17.7 P52 S1 S2 P62 P13 P23 Figure 7c
-1.5
-1.5 -3.0 -1.4
c = 18.6 P71 P12 P22 P81 P32 P42 Figure 7b
0.0
-1.5
-0.7
0.0 v(t)
(c)
0.7
1.4
-3.0 -1.4
-0.7
0.0 v(t)
(d)
Fig. 7 The coexisting phase trajectories of different attractors in different planes (a) c = 19.4; (b) c = 18.6; (c) c = 17.7; (d) c = 17.3
parameter variation is relatively small. Therefore, we attempt to analyze extreme multistability in the wide parameter range by choosing x(0) as the bifurcation parameter, which means that the initial condition is not an equilibrium point. Through simulation analysis, it is found that the choice of v(0) in different intervals have influence on the bifurcation modes of the x(0), as shown in Table 3. When v(0) ∈ [1.87, 1.87] and v(0) are chosen as 0, ± 0.5, ± 0.9, ± 1 respectively
88
F. Min and C. Li
Table 3 The bifurcation modes of x(0) with different v(0) The region of v(0) v(0) ∈ [−1.87, 1.87] v(0) ∈ [−2.20, −1.87) ∪ (1.87, 2.20] v(0) ∈ [−3.15, −2.20) ∪ (2.20, 3.15] v(0) ∈ (−∞, −3.15) ∪ (3.15, +∞)
Some values of v(0) v(0) = 0, ± 0.5, ± 0.9, ± 1 v(0) = ± 1.88, ± 1.9, ± 1.91 v(0) = ± 2.25, ± 2.27, ± 2.3 v(0) = ± 3.17, ± 3.2, ± 3.22
Bifurcation modes Figure 8a Figure 8b Figure 8c Figure 8d
from Table 3, the bifurcation diagrams of the state variable y with the increasing x(0) are shown in Fig. 8a. From Fig. 8a, we can find that when v(0) takes any value within the region [−1.87, 1.87], the coexisting periodic, quasi-periodic and chaotic attractors can be obtained by changing the value of the state variable x(0). With v(0) ∈ [−2.20, −1.87) ∪ (1.87, 2.20] and v(0) are chosen as ±1.88, ± 1.9, ± 1.91 respectively, the bifurcation diagrams of the state variable y with the parameter x(0) increasing are shown in Fig. 8b. Compared with the bifurcation mode of Fig. 8a, the bifurcation mode of Fig. 8b becomes discontinuous and the bifurcation diagram appears discontinuous periodic orbits. When v(0) are chosen as ±2.25, ± 2.27, ± 2.3, respectively and v(0) ∈ [−3.15, −2.20) ∪ (2.20, 3.15], the bifurcation diagrams of the state variable y with the increasing x(0) are depicted in Fig. 8c, which shows that the bifurcation diagram is also discontinuous and the trajectories of the system change from stable fixed point to large 1-periodic orbit directly. When |v(0)| > 3.15 and v(0) are chosen as ±3.17, ± 3.2, ± 3.22 respectively, the motion trajectories of the system only include stable fixed point and large 1-periodic orbit, as shown in Fig. 8d.
3.3 Coexisting Strange Attractors Usually chaotic attractors describe the relationship of voltage cross the capacitor (v), current in the inductance branch (i), charge (q) and magnetic flux (ϕ) of the memristor. If the observed state variables are generalized to power (P) and energy (W) signals, it is found that the folding and extension of attractors are more complex. With the system parameters a = 9, b = 30, c = 17, m = − 1.2, n = 1.2 and initial conditions (10−6 , 0, 0, 0, 0), numerical simulations for the relationship of different state variables are performed in Fig. 9, and it shows a four-wing chaotic attractor, which reflects the relationship between voltage (v1 ) cross the capacitor C1 and energy (W(L1 )) in the inductor L1 . Figure 9b displays a four-scroll chaotic attractor, which describes the characteristic between power (P(M(q))) of the memristor M(q). Figure 9c exhibits a four-wing and double-scroll chaotic attractor, which implies the relationship between power of the memristor and voltage cross the capacitor C1 . Figure 9d reveals a double-wing and double-scroll chaotic attractor, which expresses the relationship of power of the memristor M(q) and energy in the inductor L2 .
Multistability Coexistence of Memristive Chaotic System and the Application. . .
89
Fig. 8 Bifurcation diagrams of the parameter x(0) with v(0) in different regions. (a) |v(0)| ≤ 1.87; (b) 1.87 < |v(0)| ≤ 2.20; (c) 2.20 < |v(0)| ≤ − 3.15; (d) |v(0)| > 3.15
90
F. Min and C. Li 0.16
0.18
0.12
W(v2)
W(L1)
0.12
0.08
0.06
0.04 0.00 -1.6
-0.8
0.0 v1
0.8
0.00 -1.0
1.6
-0.5
(a)
0.0 q
0.5
1.0
3.5
7.0
(b)
1.6
0.15
0.8
W(L2)
v1
0.10 0.0
0.05 -0.8 -1.6 -8
-4
0 P(M(q))
(c)
4
8
0.00 -7.0
-3.5
0.0 P(M(q))
(d)
Fig. 9 Strange chaotic attractors in different planes (a) v1 − W(L1 ); (b) q − W(v2 ); (c) P(M(q)) − v1 ; (d) P(M(q)) − W(L2 )
In addition, the power (P), energy (W), voltage (v) and charge (q) are choose as the independent signal to analyze the coexisting dynamical behaviors. When the system initial values are changed, the symmetry of strange chaotic attractor will be destroyed, and it is found that the system motion has a large number of coexisting behaviors, as shown in Fig. 10. In Fig. 10a, with the initial conditions IC1 = (10−6 , 0, 1.3, 0, 0) and IC2 = (10−6 , 0, −1.3, 0, 0) respectively, the system emerges the coexisting double-wing chaotic attractors in v1 − W(L1 ) plane. Figure 10b displays the coexisting double-scroll chaotic attractors in q − W(v2 ) plane for the initial conditions IC1 and IC2 . In Fig. 10c, for initial conditions IC3 = (10−6 , 0, 0.5, 0, 0) and IC4 = (10−6 , 0, −0.5, 0, 0), there are the coexisting double-wing and single-scroll chaotic attractors in P(M(q)) − v1 plane. Figure 10d exhibits the coexisting single-wing and single-scroll chaotic attractors in P(M(q)) − v1 plane with the initial conditions IC3 and IC4 respectively.
Multistability Coexistence of Memristive Chaotic System and the Application. . . 0.45
91
0.4 0.3
W(v2)
W(L1)
0.30
0.2
0.15 0.1 0.00 -3.0
-1.5
0.0 v1
1.5
3.0
0.0 -1.2
-0.6
2
0.20
1
0.15
0
1.2
15
30
0.10 0.05
-1 -2 -30
0.6
(b)
W(L2)
v1
(a)
0.0 q
-15
0 P(M(q))
15
30
0.00 -30
-15
(c)
0 P(M(q))
(d)
Fig. 10 Coexisting strange chaotic attractors (a and b) IC1 (blue), IC2 (red); (c and d) IC3 (blue), IC4 (red)
4 Color Image Encryption 4.1 Analysis of Image Encryption Algorithm In this section, we adopt the method of combining pixel displacement and pixel location scrambling, and design an image encryption scheme with chaotic sequence, as shown in Fig. 11. The scheme can be outlined as follows: G R B Step 1 we obtain RGB pixel sequences: CM×N CM×N CM×N from the M × N digital color image (M is the row pixel, N is the column pixel). Two chaotic sequences with length M × N are selected and processed to obtain new sequences X1 X2 :
⎧ ⎨ X1 = mod (x1 ∗ 1000, M × N) ⎩
(8) X2 = mod (x2 ∗ 1000, M × N)
92
F. Min and C. Li
XOR
Original image
decomposition synthsis
Plain-text R, G, B Pixel replacement
Chaotic sequence
Encrypted image
Pixel position scrambling decomposition Cipher-text synthsis
R,G,B
Fig. 11 Flowchart on encryption and decryption of color digital image
G R B Step 2 we combine sequence X1 with pixel sequences CM×N CM×N CM×N by XOR, and obtain new sequences X1 X2 X3 :
⎧
R ⎪ X1 = bitxor X1 , CM×N ⎪ ⎪ ⎪ ⎪ ⎨
G X2 = bitxor X1 , CM×N ⎪ ⎪ ⎪ ⎪ ⎪
⎩ B X3 = bitxor X1 , CM×N
(9)
Step 3 A new sequence is given by T = {t1 , t2 . . . , ti , . . . , tM × N } = {1, 2, . . . , M × N} and the element ti of sequence T is handled as follows: we take the element xi of sequence X2 and exchange the i-th element of sequence T with the xi −th element of sequence T, and then move the element ti backward xi bit to obtain a new sequence T .
Step 4 the new sequence T is used as address mapping table to rearrange each element address of the sequence X1 X2 X3 and then we can obtain encrypted RGB pixel matrix X1 X2 X3 which can be synthesized to encrypted color image. The decryption process is the reverse process of the encryption, which we omit here.
4.2 Simulation Analysis In the following, the standard Lena color digital image are selected, whose size is 256 × 256. The system parameters of memristor chaotic system are set as a = 9, b = 30, c = 17, m = −1.2, n = 1.2, and the system initial conditions, with values as [0.1,0.1,0,0,0.4], [0,0.1,0,0,0.3], [0.1,0,0,0,0.4] are chosen as the algorithm keys of
Multistability Coexistence of Memristive Chaotic System and the Application. . .
93
Fig. 12 Image encryption results (a–c) RGB plain image; (e–g) RGB cipher image; (d) plain color image; (h) cipher color image
RGB image respectively. Simulation results are shown in Fig. 12. The cipher RGB images completely hide useful information, which means that the algorithm can make good use of chaotic sequence to encrypt the color image. In order to present the validity of algorithm, other analysis methods are also adopted, such as image histogram analysis, pixel correlation analysis, and key sensitivity analysis. 1. Statistical analysis Image histogram is one of the most important statistical methods. In this paper, Lena.bmp color images R G B are analyzed statistically, as shown in Fig. 13. Figure 13a–c show plain RGB image histograms, whose distribution is extremely uneven; Fig. 13d–f show cipher RGB image histogram with uniform distribution. These indicate that the encryption algorithm can effectively diffuse plain information into random cipher, and resist statistical attacks. 2. Correlation analysis Correlation analysis is usually applied to study the scattering of encryption algorithm, and it is one of the important performance indexes to evaluate image encryption performance. By analyzing the adjacent pixels distribution of plain and cipher image, we can understand the effectiveness of memristor chaotic encryption algorithm. Besides, the correlation of adjacent pixels can be calculated by the following formula:
94
F. Min and C. Li
Fig. 13 Image histogram (a–c) plain R G B image histogram; (d–f) cipher R G B image histogram
⎧ ⎪ ⎪ ⎪ E(x) = ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ D(x) =
1 N
1 N
N
xi
i=1 N
[xi − E(x)]2
i=1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ cov (x, y) = N1 [xi − E(x)] [yi − E(x)] ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ρx,y = √ cov(x,y) √ D(x) D(y)
(10)
where, x and y are the pixel values of two adjacent pixels respectively; ρ x, y indicates the correlation coefficient of adjacent image pixels. Figure 14a–c presents the adjacent pixel correlation of plain image R G B in the horizontal direction, while Fig. 14d–f shows the adjacent pixel correlation of cipher image R G B in the horizontal direction. It is obvious that the correlation between the pixels of the plain image is linear, while the correlation between the pixels of the cipher image is random. As can be seen from Table 4, the pixel correlation coefficient of plain image RGB in the horizontal, vertical or diagonal directions is close to 1, and the pixel correlation coefficient of encrypted image RGB in horizontal, vertical and diagonal directions is very close to 0. The encrypted
Multistability Coexistence of Memristive Chaotic System and the Application. . .
95
Fig. 14 Correlation of adjacent pixels in horizontal direction (a–c) plain image R, G, B; (d–f) cipher image R, G, B Table 4 The correlation coefficient of plain image and cipher image Direction Horizontal Vertical Diagonal
Plain R 0.9571 0.9624 0.9248
G 0.9457 0.9481 0.8953
B 0.9291 0.9336 0.8512
Cipher R 0.0029 −0.0118 0.1380
G −0.0098 0.0036 0.2343
B 0.0056 −0.0183 0.1939
image has no features of the original image, which means it can effectively resist plain-text attacks. 3. Secret key space and sensitivity analysis The system parameters: a b c m n and the initial values: x0 y0 z0 u0 v0 are chosen as secret keys. The key space is up to 10150 , which is much larger than 2100 , which means that the exhaustive attacks will lose its effect. For the key sensitivity, we change system parameter a and initial value x0 by 10−15 , respectively, and the remaining key parameters are unchanged. The decrypted image is shown in Fig. 15. Other secret key parameters have the same sensitivity, which indicates that the algorithm is highly sensitive to the secret keys.
96
F. Min and C. Li
Fig. 15 Test results of key sensitivity (a) a = a + 10−15 , decrypted image; (b) x0 = x0 + 10−15 , decrypted image; (c) decrypted image with correct key
5 Conclusions By replacing the nonlinear resistance in Chua’s circuit with a charge-controlled memristor model, a fifth-order memristive chaotic system is presented in the paper. One feature of the proposed memristive chaotic system is that its multistability depends extremely on the memristor initial condition and the system parameters. Numerical simulations for coexisting attractors are performed, which show the coexisting behaviors of point, period, chaos, and quasic-period. More interestingly, the first state variable of newly proposed memristive system has different bifurcation structures with the memristor initial condition in different regions and its multistability closely relies on memristor initial condition and the first state variable, which reveals that the complex dynamical behaviors with the system motion start from the non-equilibrium point. Another feature is that some strange chaotic attractors like mixed or superimposed type are obtained by observing the power and energy signal, and the dynamic behaviors of coexisting strange attractors are exhibited by phase diagrams. At last, color image encryption based on the new chaotic system is studied and safety performance analysis is performed, including key spaces, key sensibility and anti-attack ability. Results show that the new system can be used to encrypt color images in a very secure way.
References 1. L.O. Chua, Memristor-themissing circuit element. IEEE Trans. Circuit Theory 18, 507–519 (1971) 2. D.B. Strukov, G.S. Snider, D.R. Stewart, R.S. Williams, The missing memristor found. Nature 453, 80–83 (2008) 3. E. Chen, L.Q. Min, G.R. Chen, Discrete chaotic systems with one-line equilibria and their application to image encryption. Int. J. Bifurc. Chaos 27(3), 1750046 (2017)
Multistability Coexistence of Memristive Chaotic System and the Application. . .
97
4. S.J. Cang, A.G. Wu, Z.L. Wang, Z.H. Wang, Z.Q. Chen, A general method for exploring three-dimensional chaotic attractors with complicated topological structure based on the twodimensional local vector field around equilibriums. Nonlinear Dyn. 83, 1069 (2016) 5. G.A. Leonov, N.V. Kuznetsov, T.N. Mokaev, Homoclinic orbits, and self-excited and hidden attractors in a Lorenz-like system describing convective fluid motion. Eur. Phys. J. Spec. Top. 224, 1421–1458 (2015) 6. J. Kengne, Z.T. Njitacke, N.A. Nguomkam, T.M. Fouodji, H.B. Fotsin, Coexistence of multiple attractors and crisis route to chaos in a novel chaotic jerk circuit. Int. J. Bifurc. Chaos 25(4), 1550052 (2015) 7. B.C. Bao, Q. Xu, H. Bao, M. Chen, Extreme multistability in a memristive circuit. Electron. Lett. 52(12), 1008–1010 (2016) 8. C. Hens, S.K. Dana, U. Feudel, Extreme multistability: attractors manipulation and robustness. Chaos 25, 053112 (2015) 9. M. Chen, M.Y. Li, Q. Yu, B.C. Bao, Q. Xu, J. Wang, Dynamics of self-excited and hidden attractors in generalized memristor-based Chua’s circuit. Nonlinear Dyn. 81, 215–226 (2015) 10. Z.T. Njitacke, J. Kengne, H.B. Fotsin, N.A. Nguomkam, D. Tchiotsop, Coexistence of multiple attractors and crisis route to chaos in a novel memristive diode bidge-based jerk circuit. Chaos Solitons Fractals 91, 180–197 (2016) 11. C. Li, F.H. Min, Q.S. Jin, H.Y. Ma, Extreme multistability analysis of memristor-based chaotic system and its application in image decryption. AIP Adv. 7, 125204 (2017) (SCI) 12. B.C. Bao, T. Jiang, G.Y. Wang, P.P. Jin, H. Bao, M. Chen, Two-memristor-based Chua’s hyperchaotic circuit with plane equilibrium and its extreme multistability. Nonlinear Dyn. 89, 1157–1171 (2017) 13. G.Y. Peng, F.H. Min, Multistability analysis, circuit implementations and application in image encryption of a novel memristive chaotic circuit. Nonlinear Dyn. 90(3), 1607–1625 (2017)
Extreme Events and Emergency Scales Veniamin Smirnov, Zhuanzhuan Ma, and Dimitri Volchenkov
1 Introduction Not a single day passes by without hearing about extreme events which surround us almost everywhere. On one hand, climate change results in droughts, heat waves, tornadoes, storms etc.; movement of the tectonic plates is responsible for earthquakes and volcano eruptions. On the other hand, certain aspects of human activities may crash stock markets, influence tensions not only among groups of people in a country, but also between countries sometimes leading to military confrontations and mass migrations, etc. While there is no single definition of extreme events [1], they are considered events that cause infrastructure failures, economic and property losses, risk to health and life. In order to quantify extreme events, practitioners developed several scales. For instance, Modified Mercalli Intensity scale [2] (describes the intensity of visible damage of earthquakes), Beaufort Wind scale [3] (measures speed and observed effects the wind has on the sea and land), Saffir-Simpsons Hurricane scale [4] (measures wind speed), Fujita scale [5] (rates the intensity of a tornado after it has passed), US Homeland Security Terror Alert scale [6] (measures five colorcoded terror alert levels), U.S. Climate Extremes Index [7], etc. Rohn Emergency Scale [8] unites emergency scales using three independent dimensions: (1) scope; (2) topographical change (or lack thereof); and (3) speed of change. The intersection of the three dimensions provides a detailed scale for defining any emergency [8]. In
Work done under contract with the AVX Aircraft Company Contract #W911W6-13-2-0004. V. Smirnov · Z. Ma · D. Volchenkov () Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX, USA e-mail: [email protected]; [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 D. Volchenkov, J. A. Tenreiro Machado (eds.), Mathematical Methods in Modern Complexity Science, Nonlinear Systems and Complexity 33, https://doi.org/10.1007/978-3-030-79412-5_6
99
100
V. Smirnov et al.
some papers, the threshold for an extreme event is related to the number of standard deviations from the average amplitude [9, 10]. However, existing empirical scales tend to describe the characteristics of the event itself rather that the consequences; such scales are ill-suited to describe emergencies in a way that is meaningful for response [11]. For example, the severity of earthquake shaking (in terms of the amount of seismic energy released) is measured using the Richter magnitude scale that ranges from 1.0 (microearthquakes, not felt of felt rarely happening several million times per year) to 9.0 (at or near total destruction causing permanent changes in ground topography, and happening one per 10 to 50 years) [12]. At the same time, a projection of an earthquake on the Richter magnitude scale may not be an accurate scale of the disaster. The 2010 Haiti earthquake of a catastrophic magnitude 7.0 caused from 100,000 to 160,000 deaths and created between $7.8 billion and $8.5 billion dollars in damage [13]. Later on July 5, 2019 there was a magnitude 7.1 earthquake near Southern California’s Searles Valley. Estimated losses are at least $1 billion dollars. No deaths were reported related to this catastrophe [14]. Hence, the magnitude of the earthquakes is not equivalent to the value of their consequences. We also witness extreme events of different magnitudes in financial markets from global recessions (defined by a global annual GPD growth rate of 3.0% of less) that happened in 1975, 1982, 1991, 2009 to “flash crashes” (for instance, on May 6, 2010, the S&P500 declined 7% in less than 15 min, and then quickly rebounded). Unlike the Richter magnitude scale, the severity of extreme events in financial markets is defined by the measures that need to be taken to ease panic such as halting trading. Under 2012 rules, market-wide circuit breakers (or ‘curbs’) kick in when the S&P 500 index drops 7% for Level 1; 13% for Level 2; and 20% for Level 3 from the prior day’s close. A market decline that triggers a Level 1 or 2 circuit breaker before 3:25 p.m. Eastern Time will halt trading for 15 min, but will not halt trading at or after 3:25 p.m. Circuit breakers can also be imposed on single stocks as opposed to the whole market. Under current rules, a trading halt on an individual security is placed into effect if there is a 10% change in value of a security that is a member of the S&P 500 Index within a 5-min time frame, 30% change in value of a security whose price is equal or greater than $1 per share, and 50% change in value of a security whose price is less than $1 per share [15]. Ironically, in August 2015, single stock circuit breakers produced unprecedented disruption as 327 exchangetraded funds experienced more than 1000 trading halts during a single day. Statistics of extreme events show that the extreme events are found in the tails of probability distributions (i.e. the distributions extremities). But different distributions may admit asymptotic tails of the same appearance (a power law). In our paper we apply two approaches existing in practical extreme value analysis to study S&P 500 time series in the period from January 2, 1980 till December 31, 2018. The first one relies on deriving block maxima series as a preliminary step. We fit the actual data to the Generalized Extreme Value Distribution (GEVD). Our results (Sect. 4.1) show that the distribution of block maxima is a composition of several distributions (Fig. 7b). The second approach relies on extracting the peak values reached for any period during which values exceed a certain threshold
Extreme Events and Emergency Scales
101
Fig. 1 Triality of an extreme event
(falls below a certain threshold). We will cover major methods of choosing the threshold in Sect. 4.2. Usually with the second approach the analysis may involve fitting two distributions: one for the number of events in a time period considered and a second for the size of the exceedances. But in practice a suitable threshold is unknown and must be determined for each analysis [16]. We show graphical methods, the rules of thumb and automatic methods are used by us to select the threshold. All the above reveals the triality of nature of extreme events (Fig. 1). From a scientific point of view, the empirical emergency scales (based on the percent of the populations affected and the economical impact of damages) determine the qualitative characteristics of extreme events. The scales essentially specify the threshold for each level of the severity of extreme events. Such empirical way of choosing the threshold, called the rule of thumb, is studied in Sect. 4.2.3. A choice of the threshold can be made using statistical analysis tools, such as graphical approaches (Sect. 4.2) and automatic methods (Sect. 4.2.2). Rather than characterize extreme events solely in terms of the distribution of weekly maxima, threshold methods take account of all events beyond a given high threshold, also known as exceedances, i.e. the chosen threshold establishes a tail of a distribution that shows the extreme events. This step completes the cycle (Fig. 1).
102
V. Smirnov et al.
At each part of the diagram (Fig. 1), we face uncertainty of defining the threshold to study a tail distribution to decide whether a particular event is extreme. We applied several methods to select a threshold in Sect. 4.2 such as a rule of thumb and an automatic method, and in each instance the choice depends on amount of data considered, a method used, etc. Moreover, once new data becomes available, we cannot guarantee sustainability of the previously chosen threshold value. In addition to that, assessment of the severity of the extreme event is also ruled by consequences, which may be revealed ex past facto. In Sect. 4.3 we presented statistics of extreme events under threshold uncertainty. We analyzed the degree of uncertainty of the threshold value (chosen with the rule of thumb) based on the probability of its change at any given day. Figure 2 demonstrates the degree of uncertainty that depends on both the amount of data available and the values of thresholds. We observed several cases: (i) If the amount of data is not sufficient (a solid line corresponding to an 18-day window of observation in Fig. 2), then the uncertainty curve forms a skewed profile attaining a single maximum for some value of the threshold. In this situation, an observer’s perception of events reminds Red Queen from “Through The Looking-Glass and What Alice Found There” by Lewis Carroll [17] who
Fig. 2 Degree of uncertainty of different values of the thresholds based on the amount of trading days taken into account. Three shaded regions mark three scales of emergency: I—subcritical, II—critical, III—extreme and the solid curve represents uncertainty of the Red Queen State
Extreme Events and Emergency Scales
103
said “When you say hill, I could show you hills, in comparison with which you’d call that a valley”. Our understanding of the events whether they are extreme or not is very limited and uncertainty is blurry. As events become more severe, our uncertainty that the events are extreme decreases. The observer realizes that the events are extreme, but a precise point at which the events turn to be severe cannot be determined. This case is called the Red Queen State. (ii) As the window of observation becomes larger, the uncertainty curve exhibits two maxima indicating that the amount of data is sufficient. As we further extend the window the curve torrents into sharp peaks (Fig. 2). The latter ones clearly separate the threshold values into three regions: three levels of emergency. (I) In the region I (subcritical), the threshold values are small to raise concern about extreme events. Then we can observe a spike with the degree of uncertainty attaining its first maximum. This extremum indicates a transition to the next kind of uncertainty. (II) In the region II (critical), uncertainty is conceptualized with a question whether a magnitude of the event is already critical, extreme or not yet. Further, we see another jump of uncertainty. At this point we certain that events are not regular anymore. (III) We consider all events in this region extreme with our uncertainty decreasing. If these events are not extreme then what are they? We consider the Red Queen state and three scales, which are based on the degree of uncertainty, our contribution to the discussion on the extreme events and emergency scales. We conclude in the last Sect. 5.
2 Data Source and Description The analysis of extreme events in the S&P 500 time series of log-return has been performed based on the data collected during 9835 trading days (39 years), in the period from January 2, 1980, till December 31, 2018 (see Fig. 3) acquired using the publicly available source at Yahoo Finance (https://finance.yahoo.com/quote/ %5EGSPC/). The data set contains S&P 500 index at market open, highest point reached in the day, lowest point in the day, index of the stock at market close, number of shares traded. We have used index at market close since this information was always present in the data set. Computations were made using Python’s numerical libraries, such as NumPy and Pandas, as well as R-language to perform statistical analysis.
104
V. Smirnov et al.
Fig. 3 The S&P 500 Index of 500 large-cap U.S. stocks assessing market performance
Fig. 4 The S&P 500 index log-return at market close
3 Log Returns: Between Noise and Random Walks Due to complexity of financial data, in order to analyze dynamics of the given financial time series, we computed log return, denoted Rln , according to the following formula [18]: S&P Index(t) , Rln (t) = ln S&P Index(t − 1)
(3.1)
where S&PIndex(t) is the value of S&P 500 at market close (Fig. 4). The lowest drop of log-return in Fig. 4 happened Monday, October 19, 1987 known as Black Monday when the S&P500 shed value of nearly 23% [19]. Another significant drop occurred in 2008 when S&P 500 fell 38.49%, its worst yearly percentage loss [20]. In September 2008, Lehman Brothers collapsed as the financial crisis spread. However, on Oct 13, 2008: S&P 500 marks its best daily percentage gain, rising 11.58% and registers its largest single-day point increase of 104.13 points [20].
Extreme Events and Emergency Scales
105
Fig. 5 Distribution of log-return values in the log 2-linear scale. Solid lines correspond to a Zipf’s Law (∝ x −1 ), dotted lines represent Power Law (∝ x −3.5 ), and dashed lines correlate to Gaussian distribution (∝ exp(−x 2 )). Curves are given for reference only
Distribution of Rln values (Fig. 5) is asymmetrically skewed, with fat heterogeneous right and left tails. The log-return time series has a scale invariant structure when the structure repeats itself on sub-intervals of the signal, X(ct) = cH X(t), where the Hurst exponent H characterizes the asymptotic behavior of the auto-correlation function of the time series [21–23]. The larger Hurst exponent is visually seen as more slow evolving variations (i.e., more persistent structure) of the time series [22, 24, 25]. Processes with 0 < H < 0.5 exhibit antipersistence, with an increase in the process is likely to be followed by a decrease in the next time interval resulting in sample paths with a very rough structure [22, 24, 25]. On the contrary, values 0.5 < H < 1 lead to long-range dependence (“long memory”) in the time series, with more likely the same sign of successive increments (persistence) and smoother sample trajectories. Finally, the time series constitutes random walks when H > 1 that have more apparent slow evolving fluctuations [22, 24, 25]. The q-order Hurst exponent Hq is only one of several types of scaling exponents used to parameterize the multifractal structure of time series [22, 26]. The log-return time series for S&P 500 exhibits local fluctuations with both extreme small and large magnitudes, as well as short- and long-range dependences on different time scales [27, 28]; it is not normal distributed and all q-order statistical moments should to be considered to describe the spatial and temporal variation that reveals a departure of the log-return time series from simple random walk behavior [22, 24]. The q-order weights the influence of segments with large and small fluctuations. The negative q s are influenced by the time series segments
106
V. Smirnov et al.
with small fluctuations, and large fluctuations influence the time series segments for positive q’s. In our work, we use the standard multifractal detrended fluctuation analysis (MFDFA) algorithm [22, 26] for estimating the q-order Hurst exponents and the multifractal spectra directly from the time series: 1. The original time series x k , k = 1, . . . , N is aggregated by computing the cumulative sums Y (n) = nk=1 (xk − x) , n = 1, . . . , N, where x denotes the sample mean; 2. The aggregated data is divided into N/s non-overlapping segments of length s; 3. The maximum likelihood estimator of the residual variance in segment ν, ν = 1, . . . , Ns , ! 1 s {Y [(ν − 1)s + i]−yν (i)}2 , for ν=1, . . . , Ns , F (ν, s)= 1s i=1 s 2 i=1 {Y [N − (ν − Ns )s + i]−yν (i)} , for ν=Ns + 1, . . . , 2Ns , s 2
where yν (i) is the m-degree polynomial fitted the aggregated observations in the segment; 4. For each segment of length s and for each positive or negative values of the moment order q, the q-order fluctuation function, # " 2 q/2 1/q s Fq (s) = N1 N , are calculated. The local fluctuations Fq ν=1 F (ν, s) with large and small magnitudes is graded by the magnitude of the negative or positive q-order, respectively; 5. A linear regression of ln Fq (s) on ln s for all s is performed, and the slope of the linear function ln Fq (s) ∝ Hq ln s is used as an estimator of the q-order Hurst exponent Hq for each q-order fluctuation function Fq . The fractal structures of the positive and negative log-return time series and its deviations within time periods with large and small fluctuations are assessed by the q-order Hurst exponents (see Fig. 6). The slopes Hq of the regression lines are qdependent for the multifractal time series of positive (the dashed line) and negative (the bold line) log-returns (see Fig. 6). Decreasing Hq with the q order indicates that the segments with small fluctuations have a random walk like structure whereas segments with large fluctuations have a noise like structure.
4 Tails, Thresholds, and Extreme Events There are two primary approaches to analyzing extreme values (the extreme deviations from the median of the probability distributions) in data:
Extreme Events and Emergency Scales
107
Fig. 6 The q-order Hurst exponents Hq for the time series of positive (the dashed line) and negative (the bold line) log-returns
1. The first and more classical approach reduces the data considerably by taking maxima of long blocks of data, e.g., annual maxima. The GEVD function has theoretical justification for fitting to block maxima of data [29]. 2. The second approach is to analyze excesses over a high threshold. For this second approach the generalized Pareto distribution (GPD) function has similar justification for fitting to excesses over a high threshold [29].
4.1 Generalized Extreme Value Distributions The GEVD is a flexible three-parameter continuous probability distributions that was developed with extreme value theory to combine the Gumbel, Fréchet, and Weibull extreme values distributions into one single distribution [30, 31]. The GEV distribution has the following pdf [32]: f (x; μ, σ, ξ ) =
1 t (x)ξ +1 e−t (x) , σ
where t (x) =
−1/ξ (1 + ξ( x−μ σ )) e−(x−μ)/σ
ξ = 0 ξ =0
and μ ∈ R is the location parameter, σ > 0 is the scale parameter, and ξ ∈ R is the shape parameter. When the shape parameter ξ is equal to 0, greater than 0, and lower than 0 [29], the GEV distribution is equivalent to Gumbel [33], Fréchet [34] and “reversed” Weibull distributions [35], respectively. The Gumbel distribution, also named as the Extreme Value Type I distribution, has the following pdf and cdf :
108
V. Smirnov et al.
f (x; μ, β) =
− 1 −( x−μ +e e β β
F (x; μ, β) = e−e
x−μ β )
,
x−μ − β
(4.1) (4.2)
where x ∈ R, μ is the location parameter, β > 0 is the scale parameter. Specially, when μ = 0 and β = 1, the distribution becomes the standard Gumbel distribution. Generalizations of the Gumbel distribution, which are of flexible skewness and kurtosis due to the addition of one more shape parameter are widely used for extreme value data as they better fit data [36]. The distribution in (4.1) has been employed as a model for extreme values [37, 38]. The distribution has a light right tail, which declines exponentially, since its skewness and kurtosis coefficients are constant. The Fréchet distribution, also known as the Extreme Value Type II distribution, has the following pdf and cdf , respectively: f (x; α, β) =
α β α+1 −( β )α ( ) e x , β x β α
F (x; α, β) = e−( x )
where α > 0 is the shape parameter and β > 0 is the scale parameter. The Weibull distribution is known as the Extreme Value Type III distribution. The pdf and cdf of a Weibull random variable are shown as follows, respectively: f (x; λ, k) =
k x k−1 −(x/λ)k e λ(λ)
x≥0
0
x 0 is the shape parameter. Further we show the application of the GEV model to the stock market close price using the weekly-return data that was calculated by R(t)=
(maximum close price of week t)−(maximum close price of week (t − 1)) . (maximum close price of week (t − 1))
The results of fitting the GEV distribution to (weekly) block maxima data is presented in Fig. 7 and Table 1 that present the Quantile-quantile plot (QQ-plot), quantiles from a sample drawn from the fitted GEV pdf against the empirical data quantiles with 95% confidence bands. The maximum likelihood estimators of the GEV distribution are the values of the three parameters (μ, σ, ξ ) that maximize the
Extreme Events and Emergency Scales
109
Fig. 7 (a) A general extreme value QQ-plot with maximum likelihood estimation; (b) Density plot of empirical data where a dashed curve A is based on the empirical data, and a dashed curve B is modeled. N = 2036 and bandwidth is 135.9
log-likelihood. The magnitude along with positive sign of ξ indicates the fat-tailness of the weekly-return data, which is consistent with the quantile plot. Based on the statistical analysis presented above (Fig. 7a), we see that the distribution of the weekly-return data can be described by a combination of different distributions. The density plot (Fig. 7b) having two humps validates the idea of a mixture of distributions.
110
V. Smirnov et al.
Table 1 Parameter Estimates for the GEV fitted model with Maximum Likelihood Estimator. The 95% confidence intervals for each estimates are included Estimated parameter 95% a lower bound of the confidence interval 95% an upper bound of the confidence interval
Location μˆ 606.3260 576.43 636.22
Scale σˆ 511.1713 486.42 535.92
Shape ξˆ 0.1215 0.05 0.19
4.2 How to Choose a Threshold The classical approach for modeling extreme events is based on the GPD. It was proved [39] that if a threshold u is chosen and X1 , X2 , . . . Xn are observations above u, then the limiting distribution for excess over threshold is indeed GPD. In applications, the GPD is used as a tail approximation [40] of values x − u exceeding the threshold u. The GPD is determined by scale and shape parameters σu > 0 and ξ , respectively, or in terms of threshold excess x − u, producing the following formula ⎧ $ %−1/ξ ⎪ ⎪ 1 − 1 + ξ x−u , ξ = 0 ⎪ σu ⎨ + G(x|u, σu , ξ ) = Pr(X < x|X > u) = ⎪ % $ ⎪ ⎪ ⎩1 − exp − x−u , ξ =0 σu +
(4.3) where f+ = max(f, 0). When ξ > 0, it takes the form of the ordinary Pareto distribution. This case is the most relevant for financial time series, since it is heavytailed. For security returns or high-frequency foreign exchange returns, the estimates of ξ are usually less than 0.5. When ξ = 0, the GPD corresponds to the exponential distribution [41]. There are several properties of GPD [39], such as, ‘threshold stability’ property: if X is GPD and u > 0, then X − u provided X > u is also GPD. Therefore, a Poisson process of exceedance times with generalized Pareto excess implies the classical extreme value distributions [42]. The above suggests that generalized Pareto distribution is a practical tool for statistical estimation of the extreme values, given a sufficiently high threshold. The rest of this chapter is devoted to a question about how high a threshold should be.
4.2.1
Graphical Approaches to Estimate Threshold
One of the most common ways to determine a suitable threshold is to graphically inspect data. This approach [40] requires substantial expertise, that can be subjective and time consuming. In some cases, when dealing with several data sets, a uniform
Extreme Events and Emergency Scales
111
threshold may be proposed and kept fixed making the entire evaluation even more subjective. The most common graphical tools are: mean excess plot [43], threshold stability plot [40], QQ-plot [44], Hill plot [45], return level plots [40], etc. The mean excess plot is a tool widely used in the study of risk, insurance and extreme values. One use is in validating a generalized Pareto model for the excess distribution. The distribution of the excess over a threshold u for a random variable X with distribution function F is defined as Fu (x) = Pr (X − u ≤ x|X > u) .
(4.4)
This excess distribution is the foundation for peaks over threshold modeling which fits appropriate distributions to data on excesses and widespread with many application in hydrology [46, 47], actuarial science [48, 49], survival analysis [50]. This modeling is based on the GPD that is suitable for describing properties of excesses. The mean excess (ME) function is one of the most common tools to determine a suitable threshold u. The ME function of a random variable X is defined as M(u) = E(X − u|X > u),
(4.5)
provided EX+ < +∞, which is also known as mean residual life function. As Ghosh [43] noted for a random variable X ≈ Gξ,β , E(X) < +∞ if and only if ξ < 1 and in this case, the ME function of X is linear in u: M(u) =
β ξ + u, 1−ξ 1−ξ
(4.6)
$ % where 0 ≤ u < +∞ if ξ ∈ [0, 1) and u ∈ 0, − βξ if ξ < 0. The linearity of the ME function characterizes the GPD class [43]. Davison and Smith [42] developed a simple graphical tool that checks data against a GPD model. Let X1 ≥ X2 ≥ . . . ≥ Xn be the order statistics of the data, then ME plot depicts ˆ k )|1 < k ≤ n}, where Mˆ is the empirical ME function defined the points {Xk , M(X as n (Xi − u)I[Xi >u] ˆ n , u ≥ 0. (4.7) M(u) = i=1 i=1 I[Xi >u] If the ME plot is close to linear for sufficiently large values of the threshold then there is no evidence against use of a GPD model. Another problem is to obtain a natural estimation ξˆ of ξ . There are several methods to estimate ξˆ , such as: (1) Least squares [40], (2) Maximum likelihood estimation [42], (3) the Hill estimator [45], (4) the Pickands estimator [51], (5) quantile-quantile plot (QQ-plot) [40], (6) the moment estimator [52].
112
V. Smirnov et al.
Fig. 8 Mean residual life plot for the S&P 500 positive returns. Solid jagged line is empirical MRL with approximate pointwise Wald 95% confidence intervals as dashed lines. The threshold u is estimated at 0.016. A vertical dashed line marks this threshold
"
For example, the QQ plot depicts the points Qm,n = − log mi , log XXmi & # & &1 ≤ i ≤ m where m < n and ξ > 0. In case ξ < 0, QQ plot is the plot of points # & " & i Qm,n = Xi , G← ≤ i ≤ n , where ξˆ is an estimate of ξ based on m &1 ξˆ ,1 n+1 upper order statistics. Recently, new graphical diagnostic tools have been introduced: a new multiplethreshold GP model with a piece-wise constant shape parameter [53]; plots measuring a surprise at different candidates to select threshold, using results of Bayesian statistics [54, 55]; structure of maximum likelihood estimators have been studied to develop diagnostic plots with more direct interpretability [56]. With this choice for a threshold, u = 0.016, we get 507 exceedances with the empirical life becoming close to linear above this choice of the threshold (Fig. 8). Similarly, we find a threshold for negative returns. In this case, all computations were repeated for absolute values of negative returns (Fig. 9). With this choice for the threshold u = 0.017, there are 462 exceedances. One can see that empirical MRL becomes almost linear above u = 0.017. Ghosh and Resnick [43] noted that despite graphical diagnostic is a tool commonly accepted by practitioners, there are some problems associated with the methods mentioned above, such as: (1) an analyst needs to be convicted that ξ < 1 since for ξ ≥ 1 random sets are the limits for the normalized ME plot. Such
Extreme Events and Emergency Scales
113
Fig. 9 Mean residual life plot for the S&P 500 negative returns. Solid jagged line is empirical MRL with approximate point-wise Wald 95% confidence intervals as dashed lines. The threshold u is estimated at 0.017. A vertical dashed line marks this threshold
random limits lead to wrong impressions. Certain methods described above work with ξ defined on specific intervals; (2) in case distributions are not close to GPD can mislead the mean excess diagnostics. Based on the graphical approach, the threshold for negative return of S&P 500 was chosen at u = −0.017 and for positive return at u = 0.016. Distribution of exceedances over respective thresholds are shown in the Fig. 10. The QQ-plots shown in the Fig. 11 demonstrate that with a few exceptions, exceedances follow the GPD.
4.2.2
Automatic Methods to Estimate Thresholds
As was mentioned above, the graphical approaches as well as rules of thumb can be highly objective, time consuming, and require certain professional background. Thus some authors have proposed automatic selection methods that can treat chunks of Big Data: a pragmatic automated, simple and computationally inexpensive threshold selection method based on the distribution of the difference of parameter estimates when the threshold is changed [57]: it was shown that better performance is demonstrated by graphical methods and Goodness of Fit metrics that rely on preasymptotic properties of the GPD [58] using weighted least squares to fit linear
114
V. Smirnov et al.
Fig. 10 Distribution of exceedances normalized by thresholds u = 0.016 for positive and u = −0.017 for negative returns, respectively. Dotted lines represent Zipf’s Law (∝ x −1 ), dashdot lines represent Gaussian distribution (∝ exp(−x 2 )) and dashed lines represent power law (∝ x −3.3 ). The curves are given for reference only
models in the traditional mean residue life plot; the recently developed stopping rule ForwardStop [59], which transforms the results of ordered, sequentially tested hypotheses to control the false discovery rate [60] that provides reasonable error control [54]. A particular interest has a method that suggests a way to determine threshold automatically without time consuming and rather subjective visual approaches based on L-moments of GPD that summarize probability distributions, perform estimation of parameters and hypothesis testing [54]. Probability weighted moments, defined by Greenwood [61], are precursors of L-moments. Sample probability weighted moments computed from data values X1 , X2 , . . . Xn arranged in increasing order, are given by n n 1 1 (j − 1)(j − 2) . . . (j − r) Xj . Xj , br = b0 = n n (n − 1)(n − 2) . . . (n − r) j =1
j =r+1
L-moments are certain linear combinations of probability weighted moments that have simple interpretations as measure of location, dispersion and shape of the data sample. The first few L-moments are defined by λ1 = b0 ,
λ2 = 2b1 − b0 ,
λ3 = 6b2 − 6b1 + b0 ,
λ4 = 20b3 − 30b2 + 12b1 − b0 .
(the coefficients are those of the shifted Legendre polynomials).
Extreme Events and Emergency Scales
115
Fig. 11 (a) Quantile-Quantile plot with maximum likelihood estimation for the negative threshold; (b) QQ-plot with maximum likelihood estimation for the positive threshold
The first L-moment is the sample mean, a measure of location. The second L-moment is (a multiple of) Gini’s mean difference statistic, a measure of the dispersion of the data values about their mean. By dividing the higher-order Lmoments by the dispersion measure, we obtain the L-moment ratios, τr = λλ2r , r = 3, 4, . . .. These are dimensionless quantities, independent of the units of measurement of the data. τ3 is a measure of skewness and τ4 is a measure of kurtosis—these are respectively the L-skewness and L-kurtosis. They take values between −1 and +1. For random variable with GPD with ξ < 1, the particular relationship between L-skewness and L-kurtosis is defined as
116
V. Smirnov et al.
Fig. 12 Value of the thresholds for positive and negative log-return based on L-moments. The solid black and blue lines correspond to negative and positive log return thresholds, respectively, and based on a window of 100 trading days. The dotted black and blue lines correspond to negative and positive log return thresholds, respectively, and based on a window of 400 trading days
τ4 = τ3
1 + 5τ3 . 5 + τ3
Given a sample x1 , x2 , . . . , xn , the Automatic L-moment Ratio Selection Method (ALRSM) works as follows [54]: 1. Define the set of candidate thresholds {ui }Ii=1 as I = 20 sample quantiles, starting at 25% by steps of 3.7%. 2. Compute the sample L-skewness and L-kurtosis for the excess over each candidate threshold (τ3,ui , τ4,ui ) and determine dui —the Eucledian distance: dui =
(τ3,ui − τ3 )2 + (τ4,ui − g(τ3 ))2 ,
for i = 1, . . . , I with g(τ3 ) = τ3
1 + 5τ3 . 5 + τ3
3. The threshold after which the behavior of the tail of the underlying distribution can be considered approximately GPD is then automatically selected as u∗ = argmin {dui }, ui ,1≤i≤I
that is, the level above which the corresponding L-statistics fall closest to the curve. Using the L-moments method, we computed thresholds for S&P 500 log-return depending on an observation period, 100 trading days and 400 trading days. The Fig. 12 indicates that a threshold not only time dependent, but also depends on the size of data set used. Once again we cannot choose one value of the threshold that can be absolutely accurate.
Extreme Events and Emergency Scales
4.2.3
117
Rules of Thumb to Choose a Threshold
As earlier noted, the threshold sequence is a function of the properties of the GPD provided that a population is in the domain of attraction of the GPD. In case a distribution function F is known, derivation of the threshold selection is possible, however, in practice if F is unknown then there is no general form for the threshold sequence [40]. Practitioners often use so-called rules of thumb, many of them have little to know theoretical justification, for instance, a fixed quantile√rule [62]: the upper 10% rule or its derivative 5% rule; the square root rule k = n [63]. There is a procedure that tries to find a region of stability among the estimates of the extreme value index [64]. This method depends on a tuning parameter, whose choice is further analysed in [65]. Unfortunately, no theoretical analysis exists for this approach [66]. More comprehensive reviews of the threshold selection methods can be found in [40]. Even though most of methods mentioned above have no theoretical justification for an exact value of a threshold, we can find an approximate location for a threshold given a data set R that has M values of log-return, R = [x1 , x2 , . . . , xM ]. 1. Let Rn be a list of n consecutive values of log-return, Rn = [x1 , x2 , . . . , xn ]. Split Rn into two parts: Rn+ = {x ∈ Rn |x ≥ 0} and Rn− = {x ∈ Rn |x < 0}; then sort them in an increasing order such that Rn+ = [x1+ , x2+ , . . . , xp+ ],
Rn− = [x1− , x2− , . . . , xq− ].
where p, q ∈ N such that p + q = n and n is the width of observation or a window of observation. 2. Next, we compute medians of Rn+ , Rn− , call them xm+ and xm− . These values will be a lower bound for a positive threshold and an upper bound for a negative threshold, respectively. 3. At this step, an upper bound for a positive threshold xu and a lower bound for a negative threshold xl are estimated using the rule of thumb, namely, the fixed quantile rule with upper 10% for positive log-return values and lower 10% for negative log-return values. With this we have Rn+ = [x1+ , x2+ , . . . , xm+ , . . . , xu , . . . , xp+ ],
Rn− = [x1− , x2− , . . . , xl , . . . , xm− , . . . , xq− ].
The indices u, l can be found as u = p/1.111, l = q/10. 4. A threshold for negative log-return values ranges from xl to xm− and a threshold for positive values is within xm+ and xu , based on n observations from Rn . Further, we shifted Rn to Rn = [x2 , x3 , . . . , xn+1 ] to estimate new values of thresholds based on previous n observations. The process repeated until we exhausted the entire data set R. We chose a window of 300 days that was moving over the entire dataset producing a domain for threshold existence as shown in the Fig. 13. It is clear from
118
V. Smirnov et al.
Fig. 13 Possible threshold changing ranges from 03/11/1981 to 12/31/2018 based on 300 preceding trading days. A green strip represents positive log-return and an orange strip shows the threshold domain for negative log-return values
Fig. 14 Possible threshold changing ranges from 05/18/1982 to 12/31/2018 based on 600 preceding trading days. A green strip represents positive log-returns and an orange strip shows the threshold domain for negative log-return values
the Fig. 13 that certain values of thresholds cannot sustain the entire period and must be updated from time to time. Similarly to the window of 300 trading days, the Fig. 14 demonstrates how ranges of thresholds change as we move a window of 600 trading days across available data. Once again, some values of thresholds can exist for almost entire period, while other can exist a few months and then must be replaced with an updated value. Based on Figs. 13 and 14 we can see that a choice of the threshold depends on a size of the data set used. Moreover, the rules of thumbs alike graphical approaches
Extreme Events and Emergency Scales
119
require practitioners involvement in studying data before making a final choice for thresholds. Equipped with these results, we can compute a validity period τ (u) for a threshold u. Let n be the window of observation. By moving the window over the data set containing M values of log-returns we obtain four lists with M − n + 1 elements in each Xm+ = [xm+,n , xm+,n+1 , . . . , xm+,M ] Xu = [xu,n , xu,n+1 , . . . , xu,M ]
Xm− = [xm−,n , xm−,n+1 , . . . , xm−,M ] Xl = [xl,n , xl,n+1 , . . . , xl,M ]
containing medians of positive and negative log-returns, upper 10% cut-offs and lower 10% cut-offs for positive and negative log-returns, respectively, as computed previously. 1. First, we compute the k threshold candidates ui for positive log-return as i {u+ i ∈ R| min(Xm+ ) + k (max(Xu ) − min Xm+ ), i = 1, . . . , k}. In a similar fashion, we compute a set of threshold candidates for negative log-returns, i {u− i ∈ R| min(Xl ) + k (max(Xm− ) − min Xl ), i = 1, . . . , k}. 2. For each threshold candidate ui found in the previous step we compute its validity duration, i.e. a number of days Nui a candidate would fall within admissible ranges. Set Nui = 0. For j = n . . . M if xm+,j < ui < xu,j , then Nui = Nui + 1 for positive threshold candidates, similarly, if xl,j < ui < xm+,j , then Nui = Nui + 1 for negative threshold candidates. 3. The probability that a threshold candidate will be changed on any given day is η(ui ) = 1 − Nui /(M − n + 1). Figure 15 shows probabilities that a particular choice of the threshold can be changed on any day depending on the size of the dataset used, which is important to know especially when new data becomes available and is included for consideration. The more data we use to estimate a value of a threshold, the more likely the threshold will stay unchanged, however, as we add data, the threshold should be reconsidered. It also brings another issue: in many cases, statistical analysis is performed on a historical dataset that does not reflect a phenomena we study at the present time.
4.3 Statistics of Extreme Events Under Threshold Uncertainty The statistics of extreme events under threshold uncertainty can be described with the help of a simple model, in which the log-return of the index and the value of threshold are treated as random variables that yet can change inconsistently. The model that we are going to adopt and modify had been put forward by us for the first time in [67] to describe the behavior of systems close to a threshold of instability and used later in [68, 69] to model survival under uncertainty and the events of mass extinction. In the model of extreme events under threshold uncertainty,
120
V. Smirnov et al.
Fig. 15 The probability that a threshold of the log-return values will be changed on any given day calculated over the different data windows ranging from 25 to 6000 trading days
the current value of the log-return is quantified by a random number x ∈ [0, 1] drawn accordingly some probability distribution function Pr{x < z} = F (z). The threshold value that might change any time once the new data become available is another random number y ∈ [0, 1], which is drawn from another probability distribution function, Pr{y < z} = G(z). We assume that the rate of daily variations of the log-return values is greater than or equal to that of the threshold values, ultimately determining whether the current log-return value is extreme or not. In fact, it is the relative rate of random updates of x and y described in our model by the probability of inconsistency η ≥ 0, that actually determines the statistics of extreme events. At time t = 0, the value of log-return x is chosen with respect to the probability distribution function F , and the value of the threshold y is chosen with respect to the probability distribution function G. If y ≥ x, the event is regular and the process keeps going to time t = 1. At time t ≥ 1, either, with probability η ≥ 0, the value of log-return x is drawn anew from the probability distribution function F, but the threshold keeps the value y it had at time t − 1, or with probability 1 − η, the value of log-return x is updated anew from the probability distribution function F , and
Extreme Events and Emergency Scales
121
the level of supply y is updated either with respect to the probability distribution function G. As long as the value of threshold is not exceeded (x ≤ y), the event is classified as regular, but the event is extreme whenever x > y. The value of probability η > 0 can be interpreted as the reciprocal characteristic time interval, during which the threshold level remains unchanged, and vice versa the probability that a threshold of the log-return values will stay unchanged on any given day can be calculated as the inverse of the time interval during which the threshold value stays put. The probability that a threshold of the log-return values will stay unchanged on any given day calculated over the different data windows ranging from 25 to 6000 trading days is shown in Fig. 15. We are interested in the probability distribution Pη (t) of the interval duration t between the sequential extreme events for some probability distribution functions F and G and a given value of the probability η ≥ 0. A straightforward computation [67, 68], shows that, independently upon the value of η, the initial probability of choosing the level of demand below the supply level (to start the subsistence 1 process) is 0 dG(z)F (z). The general formulas for Pη (t) can be found in [67, 68]. When η = 0, the values of log-return and the threshold value are updated coherently the resulting probability function, ' Pη=0 (t) =
1
(t dG(z)F (z)
0
1
dG(z) (1 − F (z)) ,
(4.8)
0
decays exponentially fast with t, for any choice of the probability distribution functions F and G. In particular, if the threshold level and log-returns are drawn uniformly at random, dF (z) = dG(z) = dz, over the interval [0, 1], the occurrence of an extreme event is statistically equivalent to simple flipping a fair coin, for which head and tail come up equiprobably, Pη=0 (t) =
1 . 2(t+1)
(4.9)
On the contrary, when the threshold level is kept unchanged, η = 1, the statistics of intervals between the sequential extreme events,
1
Pη=1 (t) =
dG(z)F (z)t (1 − F (z)) ,
(4.10)
0
decays asymptotically algebraically as t 1 [67, 68]. For example, in the special case of uniformly random updates of the threshold and log-return values, the probability function (4.10) decays algebraically as Pη=1 (t) =
1 1 2. (t + 1)(t + 2) t
(4.11)
122
V. Smirnov et al.
For a general family of invariant measures of a map of the interval [0, 1] with a fixed neutral point defined by the probability distributions F and G, absolutely continuous with respect to the Lebesgue measure, i.e., dF (z) = (1 + α)zα dz, α > −1, dG(z) = (1 + β)(1 − z)β dz, β > −1.
(4.12)
Equation (4.10) gives the probability function that exhibits the power law asymptotic decay for t >> 1 [67–69]: (1 + β) Γ (2 + β) (1 + α)−1−β Pη=1 (t) t 2+β
1 1+0 . t
(4.13)
The asymptotic decay of (4.13) seems to be algebraic, Pη=1 (τ )
1 τ 2+β
(4.14)
,
for β > −1, for any choice of the distributions F and G although it is mainly the character of the probability function G that determines the rate of decay of Pη=1 (t) with time. In the limiting case when the support of the probability distribution G(x) determining the choice of the supply level is concentrated close to x = 1, i.e., is zero everywhere in the interval [0, 1], except for a small interval of length ε up to 1, the Zipf power law asymptote ∝ t −1−ε , ε > 0, follows directly from (4.13) [67–69]. A possible modeling function for such a bountiful probability distribution, forming a thin spike as x → 1, can be chosen in the form, Gε (x) = 1 − (1 − x)ε ,
ε > 0,
(4.15)
with the probability density in the interval [0, 1[, dGε (x) =
ε dx . (1 − x)1−ε
(4.16)
The straight line shown in Fig. 16 represents the hyperbolic decay of time intervals between the extreme events.
5 Defining Emergency Scales by Thresholds Uncertainty According to the triality of the nature of extreme events (Fig. 1), once we assume statistics for log-returns, or ranges for threshold values are defined using different methods shown in Sect. 4.1, we introduce uncertainty to the threshold value that
Extreme Events and Emergency Scales
123
Fig. 16 The statistics of time intervals (in days) between the sequential extreme events for the fixed threshold values, u = 0.016 and u = −0.017, for positive and negative fluctuations of the log-return respectively. The solid line ∝ t −2 corresponding to the asymptotic quadratic decay (4.11) is given for reference
depends on the method itself and the amount of data we use. We measure uncertainty to justify an emergency scale to represent the extreme events. Using the rule of thumb (Sect. 4.2.3), we determined the ranges for the threshold values depending on the window of observation (Fig. 13) for both positive and negative log return values. For each admissible positive choice of a threshold u, we determine the probabilities η(u) that the threshold u can be changed on any given day (Fig. 15), degrees of uncertainty have been assessed by the means of the Shannon’s entropy for each threshold value and the window of observation [69] (5.1). The case with the negative values of the threshold is analogous. H (η) =
−η log η − (1 − η) log(1 − η), 0 < η < 1 0,
η = 0, 1
(5.1)
where η(u) is the probability that the threshold will be changed on any given day (Fig. 15). We observed the Red Queen State and three emergency scales can be readily interpreted. 1. If amount of data is not sufficient (a solid line corresponding to the 18-day window of observation in Fig. 17), then the uncertainty curve forms a skewed profile attaining a single maximum of 0.69295 for the threshold value 0.008272. In this situation, an observer’s perception of the events reminds Red Queen from “Through The Looking-Glass and What Alice Found There” by Lewis Carroll [17] who said “When you say hill, I could show you hills, in comparison with
124
V. Smirnov et al.
Fig. 17 Degree of uncertainty of different values of the thresholds based on the amount of trading days taken into account. Three shaded regions mark three scales of emergency: I—subcritical, II—critical, III—extreme and the solid curve represents the Red Queen State
which you’d call that a valley”. Our understanding of the events whether they are extreme or not is very limited and uncertainty is blurry. As events become more severe, our uncertainty that the events are extreme decreases. The observer realizes that the events are extreme, but a precise point at which the events turn to be severe cannot be determined. This case is called the Red Queen State. 2. As the window of observation becomes larger, 25 days, for instance, the uncertainty curve exhibits two maxima indicating that the amount of data is sufficient. This happens because H (η) attains its maximum for η = 1/2 and the η curve admits a value 1/2 twice (Fig. 15). As we further extend the window the curve torrents into sharp peaks (Fig. 17). The latter ones clearly separate the threshold values into three regions: three levels of emergency. The locations of peaks of the curves are summarized in Table 2. The location of two peaks is not sensitive to the window of observation. Emergency scales for S&P 500. (I) Subcritical. In the region I, the threshold values are small to raise concern about extreme events. Then we can observe a spike with the degree of
Extreme Events and Emergency Scales
125
Table 2 Location of extrema points of the uncertainty curves from Fig. 17 Window 18 days 50 days 2000 days 4000 days
First maximum Threshold 0.00827 0.00461 0.00534 0.00533
Uncertainty 0.69296 0.67411 0.69315 0.68217
Second maximum Threshold – 0.01532 0.01672 0.01580
Uncertainty – 0.69290 0.69034 0.69314
uncertainty attaining its first maximum. This extremum indicates a transition to the next kind of uncertainty. For the window of 2000 days, the region I lies in the interval [0, 0.00534) (see Fig. 17, Table 2). (II) Critical. In the region II, the interval [0.00534, 0.01672) for the 2000-day window, uncertainty is conceptualized with a question whether a magnitude of the event is already critical, extreme or not yet. Further, we see another jump of uncertainty reaching 0.69034. At this point we are certain that events are not regular anymore. (III) Extreme. In the case of the window of 2000 days, the interval [0.01672, ∞) constitutes the region III. We consider all events in this region extreme with our uncertainty decreasing as threshold values increase. With the analysis presented above, we define an emergency scale of three levels based on the three regions of the threshold values corresponding to three peaks of the uncertainty curve. This emergency scale is not sensitive to the size of the window of sufficient amount of data considered.
6 Conclusion The S&P500 times series in the period from January 2, 1980 till December 31, 2018 exhibits an asymmetrical skewness of the distribution with the right and left power law tails. Multifractal detrended fluctuation analysis of log return time series for S&P 500 index reveals a scale invariant structure for the fluctuations on both small- and large scale magnitudes, as well as its short- and long-range dependence on different time scales. Moreover, the segments with small fluctuations have a random walk like structure whereas segments with large fluctuations have a noise like structure. We have reviewed different methods of threshold selection and studied the extreme events presented in the time series using different statistical approaches. We found that the distribution of the weekly-return data can be described by a combination of different distributions. Based on a graphical approach for threshold selection, we chose separate thresholds for the positive and negative values of the log return, 0.016 and −0.017, respectively. With this choice, we registered 507 instances of extreme events corresponding to raise of market and 462 extreme events related
126
V. Smirnov et al.
to market declines. With a few exceptions, exceedances over (under for negative log return values) the threshold follow the GPD. The rule of thumb showed that a threshold value depends on the width of observation window, and the threshold can change at any moment, once new data become available. Uncertainty of the threshold values can be determined by the probability of changing the threshold on any given day. The moment we make an assumption about statistics of distributions or the dataset is fixed, it leads to uncertainty of the threshold value which can be resolved by the emergency scales rigid to variation on the size of the dataset. We suggested a statistical model that describes registration frequency of extreme events under threshold uncertainty. Our model fits well the statistics of occurrence of the extreme values in the S&P 500 time series. Acknowledgments SV acknowledges the support of Department of Mathematics and Statistics of Texas Tech University. We are grateful to Dr. M. Toda for her support. Work was done under a contract #W911W6-13-2-0004 with the AVX Aircraft Company.
References 1. S. Kaplan, Tolley’s Handbook of Disaster and Emergency Management: Principles and Best Practices; Disaster and Emergency Management Systems (Lexis Nexis, London, 2004) 2. Richter Scale/Mercalli Scale, https://www.usgs.gov/media/images/modified-mercalliintensity-mmi-scale-assigns-intensities 3. Beaufort Wind Scale, https://www.spc.noaa.gov/faq/tornado/beaufort.html 4. Saffir-Simpson Hurricane Wind Scale, https://www.nhc.noaa.gov/aboutsshws.php 5. Fujita Tornado Damage Scale, https://www.spc.noaa.gov/faq/tornado/f-scale.html 6. Homeland Security Advisory System, https://en.wikipedia.org/wiki/Homeland_Security_ Advisory_System 7. U.S. Climate Extremes Index (CEI), https://www.ncdc.noaa.gov/extremes/cei/introduction 8. E. Rohn, D. Blackmore, A unified localizable emergency events scale. Int. J. Inf. Syst. Crisis Response Manag. 1(4), 1–14 (2009) 9. A.N. Pisarchik, R. Jaimes-Reátegui, R. Sevilla-Escoboza, G. Huerta-Cuellar, M. Taki, Rogue waves in a multistable system. Phys. Rev. Lett. 107, 274101 (2011) 10. A.N. Pisarchik, V.V. Grubov, V.A. Maksimenko et al., Extreme events in epileptic EEG of rodents after ischemic stroke. Eur. Phys. J. Spec. Top. 227, 921–932 (2018) 11. L. Plotnick, E. Gomez, C. White, M. Turoff, Furthering development of a unified emergency scale using Thurstone’s Law of comparative judgment: a progress report ABSTRACT (2007). https://www.dhs.gov/xlibrary/assets/hsas_unified_scale_feedback.pdf 12. Richter Magnitude Scale, https://en.wikipedia.org/wiki/Richter_magnitude_scale 13. 2010 Haiti Earthquake, https://en.wikipedia.org/wiki/2010_Haiti_earthquake 14. 2019 Ridgecrest Earthquakes, https://en.wikipedia.org/wiki/2019_Ridgecrest_earthquakes 15. Trading Halt Definition, James Chen - https://www.investopedia.com/terms/t/tradinghalt.asp 16. J. Lee, Y. Fany, S.A. Sisson, Bayesian threshold selection for extremal models using measures of surprise (2014). arXiv:1311.2994v2 [stat.ME] 17. L. Carroll, Through the Looking-Glass and What Alice Found There (W.B. Conkey Co., Chicago, 1900) 18. R. Hudson, A. Gregoriou, Calculating and comparing security returns is harder than you think: a comparison between logarithmic and simple returns. Int. Rev. Financ. Anal. 38, 151–162 (2015)
Extreme Events and Emergency Scales
127
19. J.-P. Onnela, A. Chakraborti, K. Kaski, J. Kertsz, Dynamic asset trees and Black Monday. Physica A Stat. Mech. Appl. 324(1–2), 247–252 (2003) 20. J. Birru, S. Figlewski, Anatomy of a meltdown: the risk neutral density for the S&P 500 in the fall of 2008. J. Financ. Mark. 15(2), 151–180 (2012) 21. J. Beran, Statistics for Long-Memory Processes (Chapman & Hall/CRC, Roca Raton, FL, 1994) 22. E.A. Ihlen, Introduction to multifractal detrended fluctuation analysis in matlab. Front. Physiol. 3, 141 (2012) 23. J.R. Thompson, J.R. Wilson, Multifractal detrended fluctuation analysis: practical applications to financial time series. Math. Comput. Simul. 126, 63–88 (2016) 24. D. Harte, Multifractals (Chapman & Hall, London, 2001) 25. L. Calvet, A. Fisher, Multifractality in asset returns: theory and evidence. Rev. Econ. Stat. 84, 381–406 (2002) 26. J.W. Kantelhardt, S.A. Zschiegner, E. Koschielny-Bunde, S. Havlin, A. Bunde, H.E. Stanley, Multifractal detrended fluctuation analysis of nonstationary time series. Physica A 316, 87–114 (2002) 27. V. Smirnov, D. Volchenkov, Five years of phase space dynamics of the Standard & Poor’s 500. Appl. Math. Nonlinear Sci. 4(1), 203–216 (2019) 28. Z.-Q. Jiang, W.-J. Xie, W.-X. Zhou, D. Sornette, Multifractal analysis of financial markets: a review. Rep. Prog. Phys. (2019). https://doi.org/10.1088/1361-6633/ab42fb 29. S. Coles, An Introduction to Statistical Modeling of Extreme Values (Springer, London, 2001) 30. R.A. Fisher, L.H.C. Tippett, Limiting forms of the frequency distribution of the largest and smallest member of a sample. Proc. Camb. Philos. Soc. 24, 180–190 (1928) 31. B.V. Gnedenko, Sur la distribution limite du terme maximum d’une série aléatoire. Ann. Math. 44, 423–453 (1943) 32. X. Zhang, J. Lian, Analysis of extreme value at risk to amazon stocks. Int. J. Eng. Res. Dev. 14(2), 62–71 (2018) 33. S. Nadarajah, S. Kotz, The beta Gumbel distribution. Math. Probl. Eng. 4, 323–332 (2007) 34. K. Abbas, T. Yincai, Comparison of estimation methods for Frechet distribution with known shape. Caspian J. Appl. Sci. Res. 1(10) 58–64 (2012) 35. H. Rinne, The Weibull Distribution: A Handbook (Chapman and Hall/CRC, Boca Raton, 2008) 36. E.C. Pinheiro, S.L.P. Ferrari, A comparative review of generalizations of the Gumbel extreme value distribution with an application to wind speed data. J. Stat. Comput. Simul. 86(11), 2241– 2261 (2015) 37. E. Castillo, A.S. Hadi, N. Balakrishnan, J.M. Sarabia, Extreme Value and Related Models with Applications in Engineering and Science (Wiley, New York, 2005) 38. S.L.P. Ferrari, E.C. Pinheiro, Small-sample one-sided testing in extreme value regression models. AStA Adv. Stat. Anal. 100(1), 79–97 (2016) 39. J. Pickands, Statistical inference using extreme order statistics. Ann. Stat. 3, 119–131 (1975) 40. C. Scarrott, A. MacDonald, A review of extreme value threshold estimation and uncertainty quantification. RevStat Stat. J. 10(1), 33–60 (2012) 41. R. Gencay, F. Selcuk, A. Ulugulyagci, EVIM: a software package for extreme value analysis in MATLAB. Stud. Nonlinear Dyn. Econom. 5(3), 1–29 (2001) 42. A.C. Davison, R.L. Smith, Models for exceedance over high thresholds (with discussion). J. R. Stat. Soc. B 52, 237–254 (1990) 43. S. Ghosh, S.I. Resnick, A discussion on mean excess plots. Stoch. Process. Appl. 120, 1492– 1517 (2010) 44. M. Kratz, S.I. Resnick, The QQ-estimator and heavy tails. Stoch. Models 12(4), 699–724 (1996) 45. H. Drees, L. De Haan, S. Resnick, How to make a Hill plot. Ann. Stat. 28(1), 254–274 (2000) 46. P. Todorovic, J. Rousselle, Some problems of flood analysis. Water Resour. Res. 7(5), 1144– 1150 (1971) 47. P. Todorovic, E. Zelenhasic, A stochastic model for flood analysis. Water Resour. Res. 6(6), 1641–1648 (1970)
128
V. Smirnov et al.
48. R. Hogg, S. Klugman, Loss Distribution (Wiley, New York, 1984) 49. P. Embrechts, A. McNeil, R. Frey, Quantitative Risk Management: Concepts, Techniques, and Tools (Princeton University Press, Princeton, 2005) 50. F. Guess, F. Proschan, Mean Residual Life: Theory and Applications. Defence Technical Information Center (1985) 51. H. Drees, Refined estimators of the extreme value index. Ann. Stat. 23, 2059–2080 (1995) 52. A.L. Dekkers, J.H. Einmahl, L. De Haan, A moment estimator for the index of an extremevalue distribution. Ann. Stat. 17(4), 1833–1855 (1989) 53. P.J. Northrop, C.L. Coleman, Improved threshold diagnostic plots for extreme value analyses. Extremes 17, 289–303 (2014) 54. J.S. Lomba, M.I. Alves, L-moments for automatic threshold selection in extreme value analysis (2019). arXiv:1905.08726v1 [stat.ME] 55. A. Manurung, A.H. Wigena, A. Djuraidah, GPD threshold estimation using measure of surprise. Int. J. Sci. Basic Appl. Res. 42(3), 16–25 (2018) 56. J. Wadsworth, Exploiting structure of maximum likelihood estimators for extreme value threshold selection. Technometrics 58(1), 116–126 (2016) 57. P. Thompson, Y. Cai, D. Reeve, J. Stander, Automated threshold selection methods for extreme wave analysis. Coast. Eng. 56, 1013–1021 (2009) 58. A. Langousis, A. Mamalakis, M. Puliga, R. Deidda, Threshold detection for the generalized Pareto distribution: review of representative methods and application to the NOAA NCDC daily rainfall database. Water Resour. Res. 52, 2659–2681 (2016) 59. M. G’Sell, S. Wager, A. Chouldechova, R. Tibshirani, Sequential selection procedures and false discovery rate control. J. R. Stat. Soc. Series B 78(2), 423–444 (2016) 60. B. Bader, J. Yan, X. Zhang, Automated threshold selection for extreme value analysis via ordered goodness-of-fit tests with adjustment for false discovery rate. Ann. Appl. Stat. 12(1), 310–329 (2018) 61. J.A. Greenwood, Probability weighted moments: definition and relation to parameters of several distributions expressible in inverse form. Water Resour. Res. 15(5), 1049–1054 (1979) 62. W.H. DuMouchel, Estimating the stable index α in the order to measure tail thickness: a critique. Ann. Stat. 11, 1019–1031 (1983) 63. A. Ferreira, L. De Haan, L. Peng, On optimising the estimation of high quantiles of a probability distribution. Statistics 37, 401–434 (2003) 64. R.D. Reiss, M. Thomas, Statistical Analysis of Extreme Values: With Applications to Insurance, Finance, Hydrology and Other Fields (Birkhauser, Boston, 2017) 65. C. Neves, M.I.F. Alves, Reiss and Thomas’ automatic selection of the number of extremes. Comput. Stat. Data. Anal. 47, 689–704 (2004) 66. L.F. Schneider, A. Krajina, T. Krivobokova, Threshold selection in univariate extreme value analysis (2019). arXiv:1903.02517v1 [stat.ME] 67. E. Floriani, D. Volchenkov, R. Lima, A system close to a threshold of instability. J. Phys. A Math. General 36, 4771–4783 (2003) 68. D. Volchenkov, Survival under Uncertainty An Introduction to Probability Models of Social Structure and Evolution. Springer Series: Understanding Complex Systems (Springer, Berlin, 2016) 69. D. Volchenkov, Grammar of Complexity: From Mathematics to a Sustainable World. World Scientific Series, Nonlinear Physical Science (World Scientific, Singapore, 2018)
Probability Entanglement and Destructive Interference in Biased Coin Tossing Dimitri Volchenkov
AMS (MSC 2010) 49J21, 26A33, 11C08
A weaker man might be moved to re-examine his faith, if in nothing else at least in the law of probability. Tom Stoppard, “Rosencrantz and Guildenstern Are Dead”, Act 1.
1 Introduction The vanishing probability of winning in a long enough sequence of coin flips features in the opening scene of the Tom Stoppard’s play “Rosencrantz and Guildenstern Are Dead”, where the protagonists are betting on coin flips. Rosencrantz, who bets on heads each time, has won ninety-two flips in a row, leading Guildenstern to suggest that they are within the range of supernatural forces. And he was actually right, as the king had already sent for them [1]. Coin-tossing experiments are ubiquitous in courses on elementary probability theory, since coin tossing is regarded as a prototypical random phenomenon of unpredictable outcome. Although the tossing of a real coin obeying the physical laws is inherently a deterministic process, with the outcome that, formally speaking, might be determined if the initial state of the system is known [2], the discussion on whether the outcome of naturally tossed coins is truly random [3], or it can be manipulated [4, 5] has been around perhaps for as long as coins existed. All in all, the toss of a coin has been a method used to determine random outcomes for centuries [5], and individuals who are told by the coin toss to make an important change are reported much more likely to make a change and are happier 6 months
D. Volchenkov () Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 D. Volchenkov, J. A. Tenreiro Machado (eds.), Mathematical Methods in Modern Complexity Science, Nonlinear Systems and Complexity 33, https://doi.org/10.1007/978-3-030-79412-5_7
129
130
D. Volchenkov
later than those who were told by the coin to maintain the status quo in their lives [6]. The practice of flipping a coin was ubiquitous for taking decisions under uncertainty, as a chance outcome is often interpreted as the expression of divine will [1]. If the coin is not fair, the outcome of future flipping can be either (1) anticipated intuitively by observing the whole sequence of sides shown in the past in search for the possible patterns and repetitions, or (2) guessed instantly from the side just showed up. In our brain, the stored routines and patterns making up our experience are managed by the basal ganglia, and insula, highly sensitive to any change, takes care of our present awareness and might feature the guess on the coin toss outcome [7]. Trusting our gut, we unconsciously look for patterns in sequences of shown sides, a priori perceiving any coin as unfair. In the present paper, we perform the information theoretic study of the most general model for tossing a biased coin in fractional time. We demonstrate that this stochastic model is singular (along with many other well-known stochastic models), and therefore its parameters cannot be inferred from assessing frequencies of shown sides (see Sect. 2). We show that some uncertainty about the coin flipping outcome can be resolved from the sequence of sides already shown, so that the actual level of uncertainty can be lower than assessed by the entropy function, which can therefore be decomposed into the predictable and unpredictable information components (Sect. 3). Interestingly, the efficacy of the side forecasting strategies (1) and (2) can be quantified by the distinct information theoretic quantities—the excess entropy and conditional mutual information, respectively (Sect. 3). We use the backward-shift operator of Hosking for making up the fractional analog of the biased coin tossing model. The fractional time difference operator is defined by a convergent infinite series of binomial type resulting in a Markov chain for the biased coin tossing in fractional time (Sect. 4). The side repeating probabilities considered independent of each other in integer time appear to be entangled with one another in fractional time: in fact, time fractionation defines a smooth dynamical system in the space of model parameters, and the degree of entanglement varies non-linearity over (fractional) time (Sect. 4). Finally, we study the evolution of the predictable and unpredictable information components of entropy over fractional time (Sect. 5). We conclude in the last section.
2 The Model of a Biased Coin A biased coin prefers one side over another. If this preference is stationary, and the coin tosses are independent of each other, we describe coin flipping by a Markov chain defined by the stochastic transition matrix, viz., T (p, q) =
p 1−p , 1−q q
(1)
Probability Entanglement and Destructive Interference in Biased Coin Tossing
131
in which the states, ‘heads’ (“0”) or ‘tails’ (“1”), repeat themselves with the probabilities 0 ≤ p ≤ 1 and 0 ≤ q ≤ 1, respectively.The Markov chain Eq. 1 generates the stationary sequences of states, viz., 0, 0, 0, · · · when p = 1, or 1, 1, 1, · · · when q = 1, or 0, 1, 0, 1 · · · when q = p = 0, but describes flipping a fair coin if q = p = 12 . For a symmetric chain, q = p, the relative frequencies (or densities) of ‘head’ and ‘tail’, π1 (p, q) =
1−q 2−p−q
and
π2 (p, q) =
1−p , 2−p−q
(2)
are equal each other, and therefore the entropy function, expressing the amount of uncertainty about the coin flip outcome, viz., H (p, q) = −
2
πk (p, q) · log2 πk (p, q),
(3)
k=1
attains the maximum value, H (p, p) = 1 bit, uniformly for all 0 < p < 1. On the contrary, flipping the coin when p = 1 (or q = 1) generates the stationary sequences of no uncertainty, H (p, q) = 0 (see Fig. 1). In Eq. 3 and throughout the paper, we use the following conventions reasonable by a limit argument: 0 · log 0 = log 00 = log 1 = 0. The information difference between the amounts of uncertainty on a smooth statistical manifold parameterized by the probabilities p and q is calculated using the Fisher information matrix (FIM) [8–10], viz., gp,q =
2
πk (p, q) ·
k=1
∂ ∂ log2 πk (p, q) · log πk (p, q) . ∂p ∂q 2
(4)
However, since H (p, p) = 1 bit, for 0 < p = q < 1, the FIM, g =
1 (ln 2)2 (2 − p − q)2
) 1−q 1−p
−1
−1 1−p 1−q
* ,
(5)
is degenerate, and therefore the biased coin model Eq. 1 is singular [11], along with many other stochastic models, such as Bayesian networks, neural networks, hidden Markov models, stochastic context-free grammars, and Boltzmann machines. The singular FIM (4) assumes that the parameters of the model, p and q, cannot be inferred from assessing relative frequencies of sides in sequences generated by the Markov chain Eq. 1.
132
D. Volchenkov
Fig. 1 The value of entropy Eq. 3 attains maximum (of 1 bit) for the symmetric chain, q = p, but is zero for the stationary sequences, p = 1, or q = 1
3 Predictable and Unpredictable Information in the Model of Tossing a Biased Coin Some amount of uncertainty in the model Eq. 1. can be dispelled before tossing a coin. Namely, we can consider the entropy function Eq. 3 as a sum of the predictable and unpredictable information components, H (p, q) = P (p, q) + U (p, q) ,
(6)
where the predictable part P (p, q) estimates the amount of apparent uncertainty about the future flipping outcome that might be resolved from the sequence of sides already shown, and U (p, q) estimates the amount of true uncertainty that cannot be inferred from the past and present sides anyway. It is reasonable to assume that both functions, P and U , in Eq. 6 should have the same form as the entropy function Eq. 3: P = −
2 k=1
πk · log2 ϕk ,
U = −
2
πk · log2 ψk ,
ϕk ψk = πk .
(7)
k=1
Furthermore, as the more frequent the side, the higher the forecast accuracy, the partition function ϕk featuring the predicting potential in already shown sequences for forecasting the side k is obviously proportional to the relative frequency of that side, ϕk ∝ πk . Denoting the relevant proportionality coefficient as σk in ϕk = πk σk ,
Probability Entanglement and Destructive Interference in Biased Coin Tossing
133
we obtain ψk = σk−1 = πϕkk . Given the already shown sequence of coin sides ← − S t = St−1 , St−2 , St−3 , . . ., the average amount of uncertainty about flipping a coin is assessed by the entropy rate [10] of the Markov chain Eq. 1, viz., )
← − H St | S t =H St |St−1 =− πk Tkr log2 Tkr ,
2
2
k=1
r=1
where
* p 1−p T= , 1−q q (8)
and therefore, the excess entropy [11–13], quantifying the apparent uncertainty of the flipping outcome that can be resolved by discovering the repetition, rhythm, and ← − patterns over the whole (infinite) sequence of sides showed in the past, S t , equals E (p, q) ≡ H (p, q) − H (St |St−1 ) = −
2
) πk · log2 πk −
k=1
2
* Tkr log2 Tkr .
r=1
(9) The excess entropy E (p, q) attains the maximum value of 1 bit over the stationary sequences but equals zero for q = 1 − p (Fig. 2a). Moreover, the next flipping outcome can be guessed from the present state alone, and the level of accuracy of such a guess can be assessed by the mutual information between the present state and the future state conditioned on the past [11, 14], viz., G (p, q) ≡H (St+1 |St−1 ) −H (St |St−1 ) =
2 k=1
πk
2 Tkr log2 Tkr −Tkr2 log2 Tkr2 . r=1
(10)
Fig. 2 (a) E (p, q), the apparent uncertainty of the flipping outcome that can be resolved by discovering possible patterns and repetitions in the infinite sequence of showed sides; (b) G (p, q), the mutual information between the present state and the future state conditioned on the past measuring the efficacy of forecast of the coin toss outcome from the present state alone
134
D. Volchenkov
Fig. 3 (a) The entropy H (p, q) (transparent) and predictable information P (p, q) (hue colored) in the model of a biased coin for the different values of p and q; (b) The entropy H (p, q) (transparent) and unpredictable information U (p, q) (hue colored) for the different values of p and q
The mutual information (10) is a component of the entropy rate (9) growing as p, q 0 and p, q 1 until the rise of destructive interference between the incompatible hypotheses on alternating the presently shown side at the next tossing (if p, q > 0) or on repeating it (when p, q < 1) causes the attenuation and cancellation of mutual information (10) when q = 1 − p (Fig. 2b). By summing (9) and (10), we obtain the amount of predictable information: P (p, q) = E (p, q) + G (p, q) ,
U (p, q) = H (p, q) − P (p, q).
(11)
The predictable information component P (p, q) amounts to H (p, q) over the stationary sequences but disappears for q = 1 − p (Fig. 3a). On the contrary, the share of unpredictable information U (p, q) attains the maximum value U (p, 1 − p) = H (p, 1 − p), for q = 1 − p (Fig. 3b).
4 The Fractional Time Model of Flipping a Biased Coin: Probabilities Entanglement Over Fractional Time In our work, we define the fractional time model of flipping a biased coin using the fractional differencing of non-integer order [15, 16] for the discrete time stochastic processes [17–19]. The Grunwald–Letnikov fractional difference ατ ≡ (1 − T )α of order α with the unit step τ, and the time lag operator T is defined [20–23] by
Probability Entanglement and Destructive Interference in Biased Coin Tossing
ατ x(t) ≡ (1 − Tτ )α x(t) =
∞
(−1)m ·
m=0
α m
135
· x(t − m · τ )
(12)
α is the binomial coefficient that where Tτ x(t) = x(t − τ ) is fixed τ -delay, and m can be written for integer or non-integer order α using the Gamma function, viz.,
α m
≡ (−1)m−1
α · Γ (m − α) . Γ (1 − α)Γ (m + 1)
(13)
It should be noted that for a Markov chain defined by Eq. 1, the Grunwald–Letnikov fractional difference of a non-integer order 1 − ε takes form of the following infinite series of binomial type, viz., (1 − T )1−ε =
∞ k=0
∞
Γ (k − 1 + ε) Γ (k − 1 + ε) T k =1 + T k ≡ 1 − T1−ε Γ (k + 1) Γ (−1 + ε) Γ (k + 1) Γ (−1 + ε)
(14)
k=1
that converges absolutely, for 0 < ε < 1. In Eq. 14, we have used a formal structural similarity between the fractional order difference operator and the power series of binomial type in order to introduce a fractional backward-shift transition operator T1−ε for any fractional order 0 < 1 − ε < 1 as a convergent infinite power series of the transition matrix Eq. 1, viz., T1−ε (p, q) ≡ =
∞
Γ (k−1+ε) k k=1 Γ (k+1)Γ (−1+ε) T *(p, q) 1−p 1−p (2−p−q)ε (2−p−q)ε 1−q 1−q 1 − (2−p−q) ε (2−p−q)ε
− ) 1−
p ε 1 − pε . ≡ 1 − qε qε
(15) The backward-shift fractional transition matrix defined by Eq. 15 is a stochastic matrix preserving the structure of the initial Markov chain Eq. 1, for any 0 < 1−ε < 1. Since the power series of binomial type in Eq. 15 is convergent and summable for any 0 < 1 − ε < 1, we have also introduced the fractional time probabilities, pε and qε , as the elements of the fractional time transition matrix. The backward-shift fractional time transition operator Eq. 15 describes a dynamical model of flipping a biased coin over backward fractional times, 0 < ε < 1. Namely, the fractional time transition probabilities in Eq. 15 equal those at integer times as ε → 0, lim pε = p,
ε→0
lim qε = q,
ε→0
(16)
but equal the densities Eq. 2 of the ‘head’ and ‘tail’ states, as ε → 1, lim pε =
ε→1
1−q = π1 , 2−p−q
and
lim qε =
ε→1
1−p = π2 . 2−p−q
(17)
136
D. Volchenkov
Fig. 4 (a) The (pε , qε ) —flow of the model Eq. 15 over fractional time 1−ε; (b) The time varying entanglement Eq. 18 between the probabilities pε and qε attains the maximum value at ε = 0.855
Thus, the value ε = 0 in the model Eq. 15 (on the top face of the cube shown in Fig. 4a) may attributed to the moment of time when the future side of the coin to be revealed, and then ε = 1 (on the bottom face of the cube in Fig. 4a) corresponds to the present moment of time when information about all (infinitely many) already shown sides of the coin is available and the density of states of the Markov chain Eq. 1 is known. The backward-shift fractional transition matrix T1−ε defined in Eq. 15 encodes the transition at fractional time preceding an integer moment as an infinite series of integer-time transitions that is usual for accounting of long memory effects in the autoregressive fractional integrated moving average models used for describing economic processes [15, 16]. The transformation Eq. 15 defines the (pε , qε )—flow over fractional time 1 − ε shown in Fig. 4a. Since the vector of ‘head’ and ‘ tail’ densities Eq. 2 is an eigenvector for all powers T k , it is also an eigenvector for the fractional transition operator Tε (p, q), for every intermediate time 1 − ε. Therefore, the fractional time dynamics does not change the distribution of the states in the Markov chain, and the entropy function Eq. 3 is an invariant of fractional time dynamics in the model Eq. 15 (Fig. 4a). In fractional time, 0 < ε ≤ 1, the state repetition probabilities pε and qε get entangled with one another due to the normalization factor (2 − p − q)−ε in the transition probabilities Eq. 15. Although the integer time probabilities p and q are independent of each other as ε = 0, they appear to be linearly dependent, π1 = p1 = 1 − q1 = 1 − π2 , as ε = 1. The degree of entanglement over fractional time ε can be assessed by the expected divergence between the transition probabilities in the models Eqs. 1 and 15,
Probability Entanglement and Destructive Interference in Biased Coin Tossing
p q dpdq π log + π log 1 2 2 pε 2 qε 0 1 1 = 2 0 dpdq π1 log2 ppε = 2 0 dpdq π2 log2
Ent (ε) =
137
1
q qε
.
(18)
The integrand in Eq. 18 turns to zero when the probabilities are independent of one another (as ε = 0) but equals the doubled (due to the obvious p ↔ q symmetry of expressions) Kullback–Leibler divergence (relative entropy) [10] between p and π1 (q and π2 ) as ε = 1. The time varying entanglement function defined by Eq. 1 attains the maximum value at ε = 0.855 (Fig. 4b).
5 Evolution of Predictable and Unpredictable Information Over Fractional Time The predictable and unpredictable information components defined by Eqs. 9, 10, and 11 can be calculated for the fractional time transition matrix Eq. 15, for any fractional time 0 < ε ≤ 1. The evolution of the predictable and unpredictable information components over fractional time is the main contribution of our work. In the present section, without loss of generality, we discuss the symmetric chains, q = p. $ % For a symmetric chain, the densities of both states are equal, π = 12 , 12 , so that H (p, p) ≡ H (p) = −log2 21 = 1 bit, uniformly for all 0 < p < 1 (Fig. 5a). The excess entropy Eq. 9 quantifying predictable information encoded in the historical sequence of showed sides for a symmetric chain reads as follows [24]: E (p, p) ≡ E (p) = 1 − H (St |St−1 ) = −p · log2 p − (1 − p) log2 (1 − p) . (19) Forecasting the future state through discovering patterns in sequences of showed sides Eq. 19 loses any predictive power when the coin is fair, p = 12 , but E (p) = 1 bit when the series is stationary (i.e., p = 0, or p = 1). The mutual information Eq. 10 measuring the reliability of the guess about the future state provided the present state is known [24], G (p) = p · log2 p + (1 − p) · log2 (1 − p) − 2p (1 − p) · log2 2p (1 − p)
− p2 + (1 − p)2 · log2 p2 + (1 − p)2 , (20) increases as p 0 (p 1) attaining maximum at p ≈ 0.121 (p ≈ 0.879). The effect of destructive interference between two incompatible hypotheses about alternating the current state (p 0) and repeating the current state (p 1) culminates in fading this information component when the coin is fair, p = 1/2 (Fig. 5a). The difference between the entropy rate H (St |St−1 ) and the mutual information G (p) may be viewed as the degree of fairness of the coin that attains maximum (U (p) = 1 bit) for the fair coin p = 1/2 (see Fig. 5a).
138
D. Volchenkov
Fig. 5 (a) The decomposition of entropy into information components for a symmetric unfair coin, q = p, at integer time. The symmetric coin is fair when p = 1/2 : the amount of uncertainty of the fair coin tossing cannot be reduced anyway, as the amount of unpredictable information equals U (1/2) = H (p) = 1 bit; (b) Evolution of the information components over fractional time. The decomposition of entropy at integer time shown in Fig. 5a. corresponds to the outer face of the Fig. 5b. (ε = 0)
The entropy decomposition presented in Fig. 5a at integer time (ε = 0) evolves over fractional time, 0 < ε ≤ 1 as shown in Fig. 5b: the decomposition of entropy at integer time (Fig. 5a) corresponds to the outer face of the Fig. 5b. When p = 1, the sequence of coin sides shown in integer times is stationary, so that there is no uncertainty about the tossing outcome. However, the amount of uncertainty for p = 1 grows to 1 bit, in fractional time as ε → 1. When ε = 1, the repetition probability of coin sides equals its relative frequency, p1 = π1 = 1/2, and therefore uncertainty about the future state of the chain cannot be reduced anyway, H (1/2) = U (1/2) = 1 bit. Interestingly, there is some gain of predictable information component G(p) for p = 1 as ε 1 (see Fig. 5b). The information component G(p) quantifies the goodness of guess of the flipping outcome from the present state of the chain, so that the gain observed in Fig. 5b might be interpreted as the reduction of uncertainty in a stationary sequence due to the choice of the present state, “0” or “1” . Despite the dramatic demise of unpredictable information over fractional time as ε → 0, the fair coin (p = 1/2) always stays fair.
6 Discussion and Conclusion A simple Markov chain, generating a binary sequence, in which two states repeat with the given probabilities in integer times, provides us with an analytically computable and telling example for studying conditional information quantities
Probability Entanglement and Destructive Interference in Biased Coin Tossing
139
and their fractional time dynamics in detail. Although the model probabilities cannot be inferred from counting the relative frequencies of showed coin sides, a quantifiable amount of uncertainty about the coin tossing outcome can be resolved by featuring patterns in the generated state sequences and by guessing the coin side to appear from the shown one. The destructive interference between the mutually incompatible hypotheses about the forthcoming state of the chain results in damping of predictable information for a completely unpredictable, fair coin. The fractional time dynamics is defined by a convergent infinite series of binomial type over the Markov chain. The state probabilities assumed to be the independent parameters of the model in integer time get nonlinearly entangled with one another over fractional time. The amount of predictable information grows smoothly over fractional time toward the maximum value attained at integer time, however a fair coin always stays fair. Acknowledgments D.V. is grateful to the Texas Tech University for the administrative and technical support.
References 1. D. Volchenkov, Survival under Uncertainty. An Introduction to Probability Models of Social Structure and Evolution. Understanding Complex Systems (Springer, Berlin, 2016) 2. R.C. Stefan, T.O. Cheche, Coin toss modeling. Romanian Reports in Physics (2016). https:// arxiv.org/abs/1612.06705 3. J.B. Keller, The probability of heads. Amer. Math. Month. 93, 191–197 (1986) 4. P. Diaconis, S. Holmes, R. Montgomery, Dynamical bias in the coin toss. SIAM Rev. 49, 211 (2007) 5. M.P. Clark, B.D. Westerberg, Holiday review. How random is the toss of a coin? CMAJ 181(12), E306-8 (2009). https://doi.org/10.1503/cmaj.091733 6. S.D. Levitt, Heads or tails: the impact of a coin toss on major life decisions and subsequent happiness. NBER Working Paper No. 22487, JEL No. D12, D8 (2016) 7. F. Fabritius, H.W. Hagemann, The Leading Brain: Neuroscience Hacks to Work Smarter, Better, Happier (Penguin, New York, 2017) 8. R.A. Fisher, Theory of statistical estimation. Proc. Cambridge Philos. Soc. 22(5), 700 (1925) 9. S. Amari, Differential-Geometrical Methods in Statistics. Lecture Notes in Statistics (Springer, Berlin, 1985) 10. T.M. Cover, J.A. Thomas, Elements of Information Theory (Wiley, New York, 1991) 11. S. Watanabe, L. Accardi, W. Freudenberg, M. Ohya, (Eds.), Algebraic Geometrical Method in Singular Statistical Estimation. Quantum Bio-Informatics (World Scientific, Singapore, 2008), pp. 325–336 12. R.G. James, C.J. Ellison, J.P. Crutchfield, Anatomy of a bit: information in a time series observation. Chaos 21, 037109 (2011) 13. N.F. Travers, J.P. Crutchfield, Infinite excess entropy processes with countable-state generators. Entropy 16, 1396–1413 (2014) 14. S. Marzen, J.P. Crutchfield, Information anatomy of stochastic equilibria. Entropy 16, 4713–4748 (2014) 15. C.W.J. Granger, R. Joyeux, An introduction to long memory time series models and fractional differencing. J. Time Ser. Anal. 1, 15–39 (1980) 16. J.R.M. Hosking, Fractional differencing. Biometrika 68(1), 165–176 (1981)
140
D. Volchenkov
17. E. Ghysels, N.R. Swanson, M.W. Watson, Essays in econometrics collected papers of clive W. J. Granger, in Causality, Integration and Cointegration, and Long Memory, vol. II (Cambridge University Press, Cambridge, 2001), p. 398 18. L.A. Gil-Alana, J. Hualde, Fractional integration and cointegration: an overview and an empirical application, in Palgrave Handbook of Econometrics. ed. by T.C. Mills, K. Patterson. Applied Econometrics, vol. 2 (Springer, Berlin, 2009), pp. 434–469 19. V. Tarasov, V. Tarasova, Long and short memory in economics: fractional-order difference and differentiation. IRA-Int. J. Manag. Soc. Sci. 5(2), 327–334 (2016). ISSN 2455–2267 20. S.G. Samko, A.A. Kilbas, O.I. Marichev, Fractional Integrals and Derivatives Theory and Applications (Gordon and Breach, New York, 1993), p. 1006 21. I. Podlubny, Fractional Differential Equations (Academic Press, San Diego, 1998), p. 340 22. A.A. Kilbas, H.M. Srivastava, J.J. Trujillo, Theory and Applications of Fractional Differential Equations (Elsevier, Amsterdam, 2006), p. 540 23. V.E. Tarasov, Fractional Dynamics: Applications of Fractional Calculus to Dynamics of Particles, Fields and Media (Springer, New York, 2010). https://doi.org/10.1007/978-3-64214003-7 24. D. Volchenkov, Grammar of Complexity: From Mathematics to a Sustainable World. Nonlinear Physical Science (World Scientific, Singapore, 2018)
On the Solvability of Some Systems of Integro-Differential Equations with Drift Messoud Efendiev and Vitali Vougalter
AMS (MSC 2010) 49J21, 26A33, 11C08
1 Introduction Let us recall that a linear operator L acting from a Banach space E into another Banach space F satisfies the Fredholm property if its image is closed, the dimension of its kernel and the codimension of its image are finite. As a consequence, the equation Lu = f is solvable if and only if φi (f ) = 0 for a finite number of functionals φi from the dual space F ∗ . These properties of Fredholm operators are widely used in many methods of linear and nonlinear analysis. Elliptic problems in bounded domains with a sufficiently smooth boundary satisfy the Fredholm property if the ellipticity condition, proper ellipticity and Lopatinskii conditions are fulfilled (see e.g. [1, 7, 11, 13]). This is the main result of the theory of linear elliptic equations. In the case of unbounded domains, these conditions may not be sufficient and the Fredholm property may not be satisfied. For instance, Laplace operator, Lu = u, in Rd does not satisfy the Fredholm property when considered in Hölder spaces, L : C 2+α (Rd ) → C α (Rd ), or in Sobolev spaces, L : H 2 (Rd ) → L2 (Rd ). Linear elliptic equations in unbounded domains satisfy the Fredholm property if and only if, in addition to the conditions stated above, limiting operators are invertible (see [14]). In some trivial cases, limiting operators can be explicitly constructed. For instance, if M. Efendiev () Helmholtz Zentrum München, Institut für Computational Biology, Neuherberg, Germany e-mail: [email protected] Department of Mathematics, Marmara University, Istanbul, Turkey V. Vougalter Department of Mathematics, University of Toronto, Toronto, ON, Canada e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 D. Volchenkov, J. A. Tenreiro Machado (eds.), Mathematical Methods in Modern Complexity Science, Nonlinear Systems and Complexity 33, https://doi.org/10.1007/978-3-030-79412-5_8
141
142
M. Efendiev and V. Vougalter
Lu = a(x)u + b(x)u + c(x)u,
x ∈ R,
where the coefficients of the operator have limits at infinity, a± = lim a(x), x→±∞
b± = lim b(x), x→±∞
c± = lim c(x), x→±∞
the limiting operators are given by: L± u = a± u + b± u + c± u. Since the coefficients are constants, the essential spectrum of the operator, that is the set of complex numbers λ for which the operator L − λ fails to satisfy the Fredholm property, can be explicitly found by virtue of the Fourier transform: λ± (ξ ) = −a± ξ 2 + b± iξ + c± , ξ ∈ R. Invertibility of limiting operators is equivalent to the condition that the essential spectrum does not contain the origin. In the case of general elliptic problems, the same assertions hold true. The Fredholm property is satisfied if the essential spectrum does not contain the origin or if the limiting operators are invertible. However, these conditions may not be explicitly written. In the case of non-Fredholm operators the usual solvability conditions may not be applicable and solvability conditions are, in general, unknown. There are some classes of operators for which solvability relations are derived. Let us illustrate them with the following example. Consider the problem Lu ≡ u + au = f
(1.1)
in Rd , where a is a positive constant. Such operator L coincides with its limiting operators. The homogeneous equation has a nonzero bounded solution. Hence the Fredholm property is not satisfied. However, since the operator has constant coefficients, we can apply the Fourier transform and find the solution explicitly. Solvability conditions can be formulated as follows. If f ∈ L2 (Rd ) and xf ∈ L1 (Rd ), then there exist a solution of this equation in H 2 (Rd ) if and only if eipx f (x), = 0, d (2π ) 2 L2 (Rd )
d p ∈ S√ a
a.e.
(see [17]). Here and further down Srd stands for the sphere in Rd of radius r centered at the origin. Hence, though the operator does not satisfy the Fredholm property, solvability relations are formulated in similarly. However, this similarity is only formal because the range of the operator is not closed.
On the Solvability of Some Systems of Integro-Differential Equations with Drift
143
In the case of the operator with a potential, Lu ≡ u + a(x)u = f, Fourier transform is not directly applicable. Nevertheless, solvability relations in R3 can be derived by a rather sophisticated application of the theory of self-adjoint operators (see [15]). As before, solvability conditions are formulated in terms of orthogonality to solutions of the homogeneous adjoint equation. There are several other examples of linear elliptic non Fredholm operators for which solvability conditions can be derived (see [14–17, 22]). Solvability relations play a significant role in the analysis of nonlinear elliptic problems. In the case of non-Fredholm operators, in spite of some progress in understanding of linear equations, there exist only few examples where nonlinear non-Fredholm operators are analyzed (see [6, 17, 18, 20]). In the present work we consider another class of stationary nonlinear systems of equations, for which the Fredholm property may not be satisfied: d 2 uk duk + ak uk + + bk dx dx 2
Gk (x − y)Fk (u1 (y), u2 (y), . . . , uN (y), y)dy = 0,
(1.2) with ak ≥ 0, bk ∈ R, bk = 0, 1 ≤ k ≤ N, N ≥ 2 and x ∈ . Here and further down the vector function u := (u1 , u2 , . . . , uN )T ∈ RN .
(1.3)
For the simplicity of presentation we restrict ourselves to the one dimensional case (the multidimensional case will be treated in our forthcoming paper). Thus, is a domain on the real line. Work [8] deals with a single equation anologous to system (1.2) above. In population dynamics the integro-differential equations describe models with intra-specific competition and nonlocal consumption of resources (see e.g. [2, 4]). The studies of the systems of integro- differential equations are of interest to us in the context of the complicated biological systems, where uk (x, t), k = 1, . . . , N denote the cell densities for various groups of cells in the organism. We use the explicit form of the solvability conditions and study the existence of solutions of such nonlinear systems. We would like especially to emphasize that the solutions of the integro-differential equations with the drift term are relevant to the understanding of the emergence and propagation of patterns in the theory of speciation (see [21]). The solvability of the linear equation involving the Laplace operator with the drift term was treated in [16], see also [3]. In the case of the vanishing drift terms, namely when bk = 0, 1 ≤ k ≤ N, the system analogous to (1.2) was treated in [20] and [18]. Weak solutions of the Dirichlet and Neumann problems with drift were considered in [12].
144
M. Efendiev and V. Vougalter
2 Formulation of the Results Our technical assumptions are analogous to the ones of the [8], adapted to the work with vector functions. It is also more complicated to work in the Sobolev spaces for vector functions, especially in the problem on the finite interval with periodic boundary conditions when the constraints are imposed on some of the components. The nonlinear parts of system (1.2) will satisfy the following regularity conditions. Assumption 1 Let 1 ≤ k ≤ N. Functions Fk (u, x) : RN × → R are satisfying the Caratheodory condition (see [10]), such that + ,N , F 2 (u, x) ≤ K|u| k
RN
+ h(x)
u ∈ RN , x ∈
f or
(2.1)
k=1
with a constant K > 0 and h(x) : → R+ , h(x) ∈ L2 (). Moreover, they are Lipschitz continuous functions, such that for any u(1),(2) ∈ RN , x ∈ : + ,N , - (F (u(1) , x) − F (u(2) , x))2 ≤ L|u(1) − u(2) | N , k k R
(2.2)
k=1
with a constant L > 0. Here and below the norm of a vector function given by (1.3) is
|u|RN
+ ,N , := u2 . k
k=1
Note that the solvability of a local elliptic problem in a bounded domain in RN was studied in [5], where the nonlinear function was allowed to have a sublinear growth. For the purpose of the study of the existence of solutions of (1.2), we introduce the auxiliary system of equations with 1 ≤ k ≤ N as −
duk d 2 uk − ak uk = − bk 2 dx dx
Gk (x − y)Fk (v1 (y), v2 (y), . . . , vN (y), y)dy.
(2.3) We denote (f1 (x), f2 (x))L2 () := f1 (x)f¯2 (x)dx, with a slight abuse of notations when these functions are not square integrable, like for example those involved in orthogonality condition (5.5) below. In the first part of the article we consider the case of the whole real line, = R, such that the appropriate Sobolev space is equipped with the norm
On the Solvability of Some Systems of Integro-Differential Equations with Drift
. 2 .2 .d φ . . φ2H 2 (R) := φ2L2 (R) + . . . dx 2 . L2 (R)
145
(2.4)
For a vector function given by (1.3), we have u2H 2 (R,RN )
:=
N
uk 2H 2 (R)
k=1
. 2 .2 / N ! . d uk . 2 . . uk L2 (R) + . 2 . . = dx L2 (R)
(2.5)
k=1
Let us also use the norm u2L2 (R,RN ) :=
N
uk 2L2 (R) .
k=1
Due to Assumption 1 above, we are not allowed to consider the higher powers of the nonlinearities, than the first one, which is restrictive from the point of view of the applications. But this guarantees that our nonlinear vector function is a bounded and continuous map from L2 (, RN ) to L2 (, RN ). The main issue for the problem above is that in the absence of the drift terms we were dealing with the self-adjoint, non Fredholm operators −
d2 − ak : H 2 (R) → L2 (R), ak ≥ 0, dx 2
which was the obstacle to solve our system. The similar situations but in linear problems, both self- adjoint and non self-adjoint involving non Fredholm differential operators have been treated extensively in recent years (see [14–17, 22]). However, the situation is different when the constants in the drift terms bk = 0. For 1 ≤ k ≤ N , the operators La, b, k := −
d d2 − ak : − bk dx dx 2
H 2 (R) → L2 (R)
(2.6)
with ak ≥ 0 and bk ∈ R, bk = 0 involved in the left side of system (2.3) are nonselfadjoint. By virtue of the standard Fourier transform, it can be easily verified that the essential spectra of the operators La, b, k are given by λa, b, k (p) = p2 − ak − ibk p,
p ∈ R.
Evidently, when ak > 0 the operators La, b, k are Fredholm, since their essential spectra stay away from the origin. But for ak = 0 our operators La, b, k do not satisfy the Fredholm property since the origin belongs to their essential spectra. We manage to prove that under the reasonable technical assumptions system (2.3) defines a map Ta,b : H 2 (R, RN ) → H 2 (R, RN ), which is a strict contraction.
146
M. Efendiev and V. Vougalter
Theorem 1 Let = R, N ≥ 2, 1 ≤ l ≤ N − 1, 1 ≤ k ≤ N, bk ∈ R, bk = 0 and Gk (x) : R → R, Gk (x) ∈ L1 (R) and Assumption 1 holds. (I) Let ak > 0 for 1 ≤ k ≤ l. (II) Let ak = 0 for l + 1 ≤ k √≤ N, in addition xGk (x) ∈ L1 (R), orthogonality relations (5.5) hold and 2 π Na, b L < 1 with Na, b defined by (5.4) below. Then the map v → Ta,b v = u on H 2 (R, RN ) defined by system (2.3) has a unique fixed point v (a,b) , which is the only solution of the system of equations (1.2) in H 2 (R, RN ). The fixed point v (a,b) is nontrivial provided that for some 1 ≤ k ≤ N the intersection of supports of the Fourier transforms of functions suppF k (0, x) ∩ 0k is a set of nonzero Lebesgue measure in R. suppG Note that in the case (I) of the theorem above, when ak > 0, as distinct from part (I) of Assumption 2 of [18], the orthogonality relations are not needed. Related to system (1.2) on the real line, we consider the sequence of approximate systems of equations with m ∈ N, 1 ≤ k ≤ N, ak ≥ 0, bk ∈ R, bk = 0, namely (m)
(m)
d 2 uk du + bk k +ak u(m) k + dx dx 2
∞
−∞
(m) (m) Gk,m (x − y)Fk (u(m) 1 (y), u2 (y), . . . , uN (y), y)dy = 0.
(2.7) Each sequence of kernels {Gk,m (x)}∞ m=1 converges to Gk (x) as m → ∞ in the appropriate function spaces discussed further down. We will prove that, under the certain technical assumptions, each of systems (2.7) admits a unique solution u(m) (x) ∈ H 2 (R, RN ), limiting system (1.2) has a unique solution u(x) ∈ H 2 (R, RN ), and u(m) (x) → u(x) in H 2 (R, RN ) as m → ∞, which is the socalled existence of solutions in the sense of sequences. In this case, the solvability conditions can be formulated for the iterated kernels Gk,m . They imply the convergence of the kernels in terms of the Fourier transforms (see the Appendix) and, as a consequence, the convergence of the solutions (Theorems 2, 4). Similar ideas in the sense of standard Schrödinger type operators were exploited in [19]. Our second main statement is as follows. Theorem 2 Let = R, N ≥ 2, 1 ≤ l ≤ N − 1, 1 ≤ k ≤ N, bk ∈ R, bk = 0, m ∈ N, Gk,m (x) : R → R, Gk,m (x) ∈ L1 (R) are such that Gk,m (x) → Gk (x) in L1 (R) as m → ∞. Let Assumption 1 hold. (I) Let ak > 0 for 1 ≤ k ≤ l. (II) Let ak = 0 for l + 1 ≤ k ≤ N . Assume that xGk,m (x) ∈ L1 (R), xGk,m (x) → xGk (x) in L1 (R) as m → ∞, orthogonality relations (5.11) hold along with upper bound (5.12). Then each system (2.7) admits a unique solution u(m) (x) ∈ H 2 (R, RN ), and limiting problem (1.2) possesses a unique solution u(x) ∈ H 2 (R, RN ), such that u(m) (x) → u(x) in H 2 (R, RN ) as m → ∞.
On the Solvability of Some Systems of Integro-Differential Equations with Drift
147
The unique solution u(m) (x) of each system (2.7) is nontrivial provided that for some 1 ≤ k ≤ N the intersection of supports of the Fourier transforms of functions suppF k (0, x) ∩ supp G k,m is a set of nonzero Lebesgue measure in R. Analogously, the unique solution u(x) of limiting system (1.2) does not vanish identically if 0k is a set of nonzero Lebesgue measure in R for a certain suppF k (0, x) ∩ supp G 1 ≤ k ≤ N. In the second part of the work we consider the analogous system on the finite interval with periodic boundary conditions, i.e. = I := [0, 2π ] and the appropriate functional space is H 2 (I ) = {v(x) : I → R | v(x), v (x) ∈ L2 (I ), v(0) = v(2π ), v (0) = v (2π )}, aiming at uk (x) ∈ H 2 (I ), 1 ≤ k ≤ l. For the technical purposes, we introduce the following auxiliary constrained subspace H02 (I ) = {v(x) ∈ H 2 (I ) | (v(x), 1)L2 (I ) = 0}.
(2.8)
The aim is to have uk (x) ∈ H02 (I ), l + 1 ≤ k ≤ N . The constrained subspace (2.8) is a Hilbert space as well (see e.g. Chapter 2.1 of [9]). The resulting space used for proving the existence in the sense of sequences of solutions u(x) : I → RN of system (1.2) will be the direct sum of the spaces given above, namely 2 Hc2 (I, RN ) := ⊕lk=1 H 2 (I ) ⊕N k=l+1 H0 (I ).
The corresponding Sobolev norm is given by u2H 2 (I,RN ) := c
N
{uk 2L2 (I ) + uk 2L2 (I ) },
k=1
where u(x) : I → RN . Another useful norm is given by u2L2 (I,RN )
=
N
uk 2L2 (I ) .
k=1
Let us prove that system (2.3) in this situation defines a map τa,b : Hc2 (I, RN ) → Hc2 (I, RN ), which will be a strict contraction under the given technical assumptions. Theorem 3 Let = I, N ≥ 2, 1 ≤ l ≤ N − 1, 1 ≤ k ≤ N, bk ∈ R, bk = 0 and Gk (x) : I → R, Gk (x) ∈ L∞ (I ), Gk (0) = Gk (2π ), Fk (u, 0) = Fk (u, 2π ) for u ∈ RN and Assumption 1 holds.
148
M. Efendiev and V. Vougalter
(I) Let ak > 0 for 1 ≤ k ≤ l. (II) Let √ ak = 0 for l + 1 ≤ k ≤ N, orthogonality conditions (5.27) hold and 2 πNa, b L < 1, where Na, b is defined by (5.26). Then the map τa,b v = u on Hc2 (I, RN ) defined by system (2.3) possesses a unique fixed point v (a,b) , the only solution of the system of equations (1.2) in Hc2 (I, RN ). The fixed point v (a,b) is nontrivial provided that the Fourier coefficients Gk,n Fk (0, x)n = 0 for a certain 1 ≤ k ≤ N and for some n ∈ Z. Remark 1 We use the constrained subspace H02 (I ) involved in the direct sum of 2
d d 2 2 spaces Hc2 (I, RN ), such that the Fredholm operators − dx 2 −bk dx : H0 (I )→L (I ), have the trivial kernels.
To study the existence in the sense of sequences of solutions for our integrodifferential system on the interval I , we consider the sequence of iterated systems of equations, analogously to the whole real line case with m ∈ N, 1 ≤ k ≤ N, ak ≥ 0, bk ∈ R, bk = 0, such that (m)
(m)
d 2 uk du + bk k dx dx 2
(m)
+ ak uk +
0
2π
(m)
(m)
(m)
Gk,m (x − y)Fk (u1 (y), u2 (y), . . . , uN (y), y)dy = 0.
(2.9) Our final main proposition is as follows. Theorem 4 Let = I, N ≥ 2, 1 ≤ l ≤ N − 1, 1 ≤ k ≤ N, bk ∈ R, bk = 0, m ∈ N, Gk,m (x) : I → R, Gk,m (0) = Gk,m (2π ), Gk,m (x) ∈ L∞ (I ) are such that Gk,m (x) → Gk (x) in L∞ (I ) as m → ∞, Fk (u, 0) = Fk (u, 2π ) for u ∈ RN . Let Assumption 1 hold. (I) Let ak > 0 for 1 ≤ k ≤ l. (II) Let ak = 0 for l + 1 ≤ k ≤ N . Assume that orthogonality conditions (5.33) hold along with estimate from above (5.34) Then each system (2.9) has a unique solution u(m) (x) ∈ Hc2 (I, RN ) and the limiting system of equations (1.2) admits a unique solution u(x) ∈ Hc2 (I, RN ), such that u(m) (x) → u(x) in Hc2 (I, RN ) as m → ∞. The unique solution u(m) (x) of each system (2.9) is nontrivial provided that the Fourier coefficients Gk,m,n Fk (0, x)n = 0 for a certain 1 ≤ k ≤ N and some n ∈ Z. Analogously, the unique solution u(x) of the limiting system of equations (1.2) does not vanish identically if Gk,n Fk (0, x)n = 0 for some 1 ≤ k ≤ N and a certain n ∈ Z. Remark 2 Note that in the article we work with real valued vector functions by virtue of the assumptions on Fk (u, x), Gk,m (x) and Gk (x) involved in the nonlocal terms of the iterated and limiting systems of equations discussed above.
On the Solvability of Some Systems of Integro-Differential Equations with Drift
149
Remark 3 The importance of Theorems 2 and 4 above is the continuous dependence of solutions with respect to the integral kernels.
3 The Whole Real Line Case Proof of Theorem 1. Let us first suppose that in the case of = R for some v ∈ H 2 (R, RN ) there exist two solutions u(1),(2) ∈ H 2 (R, RN ) of system (2.3). Then their difference w(x) := u(1) (x) − u(2) (x) ∈ H 2 (R, RN ) will be a solution of the homogeneous system of equations −
dwk d 2 wk − ak wk = 0, − bk dx dx 2
1 ≤ k ≤ N.
Since the operator La, b, k defined in (2.6) acting on the whole real line does not possess any nontrivial square integrable zero modes, w(x) = 0 on R. We choose arbitrarily v(x) ∈ H 2 (R, RN ). Let us apply the standard Fourier transform (5.1) to both sides of (2.3). This yields u1k (p) =
√
2π
0k (p)f1k (p) G , − ak − ibk p
p2
1 ≤ k ≤ N,
(3.1)
where f1k (p) denotes the Fourier image of Fk (v(x), x). Evidently, for 1 ≤ k ≤ N , we have the estimates from above |u1k (p)| ≤
√ 2π Na, b, k |f1k (p)| and
|p2 u1k (p)| ≤
√
2π Na, b, k |f1k (p)|,
where Na, b, k < ∞ by means of Lemma 5 of the Appendix without any orthogonality relations for ak > 0 and under orthogonality condition (5.5) when ak = 0. This enables us to estimate the norm u2H 2 (R,RN ) =
N N 2 2 {u1k (p)2L2 (R) + p 2 u1k (p)2L2 (R) } ≤ 4π Na, b, k Fk (v(x), x)L2 (R) , k=1
k=1
which is finite due to (2.1) of Assumption 1 because |v(x)|RN ∈ L2 (R). Hence, for an arbitrary v(x) ∈ H 2 (R, RN ) there exists a unique solution u(x) ∈ H 2 (R, RN ) of system (2.3), such that its Fourier image is given by (3.1) and the map Ta,b : H 2 (R, RN ) → H 2 (R, RN ) is well defined. This allows us to choose arbitrarily v (1),(2) (x) ∈ H 2 (R, RN ), such that their images u(1),(2) = Ta,b v (1),(2) ∈ H 2 (R, RN ). Clearly,
150
M. Efendiev and V. Vougalter
(1) 0k (p)f2 √ G 2 (1) k (p) , uk (p)= 2π 2 p − ak − ibk p
(2) 0k (p)f2 √ G 2 (2) k (p) uk (p) = 2π 2 , p − ak − ibk p
1 ≤ k ≤ N,
2 2 (1) (2) where fk (p) and fk (p) stand for the Fourier transforms of Fk (v (1) (x), x) and Fk (v (2) (x), x) respectively. Thus, for 1 ≤ k ≤ N, we easily obtain & & & & &2 & (1) & √ & 2 (2) (2) &u(1) (p) − u2 & ≤ 2π Na, b, k &f2 &, (p) (p) − f (p) k k & k & k & & & & & & & 22 &2 & √ & 2 (2) (1) (2) &p u(1) (p) − p2 u2 & & & k k (p)& ≤ 2π Na, b, k &fk (p) − fk (p)&. & For the appropriate norms of vector functions we arrive at 2 u(1) − u(2) 2H 2 (R,RN ) ≤ 4π Na, b
N
Fk (v (1) (x), x) − Fk (v (2) (x), x)2L2 (R) .
k=1 (1),(2)
Note that vk (2.2) yields N
(x) ∈ H 2 (R) ⊂ L∞ (R) via the Sobolev embedding. Condition
Fk (v (1) (x), x) − Fk (v (2) (x), x)2L2 (R) ≤ L2 v (1) − v (2) 2L2 (R,RN ) .
k=1
Therefore, √ Ta,b v (1) − Ta,b v (2) H 2 (R,RN ) ≤ 2 πNa, b Lv (1) − v (2) H 2 (R,RN ) and the constant in the right side of this inequality is less than one as assumed. Hence, by means of the Fixed Point Theorem, there exists a unique vector function v (a,b) ∈ H 2 (R, RN ) with the property Ta,b v (a,b) = v (a,b) , which is the only solution of system (1.2) in H 2 (R, RN ). Suppose v (a,b) (x) = 0 identically on the real line. This will contradict to our assumption that for a certain 1 ≤ k ≤ N, the Fourier transforms of Gk (x) and Fk (0, x) do not vanish on a set of nonzero Lebesgue measure in R. Let us turn our attention to showing the existence in the sense of sequences of the solutions for our system of integro-differential equations on the real line. Proof of Theorem 2. By virtue of the result of Theorem 1 above, each system (2.7) has a unique solution u(m) (x) ∈ H 2 (R, RN ), m ∈ N. Limiting system (1.2) possesses a unique solution u(x) ∈ H 2 (R, RN ) by means of Lemma 6 below along with Theorem 1. Let us apply the standard Fourier transform (5.1) to both sides of (1.2) and (2.7), which yields for 1 ≤ k ≤ N, m ∈ N
On the Solvability of Some Systems of Integro-Differential Equations with Drift
u1k (p) =
0k (p)ϕ1k (p) √ G , 2π 2 p − ak − ibk p
√ G 2 k,m (p)ϕ2 k,m (p) (m) uk (p) = 2π 2 , p − ak − ibk p
151
(3.2)
where ϕ1k (p) and ϕ2 k,m (p) denote the Fourier images of Fk (u(x), x) and Fk (u(m) (x), x) respectively. Apparently, & . & . 0k (p) &2 . & √ . G G k,m (p) &u(m) (p) − u1k (p)& ≤ 2π . . |ϕ1k (p)|+ & k & . p2 − a − ib p − p2 − a − ib p . k k k k L∞ (R) . √ . + 2π . .
. . G k,m (p) . |ϕ2 1k (p)|. k,m (p) − ϕ 2 p − ak − ibk p .L∞ (R)
Therefore, (m)
uk −uk L2 (R) ≤
. √ . + 2π . .
√
. . 0k (p) . . G G k,m (p) . − 2π . . p 2 − a − ib p p 2 − a − ib p . ∞ Fk (u(x), x)L2 (R) + k k k k L (R)
. . G k,m (p) . Fk (u(m) (x), x) − Fk (u(x), x)L2 (R) . 2 p − ak − ibk p .L∞ (R)
By virtue of inequality (2.2) of Assumption 1, we derive + ,N , Fk (u(m) (x), x) − Fk (u(x), x)2L2 (R) ≤ Lu(m) (x) − u(x)L2 (R,RN ) . k=1
(3.3) 2 ∞ Clearly, u(m) k (x), uk (x) ∈ H (R) ⊂ L (R) for 1 ≤ k ≤ N, m ∈ N due to the Sobolev embedding. Therefore, we arrive at
u(m) (x)−u(x)2L2 (R,RN ) ≤ 4π
.2 N . 0k (p) . . G G2 k,m (p) . . Fk (u(x), x)2L2 (R) + . p2 − a − ib p − p2 − a − ib p . k k k k L∞ (R) k=1
(2
' +4π
(m) Na, b
L2 u(m) (x) − u(x)2L2 (R,RN ) ,
152
M. Efendiev and V. Vougalter
such that u(m) (x) − u(x)2L2 (R,RN ) ≤ .2 N . 0k (p) . . G G 4π k,m (p) . . Fk (u(x), x)2L2 (R) . ≤ . p2 − a − ib p − p2 − a − ib p . ε(2 − ε) k k k k L∞ (R) k=1
By virtue of inequality (2.1) of Assumption 1, we have Fk (u(x), x) ∈ L2 (R), 1 ≤ k ≤ N for u(x) ∈ H 2 (R, RN ). This implies that u(m) (x) → u(x),
m→∞
(3.4)
in L2 (R, RN ) via the result of Lemma 6 of the Appendix. Evidently, for 1 ≤ k ≤ N, m ∈ N, we have p2 u1k (p) =
√
2π
0k (p)ϕ1k (p) p2 G , 2 p − ak − ibk p
√ p2 G 2 k,m (p)ϕ2 k,m (p) (m) p2 uk (p) = 2π . p2 − ak − ibk p
Therefore, . & & . 2 2 0k (p) . & √ . & 22 p2 G . &p u(m) (p) − p 2 u1k (p)& ≤ 2π . p Gk,m (p) − |ϕ1k (p)|+ k & & . p 2 − a − ib p p 2 − ak − ibk p .L∞ (R) k k
. √ . + 2π . .
. p2 G k,m (p) . . |ϕ2 1k (p)|. k,m (p) − ϕ p2 − ak − ibk p .L∞ (R)
By virtue of (3.3), for 1 ≤ k ≤ N, m ∈ N, we arrive at . . 2 (m) . . 0k (p) . √ . p2 G2 . d uk d 2 uk . p2 G k,m (p) . . . . − ≤ 2π Fk (u(x), x)L2 (R) + − . dx 2 . p 2 − a − ib p p 2 − a − ib p . dx 2 .L2 (R) k k k k L∞ (R)
. √ . + 2π . .
. p2 G k,m (p) . . Lu(m) (x) − u(x)L2 (R,RN ) . p2 − ak − ibk p .L∞ (R)
By means of the result of Lemma 6 of the Appendix along with (3.4), we derive that d 2 u(m) d 2u → in L2 (R, RN ) as m → ∞. Definition (2.5) of the norm gives us 2 dx dx 2 that u(m) (x) → u(x) in H 2 (R, RN ) as m → ∞. Let us assume that the solution u(m) (x) of system (2.7) studied above vanishes on the real line for a certain m ∈ N. This will contradict to our assumption that the Fourier images of Gk,m (x) and Fk (0, x) are nontrivial on a set of nonzero Lebesgue measure in R. The analogous argument holds for the solution u(x) of limiting system (1.2).
On the Solvability of Some Systems of Integro-Differential Equations with Drift
153
4 The Problem on the Finite Interval Proof of Theorem 3. Apparently, each operator involved in the left side of system (2.3) La, b, k := −
d d2 − bk − ak : 2 dx dx
H 2 (I ) → L2 (I ),
(4.1)
where 1 ≤ k ≤ N, ak > 0, bk ∈ R, bk = 0 is Fredholm, non-selfadjoint, its set of eigenvalues is given by λa, b, k (n) = n2 − ak − ibk n,
n∈Z
(4.2)
einx and its eigenfunctions are the standard Fourier harmonics √ , n ∈ Z. When ak = 2π 0, we will use the similar ideas in the constrained subspace (2.8) instead of H 2 (I ). Evidently, the eigenvalues of each operator La, b, k are simple, as distinct from the analogous situation without the drift term, when the eigenvalues corresponding to n = 0 have the multiplicity of two (see [20]). Let us first suppose that for some v(x) ∈ Hc2 (I, RN ) there exist two solutions (1),(2) u (x) ∈ Hc2 (I, RN ) of system (2.3) with = I . Then the vector function w(x) := u(1) (x) − u(2) (x) ∈ Hc2 (I, RN ) will be a solution of the homogeneous system of equations −
dwk d 2 wk − ak wk = 0, − bk 2 dx dx
1 ≤ k ≤ N.
But the operator La, b, k : H 2 (I ) → L2 (I ), ak > 0 discussed above does not possess nontrivial zero modes. Hence, w(x) vanishes in I . We choose arbitrarily v(x) ∈ Hc2 (I, RN ) and apply the Fourier transform (5.23) to system (2.3) considered on the interval I . This yields uk,n =
√
Gk,n fk,n , 2π 2 n − ak − ibk n
n2 uk,n =
√
n2 Gk,n fk,n , 2π 2 n − ak − ibk n
1≤k≤N
n ∈ Z,
(4.3) with fk,n := Fk (v(x), x)n . Thus, we obtain |uk,n | ≤
√ 2π Na, b, k |fk,n |,
|n2 uk,n | ≤
√
2π Na, b, k |fk,n |,
154
M. Efendiev and V. Vougalter
where Na, b, k < ∞ under the given auxiliary assumptions by virtue of Lemma 7 of the Appendix. Hence, u2H 2 (I,RN ) = c
N ' ∞ k=1
|uk,n |2 +
n=−∞
∞
( N 2 |n2 uk,n |2 ≤ 4π Na, Fk (v(x), x)2L2 (I ) < ∞ b
n=−∞
k=1
due to (2.1) of Assumption 1 for |v(x)|RN ∈ L2 (I ). Thus, for any v(x) ∈ Hc2 (I, RN ) there exists a unique u(x) ∈ Hc2 (I, RN ) solving system (2.3) with its Fourier transform given by (4.3) and the map τa,b : Hc2 (I, RN ) → Hc2 (I, RN ) is well defined. Let us consider arbitrary v (1),(2) (x) ∈ Hc2 (I, RN ) with their images under the map discussed above u(1),(2) = τa,b v (1),(2) ∈ Hc2 (I, RN ). By applying the Fourier transform (5.23), we easily derive (1)
uk,n =
√ 2π
(1)
Gk,n fk,n
n2 − ak − ibk n
,
(2)
uk,n =
√
(2)
2π
Gk,n fk,n
n2 − ak − ibk n
,
1 ≤ k ≤ N,
n ∈ Z,
(j )
with fk,n := Fk (v (j ) (x), x)n , j = 1, 2. Hence, (2) |u(1) k,n −uk,n | ≤
√ (1) (2) 2π Na, b, k |fk,n −fk,n |,
(2) |n2 (u(1) k,n −uk,n )| ≤
√ (1) (2) 2π Na, b, k |fk,n −fk,n |.
Therefore, u(1) − u(2) 2H 2 (I,RN ) = c
≤
N ' ∞ k=1
2 4π Na, b
N
n=−∞
(1)
(2)
|uk,n − uk,n |2 +
∞
( (1) (2) |n2 (uk,n − uk,n )|2 ≤
n=−∞
Fk (v (1) (x), x) − Fk (v (2) (x), x)2L2 (I ) .
k=1
Clearly, vk(1),(2) (x) ∈ H 2 (I ) ⊂ L∞ (I ), 1 ≤ k ≤ N due to the Sobolev embedding. By virtue of (2.2) we easily arrive at √ τa,b v (1) − τa,b v (2) Hc2 (I,RN ) ≤ 2 π Na, b Lv (1) − v (2) Hc2 (I,RN ) , with the constant in the right side of this estimate less than one as assumed. Thus, the Fixed Point Theorem yields the existence and uniqueness of a vector function v (a,b) ∈ Hc2 (I, RN ), satisfying τa,b v (a,b) = v (a,b) , which is the only solution of system (1.2) in Hc2 (I, RN ). Suppose v (a,b) (x) = 0 identically in I . This gives us the contradiction to the assumption that Gk,n Fk (0, x)n = 0 for some 1 ≤ k ≤ N and a certain n ∈ Z. We proceed to establishing the final main statement of the article.
On the Solvability of Some Systems of Integro-Differential Equations with Drift
155
Proof of Theorem 4. Apparently, the limiting kernels Gk (x), 1 ≤ k ≤ N are also periodic on the interval I (see the argument of Lemma 8 of the Appendix). Each system (2.9) has a unique solution u(m) (x), m ∈ N belonging to Hc2 (I, RN ) by virtue of Theorem 3 above. The limiting system of equations (1.2) admits a unique solution u(x), which belongs to Hc2 (I, RN ) by means of Lemma 8 of the Appendix along with Theorem 3. Let us apply Fourier transform (5.23) to both sides of systems (1.2) and (2.9). This yields uk,n =
√ 2π
Gk,n ϕk,n , 2 n − ak − ibk n
(m)
uk,n =
√ 2π
(m)
Gk,m,n ϕk,n
n2 − ak − ibk n
,
(4.4)
(m)
where 1 ≤ k ≤ N, n ∈ Z, m ∈ N with ϕk,n and ϕk,n denoting the Fourier images of Fk (u(x), x) and Fk (u(m) (x), x) respectively under transform (5.23). We easily obtain the upper bound (m)
|uk,n − uk,n | ≤
. √ . 2π . .
. √ . + 2π . .
. . Gk,m,n Gk,n . |ϕk,n |+ − 2 2 n − ak − ibk n n − ak − ibk n .l ∞
. . Gk,m,n . |ϕ (m) − ϕk,n |. 2 n − ak − ibk n .l ∞ k,n
Hence, (m)
uk
− uk L2 (I ) ≤
. √ . + 2π . .
√
. . 2π . .
. . Gk,m,n Gk,n . Fk (u(x), x) 2 + − L (I ) 2 2 n − ak − ibk n n − ak − ibk n .l ∞
. . Gk,m,n . Fk (u(m) (x), x) − Fk (u(x), x) 2 . L (I ) 2 n − ak − ibk n .l ∞
By virtue of bound (2.2) of Assumption 1, we arrive at + ,N , Fk (u(m) (x), x) − Fk (u(x), x)2L2 (I ) ≤ Lu(m) (x) − u(x)L2 (I,RN ) . k=1
(4.5)
156
M. Efendiev and V. Vougalter
Note that uk (x), uk (x) ∈ H 2 (I ) ⊂ L∞ (I ) via the Sobolev embedding. Evidently, (m)
u(m) (x) − u(x)2L2 (I,RN ) ≤ 4π
.2 N . . . Gk,n Gk,m,n 2 . . . n2 − a − ib n − n2 − a − ib n . Fk (u(x), x)L2 (I ) + k k k k l∞ k=1
(2 ' (m) +4π Na,b L2 u(m) (x) − u(x)2L2 (I,RN ) . Thus, we derive u(m) (x) − u(x)2L2 (I,RN ) ≤ .2 N . . . Gk,m,n Gk,n 4π 2 . . ≤ . n2 − a − ib n − n2 − a − ib n . Fk (u(x), x)L2 (I ) . ε(2 − ε) k k k k l∞ k=1
Clearly, Fk (u(x), x) ∈ L2 (I ), 1 ≤ k ≤ N for u(x) ∈ Hc2 (I, RN ) by means of bound (2.1) of Assumption 1. Lemma 8 below implies that u(m) (x) → u(x),
m→∞
(4.6)
in L2 (I, RN ). Obviously, (m)
|n2 uk,n − n2 uk,n | ≤
√
. . 2π . .
. √ . + 2π . .
. . n2 Gk,m,n n2 Gk,n . |ϕk,n |+ − 2 2 n − ak − ibk n n − ak − ibk n .l ∞
. n2 Gk,m,n . . |ϕ (m) − ϕk,n |. n2 − ak − ibk n .l ∞ k,n
By means of (4.5), we obtain . . 2 (m) . . √ . n2 Gk,m,n . d uk . n2 Gk,n d 2 uk . . . . . − − ≤ 2π . n2 − a − ib n n2 − a − ib n . Fk (u(x), x)L2 (I ) + . dx 2 dx 2 .L2 (I ) k k k k l∞
. √ . + 2π . .
. n2 Gk,m,n . . Lu(m) − u 2 L (I,RN ) . n2 − ak − ibk n .l ∞
d 2u d 2 u(m) → as m → ∞ in dx 2 dx 2 L2 (I, RN ). Hence, u(m) (x) → u(x) in the Hc2 (I, RN ) norm as m → ∞. Suppose that u(m) (x) = 0 identically in the interval I for some m ∈ N. This gives us a contradiction to the assumption that Gk,m,n Fk (0, x)n = 0 for some 1 ≤ k ≤ N and a certain n ∈ Z. The analogous argument holds for the solution u(x) of the limiting system of equations (1.2).
Lemma 8 of the Appendix along with (4.6) give us
On the Solvability of Some Systems of Integro-Differential Equations with Drift
157
5 Appendix Let Gk (x) be a function, Gk (x) : R → R, for which we denote its standard Fourier transform using the hat symbol as 0k (p) := √1 G 2π
∞ −∞
Gk (x)e−ipx dx,
p ∈ R,
(5.1)
such that 0k (p)L∞ (R) ≤ √1 Gk L1 (R) G 2π
(5.2)
∞ 1 0k (q)eiqx dq, x ∈ R. For the technical purposes we G and Gk (x) = √ 2π −∞ define the auxiliary quantities ". . Na, b, k := max .
. 0k (p) G . , . p2 − ak − ibk p L∞ (R)
. . .
# 0k (p) . p2 G . , . p2 − ak − ibk p L∞ (R) (5.3)
with ak ≥ 0, bk ∈ R, bk = 0, 1 ≤ k ≤ N, N ≥ 2. Let N0, b, k stand for (5.3) when ak vanishes. Under the conditions of Lemma 5 below, quantities (5.3) will be finite. This will enable us to define Na, b := max1≤k≤N Na, b, k < ∞.
(5.4)
The technical lemmas below are the adaptations of the ones established in [8] for the studies of the single integro-differential equation with drift, analogous to system (1.2). We provide them for the convenience of the readers. Lemma 5 Let N ≥ 2, 1 ≤ k ≤ N, bk ∈ R, bk = 0 and Gk (x) : R → R, Gk (x) ∈ L1 (R) and 1 ≤ l ≤ N − 1. (a) Let ak > 0 for 1 ≤ k ≤ l. Then Na, b, k < ∞. (b) Let ak = 0 for l + 1 ≤ k ≤ N and in addition xGk (x) ∈ L1 (R). Then N0, b, k < ∞ if and only if (Gk (x), 1)L2 (R) = 0
(5.5)
holds. Proof First of all, let us observe that in both cases (a) and (b) of our lemma 0k (p) 0k (p) G p2 G the boundedness of 2 yields the boundedness of 2 . p − ak − ibk p p − ak − ibk p
158
M. Efendiev and V. Vougalter
Indeed, we can write
0k (p) p2 G as the following sum 2 p − ak − ibk p
0k (p) + ak G
p2
0k (p) 0k (p) pG G + ibk 2 . − ak − ibk p p − ak − ibk p
(5.6)
Evidently, the first term in (5.6) is bounded by means of (5.2) since Gk (x) ∈ L1 (R) as assumed. The third term in (5.6) can be estimated from above in the absolute value via (5.2) as
0k (p)| |bk ||p||G
1 ≤ √ Gk (x)L1 (R) < ∞. 2π (p2 − ak )2 + bk2 p2
0k (p) 0k (p) p2 G G ∈ L∞ (R) implies that 2 ∈ L∞ (R). To obtain − ak − ibk p p − ak − ibk p the result of the part a) of the lemma, we need to estimate
Thus,
p2
0k (p)| |G . (p2 − ak )2 + bk2 p2
(5.7)
Apparently, the numerator of (5.7) can be bounded from above by virtue of (5.2) and the denominator in (5.7) can be easily estimated below by a finite, positive constant, such that & & 0k (p) & & G & & & p2 − a − ib p & ≤ CGk (x)L1 (R) < ∞. k k Here and further down C will denote a finite, positive constant. This yields that under the given conditions, when ak > 0 we have Na, b, k < ∞. In the case of ak = 0, we express 0k (p) = G 0k (0) + G
p 0
0k (s) dG ds, ds
such that p dG 0k (s) 0k (0) 0k (p) G G 0 ds ds + . = 2 p(p − ibk ) p(p − ibk ) p − ibk p
(5.8)
By means of definition (5.1) of the standard Fourier transform, we easily estimate & & 0k (p) & & dG 1 & & & dp & ≤ √2π xGk (x)L1 (R) .
On the Solvability of Some Systems of Integro-Differential Equations with Drift
159
Hence, we derive & p dG & 0 & 0 dsk (s) ds & xGk (x)L1 (R) & &