128 68 5MB
English Pages 334 [326] Year 2021
STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health
W. A. Zúñiga-Galindo Bourama Toni Editors
Advances in Non-Archimedean Analysis and Applications The p-adic Methodology in STEAM-H
STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health
STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health Series Editor Bourama Toni Department of Mathematics Howard University Washington, DC, USA
This interdisciplinary series highlights the wealth of recent advances in the pure and applied sciences made by researchers collaborating between fields where mathematics is a core focus. As we continue to make fundamental advances in various scientific disciplines, the most powerful applications will increasingly be revealed by an interdisciplinary approach. This series serves as a catalyst for these researchers to develop novel applications of, and approaches to, the mathematical sciences. As such, we expect this series to become a national and international reference in STEAM-H education and research. Interdisciplinary by design, the series focuses largely on scientists and mathematicians developing novel methodologies and research techniques that have benefits beyond a single community. This approach seeks to connect researchers from across the globe, united in the common language of the mathematical sciences. Thus, volumes in this series are suitable for both students and researchers in a variety of interdisciplinary fields, such as: mathematics as it applies to engineering; physical chemistry and material sciences; environmental, health, behavioral and life sciences; nanotechnology and robotics; computational and data sciences; signal/image processing and machine learning; finance, economics, operations research, and game theory. The series originated from the weekly yearlong STEAM-H Lecture series at Virginia State University featuring world-class experts in a dynamic forum. Contributions reflected the most recent advances in scientific knowledge and were delivered in a standardized, self-contained and pedagogically-oriented manner to a multidisciplinary audience of faculty and students with the objective of fostering student interest and participation in the STEAM-H disciplines as well as fostering interdisciplinary collaborative research. The series strongly advocates multidisciplinary collaboration with the goal to generate new interdisciplinary holistic approaches, instruments and models, including new knowledge, and to transcend scientific boundaries.
More information about this series at http://www.springer.com/series/15560
W. A. Zú˜niga-Galindo • Bourama Toni Editors
Advances in Non-Archimedean Analysis and Applications The p-adic Methodology in STEAM-H
Editors W. A. Zú˜niga-Galindo University of Texas Rio Grande Valley Brownsville, TX, USA
Bourama Toni Department of Mathematics Howard University Washington, DC, USA
ISSN 2520-193X ISSN 2520-1948 (electronic) STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health ISBN 978-3-030-81975-0 ISBN 978-3-030-81976-7 (eBook) https://doi.org/10.1007/978-3-030-81976-7 Mathematics Subject Classification: 11F85, 11S82, 12J25, 26E30, 35S05, 37P20 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Foreword
It is well known that having two segments a and b on a straight line, where a < b, one can overpass b by applying a some n times along b. This property is called the Archimedean axiom (postulate). It can be also formulated for surfaces and volumes of Euclidean spaces, and can be extended to Riemannian spaces. Its version in the field of real numbers is the following statement: if a and b are two positive real numbers, and a < b, then there exists a sufficiently large natural number n such that na > b. In metric spaces, the Archimedean axiom is related to the triangle inequality: d(x, y) ≤ d(x, z) + d(y, z). There are also ultrametric (non-Archimedean) spaces, where the Archimedean axiom is not satisfied and instead of the triangle inequality holds the strong triangle inequality: d(x, y) ≤ max{d(x, z), d(y, z)}. In non-Archimedean spaces for any natural number n one has that |1 · n| ≤ 1, where | · | denotes the relevant ultrametric norm. Analyses on the field of p-adic numbers and on the Levi-Civita field are attractive and well-developing examples of non-Archimedean analysis. The Archimedean axiom is inherently related to the measurements. General relativity combined with quantum mechanics predicts the Planck length as the smallest one that can be measured. In other words, theoretical physics based on real and complex numbers predicts its own breakdown approaching to the Planck scale. This theoretical observation gave rise to the application of p-adic analysis in string theory by introducing p-adic strings, whose world-sheet is p-adic. Now, in addition to p-adic string theory, there is also application of p-adic analysis in ultrametric modeling of many physical, and some other, systems and phenomena with hierarchical structure. This book is a collection of selected review articles on many recent developments of methods in non-Archimedean analysis aimed to applications in sciences. The articles are written by experts who have been invited by the editors. The book covers many directions of contemporary research, and papers presented here give an excellent insight into state of the art and perspectives.
v
vi
Foreword
I believe that this book will be useful not only to those who are already involved in this rapidly developing research area, but also to the others who want to be informed of what is going on in this subject. It should stimulate further investigations in non-Archimedean analysis and its applications. Institute of Physics, University of Belgrade and Mathematical Institute of the Serbian Academy of Sciences and Arts, Belgrade, Serbia
Branko Dragovich
Preface
This book, Advances in Non-Archimedean Analysis and Applications—The p-adic Methodology in STEAM-H, features recent developments and techniques in nonArchimedean analysis and its applications by world-renowned experts in the field; it will contribute to re-emphasize the relevance and depth of this important area of mathematics, in particular, its expanding reach into the physical, biological, social, and computational sciences. The volume provides an accessible summary of a wide range of active research topics, along with exciting new results. Topics include: the p-adic theory of automata functions, p-adic statistical lattice model, non-Archimedean valued fields, non-Archimedean models of morphogenesis, and p-adic wave equations. The volume’s unique feature is to gather in a single expert book the most recent theorical developments as well as state-of-the art applications of advances in nonArchimedean analysis. It will certainly serve as a useful resource for both graduate students entering this research area and for more established researchers, including as a wide-angle snapshot of this exciting and far-reaching research domain. The volume will also facilitate an in-depth exchange of ideas on recent advances in the various aspects of the non-Archimedean analysis and geometry. As such the volume is an important part of the multidisciplinary STEAMH series (science, technology, engineering, agriculture, mathematics, and health); the series continues bringing together leading researchers to present their work in the perspective to advance their specific fields, and in a way to generate a genuine interdisciplinary interaction transcending disciplinary boundaries. All chapters therein were carefully edited and peer reviewed; they are reasonably selfcontained, and pedagogically exposed for a multidisciplinary readership. Contributions are invited only, and reflect the most recent advances delivered in a high standard, self-contained in lines with the goals of the series, that is: (1) To enhance multidisciplinary understanding between the disciplines by showing how some new advances in a particular discipline can be of interest to the other
vii
viii
Preface
discipline, or how different disciplines contribute to a better understanding of a relevant issue at the interface of mathematics and the sciences. (2) To promote the spirit of inquiry so characteristic of mathematics for the advances of the natural, physical, and behavioral sciences. (3) To encourage diversity in the readers’ background and expertise, while at the same time structurally fostering genuine interdisciplinary interactions and networking. Current disciplinary boundaries do not encourage effective interactions between scientists; researchers from different fields usually occupy different academic buildings, publish in journals specific to their field, and attend different scientific meetings. Existing scientific meetings usually fall into either small gatherings specializing on specific questions, targeting specific and small group of scientists already aware of each other’s work and potentially collaborating, or large meetings covering a wide field and targeting a diverse group of scientists but usually not allowing specific interactions to develop due to their large size and a crowded program. Here, contributors focus on how to make their work intelligible, accessible to a diverse audience, which, in the process, enforces mastery of their own field of expertise. This volume, with its purposely diversified content, strongly advocates multidisciplinarity with the goal to generate new interdisciplinary approaches, instruments and models including new knowledge, transcending scientific boundaries to adopt a more holistic approach. For instance, it should be acknowledged, following Nobel laureate and president of the UK’s Royal Society of Chemistry, Professor Sir Harry Kroto, “that the traditional chemistry, physics, biology departmentalised university infrastructures—which are now clearly out-of-date and a serious hindrance to progress—must be replaced by new ones which actively foster the synergy inherent in multidisciplinarity.” The National Institute of Health and the Howard Hughes Medical Institute have strongly recommended that undergraduate biology education should incorporate mathematics, physics, chemistry, computer science, and engineering until “interdisciplinary thinking and work become second nature.” Young physicists and chemists are encouraged to think about the opportunities waiting for them at the interface with the life sciences. Mathematics is playing an ever more important role in the physical and life sciences, engineering, and technology, blurring the boundaries between scientific disciplines. The STEAM-H series, through contributed volumes such as the current one, is to be a reference of choice for established interdisciplinary scientists and mathematicians, and a source of inspiration for a broad spectrum of researchers and research students, graduates, and postdoctoral fellows; the sheer emphasis of these carefully selected and refereed contributed chapters is on important methods, research directions, and applications of analysis within and beyond mathematics. As such, the volume implicitly promotes mathematical sciences, physical and life sciences, engineering, and technology education, as well as interdisciplinary, industrial, and academic genuine cooperation.
Preface
ix
The current book, titled Advances in Non-Archimedean Analysis and Applications—The p-adic Methodology in STEAM-H, as a whole certainly enhances the overall objective of the series, that is, to foster the readership interest and enthusiasm in the STEAM-H disciplines (science, technology, engineering, agriculture, mathematics, and health), to stimulate graduate and undergraduate research, and generate collaboration among researchers on a genuine interdisciplinary basis. The STEAM-H series is hosted at Howard University, Washington DC, USA, an area that is socially, economically, and intellectually very dynamic, and home to some of the most important research centers in the USA. The series, by now well established and published by Springer, a world-renowned publisher, is expected to become a national and international reference in interdisciplinary education and research. Washington, DC, USA
Bourama Toni
Brownsville, TX, USA
W. A. Zúñiga-Galindo
Acknowledgments
We would like to express our sincere appreciation to all the contributors and to all the anonymous referees for their professionalism. They all made this volume a reality for the greater benefit of the community of science, technology, engineering, agriculture, mathematics, and health.
xi
Contents
Introduction: Advancing Non-Archimedean Mathematics. . . . . . . . . . . . . . . . . . Bourama Toni and W. A. Zúñiga-Galindo
1
The p-adic Theory of Automata Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vladimir Anashin
9
Chaos in p-adic Statistical Lattice Models: Potts Model . . . . . . . . . . . . . . . . . . . . 115 Farrukh Mukhamedov and Otabek Khakimov QFT, RG, and All That, for Mathematicians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Abdelmalek Abdesselam Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann . . . . . . . . 185 Parikshit Dutta and Debashis Ghoshal On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and Metric Structures, Analysis and Applications . . . . . . . . . . . . . 209 Khodr Shamseddine and Angel Barría Comicheo Non-Archimedean Models of Morphogenesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 W. A. Zúñiga-Galindo p-Adic Wave Equations on Finite Graphs and T0 -Spaces . . . . . . . . . . . . . . . . . . . 275 Patrick Erik Bradley A Riemann-Roch Theorem on Infinite Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 Atsushi Atsuji and Hiroshi Kaneko Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
xiii
Contributors
Abdelmalek Abdesselam Department of Mathematics, University of Virginia, Charlottesville, VA, USA Vladimir Anashin Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Moscow, Russia Federal Research Center ‘Information and Control’, Russian Academy of Sciences, Moscow, Russia Atsushi Atsuji Department of Mathematics, Keio University, Yokohama, Japan Patrick Erik Bradley Institute of Photogrammetry and Remote Sensing, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany Angel Barría Comicheo Department of Mathematics, University of Manitoba, Winnipeg, MB, Canada Parikshit Dutta Asutosh College, Kolkata, India Debashis Ghoshal School of Physical Science, Jawaharlal Nehru University, New Delhi, India Hiroshi Kaneko Department of Mathematics, Tokyo University of Science, Shinjuku-ku, Japan Otabek Khakimov Department of Algebra and Its Applications, Institute of Mathematics, Tashkent, Uzbekistan Farrukh Mukhamedov Department of Mathematical Science, College of Science, The United Arab Emirates University, Abu Dhabi, UAE Khodr Shamseddine Department of Physics and Astronomy, University of Manitoba, Winnipeg, MB, Canada Bourama Toni Department of Mathematics, Howard University, Washington, DC, USA
xv
xvi
Contributors
W. A. Zúñiga-Galindo University of Texas Rio Grande Valle, School of Mathematical & Statistical Sciences, Brownsville, TX, USA Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, Departamento de Matemáticas, Santiago de Querétaro, Mexico
Introduction: Advancing Non-Archimedean Mathematics Bourama Toni and W. A. Zúñiga-Galindo
Abstract Hermann Weyl is quoted with saying in Philosophie der Mathematik und Naturwissenschaft 1927, p. 36: As a matter of fact, it is by no means impossible to build up a consistent “non-Archimedean” theory of magnitudes in which the axiom of Eudoxus (usually named after Archimedes) does not hold. Indeed the Axiom of Eudoxus/Archimedes is the main difference between the real and p-adic/ultrametric space; however the axiom is more of a physical one which concerns the process of measurement: exchanging the real numbers field with the p-adic number field is tantamount to exchanging axiomatics in quantum physics. Keywords p-Adic mathematical physics · p-adic analysis · Ultrametricity · Automaton · Automata functions · Ergodic theory · p-Adic Potts model · p-Adic Gibbs measure · Renormalization group · Quantum field theory · Riemann zeta function · Riemann hypothesis · Partition functions · Non-Archimedean valued fields · Ultrametric spaces · Hahn fields · Levi-Civita fields · Non-Archimedean analysis · Reaction-diffusion equations · Turing patterns · Wave equation · Finite graphs · Finite T0 -spaces · Riemann-Roch theorem · Infinite graphs · Laplace operator on graphs
B. Toni () Department of Mathematics, Howard University, Washington, DC, USA e-mail: [email protected] W. A. Zúñiga-Galindo () School of Mathematical and Statistical Sciences, University of Texas, Austin, TX, USA Centro de Investigación y Estudios Avanzados del IPN, Mexico, Mexico e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 W. A. Zúñiga-Galindo, B. Toni (eds.), Advances in Non-Archimedean Analysis and Applications, STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health, https://doi.org/10.1007/978-3-030-81976-7_1
1
2
B. Toni and W. A. Zúñiga-Galindo
Hermann Weyl is quoted with saying in Philosophie der Mathematik und Naturwissenschaft 1927, p. 36: As a matter of fact, it is by no means impossible to build up a consistent “non-Archimedean” theory of magnitudes in which the axiom of Eudoxus (usually named after Archimedes) does not hold. Indeed the Axiom of Eudoxus/Archimedes is the main difference between the real and p-adic/ultrametric space; however the axiom is more of a physical one which concerns the process of measurement: exchanging the real numbers field with the p-adic number field is tantamount to exchanging axiomatics in quantum physics. Ultrametricity in physics means the emergence of ultrametric spaces in physical models. A metric space (M, d) is ultrametric if the distance d satisfies d(A, B) ≤ max {d(A, C), d(C, B)} for any three points A, B, C in M. Ultrametricity was discovered in the 80s by Parisi and others in the theory of spin glasses and by Frauenfelder and others in physics of proteins. In both cases, the space of states of a complex system has a hierarchical structure which play a central role in the physical behavior of the system. On the other hand, in the 1930s, Bronstein showed that general relativity and quantum mechanics imply that the uncertainty x of any length measurement satisfies x ≥ LPlanck := 10−33
h¯ G , c3
where LPlanck is the
Planck length (LPlanck ≈ cm). This implies that spacetime is not an infinitely divisible continuum. Mathematically speaking, spacetime must be a completely disconnected topological space. The ultrametric spaces are naturally completely disconnected. There are several possible interpretations of the Bronstein inequality. One of them drives the loop quantum gravity. Another interpretation of Bronstein’s inequality was given by Volovich in the 80s. The inequality mentioned implies that real numbers cannot be used in models at the level of Planck’s length, because the Archimedean axiom, which appears naturally if we use real numbers, implies that lengths can be measured with arbitrary precision. Volovich proposed using p-adic numbers in physical models at the Planck scale. These ideas have propelled development of a very large number of areas in mathematics and theoretical physics, which in turn has led to applications in computer science, biology, etc. For instance, the p-adic dynamics (discrete time flows) has been proved to be effective in a variety of areas: computer science (p-adic matrix processors; parallel p-adic linear solver); cryptography (stream ciphers, Tfunctions); automata theory and formal languages, genetics, and data mining, among other areas, see e.g. [1, 2], see also [3]. To paraphrase Murtagh, ultrametricity is a pervasive property of observational data. It offers the theoretical framework and processing tools to handle the “big data” and high dimensional data sets. One could say that “the p-adic methodology” consists in representing/interpreting phenomena/data in ultrametric spaces to in turn produce models that can be studied using non-Archimedean mathematical techniques.
Introduction: Advancing Non-Archimedean Mathematics
3
In recent years the connections between non-Archimedean mathematics and mathematical physics have received much attention, see e.g. [3–8] and the references therein. All these developments have been motivated by two physical ideas. The first idea comes from statistical physics, in particular in models which describe relaxation in glasses, macromolecules, and proteins. It has been proposed that the non-exponential nature of those relaxations is a consequence of a hierarchical structure of the state space which can in turn be put in connection with p-adic structures. In the middle of the 80s the idea of using ultrametric spaces to describe the states of complex biological systems, which naturally possess a hierarchical structure, emerged in the works of Frauenfelder, Parisi, Stain, among others, see e.g. [9, 10]. In protein physics, it is regarded as one of the most profound ideas put forward to explain the nature of distinctive life attributes. As a consequence of this, the stochastic processes on ultrametric spaces and their connections with models of complex systems have received a lot attention in recent years, see e.g. [3, 4, 7, 8, 11– 17], and the references therein. In the 80s, Volovich posed the conjecture that the space-time has a nonArchimedean structure at the level of the Planck scale and initiated the p-adic string theory [18], see also [19, Chapter 6], [4], see also [20, 21]. p-Adic string theory is still a very active research area see e.g. [22–26] and references therein. On the other hand, the relevance of constructing p-adic quantum field theories was stressed in [4] and [19]. In the last 35 years p-adic QFT has attracted a lot of attention of physicists and mathematicians, see e.g. [5, 6, 8, 27–36], and the references therein. Nowadays there is a very strong research activity in ultrametric analysis and its applications. Particularly in p-adic field and string theories, p-adic dynamical systems, p-adic techniques in cryptography, p-adic reaction-diffusion equations and biological models, p-adic models in geophysics, stochastic processes in ultrametric spaces, applications of ultrametric spaces in data processing, etc. This volume features recent development of techniques in non-Archimedean analysis and its applications by world-renown experts in the field. It contributes to reemphasize the relevance and depth of this important area of mathematics, and in particular its expanding reach into the physical, biological, social, and computational sciences as well as engineering and technology. The book is organized in chapters as follows: The chapter titled The p-Adic Theory of Automata Functions, by Vladimir Anashin, is a survey on the p-adic theory of automata functions. An automaton is a sequential machine which maps symbols of a finite input alphabet to symbols of a finite output alphabet so that any output symbol depends on corresponding input symbol and on current state of the machine, whereas any input symbol changes current state of the machine. An automaton whose input and output alphabets consist of p symbols, p a prime, produces a mapping from input infinite words to output infinite words over a p-letter alphabet. The mapping produced by the automaton is a function whose domain and range are p-adic integers. It can be shown that the function is necessarily 1-Lipschitz w.r.t. p-adic metric; and moreover,
4
B. Toni and W. A. Zúñiga-Galindo
any 1-Lipschitz mapping from p-adic integers to p-adic integers is a mapping associated with some automaton. The chapter focuses on dynamical (especially, ergodic) and other properties of functions from p-adic integers to p-adic integers. The problems considered in the chapter are mostly motivated by applications in computer science, cryptography, pseudorandom numbers, digital economy (smart contracts modelling), physics, quantitative biology, etc. The chapter titled Chaos in p-Adic Statistical Lattice Models: Potts Model, contains the contribution of Farrukh Mukhamedov and Otabek Khakimov. Models of interacting systems have been intensively studied in the last years and new methodologies have been developed in the attempt to understanding their intriguing features. One of the most promising directions is the combination of statistical mechanics tools with the methods adopted from dynamical systems. One of such tools is the renormalization group (RG) which has had a profound impact on modern statistical physics. The authors provide recent results on the existence of the phase transition and its relation to the chaotic behavior of the associated p-adic dynamical system. The chapter presents the p-adic q-state Potts model on a Cayley tree using the theory of p-adic measure and non-Archimedean stochastic processes. In the chapter titled QFT, RG, and All That, for Mathematicians, Abdelmalek Abdesselam presents a quick nontechnical introduction to quantum field theory and Wilson’s theory of the renormalization group from the point of view of mathematical analysis. The presentation is geared primarily towards a probability theory, harmonic analysis and dynamical systems theory audience. The author also emphasizes the use of p-adics in order to set up hierarchical versions of the renormalization group. The latter provide an ideal stepping stone towards the more involved Euclidean space setting. The chapter titled Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann, contains the contribution of Parikshit Dutta and Debashis Ghoshal. The distribution of the non-trivial zeroes of the Riemann zeta function, according to the Riemann hypothesis, is tantalizingly similar to the zeroes of the partition functions (Fisher and Yang-Lee zeroes) of statistical mechanical models studied by physicists. The authors show that a ‘phase operator’ conjugate to it can be constructed on a subspace L2 (p−1 Zp ) of L2 (Qp ). They discuss (at physicists’ level of rigor) how to combine this for all primes to possibly relate to the zeroes of the Riemann zeta function. Finally, they extend these results to the family of Dirichlet L-functions, using our recent construction of Vladimirov derivative like pseudodifferential operators associated with the Dirichlet characters. The chapter titled On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and Metric Structures, Analysis and Applications, contains the contribution of Khodr Shamseddine and Angel Barría Comicheo. The authors first briefly review basic properties of ultrametric spaces, valued fields and ordered fields as well as the connection between these different mathematical objects. As examples, they introduce the so-called general Hahn fields and Levi-Civita fields, and present a summary of their key properties. Then, for the rest of the paper, they focus their attention on two special Levi-Civita fields: R and its complex counterpart C. They review some of their research work on R and C as well as on the spaces
Introduction: Advancing Non-Archimedean Mathematics
5
R2 and R3 : one-dimensional and multi-dimensional calculus, power series and analytic functions, measure theory and integration, unconstrained and constrained optimization, operator theory on the Banach space c0 of null sequences of elements of C, and computational applications. In the chapter titled Non-Archimedean Models of Morphogenesis, W. A. ZúñigaGalindo studied some p-adic reaction-diffusion system and the associated Turing patterns. The author establishes an instability criteria and show that the Turing patterns are not classical patterns consisting of alternating domains. Instead of this, a Turing pattern consists of several domains (clusters), each of them supporting a different pattern but with the same parameter values. This type of patterns are typically produced by reaction-diffusion equations on large networks. In the chapter titled p-Adic Wave Equations on Finite Graphs and T0 -Spaces, Patrick Erik Bradley studied certain p-adic wave and diffusion equations on finite T0 -spaces through their Hasse diagrams. First, a dictionary between graph theory and p-adic analysis is developed. Then the structure of the solutions of homogeneous wave equations on networks with and without damping is studied and compared with the classical case. The chapter titled A Riemann-Roch Theorem on Network presents the contribution of Atsushi Atsuji and Hiroshi Kaneko. Baker and Norine established a Riemann-Roch theorem on finite graph with uniform unit vertex-weight and uniform unit edge-weight. In this chapter, the authors consider an edge-weighted infinite graph and study the spectral gaps of the Laplace operators defined on its finite subgraphs naturally given by Q-valued positive weights on the edges. The authors build a potential theoretic scheme for a proof of a Riemann-Roch theorem on an edge-weighted infinite graph. The editors want to thank all the people and institutions who made this volume possible. First of all, the colleagues who contributed with the chapters: Vladimir S. Anashin (Lomonosov Moscow State University, Russia); Farrukh Mukhamedov (The United Arab Emirates University, UAE) and Otabek Khakimov (Institute of Mathematics, Uzbekistan); Abdelmalek Abdesselam, (University of Virginia, U.S.A); Parikshit Dutta (Asutosh College Kolkata, India) and Debashis Ghoshal (Jawaharlal Nehru University, India); Angel Barría Comicheo (International College of Manitoba, Canada) and Khodr Shamseddine (University of Manitoba, Canada); W. A. Zúñiga-Galindo (University of Texas Rio Grande Valley, U.S.A. and Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, México); Patrick Erik Bradley (Karlsruhe Institute of Technology, Germany); Atsushi Atsuji (Keio University, Japan) and Hiroshi Kaneko (Tokyo University of Science, Japan). We also acknowledge with thanks assistance from Anatoly Kochubei, Evgeny Zelenov, José Aguayo, Sangtae Jeong, Livat Tyapaev, Branko Dragovich, among others. Zúñiga-Galindo acknowledges with thanks the financial support from Conacyt, Grant 217367 (Mexico), and the Debnath Endowed Chairmanship at UTRGV, U.S.A.
6
B. Toni and W. A. Zúñiga-Galindo
References 1. Anashin, Vladimir, Khrennikov Andrei, Applied algebraic dynamics. De Gruyter Expositions in Mathematics, 49. Walter de Gruyter & Co., Berlin, 2009. 2. Murtagh Fionn, Thinking ultrametrically, thinking p-adically. Clusters, orders, and trees: methods and applications, 249–272, Springer Optim. Appl., 92, Springer, New York, 2014. 3. Dragovich B., Khrennikov A. Yu., Kozyrev S. V., Volovich I. V., Zelenov E. I., p-Adic mathematical physics: the first 30 years, p-Adic Numbers Ultrametric Anal. Appl. 9 (2017), no. 2, 87–121. 4. Vladimirov V. S., Volovich I. V., Zelenov E. I., p-adic analysis and mathematical physics. Series on Soviet and East European Mathematics, 1. World Scientific Publishing Co., Inc., River Edge, NJ, 1994. 5. Khrennikov Andrei, Non-Archimedean analysis: quantum paradoxes, dynamical systems and biological models. Mathematics and its Applications, 427. Kluwer Academic Publishers, Dordrecht, 1997. 6. Khrennikov Andrei, p-Adic valued distributions in mathematical physics. Mathematics and its Applications, 309. Kluwer Academic Publishers Group, Dordrecht, 1994. 7. Zúñiga-Galindo W. A., Pseudodifferential equations over non-Archimedean spaces. Lecture Notes in Mathematics, 2174. Springer, Cham, 2016. 8. Khrennikov Andrei Yu., Kozyrev Sergei V., Zúñiga-Galindo W. A., Ultrametric pseudodifferential equations and applications. Encyclopedia of Mathematics and its Applications, 168. Cambridge University Press, Cambridge, 2018. 9. Frauenfelder H, Chan S. S., Chan W. S. (eds), The Physics of Proteins. Springer-Verlag, 2010. 10. Rammal R., Toulouse G., Virasoro M. A., Ultrametricity for physicists, Rev. Modern Phys. 58 (1986), no. 3, 765–788. 11. Albeverio Sergio, Karwowski Witold, Jump processes on leaves of multibranching trees, J. Math. Phys. 49 (2008), no. 9, 093503, 20 pp. 12. Avetisov V. A., Bikulov A. Kh., Osipov V. A., p-adic description of characteristic relaxation in complex systems, J. Phys. A 36 (2003), no. 15, 4239–4246. 13. Avetisov V. A., Bikulov A. H., Kozyrev S. V., Osipov V. A., p-adic models of ultrametric diffusion constrained by hierarchical energy landscapes, J. Phys. A 35 (2002), no. 2, 177–189. 14. Hoffmann K. H., Sibani P., Diffusion in Hierarchies, Phys. Rev. A 38, 4261–4270 (1988). 15. Karwowski W., Diffusion processes with ultrametric jumps, Rep. Math. Phys. 60 (2007), no. 2, 221–235. 16. Kochubei Anatoly N., Pseudo-differential equations and stochastics over non-Archimedean fields. Marcel Dekker, Inc., New York, 2001. 17. Kozyrev S. V., Methods and Applications of Ultrametric and p-Adic Analysis: From Wavelet Theory to Biophysics, Sovrem. Probl. Mat., 12, Steklov Math. Inst., RAS, Moscow, 2008, 3– 168. 18. Volovich I. V., p-adic string, Classical Quantum Gravity 4(4), L83–L87 (1987). 19. Varadarajan V. S., Reflections on quanta, symmetries, and supersymmetries. Springer, New York, 2011. 20. Freund Peter G. O., Witten Edward: Adelic string amplitudes. Phys. Lett. B 199(2), 191–194 (1987). 21. Aref’eva I. Ya., Dragovi´c, B. G., Volovich I. V., On the adelic string amplitudes, Phys. Lett. B 209(4), 445–450 (1988). 22. Gubser Steven S., Knaute Johannes, Parikh Sarthak, Samberg Andreas, Witaszczyk Przemek, p-adic AdS/CFT, Comm. Math. Phys. 352 (2017), no. 3, 1019–1059. 23. Heydeman Matthew, Marcolli Matilde, Saberi Ingmar A., Stoica Bogdan, Tensor networks, padic fields, and algebraic curves: arithmetic and the AdS3/CFT2 correspondence, Adv. Theor. Math. Phys. 22 (2018), no. 1, 93–176. 24. Bocardo-Gaspar M., Veys Willem, Zúñiga-Galindo W. A., Meromorphic continuation of KobaNielsen string amplitudes, J. High Energy Phys. 2020, no. 9, 138, 43 pp.
Introduction: Advancing Non-Archimedean Mathematics
7
25. García-Compeán H., López Edgar Y., Zúñiga-Galindo W. A., p-Adic open string amplitudes with Chan-Paton factors coupled to a constant B-field, Nuclear Phys. B 951 (2020), 114904, 33 pp. 26. Bocardo-Gaspar M., García-Compeán H., Zúñiga-Galindo W. A., On p-adic string amplitudes in the limit p approaches to one, J. High Energy Phys. 2018, no. 8, 043, front matter+22 pp. 27. Abdelmalek Abdesselam, Ajay Chandra and Gianluca Guadagni, Rigorous quantum field theory functional integrals over the p-adics I: anomalous dimensions. arXiv:1302.5971. 28. Gubser Steven S., A p-adic version of AdS/CFT, Adv. Theor. Math. Phys. 21(7) (2017), 1655– 1678. 29. Kochubei A. N. and Sait-Ametov M. R., Interaction measures on the space of distributions over the field of p−adic numbers, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 6(3) (2003), 389–411. 30. Lerner E. Y. and Misarov M. D., Scalar models in p−adic quantum field theory and hierarchical models, Theor. Math. Phys. 78 (1989) 248–257. 31. Missarov M. D., p−adic ϕ 4 −theory as a functional equation problem, Lett. Math. Phys. 39(3) (1997), 253–260 . 32. Missarov M. D., p−adic renormalization group solutions and the Euclidean renormalization group conjectures, p-Adic Numbers Ultrametric Anal. Appl. 4(2) (2012), 109–114. 33. Mendoza-Martínez M. L., Vallejo J. A., Zúñiga-Galindo W. A., Acausal quantum theory for non-Archimedean scalar fields, Rev. Math. Phys. 31 (2019), no. 4, 1950011, 46 pp. 34. Arroyo-Ortiz Edilberto, Zúñiga-Galindo W. A., Construction of p-adic covariant quantum fields in the framework of white noise analysis, Rep. Math. Phys. 84 (2019), no. 1, 1–34. 35. Smirnov V. A., Renormalization in p-adic quantum field theory, Modern Phys. Lett. A 6(15) (1991), 1421–1427. 36. Smirnov V. A., Calculation of general p−adic Feynman amplitude, Comm. Math. Phys. 149(3) (1992), 623–636.
The p-adic Theory of Automata Functions Vladimir Anashin
Abstract This is mostly a survey paper on the p-adic (and wider, an ulrametric) theory of automata functions, though sketch proofs are given in a few cases. In the paper, by the automaton we mostly mean a transducer, i.e., a sequential machine which maps symbols of a finite input alphabet to symbols of a finite output alphabet so that any output symbol depends on corresponding input symbol and on current state of the machine, whereas any input symbol changes current state of the machine. Therefore an automaton whose input and output alphabets consist of p symbols, p a prime, produces a mapping from input infinite words to output infinite words over a p-letter alphabet. As the words can naturally be associated with p-adic integers, the mapping produced by the automaton is a function whose domain and range are p-adic integers. It can be shown that the function is necessarily 1Lipschitz w.r.t. p-adic metric; and moreover, any 1-Lipschitz mapping from p-adic integers to p-adic integers is a mapping associated with some automaton. Therefore one can study behaviour of automata by studying dynamics of corresponding 1-Lipschitz mappings, the automata functions. The paper focuses on dynamical (especially, ergodic) and other properties of automata functions as functions from p-adic integers to p-adic integers. Settings of problems considered in the paper are mostly motivated by (or related to) applications in computer science, cryptography, pseudorandom numbers, digital economy (smart contracts modelling), physics, quantitative biology, etc. Basically the paper is a demonstration of what can be called an ultrametric approach to the phenomenon of causality; that is why we also
Supported in parts by Russian Foundation for Basic Research grant No 18-29-03124. V. Anashin () Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Moscow, Russia Federal Research Center ‘Information and Control’, Russian Academy of Sciences, Moscow, Russia e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 W. A. Zúñiga-Galindo, B. Toni (eds.), Advances in Non-Archimedean Analysis and Applications, STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health, https://doi.org/10.1007/978-3-030-81976-7_2
9
10
V. Anashin
briefly touch the problem of automata over continuous time although automata over discrete time constitute main body of the paper. 2020 Mathematics Subject Classification Primary 11E95; Secondary 11B85, 68Q70 Keywords Automaton · p-Adic numbers · Automata functions · Ergodic theory · Automata sequences · Ultrametricity · Causality
1 Introduction The p-adic numbers, which appeared more than a century ago in Kurt Hensel’s works as a pure mathematical construction, see e.g. [61], at the end of XX century were recognized as a base for adequate descriptions of physical, biological, cognitive and information processing phenomena. The pioneering papers in these studies were works of Vladimirov and Volovich [128, 129, 132] followed by monograph [131]. Although the papers (and the monograph) are focused on application of the p-adic theory to mathematical physics, the impact of these works was much wider than physical models only: Inspired by these works, many scientists started applying p-adic methods to their own areas of research. Now the p-adic theory, and wider, ultrametric analysis and ultrametric dynamics, is a rapidly developing area that finds applications to various sciences (physics, biology, genetics, cognitive sciences, information sciences, computer science, cryptology, numerical methods, etc.). On the state of the art the interested reader is referred to, e.g., [10, 12, 14, 75] and references therein. There are numerous more recent publications on the theme which we will mention in respective sections of the current paper. The main goal of paper is to give an exposition of numerous results which concern causality; namely, we are aimed to demonstrate that adequate mathematical model for causality are automata, the transducers which map ‘causes’ to ‘effects’, and ‘causes’/‘effects’ can be considered as functions defined on a totally ordered set, ‘time’. Therefore an automaton maps functions to functions, and ‘causality law’ can be described in terms of this mapping, the ‘automaton function’. Each ‘cause’ results in permanent transition of ‘states’ of the automaton, and the resulting ‘effect’ at each moment of time depends only on ‘events which already have happened’, i.e., the ones which constitute the ‘cause’ during an interval of ‘time elapsed’. By that very reason, every automaton function, i.e., a manifestation of ‘causality’, is intrinsically of ultrametric nature, and therefore causality can be investigated by means of ultrametric analysis and ultrametric dynamics. It is clear also that causality strongly depends on properties of ‘time’, the totally ordered set on which ‘causes’ and ‘effects’ are defined. Basically that total order can be of two types only, well order and dense order. If the order is well order, time is called ‘discrete’, and if the order is ‘dense’, time is called ‘continuous’. These results in significant differences in two models of causality, the one related
The p-adic Theory of Automata Functions
11
to discrete time and the other related to continuous time. Thus, the first model of causality results in the theory of automata over discrete time whereas the second one deals with automata over continuous time. In applications, time is the set of at most continual cardinality, thus we also impose that restriction on the ordered set which represents time. Moreover, in papers on automata over continuous time (e.g., on so-called timed automata and on automata over non-Zeno signals, see [6, 7, 112]) where time is assumed to be the set R≥0 of all non-negative real numbers, it is usually stressed that all results of the papers remain true whenever time is assumed to be any dense subset of R≥0 , for instance, the set Q≥0 of all non-negative rational numbers. Thus actually time can be considered as a countable set. It is well known however that every countable, totally ordered set is order-isomorphic to a subset of the rational numbers Q with standard ordering ≤, see, e.g., [53, Lemma 174]. On the other hand, actually in literature on the totally ordered set representing time one more restriction is imposed: In order to work with time shifts, time must be an ordered semigroup, see, e.g. the aforementioned papers. But it is also well known that a subgroup of an ordered topological additive group of the real numbers R is either dense or closed with respect to the standard topology which agrees with the standard order ≤ on R; see, e.g., [1] or [120] for a shorter direct proof. These considerations show why it is reasonable to restrict the totally ordered set representing time by either a subset of Q≥0 which is dense in R≥0 for continuous time or, respectively, by N0 , the set of all non-negative integers for discrete time. Both types of automata, over discrete time and over continuous time, have numerous applications: For instance, automata over continuous time serve as mathematical models for so-called hybrid systems in engineering sciences and in computer science, as well as for smart contracts in cryptographic currency of various types (bitcoin, etherium, etc.) as well as for digital economy at whole. Literature on hybrid systems at whole and in particular on timed automata is vast and worth a special exposition which we are not going to present in the current paper; we only refer the interested reader to the book [96], the journal ‘Nonlinear Analysis: Hybrid Systems’ published by Elsevier and to proceedings of numerous conferences ‘Hybrid Systems: Computation and Control’ (the most recent was 23-rd conference HSCC 2020). We will however discuss timed automata (as well as automata over non-Zeno signals) in connection with modelling and verification of smart contracts, see Sect. 7.2. The biggest part of the current paper deals with automata over discrete time. The automata over discrete time which we consider in the current paper are letter-to-letter transducers which map words over finite input alphabet to words over finite output alphabet so that at each time moment current letter of input results in transition of current state of the automaton to some another state which is completely defined by the letter and the current state, while respective letter of output is completely defined by a current letter of input and a current state. Therefore causality in case of discrete time is represented by the letter-to-letter transducers which map one-sided infinite words over input alphabet to one-sided infinite words over output alphabet. The set of all one-sided infinite words over a finite alphabet can be endowed with a natural non-Archimedean metric, and it
12
V. Anashin
turns out that the class of all automata functions coincide with the class of all non-expansive mappings with respect to the said ultrametric. In other words, if input/output alphabet of an automaton consists of p symbols, then the automaton function can be identified as a mapping of p-adic integers to p-adic integers (under a natural one-to-one correspondence between infinite words over a p-letter alphabet and p-adic integers) which satisfies a Lipschitz condition with a constant 1 (referred to as a 1-Lipschitz mapping, in what follows) with respect to the p-adic metric; and moreover, every 1-Lipschitz mapping from the space Zp of p-adic integers to Zp is an automaton function for a suitable transducer whose input/output alphabets consist of p symbols. The correspondence between automata functions and p-adic 1-Lipschitz functions immediately supply researchers by powerful mathematical tools from p-adic analysis and p-adic dynamics. We stress here that this approach, in contrast to classical one, gives a possibility to study properties of automata functions without ‘looking inside’ of automata under research; i.e., without investigating state transition table of the automata; that investigation comprises a base of common approach in classical automata theory.
Of course, the approach based on study of automata functions by means of p-adic analysis and p-adic dynamics is not universal, it works well when one is interested in functional properties of word transformations performed by automata rather then how automata perform these transformations; however, the p-adic approach already has proved its effectiveness not only for automata theory (e.g., by developing tools to determine whether an automaton function is bijective) but also in numerous applications to computer science, cryptography, pseudorandom number generation, Monte Carlo methods, etc. We mention these applications in the current paper. Note that we mostly assume that p is a prime in order to avoid making statements of results too complicated. Note that the case p = 2 is the most important for applications to computer science; automata functions of that type are also known under the name of T-functions. The T-functions play a special role in cryptography. The paper is organized as follows: • Section 2 serves as a brief reminder of definitions and notions from automata theory; we also recall in the section some facts about p-adic numbers and explain how automata functions are related to p-adic 1-Lipschitz functions. • In Sect. 3 we make an exposition of various functional representations of automata functions; e.g., in Mahler basis, in van der Put basis, in coordinate form, etc. • In Sect. 4 we deal with special important classes of automata functions such as functions which correspond to finite automata (also known under the name of bounded determinate functions) and locally analytic functions. • In Sect. 5 we investigate dynamical (namely, ergodic) properties of automata functions. • In Sect. 6 we discuss real functions which are defined by automata functions. • In Sect. 7 we make a short overview of another techniques to study automata functions, namely, the one based on infinite power series over p-element fields
The p-adic Theory of Automata Functions
13
rather than on p-adic integers; and also we discuss some general approaches to automata over continuous time. • We conclude in Sect. 8. As the paper is of expository nature, no proofs are given except for a few places where the proofs are both necessary, short and new.
2 Preliminaries For reader’s convenience, in the current Section we recall necessary definitions and state some results which will be needed further in the paper.
2.1 A Few Words About Words An alphabet is just a finite non-empty set A; further in the paper usually A = {0, 1, . . . , p − 1} = Fp . Elements of A are called symbols, or letters. By the definition, a word of length n over alphabet A is a finite sequence (stretching from right to left) αn−1 · · · α1 α0 , where αn−1 , . . . , α1 , α0 ∈ A. The number n is called the length of the word w = αn−1 · · · α1 α0 and is denoted via (w). The empty word φ is a sequence of length 0, that is, the one that contains no symbols. Given a word w = αn−1 · · · α1 α0 , any word v = αk−1 · · · α1 α0 , k ≤ n, is called a prefix of the word w; whereas any word u = αn−1 · · · αi+1 αi , 0 ≤ i ≤ n − 1 is called a suffix of the word w. Every word αj · · · αi+1 αi where n − 1 ≥ j ≥ i ≥ 0 is called a subword of the word w = αn−1 · · · α1 α0 . Given words a = αn−1 · · · α1 α0 and b = βk−1 · · · β1 β0 , the concatenation ab is the following word (of length n + k): ab = αn−1 · · · α1 α0 βk−1 · · · β1 β0 . Given a word w, its k-times concatenation is denoted via (w)k : . . . w . (w)k = ww k times
We denote via W the set of all non-empty words over A = {0, 1, . . . , p − 1} and via Wφ the set of all words including the empty word φ. In the sequel the set of all n-letter words over the alphabet Fp we denote as Wn ; so W = ∪∞ n=1 Wn . To every word w = αn−1 · · · α1 α0 we put into the correspondence a non-negative integer num(w) = α0 +α1 ·p+· · ·+αn−1 ·pn−1 . Thus num maps the set W of all non-empty finite words over the alphabet A onto the set N0 = {0, 1, 2, . . .} of all non-negative integers. We will also consider a map ρ of the set W into the real unit half-open
14
V. Anashin
interval [0, 1); the map ρ is defined as follows: Given w = βr−1 . . . β0 ∈ W, put ρ(w) = num(w) · p −(w) =
β0 + β1 p + · · · + βr−1 p r−1 = 0.βr−1 . . . β0 ∈ [0, 1). pr
(2.1) We also use notation 0.w for 0.βr−1 . . . β0 . Along with finite words we also consider (left-)infinite words over the alphabet A; the ones are the infinite sequences of the form . . . α2 α1 α0 where αi ∈ A, i ∈ N0 . For infinite words the notion of a prefix and of a subword are defined in the same way as for finite words; whilst suffix is not defined. Let an infinite word w be eventually periodic, that is, let w = . . . βt−1 βt−2 . . . β0 βt−1 βt−2 . . . β0 αr−1 αr−2 . . . α0 for αi , βj ∈ A; then the subword βt−1 βt−2 . . . β0 is called a period of the word w and the suffix αr−2 . . . α0 is called the pre-period of the word w. Note that a pre-period may be an empty word while a period can not. We write the eventually periodic word w as w = (βt−1 βt−2 . . . β0 )∞ αr−1 αr−2 . . . α0 .
2.2 p-adic Numbers See [52, 71, 83] for introduction to p-adic analysis or comprehensive monographs [97, 116] for further reading. Fix a prime number p and denote respectively via N0 = {0, 1, 2, . . .} and Z = {0, ±1, ±2, . . .} the set of all non-negative rational integers and the ring of all rational integers. Given n ∈ N = N0 \ {0}, the p-adic absolute value of n is |n|p = p− ordp n , where pordp n is the largest power of p which is a factor of n; so n = n · pordp n where n ∈ N is co-prime to p. By putting |0|p = 0, | − n|p = |n|p and |n/m|p = |n|p /|m|p for n, m ∈ Z, m = 0 we expand the p-adic absolute value to the whole field Q of rational numbers. Given an absolute value | |p , we define a metric in a standard way: |a − b|p is a p-adic metric on Q. The field Qp of p-adic numbers is a completion of the field Q of rational numbers w.r.t. the padic metric while the ring Zp of p-adic integers is a ring of integers of Qp ; and the ring Zp is a completion of Z w.r.t. the p-adic metric. The ring Zp is compact w.r.t. the p-adic metric: Actually Zp is a ball of radius 1 centered at 0; namely Zp = {r ∈ Qp : |r|p ≤ 1}. Balls in Qp are clopen; that is, both closed and open w.r.t. the p-adic metric. A p-adic number r ∈ Qp \ {0} admits a unique p-adic canonical expansion i r = ∞ then any i=k αi p where αi ∈ {0, 1, . . . , p − 1}, k ∈ Z, αk = 0. Note that i p-adic integer z ∈ Zp admits a unique representation z = ∞ i=0 αi p for suitable αi ∈ {0, 1, . . . , p − 1}. The latter representation is called a canonical form (or, a canonical representation) of the p-adic integer z ∈ Zp ; the i-th coefficient αi of the
The p-adic Theory of Automata Functions
15
expansion will be referred to as the i-th p-adic digit of z and denoted via αi = δi (z). It is clear that once z ∈ N0 , the i-th p-adic digit δi (z) of z is just the i-th digit in the base-p expansion of z. Note also that a p-adic integer z ∈ Zp is a unity of Zp (i.e., has a multiplicative inverse z−1 ∈ Zp ) if and only if δ0 (z) = 0; so any p-adic number z ∈ Qp has a unique representation of the form z = z · |z|−1 p where z ∈ Zp is a unity. The p-adic integers may be associated to infinite words over the alphabet Fp = {0, 1, . . . , p − 1} as follows:i Given a p-adic integer z ∈ Zp , consider its canonical expansion z = ∞ i=0 αi · p ; then denote via wrd(z) the infinite word . . . α2 α1 α0 (allowing some freedom of saying we will sometimes refer wrd(z) to as a base-p expansion of z ∈ Zp ). Vice a left-infinite word w = . . . α2 α1 α0 we ∞ versa, given i corresponding p-adic integer whose base-p denote via num(w) = α · p i i=0 expansion is w thus expanding the mapping num defined in Sect. 2.1 to the case of infinite words as well. It is worth noticing here that addition and multiplication of p-adic integers can be performed by using the same school-textbook algorithms for addition/multiplication of non-negative integers represented via their base-p expansions with the only difference: The algorithms are applied to infinite words that correspond to p-adic canonical forms of summands/multipliers rather than to a finite words which are base-p expansions of summands/multipliers. ∞ i Given n ∈ N and a canonical expansion z = i=0 αi p for z ∈ Zp , denote n−1 n i n n zmodp = i=0 αi p . The mapping modp z → zmodp is a ring epimorphism of Zp onto the residue ring Z/pn Z (under a natural representation of elements of the residue ring by the least non-negative residues {0, 1 . . . , p n − 1}). The series in the right-hand side of the canonical form converges w.r.t. the p-adic metric; that is, the sequence of partial sums z mod pn converges to z w.r.t. the pp n adic metric: ∞ limn→∞ (z mod p ) = z. It is worth noticing here that arbitrary infinite series i=0 ri where ri ∈ Qp converges in Qp (i.e., w.r.t. p-adic metric) if and only if limi→∞ |ri |p = 0 since p-adic metric is non-Archimedean; that is, it satisfies strong triangle inequality |x − y|p ≤ max{|x − z|p , |z − y|p } for all x, y, z ∈ Qp . Note that z ∈ N0 if and only if all but a finite number of coefficients αi in the canonical form are 0 while z ∈ {−1, −2, −3, . . .} if and only if all but a finite number of αi are p − 1. Further we will need a special representation for p-adic integer rationals; that is, for those rational numbers z which at the same time are p-adic integers, i.e., for z ∈ Zp ∩ Q. Note that z ∈ Zp ∩ Q if and only if z can be represented by an irreducible fraction z = a/b, a ∈ Z, b ∈ N where b is co-prime to p. The following proposition is well known, cf., e.g., [47, Theorem 10]: Proposition 2.1 A p-adic integer z is rational (i.e., z ∈ Zp ∩ Q) if and only if the sequence of coefficients of its canonical form is eventually periodic: z = α0 + α1 p + · · · + αr−1 pr−1 + (β0 + β1 p + · · · + βt−1 pt−1 )pr + (β0 + β1 p + · · · + βt−1 pt−1 )pr+t + (β0 + β1 p + · · · + βt−1 pt−1 )pr+2t + · · · (2.2)
16
V. Anashin
for suitable αj , βi ∈ {0, 1, . . . , p − 1}, r ∈ N0 , t ∈ N (the sum α0 + α1 p + · · · + αr−1 pr−1 is absent in the above expression once r = 0). In other ∞words,i once a p-adic integer z is represented in its canonical form, z = i=0 γi p , the corresponding infinite word . . . γ1 γ0 is eventually periodic: . . . γ1 γ0 = (βt−1 . . . β0 )∞ αr−1 . . . α0 . It is clear that given z ∈ Zp ∩ Q, both r and t are not unique: For instance, (βt−1 . . . β0 )∞ αr−1 . . . α0 = (β0 βt−1 . . . β1 β0 βt−1 . . . β1 )∞ αr αr−1 . . . α0 , where αr = β0 . But once both pre-periodic and periodic parts (the prefix αr−1 . . . α0 and the word βt−1 . . . β0 ) are taken the shortest possible, both the pre-period length r and the period length t are unique for a given p-adic rational integer z ∈ Zp ∩ Q; we refer to αr−1 . . . α0 and to βt−1 βt−2 . . . β1 β0 as to pre-period of z and period of z accordingly. Given z ∈ Zp ∩ Q we mostly assume further that in the representation z = r+tj (respectively, in eventually α0 +· · ·+αr−1 pr−1 +(β0 +· · ·+βt−1 pt−1 )· ∞ j =0 p ∞ periodic infinite word wrd(z) = (βt−1 . . . β0 ) αr−1 . . . α0 that corresponds to z) r is a pre-period length and t is a period length. Note that a pre-period may be an empty word (i.e., of length 0) while a period can not. Rational p-adic integers can also be represented as fractions of a special kind: Proposition 2.2 A p-adic integer z ∈ Zp is rational if and only if there exist t ∈ N, c ∈ Z, d ∈ {0, 1, . . . , pt − 2} such that z=c+
pt
d . −1
(2.3)
Proof Indeed, z ∈ Zp ∩ Q if and only if z is of the form (2.2); therefore β0 + β1 p + · · · + βt−1 pt−1 = z = (α0 +α1 p+· · ·+αr−1 pr−1 −pr )+pr 1 − pt − 1 (α0 + α1 p + · · · + αr−1 pr−1 − pr + q) +
ζ0 + ζ1 p + · · · + ζt−1 pt−1 pt − 1
(2.4)
where ζ0 + ζ1 p + · · · + ζt−1 pt−1 is a base-p expansion of the least s ∈ N0 such that pr (pt − 1 − (β0 + β1 p + · · · + βt−1 pt−1 )) ≡ s (mod (pt − 1)). ∞ Note 2.3 Recall that (1 − pm )−1 = i=0 pmi ∈ Zp , for every m ∈ N. Note 2.4 Note that once in (2.4) r is a pre-period length and t is a period length of z ∈ Zp ∩ Q, the representation (2.3) is unique; that is, the choice of c and d in (2.3) is unique. In the sequel we often use the base-p expansions of p-adic rational integers reduced modulo 1 along with their p-adic canonical forms. Recall that if y ∈ R then by the
The p-adic Theory of Automata Functions
17
definition y mod 1 = y − y ∈ [0, 1) ⊂ R, where y is the biggest integer from Z = {0, ±1, ±2, . . .} which does not exceed y. For reader’s convenience, we now summarize some facts on connections between these representations. It is very well known that a base-p expansion of a rational number is eventually periodic; that is, given x ∈ Q ∩ [0, 1], the base-p expansion for x is x = 0.χ0 . . . χk−1 (ξ0 . . . ξn−1 )∞ = χ0 p−1 + χ1 p−2 + · · · + χk−1 p−k + ξ0 p−k−1 + ξ1 p−k−2 + · · · + ξn−1 p−k−n + ξ0 p−k−1−n + ξ1 p−k−2−n + · · · + ξn−1 p−k−2n + · · · = 1 1 ξ0 pn−1 + ξ1 pn−2 + · · · + ξn−1 k−1 k−2 , (χ p + χ p + · · · + χ ) + · 0 1 k−1 pk pk pn − 1 (2.5) where χi , ξj ∈ {0, 1, . . . , p − 1}. Note that in the base-p expansions of rational integers from [0, 1] we use right-infinite words rather than left-infinite ones that correspond to canonical expansions of p-adic integers. Given z ∈ R, by z mod 1 we denote a real number from the unit right-open interval [0, 1) such that z = c + z mod 1 where c ∈ Z. Proposition 2.5 Given z ∈ Zp ∩ Q, represent z in the form (2.2); then z mod 1 = 0.(βˆt−1−¯r βˆt−2−¯r . . . βˆ0 βˆt−1 βˆt−2 . . . βˆt−¯r )∞ mod 1, where βˆ = p − 1 − β for β ∈ {0, 1, . . . , p − 1} and r¯ is the least non-negative residue of r modulo t if t > 1 or r¯ = 0 if otherwise. r+tj = −p r (p t − 1)−1 in Z ; so z = u − Proof Indeed, by Note 2.3, ∞ p j =0 p vpr (pt − 1)−1 where u = α0 + α1 p + · · · + αr−1 pr−1 and v = β0 + β1 p + · · · + βt−1 pt−1 . Therefore vp r mod 1. z mod 1 = − t p −1 But (pt − 1)−1 = p−t + p−2t + p−3t + · · · in R; so . . . 0 1)∞ (pt − 1)−1 = 0.(00 t−1
and thus −v · (pt − 1)−1 = −0.(βt−1 βt−2 . . . β0 )∞ . Now just note that (p−1−γ0 )+(p−1−γ1 )p+· · ·+(p−1−γs−1 )p s−1 = p s −1−(γ0 +γ1 p+· · ·+γs−1 p s−1 )
18
V. Anashin
for γ0 , γ1 , . . . ∈ {0, 1, . . . , p − 1}, s ∈ N; so (p − 1 − γ0 ) + (p − 1 − γ1 )p + · · · + (p − 1 − γs−1 )ps−1 = ps − 1 1−
γ0 + γ1 p + · · · + γs−1 ps−1 ps − 1
and therefore (−0.(γs−1 γs−2 . . . γ0 )∞ ) mod 1 = (0.(γˆs−1 γˆs−2 . . . γˆ0 )∞ ) mod 1 where γˆ = p − 1 − γ for γ ∈ {0, 1, . . . , p − 1}.
(2.6)
Combining (2.5) with Proposition 2.2 we see that all real numbers whose the base-p expansions are purely periodic must lie in Zp ∩ Q; therefore the following criterion is true: Corollary 2.6 A real number x is in Zp ∩ Q if and only if the base-p expansion of x mod1 is purely periodic: x mod1 = 0.(χ0 . . . χn−1 )∞ for suitable χ0 , . . . , χn−1 ∈ Fp . The following corollary expresses the base-p expansion of a p-adic rational integer via its representation in the form given by Proposition 2.2: Corollary 2.7 Once a p-adic rational integer z ∈ Zp ∩Q is represented in the form from Proposition 2.2 then z mod 1 = 0.(ζt−1 ζt−2 . . . ζ0 )∞ where d = ζ0 + ζ1 p + · · · + ζt−1 pt−1 . Now we can find a period length of z ∈ Zp ∩ Q provided z is represented as an irreducible fraction z = a/b, where a ∈ Z, b ∈ N. Proposition 2.8 Once a p-adic rational integer z = 0 is represented as an irreducible fraction z = a/b, and if b > 1, then the period length t of z is equal to the multiplicative order of p modulo b (i.e., to the smallest ∈ N such that p ≡ 1 (mod b)). Now given b ∈ N, b co-prime to p, we denote via multb p the multiplicative order of p modulo b if b > 1 or put multb p = 1 once b = 1. Then multb p is the period length of z ∈ Zp ∩ Q once z is represented as an irreducible fraction z = a/b where a ∈ Z and b ∈ N. Note that we consider here only infinite words that correspond to p-adic rational integers; thus to, e.g., 0 there corresponds a word (0)∞ (so a period of 0 is 0 and a pre-period is empty) and the base-p expansion of 0 is 0.(0)∞ . Also, 1 = 1 + 0 · p + 0 · p2 + · · · , the corresponding infinite word is (0)∞ 1; therefore 1 is a pre-period of 1, 0 is a period of 1, and the representation of 1 in the form (2.3) is 1 = 1 + (0/p − 1). Example 2.9 Let p = 2; then 1/3 = 1·1+1·2+0·4+1·8+0·16+· · · = 1−2·3−1 is a canonical 2-adic expansion of 1/3; so the corresponding infinite binary word is
The p-adic Theory of Automata Functions
19
(01)∞ 1. Therefore the period length of 1/3 is 2 (and note that the multiplicative order of 2 modulo 3 is indeed 2), the period is 01, the pre-period is 1. Also, c = 0 and d = 1 once 1/3 is represented in the form of Proposition 2.2; 1/3 = 0.(01)∞ is a base-2 expansion of 1/3, cf. Proposition 2.5 and Corollary 2.7.
2.3 Automata: Basic Definitions and Properties Here we recall some basic facts from automata theory (see e.g. monographs [28, 30, 43, 115]). By the definition, the (non-initial) automaton is a 5-tuple A = I, S, O, S, O where I is a finite set, the input alphabet; O is a finite set, the output alphabet; S is a non-empty (possibly, infinite) set of states; S : I × S → S is a state transition function; O : I × S → O is an output function. The automaton where both input alphabet I and output alphabet O are non-empty is called the transducer, see e.g. [4]; the automaton where the input alphabet is empty whereas the output alphabet is not empty is called the generator. The initial automaton A(s0 ) = I, S, O, S, O, s0 is an automaton A where one state s0 ∈ S is fixed; it is called the initial state. We stress that the definition of the initial automaton A(s0 ) is nearly the same as the one of Mealy automaton (see e.g. [28]) with the only important difference: the set of states S of A(s0 ) is not necessarily finite. Given an input word w = χn−1 · · · χ1 χ0 over the alphabet I, an initial transducer A(s0 ) = I, S, O, S, O, s0 transforms w to output word w = ξn−1 · · · ξ1 ξ0 over the output alphabet O as follows (cf. Fig. 1): Initially the transducer A(s0 ) is at the state s0 ; accepting the input symbol χ0 ∈ I, the transducer outputs the symbol ξ0 = O(χ0 , s0 ) ∈ O and reaches the state s1 = S(χ0 , s0 ) ∈ S; then the transducer accepts the next input symbol χ1 ∈ I, reaches the state s2 = S(χ1 , s1 ) ∈ S, outputs
Fig. 1 Initial transducer, schematically
20
V. Anashin
ξ1 = O(χ1 , s1 ) ∈ O, and the routine repeats. This way the transducer A = A(s0 ) defines a mapping a = as0 of the set Wn (I) of all n-letter words over the input alphabet I to the set Wn (O) of all n-letter words over the output alphabet O; thus A defines a map of the set W(I) of all non-empty words over the alphabet I to the set W(O) of all non-empty words over the alphabet O. We will denote the latter map by the same symbol a (or by as0 if we want to stress what initial state is meant), and when it is clear from the context what alphabet A is meant we use notation W rather than W(A). Throughout the paper, ‘automaton’ mostly stands for ‘initial automaton’; we make corresponding remarks if not. Further in the paper we mostly consider transducers. Furthermore, throughout the paper we consider only reachable transducers; that is, we assume that all the states of an initial transducer A(s0 ) are reachable from s0 : Given s ∈ S, there exists input word w over alphabet I such that after the word w has been feeded to the automaton A(s0 ), the automaton reaches the state s. A reachable transducer is called finite if its set S of states is finite, and is called infinite otherwise. Moreover, hereinafter in the paper the word ‘automaton’ stands for an initial transducer whose input and output alphabets consist of p symbols. We mostly assume that p is a prime although many of further results (for instance, the following Theorem 2.10) are true without this restriction and so we identify input/output symbols with the p-element field Fp = {0, 1, . . . , p − 1}. Thus, for every n = 1, 2, 3, . . . the automaton A(s0 ) = Fp , S, Fp , S, O, s0 maps n-letter words over Fp to n-letter words over Fp according to the procedure described above, cf. Fig. 1. The figure is a schematic representation (block diagram) of an automaton; the representation is most applicable when it is needed to represent an automaton as a computer program. In automata theory, however, it is convenient to represent ‘internal structure’ of an automaton via its state transition diagram, a directed graph (digraph) whose vertices are states of the automaton, whose edges (arrows) are labelled by symbols a|b, where a (resp., b) is a letter of input (resp., output) alphabet, and arrow goes from i-th vertex to j -th vertex if for some input letter a the automaton goes from i-th state to j -th state; the arrow is labelled by a|b if corresponding output symbol is b. Figure 2 depicts state transition diagram of the automaton A(s0 ) whose input/output alphabets are F2 = {0, 1}. Given two automata A = A(s0 ) and B = B(t0 ) of that kind, their sequential composition (or briefly, a composition) C = B ◦ A can be defined in a natural way via sending output of the automaton A to input of the automaton B so that the mapping c : W → W the automaton C performs is just a composite mapping b ◦ a (cf. any of monographs [28, 30, 43] for an exact definition and further facts mentioned in the subsection). Note that a composition of finite automata is a finite automaton. In a similar manner one can consider automata with multiple inputs/outputs; these can be also treated as automata whose input/output alphabets are Cartesian powers of Fp : For instance, an automaton with m inputs and n outputs over alphabet Fp can be considered as an automaton with a single input over the alphabet Fm p and a single output over the alphabet Fnp . Moreover, as the letters of the alphabet Fkp
The p-adic Theory of Automata Functions
21
Fig. 2 State transition diagram of the automaton A(s0 ), p = 2, S = {s0 , s1 , s2 , . . .}. Here the directed edge which starts at vertex sj and is labelled with χ|O(γ , sj ) means that once the automaton in state sj is feeded by γ ∈ F2 , the automaton returns O(χ, sj ) ∈ F2 as an output symbol and changes its state to a new one to which the edge points
Fig. 3 An automaton A having m inputs and n outputs over a p-symbol alphabet
are in a one-to-one correspondence with residues modulo pk ; the automaton with m inputs and n outputs can be considered (if necessary) as an automaton with a single input over the alphabet Z/pm Z and a single output over alphabet Z/pn Z, cf. Fig. 3. Compositions of automata with multiple inputs/outputs can also be naturally defined: For instance, given automata A1 , A2 , and A3 with m1 , m2 , m3 inputs and n1 , n2 , n3 outputs respectively, in the case when m3 = n1 + n2 one can consider a composition of these automata by connecting every output of automata A1 and A2 to some input of the automaton A3 so that every input of the automaton A3 is connected to a unique output which belongs either to A1 or to A2 but not to the both. This way one obtains various compositions of automata A1 and A2 , with the automaton A3 , and either of these compositions is an automaton with m1 +m2 inputs and n3 outputs. Moreover, either of the compositions is a finite automaton if all three automata A1 , A2 , A3 are finite. Automata can be considered as (generally) non-autonomous dynamical systems on different configuration spaces (e.g., Wn , W, etc.); the system is autonomous when neither the state transition function S nor the output function O depend on input; in this case the automaton A is called autonomous as well. In the latter case the mapping a is a constant map. An example state diagram of an autonomous
22
V. Anashin
automaton (whose input/output alphabet is {0, 1}) is represented by Fig. 5. Note that it produces different mappings depending on which state we choose to be initial. One may consider automata with input/output alphabets A = Fp as dynamical systems on the space Zp of p-adic integers, i.e., to relate an automaton A to a special map fA : Zp → Zp . In the next subsection we recall some facts about the map fA .
2.4 Automata Maps: p-adic View Any initial automaton maps a (left-side) infinite input word over Fp to a (left-side) infinite output word over Fp according to the following rule: · · · χi · · · χ1 χ0 → · · · ψi (χ0 , χ1 , . . . , χi ) · · · ψ1 (χ0 , χ1 )ψ0 (χ0 ),
(2.7)
→ Fp , i = where ψi (χ0 , χ1 , . . . , χi ) is the i-th output symbol, ψi : Fi+1 p 0, 1, 2, . . ., and vice versa, every map of that form is a map which correspond to an automaton. These maps are known under different names; e.g., automata maps, determinate functions, causal maps, etc. The causality here means that ‘present’ letter of output word does not depend on ‘future’ letters of input word; that is, every i-th output symbol ξi does not depend on χj if j > i. Maps of that kind can be viewed as non-expansive non-Archimedean maps; below we explain this in detail. We identify n-letter words over Fp with non-negative integers in a natural way: Given an n-letter word w = χn−1 χn−2 · · · χ0 (i.e., χi ∈ Fp for i = 0, 1, 2, . . . , n − 1), we consider w as a base-p expansion of the number num(w) = χ0 + χ1 · p + · · · + χn−1 · pn−1 . In turn, the latter number can be considered as an element of the residue ring Z/pn Z modulo pn . We denote via wrdn an inverse mapping to num. The mapping wrdn is a bijection of the set {0, 1 . . . , pn − 1} ⊂ N0 onto the set Wn of all n-letter words over Fp . As the set {0, 1 . . . , p n − 1} is the set of all non-negative residues modulo pn , to every automaton A = A(s) there corresponds a map fn,A from Z/pn Z to Z/pn Z, for every n = 1, 2, 3, . . .. Namely, for r ∈ Z/pn Z put fn,A (r) = num(a(wrdn (r))), where a is a word transformation of Wn performed by the automaton A, cf. Sect. 2.3. Speaking less formally, the mapping fn,A can be defined as follows: given r ∈ {0, 1, . . . , p n − 1}, consider a base-p expansion of r, read it as a n-letter word over Fp = {0, 1, . . . , p −1} (put additional zeroes on higher order positions if necessary) and then feed the word to the automaton so that letters that are on lower order positions (‘less significant digits’) are feeded prior to ones on higher order positions (‘more significant digits’). Then read the corresponding output n-letter word as a base-p expansion of a number from N0 keeping the same order, i.e. when the earliest outputted letters correspond to lowest order digits in the base-p expansion. We stress the following determinative property of the mapping fn,A which follows directly from the definition, cf. (2.7): Given a, b ∈ {0, 1, . . . , p n − 1}, whenever a ≡ b (mod pk ) for some k ∈ N then necessarily fn,A (a) ≡ fn,A (b) (mod pk ). This implication may be re-stated in terms of p-adic absolute value | · |p
The p-adic Theory of Automata Functions
23
as follows: |fn,A (a) − fn,A (b)|p ≤ |a − b|p .
(2.8)
Note that when necessary we may also identify n-letter words over Fp with elements of Fnp , the n-th Cartesian power of Fp ; so further we use these one-toone correspondences between n-letter words and residues modulo pn (as well as between the words and elements from Fnp ) without extra comments. In a similar manner, every automaton A = A(s0 ) defines a map fA from Zp to Zp : Given an infinite word w = . . . χn−1 χn−2 · · · χ0 (that is, an infinite sequence) over Fp we consider a p-adic integer whose p-adic canonical expansion is z = z(w) = χ0 + χ1 · p + · · · + χn−1 · pn−1 + · · · ; so, by the definition, for every z ∈ Zp we put δi (fA (z)) = O(δi (z), si )
(i = 0, 1, 2, . . .),
(2.9)
where si = S(δi−1 (z), si−1 ), i = 1, 2, . . ., and δi (z) is the i-th ‘p-adic digit’ of z; that is, the i-th term coefficient in the p-adic canonical representation of z: δi (z) = χi ∈ Fp , i = 0, 1, 2, . . .. The so defined map fA is called the automaton function (or, the automaton map) of the automaton A. Note that from (2.9) it follows that δi (fA (z)) = ψi (δ0 (z), . . . , δi (z)),
(2.10)
where ψi is a map from the (i + 1)-th Cartesian power Fi+1 p of Fp into Fp , cf. (2.7). More formally, given z ∈ Zp , define fA (z) as follows: Consider a sequence n ∞ (z mod pn )∞ n=1 and a corresponding sequence (fn,A (z mod p ))n=1 ; then, as the ∞ n sequence (z mod p )n=1 converges to z w.r.t. the p-adic metric (cf. Sect. 2.2), the sequence (fn,A (z mod pn ))∞ n=1 in view (2.8) also converges w.r.t. the p-adic metric (since the latter sequence is fundamental and Zp is closed in Qp which is a complete metric space). Now we just put fA (z) to be a limit point of the sequence (fn,A (zmodpn ))∞ n=1 . Thus, the mapping fA is a well-defined function with domain Zp and values in Zp ; by (2.8) the function fA satisfies a Lipschitz condition with a constant 1 w.r.t. p-adic metric. For instance, for the automaton A whose state diagram is represented by Fig. 4, the automaton function fA is just a multiplication by 5 in the space of all 2-adic integers; i.e., fA (z) = 5z for all z ∈ Z2 . The automaton B whose state diagram is represented by Fig. 5 is an autonomous automaton; the domain automaton function are 2-adic integers; ∞ and2jrange of its fB (z) = − 13 = = (01)∞ if we choose s1 as an initial state, and j =0 2 2j +1 = (10)∞ if we choose s as an initial state. fB (z) = − 23 = ∞ 2 j =0 2 Yet one more example of a state diagram of an autonomous automaton is given by Fig. 6; its automaton function is a constant 2/7 ∈ Z2 . As the set N0 = {0, 1, 2, . . .} is dense in Zp , and as any 1-Lipschitz map f : Zp → Zp is continuous with respect to p-adic metric, f is completely
24
V. Anashin
Fig. 4 State transition diagram of an automaton whose automaton function is z → 5z (z ∈ Z2 ); initial state is s0
Fig. 5 State transition diagram of an autonomous automaton
Fig. 6 State transition diagram of an autonomous automaton whose automaton function is a constant 2/7 when state s0 is taken as an initial
determined by its values on N0 ; that is, on all finite words (words of finite lengths) in alphabet Zp . There is a natural epimorphism of Zp onto the residue ring Z/pn Z modulo pn , the reduction map modulo pn ; the latter is the map modpn : Zp → i n Z/pn Z such that x mod pn = n−1 i=0 δi (x)p . That is, the map modp just deletes all terms starting with the n-th one in the canonical p-adic expansion of x. The reduction map modpn is a continuous ring epimorphism of Zp onto the residue ring Z/pn Z. Therefore, given a 1-Lipschitz map f : Zp → Zp , the reduction modulo pn f mod pn : Z/pn Z → Z/pn Z such that (f mod pn )(x) = f (x) mod pn is well defined; that is, the so defined map f mod pn does not depend on the choice of representatives in co-sets with respect to the ideal pn Zp . The two states si and sj of the automaton A(s0 ) are called equivalent whenever fA(si ) = fA(sj ) , that is, the automata functions coincide independently of which of the states, si or sj , is taken as an initial state. Therefore we can factorise state transition diagram by the equivalence relation thus obtaining a factor-automaton (which is called reduced representation of initial automaton) whose set of states is the factor-set of the set S by the equivalence relation. It is clear that automata functions defined by automaton and by factor-automaton coincide. Therefore usually all automata are considered to be reduced that way. Further we will say that an automaton is finite whenever the set of states of its reduced representation is finite, and infinite if otherwise.
The p-adic Theory of Automata Functions
25
Though any automaton can be represented by a state transition diagram which is an infinite tree, cf. Fig. 2, reduced representation of the automaton may not be a tree at all: An example Moore diagram is given by Fig. 4; the diagram represents an automaton which performs multiplication by 5 of integers represented by base-2 expansions. To the initial automaton A(s0 ) we put into a correspondence a family ˜ O, S, ˜ ˜ O, ˜ s, s ∈ S, where S˜ = S(s) F(A) of all sub-automata A(s) = I, S, ⊂S ˜ ˜ is the set of all states that are reachable from the state s and S, O are respective ˜ A subrestrictions of the state transition and output functions S, O on I × S. ˜ automaton A(s) is called proper if the set S of all its states is a proper subset of S. A sub-automaton A(s) is called minimal if it contains no proper sub-automata; e.g., an automaton from Fig. 4 is minimal. It is obvious that a finite sub-automaton is minimal if and only if every its state is reachable from any other its state. The set of all states of a minimal sub-automaton of the automaton A is called an ergodic component of the (set of all states) of the automaton A. It is clear that once the automaton is in a state that belongs to an ergodic component, all its further states will also be in the same ergodic component. Therefore all states of a finite automaton are of two types only: The transient states which belong to no ergodic component, and ergodic states which belong to ergodic components. It is clear that the set of all ergodic states is a disjoint union of ergodic components. Note that we use the term ‘minimal automaton’ in a different meaning compared to the one used in automata theory, see, e.g., [43]: Our terminology here is from the theory of Markov chains, see, e.g., [72] (since to the graph of state transitions of every automaton there corresponds a Markov chain). The automaton from Fig. 7 has two minimal sub-automata; its ergodic components are respectively {s1 , s2 , s3 } and {s4 , s5 , s6 , s7 , s8 }; s0 is its initial state. The sub-automaton whose set of states is {s1 , s2 , s3 } performs multiplication by 3 of natural numbers represented via their base-2 expansions once state 1 is taken for the initial state. The subautomaton with states {s4 , s5 , s6 , s7 , s8 } is up to the numbering of states the same Fig. 7 State transition diagram of an automaton which has two minimal sub-automata; initial state of the automaton is s0
26
V. Anashin
as in Fig. 4; so it performs multiplication by 5 once state 4 is taken for the initial state. It is well known that the class of all automata functions f that correspond to automata with p-letter input/output alphabets coincides with the class of all nonexpansive mapping from Zp to Zp ; i.e., the mappings that satisfy p-adic Lipschitz condition with a constant 1 (1-Lipschitz maps, for brevity), see, e.g., [21, 55, 115, 133]: dp (f (a), f (b)) ≤ dp (a, b) for all a, b ∈ Zp .
(2.11)
Here dp stands for p-adic distance which is equal to p−n where n is the length of ← − the longest common prefix of the two (left-)infinite words ← a− and b corresponding to the p-adic integers a and b, respectively (so dp (z, z) = 0 for z ∈ Zp ). In other words, dp (x, y) = |x − y|p , for all x, y ∈ Zp . Theorem 2.10 The automaton function fA(s0 ) : Zp → Zp of the automaton A(s0 ) = Fp , S, Fp , S, O, s0 is 1-Lipschitz. Conversely, for every 1-Lipschitz function f : Zp → Zp there exists an automaton A(s0 ) = Fp , S, Fp , S, O, s0 such that f = fA(s0 ) . From that Theorem one easily derives a topological characterisation of automata functions: Corollary 2.11 The function f : Zp → Zp is an automaton function if and only if it maps every p-adic ball of radius p−r into a p-adic ball of radius p−r , for all r = 1, 2, 3, . . .. Recall that a p-adic ball Bp−r (z) of radius p−r centered at z ∈ Zp is a coset with respect to epimorphism modpr ; namely, Bp−r (z) = {x ∈ Zp : |x − z|p ≤ p−r } = z + pr Zp . Therefore a map f : Zp → Zp is an automaton function if and only if f (z + pr Zp ) ⊂ f (z) + pr Zp , for all z ∈ Zp , r = 1, 2, 3, . . .. Corollary 2.11 can be re-stated in algebraic terms: Corollary 2.12 The function f : Zp → Zp (respectively, f : Z → Z or f : N0 → N0 ) is (respectively, can be uniquely expanded to) an automaton function if and only if it is compatible with all congruences modulo pr , for all r = 1, 2, 3, . . .; that is, a ≡ b (mod pr ) implies f (a) ≡ f (b) (mod pr ). That is why in the paper we sometimes use the term compatible function along with the term 1-Lipschitz function as the terms determine the same class of functions from Zp to Zp , the automata functions.
There is yet another characterization of 1-Lipschitz (thus, automata) functions f : Zp → Zp via properties of coordinate functions; the latter are functions δi (f (x)) defined on Zp and valuated in Fp = {0, 1, . . . , p − 1}. The i-th coordinate function is merely a value of coefficient of the i-th term in a canonical p-adic expansion of f (x).
The p-adic Theory of Automata Functions
27
Proposition 2.13 (cf. [12, Proposition 3.35]) A function f : Zp → Zp is 1Lipschitz (thus, an automaton function) if and only if for every i = 1, 2, . . . the i-th coordinate function δi (f (x)) does not depend on δi+k (x), for all k = 1, 2, . . . . Further, given a 1-Lipschitz function f : Zp → Zp via Af we denote an initial transducer Fp , S, Fp , S, O, s0 whose automaton function is f ; that is, fAf = f . Note that the automaton Af is not unique: There are many automata that has the same automaton function. However, this non-uniqueness will not cause misunderstanding since in the paper we are mostly interested in automata functions rather than with ‘internal structure’ (e.g., with state sets, state transition and output functions, etc.) of automata themselves. Further we need more detailed information about finite automata functions, that is, about functions fA : Zp → Zp where A = A(s0 ) is a finite automaton (i.e., whose reduced set of states is finite). It is well known (cf. previous Sect. 2.3) that the class of finite automata functions is closed w.r.t. composition of functions and a sum of functions: Once f, g : Zp → Zp are finite automata functions, either of mappings x → f (g(x)) and x → f (x) + g(x) (x ∈ Zp ) is a finite automaton function. Another important property of finite automata functions is that any finite automaton function maps Zp ∩ Q into itself. In view of (2.2), the latter property is just a re-statement of a a well-known property of finite automata which yields that if any finite automaton is feeded by an eventually periodic sequence then its output sequence is eventually periodic, cf., e.g., [28, Corollary 2.6.9], [43, Chapter XIII, Theorem 2.2.]. Since further we often use that property of finite automata, we state it as a lemma for future references: Lemma 2.14 If a finite automaton A is being fed by a left-infinite periodic word w ∞ , where w ∈ W is a finite non-empty word, then the corresponding output leftinfinite word is eventually periodic; i.e., it is of the form u∞ v, where u ∈ W, v ∈ Wφ . To put it in other words, if a finite automaton is being fed by an eventually periodic finite word (w)k t, where w ∈ W, t ∈ Wφ , and k ∈ N is sufficiently large, then the output word is of the form r(u) v, where ∈ N, u ∈ W, r, v ∈ Wφ and r is either empty or a prefix of u: u = hr for a suitable h ∈ Wφ . Therefore the output word is of the form (u) ¯ v , where u¯ is a cyclically shifted word u. From here by Proposition 2.1 we deduce Corollary 2.15 Any finite automaton A whose input/output alphabets are Fp , maps rational p-adic integers to rational p-adic integers; i.e., fA (Zp ∩ Q) ⊂ Zp ∩ Q. Automata functions of automata with multiple inputs and outputs (cf. Sect. 2.3) over the same alphabet can be considered in a similar manner. We remark that in the case when the alphabet is Fp , the automata can be considered as automata whose input/output alphabets are Cartesian powers Fnp and Fm p , for suitable m, n ∈ N. For these automata a theory similar to that of automata with a single input/output can be developed: Corresponding automata function are then 1-Lipshitz mappings k from Znp to Zm p w.r.t. p-adic metrics. Recall that p-adic absolute value on Zp is defined as follows: Given (z1 , . . . , zk ) ∈ Zkp , put |(z1 , . . . , zk )|p = max{|zi |p : i =
28
V. Anashin
1, 2, . . . , k}. The so defined absolute value (and the corresponding metric) are nonArchimedean as well. It is worth recalling here a well-known fact that addition of two p-adic integers can be performed by a finite automaton which has two inputs and one output: Actually the automaton just finds successively (digit after digit) the sum by a standard addition-with-carry algorithm which is used to find a sum of two nonnegative integers represented by their base-p expansions thus calculating the sum with arbitrarily high accuracy w.r.t. the p-adic metric. On the contrary, no finite automaton can perform multiplication of two arbitrary p-adic integers since it is well known that no finite automaton can calculate the base-p expansion of a square of an arbitrary non-negative integer, given the base-p expansion of the latter, cf., e.g., [28, Theorem 2.2.3]. The following properties of finite automata functions can be proved: Proposition 2.16 Let A, B be finite automata, let a, b ∈ Zp ∩ Q be p-adic rational integers. Then the following is true: (i) the mapping z → fA (z) + fB (z) of Zp into Zp is a finite automaton function; (ii) a composite function f (z) = a · fA (z) + b, (z ∈ Zp ), is a finite automaton function; (iii) a constant function f (z) = c is a finite automaton function if and only if c ∈ Zp ∩ Q; (iv) an affine mapping f (z) = c · z + d is a finite automaton function if and only if c, d ∈ Zp ∩ Q. Concluding the subsection, we remark that in literature (finite) automata functions are also known under names of (bounded) determinate functions, or (bounded) deterministic functions, cf., e.g., [137]. We note that the p-adic approach (and wider the non-Archimedean one) has already been successfully applied to automata theory. Seemingly the paper [95] is the first one where the p-adic techniques is applied to study automata functions; the paper deals with linearity conditions of automata maps. For application of the non-Archimedean methods to automata and formal languages see expository paper [108] and references therein; for applications to automata and group theory (as well as to dynamics, spectral theory, ergodic theory, C ∗ -algebras, graph theory, random walks, actions on rooted trees) see [54, 55, 57]. In [133–135] the 2-adic methods are used to study binary automata (the ones whose input/output alphabet is F2 ), in particular, to obtain the finiteness criterion for these automata. In monograph [12] the p-adic ergodic theory is developed (see numerous references therein) aiming at applications to computer science and cryptography (in particular, to automata theory, to pseudorandom number generation and to stream cipher design) as well as to applications in other areas like quantum theory, cognitive sciences and genetics. We stress that in many cases, especially in computer science, the approach to derive properties of automata functions from their functional representations rather than from their representations via state transition diagrams turns out to be more fruitful than the approach based on direct study of the diagrams since in computer
The p-adic Theory of Automata Functions
29
science the automata functions can be represented by straight line programs (SLP) which are just compositions of computer instructions; therefore corresponding automaton may have enormous number of states and, moreover, be even infinite: It is well known that, e.g., squaring of a natural number can not be implemented on a finite automaton, see. e.g, [28]; that is, the mapping z → z2 (z ∈ Zp ) is an automaton function of an infinite automaton, and the mapping is an automaton function of neither finite automaton. It is worth noticing here that along with Mealy automata also Moore automata (or, Moore sequential machines) are often considered; the latter are ones whose output function O(χ , s) does not depend on input symbol χ . Its is clear from the definition that all Moore automata are Mealy automata; nonetheless any automaton function of Mealy automaton can be obtained on a suitable Moore automaton once one assumes that the very first output symbol of Moore automaton (which does not depend on input words at all) is always empty symbol (i.e., if there is no output symbol after the first input symbol feeded to the automaton), see, e.g., [115, Theorem 5.1]. Thus one may use either of two representations, Moore or Mealy, for an automaton function. Also we note that in literature the term ‘automaton’ is used in a very broad meaning, much more broad that the one we discuss in the paper. Speaking of classical automata, we mean sequential machines, the least powerful algorithms, from computational view. There exists a hierarchy of classes of automata over discrete time, with respect to their computational capabilities, the Turing machines (and, which is equivalent, cellular automata) being the most powerful; they can perform any algorithm. In between, there are automata with delays, push-down automata, asynchronous automata, etc. In the paper, we do not consider automata over discrete time other than the classical automata, the letter-to-letter transducers. We note, however, that some results on classical automata which are considered in the paper can be expanded to the classes of automata which are wider than the letter-to-letter transducers, see, e.g., [32, 124–126] an asynchronous automata and automata with delays. To emphasize the difference between synchronous and asynchronous automata we mention that the mapping Zp → Zp is continuous (and not necessarily 1-Lipschitz) with respect to p-adic metric if and only if it is defined by some non-degenerate asynchronous automaton, [55, Theorem 2.4]; note that the class of all mappings defined in a similar way by Turing machines over a p-letter input alphabet constitute functions Zp → Zp which need not be defined everywhere on Zp , not speaking of other properties. Also, we omit numerous results on socalled p-adic scaling mappings, p-adic β-shifts, etc., which are continuous but not 1-Lipschitz mappings Zp → Zp and therefore aren’t automata functions defined by letter-to-letter transducers; we only mention here some of important papers in that area: [50, 76, 77] and [139]; the latter paper has both theoretical and applied value.
30
V. Anashin
3 Explicit Representations of General Automaton Function As we already know from Theorem 2.10, the class of all automata functions of automata whose input/output alphabets are Fp is exactly the class of all 1-Lipschitz functions from p-adic integers to p-adic integers. Therefore to study the automata functions it is important to find some good representations of the functions by means of p-adic analysis. In p-adic analysis, continuous functions whose domain and range are Zp can be represented by various convergent series, so in the Section we will discuss conditions under which the series represent 1-Lipschitz functions.
3.1 Coordinate Representation Any p-adic integer iz ∈ Zp admits a unique canonical representation, the canonical form z = ∞ i=0 ζi p , where ζi ∈ {0, 1, . . . p − 1} = Fp , i = 0, 1, 2, . . .. Therefore, any mapping f : Zp → Zp can be represented as f (z) =
∞
δi (f (z))pi ,
(3.12)
i=0
cf. (2.9). From Proposition 2.13 we know that the mapping is 1-Lipschitz if and only if δi is a mapping ψi from Fi+1 p to Fp such that δi (f (z)) = ψi (ζ0 , . . . , ζi ), where ζj = δj (z), j = 0, 1, 2, . . .. It is well known (see, e.g., [88]) that any finite field is polynomially complete, that is, any n-variate mapping from that field to itself can be represented by a polynomial over that field in n variables. Thus we conclude that the function f represented by (3.12) is an automaton function if and only if δi (f (z)) is a polynomial in δ0 (z), . . . , δi (z) over the field Fp , for all i = 0, 1, 2, . . ..
3.2 Representation via Mahler Series Mahler expansion is a useful technique to study various properties of automata functions. Every function f : N0 → Zp (or, respectively, f : N0 → Z) has the only Mahler expansion, that is, has a unique representation via the so-called Mahler (interpolation) series ∞
x , ai f (x) = i i=0
(3.13)
The p-adic Theory of Automata Functions
31
where ai ∈ Zp (respectively, ai ∈ Z), i = 0, 1, 2, . . ., and x x(x − 1) · · · (x − i + 1) = i! i for i = 1, 2, . . .; x = 1, 0 by the definition. Various properties of the function f : Zp → Zp can be expressed via properties of coefficients of its Mahler expansion. We recall some basic facts about Mahler series, referring to [97] or [116] for their proofs. If f is uniformly continuous on N0 with respect to the p-adic metric, it can be uniquely expanded to a uniformly continuous function on Zp . Hence the interpolation series for f converges uniformly on Zp . The following is true: The series (3.13) converges uniformly on Zp if and only if p
lim ai = 0
i→∞
(3.14)
p
(here lim stands for a limit with respect to the p-adic metric). Hence uniformly convergent series defines a uniformly continuous function on Zp . The function f represented by the interpolation series (3.13) is (uniformly) differentiable everywhere on Zp if and only if p
ai+n =0 i→∞ i lim
(3.15)
for all n ∈ N0 ; in this case the following formula for the derivative holds: f (x) =
∞
i f (x) . (−1)i+1 i
(3.16)
i=1
Here is the difference operator: By the definition, g(x) = g(x + 1) − g(x), i g(x) = i−1 g(x), 0 g(x) = g(x) for an arbitrary function g. The function f is analytic on Zp if and only if p
ai = 0. i→∞ i! lim
(3.17)
The following theorem ([13], see also [12, 15]) gives a complete description of automata functions in terms of Mahler series.
32
V. Anashin
Theorem 3.1 A function f : Zp → Zp represented by Mahler expansion (3.13) is an automaton function if and only if |ai |p ≤ p−logp i for all i = 1, 2, . . .. In other words, a function f : Zp → Zp is an automaton function if and only if it can be represented as f (x) =
∞
p
i=0
logp i
ci
x , i
(3.18)
for suitable ci ∈ Zp ; i = 0, 1, 2, . . .. Note 3.2 Recall that for m ∈ N0
logp m = (the number of digits in a base-p expansion for m) − 1;
that is why further in the paper we assume that logp 0 = 0. Examples The following classes of functions are 1-Lipschitz, thus, automata functions (see, e.g., [12]). • Polynomials over Zp are automata functions since if
ai i!
∈ Zp in (3.13) then
ai ≡ 0 (mod p logp i ) for all i = p, p + 1, p + 2, . . .. • For arbitrary polynomials u(x), v(x) ∈ Zp [x], the rational functions f (x) = u(x) 1+pv(x) are automata functions. • Let u, v : Zp → Zp be automata functions, let u(z) ≡ 1 (mod p) for all z ∈ Zp . Then the function f (z) = u(z)v(z) is well defined for all z ∈ Zp , maps Zp into Zp , and is an automaton function. Note 3.3 A criterion similar to that from Theorem 3.1 for a function Zp → Zp −
1
represented by q-Mahler series with q ∈ Zp such that |q − 1|p < p p−1 to be an automaton function was obtained in [66]. Recall that q-Mahler series are series of the form f (x) =
∞
i=0
x ai,q , i q
where x x (q x − 1)(q x−1 − 1) · · · (q x−i+1 − 1) if i ≥ 1, and = = 1. i q 0 q (q i − 1)(q i−1 − 1) · · · (q − 1)
The p-adic Theory of Automata Functions
33
3.3 Representation via van der Put Series Now we remind definitions and some properties of van der Put series, see e.g. [97, 116] for details. Given a continuous function f : Zp → Zp , there exists a unique sequence B0 , B1 , B2 , . . . of p-adic integers such that ∞
f (x) =
(3.19)
Bm χ (m, x)
m=0
for all x ∈ Zp , where
χ (m, x) =
1, if |x − m|p ≤ p−n 0, otherwise
and n = 1 if m = 0; n is uniquely defined by the inequality pn−1 ≤ m ≤ pn − 1 otherwise. The right side series in (3.19) is called the van der Put series of the function f . Note that the sequence B0 , B1 , . . . , Bm , . . . of van der Put coefficients of the function f tends p-adically to 0 as m → ∞, and the series converges uniformly on Zp . Vice versa, if a sequence B0 , B1 , . . . , Bm , . . . of p-adic integers tends padically to 0 as m → ∞, then the the series in the right part of (3.19) converges uniformly on Zp and thus define a continuous function f : Zp → Zp . The number n in the definition of χ (m, x) has a very natural meaning; it is just the number of digits in a base-p expansion of m ∈ N0 ; that is, n = logp m + 1 for all m ∈ N0 . Note that coefficients Bm are related to the values of the function f in the following way: Let m = m0 + . . . + mn−2 pn−2 + mn−1 pn−1 be a base-p expansion for m, i.e., mj ∈ {0, . . . , p − 1}, j = 0, 1, . . . , n − 1 and mn−1 = 0, then Bm =
f (m) − f (m − mn−1 pn−1 ),
if m ≥ p;
f (m),
if otherwise.
(3.20)
It worth noticing also that χ (m, x) is merely a characteristic function of the ball logp m −1 − logp m −1 B −logp m−1 (m) = m + p Zp of radius p centered at m ∈ N0 : p
χ (m, x) =
1, if x ≡ m (mod p 0, if otherwise
logp m +1
);
⎧ ⎨1, = ⎩0,
if x ∈ B −logp m−1 (m); p if otherwise (3.21)
The following theorem gives a complete characterisation of automata functions in terms of van der Put series, [19].
34
V. Anashin
Theorem 3.4 The function f : Zp → Zp is an automaton function if and only if it can be represented as f (x) =
∞
bm p
logp m
χ (m, x),
(3.22)
m=0
where bm ∈ Zp for m = 0, 1, 2, . . . Representation of automata functions via van der Put series can be effective in a study of functions of automata whose input/output alphabets are binary, i.e., F2 = {0, 1}. These automata functions are known under a name of T-functions; they found numerous applications in computer science and cryptography, see, e.g., [10–12, 14, 24, 26, 63, 78–81, 84, 117, 118, 136] and references therein. Actually numerous computer instructions are T-functions, e.g., arithmetic instructions (addition and multiplication of integers), bitwise logical instructions, some other machine instructions which are used by most contemporary processors. Now we give formal definitions of these basic instructions, bitwise logical and machine: Definition 3.5 Let z = δ0 (z) + δ1 (z) · 2 + δ2 (z) · 22 + δ3 (z) · 23 + · · · be a 2-adic canonical expansion for z ∈ Z2 (that is, δj (z) ∈ {0, 1}); then, • y XOR z is a bitwise addition modulo 2: δj (y XOR z) ≡ δj (y) + δj (z) (mod 2), for all j = 0, 1, 2 . . .; • y AND z is a bitwise multiplication modulo 2: δj (y AND z) ≡ δj (y) · δj (z) (mod 2), for all j = 0, 1, 2 . . .; • NOT, a bitwise logical negation: δj (NOT(z)) ≡ δj (z) + 1 (mod 2), for all j = 0, 1, 2 . . .; • y OR z is a bitwise logical ‘or’ : δj (y OR z) ≡ δj (y) OR δj (z) (mod 2), for all j = 0, 1, 2 . . .; • 2z , the integral part of 2z , is a shift towards less significant bits; • 2k · z, a multiplication by k-th power of 2, is a k-bit shift towards more significant bits; • y AND z, where y is a constant, is also called a masking of z with the mask y; • z mod 2k = z AND (2k − 1) is a reduction of z modulo 2k ; a truncation of all i high order bits starting with the k-th one (note that 2k − 1 = k−1 i=0 2 ). As a composition of automata functions is an automaton function, a composition of arithmetic, bitwise logical and machine instructions mentioned above is an automaton function which may be quite complicated, like that one ∞
f (x) = 1 + δ0 (x) + 6δ1 (x) + (1 + 2(x AND (2k − 1)))2k δk (x). k=2
However by using representation of the latter via van der Put series it is possible to study dynamics of automata functions which is important in design of T-functionbased pseudorandom number generators, see, e.g., [12, 24] and references therein.
The p-adic Theory of Automata Functions
35
4 Special Classes of Automata Functions In this section we discuss some significant special classes of automata functions which were considered in literature due to their important role in various problems, both theoretical and applied.
4.1 Finite Automata Functions Finite automata play an outstanding role in general automata theory; moreover, classical automata theory is mostly about finite automata. Therefore it is important to determine which automata functions can be represented by finite automata. In order to do this, we first remind some notions and facts from the theory of automata sequences following [4]. An infinite sequence a = (ai )∞ i=0 over a finite alphabet A, #A = L < ∞, is called p-automatic if there exists a finite transducer T = Fp , S, A, S, O, s0 such that for all n = 0, 1, 2, . . ., if T is feeded by the word χk χk−1 · · · χ0 which is a basep expansion of n = χ0 + χ1 p + · · · χk pk , χk = 0 if n = 0, then the k-th output symbol of T is an ; or, in other words, such that δkA (fT (n)) = an for all n ∈ N0 , where k = logp n and δkA (r) stands for the k-th digit in the base-L expansion of r. The p-kernel of the sequence a is the set kerp (a) of all subsequences m (ajpm +t )∞ j =0 , m = 0, 1, 2, . . ., 0 ≤ t < p . The following criterion of automaticity of a sequence holds, cf. [4, Theorem 6.6.2]: Theorem 4.1 Let p ≥ 2; then the sequence a is p-automatic if and only if its pkernel is finite. Now we are able to state automata finiteness criterion in terms of van der Put series of automata functions, see [20]: Theorem 4.2 (Automata Finiteness Criterion) Given a 1-Lipschitz function f : Zp → Zp represented by van der Put series (3.22), f (x) =
∞
bm p
logp m
χ (m, x),
m=0
the function f is the automaton function of a finite automaton if and only if the following conditions hold simultaneously: (i) all coefficients bm , m = 0, 1, 2, . . ., constitute a finite subset Bf ⊂ Q ∩ Zp , and (ii) the p-kernel of the sequence (bm )∞ m=0 is finite. Note 4.3 Condition (ii) of the theorem is equivalent to the condition that the sequence (bm )∞ m=0 is p-automatic, cf. Theorem 4.1.
36
V. Anashin
Despite in [20] Theorem 4.2 was proved only for prime p, it was recently shown in [56] that the theorem remains true for arbitrary p ∈ {2, 3, 4, . . .}; moreover, in [56] it is described the explicit connection between the Moore automaton producing automatic sequence and the Mealy automaton inducing the corresponding endomorphism of the p-adic rooted tree, p ∈ {2, 3, 4, . . .}. Now we are going to present an equivalent statement of Theorem 4.2, in terms of formal power series. Given a q-element field Fq , denote via Fq [[X]] the ring of formal power series in variable X over Fq : Fq [[X]] =
∞
ai X : ai ∈ Fq ; i
i=0
denote via Fq ((X)) the ring of formal Laurent series over Fq : Fq ((X)) =
⎧ ∞ ⎨
⎩
i=−n0
ai Xi : n0 ∈ N0 , ai ∈ Fq
⎫ ⎬ ⎭
.
Denote via Fq (X) the field of (univariate) rational functions over Fq :
Fq (X) =
u(X) : u(X), v(X) ∈ Fq [X], u(X) = 0 , v(X)
where Fq [X] is the ring of polynomials in variable X over Fq . As the field Fq ((X)) contains a subfield Fq (X), to define algebraicity over Fq (X): A formal it is possible i is algebraic over F (X) if and only if there a X Laurent series F (X) = ∞ p i=−n0 i exist d ∈ N and polynomials u0 (X), . . . , ud (X) ∈ Fp [X], not all zero, such that in the field Fq ((X)) the following identity holds: u0 (X) + u1 (X) · F (X) + · · · + ud (X) · (F (X))d = 0. The following is Christol’s theorem; see, e.g., [4, Theorem 12.2.5]: Theorem 4.4 (Christol) Let p be a prime, and let a = (ai )∞ i=0 be an infinite sequence over a finite non-empty alphabet A. The sequence a is p-automatic if and only if there exists an integer ∈ N and an injection τ : A → Fp such that the i formal power series ∞ i=0 τ (ai )X is algebraic over Fp (X). By Christol’s theorem, we now may replace condition (ii) from the statement of Theorem 4.2 by an equivalent one, thus getting an equivalent finiteness criterion: Theorem 4.5 (Automata Finiteness Criterion, Equivalent) Given a 1-Lipschitz function f : Zp → Zp represented by van der Put series (3.22), the function f is
The p-adic Theory of Automata Functions
37
the automaton function of a finite automaton if and only if the following conditions hold simultaneously: (i) all coefficients bm , m = 0, 1, 2, . . ., constitute a finite subset Bf ⊂ Q ∩ Zp , and (ii) under a suitable injection τ Bf → Fp , the formal power series ∞
τ (bm )Xm
m=0
over Fp is algebraic over Fp (X). We note an important moment from the proof of Theorem 4.2 (as given in [20]) which we state as a lemma: Lemma 4.6 Given a 1-Lipschitz function f , for n ∈ N0 , k ≥ logp n + 1 consider functions fn,k : Zp → Zp defined as follows: fn,k (z) =
1 k k f (n + p z) − (f (n) mod p ) ; z ∈ Zp . pk
(4.23)
The function f is an automaton function of a finite automaton if and only if in the collection F of functions fn,k , where n ∈ N0 , k ∈ N, k ≥ logp n + 1, contains only finitely many pairwise distinct functions. Note 4.7 Note that fn,k is the automaton function that corresponds to the automaton A(s(nk )) = Fp , S, Fp , S, O, s(nk ), where s(nk ) ∈ S is the state the automaton A = Af = Fp , S, Fp , S, O, s0 reaches after it has been feeded with the input word nk (of length pk ) that corresponds to a base-p expansion of n (so the word nk may contain some leading zeros that correspond to higher order digits of the expansion). In literature, other criteria of finiteness of automata are also known, in terms of truth tables rather than in terms of functional representations. Recall that truth table truth(f ) is the sequence of higher order digits of base-p expansions of output values of the automaton function f on lexicographically ordered finite words. The automaton function f is a finite automaton function if and only if truth(f ) = (ei )∞ i=0 i is algebraic over Fp ; that is, the formal power series ∞ i=0 ei X is algebraic over Fp [X], see [121, 134]. It would be interesting to find properties of automata which can be in terms of functional representations of automata functions, for instance, such that as minimality of the automaton, synchronization property, etc.
38
V. Anashin
4.2 (Locally) Analytic Automata Functions In the subsection we mostly follow [12]. We first recall that a definition of p-adic derivative, which is a direct analog of its classical counterpart from real analysis, for p-adic metric can be restated as follows, in terms of congruences rather than in terms of absolute values. Definition in that form is in many cases more convenient when dealing with automata functions since reduction modulo pk of a p-adic integer is just taking a k-letter prefix of an infinite word which represents the p-adic integer. Definition 4.8 (p-adic Differentiability) A function f : Zp → Zp is said to be differentiable at the point x ∈ Zp if there exists f (x) ∈ Qp (called the derivative of f at the point x) such that, given arbitrary (sufficiently large) k ∈ N, f (x + h) ≡ f (x) + f (x)h (mod pordp h+k )
(4.24)
for all sufficiently small h ∈ Zp : in other words, there exists N = Nk ∈ N such that congruence (4.24) holds once |h|p ≤ p−N ; that is, once h ≡ 0 (mod pN ) (equivalently, once ordp h ≥ N). The function f is called uniformly differentiable (on Zp ) if it is differentiable at every point x ∈ Zp and Nk does not depend on x; that is, given sufficiently large k ∈ N, there exists N = Nk ∈ N such that congruence (4.24) holds simultaneously for all x ∈ Zp and all h ∈ Zp once ordp h ≥ N. Note 4.9 If a 1-Lipschitz (thus, automaton) function f : Zp → Zp is differentiable at the point x ∈ Zp then f (x) ∈ Zp . We remind that rules of derivation (e.g., chain rule) with respect to p-adic metric are the same as in real case; moreover, for m ∈ N, the function f (x) = x m is an automatic function which is uniformly differentiable on Zp and f (x) = mx m−1 . However, in contrast to real case, in the p-adic case there are so-called pseudoconstants, the non-constant functions whose derivatives vanish everywhere in Zp . An example of pseudo-constant which is an automaton function is the function δ0 (x); it is uniformly differentiable on Zp and δ0 (x) = 0 for all x ∈ Zp . For applications in computer science it is important to stress that many bitwise logical computer instructions (cf., Definition 3.5) are uniformly differentiable on Z2 (they are automata functions as well): Example 4.10 Let c ∈ Z = {0, ±1, ±2, . . .}. Then the the following is true: (i) The function x AND c is uniformly differentiable on Z2 ; (x AND c) = 0 if c ≥ 0, and (x AND c) = 1 if c < 0. (ii) The function f (x) = x XOR c is uniformly differentiable on Z2 ; f (x) = 1 if c ≥ 0, and f (x) = −1 if c < 0. (iii) The function x OR c is uniformly differentiable on Z2 ; (x OR c) = 1 if c ≥ 0, (x OR c) = 0 if c < 0.
The p-adic Theory of Automata Functions
39
(iv) Given n ∈ N, the function x mod 2n is uniformly differentiable on Z2 ; (x mod 2n ) = 0. (v) The function NOT x is uniformly differentiable on Z2 ; (NOT x) = −1. Definition 4.11 (Analytic Function) The function f : Zp → Zp is said to be analytic on the ball Bp−r (a) ⊂ Zp if it can be represented by a power series that converges everywhere on Bp−r (a): f (x) =
∞
ai (x − a)i
i=0
for all x ∈ Bp−r (a) (here ai ∈ Qp , i = 0, 1, 2, . . .). If the function f is analytic on the ball Bp−r (a), it can be represented by Taylor (i)
series everywhere on the ball; that is, ai = f i!(a) , i = 0, 1, 2, . . ., see e.g. [116, Theorem 40.2]. Here as usual f (i) (a) stands for the i-th derivative of the function f at the point a ∈ Zp . Example 4.12 The following functions are analytic on respective balls: • The p-adic exponential function expp x =
∞
xj j =0
j!
is analytic on Bp−1 (0) = pZp if p ≥ 3 and on B 1 (0) = 4Z2 if p = 2. Note that 4 the ‘p-adic e’ does not exist since the series does not converge at x = 1. • In the same way, i.e., by considering corresponding power series, we can introduce p-adic trigonometric functions: sinp x =
∞
(−1)j x 2j +1 j =0
(2j + 1)!
, cosp x =
∞
(−1)j x 2j j =0
2j !
.
They are analytic on the same balls as the exponential function. ∞ (−1)k+1 (x−1)k • The p-adic logarithm lnp x = is analytic on Bp−1 (1) = k=1 k 1 + pZp Recall the following definition (see e.g. [116, Definition 25.3]): Definition 4.13 (Locally Analytic Function) A function f : Zp → Zp is said to be locally analytic of order r iff f is analytic on all balls of radii p−r .
40
V. Anashin
The definition can be re-formulated in an equivalent form: Definition 4.14 (Locally Analytic Functions) A function f : Zp → Zp is said to be locally analytic of order r if and only if f (a + h) =
∞
f (i) (a) i=0
i!
hi
for all a ∈ Zp whenever |h|p ≤ p−r . Note that functions that are analytic of order 0 are analytic on Zp , and vice versa. The following theorem by Yvette Amice gives a complete characterization of locally analytic functions represented via Mahler series: Theorem [8, Ch. III, Sec. 10, Th. 3, Cor. 1(c)]) The function x 4.15 (Amice, f (x) = ∞ , a a ∈ Qp , is locally analytic of order r if and only if i i i=0 i lim
i→∞
1 i ordp ai + = +∞, wtp i − r p−1 p
where wtp i = δ0 (i) + δ1 (i) + · · · is the sum of all digits in a base-p expansion of i ∈ N0 . Of course, (locally) analytic functions are not necessarily automata functions; but there are numerous automata functions which are (locally) analytic; for instance, the pj x j function f (x) = expp p x = ∞ j =0 j ! (which is analytic on Zp ) is an automaton function if either p ≥ 3, ≥ 1 or p = 2, ≥ 2; the function f (x) = lnp (1+pzx) = ∞ i i i i+1 p z x is an automaton function which is analytic on Z for all z ∈ Z . p p i=1 (−1) i Now we consider wider important classes of (locally) analytic automata functions.
4.2.1
C-Functions
In the ring Zp [[x]] of all formal power series in one variable x over the ring Zp consider the set C(x) of all series s(x) =
∞
ci x i
(ci ∈ Zp , i = 0, 1, 2 . . .),
(4.25)
i=0
that converges everywhere on Zp . In other words, s(x) ∈ C(x) if and only if p
lim ci = 0. Once these conditions are satisfied, the series s(x) ∈ C(x) defines
i→∞
on Zp a function having values in Zp (an integer-valued function, for short) s : Zp → Zp which is called a C-function. It is easy to show that any C-function
The p-adic Theory of Automata Functions
41
is 1-Lipschitz, thus, an automaton function. The class C of all C-functions is closed with respect to derivations; all C-functions are analytic on Zp . For ∈ N, the functions f (x) = expp p x, f (x) = sinp p x, f (x) = cosp p x are C-functions if either p ≥ 3 or p = 2 and ≥ 2. The functions f (x) = lnp (1 + px) and f (x) = (1 + px)−1 are C-functions, for any p prime.
4.2.2
B-Functions
Recall that given n ∈ N0 , the n-th falling factorial power x n is x n = x(x − 1) · · · (x − n + 1) if n ≥ 1 and x 0 = 1. Consider a class B(x) of all falling factorial series with i p-adic integer coefficients; that is, f (x) ∈ B(x) if and only if f (x) = ∞ i=0 bi x , (bi ∈ Zp ). In other words,
∞
x ai : ∈ Zp ; i = 0, 1, 2, . . . , B(x) = ai i i!
(4.26)
i=0
By the convergence criterion for Mahler interpolation series (see (3.14)), all series from B(x) are uniformly convergent on Zp and thus define uniformly continuous p
functions on Zp which we call B-functions: As ai /i! ∈ Zp then lim ai = 0 since i→∞
p
lim i! = 0. By Theorem 3.1, any B-function is an automaton function since its
i→∞
1 well known that ordp i! = p−1 (i − wtp i) and thus ordp ai ≥ logp i for all i ∈ N. Note also that any B-function is uniformly differentiable on Zp , cf., (3.15). Denote via B a class of all functions defined by series from B(x). It turns out that distinct series define distinct functions, so in the sequel we do not differ series from functions they define.
Proposition 4.16 Any two distinct series from B(x) (respectively, from C(x)) define two distinct functions on Zp . It is clear that B ⊃ C; moreover, B = C since not all B-functions are analytic on i Zp : Indeed, the function ∞ i=0 x is a B-function; however, conditions (3.17) are not satisfied. The class B can be endowed with the non-Archimedean metric Dp (f, g) = max{|f (z) − g(z)|p : z ∈ Zp }. On the other hand, B-functions behave somewhat like real analytic functions: A composition of real analytic functions is again an analytic function; generally this is not the case for p-adic analytic functions, see [116, Section 41] for details. Fortunately, for B-functions the following analog of Stone-Weierstrass theorem holds:
42
V. Anashin
Theorem 4.17 The class B is closed with respect to additions, multiplications, derivations, and compositions of functions. Polynomials with non-negative rational integer coefficients constitute a dense subset of B. Note It should be noticed that although Theorem 4.17 can be considered as an analog of Stone-Weierstrass theorem, it is not the p-adic Stone-Weierstrass theorem: The latter theorem states conditions when a uniform closure of a class of functions defined on a compact set is a class of all continuous functions on the set, see e.g. [116, Appendix A.4]. Furthermore, the p-adic Stone-Weierstrass theorem says nothing special about differentiability of the uniform closure; moreover, in contrast to Theorem 4.17, the uniform closure of polynomials over Qp contains not only uniformly differentiable functions as in the case of uniform closure of polynomials over N0 . Theorem 4.17 deals with a uniform closure of a smaller class of functions, the polynomials over N0 rather than over Qp , however, the closure is with respect to the same metric as in the p-adic Stone-Weierstrass theorem. From this view, Theorem 4.17 can be considered also as an analog of p-adic Kaplansky’s theorem (see e.g. [116, Theorem 43.3]), however, not for all polynomials over the field Qp as the p-adic Kaplansky’s theorem deals with, but only for polynomials over N0 , or equivalently, over the ring Z. The p-adic Kaplansky’s theorem yields that all continuous functions defined on a compact set can be uniformly approximated (with respect to Dp ) by polynomials over Qp , whereas Theorem 4.17 deals with uniform closure of polynomials over Z (or, which is equivalent, with uniform closure of polynomials over Zp ), and all functions of the closure turn out to be uniformly differentiable, and not only continuous. Finally we note that the classical Weierstrass approximation theorem also only yields that continuous functions defined on a real compact set and valuated in R can be uniformly approximated by polynomials over R; that is, the limit of uniformly convergent sequence of polynomials over R (which are differentiable functions) need not be a differentiable function. However, Theorem 4.17 yields that in the case of polynomials over Zp the limit is a uniformly differentiable function. All the above is to justify why Theorem 4.17 can not be derived neither from p-adic Stone-Weierstrass theorem nor from p-adic Kaplansky’s theorem and needs a special proof. Although a B-function is not necessarily analytic on Zp , they turn out to be analytic on all balls of radii less than 1; that is, all B-functions are locally analytic of order 1, cf. Definition 4.13. Namely, the following Taylor theorem for B-functions holds: Theorem 4.18 (Taylor theorem for B-functions) For every f ∈ B, a, h ∈ Zp and k = 1, 2, 3, . . . the following equality holds: f (a + pk h) = f (a) + f (a) · pk h + Moreover, all
f (j ) (a) j!
f (a) 2k 2 f (a) 3k 3 ·p h + ·p h +· · · 2! 3!
are p-adic integers, j = 0, 1, 2, . . ..
(4.27)
The p-adic Theory of Automata Functions
43
Important examples of B-functions are exponential functions. Consider p-adic functions of the form f (x) = u(x)v(x) where u, v : Zp → Zp are automata functions. The domain of the function f may be smaller than Zp (and actually may be empty) not speaking of 1-Lipschitzness. The following proposition states sufficient conditions for the function f to be an automaton function, and moreover, to lie in B: Proposition 4.19 Let u, v : Zp → Zp be automata functions and let u(z) ≡ 1 (mod p) for all z ∈ Zp . Then the function f (z) = u(z)v(z) is well defined for all z ∈ Zp , integer-valued and 1-Lipschitz (thus, an automaton function). Moreover, if w, v ∈ B, u(z) = 1 + p · w(z), then f ∈ B. A rational function over Zp is a function f (x) = g(x) u(x) where g(x), u(x) are polynomials with p-adic integer coefficients. From Proposition 4.19 it immediately follows that f is a B-function once u(x) = 1 + pw(x) where w is a polynomial over Zp . However, the latter assertion can be slightly generalized: Proposition 4.20 The rational function f (x) = g(x) u(x) is a B-function if the denominator u(x) vanishes modulo p nowhere on Zp (that is, if u(z) ≡ 0 (mod p) for all z ∈ Zp ).
4.2.3
Class A
Some important functions (for instance, some polynomials over Qp that not necessarily have integer p-adic coefficients yet map Zp into Zp and satisfy a Lipschitz condition with a constant 1 everywhere on Zp ) do not lie in B. However, they lie in a wider class A: Definition 4.21 A function f : Zp → Zp lies in A (and is said to be an A-function) if and only if f satisfies a Lipschitz condition with a constant 1 (i.e., is an automaton function) and pn f ∈ B for some n ∈ N0 . From Theorem 4.18 we immediately conclude that Taylor theorem for every Afunction f holds in the following form: Theorem 4.22 (Taylor theorem for A-functions) For every f ∈ A, a, h ∈ Zp and k = 1, 2, 3, . . . the function f (a + pk h) in variable h can be represented via convergent Taylor series f (a +pk h) = f (a)+f (a)·pk h+
f (a) 2k 2 f (a) 3k 3 ·p h + ·p h +· · · . 2! 3!
f (j ) (a) j!
(4.28)
are not necessarily p-adic integers now; however, in view of (j ) the second claim of Theorem 4.18, f j !(a) ≤ pn for all j = 1, 2, . . .. Moreover, Note that
p
f (a) is still a p-adic integer since f is 1-Lipschitz.
44
V. Anashin
The most important examples of A-functions that are not necessarily B-functions are 1-Lipschitz integer-valued polynomials over Qp , (i.e., functions of the form f (x) = di=0 ai plogp i xi , where ai ∈ Zp , i = 0, 1, 2, . . .) since every automaton function can be uniformly approximated on Zp (with respect to the metric Dp ) by A-functions. For instance, the function f (x) = B-function.
(x p −x)2 p
is an A-function but not a
5 The p-adic Ergodic Theory of Automata Every automaton performs a mapping of all words of length into words of length , for every ∈ N. The properties of these mappings constitute wide area of research in classical automata theory as well as in applications (e.g., to computer science and cryptology). For instance, in monograph [122] the problem of weak invertibility of finite automata is one of main themes. In our terms, the weak invertibility of an automaton A means that its automaton function fA : Zp → Zp is injective. It is not easy to verify weak invertibility of an automaton given its representation as Moore diagram; and it is even more difficult to construct a weakly invertible automaton which satisfies some important restrictions on its automaton function, such as, e.g., transitivity which means that the transformation the automaton performs on words of length is not only a permutation (which takes place in case of weak invertibility) but a permutation having only one cycle (that is, the cycle of length p ). But the latter mappings are very important in case one needs to produce a pseudo-random (e.g., uniformly distributed) sequence for simulation purposes in,e.g., in MonteCarlo method, see [14, 44, 106], or for cryptographic usage (e.g., in stream ciphers). For instance, by the methods based on Moore diagram it would not be easy (if possible at all) to prove that the following wild-looking function is an automaton function which is transitive on all 2 binary words of length , for all ∈ N: 7+NOT( 8x 8 9 ) 9+10x x AND x 2 + x 3 OR x 4 f (x) = (1 + x) XOR 4 · 1 − 2 · . 6 7 3 − 4 · (5 + 6x 5 )x XORx However, this can be proved by using the p-adic ergodic theory which we discuss below. It is worth stressing here sharp difference in a classical approach based on investigation of Moore diagrams of automata and a p-adic approach which is based on investigation of automata functions directly by using p-adic analysis and p-adic dynamics; see more about the latter in [12]. In what follows, we explain methods and present results base on the latter approach. Note that in the current paper we do not consider numerous results on ergodic decomposition, repellers, attractors and other dynamical properties of general 1-Lipschitz (thus, automata) functions on Zp since this is a vast area of research which deserves special expository paper; we only mention a few works from the area which give to the interested reader an impression about that area as well as references to literature for further reading: [45, 105, 114].
The p-adic Theory of Automata Functions
45
5.1 Basics of (p-adic) Dynamics In the subsection we recall some basic notions and concepts from dynamical systems theory mainly following [59]. In the paper, by a (discrete) dynamical system on a phase space (or, on configuration space) S we understand a triple S; μ; f , where S is a measure space endowed with a measure μ and f : S → S is a measurable map f : S → S; that is, the f -preimage f −1 (S) of any μ-measurable subset S ⊂ S is a μ-measurable subset of S. Recall that topological dynamical systems are pairs X; f where X is a topological space and f is a continuous transformation on X. In the cases we discuss in the paper, dynamical systems are also topological since configuration spaces are not only measure spaces but also metric spaces, and corresponding transformations are not only measurable but also continuous. The orbit (or, the trajectory) of a point x0 of the dynamical system S is the sequence x0 = f 0 (x0 ), x1 = f (x0 ), x2 = f (x1 ) = f 2 (x0 ), . . . , xi = f (xi−1 ) = f i (x0 ), . . . of points of the space S; that is, the orbit of the point x0 is just a sequence of ∞ i iterates (f i (x0 ))∞ i=0 . The point x0 is called the initial point of the orbit (f (x0 ))i=0 . The point x0 is called (eventually) periodic if its orbit is (eventually) periodic; the eventually periodic points are also called pre-periodic. If the orbit (f i (x0 ))∞ i=0 has a period of length r, the point x0 is called an r-periodic point. A 1-periodic point is called a fixed point. If F : S → T is a measurable map of S to some other measure space T endowed with a measure ν (that is, if the F -preimage of any ν-measurable subset of T is a μ-measurable subset of X), the sequence F (x0 ), F (x1 ), F (x2 ), . . . is called the observable. The map F : S → Y is said to be measure-preserving if μ(F −1 (S)) = ν(S) for each measurable subset S ⊂ Y. Also in this case once S = Y and μ = ν the measure μ is said to be invariant with respect to F (or simply ‘invariant’ when it is clear from the context what is F ). In the case when S = Y, μ = ν, and μ is a probability measure, a measure-preserving map F is said to be ergodic if for each measurable subset S such that F −1 (S) = S it holds either μ(S) = 1 or μ(S) = 0. A measurable subset S ⊂ S is called invariant subset of the map F : S → S (or, F -invariant) if F −1 (S) = S; so ergodicity of the map F just means that F has no invariant subsets whose measure is neither 0 nor 1.
5.1.1
Ergodic Theory
Basic questions in general ergodic theory are: How often does an orbit visit given region (that is, given a measurable subset S ⊂ S)? What is the frequency with
46
V. Anashin
which the orbit of x = x0 hits S? For instance, when the frequency is equal to the probability that a randomly chosen point of S lies in S? That is, if denote χ (S, x) a characteristic function of S (i.e., χ (S, x) = 1 if and only if x ∈ S; otherwise χ (S, x) = 0), when the following condition holds? N −1 1
χ (S, xi ) = μ(S). N →∞ N
lim
(5.29)
i=0
In what follows, we call a sequence (xi )∞ i=0 uniformly distributed if (5.29) holds for all measurable subsets S of S. A word of caution: this definition must not be considered as a definition of uniform distribution on arbitrary measure space S with arbitrary probability measure μ; some restrictions must be imposed both on measurable sets S and on the measure μ. However, all these restrictions are satisfied for all dynamical systems we consider further in the paper, so we are free to use above definition of uniform distribution throughout the paper. Readers interested in further detail on these restrictions, as well as in the theory of uniform distribution of sequences, are referred to [86]. One of fundamental results of ergodic theory yields that once the function f is ergodic, the orbit (f i (x0 ))∞ i=0 is uniformly distributed. Again, to make this assertion true in general, some additional restrictions must be imposed (see [86]). However, for dynamical systems we consider further, the ones on finite spaces and on Znp , the restrictions are satisfied; so we use the just stated assertion without limitations.
5.1.2
Topological Transitivity
As already mentioned, the dynamical systems we mostly consider further are also topological since configuration spaces are not only measure spaces but also metric spaces, and corresponding transformations are not only measurable but also continuous. In the theory of topological dynamical systems there is an important counterpart of the notion of ergodicty, the topological transitivity. Definition 5.1 Given a topological space X and a continuous mapping F : X → X, the mapping F (respectively, the dynamical system X; F ) is called topologically transitive if there exists a dense orbit of F ; that is, if there exists x ∈ X such that the set of iterations {F i (x) i ∈ N0 } is everywhere dense in X. The system X; F is called minimal if every orbit is dense. It is worth noticing here that once a configuration space is endowed both with a metric (thus, with a topology) and with a measure, and even the metric and the measure somehow agree, the properties of ergodicity and of minimality may be quite different. Fortunately, in cases we consider further in the paper, ergodicity and minimality occur to be equivalent since we deal with isometries of compact metric spaces. The following very general proposition is true (see e.g. [59, Corollary 4.3.6]):
The p-adic Theory of Automata Functions
47
Proposition 5.2 A minimal isometry of a compact space is uniquely ergodic. Unique ergodicity means that there exists a unique (up to a multiplication by a constant) invariant measure. In the cases considered further in the paper, the invariant measure is a probability measure; namely, a Haar measure which is normalized so that the measure of the whole space is 1. Therefore speaking further of ergodicty we actually speak of minimality of corresponding dynamical systems; but this implies ergodicty of the systems by Proposition 5.2.
5.2 Finite Dynamics Now we consider the notions we just have introduced for a very special (however, important) case when dynamical system S = S; f ; μ is finite, that the configuration space S contains only finitely many points. Total number of these points (that is, the cardinality of the set S) is denoted via #S. Of course, when dynamics is finite, all points are eventually periodic. Now consider measure-preservation and ergodicity for finite dynamical systems. Let S, T be finite. For S ⊂ S and T ⊂ T put μ(S) =
#S #T , ν(T ) = . #S #T
That is, μ, ν are “standard” probability measures on S, T respectively. Let F : S → T be a surjective (that is, an “onto”) map. It is easy to prove that the following is true: (i) The map F is measure-preserving if and only if f is balanced. i.e., if and only if the number #F −1 (t) of F -pre-images of the point t ∈ T does not depend on the point t: #F −1 (t) =
#S #T
for all t ∈ T. (ii In particular, in the case when S = T and μ = ν, the map F is measurepreserving if and only if F is bijective, i.e., if and only if F is a permutation on S. (iii) Finally, F is ergodic if and only if F is transitive on S, i.e., if and only if F is a cycle of length #S (that is, F cyclically permutes points of S). It is clear that if a map f : S → S is transitive, then every its orbit is uniformly distributed since the orbit is a periodic sequence such that the length of its shortest period is #S, and every element from S appears at the period exactly once. Sequences having this property are called strictly uniformly distributed.
48
5.2.1
V. Anashin
Heritable Dynamical Properties
Balanced, bijective and transitive maps may under certain conditions be ‘projected’ onto smaller domains so that the ‘images’ of these maps remain balanced, bijective, transitive, respectively. We consider these conditions for the case of maps of finite rings. Let A, B be rings, let φ : A → B be a ring epimorphism. Remind (see [88]) that a mapping f : A → A is called compatible with the epimorphism φ if there exists a commutative closure φ(f ) of the following diagram:
Here by the definition (φ(f ))(b) = f (φ(a)) for b ∈ B, where a ∈ A is a φ-preimage of b: a ∈ φ −1 (b). In other words, the compatibility of f with respect to the epimorphism φ means that the map φ(f ) is well defined: given arbitrary x, y ∈ A such that their φ-images coincide, φ(x) = φ(y), then φ(f (x)) = φ(f (y)). So each compatible transformation on A defines a unique transformation on each epimorphic image of A. As each epimorphism of A defines a unique ideal Iφ of A, the kernel of φ, and vice versa, we will also say that f is compatible with respect to the ideal Iφ . We will say that f has some property P modulo the ideal Iφ or, which is the same, modulo epimorphism φ if the mapping φf induced by f on the epimorphic image B of A has the property P. The notion of compatibility with an epimorphism can be defined for multivariate maps as well: If F An → Am , m ≤ n is a map from the n-th Cartesian power of A onto its m-th Cartesian power, and if I is an ideal in A, then I n , I m are ideals in An , Am , respectively; so the map F is called compatible with respect to the ideal I of A if there exists a commutative closure φ(F ) of the following diagram:
Note that we denote epimorphisms An → B n and Am → B m with kernels I n , I m , respectively, by the same symbol φ just to avoid complicated notation. We will also say that F has some property P modulo I (modulo φ) if the map φ(F ) has the property P.
The p-adic Theory of Automata Functions
49
5.3 1-Lipschitz Dynamics on Znp In the subsection we mainly follow [12]. Configuration spaces of dynamical systems we are focused are metric spaces Znp . These spaces are endowed with a natural probability measure. Namely, the ring Zp can be endowed with a probability measure μp , thus becoming a probability space. The latter measure is a normalized Haar measure: The base of the corresponding σ -algebra of measurable subsets of Zp , the elementary measurable subsets, are all balls of non-zero radii. That is, every element of the σ -algebra, the measurable subset of Zp , can be constructed from the elementary measurable subsets by taking complements and countable unions. We put μp (Bp− (a)) = p− . We sometimes will refer to this probability measure on Zp as to p-adic measure. In a similar manner we define a probability measure μp on Znp : For an ndimensional ball Bp− (a) ⊂ Znp we put μp (Bp− (a)) = p−n . Remind that the (ultra)metric on Qnp is defined as follows: |a − b|p = max{|ai − bi |p : i = 1, 2, . . . , n} for a = (a1 , . . . , an ), b = (b1 , . . . , bn ) ∈ Qnp . We remind (see e.g. [88]) that if a measure space S endowed with a probability measure μ is also a topological space, the measure μ is called Borel if all Borel sets in S are μ-measurable. Recall that a Borel set is any element of σ -algebra generated by all open subsets of S; that is, a Borel subset can be constructed from open subsets with the use of complements and countable unions. Recall further that a probability measure μ is called regular (see e.g. [88]) if μ(E) = sup{μ(C) : C ⊆ E, C closed} = inf{μ(D) : E ⊆ D, D open}
(5.30)
for all Borel sets E in S. Proposition 5.3 The measure μp is Borel and regular. Further in the paper we simply say that a p-adic function is measure-preserving or ergodic meaning it has these properties with respect to the measure μp . Proposition 5.4 A 1-Lipschitz map F from Znp to Zm p (which is not necessarily an ‘onto’ map) is measurable. Recall that the map F is 1-Lipschitz, it is an automaton function of an automaton having n inputs and m outputs over input/output alphabet Fp = {0, 1, . . . , p − 1}. n m Given a 1-Lipschitz map F : Znp → Zm p of Zp onto Zp , m ≤ n, the map k k n k m F mod p : (Z/p Z) → (Z/p Z) is well defined due to the compatibility of F with every epimorphism modpk , for all k: Actually the map F mod pk is the mapping of m-tuples of k-letter input words to n-tuples of k-letter output words the corresponding automaton performs. Therefore we can define the notion of hereditable properties with respect to the epimorphism modpk . As we focus on measure-preservation and ergodicity, the following notions are the most important further in the paper:
50
V. Anashin
Definition 5.5 (Bijectivity, Transitivity, Balance Modulo pk ) A 1-Lipschitz k (thus, automaton) function F : Znp → Zm p is said to be balanced modulo p (respectively, bijective modulo pk or transitive modulo pk ) if the reduced map (on k-letter words) F mod pk : (Z/pk Z)n → (Z/pk Z)m is balanced (respectively, bijective or transitive).
5.4 The p-adic Ergodic Theory of General Automata Functions Main theorem of the p-adic ergodic theory for 1-Lipschitz (thus, automata) functions yields the following: Theorem 5.6 Let F be a 1-Lipschitz map of Znp onto Zm p , m ≤ n. Whenever m = n, the map F preserves the p-adic measure μp (or, accordingly, is ergodic with respect to μp ) if and only if F bijective (accordingly, transitive) modulo pk for all k = 1, 2, 3, . . .. For n ≥ m, the map F preserves the measure μp if and only if F is balanced modulo pk for all k = 1, 2, 3, . . .. From the proof of the theorem the following corollary can be derived: Corollary 5.7 A 1-Lipschitz function f : Zp → Zp preserves measure if and only if it is bijective modulo pk for all k = 1, 2, . . .. Note 5.8 As a bonus we have that every 1-Lipschitz measure-preserving function f : Zp → Zp is an isometry: A distance between two points is just a radius of the smallest ball that contains them both; however, a measure-preserving 1-Lipschitz mapping is a bijection that merely permutes balls of pairwise equal radii. Now we are using an opportunity to fix a flaw in the proof of the ‘if’ part of ergodicity criterion for 1-Lipschitz functions F : Znp Znp occurred in [9, 12]: Proposition 5.9 A 1-Lipschitz function F : Znp → Znp is ergodic if and only if F is transitive modulo pk , for all k = 1, 2, . . .. Proof The proof of ‘only if’ part is flawless, see respective statements in [9, 12]. We need to prove the ‘if’ part of the statement only. Under conditions of Proposition 5.9 the function F is a measure-preserving isometry by Note 5.8 which holds for measure-preserving maps of Znp to Znp as well. Moreover, the conditions imply that every F -orbit is dense in Znp since given ¯ where arbitrary x, y ∈ Znp and k ∈ N there exists ∈ N0 such that y¯ = F¯k (x) F¯k stands for the reduced map F mod pk and x, ¯ y¯ are respective residues modulo
The p-adic Theory of Automata Functions
51
pk in the Cartesian product (Z/pk Z)n ; so |F (x) − y|p ≤ p−k . Therefore F is a minimal isometry on a compact metric space; but all such isometries are known to be uniquely ergodic (thus ergodic), cf. Proposition 5.2 Corollary 5.10 Given a 1-Lipschitz transformation F : Znp → Znp and x ∈ Znp , every sequence (F i (x) mod pk )∞ i=0 , k = 1, 2, 3, . . ., is strictly uniformly distributed if and only if F is ergodic. Corollary 5.11 Given a 1-Lipschitz ergodic transformation F : Znp → Znp , every orbit (F i (x))∞ i=0 is uniformly distributed with respect to the measure μp . We stress an important practical conclusion which follows from main ergodic Theorem 5.6: Measure-preservation (respectively, ergodicity) of an automaton function is equivalent to the bijectivity (respectively, transitivity) of the mapping performed by the automaton on k-letter words, for all k = 1, 2, 3, . . ..
Later we will see that in many important cases to guarantee measure-preservation (respectively, ergodicity) of an automaton function it is sufficient to provide bijectivity (respectively, transitivity) of the mapping performed by the automaton on k-letter words for small k. This way one can determine bijectivity (respectively, transitivity) of automata mapping on long words just by verifying these properties on short words.
5.4.1
Ergodicity of Affine Mappings
Affine transformations f (x) = ax + b on the space Zp , where a, b ∈ Zp , is an important special case both for general theory and for applications. Automata whose automaton function is affine are sometimes called linear; we do not use that term, however. Iterations of an affine mapping is a well-known method to produce pseudo-random numbers, the mixed congruential method, see, e.g., [82]. In view of Theorem 5.6 it is clear that f is measure-preserving if and only if a has a multiplicative inverse modulo pk for all k = 1, 2, . . . (that is, a is a unit in Zp ); in other words, if and only if a ≡ 0 (mod p). Theorem 5.12 The function f (x) = ax + b, where a, b ∈ Zp , is an ergodic transformation on Zp if and only if following conditions hold simultaneously: b ≡ 0 (mod p);
(5.31)
a ≡ 1 (mod p), for p odd;
(5.32)
a ≡ 1 (mod 4), for p = 2.
(5.33)
52
5.4.2
V. Anashin
Ergodicity and Measure-Preservation in Terms of Coordinate Functions
Recall that according to Proposition 2.13 every 1-Lipschitz function f : Z2 → Z2 can be represented in the form f
∞
i=0
χi · 2
i
=
∞
ψi (χ0 , . . . , χi ) · 2i
(5.34)
i=0
where χi ∈ {0, 1}, and each i-th coordinate function ψi (χ0 , . . . , χi ) = δi (f (x)) is a Boolean function in Boolean variables χ0 , . . . , χi ; that is, ψi : {0, 1}i+1 → {0, 1}; i = 0, 1, 2 . . .. Recall that an algebraic normal form, the ANF, of the Boolean function ψi (χ0 , . . . , χi ) is a representation of this function via ⊕ (addition modulo 2, that is, logical ‘exclusive or’) and · (multiplication modulo 2, that is, logical ‘and’, or conjunction). In other words, the ANF of the Boolean function ψ is its representation in the form ψ(χ0 , . . . , χj ) = β ⊕ β0 χ0 ⊕ β1 χ1 ⊕ . . . ⊕ β0,1 χ0 χ1 ⊕ . . . , where β, β0 , . . . ∈ {0, 1} and χ0 , . . . , χj are Boolean variables; that is, as an element of the residue ring of the ring of polynomials F2 [χ0 , . . . , χj ] over F2 in variables χ0 , . . . , χj modulo the ideal generated by χk2 − χk , k = 0, 1, . . . , j . Recall that the weight of the Boolean function ψ in (j + 1) variables is the number of (j + 1)-bit words that satisfy ψ; that is, the weight is a cardinality of the truth set of ψ, and the truth set of ψ is the set of all points from {0, 1}j +1 where ψ takes the value 1. The following theorem has been circulating among people working in the theory of Boolean functions at least since the end of 1970th; however, the author of the theorem is not known still. We give an equivalent statement of the theorem in terms of measure-preservation/ergodicity rather that in terms of bijectivity/transitivity, as in its original form. Theorem 5.13 (Folklore) The function f defined by equation (5.34) is measurepreserving if and only if for every i = 0, 1, . . . the ANF of the i-th coordinate function is ψi (χ0 , . . . , χi ) = χi ⊕ ϕi (χ0 , . . . , χi−1 ), where ϕi is an ANF of a Boolean function in Boolean variables χ0 , . . . , χi−1 , and ϕ0 is a constant from {0, 1}. The function f is ergodic if and only if, additionally, ϕ0 = 1, and every Boolean function ϕi is of odd weight, that is, takes value 1 exactly at an odd number of points from {0, 1}i for i = 1, 2, . . .. The latter condition holds if and only if degree of the ANF of ϕi for i ≥ 1 is exactly i, that is, if and only if the ANF of ϕi contains a monomial χ0 · · · χi−1 .
The p-adic Theory of Automata Functions
53
The theorem is used in proofs of other measure-preservation/ergodicity criteria in terms of different representations of automata functions of automata whose input/output alphabets are F2 . Also, numerous examples of measurepreserving/ergodic automata functions of that kind can be derived from the theorem: Example 5.14 Given an arbitrary 1-Lipschitz function g : Z2 → Z2 and an arbitrary 1-Lipschitz measure-preserving function f : Z2 → Z2 , both functions u(x) = f (x)+2·g(x) and v(x) = f (x)XOR2·g(x) are measure-preserving. Given arbitrary 1-Lipschitz function g : Z2 → Z2 and an arbitrary 1-Lipschitz ergodic function f : Z2 → Z2 , all the functions u(x) = f (x) + 4 · g(x), u(x) = f (x) XOR 4 · g(x), u(x) = f (x + 4 · g(x)), and u(x) = f (x XOR 4 · g(x)) are ergodic. The following function invp , the generalized inverse on Zp was studied in literature: invp (x) = p
ordp x
·
x pordp x
−1 (5.35)
. p
It can be shown that the function invp is well defined on Zp (since lim invp (x) = x→0
0), that it is an automaton function which is infinitely many times differentiable everywhere on Zp \ {0}, that it is not differentiable at 0, and that invp is measurepreserving. The following Proposition can be proved by using Theorem 5.13: Proposition 5.15 Let f be any 1-Lipschitz transformation on Z2 . If f is ergodic, then both compositions f (inv2 (x)) and inv2 (f (x)) are ergodic. Vice versa, if either of transformations f (inv2 (x)) or inv2 (f (x)) is ergodic, then f is ergodic. From that proposition a number of results on ergodicty of composite automata functions with inv2 can be derived, both known and new. For instance, in [41] it is proved that given a, b ∈ Z, the function f (x) = a · inv2 (x) + b is transitive modulo 2n , n ≥ 2, if and only if a ≡ 1 (mod 4) and b ≡ 1 (mod 2). The result immediately follows by combining Theorems 5.13 and 5.12. More complex ergodic automata functions can be constructed with the use of Proposition 5.15: For instance, the following automata functions are ergodic on Z2 : f (x) = 3 · inv2 (x) + 3inv2 (x) is ergodic on Z2 ; f (x) = inv2 (1 + x) + 4 · (1 + inv2 (2x))inv2 (x) ; f (x) = inv2 (2x 2 ) + inv2 (7x) + 1 and f (x) = inv2 (2x 2 + 7x + 1), etc. Due to the importance of Theorem 5.13, some attempts have been made to extend its statement to the case p > 2. However, there is a serious obstacle since for p > 2 coordinate functions are polynomials over Fp , and to prove the theorem one must have a criteria when these polynomials are bijective with respect to either variable; but no general criterion is known in the theory of finite fields. We have managed to handle the case p = 2 since every mapping F2 → F2 can be represented by a polynomial of degree ≤ 1; but for lager p that is not the case: General bijective polynomials are of degree ≤ p − 2 for p > 2 and that’s why their characterization is not known in the theory of finite fields. By that reason, currently only the case
54
V. Anashin
p = 3 is handled; i.e., there are measure-preservation/egodicity criteria similar to that of Theorem 5.13 for that case, cf. [39, 68]. However for general case p > 3 it is still a problem to find such criteria though some substantial work in that direction is done, cf. [74]. Also, no results similar to that of Proposition 5.15 is known for the case p > 2 though it is can be shown that either of compositions f (invp (x)) or invp (f (x)) is measure-preserving if and only if the automaton function f is measure-preserving.
5.4.3
Ergodicity and Measure-Preservation in Terms of Mahler Expansion
Recall that every function f : Zp → Zp can be expressed via Mahler interpolation series, see (3.13): ∞
x , ai f (x) = i i=0
where ai ∈ Zp , i = 0, 1, 2, . . .. We now are going to describe how one can determine from the coefficients ai whether f is measure-preserving or, respectively, ergodic automaton (that is, 1-Lipschitz) function: Theorem 5.16 The function f defines a 1-Lipschitz measure-preserving transformation on Zp whenever the following conditions hold simultaneously: a1 ≡ 0 (mod p);
(5.36)
ai ≡ 0 (mod plogp i+1 ), i = 2, 3, . . . .
(5.37)
The function f defines a 1-Lipschitz ergodic transformation on Zp whenever the following conditions hold simultaneously: a0 ≡ 0 (mod p);
(5.38)
a1 ≡ 1 (mod p), for p odd;
(5.39)
a1 ≡ 1 (mod 4), for p = 2;
(5.40)
ai ≡ 0 (mod p
logp (i+1) +1
), i = 2, 3, . . . .
(5.41)
Moreover, in the case p = 2 these conditions are necessary: Namely, if f is 1-Lipschitz and measure-preserving then conditions (5.36) and (5.37) hold simultaneously; if f is 1-Lipschitz and ergodic then conditions (5.38), (5.40) and (5.41) hold simultaneously. During the proof of the theorem some extra results were established which are of interest by their own.
The p-adic Theory of Automata Functions
55
Lemma 5.17 Given a 1-Lipschitz function v : Zp → Zp and p-adic integers c, d, c ≡ 0 (mod p), the function g(x) = d + cx + p · v(x) preserves measure, and the function h(x) = c + x + p · v(x) is ergodic. (Recall that is a difference operator: v(x) = v(x + 1) − v(x) by the definition.) Moreover, when p = 2, every measure-preserving (respectively, ergodic) 1-Lipschitz function can be represented as d + x + 2 · v(x) (respectively, as 1 + x + 2 · v(x)) for suitable d ∈ Z2 and 1-Lipschitz function v : Z2 → Z2 . The lemma gives an easy practical way to construct a measure-preserving/ergodic automaton function out of given automaton function. It is worth noticing here that once v is a finite automaton function then the functions g and h are also finite automaton functions. From Theorem 5.16 numerous results on measure-preservation/ergodicity for special classes of automata functions can be derived. For instance, for every prime p and every a ≡ 1 (mod p) the function f (x) = ax + a x is a 1-Lipschitz ergodic automaton function. The function f is a B-function (see Sect. 4.2.2); if p = 2, the following explicit representations for measure-preserving/ergodic B-functions can be derived from Theorem 5.16: i Proposition 5.18 Let f be a B-function on Z2 ; that is, let f (x) = ∞ i=0 ei · x , for suitable ei ∈ Z2 , i = 0, 1, 2, . . .. The function f is measure-preserving if and only if e1 ≡ 1 (mod 2),
e2 ≡ 0 (mod 2),
e3 ≡ 0 (mod 2).
The function f is ergodic on Z2 if and only if e0 ≡ 1 (mod 2),
e1 ≡ 1 (mod 4),
e2 ≡ 0 → (mod 2),
e3 ≡ 0 (mod 4).
Moreover, by using Theorem 5.16 the results on explicit representations of ergodic C-functions (see Sect. 4.2.1) can be refined, especially for polynomials over Z2 . We summarize the results in the following proposition: Proposition 5.19 A C-function f : Z2 → Z2 is ergodic on Z2 if and only if f is transitive modulo 8. A complete list of polynomial transitive transformations on Z/8Z is a s follows: x+1
5x + 1
2x 2 + 3x + 1
2x 2 + 7x + 1
x+3
5x + 3
2x 2 + 3x + 3
2x 2 + 7x + 3
x+5
5x + 5
2x 2 + 3x + 5
2x 2 + 7x + 5
x+7
5x + 7
2x 2 + 3x + 7
2x 2 + 7x + 7
i Let the C-function f : Z2 → Z2 be represented via power series f (x) = ∞ i=0 ci x , ci ∈ Z2 , i = 0, 1, 2, . . .. Then the function f is ergodic if and only if the following
56
V. Anashin
conditions hold simultaneously: c3 + c5 + c7 + · · · ≡2c2 (mod 4); c4 + c6 + c8 + · · · ≡c1 + c2 − 1 (mod 4); c1 ≡1 (mod 2); c0 ≡1 (mod 2). Historical remarks: Theorem 5.16 is proved in [13] for the case p = 2 and in [15] for general case; Proposition 5.18 is proved in [15]. Proposition 5.19 is proved in [87]; more elementary proof of the second part of the proposition is given in [2] as a consequence of the result on level transitivity of polynomial action on rooted tree which constitutes main result of that paper. Polynomial mappings modulo powers of p prime were investigated in [36]. A complete analog of Theorem 5.16 for q-Mahler basis (cf. Note 3.3) was obtained in [66]. It is important to stress here that proofs of [66] do not rely on Theorem 5.13, in contrast to the proof from [13]. For B-functions general criterion of measure-preservation was proved in [68] which yields: Proposition 5.20 1 Let f : Zp → Zp be a B-function represented in Mahler’s expansion as in (3.18). Then, f is measure-preserving if and only if the following conditions are satisfied: • {f (0), f (1), . . . , f (p − 1)} is a complete set of distinct residues modulo p; • For all 0 ≤ i < p, p−1
m=0
λim cm
i
i cm+p ≡ 0 (mod p), + m m=0
where λim =
1 p
i+p i − . m m
Proof It suffices to show that for 0 ≤ i < p; the normalized van der Put coefficient, bi + p is equal to the left-hand side of the condition. From the Mahler expansion of
1 The
author is grateful to anonymous referee for communication of corrected statement of the proposition and corresponding proof.
The p-adic Theory of Automata Functions
57
f ; we have
bi+p
⎞ ⎛
i+p i i+p i 1 ⎝ logp m 1 − cm ⎠ = = (f (i + p) − f (i)) = p cm m m p p m=0
p−1
m=0
1 p
m=0
i
i i+p i cm+p . − cm + m m m m=0
The result follows by application of Lucas’ congruence theorem to the Nsecondi sum above. Recall that the latter theorem yields as follows: Let r = i=0 ri p and i be the base-p expansions of r, n ∈ N : r , n ∈ {0, 1, . . . , p − 1} n= N n p i 0 i i i=0 (i = 0, 1, 2, . . .); then the following congruence for binomial coefficients holds: r r0 r1 rN ≡ ··· (mod p), n n0 n1 nN
see, e.g., [5].
The latter result was extended to automata functions which are uniformly differentiable modulo p, cf. [67]; for the differentiability modulo p see Definition 5.27 further. For p = 2, conditions of Theorem 5.16 are sufficient, and not necessary. For p > 2 the conditions which are both sufficient and necessary where found in [102]; the first one is condition (i) from Proposition 5.20, but the rest ones are somewhat lengthy to be represented in the current paper and not very easy to determine if they hold for a given automaton function in Mahler’s expansion. Nevertheless, the conditions are helpful to determine ergodicity of automata functions for smaller p, see examples in [102].
5.4.4
Ergodicity and Measure-Preservation in Terms of van der Put Expansion
Here we consider measure-preservation and the ergodicity criteria for 1-Lipschitz transformations on Zp . For the case p = 2 the said criteria has been found in [24]. Recall that for the case p = 2 the 1-Lipschitz functions are also called T-functions. We use that term to stress when we deal with the case p = 2 only; that is, when we discuss ergodic properties of automata functions of the automata whose input/output alphabets are F2 = {0, 1}. Theorem 5.21 Let f : Z2 → Z2 be a T-function represented via van der Put series (3.19): f (x) =
∞
m=0
Bm χ (m, x).
58
V. Anashin
The T-function f is measure-preserving if and only if the following conditions hold simultaneously: (i) B0 + B1 ≡ 1 (mod 2); (ii) |Bm |2 = 2−log2 m , m = 2, 3, . . .. Corollary 5.22 A map f : Z2 → Z2 is a measure-preserving T-function if and only if it can be represented as f (x) = b0 χ (0, x) + b1 χ (1, x) +
∞
2log2 m bm χ (m, x),
m=2
where bm ∈ Z2 , and the following conditions hold simultaneously (i) b0 + b1 ≡ 1 (mod 2); (ii) bm ≡ 1 (mod 2), m = 2, 3, 4 . . .. Theorem 5.23 A T-function f : Z2 → Z2 is ergodic if and only if it can be represented as f (x) = b0 χ (0, x) + b1 χ (1, x) +
∞
2log2 m bm χ (m, x)
m=2
for suitable bm ∈ Z2 that satisfy the following conditions: (i) (ii) (iii) (iv) (v)
b0 ≡ 1 (mod 2); b0 + b1 ≡ 3 (mod 4); |bm |2 = 1, m ≥ 2; b2 + b3 ≡ 2 (mod 4); 2n −1 b ≡ 0 (mod 4), n ≥ 3. m=2n−1 m
Proposition 5.24 Let f : Z2 → Z2 be a T-function which represented by the van der Put series (3.19). Then f is ergodic if and only if the following conditions are satisfied simultaneously: (i) (ii) (iii) (iv)
B0 ≡ 1 (mod 2); B0 + B1 ≡ 3 (mod 4); −(n−1) , n ≥ 2, 2n−1 ≤ m < 2n − 1; |B m |n2 = 2 2 −1 m=2n−1 (Bm − 2n−1 ) ≤ 2−(n+1) , n ≥ 2. 2
The above criteria can be especially helpful when it is needed to determine measurepreservation/ergodicty of an automaton function which is a composition of various processor’s instructions (cf., Definition 3.5); for instance: Example 5.25 Given a sequence c, c0 , c1 , c2 , . . . of 2-adic integers, the series c+
∞
i=0
ci δi (x)
(5.42)
The p-adic Theory of Automata Functions
59
defines an ergodic T-function f : Z2 → Z2 if and only if the following conditions hold simultaneously: (i) c ≡ 1 (mod 2); (ii) c0 ≡ 1 (mod 4); (iii) |ci |2 = 2−i , for i = 1, 2, 3, . . .. Example 5.26 The following T-function f is ergodic on Z2 : f (x) = 1 + δ0 (x) + 6δ1 (x) +
∞
(1 + 2(x AND (2k − 1)))2k δk (x). k=2
Measure-preservation criteria of automata functions for the case when p > 2 where found in [73] and are actually of the form similar to that of Lemma 5.17 where affine summand is replaced by a function having special form in van der Put basis. The result is a bit lengthy to be reproduced in the current paper.
5.5 Measure-Preservation and Ergodicity of Uniformly Differentiable Automata Functions Uniformly differentiable automata functions constitute an important and wide subclass of automata functions; as we already know, the class includes A-, B-, and C-functions. Moreover, there are many automata functions which are not uniformly differentiable on Zp but are close to such functions in the following meaning. Definition 5.27 (Differentiability Modulo pk ) Given k ∈ N, a function f : Zp → Zp is said to be differentiable modulo pk at the point x ∈ Zp if there exists fk (x) ∈ Qp (called the derivative modulo pk of the function f at the point x) such that f (x + h) ≡ f (x) + fk (x)h (mod pordp h+k )
(5.43)
for all sufficiently small h ∈ Zp . The function f is called uniformly differentiable modulo pk (on Zp ) if it is differentiable modulo pk at every point x ∈ Zp and Nk from Definition 4.8 does not depend on x; that is, there exists N ∈ N such that congruence (5.43) holds simultaneously for all x ∈ Zp and all h ∈ Zp once ordp h ≥ N. The smallest N with this property is denoted via Nk (f ). From Definitions 4.8 and 5.27 we easily conclude that if the function f is differentiable at x then it is differentiable modulo pk for all k ∈ N, and that f (x) ≡ fk (x) (mod pk ). Speaking loosely, the differentiability of f means that f (x + h) − f (x) ≈ fk (x) h
60
V. Anashin
with arbitrarily high accuracy, whereas in the case of differentiability modulo pk the accuracy is only not worse than p−k . Note 5.28 It is not difficult to show that derivatives modulo pk of automata functions are p-adic integers, cf. Note 4.9. Hence, as derivative modulo pk is defined up to the term that is 0 modulo pk , for the 1-Lipschitz function f we may assume when convenient that fk (x) ∈ Z/pk Z = {0, 1, . . . , p k − 1}; so given a 1-Lipschitz function f : Zp → Zp that is differentiable modulo pk at every point x ∈ Zp , the derivative fk maps Zp into the residue ring Z/pk Z. To deal with multivariate p-adic functions recall that (ultra)metric in Qnp is defined as follows: Given a = (a1 , . . . , an ), b = (b1 , . . . , bn ) ∈ Qnp , |a − b|p = max{|ai − bi |p : i = 1, 2, . . . , n}. Let s ∈ N; let a = (a1 , . . . , an ), b = (b1 , . . . , bn ) ∈ Qnp . We write a ≡ b (mod ps ) if and only if |ai − bi |p ≤ p−s ; or, which is the same, if and only if ai = bi + ci ps for suitable ci ∈ Zp , i = 1, 2, . . . , n. In other words, further a ≡ b (mod ps ) stands for |a − b|p ≤ p−s ; that is, both a and b lie in a ball of radius p−s of the space Qnp . For instance, the function F = (f1 , . . . , fm ) : Znp → Zm p is 1-Lipschitz (i.e., |F (a) − F (b)|p ≤ |a − b|p for all a, b ∈ Znp ) if and only if F (a) ≡ F (b) (mod p ) once a ≡ b (mod p ).
(5.44)
Recall that the class of all 1-Lipschitz functions F = (f1 , . . . , fm ) : Znp → Zm p coincide with the class of all automata functions with n inputs and m outputs over alphabet Fp , cf. Sect. 2.4. Definition 5.29 (Differentiability Modulo pk of Multivariate Functions) Given k ∈ N, a function F = (f1 , . . . , fm ) : Znp → Zm p is said to be differentiable modulo pk at the point u = (u1 , . . . , un ) ∈ Znp if there exists a positive integer rational N and an n × m matrix Fk (u) over Qp (called the Jacobi matrix modulo pk of the function F at the point u) such that for every positive rational integer K ≥ N and every h = (h1 , . . . , hn ) ∈ Znp the congruence F (u + h) ≡ F (u) + h · Fk (u) (mod pk+K )
(5.45)
holds whenever |h|p ≤ p−K . In the case m = 1 the Jacobi matrix modulo pk is called a differential modulo pk . In the case m = n a determinant of the Jacobi matrix modulo pk is called a Jacobian modulo pk . Entries of the Jacobi matrix modulo pk are called partial derivatives modulo pk of the function F at the point u. Similarly to univariate functions, the function F is called differentiable at the point u if it is differentiable modulo pk at u for all k ∈ N.
The p-adic Theory of Automata Functions
61
We denote a partial derivative (respectively, a differential) modulo pk via ∂k∂fkix(u) j (respectively, via dk F (u) = ni=1 ∂k∂Fk x(u) d x ). k i i In cases when all partial derivatives modulo pk at all points of Znp are p-adic integers we say that the function F has integer-valued derivatives modulo pk . In these cases we can associate to each partial derivative modulo pk a unique element of the ring Z/pk Z; a Jacobi matrix modulo pk at every point u ∈ Znp can then be considered as a matrix over the ring Z/pk Z. As in univariate case, if a 1-Lipschitz function F = (f1 , . . . , fm ) : Znp → Zm p is differentiable modulo pk then all its partial derivatives modulo pk are integervalued; that is, one may assume that they take values in the residue ring Z/pk Z = {0, 1, . . . , pk − 1}. Integer-valued functions that have integer-valued derivatives are sometimes called twice integer-valued. Definition 5.30 (Uniform Differentiability Modulo pk for Multivariate Funck tions) A function F : Znp → Zm p is said to be uniformly differentiable modulo p n on Zp if and only if there exists K ∈ N such that the congruence (5.45) holds simultaneously for all u ∈ Znp whenever |h|p ≤ p−K . The smallest of these K is denoted via Nk (F ). Example 5.31 Let p = 2. The function f (x, y) = x XOR y : Z22 → Z2 is not uniformly differentiable on Z22 as a bivariate function; however, f is uniformly differentiable modulo 2 on Z22 , and its partial derivatives modulo 2 are 1 everywhere on Z22 . Rules of derivation modulo pk are similar to those for usual derivation with the only difference these are congruences modulo pk rather that equalities: Proposition 5.32 Let the functions G : Zsp → Znp and F : Znp → Zm p be differentiable modulo pk at the points, respectively, v = (v1 , . . . , vs ) and u = G(v), and let all partial derivatives modulo pk of the functions G and F at the points, respectively, v and u are p-adic integers. Then the composition F ◦ G : Zsp → Zm p is differentiable modulo pk at the point v, all its partial derivatives modulo pk at this point are p-adic integers, and (F ◦ G)k (v) ≡ Gk (v)Fk (u) (mod pk ). In particular, if the functions f, g : Zp → Zp are differentiable modulo pk at the point u ∈ Zp , and if their derivatives modulo pk at this point are integer-valued, then (f + g)k (u) ≡ fk (u) + gk (u) (mod pk ); (f · g)k (u) ≡ fk (u)g(u) + f (u)gk (u) (mod pk ).
62
V. Anashin
If, moreover, there exists an ball B u such that g(r) ≡ 0 (mod p) at every point r ∈ B, then the function fg : B → Zp is differentiable modulo pk at the point u, has integer-valued derivative modulo pk at this point, and f (u)g(u) − f (u)gk (u) f (u) ≡ k (mod pk ). g k g(u)2 If additionally the functions F , G, f , g are uniformly differentiable modulo pk , and if their derivatives modulo pk are integer-valued everywhere on Zp , then the same is true for the functions F ◦ G, f + g, and f · g. Finally, if g(v) ≡ 0 (mod p) for all v ∈ Zp , then the function fg is integervalued and uniformly differentiable modulo pk everywhere on Zp , and its partial derivative modulo pk is integer-valued at all points of Zp . Example 5.33 The following automaton function of an automaton having 2 inputs and 2 outputs over F2 is uniformly differentiable modulo 2 as a bivariate function, and N1 (F ) = 1: F (x, y) = (f (x, y), g(x, y)) = (x XOR (2 · (x AND y)), (y + 3x 3 ) XOR x). Namely, F (x + 2n t, y + 2m s) ≡ F (x, y) + (2n t, 2m s) ·
1 x+1 (mod 2k+1 ) 0 1
for all m, n ≥ 1 (here k = min{m, n}). The matrix
1 x+1 0 1
= F1 (x, y) is a
Jacobi matrix modulo 2 of F (see Definition 5.29). Here is how we calculate partial derivatives modulo 2: For instance, ∂1 g(x,y) = ∂1 x 3 ∂1 (y+3x ) ∂1 (uXORx) ∂1 x ∂1 (uXORx) 2 · · 3 + 3 = 9x · 1 + 1 · 1 ≡ x + 1 ∂1 x
∂1 u
u=y+3x
∂1 x
∂1 x
u=y+3x
(mod 2). Note that a partial derivative modulo 2 of the function 2 · (x AND y) is always 0 modulo 2, due to the multiplier 2: The function x AND y is not differentiable modulo 2 as a bivariate function, however, the function 2 · (x AND y) is. So the Jacobian of the function F is det F1 ≡ 1 (mod 2). 5.5.1
Conditions for Measure-Preservation
The following theorem gives an answer to the question when a uniformly differentiable modulo p function is measure-preserving providing that all its derivatives modulo p are integer-valued; thus, the theorem gives conditions of measurepreservations for automata functions which are uniformly differentiable modulo p:
The p-adic Theory of Automata Functions
63
Theorem 5.34 Let the function F : Znp → Zm p , m ≤ n, be uniformly differentiable modulo p, and let all partial derivatives modulo p of the function F be integervalued. Then F is measure-preserving whenever the following two conditions hold simultaneously: (i) F is balanced modulo p k for some k ≥ N1 (F ). (ii) The rank rk F1 (y) of Jacobi matrix F1 (y) modulo p is m at all points y ∈ Znp . Moreover, in the case when m = n the mentioned conditions are also necessary: If F : Znp → Znp is measure-preserving then F is bijective modulo pk for all k ≥ N1 (F ), and det F1 (y) ≡ 0 (mod p) for all y ∈ Znp . Finally, the function F : Znp → Znp is measure-preserving if and only if F is bijective modulo pk for some k ≥ N1 (f ) + 1. Note 5.35 The bound given by Theorem 5.34 is sharp; for instance, the automaton function f (x) = 1 + x p : Zp → Zp satisfies the following properties • • • • •
f is uniformly differentiable modulo p, f1 is integer-valued, f is bijective modulo pN1 (f ) , f is not bijective modulo pN1 (f )+1 , and f is not measure-preserving.
Numerous results both of theoretical and practical value follow from Theorem 5.34. Some of them were know before (and proved by other methods), some of them are new. We list corresponding examples, re-stating them in terms of measurepreservation according to Theorem 5.6: Example 5.36 A polynomial from Zp [x] is measure-preserving if and only it is bijective modulo p and its derivative vanishes modulo p nowhere on Z/pZ; equivalently, the polynomial is measure-preserving if and only it is bijective modulo p2 . (cf., e.g., [88]). ∞ i A C-function f = i=0 ci x is measure-preserving on Z2 if and only if the following conditions hold simultaneously: c2 + c4 + c6 + · · · ≡0 (mod 2); c3 + c5 + c7 + · · · ≡0 (mod 2);
(5.46)
c1 ≡1 (mod 2). The T-function f (x) = c0 1 c1 x 2 c2 x 2 3 · · · n cn x n , where j ∈ {+, XOR}, j = 1, 2, . . . , n, is measure-preserving on Z2 if and only conditions (5.46) hold simultaneously (cf., [78]). Under assumptions of Theorem 5.34 assume that m = 1. Then F if measurepreserving whenever F is balanced modulo pk for some k ≥ N1 (F ), and all partial derivatives modulo p of the function F vanish simultaneously at no point of (Z/pk Z)n . If additionally n = 1, then F is measure-preserving if and only if
64
V. Anashin
it is bijective modulo pN1 (F ) and its derivative modulo p vanishes at no point of {0, 1, . . . , p N1 (F ) − 1}. Equivalently, if m = n = 1 then F is measure-preserving if and only if F is bijective modulo pN1 (F )+1 . Note 5.37 From Lemma 5.17 it can be derived that if p = 2, then any measurepreserving automaton function f : Z2 → Z2 is uniformly differentiable modulo 2, and f2 ≡ 1 (mod 2). Moreover, Theorem 5.34 can be used in combinatorics (and its applications) to construct large classes of Latin squares and orthogonal Latin squares as automata functions. Recall that a L × L Latin square can be viewed a 2-variate mapping f : A2 → A, where A = {0, 1, . . . , L − 1}, which is invertible (i.e., bijective) with respect to each variable. Latin squares are used widely: For games (recall sudoku), and for more serious applications as, say, private communication networks (for password distribution), in coding theory, in some cryptographic algorithms (under the name of multipermutations), etc., see monographs [35, 89]. However, known methods (e.g., the ones from the mentioned books) may not work efficiently in some practical cases. For instance, a real problem is to write a software that produces a number of large Latin squares; however, this is only a part of the problem. Another part of the problem is that in some constraint environments (e.g., in smart cards) it is impossible to store the whole matrix: Given two numbers a, b ∈ {0, 1, . . . , L − 1} the software must calculate the (a, b)-th entry of the matrix on-the-fly. This problem can be solved by using Theorem 5.34 as follows. Let us say that a bivariate 1-Lipschitz function f : Z2p → Zp is a Latin square modulo pk whenever the reduced mapping f¯ = f mod pk : Z/pk Z × Z/pk Z → Z/pk Z is a Latin square on A = Z/pk Z = {0, 1, . . . , p k − 1}. Corollary 5.38 (of Theorem 5.34) A uniformly differentiable modulo p 1Lipschitz function f : Z2p → Zp (thus, an automaton function of an automaton having 2 inputs an 1 output over Fp ) is a Latin square modulo pk for all k = 1, 2, . . . whenever f is a Latin square modulo pN1 (f ) and ∂1∂f1 x(u) ≡ 0 (mod p) for all i N (f ) 2 1 Z) , i = 1, 2. Equivalent statement: if and only if f is bijective u ∈ (Z/p modulo pN1 (f )+1 with respect to either variable. Example 5.39 Given arbitrary automaton function v : Z22 → Z2 with 2 inputs and 1 output over the alphabet F2 , and arbitrary integer γ ∈ Z, the map f2k (x, y) = (x + y + γ + 2 · v(x, y)) mod 2k is a Latin square on 2k symbols for all k = 1, 2, . . .. Note 5.40 By using Corollary 5.38 along with Chinese Reminder Theorem, it is possible to construct numerous Latin squares modulo large N given small Latin squares modulo prime divisors of N since the latter small Latin squares can be represented by polynomials over respective finite fields.
The p-adic Theory of Automata Functions
65
Recall that two L×L Latin squares are said to be orthogonal if, after the two squares are superimposed, then each of the P 2 ordered pairs of symbols appears in the so obtained L × L table exactly once. Here is an example of a pair of orthogonal Latin squares on 3 symbols: The Latin squares 012 120 201
012 201 120
are orthogonal since after we superimpose them, we get a square (0, 0) (1, 1) (2, 2) (1, 2) (2, 0) (0, 1) (2, 1) (0, 2) (1, 0) where all pairs are different. Pairs of orthogonal Latin squares are used in experiment design to provide consistent testing of samples, as well as in cryptography (e.g., as block mixers for block ciphers, and as cipher combiners), etc. We say that bivariate 1-Lipschitz functions f, g : Z2p → Zp are orthogonal Latin squares modulo pk whenever the reduced mappings f¯ = f mod pk : Z/pk Z × Z/pk Z → Z/pk Z and g¯ = g mod p k : Z/pk Z × Z/pk Z → Z/pk Z constitute a pair of orthogonal Latin squares on A = Z/pk Z = {0, 1, . . . , p k − 1}. Corollary 5.41 (of Theorem 5.34) Let g, f : Z2p → Zp be uniformly differentiable modulo p 1-Lipschitz functions, and let f and g be Latin squares modulo pk for all k = 1, 2, . . . (cf. Corollary 5.38). These Latin squares are orthogonal modulo pk for all k = 1, 2, . . . if and only if the function F (x, y) = (f (x, y), g(x, y)) : Z2p → Z2p preserves measure. This holds if and only if f and g are orthogonal modulo pk for some k ≥ max{N1 (f ), N1 (g)}, and ∂ det
1 f (x,y) ∂1 x ∂1 f (x,y) ∂1 y
∂1 g(x,y) ∂1 x ∂1 g(x,y) ∂1 y
≡ 0 (mod p)
for all (x, y) ∈ (Z/pN1 (F ) Z)2 Corollary 5.41 implies a method to construct large orthogonal Latin squares out of small orthogonal Latin squares. For instance, let p = 3, and let ⎛
⎞ ⎛ ⎞ 012 012 f (x, y) mod 3 = ⎝1 2 0⎠ g(x, y) mod 3 = ⎝2 0 1⎠ 201 120
66
V. Anashin
be a pair of orthogonal Latin squares of order 3 each. Then, given arbitrary polynomials v(x, y), w(x, y) ∈ Z3 [x, y], the functions f (x, y) = x +y +3·v(x, y) and g(x, y) = 2x + y + 3 · w(x, y) define a pair of orthogonal Latin squares modulo 3k , for all k = 1, 2, . . . since det
5.5.2
12 ≡ 2 (mod 3) 11
Differentiable Ergodic Transformations on Zp
The following theorem gives sufficient and necessary conditions of ergodicity for functions that are uniformly differentiable modulo p2 . Theorem 5.42 Let an automaton function f : Zp → Zp be uniformly differentiable modulo p2 . Then f is ergodic if and only if it is transitive modulo pn for some (equivalently, for every) n ≥ N2 (f ) + 1 whenever p is odd or, respectively, for some (equivalently, for every) n ≥ N2 (f ) + 2 whenever p = 2. Theorem Theorem 5.42 has recently be expanded for automata functions f which are uniformly differentiable modulo p rather than modulo p2 , as follows: Theorem 5.43 ([103]) Let f be a measure-preserving and uniformly differentiable modulo p automaton function Zp → Zp such that N1 (f ) = 1. Then, f is ergodic on Zp if and only if the following conditions are satisfied: (i) f is transitive modulo p. k (ii) For every positive integer k, f p (0) ≡ 0 (mod pk+1 ). (iii) For every positive integer k, "pk −1 j =0
Bj +pk
k pkp
≡ 1 (mod p).
Here Bi are van der Put coefficients of f (cf. Sect. 3.3 ); f i stands for i-th iterate of f , i = 0, 1, 2, . . ., and f 0 (z) = z. Of course, to determine whether conditions of Theorem 5.43 hold is more difficult than to determine whether conditions of Theorem 5.42 hold. However, Theorem 5.43 is a useful tool to study ergodicty of automata functions when p is small, see examples in [103]. Unfortunately, both Theorems 5.42 and 5.43 cannot be expanded to multivariate case due to the following result: Theorem 5.44 Let an automaton function F : Znp → Znp of an automaton having n inputs and n outputs over the alphabet Fp be uniformly differentiable modulo p, and let F be ergodic. Then n = 1. Note 5.45 The bound given by Theorem 5.42 is sharp: For odd prime p consider the automaton function f (x) = δ0 (x + 1); then
The p-adic Theory of Automata Functions
• • • •
f f f f
67
is uniformly differentiable modulo p2 , is transitive modulo pN2 (f ) , is not transitive modulo pN2 (f )+1 , is not ergodic.
However, for narrower classes of functions these bounds can obviously be sharpened; e.g., for affine functions: Theorem 5.12 implies that an affine function f (x) = ax+b is ergodic if and only if it is transitive modulo p whenever p is odd, or modulo 4 whenever p = 2, and not modulo p2 and modulo 8, respectively, as follows from Theorem 5.42. For some important classes of functions which were considered above the said bounds can be significantly reduced and calculated explicitly in contrast to those from Theorems 5.34 and 5.42 (note that given f , it might be not an easy problem to find N1 (f ) and N2 (f )). We start with A-functions. Let f ∈ A, then, according to Definition 4.21 of A-functions, pn · f ∈ B for a suitable n ∈ N0 . Given f ∈ A, denote ρ(f ) = min {n ∈ N0 : pn · f ∈ B}; put
pk − 1 λ(f ) = min k ∈ N : 2 · − k > ρ(f ) . p−1 The following theorem is true. Theorem 5.46 An A-function f is measure-preserving if and only if it is bijective modulo pλ(f )+1 . The function f is ergodic if and only if it is transitive modulo pλ(f )+1 whenever p ∈ / {2, 3}, or modulo pλ(f )+2 whenever p ∈ {2, 3}. By using Theorem 5.46 one can determine whether a polynomial f (x) ∈ Qp [x] is an automaton measure-preserving (respectively, ergodic) transformation on Zp by evaluating f at ≈ p3 · deg f points: Proposition 5.47 A polynomial f (x) ∈ Qp [x] induces a 1-Lipschitz measurepreserving (respectively, ergodic) transformation on Zp if and only if the mapping z → f (z) mod plogp (deg f )+3 is a compatible and bijective (respectively, transitive) transformation on the residue ring Z/plogp (deg f )+3 Z. For B-functions, the bounds from Theorem 5.46 can be significantly refined: Corollary 5.48 A B-function (and thus a C-function) f is measure-preserving if and only if f is bijective modulo p2 . The function f is ergodic if and only if f is transitive modulo p2 whenever p ∈ / {2, 3}, or modulo p3 whenever p ∈ {2, 3}. That is, given an automaton whose automaton function belongs to the class B, to determine whether the automaton performs bijective (respectively, transitive) transformation on the set of all N-letter words for large N, its is necessary and sufficient to check these properties for short words, of length not exceeding p3 . Note The bounds given by Corollary 5.50 (and therefore by Corollary 5.48) are sharp: The polynomial 2x 3 + 3x + 5 is transitive modulo 4, yet it is not transitive modulo 8 (whence, it is not ergodic on Z2 ); the polynomial
68
V. Anashin
1 + x − x(x − 1)(x − 2)(x − 3)(x − 4)(x − 6)(x − 7) is transitive modulo 9, yet it is not transitive modulo 27 (whence, it is not ergodic on Z3 ); the polynomial 1 + x p is transitive modulo p, yet it is not transitive (even is not bijective) modulo p2 ; whence, it is not measure-preserving on Zp . The first two examples are due to M. V. Larin, [87]. Theorem 5.42 as well as its corollaries mentioned above can be used to establish transitivity of numerous previously known automata functions as well as to obtain new results. For instance, the following result was obtained in [80] and used in stream ciphers; by using Theorem 5.42, proof of transitivity of corresponding automaton function becomes an easy exercise: Example 5.49 The T-function f (x) = x + x 2 OR 5 is ergodic: It is easy to see that N2 (f ) ≤ 3, so it is sufficient to prove that f is transitive modulo 32 which can be done by direct calculations. From Corollary 5.48 we immediately deduce Corollary 5.50 (cf. [36, 87]) A polynomial f ∈ Zp [x] is ergodic if and only if f is transitive modulo p2 whenever p ∈ / {2, 3}, or modulo p3 whenever p ∈ {2, 3}. Numerous other examples of ergodic automata functions can be constructed by using Theorem 5.42; for instance, given arbitrary 1-Lipschitz function g : Zp → Zp and arbitrary ergodic B-function u : Zp → Zp , the function f (x) = u(x)+p2 ·g(x) is ergodic. Historical remark: Most results (namely, the ones for which no references are given) of the Sect. 5.5 were originally proved in [15].
5.6 Automata Functions Which Are Ergodic on p-adic Balls and p-adic Spheres Consider a measure μˆ p induced on a subspace S of the space Zp by the Haar measure μp defined on the whole space Zp ; assume that μˆ p is normalized so that μˆ p (S) = 1. Now, if f : S → S is a 1-Lipschitz map, we can speak of ergodicity of this map with respect to the measure μˆ p . In the sequel, speaking of ergodicity (and of measure preservation) of a map f on a subspace S we mean that S is invariant under action of f and the measure is μˆ p . A study automata functions on S rather than on the whole space Zp is important both for general p-adic dynamical theory and for applications, e.g., for pseudo-random number generation. For instance, in [40] it is considered a pseudo-random generator based on iterations of the function f (x) = ax −1 + b modulo 2n where a + b ≡ 1 (mod 2); in [70] a pseudo-random generator based on iterations of the function f (x) = ax −1 + b + cx modulo 2n , where where a + b + c ≡ 1 (mod 2), is
The p-adic Theory of Automata Functions
69
studied. In the papers, the generators are judged as pseudo-random if they produce the longest possible cycle; i.e., of length 2n−1 . However, it is easy to see that the problem to find conditions on a, b, c which guarantee that the cycle is the longest possible is equivalent to the problem of finding conditions when f is ergodic on the 2-adic ball B1/2 (1) of radius 1/2 centered at 1. The problem to determine ergodicity of a 1-Lipschitz transformation on the p-adic ball Bp−k (a) = a + pk Zp can be reduced to the problem of determining ergodicty of a 1-Lipschitz map from Zp to Zp . Indeed, if f is a 1-Lipschitz transformation such that f (a + pk Zp ) ⊂ a + pk Zp , then necessarily f (a) = a + pk y for a suitable y ∈ Zp . Thus, f (a + pk z) = f (a) + pk · g(z) for any z ∈ Zp ; so we can relate to f the following 1-Lipschitz transformation on Zp g : z → g(z) =
1 (f (a + pk z) − a − pk y); z ∈ Zp . pk
It is clear that the transformation f is ergodic on the ball Bp−k (a) if and only if the transformation g is ergodic on Zp . By applying that idea to the function f (x) = ax −1 + b (respectively, to the function f (x) = ax −1 + b + cx) and using ergodicity criteria exstablished in preceding subsections, it is not difficult to prove that the function f (x) = ax −1 + b (respectively, the function f (x) = ax −1 + b + cx) is ergodic on B1/2 (1) if and only if a ≡ 1 (mod 4) and b ≡ 2 (mod 4) (respectively, if and only if a + c ≡ 1 (mod 4) and b ≡ 2 (mod 4)). That method can also be applied to other generators mentioned in [42]. The 1-Lipschitz dynamics on 2-adic spheres can be reduced to 1-Lipschitz dynamics on 2-adic balls since a 2-adic sphere S2−r (a) of radius 2−r centered at the point a ∈ {0, . . . , 2r − 1} coincides with the ball B2−r−1 (a + 2r ) of radius 2−r−1 centered at the point a + 2r . Indeed, let Sp−r (y) be a sphere of radius p1r , r ≥ 1, centered at y ∈ Zp ; that is
1 Sp−r (y) = z ∈ Zp : |z − y|p = r . p We remind (see any book on p-adic analysis, e.g., [71, 83, 97, 116]) that the sphere 1 is a disjoint union of balls of radius pr+1 each, Sp−r (y) =
p−1 #
(y + pr s + pr+1 Zp ),
(5.47)
s=1
since Sp−r (y) is a set-theoretic complement of the ball y + pr+1 Zp in the ball y + pr Zp . So Sp−r (y) is a closed and simultaneously an open (whence, a μp measurable) subset of Zp .
70
V. Anashin
From the definition of p-adic % follows that a 2-adic sphere $ sphere it immediately is a 2-adic ball: S2−r (a) = a + 2r + 2r+1 x x ∈ Z2 = B2−r−1 (a + 2r ). Let the sphere S2−r (a) be invariant under action of f ; that is, let f (S2−r (a)) ⊂ S2−r (a). As S2−r (a) = B2−r−1 (a + 2r ), the sphere S2−r (a) is f -invariant if and only if f (a + 2r + 2r+1 Zp ) ⊂ a + 2r + 2r+1 Zp ; that is, if and only if f (a + 2r ) ≡ a + 2r (mod 2r+1 )
(5.48)
as a 1-Lipschitz function maps a ball of radius 2− into a ball of radius 2− . This way we reduce 1-Lipschitz 2-adic dynamics on spheres to the dynamics on balls. By using this approach, in [23], a criterion of ergodicity of arbitrary 1-Lipschitz function on the sphere S2−r (a) was established, in terms of van der Put coefficients of the function. Namely, given a 1-Lipschitz function f : Z2 → Z2 , in view of Theorem 3.4 and Eq. (3.22), f has a unique representation via van der Put series: f (x) =
∞
m=0
Bf (m)χ (m, x) =
∞
2log2 m bf (m)χ (m, x),
(5.49)
m=0
where bf (m) ∈ Z2 ; so Bf (m) = 2log2 m bf (m) for all m = 0, 1, 2, . . .. Theorem 5.51 The function f represented by van der Put series (5.49) is ergodic on the sphere S2−r (a) if and only if the following conditions hold simultaneously: (i) (ii) (iii) (iv) (v)
r r + 2r+1 (mod 2r+2 ); f (a + 2 )r ≡ a + 2r+1 bf (a + 2 + m · 2 ) = 1, for m ≥ 1; 2 bf (a + 2r + 2r+1 ) ≡ 1 (mod 4); bf (a + 2r + 2r+2 ) + bf (a + 2r + 3 · 2r+1 ) ≡ 2 (mod 4); 2n −1 b (a + 2r + m · 2r+1 ) ≡ 0 (mod 4), for n ≥ 3. m=2n−1 f
Also, in that paper [23] the following criterion of ergodicity of a perturbed monomial function on 2-adic spheres was found: Theorem 5.52 Let u : Z2 → Z2 be arbitrary 1-Lipschitz function, let s, r ∈ N. The function f (x) = x s + 2r+1 u(x) is ergodic on the sphere S2−r (1) if and only if s ≡ 1 (mod 4) and u(1) ≡ 1 (mod 2). In [101] explicit conditions of ergodicity in terms of coefficients of polynomials on the sphere S1/2 (0) = B1/2 (1) = Z2 \ 2Z2 were found, whereas in [100] explicit conditions of ergodicity on the sphere in terms of linear relations on coefficients (somewhat similar to that of Proposition 5.19) were found for rational functions. Note that the 2-adic sphere S1/2 (0) is the group of all units (invertible elements) of the ring Z2 ; so the results of [100, 101] are of special interest (we do not present the results here since corresponding conditions of ergodicity are somewhat lengthy). However, general problem to determine ergodicity of automata functions on padic spheres is much more complicated that the one for p-adic balls since p-adic spheres are not p-adic balls when p > 2. This case was investigated in [9]; the
The p-adic Theory of Automata Functions
71
results obtained in that paper constitute the rest of the current Subsection. The following easy proposition holds: Proposition 5.53 If Sp−r (y) is invariant under action of a 1-Lipschitz map f , then f (y) ≡ y (mod pr ). From this proposition we derive Corollary 5.54 Let all spheres around y ∈ Zp of radii less than ε > 0 be invariant under action of a 1-Lipschitz map f . Then y is a fixed point of f . What is important, the analogue of Theorem 5.6 for a sphere (rather than for the whole space Zp ) remains true: Proposition 5.55 A 1-Lipschitz mapping f : Zp → Zp is ergodic on the sphere Sp−r (y) if and only if it induces on the residue ring Z/pk+1 Z a mapping which is transitive on all subsets Sp−r (y) mod pk+1 = {y + pr s + pr+1 Z : s = 1, 2, . . . , p − 1} ⊂ Z/pk+1 Z k = r, r + 1, . . .. (That is, the reduced mapping f mod pk+1 permutes cyclically elements of every subset Sp−r (y) mod pk+1 ). It is worth noticing also that whenever a 1-Lipschitz mapping f is ergodic on the sphere Sp−r (y), f is a bijection of this sphere onto itself; moreover, it is an isometry on this sphere, cf. Note 5.8. The same holds for balls. Now we are going to state criterion of ergodicity on p-adic spheres for Bfunctions. In order to do this, recall that a p-adic number z ∈ Zp is called primitive modulo pk whenever z mod pk generates the whole group (Z/pk Z)∗ of invertible elements of the residue ring Z/pk Z. Note that whenever k > 2 we speak on primitivity modulo pk only for odd p since it is well known that the multiplicative subgroup (Z/2k Z)∗ is not cyclic if k > 2; see, e.g., [107]. Theorem 5.56 Let f be a B-function. The function f is ergodic on the sphere Sp−r (y) of a sufficiently small radius p−r if and only if one of the following alternatives holds: (i) Whenever p is odd, then simultaneously • f (y) ≡ y (mod pr+1 ), • f (y) is primitive modulo p2 . (ii) Whenever p = 2, then simultaneously • f (y) ≡ y (mod 2r+1 ), • f (y) ≡ y (mod 2r+2 ), • f (y) ≡ 1 (mod 4). Note Within context of the theorem, the ‘sufficiently small’ means that r ≥ 2 if p > 3, or r ≥ 3 if p ≤ 3.
72
V. Anashin
Corollary 5.57 Let y ∈ Zp be a fixed point of the function f ∈ B, and let p be odd. Then, f is ergodic on all spheres around y of sufficiently small radii if and only if f is ergodic on some sphere around y of a sufficiently small radius. From Theorem 5.56 we derive complete characterization of B-functions that are ergodic on p-adic spheres. Theorem 5.58 Let f be a B-function. Whenever p is odd, the mapping z → f (z) is an ergodic transformation on every sufficiently small sphere centered at y ∈ Zp if and only if the following two conditions hold simultaneously: • f (y) = y, and • the derivative f (y) of the function f at the point y ∈ Zp is primitive modulo p2 . In the case p = 2 no B-function exists such that the mapping z → f (z) is ergodic on all spheres around y ∈ Z2 of radii less than ε, whatever ε > 0 is taken. A number of results, both known before as well as new ones, can be derived from the theorems stated above; for instance: Examples 5.59 Let p be an odd prime. • Consider the automaton function Ma, (z) = az , where ∈ N, ≡ 1 (mod p) and a ∈ Bp−1 (1). The map Ma, has a unique fixed point x0 ∈ Bp−1 (1) and Ma, is ergodic on Sp−r (x0 ) if and only if is primitive modulo p2 (cf. [29]). • Affine map Ta,b : z → az +b, where a, b ∈ Zp , a = 1, is an automaton function. b It has a fixed point y = 1−a ∈ Qp . When y ∈ Zp , the map Ta,b is ergodic on Sp−r (y) if and only if a is primitive modulo p2 (cf. [29]). • The perturbed monomial mapping f : x → x + q(x), where q(x) = pr+1 u(x) and u is a B-function (whence, f is an automaton function), is ergodic on the sphere Sp−r (1), r > 1, if and only if is primitive modulo p2 . This solves the problem posed in [58]. Note that the result is true for p = 2 as well (which follows from Theorem 5.56); and for r = 1 (cf. [111]). • Let ∈ N be primitive modulo modulo p2 . Then the automata functions f (x) = 1 + · (−1 + x + p2 · v(x)) and g(x) = · (ax + a x − 2a) + 1 are ergodic on all (sufficiently small) spheres around 1, for every a ∈ 1 + p2 Zp and every B-function v; and the automata functions f (x) = · x + lnp (1 + p2 x) and ·x g(x) = 1+p 2 x are ergodic on all (sufficiently small) spheres around 0. Many important automata functions do not lie in B; however, they lie in a wider class A. We can determine whether an A-function is ergodic on a p-adic sphere as well: Theorem 5.60 The statement of Theorem 5.56 remains true for A-functions. Note The definition of A-function implies that if f ∈ A then f = p1n f¯ for a suitable B-function f¯ and suitable non-negative rational integer n. In contrast to
The p-adic Theory of Automata Functions
73
Theorem 5.56, within conditions of Theorem 5.60 it depends also on n (i.e., on the function f ) how small the sphere Sp−r (y) must be to satisfy the Theorem. Here are examples of A-functions (which are not B-functions) that are ergodic on all sufficiently small spheres around 0 if ∈ Z is primitive modulo p2 : f (x) = · x + lnp (1 + p2 x) +
1 1 p ·x (x − x)2 ; f (x) = + (x p − x)2 . p p 1 + p2 x
5.7 Transitivity of Automata From Sect. 5.5.1 we already know how the p-adic ergodic theory can be applied to determine weak invertibility of automata (cf. very beginning of Sect. 5) since the weak invertibility is equivalent to the measure-preservation of the corresponding automaton function. Similar approach can be applied to determine if an automaton function is ergodic; however, the transitivity of automata is a wider notion which in general is not reduced only to the ergodicity of the automaton function; moreover, it is related to how the points of orbits are distributed. In this Subsection, we deal with various aspects of transitivity mainly following [21]. Conventions: Before, speaking of an automaton we have meant initial transducer A(s0 ) = Fp , S, Fp , S, O, s0 ; but since now we will also consider non-initial automata A(s0 ) = Fp , S, Fp , S, O. To avoid possible confusion, the latter will be referred as to discrete systems, or, for brevity, just as to systems. The justification of the latter term is as follows: According to the most general definition of a system in general mathematical systems theory (see e.g. [69]), by a discrete system people usually understand a stationary dynamical system with discrete time, that is, a 5tuple A = I, S, O, S, O where I is a non-empty finite set, the input alphabet; O is a non-empty finite set, the output alphabet; S is a non-empty (possibly, infinite) set of states; S : I × S → S is a state transition function; O : I × S → O is an output function. That is, the stationary dynamical systems with discrete time are just non-initial automata in our terminology. Obviously, the system A corresponds to the family F(A) of all automata A(s) = I, S, O, S, O, s, s ∈ S. To the latter family, we relate a family of automata functions fA(s) , s ∈ S. We additionally will assume that I = O = Fp , p a prime (though some further results are true without this limitation) and that there exists a state s0 ∈ S such that all the states of the system A are reachable from s0 . To introduce main notion of the Subsection, we remind the notion of transitivity of a collection of maps: Definition 5.61 (Transitivity) A collection F of mappings of a finite non-empty set M into M is called transitive whenever given a pair (a, b) ∈ M × M, there exists f ∈ F such that f (a) = b.
74
V. Anashin
Note that whenever F consists only of one mapping f , the latter is transitive if it is bijective and the collection {e = f 0 , f = f 1 , f 2 , f 3 , . . .} is transitive in the sense of the above definition (here as usual f i stands for the i-th iterate of f ). In other words, the mapping f : M → M is transitive if and only if it cyclically permutes elements of M. Now we are able to state main notion of the subsection: Definition 5.62 (Automata Transitivity) The automaton A(s0 ) (equivalently, the system A) is said to be • n-word transitive, if the mapping fn,A(s0 ) is transitive on the set Wn of all words of length n; • word transitive, if A(s0 ) is n-word transitive for all n ∈ N; • completely transitive, if for every n ∈ N, the collection fn,A(s) , s ∈ S, is transitive on Wn ; • absolutely transitive, if for every s ∈ S the automaton A(s) is completely transitive; that is, if for every n ∈ N the collection fn,A(t) , t ∈ SA(s) , is transitive on Wn , where SA(s) is the set of all reachable states of the automaton A(s). The transitivity properties may be defined in equivalent way: Definition 5.63 (Automata Transitivity, Equivalent) (i) The word transitivity means that given two finite words w, w whose lengths are equal one to another, (w) = (w ) = n, the word w can be transformed into w by a sequential composition of a sufficient number of copies of A = A(s0 ):
(ii) The complete transitivity means that given finite words w, w such that (w) = (w ), there exists a finite word y (may be of length other than that of w and w ) such that the automaton A(s0 ) transforms the input word w ◦ y (whose prefix is y) to the output word w ◦ y whose suffix is w :
(iii) The absolute transitivity means that given finite words x, w, w such that (w) = (w ) (and maybe (x) = (w)), there exists a finite word y such
The p-adic Theory of Automata Functions
75
that the automaton A(s0 ) transforms the input word w ◦y ◦x to the output word w ◦ y ◦ x :
By Theorem 5.6, an automaton A = A(s0 ) is word transitive if and only if its automation function fA is ergodic; so to determine if the automaton is word transitive one may apply various techniques developed before. For instance, the the following theorem is just a re-statement of Theorem 5.42: Theorem 5.64 Let the automaton function f = fA : Zp → Zp be uniformly differentiable modulo p2 . Then the automaton A is word transitive if and only if it is n-word transitive for a sufficiently large n. The notions of complete and absolute transitivity are more complex: they are related to ergodicity of a collection of maps rather than to the ergodicity of a single map. Definition 5.65 (Ergodicity of a Collection of Maps) A collection F = {fi i ∈ I } of measurable maps fi : S → S (which are not necessarily ‘onto’ maps) of a measure space S endowed with a probability measure μ is called ergodic if the maps fi , i ∈ I , have no common μ-measurable invariant subset other than sets of measure 0 or 1; that is, if there exists a μ-measurable subset S ⊂ S such that fi−1 (S) = S for all i ∈ I , then necessarily either μ(S) = 0, or μ(S) = 1. Note that if a collection consists only of one map, Definition 5.65 yields general definition of ergodicity of a single map, with no assumption about the measure-preservation of the latter map although everywhere in the paper speaking of ergodicity of a single map we additionally assume that the map is measurepreserving. Theorem 5.6 can be re-stated for collections of maps, in the following form: Theorem 5.66 If for all k = 1, 2, . . . a collection F = {fi i ∈ I } of 1-Lipschitz maps fi : Znp → Znp is transitive modulo pk (that is, if the collection F mod pk = {fi mod pk i ∈ I } of reduced maps is transitive on (Z/pk Z)n for all k = 1, 2, . . .) then the collection F is ergodic with respect to the p-adic measure μp . It turned out that to study transitivity of automata (that is, ergodicity of corresponding collections of 1-Lipschitz maps) extra techniques is needed, which is tightly connected with real functions defined by automata. We consider that theme in the next Section. Concluding current Section, we note that ergodicity of automata maps can be studied with respect to measures other than the Haar measure on Zp , see [48, 49], but this is quite another story which we don’t touch in the paper.
76
V. Anashin
6 Plots of Automata Functions in Rn When studying real functions that can be computed by an automaton A whose input/output alphabets are A = {0, 1, . . . , p − 1} (where p > 1 is an integer from N = {1, 2, 3, . . .}) most authors follow common approach which is described in, e.g., [43, Chapter XIII, Section 4]: They associate an infinite word α1 α2 . . . αn . . . −i over A to a real number whose base-p expansion is 0.α1 α2 . . . αn . . . = ∞ i=1 αi p A and consider a real function as follows: Given x ∈ [0, 1], take its ∞f defined −i ; then produce an infinite output sequence base-p expansion x = α p i=1 i β1 β2 . . . βn . . . of A by successfully feeding the automaton with the letters α1 , α2 , ∞ −i . Being feeded by infinite input sequence etc., and put f A (x) = β p i=1 i α1 α2 . . . αn . . ., the automaton A produces a unique infinite output sequence β1 β2 . . . βn . . .; therefore the function f A is well defined everywhere on the real closed unit interval (segment) I = [0, 1] with the exception of maybe a countable set D ⊂ [0, 1] of points; namely, of those having two different base-p expansions 0.γ1 γ2 . . . γn 0 . . . 0 . . . = 0.γ1 γ2 . . . γn−1 (γn − 1)(p − 1) . . . (p − 1) . . .. The point set M(A) = {(x; f A (x)) ∈ R2 : x ∈ [0, 1]} can be considered as a graph of the real function f A specified by the automaton A (note that every time, before being feeded by the very first letter of each infinite input word the automaton A is assumed to be in a fixed state s0 , the initial state). Indeed, f A (x) is defined uniquely for x ∈ [0, 1] \ D and f A (x) can be ascribed to at most two values for x ∈ D; so f A can be treated as a real function which is defined on the unit segment [0, 1] and has not more that a countable number points of discontinuity in [0, 1]. In the sequel we refer M(A) as to the Monna graph of the automaton A. The said common approach (and its various generalizations) is utilized, e.g., in [31–33, 51, 62, 85, 94, 119]. Speaking loosely, the common approach looks as if one feeds the automaton A by a base-p expansion of a real number x ∈ [0, 1] so that leftmost (i.e., the most significant) digits are feeded to the automaton prior to rightmost ones, and observes output as real numbers (since the automaton outputs accordingly leftmost digits of the base-p expansion of f A (x) ∈ [0, 1] prior to rightmost ones) thus ascribing to the automaton A the real function f A . We stress that the function f A is well defined almost everywhere on [0, 1] due to namely that order in which digits of base-p expansion are feeded to (and outputted from) the automaton A. A crucial difference of the approach used in papers [16, 17, 21, 22] from the mentioned one is that in the latter papers digits are feeded to (and are read from) the automaton in another (i.e., inverse) order: Namely, (i) given a real number x ∈ [0, 1], represent x via base-p expansion x = 0.α1 α2 . . . αn . . . (take both expansions if x has two different ones); (ii) from the base-p expansion 0.α1 α2 . . . αn . . . derive corresponding sequence α1 , α1 α2 , α1 α2 α3 , . . . of words; then (iii) feeding the automaton A successively by the words α1 , α1 α2 , α1 α2 α3 , . . . so that rightmost letters are feeded to A prior to leftmost ones, construct corresponding output word sequence ζ11 , ζ12 ζ22 , ζ13 ζ23 ζ33 , . . .;
The p-adic Theory of Automata Functions
77
(iv) to the output sequence put into a correspondence the sequence S(x) of rational numbers whose base-p expansions are 0.ζ11 , 0.ζ12 ζ22 , 0.ζ13 ζ23 ζ33 , . . . thus obtaining a point set X(x) = {(0.α1 . . . αi ; 0.ζ1i ζ2i . . . ζii ) i = 1, 2, . . .} in the real unit square I2 = [0, 1] × [0, 1]; after that (v) consider the set F(x) of all cluster points of the sequence S(x); (vi) finally, specify a real plot (or, briefly, a plot) of the automaton A as a union P(A) = ∪x∈[0,1],y∈F(x) ((x; y) ∪ X(x)). To put this in other words, P(A) is a closure in the unit square I2 of the union ∪∞ i=1 Li (A) where Li (A) = {(0.α1 . . . αi ; 0.ζ1i ζ2i . . . ζii ) : x ∈ I} is the i-th layer of the plot P(A). That is, the plot P(A) can be considered as a ‘limit’ of the sequence of sets ∪ni=1 Li (A), the approximate plots at word length N, while N → ∞. We stress crucial difference of real plots vs Monna graphs: In contrast to the Monna graph M(A), a real plot P(A) is capable of showing long-term behavior of automaton A (i.e., when sufficiently long words are used as input of A) rather than a short-term behaviour displayed by the Monna graph M(A): Due to the very construction of the real plot, the higher order (i.e., the most significant) digits of the real number represented by the output word are formed by the latest outputted letters of the output word whereas the construction of the Monna graph assumes that the higher order digits are formed by the earliest outputted letters. This results in a drastically different appearances of the real plot and of the Monna graph: For instance, Figs. 8, 9, 10 show that real plot clearly demonstrates ‘ultimate linearity’ of the corresponding automaton (i.e., that the automaton exhibits linear long-term behavior) whereas the Monna graph is incapable to reveal this important feature of the automaton, cf. Fig. 11. This is the main reason why in the paper we focus on real plots of automata rather than on their Monna graphs. Therefore when specifying a notion of computability of a real-valued function g : G → [0, 1] ⊂ R (where G ⊂ [0, 1]) on automata, at least two different approaches do exist: The first one is to speak of the case when the graph G(g) = {(x; g(x)) : x ∈ G} of the function G lie completely in M(A) for some automaton A while the second one is to consider the case when G(g) ⊂ P(A). Aforementioned papers [31–33, 51, 62, 85, 94, 119] basically deal with the computability in the first meaning whereas in the current paper we consider computability of the second kind. Fig. 8 Approximate plot of an automaton at word length 16
78
V. Anashin
Fig. 9 Approximate plot of the same automaton at word length 17
Fig. 10 Cluster points of the plot of the same automaton
Fig. 11 Monna graph of the same automaton
To the best of our knowledge, the approach which is based on real plots rather than on Monna graphs was originally emerged in [12] and was never considered before by other authors.
The p-adic Theory of Automata Functions
79
6.1 Automata 0-1 Law As before, in the Subsection ‘automaton’ stands for an initial transducer A(s0 ) = Fp , S, Fp , S, O, s0 such that all states from S are reachable from the initial state s0 . Given an automaton A(s0 ), consider corresponding automaton function f = fA(s0 ) : Zp → Zp . For k = 1, 2, . . . let the k-th layer Lk (f ) = Lk (A) be the set of f all the following points ek (x) of Euclidean unit square I2 = [0, 1] × [0, 1] ⊂ R2 : f ek (x)
=
x mod pk f (x) mod pk , pk pk
,
where x ∈ Zp . Note that x mod pk corresponds to the prefix of length k of the infinite word x ∈ Zp , i.e., to the input word of length k of the automaton A(s0 ); while f (x) mod pk corresponds to the respective output word of length k. That is, given an input word w = χk−1 · · · χ1 χ0 and corresponding output word w = ξk−1 · · · ξ1 ξ0 , consider in I2 the set of all points (χk−1 p−1 + · · · + χ1 p−k+1 + χ0 p−k , ξk−1 p−1 + · · · + ξ1 p−k+1 + ξ0 p−k ), for all pairs (w, w ) of input/output words of length k; cf. Fig. 12. It can be observed by computer experiments that basically the so obtained snapshots of behaviour of an automaton A can be of two types only: (i) As k → ∞, the point set Lk (f ) is getting more and more dense (cf. Figs. 13, 14, 15, 16, p = 2), or (ii) Lk (f ) is getting less and less dense while k → ∞, cf. Figs. 17, 18, 19, 20 (p = 2). Now we explain this experimental phenomenon. Denote via P(f ) be the real plot of the automaton A&whose automaton function is f = fA ; that is, P(f ) is a closure 2 of the set L(f ) = ∞ k=1 Lk (f ) in the topology of real plane R . As P(f ) is a closed set, it is measurable with respect to the Lebesgue measure on real plane R2 . Let α(f ) be the Lebesgue measure of P(f ). It is clear that 0 ≤ α(f ) ≤ 1; however, it turns out that in fact only the two extreme cases occur: α(f ) = 0 or α(f ) = 1. The following theorem is true, [12, 21]:
Fig. 12 The point (0.χk−1 . . . χ1 χ0 , 0.ξk−1 . . . ξ1 ξ0 ) ∈ I2 corresponds to transformation of the input word χk−1 · · · · · · · · · χ1 χ0 to output word ξk−1 · · · · · · · · · ξ1 ξ0 by the automaton A
80 Fig. 13 f (x) = 2x 2 + 3x + 1, k = 16
Fig. 14 Same function f , k = 18
Fig. 15 Same function f , k = 20
Fig. 16 Same function f , k = 23
V. Anashin
The p-adic Theory of Automata Functions
81
Fig. 17 The function f (x) = x + ((x 2 ) OR (−131065)), k = 16
Fig. 18 The function f (x) = x + ((x 2 ) OR (−131065)), k = 17
Fig. 19 Same function, k = 18
Theorem 6.1 (The Automata 0-1 Law) For an automaton function f : Zp → Zp , the following alternative holds: Either α(f ) = 0 (equivalently, P(f ) is nowhere dense in I2 ), or α(f ) = 1 (equivalently, P(f ) = I2 ). Moreover, α(f ) = 1 if and only if the automaton Af whose automaton function is f is completely transitive. Recall that nowhere dense sets can nevertheless have positive Lebesgue measures, cf. fat Cantor sets (e.g. the Smith-Volterra-Cantor set), also known as -Cantor sets, see e.g. [3]. According to Theorem 6.1, since now we say for short that a 1-Lipschitz function f : Zp → Zp (respectively, a transducer A = Af ) is of measure 1 if and only if α(f ) = 1, and of measure 0 otherwise.
82
V. Anashin
Fig. 20 Same function, k = 22
Open question 6.2 Characterize automata of measure 1 (or, equivalently, automata of measure 0). Currently only sufficient conditions are known; we are going to discuss the conditions.
6.1.1
Conditions for Complete Transitivity
According to Theorem 6.1, to determine if an automaton is completely transitive is to determine whether it is of measure 1 or not; so below we reproduce some sufficient conditions for an automaton to be of measure 0 or 1 from [12, 21]. Theorem 6.3 (Finite Automata Are All of Measure 0) Whenever a 1-Lipschitz function f : Zp → Zp is an automaton function of a finite automaton, f is of measure 0. Therefore a finite automaton cannot be completely transitive even it is ergodic. Theorem 6.4 Let f : Zp → Zp be a 1-Lipschitz function, and let f be differentiable everywhere in a ball B ⊂ Zp of a non-zero radius. The function f is of measure 1 whenever the following two conditions hold simultaneously: (i) f (B ∩ N0 ) ⊂ N0 ; (ii) f is two times differentiable at some point v ∈ B ∩ N0 , and f (v) = 0. Corollary 6.5 Under conditions of Theorem 6.4, let B = Zp and let f (x) have no more than a finite number of zeros in N0 . Then the automaton whose automaton function is f is absolutely transitive. In particular, if automaton function f = fA of an automaton A is a univariate polynomial of degree ≥ 2 with rational integer coefficients then the automaton A is absolutely transitive. Examples 6.6 The following automata functions are of measure 1: • f (x) = cx + cx if c ∈ {2, 3, 4, . . . , } and c ≡ 1 (mod p); • f (x) = (x AND c) + ((x 2 ) OR c) if c ∈ Z and p = 2.
The p-adic Theory of Automata Functions
83
• Constant function which corresponds to van der Corput word . . . 100011111010110011101 (recall that the latter word is a concatenation of the base-2 expansions of natural numbers 1, 2, 3, 4, 5, 6, 7, 8, . . .). The following automata functions are of measure 0: • A polynomial of degree 1 over Zp ∩ Q; note that there exist ergodic as well as non-ergodic polynomials of this kind, cf. Theorem 5.12. Note also that this polynomial is a finite automaton function, cf. Proposition 2.16 • A constant function c ∈ Zp ∩ Q. This is a finite automaton function as well, cf. Proposition 2.16. • The automaton function f (x) = 1 + x + 8 · (x 2 AND (−1/3)), p = 2. This is an ergodic infinite automaton function; i.e., f is an automaton function but no finite automaton may have f as its automaton function.
6.1.2
Distribution of Orbits of Automata Functions in Rn
The automata 0-1 law stated at the beginning of Sect. 6.1 can be generalized to the n-dimensional case ad follows. Given an automaton function f = fA : Zp → Zp f and n ≥ 2, consider the set of all points ek,n (x) of the Euclidean unit hypercube n n n I = [0, 1] ⊂ R : f ek,n (x)
=
x mod p k f (x) mod pk f n−1 (x) mod pk , , . . . , pk pk pk
,
let Pn (f ) be a closure in Rn of all these points, for all x ∈ Zp and all k = 1, 2, 3, . . ., the n-dimensional plot of the automaton function f , let αn (f ) be Lebesgue measure of Pn (f ). It turns out that automata 0-1 law (cf. Theorem 6.1) holds for n > 2 as well: The only values which αn (f ) can take are 0 and 1, cf. [12]. Thus a natural question arises: Do there exist automata A such that αn (fA ) = 1 for all n = 2, 3, 4, . . .? This question was answered in [91, 92]; namely, the following is true: Theorem 6.7 Let f be a polynomial over Z, let deg f ≥ 2. Then αn (f ) = 1, for all n = 2, 3, 4, . . .. Note that the theorem holds for arbitrary p ≥ 2 which is not necessarily a prime. Moreover, in [91, 92] it was shown that under conditions of Theorem 6.7 distribution f of points ek,n (x) in In tends to uniform as k → ∞. This result is not only of significant theoretical value but also of high importance in applications to pseudorandom number generators design since it means that taking any initial point x as a ‘seed’ and generating an orbit x, f (x), f 2 (x), . . . of the polynomial f , one
84
V. Anashin
obtains a uniformly distributed point set in the real unit hypercube [0, 1]n when the number of digits k of every iterate f i (x) is taken sufficiently large. In other words, i k can be made ‘sufficiently distribution of n-tuples in the point sequence f (x)modp pk uniform’ by choosing k sufficiently large. Moreover, it is possible to establish a uniformity criterion for a collection of finite automata functions by imposing some synchronization condition the automata should satisfy, see details in [90].
6.2 Plots of Finite Automata From Theorem 6.3 we know that finite automata are all of measure 0; that is, their plots cannot contain ‘figures’; but the plots may contain ‘lines’. It turns out that the lines can be only straight ones, cf. [16, 17, 22]. To state corresponding theorem we need the following definition: Definition 6.8 (Limit Plot) Given an automaton A, we call a limit plot of the automaton A the point set LP(A) which is defined as follows: A point (x; y) ∈ R2 lies in LP(A) if and only if there exist z ∈ Zp and a strictly increasing infinite sequence k1 < k2 < . . . of numbers from N such that simultaneously z mod pki fA (z) mod pki = x; lim = y. i→∞ i→∞ pki pki lim
(6.50)
Within the Subsubsection we consider when appropriate both plots and limit plots of automata as point sets in the unit torus T2 = R2 /Z2 rather than in the unit square I2 ; the torus is obtained by ‘gluing together’ opposite sides of the unit square I2 ; i.e., by reducing coordinates modulo 1. This leads to more transparent description of the aforemntioned staright lines which turn out to be windings of the torus T2 , cf. Fig. 10. We stress here once again a crucial difference in the construction of plots and of Monna graphs of automata: Given a canonical expansion of p-adic integer i z= ∞ γ i=0 i p we put into a correspondence to z a single real number mon(z) = ∞ −i−1 while constructing Monna graphs; whereas while constructing plots i=0 γi p we associate to z a whole set of all accumulation points of the sequence (p−m (zmod pm ))∞ m=1 , and the latter set may not consist of a single point; moreover, ‘usually’ the set never consists of a single point since with a probability 1 the set is a whole segment [0, 1]. Therefore to investigate structure of plots we need to deal with sets of all limit points of (usually, non-convergent) sequences rather than with limits of convergent sequences, as it holds for the case of Monna maps. We will need few concepts concerning torus knots theory; details may be found in numerous books on knot theory, see e.g. [34, 98]. For our purposes it is enough to recall only two notions, the knot and the link. Recall that a knot is a smooth embedding of a circle S into R3 and a link is a smooth embedding of several disjoint circles in R3 , cf. [98]. We will consider only special types of knots and links, namely,
The p-adic Theory of Automata Functions
85
torus knots and torus links. Informally, a torus knot is a smooth closed curve without intersections which lies completely in the surface of a torus T2 ⊂ R3 , and a link (of torus knots) is a collection of torus knots, see e.g. [38, Section 26] for formal definitions. Definition 6.9 (Cable of the Torus) A cable of the torus is an image of a straight line in R2 under the map mod1 : (x; y) → (x mod 1; y mod 1) of the Euclidean plain R2 onto the 2-dimensional real torus T2 = R2 /Z × Z = S × S ⊂ R3 . If the line is defined by the equation y = ax + b we say that a is a slope of the cable C(a, b). We denote via C(∞, b) a cable which corresponds to the line x = b, the meridian, and say that the slope is ∞ in this case. Cables C(0, b) whose slope is 0 (i.e., the ones that correspond to straight lines y = b) are called parallels. In dynamics, cables of torus T2 are viewed as orbits of linear flows on torus; that is, of dynamical systems on T2 defined by a pair of differential equations of the dy 2 form dx dt = β; dt = α on T , whence, by a pair of parametric equations x = (βt +τ )mod1; y = (αt +σ )mod1 in Cartesian coordinates, cf. e.g. [60, Subsection 4.2.3]. Note 6.10 It is well known that a cable defined by the straight line y = ax + b is dense in T2 if and only if −∞ < a < +∞ and the slope a = βα is irrational, see e.g. [60, Proposition 4.2.8] or [104, Section 5.4]. Given a Cartesian coordinate system XY Z of R3 , a torus can be obtained by rotation around Z-axis of a circle which lies in the plain XZ. If radius of the circle is r and the circle is centered at a point lying in X-axis at distance R from the origin, then in cylindrical coordinates (r0 , θ, z) of R3 (where r0 is a radius-vector in Cartesian coordinate system XY , θ is an angle of the radius-vector in coordinates XY , z is a Z-coordinate in Cartesian coordinate system XY Z) the torus is defined by the equation (r0 − R)2 + z2 = r 2 and a cable (with a rational slope βα where α ∈ Z and β ∈ N) of the torus is defined by the system of parametric equations (with parameter s ∈ R) of the form ⎤ ⎡ ⎤ R + r cos βα s + ω r0 ⎥ ⎢ ⎥ ⎣θ ⎦=⎢ ⎣ s ⎦ , s ∈ R. α z r sin s + ω ⎡
(6.51)
β
The cable defined by the above equations winds β times around Z-axis and |α| times around interior of the torus (the sign of α determines whether the rotation is clockwise or counter-clockwise), see for an example of the corresponding torus knot Figs. 21 and 22 where α = 5 and β = 3. Letting ω in the above equations take a finite number of values we get an example of torus link, see e.g. Figs. 26 and 27 which illustrate a link consisting of a pair of torus knots whose slopes are 35 . Note that Figs. 28 and 29 illustrate a union of two distinct torus links (of two and of three knots respectively) rather than a single torus link of 5 knots. Finally, due to the above representation of a torus link in the form of equations in cylindrical coordinates, we
86
V. Anashin
Fig. 21 Limit plot of the function f (z) = 53 z, z ∈ Z2 , in R2
Fig. 22 Limit plot of the same function on the torus T2
naturally associate the torus link consisting of N cables whose slopes are family of complex-valued functions ψj : R → C of real variable s ∈ R -
ψj (t) = e
i( βα s+ωj )
α β
to a
. : j = 0, 1, 2, . . . , N − 1 ,
where i stands for imaginary unit i ∈ C: i 2 = −1. Recall that an automaton A(s0 ) = I, S, O, S, O, s0 is called autonomous once neither its state update function S nor its output function O depend on input; i.e., when si+1 = S(si ), ξi = O(χi , si ) = O(si ) (i = 0, 1, 2, . . .), cf. Fig. 1. It is clear that automaton function of an autonomous automaton is a constant; however limit plot of this function is not necessarily a straight line. For instance,i the limit plot of a constant c ∈ Zp is the whole unit square I2 once c = ∞ i=0 αi p where the infinite word u = . . . α2 α1 α0 over Fp is such that every non-empty finite word w = γk−1 γk−2 . . . γ0 over Fp occurs as a subword of u; that is, if there exist a finite word v and an infinite word s over Fp such that u is a concatenation of v, w and s: u = swv, cf. [21]. On the other hand, once an autonomous automaton A is finite, corresponding infinite output word must necessarily be eventually periodic. is, c = α0 + Thatr+tj for suitable α1 p + · · · + αr−1 pr−1 + (β0 + β1 p + · · · + βt−1 pt−1 ) · ∞ j =0 p αi , βj ∈ Fp ; therefore an automaton function of a finite autonomous automaton is a rational constant, i.e., c ∈ Zp ∩ Q, cf. Propositions 2.1 and 2.16. Furthermore, the numbers that correspond to (sufficiently long) finite output words are then all the form
The p-adic Theory of Automata Functions
87
0.βk βk−1 . . . β0 βt−1 βt−2 . . . β0 βt−1 βt−2 . . . β0 . . . βt−1 βt−2 . . . β0 αr−1 αr−2 . . . α0 for k = 0, 1, . . . , t −1. Consequently, the limit plot of the automaton (in R2 ) consists of t pairwise parallel straight lines which correspond to the numbers 0.βk βk−1 . . . β0 βt−1 βt−2 . . . β0 βt−1 βt−2 . . . β0 . . . = 0.βk βk−1 . . . β0 (βt−1 βt−2 . . . β0 )∞
where k = 0, 1, . . . , t − 1, cf. very beginning of Sect. 6; or (which is the same) to the numbers 0.(βk βk−1 . . . β0 βt−1 βt−2 . . . βk+1 )∞ . That is, all the lines from the limit plot are y = p h mod 1, ∈ N0 , for any line y = h belonging to the limit plot; thus the number of lines in the limit plot does not exceed t. Respectively, being considered as a point set on the torus T2 , the limit plot consists of not more than t parallels, cf., e.g., Figs. 23 and 24. We now summarize all these considerations in a proposition: Proposition 6.11 Let fA : z → q be an automaton function of a finite automaton A (therefore q ∈ Zp ∩ Q by Proposition 2.16); then LP(A) ⊂ T2 is a disjoint union of t parallels C(0, e), e ∈ C(q), and t is a period length of q (cf. (2.2)). Note 6.12 In conditions of Proposition 6.11 the constant q ∈ Zp ∩ Q can be represented as an irreducible fraction q = a/b where a ∈ Z, b ∈ N, p b (we put b = 1 and a = 0 if q = 0). Then the limit plot LP(A) ⊂ T2 is a torus link that consists of t = multb p trivial torus cables (parallels) with slopes 0; to the link there corresponds a collection of t complex constants (which are b-th roots of 1) -
. ψ = e−2π ip q : = 0, 1, . . . , (multb p) − 1 ,
where i stands for imaginary unit i ∈ C: i 2 = −1. Recall that multb p is the multiplicative order of p modulo b if b > 1 and multb p = 1 if b = 1, cf. Sect. 2.2. Example 6.13 Let p = 2 and q = 2/7. Then mult7 2 = 3 and the limit plot consists of 3 lines. The binary infinite word that corresponds to the 2-adic canonical representation of 2/7 is (011)∞ 10, so the period of 2/7 is 011, the pre-period is 01, and u = 2 = 0 + 1 · 2 + 0 · 22 . Therefore the three lines which constitute the limit plot are: y = 0.(101)∞ = 5/7 = (−2/7) mod 1 = c(3, 0, 2), y = 0.(011)∞ = 6/7 = (−1/7) mod 1, y = 0.(110)∞ = 3/7 = (−4/7) mod 1. The limit plot (on the unit square and on the torus) is illustrated by Figs. 23 and 24 accordingly; the state diagram of the automaton is given by Fig. 6. Note that the plot does not depend on the state which is taken as initial; the plot is completely determined by minimal sub-automaton whose set of states is {s2 , s3 , s4 }. Proposition 6.14 Given c ∈ Zp ∩ Q, represent c = a/b, where a ∈ Z, b ∈ N, a, b are coprime, p b. If A is an automaton such that fA (z) = cz (z ∈ Zp ) then LP(A) = {(x mod 1; (cx) mod 1) x ∈ R} = C(c, 0) is a cable (with a slope c) of the unit 2-dimensional real torus T2 . For every c ∈ Zp ∩ Q the automaton A may be taken a finite.
88 Fig. 23 Limit plot of the constant function f (z) = (z ∈ Z2 ), in R2
V. Anashin
2 7
Fig. 24 Limit plot of the same function on the torus T2
Fig. 25 State diagram of the automaton whose function is f (z) = 53 z, z ∈ Z2 (state s0 is initial)
Example 6.15 Take p = 2 and c = 5/3. Figures 21 and 22 illustrate limit plot of the function f (z) = (5/3) · z in I2 and in T2 , respectively. State diagram of corresponding automaton is given by Fig. 25. The following theorem gives complete description of finite affine automata functions and their limt plots. Theorem 6.16 Given c, q ∈ Zp , a map z → cz + q of Zp into Zp is an automaton function of a finite automaton if and only if c, q ∈ Zp ∩ Q. Given a finite automaton A whose automaton function is f (z) = cz + q for c, q ∈ Zp ∩ Q, represent c, q as irreducible fractions c = a/b, q = a /b , where a, a ∈ Z, b, b ∈ N and gcd(a, b) = gcd(a , b ) = gcd(b, p) = gcd(b , p) = 1; then the limit plot LP(A) ⊂ T2 is a link of multm p torus knots, where m = b / gcd(b, b ), and every knot of the link is a cable C(c, e) for e ∈ C(q):
The p-adic Theory of Automata Functions
LP(A) = {(y mod 1; (cy + e) mod 1) : y ∈ R, e ∈ C(q)} .
89
(6.52)
Moreover, C(c, e1 ) = C(c, e2 ) for e1 , e2 ∈ C(q) if and only if r1 ≡ r2 (mod m) where ei = (−pri q) mod 1, i = 1, 2. Note 6.17 Once m = 1, i.e., once b | b, the congruence r1 ≡ r2 (mod m) holds trivially, mult1 p = 1 and the link consists of a single knot; so in that case C(c, e1 ) = C(c, e2 ) for all e1 , e2 ∈ C(q). Corollary 6.18 There is a one-to-one correspondence between maps of the form f z → ab z+ ab on Zp (where ab , ab ∈ Zp ∩Q; a, a ∈ Z; b, b ∈ N) and collections of multm p complex-valued exponential functions ψk : R → C of real variable y ∈ R,
a k a ψk (y) = ei( b y−2πp b ) : k = 0, 1, 2, . . . , (multm p) − 1 . Here i ∈ C is imaginary unit and m = b / gcd(b, b ). Example 6.19 Let p = 2 and f (z) = (3/5) · z + (1/3). Then in conditions of Theorem 6.16 we have that m = 3 and therefore the link consists of mult3 2 = 2 cables whose slopes are 3/5, cf. Figs. 26 and 27. Given a real function g : D → R with domain D ⊂ R, a graph of the function (on the torus T2 ) is the point subset GD (g) = {(x mod 1; g(x) mod 1) : x ∈ D} ⊂ T2 . Fig. 26 Limit plot of the function f (z) = 35 z + 13 , z ∈ Z2 , in R2
Fig. 27 Limit plot of the same function on the torus T2
90
V. Anashin
We call the function g finitely computable if GD (g) ∈ P(A) for a finite automaton A. The following theorem completely characterizes C 2 -smooth finitely computable functions: Theorem 6.20 Consider a finite automaton A and a continuous function g which is defined everywhere on [a, b] ⊂ [0, 1) and which takes values in [0, 1). Let G(g) ⊂ P(A), let g be two times differentiable on [a, b], and let the second derivative g of g be continuous on [a, b]. Then there exist A, B ∈ Q ∩ Zp such that g(x) = (Ax + B) mod 1 for all x ∈ [a, b]; moreover, the graph G[a,b] (g) of the function ¯ ⊂ LP(A) for all g lies completely in the cable C(A, B) ⊂ LP(A) and C(A, B) B¯ ∈ C(B mod 1). Given a finite automaton A, there are no more than a finite number of pairwise distinct cables C(A, B) of the unit torus T2 such that C(A, B) ⊂ P(A) (note that A, B ∈ Zp ∩ Q). Note The cables of torus mentioned in Theorem 6.16 correspond to minimal subautomata; that is, each link is a limit plot of a minimal sub-automaton (of the automaton A) whose automaton function is affine; cf. Figs. 28 and 29. By Theorem 6.20, the smooth curves from the plot of a finite automaton A can be described by families of complex-valued exponential functions of the form ψk (y) = k ei(Ay−2πp B) , k = 0, 1, 2, . . ., for suitable A, B ∈ Zp ∩ Q, cf. Corollary 6.18.
Fig. 28 Limit plot in R2 of an automaton that has two affine subautomata A and B; fA (z) = −2z + 13 and fB (z) = 35 z + 27 , where z ∈ Z2
Fig. 29 Limit plot of the same automaton on the torus T2 in R3 . The plot consists of two torus links (of 2 and of 3 knots accordingly)
The p-adic Theory of Automata Functions
91
Historical remark: All results presented in current Sect. 6.2 were obtained in [16, 17, 22]. We also note that Theorem 6.20 can be generalized to the case when finite automata have multiple inputs/outputs, see [16, 22]. In connection with these results it is worth mentioning here that for Monna maps similar situation takes place; namely, if the graph M(A) of a finite automaton A is a real C 1 -smooth curve, then the curve is necessarily a graph of an affine function with rational coefficients, see [51]. Therefore the following natural open questions arise: Open question 6.21 In conditions of Theorem 6.20, can C 2 -smoothness be replaced by C 1 -smoothness? What are continuous finitely computable functions? Concluding the Section, we note that yet another approach which relates to finite automata their geometrical images in Euclidean space in order to study respective dynamics is undertaken in [123, 127]. Namely, to the word w = ωk−1 ωk−1 · · · ω0 over the alphabet P = {1, 2, . . . , P } the authors put into the correspondence the −(k−i+1) and relate to a finite automaton whose input/output number k−1 i=0 ωi (P +1) alphabets are accordingly Q = {1, 2, . . . , Q} and P = {1, 2, . . . , P } sets of points in R2 whose X-coordinate corresponds to a single-letter word over Q and whose Y -coordinate corresponds to respective output word over P of the automaton, for all finite single-letter input words and all letters from Q. Then the authors give explicit formula for real functions whose graphs contain all such points, for every letter from Q. Note that such functions are not necessarily affine; even for a two-state automaton with P = 2 the functions are of the form a · (2 − x)log2 3 + b · 3c·
log2 (2−x) 2
+ d.
7 Other Non-Archimedean Theories of Automata Functions The p-adic theory offers effective and powerful tools to study automata by investigating properties and behaviour of automata functions considered as p-adic 1-Lipschitz functions, i.e., functions whose domain and range are p-adic integers and which satisfy a Lipschits condition with a constant 1 with respect to p-adic metric. However, there are other approaches based on non-Archimedean (rather than on p-adic) analysis which can be applied to study automata functions, and not only automata functions defined by classical automata which can be regarded as automata with ‘discrete’ time N0 . In the current section we briefly discuss the approach based on representation of automata functions as 1-Lipschitz mappings of the ring Fp [[X]] of formal power series into itself as well as some topics related to automata with ‘continuous’ time R.
92
V. Anashin
7.1 Automata Functions over Fp [[X]] Basically, an automaton whose input/output alphabets are Fp = {0, 1, . . . , p − 1} just maps words from W∞ to words from W∞ so that each letter of ξi of output word does not depend on letters χi+1 , χi+2 , . . . for all i ∈ N0 , cf. Eq. (2.7); and moreover, it is well known that any such mapping W∞ → W∞ can be produced by a suitable automaton (which need not be necessarily finite). The set W∞ can be naturally endowed with a metric dp such that given u, v ∈ W∞ , u = v, one put dp (u, v) = p−(length of the longest common prefix of u and v) .
(7.53)
This way the set W∞ becomes a compact non-Archimedean metric space (actually, a ball of radius 1 with respect to dp ) and thus automata functions constitute the class of all 1-Lipschitz mappings W∞ → W∞ with respect to the (ultra)metric dp . The space Wp = W∞ , dp can be endowed with a natural probability measure exactly in the same way as the probability measure is defined on Zp , by taking all balls as an elementary measurable sets and assuming that the measure of every ball is equal to its radius. Therefore we can speak of measure-preservation and ergodicity of 1-Lipschitz maps on Wp and moreover, the measure-preservation/ergodicity of a 1Lipscihtz (thus, automaton) map Wp → Wp is equivalent to bijectivity/transitivity of action of corresponding automaton on finite words Wn , for all n = 1, 2, 3, . . ., cf. Theorem 5.6. But on the space Wp is is possible to define a structure of a commutative integral domain (the ring without zero divisors), which is complete and compact with respect the metric dp , and which is not isomorphic to the ring Zp of p-adic integers. It is well known that such alternative ring structure is unique: That is the ring Fp [[X]] of formal power series in indeterminate X over a p-element finite field Fp . Therefore it is quite natural to develop a theory of automata functions as 1Lipschitz mappings Fp [[X]] → Fp [[X]], no only in view of theoretical, but also of an applied value the theory has. Indeed, if p = 2, then ring addition in F2 [[X]] is just a bitwise logical addition modulo 2, the XOR instruction of modern computers, cf. Definition 3.5. Unfortunately, in contrast to ordinary multiplication of numbers represented via the base-2 expansions, multiplication in the ring F2 [[X]] (the ‘multiplication without carry’, speaking loosely) is not a standard processor instruction; but nevertheless multiplication in Fp [[X]] is easily programmable. Therefore it may happen that some useful applications of automata functions as 1-Lipschitz mappings on Fp [[X]] will be found in near future, similar to that of the theory of T-functions. Currently the ergodic theory of 1-Lipschtz mappings Fp [[X]] → Fp [[X]] is under development. The following results within that theory are obtained: • Criteria of automaticity (i.e., of 1-Lipschizness) of a mapping F2 [[X]] → F2 [[X]], as well as criteria of measure-preservation/ergodicity of a 1-Lipschitz mapping F2 [[X]] → F2 [[X]] are found, in terms of van der Put basis and in terms of Carlitz basis over F2 [[X]], see [93].
The p-adic Theory of Automata Functions
93
• For general mapping Fp [[X]] → Fp [[X]], criteria of 1-Lipschitzness in terms of van der Put basis, in terms of Carlitz basis, in terms of digit derivatives basis, and in terms of digit shifts basis were found in [64, 65, 138]. The criterion in terms of van der Put basis over Fp [[X]] is looking similar to the one from Theorem 3.4; namely, the mapping F : Fp [[X]] → Fp [[X]] represented via van der Put series F (x) = g∈Fp [X] Bg · χ (g, x) is an automaton function if and only if |Bg |X ≤ p− deg g , where | · |X is the absolute value on Fp [[X]] associated to the metric dp (i.e. |G|X = dp (G, 0), where 0 is additive neutral of the ring Fp [[X]], the formal power series in X all whose coefficients are zero polynomials over Fp ), Bg ∈ Fp [[X]], and χ (g, x) is a characteristic function of the ball in Fp [[X]] centered at g ∈ Fp [X] of radius p− deg g−1 , g ∈ Fp [X] are polynomials in indeterminate X over a p-element field Fp . • Criteria for measure-preservation/ergodicity for 1-Lipschitz mappings Fp [[X]] → Fp [[X]] represented in van der Put basis and in Carlitz basis over Fp [[X]] were found in [93] for p = 2; for general case, as well as for representations in other bases, see [64, 65, 138]. We note that the criteria in terms of van der Put basis over F2 [[X]] are looking very similar to those in terms of van der Put series over Z2 , cf. Theorems 5.21 and 5.23. Note that the non-Archimedean theory of automata functions considered as 1Lipschitz mappings Fp [[X]] → Fp [[X]] is still less developed in comparison to its counterpart over Zp .
7.2 Automata over Continuous Time In this subsection we discuss different approaches to what can be judged as a (finitestate) automaton over ‘continuous’ (rather than ‘discrete’) time. But firstly we must agree on what is ‘continuous time’ and what is ‘discrete time’.
7.2.1
General Considerations
In literature, the ‘discrete time’ is usually assumed to be the set N0 = {0, 1, 2, . . .} of all non-negative integers. This means, in loose terms, that an automaton over so defined discrete time works (i.e., accepts input symbol, changes its state and produces output symbol) only at certain moments of physical time, being in ‘idle mode’ between that moments. Therefore inputs/outputs of automaton over ‘discrete’ time N0 constitute a set of sequences, that is, of functions whose domain is N0 and range is input/output alphabet(s). Whence formal definition of an automaton whose input/output alphabets are Fp = {0, 1, . . . , p − 1} is the one from Sect. 2.3. The determinative property of automata mappings defined by automata with discrete time N0 is causality; that is, the automaton function maps sequences over Fp (i.e.,
94
V. Anashin
onesided infinite words over the alphabet {0, 1, . . . , p − 1}) W∞ to W∞ according to the rule (2.7). Accordingly, automata over continuous time are basically understood in literature as causal mappings which map signals to signals, where a signal is a function of time, i.e., a function σ whose domain is an infinite ordered set T called time with the order , the range of σ is some nonempty set E, the set of events. Time is called continuous if the order is dense; that is, given a, b ∈ T, a ≺ b, there exists c ∈ T such that a ≺ c ≺ b. Here as usual x ≺ y means that x y and x = y. Moreover, it is usually assumed that time has a beginning, i.e. there exists a unique least element with respect to the order . Time is called discrete if the order is well-order, i.e., if any non-empty subset of T has the least element; therefore any element t ∈ T (except for a possible greatest element) has a unique successor in discrete time. It is worth noticing here that for various applications the aforementioned definition of discrete/continuous time may be too general since ordered sets of the same cardinality may have different ordinal numbers, i.e., may be of different ordinal types and therefore the orders may be non-isomorphic even when time is discrete and countable: For instance, a standard order on N0 with respect to the relation ≤ ‘less or equal’ is not isomorphic to the order on N0 such that 0 2 4 6 ···1 3 5 7 9··· On the other hand, from Cantor’s theorem we know that every two non-empty dense totally ordered countable sets without lower or upper bounds are order-isomorphic. Moreover, every countable, totally ordered set is order-isomorphic to a subset of the rational numbers Q with standard ordering ≤, see, e.g., [53, Lemma 174]. We will see further that in applications to computer science or, wider, to IT, time can always be considered as a countable ordered set; therefore one always may assume if necessary that time is order-isomorphic to Q or to any other subset of Q which is order-isomorphic to Q; for instance, to a subset of Zp ∩ Q of rational p-adic integers. Nonetheless we start discussion here with the most general approach to automata functions regarded as causal mappings. The approach originates from [99] where ‘time’ is understood just as an ordered set, and causality is understood as a nonexpansive mapping with respect to generalized ultrametric distance defined in [109, 110]. We describe the approach briefly and in somewhat less general form than in [99]. Let (, ≤) be a totally ordered set with respect to the total order ≤, having minimal element 0. Let X be a non-empty set. A mapping d : X × X → is called a generalized ultrametric distance when the following properties are satisfied for all x, y, z ∈ X: (i) d(x, y) = 0 if and only if x = y. (ii) d(x, y) = d(y, x). (iii) If d(x, y) ≤ γ and d(y, z) ≤ γ then d(x, z) ≤ γ , for all γ ∈ .
The p-adic Theory of Automata Functions
95
The cortege (X, d, ) is called a generalized ultrametric space. In [99] the fixedpoint theory for strictly causal functions (the latter are defined below) is developed, with applications to design of programming languages and model-based design tools for timed systems. Given an ordered set T with the order , denote via LT, the set of all lower sets of T. Recall that L ⊂ T is a lower set (also called a down-set or an order ideal) if and only if for any t1 , t2 ∈ T, if t1 t2 and t2 ∈ L, then t1 ∈ L. The set LT, is totally ordered with respect to relation ⊇, the subset inclusion, has the least element T and is used as the set where ultrametric distance takes values, namely: Given signals σ1 , σ2 ∈ S from the set of all signals S, let d(σ1 , σ2 ) = {t ∈ T : σ1 (t ) = σ2 (t ) for every t t}; this way S becomes a generalized ultrametric space with respect to generalized ultrametric distance d. Given t ∈ T and a signal σ ∈ S, a t-prefix of σ is the restriction σ |D(t) of σ to the down-set D(t) generated by t; i.e. D(t) = {t ∈ T : t t}. Definition 7.1 (General Causal Mapping) A mapping F : S → S of signals to signals is called causal if σ1 |D(t) = σ2 |D(t) implies F (σ1 )|D(t) = F (σ2 )|D(t) , → for all t ∈ T; the mapping F : S → S is called strictly causal if σ1 |D(t ) = σ2 |D(t ) for all t ∈ T, t ≺ t, implies F (σ1 )|D(t) = F (σ2 )|D(t) . Note 7.2 In terms of generalized ultrametric distance d, Definition 7.1 can be re-stated as follows: The mapping F : S → S is causal if and only if d(F (σ1 ), F (σ2 )) ⊇ d(σ1 , σ2 ) and F is strictly causal if and only if d(F (σ1 ), F (σ2 )) ⊃ d(σ1 , σ2 ). We stress that the LT, is ordered with respect to ⊇ rather than with respect to ⊆; that is, A precedes B with respect to that order if and only if A ⊃ B, for A, B ∈ LT, . In other words, (strictly) causal mappings are exactly the mappings which are non-expansive (accordingly, srictly contractive) with respect to generalized ultrametric distance d. Note that by putting T, = N0 , ≤ and E = Fp in the aforementioned definition of causal mapping F , we get exactly the definition of automaton function f , cf. (2.7), since in that case S = W∞ and σ |D(t) is merely f mod pt for t ∈ N, the restriction of automaton function f to finite words of length t. That is, when time is ‘discrete’, causality is equivalent to 1-Lipschitzness, cf. (2.11), whereas strict causality is equivalent to strict 1-Lipschitzness meaning the latter is the property similar to (2.11) where the ‘less or equal’ relation ≤ is replaced by the ‘strictly less’ relation 2
The p-adic Theory of Automata Functions
99
It worth noticing here that modelling by timed automata has some drawbacks, the most significant of which is that general verification problem is undecidable for these automata. In order to improve that drawback, in [7] a special class of timed automata, the event-clock automata was introduced. At our view, that class suits better to model smart contracts, but still people use general timed automata (rather than event-clock automata) to represent smart contracts. We will not go into further details. Note 7.6 We stress that by their very construction the timed automata can also be considered as automata which have two inputs, one for data (symbols of a finite alphabet) and another for time stamps. Moreover, as it is mentioned in [6], the only condition the set of time stamps must satisfy is that the set must be dense in R≥0 . Note that as a matter of fact the set of all time stamps must satisfy yet one more condition; namely, it must be closed under addition in R≥0 : This follows immediately from the aforementioned description of how the T-automaton works, cf. how it processes time stamps. Therefore, one can consider timed automata whose time stamps constitute, e.g., the set Zp ∩ Q≥0 or any other subsemigroup of R≥0 . It is worth mentioning here that all additive subgroups of Q are completely described, see, e.g., [27]. We will use these observations later while considering approximations of timed automata by classical ones.
7.2.3
Finite Transducers over Continuous Time
One of the most important questions for automata over continuous time is: What are finite automata over continuous time if automata mappings over continuous time are understood as causal mappings of signals over continuous time to signals over continuous time, cf. Definition 7.1. To answer the question we must rigorously define what does the finiteness mean since no notion similar to the notion of ‘state’ of classical automata we have defined yet for general automata over continuous time and for general signals. To define that notion, we will use the idea similar to the one we have used when finding an expression for a finite automaton function of a classical automaton via van der Put basis, cf. Lemma 4.6 and Note 4.7. For not to fall into too general considerations, we only illustrate the notion of states of an automaton T over continuous time T by an example when T = R≥0 and the set S of all signals is the set of functions whose domains/ranges are R≥0 . Namely, given t ∈ R≥0 and signals u, v, denote (u|t v)(t) =
u(t), if t ∈ [0, t ) v(t − t ), if t ∈ [t , ∞)
100
V. Anashin
and put (FT,u,t (g))(t) = (FT (u|t g) − FT (u|t 0))(t + t ),
(7.54)
where FT : S → S is automaton function of the automaton T over continuous time T = R≥0 . The states of the automaton T are pairwise distinct functions defined by (7.54). As usual, an automaton is called finite if and only if the set of all its states is finite. Note that we may define an ultrametric distance D on the set S of all signals in that case as follows: Given any pair of functions g, h ∈ S, put
D(g, h) = e− sup{t ∈R≥0 : g(t)=h(t) for all t >t≥0}
(7.55)
so that causality condition from Definition 7.1 is just equivalent to 1-Lipschitzness of the mapping FT with respect to the ultrametric distance D. Aiming at possible numerous applications it would be useful to develop a theory of finite automata over continuous time for the case of general signals like the aforementioned functions R≥0 → R≥0 . Unfortunately, to the best of our knowledge, no such theory for general signals exists even under restriction to the case of locally integrable signals R≥0 → R≥0 , or to the case of continuous signals, piecewise differentiable signals, etc., etc., under a reasonable modification of definition of distance (7.55) (say, if the condition ‘g(t) = h(t) for all’ is replaced by the condition ‘g(t) = h(t) for all but measure 0’, etc.). For the case when continuous time is Q≥0 (or Zp ∩ Q≥0 ) rather than R≥0 , cf. Note 7.6, no such theory is known as well. However, if one restricts himself to non-Zeno signals, such a theory exists. We will make a brief exposition of the theory now. Speaking loosely, a non-Zeno property means that during a finite time interval only a finite number of events may happen. That’s why a non-Zeno signal σ over a continuous time T is usually understood in the following meaning: Definition 7.7 A non-Zeno signal over continuous time T is any function σ : T → E where E is a non-empty set of events, T a totally ordered set whose order is dense, and for all t1 , t2 ∈ T, t1 ≺ t2 , the set {σ (t) : t1 t t2 } ⊂ E is finite. Note that the set E of events needs not be finite, only the set of all pairwise distinct values which takes a non-Zeno signal over any finite time segment must be finite. In [112], finite-state transducers which map input non-Zeno signal to output nonZeno signals are considered as follows. Let T = R≥0 , let E be a finite subset of R. Take all piecewise constant left-continuous functions with domain R≥0 and range E as a set S of all signals, denote via S¯ ⊂ S the subset of all non-Zeno signals of S. Definition 7.8 (See [112, Definition 8]) A finite-state transducer over non-Zeno signals (which we denote via Z) has the following components: (i) A finite set of states S, (ii) An initial state s0 ∈ S, (iii) An input alphabet I and output alphabet O,
The p-adic Theory of Automata Functions
101
(iv) An output function O : I × S → O and (v) A transition function : I × I → (S → S) such that (a, b) ◦ (b, b) = (a, b). Note that the definition is similar to the definition of a finite initial automaton from Sect. 2.3 with the only difference in the definition of state-transition function , see (v). In (v), to every tuple (a, b) of input alphabet the transition function put into the correspondence a mapping from the set S of all states to S so that the state transition procedure satisfies the following condition: If the first input symbol at time t1 was b, the second one at time t2 > t1 was also b, and the third one at time t3 > t2 was a, and if the state at t1 was s1 , the state at t2 became s2 and the state at t3 became s3 then s2 = s1 . That is, the state may be updated to another state different from the current one only if new input symbol differs from the preceding one. Therefore automata functions of transducers from Definition 7.8 constitute narrower class in comparison to the class of all finite automata functions. Though by the definition the transducer Z is a classical transducer which maps infinite words over alphabet I to infinite words over alphabet O, yet to the transducer it corresponds a causal mapping FZ of input non-Zeno signals which take their values in I to output non-Zeno signals which take their values in O. Namely, as input signals are piecewise constant and right-continuous functions with domain R≥0 and range I, then for any signal σ and every moment t ∈ R≥0 of time there exists a moment t > t such that σ (t) = σ (t ); so whilst accepting input signal σ during time interval [t, t ] the state of the transducer will not be updated and thus the output signal (which is a piecewise constant and right-continuous function with domain R≥0 and range O) during time interval [t, t ] will also be a constant. In other words, states can be updated only at the moment when input signal changes its value and therefore output signal may change its value only at exactly that moments, being a constant during time intervals which do not contain that time moments. This way the finite-state transducer maps non-Zeno signals which are piecewise constant and right-continuous functions with domain R≥0 and range I to non-Zeno signals which are piecewise constant and right-continuous functions with domain R≥0 and range O. Let us explain that more formally. By the definition, a non-Zeno signal σ : R≥0 → I is a piecewise constant rightcontinuous function having not more that countably many points of discontinuity; let all these points of discontinuity be t1 (σ ), t2 (σ ), . . . ∈ R, t1 (σ ) < t2 (σ ) < · · · . Put t0 (σ ) = 0; then t0 (σ ) < t1 (σ ). If σ has only finitely many (say, n ∈ N0 ) points of discontinuity, then put tn+1 (σ ) = +∞. For i = 0, 1, 2, . . . denote ai = σ (t) for t ∈ [ti (σ ), ti+1 (σ )); note that σ is a constant on every interval [ti (σ ), ti+1 (σ )). The automaton function FZ maps any signal σ : R0 → I to the signal FZ (σ ) : R0 → O as follows: For any t ∈ R0 , there exists a unique i ∈ N0 such that [ti (σ ), ti+1 (σ )) t; then the value of the signal FZ (σ ) at t ∈ R0 is equal to the i-th (i.e., leftmost) symbol bi ∈ O of the output word bi · · · b2 b1 b0 of the automaton Z if the automaton is feeded by the input word ai · · · a2 a1 a0 , that is, (FZ (σ ))(t) = bi .
102
V. Anashin
As it is mentioned in [112], all results of the paper remain true if R≥0 is replaced by time domain T which satisfies the following conditions: • T is totally ordered with respect to order relation with a minimal element and with no maximal element. • There exists an associative binary operation + on T (that is, a two-variate function + : T × T → T which satisfies the associative law (t1 + t2 ) + t3 = t1 +(t2 +t3 ) for all t1 , t2 , t3 ∈ T) such that for every t ∈ T the function τ → t +τ is an order preserving bijection from T to {t : t t } ⊆ T.
7.2.4
Approximation of Automata over Continuous Time by Classical Ones
Now we are going to compare both classes of automata over continuous time, the timed automata from Sect. 7.2.2 and transducers of non-Zeno signals over continuous time from Sect. 7.2.3, and to show that both types of automata can be approximated with any desirable accuracy by classical automata from Sect. 2.3. Firstly note that any piecewise constant right-continuous non-Zeno signal σ : R0 → E where E finite we may associate to infinite timed words ((ai , τi ))∞ i=0 over the alphabet E, see Definition 7.3, where ai ∈ E, τi ∈ R≥0 , i ∈ N0 , and the sequence (τi )∞ i=0 is monotone and unboundedly increasing. This follows from the definition of right-open E-signals, cf. [112, Section 5] which are exactly piecewise constant right-continuous non-Zeno signals in our terminology. Indeed, we just put τ0 = 0, for i ∈ N we put τi ∈ R≥0 to be the i-th point of discontinuity of σ , and put ai ∈ E the value the function σ takes at any point from the right-open interval [τi , τi+1 ) ⊂ R≥0 . As σ is piecewise constant right-continuous and non-Zeno, it has not more than countably many points of discontinuity. If σ has infinitely many points of discontinuity, we this way put into the correspondence to σ a unique timed word wσ = ((ai , τi ))∞ i=0 . Now for every pair τi , τi+1 , i ∈ N0 and for every N = Ni ∈ N0 choose arbitrary finite strictly j +1 increasing sequence (τi )N j =0 τi = τi0 < τi1 < · · · τiNi < τiNi +1 = τi+1 j
j
and consider infinite timed word (bij , τi ) over alphabet E, where (τi ) is the following infinite monotone and unboundedly increasing sequence over R≥0 0 = τ0 < τ01 < · · · < τ0N0 < τ1 < τ11 < · · · < τ1N1 < τ2 < · · · j
j +1
and bij = σ (t) if t ∈ [τi , τi ) where j ≤ Ni . Denote via W (σ ) the set of all the so constructed timed words. Let σ have only finitely many points of discontinuity, let τn ∈ R≥0 be the largest one (n = 0 if σ has no points of discontinuity; that is, σ is a constant function).
The p-adic Theory of Automata Functions
103
Then σ (t) = an for all t ∈ [τn , ∞). For all pairs τi , τi+1 where i + 1 ≤ n and for j +1 all Ni ∈ N0 choose arbitrary finite strictly increasing sequence (τi )N j =0 τi = τi0 < τi1 < · · · τiNi < τiNi +1 = τi+1 j
j
consider finite timed word (bij , τi ) over alphabet E where (τi ) is the following finite monotone increasing sequence over R≥0 N
n−1 0 = τ0 < τ01 < · · · < τ0N0 < τ1 < τ11 < · · · < τ1N1 < τ2 < · · · τn−1 < τn
j
j +1
and bij = σ (t) if t ∈ [τi , τi ) where j ≤ Ni . j Further, for all the so constructed sequences (τi ) and all unboundedly increasing 0 0 0 sequences τn+1 < τn+2 < · · · over R0 such that τn+1 > τn consider the set Tσ of j
j
j +1
all timed words ((aij , τi )) where aij = σ (t) for arbitrary t ∈ [τi , τi ) if i < n and j < Ni + 1, put aij = σ (t) for arbitrary t ∈ [τn , ∞) if i ≥ n. Put into the correspondence to σ the set W (σ ) of all these timed words. Now to every timed word w = ((ai , τi ))∞ i=0 over the alphabet E put into the correspondence the function σ w : R≥0 → E such that σ w (t) = ai if t ∈ [τi , τi+1 ), i ∈ N0 . It is obvious that σ w = σ v if and only if W (σ w ) = W (σ v ). Without loss of generality we may always assume that τ0 = 0 for all timed words we consider further. Also without loss of generality we may assume that the aforementioned finite-state transducer Z from Sect. 7.2.3 on non-Zeno signals has input and output alphabets which coincide with the finite set E where non-Zeno signals take their values: I = O = E. As by its definition the transducer Z is a classical transducer which maps infinite words over alphabet E to infinite words over alphabet E, let fZ denote its automaton function. Note that the mapping fZ maps left-infinite words over E to left infinite words over E rather than the mapping FZ which is also associated to the transducer Z but maps signals to signals, cf. Sect. 7.2.3. Let now TZ be a mapping from timed words over the alphabet E to timed words over the alphabet E which, given any timed word w = ((ai , τi ))∞ i=0 maps it to the w ))). The mapping T is well defined timed word w = ((ai , τi ))∞ ∈ W (F (σ Z Z i=0 since from the definition of the transducer Z it follows that, given a timed word w, the timed word w = TZ (w) is unique by the construction of W (σ ): Actually . . . a2 a1 a0 = fz (. . . a2 a1 a0 ) where fZ is an automaton function of the transducer Z, i.e., a mapping of infinite words over E to infinite words over E; so ai is the i-th letter of output word of a classical transducer Z which is feeded by input word . . . ai . . . a2 a1 a0 . Finally we conclude that to the finite transducer Z it corresponds a unique timed automaton which we denote via TZ . Note that the latter timed automaton has no clocks (and therefore no time constraints), cf. Definitions 7.5 and 7.4; so TZ is just a classical finite automaton whose state update procedure and output symbols depend only on input symbols and do not depend on time stamps of input symbols, in contrast to general timed automaton.
104
V. Anashin
It is easy to show that the correspondence Z → TZ is injective: Given Z1 and Z2 such that fZ1 = fZ2 (which is true if and only if FZ1 = FZ2 ) then necessarily TZ1 = TZ2 ; that is, respective mappings of timed words to timed words which correspond to TZ1 and TZ2 are different. Further, to every timed word w = ((ai , τi ))∞ i=0 over the alphabet E it corresponds a unique non-Zeno signal σw such that σw (t) = ai where i is uniquely defined by the condition t ∈ [τi , τi+1 ); recall that we assume τ0 = 0 for all timed words. Denote the mapping w → σw via . Clearly, w ∈ W (σw ), and moreover, that (TZ (v)) = σ for all v ∈ W (σw ). Therefore, FZ ((TZ (v))) = FZ (σw ) for all v ∈ W (σw ) and TZ (W (FZ ((w))) = TZ (w) for all timed words w. Thus, speaking loosely, to every finite transducer Z on non-Zeno signals we put into the correspondence a unique timed automaton TZ which acts on timed words that correspond to non-Zeno signals exactly in the same way as the transducer Z acts on non-Zeno signals which correspond to the timed words. These timed automata TZ have neither clocks nor clock constraints and that is why the class they constitute is significantly narrower than the class of all timed automata. Concluding the section we are going to explain why timed automata (and therefore finite transducers on non-Zeno signals) can be approximated ‘with any desirable accuracy’ by classical automata with multiple inputs/outputs. This observation is important in applications to smart contracts modelling and verification since smart contracts are usually modelled by timed automata and therefore interaction of smart contracts in media (e.g., in blockchain) should be considered as interaction of computer programs (namely, contracts) in physical time. However, it is impossible to use real numbers in computer simulations; only approximations of the reals by rational numbers. Therefore a natural question arises what is precision of such models since actually in the models the automata over continuous time are replaced by classical automata over discrete time. This question was considered in [18]; we briefly explain the idea here. As it was already mentioned in Sects. 7.2.2 and 7.2.3 all results for both timed automata and automata over non-Zeno signals over continuous time remain true if time T is arbitrary dense additive semigroup in R≥0 . Therefore we may take T = Zp ∩ Q≥0 for arbitrary prime p. Of course, being aimed to computer simulations it is better to take p = 2. But every 2-adic integer can be approximated by a number from {0, 1, . . . , 2n − 1} with any desirable 2-adic accuracy just by choosing n sufficiently large. This means that taking sufficiently large number of low-order digits in base-2 representation of rational 2-adic integers we can approximate all real numbers with any desirable accuracy, cf. Sect. 2.2. Note that currently the smallest measured time interval is about 10−20 seconds, and according to paradigms of contemporary physics, the smallest time interval which can be measured is Planck’s time ≈ 10−43 seconds. As 1043 ≈ 2143 , to represent fractional part of real number which represents time with the most theoretically possible accuracy it is enough roughly 145 bits; so to represent every time moment within 3000-year interval with that accuracy it is enough roughly 180 bits. As timed automaton can be considered as an automaton with two inputs, one for time and another one for data, it is clear that for all practical purposes in order one can
The p-adic Theory of Automata Functions
105
model any timed automaton with any desirable and physically possible accuracy by classical automaton, 256 inputs (i.e. 256 binary entries) for the classical automaton is much more than enough to represent both time stamps and data. Thus, by using 1256 Lipschitz mappings Z256 2 → Z2 (all of them are classical automata functions), one can model any timed automaton with any physically possible accuracy. Of course, for most practical purposes the 256 bit estimate is too high. Having in mind the purpose of modelling and verification of smart contracts, we note also that in most cases finite transducers of non-Zeno signals over continuous time (and therefore their timed automaton counterpart) seem not to be adequate model of smart contracts since any such automaton maps constant signal to constant signal, cf. [112], which is obviously not the case for most legal contracts and thus for corresponding smart contracts since, for instance, a constant signal which means ‘no payment done’ must finally result in a non-constant response. Currently it is not clear whether Zeno-type signals can happen during interaction of smart contracts; moreover, there are numerous Zeno cases in the theory of hybrid automata, e.g., cf. [140] and other relevant papers from the proceedings the cited paper is published. However, smart contracts can definitely be considered as a sort of hybrid automata.
7.2.5
On the p-adic Time
It has been already mentioned at the beginning of the current Subsection that when dealing with automata over continuous time people actually are using a countable ordered dense subset of R≥0 as a ‘time scale’; thus the time scale is actually a dense subset of Q≥0 whose order is inherited from the natural order on R≥0 . This is why one naturally can consider Zp ∩ Q≥0 as a time scale. However, the set Zp ∩ Q≥0 is dense in R≥0 as well as in Zp according to respective metrics; thus by taking closures of Zp ∩ Q≥0 with respect to corresponding metrics, we obtain automata over continuous time of two different types: The first one over the Archimedean time R≥0 , and second one over the non-Archimedean time Zp . The p-adic time is not too unusual in theoretical physics: Corresponding theories of p-adic (and even more general, of adelic) space-time have been developing by theorists since at least 1980-th, see, e.g., [25, 37, 130, 131]; the main motivation for considering the nonArchimedean time is that the smallest quantum of space and time exists and it is of the order of Planck scale, [25]. But when considering automata over p-adic time one has to explain what is causality in that case since it is well known that on Qp as well as on Zp no order which agrees both with addition and p-adic metric exists, see, e.g., [113]. However, such an order which agrees both with addition and real metric exists on Zp ∩ Q; that order is the one inherited from the natural order on R. Therefore no problem with causality arises when using Zp ∩ Q≥0 as a time arrow of automata over continuous time. Moreover, as Zp ∩ Q≥0 is dense in R≥0 , it is impossible to distinguish between real numbers from R≥0 and p-adic rational numbers from Zp ∩ Q≥0 in measurements, whatever p is taken. So models whose time arrow is
106
V. Anashin
Zp ∩ Q≥0 rather than R≥0 may be adequate at all scales, from Planck’s scale to cosmological scale. Within that context, it is interesting to mention that discrete-time dynamics, the cascades, defined by large classes of classical automata (which are obviously automata over discrete time N0 ) can be uniquely expanded to flows over p-adic time Zp ; therefore, to automata dynamics over rational p-adic time Zp ∩ Q≥0 . Namely, in [12, Subsection 4.8.1] the following problem is studied: As N0 is dense (with respect to the p-adic metric) in Zp , given an automaton (that is, a 1-Lipschitz) map f : Zp → Zp , can one expand discrete dynamics of f on Zp to continuous dynamics on Zp by taking p-adic limits of iterates f n (x) when n ∈ N0 tends pp
adically to t ∈ Zp ? In other words, when the p-adic limit lim f t (x) exists for nj →t
every x, t ∈ Zp and every sequence (nj ∈ N0 )∞ j =0 that tends p-adically to t ∈ Zp ? The answer is: This holds for every p if f is ergodic; and moreover, if p = 2 then this is true if f is measure-preserving. We state corresponding result as a theorem: Theorem 7.9 Given a 1-Lipschitz ergodic transformation f : Zp → Zp , the 2variate function f t (x) is a 1-Lipschitz function which is well-defined for all (t, x) ∈ Z2p and which is valuated in Zp . Foremost, for every x ∈ Zp the function f t (x) is measure-preserving as a function of variable t ∈ Zp . Moreover, when p = 2, the same is true for any measure-preserving (and not necessarily ergodic) 1-Lipschitz function f : Z2 → Z2 . If in conditions of Theorem 7.9 one assumes that f : Zp → Zp is an affine function then one can express f t (x) explicitly. Namely, given an ergodic affine transformation f (x) = ax + b on Zp , the 2-variate function from Theorem 7.9 is of the form f t (x) = bt + x if a = 1, and f t (x) = b ·
at − 1 + a t x, a−1
if a = 1. Note that by Theorem 5.12, p b and a ≡ 1 (mod p); thus by Proposition 4.19 the function a t is a B-function and so is f t (x) as a function of t. Moreover, it is not difficult to show that actually f t (x) is a C-function as a function of t, for every x ∈ Zp . These observations can serve a reasoning why it may be possible to develop a theory of automata as causal mappings over continuous time Zp ∩ Q≥0 whose input/output signals are functions Zp ∩ Q≥0 → Zp ∩ Q which are 1-Lipschitz (thus, classical automata) functions with respect to p-adic metric and which at the same time are continuous functions with respect to the real metric on R. To the best of our knowledge, no such theory currently exists.
The p-adic Theory of Automata Functions
107
8 Conclusion The main goal of the paper is to show that causality is intrinsically non-Archimedean by its very nature; that is, a causal mapping under every reasonable definition of whatever could be called a causality, is a non-Archimedean non-expansive mapping with respect to (generalized) non-Archimedean distance. Causality is tightly related to the notion of time, and the latter notion can be formalized basically in two different ways, as a discrete time and as a continuous time. The two types, having a lot of common, differs in many significant details. For the case of discrete time causal mappings are 1-Lipschitz mappings with respect to p-adic metric. The class of p-adic 1-Lipschitz mappings coincides with the class of automata functions, the mappings performed by letter-to-letter transducers which map onesided infinite words to one-sided infinite words over finite alphabets. The theory of these mappings is a part of p-adic dynamics which currently is a rapidly developing and rich mathematical theory having a number of applications to computer science, cryptography, physics, quantitative biology, genetics, cognitive sciences, etc. The case of causal mappings over continuous time and related automata mappings, though very important to various applications, is less developed compared to the case of automata over discrete time, and many problems are still open, for instance, the ones related to transducers of Zeno signals. The problems may be rooted in the very understanding (and whence, modelling) of physical time which is usually considered as a continuous one, moreover, whose adequate model is mostly assumed to be the real numbers. But namely in that case most mathematical difficulties with causality mappings occur. A possible way to overcome these difficulties may be to consider infinite dense subsets of rational (rather than real) numbers as a time scale; therefore usage of the p-adic rational integers with ordering inherited from natural order on the real numbers might serve as a reasonable time arrow in the considerations since mappings defined by automata over discrete as well as over continuous time are well-defined functions on p-adic rational integers. Last but not the least: The author is grateful to colleagues from the Faculty of Computational Mathematics and Cybernetics Lomonosov Moscow State University, from the Federal Research Center ‘Information and Control’ Russian Academy of Sciences, and from the Department of Mathematical Physics Steklov Mathematical Institute Russian Academy of Sciences for encouraging discussions on applications of the p-adic theory of automata functions to information sciences, to computer sciences, and to physical sciences; the author also would like to thank the referees for careful reading of the paper and numerous recommendations and suggestions which improve the overall exposition; the author appreciates support from the Russian Foundation for Basic Research by grant No 18-20-03124.
108
V. Anashin
References 1. Herbert Abels and Antonios Manoussos. Topological generators of abelian Lie groups and hypercyclic finitely generated abelian semigroups of matrices. Advances in Mathematics, 229:1862–1872, 2012. 2. Elsayed Ahmed and Dmytro Savchuk. Endomorphisms of regular rooted trees induced by the action of polynomials on the ring Zd of d-adic integers. J. Algebra Appl., 19(8): 2050154, 2020. 3. Charalambos D. Aliprantis and Owen Burkinshaw. Principles of real analysis. Academic Press, Inc., third edition, 1998. 4. J.-P. Allouche and J. Shallit. Automatic Sequences. Theory, Applications, Generalizations. Cambridge Univ. Press, 2003. 5. R. C. Alperin. p-adic binomial coefficients modp. The Amer. Math. Month., 92(8):576–578, 1985. 6. Rajeev Alur and David Dill. The theory of timed automata. Theoretical Computer Science, 126:183–235, 1994. 7. Rajeev Alur, Limor Fix, and Thomas A. Henzinger. A determinizable class of timed automata. In David L. Dill, editor, Computer Aided Verification. 6th Internat. Conf. , CAV’94, volume 818 of Lecture Notes in Computer Science, pages 1–13, California, USA, Jun 21–23 1994. Stanford, Springer. 8. Y. Amice. Interpolation p-adique. Bull. Soc. Math. France, 92:117–180, 1964. 9. V. Anashin. Ergodic transformations in the space of p-adic integers. In Andrei Yu. Khrennikov, Zoran Raki´c, and Igor V. Volovich, editors, p-adic Mathematical Physics. 2nd Int’l Conference (Belgrade, Serbia and Montenegro 15–21 September 2005), volume 826 of AIP Conference Proceedings, pages 3–24, Melville, New York, 2006. American Institute of Physics. 10. V. Anashin. Non-Archimedean theory of T-functions. In Proc. Advanced Study Institute Boolean Functions in Cryptology and Information Security, volume 18 of NATO Sci. Peace Secur. Ser. D Inf. Commun. Secur., pages 33–57, Amsterdam, 2008. IOS Press. 11. V. Anashin. Non-Archimedean ergodic theory and pseudorandom generators. The Computer Journal, 53(4):370–392, 2010. 12. V. Anashin and A. Khrennikov. Applied Algebraic Dynamics, volume 49 of de Gruyter Expositions in Mathematics. Walter de Gruyter GmbH & Co., Berlin—N.Y., 2009. 13. V. S. Anashin. Uniformly distributed sequences of p-adic integers. Mathematical Notes, 55(2):109–133, 1994. 14. V. S. Anashin. Uniformly distributed sequences in computer algebra, or how to construct program generators of random numbers. J. Math. Sci., 89(4):1355–1390, 1998. 15. V. S. Anashin. Uniformly distributed sequences of p-adic integers, II. Discrete Math. Appl., 12(6):527–590, 2002. 16. V. S. Anashin. Quantization causes waves: Smooth finitely computable functions are affine. p-Adic Numbers, Ultrametric Analysis Appl.., 7(3):169–227, 2015. 17. V. S. Anashin. Smooth finitely computable functions are affine, or why quantum systems are wave systems. Doklady Mathematics, 92(3):165–167, 2015. 18. V. S. Anashin. On automata models of blockchain. Informatics and Applications, 13(2):29– 36, 2019. In Russian, English summary. 19. V. S. Anashin, A. Yu. Khrennikov, and E. I. Yurova. Characterization of ergodicity of p-adic dynamical systems by using van der Put basis. Doklady Mathematics, 83(3):306–308, 2011. 20. Vladimir Anashin. Automata finiteness criterion in terms of van der Put series of automata functions. p-Adic Numbers, Ultrametric Analysis and Applications, 4(2):151–160, 2012. 21. Vladimir Anashin. The non-Archimedean theory of discrete systems. Math. Comp. Sci., 6(4):375–393, 2012. 22. Vladimir Anashin. Discreteness causes waves. Facta Universitatis, 14(6):143–196, 2016.
The p-adic Theory of Automata Functions
109
23. Vladimir Anashin, Andrei Khrennikov, and Ekaterina Yurova. Ergodicity criteria for nonexpanding transformations of 2-adic spheres. Discrete and Continuous Dynamical Systems, 34(2):367–377, 2014. 24. Vladimir Anashin, Andrei Khrennikov, and Ekaterina Yurova. T-functions revisited: new criteria for bijectivity/transitivity. Designs, Codes, and Cryptography, 71(3):383–407, 2014. 25. I. Ya. Aref’eva. Physics at the Planck length and p-adic field theories. In Ling-Lie Chau and Werner Nahm, editors, Differential Geometrical Methods in Theoretical Physics, NATO Science Series B: Physics and Geometry, pages 387–398, NY, 1990. Plenum Press. 26. Ekaterina Yurova Axelsson and Andrei Khrennikov. Description of (fully) homomorphic cryptographic primitives within the p-adic model of encryption. In Karl-Olof Lindahl, Torsten Lindström, Luigi G. Rodino, Joachim Toft, and Patrik Wahlberg, editors, Analysis, Probability, Applications, and Computation. Proceedings of the 11th ISAAC Congress, pages 241–248. Birkhäuser, 2017. 27. Ross A. Beaumont and Herbert S. Zuckerman. A characterization of the subgroups of the additive rationals. Pacific J. Math, 1(2):169–177, 1951. 28. W. Brauer. Automatentheorie. B. G. Teubner, Stuttgart, 1984. 29. J. Bryk and C. E. Silva. Measurable dynamics of simple p-adic polynomials. Amer. Math. Monthly, 112(3):212–232, 2005. 30. John Carroll and Darrell Long. Theory of Finite Automata. Prentice-Hall Inc., 1989. 31. Swarat Chaudhuri, Sriram Sankaranarayanan, and Moshe Y. Vardi. Regular real analysis. In 28th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2013), pages 509–518, Los Alamitos, CA, 2013. IEEE Computer Soc. 32. A. N. Cherepov. On approximation of continuous functions by determinate functions with delay. Discrete Math. Appl., 22(1):1–24, 2010. 33. A. N. Cherepov. Approximation of continuous functions by finite automata. Discrete Math. Appl., 22(4):445–453, 2012. 34. R. Crowell and R. Fox. Introduction to the Knot Theory. Ginu and Co., Boston, 1963. 35. J. Dénes and A. D. Keedwell. Latin squares. North-Holland, Amsterdam, 1991. 36. D. L. Desjardins and M. E. Zieve. On the structure of polynomial mappings modulo an odd prime power. Available at http://arXiv.org/math.NT/0103046, 2001. 37. B. G. Dragovi´c, P. H. Frampton, and B. V. Uroševi´c. Classical p-adic space-time. Modern Physics Letters, A5:1521–1528, 1990. 38. B. A. Dubrovin, A. T. Fomenko, and S. P. Novikov. Modern Geometry - Methods and Applications, volume II. Springer-Verlag, NY–Berlin-Heidelberg-Tokyo, 1985. 39. F. Durand and F. Paccaut. Minimal polynomial dynamics on the set of 3-adic integers. Bull. London Math. Soc., 41(2):302–314, 2009. 40. J. Eichenauer, J. Lehn, and A. Topuzo˘glu. A nonlinear congruential pseudorandom number generator with power of two modulus. Math. Comp., 51:757–759, 1988. 41. J. Eichenauer-Herrmann and H. Grothe. A new inversive congruential pseudorandom number generator with power of two modulus. ACM Trans. Modelling and Computer Simulation, 2:1–11, 1992. 42. J. Eichenauer-Herrmann, E. Herrmann, and S. Wegenkittl. A survey of quadratic and inversive congruential pseudorandom numbers. In P. Hellekalek, G. Larcher, H. Niederreiter, and P. Zinterhof, editors, Monte Carlo and Quasi-Monte Carlo Methods 1996, volume 127 of Lecture Notes in Statistics, pages 66–97, N.Y., 1998. Springer. 43. Samuel Eilenberg. Automata, Languages, and Machines, volume A. Academic Press, 1974. 44. G. Everest, A. van der Poorten, I. Shparlinsky, and T. Ward. Recurrence Sequences, volume 104 of American Mathematical Society Surveys. American Mathematical Society, 2003. 45. Aihua Fan, Shilei Fan, Lingmin Liao, and Yuefei Wang. On minimal decomposition of p-adic homographic dynamical systems. Advances in Mathematics, 257:92–135, 2014. 46. Mark D. Flood and Oliver R. Goodenough. Contract as automaton: The computational representation of financial agreements. SSRN Electronic Journal, March 2015. DOI: 10.2139/ssrn.2538224.
110
V. Anashin
47. C. Frougny and K. Klouda. Rational base number systems for p-adic numbers. RAIPO Theor. Inform. Appl., 46(1):87–106, 2012. 48. Joanna Furno. Orbit equivalence of p-adic transformations and their iterates. Monatsh. Math., 175:249–276, 2014. 49. Joanna Furno. Singular p-adic transformations for Bernoulli product measures. New York J. Math., 20:799–812, 2014. 50. Joanna Furno. Natural extensions for p-adic β-shifts and other scaling maps. Indagationes Mathematicae, 30:1099–1108, 2019. 51. Alexi Block Gorman et al. Continuous regular functions. Logical Methods in Computer Science, 16(1):17:1–17:24, 2020. 52. F. Q. Gouvêa. p-adic Numbers, An Introduction. Springer-Verlag, Berlin–Heidelberg–New York, second edition, 1997. 53. George Grätzer. Lattice Theory: Foundation. Birkhäuser, 2011. 54. R. I. Grigorchuk. Some topics in the dynamics of group actions on rooted trees. Proc. Steklov Institute of Mathematics, 273:64–175, 2011. 55. R. I. Grigorchuk, V. V. Nekrashevich, and V. I. Sushchanskii. Automata, dynamical systems, and groups. Proc. Steklov Institute Math., 231:128–203, 2000. 56. Rostislav Grigorchuk and Dmytro Savchuk. Solenoid maps, automatic sequences, van der Put series, and Mealy-Moore automata. ArXiv:2006.02316v1 [cs.FL] 3 Jun 2020. 57. Rostislav Grigorchuk and Dmytro Savchuk. Ergodic decomposition of group actions on rooted trees. Proc. Steklov Institute Math., 292:94–111, 2016. 58. V. M. Gundlach, A. Yu. Khrennikov, and K.-O. Lindahl. Ergodicity on p-adic sphere. In German Open Conference on Probability and Statistics, pages 15–21, Hamburg, 2000. University of Hamburg Press. 59. B. Hasselblatt and A. Katok, editors. Handbook of Dynamical Systems, volume 1A. Elsevier Science B. V., Amsterdam, 2002. 60. B. Hasselblatt and A. Katok. A First Course in Dynamics. Cambridge Univ. Press, Cambridge, etc., 2003. 61. K. Hensel. Über eine neue Begründung der Theorie der algebraischen Zahlen. Jahresbericht der Deutschen Mathematiker-Vereinigung, 6(3):83–88, 1897. 62. Philipp Hieronymi and Erik Walsberg. On continuous functions definable in expansions of the ordered real additive group. arXiv:1709.03150v1 [math.LO] 10 Sep 2017. 63. J. Hong, D. Lee, Y. Yeom, and D. Han. A new class of single cycle T-functions. In Fast Software Encryption, FSE 2005, number 3557 in Lect. Notes Comp. Sci., pages 68–82. Springer-Verlag, 2005. 64. Y. Jang, S. Jeong, and C. Li. Criteria of measure-preservation for 1-Lipschitz functions on Fq [[T ]] in terms of the van der Put and its applications. Finite Fields Appl., 37:131–157, 2016. 65. Youngho Jang, Sangtae Jeong, and Chunlan Li. Measure-preservation criteria for 1-Lipschitz functions on Fq [[T ]] in terms of the three bases of Carlitz polynomials, digit derivatives, and digit shifts. Finite Fields Appl., 46:304–325, 2017. 66. Sangtae Jeong. Characterization of the ergodicity of 1-Lipschitz functions on Z2 using the q-Mahler basis. J. Number Theory, 151:116–128, 2015. 67. Sangtae Jeong. Measure-preservation and the existence of a root of p-adic 1-Lipschitz functions in Mahler’s expansion. p-Adic Numbers, Ultrametric Analysis and Applications, 10(3):192–208, 2018. 68. Sangtae Jeong and Chunlan Li. Measure-preservation criteria for a certain class of 1Lipschitz functions on Zp Mahler’s expansion. Discrete and Continuous Dynamical Systems, 37(7):3787–3804, 2017. 69. R. E. Kalman, P. L. Falb, and M. A. Arbib. Topics in mathematical system theory. McGrawHill, N. Y., 1969. 70. T. Kato, L.-M. Wu, and N. Yanagihara. On a nonlinear congruential pseudorandom number generator. Math. Comp., 65:227–233, 1996.
The p-adic Theory of Automata Functions
111
71. S. Katok. p-adic analysis in comparison with real. Mass. Selecta. American Mathematical Society, 2003. 72. John G. Kemeny and J. Laurie Snell. Finite Markov Chains. Springer-Verlag, 1976. 73. Andrei Khrennikov and Ekaterina Yurova. Criteria of measure-preserving for p-adic dynamical systems in terms of the van der Put basis. Journal of Number Theory, 133:484– 491, 2013. 74. Andrei Khrennikov and Ekaterina Yurova. Criteria of ergodicity for p-adic dynamical systems in terms of coordinate functions. Chaos, Solitons & Fractals, 60:11–30, 2014. 75. Andrei Khrennikov and Ekaterina Yurova. Automaton model of protein: Dynamics of conformational and functional states. Progress in Biophysics and Molecular Biology, 130:2– 14, Nov. 2017. 76. J. Kingsbery, A. Levin, A. Preygel, and C. E. Silva. On measure-preserving c1 transformations of compact-open subsets of non-archimedean local fields. Trans. Amer. Math. Soc., 361(1):61–85, 2009. 77. J. Kingsbery, A. Levin, A. Preygel, and C. E. Silva. Dynamics of the p-adic shift and applications. Discrete and Continuoius Dynamical Systems, 30(1):209–218, 2011. 78. A. Klimov and A.Shamir. New cryptographic primitives based on multiword T-functions. In Bimal Roy and Willi Meier, editors, Fast Software Encryption: 11th International Workshop, FSE 2004, Delhi, India, February 5–7, 2004. Revised Papers, pages 1–15. Springer-Verlag GmbH, 2004. 79. A. Klimov and A. Shamir. Cryptographic applications of T-functions. In Selected Areas in Cryptography -2003, 2003. 80. A. Klimov and A. Shamir. A new class of invertible mappings. In B.S.Kaliski Jr.et al., editor, Cryptographic Hardware and Embedded Systems 2002, volume 2523 of Lect. Notes in Comp. Sci, pages 470–483. Springer-Verlag, 2003. 81. A. Klimov and A. Shamir. New applications of T-functions in block ciphers and hash functions. In Fast Software Encryption, FSE 2005, number 3557 in Lect. Notes Comp. Sci., pages 18–31. Springer-Verlag, 2005. 82. D. Knuth. The Art of Computer Programming, volume 2:Seminumerical Algorithms. Addison-Wesley, Third edition, 1997. 83. N. Koblitz. p-adic numbers, p-adic analysis, and zeta-functions, volume 58 of Graduate texts in math. Springer-Verlag, second edition, 1984. 84. N. Kolokotronis. Cryptographic properties of nonlinear pseudorandom number generators. Designs, Codes and Cryptography, 46:353–363, 2008. 85. Michal Koneˇcný. Real functions computable by finite automata using affine representations. Theor. Comput. Sci., 284:373–396, 2002. 86. L. Kuipers and H. Niederreiter. Uniform Distribution of Sequences. John Wiley & Sons, N.Y. etc., 1974. 87. M. V. Larin. Transitive polynomial transformations of residue class rings. Discrete Mathematics and Applications, 12(2):141–154, 2002. 88. Hans Lausch and Wilfried Nöbauer. Algebra of Polynomials. North-Holl. Publ. Co, American Elsevier Publ. Co, 1973. 89. C. F. Laywine and G. L. Mullen. Discrete mathematics using Latin squares. John Wiley & Sons, Inc., New York, 1998. 90. E. Lerner. On synchronizing automata and uniform distribution. In Y.-S. Han and K. Salomaa, editors, Implementation and Application of Automata, volume 9705 of Lecture Notes Comp. Sci., pages 202–212. Springer, 2016. 91. E. E. Lerner. Uniform distribution of sequences generated by iterated polynomials. Doklady Math., 92(3):704–706, 2015. 92. Emil Lerner. The uniform distribution of sequences generated by iterated polynomials. pAdic Numbers, Ultrametric Analysis and Applications, 11(4):280–298, 2019. 93. Dongdai Lin, Tao Shi, and Zifeng Yang. Ergodic theory over F2 [[T ]]. Finite Fields and Appl., 18:473–491, 2012.
112
V. Anashin
94. L. P. Lisovik and O. Yu. Shkaravskaya. Real functions defined by transducers. Cybernetics and System Analysis, 34(1):69–76, 1998. 95. A. G. Lunts. The p-adic apparatus in the theory of finite automata. Problemy Kibernetiki, 14:17–30, 1965. In Russian. 96. Jan Lunze and Françoise Lamnabhi-Lagarrigue, editors. Handbook of Hybrid Systems Control. Cambridge University Press, 2009. 97. K. Mahler. p-adic numbers and their functions. Cambridge Univ. Press, 1981. (2nd edition). 98. V. Mansurov. Knot theory. Chapman & Hall/CRC, Boca Raton - London - NY - Washington, 2004. 99. Eleftherios Matsikoudis and Edward A. Lee. The fixed-point theory of strictly causal functions. Theoretical Computer Science, 574:39–77, 2015. 100. Nacima Memi´c. Characterization of ergodic rational functions on the set of 2-adic units. International Journal of Number Theory, 13(05):1119–1128, 2017. 101. Nacima Memi´c. Ergodic polynomials on 2-adic spheres. Bulletin Polish Acad. Sci. Math., 65:35–44, 2017. 102. Nacima Memi´c. Mahler coefficients of 1-Lipschitz measure-preserving functions on Zp . International Journal of Number Theory, 16(6):1247–1261, 2020. 103. Nacima Memi´c and Jasmina Muminovi´c Huremovi´c. Ergodic uniformly differentiable functions modulo p on Zp . p-Adic Numbers, Ultrametric Analysis and Applications, 12(1):49–59, 2020. 104. A. Mishchenko and A. Fomenko. A course of differential geometry and topology. Mir, Moscow, 1988. 105. F. M. Mukhamedov and O. N. Khakimov. On metric properties of unconventional limit sets of contractive non-Archimedean dynamical systems. Dyn. Syst., 31(4):506–524, 2016. 106. H. Niederreiter. Random number generation and quasi-Monte Carlo methods. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1992. 107. D. Passman. Permutation groups. W. A. Benjamin, Inc., New York—Amstrdam, 1968. 108. J.-E. Pin. Profinite methods in automata theory. In Symposium on Theoretical Aspects of Computer Science — STACS 2009, pages 31–50, Freiburg, 2009. 109. S. Priess-Crampe and P. Ribenboim. Fixed points, combs, and generalized power series. Abh. Math. Sem. Univ. Hamburg, 63:227–244, 1993. 110. Sibylla Priess-Crampe and Paulo Ribenboim. Ultrametric dynamics. Illinois Journal of Mathematics, 55(1):287–303, 2011. 111. Chengqin Qu, Zhiwei Zhu, and Zuoling Zhou. A note on the perturbed monomial mapping. Appl. Math. J. Chinese Univ., 34(1):76–81, 2019. 112. Alexander Rabinovich. Automata over continuous time. Theoretical Computer Science, 300:331–363, 2003. 113. G. Rangan. On orderability of topological groups. Internat. J. Math. & Math. Sci., 8(4):747– 754, 1985. 114. U. A. Rozikov and I. A Sattarov. Dynamical systems of the p-adic (2, 2)-rational functions with two fixed points. Results Math, 75(100), 2020. 115. A. Salomaa. Theory of Automata. Pergamon Press, 1969. 116. W. H. Schikhof. Ultrametric calculus. Cambridge University Press, 1984. 117. Tao Shi, Vladimir Anashin, and Dongdai Lin. Linear weaknesses in T-functions. In T. Helleseth and J. Jedwab, editors, SETA 2012, volume 7280 of Lecture Notes Comp. Sci., pages 279–290, Berlin–Heidelberg, 2012. Springer-Verlag. 118. Tao Shi, Vladimir Anashin, and Dongdai Lin. Fast evaluation of T-functions via timememory trade-offs. In Information Security and Cryptology, volume 7763 of Lecture Notes in Computer Science, pages 263–275, Berlin–Heidelberg, 2013. Springer. 119. O. Yu. Shkaravskaya. Affine mappings defined by finite transducers. Cybernetics and System Analysis, 34(5):781–783, 1998. 120. Jitender Singh. Subgroups of the additive group of real line. ArXiv:1312.7067v3 [math.NT] 20 May 2014.
The p-adic Theory of Automata Functions
113
121. T. I. Smyshlyaeva. A criterion for functions defined by automata to be bounded-determinate. Diskret. Mat., 25(2):121–134, 2013. 122. Renji Tao. Finite Automata and Application to Cryptography. Tsinghua Univ. Press, Springer, 2008. 123. L. B. Tyapaev. Solving some problems of automata behaviour. Izv. Saratov Univ. (N.S.) , Ser. Math. Mech. Inform., 6(1–2):121–133, 2006. In Russian; abstract in English. 124. L. B. Tyapaev. Measure-preserving and ergodic asynchronous automata mappings. In O. M. Kasim-Zade, editor, Proceedings of the 12th International Workshop on Discrete Mathematics and its Applications (June 20–26, 2016, Moscow), pages 398–400. Lomonosov Moscow State University, 2016. In Russian. 125. L. B. Tyapaev. Transitive families and measure-preserving an n-unit delay mappings. In Proceedings of the International Conference on Computer Science and Information Technologies (June 30-July 2, 2016, Saratov), pages 425–429, Saratov, 2016. Publishing Center Nauka. 126. L. B. Tyapaev. Ergodic automata mappings with delay. In Yu. I. Zhuravlev, editor, Proceedings of the International Conference on Problems of Theoretical cybernetics (June 19–23, 2017, Penza), pages 242–244, Moscow, 2017. Maks Press. In Russian. 127. L. B. Tyapaev, D. V. Vasilenko, and M. V. Karandashov. Discrete dynamical systems defined geometrical images of automata. Izv. Saratov Univ. (N.S.), Ser. Math. Mech. Inform., 13(2(2)):73–78, 2013. In Russian; abstract in English. 128. V. S. Vladimirov and I. V. Volovich. Superanalysis 1. Differential calculus. Teoret. Mat. Fiz., 59:3–27, 1984. 129. V. S. Vladimirov and I. V. Volovich. Superanalysis 2. Integral calculus. Teoret. Mat. Fiz., 60:169–198, 1984. 130. V. S. Vladimirov and I. V. Volovich. p-adic quantum mechanics. Commun. Math. Phys., 123:659–676, 1989. 131. V. S. Vladimirov, I. V. Volovich, and E. I. Zelenov. p-adic Analysis and Mathematical Physics. World Scientific, Singapore, 1994. 132. I. V. Volovich. p-adic string. Class. Quant. Grav., 4:83–87, 1987. 133. J. Vuillemin. On circuits and numbers. IEEE Trans. on Computers, 43(8):868–879, 1994. 134. J. Vuillemin. Finite digital synchronous circuits are characterized by 2-algebraic truth tables. In Advances in computing science - ASIAN 2000, volume 1961 of Lecture Notes in Computer Science, pages 1–7, 2000. 135. J. Vuillemin. Digital algebra and circuits. In Verification:Theory and Practice, volume 2772 of Lecture Notes in Computer Science, pages 733–746, 2003. 136. S. Wang, B. Hu, and Y. Liu. The autocorrelation properties of single cycle polynomial Tfunction. Des. Codes Cryptogr., 86:1527–1540, 2018. 137. S. V. Yablonsky. Introduction to discrete mathematics. Mir, Moscow, 1989. 138. Zifeng Yang. Ergodic functions over Fq [[T ]]. Finite Fields and Their Applications, 53:189– 204, 2018. 139. I. A. Yurov. On p-adic functions preserving the Haar measure. Math. Notes, 63(5–6):823– 836, 1998. 140. Jun Zhang et al. Dynamical systems revisited: Hybrid systems with Zeno executions. In International Workshop on Hybrid Systems: Computation and Control HSCC2000, volume 1790 of Lecture Notes in Computer Science, pages 451–464. Springer, 2000.
Chaos in p-adic Statistical Lattice Models: Potts Model Farrukh Mukhamedov and Otabek Khakimov
Abstract It is known that models of interacting systems have been intensively studied in the last years and new methodologies have been developed in the attempt to understanding their intriguing features. One of the most promising directions is the combination of statistical mechanics tools with the methods adopted from dynamical systems. One of such tools is the renormalization group (RG) which has had a profound impact on modern statistical physics. This approach in statistical mechanics yielded lots of interesting results. These investigations shaded light into phase transitions problem of spin models on lattices models. In the present work, we are going to review recent development on the RG method to p-adic lattice models on Cayley trees. It turned out that the investigation of RG transformation is strongly tied up with associated p-adic dynamical system. In this paper, we provide recent results on the existence of the phase transition and its relation to the chaotic behavior of the associated p-adic dynamical system. We restrict ourselves to the p-adic q-state Potts model on a Cayley tree. Our approach uses the theory of p-adic measure and non-Archimedean stochastic processes. One of the main tools of the detection of the phase transition is the existence of several p-adic Gibbs measures. The advantage of the non-Archimedeanity of the norm allowed us rigorously to prove the existence of the chaos. We point out that In the real case, analogous results with rigorous proofs are not known in the literature. Keywords p-Adic numbers · p-Adic Potts model · Cayley tree · Chaos · Renormalization group · p-Adic Gibbs measure
F. Mukhamedov () Department of Mathematical Sciences, College of Science, The United Arab Emirates University, Abu Dhabi, UAE e-mail: [email protected] O. Khakimov Department of Algebra and Its Applications, Institute of Mathematics, Tashkent, Uzbekistan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 W. A. Zúñiga-Galindo, B. Toni (eds.), Advances in Non-Archimedean Analysis and Applications, STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health, https://doi.org/10.1007/978-3-030-81976-7_3
115
116
F. Mukhamedov and O. Khakimov
1 Introduction It is known that (see [128]) the q-state Potts model is one of the most studied models in statistical mechanics. It has wide theoretical interest and practical applications. Originally, the Potts model was introduced as a generalization of the Ising model to more than two spin components [28]. The model has enough rich structure to illustrate almost every conceivable nuance of statistical mechanics. In [32, 110, 111] the phase diagrams of the model on the Bethe lattices (Cayley tree in our terminology [109]) were studied and the pure phases of the ferromagnetic Potts model were found. Note that the Bethe lattices were fruitfully used, providing a deeper insight into the behavior of the Potts models [13]. To rigorously investigate a phase transition problem for statistical mechanics models over lattices, it is used a measure-theoretic approach based on the theory of Gibbs measures [29]. Such a theory takes its origin from Boltzmann and Gibbs who introduced a statistical approach to thermodynamics to deduce collective macroscopic behaviors from individual microscopic information. A Gibbs measure associated with the Hamiltonian of a physical system (a model) generalizes the notion of a canonical ensemble (see [29]). In the classical case (where the mathematical model was prescribed over the real numbers), the physical phenomenon of phase transition should be reflected in a mathematical model by the non-uniqueness of the Gibbs measures or the size of the set of Gibbs measures for a prescribed model. Due to the convex structure of the set of Gibbs measures over the real numbers field, in order to describe the size of the set of Gibbs measures, it was sufficient to study the number of its extreme elements. Hence, in the classical case, to predict a phase transition, the main attention was paid to finding all possible extreme Gibbs measures. A complete analysis of this set is often a difficult problem. Many papers have been devoted to these studies when the underlying lattice is a Cayley tree (see, for review [116]). On the other hand, it is well-known that the modern axiomatics of probability theory was provided by A.N. Kolmogorov [69] which, is reduced to the theory of σ -additive measures taking values in the segment [0, 1]. However, there is another path to the probability which is based on von Mises frequency approach [74]. It is stressed that von Mises way could not compete with the precisely and naturally formulated Kolmogorov theory. Here, we are mentioning the von Mises way, since in such an approach with frequency probabilities acted an essential part in the process of formulation of conventional axiomatics of probability theory. It is natural to aks: is Kolmogorov’s theory enough? From the mathematical point of view such a theory is enough, however, a physicist may argue to be so optimistic. It seems that Kolmogorov’s model, despite its generality, does not provide a reasonable mathematical description of all probabilistic structures that appear in physics (see [58] for discussion). In particular, one can recall the old problem of negative probabilities. There are many objects that must be probabilities by their physical origin, but they can take negative values, (see, e.g., [39, 75]). As a consequence,
Chaos in p-adic Statistical Lattice Models: Potts Model
117
physicists should work with such objects at the physical level of rigor. However, negative probabilities appear again and again in different domains of physics. On the other hand, that most of modern science is based on mathematical analysis over real and complex numbers. In practice, results of any measurements give always rational numbers Q. To do mathematical analysis, one needs a completion of the field Q. Due to the Ostrowski theorem (see [68]) there are only two kinds of completions of the rationals. The first one gives the real R field and the second one is p-adic Qp number field, where p is any prime number with corresponding p-adic norm |x|p , which is non-Archimedean. Therefore, one naturally appears another p-adic probability in theoretical physics [53]. Such probabilities have commonly come out in p-adic physical models, namely, the p-adic string, was suggested by I. Volovich [125]. Moreover, there are many applications of the p-adic analysis to mathematical physics [2–5, 12, 55]. A reader is referred to [25, 26] for recent development of the subject. In fact, there are two sorts of p-adic models: (A) the variables are p-adic, but the functions are C-valued; (B) both the variables and functions take p-adic values. The (A)-models of p-adic physics and their relation to conventional probability theory on locally compact groups (especially, totally disconnected) are briefly discussed in [8, 61, 124, 129], whereas the (B)-models are the most interesting for our present considerations. In this setting, the probabilities (by their physical origin) belong to the field of p-adic numbers (which is denoted by Qp ). However, in such an approach, Kolmogorov’s axiomatics can not be employed, see [56]. Therefore, in [55, 58, 66, 67, 73, 115], the theory of p-adic probability was rigorously constructed. We stress that a statistical interpretation of such kind of probabilities was provided in [58]. Furthermore, by means of p-adic measure theory in [58, 62, 73] the theory of stochastic processes with values in p-adic and more general non-Archimedean fields having probability distributions with non-Archimedean values has been developed. In particular, a non-Archimedean analog of the Kolmogorov theorem was proven (see also [33, 63]). Such a result allows us to construct wide classes of stochastic processes using finite dimensional probability distributions. Therefore, it gives us a possibility to develop the theory of statistical mechanics in the context of the p-adic theory, since it lies on the basis of the theory of probability and stochastic processes. In this paper, as an application of such a theory, we employ tools and methods of p-adic stochastic processes to the investigation of phase transitions in statistical mechanical models. After Wilson’s seminal work in the early 1970s [126], the renormalization group (RG) has had a profound impact on modern statistical physics. Nowadays, RG is a pillar of theoretical physics, explaining how long-distance collective behavior emerges from microscopic models. Critical phenomena are thus understood in terms of RG fixed points and universality is explained in terms of basins of attractions. Not only do RG techniques provide a powerful tool to analytically describe and quantitatively capture both static and dynamic critical phenomena near continuous phase transitions that are governed by strong interactions, fluctuations,
118
F. Mukhamedov and O. Khakimov
and correlations (see for review [42]). RG method is then applied in statistical mechanics and yielded lots of interesting results [27, 30]. Since such investigations of phase transitions of spin models on hierarchical lattices showed that they make the exact calculation of various physical quantities [13, 29]. One of the most simple hierarchical lattice is a Cayley tree (or a Bethe lattice). This lattice is not a realistic lattice, however, investigations of phase transitions of spin models on trees like the Cayley tree show that they make the exact calculation of various physical quantities. It is believed that several among its interesting thermal properties could persist for regular lattices, for which the exact calculation is far intractable. Illustrations of the renormalization methods are widely shown in the study of Ising model [28], since it has a wide theoretical interest and practical applications. Therefore, one of the generalizations of the Ising model is so-called Potts model on the Cayley tree (see [128]). Such a model has enough rich structure to illustrate almost every conceivable nuance of statistical mechanics. We point out that the rigorous mathematical foundation of the theory of Gibbs measures on Cayley trees was presented in [116]. One of the main aims of the current paper is to explore phase transition phenomena by means of p-adic probability theory. The p-adic counterpart of the theory of Gibbs measures on Cayley trees has also been initiated [33, 106, 107]. In the present paper, we review latest results on Potts models on the Cayley tree, since such a model has broad theoretical and practical applications [13]. Moreover, we also provide new results in this direction. First results on Potts models have been started by analyzing p-adic Gibbs measures within p-adic probability framework [31, 48–52, 78–93, 96–98, 103, 104, 106–108, 113]. However, in this article, we are going to study more generalized p-adic Gibbs measures which are not previously investigated. In the p-adic setting, due to lack of convex structure of the set of p-adic (quasi) Gibbs measures, it is quite difficult to constitute a phase transition with some features of the set of p-adic (quasi) Gibbs measures. Moreover, unlike the real case [70], the set of p-adic Gibbs measures of lattice models on the Cayley tree has a complex structure. In [82, 83, 86] we have developed the RG method to study phase transitions for several p-adic models on Cayley trees. The RG method is closely related to the investigation of p-adic dynamical system associated with a given model (see [10]). Investigation of its fixed points is strongly tied up with a Diophantine problem over the Q. In general, the same Diophantine problem may have different solutions from the field of p-adic numbers to the field of real numbers because of the different topological structures. Recently, this problem was fully studied for several kinds of equations [19, 99, 101, 102]. In the real setting, in [20] it has been established a nice relation between the phase transition and chaotic behavior of the RG transformation. It turns out that in the padic setting a similar kind of connection does exist. Namely, in [1, 90, 93, 97, 98] it has been established a connection between the existence of the phase transition and chaoticity of the related RG of the p-adic q-state Potts model on a Cayley tree. Namely, if q is divisible by p (with p ≥ 3) and under some conditions, it was shown that the associated p-adic dynamical system is chaotic, i.e. it is conjugate
Chaos in p-adic Statistical Lattice Models: Potts Model
119
to the full shift. We notice that in [1, 83, 86, 87, 90, 97, 98, 120] the RG method has been applied to the detection of the phase transitions for p-adic Potts models on Cayley trees. It is emphasized that most of the RG transformations are look like a rational p-adic dynamical systems. A lot of papers have been devoted to the extensive investigations of p-adic rational dynamical systems [11, 15, 21, 24, 37, 38, 52, 105, 114]. We stress that some p-adic chaotic dynamical systems have been studied in [6, 14, 36, 123, 127] and had immense applications in coding theory. We emphasize that the theory of p-adic dynamical systems is rapidly growing branch of dynamical systems [10, 36, 57, 66, 100, 122]. We remark that first investigations of non-Archimedean dynamical systems have appeared in [35, 72]. In the present paper, we review a phase transition phenomena of p-adic Potts models by means of renormalization methods in the measure-theoretical scheme. Therefore, in what follows, methods of p-adic dynamical systems and p-adic probability measures will be used. The advantage of the non-Archimedeanity of the norm allowed us rigorously to prove the existence of the chaos. We point out that In the real case, analogous results with rigorous proofs are not known in the literature.
2 Preliminaries 2.1 p-adic Numbers In what follows p will be a fixed prime number, and Qp denotes the field of p-adic filed, formed by completing Q with respect to the unique absolute value satisfying |p|p = 1/p. The absolute value | · |p , is non-Archimedean, meaning that it satisfies the ultrametric triangle inequality |x + y|p ≤ max{|x|p , |y|p }. Any p-adic number x ∈ Qp , x = 0 can be uniquely represented in the form x = pγ (x) (x0 + x1 p + x2 p2 + . . .),
(2.1)
where γ = γ (x) ∈ Z and xj are integers, 0 ≤ xj ≤ p − 1, x0 > 0, j = 0, 1, 2, . . . In this case |x|p = p−γ (x) . Denote . Ep = x ∈ Qp : |x − 1|p < p−1/(p−1) . This set is the range of the p-adic exponential function [68, 121]. In the sequel, the following well known fact will be frequently used without noticing. Lemma 2.2 The set Ep has the following properties: (a) Ep is a group under multiplication; (b) |a − b|p < 1 for all a, b ∈ Ep ;
120
F. Mukhamedov and O. Khakimov
(c) if a, b ∈ Ep then
|a + b|p =
2−1 , p = 2, 1, p = 2;
(d) if a ∈ Ep , then there is an element h ∈ Bp−1/(p−1) (0) such that a = expp (h). We recall that Zp = {x ∈ Qp : |x|p ≤ 1} and Z∗p = {x ∈ Qp : |x|p = 1} are the set of all p-adic integers and p-adic units, respectively. Note that Ep ⊂ Z∗p ⊂ Zp . Lemma 2.3 ([97]) Let k ≥ 2 and α, β ∈ Ep . Then there exists a unique γ ∈ Zp such that k−1
α k−j −1 β j = kγ .
(2.2)
j =0
Moreover, if p = 2 then γ ∈ Ep . Let us consider the following monomial equation: x k = a, k ∈ N, a ∈ Qp
(2.3)
in Qp . Let us first notice that the Eq. (2.3) can be considered over Z∗p . Indeed, any x∗ nonzero p-adic number x has a unique representation of the form x = |x| , where p ∗
∗
x a , a = |a| into (2.3), we can get that x ∗ ∈ Z∗p . After substituting the forms x = |x| p p k x∗ a∗ = |a| . This means that Eq. (2.3) has a solution in Qp whenever a ∈ Qp if |x|p p
and only if |a|p = pkl for some l ∈ Z and the equation x∗k = a∗ has a solution in Z∗p . Hence, we may always assume that a ∈ Z∗p when we consider the Eq. (2.3). We recall that an integer a ∈ Z is called a k-th power residue modulo p if the equation x k ≡ a(mod p) has a solution x ∈ Z. Let a ∈ Z∗p has the following canonical form a = a0 + a1 p + a2 p2 + . . . We denote . Solp (x k − a) = ξ ∈ Fp : ξ k ≡ a(mod p) and κp = card Solp (x k − a) ,
Chaos in p-adic Statistical Lattice Models: Potts Model
121
where Fp is a ring of integers modulo p and card(A) stands for the cardinality of a set A. We notice that 0 ≤ κp ≤ k. We observe that the condition Solp (x k − a) = ∅ is equivalent to a0 is k-th power residue modulo p. Theorem 2.4 ([102]) Let p = 2 and a ∈ Z∗2 . Then the following statements are true: (i) if k is odd then Eq. (2.3) has a unique solution; (ii) if k = 2s (2m − 1), s, m ∈ N then Eq. (2.3) has a solution iff a ≡ 1(mod 2s+2 ). Moreover, if k is even then Eq. (2.3) either has no root or has two distinct roots. Theorem 2.5 ([99, 102]) Let p ≥ 3 and k = mps , where (p, m) = 1, s ≥ 0. Assume that a ∈ Z∗p and Solp (x k − a) = ∅. Then the followings statements are equivalent: (i) Eq. (2.3) has a solution; ps (ii) a ≡ a0 (mod ps+1 ); (iii) for any ξ ∈ Solp (x k − a) Eq. (2.3) has a unique solution in B1 (ξ ).
2.6 Dynamical Systems in Qp In this subsection we recall some standard terminology of the theory of dynamical systems (see for example [10, 66]). Given r, s > 0 (r < s) and a ∈ Qp denote Br (a) = {x ∈ Qp : |x − a|p < r}, B r (a) = {x ∈ Qp : |x − a|p ≤ r} (2.4) Br,s (a) = {x ∈ Qp : r < |x − a|p < s}, Sr (a) = {x ∈ Qp : |x − a|p = r}. (2.5) It is clear that B r (a) = Br (a) ∪ Sr (a). A function f : Br (a) → Qp is said to be analytic if it can be represented by f (x) =
∞
fn (x − a)n , fn ∈ Qp ,
n=0
which converges uniformly on the ball Br (a). Consider a dynamical system (f, B) in Qp , where f : x ∈ B → f (x) ∈ B is an analytic function and B = Br (a) or Qp . Denote x (n) = f n (x (0) ), where x 0 ∈ B and f n (x) = f ◦ · · · ◦ f (x). If f (x (0) ) = x (0) then x (0) is called a fixed point. A fixed n
point x (0) is called an attractor if there exists a neighborhood U (x (0) )(⊂ B) of x (0) such that for all points y ∈ U (x (0) ) it holds lim y (n) = x (0) , where y (n) = f n (y). n→∞
122
F. Mukhamedov and O. Khakimov
If x (0) is an attractor then its basin of attraction is A(x (0) ) = {y ∈ Qp : y (n) → x (0) , n → ∞}. A fixed point x (0) is called repeller if there exists a neighborhood U (x (0) ) of x (0) such that |f (x) − x (0) |p > |x − x (0) |p for x ∈ U (x (0) ), x = x (0) . For a fixed point x (0) of a function f (x) a ball Br (x (0) ) (contained in B) is said to be a Siegel disc if each sphere Sρ (x (0) ), ρ < r is an invariant sphere of f (x), i.e. if x ∈ Sρ (x (0) ) then all iterated points x (n) ∈ Sρ (x (0) ) for all n = 1, 2 . . . . The union of all Siegel discs with the center at x (0) is said to a maximum Siegel disc and is denoted by SI (x (0) ). Remark 2.7 In non-Archimedean geometry, a center of a disc is nothing but a point which belongs to the disc, therefore, in principle, different fixed points may have the same Siegel disc (see [7]). Let x (0) be a fixed point of an analytic function f (x). Set λ=
d f (x (0) ). dx
The point x (0) is called attractive if 0 ≤ |λ|p < 1, indifferent if |λ|p = 1, and repelling if |λ|p > 1.
2.8 Non-Archimedean Measure In this subsection, we are going to provide a non-Archimedean analogue of Kolmogorov’s extension theorem which was first proved in [33]. This journal is not available in English, therefore, we provide its proof, for the sake of completeness. We notice that all p-adic measures are considered to be finitely additive, since only discrete measures are countably additive [115]. Let X = ∅ and K be a non-Archimedean field. Assume that % is a semiring of subsets of X and μ : % → K is a set function. A set function μ is called nonArchimedean measure on % if μ(A1 ∪ A2 ) = μ(A1 ) + μ(A2 ) for any A1 , A2 ∈ % with A1 ∩ A2 = ∅. If X ∈ % and μ(X) = 1 then measure μ on % is called probability measure (see [54]). Like as the real case, we first extend this measure from semiring % to the ˜ containing %. Recall that a non-Archimedean measure μ˜ on % ˜ is minimal ring % ˜ called extension of a non-Archimedean measure μ on % if % ⊂ % and μ(A) ˜ = μ(A),
∀A ∈ %.
Chaos in p-adic Statistical Lattice Models: Potts Model
123
We notice that by means of finite additivity, the statements and continuations of measures are standard (see, for example [16, Proposition 1.3.9]). For the sake of completeness, we provide such an extension in the p-adic setting. Proposition 2.9 For any non-Archimedean measure μ on semiring % there exists a ˜ containing %. unique extension μ˜ on the minimal ring % Proof Since operations (∪, ∩, \ and ) on the subsets of X does not depend the ˜ one can find finitely many existence of topology on X, we infer that for any A ∈ % subsets B1 , B2 , . . . , Bn of % such that n #
A=
Bk , Bi ∩ Bj = ∅, i = j.
(2.6)
k=1
Then we put μ(A) ˜ =
n
(2.7)
μ(Bk ).
k=1
˜ i.e. it does not depend on representation We show that (2.7) depends only A ∈ %, ˜ has the following two representations (2.6). Assume that A ∈ % n #
A=
Bk =
m #
Cj , Bk , Cj ∈ %.
j =1
k=1
Since % is a semiring we have Bk ∩ Cj ∈ % for any k ≤ n and j ≤ m. Then additivity of μ implies μ(A) ˜ =
n
μ(Bk ) =
k=1
m n
μ(Bk ∩ Cj ) =
k=1 j =1
n m
μ(Cj ∩ Bk ) =
j =1 k=1
m
μ(Cj ) = μ(A). ˜
j =1
˜ Take any Let us show that the set function given by (2.7) is a measure on %. ˜ such that A1 ∩ A2 = ∅. Assume that A1 , A2 ∈ % A1 =
n #
(1)
Bk , A1 =
k=1
m #
(2)
Bk .
k=1
We notice that Bi(1) ∩ Bj(2) = ∅ for any pair (i, j ) with i ≤ n, j ≤ m. Then A1 ∪ A2 has the following representation A1 ∪ A2 =
n+m # k=1
(1)
(2)
Ck , Ck = Bk , k ≤ n, Cj = Bj , n < j ≤ m.
124
F. Mukhamedov and O. Khakimov
Then μ(A ˜ 1 ∪ A2 ) =
n+m
μ(Ck ) =
k=1
n
n+m
μ(Ck ) +
k=1
μ(Ck ) = μ(A ˜ 1 ) + μ(A ˜ 2 ).
k=n+1
˜ Now, Hence, we have shown that a function given by (2.7) is a measure on %. to prove it uniqueness, we suppose that ν is another extension of μ. Then, for any ˜ using its representation (2.6) and keeping in mind ν(Bk ) = μ(Bk ) for any A∈% k ≤ n, one gets n
ν(A) =
ν(Bk ) =
n
k=1
μ(Bk ) = μ(A). ˜
k=1
˜ implies ν = μ. So, the arbitrariness of A ∈ % ˜
Corollary 2.10 An extension of any probability measure is a probability measure as well. Let (X, B) be a measurable space, where B is an algebra of subsets X. Denote (X∞ , B ∞ ) =
∞ /
(X, B),
j =1
where B ∞ is a minimal algebra containing cylindric subsets of X∞ , i.e. $
In (B) = x ∈ X
∞
%
: (x1 , x2 , . . . , xn ) ∈ B , B ∈
n 0
B.
j =1
-
We say a sequence of non-Archimedean probability measures {Pn }∞ n=1 on .∞ 1 n Xn , j =1 B is a compatible if n=1
Pn+1 (B × X) = Pn (B),
∀n ≥ 1.
Theorem 2.11 (Non-Archimedean Analogue of Kolmogorov’s Extbe a sequence of non-Archimedean probability ension Theorem) Let {Pn }∞ .∞ n=1 n n . If it is compatible then there exists a unique nonmeasures on X , ⊗j =1 B n=1
Archimedean probability measure P defined on (X∞ , B ∞ ) such that P (In (B)) = Pn (B), ∀B ∈ ⊗nj=1 B, ∀n ≥ 1.
Chaos in p-adic Statistical Lattice Models: Potts Model
125
Proof We1 first define a non-Archimedean probability measure P on cylindrical sets. Let Bn ∈ nj=1 B and In (Bn ) be a cylindric set. Then P (In (Bn )) = Pn (Bn ). Let us show that this measure is well defined, i.e. the value of P (In (B)) does not depend on representation of In (B). Assume that a cylindric set is given by two representations, i.e. In (Bn ) = In+k (Bn+k ). Hence, if (x1 , x2 , . . . , xn+k ) ∈ Xn+k then (x1 , x2 , . . . , xn ) ∈ Bn
⇐⇒
(x1 , x2 , . . . , xn+k ) ∈ Bn+k .
(2.8)
From (2.8) with the compatibility, one gets Pn (Bn ) = Pn+1 ({(x1 , x2 , . . . , xn+1 )|(x1 , x2 , . . . , xn ) ∈ Bn }) = ... = Pn+k ({(x1 , x2 , . . . , xn+k )|(x1 , x2 , . . . , xn ) ∈ Bn }) = Pn+k (Bn+k ). Let us suppose that B (1) , B (2) , . . . , B (k) is a collection of pairwise disjoint (j ) cylindrical sets. One can find a collection of pairwise disjoint sets {Bn }kj =1 on (j )
⊗nj=1 B with B (j ) = In (Bn ) for any j ≤ k. Then ⎛ P⎝
k #
⎞
⎛
B (j ) ⎠ = P ⎝
j =1
k #
⎞ In (Bn )⎠
j =1
⎛ = Pn ⎝
k #
(j )
⎞ (j ) Bn ⎠ =
j =1
=
k
k
(j ) Pn Bn
j =1
P B (j ) .
j =1
This implies the additivity of P and P (X∞ ) = 1. Thus, we have constructed a probability measure on subring of cylindric sets. Thanks to Corollary 2.10 it can be extended uniquely to a probability measure on the minimal algebra containing the semiring of cylindrical sets. Remark 2.12 We notice that in [62, 63] certain results on the existence of probability on a product of non-Archimedean probabilistic spaces have been proved. One of the important condition (which was already invented in the first Monna– Springer theory of non-Archimedean integration [76]) is boundedness, namely a
126
F. Mukhamedov and O. Khakimov
p-adic probability measure μ is called bounded if sup{|μ(A)|p : A ∈ B} < ∞. We pay attention to an important special case in which boundedness condition by itself provides a fruitful integration theory (see for example [58]). Note that, in general, a p-adic probability measure need not be bounded [54, 62, 68]. For more detail information about p-adic measures we refer to [66, 115].
2.13 Semi-Infinite Cayley Tree and Its Coordinate Structure k = (V , L) be a semi-infinite Cayley tree of order k ≥ 1 with the root x (0) Let + (whose each vertex has exactly k + 1 edges, except for the root x (0) , which has k edges). Here V is the set of vertices and L is the set of edges. The vertices x and y are called nearest neighbors and they are denoted by l = x, y if there exists an edge connecting them. A collection of the pairs x, x1 , . . . , xd−1 , y is called a path from the vertex x to the vertex y. The distance d(x, y) between the vertices x and y on the Cayley tree, is the length of the shortest path from x to y. k : every vertex x (except for x (0) ) of k has Recall a coordinate structure in + + coordinates (i1 , . . . , in ), here im ∈ {1, . . . , k}, 1 ≤ m ≤ n and for the vertex x (0) we put (0). Namely, the symbol (0) constitutes level 0, and the sites (i1 , . . . , in ) form level n ( i.e. d(x (0) , x) = n) of the lattice. k , x = (i , . . . , i ) put For x ∈ + 1 n
S(x) = {(x, i) : 1 ≤ i ≤ k},
(2.9)
here (x, i) is short for (i1 , . . . , in , i). This set is called a set of direct successors of x. k a binary operation ◦ : k × k → k as follows: for any Let us define on + + + + two elements x = (i1 , . . . , in ) and y = (j1 , . . . , jm ) put x ◦ y = (i1 , . . . , in ) ◦ (j1 , . . . , jm ) = (i1 , . . . , in , j1 , . . . , jm )
(2.10)
x ◦ x (0) = x (0) ◦ x = (i1 , . . . , in ) ◦ (0) = (i1 , . . . , in ).
(2.11)
and
k becomes a noncommutative semigroup By means of the defined operation + k → k , with a unit. Using this semigroup structure one defines translations τg : + + k g ∈ + by
τg (x) = g ◦ x. It is clear that τ(0) = id.
(2.12)
Chaos in p-adic Statistical Lattice Models: Potts Model
127
k be a sub-semigroup of k and h : k → Y be a Y -valued function Let H ⊂ + + + k . We say that h is H -periodic if h(τ (x)) = h(x) for all g ∈ H and defined on + g k . Any k -periodic function is called translation-invariant. For each m ≥ 2 x ∈ + + we put k : d(x, x (0) ) ≡ 0(mod m)}. Hm = {x ∈ +
(2.13)
One can check that Hm is a sub-semigroup. Let us set Wn = {x ∈ V : d(x, x (0) ) = n}, Vn =
n #
Wm , Ln = {x, y ∈ L : x, y ∈ Vn }.
m=0
3 Construction of Generalized p-adic Gibbs Measure k = Let Qp be the field of p-adic numbers and be a finite set. Assume that + (V , L) is a semi-infinite Cayley tree of order k ≥ 1. We stress that, in principle, one may consider the whole Cayley tree, but for the sake of convenience, we restrict ourselves to the semi-infinite Cayley tree. A configuration σ on V is then defined as a function x ∈ V → σ (x) ∈ ; in a similar fashion one defines the configurations on Vn and Wn respectively. The set of all configurations on V (resp. Vn , Wn ) coincides with = V (resp.Vn = Vn , Wn = Wn ). Using this, for given configurations σ ∈ Vn−1 and ω ∈ Wn we define their concatenations by
(σ ∨ ω)(x) =
σ (x), if x ∈ Vn−1 , ω(x), if x ∈ Wn .
It is clear that σ ∨ ω ∈ Vn . k . A function h (for example, a Remark 3.1 Let H be a sub-semigroup of + x k configuration σ (x)) of x ∈ + is called H -periodic if hyx = hx (resp. σ (y ◦ x) = k and y ∈ H . A k -periodic function is called translationσ (x)) for any x ∈ + + invariant.
We consider p-adic Potts model on a semi-infinite Cayley tree, where the spin takes values in the set := {1, 2, . . . , q}, and is assigned to the vertices of the tree. The Hamiltonian of q-state p-adic Potts model on Vn is H(σ ) = J
x,y∈Ln
δσ (x)σ (y) ,
∀σ ∈ Vn
(3.1)
128
F. Mukhamedov and O. Khakimov
where J ∈ Z is a coupling constant, x, y stands for nearest neighbor vertices and δij is the Kroneker’s symbol: δij =
0, if i = j 1, if i = j.
Remark 3.2 It should be noted that the original Hamiltonian of the Potts model (in the real setting) is the same as in (3.1), and does not contain p-adic numbers initially p-adic numbers. Moreover, the coupling constant J belong to R. In the current situation, in the construction of p-adic Gibbs measures, we cannot take J any padic number, since p-adic exponents are not defined everywhere in Qp . Therefore, we restrict ourselves to J ∈ Z. Let us construct generalized p-adic Gibbs measures for the model q-state Potts k. model on + q Assume that h : V \ {x (0) } → Qp is a function, i.e. hx = (h1,x , h2,x , . . . , hq,x ), where hi,x ∈ Qp \ {0} for every i ∈ {1, 2, . . . , q} and for any x ∈ V \ {x (0) }. Given ρ ∈ Qp \ {−1, 0, 1} let us consider a p-adic probability measure μ(n) h,ρ on Vn defined by (n)
μh,ρ (σ ) =
1 (h) Zn,ρ
ρ H(σ )
/
hσ (x),x ,
∀σ ∈ Vn .
(3.2)
x∈Wn
(h)
Here Zn,ρ is the corresponding normalizing factor or partition function given by (h) Zn,ρ =
σ ∈Vn
ρ H(σ )
/
hσ (x),x .
(3.3)
x∈Wn
We notice that, in the real setting, the measures given by (4.8) are called conditional Gibbs measures. In that case, the numbers ρ and {hi,x } are strictly positive numbers (since it should define probability). In principle, such a number can be taken as ρ = exp(J ) (J ∈ R and hi,x = exp(h˜ i,x ). However, in the padic setting, a probability measure is not necessary to be positive (since Qp is not ordered). Therefore, ρ and {hi,x } could be arbitrary numbers from Qp . Furthermore, if ρ, hk,x ∈ Ep , then the associated p-adic measures are called p-adic Gibbs measures, otherwise generalized p-adic Gibbs measures. (h) could be zero for some h. In this situation, Remark 3.3 Note that, in general, Zn,ρ (n) in formal, we may assume that μh,ρ (σ ) = ∞ for all σ ∈ Vn . However, such kind of measures are not interested. Hence, when it occurs we say that for h there is no measure.
In the present we are interested in a construction of an infinite volume distribution with given finite-dimensional distributions in a p-adic setting. More exactly, we
Chaos in p-adic Statistical Lattice Models: Potts Model
129
want to define a p-adic probability measure μh,ρ on which is compatible with (n) defined ones μh,ρ , i.e. $ % (n) μh,ρ ( ω ∈ : ω|Vn ≡ σ ) = μh,ρ (σ ),
∀σ ∈ Vn , ∀n ∈ N.
(3.4)
Remark 3.4 It is well known [29] that, in the real case, Markov chains on trees are particular cases of Gibbs measures corresponding to a Hamiltonian with nearestneighbor interactions. In [118, 119] p-adic Gibbs measures corresponding to Hamiltonians with nearest-neighbor interactions have been characterized in terms of Markov random fields over countable graphs. Recently, in [71] a boundary law argument to study p-adic Markov chains on general trees has been developed. A notion of ultrametric Markovianity was also considered in [59]. Such Markovianity describes independence of contributions to random field from different ultrametric balls, has been introduced, and shows that Gaussian random fields on general ultrametric spaces (which were related with hierarchical trees), which were defined as a solution of pseudodifferential stochastic equation (see also [41, 46]), satisfies the Markovianity. In addition, covariation of the defined random field was computed with the help of wavelet analysis on ultrametric spaces (see also [47]). Some applications of the results to replica matrices, related to general ultrametric spaces have been investigated in [60]. In general, à priori the existence such a kind of measure μ is not known, since there is not much information on topological properties, such as compactness, of the set of all p-adic measures defined even on compact spaces (In the real case, when the state space is compact, then the existence follows from the compactness of the set of all probability measures (i.e. Prohorov’s Theorem). When the state space is non-compact, then there is a Dobrushin’s Theorem [22, 23] which gives a sufficient condition for the existence of the Gibbs measure for a large class of Hamiltonians). We point out that certain properties of the set of p-adic measures has been studied in [40, 44, 45], but those properties are not enough to prove the existence of the limiting measure. Therefore, at a moment, we can only use the p-adic Kolmogorov extension Theorem 2.11. As we know to employ the p-adic Kolmogorov’s extension (n) Theorem the measures {μh,ρ } should satisfy the compatibility condition i.e.
(n−1) μ(n) h,ρ (σ ∨ ω) = μh,ρ (σ ),
∀σ ∈ Vn−1 .
(3.5)
ω∈Wn
This condition according to the theorem implies the existence of a unique p-adic measure μh,ρ defined on with a required condition (3.4). Such a measure μh,ρ is said to be a generalized p-adic Gibbs measure corresponding to the model. If q ρ J ∈ Ep and hx ∈ Ep for any x ∈ V \ {x (0) } then limiting measure μh,ρ is called p-adic Gibbs measure. For a given Hamiltonian H by GGρ (H) we denote the set of all generalized padic Gibbs measures associated with a function h = {hx , x ∈ V }. It is said a phase
130
F. Mukhamedov and O. Khakimov
transition occurs if there exist at least two distinct generalized p-adic quasi Gibbs measures μ, ν ∈ GGρ (H) such that μ is bounded and ν is unbounded. Moreover, if there is a sequence of sets {An } such that An ∈ Vn with |μ(An )|p → 0 and |ν(An )|p → ∞ as n → ∞, then we say that there occurs a strong phase transition. It is said a quasi phase transition occurs if there are two different functions s and h defined on N such that there exist the corresponding measures μs,ρ , μh,ρ , and they are either bounded or unbounded. Remark 3.5 We would like to point out that if one considers the usual q-state Potts model (i.e. in the real setting) on a multidimensional lattice, then at low temperature there occurs a phase transition [110, 111], i.e. there exist q-different Gibbs measures μi , (i = 1, . . . , q), i.e. μi (σ (0) = i) > 1/2, μj (σ (0) = i) < 1/2 j = i. This implies that the measures μi are mutually singular to each other. The strong phase transition (see definition above), in the p-adic setting, has the similar meaning as singularity, i.e. the p-adic measures μ and ν are “singular” (in the above given sense). Here we have to stress that absolutely continuity and singularity of p-adic measures cannot be directly defined in a similar manner with real case. Absolutely continuity of p-adic measures have been studied in [43]. The singularity what we are proposing is consistent with that absolutely continuity introduced in [43]. Remark 3.6 Note that in [106] we considered the following sequence of p-adic measures defined by (n)
μh (σ ) =
1 (h) Z˜ n
expp {Hn (σ )}
/
hσ (x),x ,
(3.6)
x∈Wn
(h) here as usual Z˜ n is the corresponding normalizing factor. A limiting p-adic measures generated by (3.6) was called p-adic Gibbs measure. Such kind of measures and phase transitions, for Ising and Potts models on Cayley tree, have been studied in [1, 34, 50, 51, 106, 107]. When a state space is countable, the corresponding p-adic Gibbs measures have been investigated in [64, 65, 79, 81].
The following statement describes conditions on h guaranteeing compatibility of (n) the sequence of probability distributions {μh }n≥1 . Theorem 3.7 Let H be a Hamiltonian of q-state p-adic Potts model on a semiq infinite Cayley tree. For a given vector valued function h : V \ {x (0) } → Qp (n) a sequence of p-adic probability measures {μh,ρ }n≥1 given by (4.8), (3.3) is compatible iff for every x ∈ V \ {x (0) } it hold the following equalities: q / ρ J δiu hu,y hi,x = , qu=1 J δ qu h hq,x u,y u=1 ρ y∈S(x)
i ∈ {1, 2, . . . , q − 1}.
(3.7)
Chaos in p-adic Statistical Lattice Models: Potts Model
131
Proof Necessity. Suppose that (3.5) holds. Substituting (4.8) in (3.5), for any n ≥ 2 and every σ ∈ Vn−1 we obtain 1 (h) Zn−1,ρ
/
ρ H(σ )
hσ (x),x =
x∈Wn−1
=
=
=
1
/
/
(h) Zn,ρ ω∈W x∈Wn−1 y∈S(x) n
ρ H(σ )
/
ρ H(σ ∨ω) hω(y),y
/
(h) Zn,ρ ω∈Wn x∈Wn−1 y∈S(x) q ρ H(σ ) / / (h) Zn,ρ ω∈Wn i=1 q ρ H(σ ) / / (h) Zn,ρ i=1
x∈Wn−1 σ (x)=i
ρ J δσ (x)ω(y) hω(y),y
/
x∈Wn−1 σ (x)=i
ρ J δiω(y) hω(y),y
y∈S(x)
q /
ρ J δiu hu,y .
(3.8)
y∈S(x) u=1 q
Now, we fix x ∈ Wn−1 and take a collection of configurations {σ (i) }i=1 on Vn−1 such that for any x ∈ Wn−1 it holds σ
(i)
(x ) =
i, x = x; σ (1) (x ), x = x;
∀i ∈ {1, 2, . . . , q}.
We rewrite (3.8) for σ (i) , i = 1, q and after dividing all equalities by the case σ (q) one gets q / ρ J δiu hu,y hi,x = , qu=1 J δ qu h hq,x u,y u=1 ρ
i ∈ {1, 2, . . . , q − 1}.
(3.9)
y∈S(x)
Since arbitrariness of n ≥ 2 and x ∈ Wn−1 we conclude that (3.7) holds. Sufficiency Now we assume that (3.7) holds. Then there exists a function a : V \ {x (0) } → Qp \ {0} such that for any x ∈ V \ {x (0) } the following equalities hold: a(x)hi,x =
q /
ρ J δiu hu,y ,
i = 1, 2, . . . , q − 1.
y∈S(x) u=1
Hence, for any n ≥ 2 and every σ ∈ Vn−1 we get q / / i=1
x∈Wn−1 σ (x)=i
a(x)hi,x =
q / / i=1
x∈Wn−1 σ (x)=i
q /
y∈S(x) u=1
ρ J δiu hu,y .
132
F. Mukhamedov and O. Khakimov
" Multiplying ρ H(σ ) both side of the last one and denoting An−1 = x∈Wn−1 a(x) one has An−1 ρ H(σ )
/
hσ (x),x = ρ H(σ )
x∈Wn−1
q / / i=1
=
x∈Wn−1 σ (x)=i
= ρ H(σ )
q /
ρ J δiu hu,y
y∈S(x) u=1
/
/
ρ J δσ (x)ω(y) hω(y),y
ω∈Wn x∈Wn−1 y∈S(x)
ρ H(σ ∨ω)
ω∈Wn
/
hω(y),y .
y∈Wn
Consequently, we obtain (n−1)
h (h) An−1 Zn−1,ρ μh,ρ (σ ) = Zn,ρ
(n)
μh,ρ (σ ∨ ω).
(3.10)
ω∈Wn (n)
Keeping in mind {μh,ρ }n≥1 is a sequence of probability measures, summarize (3.10) by all σ ∈ Vn−1 one finds (h) (h) An−1 Zn−1,ρ = Zn,ρ .
(3.11)
The last one together with (3.10) imply (3.5). Remark 3.8 Thanks to Theorem 3.7 one can check that if we denote h˜ i,x = (n) hi,x / hq,x , x ∈ V \ {x (0) } for any i ∈ {1, 2, . . . , q} then {μ ˜ }n≥1 is compatible h,ρ
if and only if {μ(n) h,ρ }n≥1 is compatible. Moreover, (n) h,ρ
(n)
μ ˜ (σ ) = μh,ρ (σ ),
∀n ∈ N, ∀σ ∈ Vn . q
We consider the following function F = (F1 , F2 , . . . , Fq ) on Qp given by ⎛
⎞k q−1 (ρ J − 1)hi + j =1 hj + 1 ⎠ , Fi (h) = ⎝ q−1 ρ J + j =1 hj
q
h = (h1 , h2 , . . . , hq−1 , 1) ∈ Qp (3.12)
Due to Remark 3.1, Theorem 3.7 and Remark 3.8 m-periodic point of the function F defines Hm -periodic generalized p-adic Gibbs measure for q-state Potts model and visa verse. We notice that for fixed point of F is corresponded translationinvariant generalized p-adic Gibbs measure.
Chaos in p-adic Statistical Lattice Models: Potts Model
133
One can see that h0 = (1, 1, . . . , 1) is a fixed point of F . So, if denote by F ix(F ) a set of all fixed points of F then F ix(F ) = ∅. The following result describes the set of all p-adic Gibbs measures for q-state Potts model when |q|p = 1. Such kind results in the real case does not occur. Theorem 3.9 Let k ≥ 1, |q|p = 1 and ρ ∈ Ep . Then there is a unique p-adic k . Moreover, the unique p-adic Gibbs measure for the q-state Potts model on + Gibbs measure is a translation-invariant. q
Proof Since h = {(1, 1, . . . , 1) ∈ Qp : x ∈ V \ {x (0) }} is a solution of (3.7) we infer that there exists at least one p-adic Gibbs measure. Suppose that s is a solution of (3.7). We show that h = σ . Thanks to Remark 3.8 we may assume that q s = {(s1,x , s2,x , . . . , sq−1,x , 1) ∈ Qp : x ∈ V \ {x (0) }}. For any x ∈ V \ {x (0) } we have si,x
/ (ρ J − 1)si,y + jq−1 =1 sj,y + 1 = , q−1 J ρ + j =1 sj,y y∈S(x)
i ∈ {1, 2, . . . , q − 1}.
(3.13)
First we show that |si,x − 1|p ≤
$ % 1 max |si,y − 1|p : y ∈ S(x) , p
i ∈ {1, 2, . . . , q − 1}.
(3.14)
From ρ J , si,y ∈ Ep due to Lemma 2.2 and |q|p = 1 using strong triangle inequality one gets J q−1 J (ρ − 1)(si,y − 1) (ρ − 1)si,y + j =1 sj,y + 1 p − 1 = q−1 q−1 J J −1+ ρ + j =1 sj,y (sj,y − 1) + q ρ j =1 p p = (ρ J − 1)(si,y − 1) p
≤
1 |si,y − 1|p . p
Hence, we find ⎞⎞ ⎛ ⎛ q−1
/ (ρ J − 1)si,y + j =1 sj,y + 1 q−1 1 ⎠ ⎠ ⎝1 + ⎝ −1 − 1 ≤ (si,y − 1) q−1 J p ρ + j =1 sj,y i=1 y∈S(x) p
p
% $ 1 ≤ max |si,y − 1|p : y ∈ S(x) , p
which together with (3.13) imply (3.14).
134
F. Mukhamedov and O. Khakimov
Due to (3.14), for given x ∈ V \ {x (0) } and for any N ∈ N we can choose a collection of vertices {x1 , x2 , . . . , xN } such that x1 ∈ S(x), x2 ∈ S(x1 ), . . . , xN ∈ S(xN −1 ) and |si,x − 1|p ≤
1 pN +1
$ % max |si,y − 1|p : y ∈ S(xN ) ,
i ∈ {1, 2, . . . , q − 1}.
From the last one we infer that si,x = 1,
i ∈ {1, 2, . . . , q − 1}.
Finally, from arbitrariness of x ∈ V \ {x (0) } we conclude that h = s. Moreover, h is translation-invariant solution of (3.7). Consequently, since 1 ∈ Ep we obtain that μh is a translation-invariant p-adic Gibbs measure. Remark 3.10 The proved result extends main results of [106, 107]. We point out that more general types of recursive equations have been investigated in [78, 84, 89].
4 Translation-Invariant Measures In this section, we consider p-adic generalized Gibbs measures which are q translation-invariant, i.e., we assume hx = h = (h1 , . . . , hq−1 , 1) ∈ Qp for all x ∈ V . Then, by (3.12) one gets ⎛
⎞k q−1 (ρ J − 1)hi + j =1 hj + 1 ⎠ , i = 1, . . . , q − 1. hi = ⎝ q−1 ρ J + j =1 hj
(4.1)
Proposition 4.1 Let h = (h1 , h2 , . . . , hq−1 ) be a solution of (4.1) such that q−1 i=1 hi = −1. Then μh := μh,ρ is a translation-invariant generalized p-adic Gibbs measure for q-state Potts model, where hx = (h1 , h2 , . . . , hq−1 , 1) for every x ∈ V \ {x 0 }. Moreover, it holds
% $ μh ( σ ∈ : σ |Vn ≡ σn ) =
ρ Hn (σ )
ρJ +
q−1 " i=1
q−1 j =1
hi
x∈Wn δiσ (x)
k(kn −1) k−1
hj
q−1
,
∀n ∈ N.
hi + 1
i=1
(4.2)
Chaos in p-adic Statistical Lattice Models: Potts Model
135
q−1 Proof Let h = (h1 , h2 , . . . , hq−1 ) be a solution of (3.12) such that i=1 hi = −1. Then vector valued function hx = (h1 , h2 , . . . , hq−1 , 1) for any x ∈ V \ {x (0) } is a solution of (3.7). Then thanks to Theorem 3.7 we infer that μh,ρ is a generalized p-adic Gibbs measure for q-state Potts model. Due to hx = hy for every x, y ∈ V \ {x 0 } we obtain that μh,ρ is a translation-invariant. Since h depends only h we denote μh := μh,ρ . In order to show (4.2) it is enough to prove the following ρ Hn (σ )
q−1 " i=1
μ(n) h,ρ (σ ) = ρJ +
q−1 j =1
hi
x∈Wn δiσ (x)
q−1
k−1
hj
∀σ ∈ Vn , ∀n ∈ N.
,
k(kn −1)
(4.3)
hi + 1
i=1
First we prove the following recurrence formula for partition function: ⎛ (h) Zn,ρ = ⎝ρ J +
q−1
⎞ k(kn −1) ⎛ k−1
hj ⎠
⎝
j =1
q−1
⎞ hi + 1⎠ ,
∀n ∈ N.
(4.4)
i=1 (h)
(h)
From the proof of the Theorem 3.7 we find Zn,ρ = An−1 Zn−1,ρ for any n ≥ 2. Noting ⎛ a(x)hi = ⎝
⎞k
⎛
ρ J δij hj ⎠ = hi ⎝ρ J +
j =1
q−1
⎞k hj ⎠ ,
∀x ∈ Wn−1 , ∀i ∈ {1, 2, . . . , q − 1}
j =1
one gets /
An−1 =
⎛ a(x) = ⎝ρ J +
q−1
⎞k n hj ⎠ ,
∀n ≥ 2.
(4.5)
j =1
x∈Wn−1
On the other hand we obtain (h)
Z1,ρ =
q−1
⎛ ⎝(ρ J − 1)hi +
= ⎝ρ J +
⎞k
⎛
hj + 1⎠ + ⎝ρ J +
j =1
i=1
⎛
q−1
q−1
⎞k ⎛ ⎞ q−1
hj ⎠ ⎝ hi + 1⎠
j =1
i=1
The last one together with (4.5) implies (4.4).
q−1
j =1
⎞k hj ⎠
136
F. Mukhamedov and O. Khakimov
Let n ≥ 1 and σ ∈ Vn . Then we have (n) μh,ρ (σ )
=
=
1 (h)
ρ
Hn (σ )
Zn,ρ
q / / i=1
1
ρ Hn (σ ) (h)
Zn,ρ
hi
x∈Wn : σ (x)=i
q−1 /
hi
x∈Wn δiσ (x)
.
i=1
From this keeping in mind (4.4) we immediately get (4.3).
Theorem 4.2 Let k = 2. Then for any solution (h1 , . . . , hq−1 ) of (4.1) there exists M ⊂ {1, . . . , q − 1} and h∗ ∈ Qp such that ⎧ ⎨ 1, if i ∈ /M hi = ⎩ h∗ , if i ∈ M. Proof It is easy to see that hi = 1 is a solution of ith equation of the system (4.1) for each i = 1, 2, . . . , q − 1. Thus for a given M ⊂ {1, . . . , q − 1} one can take / M. Let ∅ = M ⊂ {1, . . . , q − 1}, without loss of generality we hi = 1 for any i ∈ can take M = {1, 2, . . . , m}, m ≤ q − 1, i.e. hi = 1, i = m + 1, . . . , q − 1. Now we shall prove that h1 = h2 = · · · = hm . From (4.1) we have hi =
2 (ρ J − 1)hi + m j =1 hj + q − m m , i = 1, . . . , m. J j =1 hj + q − m − 1 + ρ
(4.6)
By assumption hi = 1 for every i = 1, 2, . . . , m. Then from (4.6) after some simplification we get m
j =1 hj + q ρJ − 1
hi =
−m
2 , i = 1, . . . , m.
The right side of the last equality does not depend on i. Hence, we conclude that hi = h j ,
∀i, j ∈ {1, . . . , m}.
By this theorem we have that any translation-invariant p-adic generalized Gibbs 2 corresponds to a solution z∗ ∈ Q of the measure for the Potts model on + p following equation z= for some m = 1, . . . , q − 1.
(ρ J + m − 1)z + q − m mz + q − m − 1 + ρ J
2 ,
(4.7)
Chaos in p-adic Statistical Lattice Models: Potts Model
137
Remark 4.3 The proved result extends the main result of [117]. If k = 3 and ρ ∈ Ep , a similar kind of description has been recently given in [120]. We note that, in the real case, Theorem 4.2 is true for any k ≥ 2 [70]. However, in the p-adic setting (in general), if k ≥ 3 then Theorem 4.2 is not true. Indeed, (1) If k = q = p = 3 and ρ J = −2 then h = (64, −125) ∈ Q23 is a solution of (4.1). 5 (2) If k = p = 3, q = 6 and ρ J = − 37 20 then h = (64, −125, 1, 1, 1) ∈ Q3 is a solution of (4.1). −1 is a solution Lemma 4.4 If zm is a solution of (4.7) for m ≤ q −1 then zq−m = zm of (4.7) for q − m.
Let M ⊂ {1, . . . , q − 1}, with |M| = m. Then the corresponding solution of (4.7) is denoted by zM . Then a vector hM = (h1 , h2 , . . . , hq−1 ) whose coordinates are defined as hi = zM if i ∈ M and hi = 1 otherwise, is a fixed point of F . It is obvious that hM depends on only the cardinality of M, i.e. we may assume that hM = hm . Put 1M = (e1 , . . . , eq−1 ), where
ei =
1, if i ∈ M, 0, if i ∈ / M.
By μhM 1M , we denote the translation-invariant generalized p-adic Gibbs measure corresponding to hM . The following proposition is useful. Proposition 4.5 For any finite ⊂ V and any σ ∈ {1, . . . , q} we have μhM 1M (σ ) = μhM c 1M c (σ ),
(4.8)
−1 −1 −1 where M c = {1, . . . , q − 1} \ M and hM c = h−1 M := (h1 , h2 , . . . , hq−1 ).
Corollary 4.6 Each translation-invariant generalized p-adic Gibbs measures corresponds to a solution of (4.7) with some m ≤ [q/2], where [a] is the integer part q of a. Moreover, for a given m ≤ [q/2], a fixed solution zM to (4.7) generates m q vectors hM 1M giving m translation-invariant generalized p-adic Gibbs measures.
5 Description the Set Off All Translation-Invariant p-adic Gibbs Measures In this section, we consider translation-invariant p-adic Gibbs measures, i.e. it is assumed that ρ J ∈ Ep \ {1}.
138
F. Mukhamedov and O. Khakimov
Thanks to Theorem 3.9 we conclude that if p q then for q-state Potts model there is a unique p-adic Gibbs measure which is translation-invariant. In other words, p | q is a necessity condition of the existence of at least two translationinvariant p-adic Gibbs measures. Theorem 5.1 Let p = 2 and ρ ∈ E2 .
q (A) For a given m ≤ [q/2] there exist 2 m of translation-invariant 2-adic Gibbs measures if at least one of the following conditions is satisfied:
(A1 ) |4m|2 > max{|ρ J − 1|2 , |q|2 } and ρ J ∈ / {1 − q, 1 + q}; / {1 − q, 1 + q}; (A2 ) |4m|2 = |ρ J − 1|2 = |q|2 and ρ J ∈ (A3 ) |m|2 > |ρ J − 1|2 = |q|2 > |4m|2 , q = 2m, ρ J ∈ / {1 − q, 1 + q} and √ q ρ J −1 2 there exists 1 − 2a + b , where a = 2m , b = 2m . q of translation-invariant 2-adic Gibbs (B) For a given m ≤ [q/2] there exist m measures if at least one of the following conditions is satisfied: (B1 ) (B2 ) (B3 ) (B4 ) (B5 ) (B6 )
|4m|2 > max{|ρ J − 1|2 , |q|2 } and ρ J ∈ {1 − q, 1 + q}; |4m|2 = |ρ J − 1|2 = |q|2 and ρ J ∈ {1 − q, 1 + q; |m|2 = |ρ J − 1|2 = |q|2 > |4m|2 and ρ J ∈ / {1 − q, 1 + q}; |m|2 > |ρ J − 1|2 = |q|2 > |4m|2 and ρ J ∈ {1 − q, 1 + q}; |ρ J − 1|2 = |q|2 > |m|2 and ρ J ∈ / {1 − q, 1 + q}; |m|2 > |ρ J − 1|2 = |q|2 > |4m|2 , q = 2m, ρ J ∈ / {1 − q, 1 + q} and √ ρ J −1 2 there exists b − 1, where b = q .
(C) Otherwise there does not exist any translation-invariant 2-adic Gibbs measure. Theorem 5.2 Let p = 2 and ρ ∈ Ep .
q (A) For a given m ≤ [q/2] there exist 2 m of translation-invariant p-adic Gibbs measures if at least one of the following conditions is satisfied:
(A1 ) |m|p > max{|ρ J − 1|p , |q|p } and ρ J ∈ / {1 − q, 1 + q}; (A2 ) |m|p = |ρ J − 1|p = |q|p , 0 < (ρ J − 1)2 − q 2 p < q 2 p , 0 < − 2m| |q −2s pJ < |q|2 p and there exists an integer number s ≥ 1 such that p (ρ − 1) − 4m(q − m) p = 1; q (B) For a given m ≤ [q/2] there exist m of translation-invariant p-adic Gibbs measures if at least one of the following conditions is satisfied: (B1 ) |m|p > max{|ρ J − 1|p , |q|p } and ρ J ∈ {1 − q, 1 + q}; (B2 ) |m|p < |ρ J − 1|p = |q|p and 0 < (ρ J − 1)2 − q 2 p < q 2 p ; (B3 ) |m|p = |ρ J − 1|p = |q|p and 0 < (ρ J − 1)2 − q 2 p < q 2 p , |q − 2m|p = |q|p ; (B4 ) |m|p = |ρ J − 1|p = |q|p and ρ J ∈ {1 − q, 1 + q} and 0 < |q − 2m|p < |q|p ;
Chaos in p-adic Statistical Lattice Models: Potts Model
139
(B5 ) q = 2m, |ρ J − 1|p = |q|p , 0 < (ρ J − 1)2 − q 2 p < q 2 p and there exists an integer number s ≥ 1 such that p−2s (ρ J − 1)2 − q 2 = 1. p
(C) Otherwise for a given m ∈ {1, . . . , [q/2]} there does not exist any translationinvariant p-adic Gibbs measure.
5.3 On Cardinality of the Set of All Translation-Invariant p-adic Gibbs Measures Denote by NT I the number of all translation-invariant p-adic Gibbs measures for 2 . Note that N J the q-state p-adic Potts model on + T I depends on the parameter ρ . Since the translation-invariant p-adic Gibbs measure μ0 exists independently on parameters, the set of all translation-invariant p-adic Gibbs measures is not empty. (1) Let q ∈ / pN. In this case by Theorem 5.2 there exists a unique translationinvariant p-adic Gibbs measure μ0 , i.e. NT I = 1. (2) Let q = p > 2 (If q = p = 2 we get 2-adic Ising model. It is known that for the Ising model there exists a unique p-adic Gibbs measure which is translationinvariant. So, NT I = 1). Then for any integer number m ∈ {1, 2, . . . , [q/2]} J − 1| . By Theorem 5.2 for the integer number it holds |m|p > |q|p ≥ |ρ p q m ≤ [q/2] there are 2 m of translation-invariant p-adic Gibbs measures if q ρJ ∈ if ρ J ∈ {1 − q, 1 + q}. / {1 − q, 1 + q} and there are m Using q q = , m q −m
q
q m=1
m
= 2q − 1
we get NT I = 1 + 2
[q/2]
m=1
q = 2q − 1, m
if ρ J ∈ / {1 − q, 1 + q}
and NT I = 1 +
[q/2]
m=1
q = 2q−1 , m
if ρ J ∈ {1 − q, 1 + q}.
(3) Let p > 2 and q = pn, n ∈ {2, p − 1}. Then |m|p > |q|p ≥ |ρ J −1|p ,
if m ∈ {1, 2, . . . , [pn/2]}\{p, 2p, . . . , [n/2]p}
140
F. Mukhamedov and O. Khakimov
and |m|p = |q|p ≥ |ρ J − 1|p
if m ∈ {p, 2p, . . . , [pn/2]}.
By Theorem 5.2 one can show that ⎧ [n/2] ⎪ 2q − 1 − 2 s=1 pn ⎪ ps , ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 2q − 1 + q − 2 [n/2] pn, ⎪ ⎪ s=1 ps [q/2] ⎨ NT I = [n/2] ⎪ 2q−1 − s=1 pn ⎪ ps , ⎪ ⎪ ⎪ ⎪ ⎪ q [n/2] pn ⎪ q−1 + ⎪ ⎪ ⎩2 s=1 ps , [q/2] −
if ρ J ∈/ {1 − q, 1 + q} and q is odd if ρ J ∈/ {1 − q, 1 + q} and q is even if ρ J ∈ {1 − q, 1 + q} and q is odd if ρ J ∈ {1 − q, 1 + q} and q is even
(4) Let p > 2 and q = ps n, where s > 1, n ∈ {1, . . . , p − 1}. If n = 1 then p-adic there are at most 2q − 1 translation-invariant measures. Note that Gibbs NT I = 2q − 1 if and only if 0 < (ρ J − 1)2 − q 2 p ≤ q 2 p . [n/2] s n If 1 < n ≤ p − 1 and n is odd then there are at most 2q − 1 − 2 m=1 pps m translation-invariant p-adic Gibbs measures. q If 1 < n ≤ p − 1 and n is even then there are at most 2q − 1 + [q/2] − [n/2] ps n 2 m=1 ps m translation-invariant p-adic Gibbs measures. (5) Let p = 2 and |q|2 > 14 . Then by Theorem 5.1 there exists a unique translationinvariant p-adic Gibbs measure. Thus, in this case NT I = 1. (6) Let p = 2 and q = 4. 3Then there are at most 15 translation-invariant p-adic Gibbs measures. If (ρ J − 5)(ρ J + 3) exists in Q2 then there exist 15 3 translation-invariant p-adic Gibbs measures. We notice that the number (ρ J − 5)(ρ J + 3) exists in Q2 if and only if ρ J ∈ A ∪ B ∪ C ∪ D, where
1 A = x ∈ Q2 : |x − 29|2 ≤ 128
1 B = x ∈ Q2 : |x − 93|2 ≤ 256
1 C = x ∈ Q2 : |x − 165|2 ≤ 256
∞ # x ∈ Q2 : |x − 5 − 2s |2 ≤ D= s=1
1 2s+3
.
Chaos in p-adic Statistical Lattice Models: Potts Model
141
Theorem 5.4 Let q ∈ N and μh be a translation-invariant p-adic Gibbs measure 2 . Then μ is bounded iff p q. for q-state Potts model on + h Thanks to Theorem 5.4 we conclude that a phase transition does not occur on the 2. set of all translation-invariant p-adic Gibbs measures for q-state Potts model on + Particularly, a strong phase transition does not occurs on that set.
6 Existence a Strong Phase Transition for q-State Potts Model In the previous section, we have proved that there is no phase transition (particularly, the strong phase transition does not occur) on the set of all translation-invariant p2 . In this section, we are interested adic Gibbs measures for q-state Potts model on + in the existence of strong phase transition for the Potts model. First of all, we notice that every p-adic Gibbs measure is a generalized p-adic Gibbs measure. We 2. consider translation-invariant generalized p-adic Gibbs measures on + We consider the following function fρ,k (z) =
ρJ z + q − 1 z + ρJ + q − 2
k ,
ρ ∈ Qp \ {−1, 0, 1}, k ∈ N,
(6.1)
which is called Potts-Bethe mapping. We notice that if k = 2 then all fixed points of (6.1) is a solution of (4.7). Indeed, from (4.7) for m = 1 one has (z − 1)(z2 + 2(q − 1) − (ρ J − 1)2 z + (q − 1)2 ) = 0, from which we find all fixed points of fρ,2 . It is obvious that z0 = 1 is a fixed point of (6.1) and it defines a generalized p-adic Gibbs measure μ0 . Now we are interested in finding other fixed points of fρ,2 , which means we need to solve the following one z2 + 2(q − 1) − (ρ J − 1)2 z + (q − 1)2 = 0.
(6.2)
Observe that the solutions of (6.2) can be formally written by z1,2 =
√ −(2ρ J − ρ 2J + 2q − 3) ± (ρ J − 1) D(ρ, q) , 2
where D(ρ, q) = ρ 2J − 2ρ J − 4q + 5
(6.3)
142
F. Mukhamedov and O. Khakimov
So, if the defined solutions exist in Qp , then they define generalized p-adic quasi Gibbs√measures μ1 and μ2 , respectively. Note that to exist such solutions the expression D(ρ, q) should have a sense in Qp , since in Qp not √ every quadratic equation has a solution. Therefore, we are going to check when D(ρ, q) does exist. In what follows, we always assume that |ρ|p < 1 and J > 0. Now let us consider several cases with respect to q. Proposition 6.1 Let J ≥ 1 and q = 2. Assume that ρ J = pJ η, where |η|p = 1, η = c0 + η1 · p. Then the following assertions hold true: (i) If p = 2, then (6.2) has no solution; (ii) Let p = 3. If J = 1, then (6.2) has two solutions z1,2 if and only if η ≡ √ 1(mod p) and η1 exists. Otherwise it has no solution; (iii) Let p ≥ 5, then (6.2) has two solutions z1,2 if and only if p ≡ 1(mod 12) or p ≡ −5(mod 12). Otherwise it has no solution. Hence, we can formulate the following Theorem 6.2 Let J ≥ 1 and q = 2 (ferromagnetic Ising model). Assume that ρ J = pJ η, where |η|p = 1, η = c0 + η1 · p. Then the following assertions hold true: (i) If p = 2 then there is a unique translation-invariant generalized p-adic Gibbs measure μ0 ; √ (ii) Let p = 3. If J = 1, η ≡ 1(mod p) and η1 exists, then there are three translation-invariant generalized p-adic Gibbs measures μ0 , μ1 and μ2 . (iii) Let p ≥ 5 and p ≡ 1(mod 12) or p ≡ −5(mod 12). Then there are three translation-invariant generalized p-adic Gibbs measures μ0 , μ1 and μ2 . The following result can be proved by the same argument used in [80]. Theorem 6.3 Let J ≥ 1 and q ≥ 3 (ferromagnetic Potts model). Then the following assertions hold true: (i) If |q − 1|p < 1, then there are three translation-invariant generalized p-adic Gibbs measures μ0 , μ1 and μ2 ; (ii) If p = 3, |q − 3|p < 1 or p ≥ 5, |4q − 5|p < 1, then there is a unique translation-invariant generalized p-adic Gibbs measure μ0 ; Remark 6.4 We point out that the existence of different Gibbs measures with significantly different behavior (for example, one is finite, the others are infinite) is an indicator of the presence of a phase transition. It should be noted that the problem of describing the set of p-adic Gibbs measures in this model is very non-trivial. The formulated results shows the existence of translation-invariant measures. However, for the presence of the phase transition, one needs to study further their properties. This will be discussed in the coming sections via the investigation of the dynamical behavior of the function fρ,k .
Chaos in p-adic Statistical Lattice Models: Potts Model
143
6.5 Behavior of the Dynamical System (6.1) In this section we are going to investigate the dynamical system given by fρ,2 . In the previous section, we have established some conditions for the existence of its fixed points. In the sequel, we are going to describe possible attractors of the system, which allows us to find a relation between behavior of that dynamical system and the phase transitions. In what follows, for the sake of simplicity, we always assume that p ≥ 3. From (6.1) we easily find the following auxiliary facts: fρ,2 (z) =
ρJ z + q − 1 z + ρJ + q − 2
|fρ,2 (z) − fρ,2 (z)|p =
2 ·
2(ρ J − 1)(ρ J + q − 1) ; (ρ J z + q − 1)(z + ρ J + q − 2)
(6.4)
|ρ J − 1|p |ρ J + q − 1|p |z − z|p |η(ρ, q, z, z)|p , (6.5) |z + ρ J + q − 2|2p |z + ρ + q − 2|2p
where η(ρ, q, z, z) = Aρ J (z + z) + 2ρ J zz + 2(q − 1)A + (q − 1)(z + z),
(6.6)
here A = ρ J + q − 2. Furthermore, we assume that fρ,2 has three fixed points, the existence such points has been investigated in Theorem 6.3. We denote them as follows z0 , z1 , z2 . Note that z0 = 1. For the fixed points z1 and z2 , from (6.2) we find that z1 + z2 = −2q + 3 + ρ 2J − 2ρ J , z1 · z2 = (q − 1)2 .
(6.7)
In the sequel we assume |ρ J |p < 1, moreover |ρ J |p ≤ |q − 1|2 if |q − 1|p < 1. Lemma 6.6 Let z1 and z2 be the fixed points of fρ,2 . Then the followings hold true: |z1 |p = |q − 1|2p , |z2 |p = 1,
|ρ J z1 + q − 1|p = |q − 1|p ; if |q − 1|p < 1, |z1 + ρ J + q − 2|p = 1;
(6.8)
(6.9)
|ρ J z2 + q − 1|p = |q − 1|p ; if |ρ J |p ≤ |q − 1|2p < |q − 1|p < 1, (6.10) |z2 + ρ J + q − 2|p = |q − 1|p ; |ρ J zi + q − 1|p = 1; if |q − 1|p = 1, i = 1, 2. |zi + ρ J + q − 2|p = 1;
(6.11)
144
F. Mukhamedov and O. Khakimov
Let us find behavior of the fixed points. From (6.4) we find |fρ,2 (z0 )|p
ρJ − 1 1 = = J . ρ + q − 1 p |q − 1|p
(6.12)
For other fixed points, again from (6.4) one gets |fρ,2 (zi )|p =
|zi |p |ρ J − 1|p |ρ J + q − 1|p , i = 1, 2. |ρ J zi + q − 1|p |zi + ρ J + q − 2|p
(6.13)
Now taking into account (6.8)–(6.11) we derive |fρ,2 (z1 )|p = |q − 1|2p , |fρ,2 (z2 )|p =
1 . |q − 1|p
Consequently, one has Proposition 6.7 Let |ρ J |p < 1 and assume that the dynamical system fρ,2 given by (6.1) has three fixed points z0 ,z1 , z2 . Then the following assertions hold true: (i) if |q − 1|p = 1, then the fixed points are neutral; (ii) if |q − 1|p < 1 and |ρ J |p ≤ |q − 1|2p , then z1 is attractive, and z0 , z2 are repelling. Furthermore, we concentrate ourselves to the case |q − 1|p < 1, which is more interesting. For a given set B ⊂ Qp , let us denote . n (z) ∈ B for some n ≥ 0 . J (B) = z ∈ S1 (0) : fρ,2
(6.14)
Theorem 6.8 Let |q − 1|p < 1, and |ρ J |p ≤ |q − 1|2p . Then one has A(z1 ) ⊃ Qp \ Z∗p ∪ B|q−1|p ,p (0) ∪ J (B|q−1|2p ,|q−1|p (z0 )) ∪ J (B|q−1|2p ,|q−1|p (z2 )) Proof Let us consider several cases with respect to |z|p . (I) Assume that z ∈ B1 (0), then one finds |fρ,2 (z)|p = |q − 1|2p < 1, hence fρ,2 (B1 (0)) ⊂ B1 (0). Note that in the considered case we have |A|p = 1, therefore for z ∈ B1 (0) from (6.6) one immediately gets |η(ρ, q, z, z1 )|p = |q − 1|p . So, (6.5), (6.9) with |z + ρ + q − 2|p = 1 imply that |fρ,2 (z) − z1 |p = (q − 1)2 (z − z1 ) . p
n (z) → z for every Hence, fρ,2 is a contraction of B1 (0), which means fρ,2 1 z ∈ B1 (0), i.e. B1 (0) ⊂ A(z1 ).
Chaos in p-adic Statistical Lattice Models: Potts Model
145
Note that B¯ 1 (0) A(z1 ), since |z0 |p = |z2 |p = 1, i.e. S1 (0) A(z1 ). |q−1| (II) Assume that 1 < |z|p ≤ |ρ J | p , then |ρ J z + q − 1|p ≤ |q − 1|p , therefore p one finds J ρ z + q − 1 2 |q − 1|p 2 |fρ,2 (z)|p = ≤ < |q − 1|2 < 1. z + ρJ + q − 2 |z| p
(III) Now let |z|p >
|q−1|p , |ρ J |p
then |ρ J z + q − 1|p = |ρ J z|p , so we have
|fρ,2 (z)|p =
|ρ J z|2p |z|2p
= |ρ J |2 < 1.
Hence, from (II), (III) one concludes that fρ,2 (z) ∈ B1 (0), for any z with |z|p > 1, which, due to (I), yields z ∈ A(z1 ). Consequently, we infer that Qp \ Z∗p ⊂ A(z1 ).
(6.15)
(IV) Now assume that |z|p = 1, |z − 1|p > |q − 1|p . Then |z + ρ J + q − 2|p = |z − 1|p , so one finds |fρ,2 (z)|p =
|q − 1|2p |z − 1|2p
< 1,
which, due to (I), implies z ∈ A(z1 ). (V) Suppose that |z − 1|p < |q − 1|p . Then |z + ρ J + q − 2|p = |q − 1|p , and from (6.6) we find |η(ρ, q, z, 1)|p = |q − 1|2p . Consequently, (6.5) implies |fρ,2 (z) − 1|p =
|z − 1|p . |q − 1|p
(6.16)
Hence, if |z − 1|p > |q − 1|2p , then |fρ,2 (z) − 1|p > |q − 1|p , which, due (IV), means z ∈ A(z1 ). (VI) Consider J (B|q−1|2p ,|q−1|p (z0 )). One can see that J (B|q−1|2p ,|q−1|p (z0 )) ⊂ n0 A(z1 ). Indeed, if z ∈ J (B|q−1|2p ,|q−1|p (z0 )), then |q −1|2 < |fρ,2 (z)−1|p
|q −1|p , which with (V) yields z ∈ A(z1 ). Now look to z2 . From (6.7) one finds
|z2 − 1|p = |q − 1|p , |z2 − 1 + q − 1|p = |q − 1|p .
(6.17)
Note that the strong triangle inequality implies that |z − z2 |p > |q − 1|p if and only if |z − 1|p > |q − 1|p .
146
F. Mukhamedov and O. Khakimov
(VII) Therefore, assume that |z − z2 |p < |q − 1|p , which implies that |z − 1|p = |q − 1|p . So, by means of (6.10), (6.17) from (6.6) we derive that |η(ρ, q, z, z2 )|p = |q − 1|2p . Hence, from (6.5) with (6.10) and |z + ρ J − 1 + q − 1|p = |q − 1|p one finds |fρ,2 (z) − z2 |p =
|z − z2 |p . |q − 1|p
(6.18)
Now using the same argument as in (V)–(VII) with (6.18) we obtain that J (B|q−1|2p ,|q−1|p (z2 )) ⊂ A(z1 ). Note that these sets are disjoint. This completes the proof. Denote gρ,2 (z) =
ρJ z + q − 1 . z + ρJ + q − 2
(6.19)
Note that fρ,2 (z) = (gρ,2 (z))2 . Then one can see that (z − z)(ρ J − 1)(ρ J + q − 1) p , |gρ,2 (z) − gρ,2 (z)|p = (z + ρ J + q − 2)(z + ρ J + q − 2)
(6.20)
p
−1 gρ,2 (z) =
(ρ J + q − 2)z − q + 1 ρJ − z
(6.21)
Moreover, one has the following lemma. Lemma 6.9 Let |q − 1|p < 1, and |ρ J |p ≤ |q − 1|2p . The following assertions hold true: (i) If |z|p = 1, then |gρ,2 (z)|p ≤ max{|q − 1|p , |ρ J |p }; (ii) If |gρ,2 (z)|p > 1, then |z|p = 1. Now we are going to investigate solutions of (3.7) over the invariant line (hx , 1, . . . , 1, 1). Let us introduce some notations. If x ∈ Wn , then instead of hx (n) we use the symbol hx . Theorem 6.10 Let |q − 1|p < 1, and |ρ J |p ≤ |q − 1|2p . Assume that {(hx , 1, . . . , 1, 1)}x∈V \{x (0) } is a solution of (3.7) such that |hx |p = 1 for all x ∈ V \ {x (0) }. Then hx = z1 for every x. 0) Proof Let us first show that |hx |p < 1 for all x. Suppose that |h(n x |p > 1 for some n0 ∈ N and x ∈ Wn0 . Since {(hx , 1, . . . , 1, 1)}x∈V \{x (0) } is a solution of (3.7), therefore, we have
(n +1)
(n +1)
0 0 0) = g h(n ρ,2 (h(x,1) )gρ,2 (h(x,2) ), x
here we have used coordinate structure of the tree.
(6.22)
Chaos in p-adic Statistical Lattice Models: Potts Model (n +1)
147
(n +1)
0 0 Now according to |h(x,1) |p = 1 and |h(x,2) |p = 1, then Lemma 6.9 (i) implies
(n0 +1) (n0 +1) 0) )|p < 1, |gρ,2 (h(x,1) )|p < 1, which with (6.22) means |h(n that |gρ,2 (h(x,1) x |p < 1. It is a contradiction. Hence, |hx |p < 1 for all x. Then from (6.20) we obtain
|gρ,2 (hx ) − gρ,2 (z1 )|p = |q − 1|p |hx − z1 |p
(6.23)
for any x ∈ V \ {x (0) }. Now denote -h(n) -p = max{|h(n) x |p : x ∈ Wn },
n ∈ N.
Let ε > 0 be an arbitrary number. Then from the prof of Lemma 6.9 (i) with (6.23) one finds (n+1)
(n+1)
2 |h(n) x − z1 |p = |gρ,2 (h(x,1) )gρ,2 (h(x,2) ) − (gρ,2 (z1 )) |p (n+1) (n+1) = gρ,2 (h(n+1) ) g (h ) − g (z ) + g (z ) g (h ) − g (z ) ρ,2 (x,2) ρ,2 1 ρ,2 1 ρ,2 (x,2) ρ,2 1 (x,1)
(n+1) (n+1) ≤ |q − 1|2p max |h(x,1) − z1 |p , |h(x,2) − z1 |p .
p
Thus, we derive -h(n) − z1 -p ≤ |q − 1|2p -h(n+1) − z1 -p . So, iterating the last inequality N times one gets N
-h(n) − z1 -p ≤ |q − 1|2p -h(n+N ) − z1 -p .
(6.24)
N
Choosing N such that |q − 1|2p < ε, from (6.24) we find -h(n) − z1 -p < ε. Arbitrariness of ε yields that hx = z1 . This completes the proof. From this theorem we conclude that other solutions of (3.7) may exist under condition |hx |p = 1 for all x ∈ V \ {x (0) }. Note that using arguments of [107] one can show the existence of periodic solutions of (3.7).
6.11 Boundedness of Generalized p-adic Gibbs Measures and Phase Transitions From the results of the previous section, we conclude that to investigate the generalized p-adic Gibbs measures, for us it is enough to study the measures μ0 , μ1 and μ2 , corresponding to the solutions z0 , z1 and z2 . In this section we shall study boundedness and unboundedness of the said measures.
148
F. Mukhamedov and O. Khakimov
Furthermore, we are going to consider the generalized p-adic Gibbs measures corresponding to these solutions. For a given configuration σn ∈ Vn denote #σn = {x ∈ Wn : σ (x) = 1}. Then thanks to Proposition 4.1 we obtain % $ ρ Hn (σn ) zi#σn μi ( σ ∈ : σ |Vn ≡ σn ) = , 2(2n −1) zi + ρ J + q − 2 (zi + q − 1)
∀n ∈ N. (6.25)
In this section we shall prove the existence of phase transitions. Namely one has the following Theorem 6.12 Assume that |q − 1|p < 1, |ρ J |p ≤ |q − 1|2p . Then for generalized 2 one has: p-adic Gibbs measures μ1 , μ2 , μ3 of q-state Potts model on + (i) the measure μ1 is bounded; (ii) the measures μ0 and μ2 are unbounded. Moreover, there is a strong phase transition. Proof According to Corollary 6.3 conditions |q − 1|p < 1 and |ρ J |p ≤ |q − 1|2p imply the existence of three translation-invariant measures μ0 , μ1 and μ2 . Then from (6.25) with (6.8), (6.9) we obtain H (σ )
|μ1 (σ )|p =
|ρ|p n
· |z1 |#σ p
|q − 1|p
≤ |h0 |2p ,
(6.26)
which implies that the measure μ1 is bounded. Similarly, from (6.25) with (6.8), (6.10) we find |μ2 (σ )|p =
|ρ|pHn (σ ) |q
2(2n −1) − 1|p
≥ p2(2
n −1)
· |ρ|pHn (σ ) .
Now let us choose σ0,n ∈ V2n as follows σ0,n (x) =
1, x ∈ W2k , 0, x ∈ W2k−1 ,
1 ≤ k ≤ n.
Then one can see that Hn (σ0,n ) = 0, therefore it follows from (6.27) that |μ2 (σ0,n )|p ≥ p2(2
n −1)
→ ∞ as n → ∞.
This yields that the measure μ2 is not bounded.
(6.27)
Chaos in p-adic Statistical Lattice Models: Potts Model
149
Let us consider the measure μ0 . Similarly, we obtain |μ0 (σ )|p = =
H (σ )
|ρ|p n
2(2n −1)
|q − 1|p
H (σ )
|ρ|p n
2(2n −1)
|q − 1|p
≥ p2(2
n −1)
· |ρ|pHn (σ )
(6.28)
so, we immediately find that |μ0 (σ0,n )|p → ∞ as n → ∞. It follows from (6.27), (6.28) that μ0 (σ ) μ (σ ) = 1. 2 p Now let us compare μ1 and μ2 . From (6.26), (6.27) with (6.8) one finds #σ0,n
|μ1 (σ0,n )μ2 (σ0,n )|p =
|z1 |p
2(4n −1)
|q − 1|p
= |q − 1|p2((4
n −(4n −1))
= |q − 1|2p . This implies that |μ1 (σ0,n )|p → 0 as n → ∞. The last relation yields the existence of the strong phase transition.
(6.29)
Remark 6.13 From Proposition 6.7 and Theorem 6.12 we conclude that at |q − 1|p < 1, |ρ J |p ≤ |q − 1|2p , the attractivity of the fixed point x1 yields the boundedness of the measure μ1 . The measures μ0 and μ2 are unbounded while the corresponding fixed points are repelling. Therefore, we may at least predict the existence of the strong phase transition by looking to the behavior of the fixed points of associated dynamical system. Now assume that |q −1|p = 1. In this case, the solutions z1 and z2 may not exists (see Theorems 6.2 and 6.3). Therefore, we suppose the existence of such solutions. So, using the similar argument as above we can prove the following result. Theorem 6.14 Assume that |q − 1|p = 1 and the measures μ1 , μ2 for p-adic the ferromagnetic q-state Potts model exist. Then the measures μk (k = 0, 1, 2) are bounded. In this case, there is a quasi phase transition. Again Proposition 6.7 with Theorem 6.14 implies that the neutrality of the fixed points yields the occurrence of the quasi phase transition.
150
F. Mukhamedov and O. Khakimov
7 Chaotic Behaviour of Potts-Bethe Mapping In the previous sections for the q-state Potts model the set of all translation-invariant k are described when k = 2. Similarly, we generalized p-adic Gibbs measures on + can describe the set of all fixed points of (6.1) when k ≥ 3. In this section we give another “good way" to approach the finding periodic points of (6.1) (which defines periodic generalized p-adic Gibbs measures) for arbitrary k ∈ N. We notice that for the Potts model the existence of 2-periodic Gibbs measure was investigated in [107]. However, technically, to find other types of periodic Gibbs measures is very tricky. Therefore, we are gong to use other approach based on chaoticity of the Potts-Bethe mapping. In order to understand results on the existence of Hm -periodic generalized p-adic Gibbs measures for any m ≥ 1, we need some basic notions on p-adic subshifts.
7.1 p-adic Sub-Shift Let f : X → Qp be a map from a compact open set X of Qp into Qp . We assume that (i) f −1 (X) ⊂ X; (ii) X = ∪j ∈I Br (aj ) can be written as a finite disjoint union of balls of centers aj and of the same radius r such that for each j ∈ I there is an integer τj ∈ Z such that |f (x) − f (y)|p = pτj |x − y|p ,
x, y ∈ Br (aj ).
(7.1)
For such a map f , define its Julia set by Jf =
∞ #
f −n (X).
(7.2)
n=0
It is clear that f −1 (Jf ) = Jf and then f (Jf ) ⊂ Jf . The triple (X, Jf , f ) is called a p-adic weak repeller if all τj in (7.1) are nonnegative, but at least one is positive. We call it a p-adic repeller if all τj in (7.1) are positive. For any i ∈ I , we let $ % Ii := j ∈ I : Br (aj ) ∩ f (Br (ai )) = ∅ = {j ∈ I : Br (aj ) ⊂ f (Br (ai ))} (the second equality holds because of the expansiveness and of the ultrametric property). Then define a matrix A = (aij )I ×I , called incidence matrix as follows
aij =
1, if j ∈ Ii ; 0, if j ∈ Ii .
If A is irreducible, we say that (X, Jf , f ) is transitive. Here the irreducibility of A (m) means, for any pair (i, j ) ∈ I × I there is positive integer m such that aij > 0, where aij(m) is the entry of the matrix Am .
Chaos in p-adic Statistical Lattice Models: Potts Model
151
Given I and the irreducible incidence matrix A as above we denote A = {(xk )k≥0 : xk ∈ I, Axk ,xk+1 = 1, k ≥ 0} which is the corresponding subshift space, and let σ be the shift transformation on A . We equip A with a metric df depending on the dynamics which is defined as follows. First for i, j ∈ I, i = j let κ(i, j ) be the integer such that |ai − aj |p = p−κ(i,j ) . It clear that κ(i, j ) < τ . By the ultra-metric inequality, we have |x − y|p = |ai − aj |p i = j, ∀x ∈ Br (ai ), ∀y ∈ Br (aj ) For x = (x0 , x1 , . . . , xn , . . . ) ∈ A and y = (y0 , y1 , . . . , yn , . . . ) ∈ , define
df (x, y) =
p−τx0 −τx1 −···−τxn−1 −κ(xn ,yn ) , if n = 0 if n = 0 p−κ(x0 ,y0 ) ,
where n = n(x, y) = min{i ≥ 0 : xi = yi }. It is clear that df defines the same topology as the classical metric which is defined by d(x, y) = p−n(x,y) . Theorem 7.2 ([36]) Let (X, Jf , f ) be a transitive p-adic weak repeller with incidence matrix A. Then the dynamics (Jf , f, | · |p ) is isometrically conjugate to the shift dynamics (A , σ, df ). Remark 7.3 Other types of generalizations of the given construction have been studied in [94, 95, 112].
7.4 Dynamics of p-adic Potts-Bethe Mapping In this section, we are going to study dynamics of Potts-Bethe mapping (6.1). In what follows we suppose that ρ J ∈ Ep and p = 2. Results of this section are given in [97, 98]. There are two cases for q, i.e. either |q|p = 1 or |q|p < 1. We consider both cases separately. Lemma 7.5 Let p ≥ 3 and |q|p = 1. Then fρJ (Ep ) ⊂ Ep and fρJ is a contraction on Ep . Let us denote (1) Bq,ρ = {z ∈ Qp : |z + q − 1|p > |ρ J − 1|p }, (2)
Bq,ρ = {z ∈ Qp : |z + q − 1|p = |ρ J − 1|p }, (3)
Bq,ρ = {z ∈ Qp : |z + q − 1|p < |ρ J − 1|p }. (1) Note that Ep ⊂ Bq,ρ .
152
F. Mukhamedov and O. Khakimov (1)
Lemma 7.6 Let p ≥ 3 and |q|p = 1. Then fρJ (Bq,ρ ) ⊂ Ep . Let |q − 1|p = p−s , s ≥ 0. We define the set - s . Solp (zk + q − 1) = p− k ξ ∈ Fp : ξ k + q − 1 ≡ 0(mod ps+1 ) . If Solp (zk + q − 1) = ∅, we then denote κp := |Solp (zk + q − 1)|. Remark 7.7 According to Theorem 2.5, a condition Solp (zk + q − 1) = ∅ implies √ k 1 − q ∈ Qp . Lemma 7.8 Let p ≥ 3 and |q − 1|p = p−s , s ≥ 0. Then Solp (zk + q − 1) = ∅ if and only if there exists a p-adic integer η such that |ηk + q − 1|p < |q − 1|p . Proof From the definition of the set Solp (zk +q−1) we can easily see that Solp (zk + 1 q − 1) = ∅ if and only if |ηk + q − 1|p ≤ ps+1 for some p-adic number η. Since
|q − 1|p = p−s one gets |ηk + q − 1|p < |q − 1|p . From the non-Archimedean norm’s property we obtain |ηk |p = |q − 1|p which means that η is a p-adic integer. Proposition 7.9 Let p ≥ 3 and |ρ J − 1|p < |q − 1|p = p−s , s ≥ 0. Assume that Solp (zk + q − 1) = ∅. Then fρ,k has a unique fixed point z0 = 1. Moreover, it holds A(z0 ) = Dom(fρ,k ). Proof It is clear that z0 = 1 is a fixed point for fρ,k . One has ∂ k(ρ J − 1) fρ,k (z0 ) = J . ∂z ρ −1+q
(z )| ≤ |ρ J − 1| < 1. It The last one with |q|p = 1, |k|p ≤ 1 implies |fρ,k 0 p p means that z0 is attracting. Let us show that A(z0 ) = Dom(fρ,k ). Indeed, due to Solp (zk + q − 1) = ∅ from Lemma 7.8 one finds
|zk + q − 1|p ≥ |q − 1|p , for any z ∈ Qp . In particularly, for any z ∈ Dom(fρ,k ) one has |fρ,k (z) + q − 1|p ≥ |q − 1|p . Since (1) |q − 1|p > |ρ J − 1|p we obtain fρ,k (z0 ) ∈ Bq,ρ . According to Lemma 7.6 one gets 2 fρ,k (z0 ) ∈ Ep . Finally, the contractivity of fρ,k on Ep and z0 ∈ Ep yield that that n (z) → z as n → ∞. The arbitrariness of z means that fρ,k 0 A(z0 ) = Dom(fρ,k ).
Chaos in p-adic Statistical Lattice Models: Potts Model
153
Let us consider the case Solp (zk + q − 1) = ∅. For a given q ≥ 3 with |q − 1|p = s ≥ 0 we define the set
p−s ,
X=
κp #
(7.3)
Br (zi ),
i=1
s+k which is a finite union of disjoint balls. Here, r = p k (ρ J − 1) and zi is defined p
by (7.4) if s = 0 and by (7.5) if s = 0. Case. s = 0 zi =
1 − q + ηi (ρ J − 1), if ξi + q − 1 ≡ 0(mod p) 1 − q,
if ξi + q − 1 ≡ 0(mod p),
(7.4)
where ηi be a solution of ηi (ξi − 1) + ξi + q − 1 ≡ 0(mod p), for a given ξi ∈ Solp (zk + q − 1), i = 1, κp . Case. s > 0 s
zi = 1 − q + p k ξi (ρ J − 1)
(7.5)
where ξi ∈ Solp (zk + q − 1), i = 1, κp . Proposition 7.10 Let p ≥ 3 and |ρ J − 1|p < |q − 1|p . Assume that Solp (zk + q − 1) = ∅. Then one has A(z0 ) ⊃ Dom(fρ,k ) \ X. (1)
Proof Take any x ∈ Dom(fρ,k ) \ X. Note that Bq,ρ ∩ X = ∅. So, first we consider (1) a case z ∈ Bq,ρ ∪ X. Then, there exists a p-adic integer η such that z = 1 − q + η(ρ J − 1). Now consider two cases with respect to s. Case. s = 0. Suppose that η(ξ − 1) + ξ + q − 1 ≡ 0(mod p) for any solution ξ of y k + q − 1 ≡ 0(mod p). Then we get
1−q +η 1+η
k + q − 1 ≡ 0(mod p).
154
F. Mukhamedov and O. Khakimov
The last one with |ρ J − 1|p < 1 implies that 1 − q + η + η(ρ J − 1) k + q − 1 ≥ 1. |fρ,k (z) + q − 1|p = 1+η p
(1)
Hence, fρ,k (z) ∈ Bq,ρ . s Case. s > 0. First, we show that |fρ,k (z)|p = |q − 1|p if |η|p = p− k . s Assume that |η|p = p− k . If |η|p ≥ 1 we find 1 − q + ηρ J k η = |fρ,k (z)|p = 1 + η ≥ 1. 1+η p p
Since |q − 1|p < 1 one gets |fρ,k (z)|p > |q − 1|p . s
s
Let p− k = |η|p < 1. Since |q − 1|p < p− k , we get 1 − q + ηρ J = 1 − q + ηρ J = p− ks . p 1+η p Using the last we have |fρ,k (z)|p = |q − 1|p . s Thus, we have shown that |fρ,k (z)|p = |q − 1|p if |η|p = p− k . It is equivalent to the following: |fρ,k (z) + q − 1|p ≥ |q − 1|p . (1)
Since |ρ J − 1|p < |q − 1|p , the last inequality yields that fρ,k (z) ∈ Bq,ρ . s Let us assume that |η|p = p− k . Then there exists a p-adic integer |ξ |p = 1 such s that η = p k ξ . One has s
ρJ +
p k (1 − q) ∈ Ep , ξ
Then according to Lemma 2.2 we obtain ⎛ ⎜ ⎝
ρJ +
1−q
s pk s k
ξ
1+p ξ
s
1 + p k ξ ∈ Ep .
⎞k ⎟ ⎠ ∈ Ep .
Chaos in p-adic Statistical Lattice Models: Potts Model
155
Hence, ⎛ ⎜ ps ξ k ⎝
ρJ +
1−q
s pk s k
ξ
1+p ξ
⎞k ⎟ s k s+1 ⎠ ≡ p ξ (mod p ).
From ξ ∈ Solp (zk + q − 1) one finds p s ξ k + q − 1p = |q − 1|p . Consequently, we have |fρ,k (z) + q − 1|p = |q − 1|p , (1)
which yields fρ,k (z) ∈ Bq,ρ . (1) (1) for any z ∈ Bq,ρ ∪ X. On the other Thus, we have shown that fρ,k (z) ∈ Bq,ρ (1) hand, according to Lemmas 7.5 and 7.6 one has A(z0 ) ⊃ Bq,ρ . So, we conclude that A(z0 ) ⊃ Dom(fρ,k ) \ X. Proposition 7.11 Let p ≥ 3 and |ρ J − 1|p < |q − 1|p = p−s , s ≥ 0. If Solp (zk + q − 1) = ∅. Then one has |k|p
|fρ,k (z) − fρ,k (z)|p = p
s(k−1) k
|ρ J − 1|p
|z − z|p , for any z, z ∈ Br (zi ).
(7.6)
Proof For any pair (z, z) ∈ Q2p we have fρ,k (z) − fρ,k (z) =
(ρ J − 1)(ρ J + q − 1)
k−1
j =0 [R(z)Q(z)] [Q(z)Q(z)]k
k−1−j [R(z)Q(z)]j
(z − z),
(7.7)
where R(z) = ρ J z + q − 1,
Q(z) = z + ρ J + q − 2
Case. s = 0. Pick any z ∈ Br (zi ). Suppose that zi = 1 − q. Then there exists an αz ∈ pZp such that z = 1 − q + (ρ J − 1)(ηi + αz ). We have R(z) = (ρ J − 1) 1 − q + ηi + αz + (ρ J − 1)(ηi + αz ) Q(z) = (ρ J − 1) (1 + ηi + αz ) .
156
F. Mukhamedov and O. Khakimov
It follows from q ≡ 0(mod p) and the definition of ηi that |ηi + 1|p = |ηi + 1 − q|p = 1. So, there exist p-adic integers βz , γz ∈ pZp such that R(z) = (ρ J − 1)(1 − q + ηi )(1 + βz ),
Q(z) = (ρ J − 1)(1 + ηi )(1 + γz ).
It follows that |R(z)|p = |Q(z)|p = |ρ J − 1|p . Plugging them into (7.7) and according to Lemma 2.2 one finds |fρ,k (z) − fρ,k (z)|p =
|k|p |z − z|p , for any z, z ∈ Br (zi ). |ρ J − 1|p
Assume that zi = 1 − q. Then for any z ∈ Br (zi ) there exists an αz ∈ pZp such that z = 1 − q + αz (ρ J − 1). In this case, we get R(z) = (ρ J − 1)(1 − q)(1 + Q(z) = (ρ J − 1)(1 + αz ).
αz 1−q ),
According to Lemma 2.3 from (7.7) one has fρ,k (z) − fρ,k (z) =
k(1 − q)k−1 γ (z − z), γ ∈ Ep , ρJ − 1
for any z, z ∈ Br (zi ). Hence, |fρ,k (z) − fρ,k (z)|p =
|k|p |z − z|p . − 1|p
|ρ J
Case. s > 0. Then for z ∈ Br (zi ) there exists p-adic integer αz ∈ pZp such that s
z = 1 − q + p k (ξi + αz )(ρ J − 1). We have s R(z ) = (ρ J − 1) 1 − q + p k (ξi + αz )ρ J , s Q(z ) = (ρ J − 1) 1 + p k (ξi + αz ) . There exist p-adic integers βz , γz ∈ pZp such that s
R(z ) = p k ξi (ρ J − 1)(1 + βz ), Q(z ) = (ρ J − 1)(1 + γz ).
Chaos in p-adic Statistical Lattice Models: Potts Model
157
Putting these into (7.7) and using the non-Archimedean norm’s property one gets |k|p
|fρ,k (z )−fρ,k (z )|p = p
s(k−1) k
|ρ J
− 1|p
|z −z |p , for any z , z ∈ Br (zi ).
This completes the proof. Now we assume that |q − 1|p = p−s , s ≥ 0 and κp ≥ 2. √ Proposition 7.12 Let p ≥ 3 and k 1 − q ∈ Qp . Let X be a set defined as (7.3). Then the triple (X, Jfρ,k , fρ,k ) is a p-adic repeller iff |k|p > p
s(k−1) k
|ρ J − 1|p .
−1 (X) ⊂ X. Proof According to Proposition 7.11 it is enough to show that fρ,k √ k Let us assume that 1 − q ∈ Qp . The function fρ,k has the following inverse branches on X:
gρ J ,i (z) = where ξˆi =
√ k
1 − q and
√ (ρ J + q − 2)ξˆi k z∗ + 1 − q , i = 1, κp , √ ρ J − ξˆi k z∗
√ z k z = k ∗ 1−q ∈ Ep .
We show that gρ J ,i (z) ∈ Br (zi ) for any z ∈ X. Case. s = 0. Take any z ∈ X. Then we get 6 7 √ (ρ J − 1) ηi (ξˆi − 1) + ξˆi + q − 1 + (ηi + 1)ξˆi ( k z∗ − 1) − ηi (ρ J − 1) . gρ J ,i (z)−zi = √ 1 − ξˆi − ξˆi ( k z∗ − 1) + ρ J − 1
(7.8)
One can see that |ηi (ξˆi − 1) + ξˆi + q − 1|p < 1, √ | k x − 1|p < 1, |ρ J − 1|p < 1, |ξˆi |p = |ηi |p = |1 − ξˆi |p = 1. Plugging these into (7.8) and using the strong triangle inequality we obtain g
< |ρ J − 1|p
ρ J ,i (z) − zi p
−1 which implies that gρ J ,i (z) ∈ Br (zi ). Hence, fρ,k (X) ⊂ X. Case. s > 0. Take any z ∈ X. Then, we have
6 √ s √ 7 (ρ J − 1) ξˆi k z∗ + q − 1 − p k ξi (ρ J − ξˆi k z∗ ) gρ J ,i (z) − zi = . √ ρ J − ξˆi k z∗
(7.9)
158
F. Mukhamedov and O. Khakimov
Noting ξˆi = p k ξi α, α ∈ Ep , we have s
J ˆ√ ρ − ξi k z∗ = 1, p √ s s ˆ k k ξ i z∗ − p ξ i ρ J < p − k , p s 2s k ˆ√ p ξ i ξ i k z∗ = p − k . p
Putting the last ones into (7.9) one finds s
|gρ J ,i (z) − zi |p ≤ p− k −1 |ρ J − 1|p which yields gρ J ,i (z) ∈ Br (zi ) for any z ∈ X.
From the proof of Proposition 7.12, we immediately have the following √ Corollary 7.13 Let p ≥ 3 and k 1 − q ∈ Qp . Let X be a set defined as (7.3). s(k−1) k
If |k|p > p fρ,k (Br (zj )).
|ρ J − 1|p then for any i, j ∈ {1, 2, . . . , κp } one has Br (zi ) ⊂
Theorem 7.14 Let |q−1|p = p−s , s ≥ 0 and
√ k 1 − q ∈ Qp . Let X be a set defined
s(k−1)
as (7.3). If κp ≥ 2 and |k|p > p k |ρ J − 1|p then (X, Jf , f ) is a transitive padic repeller, i.e. this triple is topologically conjugate to the full shift dynamics of κp symbols. √ Proof We have shown that under conditions k 1 − q ∈ Qp , κp ≥ 2 and |k|p > s(k−1)
p k |ρ J − 1|p the triple (X, Jfρ,k , fρ,k ) is a p-adic repeller. Moreover, under these conditions the incidence matrix A (κp × κp dimensional) for the function fρ,k : X → Qp has the following form (it follows from Corollary 7.13) ⎛
11 ⎜1 1 ⎜ A = ⎜. . ⎝ .. ..
... ... .. .
⎞ 1 1⎟ ⎟ .. ⎟ . .⎠
1 1 ... 1 This means that the triple (X, Jfρ,k , fρ,k ) is transitive, hence according to Theorem 7.2 we conclude that the dynamics (Jfρ,k , fρ,k , | · |p ) is isometrically conjugate to the shift dynamics (A , σ, dfρ,k ). Remark 7.15 We stress that all conditions of the formulated theorem are important. √ If one drops any one the desired result may not be achieved. Namely, if k 1 − q ∈ / s(k−1) J Qp , then the dynamical system becomes asymptotical stable. If |k|p > p k |ρ − 1|p fails, then we cannot get the desired contracting mappings which provide the existence of the Julia set.
Chaos in p-adic Statistical Lattice Models: Potts Model
159
This theorem opens new perspectives in investigations of generalized p-adic self-similar sets. √ We point out that the necessary and sufficient conditions for the existence of k 1 − q in Qp are given in [102]. In particularly, if we substitute q = 2 to the Potts-Bethe mapping, it reduces to the Ising mapping which has been considered in [96]. However, in the mentioned paper, we have used different techniques to establish the conjugacy of the triple f (on an appropriate set) to the full shift. In the real case, analogous results with rigorous proofs are not known in the literature. In this direction, only numerical analysis predicts the existence of the chaos (see for example, [9, 18, 77]). The advantage of the non-Archimedeanity of the norm allowed us rigorously to prove the existence of the chaos (in Devaney’s sense). Now we suppose that |q|p < 1. It is easy to notice that the function (6.1) is defined on Qp \ {z(∞) }, where z(∞) = 2 − q − ρ. For the sake of convenience, we write Dom(fρ,k ) := Qp \ {z(∞) }. Let us denote Pz(∞) =
∞ #
−n (∞) fρ,k (z ).
n=1
On can see that the set Pz(∞) is at most countable, and could be empty for some k, q and ρ J . If it is not empty, then for any z0 ∈ Pz(∞) there exists an n ≥ 1 such that after n-times we will “lost" that point. Like as Theorem 7.14 we get the following result for |q|p < 1. Theorem 7.16 Let k ≥ 2, |q|p < 1 and z0∗ = 1. Then the dynamical structure of the system (Qp , f ) is described as follows: (A). If |k|p ≤ |q + ρ J − 1|p then F ix(fρ,k ) = {z0∗ } and A(z0∗ ) = Dom(fρ,k ). (B). Assume that |k|p > |q + ρ J − 1|p and |ρ J − 1|p < |q 2 |p . Then there exists a non empty set Jfρ,k ⊂ Dom(fρ,k ) \ Pz(∞) which is invariant with respect to f and A(z0∗ ) = Dom(fρ,k ) \ Pz(∞) ∪ Jfρ,k . Moreover, if (k, p − 1) is the greatest common factor of k and p − 1, then the followings hold: (B1). if (k, p − 1) = 1 then there exists z∗ ∈ F ix(fρ,k ) such that z∗ = z0∗ and Jfρ,k = {z∗ }; (B2). if (k, p − 1) ≥ 2 then (Jfρ,k , fρ,k , | · |p ) is a transitive p-adic repeller, i.e. this triple is topologically conjugate to the full shift dynamics of (k, p − 1) symbols.
160
F. Mukhamedov and O. Khakimov
As a corollary of Theorems 7.14 and 7.16 we can formulate the following result. Theorem 7.17 Let p ≥ 3, k ≥ 2, q ≥ 2 and |ρ J − 1|p < 1. Assume that x k − 1 + q polynomial has at least two roots in Qp . Then for any m ≥ 2 there exists Hm k if the periodic generalized p-adic Gibbs measures for q-state Potts model on + one of the following statements hold: k−1 (1) |q|p = 1 and |k|p > (ρ J − 1)(q − 1) k ; p
(2) |q|p < 1 and |k 2 |p > |q 2 |p > |ρ J − 1|p . (m+1)/m
Remark 7.18 It is worth to mention that if k = 3 and 0 < |q|p ≤ |θ − 1|p < (m+2)/(m+1) < 1 for some m ≥ 1, the chaoticity of the Potts-Bethe mapping |q|p has been established in [1]. However, their proof is too technical and based on the examinations of locations of roots of depressed cubic equation over Qp [87, 88, 101, 120].
8 Conclusions In the conclusion, it is worth to mention some brief description of the differences of behavior between classical (real) and p-adic Potts models on the Cayley tree. In the real case, for the ferromagnetic q-state Potts model (q ≥ 2) there are q + 1 distinct translation invariant Gibbs measures. Moreover,there are two critical temperatures 0 < Tc < Tc such that: if T ∈ (Tc , Tc ] there are q + 1 extreme Gibbs measures; if T ≤ Tc there are q extreme Gibbs measures coexist; if T > Tc then there is only one Gibbs measures (see [17, 29, 110]). In the p-adic setting, for the same model there are one can find several p-adic Gibbs measures when k = 2 and p ≥ 3. If q is not divisible by p, then there is only one p-adic Gibbs measure [106]. If q = 2 (then model resuces to the Ising model), there is only one p-adic Gibbs measure, i.e. there is no phase transition. If q is divisible by p, then it appears several several regimes for the existence of phase transitions. These are one of the interesting differences between real and p-adic Potts models. In the present paper, we reviewed a phase transition problem and its connection with chaoticity of the RG transformation, for the p-adic Potts model over the Cayley tree. We considered a more general notion of p-adic Gibbs measure which depends on parameter ρ ∈ Qp . Such a measure is called generalized p-adic Gibbs measure. When ρ equals to p-adic exponent, then it coincides with the usual p-adic Gibbs measure (see [106]). In the present paper we have considered two regimes with respect to the values of |ρ|p . Namely, in the first regime, one takes ρ = expp (J ) for some J ∈ Qp , in the second we let |ρ|p < 1. In each regime, we first find conditions for the existence of generalized p-adic Gibbs measures.
Chaos in p-adic Statistical Lattice Models: Potts Model
161
Acknowledgments This work is supported by the UAEU UPAR Grant No. G00003247 (Fund No. 31S391). The authors are thankful to an anonymous reviewer for his/her useful suggestions which improved the text of the present paper.
References 1. Ahmad M.A.Kh., Liao L.M. Saburov M. Periodic p-adic Gibbs measures of q-state Potts model on Cayley tree: the chaos implies the vastness of p-adic Gibbs measures, J. Stat. Phys., 171:6 (2018), 1000–1034. 2. Albeverio S., Khrennikov A., Cianci R., On the Fourier transform and the spectral properties of the p-adic momentum and Schrodinger operators. J. Phys. A, Math. and General, 30 (1997) 5767–5784. 3. Albeverio S., Khrennikov A., Cianci R., A representation of quantum field hamiltonian in a p-adic Hilbert space. Theor. Math. Phys., 112 (1997) 1081–1096. 4. Albeverio S., Khrennikov A., Cianci R., On the spectrum of the p-adic position operator. J. Phys. A, Math. and General, 30(1997), 881–889. 5. Albeverio S., Cianci R., Khrennikov A. Yu., p-adic valued quantization. P-Adic Numbers, Ultrametric Anal. Appl., 1 (2009), 91–104. 6. Arrowsmith D.K., Vivaldi F., Some p−adic representations of the Smale horseshoe, Phys. Lett. A 176(1993), 292–294. 7. Arrowsmith D.K., Vivaldi F., Geometry of p-adic Siegel discs. Physica D, 71(1994), 222– 236. 8. Arroyo-Ortiz E., Zuniga-Galindo W.A., Construction of p-Adic Covariant Quantum Fields in the Framework of White Noise Analysis, Rep. Math. Phys. 84(2019), 1–34. 9. Ananikian N.S., Dallakian S.K., Hu B., Chaotic Properties of the Q-state Potts Model on the Bethe Lattice: Q < 2, Complex Systems, 11 (1997), 213–222. 10. Anashin V., Khrennikov A., Applied Algebraic Dynamics, Walter de Gruyter, Berlin, New York, 2009. 11. Albeverio S., Rozikov U., Sattarov I.A., p-adic (2, 1)-rational dynamical systems, J. Math. Anal. Appl., 398 (2013), 553–566. 12. Avetisov V.A., Bikulov A.H., Kozyrev S.V. Application of p-adic analysis to models of spontaneous breaking of the replica symmetry, J. Phys. A: Math. Gen. 32(1999) 8785–8791. 13. Baxter R.J., Exactly Solved Models in Statistical Mechanics, Academic Press, London, 1982. 14. Benedetto R., Reduction, dynamics, and Julia sets of rational functions, J. Number Theory, 86 (2001), 175–195. 15. Benedetto R., Hyperbolic maps in p-adic dynamics, Ergod. Th.& Dynam. Sys. 21 (2001), 1–11. 16. Bogachev V., Measure theory, Springer, Berlin, 2007. 17. Bogachev L.V., Rozikov U.A., On the uniqueness of Gibbs measure in the Potts model on a Cayley tree with external field, J. Stat. Mech.: Theory and Exper., (2019) 073205 18. Bosco F.A., Jr Goulart R.S., Fractal dimension of the Julia set associated with the Yang-Lee zeros of the ising model on the Cayley tree, Europhys. Let. 4 (1987) 1103–1108. 19. Casas J.M., Omirov B.A., Rozikov U.A, Solvability criteria for the equation x q = a in the field of p-adic numbers, Bull. Malays. Math. Sci. Soc., 37(2014), 853–864. 20. Derrida B., Seze L. De., Itzykson C. Fractal structure of zeros in hierarchical models, J. Stat. Phys. 33(1983) 559–569. 21. Diao H., Silva C.E., Digraph representations of rational functions over the p-adic numbers, p-Adic Numbers, Ultametric Anal. Appl. 3 (2011), 23–38. 22. Dobrushin R.L. The problem of uniqueness of a Gibbsian random field and the problem of phase transitions, Funct.Anal. Appl. 2 (1968) 302–312.
162
F. Mukhamedov and O. Khakimov
23. Dobrushin R.L. Prescribing a system of random variables by conditional distributions, Theor. Probab. Appl. 15(1970) 458–486. 24. Dragovich B., Khrennikov A., Mihajlovic D. Linear fraction p-adic and adelic dynamical systems, Rep. Math. Phys. 60(2007) 55–68. 25. Dragovich B., Khrennikov A.Yu., Kozyrev S.V., Volovich I.V., On p-adic mathematical physics, p-Adic Numbers, Ultrametric Analysis and Appl. 1 (2009), 1–17. 26. Dragovich B., Khrennikov A.Yu., Kozyrev S.V., Volovich I.V., Zelenov E. I., p -Adic Mathematical Physics: The First 30 Years. p-Adic Numbers Ultrametric Anal. Appl. 9 (2017), 87–121. 27. Efetov K.B., Supersymmetry in disorder and chaos, Cambridge Univ. Press, Cambrdge, 1997. 28. Eggarter T.P., Cayley trees, the Ising problem, and the thermodynamic limit, Phys. Rev. B 9 (1974) 2989–2992. 29. Georgii H.O. Gibbs measures and phase transitions, Walter de Gruyter, Berlin, 1988. 30. Gyorgyi G., Kondor I., Sasvari L., Tel T., From phase transitions to chaos, World Scientific, Singapore, 1992. 31. Gandolfo D., Rozikov U., Ruiz J. On p-adic Gibbs measures for hard core model on a Cayley Tree, Markov Proc. Rel. Topics 18(2012) 701–720. 32. Ganikhodjaev N.N., On pure phases of the three-state ferromagnetic Potts model on the Bethe lattice order two, Theor. Math. Phys. 85 (1990) 163–175. 33. Ganikhodjaev N.N., Mukhamedov F.M., Rozikov U.A. Phase transitions of the Ising model on Z in the p-adic number field, Uzbek. Math. Jour. 4 (1998) 23–29 (Russian). 34. Ganikhodjaev N.N., Mukhamedov F.M., Rozikov U.A. Phase transitions of the Ising model on Z in the p-adic number field, Theor. Math. Phys. 130 (2002), 425–431. 35. Herman M., Yoccoz J.-C., Generalizations of some theorems of small divisors to nonArchimedean fields, In: Geometric Dynamics Rio de Janeiro, 1981, Lec. Notes in Math. 1007, Springer, Berlin, 1983, pp. 408–447. 36. Fan A.H., Liao L.M., Wang Y.F., Zhou D., p-adic repellers in Qp are subshifts of finite type, C. R. Math. Acad. Sci Paris, 344 (2007), 219–224. 37. Fan A.H., Fan S.L., Liao L.M., Wang Y.F., On minimal deecomposition of p-adic homographic dynamical systems, Adv. Math. 257(2014) 92–135. 38. Fan A.H., Fan S.L., Liao L.M., Wang Y.F., Minimality of p-adic rational maps with good reduction, Discrete Cont. Dyn. Sys. 37(2017), 3161–3182. 39. Feynman R.P. Negative Probability, in Quantum Implications, Essays in Honour of David Bohm, Ed. by B. J. Hiley and F. D. Peat, Routledge and Kegan Paul, London, 1987, pp. 235– 246. 40. Ilic-Stepic A., Ognjanovic Z., Ikodinovic N., Perovic A., p-adic probability logics, p-Adic Num. Ultra. Anal. Appl. 8 (2016), 177–203. 41. Kaneko H., Kochubei A.N., Weak solutions of stochastic differential equations over the field of p-adic numbers, Tohoku Math. J. 59(2007), 547–564. 42. Kaplan S., A survey of symbolic dynamics and celestial mechanics, Qualitative Theor. Dyn. Sys., 7 (2008), 181–193. 43. Katsaras A.K. Extensions of p-adic vector measures, Indag. Math.N.S. 19 (2008) 579–600. 44. Katsaras A.K. On spaces of p-adic vector measures, P-Adic Numbers, Ultrametric Analysis, Appl. 1 (2009) 190–203. 45. Katsaras A.K. On p-adic vector measures, Jour. Math. Anal. Appl. 365 (2010), 342–357. 46. Kochubei A.N. Pseudo-differential equations and stochastics over non-Archimedean fields, Mongr. Textbooks Pure Appl. Math. 244 Marcel Dekker, New York, 2001. 47. Kozyrev S.V., Wavelets and spectral analysis of ultrametric pseudodifferential operators Sbornik Math. 198(2007), 97–116. 48. Khakimov O. N., On a generalized p-adic gibbs measure for Ising Model on trees, p-Adic Numbers, Ultrametric Anal. Appl.6 (2014) 105–115. 49. Khakimov O.N., p-adic Gibbs quasi measures for the Vannimenus model on a Cayley tree, Theor. Math. Phys. 179(2014) 395–404.
Chaos in p-adic Statistical Lattice Models: Potts Model
163
50. Khamraev M., Mukhamedov F.M. On p-adic λ-model on the Cayley tree, J. Math. Phys. 45(2004) 4025–4034. 51. Khamraev M., Mukhamedov F.M., Rozikov U.A. On uniqueness of Gibbs measure for p-adic λ-model on the Cayley tree, Lett. Math. Phys. 70(2004), No. 1, 17–28 52. Khamraev M., Mukhamedov F.M. On a class of rational p-adic dynamical systems, J. Math. Anal. Appl. 315 (2006), 76–89. 53. Khrennikov A. YU., p-Adic Description of Dirac’s Hypothetical World with Negative Probabilities, Int. J. Theor. Phys. 34(1995), 2423–2434. 54. Khrennikov A., p-adic valued probability measures, Indag. Mathem. N.S., 7 (1996) 311–330. 55. Khrennikov A., Non-Archimedean analysis and its applications. Nauka, Fizmatlit, Moscow, 2003 (in Russian). 56. Khrennikov A.Yu. Non-Archimedean analysis: quantum paradoxes, dynamical systems and biological models, Kluwer Academic Publisher, Dordrecht, 1997. 57. Khrennikov A., p-adic description of chaos., In: Nonlinear Physics: Theory and Experiment. Editors E. Alfinito, M. Boti., World Scientific, Singapore, 1996, pp. 177–184. 58. Khrennikov A.Yu., Generalized probabilities taking values in non-Archimedean fields and in topological Groups, Russian J. Math. Phys. 14 (2007), 142–159. 59. Khrennikov A.Yu., Kozyrev S.V., Ultrametric random field, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 9(2006), 199–213. 60. Khrennikov A.Yu., Kozyrev S.V., Replica symmetry breaking related to a general ultrametric space I,II,III, Physica A, 359(2006), 222–240; 241–266; 378(2007), 283–298. 61. Khrennikov A.Yu., Kozyrev S.V., Zuniga-Galindo W.A., Ultrametric Pseudodifferential Equations and Applications, Cambridge Univ. Press, 2018. 62. Khrennikov A.Yu., Ludkovsky S. Stochastic processes on non-Archimedean spaces with values in non-Archimedean fields, Markov Process. Related Fields 9(2003) 131–162. 63. Khrennikov A.Yu., Ludkovsky S., On infinite products of non-Archimedean measure spaces, Indag. Math. N. S. 13(2002), 177–183. 64. Khrennikov A.Yu., Mukhamedov F., On uniqueness of Gibbs measure for p-adic countable state Potts model on the Cayley tree, Nonlin. Analysis: Theor. Methods Appl. 71 (2009), 5327– 5331. 65. Khrennikov A., Mukhamedov F., Mendes J.F.F. On p-adic Gibbs measures of countable state Potts model on the Cayley tree, Nonlinearity 20(2007) 2923–2937. 66. Khrennikov A.Yu., Nilsson M. p-adic deterministic and random dynamical systems, Kluwer, Dordreht, 2004. 67. Khrennikov A.Yu., Yamada S., van Rooij A., Measure-theoretical approach to p-adic probability theory, Annals Math. Blaise Pascal 6 (1999) 21–32. 68. Koblitz N., p-adic numbers, p-adic analysis and zeta-function, Berlin, Springer, 1977. 69. Kolmogorov A.N. Foundations of the Probability Theory, Chelsey, New York, 1956. 70. Kulske C., Rozikov U.A., Khakimov R.M., Description of all translation-invariant splitting Gibbs measures for the Potts model on a Cayley tree, J. Stat. Phys. 156 (1) (2013), 189–200. 71. Le Ny A., Liao L., Rozikov U.A., p-adic boundary laws and Markov chains on trees, Lett. Math. Phys. doi.org/10.1007/s11005-020-01316-7. 72. Lubin J., Nonarchimedean dynamical systems, Composito Math., 94 (1994), 321–346. 73. Ludkovsky S. Stochastic processes and their spectral representations over non-archimedean fields, J. Math. Sci. 185(2012), 65–124. 74. von Mises R., The Mathematical Theory of Probability and Statistics, Academic, London, 1964. 75. Muckenheim W., A Review on Extended Probabilities, Phys. Rep. 133(1986), 338–401. 76. Monna A., Springer T., Integration non-Archim’edienne 1, 2. Indag. Math. 25 (1963) 634– 653. 77. Monroe J.L. Julia sets associated with the Potts model on the Bethe lattice and other recursively solved systems, J. Phys. A: Math. Gen., 34 (2001), 6405–6412 78. Mukhamedov F., On a recursive equation over p-adic field, Appl. Math. Lett. 20(2007), 88– 92.
164
F. Mukhamedov and O. Khakimov
79. Mukhamedov F., On existence of generalized Gibbs measures for one dimensional p-adic countable state Potts model, Proc. Steklov Inst. Math. 265 (2009), 165–176. 80. Mukhamedov F., On p-adic quasi Gibbs measures for q + 1-state Potts model on the Cayley tree, P-adic Numbers, Ultametric Anal. Appl. 2(2010), 241–251. 81. Mukhamedov F.M., Existence of P -adic quasi Gibbs measure for countable state Potts model on the Cayley tree, J. Ineqal. Appl. 2012, 2012:104. 82. Mukhamedov F., Dynamical system appoach to phase transitions p-adic Potts model on the Cayley tree of order two, Rep. Math. Phys., 70 (2012), 385–406. 83. Mukhamedov F., On dynamical systems and phase transitions for q + 1-state p-adic Potts model on the Cayley tree, Math. Phys. Anal. Geom., 53 (2013) 49–87. 84. Mukhamedov F., Recurrence equations over trees in a non-Archimedean context, P-adic Numb. Ultra. Anal. Appl. 6(2014), 310–317. 85. Mukhamedov F. On strong phase transition for one dimensional countable state P -adic Potts model, J. Stat. Mech. (2014) P01007. 86. Mukhamedov F., Renormalization method in p-adic λ-model on the Cayley tree, Int. J. Theor. Phys., 54 (2015), 3577–3595. 87. Mukhamedov F., Akin H. Phase transitions for P -adic Potts model on the Cayley tree of order three, J. Stat. Mech. (2013), P07014. 88. Mukhamedov F., Akin H. The p-adic Potts model on the Cayley tree of order three, Theor. Math. Phys. 176 (2013), 1267–1279. 89. Mukhamedov F., Akin H., On non-Archimedean recurrence equations and their applications, J. Math. Anal. Appl. 423 (2015), 1203–1218. 90. Mukhamedov F., Akin H., Dogan M. On chaotic behavior of the p-adic generalized Ising mapping and its application, J. Difference Eqs Appl. 23(2017), 1542–1561. 91. Mukhamedov F., Dogan M., On p-adic λ-model on the Cayley tree II: phase transitions, Rep. Math. Phys. 75 (2015), 25–46. 92. Mukhamedov F., Khakimov O. On Periodic Gibbs Measures of p-Adic Potts Model on a Cayley Tree, p-Adic Numbers, Ultr. Anal.Appl., 8(2016)225–235. 93. Mukhamedov F., Khakimov O. Phase transition and chaos: p-adic Potts model on a Cayley tree, Chaos, Solitons & Fractals 87(2016), 190–196. 94. Mukhamedov F., Khakimov O., On metric properties of unconventional limit sets of contractive non-Archimedean dynamical systems, Dynamical Systems 31 (2016), 506–524. 95. Mukhamedov F., Khakimov O., On generalized self-similarity in p-adic field, Fractals, 24 (2016), No. 4, 16500419. 96. Mukhamedov F., Khakimov O., On Julia set and chaos in p-adic Ising model on the Cayley tree, Math. Phys. Anal. Geom. 20 (2017) 23. 97. Mukhamedov F., Khakimov O., Chaotic behaviour of the p-adic Potts-Bethe mapping, Disc. Cont. Dyn. Syst. 38(2018), 231–245. 98. Mukhamedov F., Khakimov O., Chaotic behaviour of the p-adic Potts-Bethe mapping II, Ergodic Theory Dyn Sys. https://doi.org/10.1017/etds.2021.96 99. Mukhamedov F., Khakimov O., On equation x k = a over Qp and its applications, Izvestiya Math. 84 (2020), 348–360. 100. Mukhamedov F.M., Mendes J.F.F., On the chaotic behavior of a generalized logistic p-adic dynamical system, J. Diff. Eqs. 243 (2007), 125–145 101. Mukhamedov F., Omirov B., Saburov M., On cubic equations over p-adic field. Int. J. Number Theory 10 (2014), 1171–1190. 102. Mukhamedov F., Saburov M, On equation x q = a over Qp , J. Number Theor., 133, (2013), 55–58. 103. Mukhamedov F., Saburov M., Khakimov O., On p-adic Ising-Vannimenus model on an arbitrary order Cayley tree, J. Stat. Mech. (2015), P05032 104. Mukhamedov F., Saburov M., Khakimov O., Translation-invariant p-adic quasi Gibbs measures for the Ising-Vannimenus model on a Cayley tree, Theor. Math. Phys., 187(1), (2016), 583–602.
Chaos in p-adic Statistical Lattice Models: Potts Model
165
105. Mukhamedov F.M., Rozikov U.A., On rational p-adic dynamical systems, Methods of Funct. Anal. and Topology, 10 (2004), No.2, 21–31 106. Mukhamedov F.M., Rozikov U.A. On Gibbs measures of p-adic Potts model on the Cayley tree, Indag. Math. N.S. 15 (2004) 85–100. 107. Mukhamedov F.M., Rozikov U.A. On inhomogeneous p-adic Potts model on a Cayley tree, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 8(2005) 277–290. 108. Mukhamedov F., Rozikov U., Mendes J.F.F. On Phase Transitions for p-Adic Potts Model with Competing Interactions on a Cayley Tree, AIP Conf. Proc. 826(2006) 140–150. 109. Ostilli M., Cayley Trees and Bethe Lattices: A concise analysis for mathematicians and physicists, Physica A, 391 (2012) 3417–3423. 110. Peruggi, F., di Liberto F., Monroy G., Phase diagrams of the q-state Potts model on Bethe lattices. Phys. A 141 (1987), 151–186. 111. Peruggi, F., di Liberto F., Monroy G., The Potts model on Bethe lattices. I. General results. J. Phys. A 16 (1983), 811–827. 112. Qiu W.Y., Wang Y.F., Yang J.H., Yin Y.C., On metric properties of limiting sets of contractive analytic non-Archimedean dynamical systems, J. Math. Anal. App., 414 (2014) 386–401. 113. Rahmatullaev M. M., Khakimov O. N., Tukhtaboev A. M., A p-adic generalized Gibbs measure for the Ising model on a Cayley tree. Theor. Math. Phys., 201(1), (2019) 1521–1530. 114. Rivera-Letelier J., Dynamics of rational functions over local fields, Astérisque, 287 (2003), 147–230. 115. van Rooij A., Non-archimedean functional analysis, Marcel Dekker, New York, 1978. 116. Rozikov U.A. Gibbs Measures on Cayley Trees, World Scientific, 2013. 117. Rozikov U. A., Khakimov O. N., Description of all translation-invariant p-dic Gibbs measures for the Potts model on a Cayley tree, Markov Proces. Rel. Fields, 21 (2015), 177– 204. 118. Rozikov U. A., Khakimov O. N. p-adic Gibbs measures and Markov random fields on countable graphs, Theor. Math. Phys. 175 (2013), 518–525. 119. Rozikov U.A., Tugyonov Z.T., Construction of a set of p-adic distributions, Theor.Math. Phys. 193(2017), 1694–1702. 120. Saburov M., Ahmad M.A.Kh., On descriptions of all translation invariant p-adic Gibbs measures for the Potts model on the Cayley tree of order three, Math. Phys. Anal. Geom., 18 (2015) 26. 121. Schikhof W. H., Ultrametric calculus. An introduction to p-adic analysis. Cambridge: Cambridge University Press 1984. 122. Silverman J.H. The arithmetic of dynamical systems, New York, Springer, 2007. 123. Thiran E., Verstegen D., Weters J., p-adic dynamics, J. Stat. Phys., 54 (1989), 893–913. 124. Vladimirov V.S., Volovich I.V., Zelenov E.I. p -adic Analysis and Mathematical Physics, World Scientific, Singapour, 1994. 125. Volovich I.V. p−adic string, Classical Quantum Gravity 4 (1987) L83-L87. 126. Wilson K.G., Kogut J., The renormalization group and the - expansion, Phys. Rep. 12 (1974), 75–200. 127. Woodcock C.F., Smart N.P., p-adic chaos and random number generation, Experiment Math. 7 (1998) 333–342. 128. Wu F.Y., The Potts model, Rev. Mod. Phys. 54 (1982) 235–268. 129. Zuniga-Galindo W.A., Torba S.M., Non-Archimedean Coulomb gases, J. Math. Phys. 61(2020), 013504.
QFT, RG, and All That, for Mathematicians Abdelmalek Abdesselam
Abstract We present a quick nontechnical introduction to quantum field theory and Wilson’s theory of the renormalization group from the point of view of mathematical analysis. The presentation is geared primarily towards a probability theory, harmonic analysis and dynamical systems theory audience. We also emphasize the use of p-adics in order to set up hierarchical versions of the renormalization group. The latter provide an ideal stepping stone towards the more involved Euclidean space setting. Keywords Quantum field theory · Renormalization group · Hierarchical models · Stochastic processes
1 Introduction To say that quantum field theory (QFT) has exerted a profound influence on recent mathematical developments is a banal statement. Ideas from QFT have been shown to be relevant for the understanding of knot invariants [66], four-manifolds [67] and questions in enumerative algebraic geometry [17]. Not only low-dimensional topology but also high-dimensional topology benefited from QFT (see, e.g., [49]). The connections between two-dimensional conformal field theory (CFT) and the geometric Langlands correspondence are well-known [9, 31, 32, 34]. Moreover, the latter has been shown to be related to higher-dimensional QFT [44]. It is therefore not surprising that, in recent times, there has been an increased mathematical interest for QFT. Several books have appeared where mathematicians took on the task of explaining QFT to other mathematicians (see, e.g., [20, 28, 30]). Yet, it would be fair to say that this interest came mostly from domains of mathematics such as geometry, topology and representation theory, while the general area known as
A. Abdesselam () Department of Mathematics, University of Virginia, Charlottesville, VA, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 W. A. Zúñiga-Galindo, B. Toni (eds.), Advances in Non-Archimedean Analysis and Applications, STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health, https://doi.org/10.1007/978-3-030-81976-7_4
167
168
A. Abdesselam
analysis has been lagging behind. The proportion of young mathematical analysts who are working on the foundational questions posed by QFT, compared to other areas of analysis such as partial differential equations, is very small. Hopefully, the introduction presented here will help make the subject more approachable to such analysts. The purpose of this article, geared primarily towards a probability theory, harmonic analysis and dynamical systems theory audience, is to give an idea of some of the key mathematical problems posed by QFT and to explain how Wilson’s renormalization group (RG) theory offers a strategy for solving them. The main problem relates to the construction of QFT functional integrals. Unfortunately, this is not addressed in the above-mentioned books. Indeed, these books only consider the construction in the sense of formal power series. The main problem will be presented in Sect. 3. Before that, Sect. 2 will provide a motivation for the study of this problem coming from the analysis of scaling limits for models in statistical mechanics such as the famous two-dimensional Ising model. Then in Sect. 4, we will provide a rough outline of the RG strategy for solving this problem. Finally, in Sect. 5 we will discuss hierarchical models which constitute a useful testing ground for rigorous RG methods. Note that because of the self-imposed page limitation and the intent not to obscure the big picture, many technical details will be omitted from the discussion. We tried to do so while sacrificing as little mathematical precision as possible. At the end of the article, we will list references where the rigorous mathematical details may be found.
2 Scaling Limits A major theme in today’s probability theory is the study of scaling limits of models from statistical mechanics in relation to CFT. A typical example is that of the Ising model on a two-dimensional lattice and at the critical temperature. If (σx )x∈Z2 denotes the random configuration of Ising spins, one can associate to it a random 15 generalized function or Schwartz distribution φr = L 8 r x∈Z2 σx δLr x . Here L is some fixed number greater than 1 which serves as a yardstick for measuring changes of scale. In the context of the dyadic decompositions frequently used in harmonic analysis, one picks L = 2. The notation δLr x refers to the delta function located at Lr x ∈ R2 . As for r ∈ Z, it plays the role of an ultraviolet (UV) or short distance cut-off since one can think of the Ising model as now living on the lattice (Lr Z)2 whose mesh Lr is taken to 0. The scaling limit is the generalized random field φ−∞ , or simply , obtained by this construction when r → −∞. The uniqueness and conformal invariance of this scaling limit was shown in the recent work [16] which builds on [18, 26]. The main reason to consider such scaling limits which, by definition, live on the continuum is that they are universal objects with enhanced symmetry. Instead of lattice symmetries one gets full invariance by translation, rotation and, here also, by scale transformation. Since interactions are local (e.g., nearest neighbor ones for the Ising model), one expects these symmetries to hold locally, hence conformal invariance. The latter was introduced in the present context by Polyakov in [55] (see also [33] for its early history in physics). Important
QFT, RG, and All That, for Mathematicians
169
quantities of interest are the moments or correlators E [(x1 ) . . . (xn )] of . These are expected to be distributions with singular support on the big diagonal where xi = xj for some i = j . This relates to the fact such a random field is a generalized one whose sample paths are given by distributions rather than ordinary functions. This fact is also a feature of singular SPDEs which have been the object of intense recent research activity [43]. For the Ising model, the 2-point 1 function E[(x1 )(x2 )] decays like |x1 − x2 |− 4 which reflects a scaling dimension 1 [] = 8 for the (elementary) field . In two dimensions a free field should behave logarithmically and thus with a scaling dimension 0. The difference 18 − 0 is an example of anomalous dimension. Via the Schwartz nuclear theorem, the above correlators seen as 8distributions are obtained from the moments of honest random variables (f ) = (x)f (x) d2 x where the field is smeared with a test function f . For a dimension of space d = 2, such variables are the limits when r → −∞ of unit lattice quantities
φr (f ) = σx L(d−[])r f (Lr x) (1) x∈Z2
which involve the diluted test function L(d−[])r f (Lr ·). Another feature which emerges when considering the scaling limit in the continuum is that correlators have a precise asymptotic expansion in the limit where two of the evaluation points coincide and, moreover, the shape of such an expansion is uniform with respect to the other points. This is the operator product expansion (OPE) which lies at the foundation of CFT in physics. Efforts to mathematically capture this structure can be seen in such frameworks as Borcherds’ vertex operator algebras [10, 32], Beilinson and Drinfeld’s chiral algebras [9, 34] or Costello and Gwilliam’s general factorization algebras [21]. Terms in such expansions may be viewed as mixed moments which, in addition to the original random field, involve suitably defined pointwise squares, third powers etc. of that field. This generalizes the notion of Wick power for a Gaussian field. If the 2-point correlation for the squared field decays with a power which is not twice that of the original field, then one says that the squared field displays an anomalous dimension of its own. For the two-dimensional Ising model, the successes mentioned earlier (e.g., [16]) were made possible by very special features of two-dimensional lattice models: exact solutions [69] and suitable notions of discrete holomorphic functions [58], or the SLE [56]. For more general models where such tools are not available, Wilson’s Nobel Prize winning theory [62, 63] of the RG is one of the rare approaches available.
3 The Fundamental Problem The origins of the RG come from QFT where, from the mathematical point of view, the fundamental problem is to give a meaning to and study the properties of expressions such as
170
A. Abdesselam
8
OA1 (x1 ) · · · OAn (xn ) e−S() D 8 . E OA1 (x1 ) · · · OAn (xn ) = F −S() D Fe 9
:
(2)
The integrals are over a space F of “functions” : Rd → R, with D denoting the “Lebesgue measure” on this infinite-dimensional space. For most applications it is enough to take the space of tempered distributions F = S (Rd ). As for the functional S, a typical example is
; S() =
Rd
1 (∇)2 (x) + μ (x)2 + g (x)4 2
dd x
which corresponds to the so-called φd4 model. Initially, the latter was only thought of as a toy model for more physical ones such as quantum electrodynamics which describes particles seen in nature (photons, electrons and positrons). However, this is no longer the case since the discovery of the Higgs particle [5, 19]. Finally, a local observable OA (x) stands for a function of the field and its derivatives at the point x ∈ Rd such as (x), (x)2 , (x)3 ∂i (x), etc. The different species of such observables are labelled by A ∈ A. One thus avoids precise plethystic notations for “monomials of monomials”. The OPE (see, e.g., [68]) is the asymptotic expansion ∞ 9 :
E OA1 (x1 )OA2 (x2 )OA3 (x3 ) · · · OAn (xn ) = Cj (x1 − x2 )
9 : E OBj (x1 )OA3 (x3 ) · · · OAn (xn )
j =0
when x2 → x1 . The nontrivial requirement is that the functions Cj and the (composite) fields OBj should remain the same for any n and for whatever locations x3 , . . . , xn of the spectator fields. See [2] for a recent investigation of the OPE from a probability theory perspective.
4 The RG Strategy The following is a distillation by the author of the ideas of Wilson [62, 64] and Wegner [60] regarding the RG strategy for solving the above fundamental problem. First, one combines the kinetic part with the nonexistent Lebesgue measure D and turns them into a Gaussian measure dμC−∞ with covariance C−∞ (x, y) or 0. Most of the physics literature on the RG takes [φ] = d−2 , but when 2
QFT, RG, and All That, for Mathematicians
171
a nontrivial infrared fixed point is present, e. g., if d = 3, this leads to additional difficulties related to the elementary field φ itself (in addition to composite fields like φ 2 ) developing an anomalous dimension (see the remark at the end of this section). One now mollifies this covariance at distance scale Lr by introducing the cut-off 0 is a small bifurcation parameter. This is the fractional φ34 model considered in [14] where a new nontrivial fixed point 8 Vnontriv appears in the : φ 4 : direction at distance from VGauss . The fixed point Vnontriv has a codimention one 8 stable manifold, with unstable direction in the general direction of the mass term : φ 2 : heading for VHT . A particular example of bare ansatz consists in choosing g (r,r) = g and μ(r,r) = μ fixed. Similarly to tuning the temperature in the Ising model to its critical value, one can pick a critical value μ = μc (g) so that the (r-independent) V (r,r) [0] lie on the stable manifold of Vnontriv . This gives rise to a scaling limit similar to that for the critical Ising model discussed at the beginning. In fact, it is conjectured that a suitable random field obtained along these lines for d = 2 and [φ] = 0 is the same as the random field constructed in [16]. In this case Tideal is the constant sequence equal to Vnontriv . For the fractional φ34 model and this type of bare ansatz, step 1) has been done rigorously in [14]. In [1], the author constructed a connecting orbit which joins VGauss to Vnontriv . It is not difficult to elaborate on the proof therein in order to produce a bare ansatz which results in a trajectory Tideal which is that connecting orbit. Thus step 1) is essentially solved also for a random field which is not self-similar but has a short distance scaling limit which is Gaussian and a large distance one which is not and corresponds to the nontrivial fixed point. These two notions of scaling limit, for a generalized random field in the continuum, are the ones defined, e.g., in [23]. (2) Controlling the Deviations The previous RG map is defined over the bigger (extended) space Eext that allows couplings in front of : φ k : (x) to depend on the location x. The previous space Ebulk ⊂ Eext is stable by the map RG. The deviations V (r,q) [f ] − V (r,q) [0] due to the introduction of the test function f now live in Eext . Controlling the deviations means showing bounds on these quantities which are uniform in the UV cut-off r and summable over the scales q. This is crucial for the convergence of the two-sided series (4) with strong enough control on the r → −∞ limit defining S T (f ). Let Lq+ be the characteristic size of the “support” of f (e.g., defined by the mean square distance to the origin for a density proportional to |f |2 ). Let L−q− be the same notion for the Fourier transform f q+ . This rests on the following observations. The fluctuation covariance decays at length scale L. By a suitable choice of the cut-off function η one can arrange for to have compact support in x space (this is the idea of finiterange decompositions in [11]). If one considers the restrictions of the fluctuation field ζ to different L-blocks (sets of the form L, with , , etc. always denoting unit cells), these are approximately independent. Thus, the RG map acts locally, i.e., can be seen as an infinite number of independent operations performed in parallel, one for each L-block (the localization property). Such an operation takes the data for the potential from the Ld unit cells contained in L and produces the new data for to be used in the next RG iteration. Another (approximate) property of the fluctuation covariance which holds if η is sufficiently close to 1 near the origin is that, almost surely, the sample paths ζ have zero spatial average in each Lblock. Since at the beginning of iterations ζ is smeared with the diluted test function which is almost constant in each L-block, the result is zero. Thus the test function has no notable effect on the RG evolution in the UV sector. This can also be seen in the language of Feynman diagrams where the effect of the test function is first introduced (and later reinforced by a feedback loop) in diagrams with external legs indicating convolutions of with the diluted test function which respectively have “Fourier support” in the disjoint ranges L−1 ≤ |ξ | ≤ 1 and Lrq+ ≤ |ξ | ≤ Lrq− . This is a property of orthogonality between scales (see, e.g., [54]) as in LittelwoodPaley theory. The decay in the IR sector is due to a different mechanism: after q+ − r RG iterations, the deviations V (r,q) [f ] − V (r,q) [0] only reside in the unit cell 0 at the 8 origin. In other words,8 these deviations live in the span of terms of the form 0 : φ k : rather than Rd : φ k : which belong to the bulk. The corresponding eigenvalues are L−k[φ] instead of Ld−k[φ] and the RG map for the deviations becomes a contraction. The beauty and power of this presentation of the RG strategy is that it also works if one adds at the beginning another term : φr2 :Cr (j ) for some new test function j in order to construct the log-moment generating function S T (f, j ) which produces mixed cumulants for both the elementary field and the squared field. The control of deviations in the UV sector follows a similar line of reasoning, but now is considerably more difficult. For the fractional φ34 model, after the initial rescaling one gets ; :
φr2 :Cr
(j ) =
R3
: φ 2 :C0 (x) L(3−2[φ])r j (Lr x) d3 x
which features a new diluted form of the test function j . By the localization property of the RG map and the local constancy of this diluted test function, controlling the deviations which a priori involves the RG action in the extended space Eext becomes a question which is purely about the RG action in the bulk space Ebulk . One has
QFT, RG, and All That, for Mathematicians
175
to show that if V is picked as mentioned earlier on the stable manifold of Vnontriv 8 and if W is a perturbation in the R3 : φ 2 : direction, then (letting n = q − r) the limit limn→∞ RGn (V + L−(3−2[φ])n W ) exists and is nonzero. However, this naive construction gives 0 because the expanding eigenvalue at Vnontriv is strictly smaller than the one at VGauss , namely, L3−2[φ] . Therefore the correct limit is (V , W ) = lim RGn (V + Z n L−(3−2[φ])n W ) n→∞
κ
for Z = L 2 with κ > 0 so that Z −1 L3−2[φ] equals the expanding eigenvalue at Vnontriv . If , 2 respectively denote the self-similar random field and its suitably renormalized square obtained at the end of the day, then their covariances satisfy Cov(2 (x1 ), 2 (x2 )) ∼ [Cov((x1 ), (x2 ))]2 ×
1 . |x1 − x2 |κ
Thus the composite field OA = 2 exhibits an anomalous dimension. In fact the construction of controls the deviations V (r,q) [f ] − V (r,q) [0] but is not enough for the convergence of the series (4), in the UV sector. Some explicit terms linear in j must be extracted from the δb’s in order to secure convergence. This accounts for an additive correction needed to define the proper : φr2 :Cr (j ) input. The correct choice is ; φr (x)2 − L−2[φ]r (C0 (0) + Y ) j (x) d3 x Z −r R3
for some suitable nonuniversal constant Y (whereas Z is universal which here means g-independent). Since : φr2 :Cr (x) = φr (x)2 − L−2[φ]r C0 (0), this is a correction to Gaussian Wick ordering which is already needed for the definition of 2 when is Gaussian (see [22, 50] for a discussion of composite fields related to multiple stochastic integrals, in the Gaussian case). In terms of the original φ 4 -type unbounded spin system (φx )x∈Z3 whose scaling limit is taken, C0 (0) + Y represents the variance of a single spin φx , and κ gives the long-distance behavior of Cov(φx2 , φy2 ) ∼
1 . |x − y|4[φ]+κ
(5)
The explicit relation between the correlators and the RG dynamical system embodied in (4), also gives a handle on questions related to the short distance structure of these correlators: smoothness away from the big diagonal and the OPE. The previous construction of OA = 2 is an example of composite field renormalization. One can also use the OPE in order to give a tautological definition of this field (in the sense of moments). Relating the two (i.e., proving the OPE) amounts to showing the commutation of limits |x1 − x2 | → 0 and r → −∞. The crucial ingredient is reminiscent of Møller wave operators in scattering theory which intertwine free and interacting evolutions. In the classical (rather than quantum) context of
176
A. Abdesselam
the dynamical system RG, such operators realize a conjugation of the nonlinear map to its linearization at a fixed point. If z denotes a curvilinear coordinate which defines the stable manifold of Vnontriv by z = 0 and satisfies the relation z(RG(V )) = Z −1 L3−2[φ] z(V ), i.e., linearizes the action of RG in the unstable direction, then an easy calculation shows that (V , W ) is the directional derivative of z at V in the direction of W . Coordinates such as z are called nonlinear scaling fields and are the basic ingredients of Wegner’s theory [60] describing the fine features of statistical mechanics systems at criticality, such as the behavior of higher composite fields. Now is a good time to take a pause and try to answer: what is a QFT? A safe answer is: an infinite collection of correlators E[(x1 ) · · · (xn )]. If one can solve the corresponding moment problem (as [16] did for [18, 26]), then one may say: a generalized random field. As in [70, §2], one could also request the secondary structure consisting of all correlators E[OA1 (x1 ) · · · OAn (xn )] generated from the field by the OPE. Finally, one can identify a QFT with an ideal trajectory Tideal . This corresponds to the modern view in physics which sees a QFT as a sequence of effective theories at scales Lq in theory space (see [24, 25]), i.e., Ebulk . Correlators can be recovered, via (4), as directional derivatives around the points of Tideal . Yet, these directions may go out into the bigger space Eext . By choosing one of the entries of Tideal , say the q = 0 entry, one can parametrize such sequences or QFTs by the unstable manifold of VUV which typically is finite dimensional (for instance if VUV = VGauss , [φ] is canonical and d > 2). The reparametrization obtained by choosing a different entry in the sequence (the same as rescaling) accounts for the old pre-Wilsonian version of the RG [42, 59]. Remark 1 In the above discussion and in particular for d = 3 and [φ] = 3− 4 with > 0 small, and for the QFT corresponding to the trajectory from VUV = VGauss to the nontrivial fixed point VIR , the scaling dimension of the elementary field φ is the same in the IR as it is in the UV. This “nonrenormalization theorem” has recently been proved for the Euclidean lattice model [48] (see also [4] for the simpler hierarchical analogue). By contrast, if [φ] = d−2 2 and d < 4, the scaling dimension in the IR is expected to be different from that in the UV because of the wave function d−2 renormalization (see [45, §15.5]). As for [φ] > d−2 2 yet close to the value 2 , this is an even more subtle situation [8] which is beyond present mathematically rigorous methods.
5 Hierarchical Models Since the implementation of the RG strategy is a difficult enterprise, it is useful to have simplified models on which to test one’s methods. A important example of such is that of hierarchical models. Let N be a positive integer and suppose one (0) (0) has a vector of centered Gaussian random variables (ζ1 , . . . , ζN ) whose joint law is specified by a covariance matrix M = (Mij )1≤i,j ≤N whose entries sum
QFT, RG, and All That, for Mathematicians
177
(0) up to zero. This implies N = 0 almost surely. One can make infinitely i=1 ζi many independent copies of this vector and obtain a lattice Gaussian random (0) field (ζx )x∈L0 . Here the first layer L0 is the set {1, 2, 3, . . .} with the copies corresponding to the groups of labels {kN + 1, . . . , kN + N }, k ≥ 0. One can (q) then make independent copies (ζx )x∈Lq of this field indexed by integers q ≥ 0. This introduces new layers Lq on which one can put a geometrical structure by identifying the points of Lq with the N-groups of the previous layer Lq−1 . One thus obtains a singly infinite tree structure as in the following figure where N = 3. L0 L1 L2
For x ∈ L0 one easily defines its ancestor aq (x) in Lq . Given a number α > 1, and for x = y ∈ L0 one defines |x − y| = α q where q is the smallest integer for which aq (x) = aq (y). This is an ultrametric notion of distance, formally denoted as the norm of a difference. Let β > 1 be another parameter, then φx =
∞
β −q ζaq (x) (q)
q=0
defines a random field (φx )x∈L0 which is a hierarchical lattice Gaussian field. Its β d covariance is E[φx φy ] ∼ |x − y|−2[φ] with [φ] = log log α . Consider R discretized using dyadic cubes and identify Lq with the set of cubes of size 2q . Setting N = 2d , α = 2 and β = 2[φ] produces a reasonable toy model for the massless Gaussian field on Rd with scaling dimension [φ] and with a unit cut-off (e.g., discretized on Zd ). Starting from such a Gaussian hierarchical lattice measure dμC0 for (φx )x∈L0 one can perturb it by a product of single spin potentials involving φx2 and φx4 terms and repeat the previous story by integrating out the fluctuation fields ζ a few layers at a time. Namely, one can use the RG strategy in order to analyze the scaling limit of the resulting non-Gaussian field. A nice feature of such toy models is that many of the properties which earlier were approximately true, for instance the localization property of the RG, now become exact. In order to have a home for the scaling limit, one needs a notion of continuum. This is obtained by subdividing the nodes of the top layer L0 and continuing the tree structure with the introduction of new layers L−1 , L−2 , etc. The set L−∞ of leafs or ends at infinity of the resulting doubly infinite tree structure is the needed continuum. The lattice L0 is to L−∞ what the lattice Zd is to Rd .
178
A. Abdesselam
The introduction of hierarchical models originated from two independent sources. One is the work of Dyson on one-dimensional Ising spin models with long-range interactions [27]. The other is the early work of Wilson on his RG theory. While some features of the latter were already present in the old article [61], one may say that its first systematic exposition was given in [62]. In fact, this article was about a hierarchical model as above and the corresponding RG map was called “the approximate recursion”. The relevance of this approximation for physical models over Rd stemmed from Wilson’s anticipation, in that article, of wavelet multiresolution analysis (see [7]). Note that the word “approximation” for the hierarchical RG in relation to the RG for real models is somewhat misleading, since it suggests that this is the zero-th step of a systematic approximation procedure which can lead, through successive improvements, to the “true” RG for models over Rd . Despite some attempts in this direction (e.g., [52] and [53, §14.2]), the existence of such a procedure is unclear. The confusion may be due to the fact the hierarchical RG is thematically similar to another simplification called the local potential approximation (LPA). The LPA arises from the above description of the RG on Rd if one ignores the nonlocal kernels which should appear in V = RG(V ) as well as the gradient terms one obtains by the comparison of such nonlocal kernels with their local projection. There is rigorous work in the LPA setting, e.g., [29] as well as a nonrigorous approximation procedure starting from the LPA know as the derivative expansion (see, e.g., [6]). The author’s point of view is that one should not try to approximate critical exponents such as anomalous dimensions for models on Rd by their analogues on hierarchical models. The latter reflect the different treelike geometric texture of the underlying continuum (and in particular depend on the choice of N which governs the shape of the underlying tree-like space). The utility of hierarchical models is that they are a good testing ground for RG techniques. Such methodology has been successful, in a rigorous setting, for instance in the work of Gawe¸dzki and Kupiainen on (∇φ)4 lattice models (the hierarchical model testing was done in [35, 36] while the real model was treated in [37, 38]) or that of Brydges and Slade [15] on the weakly self-avoiding walk in four dimensions (their approach was tested on a hierarchical model in [12, 13]). Another example, in a nonrigorous context, of the success of this methodology is the work of Wilson himself when he developed his RG theory in the first place. Indeed, this theory presented in [62] initially had a modest impact, perhaps because it pertained to the hierarchical model and it was not clear, to the skeptical minds at the time, how it could shed light on the physics of real models. The situation drastically changed with the soon-to-follow article [63] which allowed the RG method to produce an expansion (which can be systematically improved) for critical exponents of real models such as the Ising model in three dimensions related to the liquid-vapour critical point of water (i.e., a real-world phenomenon where such exponents can be measured experimentally). This so-called -expansion whose definitive treatment was later given in [64] was largely responsible for the revolution created by Wilson’s RG theory in physics. It is now part of any theoretical physicist’s DNA or view of the world (see, e.g., [24, 25]). The key article [63] contained two major conceptual advances. The first is that of introducing the bifurcation parameter (as done later
QFT, RG, and All That, for Mathematicians
179
in the fractional φ34 model) and expanding exponents with respect to this parameter which can be viewed as the difference 4 − d between spatial dimensions. This idea can be implemented for both the Rd model and the hierarchical one. The second idea was the implementation of this -deformation in the Rd case using the newly developed dimensional regularization in QFT. One can thus ask if the first idea was initially developed on the hierarchical model. The answer is “yes” or in Wilson’s words “Then, at Michael’s urging, I work out what happens near four dimensions for the approximate recursion formula, and find that d-4 acts as a small parameter. Knowing this it is then trivial, given my field theoretic training, to construct the beginning of the epsilon expansion for critical exponents.” [65]. There is great arbitrariness when setting up a hierarchical model in order to mimic one living in Rd . More precisely, there are lots of ways to pick N , M, α and β for given d and [φ]. There are many versions of the hierarchical model considered by various authors (see [53] for a review). If one does not set up this model carefully, one can end up with rather absurd results such as lack of universality or having critical exponents produced by the RG strategy depend on the artificial yardstick L (see, e.g., [53, §5.2] for a discussion of this issue). There is a particular set-up for the hierarchical RG which, among many other beautiful mathematical properties, avoids such problems and it uses p-adics. For d an integer, the p-adic set-up consists in taking N = p d where p is a prime number, α = p, β = p[φ] and defining the matrix M by putting 1 − p−d on the diagonal and −p−d everywhere else. The padic fractional φ34 model is the particular case of hierarchical model obtained in this way when d = 3 and [φ] = 3− 4 . From the point of view of probability theory, the restriction to primes is not essential. However, doing so gives access to a huge “software library” developed for the needs of number theory. One can then identify L−∞ with Qdp where Qp is the field (in the algebra sense) of p-adic numbers. The fields (in the QFT or probability theory sense) are still real-valued. Instead of the previous elementary and ad hoc description of the p-adic model, a more elegant approach (see [4]) is to use Fourier analysis and the theory of distributions on the p-adics, for real or complex-valued functions, which come from this software library. Since Qp is an additive group, there is a natural notion of translation invariance. The maximal compact subgroup GLd (Zp ) (unique up to conjugation), which is the analogue of the orthogonal group in Rd , supplies the notion of rotation invariance. The analogue of the Euclidean norm in Rd is the maximum of the padic absolute values of the components (used to define the previous hierarchical distance |x − y|) since it is invariant by GLd (Zp ). Instead of R∗+ , the group of scaling transformations pZ is now discrete. One sets L = pl for some integer l ≥ 1 when defining the RG which then integrates out the first l layers L0 , . . . , Ll−1 at each step. One thus avoids the problem of L-dependent critical exponents (since the texture of the underlying space depends on p, not L). This is also the same RG map discussed earlier using a Fourier cut-off. Indeed, by taking the function η equal to the sharp characteristic function of the interval [0, 1] one recovers the above ad hoc hierarchical model. One can also try to localize the previous notions of invariances and look for an analogue of conformal invariance (see [3, 46, 47, 51]).
180
A. Abdesselam
A leitmotiv in the theory of automorphic forms is that completions of Q such as R or Qp should be treated on equal footing. Beautiful theories in analysis (e.g., the theory of unitary representations of noncompact groups) developed for R have analogues over Qp . Such unity is visible in the series of books by Gel’fand and co-authors on generalized functions (e.g., [40, 41] which pertain to self-similar random fields) which included [39] as a sixth volume in the Russian edition. The author believes (and hopes to have convinced the reader) that a similar harmonious unity is present in the context of generalized random fields, QFT and the RG. The previous presentation of QFT and the RG may seem somewhat impressionistic and it indeed avoided discussing many issues which are important for mathematical rigor: the infinite volume limit, dealing with nonlocalities and gradient terms, the specific norms and bounds needed, etc. Nevertheless, the reader can find in the article [4] a complete rigorous substantiation of the story told in Sects. 3 and 4 (with the exception of the OPE), in the case of the self-similar p-adic φ34 model at Vnontriv . For the model over R3 , the reader is referred to the preliminary results [1, 14, 57] which should be easier to read after seeing a simpler version [4, §6] of the needed RG estimates. Acknowledgments The author would like to express his gratitude to those who influenced his thinking about QFT and the RG, over the years. These are D. C. Brydges, J. Magnen, P. K. Mitter and V. Rivasseau. Of course, any shortcoming of the present article is the responsibility of the author alone.
References 1. A. Abdesselam, A complete renormalization group trajectory between two fixed points. Comm. Math. Phys. 276 (2007), no. 3, 727–772. 2. A. Abdesselam, A second-quantized Kolmogorov-Chentsov theorem via the operator product expansion. Preprint arXiv:1604.05259[math.PR], 2016. To appear in Comm. Math. Phys. 3. A. Abdesselam, Towards three-dimensional conformal probability. p-Adic Numbers Ultrametric Anal. Appl. 10 (2018), no. 4, 233–252. 4. A. Abdesselam, A. Chandra and G. Guadagni, Rigorous quantum field theory functional integrals over the p-adics I: anomalous dimensions. Preprint arXiv:1302.5971[math.PR], 2013. 5. ATLAS Collaboration, Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC. Phys. Lett. B 716 (2012), no. 1, 1–29. 6. C. Bagnuls and C. Bervillier, Exact renormalization group equations: an introductory review. Renormalization group theory in the new millennium, II. Phys. Rep. 348 (2001), no. 1–2, 91– 157. 7. G. Battle, Wavelet refinement of the Wilson recursion formula. In: “Recent Advances in Wavelet Analysis”, 87–118, Wavelet Anal. Appl., 3, Academic Press, Boston, MA, 1994. 8. C. Behan, L. Rastelli, S. Rychkov and B. Zan, Long-range critical exponents near the shortrange crossover. Phys. Rev. Lett. 118 (2017), 241601. 9. A. Beilinson and V. Drinfeld, Chiral Algebras. American Mathematical Society Colloquium Publications, 51. American Math. Soc., Providence, RI, 2004. 10. R. E. Borcherds, Vertex algebras, Kac-Moody algebras, and the monster. Proc. Nat. Acad. Sci. U.S.A. 83 (1986), no. 10, 3068–3071.
QFT, RG, and All That, for Mathematicians
181
11. D. C. Brydges, G. Guadagni, and P. K. Mitter, Finite range decomposition of Gaussian processes. J. Statist. Phys. 115 (2004), no. 1–2, 415–449. 12. D. C. Brydges and J. Z. Imbrie, End-to-end distance from the Green’s function for a hierarchical self-avoiding walk in four dimensions. Comm. Math. Phys. 239 (2003), no. 3, 523–547. 13. D. C. Brydges and J. Z. Imbrie, Green’s function for a hierarchical self-avoiding walk in four dimensions. Comm. Math. Phys. 239 (2003), no. 3, 549–584. 14. D. C. Brydges, P. K. Mitter and B. Scoppola. Critical (4 )3, . Comm. Math. Phys., 240 (2003), 281–327. 15. D. Brydges and G. Slade, Renormalisation group analysis of weakly self-avoiding walk in dimensions four and higher. In: “Proceedings of the International Congress of Mathematicians”, Vol. IV, 2232–2257, Hindustan Book Agency, New Delhi, 2010. 16. F. Camia, C. Garban and C. Newman, Planar Ising magnetization field I. Uniqueness of the critical scaling limit. Ann. Probab. 43 (2015), no. 2, 528–571. 17. P. Candelas, X. C. de la Ossa, P. S. Green and L. Parkes, A pair of Calabi-Yau manifolds as an exactly soluble superconformal theory. Nuclear Phys. B 359 (1991), no. 1, 21–74. 18. D. Chelkak, C. Hongler and K. Izyurov, Conformal invariance of spin correlations in the planar Ising model. Ann. of Math. (2) 181 (2015), no. 3, 1087–1138. 19. CMS Collaboration, Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC. Phys. Lett. B 716 (2012), no. 1, 30–61. 20. K. Costello, Renormalization and effective field theory. Mathematical Surveys and Monographs, 170. American Mathematical Society, Providence, RI, 2011. 21. K. Costello and O. Gwilliam, Factorization Algebras in Quantum Field Theory. Vol. 1. New Mathematical Monographs, 31. Cambridge University Press, Cambridge, 2017. 22. R. L. Dobrushin, Gaussian and their subordinated self-similar random generalized fields. Ann. Probab. 7 (1979), no. 1, 1–28. 23. R. L. Dobrushin, Automodel generalized random fields and their renorm group. In: Multicomponent Random Systems, Ed.: R. L. Dobrushin and Ya. G. Sinai, pp. 153–198, Adv. Probab. Related Topics 6, Marcel Dekker, New York, 1980. 24. M. R. Douglas, Spaces of quantum field theories. J. Phys.: Conf. Ser. 462 (2013), 012011. 25. M. R. Douglas, Foundations of quantum field theory. In: “String-Math 2011”, 105–124, Proc. Sympos. Pure Math., 85, American Math. Soc., Providence, RI, 2012. 26. J. Dubedat, Exact bosonization of the Ising model. Preprint arXiv:1112.4399[math.PR], 2011. 27. F. J. Dyson, Existence of a phase-transition in a one-dimensional Ising ferromagnet. Comm. Math. Phys. 12 (1969), no. 2, 91–107. 28. E. de Faria and W. de Melo, Mathematical aspects of quantum field theory. Cambridge Studies in Advanced Mathematics, 127. Cambridge University Press, Cambridge, 2010. 29. G. Felder, Renormalization group in the local potential approximation. Comm. Math. Phys. 111 (1987), no. 1, 101–121. 30. G. B. Folland, Quantum field theory. A tourist guide for mathematicians. Mathematical Surveys and Monographs, 149. American Mathematical Society, Providence, RI, 2008. 31. E. Frenkel, Langlands correspondence for loop groups. Cambridge Studies in Advanced Mathematics, 103. Cambridge University Press, Cambridge, 2007. 32. E. Frenkel and D. Ben-Zvi, Vertex Algebras and Algebraic Curves. Mathematical Surveys and Monographs, 88. American Math. Soc., Providence, RI, 2001. 33. T. Fulton, F. Rohrlich and L. Witten, Conformal Invariance in Physics. Rev. Mod. Phys. 34 (1962), no. 3, 442–457. 34. D. Gaitsgory, Notes on 2D conformal field theory and string theory. In Quantum Fields and Strings: a Course for Mathematicians, Vol. 2 (Princeton, NJ, 1996/1997), Edited by P. Deligne et al. pp. 1017–1089, American Math. Soc., Providence, RI, 1999. 35. K. Gawe¸dzki and A. Kupiainen, Renormalization group study of a critical lattice model. I. Convergence to the line of fixed points. Comm. Math. Phys. 82 (1981/82), no. 3, 407–433. 36. K. Gawe¸dzki and A. Kupiainen, Renormalization group study of a critical lattice model. II. The correlation functions. Comm. Math. Phys. 83 (1982), no. 4, 469–492.
182
A. Abdesselam
37. K. Gawe¸dzki and A. Kupiainen, Block spin renormalization group for dipole gas and (∇ϕ)4 . Ann. Physics 147 (1983), no. 1, 198–243. 38. K. Gawe¸dzki and A. Kupiainen, Lattice dipole gas and (∇ϕ)4 models at long distances: decay of correlations and scaling limit. Comm. Math. Phys. 92 (1984), no. 4, 531–553. 39. I. M. Gel’fand, M. I. Graev, M. I. and I. I. Pyatetskii-Shapiro, Representation Theory and Automorphic Functions. Translated by K. A. Hirsch. W. B. Saunders Co., Philadelphia– London–Toronto, 1969. 40. I. M. Gel’fand and G. E. Shilov, Generalized Functions. Vol. 1: Properties and Operations. Translated by E. Saletan. Academic Press, New York–London, 1964. 41. I. M. Gel’fand and N. Ya. Vilenkin, Generalized Functions. Vol. 4: Applications of Harmonic Analysis. Translated by A. Feinstein. Academic Press, New York–London, 1964. 42. M. Gell-Mann and F. E. Low, Quantum electrodynamics at small distances. Phys. Rev. (2) 95 (1954), 1300–1312. 43. M. Hairer, Introduction to regularity structures. Brazilian J. Probab. Stat. 29 (2015), no. 2, 175–210. 44. A. Kapustin and E. Witten, Electric-magnetic duality and the geometric Langlands program. Commun. Number Theory Phys. 1 (2007), no. 1, 1–236. 45. A. Kupiainen, Introduction to The Renormalization Group. Course lecture notes (2014) available at http://www.math.lmu.de/~bohmmech/Teaching/bricmont2014/notes_kupiainen.pdf 46. È. Yu. Lerner, The hierarchical Dyson model and p-adic conformal invariance. Theor. Math. Phys. 97 (1993), no. 2, 1259–1266. 47. È. Yu. Lerner and M. D. Missarov, p-adic conformal invariance and the Bruhat-Tits tree. Lett. Math. Phys. 22 (1991), no. 2, 123–129. 48. M. Lohmann, G. Slade and B. C. Wallace, Critical two-point function for long-range O(n) models below the upper critical dimension. J. Statist. Phys. 169 (2017), no. 6, 1132–1161. 49. J. Lurie, On the classification of topological field theories. Current developments in mathematics, 2008, 129–280, Int. Press, Somerville, MA, 2009. 50. P. Major, Multiple Wiener-Itô Integrals. With Applications to Limit Theorems. Lecture Notes in Mathematics 849, Springer, Berlin, 1981. 51. E. Melzer, Non-Archimedean conformal field theories. Internat. J. Modern Phys. A 4 (1989), no. 18, 4877–4908. 52. Y. Meurice, A perturbative improvement of the hierarchical approximation. Unpublished preprint arXiv:hep-th/9307128, 1993. 53. Y. Meurice, Nonlinear aspects of the renormalization group flows of Dyson’s hierarchical model. J. Phys. A 40 (2007), no. 23, R39–R102. 54. E. Pereira and M. O’Carroll, Orthogonality between scales and wavelets in a representation for correlation functions. The lattice dipole gas and (∇φ)4 models. J. Statist. Phys. 73 (1993), no. 3–4, 695–721. 55. A. M. Polyakov, Conformal symmetry of critical fluctuations. J. Exp. Theor. Phys. Lett. 12 (1970), 381–383. 56. O. Schramm, Conformally invariant scaling limits: an overview and a collection of problems. In: International Congress of Mathematicians, Vol. I, 513–543, European Math. Soc., Zürich, 2007. 57. G. Slade, Critical exponents for long-range O(n) models below the upper critical dimension. Comm. Math. Phys. 358 (2018), no. 1, 343–436. 58. S. Smirnov, Discrete complex analysis and probability. In: “Proceedings of the International Congress of Mathematicians”, Vol. I, 595–621, Hindustan Book Agency, New Delhi, 2010. 59. E. C. G. Stueckelberg and A. Petermann, La normalisation des constantes dans la théorie des quanta. Helvetica Phys. Acta 26 (1953), 499–520. 60. F. J. Wegner, Corrections to scaling laws. Phys. Rev. B 5 (1972), no. 11, 4529–4536. 61. K. G. Wilson, Model Hamiltonians for local quantum field theory. Phys. Rev. 140 (1965), no. 2B, B445–B457. 62. K. G. Wilson, Renormalization group and critical phenomena. II. Phase-space cell analysis of critical behavior. Phys. Rev. B 4 (1971), no. 9, 3184–3205.
QFT, RG, and All That, for Mathematicians
183
63. K. G. Wilson and M. E. Fisher, Critical Exponents in 3.99 Dimensions. Phys. Rev. Lett 28 (1972), no. 4, 240–243. 64. K. G. Wilson and J. Kogut, The renormalization group and the expansion. Phys. Rep. 12 (1974), no. 2, 75–199. 65. K. G. Wilson, cited from Part II of his 07/06/2002 interview in Physics of Scales Activities. Transcript available at http://authors.library.caltech.edu/5456/1/hrst.mit.edu/hrs/ renormalization/Wilson/Wilson2.htm 66. E. Witten, Quantum field theory and the Jones polynomial. Comm. Math. Phys. 121 (1989), no. 3, 351–399. 67. E. Witten, Monopoles and four-manifolds. Math. Res. Lett. 1 (1994), no. 6, 769–796. 68. E. Witten, Perturbative quantum field theory. In Quantum Fields and Strings: a Course for Mathematicians, Vol. 1 (Princeton, NJ, 1996/1997), Edited by P. Deligne et al. pp. 419–473, American Math. Soc., Providence, RI, 1999. 69. T. T. Wu, Theory of Toeplitz determinants and the spin correlations of the two-dimensional Ising model. I. Phys. Rev. 149 (1966), no. 1, 380–401. 70. A. B. Zamolodchikov, Renormalization group and perturbation theory about fixed points in two-dimensional field theory. Sov. J. Nucl. Phys. 46 (1987), 1090–1096.
Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann Parikshit Dutta and Debashis Ghoshal
Abstract The distribution of the non-trivial zeroes of the Riemann zeta function, according to the Riemann hypothesis, is tantalisingly similar to the zeroes of the partition functions (Fisher and Yang-Lee zeroes) of statistical mechanical models studied by physicists. The resolvent function of an operator akin to the phase operator, conjugate to the number operator in quantum mechanics, turns out to be important in this approach. The generalised Vladimirov derivative acting on the space L2 (Qp ) of complex valued locally constant functions on the p-adic field is rather similar to the number operator. We show that a ‘phase operator’ conjugate to it can be constructed on a subspace L2 (p−1 Zp ) of L2 (Qp ). We discuss (at physicists’ level of rigour) how to combine this for all primes to possibly relate to the zeroes of the Riemann zeta function. Finally, we extend these results to the family of Dirichlet L-functions, using our recent construction of Vladimirov derivative like pseudodifferential operators associated with the Dirichlet characters. Keywords Riemann zeta function · Riemann hypothesis · Partition functions · Vladimirov operator · Dirichlet L-functions
1 Introduction The statistical distribution of the zeroes of the Riemann zeta function, and the related family of Dirichlet L-functions, qualitatively resemble the eigenvalue distribution of a random ensemble of unitary matrices[1–3]. It is also reminiscent of the distribution of zeroes of partition functions of statistical models. The latter observation is the motivation to search for a suitable model in physicists’ approach to the problem—
P. Dutta Asutosh College, Kolkata, India D. Ghoshal () School of Physical Sciences, Jawaharlal Nehru University, New Delhi, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 W. A. Zúñiga-Galindo, B. Toni (eds.), Advances in Non-Archimedean Analysis and Applications, STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health, https://doi.org/10.1007/978-3-030-81976-7_5
185
186
P. Dutta and D. Ghoshal
the literature is vast, however, see e.g., [4–9], the review [10] and references therein. This resemblance may be an important guide since the zeroes of the partition function of many systems, by the Yang-Lee type theorems [11], all lie parallel to the imaginary axis (or on the unit circle). These zeroes are called Yang-Lee zeroes or Fisher zeroes, depending upon whether the partition function is viewed as a function of the applied external field, e.g., magnetic field, or of β = 1/(kB T ), the inverse temperature in natural units. The arithmetic or primon gas of [5–7] and the number theoretic spin chain of [8, 9], in particular, proposed interesting models for which the partition functions are directly related to the Riemann zeta function. In this approach the non-trivial zeroes of the zeta function are to be identified with the Yang-Lee or Fisher zeroes. Motivated by these, we shall propose a statistical model and compute its partition function. The idea is again to associate the spectrum of an operator with the Fisher zeroes of the partition function. In addition, however, we shall study the spectrum of some relevant operators of these models. The systems that relate to the Lfunctions of our interest can be thought of as spins in an external magnetic field. Since the spectrum of a Hamiltonian of this type of spin systems is discrete (the spins being half-integer valued) this operator is similar to the number operator of an oscillator. A phase operator that is conjugate to this will be a new ingredient in our investigation. The construction of a phase operator which is truly canonically conjugate to the number operator is a subject of long-standing quest that may not be completely closed yet. Nevertheless, several different ways to define the phase operator have been proposed, for example, [12–19] is a partial list. In particular, we shall investigate two ways of defining it for the spin models corresponding to the family of L-functions. In the first construction, we follow [17], where the authors propose an operator by directly constructing eigenstates of phase for a system with a discrete spectrum. The second approach is motivated by the proposal in [15]. We shall argue that there are enough hints in these proposals to understand the correspondence between the spectrum of these operators and the zeroes of the partition function. In the following, we shall first review (in Sect. 2) some of the relevant arguments and results from the cited references, in the context of a simple spin system on a one-dimensional lattice. In Sect. 3, after recalling some properties of the Riemann zeta function and our earlier work on its relation to operators on the Hilbert space of complex valued functions on the p-adic number field Qp [20, 21], we elaborate on a proposal to view it as a statistical model of spins. In Sects. 3.1 and 3.2 we detail two constructions of the phase operators for the spin model for the Riemann zeta function, which are then extended to the family of Dirichlet L-functions in Sect. 4.
2 Quantum Spins in External Field Spin models in one dimension are among the simplest statistical models, yet they offer an arena rich enough to experiment, before considering more complicated systems. The variables are ‘spins’ sn at lattice points n ∈ Z or n ∈ N that can
Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann
187
take (2j + 1) values {−j, −j + 1, · · · , j − 1, j }, in the spin-j representation, where 2j ∈ Z. In models of magnetism, these spins interact locally, usually with the nearest neighbours. In addition, one may turn on an external magnetic field. Let us digress to recall the properties of a simpler model, the Ising model, in which the classical spinstake one of two possible values ±1 and the total Hamiltonian is H = −J n sn sn+1 − B n sn , where J is the strength of interaction (J > 0 being ferromagnetic and anti-ferromagnetic otherwise) and the second term arises from an interaction with an external magnetic field B. The partition function (in the absence of an external field) of an Ising system of size L at inverse temperature β is Z(β) ≡ Tr e
−βH
=
exp βJ
{sn }
L−1
sn sn+1
n=1
In this simple case, one may also change variables to σn ≡ σn−1,n = sn−1 sn associated to the edges n − 1, n joining nearest neighbours. Evidently, σn = ±1 as well. Thus L
Z(β) = 2 exp βJ σn {σn }
=2
n=2
⎛
σ2 , σ3 , · · · | exp ⎝βJ
σn
⎞ Sj ⎠ |σ2 , σ3 , · · ·
j
where we have defined vectors |σn in a (two-dimensional) Hilbert space corresponding the spin on the edge n − 1, n and Sn s are spin operators such that Sn |σn = σn |σn . A generalisation of this model allows thecoupling constants J to be position dependent, so that the Hamiltonian is H = − n Jn σn and Z(β) = 2
>
? σ2 , σ3 , · · · eβ n Jn Sn σ2 , σ3 , · · ·
σn
is the canonical partition function of the generalised model at the temperature kB T = 1/β. We would like to consider the general case where the spins are to be valued in the spin-j representation of su(2). Although we seek a partition function of the form as above, the general spin case is cannot be realised as an Ising type model, rather it will be a model of spins in an external local magnetic field Bn at site n. It will be useful to think of Sn to be the third component Sn3 of the su(2) spin operators on the edge/site, the others being Sn± . The vectors |σ2 , σ3 , · · · = |σ2 ⊗ |σ3 ⊗ · · · belong to the product space.The interaction Hamiltonian H ∼ B · S between the spin (to be precise, the magnetic moment, which differs from the spin by a constant that is irrelevant for our purpose and can be absorbed in Bn ) and the external field (assumed
188
P. Dutta and D. Ghoshal
to be along the z-direction) described by the Hamiltonian H = − the partition function Z(β) =
n Bn σn
leads to
>
? σ2 , σ3 , · · · eβ n Bn Sn σ2 , σ3 , · · ·
σn
at the temperature kB T = 1/β. Our objective is to obtain an identity for the partition function for this model. To this end, we shall seek an operator that, in a certain well defined sense, is formally canonically conjugate to the z-component Sn3 of the spin operator at site n. There are well known difficulties in defining such an operator, however, we shall see that one needs to make a much weaker demand. In this context, it is useful to remember the Schwinger oscillator realisation of the algebra su(2) in terms of a pair of bosonic creation/anhilation operators (a1† , a1 , a2† , a2 ) at each edge, where we drop the edge index for the time being. Then S+ = a1† a2 , S− = a2† a1 and the third component is the difference of the number operators S3 =
1 2 (n1
− n2 ) =
1 2
a1† a1 − a2† a2 . One can formally introduce the
phase operator = 12 (φ1 − φ2 ) such that [φa , nb ] = iδab , however, there are several mathematical difficulties in defining the above [12, 13]. We will now review an explicit construction to show how one can still work around this problem.
2.1 Phase Operator via Phase Eigenstates Let us label the eigenstates in the spin-j representation of S3 as |m, for m = −j, · · · , j . One can define an eigenstate of phase as a unitary transform of these states as j
1 |φk = √ e−imφk B |m 2j + 1 m=−j
where, φk =
2π k , B(2j + 1)
k = −j, · · · , j
(1)
are the eigenvalues of the phase. The phase eigenstates satisfy φk |φk =
j
1 e−imB(φk −φk ) = δk,k 2j + 1 m=−j
and thus provide an orthonormal basis of the Hilbert space of states.
(2)
Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann
189
In terms of these, we may define the the ‘phase operator’ through spectral decomposition as φˆ =
j
φk |φk φk |
(3)
k=−j
We shall now show that it transforms covariantly when conjugated by eβBS3 . This ˆ being angle-valued, is works for special values of β since, an eigenvalue φk of φ, only defined modulo 2π/B. In order to see this, we note that e−βBS3 φˆ eβBS3 =
j
k=−j
j
φk 2j + 1
j
e−im(φk −iβ)B+im (φk −iβ)B |mm |
m=−j m =−j
There are two cases to consider. The first is trivial: for β = 0 or any of its periodic images iβ = 2πBn (n ∈ Z) in the complex β-plane, the RHS is the phase operator ˆ More interestingly, if iβ takes any of the specific discrete values 2π k + 2π n , φ. B(2j +1) B where k = −j, · · · , j (but k = 0) and n ∈ Z, i.e., iβ is a difference between the phase eigenvalues (mod 2π /B), then φk − iβ is again an allowed eigenvalue of the phase operator (mod 2π/B). In this case, we can add and subtract iβ to the eigenvalue φk and use the completeness of basis, to find e−βBS3 φˆ eβBS3 = φˆ + iβ only for 0 = β = −
2π ij 2π ij ,··· , B(2j + 1) B(2j + 1)
2π mod B
(4)
This is called a shift covariance relation [17]. It may also be rewritten as a commutator 9
: ˆ eβBS3 = iβ eβBS3 only for 0 = β = − φ,
2πij 2πij ,··· , B(2j + 1) B(2j + 1)
2π mod B
i.e., at special values of the inverse temperature. To summarise, we find that φˆ in Eq. (3) satisfies shift covariance, alternatively, though somewhat loosely, it is ‘canonically conjugate’ to S3 only for a special set of an infinite number of imaginary values of β, all on the line Re β = 0 as above. At 1 β = 0 (mod 2π B ), the commutator is trivial . In passing, it is instructive to take the trace of the ‘canonical commutator’. The left hand side evidently vanishes, since the vector space of states is finite, namely (2j + 1), dimensional. On the right hand side, the trace of e−βH , the partition function which vanishes, being a sum over the roots of unity. Thus the values of β
1 It
is also reflected in the resolvent of the phase operator, as we shall see in the following.
190
P. Dutta and D. Ghoshal
for which Eq. (4) is valid must also satisfy the condition Tr e−βH = 0. This means that mod 2π /B, these values of iβ = 0 for which the partition function has a zero ˆ are same as that of the eigenvalues of φ. The resolvent of the exponential of the phase operator at a single site (as a function of z = eiφ ) is ˆ −1 ˆ φ](φ) ˆ R[ = 1 − e−iφ ei φ and its trace is j @
ˆ φ](φ) ˆ Tr R[ = φk k=−j
1 ˆ 1 − ei(φ−φ)
j A
φk = k=−j
1 1 − ei(φk −φ)
On other hand, the partition function at a single site Z1 (β) = Tr e−βH = the 2π mi βBm vanishes at special values of the inverse temperature β = B(2j me +1) (mod ∈ {−j, · · · , j } but m = 0. These zeroes of the partition function in the complex β-plane are called Fisher zeroes. At precisely these values, the resolvent function develops poles. 2π B ) where m
3 The Case of Riemann Zeta Function Before we get to our main goal to interpret the Riemann zeta function as a partition function, let us briefly recall some of its relevant properties. Originally defined by the analytical continuation of the series ζ (s) =
∞
1 = ns n=1
/
1 , 1 − p−s p ∈ primes
Re(s) > 1
(5)
to the complex s-plane by Riemann, the zeta function has a set of equally spaced zeroes at negative even integers −2n, n ∈ Z called its trivial zeroes. More interestingly, it has another infinite set of zeroes, which, according to the Riemann hypothesis lie on the critical line Re(s) = 12 . The related Riemann ξ -function (sometimes called the symmetric zeta-function) and the adelic zeta function share only the latter (non-trivial) zeroes with Eq. (5) (i.e., the set of trivial zeroes are absent in the following functions) ξ(s) =
s s 1 1 s(s − 1)ζA (s) = s(s − 1)π − 2 ζ (s) 2 2 2
(6)
Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann
191
both of which satisfy the reflection identity ξ(s) = ξ(1 − s), respectively, ζA (s) = ζA (1 − s), derived from a similar identity for the original zeta function. The former is a holomorphic function while the latter, ζA (s), is meromorphic. The non-trivial zeroes of ζ (s) (which are the only zeroes of ξ(s) and ζA (s)) lie conjecturally on the critical line, and seem to occur randomly, although they are found to be correlated in the same way as the eigenvalues of a Gaussian ensemble of N × N hermitian or unitary matrices in the limit N → ∞ [1–3]. Starting from Hilbert and Pólya, it has long been thought that these zeroes correspond to the eigenvalues of an operator, that is self-adjoint in an appropriately defined sense. A direct analysis of the spectrum of the purported operator may lead to a proof of the Riemann hypothesis. Despite many ingenious efforts, an operator has not yet been found. In [20], in a larger collaboration, we attempted to find a suitable operator by assuming the validity of the hypothesis, specifically, by assuming that the zeroes are the eigenvalues of a unitary matrix model2 (UMM). We found that the partition function can be expressed through the trace of an operator on the Hilbert space of complex valued locally constant Bruhat-Schwarz functions supported on a compact subset p−1 Zp of the p-adic field Qp . This was achieved in two steps. First a UMM was constructed for each prime p corresponding to the Euler product form in Eq. (5). These (as well as a UMM for the trivial zeroes) were combined to define the random matrix model. In this paper, we shall use some of the technology that were useful in [20], however, our goal will be different. We begin by expanding the prime factors in the Euler product form of the zeta function ζ (s) =
∞
/ 1 = p−sn(p) , −s 1 − p p ∈ primes p ∈ primes (p)
/
n
Re(s) > 1
(7)
=0
For a fixed prime p, the factor 1 ζp (s) = 1 − p−s
(8)
is sometimes called the local zeta function at p. It can be thought of as a complex valued function on the field Qp (of p-adic numbers). The prefactor ζR (s) = s π − 2 2s in Eq. (6) is known as the local zeta functions corresponding to R (of real 2 numbers). It is the Mellin transform of the Gaussian function e−π x . In an exactly analogous fashion, ζp (s) in Eq. (8) is the Mellin transform of the equivalent of the Gaussian function (in the sense of a function that is its own Fourier transform) on Qp .
2 Similar
construction have been attempted with quantum mechanical systems, see e.g., [22] and the review [10].
192
P. Dutta and D. Ghoshal
We can express the sum in Eq. (7) as the trace of an operator. To this end, let us recall that the space of (mean-zero) square integrable complex valued functions (p) on Qp is spanned by the orthonormal set of Kozyrev wavelets ψnml (ξ ) ∈ C (for ξ ∈ Qp ), which have compact support in Qp [23]. In p segments (of equal Haar measure) its values are the p-th roots of unity. They are analogous to the generalised Haar wavelets, with the labels n, m and l referring to scaling, translation and phase rotation. Interestingly, the Kozyrev wavelets are eigenfunctions of an operator with eigenvalue p α(1−n) (p)
(p)
α D(p) ψn,m,l (ξ ) = pα(1−n) ψn,m,l (ξ )
(9)
α , called the generalised Vladimirov where, the pseudodifferential operators D(p) derivatives, are defined by the following integral kernel as α f (ξ ) = D(p)
1 − pα 1 − p−α−1
; Qp
dξ
f (ξ ) − f (ξ ) |ξ − ξ |α+1 p
,
α∈C
α1 α2 α2 α1 α1 +α2 They satisfy D(p) D(p) = D(p) D(p) = D(p) . Since the roles of translation and phase are not going to be important in what follows, let us set m = 0 and l = 1 and define vectors |n(p) corresponding to (p) ψ−n+1,0,1 (ξ ) (p)
ψ−n+1,0,1 (ξ ) ←→ |n(p)
(10)
in the Hilbert space L2 (Qp ). Then α D(p) |n(p) = pn
(p) α
logp D(p) |n(p) = lim
α→0
|n(p)
α −1 D(p)
α ln p
|n(p) = n(p) |n(p)
(11)
The wavelets, by construction, transform naturally under the affine group of scaling and translation. However, it was shown in [24] that the scaling part of it seems to enhance to a larger SL(2,R) symmetry. In terms of the raising and lowering (p) (p) (p) operators a± |n(p) = |n(p)±1 the generators of SL(2,R) are J± = a± logp D(p) (p)
and J3 = logp D(p) . The algebra of these generators and their action on the wavelet states are as follows. 7 7 6 6 (p) (p) (p) (p) (p) (p) J3 , J± = ±J± , J+ , J− = −2J3 (12) (p) (p) J3 |n(p) = n(p) |n(p) , J± |n(p) = n(p) |n(p) ±1
Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann
193
We can now write Eq. (7) as ζ (s) =
/
∞
>
−s ? n(p) n(p) D(p)
p ∈ primes n(p) =0
=
n| e−s ln D |n
(13)
n=(n(2) ,n(3) ,··· )
where we have used a shorthand ln D ≡ , and the vectors |n p ln p logp D(p) 1 2 belong to the product of the Hilbert spaces for all primes p L (Qp ). However, since the sum only over the positive integers (including zero), this subspace 1 runs 2 −1 is actually p L (p Zp ), spanned by the Bruhat-Schwarz functions restricted to p−1 Zp due to which the trace is well defined (see [23–25] for details on the wavelet functions). This expression leads us to think of the zeta function as the partition function of a statistical system, in analogy with the systems in Sect. 2, the configurations of which are parametrised by the integers n = (n(2) , n(3) , · · · ). The sl2 (R) algebra Eq. (12) can be realised in terms of a pair of oscillators in the Schwinger representation (p)
J3
= logp D(p) =
1 (p) (p) † † NI(p) − NII(p) , J+ = aI(p) aII(p) and J− = aII(p) aI(p) 2 (14)
(p) (p) Formally there is a phase difference operator (p) = I − II conjugate to the number difference operator N(p) = 12 NI(p) − NII(p) , such that [I (p) , NJ (p ) ] = iδI J δpp ,
[(p) , N(p ) ] = iδpp
(15)
In Sect. 2 we reviewed a construction for the phase operator following [12–18]. Assuming for the moment that a phase operator with the desired properties can be constructed, we define the operator ln1p (p) p−N(p) = ln1p (p) e−N ln p and evaluate the following commutator B
C 1 N p p−N(p) , p (p ) = iδpp ln p
(16)
using Eq. (15). Thus, the operator ln1p (p) p−N(p) is formally canonically conjugate to pN(p) = D(p) . We would now like to extend it to the large Hilbert space obtained by combining all primes. Let us first consider all prime numbers up to a fixed prime p. The number of such primes is π(p), where π(x) is the prime counting function. We now define Op =
p 1 1 (p) Dp−1 π(p) ln p p=2
and
ln Dp =
p
p=2
ln D(p)
194
P. Dutta and D. Ghoshal
which are operators in the truncated Hilbert space
p 0
L2 (p−1 Zp ). These are
p=2 : 9 canonically conjugate since Op , Dp = i. Now we take the limit p → ∞ to obtain the canonically conjugate operators
O = lim Op ,
D = lim Dp
p→∞
p→∞
such that
[O, D] = i
1 on the large Hilbert space p L2 (p−1 Zp ). This limit is analogous to the thermodynamic limit of statistical models, as we shall see in Sect. 3.1. Associated to these operators is the Weyl symmetric product 1 1 i Dp Op + Op Dp = OD − (DO + OD) = lim p→∞ 2 2 2 p 1 (p) − 1 ln D(p) 1
ln D (p) 1 ⊗ ··· ⊗ e2 e 2 ⊗ 1 ⊗ ··· = lim p→∞ π(p) ln p p=2
(17) which is (formally) self-adjoint. In the last line, we have a similarity transform of the sum of the (p) operators. As has been emphasised, e.g. in [15], the operator canonically conjugate to the number operator can only be defined up to a similarity transformation. Hence there ought to be more than one (which could be infinite in number) total phase operators canonically conjugate to the total number operator ln D = p ln D(p) . One may follow proposals in the literature (e.g. [15]) to define (p) , which would result in 12 Dp Op + Op Dp , canonically conjugate to ln Dp , on the outer product of a dense subspace of the Hilbert space L2 (p−1 Zp ) at the p-th place. It is worth reiterating that the construction discussed above is formal. The limit p → ∞ is far from straightforward. There is a more convenient way to construct the phase operator over a subspace of the Hilbert space. We shall attempt to do so in the next two subsections.
3.1 Aggregate Phase Operator for the Riemann Zeta Function Let us return to the model of su(2) spin in an external field of Sect. 2 with the Hamiltonian containing a site dependent magnetic field H =−
p
p=2
Bp Np = −
p
p=2
Bp S3,p + j 1
Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann
195
where we have now chosen an unusual convention3 of using prime numbers p to label the sites, Bp are the values of the magnetic field at site p and we have shifted the zero of the energy for convenience. The latter amounts to a shift in the spectrum of Sp,3 by Sp,3 → Np = S3,p + j 1 so that Np takes the integer value 0, 1, · · · , n. In this case one can define a phase operator at an individual site, say φ p at the pth site, as in Sect. 2. Each of these individual operators satisfies the shift covariance relation (or commutator) for special values of β p p : 9 φ p , eβ 2 Bp Np = iβeβ 2 Bp Np
for β =
(18)
2π ik (mod 2π /Bp ) with k = 1, · · · , n and p = 2, · · · , p Bp (n + 1)
This is valid over the entire Hilbert space, i.e., on an arbitrary state vector, but only for these special values of β. Thus there are as many shift covariant phase operators as the number of sites, and each individual phase operator is covariant under the specific choices of β. Moreover, since each of the Hilbert spaces, labelled by p, is finite dimensional, the trace is a product over traces in each Hilbert space. Hence if we take the trace of Eq. (18), exactly as in the case of the spin model in β p Bp Np Sect. 2, the trace of the commutator is zero, therefore, Tr e = 0. Thus the shift covariance relation is valid for those values of β which also satisfy the zero trace condition. This relates the zeroes of the partition function to the poles of the following resolvent operators −1 R[eiφ p ](φ) = 1 − e−iφ eiφ p for all p. The trace of the resolvent is n
kp =0
1 1−e
−iφ+iφkp
apart from the pole for k = 0, which yields the trivial commutator. The similarity between the spin in a magnetic field and the zeta function is apparent at this stage. (Recall that we have labelled the sites by the first p prime numbers with this objective.) Indeed, if we choose the local magnetic field Bp = ln p, then the partition function becomes Z(β) =
p
n / p=2
3 It
mp =0
e
βmp ln p
p / 1 − pβ(n+1) = 1 − pβ p=2
should, however, be mentioned that this type of numbering has been used before in [4–7].
196
P. Dutta and D. Ghoshal
In the thermodynamic limit p → ∞, even for finite n, the partition function has a simple form in terms of a ratio of the Riemann zeta functions p / 1 − pβ(n+1) ζ (−β) = β 1−p ζ (−(n + 1)β)
Z(β) = lim
p→∞
(19)
p=2
Remarkably this has the exact same form as the partition functions of a κparafermionic primon gas of [5–7, 10] with κ = n + 1 and s = −β. It would be interesting to try to relate the parafermionic variables to the spin degrees of freedom. Notice that Z(β) has zeroes at the non-trivial zeroes of ζ (−β) from the numerator, as well as at β = −1/(n + 1) from the pole of ζ (−(n + 1)β) from the denominator. The latter is the only real zero, although it is at an unphysical value of the (inverse) temperature. However, the trivial zeroes of ζ (−β) are not zeroes of the partition function. This is due to the fact that at these points, both the numerator and the denominator have simple zeroes, hence ζ (−β) = finite β→2n ζ (−(n + 1)β) lim
Thus the nontrivial zeroes are the Fisher zeroes of the spin model in the complex (inverse temperature) β-plane. However, since the zeroes of the Riemann zeta function are believed to be isolated (and since there is no accumulation point on the real line) these zeroes are not related to any phase transition. This is consistent as the system of spins in a magnetic field is not expected to undergo a phase transition. Finally, the partition function has additional poles from the zeroes of the zeta function in the denominator. The spectrum of the zeroes of the partition function is then given by p
n
p=2 np ∈Z kp =0
1 1−e
2π n iφkp +i ln pp
−iφ
−
p,np
1 1−e
i
2π np ln p
−iφ
(20)
where we have subtracted the pole due to k = 0. This function may be rewritten as follows. p
p=2 n∈Z sing.
≈
1−e
i
1 2π n (n+1) ln p −φ
p
−i
φ− p=2 n∈Z
2π n (n+1) ln p
−
−
p
1
i
p=2 n∈Z
1−e
p
−i
φ− p=2 n∈Z
2π n ln p −φ
2π n ln p
Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann
≈
p
p=2
≈ −i
197
p d d
−(n+1)iφ ln(1 − p )− ln(1 − p−iφ ) d(iφ) d(iφ) p=2
d ln dφ
/ p p=2
−(n+1)iφ
1−p 1 − p−iφ
In the above we have used the Mittag-Leffler expansion, assuming analyticity of the partition function. The expression above, in the limit p → ∞, becomes d ζ (iφ) −i ln dφ ζ ((n + 1)iφ) for Re (iφ) > 1. We will now try to construct a single operator that can be understood as ‘canonically conjugate’ to the Hamiltonian. If we define the total phase operator as = p φ p (which is the sum of individual phase operators φ p as defined in Sect. 2) it does not, unfortunately, have the desired shift covariance relation Eq. (4) with the Hamiltonian. This is due to the site dependence of the magnetic field Bp , as is apparent from the steps leading to Eq. (4). The commutator there is obtained only at specific discrete values of β which are integer multiples of 2π k /Bp (n + 1). Therefore, unless the magnetic field Bp at all the sites are commensurate, which is certainly not the case for Bp = ln p, it is not possible to get the desired commutator this way. Instead, we propose to work with an aggregate phase operator ϕ such that ? 1 i φˆ p if one of the its action on the composite state p φp,kp is defined to be e eigenvalues φp = 0, while at the same time all other eigenvalues φq =p are zero, otherwise this operator acts as the identity. Thus, if two or more of the phases are non-zero, eiϕ = 1. This may be expressed as eiϕ =
p
p=2
ˆ
ei φp
/ q =p
δφq ,0 +
2 n =0 (n =0 − 1)
p
/
p1 ,p2 =1 p1 =p2
(1 − δφp1 ,0 )(1 − δφp2 ,0 ) (21)
where n =0 = p (1 − δφp ,0 ) is the number of sites where the phase is non-zero. This is equivalent to projecting on a subspace H(1) of the Hilbert space, in which only one, and exactly one, phase is different from zero4 . After the projection, one can use the total phase operator in the subspace |H(1) = !H(1) p p !H(1) . 4 This
is analogous, though not exactly equivalent, to a projection of the Fock space of a quantum field theory of, say a scalar field, on a subspace with single-particle excitation.
198
P. Dutta and D. Ghoshal
From either point of view, the action of the above is nontrivial on a subspace of the Hilbert space parametrised by only one of the eigenvalues φp at a time, i.e., on a 1 while the full Hilbert space is parametrised by (S 1 )p . In the union of circles ∪p S(p) complement of this subspace, it is identity. In this subspace H(1) , we can follow the steps leading to Eq. (4) to compute the commutator 6 7 ϕ, !H(1) e−βH !H(1) = iβ !H(1) e−βH !H(1) k which holds in H(1) for all β = Bp2π (n+1) (mod 2π/Bp ) where k = 1, · · · , n and p = 2, · · · , p. It is worth emphasising that, as in several examples in quantum theory, the domain of the canonical commutator is not the entire Hilbert space, but a direct sum of closed orthogonal subspaces [14] of the type H(1) . In the limit p → ∞, one take the closure of this subspace to obtain a closed subspace of the entire Hilbert space. The resolvent function of the exponential of the aggregate phase operator Eq. (21)
−1 R[eiϕ ](φ) = 1 − e−iφ eiϕ
(22)
has the trace p ∞
p >
p ? Tr R[eiϕ ](φ) = ⊗i=1 φki e−βH ein(ϕ−φ) ⊗i=1 φki p=2 k1 ,··· ,kp
=
p
p=2
n=0
kp
n=0
p=2
kp
1
kp
1 − eiφkp −iφ
exactly one φkp =0
at least two φ =0
φkp =0
=
e−inφ
kp
exactly one p
einφkp e−inφ +
+
kp
1 1 − e−iφ
at least two φ =0 kp
(23) Except for the pole at φ = 0, this behaviour is in fact identical to that of the resolvent p
ˆ −1 1 − e−iφ ei φp in Eq. (20). p=2
Even though we do not require to take the limit n → ∞, it is interesting to note that the phase operator φ p at the p-th site approaches the phase operator described in [15], which is a Toeplitz operator [26–28] in this limit. This has been shown in [17]. This fact provides a way to understand the relation to the spectrum without a truncation to a finite n case. As shown in [15, 17], each pair of operators (φ p , Np )
Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann
199
satisfies the canonical commutation relation in a subspace p of the p-th Hilbert space L2 (p−1 Zp ) as follows ∞ ∞ .
p = |f (p) = fnp |np (p) : fnp = 0 np =0
np =0
where, |np (p) is an eigenstate of Np corresponding to the eigenvalue np . Thus, each of the phase operators φ p satisfies the canonical commutator over a dense subspace of the Hilbert space 6 1 7
φp , ln D(p) = i, ln p
in p
0
Hp
p =p
p∈prime
This is similar to the operator defined earlier, but without the sum restricted to a finite prime p (along with the normalisation factor π(p) in the denominator). This is due to the fact that in this case, one gets a contribution from only one of the subspaces (one prime) at a time. In the next subsection we shall take a similar route to define another phase operator.
3.2 Total Phase Operator for the Riemann Zeta Function Following [15] (see also [17]) we would like to discuss another construction of the ˆ conjugate to the number operator Nˆ such that phase operator ˆ N|n = n|n, n = 0, 1, · · · , n
i |mn| ˆ = m−n
(24)
m =n
This is a (n + 1) × (n + 1) hermitian Toeplitz matrix [26–28]. When applied on a state |v = n vn |n, we find that 9
: ˆ Nˆ |v = i|v ,
if and only if
n
vn = 0
(25)
n=0
Thus the commutator is valid in a codimension one subspace. For example, we could choose the vn s to be the nontrivial (n + 1)-th roots of unity. Toeplitz matrices and operators have a long history and have been studied extensively (see e.g., [28]). Although eigenvalues k0 ≤ k1 ≤ · · · ≤ kn and the corresponding eigenvectors |km (k = 0, 1, · · · , n) of the matrix above exist, one cannot write them explicitly. Moreover, by Szegö’s theorems, the spectrum is bounded by π as n → ∞ (so that
200
P. Dutta and D. Ghoshal
the matrix size goes to infinity) and the eigenvalues are distributed uniformly and symmetrically around zero, as one can also check numerically for small values of n. Coming back to the problem of our interest, in which the Hamiltonian is H = p ln p N(p) , where N(p) is the number operator at the p-th site, which in turn can be expressed in terms of the generalised Vladimirov derivative. For a natural number n ∈ N, we use the prime factorisation5 to associate a vector in ⊗p L2 (p−1 Zp ) as n=
/
pn(p) ←→ |n = ⊗p |n(p)
(26)
p
using the wavelet basis. We emphasise that only a finite number of entries in the infinite component vector are non-zero integers. Clearly |n is an eigenvector of H H |n =
n(p) ln p |n = ln n |n
p
" " j Moreover, these states are orthonormal ni |nj = p ni(p) |n(p) = p δni ni = (p) (p) δni nj . When restricted to a fixed value of p, the following definition for the phase operator i ln p
(p)
(p)
(p)
(p)
(p)
|na nb | (p)
na =nb
(na − nb )
on L2 (p−1 Zp ) is natural. This is a Toeplitz matrix, therefore, it has eigenvectors |k(p) . Let us define the phase operator on the full space ⊗p L2 (p−1 Zp ) schematically to be of the form
tot ∼
na =nb
i |na nb | = ln na − ln nb
not all (p)
(p ) (p ) i ⊗pa |ni a ⊗pb nb b | (p) (p) p (na − nb ) ln p
(p)
na =nb
We need to specify the limits of sums over the integers in the above. However, before we undertake that exercise, we would like to check if H and tot could be a canonically conjugate pair, possibly on a subspace spanned by vectors of the form in Eq. (26). To this end, let us now consider a finite linear combination of the form |v = n vn |n, in which we further require the coefficients to factorise as
5 Prime
factorisation played an important role in the arithmetic gas models [4–7]. See also [29] for a different aspect of this correspondence.
Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann
vn ≡ v(n(2) ,n(3) ,··· ) = on such a state
"
p vn(p) .
201
One can compute the commutator and verify that
9 : tot , H |v = i|v
if and only if
vn = 0
(27)
n
where the upper limit of the sum is the maximum integer nmax that appear in the definition of the vector |v. Consider all the vectors |n that appear in the linear combination in defining |v on which we want to check for the commutator, and the prime factorisations of the corresponding integers n. Let the maximum maxp {n(p) } of these be n ∈ N. There is also a highest prime p, i.e., above which all n(p>p) = 0 in the factorisations. We can now make the proposal for the phase operator more precise. It is
tot =
⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨
n
i
1
pa ≤p
(p) (p) na , nb =0 (p) not all n(p) a =nb
(pa ) ? 1 n i
(p) p≤p (na
>
pb ≤p
(p) − nb ) ln p
⎪ ⎪ ⎪ ⎪ ⎪ 0 ⎪ ?> ⎪ ⎪ n(p) = 0 n(p) = 0 ⎪ ⎩
(pb )
nb
(p)
(p)
for na , nb
≤n
otherwise
p>p
(28) and acts on a space spanned by vectors of the form ⎛ ⎞ 0 ? 0 0 ? k(p) n(p ) = 0 ⎠ ⎝ |k, p =
p≤p
p >p
where at least one k(p) = 0 for a prime p ≤ p and for p > p, we have chosen the ‘vacuum’ state in the number representation. In the limit p → ∞ (even for finite values of n), we expect this to be a well defined Toeplitz operator on ⊗p L2 (p−1 Zp ). However, we are not able to offer a rigorous mathematical proof of this assertion. It is well known that a phase operator cannot be defined uniquely, it is ambiguous upto a similarity transform [15]. Given a phase operator , for example, as defined in Eq. (24), let us consider the operator β = e−βN eβN related by a similarity transform labelled by a parameter β. This would have been a trivial statement had the commutator Eq. (25) been true in the full vector space, however, as we have seen this relation holds in a subspace of codimension one. It is straightforward to check that the condition that restricts to the subspace is modified to n eβn vn = 0 for β to be conjugate to N. We may choose vn = e2π im1 n/(n+1) and β = 2π im2 /(n + 1) with m1 + m2 = 0 (mod n + 1). This condition is identical to the vanishing of the partition function Z = Tr e−βH for the Hamiltonian H = −N at these special values of β.
202
P. Dutta and D. Ghoshal
Now consider tot,β = e−βH tot eβH , the similarity transformation of Eq. (28). " The modified condition that defines the subspace is p n(p) eβn(p) ln p vn(p) = 0. If we choose the coefficient vn(p) = χ (p)n(p) , where χ (p) is a Dirichlet character (see Eq. (30)), the subspace is defined by the vanishing of p
p n / / n 1 − χ (pn+1 )pβ(n+1) χ (p)pβ (p) = 1 − χ (p)pβ
p=2 n(p) =0
(29)
p=2
which, in the limit p → ∞ is a ratio of Riemann zeta or Dirichlet L-functions, depending whether the character is trivial or not, as in Eqs. (19) and (39), respectively. Thus the subspace in which the phase operator Eq. (28), or its similarity transform, is canonically conjugate to the Hamiltonian, is defined by the vanishing of the Riemann zeta function (at special values of the inverse temperature β = 0). We have previously encountered this in Eq. (19) with the aggregate phase operator defined in Sect. 3.1. As we see, different choices for the coefficients relate to the vanishing of Dirichlet L-functions, to which we shall now turn our attention.
4 Extension to the Dirichlet L-Functions The Riemann zeta function belongs to a family of functions, called the Dirichlet L-functions, that are defined as the analytic continuation of the Dirichlet series L(s, χ ) =
∞
χ (n) n=1
ns
=
/ p ∈ primes
1 , 1 − χ (p)p−s
Re(s) > 1
(30)
to the complex s-plane. In the above, χ (n), called the Dirichlet character, is a homomorphism from the multiplicative group G(k) = (Z/kZ)∗ of invertible elements of Z/kZ to C∗ , which is then extended as a character for all Z by setting χ (m) = 0 for all m which are zero (mod k) [30]. A Dirichlet character so defined satisfies the following properties 1. For all m1 , m2 ∈ Z, χ (m1 m2 ) = χ (m1 )χ (m2 ) 2. χ (m) = 0 if and only if m is relatively prime to k 3. χ (m1 ) = χ (m2 ) if m1 ≡ m2 (mod k) Therefore, χ is a multiplicative character, defined modulo k, on the set of integers. It is this multiplicative property that justifies the sum to be written as an infinite product in Eq. (30). There is a trivial character that assigns the value 1 to all integers, including 0. (This may be taken to correspond to k = 1.) The Riemann zeta function corresponds to the choice of the trivial character. In all other cases, only those integers (respectively, primes), the Dirichlet characters of which are not zero,
Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann
203
contribute to the sum (respectively, the product). This of course depends on the periodicity k of the character. Therefore, the product restricts to primes that do not divide k / p
/ / 1 1 = = −s −s 1 − χk (p)p 1 − χk (p)p
∞
χk (p)np p−np s
(31)
(p,k)=1 np =0
(p,k)=1
With this understanding, one can define the inverse χ −1 by restricting to the relevant set of primes. For these primes p k, the Dirichlet character satisfies χk−1 (p) = χk∗ (p). (Formally, for the others, we may take χ as well as χ −1 to be zero[21].) Everything we discussed in the context of the Riemann zeta function in Sect. 3, including all the caveats, apply to the Dirichlet L-functions, with obvious modifications at appropriate places. The role of the generalised Vladimidrov derivative, acting on complex valued functions on the p-adic numbers Qp is played by the generalised Vladimirov derivative twisted by the character χ [21], denoted by D(p)x . The Kozyrev wavelets are eigenfunctions of these operators for all χ . The eigenvalues, however, are different and involve the Dirichlet character as follows (p)
(p)
D(p)x ψ1−n,m,j (ξ ) = χk (pn )pn ψ1−n,m,j (ξ )
(32)
We refer to [21] for details of the construction and other properties of these operators. The above equation and Eq. (9) lead to the conclusion that D and Dx are simultaneously diagonalisable, hence the Kozyrev wavelets are also eigenfunctions of the unitary operator Ux = Dx D −1 −1 U(p)x ψ1−n,m,j (ξ ) = D(p)x D(p) ψ1−n,m,j (ξ ) = χk (pn )ψ1−n,m,j (ξ ) (p)
(p)
(p)
(33)
† −1 −1 = U(p)x = D(p) D(p)x We can define its inverse U(p)x ∗ for those k which do not contain p in its factorisation (otherwise it is the identity operator). Conversely, when we consider all primes, for a given k, we need to restricted to the set of primes that do not divide k, i.e., with the formal extension of the inverse given after Eq. (31). As in the case of the Riemann zeta function, we can combine all the prime factors to write L(s, χk ) as a trace
L(s, χ ) =
? > n Ux e−s ln D n
n=(n(2) ,n(3) ,··· )
where Ux = ⊗p U(p)x . In interpreting this as the partition function of a statistical mechanical model, the Hamiltonian is such that e−βHx ←→ Ux e−s ln D = D−1 Dx D−s = Dx D−s−1
204
P. Dutta and D. Ghoshal
which reduces to e−βH ∼ D−s corresponding to the Riemann zeta function in Eq. (13), upto a phase. Now since a non-zero χk (p) = eiωp is a root of unity, we can define a new phase state6 n
1 (x) |φkp = √ e−inp (φkp ln p+ωp ) |np n + 1 n =0
(34)
p
which provide an orthonormal set >
(x) (x) ? φk φkp = δkp ,kp p
One may construct a phase operator φˆ (p)x =
(x) (x) ?> (x)
ωp (x) ?> (x) φkp + φkp φkp ≡ φkp φkp φkp ln p kp
(35)
kp
using the eigenvalues and eigenstates as before. Now we define the operator U(p) eβ ln pNp such that U(p) eβ ln pNp |np = einp ωp eβnp ln p |np
(36)
It follows that φˆ (p)x U(p) eβ ln p Np = U(p) eβ ln p Np
kp
(x)
φ kp +
A@ ωp ωp ωp (x) (x) − iβ φkp + − iβ φkp + − iβ ln p ln p ln p
A@ ωp (x) ωp ωp (x) + U(p) eβ ln p Np iβ − − iβ φkp + − iβ φkp + ln p ln p ln p kp
This relation can be obtained by the same method used in the earlier sections. If 2π k ω iβ takes any of the values (n+1) pln p + ln pp then in the first term above, one gets the phase operator Eq. (35), since φkp is defined modulo ln2πp . Hence, as in the case of the Riemann zeta function, 7 6 ωp β ln p Np ˆ U(p)x eβ ln p Np = iβ − φ(p)x , U(p)x e (37) ln p The definition of the (exponential of the) resolvent is completely analogous to the case of the Riemann zeta function Eq. (22)—one only needs to substitute φˆ p → φˆ (p)x , resulting in the trace
6 This
is done by truncating the spectrum to relate with the previous case.
Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann p
kp
p=2
exactly one
1−e
1
ω i φkp + lnpp −iφ
+
205
iωp
kp
at least two
1 1 − e ln p
−iφ
φkp =0
φkp =0
in place of Eq. (23). Once again the poles (apart from that at φ = 0) coincide with the zeroes of the partition function, which is ⎞ p p
/ 1 − χ n+1 (p)pβ(n+1) Z(β) = Tr ⎝Ux,p exp β ln p Np ⎠ = 1 − χ (p)pβ ⎛
p=2
(38)
p=2
where the unitary opeartor Ux,p is the product of the corresponding operators at all p / U(p)x , and we have used the fact that χ (pn+1 )(p) = χ n+1 (p) is sites Ux,p = p=1
again a character with the same periodicity. In the thermodynamic limit p → ∞ we get the following ratio of the Dirichlet L-functions Z(β) =
L(−β, χk ) L(−(n + 1)β, χkn+1 )
(39)
In the special case where n + 1 is the Euler totient function ϕ(k) or its integer multiple, χ (pϕ(k) ) = (χ (p))ϕ(k) reduces to χk,0 (p), the principal character, which is 1 if (p, k) = 1 and 0 otherwise. Except for the trivial zeroes, the script of the discussions above is very similar to what we argued for the Riemann zeta function. In summary, we have proposed to view the Riemann zeta and the Dirichlet Lfunctions as the partition functions (upto multiplication by a function that plays no essential role) of quantum spins in magnetic fields, the values of which depend on the site. We have argued how to make sense of the phase operator (upto a similarity transformation). The zeroes of the partition function coincide with the poles of the resolvent function of the exponential of the aggregate or total phase operators, as discussed in Sects. 3.1 and 4. A different approach to the phase operator was discussed in Sect. 3.2. Its relation to the partition function, via similarity transforms, seems to relate the zeta function and the L-functions in the same framework. Acknowledgments We thank Surajit Sarkar for collaboration at initial stages of this work. It is a pleasure to acknowledge useful discussions with Rajendra Bhatia, Ved Prakash Gupta and Vijay Patankar. We would like to thank Toni Bourama and Wilson Zùñiga-Galindo for the invitation to write this article.
206
P. Dutta and D. Ghoshal
References 1. H. Montgomery, “The pair correlation of zeros of the zeta function,” Analytic number theory (Proc. Sympos. Pure Math., Vol. XXIV, St. Louis Univ., St. Louis, Mo., 1972), pp. 181–193, 1973. 2. B. Hayes, “Computing science: the spectrum of Riemannium,” American Scientist, vol. 91, no. 4, pp. 296–300, 2003. 3. A. Odlyzko, “The 1022 -nd zero of the Riemann zeta function,” in Dynamical, spectral, and arithmetic zeta functions (San Antonio, TX, 1999), Contemp. Math., pp. 139–144, 2001. 4. D. Spector, “Supersymmetry and the Möbius inversion function,” Comm. Math. Phys., vol. 127, pp. 239–252, 1990. 5. B. Julia, Statistical theory of numbers. in Number Theory and Physics, J. Luck, P. Moussa, and M. Waldschmidt (Eds.), Springer Proceedings in Physics, Springer, 1990. 6. B. Julia, “Thermodynamic limit in number theory: Riemann-Beurling gases,” Physica A: Statistical Mechanics and its Applications, vol. 203, no. 3, pp. 425–436, 1994. 7. I. Bakas and M. Bowick, “Curiosities of arithmetic gases,” Journal of Mathematical Physics, vol. 32, pp. 1881–1884, 1991. 8. A. Knauf, “Phases of the number-theoretic spin chain,” J. Stat. Phys., vol. 73, pp. 423–431, 1993. 9. A. Knauf, “The number-theoretical spin chain and the Riemann zeroes,” Commun. Math. Phys., vol. 196, pp. 703–731, 1998. 10. D. Schumayer and D. Hutchinson, “Physics of the Riemann hypothesis,” Rev. Mod. Phys., vol. 83, pp. 307–330, 2011, 1101.3116 [math-ph]. 11. C. Itzykson and J.-M. Drouffe, Statistical field theory: vol. 1, From Brownian motion to renormalization and lattice gauge theory. Cambridge Monographs on Mathematical Physics, Cambridge University Press, 1991. 12. L. Susskind and J. Glogower, “Quantum mechanical phase and time operator,” Physics, vol. 1, pp. 49–61, 1964. 13. P. Carruthers and M. Nieto, “Phase and angle variables in quantum mechanics,” Rev. Mod. Phys., vol. 40, pp. 411–440, 1968. 14. J. Garrison and J. Wong, “Canonically conjugate pairs, uncertainty relations and phase operators,” J. Math. Phys., vol. 11, pp. 2242–2249, 1970. 15. A. Galindo, “Phase and number,” Lett. Math. Phys., vol. 8, pp. 495–500, 1984. 16. D. Pegg and S. Barnett, “Unitary phase operator in quantum mechanics,” Europhys. Lett., vol. 6, pp. 483–487, 1988. 17. P. Busch, M. Grabowski, and P. Lahti, Operational quantum physics, vol. 31 of Lecture Notes in Physics. Springer, 1995. 18. X. Ma and W. Rhodes, “Quantum phase operator and phase states,” arXiv e-print, 2015, arXiv:1511.02847 [quant-ph]. 19. A. Perez-Leija, L. Andrade-Morales, F. Soto-Eguibar, A. Szameit, and H. Moya-Cessa, “The Pegg–Barnett phase operator and the discrete Fourier transform,” Physica Scripta, vol. 91, p. 043008, 2016. 20. A. Chattopadhyay, P. Dutta, S. Dutta, and D. Ghoshal, “Matrix model for Riemann zeta via its local factors,” Nucl. Phys. B954, p. 114996, 2020, 1807.07342. 21. P. Dutta and D. Ghoshal, “Pseudodifferential operators on Qp and L-series,” 2020, arXiv:2003.00901. 22. R. Mack, J. Dahl, H. Moya-Cessa, W. Strunz, R. Walser, and W. Schleich, “Riemann ζ -function from wave-packet dynamics,” Phys. Rev. A, vol. 82, p. 032119, 2010. 23. S. Kozyrev, “Wavelet theory as p-adic spectral analysis,” Izv. Math., vol. 66, no. 2, p. 367–376, 2002, arXiv:math-ph/0012019. 24. P. Dutta, D. Ghoshal, and A. Lala, “Enhanced symmetry of the p-adic wavelets,” Phys. Lett., vol. B783, pp. 421–427, 2018, 1804.00958.
Phase Operator on L2 (Qp ) and the Zeroes of Fisher and Riemann
207
25. A. Khrennikov, S. Kozyrev, and W. Zúñiga-Galindo, Ultrametric pseudodifferential equations and applications. Encyclopedia of Mathematics and its Applications, Cambridge University Press, 2018. 26. R. Gray, “Toeplitz and circulant matrices: a review,” Foundations and Trends in Communications and Information Theory, vol. 2, pp. 153–239, 2006. 27. H. Widom, Toeplitz matrices. in Studies in real and complex analysis, I. Hirschman Jr. (Ed.), The Mathematical Association of America, 1990. 28. N. Nikolski, Toeplitz matrices and operators. Cambridge Studies in Advanced Mathematics, Cambridge University Press, 2020. 29. Dutta, P. and Ghoshal, D., “A p-arton Model for Modular Cusp Forms,” 2021, 2103.02443. 30. J.-P. Serre, A course in arithmetic. Graduate texts in Mathematics, Springer, 1973.
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and Metric Structures, Analysis and Applications Khodr Shamseddine and Angel Barría Comicheo
Abstract In this survey paper, we first briefly review basic properties of ultrametric spaces, valued fields and ordered fields as well as the connection between these different mathematical objects. As examples, we introduce the so-called general Hahn fields and Levi-Civita fields, and we present a summary of their key properties. Then, for the rest of the paper, we focus our attention on two special Levi-Civita fields: R and its complex counterpart C. Among all the non-Archimedean fields surveyed in the first part of the paper, R and C are unique from a pure Mathematics point of view: R (respectively, C) is the smallest non-Archimedean valued field extension of the field of real numbers R (respectively, the field of complex numbers C) that is real closed (respectively, algebraically closed) and Cauchy-complete in the valuation topology. Moreover, because of the left-finiteness of the supports of the Levi-Civita numbers, those numbers can be used on a computer, thus allowing for many useful computational applications. We review some of our research work on R and C as well as on the spaces R2 and R3 : one-dimensional and multidimensional calculus, power series and analytic functions, measure theory and integration, unconstrained and constrained optimization, operator theory on the Banach space c0 of null sequences of elements of C, and computational applications. Keywords Non-Archimedean valued fields · Ultrametric spaces · Hahn fields · Levi-Civita fields · Non-Archimedean analysis · Non-Archimedean calculus · Computational applications
This work was funded by the Natural Sciences and Engineering Council of Canada (NSERC Discovery Grant # RGPIN/4965-2017) and by the University of Manitoba. K. Shamseddine () Department of Physics and Astronomy, University of Manitoba, Winnipeg, MB, Canada e-mail: [email protected] A. Barría Comicheo Department of Mathematics, University of Manitoba, Winnipeg, MB, Canada © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 W. A. Zúñiga-Galindo, B. Toni (eds.), Advances in Non-Archimedean Analysis and Applications, STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health, https://doi.org/10.1007/978-3-030-81976-7_6
209
210
K. Shamseddine and A. Barría Comicheo
2020 Mathematics Subject Classification 12J25, 26E30, 11D88, 46S10
1 Introduction In the first part of the paper, we provide all the necessary preliminaries needed to introduce the Levi-Civita fields in a general context considering their algebraic, metric and ordered structures. We begin by reviewing the properties of ultrametric spaces and non-Archimedean valued fields in Sect. 2, establishing a close correspondence between the two mathematical concepts and introducing the nonArchimedean notion of spherical completeness and related results. Then, in Sect. 3, we introduce the Hahn fields and review their role among ordered fields with the same “level of non-Archimedicity”, which is a generalization of the role that the field of real numbers R plays among all Archimedean ordered fields. Finally, we discuss the Levi-Civita fields as particular subfields of the Hahn fields. In the second part of the paper, we focus our attention on two particular LeviCivita fields: R and C := R ⊕ iR. R (resp. C) is the smallest non-Archimedean valued field extension of R (resp. C) that is real closed (resp. algebraically closed) and Cauchy-complete in the valuation topology. After reviewing the algebraic and topological structures of R and C in Sect. 4, we review our work on developing calculus on R and Rn , showing in Sect. 5 that, for the so-called weakly locally uniformly differentiable (WLUD) functions at a point or on an open subset of R or Rn , the important theorems of real calculus hold locally. Then we summarize in Sect. 6 the convergence and analytical properties of power series, showing that they have the same smoothness behavior as real and complex power series. Moreover, we present in Sect. 7 a Lebesgue-like measure and integration theory on R, R2 and R3 with applications of that theory. As well, we discuss in Sect. 8 solutions to one-dimensional and multi-dimensional optimization problems based on continuity and differentiability concepts that are stronger than the topological ones. Then, in Sect. 9, we review some of the computational applications of the Levi-Civita numbers which can be used on a computer because of the left-finiteness of the supports of those numbers. Finally, in Sect. 10, we present a quick review of our work on developing an operator theory on the Banach space c0 of null sequences of C.
2 Preliminaries 2.1 Non-Archimedean Valued Fields We start this subsection with the definition of a valuation on a field. Definition 2.1 Let K be a field. A valuation on K is a map | | : K → R satisfying the following properties, for all x, y ∈ K:
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
211
(1) |x| ≥ 0, and |x| = 0 if and only if x = 0, (2) |xy| = |x||y|, (3) |x + y| ≤ |x| + |y|. The pair (K, | |) is called a valued field. It is not hard to see that |1K | = 1, | − x| = |x| and |x −1 | = |x|−1 for x = 0. In the rest of the article we will denote the set K \ {0} by K ∗ . A valuation | | on K is called non-Archimedean if it satisfies the strong triangle inequality: |x + y| ≤ max{|x|, |y|} for all x, y ∈ K. Otherwise it is called Archimedean. Theorem 2.2 ([55, 1.1], [33, lemma 8.2]) Let (K, | following conditions are equivalent. (1) (2) (3) (4) (5)
|) be a valued field. The
| | is non-Archimedean. If a, b ∈ K and |a| < |b|, then |b − a| = |b| (isosceles triangle principle). The set {|n1K | : n ∈ N} is bounded. |n1K | ≤ 1 for every n ∈ N. |2 · 1K | ≤ 1.
As the reader can deduce from the definition, the features of a valued field will depend on the algebraic properties of the field (whether it is algebraically closed, or real-closed, etc.) and the properties of the valuation (whether it is Archimedean or non-Archimedean, or whether |K ∗ | is discrete or dense in (0, ∞), etc.). As in the real and complex cases, the valuation defines a metric (x, y) := |x − y| for x, y ∈ K, which allows us to consider convergence of sequences and series, and continuity of functions on K. Sometimes, the valued field will admit an order compatible with the field operations which enriches the properties of the valued field, and generates an order topology. It is interesting to ask whether the order topology coincides with the topology induced by the valuation. In [4] and [12] it is shown that for every ordered field, there exists a valuation that induces the order topology of the field, although sometimes the codomain of the valuation needs to be an ordered group that cannot be embedded in the real numbers. Those latter valuations are called Krull valuations or general valuations and are studied in [12].
2.2 Ultrametric Spaces The importance of non-Archimedean valued fields relies on the alternative models they provide compared to the commonly studied models defined using real or complex numbers. How different the non-Archimedean structures can be from the real or complex ones depends in a major part on how different the underlying nonArchimedean valued field is from the real or complex number fields. The first aspect of non-Archimedean valued fields that we will review is their metric structure, e.g. how different the convergence criteria for sequences and series
212
K. Shamseddine and A. Barría Comicheo
can be from the ones in the real or complex numbers fields. For this, we will leave the algebraic structure of the valued field aside, and we will focus on its properties as a metric space leading to the notion of an ultrametric space. Recall that a metric on a set X is a function : X × X → R satisfying the following properties for all x, y, z ∈ X: (1) (x, y) ≥ 0, and (x, y) = 0 if and only if x = y, (2) (x, y) = (y, x), (3) (x, y) ≤ (x, z) + (z, y) (triangle inequality). The pair (X, ) is called a metric space. In particular, when the metric satisfies the so-called strong triangle inequality (x, y) ≤ max{(x, z), (z, y)} for all x, y, z ∈ X, the pair (X, ) is called an ultrametric space. Any subset of a non-Archimedean valued field (K, | |) with the map (x, y) → |x −y| constitutes an ultrametric space. Notice that with this example we have listed all ultrametric spaces, since W. Schikhof proved in [31] that any ultrametric space can be isometrically embedded into a non-Archimedean valued field. When a metric satisfies the strong triangle inequality, geometrical situations occur that do not occur otherwise. As an example of such unusual situations, we will consider triangles in an ultrametric space. Let (X, ) be a metric space. The following condition is called the isosceles triangle principle: for all x, y, z ∈ X, if (x, z) = (z, y) then (x, y) = max{(x, z), (z, y)}; that is, every triangle with vertices in X is isosceles. Theorem 2.3 ([28, p. 3], [55, 2.A]) Let (X, ) be a metric space. The metric is an ultrametric if and only if it satisfies the isosceles triangle principle. Before we continue, let us present some important notations. Notation 2.4 Let (X, ) be a metric space and let a ∈ X and r > 0. The sets B(a, r) := {x ∈ X : (x, a) < r} and B[a, r] := {x ∈ X : (x, a) ≤ r} are called the open and closed balls of center a and radius r, respectively. The family of open balls forms a base of neighbourhoods for a uniquely determined Hausdorff topology on X. This topology is called the topology induced by on X. With respect to this topology the open balls are open sets and the closed balls are closed sets in X. The diameter of a non-empty set Y ⊂ X is diam(Y ) := sup{(x, y) : x, y ∈ Y } and the distance between two non-empty sets Y, Z ⊂ X is dist (Y, Z) := inf{(y, z) : y ∈ Y, z ∈ Z}. The set of values of a metric : X × X → R is denoted and defined by (X × X) := {(x, y) : x, y ∈ X}. The following theorem collects the most remarkable results about an ultrametric space, all of which are direct consequences of the strong triangle inequality. Theorem 2.5 Let (X, ) be an ultrametric space. Then the following properties are satisfied. (1) Each point of a ball is a center of the ball. (2) Each ball in X is both closed and open (“clopen”) in the topology induced by the ultrametric.
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
(3) (4) (5) (6)
(7) (8)
(9)
(10) (11) (12) (13)
213
Each ball has an empty boundary. Two balls are either disjoint, or one is contained in the other. Let a ∈ Y ⊂ X. Then diam(Y ) = sup{(x, a) : x ∈ Y }. The radii of a ball B form the set {r ∈ R : r1 ≤ r ≤ r2 }, where r1 = diam(B), r2 = dist (B, X\B) (r2 = ∞ if B = X). It may happen that r1 < r2 , so that a ball may have infinitely many radii. If two balls B1 , B2 are disjoint, then dist (B1 , B2 ) = (x, y) for all x ∈ B1 , y ∈ B 2 . Let U = ∅ be an open subset of X. Given a sequence (rn )n in (0, ∞), strictly decreasing and convergent to 0, then there exists a partition of U formed by balls of the form B[a, rn ], with a ∈ U and n ∈ N. Let ε ∈ R+ . For x, y ∈ X, the relation (x, y) < ε is an equivalence relation and induces a partition of X into open balls of radius ε. Analogously for (x, y) ≤ ε and closed balls. Let Y ⊂ X, B a ball in X, B ∩ Y = ∅. Then, B ∩ Y is a ball in Y. Let (xn )n be a sequence in X converging to x ∈ X, then for each a ∈ X \ {x}, there exists N ∈ N such that (xn , a) = (x, a) for all n ≥ N. There are no new values of an ultrametric after completion, i.e. if (X∧ , ∧ ) is the completion of (X, ), then (X × X) = ∧ (X∧ × X∧ ). A sequence (xn )n on X is Cauchy if and only if lim (xn , xn+1 ) = 0. n→∞
Proof The property (8) can be found in [33, Theorem 18.6], while (5) is in [32, 1.D]. The property (3) follows directly from (2) and the proofs of the remaining can be found in [28, pp. 3–4]. For results regarding compactness and separability of ultrametric spaces, and for results about ultrametrizability of a topological space, we refer the reader to [12].
2.3 Spherical Completeness Recall that a metric space is said to be Cauchy complete if every Cauchy sequence is convergent or, equivalently, if each nested sequence of closed balls whose radii form a null sequence has a non-empty intersection. This motivates the following definition. Definition 2.6 An ultrametric space is called spherically complete if each nested sequence of balls has a non-empty intersection. Remark 2.7 The concept of spherical completeness plays a key role as a necessary and sufficient condition for the validity of the Hahn-Banach theorem in the nonArchimedean context (see [55, 4.10, 4.15]). Furthermore, spherically complete spaces satisfy important properties as we will see below: a fixed point theorem and best approximations.
214
K. Shamseddine and A. Barría Comicheo
It is clear that a spherically complete ultrametric space is Cauchy complete, but the converse is not always true as we will see when we review the Levi-Civita fields. Nevertheless, the following lemma is a partial converse. Lemma 2.8 ([35, Lemma 1.7]) Suppose that (X, ) is a Cauchy complete ultrametric space. If 0 is the only accumulation point of the set (X × X), then (X, ) is spherically complete. The concept of spherical completeness is geometrical rather than topological. Theorem 2.9 ([55, 2.F]) Let (X, ) be a complete ultrametric space. Then the formula σ (x, y) := inf{2−n : n ∈ Z, (x, y) ≤ 2−n } defines an ultrametric σ such that ≤ σ ≤ 2, and (X, σ ) is spherically complete. One of the attributes of spherically complete ultrametric spaces is that they satisfy a stronger version of the fixed point theorem for complete metric spaces. Definition 2.10 Let (X, ) be a metric space. A function f : X → X is called a shrinking map when (f (x), f (y)) < (x, y) for all x, y ∈ X, x = y. If there exists k ∈ (0, 1) such that (f (x), f (y)) < k(x, y) for all x, y ∈ X, x = y, then f is called a contraction. The fixed point theorem for complete metric spaces states that every contraction of a complete metric space has a unique fixed point [54, 3.7.4]. However, this theorem cannot be extended to shrinking maps. In fact, the map f : [1, ∞) → 1 [1, ∞), f (x) = x + is a shrinking map defined on a complete space that has no x fixed point. In contrast, every shrinking map of a spherically complete ultrametric space has a unique fixed point [29, 2.3]. Definition 2.11 Let Y be a subset of an ultrametric space (X, ). Let a ∈ X and b ∈ Y . Then b is a best approximation of a in Y if (a, b) = dist (a, Y ). Another attribute of a spherically complete ultrametric space is the existence of best approximations as stated in the following result. Theorem 2.12 ([33, 21.2]) Let Y = ∅ be a spherically complete ultrametric space embedded in an ultrametric space X. Then each x ∈ X has a best approximation in Y , i.e. min{(y, x) : y ∈ Y } exists. In general, best approximations are not unique. Theorem 2.13 ([33, 21.1]) Let Y = ∅ be a subset of an ultrametric space X. Suppose that Y has no isolated points. If an element a ∈ X \ Y has a best approximation in Y then it has infinitely many. Note that every non-Archimedean valued field has at least one “spherical completion" and that spherical completion is not always unique [12]. Also, the
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
215
spherical completeness of a non-Archimdean valued field has been characterized in terms of sequences (pseudo-completeness), and in terms of valued field extensions (maximal completeness). For more information about these equivalent concepts and for a description of the structure of spherically complete valued fields in general, please see [12].
2.4 Completion of Valued Fields The completion of a valued field as a metric space can be equipped with field operations and a valuation that extend the ones of the original field. Theorem 2.14 ([14, 1.1.4]) Let (K, | |) be a valued field. There exists a Cauchy < |D|), and an embedding i : K → K, < such that |x| = |i(x)| complete valued field (K, < < D for all x ∈ K, and the image i(K) is dense in K. If (K , | | , i ) is another such trio, = |x| 0 implies xy > 0. Note that an ordered ring is necessarily an integral domain. A field that is an ordered ring will be called an ordered field. Definition 3.1 A field K is formally real if it satisfies the following condition: given n ∈ N and a1 , . . . , an ∈ K such that ni=1 ai2 = 0, then a1 = · · · = an = 0. The following result characterizes the formally real fields as the fields that can be ordered. Theorem 3.2 ([4, 1.70(5) and 1.71(6)]) Let K be a field. The following conditions are equivalent. (1) K is formally real, (2) −1 is not a sum of squares in K, (3) There exists an order ≤ on K such that (K, ≤) is an ordered field. Recall that the characteristic of a field K, denoted by char(K), is the smallest positive integer n such that ni=1 1K = 0 if such a number n exists, and 0 otherwise. char(K) 2 1K . Examples 3.3 (1) If K is a field of non-zero characteristic, then 0 = i=1 Hence K is not formally real. Thus if K is formally real then char(K) = 0. However, the converse is not true as it can be seen in Example 3.3(4) below. (2) The field of complex numbers C cannot be an ordered field since −1 = i 2 and therefore it is not formally real. (3) If K is an ordered field, then we can define an order in the field of formal Laurent series K((x)), which is compatible with the addition and multiplication. Thus K((x)) can be ordered, and therefore it is formally real. Such an orderis defined as follows: for every z ∈ K((x)) there are ri ∈ K such that i z= ∞ i=v ri x . We say that z < 0 if z = 0 and rv < 0. Then z1 ≤ z2 if z1 = z2 or (z1 = z2 and z1 − z2 < 0). (4) Qp is not formally real because if p = 2, then −7 is a square and if p > 2 then 1 − p is a square ([30, p. 144]). Recall that in a formally real field the squares are non-negative elements. Since Q ⊂ Qp , char(Qp ) = 0.
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
217
3.2 General Hahn Fields and the Embedding Theorem In this subsection we will review the concept of an Archimedean extension of a field as well as the general Hahn fields. Let K be an ordered field. Two elements x, y ∈ K ∗ are comparable if there exist n, m ∈ N such that |x|0 < n|y|0 and |y|0 < m|x|0 , where |a|0 := max{a, −a} =
a,
if a ≥ 0
−a,
if a < 0 .
The relation of being comparable is an equivalence relation on K ∗ and to denote ‘x and y are comparable’ we write x ∼ y. This relation defines a partition of K ∗ into equivalence classes, which are called the Archimedean classes of K. The equivalence class of x ∈ K ∗ is denoted by [x]; and the set of all the Archimedean classes is denoted by GK . Then GK is an ordered abelian group under the order ≺ and addition + defined as follows: for every x, y ∈ K ∗ , (1) [x] ≺ [y] ⇐⇒ ∀n ∈ N, n|y|0 < |x|0 ⇐⇒ y ∼ x and |y|0 < |x|0 ; and (2) [x] + [y] := [xy]. In this group, the neutral element is [1K ], and −[x] = [x −1 ] for x ∈ K ∗ . Definition 3.4 An ordered field K is Archimedean if GK = {[1K ]}, that is when any two elements in K ∗ are comparable. Theorem 3.5 ([12, 3.7]) An ordered field K is Archimedean if and only if it satisfies the Archimedean property, i.e. for every x ∈ K, there exists n ∈ N such that |x|0 < n1K . Thus for every ordered field K, the group GK determines the ‘Archimedicity’ or the ‘non-Archimedicity’ of K. The field R of real numbers (the only ordered, Dedekind complete field up to isomorphism) is characterized by the fact that each Archimedean ordered field can be embedded in R ([20, 3.5]). Hans Hahn in [19] (1907) generalized this property (see Theorem 3.10 below); and by doing so, he ended up with ordered fields that extend all the ordered fields with a given “level of non-Archimedicity". Definition 3.6 Let E/K be an extension of ordered fields, where the order on E restricted to K coincides with that of K. The field E is an Archimedean extension of K if every x ∈ E is comparable to some y ∈ K. In that case, GE and GK are isomorphic ordered groups. An ordered field K is called Archimedean complete if it has no proper Archimedean extension fields. Definition 3.7 Let K be an ordered field. If G is an ordered abelian group isomorphic to GK , then we say that K is of type G and G is called an Archimedean group of K.
218
K. Shamseddine and A. Barría Comicheo
The simplest Archimedean complete field is R, since it is (up to isomorphism) the only Archimedean complete, ordered field of type {0} [12, 3.10]. Archimedean complete fields of other types are given by the general Hahn fields defined in the next result. Theorem 3.8 ([4, 6.20, 6.21, 7.32], [13, 2.15], [19]) Let K be a field (not necessarily ordered) and G an ordered abelian group. The set K((G)) := {f : G → K : supp(f ) is well-ordered}, where supp(f ) := {x ∈ G : f (x) = 0}, is a field under the addition and multiplication defined as follows: for every f, g ∈ K((G)) and x ∈ G, (1) (f + g)(x)
:= f (x) + g(x), (2) fg(x) := f (a)g(b). a+b=x
Fields of the form K((G)) are called general Hahn fields. When K is an ordered field we can define an order on K((G)) generalizing the definition of the order in K((x)) defined in Example 3.3 (3). Definition 3.9 (Ordered General Hahn Fields) Let K be an ordered field and consider λ : K((G))∗ → G, λ(f ) = min{supp(f )}. For f, g ∈ K((G)) we define: f < g ⇔ f = g and (g − f )(λ(g − f )) > 0. Then (K((G)), ≤) is an ordered field. The next two results are the main features of the general Hahn fields as ordered fields and mimic the relation between R and other Archimedean fields. Theorem 3.10 ([11], [21, 3.1], [4, 1.64], [13, 1.35], [19] (Hahn’s Embedding Theorem)) If K is an ordered field, then for every Archimedean group G of K, there exists an order-preserving field monomorphism σ from K into R((G)) such that R((G)) is an Archimedean extension of σ (K). Theorem 3.11 ([11, pp. 862–863], [21, 3.2], [19] (Hahn’s Completeness Theorem)) If G is an ordered abelian group then the field R((G)) is (up to isomorphism) the only Archimedean complete, ordered field of type G.
3.3 Hahn Fields and Levi-Civita Fields In this subsection we will define a non-Archimedean valuation in some general Hahn fields and the family of the Levi-Civita fields will be introduced.
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
219
Definition 3.12 A Hahn field is a general Hahn field K((G)) for which G is a subgroup of (R, +) and K is any field. The distinctive characteristic of a Hahn field among general Hahn fields is that we can define in a natural way a non-Archimedean valuation on the field. Theorem 3.13 ([33, A.9 pp. 288–292], [34, II.6 corollary, p. 51]) Let G be a subgroup of (R, +) and K any field. If the map | | : K((G)) → R is defined by |f | :=
e− min{supp(f )}
if f = 0
0
if f = 0,
then (K((G)), | |) is a Cauchy complete non-Archimedean valued field with residue class field isomorphic to K and value group |K((G))∗ | = {eg ∈ R : g ∈ G}. Moreover, it is spherically complete. In the following result we introduce the Levi-Civita fields. Theorem 3.14 ([55, 1.3]) Let K be any field and let G be a subgroup of (R, +). Then L[G, K] := {f : G → K | supp(f ) ∩ (−∞, n] is finite for every n ∈ Z} is a subfield of K((G)). When we restrict the valuation of K((G)) to L[G, K], the latter becomes a Cauchy complete, non-Archimedean valued field with residue class field isomorphic to K and value group |L[G, K]∗ | = {eg : g ∈ G}. Remark 3.15 Fields of the form L[G, K], as defined in Theorem 3.14 above, are called Levi-Civita fields. Also, for a given f ∈ L[G, K], since supp(f ) ∩ (−∞, n] is finite for every n ∈ Z, we say that supp(f ) is left-finite. If a field K has a discrete valuation, like R((x)) and Qp , i.e. when |K ∗ | is discrete as a subspace of R, then every nonzero element can be written as a limit of a convergent power series ([14, 1.3.5]). The following result shows that in some Levi-Civita fields this also is possible when the valuation is dense, i.e. when |K ∗ | is dense in (0, ∞). Lemma 3.16 ([36, Theorem 4.1]) Let K be a field and let d : Q → K be the function defined by d(x) :=
1 if x = 1 0 if x = 1.
220
K. Shamseddine and A. Barría Comicheo
Then d is an element of the field L[Q, K]; and for any r ∈ Q, we have that d (x) = r
1
if x = r
0
if x = r.
The value group of (L[Q, K], | |) is {e−r = |d r | = |d| r : r ∈ Q}. Furthermore, every nonzero element f in L[Q, K] is the sum of a convergent generalized power series with respect to the valuation on L[Q, K], specifically: f =
r ∈Q
f (r)d r =
f (r)d r .
r ∈ supp(f )
Additionally, every generalized power series of the form r∈Q ar d r for which {r ∈ Q : ar = 0} ∩ (−∞, n] is finite for every n ∈ Z, is convergent in L[Q, K]; and if two such series differ in at least one coefficient then their sums are different. Theorem 3.17 ([12, 3.19]) Let K be any field and G a subgroup of (R, +). Then (1) The fields K((G)) and L[G, K] coincide if and only if G is discrete. (2) The field L[G, K] is spherically complete if and only if G is discrete. (3) If K is an ordered field, then K((G)) is an Archimedean extension of L[G, K] with respect to the order defined in Definition 3.9. If, in addition, K is Archimedean then both K((G)) and L[G, K] are of type G (see Definition 3.7).
3.4 Real Closed Field Extensions of R Recall that a field K is algebraically closed if every polynomial in K[x] has a root in K. If L/K is a field extension then a ∈ L is algebraic over K if it is the root of a polynomial in K[x]. If every element of L is algebraic over K, then L is an algebraic extension of K. Also, K is real closed if K is formally real and does not admit a proper algebraic extension that is formally real. As the reader can see in the following result, a real closed field is an ordered field very similar to R. Theorem 3.18 ([22, Chapter XI], [4, 1.71(21),1.71(22)], [10, 5.4.4], [6, Chapter 5, Section 4, Lemma 4.1]) Let K be a field. The following conditions are equivalent. (1) K is real closed, (2) x 2 + 1 is irreducible in K and K(i) is algebraically closed (i 2 = −1), (3) K is an ordered field, each positive element of K has a square root and every p ∈ K[x] of odd degree has a root in K, (4) any sentence in the first-order language of fields is true in K if and only if it is true in R,
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
221
(5) K is an ordered field and the intermediate value theorem holds for all polynomials over K. Therefore, in order to develop a theory of Calculus over ordered fields for which the intermediate value theorem holds, then our base field has to be real closed. Some general Hahn fields are real closed. In fact, if K is a field and G an ordered abelian group, then K((G)) is real closed if and only if K is real closed and G is divisible [4, 6.23 (1)–(2)]. Additionally, the Levi-Civita field L[Q, K] is real closed if and only if K is real closed. [12, 4.12]. From this and from 3.18(2) we can deduce that the Hahn field R((Q))(i) = C((Q)) and the Levi-Civita field L[Q, R](i) = L[Q, C] are algebraically closed. Definition 3.19 Let (xn ) be a sequence of elements in an ordered field K. Then (xn ) is Cauchy if for every 0-neighborhood U with respect to the order topology in K, there exists N ∈ N such that xm − xn ∈ U for all m, n ≥ N. The sequence (xn ) is convergent to x ∈ K if for every 0-neighborhood U with respect to the order topology in K, there exists N ∈ N such that xn − x ∈ U for all n ≥ N . An ordered field K is said to be Cauchy complete if every Cauchy sequence of K is convergent in the order topology. As the next result states, the Levi-Civita field L[Q, R] is the smallest real closed field extension of R that is Cauchy complete with respect to a non-Archimedean order. Theorem 3.20 ([36, 3.11]) Let K/R be a field extension where K is a Cauchy complete ordered field such that (1) the order in K extends the one in R; (2) there exists δ ∈ K such that 0 < δ < r for every r ∈ R+ and (δ n ) converges to 0 in the order topology; and (3) K is real closed. If d is the function defined in Lemma 3.16 then there exists an order-preserving field monomorphism σ : L[Q, R] → K defined by σ (f ) = σ
q∈supp(f )
f (q)d q
=
f (q)δ q .
q∈supp(f )
Note that the order topology on L[Q, R] coincides with the topology induced by the valuation [12, 5.7, 5.10]. Using this fact, it is possible to adapt the proof of Theorem 3.20 to prove that the Levi-Civita field L[Q, R] equipped with the valuation presented in 3.14 is the smallest non-Archimedean valued field that is a real closed field extension of R and that is Cauchy complete with respect the valuation. This results is stated in the following corollary.
222
K. Shamseddine and A. Barría Comicheo
Corollary 3.21 Let K/R be a field extension where K is real closed and Cauchy D = 1 for all x ∈ R∗ . complete with respect to a non-trivial valuation |D| such that |x| ∗ < Then, for every δ ∈ K such that |δ| < 1, there exists a field monomorphism σ : L[Q, R] → K defined by σ (f ) = σ
f (q)d q
=
q∈supp(f )
f (q)δ q ,
q∈supp(f )
< = |d|τ . satisfying |σ (f )| = |f |τ for all f ∈ L[Q, R], where τ > 0 is such that |δ|
4 The Levi-Civita Fields R and C In the rest of the paper, we will focus on presenting an overview of our research on the Levi-Civita fields R := L[Q, R] and C := L[Q, C]. For the further discussion, it is convenient to introduce the following terminology. Definition 4.1 (λ, ∼, ≈) For x = 0 in R or C, we let λ(x) = min(supp(x)), which exists because of the left-finiteness of supp(x); and we let λ(0) = +∞. Moreover, we denote the value of x at q ∈ Q with brackets like x[q]. Given x, y ∈ R∗ or C ∗ , we say x ∼ y if λ(x) = λ(y); and we say x ≈ y if λ(x) = λ(y) and x[λ(x)] = y[λ(y)]. Note that λ describes orders of magnitude; the relation ≈ corresponds to agreement up to infinitely small relative error; while ∼ corresponds to agreement of order of magnitude and, in the case of R, it is the same equivalence relation introduced in Sect. 3.2. Moreover, we can isomorphically embed R and C in R and C, respectively, as subfields via the map E : R, C → R, C defined by
E(x)[q] =
x if q = 0 . 0 else
(4.1)
Recall that R is an ordered subfield of the ordered Hahn field R((Q)), with its order inherited from that of R((Q)); see Definition 3.9. Note that, given a < b in R, we define the R-interval [a, b] = {x ∈ R : a ≤ x ≤ b}, with the obvious adjustments in the definitions of the intervals [a, b[, ]a, b], and ]a, b[. Moreover, the embedding E in Eq. (4.1) of R into R is compatible with the order. The order leads to the definition of an ordinary absolute value on R: |x|0 = max {x, −x} which induces the same topology on R (called the order topology or valuation topology) as that induced by the ultrametric absolute value (nonArchimedean valuation): |x| = e−λ(x) ,
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
223
as was shown in [47]. Moreover, two corresponding absolute values are defined on C in the natural way: |x + iy|0 =
x 2 + y 2 ; and |x + iy| = e−λ(x+iy) = max{|x|, |y|}.
Thus, C is topologically isomorphic to R2 provided with the product topology induced by |·|0 (or |·|) in R. Besides the usual order relations on R, some other notations are convenient. Definition 4.2 (2, 3) Let x, y ∈ R be non-negative. We say x is infinitely smaller than y (and write x 2 y) if nx < y for all n ∈ N; we say x is infinitely larger than y (and write x 3 y) if y 2 x. If x 2 1, we say x is infinitely small; if x 3 1, we say x is infinitely large. Infinitely small numbers are also called infinitesimals or differentials. Infinitely large numbers are also called infinite. Non-negative numbers that are neither infinitely small nor infinitely large are also called finite. If d is the Levi-Civita number introduced in Lemma 3.16 then it is easy to check that d q 2 1 if q > 0 and d q 3 1 if q < 0 in Q. Moreover, for all x ∈ R (resp. C), the elements of supp(x) can be arranged in ascending order, say supp(x) = ∞ {q1 , q2 , . . .} with qj < qj +1 for all j ; and x can be written as x = x[qj ]d qj , j =1
where the series converges in the valuation topology. Besides being the smallest ordered non-Archimedean field extension of the real numbers that is both complete in the order topology and real closed, the Levi-Civita field R is of particular interest because of its practical usefulness. Since the supports of the elements of R are left-finite, it is possible to represent these numbers on a computer. One such application is the computation of derivatives of real functions representable on a computer [43], where both the accuracy of formula manipulators and the speed of classical numerical methods are achieved. In the following sections, we present a brief overview of recent research done on R and C; and we refer the interested reader to the respective papers for a more detailed study of any of the research topics summarized below.
5 Calculus on R and Rn The following examples show that functions on a finite interval of R behave in a way that is different from what we would expect under similar conditions in R. Example 5.1 Let f1 : [0, 1] → R be given by ⎧ −1 if 0 ≤ x < d ⎨d −1/λ(x) f1 (x) = d if d ≤ x 2 1 ⎩ 1 if x ∼ 1.
224
K. Shamseddine and A. Barría Comicheo
Then f1 is continuous on [0, 1]; but for d ≤ x 2 1, f1 (x) grows without bound. Example 5.2 Let f2 : [−1, 1] → R be given by f2 (x) = x − x[0]. Then f2 is continuous on [−1, 1]. However, f2 assumes neither a maximum nor a minimum on [−1, 1]. The set f2 ([−1, 1]) = {y ∈ R : λ(y) > 0} = {y ∈ R : |y| < 1} is bounded above by any positive real number and below by any negative real number; but it has neither a least upper bound nor a greatest lower bound. Example 5.3 Let f3 : [0, 1] → R be given by
f3 (x) =
1 if x ∼ 1 . 0 if x 2 1
Then f3 is continuous on [0, 1] and differentiable on ]0, 1[, with f3 (x) = 0 for all x ∈]0, 1[. We have that f3 (0) = 0 and f3 (1) = 1; but f3 (x) = 1/2 for all x ∈ [0, 1]. Moreover, f3 is not constant on [0, 1] even though f3 (x) = 0 for all x ∈]0, 1[. Example 5.4 Let f4 : [−1, 1] → R be given by f4 (x) = x[0] +
∞
xν d 3qν when x = x[0] +
ν=1
∞
xν d qν .
ν=1
Then f4 (x) = 0 for all x ∈] − 1, 1[. But f4 is obviously not constant on [−1, 1]. Remark 5.5 The extension f of f4 to R, that is f : R → R given by f (x)[q] = x[q/3], is differentiable on all of R with vanishing derivative everywhere. Moreover, f is an example of a nontrivial order preserving field automorphism on R [39]; in R (or any other ordered Archimedean field) the identity map is the only order preserving field automorphism. Example 5.6 Let f5 : [−1, 1] → R be given by f5 (x) = −f4 (x) + x 4 , where f4 is the function from Example 5.4. Then f5 (x) = 4x 3 for all x ∈] − 1, 1[. Thus, f5 > 0 on ]0, 1[; but f5 is not increasing on ]0, 1[: f5 d 2 > f5 (d) even though d 2 < d. Also f5 is strictly increasing and f5 ≥ 0 on ] − 1, 1[; but f5 is not convex on ] − 1, 1[ since f5 (d) = −d 3 + d 4 < 0 = f5 (0) + f5 (0)d. Example 5.7 Let f6 : [−1, 1] → R be given by f6 (x) = − (f4 (x))2 + x 8 ,
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
225
where f4 is again the function from Example 5.4. Then f6 is infinitely often (j ) (8) differentiable on ] − 1, 1[ with f6 (0) = 0 for 1 ≤ j ≤ 7 and f6 (0) = 8! > 0. But f6 has a relative maximum at 0. The difficulties embodied in the examples above are not specific to R, but are common to all non-Archimedean ordered fields; and they result from the fact that R is disconnected in the topology induced by the order. This makes developing Analysis on the field more difficult than in the real case; for example, the existence of nonconstant functions whose derivatives vanish everywhere on an interval (as in Example 5.4) makes integration much harder and renders the solutions of the simplest initial value problems (e.g. y = 0; y(0) = 0) not unique. To circumvent such difficulties, different approaches have been employed. For example, by imposing stronger conditions on the function than in the real case, we obtain versions of the intermediate value theorem, the inverse function theorem, the mean value theorem and the implicit function theorem [9, 42, 49, 51]; by carefully defining a measure on R in [40, 46], we succeed in developing an integration theory with similar properties to those of the Lebesgue integral of Real Analysis; and by using a stronger concept of continuity and differentiability than in the real case, onedimensional and multi-dimensional optimization results similar to those from Real Analysis have been obtained for R-valued functions [52, 53].
5.1 Locally Uniformly Differentiable and Weakly Locally Uniformly Differentiable Functions from R to R In [49, 51], we focus our attention on R-valued functions of one variable. We study the properties of locally uniformly differentiable (LUD) functions at a point x0 ∈ R or on an open subset A of R. In particular, we show that LUD functions are C 1 , they include all polynomial functions, and they are closed under addition, multiplication and composition. Then we generalize the definition of local uniform differentiability to any order. In particular, we study the properties of LUD2 functions at a point x0 ∈ R or on an open subset A of R; and we show that LUD2 functions are C 2 , they include all polynomial functions, and they are closed under addition, multiplication and composition. Finally, we formulate and prove an inverse function theorem as well as a local intermediate value theorem and a local mean value theorem for these functions. Here we only recall the main definitions and results (without proofs) and refer the reader to [49, 51] for the details. Definition 5.8 Let A ⊆ R be open and let f : A → R. We say that f is uniformly differentiable (UD) on A if f is differentiable on A and for every > 0 in R there exists δ > 0 in R such that, whenever x, y ∈ A with |y − x|0 < δ, we have that |f (y) − f (x) − f (x)(y − x)|0 ≤ |y − x|0 . Definition 5.9 Let A ⊆ R be open, let f : A → R, and let x0 ∈ A be given. We say that f is locally uniformly differentiable (LUD) at x0 if there exists a
226
K. Shamseddine and A. Barría Comicheo
neighborhood of x0 in A such that f is uniformly differentiable on . Moreover, we say that f is locally uniformly differentiable (LUD) on A if f is LUD at every point in A. Definition 5.10 Let A ⊂ R be open, let f : A → R, and let n ∈ N ∪ {0} be given. Then we say that f is UDn on A if f is n times differentiable on A and for every > 0 in R there exists δ > 0 in R such that, whenever x, y ∈ A with |y − x|0 < δ, we have that n
f (k) (x) k (y − x) ≤ |y − x|n0 . f (y) − k! k=0
0
Definition 5.11 Let A ⊆ R be open, let f : A → R, let x0 ∈ A, and let n ∈ N∪{0} be given. We say that f is LUDn at x0 if there exists a neighborhood of x0 in A such that f is UDn on . Moreover, we say that f is LUDn on A if f is LUDn at every point in A. Note that, for n = 0, UDn means uniformly continuous; and hence LUD0 means locally uniformly continuous. Definition 5.12 Let A ⊆ R be open, let f : A → R, and let x0 ∈ A be given. We say that f is LUD∞ at x0 if f is LUDn at x0 for every n ∈ N. Moreover, we say that f is LUD∞ on A if f is LUD∞ at every point in A. Theorem 5.13 (Inverse Function Theorem) Let A ⊆ R be open, let f : A → R be LUD on A and let x0 ∈ A be such that f (x0 ) = 0. Then there is a neighborhood of x0 in A and a function g : f () → R, such that (1) (2) (3) (4)
g = f |−1 ; f | is one-to-one; f () is open; and g is LUD on f (), with g =
1 f ◦g .
Theorem 5.14 (Local Intermediate Value Theorem) Let A ⊆ R be open, let f : A → R be LUD on A, and let x0 ∈ A be such that f (x0 ) = 0. Then there is a neighborhood of x0 such that for any a < b in f () and for any c ∈]a, b[, there is an x ∈ , strictly between f (−1) (a) and f (−1) (b), such that f (x) = c. Theorem 5.15 (Local Mean Value Theorem) Let A ⊆ R be open, let f : A → R be LUD2 on A, and let x0 ∈ A be such that f (x0 ) = 0. Then there exists a neighborhood of x0 such that f has the mean value property on . That is, for every a, b ∈ with a < b, there exists c ∈]a, b[ such that f (c) =
f (b) − f (a) . b−a
As in the real case, the mean value property can be used to prove other important results. In particular, while L’Hôpital’s rule does not hold for differentiable functions
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
227
on R, we prove the result in [49] under similar conditions to those of the local mean value theorem. To do this we first prove the local equivalent of the Cauchy mean value theorem (Lemma 5.16). The proof is obtained from the mean value property the same way as in the real case. Lemma 5.16 Let A ⊆ R be open, let f, g : A → R be LUD2 on A and let x0 ∈ A be such that f (x0 ) = 0 and g (x0 ) = 0. Then there exists a neighborhood of x0 such that for every a, b ∈ with a < b, there exists c ∈]a, b[ such that f (c) (g(b) − g(a)) = g (c) (f (b) − f (a)) . Theorem 5.17 Let A ⊆ R be open, let f, g : A → R be LUD2 on A and let a ∈ A be such that f (a) = 0 and g (a) = 0. Furthermore, suppose that f (a) = g(a) = 0, that there exists a neighborhood of a in A such that g (x) = 0 for every x ∈ \ {a}, and that lim f (x)/g (x) exists. Then x→a
lim
x→a
f (x) f (x) = lim . x→a g (x) g(x)
In [9], we show that all the results in [49, 51] still hold if we replace local uniform differentiability with the strictly weaker concept of weak local uniform differentiability which we define below. Definition 5.18 Let A ⊆ R be open, let f : A → R, and let x0 ∈ A be given. We say that f is weakly locally uniformly differentiable (abbreviated as WLUD) at x0 if f is differentiable in a neighbourhood of x0 in A and if for every > 0 in R there exists δ > 0 in R such that for every x, y ∈]x0 − δ, x0 + δ[∩ we have that |f (y) − f (x) − f (x)(y − x)|0 ≤ |y − x|0 . Moreover, we say that f is WLUD on A if f is WLUD at every point in A. We extend the WLUD concept to higher orders of differentiability and we define WLUDn analogously to how LUDn was defined in [49]. Definition 5.19 Let A ⊆ R be open, let f : A → R, let x0 ∈ A, and let n ∈ N be given. We say that f is WLUDn at x0 if f is n times differentiable in a neighbourhood of x0 in A and if for every > 0 in R there exists δ > 0 in R such that for every x, y ∈]x0 − δ, x0 + δ[∩ we have that n
f (k) (x) (y − x)k ≤ |y − x|n0 . f (y) − k! k=0
0
Moreover, we say that f is WLUDn on A if f is WLUDn at every point in A. Remark 5.20 A close look at Definition 5.19 shows that f is WLUD0 at x0 if and only if f is continuous in a neighborhood around x0 .
228
K. Shamseddine and A. Barría Comicheo
Finally in [42], we state and prove a Taylor theorem with remainder for WLUDn functions on R. As in the real case, the proof of the theorem uses the mean value theorem. However, in the non-Archimedean setting, stronger conditions on the function are needed than in the real case. Theorem 5.21 (Taylor’s Theorem with Remainder) Let A ⊆ R be open, let n ∈ N be given, and let f : A → R be WLUDn+2 on A. Assume further that f (m) is WLUD2 on A for 0 ≤ m ≤ n. Then, for every x ∈ A, there exists a neighborhood U of x in A such that, for any y ∈ U , there exists c ∈ [min(y, x), max(y, x)] such that f (y) =
n
f (k) (x)
k!
k=0
(y − x)k +
f (n+1) (c) (y − x)n+1 . (n + 1)!
Then we generalize the concept of weak local uniform differentiability to functions from Rn to Rm with m, n ∈ N. Moreover, we formulate and prove the inverse function theorem for WLUD functions from Rn to Rn and the implicit function theorem for WLUD functions from Rn to Rm with m < n in N.
5.2 WLUD Functions from Rn to Rm Throughout this section, let A denote an open subset of Rn ; consequently, whenever we speak of a ball Bδ (x) := {y ∈ R : |y − x|0 < δ} around a point x in A, it is assumed that δ > 0 is small enough so that Bδ (x) ⊂ A. We will state the main definitions and results here and we refer the reader to [42] for the details. Notation 5.22 Let f : A → Rm be differentiable at x ∈ A. Then Df (x) denotes the linear map from Rn to Rm defined by the m × n Jacobian matrix of f at x: ⎛
⎞ f 11 (x) f 12 (x) . . . f 1n (x) ⎜ f 2 (x) f 2 (x) . . . f 2 (x) ⎟ n 2 ⎜ 1 ⎟ ⎜ . .. .. ⎟ .. ⎝ .. . . . ⎠ m m m f 1 (x) f 2 (x) . . . f n (x) with f ij (x) =
∂fi ∂xj
(x) for 1 ≤ i ≤ m and 1 ≤ j ≤ n.
Definition 5.23 (Uniformly Differentiable) Let f : A → Rm be differentiable on A. Then we say that f is uniformly differentiable on A if for all > 0 in R, there exists δ > 0 in R such that whenever x, y ∈ A and |y − x|0 < δ we have that |f (y) − f (x) − Df (x)(y − x)|0 ≤ |y − x|0 . Definition 5.24 (Weakly Locally Uniformly Differentiable) Let A ⊂ Rn be open, let f : A → Rm , and let x0 ∈ A be given. Then we say that f is
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
229
weakly locally uniformly differentiable (WLUD) at x0 if f is differentiable in a neighborhood of x0 in A and if for every > 0 in R there exists δ > 0 in R such that for all x, y ∈ Bδ (x0 ) ∩ , we have that |f (y) − f (x) − Df (x)(y − x)|0 ≤ |y − x|0 . Moreover, we say that f is WLUD on A if f is WLUD at every point in A. It is clear from the two definitions above that if f is uniformly differentiable on A then f is WLUD at every point in A and hence f is WLUD on A. Moreover, we show that if f : A → Rm is WLUD at x0 ∈ A (resp. on A) then f is C 1 at x0 (resp. on A). Thus, the class of WLUD functions at x0 (resp. on A) is a subset of the class of C 1 functions at x0 (resp. on A). However, this is still large enough to include all polynomial functions. We also show that any linear combination of WLUD functions at x0 (resp. on A) is again WLUD at x0 (resp. on A). Moreover, we show that if f : A → Rm is WLUD at x0 ∈ A (resp. on A) and if g : C → Rp is WLUD at f (x0 ) (resp. on C), with f (A) ⊆ C, then g ◦ f is WLUD at x0 (resp. on A). Theorem 5.25 (Inverse Function Theorem) Let f : A → Rn be WLUD on A and let t0 ∈ A be such that J f (t0 ) = 0. Then there is a neighborhood of t0 such that: (1) f | is one-to-one; (2) f () is open; (3) the inverse g of f | is WLUD on f (); and Dg(x) = [Df (t)]−1 for t ∈ and x = f (t). As in the real case, the inverse function theorem is used to prove the implicit function theorem. Let A ⊆ Rn be open and let : A → Rm be WLUD on A. For t = (t1 , . . . , tn−m , tn−m+1 , . . . , tn ) ∈ A, let tˆ = (t1 , . . . , tn−m ) and J˜(t) = det
∂(1 , . . . , m ) . ∂(tn−m+1 , . . . , tn )
Theorem 5.26 Let : A → Rm be WLUD on A, where A ⊆ Rn is open and 1 ≤ m < n. Let t0 ∈ A be such that (t0 ) = 0 and J˜(t0 ) = 0. Then there exist a neighborhood U of t0 , a neighborhood R of tˆ0 and φ : R → Rm that is WLUD on R such that J˜(t) = 0 for all t ∈ U, and {t ∈ U : (t) = 0} = {(tˆ , φ(tˆ )) : tˆ ∈ R}.
230
K. Shamseddine and A. Barría Comicheo
Remark 5.27 All the results in [9, 42, 49, 51] have been proved in a more general context than the Levi-Civita field. More specifically, in those papers, R is replaced by any non-Archimedean ordered field extension of the real numbers that is real closed and Cauchy complete in the topology induced by the order, which we denote by N .
6 Review of Power Series and Analytic Functions Power series on the Levi-Civita field R have been studied in details in [36, 38, 44, 47, 48]; work prior to that had been mostly restricted to power series with real coefficients. In [23–25, 27], they could be studied for infinitely small arguments only, while in [7], using the newly introduced weak topology (see Definition 6.4 below), also finite arguments were possible. Moreover, power series over complete valued fields in general have been studied by Schikhof [33], Alling [4] and others in valuation theory, but always in the valuation topology. In [44], we study the general case when the coefficients in the power series are Levi-Civita numbers (i.e. elements of R or C). We study the convergence of sequences and series in both the valuation (order) topology and the weak topology; and we derive convergence criteria for power series in both topologies. In [47] it is shown that, within their domain of convergence, power series are infinitely often differentiable and the derivatives to any order are obtained by differentiating the power series term by term. Also, power series can be reexpanded around any point in their domain of convergence. We then study a class of functions that are given locally by power series (which we call analytic functions) and show that they are closed under arithmetic operations and compositions and they are infinitely often differentiable with the derivative functions of all orders being analytic themselves. In [48], we focus on the proof of the intermediate value theorem for analytic functions. Given a function f that is analytic on an interval [a, b] and a value S between f (a) and f (b), we use iteration to construct a sequence of numbers in [a, b] that converges in the valuation topology to a point c ∈ [a, b] such that f (c) = S. The proof is quite involved, making use of many of the results proved in [44, 47] as well as some results from Real Analysis. Finally, in [38], we state and prove necessary and sufficient conditions for the existence of relative extrema. Then we use that as well as the intermediate value theorem and its proof to prove the extreme value theorem, the mean value theorem, and the inverse function theorem for functions that are analytic on an interval [a, b], thus showing that such functions behave as nicely as real analytic functions. In the following, we summarize some of the key results in [38, 44, 47, 48]. We start with a brief review of the convergence of sequences in two different topologies.
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
231
6.1 Convergence of Sequences in Two Topologies Definition 6.1 A sequence (sn ) in R or C is called regular if the union of the supports of all members of the sequence is a left-finite subset of Q. Definition 6.2 We say that a sequence (sn ) converges strongly in R or C if it converges in the valuation topology. The fields R and C are complete with respect to the valuation topology; and a detailed study of strong convergence can be found in [36, 44]. Since power series with real (complex) coefficients do not converge strongly for any nonzero real (complex) argument, it is advantageous to study a new kind of convergence. We do that by defining a family of semi-norms on R or C, which induces a topology weaker than the valuation topology and called weak topology [7, 36, 37, 44]. Definition 6.3 Given r ∈ R, we define a mapping - · -r : R or C → R as follows: -x-r = max{|x[q]|0 : q ∈ Q and q ≤ r}. The maximum in Definition 6.3 exists in R since, for any r ∈ R, only finitely many of the x[q]’s considered do not vanish. Definition 6.4 A sequence (sn ) in R (resp. C) is said to be weakly convergent if there exists s ∈ R (resp. C), called the weak limit of the sequence (sn ), such that for all > 0 in R, there exists N ∈ N such that -sm − s-1/ < for all m ≥ N. It is shown [7] that R and C are not Cauchy complete with respect to the weak topology and that strong convergence implies weak convergence to the same limit. A detailed study of weak convergence is found in [7, 36, 37, 44].
6.2 Power Series In the following, we review strong and weak convergence criteria for power series, Theorems 6.5 and 6.6, the proofs of which are given in [44]. We also note that Theorem 6.5 is a special case of the result on page 59 of [33]. Theorem 6.5 (Strong Convergence Criterion for Power Series) Let (an ) be a sequence in R (resp. C), and let
−λ(an ) λ0 = lim sup n n→∞
in R ∪ {−∞, ∞}.
Let x0 ∈ R (resp. C) be fixed and let x ∈ R (resp. C) be given. Then the power n series ∞ n=0 an (x − x0 ) converges strongly if λ(x − x0 ) > λ0 and diverges in the valuation topology if λ(x − x0 ) < λ0 or if λ(x − x0 ) = λ0 and −λ(an )/n > λ0 for infinitely many n.
232
K. Shamseddine and A. Barría Comicheo
Theorem 6.6 (Weak Convergence Criterion for Power Series) Let (an ) be a sequence in R (resp. C), and let λ0 = lim supn→∞ (−λ(an )/n) ∈ Q. Let x0 ∈ R (resp. C) be fixed, and let x ∈ R (resp. C) be such that λ(x − x0 ) = λ0 . For each n& ≥ 0, let bn = an d nλ0 . Suppose that the sequence (bn ) is regular and write ∞ n=0 supp(bn ) = {q1 , q2 , . . .}; with qj1 < qj2 if j1 < j2 . For each n, write qj bn = ∞ j =1 bnj d , where bnj = bn [qj ]. Let η=
1 . in R ∪ {∞}, 1/n sup lim supn→∞ |bnj |0 : j ≥ 1
(6.1)
n with the conventions 1/0 = ∞ and 1/∞ = 0. Then ∞ n=0 an (x − x0 ) converges absolutely in the weak topology if |(x − x0 )[λ0 ]|0 < η and diverges in the weak topology if |(x − x0 )[λ0 ]|0 > η. Remark 6.7 The number η in Eq. (6.1) is referred to as the radius of weak n convergence of the power series ∞ n=0 an (x − x0 ) . As an immediate consequence of Theorem 6.6, we obtain the following result which allows us to extend real and complex functions representable by power series to the Levi-Civita fields R and C. This result is of particular interest for the application [43] mentioned in Sect. 4 above and discussed in Sect. 9 below. Corollary 6.8 (Power Series with Purely Real or Complex Coefficients) Let ∞ n n=0 an X be a power series with purely real (resp. complex) coefficients and with classical radius of convergence equal to η. Let x ∈ R (resp. C), and let An (x) = nj=0 aj x j ∈ R (resp. C). Then, for |x|0 < η and |x|0 ≈ η, the sequence (An (x)) converges absolutely weakly. We define the limit to be the continuation of the power series to R (resp. C). Definition 6.9 (The Functions Exp, Cos, Sin, Cosh, and Sinh) By Corollary 6.8, the series ∞ ∞ ∞ ∞ ∞ 2n 2n+1
xn
x 2n x 2n+1 n x n x , , , , and (−1) (−1) n! (2n)! (2n + 1)! (2n)! (2n + 1)! n=0
n=0
n=0
n=0
n=0
converge absolutely weakly in R (resp. C) for any x ∈ R (resp. C), at most finite in (ordinary) absolute value (that is, for λ(x) ≥ 0). For any such x, define exp(x) =
∞
xn n=0
cos(x) =
∞
n=0
n!
;
(−1)n
x 2n ; (2n)!
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
sin(x) =
∞
n=0
cosh(x) =
(−1)n
233
x 2n+1 ; (2n + 1)!
∞
x 2n ; (2n)! n=0
sinh(x) =
∞
x 2n+1 . (2n + 1)! n=0
A detailed study of the transcendental functions introduced on R and C in Definition 6.9 can be found in [36]. In particular, we show that addition theorems similar to the real ones hold, which is essential for the implementation of these functions on a computer (see Section 1.5 in [36]).
6.3 Analytic Functions In this subsection, we review the algebraic and analytical properties of a class of functions that are given locally by power series and we refer the reader to [38, 47, 48] for a more detailed study. Definition 6.10 Let a < b in R be given and let f : [a, b] → R. Then we say that f is analytic on [a, b] if for all x ∈ [a, b] there exists a positive δ ∼ b − a in R, and there exists (an (x)) in R such that, under weak convergence, a regular sequence n a for all y ∈ ]x − δ, x + δ[ ∩ [a, b]. f (y) = ∞ − x) (x) (y n n=0 It is shown in [47] that if f is analytic on [a, b] then f is bounded on [a, b]; also, if g is analytic on [a, b] and α ∈ R then f + αg and f · g are analytic on [a, b]. Moreover, the composition of analytic functions is analytic. Furthermore, using the fact that power series on R are infinitely often differentiable within their domain of convergence and the derivatives to any order are obtained by differentiating the power series term by term [47], we obtain the following result. Theorem 6.11 Let a < b in R be given, and let f : [a, b] → R be analytic on [a, b]. Then f is infinitely often differentiable on [a, b], and for any positive integer m, we have that f (m) isanalytic on [a, b]. Moreover, if f is given locally around n (m) is given by x0 ∈ [a, b] by f (x) = ∞ n=0 an (x0 ) (x − x0 ) , then f f (m) (x) =
∞
n (n − 1) · · · (n − m + 1) an (x0 ) (x − x0 )n−m .
n=m
In particular, we have that am (x0 ) = f (m) (x0 ) /m! for all m = 0, 1, 2, . . ..
234
K. Shamseddine and A. Barría Comicheo
In [48], we prove the intermediate value theorem for analytic functions on an interval [a, b]. Theorem 6.12 (Intermediate Value Theorem) Let a < b in R be given and let f : [a, b] → R be analytic on [a, b]. Then f assumes on [a, b] every intermediate value between f (a) and f (b). Since Theorem 6.12 is a central result in the study of power series and analytic functions, we present in the following the key steps of the proof and refer the reader to [48] for the detailed (lengthy) proof. • Without loss of generality, we may assume that f is not constant on [a, b]. Let F : [0, 1] → R be given by F (x) = f ((b − a)x + a) −
f (a) + f (b) . 2
Then F is analytic on [0, 1]; and f assumes on [a, b] every intermediate value between f (a) and f (b) if and only if F assumes on [0, 1] every intermediate value between F (0) = (f (a) − f (b))/2 and F (1) = (f (b) − f (a))/2 = −F (0). So without loss of generality, we may assume that a = 0, b = 1, and f = F . Also, since scaling the function by a constant factor does not affect the existence of intermediate values, we may assume that the index of f , i(f ) := min {supp(f (x)) : x ∈ [0, 1]} , is equal to 0. • We define fR : [0, 1] ∩ R → R by fR (X) = f (X)[0]. Then fR is a real-valued analytic function on the real interval [0, 1] ∩ R. Let S be between f (a) = f (0) and f (b) = f (1); and let SR = S[0]. Then SR is a real value between fR (0) and fR (1). We use the classical intermediate value theorem to find a real point X0 ∈ [0, 1] such that fR (X0 ) = SR . • We use iteration to construct a convergent sequence (xn ) such that λ(xn ) > 0 and λ (xn+2 − xn+1 ) > λ (xn+1 − xn ) for all n ∈ N. Let x = limn→∞ xn ; then λ(x) > 0, and we show that X0 + x ∈ [0, 1] and f (X0 + x) = S. A close look at that proof shows that if f is not constant on [a, b] and S is between f (a) and f (b) then there are only finitely many points c in [a, b] such that f (c) = S. This is crucial for the proof of the extreme value theorem for the analytic functions in [38]. In [38], we complete the study of analytic functions: we state and prove necessary and sufficient conditions for the existence of relative extrema; then we prove the extreme value theorem, the mean value theorem and the inverse function theorem for these functions, thus showing that analytic functions have all the nice properties of real analytic functions. Theorem 6.13 Let a < b in R be given, let f : [a, b] → R be analytic on [a, b], let x0 ∈]a, b[, and let m ∈ N be the order of the first nonvanishing derivative of f at x0 . Then f has a relative extremum at x0 if and only if m is even. In that case (m is even), the extremum is a minimum if f (m) (x0 ) > 0 and a maximum if f (m) (x0 ) < 0.
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
235
Theorem 6.14 (Extreme Value Theorem) Let a < b in R be given and let f : [a, b] → R be analytic on [a, b]. Then f assumes a maximum and a minimum on [a, b]. Using the intermediate value theorem and the extreme value theorem, then the following results become easy to prove. Corollary 6.15 Let a < b in R be given and let f : [a, b] → R be analytic on [a, b]. Then there exist m, M ∈ R such that f ([a, b]) = [m, M]. Corollary 6.16 (Mean Value Theorem) Let a < b in R be given and let f : [a, b] → R be analytic on [a, b]. Then there exists c ∈ ]a, b[ such that f (c) =
f (b) − f (a) . b−a
Corollary 6.17 Let a < b in R be given, and let f : [a, b] → R be analytic on [a, b]. Then the following are true. (i) If f (x) = 0 for all x ∈ ]a, b[ then either f (x) > 0 for all x ∈ ]a, b[ and f is strictly increasing on [a, b], or f (x) < 0 for all x ∈ ]a, b[ and f is strictly decreasing on [a, b]. (ii) If f (x) = 0 for all x ∈ ]a, b[, then f is constant on [a, b]. Corollary 6.18 (Inverse Function Theorem) Let a < b in R be given, let f : [a, b] → R be analytic on [a, b], and let x0 ∈ ]a, b[ be such that f (x0 ) > 0 (resp. f (x0 ) < 0). Then there exists δ > 0 in R such that (i) f > 0 and f is strictly increasing (resp. f < 0 and f is strictly decreasing) on [x0 − δ, x0 + δ]. (ii) f ([x0 − δ, x0 + δ]) = [m, M] where m = f (x0 − δ) and M = f (x0 + δ) (resp. m = f (x0 + δ) and M = f (x0 − δ)). (iii) ∃g : [m, M] → [x0 − δ, x0 + δ], strictly increasing (resp. strictly decreasing) on [m, M], such that – g is the inverse of f on [x0 − δ, x0 + δ]; – g is differentiable on [m, M]; and for all y ∈ [m, M], g (y) =
1 . f (g(y))
Remark 6.19 Since power series over R are analytic on any interval within their domain of convergence, all the results of Sect. 6.3 hold as well for power series on any interval in which the series converges weakly.
236
K. Shamseddine and A. Barría Comicheo
7 Measure Theory and Integration Using the nice smoothness properties of power series and analytic functions, summarized above, we develop a Lebesgue-like measure and integration theory on R in [40, 46] that uses the analytic functions studied in Sect. 6.3 as the building blocks for measurable functions instead of the step functions used in the real case. This was possible in particular because the family S(a, b) of analytic functions on a given interval I (a, b) ⊂ R (where I (a, b) denotes any one of the intervals [a, b], ]a, b], [a, b[ or ]a, b[) satisfies the following crucial properties. (1) S(a, b) is an algebra that contains the identity function; (2) for all f ∈ S(a, b), f is Lipschitz on I (a, b) and there exists an anti-derivative F of f in S(a, b), which is unique up to a constant; (3) for all differentiable f ∈ S(a, b), if f = 0 on ]a, b[ then f is constant on I (a, b); moreover, if f ≥ 0 on ]a, b[ then f is nondecreasing on I (a, b). Notation 7.1 Let a < b in R be given. Then by l(I (a, b)) we will denote the length of the interval I (a, b), that is l(I (a, b)) = length of I (a, b) = b − a.
7.1 Measurable Sets Definition 7.2 Let A ⊂ R be given. Then we say that A is measurable if for every > 0 in R, there exist a sequence of mutually disjoint intervals (In ) and a sequence ∞ ∞ ∞ & & of mutually disjoint intervals (Jn ) such that In ⊂ A ⊂ Jn , l(In ) and ∞
n=1 l(Jn )
∞
converge in R, and
n=1
l(Jn ) −
n=1
∞
n=1
n=1
l(In ) ≤ .
n=1
Given a measurable set A, then for every k ∈ N, we can select a sequence of mutually disjoint intervals Ink and a sequence of mutually disjoint intervals Jnk ∞ ∞ such that l Ink and l Jnk converge in R for all k, n=1
∞ # n=1
Ink ⊂
∞ # n=1
n=1
Ink+1 ⊂ A ⊂
∞ #
Jnk+1 ⊂
n=1
∞ # n=1
Jnk and
∞ ∞
l Jnk − l Ink ≤ d k n=1
n=1
for all k ∈ N. Since R is Cauchy-complete in the order topology, it follows that ∞ ∞ k lim l Jnk both exist and they are equal. We call the n=1 l In and lim k→∞
k→∞ n=1
common value of the limits the measure of A and we denote it by m(A). Thus,
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
m(A) = lim
k→∞
237
∞ ∞
l Ink = lim l Jnk . n=1
k→∞
n=1
Contrary to the real case, sup
∞
l(In ) : In ’s are mutually disjoint intervals and
n=1
∞ #
In ⊂ A
n=1
and inf
∞
l(Jn ) : Jn ’s are mutually disjoint intervals and A ⊂
n=1
∞ #
Jn
n=1
need not exist for a given set A ⊂ R. However, as shown in [46], if A is measurable then both the supremum and infimum exist and they are equal to m(A). This shows that the definition of measurable sets in Definition 7.2 is a natural generalization of that of the Lebesgue measurable sets of real analysis that corrects for the lack of suprema and infima in non-Archimedean ordered fields. It follows directly from the definition that m(A) ≥ 0 for any measurable set A ⊂ R and that any interval I (a, b) is measurable with measure m(I (a, b)) = l(I (a, b)) = b − a. It also follows that if A is a countable union of mutually disjoint ∞ intervals (In (an , bn )) such that (bn − an ) converges then A is measurable with m(A) =
∞
n=1
(bn − an ). Moreover, if B ⊂ A ⊂ R and if A and B are measurable,
n=1
then m(B) ≤ m(A). In [46] we show that the measure defined on R above has similar properties to those of the Lebesgue measure on R. For example, we show that any subset of a measurable set of measure 0 is itself measurable and has measure 0. We also show that any countable unions of measurable sets whose measures form a null sequence is measurable and the measure of the union is less than or equal to the sum of the measures of the original sets; moreover, the measure of the union is equal to the sum of the measures of the original sets if the latter are mutually disjoint. Furthermore, we show that any finite intersection of measurable sets is also measurable and that the sum of the measures of two measurable sets is equal to the sum of the measures of their union and intersection. It is worth noting that the complement of a measurable set in a measurable set need not be measurable. For example, [0, 1] and [0, 1] ∩ Q are both measurable with measures 1 and 0, respectively. However, the complement of [0, 1] ∩ Q in [0, 1] is not measurable. On the other hand, if B ⊂ A ⊂ R and if A, B and A \ B are all measurable, then m(A) = m(B) + m(A \ B). The example of [0, 1] \ ([0, 1] ∩ Q) above shows that the axiom of choice is not needed here to construct a nonmeasurable set, as there are many simple examples
238
K. Shamseddine and A. Barría Comicheo
of nonmeasurable sets. Indeed, any uncountable real subset of R, like [0, 1] ∩ R for example, is not measurable.
7.2 Measurable Functions and Integration on R We define in [46] a measurable function on a measurable set A ⊂ R using Definition 7.2 and analytic functions. Definition 7.3 Let A ⊂ R be a measurable subset of R and let f : A → R be bounded on A. Then we say that f is measurable on A if for all > 0 in R, there exists a sequence of mutually disjoint intervals (In ) such that In ⊂ A for all n, ∞ ∞ l (In ) converges in R, m(A) − l(In ) ≤ and f is analytic on In for all n. n=1
n=1
In [46], we derive a simple characterization of measurable functions and we show that they form an algebra. Then we show that a measurable function is differentiable almost everywhere and that a function measurable on two measurable subsets of R is also measurable on their union and intersection. We define the integral of an analytic function over an interval I (a, b) and we use that to define the integral of a measurable function f over a measurable set A. Before we do that, we recall the following result whose proof can be found in [36]. Proposition 7.4 Let a < b in R and let f : I (a, b) → R be analytic on I (a, b). Then • f is Lipschitz on I (a, b); • lim f (x) and lim f (x) exist; x→a +
x→b−
• the function g : [a, b] → R, given by ⎧ f (x) if x ∈ I (a, b) ⎪ ⎪ ⎨ lim f (ξ ) if x = a g(x) = ξ →a + ⎪ ⎪ ⎩ lim f (ξ ) if x = b, ξ →b−
extends f to an analytic function on [a, b] when I (a, b) [a, b]. Definition 7.5 Let a < b in R, let f : I (a, b) → R be analytic on I (a, b), and let F be an analytic anti-derivative of f on I (a, b). Then the integral of f over I (a, b) is the R number ; f = lim F (x) − lim F (x). I (a,b)
x→b−
x→a +
The limits in Definition 7.5 account for the case when the interval I (a, b) does not include one or both of the end points; and these limits exist by Proposition 7.4 above.
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
239
Now let A ⊂ R be measurable, let f : A → R be measurable and let M be a bound for |f |0 on A. Then for every k ∈ N, there exists a sequence of ∞ ∞ ∞ & mutually disjoint intervals Ink n=1 such that Ink ⊂ A, l Ink converges, n=1
n=1
∞ l Ink ≤ d k , and f is analytic on Ink for all n ∈ N. Without loss m(A) − n=1
of generality, we may assume that Ink ⊂ Ink+1 for all n ∈ N and for all k ∈ N. 8 Since lim l Ink = 0, and since I k f ≤ Ml Ink (proved in [46] for analytic n n→∞ 0 functions), it follows that ; lim f = 0 for all k ∈ N. n→∞ I k n
∞ 8
f converges in R for all k ∈ N [44]. ∞ ∞ 8 converges in R; and we define the We show that the sequence Ik f
Thus,
n=1
Ink
n
n=1
unique limit as the integral of f over A.
k=1
Definition 7.6 Let A ⊂ R be measurable and let f : A → R be measurable. Then 8 the integral of f over A, denoted by A f , is given by ; f = A
lim
∞
l(In ) → m(A)
n=1
∞ &
∞ ;
f.
n=1 In
In ⊂ A
n=1
In s are mutually disjoint f is analytic on In ∀ n
It turns out that the integral in Definition 7.6 satisfies similar properties to those of the Lebesgue integral on R [46]. In particular, 8 we prove the linearity property of the integral and that if |f |0 ≤ M on A then A f 0 ≤ Mm(A), where m(A) is the measure of A. We also show that the sum of the integrals of a measurable function over two measurable sets is equal to the sum of its integrals over the union and the intersection of the two sets. In [40], which is a continuation of the work done in [46] and complements it, we show, among other results, that the uniform limit of a sequence of convergent power series on an interval I (a, b) is again a power series that converges on I (a, b). Then we use that to prove the uniform convergence theorem in R. Theorem 7.7 Let A ⊂ R be measurable, let f : A → R, for each k ∈ N let fk : A → R be measurable on A, and let 8the sequence (fk ) converge uniformly to f on A. Then f is measurable on A, lim A fk exists, and k→∞
240
K. Shamseddine and A. Barría Comicheo
;
; lim
k→∞ A
fk =
f. A
7.3 Integration on R2 and R3 In [50] we generalize the results of [40, 46] to two and three dimensions. In particular, we define a Lebesgue-like measure on R2 (resp. R3 ). Then we define measurable functions on measurable sets using analytic functions in two (resp. three) variables and show how to integrate those measurable functions using iterated integration. The resulting double (resp. triple) integral satisfies similar properties to those of the single integral in [40, 46] as well as those properties satisfied by the double and triple integrals of real calculus. In order to have basic regions, like disks for example, measurable, it turns out that the so-called simple regions defined below, rather than rectangles, are the best choice for the building blocks for measurable sets. We recall the following definitions from [50] which will be needed later in this paper. Definition 7.8 (Simple Region) Let G ⊂ R2 . Then we say that G is a simple region if there exist a b in R and analytic functions h1 , h2 : I (a, b) → R, with h1 ≤ h2 on I (a, b) such that G = {(x, y) ∈ R2 : y ∈ I (h1 (x), h2 (x)), x ∈ I (a, b)} or G = {(x, y) ∈ R2 : x ∈ I (h1 (y), h2 (y)), y ∈ I (a, b)}. Definition 7.9 (λx and λy of a simple region) Let A ⊂ R2 be a simple region. If A = {(x, y) ∈ R2 : y ∈ I (h1 (x), h2 (x)), x ∈ I (a, b)} we define λx (A) = λ(b − a) and λy (A) = i(h2 − h1 ) on I (a, b) where i(h2 − h1 ) is the index of the analytic function h2 − h1 on I (a, b): i(h2 − h1 ) = min {λ(h2 (x) − h1 (x)) : x ∈ I (a, b)}. On the other hand, if A = {(x, y) ∈ R2 : x ∈ I (h1 (y), h2 (y)), y ∈ I (a, b)}, we define λy (A) = λ(b − a) and λx (A) = i(h2 − h1 ) on I (a, b). If λx (A) = λy (A) = 0 then we say that A is finite. Definition 7.10 (Analytic Functions on R2 ) Let A ⊂ R2 be a simple region. Then we say that f : A → R2 is an analytic function on A if, for every (x0 , y0 ) ∈ A, there exist a simple region A0 containing (x0 , y0 ) that satisfies λx (A0 ) = λx (A) and λy (A0 ) = λy (A), and a regular sequence (aij )∞ i,j =0 such that for every s, t ∈ R, if (x0 + s, y0 + t) ∈ A ∩ A0 then
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
f (x0 + s, y0 + t) =
∞
aij s i t j = f (x0 , y0 ) +
i,j =0
∞
241
aij s i t j ,
i,j =0 i+j =0
where the power series converges in the weak topology. Given a simple region S ⊂ R2 and an analytic function f : S → R, we define the index of f on S by i(f ) = min {λ(f (x, y))|(x, y) ∈ S} , which is shown to exist [50]. We note that λ(f (x, y)) = i(f ) for almost every (x, y) ∈ S ∩ (d λx (S) R × d λy (S) R) and for any such point (x, y) ∈ S ∩ (d λx (S) R × d λy (S) R), we have that λ(f (x , y )) = i(f ) for all (x , y ) ∈ S satisfying |x −x|0 2 d λx (S) and |y − y|0 2 d λy (S) . With the above definitions, we can proceed to define measurable sets, measurable functions, and integration just as we did in R, replacing intervals by simple regions. We can then extend the measure theory and integration to R3 , R4 , etc. in an inductive way and obtain similar properties for the resulting integrals as those for the single integral defined above. As an application of the integration theory, we develop in [17] a theory of integrable delta functions on the Levi-Civita field R as well as on R2 and R3 with similar properties to the one-dimensional, two-dimensional and three-dimensional Dirac delta functions and which reduce to them when restricted to R, R2 and R3 , respectively; and we show how those delta functions can be used to solve differential equations that arise in Physics and Engineering.
7.4 Integrable Delta Functions In various branches of physics, one encounters sources which are nearly instantaneous (if time is the independent variable) or almost localized (if the independent variable is a space coordinate). To avoid the cumbersome studies of the detailed functional dependencies of such sources, one would like to replace them with idealized sources that are truly instantaneous or localized. Typical examples of such sources are the concentrated forces and moments in solid mechanics, the point masses in the theory of the gravitational potential, and the point charges in electrostatics. The field of real numbers R does not permit a direct representation of the (improper) delta functions used for the description of impulsive (instantaneous) or concentrated (localized) sources. Of course, within the framework of distributions, these concepts can be accounted for in a rigorous fashion, but at the expense of the intuitive interpretation. The existence of infinitely small numbers and infinitely large numbers in the nonArchimedean Levi-Civita field R allows us to have well-behaved delta functions.
242
K. Shamseddine and A. Barría Comicheo
For example, the function δ : R → R, given by δ(x) =
3
4d
2 d − x 2 if |x|0 < d otherwise,
−3
0
where d is the positive infinitely small number introduced in Lemma 3.16, is a (onedimensional) continuous (and piece-wise infinitely differentiable) delta function; it assumes an infinitely large value (3/4d −1 ) at 0, it vanishes at all other real points and its integral on any interval containing ] − d, d[ is equal to one. In the following we summarize key properties of δ(x) which remind us of the corresponding properties of the Dirac delta function. • If I ⊂ R is an interval that contains ] − d, d[ then ; δ(x) = 1. x∈I
Moreover, if ] − d, d[∩I = ∅ then ; δ(x) = 0. x∈I
6 7 d d then , |α| • If α = 0 in R and if I ⊂ R is an interval containing − |α| ; δ(αx) = x∈I
1 . |α|
• Let I ⊂ R be an interval containing ] − d, d[. Then the function H : I → R, given by
H (x) =
⎧ ⎪ ⎪ ⎨0
if x −d
3 −3 2 d (d x ⎪4
⎪ ⎩1
−
1 3 1 3x ) + 2
if − d < x < d , if x d
is a measurable anti-derivative of δ(x) on I that is equal to the Heaviside function on I ∩ R. • If a < b in R is such that λ(b − a) < 1 and if f : I (a, b) → R is analytic on I (a, b) with i(f ) = 0 then for any x0 ∈ [a + d, b − d], we have that ; f (x)δ(x − x0 ) =0 f (x0 ). x∈I (a,b)
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
243
8 Optimization In [52], we consider unconstrained one-dimensional optimization on R. We study general optimization questions and derive first and second order necessary and sufficient conditions for the existence of local maxima and minima of a function on a convex subset of R. We show that for first order optimization, the results are similar to the corresponding real ones. However, for second and higher order optimization, we show that conventional differentiability is not strong enough to just extend the real-case results (see Examples 5.6 and 5.7); and a stronger concept of differentiability, the so-called derivate differentiability (see Definition 8.4 below), is used to solve that difficulty. We also characterize convex functions on convex sets of R in terms of first and second order derivatives.
8.1 One-Dimensional Optimization In the following, we review the definitions of derivate continuity and differentiability in one dimension, as well as some related results and we refer the interested reader to [36, 41] for a more detailed study. As before, throughout this section, I (a, b) will denote any one of the intervals ]a, b[, ]a, b], [a, b[ or [a, b]. Definition 8.1 Let a < b be given in R and let f : I (a, b) → R. Then we say that f is derivate continuous on I (a, b) if there exists M ∈ R, called a Lipschitz constant of f on I (a, b), such that f (y) − f (x) ≤ M for all x = y in I (a, b) . y−x 0 It follows immediately from Definition 8.1 that if f : I (a, b) → R is derivate continuous on I (a, b) then f is uniformly continuous and bounded on I (a, b). Remark 8.2 It is clear that the concept of derivate continuity in Definition 8.1 coincides with that of Lipschitz continuity when restricted to R. We chose to call it derivate continuity here so that, after having defined derivate differentiability in Definition 8.4 and higher order derivate differentiability in Definition 8.6, we can think of derivate continuity as derivate differentiability of “order zero", just as is the case for continuity in R. Remark 8.3 Definition 8.1 can be generalized in the obvious way to functions on any countable unions of intervals of R. Definition 8.4 Let a < b be given in R, let f : I (a, b) → R be derivate continuous on I (a, b), and let Id denote the identity function on I (a, b). Then we say that f is derivate differentiable on I (a, b) if for all x ∈ I (a, b), the function f −f (x) Id −x : I (a, b) \ {x} → R is derivate continuous on I (a, b) \ {x}. In this case the unique continuation of
f −f (x) Id −x
to I (a, b) will be called the first derivate function (or
244
K. Shamseddine and A. Barría Comicheo
simply the derivate function) of f at x and will be denoted by F1,x ; moreover, the function value F1,x (x) will be called the derivative of f at x and will be denoted by f (x). It follows immediately from Definition 8.4 that if f : I (a, b) → R is derivate differentiable then f is differentiable in the conventional sense; moreover, the two derivatives at any given point of I (a, b) agree. As for derivate continuity, the definition of derivate differentiability can be generalized to functions on countable unions of intervals of R. The following result provides a useful tool for checking the derivate differentiability of functions. Theorem 8.5 Let a < b be given in R and let f : I (a, b) → R be derivate continuous on I (a, b). Suppose there exists M ∈ R and there exists a function g : I (a, b) → R such that f (y) − f (x) ≤ M |y − x|0 for all y = x in I (a, b) . − g (x) y−x 0 Then f is derivate differentiable on I (a, b), with derivative f = g. Definition 8.6 (n-times Derivate Differentiability) Let a < b be given in R, let f : I (a, b) → R, and let n ≥ 2 be given in N. Then we define n-times derivate differentiability of f on I (a, b) inductively as follows: Having defined (n − 1)times derivate differentiability, we say that f is n-times derivate differentiable on I (a, b) if f is (n − 1)-times derivate differentiable on I (a, b) and for all x ∈ I (a, b), the (n − 1)st derivate function Fn−1,x is derivate differentiable on I (a, b). For all x ∈ I (a, b), the derivate function Fn,x of Fn−1,x at x will be called the nth derivate function of f at x, and the number f (n) (x) = n!Fn−1,x (x) will be (n) called the nth derivative of f at x and denoted by f (x). One of the most useful consequences of the derivate differentiability concept is that it gives rise to a Taylor theorem with remainder while the conventional (topological) differentiability does not. We only state the result here and refer the reader to [36, 41] for its proof. We also note that, as an immediate result of Theorem 8.7, we obtain local expandability in Taylor series around x0 ∈ I (a, b) of a given function that is infinitely often derivate differentiable on I (a, b) [36, 41]. Theorem 8.7 (Taylor’s Theorem with Remainder) Let a < b be given in R and let f : I (a, b) → R be n-times derivate differentiable on I (a, b). Let x ∈ I (a, b) be given, let Fn,x be the nth order derivate function of f at x, and let Mn,x be a Lipschitz constant of Fn,x . Then for all y ∈ I (a, b), we have that f (y) = f (x) +
n
f (j ) (x) j =1
with λ (rn (x, y)) ≥ λ Mn,x .
j!
(y − x)j + rn (x, y) (y − x)n+1 ,
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
245
Using Theorem 8.7, we are able to generalize in [52] most of the one-dimensional optimization results of Real Analysis. For example, we obtain the following two results which state necessary and sufficient conditions for the existence of local (relative) extrema. Theorem 8.8 (Necessary Conditions for Existence of Local Extrema) Let a < b be given in R, let m ≥ 2, and let f : I (a, b) → R be m-times derivate differentiable on I (a, b). Suppose that f has a local extremum at x0 ∈]a, b[ and l ≤ m is the order of the first nonvanishing derivative of f at x0 . Then l is even. Moreover, f (l) (x0 ) is positive if the extremum is a minimum and negative if the extremum is a maximum. Theorem 8.9 (Sufficient Conditions for Existence of Local Extrema) Let a < b be given in R, let k ∈ N, and let f : I (a, b) → R be 2k-times derivate differentiable on I (a, b). Let x0 ∈]a, b[ be such that f (j ) (x0 ) = 0 for all j ∈ {1, . . . , 2k − 1} and f (2k) (x0 ) = 0. Then f has a local minimum at x0 if f (2k) (x0 ) > 0 and a local maximum if f (2k) (x0 ) < 0.
8.2 Multidimensional Constrained Optimization In [41, 53], we generalize the concepts of derivate continuity and differentiability to higher dimensions; and this yields a Taylor theorem with a bounded remainder term for C m functions (in the derivate sense) from an open subset of Rn to R. Definition 8.10 Let D ⊂ Rn be open, let f : D → R, and let u = {u1 , u2 , . . . , un } be a unit vector (that is, |u|0 = u21 + u22 + · · · + u2n = 1). For each x ∈ D, let Dx,u = {t ∈ R : x + tu ∈ D} and define φx,u : Dx,u → R by φx,u (t) = f (x + tu). Then we say that f is derivate differentiable on D in the direction of u if φx,u is (0) will derivate differentiable on Dx,u for all x in D. Moreover, the derivative φx,u be called the directional derivative of f at x in the u direction. Definition 8.11 (Partial Derivatives) Let D ⊂ Rn be open, let f : D → R and let {e1 , . . . , en } denote the standard basis of Rn . Then the partial derivatives of f are defined as the directional derivatives of f in the directions e1 , . . . , en , if these exist. The gradient of f , denoted by ∇f , is defined to be the row vector whose components are the (first order) partial derivatives of f .
246
K. Shamseddine and A. Barría Comicheo
Definition 8.12 Let D ⊂ Rn be open, let f : D → R and let q ∈ N be given. Then we say that f is C q on D if all the partial derivatives of order smaller than or equal to q exist and are derivate continuous on D. Theorem 8.13 (Taylor’s Theorem with Remainder for Functions of Several Variables) Let D ⊂ Rn be open, let x 0 ∈ D be given and let f : D → R be C q on D. Then there exist M, δ > 0 in R such that Bδ (x 0 ) := {x ∈ Rn : |x − x 0 |0 < δ} ⊂ D and, for all x ∈ Bδ (x 0 ), we have that f (x) = f (x 0 ) +
q
j =1
⎛ ⎝1 j!
n
⎞ j ∂l1 · · · ∂lj f (x 0 )πk=1 xlk − x0,lk ⎠
l1 ,...,lj =1
+Rq+1 (x 0 , x), q+1 where Rq+1 (x 0 , x)0 ≤ M|x − x 0 |0 . Then we use Theorem 8.13 to derive necessary and sufficient conditions of second order for the existence of a minimum of an R-valued function on Rn subject to equality and inequality constraints. More specifically, we solve the problem of minimizing a function f : Rn → R, subject to the following set of constraints: ⎧ ⎧ ⎪ ⎪ ⎨ g1 (x) ≤ 0 ⎨ h1 (x) = 0 .. .. , and . . ⎪ ⎪ ⎩ ⎩ hm (x) = 0 gp (x) ≤ 0
(8.1)
where all the functions in (8.1) are from Rn to R. A point x 0 ∈ Rn is said to be a feasible point if it satisfies the constraints in (8.1). Definition 8.14 Let x 0 be a feasible point for the constraints in (8.1) and let I (x 0 ) = {l$∈ {1, . . . , p} : gl (x 0 ) = 0}. Then we say that x% 0 is regular for the constraints if ∇hj (x 0 ) : j = 1, . . . , m; ∇gl (x 0 ) : l ∈ I (x 0 ) forms a linearly independent subset of vectors in Rn . The following theorem provides necessary conditions of second order for a local minimizer x 0 of a function f subject to the constraints in (8.1). The result is a generalization of the corresponding real result [15, 26] and the proof (see [53]) is similar to that of the latter; but one essential difference is the form of the remainder formula in Taylor’s theorem. In the real case, the remainder term is related to the second derivative at some intermediate point, while here that is not the case. However, the concept of derivate differentiability puts a bound on the remainder term; and this proves to be instrumental in the proof of the theorem in the nonArchimedean setting. p
2 Theorem 8.15 Suppose that f , {hj }m j =1 , {gl }l=1 are C on some open set D ⊂ n R containing the point x 0 and that x 0 is a regular point for the constraints in
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
247
(8.1). If x 0 is a local minimizer for f under the given constraints, then there exist α1 , . . . , αm , β1 , . . . , βp ∈ R such that (i) βl ≥ 0 for all l ∈ {1, . . . , p}, (ii) βl gl (x 0 ) = 0 for all l ∈ {1, . . . , p}, p α ∇h (x ) + (iii) ∇f (x 0 ) + m j 0 j =1 j l=1 βl ∇gl (x 0 ) = 0, and p m (iv) y T ∇ 2 f (x 0 ) + j =1 αj ∇ 2 hj (x 0 ) + l=1 βl ∇ 2 gl (x 0 ) y ≥ 0 for all y ∈ Rn satisfying ∇hj (x 0 )y = 0 for all j ∈ {1, . . . , m}, ∇gl (x 0 )y = 0 for all l ∈ L = {k ∈ I (x 0 ) : βk > 0} and ∇gl (x 0 )y ≤ 0 for all l ∈ I (x 0 ) \ L. In the following theorem, we present second order sufficient conditions for a feasible point x 0 to be a local minimum of a function f subject to the constraints in (8.1). It is a generalization of the real result [15] and reduces to it, when restricted to functions from Rn to R. In fact, since in condition (iv) below is allowed to be infinitely small, the condition |∇hj (x 0 )y|0 < would reduce to ∇hj (x 0 )y = 0, when restricted to R. Similarly, one can readily see that the other conditions are mere generalizations of the corresponding real ones. However, the proof (see [53]) is different from that of the real result since the supremum principle does not hold in R. p
2 n Theorem 8.16 Suppose that f , {hj }m j =1 , {gl }l=1 are C on some open set D ⊂ R containing the point x 0 and that x 0 is a feasible point for the constraints in (8.1) such that, for some α1 , . . . , αm , β1 , . . . , βp ∈ R and for some , γ > 0 in R, we have that
(i) βl ≥ 0 for all l ∈ {1, . . . , p}, (ii) βl gl (x 0 ) = 0 for all l ∈ {1, . . . , p}, p (iii) ∇f (x 0 ) + m αj ∇hj (x 0 ) + l=1 βl ∇gl (x 0 ) = 0, and j =1 p 2 h (x ) + 2 g (x ) y ≥ γ for all y ∈ (iv) y T ∇ 2 f (x 0 ) + m α ∇ β ∇ j j 0 l l 0 j =1 l=1 Rn satisfying |y|0 = 1, ∇hj (x 0 )y 0 < for all j ∈ {1, . . . , m}, |∇gl (x 0 )y|0 < for all l ∈ L = {k : βk > 0} and ∇gl (x 0 )y < for all l ∈ I (x 0 ) \ L, where I (x 0 ) = {k : gk (x 0 ) = 0}. Then x 0 is a strict local minimum for f under the constraints of (8.1).
9 Computational Applications The general question of efficient differentiation is at the core of many parts of the work on perturbation and aberration theories relevant in Physics and Engineering. In this case, derivatives of highly complicated functions have to be computed to high orders. However, even when the derivative of the function is known to exist at the given point, numerical methods fail to give an accurate value of the derivative; the error increases with the order, and for orders greater than three, the errors often become too large for the results to be practically useful.
248
K. Shamseddine and A. Barría Comicheo
On the other hand, while formula manipulators like Mathematica are successful in finding low-order derivatives of simple functions, they fail for high-order derivatives of very complicated functions. Moreover, they fail to find the derivatives of certain functions at given points even though the functions are differentiable at the respective points. This is generally connected to the occurrence of non-differentiable parts that do not affect the differentiability of the end result as well as the occurrence of branch points in coding as in IF-ELSE structures. Using Calculus on R and the fact that the field has infinitely small numbers represents a new method for computational differentiation that avoids the wellknown accuracy problems of numerical differentiation tools. It also avoids the often rather stringent limitations of formula manipulators that restrict the complexity of the function that can be differentiated, and the orders to which differentiation can be performed. By a computer function, we denote any real-valued function that can be typed on a computer. The R numbers as well as the continuations to R of the intrinsic functions (and hence of all computer functions) have all been implemented for use on a computer, using the code COSY INFINITY [8]. Using the calculus on R, we formulate a necessary and sufficient condition for the derivatives of a computer function to exist, and show how to find these derivatives whenever they exist [43, 45]. The new technique of computing the derivatives of computer functions, which we summarize below, achieves results that combine the accuracy of formula manipulators with the speed of classical numerical methods, that is the best of both worlds. The method is much faster than Mathematica and other formula manipulators since no symbolic differentiation is required before the numerical evaluation of the derivatives. Moreover, the results obtained are accurate up to machine precision—the error is infinitely small and hence it does not mix with the real derivative; this represents a clear advantage over traditional numerical differentiation methods in which case finite errors result from digit cancelation in the floating point representation and for high orders the errors usually become too large for the results to be of any practical use. Lemma 9.1 Let f be a computer function. Then f is defined at x0 if and only if f (x0 ) can be computed on a computer. This lemma hinges on a careful implementation of the intrinsic functions and operations, in particular in the sense that they should be executable for any floating point number in the domain of definition that produces a result within the range of allowed floating point numbers. Lemma 9.2 Let f be a computer function, and let x0 be a real number. Then f is right-continuous at x0 if and only if f (x0 ) and f (x0 + d) are defined, and f (x0 + d) =0 f (x0 ). f is left-continuous at x0 if and only if f (x0 ) and f (x0 − d) are defined, and f (x0 − d) =0 f (x0 ).
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
249
Finally, f is continuous at x0 if and only if it is both right-continuous and leftcontinuous at x0 ; that is, if and only if f (x0 − d), f (x0 ), and f (x0 + d) are all defined, and f (x0 − d) =0 f (x0 ) =0 f (x0 + d). Theorem 9.3 Let f be a computer function that is continuous at x0 . Then f is differentiable at x0 if and only if f (x0 + d) − f (x0 ) f (x0 ) − f (x0 − d) and d d are both at most finite in (ordinary) absolute value, and their real parts agree. In this case, f (x0 ) − f (x0 − d) f (x0 + d) − f (x0 ) =0 f (x0 ) =0 . d d If f is differentiable at x0 , then f is twice differentiable at x0 if and only if f (x0 + 2d) − 2f (x0 + d) + f (x0 ) f (x0 ) − 2f (x0 − d) + f (x0 − 2d) and d2 d2 are both at most finite in (ordinary) absolute value, and their real parts agree. In this case f (x0 + 2d) − 2f (x0 + d) + f (x0 ) f (x0 ) − 2f (x0 − d) + f (x0 − 2d) =0 f (x0 ) =0 . d2 d2
In general, if f is (n − 1) times differentiable at x0 , then f is n times differentiable at x0 if and only if n
j =0
(−1)n−j
n n n j f (x0 + j d) f (x0 − j d) j =0 (−1) j j and dn dn
are both at most finite in (ordinary) absolute value, and their real parts agree. In this case, n
n−j j =0 (−1)
n n j n f (x − j d) (−1) f (x0 + j d) 0 j =0 j j (n) = f (x ) = . 0 0 0 dn dn
Since knowledge of f (x0 − d) and f (x0 + d) gives us all the information about a computer function f in an interval (x0 − σ, x0 + σ ), with real σ > 0, around x0 , we have the following result which states that, from the mere knowledge of f (x0 − d) and f (x0 + d), we can find at once the order of differentiability of f at x0 and the accurate values of all existing derivatives.
250
K. Shamseddine and A. Barría Comicheo
Theorem 9.4 Let f be a computer function that is defined at x0 ; and let n ∈ N be given. Then f is n times differentiable at x0 if and only if f (x0 − d) and f (x0 + d) are both defined and can be written as f (x0 − d) =n f (x0 ) +
n n
(−1)j αj d j and f (x0 + d) =n f (x0 ) + αj d j , j =1
j =1
where the αj ’s are real numbers. Moreover, in this case f (j ) (x0 ) = j ! αj for 1 ≤ j ≤ n. Now consider, as an example, the function sin x 3 + 2x + 1 + g(x) =
3+cos(sin(ln|1+x|)) sin(cos(tan(exp(x)))) exp tanh sinh cosh cos(sin(exp(tan(x+2))))
2 + sin sinh cos tan−1 ln exp(x) + x 2 + 3
.
(9.1)
Using the R calculus, we find g (n) (0) for 0 ≤ n ≤ 10. These numbers are listed in Table 1; we note that, for 0 ≤ n ≤ 10, we list the CPU time needed to obtain all derivatives of g at 0 up to order n and not just g (n) (0). For comparison purposes, we give in Table 2 the function value and the first six derivatives computed with Mathematica. Note that the respective values listed in Tables 1 and 2 agree. However, Mathematica used much more CPU time to compute the first six derivatives, and it failed to find the seventh derivative as it ran out of memory. We also list in Table 3 the first ten derivatives of g at 0 computed numerically using the numerical differentiation formulas ⎛ ⎞ n
n g (n) (0) = (x)−n ⎝ (−1)n−j g (j x)⎠ , x = 10−16/(n+1) , j j =0
Table 1 g (n) (0), 0 ≤ n ≤ 10, computed with R calculus
Order n 0 1 2 3 4 5 6 7 8 9 10
g (n) (0) 1.004845319007115 0.4601438089634254 −5.266097568233224 −52.82163351991485 −108.4682847837855 16451.44286410806 541334.9970224757 7948641.189364974 −144969388.2104904 −15395959663.01733 −618406836695.3634
CPU Time 1.820 ms 2.070 ms 3.180 ms 4.830 ms 7.700 ms 11.640 ms 18.050 ms 26.590 ms 37.860 ms 52.470 ms 72.330 ms
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . . Table 2 g (n) (0), 0 ≤ n ≤ 6, computed with Mathematica
Table 3 g (n) (0), 1 ≤ n ≤ 10, computed numerically
Order n 0 1 2 3 4 5 6 Order n 1 2 3 4 5 6 7 8 9 10
251
g (n) (0) 1.004845319007116 0.4601438089634254 −5.266097568233221 −52.82163351991483 −108.4682847837854 16451.44286410805 541334.9970224752
g (n) (0) 0.4601437841866840 −5.266346392944456 −52.83767867680922 −87.27214664649106 19478.29555909866 633008.9156614641 −12378052.73279768 −1282816703.632099 83617811421.48561 91619495958355.24
CPU Time 0.11 s 0.17 s 0.47 s 2.57 s 14.74 s 77.50 s 693.65 s
Relative Error 54 × 10−9 47 × 10−6 30 × 10−5 0.20 0.18 0.17 2.6 7.8 6.4 149
for 1 ≤ n ≤ 10, together with the corresponding relative errors obtained by comparing the numerical values with the respective exact values computed using R calculus. On the other hand, formula manipulators fail to find the derivatives of certain functions at given points even though the functions are differentiable at the respective points. For example, the functions
g1 (x) = |x|5/2 · g(x) and g2 (x) =
⎧ 2 ⎪ ⎨ 1−expx(−x ) · g(x) if x = 0 ⎪ ⎩
0
,
if x = 0
where g(x) is the function given in Eq. (9.1), are both differentiable at 0; but the attempt to compute their derivatives using formula manipulators fails. This is not specific to g1 and g2 , and is generally connected to the occurrence of nondifferentiable parts that do not affect the differentiability of the end result, of which case g1 is an example, as well as the occurrence of branch points in coding as in IF-ELSE structures, of which case g2 is an example. More recently, in [16, 18], building on the success of computational differentiation above, we succeeded in achieving more computational applications of the Levi-Civita numbers such as numerical integration as well as the computation of numerical sequences that are given by generating functions like the Bernoulli numbers.
252
K. Shamseddine and A. Barría Comicheo
In the following section we give a brief summary of our work on developing a non-Archimedean operator theory on a Banach space over the complex Levi-Civita field C which is the result of a recent collaboration with José Aguayo (Universidad de Concepción, Chile) and Miguel Nova (Universidad Católica de la Santísima, Concepción, Chile). For lack of space, we will omit all the details here and refer the interested reader to [1–3].
10 Non-Archimedean Operator Theory Let c0 denote the space of all null sequences of elements in C. The natural inner product on c0 induces the sup-norm of c0 . In [1], we show that c0 is not orthomodular then we characterize those closed subspaces of c0 with an orthonormal complement with respect to the inner product. Such a subspace, together with its orthonormal complement, defines a special kind of projection, the normal projection. We present characterizations of normal projections as well as other kinds of operators, the self-adjoint and compact operators on c0 . In [2], we work on some B∗ -algebras of operators, including those mentioned above; and we define an inner product on such algebras that induces the usual norm of operators. Finally, in [3], we study the properties of positive operators on c0 which are similar to those of positive operators in classical functional analysis; however the proofs of many of the results are nonclassical. Then we use our study of positive operators to introduce a partial order on the set of compact and self-adjoint operators on c0 and study the properties of that partial order.
References 1. AGUAYO, J., NOVA, M., AND SHAMSEDDINE, K. Characterization of compact and self-adjoint operators on free Banach spaces of countable type over the complex Levi-Civita field. J. Math. Phys. 54, 2 (2013). 2. AGUAYO, J., NOVA, M., AND SHAMSEDDINE, K. Inner product on B∗ -algebras of operators on a free Banach space over the Levi-Civita field. Indag. Math. (N.S.) 26, 1 (2015), 191–205. 3. AGUAYO, J., NOVA, M., AND SHAMSEDDINE, K. Positive operators on a free banach space over the Levi-Civita field. p-Adic Numbers Ultrametric Anal. Appl. 9, 2 (2017), 122–137. 4. ALLING, N. L. Foundations of analysis over surreal number fields, vol. 141 of North-Holland Mathematics Studies. Elsevier, 1987. 5. BACHMAN, G. Introduction to p-adic numbers and valuation theory. Academic paperbacks. Academic Press. Inc, 1964. 6. BELL, J. L., AND MACHOVER, M. A course in mathematical logic. North-Holland, 1977. 7. BERZ, M. Calculus and numerics on Levi-Civita fields. In Computational Differentiation: Techniques, Applications, and Tools (Philadelphia, 1996), M. Berz, C. Bischof, G. Corliss, and A. Griewank, Eds., SIAM, pp. 19–35. 8. BERZ, M., HOFFSTÄTTER, G., WAN, W., SHAMSEDDINE, K., AND MAKINO, K. COSY INFINITY and its applications to nonlinear dynamics. In Computational Differentiation:
On Non-Archimedean Valued Fields: A Survey of Algebraic, Topological and. . .
253
Techniques, Applications, and Tools (Philadelphia, 1996), M. Berz, C. Bischof, G. Corliss, and A. Griewank, Eds., SIAM, pp. 363–367. 9. BOOKATZ, G., AND SHAMSEDDINE, K. Calculus on a non-archimedean field extension of the real numbers: inverse function theorem, intermediate value theorem and mean value theorem. Contemp. Math. 704 (2018), 49–67. 10. CHANG, C. C., AND KEISLER, H. J. Model theory, vol. 73 of Studies in Logic and the Foundations of Mathematics. Elsevier, 1990. 11. CLIFFORD, A. H. Note on hahn’s theorem on ordered abelian groups. Proceedings of the American Mathematical Society 5, 6 (1954), 860–863. 12. COMICHEO, A. B., AND SHAMSEDDINE, K. Summary on non-archimedean valued fields. Contemp. Math. 704 (2018), 1–36. 13. DALES, H. G., AND WOODIN, W. H. Super-real fields: totally ordered fields with additional structure. No. 14 in London Mathematical Society Monographs. Oxford University Press, 1996. 14. ENGLER, A. J., AND PRESTEL, A. Valued fields. Springer Monographs in Mathematics. Springer Science & Business Media, 2005. 15. FIACCO, A. V., AND MCCORMICK, G. P. Nonlinear Programming; Sequential Unconstrained Minimization Techniques. SIAM, Philadelphia, 1990. 16. FLYNN, D. On the Hahn and Levi-Civita Fields: Topology, Analysis, and Applications. PhD thesis, University of Manitoba, Winnipeg, Manitoba, Canada, 2019. 17. FLYNN, D., AND SHAMSEDDINE, K. On integrable delta functions on the Levi-Civita field. p-Adic Numbers Ultrametric Anal. Appl. 10, 1 (2018), 32–56. 18. FLYNN, D., AND SHAMSEDDINE, K. On computational applications of the Levi-Civita field. J. Comput. Appl. Math. 382 (2021). 19. HAHN, H. Über die nichtarchimedischen größensysteme. In Hans Hahn Gesammelte Abhandlungen Band 1/Hans Hahn Collected Works Volume 1. Springer, 1995, pp. 445–499. 20. HALL, J. F. Completeness of ordered fields. arXiv preprint arXiv:1101.5652 (2011). 21. HAUSNER, M., AND WENDEL, J. G. Ordered vector spaces. Proceedings of the American Mathematical Society 3, 6 (1952), 977–982. 22. LANG, S. Algebra revised third edition, vol. 211. Springer Science and Media, 2002. 23. LAUGWITZ, D. Tullio Levi-Civita’s work on nonarchimedean structures (with an Appendix: Properties of Levi-Civita fields). In Atti Dei Convegni Lincei 8: Convegno Internazionale Celebrativo Del Centenario Della Nascita De Tullio Levi-Civita (Academia Nazionale dei Lincei, Roma, 1975). 24. LEVI-CIVITA, T. Sugli infiniti ed infinitesimi attuali quali elementi analitici. Atti Ist. Veneto di Sc., Lett. ed Art. 7a, 4 (1892), 1765. 25. LEVI-CIVITA, T. Sui numeri transfiniti. Rend. Acc. Lincei 5a, 7 (1898), 91,113. 26. LUENBERGER, D. G. Linear and Nonlinear Programming, 2nd ed. Addison-Wesley, Reading, Massachusetts, 1984. 27. NEDER, L. Modell einer Leibnizschen Differentialrechnung mit aktual unendlich kleinen Größen. Mathematische Annalen 118 (1941–1943), 718–732. 28. PEREZ-GARCIA, C., AND SCHIKHOF, W. H. Locally Convex Spaces over Non-Archimedean Valued Fields, vol. 119 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, 2010. 29. RIBENBOIM, P. The new theory of ultrametric spaces. Periodica Mathematica Hungarica 32, 1–2 (1996), 103–111. 30. RIBENBOIM, P. The theory of classical valuations. Springer Monographs in Mathematics. Springer Science & Business Media, 1999. 31. SCHIKHOF, W. H. Isometrical embeddings of ultrametric spaces into non-archimedean valued fields. In Indagationes Mathematicae (Proceedings) (1984), vol. 87, Elsevier, pp. 51–53. 32. SCHIKHOF, W. H. A crash course in p-adic analysis, 2003. 33. SCHIKHOF, W. H. Ultrametric Calculus: an introduction to p-adic analysis, vol. 4 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, 2007.
254
K. Shamseddine and A. Barría Comicheo
34. SCHILLING, O. F. G. The Theory of Valuations. Mathematical surveys. American Mathematical Society, 1950. 35. SCHNEIDER, P. p-adic analysis and lie groups, 2008. 36. SHAMSEDDINE, K. New Elements of Analysis on the Levi-Civita Field. PhD thesis, Michigan State University, East Lansing, Michigan, USA, 1999. also Michigan State University report MSUCL-1147. 37. SHAMSEDDINE, K. On the topological structure of the Levi-Civita field. J. Math. Anal. Appl. 368 (2010), 281–292. 38. SHAMSEDDINE, K. Absolute and relative extrema, the mean value theorem and the inverse function theorem for analytic functions on a Levi-Civita field. Contemp. Math. 551 (2011), 257–268. 39. SHAMSEDDINE, K. Nontrivial order preserving automorphisms of non-Archimedean fields. Contemp. Math. 547 (2011), 217–225. 40. SHAMSEDDINE, K. New results on integration on the Levi-Civita field. Indag. Math. (N.S.) 24, 1 (2013), 199–211. 41. SHAMSEDDINE, K. One-variable and multi-variable calculus on a non-Archimedean field extension of the real numbers. p-Adic Numbers Ultrametric Anal. Appl. 5, 2 (2013), 160– 175. 42. SHAMSEDDINE, K. Taylor’s theorem, the inverse function theorem and the implicit function theorem for weakly locally uniformly differentiable functions on non-Archimedean spaces. p-Adic Numbers Ultrametric Anal. Appl. 13, 2 (2021), 148–165. 43. SHAMSEDDINE, K., AND BERZ, M. Exception handling in derivative computation with nonArchimedean calculus. In Computational Differentiation: Techniques, Applications, and Tools (Philadelphia, 1996), SIAM, pp. 37–51. 44. SHAMSEDDINE, K., AND BERZ, M. Convergence on the Levi-Civita field and study of power series. In Proc. Sixth International Conference on p-adic Functional Analysis (New York, NY, 2000), Marcel Dekker, pp. 283–299. 45. SHAMSEDDINE, K., AND BERZ, M. The differential algebraic structure of the Levi-Civita field and applications. Int. J. Appl. Math. 3 (2000), 449–465. 46. SHAMSEDDINE, K., AND BERZ, M. Measure theory and integration on the Levi-Civita field. Contemp. Math. 319 (2003), 369–387. 47. SHAMSEDDINE, K., AND BERZ, M. Analytical properties of power series on Levi-Civita fields. Ann. Math. Blaise Pascal 12, 2 (2005), 309–329. 48. SHAMSEDDINE, K., AND BERZ, M. Intermediate value theorem for analytic functions on a Levi-Civita field. Bull. Belg. Math. Soc. Simon Stevin 14 (2007), 1001–1015. 49. SHAMSEDDINE, K., AND BOOKATZ, G. A local mean value theorem for functions on nonArchimedean field extensions of the real numbers. p-Adic Numbers Ultrametric Anal. Appl. 8, 2 (2016), 160–175. 50. SHAMSEDDINE, K., AND FLYNN, D. Measure theory and lebesgue-like integration in two and three dimensions over the Levi-Civita field. Contemp. Math. 665 (2016), 289–325. 51. SHAMSEDDINE, K., AND SIERENS, T. On locally uniformly differentiable functions on a complete non-Archimedean ordered field extension of the real numbers. ISRN Math. Anal. 2012 (2012), 20 pages. 52. SHAMSEDDINE, K., AND ZEIDAN, V. One-dimensional optimization on non-Archimedean fields. J. Nonlinear Convex Anal. 2 (2001), 351–361. 53. SHAMSEDDINE, K., AND ZEIDAN, V. Constrained second order optimization on nonArchimedean fields. Indag. Math. (N.S.) 14 (2003), 81–101. 54. SHIRALI, S., AND VASUDEVA, H. L. Metric spaces. Springer Science & Business Media, 2005. 55. VAN ROOIJ, A. Non-Archimedean functional analysis, vol. 51 of Monographs and Textbooks in Pure and Applied Mathematics. Marcel Dekker, 1978.
Non-Archimedean Models of Morphogenesis W. A. Zúñiga-Galindo
Abstract We study a p-adic reaction-diffusion system and the associated Turing patterns. We establish an instability criteria and show that the Turing patterns are not classical patterns consisting of alternating domains. Instead of this, a Turing pattern consists of several domains (clusters), each of them supporting a different pattern but with the same parameter values. This type of patterns are typically produced by reaction-diffusion equations on large networks. Keywords Reaction-diffusion equations · Turing patterns · p-Adic analysis 2020 Mathematics Subject Classification Primary: 35K57, 47S10; Secondary: 92C42
1 Introduction In 1952, A. Turing proposed that under certain conditions chemicals can react and diffuse in such a way as to produce steady state heterogeneous spatial patterns of chemical (or morphogen) concentration. In the case of two morphogens interacting, the model proposed by Turing has the form:
The author was partially supported by Conacyt Grant No. 217367 (Mexico), and by the Debnath Endowed Professorship (UTRGV, USA) W. A. Zúñiga-Galindo () School of Mathematical & Statistical Sciences, University of Texas Rio Grande Valley, Brownsville, TX, USA Departamento de Matemáticas, Unidad Querétaro, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, Santiago de Querétaro, Mexico e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 W. A. Zúñiga-Galindo, B. Toni (eds.), Advances in Non-Archimedean Analysis and Applications, STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health, https://doi.org/10.1007/978-3-030-81976-7_7
255
256
W. A. Zúñiga-Galindo
⎧ ∂u ⎪ ⎨ ∂t (x, t) = γf (u, v) + ⎪ ⎩ ∂v ∂t
∂2u (x, t) ∂x 2
(1.1) (x, t) =
∂2v γ g (u, v) + d ∂x 2 (x, t),
with suitable boundary conditions. In this model u, v represent the concentrations of the two morphogens, f and g represent the reaction kinetics, and d is the ratio of diffusion, and γ represents the relative strength of the reaction terms. Typically u (x, t) and v(x, t) are interpreted to be the local densities of the activator and inhibitor species. Functions f (u, v) and g (u, v) specify the local dynamics of the activator, which autocatalytically enhances its own production, and of the inhibitor, which suppresses the activator growth. The Turing instability occurs when the parameter d exceeds a threshold dc , [15, 22]. This event drives to a spontaneous development of a spatial pattern formed by alternating activator-rich and activator-poor patches. Turing instability in activator-inhibitor systems establishes a paradigm of non-equilibrium self-organization, which has been extensively studied for biological and chemical processes. In the 70s, Othmer and Scriven started the study of the Turing instability in network-organized systems [17, 18]. Since then, reaction-diffusion models on networks have been studied intensively, see e.g. [2], [5], [6], [9], [12], [14], [16], [17], [18], [20], [23], [25], [26], [28] and the references therein. In the discrete case, the continuous media is replaced by a network (an unoriented graph G, which plays the role of discrete media). The analog of operator is the Laplacian of the graph G, which is defined as [LJ I ]J,I ∈V (G ) = [AJ I − γI δJ I ]J,I ∈V (G ) ,
(1.2)
where [AJ I ]J,I ∈V (G ) is the adjacency matrix of G and γI is the degree of I . The network analogue of (1.1) is ⎧ ∂uJ ⎪ ⎪ ∂t = f (uJ , vJ ) + ε LJ I uI ⎪ ⎨ I ⎪ ∂vJ ⎪ ⎪ ⎩ ∂t = g(uJ , vJ ) + εd LJ I vI .
(1.3)
I
In the last 50 years, Turing patterns produced by reaction-diffusion systems on networks have been studied intensively, see e.g. [5], [9], [12], [16], [17], [18], [26], [28] and the references therein. Nowadays, there is a large amount of experimental results about the behavior of these systems, obtained mainly via computer simulations using large random networks. The investigations of the Turing patterns for large random networks have revealed that, whereas the Turing criteria remain essentially the same, as in the classical case, the properties of the emergent patterns are very different. In [16], by using a physical argument, Nakao and Mikhailov establish that Turing patterns with alternating domains cannot exist in the network case, and only several domains (clusters) occur. Multistability, that
Non-Archimedean Models of Morphogenesis
257
is, coexistence of a number of different patterns with the same parameter values, is typically found and hysteresis phenomena are observed. They used mean-field approximation to understand the Turing patterns when d > dc , and proposed that the mean-field approximation is the natural framework to understand the peculiar behavior of the Turing patterns on networks. This program was carried out in [28] using p-adic analysis. In [28] we established that the p-adic reaction-diffusion system 8 ⎧ ∂u(x,t) = f (u (x, t) , v (x, t)) + ε (u (y, t) − u (x, t)) JN (x, y)dy ⎪ ∂t ⎪ ⎪ ⎨ KN 8 ⎪ ∂v(x,t) ⎪ ⎪ ⎩ ∂t = g(u (x, t) , v (x, t)) + εd (u (y, t) − u (x, t)) JN (x, y)dy, KN
(1.4) where KN (an open compact subset) and JN (x, y) depends on [AJ I ]J,I ∈V (G ) , is a good p-adic continuous approximation of (1.3). The Turing instability criteria for (1.4) is essentially the same as in the classical case, but the qualitative description of the Turing patterns is the one given by Nakao and Mikhailov in terms of clustering and multistability. In this article we study the following p-adic reaction-diffusion system: ⎧ ∂u ⎨ ∂t (x, t) = γf (u, v) − D αx u(x, t); (1.5)
⎩ ∂v
α ∂t (x, t) = γ g (u, v) − dD x v(x, t),
where x ∈ Qp , t ≥ 0, and D αx is the Vladimirov operator. Since want to study self-organization patterns we need a ‘zero flux boundary condition’, which is u(x, t), v(x, t) ≡ 0 for x ∈ Qp BM , for any t ≥ 0, where BM is the ball of radius pM around the origin. We establish a Turing instability criteria for (1.5) and show that the Turing pattern can be described as a couple of convergent series involving the functions eλt p p
−r 2
−r 2
cos
. p r x − np , p−1 j pr x − n p
r . −1 p r x − np , sin p j p x − n p
see Theorem 1. The Turing patterns attached to (1.5) are not classical patterns consisting of alternating domains. Instead of this, a Turing pattern consists of several domains (clusters), each of them supporting a different pattern but with the same parameter values (multistability). It is important to mention that the spectra
258
W. A. Zúñiga-Galindo
of D αx consists of a sequence of positive eigenvalues with infinite multiplicity, while the spectra of the operators considered in [28, Theorem 10.1] consists of a finite number of non-negative eigenvalues with finite multiplicities. In Sect. 5 we construct discretization of system (1.5) of type (1.3). But, In this case the matrix of the discrete is not related to adjacency matrix of a graph, see (5.1).
2 p-Adic Analysis: Essential Ideas In this section we collect some basic results about p-adic analysis that will be used in the article. For an in-depth review of the p-adic analysis the reader may consult [1], [21], [24].
2.1 The Field of p-adic Numbers Along this article p will denote a prime number. The field of p−adic numbers Qp is defined as the completion of the field of rational numbers Q with respect to the p−adic norm | · |p , which is defined as
|x|p =
⎧ ⎨0 ⎩
if x = 0
p−γ if x = pγ ab ,
where a and b are integers coprime with p. The integer γ := ord(x), with ord(0) := +∞, is called the p−adic order of x. Any p−adic number x = 0 has a unique expansion of the form x=p
ord(x)
∞
xj pj ,
j =0
where xj ∈ {0, . . . , p − 1} and x0 = 0. By using this expansion, we define the fractional part of x ∈ Qp , denoted {x}p , as the rational number
{x}p =
⎧ ⎪ ⎨0
if x = 0 or ord(x) ≥ 0
⎪ ⎩ pord(x) −ordp (x)−1 x pj if ord(x) < 0. j j =0
In addition, any non-zero p−adic can be represented uniquely as x = number j pord(x) ac (x) where ac (x) = ∞ j =0 xj p , x0 = 0, is called the angular component of x. Notice that |ac (x)|p = 1.
Non-Archimedean Models of Morphogenesis
259
For r ∈ Z, denote by Br (a) = {x ∈ Qp ; |x − a|p ≤ pr } the ball of radius pr with center at a ∈ Qp , and take Br (0) := Br . The ball B0 equals Zp , the ring of p−adic integers of Qp . We also denote by Sr (a) = {x ∈ Qp ; |x − a|p = pr } the sphere of radius pr with center at a ∈ Qp , and take Sr (0) := Sr . We notice that S01 = Z× p (the group of units of Zp ). The balls and spheres are both open and closed subsets in Qp . In addition, two balls in Qp are either disjoint or one is contained in the other. The metric space Qp , |·|p is a complete ultrametric space. As a topological space Qp , | · |p is totally disconnected, i.e. the only connected subsets of Qp are the empty set and the points. In addition, Qp is homeomorphic to a Cantor-like subset of the real line, see e.g. [1], [24]. A subset of Qp is compact if and only if it is closed and bounded in Qp , see e.g. [24, Section 1.3], or [1, Section 1.8]. The balls and spheres are compact subsets. Thus Qp , | · |p is a locally compact topological space. Notation 1 We use p−r |x − a|p to denote the characteristic function of the ball Br (a) = a + p−r Zp . For more general sets, we use the notation 1A for the characteristic function of a A.
2.2 Some Function Spaces A complex-valued function ϕ defined on Qp is called locally constant if for any x ∈ Qp there exist an integer l(x) ∈ Z such that ϕ(x + x ) = ϕ(x) for x ∈ Bl(x) .
(2.1)
A function ϕ : Qp → C is called a Bruhat-Schwartz function (or a test function) if it is locally constant with compact support. In this case, we can take l = l(ϕ) in (2.1) independent of x, the largest of such integers is called the parameter of local constancy of ϕ. The C-vector space of Bruhat-Schwartz functions is denoted by D := D(Qp ). We denote by DR := DR (Qnp ), the R-vector space of test functions. Since (Qp , +) is a locally compact topological group, there exists a Borel measure dx, called the Haar measure of (Qp , +), unique up to multiplication by a 8 positive constant. Furthermore, dx > 0 for every non-empty open set U ⊂ Qp , U 8 8 and E+z dx = E dx for every Borel set E ⊂ Qp , see e.g. [7, Chapter XI]. If we 8 normalize this measure by the condition Zp dx = 1, then dx is unique. From now on we denote by dx the normalized Haar measure of (Qp , +). Given ρ ∈ [0, ∞), we denote by Lρ := Lρ Qp := Lρ 8Qp , dx , the C−vector space of all the complex valued functions g satisfying Qp |g (x)|ρ dx < ∞, and L∞ := L∞ Qp = L∞ Qp , dx denotes the C−vector space of all the complex valued functions g such that the essential supremum of |g| ρis bounded. ρ ρ The corresponding R-vector spaces are denoted as LR := LR Qp = LR Qp , dx , 1 ≤ ρ ≤ ∞.
260
W. A. Zúñiga-Galindo
2.3 Fourier Transform Set χp (y) = exp(2π i{y}p ) for y ∈ Qp . The map χp (·) is an additive character on Qp , i.e. a continuous map from Qp , + into S (the unit circle considered as multiplicative group) satisfying χp (x0 + x1 ) = χp (x0 )χp (x1 ), x0 , x1 ∈ Qp . The additive characters of Qp form an Abelian group which is isomorphic to Qp , + . The isomorphism is given by ξ → χp (ξ x), see e.g. [1, Section 2.3]. If f ∈ L1 its Fourier transform is defined by ; (Ff )(ξ ) =
Qp
χp (ξ x)f (x)dx,
for ξ ∈ Qp .
We will also use the notation Fx→ξ f and f< for the Fourier transform of f . The Fourier transform is a linear isomorphism from D onto itself satisfying (F(Ff ))(ξ ) = f (−ξ ),
(2.2)
for every f ∈ D, see e.g. [1, Section 4.8]. If f ∈ L2 , its Fourier transform is defined as ; (Ff )(ξ ) = lim χp (ξ · x)f (x)d n x, for ξ ∈ Qp , k→∞ |x|p ≤pk
where the limit is taken in L2 . We recall that the Fourier transform is unitary on L2 , i.e. ||f ||L2 = ||Ff ||L2 for f ∈ L2 and that (2.2) is also valid in L2 , see e.g. [21, Chapter I I I , Section 2].
2.4 The Vladimirov Operator The Vladimirov pseudodifferential operator D α , α > 0, is defined as 1 − pα D ϕ (x) = 1 − p−α−1
;
α
Qp
|y|−α−1 (ϕ(x − y) − ϕ(x)) dy, for ϕ ∈ D. p
(2.3)
The right-hand side of (2.3) makes sense for a wider class of functions, for example, for locally constant functions ϕ satisfying ; |x|p ≥1
|x|−α−d |ϕ(x)| dx < ∞. p
Consequently, the constant functions are contained in the domain of D α , and that D α ϕ = 0, for any constant function ϕ. On other hand,
Non-Archimedean Models of Morphogenesis
261
α |ξ | D α ϕ(x) = Fξ−1 F ϕ , for ϕ ∈ D. x→ξ p →x
(2.4)
Finally in case in which the Vladimirov acts on functions depending on two variables, (x, t) ∈ Qp × R+ , we will sue the notation D αx u(x, t) instead of D α u(x, t). 2.4.1
The Spectrum of the Operator D α
The set of functions rnj
%
$ rnj
(x) = p
−r 2
defined as
χp p−1 j pr x − n pr x − np ,
(2.5)
where r ∈ Z, j ∈ {1, · · · , p − 1}, and n runs through a fixed set of representatives of Qp /Zp , is an orthonormal basis of L2 (Qp ) consisting of eigenvectors of operator Dα : Dα
rnj
= p(1−r)α
rnj
for any r, n, j ,
[11, Theorem 3.29], [1, Theorem 9.4.2]. We set L (BM ) to be the C-vector space generated by the functions support in BM , which are exactly those satisfying r ≤ M, n ∈ pr−M Zp ∩ Qp /Zp , j ∈ {1, · · · , p − 1} .
(2.6)
rnj
(x) with
(2.7)
Notice that L (BM ) is a closed subspace of L2 (BM ). Furthermore, all the functions to the characteristic function of BM , i.e. 8 rnj (x) satisfying (2.7) are orthogonal F 2 (B ) = C p −M |x| L20 (BM ), where dx = 0. Notice that L (x) M p BM rnj
; L20 (BM ) = f ∈ L2 (BM ) ;
f dx = 0 . BM
2.5 Two Spectral Problems Consider the spectral problem: ⎧ α ⎨ D θ (x) = κθ (x) , κ ∈ R ⎩
θ ∈ L2R Qp .
(2.8)
The functions rnj (x) %are complex-valued eigenfunctions of (2.8) with eigenvalues $ κ ∈ p(1−r)α ; r ∈ Z . Notice that each eigenvalues has infinite multiplicity. Therefore
262
W. A. Zúñiga-Galindo
p p
−r 2
r . −1 cos p j p x − n p r x − np ,
−r 2
. p r x − np , sin p−1 j pr x − n
p
(2.9)
p
with r, j , n as before, are real-valued eigenfunctions of (2.8) with κ = p(1−r)α . Notice that the eigenfunctions are not completely determined by the ‘wavenumber’ κ. The functions of the type (2.9) form a basis of L2R Qp (which is not necessarily orthonormal). More precisely, f (x) = rnj Arnj rnj (x) ∈ L2R Qp admits an expansion of the form
p
−r 2
p
−r 2
Re(Arnj ) cos
rnj
p -
Im(Arnj ) sin
p
−1
−1
. j pr x − n
p
p r x − np −
(2.10)
r . p r x − np , j p x−n p
rnj
where Re(Arnj ) = p
−r 2
; Qp
Im(Arnj ) = p
−r 2
;
r . −1 f (x) cos p j p x − n p r x − np dx, p
r . −1 p r x − np dx. f (x) sin p j p x − n p
Qp
Now we consider the eigenvalue problem: ⎧ α ⎨ D x θ (x) = κθ (x) , ⎩
κ∈R (2.11)
θ ∈ L2R (BM ) ∩ L (BM ) , M ∈ Z.
All the functions Re rnj (x) satisfying (2.7) are solutions of (2.11), with κ = p(1−r)α . Now, any function f ∈ L2R (BM ) ∩ L (BM ) admits a Fourier expansion of the form
Crnj rnj (x) = Re Crnj rnj (x) , (2.12) f (x) = rnj
rnj
where the rnj (x)s run through all the wavelets supported in the ball BM . Therefore, (2.12) is a solution of (2.11). Finally we notice that
Non-Archimedean Models of Morphogenesis
D α p−M |x|p =
263
⎧ −1 −αM 1−p p ⎪ ⎪ ⎨ 1−p−α−1 ⎪ α )p M ⎪ ⎩ (1−p−α−1 1−p
if |x|p ≤ pM (2.13)
1 |x|α+1 p
|x|p >
pM
implies that p−M |x|p is not a solution of (2.11).
2.6 The p-adic Heat Equation The evolution equation ∂u(x, t) + (D αx u)(x, t) = 0, ∂t
x ∈ Qp ,
t ≥ 0,
(2.14)
is the p-adic heat equation. The analogy with the classical heat equation comes from the fact that the solution of the initial value problem attached to (2.14) with initial datum u(x, 0) = ϕ(x) ∈ DR is given by ; u(x, t) =
Qp
Z(x − y, t)ϕ(x) dx,
where ; Z(x, t) :=
Qp
χp (−xξ )e−t|ξ |p dξ for t > 0, α
is the p-adic heat kernel. Z (x, t) is a transition density of a time and space homogeneous Markov process which is bounded, right continuous and has no discontinuities other than jumps, see e.g. [30, Theorem 16]. We now review the classical case. For γ > 0, consider the fractional Laplacian: γ (−)γ ϕ (x) = Fξ−1 →x |ξ |R Fx→ξ ϕ , where ϕ is a Schwartz function, F denotes the Fourier transform in the group (R, +). The heat kernel, as a distribution, is given by γ ZR (x, t) = Fξ−1 →x exp −t|ξ |R , for x ∈ R, t > 0. In order to have a probabilistic meaning, this kernel must be a probability measure, γ then by a Bochner theorem, cf. [4, Theorem 3.12], exp −t|ξ |R is a positive definite γ function, and by a theorem due to Schoenberg cf. [4, Theorem 7.8], |ξ |R is a negative definite function. Now, if ψ : R → C is a negative definite function, then |ψ (x)| ≤ C|x|2R for |x|R ≥ 1, cf. [4, Corollary 7.16]. This implies that 0 ≤ γ ≤ 2.
264
W. A. Zúñiga-Galindo
The family of ‘p-adic Laplacians’ is very large, see e.g. [11, Chapter 12], [10, Chapter 4], [24, Chapter 3, Section XVI] [30, Chapter 2] and the references therein.
3 The Model We fix f , g : R2 → R two R-analytic functions, and fix d, γ , α > 0. In this article we consider the following non-Archimedean Turing system: ⎧ u(·, t), v(·, t) ∈ L2R (BM ) ∩ L (BM ) , for t ≥ 0; ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ u(x, 0), v(x, 0) ∈ L2R (BM ) ∩ L (BM ) , u(x, 0), v(x, 0) ≥ 0; ⎪ ∂u ⎪ α ⎪ ⎪ ∂t (x, t) = γf (u, v) − D x u(x, t); ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ∂v α ∂t (x, t) = γ g (u, v) − dD x v(x, t), x ∈ BM , t ≥ 0.
(3.1)
Since we want to study self-organization patterns we need a ‘zero flux boundary condition’, which is u(x, t), v(x, t) ≡ 0 for x ∈ Qp BM , for any t ≥ 0.
(3.2)
Condition (3.2) seems very strong in comparison with the classical one, but there are at least to reasons for this choice. First, since BM is open and closed, then its boundary is the empty set. Second, the operator D αx is non-local, as a consequence of this a condition like D αx u(x, t) ≡ 0 for any x ∈ BM , for any t ≥ 0, it is not sufficient to stop the diffusion outside of ball BM , see (2.13).
4 Turing Instability Criteria We now consider a homogeneous steady state (u0 , v0 ) of (3.1) which is a positive solution of f (u, v) = g(u, v) = 0.
(4.1)
Since u, v are real-valued functions, to study the linear stability of (u0 , v0 ) we can use the classical results, see e.g. [15, Chapter 2]. Following Turing, in the absence of any spatial variation, the homogeneous state must be linearly stable. With no spatial variation u, v satisfy
Non-Archimedean Models of Morphogenesis
⎧ ⎨
∂u ∂t (x, t)
265
= γf (u, v) (4.2)
⎩ ∂v
∂t (x, t) = γ g (u, v) .
Notice that (4.2) is an ordinary system of differential equations in R2 . In order to linearize about the steady state (u0 , v0 ), we set ⎡ w=⎣
w1 w2
⎤
⎡
⎦=⎣
u − u0
⎤ ⎦
(4.3)
v − v0 .
By using the fact that f and g are R-analytic, and assuming that -w-L∞ := max {-w1 -L∞ , -w2 -L∞ } is small, then (4.2) can be approximated as ∂w = γ Jw, ∂t
(4.4)
where ⎡ ⎢ J=⎣
∂f ∂u
∂f ∂v
∂g ∂u
∂g ∂v
⎤
⎤ ⎡ fu0 fv0 ⎥ ⎦. ⎦ (u0 , v0 ) =: ⎣ gu0 gv0
We now look for solutions of (4.4) of the form w (t; λ) = eλt w 0 .
(4.5)
By substituting (4.5) in (4.4), the eigenvalues λ are the solutions of det (γ J − λI ) = 0, i.e. λ2 − γ (T rJ) λ + γ 2 det J = 0.
(4.6)
Consequently λ1,2
γ 2 T rJ ± (T rJ) − 4 det J . = 2
The steady state w = 0 is linearly stable if Re λ1,2 < 0, this last condition is guaranteed if T rJ < 0 and det J > 0.
(4.7)
266
W. A. Zúñiga-Galindo
We linearize B the C full reaction-ultradiffusion system about the steady state, which is 0 w = 0 := , see (4.3), to get 0 ∂w (x, t) = γ Jw(x, t) − DD αx w(x, t), ∂t
(4.8)
where ⎡ D =⎣
1 0
⎤
⎡
⎦ , D αx w := ⎣
D αx w1
⎤ ⎦.
(4.9)
D αx w2
0 d
To solve the system (4.8) subject to the boundary conditions (3.2), we first determine a solution wκ of the following eigenvalue problem: ⎧ α ⎨ D x w κ (x) = κwκ (x) ⎩
(4.10) wκ ∈ L2R (BM ) ∩ L (BM ) .
In Sect. 2.5, the existence of for the eigenvalue problem (4.10) was C B solution w1,κ , then established. Indeed, if w κ = w2,κ w1,κ , w2,κ ∈ G rnj
G rnj
r G r . −r −1 p 2 cos p j p x − n p x−n p p
. r G - −M r . −r −1 2 |x|p , p sin p j p x − n p x−n p p p
−αM
p with r = r(κ), j ∈ {1, · · · , p − 1}, and n ∈ Qp /Zp as in (2.7).The value 1−p −α−1 is −M |x|p . We now look for a solution w(x, t) the eigenvalue corresponding to p λt λt C of (4.8) of the form w(x, t) = κ,λ κ,λ e w κ (x). The function e w κ (x) is a non-trivial solution of (4.8) if λ satisfies
det (λI − γ J + κD) = 0,
(4.11)
λ2 + {κ (1 + d) − γ T rJ} λ + h (κ) = 0,
(4.12)
h (κ) := dκ 2 − γ κ dfu0 + gv0 + γ 2 det J.
(4.13)
i.e.
where
In the Archimedean case, see e.g. [15, Section 2.3], condition (4.11) becomes condition (4.6) when κ = 0. Since κ = 0 is not an eigenvalue of operator D αx , conditions (4.11) and (4.6) are independent.
Non-Archimedean Models of Morphogenesis
267
The steady state (u0 , v0 ) is linearly stable if both solutions of (4.12) have Re (λ) < 0. Conditions (4.7) guarantee that the steady state is stable in absence of spatial effects, i.e. Re (λ |κ=0 ) < 0. For the steady state to be unstable to spatial disturbances we require Re (λ (κ)) > 0 for some κ = 0. This can happen if either the coefficient of λ in (4.12) is negative, or if h (κ) < 0 for some κ = 0 in (4.13). Since T rJ < 0 from conditions (4.7) and the coefficient of λ in (4.13) is (1 + d) − γ T rJ, which is positive, therefore, the only way Re (λ (κ)) can be positive is if h (κ) < 0 for some κ = 0. Since det J > 0 from (4.7), in order h (κ) to be negative, it is necessary that dfu0 + gv0 > 0. Now, since fu0 + gv0 = T rJ < 0, necessarily d = 1 and fu0 and gv0 must have opposite signs. So an additional requirement to those of (4.7) is d = 1.
(4.14)
This is a necessary condition, but not sufficient for Re (λ (κ)) > 0. For h (κ) to be negative for some nonzero κ, the minimum hmin of h (κ) must be negative. An elementary calculation shows that 2 + g df u v 0 0 hmin = γ 2 det J − , (4.15) 4d and the minimum is achieved at κmin = γ
dfu0 + gv0 2d
(4.16)
Thus the condition h (κ) < 0 for some κ = 0 is 2 dfu0 + gv0 > det J. 4d
(4.17)
A bifurcation occurs when hmin = 0, see (4.15), for fixed kinetics parameters, this condition, 2 dfu0 + gv0 , det J = 4d defines a critical diffusion dc , which is given as an appropriate root of fu20 dc2 + 2 2fv0 gu0 − fu0 gv0 dc + gv20 = 0.
(4.18)
(4.19)
The model for d > dc exhibits Turing instability, while for d < dc no. Notice that dc > 1. A critical ‘wavenumber’ κc is obtained by using (4.16): dc fu0 + gv0 κc = γ =γ 2dc
H
det J . dc
(4.20)
268
W. A. Zúñiga-Galindo
When d > dc , there exists a range of unstable of positive wavenumbers κ1 < κ < κ2 , where κ1 , κ2 are the zeros of h (κ) = 0, see (4.13): κ1 =
2 γ dfu0 + gv0 − dfu0 + gv0 − 4d det J , 2d
(4.21)
κ2 =
2 γ dfu0 + gv0 + dfu0 + gv0 − 4d det J . 2d
(4.22)
We call the function λ (κ) the dispersion relation. Notice that, within the unstable (0) range, Re λ (κ) > 0 has a maximum for the wavenumber κmin obtained from (4.16) with d > dc . Some typical plots for λ (κ) and Re λ (κ) are showed in [15, Section 2.3], see also the figure 2.5 in [15, Section 2.3]. Then as t increases the behavior of w (x, t) is controlled by the dominant modes, i.e. those eλ(κ)t wκ (x) with Re λ (κ) > 0, since the other modes tend to zero exponentially. Then
w (x, t) ∼
Arnj eλt p
κ1 0. Notice that
D αx ϕ(x) = D αM − λM ϕ(x) for ϕ ∈ DR (p−M Zp ), where D αM ϕ(x)
1 − pα := 1 − p−α−1
; p−M Zp
ϕ (x − y) − ϕ (x) dy. |y|α+1 p
The operator D αM −λM is a non-negative, symmetric operator on L2R (p−M Zp ). Furthermore, its closure, also denoted by D αM ,−λM is a self-adjoint operator, see [10, Section 3.3.2]. Every wavelet rnj (x), with support in p−M Zp , with r, n, j satisfying (2.7), is M an eigenfunction of D αM with eigenvalue p(1−r)α . In addition p 2 p−M |x|p is M
also a eigenfunction of D αM with eigenvalue λM p 2 . −L As a discretization of D αM −λM , we pick its restriction to DM , which is denoted −L α α as D L,M − λM . Since DM is a finite vector space D L,M − λM is represented by a matrix AαL,M .
5.3 Computation of the Matrix AαL,M In order to compute the matrix AαL,M , we first compute L D αM p 2 pL |x − I |p 1 − pα =p 1 − p−α−1 L 2
pL |x − y − I |p − pL |x − I |p
; p−M Z
|y|α+1 p p
dy
Non-Archimedean Models of Morphogenesis
L
=: p 2
J ∈GL,M
1 − pα 1 − p−α−1
pL |x − y − I |p − pL |x − I |p
;
1 − pα =p 1 − p−α−1 L 2
271
|y|α+1 p
|y−J |p ≤p−L
dy
IJ (x, I, L) .
J ∈GL,M
We now compute the integrals IJ (x, I, L). We consider first the case J = 0. By using that p L |x − I |p ∗ pL |x − J |p = p−L pL |x − (I + J )|p , we have IJ (x, I, L) =
. 1 - L L −L L |x | |x | |x | p ∗ p − p − I − J − I p p p p |J |α+1 p =
. p −L - L p |x − (I + J )|p − p L |x − I |p . α+1 |J |p
Now in the case J = 0, we have IJ (x, I, L) = 0. Indeed, ; pL |x − y − I |p − pL |x − I |p dy. I0 (x, I, L) = |y|α+1 p |y|p ≤p−L
If pL |x − I |p = 1, then x ∈ I +pL Zp and since y ∈ pL Zp we have x−y−I ∈ pL Zp , which implies that pL |x − y − I |p = 1. Now, if p L |x − I |p = 0, i.e. if x ∈ K +pL Zp for some K = I , then by using that pL Zp ∩K −I +pL Zp = ∅, we have ; 1 I0 (x, I, L) = dy = 0. |y|α+1 p pL Zp ∩K−I +pL Zp
Now, we use the fact that GL,M is a additive group to conclude that L D αM p 2 pL |x − I |p L
= p− 2
1 − pα 1 − p−α−1 ⎛
L
−p− 2
1 − pα 1 − p−α−1
K∈GL,M K =I
⎜
⎜ ⎜ ⎝
K∈GL,M K =I
1
L |x − K| p p α+1
|K − I |p
⎞ ⎟ ⎟ L |x | p − I , ⎟ p ⎠ |K − I |α+1 p 1
272
W. A. Zúñiga-Galindo
7 6 and the entries of the matrix AαL,M = AαK,I
are given as
K,I ∈GL,M
AαK,I =
⎧ ⎪ ⎪ ⎪ ⎨
L
p− 2
1−pα 1 1−p−α−1 |K−I |α+1 p
⎪ − L 1−pα ⎪ ⎪ ⎩ −p 2 1−p−α−1
K =I
if K = I (5.1)
1 |K−I |α+1 p
− λM if K = I.
5.4 Discretization of the p-adic Turing System In the discretization of the Turing system (3.1), we use the following approximation for functions u(x, t), v(x, t):
(5.2) u(L) (I, t) p L |x − I |p u(L) (x, t) = I ∈GL,M
and v (L) (x, t) =
v (L) (I, t) p L |x − I |p ,
(5.3)
I ∈GL,M
where u(L) (I, ·), v (L) (I, ·) ∈ C 1 ([0, T ]) for some fixed positive T . Furthermore, we set 6 6 7 7 u(L) (x, t) = u(L) (I, t) , v (L) (x, t) = v (L) (I, t) . I ∈GL,M
I ∈GL,M
We assume that range u(L) (x, t) ×range v (L) (x, t) is contained in the domains of convergence of f , g. Then ⎞ ⎛
u(L) (I, t) p L |x − I |p , v (L) (J, t) p L |x − J |p ⎠ = f⎝ I ∈GL,M
J ∈GL,M
f u(L) (I, t) , v (L) (I, t) pL |x − I |p .
I ∈GL,M
A similar formula holds for function g. Then the discretization of the p-adic Turing system has the form: 6 7 7 ∂ 6 (L) u (I, t) = γ f u(L) (I, t) , v (L) (I, t) I ∈GL,M ∂t I ∈GL,M 7 6 − AαL,M u(L) (I, t)
I ∈GL,M
Non-Archimedean Models of Morphogenesis
273
6 7 7 ∂ 6 (L) v (I, t) = γ g u(L) (I, t) , v (L) (I, t) I ∈GL,M ∂t I ∈GL,M 7 6 − dAαL,M v (L) (I, t)
I ∈GL,M
,
where I ∈ GL,M . Equivalently, L ∂ (L) 1 − pα u(L) (J, t) u (I, t) = γf u(L) (I, t) , v (L) (I, t) − p− 2 ∂t 1 − p−α−1 |J − I |pα+1 J =I ⎛ ⎞ −1 p −αM+ L2 α
1 − p L 1 − p 1 ⎝ ⎠ u(L) (I, t) , +p− 2 − −α 1 − p 1 − p−α−1 |J − I |α+1 p J =I
L ∂ (L) 1 − pα u(L) (J, t) v (I, t) = γ g u(L) (I, t) , v (L) (I, t) − dp− 2 ∂t 1 − p−α−1 |J − I |α+1 p J =I ⎛ ⎞ L 1 − p−1 p−αM+ 2 L 1 − p α ⎝
1 ⎠ u(L) (I, t) , +dp− 2 − −α 1 − p 1 − p−α−1 |J − I |α+1 p J =I
where I ∈ GL,M .
References 1. S. Albeverio, A. Yu. Khrennikov, V. M. Shelkovich, Theory of p-adic distributions: linear and nonlinear models, London Mathematical Society Lecture Note Series, 370 (Cambridge University Press, 2010). 2. B. Ambrosio, M. A. Aziz-Alaoui, V. L. E. Phan, Global attractor of complex networks of reaction-diffusion systems of Fitzhugh-Nagumo type, Discrete Contin. Dyn. Syst., Ser. B 23 (2018), No. 9, 3787–3797. 3. Barrat, A., Barthélemy, M. & Vespignani, A. Dynamical Processes on Complex Networks (Cambridge Univ. Press, 2008). 4. Berg Christian, Forst Gunnar, Potential theory on locally compact abelian groups (SpringerVerlag, New York-Heidelberg, 1975). 5. S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, D.-U. Hwang, Complex networks: structure and dynamics, Phys. Rep. 424 (2006), no. 4–5, 175–308. 6. Soon-Yeong Chung, Jae-Hwang Lee, Blow-up for discrete reaction-diffusion equations on networks, Appl. Anal. Discrete Math. 9 (2015), No. 1, 103–119. 7. Paul R. Halmos, Measure Theor (D. Van Nostrand Company, 1950). 8. Horsthemke, W., Lam, K. & Moore, P. K. Network topology and Turing instability in small arrays of diffusively coupled reactors, Phys. Lett. A 328, 444–451 (2004) 9. Yusuke Ide, Hirofumi Izuhara, Takuya Machida, Turing instability in reaction-diffusion models on complex networks, Physica A 457 (2016), 331–347.
274
W. A. Zúñiga-Galindo
10. Anatoly N. Kochubei, Pseudo-differential equations and stochastics over non-Archimedean fields (Marcel Dekker, 2001). 11. Andrei Khrennikov, Sergei Kozyrev, W. A. Zúñiga-Galindo, Ultrametric Equations and its Applications, Encyclopedia of Mathematics and its Applications (168) (Cambridge University Press, 2018). 12. M. Mocarlo Zheng, Bin Shao, Qi Ouyang, Identifying network topologies that can generate Turing pattern, J. Theor. Biol. 408 (2016), 88–96. 13. Moore, P. K. & Horsthemke, W. Localized patterns in homogeneous networks of diffusively coupled reactors, Physica D 206 (2005), 121–144 . 14. Delio Mugnolo, Semigroup methods for evolution equations on networks. Understanding Complex Systems (Springer, Cham, 2014). 15. J. D. Murra, Mathematical biology. II. Spatial models and biomedical applications. Third edition (Springer-Verlag, New York, 2003). 16. Hiroya Nakao, Alexander S. Mikhailov, Turing patterns in network-organized activator – inhibitor systems, Nature Physics 6 (2010), 544–550. 17. H. G. Othmer, L. E. Scriven, Instability and dynamic pattern in cellular networks, J. Theor. Biol. 32 (1971), 507–537. 18. H. G. Othmer, L. E. Scriven, Nonlinear aspects of dynamic pattern in cellular networks, J. Theor. Biol. 43 (1974), 83–112. 19. Benoît Perthame, Parabolic equations in biology. Growth, reaction, movement and diffusion, Lecture Notes on Mathematical Modelling in the Life Sciences (Springer, Cham, 2015). 20. Angela Slavova, Pietro Zecca, Complex behavior of polynomial FitzHugh-Nagumo cellular neural network model, Nonlinear Anal., Real World Appl. 8 (2007), No. 4, 1331–1340. 21. M. H. Taibleson, Fourier analysis on local fields (Princeton University Press, 1975). 22. A. M. Turing, The chemical basis of morphogenesis, Philos. Trans. Roy. Soc. London Ser. B 237 (1952), no. 641, 37–72. 23. Piet Van Mieghem, Graph spectra for complex networks (Cambridge University Press, Cambridge, 2011). 24. V. S. Vladimirov, I. V. Volovich, E. I. Zelenov, p-adic analysis and mathematical physics (World Scientific, 1994). 25. Joachim von Below, José A. Lubary, Instability of stationary solutions of reaction-diffusionequations on graphs, Result. Math. 68 (2015), No. 1–2, 171–201. 26. Hongyong Zhao, Xuanxuan Huang, Xuebing Zhang, Turing instability and pattern formation of neural networks with reaction-diffusion terms, Nonlinear Dyn. 76 (2014), No. 1, 115–124. 27. YusukeIde, Hirofumi Izuhara, Takuya Machida, Turing instability in reaction-diffusion models on complex networks, Phys. A 457 (2016), 331–347. 28. W. A. Zúñiga-Galindo, Reaction-diffusion equations on complex networks and Turing patterns, via p-adic analysis, J. Math. Anal. Appl. 491 (2020), no. 1, 124239, 39 pp. 29. W. A. Zúñiga-Galindo, Non-Archimedean Reaction-Ultradiffusion Equations and Complex Hierarchic Systems, Nonlinearity 31 (2018), no. 6, 2590–2616. 30. W. A. Zúñiga-Galindo, Pseudodifferential equations over non-Archimedean spaces, Lectures Notes in Mathematics 2174 (Springer, Cham, 2016).
p-Adic Wave Equations on Finite Graphs and T0 -Spaces Patrick Erik Bradley
Abstract p-adic wave and diffusion equations are studied on finite T0 -spaces through their Hasse diagrams, which are graphs, and on certain continuous maps called aggregation maps. First, a dictionary between graph theory and p-adic analysis is developed. Then the structure of the solutions of homogeneous wave equations on networks with and without damping is studied and compared with the classical case. Finally, the relationship between the Laplacians of the finite T0 spaces occurring in aggregation maps is studied. The latter yields a relationship between solutions of such equations on a finite T0 -space and its aggregation to a coarser space. Keywords p-Adic numbers · Wave equation · Finite graphs · Finite T0 -spaces
1 Introduction Graphs are a convenient and efficient way of representing relationships, and the idea of using graphs for modeling physical processes is quite natural, when the domain of physical action can be modeled through topological relationships. An example would be a simulation of waves or heat being transferred through a building. The building elements are related to another through their adjacency, and the physical quantity is assumed to travel only between adjacent elements. This leads to the quite recent idea of studying partial differential equations on graphs. E.g. oscillations of networks are the topic of [6], where certain nodes are identified as soft nodes, damping of which leads to unbounded resonance with catastrophic effects. A seemingly unrelated concept is given by the p-adic numbers [7]. These are quite attractive from a computational point of view, because of their inherent
P. E. Bradley () Institute of Photogrammetry and Remote Sensing, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 W. A. Zúñiga-Galindo, B. Toni (eds.), Advances in Non-Archimedean Analysis and Applications, STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health, https://doi.org/10.1007/978-3-030-81976-7_8
275
276
P. E. Bradley
hierarchical structure [5]. This leads to the idea of p-adic approaches to partial differential equations. And indeed there exist p-adic analogues in the form of pseudodifferential equations [8, 12, 16]. These are interesting for many applications. E.g. in [10, 14], p-adic analogues of the wave equation are studied. Also, similarly as in the classical case, the spectrum of p-adic pseudo-differential operators can, under certain conditions, distinguish the spaces on which they act: this was found in the case of spaces parametrising p-adic Riemann surfaces (aka Mumford curves) corresponding to certain types of finite graphs [2]. The combination of these two ideas is very recent, i.e. to consider the nodes of a graph as belonging to a p-adic space, which leads to p-adic pseudodifferential equations describing reaction/diffusion processes on any finite graph [15]. This approach generalises the seemingly more natural application of p-adic analysis to evolutionary processes on trees, which are cited in that article. The intended application of the present work brings us to a third idea, namely the simulation of processes on finite topological spaces coming from partial orderings, so-called finite T0 -spaces [1]. In some applications, these topologies represent the boundary relationships of geometric models of e.g. buildings or cities. In order to enable a distributed processing of such simulations, a p-adic Gray-Hilbert index was created in [3], which yields an efficient data-driven method for accessing data with the help of space-filling curves in any finite dimension. In order to become even more efficient, the simulations should be able to run across varying levels of detail. As a partial ordering can be encoded with a minimal graph structure (the Hasse diagram), the door to p-adic methods for simulating processes on finite T0 -spaces, and on certain continuous maps between such spaces, is open. The latter will represent the aggregation maps between different levels of detail of the geometric models in question. The following section sets up a dictionary between graph theory and p-adic analysis, and reviews some needed results from p-adic analysis and its application to graphs. Section 3 is a study of the structure of the solutions to homogeneous p-adic wave equations on undirected graphs, with and without damping. The results are then compared with the classical case. Section 4 is first devoted to a brief introduction to finite T0 -spaces, and then to processes induced by a p-adic Laplacian on a finite T0 -space, together with an aggregation map to another finite T0 -space. This is motivated by the desire to have a simulation run across different levels of detail. The result here is an explicit relationship between the two Laplacians. A corollary is that solutions to homogeneous wave equations break into solutions of corresponding equations on aggregated spaces plus a solution consisting in noise.
p-Adic Wave Equations on Finite Graphs and T0 -Spaces
277
2 A Dictionary Between Graph Theory and p-Adic Analysis Here, we develop a dictionary which more fully than previously brings graphtheoretic notions into p-adic analysis, and also review some already known p-adic analytic facts about graphs. Let G be a finite graph with n vertices and vertex set V (G). The edge set of G will be denoted as E(G). Zúñiga-Galindo’s method in [15] for enabling the application of p-adic methods to graph theory is to fix an embedding V (G) → Qp and fix N such that each vertex I ∈ V (G) lies in a distinct p-adic ball of radius r = p−N $ % Br [I ] = x ∈ Qp : |x − I |p ≤ r . Let KN =
#
Bp−N [I ] ⊂ Qp .
I ∈V (G)
The function space of interest is the Hilbert space L2 (KN , C) consisting of complex-valued functions on KN , and we will use the relationship between elements of Cn and those of this Hilbert space. The indicator function on Bp−N [I ] will be denoted as p N |x − I |p . Lemma 2.1 There is an injective linear map ⎛ Cn → L2 (KN , C), (uI )I ∈V (G) → ⎝x → u(x) =
⎞ uI pN |x − I |p ⎠ .
I ∈V (G)
Proof This is clear, including well-definedness (as the sum is finite), and injectivity (as the indicator functions are linearly independent). The standard inner product on L2 (KN , C) will be denoted as ·, ·, whereas the standard inner product on Cn will be written as ·, ·. The corresponding norms will be written as -·-2 for L2 (KN , C), and |·| for Cn , respectively. Let A = (AI J ) ∈ Cn×n . Then the function KN × KN → C, (x, y) → A(x, y)
278
P. E. Bradley
with A(x, y) = pN
AI J pN |x − I |p pN |y − J |p
I ∈V (G) J ∈V (G)
gives rise to a linear operator A on L2 (KN , C) associated with A. The space of bounded linear operators on a Hilbert space X will be denoted as B(X). Proposition 2.2 There is an injective linear and multiplicative homomorphism b : Cn×n → B(L2 (KN , C)), A → A such that Au, v = p−N Au, v
(1)
for u, v ∈ Cn . Proof Well-definedness. For u ∈ C(KN , C), where C(KN , C) is the space of continuous complex-valued functions on KN , we have -Au-2 ≤ -u-2 -A-F where -·-F is the Frobenius norm for matrices. Namely, 2 ; ; 2 -Au-2 = A(x, y)u(y) dy dx KN KN ; ;
2N 2 N |AI J | ≤p p |x − I |p pN |y − J |p |u(y)|2 dy dx I,J
≤ p2N
KN
KN
|AI J |2 p−2N -u-22
I,J
=
-A-2F
-u-22
By density of the space D(KN , C) of test functions supported on KN in L2 (KN , C), it now follows that the mapping is well-defined. Linearity. The linearity of b is clear by definition.
p-Adic Wave Equations on Finite Graphs and T0 -Spaces
279
Multiplicativity. Let A, B ∈ Cn×n . We have ; ABu(x) =
; A(x, y)
KN
B(y, z)u(z) dz dy
KN
;;
=
A(x, y)B(y, z) dy u(z) dz
Now, a calculation shows that ; A(x, y)B(y, z) dy = p
N
I,K
AI J BJ K p N |x − I |p p N |z − K|p
J
Hence, it follows that b(A)b(B)u = b(AB)u from which the multiplicativity of b follows. Injectivity. Let b(A) = 0. Thus, for u = pN |x − I |p , it follows that 0 = Au(x) ;
N N =p pN |y − K|p pN |y − I |p dy AJ K p |x − J |p J,K
=
AJ I pN |x − J |p
J
Hence, as the indicator functions are linearly independent, we have AJ I = 0 for J ∈ V (G). As I ∈ V (G) is arbitrary, it follows that A = 0. Hence b is injective. Property (1). We have for u, v ∈ C that ;; Au, v = = pN
A(x, y)u(y) dy v(x) dx
; AI J
; pN |y − J |p u(y) dy v(x) dx pN |x − I |p
I,J
= pN
AI J uJ vI p−2N
I,J
= p−N Au, v as asserted.
280
P. E. Bradley
Remark 2.3 Property (1) is consistent with the fact that -u-22 = u, u = p−N u, u = p−N |u|2 which follows from a simple calculation. The consistency follows by taking A as the identity matrix. For the following remark, we need a definition: Definition 2.4 The function N ψ−N (p−N I )j (x) = p 2 χp p−N −1 j x pN |x − I |p for I ∈ V (G), and j = 1, . . . , p − 1, is called a Kozyrev basis function, or p-adic wavelet. Remark 2.5 It does not hold true that b(id) is the identity on L2 (KN , C), if id ∈ Cn×n is the identity. For example, if u = ψ−N (p−N I )j is a Kozyrev basis function, then ; ψ−N (p−N I )j (y) dy = 0 b(id)u(x) = pN pN |x − I |p as the latter integral is zero: use the well-known fact that ; χp (ax) dx = 0 |x|p ≤1
if |a|p > 1, and make the corresponding substitution. Recall that the Laplacian of a finite graph G is the matrix L = A − (γI δI J ) where A = (AI J ) is the adjacency matrix of the graph, and γI =
AI J
J ∈V (G)
is the degree of vertex I of G. The following can be found in [15, p. 11]: Definition 2.6 The p-adic Laplacian of G is the linear operator L : L2 (KN , C) → L2 (KN , C) associated with L.
p-Adic Wave Equations on Finite Graphs and T0 -Spaces
281
Lemma 2.7 It holds true that ; Lu(x) =
A(x, y)(u(y) − u(x)) dy KN
where u is a test function on KN , L is the p-adic Laplacian, and A the adjacency matrix of G. Proof A simple calculation which proves this can be found in [15, p. 11].
Table 1 summarises our results so far in a dictionary between graph theory and p-adic analysis. Definition 2.8 A weighted graph is a pair (G, w) where w : E(G) → C is a function which assigns complex numbers to each edge of G. We will write G instead of (G, w), as we will not use any particular weighting function w. Let A ∈ Cn×n be symmetric. We can view this matrix as an adjacency matrix of a weighted graph GA with n nodes in a natural way. The Laplacian associated with this graph will be denoted as LA = A − DA ,
DA = (γI δI J )
and the corresponding linear operators are written as LA , DA : L2 (KN , C) → L2 (KN , C). Table 1 A dictionary between graph theory and p-adic analysis Graph theory Node vector (uI )
p-Adic analysis u(x) = uI p N |x − I |p
Adjacency matrix A = (AI J ) Entry AI J
bounded linear operator A A(x, y) = p N AI J (p N |x − I |p )(p N |y − J |p ) I,J 8 Au = A(·, y)u(y) dy KN 8 u(x) = u(y)δ(x − y) dy KN 8 γ (x) = A(x, y) dy
I ∈V (G)
Product Au Identity uI =
δI J uJ
J
Degree γI =
AI J
J
Laplacian L = A − (γI δI J )
KN
L(x, y) = A(x, y) − γ (x)δ(x − y) 8 Lu(x) = A(x, y)(u(y) − u(x)) dy KN
282
P. E. Bradley
Lemma 2.9 The Kozyrev basis function ψ−N (p−N I )j is an eigenfunction of LA with eigenvalue −γI . Furthermore, we have ψ−N (p−N I )j ∈ ker DA for all I ∈ V (G), j = 1, . . . , p − 1. Proof This is well known, cf. [8, Thm. 3.29]. Or [15, (10.6)] for eigenfunction, and [15, (10.4)] for being in the kernel. Let ϕI ∈ L2 (KN , C) be the function associated with an eigenvector of L corresponding to I ∈ V (G), according to the dictionary of Table 1. We next present without proofs some results taken from Zúñiga-Galindo in reference [15]. Proposition 2.10 (Zúñiga-Galindo) Assume that the matrix L is the Laplacian of a (weighted) undirected graph G. Then the p-adic Laplacian L : L2 (KN , C) → L2 (KN , C) is compact, and its non-zero spectrum is the disjoint union of the non-zero spectrum of L, and {−γI : I ∈ V (G)} whose elements are all negative. The corresponding eigenfunctions are given by functions ϕI , together with ψ−N (p−N I )j supported in KN , where I ∈ V (G), j = 1, . . . , p−1. The space L2 (KN , C) has an orthonormal basis consisting of functions ϕI and normalisations of ψ−N (p−N I )j for each I ∈ V (G), j = 1, . . . , p − 1. Proof Cf. [15, Thm. 10.1].
Corollary 2.11 (Zuniga-Galindo’s Structure Theorem) Let A ∈ Cn×n be symmetric. Then the associated operator A induces an orthogonal decomposition of L2 (KN , C) into A-invariant parts L2 (KN , C) = L2 (GA , C) ⊕ L2Koz (KN , C) where L2 (GA , C) ∼ = Cn is generated by the (functions corresponding to) the eigenvectors of LA , and L2Koz (KN , C) is generated by the Kozyrev basis functions. Proof This is an immediate consequence of Proposition 2.10.
p-Adic Wave Equations on Finite Graphs and T0 -Spaces
283
Let L be the p-adic Laplacian of a finite (weighted) graph. According to [15, Thm. 4.2], the Cauchy problem of the heat equation: ∂ u(t, x) = Lu(t, x) ∂t u(0, x) = u0 (x) ∈ L2 (KN , C) with t ≥ 0 -and .> 0 has a solution which is unique. The solution is given by the generated by L. semigroup et L t≥0
Corollary 2.12 If L is the p-adic Laplacian of a finite graph, then L2 (G, C) and L2Koz (KN , C) are both invariant under the diffusion operator et L for t, > 0. Proof This is an immediate consequence of Corollary 2.11.
This result can be interpreted as having a ‘signal’ from L2 (G, C) plus additional ‘noise’ from L2Koz (KN , C), and that the evolution of this signal under the diffusion operator always acts in such a way that at all times the output can always be separated into ‘signal’ and ‘noise’ by letting the operator act on the two components independently. This idea will be pursued in more detail in the following sections.
3 Homogeneous Wave Equations on Undirected Graphs We now assume that the (weighted) graph G is undirected, i.e. the adjacency matrix A is symmetric.
3.1 Wave Equations Without Damping The Cauchy problem for the homogeneous p-adic wave equation on the (weighted) graph G without damping is the finding of u(·, t) ∈ L2 (KN , C) for t ≥ 0 such that ∂2 u(t, x) − Lu(t, x) = 0, ∂t 2
x ∈ KN , t ≥ 0
u(0, x) = φ(x) ∈ L2 (KN , C) ∂u(t, x) = ψ(x) ∈ L2 (KN , C). ∂t t=0
(2)
284
P. E. Bradley
Theorem 3.1 The solution in L2 (KN , C) of the Cauchy problem (2) is given by
u(x, t) =
AI e
√
−μI t
I ∈V (G)
CI,j e
√
γI t
I ∈V (G) j =1
p−1
+
BI e
√
−μI t
ϕI (x)
I ∈V (G)
p−1
+
ϕI (x) +
DI,j e
I ∈V (G) j =1
√
γI t
ψ−N (p−N I )j (x)
ψ−N (p−N J )j (x)
where the μI is the eigenvalue of LA corresponding to eigenfunction φI the γI are as in Proposition 2.10, and the constants AI , BI , CI,j , DI,j are uniquely determined by the initial conditions. Proof We are looking for a solution of the form u(x, t) =
cI (t)ϕI (x) +
p−1 I ∈V (G) j =1
I ∈V (G)
cI,j (t)ψ−N (p−N I )j (x).
(3)
Inserting this into (2), and using LϕI (x) = μI φ(x) Lψ−N (p−N )j (x) = −γI ψ−N (p−N I )j (x), cf. [15, Thm. 10.1], we see that each coefficient cI (t), cI,j (t) satisfies the corresponding equation from the following system: d2 cI (t) + μI cI (t) = 0 dt 2 d2 cI,j (t) − γI cI,j (t) = 0 dt 2 This proves that cI (t) = AI e
√
cI,j (t) = CI,j e
−μI t
√
γI t
ϕI (x) + BI ei
√
−μI t
ϕI (x)
ψ−N (p−N I )j (x) + DI,j e
√
γI t
ψ−N (p−N I )j (x)
with AI , BI , CI,j , DI,j uniquely determined by the initial conditions.
Remark 3.2 Notice that the solution to the Cauchy problem (2) consists of two parts. The discrete part involves the functions ϕI (x), and is the solution of the
p-Adic Wave Equations on Finite Graphs and T0 -Spaces
285
classical wave equation on the graph G. The continuous part involves the Kozyrev functions ψ−N (p−N I )j (x) and comes from the extension of the Laplacian of the graph G into the p-adics. Thus the p-adic approach includes the classical approach of analysis on graphs.
3.2 Wave Equations with Damping d Here, we also write y˙ instead of ∂t∂ y or dt y. We now assume that the (weighted) graph consists of springs between nodes with dampers. Thus, the wave equation now becomes
u(t, ¨ x) − Eu(t, ˙ x) − Lu(t, x) = 0
(4)
where E is the linear operator corresponding in the dictionary to a symmetric matrix. The entries can be understood as damping coefficients. Let AJ I = EϕI , ϕJ and c = (cI )I ∈V (G) , A = (AI J ) Let α : {(I, j ) : I ∈ V (G), j = 1, . . . , p − 1} → N be a linear ordering. Let Bα(J,k),α(I,j ) = Eψ−N (p−N I )j , ψ−N (p−N J )k ,
B = (Bα(J,k),α(I,j ) )
Theorem 3.3 The solution u ∈ L2 (KN , C) of the Cauchy problem (4) is equivalent to the two Cauchy problems c¨ − Ac˙ − μ6 c = 0 c(0) = ∈ CN c(0) ˙ = ∈ CN d¨ − B d˙ + γ 6 d = 0 d(0) = ∈ CN (p−1) ˙ d(0) = ∈ CN (p−1) with μ = (μI ), γ = (γI ).
(5)
286
P. E. Bradley
Proof We are looking again for solutions of the form (3). Inserting this into the equation yields 0=
I ∈V (G) j =1
I ∈V (G)
−
c¨I,j (t)ψ−N (p−N I )j (x)
p−1
c˙I (t)Eϕi (x) −
I ∈V (G) j =1
I ∈V (G)
−
p−1
c¨I (t)ϕI (x) +
cI (t)ϕI (x) +
p−1 I ∈V (G) j =1
I ∈V (G)
c˙I,j (t)Eψ−N (p−N I )j (x)
γI cI,j (t)ψ−N (p−N I )j (x)
Now, taking the inner product with ϕJ and ψ−N (p−N J )k , respectively, yields c¨J −
c˙I EϕI , ϕJ − μJ cJ = 0
(6)
c˙I,j Eψ−N (p−N I )j , ψ−N (p−N J )k + γJ cJ,k = 0
(7)
I ∈V (G)
c¨J,k −
p−1
I ∈V (G) j =1
Then Eqs. (6), (7) can be written as c¨ − Ac˙ − μ6 c = 0 d¨ − B d˙ + γ 6 d = 0 This proves the assertion.
Remark 3.4 Notice again that, as in the undamped case, the solution of the p-adic equation consists of a discrete and a continuous part, which each can be found through a classical system of second order differential equations. Again, the discrete part is the solution of the classical damped wave equation on the graph G given by (5). Hence, again the p-adic approach includes the classical approach.
4 Application to Aggregation Maps Between Finite T0 -Spaces 4.1 Brief Introduction to Finite Topological Spaces P. Alexandrov studied in [1] such topological spaces in which arbitrary intersections of open sets are also open. These are called Alexandrov topological spaces and include the finite topological spaces. Here, we will give a brief recount of results on special types of finite topological spaces.
p-Adic Wave Equations on Finite Graphs and T0 -Spaces
287
A T0 -space is a topological space in which every two distinct points at least one of them has a neighbourhood not containing the other. P. Alexandrov observed in [1] that topologies on a finite set X correspond bijectively with reflexive and transitive binary relations on the set X. The T0 -topologies correspond under this bijection with partial orderings. This correspondence means that two binary relations R and S on a set X can be said to be equivalent, if their reflexive and transitive closures R ∗ and S ∗ coincide. Recall that # Rn R∗ = n∈N where R 0 ⊂ X × X is the diagonal, and R n+1 = R n ◦ R $ % = (x, y) ∈ X × X : ∃z ∈ X : (x, z) ∈ R n and (z, y) ∈ R for n ∈ N. Hence, to any binary relation R, a topology on a set X can be associated through R ∗ , and the topology depends only on R ∗ . This observation leads to applications of topology in relational models [4]. Given a T0 -topology, or, equivalently, a partial ordering on a set X, there is a unique minimal binary relation to which the same topology on X is associated. This minimal relation is called the Hasse diagram, and is a directed acyclic graph whose nodes are the elements of X [13]. Assume that X, Y are finite topological spaces whose topologies are associated with binary relations R and S, respectively. A map X → Y is continuous, if and only if it takes pairs from R to pairs in S ∗ [4, Thm. 5.6].
4.2 Processes on Finite T0 -Spaces and Their Aggregations A process on a finite T0 -space X is defined to be a process on its Hasse diagram which is a directed acyclic graph. If X is a finite T0 -space, then by viewing the edges of its Hasse diagram as undirected, we obtain a graph G associated with X. We will furthermore assume that the edges are weighted with positive real numbers. We will call the Laplacian of this weighted graph G also a Laplacian of X. As the weights are assumed fixed, we will also simply speak of “the” Laplacian of X, if there is no cause for confusion. In any case, it is a symmetric n × n-matrix. By associating the points of a finite T0 -space X with Qp , we can now define p-adic processes on X. The interpretation of a function u ∈ L2 (KN , C) as a noisy signal is as follows: We have a decomposition
288
P. E. Bradley
u = uG + uK ∈ L2 (G, C) ⊕ L2Koz (KN , C) the component uG ∈ L2 (G, C) is a function on the vertices of G, considered as the signal part of u, whereas the component uK ∈ L2Koz (KN , C) can be interpreted as noise part consisting of fluctuations in a small p-adic ball around (the center of) the node I ∈ V (G). The noise part has the property that its mean is zero and that it is orthogonal to the signal part of u. The results of the previous section can now be interpreted as follows: Assume that a process given by the Laplacian L of G is given (e.g. diffusion or wave equation). Then, because the signal and noise parts of a noisy signal are mapped to signal and noise parts of signals, respectively, we can at all times t obtain the evolved signal part from an initial noisy signal by projecting the evolved noisy signal to L2 (G, C). On the other hand, the projection to the individual components with respect to the Kozyrev basis functions in L2Koz (KN , C) can be viewed as a p-adic wavelet transform of the noise part of a (noisy) signal. Definition 4.1 Let f : X → Y be a continuous surjective map between finite T0 -spaces. This map f is monotonic, if the pre-image of every connected set is connected. The map f is an aggregation map, if f is monotonic and no folding occurs, i.e. no path γ in the Hasse diagram of X containing 3 points is mapped to an edge e of the Hasse diagram of Y in such a way that the two extremal points of γ are mapped to the same endpoint of e. Let f be an aggregation map, and let G, H be the corresponding Hasse diagrams, called G and H . We further assume that the vertices of G and H are centers of p-adic discs whose unions are the sets KN +1 , KN ⊂ Qp such that for each I ∈ V (G), the disc of radius p−(N +1) corresponding to I is contained in the disc of radius p−N corresponding to f (I ). The approach described here presupposes, that the p-adic disc corresponding to f (I ) has sufficiently many maximal strictly smaller subdiscs. This can be achieved in two ways: either increase p, or resort to a finite (unramified) field extension of Qp (cf. [7]). Both ways are equally possible, only that the second alternative leads to a more general theory which can be found in [9, 11]. For reasons of simplicity, we will resort to the first alternative and assume that the prime p is sufficiently large. We will also denote the inclusion map KN +1 → KN with f . The fibre sum is given as σf : L2 (G, C) → L2 (H, C), u →
I ∈V (H ) J →I
uJ pN |x − I |p
p-Adic Wave Equations on Finite Graphs and T0 -Spaces
289
This fibre summation map can be extended to the noise part as follows: σf : L2Koz (KN +1 , C) → L2Koz (KN , C) ψ−(N +1)(p−(N+1) I )j → ψ−N (p−N f (I ))j Putting things together, we obtain a map σf : L2 (KN +1 , C) → L2 (KN , C) The push-forward map is defined as
f∗ : Cn×n → Cn ×n
A = (AI J )I,J ∈V (G) → (f∗ (A)KL )K,L∈V (H )
f∗ (A)KL = AI J I ∈f −1 (K),J ∈f −1 (L)
where n, n are the numbers of vertices of G, H , respectively. As a map between operator spaces, this is given as f∗ : B(L2 (KN +1 , C)) → B(L2 (KN , C)) with f∗ A(x, y) = pN
K,L∈V (H )
I →K J →L
AI J pN |x − K|p pN |y − L|p
Theorem 4.2 Let f : X → Y be a monotonic map of T0 -spaces. Then the following holds true: 1. f∗ LX = LY , if and only if f is an aggregation map. 2. There is a commutative diagram
where ι is the inclusion map. Proof 1. Assume first that f is an aggregation map. Now, a diagonal element of f∗ LX indexed by a vertex I ∈ V (H ) is obtained by adding all elements of the square submatrix of LX indexed by the fibre f −1 (I ). This equals the negative sum
290
P. E. Bradley
of the vertex degrees of all J ∈ f −1 (I ) plus the sum of the numbers of edges in G attached to all J ∈ f −1 (I ). This equals the negative sum of the numbers of edges connected to, but not contained in, the maximal subgraph of G having f −1 (I ) as vertex set. This now equals the vertex degree of I in the image graph f (G) = H . A non-diagonal element αI J of f∗ LX indexed by distinct vertices I, J ∈ V (H ) is obtained as
AKL αI J = K→I L→J
where the sum runs through the pairs (K, L) which form an edge in G. So, if (I, J ) is an edge of H , then all AKL = 1 in that sum. Hence, αI J equals the number of edges mapped to (I, J ). As f is an aggregation map, there is precisely one edge of G which maps to (I, J ). It follows that αI J = 1. If (I, J ) is not an edge, then no edge of G maps to (I, J ), and αI J = 0. This proves that f∗ LX = LY , if f is an aggregation map. Assume now that f is not an aggregation map. Let a path a − b − c consisting of three vertices in G be folded in b under the map f , and denote the image of the image as b − c . We have X LX ba = Lbc = 1
where the superscript X indicates that these are the elements of the Laplacian of G. Let f∗ LX = (BI J ). Due to the folding, we have X Bbc ≥ LX ba + Lbc = 2
But LYbc = 1. So, it cannot be that f∗ LX = LY , if f is not an aggregation map. 2. Each arrow in the following diagram is well-defined:
where for the horizontal arrows, this follows from Corollary 2.11. Let u ∈ L2 (H, C). Then LX ι(u)(x) =
K,L
N +1 |x − K|p LX KL uf (L) p
p-Adic Wave Equations on Finite Graphs and T0 -Spaces
291
where LX KL are the matrix entries of Laplacian LX , and
σf (LX ι(u))(x) =
N |x | − I LX u p p KL f (L)
(8)
I ∈V (H ) K→I L∈V (G)
Now, (f∗ LX )u(x) = =
K,L∈V (H )
I →K J →L
N LX I J uL p |x − K|p
N LX I J uf (J ) p |x − K|p
K∈V (H ) I →K J ∈V (G)
Comparing this with (8), we see that the diagram is indeed commutative. By Zúñiga-Galindo’s Structure Theorem (Corollary 2.11), we also have a diagram
As σf is a right inverse of ι (both restricted to the noise part) and the horizontal maps are diagonal, the commutativity of the above diagram follows, because the multipliers of the lower horizontal map (diagonal elements of f∗ LX ) equal the sums of the corresponding multipliers of the upper horizontal map (negative vertex degrees in G), where the correspondence is given by f . This now concludes the proof of the second statement. Remark 4.3 Notice that the vertical arrows in the diagram of Theorem 4.2 also respect the signal/noise structure, i.e. map the signal parts to signal parts, and the noise parts to noise parts. Theorem 4.2 can be interpreted as follows: if f : X → Y is an aggregation map, then a signal on X can be viewed as a noisy signal u = us + un on Y , whose signal part us = ι(v) for some signal v on Y , is given by the averages over the fibres of f . Then LX can be viewed as transforming u to a noisy signal on Y as follows: LX : us + un → LY us + LX un signal
noise
The fibre sum σf then projects the transformed noisy signal on Y onto its signal part.
292
P. E. Bradley
Fig. 1 An aggregation map between finite T0 -spaces
Example 4.4 Figure 1 shows an aggregation map f : X → Y between finite T0 spaces, where we assume that f (a) = a, f (b) = f (c) = f (d) = b The Laplacians of X, Y are ⎛ −1 ⎜1 LX = ⎜ ⎝0 0
1 −3 1 1
0 1 −2 1
⎞ 0 1⎟ ⎟, 1⎠ −2
LY =
−1 1 1 −1
Observe that indeed f∗ LX = LY Let u = (0, 2, −1, −1) be a signal on X. We have u(x) = 2 pN +1 |x − b|p − pN +1 |x − c|p − pN +1 |x − d|p As ; u(x) dx = 0 KN
it follows that u ∈ L2Koz (KN , C), i.e. u is noise in Y . Now, ⎛
⎞ 2 ⎜−8⎟ ⎟ LX u = ⎜ ⎝3⎠ 3 One calculates that LY u(x) = 2 pN |x − a|p − 2 pN x − b p = σf (LX u) (x)
p-Adic Wave Equations on Finite Graphs and T0 -Spaces
293
as expected: the first entry equals the coefficient of p N |x − a|p , and the sum entries in LX u equals −2, which equals the coefficient of of the last three N p x−b p . Remark 4.5 If the requirement that f : X → Y be without folding is lifted, then f∗ LX is still the Laplacian of a matrix, because the row sums are still zero. This leads to the following result: Proposition 4.6 If f : X → Y is a continuous map between finite T0 -spaces which is monotonic, then f∗ LX is the Laplacian of a graph with multiple edges. Proof We have seen in Remark 4.5 that f∗ LX is the Laplacian of a matrix A. The off-diagonal elements of A are all natural numbers, and the diagonal elements are all non-positive integers by construction. Hence, A is the Laplacian of a graph whose edges have the positive entries of A as their multiplicities. Example 4.7 We consider a map f : X → Y with X as in Example 4.4, and with Y the path a −→ b −→ c where f (c) = f (d) = c The map f satisfies the conditions of an aggregation map, except that it has folding. We have ⎛
⎞ −1 1 0 LY = ⎝ 1 −2 1 ⎠ 0 1 −1 and ⎛
⎞ −1 1 0 f∗ LX = ⎝ 1 −3 2 ⎠ 0 2 −2 The corresponding graph with multiple edges is a
b
2
c
where the multiplicity larger than one is written on the edge.
294
P. E. Bradley
Now, consider for an aggregation map f : X → Y the two wave equations u(t, ¨ x) − DD u(t, ˙ x) − LX u(t, x) = 0
(9)
s¨(t, x) − f∗ D˙s (t, x) − LY s(t, x) = 0
(10)
with damping D as in Sect. 3.2, but with the extra requirement that D is another Laplacian for the same T0 -space X. This means that the underlying unweighted Hasse diagram of X is also weighted by the off-diagonal elements of D. Assume that u(t, ·) ∈ L2 (G, C) is of the form u(t, x) = s(t, x) + ν(t, x) with s(t, ·) ∈ L2 (H, C) and ν(t, ·) ∈ L2Koz (KN , C), and that s(t, x) is a solution of (10). Corollary 4.8 The function u(t, x) is a solution of (9), if and only if ν(t, x) is a solution of (9). Proof The function u(t, x) is a solution of (9), if and only if s¨ + ν¨ − D(˙s + ν˙ ) − LX (s + ν) = 0
(11)
Now, by Theorem 4.2, we have LX (s + ν) = LY s + LX ν Also, as D is a Laplacian of X, we have D(˙s + ν˙ ) = f∗ D˙s + Dν˙ Hence, (11) is equivalent to 0 = s¨ − f∗ D˙s − LY s +¨ν − D˙ν − LX ν =0
= ν¨ − D˙ν − LX ν where the underbraced part equals zero, by assumption. This proves the assertion. A consequence is that the Cauchy problem for a homogeneous wave equation on a space X is given by solving the Cauchy problem for a related wave equation on its aggregation to Y plus a solution of the original Cauchy problem for the noise function.
p-Adic Wave Equations on Finite Graphs and T0 -Spaces
295
Acknowledgments Wilson Zúñiga-Galindo is thanked for comments and suggestions which helped to substantially improve this article.
References 1. Pavel Alexandrov. Diskrete Räume. Matematie´ceskij Sbornik, 44(2):501–519, 1937. 2. P.E. Bradley. Generalised diffusion on moduli spaces of p-adic Mumford curves. p-Adic Numbers, Ultrametric Analysis and Applications, 12:73–89, 2020. 3. P.E. Bradley and M.W. Jahn. On the behaviour of p-adic scaled space filling curve indices for high-dimensional data. The Computer Journal, bxaa036, https://doi.org/10.1093/comjnl/ bxaa036, 2020. 4. P.E. Bradley and N. Paul. Using the relational model to capture topological information of spaces. The Computer Journal, 53:69–89, 2010. 5. L. Brekke and P.G.O. Freund. p-adic numbers in physics. Physics Reports, 233(1):1–66, 1993. 6. J.-G. Caputo, A. Knippel, and E. Simo. Oscillations of networks: the role of soft nodes. J. Phys. A: Math. Theor., 46:035101, 2013. 7. F.Q. Gouvêa. p-adic Numbers. An Introduction. Universitext. Springer, Berlin, 1993. 8. A. Khrennikov, S. Kozyrev, and W.A. Zúniga-Galindo. Ultrametric Pseudodifferential Equations and Its Applications. Encyclopedia of Mathematics and Its Applications, vol. 168. Cambridge University Press, 2018. 9. A.N. Kochubei. Pseudo-Differential Equations and Stochastics over Non-Archimedean Fields. Monographs and Textbooks in Pure and Applied Math. 244. Marcel Dekker, Inc., New York, 2001. 10. A.N. Kochubei. A non-Archimedean wave equation. Pacific Journal of Mathematics, 235(2):245–261, 2008. 11. M.H. Taibleson. Fourier Analysis on Local Fields. Princeton Univ. Press, Princeton, NJ, 1975. 12. V.S. Vladimirov, I.V. Volovich, and E.I. Zelenov. p-Adic Analysis and Mathematical Physics, volume 1 of Series on Soviet & East European Mathematics. World Scientific, Singapore, 1994. 13. H.G. Vogt. Leçons sur la résolution algèbrique des équations. Cornell University Library, 1895. 14. B. Wu and A. Khrennikov. p-adic analogue of the wave equation. Journal of Fourier Analysis and Applications, 25:2447–2462, 2019. 15. W.A. Zúñiga-Galindo. Reaction-diffusion equations on complex networks and Turing patterns, via p-adic analysis. Journal of Mathematical Analysis and Applications, 491(1):124239, 2020. 16. W.A. Zúniga-Galindo. Pseudodifferential Equations over Non-Archimedean Spaces. Lecture Notes in Math. 2174. Springer, Berlin, 2016.
A Riemann-Roch Theorem on Infinite Graphs Atsushi Atsuji and Hiroshi Kaneko
Abstract A Riemann-Roch theorem on graph was initiated by M.Baker and S.Norine. In their article, a Riemann-Roch theorem on a finite graph with uniform unit vertex-weight and uniform unit edge-weight was established and a feasibility of Riemann-Roch theorem on infinite graph was suggested. In this article, we take an edge-weighted infinite graph and focus on the importance of the spectral gaps of the Laplace operators defined on its finite subgraphs naturally given by Q-valued positive weights on the edges. We build a potential theoretic scheme for a proof of a Riemann-Roch theorem on the edge-weighted infinite graphs. Keywords Riemann-Roch theorem · Infinite graphs · Laplace operator on graphs
1 Introduction A Riemann-Roch theorem on connected finite graph was initiated by M. Baker and S. Norine in [3]. In their work, a unit weight was given to each vertex and also a unit weight was given to each vertex of the graph. Originally, in the complex plane, the exponents of lowest degree in the Laurent series around a pole admit an interpretation as integer multiples of the single pole. In accordance with this fact, the Riemann-Roch theorem for the divisor which determines integer multiples of unit weight at each vertex is established. After their work, Riemann-Roch theory on a finite graph and its variants have been investigated over the last decade to give several applications in algebraic geometry and combinatorics and it is also discussed in the context of tropical geometry [8].
A. Atsuji Department of Mathematics, Keio University, Yokohama, Japan H. Kaneko () Department of Mathematics, Tokyo University of Science, Shinjuku-ku, Japan © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 W. A. Zúñiga-Galindo, B. Toni (eds.), Advances in Non-Archimedean Analysis and Applications, STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health, https://doi.org/10.1007/978-3-030-81976-7_9
297
298
A. Atsuji and H. Kaneko
In this article we show our attempt to extend the Baker-Norine theory of Riemann-Roch theorem to the case of some infinite graphs from a probabilistic point of view. In other words, we specify edge-weighted infinite graphs which allow us to apply potential theoretical method and probabilistic analysis for our goal. We first reinvestigate the method in [3] for Riemann-Roch theorem to accommodate procedures in [3] with Q-valued edge-weighted finite graph by taking alternative canonical divisor of the one in the existing Riemann-Roch theorem in [3] in the second section. Our basic idea towards infinite graphs is to approximate the infinite graph by a suitable sequence of finite subgraphs. On each subgraph we can establish a Riemann-Roch theorem which will be seen in Sect. 2. We expect that the desired theorem will be achieved by taking the limits of the quantities in the RiemannRoch theorem along a sequence of finite subgraphs. This limiting procedure is the main part of this research. As a matter of fact, we look at the fact that the weights of graphs give a Laplace operator which is regarded as a generator of a Markov process and is associated uniquely with a Dirichlet form [7]. For the limiting procedure, it is important to reveal L2 -boundedness of 0-resolvent of the Laplacian, which is implied by the existence of spectral gap of the Laplacian known as special analytical feature of the operator. To validate a spectral gap, in Sect. 3, we establish a Poincaré inequality on the basis of the method in [16]. To ensure this inequality, we need some assumptions on decay of weights at infinity of the graph. We give a sufficient condition for the spectral gap in Sect. 3, in terms of transition probability of the Markov process, which imposes a stronger property than ordinary recurrence on the Markov process in our case. This suggests that the graph satisfying the condition looks closely like a finite graph from a view of the Markov process. In the final section, we finalize the proof of our Riemann-Roch theorem on the infinite graph with Q-valued positive weight on edges. Our Riemann-Roch theorem is akin to the classical Riemann-Roch theorem. We consider a locally finite connected graph G of finite volume equipped with weight function C defined on the set of edges, and take the Markov process associated with the weight function C. We see that, if its transition probability from a finite subgraph to the outside asymptotically tends to zero, then the formula r(D) − r(KG − D) = deg(D) + e(G,C) holds for any divisor D of finite degree, where r(D), KG and e(G,C) are the counterparts of the dimension of the linear subspace of the meromorphic functions, canonical divisor and Euler characteristic of a Riemann surface, respectively. They are all defined by approximating procedure mentioned above. We give the precise statement in the final section. In particular, this formula will be given in Theorem 4.6 and the assumption on the transition probability be given in (7) of the final section. We remark that there have been several extensions of the classical RiemannRoch theorem on closed Riemann surfaces to open surfaces known by several literatures ([10, 12, 14]) in classical theory of Riemann surfaces. For an example of simpler cases, let S be a Riemann surface of finite type, namely S is conformally
A Riemann-Roch Theorem on Infinite Graphs
299
equivalent to a closed Riemann surface S minus finite number of points. Since L2 meromorphic functions and L2 -meromorphic 1-forms on S can be extended on S, we have a Riemann-Roch theorem for this class of functions on S from the classical Riemann-Roch theorem on S. This suggests that this sort of smallness of the infinity may imply a Riemann-Roch theorem on a non-compact space. Our assumption on the decay of weight function is stronger than typically imposed condition for the recurrence of the Markov process, thus it might suggest an importance of such smallness of the infinity of infinite graphs. As for more general Riemann surfaces, R. Nevanlinna [12] obtained a Riemann-Roch theorem for L2 -meromorphic 1-forms on parabolic Riemann surfaces of finite genus i.e. Riemann surfaces of finite genus on which Brownian motions are recurrent. We remark that our assumption seems stronger than the parabolicity, though we can treat a wider class of functions. In this article we survey our discussion on our approach without proofs since the proofs need several even further detailed discussions. We give the detail of the proofs of our results in [1]. We here give several comments on some preceding researches related to our results. M. Baker and F. Shokrieh [4] discussed some potential theoretic aspects and chip-firing games in finite graph case. The authors learned several basics on this subject from [3] and [4]. R. James and R. Miranda [11] considered a RiemannRoch theorem for edge-weighted finite graphs in a different context from ours. S. Backman [2] gave another proof of the Riemann-Roch theorem on finite graphs.
2 Riemann-Roch Theorem on a Weighted Finite Graph Let G = (VG , EG ) be a connected graph consisting of a finite set VG of vertices and of a finite set EG of edges without loops. To be more precise, EG is given as a subset of VG × VG \ {{x, x} | x ∈ VG } under identifying {x, y} = {y, x} for x, y ∈ VG . We assume that a Q-valued positive weight Cx,y is given at every edge {x, y} ∈ EG with x = y and assume that Cx,y = Cy,x . At each vertex x in VG , the set N(x) consisting of its neighbor is defined by N(x) = {y ∈ VG | {x, y} ∈ EG } of x. Here and in what follows, f stands for Z-valued function on VG and a function f on VG is defined by f (x) = y∈N (x) Cx,y (f (x) − f (y)) and another function i on VG is defined by i(x) = min{|f (x)| | f : VG → Z satisfying f (x) = 0and f (x) = 0}. A divisor on the graph G is given by D = x∈VG (x)i(x)1{x} for a Z-valued is defined by deg(D) = function on VG , and its degree deg(D) x∈VG (x)i(x). f will be identified with the divisor f (x)1 . The positive real value {x} x∈VG min{| x∈VG (x)i(x)| ∈ (0, ∞) | : VG → Z} is denoted by i(G,C) and the family of total orders on VG by O. The canonical divisor KG on the weighted graph G is given by KG = x∈VG { y∈N (x) Cx,y − 2i(x)}1{x} . A divisor D = x∈VG (x)i(x)1{x} is said to be effective, if (x) ≥ 0 for all x ∈ VG . For each O ∈ O, we introduce the divisor νO given by
300
A. Atsuji and H. Kaneko
νO (x)1{x} ,
x∈VG
with νO (x) = y∈N (x),y n, since in (x)/ ij (x) is a Z-valued function on Vn , i(Gn ,Cn ) = min{| x∈Vn (x)in (x)| | : Vn → Z with x∈Vn (x)in (x) = 0} is represented as an integer times of i(Gj ,Cj ) = min{| x∈Vj (x)ij (x)| | : Vj → Z with x∈Vj (x)ij (x) = 0}. A formal linear combination D = x∈V (x)i(x)1{x} of the family {1{x} | x ∈ V of indicator functions with at most countably many non-zero real coefficients {(x)i(x) | x ∈ V } is called a divisor on G. Then, the divisor (D)n =
306
A. Atsuji and H. Kaneko
x∈Vn−1 (x)i(x)1{x} is regarded as a divisor on the finite graph Gn for any positive integer n. We call (D)n the restriction of D to Vn . Our objective is to extend the Riemann-Roch theorem on finite graphs in [3] to the one on an infinite graph for divisor D = x∈V (x)i(x)1{x} on G satisfying |(x)|i(x) < ∞. To achieve this, it is necessary to discuss the convergence x∈V of rn (D) defined as in Sect. 2 on each Gn , at least in the case that supp[D] = {x ∈ V | (x)i(x) = 0} is a finite set. One of the key properties of the graphs for our aim is so-called spectral gap of L. We assume some conditions on the infinity of G for taking our approach on the basis of spectral gap theory. We denote the finite measure mGn given in the previous subsection by mn and mn (A) for A ⊂ Vn . For introduce the probability measure μn given by μn (A) = mn (Vn ) x ∈ Vn , mn ({x}) is denoted briefly by mn (x). In terms of a reversible Markov chain {Xn } determined by the transition matrix defined in (1), for any vertex x with d(x, v0 ) = n, we see that Px (X1 ∈ Vn ) = mn (x)/m(x),
Px (X1 ∈ Vn−1 ) =
1 m(x)
Cx,y
y∈N (x)∩Vn−1
and these two identities imply Px (X1 ∈ Vn ) ≥ Px (X1 ∈ Vn−1 ). From these relationships among the probabilities, c )≥ Px (X1 ∈ Vn−1
m(x) − mn (x) . m(x)
c ) for any positive integer n and take a We introduce ρn = supx∈Sn Px (X1 ∈ Vn−1 control over the behavior of {Xn } by assuming that
lim sup ρn < 1.
(2)
n→∞
In fact, by the assumption (2), we can take ρ < 1 such that lim supn→∞ ρn < ρ. Then we have m(x) − mn (x) ≤ ρn m(x) and m(x) ≤
1 mn (x) 1−ρ
for any x ∈ Sn with sufficiently large n. Since m(x) = mn (x) for any x ∈ Vn−1 , the first estimate and the finiteness of the measure m imply that m(Vn ) − mn (Vn ) → 0 as n → ∞. The following assertion is crucial to establish Riemann-Roch theorem in our approach.
A Riemann-Roch Theorem on Infinite Graphs
307
Theorem 3.3 (Poincaré Inequality) There exists a positive constant A with A < 1 such that if lim sup ρn < A,
(3)
n→∞
then there exists a positive constant C and ||f ||2L2 (μ) ≤ CE(f, f ),
(4)
holds for any f ∈ L2 (μ) with (f, 1)L2 (μ) = 0. Remark 3.4 Let B(a) be the smaller solution of the quadratic equation (ea − e−a )t 2 − 2(ea − 2e−a + 1)t + 1 − 2e−a = 0. We can take the maximum of B(a) subject to a > log 2 as the positive constant A in the assumption of the theorem. By a numerical calculation, one sees that max{B(a) | a > log 2} ≈ 0.0569. For a connected subgraph U of G, we introduce λU = inf{E˜U (f, f ) | f ∈ L2 (μU ), (f, 1)L2 (μU ) = 0 and -f -L2 (μU ) = 1} called a spectral gap of LU , where E˜U (u, v) = mU (VU )−1 EU (u, v). Similar to the relationship between the Laplace operator LU on L2 (mU ) and EU , we note that (LU u, v)L2 (μU ) = E˜U (u, v). It is well known that if U is a finite graph, then λU > 0 (cf. [5]). Corollary 3.5 Under the assumption of Theorem 3.3, we have λG ≥
1 . Cm(V )
Lemma 3.6 Let λn be the spectral gap of LGn and assume λG > 0. If lim supn→∞ ρn 1−ρn < λG , then lim infn→∞ λn > 0. Moreover, if limn→∞ ρn = 0, then lim inf λn ≥ λG . n→∞
ρn Lemma 3.7 If lim supn→∞ 1−ρ < λG as assumed in Lemma 3.6, then there n exists some positive constant K such that any sequence {gn } of functions gn on Vn satisfying (1, gn )L2 (μn ) = 0 for sufficiently large n yields
||R (n) gn ||L2 (μn ) ≤ K||gn ||L2 (μn ) for sufficiently large n, where R (n) stands for the 0-order resolvent of LGn on L2 (μn ).
308
A. Atsuji and H. Kaneko
Remark 3.8 If the support of g is contained in Vn0 and (1, g)L2 (μn ) = 0 is 0 satisfied for some n0 , then the identity holds for sufficiently large n and accordingly (1, g)L2 (μ) = 0.
4 Proof of the Riemann-Roch Theorem on an Infinite Graph In this section, we establish a Riemann-Roch theorem on the connected infinite graph G = (VG , EG ) with local finiteness and finiteness of the total volume m(VG ) as in the last section, by applying L2 -boundedness of the 0-order resolvent derived from spectral gap theory to a sequence of functions in the images of the Laplace operator. For that purpose, we first take a divisor D = x∈Vn−1 (x)i(x)1{x} on some subgraph Gn = (Vn , En ) of G = (VG , EG ) as given in the last section and denote Gn and LGn by n and by Ln , respectively, where the weight Cn,x,y for equivalence between divisors D = edge {x,y} ∈ En coincides withCx,y . The x∈Vn (x)i(x)1{x} and D = x∈Vn (x)i(x)1{x} with Z-valued functions and is defined by D = D + n f on Vn for some Z-valued function f with n supp[f ] ⊂ Vn . This relationship will be denoted by D ∼ D and will be called n-equivalence. The family of total orders on Vn is denoted by On . For a divisor D on finite graph Gn = (Vn , En ), rn (D) is defined on the n finite subgraph Gn , by replacing “∼" in Sect. 2 with the n-equivalence “∼", more specifically given by rn (D) =
n
min
D ∼D,On ∈On
deg+ (D − νOn ) − i(Gn ,Cn ) .
When a divisor D = x∈VG (x)i(x)1{x} satisfying x∈VG |(x)|i(x) < ∞ is given, the family {(D)n } of divisors is consistent in the sense that ((D)j )n = (D)n whenever j > n. Later, by taking control over the sequence {Oj } of total orders deg+ (D − νOj ) − given by the minimization of rj ((D)n ) = min j D ∼(D)n ,Oj ∈Oj
i(Gj ,Cj ) on every subgraph Gj satisfying Vj ⊃ Vn−1 ⊃ supp[(D)n ], we facilitate successive procedures of taking limits as j → ∞ and n → ∞ in rj ((D)n ) so that those limits are taken along an identical subsequence of positive integers. We focus on a divisor D satisfying supp[D] ⊂ Vn−1 for some n. We note that this condition implies that the divisor (D)j is equal to D for sufficiently large j . In what follows, the integer valued function f on Vn such that D = D + n f attains the minimum rn (D) =
n
min
D ∼D,On ∈On
deg+ (D − νOn ) − i(Gn ,Cn ) ,
A Riemann-Roch Theorem on Infinite Graphs
309
in the right-hand side with some total order for On ∈ On is called a minimizer + rn (D) and denoted by fn . We denote (x)i(x) by deg (D) and x∈VG ,(x)>0 − − x∈V This notation will be used for any divisor G ,(x)n of functions such that (n)
(i) fn = fj
on Vn , (n)
(ii) maxx∈Vj |fj (x)| ≤
√
Cn (D, K)/ min{m(x) | x ∈ Vn } + 1,
(iii) limn→∞ supj >n -mj (·)−1 (j fj(n) − D)-L1 (Vj \Vn ;μ) = 0. Now we discuss divisor D = x∈VG (x)i(x)1{x} satisfying x∈VG |(x)|i(x) < ∞. In the next lemma, we start with a divisor (D)n0 (ε) satisfying deg+ (D − (D)n0 (ε) ) + deg− (D − (D)n0 (ε) ) < ε for a given ε > 0 and we only focus on the subgraphs Vn with n ≥ n0 (ε). For any pair of positive integers j, n with j > n ≥ n0 (ε) and total order Oj ∈ Oj , the restriction of total order of Oj to Vn is denoted by Oj |Vn . Lemma 4.2 For any ε > 0, there exists a sequence {ON (ε/2l ) } of total orders satisfying ON (ε/2j ) ∈ ON (ε/2j ) with m(VNc (ε/2j ) ) < ε/2j for any non-negative integer j and a sequence {nj } satisfying n1 < n2 < . . . and nj +1 ≥ N(ε/2j ) for any non-negative integer j such that rnk ((D)nl )=
nk
min
D ∼(D)nl ,ON(ε/2l ) =Onk |V
N(ε/2l )
,Onk ∈Onk
deg+ (D − νOnk ) − i(Gnk ,Cnk ) , (5)
whenever k > l. In particular, k > l implies ON (ε/2l ) = Onk |VN(ε/2l ) and deg+ (νOnl − νOnk ) + deg− (νOnl − νOnk ) < m(VNc (ε/2min{k,l} ) )) < ε/2min{k,l}
(6)
for any positive integers k and l. For the following assertion, we suppose that the divisor (D)n0 (ε) in the last lemma is taken as D. In the following proposition, we impose a tighter condition on the sequence {ρn } than before for our Riemann-Roch theorem on an infinite graph:
310
A. Atsuji and H. Kaneko
Proposition 4.3 If ρn m(Sn )/ min m(x) → 0 as n → ∞ x∈Vn
(7)
and D is a divisor satisfying supp[D] ⊂ Vn0 (ε)−1 , then rnk (D) converges as k → ∞, where n1 , n2 , . . . is the subsequence satisfying (5) associated with a sequence {ON (ε/2j ) } of total orders in Lemma 4.2. The limit in this proposition depends on the choice of the sequence {Onk } of total orders. However, as long as a divisor D is supported by a finite graph, we can define r{Onk } (D) = lim rnk (D) k→∞
by taking a subsequence {Onk } of total orders as in Lemma 4.2. Remark 4.4 If one takes another base vertex v0 satisfying the condition (7) in Proposition 4.3, in instead of v0 , then rn k (D) converges as k → ∞ for the same divisor D as shown in the proposition. By taking a similar procedure in the proof of the proposition, one sees not only lim sup →∞ rn (D) ≤ lim infk→∞ rnk (D) but lim sup→∞ rn (D) ≤ lim infk →∞ rn k (D). Accordingly, limk→∞ rnk (D) does not depend on the choice of the base vertex satisfying (7). For a divisor D = on G, we introduce effective divisors x∈VG (x)i(x)1{x} + − + − = D and D given respectively by D = {x} and D x∈VG ,(x)>0 (x)i(x)1 − x∈VG ,(x)