Algorithmic and Symbolic Combinatorics: An Invitation to Analytic Combinatorics in Several Variables 3030670791, 9783030670795

This book uses new mathematical tools to examine broad computability and complexity questions in enumerative combinatori

283 27 6MB

English Pages 300 [426] Year 2021

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Foreword
Preface
Contents
List of Symbols
Chapter 1 Introduction
1.1 Algorithmic Combinatorics
1.1.1 Analytic Methods for Asymptotics
1.1.2 Lattice Path Enumeration
1.2 Diagonals and Analytic Combinatorics in Several Variables
1.2.1 The Basics of Analytic Combinatorics in Several Variables
1.2.2 A History of Analytic Combinatorics in Several Variables
1.3 Organization
References
Part I Background and Motivation
Chapter 2 Generating Functions and Analytic Combinatorics
2.1 Analytic Combinatorics in One Variable
2.1.1 AWorked Example: Alternating Permutations
2.1.2 The Principles of Analytic Combinatorics
2.1.3 The Practice of Analytic Combinatorics
2.2 Rational Power Series
2.3 Algebraic Power Series
2.4 D-Finite Power Series
2.4.1 An Open Connection Problem
2.5 D-Algebraic Power Series
Appendix on Complex Analysis
Problems
References
Chapter 3 Multivariate Series and Diagonals
3.1 Complex Analysis in Several Variables
3.1.1 Singular Sets of Multivariate Functions
3.1.2 Domains of Convergence for Multivariate Power Series
3.2 Diagonals
3.2.1 Properties of Diagonals
3.2.2 Representing Series Using Diagonals
3.3 Multivariate Laurent Expansions and Other Series Operators
3.3.1 Convergent Laurent Series and Amoebas
3.3.2 Diagonals and Non-Negative Extractions of Laurent Series
3.4 Sources of Rational Diagonals
3.4.1 Binomial Sums
3.4.2 Irrational Tilings
3.4.3 Period Integrals
3.4.4 Kronecker Coefficients
3.4.5 Positivity Results and Special Functions
3.4.6 The Ising Model and Algebraic Diagonals
3.4.7 Other Sources of Examples
Problems
References
Chapter 4 Lattice Path Enumeration, the Kernel Method, and Diagonals
4.1 Walks in Cones and The Kernel Method
4.1.1 Unrestricted Walks
4.1.2 A Deeper Kernel Analysis: One-Dimensional Excursions
4.1.3 Walks in a Half-Space
4.1.4 Walks in the Quarter-plane
4.1.5 OrthantWalks Whose Step Sets Have Symmetries
4.2 Historical Perspective
4.2.1 The Kernel Method
4.2.2 Recent History of Lattice Paths in Orthants
Problems
References
Part II Smooth ACSV and Applications
Chapter 5 The Theory of ACSV for Smooth Points
5.1 Central Binomial Coefficient Asymptotics
5.1.1 Asymptotics in General Directions
5.1.2 Asymptotics of Laurent Coefficients
5.2 The Theory of Smooth ACSV
5.3 The Practice of Smooth ACSV
5.3.1 Existence of Minimal Critical Points
5.3.2 Dealing with Minimal Points that are not Critical
5.3.3 Perturbations of Direction and a Central Limit Theorem
5.3.4 Genericity of Assumptions
Problems
References
Chapter 6 Application: Lattice Walks and Smooth ACSV
6.1 Asymptotics of Highly Symmetric Orthant Walks
6.1.1 Asymptotics for All Walks in an Orthant
6.1.2 Asymptotics for Boundary Returns
6.1.3 Parameterizing the Starting Point
Problems
References
Chapter 7 Automated Analytic Combinatorics
7.1 An Overview of Results and Computations
7.1.1 Surveying the Computations
7.1.2 Minimal Critical Points in the Combinatorial Case
7.1.3 Minimal Critical Points in the General Case
7.2 ACSV Algorithms and Examples
7.2.1 Examples
7.3 Data Structures for Polynomial System Solving
7.3.1 Gröbner Bases and Triangular Systems
7.3.2 Univariate Representations
7.4 Algorithmic ACSV Correctness and Complexity
7.4.1 Polynomial Height Bounds
7.4.2 Polynomial Root Bounds
7.4.3 Resultant and GCD Bounds
7.4.4 Algorithms for Polynomial Solving and Evaluation
Problems
References
Part III Non-Smooth ACSV
Chapter 8 Beyond Smooth Points: Poles on a Hyperplane Arrangement
8.1 Setup and Definitions
8.2 Asymptotics in Generic Directions
8.2.1 Step 1: Express the Cauchy Integral as Sum of Imaginary Fibers
8.2.2 Step 2: Determine the Contributing Singularities
8.2.3 Step 3: Express the Cauchy Integral as Sum of Local Contributing Integrals
8.2.4 Step 4: Compute Residues
8.2.5 Step 5: Determine Asymptotics
8.2.6 Dealing with Non-Simple Arrangements
8.3 Asymptotics in Non-Generic Directions
Problems
References
Chapter 9 Multiple Points and Beyond
9.1 Local Geometry of Algebraic and Analytic Varieties
9.2 ACSV for Transverse Points
9.2.1 Critical Points and Stratifications
9.2.2 Asymptotics via Residue Forms
9.3 A Geometric Approach to ACSV
9.3.1 A Gradient Flow Interpretation for Analytic Combinatorics
9.3.2 Attacking the Connection Problem through ACSV and Numeric Analytic Continuation
9.3.3 The State of Analytic Combinatorics in Several Variables
Problems
References
Chapter 10 Application: Lattice Paths, Revisited
10.1 Mostly Symmetric Models in an Orthant
10.1.1 Diagonal Expressions and Contributing Points
10.1.2 Asymptotics for Positive Drift Models
10.1.3 Asymptotics for Negative Drift Models
10.2 Lattice Path Problems to Test Your Skills
Problems
References
Index
Author Index
Recommend Papers

Algorithmic and Symbolic Combinatorics: An Invitation to Analytic Combinatorics in Several Variables
 3030670791, 9783030670795

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Texts & Monographs in Symbolic Computation

Stephen Melczer

Algorithmic and Symbolic Combinatorics An Invitation to Analytic Combinatorics in Several Variables

Texts & Monographs in Symbolic Computation A Series of the Research Institute for Symbolic Computation, Johannes Kepler University, Linz, Austria

Founding Editor Bruno Buchberger, Research Institute for Symbolic Computation, Linz, Austria Series Editor Peter Paule, Research Institute for Symbolic Computation (RISC), Johannes Kepler University, Linz, Austria

Mathematics is a key technology in modern society. Symbolic Computation is on its way to become a key technology in mathematics. “Texts and Monographs in Symbolic Computation” provides a platform devoted to reflect this evolution. In addition to reporting on developments in the field, the focus of the series also includes applications of computer algebra and symbolic methods in other subfields of mathematics and computer science, and, in particular, in the natural sciences. To provide a flexible frame, the series is open to texts of various kind, ranging from research compendia to textbooks for courses. Indexed by zbMATH.

More information about this series at http://www.springer.com/series/3073

Stephen Melczer

Algorithmic and Symbolic Combinatorics An Invitation to Analytic Combinatorics in Several Variables

123

Stephen Melczer Department of Mathematics University of Pennsylvania Philadelphia, PA, USA

ISSN 0943-853X ISSN 2197-8409 (electronic) Texts & Monographs in Symbolic Computation ISBN 978-3-030-67079-5 ISBN 978-3-030-67080-1 (eBook) https://doi.org/10.1007/978-3-030-67080-1 Mathematics Subject Classification: 05-XX, 68Rxx, 68W30 © Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Dedicated to the memories of Philippe Flajolet and Herb Wilf

Foreword

A little over 100 years ago, Hardy and Ramanujan used complex integrals to estimate the number of partitions of a large integer. This gave an inkling of a deep connection between something elementary (counting) and something deep (complex variable theory). The counting problem yields a generating function; coefficients of this generating function can be computed or approximated via Cauchy’s integral formula. For many decades Hardy and Ramanujan’s work, built on ideas of de Moivre, Bernoulli, Euler, and Dirichlet, stood as a lone tour de force. Enumeration continued to appear elementary and isolated from the rest of mathematics. In the 1950’s Hardy and Ramanujan’s circle method (integrate over a circle approaching the singularities of the generating function) was systematized somewhat by Hayman. A few decades later, Flajolet and Odlyzko distilled the process of singularity analysis to a set of effective transfer theorems, allowing almost automatic derivation of asymptotics. Notably, over this span of seventy years, none of the analyses exceeded the technical difficulty of Hardy and Ramanujan’s. This book introduces a beast of quite a different color: Analytic Combinatorics in Several Variables (ACSV). Generating functions in more than one variable can be quite powerful. They capture joint distributions of combinatorial features, for example the size of the ground set of a permutation as well as its number of cycles, or the location and orientation of a domino in a random tiling, along with the size of the tiling. Transferring the ideas of the Hardy-Ramanujan analysis to the multivariate setting, however, turns out to summon mathematics from the four corners of the known mathematical world. Starting with a Cauchy integral, this time in several variables, one is led in simple cases to saddle point integration. More complicated cases require some harmonic analysis and singularity theory (inverse Fourier transforms of homogeneous rational forms). Sometimes there is a nontrivial choice of where to put the contour; this invokes algebraic topology and stratified Morse theory. Making all of the steps effective requires computational algebra and homotopy methods. The story is beautiful but also difficult and eclectic. Melczer finds a way to tell it understandably and simply. Indeed, for Melczer, computation drives understanding. To quote from Knuth, “Science is what we understand well enough to explain to a

vii

viii

Foreword

computer.” And so, drawing from the author’s expertise in computer algebra, every part of the story is told from the viewpoint of effective computation. The book contains deep theorems, yes, but it embodies much more: a tutorial in computer algebra, expertly conceived illustrations, and a very rich collection of examples. Among the examples one finds Kronecker coefficients, rational period integrals, and models from statistical physics. The author’s background in lattice walk enumeration keeps the book grounded in yet another source of compelling examples. As the title implies, this book succeeds where our own arguably has not, in making this material inviting. Those taking up the invitation to ACSV will be expertly guided into beautiful new territory. February 2020,

Robin Pemantle and Mark Wilson

Preface

If you ask a mathematician what they love most about mathematics, certain answers invariably arise: beauty, abstraction, creativity, logical structure, connection (between disciplines and between people), elegance, applicability, and fun. This book can be viewed as a humble attempt to show that combinatorics is the branch of mathematics best situated to embody and illustrate all of these virtues. Our core subject is the large-scale behaviour of combinatorial objects, with a focus on two goals: the calculation of approximate asymptotic behaviour of sequences arising in combinatorial contexts, and the derivation of limit theorems describing how the parameters of combinatorial objects behave for random objects of large size. We begin with a simple idea, that to study a sequence we should look at its generating function (the formal series whose coefficients are the sequence of interest). Algebraic, differential, and functional equations satisfied by the generating function then represent data structures which encode the sequence, with more complicated sequences requiring more complicated encodings. These encodings can be used to computationally manipulate the sequence and, in many cases, determine its underlying properties. The field of analytic combinatorics studies the asymptotic behaviour of univariate sequences by applying techniques from complex analysis to their generating functions. The universality of many properties of analytic functions often allows for an automatic asymptotic computation, with much of the theory now implemented in computer algebra systems. In this sense the study of analytic combinatorics for univariate sequences is somewhat classical; it is captured in glorious detail by the book Analytic Combinatorics of Philippe Flajolet and Robert Sedgewick. More recently, in the early twenty-first century, Robin Pemantle and Mark Wilson (later joined by Yuliy Baryshnikov) combined methods from complex analysis, singularity theory, algebraic and differential geometry, and topology to form a theory of analytic combinatorics in several variables (ACSV). The textbook Analytic Combinatorics in Several Variables, by Pemantle and Wilson, provides a comprehensive overview of the subject, but its use of advanced constructions coming from the various mathematical disciplines conscripted into their work makes it suitable mainly for researchers with strong mathematical backgrounds across several domains.

ix

x

Preface

The aim of this book is to provide a more accessible introduction to this vast and beautiful area of combinatorics. There are several (potentially overlapping) audiences: mathematicians interested in the behaviour of functions satisfying certain algebraic, differential, or functional equations; combinatorialists interested in learning the theory of analytic combinatorics and analytic combinatorics in several variables; computer scientists interested in the computational aspects of these subjects; and researchers from a variety of domains with an interest in the resulting applications. Our presentation is calibrated for an audience of math and computer science graduate students and researchers, although advanced undergraduates and those from adjacent research areas shouldn’t be scared away. Previous knowledge of sequences and series, at the level of an advanced calculus course or first course in analysis, is the main prerequisite. Of all the advanced mathematical topics encountered, a familiarity and comfort with the basics of complex analysis (analytic functions, residues, the Cauchy integral formula) are the most crucial, although the necessary complex analytic background is reviewed in an appendix. Additional knowledge from commutative algebra, singularity theory, algebraic and differential geometry, and topology can help put certain results in context but are not assumed. Applications from numerous mathematical and scientific domains are given, including many illustrations of the various techniques on lattice path enumeration problems. We put a strong focus on computation, and the companion website to this text contains computer algebra code working through the examples contained here. Because it draws from so many different areas of mathematics, analytic combinatorics in several variables has a reputation of being powerful yet impenetrable. My deepest wish is that this work illuminates the vast riches of ACSV and opens up the area to new researchers across disciplines. Acknowledgments First and foremost, I owe a great debt to the three architects of analytic combinatorics in several variables, Robin Pemantle, Mark Wilson, and Yuliy Baryshnikov, for their support and guidance over the last several years. This book grew out of my doctoral thesis for the University of Waterloo and the École normale supérieure de Lyon, which was made incalculably better by the supervision of Bruno Salvy and George Labahn, and by the thesis reporters and members of the thesis committee: Jason Bell, Sylvie Corteel, Michael Drmota, Ira Gessel, and Éric Schost. Early versions of this manuscript were used as the basis for short lecture series at the University of Illinois Urbana-Champaign and the Research Institute for Symbolic Computation at JKU Linz, and for graduate courses at the University of Pennsylvania and the Université du Québec à Montréal, and I thank all the students from those classes for their feedback. Russell May gave invaluable comments on the text, helping to improve its presentation and fix many typographical errors. Marcus Michelen, Shaoshi Chen, Chaochao Zhu, Manuel Kauers, Alin Bostan, and Marc Mezzarobba gave feedback on various versions of the manuscript, along with my summer students Keith Ritchie and Andrew Martin. I would also like to thank Persi Diaconis, Jessica Khera, Erik Lundberg, and Armin Straub for suggesting problems which appear here. I originally became interested in analytic combinatorics as an undergraduate student, under projects (some of which inspired problems in this text) supervised by Marni Mishna, Alin Bostan, and Manuel Kauers, and I hope this text

Preface

xi

conveys a sense of the wonder I experienced under their supervision. In addition to those mentioned above, some of the work presented here originally appeared in papers coauthored by Mireille Bousquet-Mélou, Julien Courtiel, Éric Fusy, and Kilian Raschel, all of whom taught me a great deal about combinatorics. Thanks also to Peter Paule for encouraging me to write this book and agreeing to be its mathematical editor, and Martin Peters and Leonie Kunz at Springer. Finally, this book is dedicated to Philippe Flajolet and Herb Wilf, two inspirational mathematicians–on both personal and professional terms–whom I never had a chance to meet. I would like to thank Ruth Wilf for welcoming me to Philadelphia during my time there and sharing stories of Herb. I also thank my family (my parents, Laura, Celia, Harry, and Su) and friends who have supported me over the years. September 2020,

Stephen Melczer

Contents

1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Algorithmic Combinatorics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Analytic Methods for Asymptotics . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Lattice Path Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Diagonals and Analytic Combinatorics in Several Variables . . . . . . . 1.2.1 The Basics of Analytic Combinatorics in Several Variables . 1.2.2 A History of Analytic Combinatorics in Several Variables . . 1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 2 3 6 9 10 13 14 16

Part I Background and Motivation 2

Generating Functions and Analytic Combinatorics . . . . . . . . . . . . . . . . . 2.1 Analytic Combinatorics in One Variable . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 A Worked Example: Alternating Permutations . . . . . . . . . . . . 2.1.2 The Principles of Analytic Combinatorics . . . . . . . . . . . . . . . . 2.1.3 The Practice of Analytic Combinatorics . . . . . . . . . . . . . . . . . 2.2 Rational Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Algebraic Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 D-Finite Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 An Open Connection Problem . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 D-Algebraic Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix on Complex Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21 24 26 29 32 33 40 56 70 74 76 83 86

3

Multivariate Series and Diagonals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 3.1 Complex Analysis in Several Variables . . . . . . . . . . . . . . . . . . . . . . . . . 94 3.1.1 Singular Sets of Multivariate Functions . . . . . . . . . . . . . . . . . . 97 3.1.2 Domains of Convergence for Multivariate Power Series . . . . 100 3.2 Diagonals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

xiii

xiv

Contents

3.2.1 Properties of Diagonals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 3.2.2 Representing Series Using Diagonals . . . . . . . . . . . . . . . . . . . 108 3.3 Multivariate Laurent Expansions and Other Series Operators . . . . . . 113 3.3.1 Convergent Laurent Series and Amoebas . . . . . . . . . . . . . . . . 115 3.3.2 Diagonals and Non-Negative Extractions of Laurent Series . 122 3.4 Sources of Rational Diagonals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 3.4.1 Binomial Sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 3.4.2 Irrational Tilings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 3.4.3 Period Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 3.4.4 Kronecker Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 3.4.5 Positivity Results and Special Functions . . . . . . . . . . . . . . . . . 134 3.4.6 The Ising Model and Algebraic Diagonals . . . . . . . . . . . . . . . 135 3.4.7 Other Sources of Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 4

Lattice Path Enumeration, the Kernel Method, and Diagonals . . . . . . . 143 4.1 Walks in Cones and The Kernel Method . . . . . . . . . . . . . . . . . . . . . . . 145 4.1.1 Unrestricted Walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 4.1.2 A Deeper Kernel Analysis: One-Dimensional Excursions . . . 147 4.1.3 Walks in a Half-Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 4.1.4 Walks in the Quarter-plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 4.1.5 Orthant Walks Whose Step Sets Have Symmetries . . . . . . . . 165 4.2 Historical Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 4.2.1 The Kernel Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 4.2.2 Recent History of Lattice Paths in Orthants . . . . . . . . . . . . . . 173 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

Part II Smooth ACSV and Applications 5

The Theory of ACSV for Smooth Points . . . . . . . . . . . . . . . . . . . . . . . . . . 185 5.1 Central Binomial Coefficient Asymptotics . . . . . . . . . . . . . . . . . . . . . . 188 5.1.1 Asymptotics in General Directions . . . . . . . . . . . . . . . . . . . . . . 195 5.1.2 Asymptotics of Laurent Coefficients . . . . . . . . . . . . . . . . . . . . 198 5.2 The Theory of Smooth ACSV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 5.3 The Practice of Smooth ACSV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 5.3.1 Existence of Minimal Critical Points . . . . . . . . . . . . . . . . . . . . 222 5.3.2 Dealing with Minimal Points that are not Critical . . . . . . . . . 228 5.3.3 Perturbations of Direction and a Central Limit Theorem . . . . 230 5.3.4 Genericity of Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

Contents

xv

6

Application: Lattice Walks and Smooth ACSV . . . . . . . . . . . . . . . . . . . . 247 6.1 Asymptotics of Highly Symmetric Orthant Walks . . . . . . . . . . . . . . . 247 6.1.1 Asymptotics for All Walks in an Orthant . . . . . . . . . . . . . . . . . 249 6.1.2 Asymptotics for Boundary Returns . . . . . . . . . . . . . . . . . . . . . 255 6.1.3 Parameterizing the Starting Point . . . . . . . . . . . . . . . . . . . . . . . 258 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

7

Automated Analytic Combinatorics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 7.1 An Overview of Results and Computations . . . . . . . . . . . . . . . . . . . . . 264 7.1.1 Surveying the Computations . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 7.1.2 Minimal Critical Points in the Combinatorial Case . . . . . . . . 267 7.1.3 Minimal Critical Points in the General Case . . . . . . . . . . . . . . 269 7.2 ACSV Algorithms and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 7.2.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 7.3 Data Structures for Polynomial System Solving . . . . . . . . . . . . . . . . . . 280 7.3.1 Gröbner Bases and Triangular Systems . . . . . . . . . . . . . . . . . . 281 7.3.2 Univariate Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 7.4 Algorithmic ACSV Correctness and Complexity . . . . . . . . . . . . . . . . 295 Appendix on Solving and Bounding Univariate Polynomials . . . . . . . . . . . 296 7.4.1 Polynomial Height Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 7.4.2 Polynomial Root Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 7.4.3 Resultant and GCD Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 7.4.4 Algorithms for Polynomial Solving and Evaluation . . . . . . . . 301 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303

Part III Non-Smooth ACSV 8

Beyond Smooth Points: Poles on a Hyperplane Arrangement . . . . . . . . 307 8.1 Setup and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 8.2 Asymptotics in Generic Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 8.2.1 Step 1: Express the Cauchy Integral as Sum of Imaginary Fibers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 8.2.2 Step 2: Determine the Contributing Singularities . . . . . . . . . . 316 8.2.3 Step 3: Express the Cauchy Integral as Sum of Local Contributing Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 8.2.4 Step 4: Compute Residues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328 8.2.5 Step 5: Determine Asymptotics . . . . . . . . . . . . . . . . . . . . . . . . 333 8.2.6 Dealing with Non-Simple Arrangements . . . . . . . . . . . . . . . . . 338 8.3 Asymptotics in Non-Generic Directions . . . . . . . . . . . . . . . . . . . . . . . . 342 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350

xvi

Contents

9

Multiple Points and Beyond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 9.1 Local Geometry of Algebraic and Analytic Varieties . . . . . . . . . . . . . 351 9.2 ACSV for Transverse Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 9.2.1 Critical Points and Stratifications . . . . . . . . . . . . . . . . . . . . . . . 357 9.2.2 Asymptotics via Residue Forms . . . . . . . . . . . . . . . . . . . . . . . . 365 9.3 A Geometric Approach to ACSV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 9.3.1 A Gradient Flow Interpretation for Analytic Combinatorics . 370 9.3.2 Attacking the Connection Problem through ACSV and Numeric Analytic Continuation . . . . . . . . . . . . . . . . . . . . . . . . 378 9.3.3 The State of Analytic Combinatorics in Several Variables . . 381 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383

10

Application: Lattice Paths, Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 10.1 Mostly Symmetric Models in an Orthant . . . . . . . . . . . . . . . . . . . . . . . 388 10.1.1 Diagonal Expressions and Contributing Points . . . . . . . . . . . . 392 10.1.2 Asymptotics for Positive Drift Models . . . . . . . . . . . . . . . . . . . 397 10.1.3 Asymptotics for Negative Drift Models . . . . . . . . . . . . . . . . . . 398 10.2 Lattice Path Problems to Test Your Skills . . . . . . . . . . . . . . . . . . . . . . . 401 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405

List of Symbols

C C∗ A R R∗ R>0 Q Z N N>0 (z) (an )∞ n=0 K K[z] K(z) K[[z]] K((z)) Kfra ((z)) [z n ]F(z) fn = O(gn ) fn = o(gn ) fn ∼ gn ˜ n) fn = O(g Γ(z) z zk zˆ dz zi Fz j (z) (∇F)(z)

Set of complex numbers Set of non-zero complex numbers Set of algebraic numbers (complex roots of polynomials in Z[z]) Set of real numbers Set of non-zero real numbers Set of positive real numbers Set of rational numbers Set of integers Set of natural numbers, including 0 Set of positive integers Real part of z ∈ C Sequence with terms a0, a1, . . . An arbitrary field Ring of polynomials with coefficients in K Field of rational functions over K Ring of formal power series over K Field of Laurent series over K Field of Puiseux series over K Coefficient of z n in (power, Laurent, or Puiseux) series F(z) States existence of M, N > 0 such that | fn | ≤ M |gn | for all n ≥ N States the limit of fn /gn goes to 0 as n → ∞ States the limit of fn /gn goes to 1 as n → ∞ States fn = O(gn logk |gn |) for some k ∈ N The Euler gamma function, defined in the appendix to Chapter 2 Vector (z1, . . . , zd ) Vector (z1, . . . , zk−1, zk+1, . . . , zd ) given by removing kth term of z Shorthand for zd = (z1, z2, . . . , zd−1 ) Shorthand for dz1 dz2 . . . dzd in integrals Monomial z1i1 · · · zdid Partial derivative of F(z) with respect to the variable z j Gradient (Fz1 , . . . , Fz d ) xvii

xviii

(∇log F)(z) V( f1, . . . , fr ) ∂j H s (z) V V∗ VR V>0 |w| T(w) D(w) e(j) W M MR Cx sgn(z) sgn(z) sgnσ (B) σB B(σ) N(w) τσ νB,σ Ow VS SS

List of Symbols

Logarithmic gradient (z1 Fz1 , . . . , zd Fz d ) Solutions of the system f1 (z) = · · · = fr (z) for z ∈ C Differential operator ∂/∂z j or ∂/∂θ j , depending on context Square-free part of polynomial H(z), product of irreducible factors Singularities of meromorphic F(z), when F is understood Elements of V with non-zero coordinates Elements of V with real coordinates Elements of V with positive real coordinates The point (|w1 |, . . . , |wd |) ∈ Rd≥0 for any w ∈ Cd The polytorus {z ∈ Cd : |z j | = |w j |} for any w ∈ Cd The polydisk {z ∈ Cd : |z j | ≤ |w j |} for any w ∈ Cd Elementary basis vector with 1 in position j and all other entries 0 Singular set V(H) together with coordinate hyperplanes V(z1 · · · zd ) Domain of analyticity Cd \ W of Cauchy integral Elements of M with real coordinates Imaginary fiber x + iRd = {x + iy : y ∈ Rd } defined by x ∈ Rd The univariate sign function, equal to 0 if z = 0 and z/|z| otherwise The multivariate sign function, equal to sgn(z1 · · · zd ) Sign of component B with respect to σ Critical point for bounded component B (see Proposition 8.2) Bounded component for critical point σ (see Definition 8.7) Normal cone corresponding to w Linking torus Linking constant of component B and critical point σ The local ring at w ∈ Cd The flat defined by an index set S The stratum defined by an index set S

Chapter 1

Introduction

For it is unworthy of excellent men to lose hours like slaves in the labor of calculation which could safely be relegated to anyone else if the machine were used. — Gottfried Wilhelm Leibniz To count is a modern practice, the ancient method was to guess. — Samuel Johnson

A fundamental problem in mathematics is how to efficiently encode mathematical objects and, from such encodings, determine their underlying properties. As an illustrative example, imagine encoding the elements of different algebraic structures on a computer. • An integer can be encoded in its binary representation together with a bit indicating its sign. • A rational number can be encoded as a pair of integers representing its numerator and denominator. • An algebraic number over the rationals, α ∈ A, can be encoded by its minimal polynomial mα (z) ∈ Z[z] (defined by a finite list of integers) and an isolating region of the complex plane (defined, for instance, as a disk in the complex plane with rational radius and a centre with rational real and imaginary parts). Although the first two representations here are somewhat explicit, the third is implicit: we can view the integer polynomial mα as a ‘data structure’ storing α, from which information about α can be computed as desired. On the other hand, since the field of complex numbers C is uncountable its elements cannot even be listed. Thus, there is a nesting of algebraic structures of increasing complexity Z⊂Q⊂A⊂C going from a ring whose elements can be encoded and manipulated directly to a field where almost nothing can be decided for a general element. In this text we examine questions of computability (the study of what can be decided) and computational complexity (the study of how fast something can be decided) in enumerative combinatorics, with a focus on the limiting behaviour of sequences and large discrete structures. Examining different encodings of increasingly complicated classes of sequences, we will see how current questions on these topics focus around sequences defined by multivariate rational functions. At the heart of this work is the new field of analytic combinatorics in several variables (abbreviated ACSV), and the majority of this text is dedicated to the development and exposition © Springer Nature Switzerland AG 2021 S. Melczer, Algorithmic and Symbolic Combinatorics, Texts & Monographs in Symbolic Computation, https://doi.org/10.1007/978-3-030-67080-1_1

1

2

1 Introduction

of this theory. In addition to providing tools for the study of univariate sequences, ACSV also allows for an analysis of multidimensional sequences encoded by series expansions of multivariate functions. This allows for striking results, such as limit theorems for various combinatorial objects. At each stage of our study we focus on effective methods and implementable algorithms, with links to necessary software packages given at the textbook website, https://melczer.ca/textbook/ This website also contains code working through every computational example in this textbook. The majority of the examples are worked through in the Maple [1] and Sage [51] computer algebra systems, although other software (such as MAGMA [15]) is used occasionally when specific packages are required. Of particular importance for this text is the Maple gfun package of Salvy and Zimmermann [48] and the Sage ore_algebra package of Kauers et al. [33]. Links to the most up to date versions of this software are given on the book website.

1.1 Algorithmic Combinatorics Given a sequence ( fn ) = f0, f1, f2, . . . , the generating function of ( fn ) is the series  fn z n . F(z) = n≥0

We typically consider the case when the fn are complex numbers, although generating functions can be defined formally over more general rings in a manner described in later chapters. Wilf [53] famously described a generating function as a “clothesline on which we hang up a sequence of numbers for display,” aptly summarizing their role as data structures. Assuming the terms of ( fn ) are complex numbers which grow at most exponentially, i.e., there exists K > 0 such that | fn | < K n for all n ∈ N, the generating function F(z) can be viewed as the power series expansion of a complexvalued function at the origin. A sequence can thus be encoded by equations satisfied by its generating function. As an example, consider the following classes of functions which appear frequently in combinatorial contexts. • A polynomial F(z) ∈ Q[z] can be encoded as a finite sequence of rational numbers. • A rational function F(z) ∈ Q(z) can be encoded as a pair of polynomials in Z[z], corresponding to its numerator and denominator. • Say that F(z) is algebraic if there exists a non-zero polynomial P(z, y) ∈ Z[z, y] such that P(z, F(z)) = 0. The algebraic functions we consider can be encoded by such polynomials P(z, y) together with initial terms of their power series at the origin to uniquely determine them among the roots of P with respect to y.

1.1 Algorithmic Combinatorics

3

• Say that F(z) is differentially finite (D-finite) if it satisfies a non-zero linear differential equation with coefficients in Z[z]. A D-finite function can be encoded by an annihilating linear differential equation and a finite set of initial conditions. • Say that F(z) is differentially algebraic (D-algebraic) if there exists d ∈ N and a non-zero multivariate polynomial P(z0, z1, . . . , zd ) with coefficients in Z such that F and its derivatives satisfy P(F, F , . . . , F (d) ) = 0. A D-algebraic function can be encoded by the polynomial P and a finite set of initial conditions. Sequences coming from closely related combinatorial problems often have generating functions which share similar properties—for example, the generating function of any sequence satisfying a linear recurrence relation with constant coefficients is rational, while the generating function of any sequence satisfying a linear recurrence relation with polynomial coefficients is D-finite—and many amazing methods have been developed to go from a combinatorial enumeration problem to an encoding of the relevant generating function. Given a generating function specified by one of the above encodings, there are many natural questions that can be asked, including • Can a simple1 closed form of the term fn be determined as a function of n? • Can the asymptotic behaviour of fn be determined as a function of n directly from this encoding? Can it be determined from any encoding? • How else can this generating function be encoded, and how can one convert to these other encodings? What is the “simplest” encoding possible? The asymptotic behaviour of a sequence fn describes how it behaves as n → ∞. Frequently, the asymptotic behaviour of a sequence can be more revealing than cumbersome exact enumeration formulas and is often easier to determine. Example 1.1 (Exact vs. Asymptotic Enumeration for Unlabelled Graphs) Wilf [52, Ex. 3] conjectured in 1982 that the number fn of unlabelled graphs on n nodes cannot be computed in polynomial time with respect to n. This conjecture is still open, although it has long been known2 that fn n! → 1 as n → ∞. (n) 2

2

1.1.1 Analytic Methods for Asymptotics Our main tools for asymptotic enumeration come from analysis, and the systematic use of analytic techniques to study the asymptotic behaviour of sequences is known 1 Of course, the notion of a “simple” closed form expression is subjective, and thus open to interpretation. We do not touch on this delicate topic here, but refer the interested reader to Wilf [52], Stanley [49, Section 1.1], and Pak [40] for interesting meditations on the subject. n 2 Asymptotic behaviour of unlabelled graphs follows from the fact that there are 2( 2 ) labelled graphs on n nodes, almost all of which have no automorphisms; see Noy [14, Ch. 6] for details.

4

1 Introduction

as the study of analytic combinatorics. The main results of analytic combinatorics illustrate strong links between the singularities of a generating function (roughly, points where the function ‘behaves badly’) and asymptotics of its coefficients, starting with the Cauchy integral formula. When the generating function of ( fn ) represents a convergent power series in a neighbourhood of the origin, the Cauchy integral formula implies that for any n ∈ N the term fn can be represented by a complex contour integral ∫ 1 dz F(z) n+1 , fn = 2πi γ z where γ is a counter-clockwise circle in the complex plane sufficiently close to the origin. This equality relates the power series coefficients of F to an analytic object, and allows one to determine asymptotics of fn by applying classical results on the asymptotics of parametrized complex integrals. Unless the terms fn decay too fast (faster than any exponential function), a complex-valued function defined by the series F(z) will admit at least one singularity in the complex plane. In Chapter 2 we will see how knowledge of the function F(z) near its singular points closest to the origin, known as dominant singularities, directly translates into explicit asymptotic formulas for fn . The first major result of analytic combinatorics is that the location of the dominant singularities in the complex plane determine the exponential growth ρ = lim supn→∞ | fn | 1/n of the coefficient sequence, the coarsest measure of its growth. After finding the dominant singularities of F, giving the exponential growth of fn , full asymptotic behaviour can be deduced by studying F near these points. For most examples encountered in applications it is sufficient to determine the type of each dominant singularity (roughly, if it comes from a division by zero, substituting zero into an algebraic root or logarithm, etc.) together with small amount of additional analytic information which can then be substituted into known formulas. Example 1.2 (The Catalan Generating Function) The power series coefficients (cn ) of the Catalan generating function √ 1 − 1 − 4z = 1 + z + 2z 2 + 5z 3 + · · · C(z) = 2z count an incredible number of combinatorial sequences, including the number of rooted binary plane trees (trees where each node has at most two children). Because z = 1/4 is a singularity of C(z) due to a square-root vanishing, and there are no other3 singularities of C(z), the techniques of analytic combinatorics (in particular, Proposition 2.11 in Chapter 2) immediately imply that the nth power series coefficient cn behaves like 4n n−3/2 π −1/2 as n → ∞. Stanley [50] gives an extensive account of this sequence and 214 (!) different combinatorial objects it counts. 3 Although there may appear to be a problem at z = 0, where the denominator of C(z) is zero, this is not the case: the numerator of C(z) also vanishes at z = 0 and the limit of C(z) as z approaches zero exists and is finite.

1.1 Algorithmic Combinatorics

5 Finite Automata

Rational

Pushdown Automata

Algebraic

Turing Machine

Rational Diagonal D-Finite

Q

D-Algebraic

A C

Fig. 1.1 The classes of generating functions, models of computation, and numerical constants discussed here, including the generating function class of rational diagonals which is central to analytic combinatorics in several variables.

Methods in analysis are flexible, giving some freedom when setting up calculations, and this flexibility often allows for an automated asymptotic analysis4. Automatically determining properties of a generating function from an encoding is the domain of algorithmic combinatorics. As we will see, asymptotic expressions for sequences with rational or algebraic generating functions can be determined automatically from the encodings discussed above. In contrast, it is undecidable to take a general D-algebraic function with rational power series coefficients, encoded as above, and determine whether or not its coefficients grow exponentially. Although much is known about the asymptotics of sequences with D-finite generating functions, the decidability of determining such asymptotics is an open problem; D-finite functions thus lie at the boundary of decidability in enumeration. When discussing decidability results it is interesting to compare our classes of generating functions to families of objects in theoretical computer science. A finite automaton is the simplest model of computation used for pattern recognition, and counting the strings of patterns recognized by a fixed automaton by the number of symbols they contain always results in a rational generating function. Similarly, algebraic generating functions appear when enumerating patterns recognized by certain ‘pushdown’ automata. Because finite and pushdown automata are relatively simple models of computation, it is not surprising that many properties of rational and algebraic generating functions can be decided. On the other hand, it can be shown [30] that universal Turing machines can be simulated by solving D-algebraic systems! As Turing machines are powerful, it is thus not surprising that D-algebraic functions are hard to get a handle on. The increasingly complicated classes of generating functions, models of computation, and numerical constants are displayed in Figure 1.1; these connections are further explored in Chapters 2 and 3. 4 We will see several examples in this text. For an early example of an automated system to determine asymptotics of sequences, implemented in CAML and Maple, see the Lambda-UpsilonOmega (ΛΥΩ) system of Flajolet et al. [26, 27].

6

1 Introduction

Fig. 1.2 Lattice walks of length 500 restricted to the quadrant N2 using the step sets S = {(±1, 0), (0, ±1)} (left) and S = {(0, −1), (±1, 1)} (right). Note the walks have self-intersections.

1.1.2 Lattice Path Enumeration Much of the theory of algorithmic combinatorics appears in the rich field of lattice path enumeration, which forms the main source of combinatorial applications in this textbook5. Roughly speaking, a lattice path model is a combinatorial class which encodes the number of ways to “move” on a lattice subject to certain constraints. More precisely, given a dimension d ∈ N, a finite set of allowable steps S ⊂ Zd , and a restricting region R ⊆ Zd , the integer lattice path model taking steps in S and restricted to R is the combinatorial class consisting of walks: finite sequences of steps beginning at some fixed point of R which must always stay in R (see Figure 1.2). We may also restrict the class further by adding other constraints, such as only admitting walks which end in some terminal set. In addition to a large number of applications, discussed in Chapter 4, lattice path problems form an interesting ‘combinatorial playground’ to develop new methods for dealing with a wide range of generating function behaviours. For example, the generating functions of two-dimensional models restricted to the quadrant N2 with step sets S ⊂ {±1, 0}2 already exhibit a wide variety of behaviour: they admit generating functions which are rational, algebraic, transcendentally D-finite, non-D-finite but D-algebraic, and non-D-algebraic. Progress has been made on the study of these quadrant models over the last several decades using tools from the theory of algebraic curves, formal power series approaches to discrete difference equations, probability theory, computer algebra, boundary value problems, potential 5 Of particular use to us will be the extreme effectiveness of analytic combinatorics in several variables for solving lattice path enumeration problems. It is interesting to note that the development of complex analysis in a single variable was greatly inspired by the study of elliptic functions, while the theory of complex analysis in several variables suffered due to lack of concrete problems. To quote the opening sentence of Blumenthal [13] from 1903, translated from the German, “If, in contrast to the widely and highly developed theory of functions of a single complex variable, the theory of functions of several variables has lagged behind, this is due essentially attributed to the absence of suitable interesting examples which could lead the general theory.” The methods of ACSV rely heavily on complex analysis in several variables.

1.1 Algorithmic Combinatorics

7

theory, differential Galois theory, the study of hypergeometric functions, and several branches of complex analysis. Many of the decidability issues discussed above arise naturally in this context.

Example: D-finite Decidability Problems in Lattice Path Enumeration Consider the sequence an counting the number of walks restricted to N2 which begin at the origin and take n steps in S = {(±1, 0), (0, ±1)}, illustrated on the left of Figure 1.2. Using results from Chapters 3 and 4, it is possible to show that the generating function A(z) of (an ) satisfies the linear differential equation z2 (4z − 1)(4z + 1)A (z) + 2z(4z + 1)(16z − 3)A (t) + 2(112z2 + 14z − 3)A (t) + 4(16z + 3)A(t) = 0,

(1.1)

a result originally conjectured by Bostan and Kauers [17] and proven by Bostan et al. [16]. Using techniques discussed in Chapter 2, it is possible to automatically translate such a differential equation to a linear recurrence relation satisfied by an , (n + 4)(n + 3)an+2 − 4(2n + 5)an+1 − 16(n + 1)(n + 2)an = 0.

(1.2)

Because this is a linear recurrence, the set of all sequences which satisfy (1.2) form a complex vector-space, and because such a sequence is uniquely determined by its first two values this vector space has dimension two. Furthermore, the methods of Chapter 2 allow us to describe a basis of this vector space consisting of two elements, Ψ1 (n) and Ψ2 (n), whose asymptotic expansions     3 9 +··· and Ψ2 (n) = (−4)n n−3 1 − +··· Ψ1 (n) = 4n n−1 1 − 2n 2n as n → ∞ can be computed to any fixed accuracy. Because Ψ1 and Ψ2 form a basis for all solutions of (1.2), there exist constants λ1, λ2 ∈ C such that our lattice path counting sequence, which is one particular solution of (1.2), satisfies an = λ1 Ψ1 (n) + λ2 Ψ2 (n). Although it is currently unknown how to compute the constants λ1 and λ2 directly from (1.1) and initial terms of A(z), we will see how to use numeric analytic continuation to rigorously approximate them to any desired accuracy. Heuristically, one could also exploit this recurrence to efficiently compute the term a N for a large index N, say N = 1000, and use this result to approximate λ1 ≈ a N /Ψ1 (N). In this case we can obtain a numerical approximation an = (1.273 . . . )Ψ1 (n) + (5.092 . . . )Ψ2 (n) ∼ (1.273 . . . )4n n−1

8

1 Introduction

with rigorous error bounds on the numeric constants, giving asymptotics up to a multiplicative constant which can be determined to high accuracy. Among other things, we will see that this asymptotic growth directly implies that A(z) is not algebraic and thus cannot be encoded by a polynomial equation. In contrast, let bn count the number of walks restricted to N2 which begin at the origin and take n steps in S = {(0, −1), (−1, 1), (1, 1)}, illustrated on the right of Figure 1.2. Now the generating function B(z) satisfies a linear differential equation of order 5 whose coefficients are polynomials of degree 16, from which it is possible to show that bn satisfies a linear recurrence relation of order 6 with polynomial coefficients. This linear recurrence has a basis Ψ1 (n), . . . , Ψ6 (n) whose expansions as n → ∞ can be computed to any fixed accuracy, where 3n Ψ1 (n) = √ (1 + · · · ) n and all other terms grow exponentially slower than 3n as n → ∞. As the Ψ j form a basis there are λ1, . . . , λ6 ∈ C such that the lattice path counting sequence bn satisfies bn = λ1 Ψ1 (n) + λ2 Ψ2 (n) + · · · + λ6 Ψ6 (n).

(1.3)

Since Ψ1 (n) √ is the asymptotically dominant basis element it appears that bn grows like λ1 3n / n. However, perhaps surprisingly, numerically approximating λ1 shows that it equals zero to many decimal places. In fact, √ we will see that the number of lattice walks on n steps has exponential growth 2 2, and its dominant asymptotic behaviour depends on the parity of n. In general, it is not known whether one can examine decompositions like (1.3) for sequences satisfying linear recurrences with polynomial coefficients and determine which coefficients are identically zero. The issue is that although one can show these constants to be zero to large accuracy, a priori there are no lower bounds on how small they can be when they are non-zero (in fact there is not even a tight characterization of the ring of numbers in which these constants lie). This issue, lying at the heart of effective asymptotic methods for sequences with D-finite generating functions, is discussed in detail in Chapter 2. Going back to lattice paths, a full and rigorous asymptotic analysis of models restricted to the non-negative quadrant stubbornly resisted several attempts. Bostan et al. [16] gave differential equations satisfied by the generating functions of all quadrant models with step sets S ⊂ {±1, 0}2 , and expressed these generating functions in terms of explicit hypergeometric functions. Although these representations give strong information about the sequences involved, it is still difficult to prove the conjectured asymptotics. For instance, those authors show [16, Conjecture 2] that the constant λ1 in (1.3) is exactly zero if and only if the integral

1.2 Diagonals and Analytic Combinatorics in Several Variables ∫ 0

1/3



(1 − 3v)1/2 v 3 (1 + v 2 )1/2

9

   3/4, 5/4 3 1 + (1 − 10v ) × 2 F1 64v 4 1  

5/4, 7/4 2 4 + 6v 3 (3 − 8v + 14v 2 ) × 2 F1 64v 4 − 3 + 2 dv 2 v v

equals 1, where 2 F1 denotes the hypergeometric series    a, b (a)n (b)n z n z = , (x)n = x(x + 1) · · · (x + n − 1). 2 F1 c (c)n n! n≥0 It is not clear how to prove this identity. Here we will follow work of Melczer and Mishna [37] and Melczer and Wilson [39] to determine asymptotics of these, and many more, lattice path models by encoding their generating functions using a new data structure: multivariate rational diagonals. Although a linear differential equation encodes a vector space of solutions, from which the desired generating function is specified by its initial terms, this representation directly encodes the generating function of interest. Asymptotics can then be determined using the methods of analytic combinatorics in several variables.

1.2 Diagonals and Analytic Combinatorics in Several Variables Diagonal representations encode sequences using multivariate series expansions. Given a d-variate complex function F(z) = F(z1, . . . , zd ) with power series  fi1,...,id z1i1 · · · zdid F(z) = (i1,...,i d )∈N d

at the origin, we define the main diagonal of F(z) as the univariate function obtained by taking the coefficients of this series where all variable exponents are equal,  fn,...,n z n . (ΔF)(z) = n≥0

More generally, one can define the r-diagonal of F(z) for any r ∈ Rd as   fnr z n = fnr1,...,nrd z n, (Δr F)(z) = n≥0

n≥0

where the coefficient fnr is considered undefined when nr  Nd . Our methods also apply to more general series expansions whose exponents contain negative integers. Although the theory we develop works for more general classes of functions, we focus mainly on series expansions of rational functions so that tools from computer algebra can be used to automate the necessary computations. Rational diagonals are studied extensively in Chapter 3, where we will see that each r-diagonal is D-finite and every algebraic function can be realized as the main diagonal of a bivariate

10

1 Introduction

rational function. Because the ring of multivariate rational diagonals lies between the class of algebraic functions, where coefficient asymptotics can be determined automatically, and the ring of D-finite functions, where this problem is still open, they make a prime subject on which to study effective asymptotics. Chapter 3 discusses how diagonals arise naturally in combinatorics (lattice path enumeration, irrational tilings of rectangles, Kronecker coefficients), probability theory (random walk models), number theory (binomial sums such as Apéry’s sequence, used in his proof [2] of the irrationality of ζ(3)), physics (statistical mechanics and the Ising model), and many other areas. Just as Bernoulli [11] used linear recurrence relations to approximate roots of univariate polynomials in the early eighteenth century, asymptotic behaviour of the r-diagonals of a rational function 1/H(z) encodes deep information about the algebraic set defined by H(z) = 0.

1.2.1 The Basics of Analytic Combinatorics in Several Variables In order to study the asymptotics of rational diagonal coefficient sequences we make use of complex analysis in several variables. Suppose F(z) = G(z)/H(z) is a multivariate rational function, where G and H are coprime polynomials with integer coefficients. When H(0) is non-zero, F admits a power series expansion  fi1,...,id z1i1 · · · zdid (1.4) F(z) = (i1,...,i d )∈N d

in some open domain of convergence D around the origin. As in the univariate setting, there is a strong link between the singularities of F(z), which are the elements of the singular variety V = {z ∈ Cd : H(z) = 0}, and asymptotics of the r-diagonal sequence fnr as n → ∞. A singularity is called minimal if no other point in V has coordinate-wise smaller modulus. The minimal points are the singularities which are closest to the origin in Cd , similar to dominant singularities in the univariate case, and they play an important role in determining asymptotics. The study of analytic combinatorics is more delicate in several variables. Although many6 univariate functions admit a finite number of dominant singularities, in dimension two and greater there will always be an infinite number of minimal points unless F(z) is a polynomial. For a fixed direction r, the goal is still to determine a finite number of singularities of F(z) where the local behaviour of F(z) allows one to determine asymptotics of fnr . The fact that this is not always possible is a reflection of the pathologies which can arise when dealing with the singularities of multivariate rational functions.

6 For instance, any rational, algebraic, or D-finite function has a finite number of singularities in the complex plane.

1.2 Diagonals and Analytic Combinatorics in Several Variables

11

Critical Points Fix r ∈ Nd with positive coordinates. The starting point of a multivariate asymptotic analysis is a generalization of the Cauchy integral formula to several variables, ∫ 1 dz1 · · · dzd fnr = F(z) nr +1 , (1.5) (2πi)d C z1 1 · · · zdnrd +1 where C is a product of circles sufficiently close to the origin. Using standard integral bounds, we will show that every minimal point w ∈ V gives an upper bound ρ ≤ |w1−r1 · · · wd−rd |

(1.6)

on the exponential growth ρ = lim supn→∞ | fnr | 1/n of the diagonal sequence. To find a set of minimal points where a local singularity analysis of F(z) determines asymptotics, it makes sense to look for the minimal points minimizing this upper bound, as those are the only ones around which the integrand of (1.5) could have the same exponential growth as the diagonal sequence. Suppose first that the denominator H and its partial derivatives never simultaneously vanish. In Chapter 5 we show that frequently the minimal points giving the best bound in (1.6) satisfy the algebraic system of smooth critical point equations H(z) = 0,

r1−1 z1 Hz1 (z) = · · · = rd−1 zd Hz d (z),

where Hz j represents the partial derivative of H with respect to z j . When H and its partial derivatives never simultaneously vanish, any solution of this system is called a critical point of F(z). The condition that H and its partial derivatives never simultaneously vanish implies that the singular set V forms a manifold, meaning V looks like Cd−1 around each of its points. When the algebraic set V is not a manifold, one must partition V into a collection of manifolds called strata and define critical points on each stratum. In Chapter 9 we discuss how the critical points on any stratum can always be specified by a system of polynomial equations. Thus, in practice it is usually easy to characterize the critical points of F(z), which are encoded by algebraic equations, but much more difficult to decide which (if any) are minimal, as minimality is a semi-algebraic condition (it relies on inequalities between moduli of variables). When there are minimal critical points where the singular variety is locally a manifold such points must minimize the upper bound (1.6) on the exponential growth ρ, but this may not hold for non-smooth minimal critical points.

Asymptotics Asymptotic behaviour is determined by deforming the domain of integration in the Cauchy integral (1.4) to be close to the singularities of F(z), without changing the value of the integral, then performing a local singularity analysis. Intuitively,

12

1 Introduction

minimal points are those to which the contour C can be easily deformed, as they are the singularities closest to the origin, while critical points are those where such a singularity analysis can be deformed to determine asymptotics. As in the univariate case, the nature of the singular variety at minimal critical points is important to the determination of asymptotics, and now the local geometry of V plays a large role. The easiest case is when V admits a single minimal critical point w, around which V is a complex manifold. Assuming some minor extra conditions typically satisfied in applications, diagonal asymptotics can be obtained by computing a univariate residue integral followed by an d − 1 dimensional integral whose domain of integration can be made arbitrarily close to w. When V has a finite number of such minimal critical points, diagonal asymptotics are determined by applying the so-called saddle-point method to each of the d − 1 dimensional integrals. Under conditions which can be automatically verified, there are explicit formulas for diagonal asymptotics depending only on the minimal critical points involved and evaluations of the partial derivatives of G(z) and H(z): see, for instance, Theorem 5.1 in Chapter 5. The next simplest case occurs when V admits a minimal critical point w around which V is the finite union of manifolds whose tangent planes are linearly independent; such a point w is called a transverse multiple point. Under certain conditions which often hold, and which are sufficient for our applications, diagonal asymptotics can again be computed through explicit formulas. Although we discuss how to perform a singularity analysis when V has a more exotic geometry, our results in these cases rely on more complicated techniques and are thus less explicit. Ultimately, the goal of this program is to automate as much as possible, perhaps proving the decidability of asymptotics for multivariate rational diagonal sequences.

Multivariate Generating Functions and Central Limit Theorems Instead of simply viewing a multivariate function F(z) as a data structure for a single r-diagonal sequence, we can also view F(z) as a truly multivariate generating function whose coefficients enumerate different parameters on combinatorial objects. In this case we want to know coefficient behaviour for all r-diagonals. Our results show that for ‘most’ directions (in a manner to be made precise) asymptotics behave in a uniform manner as r varies, and in many situations we are even able to prove limit theorems about the coefficients of interest. For example, if for each n ∈ N we let [zdn ]F(z) denote the terms in the series expansion of F(z) containing zdn then the results of Chapter 5 give automatically verifiable conditions under which the coefficients of [zdn ]F(z) approach a multivariate normal distribution as n → ∞. Figure 1.3 illustrates two examples, which we return to in Chapter 5.

1.2 Diagonals and Analytic Combinatorics in Several Variables

13

Fig. 1.3 Left: The coefficients of z n in 1/(1−z −xz 2 −yz 3 ) approach a bivariate normal distribution as n → ∞ (n = 200 is pictured). These coefficients enumerate certain permutations of {1, . . . , n} by the number of two and three cycles they contain, part of a family studied by Chung et al. [18] due to a connection to random perfect matchings in bipartite graphs. Right: Let ek, n denote the number of pairs of polynomials (r, q) over the finite field of prime order p such that q is monic, n = deg q > deg r, and Euclid’s gcd algorithm applied to (r, q) takes k steps to terminate. The methods of Chapter 5 imply ek, n approaches a normal distribution as n → ∞. Pictured is a frequency count of divisions performed when running Euclid’s algorithm on 10 000 random pairs of polynomials with deg q = 500 and p = 3, compared to the limiting distribution.

1.2.2 A History of Analytic Combinatorics in Several Variables Using the binomial theorem, coefficient sequences of multivariate rational functions can often be represented as sums of non-negative terms which are amenable to classical asymptotic techniques covered, for instance, in de Bruijn [19]. Early work on truly multivariate analyses of generating functions came from a probabilistic examination of their coefficients. Starting in the 1970s, Bender [7] derived local and central limit theorems for coefficients in the bivariate case, which Bender and Richmond [8] generalized to any number of variables, followed by related work by those authors and collaborators [10, 29, 9]. Additional work in the 1990s further examined limit theorems for coefficients of bivariate generating functions [28, 25, 21, 22, 23, 31, 32]. Near the end of this probabilistic work, a new approach arose which used multivariate complex residues to determine coefficient asymptotics for functions whose denominators are products of linear factors [35, 12] and for bivariate functions [36]. Although the theory developed in these early approaches was applied in a somewhat ad hoc manner, much of the modern approach was present; for example, the queuing theory work of Kogan and Yakovlev [34] derived asymptotics using residue computations to reduce questions of asymptotic behaviour to an analysis of saddlepoint integrals. Egorychev’s book [24] on combinatorial identities contains a small number of results using multivariate integrals. More recently, Pemantle, Wilson, Baryshnikov, and collaborators, have brought together results from several different mathematical disciplines in order to develop a large-scale systematic theory of multivariate asymptotics for combinatorial purposes. Perhaps the first work in this program is Pemantle [41], which studies coefficient sequences of multivariate rational functions whose asymptotic expansions terminate

14

1 Introduction

(and thus uniquely determine the sequences of interest). In the early 2000s, Pemantle and Wilson [42] gave a method for determining asymptotics when the singular set V is a complex manifold and there are minimal critical points. Soon after, those authors extended their results to cover a larger set of singular behaviour [43]. Raichev and Wilson [46, 47] gave a method to determine higher-order asymptotic terms. These early results follow the ‘surgery’ approach to ACSV, meaning they use simple and explicit deformations of the multivariate Cauchy residue integral. Later work of Baryshnikov and Pemantle [6] used more complicated deformations of the domain of integration in the multivariate Cauchy integral to extend these results. This approach, extended in the textbook of Pemantle and Wilson [44] and recent papers of Baryshnikov, Melczer, and Pemantle [5, 4, 3], shows how the methods of ACSV fit into the very general framework of stratified Morse theory, allowing the use of advanced homological methods. Our presentation of analytic combinatorics in several variables focuses more on explicit calculations, including the derivation of effective algorithms working in a large variety of situations. DeVries et al. [20] gave a general algorithm for bivariate asymptotics when V forms a smooth manifold and Raichev [45] developed a Sage package which helps with some of the computations necessary in an asymptotic analysis. More recently, Melczer and Salvy [38] gave algorithms and complexity bounds for the analysis of multivariate generating functions in any number of variables, with an accompanying Maple implementation. Links to this software can be found on the website for this textbook.

1.3 Organization This textbook is split into three parts.

Part I Part I covers the background and motivation for analytic combinatorics in several variables, including a thorough treatment of univariate techniques. We take a computational viewpoint, and our presentation of univariate methods is structured to help guide intuition in the more advanced multivariate setting. Part I begins in Chapter 2, which describes in detail the underpinnings of univariate analytic combinatorics. After some background on generating functions and combinatorial classes, we introduce the main tools of analytic combinatorics through an extended example looking at alternating permutations. This serves as a lead-in to the general theory. The analytic, algebraic, arithmetic, and asymptotic properties of rational, algebraic, D-finite, and D-algebraic power series are also detailed. Chapter 3 serves as an introduction to the theory of complex analysis in several variables, and the use of multivariate generating functions. After providing the tools necessary for the multivariate singularity analysis which lies at the heart of analytic

1.3 Organization

15

combinatorics in several variables, we give a full treatment of rational diagonals and their connections to other classes of generating functions. Algebraic tools for the analytic study of multivariate rational functions are described, such as the connection between so-called amoebas and convergent Laurent expansions. Crucially, the singular set of a multivariate rational function encodes not only information about the asymptotic behaviour of the main diagonal sequence of a rational function’s power series expansion, but also information about all r-diagonals of the power series expansion and information about diagonals of other convergent Laurent expansions of the rational function. Thus, a good understanding of these topics helps illustrate why certain behaviours occur when applying the methods of analytic combinatorics in several variables. We show that the different convergent Laurent expansions of a rational function are strongly linked, and study different operators on multivariate expansions which will be useful for our combinatorial applications. Chapter 3 ends by describing several domains of mathematics and the sciences where rational diagonals arise. In addition to showing the importance of rational diagonals, the examples discussed here are used to illustrate the methods of ACSV in later chapters. Part I ends in Chapter 4, which contains a presentation of the kernel method approach to lattice path enumeration. Beginning with the easy case of unrestricted lattice path models, the mechanics of the kernel method are built up for one-dimensional walks restricted to a half-space and two-dimensional walks restricted to a quadrant. After describing this incredibly effective machinery, which naturally results in rational diagonal expressions for the generating functions involved, we survey results in lattice path enumeration using this approach. We will return to problems in lattice path enumeration multiple times in the text, using the increasingly sophisticated analytic techniques we develop to continually derive more general results.

Part II Part II covers the theory and applications of ACSV when the singular set of the function under consideration forms a manifold, known as the smooth case. It begins in Chapter 5, which describes the basics of analytic combinatorics in several variables and shows how to obtain asymptotics for many rational functions which admit singular varieties that are complex manifolds. After an extended example, which concretely illustrates the methods of ACSV in the smooth case from start to finish, the general theory is developed. Many examples are given and general strategies for applying the tools of analytic combinatorics in several variables are demonstrated. Chapter 6 describes how to use the methods of ACSV to enumerate lattice path models in a general orthant Nd when the set of allowable steps is symmetric over every axis. This analysis uses the results of Chapters 4 and 5 and provides the first illustration of a principle that will arise multiple times in the text: structures which exhibit nice combinatorial properties, such as a large number of symmetries, can often be encoded by generating functions which have nice analytic properties. The enumerative results derived are quite general, and illustrate the power of analytic combinatorics in several variables.

16

1 Introduction

Finally, Part II ends in Chapter 7 with a study of effective algorithms for analytic combinatorics in the smooth case. After developing some necessary background on symbolic-numeric computation and methods for encoding solutions of multivariate algebraic systems, we develop rigorous algorithms and complexity results for rational diagonal asymptotics in any dimension, under assumptions which are often satisfied in applications. A large number of examples are worked through using a Maple package of Melczer and Salvy, which implements the theory presented here.

Part III Part III generalizes the methods of Part II to allow for the analysis of more general rational functions. First, Chapter 8 gives a thorough treatment of rational functions whose denominators are products of real linear factors. Although the singular varieties under consideration may not be manifolds, they will be unions of hyperplanes, allowing for a deep analysis to be performed. Chapter 9 describes a general theory of analytic combinatorics in several variables when the singular variety is no longer a manifold. We extend the analysis of Chapter 8 to get explicit asymptotics for rational functions whose singular variety looks locally like the union of hyperplanes near minimal critical points. We also detail a new approach inspired by Morse theory which gives powerful structure theorems for asymptotics, and provide algebraic criteria that certify when such results hold. This approach has strong implications for the development of effective algorithms and yields one of the most promising techniques for resolving computability questions about sequences with D-finite generating functions. Chapter 10 uses these new analytic methods to derive deeper and more general results on lattice path enumeration, resulting in enumerative results for lattice paths in the orthant Nd whose step sets are symmetric over all but one axis. A selection of other lattice path problems are surveyed, giving a large selection of exercises for the reader.

References 1. Maple 2018. Maplesoft, a division of Waterloo Maple Inc., Waterloo, Ontario. 2. Roger Apéry. Irrationalité de ζ (2) et ζ (3). In Journées Arith. de Luminy. Colloque International du Centre National de la Recherche Scientifique (CNRS), Centre Universitaire de Luminy, Jun 20-24. Astérisque, volume 61, pages 11–13, 1979. 3. Y. Baryshnikov, S. Melczer, and R. Pemantle. Asymptotics of multivariate sequences in the presence of a lacuna. arXiv preprint, arXiv:1905.04187, 2019. 4. Y. Baryshnikov, S. Melczer, and R. Pemantle. Asymptotics of multivariate sequences IV: generating functions with poles on a hyperplane arrangement. In preparation, 2020. 5. Y. Baryshnikov, S. Melczer, and R. Pemantle. Stationary points at infinity for analytic combinatorics. arXiv preprint, arXiv:1905.05250, 2020. 6. Yuliy Baryshnikov and Robin Pemantle. Asymptotics of multivariate sequences, part III: Quadratic points. Adv. Math., 228(6):3127–3206, 2011.

References

17

7. Edward A. Bender. Central and local limit theorems applied to asymptotic enumeration. J. Combinatorial Theory Ser. A, 15:91–111, 1973. 8. Edward A. Bender and L. Bruce Richmond. Central and local limit theorems applied to asymptotic enumeration. II. Multivariate generating functions. J. Combin. Theory Ser. A, 34(3):255–265, 1983. 9. Edward A. Bender and L. Bruce Richmond. Multivariate asymptotics for products of large powers with applications to Lagrange inversion. Electron. J. Combin., 6:Research Paper 8, 21 pp. (electronic), 1999. 10. Edward A. Bender, L. Bruce Richmond, and S. G. Williamson. Central and local limit theorems applied to asymptotic enumeration. III. Matrix recursions. J. Combin. Theory Ser. A, 35(3):263–278, 1983. 11. Daniel Bernoulli. Observationes de seriebus quae formantur ex additione vel subtractione quacunque terminorum se mutuo consequentium. Commentarii Academiae Scientiarum Imperialis Petropolitanae, 3:85–100, 1728. 12. Andrea Bertozzi and James McKenna. Multidimensional residues, generating functions, and their application to queueing networks. SIAM Rev., 35(2):239–268, 1993. 13. Otto Blumenthal. Über modulfunktionen von mehreren veränderlichen. (erste hälfte). Mathematische Annalen, 56:509–548, 1903. 14. Miklós Bóna, editor. Handbook of enumerative combinatorics. Discrete Mathematics and its Applications (Boca Raton). CRC Press, Boca Raton, FL, 2015. 15. Wieb Bosma, John Cannon, and Catherine Playoust. The Magma algebra system. I. The user language. J. Symbolic Comput., 24(3-4):235–265, 1997. 16. Alin Bostan, Frédéric Chyzak, Mark van Hoeij, Manuel Kauers, and Lucien Pech. Hypergeometric expressions for generating functions of walks with small steps in the quarter plane. European J. Combin., 61:242–275, 2017. 17. Alin Bostan and Manuel Kauers. Automatic classification of restricted lattice walks. In 21st International Conference on Formal Power Series and Algebraic Combinatorics (FPSAC 2009), Discrete Math. Theor. Comput. Sci. Proc., AK, pages 201–215. Assoc. Discrete Math. Theor. Comput. Sci., Nancy, 2009. 18. Fan Chung, Persi Diaconis, and Ron Graham. Permanental generating functions and sequential importance sampling. Advances in Applied Mathematics, page 101916, 2019. 19. N. G. de Bruijn. Asymptotic methods in analysis. Bibliotheca Mathematica. Vol. 4. NorthHolland Publishing Co., Amsterdam; P. Noordhoff Ltd., Groningen; Interscience Publishers Inc., New York, 1958. 20. Timothy DeVries, Joris van der Hoeven, and Robin Pemantle. Automatic asymptotics for coefficients of smooth, bivariate rational functions. Online J. Anal. Comb., (6):24, 2011. 21. Michael Drmota. Asymptotic distributions and a multivariate Darboux method in enumeration problems. J. Combin. Theory Ser. A, 67(2):169–184, 1994. 22. Michael Drmota. A bivariate asymptotic expansion of coefficients of powers of generating functions. European J. Combin., 15(2):139–152, 1994. 23. Michael Drmota. Systems of functional equations. Random Structures Algorithms, 10(12):103–124, 1997. 24. G. P. Egorychev. Integral representation and the computation of combinatorial sums, volume 59 of Translations of Mathematical Monographs. American Mathematical Society, Providence, RI, 1984. 25. P. Flajolet and T. Lafforgue. Search costs in quadtrees and singularity perturbation asymptotics. Discrete Comput. Geom., 12(2):151–175, 1994. 26. Philippe Flajolet, Bruno Salvy, and Paul Zimmermann. Lambda-Upsilon-Omega the 1989 cookbook. Research Report RR-1073, INRIA, 1989. 27. Philippe Flajolet, Bruno Salvy, and Paul Zimmermann. Automatic average-case analysis of algorithms. Theoret. Comput. Sci., 79(1, (Part A)):37–109, 1991. 28. Philippe Flajolet and Michèle Soria. General combinatorial schemas: Gaussian limit distributions and exponential tails. Discrete Math., 114(1-3):159–180, 1993.

18

1 Introduction

29. Zhicheng Gao and L. Bruce Richmond. Central and local limit theorems applied to asymptotic enumeration. IV. Multivariate generating functions. J. Comput. Appl. Math., 41(1-2):177–186, 1992. 30. Daniel S. Graça, Manuel L. Campagnolo, and Jorge Buescu. Computability with polynomial differential equations. Adv. in Appl. Math., 40(3):330–349, 2008. 31. Hsien-Kuei Hwang. Large deviations for combinatorial distributions. I. Central limit theorems. Ann. Appl. Probab., 6(1):297–319, 1996. 32. Hsien-Kuei Hwang. Large deviations of combinatorial distributions. II. Local limit theorems. Ann. Appl. Probab., 8(1):163–181, 1998. 33. Manuel Kauers, Maximilian Jaroschek, and Fredrik Johansson. Ore polynomials in Sage. In Computer algebra and polynomials, volume 8942 of Lecture Notes in Comput. Sci., pages 105–125. Springer, Cham, 2015. 34. Yaakov Kogan and Andrei Yakovlev. Asymptotic analysis for closed multichain queueing networks with bottlenecks. Queueing Systems Theory Appl., 23(1–4):235–258, 1996. 35. Ben Lichtin. The asymptotics of a lattice point problem associated to a finite number of polynomials. I. Duke Math. J., 63(1):139–192, 1991. 36. Ben Lichtin. The asymptotics of a lattice point problem associated to a finite number of polynomials. II. Duke Math. J., 77(3):699–751, 1995. 37. Stephen Melczer and Marni Mishna. Asymptotic lattice path enumeration using diagonals. Algorithmica, 75(4):782–811, 2016. 38. Stephen Melczer and Bruno Salvy. Symbolic-numeric tools for analytic combinatorics in several variables. In Proceedings of the ACM on International Symposium on Symbolic and Algebraic Computation, ISSAC ’16, pages 333–340, New York, NY, USA, 2016. ACM. 39. Stephen Melczer and Mark C. Wilson. Higher Dimensional Lattice Walks: Connecting Combinatorial and Analytic Behavior. SIAM J. Discrete Math., 33(4):2140–2174, 2019. 40. Igor Pak. Complexity problems in enumerative combinatorics. In Proc. Int. Cong. of Math., volume 3, pages 3139–3166, Rio de Janeiro, 2018. 41. Robin Pemantle. Generating functions with high-order poles are nearly polynomial. In Mathematics and computer science (Versailles, 2000), Trends Math., pages 305–321. Birkhäuser, Basel, 2000. 42. Robin Pemantle and Mark C. Wilson. Asymptotics of multivariate sequences. I. Smooth points of the singular variety. J. Combin. Theory Ser. A, 97(1):129–161, 2002. 43. Robin Pemantle and Mark C. Wilson. Asymptotics of multivariate sequences. II. Multiple points of the singular variety. Combin. Probab. Comput., 13(4-5):735–761, 2004. 44. Robin Pemantle and Mark C. Wilson. Analytic combinatorics in several variables, volume 140 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2013. 45. Alexander Raichev. New software for computing asymptotics of multivariate generating functions. ACM Commun. Comput. Algebra, 45(3-4):183–185, 2011. 46. Alexander Raichev and Mark C. Wilson. Asymptotics of coefficients of multivariate generating functions: improvements for smooth points. Electron. J. Combin., 15(1):Research Paper 89, 17, 2008. 47. Alexander Raichev and Mark C. Wilson. Asymptotics of coefficients of multivariate generating functions: improvements for multiple points. Online J. Anal. Comb., (6):21, 2011. 48. B. Salvy and P. Zimmermann. GFUN: a Maple package for the manipulation of generating and holonomic functions in one variable. ACM Trans. Math. Softw., 20(2):163–177, 1994. 49. Richard P. Stanley. Enumerative combinatorics. Volume 1, volume 49 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, second edition, 2012. 50. Richard P. Stanley. Catalan numbers. Cambridge University Press, New York, 2015. 51. The Sage Developers. SageMath, the Sage Mathematics Software System (Version 8.1), 2019. 52. Herbert S. Wilf. What is an answer? Amer. Math. Monthly, 89(5):289–292, 1982. 53. Herbert S. Wilf. generatingfunctionology. A K Peters, Ltd., Wellesley, MA, third edition, 2006.

Part I

Background and Motivation

Chapter 2

Generating Functions and Analytic Combinatorics Since there is a great conformity between the Operations in Species, and the same Operations in common Numbers . . . I cannot but wonder that no body has thought of accommodating the lately discover’d Doctrine of Decimal Fractions in like manner to Species . . . especially since it might have open’d a way to more abstruse Discoveries. — Issac Newton

This book is concerned with the study of sequences and their generating functions. We work over a field K, typically the field of rational numbers Q or the field of complex numbers C. The basic properties of rings and fields which we use can be found in any standard reference on abstract algebra, such as Dummit and Foote [38], although in most cases one can simply view an abstract ring as the ring of integers Z and an abstract field as Q or C. Definition 2.1 (sequences and formal power series) A (univariate) sequence (an ) = (a0, a1, a2, . . . ) over a field K is a mapping from the natural numbers to K. A formal power series over K in the indeterminate z is any formal expression A(z) =

∞ 

an z n = a0 + a1 z + a2 z 2 + · · ·

n=0

with coefficient sequence (an ) in K. We write [z n ]A(z) for the term an in A(z). If A is an integral domain (a commutative ring with no zero divisors) with field of fractions K, we write A[[z]] to denote the elements of K[[z]] with coefficients in A. ∞ n n Given two formal power series A(z) = ∞ n=0 an z and B(z) = n=0 bn z over the same field, we may define addition of A(z) and B(z) term-wise as A(z) + B(z) =

∞ 

an z n +

n=0

∞  n=0

bn z n =

∞ 

(an + bn ) z n,

n=0

and multiplication of A(z) and B(z) by the Cauchy product  ∞ 

n 

∞ ∞     an z n bn z n = ak bn−k z n . A(z)B(z) = n=0

n=0

n=0 k=0

The word ‘formal’ in the term formal power series refers to the fact that we do not consider whether a formal series converges, or define what it means to substitute an element of K into z. © Springer Nature Switzerland AG 2021 S. Melczer, Algorithmic and Symbolic Combinatorics, Texts & Monographs in Symbolic Computation, https://doi.org/10.1007/978-3-030-67080-1_2

21

22

2 Generating Functions and Analytic Combinatorics

Lemma 2.1 The collection of formal power series over a field K in the indeterminate z forms an integral domain under term-wise addition and the Cauchy product, called the ring of formal power series over K and denoted K[[z]]. Problem 2.1 asks you to prove Lemma 2.1, and Problems 2.2 to 2.5 ask you to derive basic properties of formal power series: they form a complete metric space, any power series with non-zero constant can be inverted, and if F(z) and G(z) are formal power series such that G(z) has no constant term then the composition F(G(z)) is well defined. Because we deal almost exclusively with power series over fields K ⊂ C which define analytic functions, we do not go into much detail on the formal theory. Additional results about K[[z]] can be found in Stanley [112] or Henrici [65]. When a formal power series over K ⊂ C converges for complex z in a disk around the origin, our formal constructions are consistent with the classical definitions for complex functions. Example 2.1 (Formal vs. Analytic Power Series) Let F(z) =



zn

G(z) =

and



n!z n

n ≥0

n≥0

be formal power series in Q[[z]]. Then (1 − z)F(z) = 1 +



(1 − 1)z n = 1,

n≥1

so that F(z) = (1 − z)−1 is the multiplicative inverse of 1 − z in Q[[z]]. Note that  n≥0

zn =

1 1−z

as complex-valued functions when |z| < 1, so we can apply results from analysis to study the series F(z). Because n! grows faster than z n for any fixed z the series G(z) does not converge for any z  0, but we can still work formally with G(z).

We use power series because their coefficients can encode interesting information, such as the number of objects in a combinatorial structure. Definition 2.2 (combinatorial classes and generating functions) A combinatorial class is a countable collection of objects C together with a size function | · | : C → N such that the inverse image of any natural number under | · | is finite (there are a finite number of objects of any fixed size). The counting sequence (cn ) of C is the sequence cn = #{x ∈ C : |x| = n} counting the number of objects in C of each size. The generating function of C is the formal power series C(z) = n ≥0 cn z n , which can be viewed as an element of Q[[z]] with natural number coefficients. More generally, given a sequence (an ) over any field K the generating function of (an ) is the formal power series A(z) ∈ K[[z]] whose coefficient sequence is (an ).

2 Generating Functions and Analytic Combinatorics

23

Example 2.2 (Compositions and Partitions) The class of integer compositions consists of the set of all positive integer tuples,   C = (k1, . . . , kr ) : r ∈ N>0, k j ∈ N>0 , and the size function that maps a tuple to its sum: |(k1, . . . , kr )| = k1 + · · · + kr . The number cn of objects of size n in C is the number of ways of writing n as a sum of positive integers. For instance, there are 4 compositions of size 3 since 3 = 1 + 1 + 1 = 1 + 2 = 2 + 1. To determine the number of integer compositions of size n, consider the expression 1  1  ···  1 consisting of n ones and n − 1 symbols . Replacing each  with a plus ‘+’ or a comma ‘,’ gives a tuple of elements adding up to n, and conversely every tuple of size n is obtained in this manner. Thus, cn = 2n−1 and   z C(z) = 2n−1 z n = z 2n z n = , 1 − 2z n≥1 n≥0 either formally or as a convergent power series when |z| < 1/2. Similarly, let   P = (k 1, . . . , kr ) : r ∈ N>0, k j ∈ N>0, k1 ≥ k2 ≥ · · · ≥ kr be the class of integer partitions, containing all non-increasing positive integer tuples where the size of a tuple is again its sum. Although similar to integer compositions, integer partitions are much harder to analyze. In 1748, Euler [40] gave the generating function expression   1 , P(z) = pn z n = 1 − zn n≥0 n≥1 which follows from the definition of power series multiplication after expanding each term as a geometric series. Using analytic methods, Hardy and Ramanujan [63] famously1 determined asymptotic behaviour of pn from a study of P(z).

The theory of generating functions, where a series is studied to understand its coefficients, can be traced back to de Moivre [32], who found correspondences between various properties of a power series and its coefficient sequence; Euler [39], who gave the first explicit definition of formal series and wrestled with their implications; Laplace [75], who created a formal calculus of power series; and Cauchy [20], whose 1 In one of the rare instances of pure mathematics breaking through to Hollywood, the 2015 film ‘The Man Who Knew Infinity’ includes a dramatic scene where Ramanujan reveals an approximation of p200 to Major MacMahon, who had recently calculated p200 exactly by hand after much effort.

24

2 Generating Functions and Analytic Combinatorics

work finding analytic expressions for power series coefficients underlies most of analytic combinatorics. The work of many major figures in analysis in the eighteenth and early nineteenth centuries had an impact; Ferraro [44] gives a detailed account of the theory of series during this time period. Although there are many large and beautiful frameworks which allow one to go from a combinatorial description of a class to a specification of its generating function, we do not discuss these in detail here. Instead, we typically assume that we already have access to a generating function specified in one of the manners discussed in the rest of this chapter. Those wanting to learn more about how to derive generating function expressions can refer to the Symbolic Method of Flajolet and Sedgewick [50] and the Theory of Species described by Bergeron et al. [6]. A nice introduction to such ‘generatingfunctionology’ can be found in Wilf [125].

2.1 Analytic Combinatorics in One Variable Unless explicitly stated, we now work only with power series over a field K ⊂ C so that we can apply methods from complex analysis. The appendix to this chapter recalls some basic facts and notation from complex analysis which we make use of. Additional background can be found in any standard reference for complex analysis, such as Henrici [65]. The key idea of analytic combinatorics is to use an analytic function to encode the sequence formed by its power series coefficients. Definition 2.3 (power series coefficients) A complex-valued function F(z) is analytic at a ∈ C if F(z) is represented by a convergent power series on some open disk centred at a. Given a function F(z) which is analytic at the origin, we call the coefficients ( fn ) of a power series representation  fn z n, F(z) = n≥0

valid in a neighbourhood of the origin, the power series coefficients of F. From now on we slightly abuse notation and write F(z) to refer both to the original function and the power series. There is no harm in making this identification, since the original function and the series both define analytic functions which agree on an open disk containing the origin and the series representation is unique. In this way, we also refer to the complex-valued function F(z) as a generating function of ( fn ). We first recall a few basic facts from complex analysis, which follow from the background results discussed in the appendix. Definitions for the complex analytic terms used here can also be found in the appendix. Deforming Curves of Integration: If C and C are simple closed curves such that C can be continuously deformed to C while staying in an open set where f (z) is analytic, then

2.1 Analytic Combinatorics in One Variable

25



∫ C

f (z)dz =

C

f (z)dz.

Cauchy Residue Theorem: Suppose f (z) is meromorphic in a domain Ω and γ is a positively oriented loop in Ω on which f is analytic. Then ∫  1 f (z)dz = Res f (z), z=p 2πi γ p ∈Ψ where Ψ is the (necessarily finite) set of poles of f inside γ. Maximum Modulus Integral Bound: If f (z) is continuous on a curve γ, then ∫ f (z)dz ≤ length(γ) × max | f (z)|. z ∈γ γ

The methods of analytic combinatorics work because of deep connections between analytic properties of a function and asymptotic properties of its power series coefficients. The basis for this is the Cauchy integral formula, which gives an analytic expression for power series coefficients. Theorem 2.1 (Cauchy Integral Formula for Coefficients) Let γ be a positively oriented circle around the origin and F(z) be a complex-valued function analytic inside and on γ. If F(z) is represented in a neighbourhood of the origin by the convergent power series F(z) = n≥0 fn z n then, for any n ∈ N, ∫ 1 dz fn = F(z) n+1 . (2.1) 2πi γ z In order to talk about asymptotics we introduce some notation. Definition 2.4 (asymptotic notation) If fn and gn are sequences, and gn  0 for n sufficiently large, then we write fn fn fn fn

= O(gn ) = o(gn ) ∼ gn ˜ n) = O(g

if there exist M, N > 0 such that | fn | ≤ M |gn | for all n ≥ N, if the limit of fn /gn goes to 0 as n → ∞, if the limit of fn /gn goes to 1 as n → ∞, if fn = O(gn logk |gn |) for some k ∈ N.

We now work through an example in detail, showing the ease with which complex analysis allows one to derive powerful results, before discussing the general theory.

26

2 Generating Functions and Analytic Combinatorics

Fig. 2.1 The integral In , over the circle |z | = 2, grows exponentially slower than the coefficients of tangent. Introducing this integral allows us to determine asymptotics by computing residues at the singularities z = ±π/2 of tangent.

2.1.1 A Worked Example: Alternating Permutations An alternating permutation, also known as a zigzag permutation, of size n = 2m + 1 is an ordering (π1, . . . , π2m+1 ) of the numbers 1, 2, . . . , 2m + 1 such that consecutive numbers alternate between increasing and decreasing: π1 < π2,

π2 > π3,

π3 < π4,

etc.

Let an denote the number of alternating permutations of size n, where an = 0 when n is even. In 1879, André [1] discovered the marvelous fact that if (tn ) denotes the power series coefficients of tan z then an = n! tn , meaning   an tan z = tn z n = zn n! n≥0 n≥0 for z in a neighbourhood of the origin. See Problem 2.12 for a derivation of the generating function; further historical details can be found in Stanley [111]. To determine asymptotics of an it is sufficient to find asymptotics of tn . Since tan z = (sin z)/(cos z) the tangent function is meromorphic in the complex plane, with simple poles at the roots {±π/2, ±3π/2, . . . } of cos z and no other singularities. In particular, Theorem 2.1 implies   ∫ sin z sin z dz 1 (2.2) = tn = Res n+1 z=0 z 2πi |z |=ε cos z z n+1 cos z for any 0 < ε < π/2. The analysis proceeds in two steps.

2.1 Analytic Combinatorics in One Variable

27

Step 1: Introduce an Exponentially Smaller Integral Let In =

1 2πi

∫ |z |=2

sin z dz , cos z z n+1

and note that π/2 < 2 < 3π/2. Since tan z is analytic on the compact set C2 = {z ∈ C : |z| = 2}, there exists M > 0 such that | tan z| ≤ M for |z| = 2. Thus, the maximum modulus integral bound implies tan z M = 2(2π) |In | ≤ length(C2 ) × max = O(2−n ). |z |=2 (2π)z n+1 (2π)2n+1 We will soon show that In grows exponentially smaller than tn , and its introduction allows us to simplify the Cauchy integral through residue computations.

Step 2: Compute Residues Since tan z is meromorphic, and in the disk {|z| ≤ 2} only has poles at 0 and ±π/2, the Cauchy residue theorem (Proposition 2.24 in the appendix) implies       sin z sin z sin z + Res n+1 + Res . In = Res n+1 z=0 z z=π/2 z z=−π/2 z n+1 cos z cos z cos z The residue at the origin is the power series coefficient tn , so after some simplification, and using the fact that |In | = O(2−n ), we obtain tn = −

    n+1    n+1  sin z sin z 2 2 Res Res − − + O(2−n ). π π z=π/2 cos z z=−π/2 cos z

Lemma 2.4 of the appendix shows that the residue of any fraction p(z)/q(z) at a point z = a where q(a) = 0 and p(a), q (a)  0 equals p(a)/q (a). Using this fact, or simply computing the Laurent series of tan z at z = ±π/2 in a computer algebra system, shows that the residues of tan z at z = ±π/2 have value −1, and tn =

  n+1   n+1 2 2 + − + O(2−n ). π π

Thus, almost magically, we have proven that an ∼ 2 that an = 0 when n is even.

  n+1 2 π

n! when n is odd. Recall

Notes We can come away from this example with several important observations.

28

2 Generating Functions and Analytic Combinatorics

1. The only property of the domain of integration {|z| = 2} used in our argument was that the circle {|z| = 2} separated the poles of tan z closest to the origin from the remaining poles. In particular, the domain of integration of In can be deformed to any circle {|z| = τ} where π/2 < τ < 3π/2 without changing the value of In . Thus, our argument shows the stronger statement that n   n+1  an 2 2 = tn = 2 +ε +O n! π 3π for all ε > 0 and n odd. 2. In fact, one can integrate over any circle {|z| = τ} with τ  (2k +1)π/2 for k ∈ N to obtain an expression for tn as a sum of residues with an error term. As τ grows the residues of tan z at additional poles will be added to the sum while the error gets exponentially smaller. For example, taking 3π/2 < τ < 5π/2 gives   n+1  n   n+1 an 2 2 2 +2 +O =2 n! π 3π 5π for n odd. Note that we may use an error with exponential growth 2/5π, instead of 2/5π + ε, because the next asymptotic term is again given by a simple pole. Taking τ → ∞ even gives a convergent series expansion   n+1  an 2 1 =2 n! π (2k + 1)n+1 k ≥0 for odd n. Proving this infinite series representation requires uniformly bounding | tan z| away from its poles, for which it is more convenient to integrate over squares than circles; see Problem 2.13 for a proof strategy. 3. In this way, every singularity of tan z gives an asymptotic contribution to the power series coefficients tn . The most important part of this contribution, its exponential factor, is determined by the location of the singularity. As each singularity in this example is a simple pole, the sub-exponential factor at each point is a constant. Instead of simply introducing the integral In , it can be instructive to imagine expanding the domain of integration in the Cauchy integral (2.2) away from the origin while staying where the integrand is analytic, as in Figure 2.2. The domain can be expanded freely until it becomes locally ‘stuck’ in two places near the singularities at z = ±π/2. One can compensate for pushing the domain of integration past these singularities by adding a contribution from each, determined by their residues. The domain of integration can then be expanded again, until getting ‘stuck’ on the next singularities. Since the Cauchy integrand decreases exponentially as the domain of integration moves away from the origin, the contributions from these singularities get successively smaller.

2.1 Analytic Combinatorics in One Variable

29

Fig. 2.2 Starting with a small circle at the origin, one can imagine pushing the domain of integration in the Cauchy integral away from the origin: this deformed curve gets ‘stuck’ near the two poles of tan z closest to the origin. Deforming past these singularities is possible if one adds integrals around each, which can be computed using residues.

This mental picture of sliding around domains of integration will be useful in the multivariate setting, when things are much harder to visualize, and helps guide the analysis in the presence of non-polar singularities such as algebraic branch points.

2.1.2 The Principles of Analytic Combinatorics We now discuss the general theory of analytic combinatorics, which closely follows what we observed in the preceding example. Because we only consider power series representing analytic functions, the series coefficients of interest can grow at most exponentially. This makes their exponential growth the most significant component of their asymptotic behaviour. Definition 2.5 (exponential growth) The exponential growth of a sequence ( fn ) is the constant ρ = lim sup | fn | 1/n . n→∞

The reason for the limsup is that different subsequences of ( fn ) can have different growth; for instance, in combinatorial contexts there may be periodicities in the underlying counting problem. When ( fn ) is the coefficient sequence of an analytic function at the origin then ρ is finite and is the coarsest measure of sequence behaviour (it may be zero, in which case fn decays faster than any exponential function). The classic root test on series convergence immediately implies the following. Proposition 2.1 If n≥0 fn z n is a power series with finite radius of convergence R > 0 then the exponential growth of ( fn ) is ρ = 1/R. Furthermore, Cauchy’s integral formula links the radius of convergence of a power series and the singularities of the analytic function it represents at the origin. The proof of the following result is Problem 2.16.

30

2 Generating Functions and Analytic Combinatorics

Proposition 2.2 Suppose F(z) is an analytic function represented in a neighbourhood of the origin by a power series with finite radius of convergence R. Then F admits a singularity on the circle |z| = R. Consequently, the exponential growth of F’s power series coefficients is 1/R, where R is the minimum modulus of a singularity of F. This motivates the first of two Principles of Coefficient Asymptotics2. First Principle of Coefficient Asymptotics: The location of a function’s singularities dictates the exponential growth ρ of its power series coefficients. Definition 2.6 (dominant singularity) A singularity of minimum modulus is called a dominant singularity of F. The dominant singularities of a function are those which influence dominant asymptotics of its coefficient sequence. In many applications, F admits only a finite number of dominant singularities; for instance, this occurs whenever F is meromorphic. When the power series coefficients ( fn ) of F have exponential growth ρ then fn ∼ ρn θ(n) as n → ∞, where θ(n) is a function which grows at most subexponentially (or decays). As in Section 2.1.1, one can imagine expanding the domain of integration in the Cauchy integral representation of Theorem 2.1 until it gets locally stuck near singularities of F. Each isolated singularity of F will give some asymptotic contribution to the power series coefficients fn , with the type of the singularity (pole of a given order, algebraic branch point, logarithmic branch point, etc.) determining θ(n). This leads to the second principle of coefficient asymptotics. Second Principle of Coefficient Asymptotics: The nature of a function’s singularities determines the associated sub-exponential growth θ(n). The simplest case occurs when the dominant singularities of F are poles. The asymptotic contributions of each singularity can then be determined by a residue calculation analogously to the computation in Section 2.1.1, giving the following. Proposition 2.3 Suppose that F(z) is analytic on the circle |z| = R and is analytic in the disk |z| < R except at a finite number of (distinct) polar singularities σ1, . . . , σm , none of which are zero. Then there exist polynomials P1 (n), . . . , Pm (n) such that fn =

m 

Pk (n)σk−n + O(R−n ),

k=1

and the degree of Pk is one less than the order of the pole of F at z = σk . 2 As laid out by Flajolet and Sedgewick [50], the analytic combinatorialist’s bible.

2.1 Analytic Combinatorics in One Variable

31

Proof If we define the integral In =

1 (2πi)n

∫ |z |=R

F(z) dz z n+1

then the maximum modulus bound implies In = O(R−n ) and the Cauchy residue theorem gives 

      m m  F(z) F(z) F(z) = fn + . Res Res In = Res n+1 + z=σk z n+1 z=σk z n+1 z=0 z k=1 k=1 As shown in Lemma 2.4 of the appendix, if z = σk is a pole of order r then    F(z) 1 d r−1  r −n−1 lim (z − σ , ) F(z)z Res n+1 = k z=σk z (r − 1)! z→σk dzr−1 which is a polynomial in n of order r − 1.



When F(z) admits non-polar singularities, asymptotics can be computed using more nuanced methods. In the nineteenth century, Darboux [30] investigated coefficients of functions with algebraic singularities; see Section 2.3 below for more details. In the 1930s, Jungen [67] showed that when F(z) behaves near a singularity z = ρ like a sum of terms of the form C(1 − z/ρ)α logr

1 1 − z/ρ

with r ∈ N, then each term with α ∈ C \ N makes an explicit asymptotic contribution to the power series coefficients of F which begins ρ−n n−α−1 logr (n)

C , Γ(−α)

where Γ is the Euler gamma function, whose definition is recalled at the end of the appendix. A similar formula is available when α ∈ N and r > 0, the only other situation in which F has a singularity at ρ. We shall see below that this statement encapsulates the singular behaviour of large classes of generating functions. Flajolet and Odlyzko [48] coined the term singularity analysis for the process of translating local behaviour of a function at its singularities to asymptotic information about its power series coefficients. Those authors gave a modern and uniform treatment of such transfer theorems, considering types of singularities beyond those we encounter. See also the survey of Odlyzko [87], or the extensive treatment in Flajolet and Sedgewick [50], for additional details on these methods. Further information on classical methods for determining asymptotics can be found in de Bruijn [31].

32

2 Generating Functions and Analytic Combinatorics

2.1.3 The Practice of Analytic Combinatorics Let F(z) be analytic at the origin with coefficient sequence ( fn ). If the dominant singularities of F are known, and they fall into one of the large number of types whose asymptotic contributions have been determined, then one can ‘read off’ dominant asymptotics of fn by looking up precomputed transfer theorems. In practice, most of the work in an asymptotic argument thus goes towards the determination and classification of dominant singularities. Because we deal mainly with power series whose coefficients are non-negative, the following result simplifies the determination of dominant singularities. It has been attributed by various authors to Pringsheim, Borel or Vivanti; see Hadamard [61] and Vivanti [120] for historical context and Titchmarsh [113, Sect. 7.21] for a proof. Proposition 2.4 (Vivanti-Pringsheim Theorem) Suppose F(z) is represented at the origin by a series expansion with real coefficients which has a finite radius of convergence R > 0. If this series expansion contains only a finite number of negative coefficients then z = R is a singularity of F(z). Thus, when the series coefficients fn of F(z) are non-negative, and do not grow or decay super-exponentially, their dominant asymptotics can usually be determined by the following steps: 1. Find the smallest real r > 0 such that z = r is a singularity of F(z); 2. Find all other singularities on the circle |z| = r; 3. If F admits a finite number of singularities on this circle, try to determine the asymptotic contribution of each by examining the local behaviour of F (see Propositions 2.11 and 2.17 below); 4. Add the results to get dominant asymptotics of fn . Example 2.3 (Asymptotics of Surjections) A surjection of size n is a map from the set [n] = {1, 2, . . . , n} to any set of the form [r] = {1, 2, . . . , r } which covers [r] (all elements of [r] have a preimage). Using the Symbolic Method of Flajolet and Sedgewick [50], it is an easy exercise [50, p. 109] to show that if sn denotes the number of surjections of size n then the generating function F(z) of sn /n! satisfies F(z) =

 sn 1 zn = = 1 + z + 3(z 2 /2!) + 13(z 3 /3!) + · · · . z n! 2 − e n≥0

The singularities of F(z) form the set {log 2 + 2πik : k ∈ Z}, with the smallest real positive singularity equal to log 2. There are no other singularities of modulus log 2 and F(z) has a simple pole at z = log 2 with corresponding residue −1/2, so sn ∼

n! . 2(log 2)n+1

2.2 Rational Power Series

33

We now discuss in detail several important families of generating functions: rational, algebraic, D-finite, and D-algebraic series. Each successive family in this list contains the previous families, and thus represents an increasingly complicated collection of sequences. For instance, although the power series coefficients of a rational generating function can be written as a finite sum of algebraic quantities which can be described explicitly, it is undecidable to determine several basic properties of the coefficients of a general D-algebraic function. We characterize the coefficient sequences corresponding to each class, describe how they can be manipulated, and identify possible asymptotic behaviour.

2.2 Rational Power Series We begin with the class of rational power series.

Rational Functions and C-finite Sequences In his pioneering eighteenth-century work, plausibly the first use of generating functions, de Moivre [32, 33] used rational generating functions to study sequences satisfying linear recurrences with constant coefficients. Definition 2.7 (C-finite sequences and linear recurrences) Given r ∈ N, a sequence ( fn ) over a field K (not necessarily a subset of C) satisfies a linear recurrence relation of order r with constant coefficients over K if there are constants c0, . . . , cr−1 ∈ K with c0  0 such that fn+r = cr−1 fn+r−1 + cr−2 fn+r−2 + · · · + c0 fn

(2.3)

for all n ≥ 0. Any sequence satisfying (2.3) is uniquely determined by its first r terms f0, . . . , fr−1 , called the initial conditions of ( fn ). A sequence that satisfies a linear recurrence relation over K is called constant recursive, or C-finite, over K. Remark 2.1 The set of sequences satisfying (2.3) forms a K-vector space of dimension r: if fn and gn satisfy (2.3) then so does fn + λgn for any λ ∈ K. De Moivre showed that a sequence is C-finite if and only if its generating function is rational. Theorem 2.2 The sequence ( fn ) satisfies the recurrence (2.3) if and only if its generating function F(z) satisfies F(z) =

 n≥0

fn z n =

p(z) 1 − cr−1 z − · · · − c0 zr

for some polynomial p(z) ∈ K[z] of degree at most r − 1.

34

2 Generating Functions and Analytic Combinatorics

The polynomial p depends on the initial conditions of ( fn ). Given a rational function F whose numerator has degree greater than or equal to its denominator, polynomial division gives F as the sum of a rational function of this form and a polynomial, which only affects a finite number of coefficients. If K is not a subset of C, the rational function can be viewed as a multiplicative inverse in the ring of formal power series. Proof For any n ≥ 0, the coefficient [z n+r ]F(z) (1 − cr−1 z − · · · − c0 zr ) equals

  n+r n fn z (1 − cr−1 z − · · · − c0 zr ) = fn+r − cr−1 fn+r−1 − · · · − c0 fn . [z ] n≥0

Thus, F(z) (1 − cr−1 z − · · · − c0 zr ) is a polynomial of degree at most r − 1 if and  only if ( fn ) satisfies (2.3) for all n ≥ 0. Linear recurrence relations have applications in an astounding number of scientific and mathematical fields, including combinatorics, probability, number theory, dynamical systems, theoretical computer science, economics, biology, and more. In computer algebra, fast computation with linear recurrence relations is used as a basis for other algorithms [121, Sect. 12.3]. The text of Everest et al. [41] is dedicated to C-finite sequences and their applications. Bostan et al. [14, Ch. 4] study the complexity of generating terms in C-finite sequences.

Examples of C-finite Sequences Our first, and most famous, example of a C-finite sequence has transcended academia to appear in popular literature. Example 2.4 (Virahanka-Fibonacci Numbers) The Virahanka-Fibonacci numbers (vn ) satisfy the recurrence vn+2 = vn+1 + vn with initial conditions v0 = v1 = 1, and have generating function V(z) =

1 = 1 + z + 2z 2 + 3z 3 + 5z 4 + · · · . 1 − z − z2

C-finite sequences occur naturally as generating functions of combinatorial classes whose elements are built out of a finite set of fixed objects3. Although 3 For instance, traditional Sanskrit prosody is built from short and long units containing one or two syllables, respectively. As described by the Indian prosodist Virahanka around the 7th century AD, the number of long-short syllable sequences containing n syllables thus satisfies the VirahankaFibonacci recurrence vn+2 = vn+1 + vn (simply consider whether the final unit has 1 or 2 syllables). Singh [107] gives a historical account of the Virahanka-Fibonacci numbers focused on the early contributions of Indian poets and mathematicians.

2.2 Rational Power Series

35

satisfying a C-finite recurrence is a strong condition, some surprisingly complicated sequences can be C-finite. Example 2.5 (Look-and-Say Digit Sequence) Conway’s look-and-say sequence (ln ), beginning 1, 11, 21, 1211, 111221, 312211, 13112221, . . . is obtained by starting with 1 and recording the result of reading the previous blocks of digits from left to right (the first term contains ‘one 1’ giving the second term 11, the second term contains ‘two 1s’ giving the third term 21). A term of the form x1m1 x2m2 · · · xkmk with distinct consecutive bases is followed by m1 x1 m2 x2 · · · mk xk . The look-and-say digit sequence dn counts the number of digits of appearing in ln , beginning with the terms 1, 2, 2, 4, 6, 6, 8, . . . Conway [29] proved the miraculous fact that (dn ) is C-finite, and gave the rational generating function D(z) =

G(z) 1 + z + · · · + 18z 77 − 12z 78 . = H(z) 1 − z + · · · − 9z 71 + 6z 72

If the starting term ‘1’ is replaced by any natural number other than the fixed point ‘22’ the resulting generating function will still be rational with denominator H.

Rational generating functions appear often in the study of lattice walks, which will be discussed in Chapter 4. Additional examples come from counting integer partitions whose summands come from any finite set, non-negative solutions to many linear Diophantine equations and inequalities, integer points in convex polyhedral cones and scalings of convex rational polytopes, walks on finite digraphs, words in Coxeter groups, and many other combinatorial objects. See Stanley [112], BousquetMélou [18], and Flajolet and Sedgewick [50] for these and other examples. The class of C-finite sequences is also closed under several natural operations. Proposition 2.5 (Closure Properties) If ( fn ) and (gn ) are C-finite sequences then so are the sequences • ( fn + gn )n≥0   n • k=0 fk gn−k n≥0 • ( fn gn )n≥0 • ( fnm+k )n≥0 for any m ∈ N and 0 ≤ k < m Proof If F(z) and G(z) are rational generating functions of the sequences ( fn )  n and (gn ), then the generating functions of ( fn + gn ) and k=0 fk gn−k are the

36

2 Generating Functions and Analytic Combinatorics

rational functions F(z)+G(z) and F(z)G(z), immediately giving the first two cases of n and G(z) = n f z Proposition 2.5. The operation taking F(z) = n n≥0 n ≥0 gn z and n returning (F  G)(z) = n≥0 fn gn z is known as the Hadamard product of F and G. C-finiteness of ( fn gn ), and thus rationality of F G, follows from Theorem 2.3 below. The Hadamard product of F(z) and the rational function z k /(1 − z m ) has the form z k H(z m ), where H(z) is rational and equals the generating function of ( fnm+k )n ≥0 .

N-Rational Series The following class of rational series arise frequently in combinatorial applications. Definition 2.8 (N-rational functions) The collection of functions containing the constant 1 and the variable z which is closed under addition, multiplication, and the pseudo-inverse operation f → 1/(1 − z f ) forms the set of N-rational functions. Series expansions of N-rational functions have deep importance in computer science as the generating functions of regular languages counted by length, equivalent to the class of languages recognized by deterministic finite automata4, a well studied model of computation (see Rozenberg and Salomaa [95, Ch. 2] for details). Although every N-rational function is a rational function with natural number power series coefficients, there are rational power series with natural number coefficients which are not N-rational. Of particular interest for asymptotics is the following necessary condition of N-rationality given by Berstel [9], whose proof is Problem 2.9. Proposition 2.6 If F(z) is an N-rational function and σ is a dominant singularity of F(z) then σ/|σ| is a root of unity.

Example 2.6 (A Non-N-Rational Series) Consider the rational function F(z) =

1/2 1 1/2 2(1 + 3z)2 , + + = (1 − 9z)(1 + 14z + 81z 2 ) 1 − 9ze2iθ 1 − 9ze−2iθ 1 − 9z

where θ = arccos(1/3). Then F(z) is not N-rational by Proposition 2.6 as e2iθ is not a root of unity. Since the denominator of F has a constant term of 1 the coefficients of F(z) are integers. In fact, basic algebraic manipulations imply [z n ]F(z) = 9n (1 + cos(2nθ)) ≥ 0 so F(z) has natural number coefficients.

4 A finite automaton over a finite set A is a directed multigraph (V, E) whose edges are labelled by elements of A, where one vertex v ∈ V is chosen as an initial vertex and a collection Q ⊂ V of vertices are chosen as accepting vertices. A sequence (a1, . . . , ar ) ∈ A r for some r ∈ N is recognized by the automaton if there is a path in the automaton from v to an element of Q whose consecutive edges have labels a1, . . . , ar .

2.2 Rational Power Series

37

Soittola [109] gave a characterization of N-rational functions based on singularity analysis. Barcucci et al. [5], building on Soittola’s work, gave an algorithm which takes a rational function F(z), tests for N-rationality, and, if this test is positive, returns a regular language whose counting sequence is the coefficient sequence of F. In the words of those authors, this allows one to “reverse-engineer a combinatorial problem from a rational generating function.” As a meta-principle, generating functions of combinatorial classes seem to be N-rational. In the proceedings for the 2006 International Congress of Mathematics, Bousquet-Mélou [18] writes that she “never met a counting problem that would yield a rational, but not N-rational, GF.” Koutschan [71] gave a Maple implementation of the algorithm of Barcucci et al. and tested “about 60” sequences of combinatorial significance with rational generating functions, finding all functions to be N-rational. As we will soon see, this means certain deep open problems about the behaviour of rational function coefficients can be avoided for combinatorial enumeration problems. Trying to generalize characterizations of N-rationality to coefficient sequences of multivariate rational functions is an open problem, discussed in Chapter 3.

Asymptotics In addition to characterizing C-finite sequences in terms of their generating functions, de Moivre also studied their closed-form solutions, as did Bernoulli [8] and Euler [40, V. 1 Ch. XIII] around the same time period5. The following result gives an exact closed-form representation for terms of sufficiently large index. Theorem 2.3 Suppose F(z) = G(z)/H(z) ∈ Q(z) is a rational function with G and H coprime polynomials and H(0)  0. Let d denote the degree of H and σ1, . . . , σm be the distinct roots of H(z) in the complex plane. Then there exist polynomials P1 (n, x), . . . , Pm (n, x) in Q[n, x], whose degrees in x are at most d, such that for all n larger than some fixed natural number the power series coefficients of F(z) satisfy fn =

m 

P j (n, σj ) σj−n .

(2.4)

j=1

The degree of P j (n, x) in n is one less than the order of the pole of F(z) at z = σj . Conversely, if ( fn ) has a representation of the form (2.4) then the generating function F(z) is a rational function. Proof Theorem 2.3 is Proposition 2.3 specialized and strengthened in the case when F(z) is rational, and their proofs are analogous. In particular, when R > 0 is large enough so that all roots of H(z) lie in the disk |z| < R the Cauchy residue theorem and maximum modulus bound imply 5 For instance, Euler advocated the use of partial fraction decomposition to find the general term of a series with rational generating function.

38

2 Generating Functions and Analytic Combinatorics

  ∫ m  F(z) 1 F(z) max |z |=R |F(z)| fn + = Res dz . 2πi ≤ n+1 z=σ j z n+1 Rn |z |=R z j=1

(2.5)

Because F(z) is rational, there exists k > 0 such that for any R sufficiently large max |z |=R |F(z)| ≤ Rk . Thus, whenever n > k the right-hand side of (2.5) goes to zero as R → ∞, giving (2.4). The bounds on the degrees of the P j follow from Lemma 2.4 in the appendix of this chapter and the fact that each σj is an algebraic number of degree at most d. For any k ∈ N and c, α ∈ C the generating function of the sequence fn = cnk α n is the rational function cRk (αz), where Rk (z) is obtained by starting with the geometric series 1/(1−z) and alternatively taking the derivative and multiplying by z repeated k times. Distributing the sum in (2.4) then proves the final statement.  Gourdon and Salvy [58] give algorithms, running in polynomial time in the degree d, for computing the polynomials P j and determining which terms of the sum dictate dominant asymptotics of fn . Calculating the P j can be done symbolically using resultants and other algebraic tools, while determining which roots of H(z) are closest to the origin and dictate asymptotics requires numeric algorithms to separate the algebraic numbers σj and their moduli. These algorithms may be viewed as special cases of the machinery we develop in Chapter 7 for the multivariate setting. Example 2.7 (Virahanka-Fibonacci Asymptotics) The generating function of the Virahanka-Fibonacci numbers can be written   1 1 1 1 1 1 − , = V(z) = √ 1 − z − z2 5 σ 1 − z/σ τ 1 − z/τ √ √ where σ = (−1 + 5)/2 and τ = (−1 − 5)/2 are the roots of 1 − z − z 2 . Thus, expanding 1/(1 − z/σ) and 1/(1 − z/τ) as geometric series gives vn = [z n ]V(z) =

1 1 √ σ −n − √ τ −n . σ 5 τ 5

Similarly, the tools developed in Chapter 7 allow one to automatically prove that the look-and-say sequence generating function D(z) admits a single dominant singularity, necessarily real and positive, which is the root λ = .7671198507 . . . of the denominator H(z). Since the derivative of H does not vanish at λ this is a simple pole of D(z), and the look-and-say digit counting sequence has asymptotic behaviour dn ∼

−G(λ) λ−n ≈ (2.04216 . . . ) (1.30357 . . . )n . λ (∂H/∂z)(λ)

An automated analysis of this rational generating function was carried out by Gourdon and Salvy [58]. Koutschan [72] showed that D(z) is N-rational.

2.2 Rational Power Series

39

Equation (2.4) is an exact equality, for n sufficiently large, but not every summand will necessarily contribute to dominant asymptotics. Although this characterization is explicit there are still significant open problems in the area, due to potentially complicated interactions between the algebraic quantities involved. Open Problem 2.1 (Skolem’s Problem) Is there an algorithm which takes a rational function with integer coefficients, analytic at the origin, and determines whether it has a power series coefficient which is zero? Equivalently, is there an algorithm which takes a C-finite sequence over Q, defined by a linear recurrence and initial conditions, and decides whether any term in the sequence is zero? In the 1930s, Skolem [108] showed that if a sequence ( fn ) of rational numbers6 is C-finite then the set {n : fn = 0} of indices of zero terms consists of a finite set together with a finite set of arithmetic progressions, and this problem has essentially been open since then. When the set of zero indices is infinite, meaning it contains an arithmetic progression, this can be detected [10]. Currently, all known proofs of Skolem’s theorem use p-adic analysis in a manner which does not give an upper bound on the potential indices of sporadic zeroes7. Mignotte et al. [84] and Vereshchagin [119] show Skolem’s problem is decidable for sequences satisfying C-finite recurrences of orders 3 or 4. As noted in Kenison et al. [70], the crucial open case for recurrences of order 5 consists of sequences (un ) of the form8 un = a(λ1n + λ1n ) + b(λ2n + λ2n ) + cρn where |λ1 | = |λ2 | > | ρ| with a, b, c real algebraic numbers and |a|  |b|. Those authors give a method [70, Thm. 1.2] to determine when C-finite sequences of any order take the value zero on terms whose indices are prime powers. If ( fn ) is a C-finite sequence, so is the sequence ( fn2 ). Thus, the problem of detecting zero coefficients over Q can be reduced to deciding whether every term of a C-finite sequence is positive. This condition can then be relaxed, motivating the following problem which is more directly related to our asymptotic setting. Open Problem 2.2 (Ultimate Positivity) Is there an algorithm which takes a rational function with integer coefficients, analytic at the origin, and determines whether its series coefficients are eventually non-negative (i.e., if there are only a finite number of negative coefficients)? Equivalently, is there an algorithm which takes a C-finite sequence over Q, defined by a linear recurrence and initial conditions, and decides whether its terms are eventually non-negative? What if ‘non-negative’ is strengthened to ‘positive’? 6 Mahler [78] and Lech [76] extended this result to sequences over the field of algebraic numbers, and any field of characteristic zero, respectively. The theorem does not hold in finite characteristic: for instance, if F p denotes the finite field of prime order p then the sequence cn = (1 + x) n − 1 − x n defined over K = F p (x) satisfies a C-finite recurrence of order 3, but cn = 0 if and only if n = p k for some k ∈ N. This counter-example is due to Lech [76]. 7 Additional information on Skolem’s theorem can be found in Chapter 2 of Everest et al. [41]. 8 An explicit example of such a sequence, given by Kenison et al. [70], is the sequence defined by un+5 = −41un+4 + 952un+3 − 178360un+2 − 17673175un+1 + 17850625un with initial conditions u0 = 9, u1 = −281, u2 = 15485, u3 = −1135097, u4 = −30999543.

40

2 Generating Functions and Analytic Combinatorics

Ouaknine and Worrell [89, 88] show that the Ultimate Positivity problem is decidable for a rational function with square-free denominator (i.e., when the denominator and its derivative are coprime polynomials) and for an arbitrary rational function whose denominator has degree at most 5. Those authors also show that proving Ultimate Positivity for rational functions whose denominators have degree 6 would give a method of computing the so-called Lagrange constant for a large collection of transcendental numbers, a significant breakthrough in analytic number theory. Ultimate Positivity is exactly the setting in which the Vivanti-Pringsheim Theorem can be applied. Of course, when one is dealing with the generating function of a combinatorial class then non-negativity of coefficients is guaranteed. Because of the meta-principle that combinatorial classes always seem to have N-rational generating functions, in practice asymptotics can be effectively decided for combinatorial problems and combinatorialists do not need to worry about these pathological decidability issues. In particular, Proposition 2.6 implies that the dominant singularities of an N-rational function differ by roots of unity, meaning their coefficients have explicit periodic behaviour: there exists a natural number m such that for each k = 0, . . . , m−1 the coefficient sub-sequence ( fmn+k ) has dominant asymptotics of the form Ck nαk ρkn where αk is a computable natural number and Ck and ρk are algebraic constants with computable minimal polynomials. Flajolet et al. [49] give a similar characterization for dominant singularities of combinatorial classes obtained from a large collection of recursive ‘constructions’.

2.3 Algebraic Power Series We next examine series satisfying polynomial equations. Throughout this section we assume that K is an algebraically closed field of characteristic zero, of which the most important cases are K = A, the field of algebraic numbers over Q, and K = C. We will see how bivariate polynomials can be used as data structures for algebraic functions, and how algebraic tools can be leveraged to create algorithms for coefficient asymptotics. In particular, the potential singularities of an algebraic function can be characterized, after which asymptotics can be ‘read off’ from local expansions near the singularities. In order to solve algebraic equations we must consider more general objects than power series.

Laurent and Puiseux Series Expansions First, we note that to solve equations such as zF(z) = 1 it is necessary to introduce series expansions with negative exponents. Definition 2.9 (formal Laurent series) The field of formal Laurent series over K, denoted K((z)), consists of formal series with potentially negative integer exponents bounded from below,

2.3 Algebraic Power Series

41

 K((z)) =



 an z n : N ∈ Z, a j ∈ K for all j ,

n≥ N

with term-wise addition and Cauchy product. Problem 2.5 implies that K((z)) is the field of fractions of the ring of formal power series K[[z]]. In a similar fashion, to solve algebraic equations such as F(z)R = z using formal series we must introduce fractional exponents. Definition 2.10 (formal Puiseux series) The field of formal Puiseux series, denoted Kfra ((z)), consists of the set of formal series    fra n/R an z : N ∈ Z, R ∈ N>0, a j ∈ K for all j K ((z)) = n≥ N

with term-wise addition and Cauchy product. The variables appearing in a Puiseux series have rational exponents that are bounded from below and have fixed denominators, so expressions such as n ≥0 z −n and n≥1 z1/n are not Puiseux series. Although Kfra ((z)) does not include all series with rational exponents, it is adequate to describe the solutions of any algebraic equation. The following result dates back to Newton [85] and Puiseux [94]; a proof can be found in Walker [122, Sect. 3]. Proposition 2.7 (Puiseux’s Theorem) Let K be an algebraically closed field of characteristic zero. Then Kfra ((z)) is the algebraic closure of K((z)). Definition 2.11 (algebraic series and minimal polynomials) A formal series F(z) ∈ Kfra ((z)) is algebraic over K if there exist polynomials p0 (z), . . . , pd (z) ∈ K[z], not all zero, such that pd (z)F(z)d + pd−1 (z)F(z)d−1 + · · · + p0 (z) = 0. If F(z) is algebraic, a minimal polynomial of F is any non-zero polynomial P(z, y) ∈ K[z][y] of minimal degree in y with coefficients in K[z] such that P(z, F(z)) = 0; this is unique up to a non-zero multiple of K[z]. The Puiseux series roots of a minimal polynomial of F are called the conjugates of F. A series which is not algebraic is called transcendental. √ √ Just as the minimal polynomial x 2 − 2 of 2 over Q also admits − 2 as a root, a general algebraic series F(z) must be encoded by its minimal polynomial together with additional information to distinguish F among its conjugates. Even when F is a power series its conjugates can contain negative and fractional powers. Example 2.8 (Rooted Binary Trees) A rooted binary plane tree is a rooted tree (including the empty tree) where every vertex has two (possibly empty) ordered binary plane trees as children. From this

42

2 Generating Functions and Analytic Combinatorics

recursive definition it can be shown that the generating function C(z) of such trees, counted by their number of vertices, satisfies P(z, C(z)) = 0 where P(z, y) = zy 2 − y + 1. By Puiseux’s Theorem we know there are two formal series solutions to P(z, y) = 0. Let f (z) = ar zr + · · · for some r ∈ Q with ar  0 and algebraic coefficients a j ∈ A to be determined, where ‘· · · ’ hides terms with strictly larger exponents. Substitution into the equation P(z, f (z)) = 0 gives   0 = z (ar zr + · · · )2 − (ar zr + · · · ) + 1 = ar2 z 2r+1 + · · · − (ar zr + · · · ) + 1. (2.6) In order for this equality to hold, every term on the right hand side of (2.6) must cancel. In particular, there must be at least two terms whose exponent of z is minimal, and these terms must cancel. Thus, either 2r + 1 = r < 0

and

ar2 − ar = 0

(z 2r+1 and zr are minimal)

or

2r + 1 = 0 < r

and

ar2 + 1 = 0

(z2r+1 and z0 are minimal)

or

r = 0 < 2r + 1

and

− ar + 1 = 0

(zr and z 0 are minimal).

In the first case, r = −1 and ar = a−1 = 1 since ar  0 by assumption. The second case cannot occur, but the third can when r = 0 and ar = a0 = 1. Thus, we have found the start of two series expansions f1 (z) = z−1 + · · ·

and

f2 (z) = 1 + · · ·

such that P(z, f1 ) = P(z, f2 ) = 0. Substituting these starting terms back in (2.6) allows one to repeat the process and obtain further terms in the series expansions. For example, when r = −1 and a−1 = 1 then repeated back substitution shows the next non-zero term is a0 z0 with a0 = −1, followed by a1 z with a1 = −1, etc. Puiseux’s theorem implies that this process of extending a partial expansion must succeed, giving solutions f1 (z) = z−1 − 1 − z − 2z2 − 5z3 + · · ·

and

f2 (z) = 1 + z + 2z2 + 5z3 + 14z 4 + · · ·

to P(z, f1 ) = P(z, f2 ) = 0.

This computational procedure of extending truncated series solutions can be automated, a process known as the Newton polygon method (see, for example, Walker’s proof [122] of Puiseux’s Theorem). The name comes from the use of a polygon defined by the exponents of monomials in P(z, y) to determine the potential starting indices of solutions. There are several computer algebra packages that can compute Puiseux series solutions of equations, such as the gfun package [101] in Maple.

2.3 Algebraic Power Series

43

Example 2.9 (Rooted 3-Ary Trees) A rooted 3-ary plane tree is a rooted tree where every vertex has three ordered 3-ary plane trees as children. The generating function F(z) of 3-ary plane trees satisfies P(z, F(z)) = 0,

P(z, y) = zy 3 − y + 1.

There are now three solutions, f1 (z) =

1 z1/2



1 3z 1/2 − +··· , 2 8

f2 (z) =

−1 1 3z 1/2 − + +··· , 1/2 2 8 z

and f3 (z) = 1 + z + 3z 2 + 12z 3 + · · · satisfying P(z, f1 ) = P(z, f2 ) = P(z, f3 ) = 0. Even though the generating function f3 is a power series, its conjugates f1 and f2 have fractional exponents.

When working over the complex numbers there is a useful analytic version of Puiseux’s Theorem involving convergent series. When moving from formal to analytic considerations, we account for branch cuts of algebraic functions by considering series in disks with rays removed (the appendix to this chapter contains additional details on branch cuts). Definition 2.12 (slit disks) A slit disk centred at z = ω is a disk |z − ω| < R centred at ω with a ray {ω + ta : t > 0} removed, for any R > 0 and a ∈ C \ {0}. See Hille [66, Ch. 12] for a proof of the following. Proposition 2.8 (Puiseux’s Theorem, Analytic Version) Let P(z, y) ∈ Q[z, y] be an irreducible polynomial of degree R in y, and let ω ∈ C. Then in a sufficiently small slit disk centred at ω there exist R convergent Puiseux series expansions  cn(k) (1 − z/ω)n/R, 1≤k≤R yk (z) = n≥ N

with P(z, yk (z)) = 0 for each 1 ≤ k ≤ R, where the branch of (1 − z/ω)1/R is fixed in the slit disk and all coefficients cn(k) are algebraic numbers. The point z = ω is a singularity of one of the analytic functions defined by these expansions if and only if the corresponding expansion is not a power series. The coefficients in such a convergent Puiseux series can be determined by applying the Newton polygon method to the polynomial Q(t, y) = P(ω(1 − t), y) to obtain Puiseux series in t and then substituting t = 1 − z/ω. We consider these expansions to be centred at z = ω as they converge in a slit disk centred at ω. Example 2.10 (Puiseux Expansions I)

44

2 Generating Functions and Analytic Combinatorics

To find convergent Puiseux series solutions of P(z, y) = zy 2 − y + 1 centred at z = 1/4, we can find the Puiseux series solutions of P((1 − t)/4, y) = (1 − t)y 2 /4 − y + 1 = 0 at t = 0, and then substitute t = 1 − 4z. The Newton polygon method gives solutions 2 + 2t 1/2 + 2t + 2t 3/2 + · · · = 2 + 2 (1 − 4z)1/2 + 2 (1 − 4z) + 2 (1 − 4z)3/2 + · · · and 2 − 2t 1/2 + 2t − 2t 3/2 + · · · = 2 − 2 (1 − 4z)1/2 + 2 (1 − 4z) − 2 (1 − 4z)3/2 + · · · Taking the principal branch of the square-root function t 1/2 , these series converge in the slit disk where |1 − 4z| < 1 and z is not a real number larger than 1/4 (so that 1 − 4z is not real and negative). An algebraic generating function is encoded by a polynomial P(z, y) ∈ Z[z, y], but P also encodes other algebraic series. We now discuss how to separate a specific series of interest from the other roots of P. Definition 2.13 (singular parts and separation orders) Let P(z, y) ∈ Z[z, y] and suppose f (z) = n ≥ N cn z n/R satisfies P(z, f (z)) = 0. The singular part of f is the initial partial sum fI (z) = n=N cn z n/R, where  ≥ N is the minimal integer such that fI is not equal to an initial partial sum of any other Puiseux series expansion y(z) with P(z, y(z)) = 0. The separation order of f (z) with respect to P is the integer . The following result, found in Walsh [123], not only bounds the separation order but also shows how it can be used to determine the smallest field extension of Q containing all coefficients of a Puiseux series expansion. Proposition 2.9 Let P(z, y) ∈ Q[z, y] be an irreducible polynomial in y of degree dy in y and degree dz in z, and f (z) = n≥ N cn z n/R satisfy P(z, f (z)) = 0. If the separation order of f with respect to P is  then  ≤ 2dz dy (2dy − 1) and all Puiseux series coefficients c N , c N +1, . . . of f lie in the finite extension Q(c N , . . . , c ). Knowing that the coefficients of each Puiseux series expansion lie in a finite extension of Q is necessary for computation. Example 2.11 (Puiseux Expansions II) The Puiseux series solutions of P(z, y) = (z + 1/4)y 2 − y + 1 begin 2 + 4iz 1/2 − 8z + · · ·

and

2 − 4iz 1/2 − 8z + · · · .

2.3 Algebraic Power Series

45

Since these series are separated by their truncations at order 2, all coefficients of both Puiseux series expansions lie in Q(2, ±4i) = Q(i).

Later, in our arguments on lattice path enumeration, we will need to determine how many Puiseux series solutions of an algebraic equation are fractional power series (i.e., contain no negative exponents). The following result gives an answer. Proposition 2.10 Let P(z, y) = p0 (z) + p1 (z)y + · · · + pd (z)y d ∈ Q[z][y] be a polynomial of degree d in y such that p j (0)  0 for at least one index j. Then the number of series solutions to P(z, y) = 0 in y at z = 0 which are fractional power series equals the maximum index j such that p j (0)  0. A proof of Proposition 2.10, which involves algebraic manipulations of Puiseux expansions, can be found in Stanley [110, Prop. 6.1.8].

Singularities of Algebraic Curves We now determine the points where the solution of an algebraic equation can have a singularity, in order to apply the techniques of analytic combinatorics to determine coefficient asymptotics. This requires some algebraic tools. Definition 2.14 (resultants and discriminants) Given polynomials p(x) = a0 + a1 x + · · · + ad x d

and

q(x) = b0 + b1 x + · · · + be x e

over an integral domain R, where ad, be  0, the resultant of p and q with respect to x is  (α − β), Resultant x (p, q) = ade bed p(α)=q(β)=0

where α and β run through the roots of p and q in an algebraic closure of R. The discriminant of p(x) with respect to x is the quantity Discx (p) =

(−1)d(d−1)/2 Resultant x (p, p ), ad

where p (x) is the derivative of p(x) with respect to x. Problem 2.14 asks you to prove that the resultant lies in R, and gives an algorithm for its calculation. Note that the resultant of p and q is zero if and only if p and q share a root in any algebraic closure of R. Thus, the discriminant Discx (p) is zero if and only if p has a multiple root in any algebraic closure of R. Both the resultant and discriminant can be calculated by any major computer algebra system using efficient algorithms [121, Ch. 6] (the cost is not much greater than polynomial multiplication). Now, let

46

2 Generating Functions and Analytic Combinatorics

P(z, y) = pd (z)y d + pd−1 (z)y d−1 + · · · + p0 (z) be an irreducible polynomial in Q[z][y] of degree d in y. For fixed a ∈ C the polynomial P(a, y) has d solutions for y in the complex plane unless pd (a) = 0 or P(a, y) has a multiple root in y, meaning Discy (P) ∈ Q[z] vanishes at z = a. Definition 2.15 (exceptional sets) The exceptional set of P(z, y) ∈ Z[z, y] is the finite set Ξ = Ξ(P) = {z ∈ C : pd (z) = 0 or Discy (P)(z) = 0}. For any a ∈ C \ Ξ the polynomial P(a, y) has d distinct solutions in y, and the implicit function theorem (described in a more general context in Proposition 3.1 of Chapter 3) implies the existence of analytic functions y1 (z), . . . , yd (z) defined in a neighbourhood of z = a such that y1 (a), . . . , yd (a) are the solutions of P(a, y) = 0. Definition 2.16 (branches and algebraic functions) Each analytic function y j (z) in the proceeding paragraph defines a branch of the curve P(z, y) = 0 in any simply connected region of the complex plane where it is analytic. An algebraic function on a domain Ω is any function on Ω which is a branch of some algebraic curve. One immediate consequence of the implicit function theorem is the following. Lemma 2.2 If z = a is a singularity of a branch of P(z, y) then a ∈ Ξ. Vanishing of pd (z) corresponds to branches going off to infinity: if pd (a) = 0 then the collection of Puiseux series solutions to F(z, y) at z = a can contain series with negative exponents. Vanishing of Discy (P)(z) corresponds to two branches colliding: if Discy (P)(a) = 0 then the collection of Puiseux series solutions to F(z, y) at z = a can contain series with non-integer fractional powers coming in sets of conjugates. Example 2.12 (Singularities for Plane Trees) The generating function for the number of rooted binary plane trees is a branch of P(z, y) = zy 2 − y + 1. Then Discy (P) = 1 − 4z, so that Ξ = {z : z = 0 or 1 − 4z = 0} = {0, 1/4}. We have already seen that one Puiseux series solution of P(z, y) = 0 at the origin is a Laurent series with negative exponents, which corresponds to the vanishing of the leading coefficient of y, and that the series solutions centred at z = 1/4 contain non-integer powers due to the vanishing of the discriminant. The generating function for the number of rooted 3-ary plane trees is a branch of P(z, y) = zy 3 − y + 1, which has as its Puiseux series solutions at the origin the power series 1 + z + 3z 2 + 12z 3 + 55z 4 + · · ·

2.3 Algebraic Power Series

47

Fig. 2.3 When F(z) behaves like an algebraic branch point near its unique dominant singularity z = ω, the domain of integration in the Cauchy integral can be deformed around the corresponding branch cut R (dotted) to obtain an open circle and small lip C1 which can be made arbitrarily close to ω (black) together with a collection of points C2 whose moduli are bounded above |ω | (gray). The Puiseux expansion of F(z) is a good approximation when C1 is sufficiently close to ω, which allows for the transfer of coefficient asymptotics, while the Cauchy integrand is exponentially smaller than ω −n on C2 .

together with the fractional Puiseux series z −1/2 −

1 3 1/2 1 − z + z +··· 2 8 2

and

− z −1/2 −

1 3 1/2 1 + z + z +··· . 2 8 2

Here Discy (P) = z(4 − 27z), so that both the discriminant and the leading coefficient of P vanish at the origin. This helps explain the existence of two series with noninteger powers which are negative.

Asymptotic Behaviour Suppose F(z) is analytic at the origin and has a single dominant singularity z = ω, where it behaves like an algebraic branch point. The key to determining asymptotics is to deform the domain of integration in the Cauchy integral (2.1) around the singularity ω without crossing the corresponding branch cut; see Figure 2.3. Away from the singularity ω, the domain of integration can be deformed so that the Cauchy integrand in (2.1) is exponentially smaller than ω−n . Near the singularity F behaves like its Puiseux expansion, so coefficient asymptotics of F(z) can be determined from the terms in its Puiseux expansion. In particular, coefficient asymptotics of a summand (1−z/ω)α follows from the binomial theorem and Stirling’s approximation for the gamma function, and these asymptotic expansions can be summed to give an asymptotic expansion for the coefficients of F(z). A rigorous version of this argument, yielding the following result, can be found in Flajolet and Odlyzko [48].

48

2 Generating Functions and Analytic Combinatorics

Proposition 2.11 (Darboux’s Theorem) Suppose F(z) = n ≥0 fn z n is analytic in some open disk around the origin except for a single dominant singularity ω ∈ C\{0} and points in the ray R ω = {tω : t ≥ 1}, such that  ck (1 − z/ω)k/R F(z) = k ≥N

is a convergent expansion of F in a disk centred at ω with R ω removed. Then each summand ck (1 − z/ω)k/R with k/R  N adds an asymptotic contribution ck ω−n

Γ(n − k/R) n−k/R−1 ∼ ck ω−n Γ(−k/R) Γ(n + 1) Γ(−k/R)

to fn , where Γ is the Euler gamma function. In particular, for all M ≥ 0

N +M     ω−n Γ(n − k/R) fn = ck + O n−(M+1)/R Γ(n + 1) k=N Γ(−k/R) −p/R−1

−n n where Γ(n−k/R) Γ(−k/R) = 0 if k/R ∈ N, and fn ∼ c p ω Γ(−p/R) where p is the smallest index such that cp  0 and p/R  N. If there are a finite number of dominant singularities, each of this form, then asymptotics of fn are determined by adding the asymptotic contributions given by each.

Note that Proposition 2.11 only requires the function F(z) to behave locally like an algebraic function near its dominant singularity (in the sense that it has a convergent Puiseux expansion). Example 2.13 (Asymptotics of 2-Regular Simple Graphs) A graph is 2-regular if every vertex has degree 2, and simple if the graph contains no loops or multiple edges between any pair of vertices. If gn denotes the number of 2-regular simple graphs on n vertices where each vertex has a distinct label, then generating function arguments (found in Wilf [125], for instance) imply F(z) =

2  gn e−z/2−z /4 zn = √ . n! 1−z n≥0

Because F(z) has a unique dominant singularity at the point z = 1, where its Puiseux expansion begins F(z) = e−3/4 (1−z)−1/2 +· · · , Proposition 2.11 immediately implies gn n−1/2 e−3/4 ∼ e−3/4 = √ . n! Γ(1/2) nπ This example is also studied by Flajolet and Odlyzko [48].

2.3 Algebraic Power Series

49

Because every algebraic function satisfies the hypotheses of Proposition 2.11, this gives a strong characterization of their asymptotic behaviour. Corollary 2.1 Let F(z) = n≥0 fn z n be a branch of an algebraic equation which is analytic at the origin and has dominant singularities ω0 β, . . . , ωm−1 β where β > 0 and |ω j | = 1 for each j. Then fn =

m−1 β n ns  C j ω nj + O(β n nt ), Γ(s + 1) j=0

where s ∈ Q \ {−1, −2, . . . }, t < s, and β, the ω j , and the C j are algebraic. Checking for incompatible asymptotic behaviour of coefficients, or the existence of an infinite number of singularities, gives a powerful test of function transcendence which is not available for constants. Example 2.14 (A Transcendental Series) application of the generalized If F(z) = (1 − 4z)−1/2 then a straightforward   n 2n binomial n √ z and Proposition 2.11 shows theorem implies F(z) = n≥0 2n n n ∼ 4 / πn.   2 n The function G(z) = n≥0 2n n z is also analytic at the origin, but 

2n n

2 ∼

16n πn

so G(z) is not algebraic by Corollary 2.1. See Problem 2.10 for a generalization.

Automated Asymptotics: A Solvable Connection Problem Let P(z, y) ∈ Q[z, y] and F(z) = n≥0 fn z n be a branch of P which is analytic at the origin, specified by enough power series coefficients to distinguish it from the other branches of P. In order to determine dominant asymptotics of fn , we must identify the dominant singularities of F and find its local behaviour near each to apply Proposition 2.11. Any singularity of F lies in the (finite) exceptional set Ξ, but we need to be able to determine which elements of Ξ are singularities of F (and not singularities of other branches or spurious points where all branches are analytic). Given a ∈ Ξ we can use the Newton polygon method to compute series solutions to P(z, y) = 0 which are analytic in a neighbourhood of a, potentially minus a or a ray starting at a, and then identify the singular branches at a by finding those with non-integer or negative exponents. Since we know the series expansion of the generating function F at the origin, we are thus trying to solve a connection problem: given series expansions of branches of P(z, y) = 0 around two distinct points, can we pair up the series in such a way as to be consistent among branches? Since

50

2 Generating Functions and Analytic Combinatorics

Fig. 2.4 Plot of the real solutions to zy 2 − y + 1 = 0 with the values of Ξ marked. At the origin one branch of the solution goes to infinity, while at z = 1/4 the two branches collide.

series expansions capture local behaviour of a function, being able to solve this connection problem means capturing global behaviour about branches of P from their local behaviour. Example 2.15 (Catalan Asymptotics) Recall again the generating function for the number of rooted binary plane trees, which is the unique power series root of P(z, y) = zy 2 − y + 1 at the origin. Near the origin the branches of P(z, y) are represented by the expansions f2 (z) = z−1 − 1 − z − 2z 2 + · · ·

f1 (z) = C(z) = 1 + z + 2z2 + 5z 3 + · · ·

while around the point z = 1/4 the branches are represented by the expansions g1 (z) = 2 + 2 (1 − 4z)1/2 + · · ·

g2 (z) = 2 − 2 (1 − 4z)1/2 + · · · .

Thus C(z) must have a singularity at z = 1/4, but is it locally represented by g1 or g2 ? If C(z) was represented by g1 near z = 1/4 then Proposition 2.11 would imply cn ∼ (2) 4n

4n n−3/2 = − 3/2 √ , Γ(−1/2) n π

which is impossible as cn has non-negative terms. Thus, C(z) corresponds to g2 and  −3/2      n 1 1 1 n n cn = (−2) 4 +O =4 . √ +O 3/2 Γ(−1/2) n n n π In fact, the quadratic formula implies C(z) =

√ 1− 1−4z , 2z

so

4n Γ(n + 1/2) cn = (−1/2)[z n+1 ](1 − 4z)1/2 = √ , π Γ(n + 2)

2.3 Algebraic Power Series

51

4 3 2 3 Fig. 2.5 Real parts of branches of the algebraic curve P(z, y) √ = y −2y +(1+2z)y −2yz+4z = 0. The discriminant of P vanishes at z = 1/4, 0, and z = (−1± 5)/8, corresponding to the intersection of at least two of the branches.

which is simply a restatement of the well known formula cn =

  1 2n n+1 n .

Such heuristic arguments do not always apply, and more involved techniques are usually required. Example 2.16 (The Connection Problem for Supertrees) The generating function F(z) of the combinatorial class of bicoloured supertrees is the branch y = F(z) of the algebraic curve P(z, y) = y 4 − 2y 3 + (1 + 2z)y 2 − 2yz + 4z 3 whose power series expansion at the origin begins F(z) = 2z 2 + 2z 3 + · · · (see Flajolet and Sedgewick [50, Ex. VI.10] for details). Figure 2.5 shows the real parts of the branches defined by P(z, y) = 0. The set of potential singularities of F(z),   √   Ξ = 0, 1/4, −1 ± 5 /8 , is defined by the vanishing of the discriminant. Using only local information about F at the origin (its initial power series terms look like a quadratic function) we can identify F with the dark curved branch in Figure 2.5. Tracing this branch from the origin across the positive real numbers shows that F(z) admits a single dominant singularity at z = 1/4. Tracing along curves can be formally captured by algorithms for real numeric continuation, but we now outline a different approach for combinatorial generating functions which is simpler and more efficient. Because F(z) has non-negative power series coefficients, the Vivanti-Pringsheim theorem implies one of its dominant singularities is real and positive. At the origin, the branches have power series

52

2 Generating Functions and Analytic Combinatorics

expansions A1 (z) = 1 − 2z 2 + · · ·

A2 (z) = 1 − 2z + · · ·

A3 (z) = 2z + 2z + · · ·

A4 (z) = F(z) = 2z2 + 2z 3 + · · · .

2

Note that all branches are analytic even though the branches intersect: intersection is a necessary condition for singularities of branches not going √ to infinity, but not a sufficient one. At the first potential positive singularity, ρ = ( 5 − 1)/8, the branches have Puiseux expansions  √ 5− 5  1 − z/ρ + · · · , 4  √ 5− 5  1 1 − z/ρ + · · · , B3 (z) = − 2 4 B1 (z) =

1 + 2

√ √ √ √ 10 − 4 2 2 + 10 − 2 − (1 − z/ρ) + · · · 4 16 √ √ √ √ 10 − 4 2 2 − 10 + 2 + (1 − z/ρ) + · · · , B4 (z) = 4 16 B2 (z) =

while at z = 1/4 they have Puiseux expansions 1 1 + (1 − 4z)1/4 + · · · 2 2 1 i C3 (z) = + (1 − 4z)1/4 + · · · 2 2 C1 (z) =

1 1 − (1 − 4z)1/4 + · · · 2 2 1 i C4 (z) = − (1 − 4z)1/4 + · · · 2 2 C2 (z) =

Our goal is to determine which of the expansions B j (z) and Ck (z) correspond to F. In particular, we want to determine whether F corresponds to one of the expansions B2 and B3 which are singular at z = ρ, or if the dominant singularity of F is z = 1/4. The key is to use the fact that the real branches of P(z, y) can be sorted at real points, and their relative orders do not change between elements of Ξ. To begin, we note that if r > 0 is sufficiently small then looking at the initial terms of the series expansions of the A j shows A1 (r) > A2 (r) > A3 (r) > A4 (r). Similarly, if s < ρ is sufficiently close to ρ then looking at the initial terms of the series expansions of the B j shows B2 (s) > B1 (s) > B3 (s) > B4 (s). Because the real branches of P can only cross at elements of Ξ, the relative orders of the branches at z = r and z = s must be equal. Since A1 (r) and B2 (s) are largest in the orderings, A1 (z) and B2 (z) are expansions of the same branch; the rest of the branches pair up as (A2, B1 ), (A3, B3 ), and (A4, B4 ). In particular, as the series expansion of F around z = 0 is A4 its series expansion near z = ρ is the power series B4 , and z = ρ is not a singularity of F(z). Continuing this process, we take t > ρ to be sufficiently close to ρ. Only the branches B2 (z) and B4 (z) are real for z > ρ, and examining their initial series terms

2.3 Algebraic Power Series

53

shows B2 (t) > B4 (t). If u < 1/4 is sufficiently close to 1/4 then examining the initial series terms of the C j (z) shows that only C1 (u) and C2 (u) are real, and that C1 (u) > C2 (u). Since F(z) is represented by B4 (z) near z = ρ, and the relative orders of the real branches do not change between elements of Ξ, we have determined that z = 1/4 is the unique dominant singularity of F(z), where it has the convergent expansion F(z) = C2 (z) =

1 1 − (1 − 4z)1/4 + · · · 2 2

in a slit disk. Proposition 2.11 then implies the number fn of bicoloured supertrees of size n has dominant asymptotics fn = 4n n−5/4

4n −1/2 + O(4n n−3/2 ). + O(4n n−3/2 ) = 5/4 Γ(−1/4) n 8Γ(3/4)

An analysis of bicoloured supertrees using this approach was carried out in the thesis of Chabaud [21]. DeVries [36] gave an alternative analysis using the multivariate machinery discussed in Chapters 3 and 5.

This branch sorting algorithm can be applied to any algebraic function F(z) with non-negative power series coefficients at the origin. As in the example, one sorts the Puiseux series solutions of an annihilating polynomial P(z, y) = 0 which take real values between positive elements of the exceptional set Ξ, and uses the fact that their relative orders don’t change between potential singularities to link local behaviour of expansions at one point to those at another. Proposition 2.9 allows us to determine which branches are real, and sorting can be done through a lexicographical ranking of series by their coefficients since the lower order terms in a Puiseux series dominate local asymptotic behaviour. Example 2.17 (Kreweras Lattice Walks) Let A(z) be the generating function for the number an of lattice walks beginning at the origin, staying in N2 , and taking n steps in S = {(−1, 0), (0, −1), (1, 1)}. This lattice path model was studied by Kreweras [74] in the 1960s, who gave a simple formula for walks ending at the origin and a more complicated expression for walks ending anywhere in N2 . Gessel [55] proved that A(z) is algebraic (in fact, he proved the stronger result that the trivariate generating function counting walks by length and x and y endpoint is algebraic). Bousquet-Mélou [17] used the kernel method, which we study in detail in Chapter 4, to derive algebraic equations and enumerate these walks. In particular, the generating function A(z) satisfies P(z, A(z)) = 0 where

54

2 Generating Functions and Analytic Combinatorics

P(z, y) = z5 (3z − 1)3 y 6 + 6z 4 (3z − 1)3 y 5 + z 3 (3z − 1)(135z 2 − 78z + 14)y 4 + 4z 2 (3z − 1)(45z 2 − 18z + 4)y 3 + z(3z − 1)(135z 2 − 26z + 9)y 2 + 2(3z − 1)(27z 2 − 2z + 1)y + 43z 2 + z + 2. Computing the discriminant shows that the exceptional set of P consists of 0, −1, 1/3, and the two non-real roots of the polynomial 9z 2 + 3z + 1, so A(z) has a unique dominant singularity at z = 1/3. The Newton polygon method shows that there are two Puiseux series solutions of P(z, y) = 0 at the origin which have real coefficients, −2z−1 − 1 − z − 3z 2 + · · ·

and

1 + z + 3z 2 + 7z 3 + · · · ,

and two real Puiseux series solutions at z = 1/3 which have real coefficients, √ √ 2 2(1 − 3z)−1/4 − 3 + · · · and − 2 2(1 − 3z)−1/4 − 3 + · · · . Since A(z) is the real branch of P(z, y) = 0 which is larger just to the right of the origin, it is the larger real branch just to the√left of z = 1/3, and the expansion of A(z) in a slit disk at z √ = 1/3 begins with 2 2(1 − 3z)−1/4 . Proposition 2.11 then n −3/4 (2 2)/Γ(1/4). implies an ∼ 3 n

More generally, for series with negative coefficients, rigorous numerical methods can be used. There exist bounds (discussed in the appendix to Chapter 7) on how close the values of the branches to P(z, y) = 0 can be at a point z = a, depending only on P and a. Since the series expansion of F(z) at the origin is convergent whenever z has smaller modulus than its dominant singularities, one can iterate through the elements of Ξ in order of increasing modulus and numerically approximate F and the branches near each potential singularity to a sufficient accuracy to decide which branch corresponds to F. Full details of both approaches can be found in the PhD thesis of Chabaud [21]. The Sage package of Mezzarobba for numeric analytic continuation of D-finite functions, discussed in Section 2.4 below, can be used for the necessary numeric computations. Once the dominant singularities of an algebraic generating function are determined, an asymptotic expression for its coefficients can be computed using Proposition 2.11. The decidability issues related to Skolem’s Problem for dominant coefficient asymptotics of rational generating functions are still in play for algebraic functions, but such pathological considerations do not typically appear in combinatorial applications.

Examples of Algebraic Series and N-Algebraicity As briefly seen in the above examples, combinatorial structures which admit a recursive decomposition based on collections of smaller objects of the same type are

2.3 Algebraic Power Series

55

those that tend to have algebraic generating functions. There are a wealth of combinatorial applications, including the enumeration of many types of lattice paths, pattern-avoiding permutations, trees, planar maps on surfaces, polyominoes, dissections of polygons, sequences arising in bioinformatics, and tiling problems. Algebraic functions over a field of characteristic p have connections to p-automatic sequences in theoretical computer science and the combinatorics of words. These examples, and more, can be found in Stanley [110], Bousquet-Mélou [18], and Banderier and Drmota [4]. Algebraic functions also have many closure properties. Proposition 2.12 The set of algebraic series forms a field: the sum, difference, product, and quotient (when the denominator is non-zero) of two algebraic series a(x) and b(x) is algebraic. Annihilating polynomials for a(x)+ b(x), a(x)− b(x), a(x)b(x), and a(x)/b(x) can be computed from annihilating polynomials for a(x) and b(x). Problem 2.15 guides the reader through a proof of Proposition 2.12. Using resultants to eliminate variables, any component of a solution of a nontrivial algebraic system of equations is also algebraic. Definition 2.17 (N-algebraic series) A proper N-algebraic system is a polynomial system of the form ⎧ ⎪ y1 = P1 (z, y1, . . . , yd ) ⎪ ⎪ ⎨ ⎪ .. . ⎪ ⎪ ⎪ ⎪ yd = Pd (z, y1, . . . , yd ) ⎩ where each Pi has natural number coefficients, no constant term, and [yi ]Pi = 0. A univariate series y(z) is N-algebraic if there exists a constant c0 ∈ N and power series solution (y1, . . . , yd ) ∈ N[[z]]d of some proper N-algebraic system such that y(z) = c0 + y1 (z). Just as N-rational series appear as generating functions when counting words in regular languages, N-algebraic series appear in the study of context-free languages, a superset of regular languages which are recognized by ‘pushdown’ automata. Chomsky and Schützenberger [23] proved that the generating function counting words in any non-ambiguous context-free language by length is N-algebraic, and conversely any N-algebraic generating function counts the number of words in some non-ambiguous context-free language by length. Flajolet [46] used this enumerative approach to easily prove the ambiguity of several families of context-free languages, a typically hard task. See Chapter 3 of [95] for more details on these topics. Traces of the modern Symbolic Method, which allows one to convert a combinatorial description of a family of objects into a generating function specification, can be found in the Delest-Viennot-Schützenberger (DVS) methodology [34]. The DVS methodology, dating back to Schützenberger’s work on formal language theory, states roughly that to enumerate a combinatorial class one should try to: construct a bijection to a context-free language, determine a context-free grammar generating that language, then deduce an algebraic equation for the number of objects under consideration using the rules of the grammar. This approach and the techniques it has inspired have been extremely influential in enumerative combinatorics.

56

2 Generating Functions and Analytic Combinatorics

Banderier and Drmota [4] classify the dominant asymptotic behaviour of Nalgebraic series, giving a tool for disproving N-algebraicity of generating functions. Subdominant asymptotic terms can be harder to get a handle on, a consequence of the fact that every algebraic power series with integer coefficients is the difference of two N-algebraic series [4, Prop. 2.6]. Currently, a complete classification of N-algebraicity is not known. Open Problem 2.3 (Decidability of N-Algebraicity) Is there an algorithm which takes an algebraic series with natural number coefficients, specified by an annihilating polynomial and initial terms, and determines whether it is N-algebraic?

2.4 D-Finite Power Series We now turn to a class of generating functions encoded by linear differential equations over a field K of characteristic zero. Just as polynomials are data structures for algebraic series, differential equations are data structures for these series. Definition 2.18 (D-finite series and functions) The derivative of a formal power d series F(z) = n≥0 fn z n is the formal series dz F(z) = n ≥1 (n fn )z n−1 . We call F(z) differentially finite (D-finite) over K if there exist a0 (z), . . . , ar (z) ∈ K[z] with ar (z)  0 such that ar (z)

dr d r−1 F(z) + · · · + a0 (z)F(z) = 0. F(z) + a (z) r−1 dzr dzr−1

(2.7)

If K ⊂ C then an analytic function F(z) is called D-finite if it satisfies an equation of the form (2.7) when defined. If d is the maximum degree of the coefficient polynomials a j (z) then we call (2.7) a D-finite equation of order r and degree d. Equivalently, F(z) is D-finite if and only if all derivatives F, F , F , F , . . . form a finite dimensional vector space over the field of rational functions K(z). We often dr write F (r) for the rth derivative dz r F(z).

Example 2.18 (D-finite Transcendental Function) The function F(z) = ez = n≥0 z n /n! is not algebraic (for instance, because its asymptotic growth does not satisfy the conditions of Corollary 2.1) but satisfies F (z) − F(z) = 0, so F is D-finite.

Remark 2.2 Suppose F(z) satisfies a non-homogeneous D-finite equation of the form L(F) = c(z), where L(F) = ar (z)F (r) (z) + · · · + a0 (z)F(z) and c(z) ∈ K[z]. d Then c(z) dz L(F) − c (z)L(F) = 0, so any function satisfying a non-homogeneous D-finite equation of order r also satisfies a homogeneous equation of order r + 1.

2.4 D-Finite Power Series

57

D-finite equations are often manipulated in computer algebra systems by encoding them as elements of a non-commutative ring. Definition 2.19 (Weyl algebra) The Weyl algebra W over K is the K-algebra Kz, δ/δz − zδ − 1, consisting of K-linear combinations of monomials containing powers of z and δ such that z and δ don’t commute but satisfy δz = zδ + 1. We let the elements of W act on K[[z]] by defining z · F = zF and δ · F = F ; the commutation rule for z and δ was chosen to make this action well defined, since (δz) · F(z) = δ · (zF(z)) =

d (zF(z)) = zF (z) + F(z) = (1 + zδ) · F(z). dz

Any element P ∈ W can be uniquely written P = pr (z)δr + pr−1 (z)δr−1 + · · · + p0 (z) for polynomials p j (z) ∈ K[z], meaning P · F(z) = pr (z)

dr d r−1 F(z) + p (z) F(z) + · · · + p0 (z)F(z). r−1 dzr dzr−1

D-finite equations over K are thus in bijection with the elements of W.

Coefficient Properties The coefficient sequence of a D-finite function satisfies a linear recurrence relation with polynomial coefficients, generalizing the C-finiteness of rational function coefficients. Definition 2.20 (P-recursive sequences) A sequence ( fn ) is polynomially recursive (P-recursive) if there exist polynomials c0 (n), . . . , cr (n) ∈ K[n] such that c0 (n), cr (n)  0 and cr (n) fn+r + cr−1 (n) fn+r−1 + · · · + c0 (n) fn = 0

(2.8)

for all n ≥ 0. If the coefficient polynomials c j (n) have highest degree d, then we call (2.8) a P-recursive relation of order r and degree d. As in the differential case above, if ( fn ) satisfies a non-homogeneous P-recursive relation of order r then it also satisfies a homogeneous P-recursive relation of order r + 1. Definition 2.21 (shift algebra) The shift algebra S over K is the K-algebra Kn, S/Sn − (n + 1)S, consisting of K-linear combinations of monomials containing powers of n and S, such that n and S don’t commute but satisfy Sn = (n + 1)S. We let the elements of S act on the set of sequences over K by defining n·( fn ) = (n fn ) and S · ( fn ) = ( fn+1 ); the commutation rule for n and S was chosen to make this action well defined, since (Sn) · ( fn ) = S · (n fn ) = ((n + 1) fn+1 ) = ((n + 1)S) · ( fn ).

58

2 Generating Functions and Analytic Combinatorics

Any element Q ∈ S can be uniquely written Q = qr (n)Sr + qr−1 (n)Sr−1 + · · · + q0 (n) for polynomials q j (n) ∈ K[n], meaning Q · ( fn ) = (qr (n) fn+r + qr−1 (n) fn+r−1 + · · · + q0 (n) fn ) . P-recursive relations over K are thus in bijection with the elements of S. Both the Weyl algebra and shift algebra are examples of Ore algebras, and are implemented in the Sage ore_algebra package [69]. Proposition 2.13 Let F(z) = n≥0 fn z n ∈ K[[z]]. 1. If F(z) satisfies a D-finite equation of order r and degree d then ( fn ) satisfies a P-recursive relation of order at most r + d and degree at most r. 2. If ( fn ) satisfies a P-recursive relation of order r and degree d then F(z) satisfies a D-finite equation of order at most d and degree at most r + d. Proof Since [z n ]F (r) (z) = (n + r)(n + r − 1) · · · (n + 1) fn+r for r ∈ N, and  fn−k : 0 ≤ k ≤ n n k [z ]z F(z) = 0 :k>n for k ∈ N, if F(z) satisfies a differential equation of the form (2.7) then its coefficients satisfy a linear recurrence with polynomial coefficients, where derivatives of F can increase the coefficient degree of the recurrence and both derivatives and multiplications by z can increase the order of the recurrence. d be the operator that differentiates by z and then multiplies Conversely, let ∂z = z dz the result by z, and for any k ∈ N let F≤k (z) be the polynomial F≤k (z) = f0 + f1 z + · · · + fk z k . Then  F(z) − F≤k (z) fn+k z n = zk n≥0 and repeated differentiation gives  n≥0





n fn+k z = j

n

j ∂z

 F(z) − F≤k (z) . zk

Thus, if ( fn ) satisfies a linear recurrence of the form (2.8) then multiplying by z n and summing over n gives a linear differential equation satisfied by F(z). Problem 2.11 asks you to prove the stated degree and order bounds.  Note that it is automatic to translate between D-finite equations satisfied by a series and P-recursive relations satisfied by its coefficient sequence. This is implemented, for example, in the Maple gfun package and the Sage ore_algebra package. Example 2.19 (Central Binomial Coefficients Squared)  2 and recall that the asymptotic growth of ( fn ) implies the generating Let fn = 2n n   + function F(z) = n≥0 fn z n is not algebraic. Using Pascal’s identity, that a−1 b

2.4 D-Finite Power Series

 a−1 b−1

=

a b

59

for all positive integers a and b, light algebraic manipulation shows (n + 1)2 fn+1 − 4(2n + 1)2 fn = 0

for all n ≥ 0. Thus, F(z) is D-finite. Following the proof of Proposition 2.13 translates this to the differential equation z (1 − 16z)

d d2 F(z) + (1 − 32z) F(z) − 4F(z) = 0. 2 dz dz

The generating function F(z) can be written in terms of an elliptic integral by solving this differential equation, providing a second path to proving transcendence.

Recall from Section 2.2 that the solutions of a linear recurrence of order r with constant coefficients form a K-vector of dimension r (adding two solutions or multiplying by a constant yields another solution). Because (2.8) is still a linear recurrence, its solutions still form a K-vector space, but when the leading coefficient cr (n) vanishes at non-negative integers the dimension of this vector space may be greater than the order r of the recurrence. To account for this, let B be the set of non-negative integer solutions to cr (n) = 0 and I = {0, . . . , r − 1} ∪ {b + r : b ∈ B}. If ( fn ) is a solution of the linear recurrence (2.8) then the values of fn for n ∈ I uniquely determine ( fn ). Furthermore, if N = max I and f0, . . . , f N satisfy (2.8) for n ≤ N − r then this finite list always extends to a unique infinite sequence satisfying (2.8). Thus, the finite sequences which satisfy (2.8) for n ≤ N − r form a vector space Λ N whose dimension can be computed by linear algebra and equals the dimension of the vector space of all solutions to (2.8). Definition 2.22 (generalized initial conditions) The values of fn when n ∈ I are generalized initial conditions for a sequence ( fn ) satisfying (2.8). Remark 2.3 By Proposition 7.7 in Chapter 7, the elements of B are at most one larger than the maximum absolute value of the coefficients of cr (n). Example 2.20 (A Large Solution Space for a Linear Recurrence) Consider the linear recurrence relation n(n−2) fn+2 − fn+1 −(n−1) fn = 0 over K = Q. Here the order r = 2 and B = {0, 2} so I = {0, 1} ∪ {0 + 2, 2 + 2} = {0, 1, 2, 4} with maximum element N = 4. In order for ( f0, . . . , f4 ) to be the start of a sequence satisfying this recurrence it must be that − f1 + f0 = 0 − f3 − f2 = 0 − f3 − f2 = 0

(n = 0) (n = 1) (n = 2 = N − r)

60

2 Generating Functions and Analytic Combinatorics

meaning 1 −1 0 0 0  " 0 0 −1 −1 0# !0 0 −1 −1 0$ %&'( M

f  0" f1 # # f2 # = 0. # f3 # ! f4 $

Since M has rank 2, the rank-nullity theorem implies the vector space of solutions to this matrix equation has dimension 3. A sequence ( fn ) satisfying this recurrence is uniquely determined by f0, f2 , and f4 , and any assignment of values to these terms can be extended to a full sequence by setting f1 = f0 and fn+2 =

fn+1 + (n − 1) fn n(n − 2)

for all n  {0, 2}. Thus, the solution space of this recurrence forms a Q-vector space of dimension 3 (even though the recurrence has order 2). Any assignment f0 = f1 = a, f2 = b, and f4 = c for (a, b, c) ∈ Q forms a generalized initial condition.

Bostan et al. [14, Ch. 15] study the complexity of generating terms in P-recursive sequences.

Examples of D-finite Functions Many common functions are easily seen to be D-finite, including exponentials, logarithms, (arc-)sine and (arc-)cosine functions, generalized hypergeometric functions (including common special functions like the Bessel and Airy functions), rational period integrals, and more. Furthermore, the class of D-finite functions contains all algebraic functions. Proposition 2.14 If f (z) is an algebraic function of degree d over a field K of characteristic zero then f (z) is D-finite over K, and is annihilated by a D-finite equation of order at most d. Proposition 2.14 has been rediscovered by several authors going back to the 19th century, including Abel, Tannery, Cockle, Harley, and Comtet. Chudnovsky and Chudnovsky [25] and Bostan et al. [15] have studied this result from a complexity viewpoint, giving explicit bounds on the orders and coefficient degrees of annihilating D-finite equations associated to algebraic functions. In particular, for efficiency reasons it can be desirable to take an annihilating D-finite equation of non-minimal order so as to greatly reduce the degrees of the polynomial coefficients involved. Proof Let P(z, y) be the minimal polynomial of f and write Pz (z, y) for (∂P/∂z)(z, y) and Py (z, y) for (∂P/∂ y)(z, y). Differentiating the equation 0 = P(z, f (z)) with respect to z implies

2.4 D-Finite Power Series

61

0 = Pz (z, f (z)) + f (z)Py (z, f (z)). Because K has characteristic zero and Py is a polynomial in y of degree smaller than P, which is the minimal polynomial of f (z), it follows that Py (z, f (z))  0 and f (z) = −

Pz (z, f (z)) . Py (z, f (z))

If R = K(z) denotes the field of rational functions over K, this implies f lies in the algebraic field extension R( f ) of degree d over R. By induction the d + 1 elements f , f , . . . , f (d) all lie in R( f ), meaning they are linearly dependent over R. Conversely, it is an interesting problem to prove or disprove algebraicity of a D-finite function specified by a known annihilating D-finite equation and initial conditions. Singer [105] gives an algorithm for deciding when all solutions of a D-finite equation are algebraic, which can be modified to solve this problem, but a need for factorization in non-commutative rings leads to an impractically large runtime. An alternative approach, arising in the context of lattice path enumeration, can be found in Bostan et al. [13]. The class of D-finite functions is also closed under many natural operations. Proposition 2.15 Let F(z) = n≥0 fn z n and G(z) = n ≥0 gn z n satisfy D-finite equations of orders r and s, and let H(z) be differentiable. Then, when defined, (1) The product F(z)G(z) satisfies a D-finite equation of order at most r s; (2) The sum F(z) ∫ + G(z) satisfies a D-finite equation of order at most r + s; (3) dF F(z)dz are D-finite; dz (z) and (4) If G(z) is algebraic then F(G(z)) is D-finite; (5) The Hadamard product (F  G)(z) = n≥0 ( fn gn )z n is D-finite; (6) H and 1/H are both D-finite if and only if H /H is algebraic. Proof Item (1) follows from the fact that the products of derivatives F (i) (z)G(j) (z) span the K(z)-vector space of the derivatives of F(z)G(z), while (2) follows from linearity of the derivative. Item (3) follows directly from the definition of D-finiteness and Item (4) follows from the chain rule and basic field theory [110, Thm. 6.4.10]. Item (5) follows from P-recursiveness of the coefficient sequences and an argument analogous to Item (1). Item (6) was given by Harris and Sibuya [64].  These closure properties are effective: there are algorithms which take annihilating D-finite equations for F and G and return annihilating equations for their sum, product, composition, and Hadamard product, among other operations. In comparison to Item (4), Singer [106] showed that if G(z) is algebraic of genus at least 1 then G(F(z)) is D-finite if and only if F(z) is also algebraic. The Pólya-Carlson theorem [92] implies that a D-finite series with integer coefficients and radius of convergence 1 is in fact rational. P-recursive sequences, and thus D-finite functions, are central to enumerative combinatorics. In addition to combinatorial classes with rational or algebraic generating functions, transcendental D-finite functions appear in the study of most

62

2 Generating Functions and Analytic Combinatorics

major combinatorial objects including permutations9, graphs10, tableau [56], and, as discussed in detail here, lattice path enumeration. Stanley [110] and Flajolet and Sedgewick [50, Sect. VII.9] contain a large number of additional examples. Salvy [102] gives an excellent survey of D-finite functions from a computer algebra perspective.

Automated Identity Proving and Creative Telescoping Identities involving P-recursive sequences can be proven by verifying that they hold for a sufficient (finite) number of terms: to prove a P-recursive sequence (cn ) is identically zero it is sufficient to show that ck = 0 when k lies in a finite set of generalized initial conditions. To prove that two P-recursive sequences (an ) and (bn ) are identical, one can compute a linear recurrence satisfied by cn = an − bn and verify that cn = 0 for a sufficient number of terms. In fact, given linear recurrence relations for (an ) and (bn ) one can automatically determine a bound N such that the sequences (an ) and (bn ) are equal when their terms agree for all 0 ≤ n ≤ N. Similarly, to prove two D-finite functions are equal it is sufficient to prove that a finite number of their derivatives (or power series coefficients) are equal at the origin. Example 2.21 (Sums of Squares) Suppose we want to prove the identity an+1 − an = (n + 1)2

n k=0

and

k2 =

n(n+1)(2n+1) . 6

If an =

n k=0

k 2 then

an+2 − an+1 = (n + 2)2,

so subtracting (n+2)2 times the first equation from (n+1)2 times the second equation gives a second order linear homogeneous recurrence (n + 1)2 an+2 − (2n2 + 6n + 5)an+1 + (n + 2)2 an = 0

(2.9)

satisfied by (an ). Direct substitution shows that the sequence bn = n(n + 1)(2n + 1)/6 satisfies the same recurrence, and it agrees with an for its first two terms, so the two sequences are equal for all n. In general (bn ) will also be specified by a linear recurrence, and one cannot simply substitute. For example, here bn+1 /bn is a rational function so bn satisfies the first order linear recurrence 9 Many well known families, such as Baxter permutations and permutations with bounded cycle length, have D-finite generating functions. In the 1990s, Noonan and Zeilberger [86] conjectured that the generating function of permutations avoiding any fixed set of patterns was D-finite (see that paper for the definition of pattern avoidance). This was recently shown to be false by Garrabrant and Pak [54], who proved non-D-finiteness for the generating function of permutations avoiding some set of patterns contained in S80 . There is some evidence [28] that the generating function of 1324-avoiding permutations may be non-D-finite, which is a large open problem in the area. 10 D-finite generating functions appear in the enumeration of rooted planar maps of fixed genus [19], k-regular graphs [56] for fixed k, and similar problems.

2.4 D-Finite Power Series

63

n(n + 1)(2n + 1)bn+1 − (n + 1)(n + 2)(2n + 3)bn = 0.

(2.10)

This implies the sequence cn = an −bn satisfies a recurrence of order at most 2+1 = 3, meaning the four sequence shifts cn+3, cn+2, cn+1, and cn lie in a Q(n)−vector space of dimension 3 and thus have a linear dependency. Writing out the c j sequence in terms of a j and b j , then using (2.9) and (2.10) to eliminate all terms except for an+1, an, and bn gives a system of three homogeneous linear equations in four variables. Computing a non-zero solution of this system shows that cn satisfies the same second order recurrence, (n + 1)2 cn+2 − (2n2 + 6n + 5)cn+1 + (n + 2)2 cn = 0, as an . Note that this recurrence is satisfied when cn is the difference of any two solutions11 of (2.9) and (2.10): working on the level of recurrences allows for computations to be performed in Q(n). Since our specified sequences an and bn satisfy c0 = a0 − b0 = 0 and c1 = a1 − b1 = 0 then an = bn for all n.

A powerful technique for proving identities is the creative telescoping framework, introduced by Zeilberger [128, 129], along with Wilf [126], in the 1980s and 1990s based on techniques of Fasenmyer [43] and Gosper [57]. Creative telescoping methods can (among other things) compute integrals of D-finite functions and sums of P-recursive sequences with free parameters by deducing differential equations or recurrence relations they satisfy. Proposed identities can then be verified, or discovered, using the arguments discussed above. The name ‘creative telescoping,’ apparently coined by van der Poorten [117] in his account of Apéry’s proof of the irrationality of ζ(3), comes from the fact that these methods can be considered a (vast) generalization of the elementary technique by which certain sums can be manipulated to have summands which telescope. Example 2.22 (A Parametrized Integral) Consider the parametrized integral 1 G(t) = 2πi

∫ γ

1 dx, x − x2 − t

where γ is the positively oriented circle {|x| = 1/2} and |t| ≤ 1/5, so x − x 2 − t is never zero for x on γ. The methods of creative telescoping produce the identity     1 1 − 2x 2 d d + = , (4t − 1) dt x − x 2 − t dx x − x 2 − t x − x2 − t 11 It can be shown that the general solution to (2.9) is A + B b n for constants A and B while the general solution to (2.10) is C b n for a constant C, and indeed A + B b n − C b n = A + (B − C) b n is always a solution of (2.9).

64

2 Generating Functions and Analytic Combinatorics

which can easily be verified by differentiation. Integrating this equation with respect to x gives     ∫ ∫ ∫ d 1 d 1 1 − 2x dx + 2 dx = dx = 0, (4t − 1) 2 2 2 γ dt x − x − t γ x−x −t γ dx x − x − t since γ is a closed curve on which x 2 − x − t is not zero. A minor argument d can be moved outside of the integral with respect to x, shows that the derivative dt so G(t) is D-finite and satisfies (4t − 1)G (t) + 2G(t) = 0 with initial condition ∫ dx 1 G(0) = 2πi γ x−x 2 = 1. In fact, solving this separable differential equation shows

G(t) = (1 − 4t)−1/2 is the generating function of the central binomial coefficients. This computation is related to the multivariate diagonals discussed in Chapter 3.

Over the last three decades there has been a tremendous amount of research in the computer algebra community on creative telescoping, extending the framework to new classes of problems and improving efficiency through several ‘generations’ of algorithms. This work has been some of the most fruitful in modern computer algebra, leading to practical implementations which find application in a vast number of mathematical and scientific fields. A listing of such results is outside the scope of this text, but we briefly return to creative telescoping in Chapter 3 as it allows one to obtain an annihilating D-finite equation satisfied by the diagonal of a multivariate rational function. Detailed accounts of the methods and applications of creative telescoping can be found in recent surveys [91, 73, 27, 22].

Asymptotics of P-Recursive Sequences Assume now that K = A is the field of algebraic numbers. A classic theorem of Fabry [42] states that any D-finite equation over the field of algebraic numbers admits a basis of formal series solutions of the form d         f j z 1/q log j z , z α exp P z−1/q j=0

where the f j are power series, P is a polynomial, q ∈ N>0 , α is an algebraic number, and log z is a formal term algebraically independent of z with formal d log z = 1/z. Analogously, a P-recursive relation has a basis of formal derivative dz series solutions of the form d         ρn nα Γ(n)β exp P n1/q f j n−1/q log j n , j=0

2.4 D-Finite Power Series

65

where the f j are power series, P is a polynomial, q ∈ N>0 , ρ and α are algebraic numbers, and β ∈ Q. Initial terms of these formal series solutions can be computed using the so-called Birkhoff-Trjitzinsky method [11, 12]. A nice summary of the Birkhoff-Trjitzinsky method can be found in Wimp and Zeilberger [127], including a discussion on the difference between formal series solutions and those representing asymptotically meaningful series; an implementation of the Birkhoff-Trjitzinsky method (used for our examples) is contained in the Sage ore_algebra package. We do not delve into the complicated topic of formal versus asymptotic series because, as we will soon see, when dealing with combinatorial generating functions which are analytic at the origin one can always work with D-finite equations and P-recursions whose formal solutions represent meaningful asymptotic expansions. Example 2.23 (D-Finite and P-Recursive Bases of Solutions) The number an of lattice walks beginning at the origin, staying in N2 , and taking n steps in S = {(±1, 0), (0, ±1)} satisfies the P-recursion (n + 4)(n + 3)an+2 − 4(2n + 5)an+1 − 16(n + 1)(n + 2)an = 0. The solutions of this P-recurrence form a two-dimensional C-vector space, and the generalized_series_solutions command of the Sage ore_algebra package determines a basis for this vector space consisting of two elements whose expansions begin     3 9 +··· and (−4)n n−3 1 − +··· . 4n n−1 1 − 2n 2n The generating function A(z) of this sequence satisfies the D-finite equation z2 (4z − 1)(4z + 1)A (z) + 2z(4z + 1)(16z − 3)A (t) + 2(112z2 + 14z − 3)A (t) + 4(16z + 3)A(t) = 0 whose solutions form a three-dimensional C(z) vector space. Applying the generalized_series_solutions command now gives three series expansions 1 + 2z + 6z 2 + 18z 3 + 60z 4 + · · · z −1 (1 + 2z + 4z 2 + 12z 3 + 36z 4 + · · · ) z −2 (z + 2z 2 + 4z 3 + 12z 4 + · · · ) log(z) − (1/4 + 3z/2 + 2z 2 + · · · ) corresponding to the elements of a basis for the solution vector space.

Our asymptotic results follow from a careful study of how a D-finite equation determines the singularities of its D-finite function solutions. Definition 2.23 (singularities of D-finite equations) A point ζ ∈ C is an ordinary point of the D-finite equation (2.7) if there exist r linearly-independent solutions

66

2 Generating Functions and Analytic Combinatorics

of (2.7) which are analytic at z = ζ. A point which is not ordinary is a singularity of the D-finite equation (2.7). The exceptional set of (2.7) is Ξ = {z ∈ C : ar (z) = 0}. A classic argument [124, Thm. 2.2] dating back to Cauchy implies any singularity of (2.7) must be a zero of the leading coefficient ar (z) and thus lie in the finite algebraic set Ξ, giving the following. Lemma 2.3 If ζ  Ξ then any solution of (2.7) which is analytic in a slit disk centred at ζ is in fact analytic at ζ. To further classify the (potential) singularities of (2.7), we rewrite the equation as d r−1 dr F(z) + br−1 (z) r−1 F(z) + · · · + b0 (z)F(z) = 0, r dz dz

(2.11)

where b j (z) = a j (z)/ar (z). Definition 2.24 (regular singular points and indicial polynomials) For any rational function R(z), let ωζ (R) be the order of the pole of R at z = ζ, equal to 0 if R is analytic at ζ. The point ζ ∈ Ξ is a regular singular point of (2.11) and, equivalently, (2.7) if ωζ (br−1 ) ≤ 1,

ωζ (br−2 ) ≤ 2,

. . .,

ωζ (b0 ) ≤ r.

A linear differential equation with only regular singular points is called Fuchsian. The indicial polynomial of (2.7) at a regular singular point ζ is the polynomial I(θ) = (θ)r + δ1 (θ)r−1 + · · · + δr , where (θ) j = θ(θ − 1) · · · (θ − j + 1) and δ j = limz→ζ (z − ζ) j br−j (z). In the 1860s, Fuchs [52] published a study of linear differential equations, characterizing those which do not admit solutions with essential singularities. Soon after, Frobenius [51] gave a simplified method for computing a basis of solutions at a regular singular point, which consists of functions locally analytic in a slit disk12. The following result can be found in Wasow [124, Ch. 2]. Proposition 2.16 Let ζ be a regular singular point of the D-finite equation (2.7). Then in a slit disk around ζ, equation (2.7) admits a C-linear basis of analytic solutions of the form (1 − z/ζ)γ

m  

 f j (1 − z/ζ) log j (1 − z/ζ) ,

(2.12)

j=0

where γ is a root of the indicial polynomial I(θ) and the f j are analytic at the origin. If I(θ) has distinct roots, no two of which differ by an integer, then there are no 12 In fact, much of this theory seems to be contained in an unpublished manuscript of Riemman from several decades earlier. A historical treatment of early methods in differential equations is given by Gray [60].

2.4 D-Finite Power Series

67

logarithmic terms and a linear basis of solutions is given by functions of the form (1 − z/ζ)γ f (1 − z/ζ) where f is analytic at the origin. Given expansion (2.12) valid in a slit disk, we write F(z) ∼ C(1 − z/ζ)α logm (1 − z/ζ) if the ratio C(1 − z/ζ)α logm (1 − z/ζ)/F(z) → 1 whenever z → ζ along any path in the slit disk. As with our work on Puiseux expansions, we can transfer a series expansion of the form (2.12) at the dominant singularity of a generating function to asymptotics of its coefficients. The following result is originally due to Jungen [67]; a modern proof can be found in Flajolet and Odlyzko [48]. Proposition 2.17 Suppose F(z) = n≥0 fn z n is analytic in some open disk around the origin except for a single dominant singularity ζ ∈ C and points in the ray R ζ = {tζ : t ≥ 1}. If F(z) has a convergent expansion of the form (2.12) in a disk centred at ζ with R ζ removed and F(z) ∼ C(1 − z/ζ)α logm (1 − z/ζ) with α  N −α−1 then fn ∼ Cζ −n nΓ(−α) logm n. If there are a finite number of dominant singularities, each satisfying these hypotheses, then asymptotics of fn are determined by adding the asymptotic contributions given by each. Remark 2.4 When α ∈ N the identity d (1 − z)α log(1 − z)r = −α(1 − z)α−1 log(1 − z)r − r(1 − z)α−1 log(1 − z)r−1 dz can be used to recursively determine the coefficients of (1 − z)α log(1 − z)r . Any term in (2.12) with α ∈ N and r = 0 is a polynomial and may be subtracted from F(z) without changing asymptotic behaviour. Putting Propositions 2.16 and 2.17 together gives the following. Proposition 2.18 Assume that the coefficients b j (z) of (2.11) are analytic in a disk |z| < ρ except at a unique pole ζ ∈ Ξ with 0 < |ζ | < ρ. Suppose that ζ is a regular singular point of (2.11) and that F(z) is a solution to (2.11) which is analytic at the origin. If none of the solutions α1, . . . , αr to the indicial equation I(θ) = 0 at ζ differ by an integer then there exist c1, . . . , cr ∈ C such that for any τ with |ζ | < τ < ρ, [z n ]F(z) =

r 

c j Δ j (n) + O(τ −n ),

(2.13)

j=1

where Δ j (n) = 0 when α j ∈ N and Δ j (n) is determined by an effective asymptotic series whose terms can be calculated to any desired order when α j  N. If α j  N then the leading term of Δ j (n) is    1 n−α j −1 −n 1+O ζ . Δ j (n) = Γ(−α j ) n When the roots of I(θ) differ by an integer (including the case of multiple roots) then dominant asymptotics of [z n ]F(z) are given by a C-linear combination of terms of the

68

2 Generating Functions and Analytic Combinatorics

form ζ −n n−σ j (log n) , where σj is algebraic and  is a non-negative integer. When there are a finite number of dominant singularities, each satisfying these hypotheses, then asymptotics are determined by adding the asymptotic contributions of each. The growth of D-finite series coefficients is less restricted than the algebraic case: logarithmic powers may be present and the factor n−1−α j can be irrational or a negative integer power. Further discussion of Proposition 2.18, its applications, and generalizations to irregular singularities can be found in Flajolet and Sedgewick [50, Sect. VII. 9 and VIII. 7]. Example 2.24 (Central Binomial Coefficients Squared)  2 n As discussed above, the transcendental function F(z) = n ≥0 2n n z satisfies the D-finite equation z (1 − 16z) F (z) + (1 − 32z)F (z) − 4F(z) = 0. Since F(z) doesn’t decay super-exponentially (it grows) it must have a singularity, and because it is analytic at the origin this singularity is the regular singular point z = 1/16. We can compute convergent expansions 9 1 (1 − 16z)2 + · · · A(z) = 1 + (1 − 16z) + 4 64   1 9 21 1 B(z) = 1 + (1 − 16z) + (1 − 16z)2 + · · · log(1 − 16z) + (1 − 16z) + (1 − 16z)2 + · · · 4 64 2 64

in a slit disk at z = 1/16 of functions A(z) and B(z) which form a basis of solutions for this D-finite equation. The logarithmic factor in B is a reflection of the fact that the indicial polynomial I(θ) = θ 2 at z = 1/16 has a double root. Since A and B form a basis of solutions, we can write F(z) = κ1 A(z)+ κ2 B(z) for κ1, κ2 ∈ C. Furthermore, since A is analytic at z = 1/16 while F and B have singularities there, it must be the case that κ2  0 and Proposition 2.17 implies 

2n n

2 = κ2

   1 16n 1+O . n n

  Finding asymptotics of the sequence 2n n with algebraic generating function G(z) = (1 − 4z)−1/2 and squaring the result shows κ2 = 1/π. We discuss the general problem of determining such coefficients in the next subsection.

At an irregular singular point a solution of (2.7) may have an essential singularity, making arguments more difficult and leading to more general asymptotic behaviour. Fortunately, a set of powerful results collectively imply that any D-finite power series with integer coefficients that is analytic at the origin has only regular singularities, and the corresponding indicial polynomial has only rational roots. In fact, the result applies to a larger class of power series.

2.4 D-Finite Power Series

69

Definition 2.25 (G-functions) A power series F(z) = n ≥0 fn z n ∈ Q[[z]] is called a G-function13 if F(z) is D-finite and there exists a constant C > 0 such that for all n both | fn | and the least common denominator of f0, . . . , fn are bounded by C n . Proposition 2.19 (André-Chudnovsky-Katz Theorem) If F(z) is a G-function then a minimal order annihilating D-finite equation for F is Fuchsian, and its indicial polynomial I(θ) has only rational roots. Chudnovsky and Chudnovsky [24, Thm. III] showed that if F(z) is a G-function then a minimal order annihilating D-finite equation L is globally nilpotent, meaning that a certain linear operator derived from L is nilpotent modulo p for all but a finite number of primes p. A previous result of Katz [68] then restricts the singular behaviour of L, giving Proposition 2.19. André [2, Section VI] extended and clarified these arguments. Note that Proposition 2.19 gives information about all solutions of a D-finite equation from properties of a single solutions, a powerful result. Translating back to coefficient asymptotics gives the following, first used in combinatorial contexts by Garoufalidis [53] and Bostan et al. [16]. Corollary 2.2 Suppose F(z) is a G-function, for instance a D-finite function with integer coefficients which is analytic at the origin. Then, as n → ∞, the power series coefficients ( fn ) of F(z) have an asymptotic expansion given by a sum of terms of the form C nα ζ n (log n) where C ∈ C, α ∈ Q,  ∈ N, and ζ is algebraic. Just as the asymptotic statement in Corollary 2.1 is useful for proving transcendence of functions, Proposition 2.18 and Corollary 2.2 are powerful tools for proving non-D-finiteness. Another useful technique for proving non-D-finiteness is to show that a function has an infinite number of singularities. Example 2.25 (Number of Alternating Permutations is Not P-Recursive) In Section 2.1.1 we studied the function tan(z) = n ≥0 an!n z n, where an is the number of alternating permutations of n. Because tan(z) has a singularity at every z = π/2 + πk with k ∈ Z, the tangent function is not D-finite and an /n! is not P-recursive. Since the product of P-recursive sequences is P-recursive by Item (5) of Proposition 2.15 above, and 1/n! is P-recursive, this implies an is not P-recursive.

Example 2.26 (Prime Sequence is Non-D-Finite) Flajolet et al. [47] prove non-D-finiteness of several generating functions by examining asymptotics of their power series coefficients, including logarithms, non-integer powers, and the sequence of primes. For instance, if pn denotes the nth prime number then the prime number theorem allows one to deduce the asymptotic estimate pn = n log n + n log log n + O(n), 13 G-functions were introduced by Siegel [104] in his studies on number theory and elliptic integrals.

70

2 Generating Functions and Analytic Combinatorics

and the existence of the log-log term proves that the generating function P(z) = n n ≥0 pn z is non-D-finite.

2.4.1 An Open Connection Problem The interplay between a G-function F(z) and its P-recursive coefficient sequence ( fn ) comes into sharp focus when trying to determine asymptotics. First, studying Dfinite equations allowed for the asymptotic characterizations in Proposition 2.18 and Corollary 2.2. On the other hand, the Birkhoff-Trjitzinsky method applied to the P-recursion for fn gives a basis of solutions Ψ1 (n), . . . , Ψr (n) for the recurrence whose asymptotic expansions can be computed. To represent fn in this basis it is most convenient to turn back to the D-finite equation for F(z). Definition 2.26 (connection coefficients) The constants λ1, . . . , λr ∈ C such that fn = λ1 Ψ1 (n) + · · · + λr Ψr (n) are called connection coefficients of fn with respect to the basis Ψ1, . . . , Ψr . Similarly, if Λ1 (z), . . . , Λs (z) form a basis of solutions for the D-finite equation satisfied by F(z) then the constants λ1, . . . , λs ∈ C such that F(z) = λ1 Λ1 (z)+· · ·+λs Λs (n) are called connection coefficients of F(z) with respect to the basis Λ1, . . . , Λs . An exact characterization of connection coefficients for G-functions is currently unknown, although Fischler and Rivoal [45, Thms. 1 and 2] show that such any such constant is the evaluation g(1) of a G-function g ∈ Q(i)[[z]] whose radius of convergence can be made arbitrarily large. Most importantly, to determine even the scale of dominant asymptotics one must figure out which connection coefficients are nonzero. At the generating function level this is a connection problem analogous to the case of algebraic functions: around each element of Ξ local convergent expansions can be computed for a basis of solutions, and one needs to express F(z) as a C-linear combination of these basis elements. If the representation of F(z) in such a basis contains an element whose local series expansion at ζ ∈ Ξ contains a non-power-series term, and there is no cancellation with other basis elements, then z = ζ is a singularity of F. Numeric approximations of connection coefficients can be rigorously computed to any desired accuracy, allowing one to certify a non-zero connection coefficient and providing heuristic evidence when connection coefficients are zero. Naive techniques for computing such continuations were known in principle perhaps as far back as Frobenius. Efficient algorithms for continuation of D-finite functions, including to a regular singular point, were given by van der Hoeven [115] and outlined in previous work of Chudnovsky and Chudnovsky [26]; the key idea is to use fast evaluation of a D-finite power series inside its radius of convergence14. Additional aspects of this problem have been studied by Mezzarobba [80], who also developed a Sage 14 Only O(n) terms of a D-finite power series are needed to compute n digits of an evaluation inside its radius of convergence, with good bounds on the precise number of necessary terms [83, 82].

2.4 D-Finite Power Series

71

package [81] which can compute hundreds (or thousands) of rigorously certified digits of connection coefficients in seconds when run on a modern computer. Example 2.27 (A Numeric Connection Solution) Consider a solution F(z) of the Fuchsian differential equation 5z2 (2z − 1)(z − 3)F (4) (z) + 2z(59z 2 − 139z + 36)F (3) (z) + 6(61z 2 − 80z + 6)F (z) + 12(25z − 11)F (z) + 36F(z) = 0,

(2.14)

whose coefficient sequence ( fn ) satisfies the linear recurrence 3(n + 3)(5n + 4) fn+2 − (n + 2)(35n + 33) fn+1 + (10n2 + 28n + 18) fn = 0. This recurrence has a basis of solutions consisting of two series Ψ1 (n) =

  2n  1 − n−1 + O n−2 n

and

   Ψ2 (n) = 3−n 1 + O n−2 ,

which can be determined automatically to any order by the Birkhoff-Trjitzinsky method. This means there exists constants λ1 and λ2 , depending only on the initial conditions f0 and f1 , such that  n λ1 2n : λ1  0 . fn = λ1 Ψ1 (n) + λ2 Ψ2 (n) ∼ −n : λ1 = 0 λ2 3 To say even if fn increases or decreases, it is necessary to decide whether λ1 is zero. To this end, we examine a basis of solutions Φ0 (z), . . . , Φ3 (z) to the differential equation (2.14) defined by their expansions in a neighbourhood of the origin. We can compute a basis of solutions Φ0 (z) = 1 − z 2 /2 + · · · Φ2 (z) = z−1 + · · ·

Φ1 (z) = z + 11z2 /6 + · · ·   Φ3 (z) = z1/5 z + 11z 2 /6 + · · · ,

and matching up initial series terms with F(z) = f0 + f1 z + · · · implies F(z) = f0 Φ0 (z) + f1 Φ1 (z). Note that although we do not know the basis elements explicitly, we can represent F in the basis using only the first two series terms of each element. The leading coefficient of (2.14) vanishes at z = 1/2 and z = 3, corresponding to the potential asymptotic behaviour of fn . For example, there is a basis of solutions Γ0 (z), . . . , Γ3 (z) to (2.14) whose expansions at z = 1/2 are

72

2 Generating Functions and Analytic Combinatorics

Γ0 (z) = log (z − 1/2) (1 + 2(z − 1/2) + · · · ) Γ1 (z) = 1 + (8/25)(1 − z/2)3 + · · · Γ2 (z) = (z − 1/2) + (8/25)(1 − z/2)3 + · · · Γ3 (z) = (z − 1/2)2 − 2(1 − z/2)3 + · · · , where Γ0 (z) contains a logarithmic singularity at z = 1/2 and the other three functions are analytic. To determine the singular behaviour of F(z) at z = 1/2 we need to express F in terms of the Γ j basis. Since we have expressed F in the Φ j basis by examining its behaviour at the origin, it is sufficient to find the change of basis matrix M such that if a0, . . . , a3, b0, . . . , b3 ∈ C and a0 Ψ0 (z) + · · · a3 Ψ3 (z) = b0 Γ0 (z) + · · · b3 Γ3 (z) then



b0 b1 b2 b3

T

T  = M a0 a1 a2 a3 .

The analytic_ore_algebra Sage package of Mezzarobba [81] uses numeric analytic continuation to compute a rigorous numeric approximation 0.50 . . .  2.14 . . . − i(1.57 . . .) M≈ 0.02 . . . − i(3.14 . . .) !1.67 . . . − i(6.28 . . .)

−1.50 . . . −2.83 . . . + i(4.71 . . .) 1.35 . . . − i(9.42 . . .) −4.44 + i(18.84 . . .)

0.0 . . . 2.0 . . . −4.0 . . . 8.0 . . .

−1.37 . . . " −2.73 . . . + i(4.33 . . .)# #, 1.05 . . . − i(8.66 . . .) # −4.408 + i(17.32 . . .) $

where the approximation is computed to 2500 certified digits in around 10 seconds on a modern laptop. The entries of 

  T b0 b1 b2 b3 = M f0 f1 0 0

then determine F(z) in terms of the Γ j basis. In particular, the coefficient b0 of Γ0 is (0.50 . . .) f0 + (−1.50 . . .) f1 , where the dots hide 2498 zeroes, strongly suggesting b0 = (1/2) f0 − (3/2) f1 . Since Γ(z) has a singular expansion beginning with log(z − 1/2) at z = 1/2, and Γ1, Γ2, and Γ3 are analytic at z = 1/2, if true this would imply    3 f1 − f0 n  −1 2 n + · · · + O (3−n ) . fn = 2 Thus, when 3 f1 − f0 is far from zero it seems reasonable that the coefficient sequence will grow exponentially and this will be easily detected. When 3 f1 − f0 equals zero it seems that F(z) does not have a singularity at z = 1/2, and dominant asymptotics are determined by repeating the process with a basis of solutions whose expansions near z = 3 are known. When 3 f1 − f0 is very close but not equal to zero it seems that the sequence will grow exponentially, but one could be fooled into thinking this was not the case. In general the connection coefficients can be transcendental, making such an analysis difficult. For this simple example it can be verified directly that

2.4 D-Finite Power Series

Ψ1 (n) =

73

2n  (−1)k 2n = n k ≥0 nk n+1

and

Ψ2 (n) = 3−n

form a basis of solutions for the recurrence under consideration, so that     3 f0 − 3 f1 −n 3 f1 − f0 2n fn = + 3 . 2 n+1 2

Example 2.28 (A Numeric Connection Solution for Binomial Coefficients) Repeating this process on the differential equation z (1 − 16z)

d d2 F(z) + (1 − 32z) F(z) − 4F(z) = 0 dz dz2

with initial conditions F(0) = 1 and F (0) = 4, which encodes the generating function of the central binomial coefficients squared, gives 

2n n

2 = (.3183098861 . . . )

   16n 1 1+O . n n

The numerical approximation corresponds to the connection coefficient 1/π.

Example 2.29 (Simple Lattice Walks in N2 ) Problem 4.4 of Chapter 4 asks you to prove that the generating function A(z) counting the number an of lattice walks beginning at the origin, staying in N2 , and taking n steps in S = {(±1, 0), (0, ±1)} satisfies the D-finite equation z2 (4z − 1)(4z + 1)A (z) + 2z(4z + 1)(16z − 3)A (t) + 2(112z2 + 14z − 3)A (t) + 4(16z + 3)A(t) = 0,

(2.15)

so that (an ) satisfies the P-recurrence (n + 4)(n + 3)an+2 − 4(2n + 5)an+1 − 16(n + 1)(2 + n)an = 0. This recurrence has a basis of solutions with expansions     3 9 +··· and Ψ2 (n) = (−4)n n−3 1 − +··· , Ψ1 (n) = 4n n−1 1 − 2n 2n corresponding to the singular points z = ±1/4 of the D-finite equation (2.15). Performing the same numeric analytic continuation argument gives

74

2 Generating Functions and Analytic Combinatorics

an = (1.273 . . . )Ψ1 (n) + (5.092 . . . )Ψ2 (n) ∼ (1.273 . . . )4n n−1 . In Section 6.1.1 of Chapter 6 we use the methods of analytic combinatorics in several variables to provide an asymptotic decomposition of an which determines these connection coefficients exactly. In particular, we will see that an ∼ (4/π)4n n−1 .

The determination of connection coefficients is a major open problem in enumerative combinatorics. We discuss methods of attacking this connection problem using analytic combinatorics in several variables in the following chapters. Open Problem 2.4 (Decidability of D-finite Asymptotics) Given a G-function F(z), specified by an annihilating Fuchsian D-finite equation L and enough initial conditions to distinguish F(z) as a solution, is it decidable to determine dominant asymptotics of F(z)? In particular, is it decidable to determine which singularities of L are singularities of F(z) and find the corresponding connection coefficients?

2.5 D-Algebraic Power Series We end with a brief discussion of functions satisfying certain non-linear differential equations, mainly to highlight how delicate the relevant decidability issues are. As we do not consider such functions later in the text, we do not go into great detail; a more robust overview can be found in the surveys of Rubel [97, 98, 99]. Definition 2.27 (D-algebraic functions) An analytic function or formal series F(z) is differentially algebraic (D-algebraic) over a field K ⊂ C if there exists a non-zero polynomial P(x0, x1, . . . , xd ) with coefficients in K such that F and its derivatives satisfy the algebraic differential equation (ADE)   P F, F , . . . , F (d) = 0. For a function or series F, satisfying an ADE is equivalent to the apparently weaker condition that there exists a multivariate polynomial Q with coefficients in K such that   Q z, F, F , . . . , F (d) = 0. For n sufficiently large the coefficients fn of a D-algebraic function F(z) satisfy a recurrence of the form p(n) fn = qn ( f0, . . . , fn−1 ), where p and the qn are polynomials which can be explicitly computed from an annihilating ADE. In particular, a power series solution of an ADE is still uniquely determined by a finite number of initial terms, and given these initial terms the coefficient fn can be determined in polynomial time with respect to n. A good discussion of the complexity of basic operations with ADEs, such as extending known solutions, can be found in van der Hoeven [116].

2.5 D-Algebraic Power Series

75

Examples of D-Algebraic Series The family of D-algebraic functions includes all generating function classes so far discussed, and is closed under sum, product, difference, quotient, composition, and compositional inverse, when defined; components of a system of ADEs are themselves D-algebraic. Unlike D-finite functions, analytic D-algebraic functions can have an infinite number of singularities. Examples above such as tan(z) and the generating function of partitions are non-D-finite but D-algebraic. Perhaps the most famous combinatorial example of a D-algebraic function is Tutte’s derivation [114] of the equation 2q2 (1 − q)z + (qz + 10H − 6zH )H + q(4 − q)(20H − 18zH + 9z 2 H ) = 0 for the generating function H(z) counting q-coloured rooted triangulations by vertices, for all q ∈ N. Bergeron and Reutenauer [7] show how a large class of ADEs can be interpreted as the generating functions of certain classes of trees. Solutions of ADEs have been important in theoretical computer science since Shannon’s analysis [103] of the Differential Analyzer—an analogue computer designed to solve differential equations—as a model of computation. In fact, initial value problems of (first order) D-algebraic systems can simulate universal Turing machines [59]. That ADEs capture a wide variety of behaviour is not surprising in light of the existence of so-called universal differential equations; for instance, Duffin [37] showed that for any real continuous function g(z) there is a four times continuously differentiable solution of the ADE 2F (z)F (z)2 − 5F (z)F (z)F (z) + 3F (z)3 = 0 which approximates g(z) to arbitrary accuracy over the real numbers. The first example of a universal differential equation was given by Rubel [96].

Non-D-Algebraic Functions Definition 2.28 (hypertranscendence) A non-D-algebraic function is called hypertranscendental or transcendentally transcendental. Well known examples of hypertranscendental functions include the Euler gamma n function Γ(z), the Riemann zeta function, and the ‘lacunary’ series n ≥0 z2 . Important tools for proving hypertranscendence include theorems related to large gaps between non-zero power series terms [77], new complexity techniques inspired by applications in combinatorics [90], and differential Galois theory [118, 62]. In particular, differential Galois theory gives a powerful collection of tools for proving algebraicity, transcendence, and hypertranscendence of functions by reducing such questions to effectively decidable statements about algebraic groups.

76

2 Generating Functions and Analytic Combinatorics

Asymptotics and Decidability D-algebraic series seem to be on the border between decidability and undecidability, perhaps tipped to the side of undecidability. There are algorithms to determine when an ADE admits a formal power series solution, or a smooth C ∞ solution, but it is undecidable to determine whether there is an analytic solution at the origin; see Denef and Lipshitz [35] for details. Compared to D-finite functions not much is known about D-algebraic asymptotics, and in general not much can be said. Denef and Lipshitz [35] show that it is undecidable to detect whether a D-algebraic series which is analytic at the origin has radius of convergence less than one, or greater than or equal to one. Recast in terms of coefficient asymptotics, this says it is undecidable to determine whether the coefficient sequence of an analytic D-algebraic function exponentially grows. Maillet [79] showed that if F(z) is D-algebraic then its power series coefficients satisfy | fn | ≤ K(n!)α for some constants α, K > 0 and all n. Popken [93] showed that if F(z) is D-algebraic and its power series coefficients fn  are algebraic numbers then | fn | ≥ exp −cn(log n)2 for some constant c > 0. These results allow for asymptotic proofs of hypertranscendence for extremely fast growing or decaying sequences.

Appendix on Complex Analysis This appendix summarizes the results from complex analysis needed for our asymptotic arguments. Because we continually introduce new concepts we do not break out definitions as in the rest of this text. Full proofs of the results discussed here can be found in Henrici [65] and Rudin [100]. An open disk centred at a ∈ C is a set of the form D = {z ∈ C : |z − a| < r } with r > 0. A subset O ⊂ C is open if for any a ∈ O there is an open disk centred at a which is contained in O, while a domain Ω is a connected open subset of C. A neighbourhood of a ∈ C is an open subset of C containing a, and a complex-valued function f (z) is called analytic at z = a if there exists a neighbourhood N of a where f (z) is represented by a convergent power series,  cn (z − a)n (z ∈ N). f (z) = n≥0

We say f is analytic in the domain Ω if it is analytic at each point of Ω; this is equivalent to f being complex-differentiable in Ω, i.e., that the limit of ( f (z) − f (a))/(z − a) as z approaches a in the complex plane exists for every a ∈ Ω. Two analytic functions defined on a domain Ω which agree on any subset S ⊂ Ω containing a limit point (including any non-empty open set) must agree on all of Ω. This uniqueness property, sometimes called the identity theorem, implies that a nonzero analytic function has a finite number of zeroes in any compact set. If Ω1 and Ω2 are non-disjoint domains and f : Ω1 → C and g : Ω2 → C are analytic functions

2.5 D-Algebraic Power Series

77

which agree on Ω1 ∩ Ω2  ∅, then one can analytically continue f by defining analytic F : Ω1 ∪ Ω2 → C with F(z) = f (z) for z ∈ Ω1 and F(z) = g(z) for z ∈ Ω2 . In this case we say g(z) is a direct analytic continuation of f (z) to Ω2 . Example 2.30 (Analytic Continuation of Series) The function C(z) = z/(1 − 2z) is analytic in the complex plane, except at the point z = 1/2. Since C(z) has the convergent power series representation   cn z n = 2n−1 z n C(z) = n≥0

n≥1

for |z| < 1/2, where (cn ) is the counting sequence of integer compositions, we can view this rational function as a direct analytic continuation of the generating function of integer compositions from the disk {|z| < 1/2} to C \ {1/2}.

More generally, we say that g is an analytic continuation of f if there exists a sequence of direct analytic continuations on consecutively overlapping domains Ω1, Ω2, . . . , Ωr beginning with f and ending with g. The analytic continuation g is uniquely determined by the sequence of domains Ω j .

Complex Integration A curve γ : [a, b] → C is a piecewise continuously differentiable function from the real interval [a, b] into the complex numbers; when γ(t) = γ1 (t) + iγ2 (t) for real functions γ1 and γ2 this states that γ1 and γ2 are piecewise continuously differentiable as functions from [a, b] to R. If f (z) is a complex-valued function on γ then the complex path integral of f over γ is ∫ γ

∫ f (z)dz =

b

f (γ(t))γ (t)dt,

a

when this integral exists. If f (z) = u(z) + iv(z) for u, v : C → R then ∫ ∫ ∫ f (z)dz = u(z)dz + i v(z)dz. γ

γ

γ

The integral is parametrization independent: it depends only on the image γ([a, b]), not on the actual function γ. Example 2.31 (A Circular Integral) The curve defined by γ(t) = eit for t ∈ [−π, π] describes the unit circle in C. If n ∈ Z and n  −1 then

78

2 Generating Functions and Analytic Combinatorics

∫ γ

while

∫ z n dz =

b

γ(t)n γ (t)dt =



−π

a

∫ γ

z−1 dz =



π

π

−π

ieit(1+n) dt =

π eit(1+n) = 0, 1 + n t=−π

ieit(1−1) dt = 2πi.

A curve γ is closed if γ(a) = γ(b) and simple if γ is injective except for potentially having γ(a) = γ(b). Two curves γ and γ with the same endpoints are homotopic in the domain Ω if one can be continuously deformed into the other while staying in Ω. A loop in a domain Ω is a simple closed curve which can be continuously deformed to a single point while staying in Ω. A loop divides C into two regions, one of which, called the interior of the loop, is finite. A loop is positively oriented if the interior of the loop is on the left when tracing the path of the loop; for example, the loop defined by γ(t) = eit as t goes from −π to π is positively oriented. The domain Ω is simply connected if every simple closed curve is a loop. The following results are fundamental to complex analysis and form the bedrock of our asymptotic methods. First, deforming a curve of integration does not affect an integral as long as the curve stays in a domain where the integrand is analytic. Proposition 2.20 (Path Independence of Integration) Suppose f is analytic in the domain Ω. If γ and γ are homotopic curves in Ω then ∫ ∫ f (z)dz = f (z)dz. γ

γ

In particular, if γ is a loop in Ω then

∫ γ

f (z)dz = 0.

The relationship between a function’s power series coefficients and its analytic behaviour comes from the next result. Proposition 2.21 (Cauchy Integral Theorem) Suppose f is analytic in the domain Ω and D = {z : |z − a| ≤ r } ⊂ Ω for some r > 0. If γ = {z : |z − a| = r } is the positively oriented boundary of the disk D, w is in the interior of D, and n ∈ N, then ∫ f (z) 1 (n) f (w) = dz, 2πi γ (z − w)n+1 where f (n) denotes the nth derivative of f , with f (0) (z) = f (z). The following bound comes from the definition of a complex integral and the triangle inequality. Proposition 2.22 (Maximum Modulus Integral Bound) If f (z) is continuous on a curve γ then ∫ f (z)dz ≤ length(γ) · max | f (z)|, z ∈γ γ

where length(γ) is the arc length of the curve.

2.5 D-Algebraic Power Series

79

Singular Behaviour Roughly speaking, a singularity of a function f (z) is a point near which f behaves badly. Examining local behaviour of a function around such points will be key to our asymptotic methods. For our purposes we need only consider a few common types of singularities.

1. Isolated Singularities The open annulus of radii 0 ≤ r < R ≤ ∞ centred at z = a is the set Ar, R (a) = {z ∈ C : r < |z − a| < R}, and a punctured disk centred at a is an open annulus centred at a with r = 0. We say that z = a is an isolated singularity if there exists a punctured disk centred at a where f is analytic, but f is not analytic at a and cannot be made analytic solely by changing (or defining for the first time) the value of f (a). Proposition 2.23 Let a ∈ C, let 0 ≤ r < R ≤ ∞, and suppose f is analytic in the annulus Ar,R (a). Then f can be represented by a unique Laurent expansion f (z) =

∞  n=−∞

cn (z − a)n =

∞  n=1

c−n (z − a)−n +

∞ 

cn (z − a)n

(2.16)

n=0

for all z ∈ Ar,R (a), with uniform convergence on any compact subset of Ar, R (a). Suppose the expansion (2.16) is valid in a punctured disk around z = a. When cn = 0 for n < 0 in (2.16) then f is analytic at z = a or can be made analytic there by defining f (a) = c0 . Conversely, if there exists n < 0 with cn  0 then z = a is an isolated singularity of f . The coefficient c−1 of the expansion (2.16) is called the residue of f at z = a and denoted Resz=a f (z). Note that unlike formal Laurent series, convergent Laurent expansions can have an arbitrary number of terms with negative exponents. There are thus two types of isolated singularities, characterized by the coefficients cn in (2.16) which vanish. • If there exists M > 0 such that c−M  0 but cn = 0 for all n ≤ −M, then we say that a is a pole of order M. A pole of order 1 is called a simple pole. When f is analytic or has a pole at z = a we say that f is meromorphic at z = a, and there exists a punctured disk D centred at a and analytic functions g(z) and h(z) such that f (z) = g(z)/h(z) for z ∈ D. When g vanishes to order s at a ∈ Ω and h vanishes to order r at a, i.e., if g(a) = g (a) = · · · = g (s) (a) = h(a) = h (a) = · · · = h(r) (a) = 0 but g (s+1) (a) and h(r+1) (a) are nonzero, then z = a is a singularity of f (z) if and only if r > s, in which case z = a is a pole of order M = r − s. We say f is meromorphic in a domain Ω if it is meromorphic at each point of Ω; a ratio of analytic functions in Ω is always meromorphic in Ω. Near a polar

80

2 Generating Functions and Analytic Combinatorics

singularity, f behaves like a rational function whose denominator is going to zero. The following result, useful for calculating residues, follows from (2.16) and repeated term-by-term differentiation. Lemma 2.4 When f (z) has a pole of order M at z = a then Res f (z) = z=a

1 d M−1 lim M−1 (z − a) M f (z). (M − 1)! z→a dz

If f (z) = g(z)/h(z) with h(a) = 0 but h (a), g(a)  0 then z = a is a simple pole of f (z) and Resz=a f (z) = g(a)/h (a). • If there exist an infinite number of indices n < 0 such that cn  0 in (2.16) then f has an essential singularity at z = a. Behaviour near an essential singularity is more complicated than behaviour near a pole: Picard’s Great Theorem states that in any neighbourhood of an essential singularity f (z) takes on every complex value, except possibly one, infinitely many times. An example of an essential singularity is the function f (z) = e1/z at z = 0, which takes on every value but zero infinitely many times in any neighbourhood of the origin. Example 2.32 (Residues of Tangent) The function tan(z) = sin(z)/cos(z) is meromorphic in Ω = C and its singularities form the set S = {(k + 1/2)π : k ∈ Z} of zeroes of cos(z). The numerator sin(z) and the derivative cos(z) = − sin(z) are non-zero when z ∈ S, so every singularity of tan(z) is a simple pole and Lemma 2.4 implies Res

z=(k+1/2)π

tan(z) =

sin((k + 1/2)π) = −1. − sin((k + 1/2)π)

Residue computations can be used to evaluate integrals of meromorphic functions. Proposition 2.24 (Cauchy Residue Theorem) Let f (z) be a meromorphic function on a domain Ω ⊂ C, and let γ be a positively oriented loop in Ω on which f is analytic. If a1, . . . , ar denote the poles of f (z) interior to γ then 1 2πi

∫ γ

f (z)dz =

r  j=1

Res f (z).

z=a j

2. Algebraic and Logarithmic Branch Points A function with an isolated singularity at z = a is not analytic at a but does have a convergent series expansion in some punctured disk around a. When dealing with

2.5 D-Algebraic Power Series

81

algebraic and D-finite generating functions we must consider more general singular behaviour, arising from a need to take inverses of non-injective functions. Example 2.33 (Square-Root Branches) Because the function z → z 2 is not injective, one must carefully define an inverse √ function z → z. Given z ∈ C \ R ≤0 , the principal argument Arg(z) is obtained by writing z = reiθ for real r and −π < θ < π, and taking Arg(z) = θ. The principal branch square-root is the analytic function f : C \ R ≤0 → C defined by  √ f (z) = z = |z| ei Arg(z)/2,  where |z| is the usual square-root on R>0 . One cannot extend f (z) to an analytic function on any punctured disk around the origin as the limit of f does not exist at any negative real number:  √    √ √ √ lim f εeiθ = εeiπ/2 = i ε  −i ε = εe−iπ/2 = lim f εeiθ θ→π

θ→−π

for any ε > 0, while εeiπ = −ε = εe−iπ . Removing the set R ≤0 from the domain of f to obtain an analytic function is called making a branch cut. In fact, removing any simple curve from the origin to infinity gives a valid branch cut and any continuous choice of the argument function on the branch cut defines an analytic branch of the square-root. For any r ∈ Q \ Z the branches of zr are defined by an analogous process.

Example 2.34 (Logarithm Branches) Similarly, the principal branch logarithm is the function defined by Log(z) = log |z| + iArg(z) for z ∈ C\R ≤0 , where log |z| is the usual logarithm on R>0 and Arg(z) is the principal argument of z. Different branches of the logarithm can be defined by removing any simple curve from the origin to infinity and selecting any continuous choice of the argument function after taking the branch cut. At any z ∈ C, all branches of the logarithm which are defined at z differ by some integer multiple of 2πi. A cut disk around a point a ∈ C is a disk around a with a simple curve from a to the boundary of the disk removed. The function f (z) has an algebraic singularity or algebraic branch point at a ∈ C if it can be represented by a convergent series expansion of the form    cn (z − a)1/R

f (z) =

n≥ N

n

82

2 Generating Functions and Analytic Combinatorics

in a cut disk D centred at a, where N ∈ Z, R ∈ N, the map t → t 1/R is defined by a consistent branch in D, and there exists cn  0 where n/R  Z. Equivalently, f (z) is not analytic at z = a but f (z R + a) is meromorphic at the origin for some R ∈ N>1 . The function f (z) has a logarithmic singularity or logarithmic branch point at a ∈ C if f (z) can be represented by a convergent series expansion of the form (z − a) N

d  

C j (z) (log(z − a)) j



j=0

in a cut disk D centred at a, where N ∈ Z, d > 0, the logarithm is defined by a consistent branch in D, and each C j (z) is analytic at a. Finally, an algebro-logarithmic singularity is a combination of algebraic and logarithmic singularities, that is, a point where f (z) can be locally represented in a cut disk by a series of terms containing natural number powers of logarithms and Rth roots for some positive integer R > 1.

3. General Singularities More generally, let γ be a loop and f (z) be an analytic function on the interior of γ. We say that a ∈ γ is a singularity of f (z) if f (z) cannot be analytically continued to a neighbourhood containing a. When discussing singularities of a generating function, the interior of the loop γ should contain the origin. This definition contains the singularities discussed above, and many more. Example 2.35 (Natural Boundaries)

n Problem 2.18 asks you to prove that the function F(z) = n ≥0 z 2 defined for |z| < 1 cannot be analytically continued outside the unit circle, meaning every point on the unit circle is a singularity of F (we say the unit circle is a natural boundary of F).

The Gamma Function The Euler gamma function Γ(z), arising ∫ ∞ in our asymptotic results on algebraic and Dfinite functions, is defined by Γ(z) = 0 e−t t z−1 dt when the real part (z) > 0 (so the integral converges) and is uniquely extended to the complex plane with the negative integers removed by analytic continuation; it has a simple pole at ∫each negative ∞ integer. Integration by parts shows that Γ(z + 1) = zΓ(z) and Γ(1) = 0 e−t dt = 1, so Γ(n+1) = n! for any n ∈ N. The gamma function can thus be viewed as an analytic extension of the factorial function to non-integers. In fact, the gamma function is the unique analytic extension subject to the additional√constraint that log(Γ(x)) is convex ∫∞ for x > 0. The value Γ(1/2) = 0 t −1/2 e−t dt = π and the recurrence satisfied by Γ(z) allows one to calculate the gamma function at any half-integer. Stirling’s formula

2.5 D-Algebraic Power Series

83

√   gives an asymptotic expansion Γ(z) ∼ z z−1/2 e−z 2π 1 + z −1 /12 + z −2 /288 + · · · as |z| → ∞ with the argument of z fixed in (−π, π). Proofs of these results, and further details, can be found in Andrews et al. [3, Ch. 1].

Problems 2.1 Prove Lemma 2.1. 2.2 For F = n≥0 fn z n and G = n≥0 gn z n in K[[z]], define d(F, G) = 2− min{n : fn gn }, where d(F, G) = 0 if F = G. Prove that d satisfies the strong triangle inequality, d(F, H) ≤ max {d(F, G), d(G, H)} ,

for all F, G, H ∈ K[[z]].

Since d is symmetric, and d(F, G) ≥ 0 with equality if and only if F = G, this implies d defines a non-archimedean metric on K[[z]]. 2.3 A sequence of formal power series (Fn ) in K[[z]] is called Cauchy if for every ε > 0 there exists N ∈ N such that d(Fn, Fm ) < ε whenever n, m ≥ N. Prove that every Cauchy sequence in K[[z]] converges to an element of K[[z]] under the metric d. This implies K[[z]] forms a complete non-archimedean metric space. 2.4 Suppose F(z) = n≥0 fn z n and G(z) = n ≥0 gn z n are elements of K[[z]] N fk G(z)k with g0 = 0. Prove that the sequence (SN ) in K[[z]] defined by SN = k=0 converges in K[[z]]. We define the composition F(G(z)) as the limit of this sequence, F(G(z)) = lim N →∞ SN = k ≥0 fk G(z)k . 2.5 Suppose F(z) = n≥0 fn z n ∈ K[[z]] with f0  0. Show that F(z) = f0 − zG(z) for some G ∈ K[[z]], and prove that the infinite series k ≥0 z k G(z)k / f0k+1 converges to I(z) ∈ K[[z]] satisfying F(z)I(z) = 1. 2.6 Prove that set of formal series in Q[[z]] which define analytic functions at the origin is dense in Q[[z]]. In other words, show that for any F ∈ Q[[z]] there is a sequence of analytic power series whose limit in the metric space Q[[z]] is F. 2.7 Prove that the coefficient of z100 in the rational function F(z) =

1 (1 − z)(1 −

z 5 )(1

− z 10 )(1 − z 25 )

gives the number of ways to make change for a dollar using pennies (1 cent), nickels (5 cents), dimes (10 cents), and quarters (25 cents), and find that coefficient. Find a recurrence for the number of ways fn to make change for n cents. 2.8 The Lagrange inversion formula states that if u, R ∈ Q[[z]] satisfy u(0) = 0 and z = u(z)/R(u(z)) then [z n ]u(z) = n1 [t n−1 ]R(t)n for n > 0. Prove this by justifying the calculation

84

2 Generating Functions and Analytic Combinatorics

n[z ]u(z) = [z n

n−1

1 ]u (z) = 2πi



u (z) 1 dz = zn 2πi



R(u)n du = [t n−1 ]R(t)n . un

Start by showing you may assume R and u are polynomials. Use the Lagrange inversion formula to find the coefficients of the power series y(z) satisfying y = z + zy 2 , which counts rooted trees where each node is a leaf or has two children. 2.9 Let A ⊂ Q[[z]] denote series of rational functions with natural number coefficients such that if σ is a dominant singularity then σ/|σ| is a root of unity. Prove that for any a, b ∈ A the functions a(z) + b(z), a(z)b(z), and 1/(1 − za(z)) are in A, establishing Proposition 2.6. Hint: For the final construction suppose the dominant singularities of a ∈ A have modulus p > 0 and let c(z) = 1/(1 − za(z)). Show that the equation za(z) = 1 has a unique root r ∈ (0, p), and that z = r is a dominant singularity of c(z). Conclude that any dominant singularity θ of c(z) satisfies θa(θ) = 1 = ra(r) so θ k = r k = |θ| k for some k ∈ N.  κ n 2.10 Prove that the series F(z) = n≥0 2n n z is transcendental over Q(z) for any natural number κ ≥ 2. 2.11 Prove that if ( fn ) satisfies a P-recursive equation of order r and degree d then F(z) = n≥0 fn z n ∈ K[[z]] satisfies a D-finite equation of order at most d and degree at most r + d. 2.12 Let an be the number of alternating permutations (satisfying π1 > π2 < π3 > π4 < . . . ) when n is odd and zero when n is even. Let  a2k+1  an zn = z 2k+1 . T(z) = n! (2k + 1)! n≥0 k ≥0 • By considering the location of 1 in a permutation, prove that for all k ≥ 1  2k  a j a2k−j a2k+1 = j 1≤ j ≤2k j odd

• Using this recurrence, show that T (z) = T(z)2 + 1. • Solve the differential equation to conclude T(z) = tan z. 2.13 For any positive integer B let SB denote the square in the complex plane with corners ±Bπ ± Bπi. Using the hyperbolic trigonometric functions and the fact that | tan(x + iy)| 2 =

tan(x)2 + tanh(y)2 cosh(y)2 − cos(x)2 = 2 2 1 + tan(x) tanh(y) cosh(y)2 + cos(x)2 − 1

1 for all x, y ∈ R, prove that if z ∈ SB for B ∈ N then | tan z| ≤ tanh(π) < 1.1. Adapt the argument in Section 2.1.1 to prove that for any n ∈ N and positive integer B the number an of alternating permutations satisfies

2.5 D-Algebraic Power Series

85

  n+1 B−1  2 an 1 =2 + O ((πB)−n ) . n+1 n! π (2k + 1) k=0 2.14 Let A be a commutative ring and P(z) = pn z n + · · · + p1 z + p0 Q(z) = qm z m + · · · + q1 z + q0 be polynomials of degree n and m in A[z]. The Sylvester matrix S(P, Q) of P and Q is the (m + n) × (m + n) matrix obtained by repeating m times the vector (pn, . . . , p0 ) with each copy shifted once over, then repeating n times the vector (qm, . . . , q0 ) with each copy shifted once over. For instance, if n = 4 and m = 3 then p4 0 0 S(P, Q) = q3 0 0 !0 

p3 p4 0 q2 q3 0 0

p2 p3 p4 q1 q2 q3 0

p1 p2 p3 q0 q1 q2 q3

p0 p1 p2 0 q0 q1 q2

0 p0 p1 0 0 q0 q1

0 " 0# # p0 # # 0#. # 0# # 0# q0 $

Define the resultant Resultantz (P, Q) of P and Q as the determinant of S(P, Q). a) Show that Resultantz (P, Q) lies in A. b) Prove that if P and Q share a root α then Resultantz (P, Q) = 0. Hint: It is sufficient to prove the existence of a non-zero vector x such that S(P, Q)x = 0. c) Suppose that P(z) = a(z − α1 ) · · · (z − αn )

and

Q(z) = b(z − β1 ) · · · (z − βm )

* for αi, β j ∈ C. Prove that Resultantz (P, Q) = R where R = a m bn i, j (αi − β j ). Hint: Consider Resultantz (P, Q) and R to be multivariate polynomials in new variables αi and β j . Show that R divides the resultant and consider their degrees. 2.15 Suppose a(x) and b(x) are algebraic series satisfying P(x, a(x)) = Q(x, b(x)) = 0 for non-zero P(x, y), Q(x, y) ∈ Q[x, y], and let d = degy (P). Prove that • y = a(z) + b(z) is a root of Resultantz (P(x, y − z), Q(x, y)) ∈ Q[y, z] • y = a(x)b(x) is a root of Resultantz (z d P(x, y/z), Q(x, y)) ∈ Q[y, z] • when a(0)  0 then y = 1/a(x) is a root of Resultantz (P(x, y), 1 − yz) ∈ Q[y, z] 2.16 Use Cauchy’s integral formula and the maximum modulus integral bound to show that if f (z) is analytic in a disk |z| ≤ R + ε then it is represented at the origin by a power series which converges in |z| ≤ R. Conclude that a power series admits a singularity on the boundary of its domain of convergence. 2.17 What is wrong with the “proof”

86

2 Generating Functions and Analytic Combinatorics

√  2  √ √ √ 1 = 1 = (−1)(−1) = −1 −1 = −1 = −1 n 2.18 Let F(z) = n≥0 z2 and ζ ∈ C with |ζ | = 1. Show that F(z) is unbounded in any neighbourhood of ζ inside the unit circle, and conclude that F cannot be analytically continued outside the unit circle.

References 1. Désiré André. Développements de séc x et de tang x. Comptes rendus de l’Académie des sciences, 88:965–967, 1879. 2. Yves André. G-functions and geometry. Aspects of Mathematics, E13. Friedr. Vieweg & Sohn, Braunschweig, 1989. 3. George E. Andrews, Richard Askey, and Ranjan Roy. Special functions, volume 71 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge, 1999. 4. Cyril Banderier and Michael Drmota. Formulae and asymptotics for coefficients of algebraic functions. Combin. Probab. Comput., 24(1):1–53, 2015. 5. E. Barcucci, A. Del Lungo, A. Frosini, and S. Rinaldi. A technology for reverse-engineering a combinatorial problem from a rational generating function. Adv. in Appl. Math., 26(2):129– 153, 2001. 6. F. Bergeron, G. Labelle, and P. Leroux. Combinatorial species and tree-like structures, volume 67 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge, 1998. 7. François Bergeron and Christophe Reutenauer. Combinatorial resolution of systems of differential equations. III. A special class of differentially algebraic series. European J. Combin., 11(6):501–512, 1990. 8. Daniel Bernoulli. Observationes de seriebus quae formantur ex additione vel subtractione quacunque terminorum se mutuo consequentium. Commentarii Academiae Scientiarum Imperialis Petropolitanae, 3:85–100, 1728. 9. Jean Berstel. Sur les pôles et le quotient de Hadamard de séries N-rationnelles. C. R. Acad. Sci. Paris Sér. A-B, 272:A1079–A1081, 1971. 10. Jean Berstel and Maurice Mignotte. Deux propriétés décidables des suites récurrentes linéaires. Bull. Soc. Math. France, 104(2):175–184, 1976. 11. George D. Birkhoff. Formal theory of irregular linear difference equations. Acta Math., 54(1):205–246, 1930. 12. George D. Birkhoff and W. J. Trjitzinsky. Analytic theory of singular difference equations. Acta Math., 60(1):1–89, 1933. 13. Alin Bostan, Mireille Bousquet-Mélou, Manuel Kauers, and Stephen Melczer. On 3dimensional lattice walks confined to the positive octant. Annals of Combinatorics, 20(4):661– 704, 2016. 14. Alin Bostan, Frédéric Chyzak, Marc Giusti, Romain Lebreton, Grégoire Lecerf, Bruno Salvy, and Éric Schost. Algorithmes Efficaces en Calcul Formel. Frédéric Chyzak (auto-édit.), Palaiseau, September 2017. 15. Alin Bostan, Frédéric Chyzak, Bruno Salvy, Grégoire Lecerf, and Éric Schost. Differential equations for algebraic functions. In ISSAC 2007, pages 25–32. ACM, New York, 2007. 16. Alin Bostan, Kilian Raschel, and Bruno Salvy. Non-D-finite excursions in the quarter plane. J. Combin. Theory Ser. A, 121:45–63, 2014. 17. Mireille Bousquet-Mélou. Walks in the quarter plane: Kreweras’ algebraic model. Ann. Appl. Probab., 15(2):1451–1491, 2005.

References

87

18. Mireille Bousquet-Mélou. Rational and algebraic series in combinatorial enumeration. In International Congress of Mathematicians. Vol. III, pages 789–826. Eur. Math. Soc., Zürich, 2006. 19. Sean Carrell and Guillaume Chapuy. A simple recurrence formula for the number of rooted maps on surfaces by edges and genus. In 26th International Conference on Formal Power Series and Algebraic Combinatorics (FPSAC 2014), Discrete Math. Theor. Comput. Sci. Proc., AT, pages 573–584. Assoc. Discrete Math. Theor. Comput. Sci., Nancy, 2014. 20. Augustin Louis Cauchy. Cours d’analyse de l’École royale polytechnique. I. Analyse algébrique. Debure frères, Paris, 1821. 21. Cyril Chabaud. Séries génératrices algébriques : asymptotique et applications combinatoires. PhD thesis, Université Pierre et Marie Curie (UPMC), 2002. 22. Shaoshi Chen and Manuel Kauers. Some open problems related to creative telescoping. J. Syst. Sci. Complex., 30(1):154–172, 2017. 23. N. Chomsky and M. P. Schützenberger. The algebraic theory of context-free languages. In Computer programming and formal systems, pages 118–161. North-Holland, Amsterdam, 1963. 24. D. V. Chudnovsky and G. V. Chudnovsky. Applications of Padé approximations to Diophantine inequalities in values of G-functions. In Number theory (New York, 1983–84), volume 1135 of Lecture Notes in Math., pages 9–51. Springer, Berlin, 1985. 25. D. V. Chudnovsky and G. V. Chudnovsky. On expansion of algebraic functions in power and Puiseux series. I. J. Complexity, 2(4):271–294, 1986. 26. David V. Chudnovsky and Gregory V. Chudnovsky. Computer algebra in the service of mathematical physics and number theory. In Computers in mathematics (Stanford, CA, 1986), volume 125 of Lecture Notes in Pure and Appl. Math., pages 109–232. Dekker, New York, 1990. 27. Frédéric Chyzak. The ABC of Creative Telescoping — Algorithms, Bounds, Complexity. Habilitation à diriger des recherches (hdr), University Paris-Sud 11, April 2014. 28. Andrew R. Conway and Anthony J. Guttmann. On 1324-avoiding permutations. Adv. in Appl. Math., 64:50–69, 2015. 29. J. H. Conway. The weird and wonderful chemistry of audioactive decay. In Thomas M. Cover and B. Gopinath, editors, Open Problems in Communication and Computation, pages 173–188. Springer New York, New York, NY, 1987. 30. G. Darboux. Mémoire sur l’approximation des fonctions de très-grands nombres, et sur une classe étendue de développements en série. Journal de mathématiques pures et appliquées 3e série, 4:5–56, 1878. 31. N. G. de Bruijn. Asymptotic methods in analysis. Bibliotheca Mathematica. Vol. 4. NorthHolland Publishing Co., Amsterdam; P. Noordhoff Ltd., Groningen; Interscience Publishers Inc., New York, 1958. 32. A. de Moivre. The Doctrine of Chances or a Method of Calculating the Probabilities of Events in Play. London, 1718. 33. A. de Moivre. Miscellanea analytica de seriebus et quadraturis. J. Tomson and J. Watts, London, 1730. 34. M. Delest. Algebraic languages: a bridge between combinatorics and computer science. In Formal power series and algebraic combinatorics (New Brunswick, NJ, 1994), volume 24 of DIMACS Ser. Discrete Math. Theoret. Comput. Sci., pages 71–87. Amer. Math. Soc., Providence, RI, 1996. 35. J. Denef and L. Lipshitz. Decision problems for differential equations. J. Symbolic Logic, 54(3):941–950, 1989. 36. Timothy DeVries. A case study in bivariate singularity analysis. In Algorithmic probability and combinatorics, volume 520 of Contemp. Math., pages 61–81. Amer. Math. Soc., Providence, RI, 2010. 37. R. J. Duffin. Rubel’s universal differential equation. Proc. Nat. Acad. Sci. U.S.A., 78(8, part 1):4661–4662, 1981. 38. David S. Dummit and Richard M. Foote. Abstract algebra. John Wiley & Sons, Inc., Hoboken, NJ, third edition, 2004.

88

2 Generating Functions and Analytic Combinatorics

39. L. Euler. Institutiones calculi differentialis cum eius usu in analysi finitorum ac doctrina serierum. Berolini, 1755. 40. Leonhard Euler. Introductio in analysin infinitorum. 2 Volumes. Lausannae: Apud MarcumMichaelem Bousquet & socios., 1748. 41. Graham Everest, Alf van der Poorten, Igor Shparlinski, and Thomas Ward. Recurrence sequences, volume 104 of Mathematical Surveys and Monographs. American Mathematical Society, Providence, RI, 2003. 42. E. Fabry. Sur les intégrales des équations différentielles linéaires à coefficients rationnels. PhD thesis, Thèse de Doctorat ès sciences mathématiques, Faculté des Sciences de Paris N. 544, 1885. 43. Mary Celine Fasenmyer. A note on pure recurrence relations. Amer. Math. Monthly, 56:14–17, 1949. 44. Giovanni Ferraro. The rise and development of the theory of series up to the early 1820s. Sources and Studies in the History of Mathematics and Physical Sciences. Springer, New York, 2008. 45. Stéphane Fischler and Tanguy Rivoal. On the values of G-functions. Comment. Math. Helv., 89(2):313–341, 2014. 46. Philippe Flajolet. Analytic models and ambiguity of context-free languages. Theoret. Comput. Sci., 49(2–3):283–309, 1987. 47. Philippe Flajolet, Stefan Gerhold, and Bruno Salvy. On the non-holonomic character of logarithms, powers, and the nth prime function. Electron. J. Combin., 11(2):Article 2, 16, 2004/06. 48. Philippe Flajolet and Andrew Odlyzko. Singularity analysis of generating functions. SIAM J. Discrete Math., 3(2):216–240, 1990. 49. Philippe Flajolet, Bruno Salvy, and Paul Zimmermann. Automatic average-case analysis of algorithms. Theoret. Comput. Sci., 79(1, (Part A)):37–109, 1991. 50. Philippe Flajolet and Robert Sedgewick. Analytic combinatorics. Cambridge University Press, Cambridge, 2009. 51. G. Frobenius. Ueber die Integration der linearen Differentialgleichungen durch Reihen. J. Reine Angew. Math., 76:214–235, 1873. 52. L. Fuchs. Zur Theorie der linearen Differentialgleichungen mit veränderlichen Coefficienten. (Ergänzungen zu der im 66sten Bande dieses Journals enthaltenen Abhandlung). J. Reine Angew. Math., 68:354–385, 1868. 53. Stavros Garoufalidis. G-functions and multisum versus holonomic sequences. Adv. Math., 220(6):1945–1955, 2009. 54. Scott Garrabrant and Igor Pak. Pattern avoidance is not p-recursive. arXiv preprint, arXiv:1505.06508, 2015. 55. Ira M. Gessel. A probabilistic method for lattice path enumeration. J. Statist. Plann. Inference, 14(1):49–58, 1986. 56. Ira M. Gessel. Symmetric functions and P-recursiveness. J. Combin. Theory Ser. A, 53(2):257–285, 1990. 57. R. William Gosper, Jr. Decision procedure for indefinite hypergeometric summation. Proc. Nat. Acad. Sci. U.S.A., 75(1):40–42, 1978. 58. Xavier Gourdon and Bruno Salvy. Effective asymptotics of linear recurrences with rational coefficients. In Proceedings of the 5th Conference on Formal Power Series and Algebraic Combinatorics (Florence, 1993), volume 153, pages 145–163, 1996. 59. Daniel S. Graça, Manuel L. Campagnolo, and Jorge Buescu. Computability with polynomial differential equations. Adv. in Appl. Math., 40(3):330–349, 2008. 60. Jeremy Gray. Linear differential equations and group theory from Riemann to Poincaré. Birkhäuser Boston, Inc., Boston, MA, 1986. 61. J. Hadamard. History of science and psychology of invention. Mathematika, 1:1–3, 1954. 62. Charlotte Hardouin. Galoisian approach to differential transcendence. In Galois theories of linear difference equations: an introduction, volume 211 of Math. Surveys Monogr., pages 43–102. Amer. Math. Soc., Providence, RI, 2016.

References

89

63. G. H. Hardy and S. Ramanujan. Asymptotic Formulæ in Combinatory Analysis. Proc. London Math. Soc. (2), 17:75–115, 1918. 64. William A. Harris, Jr. and Yasutaka Sibuya. The reciprocals of solutions of linear ordinary differential equations. Adv. in Math., 58(2):119–132, 1985. 65. Peter Henrici. Applied and computational complex analysis. Vol. 1. Wiley Classics Library. John Wiley & Sons, Inc., New York, 1988. 66. Einar Hille. Analytic function theory. Vol. II. Introductions to Higher Mathematics. Ginn and Co., Boston, Mass.-New York-Toronto, Ont., 1962. 67. R. Jungen. Sur les séries de Taylor n’ayant que des singularités algébrico-logarithmiques sur leur cercle de convergence. Comment. Math. Helv., 3(1):266–306, 1931. 68. Nicholas M. Katz. Nilpotent connections and the monodromy theorem: Applications of a result of Turrittin. Inst. Hautes Études Sci. Publ. Math., (39):175–232, 1970. 69. Manuel Kauers, Maximilian Jaroschek, and Fredrik Johansson. Ore polynomials in Sage. In Computer algebra and polynomials, volume 8942 of Lecture Notes in Comput. Sci., pages 105–125. Springer, Cham, 2015. 70. George Kenison, Richard Lipton, Joël Ouaknine, and James Worrell. On the skolem problem and prime powers. In Proceedings of the 45th International Symposium on Symbolic and Algebraic Computation, ISSAC ’20, pages 289–296, New York, NY, USA, 2020. Association for Computing Machinery. 71. Christoph Koutschan. Regular languages and their generating functions: the inverse problem. Master thesis (Diplomarbeit), Friedrich-Alexander-Universität, Erlangen-Nürnberg, Germany, 2005. 72. Christoph Koutschan. Regular languages and their generating functions: the inverse problem. Theoret. Comput. Sci., 391(1-2):65–74, 2008. 73. Christoph Koutschan. Creative telescoping for holonomic functions. In Computer algebra in quantum field theory, Texts Monogr. Symbol. Comput., pages 171–194. Springer, Vienna, 2013. 74. G. Kreweras. Sur une classe de problémes liés au treillis des partitions d’entiers. Cahiers du B.U.R.O., 5–105, 1965. 75. Pierre-Simon Laplace. Mémoire sur les suites. Mémoire de l’Académie royale des sciences de Paris. Mémoires mathématiques et physique (in Œuvre Complètes 10:1–89), pages 207–309, 1779. 76. Christer Lech. A note on recurring series. Ark. Mat., 2:417–421, 1953. 77. Leonard Lipshitz and Lee A. Rubel. A gap theorem for power series solutions of algebraic differential equations. Amer. J. Math., 108(5):1193–1213, 1986. 78. K. Mahler. Eine arithmetische eigenschaft der taylor-koeffizienten rationaler funktionen. Proc. Akad. Wet. Amsterdam, 38:50–60, 1935. 79. Edmond Maillet. Sur les séries divergentes et les équations différentielles. Ann. Sci. École Norm. Sup. (3), 20:487–518, 1903. 80. Marc Mezzarobba. Autour de l’évaluation numérique des fonctions D-finies. Thèse de doctorat, École polytechnique, November 2011. 81. Marc Mezzarobba. Rigorous multiple-precision evaluation of D-finite functions in SageMath. Technical Report 1607.01967, arXiv, 2016. 82. Marc Mezzarobba. Truncation bounds for differentially finite series. Ann. H. Lebesgue, 2:99–148, 2019. 83. Marc Mezzarobba and Bruno Salvy. Effective bounds for P-recursive sequences. J. Symbolic Comput., 45(10):1075–1096, 2010. 84. M. Mignotte, T. N. Shorey, and R. Tijdeman. The distance between terms of an algebraic recurrence sequence. J. Reine Angew. Math., 349:63–76, 1984. 85. Isaac Newton. The method of fluxions and infinite series: with its application to the geometry of curve-lines. English translation by John Colson from the Latin original. Printed by Henry Woodfall, 1736. 86. John Noonan and Doron Zeilberger. The enumeration of permutations with a prescribed number of “forbidden” patterns. Adv. in Appl. Math., 17(4):381–407, 1996.

90

2 Generating Functions and Analytic Combinatorics

87. A. M. Odlyzko. Asymptotic enumeration methods. In Handbook of combinatorics, Vol. 1, 2, pages 1063–1229. Elsevier, Amsterdam, 1995. 88. Joël Ouaknine and James Worrell. Positivity problems for low-order linear recurrence sequences. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 366–379. ACM, New York, 2014. 89. Joël Ouaknine and James Worrell. Ultimate positivity is decidable for simple linear recurrence sequences. In Automata, languages, and programming. Part II, volume 8573 of Lecture Notes in Comput. Sci., pages 330–341. Springer, Heidelberg, 2014. 90. Igor Pak. Complexity problems in enumerative combinatorics. In Proc. Int. Cong. of Math., volume 3, pages 3139–3166, Rio de Janeiro, 2018. 91. Marko Petkovšek, Herbert S. Wilf, and Doron Zeilberger. A = B. A K Peters, Ltd., Wellesley, MA, 1996. 92. G. Pólya. Sur Les Series Entieres a Coefficients Entiers. Proc. London Math. Soc. (2), 21:22–38, 1923. 93. J. Popken. Über arithmetische Eigenschaften analytischer Funktionen. PhD thesis, University of Groningen. North-Holland Publishing Co., Amsterdam, 1935. 94. V. Puiseux. Recherches sur les fonctions algébriques. Journal de mathématiques pures et appliquées 1re série, 15:365–480, 1850. 95. G. Rozenberg and A. Salomaa, editors. Handbook of formal languages. Vol. 1. SpringerVerlag, Berlin, 1997. 96. L. A. Rubel. A universal differential equation. Bull. Amer. Math. Soc., 4(3):345–349, 1981. 97. Lee A. Rubel. Some research problems about algebraic differential equations. Trans. Amer. Math. Soc., 280(1):43–52, 1983. 98. Lee A. Rubel. A survey of transcendentally transcendental functions. Amer. Math. Monthly, 96(9):777–788, 1989. 99. Lee A. Rubel. Some research problems about algebraic differential equations. II. Illinois J. Math., 36(4):659–680, 1992. 100. W. Rudin. Real and complex analysis. McGraw-Hill Book Co., NY, third edition, 1987. 101. B. Salvy and P. Zimmermann. GFUN: a Maple package for the manipulation of generating and holonomic functions in one variable. ACM Trans. Math. Softw., 20(2):163–177, 1994. 102. Bruno Salvy. Linear differential equations as a data structure. Foundations of Computational Mathematics, Jan 2019. 103. Claude E. Shannon. Mathematical theory of the differential analyzer. J. Math. Phys. Mass. Inst. Tech., 20:337–354, 1941. 104. Carl L. Siegel. Über einige Anwendungen diophantischer Approximationen [reprint of Abhandlungen der Preußischen Akademie der Wissenschaften. Physikalisch-mathematische Klasse 1929, Nr. 1]. In On some applications of Diophantine approximations, volume 2 of Quad./Monogr., pages 81–138. Ed. Norm., Pisa, 2014. 105. M. F. Singer. Algebraic solutions of nth order linear differential equations. In Proceedings of the Queen’s Number Theory Conference, 1979 (Kingston, Ont., 1979), volume 54 of Queen’s Papers in Pure and Appl. Math., pages 379–420. Queen’s Univ., Kingston, Ont., 1980. 106. Michael F. Singer. Algebraic relations among solutions of linear differential equations. Trans. Amer. Math. Soc., 295(2):753–763, 1986. 107. Parmanand Singh. The so-called Fibonacci numbers in ancient and medieval India. Historia Math., 12(3):229–244, 1985. 108. T. Skolem. Ein verfahren zur behandlung gewisser exponentialer gleichungen und diophantischer gleichungen. Skand. Mat. Kongr., Stockhohn, 8(163–188), 1934. 109. Matti Soittola. Positive rational sequences. Theoret. Comput. Sci., 2(3):317–322, 1976. 110. Richard P. Stanley. Enumerative combinatorics. Vol. 2, volume 62 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 1999. 111. Richard P. Stanley. A survey of alternating permutations. In Combinatorics and Graphs, volume 531 of Contemp. Math., pages 165–196. Amer. Math. Soc., Providence, RI, 2010. 112. Richard P. Stanley. Enumerative combinatorics. Volume 1, volume 49 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, second edition, 2012.

References

91

113. E. C. Titchmarsh. The theory of functions. Oxford University Press, Oxford, 2nd edition, 1939. 114. W. T. Tutte. Chromatic solutions. II. Canad. J. Math., 34(4):952–960, 1982. 115. Joris van der Hoeven. Fast evaluation of holonomic functions near and in regular singularities. J. Symbolic Comput., 31(6):717–743, 2001. 116. Joris van der Hoeven. Computing with D-algebraic power series. AAECC, 30(1):17–49, 2019. 117. Alfred van der Poorten. A proof that Euler missed. . .Apéry’s proof of the irrationality of ζ(3). Math. Intelligencer, 1(4):195–203, 1978/79. 118. Marius van der Put and Michael F. Singer. Galois theory of linear differential equations, volume 328 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 2003. 119. Nikolai Vereshchagin. Occurrence of zero in a linear recursive sequence. Mathematical notes of the Academy of Sciences of the USSR, 38(2), 1985. 120. G. Vivanti. Sulle serie di potenze. Annali di Matematica Pura ed Applicata (1867–1897), 21(1):193–194, Jan 1893. 121. Joachim von zur Gathen and Jürgen Gerhard. Modern computer algebra. Cambridge University Press, Cambridge, third edition, 2013. 122. Robert J. Walker. Algebraic Curves. Princeton Mathematical Series, vol. 13. Princeton University Press, Princeton, N. J., 1950. 123. P. G. Walsh. On the complexity of rational Puiseux expansions. Pacific J. Math., 188(2):369– 387, 1999. 124. Wolfgang Wasow. Asymptotic expansions for ordinary differential equations. Pure and Applied Mathematics, Vol. XIV. Interscience Publishers John Wiley & Sons, Inc., New York-London-Sydney, 1965. 125. Herbert S. Wilf. generatingfunctionology. A K Peters, Ltd., Wellesley, MA, third edition, 2006. 126. Herbert S. Wilf and Doron Zeilberger. Rational functions certify combinatorial identities. J. Amer. Math. Soc., 3(1):147–158, 1990. 127. Jet Wimp and Doron Zeilberger. Resurrecting the asymptotics of linear recurrences. J. Math. Anal. Appl., 111(1):162–176, 1985. 128. Doron Zeilberger. Sister Celine’s technique and its generalizations. J. Math. Anal. Appl., 85(1):114–145, 1982. 129. Doron Zeilberger. A holonomic systems approach to special functions identities. J. Comput. Appl. Math., 32(3):321–368, 1990.

Chapter 3

Multivariate Series and Diagonals The residues of this kind occur naturally in several branches of the algebraic analysis and of the infinitesimal analysis. Their consideration provides simple and easy to use methods which apply to a large number of diverse questions, and new formulas which seem to merit the attention of geometers. — Augustin-Louis Cauchy They that are ignorant of Algebra cannot imagine the wonders in this kind are to be done by it: and what further improvements and helps advantageous to other parts of knowledge the sagacious mind of man may yet find out, it is not easy to determine. — John Locke

In this chapter we study multivariate series, because they encode multivariate sequences but also because they can serve as efficient data structures for univariate sequences. For readability we use bold variables to denote multivariate quantities, typically of dimension d ∈ N>0 , and use multi-index notation to denote multivariate quantities, so that z = (z1, . . . , zd )

and

dz = dz1 dz2 . . . dzd,

and for k ∈ {1, . . . , d} and i ∈ Rd , zi = z1i1 · · · zdid ,

zk = (z1, . . . , zk−1, zk+1, . . . , zd ),

zˆ = zd.

Given a (multivariate) differentiable function F(z) we write Fz j (z) to denote the (partial) derivative of F with respect to the variable z j . Definition 3.1 (multivariate formal series) A multivariate formal power series over a field K in the variables z is a formal expression   fi zi = fi1,...,id z1i1 · · · zdid , F(z) = i∈N d

(i1,...,i d )∈N d

with coefficients fi ∈ Kd . Analogously to the univariate case, we define the coefficient extraction operator [zi ]F(z) = fi and make the ring of multivariate formal power series K[[z]] by defining addition termwise    ( fi + gi ) zi fi zi + gi zi = i∈N d

i∈N d

i∈N d

and multiplication by the Cauchy product

© Springer Nature Switzerland AG 2021 S. Melczer, Algorithmic and Symbolic Combinatorics, Texts & Monographs in Symbolic Computation, https://doi.org/10.1007/978-3-030-67080-1_3

93

94

3 Multivariate Series and Diagonals

 i∈N d

 fi z

i



 gi z

i

=

  k∈N d !i+j=k

i∈N d

" fi gj # zk . $

Note that the Cauchy product is well defined since for each k ∈ Nd there are only a finite number of elements i, j ∈ Nd such that i + j = k. One could also define K[[z]] iteratively, starting with the univariate formal power series ring K[[z1 ]] and taking K[[z]] = (K[[z1, . . . , zd−1 ]]) [[zd ]]; our definition is isomorphic to the iterated one, meaning the iterated definition does not depend on the ordering of the variables (compare this to iterated Laurent series in Section 3.3).

3.1 Complex Analysis in Several Variables As in the univariate case, we deal mainly with multivariate series over K ⊂ C which represent analytic functions. We now recap the basics of complex analysis in several variables, specialized to fit our needs. Standard accounts of complex analysis in several variables can be found in Hörmander [39] and Krantz [44]. Once again, our most basic analytic object is a convergent power series expansion. d , the Definition 3.2 (polydisks and polytori) Given a point a ∈ Cd and r ∈ R>0 open polydisk Da (r) centred at z = a of radius r is defined as a product of disks

Da (r) = {z ∈ Cd : |z1 − a1 | < r1, . . . , |zd − ad | < rd }, and the polytorus Ta (r) centred at z = a of radius r is defined as a product of circles Ta (r) = {z ∈ Cd : |z1 − a1 | = r1, . . . , |zd − ad | = rd }. Remark 3.1 The polytorus Ta (r) is a subset of the boundary ∂Da (r). For convenience we often drop the subscript when a = 0, and for an arbitrary complex vector w ∈ Cd with non-zero coordinates we let D(w) = {z ∈ Cd : |z1 | < |w1 |, . . . , |zd | < |wd |} and

T(w) = {z ∈ Cd : |z1 | = |w1 |, . . . , |zd | = |wd |},

denote the polydisk and polytorus with radii r = (|w1 |, . . . , |wd |). Definition 3.3 (analytic functions and absolute convergence) A series n ≥0 cn with complex summands is called absolutely convergent if n ≥0 |cn | converges. A d function F : Cd → C is analytic at the point a ∈ Cd if there exists a radius r ∈ R>0 such that F(z) is represented by an absolutely convergent power series   fi (z − a)i = fi (z1 − a1 )i1 · · · (zd − ad )id F(z) = i∈N d

i∈N d

3.1 Complex Analysis in Several Variables

95

for z ∈ Da (r). Such a series is said to be centred at a. At first glance it may not be clear how to sum the terms of a multivariate series. Helpfully, the terms of an absolutely convergent series can be rearranged without affecting convergence or changing the value of the series (see Problem 3.1) so the terms of a power series representing an analytic function can be summed in any reasonable order. The following result implies our insistence on absolute convergence is not a large restriction: if a power series converges at w ∈ C then the power series absolutely converges at any point coordinate-wise closer to the centre of the series. For convenience we state the result for series centred at the origin. Lemma 3.1 (Abel’s Lemma) If w ∈ Cd is such that supi∈N d | fi wi | is finite then i d i∈N d fi z converges absolutely for any z ∈ C with |z j | < |w j | for all 1 ≤ j ≤ d. Proof Let C = supi∈N d | fi wi |. The result holds vacuously if w has a zero coordinate. |z | If z ∈ Cd is such that the maximum of |wjj | for 1 ≤ j ≤ d is some t < 1 then i i |zi | fi z = fi w ≤ Ct i1 +···+id , |wi | so

  fi zi ≤ C t i1 +···+id = i∈N d

i∈N d

C (1 − t)d

is finite. A bounded series with non-negative terms always converges.



Definition 3.4 (domains, neighbourhoods, and analyticity) A subset O ⊂ Cd is called open if for all z ∈ O there is an open polydisk centred at z which is contained in O. Analogously to the univariate case, a domain in Cd is an open and connected subset of Cd and a neighbourhood of a ∈ Cd is an open set containing a. We say F(z) is analytic on a domain Ω ⊂ Cd if it is analytic at every point of Ω, and entire if it is analytic on all of Cd . The product and sum of two absolutely convergent series are absolutely convergent, and if f (z) and g(z) are represented by absolutely convergent power series on a domain Ω then the series representations for f (z) + g(z) and f (z)g(z) are obtained by term-wise summation and the Cauchy product, just as for formal series. Example 3.1 (A Simple Binomial Sum) The rational function F(x, y) =

1 1−x−y

is analytic for any (x, y) ∈ C2 such that x + y  1. For instance,

   n  i + j   n k n−k x y xi y j (x + y) = = F(x, y) = k i 2 n ≥0 n≥0 k ≥0 (i, j)∈N

96

3 Multivariate Series and Diagonals

when (x, y) is sufficiently close to the origin. Note that there is some subtlety in this derivation, hinting at why we discuss absolute convergence: although the first equality holds whenever |x + y| < 1, rearrangement is only possible when the sum converges absolutely. For example, the first series converges when x = 1 and y = −1/2, but many rearrangements of the final series at these values will ≤ 2i+j for all i, j ∈ N, the final power series representation is diverge. Since i+j i valid, at least, when both |x| and |y| are strictly less than 1/2. When F(z) is analytic in a domain Ω ⊂ Cd , and a ∈ Ω, fixing zˆ = (z1, . . . , zd−1 ) = (a1, . . . , ad−1 ) = aˆ in an absolutely convergent power series representation of F(z) gives an absolutely convergent power series representation of the univariate function f (zd ) = F(ˆa, zd ) for zd near ad . An analytic function on a domain is thus analytic in each variable, when the others are fixed1. This observation allows us to generalize many important results from complex analysis in a single variable to the multivariate setting by induction. First is the identity lemma for analytic functions. Lemma 3.2 (Identity Lemma for Analytic Functions) If f (z) and g(z) are analytic functions in a domain Ω ⊂ Cd and f (z) = g(z) on a polydisk contained in Ω then f (z) = g(z) on Ω. Thus, as in the univariate case, we can identify a function F(z) which is analytic at the origin with its power series F(z) = i∈N d fi zi and call fi the power series coefficients of F(z). The Cauchy integral formula also generalizes to several variables by induction. Theorem 3.1 (Multivariate Cauchy Integral Formula for Coefficients) Suppose F(z) is analytic on a domain Ω ⊂ Cd and the closure of a polydisk Da (r) lies in Ω. If F(z) is represented by the power series  fi (z − a)i F(z) = i∈N d

on Da (r) then, for all i ∈ Nd , fi =

1 (2πi)d

∫ Ta (r)

F(z) dz. (z − a)i+1

1 In fact, the converse holds. In the 1830s, Cauchy defined a multivariate function to be analytic over a domain D if it was analytic as a univariate function of each variable at every point in D, and this definition was also used by Jordan. Weierstrass, on the other hand, called a multivariate function analytic in a domain D if it had a multivariate power series representation in the neighbourhood of any point in the domain (Poincaré also used this definition in this doctoral thesis in 1879). Perhaps illustrating the difficulties working in several variables, the two definitions were not shown to be equivalent until work of Hartogs [37] in 1906. See Bottazzini and Gray [18, Chapter 9] for additional historical information on the development of complex analysis in several variables.

3.1 Complex Analysis in Several Variables

97

This is the most straightforward generalization of the univariate Cauchy integral formula, but not the only natural one in the multivariate setting: there is no multivariate integral formula which generalizes all major properties of the univariate Cauchy integral formula. For example, the domain of integration in Theorem 3.1 is the polytorus Ta (r), not the boundary of the polydisk Da (r), so this representation cannot be easily modified to work with domains other than polydisks. A˘ızenberg and Yuzhakov [3, Ch. 1] give a detailed survey of different integral representations for multivariate analytic functions, which generalize different properties of the univariate case. We will also make repeated use of the implicit function theorem, which helps locally parametrize the solutions of an analytic equation. A proof can be found in Hörmander [39, Thm. 2.1.2]. Proposition 3.1 (Implicit Function Theorem) If f (ˆz, y) is an analytic function at  w ∈ Cd and the partial derivative fy (w)  0 then for zˆ in a neighbourhood of w there is a unique analytic function g(ˆz) with g( w) = wd such that f (ˆz, y) = 0 if and only if y = g(ˆz).

3.1.1 Singular Sets of Multivariate Functions One of the most challenging difficulties moving from one to many variables is the diverse behaviour of the singular sets which arise. For our purposes it is sufficient to consider multivariate generalizations of meromorphic functions, whose singularities are defined (at least in a neighbourhood of each point) by explicit equations. We will see that increasing dimension allows rational functions to capture more complicated behaviour, such as that of algebraic series. Definition 3.5 (multivariate singularities) Suppose g(z) and h(z) are analytic functions in a domain Ω, with h(z) not identically zero, and define f (z) = g(z)/h(z) when h(z)  0. Since h is analytic and not identically zero, Lemma 3.2 implies f (z) is defined at some point in any polydisk. We say that f has a singularity at a ∈ Ω if | f (z)| is unbounded in any neighbourhood of a in Ω (where defined). The denominator h(z) must vanish at any singularity, but h(z) may vanish at a point which is not a singularity. Example 3.2 (Singularities of Multivariate Functions) The function e(x, y) = (x + y)/(x − y) has a singularity at every point (x, y) ∈ C2 where x = y. Note that even though the numerator and denominator both vanish at the origin this is still a singularity, since, for instance,   1 1 1 + 2, = 2n + 1 e n n n

98

3 Multivariate Series and Diagonals

is unbounded as n → ∞. On the other hand, the function f (x, y) = sin(x − y)/(x − y) has no singularities in C2 . Indeed, taking a power series expansion of sin(x − y) shows sin(x − y)  (x − y)2n (−1)n = f (x, y) = x−y n! n≥0 for all (x, y) with x  y. This series is absolutely convergent for all x, y ∈ C2 , so | f (x, y)| is bounded in any bounded set. Definition 3.6 (meromorphic functions) A function f (z) is called meromorphic at a ∈ Cd if there exists a polydisk D centred at a and analytic functions g(z) and h(z) on D, with h not identically zero, such that f (z) = g(z)/h(z) in D whenever h(z)  0. Note that f (z) need not be defined at a, and we say that a is a singularity of f if it is a singularity of g(z)/h(z). A function f (z) is meromorphic on a domain Ω ⊂ Cd if it is meromorphic at each point of Ω. Thus, a meromorphic function is one that can locally be written as a fraction of analytic functions, and a singularity of a meromorphic function is a point where this ratio behaves badly. By far the most important situation for us is when F(z) = G(z)/H(z) ∈ C(z) is a rational function, defined and analytic at least on the complement of the zero set of H(z) in Cd . The following result shows that if G and H are coprime polynomials then the zero set of H(z) is precisely the set of singularities of F(z). Proposition 3.2 Let F(z) = G(z)/H(z) ∈ C(z) be the ratio of coprime polynomials G and H. Then the singular set of F(z) is the set V = {z ∈ Cd : H(z) = 0}. Proof We need to show that if H(w) = 0 for some w ∈ C then F(z) is unbounded in any neighbourhood of w. Without loss of generality, we may assume that w = 0. If G(0)  0 the result is immediate, since the denominator of F approaches zero as z → 0 while the numerator of F is bounded away from zero. For the same reason, it is sufficient to prove that any neighbourhood of 0 contains a point ζ such that G(ζ )  0 but H(ζ ) = 0. Since G(z) and H(z) are coprime as polynomials, they are coprime as elements of C(z1, . . . , zd−1 )[zd ], so the extended Euclidean algorithm implies the existence of polynomials a, b, c ∈ C[ˆz] with a non-zero such that a(ˆz) = b(ˆz)G(z) + c(ˆz)H(z).

(3.1)

We claim that for any zˆ in a sufficiently small neighbourhood N of the origin there exists tzˆ ∈ C such that H(ˆz, tzˆ ) = 0. Assuming this, if G(ˆz, tzˆ ) = 0 for each zˆ ∈ N then the non-zero polynomial a(ˆz) in (3.1) would vanish in a neighbourhood of the origin, a contradiction. Thus, assuming the claim any sufficiently small neighbourhood of the origin contains a point ζ with H(ζ ) = 0 and G(ζ )  0, as desired. It remains only to prove the existence of tzˆ for zˆ sufficiently close to the origin. Problem 3.2 asks you to show that after an invertible linear change of variables the polynomial H(0, 0, . . . , 0, zd ) is not identically zero. Since H(z) can be considered

3.1 Complex Analysis in Several Variables

99

a polynomial in zd whose coefficients are polynomials in z1, . . . , zd−1 , this implies pzˆ (zd ) = H(ˆz, zd ) is a non-zero polynomial in zd whenever zˆ is fixed and sufficiently  close to the origin. Any root of pzˆ (zd ) can be taken as tzˆ , proving the claim. Hörmander [39, Thm 6.2.3] shows that Proposition 3.2 holds for ratios of analytic functions, after suitably defining what it means for two analytic functions to be coprime (in fact, other than preliminary results setting up the local algebraic structure of analytic functions, the proof in the meromorphic case is very similar). If f (z) is analytic at a point w ∈ Cd and f (w)  0 then 1/ f (z) is also analytic at w. Since a ratio of polynomials (or analytic functions) can be locally reduced to a ratio of coprime polynomials (respectively analytic functions), we thus have the following. Corollary 3.1 Let F(z) be a meromorphic function. Then F(z) is analytic at w ∈ Cd if and only if z = w is not a singularity of F(z). When G(z) and H(z) are coprime and F(z) = G(z)/H(z) is a rational function with singularity z = a, there are two types of behaviour to consider. 1) If H(a) = 0 but G(a)  0 then F(z) behaves like a univariate function near a pole: the limit of |F(z)| as z → a equals infinity. 2) If H(a) = G(a) = 0 then F behaves like a univariate function near an essential singularity: for any fixed C ∈ C the equation F(z) = C has a solution2 in all sufficiently small neighbourhoods of a. We will see that finding asymptotics of power series coefficients determined by singularities of the first type leads to more explicit and easier-to-apply formulas, although effective methods still exist for the second case. Since any root of a polynomial in at least two variables is not isolated, any singularity of a rational function in at least two variables is not isolated. Similarly, a zero of an analytic function f (z) in at least two variables is never isolated3, giving the following result. Proposition 3.3 If F(z) is a meromorphic function in a domain Ω ⊂ Cd where d ≥ 2 then no singularity of F(z) in Ω is isolated. Proposition 3.3 hints at a major computational difficulty when applying the methods of analytic combinatorics in several variables: unlike univariate functions, which typically have a finite number of dominant singularities that can be checked one-byone, in the multivariate setting there will always be an infinite number of singularities to sort through.

2 Note that I(z) = G(z) − C H(z) and H(z) are coprime, so our proof of Proposition 3.2 shows every neighbourhood of a contains a point where I (z) is zero and H(z) is non-zero. 3 This follows from the Weierstrass preparation theorem [39, Thm. 6.1], which describes the zero set of an analytic function f (z) as the vanishing of a polynomial in z d whose coefficients are analytic functions in zˆ .

100

3 Multivariate Series and Diagonals

3.1.2 Domains of Convergence for Multivariate Power Series In one variable, the domain of convergence of a power series is a disk centred at the point of expansion. In the multivariate setting, things are more complicated. Definition 3.7 (domains of convergence) The (open) domain of convergence of a power series is the set of points z ∈ Cd such that the power series converges absolutely in some neighbourhood of z. Note that for us a domain of convergence is always an open set. We will make great use of the fact that domains of convergence are convex subsets of Rd after taking coordinate-wise moduli and then taking logarithms. Definition 3.8 (Relog map and logarithmic convexity) Let C∗ be the set of non-zero complex numbers. The real-logarithm (Relog) map from C∗d to Rd is the function Relog(z) = (log |z1 |, log |z2 |, . . . , log |zd |) . For notational convenience we occasionally extend the Relog map to all of Cd by defining log 0 = −∞. A set Ω ⊂ Cd is called logarithmically convex (log-convex) if its image under the Relog map is convex. Equivalently, Ω is log-convex if a, b ∈ Ω implies   |a1 | t |b1 | 1−t , . . . , |ad | t |bd | 1−t ∈ Ω for all t ∈ [0, 1].

Proposition 3.4 Suppose F(z) = i∈N d fi zi is a power series centred at the origin with domain of convergence D. Then D is logarithmically convex and a point z ∈ Cd lies in the closure D if and only if the open polydisk D(z) is contained in D. Proof Lemma 3.1 implies that D is the interior of the set B = {z : supi | fi zi | finite}, d satisfy sup | f xi | ≤ C so it is sufficient to prove B is log-convex. Suppose i i   x,ty ∈ C i 1−t and supi | fi y | ≤ C for some C ≥ 0. If w = |x1 | |y1 | , . . . , |xd | t |yd | 1−t for some t ∈ [0, 1] then, for any i ∈ Nd , i i t i 1−t fi w = fi x fi y ≤ C, so w ∈ B and B is log-convex. If z ∈ D and w ∈ D(z) then | fi wi | < | fi zi | so w ∈ D by Lemma 3.1, and D(z) ⊂ D.  To apply the methods of analytic combinatorics in several variables we need to link the analytic behaviour of a convergent power series centred at the origin to its domain of convergence. The following result generalizes Proposition 2.2 in Chapter 2 to the multivariate setting; its proof is Problem 3.3. Proposition 3.5 Suppose F(z) is a meromorphic function, analytic at the origin where it is represented by a power series with domain of convergence D. If w ∈ Cd lies on the boundary ∂D then there exists a singularity ζ ∈ Cd of F(z) with the same coordinate-wise modulus as w.

3.1 Complex Analysis in Several Variables

101

Definition 3.9 (minimal singularities) A singularity of meromorphic F(z) which is on the boundary of the domain of convergence of its power series at the origin is called a minimal singularity or minimal point. Equivalently, a minimal point w is a singularity of F such that no other singularity z of F satisfies |z j | ≤ |w j | for each 1 ≤ j ≤ d with at least one of the inequalities being strict.

Example 3.3 (Minimal Singularities of a Binomial Sum) We have already seen that  i + j  1 xi y j = F(x, y) = i 1−x−y 2 (i, j)∈N

for (x, y) in a neighbourhood of the origin. Since  i + j   1 |x| i |y| j = , (|x| + |y|)n = i 1 − |x| − |y| 2 n≥0 (i, j)∈N

with convergence if and only if |x| + |y| < 1, the domain of convergence of the power series for F(x, y) at the origin is D = {(x, y) ∈ C2 : |x| + |y| < 1}. If |x| + |y| ≤ 1 and x + y = 1 then x and y must be real and positive, so the minimal singularities of F(x, y) form the set S = {(x, 1 − x) : 0 ≤ x ≤ 1}.

The importance of minimal points will become clear in Chapter 5, when the techniques of analytic combinatorics in several variables are discussed. Of particular interest is the following property of minimal points, which helps determine minimal singularities where a local analysis can yield coefficient asymptotics. Definition 3.10 (logarithmic gradients) Given an analytic function f (z), the loga  rithmic gradient of f is (∇log f )(z) = z1 fz1 , . . . , zd fz d , where the subscripts indicate partial derivatives. Proposition 3.6 Let F(z) = G(z)/H(z) be a rational function which is analytic at the origin with domain of convergence D, where G and H are coprime polynomials over the complex numbers. Then for any minimal point w there exists λ ∈ Rd≥0 and τ ∈ C such that (∇log H)(w) = τλ. Proof If the partial derivative Hz j (w) is zero for all 1 ≤ j ≤ d then we can take τ = 0 and the conclusion trivially holds. Thus we may assume, without loss of generality, that the partial derivative Hz d (w)  0. Proposition 3.1, the implicit function theorem, implies the existence of an analytic function g(ˆz) = g(z1, . . . , zd−1 )  in Cd−1 such that (ˆz, g(ˆz)) is a parameterization of the in a neighbourhood N of w d singular set V = {z ∈ C : H(z) = 0} of F near w. Fix any 1 ≤ j ≤ d − 1. Then for zˆ ∈ N one has H(ˆz, g(ˆz)) = 0, and differentiating  yields with respect to z j followed by substituting zˆ = w

102

3 Multivariate Series and Diagonals

Hz j (w) + gz j ( w)Hz d (w) = 0.

(3.2)

Furthermore, taking z j = w j eiθ j with θ j real shows that the derivative of zd = g(ˆz) w) with respect to θ j as z j moves around the circle with modulus |w j | is gθ j = iw j gz j ( at θ j = 0. By minimality of w the path of zd on this curve at θ j = 0 must be tangent to the circle {wd eiθd : −π ≤ θ ≤ π}, hence there exists λ j ∈ R such that gθ j = −λ j iwd . Putting this together with (3.2) implies w) = − −λ j iwd = iw j gz j (

iw j Hz j (w) Hz d (w)

,

and defining λd = 1 gives λ j wd Hz d (w) = λd w j Hz j (w), as desired. If, instead of taking z j = w j eiθ j we take z j = w j e x j for x j real then the w) = −λ j wd . Since w is derivative of zd = g(ˆz) with respect to x j at x j = 0 is w j gz j ( minimal the modulus of zd cannot decrease when the modulus of z j decreases and the other variables are held constant, meaning λ j wd must have the same argument  as wd . Thus λ j ≥ 0, and as 1 ≤ j ≤ d − 1 was arbitrary the conclusion holds. Remark 3.2 Our proof of Proposition 3.6 did not use the hypothesis that H is a polynomial: if H(z) is any analytic function on a neighbourhood U of w ∈ Cd , and no z ∈ U with H(z) = 0 has coordinate-wise smaller modulus than w, then the conclusion of Proposition 3.6 still holds.

3.2 Diagonals Since we use multivariate series as a data structure for enumeration problems, we need ways of extracting univariate sequences from multivariate expansions. Definition 3.11 (main diagonals) If F(z) = i∈N d fi zi is a (formal or convergent) multivariate power series, the main diagonal of F(z) is the univariate power series  fn,...,n t n (ΔF)(t) = n≥0

defined by the terms of F where all variables have the same exponent. If the variable appearing in the diagonal is not specified, then for notational convenience we treat it as the final input variable to F. For instance, if F(x, y) is a bivariate function then ΔF is by default treated as a series in the variable y. Example 3.4 (Central Binomial Coefficients as a Diagonal) The diagonal

3.2 Diagonals

103

 Δ

     1   i + j i j "  2n n x y #= y =Δ i n 1−x−y 2 n ≥0 (i, j)∈N ! $

is the generating function of the central binomial coefficients.

Some of the most interesting applications of analytic combinatorics in several variables, such as combinatorial limit theorems, come from linking the behaviours of different univariate sequences defined by the same multivariate series. Definition 3.12 (r-diagonals) Given r ∈ Qd≥0 we define the r-diagonal of the se ries F(z) = i∈N d fi zi to be the univariate power series  fnr1,...,nrd t n (Δr F)(t) = n≥0

whose coefficients form the sequence [znr ]F(z), where fnr1,...,nrd is (for now) considered to be zero for nr  Nd . When F(z) is a function analytic in a neighbourhood of the origin, the r-diagonal Δr F is defined to be the r-diagonal of F’s power series representation at the origin. We always use the diagonal notation Δ without a subscript to refer to the main diagonal r = 1, which we give the most attention. By replacing variables by positive integer powers of themselves, one can always reduce4 to the case r ∈ Nd . Although the coefficient [znr ]F(z) is not formally defined for nr  Nd , and never is when r  Qd , we will see in later chapters that asymptotics of r-diagonals typically exhibit uniform behaviour as r varies smoothly. This will allow us to make asymptotic statements about r-diagonals for r having arbitrary real number coordinates. We discuss many examples of rational diagonals in Section 3.4 below.

3.2.1 Properties of Diagonals One of the key characteristics of the diagonal operator is how it interacts with the various classes of generating functions discussed in Chapter 2. Our first result, connecting algebraic functions and bivariate rational diagonals, was known to Pólya [57] in special cases and later proven by Furstenberg [33] and Hautus and Klarner [38]. Proposition 3.7 Suppose the rational function F(x, y) = G(x, y)/H(x, y) ∈ Q(x, y) is analytic at the origin. Then (ΔF)(y) is an algebraic function, obtained by adding G(x,y/x) the residues of xH(x,y/x) at its poles x1 (y), . . . , xr (y) in x which are fractional power series in y with no constant term (i.e., those which approach zero as y → 0). 4 For example, [x n/2 y n ]F(x, y) = [x n y n ]F(x 2, y) for any bivariate power series F(x, y).

104

3 Multivariate Series and Diagonals

We give a complex analytic proof of Proposition 3.7; a generalization to series over arbitrary fields is detailed in Stanley [61, Thm. 6.3.3]. Proof Suppose F(x, y) = (i, j)∈N2 fi, j x i y j converges in the polydisk |x| < R and |y| < S for R, S > 0. Then for all fixed y with |y| < RS the series  x −1 F(x, y/x) = fi, j x i−j−1 y j (i, j)∈N2

converges absolutely and uniformly for all x in the annulus A |y | = {|y|/S < |x| < R}. If C |y | is a positively oriented circle around the origin staying in A |y | then the Cauchy residue theorem implies , + ∫ ∫   F(x, y/x) 1 j 1 i−j−1 x dx = f j, j y j = (ΔF)(y), fi, j y dx = 2πi C|y | x 2πi C|y | 2 j ≥0 (i, j)∈N

where we may commute the integral and summation by uniform convergence, so (ΔF)(y) =

1 2πi

∫ C|y |

  r  G(x, y/x) G(x, y/x) dx = ; x = ρi (y) , Res xH(x, y/x) xH(x, y/x) i=1

where ρ1 (y), . . . , ρr (y) are the poles of G(x, y/x)/xH(x, y/x) inside C |y | . Because the upper radius of A |y | is fixed while the lower radius goes to zero with |y|, when |y| is sufficiently small there is always a circle C |y | such that the only poles of F inside C |y | are those that approach the origin as y goes to zero. Each of the poles is algebraic, and thus so is each residue and their sum.  Bostan et al. [15, Thm. 18] give tight bounds on the algebraic quantities involved. Their results imply that if F(x, y) is a rational function whose numerator and denominator have degrees at most δ in x and in y, then there is a polynomial P(t, D) of degree at most d 2 4δ+1 in t and degree at most 4δ in D such that P(t, (ΔF)(t)) = 0. See that paper for more precise bounds, and an algorithm to compute such a polynomial P from F(x, y) using O(8δ ) arithmetic operations with rational numbers. Example 3.5 (The Algebraic Diagonal of a Binomial Sum) The generating function C(y) of the central binomial coefficients is the diagonal of the bivariate function F(x, y) = 1/(1 − x − y). The power series expansion of F(x, y) converges in the polydisk |x| < 1/2 and |y| < 1/2, and  yj  i + j  −1 x F(x, y/x) = x i−1 i x 2 (i, j)∈N

converges absolutely and uniformly whenever |y| < 1/4 and 2|y| < |x| < 1/2. For fixed y with |y| < 1/6 all points on the curve |x| = 3|y| satisfy 2|y| < |x| < 1/2, so

3.2 Diagonals

105

C(y) = (ΔF)(y) =

1 2πi

∫ |x |=3 |y |

1 1 dx = x(1 − x − y/x) 2πi

∫ |x |=3 |y |

1 dx. x − x2 − y

The denominator H(x, y) = x − x 2 − y has roots  1 ± 1 − 4y ρ± (y) = , 2 where ρ− (y) → 0 and ρ+ (y) → 1/2 as y → 0. Thus,   1 1 1 C(y) = Res = ; x = ρ− (y) = 1 − 2ρ− (y) x − x2 − y 1 − 4y for y in a neighbourhood of the origin.

Over the rational numbers, or any field of characteristic zero, the diagonal of a rational function in more than two variables need not be algebraic5. Example 3.6 (Not All Rational Diagonals are Algebraic) In Chapter 2 we saw that the function  D(z) = Δ

⎡  ⎤  2    ⎢ i + j k + n i j k n ⎥⎥  2n n 1 x y w z ⎥= = Δ ⎢⎢ z n (1 − x − y)(1 − w − z) ⎢i, j ≥0 i ⎥ n≥0 n ⎣ ⎦

is transcendental by examining asymptotics of its coefficients. Problem 3.7 asks you to prove that the trivariate function F(x, y, z) = 1/(1 − x − y − z) has transcendental diagonal.

Although the rational diagonal in the last example is not algebraic, it is D-finite. In the 1980s, Christol [22] proved that the diagonal of a rational function is always D-finite. In fact, diagonals satisfy a stronger closure property. Definition 3.13 (multivariate D-finite functions) Generalizing the univariate setting, if K is a field of characteristic zero then a power series or analytic function F(z) is called D-finite over K if the K(z)-vector space generated by F(z) and its partial derivatives is finite dimensional. Equivalently, F(z) is D-finite if for each variable z j 5 Over a field of prime characteristic, the diagonals of rational [33] and even algebraic [26] functions in any number of variables must be algebraic. There is a beautifully constructed elementary proof of these results using a connection between diagonals, algebraic functions, and certain finite automata taking as input base p expansions of natural numbers; see Lipshitz and Poorten [48] for a survey. Adamczewski and Bell [2] give explicit bounds on the degree and height of the minimal polynomial for the diagonal (ΔF mod p) in terms of the prime p, the degree and height of the minimal polynomial of F(z), and the number of variables.

106

3 Multivariate Series and Diagonals

it satisfies a linear differential equation with coefficients in K[z] with respect to the partial derivative ∂z∂ j . Even though the class of rational (or even algebraic) functions is not closed under taking diagonals, the class of multivariate D-finite functions is closed under taking diagonals. Theorem 3.2 If f (z) is D-finite over a field K then the main diagonal Δ f is also D-finite over K. See Lipshitz [47] for a proof of Theorem 3.2 using a clever argument enumerating elements in finite-dimensional vector spaces. An annihilating linear differential equation for the diagonal of a rational (or, more generally, D-finite) function is obtained through the theory of creative telescoping. If F(z) is a rational function whose numerator and denominator have their degree in each variable bounded by δ, Lairez [45] gives an algorithm determining a D-finite equation satisfied by ΔF which runs in complexity δO(d) . Bostan et al. [16, Thm. 4.2] give a bound on the order and degree of this D-finite equation, both of which are also in δO(d) . Currently, the most efficient implementation of this creative telescoping machinery for diagonals is a MAGMA package of Lairez [45]. Koutschan [43] maintains a useful Mathematica package, HolonomicFunctions, which calculates an annihilating D-finite equation for the diagonal of a multivariate D-finite function. Example 3.7 (Creative Telescoping for Binomial Coefficients) Above we saw that for y sufficiently small the diagonal G(y) of the function F(x, y) = 1/(1 − x − y) is given by the integral ∫ 1 1 dx, G(y) = 2πi γ x − x 2 − y where γ is a positively oriented circle close to the origin. Example 2.22 of Chapter 2 derived the differential equation (4y − 1)G (y) + 2G(y) = 0 for this integral using an identity relating the derivatives of F with respect to x and with respect to y.

Theorem 3.2 restricts the possible asymptotic behaviour of rational diagonal coefficients. Recall from Definition 2.25 in Chapter 2 that F(z) = n ≥0 fn z n ∈ Q[[z]] is called a G-function if fn and the least common denominator of f0, . . . , fn grow at most exponentially with n. In fact, rational diagonals satisfy a stronger property. Definition 3.14 (global boundedness) A univariate series F(z) ∈ Q[[z]] is called globally bounded if F(z) represents an analytic function in a neighbourhood of the origin and there exist non-zero a, b ∈ Q such that aF(bz) has integer coefficients. Problem 3.4 asks you to prove that the diagonal of any rational function F(z) ∈ Q(z) that is analytic at the origin is globally bounded, and that every globally bounded function which is analytic at the origin is a G-function. Combining this with Corollary 2.2 in Chapter 2, about the asymptotics of G-functions, then gives the following.

3.2 Diagonals

107

Corollary 3.2 Suppose F(z) ∈ Q(z) is analytic at the origin. As n → ∞ its diagonal coefficient sequence [z1n · · · zdn ]F(z) has an asymptotic expansion given by a sum of terms of the form C nα ζ n (log n) , where C ∈ C, α ∈ Q,  ∈ N, and ζ is algebraic.

Example 3.8 (A D-Finite Function which is not a Rational Diagonal) The function f (z) = ez = n≥0 z n /n! is D-finite, but not globally bounded, so it is not the diagonal of a rational function in any number of variables.

Although the results of this section were stated for the main diagonal ΔF, their analogues hold for the r-diagonal Δr F whenever r has rational coordinates. As previously mentioned, one can immediately reduce to the case r ∈ Nd by taking positive integer powers of each variable. Then, the r-diagonal of a rational (respectively algebraic or D-finite) function F(z) can be written as the main diagonal of a related rational (respectively algebraic or D-finite) function. Problem 3.8 asks you to extend the approach of the following example to the general case. Example 3.9 (Reduction to Main Diagonal) Consider the r = (2, 3) diagonal of the rational function  i + j  1 xi y j . = F(x, y) = 1 − x − y i, j ≥0 i Our first goal is to determine a series expression involving only even powers of x, since these are the only terms which will appear in the r-diagonal. To find such a series, we note that  i + j  1 (−1)i x i y j F(−x, y) = = 1 + x − y i, j ≥0 i so that G(x, y) =

  1 + (−1)i  i + j  1−y F(x, y) + F(−x, y) = = xi y j 2 (1 − x − y)(1 + x − y) i, j ≥0 2 i  2i + j  x 2i y j . = 2i i, j ≥0

Similarly, our next goal is to determine a series expression involving only the powers of y in G(x, y) which are divisible by 3. If ω = e2i/3 is a primitive third root of unity then 1 + ωk + ω2k = 0 for k an integer not divisible by 3 and 1 + ωk + ω2k = 3 if 3 divides k. Thus,

108

3 Multivariate Series and Diagonals

H(x, y) =

  1 + ω j + ω2j  2i + j  G(x, y) + G(x, ωy) + G(x, ω2 y) = x 2i y j 3 3 2i i, j ≥0  2i + 3 j  x 2i y 3j , = 2i i, j ≥0

and simplifying H(x, y) gives  2i + 3 j  1 − x 2 y 3 + x 4 − y 3 − 2x 2 x 2i y 3j . = 2i 1 − x 6 + y 6 − 6x 2 y 3 + 3x 4 − 2y 3 − 3x 2 i, j ≥0 Now that we have an expression involving the terms which appear in the r-diagonal, our final step is to convert to a rational function where these terms lie on the main diagonal. Since x always appears to an even power, and y always appears to a power divisible by 3, we can replace x 2 by s and y 3 by t to obtain  2i + 3 j  1 − st + s2 − t − 2s si t j . = 2i 1 − s3 + t 2 − 6st + 3s2 − 2t − 3s i, j ≥0 The main diagonal of this rational function is the r-diagonal of F(x, y).

The technique of multiplying power series variables by roots of unity and adding the results to obtain a specific subseries is often referred to as a ‘root of unity filter’. Although this process can always be followed to reduce an r-diagonal to the main diagonal, the resulting function will typically be more complicated and harder to analyze. We will see later in this book that the methods of analytic combinatorics in several variables allow for asymptotic statements about r-diagonals by working directly with r as a parameter, so such manipulations are not necessary.

3.2.2 Representing Series Using Diagonals Since we develop methods to determine asymptotics of diagonal sequences, it is natural to ask which functions can be represented by diagonals. First, we see that any algebraic function can be written as the diagonal of a bivariate rational function, meaning the classes of bivariate rational diagonals and algebraic functions are equal. Proposition 3.8 Suppose P(z, y) ∈ Q[z, y] with partial derivative Py (0, 0)  0. If F ∈ Q[[z]] has no constant term and P(z, F(z)) = 0 then 

y 2 Py (zy, y) ; F(z) = Δ P(zy, y) i.e., F is the diagonal of a bivariate rational function.

3.2 Diagonals

109

We follow the simple proof of Furstenberg [33], which works for power series over any integral domain. Note that the restriction that F(z) vanish at the origin is mild: if instead F(0) = a  0 then the result can be applied to the function G(z) = F(z) − a to obtain a rational function g(z, y) whose diagonal is G(z), and the diagonal of g(z, y) + a is F(z). To work around the restriction that Py (0, 0)  0 one must separate the solutions of P(z, y) = 0 near the origin; see Problem 3.9 for an approach suggested by Adamczewski and Bell [2]. Proof Writing P(z, y) = (y − F(z))u(z, y) for some u ∈ Q[[z]][y] we see Py (z, y) = u(z, y) + (y − F(z))uy (z, y), so that

y 2 Py (zy, y) y 2 uy (zy, y) y2 = + . P(zy, y) y − F(zy) u(zy, y)

(3.3)

Since F(z) has no constant term,  y y2 = = y 1−k F(zy)k y − F(zy) 1 − F(zy)/y k ≥0 is a power series whose diagonal is F(z). Because u(0, 0) = Py (0, 0)  0, the second summand in (3.3) is a power series, with a zero diagonal. Taking the diagonal operator in (3.3) then gives the desired result.  Example 3.10 (The Catalan Generating Function as a Rational Diagonal) The Catalan generating function C(z) is the solution to zC(z)2 − C(z) + 1 = 0 with C(0) = 1. The function D(z) = C(z)−1 has D(0) = 0 and satisfies P(z, D(z)) = 0 where P(z, y) = z(y + 1)2 − (y + 1) + 1. Since Py (0, 0) = −1  0,  D(z) = Δ

y(2y 2 z + 2yz − 1) y 2 z + 2yz + z − 1



and   3  2y z + 3y 2 z + 2yz − y + z − 1 y(2y 2 z + 2yz − 1) =Δ . C(z) = Δ 1 + 2 y z + 2yz + z − 1 y 2 z + 2yz + z − 1 

The following generalization of Proposition 3.8 follows from an analogous argument; see Problem 3.10. Proposition 3.9 Let P(z, y) ∈ Q[z, y] and suppose F(z) is an analytic function at the origin satisfying P(z, F(z)) = 0 and F(ˆz, 0) = 0. If P(z, y) = (y − F(z))k u(z, y) for k ∈ N and u(z, y) analytic at the origin with u(0, 0)  0, then

110

3 Multivariate Series and Diagonals

[zi ]F(z) = [zi y id ]

y 2 Py (ˆz, yzd, y) . kP(ˆz, yzd, y)

Denef and Lipshitz [27, Thm. 6.2] extend both Proposition 3.8 and Proposition 3.9 by proving that for any algebraic function F(z) in d variables there exists a rational function R(z, y) in 2d variables such that [zi ]F(z) = [zi yi ]R(z, y) for all i ∈ Nd . Safonov [59] gives an elementary argument that for any algebraic function F(z) in d variables there exists a rational function R(z, y) in d + 1 variables and a unimodular matrix A ∈ Nd×d such that if i ∈ Nd and j = Ai then [zi ]F(z) = [zj y jd ]R(z, y). Example 3.11 (Counting Leaves in Rooted Plane Trees) A rooted plane tree is a (non-empty) rooted tree where every node is a leaf (has no children) or has a finite sequence of children which are themselves non-empty rooted plane trees. Problem 3.11 asks you to prove that the bivariate generating function T(u, z) = uz + uz2 + (u + u2 )z3 + · · · whose coefficients tk,n = [uk z n ]T(z, u) give the number of rooted plane trees on n nodes with k leaves satisfies P(u, z, T(z, u)) = 0, where P(u, z, y) = y 2 + (z − zu − 1)y + zu. The conditions of Proposition 3.9 hold with k = 1, so tk,n = [uk z n y n ]

y 2 Py (u, yz, y) y(uyz − yz − 2y + 1) = [uk z n y n ] . P(u, yz, y) 1 + uyz − uz − yz − y

In this case one can determine tk,n directly from the equation P(u, z, T) = 0 by a clever application of Lagrange inversion (recall Problem 2.8 from Chapter 2). See Flajolet and Sedgewick [28, Sect. III. 3] for further examples of this nature.

Example 3.12 (Counting Bar Graphs) A bar graph is a finite connected union of 1 × 1 boxes with corners in Z2 such that lowest edge in each column lies on the x-axis and each column contains all boxes between its highest and lowest box; see Figure 3.1 for an example. An edge between two vertices of Z2 is called an external edge of a bar graph if there is a 1 × 1 box in the bar graph containing the edge and a 1 × 1 box in Z2 not contained in the bar graph which also contains the edge. If bn,k denotes the number of bar graphs with 2n horizontal external edges and 2k vertical external edges (there are always an even number of both) then Bousquet-Mélou and Rechnitzer [19] show that the bivariate generating function B(x, y) = n,k ≥0 bn,k x n y k satisfies B(0, 0) = 0 and P(x, y, B) = 0, where P(x, y, z) = z − x y − (x + y + xy)z + xz2 . The conditions of Proposition 3.9 are satisfied, so we obtain bn,k = [x n y k z k ]

z(1 − xyz − 2xz − yz − x) . 1 − xyz − xy − xz − yz − x

3.2 Diagonals

111

Fig. 3.1 A bar graph with 10 external vertical edges (coloured in gray) and 12 external horizontal edges (coloured in black).

In Chapter 5 we use the methods of analytic combinatorics in several variables to determine asymptotics of bn,k as n, k → ∞.

The fact that any algebraic function can be represented as a bivariate diagonal has asymptotic consequences. For instance, it follows from Proposition 2.11 in Chapter 2 that for any α ∈ Q \ {−1, −2, . . . } there exists a bivariate rational function whose diagonal coefficient sequence has dominant asymptotics of the form Cnα ρn for constants C, ρ > 0. Using the methods developed in Chapter 5, for any negative integer α one can construct a trivariate rational function whose main diagonal has dominant asymptotic growth of the form Cnα ρn for constants C, ρ > 0. Although every rational diagonal is a D-finite globally bounded function, it is an open question whether every globally bounded D-finite function can be written as a rational diagonal. This question has been studied for decades. Example 3.13 (Christol’s Open Question) In 1986, Christol [23, p. 15] noted that the series f (z) =

 (1/9)n (4/9)n (5/9)n z n , (1/3)n (1)n n! n≥0

(x)n = x(x + 1) · · · (x + n − 1)

a so-called 3 F2 hypergeometric function, is globally bounded and D-finite6 but is not easily expressed as the diagonal of a rational function. It is still open whether f (z) can be expressed as a rational diagonal. Bostan et al. [14] gave many additional examples of globally bounded hypergeometric functions, some of which were expressed as explicit rational diagonals by Abdelaziz et al. [1] and Bostan and Yurkevich [17].

6 The series f (z) satisfies the D-finite equation 729z 2 (z − 1) f (z) + 81z(37z − 21) f (z) + 9(200z − 27) f (z) + 20 f (z) = 0.

112

3 Multivariate Series and Diagonals

Conjecture 3.1 (Christol [24, Conjecture 4]) Every globally bounded D-finite function can be expressed as the main diagonal of a rational function. It is also natural to wonder about the relationship between diagonals of rational functions in differing numbers of variables. Given d ∈ N, let Dd denote the set of functions which can be obtained as (main) diagonals of d-variate rational functions over Q. If F(z) is any function analytic at the origin, the diagonals of F(z) and G(z, t) = F(z1, . . . , zd−1, tzd ) are equal, giving a sequence of containments D1 ⊂ D2 ⊂ D3 ⊂ · · · . Because D1 contains univariate rational functions and D2 is the class of algebraic functions, we have strict containments D1  D2  D3 , but there is no natural criterion to separate Dd for d ≥ 3. Open Problem 3.1 Does there exist d > 3 such that D3  Dd ? Note that, even when the number of variables d is fixed, there are an infinite number of ways to represent f ∈ Dd as a diagonal. Indeed, if f (z) = (ΔF)(z) for some rational function F(z) then f (z) = (ΔG)(z) for any G(z) = F(z) + H(z) with H(z) a rational function with zero diagonal. One may take, for instance, H(z) = zd h(z1, . . . , zd−2, zd−1 zd ) where h(w1, . . . , wd−1 ) is any (d − 1)-variate rational function that is analytic at the origin. As such representations can have very different properties, characterizing all functions with the same diagonal is an important problem. This can be reduced to characterizing the rational functions with a zero diagonal. Open Problem 3.2 Give a useful characterization of the functions F(z) ∈ Q(z) which are analytic at the origin and have zero diagonal. The term ‘useful characterization’ in Open Problem 3.2 is intentionally vague: the characterization should be sufficiently powerful to allow one to select the ‘best’ representative for a particular purpose, such as an asymptotic analysis. Note that it is decidable to check whether two rational functions F(z) and G(z) have the same diagonal. Creative telescoping algorithms determine an annihilating D-finite equation satisfied by Δ(F − G), and this equation gives a finite bound N such that the diagonals of F and G are equal if their first N power series coefficients agree. Example 3.14 (Apéry’s Sequence) We prove that the trivariate rational functions 1 1 − (1 + z)(x + y − xy) 1 F2 (x, y, z) = 1 − x − y − z(1 − x)(1 − y) F1 (x, y, z) =

have the same diagonal sequence, whose asymptotic expansion is related to Apéry’s approach to proving irrationality of the Riemann zeta function at integer arguments (see Section 3.4 below for details). The Magma package of Lairez [45] shows that (ΔF1 )(t) and (ΔF2 )(t) both satisfy the differential equation

3.3 Multivariate Laurent Expansions and Other Series Operators

113

t(t 2 + 11t − 1) f (t) + (3t 2 + 22t − 1) f (t) + (t + 3) f (t) = 0, meaning both diagonal sequences satisfy the recurrence relation (n + 2)2 un+2 = (11n2 + 33n + 25)un+1 + (n + 1)2 un . Since the first two diagonal terms of F1 and F2 are equal, and they satisfy the same recurrence relation of order 2, their diagonal sequences are identical (of course, their non-diagonal power series coefficients are different).

3.3 Multivariate Laurent Expansions and Other Series Operators In our investigations it will be necessary to consider expansions (and diagonals) of functions which cannot be represented by power series at the origin. Furthermore, non-power series expansions of a rational function centred at the origin can help to give insight into the computations arising in an asymptotic analysis, even when one is only interested in power series expansions. Just as the analysis of an algebraic function encoded by its minimal polynomial P must take into account the other roots of P, our methods to determine asymptotics of a rational function F(z) will naturally pick up properties of all valid series expansions of F at the origin. Recall from Definition 2.9 in Chapter 2 that the ring of formal Laurent series over the field K consists of all series with integer exponents bounded from below,    i fi z : q ∈ Z, fi ∈ K , K((z)) = i ≥q

equipped with term-wise addition and the Cauchy product, while convergent Laurent series can have an infinite number of terms with negative exponents. We make use of a straightforward multivariate generalization. Definition 3.15 (multivariate formal Laurent series) The field of formal iterated Laurent series in the variables z is defined inductively as K((z)) = K((z1, . . . , zd )) = K((z1, . . . , zd−1 ))((zd )). Xin [67, Ch. 2] gives a detailed account of formal iterated Laurent series, and proofs of their basic properties. Unlike multivariate power series, the ordering of the variables matters in this construction. Example 3.15 (Multivariate Laurent Expansions Depend on Variable Order) In the field Q((x, y)) = Q((x))((y)), consisting of a Laurent series in y whose coefficients are Laurent series in x, one can compute the formal expansion

114

3 Multivariate Series and Diagonals

  1/x 1 (−1)n x −n−1 y n = = x + y 1 + y/x n≥0 for the multiplicative inverse of x + y. This does not lie in Q((y, x)) = Q((y))((x)), as it contains terms with arbitrarily large negative exponents in x (note, however, that the coefficient of any fixed power of y has only one term with a negative exponent in x). Similarly, in Q((y, x)) one can compute the formal expansion   1 1/y (−1)n y −n−1 x n = = x + y 1 + x/y n≥0 which does not lie in Q((x, y)).

Because K((z)) forms a field, any rational function (or ratio of analytic functions) has an expansion as an iterated Laurent series, making these formal series sufficient for our purposes. Unfortunately, K((z)) is not closed under composition, even in the univariate case: if f (z) = 1 + z + z 2 + · · · then f (1/z) is not a well defined formal Laurent series. With this in mind, we define an important subset of Laurent series. Definition 3.16 (Laurent polynomials) For a variable z, we write z = 1/z and z = (z 1, . . . , z d ). The ring of Laurent polynomials in the variables z, written K[z, z], is the subset of K((z)) consisting of elements with a finite number of non-zero terms. Since a Laurent polynomial only contains a finite number of terms, if f (z) ∈ K[z, z] and g ∈ K((z)) then f (g(z)) ∈ K((z)). Note also that the ring of Laurent polynomials does not depend on the ordering of the variables used to construct it. Example 3.16 (A Multivariate Walk Generating Function) The ring Q[x, x][[t]] consists of power series in t whose coefficients are Laurent polynomials in x. For instance, in Q[x, x][[t]] one has the expansion

n       n 2k−n n 1 n n x (x + x) t = (3.4) t . = k 1 − t(x + x) n≥0 n≥0 k=0 This series has a natural interpretation as the multivariate generating function of walks on the integer lattice Z that begin at the origin and take steps in {−1, 1}, where the exponent of x marks the endpoint of a walk and the exponent of t marks the number of steps it contains. That the coefficient of t n is a Laurent polynomial for each n reflects the fact that a walk taking n steps from {−1, 1} can only end at a finite number of coordinates. Note that for any f ∈ Q((z)), we can substitute x = f (z) into both sides of (3.4) and still have a valid equality among elements of Q((z))[[t]].

Aparicio-Monforte and Kauers [4] discuss constructions of other, more general, rings of formal Laurent series.

3.3 Multivariate Laurent Expansions and Other Series Operators

115

3.3.1 Convergent Laurent Series and Amoebas As usual, we mainly restrict our attention to series representing analytic functions. Definition 3.17 (multivariate convergent Laurent series) Given domain D ⊂ Cd , the set C D {z} of convergent Laurent series on D (centred at the origin) consists of series i∈Z d fi zi with fi ∈ C which are absolutely convergent at each point of D and uniformly convergent on compact subsets of D. We always consider convergent Laurent expansions centred at the origin. The characterization of multivariate power series domains of convergence discussed above, as well as the Cauchy integral formula, can be extended to convergent Laurent series. Recall the Relog map Relog(z) = (log |z1 |, log |z2 |, . . . , log |zd |) . Given x ∈ Rd , the inverse image Relog−1 (x) of x under Relog is the polytorus T(ex ) of radius ex = (e x1 , . . . , e x d ). A proof of the following classical result can be found in Pemantle and Wilson [55, Thm. 7.2.2]. Proposition 3.10 The domain of convergence of the series F(z) = i∈Z d fi zi has the form D = Relog−1 (B) for some (possibly empty) open convex subset B ⊂ Rd , and F defines an analytic function on D. Conversely, if f (z) is an analytic function −1 B ⊂ Rd open and convex then there exists a unique on D = Relog (B) with i series F(z) = i∈Z d ci z ∈ C D {z} converging to f , whose coefficients are given by ∫ f (z) 1 dz (3.5) ci = d −1 (2πi) Relog (x) zi+1 for any x ∈ B. The set of formal expressions i∈Z d fi zi does not have a natural ring structure, which is why we restrict ourselves to iterated Laurent series in the formal case. The set C D {z} is, however, a ring when addition is defined term-wise and the multiplication of elements F, G ∈ C D {z} representing analytic functions f (z) and g(z) on D is defined as the unique element of C D {z} converging to the product f (z)g(z); the coefficients of this product can be determined using (3.5).

Amoebas of Laurent Polynomials There is a nice characterization of the convergent Laurent expansions of a rational function, using an interesting construction from algebraic geometry. Definition 3.18 (amoebas and their complements) Given a Laurent polynomial f (z) ∈ C[z, z], the amoeba of f is the set

116

3 Multivariate Series and Diagonals

Fig. 3.2 Left: The amoeba of H(x, y) = 1 − x − y, whose complement in R2 contains three convex connected components corresponding to convergent Laurent expansions in three domains D1, D2, and D3 , pictured with the recession cones of those components. Right: The Newton polygon of H(x, y), together with the dual cones at each vertex.

  amoeba( f ) = Relog(z) : z ∈ C∗d, f (z) = 0 ⊂ Rd, where C∗ = C \ {0}. The amoeba’s complement is amoeba( f )c = Rd \ amoeba( f ). Amoebas were introduced to the study of algebraic varieties by Bergman [12], and popularized through the work of Gelfand, Kapranov, and Zelevinsky [35] (who coined the term amoeba as two dimensional amoebas in the plane resemble cellular amoebas with tentacles going off to infinity; see Figure 3.2). The following result, proven in Gelfand, Kapranov, and Zelevinsky [35, Cor. 1.6], shows the connection between amoebas and convergent Laurent expansions. Proposition 3.11 If f (z) is a Laurent polynomial then all connected components of the complement amoeba( f )c are convex subsets of Rd . These real convex subsets are in bijection with the Laurent series expansions of the rational function 1/ f (z). When 1/ f (z) has a power series expansion, then it corresponds to the component of Rd \ amoeba( f ) containing all points (−N, . . . , −N) for N sufficiently large. Although the introduction of amoebas can seem artificial at first glance, it is actually quite natural. In order to picture the zero set of f , which lives in the extremely hard to visualize space Cd , one takes the absolute value of each coordinate to work over the real numbers. Then, because domains of convergence of Laurent series are logarithmically convex, coordinate-wise logarithms are taken. Amoebas thus represent the ‘real shadow’ of a complex set, and an understanding of amoebas is crucial to get a good visual picture for power series domains of convergence, singular sets, minimal singularities, and other objects arising in analytic combinatorics in several variables. Example 3.17 (A Prototypical Amoeba) The amoeba of H(x, y) = 1 − x − y is drawn in Figure 3.2. As R2 \ amoeba(H) has three connected components, there are three convergent Laurent expansions

3.3 Multivariate Laurent Expansions and Other Series Operators

117

of F(x, y) = 1/(1 − x − y), which converge on three disjoint domains in C2 . In addition to the power series expansion  i + j  1 xi y j, = 1 − x − y i, j ≥0 i the binomial theorem implies   i  −1/x 1 i −i−1 (−1) j+1 y j x −i−1 = = − (1 − y) x = j 1 − x − y 1 − (1 − y)/x i ≥0 i, j ≥0 and  i   1 −1/y i −i−1 (−1) j+1 x j y −i−1, = = = − (1 − x) y j 1 − x − y 1 − (1 − x)/y i ≥0 i, j ≥0 when these series absolutely converge. We have already seen that the power series expansion converges on D1 = {(x, y) : |x| + |y| < 1}, and to find the domains of convergence of the Laurent expansions we note  i  1/|x| |y| j |x| −i−1 = j 1 − (1 + |y|)/|x| i, j ≥0 on D2 = {(x, y) : 1 + |y| < |x|}, with divergence outside the closure of D2 , and  i  1/|y| |x| j |y| −i−1 = j 1 − (1 + |x|)/|y| i, j ≥0 on D3 = {(x, y) : 1 + |x| < |y|}, with divergence outside the closure of D3 . Thus, we have found the three convergent Laurent series expansions of F(x, y), valid on the domains D1, D2, and D3 .

Definition 3.19 (minimal points) As for power series expansions, a singularity of a rational (or, more generally, meromorphic) function F(z) which is on the boundary ∂D of the domain of convergence of one of its convergent Laurent series expansions is called a minimal singularity or minimal point with respect to that expansion.

Properties of Amoebas When F(z) = G(z)/H(z) is a rational function with G and H coprime, the convergent Laurent expansions of F are determined by the connected components

118

3 Multivariate Series and Diagonals

of Rd \ amoeba(H). We now summarize some important properties of amoebas, which help visualize zero sets of multivariate polynomials and properties of Laurent expansions. 1. The Newton Polytope and Amoebas. To state our first collection of results we need a few definitions from the study of convex sets. Definition 3.20 (definitions from convex analysis) • The support S( f ) ⊂ Zd of a Laurent polynomial f (z) is the finite set of vectors i ∈ Zd appearing as exponents to monomials zi in f ; • the Newton polytope N ( f ) ⊂ Rd of f is the convex hull of its support S( f ), the smallest convex set in Rd containing S( f ); • a vertex of N ( f ) is a point in N ( f ) which does not lie strictly between two other points of the set, also known as an extreme point; • the dual cone to a convex set S ⊂ Rd at a point v ∈ S is the set of vectors x ∈ Rd such that x · v ≥ x · s for all s ∈ S; • the recession cone of a convex set B ⊂ Rd is the set of vectors x ∈ Rd such that x + b ∈ B for all b ∈ B. See Figure 3.2 and our examples below for an illustration of these concepts. The following result links the Laurent expansions of a rational function G(z)/H(z) and integer points in the Newton polytope N (H). Proposition 3.12 Let H ∈ C[z, z] be a Laurent polynomial and B = {B1, . . . , Br } be the connected components of Rd \ amoeba(H). Then 1) there exists an injective mapping ν from B to the integer points of N (H), meaning each connected component of the amoeba complement can be identified with an integer point in the Newton polytope; 2) each vertex of N (H) has a preimage under ν, meaning the number of components in the amoeba complement (and thus Laurent expansions of 1/H) is at least the number of vertices of N (H) and at most the number of integer points in N (H); 3) if component B ∈ B is the image of v ∈ N (H) under ν, then the dual cone to N (H) at v equals the recession cone of B. See Forsberg et al. [29] for a proof. Proposition 3.12 allows one to easily compute important information about the amoeba of H(z) directly from the polynomial. In particular, each vertex of the Newton polygon corresponds to a component of the amoeba complement which is unbounded. An integer point v of the Newton polygon N that is not a vertex may or may not correspond to a component of the complement; when v is an interior point of N it will correspond to a bounded component of the amoeba complement (if any), and when v is on the boundary of N it will correspond to an unbounded component of the amoeba complement (if any).

3.3 Multivariate Laurent Expansions and Other Series Operators

119

2. Amoeba Boundaries and Contours. Suppose D is the domain of convergence of a Laurent expansion of rational F(z) = G(z)/H(z). As in the power series case, if w ∈ ∂D then there is a singularity of F, necessarily minimal, with the same coordinate-wise modulus as w. Note that minimal singularities are precisely those which map to the boundary of amoeba(H) under the Relog map. Proposition 3.13 Let F(z) = G(z)/H(z) be a rational function, where G and H are coprime polynomials over C, and let V denote the singularities of F (which form the roots of H in Cd ). Then for any point w ∈ ∂D ∩ V there exists λ ∈ Rd and τ ∈ C such that (∇log H)(w) = τλ. Suppose B is a component of amoeba(H)c with Relog(w) ∈ ∂B. If τ  0 and w has no zero coordinate, then the hyperplane Sw = {z ∈ Rd : (z − Relog(w)) · λ = 0} is a support hyperplane to B (all points of B lie on one side of Sw ). See Figure 3.3 for a visualization. Proof Existence of λ and τ follows from the same proof as Proposition 3.6, which is Proposition 3.13 restricted to power series. For the claim about support hyperplanes, ˜ ˜ = H(e x1 , . . . , e x d ) then (∇ H)(a) = write w j = e a j for some a j = log(w j ) ∈ C. If H(x) ˜ near a is, up (∇log H)(w) = τλ and Taylor’s theorem implies that the zero set of H(x) to first order approximation, the hyperplane {z ∈ Cd : (z − a) · λ = 0}. The second statement then holds because λ has real coordinates, each connected component of ˜ amoeba(H)c is convex, and amoeba(H) is the real part of the zero set of H.  Drawing an amoeba, or its boundary, can be a difficult task7, and approximation methods are prevalent. Luckily, Proposition 3.13 allows for a natural parameterization of a superset of an amoeba boundary. Definition 3.21 (amoeba contours) Let C(H) denote the points z ∈ C∗d such that H(z) = 0 and (∇log H)(z) is a scalar multiple of a vector in Rd . The image of C(H) under the Relog map is called the contour of the amoeba of H. Corollary 3.3 Let H ∈ C[z, z] be a Laurent polynomial. Then the amoeba boundary satisfies ∂ amoeba(H) ⊂ {Relog(z) : z ∈ C(H)} ⊂ amoeba(H). In two dimensions, the contour can be parametrized by a real variable t by solving xHx (x, y) − t yHy (x, y) = 0,

H(x, y) = 0

for all complex values of x and y in terms of t, and then taking (log |x|, log |y|). Corollary 3.3 was originally given by Mikhalkin [51], and the contour was investigated as 7 Even determining whether a univariate polynomial with integer coefficients has a root of modulus one, the univariate version of this problem, is NP-hard with respect to the bitsize of the input polynomial coefficients [56].

120

3 Multivariate Series and Diagonals

 1  t for t ∈ R. Fig. 3.3 Left: The contour of H(x, y) = 1 − x − y, parametrized by log 1+t , log 1+t Right: The logarithmic gradients (∇log H)(w) form the normals to support hyperplanes of B at minimal points w ∈ ∂D ∩ V.

a tool for computationally drawing amoebas by Theobald [64]. Other approximation methods for amoeba boundaries and contours have been studied in [58, 8, 65, 30]. Example 3.18 (A Prototypical Contour) The contour of H(x, y) = 1 − x − y, shown in Figure 3.3, is determined by solving −x + t y = 1 − x − y = 0 for x and y, and then taking the Relog map, giving the real parameterization   1 t log :t ∈R . , log 1+t 1+t In this case, the contour is the boundary of amoeba(H).

Example 3.19 (Sketching a Contour) Figure 3.4 shows points on the contour of the degree four polynomial Q(x, y) = 1 − x − y − 6xy − x 2 y 2 , together with its Newton polygon N (Q). Four unbounded components of the amoeba complement are visible, corresponding to the four vertices of N (Q). Furthermore, the contour suggests that there might be a bounded component of the amoeba complement containing the origin, corresponding to the interior integer point of N (Q). To see that the origin is not contained in the amoeba, we must show that there is no root of Q with |x| = |y| = 1. In order to algebraically test conditions involving the moduli of coordinates, we must work with variables taking real values. If

3.3 Multivariate Laurent Expansions and Other Series Operators

121

Fig. 3.4 Left: Points on the contour of Q(x, y) = 1 − x − y − 6xy − x 2 y 2 , obtained by numerically solving xQ x − tyQ y = Q = 0 for t incremented by 0.01 between −5 and 5 and taking the Relog map. Right: The Newton polytope of Q(x, y).

Q(a + ib, c + id) = Q R (a, b, c, d) + i Q I (a, b, c, d) for polynomials Q R and Q I with real coefficients, then the polynomial Q admits a root with |x| = |y| = 1 if and only if the system Q R (a, b, c, d) = Q I (a, b, c, d) = a2 + b2 − 1 = c2 + d 2 − 1 = 0 has a solution (a, b, c, d) ∈ R4 . Using a computer algebra package to solve this system shows there is no solution with b or d taking real values. To conclude that there is a bounded component of the amoeba complement, it is now sufficient to prove that the origin is not contained in any of the four unbounded components of the complement. Using convexity, this can be accomplished by exhibiting points in the amoeba on line segments from the origin to each unbounded component. Such points can be seen directly in Figure 3.4: although points on the contour of Q are not always on the amoeba boundary, they always lie in the amoeba. In Chapter 5 we will see how to easily determine points with explicit algebraic coordinates between the origin and the unbounded components using so-called critical points arising in an asymptotic analysis of r-diagonals for different directions r.

3. Amoeba Limit Directions. Finally, we examine how an amoeba goes to infinity. Definition 3.22 (limit directions of amoebas) A vector v ∈ Rd is called a limit direction of amoeba(H) if there exists x ∈ Rd such that x + tv ∈ amoeba(H) for all t ≥ 0.

122

3 Multivariate Series and Diagonals

Problem 3.14 asks you to prove that if v is a limit direction of amoeba(H) then there exist distinct vectors n and m in N (H) such that v · n = v · m ≥ v · k for all k ∈ N (H). Pictorially, when v is a limit direction of amoeba(H) then there exists a hyperplane in Rd with normal v such that all points of N (H) lie to one side of the hyperplane and at least two elements of N (H) lie on the hyperplane. Limit directions of amoebas help rule out pathological cases for asymptotic analyses. In particular, determining asymptotics of the r-diagonal of F(z) = G(z)/H(z) can be difficult when amoeba(H) contains a limit direction orthogonal to r. Example 3.20 (Amoeba Limit Directions) The amoeba of the polynomial T(x, y) = 1− x − xy has a limit direction (−1, 1) which is orthogonal to the main diagonal direction (1, 1). Expanding 1/T(x, y) as a power series using the binomial formula shows that the diagonal coefficient sequence is the trivial sequence (1, 1, 1, . . . ).

3.3.2 Diagonals and Non-Negative Extractions of Laurent Series Let F(z) = i∈Z d fi zi be a formal or convergent Laurent series, where fi is zero for enough indices i ∈ Zd to make F well defined in the formal case. Definition 3.23 (diagonals of Laurent expansions) The r-diagonal of F for r ∈ Qd is the univariate series  fnr t n, (Δr F)(t) = n≥0

where fnr = 0 if nr 

Zd .

We still reserve ΔF for the main diagonal r = 1.

Given a function f (z) over the complex numbers, the diagonal Δ f can only be defined after specifying which Laurent expansion of f (z) is under consideration; by Proposition 3.10 this can be done by specifying a point in the domain of convergence of the Laurent expansion. Unless explicitly noted, when given a function analytic at the origin we always consider the power series expansion of the function. Most results discussed above for diagonals of power series expansions of rational functions hold for convergent Laurent series expansions of rational functions. In particular, diagonals of convergent Laurent series with integer coefficients are G-functions, with the corresponding restrictions on asymptotic growth. Proposition 3.10 implies that for r ∈ Zd the r-diagonal of a convergent Laurent expansion of the rational function F = G/H corresponding to a component B ⊂ amoeba(H)c can be represented as

3.3 Multivariate Laurent Expansions and Other Series Operators

123

Fig. 3.5 Left: Points on the contour of R(x, y) = 1 − x − y − xy 3 . Right: The Newton polygon of R; the non-vertex integer points (shaded gray) do not correspond to convergent Laurent expansions.

(Δr F)(t) =



fnr t = n

n≥0

 n≥0

1 (2πi)d ∫

=

1 (2πi)d

=

1 (2πi)d

∫ Relog−1 (x)



F(z) dz rd n r1 n z1 · · · zd z1 · · · zd t n F(z)

rd n r1 Relog−1 (x) n≥0 (z1 · · · zd ) ∫ zr−1 F(z) Relog−1 (x)

zr − t

 tn

dz z1 · · · z d

dz,

for any x ∈ B and |t| < er·x . Using this common representation, a D-finite equation satisfied by the r-diagonals of all convergent Laurent expansions of G(z)/H(z) can be computed by the creative telescoping algorithm of Lairez [45]. Example 3.21 (Diagonals and Multiple Laurent Expansions) The contour of R(x, y) = 1 − x − y − x y 3 is shown in Figure 3.5; the amoeba complement amoeba(R)c has four components, corresponding to four8 convergent Laurent expansions of 1/R. Using the creative telescoping package of Lairez, we can compute the D-finite equation (9t 2 + 6t − 1)(27t 4 + 18t 2 − 1) f (t) + 6(162t 5 + 135t 4 + 36t 2 − 6t + 1) f (t) + 6(81t 4 + 81t 3 − 27t 2 − 3t − 4) f (t) = 0

8 Although it appears there may be an additional component containing the origin, this can be ruled out, for instance since R(1, i) = 0 and (1, i) maps to the origin under the Relog map. This illustrates that the contour is a subset of the amoeba boundary.

124

3 Multivariate Series and Diagonals

satisfied by the main diagonal of any convergent Laurent expansion of 1/R, ∫ dx dy 1   f (t) = (2πi)2 Γ 1 − x − y − xy 3 (xy − t) with Γ = Relog(a, b) for some (a, b) ∈ amoeba(R)c . This can be converted into an order 6 P-recursive equation satisfied by each diagonal sequence dn = [t n ] f (t), (n + 6)(n + 5)dn+6 + · · · + 243(n + 1)(n + 2)dn = 0, and, in this instance, even solved in a computer algebra system to get a general solution (as expected with bivariate diagonals, it is a sum of algebraic functions). Taking a power series expansion at the origin shows that the diagonal sequence of this (power series) expansion of F begins 1, 2, 6, 23, 90, 357, . . . , and these initial terms and the recurrence they satisfy uniquely specify the power series diagonal. We now turn to the convergent Laurent expansion corresponding to the vertex (0, 1) in the Newton polygon of R. Note F(x, y) =

k −1/y 1 3  1 − x − xy = − y −k y k ≥0 1 − 1 − x − x y 3 /y 

(3.6)

with absolute convergence (at least) whenever 1 + |x| + |xy 3 | < |y|; for example, k  when |x| = 1/16 and |y| = 2. Expanding (1 − x) − xy 3 y −k−1 using the binomial theorem shows it contains only monomials of the form x k−a+b y 2k−3a−1,

0 ≤ a ≤ k,

0 ≤ b ≤ a.

If k − a + b = 2k − 3a − 1 then 0 ≤ k − 2a − 1 ≤ a, so the exponents of terms appearing on the diagonal are at least 2k − 3a − 1 ≥ (k + 1)/2. Thus, the initial 2N − 1 terms of the series in (3.6) may be summed to determine the initial N terms of its diagonal sequence. In particular, the diagonal sequence of this (Laurent) expansion of F begins 0, 1, −3, 10, −39, 156, . . . , and these initial terms and the recurrence they satisfy uniquely specify this diagonal. Problem 3.15 asks you to prove that the diagonal sequences of the two other convergent Laurent expansions of F are identically zero, which can be accomplished by computing initial terms of each sequence and using the above recurrence. In Chapter 5 we will see an easy asymptotic argument, not relying on the calculation of any recurrence, which shows that these final two diagonals must be eventually zero.

We end this section by describing another operation on Laurent series which is used extensively in lattice path enumeration. Definition 3.24 (non-negative series extraction) Given a field K and a power series in t whose coefficients are formal iterated Laurent series in K,

3.3 Multivariate Laurent Expansions and Other Series Operators

F(z, t) =

  k ≥0

125

 fi,k zi t k ∈ K((z))[[t]],

i∈Z d

4 3 the non-negative series extraction operator z1≥0 · · · zd≥0 takes F(z, t) and returns the power series

    4 3 ≥0 fi,k zi t k = fi,k zi t k ∈ K[[z, t]] z1 · · · zd≥0 F(z, t) = k ≥0 i∈N d

(k,i)∈N d+1

defined by taking only those terms with positive exponents in the z variables. The non-negative series extraction of a function F(z, t) having a convergent (instead of formal) Laurent series expansion which is a power series in the t variable is defined analogously. Recall that the formal Laurent series ring K((z)) implicitly comes3 with an ordering 4 of the variables, and even though the image of F(z, t) under z1≥0 · · · zd≥0 is a power series it may depend on the underlying ordering. When F(z, t) ∈ K[z, z][[t]] is a power series3 in t whose4 coefficients are Laurent polynomials in z, the image of F(z, t) under z1≥0 · · · zd≥0 is independent of how the variables are ordered. Certain variants of the kernel method for lattice path enumeration, described in Chapter 4, rely heavily on generating function representations using non-negative series extractions of rational functions. The following result gives a relationship between evaluations of a multivariate function encoded as the non-negative series extraction of a Laurent series and a diagonal extraction. Proposition 3.14 Let F(z, t) ∈ K[z, z][[t]]. Then for a ∈ {0, 1} d ,   4 3 ≥0 F(z1, . . . , z d, z1 · · · zd t) ≥0 . z1 · · · zd F(z, t) = Δ z=a (1 − z1 )a1 · · · (1 − zd )a d

(3.7)

The left hand side of (3.7) refers to first taking 3the non-negative extraction of F(z, t) 4 and then substituting z = a. When G(z, t) = z1≥0 · · · zd≥0 F(z, t) is a multivariate generating function tracking parameters in some combinatorial class, setting z j = 0 in G corresponds to counting objects where the jth parameter is 0, while setting z j = 1 corresponds to counting objects where the jth parameter can take any value. Note that the specialization on the left hand side of (3.7), and the substitution on its right-hand side, are well defined as each coefficient of F(z, t) with respect to t is a Laurent polynomial. We prove Proposition 3.14 after an example. Example 3.22 (A Constant Term Extraction using Diagonals) Let F(x, t) =

 1 = (x + x)n t n ∈ Q[x, x][[t]]. 1 − t(x + x) n≥0

126

3 Multivariate Series and Diagonals

We have seen above that the coefficient [x i t n ]F(x, t) for (i, n) ∈ Z × N counts the number of walks taking n steps in {−1, 1} which begin at the origin and end at the point x = i. Thus, Proposition 3.14 implies   3 ≥0 4 1 F(x, t) xt) = x = ΔF(x, Δ x=0 1 − t(1 + x 2 ) counts the number of walks on n steps which begin and end at the origin (note one cannot just substitute x = 0 directly into the series for F(x, t) as it contains terms with negative exponents in x), while     3 ≥0 4 1 F(x, xt) =Δ F(x, t) = x Δ x=1 (1 − x) (1 − x)(1 − t(1 + x 2 )) counts the number of walks on n steps beginning at the origin and ending at any point x ≥ 0. As diagonals of bivariate functions these generating functions are algebraic, and annihilating polynomials can be found using Proposition 3.7. Any walk beginning and ending at the origin must have even length 2n and consist of n steps +1 and n steps −1, showing combinatorially that     2n 2n 1 t = (1 − 4t 2 )−1/2 . = Δ 2 n 1 − t(1 + x ) n≥0

Proof (of Proposition 3.14) The right hand side of Equation (3.7) is given by

 a1  ad

 , +     k−i d k−i1 k k z1 ··· zd fi,k z1 · · · zd tk Δ k ≥0

k ≥0

k ≥0 i∈Z n

⎤ ⎡   ⎥ ⎢  a1 j1 −i1 a d j d −i d " k⎥ ⎢ fi,k z1 · · · zd =Δ⎢ # (z1 · · · zd t) ⎥ , ⎥ ⎢k ≥0 j∈N d i∈Z d ! $ ⎦ ⎣ where these manipulations are valid because enough coefficients fi,k are zero to make F(z, t) a power series in t whose coefficients are Laurent polynomials in z. If a j = 0 for all 1 ≤ j ≤ d then each i j = 0 in the inner sum for any term on the diagonal. If, however, a j = 1 then any terms with i j non-negative in the inner sum lie on the diagonal. Evaluating the non-negative series extraction at z j = 0 removes all terms with positive powers of z j , while evaluating at z j = 1 sums all coefficients  with non-negative powers of z j .

3.4 Sources of Rational Diagonals

127

3.4 Sources of Rational Diagonals In order to further motivate the study of rational diagonals, and their asymptotics, we end this Chapter by describing several domains of mathematics where they naturally appear. Examples discussed here will be analyzed throughout Parts II and III of this text. Additionally, the theory of lattice path enumeration, our main domain of application, is described in Chapters 4, 6, and 10.

3.4.1 Binomial Sums Given a rational function F(z) which is analytic at the origin, the binomial theorem implies that the power series coefficients of F can be represented as a sum of binomial coefficients. In fact, the converse is also true, although we need a bit of notation to state the formal result. Definition 3.25 (rational binomial sums) An integer indexed sequence of dimension d is a map u : Zd → Q. Any such sequence ui1,...,is of dimension s < d can be viewed as a sequence of dimension d by defining ui1,...,id = ui1,...,is , and with this identification one may add and multiply integer indexed sequences of any dimension term by term. The class of rational binomial sums is the smallest Q-algebra of integer indexed sequences (closed under term by term addition and multiplication, and multiplication by rational numbers) which • contains the univariate Kronecker delta sequence defined by δn = 1 if n = 0 and δn = 0 otherwise; • contains all geometric sequences gn = C n for non-zero C ∈ Q;   • contains the bivariate binomial coefficient sequence bn,k = nk , defined to be zero whenever k < 0; • is closed under affine maps on sequence indices; • is closed under indefinite summation: whenever ui,k is a binomial sum then so is

si,m

m ⎧ ⎪ u ⎪ ⎨ k=0 i,k ⎪ = 0 ⎪ ⎪ ⎪ − −1 k=m+1 ui,k ⎩

:m≥0 : m = −1 : m < −1

We are typically interested in sequences when their indices have non-negative values, in which case si,m is defined by regular summation of ui,k . Example 3.23 (Building a Binomial Sum)   Since bn,k = nk is a binomial sum and the class of binomial sums is closed under   is a binomial sum. The sequence σn which affine maps on indices, bn+k,k = n+k k

128

3 Multivariate Series and Diagonals

is one when n ≥ 0 and zero otherwise is the indefinite summation of the Kronecker delta sequence, and thus is a binomial sum. Since the class of binomial sums is   2 n 2 is a binomial sum. Finally, closed under multiplication, this implies σn n+k k k since the class of binomial sums is closed under summation, the sequence an which is zero if n < 0 and 2 n  2   n n+k an = k k k=0 for n ≥ 0 is a binomial sum.

The class of univariate binomial sums supported on the natural numbers is exactly the class of diagonal sequences of rational functions. Proposition 3.15 A univariate sequence (un ), zero whenever n < 0, is a rational binomial sum if and only if the generating function U(z) = n ≥0 un z n is the main power series diagonal of a rational function over Q which is analytic at the origin. A proof of Proposition 3.15 can be found in Bostan et al. [16, Thm. 3.5], and a Maple package of Lairez9 allows one to determine a rational diagonal expression for a given binomial sum. Binomial sums appear in several areas of mathematics, especially number theory. Example 3.24 (Apéry Numbers as Diagonals) 1 Apéry [5] proved that the constant ζ(3) = ∞ n=1 n3 was irrational–an open result for hundreds of years–by constructing two sequences of rational numbers whose ratios converge to ζ(3) at a rate which implies that ζ(3) is irrational. Alfred van der Poorten’s canonical report [66] on the proof gives the following exercise: “Be the first on your block to prove by a 2-line argument that ζ(3) is irrational” by determining an algebraic relationship between the two sequences and determining the exponential growth of the binomial sum sequence an =

2 n  2   n n+k k=0

k

k

.

The integers an are often referred to as the Apéry numbers (see the Online Encyclopedia of Integer Sequences entry A005259) and the Maple package of Lairez shows that their generating function satisfies   1 . A(z) = Δ 1 − t(1 + x)(1 + y)(1 + z)(1 + y + z + yz + xyz) 9 The Maple package of Lairez, available at https://github.com/lairez/binomsums, contains the command sumtores which returns a rational function R(y, z) ∈ Q(y, z) such that the binomial sum under consideration is the main power series diagonal of y1 · · · yd R(y, y1 y2 · · · yd z).

3.4 Sources of Rational Diagonals

129

In his seminal paper proving diagonals of rational functions are D-finite, Christol [22] gave two diagonal expressions for the Apéry numbers; Bostan et al. [14, Appendix B] list four additional rational diagonal expressions for the generating function of an (two containing 5 variables, one containing 6 variables, and one containing 8 variables, none of which is the representation given here). The singular sets of these rational functions exhibit somewhat different behaviours, with our first set of asymptotic methods in Chapter 5 applying to some, and the others needing the more refined methods of Chapter 9. Apéry also presented a novel proof of the irrationality of ζ(2) which relies on asymptotics of the binomial sum sequence cn =

 n  2   n n+k k=0

k

k

.

The cn are also referred to as Apéry numbers (OEIS entry A005258). Apéry himself noted [6] that the generating function C(z) of (cn ) is the diagonal of two trivariate rational functions     1 1 C(z) = Δ =Δ , 1 − (1 + z)(x + y − xy) 1 − x − y − z(1 − x)(1 − y) and the Maple package of Lairez shows   1 C(z) = Δ . 1 − z(1 + x)(1 + y)(xy + y + 1) We use the tools of analytic combinatorics in several variables to easily determine asymptotics of these sequences by hand in Chapter 5, and the algorithms of Chapter 7 are able to rigorously and automatically compute their asymptotics.

3.4.2 Irrational Tilings As seen in Chapter 2, the smallest collection of univariate rational functions containing 0 and z and closed under addition, multiplication, and f → 1/(1 − z f ) form the class of N-rational functions, studied for connections to theoretical computer science and combinatorics. There is a natural multivariate generalization. Definition 3.26 (multivariate N-rational functions) The set of d-variate N-rational functions is the smallest set of rational functions containing 0 and z1, . . . , zd which is closed under addition, multiplication, and pseudo-inverse G → 1/(1 − z j G) for each j = 1, . . . , d. Garrabrant and Pak [34] give a combinatorial characterization of sequences which are diagonals of N-rational functions.

130

3 Multivariate Series and Diagonals

√ Fig. 3.6 Left: Two tiles, of size 1 × α and 1 × (1 − α) with α = 1/ 2. Right: Two possible tilings (out of six) of a 1 × 2 rectangle. There are 2n n possible tilings of a 1 × n rectangle with these tiles.

Definition 3.27 (tile-counting functions) A tile is an axis-parallel simply connected closed polygon in the plane of height 1, and a tiling of a rectangle R of height 1 with the set of tiles T is a sequence of tiles in T, overlapping only on their boundaries, which cover R (see Figure 3.6). For a fixed set of tiles T and fixed ε > 0, define fT,ε (n) to be the number of tilings of a 1 × (n + ε) rectangle using the elements of T for all n ∈ N. Let F be the set of all such tile-counting functions fT,ε : N → N as T and ε vary. Garrabrant and Pak [34, Thm. 1.2] prove the following result. Proposition 3.16 The sequence f (n) ∈ F if and only if the generating func tion n≥0 f (n)z n is the diagonal of an N-rational function. This allows for a combinatorial interpretation of many rational diagonal sequences. Example 3.25 (Central Binomial Coefficients Enumerate Tilings) √ Let α = 1/ 2. Because α is irrational, tiling a 1 × n rectangle with tiles of size 1 × α and 1 × (1 − α), as shown in Figure 3.6, requires   exactly n copies of each tile, which can be arranged in any order. Thus, there are 2n n such tilings, and the corresponding generating function is the diagonal of the rational function F(x, y) = 1/(1 − x − y).

By representing the diagonal coefficients of N-rational functions in terms of a restricted class of binomial sums, Garrabrant and Pak [34, Thm. 4.2] strengthen Corollary 3.2 on asymptotics of rational diagonals. In particular, they show that if F(z) is N-rational with main diagonal sequence dn then there exists m ∈ N such that for each k = 0, . . . , m − 1 the sub-sequence (dmn+k )n ≥0 either grows exponentially or is eventually polynomial (when the sequence grows exponentially then its subexponential terms can be non-polynomial). Although there is an algorithm to determine when a univariate function is Nrational, it is currently unknown how to characterize N-rationality in higher dimensions. For example, it is an open question [34,  Conj. 4.6] whether the generating 1 2n function for the Catalan numbers Cn = n+1 n is the diagonal of an N-rational function (in any number of variables) while as an algebraic function it is the diagonal of a bivariate rational function. The univariate characterization of N-rationality relies heavily on a singularity analysis which does not easily translate into the multivariate case. The techniques developed for the methods of analytic combinatorics in several variables may provide a source of tools to study this problem.

3.4 Sources of Rational Diagonals

131

Open Problem 3.3 Is there an algorithm to determine when a multivariate rational function whose power series at the origin has coefficients in N is N-rational.

3.4.3 Period Integrals The ring of period numbers, introduced by Kontsevich and Zagier [42] as “the next most important class in the hierarchy of numbers [after algebraic numbers] according to their arithmetic properties,” contains all complex numbers whose real and imaginary parts can be expressed as absolutely convergent integrals of the form ∫ F(z)dz , Γ

where F(z) ∈ Q(z) and Γ ⊂ Rn is defined by polynomial inequalities with rational coefficients. Example 3.26 (Period Numbers) The identities ∫ π=

x 2 +y 2 ≤1

∫ dx dy,

log 2 = 1

2

dx , x

∫ ζ(3) = 0 0 and W(z, t) counts the number of weighted walks marked by endpoint and length, as long as the characteristic polynomial is replaced by its weighted version S(z) = i∈S wi zi .

4.1.2 A Deeper Kernel Analysis: One-Dimensional Excursions When d = 1, the generating function counting walks ending at the origin is   xp E(t) = Δ 1 − t xS(x) which, as the diagonal of a bivariate function, is algebraic. Assume now that p = 0 and let −m and M with m, M > 0 be the smallest and largest elements of S, so that S contains both a step with negative value and a step with positive value (otherwise there can be no walks ending at the origin). For |t| < 1/S(1) the generating function is given by the integral expression ∫ dx 1 , (4.4) E(t) = 1 − tS(x) x |x |=1 which follows either from the diagonal expression or straight from the Cauchy integral formula. The poles of this integrand are the roots of K(x, t) = 1−tS(x), which we now study. Since x m K(x, t) = x m −t x m S(x) is a polynomial of degree m+ M in x, Proposition 2.10 in Chapter 2 implies the kernel K(x, t), considered as a polynomial in x, has m roots r1 (t), . . . , rm (t) which are fractional power series in t and M roots R1 (t), . . . , RM (t) which approach infinity as t approaches zero. Definition 4.5 (small and large kernel roots) The roots r1 (t), . . . , rm (t) are the small roots of K(x, t), while R1 (t), . . . , RM (t) are the large roots of K(x, t). Setting t = 0 in the equation r j (t)m = t r j (t)m S(r j (t)) implies r j (0) = 0; i.e., a small root r j (t) has no constant term. Although not necessary here, we note that since r j (t) has no constant term one may substitute x = r j (t) in any element of R[[x, t]] to obtain a well-defined Puiseux expansion in t. This will be key to enumerating walks in a half-space.

148

4 Lattice Path Enumeration, the Kernel Method, and Diagonals

Proposition 4.1 Let S ⊂ Z be a step set with smallest element −m for m > 0. Then the generating function E(t) enumerating walks on the steps S which begin and end at the origin has the representation E(t) = t

m r (t)  j j=1

r j (t)

,

where r1 (t), . . . , rm (t) are the small roots of the kernel K(x, t) = 1 − tS(x) in x. Remark 4.1 The statement of Proposition 4.1 does not require that S has a positive step. If S contains only negative steps then there are no excursions of positive length and the expression for E(t) correctly simplifies to 1. Proof For t non-zero but sufficiently close to the origin only the small roots r j (t) lie inside the curve |x| = 1, so (4.4) and the residue theorem imply that E(t) is a sum of residues at these roots (note x = 0 is not a singularity of the integrand as m > 0). If r(t) satisfies 1 − tS(r(t)) = 0 then taking the derivative with respect to t shows −S(r(t)) − tr (t)S (r(t)) = 0, so that S (r(t))  0 and S (r(t)) = −

1 S(r(t)) =− 2 . tr (t) t r (t)

Thus, each root of the kernel is a simple zero and E(t) =

m  j=1

 Res

x=r j (t)

  m  m r (t)   1 1 j , = t −1 = t (r (t)) x(1 − tS(x)) −r (t) S r (t) j j j=1 j=1 j

as desired.



Example 4.1 (One Dimensional Excursions) Suppose S = {−2, −1, 0, 1, 2} ⊂ Z, so that S(x) = x −2 + x −1 +1+ x + x 2 . As expected, the equation 1 − tS(x) = 0 has 2 solutions r1 (t) = t 1/2 +

t 5t 3/2 + +··· 2 8

r2 (t) = −t 1/2 +

t 5t 3/2 − +··· 2 8

which approach zero as t approaches zero, and two solutions R1 (t) = t −1/2 −

1 3t 1/2 − +··· 2 8

R2 (t) = −t −1/2 −

1 3t 1/2 + +··· 2 8

which approach infinity as t approaches zero. Thus, the generating function of walks beginning and ending at the origin satisfies E(t) =

t r1 (t) t r2 (t) + = 1 + t + 5t 2 + 19t 3 + 85t 4 + · · · . r1 (t) r2 (t)

4.1 Walks in Cones and The Kernel Method

149

Note that although r1 (t) and r2 (t) are not analytic at the origin the generating function E(t) is analytic there. Knowing the minimal polynomial of r1 and r2 , the minimal polynomial m(t, y) of E(t) can be calculated using resultants, giving m(t, y) = (4 + 5t)(5t − 1)2 (t − 1)2 y 4 + 2(t − 1)(5t − 2)(5t − 1)y 2 + t.

Problem 4.1 asks you to generalize this argument to walks with arbitrary starting point p ∈ Z. See also Banderier and Flajolet [5] for similar results in this setting.

4.1.3 Walks in a Half-Space Consider now a weighted step set S ⊂ Zd restricted to the half-space R = Zd−1 ×N. In order to enumerate walks ending anywhere, or walks ending on the boundary hyperplane zd = 0, it is enough to project each step onto its dth coordinate to obtain a new weighted step set S ⊂ Z and count walks restricted to R = N. To simplify our presentation we thus restrict ourselves to the case of one-dimensional walks in N beginning at the origin, losing only a small amount of generality. Now, define

   i n hi,n x t ∈ R[x][[t]] H(x, t) = n≥0 i ≥0

where hi,n counts the number of weighted half-space walks of length n on the steps in S ending at a point with x-coordinate i. As before, let −m and M be the smallest and largest elements of S, where we assume m, M > 0 so that the valid walks are nontrivial and actually interact with the boundary of R = N. With Hn (x) = [t n ]H(x, t) enumerating the walks of length n by endpoint, our goal is to obtain a recurrence for Hn ; this recurrence will not be the same as the unrestricted case as we must take into account walks that try to leave N. Fortunately, using the bivariate function H(x, t) tracking the endpoint of a walk allows us to ensure that walks do not leave N. Define again the weighted characteristic polynomial  wi x i = w−m x −m + · · · + w M x M , S(x) = i ∈S

where wi > 0 is the real weight associated to the step i ∈ S. To keep track of walks potentially leaving the half-space, for any integer j with 0 ≤ j < m we let S0 that F(z) = G(z)/H(z) admits a power series expansion F(z) = i∈N d fi zi . Suppose that the system of polynomial equations

© Springer Nature Switzerland AG 2021 S. Melczer, Algorithmic and Symbolic Combinatorics, Texts & Monographs in Symbolic Computation, https://doi.org/10.1007/978-3-030-67080-1_5

185

186

5 The Theory of ACSV for Smooth Points

H(z) = r2 z1 Hz1 (z) − r1 z2 Hz2 (z) = · · · = rd z1 Hz1 (z) − r1 zd Hz d (z) = 0 admits a finite number of solutions, exactly one of which, w ∈ C∗d , is minimal. Suppose further that Hz d (w)  0, that the matrix H defined by (5.24) and (5.25) below has non-zero determinant, and that G(w)  0. Then, as n → ∞,    1 (2πrd )(1−d)/2 −G(w) fnr = w−nr n(1−d)/2  1+O (5.1) w H (w) n d z det(H ) d d the when nr ∈ Nd . As r varies in any sufficiently small neighbourhood N in R>0 solution w = w(r) varies smoothly with r. When the above conditions are satisfied for each w(r) with r ∈ N , the hidden constant in the big-O error in (5.1) can be chosen independently of r in any compact subset of N . When the zero set of H contains a finite number of points with the same coordinatewise modulus as w, all of which satisfy the same conditions as w, then an asymptotic expansion of fnr is obtained by summing the right hand side of (5.1) at each point. When the power series expansion of F(z) contains only a finite number of negative coefficients, any point w ∈ C∗d with H(w) = 0 is minimal if and only if H(tw1, . . . , twd ) is non-zero for all 0 ≤ t < 1.

Theorem 5.1 is a special case of Theorem 5.4 below. Example 5.1 (Asymptotics of Central Binomial Coefficients) Consider the power series expansion  i + j   1 i j xi y j . fi, j x y = = F(x, y) = i 1 − x − y i, j ≥0 i, j ≥0 For r, s > 0 the system of equations H(x, y) = sxHx (x, y) − r yHy (x, y) = 0 becomes 1 − x − y = −xs + yr = 0,  r s  , . r+s r+s The minimal singularities of F(x, y) were described in Chapter 3, where it was shown that any point (x, y) with x, y > 0 and x + y = 1 was minimal. Building the matrix H defined by (5.24) and (5.25), which in the bivariate case is just a constant, shows that the conditions of Theorem 5.1 are satisfied here, and we calculate  r + s  r n  r + s  sn √r + s . fr n,sn ∼ √ r s 2r sπ n

with solution

(x, y) =

Section 5.1 goes into great detail on how to derive this expansion.

5 The Theory of ACSV for Smooth Points

187

Below we relax many of the conditions of Theorem 5.1, considering more general Laurent expansions and determining an explicit asymptotic expansion of fnr in decreasing powers of n. In addition, we show that most of our assumptions hold generically, meaning they hold for all rational functions with numerator and denominator of fixed degree, except those whose coefficients lie in some fixed algebraic set. Our main results and the necessary surrounding theory are developed in Section 5.2. Section 5.3 illustrates how the theory is put into practice, including (in Section 5.3.3) a characterization of how asymptotics of fnr varies with r and an illustration for how this implies a central limit theorem for series coefficients. Section 5.3.4 concludes by proving that most of our assumptions hold generically. We make frequent use of the results and concepts introduced in Chapter 3 throughout.

Setup As in the univariate case, the analysis begins with the (multivariate) Cauchy integral formula from Theorem 3.1 (for power series expansions) and Proposition 3.10 (for more general Laurent expansions) in Chapter 3. Given a meromorphic function F(z) with absolutely convergent Laurent expansion F(z) = i∈Z d fi zi in some domain D ⊂ Cd , we fix a direction r ∈ Rd and study asymptotics of the coefficient sequence ∫ 1 dz fnr = F(z) nr+1 d z (2πi) T (w) as n → ∞, where T(w) is the polytorus defined by any w ∈ D. In most applications, F(z) is a rational function and we study asymptotics of its power series expansion at the origin; in this case T(w) can be taken as any product of circles sufficiently close to the origin. Although fnr is essentially not defined if nr  Zd , we will show that asymptotics vary smoothly with r, allowing us to make meaningful asymptotic statements about fnr even when r has irrational coordinates. Our first set of results needs only univariate techniques and the following two facts. The first follows by induction from the univariate complex analysis results discussed in Chapter 2, while the second is a consequence of the triangle inequality. Deforming Curves of Integration: Let f (z) be a meromorphic function and let T(a), T(b) ⊂ Cd be tori in Cd such that f (z) has no singularities with |z j | between |a j | and |b j | for each 1 ≤ j ≤ d. Then ∫ ∫ f (z)dz = f (z)dz. T (a)

T (b)

Maximum Modulus Integral Bound: If f (z) is continuous on a contour C ⊂ Cd of finite area, then

188

5 The Theory of ACSV for Smooth Points

∫ f (z)dz ≤ area(C) × max | f (z)|. z∈ C C

We start with a detailed study of the rational function F(x, y) = 1/(1 − x − y), which lays out the steps we must go through in the general case.

5.1 Central Binomial Coefficient Asymptotics Let F(x, y) =

1 . 1−x−y

In Chapter 3 we saw that F(x, y) admits three distinct convergent Laurent expansions. Here, we begin by determining coefficient asymptotics in the main diagonal direction r = (1, 1) of the power series expansion  i + j  xi y j, F(x, y) = i i, j ≥0 with domain of (absolute) convergence D = {(x, y) ∈ C2 : |x| + |y| < 1}. The Cauchy integral formula implies   ∫ 2n 1 1 dxdy = , n (2πi)2 T (a,b) 1 − x − y x n+1 y n+1

(5.2)

for any (a, b) ∈ D, which forms the basis of our analysis. We break our argument into several steps.

Step 1: Bound Exponential Growth By Corollary 3.2 of Chapter 3, the dominant asymptotic behaviour of a rational diagonal sequence is given by a finite sum of terms of the form C nα ζ n (log n) , where C ∈ C, α ∈ Q,  ∈ N, and ζ is algebraic. As in the univariate case, the first step of the analysis is to determine information about the exponential growth ρ = lim sup | fn,n | 1/n . n→∞

In the univariate setting the exponential growth of a sequence is obtained by finding the minimal modulus of its generating function’s singularities and taking the recip-

5.1 Central Binomial Coefficient Asymptotics

189

rocal. Unfortunately, in dimension two or greater there will always be an infinite number of singularities ‘closest’ to the origin, with each only giving a bound on the exponential growth. In our example |a| + |b| < 1 whenever (a, b) ∈ D, so that 1 1 1 − x − y ≤ 1 − |a| − |b| for all (x, y) ∈ T(a, b). Thus, the maximum modulus integral bound implies   ∫ 1 2n |ab| −n dxdy 1 = ≤ 2 n+1 n+1 n 1 − |a| − |b| (2πi) T (a,b) 1 − x − y x y

(5.3)

for all (a, b) ∈ D. Equation (5.3) gives a family of bounds 

2n lim sup n n→∞

 1/n

≤ |ab| −1

(5.4)

on the exponential growth of the central binomial coefficients, one for each pair of points (a, b) ∈ D. In fact, allowing (a, b) to approach the boundary ∂D shows that the exponential growth ρ is bounded above by |ab| −1 for all (a, b) ∈ D. It is natural to wonder which points give the best upper bound, and whether that bound is tight. In fact, answering these two questions is usually the hardest step when studying a multivariate generating function. As the upper bound |ab| −1 decreases as the coordinates a and b get farther from the origin, the minimum of |ab| −1 on D = {(a, b) ∈ C2 : |a| + |b| ≤ 1} occurs on the boundary of D, where |a| + |b| = 1. Thus, we want to minimize |ab| −1 subject to |a| + |b| = 1. Examining the function t −1 (1 − t)−1 for 0 ≤ t ≤ 1 shows the minimum is achieved when |a| = |b| = 1/2, where |ab| −1 = 4. Returning to the bound in (5.4), we have shown that for every ε > 0 there exists a constant Cε > 0 such that   2n ≤ Cε (4 + ε)n n for all n ∈ N. Using the fact that the central binomial coefficients have an algebraic generating function, we have seen in Chapter 2 that      2n 4n 1 =√ 1+O n n πn as n → ∞, so our upper bound of 4 on the exponential growth is tight.

190

5 The Theory of ACSV for Smooth Points

Step 2: Determine Contributing Singularities In order to completely determine asymptotics of this diagonal sequence, we will perform a local analysis of F(x, y) near some of its singularities. But which singularities should we study? Because the minimum of |ab| −1 on D occurs on the boundary ∂D, and depends only on the moduli of the coordinates, there are singularities of F achieving this minimum. Since the denominator under consideration is 1 − x − y, the only singularity of F(x, y) with |x| = |y| = 1/2 is the point σ = (1/2, 1/2). The Cauchy integrand in (5.2) has exponential growth close to 4n when (x, y) is near to σ, and has different exponential growth near any other singularity of F(x, y). The point σ is thus the only singularity of F(x, y) where the growth of the Cauchy integrand matches our predicted diagonal sequence growth, and therefore the only singularity where a local analysis of the Cauchy integral could possibly capture the asymptotic growth of the diagonal coefficients. Following this logic, we attempt to determine dominant asymptotics by manipulating the Cauchy integral in (5.2) into an integral whose domain stays near this singularity, then replace F(x, y) by a local approximation near σ.

Step 3: Localize the Cauchy Integral and Compute a Residue For any n ∈ N let I = In =

1 (2πi)2

∫

∫ |x |=1/2

|y |=1/4

 dx dy 1 . 1 − x − y y n+1 x n+1

Since σ = (1/2, 1/2) lies on the boundary   of the domain of convergence ∂D, the point (1/2, 1/4) lies in D and thus I = 2n n . In order to obtain an integral where x is restricted to a neighbourhood of 1/2, we define   π π  and N = {|x| = 1/2} \ N . N = |x| = 1/2 : arg(x) ∈ − , 4 4 Figure 5.1 illustrates the sets N and N , and we note that eiπ/4 |1 − x| < 1 − = 0.7368 . . . 2 %&'( ρ

for x ∈ N , and |1 − x| ≥ ρ for x ∈ N . We now compare I to the ‘localized’ integral  ∫ ∫ dx 1 1 dy . Iloc = (2πi)2 N |y |=1/4 1 − x − y y n+1 x n+1 Our arguments are local, in the sense that replacing N with any smaller neighbourhood of 1/2 will not change our results. For fixed x ∈ N ,

5.1 Central Binomial Coefficient Asymptotics

191

Fig. 5.1 The domains of integration N and N , with |1 − x | < ρ for x ∈ N and |1 − x | ≥ ρ > 1/4 for x ∈ N .

∫ 1 (2πi)

|y |=1/4

∫ dy 1 1 1/(1 − x) dy = y 1 − x − y y n+1 (2πi) |y |=1/4 1 − 1−x y n+1 n  −(j+1) j = [y ] (1 − x) y j ≥0 = |1 − x| −(n+1) ≤ ρ−(n+1),

where the series expansion is valid as |y| = 1/4 < |1 − x| when x ∈ N . Thus, the maximum modulus integral bound implies  ∫ ∫ dx length(N ) −(n+1) n+1 1 dy 1 |I − Iloc | = ≤ 2 ρ 2π (2πi)2 N |y |=1/4 1 − x − y y n+1 x n+1  n 3 2 = . 8ρπ ρ Since 2/ρ ≤ 2.72, this implies we can replace the integral I with Iloc in our asymptotic arguments and introduce an error that grows at an exponentially smaller rate than our diagonal sequence. Our next tactic is to introduce the integral  ∫ ∫ dx 1 dy 1 Iout = 2 n+1 1 − x − y (2πi) N |y |=3/4 y x n+1 whose domain of integration lies outside the domain of convergence D. For x ∈ N , the quantity |1 − x| is bounded away from 3/4 so that 1/|1 − x − y| is bounded when |y| = 3/4. The maximum modulus integral bound then implies the existence

192

5 The Theory of ACSV for Smooth Points

of a constant C such that

 n   n  4 8 , =O 3 3

|Iout | ≤ C 2n

which grows exponentially smaller than our diagonal sequence. Remark 5.1 There are solutions to 1 − x − y = 0 with |x| = 1/2 and |y| = 3/4, but no solution has x ∈ N . This is why we introduce and localize to the neighbourhood N . Finally, define χ = Iloc − Iout  ∫ ∫ ∫ dx dy dy 1 1 1 −1 = − . n+1 x n+1 2πi N 2πi |y |=3/4 1 − x − y y n+1 1 − x − y y |y |=1/4 For any fixed x ∈ N , the function F(x, y) = 1/(1 − x − y) has a unique pole between the curves {|y| = 1/4} and {|y| = 3/4}, at y = 1 − x. Thus, the (univariate) Cauchy residue theorem discussed in Chapter 2 implies  ∫ ∫ 1 1 dy dy 1 = −(1 − x)−(n+1), − n+1 2πi |y |=3/4 1 − x − y y n+1 1 − x − y y |y |=1/4 for each x ∈ N , and χ=

1 2πi

∫ N

dx . x n+1 (1 − x)n+1

(5.5)

This gives a good asymptotic approximation of our diagonal sequence, because    n 2n = |I − (Iloc − Iout )| ≤ |I − Iloc | + |Iout | = O 2 . − χ n ρ Remark 5.2 The emergence of the integral in (5.5) should not be surprising. In fact, the binomial theorem implies   ∫ 2n dx 1 = [x n ] (1 − x)−(n+1) = , (5.6) n+1 n 2πi |x |=1/2 x (1 − x)n+1 which is an exact equality instead of an asymptotic approximation. The reason our argument has an (asymptotically negligible) error is that we restrict to the neighbourhood N of x = 1/2 in order to have the integrand of Iout bounded when |y| = 3/4. In this example, we can define the domain of integration for y in Iout to be |y| = 3/2 + r for any r > 0 and the restriction of x to the neighbourhood N of 1/2 becomes unnecessary; the rest of our argument then yields the exact equality (5.6). The reason we did not take this approach above is because for general rational (or meromorphic) functions one must stick to a local analysis near the singularities of interest. We thus restrict x to the neighbourhood N and obtain an asymptotically negligible error here as this argument is the one that generalizes. Chapter 8 provides a detailed asymptotic

5.1 Central Binomial Coefficient Asymptotics

193

analysis for diagonals of rational functions whose denominators are a product of linear factors, giving exact equalities such as (5.6).

Step 4: Apply the Saddle-Point Method   Parameterizing the domain of integration N in (5.5) as N = eiθ /2 : θ ∈ (−π/4, π/4) converts this integral expression for χ into 4n χ= 2π where A(θ) =

1 1 − eiθ /2



π/4

A(θ)e−nφ(θ) dθ,

−π/4

(5.7)

  φ(θ) = log 2 − eiθ + iθ.

and

This is our first example of a Fourier-Laplace integral, whose asymptotics can be calculated using the saddle-point method and related techniques. To determine asymptotics we will replace the amplitude function A(θ) and phase φ(θ) in (5.7) by the leading terms of their expansions A(θ) = 2 + 2iθ − 3θ 2 + · · ·

φ(θ) = θ 2 + iθ 3 + · · ·

and

(5.8)

at the origin, after which the resulting integral can be computed explicitly. First, we restrict the domain of integration in (5.7) to a small neighbourhood of the origin: if Bn = n−2/5 then (5.8) implies 1/5 −nφ(θ) e = e−n(φ)(θ) ≤ e−n +O(1) whenever |θ| ≥ Bn , so that ∫

π/4

−π/4

A(θ)e−nφ(θ ) dθ =



−B n

∫ =

Bn

Bn

−B n

A(θ)e−nφ(θ ) dθ +



−B n

−π/4

A(θ)e−nφ(θ ) dθ +



π/4

A(θ)e−nφ(θ ) dθ

Bn

  1/5 . A(θ)e−nφ(θ ) dθ + O e−n

For |θ| ≤ Bn the expansions in (5.8) imply      2 −1/5 2 and e−nφ(θ) = e−nθ +O(n ) = e−nθ 1 + O n−1/5 , A(θ) = 2+O n−2/5 meaning χ=

4n 2π

∫

Bn

−B n

2e−nθ dθ 2



  1 + O n−1/5 .

(5.9)

Asymptotics of such an integral √ over the entire real line is easy to determine: making the change of variable θ = t/ n gives

194

5 The Theory of ACSV for Smooth Points





−∞

e

−nθ 2

dθ = n

−1/2



5



−∞

e

−t 2

dt =

π . n

Since ∫



Bn

e−nθ dθ = 2

∫ 0



e−n(Bn +θ) dθ = e−nBn 2

2





e−nθ

2 −nB





0

∫ ∞ 2 2 e−θ dθ ≤ e−nBn  1/50  = O e−n ,

we can ‘add the tails’ back to the domain of integration in (5.9) while introducing an exponentially small error, ultimately yielding ∫ ∞      2 4n  4n 1 + O n−1/5 . 2e−nθ dθ 1 + O n−1/5 = √ χ= 2π −∞ πn Thus, we have obtained an asymptotic estimate     2n 4n  =√ 1 + O n−1/5 . n πn   A more careful analysis can bring the error term O n−1/5 to O(n−1 ), and even determine an asymptotic expansion in powers of n−1 . The key properties that allow for such an analysis are that A and φ are analytic at the origin and • φ(0) = φ (0) = 0 while φ (θ)  0 (so that the dominant term in the Taylor series for φ at the origin is quadratic, leading to a Gaussian-type integral); • the real part of φ is non-negative on the domain of integration (so that the integral decays away from the origin); • φ (θ)  0 on the domain of integration unless θ = 0 (so that the origin is the only point where local behaviour will dictate asymptotics), although these can be relaxed to differing degrees (especially for univariate integrals). Good expositions of the asymptotics of univariate saddle-point integrals can be found in de Bruijn [13] and Flajolet and Sedgewick [16, Sect. VIII], which illustrate a deep, rich, and well-developed theory, and motivate some of our choices above. Below we use Proposition 5.3, which gives asymptotics in a generalized multivariate setting. The miracle which underlies analytic combinatorics in several variables is the fact that, by making a natural choice of singularities to study–those giving the best bound on exponential growth–one ends up with a class of integrals which can be asymptotically approximated1. Under the assumptions of this chapter, when dealing with rational functions of d variables the Fourier-Laplace integral expressions ob1 As we will see, this miracle is due to the Cauchy-Riemman equations.

5.1 Central Binomial Coefficient Asymptotics

195

tained are d − 1 dimensional. They have no analogue in the univariate case, where one is finished after computing the residue in Step 3.

5.1.1 Asymptotics in General Directions Suppose we wish to determine asymptotics of the coefficients fr n,sn in a more general direction r = (r, s) ∈ N2 . Because changing the direction r does not affect the underlying rational function F(x, y), its singular set V, or the power series domain of convergence D, we have the modified Cauchy integral representation   ∫ rn + sn dxdy 1 1 = fr n,sn = (5.10) rn (2πi)2 T (a,b) 1 − x − y x r n+1 y sn+1 for any (a, b) ∈ D. As in our study of the main diagonal, the maximum modulus integral bound applied to (5.10) gives a bound on the exponential growth ρ = lim sup | fr n,sn | 1/n ≤ |a| −r |b| −s n→∞

for all (a, b) ∈ D. In order to minimize this upper bound on exponential growth, we introduce the height function hr (x, y) = h(x, y) = −r log |x| − s log |y| and search for the minimum of h(x, y) on ∂D, where |x| + |y| = 1. Writing |x| = t for 0 ≤ t ≤ 1, we thus search for the minimum of g(t) = −r log t − s log(1 − t) when 0 ≤ t ≤ 1. Since r, s > 0 the function g(t) goes to infinity as t approaches 0 or 1, and we may find the minimum of g by solving the equation g (t) = 0. This r s , r+s ), and there implies that the minimum of hr on ∂D occurs when (|x|, |y|) = ( r+s r s , r+s ). is precisely one singularity with this coordinate-wise modulus, (x∗, y∗ ) = ( r+s Remark 5.3 Recall the discussion of polynomial amoebas and the Relog map from Section 3.3.1 in Chapter 3. If p = log |x| and q = log |y| then h(x, y) = −r log |x| − s log |y| = −r p − sq = −(r, s) · (p, q). Thus, minimizing h(x, y) on D corresponds to minimizing the linear function ˜ q) = −(r, s) · (p, q) on the closure of the convex component B = Relog(D) h(p, of the amoeba complement amoeba(1 − x − y)c . Pictorially, for any direction (r, s) we want to find the point(s) on ∂B where the support hyperplane to B has normal (r, s); see Figure 5.2. Proposition 3.6 from Chapter 3 implies that every point on ∂B is a minimizer of h˜ for some (r, s) when we relax the condition that r and s are natural numbers and allow them to take positive real values. Following the steps outlined above, we note

196

5 The Theory of ACSV for Smooth Points

Fig. 5.2 The boundary of amoeba(1 − x − y) together with the component B of amoeba(1 − x − y) c corresponding to the power series expansion of 1/(1 − x − y). Level sets of the height function h˜ for directions (1, 2) on top, (1, 1) in the middle, and (4, 1) on the bottom are shown, together with the corresponding minimizer of h˜ on ∂B.

fr n,sn

1 =I= (2πi)2

∫

∫ |x |=x∗

|y |=y∗ −ε

 dx 1 dy sn+1 r 1−x−y y x n+1

for any sufficiently small ε > 0, and introduce the integrals  ∫ ∫ 1 dy 1 Iloc = (2πi)2 N |y |=y∗ −ε 1 − x − y y sn+1  ∫ ∫ 1 1 dy Iout = (2πi)2 N |y |=y∗ +ε 1 − x − y y sn+1

dx x r n+1 dx x r n+1

where N = {|x| = x∗ : arg(x) ∈ (−δ, δ)} for some sufficiently small δ > 0. The same argument as the main diagonal case shows that both I − Iloc and Iout grow exponentially slower than I, and we can approximate fr n,sn with the integral χ = Iloc − Iout = where A(θ) = and

x∗−r n y∗−sn 2π



δ

−δ

A(θ)e−nφ(θ) dθ,

r+s 1 + O(θ) = s 1 − x∗ eiθ

    φ(θ) = r log x∗ eiθ + s log 1 − x∗ eiθ − r log(x∗ ) − s log(y∗ ) =

r(r + s) 2 θ + O(θ 3 ). 2s

(5.11)

5.1 Central Binomial Coefficient Asymptotics

Fig. 5.3 Left: Values of the factor Mn =



197

 π [n π]−nπ π+1

arising in our determination of asymptotics  π  1/2  π  −1/2 near the irrational direction r = (π, 1) for 0 ≤ n ≤ 100, bounded between π+1 and π+1 . Right: The ratio of the coefficient f[n π], n and its limit behaviour Mn A n for 100 ≤ n ≤ 200.

Because χ is a Fourier-Laplace integral satisfying the properties discussed above, the saddle-point method implies      r + s  r n  r + s  sn  r + s  1 ∫ ∞ rn + sn −n r (r2s+s) θ 2 = fr n,sn ∼ e dθ rn r s s 2π −∞  r + s  r n  r + s  sn √r + s = . (5.12) √ r s 2r sπ n A close examination of our argument shows that the error term in this approximation varies smoothly with (r, s) and, in particular, our asymptotic result holds uniformly when (r, s) varies in any compact set. This allows us to interpret asymptotics in a direction (r, s) with non-integer coordinates as the limit of asymptotics in rational directions approaching (r, s), in a manner specified precisely by Proposition 5.9 in Section 5.3.3 below. Example 5.2 (‘Asymptotics’ in an Irrational Direction) Although no non-zero multiple of r = (π, 1) contains integer coordinates, when n is large then nr gets arbitrarily close to vectors with integer coordinates. Let [nπ] denote the closest integer to nπ and let An denote the asymptotic approximation (5.12) with r = π and s = 1. Proposition 5.9 below implies f[nπ],n →

 π  [nπ]−nπ An π+1

as n → ∞, so An determines the asymptotic behaviour of the coefficient with integer coordinates closest to nr, up to a bounded factor Mn coming from rounding nπ to an integer. See Figure 5.3 for an illustration.

In fact, writing

198

5 The Theory of ACSV for Smooth Points





rn + sn (rn + sn)! Γ(rn + sn + 1) = = rn (rn)!(sn)! Γ(rn + 1)Γ(sn + 1) r n+sn extends the coefficients r n to all (r, s) ∈ R2≥0 using the gamma function Γ(z) discussed in the appendix to Chapter 2. The asymptotic behaviour of the gamma function (which can be derived using its integral definition and the saddle-point method) shows that this ratio has asymptotics matching (5.12). Remark 5.4 If r < 0 or s < 0 then h(x, y) = −r log |x| − s log |y| is unbounded from below on D, as h(x, y) approaches negative infinity whenever x → 0 or y → 0. Going back to the bound | fr n,sn | ≤ |a| −r n |b| sn for (a, b) ∈ D implies fr n,sn decays faster than any exponential function, meaning the sequence is eventually zero due to the constraints on asymptotic growth for rational diagonals. In particular, our asymptotic argument verifies the trivial fact that the power series expansion of F(x, y) contains no terms whose indices are arbitrarily large with negative sign. Note that the set R2≥0 contains all directions r such that some multiple tr with t > 0 lies in the Newton polygon of 1 − x − y at the origin; we generalize this observation in Proposition 5.7.

5.1.2 Asymptotics of Laurent Coefficients Having studied coefficient asymptotics of the power series expansion of F(x, y) = 1/(1 − x − y), we now apply a similar analysis to the other Laurent expansions of F, derived in Chapter 3. Consider, for instance, the convergent Laurent expansion  i  1 (−1) j+1 y j x −i−1 = 1 − x − y i, j ≥0 j in the domain D2 = {(x, y) ∈ C2 : 1 + |y| < |x|} (the final Laurent expansion is the same with x and y swapped). Although the rational function F, and thus its singular set V, are unchanged, the singularities on the boundary of the domain of convergence now form the set V ∩ ∂D2 = {(1 + t, −t) : t > 0}. The exponential growth of coefficients along a direction r = (r, s) ∈ R2 is still determined by the minimum of the height function hr (x, y) = −r log |x| − s log |y| on D2 . Parameterizing V ∩ ∂D2 by (x, y) = (1+t, −t) for t > 0, we want to minimize g(t) = −r log(1 + t) − s log(t). From the series expansions g(t) = s log(t −1 ) − rt + rt 2 /2 + · · ·

(t → 0+ )

g(t) = −(r + s) log(t) − rt −1 + rt −2 /2 + · · ·

(t → ∞),

5.1 Central Binomial Coefficient Asymptotics

199

we see that hr (x, y) is unbounded from below, and thus the sequence fnr,ns is eventually zero, whenever s < 0 or r + s > 0. Remark 5.5 Recalling Figure 3.2 and the discussion in Section 3.3.1 of Chapter 3, this Laurent expansion of F(x, y) corresponds to the vertex (1, 0) of the Newton polytope N (H) under the mapping described by Proposition 3.12. The set of directions (r, s) for which (1, 0) + t(r, s) lies in the Newton polytope of 1 − x − y are exactly those where s > 0 and r + s < 0. Again, we return to this observation in Proposition 5.7 below. Suppose s > 0 and r + s < 0, so that the  of  hrr occurs  on ∂D.  minimum s s s = r+s , r+s , with the , r+s Solving g (t) = 0 gives (|x|, |y|) = (x∗, y∗ ) = 1 − r+s point (x∗, y∗ ) being the only singularity on ∂D2 with this coordinate-wise modulus. Note this is the same formula as in the power series case, with different restrictions on r and s, and that x∗ > 0 while y∗ < 0. Below we will see that the most natural algebraic techniques compute a set of potential minimizers of hr for all domains of convergence, after which one must filter the points to determine which are relevant to the specific Laurent expansion under consideration. This is one reason why a knowledge of Laurent expansions helps provide an understanding of our asymptotic methods, even when one only cares about power series expansions. If (x, y) ∈ D2 then (x, z) ∈ D2 for any z with |z| < |y|, since 1+ |z| < 1+ |y| ≤ |x|. Thus, for any sufficiently small ε > 0, ∫  ∫ dx 1 dy 1 . fr n,sn = I = 2 sn+1 r (2πi) |x |=x∗ |y |= |y∗ |−ε 1 − x − y y x n+1 Again following the argument above, we introduce the integrals  ∫ ∫ dx 1 dy 1 Iloc = 2 sn+1 r (2πi) N |y |= |y∗ |−ε 1 − x − y y x n+1  ∫ ∫ dx dy 1 1 Iout = , 2 sn+1 r (2πi) N |y |= |y∗ |+ε 1 − x − y y x n+1 where N = {|x| = x∗ : arg(x) ∈ (−δ, δ)} for some sufficiently small δ > 0. Defining N = {|x| = x∗ } \ N , we note that I − Iloc and Iout grow slowly: • If a ∈ C with a = |x∗ | and a  x∗ then the radius of convergence of the series F(a, y) in y is larger than |y∗ |. This shows the existence of τ > |y∗ | such that ∫ 1 dy = [y sn ]F(x, y) = O(τ −ns ) sn+1 |y |= |y∗ |−ε 1 − x − y y for all x ∈ N . The maximum modulus bound therefore implies |I − Iloc | = O(x∗−r n τ −ns ) will grow exponentially slower than the coefficient sequence under consideration.

200

5 The Theory of ACSV for Smooth Points

  • The maximum modulus bound implies Iout = O x∗−r n (|y∗ | + ε)−sn , and s > 0. Thus we may replace I by Iloc , subtract Iout , and compute a residue to approximate the coefficient fnr,ns by a Fourier-Laplace integral. In fact, we obtain the same Fourier-Laplace integral as in (5.11), the only difference from the power series case being the restrictions on (r, s). Again we may apply a saddle-point argument to obtain      r + s  r n  r + s  sn  r + s  1 ∫ ∞ −nr − 1 −n r (r2s+s) (−1)ns+1 = fr n,sn ∼ e dθ ns r s s 2π −∞  r + s  r n  r + s  sn  r + s  5 s = r s s 2r(r + s)π n  for s√> 0 and r + s < 0. Note that we cannot simplify, for instance, s/(r + s) = √ s/ r + s because s and r + s are negative. Making such invalid simplifications can result in an asymptotic approximation with incorrect sign. Remark 5.6 Up to taking the leading sign,  our asymptotic estimate respects  care withns+1  nr+ns (−1) the binomial identity −nr−1 = − ns ns .

5.2 The Theory of Smooth ACSV We now show that our approach for the central binomial coefficients generalizes to an amazing degree. This section lays out the basics of analytic combinatorics in several variables, culminating in Theorems 5.2 and 5.3, while the next section contains generalizations, further discussion, and results which help apply the theory to real applications. The necessary background on complex analysis in several variables, including singularities, Laurent expansions, polynomial amoebas, and diagonals of analytic functions, was covered in Chapter 3. Fix a rational function G(z) F(z) = H(z) with G and H coprime polynomials over the complex numbers, and let D denote the domain of convergence of a Laurent expansion  fi zi F(z) = i∈Z n

corresponding to a component B = Relog−1 (D) of the amoeba complement amoeba(H)c . We determine asymptotics of a sequence fnr for indices nr ∈ Zd as n → ∞. By deriving asymptotic results which vary smoothly as the direction r moves in some open set, we will be able to interpret ‘asymptotics’ of fnr for nr  Zd through a limiting procedure made explicit in Section 5.3.3.

5.2 The Theory of Smooth ACSV

201

Unless otherwise noted a statement involving fnr is interpreted to hold when n ∈ N and nr ∈ Zd , so that the coefficient is well-defined. We often assume that r contains no zero coordinate: if, say, rd = 0 then the rdiagonal of F(z) is the rˆ -diagonal of [zd0 ]F(z), equal to the rˆ -diagonal of F(ˆz, 0) when the series expansion F contains no terms with a negative exponent of zd . Recalling the notation R∗ = R \ {0} and C∗ = C \ {0} from past chapters, this means r ∈ R∗d . We take F(z) to be a rational function in order to visualize our constructions using the amoeba of H, but our ultimate results will hold for ratios of analytic functions. Our arguments revolve around the singularities of F(z). Definition 5.1 (singular variety and square-free part) Given complex functions f1, . . . , fd , we write V( f1, . . . , fd ) to denote their set of common solutions in Cd . The set of singularities of F(z), denoted V, is known as its singular variety; by Proposition 3.2 in Chapter 3 the singular variety V is the zero set V(H). Because V depends only on the irreducible polynomial factors of H, and not on their multiplicity, we let Hs denote the square-free part of H, equal to the product of its distinct irreducible polynomial factors over the complex numbers. The square-free part of H can be computed in any computer algebra system, either by completely factoring H or through more efficient specialized methods. The minimal singularities of F, which are those on the boundary of the domain of convergence, form the set V ∩ ∂D. Remark 5.7 We note that V = V(H) = V(Hs ). The algebraic properties of H s better capture the geometry of V, which will be useful in our calculations. This chapter deals with multivariate generating functions where V is a complex manifold near the singularities of F which dictate coefficient asymptotics. Definition 5.2 (smooth points) A smooth point of V is an element of V where at least one partial derivative of Hs does not vanish. If every point of V is smooth then we say V is smooth. Although not required for most of our arguments, some differential geometry can help motivate and explain our constructions, and the necessary background material can be found in Griffiths and Harris [18, Ch. 0] if desired. For instance, the implicit function theorem (Proposition 3.1 in Chapter 3) implies V is a complex manifold in some neighbourhood of w ∈ V whenever w is a smooth point.

Step 1: Bound Exponential Growth As usual, we begin with the Cauchy integral formula ∫ 1 dz fnr = F(z) nr +1 , 1 (2πi)d T (w) z1 · · · zdnrd +1

(5.13)

202

5 The Theory of ACSV for Smooth Points

where T(w) is the polytorus defined by any w ∈ D. Applying the maximum modulus bound to the integral in (5.13) gives a bound 1 ∫ dz F(z) | fnr | = ≤ Cw |w1 | −nr1 · · · |wd | −nrd , (5.14) nrd +1 nr1 +1 (2πi)d T (w) z1 · · · zd where Cw = maxz∈T (w) |F(z)| is finite. The part of this upper bound depending on n determines a notion of height for the singular points of F. Definition 5.3 (height functions) The height function of F in the direction r is the function hr : C∗d → R defined by hr (z) = −

d 

r j log |z j |.

j=1

Because it is often useful to take logarithms and picture the amoeba of H, we define the logarithmic height function h˜ r : Rd → R by h˜ r (x) = hr (e x1 , . . . , e xn ) = −r · x. ˜ When the direction r is understood we drop subscripts and write h and h. Equation (5.14) implies an exponential growth bound lim sup | fnr | 1/n ≤ |w1 | −r1 · · · |wd | −rd = ehr (w)

(5.15)

n→∞

for every point w in the closure D. When h(z) is unbounded from below on D the coefficient sequence fnr decays super-exponentially and is thus eventually zero. Otherwise, either the minimum of h(z) on D is achieved and occurs on the boundary ∂D or, because h˜ is linear and the closed set B is convex, the minimum is not achieved and there is a limit direction of amoeba(H) which is normal to r. Because we need local behaviour of F(z) near some of its singularities to capture the asymptotic growth of its coefficients, we require that hr (z) achieves its minimum on D. In this chapter we work mainly under conditions which allow for easy verification of this assumption. Recall the logarithmic gradient map ∇log f = (z1 fz1 , . . . , zd fz d ) from Chapter 3. Proposition 3.13 in that chapter states that for any minimal point w ∈ V ∩ ∂D there exists λ ∈ Rd and τ ∈ C such that (∇log H s )(w) = τλ, which helps us find minimizers of the height function. Proposition 5.1 Let r ∈ R∗d . If w ∈ V ∩ ∂D and (∇log H s )(w) = τr with τ  0 then w is either a minimizer or a maximizer of the map hr (z) on D. Remark 5.8 If H(z) = Q(z)2 for some polynomial Q then (∇log H)(w) = 0 for all w ∈ V. This is why we work with the square-free part H s instead of H. Proof When τ  0 then, since r has no zero coordinate, w also has no zero coordinate. Proposition 3.13 in Chapter 3 thus implies that r is the normal vector of a support hyperplane H of B at Relog(w), meaning all points in B lie on one side ˜ of H . Since the height function is represented by the linear function h(x) = −r · x

5.2 The Theory of Smooth ACSV

203

˜ after taking the Relog map, Relog(w) is thus a minimizer or maximizer of h(x) on B, depending on whether r points away from or towards B at Relog(w). The same relationship then holds between w and h(z) on D. Figure 5.2 above gives a visualization.  Unfortunately, there may be a minimizer of hr (z) on V ∩ ∂D which does not satisfy (∇log H s )(w) = τr, even when V is smooth. Problems occur when the amoeba of H, which comes from projecting V to real space through the Relog map, does not accurately reflect the properties of V back in Cd . Section 5.3 discusses these considerations in more detail, and gives criteria under which the sufficient conditions in Proposition 5.1 are necessary.

Step 2: Determine Contributing Points Recalling that w ∈ ∂D if and only if the intersection T(w)∩V is non-empty, our next goal is to get a better handle on the elements of V which correspond to minimizers of hr (z). For any singularity w ∈ V, the vector (∇log H s )(w) is a scalar multiple of r if and only if the matrix     w H s (w) · · · wd Hzsd (w) (∇log H s )(w) = 1 z1 M= r r1 ··· rd is rank deficient, where as usual subscripted variables refer to partial derivatives. This happens precisely when all 2 × 2 minors of M vanish, giving the system of equations r j w1 Hzs1 (w)

H s (w) = 0 − r1 w j Hzs j (w) = 0

(2 ≤ j ≤ d).

(5.16)

Note that any point w ∈ V where Hzs1 (w) = · · · = Hzsd (w) = 0 satisfies (5.16). Because H s is square-free, this happens when w is not a smooth point. Definition 5.4 (smooth critical points) Equations (5.16) are called the smooth critical point equations, and any solution w ∈ C∗d of (5.16) where some Hzs j (w)  0 is a smooth critical point. Remark 5.9 (an optional perspective from differential geometry) Suppose V is smooth and let V∗ = V ∩ C∗d denote the submanifold of points in V whose co ordinates are non-zero. If we define the map φ(z) = − dj=1 r j log z j from V∗ to C, then the points where the differential of this analytic mapping of complex manifolds vanishes (known as ‘critical points’ in differential geometry settings) are precisely the points where the projection of ∇φ to the tangent space of V∗ is zero. Since the tangent space at w ∈ V is the hyperplane whose normal is (∇H s )(w), the ‘critical points’ of φ : V∗ → C in the differential geometry sense form the solutions of (5.16) and thus match our definition of critical points.

204

5 The Theory of ACSV for Smooth Points

Although we originally derived (5.16) by minimizing the upper bound (5.15) on exponential growth, the fact that these solutions are critical points of φ means we will ultimately reduce to integrals which can be asymptotically approximated by saddle-point methods. This serendipitous property is, in a sense, a result of the Cauchy-Riemann equations. The complex manifold V∗ ⊂ Cd defines an underlying real smooth manifold W∗ ⊂ R2d , obtained by setting z j = x j +iy j for real variables x j and y j . Since hr is the real part of the map φ, the Cauchy-Riemann equations then imply that the critical points of the real-valued smooth map hr : W∗ → R appearing in (5.15) are precisely the critical points of the complex analytic map φ : V∗ → C. See Section 9.3 of Chapter 9 for more details on this interpretation. By Proposition 5.1, a smooth critical point w is either a minimizer or maximizer of hr on D, depending on whether r points away from or towards B at Relog(w), respectively. Only those critical points which are minimizers play a role in our arguments, so we make the following definition. Definition 5.5 (smooth contributing points) Any smooth critical point w ∈ C∗d where r points away from B, i.e., where x · r < Relog(w) · r for any (and thus all) x ∈ B is called a smooth contributing point. Whenever D is the power series domain of convergence and r has positive coordinates, Proposition 3.6 in Chapter 3 implies that any minimal smooth critical point is a smooth contributing point. By definition, a minimal smooth contributing point w is a minimizer of hr on D. Example 5.3 (Smooth Critical Points and the Amoeba Contour) Figure 5.4 shows a sketch of the contour of H(x, y) = 1 − x − y − 6xy − x 2 y 2 together with: the components B1, . . . , B5 of amoeba(H)c , the images of the smooth critical points under the Relog map when r = (1, 1), and the resulting support hyperplanes. The component B1 = Relog(D1 ) corresponds to the power series expansion of 1/H(x, y); there is one critical point on ∂D1 . • Since r points away from B1 at this point, it is a minimizer of hr on D1 and is contributing. The component B2 = Relog(D2 ) corresponds to a Laurent expansion of 1/H(x, y); there are three critical points on ∂D2 . • The single critical point whose image under Relog lies on the bottom left of B2 is a maximizer of hr (z) on D2 , since r points into B2 at this point. Thus, this singularity is not contributing. • There are two critical points in C2 with the same image σ under Relog, which lies on the top right of B2 . Since r points away from B2 at σ these critical points are both minimizers of hr on D2, and thus also contributing points. The component B3 = Relog(D3 ) also has σ on its boundary, but r points into B3 at σ so the corresponding critical points are not contributing. Neither of the remaining domains of convergence have critical points on their boundaries. The

5.2 The Theory of Smooth ACSV

205

Fig. 5.4 Points on the contour of H(x, y) = 1 − x − y − 6xy − x 2 y 2 , with critical points and supporting hyperplanes shown. The components of the amoeba complement are labelled B1 to B5 .

height function hr is unbounded from below on D3, D4, and D5, meaning the sequence of diagonal coefficients fn,n is eventually zero in the corresponding Laurent expansions of 1/H(x, y). The critical point whose image under Relog lies on the bottom left of B2 is not contributing for any Laurent expansion when r = (1, 1), but it is a contributing singularity for B2 in the direction r = (−1, −1).

Remark 5.10 Although a smooth contributing point w is a minimizer of hr on D, and a point where all partial derivatives of the function hr restricted to V vanish, hr never admits a local minimizer on V itself except for the trivial case when hr is constant on V. Essentially, the real dimension of V as a manifold is too large to allow h to have a local minimum (this can be formalized by a maximum modulus principle for analytic functions in several variables).

Step 3: Localize the Cauchy Integral and Compute a Residue Minimal points, lying on ∂D, are the singularities to which the domain of integration in the Cauchy integral (5.13) can be deformed arbitrarily close to. Critical points, on the other hand, are those around which the Cauchy integral can be asymptotically approximated. Contributing points are a subset of critical points where we will be able to introduce additional integrals required for our residue computations while adding only an asymptotically negligible error. Thus, the existence of minimal contributing points suggests that these points will be the ones around which local behaviour of F

206

5 The Theory of ACSV for Smooth Points

will dictate coefficient asymptotics. The analysis is easiest when there are a finite number of such points. Definition 5.6 (finite and strict minimality) A minimal point w ∈ V ∩ D is called finitely minimal if T(w) ∩ V is finite, and strictly minimal if T(w) ∩ V = {w}. In other words, w is strictly minimal if it is minimal and no other singularity of F has the same coordinate-wise modulus. Suppose first that F(z) admits a strictly minimal smooth contributing point w. Because w is a smooth point we may assume, without loss of generality, that Hz d (w)  0.  = (w1, . . . , wd−1 ). w), where we recall the notation w Let ρ = |wd | and T = T( Since w is contributing, points with zˆ ∈ T and |zd | = ρ − εrd lie in D for ε sufficiently small. In other words, if rd > 0 we need to slightly decrease the modulus of the dth coordinate of w to move inside D, while if rd < 0 we need to increase the modulus of the dth coordinate to move inside D. The implicit function theorem (Proposition 3.1 in Chapter 3) together with strict  in T , an analytic minimality of w implies the existence of a neighbourhood N of w function g : N → C, and δ > 0 sufficiently small such that for zˆ ∈ N , (i) H(z) = 0 if and only if zd = g(ˆz); (ii) ρ − δ < |g(ˆz)| < ρ + δ; (iii) if xˆ ∈ T and ρ − δ ≤ |y| ≤ ρ + δ then H(ˆx, y) = 0 only if y = g(ˆx). To guide our deformations, let  δs = sgn(rd )δ =

δ −δ

if rd > 0 . if rd < 0

As above, we define 1 I= (2πi)d Iloc = Iout

1 (2πi)d

1 = (2πi)d

∫ ∫ T

|z d |=ρ−δ s

N

|z d |=ρ−δ s

F(z)

∫ ∫ ∫ ∫ N

|z d |=ρ+δ s

F(z) F(z)

dzd zdnrd +1 dzd nrd +1 zd dzd zdnrd +1

  

d zˆ zˆ

nˆr+1

d zˆ zˆ nˆr+1 d zˆ zˆ

nˆr+1

and χ = Iloc − Iout  ∫ ∫ ∫ −1 dzd dzd d zˆ = F(z) nr +1 − F(z) nr +1 nˆr+1 . d d (2πi)d N |z d |=ρ+δs ˆ z |z |=ρ−δ zd zd s d

(5.17)

Our first goal is to show that the coefficients of interest are well-approximated by χ.

5.2 The Theory of Smooth ACSV

207

Lemma 5.1 If F admits a strictly minimal smooth contributing point w ∈ C∗d in the direction r ∈ R∗d then | fnr − χ| = O(τ n ) for some τ < |w−r |. Proof Since w is contributing, fnr = I. Furthermore, since w is contributing and strictly minimal there exists ε > 0 such that for zˆ ∈ T \ N the univariate series, in t, for F(ˆz, t) converges when |t| = ρesgn(rd )ε . Thus, for any zˆ ∈ T \ N the integral ∫ dzd F(z) nr +1 = [t nrd ]F(ˆz, t) |z d |=ρ−δs zd d grows exponentially slower than ρ−nrd = |wd | −nrd , and the maximum modulus bound implies that

∫  1 ∫ dzd d zˆ F(z) nr +1 nˆr+1 |I − Iloc | = (2πi)d T\N |z d |=ρ−δs zˆ z d d

grows exponentially slower than |w−nr |. Finally, since w is contributing every point z in the domain of integration of Iout has hr (z) > hr (w), so the integral Iout also grows exponentially slower than |w−nr |. The result then follows from the triangle inequality, as | fnr − χ| = | fnr − Iloc + Iout | ≤ | fnr − Iloc | + |Iout |.  The integral χ in Lemma 5.1 can be simplified with a residue computation. Furthermore, we can obtain a similar asymptotic expansion for a finitely minimal point by summing asymptotic contributions of this type. Corollary 5.1 Suppose F(z) admits a finitely minimal smooth contributing point w ∈ C∗d in the direction r ∈ R∗d . Suppose further that T(w) ∩ V = {w1, . . . , wr } where each w j is a smooth contributing point such that Hzsd (w j )  0. For each j let Nj denote a sufficiently small neighbourhood of w j in T(w) such that there exists an analytic parametrization zd = g j (ˆz) of V for zˆ ∈ Nj . Then fnr =

∫ r  − sgn(rd ) j=1

(2πi)d−1

R j (ˆz)

Nj

zˆ nˆr+1

d zˆ + O(τ n )

(5.18)

for some τ < |w−r |, where R j (ˆz) = Res

z d =g j (ˆz)

G(z) zdnrd +1 H(z)

.

Proof Suppose first that w = w1 is strictly minimal, so T(w) ∩ V = {w}. Then for any fixed zˆ ∈ N the inner integrand of χ defined by (5.17) has a unique pole zd = g(ˆz) with ρ − δs ≤ |zd | ≤ ρ + δs . Thus, the (univariate) Cauchy residue theorem implies

∫  ∫ dzd dzd 1 F(z) nr +1 − F(z) nr +1 = − sgn(rd )R1 (ˆz), 2πi |z d |=ρ+δs |z d |=ρ−δ s z d z d d

d

208

5 The Theory of ACSV for Smooth Points

and the result holds. If w is finitely minimal then existence of the Nj and g j follows from the implicit function theorem. Repeating the above argument for the strictly minimal case with N replaced by the union of the Nj shows that χ is a sum of residue integrals, each of which simplifies to the stated form.  Remark 5.11 Corollary 5.1 assumes Hzsd (w j )  0 for each 1 ≤ j ≤ r, meaning the solutions of H(z) = 0 can be locally parametrized by the same coordinates near each w j . If, more generally, the partial derivative Hzsk (w j )  0 then all occurrences of rd and zd in the jth summand of (5.18) should be replaced by rk and zk , respectively. If w is one of the smooth minimal critical points in Corollary 5.1 then we may write H(z) = (zd − g(ˆz)) p q(z) for some positive integer p ≥ 1 and analytic function q(z) in a neighbourhood of w with q(w)  0. The integer p is the order of the pole zd = g(ˆz), p and may be calculated as the smallest positive integer such that (∂ p H/∂zd )(w)  0. If H is square-free then p = 1 and R(ˆz) = Res

z d =g(ˆz)

G(z) zdnrd +1 (zd

− g(ˆz))q(z)

=

G(ˆz, g(ˆz)) g(ˆz)nrd +1 q(ˆz, g(ˆz))

=

G(ˆz, g(ˆz)) g(ˆz)nrd +1 Hz d (ˆz, g(ˆz))

.

The general formula is messier, but still explicit. To simplify notation, we write ∂dk to denote the partial derivative operator which takes a differentiable function f (z) and returns ∂dk ( f ) = (∂ k f /∂zdk )(z), where ∂d0 ( f ) = f (z). Lemma 5.2 If the order of the pole zd = g(ˆz) near w is p then for all zˆ ∈ N R(ˆz) = (−1) p+1 g(ˆz)−nrd −p

p−1  (−1)k (nrd + p − 1 − k)(p−k−1) R j (ˆz), k!(p − k − 1)! k=0

(5.19)

where (a)(b) = a(a − 1) · · · (a − b + 1) denotes the falling factorial for a ∈ R and b ∈ N, and   R j (ˆz) = g(ˆz)k lim ∂dk (zd − g(ˆz)) p F(z) . z d →g(ˆz)

Proof Lemma 2.4 in the appendix to Chapter 2 expresses the residue R(ˆz) as the limit of an order p − 1 derivative. The product rule for derivatives then implies 



R(ˆz) =

1 (p − 1)!

z d →g(ˆz)

=

1 (p − 1)!

 p−1    p − 1 (−1) p−1−k (nrd + p − 1 − k)(p−1−k) k  ∂d (z d − g(ˆz)) p F(z) , nr d +p−k z d →g(ˆz) k z k=0

lim

p−1

∂d

z d−nr d −1 (z d − g(ˆz)) p F(z)

lim

d

from which (5.19) is obtained by algebraic manipulation.



5.2 The Theory of Smooth ACSV

209

Step 4: Find Asymptotics using the Saddle-Point Method To convert the expression (5.18) into a form more suitable for asymptotic approximation, we make the change of coordinates z j = w j eiθ j for j = 1, . . . , d − 1. Let N ⊂ Rd−1 be the image of N under this change of variables, which will be a neighbourhood of the origin (and can be taken as any sufficiently small neighbourhood of the origin without affecting our asymptotic  statements). If θ = (θ 1 , . . . , θ d−1 )  eiθ = w1 eiθ1 , . . . , wd−1 eiθd−1 to lighten we define the coordinate-wise product w notation. Our next result follows directly from Corollary 5.1 and this change of variables. Proposition 5.2 Suppose that r ∈ R∗d and F admits a strictly minimal smooth contributing point w such that Hzsd (w)  0. Let g(ˆz) be the analytic parameterization of zd in terms of zˆ in any sufficiently small neighbourhood of w. If the order of the pole zd = g(ˆz) is p then for any sufficiently small neighbourhood N of the origin in Rd−1 there exists an ε > 0 such that   fnr = χ + O (|wr | + ε)−n , where

w−nr (2π)d−1



An (θ) e−nrd φ(θ) dθ

(5.20)

(−1) p sgn(rd ) weiθ )  iθ  p Γn ( e g w

   eiθ g w + i(ˆr · θ)/rd φ(θ) = log g( w)

(5.21)

χ=

N

for An (θ) =

and Γn (z) =

p−1  (−1) j (nrd + p − 1 − k)(p−j−1) g(ˆz) j j!(p − j − 1)! j=0

  j lim ∂d (zd − g(ˆz)) p F(z) .

z d →g(ˆz)

The function Γn (z) is a polynomial of degree p − 1 in n, whose coefficients are analytic functions in zˆ . If zd = g(ˆz) is a simple pole then p = 1 and Γn (z) =

G(ˆz, g(ˆz)) Hz d (ˆz, g(ˆz))

is independent of n. When w is finitely minimal and each point w1, . . . , wr of T(w) ∩ V is a smooth contributing point with Hzsd (w)  0 then, up to an error which is exponentially smaller than w−nr, the coefficient sequence fnr is a sum of integrals of the form (5.20) with w replaced by w j and g(ˆz) replaced by a parameterization g j (ˆz) of zd near w j .

210

5 The Theory of ACSV for Smooth Points

To compute asymptotics of the integral in (5.20) we use the following result, a multivariate generalization of the saddle-point method. Recall that the Hessian of a differentiable function g(θ) from Rk to C is the k × k matrix whose (i, j)th entry is the second partial derivative gθi θ j = ∂ 2 g/∂θ i ∂θ j . Proposition 5.3 (Asymptotics of Nondegenerate Multivariate Fourier-Laplace Integrals) Suppose that the functions P(θ) and ψ(θ) from Rk to C are analytic in a neighbourhood N of the origin, and let H be the Hessian of ψ evaluated at the point θ = 0. If r ∈ R is nonzero and • • • •

ψ(0) = 0 and (∇ψ)(0) = 0; the origin is the only point of N where ∇ψ is 0; H is non-singular (has non-zero determinant); the real part of rψ(θ) is non-negative on N ,

then for any nonnegative integer M there exist computable constants K0, . . . , K M such that   k/2 ∫ M    2π −nrψ(θ) P(θ) e dθ = det (rH )−1/2 K j (rn)−j + O n−M−1 , (5.22) n N j=0 where the square-root of the determinant is the product of the principal branch square-roots of the eigenvalues of rH (which all have non-negative real part). The constant K0 = P(0) and if P(θ) vanishes to order L ≥ 1 at the origin then (at least) the constants K0, . . . , K  L  are all zero. More precisely, define the differential 2 operator    E=− H −1 ∂i ∂j 1≤i, j ≤k

ij

where ∂j denotes differentiation with respect to the variable θ j and H −1 is the inverse matrix of H . Let ˜ ψ(θ) = ψ(θ) − (1/2)θ · H · θ T , which is a scalar function vanishing to order 3 at the origin. Then   E +j  P(θ) ψ(θ) ˜ K j = (−1) j . 2 +j !( + j)! 0≤ ≤2j

(5.23)

θ=0

˜ to determine K j one only needs to calculate Due to the order of vanishing of ψ, evaluations at 0 of the derivatives of P of order at most 2 j and the of ψ   derivatives of order at most 2 j + 2. The hidden constant in the error term O n−M−1 of (5.22) varies continuously with the values of the partial derivatives of P and ψ at the origin. Proof Proposition 5.3 follows from Theorem 2.3 of Pemantle and Wilson [31]. If the final condition of the proposition is strengthened to require that the real part of rψ(θ) be positive on N \ {0}, which holds in the situations we encounter, then the stated conclusions follow from the more elementary Theorem 4.1 of [31]. Hörmander [20, Theorem 7.7.5] gives the explicit formula for higher order terms when A

5.2 The Theory of Smooth ACSV

211

has compact support (this restriction can be removed when A and φ satisfy certain growth conditions which hold in most situations we encounter). Section 5 of [31] shows that the expressions of Hörmander are valid under our stated assumptions.  We now go through the criteria required to apply Proposition 5.3 to the integral expression in Proposition 5.2. First, we show criticality is equivalent to the gradient of φ vanishings. Lemma 5.3 Suppose w ∈ V is any point such that Hzsd (w)  0, and let zd = g(ˆz) be the parameterization of zd on V near w. Then the gradient of the function φ(θ) in (5.21) equals 0 at θ = 0 if and only if w is a smooth critical point. Proof Fix 1 ≤ k ≤ d − 1. Differentiating the equation Hs (ˆz, g(ˆz)) = 0 with respect to zk using the chain rule implies gzk (ˆz) = −Hzk (ˆz, g(ˆz))/Hz d (ˆz, g(ˆz)) for zˆ in a . Thus, the derivative of φ with respect to θ k is neighbourhood of w  iθ  iθ    iθ  e , g w e e gz w −Hz w rk rk  iθ  iθ   iwk eiθk + i . φθk (θ) = k iθ  iwk eiθk + i =  iθ k rd rd e , g w e  e Hz d w e g w g w Substituting θ = 0 and using g( w) = wd shows that vanishing of all partial derivatives of φ at the origin is equivalent to w satisfying the smooth critical point equations.  Next, we show minimality implies the real part of rd φ is non-negative. Lemma 5.4 Suppose w ∈ V is a minimal contributing point such that Hzsd (w)  0, and let zd = g(ˆz) be a parameterization of zd on V near w. If φ(θ) is defined by (5.21) then the real part (rd φ(θ)) ≥ 0 for real θ. If w is finitely minimal then (rd φ(θ)) > 0 for θ in any sufficiently small neighbourhood N of the origin. Proof The real part of rd φ can be expressed as    eiθ − rd log |g( w)|, (rd φ) = rd log g w  iθ  rd  iθ   e . Since g w  e gives which is non-negative if and only if |g( w)| rd ≤ g w the dth coordinate of a point on V whose first d − 1 coordinates lie in T( w), this inequality holds whenever w is minimal and contributing. If w is finitely minimal then no other points in any sufficiently small neighbourhood of w in V have the same coordinate-wise modulus as w, so the inequality is strict.  The Hessian of φ at the origin may, in general, be singular. Definition 5.7 (nondegenerate critical points) The critical point w is nondegenerate if the Hessian matrix H of φ at the origin is nonsingular. We require that all minimal contributing points are nondegenerate. Applying the chain rule, and using that w is a critical point in the direction r, allows for an explicit determination of the Hessian matrix of φ at the origin in terms of the partial derivatives of H(z), ultimately giving the following result.

212

5 The Theory of ACSV for Smooth Points

Lemma 5.5 Suppose w is a smooth critical point where Hzsd (w)  0 and let zd = g(ˆz) be the parameterization of zd on V near w. If, for 1 ≤ i, j ≤ d, we define Ui, j =

wi w j Hzsi z j (w) wd Hzsd (w)

and

Vi =

wi Hzsi (w) wd Hzsd (w)

=

ri rd

(5.24)

then the (d − 1) × (d − 1) Hessian matrix H of the function φ in (5.21) evaluated at the origin has (i, j)th entry Hi, j =

⎧ ⎪ ⎨Vi Vj + Ui, j − Vj Ui,d − Vi U j,d + Vi Vj Ud,d ⎪

:i j

⎪ ⎪Vi + V 2 + Ui,i − 2Vi Ui,d + V 2Ud,d i i ⎩

:i= j

(5.25)

Nondegeneracy of a critical point w relates behaviour of hr to the geometry of the singular set V near w. This regulates the behaviour of nondegenerate critical points. Lemma 5.6 There are a finite number of nondegenerate smooth critical points in any direction r ∈ R∗d . If w = w(r) ∈ C∗d is a nondegenerate smooth critical point in the direction r ∈ R∗d then w is isolated among the set of smooth critical points in the direction r, and w(r) varies smoothly as r moves in some sufficiently small neighbourhood of R∗d . If H(z) has real coefficients and w ∈ R∗d then when r moves in a sufficiently small neighbourhood each w(r) ∈ R∗d . Proof The fact that a nondegenerate critical point is isolated follows directly from the complex Morse lemma, which states that after an analytic change of coordinates φ can be written in a V∗ -neighbourhood of w as a sum of squares φ(w) + u12 + · · · + u2d−1 (see Ebeling [15, Prop. 3.15 and Cor. 3.3]). Bézout’s theorem [8, Thm. 8.2] bounds the number of isolated solutions of a polynomial system, which form its ‘zerodimensional component’. Since any nondegenerate smooth critical point is an isolated solution of the smooth critical point equations (5.16), such points are finite in number. Because w is a smooth point we may assume, without loss of generality, that Hzsd (w)  0, and we let zd = g(ˆz) be the parameterization of zd on V near w. By a multivariate version of the implicit function theorem [21, Thm. 2.1.2], to show that w(r) varies smoothly with r it is enough to show that the Jacobian of the smooth critical point system rk z1 Hz1 (ˆz, g(ˆz)) − r1 zk Hzk (ˆz, g(ˆz)) = 0

(2 ≤ k ≤ d)

 (to lighten notation we now suppose H is with respect to zˆ is non-singular at w square-free so that Hs = H, otherwise replace H with H s throughout). The Hessian weiθ ) − rd log g( w) + i(θ · rˆ ) at θ = 0 is the Jacobian of of the map φ(θ) = rd log g( the gradient map (∇φ)(θ) = where

 i  rd L1 (θ) − r1 Ld (θ), . . . , rd Ld−1 (θ) − rd−1 Ld (θ) , Ld (θ)

(5.26)

5.2 The Theory of Smooth ACSV

213







 eiθ , g w  eiθ (1 ≤ j ≤ d − 1) L j (θ) = w j eiθ j Hz j w       eiθ Hz d w  eiθ .  eiθ , g w Ld (θ) = g w The vector in (5.26) is, up to a multiplicative factor and change of coordinates z j = w j eiθ j , the smooth critical point system. The chain rule implies this change of variables does not affect singularity of the Jacobian. Furthermore, the entries of (∇φ)(θ) vanish when θ = 0 so the factor of i/Ld (θ) simply multiplies the Hessian of the smooth critical point system by the non-zero constant i/(wd Hz d (w)). Thus, the Jacobian determinant of the smooth critical point system at a critical point w is, up to a non-zero factor, the Hessian determinant of φ at θ = 0, which by definition is nonzero when w is non-degenerate. When all quantities are real, the implicit function theorem implies the critical point w(r) is also real.  We are now ready to state the main asymptotic results of this section. Theorems 5.2 and 5.3 follow directly from applying Proposition 5.3 to Proposition 5.2 under our assumptions, taking Lemmas 5.3 to 5.6 into account. The easiest, and most common, situation is when the minimal contributing point is a simple pole. We state this result separately in order to be as explicit as possible. Theorem 5.2 (Smooth Asymptotics for Simple Poles) Suppose that the rational function F(z) = G(z)/H(z) admits a nondegenerate strictly minimal smooth contributing point w ∈ C∗d in the direction r ∈ R∗d , such that Hz d (w)  0. Then for any nonnegative integer M there exist computable constants C0, . . . , CM such that M   (2π)(1−d)/2  " C j (rd n)−j + O n−M−1 # , fnr = w−nr n(1−d)/2  det(rd H ) j=0 ! $

(5.27)

where the matrix H is defined in (5.25), the square-root of the determinant is given by the product of the principal branch square-roots of the eigenvalues of rd H , and C j equals the constant K j defined by (5.23) when  iθ  iθ   e , g w e − sgn(rd ) G w  iθ  iθ   P(θ) =  iθ  e , g w  e Hz d w e g w and

   eiθ g w + i(ˆr · θ)/rd . ψ(θ) = log g( w)

The leading constant C0 in this series has the value C0 =

− sgn(rd )G(w) , wd Hz d (w)

which is nonzero whenever G(w)  0. The asymptotic expansion (5.27) holds uniformly in neighbourhoods R ⊂ R∗d of r where there is a smoothly varying nondegenerate strictly minimal contributing point such that Hz d does not vanish. In

214

5 The Theory of ACSV for Smooth Points

other words, if w(s) ∈ C∗d is the contributing point in direction s ∈ R then there exists B > 0 such that M (1−d)/2  s −j fns − w(s)−ns n(1−d)/2 (2π) C j (sd n) ≤ B |w(s)−ns | n(1−d)/2−M−1 det(sd H s ) j=0 for all n ∈ N and s ∈ R with ns ∈ Zd , where C js and H s are calculated from the contributing point w(s) as above. Remark 5.12 Although the constants in (5.27) are defined in terms of partial derivatives of the parameterization g(ˆz) for zd on V, implicitly differentiating the equation  from the H(ˆz, g(ˆz)) = 0 allows one to determine the partial derivatives of g at w partial derivatives of H at w. Thus, the only information needed to determine the constants C0, . . . , CM appearing in Theorem 5.3 are the evaluations at z = w of the partial derivatives of G(z) up to order 2M and the partial derivatives of H(z) up to order 2M + 2. The corresponding asymptotic result for higher order poles is more awkward to state, although all constants are still explicitly computable. Theorem 5.3 (Smooth Asymptotics) Suppose that F(z) = G(z)/H(z) admits a nondegenerate strictly minimal smooth contributing point w ∈ C∗d in the direction r ∈ R∗d , such that Hzsd (w)  0, and let p be the smallest integer such p that (∂d H)(w)  0. Then for any nonnegative integer M there exist computable constants C0, . . . , CM such that M   (2π)(1−d)/2  " C j (rd n)−j + O n−M−1 # . fnr = w−nr n p−1+(1−d)/2  det(rd H ) j=0 ! $

(5.28)

The constants C j are determined by expanding An (θ) in Proposition 5.2 as a polynomial in n and applying Proposition 5.3 to the resulting sum of integrals. The leading constant C0 in this series has the value C0 =

sgn(rd )(−1) p G(w)p , p p wd (∂d H)(w)

which is nonzero whenever G(w)  0. The asymptotic expansion (5.28) holds uniformly in neighbourhoods R ⊂ R∗d of r where there is a smoothly varying contributing point satisfying the same conditions as w. Remark 5.13 Often H is a pure power of a square-free polynomial, i.e., H(z) = Q(z) p for some integer p > 1 and square-free polynomial Q. In this case the leading constant C0 in Theorem 5.3 becomes C0 =

sgn(rd )(−1) p G(w)  p . (p − 1)! wd Q z d (w)

5.2 The Theory of Smooth ACSV

215

This variation is commonly in the literature. When w is a finitely minimal critical point, and each point of T(w) ∩ V satisfies the conditions of Theorem 5.28, then one can add the asymptotic contributions of the minimal contributing points to determine dominant asymptotics. Corollary 5.2 Suppose that F admits a finitely minimal smooth contributing point w ∈ C∗d in the direction r ∈ R∗d , let E be the set of points in V ∩T(w), and suppose all elements of E satisfy the conditions of Theorem 5.3. For some nonnegative integer M, let Φw denote the right-hand side of (5.28) calculated at w ∈ E (equal to (5.27) whenever w is a simple pole). Then  Φw fnr = w∈E

gives an asymptotic expansion of fnr as n → ∞. Theorem 5.3 gives an asymptotic expansion for the coefficients fnr , but when the numerator G vanishes at all minimal contributing points it is possible that the exponential growth of fnr is smaller than what is predicted by these points. Example 5.4 (Vanishing of Asymptotic Terms) Consider the rational functions A(x, y) =

1 , 1−x−y

B(x, y) =

x − 2y 2 , 1−x−y

and

C(x, y) =

x−y , 1−x−y

all of which admit a single contributing singularity σ = (1/2, 1/2) for the main diagonal direction r = (1, 1). The main power series   diagonal of A(x, y) is the generating function of the central binomial coefficients 2n n , and Theorem 5.2 with M = 2 gives  [x n y n ]A(x, y) =

    1 2n 1 1 = 4n √ 1/2 − √ 3/2 + √ 5/2 + O n−7/2 . n πn 8 πn 128 πn

The numerator of B(x, y) vanishes at σ, meaning the constant C0 in (5.27) is zero, but one can calculate the higher order constants to obtain    1 3 [x n y n ]B(x, y) = 4n √ 3/2 + √ 5/2 + O n−7/2 . 4 πn 32 πn Direct inspection shows that the main diagonal of C(x, y) is identically zero. When applied in an automatic manner, Theorem 5.2 allows one to show for any natural number M, in a complexity which is polynomial in M, that  n 4 n n [x y ]C(x, y) = O M . n

216

5 The Theory of ACSV for Smooth Points

The fact that the diagonal is identically zero can be recovered from the eventual saddle-point integral by a short argument,  Problem 5.11. Note that C(x, y)  r see s in the direction r = (r, s), and that , r+s admits the minimal contributing point r+s the numerator of G(x, y) vanishes only when r = s. Thus, the exponential growth rate of the coefficient sequence fr n,sn approaches 4r n as s → r, but there is a sharp drop down to zero exactly when s = r.

In Section 5.3.4 we show that most of our assumptions, including the condition that the numerator G(z) does not vanish at the smooth critical points of F, hold for all rational functions with numerator and denominator of fixed degree, except those whose coefficients lie in some fixed algebraic set. This helps illustrate why Corollary 5.2 applies to many problems appearing in the mathematical and scientific literature. Remark 5.14 The arguments above also hold for convergent Laurent expansions of a ratio F(z) = P(z)/Q(z) p of analytic functions P and Q with p ∈ N. Indeed, the numerator of F only enters into the determination of asymptotics for the integral in Proposition 5.2, and Proposition 5.3 only requires analyticity of the functions under consideration. The singular variety V is now the zero set of analytic Q(z), and we still say w ∈ V is a smooth point if some partial derivative of Q does not vanish at w. The definition of critical points requires only the logarithmic gradient, and we again call a critical point contributing if it is a minimizer of the height function hr on D. None of the integral manipulations above used that G and H were polynomial, so under these modified definitions our asymptotic results still hold. Corollary 5.3 Suppose the hypotheses of Corollary 5.2 hold for a ratio F(z) = P(z)/Q(z) p where p is a positive integer and for some ε > 0 the functions P and Q are analytic on {z ∈ Cd : hr (z) > hr (w)−ε}. Then the conclusion of Corollary 5.2 holds when asymptotics are calculated with G(z) = P(z), H(z) = Q(z) p, and H s (z) = Q(z). Although we do not discuss what it means for analytic functions P and Q to be coprime, if P and Q are not coprime and one applies Corollary 5.3 then the resulting asymptotic expansion still holds. The only downside is that one may find an apparent minimal critical point by examining V(Q) which is not actually a singularity of P(z)/Q(z) p . In this case all coefficients C j in the suspected leading series of the asymptotic expansion will (correctly) be calculated to be zero.

5.3 The Practice of Smooth ACSV In this section we show how to apply our newly developed asymptotic methods, and give ancillary results which help put the theory into practice. For a rational function F(z) = G(z)/H(z) the smooth critical point equations (5.16) are defined by polynomial equalities involving the square-free part of H, and are easily manipulated

5.3 The Practice of Smooth ACSV

217

with a computer algebra system. As previously noted, in Section 5.3.4 we show that for ‘most’ rational functions the singular variety V is everywhere smooth, there are a finite number of critical points, all of which are nondegenerate, and the numerator does not vanish at any of these points. Thus, the most difficult step in an asymptotic analysis is usually determining which, if any, of the algebraically defined critical points are minimal. We take a full computational perspective on these questions in Chapter 7, where we develop complete algorithms implementing our results. When H(z) is simple enough, direct arguments can be used to prove minimality. Example 5.5 (Asymptotics of a Multinomial Expansion) Consider the power series expansion of the trivariate function  i + j + k   (i + j + k)! 1 x i y j zk = F(x, y, z) = = xi y j zk, 1 − x − y − z i, j,k ≥0 i, j, k i! j!k! i, j,k whose power series domain of convergence consists of (x, y, z) ∈ C3 such that |x| + |y| + |z| < 1. Since H(x, y, z) = 1 − x − y − z is square-free, the smooth critical point equations in direction r = (a, b, c) become H(x, y, z) = 1 − x − y − z = 0 bxHx (x, y, z) − ayHy (x, y, z) = −bx + ay = 0 cxHx (x, y, z) − ayHz (x, y, z) = −cx + az = 0, so that there is a single smooth critical point   b c a , , . (x, y, z) = (x∗, y∗, z∗ ) = a+b+c a+b+c a+b+c If any of a, b, c is non-positive then F admits no minimal critical point for its power d and (x, y, z) satisfy series expansion. If a, b, c ∈ R>0 x + y + z = 1,

|x| ≤ x∗,

|y| ≤ y∗,

|z| ≤ z∗,

then it must be the case that (x, y, z) = (x∗, y∗, z∗ ), so we have found a strictly minimal smooth contributing point (since every minimal critical point is contributing for a power series expansion when r has non-negative coordinates). Equation (5.25) implies the matrix H in (5.27) equals H=



a(a+c) c2

!

ab c2

ab c2 " # b(b+c) 2 c $

and Theorem 5.2 gives an asymptotic expansion for the power series coefficients [x an y bn z cn ]F(x, y, z) with an, bn, cn ∈ N that begins

218



5 The Theory of ACSV for Smooth Points

(a + b + c)a+b+c a a bb c c

n

√ n−1

 (a + b)(a + c)(b + c) −1 a+b+c − n +··· . √ √ 2π abc 24π(abc)3/2 a + b + c

Note that Corollary 2.1 implies the r-diagonal (ΔF)(z) is non-algebraic for any r with positive rational coordinates. If we consider the power series expansion of Fp (x, y, z) =

1 (1 − x − y − z) p

for p ∈ N then Theorem 5.3 gives an asymptotic expansion for the power series coefficients [x an y bn z cn ]Fp (x, y, z) with an, bn, cn ∈ N that begins

 n   (a + b + c)a+b+c 1 (a + b + c) p−1/2 p−2 +O . n √ n a a bb c c (p − 1)! 2π abc The exponential growth of the coefficient sequence is independent of p as the singular variety V, and thus the location of the minimal contributing point (x∗, y∗, z∗ ), does not depend on p.

When dealing with Laurent expansions having non-negative coefficients, for instance multivariate generating functions of a combinatorial classes with parameters, we can use this non-negativity to help characterize minimal points. Definition 5.8 (combinatorial series and functions) A convergent Laurent expansion of meromorphic F(z) is called combinatorial if this series expansion contains only a finite number of negative coefficients. If F(z) admits a power series expansion and this power series expansion is combinatorial, the situation we encounter most often, then we call F(z) combinatorial. Given z ∈ Cd we write |z| = (|z1 |, . . . , |zd |). The following result is a multivariate analogue of the Vivanti-Pringsheim Theorem (Proposition 2.4 in Chapter 2) which greatly aids in determining minimality. Lemma 5.7 Suppose the Laurent expansion of F(z) with domain of convergence D is combinatorial. If w ∈ V∗ ∩ ∂D is a minimal point of F(z) with non-negative coordinates then the point |w| with positive coordinates also lies in V (and is thus also minimal). Proof Possibly by adding a Laurent polynomial to F(z), which will not change the singularities of F away from the coordinate axes, we may assume that all Laurent coefficients of F are non-negative. Write the convergent Laurent expansion of F on D as  fi zi . F(z) = i∈Z d

Since fi ≥ 0 for all i ∈ Zd , the triangle inequality implies  |F(z)| ≤ fi zi = F(|z|) i∈Z d

5.3 The Practice of Smooth ACSV

219

for all z ∈ D. Because F(z) is meromorphic its singularities are poles, so if w lies in z(n) in D converging to w such V(n)∩ ∂D then there exists a sequence (n) that F(z ) → ∞. But then the sequence |z | lies in D, converges to |w|, and  satisfies F(|z(n) |) → ∞. Thus |w| is also a polar singularity of F, as desired. Remark 5.15 If F(z) = G(z)/H(z) is a rational function and B = Relog(D) then Lemma 5.7 implies ∂B, which describes a portion of the amoeba boundary d under the Relog map. ∂ amoeba(H), is contained in the image of V>0 = V ∩ R>0 Lemma 5.7 gives a simple criterion to prove that a point is minimal when F(z) is combinatorial. Proposition 5.4 Fix a convergent Laurent expansion of rational F(z) with domain of convergence D, and pick any x ∈ D. A point w ∈ V is minimal if and only if there does not exist z ∈ V with |z| = t|w| + (1 − t)|x| and t ∈ (0, 1). If this series expansion is combinatorial then w ∈ V is a minimal point if and only if |w| ∈ V d does not contain an element of V, and the line segment from |x| to |w| in R>0 {t|w| + (1 − t)|x| : 0 < t < 1} ∩ V = . If the Laurent expansion under consideration is a power series expansion, one can take x = 0 and thus restrict attention to the line segment from the origin to |w|. Proof Since x ∈ D its image Relog(x) lies in the (open) component B = Relog(D) of the amoeba complement. If w is minimal then Relog(w) ∈ ∂B and convexity of B implies no element of V can lie in the interior of a line segment from |x| to |w|. If w is not minimal, then Relog(|w|) lies outside of the closed convex set B, so any continuous path in Rd from Relog(|w|) to the open set B must pass through ∂B. Since the path log (t|w| + (1 − t)|x|) , 0 ≤ t ≤ 1 goes from Relog(w) to Relog(x), where the logarithm is taken coordinate-wise, there exists z ∈ V and t ∈ (0, 1) such that log (t|w| + (1 − t)|x|) = Relog(z). Exponentiating coordinate-wise then gives |w| = t|z| + (1 − t)|x|. When F(z) is combinatorial, Lemma 5.7 shows that it is sufficient to consider d to determine the minimality of w. only the points in V ∩ R>0  Although Lemma 5.7 can be seen as a multivariate generalization of the VivantiPringsheim Theorem, its conditions are more restrictive as it requires all coefficients of the series expansion to be non-negative, not just those in the direction r which may be of (combinatorial) interest. As mentioned in Chapter 2, it is still unknown even in the univariate case how to decide when a rational function is combinatorial. In practice, then, one usually applies these results when F(z) is the multivariate generating function of a combinatorial class with parameters, or when the form of F(z) makes combinatoriality easy to prove (for instance, when considering the G(z) power series expansion of F(z) = 1−J(z) with J(z) a polynomial vanishing at the origin having non-negative coefficients).

220

5 The Theory of ACSV for Smooth Points

Example 5.6 (Lattice Path Asymptotics) In Chapter 4 we saw that the main diagonal of the power series expansion of F(x, y, t) =

(1 + x)(1 + y) 1 − t x y(x + y + 1/x + 1/y)

is the generating function for the number of lattice walks starting at the origin, taking the steps (±1, 0), (0, ±1), and staying in the first quadrant. The denominator is square-free, and the singular variety V is smooth as the partial derivative of the denominator with respect to t does not vanish at any points of V. Furthermore, F is combinatorial. The critical point equations imply that there are two critical points, σ = (1, 1, 1/4)

and

 = (−1, −1, −1/4).

Proposition 5.4 implies σ and  are minimal critical points: if (x, y, t) ∈ V with 0 < x, y ≤ 1 and one of the upper inequalities is strict then 1 > 1/4. |t| = 2 x y + xy 2 + y + x Furthermore, if |x| = 1 and |y| = 1 for x, y ∈ C then the triangle inequality implies |x 2 y + xy 2 + x + y| = 4 only if xy 2, x, x 2 y, and y have the same argument. Thus, x 2 and y 2 must be real with modulus 1 and a quick check shows the only other point of V satisfying this condition is . We have shown these critical points are finitely minimal. Since the numerator G(x, y) = (1 + x)(1 + y) vanishes (to order 2) when (x, y, t) = , Corollary 5.2 implies only σ contributes to the dominant asymptotics of the diagonal sequence. The contributions from each minimal critical point begin    1 6 19 121 1 − 2+ − + O Φσ = 4n πn πn 2πn3 12πn4 n5    1 9 1 . − + O Φ = (−4)n πn3 2πn4 n5 Note that the presence of two minimal critical points leads to periodicity in the higher order asymptotic terms,    4n 1 6 19 + 2(−1)n 63 + 18(−1)n fn,n,n = . − + O 4− + 2 3 πn n 2n 4n n4 More examples from lattice path enumeration, and additional details on this example, are discussed in Chapter 6.

5.3 The Practice of Smooth ACSV

221

The algebraic structure of the series terms with non-zero coefficients can also help determine when a minimal point is strictly minimal. Definition 5.9 (aperiodic series) A power series P(z) = n∈N d pn zn is called aped riodic if every element of Z can be written as an integer linear combination of the exponents {n ∈ Nd : pn  0} appearing in P. Proposition 5.5 Suppose F(z) = G(z)/H(z) is a ratio of functions G and H which are analytic on Cd . If H(z) = 1 − P(z) for some aperiodic power series P with non-negative coefficients then every minimal point of the power series expansion of F(z) is strictly minimal and has positive real coordinates. Proposition 5.5 is often applied when F(z) is a rational function (so P is a polynomial with non-negative coefficients). Proof Suppose w is a minimal point and for each 1 ≤ j ≤ d write w j = x j eiθ j with x j > 0 and θ j ∈ R. Let pn denote the coefficient of z n in P(z). Then   n i(n·θ) 1 = |P(w)| = pn x e pn xn, ≤ d d n∈N

n∈N

where equality holds throughout only if P(z) is a monomial or n· θ = 0 for all n ∈ Nd such that pn  0. Thus, either θ = 0 or P(z) is not aperiodic.  Example 5.7 (Lonesum Matrices) A lonesum matrix is a matrix with entries in {0, 1} that is uniquely determined by its row and column sums; the enumeration of lonesum matrices finds application in biology [27] and algebraic statistics [19]. If Bn,k denotes the number of n × k lonesum matrices then Brewbaker [7] proved the generating function expression F(x, y) =

 n,k ≥0

Bn,k

e x+y x n yk = x . n! k! e + ey − e x+y

Since the denominator under consideration has the form H(x, y) = 1 − P(x, y) where P(x, y) = 1 − e x − ey + e x+y =

k−1  x j y k−j j!(k − j)! k ≥1 j=1

is aperiodic, Proposition 5.5 implies all minimal points of the power series expansion of F(x, y) are positive and strictly minimal. Furthermore, if H(x, y) = 0 then y = x − log(e x − 1) is a decreasing function of x > 0 so every point with positive coordinates is minimal (decreasing x > 0 and staying in the singular variety V means increasing y). Manipulating the smooth critical point equations (5.16) in the direction r = (r, s) shows that (x, y) is a critical point in the direction r if and only if

222

5 The Theory of ACSV for Smooth Points

x r = x s (1 − e ) log(1 − e−x )

and

s y = . y r (1 − e ) log(1 − e−y )

t Since the function f (t) = (1−e t ) log(1−e −t ) has positive derivative for t > 0, goes to 0 as t → 0, and goes to infinity as t → ∞, f is a bijection from the positive real line to itself. Thus, in any direction r = (r, s) with positive coordinates there is a unique critical point (a, b) where a = a(r, s) = f −1 (r/s) and b = b(r, s) = f −1 (s/r). Because F(x, y) is bivariate the Hessian H in (5.25) is a constant, and is non-zero at any point with positive coordinates. Finally, since we consider a power series expansion in directions with positive coordinates the minimal critical point (a, b) is always contributing. Corollary 5.3 thus implies

Br n,sn =

   (rn)!(sn)! a(r, s)−r n b(r, s)−sn 1 + O n−1 √  n 2πsae−a (be−b + ae−a − ab)

for all r, s > 0. Additional details on this example can be found in Khera et al. [26]. The approach presented here is partially modeled on work of Andrade et al. [12], who studied a related generating function arising in permutation statistics. Problem 5.3 asks you to find asymptotics for this related generating function.

5.3.1 Existence of Minimal Critical Points Because minimal contributing points are crucial to our asymptotic results, we now study when they do (or do not) exist. First, they may not exist for a somewhat trivial reason: because a minimal contributing point is a minimizer of the height function hr on D, there will be no minimal contributing points when hr decreases without bound. This can only happen in degenerate cases. Proposition 5.6 Consider the convergent Laurent expansion of rational function F(z) = G(z)/H(z) on a domain D ⊂ Cd . If the height function hr decreases without bound on D then the coefficient sequence ( fnr ) is eventually zero. Proof If hr is unbounded from below then the inequality (5.15) implies | fnr | goes to zero faster than any exponential function. Since the asymptotic behaviour of a diagonal sequence is constrained by Corollary 3.2 in Chapter 3, this implies fnr is eventually zero. Alternatively, the conclusion follows from (5.14) and the fact that  Cw grows at most polynomially. When studying a diagonal sequence which is not eventually zero, Proposition 5.6 implies the height function is bounded on D. The following result uses the algebraic properties of polynomial amoebas discussed in Section 3.3.1 of Chapter 3 to help characterize when a diagonal sequence is identically zero. Proposition 5.7 Let F(z) = G(z)/H(z) be a ratio of coprime polynomials, let N (H) be the Newton polytope of H, and let B = Relog(D) be a connected component

5.3 The Practice of Smooth ACSV

223

of amoeba(H)c corresponding to the integer point v ∈ N (H) under the mapping described by Proposition 3.12 in Chapter 3. If v + tr  N (H) for all t > 0 then the height function hr (z) is unbounded from below on D, and ( fnr ) is eventually zero. Proof Suppose hr is bounded from below on D, so that h˜ r (x) = −r · x is bounded from below on B. Suppose also that y ∈ Rd lies in the recession cone R of B, meaning that y + b ∈ B for all b ∈ B. Then by induction ny + b ∈ B for all b ∈ B and n ∈ N. If −r · y < 0 then −r · (ny + b) = n(−r · y) − r · b → −∞ as n → ∞, contradicting the fact that h˜ is bounded on B. Thus, it must be the case that −r · y ≥ 0 for all y ∈ R. Proposition 3.12 in Chapter 3 characterizes the recession cone as R = {x ∈ Rd : x · n ≤ 0 for all n ∈ N (H) − v}, where N (H) − v is the set {z − v : z ∈ N (H)}. Because N (H) is a closed convex set, if tr  N (H) − v for all t > 0 then there exists y ∈ Rd such that r · y > 0 and n · y < 0 for all n ∈ N (H) − v. But this implies y ∈ R and −r · y < 0, a contradiction. Thus,  when hr is bounded from below there exists t > 0 such that tr ∈ N (H) − v. Unfortunately, even if hr achieves its minimum on D this minimum need not be a critical point. In fact, when the series under consideration is not combinatorial it is possible to construct examples with no critical points although the minimum of hr on D is achieved and the singular variety V is smooth. Example 5.8 (A Diagonal with No Critical Points) Consider the bivariate function F(x, y) = G(x, y)/H(x, y) = 1/(2 + y − x(1 + y)2 ) and let D be the power series domain of convergence of F. Basic algebra shows that both polynomial systems H = Hx = Hy = 0

and

H = xHx − yHy = 0

have no solutions, so the singular variety V is smooth but there are no critical points in the main diagonal direction r = (1, 1). Since the dual cone of the Newton polygon N (H) at the origin is the quadrant R = {(x, y) ∈ R2 : x, y ≤ 0}, Proposition 3.12 in Chapter 3 implies the recession cone of B = Relog(D) is R and B is a closed convex set contained in some translation of the third quadrant of the plane (see ˜ q) = −p − q achieves its minimum Figure 5.5). This implies the linear function h(p, for (p, q) ∈ B, so the height function h(x, y) achieves its minimum on D. The set of minimal points can be parametrized as   6 √ 7 6 √ 7 2+y , y : y ∈ −2, − 3 ∩ 0, 3 , V ∩ ∂D = (1 + y)2 with corresponding logarithmic gradient     2+y 2 (∇log H) , y = − 2 + y, 2 + y − , 1+y (1 + y)2

224

5 The Theory of ACSV for Smooth Points

Fig. 5.5 Left: A subset of the real points in the smooth singular variety V = V(2 + y − x(1 + y)2 ). Center: Taking coordinate-wise moduli creates an intersection, which causes h(x, y) to achieve its minimum on ∂D. Right: The curves in the contour of H corresponding to the real connected components of V shown on the left, together with amoeba(H) and the component B = Relog(D) of amoeba(H) c . Although the hyperplane with normal (1, 1) is a supporting hyperplane to B at the intersection, neither branch of the contour has a tangent line perpendicular to (1, 1).

which is indeed never colinear with (1, 1). Since the logarithmic gradient approaches a multiple of (1, 1) as y → ∞, one might say there is a (non-minimal) critical point at infinity in this example; we return to this notion in Section 9.3 of Chapter 9. √ The problem is that the minimizers (1/2, ± 3) of h(x, y) on D come from two different connected components of V ∩R2 which intersect only after taking coordinatewise moduli, as shown in Figure 5.5. More generally, one can imagine V wrapping over itself in complex space so that V is smooth but the boundary of amoeba(H) is not, in such a way that r is the normal to a supporting hyperplane at B only for ‘unnatural’ reasons due to a non-smooth point (such an occurrence has been called a ghost intersection in the literature).

This difficulty arises when the behaviour of Relog(V) in Rd does not reflect the behaviour of V in Cd . Luckily, this cannot happen in the combinatorial case, where the points of V with positive real coordinates map onto the points of ∂B by taking coordinate-wise logarithms. Proposition 5.8 Consider a combinatorial Laurent expansion of F(z) and let x be a smooth minimal point in V with positive real coordinates. Then every point in some neighbourhood of x in V ∩ Rd is a minimal point. Proof We assume that H is square-free, otherwise replace it with its square-free part in this argument. Proposition 3.13 in Chapter 3 implies that the minimal point x is critical in some non-zero direction r ∈ Rd and, perhaps by replacing r with its negation, we may assume that r points away from B at log(x). Consider the polynomial system H(z) = H(t r1 z1, . . . , t rd zd ) = 0 and assume (without loss of generality) that Hz d (x)  0. At the solution (z, t) = (x, 1) the Jacobian of this system with respect to zd and t is

5.3 The Practice of Smooth ACSV



225



Hz d (x) 0 , Hz d (x) r · (∇log H)(x)

which has non-zero determinant as (∇log H)(x) is a non-zero multiple of r. Thus, there exist analytic functions g(ˆz) and T(ˆz) parameterizing zd and t near (x, 1). If no neighbourhood of x in V ∩ Rd contains only minimal points then, since x is minimal, there exists a sequence of non-minimal points x(n) ∈ V ∩ Rd converging to x such that H(tnr1 x1(n), . . . , tnrd xd(n) ) = 0 is satisfied with tn = 1 (since x(n) ∈ V) and some 0 < tn < 1 (since x(n) is not minimal). But this contradicts the fact that, (n) ).  for n sufficiently large, tn → 1 is uniquely determined by T(x1(n), . . . , xd−1 Corollary 5.4 Consider a combinatorial Laurent expansion of F(z). If a smooth d is a local minimizer (or local maximizer) of h on ∂D then it is a point x ∈ R>0 r critical point in the direction r. Proof If x is a local minimizer of hr on ∂D then by Proposition 5.8 it is a local minimizer on V ∩ Rd . Thus, the gradient of z → r · log(z) must be normal to V, giving the critical point equations. If x is a local maximizer of hr then it is a local  minimizer of −hr , and the same argument holds. Corollary 5.5 Consider a combinatorial Laurent expansion of rational F(z), and d . If a smooth point w ∈ V has the same let x be a smooth minimal point in V ∩ R>0 coordinate-wise modulus as x, then x is a critical point in some direction r if and only if w is a critical point in the direction r. Proof Since x and w are smooth, minimal, and have no zero coordinates, Proposition 3.13 in Chapter 3 implies the existence of r,  ∈ Rd such that x and w are critical in the directions r and . In other words, there exist λ, τ ∈ C∗ such that (∇log H)(x) = λr

and

(∇log H)(w) = τ.

Since r and  are uniquely determined (up to non-zero multiple) by x and w, it is sufficient to show that x is also critical in the direction . Proposition 3.13 implies that  is the normal to a support hyperplane of B = Relog(D) at Relog(w) = log(x). This implies x is a minimizer (or maximizer) of h on D, and Corollary 5.4 then implies x is critical in the direction , as desired.  Example 5.9 (Asymptotics of Apéry Numbers) In Section 3.4 we saw the sequence (an ) of Apéry numbers, whose generating function can be expressed as the main diagonal of F(x, y, z, t) =

1 . 1 − t(1 + x)(1 + y)(1 + z)(1 + y + z + yz + xyz)

This rational function is combinatorial, and solving the critical point equations gives two smooth critical points, one of which,

226

5 The Theory of ACSV for Smooth Points

√ σ = 1 + 2,

 √ 2 2 √ , , 58 2 − 82 , 2 2



has positive coordinates. Proposition 5.4 implies that σ is a minimal critical point, since at any singularity t=

1 , (1 + x)(1 + y)(1 + z)(1 + y + z + yz + xyz)

and when x, y, and z are positive and real then decreasing any of their values causes this expression to increase. As V is smooth and F is combinatorial and admits only one critical point, Corollary 5.5 immediately implies that F is strictly minimal. Thus, Theorem 5.2 implies  √ n  √    17 + 12 2 1 48 + 34 2 1 + O . an = n n3/2 8π 3/2 Recall that determining such an asymptotic estimate is a key step in Apéry’s proof of the irrationality of ζ(3). The generating function of the second sequence of Apéry numbers (cn ) can be written as the main diagonal of F(x, y, z) =

1 . 1 − z(1 + x)(1 + y)(1 + y + xy)

Again F is combinatorial, and an analogous argument shows  √ n  √  5 5 11   + 2 2 1 250 + 110 5 1+O . cn = n 20π n We show how these examples can be completely automated in Chapter 7.

Example 5.10 (Asymptotics of Bar Graphs) z(1−xyz−2xz−yz−x) , Recall from Section 3.2.2 of Chapter 3 the function F(x, y, z) = 1−xyz−xy−xz−yz−x whose power series coefficient bn,k = [x n y k z k ]F(x, y, z) counts bar graphs with 2n horizontal external edges and 2k vertical external edges. In a direction r = (a, 1, 1) the smooth critical point equations have two solutions, the point

√  √ √ a2 + 4 − 2 a2 + 4 − a a2 + 4 − a , , , z∗ = (x∗, y∗, z∗ ) = a 2 2

which has positive coordinates, and another point whose coordinate-wise modulus is larger than z∗ . Although the power series expansion of F contains negative coeffi-

5.3 The Practice of Smooth ACSV

227

cients, if H(x, y, z) is the denominator of F then the power series expansion of 1/H is combinatorial and contains the same singular set at F. Thus, z∗ is minimal if the univariate polynomial p(t) = H(t x∗, t y∗, tz∗ ) has no root t ∈ (0, 1). A quick analysis of p(t), which factors as 1 − t times a quadratic polynomial in t, shows that for any a > 0 the only root of p(t) with t > 0 is t = 1. This proves z∗ is minimal and, since 1/H is combinatorial and no other critical point has the same coordinate-wise modulus as z∗ , it is strictly minimal. Theorem 5.2 then implies that for all a > 0

√ ban,n ∼

a2 + 4 − a 2

 2n √

a2 + 4 − 2 2

 an

Ca , + 4)πn2

4a2 (a2

  √ √ where Ca = (a2 − 2a + 4) a2 + 4 − (a − 2)(a2 + 4) a2 + 4 − 2 a2 + 4. See the computer algebra worksheet corresponding to this example for more details.

Remark 5.16 The conclusion of Corollary 5.5 may fail in the non-combinatorial case. For instance, consider the main diagonal of F(x, y) = 1/(1+2x)(1− x − y). Although any point (−1/2, y) with |y| = 1/2 is a singularity with the same coordinate-wise modulus as the minimal smooth critical point (1/2, 1/2), none of these points are themselves critical in the main diagonal direction. Corollary 5.6 Consider a combinatorial Laurent expansion of F(z). If w is a smooth minimal critical point in the direction r then |w| is a smooth minimal critical point in the direction r. Furthermore, if every point in ∂D ∩ V is smooth and F admits d with positive coordinates in the two distinct minimal contributing points a, b ∈ R>0 direction r, then   t 1−t ct = a1t b1−t 1 , . . . , ad bd is a minimal contributing point in the direction r for all 0 ≤ t ≤ 1. In particular, F admits an infinite number of critical points in the direction r, none of which are nondegenerate. Proof The first statement follows from Lemma 5.7 and Corollary 5.5. To prove the d second we note that, by logarithmic convexity of D, the point ct lies in D ∩ R>0 for all 0 ≤ t ≤ 1. Proposition 5.1 implies both a and b are minimizers of hr on D, so hr (a) = hr (b). Thus, for any 0 ≤ t ≤ 1, hr (ct ) = −

d  j=1

d d     = −t r j log atj b1−t r log a − (1 − t) r j log b j j j j j=1

j=1

= thr (a) + (1 − t)hr (b) = hr (a) is also a minimizer of hr on D. This implies ct is a minimal point for all 0 ≤ t ≤ 1, since minimizers of hr occur only on ∂D, and Corollary 5.4 implies each ct is a

228

5 The Theory of ACSV for Smooth Points

critical point. Since these critical points are not isolated, Lemma 5.6 implies they cannot be nondegenerate. 

5.3.2 Dealing with Minimal Points that are not Critical Determining minimality of critical points is typically the hardest task in a multivariate singularity analysis. Even when a critical point w ∈ V is known to be minimal, to apply Theorem 5.3 or Corollary 5.2 one must determine if T(w) ∩ V is finite and, if so, determine its elements. In the combinatorial case, Corollary 5.5 shows that every element of T(w) ∩ V is critical, greatly simplifying matters from a computational viewpoint as critical points are specified by a set of algebraic equations which usually have a finite number of solutions. Unfortunately, as Remark 7.3 illustrates, this property does not hold in general for non-combinatorial series expansions. Thankfully, even if T(w) ∩ V contains non-critical points it turns out only the critical points matter for asymptotic calculations. The requirement of finite minimality arose above because we made essentially univariate deformations of the Cauchy domain of integration, increasing the modulus of each coordinate independently. Using a genuinely multivariate deformation of the Cauchy domain of integration, it is possible to deform around non-critical points in T(w). In this sense, only critical points present true obstructions for deforming the Cauchy domain of integration to points of lower height. Theorem 5.4 Consider the convergent Laurent expansion of a rational function F(z) on a domain D. Suppose there exists w ∈ V ∩ ∂D minimizing hr on D, and that for some ε > 0 all points of the set {z ∈ V : hr (z) ≥ hr (w) − ε} are smooth points of V. Suppose further that the set W of critical points in V ∩T(w) consists of a finite number of nondegenerate smooth contributing points. Then for every positive integer M > 0 an asymptotic expansion of fnr is obtained by summing the right-hand side of (5.28) in Theorem 5.3 at each w ∈ W. The multivariate deformations required to prove Theorem 5.4 require some technical overhead, and most of our applications involve combinatorial rational functions, so we simply sketch two approaches. The first, contained in Baryshnikov and Pemantle [3, Sect. 5], uses Proposition 3.13 in Chapter 3 to construct a vector field describing how to deform the Cauchy domain of integration around non-critical points in T(w). Proposition 3.13 states that for any singularity ζ ∈ V ∩ T(w) there exist λζ ∈ C∗ and non-zero vζ ∈ Rd such that (∇log H)(ζ ) = λζ vζ , and the point ζ is critical if and only if vζ is parallel to r. Since ∇log H is the normal of the tangent space to the set V˜ = {x ∈ Cd : H(e x1 , . . . , e x d ) = 0}, if ζ is not a critical point then the tangent space to V˜ at log(ζ ) is a hyperplane whose normal is not parallel to r. Thus, near any non-critical point in V ∩ T(w) one can locally push the Cauchy domain

5.3 The Practice of Smooth ACSV

229

of integration to points of lower height without crossing the singular set. Because the vector vζ varies smoothly with ζ , it is possible to make a single deformation of the domain of integration to obtain a contour whose points have height bounded below hr (w), except in arbitrarily small neighbourhoods of the critical points in T(w). The stated deformation is constructed2 in [3, Cor. 5.5], after which asymptotics can be determined using the saddle-point methods discussed above [3, Sec. 6]. A second approach, using homological arguments and multivariate residues, is given by Pemantle and Wilson [32, Thm. 9.4.2]. Under the assumptions of Theorem 5.4, one can use advanced topological arguments to write the Cauchy integral as a sum of an (asymptotically negligible) Cauchy integral over a domain whose points have height bounded below hr (w) and a ‘multivariate residue integral’ over a domain of integration Γ lying in V (this is basically a more general, but less explicit, version of Corollary 5.1 above). Using a ‘gradient flow’ of the height function h, created essentially by solving a differential equation, it is possible to deform Γ so that all points not lying in arbitrarily small neighbourhoods of critical points have height bounded below hr (w). Pemantle and Wilson were motivated by the study of Stratified Morse theory, and more recent work of Baryshnikov et al. [1] shows how to the methods of Stratified Morse characterize asymptotics under conditions which can be computationally verified. We return to this approach in Chapter 9. Example 5.11 (Non-Finitely Minimal Critical Point) Consider the power series expansion of the bivariate rational function F(x, y) =

1 (1 + 2x)(1 − x − y)

in the main diagonal direction r = (1, 1). The singular variety has one non-smooth point (x, y) = (−1/2, 3/2), which is not minimal as it has greater coordinate-wise modulus than σ = (1/2, 1/2). The point σ is still a minimal critical point, but it is no longer finitely minimal as V ∩ T(σ) = {(1/2, 1/2)}



{(−1/2, eiθ /2) : θ ∈ (−π, π)}.

Because the non-critical points in V ∩ T(σ) have no bearing on the local behaviour of F(z) near σ, they do not play a role in asymptotics. Theorem 5.4 implies    4n 1 5 819 1 1 1 + − +O 5 . − + fn,n = √ 2 3 4 2 8n 256n 256n 65536n n πn

2 Although the hypotheses of [3, Cor. 5.5] state that all minimizers of hr on D have the same coordinate-wise modulus, this is not needed under our assumptions. Thanks to Yuliy Baryshnikov for clarification on the constructions from this paper.

230

5 The Theory of ACSV for Smooth Points

5.3.3 Perturbations of Direction and a Central Limit Theorem Because our asymptotic theorems have error terms which remain uniformly bounded as the direction r undergoes small perturbations, the coefficients under consideration grow in a predictable manner near fixed directions. In fact, we prove that in many circumstances the series coefficients scale to a normal distribution when viewed in the right manner. This behaviour is well known, having been derived in some of the first major papers on the asymptotic behaviour of multivariate generating functions [4, 5]. To begin, we examine how the exponential growth in our asymptotic formula changes when the direction under consideration varies slightly. Because scaling the direction of interest does not meaningfully change our results (one can recover asymptotics in the original direction by scaling n in the resulting asymptotic formula) we scale our direction so its final coordinate equals one. Lemma 5.8 Suppose rational F(z) = G(z)/H(z) admits a nondegenerate strictly minimal smooth contributing point w ∈ C∗d in the direction m ∈ Rd , scaled so that md = 1, and suppose Hz d (w)  0. As r varies in some sufficiently small neighbourhood of m let z(r) be the smooth critical point in the direction r near z(r), given by Lemma 5.6. If r = ( m + ε, 1) where ε = ε(n) is a sequence of points in Rd−1 −1/3 with each coordinate in o(n ) then

 ε T H −1 ε −nε −nr −nm  , ∼w ×w exp −n z(r) 2 where H is the non-singular matrix in Lemma 5.5 corresponding to the direction m. Proof Since Hz d (w)  0, the implicit function theorem yields an analytic parameterization zd = g(ˆz) of zd on V near w. If p(ˆx) = log(g(e x1 , . . . , e x d−1 )) then (ˆx, p(ˆx)) parametrizes log(V) near the point a = log(w). Letting x(r) = log(z(r)) be the image of the minimal critical point in direction r after taking coordinate-wise logarithms, it follows that z(r)−nr = exp[n h˜ r (ˆx(r))] where h˜ r is defined by h˜ r (ˆy) = −ˆr · yˆ − p(ˆy) for y in a neighbourhood of a. Note that p(log w1 + iθ 1, . . . , log wd−1 + iθ d−1 ) = log(g(w1 eiθ1 , . . . , wd−1 eiθd−1 )) equals the function φ in Lemma 5.5 up to a linear function in the θ j . If M denotes the Hessian matrix of p with respect to the variables xˆ at the origin then, since H is the Hessian of φ at the origin, the chain rule implies M = −H . Because w is a critical point in direction m, the smooth critical point equations imply the gradient of p with respect to xˆ is (∇p)(ˆa) = − m. Thus, for any yˆ sufficiently close to aˆ there is a convergent series expansion

5.3 The Practice of Smooth ACSV

231

(ˆy − aˆ )T M(ˆy − aˆ ) +··· . m − rˆ ) · (ˆy − aˆ ) − h˜ r (ˆy) = h˜ r (ˆa) + ( 2 In fact, for any r sufficiently close to m the logarithmic critical point x(r) is defined as the unique point approaching a = log(w) such that the gradient of h˜ r (x) with respect to x vanishes. Vanishing of the gradient is equivalent to the system rk = −pxk (ˆx) for 1 ≤ k ≤ d − 1, so the Jacobian of rˆ as a function of xˆ is −M. This, in turn, implies the Jacobian of xˆ as a function of rˆ is −M −1 . In particular, when each coordinate of ε = ε(n) is o(n−1/3 ) it follows that xˆ (m + ε) = xˆ (m) − M −1 ε + o(n−2/3 ) = aˆ − M −1 ε + o(n−2/3 ).  + ε into the series expansion of h˜ r (ˆy) gives Substituting yˆ = xˆ (m + ε) and rˆ = m  x(m + ε)) = h˜ m a) + ε T M −1 ε − h˜ m  +ε (ˆ  +ε (ˆ = −(m · a) − (ε · aˆ ) +

M −1 ε

T

  M M −1 ε + o(n−1 ) 2

ε T Mε + o(n−1 ), 2

and the claimed result follows since M = −H and a = log(w).



From the smoothly varying behaviour of the exponential growth we are able to deduce how dominant asymptotics transition around a fixed direction. Proposition 5.9 (Smooth Variation of Coefficients) Consider a Laurent expansion of a rational function F(z) = G(z)/H(z), and let m ∈ Rd be a direction with md = 1. Suppose that for all directions r = (ˆr, 1) in some neighbourhood of m there is a smoothly varying nondegenerate strictly minimal critical point w(r) ∈ C∗d such that Hz d (w(r)) and G(w(r)) are non-zero. If sˆ = sˆ(n) is a sequence of vectors in Zd−1 such that each coordinate of |ˆs − n m | is o(n2/3 ) then

fsˆ, n ∼ w−nm n(1−d)/2

 

m) (ˆs − n m)T H −1 (ˆs − n −G(w)(2π)(1−d)/2 ) −(ˆs−nm w exp − , √ 2n wd Hz d (w) det H

(5.29)

where H is the non-singular matrix in Lemma 5.5 corresponding to w. Remark 5.17 If the coordinates of sˆ −n m are uniformly bounded for all n (for instance, if sˆ is the closest vector to nm with integer coordinates) then Proposition 5.9 implies the dominant asymptotic behaviour of fsˆ,n is the behaviour predicted on the righthand side of (5.27) by the vector r = (m, 1) with potentially non-integer (even m) −(ˆs−n . irrational) coordinates, up to the bounded multiplicative factor w Proof Theorem 5.2 implies the existence of a constant B > 0 such that (1−d)/2 (2πn) Cw(r) ≤ w(r)−nr n(1−d)/2−1 B, fnr − w(r)−nr  det Hw(r)

232

5 The Theory of ACSV for Smooth Points

Fig. 5.6 Left: The amoeba of H(x, t) = 1 − t(x + 1 + 2/x) with the component B and vector m = (−1/4, 1) from Proposition 5.10. Right: The values of 4−n f−n/4+ε, n with ε ∈ {−20, . . . , 20} compared to the limiting curve νn (−n/4 + ε, n) for n = 80.

where Cw(r) and Hw(r) approach −G(w)/wd Hz d (w) and H , respectively, as r → m.  + ε(n) where each coordinate of ε(n) If |ˆs − n m | is o(n2/3 ) then rˆ = sˆ/n = m −1/3  as n → ∞ and the result follows from Lemma 5.8. ). Thus, rˆ → m  is o(n m) −(ˆs−n in (5.29) is bounded, its existence prevents us from Although the term w  = 1, however, this term disappears knowing exact asymptotic behaviour. When w and we can be more precise. The following result states that as n → ∞ the coefficients . of fsˆ,n approach a multivariate normal distribution around the direction m

Proposition 5.10 (Local Central Limit Theorem) Consider a combinatorial Laurent expansion of a rational function F(z) = G(z)/H(z). Suppose that, in some direction m with md = 1, there is a nondegenerate strictly minimal smooth contributing point of the form w = (1, t) for some t > 0. If Hz d (w) and G(w) are non-zero then (5.30) sup n(d−1)/2 t n fsˆ,n − Cνn (ˆs) → 0 sˆ ∈Z d−1

  as n → ∞, where C = −G(w)/ wd Hz d (w) and 

(ˆs − n m)T H −1 (ˆs − n m) (2πn)(1−d)/2 exp − νn (ˆs) = √ 2n det H for H the non-singular matrix in Lemma 5.5 corresponding to w. A result of the form (5.30) is often called a local central limit theorem. Problem 5.7 asks the reader, in essence, to derive ‘the’ classical local central limit theorem for random variables supported on finite subsets of Zd (slightly generalizing Proposition 5.10 from rational to meromorphic functions removes the restriction of finite support); see Durrett [14, Ch. 3] for more traditional derivations of central limit theorems. Before proving Proposition 5.10 we look at a short example, with a more detailed example following the proof.

5.3 The Practice of Smooth ACSV

233

Example 5.12 (Local Central Limit Theorem for Weighted Walks) Let F(x, y) =

1 1−y(x+1+2/x)

be the bivariate generating function F(x, y) =



wi,n x i y n

i ∈Z,n≥0

where wi,n counts the walks on the integer lattice Z starting at the origin, ending at x = i, and taking steps in the set {−1, −1, 0, 1}, with two distinct steps in the −1 direction (we have two negative steps to break symmetry). This series expansion of F(x, y) converges in a domain D containing all points with x = 1 and |y| < 1/4; the image B of D under the Relog map is shown on the left of Figure 5.6. Because the hypotheses of Proposition 5.10 are satisfied with t = 1/4 and m = (−1/4, 1), the coefficients of [y n ]F(x, y) = (x + 1 + 2/x)n behave like a Gaussian curve around their maximum value. The right of Figure 5.6 shows the limit curve compared to actual series coefficients.

Proof (Proposition 5.10) Lemma 5.6 implies there is a non-degenerate smooth critical point w(s) varying smoothly with s near m. Since the series under consideration is combinatorial, the points w(s) all have positive real coordinates, and Proposition 5.8 implies each is minimal. Since w = w(m) is strictly minimal, so are the w(s) when s is sufficiently close to m. We begin by bounding the exponential term in νn . Because (5.25) expresses the entries of H by evaluations of derivatives of H(z) at w ∈ Rd , the matrix H is real. The classical spectral theorem for real symmetric matrices [22, Thms. 4.1.5 and 4.2.2] then implies the eigenvalues of H are real and (ˆzT zˆ )λmin ≤ zˆT H zˆ ≤ (ˆzT zˆ )λmax for all zˆ ∈ Rd−1 , where λmin and λmax are the smallest and largest eigenvalues of H , respectively. Since w is strictly minimal, Lemma 5.4 implies the real part of the function φ(θ) in (5.21), and thus also the term θ T H θ in its series expansion at the origin, is strictly positive when θ  0. This implies all eigenvalues of H are positive, so there exist C1, C2 > 0 such that for any zˆ ∈ Rd−1 and sufficiently large n

 3 3 4 4 zˆT H zˆ T exp −nC1 zˆ zˆ ≤ exp −n ≤ exp −nC2 zˆT zˆ ≤ 1. (5.31) 2 Now, fix any 1/2 < p < 2/3. If every coordinate of |ˆs − n m | is at most n p then (5.30) is (5.29) and holds by Proposition 5.9. We thus assume each coordinate of |ˆs−n m | is at least n p and prove both t n fsˆ,n and νn (ˆs) approach zero faster than n(1−d)/2 . 1−2p for some C3 > 0, so νn (ˆs) Under this assumption, (5.31) implies νn (ˆs) ≤ e−C3 n approaches zero faster than any fixed power of n. Similarly, if every coordinate of |ˆs − n m | is o(n) then sˆ = n m + nε where every coordinate of ε is o(1); repeating the argument in the proof of Lemma 5.8 then implies that when the coordinates of 2p−1 |ˆs − n m | are at least n p then the exponential growth of t n fsˆ,n is at most e−C4 n for m | grows at least linearly then rˆ = sˆ/n some C4 > 0. Finally, if each coordinate of |ˆs−n  . Because the series under consideration is combinatorial, is bounded away from m

234

5 The Theory of ACSV for Smooth Points

Fig. 5.7 Left: The coefficients of [z n ]P1 (x1, x2, z) approach a one-dimensional normal distribution as n → ∞. Right: The coefficients of [z n ]P2 (1, x2, z3, z) approach a two-dimensional normal distribution, with maximum value at the closest integer to nm, as n → ∞.

the boundary ∂B of the amoeba complement component corresponding to the series is a smooth hypersurface near (0, log t) with normal m. Thus, any hyperplane through (0, log t) with normal r not parallel to m intersects the interior of B. Equation (5.15)  then implies the exponential growth of fnr is strictly less than 1rˆ t = t. Example 5.13 (A Family of Permutations with Restricted Cycles) For any d ∈ N let Fd (n) be the set of permutations σ on {1, . . . , n} such that i − d ≤ σ(i) ≤ i + 1 for all i. Let Fd be the union of the sets Fd (n) for all n ∈ N, and note that every element of Fd , when written in disjoint cycle notation, has cycles of length at most d + 1. Chung et al. [9] study the combinatorial classes Fd because of a connection to an algorithm for the generation of random perfect matchings in classes of bipartite graphs. In particular, those authors prove that the number of permutations i d+1 n in Fd (n) with ik cycles of length k equals the coefficient [z1i1 · · · zd+1 t ]P(z, t) in the power series expansion of the rational function P(z, t) = P(z1, . . . , zd+1, t) =

1 − z1 t − z2

t2

1 . − · · · − zt+1 t d+1

We now show that the joint distribution for the number of cycles satisfies a central limit theorem3. The first thing to notice is that the seemingly (d + 1)-dimensional array of coefficients in [t n ]P(z, t) is actually d-dimensional, since knowing the coefficient [zi t n ]P(z, t)  0 and fixing i2, . . . , id+1 and n uniquely determines i1 = n − 2i2 − · · · − (d + 1)id+1 (see the left side of Figure 5.7). Thus we set z1 = 1 and examine the coefficients of P(1, z1, t) =

1 1 . = H(z2, . . . , zd+1, t) 1 − t − z2 t 2 − · · · − zd+1 t d+1

3 Thanks to Persi Diaconis for suggesting this problem.

5.3 The Practice of Smooth ACSV

235

To prove a central limit theorem, we start by searching for minimal critical points which satisfy the hypotheses of Proposition 5.10. To that end, set z2 = · · · = zd+1 = 1 and consider the roots of h(t) = H(1, t) = 1−t−t 2 −· · ·−t d+1 . Because h(0) is positive, h(1) is negative, and h (t) is strictly negative for t > 0, we see that h(t) has a unique positive root ρ which lies between 0 and 1. At σ = (1, ρ) the logarithmic gradient of the denominator H(z2, . . . , zd+1, t) equals −(ρ2, . . . , ρd+1, h (ρ)), meaning σ is a critical point in the rescaled direction  2  ρ ρd+1 m = , . . ., , 1 . h (ρ) h (ρ) Since P is combinatorial, it follows from Proposition 5.4 that σ is minimal. Inspection of the smooth critical point equations implies that σ is the only critical point in the direction m so σ is strictly minimal by Corollary 5.4 (alternatively, strict minimality follows from Proposition 5.5 and aperiodicity of the denominator). Thus, if the d × d matrix H defined by Hi, j =

ρ i+ j+1 h (ρ)−ρ i+ j (1+i+j)h (ρ) ⎧ ⎪ ⎨ ⎪ h (ρ)3

:i j

⎪ ⎪ ⎩

:i= j

ρ i+ j+1 h (ρ)−ρ i+ j (1+i+j)h (ρ)−ρ i h (ρ)2 h (ρ)3

is non-singular then, as n → ∞, Proposition 5.10 implies that the largest coefficients of [t n ]P(1, z1, t) occur at the monomials whose exponents are close to the vector nm. These maximal coefficients approach An = ρ−n and

−(2πn)d/2 , √ ρh (ρ) det H

s [z 2 · · · z sd+1 t n ]P(1, z2, . . . , zt , t) 2 d+1 sup − νn (s2, . . . , sd+1 ) → 0 An s2,...,sd+1 ∈N

7 6 T H −1 (s−nm) . See Figure 5.7 for a plot of as n → ∞, where νn (s) = exp − (s−nm) 2n actual series coefficients compared to the limiting distribution when d = 2. Further details can be found in Melczer [28].

Pemantle and Wilson [32, Sect. 9.6] describe these and additional, ‘weak,’ limit laws holding under weaker assumptions on F(z). Flajolet and Sedgewick [16, Ch. IX] give an in-depth treatment of limit laws for bivariate generating functions. Additional work on limit theorems in combinatorics can be found in Hwang [23, 24], Wallner [34], and the references therein.

236

5 The Theory of ACSV for Smooth Points

5.3.4 Genericity of Assumptions We end this chapter by showing that our assumptions are satisfied by ‘most’ rational functions, explaining why they frequently hold in practice4. Definition 5.10 (degrees and algebraic sets) The (total) degree of a monomial zn with n ∈ Nd is the sum |n|1 = n1 + · · · + nd of its exponents, and the (total) degree of a polynomial is the maximum degree of its monomials. An algebraic set in Cr is any set formed by the common zeroes of a collection of r-variate polynomials with complex coefficients. The Hilbert basis theorem [10, Sec. 2.5] implies one may always define an algebraic set by the vanishing of a finite   collection of polynomials. Problem 5.12 asks you to monomials in C[z] of degree at most δ. show that there are mδ = δ+d d Definition 5.11 (generic properties of rational functions) A property P of polynomials in C[z] holds generically if for every δ ∈ N there exists a proper algebraic subset Cδ  Cmδ such that any polynomial of degree δ satisfies P unless the vector of its coefficients lies in Cδ . A property P of rational functions holds generically if for every pair δ1, δ2 ∈ N there exists a proper algebraic subset Cδ1,δ2  Cmδ1 +mδ2 such that any rational function with numerator of degree δ1 and denominator of degree δ2 satisfies P unless the vector formed by the coefficients of its numerator and denominator lies in Cδ1,δ2 . Since the intersection of algebraic sets is algebraic, this definition implies that the conjunction of generic properties is generic. In this section we prove the following. Proposition 5.11 Fix r ∈ R∗d . Generically, a rational function F(z) = G(z)/H(z) has a smooth singular variety, H is square-free, F admits a finite number of smooth critical points in the direction r, each of these critical points is non-degenerate, and the numerator G does not vanish at any of these critical points. When at least one of these generically occurring smooth critical points is minimal and contributing, Theorem 5.4 gives an explicit expression for the dominant asymptotic term. We require some algebraic tools before proving Proposition 5.11.

Projective Space, Multivariate Resultants, and Generic Smoothness The idea behind Proposition 5.11 is that our conditions determine an algebraic system of equations involving both the z variables and the coefficients of G and H. Consider first the property that the singular variety V = V(H) is everywhere smooth; if this property does not hold then the polynomial system 4 Although a large number of examples that come from combinatorial problems have non-generic behaviour, as we will see in Part III. It is fair to say that some generic properties (like nondegenerate critical points) hold in most applications while others (such as smoothness at all points of the singular variety) merely hold for many applications.

5.3 The Practice of Smooth ACSV

237

H(z) = Hz1 (z) = · · · = Hz d (z) = 0

(5.32)

has a solution. If H(z) has degree δ, then we can write  ci zi, ci ∈ C H(z) = |i|1 ≤δ

where, as above, |i|1 = i1 + · · · + id for indices i ∈ Nd . Instead of taking a fixed polynomial specified by the coefficients ci ∈ C, we now consider a generic polynomial H with degree δ by taking the coefficients ci as additional variables. In this sense, we consider H to be an element of the polynomial ring C[c][z] with mδ + d variables. To show that smoothness is a generic property, one must show that the system (5.32) only has a solution when the coefficient variables c of H lie in some algebraic set Cδ ⊂ Cmδ depending only on δ. Ideally, we would take a solution (c, z) of this system and project it onto the c variables to obtain an algebraic equation that must be satisfied by the coefficient variables. Unfortunately, the projection of an algebraic set onto some of its components need not be algebraic. Example 5.14 (Projections of Algebraic Sets Need Not Be Algebraic) Consider the algebraic set A = V(1 − cx) = {(c, 1/c) : c ∈ C∗ } ⊂ C2 . Then the projection of A onto its c coordinate is C∗ , which is not an algebraic set as any polynomial vanishing on C∗ must vanish on all of C. In particular, the projection is not all of C but the projection also does not lie in a proper algebraic subset of C. Essentially, this difficulty comes from the fact that Cd is not compact so points can ‘escape’ to infinity. The solution is to work over a different, compact, space. Definition 5.12 (projective space) The complex projective space of dimension n, denoted Pn , consists of all non-zero tuples (x0, . . . , xn ) ∈ Cn+1 \ {0} modulo the equivalence relation (x0, . . . , xn ) ∼ (y0, . . . , yn ) if (x0, . . . , xn ) = λ(y0, . . . , yn ) for some λ ∈ C∗ . A point (x0, x1, . . . , . . . , xn ) ∈ Cn+1 \ {0} defines an equivalence class of the relationship ∼ which is denoted (x0 : x1 : · · · : xn ) ∈ Pn . In order for the zero set of a polynomial f (x0, . . . , xn ) to be well defined over Pn , it must depend only on an equivalence class (x0 : · · · : xn ). In particular, every monomial in f must have the same degree, so that f (λx0, . . . , λxn ) = λdeg( f ) f (x0, . . . , xn ) = 0

238

5 The Theory of ACSV for Smooth Points

whenever f (x0, . . . , xn ) = 0. Definition 5.13 (homogeneous polynomials and algebraic sets) A polynomial where every monomial has the same degree is said to be homogeneous. We also extend our definition of algebraic sets to different spaces. An algebraic set, • in Pn is any subset of Pn formed by the common zeroes of a collection of (n + 1)variate complex homogeneous polynomials; • in the Cartesian product Pn × Cm is any subset of Pn × Cm formed by the common zeroes of a collection of complex polynomials of the form f (x0, . . . , xn, y1, . . . , ym ) which are homogeneous in the x variables (the degree of each monomial with respect to the x variables only is constant); • in the Cartesian product Pn ×Pm is any subset of Pn ×Pm formed by the common zeroes of a collection of complex polynomials of the form f (x0, . . . , xn, y0, . . . , ym ) which are homogeneous in the x variables and the y variables separately. Again, the Hilbert basis theorem [10, Sec. 2.5] implies one may always define an algebraic set by the vanishing of a finite collection of polynomials. We introduce projective space for the following property, called completeness or properness. Proposition 5.12 Let Pn denote complex projective space of dimension n, and let A denote either complex or projective space of any dimension. If π : Pn × A → A is the projection map onto A and V is any algebraic set in Pn × A then π(V) is an algebraic set in A. Our proof is based on the one in Mumford [29, Thm. 2.23]. The proof uses the projective Nullstellensatz, a standard result of algebraic geometry which can be found in Mumford [29, Cor. 2.3] and Cox et al. [10, Ch. 8.3]. Proof Suppose first that A = Cm . Then V is defined by the vanishing of a finite collection of polynomials f1 (x, y), . . . , fr (x, y) on Pn × Cm where each f j is homogeneous in the x variables. For fixed Y ∈ Cm , the projective Nullstellensatz states that f1 (x, Y) = · · · = fr (x, Y) = 0 has a common solution x ∈ Pn if and only if for every k ∈ N there exists a monomial of degree k that cannot be written as a C[x]-linear combination of f1 (x, Y), . . . , fr (x, Y). We claim that for every k ∈ N the points Y ∈ Cm for which there exists a monomial of degree k that cannot be written as a C[x]-linear combination of f1 (x, Y), . . . , fr (x, Y) form an algebraic set of Cm . Since the arbitrary intersection of algebraic sets is algebraic, this proves the theorem for the case A = Cm . The case A = Pm follows from the case A = Cm after noting that Pm can be written as the union of the sets Ui = {y ∈ Pm : yi = 1} for 0 ≤ i ≤ d, each of which is equivalent to Cm . It remains only to prove our claim. Fix k ∈ N, let V be the vector space of degree k homogeneous polynomials in C[x], and for 1 ≤ i ≤ r let Vi be the vector space of homogeneous polynomials of degree k − degx ( fi ) in C[x], where Vi = (0) if k < degx ( fi ). Then for any Y ∈ Cm the map

5.3 The Practice of Smooth ACSV

239

φk : V1 × · · · × Vr → V r  (g1, . . . , gr ) → gi (x) fi (x, Y) i=1

is a linear map between the vector spaces V1 × · · · × Vr and V. After fixing bases of these vector spaces, the map φk is represented by a matrix MY whose entries are complex polynomials in Y. There is a monomial of degree k that cannot be written as a C[x]-linear combination of f1 (x, Y), . . . , fr (x, Y) if and only if the map φk is not surjective, which happens if and only if all maximal minors of MY vanish. The condition of vanishing minors defines an algebraic set in Cm .  Proposition 5.12 yields a test for proving genericity of properties of homogeneous polynomials. An arbitrary polynomial can be converted into a homogeneous polynomial through the following process. Definition 5.14 (homogenization) If f (x, y) is a complex polynomial the homogenization of f with respect to x = (x1, . . . , xn ) is the homogeneous polynomial deg ( f ) f˜(x0, x, y) = x0 x f (x1 /x0, . . . , xn /x0, y),

where degx ( f ) denotes the degree of f with respect to the x variables only. Corollary 5.7 Let S be a collection of polynomials in two sets of variables z and c, of lengths n and m, and let S˜ be the result of homogenizing each polynomial with respect to the z variables. Suppose there exists a ∈ Cm such that the polynomials in S˜ have no common root in Pn after the substitution c = a. Then there exists a proper algebraic subset C  Cm such that b ∈ C whenever the polynomials in S have a common root of the form (w, b) ∈ Cn+m for some w ∈ Cn . Proof The solutions of S˜ over Pn × Cm form an algebraic set, so Proposition 5.12 implies the existence of algebraic C ⊂ Cm such that b ∈ C if and only if there exists a ˜ If (w, b) is a solution of S in Cn+m then (1, w, b) is a solution solution (w0, w, b) of S. n m ˜ of S in P × C , so b ∈ C. Since S˜ has no solution with c = a, it follows that a  C  and C  Cm . Remark 5.18 The point of Corollary 5.7 is the properness condition on C, otherwise one can trivially take C = Cm . Working over complex space, instead of projective space, the projection onto the c variables need not be algebraic, so to prove genericity one must replace the projection by its algebraic closure (the smallest algebraic set it is contained in). As noted in the example above, the algebraic closure can be larger than the projection, and can be all of Cm even if the projection is missing points. We have now developed the algebraic framework necessary to start proving Proposition 5.11. For instance, the next lemma implies that generically H is square-free and V = V(H) is smooth. Lemma 5.9 Generically, the system of equations

240

5 The Theory of ACSV for Smooth Points

H(z) = Hz1 (z) = · · · = Hz d (z) = 0 has no solution. Proof Fix the degree δ ≥ 1 of H. Writing  ci zi ∈ C[c][z] H(z) = |i|1 ≤δ

as above, we will use Corollary 5.7 to prove the existence of a proper algebraic set Cδ  Cmδ such that the system of equations has a solution only if c ∈ Cδ . To show the existence of such a set it is sufficient to exhibit a polynomial Hδ of degree δ such that the homogenization of this system of equations has no solution in Pd . If Hδ (z) = 1 − z1δ − · · · − zdδ then the system of equations becomes z0δ − z1δ − · · · − zdδ = −δz1δ−1 = · · · = −δzdδ−1 = 0, which has no solution in projective space as 0  Pd by definition.



We remark that advanced arguments allow one to be more precise. Given a collection of polynomials f1 (z), . . . , fr (z) of degrees δ1, . . . , δr , respectively, write  c j,i zi f j (z) = |i|1 ≤δ j

for all 1 ≤ j ≤ r. Given a polynomial P in the variables {u j,i : |i|1 ≤ δ j , 1 ≤ j ≤ r } we then let P( f1, . . . , fr ) denote the evaluation of P obtained by setting the variable u j,i equal to the coefficient c j,i . Proposition 5.13 For all positive integers δ0, . . . , δd , there exists an irreducible polynomial Res = Resδ0,...,δ d ∈ Z[u j,i ], called the multivariate resultant, such that d + 1 homogeneous polynomials f0, . . . , fd ∈ C[z0, . . . , zd ] of degrees δ0, . . . , δd have a common root in Pd if and only if Res( f0, . . . , fd ) = 0. The resultant is uniquely defined by the condition Res(z0δ0 , . . . , zdδ d ) = 1. The techniques used to prove Proposition 5.13, yielding algorithms for calculating the multivariate resultant, are beyond the scope of our work; see Gelfand et al. [17, Ch. 13] or Jouanolou [25] for details on Proposition 5.13 and its proof. For an introductory exposition, and a detailed overview of the computational aspects of the multivariate resultant, see the excellent treatment in Chapter 3 of Cox et al. [11]. Note that the multivariate resultant is defined for dense generic polynomial systems: each f j must contain all monomials of degree at most δ j , otherwise the resultant may identically vanish or, less pathologically, may become reducible. Refinements of the multivariate resultant to polynomials with different support sets are often studied under the name sparse resultants [11, Ch. 7].

5.3 The Practice of Smooth ACSV

241

The Remaining Properties Lemma 5.9 implies that generically H is square-free and V = V(H) is smooth. We now prove that the remaining properties listed in Proposition 5.11 hold generically.

Numerator does not vanish at critical points Corollary 5.7 implies that generically G does not vanish at the critical points of H if for every δ1, δ2 ∈ N there exist specific polynomials G and H of degrees δ1 and δ2 , respectively, such that the homogenization of the polynomial system G(z) = H(z) = r2 z1 Hz1 (z) − r1 z2 Hz2 (z) = · · · = rd z1 Hz1 (z) − r1 zd Hz d (z) = 0 has no solution in Pd . If G(z) = z1δ1

and

H(z) = 1 − z1δ2 − · · · − zdδ2

then the system of homogeneous polynomial equations G = z1δ1 = 0 H˜ = z0δ2 − z1δ2 − · · · − zdδ2 = 0 −δ2 r j z1δ2 − δ2 r1 z δj 2 = 0,

j = 2, . . . , n

has no solution in Pd , as z1 = 0 implies z j = 0 for all j.

Critical points are nondegenerate and finite in number We prove that generically all critical points are nondegenerate; Lemma 5.6 then implies they are finite in number, finishing off the proof of Proposition 5.11. We may assume H is square-free, as we have already shown this property holds generically. Consider the matrix H defined by (5.25) with w = z. After multiplying every entry of H by zd3 Hz3d we obtain a polynomial matrix H˜ whose determinant is an explicit polynomial D in the variables z and the coefficients of H. Corollary 5.7 implies that to prove all critical points are generically finite it is sufficient to find for each δ ∈ N a polynomial H of degree δ such that the homogenizations of the d + 1 equations consisting of D = 0 and the smooth critical point equations (5.16) have no common solution in Pd . Let us take, again, the trusty polynomial H(z) = 1 − z1δ − · · · − zdδ . Calculating the quantities in (5.25) and performing some algebraic simplification gives that det H˜ = (−1)d−1 δ4(d−1) (z1 · · · zd−1 )δ zdδ(d−1) det M, where M is the (d − 1) × (d − 1) matrix

242

5 The Theory of ACSV for Smooth Points



z1δ + zdδ

z2δ

z3δ

···

δ zd−1

z3δ

···

δ zd−1

" # # # # δ δ δ δ δ z z z + z · · · z M= 1 2 3 d d−1 # # # .. .. .. .. .. # . . . . . # # δ z1δ z2δ z3δ · · · zd−1 + zdδ $ ! whose off diagonal entries are constant down columns. Problem 5.13 asks you to prove that the determinant of M is zdδ(d−2) (z1δ + · · · + zdδ ), so the polynomial z1δ

z2δ + zdδ

  D = (−1)d−1 δ4(d−1) (z1 · · · zd )δ zdδ(2d−3) z1δ + · · · + zdδ is already homogeneous. The smooth critical point equations become z0δ − z1δ − · · · − zdδ = 0 −δr j z1δ − δr1 z δj = 0,

j = 2, . . . , d

and the homogenized system has no solution in Pd (the critical point equations imply each z δj is a non-zero scalar multiple of z1δ so if the δth powers sum to zero, or any of them is zero, all of them must be zero).

Problems 5.1 Using Theorem 5.2, find dominant asymptotics of the (1, −2)-diagonal of F(x, y) = 1/(1 − x − y) when it is expanded as a convergent Laurent series in the domain D3 = {(x, y) ∈ C2 : 1 + |x| < |y|}, proving minimality of any critical points and taking care with the signs and square-free roots involved. 5.2 Figure 3.5 in Section 3.3.2 of Chapter 3 displays amoeba(1 − x − y − xy 3 ), whose complement in R2 contains four components. Let F(x, y) = 1/(1−x−y−xy 3 ) and, for each component of the amoeba complement, determine the directions r = (r, s) ∈ R2 where Proposition 5.7 implies the sequence [x nr y ns ]F(x, y) is eventually zero. Are these diagonal sequences identically zero in any other directions? 5.3 Andrade et al. [12] studied the bivariate generating function F(x, y) =

e−x−y , (1 − e−x − e−y )2

whose coefficients count certain permutation statistics. Give asymptotics of the power series coefficients [x r n y sn ]F(x, y) as n → ∞ for all r, s > 0. 5.4 Figure 5.4 in Chapter 3 sketches the contour of Q(x, y) = 1 − x − y − 6x y − x 2 y 2 together with the connected components of amoeba(Q)c . For each of the four direc-

5.3 The Practice of Smooth ACSV

243

tions r ∈ {±1}2 , determine which of the 5 components contains a minimal critical point, and determine which of the 5 convergent Laurent expansions of 1/Q(x, y) has an r-diagonal which is eventually zero. Find dominant asymptotics for the main diagonal of the convergent Laurent expansion corresponding to the bounded amoeba component B2 . You may assume that this expansion admits a minimal critical point (we give computational methods for determining minimality in Chapter 7). 5.5 Pantone [30] gave asymptotics for the number of ‘singular vector tuples of generic tensors’ by analyzing the multivariate generating function F(z) =

z1 · · · zd  , d (1 − z1 ) · · · (1 − zd ) 1 − i=2 (i − 1)ei (z)

where ei (z) is the ith elementary symmetric function  z j1 · · · z ji . ei (z) = 1≤ j1 0 where F(z) admits a minimal smooth contributing point. 5.6 Baryshnikov et al. [2] study main diagonal asymptotics for the power series expansion of 1/Ha,b (x, y, z), where Ha,b (x, y, z) = 1 − (x + y + z) + a(xy + xz + yz) + b(xyz) for parameters a, b ∈ R. Prove that the singular variety of this rational function is smooth unless 4a3 −3a2 +6ba+b2 −4b = 0. In the smooth cases, determine dominant asymptotics of the main diagonal sequence. You may use without proof the GraceWalsh-Szegő theorem [6, Thm. 1.1]: if f (x, y, z) is unchanged by permutations of the variables, f is linear in each variable individually, and f (a, b, c) = 0 for |a|, |b|, |c| < r then there exists |d| < r with f (d, d, d) = 0. 5.7 Let S(ˆz) be an aperiodic Laurent polynomial with non-negative coefficients which add up to 1. Prove that Proposition 5.10 applies to the function F(z) = 1/(1 − zd S(ˆz)) and explicitly determine the limiting function νn (ˆs) in (5.30).

244

5 The Theory of ACSV for Smooth Points

5.8 Flajolet and Sedgewick [16, Ex. IX.15] analyze Euclid’s gcd algorithm over the finite field of prime order p by deriving the bivariate generating function F(u, z) =

 1 ek,n uk z n, = 1 − pz − p(p − 1)uz k,n≥0

where ek,n denotes the number of pairs of polynomials (r, q) over the finite field of order p such that q is monic, n = deg q > deg r, and Euclid’s gcd algorithm applied to (r, q) takes k steps to terminate. Determine, for sufficiently large n, dominant asymptotics of the largest coefficient of [z n ]F(u, z) and the exponent corresponding to this coefficient, then use Proposition 5.10 to show that the distribution of the number of steps in the Euclidean algorithm for a pair of polynomials of degree n satisfies a local central limit theorem. 5.9 Bender and Richmond [5] prove limit theorems for a range of multivariate series, including the power series expansion of the trivariate function R(x, y, z) =

(1 − x + (x y − y − 1)z)(1 − y + (xy − x − 1)z) − x yz (1 − z)(1 − (x + y + 1)z + x yz 2 )

which arises from a graph theory application. Determine, for sufficiently large n, dominant asymptotics of the largest coefficient of rn (x, y) = [z n ]R(x, y, z) and the exponent corresponding to this coefficient, then use Proposition 5.10 to show that the coefficients of rn (x, y) satisfy a local central limit theorem. 5.10 Ramgoolam et al. [33] use ACSV techniques to study so-called quiver gauge theories. For instance, those authors show that the ‘generalized clover quiver class’ has multivariate generating function F(z) = *  ∞ i=1

1 1−

d

i j=1 z j

.

d the meromorphic function F(z) admits Show that for any direction r ∈ R>0 σ = (r1 /|r|1, . . . , rd /|r|1 ) as a minimal critical point for its power series expansion and use Corollary 5.3 to determine dominant asymptotics of the r−diagonal, where |r|1 = r1 + · · · + rd . Hint: Write the denominator of F as

1−

d 

 zj ×

∞  

1−

i=2

k=1

!

d  " z ij # , j=1 $

then show the zeroes of the second factor don’t affect minimality of σ. 5.11 Modify the analysis of Section 5.1 to obtain an integral expression ∫ A(θ)e−nφ(θ) dθ fn,n = R

References

245

for the power series coefficients of F(x, y) = (x − y)/(1 − x − y) along the main diagonal. What steps in the analysis change with the numerator? Prove that this integral is identically zero.   5.12 Prove that there are Nd+d monomials zm of degree at most N. In other words,   show that the number of solutions to a1 + · · · + ad ≤ N with each ai ∈ N is Nd+d . 5.13 Let M = A + bIk where A is a k × k matrix whose entries Ai, j = a j are constant down columns, b ∈ C, and Ik is the k × k identity matrix. Prove that the determinant of M is bk−1 (a1 + · · · + ak + b). Hint: What are the eigenvalues of M, and what are their multiplicities?

References 1. Y. Baryshnikov, S. Melczer, and R. Pemantle. Stationary points at infinity for analytic combinatorics. arXiv preprint, arXiv:1905.05250, 2020. 2. Yuliy Baryshnikov, Stephen Melczer, Robin Pemantle, and Armin Straub. Diagonal asymptotics for symmetric rational functions via ACSV. In 29th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms, volume 110 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 12, 15. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2018. 3. Yuliy Baryshnikov and Robin Pemantle. Asymptotics of multivariate sequences, part III: Quadratic points. Adv. Math., 228(6):3127–3206, 2011. 4. Edward A. Bender. Central and local limit theorems applied to asymptotic enumeration. J. Combinatorial Theory Ser. A, 15:91–111, 1973. 5. Edward A. Bender and L. Bruce Richmond. Central and local limit theorems applied to asymptotic enumeration. II. Multivariate generating functions. J. Combin. Theory Ser. A, 34(3):255–265, 1983. 6. Julius Borcea and Petter Brändén. The Lee-Yang and Pólya-Schur programs. II. Theory of stable polynomials and applications. Comm. Pure Appl. Math., 62(12):1595–1631, 2009. 7. Chad Brewbaker. A combinatorial interpretation of the poly-Bernoulli numbers and two Fermat analogues. Integers, 8:A02, 9, 2008. 8. Peter Bürgisser, Michael Clausen, and M. Amin Shokrollahi. Algebraic complexity theory, volume 315 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 1997. 9. Fan Chung, Persi Diaconis, and Ron Graham. Permanental generating functions and sequential importance sampling. Advances in Applied Mathematics, page 101916, 2019. 10. David Cox, John Little, and Donal O’Shea. Ideals, varieties, and algorithms. Undergraduate Texts in Mathematics. Springer, New York, third edition, 2007. 11. David A. Cox, John Little, and Donal O’Shea. Using algebraic geometry, volume 185 of Graduate Texts in Mathematics. Springer, New York, second edition, 2005. 12. Rodrigo Ferraz de Andrade, Erik Lundberg, and Brendan Nagle. Asymptotics of the extremal excedance set statistic. European J. Combin., 46:75–88, 2015. 13. N. G. de Bruijn. Asymptotic methods in analysis. Bibliotheca Mathematica. Vol. 4. NorthHolland Publishing Co., Amsterdam; P. Noordhoff Ltd., Groningen; Interscience Publishers Inc., New York, 1958. 14. Rick Durrett. Probability: theory and examples, volume 31 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, fourth edition, 2010. 15. Wolfgang Ebeling. Functions of several complex variables and their singularities, volume 83 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2007.

246

5 The Theory of ACSV for Smooth Points

16. Philippe Flajolet and Robert Sedgewick. Analytic combinatorics. Cambridge University Press, Cambridge, 2009. 17. I. M. Gelfand, M. M. Kapranov, and A. V. Zelevinsky. Discriminants, resultants and multidimensional determinants. Modern Birkhäuser Classics. Birkhäuser Boston, Inc., Boston, MA, 2008. 18. Phillip Griffiths and Joseph Harris. Principles of algebraic geometry. Wiley-Interscience [John Wiley & Sons], New York, 1978. 19. Serkan Hoşten and Seth Sullivant. The algebraic complexity of maximum likelihood estimation for bivariate missing data. In Algebraic and geometric methods in statistics, pages 123–133. Cambridge Univ. Press, Cambridge, 2010. 20. Lars Hörmander. The analysis of linear partial differential operators. I, volume 256 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, second edition, 1990. 21. Lars Hörmander. An introduction to complex analysis in several variables, volume 7 of NorthHolland Mathematical Library. North-Holland Publishing Co., Amsterdam, third edition, 1990. 22. Roger A. Horn and Charles R. Johnson. Matrix analysis. Cambridge University Press, Cambridge, 1990. 23. Hsien-Kuei Hwang. Large deviations for combinatorial distributions. I. Central limit theorems. Ann. Appl. Probab., 6(1):297–319, 1996. 24. Hsien-Kuei Hwang. Large deviations of combinatorial distributions. II. Local limit theorems. Ann. Appl. Probab., 8(1):163–181, 1998. 25. J.-P. Jouanolou. Le formalisme du résultant. Adv. Math., 90(2):117–263, 1991. 26. Jessica Khera, Erik Lundberg, and Stephen Melczer. Asymptotic enumeration of lonesum matrices. arXiv preprint, arXiv:1912.08850, 2019. 27. William Letsou and Long Cai. Noncommutative biology: Sequential regulation of complex networks. PLOS Computational Biology, 12(8):e1005089, August 2016. 28. S. Melczer. A limit theorem for permutations with restricted positions. In preparation, 2020. 29. David Mumford. Algebraic geometry. I. Springer-Verlag, Berlin-New York, 1976. 30. Jay Pantone. The asymptotic number of simple singular vector tuples of a cubical tensor. Online Journal of Analytic Combinatorics, 12, 2017. 31. Robin Pemantle and Mark C. Wilson. Asymptotic expansions of oscillatory integrals with complex phase. In Algorithmic probability and combinatorics, volume 520 of Contemp. Math., pages 221–240. Amer. Math. Soc., Providence, RI, 2010. 32. Robin Pemantle and Mark C. Wilson. Analytic combinatorics in several variables, volume 140 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2013. 33. S Ramgoolam, Mark C Wilson, and A Zahabi. Quiver asymptotics: N=1 free chiral ring. Journal of Physics A: Mathematical and Theoretical, 53(10):105401, 2020. 34. Michael Wallner. A half-normal distribution scheme for generating functions. European J. Combin., 87:103138, 21, 2020.

Chapter 6

Application: Lattice Walks and Smooth ACSV

The first steps in the path of discovery, and the first approximate measures, are those which add most to the existing knowledge of mankind. – Charles Babbage

We now apply the asymptotic techniques developed in Chapter 5 to lattice path enumeration problems, using the generating function expressions derived in Chapter 4. As seen in Chapter 4, there is a correspondence between lattice path models having nice combinatorial properties and those whose generating functions admit particularly nice representations as diagonals of rational functions. Because Chapter 5 gives asymptotics in the presence of smooth contributing singularities, here we will consider lattice path models whose step sets have lots of symmetries. In addition to the precise results we derive, this chapter serves as an extended illustration of how to apply the powerful machinery of analytic combinatorics in several variables, including the determination of higher-order constants in the resulting asymptotic expansions. Chapter 10 of Part III will return to lattice path enumeration after a theory of analytic combinatorics for non-smooth points is developed in Chapter 9. Several of the asymptotic results in this chapter were originally given by Melczer and Mishna [3] and Melczer [2], on which our presentation is based.

6.1 Asymptotics of Highly Symmetric Orthant Walks We consider walks in Nd on a set of steps S ⊂ {±1, 0} d , where each step i ∈ S is given some positive real weight wi > 0. As in Chapter 4, we define the weighted characteristic polynomial  wi zi S(z) = i∈S

associated to S. To make sure we are considering walks in the correct dimension, we always make the following assumption.

© Springer Nature Switzerland AG 2021 S. Melczer, Algorithmic and Symbolic Combinatorics, Texts & Monographs in Symbolic Computation, https://doi.org/10.1007/978-3-030-67080-1_6

247

248

6 Application: Lattice Walks and Smooth ACSV

Standing Assumption: Assume that for each 1 ≤ j ≤ d the set S contains a step with a positive entry in its jth coordinate (and thus by symmetry also a negative step in its jth coordinate). Recall from Chapter 4 that S is called highly symmetric if it is symmetric over every axis and mostly symmetric if it is symmetric over all but one axis. When S is highly or mostly symmetric we may assume its axis of non-symmetry, if it exists, lies in its dth coordinate. Thus, we may write S(z) = z d A(ˆz) + Q(ˆz) + zd B(ˆz) where A, Q, and B are Laurent polynomials which are symmetric in their variables zˆ = (z1, . . . , zd−1 ) and we again use the notation x = 1/x common in lattice path enumeration. Define the multivariate generating function  wi,n zi t n, W(z, t) = i∈N d n≥0

where wi,n counts the number of weighted walks of length n using the steps in S which begin at the origin, end at i ∈ Nd , and never leave Nd . Theorem 4.2 of Chapter 4 gives the representation W(z, t) = [z ≥0 ]R(z, t) when   A(ˆz) (z1 − z1 ) · · · (zd−1 − z d−1 ) zd − z d B(ˆ z) R(z, t) = (z1 · · · zd )(1 − tS(z)) is expanded in the formal ring R = Q((z))[[t]] = Q((z1 )) · · · ((zd ))[[t]]. Analytically, this formal expansion corresponds to a convergent Laurent expansion in a domain D containing any point (z, t) where |t| is sufficiently smaller than |zd |, which in turn is sufficiently smaller than |zd−1 |, and so on until |z1 |, which is bounded away from zero. Proposition 4.8 of Chapter 4 gives the generating function expression   2 A(ˆ (1 + z ) · · · (1 + z ) B(ˆ z ) − z z ) 1 d−1  " d # (6.1) W(1, t) = Δ # (1 − zd )B(ˆz)(1 − tz1 · · · zd S(z)) ! $ for the number of walks of length n starting at the origin and ending anywhere in Nd , where S(z) = S(z1, . . . , zd−1, z d ). Proposition 4.9 of that chapter gives an alternative formula, and similar expressions can be derived for walks ending on some collection of the boundary axes {z ∈ Rd : z j = 0}. Notice that the denominator of the rational function in (6.1) contains multiple factors. If S is not highly symmetric then A(ˆz)  B(ˆz), meaning the denominator factor 1− zd does not divide the numerator and the singular variety under consideration will contain non-smooth points. Although there are situations when only smooth points dictate asymptotics in the mostly symmetric case, these models are better understood

6.1 Asymptotics of Highly Symmetric Orthant Walks

249

after a discussion on non-smooth ACSV, and we return to them in Chapter 10. We will see that the relative weight of steps moving towards or away from the bounding axes plays a deciding role in whether or not smooth points determine asymptotics. When S is highly symmetric then A(ˆz) = B(ˆz) and S(z) = S(z), so (6.1) becomes   (1 + z1 ) · · · (1 + zd−1 )(1 + zd ) . (6.2) W(1, t) = Δ 1 − tz1 · · · zd S(z) The rational function in (6.2) now has a smooth singular variety, because the denominator and its partial derivative with respect to t can never simultaneously vanish. The domain of convergence D is the power series domain of convergence.

6.1.1 Asymptotics for All Walks in an Orthant We begin by determining asymptotics for the number of walks beginning at the origin and ending anywhere in Nd . Let G(z) = (1 + z1 ) · · · (1 + zd )

and

H(z, t) = 1 − tz1 · · · zd S(z),

so that we are interested in asymptotics of the main diagonal power series coefficients of F(z, t) = G(z)/H(z, t). Recall that zk denotes the vector z with its kth entry removed. Since we are considering highly symmetric models, for each 1 ≤ k ≤ d there exist unique Laurent polynomials Ak (zk ) and Q k (zk ) such that S(z) = (z k + zk )Ak (zk ) + Q k (zk ),

(6.3)

where Ad (zd) = A(ˆz) = B(ˆz) as defined above. For any 1 ≤ k ≤ d the smooth critical point equation tHt (z, t) = zk Hzk (z, t) from (5.16) of Chapter 5 becomes t(z1 · · · zd )S(z) = t(z1 · · · zd )S(z) + tzk (z1 · · · zd )Szk (z), which implies 0 = tzk (z1 · · · zd )Szk (z) = t(zk2 − 1)(z1 · · · zk−1 zk+1 · · · zd )Ak (zk ). This gives a characterization of critical points in the main diagonal direction. Lemma 6.1 The point (x, t) ∈ V is a smooth critical point if and only if for each 1 ≤ k ≤ d either • xk = ±1, or • the polynomial (z1 · · · zk−1 zk+1 · · · zd )Ak (zk ) has a root at xk .

250

6 Application: Lattice Walks and Smooth ACSV

Note that it is possible to have an infinite set of critical points due to the second condition. This cannot happen for unweighted models in two dimensions (when every weight is one) but does occur for weighted models or when d ≥ 3. Example 6.1 (A Curve of Critical Points) Consider the unweighted highly symmetric model in three dimensions restricted to the non-negative octant defined by the step set S = {(−1, 0, ±1), (1, 0, ±1), (0, 1, ±1), (0, −1, ±1), (±1, 1, 0), (±1, −1, 0)}. Some algebraic manipulation shows  x i1 y i2 z i3 H(x, y, z, t) = 1 − t(xyz) i∈S

= 1 − t(z + 1)(x + y)(xy + 1) − tz(y 2 + 1)(x 2 + 1), 2

and solving the system of smooth critical point equations gives the two isolated critical points     1 −1 1, 1, 1, and −1, −1, −1, 12 12 together with a collection of non-isolated critical points defined by       1 1 1 x, 1, −1, , x, −1, 1, , 1, y, −1, , 4x 4x 4y       1 1 1 −1, y, 1, , 1, −1, z, , −1, 1, z, 4y 4z 4z for all x, y, z ∈ C∗ . Lemma 5.6 in Chapter 5 immediately implies that the non-isolated critical points are degenerate, which can also be verified by direction calculation (in fact, the Hessian matrix H in (5.25) has a row of zeroes at each of these points). Thankfully, none of these pathological critical points are minimal: if, for instance, (x, 1, −1, 1/4x) were minimal then Lemma 5.7 in Chapter 5 would imply   1 + |x| 2 1 , =− 0 = H |x|, 1, 1, 4|x| |x| which can never occur.

Our observation that the minimal critical points are well behaved in the last example holds more generally. Proposition 6.1 The point

6.1 Asymptotics of Highly Symmetric Orthant Walks

 σ = 1, . . . , 1,

251

1 S(1)



is a finitely minimal smooth contributing point. All minimal critical points lie in T(σ) ∩ V, there are at most 2d critical points in T(σ) ∩ V, and if (z, t) is such a point then z ∈ {±1} d . Proof The point σ is critical by Lemma 6.1. Suppose (x, tx ) lies in D(σ) ∩ V, where we note that any choice of x uniquely determines tx on V. Then  1   i+1 i wi x = (x1 · · · xd ) wi x = ≥ S(1) = wi . tx i∈S

i∈S

i∈S

Since (x, tx ) ∈ D(σ) means |xk | ≤ 1 for each 1 ≤ k ≤ d, and each weight wi is positive, the only way this can happen is if |xk | = 1 for all 1 ≤ k ≤ d and the monomials xi+1 have the same complex argument for all i ∈ S. By symmetry, and the assumption that we take a positive step in each direction, the set {xi+1 : i ∈ S} contains two elements of the form x2i2 +1 · · · xdid +1

and

x12 x2i2 +1 · · · xdid +1,

so x12 is real when these monomials have the same argument. Thus, x1 = ±1 and applying the same reasoning to each coordinate gives the stated restriction on x. Since F(z) is combinatorial, Corollary 5.5 in Chapter 5 implies that (x, tx ) ∈ V is a minimal critical point only if (|x|, t |x| ) is also minimal and critical. Since |x| = (|x1 |, . . . , |xd |) has non-negative coordinates, for every 1 ≤ k ≤ d the polynomial A(|xkˆ |)  0 and  Lemma 6.1 implies (|x|, t |x| ) is minimal and critical only if each |xk | = 1. Remark 6.1 Our proof of Proposition 6.1 used the explicit representation of critical points in Lemma 6.1 to show any minimal critical point (x, tx ) lies in T(σ). Alternatively, the smooth critical point equations imply Sz1 (x) = · · · = Sz d (x) = 0 and one may conclude directly from Proposition 4.5 in Chapter 4, which states that there is a unique positive solution to this system of equations (in this case the point x = 1). The collection of minimal critical points of F(z, t) thus form the finite set   1 d : x ∈ {±1} , |S(x)| = S(1) . C = x, S(x) In order to find asymptotics using Corollary 5.2 from Chapter 5 it remains only to determine the matrix H whose entries are given in (5.25), noting that for a ddimensional model the function F(z, t) has d + 1 variables. If (x, t) ∈ V is a critical point with x ∈ {±1} d then a direct calculation shows xi x j Hzi z j (x) =

⎧ ⎪ ⎨ −1 ⎪ j A j (x) ⎪ ⎪ − 2x S(x) ⎩

:i j :i= j

252

6 Application: Lattice Walks and Smooth ACSV

while x j tHz j t (x) = −1

and

t 2 Htt (x) = 0

for all 0 ≤ i ≤ j ≤ d, so H at (x, t) is the d × d diagonal matrix  Hx =

x1 A1 (x)

2 S(x)

0 .. . !

0

··· 0 " .. .. # x2 A2 (x) . . # #. .. .. .. # . . # . ··· 0 xd Ad (x)$ 0

If (x, t) ∈ C then |S(x)| = S(1), so xk Ak (x) xk Ak (x) Ak (1) = = S(x) (x k + xk )Ak (xk ) + Q k (xk ) S(1) since xk Ak (xk ) and Q k (xk ) must share the same sign. Thus, a1 0 · · · . 2 0 a2 . . Hx = H = S(1) .. . . . . . . . 0 ! ··· 0 

0 .. "# .# .. ## .# ad $

(6.4)

is independent of x, where ak = Ak (1) is the total weight of the steps in S that have kth coordinate equal to 1. Since G(z) = (1 + z1 ) · · · (1 + zn ) does not vanish at σ, but vanishes at any other minimal critical point, σ is the only point whose asymptotic contribution affects dominant asymptotics of the sequence under consideration. The fact that H is a diagonal matrix and does not depend on x helps simplify the calculations for determining higher order asymptotic terms, discussed below. Using the quantities we have computed with Corollary 5.2 from Chapter 5 immediately gives the following. Theorem 6.1 Let S ⊂ {−1, 0, 1} d \ {0} be a set of steps that is symmetric over every axis and moves forwards in each coordinate. Then the number cn of walks taking n steps in S, beginning at the origin, and never leaving the orthant N2 has asymptotic expansion    1 −1/2 −d/2 n −d/2 d/2 (a1 · · · ad ) . (6.5) π S(1) + O cn = S(1) n n When every step has weight one then S(1) is simply the number of steps in S, and ak is the number of steps moving forwards in the kth coordinate. Example 6.2 (Highly Symmetric Models in Two Dimensions) There are four non-isomorphic unweighted highly symmetric models in the twodimensional quarter plane, whose asymptotics are listed in Table 6.1; asymptotics for

6.1 Asymptotics of Highly Symmetric Orthant Walks S

S

Asymptotics 4 n n−1

n

6 n

−1



4

π 1×1

=

253 Asymptotics

4n 4 n π

4 n n−1

√ 6n 6 = √ n π π 3×2 6

8 n n−1



4

π 2×2 √

8

π 3×3

=

4n 2 n π

=

8n 8 n 3π

Table 6.1 The four highly symmetric models with unit steps in the quarter plane.

these models were originally conjectured by Bostan and Kauers [1]. Corollary 2.1 of Chapter 2 implies that none of these models admits an algebraic generating function.

Example 6.3 (A Weighted Model) In two dimensions, a weighted lattice path model on the steps S = {±1, 0}2 \ {0} is highly symmetric if and only if each diagonal direction (±1, ±1) has the same weight A > 0, while the vertical steps (0, ±1) both have the same weight B > 0, and the horizontal steps (±1, 0) have the same weight C > 0: A

B

C A

Under such a weighting,

cn = (4A + 2B + 2C) n n

−1

A C

B

A

4A + 2B + 2C

 π (2A + B)(2A + C)

 1+O

  1 . n

Theorem 6.1 immediately gives asymptotics for any explicit model, but is also flexible enough to give asymptotics for families of models with varying dimension. Example 6.4 (Asymptotics for Maximal Step Sets) Let S = {±1, 0} d \ {0} be the step set containing all non-zero steps, which is highly symmetric when unweighted. Then S(1) = |S| = 3d − 1, and ak = 3d−1 for all k, so the total number of walks satisfies     1 (3d − 1)d/2 . cn = (3d − 1)n n−d/2 d(d−1)/2 d/2 1 + O n 3 π

254

6 Application: Lattice Walks and Smooth ACSV

Higher Order Asymptotics The higher order terms in the asymptotic expansion of cn are determined by Theorem 5.2 in Chapter 5 through an application of Proposition 5.3; unlike the leading term, the contributions from minimal critical points other than σ may play a role. In particular, Theorem 5.2 implies that for any M ∈ N the contribution of a minimal critical point (x, t) ∈ C to the asymptotics of cn has the form S(x)n n−d/2 (2π)−d/2 det(H )−1/2

  " K jx n−j + O n−M−1 # , ! j=0 $ M 

where H is determined by (6.4). The constant K jx satisfies K jx

= (−1)



j

0≤ ≤2j

where

  ˜ E +j Px (θ) ψ(θ) 2 +j !( + j)!

,

θ=0

    Px (θ) = 1 + x1 eiθ1 · · · 1 + xd eiθd ,

E is the differential operator  E=−

S(1) 2



 1 2 1 2 ∂1 + · · · + ∂d , a1 ad

(6.6)

and     S(x1 eiθ1 , . . . , xd eiθd ) ˜ ψ(θ) = − log − θ t H θ /2 S(x)   = − log S(eiθ1 , . . . , eiθd ) + log S(1) − θ t H θ /2.

(6.7)

Because S is highly symmetric S(eiθ1 , . . . , eiθd ) is an even function in each variable, which implies its power series expansion at the origin contains only monomials with ˜ even exponents. This, in turn, implies the power series expansion of ψ(θ) at the origin also contains only monomials with even exponents. Since, by construction, ψ˜ vanishes to at least order 3 at the origin, it therefore vanishes to order 4 at the origin. Example 6.5 (Higher Asymptotics Terms of Simple Walks) Consider the set of cardinal directions S = {(1, 0), (0, 1), (−1, 0), (0, −1)}. Then S(x, y) = x + x + y + y and A1 (y) = A2 (x) = 1 so H is the 2 × 2 identity matrix multiplied by 1/2. The set C of critical points consists of two elements, σ = (1, 1, 1/4) For ρ = ±1 let

and

 = (−1, −1, −1/4).

6.1 Asymptotics of Highly Symmetric Orthant Walks



Pρ (θ) = 1 + ρeiθ1



255



1 + ρeiθ2 .

Here E = −2(∂12 + ∂22 ) and   ˜ ψ(θ) = − log eiθ1 + e−iθ1 + eiθ2 + e−iθ2 + log 4 − (θ 12 + θ 22 )/4 = − log (cos(θ 1 ) + cos(θ 2 )) + log 2 − (θ 12 + θ 22 )/4 =

θ 14 θ 12 θ 22 θ 24 + + +··· . 96 16 96

Thus, for any M ∈ N there is an asymptotic expansion cn =

M 7   4n  6 (+1) " K j + (−1)n K j(−1) n−j + O n−M−1 # , nπ j=0 ! $

where ρ Kj

=

 0≤ ≤2j 0≤k ≤ +j

   (−1)  + j 2k 2 +2j−2k  ˜ Pρ (θ) ψ(θ) . ∂1 ∂2 !( + j)! k θ=0

Initial terms can easily be computed in a computer algebra system to obtain   6 19 + 2(−1)n 63 + 18(−1)n 4n 4− + − +··· . cn = πn n 2n2 4n3

6.1.2 Asymptotics for Boundary Returns It is often of combinatorial interest to determine the number of walks which end on some or all of the boundary hyperplanes of Nd . By permuting coordinates, to enumerate walks returning to r of the boundary hyperplanes it is sufficient to determine the counting sequence (en ) of walks which return to the boundary axes defined by z j = 0 for 1 ≤ j ≤ r. Applying Proposition 3.14 from Chapter 3 to our representation W(z, t) = [z ≥0 ]

(z1 − z1 ) · · · (zd−1 − z d−1 ) (zd − z d ) (z1 · · · zd )(1 − tS(z))

for the multivariate generating function W(z, t) implies that the generating function for en , given by W(z, t) after substituting z j = 0 for 1 ≤ j ≤ r and z j = 1 otherwise, is the main power series diagonal of

256

6 Application: Lattice Walks and Smooth ACSV

F(z, t) =

(1 − z12 ) · · · (1 − zr2 )(1 + zr+1 ) · · · (1 + zd ) G(z) = . H(z, t) 1 − tz1 · · · zd S(z)

As the denominator H is unchanged from above, the set of minimal critical points   1 d : x ∈ {±1} , |S(x)| = S(1) C = x, S(x) is also unchanged, and asymptotics are still determined by applying Corollary 5.2 from Chapter 5 with the quantities H, E, and ψ˜ in (6.4), (6.6), and (6.7), respectively. On the other hand, when r > 0 the numerator G(z) now vanishes at σ = (1, 1/S(1)) and we must look at higher order terms to determine even the dominant asymptotic growth of en . Although this requires getting down in the computational muck, the vanishing of the numerator actually helps make the computations more tractable. For (x, t) ∈ C we have x 2j = 1 for all j, so defining Px (θ) = G(x1 eiθ1 , . . . , xd eiθd ) implies        Px (θ) = 1 − e2iθ1 · · · 1 − e2iθr 1 + xr+1 eiθr +1 · · · 1 + xd eiθd 6 7 = (θ 1 · · · θ r ) (−2i)r (1 + xr+1 ) · · · (1 + xd ) + higher order terms . Our goal is to determine the smallest j ∈ N such that the coefficient   E +j  Px (θ) ψ(θ) ˜ x j K j = (−1) 2 +j !( + j)! 0≤ ≤2j

θ=0

is non-zero for some (x, t) ∈ C. Because every term in the power series expansion of Px at the origin is divisible by θ 1 · · · θ r , and because the differential operator E is a linear combination of ∂12, . . . , ∂d2 , the smallest power of E which can give a non-zero term when applied to Px and evaluated at the origin is E r (which contains a non-zero ˜ vanishes to multiple of ∂12 · · · ∂r2 as a summand). Furthermore, the function ψ(θ) order 4 at the origin, so for any  ∈ N the smallest power of E which can give a ˜ and evaluated at the origin is E 2 . non-zero term when applied to ψ(θ) Since the differential operator E +j has order 2( + j), the term   ˜ E +j Px (θ) ψ(θ) θ=0

is thus zero unless 2( + j) ≥ 2r +4. The smallest value of j for which this inequality holds is j = r, and the inequality holds in this case only when  = 0. Furthermore, the only term in the power E r which can be applied to Px (θ) to give something non-zero when evaluated at the origin is ∂12 · · · ∂r2 . Since 

S(1) E = − 2 r

r 

1 2 1 2 ∂ +···+ ∂ a1 1 ad d

r =

 (−1)r S(1)r r!  2 2 ∂ + ··· · · · ∂ r 1 2r a1 · · · ar

6.1 Asymptotics of Highly Symmetric Orthant Walks

257

and the coefficient of θ 12 · · · θ r2 in the power series expansion of Px at the origin is 2r (1 + xr+1 ) · · · (1 + xd ), it follows that   ˜ 0  E r +0 Px (θ) ψ(θ)

θ=0

=

   (−1)r S(1)r r! × 2r (1 + xr +1 ) · · · (1 + x d ) ∂12 · · · ∂r2 θ12 · · · θr2 r 2 a1 · · · ar

=

(−1)r S(1)r (1 + xr +1 ) · · · (1 + x d ) r! . a1 · · · ar

If x j = −1 for any j ≥ r + 1 then the term Krx is zero. If x j = 1 for all j ≥ r then   E +r  Px (θ) ψ(θ) ˜   x r Kr = (−1)  2+r !( + r)!  0≤ ≤2r θ=0   r+0 P (θ) ψ(θ) 0  ˜ E x  = (−1)r   2r r! θ=0

S(1)r 2d−r = . a1 · · · ar After this careful set of computations, we have proven the following theorem. Theorem 6.2 Let S ⊂ {−1, 0, 1} d \ {0} be a set of steps that is symmetric over every axis and moves forwards in each coordinate. Let A ⊂ {1, . . . , d} be a set of r elements and let 

1 : x ∈ {±1} d, x j = 1 if j  A, |S(x)| = S(1) . C A = x, S(x) Then the number enA of walks of length n taking steps in S, beginning at the origin, never leaving the orthant Nd , and ending on the intersection of hyperplanes {x : x j = 0 for j ∈ A} has asymptotic expansion enA

= S(1) n n

−d/2−r

S(1)r+d/2

√ π d/2 2r a1 · · · ad j ∈A a j





1    n sgn S(x) + O  . n (x,t)∈ C A   (6.8)

 Note that the term (x,t)∈ CA sgn S(x)n equals |CA | > 0 when n is even, but could be zero when n is odd to account for periodicities. Example 6.6 (Boundary Walks in Two Dimensions) Consider the unweighted set of six steps S = {(±1, ±1), (0, ±1)}, visualized as

258

6 Application: Lattice Walks and Smooth ACSV

Then a1 = 2, a2 = 3, and the minimal critical points arising in the ACSV analysis form the set C = {(1, 1, 1/6), (1, −1, 1/6)}. The number of walks returning to the axis x = 0 has dominant asymptotics

√    62 3 6 {1} n −2 n −2 =6 n , en ∼ 6 n √ 2π π 2 a1 a2 a1 while the number of walks returning to the axis y = 0 has dominant asymptotics

√    62 6 {2} n −2 n n −2 (1 + (−1)n ) , (1 + (−1) ) = 6 n en ∼ 6 n √ π π 2 a1 a2 a2 and the number of walks returning to the origin x = y = 0 has dominant asymptotics

√    63 3 6 {1,2} n −3 n n −3 (1 + (−1)n ) . (1 + (−1) ) = 6 n ∼6 n en √ 2π π 22 a1 a2 a1 a2 Note that only even length walks can end at a point with y = 0.

6.1.3 Parameterizing the Starting Point So far we have only considered highly symmetric models whose walks start at the origin in Nd . Problem 6.2 asks you to generalize the methods of Section 4.1.5 to prove that the multivariate generating function W(z, t) tracking the endpoint and length of highly symmetric walks beginning at p ∈ Nd , taking steps in S, and staying Nd satisfies     p +1 −(p +1) p +1 −(p +1) z1 1 − z 1 1 · · · zd d − zd d W(z, t) = [z ≥0 ] , (z1 · · · zd )(1 − tS(z)) where the non-negative extraction occurs in Q[z, z][[t]]. Thus, the univariate generating function counting the number of walks starting at p and returning to the set {x ∈ Rd : x1 = · · · = xr = 0} is the main power series diagonal of     −(p +1) p +1 −(p +1) p +1 · · · zd d − zd d (z1 · · · zd ) z1 1 − z1 1 F(z, t) = (1 − zr+1 ) · · · (1 − zd )(1 − tz1 · · · zd S(z))     2p +2 2p +2 · · · 1 − zr r [zr+1 ] pr +1 · · · [zd ] pd z−p 1 − z1 1 , (6.9) = 1 − tz1 · · · zd S(z) where

6.1 Asymptotics of Highly Symmetric Orthant Walks

259 2p j +1

[z j ] p j = 1 + z j + · · · + z j

.

Example 6.7 (Walks with Prescribed Start but Free End) The generating function counting the number of walks beginning at z = p and ending anywhere in Nd is given by     −p 1 + z + · · · + z 2p1 +1 · · · 1 + z + · · · + z 2p d +1 1 d z " 1 d #. Δ # 1 − tz1 · · · zd S(z) ! $ For any starting point p ∈ Nd , the only minimal critical point where the numerator of this rational function doesn’t vanish is σ = (1, 1/S(1)). To determine asymptotics we again aim to compute the smallest j ∈ N such that   E +j  Px (θ) ψ(θ) ˜ x j K j = (−1) 2 +j !( + j)! 0≤ ≤2j

θ=0

is non-zero for some (x, t) ∈ C, where the set of minimal critical points C, together ˜ H, and E are unchanged from above as the denominator of F is again with ψ, unchanged, and    3 4 4 3 Px (θ) = x−p e−ip·θ 1 − e2(p1 +1)iθ1 · · · 1 − e2(pr +1)iθr xr +1 eiθr +1 pr +1 · · · xr +1 eiθr +1 p . d

Since ψ˜ and E are unchanged from our previous computations, the same arguments as above show that the smallest value of j such that K jx could be non-zero is j = r. A short computation shows that the coefficient of θ 12 · · · θ r2 in the power series expansion of Px at the origin is Γx = x−p 2r (1 + p1 ) · · · (1 + pr ) [xr+1 ] pr +1 · · · [xd ] pd , so that Krx

  ˜ 0 E r+0 Px (θ) ψ(θ) = (−1) 2r r! r

θ=0

=

S(1)r Γx . 2r a1 · · · ar

If there exists a j > r such that x j = −1 then [x j ] p j = 0, and thus Γx and Krx are zero. Otherwise [x j ] p j = [1] p j = 2p j + 2 and, noting that x−p = xp since x ∈ {±1} d , S(1)r 2d−r p  x (1 + p j ). a1 · · · ar j=1 d

Krx =

Ultimately, we obtain the following generalization of Theorem 6.2.

260

6 Application: Lattice Walks and Smooth ACSV

Theorem 6.3 Let S ⊂ {−1, 0, 1} d \ {0} be a set of steps that is symmetric over every axis and moves forwards in each coordinate. Let A ⊂ {1, . . . , d} be a set of r elements and let   1 : x ∈ {±1} d, x j = 1 if j  A, |S(x)| = S(1) . C A = x, S(x) A,p

Then the number en of walks of length n taking steps in S, beginning at p ∈ Nd , never leaving the orthant Nd , and ending on the intersection of hyperplanes {x : x j = 0 for j ∈ A} has asymptotic expansion

A,p en

= S(1) n n

−d/2−r

 *   S(1)r+d/2 dj=1 (1 + p j ) 1 "   p n x sgn S(x) + O #. * √ d/2 r n π 2 a1 · · · ad j ∈A a j (x,t)∈ C A ! $

As above, the term (x,t)∈ CA xp sgn S(x)n helps account for underlying periodicities in the model when considering walks returning to the boundary axes. Example 6.8 (Asymptotics of Walks with Prescribed Start but Free End) When counting walks beginning at z = p and ending anywhere in Nd , the set CA contains only the point σ = (1, 1/S(1)), and Theorem 6.3 states

 *   S(1)d/2 dj=1 (1 + p j )  1 A,p n −d/2 en = S(1) n 1 + O . √ d/2 n π a1 · · · ad

For any fixed model one may use a computer algebra package to determine the A,p asymptotic series for en to arbitrary precision. Each term will now depend on the starting point p. Example 6.9 (Higher Order Constants for Simple Walk in Two Dimensions) Consider the set of cardinal directions S = {(1, 0), (0, 1), (−1, 0), (0, −1)}. As above the set C of critical points consists of two elements, σ = (1, 1, 1/4)

and

 = (−1, −1, −1/4),

the differential operator E = −2(∂12 + ∂22 ) and ˜ ψ(θ) = − log (cos(θ 1 ) + cos(θ 2 )) + log 2 − (θ 12 + θ 22 )/4. For ρ = ±1 and (a, b) ∈ N2 we now define

6.1 Asymptotics of Highly Symmetric Orthant Walks





261



Pρa,b (θ) = e−iaθ1 1 + · · · + ρ2a+1 ei(2a+1)θ1 e−ibθ2 1 + · · · + ρ2b+1 ei(2b+1)θ1    i2(a+1)θ1 1 − ei2(b+1)θ2 −i(aθ1 +bθ2 ) 1 − e . =e 1 − ρeiθ1 1 − ρeiθ2



Then the number of walks on the steps S beginning at (a, b) and ending anywhere in N2 has an asymptotic expansion cna,b =

M 6 7   4n  +1 " K j + (−1)n K j−1 n−j + O n−M−1 # , (1 + a)(1 + b) nπ ! j=0 $

where ρ

Kj =

 0≤ ≤2j 0≤k ≤ +j

   (−1)  + j 2k 2 +2j−2k  a,b ˜ . Pρ (θ) ψ(θ) ∂1 ∂2 !( + j)! k θ=0

Computing the first terms of this expansion gives  2(2a2 + 2b2 + 4a + 4b + 9) 4n (a + 1)(b + 1) 4− cn = πn 3n

  p(a, b) + 90(−1)a+b+n 1 + +O 3 , 90n2 n

where p(a, b) is an explicit integer polynomial of degree 4 which can be found in the computer algebra worksheet corresponding to this example.

Problems 6.1 For what dimensions d ∈ N does the number of unweighted walks of length n on the maximal step set S = {±1, 0} d \ {0}, beginning at the origin and restricted to Nd , have an algebraic generating function? 6.2 Fix p ∈ Nd and a highly symmetric step set S. Prove that the multivariate generating function W(z, t) tracking the endpoint and length of walks beginning at p, taking steps in S, and staying Nd satisfies the kernel-like functional equation p +1

(z1 · · · z d )W (z, t) = z1 1

p +1

· · · zd d

+ t(z1 · · · z d )S(z)W (z, t)  (−1)|V | (z1 · · · z d )S(z)W (z, t) −t V ⊂[d]

z j =0, j ∈V

,

generalizing (4.12). Use the kernel method for highly symmetric models presented in Section 4.1.5 of Chapter 4 to show

262

6 Application: Lattice Walks and Smooth ACSV

 W(z, t) = [z ≥0 ]

p +1

z1 1

−(p1 +1)

− z1



  p +1 −(p +1) · · · zd d − zd d

(z1 · · · zd )(1 − tS(z))

,

where the non-negative extraction occurs in Q[z, z][[t]]. 6.3 If e(j) denotes the jth elementary basis vector with a 1 in position j and all other entries 0, find asymptotics for the number of walks in Nd which use the step set S = {±e(1), . . . , ±e(d) }, begin at the origin, and end anywhere in Nd . What about walks returning to the origin? 6.4 For each of the four highly symmetric step sets S ⊂ {±1, 0}2 \ {0}, find dominant asymptotics for the number of unweighted walks which begin at a point (a, b) ∈ N2 and end (i) anywhere in N2 , (ii) on the x-axis, (iii) on the y-axis, and (iv) at the origin.

References 1. Alin Bostan and Manuel Kauers. Automatic classification of restricted lattice walks. In 21st International Conference on Formal Power Series and Algebraic Combinatorics (FPSAC 2009), Discrete Math. Theor. Comput. Sci. Proc., AK, pages 201–215. Assoc. Discrete Math. Theor. Comput. Sci., Nancy, 2009. 2. Stephen Melczer. Analytic Combinatorics in Several Variables: Effective Asymptotics and Lattice Path Enumeration. PhD thesis, University of Waterloo and ENS Lyon, 2017. 3. Stephen Melczer and Marni Mishna. Asymptotic lattice path enumeration using diagonals. Algorithmica, 75(4):782–811, 2016.

Chapter 7

Automated Analytic Combinatorics It seems to us obvious. . . to bring out a double set of results, viz. 1st, the numerical magnitudes which are the results of operations performed on numerical data. . . 2ndly, the symbolical results to be attached to those numerical results, which symbolical results are not less the necessary and logical consequences of operations performed upon symbolical data, than are numerical results when the data are numerical. — Ada Augusta, Countess of Lovelace After a time many discrepancies occurred, and at one point these discordances were so numerous that I exclaimed, ‘I wish to God these calculations had been executed by steam’. — Charles Babbage

We now turn to the task of automating the asymptotic methods discussed in Chapter 5. A generic rational function has a smooth singular set and admits a finite number of critical points, defined by the polynomial equations (5.16) in Chapter 5. Thus, the bulk of our work involves manipulating the algebraic critical points in order to algorithmically determine which (if any) are minimal. We will make use of symbolic-numeric algorithms, using separation bounds on the roots of univariate integer polynomials to determine an accuracy level so that numeric approximation to such accuracy allows us to perform exact calculations. In order to simplify our presentation we focus on algorithms for asymptotics of d . The algorithms we develop power series expansions in positive directions r ∈ Z>0 extend easily to convergent Laurent expansions and general directions r ∈ R∗d .

Our Complexity Model Our algorithms will take as input multivariate integer polynomials. We study the bit complexity of the algorithms, meaning we count the number of additions, subtractions, and multiplications computed by the algorithms when representing all integers in binary. Measuring binary operations gives a more realistic estimate of the time to run an algorithm than counting the number of integer operations, since computer processors perform operations on numbers of bounded size. The bit complexities of our algorithms will be expressible in terms of the number of variables, the degrees, and the coefficient sizes of their input polynomials. The coefficient size of a polynomial is captured by the following concept. Definition 7.1 (polynomial heights) The height of a multivariate polynomial P ∈ Z[z], denoted h(P), is the base 2 logarithm of the maximum absolute value of its non-zero coefficients. For instance, the height of P(x, y) = 7 − 11x 5 y + 4y 2 is h(P) = log2 11. We define the height of the zero polynomial to be zero. © Springer Nature Switzerland AG 2021 S. Melczer, Algorithmic and Symbolic Combinatorics, Texts & Monographs in Symbolic Computation, https://doi.org/10.1007/978-3-030-67080-1_7

263

264

7 Automated Analytic Combinatorics

The height of a polynomial gives a bound on the bitsizes of its coefficients and, with the degree of the polynomial, a lower bound on how close its distinct roots can be. Definition 7.2 (multivariate complexity) Extending from our univariate notation, if f , g : Nm → R are two multivariate sequences then we write f = O(g)

if there exists a constant M > 0 such that | f (n)| ≤ M |g(n)| for all n ∈ Nd .

˜ f = O(g)

if f = O(g logk |g|) for some natural number k.

When considering the complexity of an algorithm, we often reserve δ ≥ 2 for a degree bound on the input polynomials in Z[z] = Z[z1, . . . , zd ] and define D = δ d . The complexities of most of our algorithms are dominated by a factor of the ˜ c ) for some c ∈ N, and we try to minimize the exponent c. Although the form O(D complexity will be singly exponential in the degree and number of variables, this is polynomial in the number coefficients appearing in a generic polynomial of degree δ in d variables, and polynomial in the generic number of critical points. Additional refinements of our algorithms which better capture sparsity in the input polynomials can be found in the paper of Melczer and Salvy [26]. The presentation of this chapter, including the details of the algorithms, is based on the paper of Melczer and Salvy.

7.1 An Overview of Results and Computations We begin by sketching our main results and viewing some examples, before giving a high-level overview of our algorithms. Recall from Definition 5.8 in Chapter 5 that a function F(z) is combinatorial if its power series expansion contains only a finite number of negative terms. Because Lemma 5.7 in Chapter 5 gives an efficient test for minimality when F(z) is combinatorial, it will be significantly easier to determine asymptotics under this assumption. After some additional background, our main result is stated in Theorem 7.1 of Section 7.2, which discusses the complexity and correctness of our main algorithms: Algorithm 1 (which computes dominant asymptotics given the minimal critical points), Algorithm 2 (which determines minimal critical points in the combinatorial case), and Algorithm 3 (which determines minimal critical points in the general case). The necessary algebraic machinery and data structures for the algorithms are developed in Section 7.3, and rely on certain algebraic bounds and algorithms for univariate polynomials discussed in the appendix to this chapter. Our algorithms start with polynomials G, H ∈ Z[z] of degrees at most δ and heights at most h (meaning the coefficients of the polynomials have absolute values d . Because the constants appearing in at most 2h ), and a fixed direction r ∈ Z>0 the asymptotic expansion (5.27) given by Theorem 5.2 in Chapter 5 are defined by potentially high degree algebraic numbers, it may not be desirable (or even possible) to represent them exactly using radicals. Thus, under verifiable and mostly

7.1 An Overview of Results and Computations

265

generic assumptions, our algorithms encode asymptotics by returning three rational functions A, B, C ∈ Z(u), a square-free polynomial P ∈ Z[u], and a list U of roots of P(u), specified by regions of C containing exactly one root of P, such that

      1 (1−d)/2 n A(u) B(u) C(u) (rd n)(1−d)/2 1 + O . fnr = (2π) n u ∈U In the combinatorial case, Algorithms 1 and 2 return this expansion using a proba3+4d ) bit operations. In the general case, due to the increased ˜ bilistic method in O(hδ difficulty in determining minimal critical points, Algorithms 1 and 3 return this ex  pansion using a probabilistic method in O˜ hδ12+12d bit operations. In either case, the values of A(u), B(u), and C(u) can be determined to precision 2−κ at all elements ˜ d + hδ3+3d ) bit operations. of U in O(κδ The probabilistic aspects of our algorithms come from methods for computing a certain algebraic representation of the smooth critical points, known as a Kronecker representation. One can compute this representation deterministically using Gröbner bases (see Section 7.3) but doing so loses the complexity estimates we derive. Melczer and Salvy [26] gave a Maple implementation1 of the algorithms for the combinatorial case. In practice the probabilistic nature of these algorithms does not have a large impact. Example 7.1 (Automated Apéry Asymptotics) As seen in previous chapters, the sequence of Apéry numbers is the main diagonal of the rational function 1/(1 − w(1 + x)(1 + y)(1 + z)(1 + y + z + yz + xyz)). Running the Maple commands > F = 1/(1-w*(1+x)*(1+y)*(1+z)*(1+y+z+y*z+x*y*z)): > A, U = DiagonalAsymptotics(numer(F),denom(F),[w,x,y,z],u,n); after importing the code of Melczer and Salvy verifies that certain necessary assumptions are satisfied (other than combinatoriality, which can be verified by inspection) and returns the expression 5 n   1 u + 81 u + 81 fn,n,n,n ∼ 3/2 3/2 17u − 15 28 − 24u 4π n where u is the root of

P(u) = u2 + 162u − 167

which is approximately 1.0243 . . . . The algorithm knows that the root u is real, and determines it to enough precision to uniquely identify it among the roots of P. In this case P is quadratic, so u can be expressed with a square-root to obtain

1 A link to this package is available on the book website.

266

7 Automated Analytic Combinatorics

fn,n,n,n ∼

 √  √ n 48 + 34 2 17 + 12 2 8π 3/2 n3/2

,

as was derived by hand in Chapter 5.

Remark 7.1 There is some randomness in our algorithms, so a reader running our examples on their computer may obtain different expressions and polynomials P(u). Of course, all encodings yield the same asymptotic expansions.

Example 7.2 (A Sequence Alignment Problem) Sequence alignment problems arise in molecular biology when trying to determine evolutionary relationships between species by comparing the ‘closeness’ of different sequences (see, for instance, Waterman [17, Ch. 39]). Pemantle and Wilson [33] used multivariate generating functions to enumerate certain families of sequence alignments parametrized by two natural numbers k and b. For fixed k, b ∈ N our algorithm rigorously computes asymptotics of the sequence of interest. When k = b = 2 the sequence of interest is the main power series diagonal of F(x, y) =

x 2 y 2 − xy + 1 , 1 − (x + y + x y − xy 2 − x 2 y + x 2 y 3 + x 3 y 2 )

which is combinatorial. Running our algorithm gives that the main power series diagonal has asymptotic growth  4   n 4 u − 14 u3 + 14 u2 − 2 u + 2 10 u4 − 40 u3 + 54 u2 − 26 u + 4  √ √  4 u4 − 19 u3 + 25 u2 − 4 u − 6 n 2π 10 u4 − 40 u3 + 54 u2 − 26 u + 4 5 10 u4 − 40 u3 + 54 u2 − 26 u + 4 × , 4 u4 − 16 u3 + 20 u2 − 8 u + 4 where u is the real degree 5 algebraic number defined by P(u) = 2u5 − 10u4 + 18u3 − 13u2 + 4u − 2 = 0, Note that the roots of P cannot be expressed in radicals2.

2 The Galois group of P is the symmetric group S5 .

u ≈ 1.4704 . . . .

7.1 An Overview of Results and Computations

267

7.1.1 Surveying the Computations Before introducing the necessary algebra to fully describe our data structures and algorithms, we discuss the computations to be carried out. At the highest level, Theorem 5.4 in Chapter 5 implies that the following steps determine asymptotics. DiagonalAsymptotics d INPUT: Polynomials G(z), H(z) ∈ Z[z] and direction r ∈ Z>0 OUTPUT: Asymptotics of the r-power series diagonal of F(z) = G(z)/H(z)

1. If G and H are not coprime, divide each by their greatest common divisor. 2. If H(0) = 0 then Fail. 3. Determine the square-free factorization Hs of H 4. Let C denote the zeroes of the integer polynomial system H s = z1 Hzs1 − r1 λ = · · · = zd Hzsd − rd λ = 0

(7.1)

in the variables z, λ. If C is not finite then Fail. 5. Determine the set U ⊂ C of minimal critical points. If U =  then Fail. 6. If G vanishes at each element of U, if the determinant of the Hessian H defined in (5.25) of Chapter 5 is zero at an element of U, or if U contains non-smooth points then Fail. 7. Sum the leading asymptotic contributions given by Theorem 5.3 in Chapter 5 for each element of U, and return the result.

DiagonalAsymptotics is fleshed out in Algorithm 1. Determining the critical point solutions of (7.1) requires manipulating algebraic quantities, and can be easily implemented using classical algebraic tools discussed in Section 7.3. Verifying the necessary conditions for our methods is also easy, for instance when H is square-free then (z, λ) ∈ C satisfies λ = 0 if and only if z is a non-smooth point of the singular variety or some coordinate of z is zero. In contrast, the determination of minimal critical points can be quite delicate and requires further investigation.

7.1.2 Minimal Critical Points in the Combinatorial Case Assume first that F(z) is combinatorial and admits a finite number of critical points. Corollary 5.5 from Chapter 5 implies that every minimal critical point has the same coordinate-wise modulus as a critical point with positive coordinates, and Corollary 5.6 implies there can be at most one minimal critical point with positive

268

7 Automated Analytic Combinatorics

coordinates. Furthermore, Proposition 5.4 in Chapter 5 implies that minimality can be determined by examining line segments from the origin to critical points with positive coordinates. Because critical points may have large degree algebraic coefficients, testing minimality for such points is done implicitly by introducing the polynomial H(tz1, . . . , tzd ) ∈ Z[z, t] into the critical point system (7.1) and searching for real solutions with t ∈ (0, 1). MinimalCriticalComb d INPUT: Coprime polynomials G(z), H(z) ∈ Z[z] and direction r ∈ Z>0 ASSUMING: F(z) = G(z)/H(z) is combinatorial OUTPUT: The set U of minimal critical points of F(z) in the direction r

1. If H is not square-free then replace H by its square-free part 2. Let S denote the zeroes of the integer polynomial system H(z) = z1 Hz1 (z) − r1 λ = · · · = zd Hz d (z) − rd λ = H(tz1, . . . , tzd ) = 0

(7.2)

in the variables z, λ, t. If S is not finite or if S contains a point of the form (z, λ, t) where λ = 0 and t = 1 then Fail. 3. Let C = {z ∈ C∗d : (z, λ, 1) ∈ S for some λ ∈ C} be the set of critical points. d ∩ C such that t  (0, 1) for any (w, λ, t) ∈ S. 4. Find w ∈ R>0 If there is more than one or no such point then Fail.

5. Return the set U = {z ∈ C : z ∈ T(w)} of critical points with the same coordinate-wise modulus as w.

MinimalCriticalComb is more fully described in Algorithm 2. Using MinimalCriticalComb to determine minimal critical points, DiagonalAsymptotics gives dominant asymptotics when it returns without failing. This happens whenever the following assumptions are met (A0) (A1) (A2) (A3) (J1)

F(z) admits at least one minimal critical point; (∇H)(z) does not vanish at a root of H(z); G(z) is non-zero for at least one minimal critical point; all minimal critical points of F(z) are non-degenerate; the Jacobian matrix of the system (7.2) in the variables z, λ, t is non-singular at the solutions of (7.2).

Proposition 5.11 from Chapter 5 implies assumptions (A1) to (A3) hold generically. When F is combinatorial then Corollary 5.4 in Chapter 5 and the results of Section 3.3.1 in Chapter 3 imply that F admits a minimal critical point with positive coordinates whenever the coefficients of the linear terms z1, z2, . . . , zd in H are nonzero, which also holds generically. Problem 7.1 asks you to adapt the arguments in Section 5.3.4 of Chapter 5 to prove that (J1) holds generically.

7.1 An Overview of Results and Computations

269

Assumption (A1) implies that the singular variety is smooth, while (A2) implies we can use the explicit expression in Theorem 5.1 for dominant asymptotics. Furthermore, the so-called Jacobian criterion [11, Theorem 16.19] implies that the polynomial system (7.2) is finite whenever its Jacobian matrix has full rank (i.e., is non-singular) at all of its solutions. Thus, (J1) is slightly stronger than requiring that F(z) admit a finite number of critical points: we use (J1) to compute a Kronecker representation of the solutions of this system using Proposition 7.3 below. All of our conditions, except that F is combinatorial, can be verified algorithmically. Remark 7.2 Asymptotics can still be determined in many situations where (A1) and (A2) do not hold. Assumption (A1) implies H is square-free and the singular variety is everywhere smooth; both conditions hold generically, but are not necessary to apply Theorem 5.3 from Chapter 5. When (A2) doesn’t hold the leading term in Theorem 5.3 vanishes but one can try calculating higher-order terms. These situations can be handled by small extensions of our algorithms.

7.1.3 Minimal Critical Points in the General Case In the combinatorial case, to prove w ∈ V is minimal it is sufficient to check that for each t ∈ (0, 1) the point t|w| = (t|w1 |, . . . , t|wd |) does not lie in V. Thus, one can think of tracing a line segment from the origin to |w| and stopping if the segment intersects V. In the non-combinatorial case, things are more difficult: to prove that w ∈ V is minimal, it must be determined for each t ∈ (0, 1) whether there exists a point in V with the same coordinate-wise modulus as tw. One can still imagine tracing a line segment from the origin to |w|, but now each point on the linesegment defines a product of circles which must be checked for elements of V. This, unsurprisingly, leads to an algorithm for minimality which is more computationally expensive. In order to express the moduli of coordinates algebraically, we convert our d complex variables to 2d real variables. Definition 7.3 (real and imaginary polynomial decomposition) Given a polynomial f (z) ∈ R[z] we define f (x + iy) = f (x1 + iy1, . . . , xd + iyd ) and let f R (x, y) and f I (x, y) denote the unique polynomials in R[x, y] such that f (x + iy) = f R (x, y) + i f I (x, y). Because a polynomial is everywhere differentiable, the Cauchy-Riemann equations [19, Ch. 2.1] state that the partial derivatives of f R and f I satisfy the equalities fxRj (x, y) = fyIj (x, y) and fyRj (x, y) = − fxIj (x, y) for each 1 ≤ j ≤ d, while fz j (x + iy) =

 i   1  R fx j (x, y) + i fxIj (x, y) − fyRj (x, y) + i fyIj (x, y) . 2 2

Some light algebraic manipulation then shows that the real solutions of the system

270

7 Automated Analytic Combinatorics

H R (a, b) = H I (a, b) = 0 a j HxRj (a, b) a j HxIj (a, b)

(7.3)

+

b j HyRj (a, b)

− r j λR = 0,

j = 1, . . . , d

(7.4)

+

b j HyIj (a, b)

− r j λI = 0,

j = 1, . . . , d

(7.5)

in the variables a, b, λR, λI correspond exactly to all complex solutions of the smooth critical point equations H(z) = z1 Hz1 (z) − r1 λ = · · · = zd Hz d (z) − rd λ = 0 with z = a + ib and λ = λR + iλI . Furthermore, if we define the equations H R (x, y) = H I (x, y) = 0 x 2j

+

y 2j

− t(a2j

+

b2j )

= 0,

(7.6) j = 1, . . . , d

(7.7)

then (7.7) encodes a relationship between the coordinate-wise moduli of x + iy and a + ib when x, y, a, and b are real vectors. In particular, Proposition 5.4 in Chapter 5 implies that a point z = a + ib ∈ V is minimal if and only if there is no solution to equations (7.6) and (7.7) with x, y, t real and 0 ≤ t < 1. The polynomial system defined by (7.3)–(7.7) contains 3d + 4 equations in the 4d + 3 variables a, b, x, r, λ R, λI , and t. Let • W ⊂ C4d+3 denote the complex solutions of the system of equations (7.3)–(7.7) • WR = W ∩ R4d+3 be the real part of W • πt : WR → C be the projection map πt (a, b, x, y, λR, λI , t) = t • C = {(a, b) ∈ R2d ∗ : (a, b, x, y, λ R, λ I , t) ∈ WR for some x, y, λ R, λ I , t} We now have a strategy for determining minimality: each (a, b) ∈ C corresponds to a critical point z = a + ib, and that critical point is minimal if and only if there does not exist (a, b, x, y, λR, λI , t) ∈ WR with 0 ≤ t < 1. Unfortunately, even if the set WR of real solutions is finite the set W of complex solutions will be infinite (when nonempty) since there are more variables than equations. Although the real solutions of (7.3)–(7.7) are the only ones which have meaning in our original problem, most algebraic techniques for polynomial system solving work, at least implicitly, over an algebraically closed field. To get around this problem, we introduce additional equations satisfied by the real solutions we care about which also eliminate spurious complex solutions. Our techniques are inspired by so-called critical point methods3 for sampling points in real algebraic sets, an influential approach to real polynomial system solving popularized by Grigor’ev and Vorobjov [18] and Renegar [34]. Proposition 7.1 Let H ∈ Q[z] be a polynomial which does not vanish at the origin, and suppose that the Jacobian matrix of the polynomials in (7.3)–(7.7) has full rank 3 Although algebraic results are usually most naturally stated over algebraically closed fields, calculus and geometry typically ‘respect’ real constructions (for instance, there are both real and complex versions of the implicit function theorem). Critical point methods exploit this difference.

7.1 An Overview of Results and Computations

271

at any point in W. The point z = a + ib ∈ C∗d with a, b ∈ R∗d is a minimal critical point in the direction r if and only if (a, b) satisfies equations (7.3)–(7.5) and there does not exist (x, y, ν, t) ∈ R2d+2 with 0 < t < 1 satisfying equations (7.6), (7.7), and (y j − νx j )HxRj (x, y) − (x j + νy j )HyRj (x, y) = 0,

j = 1, . . . , d.

(7.8)

Remark 7.3 If the Jacobian matrix of the polynomial system (7.3)–(7.7) has full rank then the Jacobian matrix of the sub-system (7.3)–(7.5) also has full rank. Thus, the equations (7.3)–(7.5) admit a finite number of complex solutions under the conditions of Proposition 7.1. Proof For fixed (a, b) ∈ R2d ∗ a point σ = (a, b, x, y, λ R, λ I , t) ∈ WR is a local extremum of πt as a map from WR to R only when the gradient of πt is perpendicular to the tangent plane of WR at σ. This, in turn, implies the gradient of πt and the gradients of the polynomials H R, H I, and the x 2j + y 2j − t(a2j + b2j ), which define WR , are linearly dependent. In other words, at any such extremum the matrix HR ∇H R (x, y) "  xI1 I ∇H (x, y) H x1 # # ∇(x12 + y12 − t(a12 + b21 )) # 2x1 #= J= .. # . # 0 # 2 2 2 2 ∇(xd + yd − t(ad + bd ))# 0 ∇(t) ! $ ! 0 

· · · HxRd · · · HxId 0 0 .. . 0 0 2xd ··· 0

HyR1 · · · HyI1 · · · 2y1 0 . 0 ..

HyRd 0 " HyId 0 # # 0 −(a12 + b21 ) # # .. # # 0 . # 2 2 0 0 2yd −(ad + bd )# 0 ··· 0 1 $

is rank deficient, so a non-trivial linear combination of its rows vanishes. Using the Cauchy-Riemann equations to write HxIj = −HyRj and HyIj = HxRj then implies the existence of ν, λ1, . . . , λd such that HxRj − νHyRj + λ j x j = 0 HyRj + νHxRj + λ j y j = 0 for each j = 1, . . . , d, which simplifies to give (7.8). For each of the finite number of points (a, b) ∈ R2d ∗ satisfying equations (7.3)– (7.5) under our assumptions, the set   S = (x, y, t) ∈ R2d+1 : t ∈ [0, 1], (a, b, x, y, t) satisfy (7.6) and (7.7) is compact, since t ∈ [0, 1] implies x 2j + y 2j ≤ a2j + b2j for each j = 1, . . . , d. Furthermore, S is non-empty because it contains (a, b, 1) and has no point with t = 0 since H(0)  0. Thus, the continuous function πt achieves its minimum on the compact set S and such a minimizer must satisfy (7.8) for some ν ∈ R and 0 < t ≤ 1. Any solution (x, y, t) ∈ S with t < 1 gives a point x + iy that has smaller coordinatewise modulus than a+ib, meaning z = a+ib is not minimal. Conversely, if z = a+ib is not minimal then there exists (x, y, t) ∈ S with t minimal and t < 1. This is a local minimum of πt on WR so there exists ν ∈ R such that (7.8) is satisfied. 

272

7 Automated Analytic Combinatorics

Equations (7.3)–(7.8) now consist of 4d + 4 equations in 4d + 4 unknowns, meaning they can have a finite number of complex solutions whose real solutions encode all critical points and help determine minimality. In addition to (A0)–(A3), we work under the following assumption, (J2)

the Jacobian matrix of the system (7.3)–(7.8) in the variables a, b, x, y, λR, λI , t is non-singular at the solutions of (7.3)–(7.8).

Assumption (J2) is slightly stronger than what we need to apply the results of Chapter 5, but this formulation will help us bound the complexity of the algebraic methods we develop below. As noted in Remark 7.3, (J2) implies that F admits only a finite number of critical points. Ultimately, we have obtained the following algorithm. MinimalCriticalGen d INPUT: Coprime polynomials G(z), H(z) ∈ Z[z] and direction r ∈ Z>0 OUTPUT: The set U of minimal critical points of F(z) in the direction r

1. Let S be the algebraic set defined by the zeroes of the polynomial system (7.3)– (7.8) in the variables a, b, x, y, λR, λI , ν, t. If S is not finite then Fail. 2. Let U denote the points a + ib ∈ C∗d such that: (1) there exist x, y, λR, λR, ν, t with (a, b, x, y, λ R, λI , ν, t) ∈ S ∩ R4d+4 , and (2) for any such points t  (0, 1). 3. If U is empty, if it has an element where λR = λI = 0, or if the elements of U do not all have the same coordinate-wise modulus then Fail. 4. Return U.

MinimalCriticalGen is more fully described in Algorithm 3. Using MinimalCriticalGen to determine minimal critical points, DiagonalAsymptotics gives dominant asymptotics when it returns without failing. This happens under assumptions (A0)–(A3) and (J2), all of which can be verified algorithmically.

7.2 ACSV Algorithms and Examples Having described our approach and assumptions, we can now state our main result. Theorem 7.1 Let G(z) and H(z) be polynomials in Z[z] = Z[z1, . . . , zd ] of degrees at most δ and coefficients of absolute value at most 2h , and suppose that Assumpd . If F = G/H is combinatorial and tions (A0)–(A3) hold. Fix a direction r ∈ Z>0 Assumption (J1) holds, then Algorithm 2 is a probabilistic algorithm that computes 3 D4 ) bit operations, ˜ the minimal critical points of F in the direction r in O(hδ d where D = δ . Whether or not F = G/H is combinatorial, if Assumption (J2) holds

7.2 ACSV Algorithms and Examples

273

then Algorithm 3 is a probabilistic  that computes the minimal critical  algorithm points of F in the direction r in O˜ hδ12 D12 bit operations. In either case, Algorithm 1 is a probabilistic algorithm that uses these results 3 D4 ) bit operations to compute three rational functions A, B, C ∈ Z(u), a ˜ and O(hδ square-free polynomial P ∈ Z[u] and a list U of roots of P(u), specified by disks in C containing exactly one root of P, such that

    1 nr (1−d)/2 (1−d)/2 n (rd n) A(u) B(u) C(u) + O . (7.9) [z ]F(z) = (2π) n u ∈U The values of A(u), B(u) and C(u) can be refined to precision 2−κ at all elements ˜ of U in O(κD + hδ3 D3 ) bit operations. Remark 7.4 Using techniques which better take into account the multi-homogeneous structure of the polynomial system (7.3)–(7.8) (namely that (7.3)–(7.5) contain only variables which appear with degree at most 2 in the remaining equations) it is possible to reduce of determining minimal critical points in the general case   the complexity 5 23d D9 ) bit operations. This requires even more than the ˜ from O˜ hδ12 D12 to O(hδ already considerable algebraic framework we discuss here, so we refer to Melczer and Salvy [26] for additional details. Theorem 7.1 is obtained by encoding the smooth critical points with an efficient algebraic data structure. Definition 7.4 (Kronecker representations) A Kronecker representation or rational univariate representation [P(u), Q] of a finite algebraic set V(f) = {z : f1 (z) = · · · = fd (z) = 0} defined by the polynomial system f = ( f1, . . . , fd ) ∈ Z[z]d consists of • a new variable u which is an integer linear combination of the z, u = κ1 z1 + · · · + κd zd ∈ Z[z], such that u takes distinct values for z ∈ V(f), • a square-free polynomial P ∈ Z[u], • polynomials Q1, . . . , Q d ∈ Z[u] of degrees less than deg(P) such that V(f) is determined by the z-coordinates of the solutions of the system P(u) = 0,

⎧ P (u)z1 − Q1 (u) = 0, ⎪ ⎪ ⎨ ⎪ .. . ⎪ ⎪ ⎪ P (u)zd − Q d (u) = 0. ⎩

(7.10)

Kronecker representations are useful because they encode the elements of a finite multi-dimensional algebraic set in terms of a univariate polynomial, and because the polynomials P, Q j typically have much smaller coefficient size than alternative

274

7 Automated Analytic Combinatorics

Algorithm 1: DiagonalAsymptotics Input: Polynomials G(z), H(z) ∈ Z[z] and direction r ∈ N d Output: Polynomials A, B, C, P and a set U of roots of P such that the power series coefficients fnr of F = G/H satisfy (7.9) /* Verify singularities of F are roots of H if G and H are not coprime then G ← G/gcd(G, H) and H ← H/gcd(G, H) if H(0)  0 then return fail /* Define the set of critical points   C ← H, z1 Hz1 − r1 λ, . . . , z d Hz d − rd λ [P, Q, u] ← Kronecker(C) /* Determine the subset of minimal critical points using Algorithm or Algorithm 3 below if F is known to be combinatorial then U ← MinimalCriticalComb(H, r, [P, Q, u]) else U ← MinimalCriticalGen(H, r, [P, Q, u]) /* Find Hessian and numerator at the minimal critical points 8 ← determinant of the polynomial matrix (z d Hz )H defined by (5.25) with w = z H d   8 T − z r1 · · · z r d , g + G(z) PolySys ← C ∪ h − H(z), 1

*/

*/

2 */

*/

d

[P, (Q, QH8, QT , Q−G )] ← Kronecker(PolySys) using Proposition 7.4 /* Verify nondegeneracy of critical points and leading order of asymptotics then return if QH8 (u) = 0 at any u ∈ U, or Q−G (u) = 0 at all u ∈ U then return fail return (A, B, C, P, U) = (Q−G /(rd Qλ ), (rd Qλ ) d−1 (P )2−d /QH8, P /QT , P, U)

*/

univariate encodings. Since u is an integer linear combination of the other variables, and the polynomials P and Q j have integer coefficients, a root of P(u) is real if and only if all coordinates z at the corresponding element of V(f) are real. A Kronecker representation of V(f) gives an encoding of all elements of V(f), but to talk about specific elements of V(f) it is necessary to introduce additional information. Definition 7.5 (isolating regions and numeric Kronecker representations) Given a univariate polynomial P(u) and root w ∈ C of P, an isolating disk for w is a disk D ⊂ C of rational radius whose centre has rational real and imaginary parts, such that w is the only root of P in D. If w is a real root of P then an isolating interval for w is a finite interval I ⊂ R with rational endpoints such that w is the only root of P in I. A numeric Kronecker representation [P(u), Q, U] of V(f) is a Kronecker representation [P(u), Q] of V(f) together with a sequence U of isolating intervals for the real roots of the polynomial P and isolating disks for the non-real roots of P. Determining the elements of U to sufficient precision allows one to argue about the elements of the underlying algebraic set, including: approximating the solutions to arbitrary accuracy, selecting which solutions have real coordinates in certain intervals, determining which solutions have the same coordinate-wise modulus, and more. In practice, the elements of U are often stored as floating point approximations whose accuracy is certified to a specific precision. Section 7.3 discusses the necessary algebraic background, algorithms for computing Kronecker and numeric Kronecker representations, properties of Kronecker

7.2 ACSV Algorithms and Examples

275

Algorithm 2: MinimalCriticalComb Input: Polynomial H(z), direction r ∈ N d and Kronecker representation [P, Q, u] of the smooth critical points for 1/H in the direction r assuming 1/H is combinatorial Output: Set U of roots of P corresponding to the minimal critical points /* Determine the extended critical point system */   S ← H, z1 Hz1 − r1 λ, . . . , z d Hz d − rd λ, H(tz1, . . . , tz d ) ˜ ← Kronecker(S) ˜ Q] [ P, /* Find the minimal critical point with positive real coordinates */ 2 D 2 ) is sufficient precision to ˜ U] ˜ κ), where κ = O(hδ ˜ ˜ ← NumKronecker( P, ˜ Q, ˜ Q, [ P, group the roots of P˜ by the distinct values and signs they give to each z j = Q˜ j / P˜ at the real roots of P˜ using Lemmas 7.4 and 7.5 below /* Find elements with positive real coordinates */ ˜ ∩ R : Q˜ j (u)/ P(u) ˜ > 0 for all 1 ≤ j ≤ d } } M ← { {u ∈ U /* Find unique minimal critical point with positive coordinates */ for each j ∈ {1, . . . , d } do Split apart each set in M by the distinct values its elements give Q˜ j (v)/ P˜ (v) end for each set m ∈ M do if one of the values taken by t on the elements of m lies in (0, 1) then M ← M \ {m} end if M does not have the form { {u ζ } }, or if Qλ (u ζ ) = 0 then return fail /* Reduce back to original critical point system */ 2 ) is sufficient precision to group ˜ [P, Q, U] ← NumKronecker(P, Q, κ) where κ = O(hD the roots of P˜ by the distinct values they give to each z j = Q˜ j / P˜ using Lemma 7.5 Determine the root u ζ of P corresponding to ζ from the numerical approximations /* Return the set of minimal critical points */ 3 D 3 ) is sufficient to ˜ Refine U to isolating regions of size at most 2−κ , where κ = O(hδ identify elements with same coordinate-wise modulus using Corollary 7.1 return set of u ∈ U such that |Q j (u)/P (u)| = Q j (u ζ )/P (u ζ ) for 1 ≤ j ≤ d

representations, and alternative encodings of finite algebraic sets. An algebraic toolbox of methods using the numeric Kronecker representation to decide the properties we require for our asymptotic arguments is developed. Treating the results of Section 7.3 as a black box yields Algorithms 1– 3 implementing Theorem 7.1, whose proof is completed in Section 7.4. The procedures Kronecker(f) and NumKronecker(f, κ) for computing Kronecker and numeric Kronecker representations are discussed in Proposition 7.3 and Proposition 7.5, respectively.

7.2.1 Examples We now work through some examples, starting by revisiting the Apéry numbers. As usual, worksheets illustrating these examples can be found on the textbook website.

276

7 Automated Analytic Combinatorics

Algorithm 3: MinimalCriticalGen Input: Polynomial H(z), direction r ∈ N d and Kronecker representation [P, Q, u] of the set of smooth critical points in some direction r Output: Set U of roots of P corresponding to the minimal critical points /* Determine the extended critical point system */ S˜ ← Polynomials in (7.3)–(7.8) ˜ ˜ ← Kronecker( S) ˜ Q] [ P, /* Construct the set of minimal critical points */ 8 D 8 ) is sufficient precision to ˜ U] ˜ κ) where κ = O(hδ ˜ ˜ ← NumKronecker( P, ˜ Q, ˜ Q, [ P, group the roots of P˜ by the distinct values they give to each Q˜ a i / P˜ and Q˜ b i / P˜ using Lemma 7.5 and determine when t lies in (0, 1) using Lemma 7.4 ˜ S ← {U} for each v ∈ {a1, . . . , a d , b1, . . . , b d } do Split apart each set in S by the distinct values its elements give Q˜ v (u)/ P˜ (u) end for each set s ∈ S do if there exists an index j such that a j = b j = 0 on the elements of s then return FAIL if t has a value in (0, 1) at some element of s then S ← S \ {s} end Sa, b ← numerical approximations of (a, b) determined by elements of S /* Return the minimal critical points */ 2 D 2 ) is sufficient precision to ˜ [P, Q, U] ← NumKronecker(P, Q, κ) where κ = O(hδ identify which elements have (a, b) in Sa, b using Lemma 7.5 return roots of P defining z = a + ib for (a, b) ∈ Sa, b

Example 7.3 (Apéry Revisited) Let us trace through our algorithms on the combinatorial function     1 1 =Δ . Δ H(w, x, y, z) 1 − w(1 + x)(1 + y)(1 + z)(1 + y + z + yz + xyz) To begin we form the polynomial system   S = H(w, x, y, z), wHw − λ, xHx − λ, yHy − λ, zHz − λ, H(tw, t x, t y, tw) . Using a Gröbner basis method discussed in Section 7.3, we can compute a Kronecker ˜ of V(S) by taking u as a random integer linear combination ˜ Q] representation [P, of w, x, y, and z. For instance, taking u = w + t results in a parametrization where P˜ is a polynomial of degree 14 with integer coefficients at most 17 digits long, and Q˜ w, Q˜ x, Q˜ y, Q˜ z , Q˜ λ and Q˜ t are polynomials of degree at most 13 with integer coefficients at most 18 digits long. Each element of V(S) is encoded by a root of ˜ P(u), and the critical points, determined by the elements of V(S) where t = 1, are given by the values of u that are roots of ˜ P˜ − Q˜ t ) = u2 + 162u − 167. P(u) = gcd( P,

7.2 ACSV Algorithms and Examples

277

Since P is quadratic, we can find its roots u1, u2 exactly to get the critical points

  √ √ √ √  Q˜ w (u1 ) Q˜ x (u1 ) Q˜ y (u1 ) Q˜ z (u1 ) , , , = −82 + 58 2, 1 + 2, 1/ 2, 1/ 2 P˜ (u1 ) P˜ (u1 ) P˜ (u1 ) P˜ (u1 ) ≈ (0.02 . . . , 2.41 . . . , 0.707 . . . , 0.707 . . . )



 √ √ √ √  Q˜ w (u2 ) Q˜ x (u2 ) Q˜ y (u2 ) Q˜ z (u2 ) = −82 − 58 2, 1 − 2, −1/ 2, −1/ 2 , , , P˜ (u2 ) P˜ (u2 ) P˜ (u2 ) P˜ (u2 ) ≈ (−164.02 . . . , −0.4 . . . , −0.7 . . . , −0.7 . . . ) .

The numeric Kronecker representation is obtained by approximating the roots of ˜ P(u) = 0 to a sufficiently high precision; substituting these values into Q˜ t (u)/P˜ (u) shows no element of V(S) has t in (0, 1), meaning the critical point with positive coordinates is minimal (as we have shown in previous chapters by hand). ˜ Since we now care only about the critical points, where t = 1, we may replace P(u) ˜ by P(u) and reduce the Q v polynomials accordingly. In particular, the extended Euclidean algorithm can be used to determine the inverse P˜ (u)−1 of P˜ (u) modulo the polynomial P(u). If we define Q v (u) = Q v (u)P (u)P˜ (u)−1

mod P(u)

for v ∈ {w, x, y, z} then [P, Q] defines a Kronecker representation P(u) = u2 + 162u − 167 = 0, w=

−164u + 172 2u + 394 116 , x= , y = z = , λ = −1 P (u) P (u) P (u)

of the critical points (alternatively, one could recompute a Kronecker representation from scratch using a Gröbner basis computation). Building the matrix H in (5.25), multiplying by zHz , and taking the determinant gives det H˜ as a polynomial in x, y, w, and z. Incorporating this polynomial and T = wxyz into the Kronecker representation parametrizes det H˜ =

96u − 112 P (u)

and T =

34u − 30 P (u)

at the critical points. Putting everything together, and noting the numerator G = 1 in this example, the algorithm returns

 5     u + 81  n  1 u + 81 1 fn,n,n,n = 1 + O 3/2 n3/2 17u − 15 28 − 24u n 4π u ∈U ˜ = u2 + 162u − 167. where U = {u1 } is the real root 1.0243 . . . of P(u)

278

7 Automated Analytic Combinatorics

Example 7.4 (Two Critical Points with Positive Coordinates) The bivariate rational function F(x, y) =

1 1 1 = × (1 − x − y)(20 − x − 40y) − 1 1 − x − y 20 − x − t y −

1 1−x−y

is combinatorial and has a smooth singular variety. Computing a Kronecker representation of the algebraic set defined by H(x, y) = xHx (x, y) − λ = yHy (x, y) − λ = H(t x, t y) = 0 shows that there are four critical points, given by the solutions where t = 1. Two of the critical points, (x1, y1 ) ≈ (0.549, 0.309)

and

(x2, y2 ) ≈ (9.997, 0.252),

have positive coordinates; since x1 < x2 while y1 > y2 it is not clear which of the points (if any) is minimal. Examining the full set of real solutions encoded by the Kronecker representation, not just those where t = 1, shows there is a point t(x2, y2 ) ∈ V where t ≈ 0.092, meaning (x2, y2 ) is not minimal. If t(x1, y1 ) ∈ V then t ≥ 1, so (x1, y1 ) is a smooth minimal critical point4. After verifying there is no other critical point with the same coordinate-wise modulus, we obtain asymptotics        1 1 = (5.884 . . . )n n−1/2 0.054 . . . + O , C(u1 )n n−1/2 π −1/2 A(u1 ) B(u1 ) + O n n where A, B, C ∈ Z(u) and u1 is a real algebraic number specified by its degree four minimal polynomial.

Example 7.5 (A Non-Combinatorial Series Expansion) Baryshnikov et al. [1] studied the family of rational functions Fc (x, y, z) =

1 1 − (x + y + z) + cxyz

to determine which values of c ∈ R result in eventually positive diagonal sequences. Our algorithms automatically derive such asymptotic results for fixed values of c. For example5, if c = 81/8 and we consider the main diagonal direction then there are three critical points (x, y, z), defined by 4 In fact there is another point on the line segment from the origin to (x1, y1 ), but this point has the form t(x1, y1 ) where t ≈ 1.709 > 1. 5 Our choice of c simplifies the critical points obtained for this example, but our algorithms work for any value of c.

7.2 ACSV Algorithms and Examples

279



 √  x = y = z ∈ −2/3, −1/3 ± i/ 3 3 . Computing a Kronecker representation shows that the system (7.3)–(7.8) admits solutions with 0 < t < 1. In particular, at any root of the system with 0 < t < 1 either t = 1/3 or t ≈ 0.3451 satisfies 81t 3 + 36t 2 + 4t − 9 = 0. Algorithm 3 proves that (−2/3, −2/3, −2/3) is the only non-minimal critical point, ultimately showing that the power series diagonal has dominant asymptotics

√ n  π πn  √  √  n 3√3 + 3i  √  n 3 3 − 3i 81 3 3 cos 6 + 2 + −i81 3/8 = . i81 3/8 8πn 8πn 8 2nπ Going back to the original motivations of Baryshnikov et al. [1], our algorithm proves that when c = 81/8 the diagonal sequence contains an infinite number of negative terms and an infinite number of positive terms.

Our approach to proving minimality using Proposition 7.1 is flexible enough to help even when the direction r is taken as a parameter. When the argument goes through, proving minimality is reduced to certifying that a univariate polynomial pr (t) in t whose coefficients are algebraic quantities in r has no root t ∈ (0, 1) when r lies in some range of interest. An invaluable tool for these arguments is Sturm’s theorem for real root counting. Definition 7.6 (Sturm sequences and sign changes) If f (t) is a polynomial in R[t] of degree δ then the Sturm sequence of f is the finite sequence of polynomials g0, . . . , gδ defined by g0 (t) = f (t),

g1 (t) = f (t),

gn = −rem(gn−2, gn−1 )

(1 ≤ n ≤ δ),

where rem(gn−2, gn−1 ) is the polynomial remainder when dividing gn−2 by gn−1 . For t ∈ R the sign variation of f at t, denoted Vf (t), is the number of sign changes between consecutive elements in the sequence g0 (t), . . . , gδ (t), ignoring zeroes. Proposition 7.2 (Sturm’s Theorem) The number of zeroes of f (t) ∈ R[t] in the interval (a, b] for any a, b ∈ R with a < b equals Vf (b) − Vf (a). In particular, if the Sturm sequence of f (t) has the same number of sign alternations when t = a and t = b then there are no roots of f (t) in (a, b]. A proof of Proposition 7.2 can be found in [2, Thm 2.50]. Example 7.6 (Distribution of Leaves in Planar Trees) Recall from Chapter 3 that the number of rooted planar trees on n nodes with k leaves is the coefficient of uk z n y n in the power series expansion of F(u, z, y) =

y(uyz − yz − 2y + 1) G(u, z, y) = . H(u, z, y) 1 + uyz − uz − yz − y

280

7 Automated Analytic Combinatorics

Note that the power series of F has negative coefficients away from the terms of combinatorial of interest. Forming the system (7.3)–(7.8) with r = (s, 1, 1) as a parameter and computing a lexicographic Gröbner basis, as described in Section 7.3, shows that there is a single critical point of F,   (s − 1)2 s2 ,s , , (u, z, w) = s (s − 1)2 and that the value of t at any solution of (7.3)–(7.8) satisfies (t − 1)p1 (t)p2 (t)p3 (t)p4 (t) = 0, where p1, p2, p3, p4 are explicit polynomials in t with coefficients in Z[s]. Minimality of the critical point for 0 < s < 1 follows from showing that p1, p2, p3, p4 have no root with t ∈ (0, 1) for those values of s (note that s < 1 since a tree cannot have more leaves than nodes). We prove this for p1 (t) = s4 t 2 − (3s2 − 2s + 1)t + 1; the arguments for p2, p3, and p4 , which are cubic in t, are analogous and given in the worksheet for this example. The Sturm sequence for p1 (t) evaluated at t = 0 is   (5s2 − 2s + 1)(s − 1)2 2 v0 = 1, −3s + 2s − 1, 4s4 while evaluated at t = 1 it becomes   (5s2 − 2s + 1)(s − 1)2 2 3 2 . v1 = s(s + 2)(s − 1) , (s − 1)(2s + 2s − s + 1), 4s4 Whenever 0 < s < 1 the signs of the elements of v0 and v1 both form the sequence (+, −, +), containing 2 sign changes, so p1 (t) has no root with 0 < t ≤ 1. Repeating this process with p2, p3, and p4 shows the critical point is minimal, ultimately obtaining asymptotics  n    1 1 1 sn n n 1+O [u z y ]F(u, z, y) = n 2πn2 s2s (1 − s)2s+2 for the number of rooted plane trees with n nodes and k = sn leaves.

7.3 Data Structures for Polynomial System Solving In order to implement our algorithms for asymptotics we need techniques for manipulating the solutions of polynomial systems. We thus give a brief overview of

7.3 Data Structures for Polynomial System Solving

281

this vast area of mathematics which, out of necessity, is highly tailored to our needs. Those wanting a deeper introduction can turn to the lovely book of Cox et al. [10] or, for an exhaustive look at the topic, to the book series of Mora [28, 29, 30, 31].

7.3.1 Gröbner Bases and Triangular Systems We start by encoding polynomial systems and their solutions. Definition 7.7 (ideals and algebraic sets) Given a finite collection of polynomials f1 (z), . . . , fr (z) in Q[z], the ideal of Q[z] generated by the f j is the set ( f1, . . . , fr ) =

r ⎧ ⎨ ⎪

⎫ ⎬ ⎪ g j (z) f j (z) : g j (z) ∈ Q[z] . ⎪ ⎪ ⎩ j=0 ⎭

Equivalently, ( f1, . . . , fr ) is the smallest set containing the f j which is closed under addition of elements in the set and closed under multiplication by any element of Q[z]. The algebraic set or variety6 defined by f1 (z), . . . , fr (z) ∈ Q[z] is the set of their common solutions V( f1, . . . , fr ) = {z ∈ Cd : f1 (z) = · · · = fr (z) = 0}. If two sets of polynomials f = { f1, . . . , fr } and g = {g1, . . . , gs } generate the same ideal, then each f j can be written as a Q[z]-linear combination of the gk polynomials, and vice-versa. In particular, f and g vanish on the same set of points, meaning V( f1, . . . , fr ) = V(g1, . . . , gs ). It thus makes sense to talk about the variety defined by an ideal, and symbolically ‘solving’ the polynomial system f1 = · · · = fr = 0 can be viewed as finding a particularly nice generating set for the ideal ( f1, . . . , fr ). Definition 7.8 (zero-dimensional ideals) An ideal I is zero-dimensional if V(I) is finite, and a polynomial system is zero-dimensional if its elements generate a zerodimensional ideal (in other words, the polynomials have a finite number of common roots in Cd ). If all f j polynomials are linear then one way to determine V( f1, . . . , fr ) is taught in a first class on linear algebra: put the coefficients of the polynomials in a matrix and perform Gaussian elimination. At the polynomial level, this corresponds to taking linear combinations of the f j in order to determine a new generating set gk where, for instance, the variable z1 appears only in g1, the variable z2 appears only in g2 , etc. Trying to mirror this construction for non-linear polynomials immediately runs into a problem: if we want to systematically cancel terms in the f j polynomials to determine a ‘simpler’ generating set, we need a way of determining which monomials 6 Some authors reserve the name variety for irreducible algebraic sets, but we do not make this distinction. Unlike in Chapter 5, here we only consider algebraic sets in C d (we do not need projective space).

282

7 Automated Analytic Combinatorics

to cancel first. We thus need a way of sorting the monomials which appear, which is facilitated by the following definitions. Definition 7.9 (total and monomial orderings) A total order on a set S is a binary relation  on S such that for all a, b, c ∈ S: either a  b or b  a, the order is antisymmetric (if a  b and b  a then a = b), and the order is transitive (if a  b and b  c then a  c). A monomial ordering is a total order  on monomials zi with i ∈ Nd such that for all a, b, c ∈ Nd : (1) if za  zb then za+c  zb+c and (2) za  1. Common monomial orders include lexicographic:

za lex zb if the leftmost non-zero entry of a − b is positive. Equivalently, monomials are first ordered by their powers of z1 , with ties broken by their powers of z2 , then z3 , etc. For instance, z13 z22 lex z12 z23 lex z12 lex z1 z2100 .

graded lexicographic:

za gr zb if za has larger degree than zb , or if they have the same degree and za lex zb . Monomials are sorted by degree, with ties broken by the lexicographic order. For instance, z1 z2100 gr z13 z22 gr z12 z23 gr z12 .

reverse graded lex.:

za rv zb if za has larger degree than zb , or if they have the same degree and the rightmost non-zero entry of a − b is negative. Monomials are sorted by degree, with ties broken by the reverse ordering to the lexicographic order when the sequence of the variables is also reversed. For instance, z26 rc z13 z22 rv z12 z23 rv z15 .

Definition 7.10 (leading terms and Gröbner bases) Fix a monomial order . The leading term of a polynomial f ∈ Q[z] is the term appearing in f which is largest under . Given an ideal I = ( f1, . . . , fr ) the leading terms of all polynomials in I form another ideal LT(I), called the leading term ideal of I, which is not necessarily generated by the leading terms of the generating set f j . A Gröbner basis of the ideal I is a finite generating set g1, . . . , gs of I such that the ideal generated by the leading terms of the g j is LT(I). A reduced Gröbner basis is a Gröbner basis such that no leading term of a generating polynomial lies in the ideal generated by the leading terms of the other generating polynomials, and whose polynomials are normalized to have leading terms with coefficient 1. Gröbner bases were introduced in the 1960s by Buchburger [5, 6], who gave an algorithm for their calculation (they exist for any ideal and ordering) and proved several of their remarkable properties. The reduced Gröbner basis of an ideal under a fixed monomial order is unique, and a Gröbner basis of an ideal I = ( f1, . . . , fr ) with respect to any order contains the polynomial 1 if and only if the polynomial system f1 = · · · = fr = 0 has no solution in Cd . Buchburger’s algorithm, which reduces to the Euclidean algorithm for univariate polynomials and Gaussian elimination for multivariate linear polynomials, consists of taking two polynomials f , g

7.3 Data Structures for Polynomial System Solving

283

from the generating set f of an ideal, taking the smallest degree monomial linear combination that results in a polynomial with smaller leading term, reducing this new polynomial through multivariate polynomial division with f, then adding the result to the generating set and repeating the procedure until no new generators are found. More efficient algorithms to compute Gröbner bases have been developed, notably the F4 and F5 algorithms of Faugére [13, 14], and most computer algebra systems contain built-in procedures to compute Gröbner bases after one specifies a monomial order. Example 7.7 (Apéry Critical Points) The Apéry sequence is the main diagonal of the rational function 1/H(w, x, y, z) where H(w, x, y, z) = 1 − w(1 + x)(1 + y)(1 + z)(1 + y + z + yz + x yz), in which case the polynomials in the smooth critical point system (5.16) form the ideal I = (H, xHx − yHy, xHx − zHz , xHx − wHw ). The Groebner[Basis] command in Maple, when given the ideal I and lexicographic order where x  y  z  w, returns the Gröbner basis G = (w 2 + 164w − 4, 116z − w − 82, 116y − w − 82, 58x − w − 140). Since I = G implies V(I) = V(G), the critical points satisfy the system w 2 + 164w − 4 = 0 116z − w − 82 = 0 116y − w − 82 = 0 58x − w − 140 = 0, and can be determined by solving the first equation for w then substituting the value of w in the remaining equations and solving for the remaining variables.

The fact that a lexicographic Gröbner basis gives a nice set of generators for solving the critical point system in this example is not a coincidence. Given a (finite or infinite) set S ⊂ Q[z] and natural number 0 ≤  ≤ d −1, let S denote the elements of S containing only the variables z +1, . . . , zd . Theorem 7.2 (Elimination Theorem of Lexicographic Gröbner Bases) Let I ⊂ Q[z] be an ideal and G = {g1, . . . , gr } be a Gröbner basis of I with respect to the lexicographic monomial order where z1  · · ·  zd . Then for any 0 ≤  ≤ d − 1 the set I is an ideal of Q[z +1, . . . , zd ] with lexicographic Gröbner basis G . Theorem 7.2 says that the finite set G , obtained immediately by inspection from G, generates the ideal of elements in I which contain only the variables z +1, . . . , zd . Proof That I is an ideal follows directly from the fact that I is an ideal. Let f ∈ I , so that f ∈ Q[z +1, . . . , zd ]. Since f ∈ I and G is a Gröbner basis of I,

284

7 Automated Analytic Combinatorics

LT( f ) is in the ideal generated by the leading terms of G. Because each leading term is a monomial, it follows that LT( f ) is divisible by LT(g) for some g ∈ G. But then LT( f ) ∈ Q[z +1, . . . , zd ] implies LT(g) ∈ Q[z +1, . . . , zd ]. Because G is a Gröbner basis with respect to the lexicographic order where z1  · · ·  zd , LT(g) ∈ Q[z +1, . . . , zd ] implies g ∈ Q[z +1, . . . , zd ]: if g contained a term with a variable zk for 1 ≤ k ≤  then by the definition of the lexicographic order such a variable with smallest index would also appear in its leading term. We have shown f ∈ I implies LT( f ) is in the ideal LT(G ). Each element of G is in I, so the ideal (G ) generated by G is contained in I . Towards a contradiction suppose I  (G ) and pick f ∈ I \ (G ) with minimal leading term under the lexicographic order. Then LT( f ) = LT(g) for some g ∈ (G ). The leading term of f − g ∈ I is smaller than LT( f ) under the lexicographic order, so by minimality of f we have f − g ∈ (G ) which contradicts f  (G ). In particular, it must be the  case that I = (G ) and G is a Gröbner basis under the lexicographic order. This naturally suggests a method to encode an algebraic set in order to easily determine and manipulate its elements. Definition 7.11 (triangular systems) A (finite) triangular system7 is a system of the form P1 (z1 ) = 0 P2 (z1, z2 ) = 0 .. . Pd (z1, . . . , zd ) = 0, where each Pk ∈ Q[z1, . . . , zk ] is monic in zk . Theorem 7.2 shows a strong connection between triangular systems and lexicographical Gröbner bases: if the ideal I is zero-dimensional then V(I) can be written as the union of the zeroes of a finite number of triangular systems, and such systems can be easily determined from a lexicographical Gröbner basis of I; see Lazard [24] for additional details on this approach. One can then approximate or, when possible, exactly determine the elements of V(I) by iteratively solving the sequence of polynomial equations in each triangular system. Example 7.8 (An Extended Critical Point System) Consider again the rational function 1/H(w, x, y, z) where H(w, x, y, z) = 1 − w(1 + x)(1 + y)(1 + z)(1 + y + z + yz + x yz). Since the symmetries of the main diagonal hide some of the complexities which arise in general, we now consider asymptotics of the r = (2, 1, 1, 1)-diagonal. The polynomials in the extended critical point system (7.2), which allows us to determine minimality, form the ideal 7 Triangular systems are also known as regular chains, simple systems, or simple extensions (sometimes with small differences in definition, depending on the author).

7.3 Data Structures for Polynomial System Solving

285

˜ ⊂ Q[w, x, y, z, λ, t] I = (H, wHw − 2λ, xHx − λ, xHx − λ, yHy − λ, zHz − λ, H) ˜ w, x, y, z) = H(tw, t x, t y, tw). The reduced lexicographical Gröbner bawhere H(t, sis G of I with respect to the ordering of variables λ  w  x  y  z  t contains an element of the form S(t) = (t − 1)R(t), where R(t) is a polynomial of degree 18 in t. That S(t) factors reflects the fact that we can write V(I) = V(J) ∪ V(K) for ideals J, K  Q[w, x, y, z, λ, t]. The factor t − 1 corresponds to the actual critical points of H, where t = 1, while the real solutions of the factor R(t) encode points on the lines containing the origin and these critical points. This factorization of S(t) also implies V(I) is the union of the zeroes of two triangular systems. Replacing S(t) by t − 1 in G and recomputing the lexicographical Gröbner basis gives the set of polynomials {t−1, 6z 3 −7z2 −5z+2, y−z, 6z 2 +4x−3z−3, 4482z2 +72w−8355z+2114, 2λ+1}, which form a triangular system. Similarly, replacing S(t) by R(t) in G and recomputing the lexicographical Gröbner basis gives another set of polynomials {R(t), U1 (t, z), U2 (t, y), U3 (t, x), U4 (t, w), 2λ + 1}, where U1, U2, and U3 are linear in z, y, x, and w, respectively. Unfortunately, the U j polynomials are extremely large: each is a polynomial of degree 17 whose smallest non-zero coefficient is 51 decimal digits long! Our introduction of the Kronecker representation is motivated in large part by wanting to deal with polynomials whose coefficients have a more manageable size.

The complexity of computing Gröbner bases is well studied, yet tricky to pin down precisely. Doubly-exponential lower bounds are known in the worst case [20], yet implementations in computer algebra systems run reasonably well on many applications. As a practical matter, lexicographic Gröbner bases (often desired for their elimination property) are typically larger and take longer to compute than Gröbner bases with respect to other orders8. There are algorithms which convert between Gröbner bases with respect to different orders, allowing one to compute a basis with respect to a computationally well-chosen order then convert to a lexicographic basis for its elimination properties. Especially useful for this conversion on zero-dimensional ideals is the FGLM algorithm of Faugére et al. [12]. Perhaps surprisingly, under one natural measurement the most efficient order for a Gröbner basis is the reverse graded lexicographic order [3]. Although useful for encoding finite algebraic sets, triangular systems have some drawbacks (for instance, iteratively solving polynomial equations for each variable can obscure which solutions are have real coordinates). Instead of going deeper into

8 To quote from the deep analysis of Bayer and Stillman [3], “. . . use of the lexicographic order in computations can unequivocally be discouraged, in favor of more carefully chosen orders.”

286

7 Automated Analytic Combinatorics

the theory of triangular systems, we move on to different representations of algebraic sets more suited to our needs.

7.3.2 Univariate Representations The key to our algorithms is the parametrization of a finite algebraic set by a univariate polynomial in one variable, with the other coordinates being well-chosen rational functions in that variable. We rely heavily on the results of the appendix to this chapter, which summarize important properties of univariate polynomials. With that in mind, let I ⊂ Q[z] be a zero-dimensional ideal and introduce a new variable u satisfying u = κ1 z1 + · · · + κd zd (7.11) for some κ j ∈ Z. Geometrically, introducing u corresponds to parameterizing the elements of V(I) by the level sets of the hyperplane with normal κ. Definition 7.12 (separating linear forms) The variable u in (7.11) is called a separating linear form if it takes distinct values for each z ∈ V(I). Because we care mainly about V(I), and the zero set of a polynomial is unchanged when taking positive powers, we introduce the following concept. Definition 7.13 (radical of an ideal) The radical of an ideal K ⊂ Q[x] is the set √ K consisting of f ∈ Q[x] such that f n ∈ K for some n ∈ N. Remark 7.5 The radical of an ideal K√ is itself an ideal of Q[x], and is as the largest ideal in Q[x] such that V(K) = V( K). The radical of an ideal can be computed using Gröbner bases [4, Ch. 8.2]. Let J ⊂ Q[z, u] be the radical of the ideal generated by a zero-dimensional ideal I together with the polynomial u − κ · z. When u is a separating linear form then the shape lemma [15, Prop. 1.6] states that the reduced Gröbner basis of J with respect to the lexicographic ordering where z1  · · ·  zd  u consists of a univariate polynomial P(u) with no repeated roots together with polynomials of the form z j − R j (u) for 1 ≤ j ≤ d. Conversely, if V(J) has a triangular system of this form then u is separating. To determine the elements of V(I) one simply needs to solve the univariate polynomial P(u) and substitute the result into each R j . In fact, if the κ j are picked at random there is a good chance to obtain a separating linear form. Lemma 7.1 Suppose a zero-dimensional ideal I is generated by d polynomials in the variables z1, . . . , zd , each of degree at most δ. If each κ j in (7.11) is chosen uniformly at random from a finite set S ⊂ Z then the probability that the variable u defined by (7.11) takes distinct values at the elements of V(I) satisfies D P[u not distinct at elements of V(I)] ≤

2

|S|

,

7.3 Data Structures for Polynomial System Solving

287

  where D = δ d . In particular, if S has at least D2 elements then some assignment of κ ∈ S d produces a linear form u which takes distinct values for z ∈ V(I). Proof Bézout’s theorem [38] implies that a polynomial system with a finite number of solutions defined by d equations in d variables, each of degree at most δ, has at most D solutions. Thus, we may write V(I) = {z1, . . . , zr } for r ≤ D with each z j distinct. If u(z, κ) denotes the right-hand side of (7.11), the linear form u takes distinct values at the elements of V(I) if and only if the polynomial   u(zi, κ) − u(z j , κ) P(κ) = i< j

is non-zero. Problem 7.3 asks you to prove the Schwartz-Zippel lemma, stating that the probability P(κ) vanishes when each κ j is chosen randomly from a finite set S is at most deg P/|S|. This gives the desired bound.  The solutions of systems satisfying the shape lemma can be parametrized by a single variable, which is convenient, but the polynomials in these systems can still have extremely large coefficient sizes. Example 7.9 (Large Coefficient Sizes in a Univariate Representation) Consider again the extended critical point system forming the ideal ˜ I = (H, wHw − 2λ, xHx − λ, xHx − λ, yHy − λ, zHz − λ, H) ˜ w, x, y, z) = where H(w, x, y, z) = 1−w(1+x)(1+y)(1+z)(1+y+z+yz+xyz) and H(t, H(tw, t x, t y, tw). Introducing the polynomial u − t − x, the reduced lexicographical Gröbner basis with respect to the ordering of variables λ  w  x  y  z  t  u has the expected form G = {P(u), t − R1 (u), w − R2 (u), z − R3 (u), y − R4 (u), x − R5 (u), 2λ + 1}, where P has degree 21 and the R j polynomials have degree 20. Unfortunately, the coefficients of the R j polynomials are rational numbers whose numerators and denominators have approximately 160 decimal digits!

Effective versions of the arithmetic Nullstellensätz [23, Thm. 1] imply that the maximum height of the polynomials P, R j appearing in a triangular system when u 2d ), and this upper bound appears in ˜ is a separating linear form is bounded by O(hδ practice (such as the last example). We thus turn to the Kronecker representation, d ). As the height of a ˜ which contains polynomials with heights bounded in O(hδ polynomial is the bitsize of the coefficients, this will make our computations much more tractable.

288

7 Automated Analytic Combinatorics

7.3.2.1 The Symbolic Kronecker Representation Recall from Section 7.2 that a Kronecker representation [P(u), Q] of a finite algebraic set A ⊂ Cd is defined by a separating linear form u = κ1 z1 + · · · + κd zd ∈ Z[z], square-free polynomial P ∈ Z[u], and Q ∈ Z[u]d of degrees less than deg P such that A is determined by the z-coordinates of the solutions of the system P(u) = 0,

⎧ P (u)z1 − Q1 (u) = 0, ⎪ ⎪ ⎨ ⎪ .. . ⎪ ⎪ ⎪ P (u)zd − Q d (u) = 0. ⎩

Definition 7.14 (Kronecker degrees and heights) The degree of a Kronecker representation is the degree of P, and the height of a Kronecker representation is the maximum height of its polynomials P, Q1, . . . , Q d . The following result, contained in Schost [39], suggests that the coefficients appearing in Kronecker representations will be much smaller than the coefficients appearing in triangular systems. Lemma 7.2 Let f ∈ Z[z]d be a set of polynomials of degree at most δ and heights at most h, and let Z(f) be the solutions of f where the Jacobian matrix of f is invertible. Then there exists a Kronecker representation of Z(f) of degree at most D = δ d and ˜ height O(hD). Remark 7.6 The degree bound in Lemma 7.2 follows directly from Bézout’s theorem bounding the number of elements of Z(f) ⊂ V(f). A full derivation of the height bound is outside the scope of this text, but we give an intuitive motivation for the (on first glance somewhat contrived) Kronecker representation9. Suppose u is a separating linear form for a zero-dimensional ideal I and represent V(I) by the solutions of a system P(u) = 0, z1 = R1 (u), . . . , zd = Rd (u) for polynomials P ∈ Z[u] and R j ∈ Q[u]. Since we are interested only in the elements of V(I), not in the multiplicities of the elements, we may assume P is square-free by replacing it with its square-free factorization if necessary. Thus, P(u) = (u − u1 ) · · · (u − ur ) where the u j are the distinct roots of P in C. If z1 takes the value a j ∈ C at the element of V(I) obtained by setting u = u j then we can determine the polynomial R1 (u) by interpolation, since knowing R1 (u j ) = a j implies

9 Thanks to Éric Schost for providing this explanation.

7.3 Data Structures for Polynomial System Solving

289

  u − uk  R1 (u) = aj u j − uk j=1 kj d 

=

d 

*

j=1

=

d  j=1

 aj (u − uk ) kj (u j − uk ) kj

aj  (u − uk ). P (u j ) kj

To express R1 (u) as a polynomial with rational coefficients one must put fractions over a common denominator, multiplying the values a j by different evaluations P (u j ), which essentially leads to an extra factor of D in the heights of the R j polynomials. From this perspective it is clearly natural to encode the values of each z j /P (u) instead of just z j , leading to the Kronecker representation. By Lemma 7.1, a separating linear form can be determined with high probability by randomly selecting the coefficients κ in a finite set of integers of sufficient size. For a given choice of κ one can compute a Kronecker representation of f, when u is separating, by using a lexicographical Gröbner basis of the radical of (f) to determine polynomials P(u) and R j (u) such that V(f) = V(P, z1 − R1, . . . , zd − Rd ) and taking Q j (u) to be the remainder when the polynomial P (u)R j (u) is divided by P(u). When u is not separating this can also be detected from a lexicographical Gröbner basis. A more refined analysis can be found in Rouillier [35]. Example 7.10 (A Kronecker Representation) In the last example we took the ideal ˜ I = (H, wHw − 2λ, xHx − λ, xHx − λ, yHy − λ, zHz − λ, H) ˜ w, x, y, z) = with H(w, x, y, z) = 1−w(1+ x)(1+ y)(1+ z)(1+ y + z+ yz+ x yz) and H(t, H(tw, t x, t y, tw), introduced the linear form u − t − x, and obtained a Gröbner basis of the form G = {P(u), t − R1 (u), w − R2 (u), z − R3 (u), y − R4 (u), x − R5 (u), 2λ + 1} which implies u is a separating linear form. If we define, for instance, Q1 (u) to be the remainder when R1 (u)P (u) is divided by the polynomial P(u) then Q1 (u) is a polynomial of degree 20 whose coefficients each have approximately 17 decimal digits. Though somewhat large for manual consideration, 17 decimal digits is an order of magnitude smaller than the 160 digit integers appearing in R1 (u).

Although this procedure determines a Kronecker representation, it relies on Gröbner basis computations whose complexity is not well understood, and moves through intermediate expressions with polynomials of large coefficient size. There are alter-

290

7 Automated Analytic Combinatorics

native, Gröbner free, approaches to computing a Kronecker representation. The first well-known example is the so-called geometric resolution algorithm of Giusti et al. [16], but for our complexity results we rely on an algorithm of Safey el Din and Schost [36] which uses a ‘homotopy-based approach’. To the best of our knowledge neither of these approaches are implemented in a computer algebra system, so our examples (and the ACSV package of Melczer and Salvy [26]) compute Kronecker representations using lexicographic Gröbner bases. The next result follows from Theorem 1 of Safey El Din and Schost [36]. Proposition 7.3 Let f ∈ Z[z]d be a polynomial system and Z(f) be the solutions of f where the Jacobian matrix of f is invertible (this implies Z(f) is finite). Suppose the polynomials in f have degree at most δ and heights at most h. Then there exists an algorithm Kronecker that takes f as input and produces one of the following: • a Kronecker representation of Z(f), • a Kronecker representation of degree less than that of Z(f), • fail. The first outcome occurs with probability at least 21/32 and returns a Kronecker ˜ representation of degree D = δ d and height O(hD). In any case, the algorithm has 3 ˜ bit complexity O(hD ). Repeating the algorithm k times, and taking the output with highest degree, allows one to obtain a Kronecker representation of Z(f) with probability 1 − (11/32)k . The probabilistic nature of Kronecker comes from randomly sampling a linear form u with coefficients of reasonable size and choosing certain prime numbers at different points in the algorithm. In practice the probabilistic issues are minor and the probability bounds listed are quite pessimistic: see Giusti et al. [16] or Safey El Din and Schost [36] for further discussion. When the Jacobian of f is invertible at all of its solutions, as it will be under our assumptions (J1) and (J2), Proposition 7.3 gives a Kronecker representation of all solutions of f. Our next proposition describes how to encode the values of additional polynomials at the solutions of a Kronecker representation. Since the constants appearing in the asymptotic expansion (5.27) of Theorem 5.2 in Chapter 5 are polynomials in the critical points of F(z), this will allow us to determine an expansion of the form (7.9). Proposition 7.4 Let f ∈ Z[z]d be a zero-dimensional polynomial system containing polynomials of degree at most δ and heights at most h, and suppose [P(u), Q] is a Kronecker representation of V(f) determined by Proposition 7.3. Let q(z) ∈ Z[z] be a polynomial of degree at most ρ and height at most η. Then 1. there exists a parametrization q(u) = Q q (u)/P (u) of the values taken by q on V(f), where Q q is an integer polynomial of degree less than D = δ d and ˜ + hρ)D3 ) bit ˜ + hρ)D). The polynomial Q q can be computed in O((η height O((η operations; 2. there exists a polynomial Φq (T) ∈ Z[T] of degree at most D = δ d and height ˜ O((η + hρ)D2 ) which vanishes on all values taken by q(z) for z ∈ V(f). The ˜ + hρ)D4 ) bit operations; polynomial Φq can be computed in O((η

7.3 Data Structures for Polynomial System Solving

291

3. when q has degree 1 (for instance, if q is one of the variables z j ) then there is a polynomial Φq ∈ Z[T] vanishing on the values of q(z) for z ∈ V(f) which has ˜ + h)D), and Φq can be computed in degree at most D = δ d and height O((η 3 ˜ O((η + h)D ) bit operations. Proposition 7.4 (1) is proven by adding the polynomial T − q(z) into f, where T is a new variable, and recomputing the Kronecker representation. In order to get the stated height and complexity bound, a more refined method of Safey El Din and Schost [36] must be used which exploits the structure of the new system, since T appears in only one equation. This refinement requires an additional algebraic framework whose setup would take us away from our main focus, so we refer the reader to Melczer and Salvy [26, Prop. 45]. In practice one does not actually need to recompute the entire Kronecker system. Recall the resultant from Problem 2.14 in Chapter 2. The minimal polynomial Φq in Proposition 7.4 (2) divides the resultant of the polynomials P (u) − TQ q (u) and P(u) with respect to u, so the stated bounds on its height and degree follow from Lemmas 7.8 and 7.11 in the appendix. The complexity of determining Φq follows from a fast algorithm of Kedlaya and Umans [21] for determining the minimal polynomial of Q q /P modulo the equation P(u) = 0. The improved height bound on Φq when q is linear follows from Safey El Din and Schost [36, Lemma 23].

7.3.2.2 The Numeric Kronecker Representation A Kronecker representation allows us to describe the solutions of a finite algebraic set (for instance, the critical points of a multivariate rational function), but to argue about specific solutions (for instance, the minimal critical points) we need extra information about the roots. Because distinct elements encoded by a Kronecker representation [P(u), Q] are represented by distinct roots of P(u), it is enough to separate the roots of P(u) in the complex plane. A numeric Kronecker representation is thus a triple [P(u), Q, U], where U contains isolating intervals for the real roots of P and isolating disks for the complex roots of P. Definition 7.15 (sizes of isolating regions) We say the size of an interval is half its length, while the size of a disk is its radius. Using the algorithms and results discussed in the appendix to this chapter, we build up a toolbox of algorithms for manipulating the solutions of a (numeric) Kronecker representation. Our first result, which follows directly from the algorithm PolyRoots described in Lemma 7.12 of the appendix, discusses how to determine the isolating regions U for the roots of P(u). One can compute the isolating regions to arbitrary accuracy, with a minimal amount of accuracy required to ensure the computed regions separate the (distinct) roots of P. Proposition 7.5 Suppose the zero-dimensional system f = ( f1, . . . , fd ) ∈ Z[z]d is given by a Kronecker representation [P(u), Q] of degree D and height η. Then there

292

7 Automated Analytic Combinatorics

exists an algorithm NumKronecker which takes [P(u), Q] and κ > 0 and returns a numeric Kronecker representation [P(u), Q, U], with isolating regions in U of size ˜ 3 + D 2 η + Dκ) bit operations. at most 2−κ , in O(D Once we have a Kronecker representation, the most basic operation we can perform is to approximate the coordinates of the algebraic points it encodes. Lemma 7.3 Given a Kronecker representation [P(u), Q] of degree D and height η and κ ∈ N, approximations to the solutions of the Kronecker representation with 3 + ˜ isolating regions for each coordinate of size 2−κ can be determined in O(d(D 2 D η + Dκ)) bit operations. Proof Fix a coordinate z j and root v ∈ C of P(u) = 0. Our aim is to evaluate z j = Q j (v)/P (v) to an accuracy of 2−κ . Assume we have approximations q ≈ Q j (v) and p ≈ P (v) such that |Q j (v) − q|, |P (v) − p| ≤ 2−n for some n ∈ N. Then   q Q j (v) q Q j (v)p − qp + qp − qP (v) 1 −n . + ≤2 P (v) − p = P (v)p |P (v)| P (v)p Suppose now that √ ˜ n ≥ 2ηD + 2(D − 1) log2 D − 2η + (D − 1) log2 D + 1 + 1 = O(ηD). Proposition 7.7 (i) in the appendix implies |v| ≤ 2η + 1, so |q| ≤ |Q j (v)| + 2−n ≤ 2η (1 + |v| + · · · + |v| d ) + 2−a ≤ 2η (D + 1)(2η + 1) D + 2−n = 2O(η D) . ˜

Similarly, Proposition 7.7 (iv) in the appendix implies |P (v)| ≥ 2−n+1 (in fact our bound on n was chosen so that this inequality would hold) and |p| ≥ |P (v)| − 2−n ≥ 2−n = 2−O(η D), ˜

which means

Q j (v) q ˜ D) −n+O(η . P (v) − p ≤ 2

To determine z j to precision κ it is therefore sufficient to determine Q j (v) and P (v) ˜ to κ + O(ηD) bits. Lemmas 7.12 and 7.13 in the appendix imply that this can be ˜ 3 + D 2 η + Dκ) bit operations, and doing this for each done at all roots of P in O(D of the d coordinates gives the stated complexity.  To determine minimality we need to select the solutions of a Kronecker representation which are positive and real, and test which have real coordinates in various ranges. Lemma 7.4 Given a numeric Kronecker representation [P(u), Q, U] of degree D and height η, it can be determined for every real solution to the underlying system

7.3 Data Structures for Polynomial System Solving

293

3 + D 2 η)) ˜ whether each coordinate is positive, negative, or exactly zero, in O(d(D bit operations.

Proof Fix a coordinate z j . The roots of P(u) that correspond to solutions with z j = 0 are exactly those canceling the polynomial G j (u) = gcd(P, Q j ). By Lemma 7.10 in ˜ ˜ 2 + ηD) + η) and can be computed in O(D the appendix, the gcd G j has height O(D bit operations. Using Proposition 7.7 and Lemma 7.12 from the appendix, the roots ˜ 3 + D 2 η) bit operations of P(u) which also cancel G j (u) can be determined in O(D ˜ 2 +η D) − O(D . Proposition 7.7 in by computing isolating regions of the roots of size 2 the appendix shows that knowing approximations of Q j (u) and P (u) to an accuracy ˜ of 2−O(η) allows one to determine their signs when they are real, and Lemma 7.13 2 ) bit ˜ in the appendix shows that such approximations can be determined in O(ηD ˜ operations knowing only O(ηD) bits of the roots of P(u).  Having obtained high-accuracy approximations of the solutions of a critical point system, we need to determine when specific coordinates of the solutions are exactly equal, and when they share the same modulus. Proposition 7.4 gives a degree and height bound on the minimal polynomial of the values of each coordinate z j at the solutions of V(f). Using the root separation bounds in Proposition 7.7 of the appendix and the algorithm for approximating V(f) discussed in Lemma 7.3 results in the following. Lemma 7.5 Given a numeric Kronecker representation [P(u), Q, U] corresponding to a zero-dimensional system of d polynomials of degrees at most δ and heights at most h, the coordinates of its solutions which are exactly equal can be found 3 ) bit operations, where D = δ d . ˜ in O(hD The most complicated, and expensive, operation we must perform is grouping solutions of an algebraic set by coordinate-wise modulus. To see why grouping by modulus is so difficult, let A ∈ Z[u] be a polynomial of degree D and height η, and let α, β ∈ C be roots of A with |α|  | β|. Proposition 7.7 in the appendix ˜ gives a useful lower bound |α − β| ≥ 2−O(η D) on the distance between α and β, but it is much harder to get a good bound on the distance between the moduli |α| and | β|. One approach to bounding the difference of |α| and | β| involves the resultant R(u) = ResT (A(T), T D A(u/T)), which vanishes at the products of roots 2 2 of A(u). In√particular, √ it vanishes at |α| = α α and | β| = β β so the polynomial B(T) = R( T)R(− T) vanishes at |α| and | β|. Lemmas 7.8 and 7.11 of the appendix 2 ), so Proposition 7.7 in the ˜ imply that B(T) has degree at most D and height in O(ηD 3 ˜ appendix yields ||α| − | β|| ≥ 2−O(η D ) . This is significantly worse than the distance bound between α and β. In fact, a tight separation bound between the moduli of roots of a univariate polynomial doesn’t seem to be known10.

10 A tight bound in the special case when α and β are both real (i.e., when one is positive and the other negative) is given by Bugeaud et al. [8]. In this situation one can take f (δ, h) = O(hδ + δ 2 ). See also Bugeaud et al. [7] for some recent results and experiments with low degree polynomials.

294

7 Automated Analytic Combinatorics

Open Problem 7.1 Under what conditions does there exist a function f (D, η) = o(ηD 3 ) such that ||α| − | β|| ≥ 2− f (D,η) whenever |α|  | β| and α and β are both complex roots of some polynomial of degree D and height η? Directly using a moduli separation bound of 2−O(η D ) would increase the cost of our ACSV algorithms. Luckily, in the combinatorial case we only need to determine points with the same coordinate-wise moduli as critical points which themselves are real and positive. This allows for a more efficient algorithm11. ˜

3

Lemma 7.6 Let A ∈ Z[T] be a√square-free polynomial of degree D ≥ 2 and √ height η, and define G(T) = A( T)A(− T). If A(α) = 0 and A(±|α|)  0, then  2 2 ) is given by G |α| ≥ 2−b where b = O(ηD ˜ b = (D 2 − 1) log2 (D + 1) + (2η + log2 (D + 1))(D 2 − 1)

 + 2ηD + 2D log2 (D) + D log2 D 2 + 1.

Proof Again we consider the resultant R(u) = ResT (A(T), T d A(u/T)) which vanishes at |α| 2 for any root α of A(u). Lemma 7.11 of the appendix implies R has degree at most D 2 and height at most 2ηD + log((2D)!) ≤ 2ηD + 2D log D. The polynomial G(T) has degree D and, by Lemma 7.8 of the appendix, height at most 2η + log(D + 1). The stated bound on G(|α| 2 ) then follows directly from Proposition 7.7 (iii) in the appendix.  Corollary 7.1 Under the same conditions as Lemma 7.6, isolating regions of 2 ˜ size 2−O(η D ) for the real positive roots 0 < r1 < · · · < rk of A(T) and all roots of 3 ) bit operations. ˜ moduli exactly r1, . . . , rk can be computed in O(ηD Proof Suppose α is a root of A and define the polynomial G(T) and constant b as in Lemma 7.6. If a ∈ C is such that |α − a| < 2−b−η−2 then |α − a| < 2−b−η−2 and 2 |α| − aa = |αα − αa + αa − aa| ≤ |α|2−b−η−2 + |a|2−b−η−2 ≤ 2−b since |α| ≤ 2h + 1 by Proposition 7.7 in the appendix. Thus, approximating α to 2 ) bits allows one to approximate |α| 2 to O(ηD 2 ) bits and, by Lemma 7.13 of ˜ ˜ O(ηD  2 the appendix, this is sufficient to compute G |α| to accuracy 2−b . Lemma  7.6 in the  appendix implies at least one of ±|α| is a root of A if and only if G |α| 2 < 2−b , so 2 ) bits allows one to decide whether or not at least ˜ approximating α to accuracy O(ηD one of ±|α| is a root of A; this precision is also sufficient to decide which real roots of A are positive. To determine which of ±|α| are roots of A, Proposition 7.7 in the 2 ˜ appendix implies it is sufficient to evaluate A(|α|) and A(−|α|) to precision 2−O(η D ) . All these operations run in the stated complexity.  Corollary 7.2 Given a numeric Kronecker representation [P(u), Q, U] of degree ˜ D = δ d and height η = O(hD) corresponding to a zero-dimensional system of d 11 Thanks to Bruno Salvy for this construction.

7.4 Algorithmic ACSV Correctness and Complexity

295

polynomials in Z[z] of degrees at most δ and heights at most h, determining the real solutions of the system and listing the solutions with the same coordinate-wise 4 ) bit operations. ˜ moduli in the variables z can be done in O(hD

7.4 Algorithmic ACSV Correctness and Complexity We conclude this chapter by proving Theorem 7.1. Correctness of Algorithms 1–3 under our assumptions follows from Theorem 5.4 in Chapter 5, together with the results in Section 5.3 of Chapter 5 characterizing minimality. Thus, it remains only to prove the complexity bounds stated in Theorem 7.1. Complexity of Algorithm 1 and Numerical Approximation of A(u), B(u), C(u) The Kronecker representation of the set C of critical points can be determined ˜ 3 D3 ) bit operations by Proposition 7.3, since C contains d + 1 variables. in O(hδ 8 have degrees in O(δ) ˜ The entries in the (d − 1) × (d − 1) matrix H and heights in 8 ˜ ˜ O(h+log δ), so a cofactor expansion shows the determinant of H has degree in O(dδ) ˜ and height in O(d(h + log δ)). Proposition 7.4 then implies that the polynomial Q H8 2 D), and can be determined in O(hδ 4 D3 ) bit ˜ ˜ has degree less than δD, height in O(hδ operations. By assumption the polynomial G(z) has degree at most δ and height at most h, and the polynomial T(z) = z1r1 · · · zdrd has degree O(d) and height 1. Thus, Proposition 7.4 implies that the polynomials QT and Q−G fall into the degree and height bounds for Q H8, and can be determined in the same complexity. Knowing these bounds, Proposition 7.7 (iii) implies that the roots of P(u) where Q H8 and Q G 2 D2 ) bits, which ˜ vanish can be determined by evaluating these polynomials to O(hδ 3 3 ˜ takes O(hδ D ) bit operations using Lemmas 7.12 and 7.13 in the appendix. With our degree and height bounds on the polynomials P(u), P (u), Q H8(u), QT (u), Qλ (u), and Q−G (u), and the knowledge that Q H8, QT (u), and Qλ (u) are nonzero at the roots of P(u), an argument analogous to the one presented in the proof of Lemma 7.3 shows that to determine A(u) =

Q −G (u) , rd Qλ (u)

B(u) =

(rd Qλ (u))d−1 (P )2−d , Q H˜ (u)

C(u) =

P (u) QT (u)

˜ at all roots of P(u) = 0 to κ bits of precision requires O(δDκ + hδ3 D3 ) bit operations. 2 2 ˜ Note at least κ = O(hδ D ) bits of precision are needed to isolate the roots of P(u).

Complexity of Algorithm 2 By Proposition 7.5, the numeric Kronecker representations can be determined to the 2 D2 ) bit operations. The most expensive operation is the ˜ required accuracy in O(hδ

296

7 Automated Analytic Combinatorics

3 D3 ) bits, in order to group critical points ˜ determination of elements of U to O(hδ of the same modulus. Determining the roots of P(u) to the required accuracy takes 3 D4 ) bit operations by Lemma 7.12 in the appendix, dominating the complexity. ˜ O(hδ

Complexity of Algorithm 3 8 D8 ) bits, ˜ Determining the numeric Kronecker representation to the required O(hδ 12 ˜ needed to check which solutions have t in (0, 1), can be done in O(hδ D12 ) bit operations by Proposition 7.5. This is the most expensive step of the algorithm, the others being simple calculations or computations similar to those discussed in the last two algorithms.

We have now proven Theorem 7.1, validating our algorithms for ACSV.

Appendix on Solving and Bounding Univariate Polynomials This appendix contains some classical results about univariate polynomials which are used in our construction of the numeric Kronecker representation. Our presentation of height bounds and root separation is inspired by Mignotte [27], and also parallels the appendix of Melczer and Salvy [26]. Our bounds involve the following concepts. Definition 7.16 (Mahler measure and Euclidean norm) Given a polynomial P(z) = cD z D + · · · + c0 = cD (z − r1 ) · · · (z − rD ) ∈ C[z] with roots r j ∈ C and leading coefficient cD  0, the Mahler measure M(P) and Euclidean norm P 2 of P are the quantities M(P) = |cD |

D 

max{1, |r j |},

j=1

Note that P

2

 ≤ 2h(P) deg(P) + 1.

P

2

=

D 

! j=1

1/2 2"

|c j | # $

.

Since the set of roots of a product of polynomials is the union of the roots, the Mahler measure satisfies M(PQ) = M(P)M(Q) for any P, Q ∈ C[z]. The Mahler measure will be key to the results we derive in this appendix, but it can be difficult to calculate directly as it relies on the roots of P. We thus begin by bounding the Mahler measure by the Euclidean norm, which is easy to calculate. Proposition 7.6 (Landau’s bound) For any P ∈ C[z] the Mahler measure satisfies M(P) ≤ P 2 , with equality if and only if P is a monomial. Our proof of Proposition 7.6 uses the following technical lemma, which Problem 7.9 asks you to establish.

7.4 Algorithmic ACSV Correctness and Complexity

Lemma 7.7 If P(z) ∈ C[z] then (z − r)P(z) where r denotes the complex conjugate of r.

297 2

= (r z − 1)P(z)

2

for any r ∈ C,

Proof (Proposition 7.6) Rearrange the roots of P so that r1, . . . , r are the roots with moduli larger than 1, meaning M(P) = |cD r1 · · · r |. If Q(z) = cD



(r j z − 1)

k=1

d 

(z − r j )

k= +1

then the leading coefficient of Q is cD r1 · · · r and by Lemma 7.7 P The value Q

2

2

= Q

2

≥ |cD r1 · · · r | = |cD r1 · · · r | = M(P).

equals the leading term of Q only if Q is a monomial.



7.4.1 Polynomial Height Bounds First, we use the Mahler measure to prove bounds on the heights of univariate polynomials. Lemma 7.8 For polynomials P1, . . . , Pk , P, Q ∈ Z[z], h(P1 + · · · + Pk ) ≤ max h(Pi ) + log2 k, i

h(P1 · · · Pk ) ≤

k  i=1

h(Pi ) +

k−1 

log2 (deg Pi + 1),

i=1

h(P) ≤ deg P + log2 PQ 2 . Proof The first two results follow directly from the definition of polynomial height, so we consider only the last equation. Expanding P(z) = cD (z − r1 ) · · · (z − rD ) gives Vieta’s formulas for the coefficients of P as symmetric functions in its roots,  ri1 · · · riD−k , (1 ≤ k ≤ D). ck = cD (−1)k i1 0 sufficiently small the set {xκ : κ ∈ {±1} s } consists of 2s real points, one in each of the 2s components of MR adjacent to σ. Note that x1 is contained in the component B(σ) on which σ minimizes h. If we assume that pk1 = · · · = pks = 1, so that each linear factor in the denominator of F appears only to first order, then ∫ ˜ G(z) sgn(σ)  Iσ = sgn(κ) dz, nr d (2πi) Cxκ k1 (z) · · · k s (z)z s κ ∈ {±1}

and making the change of variables w = M(σ − z) yields sgn(σ) Iσ = | det M | (2πi)d

 κ ∈ {±1} s

∫ sgn(κ)

ε(κ,0)+iR d

  G˜ σ − M −1 w   nr dw. w1 · · · ws σ − M −1 w

Replacing the alternating sum of integrals with an integral over a product of circles, and applying the residue theorem s times, gives

332

8 Beyond Smooth Points: Poles on a Hyperplane Arrangement

  G˜ σ − M −1 w   nr dw Tεs ×iR d−s w1 · · · ws σ − M −1 w   ∫ G˜ σ − M −1 w sgn(σ) = dws+1 · · · dwd   | det M | (2πi)d−s iR d−s σ − M −1 w nr w =0, j ≤s

sgn(σ) Iσ = | det M | (2πi)d



j

=

sgn(σ)σ −nr | det M | (2π)d−s

∫ R d−s

Aσ (y)e−nφσ (y) dy,

(8.26)

where y j = (−i)ws+j for 1 ≤ j ≤ d − s and    0 , Aσ (y) = G˜ σ − iM −1 y

   −1 0 − i M σ  j d  y j "# # . (8.27) φσ (y) = r j log # σj # j=1 ! $

Note that the real part of φσ is the height function h, and that by construction the gradient of φσ vanishes at the origin. When d = s, i.e., in the case of a complete intersection, (8.26) is taken to mean Iσ =

˜ G(σ) G(σ) * σ −nr = σ −nr . | det M | σ1 · · · σd jS  j (σ) p j | det M |

Combining this calculation with Theorem 8.1 gives the following. Proposition 8.3 Consider the power series expansion of simple F(z) and suppose r is a generic direction with positive coordinates. If p = 1 and χ denotes the set of contributing singularities of F(z) then  Iσ, (8.28) [znr ]F(z) = σ ∈χ

where Iσ is defined in (8.26) using the quantities in (8.25) and (8.27). For general p ∈ Nm , when the linear factors in the denominator appear to higher order,   ∫ G˜ σ − M −1 w 1 Iσ = (8.29)   dw, | det M | (2πi)d Tεs ×iR d−s w pk1 · · · wspk s σ − M −1 w nr 1

and the iterated residue obtained by integrating over the product of circles in the variables w1, . . . , ws is Rσ (ws+1, . . . , wd ) =

1 (pk1 − 1)! · · · (pk s − 1)! 

∂ p k1 +···+p k s −s

" × # p k1 −1 p k s −1 · · · ∂ws ! ∂w1 $

  G˜ σ − M −1 w   nr σ − M −1 w

(8.30) . w1 =···=w s =0

8.2 Asymptotics in Generic Directions

333

Repeated differentiation shows   −nr Rσ = Pσ (n) × σ − M −1 w

w1 =···=ws =0

,

where Pσ (n) is a polynomial pk1 +···+pk s −s



Pσ (n) =

a j,σ (ws+1, . . . , wd ) n j ,

(8.31)

j=1

in n whose coefficients are rational functions in ws+1, . . . , wd which are analytic at the origin. The leading coefficient of Pσ (n) is p −1 s    v j (w) k j , G˜ σ − M −1 w a pk1 +···+pk s −s,σ = (p − 1)! k j j=1 w1 =···=ws =0

where v(w) =



r1 σ1 −(M −1 w)1

···

rd σd −(M −1 w) d



· M −1 . Thus, we obtain the following.

Proposition 8.4 Consider the power series expansion of simple F(z), and suppose r is a generic direction with positive coordinates. If χ denotes the set of contributing singularities for F(z) then  [znr ]F(z) = Iσ, (8.32) σ ∈χ

where, for σ on the flat defined by the vanishing of k1 , . . . , ks appearing in the denominator of F(z) to multiplicities pk1 , . . . , pks , ∫

pk1 +···+pk s −s

Iσ =

 j=0

n

j R d−s

A j,σ (y)e−nφσ (y) dy

for effective rational functions A j,σ = a j,σ (−iy1, . . . , −iyd−s ) determined by (8.30) and (8.31) and φσ determined by (8.25) and (8.27). Remark 8.10 When d = s the residue Rn depends only on n. Thus, if σ lies on a complete intersection then Iσ = σ −nr Pσ (n), where Pσ (n) is a polynomial in n of degree pk1 + · · · + pk d − d.

8.2.5 Step 5: Determine Asymptotics Recall Proposition 5.3 from Chapter 5, which determines an asymptotic expansion for integrals of the form appearing in Propositions 8.3 and 8.4. The following result provides the only missing details we need for asymptotics, and can be viewed as an analogue of Lemma 5.6 in Chapter 5 which holds beyond the smooth case.

334

8 Beyond Smooth Points: Poles on a Hyperplane Arrangement

Lemma 8.4 Fix an orthant O ⊂ Rd . For any real flat L = Vk 1,...,ks bounded in O, the critical point σ(r) of L in O for the direction r varies smoothly with r. As s varies over a neighbourhood of r in R∗d the values of σ(s) cover a neighbourhood of σ(r) in L. Furthermore, if A is the d × (d − c) submatrix of M −1 consisting of its d − s √ √ rightmost columns, and D is the diagonal matrix with entries r1 /σ1, . . . , rd /σd , then the Hessian of φσ defined by (8.27) at the origin equals Hσ = (DA)T (DA),

(8.33)

which has strictly positive determinant. Proof Recall the explicit parametrization (8.9) for the critical points on the flat L, derived in the proof of Lemma 8.1. Differentiating the right-hand side of (8.9) with respect to zb and substituting z = σ gives d  rj j=1

σj2

−1 −1 M j,a M j,b,

so the Jacobian of the system (8.9) with respect to z at σ equals Hσ = (DA)T (DA). Because of the form of Hσ , the Cauchy-Binet formula (see Problem 8.9) implies its determinant is a sum of the squares of the determinants of the (d − s) × (d − s) submatrices of DA. Since the final d − s rows of DA form a diagonal matrix Λ with √ √ non-zero entries rs+1 /σs+1, . . . , rd /σd , this in turn implies det Hσ ≥ (det Λ)2 =

rs+1 · · · rd > 0, 2 · · · σ2 σk+1 d

and Hσ is non-singular. Thus, the critical point σ(r) varies smoothly with r and when s varies over a neighbourhood of some fixed r then σ(s) covers a neighbourhood of σ(r) in L. Since the system (8.9) is formed from the gradient of h(z ) in (8.8), the Jacobian of (8.9) equals the Hessian of h(z ). The function φ(y) in (8.27) equals −h(z ) after a substitution of the form z j = C j − iy j for constants C j . This implies the Hessian of φ(y) is −(−i)(−i)Hσ = Hσ .  Theorem 8.2 Consider the power series expansion of simple F(z), suppose r is a generic direction with positive coordinates, and let χ denote the set of contributing singularities for F(z). Then there exist asymptotic series Φσ (n) such that  Φσ (n). (8.34) [znr ]F(z) = σ ∈χ

In particular, if σ lies on the flat VS with S = {k 1, . . . , k s } then, for any positive integer K, there exist effective constants C σ j such that Φσ (n) = σ −nr n pk1 +···+pk s −(s+d)/2

  −j −K−1 " Cσ n + O n #. j j=0 ! $ K 

(8.35)

8.2 Asymptotics in Generic Directions

335

If G(σ)  0 then the leading asymptotic term of Φσ is pk j −1

C0σ

=

s  vj j=1

(pk j − 1)!

×*

G(σ) det(Hσ )−1/2 ×  0, p j σ1 · · · σd | det M | (2π)(d−s)/2 jS  j (σ)

  r r where M is defined in (8.25), Hσ is defined in (8.33), and v = σ11 · · · σdd · M −1 . The implied constant in the error term of (8.35) can be uniformly bounded as r varies in any connected compact set which does not contain a non-generic direction. Proof As stated above, this is an application of Proposition 5.3 from Chapter 5 to Proposition 8.4. In particular, if σ ∈ χ and φσ (y) is defined by (8.27) then • φσ and its gradient vanish at the origin by direct inspection; • Lemma 8.4 shows that the Hessian matrix Hσ is non-singular; • the real part of φσ is    σj − i M −1 0 d  y j r j log  (φσ ) = , σj j=1 which is non-negative on Rd−s , and equal to 0 precisely when y = 0. • because φσ (y) has strictly negative real part when y is bounded away from the origin, which is an isolated critical point, we can restrict each integral in Proposition 8.4 to a neighbourhood where the origin is the only critical point of φσ while introducing only an exponentially negligible error. This verifies all the necessary conditions to apply Proposition 5.3.



Figure 8.8 presents our current asymptotic picture for our running Example 8.3. Remark 8.11 When σ is a contributing singularity there exist a1, . . . , as > 0 with −(∇h)(σ) = a1 b(k1 ) + · · · + as b(ks ),   so that a 0 M = −(∇h)(σ). Thus, for 1 ≤ j ≤ s the constants v j in Theorem 8.2 are v j = a j > 0. When there is a single contributing singularity σ of maximum height, and G(σ)  0, then Theorem 8.2 gives dominant asymptotics of the coefficient sequence, [znr ]F(z) ∼ σ −nr n pk1 +···+pk s −(s+d)/2 pk j −1

×

s  vj j=1

(pk j − 1)!

×

G(σ) det(Hσ )−1/2 . * σ1 · · · σd jS  j (σ) | det M | (2π)(d−s)/2

In the simple pole case, when each p j = 1, this simplifies further to

336

8 Beyond Smooth Points: Poles on a Hyperplane Arrangement b



⎝2

a (a+b)a+b

3a+b aa bb

⎞ ⎠

n



3/2

2(a+b) n−1/2 4√abπ(2b−a)

? 12 ⎛

⎝3

a (a+b)a+b

4a+b aa bb

⎞ ⎠

n

?

3/2

n−1/2 √9(a+b) 2abπ(a−3b)

0

a

an bn Fig. 8.8 The =three  asymptotic   regimes for the coefficients [x y ]F(x, y) of the rational function 2x+y 3x+y F(x, y) = 1 1 − 3 1− 4 . Asymptotics in non-generic directions have not yet been derived and are thus denoted by question marks.

[znr ]F(z) ∼ σ −nr n(d−s)/2

G(σ) det(Hσ )−1/2 . * σ1 · · · σd jS  j (σ) | det M | (2π)(d−s)/2

Unfortunately, when the numerator of F vanishes at the contributing singularities determining dominant asymptotics can be more difficult. Example 8.4 (Asymptotics With Vanishing Numerator) Returning to an example from Chapter 5, let B(x, y) =

x − 2y 2 1−x−y

and

C(x, y) =

x−y . 1−x−y

In the main diagonal direction r = (1, 1) the functions B and C both admit the single contributing singularity σ = (1/2, 1/2), but while [x n y n ]B(x, y) has the expected exponential growth of 4n , even though its numerator vanishes at σ, the sequence [x n y n ]C(x, y) is identically zero. Given a contributing singularity σ, the saddle-point integral Iσ in (8.32) can be written as a series in n−1 , and we can determine any number of terms in the series using Proposition 5.3 of Chapter 5. Remark 8.12 When σ lies on a complete intersection then Iσ = σ −nr Pσ (n), where Pσ (n) is a polynomial in n of degree pk1 +· · ·+pk d −d. Thus, to determine whether Iσ is identically zero one simply needs to compute the first pk1 + · · · + pk d − d terms of its asymptotic expansion using Proposition 5.3.

8.2 Asymptotics in Generic Directions

337

Algorithm 4: GenericHyperplaneAsymptotics

* p j with linear (z) = 1 − b( j) · z, and Input: Rational function F(z) = G(z)/ m j j=1 j (z) direction r = (r1, . . . , rd ) Output: Asymptotic expansion of [z nr ]F(z) as n → ∞, uniform in compact set around r

if matrix with rows b( j) not full rank then run Algorithm 5 (SimpleDecomp) on F and apply GenericHyperplaneAsymptotics to to each summand in the result end compute the flats defined by the j (z) find the critical points on each flat by solving (8.2) sort the critical points using height function hr and set σ equal to point of max height while no contributing points have been found do compute the cone N (σ) from Definition 8.8 if (−∇h)(σ) on boundary of N (σ) then return fail, “r not a generic direction” end if (−∇h)(σ)  N (σ) then set σ to next lowest height critical point and repeat loop end if G(σ)  0 then σ is a contributing singularity repeat above steps to identify contributing singularities of the same height as σ else if G in polynomial ideal generated by the factors j with j (σ) = 0 then replace F by sum of rational functions using (8.36) and analyze each summand else return fail end end return sum of contributions (8.35) from each contributing singularity of highest height

There is not currently an effective characterization for vanishing of the integral Iσ . On the other hand, Lemma 8.4 implies that a critical point σ(s) covers a neighbourhood of its flat L when s moves over a neighbourhood in Rd . Thus, if G(σ(s)) = 0 vanishes for all s in some open set of Rd then G vanishes identically on L. In particular, this implies that if L is defined by the vanishing of 1, . . . , s then there exist polynomials Q1, . . . , Q s such that G(z) = Q1 (z)1 (z) + · · · + Q s (z)s (z). In other words, G(z) Q s (z) Q1 (z) *m *m , pj = p j / (z) + · · · + pj  (z)  (z)  j j k 1 1≤ j ≤m j=1 j=1 j (z) /k s (z)

*m

(8.36)

and this can be detected automatically by a Gröbner basis computation (of the kind discussed in Chapter 7). The analysis at any such critical point σ(s) can then be applied to the summands, each of which has a denominator which vanishes to lower order at σ(s). If each pk j = 1 then the hyperplane arrangement defined by any

338

8 Beyond Smooth Points: Poles on a Hyperplane Arrangement

Fig. 8.9 Plot of singular hyperplanes and contributing singularities for our three-dimensional example (note only 5 of the 7 contributing singularities are visible from this angle).

summand does not contain the flat L and Iσ(s) is identically zero. Thus, for most directions either G(σ)  0 at all contributing singularities σ or Iσ is identically zero for some contributing singularity in a manner that we are able to detect. Algorithm 4 lists the computations needed to determine asymptotics. Implementations of Algorithm 4 are linked on the book website. Example 8.5 (A Three-Dimensional Example) Consider the three-dimensional example F(x, y, z) =

x+y+z , (1 − 2x − 3y − 4z)(1 + 4x + 3y + 2z)(1 + x − y + z)

and label the denominator factors 1, 2, and 3 from left to right. Using an implementation of Algorithm 4, we find that there are seven contributing singularities in the direction r = (1, 1, 2): one each on the hypersurfaces V1, V2, and V3 , two points (necessarily in different orthants) on the flat V1,2, and one point on each of the flats V1,3 and V2,3 . The unique contributing point of highest height is σ =  √(1/8, 1/12, 1/8)  on V1 , giving dominant asymptotics [x n y n z 2n ]F(x, y, z) =

6144 n n

2 14π

+ O(1/n) .

8.2.6 Dealing with Non-Simple Arrangements Our main results above require the factors  j in the denominator to be linearly independent. Luckily, when dependencies between the factors do exist such relations can be used to decompose the rational function of interest into a sum of functions where this is no longer a problem. The key observation is that from a linear dependence a1 k1 (z) + · · · + as ks (z) = 0

(8.37)

8.2 Asymptotics in Generic Directions

339

Fig. 8.10 Plots of the linearly dependent lines 1, 2, 3 together with the critical points for r = (1, 2) on the left, r = (3, 2) in the centre, and r = (8, 2) on the right.

with non-zero constants one can divide by any a j k j (z) and then divide by any product of linear factors in the denominator to obtain  (−ai /a j ) 1 1 = q1 , qm = q qi, j 1 · · · m 1≤i ≤s

(8.38)

ij

for any q ∈ Nm, where qi, j = q + e(k j ) − e(ki ) is one larger than q in its jth coordinate and one smaller than q in its ith coordinate. Example 8.6 (A Non-Simple Rational Function) Consider the rational function F(x, y) = 1/(1 2 3 ), where 1 (x, y) = 1 −

2x + y 3

and

2 (x, y) = 1 −

3x + y 4

are the same as our running example, and 3 (x, y) = 1 −

x+y . 2

As the three lines defined by 1, 2, and 3 meet at the point (x, y) = (1, 1), these functions are linearly dependent. Figure 8.10 shows the arrangement of critical points for three different directions r = (p, q). When 0 < p/q < 1 or 3 < p/q then F(x, y) admits a smooth minimal critical point and we do not need to worry about the linear dependency to determine dominant asymptotics. However, if 1 < p/q < 3 then the only minimal critical point is the point (1, 1) where the three lines defined by the  j intersect. Comparing coefficients in the equation a1 + b2 − 3 = 0, where a and b are unknown constants, gives a linear system whose solution (a, b) = (3, −2) implies 3 = 31 − 22 , so 31 22 1= − , 3 3 and thus

340

8 Beyond Smooth Points: Poles on a Hyperplane Arrangement

F(x, y) =

3 1 3



2 2 3

1 2 3

=

3 2 − . 2 2 3 1 32

Finding coefficient asymptotics of F(x, y) in the direction r = (p, q) is therefore reduced to finding asymptotics of the two summands in this expression, both of which are simple rational functions whose asymptotics are determined by Theorem 8.2 and Algorithm 4. If 1 < p/q < 2 then [x pn y qn ]

1 ∼ 4(3q − p)n 2 32

and

[x pn y qn ]

1 ∼ 12(q − p)n, 1 32

so [x pn y qn ]F(x, y) ∼ 12(p − q)n. The coefficients of 1/2 32 have the same behaviour when 2 < p/q < 3, but the coefficients of 1/1 32 decay exponentially in this case, meaning [x pn y qn ]F(x, y) ∼ 12(3q − p)n when 2 < p/q < 3. Any direction where p/q = 2 is a non-generic direction for 1/1 32 , and will require the methods of Section 8.3 below. To show that any rational function of interest can be decomposed into a sum of simple functions, we need to introduce some notation. The following terminology is borrowed from the theory of matroids. Definition 8.12 (supports, broken circuits, and χ-independence) The support of a rational function G(z) F(z) = 1 (z)q1 · · · m (z)qm written with coprime numerator G and linear denominator factors  j is   sup (F) =  j (z) : q j > 0 . A minimal linearly dependent set of factors {k1 , . . . , ks } is called a circuit; note that any proper subset of a circuit is linearly independent. A broken circuit is the independent set obtained by removing the element of largest index from a circuit. A collection of the linear factors is χ-independent if it does not contain a broken circuit. Any dependent set contains a broken circuit, so a χ-independent set is also independent. In our last example the only dependent set is {1, 2, 3 }, so the unique broken circuit is {1, 2 }. The χ-independent sets are then {1, 3 } and {2, 3 } together with the sets containing a single linear factor (but the independent set {1, 2 } is not χ-independent). In our example we decomposed F(x, y) into a sum of simple rational functions whose supports were precisely the χ-independent sets of size two. The following result shows that such a decomposition is always possible.

8.2 Asymptotics in Generic Directions

341

Algorithm 5: SimpleDecomp

* qj Input: Rational function F(z) = G(z)/ m j=1 j (z) Output: Collection F1 (z), . . . , Fr (z) of rational functions with sup(Fj ) ⊂ sup(F) such that F = F1 + · · · + Fr and each Fj has independent support

Set S ← F(z) while there exists summand F˜ of S with broken circuit { j1 , . . . , j s−1 } in its support do apply (8.38) to F˜ with js > js−1 such that { j1 , . . . , j s } is dependent replace F˜ by the result of the previous step in the sum S end

Proposition 8.5 Let 1 (z), . . . , m (z) be any linear functions. Then the set of rational functions  1 : { j1 , . . . ,  js } is χ-independent L=  j1 (z) · · ·  js (z) is linearly independent over the complex numbers. Furthermore, the span of the rational functions   s  1 : { j1 , . . . ,  js } is χ-independent and pj = M  j1 (z) p1 · · ·  js (z) ps i=1 over the complex numbers contains the inverses of all products of the  j with M (not necessarily distinct) factors. Proof Linear independence of the set L follows from well-known results in the study of hyperplane arrangements [5, Thms. 3.43, 3.126, 5.89]. To prove the second conclusion, suppose 1/ q is a rational function whose support contains a broken circuit { j1 , . . . ,  js−1 }, with the factors ordered by increasing index. Then by the definition of a broken circuit there exists js > js−1 such that { j1 , . . . ,  js−1 ,  js } is linearly independent, and there exist constants a j such that (8.37) holds. Equation (8.38) with j = s expresses 1/ q as a linear combination of rational functions of the form 1/ q , where each q is lexicographically smaller than q. This process can be repeated on each summand that still contains a broken circuit, continuing until no factor contains a broken circuit. The iteration must terminate as each step yields rational functions whose denominator exponents are lexicographically smaller than the previous step.  Algorithm 5 implements our proof.

342

8 Beyond Smooth Points: Poles on a Hyperplane Arrangement

8.3 Asymptotics in Non-Generic Directions In this section we discuss asymptotics in non-generic directions, under the assumption that F is simple (if not F can be decomposed using Algorithm 5). To begin we express the series coefficients of interest in terms of certain ‘negative-moment Gaussian integrals’. Unfortunately, asymptotics of such integrals do not follow from Proposition 5.3 in Chapter 5. We give asymptotics when the integrals to be approximated are one-dimensional. Example 8.3 Continued (Residue Integrals in Non-Generic Directions) Consider again the function F(x, y) = 

1−

2x+y 3

1 

1−

3x+y 4

.

The coefficient sequence [x r1 n y r2 n ]F(x, y) =

1 (2πi)2

∫ T



1−

2x+y 3

1 

1−

3x+y 4



dxdy x r1 n+1 y r2 n+1

(8.39)

is non-generic in any direction r with rˆ1 ∈ {2/3, 3/4}, where its asymptotic behaviour can differ from what happens when 0 < rˆ1 < 2/3 (exponential decay), 2/3 < rˆ1 < 3/4 (the limit is the constant 12), or 3/4 < rˆ1 < 1 (exponential decay, determined by a different critical point). To be precise, we fix the non-generic direction r = (2, 1). Since Proposition 8.1 above does not require r to be generic, if we again define   (1)   2/3 1/3 b M = (2) = 3/4 1/4 b       x 1 p and make the change of variables = −M , the manipulations of Secy 1 q tion 8.2.2 still imply that ∫ 12 2n n [x y ]F(x, y) = ω˜ (8.40) (2πi)2 (ε,ε)+iR2 for any ε > 0 sufficiently small, where ω˜ =

dpdq 1 . pq (1 + 3p − 4q)2n+1 (1 − 9p + 8q)n+1

That we are in a non-generic direction is reflected in the fact that the critical points σ 1 and σ 1,2 are equal, and in the fact that the linear combination

8.3 Asymptotics in Non-Generic Directions

343

(−∇hr )(σ 1,2 ) = 3 b(1) + 0 b(2)

(8.41)

contains a zero coefficient. Theorem 8.1 no longer applies: because the coefficient of b(2) in (8.41) is zero, adding the necessary Cauchy integrals over imaginary fibers to (8.40) in order to reduce to a residue computation in p and q results in an error which grows too quickly. To see why we get into trouble, note that the exponential growth of ω˜ with respect to n is captured by the modified height function ˜ q) = −2 log(1 + 3p − 4q) − log(1 − 9p + 8q), h(p, ˜ ˜ which satisfies (∇ h)(0, 0) = (3, 0). Thus, for ε > 0 sufficiently small h(−ε, ε) is less ˜ 0) = 0, meaning we can add the Cauchy integral over the imaginary fiber than h(0, with basepoint (−ε, ε) and introduce only a negligible error. In particular, there exists τ ∈ (0, 1) such that ∫

∫ 12 2n n ω˜ − ω˜ + O(τ n ). [x y ]F(x, y) = (2πi)2 (ε,ε)+iR2 (−ε,ε)+iR2 ˜ Unfortunately, since the second coordinate of (∇ h)(0, 0) is zero this argument does ˜ −ε) is less than 0 (which is, in fact, not true). We can not allow us to show h(ε, therefore take a residue in the p variable, but attempting to take a residue in the q variable would introduce an error growing faster than the sequence of interest. Taking a residue in the p variable gives ∫ 1 1 6 [x 2n y n ]F(x, y) = dq + O(τ n ) 2n+1 πi ε+iR q (1 − 4q) (1 + 8q)n+1 ∫ 1 6 e−n[2 log(1−4q)+log(1+8q)] dq + O(τ n ). = πi ε+iR q(1 − 4q)(1 + 8q) Making the substitution y = iq then yields ∫ A(y) −nφ(y) −6 2n n e dy + O(τ n ), [x y ]F(x, y) = πi R+iε y

(8.42)

where A(y) =

1 , (1 − 4iy)(1 + 8iy)

φ(y) = 2 log(1 − 4iy) + log(1 + 8iy).

The integral in (8.42) would be a standard saddle-point integral, with an asymptotic expansion determined by Proposition 5.3 of Chapter 5, if the factor of y was not present in the denominator of the integrand. In generic directions, when there is a contributing point determining asymptotics which is a root of k of the linear denominator factors we are able to take k successive residues. The existence of the pathological denominator term in (8.42) is a reflection of the fact that σ 1 lies on the zero set of two linear factors, but we are only able to take one residue.

344

8 Beyond Smooth Points: Poles on a Hyperplane Arrangement

The general argument is similar to this example. First, we must extend our definition of contributing points to account for non-generic directions. Definition 8.13 (general contributing points) A point w ∈ Vk1,...,kt is now called contributing when t  a j b(k j ) (8.43) (−∇h)(w) = j=1

and each a j is non-negative (in generic directions each a j will be strictly positive). Fix a direction r and suppose that F admits a unique contributing singularity σ of highest height, which lies in some flat V1,...,t and no proper subflat. If r is not generic but each coefficient in (8.43) is positive (i.e., if r is not generic because of a contributing point of lower height) then the arguments of Section 8.2 still provide dominant asymptotic behaviour by analyzing the saddle-point integral Iσ corresponding to σ. Thus, we may assume (−∇h)(σ) = a1 b(1) + · · · + at b(s) + 0 b(s+1) + · · · + 0 b(t) for some s < t and a j > 0. Recall the matrix M with rows b(1), . . . , b(t), e(t+1), . . . , e(d) , where we assume that the variables and  j have been permuted so that M has full rank, and let G(z) ˜ = * G(z) . z1 · · · zd j>s  j (z) p j For any integer κ > 0 let 0κ denote the zero vector of length κ. The following result shows that, as in our example, the coefficients of interest are given by an integral of the same form as the generic case, except for an extra monomial in the denominator of the integrand which makes the integrand singular at the origin. Proposition 8.6 Under the assumptions of the previous paragraph, n p1 +···+p s −s i p s+1 +···+p t [z nr ]F(z) = | det M |(2π) d−s

    −nr R(y) σ + iM −1 0y

∫ R d−s +iε(1t −s ,0)

p

p

t y1 s+1 · · · yt −s

  1 dy 1 + O , n

where y = (y1, . . . , yd−s ) and R(y) =

j=1

for v(w) =



 G˜ σ − M −1 w (p j − 1)!

s  v j (w) p j −1

r1 σ1 −(M −1 w)1

···



rd σd −(M −1 w) d



w j =0, 1≤ j ≤s w j =−iy j−s , s+1≤ j ≤d

· M −1 .

Proof Since σ is the unique contributing point of highest height, Proposition 8.1 implies that the coefficients of interest are given, up to an exponentially negligible error term, by ∫ ˜ 1 dz G(z) I= p (2πi)d σ−εm+iR d 1 (z) 1 · · · t (z) pt znr

8.3 Asymptotics in Non-Generic Directions

345

  for m = M −1 e(1) + · · · + e(t) . Substituting w = M(σ − z) gives I=

1 | det M |(2πi)d

∫ ε(1t ,0)+iR d

˜ − M −1 w) G(σ  nr dw. p p  w1 1 · · · wt t σ − M −1 w

(8.44)

Because a1, . . . , as > 0 in (8.43) we can add or subtract the integrals obtained by replacing the domain of integration in (8.44) by any of the 2s − 1 imaginary fibers with basepoints (±1s, 1t−s, 0)  (1t , 0) and introduce only an exponentially negligible error. Taking a signed sum of these integrals corresponds to taking the residue of the integrand of I in w1, . . . , ws at the origin. As in the previous sections of this chapter, this residue is a polynomial of degree p1 + · · · + ps − s, whose leading  term after substituting y j = iw j+s for 1 ≤ j ≤ d − s becomes R(y). Determining asymptotics in a non-generic direction thus reduces to a study of integrals of the form ∫ A(y) −nφ(y) dy ma e m1 R b +i(1 a ,0) y1 · · · y a where m ∈ Na, a < b, and A and φ are analytic at the origin. Here we study the case t = d and s = d − 1, where Proposition 8.6 reduces to a one-dimensional integral of the form ∫ A(y) −nφ(y) i pd e . | det M |(2π) R+iε y p1 The following result states that A and φ can be replaced by the leading terms of their power series expansions at the origin while introducing an error term which grows slower than the sequence of interest. ∫ Lemma 8.5 Let I = R+iε y −k A(y)e−nφ(y) for some ε > 0 and r, k ∈ N. If • • • •

A is a rational function with no zeroes in the strip R + (−2ε, 2ε)i ⊂ C; φ(y) is a finite sum of terms r j log(p j + iq j y) for real r j > 0 and p j , q j  0; A(y) = a + O(y) and φ(y) = by 2 + O(y 3 ) at the origin, for some b > 0; φ (y) = 0 only if y = 0, ∫

then I=a

  2 y −k e−nby dy + o n−(k+1)/2 .

(8.45)

R+iε

Problem 8.4 asks you to prove Lemma 8.5, and Problem 8.5 asks you to determine asymptotics for the integral in (8.45). Combining Proposition 8.6 and Lemma 8.5 then gives the following. Theorem 8.3 Suppose F(z) is simple and r is a non-generic direction with a unique contributing singularity σ of maximal height, lying on the flat V1,...,d . If −(∇hr )(σ) = a1 b(1) + · · · + ad−1 b(d−1) + 0 b(d) with each a j > 0 then as n → ∞

346

8 Beyond Smooth Points: Poles on a Hyperplane Arrangement

[znr ]F(z) = σ −r n p1 +···+pd−1 +(pd +1)/2−d (C + o(1)) , where p −1

C=

d−1 a j  j j=1

(p j − 1)!

*

G(σ) (qT q/2)(pd −1)/2  , pj j>d  j (σ) 2(σ1 · · · σd ) | det M | Γ pd2+1

M is the matrix with rows b(1), . . . , b(d) , and q is the rightmost column of the matrix √ r1 /σ1 0 0 0  " √ r2 /σ2 0 0 # 0 # −1 Q= #M . .. . 0 0 0 ## √ 0 0 rd /σd $ ! 0 In the simple pole case, when p = 1, the leading asymptotic term is half what it would be if r was generic (i.e., if −(∇hr )(σ) was a positive linear combination of b(1), . . . , b(d) ). Proof Under these hypotheses Proposition 8.6 implies the coefficients of interest are determined up to a factor of 1 + O(1/n) by a univariate integral of the form ∫ A(y) −ψ(y) e dy, I = σ −r n p1 +···+pd −d pd R+iε y where rational A(y) and log-linear ψ(y) are analytic functions at the origin with ˜ A(y) = G(σ)

d−1   r



j=1

 σ+ ψ(y) = log

···

1

σ1

iM −1 σr

!

rd σd



· M −1

 p j −1 j

˜ + O(y) = G(σ)

  r 0 " y # # = (qT q/2)y 2 + O(y 3 ). # # $

d−1 

p −1

aj j

+ O(y)

j=1

Lemma 8.5 implies we can replace A and ψ by their leading asymptotics terms, and Problem 8.5 provides asymptotics of the resulting integral.  Example 8.3 Continued (Asymptotics in Non-Generic Directions) Above we reduced coefficient asymptotics of F(x, y) = 

1−

2x+y 3

1 

1−

3x+y 4



8.3 Asymptotics in Non-Generic Directions

347

b



⎝2

a (a+b)a+b

3a+b aa bb



n





3/2

2(a+b) n−1/2 4√abπ(2b−a)

6 12 ⎛

⎝3

a (a+b)a+b

4a+b aa bb



n



6

3/2

n−1/2 √9(a+b) 2abπ(a−3b)

0

a

Fig. 8.11 The regimes for the coefficient asymptotics [x a n y b n ]F(x, y) of =  three asymptotic   2x+y 3x+y F(x, y) = 1 1 − 3 1 − 4 , with asymptotics in the non-generic directions filled in.

in the direction r = (2, 1) to (8.42). Replacing the analytic functions in this integral by their leading terms gives −6 [x y ]F(x, y) = πi



2n n

R+iε

     2 1 1 e−48ny dt 1 + O = 6+O . y n n

In the simple pole case, the fact that the leading asymptotic term in Theorem 8.3 is half what it would be in the non-generic case was noted by Pemantle and Wilson [6]. Additional information on asymptotics in non-generic directions, including results on how asymptotics transition between different asymptotic regimes as r varies around a non-generic direction, can be found in Baryshnikov et al. [1].

Problems 8.1 Which of the plots in Figure 8.12 contains contributing singularities that could actually arise in an analysis of a multivariate rational function whose denominator is the product of real linear factors? 8.2 Consider the Laurent expansion of the rational function F(x, y) = 

1−

2x+y 3

1 

1−

3x+y 4



348

8 Beyond Smooth Points: Poles on a Hyperplane Arrangement

Fig. 8.12 A hyperplane arrangement with three potential arrangements of contributing singularities; not all of these arrangements can actually arise.

defined by fi, j =

1 (2πi)2

∫ |x |=1/5, |y |=3

F(x, y)

dxdy x i+1 y j+1

.

Show that this Cauchy integral can be expressed as a signed sum of Cauchy integrals over the imaginary fibers with basepoints (±1/5, ±3). If r is a direction with positive coordinates, express the coefficients of interest as a sum of Cauchy integrals over linking tori and imaginary fibers in unbounded components of MR ; determine asymptotics in the direction r. Determine asymptotics in directions r with potentially negative coordinates (note that Cauchy integrals over imaginary fibers with basepoints in unbounded regions of MR may no longer be zero). d and a convergent 8.3 Let F(z) be a simple rational function and fix both r ∈ R>0 Laurent expansion F(z) = i∈Z d fi zi in an open domain containing a point x ∈ Rd . Prove that the Cauchy integral expression ∫ 1 F(z) fnr = dz (2πi)d T (x) znr+1

can be decomposed as a sum of integrals over imaginary fibers. Deduce a generalization of Theorem 8.1 for this situation. What changes if r has negative coordinates? 8.4 For n > 0 let A√n = (−n−2/5, n2/5 ) denote a shrinking real interval centred at the origin and εn = 1/ n. Let A(y) and φ(y) be as in Lemma 8.5. ∫ 1. Prove that for any fixed κ > 0 the integral R\(−κ,κ)+iε y −k A(y)e−nφ(y) dy decays n faster than any fixed power of n as n → ∞. ∫ 2. Prove that R\A +iε y −k A(y)e−nφ(y) dy decays faster than any fixed power of n n n by bounding | A(y)| and the real part of φ(y) for y near the origin using the leading terms of their power series expansions at the origin. ∫   2 3. Prove that A +iε y −k A(y)e−nφ(y) − ay −k e−bny dy = O n−(k+1)/2−1/5 . n

n

8.3 Asymptotics in Non-Generic Directions

4. Prove that



349

y −k e−ny dy decays faster than any fixed power of n. 2

R\A n +iεn

5. Conclude that Lemma 8.5 holds. 8.5 For any κ ∈ N and b > 0 define the integral ∫ Iκ (b) = R+iε

e−bt dt. tκ 2

By∫ differentiating under the integral sign, show that Iκ satisfies the recursion Iκ (b) = − Iκ−2 (b) db for all κ ≥ 2. After finding I0 (b) and I1 (b), prove that Iκ (b) =

(−i)κ b(κ−1)/2 π  ,  Γ κ+1 2

where Γ(z) is the Euler gamma function. 8.6 Determine asymptotics of the bivariate generating function F(x, y) =

1 (1 − ρ1,1 x − ρ2,1 y)(1 − ρ1,2 x − ρ2,2 y)

for a closed multiclass queuing network with no infinite servers. How does the asymptotic behaviour depend on the weights ρi, j > 0? For what values of the weights are all directions r ∈ R2>0 generic? 8.7 Let A be an m × n matrix and B be a n × m matrix, and let In and Im denote the n × n and m × m identity matrices, respectively. Prove Sylvester’s determinant identity, which states that det(In + BA) = det(Im + AB), then show det(zIn + BA) = z n−m det(zIm + AB). Hint: If 0n×m is the n × m zero matrix, direct calculation implies       Im −A Im A Im A Im −A = . B In 0n×m In 0n×m In B In 8.8 Let C be an n × n matrix and for any 1 ≤ k ≤ n let Sk denote the collection of k element subsets of {1, . . . , n}. Prove that for any 1 ≤ k ≤ n the coefficient of z n−k in det(zIn + C) equals    det CS,S , S ∈Sk

where CS,S denotes the k × k submatrix of C consisting of its rows and columns with indices in S. 8.9 Let A be an m × n matrix and B be a n × m matrix, where n ≥ m. Use the results of Problems 8.7 and 8.8 to prove the Cauchy-Binet formula:      det AS,[m] det B[m],S , det(AB) = S ∈Sm

350

8 Beyond Smooth Points: Poles on a Hyperplane Arrangement

where [m] = {1, . . . , m} and for any matrix M the notation MP,Q refers to the submatrix of M whose rows have indices in the set P and whose columns have indices in the set Q.

References 1. Y. Baryshnikov, S. Melczer, and R. Pemantle. Asymptotics of multivariate sequences IV: generating functions with poles on a hyperplane arrangement. In preparation, 2020. 2. Andrea Bertozzi and James McKenna. Multidimensional residues, generating functions, and their application to queueing networks. SIAM Rev., 35(2):239–268, 1993. 3. Yaakov Kogan. Asymptotic expansions for large closed and loss queueing networks. Math. Probl. Eng., 8(4-5):323–348, 2002. 4. Yaakov Kogan and Andrei Yakovlev. Asymptotic analysis for closed multichain queueing networks with bottlenecks. Queueing Systems Theory Appl., 23(1–4):235–258, 1996. 5. Peter Orlik and Hiroaki Terao. Arrangements of hyperplanes, volume 300 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. SpringerVerlag, Berlin, 1992. 6. Robin Pemantle and Mark C. Wilson. Asymptotic expansions of oscillatory integrals with complex phase. In Algorithmic probability and combinatorics, volume 520 of Contemp. Math., pages 221–240. Amer. Math. Soc., Providence, RI, 2010.

Chapter 9

Multiple Points and Beyond

Only geometry can give us the thread which will lead us through the labyrinth of the continuum’s composition, the maximum and the minimum, the infinitesimal and the infinite, and no one will arrive at a solid metaphysic except he who has passed through it. — Gottfried Wilhelm von Leibniz

In this chapter we detail more general aspects of ACSV. To begin, we combine techniques from Chapter 5 (ACSV in the smooth case) and Chapter 8 (ACSV when the singular set is a hyperplane arrangement) to study functions whose singular sets are locally hyperplane arrangements; asymptotics are determined through the use of multivariate residues, leading to Theorems 9.1 and 9.2. Following this we survey current aspects of ACSV, including new work making use of tools from topology, leading up to Theorem 9.3 on the asymptotic contributions of (potentially non-minimal) critical points. Combining these topological methods with the computational results for D-finite functions discussed in Chapter 2 gives a powerful approach to the connection problem for asymptotics of sequences satisfying linear recurrence relations with polynomial coefficients.

9.1 Local Geometry of Algebraic and Analytic Varieties Before studying rational (and meromorphic) generating functions with more complicated singular varieties, we need a better understanding of the ‘local’ geometry that can arise. In order to talk about local geometry, we introduce the following notion. Definition 9.1 (local rings) Given w ∈ Cd the local ring at w, denoted Ow , is the ring of power series which converge in some neighbourhood of w with the usual addition and multiplication. Equivalently, the local ring is given by complex functions f (z) which are analytic at w, modulo the equivalence relation f ∼ g if f = g on some neighbourhood of w. An element f ∈ Ow is a unit if it does not vanish at w (so 1/ f is also in Ow ) and irreducible if f = gh implies one of g or h is a unit. Two irreducible elements f , g ∈ Ow are coprime if there does not exist a unit u ∈ Ow with f = ug. Hörmander [17, Thm. 6.2.2] shows that for any w ∈ Cd the local ring Ow forms a unique factorization domain, meaning any element of Ow can be written as a product of irreducible elements and this representation is unique up to the ordering of factors and multiplication by units. © Springer Nature Switzerland AG 2021 S. Melczer, Algorithmic and Symbolic Combinatorics, Texts & Monographs in Symbolic Computation, https://doi.org/10.1007/978-3-030-67080-1_9

351

352

9 Multiple Points and Beyond

Remark 9.1 If f (z) = g(z)h(z) for analytic functions g and h which vanish at w then the product rule implies any partial derivative of f vanishes at w. Thus, a sufficient (but not necessary) condition for an element of Ow represented by the analytic function f (z) to be irreducible is that the gradient (∇ f )(w) is non-zero. Suppose we want to determine coefficient asymptotics of a rational function F(z) = G(z)/H(z). The most explicit results in this chapter will hold when the denominator H(z) locally factors into irreducibles whose zero sets have nice geometry. Definition 9.2 (factorizations and multiple points) Let w ∈ Cd and H(z) be an analytic function at w. We call an expression of the form H(z) = u(z)H1 (z) p1 · · · Hs (z) ps

(9.1)

a factorization of H in Ow if u and the H j are analytic in some polydisk D centred at w, each p j is a positive integer, and (9.1) holds for all z ∈ D. By convention, when writing a factorization of the form (9.1) we always assume that u(w)  0 while each H j (w) = 0. If the H j (z) are irreducible in Ow and pairwise coprime then (9.1) is called a square-free factorization of H in Ow . When H has a square-free factorization in Ow with each exponent p j = 1 then H is said to be square-free at w. We call w a multiple point of H if H has a factorization in Ow of the form (9.1) where the gradient vectors (∇H1 )(w), . . . , (∇Hs )(w) are all non-zero. If, in addition, the gradient vectors (∇H1 )(w), . . . , (∇Hs )(w) are linearly-independent then w is called a transverse (multiple) point. Remark 9.2 Recall from previous chapters that the zero set of a complex-valued function f is denoted V( f ), that the square-free part of a polynomial H(z) ∈ C[z], denoted H s (z), is the product of its distinct irreducible factors, and that the smooth points of V(H) are the points where the gradient (∇H s )(z) is non-zero. Checking definitions shows that any smooth point of V(H) is a transverse point. Remark 9.3 Given an open set U ⊂ Cd and analytic function f (z) on U, let VU ( f ) denote the elements z ∈ U such that f (z) = 0. If (9.1) gives a factorization of H(z) in Ow then there exists some neighbourhood U of w in Cd such that H, H1, . . . , Hs are analytic on U and VU (H) = VU (H1 ) ∪ · · · ∪ VU (Hs ). Thus, if w is a multiple point of H(z) then the points of V(H) contained in some neighbourhood of w form a finite union of smooth sets, each defined by the vanishing of a single analytic function. If w is a transverse point then the tangent spaces of these smooth sets at w are hyperplanes with linearly independent normal vectors.

Example 9.1 (Multiple Points in an Algebraic Set) Suppose H(x, y) = H1 (x, y)H2 (x, y) where H1 = 1 − x 2 + y and H2 = 1 − x 2 − y 2 , so that V = V(H) is the union of the two irreducible algebraic varieties V1 = V(H1 )

9.1 Local Geometry of Algebraic and Analytic Varieties

353

Fig. 9.1 Left: The real points in the zero set V((1 − x 2 + y)(1 − x 2 − y 2 )) = V(1 − x 2 + y) ∪ V(1 − x 2 − y 2 ) form the union of a circle and a parabola, which have two transverse points of intersection (on the sides of the circle) and one non-transverse point of intersection (at the bottom of the circle). The transverse points of intersection move smoothly if the parabola or circle are perturbed, but the non-transverse point can disappear or split into two points of intersection when the parabola or circle are perturbed. Right: The real points in the zero set of H(x, y) = (1/3 − x)2 (1/2 − y)2 + (5/6 − x − y)(11/6 − 4x − y) form a figure-eight pattern.

and V2 = V(H2 ); see the left of Figure 9.1. Any element of V in either V1 or V2 but not both is a smooth point, and thus a transverse point. Solving the system H(x, y) = Hx (x, y) = Hy (x, y) = 0 gives the three points V1 ∩ V2 = {(−1, 0), (0, −1), (1, 0)} where V is non-smooth. The gradients ∇H1 and ∇H2 never vanish on V1 and V2 , respectively, so all three points in V1 ∩ V2 are multiple points. The gradients (∇H1 )(x, y) and (∇H2 )(x, y) are linearly independent when (x, y) = (−1, 0) and (x, y) = (1, 0), which are thus transverse points, but these gradients are linearly dependent at (x, y) = (0, −1), which is not a transverse point. Note that the transverse points vary smoothly with the coefficients of H under small perturbations. On the other hand, by slightly perturbing coefficients it is possible to remove the non-transverse multiple point or have it split into two points of intersection. This gives a geometric interpretation of transverse points: they are the multiple points arising from intersections of smooth sets which are stable under small coefficient changes.

Non-Example 9.2 (Non-Multiple Points) Consider the polynomial H(x, y, z) = z2 − x 2 − y 2 . Problem 9.1 asks you to show that H is irreducible in the local ring O0 , but the gradient of H vanishes at the origin. The zero set V(H) is not smooth at the origin; its real points form the union of two cones joined to make an hourglass shape.

354

9 Multiple Points and Beyond

In practice, when H(z) is a polynomial it often factors into irreducible polynomials whose zero sets behave nicely. Remark 9.4 Suppose H(z) ∈ C[z] has a factorization H(z) = H1 (z) p1 · · · Hm (z) pm in C[z], where each p j is a positive integer. Further assume that (1) whenever H j (w) = 0 for some w ∈ Cd then (∇H j )(w) is not the zero vector, (2) if w ∈ Cd is a common root of some factors Hk1 (w) = · · · = Hks (w) = 0 then the vectors (∇Hk1 )(w), . . . , (∇Hks )(w) are linearly independent. Then every point in the zero set V = V(H) is a transverse point of H. If only condition (1) holds then every point in V is a multiple point, but some are not transverse points. If Hk1 , . . . , Hks are the factors of H which vanish at w ∈ V then the factorization H(z) =





" H j (z) p j # Hk1 (z) pk1 · · · Hks (z) pk s ! j{k1,...,ks } $ %&'( u(z)

gives a square-free factorization of H in Ow . Definition 9.3 (transverse polynomial factorization) A factorization H(z) = H1 (z) p1 · · · Hm (z) pm in C[z] satisfying properties (1) and (2) of Remark 9.4 is called a transverse polynomial factorization of H(z). Example 9.3 (A Quadrant Lattice Path Model) If A = {(−1, 1), (0, −1), (1, 1)} then Theorem 4.2 in Chapter 4 implies that the generating function for the number of walks in N2 which begin at the origin and use the steps in A is the main diagonal of the rational function F(x, y, t) =

(1 + x)(1 − xy 2 + x 2 ) . (1 − t(1 + x 2 + xy 2 ))(1 − y)(1 + x 2 )

The denominator H(x, y, t) of F(x, y, t) is the product of the polynomials H1 = 1 − t(1 + x 2 + xy 2 ),

H2 = 1 − y,

H3 = 1 + x 2 .

When H1 (x, y, t) = 0 the derivative −∂H1 /∂t = 1 + x 2 + x y 2 = 1/t  0, so the gradients of H1, H2, and H3 do not vanish on V(H1 ), V(H2 ), and V(H3 ), respectively. Since the gradients of these polynomials are linearly independent at common intersections of their zero sets, the decomposition H = H1 H2 H3 is a transverse polynomial factorization of H.

Not every polynomial has a transverse polynomial factorization. In fact, it is possible for a polynomial H(z) to be irreducible as a complex polynomial but still

9.1 Local Geometry of Algebraic and Analytic Varieties

355

factor in the local ring at a point. This is one reason why it is important to discuss local rings, even if one only cares about the zero sets of polynomials. Example 9.4 (A Lemniscate) Historically significant examples of irreducible plane curves, which will allow for a nice illustration of our theory, are figure-eight shapes known as lemniscates1. For instance, the real zeroes of the polynomial  L(x, y) =

1 −x 3

2 

1 −y 2

2

 +

5 −x−y 6



11 − 4x − y 6



are displayed on the right-hand side of Figure 9.1. Using the quadratic formula it is easy to see that L(x, y) does not factor into linear factors. Solving the system L = Lx = Ly = 0 shows that  = (1/3, 1/2) is the only non-smooth point of V(L). Near  the zero set V(H) is the union of two zero sets V(y − a1 (x)) ∪ V(y − a2 (x)) where a1 (x) and a2 (x) are analytic at x = 1/3 with convergent series expansions a1 (x) = 1/2 − (x − 1/3) − (1/3)(x − 1/3)3 − (7/27)(x − 1/3)5 + · · · a2 (x) = 1/2 − 4(x − 1/3) + (16/3)(x − 1/3)3 − (128/27)(x − 1/3)5 + · · · . Since L is quadratic in y we can determine a1 (x) and a2 (x) using the quadratic formula, but finding such expansions can be done for any bivariate function using the Newton polygon method discussed in Section 2.3 of Chapter 2. Because the gradients of y − a1 (x) and y − a2 (x) are linearly independent at , this is a transverse point. The algorithms of Chapter 7 automatically prove that  is minimal.

We conclude this introductory section by giving a necessary condition for multiple points. Stating this condition requires the following concept. Definition 9.4 (leading homogeneous terms) Given w ∈ Cd and H(z) analytic at w, the leading homogeneous term of H at w is the sum of the lowest-degree terms appearing in the power series expansion of H(w + z) at the origin. If H(w)  0 then the leading homogeneous term of H at w is the constant H(w). Example 9.4 Continued (Lemniscate Leading Homogeneous Term) If L(x, y) is the lemniscate from above then L(1/3+x, 1/2+y) = 4x 2 +5x y+y 2 +x 2 y 2 so the leading homogeneous term of L at the non-smooth point (1/3, 1/2) is (x, y) = 4x 2 + 5x y + y 2 = (x + y)(4x + y). 1 The term lemniscate was coined by Jakob Bernoulli from a Greek word λημνισκoς (lemniskos) for ribbon; see Schappacher [29] for additional historical context. Our example L(x, y) is obtained from the simple polynomial x 2 y 2 +(x+y)(4x+y) after making the (somewhat arbitrary) substitution (x, y) → (1/3 − x, 1/2 − y) so that 1/L(x, y) has a power series expansion at the origin.

356

9 Multiple Points and Beyond

Lemma 9.1 Let w ∈ Cd and H(z) be analytic at w with H(w) = 0. Let (z) be the leading homogeneous term of H(z) at w. If w is a multiple point of H then  factors into linear polynomials. Proof We know that H(z) has a factorization in Ow of the form (9.1), which implies (z) = 1 (z) p1 · · · s (z) ps where  j is the leading homogeneous term of H j at w.  When w is a multiple point then each  j is linear.

Example 9.5 (Applying Lemma 9.1) The leading homogeneous term of H(x, y, z) = w 2 − x 2 − y 2 − z2 at the origin is itself, which does not factor into linear polynomials, so Lemma 9.1 implies the origin is not a multiple point of H.

With this knowledge of local geometry in hand, we now turn back to analytic combinatorics in several variables.

9.2 ACSV for Transverse Points In this section we generalize our previous ACSV methods to cover multivariate functions whose coefficient asymptotics are determined by transverse points. For simplicity we introduce most of our constructions and results for power series expansions of rational functions, noting afterwards how they vary for general Laurent expansions of meromorphic functions. To that end, let F(z) = G(z)/H(z) be the ratio of coprime polynomials G(z), H(z) ∈ C[z] with power series expansion F(z) = i∈N d fi zi on the domain of convergence D ⊂ Cd . Recall that because G and H are coprime, the set V = H (H) forms the singular variety of F (its set of singularities). As in previous chapters, our analysis in a direction r ∈ R∗d begins with a multivariate Cauchy integral ∫ 1 dz nr F(z) nr+1 , (9.2) fnr = [z ]F(z) = z (2πi)d T where T is any product of circles in the domain of convergence D. Once again the growth of the Cauchy integrand in (9.2) is captured by the height function hr (z) = h(z) = −

d 

r j log |z j |,

j=1

which gives a bound on the exponential growth lim sup | fnr | 1/n ≤ ehr (w) n→∞

9.2 ACSV for Transverse Points

357

for every w ∈ D, and in particular for every minimal point w ∈ V ∩ ∂D (this bound is established through the same process as (5.15) in Chapter 5). Following the approach of Chapter 5, the first step in our asymptotic analysis is to determine the singularities where this upper bound could be tight.

9.2.1 Critical Points and Stratifications In Chapter 5 we introduced the smooth critical point equations (5.16) to characterize potential minimizers of the height function h(z) on V ∩ ∂D. Unfortunately, as noted in Chapter 8, any non-smooth point of V trivially satisfies the smooth critical point equations, so these equations are too weak to properly characterize non-smooth points of interest. In Chapter 8, when the singular variety V was assumed to form a hyperplane arrangement, we decomposed V into the union of smooth sets and used the geometry of V to create a new set of ‘hyperplane’ critical point equations (8.2). Now that we have a better understanding of local geometry, we can generalize the setup of Chapter 8. Recall the logarithmic gradient map ∇log f = (z1 fz1 , . . . , zd fz d ) from previous chapters, where fz j denotes the partial derivative ∂ f /∂z j . Definition 9.5 (critical points) Let w ∈ C∗d and H(z) be analytic at w. Fix a direction r ∈ R∗d and suppose w is a transverse point where H(z) = u(z)H1 (z) p1 · · · Hs (z) ps is a square-free factorization of H in Ow . If the number of factors s < d then w is a critical point in the direction r if the (s + 1) × d matrix −(∇log H1 )(w) " .. # . # Nw (H1, . . . , Hs ) = # −(∇log Hs )(w)# r ! $ 

(9.3)

is rank deficient, meaning all (s + 1) × (s + 1) minors of Nw vanish. If s = d then w is always called a critical point in the direction r. Since w is a transverse point, it cannot be the case that s > d. Definition 9.5 generalizes Definition 5.4 in Chapter 5, for critical points in the smooth case, and Definition 8.3 of Chapter 8, for critical points in the hyperplane arrangement setting. Remark 9.5 Suppose w is a transverse point of H with square-free factorization as in Definition 9.5. Then the leading homogeneous term of H(z) at w is u(w)1 (z) p1 · · · s (z) ps , where  j (z) is the (linear) leading homogeneous term of H j (z) at w. In particular, the gradients (∇H j )(w) can be determined implicitly from the evaluations of the partial derivatives of H(z) at z = w by computing a power series expansion of H(z) at z = w and factoring the leading homogeneous term. When H(z) ∈ Q[z] then all critical points have algebraic coordinates and the evaluations of the partial derivatives of H and factorization of its leading homogeneous term can be done implicitly [19].

358

9 Multiple Points and Beyond

Definition 9.5 characterizes when a specific point is critical. In order to determine a procedure to compute all critical points, we decompose the singular variety V into a finite collection of smooth sets such that each smooth set has consistent local geometry. The easiest case occurs when H(z) admits a transverse polynomial factorization. The following generalizes Definition 8.1 in Chapter 8 from the hyperplane arrangement setting to this context. Definition 9.6 (flats and strata) Suppose H(z) admits a transverse polynomial factorization H(z) = H1 (z) p1 · · · Hm (z) pm . For any S = {k1, . . . , k s } ⊂ {1, . . . , m} the flat defined by Hk1 , . . . , Hks is the set VS = Vk1,...,ks = V(Hk1 , . . . , Hks ) of their common solutions in Cd . The stratum defined by S is the flat VS minus any other flats it strictly contains, < VT . SS = VS \ VT VS

The dimension of the flat VS , and of the stratum SS , is the value of d − |S|. Remark 9.6 Because we restrict to transverse polynomial factorizations there are no non-empty flats defined by subsets S of size |S| > d, meaning the dimension of any non-empty flat is a natural number. Any zero-dimensional flat, defined by a subset S with |S| = d, is a finite set. Further properties of the dimension of algebraic sets, including the general definition of dimension, can be found in Mumford [24, Ch. 1]. The singular variety V is the (finite and disjoint) union of the strata SS as S runs over the subsets of {1, . . . , m}. The critical points on a stratum can be characterized by one set of algebraic equations. Definition 9.7 (transverse critical points equations) Suppose H(z) admits a transverse polynomial factorization H(z) = H1 (z) p1 · · · Hm (z) pm and fix a direction r ∈ R∗d . For any S = {k1, . . . , k s } ⊂ {1, . . . , m} let Nz (Hk1 , . . . , Hks ) denote the matrix in (9.3). The system of polynomial equalities and inequalities Hk j (z) = 0,

j = 1, . . . , s

det(M) = 0,

M is an (s + 1) × (s + 1) minor of Nz (Hk1 , . . . , Hks )

z j  0, Hi (z)  0,

j = 1, . . . , d

(9.4)

i ∈ {1, . . . , m} \ {k 1, . . . , k s }

form the critical point equations for the stratum SS in the direction r. If H(z) admits a transverse polynomial factorization then the solutions of the critical point equations over all strata of V give all critical points. We discuss notions of critical points beyond transverse points in Section 9.3 below. Remark 9.7 (an optional perspective from differential geometry) Fix a stratum S and let S∗ denote the submanifold of points in S whose coordinates are non-zero. As in Remark 5.9 of Chapter 5, we may view the height function h as a smooth map of

9.2 ACSV for Transverse Points

359

manifolds from S∗ to R. Then the ‘critical points’ in the usual sense of differential geometry (points where the differential of h restricted to S∗ vanishes) match our definition of critical points. See Section 9.3 below for more details.

Example 9.3 Continued (Lattice Path Critical Points) Consider again the lattice path model in N2 defined by the set of steps A = {(−1, 1), (0, −1), (1, 1)}. As noted above, the generating function of this model is the main diagonal of a rational function whose denominator has a transverse polynomial factorization with factors H1 = 1 − t(1 + x 2 + xy 2 ), H2 = 1 − y, and H3 = 1 + x 2 . To determine critical points in the r = (1, 1, 1) direction • on the stratum S1 , we build the matrix     −∇log H1 t x(y 2 + 2x) 2t xy 2 t(1 + x 2 + xy 2 ) N= = 1 1 1 1 and determine the points in S1 where H1 and the maximal minors of N vanish. This recovers the smooth critical point equations from Chapter 5, H1 = 0,

x(∂H1 /∂ x) = y(∂H1 /∂ y) = t(∂H1 /∂t),

subject√to the condition (1 − y)(1 + x 2 )  0. There are four solutions, given by (ω2, ω 2, 1/4) for ω ∈ {±1, ±i}, which are all smooth critical points. None √ of these critical points are minimal, as they have y-coordinate of modulus 2 and the denominator of F(x, y, t) contains H2 = 1 − y as a factor. • on the stratum S1,2 , we compute the matrix −∇ H t x(y 2 + 2x) 2t xy 2 t(1 + x 2 + x y 2 )  log 1 "  " 0 y 0 N = −∇log H2 # = # 1 1 1 1 ! $ ! $ and solve H1 = H2 = det N = 0 subject to 1 + x 2  0. This gives two critical points, σ = (1, 1, 1/3) and (−1, 1, 1), of which the second is not minimal since it has larger coordinate-wise modulus than σ. • on the stratum S1,3 , we compute the matrix −∇log H1 t x(y 2 + 2x) 2t xy 2 t(1 + x 2 + x y 2 ) "  " −2x 0 0 N = −∇log H3 # = # 1 1 1 ! 1 $ ! $ 

and solve H1 = H3 = det N = 0 subject to 1 − y  0. This system of equations has no solutions, so S1,3 contains no critical points.

360

9 Multiple Points and Beyond

• on the stratum S1,2,3 , we note that both points (i, i, −i) and (−i, −i, i) in the stratum are critical points by definition, but neither is minimal. • on the strata S2, S3 , and S2,3 , we note that H2 = 1 − y and H3 = 1 + x 2 are independent of the variable t, so r = (1, 1, 1) can never be in the span of ∇log H2 and ∇log H3 , and there are no critical points (alternatively one can compute matrix minors as above to obtain systems with no solutions). Problem 9.2 below asks the reader to prove that σ = (1, 1, 1/3) is a minimal point, but σ is not strictly minimal since any point (x, 1, t) with |x| = 1 and |t| = 1/3 lies in V(H) and has the same coordinate-wise modulus as σ.

As in previous chapters, not all critical points contribute to dominant asymptotics. To find the critical points that will appear in our analysis we introduce the following notion, generalizing Definition 5.5 in Chapter 5 and Definition 8.8 in Chapter 8. Definition 9.8 (contributing points) Let w ∈ C∗d be a transverse point of H(z) and let H(z) = u(z)H1 (z) p1 · · · Hs (z) ps be a square-free factorization of H in Ow . For each 1 ≤ j ≤ s let k j ∈ {1, . . . , d} be an index such that the partial derivative (∂H j /∂zk j )(w)  0. Proposition 3.6 in Chapter 3 implies the vector vj =

  (∇log H j )(w) w1 (∂H j /∂z1 )(w) wd (∂H j /∂zd )(w) = , . . ., wk j (∂H j /∂zk j )(w) wk j (∂H j /∂zk j )(w) wk j (∂H j /∂zk j )(w)

has real coordinates. The normal cone of H at w is the set N(w) =

⎫ ⎬ ⎪ a j v j : a j > 0 ⊂ Rd . ⎪ ⎪ ⎭ ⎩ j=1 s ⎧ ⎨ ⎪

The point w is called a contributing singularity, or contributing point, for the direction r ∈ R∗d if r ∈ N(w). Since the gradients of the H j (z) at z = w are linearly independent the vectors v j are also linearly independent, thus if r = a1 v1 + · · · as vs then the a j are uniquely determined. Remark 9.8 If w is a contributing point for the direction r then r is a non-trivial linear combination of the logarithmic gradients (∇log H j )(w), so every contributing point is critical. Furthermore, a check of the definitions shows that for the power series expansion of rational F(z) any smooth minimal critical point is contributing (as should be the case since Definition 9.8 generalizes Definition 5.5 of Chapter 5). Example 9.3 Continued (Lattice Path Contributing Points) In our running lattice path example the denominator factors H1 = 1 − t(1 + x 2 + xy 2 ) and H2 = 1 − y which vanish at the minimal critical point σ = (1, 1, 1/3) have logarithmic gradients (∇log H1 )(σ) = −(1, 2/3, 1) and (∇log H2 )(σ) = −(0, 1, 0). Thus, the normal cone at σ is

9.2 ACSV for Transverse Points

361

   2 N(σ) = a 1, , 1 + b(0, 1, 0) : a, b > 0 . 3 Since (1, 1, 1) ∈ N(σ), we see that σ is a contributing point for the direction r = 1.

Example 9.6 (Another Quadrant Lattice Path Model) If B = {(1, −1), (−1, −1), (0, 1)} then Theorem 4.2 in Chapter 4 implies that the generating function for the number of walks in N2 which begin at the origin and use the steps in B is the main diagonal of the rational function F(x, y, t) =

(x − x 2 y 2 − y 2 )(1 + x) . x(1 − t(x + y 2 + x 2 y 2 ))(1 − y)

Although we develop ACSV for general Laurent expansions, here we can stick to a power series expansion by noting that the nth main diagonal coefficient of F(x, y, t) is the (n + 1)st main diagonal coefficient of E(x, y, t) = xytF(x, y, t) =

(x − x 2 y 2 − y 2 )(1 + x)yt . (1 − t(x + y 2 + x 2 y 2 ))(1 − y)

(9.5)

The denominator H(x, y, t) of E(x, y, t) has the transverse polynomial factorization H = H1 H2 where H1 (x, y, t) = 1 − t(x + y 2 + x 2 y 2 ) and H2 (x, y, t) = 1 − y. Solving the critical point equations on each stratum shows that S1 = V(H1 ) \ V(H2 ) has four critical points in the main diagonal direction,   2 2 ω ω , ω ∈ {1, −1, i, −i}, σω = ω , √ , 2 2 which are finitely minimal points where the singular variety is smooth. The stratum S2 = V(H2 ) \ V(H1 ) contains no critical points, but the stratum S1,2 = V(H1, H2 ) contains two critical points: τ = (1, 1, 1/3), which is minimal, and (−1, 1, 1), which is not minimal. Note, in particular, that τ and the σ ω are minimal critical points but they do not all have the same coordinate-wise modulus. Because we are considering a power series expansion any smooth minimal critical point is contributing, so the σ ω are all contributing points (the logarithmic gradient of H1 at each point is a non-zero multiple of 1). Computing the logarithmic gradients of H1 and H2 at τ shows    4 N(τ) = a 1, , 1 + b(0, 1, 0) : a, b > 0 3 so (1, 1, 1)  N(τ) and τ is not a contributing point for r = 1. Because there are finitely minimal smooth critical points, the non-smooth points of E(x, y, t) in (9.5) are irrelevant. In particular, an asymptotic expansion for the number wn of walks in N2 using the steps in B can be computed using Corollary 5.2

362

9 Multiple Points and Beyond

of Chapter 5. Adding the asymptotic contributions of each σ ω to the main diagonal of E, then shifting the index of n by one to return to the diagonal of F, gives  √ n  √   2 2 A 12 2 : n even 1 n . +O , An = wn = π n n2 32 : n odd Note that the numerator of E(x, y, t) vanishes at each σ ω , so the second order term in the expansion (5.27) must be calculated for each point, and that the leading constant has a periodicity coming from the different contributions of the σ ω .

Minimal contributing points are important because, as in the smooth case, they are minimizers of the height function on the closure of the domain of convergence. The fastest way to see this is to recall the discussion of polynomial amoebas and the Relog map from Section 3.3.1 in Chapter 3. Proposition 9.1 Let w ∈ C∗d be a transverse point of H(z) ∈ C[z]. If w is a contributing point of H for the direction r and w is minimal (i.e., w ∈ ∂D) then w is a minimizer of the height function hr (z) on D. Proof Let H(z) = u(z)H1 (z) p1 · · · Hs (z) ps be a square-free factorization of H in Ow and let B = Relog(D) be the component of amoeba(H)c corresponding to the power series expansion of F(z). Near Relog(w) the amoeba of H contains (at least) the zeroes of the H j (z) under the Relog map. Proposition 3.6 from Chapter 3 implies that the vectors v j in Definition 9.8 are outward normals for support hyperplanes of B at Relog(w) (where Remark 3.2 covers the case when the H j are analytic at w but not polynomials). Thus, as illustrated in Figure 9.2, any element of N(w) is the outward normal to a support hyperplane of B at Relog(w). Since hr becomes the ˜ linear function h(x) = −r · x after taking the Relog map, this implies Relog(w) is a ˜ minimizer of h on the convex set B, and thus w is a minimizer of hr on D.  Example 9.3 Continued (Lemniscate Contributing Points) Recall the lemniscate L(x, y) = (1/3 − x)2 (1/2 − y)2 + (5/6 − x − y)(11/6 − 4x − y) from above, whose zero set V(L) contains a single non-smooth point  = (1/3, 1/2). We have seen that the leading homogeneous term of L at  factors as (x, y) = (x + y)(4x + y) and there is a factorization L = L1 L2 in O where (∇log L1 )() = (w1, w2 ) = (1/3, 1/2),

(∇log L2 )() = (4w1, w2 ) = (4/3, 1/2).

This gives a normal cone      8 2 N() = a , 1 + b , 1 : a, b > 0 ; 3 3

9.2 ACSV for Transverse Points

363

Fig. 9.2 Left: The amoeba of the polynomial L(x, y) with the component B ⊂ amoeba(L) c corresponding to the power series expansion of 1/L(x, y) labelled. The image of the multiple point  = (1/3, 1/2) under the Relog map, where ∂B has a cusp, is circled. Right: A portion of the contour of L(x, y) with the normal cone N () displayed. The normals to the two smooth curves making up the contour near Relog(), which form the boundary of N (), are the logarithmic gradients of the factors of L in the local ring O  .

see Figure 9.2. The vector r ∈ R2>0 lies in N() if and only if r1 /r2 ∈ (2/3, 8/3), which gives the directions where  is a contributing singularity. The rational function 1/L(x, y) also has smooth critical points, whose coordinates are degree six algebraic d . There will be a minimal smooth critical point functions in the coordinates of r ∈ R>0 only when  is not contributing (otherwise, since  is strictly minimal, there would be two minimal contributing points with different coordinate-wise moduli, which cannot occur since the boundary of amoeba(L) does not contain a line segment).

Example 9.7 (Lattice Path Amoebas) Visualizations of the amoebas of the lattice path models discussed above, together with two-dimensional projections of the relevant critical points and logarithmic gradients, are given in Figures 9.3 and 9.4.

We end this section by discussing general (non-power series) Laurent expansions. The only concept above which does not immediately apply to general Laurent expansions is Definition 9.8 for contributing points, but Proposition 3.13 from Chapter 3 allows for a natural extension. Definition 9.9 (contributing points for Laurent expansions) Let w ∈ C∗d be a transverse point of H(z), and let H(z) = u(z)H1 (z) p1 · · · Hs (z) ps be a square-free factorization of H in Ow . Consider a Laurent expansion of F(z) = G(z)/H(z) with domain of convergence D and for each 1 ≤ j ≤ s let v j be a real vector which points

364

9 Multiple Points and Beyond

Fig. 9.3 Left: A portion of the amoeba contour for the denominator of the rational function corresponding to the quadrant model with step set A, with the power series component B of the amoeba complement shaded. Right: The projection of this plot onto the plane log(x) = 1, with the projections of the critical points and logarithmic gradients displayed. The vector r = 1 is in the normal cone at the minimal critical point, which is a transverse point. The smooth critical point is not minimal, and does not lie on the boundary of B.

Fig. 9.4 Left: A portion of the amoeba contour for the denominator of the rational function corresponding to the quadrant model with step set B, with the power series component of the amoeba complement shaded. Right: The projection of this plot onto the plane log(x) = 1, with the projections of the critical points and logarithmic gradients displayed. The logarithmic gradient at the smooth critical point is a multiple of r = 1, and this is a contributing singularity. The vector r = 1 does not lie in the normal cone at the non-smooth minimal critical point.

away from Relog(D) such that (∇log H j )(w) = τv j for some τ ∈ C (the existence of such v j follows from Proposition 3.13). The normal cone N(w) of H at w with respect to this Laurent expansion is the positive span of the vectors v1, . . . , vs . The point w is called a contributing singularity, or contributing point, for this Laurent expansion in the direction r ∈ R∗d if r ∈ N(w). Remark 9.9 If D is the power series domain of convergence of rational F(z) then Proposition 3.12 in Chapter 3 implies the convex set Relog(D) contains some translate of the negative orthant of Rd . Thus, any normal vector to a supporting hyperplane of the boundary of Relog(D) which contains a positive coordinate will point away from Relog(D), verifying that Definition 9.9 is a generalization of Definition 9.8. We are now ready to determine coefficient asymptotics.

9.2 ACSV for Transverse Points

365

9.2.2 Asymptotics via Residue Forms Asymptotics for transverse points are determined by extending the approach of Chapter 8, where the singular variety is a hyperplane arrangement, to transverse points, where the singular variety locally looks like a hyperplane arrangement. Unfortunately, the most natural way to generalize our previous uses topological concepts like homology which are outside the scope of this text. We thus sketch proofs in this section and refer to Pemantle and Wilson [26] for full details. We begin by introducing some constructions necessary to state our results. Suppose w is a transverse point of H(z) such that there is a square-free factorization H(z) = u(z)H1 (z) p1 · · · Hs (z) ps in Ow , and let S = V(H1 ) ∩ · · · ∩ V(Hs ) ∩ N for any neighbourhood N of w in Cd on which all H j are defined. Because the gradients of the H j are linearly independent at w, there exists a set P = {π1, . . . , πd−s } of d − s distinct coordinates which analytically parametrize the remaining s coordinates of S. In other words, for each j  P there exists an analytic function ζ j (zπ1 , . . . , zπd−s ) such that z ∈ S if and only if z j = ζ j (zπ1 , . . . , zπd−s ) for all j  P. Definition 9.10 (constructions for transverse points) Under the setup of the preceding paragraph, we call P a set of parameterizing coordinates for V(H) at w, and the set of functions {ζ j : j  P} a local parameterization of V(H) at w. The modified log-normal matrix of H at w is the d × d non-singular matrix (∇log H1 )(w) " .. # . # # (∇log Hs )(w) # Γw = #, (π ) 1 wπ1 e # # .. # # . (π ) !wπ(d−s) e d−s $ 

where e(j) is the jth elementary basis vector. Finally, define the function 7 6   r j log ζ j wπ1 eiθ1 , . . . , wπd−s eiθd−s . g(θ 1, . . . , θ d−s ) = jP

The parameterized Hessian matrix of H at w is the (d −s)×(d −s) Hessian matrix of g at θ = 0, written Qw . If Qw has non-zero determinant then we call w nondegenerate. As in Chapter 8, we can be most precise about asymptotics at a transverse point where the number of factors of H in the local ring equals the number of variables. Recall from previous chapters the notation T(x) for the points of Cd with the same coordinate-wise modulus as x. Given p ∈ Ns we write (p−1)! = (p1 −1)! · · · (ps −s)!. Theorem 9.1 Fix a Laurent expansion of rational F(z) = G(z)/H(z) with domain of convergence D and let r ∈ R∗d . Suppose x ∈ ∂D minimizes |z−r | on D, all minimizers of |z−r | on D lie in T(x), and all elements of V ∩ T(x) are transverse points. Suppose also that any critical point z in T(x) has r  ∂N(z), and that the set

366

9 Multiple Points and Beyond

E = {z ∈ T(x) : z is a contributing point for r} = {w} contains a single point. If H has square-free factorization H(z) = u(z)H1 (z) p1 · · · Hd (z) pd in Ow then there exists 0 < τ < |w−r | such that −nr

fnr = w

n

p1 +···+p d −d

 p−1  G(w) rΓw−1 + O(τ n ), u(w)| det Γw | (p − 1)!

(9.6)

where Γw is the parameterized Hessian matrix from Definition 9.10. Proof (sketch) As always, our asymptotic analysis starts with the Cauchy integral representation (9.2). Because all minimizers of |z−r | on D have the same coordinatewide modulus as w, and w is the only contributing singularity in T(w), the Cauchy domain of integration can be deformed so that the only points of asymptotic interest are arbitrarily close to w. Near w the image of the singular variety V under the coordinate-wise logarithm map looks like a complete intersection of a simple hyperplane arrangement. Because w is a contributing point, it seems reasonable to think our asymptotic arguments from Section 8.3 of Chapter 8 would apply. Indeed, Proposition 10.3.6 and Theorem 10.2.6 of Pemantle and Wilson [26] use the language of relative homology to show that the coefficients fnr can be represented by an integral over a product of circles around S = V(H1 ) ∩ · · · ∩ V(Hd ) (generalizing Proposition 8.3 of Chapter 8). Theorem 10.3.1 of [26] then shows how to use the theory of multivariate Leray residues to compute the expansion (9.6). Our work in Section 8.3 of Chapter 8 can be seen as an explicit development of this theory for the restricted case when H factors into real linear polynomials, so that linear algebra techniques can replace homological methods.  Remark 9.10 As was the case for hyperplane arrangements, when the number of local factors equals the number of variables and the numerator G(w)  0 then we obtain dominant asymptotics up to an exponentially lower term.

Example 9.4 Continued (Asymptotics of the Lemniscate) Recall again the lemniscate 

1 −x L(x, y) = 3

2 

1 −y 2

2



5 −x−y + 6



11 − 4x − y 6



with strictly minimal critical point  = (1/3, 1/2) such that L factors in O as L = L1 L2 where (∇log L1 ) = (1/3, 1/2) and (∇log L2 ) = (4/3, 1/2). The matrix   1/3 1/2 Γ = 4/3 1/2 has determinant −1/2 and the normal cone at  equals

9.2 ACSV for Transverse Points

367

     8 2 N() = a , 1 + b , 1 : a, b > 0 . 3 3 Because  is strictly minimal and the boundary of amoeba(L) doesn’t contain a line segment, for any direction (a, b) ∈ N() the point  is the unique minimizer of |x a y b | on the boundary of the power series domain of convergence of 1/L(x, y). Thus, if (a, b) ∈ N() then Theorem 9.1 implies n  [x an y bn ]L(x, y)−1 = 2 3a 2b + O(τ n ), for some 0 < τ < 3a 2b .

When asymptotics are determined by a transverse point where the number of factors of H in the local ring is less than the number of variables we still determine dominant asymptotics, but the error term obtained is polynomially small instead of exponentially small. Theorem 9.2 Fix a Laurent expansion of rational F(z) = G(z)/H(z) with domain of convergence D and let r ∈ R∗d . Suppose x ∈ ∂D minimizes |z−r | on D, all minimizers of |z−r | on D lie in T(x), and all elements of V ∩ T(x) are transverse points. Suppose also that any critical point z in T(x) has r  ∂N(z), and that the set E = {z ∈ T(x) : z is a contributing point for r} = {w} contains a single point, which is nondegenerate. If H has a square-free factorization H(z) = u(z)H1 (z) p1 · · · Hs (z) ps in Ow then there exists 0 < τ < |w−r | such that

 p−1    (2π)(s−d)/2 G(w) rΓw−1 1 −nr (s−d)/2+p1 +···+ps −s fnr = w n , +O  n u(w) det(rd Qw ) | det Γw | (p − 1)! (9.7) where Γw and Qw are defined in Definition 9.10. Proof (sketch) As in the proof sketch of Theorem 9.1, we can manipulate the Cauchy integral representation (9.2) into an integral arbitrarily close to w. Now, Proposition 10.3.6 and Theorem 10.2.6 of [26] show that the coefficients fnr can be represented by an integral over the product of s circles around S = V(H1 ) ∩ · · · ∩ V(Hd ) with a (d − s)-dimensional domain of integration (similar to Proposition 8.3 of Chapter 8). Integrating over the product of s circles leaves a (d − s)-dimensional integral over a subset of S whose integrand is a ‘multivariate residue form’ in the parametrizing variables zπ1 , . . . , zπd−s of Definition 9.10 (generalizing Proposition 8.4 of Chapter 8). Theorem 10.3.4 of Pemantle and Wilson [26] shows how to massage this integral into something amenable to a saddle-point analysis, which is then asymptotically approximated.  Remark 9.11 Although Theorem 9.2 applies when the number of factors s = d, in this case it is weaker than Theorem 9.1. When asymptotics are determined by minimal

368

9 Multiple Points and Beyond

critical points, Theorems 9.1 and 9.2 generalize Theorem 8.2 from Chapter 8, which applies when the factors H j are linear. Unlike Theorem 8.2, however, Theorems 9.1 and 9.2 only apply to minimal points. Dealing with non-minimal critical points is a large open problem in analytic combinatorics in several variables, discussed further in Section 9.3 below.

Example 9.4 Continued (Lattice Path Asymptotics) Consider again the lattice path model in N2 defined by the step set A = {(−1, 1), (0, −1), (1, 1)}, whose generating function is the main diagonal of F(x, y, t) =

(1 + x)(1 − xy 2 + x 2 ) . (1 − t(1 + x 2 + xy 2 ))(1 − y)(1 + x 2 )

Above we have seen that F admits a minimal contributing point σ = (1, 1, 1/3) in the direction r = 1, which is thus a minimizer of |xyt| −1 on D. Problem 9.3 asks you to prove that all minimizers of |xyt| −1 have the same coordinate-wise modulus as σ (this also follows from our general argument in Chapter 10). We have already calculated the logarithmic gradients (∇log H2 )(σ) = (−1, −2/3, −1)

and

(∇log H2 )(σ) = (0, −1, 0),

and noted that the only critical point of T(σ) is σ itself. Furthermore, on the flat V1,2 = V(1 − y, 1 − t(1 + x 2 + xy 2 )) containing σ we can parametrize y and t by their x-coordinates as y = 1 and t = 1/(1 + x + x 2 ), giving     1 1 = log g(θ) = log(1) + log 1 + eiθ + e2iθ 1 + eiθ + e2iθ and Q = g (0) = 2/3. Since Γσ =

(∇log H1 )(σ) (∇log H2 )(σ) 0 !1 

−1 −2/3 −1 " "  # = 0 −1 0 # , 0$ ! 1 0 0 $

Theorem 9.2 implies the number wn of walks of length n on the steps A which start at the origin and stay in N2 satisfies √    1 3 n −1/2 wn = 3 n . √ 1+O n 2 π

As usual, if there are a finite number of points determining asymptotics then we can add the contributions from each of the points.

9.3 A Geometric Approach to ACSV

369

Corollary 9.1 If the set E in Theorems 9.1 and 9.2 contains a finite set of points, all of which satisfy the conditions of Theorems 9.1 or 9.2, then one can sum the contributions of each element of E given by (9.6) and (9.7) to find asymptotics of fnr . We apply Theorem 9.2 and Corollary 9.1 to a variety of lattice path enumeration problems in Chapter 10.

9.3 A Geometric Approach to ACSV We end this chapter with a discussion of the most general aspects of ACSV, using geometric constructions to give a high-level view of the theory and clarifying some earlier remarks by putting our previous results in context. To discuss this approach we first need to introduce some additional concepts from differential geometry. The definitions presented here are tailored to our setting. Definition 9.11 (smooth and complex manifolds) Let M ⊂ Ck for some positive integer k. A chart of dimension s on M is a pair (U, φ) consisting of an open subset U ⊂ M and a homeomorphism2 φ from U onto an open subset of Rs . An atlas of dimension s for M is a collection of charts {(Uα, φα )} of dimension s such that every p ∈ M is contained in at least one Uα . If M admits an atlas of dimension s such that for any charts (U, φ) and (V, ψ) the maps φ ◦ ψ −1 : ψ(U ∩ V) → φ(U ∩ V)

and

ψ ◦ φ−1 : φ(U ∩ V) → ψ(U ∩ V)

between subsets of Rs are smooth (all coordinates have derivatives of all orders) then M is a smooth manifold of dimension s. By replacing Rs by Cs in the above constructions we can analogously define complex charts and complex atlases, and M is a complex manifold of dimension s if it admits a complex atlas of dimension s such that for any charts (U, φ) and (V, ψ) the maps φ ◦ ψ −1 and ψ ◦ φ−1 are analytic. Whenever we talk about charts in a manifold we always mean charts with respect to an atlas giving the manifold structure under consideration. Example 9.8 (Smooth Varieties are Complex Manifolds) Fix a polynomial H(z) and let V = V(H). If w ∈ V and (∂H/∂zd )(w)  0 then the implicit function theorem (Proposition 3.1 in Chapter 3) implies the existence of • a neighbourhood U of w in V,  in Cd−1 , • a neighbourhood N of w • an analytic function g : N → C, such that U = {(ˆz, g(ˆz)) : zˆ ∈ N }. In other words, if π : U → Cd−1 denotes the projection π(z) = zˆ of a point in U to its first d − 1 coordinates then π −1 : N → Cd is the analytic function g, so (U, π) forms a (d − 1)-dimensional complex chart of V. 2 Recall that a homeomorphism is a continuous bijective map whose inverse is also continuous.

370

9 Multiple Points and Beyond

Parameterizing other variables as necessary, if for each w ∈ V some partial derivative of H does not vanish at w then this procedure constructs a complex atlas of V, showing that V is a complex manifold of dimension d − 1. Remark 9.12 By writing z j = x j + iy j to identify a complex variable z j with the real variables x j and y j a complex manifold M of dimension s can be viewed as a smooth manifold of dimension 2s. Our next definitions help us understand the behaviour of functions on a manifold. Definition 9.12 (mappings, differentials, and stationary points) If M is a smooth manifold of dimension s then a function f : M → R is called smooth at p ∈ M if there is a chart (U, φ) with p ∈ U such that the map f ◦ φ−1 : Rs → R is smooth on φ(U). The differential of f at p is the vector dp f ∈ Rs whose jth entry equals (∂g/∂ x j )(p) for g(x1, . . . , xs ) = ( f ◦ φ−1 )(x1, . . . , xs ). Similarly, if M is a complex manifold of dimension s then f : M → C is analytic at p ∈ M if there is a chart (U, φ) with p ∈ U such that f ◦ φ−1 : Cs → C is analytic on φ(U), and the differential of f at p is the vector dp f ∈ Cs whose jth entry equals (∂g/∂z j )(p) for g(z1, . . . , zs ) = ( f ◦ φ−1 )(z1, . . . , zs ). In either case, p ∈ M is called a stationary point of f if dp f = 0. Note that stationary points are usually called critical points, but we do not use this terminology to prevent confusion with the critical points of ACSV. Whether p is a stationary point does not depend on the chart used to calculate the differential of f .

Example 9.9 (Stationary Points) Let H(x, y) = 1 − x − y and V∗ = V(H) ∩ C∗d be the roots of H with non-zero coordinates. An atlas of V∗ is given by a single chart (V∗, π) where π(x, y) = x is the projection map with inverse π −1 (x) = (x, 1 − x). With respect to this chart the function ψ(x, y) = log(x) + log(y) is represented by g(x) = (ψ ◦ π −1 )(x) = log(x) + log(1 − x). Since g (x) = 0 has the unique solution x = 1/2, the only stationary point of ψ on V∗ is (1/2, 1/2).

9.3.1 A Gradient Flow Interpretation for Analytic Combinatorics Before returning to ACSV let us reflect on our first analytic argument for asymptotics, back in Section 2.1.1 of Chapter 2, where we studied alternating permutations by analyzing the coefficients of the meromorphic function f (z) = tan(z). The general outline of that approach, reflected in Figure 2.2 of Chapter 2, can be summarized as

9.3 A Geometric Approach to ACSV

371

Fig. 9.5 Left: We can push down a cycle of integration at high height to a cycle arbitrarily close to a stationary point, except for points of lower height (the curve C1 gets pushed down to C2 , then C3 , then C4 ). Right: A problem arises when the gradient flow keeps decreasing but an asymptote forces it to stay at bounded height.

1. start with the univariate Cauchy integral formula for coefficients, 2. push the domain of integration away from the origin without crossing singularities until it consists of points where the Cauchy integrand is exponentially small, except for points arbitrarily close to the dominant singularities z = ±π/2, 3. compute asymptotic contributions of the dominant singularities using residues. Our goal is to generalize this thinking (as much as possible) to the multivariate setting. To that end, fix a Laurent expansion of rational F(z) = G(z)/H(z) with domain of convergence D and let r ∈ R∗d . Again we have the multivariate Cauchy integral representation ∫ 1 F(z) dz fnr = (2πi)d T znr+1 where T is a polytorus in D. In the univariate case we pushed the domain of integration away from the origin because that made the Cauchy integrand small. Now the exponential growth of the Cauchy integrand is captured by the familiar height function h(z) = − dj=1 r j log |z j |, so it makes sense to deform the domain of integration T (without crossing the singular variety V) so that the maximum of the height function on the domain of integration is minimized. The immediate problem in the multivariate setting is that there are too many ways to deform the domain of integration: it is not clear how the domain should be deformed in each of its d coordinates, and it is possible to leave the domain of convergence D without crossing a singularity. Thankfully, we are searching for special points on the singular variety. Starting with a domain of integration at some (usually high3) height, we want to push the domain of integration to points of lower height until it locally gets stuck; see the left side of Figure 9.5. For any r ∈ R, let Vr denote the points in V with height at least r. The points where the domain of integration can get stuck arise from a change in the topology of Vr , and a study of such points is a classic geometric problem. In particular, Morse theory [22] studies changes in the topology of super-level sets for compact real manifolds sorted by a real height function. Stratified Morse theory [16] 3 For instance, when dealing with power series expansions we can start with any product of circles sufficiently close to the origin, which can thus contain points of arbitrarily high height.

372

9 Multiple Points and Beyond

extends this to sets (like algebraic varieties) which are not manifolds but can be partitioned into a finite collection of manifolds which ‘fit together nicely’. Similar results for complex varieties are also studied in Picard-Lefschetz theory [32]. These high-level theories tell us to partition the singular variety V into manifolds and study the stationary points of the height function h on each of the manifolds. Unfortunately, because h depends on the moduli of the coordinates it is only analytic in trivial cases. There are two ways to work around this difficulty, 1. If M is a complex manifold of dimension r then, as in Remark 9.12, we can view M ⊂ Cd as a smooth manifold W ⊂ R2d of real dimension 2r. Setting z j = x j + iy j and multiplying by two gives a modified height function η(x, y) = 2h(x + iy) = − dj=1 r j log(x 2j + y 2j ) which is a smooth mapping, and we can look for stationary points of η on the smooth manifold W. 2. Because h is the real part of the analytic function ψ(z) = − dj=1 r j log z j , the Cauchy-Riemann equations [17, Sec. 2.3] imply that the stationary points of the smooth map η : W → R are the stationary points of ψ : M → C. One can thus determine interesting topological information using the analytic function ψ. Standard algebro-geometric constructions can be used to partition an arbitrary algebraic set V into a finite collection of manifolds (see, for instance, Mumford [24, Ch. 1A]). The techniques of stratified Morse theory, however, require a Whitney stratification, which is a partition of V into manifolds with an extra condition ensuring that for any manifold M in the partition the local picture of V near all points of M is consistent. A full discussion of Whitney stratifications is beyond the scope of this text, but modern treatments can be found in Mather [20] and Goresky and MacPherson [16, Ch. 1.2]. Of particular interest to us is the fact that there exist algorithms [23, 28] which take an algebraic set V defined by explicit polynomial equations and return a finite sequence of algebraic sets  = F0 ⊂ F1 ⊂ · · · ⊂ Fr = V, each defined by explicit polynomial equations, such that the connected components of the differences Fj \ Fj−1 form a Whitney stratification of V. This sequence of algebraic sets, known as a canonical Whitney filtration of V, allows us to give a general definition of critical points. Definition 9.13 (critical points) Let F(z) be a rational function with singular variety V and fix r ∈ R∗d . Suppose F0 ⊂ F1 ⊂ · · · ⊂ Fr is a canonical Whitney filtration of V. The critical points of F in the direction r is the set of stationary points of the function ψ(z) = − dj=1 r j log z j restricted to the differences Fj \ Fj−1 . A critical point w in the stratum S is non-degenerate if there exists a chart (U, φ) of S with w ∈ U and φ(w) = 0 such that the Hessian of ψ ◦ φ−1 at the origin is non-singular. Given polynomials generating the elements of a canonical Whitney filtration, vanishing of the differential of ψ can be encoded as the vanishing of minors of a matrix with explicit polynomial entries. Thus, when the denominator of F is a polynomial with rational coefficients then all critical points have algebraic coordinates.

9.3 A Geometric Approach to ACSV

373

Remark 9.13 Definition 9.13 extends our previous definitions for critical points (in the smooth, hyperplane, and transverse multiple point settings). As proven in Section 5.3.4 of Chapter 5, generically the singular variety V is smooth, so a Whitney stratification for V is just V itself. When V is smooth then the critical points of Definition 9.13 are the solutions of the smooth critical point equations H s (z) = 0 and r j z1 Hzs1 (z) − r1 z j Hzs j (z) = 0 for 2 ≤ j ≤ d from (5.16) in Chapter 5. If V is not smooth but has a transverse polynomial factorization, the next most common situation in practice, then critical points can be calculated using (9.4) above. Roughly, stratified Morse theory tells us that the critical points of F(z) represent the only obstructions to pushing down the Cauchy domain of integration to lower height. Such results are obtained via gradient flows, which use the differential of the height function on the Whitney strata to determine how to deform a domain of integration arbitrarily close to V to points of lower height. At stationary points the differential is zero, and the flow gets stuck. Aside from stationary points there is one other situation in which the flow doesn’t work as intended: when there is an ‘asymptote’ of V such that points on V go to infinity with decreasing height, but the height of these points is bounded from below4 (see the right side of Figure 9.5). To work around this issue, Baryshnikov et al. [6] introduce a notion of stationary points at infinity, along with an algorithm to determine when there are no such points. Certifying that no stationary points at infinity exist then allows one to apply the usual techniques of Morse theory to reduce the Cauchy integral for coefficients to a sum of saddle-point integrals which can be asymptotically approximated. A characterization of stationary points at infinity was a long-time sticking point in the use of Morse-like topological constructions in ACSV contexts. Recall the discussion on polynomial ideals and algebraic varieties from Section 7.3 in Chapter 7. Our discussion of stationary points at infinity uses the following construction. Definition 9.14 (saturation of ideals) Let I, J ⊂ C[z] be two polynomial ideals. The quotient of I by J is the ideal (I : J) = { f ∈ C[z] : f g ∈ I for all g ∈ J}. If (I : J n ) denotes the result of dividing I by J repeatedly n times then the sequence (I : J) ⊂ (I : J 2 ) ⊂ (I : J 3 ) ⊂ · · · eventually stabilizes5 to an ideal (I : J ∞ ) known as the saturation of I by J. Remark 9.14 Geometrically, the zero set V(I : J ∞ ) is the smallest algebraic set containing V(I) \ V(J). There are algorithms for computing ideal saturations using 4 In fact, not all asymptotes of this form are a problem, only those that influence gradient flows. Indeed, sequences of points going out to infinity while staying at finite height appear generically in at least three dimensions, so if the existence of any asymptote was a problem then the methods developed using these arguments would not work on generic examples. 5 If I1 ⊂ I2 ⊂ · · · is an infinite ascending chain of ideals then the union of all I j is also an ideal. The Hilbert basis theorem states that this ideal has a finite set of generators, each of which must lie in I j with j sufficiently large, so the inclusions eventually become identities. The condition that any infinite ascending chain of ideals stabilizes characterizes a Noetherian ring.

374

9 Multiple Points and Beyond

Gröbner bases [10, Table 6.6], and such algorithms have been implemented in several6 algebra systems. Although Baryshnikov et al. give more general results, to be explicit we focus on the generic case when F(z) = G(z)/H(z) has a smooth singular variety. In order to avoid introducing even more geometric language, we do not fully define stratified points at infinity. Instead, we give the effective test of Baryshnikov et al. [6, Prop. 3.2] which characterizes their absence. Recall from Chapter 5 that the square-free factorization of a polynomial is the product of its irreducible factors and that the homogenization of a polynomial f (z) ∈ C[z] of degree k is the polynomial z0k f (z1 /z0, . . . , zd /z0 ) ∈ C[z0, z]. Definition 9.15 (absence of stratified points at infinity) Let F(z) be the ratio of coprime polynomials G(z) and H(z) with smooth singular variety V = V(H), and ˜ 0, z) be the homogenization of the square-free part of H, and let I fix r ∈ R∗d . Let H(z be the ideal of C[z0, z, y] generated by the polynomials ˜ 0, z) H(z

and

y j z1 H˜ z1 (z0, z) − y1 z j H˜ z j (z0, z)

for

2 ≤ j ≤ d.

Let C be the saturation of I by the ideal (z0 ) and Cr be the result of substituting y = r and z0 = 0 in C. We say F has no stratified points at infinity in the direction r if the only element of V(Cr ) is z = 0. The rough idea behind Definition 9.15 is the following. By homogenizing H we can move from Cd to complex projective space, so that ‘points at infinity’ can be captured algebraically by points where z0 = 0. The polynomials in the ideal I generate the smooth critical points equations on V when the direction y is a parameter, and saturation by z0 removes extra ‘points at infinity’ which are not limit points of critical points in Cd . The elements of the saturation with y = r and z0 = 0 are thus limit points of smooth critical points in directions approaching r which do not lie in Cd . If any solution with z0 = 0 also has z = 0 then there are no solutions in projective space (because H˜ is homogeneous the point (z0, z) = (0, 0) will always be a solution). Example 9.10 (A Stratified Point at Infinity) Consider the rational function F(x, y) = 1/H(x, y) with square-free denominator H(x, y) = 2 + y − x(1 + y)2 . Since the system H(x, y) = Hx (x, y) = Hy (x, y) = 0 has no solutions, the singular variety V is smooth. Furthermore, the system H(x, y) = xHx (x, y) − yHy (x, y) = 0 has no solutions, so F has no critical points in the main diagonal direction. Homog˜ x, y) = z 3 H(x/z, y/z) = 2z3 + z2 y − x(z + y)2 . Forming the enizing H gives H(z, 6 For instance, saturation of ideals is implemented by the Saturate command of the PolynomialIdeals package in Maple.

9.3 A Geometric Approach to ACSV

375

ideal I, using the PolynomialIdeals package of Maple to saturate by (z), then substituting in the main diagonal direction and z = 0 gives C1 = (x). Since V(C1 ) ⊂ C2 contains non-zero solutions, such as (0, 1), there is a stratified point at infinity. Figure 5.5 of Chapter 5 shows the amoeba of H. The existence of a stratified point at infinity is reflected by a limit direction of amoeba(H) which is normal to r = 1. Further details are given in the worksheet corresponding to this example.

Example 9.11 (No Stratified Points at Infinity) Consider the rational function F(x, y) = 1/H(x, y) with H(x, y) = 1 − x − y. Then ˜ x, y) = z − x − y and for r = (r, s) we compute Cr = (y(r + s), x + y). There H(z, are no stratified points at infinity if r + s  0, in which case Cr = (x, y). There are stratified points at infinity in directions which are a multiple of r = (1, −1).

Absence of stratified points at infinity implies that the gradient flow approach can decompose the Cauchy integral into a sum of saddle-point integrals near stationary points. The integrands of these saddle-point integrals are expressed by residues. If V is smooth and w ∈ V we write     F(z) F(z) Res nr+1 dz = Res nr+1 dzk w, V z z k =g(z k ) z for a coordinate zk and analytic function g(zk ) such that z lies in some neighbourhood of w in V if and only if zk = g(zk ). When we use this notation it is implicit that the stated formula is valid for any possible pair of coordinate zk and analytic function g. Lemma 5.2 in Chapter 5 gives an explicit expression for the residue. We are now ready to state the main result of Baryshnikov et al., simplified under our hypothesis of smoothness. Theorem 9.3 (Baryshnikov et al. [6, Thm. 2.16]) Let F(z) be the ratio of coprime polynomials G(z) and H(z) with smooth singular variety V = V(H). Suppose there are no stationary points at infinity in the direction r ∈ R∗d and that the critical points of F in the direction r are all nondegenerate and form the finite set E = {w1, . . . , wr }. Then each w j can be paired with an integer κ j such that   ∫  κj F(z) Res dz + O (τ n ) , fnr = (9.8) d−1 w j , V znr+1 (2πi) N j j ∈χ where χ contains the elements w j ∈ E which have highest height among those with κ j non-zero, each Nj is a domain making the summands of (9.8) saddle-point integrals (meaning after a coordinate change they satisfy the hypotheses of Proposition 5.3 in Chapter 5), and 0 < τ < |wr | for any w ∈ χ.

376

9 Multiple Points and Beyond

Theorem 9.3 expresses fnr as a linear combination of saddle-point integrals, which can be asymptotically approximated using Proposition 5.3 of Chapter 5. The integers κ j in the theorem arise from expressing the original Cauchy cycle of integration T in a basis for ‘relative singular homology of a pair consisting of subsets of C∗d \ V and points of lower height’; they can be very difficult to determine directly using geometric arguments, and we do not give a full definition7. Remark 9.15 If one of the critical points w j in Theorem 9.3 is minimal, we have seen in Chapter 5 that the corresponding integer coefficient κ j equals one if w j is contributing and zero otherwise. The power of Theorem 9.3 is that it tells us something about the asymptotic contributions of non-minimal critical points, an extremely difficult task outside of very specialized circumstances (since it is hard to deform the Cauchy domain of integration outside the domain of convergence). Other than minimal points, DeVries et al. [13] show how to determine the non-zero coefficients of highest height for bivariate functions with a smooth singular variety. There are also extensions of Theorem 9.3 to non-smooth singular varieties, and we have seen in Chapter 8 how to compute the coefficients κ j when V is a hyperplane arrangement (in this situation each κ j is ±1 or 0, depending on the orthant of the critical point w j and whether or not it is contributing). An important corollary of Theorem 9.3 is the following, which bounds the exponential growth of r-diagonals by the maximum of a finite algebraic set. Corollary 9.2 Under the assumptions of Theorem 9.3, let ρ be the maximum height of a critical point of F in the direction r. Then lim supn→∞ | fnr | 1/n ≤ eρ .

Example 9.12 (Quantum Random Walks) Discrete quantum random walks have been studied as a computational primitive for quantum computation (see, for instance, Ambainis et al. [1]). Bressler and Pemantle [12] analyze the behaviour of one-dimensional quantum walks which either stay stationary or move one step to the right at each time step, and always evolve an underlying quantum state, using bivariate generating functions of the form F(x, y) =

 G(x, y) = fi,n x i y n, 2 1 − cy + cxy − xy i,n≥0

where c ∈ (0, 1) is a parameter encoding the transition probabilities of the walk model, G(x, y) is a polynomial encoding the initial fixed state of the model, and | fi,n | 2 is the probability that a walk ends at x = i after n steps. Unlike ‘classical’ walk models, where the endpoint probabilities for walks of large length usually approach a normal distribution, quantum walks experience an ‘interference pattern’ which makes the endpoint distribution far from normal (see Figure 9.6). Because the y 7 In fact, if η = max{h(w j ) : κ j  0} then only the coefficients κ j with h(w j ) ≥ η are uniquely defined. These non-zero coefficients of high height are the only ones which appear in (9.8).

9.3 A Geometric Approach to ACSV

377

Fig. 9.6 Series coefficients of the polynomial f (x) = [y 300 ]F(x, y) where F(x, y) = (1 − cy + cxy−xy 2 )−1 for c = 1/3: the coefficients are represented by dots, with consecutive terms connected by a line for visibility. Note the exponential drop outside of the interval [100, 200].

variable tracks the length of a walk, and a walk takes at most one step at each time period, we are interested in asymptotic behaviour of the coefficients [x λn y n ]F(x, y) for 0 < λ < 1, given by the r = (λ, 1) diagonal. Of particular interest is the fact that the walk probability decays exponentially for λ outside J = [(1 − c)/2, (1 + c)/2]. A quick computation shows that for any c ∈ (0, 1) there are no stratified points at infinity. For any fixed values of c and λ  J a computer algebra system can easily solve the smooth critical point equations H = xHx − λyHy = 0 and verify that all critical points have negative height. Corollary 9.2 then implies exponential decay of the r = (λ, 1) diagonal by examining this finite set of algebraic points (without the need to determine minimality or any other properties of the singular variety). Detailed studies of quantum walks using analytic combinatorics in several variables can be found in Bressler and Pemantle [12] and Baryshnikov et al. [7]. Although it is usually not possible to determine the integers κ j directly using geometric arguments, or even say if they are non-zero, knowing that these coefficients are integers can be very useful. Combined with the techniques on numeric analytic continuation for D-finite functions from Section 2.4 of Chapter 2, which allows one to determine asymptotics up to numeric constants with rigorous error bounds, this can be used to create a powerful tool for attacking the connection problem for sequences satisfying linear recurrence relations. We illustrate the possibilities of this combination on an example.

378

9 Multiple Points and Beyond

9.3.2 Attacking the Connection Problem through ACSV and Numeric Analytic Continuation While resolving certain conjectures about the eventual positivity of sequences encoded by multivariate rational functions, Baryshnikov et al. [8] consider8 the main power series diagonal coefficients of F(w, x, y, z) =

1 1 = . H(w, x, y, z) 1 − (w + x + y + z) + 27wxyz

The system H(w, x, y, z) = Hw (x, y, z) = Hx (w, x, y, z) = Hy (w, x, y, z) = Hz (w, x, y, z) = 0 has a unique solution at (w, x, y, z) = σ = (1/3, 1/3, 1/3, 1/3), which is the only non-smooth point of the singular variety V = V(H). One thus obtains a Whitney stratification of the singular variety V by taking S1 = {σ} and S2 = V \ {σ} as strata. Because σ lies in a stratum with a single element, it is a critical point (the height function is trivially constant on the stratum, so its differential is zero). The critical points in S2 are obtained by solving the smooth critical point equations H = wHw − xHx = wHw − yHy = wHw − zHz = 0, giving two smooth √ critical points τ + = (ω+, ω+, ω+, ω+ ) and τ − = (ω−, ω−, ω−, ω− ) for ω± = (−1 ± i 2)/3. Because σ has smaller coordinate-wise modulus than τ ± , the points τ ± are not minimal. Applying either the algorithms of Chapter 7, or the famous Grace-Walsh-Szegő theorem [11, Thm. 1.1] on the distributions of zeroes of symmetric polynomials which are linear in each variable (see Problem 9.7 below), shows that σ is minimal. Thus, there is a single minimal critical point σ, which is not smooth, and two smooth critical points τ ± , which are not minimal. Computing the ideal C1 in Definition 9.15 proves there are no stratified points at infinity. To better understand the minimal critical point σ we examine the series expansion   1 1 1 1 = 3(wx + wy + wz + x y + xz + yz) + higher terms, H w+ ,x+ ,y+ ,z+ 3 3 3 3

8 Continuing work of Gillis et al. [15] and Straub and Zudilin [30], Baryshnikov et al. study values of the real parameter C such that the main diagonal of F(z) = (1 − (z1 + · · · + z d ) + Cz1 · · · z d )−1 contains only a finite number of negative values. The most interesting case is C = C∗ = (d − 1) d−1 , which is the threshold for different behaviour (when C < C∗ there are only a finite number of negative values, and when C > C∗ there are an infinite number of both positive and negative values). When d is even and at least four, and C = C∗ , the topology of the singular variety is perfectly aligned so that asymptotics are determined by non-minimal critical points; the rational function here is the smallest such example. See also the discussion in Section 3.4.5 of Chapter 3 for some context on these positivity problems.

9.3 A Geometric Approach to ACSV

379

of H at σ. Because the leading term is an irreducible quadratic, Lemma 9.1 implies σ is not a transverse multiple point. In fact, the invertible change of variables9 √ √ w → a + 6b + 2c + d + 1/3 x → a − 3d + 1/3 √ √ √ z → a − 6b + 2c + d + 1/3 y → a − 2 2c + d + 1/3 transforms the leading term into 18(a2 − b2 − c2 − d 2 ). A singularity whose leading term can put into the form z12 − z22 − · · · − zd2 is called a cone point, since the real solutions of such a polynomial form two d-dimensional cones meeting at a point. Baryshnikov and Pemantle [9] show that minimal critical cone points typically determine asymptotic behaviour of rational diagonals, except when they occur in even dimensions greater than four (which is precisely our situation). In even dimensions greater than four, a peculiar lacuna phenomenon10 appears, suggesting that cone points may not contribute to coefficient asymptotics. Indeed, Baryshnikov et al. [5, Thm. 2.3] use an in-depth topological argument to show that for our rational function (and others of a similar form) the Cauchy domain of integration can be pushed past the minimal critical cone point σ, which thus does not affect asymptotics. In particular, our example is a pathological case where asymptotics are determined by non-minimal critical points. We find asymptotics by expressing dominant coefficient behaviour in two ways: as a linear combination with unknown integer coefficients using the geometric approach to ACSV from this chapter, and as a linear combination with complex number coefficients that can be rigorously approximated using the computational tools for D-finite functions from Chapter 2.

Expression #1 from ACSV Because there are no stationary points at infinity and the Cauchy domain of integration can be pushed past the critical point σ, which has highest height among the three critical points in our example, asymptotic coefficient behaviour will be determined by the smooth (non-minimal) critical points τ ± of height h(τ ± ) = log 9. Theorem 2.16 of Baryshnikov et al. [6], which is an extension of Theorem 9.3 that allows non-smooth points such as σ, implies the asymptotic contributions of τ ± are integer multiples of what they would be if these points were minimal critical points. In other words, there exist integers κ+ and κ− such that the power series coefficients of F(w, x, y, z) satisfy fn,n,n,n = κ− Φτ − (n) + κ+ Φτ + (n) + O (ρn )

(9.9)

for some 0 < ρ < 9, where Φτ ± (n) are the asymptotic expansions

9 This change of variables  is found  by diagonalizing the matrix M such that the leading term of H at σ is vMvT for v = w x y z ; see the worksheet corresponding to this example for details. 10 See, for instance, Petrowsky [27] and Atiyah et al. [4].

380

9 Multiple Points and Beyond

 √ √ √ n   2 − 5i −2i 2 − 8 −7 + 4i 2  1 "# Φτ − (n) = − +O 3/2 3/2 24 n # n π ! $      √ √ n √   2 + 5i 2i 2 − 8 −7 − 4i 2  1 "# Φτ + (n) = + O − 24 n # n3/2 π 3/2 ! $ 

given by the right-hand side of (5.27) in Chapter 5 assuming τ ± are minimal. As mentioned above, it is not immediately clear how to determine the integers κ± using geometric arguments.

Expression #2 from Numeric Analytic Continuation Recall from Chapter 3 that the diagonal of any rational function satisfies a linear differential equation with polynomial coefficients. In fact, the creative telescoping techniques discussed in Section 3.2.1 of Chapter 3 allow us to compute that the diagonal f (t) = (ΔF)(t) of our rational function satisfies t 2 (81t 2 + 14t + 1) f (3) (t) + 3t(162t 2 + 21t + 1) f (2) (t) + (21t + 1)(27t + 1) f (t) + 3(27t + 1) f (t) = 0.

(9.10)

The analysis of D-finite functions in Section 2.4 of Chapter 2 implies that the only singularities of f (t) occur at the roots t = ω±4 of the leading coefficient factor 81t 2 + 14t + 1 = 0. Following the techniques discussed in Section 2.4.1 of Chapter 2, we can compute a basis of solutions to (9.10) for t ∈ {0, ω±4 } and use numeric analytic continuation to determine singular expansions of f (t) at t = ω±4 . Performing these calculations with the Sage package of Mezzarobba [21] allows us, in a few seconds of computation time, to determine an asymptotic expansion  √ n   −7 + 4i 2  1 (0.3066 . . . ) + (0.146 . . . )i + O fn,n,n,n = 3/2 n n (9.11)  √ n    −7 − 4i 2 1 (0.3066 . . . ) − (0.146 . . . )i + O + 3/2 n n whose coefficients are rigorously certified to over a thousand digits. Without additional information it is impossible to exactly determine the coefficients in this expansion, even though they can be approximated to arbitrary accuracy.

9.3 A Geometric Approach to ACSV

381

Determining Asymptotics Exactly Equations (9.9) and (9.11), one with unknown integers and the other with approximated coefficients from a larger field, are perfect complements, and combining the expressions allows us to approximate κ1, κ2 = 2.999 . . . to over a thousand decimal digits in a few seconds. Even knowing these integers to a single decimal digit is enough to rigorously conclude that κ1 = κ2 = 3 and thus exactly determine asymptotics of the diagonal sequence under consideration. One can think of the κ j as a sort of ‘winding number’ of the starting Cauchy domain of integration T . Working in four-dimensional complex space (which is extremely difficult to picture) one can deform T to points of height smaller than h(σ), eventually moving to neighbourhoods of τ ± and points of even lower height, but this process ‘wraps’ T three times around the smooth critical points, giving a multiplicity. Our combination of geometric and computer algebra based techniques allows us to determine the multiplicity. A direct geometric argument for this multiplicity is current unknown.

9.3.3 The State of Analytic Combinatorics in Several Variables The methods of this chapter suggest a general strategy for analyzing Laurent coefficients of a function F(z) with singular variety V, 1. Determine a Whitney stratification of V and, for a fixed direction, compute the critical points of the height function on each strata using Definition 9.13. 2. Verify there are no stationary points at infinity using Definition 9.15 (or a generalization which takes into account non-smooth points). 3. Express, up to negligible error, the Cauchy integral for coefficients as an integer sum of integrals over neighbourhoods of critical points (as in Theorem 9.3). 4. Asymptotically approximate each of the integrals appearing in this expression. As detailed in Baryshnikov et al. [6], Steps 1 and 2 are now essentially solved. Step 3 is a topological problem, and is typically the most difficult task of an analysis. Aside from the few special cases mentioned above, such as minimal contributing points, no general algorithm for Step 3 is known. Step 4 depends heavily on the local geometry of V at each critical point. For a critical point which is a nondegenerate transverse multiple point, the corresponding integrals are saddle-point integrals which can be asymptotically approximated. Asymptotics of integrals corresponding to degenerate transverse multiple points can in theory be computed [31], although such examples are not prominent in the analytic combinatorics literature. Using results of Atiyah et al. [4] on inverse Fourier transforms of homogeneous hyperbolic functions, Baryshnikov and Pemantle [9] and Baryshnikov et al. [5] determine asymptotics for certain types of cone points. Explicit results for singular varieties with more exotic singular behaviour are harder to find, although much of the necessary background needed to attack such problems can be found in [2, 3].

382

9 Multiple Points and Beyond

In order to limit the necessary background, our presentation of this advanced approach to analytic combinatorics in several variables did not dig deeply into the underlying theory. A full discussion on such topics can be found in Baryshnikov et al. [6] and the textbook of Pemantle and Wilson [26].

Problems 9.1 Prove that if A(x, y, z) and B(x, y, z) are power series whose product is z 2 − x 2 − y 2 then one of A or B is invertible as a power series. 9.2 Prove that σ = (1, 1, 1/3) is a minimal point of F(x, y, t) =

(1 + x)(1 − x y 2 + x 2 ) . (1 − t(1 + x 2 + xy 2 ))(1 − y)(1 + x 2 )

Hint: Although F is not combinatorial a short argument allows one to apply Proposition 5.4 from Chapter 5. Alternatively, one can use the algorithms of Chapter 7. 9.3 Using Proposition 4.5 of Chapter 4 and Proposition 5.4 from Chapter 5, prove that all minimizers of |xyt| −1 on the power series domain of convergence of F(x, y, t) =

(1 + x)(1 − x y 2 + x 2 ) (1 − t(1 + x 2 + xy 2 ))(1 − y)(1 + x 2 )

have the same coordinate-wise modulus as σ = (1, 1, 1/3). 9.4 The number of three-dimensional ‘singular vector tuples of generic tensors’ (as discussed in [25] and Problem 5.5 of Chapter 5) has multivariate generating function F(x, y, z) =

x yz . (1 − x)(1 − y)(1 − z)(1 − x − y − z)(1 − xy − xz − yz)

Using algebraic relations between the denominator factors, decompose F as a sum of rational functions such that the denominator of each summand has at most three distinct linear factors. Use Theorems 9.1 and 9.2 to determine asymptotics along d . For which directions do these theorems not apply? different directions r ∈ R>0 1 9.5 Let F(x, y) = (1−x 2 −y)(1−x−y 2 ) . Determine the directions to which Theorems 9.1 and 9.2 apply and determine asymptotics of the power series expansion of F in these directions.

9.6 In statistical signal processing, the design of multivariate autoregressive filters requires estimating series coefficients of certain spectral density functions. In this paradigm, Geronimo et al. [14] study the Laurent series coefficients of F(x, y) = 

1−

xy  xy −

x+y  C

x+y  C

References

383

for a real parameter C > 2, expanded in the unique domain of convergence containing the point (1, 1). Use Theorems 9.1 and 9.2 to find asymptotics in different directions, as a function of C. 9.7 Let F(z) = (1 − (z1 + · · · + zd ) + Cz1 · · · zd )−1 and let C∗ = (d − 1)d−1 . Prove the singular variety V is smooth if and only if C  C∗ . If C < C∗ show that there is a single smooth minimal critical point for the main diagonal direction, which has positive coordinates, and determine asymptotics of the main diagonal of F. You may use without proof the Grace-Walsh-Szegő theorem [11, Thm. 1.1]: if f (z) is unchanged by permutations of the variables, f is linear in each variable individually, and f (w) = 0 where each |w j | < r then there exists |ω| < r with f (ω, ω, . . . , ω) = 0. 9.8 In Section 9.3.2 we found asymptotics of a four variable rational function whose coefficient behaviour was determined by two non-minimal critical points. Consider the next smallest example in this family, the six variable rational function F(u, v, w, x, y, z) =

1 . 1 − (u + v + w + x + y + z) + 3125uvwx yz

Show that there is a single non-smooth point, when all variables equal 1/5, which is a cone point, and show that there are four (non-minimal) smooth critical points for the main diagonal. Assuming the results of Baryshnikov et al. [5], which imply the Cauchy integral can be deformed past the cone point, mirror the arguments of Section 9.3.2 to determine asymptotics of the main diagonal of F. You may also assume the Grace-Walsh-Szegő theorem from Problem 9.7. An annihilating differential equation for the diagonal can be computed using the MAGMA package of Lairez [18], or found on the textbook website. Note that, unlike the four-dimensional case, not all smooth critical points determine dominant asymptotics.

References 1. Andris Ambainis, Eric Bach, Ashwin Nayak, Ashvin Vishwanath, and John Watrous. Onedimensional quantum walks. In Proceedings of the Thirty-Third Annual ACM Symposium on Theory of Computing, pages 37–49. ACM, New York, 2001. 2. V. I. Arnold, S. M. Gusein-Zade, and A. N. Varchenko. Singularities of differentiable maps. Volume 1. Modern Birkhäuser Classics. Birkhäuser/Springer, New York, 2012. 3. V. I. Arnold, S. M. Gusein-Zade, and A. N. Varchenko. Singularities of differentiable maps. Volume 2. Modern Birkhäuser Classics. Birkhäuser/Springer, New York, 2012. 4. M. F. Atiyah, R. Bott, and L. G˙arding. Lacunas for hyperbolic differential operators with constant coefficients. I. Acta Math., 124:109–189, 1970. 5. Y. Baryshnikov, S. Melczer, and R. Pemantle. Asymptotics of multivariate sequences in the presence of a lacuna. arXiv preprint, arXiv:1905.04187, 2019. 6. Y. Baryshnikov, S. Melczer, and R. Pemantle. Stationary points at infinity for analytic combinatorics. arXiv preprint, arXiv:1905.05250, 2020. 7. Yuliy Baryshnikov, Wil Brady, Andrew Bressler, and Robin Pemantle. Two-dimensional quantum random walk. J. Stat. Phys., 142(1):78–107, 2011.

384

9 Multiple Points and Beyond

8. Yuliy Baryshnikov, Stephen Melczer, Robin Pemantle, and Armin Straub. Diagonal asymptotics for symmetric rational functions via ACSV. In 29th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms, volume 110 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 12, 15. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2018. 9. Yuliy Baryshnikov and Robin Pemantle. Asymptotics of multivariate sequences, part III: Quadratic points. Adv. Math., 228(6):3127–3206, 2011. 10. Thomas Becker and Volker Weispfenning. Gröbner bases, volume 141 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1993. 11. Julius Borcea and Petter Brändén. The Lee-Yang and Pólya-Schur programs. II. Theory of stable polynomials and applications. Comm. Pure Appl. Math., 62(12):1595–1631, 2009. 12. A. Bressler and R. Pemantle. Quantum random walks in one dimension via generating functions. In 2007 Conference on Analysis of Algorithms, AofA 07, Discrete Math. Theor. Comput. Sci. Proc., AM, pages 403–414. Assoc. Discrete Math. Theor. Comput. Sci., Nancy, 2007. 13. Timothy DeVries, Joris van der Hoeven, and Robin Pemantle. Automatic asymptotics for coefficients of smooth, bivariate rational functions. Online J. Anal. Comb., (6):24, 2011. 14. Jeffrey S. Geronimo, Hugo J. Woerdeman, and Chung Y. Wong. Spectral density functions of bivariable stable polynomials. Submitted, 2020. 15. J. Gillis, B. Reznick, and D. Zeilberger. On elementary methods in positivity theory. SIAM J. Math. Anal., 14(2):396–398, 1983. 16. Mark Goresky and Robert MacPherson. Stratified Morse theory, volume 14 of Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)]. Springer-Verlag, Berlin, 1988. 17. Lars Hörmander. An introduction to complex analysis in several variables, volume 7 of NorthHolland Mathematical Library. North-Holland Publishing Co., Amsterdam, third edition, 1990. 18. Pierre Lairez. Computing periods of rational integrals. Math. Comp., 85(300):1719–1752, 2016. 19. Arjen K. Lenstra. Factoring multivariate polynomials over algebraic number fields. SIAM J. Comput., 16(3):591–598, 1987. 20. John Mather. Notes on topological stability. Bull. Amer. Math. Soc. (N.S.), 49(4):475–506, 2012. 21. Marc Mezzarobba. Rigorous multiple-precision evaluation of D-finite functions in SageMath. Technical Report 1607.01967, arXiv, 2016. 22. J. Milnor. Morse theory. Based on lecture notes by M. Spivak and R. Wells. Annals of Mathematics Studies, No. 51. Princeton University Press, Princeton, N.J., 1963. 23. T. Mostowski and E. Rannou. Complexity of the computation of the canonical Whitney stratification of an algebraic set in C n . In Applied algebra, algebraic algorithms and errorcorrecting codes (New Orleans, LA, 1991), volume 539 of Lecture Notes in Comput. Sci., pages 281–291. Springer, Berlin, 1991. 24. David Mumford. Algebraic geometry. I. Springer-Verlag, Berlin-New York, 1976. 25. Jay Pantone. The asymptotic number of simple singular vector tuples of a cubical tensor. Online Journal of Analytic Combinatorics, 12, 2017. 26. Robin Pemantle and Mark C. Wilson. Analytic combinatorics in several variables, volume 140 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2013. 27. I. Petrowsky. On the diffusion of waves and the lacunas for hyperbolic equations. Rec. Math. [Mat. Sbornik] N. S., 17(59):289–370, 1945. 28. E. Rannou. The complexity of stratification computation. Discrete Comput. Geom., 19(1):47– 78, 1998. 29. Norbert Schappacher. Some milestones of lemniscatomy. In Algebraic geometry (Ankara, 1995), volume 193 of Lecture Notes in Pure and Appl. Math., pages 257–290. Dekker, New York, 1997. 30. Armin Straub and Wadim Zudilin. Positivity of rational functions and their diagonals. J. Approx. Theory, 195:57–69, 2015.

References

385

31. A. N. Varčenko. Newton polyhedra and estimates of oscillatory integrals. Funkcional. Anal. i Priložen., 10(3):13–38, 1976. 32. V. A. Vassiliev. Applied Picard-Lefschetz theory, volume 97 of Mathematical Surveys and Monographs. American Mathematical Society, Providence, RI, 2002.

Chapter 10

Application: Lattice Paths, Revisited We have continually to make our choice among different courses of action open to us, and upon the discretion with which we make it, in little matters and in great, depends our prosperity and our happiness. — William A. Whitworth Thus all actions have one or more of these seven causes: chance, nature, compulsion, habit, reasoning, passion, desire. – Plato

In our final chapter we return to lattice path enumeration, combining the generating function expressions from Chapter 4 with the asymptotic results of Chapter 9. Once again, we see that lattice path models with similar combinatorial properties have generating functions encoded by multivariate diagonals with similar analytic properties, and can thus be treated uniformly. First, Section 10.1 generalizes the results of Chapter 6 from lattice path models whose step sets are symmetric over ever axis to models whose step sets are symmetric over all but one axis1. This work, originally carried out in Melczer and Wilson [9], solved previous conjectures on the asymptotics of two-dimensional lattice path models restricted to a quadrant which had resisted purely univariate techniques. Our presentation of this material follows [9]. Section 10.2 then discusses further work from the lattice path literature where diagonal representations and analytic combinatorics in several variables have appeared. A collection of exercises from this literature is presented for the reader to hone their ACSV skills. Although we reintroduce terminology as needed, the reader who hasn’t yet gone through Chapter 4 is encouraged to do so. In particular, recall that for any dimension d ∈ N a lattice path model of dimension d is defined by a non-empty finite set of steps S ⊂ Zd , a restricting region R ⊂ Rd , a starting point p ∈ R, and a terminal set T ⊂ R. The model defined by these parameters consists of all finite tuples (s1, . . . , sr ) ∈ S r , called walks or paths, such that p + s1 + · · · + sr ∈ T and p+s1 + · · · +sk ∈ R for all 1 ≤ k ≤ r. We often associate a weight ws > 0 to each step s ∈ S and enumerate walks by weight, where the weight of a walk (s1, . . . , sr ) is the product ws1 · · · wsr of the weights of its steps. An unweighted model, where one simply counts walks by the number of steps they contain, is the same as the weighted model where each step has weight one. Unless otherwise stated the terminal set T of a model is taken to be the entire restricting region R.

1 Recall from Proposition 4.10 in Chapter 4 that in any dimension d ≥ 2 there exists a lattice path model restricted to N d whose set of steps is symmetric over all but two axes which cannot be written as the diagonal of a rational function.

© Springer Nature Switzerland AG 2021 S. Melczer, Algorithmic and Symbolic Combinatorics, Texts & Monographs in Symbolic Computation, https://doi.org/10.1007/978-3-030-67080-1_10

387

388

10 Application: Lattice Paths, Revisited

Fig. 10.1 The mostly symmetric step sets of dimension two, up to isomorphism. The step sets in the first row have positive drift, while those in the second row have negative drift.

10.1 Mostly Symmetric Models in an Orthant We begin by studying walks defined by a weighted step set S ⊂ {±1, 0} d , where each s ∈ S is given a weight ws > 0, such that • Walks on the step set can move forwards and backwards in each direction, For all j = 1, . . . , d there exists a step s ∈ S with s j = 1 For all j = 1, . . . , d there exists a step s ∈ S with s j = −1 • Walks are restricted to R = Nd , • The step set and weighting are symmetric over every axis except for one. Recall from Chapter 4 that a lattice path model satisfying these conditions is called mostly symmetric, while a lattice path model defined by a step set that is symmetric over every axis is called highly symmetric. In Chapter 6 we gave explicit asymptotic results for highly symmetric models using the analytic methods of Chapter 5 and the fact that the generating function of a highly symmetric model can be represented by the diagonal of a rational function with a smooth singular variety. Now that we have developed analytic combinatorics in several variables for non-smooth points in Chapter 9, we will be able to derive asymptotics for mostly symmetric models. Example 10.1 (Mostly Symmetric Models in the Quadrant) Recall from Section 4.1.4 of Chapter 4 that many two-dimensional lattice path models in the quadrant N2 are isomorphic to each other (if their defining step sets differ by a reflection over the line y = x) or are isomorphic to half-space models (if one of the bounding axes is never interacting with). Figure 10.1 displays all mostly symmetric step sets of dimension two, up to reflection over the line y = x, after step sets corresponding to half-space models are removed. As in previous chapters, for a variable z we adopt the notation z = 1/z, which is common in the lattice path literature (we never use an overline to denote complex

10.1 Mostly Symmetric Models in an Orthant Class Exponential Growth highly symmetric S(1) positive drift S(1) negative drift < S(1)

389 Order Geometry Covered By n−d/2 smooth Theorem 6.1 in Ch. 5 n1/2−d/2 non-smooth Theorem 10.1 n−1−d/2 smooth Theorem 10.2

Table 10.1 Summary of our asymptotic results, including whether or not the exponential growth is the same as the number of unrestricted walks, and whether the contributing singularities in our asymptotic analysis are smooth or non-smooth points.

conjugation). Without loss of generality we may assume the axis of non-symmetry for a mostly symmetric model corresponds to its final coordinate, meaning the (weighted) characteristic polynomial  ws zs S(z) = s∈S

satisfies S(z1, . . . , z j−1, z j , z j+1, . . . , zd ) = S(z) for all 1 ≤ j ≤ d − 1. Because the step sets under consideration are subsets of {±1, 0} d , this implies we can write S(z) = z d A(ˆz) + Q(ˆz) + zd B(ˆz)

(10.1)

for Laurent polynomials A(ˆz), Q(ˆz), and B(ˆz) symmetric in zˆ = (z1, . . . , zd−1 ). A mostly symmetric step set only has one unrestricted coordinate, so the properties of the step set in this coordinate are crucial to its asymptotic behaviour. In particular, we will see that models share a similar asymptotic template depending on whether more steps move towards or away from the boundary axes. This concept is captured by the following definition. Definition 10.1 (drift for mostly symmetric models) Given a mostly symmetric model with characteristic polynomial S(z) satisfying (10.1), the drift of the model is the real number δ = B(1) − A(1), which is the total weight of steps with positive dth coordinate minus the total weight of steps with negative dth coordinate. The model has negative, zero, or positive drift depending on whether δ < 0, δ = 0, or δ > 0. Although we will make use of a uniform diagonal expression for all mostly symmetric models, the local geometry of the singular variety at the contributing singularities (in particular, whether the contributing singularities are smooth or nonsmooth transverse multiple points) will depend on the drift of the model. Table 10.1 outlines the different cases that we cover. As can be expected, when the total weight of a step set is fixed then negative drift models, where walks tend towards the boundary axes, grow the slowest (in fact, exponentially slower than models on the same set of steps that are not restricted to an orthant) while positive drift models grow the fastest. Our analytic calculations will explicitly show how the drift comes into play when determining asymptotics. We now describe our various results in more detail, before proving them.

390

10 Application: Lattice Paths, Revisited

Asymptotics of Positive Drift Models For each 1 ≤ k ≤ d − 1 let bk = s∈S,ik =1 ws be the total weight of the steps moving forwards in the kth coordinate (which by symmetry is also the total weight of steps moving backwards in the kth coordinate). Theorem 10.1 (Positive Drift Asymptotics) Let S ⊂ {−1, 0, 1} d be a weighted set of steps that is symmetric over all but one axis and moves forwards and backwards in each coordinate. If the model defined by S has positive drift then the number cn of walks taking n steps in S, beginning at the origin, and never leaving Nd satisfies + ,   d−1    2 S(1) 1 A(1) 1 + O n−1 . (10.2) cn = S(1)n n1/2−d/2 1 − √ B(1) π b1 · · · bd−1 As with our results in Chapter 6 on highly symmetric models, Theorem 10.1 can be instantly applied to any given model, and families of models in varying dimension. Example 10.2 (A Family of Positive Drift Models) For any d ∈ N consider the (unweighted) model with characteristic polynomial  (z j + z j ), S(z) = z d + zd j

S(1, ρ)d ρ b1 (1, ρ) · · · bd−1 (1, ρ)B(1)

and let C−ρ be the constant obtained by replacing ρ by −ρ in Cρ . The term under the square-root in Cρ and C−ρ will always be positive when it appears in our formulas. Theorem 10.2 (Negative Drift Asymptotics) Let S ⊂ {−1, 0, 1} d be a weighted set of steps that is symmetric over all but one axis and moves forwards and backwards in each coordinate. If the model defined by S has negative drift and Q(ˆz)  0 (i.e., if there are steps in S whose dth coordinate is zero) then the number cn of walks taking n steps in S, beginning at the origin, and never leaving the orthant Nd satisfies    cn = S(1, ρ)n n−1−d/2 Cρ 1 + O n−1 . If the model defined by S has negative drift and Q(ˆz) is identically zero then 7   6 cn = S(1, ρ)n n−1−d/2 Cρ + S(1, −ρ)n n−1−d/2 C−ρ 1 + O n−1 .

Example 10.3 (A Family of Negative Drift Models) For any d ∈ N consider the (unweighted) model with characteristic polynomial  (z j + z j ) + zd, S(z) = z d j