185 101 4MB
English Pages 386 Year 2020
Frédérique Bassino, Ilya Kapovich, Markus Lohrey, Alexei Miasnikov, Cyril Nicaud, Andrey Nikolaev, Igor Rivin, Vladimir Shpilrain, Alexander Ushakov, Pascal Weil Complexity and Randomness in Group Theory
Also of Interest Elementary Theory of Groups and Group Rings, and Related Topics. Proceedings of the Conference held at Fairfield University and at the Graduate Center, CUNY, November 1–2, 2018 Paul Baginski, Benjamin Fine, Anja Moldenhauer, Gerhard Rosenberger, Vladimir Shpilrain, 2020 ISBN 978-3-11-063673-4, e-ISBN (PDF) 978-3-11-063838-7, e-ISBN (EPUB) 978-3-11-063709-0 Complex Algebraic Foliations Alcides Lins Neto, Bruno Scárdua, 2020 ISBN 978-3-11-060107-7, e-ISBN (PDF) 978-3-11-060205-0, e-ISBN (EPUB) 978-3-11-059451-5
Abelian Groups. Structures and Classifications Carol Jacoby, Peter Loth, 2019 ISBN 978-3-11-043211-4, e-ISBN (PDF) 978-3-11-042768-4, e-ISBN (EPUB) 978-3-11-042786-8
Groups of Prime Power Order. Volume 6 Yakov G. Berkovich, Zvonimir Janko, 2018 ISBN 978-3-11-053097-1, e-ISBN (PDF) 978-3-11-053314-9, e-ISBN (EPUB) 978-3-11-053100-8
Modules over Discrete Valuation Rings Piotr A. Krylov, Askar A. Tuganbaev, 2018 ISBN 978-3-11-060977-6, e-ISBN (PDF) 978-3-11-061114-4, e-ISBN (EPUB) 978-3-11-060985-1
Commutative Algebra Aron Simis, 2020 ISBN 978-3-11-061697-2, e-ISBN (PDF) 978-3-11-061698-9, e-ISBN (EPUB) 978-3-11-061707-8
Frédérique Bassino, Ilya Kapovich, Markus Lohrey, Alexei Miasnikov, Cyril Nicaud, Andrey Nikolaev, Igor Rivin, Vladimir Shpilrain, Alexander Ushakov, Pascal Weil
Complexity and Randomness in Group Theory |
GAGTA BOOK 1
Mathematics Subject Classification 2010 35-02, 65-02, 65C30, 65C05, 65N35, 65N75, 65N80 Authors Prof. Dr. Frédérique Bassino Université Paris 13 LIPN – Institut Galilée Av. J.-B. Clément, 93430 Villataneuse, France [email protected]
Prof. Dr. Andrey Nikolaev Stevens Institute of Technology Castle Point Terrace Hoboken NJ 07030, USA [email protected]
Prof. Dr. Ilya Kapovich Hunter College Department of Mathematics and Statistics 695 Park Ave, New York 10065, USA [email protected]
Prof. Dr. Igor Rivin Temple University Mathematics Department 1805 Broad st Philadelphia PA 19122, USA
Prof. Dr. Markus Lohrey University of Siegen Department of Electrical Engineering and Computer Science Hölderlinstr. 3, 57076 Siegen, Germany [email protected]
Prof. Dr. Vladimir Shpilrain The City College of New York Department of Mathematics, NAC 8/133 Convent Ave at 138th Street New York NY 10031, USA [email protected]
Prof. Dr. Alexei Miasnikov Stevens Institute of Technology Castle Point Terrace Hoboken NJ 07030, USA [email protected]
Prof. Dr. Alexander Ushakov Stevens Institute of Technology Castle Point Terrace Hoboken NJ 07030, USA [email protected]
Prof. Dr. Cyril Nicaud LIGM Université Paris-Est Marne-la-Vallé 5 boulevard Descartes 77454 Champs-Sur-Marne, France [email protected]
Prof. Dr. Pascal Weil LaBRI – CNRS 351 cours de la Liberation 33405 Talence Cedex, France [email protected]
ISBN 978-3-11-066491-1 e-ISBN (PDF) 978-3-11-066702-8 e-ISBN (EPUB) 978-3-11-066752-3 Library of Congress Control Number: 2020934711 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2020 Walter de Gruyter GmbH, Berlin/Boston Cover image: Firn / iStock / Getty Images Plus Typesetting: VTeX UAB, Lithuania Printing and binding: CPI books GmbH, Leck www.degruyter.com
Introduction The goal of this book is to showcase new directions in group theory motivated by computer science. The subtitle (GAGTA book) reflects the fact that the book follows the course of the Geometric and Asymptotic Group Theory with Applications (GAGTA) conference series in transitioning from geometric group theory, which dominated group theory in the late twentieth century, to group theory of the twenty-first century, which has strong connections to computer science. Since its inception in the 1980s, geometric group theory produced a great deal of important results that were not limited to metric properties of groups, but included also results (such as a solution of Tarski’s problems, for example) that can be formulated without mentioning any metric. Now that geometric group theory is drifting further and further away from group theory to geometry, it is natural to look for new tools and new directions in group theory, and this is what we are trying to do in the present book. In his “Millennium problems”, Smale points out that he considers the “P versus NP” problem a gift to mathematics from computer science. The same actually goes for the whole theory of computational complexity, and a notable focus of this book is on the (time and/or space) complexity of various algorithmic problems in group theory, including “traditional” problems such as the word, conjugacy, subgroup membership and isomorphism problems, as well as problems recently influenced by theoretical computer science, including the knapsack problem, the Post correspondence problem and others. Along with the worst-case complexity of algorithms we address the generic-case complexity, or complexity on random inputs (see Chapter 1), which is more relevant to real-life applications, particularly in information security. The very concept of randomness is very important and nontrivial for infinite groups. We discuss various approaches to defining random elements, random subgroups and even random groups in Chapters 1 and 2. Of particular interest is the concept of a random matrix, since it admits several different approaches due to the versatile nature of matrices (see Chapter 3). Randomness is closely related to how one defines complexity of a given element of a given set, which is also important for understanding computational complexity of algorithms (operating on a given set), since the latter is defined as a function of complexity of individual elements. For example, for group elements, the idea of using Kolmogorov complexity as an alternative to the more traditional geodesic length leads to spectacular applications of data compression techniques from computer science to algorithmic problems in group theory, specifically the word problem. This is described in Chapter 4. We also address, in Chapter 5, an emerging area named discrete optimization in groups, which deals with adaptations of several well-known problems in computer https://doi.org/10.1515/9783110667028-201
VI | Introduction science (e. g., the subset sum problem, the knapsack problem and the Post correspondence problem) in various groups typically studied in combinatorial group theory. Finally, in the concluding Chapter 6 we describe several algorithmic problems in group theory motivated by (public-key) cryptography. This includes not only a shift of paradigm from decision to search problems, but also “brand new” problems, notably the hidden subgroup problem (HSP). The importance of the latter problem is due to the fact that Shor’s polynomial-time quantum algorithm (for the factoring and discrete logarithm problems), as well as several of its extensions, relies on the ability of quantum computers to solve the HSP for finite abelian groups. It is speculated that for some non-abelian groups, the HSP may be resilient to quantum algorithms and therefore those groups might serve as platforms for so-called post-quantum cryptographic schemes. We note that the HSP was originally defined for finite groups, but some authors recently offered ways to generalize it to infinite groups. Acknowledgments We are grateful to Moses Ganardi, Dima Grigoriev, Daniel König, Philipp Reh, and Paul Schupp for many valuable comments and discussions. The work of Ilya Kapovich was partially supported by the NSF grants DMS-1710868 and DMS-1905641. The work of Markus Lohrey was partially supported by the DFG grant Lo748/12-1. The work of Vladimir Shpilrain was partially supported by the ONR grant N000141512164. The work of Pascal Weil was partially supported by Project DeLTA, ANR-16-CE40-0007. Frédérique Bassino, Ilya Kapovich, Markus Lohrey, Alexei Myasnikov, Cyril Nicaud, Andrey Nikolaev, Igor Rivin, Vladimir Shpilrain, Alexander Ushakov, Pascal Weil
Contents Introduction | V 1 1.1 1.2 1.2.1 1.2.2 1.3 1.4 1.4.1 1.4.2 1.4.3 1.4.4 1.5 1.5.1 1.5.2 1.5.3 1.6 1.7 1.7.1 1.7.2 1.7.3 1.7.4 1.8 1.8.1 1.8.2 1.8.3
Generic-case complexity in group theory | 1 Introduction | 1 Definition(s) of generic-case complexity | 4 The original definition of generic-case complexity | 4 Weaknesses of the asymptotic density approach to genericity | 7 Decision problems in group theory: general set-up | 10 Quotient test methods | 11 Quotient tests and the word problem | 11 Quotient tests and the conjugacy problem | 14 Morseness of generic subgroups and the membership problem | 17 Finer generic conjugacy problem methods | 21 Generic-case complexity of “search” group-theoretic problems | 23 The word search problem and random van Kampen diagrams | 23 The conjugacy search problem | 25 The membership search problem | 27 Algorithmically finite groups | 28 Whitehead algorithm and related problems | 30 Whitehead algorithm and the automorphism problem | 30 Generic-case behavior of Whitehead’s algorithm | 32 Whitehead algorithm for subgroups | 33 Garside algorithm for the conjugacy problem in braid groups | 35 Generic-case complexity of the isomorphism problem | 38 Generic one-relator groups | 38 Generic quotients of the modular group | 40 Other consequences of isomorphism rigidity | 41
2 2.1 2.1.1 2.1.2 2.2 2.2.1 2.2.2 2.2.3 2.2.4 2.2.5
Random presentations and random subgroups | 45 Introduction | 45 Discrete representations | 45 Models of randomness | 47 Random finite presentations | 48 The density model | 48 The few-relators model | 53 One-relator groups | 55 Rigidity properties | 57 Nilpotent groups | 58
VIII | Contents 2.3 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5 2.4 2.4.1 2.4.2 3 3.1 3.1.1 3.1.2 3.2 3.2.1 3.2.2 3.2.3 3.3 3.4 3.5 3.5.1 3.6 3.7 3.7.1 3.8 3.9 3.9.1 3.9.2 3.9.3 3.10 3.11 3.12 3.12.1 3.13 3.13.1 3.13.2 3.14 3.14.1 3.14.2 3.15
Random subgroups | 61 Stallings graph of a subgroup | 62 The central tree property and its consequences | 63 Random Stallings graphs | 65 Whitehead minimality | 69 Random subgroups of nonfree groups | 70 Nonuniform distributions | 71 Prefix-heavy distributions | 72 Markovian automata | 73 Randomness and computation in linear groups | 77 What is a random element of an infinite matrix group? | 77 Random walks | 79 Random subgroups | 80 Properties of generic elements | 80 The easy case: SL(2, ℤ) | 80 Random products of matrices in the symplectic and special linear groups | 83 Stronger irreducibility | 84 Random walks on groups and graphs | 84 Fourier transform on finite groups | 85 Fourier estimates via linear algebra | 87 Proof of Theorem 3.5.3 | 88 Some remarks on matrix norms | 89 Properties of random subgroups | 90 A guide to the rest of the section | 90 Subgroups of SL2 (ℤ) | 92 Subgroups of SLn (ℤ) for n > 2 | 97 Ping-pong | 97 ϵ-contraction in SLn (ℤ) where n > 2 | 99 Proof of Theorem 3.7.1 | 102 Well-roundedness | 109 Lyapunov exponent estimates | 112 How to pick a random element? | 114 How to produce random numbers with a given density? | 115 Geometric preliminaries | 115 Uniform random points in balls | 115 Computing a random integer matrix | 116 Action of SL(2, ℝ) and SL(2, ℤ) on the upper half-plane | 117 Translation distance | 118 The fundamental domain and orbits of the SL(2, ℤ) action | 118 Selecting a random element of SL(2, ℤ) almost uniformly | 120
Contents | IX
3.15.1 3.16 3.16.1 3.17 3.17.1 3.17.2 3.18 3.18.1 3.18.2 3.18.3 3.18.4 3.19 3.20 3.21 3.22 3.22.1 3.22.2 3.23 3.23.1 3.24 3.24.1 3.24.2 3.24.3 3.25 3.25.1 3.26 3.27 3.28 3.29 3.30 3.31 3.31.1 3.31.2 3.31.3 3.31.4 4 4.1 4.2 4.3
Complexity estimates and implementation | 121 Extensions to other Fuchsian and Kleinian groups | 122 Constructing the fundamental domain | 122 Higher rank | 123 SL(n, ℤ) | 123 Sp(2n, ℤ) | 124 Miscellaneous other groups | 125 The orthogonal group | 125 Finite linear groups | 127 SL(n, ℝ) | 127 Other groups? | 128 Checking Zariski density | 129 Algorithms for large Galois groups | 131 Probabilistic algorithms | 133 Probabilistic algorithm to check if p(x) of degree n has Galois group Sn | 134 Some remarks on the running time of detecting Galois group Sn | 136 Deciding whether the Galois group of a reciprocal polynomial is the hyperoctahedral group | 137 Back to Zariski density | 138 Testing irreducibility | 138 A short history of Galois group algorithms | 139 Kronecker’s algorithm | 139 Stauduhar’s algorithm | 140 Polynomial time (sometimes) | 141 Some lemmas on permutations | 143 Jordan’s theorem | 145 A bit about polynomials | 146 The Frobenius density theorem | 146 Another Zariski density algorithm | 148 The base case: rank 1 | 149 Higher rank | 150 Thin or not? | 150 Computing the fundamental polyhedron | 151 Eigenvalues | 151 Asymptotic eigenvalue distribution | 152 Finite quotients | 153 Compression techniques in group theory | 155 Introduction | 155 General notations | 158 Background from complexity theory | 159
X | Contents 4.4 4.5 4.5.1 4.5.2 4.5.3 4.5.4 4.6 4.6.1 4.6.2 4.6.3 4.6.4 4.6.5 4.6.6 4.6.7 4.6.8 4.7 4.7.1 4.7.2 4.7.3 4.7.4 4.8
Rewrite systems | 162 Groups and the word problem | 163 Presentations for groups | 163 The word problem | 164 HNN-extensions | 166 The Dehn function | 167 Exponential compression | 168 Motivation: the word problem for Aut(F3 ) | 169 Straight-line programs | 170 Jeż’s algorithm for equality checking | 171 Cutting out factors from SLPs | 182 The compressed word problem | 184 Complexity of compressed word problems | 186 Power word problems | 192 Further applications of straight-line programs in group theory | 194 Tower compression and beyond | 199 Motivation: the word problem for the Baumslag group | 199 Power circuits | 201 Solving word problems using power circuits | 214 Ackermannian compression | 218 Open problems | 220
5 Discrete optimization in groups | 223 5.1 Introduction | 223 5.1.1 Motivation, general set-up, notable results | 223 5.1.2 Brief overview of the problems | 224 5.1.3 Preliminaries: algorithmic set-up | 231 5.1.4 Preliminaries: complexity classes | 233 5.1.5 Preliminaries: nilpotent groups | 235 5.1.6 Preliminaries: hyperbolic groups | 237 5.1.7 Preliminaries: graph groups and virtually special groups | 239 5.1.8 Preliminaries: automaton groups | 240 5.1.9 Preliminaries: wreath products | 240 5.1.10 Preliminaries: polycyclic groups, metabelian groups, Fox derivatives | 241 5.2 Subset sum problem and related problems | 246 5.2.1 Definition | 246 5.2.2 Examples and basic properties | 248 5.2.3 Easy SSP | 250 5.2.4 Distortion as a source of hardness of SSP | 254 5.2.5 Large abelian subgroups as a source of hardness | 260 5.2.6 Mikhailova’s construction as a source of hardness | 262
Contents | XI
5.2.7 5.3 5.3.1 5.3.2 5.3.3 5.3.4 5.3.5 5.4 5.4.1 5.4.2 5.4.3 5.4.4 5.4.5 5.4.6
6 6.1 6.2 6.2.1 6.3 6.3.1 6.3.2 6.4 6.4.1 6.4.2 6.4.3 6.4.4 6.5 6.5.1 6.5.2 6.6 6.6.1 6.6.2 6.7 6.8 6.9 6.10 6.11 6.12
Transfer results for SSP and related problems | 265 Knapsack problem | 271 Definition | 271 Groups with semilinear solution to KP | 273 Right-angled Artin groups | 281 Hardness from Diophantine equations | 283 Transfer results | 286 Post correspondence problem | 291 Connections of PCP to group theory | 291 Hereditary word problem and GPCP | 293 PCP in nilpotent groups | 297 General lemmas and remarks | 302 Post correspondence and equalizer problems in metabelian groups | 306 Post correspondence and equalizer problems for polycyclic groups | 312 Problems in group theory motivated by cryptography | 317 Introduction | 317 The Diffie–Hellman key exchange protocol | 318 The ElGamal cryptosystem | 320 The conjugacy problem | 320 The Anshel–Anshel–Goldfeld key exchange protocol | 322 The twisted conjugacy problem | 324 The decomposition problem | 324 “Twisted” protocol | 325 Finding intersection of given subgroups | 326 Commutative subgroups | 327 The factorization problem | 327 The word problem | 328 Encryption emulation attack | 329 Encryption | 331 The subgroup membership problem | 332 Security assumption | 334 Trapdoor | 334 Using the subgroup membership decision problem | 334 The isomorphism inversion problem | 335 Semidirect product of groups and more peculiar computational assumptions | 339 The subset sum problem and the knapsack problem | 342 The hidden subgroup problem | 344 Relations between some of the problems | 345
XII | Contents Bibliography | 349 Index | 371
1 Generic-case complexity in group theory 1.1 Introduction The notion of generic-case complexity was introduced in the 2003 paper [240] with the idea of capturing “practical” behavior of group-theoretic algorithms. Since then generic-case complexity moved far outside group theory and found applications in logic, computer science, engineering and other areas. However, the most convincing and well-developed applications of generic-case complexity to date have been to group-theoretic algorithmic problems. The reasons for this phenomenon are not entirely clear, but include the fact that the theory of random walks works particularly well in the context of groups. This chapter aims to survey the progress in the study of generic-case complexity of group-theoretic problems since the mid-2000s. Probabilistic methods and ideas have played an increasingly important role in geometric group theory since the 2000s. A seminal result of Gromov [189] used a probabilistic method to prove the existence of a finitely presented group that is not uniformly embeddable into a Hilbert space. “Random groups,” in Gromov’s density model, have been used to produce new families of examples of groups with Kazhdan’s property (T) [503]. Recent advances in the study of random walks on groups produced a rather detailed understanding of algebraic and geometric properties of “random” elements of mapping class groups, of Out(Fn ), of right-angled Artin groups, of groups acting on CAT(0) spaces, etc. [422, 424, 469, 324, 325]. Probabilistic group-theoretic methods have also been applied in geometric topology to understand the properties of “random” 3-manifolds [116, 115], and so on. Examples of these kinds, where one tries to understand algebraic and geometric properties of “random” group-theoretic objects (groups, subgroups, group elements, subgroup distortion, etc.) now abound in the literature and represent an important line of research in geometric group theory. This survey concentrates more specifically on results concerning generic-case complexity for group-theoretic algorithmic problems. Although many of these results are informed by the other probabilistic developments mentioned above, we will only cover those developments to the extent they have direct bearing specifically on generic-case complexity for group-theoretic problems. The notion of generic-case complexity differs from the various previously considered probabilistic complexity classes in that generic-case complexity uses a random process in generating inputs for an algorithmic problem rather than in defining how various computational devices (such as probabilistic Turing machines) perform computations. Generic-case complexity also differs from the older notion of average-case complexity in that the latter averages the behavior of the algorithm over all inputs, while the former completely disregards the behavior of an algorithm on a “negligible” collection of inputs. For this reason a problem which is algorithmically undecidable may have low generic-case complexity, while that is not possible for average-case complexity. Indeed, average-case complexity always involves a trade-off between the https://doi.org/10.1515/9783110667028-001
2 | 1 Generic-case complexity in group theory behavior of an algorithm on “typical” inputs and the worst-case behavior of the algorithm on “hard” inputs. We believe that generic-case complexity better captures the “practical” behavior of various algorithms. A prototypical example here is provided by the simplex method algorithm in linear programming. Klee and Minty [265] provided an exponential time lower bound for the worst-case complexity of this algorithm. Yet the simplex method algorithm runs thousands of times a day in numerous applications and (almost) always terminates very quickly, in low-degree polynomial time. Thus, in practical terms, we actually observe the generic-case complexity of the simplex method rather than its average-case complexity. For the definition below we first fix a computational model where algorithms (or partially computable functions) take inputs from a set U. The set U may be a countable set (usually with a specific enumeration), such as the set of all words over a given finite alphabet, or an uncountable set, such as, say, ℝk for some integer k ≥ 1. Definition 1.1.1 (Generic-case complexity). Let U be a set, equipped with a computational model of algorithms taking inputs from U. We also fix a mode of computation (e. g., deterministic, nondeterministic, etc.). Let 𝒲 = W1 , W2 , . . . , Wn , . . .
be a discrete-time random process with values in U, such that for every integer n ≥ 1 Wn is a U-valued random variable with an atomic probability distribution on U. (1) Let Ω be an algorithm with inputs from U, defining a partially computable function fΩ from U to some other set 𝒥 , operating in our chosen mode of computation. Let 𝒞 be a complexity class for our computational model and mode. We say that Ω has generic-case complexity 𝒞 with respect to 𝒲 if lim P (Ω n→∞ n
terminates on input Wn within the complexity bound 𝒞
in terms of n) = 1. If, in addition, the convergence in the above limit is exponentially fast, we say that Ω has strong generic-case complexity 𝒞 with respect to 𝒲 . (2) Let h : U → 𝒥 be a function. Let 𝒞 be a complexity class for our computational model and mode. Let Ω be an algorithm with inputs from U, defining a partial function fΩ from U to 𝒥 (so that whenever fΩ terminates on an input w ∈ U, the output is an element of 𝒥 ). We say that Ω computes h with (strong) generic-case complexity 𝒞 with respect to 𝒲 if the following conditions hold: (a) The algorithm Ω is correct for h, that is, whenever fΩ (w) is defined for w ∈ U, then fΩ (w) = h(w). (b) The algorithm Ω has (strong) generic-case complexity 𝒞 with respect to 𝒲 .
1.1 Introduction
| 3
(3) Let h : U → 𝒥 be a function. We say that h is computable with (strong) generic-case complexity 𝒞 with respect to 𝒲 if there exists an algorithm Ω as in (2) such that Ω computes h with (strong) generic-case complexity 𝒞 with respect to 𝒲 . For example, if 𝒞 is “deterministic linear time,” then in the above definition saying that Ω has generic-case complexity 𝒞 with respect to 𝒲 means that Ω is a deterministic algorithm with inputs from U and that there exists a constant C > 0 such that lim P (Ω n→∞ n
terminates on input Wn in time ≤ Cn) = 1.
If the chosen mode of computation is nondeterministic, and 𝒞 is “nondeterministic polynomial time,” then saying that Ω has generic-case complexity 𝒞 with respect to 𝒲 means that Ω is a deterministic algorithm with inputs from U and that there exist an integer k ≥ 1 and constant C > 0 such that lim P (Ω n→∞ n
terminates on some computational path starting
with input Wn in time ≤ Cnk ) = 1.
In Definition 1.1.1 we think of computing the function h : U → 𝒥 as an algorithmic problem, with an algorithm Ω trying to solve this problem. An important class of algorithmic problems for Definition 1.1.1 corresponds to the case where h = χ𝒟 : U → {0, 1} is the characteristic function of a subset 𝒟 ⊆ U. We refer to such h = χ𝒟 (and sometimes to the corresponding subset 𝒟 ⊆ U) as a decision problem for membership in 𝒟 ⊆ U. A key feature of the notion of generic-case complexity is that this notion does depend, in a crucial way, on the random process used to generate the inputs of an algorithm. We view this feature as a strength of the notion since practical behavior of an algorithm does depend on the random process used to generate the inputs. In Section 1.2 below we discuss the notion of generic-case complexity in more detail, and present several earlier versions of this notion (starting with the definition given in [240]). Those older definitions are based on the concept of “asymptotic density” for subsets of U, and have significant limitations. The asymptotic density approach (particularly when based on using uniform probability distributions on balls of increasing radius) leads to too naive a notion of generic-case complexity since most practical and natural approaches of generating inputs probabilistically produce sequences of distributions that are far from being uniform. For most natural settings it is not practical to try to pick uniformly at random an element from a ball (or a sphere) in some data structure. In addition, making the definition of generic-case complexity depend on a specific “size” function on the inputs is, to some extent, artificial. It is more natural to regard the time n, needed to generate an input Wn by a random process 𝒲 , as the “size” of that input. It is usually the case that there is then a (nonunique)
4 | 1 Generic-case complexity in group theory way to define a “size” function adapted to 𝒲 , such that for some c > 0 every input produced by Wn has size ≤ cn (or perhaps ≤ some polynomial in n) for all n ≥ 1. This “size” function could be the length of a word, the number of vertices in a graph, the number of vertices plus the number of edges in a graph, the number of cells in a cell-complex (or a diagram), etc. However, it is not really necessary to make the “size” function an explicit part of the definition of generic-case complexity. Definition 1.1.1, given above, dispenses with the dependence on the size function, and produces a notion that is considerably more general and robust than earlier definitions of generic-case complexity. This definition also more clearly indicates the way in which generic-case complexity uses probability theory. In particular, Definition 1.1.1 elucidates the point that, when dealing with generic-case complexity, rather than talking about “generic” subsets of the set of inputs U it is more appropriate to think about “generic” sets of sequences of elements of U (that is, “generic” subsets of U ℕ ) and about “generic” sequences of events (that is, “generic” sequences of subsets of U ℕ ). A useful feature of the notion of generic-case complexity is that in many situations it allows for genuine average-case complexity conclusions (which are usually difficult to come by). The idea here is the following. Suppose that a certain problem is decidable strongly generically in time f1 (n) (where f1 (n) is, say, quadratic time), and that this problem is decidable with worst-case complexity in time f (n) where the gap f (n)/f1 (n) is subexponential (e. g., if f (n) is quintic time). Then the problem is decidable in time bounded by f1 (n) on average. This simple averaging idea is formalized and explained in more detail in [241] (see also [366, Chapter 10.2] for more details) and has a surprising number of group-theoretic applications.
1.2 Definition(s) of generic-case complexity In this section, unless specified otherwise, we work in the classic Turing machine model of computability.
1.2.1 The original definition of generic-case complexity As noted above, the original notion of generic-case complexity, as defined in [240], was based on the idea of “asymptotic density.” Let U be a countable set (which we think of as the set of inputs for a particular algorithmic problem) and let s : U : ℤ≥0 be a function, called the size function, such that for every integer n ≥ 0 the set {w ∈ U | s(w) ≤ n} is finite. For a subset S ⊆ U and an integer n ≥ 0 denote γn (S) := #{w ∈ S : s(w) = n} and βn (S) := #{w ∈ S : s(w) ≤ n}. For a bounded sequence bn ∈ ℝ such that limn→∞ bn = b for some b ∈ ℝ, we say that the convergence in this limit is exponentially fast if there exists 0 < c < 1 such that
1.2 Definition(s) of generic-case complexity | 5
|bn −b| = o(cn ) as n → ∞. We say that the convergence in this limit is superpolynomially fast if for every integer k ≥ 1 we have |bn − b| = o(n−k ) as n → ∞. Definition 1.2.1. Let U and s : U → ℤ≥0 be as above. Assume that there is n0 ≥ 0 such that for every n ≥ n0 we have βn (U) > 0; we call such s admissible. 1. For a subset S ⊆ U define the upper density ρU (S) of S in U as ρU (S) := lim sup n→∞
βn (S) βn (U)
and the lower density ρ (S) of S in U as U
ρ (S) := lim inf U
n→∞
βn (S) . βn (U)
If ρU (S) = ρ (S) (that is, if the limit limn→∞ U
β (S) limn→∞ β n(U) n
2.
3.
βn (S) βn (U)
exists), the number ρU (S) =
is called the asymptotic density of S in U and is denoted by ρ (S) = U ρU (S). A subset S ⊆ U is called generic in U if ρ (S) = 1, that is, if ρU (S) exists and ρU (S) = U
β (S)
1. If, in addition, the convergence in the limit limn→∞ β n(U) = 1 is exponentially n fast, we say that S ⊆ U is called strongly generic in U. A subset S ⊆ U is called negligible in U if ρU (S) = 0, that is, if ρU (S) exists and β (S) ρU (S) = 0. If, in addition, the convergence in the limit limn→∞ β n(U) = 0 is expon nentially fast, we say that S ⊆ U is called strongly negligible in U.
Sometimes, strongly generic/negligible subsets are defined by requiring the conβ (S) vergence in the limit limn→∞ β n(U) to 1 or 0 to be superpolynomially fast, rather than n exponentially fast. Example 1.2.2. There are several important special cases for the above definition. 1. Let Σ be a finite alphabet with k ≥ 2 elements. Put U = Σ∗ to be the set of all words over Σ. Define s : Σ∗ → ℤ≥0 to be s(w) = |w|, the length of w, for w ∈ Σ∗ . Note that n+1
in this case γn (U) = k n and βn (U) = k k−1−1 . 2. Let k ≥ 1 be an integer and U = ℤk or U = ℕk . Define s : U → ℤ≥0 as s(n1 , . . . , nk ) = max{|n1 |, . . . , |nk |}. Here |n| is the absolute value of n. This set-up leads to the classic notion of asymptotic density for subsets of ℤk and ℕk used in number theory and also in modern coarse computability theory. 3. Let r ≥ 2 be an integer, and let Fr = F(a1 , . . . , ar ) be the free group of rank r (where we think of Fr as the set of all freely reduced words over the alphabet Ar = {a1 , . . . , ar }±1 ). Let U = Fr or let U be the set of all cyclically reduced words in Fr . Define s : U → ℤ≥0 as s(w) = |w|, the length of w, for a word w ∈ U. 4. Let Σ be a finite alphabet with k ≥ 2 elements and let U0 = Σ∗ . Let d ≥ 2 be an integer and let U = U0d . Define s : U → ℤ≥0 as s(w1 , . . . , wd ) = max{|w1 |, . . . , |wd |}.
6 | 1 Generic-case complexity in group theory 5.
Similarly, let r ≥ 2, Fr = F(a1 , . . . , ar ) and Ar = {a1 , . . . , ar }±1 be as above. Let d ≥ 2 be an integer. Let U0 = Fr or let U0 be the set of all cyclically reduced words in Fr . Let U = U0d . Define s : U → ℤ≥0 as s(w1 , . . . , wd ) = max{|w1 |, . . . , |wd |}.
Remark 1.2.3 (Spherical versus ball asymptotic density). Suppose that U = Σ∗ , as in part (1) of Example 1.2.2 above. It was observed in [240], as a consequence of Stolz’s theorem, that if U ⊆ Σ∗ is such that γn (U) > 0 for all sufficiently large n and if S ⊆ U is γ (S) such that the limit limn→∞ γ n(U) exists, then the asymptotic density ρU (S) also exists and one has ρU (S) = limn→∞
n
γ (S) γn (S) . Moreover, in this situation if limn→∞ γ n(U) γn (U) n
= 1 with
γ (S) exponentially fast convergence, then S is strongly generic in U, and if limn→∞ γ n(U) n
=0 with exponentially fast convergence, then S is strongly negligible in U. The same argument works in the setting of part (3) of Example 1.2.2, where U = Fr or U is the set of all cyclically reduced words in Fr . Most proofs involving asymptotic density computations for subsets of Σ∗ involve γ (S) using γ n(U) , and then applying Stolz’s theorem as above. Various random walk and n Markov chain arguments typically involve working with probability distributions supγ (S) ported on n-spheres in Σ∗ and produce information about γ n(U) . In terms of genericity n
γ (S)
reflecting the practical idea of randomness, working with the spherical quotients γ n(U) β (S)
n
rather than with the ball quotients β n(U) is also more natural, since it is usually hard n to practically emulate the uniform probability distribution on an n-ball in U while it is often straightforward to do that for n-spheres. Therefore in some sources asympγ (S) totic density is defined in terms of γ n(U) . Because of the Stolz theorem trick mention n above, this spherical quotients approach results in very similar notions to the above definition. Remark 1.2.4. Let U, U0 , d ≥ 2 and s : U → ℤ≥0 be as in part (4) or part (5) of Example 1.2.2. In this situation, if S0 ⊆ U0 is a (strongly) generic subset of U0 , then S0d ⊆ U = U0d is a (strongly) generic subset of U. Definition 1.2.5. Let U be a countably infinite set given with a specific enumeration and let s : 𝕌 → ℤ≥0 be an admissible size function. We fix a specific mode of computation (e. g., deterministic). Let J be another countable set and let h : U → 𝒥 be a function (where we think of computing h as an algorithmic problem). Let Ω be a correct partial algorithm for computing h with inputs from U in our chosen mode of computation. This means that Ω computes a partial function from U to 𝒥 and that whenever Ω with an input k-tuple τ = (w1 , . . . , wk ) ∈ (Σ∗ )k terminates and produces a value y ∈ 𝒥 , then h(τ) = y. Let 𝒞 be a complexity class for our chosen mode of computation. (a) We say that Ω computes h with generic-case (respectively, strongly generic-case) complexity 𝒞 with respect to s if there exists a generic (respectively, strongly generic) subset S ⊆ (Σ∗ )k such that for every k-tuple τ ∈ S the algorithm Ω terminates on the input τ within the complexity bound 𝒞 .
1.2 Definition(s) of generic-case complexity | 7
(b) We say that h is computable with generic-case (respectively, strongly generic-case) complexity 𝒞 with respect to s if there exists a correct partial algorithm Ω for h computing with generic-case (respectively, strongly generic-case) complexity 𝒞 with respect to s. As noted in Section 1.1, if 𝒟 ⊆ U and h = χ𝒟 : U → {0, 1} is the characteristic function of 𝒟, we think of h as a decision problem for deciding whether or not an element of U belongs to 𝒟. For that reason we often do not distinguish between 𝒟 and χ𝒟 and refer to 𝒟 ⊆ U as a decision problem. For group-theoretic decision problems, the most frequently used choice of U is U = Σ∗X or U = F(X), when working with a group G given by a presentation G = ⟨X | R⟩, where X is finite. Here ΣX = X ⊔ X −1 . For problems where inputs are d-tuples of words, one uses U = ΣdX or U = F(X)d or, sometimes, U being the set of all d-tuples of cyclically reduced words in F(X).
1.2.2 Weaknesses of the asymptotic density approach to genericity The original definition of generic-case complexity, based on the asymptotic density approach, has several significant conceptual and mathematical weaknesses. – The notion of generic-case complexity is supposed to capture practical behavior of an algorithm on “random” inputs. The asymptotic density approach takes the naive view that a “random” element of U means a uniformly at random chosen element from the n-ball {w ∈ U | s(w) ≤ n}, where n ≫ 1 is large (or sometimes from the n-sphere {w ∈ U | s(w) = n}). However, in practice it is almost never the case that we are working with a “natural” random process generating a uniform probability distribution on such an n-ball or an n-sphere in U. For example, for most finitely generated or finitely presented groups, it is extremely difficult or impossible in practice to generate the uniform probability distribution on the n-ball or the n-sphere in the Cayley graph of such a group. In practice we are usually given some “natural” random process 𝒲 (such as a random walk on a group, some growth process building van Kampen diagrams, some probabilistic process generating Stallings subgroup graphs, etc.) generating inputs of some problem, but the n-step distribution of this process is far from being uniform on its support. – The asymptotic density definition, Definition 1.2.5, of generic-case complexity masks the crucial role played by the random process 𝒲 generating the inputs from U. In reality, the notion of generic-case complexity fundamentally depends on the choice of 𝒲 , and this fact needs to be highlighted rather than hidden. – The asymptotic density approach leads to a common and still persistent misconception that the set U of inputs for a particular problem is usually divided into two disjoint subsets U = Ueasy ⊔ Uhard . In fact, it is more appropriate to talk about
8 | 1 Generic-case complexity in group theory
–
–
easy/hard sequences of inputs from U. That is, when talking about generic/negligible sets on which the behavior of an algorithm is being analyzed, we should be looking at subsets of the infinite product space U ℕ , or, more generally, at sequences of subsets of U ℕ , rather than at subsets U itself. This crucial point remains almost entirely ignored in the generic-case complexity literature, and constitutes the most important weakness of all the previous versions of generic-case complexity, including the “ensemble” definition discussed below. The presence of the size function s in Definition 1.2.5 is, to some extent, artificial. It is true that typically, when a random process 𝒲 = W1 , W2 , . . . Wn , . . . generates inputs from U, there is some natural size/complexity function s : U → ℤ≥0 (such as the length of the word, the total number of cells in a complex, the number of vertices in a graph, etc.) such that we always have s(Wn ) ≤ Cn for some constant C ≥ 1. However, defining s explicitly is often not necessary for proofs and actual computations, and frequently there is more than one choice of s that works. Conceptually, it is preferable to think about the time n needed to generate the input Wn as the “size” of that input. Compared with the random process definition, Definition 1.1.1, of generic-case complexity, Definition 1.2.5 makes it less clear where and how probabilistic arguments, proofs and tools enter the theory.
Definition 1.1.1 of generic-case complexity rectifies all of these weaknesses. All of the existing results on generic-case complexity fit the framework of Definition 1.1.1, for appropriate choices of 𝒲 . In the literature there is an existing version of the notion of generic-case complexity which is mathematically fairly close to Definition 1.1.1, namely, the “ensemble” definition of generic-case complexity used in [349]. There one works with a countably infinite set U, an admissible size function s : U → ℤ≥0 and a sequence μ = (μn )∞ n=1 of atomic probability distributions on U such that for some constant C > 0 the support Supp(μn ) of μn is contained in the ball {w ∈ U | s(w) ≤ Cn} for all n ≥ 1. Then (U, μ) is called a distribution space. A subset S ⊆ U is called μ-generic if limn→∞ μn (S) = 1; if, in addition, the convergence in this limit is exponentially fast, we say that S is strongly μ-generic. Let 𝒥 be another countable set and let h : U → 𝒥 be a function. We fix a mode of computation, and let 𝒞 be a complexity class for that mode. Let Ω be a correct partial algorithm, in our chosen mode, for computing h with inputs from U. We say that Ω computes h with generic-case (respectively, strongly generic-case) complexity 𝒞 with respect to μ if there exists a μ-generic (respectively, strongly μ-generic) subset S ⊆ U such that for every w ∈ S the algorithm Ω terminates on the input w within the complexity bound 𝒞 . This definition is inferior to Definition 1.1.1 for several reasons. First, although the n-th step Wn of 𝒲 does produce a probability distribution μn on U, in probability theory and random walks almost all arguments and proofs are phrased in terms of probabilities of various events rather than in terms of the sequence of the n-step distributions μn . Second, and most importantly, the “ensemble” definition still
1.2 Definition(s) of generic-case complexity | 9
thinks about generic subsets S of U rather than about generic sets of sequences of elements of U and generic sequences of subsets of U ℕ . Third, the “ensemble” definition still relies on the use of the size function s, which, as we noted above, is usually unnecessary in practice. In the context of Definition 1.1.1, let μn be the probability distribution on U given by the n-th step Wn of 𝒲 . For the product space U ℕ we get a corresponding product probability measure μ∗ = ×ni=1 μn . For each n ≥ 1 define ℕ En = {(wi )∞ i=1 ∈ U | Ω terminates on input wn within the complexity bound 𝒞
in terms of n}.
Part (1) of Definition 1.1.1 says that limn→∞ μ∗ (En ) = 1. Thus we have a “generic” sequence En of subsets of U ℕ here, although it is more beneficial to think in terms of Definition 1.1.1 itself. Note that in the special cases discussed in Example 1.2.2 above, the original asymptotic density-based definition of generic-case complexity (Definition 1.2.5) still works reasonably well, although the proofs even in those contexts are often better suited for the more general Definition 1.1.1. A key feature of all versions of the definition of generic-case complexity is that the partial algorithm Ω for a particular function h : U → 𝒥 is required to be correct for h, that is, whenever Ω terminates on an input w ∈ U and produces a value in 𝒥 , that value is the correct value of h(w). Suppose in part (2) of Definition 1.1.1 we relax this requirement and just require that Ω has generic-case complexity 𝒞 with respect to 𝒲 , and that lim P (Ω n→∞ n
terminates on input Wn and outputs the correct value of h(Wn ) ∈ 𝒥 ) = 1.
We would then say that Ω computes h with coarse complexity 𝒞 . This approach leads to the notion of “coarse computability” and of “coarse complexity classes” studied in detail in [212, 211]. The two theories are related but fundamentally different. We illustrate this point by the following simple example from group theory. Let G = ⟨X | R⟩ be an arbitrary infinite finitely generated group, where X is finite (and where there are no restrictions on R). Put ΣX = X ∪ X −1 and U = Σ∗X . Let 𝒲 = W1 , W2 , . . . , Wn , . . . be the simple random walk on G with respect to X. Let 𝒟 : Σ∗X → {0, 1} be the word problem for G, that is, 𝒟(w) = 1 if w =G e and 𝒟(w) = 0 if w =G ̸ e. Consider the deterministic algorithm Ω with inputs from U such that for every w ∈ Σ∗X the algorithm Ω terminates in a single step and outputs the value 0 (that is, Ω immediately claims that w =G̸ e). The assumption that G is infinite implies, by basic results about random walks on graphs, that Ω computes 𝒟 coarsely in constant time. However, Ω is not a correct algorithm for 𝒟 (since there are some w ∈ Σ∗X such that w =G e), and therefore Ω tells us nothing about the generic-case complexity of the word problem 𝒟 for G.
10 | 1 Generic-case complexity in group theory
1.3 Decision problems in group theory: general set-up The classic decision problems in group theory, formulated by Max Dehn in 1912, are the word problem, the conjugacy problem and the isomorphism problem. A closely related question is the subgroup membership problem. These problems were originally posed as decision problems, where the answer is either “yes” or “no.” To be more precise, let G be a finitely generated group. A marked generating set for G is a set X together with a map π : A → G such that π(A) generates G. In this case we denote by ΣA := A ⊔ A−1 the corresponding group alphabet; the map π then canonically extends to a surjective monoid homomorphism, also denoted by π, π : Σ∗A → G. If G is a finitely generated group with a finite marked generating set A, we denote by WP(G, A) the set of all w ∈ Σ∗A such that π(w) = e in G and we denote by WPred (G, A) the set of all freely reduced words in WP(G, A). The word problem for G with respect to A asks, given w ∈ Σ∗A , whether or not π(w) = e in G, that is, whether or not w ∈ WP(G, A). The reduced word problem for G with respect to A asks, given a freely reduced w ∈ Σ∗A , whether or not π(w) = e in G, that is, whether or not w ∈ WPred (G, A). It is well known and easy to see that the worst-case complexity of the word problem in a finitely generated group G does not depend on the choice of a finite marked generated set A, and that the word problem and the reduced word problem for G with respect to A have the same worst-case complexity. In this case, if H ≤ G is a subgroup, we denote by WP(G, H, A) the set of all w ∈ Σ∗A such that π(w) ∈ H and we denote by WPred (G, H, A) the set of all freely reduced words in WP(G, H, A). The membership problem for H in G with respect to A asks, given w ∈ Σ∗A , whether π(w) ∈ H. Similarly, the reduced membership problem for H in G with respect to A asks, given a freely reduced word w ∈ Σ∗A , whether π(w) ∈ H. The most natural way (but by no means the only way) of “randomly” generating inputs for decision problems in finitely generated groups is by doing a simple random walk or a simple nonbacktracking random walk on that group with respect to some finite generating set. Convention 1.3.1. Let G be a finitely generated group with a finite marked generating set A with #(A) = r ≥ 2 elements. Let k ≥ 1 be an integer (in most cases k = 1 or k = 2). Denote by 𝒲A,k = W1 , . . . , Wn , . . . a random process given by k independent simple random walks on G with respect to A. That means that we have k independent sequences Xn(1) , . . . , Xn(k) or i. i. d. random variables with values in ΣA = A ⊔ A−1 , where each Xn(j) has the uniform probability distribution on ΣA , and that Wn = (Wn(1) , . . . , Wn(k) ), where (j) (j) Wn(j) = X1 X2 . . . Xn(j) ∈ Σ∗A . We will also use the random process 𝒲A,k,red = W1 , . . . , Wn , . . . given by k independent nonbacktracking random walks on G with respect to A. Then Wn = (Wn(1) , . . . , Wn(k) ) is a k-tuple of independent ΣnA -valued random variables, each with the uniform probability distribution on the set of all freely reduced words of length n over ΣA . The freely reduced words Wn(j) are generated by performing n steps of a certain
1.4 Quotient test methods | 11
finite-state irreducible Markov chain, although this fact is not directly relevant here. We can assume that the freely reduced words Wn(1) , . . . , Wn(k) over A±1 are generated by k independent simple nonbacktracking random walks with respect to A. In the case k = 1 we will usually omit 1 as the subscript and denote 𝒲A := 𝒲A,1 and 𝒲A,red := 𝒲A,1,red . We will also assume here that we operate in the standard Turing machine model of computability.
1.4 Quotient test methods Many generic-case complexity results for group-theoretic decision problems are based on using the so-called “quotient test” methods. The idea of these methods is the following. Let G be a finitely generated group in which we are trying to solve some decision problem such as the word problem, the conjugacy problem, the subgroup membership problem, etc. Let ϕ : G → G be an epimorphism, where G is a group for which the decision problem in question has low worst-case complexity and belongs to some (low-) complexity class 𝒞 . Let 𝒲 = W1 , W2 , . . . , Wn , . . . be a random process producing instances of inputs for our decision problem in G (where these inputs can be elements of G, pairs or tuples of elements of G, etc.), and let 𝒲 = ϕ(W1 ), ϕ(W2 ), . . . , ϕ(Wn ), . . . be the projection of 𝒲 to G via ϕ. Suppose we know that with probability tending to 1 as n tends to infinity, 𝒲 produces an input ϕ(Wn ) for which our decision problem in G has negative answer. The fact that ϕ is a homomorphism then implies that this decision problem for G also has negative answer on the input Wn . Let Ω be the class 𝒞 algorithm for solving the problem in G. We then consider a partial algorithm Ω for this problem in G which proceeds as follows. Given an instance w of the problem for G, compute the image ϕ(w) of w in G and apply the algorithm Ω to ϕ(w). If Ω terminates on the input w in finite time with the answer “no,” then Ω produces the answer “no” for the input w. Otherwise Ω returns no answer for the input w. Then Ω solves the problem under consideration generically with complexity 𝒞 with respect to the random process 𝒲 . Sometimes this approach also works if ϕ : H → G is an epimorphism, where H ≤ G is a subgroup of finite index in G. This naive “fast check” approach turns out to work well in numerous situations, as elucidated first in [240].
1.4.1 Quotient tests and the word problem The main general result in [240] regarding generic-case complexity of the word problem can be summarized as follows.
12 | 1 Generic-case complexity in group theory Theorem 1.4.1. Let G be a finitely generated group with a presentation G = ⟨A | R⟩, where A is finite with #(A) ≥ 2. Let ϕ : H → G be a surjective group homomorphism, where H ≤ G is a finite-index subgroup. Let 𝒞 be a complexity class (for a particular mode of computation) and suppose that the word problem for G is solvable with (worstcase) complexity in 𝒞 . 1. If G is an infinite group, then the word problem WP(G, A) is solvable with genericcase complexity 𝒞 with respect to the random process 𝒲A and also with respect to the random process 𝒲A,red . 2. If G is a nonamenable group, then the word problem WP(G, A) is solvable with strong generic-case complexity 𝒞 with respect to the random process 𝒲A and also with respect to the random process 𝒲A,red . In particular, since ℤ is an infinite group with the word problem solvable in (deterministic) linear time, it follows that for any finitely generated group G with infinite abelianization, for every marked finite generating set A of G the word problem WP(G, A) is solvable with generic linear-time complexity with respect to the random process 𝒲A and also with respect to the random process 𝒲A,red . We postpone listing other applications of Theorem 1.4.1 for the moment and instead present another example of using the quotient test method for generically solving the word problem. Theorem 1.4.2. Let G = ⟨A | R⟩ be a finitely generated group with a presentation G = ⟨A | R⟩, where A is finite. Let ϕ : G → G be a homomorphism. Let ν be an atomic probability measure on (A ⊔ A−1 )∗ . Let X1 , X2 , . . . , Xn , . . . be a sequence of (A ⊔ A−1 )∗ -valued i. i. d. random variables, each with distribution ν. For n = 1, 2, . . . let Wn = X1 X2 . . . Xn and consider the (A ⊔ A−1 )∗ -valued random process 𝒲 [ν] = W1 , W2 , . . . , Wn , . . . . 1. Assume that G is relatively hyperbolic with respect to the collection of finitely generated subgroups P1 , . . . , Pm ≤ G and that the semigroup generated by the set (ϕ ∘ π)(Supp(ν)) ⊆ G is a nonelementary subgroup of G. Then: (a) If for i = 1, . . . , m the word problem in Pi is solvable in polynomial time, then the word problem WP(G, A) for G is solvable strongly generically in polynomial time with respect to the random process 𝒲 [ν]. (b) If G is a finitely generated abelian group, then the word problem WP(G, A) for G is solvable strongly generically in real time with respect to the random process 𝒲 [ν]. 2. Let G = Mod(S), the mapping class group of a closed oriented hyperbolic surface S, and assume that the semigroup generated by the set (ϕ ∘ π)(Supp(ν)) ⊆ G is a subgroup of Mod(S) containing two independent pseudo-Anosov elements. Then the word problem WP(G, A) for G is solvable strongly generically in polynomial time with respect to the random process 𝒲 [ν].
1.4 Quotient test methods | 13
Proof. (1) Since G is a quotient of G, the group G admits a presentation G = ⟨A | S⟩, where R ⊆ S ⊆ F(A). Thus we can also view A as a marked generating set of G. Let Y be the coned-off Cayley graph of G for the generating set A with respect to the parabolic subgroups P1 , . . . , Pm . The space Y is Gromov-hyperbolic and is equipped with a nonelementary isometric action of G, and hence, via ϕ, also of the group G. By Theorem 1.2 of Maher and Tiozzo [325], the orbit-map projection of the random walk Wn to Y has positive drift, so that there is C > 0 such that for any y ∈ Y lim P (d ((ϕ n→∞ n Y
∘ π)(Wn )y, y) ≥ Cn) = 1
and, moreover, the convergence in this limit is exponentially fast. This implies that lim P (π(Wn ) n→∞ n
≠ 1 in G) = 1,
with exponentially fast convergence. Note that for w ∈ Σ∗A , if ϕ(π(w)) ≠ 1 in G, then π(w) ≠ 1 in G, since ϕ : G → G is a group homomorphism. By a result of Farb [128, Theorem 3.7], the word problem for G is solvable in polynomial time. Thus there exists a polynomial-time algorithm Ω with inputs from (A⊔A−1 )∗ solving the word problem WP(G, A). Let Ω be an algorithm with inputs from (A ⊔ A−1 )∗ defined as follows. For an input w ∈ (A⊔A−1 )∗ we run the algorithm Ω on w and check whether or not w =G 1. If not, the algorithm Ω outputs the answer “no.” If so, then Ω does not output any answer. Then Ω solves the word problem WP(G, A) generically in polynomial time with respect to the random process 𝒲 [ν]. The prove of part (1)(b) is similar, where we use the result of Holt that for a relatively hyperbolic group G with respect to a finite collection of finitely generated abelian subgroups the word problem in G is solvable by a real-time Turing machine. Again, [325, Theorem 1.2], applied to the action of Mod(S) on its curve complex, implies that with probability tending to 1 exponentially fast as n → ∞, the element ϕ(π(Wn )) ∈ G = Mod(S) is pseudo-Anosov and, in particular, nontrivial. By a result of Mosher, Mod(S) is automatic and hence admits a quadratic-time deterministic algorithm Ω for solving the word problem. Let Ω be an algorithm with inputs from (A⊔A−1 )∗ defined as follows. For an input w ∈ (A ⊔ A−1 )∗ we run the algorithm Ω on w and check whether or not w =G 1. If not, the algorithm Ω outputs the answer “no.” If so, then Ω does not output any answer. Then Ω solves the word problem WP(G, A) generically in polynomial time with respect to the random process 𝒲 [ν]. Note that in the above proof we did not have to require ϕ to be an epimorphism, and we did not have to assume that 𝒲 comes from a simple random walk or a simple nonbacktracking random walk corresponding to some finite generating set of G. We refer the reader to [269, 400] for the basic definitions related to acylindrical actions and acylindrically hyperbolic groups.
14 | 1 Generic-case complexity in group theory 1.4.2 Quotient tests and the conjugacy problem Remark 1.4.3. In general, one should be particularly careful with the literature on generic-case complexity of the conjugacy problem because several different definitions of the genericity notions are being used and these notions are often not clearly defined and need to be reconstructed from the context. We will try to be as precise here as reasonably possible. One of the issues of ambiguity is how the length/size of the pair (w1 , w2 ) of input words is being measured. Sometimes the length of such a pair is defined as |w1 | + |w2 |, which is the approach taken in [240]. In other settings the length of the pair is defined to be max{|w1 |, |w2 |}, which is the approach we take in this chapter, to the extent that we need to deal with the asymptotic density approach to genericity as in Definition 1.2.1 and Definition 1.2.5 above. Moreover, yet in other settings, when people say that the conjugacy problem has specific (low) generic-case complexity 𝒞 , they may mean that there exists a generic set S of words w in the generators of the group G such that for every w ∈ S and for an arbitrary other word v in the generators of the group (or sometimes, for every w, v ∈ S), the algorithm in question decides whether or not w and v are conjugate in G within the complexity bound 𝒞 . The main result of [240] regarding the generic-case complexity of the conjugacy problem is the following. Theorem 1.4.4. Let G be a finitely generated group with infinite abelianization. Let A be any finite generating set for G. Then G has conjugacy problem solvable generically in linear time with respect to A in the following sense. There exists a subset S ⊆ (A∪A−1 )∗ ×(A∪A−1 )∗ such that S has asymptotic density 1, where |(w1 , w2 )| := |w1 | + |w2 | for w1 , w2 ∈ (A ∪ A−1 )∗ , and there exists a correct partial algorithm Ω for the conjugacy problem in G with inputs from (A ∪ A−1 )∗ × (A ∪ A−1 )∗ , such that for every (w1 , w2 ) ∈ S the algorithm Ω terminates in linear time on (w1 , w2 ). Theorem 1.4.4 is proved roughly as follows. Since the abelianization of G is infinite, there exists an epimorphism ϕ : G → ℤ. We perform two long independent simple random walks w1 , w2 (of possibly different lengths) on G with respect to A and project them to ℤ via ϕ. The fact that ℤ is infinite implies that with probability tending to 1 and |(w1 , w2 )| → ∞ we have ϕ(w1 ) ≠ ϕ(w2 ) in ℤ. Since ℤ is abelian, this means that ϕ(w1 ) is not conjugate to ϕ(w2 ) in ℤ and hence we can conclude that w1 is not conjugate to w2 in G. This argument actually works in a much more general setting than the somewhat awkward set-up considered in Theorem 1.4.4 above. We give a sample statement of this kind here. Theorem 1.4.5. Let G be a finitely generated group with infinite abelianization, so that there exists an epimorphism ϕ : G → ℤ. Let A be any finite generating set for G. Let ν be a symmetric atomic probability measure on (A ⊔ A−1 )∗ with finite support such that for some element w ∈ Supp(ν) we have ϕ(π(w)) ≠ 0.
1.4 Quotient test methods | 15
For i = 1, 2 let X1(i) , X2(i) , . . . , Xn(i) , . . . be two independent sequences of (A⊔A−1 )∗ -valued i. i. d. random variables, each with distribution ν. For i = 1, 2 and n = 1, 2, . . . put Wn(1) = X1(1) X2(1) . . . Xn(1) , put Wn(2) = X1(2) X2(2) . . . Xn(2) and consider the (A ⊔ A−1 )∗ -valued random process 𝒲 [ν] = (W1 , W1 ), (W2 , W2 ) . . . , (Wn , Wn ), . . . . (1)
(2)
(1)
(2)
(1)
(2)
Then the conjugacy problem CP(G, A) for G is solvable generically in linear time with respect to the random process 𝒲 [ν]. Proof. Note that by assumption on ν, there exists a constant C ≥ 1 such that for every n ≥ 1 and w ∈ (A ⊔ A−1 )∗ with ν(n) (w) > 0 we have |w| ≤ Cn and ν(n) (w) = ν(n) (w−1 ). Here ν(n) is the n-step convolution of ν. We project ν(n) to ℤ via ϕ ∘ π : (A ⊔ A−1 )∗ → ℤ. After n steps of the process 𝒲 [ν] we obtain two words Wn(1) , Wn(2) ∈ (A ⊔ A−1 )∗ , i. d. d. with the distribution ν(n) . Since ν is symmetric, this means that (Wn(1) )−1 Wn(2) is distributed according to ν(2n) . By looking at the projected random walk on ℤ, by applying the central limit theorem we conclude that the probability that (ϕ∘π)((Wn(1) )−1 Wn(2) ) = 0 tends to 0 as n → ∞. Hence, with probability tending to 1 as n → ∞, we have (ϕ ∘ π)(Wn(1) ) ≠ (ϕ ∘ π)(Wn(2) ) in ℤ and therefore (ϕ ∘ π)(Wn(1) ) is not conjugate to (ϕ∘π)(Wn(2) ) in ℤ and π(Wn(1) ) is not conjugate to π(Wn(2) ) in G. Now consider the partial algorithm Ω for CP(G, A) that works as follows. Given w1 , w2 ∈ (A ⊔ A−1 )∗ , the algorithm computes (ϕ ∘ π)(w1 ) and (ϕ ∘ π)(w2 ) in ℤ. If (ϕ ∘ π)(w1 ) ≠ (ϕ ∘ π)(w2 ), then Ω terminates and declares that π(w1 ) is not conjugate to π(w2 ) in G. If (ϕ∘π)(w1 ) = (ϕ∘π)(w2 ), then Ω does not return any answer. Note that for the pair (Wn(1) , Wn(2) ) produced by the random process 𝒲 [ν] we have |Wn(1) |, |Wn(2) | ≤ Cn. Thus the values (ϕ∘π)(Wn(1) ), (ϕ∘π)(Wn(2) ) ∈ ℤ can be computed in linear time in n, and we can decide in linear time whether or not (ϕ ∘ π)(Wn(1) ) = (ϕ ∘ π)(Wn(2) ). Thus we see that Ω solves the conjugacy problem CP(G, A) for G generically in linear time with respect to the random process 𝒲 [ν]. Although Theorem 1.4.4 and Theorem 1.4.5 are based on rather simple considerations, their conclusions apply to a wide variety of finitely generated groups, including the braid group Bn , where n ≥ 3, infinite one-relator groups, Artin groups, knot groups, groups given by presentations of positive deficiency, etc. Theorem 1.4.6. Let G be a group with a finite marked generating set A. Let ϕ : G → G be an epimorphism homomorphism, where G is a torsion-free acylindrically hyperbolic group with a nonelementary acylindrical isometric action on a Gromov-hyperbolic geodesic metric space Y. Let ν be an atomic probability measure on (A ⊔ A−1 )∗ with finite support such that the semigroup generated by the set (ϕ ∘ π)(Supp(ν)) is equal to G. Suppose that G has conjugacy problem solvable in complexity class 𝒞 . For i = 1, 2 let X1(i) , X2(i) , . . . , Xn(i) , . . . be two independent sequences of (A⊔A−1 )∗ -valued i. i. d. random variables, each with distribution ν. For i = 1, 2 and n = 1, 2, . . . put Wn(1) = X1(1) X2(1) . . . Xn(1) , put Wn(2) = X1(2) X2(2) . . . Xn(2) and consider the (A ⊔ A−1 )∗ -valued
16 | 1 Generic-case complexity in group theory random process 𝒲 [ν] = (W1 , W1 ), (W2 , W2 ) . . . , (Wn , Wn ), . . . . (1)
(2)
(1)
(2)
(1)
(2)
Then the conjugacy problem CP(G, A) for G is solvable strongly generically in class 𝒞 with respect to the random process 𝒲 [ν]. Proof. For the map ϕ ∘ π : A → G, we can view A as a marked finite generating set for G. Hence there exists an algorithm Ω that solves the conjugacy problem CP(G, A) for G in complexity class 𝒞 . Consider the projected random walks ϕ(π(Wn(1) )), ϕ(π(Wn(1) )) on G. A result of Maher and Sisto [324] implies that, with probability tending to 1 exponentially fast as n → ∞, the elements xn = ϕ(π(Wn(1) )), yn = ϕ(π(Wn(1) )) freely generate a subgroup of H of G that is hyperbolically embedded in G, and hence ϕ(π(Wn(1) )) and ϕ(π(Wn(1) )) are not conjugate in G (since H = F(xn , yn ) is malnormal in G, and xn is not conjugate to yn in H = F(xn , yn )). Since ϕ is an epimorphism, it follows that π(Wn(1) ) and π(Wn(1) ) are not conjugate in G. Consider the following partial algorithm Ω for solving the conjugacy problem CP(G, A) for G. Given two words w, w ∈ Σ∗A , we apply the algorithm Ω to decide whether or not their images ϕ(π(w)), ϕ(π(w )) are conjugate in G. If they are not conjugate in G, the algorithm Ω declares that π(w) and π(w ) are not conjugate in G. Otherwise Ω terminates without giving any answer. Then Ω solves CP(G, A) for G strongly generically in class 𝒞 with respect to the random process 𝒲 [ν]. Recall that if Γ is a finite simple graph, the corresponding right-angled Artin group A(Γ) is given by the presentation A(Γ) = ⟨VΓ | [v, u] = 1 whenever v, u are adjacent vertices of Γ⟩. Suppose Γ is a finite connected graph. There is an intrinsic notion of being loxodromic for an element of A(Γ) that provides a counterpart for being pseudo-Anosov in the mapping class group. A nontrivial element g ∈ G = A(Γ) is called loxodromic if the centralizer CG (g) is infinite cyclic. In [263] Kim and Koberda defined the extension graph Γe associated with A(Γ). If Γ is finite and connected, then, as proved in [263], Γe is a connected Gromov-hyperbolic graph, endowed with a natural isometric action of A(Γ), such that for g ∈ A(Γ) the element g acts as a loxodromic isometry of Γe if and only if g is loxodromic in the sense of the above definition. Moreover, Kim and Koberda also proved [264] that in this case the action of A(Γ) on Γe is acylindrical. Corollary 1.4.7. Let G be a finitely generated group with a finite marked generating set A. Let ϕ : G → G be an epimorphism. Let ν be an atomic probability measure on (A ⊔ A−1 )∗ with finite support such that the semigroup generated by the set (ϕ∘π)(Supp(ν)) is equal to G.
1.4 Quotient test methods | 17
Consider the (A ⊔ A−1 )∗ -valued random process 𝒲 [ν] = (W1 , W1 ), (W2 , W2 ) . . . , (Wn , Wn ), . . . (1)
(2)
(1)
(2)
(1)
(2)
defined as in Theorem 1.4.6 above. 1. Assume that G = A(Γ) is a right-angled Artin group, where Γ is a finite connected graph with at least two vertices which is not a join. Then the conjugacy problem CP(G, A) for G is solvable strongly generically in linear time in the RAM Turing machine computation model, with respect to the random process 𝒲 [ν]. 2. Assume that G is a nonelementary torsion-free relatively hyperbolic group with respect to the collection of finitely generated parabolic subgroups P1 , . . . , Pm . Suppose that each of P1 , . . . , Pm has conjugacy problem solvable in polynomial time. Then the conjugacy problem CP(G, A) for G is solvable strongly generically in polynomial time with respect to the random process 𝒲 [ν]. Proof. (1) The assumptions on Γ imply that the action of A(Γ) on the extension graph Γe is acylindrical and nonelementary [264]. By a result of Crisp, Godelle and Wiest [90], the conjugacy problem in A(Γ) is solvable in linear time in a RAM Turing machine computation model. The result now follows directly from Theorem 1.4.6. (2) Let Y be the coned-off Cayley graph of G with respect to the marked generating set ϕ ∘ π : A → G, with the parabolic subgroups P1 , . . . , Pm coned off. Then Y is Gromov-hyperbolic, and the action of G on Y is nonelementary and acylindrical. By a result of Bumagin [73], the assumptions on G imply that the conjugacy problem for G is solvable in deterministic polynomial time. The conclusion of part (2) now follows from Theorem 1.4.6. Note that solvability in RAM linear time implies solvability in O(n log n) deterministic time by an ordinary Turing machine. 1.4.3 Morseness of generic subgroups and the membership problem In recent years the notion of a quasiconvex subgroup of a word-hyperbolic group has found several useful generalizations in the context of arbitrary finitely generated groups. Durham and Taylor [117] introduced the notion of a “stable” subgroup: If G is a finitely generated group and H ≤ G is a finitely generated subgroup, the subgroup H is called stable in G if H is undistorted (i. e., quasiisometrically embedded) in G, and if for any finite generating set A of G and any λ ≥ 1, ϵ ≥ 0 there exists C ≥ 0 such that for any h1 , h2 ∈ H and any (λ, ϵ)-quasigeodesics γ1 , γ2 from h1 to h2 in the Cayley graph Γ(G, A), we have γ1 ⊆ NC (γ2 ) and γ2 ⊆ NC (γ1 ) in Γ(G, A). It is known that if H is stable in G, then H is word-hyperbolic. A more general version of quasiconvexity, not requiring the subgroup H to be word-hyperbolic, was proposed by Tran [480]
18 | 1 Generic-case complexity in group theory and Genevois [163]: A subgroup H of a finitely generated group G is called Morse or strongly quasiconvex in G if for any finite generating set A of G and any λ ≥ 1, ϵ ≥ 0 there exists C ≥ 0 such that for any h1 , h2 ∈ H and any (λ, ϵ)-quasigeodesic γ1 from h1 to h2 in the Cayley graph Γ(G, A), we have γ1 ⊆ NC (H) in Γ(G, A). It follows from the definition that if H ≤ G is Morse in G, then H is finitely generated and undistorted in G. It is also known that for a finitely generated undistorted subgroup H of a finitely generated group G, the subgroup H is stable in G if and only if H is word-hyperbolic and Morse in G. If G is word-hyperbolic and H ≤ G, then H is Morse in G if and only if H is stable in G if and only if H is finitely generated and undistorted in G. The notions of stable and Morse subgroup are invariant under the choices of finite generating sets for the ambient group G. Another related generalization of quasiconvexity is given by the notion of a hyperbolically embedded subgroup H of a group G, denoted H →h G. We omit the precise definition here but note that a recent result of Sisto [468, Theorem 2] shows that if G is a finitely generated group and H ≤ G is a finitely generated hyperbolically embedded subgroup, then H is Morse in G. Remark 1.4.8. Let G be a finitely generated group with solvable word problem and let H ≤ G be an undistorted finitely generated subgroup. Then the membership problem for H in G is decidable. Moreover, in this case if the word problem for G is decidable in polynomial time, then the membership problem for H in G is solvable in exponential time. Indeed, let A be a finite generating set for G and let Y be a finite generating set for H. Since H is undistorted in G, there is K ≥ 1 such that for every h ∈ H we have |h|Y ≤ K|h|A . Also, for each y ∈ Y we fix a word zy over A±1 such that zy =G y in G. Put M = maxy∈Y |zy |. Given a word w of length n ≥ 1 in the generators A±1 of G, we list the finite set U of all words u in Y ±1 of length |u| ≤ Kn. Note that #U ≤ bKn , where b = 2#Y. For each such u ∈ U let ũ be the word in A±1 obtained from u by replacing each letter y±1 by the corresponding word zy±1 . Note that for each u ∈ U we have |u|̃ ≤ KMn. Then w ∈ H if and only if there exists u ∈ U such that w =G u.̃ The latter condition can be verified using the solution for the word problem in G. For each specific u ∈ U, since |u|̃ ≤ KMn and |w| = n, checking whether or not w =G ũ can be done within the same time complexity bound as the time complexity of the algorithm for the word problem in G. Hence, if the word problem in G is solvable in polynomial time, then this algorithm for solving the membership problem for H in G runs in at most exponential time in n, since there are at most exponentially many (in n) words u ∈ U. Fortunately, in a fairly wide context, assuming that H is Morse in G and not just undistorted, one can do significantly better in terms of the complexity of the membership problem for H in G. Theorem 1.4.9. Let G be an automatic group and let H ≤ G be a Morse subgroup. Let A be a finite generating set for G and let L ⊆ Σ∗A be an automatic language for G over the alphabet ΣA = A ⊔ A−1 .
1.4 Quotient test methods | 19
Then H is L-rational (that is, the full preimage LH of H in L is a regular language), and the membership problem for H in G is solvable in quadratic time in terms of the length |w| of an input word w ∈ Σ∗A . Proof. Since L is an automatic language for G, there exist λ ≥ 1, ϵ ≥ 0 such that every u ∈ L is a (λ, ϵ)-quasigeodesic in the Cayley graph Γ(G, A). Since H is Morse in G, there is C ≥ 1 such that for every u ∈ L representing an element h of H we have pu ⊆ NC (H) in Γ(G, A), where pu is the path in Γ(G, A) from 1 to h with label u. This means that H is C-quasiconvex with respect to the automatic language L. Therefore, by [167, Theorem 2.2], the full preimage LH of H in L is a regular language, accepted by some finitestate automaton AH . We can then decide the membership problem for H in G as follows. Given a word w ∈ Σ∗A with |w| = n, we first find u ∈ L such that u =G w in G. By [123, Theorem 2.3.10], finding such u can be done in quadratic time in n. We then use the automaton AH to check whether or not u ∈ LH . This check can be done in linear time in |u|, and therefore in linear time in n. We conclude that w represents an element of H if and only if u ∈ LH . The overall running time of this algorithm is quadratic in n. Recall that a group G is acylindrically hyperbolic if G is finitely generated and admits a nonelementary acylindrical isometric action of a Gromov-hyperbolic geodesic metric space X (here “nonelementary” means that G contains some two independent loxodromic isometries of X). Theorem 1.4.10. Let G be a finitely generated acylindrically hyperbolic group such that G is also automatic. Let A be a finite generating set of G and ΣA = 𝒜 ⊔ A−1 . Let ν be an atomic probability measure on Σ∗A with finite support such that the semigroup generated by the set π(Supp(ν)) is equal to G. Let k ≥ 1 be an integer. For i = 1, . . . , k let X1(i) , X2(i) , . . . , Xn(i) , . . . be k independent sequences of Σ∗A -valued i. i. d. random variables, each with distribution ν. For i = 1, . . . , k and n = 1, 2, . . . put Wn(i) = X1(i) X2(i) . . . Xn(i) , and consider the (Σ∗A )k -valued random process 𝒲 [ν] = (W1 , . . . , W1 ), (W2 , . . . , W2 ) . . . , (Wn , . . . , Wn ), . . . . (1)
(k)
(1)
(k)
(1)
(k)
Let Hn = ⟨π(Wn(1) ), . . . , π(Wn(k) )⟩ ≤ G. Then with probability tending to 1 exponentially fast as n → ∞, the subgroup Hn ≤ G has the following properties: 1. The subgroup Hn ≤ G is free of rank k, stable and Morse in G. 2. The subgroup Hn ≤ G is rational with respect to any automatic language L on G. 3. The membership problem for Hn in G is solvable in quadratic time, in terms of the length |w| of an input word w ∈ Σ∗A . Proof. Consider the projected random walks ϕ(π(Wn(1) )), . . . , ϕ(π(Wn(k) )) on G. Since G is acylindrically hyperbolic, there exists a unique maximal finite normal subgroup
20 | 1 Generic-case complexity in group theory E(G) ≤ G. Then, a result of Maher and Sisto [324, Theorem 1] implies that, with probability tending to 1 exponentially fast as n → ∞, the subgroup Hn ≤ G is free of rank k, the subgroup H̆ n = ⟨H, E(G)⟩ splits as a semidirect product H̆ n = Hn ⋊ E(G) and this subgroup H̆ n is hyperbolically embedded in G. Since H̆ n →h G is finitely generated and hyperbolically embedded in G, a result of Sisto [468, Theorem 2] implies that H̆ n is Morse in G. Since Hn ≤ H̆ n has finite index in H̆ n , it follows that Hn ≤ G is also Morse in G. Since Hn is free of rank k, Hn is word-hyperbolic, which, together with the Morseness of Hn , implies that Hn is stable in G. Now Theorem 1.4.9 implies that Hn is rational for any automatic language on G, and that the membership problem for Hn in G is solvable in quadratic time. Remark 1.4.11. There are many classes of examples of automatic acylindrically hyperbolic groups G, and Theorem 1.4.10 applies to them. These classes of examples include: 1. Nonelementary word-hyperbolic groups. 2. Mapping class groups of finite-type hyperbolic surfaces (including braid groups on n ≥ 3 strands). Mapping class groups are automatic by a result of Mosher [359] and they are acylindrically hyperbolic by acylindricity of their action on the curve complex (see [400, Section 8, Example (a)]). 3. Finitely generated nonelementary toral relatively hyperbolic groups (including limit groups and fundamental groups of connected complete finite volume hyperbolic manifolds of arbitrary dimension n ≥ 2). Nonelementary relatively hyperbolic groups are acylindrically hyperbolic (see [400]). Toral relatively hyperbolic groups are biautomatic [8, Theorem 1.1]. 4. Nonvirtually cyclic groups G admitting a properly discontinuous cocompact essential isometric action on a proper finite-dimensional CAT(0) cubical complex X that does not nontrivially split as a product of two cube subcomplexes. All CAT(0) cubical groups are biautomatic by a general result of Niblo and Reeves [378]. Under the above assumption on G, by a result of Caprace and Sageev [77, Theorem 6.3], the action of G on X contains a rank-1 element and therefore, since G is not virtually cyclic, the group G is acylindrically hyperbolic. See [400, Section 8, Example (d)] for a precise explanation of acylindrical hyperbolicity of G in this context. 5. Right-angled Artin groups G = A(Γ) where the graph Γ has more than one vertex and the group G does not split as a nontrivial direct product. Again, see [400, Section 8, Example (d)] for the explanation of why such groups G are acylindrically hyperbolic. Alternatively, one can more directly deduce acylindrical hyperbolicity of A(Γ) in this setting from the results of Kim and Koberda [264] about acylindricity of the action of A(Γ)4 on the “extension graph” Γe for A(Γ). 6. Many (again, in some sense, “most”) groups given by finite C(4) − T(4) small cancelation presentations such that every piece has length 1. It is proved in [237] that such groups G admit a properly discontinuous cocompact action on
1.4 Quotient test methods | 21
7.
2-dimensional square complexes, and so item (4) above applies to them. Prior to the general biautomaticity Niblo–Reeves result mentioned above, automaticity of such C(4) − T(4) groups was proved by Gersten and Short [166]. “Most” 3-manifold groups (see Minasyan and Osin [354, Corollary 2.9] for assumptions implying acylindrical hyperbolicity of 3-manifold groups, and [123, Theorems 12.4.6 and 12.4.7] for conditions implying automaticity of 3-manifold groups.)
1.4.4 Finer generic conjugacy problem methods A different approach to generic-case complexity of the conjugacy problem, based on using normal forms in graphs of groups, is explored in a series of papers [56, 57, 364, 105]. The main idea there is that if G splits as a finite graph of groups (e. g., as an amalgamated product or an HNN-extension), then, under reasonable assumptions, “generic” elements in G are hyperbolic with respect to this splitting, and if the splitting is sufficiently well behaved, the conjugacy problem has a fast solution on hyperbolic inputs. In [56, 57] Borovik, Myasnikov and Remeslennikov explore this approach primarily in the context of finite-rank free vertex and edge groups. They provide a more precise notion of “bad” elliptic elements and of “bad” pairs, which turn out to correspond to conjugates to edge group elements and conjugacies between them. They also show that, under a certain set of general assumptions, one can algorithmically compute a cyclically normal form of an element and also decide conjugacy (including even finding the conjugating element) except for the “bad pairs.” These assumptions, roughly, include the following: the membership search problem for the edge groups should be solvable in the corresponding vertex groups; the coset representative search problem for the edge groups should be solvable in the corresponding vertex groups; the conjugacy membership search problem for the edge groups should be solvable in the corresponding vertex groups; the conjugacy search problem is solvable for the vertex groups; the subgroup membership search problem for the edge groups should be solvable in the corresponding vertex groups; several more technical algorithmic conditions regarding embeddings of the edge groups in vertex groups. The authors do show that all of these conditions are satisfied in the case where all vertex and edge groups are free of finite rank but they do not provide any time estimates and do not obtain any general generic-case complexity results in this setting. However, they do apply their results to the famous construction of Miller of a finitely presented group G with an unsolvable conjugacy problem. This group can be constructed as a multi-letter HNN-extension of the free group. The relevant results of [57] (namely, Theorem 4.9 and Theorem 5.1 there) can be summarized as follows.
22 | 1 Generic-case complexity in group theory Theorem 1.4.12. There exists a finitely presented group G with unsolvable conjugacy problem such that for some finite marked generating set A of G there is an exponentially generic subset U ⊆ F(A) such that there is a cubic-time algorithm that, given any (u, v) with u ∈ U and v ∈ F(A), solves the conjugacy search problem in G. The complement F(A) \ U (or its image in G) can be regarded as the “black hole,” or the set of “hard inputs” for the conjugacy problem in G. More recent work of Diekert, Myasnikov and Weiss [105] pushes these ideas in a slightly different direction. They consider the case where G is a finitely generated group with a finite marked generating set A, where G is an amalgamated product G = H ∗C K with [H : C] ≥ [H : K] ≥ 2, or G is an HNN-extension G = ⟨H, t | t −1 ct = ϕ(c), for all c ∈ C⟩. They prove that in this situation for P ∈ {H, K} the Schreier graph Γ(G, P, A) is nonamenable if and only if [H : C] ≥ 3 in the amalgamated product case and if and only if [H : C] ≥ 2, [H : ϕ(C)] ≥ 2 in the HNN-extension case. They also show in this case (where the Schreier graph Γ(G, P, A) is nonamenable) that the set ℋA ⊂ A∗ of all words in A∗ representing hyperbolic (in the Bass–Serre sense) elements of G is exponentially generic in G. Being “hyperbolic” here means that the cyclically reduced normal form in the amalgamated product sense involves at least two syllables and in the HNN-sense involves at least one stable letter, or, equivalently, that the element acts as a hyperbolic (fixed-point-free) isometry of the Bass–Serre tree corresponding to the amalgamated product or the HNN-extension in question. Using the above general result, Diekert, Myasnikov and Weiss are able to obtain the following application. Theorem 1.4.13 ([105, Corollary 4]). Let G be the fundamental group of a finite graph of groups with finitely generated free abelian vertex groups. Let A be a finite marked generating set of G. Then there exists a strongly generic subset S ⊆ Σ∗A and a correct partial algorithm Ω for the conjugacy problem in G such that given any u, v ∈ S, the algorithm Ω terminates on the input (u, v) in at most linear time in terms of max{|u|, |v|}. Since S × S is strongly generic in Σ∗A × Σ∗A , this implies that the conjugacy problem for G is solvable strongly generically in linear time with respect to the random process 𝒲A,2 (see Convention 1.3.1). In the same paper [105] the authors apply their results to obtain a strongly generic subset of the Baumslag–Gersten group G1,2 on which the conjugacy problem is solvable in polynomial time. The worst-case complexity of the conjugacy problem in G1,2 is unknown. Note that in the result discussed above the assumptions implying that the Schreier graph Γ(G, P, A) is nonamenable are exactly the same as those that guarantee that the group G admits two “independent” hyperbolic elements for the action on the Bass– Serre tree T corresponding to the amalgamated product/HNN-extension splitting of G. Now let ν be a finitely supported atomic probability measure on Σ∗A whose image in G generates G. Then by [325, Theorem 1.4], for the word Wn ∈ Σ∗A generated by the
1.5 Generic-case complexity of “search” group-theoretic problems | 23
ν-random walk of length n, it follows that Wn acts hyperbolically on T with probability converging to 1 exponentially fast as n → ∞. If the vertex groups are free abelian of finite rank, [105] provides a polynomial-time solution for the conjugacy problem for hyperbolic elements, and also a polynomial-time solution for elliptic elements in the exceptional cases (where the Bass–Serre tree does not admit two independent hyperbolic isometries in G). Therefore, one can also derive the conclusion, with strongly generically linear time solution of the conjugacy problem for G, of Theorem 1.4.13, for the random process where the inputs are generated by two independent ν-random walks, as in Theorem 1.4.5.
1.5 Generic-case complexity of “search” group-theoretic problems All of the classic group-theoretic decision problems admit “search” versions, which concentrate on the “yes” part of the corresponding decision problem and ask to produce an explicit “witness” justifying that the answer to the decision problem is indeed “yes.” There are several ways to formalize the search problems. Here, in order to differentiate them more sharply from the decision problems, we will always assume that the inputs for the search problems come from the “yes” parts for the corresponding decision problems (e. g., words representing the identity element in the group, pairs of words representing conjugate elements in the group, words representing elements belonging to a specific subgroup, etc.).
1.5.1 The word search problem and random van Kampen diagrams The word search problem WSP(G, A, R) for a finitely presented group G is given by a finite symmetrized presentation G = ⟨A | R⟩. This problem asks, given a freely reduced word w ∈ nclF(A) (R), how to express w as a product of conjugates of elements of R in F(A). As noted above, the word search problem concentrates entirely on the “yes” part of the decision word problem and only deals with words in F(A) (or sometimes in Σ∗A ) which represent the identity element in G. Generic-case analysis of the word search problem is made difficult by the fact that it is not clear how to “randomly” choose an element of nclF(A) (R) in a reasonable way. Note that if G is infinite (which is the main case of interest), then [F(A) : nclF(A) (R)] = ∞. Therefore, for most types of random walks on F(A), for an element wn ∈ F(A) obtained after n steps of the walk the probability that wn ∈ nclF(A) (R) tends to 0 (usually exponentially fast) as n → ∞. The problem of choosing a “random” element of nclF(A) (R) is quite nontrivial and also manifests itself when trying to define the notion of an “average” Dehn function (see [50, 501]).
24 | 1 Generic-case complexity in group theory A useful possible approach was put forward in the work of Myasnikov and Ushakov [368], via the construction of “random van Kampen diagrams.” They assume that G = ⟨A | R⟩ is a finite symmetrized presentation which is also reduced, meaning that no proper subword of an element of R represents the identity element of G (which is a fairly mild restriction in practice). Then, after choosing several parameters, they define a basic extension random generator ℛ𝒢 , which is a discrete-time random process such that, after n steps, it outputs a van Kampen diagram Dn over the presentation G = ⟨A | R⟩ with Area(Dn ) ≤ n. A “random” element of nclF(A) (R) is then generated as the label wn = μ(𝜕Dn ) of the boundary cycle of Dn . Space does not allow us to describe the process ℛ𝒢 here precisely, but its flavor is the following. At each step of the process, when the diagram Dn is already constructed, it is then “randomly” modified in one of the following ways: – Add a free edge attached (labeled by a letter from A±1 ) to some boundary vertex of Dn . – Attach a new 2-cell, labeled by an element of R, at some boundary vertex of Dn . – Perform some boundary folds in Dn . – Perform some “housekeeping” moves. The starting point D1 of the process may be an arbitrary van Kampen diagram, for example the trivial one consisting of a single vertex. In [368] the authors define a filling invariant of a diagram called the depth, which is closely related to the hyperbolicity or negatively curved nature of the diagram. Intuitively, the depth of a van Kampen diagram is the number of times one needs to iteratively “peel off” the boundary layers of the diagram to exhaust the diagram. More formally, for a vertex v in a van Kampen diagram D, the depth δ(D, v) of v in D is the smallest k ≥ 1 such that there exists a sequence of vertices v = v1 , . . . , vk in D such that every vi , vi+1 belongs to the boundary cycle of some 2-cell in D. Then the depth δ(D) of D is defined as the maximum of δ(D, v) over all vertices v of D. For a van Kampen diagram D we denote by Area+ (D) the total number of 2-cells and of free boundary edges in D. Note that Area(D) ≤ Area+ (D), where Area(D) is the number of 2-cells in D. We summarize the results of Myasnikov and Ushakov [368] in the following simplified statement. Theorem 1.5.1. Let G = ⟨A | R⟩ be a finite symmetrized reduced presentation and let ℛ𝒢 be the basic extension random generator process outputting a sequence of van Kampen diagrams D1 , D2 , . . . Dn , . . . with labels of boundary cycles wn = μ(𝜕Dn ). Then, with probability tending to 1 as n → ∞, the following hold: 1. We have Area+ (Dn ) ≤ 4|wn |. 2. We have δ(Dn ) ≤ log Area+ (Dn ). 3. The word search problem on wn is solvable in time O(|wn |4+4 log L(R) ).
1.5 Generic-case complexity of “search” group-theoretic problems | 25
Here L(R) = ∑r∈R |r|. The construction of the random process ℛ𝒢 is such that for every n ≥ 1 we have Area+ (Dn ) ≤ n and |wn | ≤ nL(R). Note that part (3) of Theorem 1.5.1 means that the word search problem SWP(G, A, R) is solvable generically in polynomial time with respect to the random process ℛ𝒢 . The above results are also explained in detail in Section 5 of the subsequent monograph of Myasnikov, Shpilrain and Ushakov [366]. Similar but somewhat different results for the word search problem were later obtained by Morar and Ushakov [358] (we discuss their work in more detail below).
1.5.2 The conjugacy search problem For a group G with a finite marked generating set A, the conjugacy search problem CSP(G, A) asks, given two words w1 , w2 ∈ Σ∗A (or in F(A)) representing conjugate elements of G, to find a word u ∈ Σ∗A such that w2 =G u−1 w1 u. As above, generic-case analysis of the conjugacy search problem is complicated by finding reasonable ways of generating “random” pairs (w1 , w2 ) representing conjugate elements of G. In [366] Myasnikov, Shpilrain and Ushakov push further the random van Kampen diagram methods developed in [368] to obtain some results in this direction. For a reduced symmetrized finite presentation G = ⟨A | R⟩ and a cyclically reduced word w ∈ F(A) representing a sufficiently long conjugacy class in G, they use the basic extension random generator discussed above to construct a random process ℛ𝒢 [w] outputting a sequence of annular van Kampen diagrams D1 , D2 , D3 , . . . , Dn , . . . which starts which a circle labeled by w. At step n the annular diagram Dn has its exterior and interior boundary cycles labeled by words w1,n , w2,n ∈ Σ∗A that are both conjugate to w in G. They then show that the diagram Dn generically has “hyperbolic” geometry. Using this fact and a general Todd–Coxeter-type method they develop for (abstractly) solving the conjugacy search problem in the general context, they conclude that CSP(G, A) is generically solvable in polynomial time with respect to the process ℛ𝒢 [w]. We summarize some of their results here. Theorem 1.5.2. Let G = ⟨A | R⟩ be a finite symmetrized reduced presentation and let w ∈ F(A) be a cyclically reduced word such that w is not conjugate to an element of length ≤ 2 in G. Let the random process ℛ𝒢 [w] output a sequence of annular van Kampen diagrams D1 , D2 , D3 , . . . , Dn , . . . , where Dn has its exterior and interior boundary cycles labeled w1,n , w2,n ∈ Σ∗A . Then, with probability tending to 1 as n → ∞ the following hold: 1. We have δ(Dn ) ≤ 2 log n. 2. For any fixed in advance c ∈ (0, 1) we have |w1,n | + |w2,n | ≥ cn. 3. We have 21 Area+ (Dn ) ≤ |w1,n | + |w2,n |. 4. The conjugacy search problem SCP(G, A) is solvable on the input (w1,n , w2,n ) in time O(|w1,n | + |w2,n |)2+4 log R .
26 | 1 Generic-case complexity in group theory Note also that ℛ𝒢 [w] is defined such that for every n ≥ 1 we have Area+ (Dn ) ≤ n + |w| and |wi,n | ≤ |w| + nL(R). In [358] Morar and Ushakov propose a somewhat different approach to the generic-case complexity of the conjugacy search problem (and several other search problems), based on the ideas from Ushakov’s PhD thesis [483]. Instead of the “evolutionary” or “growth” process for generating random van Kampen diagrams discussed above, they use a combination of a simplified version of this process (based on what they call ℐ -transformations) with additional algebraic techniques. This allows them, for example, to produce pairs of “random” words representing conjugate elements of G where these words are already freely and cyclically reduced (which is not the case for Theorem 1.5.2 above). For a finite symmetrized presentation G = ⟨A | R⟩, an ℐ -transformation on a word w ∈ Σ∗A consists in inserting, at a uniformly random position in w, a word u from R̂ = R ∪ {aa−1 | a ∈ ΣA }, where we have fixed in advance some probability distribution ν on R.̂ A sequence of ℐ -transformations, applied to w, produces a word that is equal to w and G and can be interpreted as a simplified version of the “evolutionary” process for producing random van Kampen diagrams discussed above. In [358] Morar and Ushakov propose the following random process for generating random pairs of conjugate elements in G. Let w ∈ F(A) be a freely reduced word and let w ≡ w a, where a ∈ ΣA is the last letter of w. For n = 1, 2, . . . they start applying a sequence of random ℐ -transformations to w to obtain the words w = v1 , v2 , v3 , . . . vn , . . . in Σ∗A . Note that for each n ≥ 1 we have vn =G w and vn a =G w. For each n ≥ 1 let wn be a uniformly randomly chosen cyclic permutation of the freely and cyclically reduced form of vn a. Thus wn is conjugate to w in G. Denote this random process by 𝒲𝒞 [ν]. Note that, by construction, we have |wn | ≤ |w| + (2 + L(R))n. The authors then obtain a polynomial-time generic-case complexity result for the conjugacy search problem with inputs produces by such a random process. Theorem 1.5.3 ([358, Theorem C]). Let G = ⟨A | R⟩ be a finite symmetrized presentation. Let w ∈ F(A) be a freely reduced word. Fix a probability distribution ν on R.̂ Let 𝒲𝒞 [ν] be the random process described above, outputting a sequence of cyclically reduced words w1 , w2 , . . . , wn , . . . in F(A) representing elements conjugate to w in G. Then, with probability tending to 1 as n → ∞, the conjugacy search problem CSP(G, A) is solvable on the input (wn , w) in polynomial time in n. The proof of Theorem 1.5.3 uses similar tools to that of Theorem 1.5.2, namely, estimating the “depth” of conjugacy van Kampen diagrams that arise in the process, and using the abstract conjugacy search algorithm developed in [366]. Morar and Ushakov also provide an explicit generic polynomial-time bound in Theorem 1.5.3, in terms of |w| and L(R), which we omit here.
1.5 Generic-case complexity of “search” group-theoretic problems | 27
1.5.3 The membership search problem In the same paper [358] Morar and Ushakov study the generic-case complexity of the membership search problem with respect to a given finitely generated subgroup. Let G = ⟨A | R⟩ be a finite symmetrized group presentation. Let H ≤ G be a finitely generated subgroup and let U = {u1 , . . . , uk } ⊆ F(A) be a generating set for H in G. The membership search problem MSP(G, H, A, U) asks, given a word w ∈ Σ∗A known to represent an element of H, how to express w as a word over ΣU = U ∪ U −1 . In [366] the authors propose two related, but somewhat different, random processes for generating random words in F(A) representing elements of H. We will concentrate on one of these ways here. Fix a parameter q ∈ (0, 1), a probability distribution ν on R̂ and a probability distribution μ on ΣU . Start with the empty word w0 = ε. Iteratively construct a random sequence w0 , w1 , . . . wn , . . . of freely reduced words in F(A) as follows. If wn is already constructed, then, with probability q, we either put wn+1 = wn u and then freely reduce, where u ∈ ΣU is a μ-random element, or we perform a random (according to ν) ℐ -transformation on wn and then freely reduce, to get wn+1 . Call this random process 𝒲 [ν, q, μ] (note that the case of q = 1, although technically disallowed by the assumption q ∈ (0, 1), would correspond to performing the n-step μ-random walk on H). Thus for each n ≥ 1 the word wn ∈ F(A) is freely reduced and represents an element of H. Moreover, |wn | ≤ cn for some constant c depending only on R and U. Morar and Ushakov prove the following. Theorem 1.5.4 ([358, Theorem E]). Let G = ⟨A | R⟩ be a finite symmetrized presentation. Let U = {u1 , . . . , uk } ⊆ F(A) be a generating set for H ≤ G. Let q ∈ (0, 1). Let ν be a probability distribution on R̂ and let μ be a probability distribution on ΣU . Let the random process 𝒲 [ν, q, μ] generate a sequence of freely reduced words w1 , w2 , . . . , wn , . . . in F(A) (which, as noted above, represent elements of H). Then, with probability tending to 1 as n → ∞, the membership search problem MSP(G, H, A, U) is solvable on the input wn in polynomial time in n. The proof of Theorem 1.5.4 relies on generalizing to the subgroup-case abstract approximation methods developed in [366, 358] for analyzing the word search problem, and on modifying the diagrammatic notion of “depth” to the subgroup context. For G, H, A, U as above, a U-diagram D over G = ⟨A | R⟩ is defined in the same way as an ordinary van Kampen diagram, except that D is also allowed to contain U-cells which are 2-cells c with the boundary cycle containing the boundary base-vertex in v0 ∈ 𝜕D and such that reading the boundary label of c from v0 produces a word in U ∪ U −1 . Thus, if w is the boundary label of D read from v0 , then w ∈ H in G. For a word w ∈ F(A) representing an element of H, the depth δU (w) of w with respect to U is defined as the minimum of δ(D) over all U-diagrams D with the label of their boundary cycle equal to w. In the course of the proof of Theorem 1.5.4 one shows that, probabilistically, for the words wn produced by the process 𝒲 [ν, q, μ] one has good control
28 | 1 Generic-case complexity in group theory over δU (wn ) (namely, δU (wn ) ≤ const log n), which ultimately translates into requisite estimates about the complexity of the membership search problem on input wn .
1.6 Algorithmically finite groups All of the results discussed in the previous sections had the flavor of claiming that various algorithmic problems in various classes of groups are “generically easy.” However, it turns out that there do exist natural examples of “generically hard” group-theoretic decision problems. The following notion was introduced by Miasnikov and Osin in [339]. A finitely generated group G is called algorithmically finite if for some (equivalently, every) finite marked generating set A of G, for the natural projection π : F(X) → G, there does not exist an infinite computably enumerable subset U ⊆ F(X) such that the restriction π|U is injective. That is to say, there does not exist an algorithm that produces an infinite collection of pairwise distinct elements of G. For algorithmically finite groups, a version of the word problem called the “equality problem” turns out to be “generically hard.” For a group G with a finite marked generating set A, the equality problem for G with respect to A asks, given two words u, v ∈ F(A) (or, more generally, u, v ∈ Σ∗A ), whether π(u) = π(v) in G. For a subset U ⊆ F(A), we say that the equality problem for G is decidable in U if there exists an algorithm with inputs (u, v) ∈ F(A) × F(A) that, given u, v ∈ U, decides whether or not π(u) = π(v) in G. Miasnikov and Osin called infinite, recursively presented, algorithmically finite groups Dehn Monsters. The main result of [339] establishes the existence of Dehn Monsters and can be summarized as follows. Theorem 1.6.1 ([339]). The following hold: 1. There exists a finitely generated algorithmically finite nonamenable group G such that G is recursively presentable. 2. Let G be as in (1) and let A be a finite marked generating set of G. Let U ⊆ F(A) be a computably enumerable subset such that the equality problem for G with respect to A is decidable in U. Then U is exponentially negligible in F(A). The proof of the above theorem is based on cleverly exploiting the Golod– Shafarevich construction [174]; we review the basic structure of the argument below. It remains unclear when the generic-case complexity properties of the word problems WP(G, A) and WPred (G, A) are in the examples produced by Theorem 1.6.1. Miasnikov and Schupp [345] show that the existence of Dehn Mosters has implications for the conjugacy problem as well. They define a finitely generated recursively presented group G to have algorithmically finite conjugation if G has infinitely many conjugacy classes and for some (equivalently, any) finite marked generating set A of G, for any
1.6 Algorithmically finite groups | 29
infinite computably enumerable subset S ⊆ F(X) there exist two distinct elements of S which represent conjugate elements of G. They show [345, Theorem 8.2] that every finitely generated Dehn Monster (for example, the group G provided by Theorem 1.6.1) has algorithmically finite conjugation. We review here the basic construction of Dehn Monsters from [339]. Let r ≥ 2 and let A = {a1 , . . . , ar } be a free basis of the free group Fr = F(A). Also choose a prime p ≥ 2. Denote by Λp = ℤp ⟨⟨x1 , . . . , xr ⟩⟩ the ring of formal power series with coefficients in ℤp in noncommuting variables x1 , . . . , xr . The elements 1 + xi are multiplicatively invertible in Λp (and so belong to the group (Λp )× of units in Λp ), and the map A → (Λp )× , ai → 1 + xi extends to an injective homomorphism m : F(A) → (ℤp )× called the Magnus embedding. The ideal ℐ = (x1 , . . . , xp ) ⊲ Λp defines the Zassenhaus filtration Fr = D1 > D2 > ⋅ ⋅ ⋅ > Dn > ⋅ ⋅ ⋅ of Fr , where Di = {w ∈ Fr | m(w) ≡ 1 mod ℐ n }. Each Di is a normal subgroup of finite index in Fr , with Fr /Di being a finite p-group, and, moreover ⋂∞ i=1 Di = {1}. For an element 1 ≠ w ∈ Fr there is a uniquely defined degree deg(w) which is an integer n ≥ 1 such that m(w) ∈ Dn \ Dn+1 . Consider a group presentation 𝒫 = ⟨A | R⟩ where R ⊆ F(A). Denote by ni (R) the number of w ∈ R with deg(w) = i. Assume, for simplicity, that n1 (R) = 0, that is, that R ⊆ D2 . Now consider the power series ∞
i
𝒮R,p (t) = 1 − rt + ∑ ni (R)t . i=2
We will say that a presentation 𝒫 = ⟨A | R⟩ (with R ⊆ D2 ) is a Golod–Shafarevich presentation if there exists 0 < t0 < 1 such that 𝒮R,p (t) < 0. A key result of Golod and Shafarevich [174], proved by considering the pro-p-completion, implies that if 𝒫 = ⟨A | R⟩ is a Golod–Shafarevich presentation, then the group G = G(𝒫 ) = ⟨A | R⟩ defined by 𝒫 is infinite. The easiest case where this assumption on 𝒫 is satisfied is R = ⌀ (in which case G = F(A) = Fr ). Following the terminology of [339] (which was inspired by the notion of “simple” set of natural numbers from recursion theory), we will say that a subset S ⊆ F(A) is simple if: (a) The subset S ⊆ F(A) is computably enumerable. (b) The group G = ⟨A | S⟩ is infinite. (c) For every infinite computably enumerable subset U ⊆ F(A), there exist distinct u, v ∈ U such that uv−1 ∈ S. Note that any simple subset S ⊆ F(A) is necessarily infinite. Note also that if S ⊆ F(A) is a simple subset, then the group G = ⟨A | S⟩ is a Dehn Monster (and the converse is also true). Thus, if S ⊆ F(A) is simple, then, by definition, G = ⟨A | S⟩ is recursively presentable and infinite. The group G is also algorithmically finite. Indeed, let
30 | 1 Generic-case complexity in group theory U ⊆ F(A) be an infinite computably enumerable subset. Since S is simple, there exist distinct u, v ∈ U such that uv−1 ∈ S. Then π(uv−1 ) =G 1 and therefore π(u) =G π(v). Thus π|U is not injective. Hence G is algorithmically finite and thus a Dehn Monster. The following key statement is proved in [339, Theorem 2.7]. Proposition 1.6.2. Let G = ⟨A | R⟩ be a Golod–Shafarevich presentation. Then there exists a simple subset S ⊆ F(A) such that G = ⟨A | R ∪ S⟩ is again a Golod–Shafarevich presentation. The proof of Proposition 1.6.2 produces a specific infinite recursively enumerable set S satisfying condition (c) via a forcing-like argument from logic; the proof uses a Golod–Shafarevich argument for R ∪ S (that is, checking that 𝒮R∪S,p (t0 ) < 0 for some 0 < t0 < 1) to ensure that G = ⟨A | R ∪ S⟩ is infinite, so that (b) also holds for R ∪ S and therefore for S as well. The result is a simple set S satisfying the conclusions of Proposition 1.6.2. Note that if R, S are as in Proposition 1.6.2 above and if R is computably enumerable, then the set R ∪ S is simple. In particular, if we take R = ⌀, then ⟨A | −⟩ is a Golod–Shafarevich presentation, and then Proposition 1.6.2 implies the existence of a Dehn Monster G = ⟨A | S⟩. Moreover, by a result of [125], all Golod–Shafarevich groups are nonamenable, and so G = ⟨A | S⟩ is nonamenable. Thus part (a) of Theorem 1.6.1 follows. Part (b) of Theorem 1.6.1 is established in [339] after some additional general algebraic and algorithmic results about Dehn Monsters are proved there. In particular, in [339] Myasnikov and Osin also prove that every algorithmically finite group G is an infinite torsion group, and that every quotient group of a finitely generated subgroup of G is again algorithmically finite. They also construct an infinite residually finite algorithmically finite group in [339].
1.7 Whitehead algorithm and related problems 1.7.1 Whitehead algorithm and the automorphism problem The automorphism problem for a free group Fr = F(A), where A = {a1 , . . . , ar }, plays a particularly important role in algorithmic group theory. This problem asks, given two elements u, v, ∈ Fr , whether or not there exists an automorphism ϕ ∈ Aut(Fr ) such that ϕ(u) = v. The classic algorithm of Whitehead [497] provides a complete solution for the automorphism problem in Fr . There is an a priori exponential-time (in terms of max{|u|, |v|}) upper bound on the running time of Whitehead’s algorithm. While there has been a significant amount of work on Whitehead’s algorithm over the years, little progress has been made on understanding its worst case since Whitehead’s original 1936 paper. The true complexity of Whitehead’s algorithm is still not known, and the worst-case complexity of the automorphism problem in Fr is not either. The only exception is r = 2, where it is known by the work of Myasnikov and Shpilrain that a
1.7 Whitehead algorithm and related problems | 31
refinement of Whitehead’s algorithm works in polynomial time (the best-known estimate, due to Khan [255], is quadratic time in terms of max{|u|, |v|}). Whitehead’s algorithm uses a finite collection 𝒲 of so-called Whitehead moves or Whitehead automorphisms, which are special elements τ ∈ Aut(Fr ) generalizing Nielsen automorphisms. We refer the reader to [249] for the precise definition of Whitehead moves, and recall some of their key properties here. Elements of 𝒲 are divided into two types: a Whitehead move τ of the first kind is induced by a permutation of A followed by a possible inversion of some generators. In particular, for every element u ∈ Fr = F(A) we have |τ(u)|A = |u|A and ‖τ(u)‖A = ‖u‖A , where |u|A is the freely reduced length of u and ‖u‖A is the cyclically reduced length of u in F(A). Whitehead moves of the second kind may change the cyclically reduced length of elements of F(A). It is also known that ⟨𝒲 ⟩ = Aut(Fr ). A conjugacy class [u] in F(A) is called automorphically minimal with respect to A if for every ϕ ∈ Aut(Fr ) we have ‖u‖A ≤ ‖ϕ(u)‖A . Whitehead’s result [497] can be summarized as follows. Theorem 1.7.1. Let r ≥ 2 and let u ∈ Fr = F(A) be a nontrivial freely reduced element. 1. If [u] is not automorphically minimal, then there exists τ ∈ 𝒲 such that ‖τ(u)‖A < ‖u‖A . 2. If [u ], [v ] are automorphically minimal, then Aut(Fr )u = Aut(Fr )v if and only if ‖u ‖A = ‖v ‖A and there exists a finite sequence τ1 , . . . , τn ∈ 𝒲 such that v = τn . . . τ1 u and such that for i = 1, . . . , n we have ‖τi . . . τ1 u ‖A = ‖u ‖. Theorem 1.7.1 provides a complete algorithm for solving the automorphism problem in Fr = F(A). Given u ∈ F(A), we can iteratively apply Whitehead automorphisms, decreasing the cyclically reduced length at each step until this is no longer possible, and we have arrived at an element u ∈ Fr such that for every τ ∈ 𝒲 we have ‖u ‖A ≤ ‖τ(u )‖A . Part (1) of Theorem 1.7.1 then says that [u ] is automorphically minimal with respect to A. This portion of the algorithm works in at most quadratic time in terms of |u|A . We apply the same procedure to an element v ∈ F(A) to find an automorphically minimal [v ] such that Aut(Fr )v = Aut(Fr )v. If ‖u ‖A ≠ ‖v ‖A , then Aut(Fr )u ≠ Aut(Fr )v. If ‖u ‖A = ‖v ‖A = m, then we check if there exists a finite chain of Whitehead moves as in part (2) of Theorem 1.7.1. If so, then Aut(Fr )u = Aut(Fr )v, and if not, then Aut(Fr )u ≠ Aut(Fr )v. This is the “hard” part of Whitehead’s algorithm, which in general takes a priori exponential time in terms of max{|u|A , |v|A }. The reason is that we need to construct the graph Ωm , whose vertices are automorphically minimal [w] with ‖w‖A = m, and two distinct vertices [w], [z] being adjacent if there exists [τ(w)] = [z]. Then we need to check if u , v are in the same connected component of Ωm . In general, the only upper bound available on the number of vertices in the component C(Ωm , [u ]) is exponential in m. The polynomial-time improvements in [365, 255] for Whitehead’s algorithm for r = 2 are obtained by proving specific polynomial upper bounds for the size of C(Ωm , [u ]) in the rank-2 case.
32 | 1 Generic-case complexity in group theory 1.7.2 Generic-case behavior of Whitehead’s algorithm However, it turns out that generically Whitehead’s algorithm works much faster than the worst-case exponential-time upper bound, for arbitrary rank r ≥ 3. We will say that a conjugacy class [u] in Fr = F(A) is strictly minimal with respect to A if for every noninner Whitehead move τ ∈ 𝒲 of the second kind we have ‖u‖A < ‖τ(u)‖A . We summarize the results of Kapovich, Schupp and Shpilrain [249] regarding generic-case behavior of Whitehead’s algorithm in the following statement. Theorem 1.7.2. Let r ≥ 2 be an integer and let Fr = F(A), where A = {a1 , . . . , ar }. Let Ur be the set of all cyclically reduced words in Fr = F(A). 1. If [u] is strictly automorphically minimal with respect to A with ‖u‖A = m, then [u] is automorphically minimal and |C(Ωm , [u])| ≤ c(r) for some constant c(r) ≥ 1 depending only on the rank r of Fr . 2. There exists a subset S ⊆ F(A) exponentially generic in F(A) and there exists a subset S ⊆ Ur exponentially generic in Ur such that for every u ∈ S and every u ∈ S the conjugacy class [u] is strictly minimal with respect to A. Moreover, there exists a linear-time algorithm deciding whether or not an element u ∈ Fr belongs to S, and the same is true for S . 3. For any u ∈ S (similarly for any u ∈ S ) and any v ∈ Fr , Whitehead’s algorithm decides in at most quadratic time in terms of max{|u|A , |v|A } whether or not Aut(Fr )u = Aut(Fr )v. 4. For any u, v ∈ S (similarly, for any u, v ∈ S ), Whitehead’s algorithm decides in at most linear time in terms of max{|u|A , |v|A } whether or not Aut(Fr )u = Aut(Fr )v. 5. Let the random process 𝒲A,2,red be given by two independent nonbacktracking random works on Fr = F(A) with respect to A, generating a pair Wn = (Wn(1) , Wn(2) ) of freely reduced words of length n in F(A) after n steps. Then Whitehead’s algorithm solves the automorphism problem for Fr strongly generically in linear time with respect to 𝒲A,2,red . The proof of Theorem 1.7.2 in [249] utilizes the analysis of “Whitehead graphs” associated with cyclically reduced words in Fr = F(A). The Whitehead graph records the information about the numbers of occurrences of one-letter and two-letter subwords in a cyclically reduced word. It is important for the proof that for “random” cyclically reduced words in the sense of Theorem 1.7.2 the frequencies of one-letter and twoletter subwords in that word are close to uniform. This nearly uniform distribution of frequencies usually no longer holds for “generic” elements of Fr generated by other types of random processes. Nevertheless, Kapovich [239] recently generalized, using the theory of geodesic currents, the genericity results from Theorem 1.7.2 to several types of other random processes generating “random” elements of Fr = F(A). These random processes include group random walks defined by “nice” finitely supported discrete probability
1.7 Whitehead algorithm and related problems |
33
measures on Fr , as well as nonbacktracking graph random walks, given by “nice” irreducible Markov chains on finite graphs with the fundamental group Fr . In that, more general, context, it turns out that “random” elements wn ∈ F(A) are no longer necessarily strictly minimal. However, there exist M ≥ 1, λ > 1, ϵ > 0 and a universal (determined by the defining measure/Markov chain of the walk and independent of n and of the trajectory of the random walk) “shortening” ϕ ∈ Aut(Fr ) such that generically the element [ϕ(wn )] is (M, λ, ϵ)-minimal, in the sense defined in [239]. This property implies, in particular, that the number of automorphically minimal conjugacy classes [u] with u ∈ Aut(Fr )wn is bounded above by M. For that reason analogs of parts (3) and (5) of Theorem 1.7.2 also hold in this context.
1.7.3 Whitehead algorithm for subgroups There is a version of the automorphism problem for finitely generated subgroups of Fr : Given two finitely generated subgroups H, K ≤ Fr , decide whether or not there exists an automorphism ϕ ∈ Aut(Fr ) such that ϕ(H) = K. In a 1984 paper [164] Gersten gave a solution for the automorphism problem for finitely generated subgroups of Fr = F(A) that works similarly to Whitehead’s algorithm. Every finitely generated subgroup H ≤ Fr is uniquely represented by its Stallings subgroup graph Γ(H), which is a finite connected labeled “folded” graph with a distinguished base-vertex 1H . The graph Γ(H) arises as the base-pointed core of the covering space corresponding to H of the r-rose Rr with petals labeled by a1 , . . . , ar . If H ≠ 1, then every vertex of Γ(H) other than possibly 1H has degree ≥ 2. For H ≠ 1, the graph Γ(H) has a unique smallest subgraph Core(Γ(H)) whose inclusion in Γ(H) is a homotopy equivalence with Γ(H). The graph Core(Γ(H)) is “cyclically reduced” in the sense that every vertex of this graph has degree ≥ 2 in Core(Γ(H)). Moreover, Core(Γ(H)) is an invariant of the conjugacy class [H] of H in Fr . For a nontrivial finitely generated subgroup H ≤ Fr , we denote by |H|A the number of vertices of Γ(H) and denote by ‖H‖A the number of vertices of Core(Γ(H)). See [470, 242] for the background on Stallings subgroup graphs. Gersten [164] established an exact analog of Theorem 1.7.1 for nontrivial finitely generated subgroups of Fr , where the cyclically reduced length of a word is replaced by ‖H‖A . In particular, one gets similar notions and results regarding automorphic minimality for subgroups. For a nontrivial finitely generated subgroup H ≤ Fr = F(A) (where r ≥ 2) we say that [H] is automorphically minimal if ‖H‖A ≤ ‖ϕ(H)‖A for every ϕ ∈ Aut(Fr ). Then a version of part (1) of Theorem 1.7.1 says that if [H] is not automorphically minimal, then there exists a Whitehead automorphism τ ∈ 𝒲 such that ‖τ(H)‖A < ‖H‖A . Similarly to the case of words, one can compute ‖τ(H)‖A − ‖H‖A from the “Whitehead hypergraph” associated with [H]. There is also a precise analog of part (2) of Theorem 1.7.1. Altogether, by a similar argument as in the classic setting of elements of Fr , Gersten obtains a total algorithm for deciding, given two finitely generated subgroups H, K ≤ Fr , if there exists ϕ ∈ Aut(Fr ) such that ϕ(H) = K. By analogy
34 | 1 Generic-case complexity in group theory with the case of words, for a nontrivial finitely generated subgroup H ≤ Fr = F(A) we say that [H] is strictly minimal with respect to A if for every noninner Whitehead automorphism of the second kind τ ∈ 𝒲 we have ‖H‖A < ‖τ(H)‖A . Then an analog of part (2) of Theorem 1.7.1 for subgroup graphs implies that if [H] is strictly minimal, then [H] is automorphically minimal. Moreover, Whitehead’s algorithm for solving the automorphism problem for two finitely generated subgroups of Fr works in low-degree polynomial time if the conjugacy class of at least one of these subgroups is strictly minimal. Bassino, Nicaud and Weil [28] obtained the following analog of the results of [249] in the subgroup graph setting. Theorem 1.7.3. Let r ≥ 2 and let Fr = F(A), where A = {a1 , . . . , ar }. 1. For n ≥ 1 let Vn be the set of cyclically reduced Stallings subgroup graphs in F(A) with n vertices, equipped with the uniform probability measure. Then the set Vn ⊆ Vn of strictly minimal elements of Vn is superpolynomially generic in Vn as n → ∞, that #V
2.
is, limn→∞ #Vn = 1, and the convergence is superpolynomially fast. n For n ≥ 1 let Zn be the set of Stallings subgroup graphs in F(A) with n vertices, equipped with the uniform probability measure. Then the set Zn ⊆ Zn of strictly minimal elements of Zn is generic but not superpolynomially generic in Zn as n → ∞,
that is, limn→∞
#Zn #Zn
= 1, but the convergence is not superpolynomially fast.
Note that in [28] the above result and the definition of strict minimality in that paper are stated slightly incorrectly, because there they use |H|A rather than ‖H‖A . However, once the proof of [28, Proposition 2.2] is correctly reworded in terms of the effect of Whitehead moves on ‖H‖A , the proof of [28, Theorem 3.1, Corollary 3.5] produces Theorem 1.7.3 above. An interesting feature of Theorem 1.7.3 is its part (2) which shows that a “generic” Stallings subgroup graph on n vertices is already cyclically reduced, that is, its base-vertex has degree ≥ 2. By contrast, a “random” freely reduced word in F(A) of length n is not cyclically reduced with some asymptotically positive probability. It seems likely that one can strengthen part (2) of Theorem 1.7.3 to show that for a superpolynomially generic element of Zn its cyclically reduced core is strictly minimal. Although the paper [28] does not discuss the implications of the above statements for the generic-case complexity of Whitehead’s algorithm for subgroups, we state a sample application of this kind below. Corollary 1.7.4. Let r ≥ 2, Fr = F(A), let Zn , Zn be as in part (2) of Theorem 1.7.3 above and let μn be the uniform probability distribution on Zn . Consider the random process 𝒲 such that for every n ≥ 1 this process generates a pair Wn = (Wn(1) , Wn(2) ) of μn -random independently chosen elements Wn(1) , Wn(2) ∈ Zn (so that Wn(1) , Wn(2) determine two “random” finitely generated subgroups of Fr ). Then Gersten’s version of the Whitehead algo-
1.7 Whitehead algorithm and related problems | 35
rithm solves the automorphism problem for finitely generated subgroups of Fr generically in cubic time in n. The proof is essentially the same as in [249] for the case of group elements. Note that for two base-pointed Stallings graphs on n vertices, to decide if they are equal (base-point label preserving isomorphic) it takes a priori quadratic time in n, and it takes at most cubic time to decide if there exists a label preserving isomorphism between them that does not necessarily respect the base-points. In contrast, for freely reduced words in Fr one can decide in linear time if they are equal to each other, and one can also decide in linear time if they are cyclic permutations of each other. That is why there is a linear time estimate in part (5) of Theorem 1.7.2 and a cubic time estimate in Corollary 1.7.4. The uniform probability distribution μn on the set Zn of Stallings subgroup graphs of a given size (that is, with n vertices) turns out to be a natural setting to consider for generic-case complexity purposes since it is possible to produce such a distribution experimentally by a reasonably easy to implement and efficient process. Such a process is described in detail in [26]. In [28] the authors also consider “random” finitely generated subgroups of Fr = F(A) (where r ≥ 2) in the basic genericity model, where the number k of the generators of the subgroup is fixed. They prove [28, Theorem 4.1] that for any fixed k ≥ 2 the set of k-tuples (w1 , . . . , wk ) of cyclically reduced words such that H = ⟨w1 , . . . , ak , ⟩ ≤ Fr is free of rank k and such that Core(Γ(H)) is strictly minimal is exponentially generic in the set of all k-tuples of cyclically reduced words. In fact, a careful analysis of the proof shows that the same result holds for k-tuples of freely reduced words. In [238] Kapovich gives a more conceptual treatment for the phenomena underlying Theorem 1.7.2 using the machinery of “geodesic currents” on free groups. It seems likely that one can recast the results of [28] regarding Whitehead’s algorithm for subgroups in the language of “subset currents” introduced by Kapovich and Nagnibeda in [243].
1.7.4 Garside algorithm for the conjugacy problem in braid groups The conjugacy problem for the n-strand braid group Bn is one of the most well-studied algorithmic problems in geometric group theory, but the precise worst-case complexity of the conjugacy problem in Bn is still not known. There are several solutions that have a priori exponential-time worst-case complexity, and it is an important open problem whether a polynomial-time solution to the conjugacy problem in Bn exists. The main approach to the conjugacy problem in Bn is given by the so-called Garside algorithm and its various modifications. The literature on this topic is rather vast, and is far beyond the scope of this chapter. We refer the reader to [143] and a foundational paper of Birman, Ko and Lee [46] for the relevant background info. The
36 | 1 Generic-case complexity in group theory basic structure of Garside’s algorithm has significant similarities to that of Whitehead’s algorithm. The braid group Bn contains a special element Δ, topologically corresponding to “half-twist” around the boundary of the disk, and such that the center of Bn is generated by Δ2 . There is also a finite set S ⊆ Bn of positive braids consisting of “divisors” of Δ called simple braids. Every element g ∈ Bn can be uniquely represented as g = Δk x1 . . . , xr , where k ∈ ℤ, x1 , . . . , xk ∈ S and where the expression x1 . . . xk is in a “left normal form” with respect to the divisibility properties of Δ. The number r = |g| is called the syllable length of g. Given such an expression for g, a “cycling” or a “decycling” move consists in conjugating g by x1 or xk−1 and rewriting the result in the normal form again. The key points of Garside’s algorithm are the following: Given g ∈ G, one can first minimize its syllable length in the conjugacy class of g by repeated cycling/decycling moves, decreasing the syllable length at each step (similar to part (1) of Theorem 1.7.1). Then, given two elements of the same minimal syllable length m, these two elements are conjugate in Bn if and only if there exists a sequence of conjugations by elements of S taking the first element to the second and keeping the syllable length constant at m (which is similar to part (2) of Theorem 1.7.1). This is the “hard” part of Gardside’s algorithm, and the difficulty consists in controlling the size of the analog of the component C(Ωm , [u ]) in the Bn setting. These analogs, for various improvements and variations of Gardside’s algorithm, are called the “summit set,” “supersummit set,” “ultrasummit set,” etc. However, at the moment it is not yet known whether any of these summittype sets have at most polynomial size in the length of g even for pseudo-Anosov g ∈ Bn . In terms of generic-case complexity for the conjugacy problem for braid groups, we are currently aware of three types of results. First, because for n ≥ 3 the braid group Bn has infinite abelianization, Theorem 1.4.4 and Theorem 1.4.5 apply to Bn . Thus for n ≥ 3 the conjugacy problem for Bn is solvable generically in linear time in the sense described in Theorem 1.4.4 and Theorem 1.4.5 (but note that strongly generically linear time cannot be concluded from these theorems). Second, for n ≥ 3, the pure braid group Pn ≤ Bn admits an epimorphism onto the free group F(a, b). Therefore, by Theorem 1.4.6, for n ≥ 3, the pure braid group Pn has the conjugacy problem solvable strongly generically in linear time in the sense described in Theorem 1.4.6. Third, there are also some generic-case complexity results regarding versions of Garside’s algorithm that yield exponential-time speed of convergence estimate, that is, in our language, strong generic-case complexity results. Thus, Caruso and Wiest [78] obtain a certain generic-case complexity result for a version of Garside’s algorithm that is similar in flavor to Theorem 1.7.2. Theorem 1.7.5 ([78, Theorem 6.2]). Let n ≥ 3. For L ≥ 1 consider the uniform probability distribution μL on the ball B(L) in the Cayley graph of Bn with respect to the generating set S consisting of simple braids. Then, for as L → ∞ for a μL -random g ∈ B(L) and an
1.7 Whitehead algorithm and related problems | 37
arbitrary h ∈ Bn , with probability tending to 1 exponentially fast as L → ∞, a version of Garside’s algorithm decides whether or not g and h are conjugate in at most quadratic time in the maximum of the lengths of g, h. The proof of this result uses a “cyclic sliding” version of Garside’s algorithm due to Gebhardt and González-Meneses [162] and proceeds by showing that a μL -random element g ∈ BL is pseudo-Anosov and that g has “sliding circuit set” SC(g) of linear size in terms of |g|. As of now, there are no results of this kind about generic-case behavior of Garside’s algorithm for the random walk-based versions of generic-case complexity. We discuss here a possible alternative approach to the generic-case complexity of the conjugacy problem in braid groups, and, more generally, in mapping class groups. Let Σ be a finite-type oriented surface of negative Euler characteristic and let G = Mod(Σ) be the mapping class group of Σ. Consider the action of G on the curve complex 𝒞 (Σ). The curve complex is well known to be Gromov-hyperbolic and this action is known to be acylindrical. Let μ be a probability measure on G whose support is finite and generates G as a semigroup. Let Wn = (Wn(1) , Wn(2) ) be a pair of independent μ-random walks on G of length n. It is well known by now that, with probability tending to 1 as n → ∞, each of Wn(1) , Wn(2) ∈ G acts loxodromically on 𝒞 (Σ) and hence is pseudo-Anosov. Moreover, a recent result of Maher and Sisto [324, Theorem 1] shows that, with probability tending to 1 as n → ∞, the subgroup H = ⟨Wn(1) , Wn(2) ⟩ ≤ G is free of rank 2, quasiisometrically embedded in G, that Wn(1) , Wn(2) have arbitrarily large translation lengths in 𝒞 (Σ) and that the group HE(G) = H ⋉ E(G) is hyperbolically embedded in G (here E(G) is the maximal finite normal subgroup of G). This implies, in particular, that Wn(1) , Wn(2) are not conjugate in G. The fact that Wn(1) and Wn(2) are not conjugate in G implies that the periods of their (quasi)axes in 𝒞 (Σ) have “small overlap” (relative to their translation lengths), even up to translation by elements of G. More precisely, for geodesic segments [a, b] and [c, d] in 𝒞 (Σ) we say that these segments have “small overlap” if there does not exist g ∈ G such that g[a, b] and [c, d] have subsegments of length 41 min{d(a, b), d(c, d)} that are 2δ-Hausdorff close in 𝒞 (Σ), where δ is the hyperbolicity constant of 𝒞 (Σ). As noted above, by a result of Maher and Sisto, with probability tending to 1 as n → ∞, the elements Wn(1) , Wn(2) are not conjugate in G and the periods of their (quasi)axes have small overlap in 𝒞 (Σ). A recent result of Bell and Webb provides a polynomial-time algorithm for computing distances in 𝒞 (Σ) and for computing tight geodesics in 𝒞 (Σ). We believe that it may be possible to promote their result to a polynomial-time algorithm that, given vertices a, b, c, d ∈ 𝒞 (Σ), decides whether or not geodesics [a, b], [c, d] have small overlap. Now, a generic algorithm for solving the conjugacy problem in G can proceed as follows. Given Wn(1) , Wn(2) ∈ G generated by the walk, the algorithm first uses the Bell–Webb result to determine if Wn(1) , Wn(2) acts with sufficiently large (with respect to δ) translation length on 𝒞 (Σ). If so, then they are both pseudo-Anosov and the Bell–Webb algorithm
38 | 1 Generic-case complexity in group theory can compute tight geodesics [a, b], [c, d] in 𝒞 (Σ) giving periods of (quasi)axes of Wn(1) , Wn(2) . Then check if [a, b], [c, d] have a small overlap. If so, conclude that Wn(1) , Wn(2) are not conjugate in G; otherwise, the algorithm does not return an answer. By the above result of Maher and Sisto, this algorithm will in fact terminate and conclude that Wn(1) , Wn(2) are not conjugate in G with probability tending to 1 as n tends to infinity. To make this approach work, one needs to obtain a polynomial-time “small overlap” extension of the Bell–Webb result. It is also desirable to obtain speed of convergence (presumably exponentially fast convergence) of probabilities estimates in the Maher–Sisto result, which for now does not have any speed of convergence estimates.
1.8 Generic-case complexity of the isomorphism problem The isomorphism problem remains perhaps the least understood among grouptheoretic problems, both in terms of the worst-case analysis and in terms of genericcase analysis. Standard fast check ideas such as various quotient methods (including, say, trying to use the order of the torsion part of the abelianization of the group) do not work very well in the setting of the isomorphism problem and do not produce generic-case algorithms for the “no” part of this problem (see Remark 1.8.3 below). Therefore new approaches are required.
1.8.1 Generic one-relator groups The first, and still one of the few, results in this direction was obtained by Kapovich, Schupp and Shpilrain in the setting of one-relator groups [249]. That paper obtained a strong Mostow-type isomorphism rigidity result for “random” one-relator groups. If A = {a1 , . . . , ak }, F(A) is the free group on A and w ∈ F(A), we denote Gw := ⟨a1 , . . . ak |w⟩ = F(A)/nclF(A) (w). We say that η ∈ Aut(F(A)) is a relabeling auϵi tomorphism if there exists a permutation σ ∈ Sym({1, . . . , k}) such that η(ai ) = aσ(i) , where ϵi ∈ {−1, 1}, for i = 1, . . . , k. Theorem 1.8.1 ([249]). Let k ≥ 2, let A = {a1 , . . . , ak } and let U ⊆ F(A) be the set of all freely and cyclically reduced words in F(A). There exists a subset S ⊂ U with the following properties: 1. The set S is exponentially generic in U (in the sense of Definition 1.2.1 and part (3) of Example 1.2.2). 2. There exists a polynomial-time algorithm to decide, for a word w ∈ U, whether or not w ∈ S. 3. For w, v ∈ S we have Gw ≅ Gv if and only if there exists a relabeling automorphism η of F(A) such that η(v) is a cyclic permutation of w±1 .
1.8 Generic-case complexity of the isomorphism problem
| 39
4. For w ∈ S and v ∈ F(A), we have Gw ≅ Gv if and only if there exists α ∈ Aut(F(A)) such that α(w) is a cyclic permutation of w±1 . Part (3) of Theorem 1.8.1 means that for w, v ∈ S, Gw ≅ Gv if and only if the Cayley graphs Γ(Gw , A), Γ(Gv , A) are isomorphic as labeled graphs, after a possible label renaming. For two elements w, v ∈ S it is easy to check in linear time if there exists a relabeling automorphism η such that η(v) is a cyclic permutation of w±1 . Moreover, using Whitehead’s algorithm, with a bit of extra work (which requires the specifics of the description of S), one can show that, given w ∈ S, v ∈ F(A), one can decide in quadratic time in terms of max{|w|, |v|} if there exists α ∈ Aut(F(A)) such that α(w) is a cyclic permutation of w±1 . Therefore Theorem 1.8.1 implies the following. Corollary 1.8.2 ([249]). Let k, A, F(A), U, S be as in Theorem 1.8.1. 1. There exists a linear-time (in terms of max{|w|, |v|}) algorithm such that for any w, v ∈ S it decides whether or not Gw ≅ Gv . 2. There exists a quadratic-time (again in terms of max{|w|, |v|}) algorithm such that for any w ∈ S, v ∈ F(A) it decides whether or not Gw ≅ Gv . In particular, Corollary 1.8.2 means that the isomorphism problem for k-generated one-relator groups is solvable strongly generically in linear time. Remark 1.8.3. In [244], Kapovich, Rivin, Schupp and Shpilrain proved that with respect to the natural notion of “annular density” on the free group F(A) of rank k ≥ 2 the set V ⊆ F(A) of “visible” elements has annular density 1/ζ (k). Here w ∈ F(A) is visible if w is primitive in the integral homology of F(A), that is, if w projects to the element of ℤk = F(A)ab which is nontrivial and not a proper power (i. e., is a part of a free basis of ℤk ). Note that for 1 ≠ w ∈ F(A) the element w is visible if and only if |Torsion((Gw )ab )| = 1. Thus if v, w ∈ F(A) are both chosen randomly and independently, then with asymptotically positive probability they will both be visible and we will have Torsion((Gw )ab ) = Torsion((Gv )ab ) = 1. This example demonstrates why naive quotient tests methods, which work very well for the other group-theoretic decision problems, are not so well suited for distinguishing isomorphism types of finitely presented groups given by generic presentations. Algebraically, the proof of parts (3) and (4) in Theorem 1.8.1 involves three, rather distinct, components. One component is the generic-case analysis of Whitehead’s algorithm discussed in Section 1.7.1 above, and also obtained in [249]. The second is a classic result of Magnus [318] about one-relator groups that for r, s ∈ F(A) we have nclF(A) (r) = nclF(A) (s) if and only if s is conjugate to r ±1 in F(A). The third, crucial, component involves establishing the following result obtained by Kapovich and Schupp in
40 | 1 Generic-case complexity in group theory their earlier paper [246], for groups given by generic presentations with a fixed number of generators and defining relators. Theorem 1.8.4 ([246]). Let k ≥ 2 and m ≥ 1. Let G = ⟨a1 , . . . , ak | w1 , . . . , wm ⟩.
(♠)
Let Um be the set of m-tuples of cyclically reduced elements of F(A) where A = {a1 , . . . , ak }. Then there exists a subset S ⊆ Um with the following properties: 1. The set S is exponentially generic in Um (in the sense of Definition 1.2.1 and part (5) of Example 1.2.2). 2. There exists a polynomial-time algorithm to decide, for a word w ∈ U, whether or not w ∈ S. 3. For every W = (w1 , . . . , wm ) ∈ S the corresponding group GW defined by (♠) is a oneended torsion-free word-hyperbolic group and presentation (♠) satisfies the C (1/12) small cancelation condition. 4. For every W ∈ S and GW defined by (♠), the group GW has only one Nielsen equivalence class of k-tuples generating nonfree subgroups of GW , namely, the class of the k-tuple (a1 , . . . , ak ). Property (4) from Theorem 1.8.4 is called the Nielsen uniqueness property for GW . The proof of Theorem 1.8.4 in turn relies on the work of Arzhanteva and Ol’shanskii [16, 13, 14], who developed a certain entropy reducing generic “graph nonreadability condition” and an algebraic iterative surgery trick for reducing the volume of a graph representing a subgroup.
1.8.2 Generic quotients of the modular group The main obstacle to generalizing Theorem 1.8.1 to the case of generic finitely presented groups with m ≥ 1 defining relations is the absence of the abovementioned result of Magnus in the setting of m ≥ 1. Nevertheless, in a subsequent paper [248] Kapovich and Schupp were able to generalize Theorem 1.8.1 to the case of m-relator (where m ≥ 1 is an arbitrary fixed number) quotients of the modular group M ≅ PSL(2, ℤ) M = ⟨a, b | a2 = 1, b3 = 1⟩ = ℤ2 ∗ ℤ3 . In that setting one can still obtain the Nielsen uniqueness property for generic quotients of M, but the fact that M is two-generated and that both generators have finite order can be used as a substitute for Magnus’ result to ultimately achieve similar conclusions. We will say that a word w in the alphabet {a, b, b−1 } is cyclically reduced in
1.8 Generic-case complexity of the isomorphism problem | 41
M if w is a strictly alternating product of a and elements of {b, b−1 }. For m ≥ 1 denote by Um (M) the set of m-tuples (w1 , . . . , wm ) of cyclically reduced words M such that |w1 | = ⋅ ⋅ ⋅ = |wm |. As before, for W = (w1 , . . . , wm ) ∈ Um denote GW = M/⟨⟨W⟩⟩ = ⟨a, b | a2 , b3 , w1 , . . . , wm ⟩. In the context of the modular group M, there is only one relabeling automorphism η of M defined by η(a) = a, η(b) = b−1 . We state here, in simplified form, a summary of the main results from [248]. Theorem 1.8.5. Let m ≥ 1 be an integer. There exists a subset S ⊆ Um (M) with the following properties: 1. The set S is exponentially generic in Um (M). 2. There exists a quadratic-time algorithm to decide, for a tuple W ∈ Um (M), whether or not W ∈ S. 3. For every W ∈ S the group GW is a one-ended torsion-free word-hyperbolic group with trivial center and Out(GW ) = {1}. 4. For W = (w1 , . . . , wm ), V = (v1 , . . . , vm ) ∈ S we have GW ≅ GV if and only if there exists a reordering (v1 , . . . , vm ) of (v1 , . . . , vm ) and there exists ϵ ∈ {0, 1} such that each vi is a cyclic permutation of ηϵ (wi ) or ηϵ (wi−1 ) for i = 1, . . . , m. Note that condition (4) of Theorem 1.8.5 can be easily verified in linear time in terms of max{|W|, |V|}. Theorem 1.8.5 can be interpreted as saying that the isomorphism problem for m-relator quotients of M the isomorphism problem is solvable strongly generically in linear time.
1.8.3 Other consequences of isomorphism rigidity Isomorphism rigidity (Theorem 1.8.1, part (3) and Theorem 1.8.5, part (4)) has several other important consequences. In particular, it can be used for counting the number of isomorphism types of groups. Let k ≥ 2. For n ≥ 2 denote by Ik (n) the number of isomorphism types of groups given by presentations G = ⟨a1 , . . . , ak | w = 1⟩, where w ∈ F(a1 , . . . , ak ) is a cyclically reduced word of length n. Similarly, for the modular group M, let m ≥ 1 be a fixed integer. Let Jm (n) be the number of isomorphism types of groups given by presentations G = ⟨a, b | a2 , b3 , w1 , . . . , wm ⟩, where w1 , . . . , wm ∈ M are cyclically reduced words on length n (i. e., strictly alternating products of length n of a and elements of {b, b−1 }). For two functions f (n), g(n) we write f (n) f (n) ∼ g(n) if limn→∞ g(n) = 1.
42 | 1 Generic-case complexity in group theory Theorem 1.8.6. The following hold: 1. [245] Let k ≥ 2. Then
2.
Ik (n) ∼
(2k − 1)n . nk!2k+1
Jm (n) ∼
(2 2 +1 )m . 2m!(2n)m
[248] Let m ≥ 1. Then n
These counting results were obtained in [245, 248] as a consequence of isomorphism rigidity by showing that groups defined by generic presentations dominate in the count. Moreover, the statements of the isomorphism rigidity results are so precise that they allow to exactly compute the multiplicity constants coming from various unavoidable symmetries, and consequently to find the asymptotics of Ik (n) and Jm (n) precisely, up to ∼. Given how poorly the isomorphism problem remains understood in geometric group theory, in terms of solvability or worst-case complexity, Theorem 1.8.6 is a rather remarkable statement and, for the moment, still essentially represents the frontier of what is known regarding counting group isomorphism types for groups given by generators and defining relations. In [247] the isomorphism rigidity results for random quotients for random quotients of M are extended further, from the fixed number of relator m, to the low-density case in Gromov’s density model of random groups. One of the applications obtained in [247] is a double exponential, in terms of n, lower bound, on the number of isomorphism types of groups given by presentations of the form ⟨a1 , a2 | R⟩, where R is contained in the n-ball in F(a1 , a2 ). Another consequence of isomorphism rigidity concerns “algebraic incompressibility.” For a finite-group presentation Π = ⟨a1 , . . . , ak | w1 , . . . wt ⟩, as before, define L(Π) := ∑ii=1 |wi |. For a finitely presentable group G, put T1 (G) to be the minimum of T1 (Π) taken over all finite presentations Π of G. This definition is closely related to, but differs slightly from, the definition of Delzant’s T-invariant T(G). The only difference there is that for the T-invariant the length of the presentation Π is defined as ∑ii=1 max{0, |wi | − 2}. The quantity T(G) is somewhat better behaved with respect to some topological arguments and, in particular, satisfies T(G1 ∗ G2 ) = T(G1 ) + T(G2 ). However, in qualitative and descriptive terms the two notions are very similar and the differences between them can be disregarded. Isomorphism rigidity can be used to show that for generic groups (for which isomorphism rigidity is known to hold) their given presentations are close to minimal, in the sense of the T-invariant or of the T1 -invariant.
1.8 Generic-case complexity of the isomorphism problem | 43
Theorem 1.8.7. For any integers k ≥ 2 and m ≥ 1 the following hold: 1. [245] For any 0 < δ < 1 there exists C > 0 such that as n → ∞, when w is chosen uniformly at random from the set of cyclically reduced words of length n in F(a1 , . . . , ak ), then, with probability ≥ 1 − δ, the group Gw = ⟨a1 , . . . , ak | w = 1⟩ satisfies T(Gw ) log2 T(Gw ) ≥ C|w|. 2.
[248] For any 0 < δ < 1 there exists a generic subset S ⊆ Um (M) such that for every W = (w1 , . . . , wm ) ∈ S we have T1 (GW ) ≥ L(W)1−δ .
These results are derived from the corresponding isomorphism rigidity properties using the techniques of Kolmogorov complexity from information theory (see [288] for the relevant background).
2 Random presentations and random subgroups 2.1 Introduction In infinite group theory, it is a classical and natural question to ask what most groups look like, i. e., what a random group looks like. The question can and must be made more precise: it is actually a question about random finitely presented groups, and in most of the literature, in fact a question about random finite group presentations on a fixed set of generators. The specific questions may be whether a random finite group presentation satisfies a small cancelation property or whether the group it presents is hyperbolic, residually finite, etc. Early on, Gromov gave an answer to this question: almost all groups are hyperbolic (see [188], and [396, 79, 390] for precise statements and complete proofs). When a group G is fixed (e. g., the free group F(A) over a given finite set A of generators, a hyperbolic group, a braid group, the modular group), one may also ask what a random finitely generated subgroup looks like. Is it free? Is it malnormal? Does it have finite index? In the case where G = F(A), is the subgroup Whitehead minimal? These questions have been abundantly studied in the literature. This chapter is a partial survey and as such, it contains no new results, but it offers a synthetic view of a part of this very active field of research. We also include a small number of proofs, in full or only sketched. We refer the reader to the survey by Ollivier [391] for more details on some of the topics discussed here, and to the survey by Dixon [110] for a discussion of probabilistic methods in finite group theory. A specific aspect of the present survey is that we discuss both random presentations and random subgroups, unlike Ollivier [391]. Random presentations were considered first in the literature, and we will start with them as well (Section 2.2). We then proceed to a discussion of results on random subgroups (Section 2.3). Finally, Section 2.4 discusses recent results on nonuniform distributions.
2.1.1 Discrete representations The very notion of randomness relies on a notion of probability, and in many cases, on a notion of counting discrete representations of finitely presented groups, or finitely generated subgroups, of a certain size: How many subgroups of F(A) are there with a tuple of f (n) generators of length at most n for a given function f ? How many are there whose Stallings graph (see Section 2.3.1) has at most n vertices? How many isomorphism classes of one-relator groups are there whose relator has length at most n? And so forth. So we must first discuss the discrete representations we will use to describe subgroups and presentations. https://doi.org/10.1515/9783110667028-002
46 | 2 Random presentations and random subgroups Let A be a finite nonempty set and let F(A) be the free group on A. The symmetrized alphabet à is the union of A and a copy of A, Ā = {ā | a ∈ A}, disjoint from A. We denote by à ∗ the set of all words on the alphabet A.̃ The operation x → x̄ is extended to à ∗ by letting ā̄ = a and ua = ā ū for all a ∈ à and u ∈ à ∗ . Recall that a word is reduced if it ̃ The (free group) reduction of a word does not contain a factor of the form aā (a ∈ A). ∗ u ∈ à is the word ρ(u) obtained from u by iteratively deleting factors of the form aā ̃ We can then think of F(A) as the set of reduced words on A:̃ the product in (a ∈ A). F(A) is given by u ⋅ v = ρ(uv), and the inverse of u is u.̄ In the sequel, we fix a finite set A, with cardinality r > 1. If n ∈ ℕ, we denote by ℛn (respectively, ℛ≤n ) the set of reduced words of length n (respectively, at most n). A reduced word u is called cyclically reduced if u2 = uu is reduced, and we let 𝒞ℛn (respectively, 𝒞ℛ≤n ) be the set of cyclically reduced words of length n (respectively, at most n). If u is a reduced word, there exist uniquely defined words v, w such that w is cyclically reduced and u = v−1 wv. Then w is called the cyclic reduction of u, written κ(u). It is easily verified that |ℛn | = 2r(2r − 1)n−1
|ℛ≤n | = Θ((2r − 1)n )
and
2r(2r − 1)n−2 (2r − 2) ≤ |𝒞ℛn | ≤ 2r(2r − 1)n−1 ,
and |𝒞ℛ≤n | = Θ((2r − 1)n ).
If h⃗ = (h1 , . . . , hk ) is a tuple of elements of F(A), we denote by ⟨h⟩⃗ the subgroup of F(A) generated by h:⃗ it is the set of all products of the elements of h⃗ and their inverses. ⃗ the normal closure of ⟨h⟩, ⃗ namely, the set of all products of And we denote by ⟨⟨h⟩⟩ g −1 conjugates hi = g hi g of the elements of h⃗ (1 ≤ i ≤ k, g ∈ F(A)) and their inverses. ⃗ is the quotient F(A)/⟨⟨h⟩⟩. ⃗ The group presented by the relators h,⃗ written ⟨A | h⟩, ⃗ ⃗ ⃗ ⃗ If h = (h1 , . . . , hk ) and κ(h) = (κ(h1 ), . . . , κ(hk )), then h and κ(h) present the same ⃗ It is therefore customary, when considering group group, that is, ⟨A | h⟩⃗ = ⟨A | κ(h)⟩. presentations, to assume that the relators are all cyclically reduced. In general, if there exists a surjective morphism μ: F(A) → G, we say that G is A-generated. Then a word u ∈ F(A) is called geodesic if it has minimum length in μ−1 (μ(u)). Properties of interest for subgroups of G are, for instance, whether they are free or quasiconvex. Recall that a subgroup H of G is quasiconvex if there exists a constant k > 0 such that for every geodesic word u = a1 ⋅ ⋅ ⋅ an such that μ(u) ∈ H, and for every 1 ≤ i ≤ n there exists a word vi of length at most k such that μ(a1 ⋅ ⋅ ⋅ ai vi ) ∈ H. We observe that while being geodesic is a word property which depends on the chosen set of generators for the group, being quasiconvex is an intrinsic property of the subgroup, which is preserved when we consider a different finite set of generators for G. We are also interested in malnormality and purity: a subgroup H is almost malnormal (respectively, malnormal) if H g ∩ H is finite (respectively, trivial) for every g ∈ ̸ H. Moreover, H is almost pure (respectively, pure, also known as isolated or closed under radical) if xn ∈ H implies x ∈ H for any n ≠ 0 and any element x ∈ G of infinite
2.1 Introduction
| 47
order (respectively, any x ∈ G). Note that malnormality and almost malnormality (respectively, purity and almost purity) are equivalent in torsion-free groups. It is easily verified that an almost malnormal (respectively, malnormal) subgroup is almost pure (respectively, pure). It is a classical result that every finitely generated subgroup of a free group is free (see Nielsen [379]) and quasiconvex (see Gromov [187]). In addition, it is decidable whether a finitely generated subgroup of a free group is malnormal [37] and whether it is pure [44] (see Section 2.3.1). In contrast, these properties are not decidable in a general finitely presented group, even if hyperbolic [69]. Quasiconvexity is also not decidable in general, even in hyperbolic or small cancelation groups [420]. Almost malnormality is however decidable for quasiconvex subgroups of hyperbolic groups [258, Corollary 6.8]. Finally, let us mention the property of Whitehead minimality for finitely generated subgroups of free groups. We say that H is Whitehead minimal if it has minimum size in its automorphic orbit, where the size of a subgroup is defined in terms of its Stallings graph (see Section 2.3.1 below). In the case of a cyclic subgroup H = ⟨u⟩, if u = v−1 κ(u)v, then the size of H is |v| + |κ(u)|. Whitehead minimality plays an important role in the solution of the automorphic orbit problem, to decide whether two subgroups are in the same orbit under the automorphism group of F(A) (see [164, 233]). For group presentations, the emphasis can be on combinatorial properties of the presentation, such as small cancelation properties, or on the geometric properties of the given presented group, typically hyperbolicity. One of the main small cancelation properties is property C (λ) (for some 0 < λ < 1), which is defined as follows. A piece in a tuple h⃗ of cyclically reduced words is a word u which occurs as a prefix of two distinct elements of the set of cyclic conjugates of the elements of h⃗ and their inverses. ̄ is a piece of h⃗ = (ā abb, ̄ ̄ ̄ ab). ̄ A finite presentation ⟨A | h⟩⃗ For instance, aba babab, ab satisfies the small cancelation property C (λ) if a piece u in h⃗ = (h1 , . . . , hk ) satisfies |u| < λ|hi | for every i such that u is a prefix of a cyclic conjugate of hi . This is an important property since it is well known that if h⃗ has property C ( 61 ), then the group ⟨A | h⟩⃗ is hyperbolic [187, 0.2.A]. An elegant generalization is due to Ollivier [392]. Other small cancelation properties are discussed in Section 2.2.1.
2.1.2 Models of randomness In this chapter, the general model of randomness on a set S which we will consider consists in the choice of a sequence (ℙn )n of probability laws on S. For instance, the set S could be the set of all k-relator presentations (for a fixed value of k), that is, the set of all k-tuples of cyclically reduced words, and the law ℙn could be the uniform probability law with support the presentations where every relator has length at most n. This general approach covers the classical models considered in the literature, such as the Arzhantseva–Ol’shanskiĭ model [16] or Gromov’s density model [188]. It
48 | 2 Random presentations and random subgroups also allows us to consider probability laws that do not give equal weight to words of equal length (see Section 2.4 below). A subset X of S is negligible if the probability for an element of S to be in X tends to 0 when n tends to infinity, that is, if limn ℙn (X) = 0. If this sequence converges exponentially fast (that is, ℙn (X) is 𝒪(e−cn ) for some c > 0), we say that X is exponentially negligible. The set X is generic (respectively, exponentially generic) if its complement is negligible (respectively, exponentially negligible).
2.2 Random finite presentations 2.2.1 The density model The density model was introduced by Gromov [188]. Let 0 < d < 1 be a real number. In the density d model, the set S (with reference to the notation in Section 2.1.2) is the set of all finite tuples of cyclically reduced words and ℙn is the uniform probability law with support the set of |𝒞ℛn |d -tuples of elements of 𝒞ℛn . We say that a property is generic (respectively, negligible) at density d if it is generic (respectively, negligible) in the density d model. In this model, small cancelation properties are generic at low enough density. For property C (λ) (0 < λ < 1), we have an interesting so-called phase transition statement ([188, 9.B], see also [391, Section I.2.a]). Theorem 2.2.1. Let 0 < d < 1 and 0 < λ < 21 . If d < λ2 , then at density d, a random finite presentation exponentially generically satisfies property C (λ). If instead d > λ2 , then at density d, a random finite presentation exponentially generically does not satisfy C (λ). ρ−1
ρ+1
Proof. To lighten notation, we let ρ = 2r −1, α = ρ and β = ρ . We saw in Section 2.1.1 that |𝒞ℛn | = cn ρn , with αβ ≤ cn ≤ β. Let ℓ = λn. A reduced word u of length ℓ is a prefix of a cyclic conjugate of w ∈ 𝒞ℛn if either w = u2 w1 u1 with u1 u2 = u, or w = w1 uw2 with |w1 |, |w2 | > 0. Let a and b be the first and last letters of u. For fixed values of u1 , u2 , the number of words w1 such that u2 w1 u1 ∈ 𝒞ℛn (that is, w1 is reduced, does not start with b̄ and does not end with a)̄ is of the form cn,ℓ (a, b)ρn−ℓ , with α ≤ cn,ℓ (a, b) ≤ 1. Similarly, for 0 < ℓ1 < n − ℓ, the number of pairs (w1 , w2 ) such that |w1 | = ℓ1 and w1 uw2 ∈ 𝒞ℛn is of the form cn,ℓ (a, b)ρn−ℓ , with 1 ,ℓ α ≤ cn,ℓ1 ,ℓ (a, b) ≤ 1. Thus the probability pn (u) that a word of 𝒞ℛn contains u as a piece is bounded above by pn (u)
≤
n−ℓ−1
2 |𝒞ℛn |
( ∑ cn,ℓ (a, b)ρn−ℓ + ℓ cn,ℓ (a, b)ρn−ℓ ) 1 ,ℓ
n−ℓ
≤
2nρ αβρ−n
ℓ1 =1
=
2n −ℓ ρ . αβ
2.2 Random finite presentations |
49
It follows that the probability that a word of length ℓ is a piece of at least two distinct words in a |𝒞ℛn |d -tuple of cyclically reduced words of length n is at most 2
β2 2β |𝒞ℛn |d 2n −ℓ ) ∑ pn (u)2 ≤ ρ2dn βρℓ ( ρ ) = 2 n2 ρ(2d−λ)n , 2 2 αβ α u∈ℛ
(
ℓ
which vanishes exponentially fast if 2d < λ. Bounding the probability that u occurs twice as a piece in the same component as a tuple is technically more complicated, and we refer to [27, Theorem 3.20] for the details, discussed there in a more general situation. A brief summary is as follows. This double occurrence can arise because there are two nonoverlapping occurrences of u, or one of u and one of u−1 , or because there are overlapping occurrences of u (u and u−1 cannot overlap). The nonoverlapping situations lead to a probability with an upper bound of the form |𝒞ℛn | ∑ κn2 ρ−2ℓ ≤ κ n2 ρ(d−λ)n , u∈ℛℓ
where κ, κ are appropriate constants; see the proof of [27, Theorem 3.20] for a more general statement. The overlapping situation is more delicate to analyze, and it leads to an upper bound of the form |𝒞ℛn |κ nρℓ ≤ κ nρ(d−λ)n for appropriate constants κ , κ . At density d < λ2 , we have d − λ < 0, so both these probabilities vanish exponentially fast. Thus, at density less than λ2 , property C (λ) holds exponentially generically. Now let us assume that d > λ2 . In a variant of the birthday paradox, we show that, exponentially generically, two words in a random |𝒞ℛn |d -tuple of elements of 𝒞ℛn have the same length ℓ prefix. Indeed, if u has length ℓ and first and last letters a and b, the number of words in 𝒞ℛn which start with u is cn,ℓ (a, b)ρn−ℓ ≥ αρn−ℓ . If u1 , . . . , uN are pairwise distinct reduced words of length ℓ, the number of elements w ∈ 𝒞ℛn that start with none of these words is greater than or equal to Nαρn−ℓ . It follows that the number of N-tuples of words in 𝒞ℛn with pairwise distinct length ℓ prefixes is at most equal to |𝒞ℛn |(|𝒞ℛn | − αρn−ℓ )(|𝒞ℛn | − 2αρn−ℓ ) ⋅ ⋅ ⋅ (|𝒞ℛn | − (N − 1)αρn−ℓ ). Thus the probability pN that an N-tuple of elements of 𝒞ℛn all have distinct length ℓ prefixes satisfies pN ≤ (1 − β−1 ρ−ℓ )(1 − 2β−1 ρ−ℓ ) ⋅ ⋅ ⋅ (1 − (N − 1)β−1 ρ−ℓ ) ≤ exp(−β−1 ρ−ℓ − 2β−1 ρ−ℓ − ⋅ ⋅ ⋅ − (N − 1)β−1 ρ−ℓ )
50 | 2 Random presentations and random subgroups
≤ exp(−β−1
N(N − 1) −ℓ ρ ). 2
For N = |𝒞ℛn |d , we find that N(N − 1) ≥ (αβ)2d ρ2dn − βd ρdn , which is greater than (αβ)2d 2dn ρ 2
for n large enough. It follows that pN ≤ exp(−
(αβ)2d (2d−λ)n ρ ), 4β
which vanishes exponentially fast if 2d > λ. As noted earlier, if h⃗ satisfies property C ( 61 ), then the group ⟨A | h⟩⃗ is hyperbolic but the condition is not necessary. Theorem 2.2.1 shows that in the density model and up to density 121 , a finitely presented group is exponentially generically hyperbolic. Yet the property holds for higher densities, and we have another phase transition theorem. ⃗ where h⃗ consists of cyclically Let us say that a finitely presented group G = ⟨A | h⟩, reduced words of equal length n, is degenerate if G is trivial, or if G is the two-element group and n is even. Then we have the following result, again a phase transition theorem, due to Ollivier [391]. Theorem 2.2.2. Let 0 < d < 1. If d < 21 , then at density d, a random finite presentation exponentially generically presents an infinite hyperbolic group. If instead d > 21 , then at density d, a random finite presentation exponentially generically presents a degenerate group. The proof of the statement in Theorem 2.2.2 about density greater than 21 reduces to counting arguments on words in the spirit of the proof of Theorem 2.2.1 (see Section 2.4 for a generalization). The proof that hyperbolicity is generic at densities between 121 and 21 – that is, greater than the critical value for property C ( 61 ) – is more complex and involves the combinatorics of van Kampen diagrams. An example of such a diagram is given in Figure 2.1; for a formal definition, the reader is referred to [313]. Remark 2.2.3. A natural variant of the density model considers tuples of words of length at most n, instead of words of length exactly n. More precisely, ℙn is chosen
c b a
∘
a b
∙
a
∙
∙
∙
∙
∙
b
b
∙
∙ ∙
c b a
∙ ∙
a c b
Figure 2.1: Informally, a van Kampen diagram is a planar finite cell complex with a specific embedding in the plane. Its edges (1-cells) are directed and labeled by letters in A, and the boundary of each face (2-cell) is cyclically labeled by a relator. There is one distinguished vertex (0-cell). On the left, we see an example of such a diagram for ̄ = 1 ̄ a)̄ of area 3, showing that abacbaā bc h⃗ = (b2 ac2 a, babc,̄ b2 abā cb in the group presented by h.⃗ The two gray faces share the segment ab.
2.2 Random finite presentations | 51
to be the uniform probability law with support the set of |𝒞ℛ≤n |d -tuples of words in 𝒞ℛ≤n . Ollivier shows in [391] that Theorems 2.2.1 and 2.2.2 also hold for this model. Remark 2.2.4. The statement on hyperbolicity in Theorem 2.2.2 has an important predecessor. For a fixed number k of relators and a fixed k-tuple of lengths (ℓ1 , . . . , ℓk ), consider the finite presentations with k relators of length ℓ1 , . . . , ℓk , respectively. The probability that such a presentation presents an infinite hyperbolic group tends exponentially fast to 1 when min(ℓi )1≤i≤k tends to infinity (while k remains fixed). This was originally stated by Gromov [187], and proved by Champetier [79] and Ol’shanskiĭ [396]. The small cancelation property C (λ) for a tuple of cyclically reduced words h⃗ can be interpreted geometrically as follows: in any reduced van Kampen diagram ⃗ a segment of consecutive edges in the (with respect to the presentation ⟨A | h⟩), boundary between two adjacent faces f and f (namely, in 𝜕f ∩ 𝜕f ) has length at most λ min(|𝜕f |, |𝜕f |). Greendlinger’s property (as interpreted by Ollivier [393]) is of the same nature: it states that in any reduced van Kampen diagram D with more than one face, there exist two faces f and f for which there are segments of consecutive edges of 𝜕f ∩ 𝜕D (respectively, 𝜕f ∩ 𝜕D) of length at least 21 |𝜕f | (respectively, 1 |𝜕f |). 2 A closely related property of a tuple of relators h⃗ is whether Dehn’s algorithm
works for the corresponding presentation. More precisely, Dehn’s algorithm is the following (nondeterministic) process applied to a reduced word w. If w is of the form w = w1 uw2 for some word u such that uv is a cyclic permutation of a relator and |v| < |u|, then replace w by the reduction of w1 v−1 w2 (which is a shorter word), and repeat. It is clear that this process always terminates, and that if it terminates with the ⃗ The converse does not hold empty word, then w is equal to 1 in the group G = ⟨A | h⟩. ⃗ in general, but we say that h is a Dehn presentation if it does, that is, if every reduced word w that is trivial in G contains a factor which is a prefix of some cyclic conjugate h of a relator, of length greater than 21 |h |. It is clear that the word problem (given a word u, is it equal to 1 in G) admits an efficient decision algorithm in groups given by a Dehn presentation. Note that a tuple h⃗ with the Greendlinger property provides a Dehn presentation. Moreover, every hyperbolic group has a computable Dehn presentation (see [4, Theorem 2.12] and [314]). Greendlinger [180] shows that a tuple h⃗ with property C ( 61 ) also has (a stronger version of) the Greendlinger property defined above and yields a Dehn presentation. Theorem 2.2.1 shows that this situation is exponentially generic at density d < 121 . Ollivier proves a phase transition result regarding this property, with critical density higher than 121 [393]. Theorem 2.2.5. Let 0 < d < 1. If d < 51 , then at density d, a random finite presentation generically is Dehn and has the Greendlinger property. If instead d > 51 , then at density d, a random finite presentation generically fails both properties.
52 | 2 Random presentations and random subgroups Ollivier [390] also considered finite presentations based on a given, fixed hyperbolic A-generated group G, that is, quotients of G by the normal subgroup generated by a tuple h⃗ of elements of G, that can be taken randomly. There are actually two ways of generating h.⃗ Let π: F(A) → G be the canonical onto morphism. Then one can draw uniformly at random a tuple h⃗ of cyclically reduced words (that is, of elements of F(A)) ⃗ and consider the quotient G/⟨⟨π(h)⟩⟩, or one can draw uniformly at random a tuple of
cyclically reduced words that are geodesic for G. Here it is useful to remember that a hyperbolic group is geodesically automatic [123] – in particular its language of geodesics L is regular – and there is a linear-time algorithm to randomly generate elements of L of a given length [41]. To state Ollivier’s result, let us recall the definition of the cogrowth of G, relative to the morphism π, under the hypothesis that π is not an isomorphism, that is, G is not free over A. Let r = |A|. Then cogrowth(G) = lim n1 log2r−1 (|Hn |), where Hn is the set of reduced words of length n in ker π (and the limit is taken over all even values of n, to account for the situation where no odd-length reduced word is in ker π). This invariant of G (and π) was introduced by Grigorchuk [183], who proved that it is always greater than 21 and less than or equal to 1 and, using a result of Kesten [254], that it is equal to 1 if and only if G is amenable (amenability is an important property which, in the case of discrete groups, is equivalent to the existence of a left-invariant, finitely additive probability measure on G). The above definition does not apply if G is free over A, but it is convenient to let cogrowth(F(A)) = 21 (see [390, Section 1.2] for a discussion). In particular, the following elegant phase transition statement generalizes Theorem 2.2.2. Theorem 2.2.6. Let G be a hyperbolic and torsion-free A-generated group and let π: F(A) → G be the canonical mapping. Let 0 < d < 1. If d < 1 − cogrowth(G), ⃗ is exponentially generically hyperbolic. then at density d, a random quotient G/⟨⟨π(h)⟩⟩ ⃗ If d > 1 − cogrowth(G), then G/⟨⟨π(h)⟩⟩ is exponentially generically degenerate. If instead we take a tuple of cyclically reduced words of length n that are geodesic for G, then the phase transition between hyperbolicity and degeneracy is at density 21 . Remark 2.2.7. Theorem 2.2.6 above is for torsion-free hyperbolic groups G. It actually holds as well if G is hyperbolic and has harmless torsion, that is, if every torsion element either sits in the virtual center of G, or has a finite or virtually ℤ centralizer (see [390]). Finally we note another phase transition theorem, due to Żuk, about Kazhdan’s property (T) – a property of the unitary representations of a group – for discrete groups [503]. Theorem 2.2.8. Let 0 < d < 21 . If d < 31 , then at density d, a random finite presentation generically does not present a group with Kazhdan’s property (T). If instead d > 31 , then at density d, a random finite presentation generically presents a group satisfying this property.
2.2 Random finite presentations | 53
infinite hyperbolic
degenerate
not Kazhdan’s (T)
Kazhdan’s (T)
Dehn Greendlinger
neither Dehn nor Greendlinger
C ( 61 ) 0
not C ( 61 ) 1 12
1 5
1 3
1 2
density
1
Figure 2.2: Phase transitions for properties of random presentations in the density model.
The critical densities in Theorems 2.2.1, 2.2.2, 2.2.5 and 2.2.8 are shown in Figure 2.2.
2.2.2 The few-relators model The few-relators model was introduced by Arzhantseva and Ol’shanskiĭ [16]. In this model, the number of relators is fixed, say, k ≥ 1. Then the set S on which we define a model of randomness (see Section 2.1.2) is the set 𝒞ℛk of all k-tuples of cyclically reduced words in F(A) and ℙn is the uniform probability law with support (𝒞ℛ≤n )k . Observe that if a tuple h⃗ of cyclically reduced words satisfies the small cancelation ⃗ then property C (λ) and if g⃗ is a subtuple of h⃗ (that is, the words in g⃗ are also in h), g⃗ satisfies property C (λ) as well. From this observation and Theorem 2.2.1 (actually its variant in Remark 2.2.3) we deduce the following result, due to Arzhantseva and Ol’shanskii [16] (see also [245, Theorem B]). Corollary 2.2.9. In the few-relators model, a random tuple exponentially generically satisfies property C ( 61 ) and presents an infinite hyperbolic group. Arzhantseva and Ol’shanskiĭ further show that, in the few-relators model, the finitely generated subgroups of a random k-relator subgroup are usually free or have finite index [16] (see statements (1) and (2) of Theorem 2.2.10 below). Statement (3) is due to Kapovich and Schupp [245, Theorem B]. Recall that a Nielsen move on a tuple (x1 , . . . , xk ) of elements of a group G consists in replacing xi by xi−1 , exchanging xi and xj or replacing xi by xi xj for some i ≠ j. We say that two k-tuples are Nielsen-equivalent if one can go from one to the other by a sequence of Nielsen moves. Theorem 2.2.10. Let k, ℓ ≥ 1 be integers. In the few-relators model with k relators, exponentially generically, (1) every ℓ-generated subgroup of an A-generated group G has finite index or is free; (2) if ℓ < |A|, every ℓ-generated subgroup of G is free and quasiconvex in G;
54 | 2 Random presentations and random subgroups (3) an |A|-tuple which generates a nonfree subgroup of G is Nielsen-equivalent to A in G. In particular, an |A|-tuple which generates a nonfree subgroup generates G itself, and every automorphism of G is induced by an automorphism of F(A). Sketch of proof. The core of the proof lies in the identification of a class 𝒫 of k-relator presentations (over a fixed alphabet of size r), defined below, which is exponentially generic in the few-relators model, and which is smooth enough for properties (1), (2) and (3) to always hold. The class 𝒫 was originally introduced by Arzhantseva and Ol’shanskiĭ [16], and revisited by Kapovich and Schupp [245]. This class, parametrized by positive real numbers λ and μ, consists in the tuples (u1 , . . . , uk ) in 𝒞ℛ≤n which satisfy property C (λ) and do not contain a proper power, such that every prefix w of a cyclic conjugate of ui of length at least 21 |ui | satisfies the following negative property. There does not exist a subgroup H of Fr whose Stallings graph (see Section 2.3.1) has at most μ|w| edges such that there exists a reduced word in H containing w as a factor and such that rank(H) ≤ r − 1 or such that rank(H) ≤ r and H has infinite index. μ If λ ≤ 15r+3μ < 61 , the class 𝒫 is exponentially generic [16] and properties (1), (2) and (3) hold for all tuples of relators in 𝒫 [16, 245]. Arzhantseva also established the following related result [15], which refines in a sense Theorem 2.2.10 (1). Here a set of generators for the ℓ-generated subgroup of G is fixed in advance (as a tuple of words in F(A)), and it is assumed that it generates an infinite-index subgroup of F(A). Theorem 2.2.11. Let H be a finitely generated, infinite-index subgroup of F(A). In the few-relators model with k relators, exponentially generically, a finite presentation G = ⟨A | h⟩⃗ (h⃗ ∈ (𝒞ℛ≤n )k ) is such that the canonical morphism φ: F(A) → G is injective on H (so φ(H) is free) and φ(H) has infinite index in G. We also note that Kapovich and Schupp extended Theorem 2.2.10 to the density model [247], with density bounds that depend on both parameters k and ℓ. Theorem 2.2.12. Let A be a fixed alphabet. For every k, ℓ ≥ 1, there exists 0 < d(k, ℓ) < 1 such that at every density d < d(k, ℓ), generically, an ℓ-generated subgroup of an A-generated group presented by a random k-tuple of relators has finite index or is free. Also, for every k ≥ 1, there exists 0 < d(k) < 1 such that at every density d < d(k), every (k−1)-generated subgroup of an A-generated group presented by a random k-tuple of relators is free. But there is no single value of d such that this holds independently of k (that is, limk→∞ d(k) = 0).
2.2 Random finite presentations | 55
2.2.3 One-relator groups If u is a cyclically reduced word, let Gu = ⟨A | u⟩. One-relator groups are of course covered by the few-relators model, and the results of Section 2.2.2 apply to them. But more specific results are known for random one-relator presentations. Magnus showed that if u, v ∈ F(A), then the normal closures of the subgroups generated by u and v, written ⟨⟨u⟩⟩ and ⟨⟨v⟩⟩, are equal if and only if u is a conjugate of v or v−1 (see [313, Prop. II.5.8]). Kapovich and Schupp combine this with Theorem 2.2.10 (3) to show the following [245, TheoremA]. Theorem 2.2.13. There exists an exponentially generic (and decidable) class P of cyclically reduced words such that if u, v ∈ P, then Gu and Gv are isomorphic if and only if there exists an automorphism φ of F(A) such that φ(u) ∈ {v, v−1 }. In particular, the isomorphism problem for one-relator groups with presentation in P is decidable. We now explain how this result gives access to generic properties of (isomorphism classes of) one-relator groups and not just of one-relator presentations. This is a more explicit rendering of arguments which can be found in particular in Ollivier [391, Section II.3], Kapovich, Schupp and Shpilrain [249] and Sapir and Špakulová [438]. For this discussion we consider probability laws ℙn for one-relator presentations and probability laws ℚn for isomorphism classes of one-relator groups. More specifically, ℙn is the uniform probability law with support 𝒞ℛ≤n (that is, the probability law for the few-relators model, with k = 1 relator), and ℚn is the uniform probability law with support the set Tn of isomorphism classes of groups Gu with |u| ≤ n. We let T = ⋃n≥1 Tn , that is, T is the set of isomorphism classes of one-relator groups. Let H be the group of length preserving automorphisms of F(A), that is, the automorphisms which permute A.̃ Note that |H| = 2r r!, where r = |A|. Let also W be the set of strictly Whitehead minimal words, that is, cyclically reduced words u such that |φ(u)| > |u| for every automorphism φ ∈ Aut(F(A)) \ H. Kapovich et al. [249] show that W is exponentially generic (see [28] and Section 2.3.4 for a more general result). Fix an arbitrary order on A.̃ For each word u ∈ 𝒞ℛ, we let τ(u) be the lexicographically least element of the set of all cyclic permutations of images of u and u−1 by an automorphism in H. A set P as in Theorem 2.2.13 can be assumed to be closed under taking inverses and cyclic conjugation (see for instance the description of P in [245, Section 4]). In that case, a word u is in P if and only if τ(u) ∈ P. The same clearly holds for W, and we have 2r ≤ |τ−1 (τ(u))| ≤ 2r+1 |u|r! – where the lower bound corresponds to a word of the form u = a|u| . It is immediate that Gu = Gτ(u) . Moreover, in view of Theorem 2.2.13, if u, v ∈ P ∩ W, then Gu and Gv are isomorphic if and only if τ(u) = τ(v). Proposition 2.2.14. Let X be a property of isomorphism classes of one-relator groups, that is, X is a subset of T. Let Y = {u ∈ 𝒞ℛ | Gu ∈ X}. If ℙn (Y) = o(n−1 ) (respectively, Y is exponentially negligible), then X is negligible (respectively, exponentially negligible). The same statement holds for genericity instead of negligibility.
56 | 2 Random presentations and random subgroups Proof. Let Z be the set of one-relator groups Gu such that u ∈ W ∩ P, where P is a set as in Theorem 2.2.13 and W is the set of strictly Whitehead minimal cyclically reduced words. Since W ∩ P is exponentially generic in 𝒞ℛ, there exist constants C, c > 0 such that ℙn (W ∩ P) ≥ 1 − Ce−cn . We have ℚn (X) = ℚn (X ∩ Z) + ℚn (X \ Z) ≤ ℚn (X ∩ Z) + ℚn (T \ Z). We first deal with ℚn (T \ Z). Let αn = |Tn ∩ Z| and βn = |Tn \ Z|. Then ℚn (T \ Z) =
βn . αn +βn
Note that Tn \ Z ⊆ {Gu | u ∈ 𝒞ℛ≤n \ (W ∩ P)}. So βn ≤ |𝒞ℛ≤n \ (W ∩ P)| ≤ Ce−cn |𝒞ℛ≤n |. On the other hand, Tn ∩ Z is in bijection with {τ(u) | u ∈ 𝒞ℛ≤n ∩ (W ∩ P)}, and it follows that αn ≥
−cn 1 1 − Ce |𝒞ℛ≤n |. 𝒞ℛ ∩ (W ∩ P) ≥ ≤n 2r+1 nr! 2r+1 nr!
Therefore ℚn (T \ Z) =
β βn Ce−cn r+1 ≤ n ≤ 2 nr!, αn + βn αn 1 − Ce−cn
which vanishes exponentially fast. Let us now consider ℚn (X ∩ Z). We have |X ∩ Z ∩ Tn | αn + βn |{Gu | u ∈ 𝒞ℛ≤n ∩ W ∩ P, u ∈ Y}| ≤ αn
ℚn (X ∩ Z) =
2r+1 nr! ℙn (Y ∩ W ∩ P) |𝒞ℛ≤n | 2r(1 − Ce−cn ) |𝒞ℛ≤n | nℙn (Y) ≤ 2r (r − 1)! , 1 − Ce−cn ≤
and this concludes the proof. Then the results of Section 2.2.2 (Corollary 2.2.9, Theorem 2.2.10), together with Proposition 2.2.14 yield the following. Corollary 2.2.15. Exponentially generically, a one-relator group G is infinite hyperbolic, every automorphism of G is induced by an automorphism of F(A), and every ℓ-generated subgroup is free and quasiconvex if ℓ < |A|. Kapovich, Schupp and Shpilrain use the ideas behind Proposition 2.2.14 to compute an asymptotic equivalent of the number of (isomorphism classes of) one-relator groups in Tn [249]. Theorem 2.2.16. Let In (A) be the number of isomorphism classes of one-relator groups of the form ⟨A | u⟩ with |u| ≤ n. If |A| = r, then In (A) is asymptotically equivalent to 1 (2r−1)n . n 2r+1 r!
2.2 Random finite presentations | 57
Finally we note the following result of Sapir and Špakulová [438]. Recall that a group G is residually 𝒫 (for some property 𝒫 ) if for all distinct elements x, y ∈ G, there exists a morphism φ from G to a group having property 𝒫 , such that φ(x) ≠ φ(y). We let finite-p be the property of being a finite p-group. A group G is coherent if every finitely generated subgroup is finitely presented. Theorem 2.2.17. Suppose that |A| ≥ 3. Then an A-generated one-relator group is generically residually finite, residually finite-p and coherent.
2.2.4 Rigidity properties Theorem 2.2.13 above gives a generic rigidity property: at least on a large (exponentially generic) set of words, the isomorphism class of the one-relator group Gu = ⟨A | u⟩ is uniquely determined by u, up to inversion and an automorphism of F(A). That is, the only words v such that Gv is isomorphic to Gu are those that come immediately to mind. As indicated, this result follows from a theorem of Magnus which states that the normal closure ⟨⟨u⟩⟩ has essentially only one generator as a normal subgroup, i. e., ⟨⟨u⟩⟩ = ⟨⟨v⟩⟩ if and only if u is a conjugate of v or v−1 . There is no such general statement for normal subgroups generated by a k-tuple with k ≥ 2. A closely related result due to Greendlinger generalizes Magnus’s statement, but only for tuples that satisfy the small cancelation property C ( 61 ) [181]: if g⃗ ⃗ then ⃗ = ⟨⟨h⟩⟩, and h⃗ are such tuples, respectively, a k-tuple and an ℓ-tuple, and if ⟨⟨g⟩⟩
k = ℓ and there is a reordering g⃗ of g⃗ such that for each i, hi is a cyclic permutation −1 of gi or gi . The restriction to tuples satisfying C ( 61 ) prevents us from proceeding as in Section 2.2.3 to prove a more general analogue of Theorem 2.2.13. Whether Theorem 2.2.13 can be extended to m-tuples of cyclically reduced words is essentially the stability conjecture formulated by Kapovich and Schupp [245, Conjecture 1.2]. Nevertheless, Kapovich and Schupp show that one can circumvent this obstacle when considering the quotients of the modular group M = PSL(2, ℤ) = ⟨a, b | a2 , b3 ⟩. If h⃗ is a tuple of cyclically reduced words in F(a, b), we denote by Mh⃗ the quotient of ⃗ Let η be the M by the images of the elements of h⃗ in M, that is, M ⃗ = ⟨a, b | a2 , b3 , h⟩. h
automorphism of M which fixes a and maps b to b−1 = b2 . Then the following holds [248, Theorem A and Corollary 2.5].
Theorem 2.2.18. For each k ≥ 1, there exists an exponentially generic (in the k-relator model), decidable subset Qk of 𝒞ℛk such that the following holds. – If h⃗ ∈ Qk , then the group Mh⃗ is hyperbolic and one-ended, the generators a and b have order 2 and 3, respectively, in Mh⃗ and all the automorphisms of Mh⃗ are inner. – If g,⃗ h⃗ ∈ Qk and Mg⃗ and Mh⃗ are isomorphic, then there is a reordering g⃗ of g⃗ and a value ϵ ∈ {0, 1} such that for each 1 ≤ i ≤ k, hi is a cyclic permutation of ηϵ (gi ) or ηϵ (gi ). −1
58 | 2 Random presentations and random subgroups –
If g⃗ ∈ Qk , h⃗ ∈ Qℓ are such that gi and hj all have the same length, and if Mg⃗ and Mh⃗ are isomorphic, then k = ℓ.
In the k-relator model, the isomorphism problem for quotients of M is exponentially generically solvable in time 𝒪(n4 ). The last statement of this theorem is all the more interesting as the isomorphism problem, and even the triviality problem, for quotients of M is undecidable in general (Schupp [443]). As in Section 2.2.3, Theorem 2.2.18 can be used to discuss asymptotic properties of k-relator quotients of the modular group, rather than of k-tuples of relators. The few-relators model for the quotients of M considers the set T of isomorphism classes of k-relator quotients of M, and the probability laws ℚn which are uniform on the set Tn of isomorphism classes of groups Mh⃗ with h⃗ ∈ (𝒞ℛ≤n )k . We can reason as for Proposition 2.2.14, modifying the map τ in such a way that τ(h) is the lexicographically least element of h, h−1 , η(h) and η(h−1 ). Then, with essentially the same proof as Proposition 2.2.14, we get the following result. Proposition 2.2.19. Let k ≥ 1 and let X be a property of isomorphism classes of k-relator quotients of the modular group, that is, X is a subset of T. Let Y = {h⃗ ∈ (𝒞ℛ)k | Mh⃗ ∈ X}. If ℙn (Y) = o(n−k ) (respectively, Y is exponentially negligible). Then X is negligible (respectively, exponentially negligible). The same statement holds for genericity instead of negligibility. As in Section 2.2.3 again, one can derive from Theorem 2.2.18 an asymptotic equivalent of the number of isomorphism classes of k-relator quotients of the modular group [248, Theorem C]. Corollary 2.2.20. Let k ≥ 1. The number of isomorphism classes of quotients of M by k relators which are cyclically reduced words of length n is asymptotically equivalent to n
(2 2 +1 )k . 2k!(2n)k Kapovich and Schupp go on to give further generic rigidity properties of homomorphisms between quotients of M, which are proved to be generically Hopfian and co-Hopfian (that is, every surjective [respectively, injective] endomorphism is an isomorphism), and on the generic incompressibility of the presentations by k relators [248, Theorems B and D]. 2.2.5 Nilpotent groups We conclude this section with recent results on random groups in a particular class, i. e., that of nilpotent groups. If G is a group, the lower central series of G is defined by
2.2 Random finite presentations | 59
letting G1 = G and, for n ≥ 1, Gn+1 = [Gn , G]. That is, Gn+1 is the subgroup generated by the commutators [g, h] = g −1 h−1 gh, with g ∈ Gn and h ∈ G. Then each Gn is normal in G and Gn+1 is contained in Gn . The group G is said to be nilpotent of class s if Gs+1 = 1. In particular, G2 is the derived subgroup of G and the class 1 nilpotent groups are exactly the abelian groups. Nilpotent groups of class 2 are those in which the derived subgroup lies in the center of the group. Let us extend the commutator notation by letting, for s ≥ 2, [x1 , . . . , xs+1 ] = [[x1 , . . . , xs ], xs+1 ]. One can show that the class of nilpotent groups of class s is defined by the identity [x1 , . . . , xs+1 ] = 1. As a result, this class constitutes a variety (in the sense of universal algebra) and we denote by Ns (A) its free object over the finite alphabet A: Ns (A) = F(A)/F(A)s+1 . Note that a torsion-free noncyclic nilpotent group contains a free abelian group of rank 2, a standard obstacle for hyperbolicity, so torsion-free noncyclic nilpotent groups are not hyperbolic. In particular, they form a negligible set in the few-relators and density models discussed in the previous sections, and we cannot use earlier results to discuss random nilpotent groups. This difficulty was circumvented in several different ways in the literature. Cordes et al. view finitely presented nilpotent groups as quotients of free nilpotent groups (of a fixed class and rank) by a random tuple of relators whose length tends to infinity [89]. In this model, relators are words over the symmetrized alphabet A.̃ Depending on the number of relators, this extends the few-relators and density models. Garreta et al. extend in [156, 157] the study initiated in [89]. The following result is a summary of [89, Theorem 29, Proposition 30 and Corollaries 32 and 35] and of [157, Theorems 3.7 and 4.1]. Theorem 2.2.21. Let s ≥ 1, r ≥ 2, let A be an alphabet of cardinality r, let Ns,r = Ns (A) be the free nilpotent group of class s over A and let π be the canonical morphism from à ∗ onto Ns,r . ⃗ is generIn the density model, at any density d > 0, a random quotient Ns,r /⟨⟨π(h)⟩⟩ ically trivial. In fact, this holds in any model where the size of the tuple of relators is not bounded. In the few-relators model with k relators with k ≤ r − 2, a random quotient ⃗ is generically non-abelian and regular (that is, every element of the center Ns,r /⟨⟨π(h)⟩⟩ of G has a nontrivial power in the derived subgroup). If k = r − 1, then such a quotient is generically virtually abelian (it has an abelian finite-index subgroup), and if k = r, then it is generically finite. In either case, it is abelian if and only if it is cyclic. Finally, if k ≥ r + 1, then it is generically finite and abelian. In the particular case where r = 2, k = 1 and s ≥ 2, the probability that a random one-relator quotient of Ns,2 is cyclic (and, hence, abelian) tends to π62 . Cordes et al. also give a full classification of the one-relator quotients of N2,2 (the Heisenberg group) [89, Section 3]. Moreover, they deduce from Theorem 2.2.21 the following result on random finitely presented groups [89, Corollary 36].
60 | 2 Random presentations and random subgroups Corollary 2.2.22. In the density model, at any density d > 0, and in any model where the size of the tuple of relators is not bounded, a random tuple h⃗ generically presents a perfect group (that is, a group G such that [G, G] = G, or equivalently, a group whose abelian quotient is trivial). Delp et al. use a different view of nilpotent groups [97]. It is well known that every torsion-free nilpotent group embeds in Un (ℤ) for some n ≥ 2, where Un (ℤ) is the group of upper-triangular matrices with entries in ℤ and diagonal elements equal to 1. If 1 ≤ i < n, let ai,n be the matrix in Un (ℤ) with coefficients 1 on the diagonal and on row i and column i + 1, and all other coefficients 0. Then An = {a1,n , . . . , an−1,n } generates Un (ℤ). Let ℓ: ℕ → ℕ be a function such that lim ℓ(n) = ∞ when n tends to infinity. We let Gℓ,n be the subgroup of Un (ℤ) generated by a random pair of words of length ℓ(n) on alphabet à n : in the language of Section 2.1.2, S is the set of pairs of words on an alphabet of the form à n for some n ≥ 2, Sn is the set of all pairs of length ℓ(n) words on alphabet à n and ℙn is the uniform probability law with support Sn . Then we have the following result, a combination of Theorems 1 and 2 in [97]. Note that Un (ℤ) is nilpotent of class n − 1: we say that a subgroup of Un (ℤ) has full class if it is nilpotent of class n − 1. Theorem 2.2.23. Let ℓ: ℕ → ℕ be a function such that lim ℓ(n) = ∞. – If ℓ = o(√n), then Gℓ,n is generically abelian (that is, of class 1). If √n = o(ℓ(n)), then Gℓ,n is generically non-abelian. If ℓ(n) = c√n, then the probability that Gℓ,n is 2
–
abelian tends to e−2c . If ℓ = o(n2 ), then Gℓ,n generically does not have full class; if n3 = o(ℓ(n)), then Gℓ,n generically has full class.
Garreta et al. use yet another representation of nilpotent groups [156, 158], the polycyclic presentation. A group G is polycyclic if it admits a sequence of subgroups 1 = Hn ≤ Hn−1 ≤ ⋅ ⋅ ⋅ ≤ H1 = G such that for every 1 < i ≤ n, Hi is normal in Hi−1 and Hi−1 /Hi is cyclic. It is elementary to verify that every finitely generated nilpotent group is polycyclic. Polycyclic groups admit presentations of a particular form, the so-called polycyclic presentations (see [216] for a precise description), which can be characterized by a k-tuple of integers, where k is a function of the number of generators in the presentation. In the case of torsion-free nilpotent groups, polycyclic presentations with generators x1 , . . . , xr have relators of the following form, called a torsion-free nilpotent presentation: b
b
c
c
i,j,j+1 [xj , xi ] = xj+1 ⋅ ⋅ ⋅ xr i,j,r , i,j,j+1 [xj , xi−1 ] = xj+1 ⋅ ⋅ ⋅ xr i,j,r ,
for all 1 ≤ i < j ≤ r, where bi,j,h and ci,j,h (1 ≤ i < j < h ≤ r) are integers. Garreta et al. introduce a notion of random torsion-free nilpotent presentations as follows [156, 158].
2.3 Random subgroups | 61
With the number r of generators fixed, they let S be the set of tuples (bi,j,h , ci,j,h )1≤i , then at density d, h exponentially generically does not have the 4
central tree property. If d < 161 , then at density d, a tuple of reduced words h⃗ exponentially generically generates a malnormal and pure subgroup.
It is immediate that if every element of g⃗ is also an element of h⃗ and h⃗ has the central tree property, then so does g.⃗ In that case, it is not hard to show also that ⟨g⟩⃗ is malnormal if ⟨h⟩⃗ is (see for instance [27, Proposition 1.5]). Then Theorem 2.3.2 yields the following corollary, which was already observed by Arzhantseva and Ol’shanskiĭ [16] for the free generation statement and by Jitsukawa [228] for the malnormality statement. Corollary 2.3.3. In the few-generators model, a tuple of reduced words exponentially generically has the central tree property, it is a basis of the subgroup it generates, and this subgroup is malnormal and pure. We now see how to use the rigidity property in Proposition 2.3.1 (3) to discuss asymptotic properties of subgroups themselves, and not of tuples of generators, at least in the few-generators model. This is in the same spirit as in Propositions 2.2.14 and 2.2.19 above. Fix k ≥ 1. In the k-generator model for tuples, the set S (in the terminology of Section 2.1.2) is ℛk and ℙn is the uniform probability law with support Sn = ℛk≤n . Now consider the set T of all k-generated subgroups of F(A), the set Tn of subgroups of the form ⟨h⟩⃗ for some h⃗ ∈ Sn and the probability law ℚn which is uniform on Tn . We call this the k-generator model for subgroups. Proposition 2.3.4. Let X be a property of k-generator subgroups of F(A), that is, X is a subset of T. Let Y = {h⃗ ∈ ℛk | ⟨h⟩⃗ ∈ X}. If Y is negligible (respectively, exponentially negligible) in the k-generator model for tuples, then so is X, in the k-generator model for subgroups. The same statement holds for genericity instead of negligibility.
2.3 Random subgroups | 65
Proof. Let P be the set of tuples with the central tree property. By Corollary 2.3.3, there exist C, d > 0 such that ℙn (P) > 1 − Ce−dn . Moreover, by Proposition 2.3.1 (3), if h⃗ ∈ P, ⃗ there are at most 2k k! elements of P which generate the subgroup ⟨h⟩. ⃗ ⃗ If Z is the set of subgroups of F(A) of the form ⟨h⟩ such that h ∈ P, one shows as in the proof of Proposition 2.2.14 that ℚn (X) ≤ ℚn (X ∩ Z) + ℚn (T \ Z), and that both terms of this sum vanish exponentially fast. The following corollary immediately follows from Corollary 2.3.3. Corollary 2.3.5. Let k ≥ 1. In the k-generator model for subgroups, malnormality and purity are exponentially generic. Remark 2.3.6. The proof of Proposition 2.3.4 does not extend to the density model: if the number of elements of a tuple h⃗ is a function k(n) that tends to infinity, the multiplying fact 2k k! is not a constant anymore, and negligibility for X is obtained only if ℙn (Y) vanishes very fast (namely, if ℙn (Y) = o(2k(n) k(n)!)). We conclude this section with a discussion of the height of the central tree of the ⃗ for a random choice of h.⃗ ArzhantStallings graph of ⟨h⟩⃗ (that is, the parameter lcp(h)) seva and Ol’shanskiĭ [16] show that in the few-generators model, the height of the cen⃗ is exponentially generically at most αn, for any tral tree (namely, the parameter lcp(h)) α > 0. It is in fact generically much smaller (see [27, Proposition 3.24]). Proposition 2.3.7. Let f be an unbounded nondecreasing integer function and let k ≥ 1. The following inequality holds generically for a tuple h⃗ chosen randomly in the k-generator model: lcp(h)⃗ ≤ f (n). This implies that generically in the few-generators model, for tuples as well as ⃗ that lie in the central tree (at most for subgroups, the proportion of vertices of Γ(⟨h⟩) ⃗ 2r(2r − 1)lcp(h)−1 ) tends to 0 (apply Proposition 2.3.7 with, say, f (n) = log log n). 2.3.3 Random Stallings graphs Another point of view on random subgroups of F(A) relies on the observation that each finitely generated subgroup corresponds to a unique Stallings graph, and that these graphs admit an intrinsic combinatorial characterization, as reduced rooted A-graphs (see Section 2.3.1). The problem of drawing a random subgroup can therefore be reduced to the problem of drawing a random reduced rooted A-graph. When considering such graphs, it is natural to measure their size by their number of vertices (the number of edges of such a graph of size n lies between n − 1 and 2|A|n). By extension, we say that the size of a subgroup H, written |H|, is the size of its Stallings graph Γ(H). Then we consider the set S of all Stallings graphs over alphabet A (that is, of all the reduced rooted A-graphs), and the uniform probability law ℙn with support
66 | 2 Random presentations and random subgroups the Stallings graphs with n vertices. This is called the graph-based model for subgroups of F(A). Implementation of the graph-based model The problem of drawing a tuple of reduced words uniformly at random is easily solved: one draws each word independently, one letter at a time, with 2r = |A|̃ choices for the first letter, and 2r − 1 choices for each of the following letters. Drawing (a tuple of) cyclically reduced words uniformly at random is also done in a simple way. Indeed, the probability that a random reduced word of length n is cyclically reduced tends to 2r−1 when n tends to infinity, and we can use a rejection 2r algorithm: repeatedly draw a reduced word until that word is cyclically reduced. The 2r 1 expected number of draws tends to 2r−1 = 1 + 2r−1 . Drawing a Stallings graph with n vertices is a less immediate task. Bassino et al. [26] use a recursive method and the tools of analytic combinatorics to solve it in an efficient manner: they give a rejection algorithm with expected number of draws 1 + o(1), which requires a linear-time precomputation, and takes linear time for each draw. These linear-time bounds are evaluated in the RAM model; in the bit complexity model, the precomputation is done in time 𝒪(n2 log n) and each draw is done in time 𝒪(n2 log2 n) (see [26, Section 3]). Let us sketch a more precise description of this random generation algorithm and its justification. The central idea is the observation that a size n Stallings graph Γ defines an A-tuple (fa )a∈A of partial injections (partial, one-to-one maps) from {1, . . . , n} into itself: fa (i) = j if and only if there is an a-labeled edge in Γ from vertex i to vertex j. Conversely, such a tuple of partial injections defines an A-labeled graph with vertex set {1, . . . , n}, which is a Stallings graph (rooted at vertex 1) if and only if it is connected and every vertex i > 2 is adjacent to at least two edges. Drawing an n-vertex Stallings graph uniformly at random can therefore be done by drawing independently |A| partial injections uniformly at random, checking whether the resulting graph is a Stallings graph and, if it is not, rejecting this draw and drawing a fresh one, repeating the operation until a Stallings graph has been drawn. The justification of the efficiency of such a rejection algorithm relies on the proof of the following statement. With probability tending to 1 when n tends to infinity, the graph defined by an A-tuple of randomly chosen partial injections on {1, . . . , n} is a Stallings graph. Once this is established, it is elementary that the expected number of draws in the rejection algorithm is 1 + o(1). We refer the reader to Theorems 2.4 and 2.6 and Corollary 2.7 in [26] for a proof of this assertion. This proof relies on a combinatorial understanding of partial injections which we discuss below. We first note that drawing uniformly at random a size n partial 2 injection can be done by the following elementary method. There are PIn,k = (nk) k! partial injections with domain size k (choose a size k domain, a size k codomain and a bijection between them). For n fixed, if the PIn,k are precomputed, one can draw a size
2.3 Random subgroups | 67
n partial injection as follows. First draw the domain size k according to the distribution given by the PIn,k , draw two size k subsets to be the domain and codomain and draw a permutation of {1, . . . , k}. This method has shortcomings: it does not give us a handle to prove the genericity of connectedness, which is essential to justifying the rejection algorithm, or to easily estimate such parameters as, say, the expected value of the domain size of a partial injection, which is essential in the proof of statements (2) to (6) of Theorem 2.3.8 below. Note also that some care needs to be exercised to obtain the linear complexity bounds mentioned above: the binomial coefficients (nk) (or the ratios PIn,k , where PIn PIn
is the number of size n partial injections) must be computed by a linear recurrence (based on Pascal’s triangle) and random permutations must be generated in linear time. The algorithm used in [26] to efficiently draw a partial injection uniformly at random is an instance of the recursive method (see [135]). A size n partial injection f (or rather its functional graph) is analyzed as follows. It is a disjoint union of its maximal orbits which are either cycles (as in permutations) or linear graphs (or sequences), that is, subsets {i1 , . . . , iℓ } of {1, . . . , n} (ℓ ≥ 1) such that f (ij ) = ij+1 for 1 ≤ j < ℓ, i1 has no preimage and iℓ has no image. The exponential generating sequences (EGSs) of these simple combinatorial structures (cycles and sequences) are easily computed and the calculus of EGSs inherent to the recursive method yields an explicit formula for the EGS of partial injections. This formula, together with a healthy dose of complex analysis, allows us to justify our rejection algorithm and to establish a number of asymptotic properties of Stallings graphs (see Theorem 2.3.8 below). The resulting efficient random generator uses the explicit computation of the coefficients of the EGS for partial injections, and the fact that this EGS is the result of specific algebraic operations applied to the EGSs of cycles and sequences. This reduces the random generation of a size n partial injection to a two-step algorithm: first we draw the profile of a random permutation, that is, the sequence of sizes and types (cycle or sequence) of its maximal orbit, and second we draw a random size n permutation to label the objects in the profile we just drew. Drawing the profile uniformly at random consists in determining the size k of a maximal orbit (according to the distribution of the sizes of these orbits, which is obtained along the way), determining whether this orbit is a cycle or a sequence (the distribution of these two types of size k orbits was also obtained along the way) and completing the profile by randomly generating the profile of a size n − k partial injection (this is the recursion in the recursive method); see [26] for more details. Asymptotic properties of subgroups in the graph-based model The following is a combination of [26, Section 2.4, Corollary 4.1] and [25, Corollary 4.8 and Theorems 5.1 and 6.1]. We say that a property X is super-polynomially negligible (respectively, generic) if ℙn (X) is 𝒪(n−k ) (respectively, 1 − 𝒪(n−k )) for every positive integer k.
68 | 2 Random presentations and random subgroups Theorem 2.3.8. Let r = |A|. (1) The number of subgroups of F(A) of size n is asymptotically equivalent to (2e)−r/2 −(r−1)n+2r√n (r−1)n+ r+2 4 . e n √2π (2) The expected rank of a size n subgroup of F(A) is (r − 1)n − r√n + 1, with standard deviation o(√n). (3) In the graph-based model, a random subgroup of F(A) of size n is generically neither malnormal nor pure; it is malnormal (respectively, pure) with vanishing probability r 𝒪(n− 2 ). (4) The probability that a subgroup of F(A) of size n avoids all the conjugates of the elements of A tends to e−r . (5) The probability that a subgroup of F(A) of size n has finite index admits an r 𝒪(n 4 e−2r√n ) upper bound. In particular, this class of subgroups is super-polynomially negligible. (6) In the graph-based model, the quotient of F(A) by the normal closure of a random subgroup is generically trivial. Theorem 2.3.8 (1) is a direct consequence of the previous discussion: generically, an r-tuple of partial injections drawn independently defines a Stallings graph, so an asymptotic estimate of the number of size n subgroups can be derived from an asymptotic estimate of the number of size n partial injections. Theorem 2.3.8 (2) uses the fact that the rank of a subgroup H is equal to e − v + 1, where e and v are the numbers of edges and vertices, respectively, of Γ(H) (see Section 2.3.1). For a size n subgroup, v = n. As for the number of a-labeled edges, it is the difference n − sa , where sa is the number of sequences among the maximal orbits of the partial injection fa determined by a. Thus the proof of Theorem 2.3.8 (2) reduces to the study of the asymptotic behavior of the random variable which counts the number of sequences in a random partial injection of size n. This relies on the saddle point analysis of the bivariate EGS which counts partial injections by size and by the number of their sequences (see [26, Section 2.3]). The counting of partial injections by the seemingly indirect recursive method is crucial for this purpose. It is interesting to contrast Theorem 2.3.8 (2) with the results reported in Section 2.3.2. As discussed at the very end of that section, in the Stallings graph of a subgroup taken at random in the few-generators model, the immense majority of vertices are on the outer loops, adjacent to exactly two edges. In fact, since the rank of a subgroup is the difference between the number of edges and the number of vertices plus 1, the ratio between the number of edges and vertices tends to 1 in the fewgenerators model (Proposition 2.3.1 (2) and Corollary 2.3.3), and it tends to |A| − 1 in the graph-based model (Theorem 2.3.8 (2)). Observe that the minimum and maximum possible values for this ratio are 1 and |A|. In intuitive terms, the Stallings graph of
2.3 Random subgroups | 69
a random group is sparse in the few-generators model, and rather full in the graphbased model. In other words, there are many more loops, including short loops, in the latter model, whereas in the k-relator model, there are only k loops, and they are all very long: using close to a k1 proportion of the edges. This is the feature that is exploited in [25] to show that the property in Theorem 2.3.8 (4) is exponentially negligible in the few-generators model, and indeed in the density model at densities d < 41 . Similarly, generically in the graph-based model, a Stallings graph has a cycle labeled by a power of a letter, and hence the corresponding subgroup is neither malnormal nor pure (Theorem 2.3.8 (3)). This is a very rough sufficient reason for a subgroup to fail being malnormal or pure, and the probability of this property may well vanish faster than stated above. A refinement of this result (namely, the fact that for each letter a, the lengths of the cycles labeled by a power of a are relatively prime) leads to Theorem 2.3.8 (6). In this respect, we see that drawing uniformly at random the Stallings graph of the subgroup generated by a tuple of relators is not a fruitful avenue to discuss “typical” properties of finite presentations. Finally, we note that the estimates in Theorem 2.3.8 (1) and (5) can be seen as an extension of the study of subgroup growth; see in particular Lubotzky and Segal [311].
2.3.4 Whitehead minimality The following property of a subgroup H of F(A) has already been mentioned. We say that H is Whitehead minimal (respectively, strictly Whitehead minimal) if |φ(H)| ≥ |H| (respectively, |φ(H)| > |H|) for every nonlength preserving automorphism φ of F(A), where |H| is the number of vertices of its Stallings graph Γ(H). This property plays an important role in the solution of the automorphic orbit problem, to decide whether two subgroups are in the same orbit under the automorphism group of F(A), as shown by Gersten [164, Corollary 2], in an extension of the famous Whitehead peak reduction theorem [497] (see also [313, Section 1.4]) from elements of F(A) to finitely generated subgroups. Note that a cyclic subgroup H = ⟨u⟩ is (strictly) Whitehead minimal if and only if the word u is (strictly) Whitehead minimal in the sense discussed in Section 2.2.3. As mentioned there, Kapovich et al. prove that strictly Whitehead minimal cyclically reduced words are exponentially generic in F(A) [249, Theorem A]. This can be generalized to all finitely generated subgroups. Since the Stallings graph of a Whitehead minimal subgroup must be cyclically reduced, the graph-based model must be restricted (in the natural way) to these graphs. If we consider instead the few-generators model, we note that being cyclically reduced is not a generic property (see [28, Proposition 4.6]); here too, the few-generators model must be restricted to tuples of cyclically reduced words, that is, to the few-relators model of Section 2.2. Under these restrictions, Bassino et al. prove that strict Whitehead minimality is
70 | 2 Random presentations and random subgroups generic both in the graph-based and in the few-generators model [28, Theorems 3.1 and 4.1]. Theorem 2.3.9. Strict Whitehead minimality is super-polynomially generic for the uniform distribution of cyclically reduced Stallings graphs. The same property is exponentially generic in the few-relators model, restricted to tuples of cyclically reduced words. Remark 2.3.10. The reasons for genericity are different for the two models, due to the very different expected geometry of a random Stallings graph; in the few-generators model, it is very sparse and most of its vertices are on very long loops, whereas the graph is fuller and has many short loops in the graph-based model; see [28] for more details.
2.3.5 Random subgroups of nonfree groups Let us first return to the few-generators model, but for subgroups of some fixed, nonfree A-generated group G. Here, the probability laws ℙn we consider are the uniform probability laws with support (à ≤n )k for some fixed k ≥ 1; that is, we draw uniformly at random k-tuples of words of length at most n, that are not necessarily reduced. Gilman et al. show the following proposition [169, Theorem 2.1]. Recall that a group is nonelementary hyperbolic if it is hyperbolic and does not have a cyclic, finiteindex subgroup. Proposition 2.3.11. Let G be a nonelementary hyperbolic group and let k ≥ 1. Then for any choice of generators A of G and any onto morphism π: F(A) → G, exponentially generically in the k-generator model, a tuple h⃗ of elements of F(A) is such that π(h)⃗ freely generates a free, quasiconvex subgroup of G. Note that a free group F(A) is nonelementary hyperbolic if |A| ≥ 2; thus Proposition 2.3.11 generalizes part of Corollary 2.3.3, since the latter is only relative to the standard set of generators of F(A). We say that a group G has the (exponentially) generic free basis property if for every choice of generators A of G and every onto morphism π: F(A) → G, for every integer k ≥ 1, the π-image of a k-tuple h⃗ of elements of à ∗ (exponentially) generically freely generates a free subgroup of G (in the k-generator model). Proposition 2.3.11 states that nonelementary hyperbolic groups have the exponentially generic free basis property. Gilman et al. [169] and Myasnikov and Ushakov [367] note that this property is preserved as follows. If φ: G1 → G2 is an onto morphism and G2 has the (exponentially) generic free basis property, then so does G1 . For instance, non-abelian right-angled Artin groups and pure braid groups PBn (n ≥ 3) have the exponentially generic free basis property, since they admit morphisms onto a rank-2 free group (see, e. g., [45] for PBn ).
2.4 Nonuniform distributions | 71
Proposition 2.3.11 can be used also to show the following result [169, Theorem 2.2] on the membership problem in subgroups – a problem which is, in general, undecidable in hyperbolic groups [420]. Corollary 2.3.12. Let G be a nonelementary hyperbolic A-generated group, let π be a surjective morphism from A∗ onto G and let k ≥ 1. There exists an exponentially generic set X of k-tuples of words in à ∗ and a cubic-time algorithm which, on input a k-tuple h⃗ and an element x ∈ à ∗ , decides whether h⃗ ∈ X, and if so, solves the membership problem ⃗ that is, decides whether π(x) ∈ ⟨π(h)⟩. ⃗ for π(x) and π(h), There is no study as yet of asymptotic properties of subgroups of nonfree groups using a graph-based model, in the spirit of Section 2.3.3. Let us however mention that recent results may open the way towards such a study. Kharlampovich et al. [258] effectively construct Stallings graphs which are uniquely associated with each quasiconvex subgroup of a geodesically automatic group, e. g., hyperbolic groups and right-angled Artin groups. Like in the free group case, this has a large number of algorithmic consequences. It may be difficult to combinatorially characterize these graphs in general, and to design random generation algorithms or to explore their asymptotic properties. But it may be possible to tackle this task for specific groups or classes of groups. In fact, somewhat earlier results already gave more efficient and more combinatorially luminous constructions, for amalgams of finite groups (see Markus-Epstein [331]) and for virtually free groups (see Silva et al. [463]). Note that both classes of groups are locally quasiconvex, and these constructions apply to all their finitely generated subgroups.
2.4 Nonuniform distributions In this final section, we introduce nonuniform distributions, both for relators and generators, as explored by Bassino et al. [27]. We keep the idea of randomly drawing tuples of words by independently drawing the elements of the tuple, but we relax the distribution on the lengths of the tuples and on the lengths of the words, and we use nonuniform probability laws of probability on each ℛn (respectively, 𝒞ℛn ). More precisely, the model of randomness is the following [27]. For each n ≥ 0, let ℝn be a law of probability on ℛn (or 𝒞ℛn if we are dealing with presentations) and let 𝕋n be a law of probability on the set of tuples of positive integers. If h⃗ = (h1 , . . . , hk ) is a tuple of words, let |h|⃗ = (|h1 |, . . . , |hk |). Together, (ℝn )n and (𝕋n )n define a sequence of probability laws ℙn on the set of tuples of (cyclically) reduced words as follows: ⃗ ∏ ℝ (h ). ℙn (h)⃗ = 𝕋n (|h|) |hi | i i
Note that this includes the density and few-generators (relators) models discussed in Sections 2.2 and 2.3; for instance, in the k-generator model, ℝn is the uniform distri-
72 | 2 Random presentations and random subgroups bution on ℛn and 𝕋n is the distribution with support the k-tuples of integers between 0 and n, each with probability k
𝕋n (ℓ1 , . . . , ℓk ) = ∏ i=1
|ℛℓi | |ℛ≤n |
.
2.4.1 Prefix-heavy distributions For each word u ∈ ℛ, denote by 𝒫 (u) the set of reduced words starting with u, that is, 𝒫 (u) = uà ∗ ∩ ℛ. For C ≥ 1 and 0 < α < 1, we say that the sequence of probability laws (ℝn )n (each with support in ℛn ) is prefix-heavy with parameters (C, α) if, for all u, v ∈ ℛ, we have ℝn (𝒫 (uv) | 𝒫 (u)) ≤ Cα|v| . This definition captures the idea that the probability of a prefix-defined set (a set of the form 𝒫 (u)) decreases exponentially fast with the length of u. It is satisfied by the sequence of uniform probability laws on ℛn (n ≥ 0). If (ℙn )n is a sequence of laws of probability on tuples of reduced words, defined as above by sequences (ℝn )n and (𝕋n )n of probability laws on words and on tuples of integers, and if (ℝn )n is prefix-heavy with parameters (C, α), then we say that (ℙn )n is prefix-heavy as well, with the same parameters. Under this hypothesis, Bassino et al. obtain a series of general results [27, Theorems 3.18, 3.19 and 3.20], summarized as follows. If h⃗ = (h1 , . . . , hk ) is a tuple of reduced words, we let size(h)⃗ = k, min(h)⃗ = min{|hi | | 1 ≤ i ≤ k} and max(h)⃗ = max{|hi | | 1 ≤ i ≤ k}. Let us say, also, that (ℝn )n and (ℙn )n do not ignore cyclically reduced words if lim inf ℝn (𝒞ℛn ) > 0. Theorem 2.4.1. Let (ℙn )n be a sequence of probability laws on tuples of reduced words, which is prefix-heavy with parameters (C, α), with C ≥ 1 and 0 < α < 1. Let 0 < λ < 21 .
–
–
–
min
If the random variable size2 α 2 is increasingly small – more precisely, if there exists min a sequence (ηn )n tending to 0, such that ℙn (size2 α 2 > ηn ) tends to 0 – then a random tuple of reduced words generically satisfies the central tree property, and freely generates a subgroup of F(A). min If there exists a sequence (ηn )n tending to 0 such that ℙn (size2 max2 α 8 > ηn ) tends to 0, then a random tuple of reduced words generically generates a malnormal subgroup of F(A). Let 0 < λ < 21 . If the sequence (ℙn )n does not ignore cyclically reduced words and if there exists a sequence (ηn )n tending to 0 such that ℙn (size2 max2 αλ min > ηn ) tends to 0, then a random tuple of cyclically reduced words generically satisfies the small cancelation property C ( 61 ).
2.4 Nonuniform distributions | 73
In all three statements, exponential genericity is guaranteed if the vanishing sequences converge exponentially fast to 0. The technical aspect of these statements is due to the very general nature of the random model considered. In the next section, we discuss a more specific model, where the ℝn are generated by a Markovian scheme. 2.4.2 Markovian automata When it comes to drawing words at random, an automaton-theoretic model comes naturally to mind. Bassino et al. introduce the following notion. A Markovian automaton 𝒜 over a finite alphabet X consists in a finite deterministic transition system (Q, ⋅) (that is, an action of the free monoid X ∗ on the finite set Q, or seen otherwise, a deterministic finite-state automaton over alphabet X without initial or terminal states), an initial probability vector γ0 ∈ [0, 1]Q (that is, ∑p∈Q γ0 (p) = 1) and a stochastic matrix M ∈ [0, 1]Q×X (that is, a matrix where each column is a probability vector) such that M(p, x) > 0 if and only if p ⋅ x is defined. Such a scheme defines a sequence (ℝn )n of laws of probability, over each set X n (n ≥ 0), as follows: ℝn (x1 ⋅ ⋅ ⋅ xn ) = ∑ γ0 (p)M(p, x1 )M(p ⋅ x1 , x2 ) ⋅ ⋅ ⋅ M(p ⋅ (x1 ⋅ ⋅ ⋅ xn−1 ), xn ). p∈Q
Note that the union over n of the support sets of ℝn is always a prefix-closed rational language: that accepted by the transition system (Q, ⋅), with initial states the support of γ0 and all states final. Example 2.4.2. For instance, if Q = A,̃ if for each a, b ∈ A,̃ a ⋅ b is defined whenever b ≠ a−1 and equal to b when defined, if the entries of γ0 are all equal to 2r1 and if the 1 , then ℝn is the uniform probability law on ℛn . nonzero entries of M are all equal to 2r−1 The Markovian automata in Figure 2.4 also yield the uniform probability law (at fixed length) on two languages which both provide unique representatives for the elements of the modular group (see Section 2.2.4): the support of 𝒜 is the set of words over alphabet {a, b, b−1 } without occurrences of the factors a2 , b2 , (b−1 )2 , bb−1 and b−1 b a|1
𝒜:
1 3
b| b−1 |
2 3
1 2 1 2
𝒜:
1 3
b|1 a|
1 2
1 3
b|
1 2
1 3
a|1
Figure 2.4: Markovian automata 𝒜 and 𝒜 . Transitions are labeled by a letter and a probability, and each state is decorated with the corresponding initial probability.
74 | 2 Random presentations and random subgroups (the shortlex geodesics of the modular group), and the support of 𝒜 consists of the words on alphabet {a, b}, without occurrences of a2 or b3 . A first set of results is obtained by specializing Theorem 2.4.1 to the case where the sequence (ℙn )n is induced by a Markovian automaton 𝒜. If 0 < α < 1, we introduce the α-density model with respect to 𝒜, in analogy with Sections 2.2.1 and 2.3.2: at density d < 1, the sequence (ℙn )n is induced by the sequences (ℝn )n , induced by 𝒜, and (𝕋n )n , where the support of 𝕋n is reduced to the αdn -tuple (n, . . . , n). The usual density model 1 corresponds to α = 2r−1 . The following is a generalization of the results in Section 2.3.2 [27, Proposition 4.3 and Corollary 4.5]. Theorem 2.4.3. Let 𝒜 be a Markovian automaton. If 𝒜 does not have a cycle with probability 1, then the induced sequence of probability laws on ℛ is prefix-heavy, with computable parameters (C, α). If that is the case, then in the density model with respect to 𝒜, at α-density d < 41 , a tuple of reduced words exponentially generically has the central tree property. And at α-density d < 161 , a tuple of reduced words exponentially generically generates a malnormal subgroup. Sketch of proof. Let Q, γ0 and M be the state set, the initial probability vector and the stochastic matrix of 𝒜, respectively. If p ∈ Q and x1 ⋅ ⋅ ⋅ xn ∈ A∗ , let γ(p, x1 ⋅ ⋅ ⋅ xn ) = M(p, x1 )M(p ⋅ x1 , x2 ) ⋅ ⋅ ⋅ M(p ⋅ (x1 ⋅ ⋅ ⋅ xn−1 ), xn ), so that ℝn (u) = ∑p∈Q γ0 (p)γ(p, u) for every word u. Let ℓ be the maximum length of an elementary cycle in 𝒜 and let δ be the maximal value of γ(q, κ) when κ is an elementary cycle at state q. By assumption, δ < 1. Then, for every cycle u (elementary or not) at a |u| vertex q, we have γ(q, u) ≤ δ ℓ . Since a path starting from q can be seen as a sequence |w|−|Q| of cycles interspersed with at most |Q| transitions, we find that γ(q, w) ≤ δ ℓ for |w|−|Q| −|Q| 1 every word w, and hence ℝn (w) ≤ δ ℓ . Letting C = δ ℓ and α = δ ℓ , we find that if n ≥ |uv|, ℝn (𝒫 (uv)) = ℝ|uv| (uv) = ∑ γ0 (p)γ(p, u)γ(p ⋅ u, v) p∈Q
≤ ( ∑ γ0 (p)γ(p, u))Cα|v| p∈Q
= ℝ|u| (u)Cα|v| = ℝn (𝒫 (u))Cα|v| . We now consider the probability P that an αdn -tuple h⃗ of reduced words in ℛn fails to satisfy the central tree property (see Section 2.3.2), that is, some word of length −1 t = 21 n occurs as a prefix of hi or h−1 i , and of hj or hj , for some i < j (it is not possible for a word of that length to occur as a prefix of both hi and h−1 i ). It is easily seen that P ≤
2.4 Nonuniform distributions | 75
4 Σi λ2 , it exponentially generically does not satisfy property C (λ). Finally, at α[2] -density d > 21 , a tuple h⃗ of cyclically reduced words exponentially generically presents a degenerate group, in the following sense. Let B ⊆ Ã be the set of letters which label a transition in 𝒜 and let D = A \ (B ∪ B−1 ). Then ⟨A | h⟩⃗ is equal to the free group of rank |D| + 1 if B ∩ B−1 = 0, and otherwise to F(D) ∗ ℤ/2ℤ if n is even and to F(D) if n is odd.
3 Randomness and computation in linear groups Introduction In this chapter we survey recent results due to the author and others on generic phenomena in infinite (primarily linear) groups, and also questions of determining whether a given element (or subgroup) has particular properties. Here are some of the questions we address: – What is a random element of an infinite group? What is a random subgroup? – How does one generate a random element or a random subgroup? – What is a random subgroup of an infinite group? – What can you say about a random element? – What can you say about a random subgroup? – How does one tell if a subgroup (given by generators) of a linear group is Zariski dense? – How does one tell if a subgroup (given by generators) of a linear group is arithmetic? Many of the questions above are open-ended, and have many possible answers. To simplify the discussion, we will assume throughout that our “universal set” is an arithmetic lattice in an almost simple algebraic group, and to simplify matters even further, our universal set will be one of SL(n, ℤ) or Sp(2n, ℤ).
3.1 What is a random element of an infinite matrix group? There are many ways to think about this question, and so we can give a number of different definitions. Below, we will describe three or four most common approaches. Let S be our group (or, really, any infinite set, hence the name). First, define a measure of size v on the elements of S. This should satisfy some simple axioms, such as: 1. v(x) ≥ 0 for all x ∈ S; 2. the set Sk = {x ∈ S | v(x) ≤ k} is finite for every k. Let now P be a predicate on the elements of S – think of a predicate as just a function from S to {0, 1}. Let 𝒫 ⊂ S be defined as 𝒫 = {x ∈ S | P(x) = 1}, and define Pk = {x ∈ 𝒫 | v(x) ≤ k}. We say that the property P is generic for S with respect to the valuation v if lim
k→∞
|Pk | = 1. |Sk |
We say that P is negligible with respect to v if https://doi.org/10.1515/9783110667028-003
(3.1)
78 | 3 Randomness and computation in linear groups
lim
k→∞
|Pk | = 0. |Sk |
(3.2)
Sometimes the above two definitions are not enough, and we say that P has asymptotic density p with respect to v if lim
k→∞
|Pk | = p. |Sk |
(3.3)
These definitions work well when they work. Here are some examples. Example 3.1.1. The set S is the set ℕ of natural numbers, and the predicate P is P(x) = x is prime. The valuation v is just the usual “Archimedean” valuation on ℕ, and, as is well known, the set of primes is negligible. One can make a more precise statement (which is the content of the prime number theorem, see [92, 374]). With definitions as above, Pk 1 = Θ( ). Sk log k Example 3.1.2. Let S be the set of integer lattice points (x, y) ∈ ℤ2 , let Ω ⊂ ℝ2 be a Jordan domain and define the valuation on S as follows: v(x) = inf{t | x ∈ tΩ}. Further, define the predicate P by P(x, y) = x is relatively prime to y – such points are called visible, since one can see them from the origin (0, 0). Then, the asymptotic den1 sity of P is ζ (2) = π62 . The proof of this for Ω being the unit square is classical, and can be found (for example) in Hardy and Wright [205] or in the less classical reference [421]. To get the general statement, we first note that the special linear group SL(2, ℤ) acts ergodically on the plane ℝ2 (see [502]). Now, define a measure μt by μt (Ω) =
1 the number of points such that P(x, y) = 1 in tΩ. t2
Each μt is clearly a measure, dominated by the Lebesgue measure, and invariant under the SL(2, ℤ) action on ℝ2 . By Helly’s theorem [281, Section 10.3] it follows that the set {mut } has a convergent subsequence σ, and by SL(2, ℤ) invariance, the limit μσ is a constant multiple of the Lebesgue measure, and the constant can be evaluated for some specific Ω, such as the square (more details of the argument can be found in [244]). Note that the constant does not depend on σ, so all the convergent subsequences of the set {μt } have the same limit, which must, therefore, be the unique limit point of the set.
3.1 What is a random element of an infinite matrix group?
| 79
Example 3.1.3. Consider the free group on two generators F2 = ⟨a, b⟩. We define the valuation v(x) to be the reduced word length of x. Let P be the predicate P(x) = the abelianization a(x) ∈ ℤ2 is a visible point. Then P does not have an asymptotic density. It does, however, have an asymptotic annular density, defined as follows. Let X ⊂ S, where S, as usual, has a valuation satisfying our axioms. We define Sk = {x ∈ S | v(x) = k}, and similarly for Xk . Then, the k-th annular density of T is defined by X 1 X ρk (X) = ( k−1 + k ). 2 Sk−1 Sk
(3.4)
We define the strict annular density of X to be ρA (X) = limk→∞ ρk (X), if the limit exists. The general result (shown in [244]) is the following. Theorem 3.1.4. Let S be an SL(n, ℤ)-invariant subset of ℤk , and let S̃ = a−1 S, where a, as before, is the abelianization map from the free group on k generators Fk to ℤk . Then S̃ has a strict annular density whenever S has an asymptotic density. Moreover, the two densities are equal. The proof of Theorem 3.1.4 uses the ergodicity of the SL(n, ℤ) action on ℝn , as described in Example 3.1.2, and the central and local limit theorems of [427] (see also [425]) and of [451].
3.1.1 Random walks An approach that allows us to bring a number of tools from dynamics to the problem is the following. First, we think of our group as a finitely generated discrete group. We pick a collection of generators (usually, but not necessarily, symmetric, meaning it is closed under taking inverses), and then we look at the set Wn of all words in the generators of length n (or length ≤ n), and say that a property 𝒫 is generic if the proportion of elements possessing it Wn goes to 1 as n goes to infinity. A somewhat more sophisticated model (introduced in the author’s 1999 preprint, published as [425]) is to introduce an automaton: a graph whose vertices are labeled by the elements of our generating set, and whose edges give the legal transitions. For example, in Figure 3.1 we give a graph such that random walks on it correspond exactly to reduced words in the free group F2 . The example in Figure 3.1 is noteworthy because the length of the walk is equal to the distance of the corresponding product from the origin in the Cayley graph of F2 . This is quite unusual, and in most cases results which are tractable in the random walk model are very hard (and usually open) in the Cayley graph distance model.
80 | 3 Randomness and computation in linear groups
Figure 3.1: Accepting automaton for a free group on two generators.
3.1.2 Random subgroups If we know how to generate a random element, then a random k-generator subgroup is just a subgroup generated by k such elements. As always, there are some variations. We can define, for example, a 1.5 random generator subgroup as one where one generator is picked deterministically (usually in such a way that it does not give rise to obviously nongeneric phenomena – for example, a central element would not be particularly satisfying), and the other is random.
3.2 Properties of generic elements 3.2.1 The easy case: SL(2, ℤ) Consider the modular group SL(2, ℤ). Our first set of results will use ordering by Frobenius norm of the matrix. Definition 3.2.1. The Frobenius norm of the matrix x = ( ac db ) is ‖x‖ = √a2 + b2 + c2 + d2 . The first question is the following. Question 3.2.2. How many elements x ∈ SL(2, ℤ) have ‖x‖ ≤ N? It is surprising that this question was first answered by Newman in 1988(!) [375]. Theorem 3.2.3. The number 𝒩k of elements x ∈ SL(2, ℤ) with ‖x‖ ≤ k is asymptotic to 6k 2 . Newman’s proof of Theorem 3.2.3 begins by reparametrizing SL(2) as follows. First define the following variables: A = a + d, B = b + c,
3.2 Properties of generic elements | 81
C = b − c,
D = a − d. We see that A2 + B2 + C 2 + D2 = a2 + b2 + c2 + d2 . Further note that 4 = 4(ad − bc) = A2 + C 2 − B2 − D2 ,
(3.5)
while a c
A = tr (
b ). d
(3.6)
Since the difference between A and D is 2d, we know that A≡D
mod 2,
(3.7)
B≡C
mod 2.
(3.8)
and for the same reason
Then Newman writes down a generating function for the number of matrices in SL(2, ℤ) with prescribed Frobenius norm in terms of the theta function 2
θ(x) = ∑ ∞xn , n=−∞
and uses classical estimates on the coefficients of products of theta functions to obtain the asymptotic result of Theorem 3.2.3. Since an exposition of this method would take us too far afield, let us use the parametrization above to count those elements with trace equal to 2 (which is to say, the parabolic elements). Equations (3.5) and (3.6) tell us that the number of such matrices with Frobenius norm bounded by k is exactly equal to the number of Pythagorean triples of norm bounded by k. Now, as is well known, Pythagorean triples (A, B, C) with A2 = B2 +C 2 are rationally parametrized by A = u2 + v2 ,
(3.9)
B=u −v ,
(3.10)
2
C = 2uv.
2
(3.11)
With this parametrization, the 2-norm of (A, B, C) equals √2(u2 + v2 ), so the number of Pythagorean triples with L2 -norm bounded above by X equals the number of pairs (u, v) with L2 -norm bounded above by 21/4 √X, which, in turn, is asymptotic to √2πX. Note that the congruences (3.7) and (3.8) tell us that 2uv = a − d. This overcounts by a factor of two (since (u, v) and (−u, −v) give the same Pythagorean triple), but on the other hand, parabolic matrices are allowed to have trace equal to ±2, so when the smoke clears, we have the following result.
82 | 3 Randomness and computation in linear groups Theorem 3.2.4. The number of parabolic matrices in SL(2, ℤ) and Frobenius norm bounded above by k is asymptotic to √2πk. Now, we make the following observation. Observation 3.2.5. The characteristic polynomial of a matrix in SL(2, ℤ) factors over ℤ if and only if the matrix is parabolic (so has trace ±2.) Proof. Indeed, if M ∈ SL(2, ℤ), the roots of the characteristic polynomial χ(M) are tr M ± √tr2 M − 4 . 2 For χ(M) to factor, tr2 M − 4 must be a perfect square, which obviously happens only when | tr M| = 2. We thus have the following. Theorem 3.2.6. The probability of a matrix in SL(2, ℤ) of Frobenius norm bounded √ above by x to have reducible characteristic polynomial is asymptotic to 3πx2 . The first result in this field seems to be due to the author, and was published in [422]. This was the following. Theorem 3.2.7. A generic element (in either the Euclidean norm sense or the random walk sense) of SL(n, ℤ) or Sp(2n, ℤ) has irreducible characteristic polynomial. The proof of Theorem 3.2.7 (with either meaning of generic) proceeds by sieving. What we show is that “big” elements are equidistributed modulo any fixed prime – for the random walk case, this is elementary, and uses the methods developed in [425], and for the “Archimedean” case, it relies on the very deep results of Sarnak and Nevo [372]. A general framework for the sieve methods involved was developed later by Kowalski [271] and Lubotzky and Meiri [309]. Irreducibility was sufficient for the goals of the paper in the case of Sp(2n, ℤ) (it led to a proof that a generic element of the mapping class group of a closed surface was pseudo-Anosov), but it turned out that in the case of SL(n, ℤ), which came up in the study of free group automorphisms, it was necessary to show that the generic element had characteristic polynomial with Galois group Sn . This was also done. Theorem 3.2.8 ([422]). A generic element of SL(n, ℤ) has Galois group Sn . Below we will describe the arguments used to prove Theorems 3.2.7 and 3.2.8 in greater detail.
3.2 Properties of generic elements | 83
3.2.2 Random products of matrices in the symplectic and special linear groups If we have a generating set γ1 , . . . , γl of our lattice Γ (which might be SL(n, ℤ) or Sp(2n, ℤ)) we might want to measure the size of an element by the length of the (shortest) word in γi equal to that element – this is the combinatorial measure of size. We will be using Theorems 3.3.2 and 3.3.3. Remark 3.2.9. We will be applying Theorem 3.3.2 to groups SL(n, ℤ/pℤ) and Sp(2n, ℤ/pℤ). Since those groups have no nontrivial 1-dimensional representations, the assumption on ρ in the statement of the theorem is vacuous. We will also need the following results of Chavdarov and Borel. Theorem 3.2.10 (Chavdarov and Borel [81]). Let q > 4, and let Rq (n) be the set of 2n×2n symplectic matrices over the field Fq with reducible characteristic polynomials. Then |Rq (n)|
| Sp(2n, Fp )|
4, and let Gq (n) be the set of n × n matrices with determinant γ ≠ 0 over the field Fq with reducible characteristic polynomials. Then |Gq (n)|
| SL(n, Fq )|
1/|Γ|. Then there is a γ2 , such that g(γ2 ) < 1/|Γ|. Thus, g(γ) − 1/|Γ| < g(γ) − g(γ2 ) < 2ϵ. The estimate (3.12) follows immediately by summing over Ω.
3.5 Fourier estimates via linear algebra In order to prove Theorem 3.3.3, we would like to use Theorem 3.4.2, and to show the equidistribution result, we would need to show that for every nontrivial irreducible representation ρ, lim
N→∞
1 tr ∑ ρ(γw ) = 0. |WN,i,j | w∈W
(3.13)
N,i,j
To demonstrate equation (3.13), suppose that ρ is k-dimensional, and hence acts on a k-dimensional Hilbert space Hρ = H. Let Z = L2 (G) – the space of complex-valued functions from V(G) to ℂ – let e1 , . . . , en be the standard basis of Z and let Pi be the orthogonal projection on the i-th coordinate space. We introduce the matrix ρ(t1 ) 0 0 ρ(t2 ) Uρ = ∑ Pi ⊗ ρ(ti ) = ( . . . . . . .... i=1 0 0 n
... 0 ... 0 ), ........ . . . ρ(tn )
and also the matrix Aρ = A(G) ⊗ IH , where IH is the identity operator on H. Both Uρ and Aρ act on Z ⊗ H. The following is immediate. Lemma 3.5.1. Consider the matrix (Uρ Aρ )l , and think of it as an n × n matrix of k × k blocks. Then the ij-th block equals the sum over all paths w of length l beginning at vi and ending of vj at ρ(γw ).
88 | 3 Randomness and computation in linear groups Now, let Tji be the operator on Z which maps ek to δkj ei . Lemma 3.5.2. We have tr [((Tjit Pj ) ⊗ IH )(Uρ ⊗ Aρ )N (Pi ⊗ IH )] = tr ∑ ρ(γw ). w∈WN,i,j
Proof. The argument of trace on the left-hand side simply extracts the ij-th k × k block from (Uρ ⊗ Aρ )N . By submultiplicativity of operator norm, we see that t N N (Tji Pj ) ⊗ IH (Uρ ⊗ Aρ ) Pi ⊗ IH op ≤ (Uρ ⊗ Aρ ) op , and so proving Theorem 3.3.3 reduces (thanks to Theorem 3.4.2) to showing the following. Theorem 3.5.3. We have lim
|‖(Uρ ⊗ Aρ )N ‖|op
N→∞
|WN,i,j |
= 0,
for any nontrivial ρ. Notation 3.5.4. We will denote the spectral radius of an operator A by ℛ(A). Since |WN,I,j | ≍ ℛN (A(G)), and by Gelfand’s theorem (Theorem 3.6.3), 1/N lim BN = ℛ(B),
N→∞
for any matrix B and any matrix norm ‖ ∙ ‖, and so Theorem 3.5.3 is equivalent to the statement that the spectral radius of Uρ ⊗ Aρ is smaller than that of A(G). Theorem 3.5.3 is proved in Section 3.5.1. 3.5.1 Proof of Theorem 3.5.3 Lemma 3.5.5. Let A be a bounded Hermitian operator A : H → H, and let U : H → H be a unitary operator on the same Hilbert space H. Then the spectral radius of UA is smaller than the spectral radius of A, and the inequality is strict unless an eigenvector of A with maximal eigenvalue is also an eigenvector of U. Proof. The spectral radius of UA does not exceed the operator norm of UA, which is equal to the spectral radius of A. Suppose that the two are equal, so that there is a v such that ‖UAv‖ = ℛ(A)v, and v is an eigenvector of UA. Since U is unitary, v must be an eigenvector of A, and since it is also an eigenvector of UA, it must also be an eigenvector of U.
3.6 Some remarks on matrix norms | 89
In the case of interest to us, ρ is a k-dimensional irreducible representation of Γ, U = Diag(ρ(t1 ), . . . , ρ(tn )), while A = A(G) ⊗ Ik . We assume that A(G) is an irreducible matrix, so that there is a unique eigenvalue of modulus ℛ(A(G)), that eigenvalue λmax (the Perron–Frobenius eigenvalue) is positive, and it has a strictly positive eigenvector vmax . We know that the spectral radius of A equals the spectral radius of A(G), and the eigenspace of λmax is the set of vectors of the form vmax ⊗ w, where w is an arbitrary vector in ℂk . If vmax = (x1 , . . . , xn ), we can write vmax ⊗ w = (x1 w, . . . , xn w), and so U(vmax ⊗ w) = (x1 ρ(t1 )w, . . . , xn ρ(tn )w). Since all of the xi are nonzero, in order for the inequality in Lemma 3.5.5 to be nonstrict, we must have some w for which ρ(ti )w = cw (where the constant c does not depend on i.) Since the elements ti generate Γ, the existence of such a w contradicts the irreducibility of ρ, unless ρ is 1-dimensional. This proves Theorem 3.5.3.
3.6 Some remarks on matrix norms In this note we use a number of matrix norms, and it is useful to summarize what they are, and some basic relationships and inequalities satisfied by them. For an extensive discussion the reader is referred to the classic [221]. All matrices are assumed square, and n × n. A basic tool in the inequalities below is the singular value decomposition of a matrix A. Definition 3.6.1. The singular values of A are the nonnegative square roots of the eigenvalues of AA∗ , where A∗ is the conjugate transpose of A. Since AA∗ is a positive semidefinite Hermitian matrix for any A, the singular valdef
ues σ1 = σmax ≥ σ2 ≥ ⋅ ⋅ ⋅ are nonnegative real numbers. For a Hermitian A, the singular values are simply the absolute values of the eigenvalues of A. The first matrix norm is the Frobenius norm, denoted by ‖∙‖. This is defined as ‖A‖ = √tr AA∗ = √∑ σi2 . i
This is also the sum of the square moduli of the elements of A. The next matrix norm is the operator norm, |‖∙‖|op , defined as |‖A‖|op = max‖Av‖ = σmax . ‖v‖=1
Both the norms ‖∙‖ and |‖∙‖|op are submultiplicative (submultiplicativity is part of the definition of matrix norm; saying that the norm |‖∙‖| is submultiplicative means that |‖AB‖| ≤ |‖A‖| |‖B‖|.)
90 | 3 Randomness and computation in linear groups From the singular value interpretation1 of the two matrix norms and the Cauchy– Schwartz inequality we see immediately that ‖A‖/√n ≤ |‖A‖|op ≤ ‖A‖.
(3.14)
We will also need the following simple inequalities. Lemma 3.6.2. Let U be a unitary matrix, |tr AU| ≤ ‖A‖√n ≤ n|‖A‖|op .
(3.15)
Proof. Since U is unitary, ‖U‖ = ‖U t ‖ = √n. So, by the Cauchy–Schwartz inequality, tr AU ≤ ‖A‖‖U‖ = √n‖U‖. The second inequality follows from inequality (3.14). The final (and deepest) result we will have the opportunity to use is the following. Theorem 3.6.3 (Gelfand). For any operator M, the spectral radius ℛ(M) and any matrix norm |‖∙‖|, k 1/k ℛ(M) = lim M . k→∞
3.7 Properties of random subgroups The first property that was shown for random subgroups of semisimple groups [424] was the following. Suppose we have one fixed (noncentral) element of a lattice Γ in a semisimple group G, and k ≥ 1 “long” random products of some generating set. Then, such a subgroup is generically Zariski dense in G. It was further shown by Aoun [9] that such a subgroup is also generically free.
3.7.1 A guide to the rest of the section Given a subgroup Γ of GLn (ℤ) with Zariski closure Γ in GLn (ℂ), Γ is called a thin group if it is of infinite index in Γ ∩ GLn (ℤ). In this section, we investigate the question of whether a generic finitely generated subgroup of SLn (ℤ) is thin. Our notion of genericity is described via a Euclidean model, as we discuss below. This question is motivated in part by recent developments in number theory (see [59], [175], [436], etc.) which have made approachable previously unsolved arithmetic problems involving thin groups (see, for example, [147] and 1 A celebrated result of von Neumann states that any unitarily invariant matrix norm is a symmetric gauge on the space of singular values [491].
3.7 Properties of random subgroups | 91
[270] for an overview). Given these new ways to handle such groups in arithmetic settings, it has become of great interest to develop a better understanding of thin groups in their own right. For example, [148], [466] and [64] have answered the question of telling whether a given finitely generated group is thin (given in terms of its generators) in various settings. Our question of asking whether such a group is generically thin of a similar flavor, but one has more freedom in avoiding the more difficult cases (as we show, the generic two-generator subgroup of SLn (ℤ) is free when generic is defined appropriately, and yet in most concrete examples investigated thus far, one does not have freeness). Aoun’s result in the context of SLn (ℤ) that any two independent random walks on SLn (ℤ) generate a free group (we give a short proof of this here as Theorem 3.9.3) implies that with this combinatorial definition of genericity, a finitely generated subgroup of SLn (ℤ) is generically of infinite index in SLn (ℤ) if n ≥ 3 (Aoun shows this by proving that any two such random walks will yield a ping-pong pair; see Section 3.9.1 for a definition of ping-pong, and Theorem 3.9.3 for a proof of a version of Aoun’s result). Combining this with Rivin’s result in [422] that in the combinatorial model a generic finitely generated subgroup of SLn (ℤ) is Zariski dense in SLn (ℂ), thinness is generic in the combinatorial set-up. It is hence perhaps reasonable to expect that thinness should be generic in the following Euclidean model. Let G = SLn (ℤ), and let BX denote the set of all elements in G of norm at most X, where norm is defined as ‖γ‖2 := λmax (γ t γ),
(3.16)
where λmax denotes the largest eigenvalue. Our task in this chapter is to choose two elements g1 , g2 uniformly at random from BX and to consider lim μX ({g = (g1 , g2 ) ∈ G2 | Γ(g) is of infinite index in G}),
X→∞
where Γ(g) = ⟨g1 , g1−1 , g2 , g2−1 ⟩ and μX is the measure on G×G induced by the normalized counting measure on B2X . If the above limit is 1, we say that the generic subgroup of G generated by two elements is infinite index in G. In general, we say that the generic subgroup of G generated by two elements has some property P if lim μX ({g = (g1 , g2 ) ∈ G2 | Γ(g) has property P}) = 1.
X→∞
It is a result of the author [424] that in this model, two randomly chosen elements do generate a Zariski dense subgroup, and we might expect that, just as in the combinatorial setting, two randomly chosen elements of G in the Euclidean model will also form a ping-pong pair with probability tending to 1 (i. e., that the generic twogenerator subgroup of G is generated by a ping-pong pair in particular). Surprisingly, we use Breuillard–Gelander’s [65] characterization of ping-pong for SLn (ℝ) over projective space to show that while this is the case for n = 2, it is not true if n > 2 (see
92 | 3 Randomness and computation in linear groups Theorem 3.9.4), and so further work must be done to prove that thinness is generic in this model, if it is in fact the case. However, if one “symmetrizes” the ball of radius X in a natural way, by imposing a norm bound on both the matrix and its inverse, we show that two elements chosen at random in such a modified model will, in fact, be a ping-pong pair over a suitable space with probability tending to 1, and this enables us to show that, in this modified set-up, the generic subgroup of SLn (ℤ) generated by two elements is thin (our methods extend to any arbitrary finite number of generators in a straightforward way, as well). This modified Euclidean model is identical to the one described above, but BX will be replaced by BX (G) := {g ∈ G | g, g −1 ∈ BX },
(3.17)
and the measure μX is replaced by μX , the normalized counting measure on (BX )2 . With this notation, we show the following. Theorem 3.7.1. Let G = SLn (ℤ), where n ≥ 2, and let BX (G) and μX be as above. Then we have 2
lim μX ({(g1 , g2 ) ∈ (BX (G)) | ⟨g1 , g2 ⟩ is thin}) = 1.
X→∞
3.8 Subgroups of SL2 (ℤ) In this section, we prove the following. Theorem 3.8.1. Let G = SL2 (ℤ), and let Γ(g) and μX be as above. Then we have lim μX ({g = (g1 , g2 ) ∈ G2 | Γ(g) is thin}) = 1.
X→∞
In this case, we are able to prove generic thinness even in the usual Euclidean ball model, which we do next. We separate the 2-dimensional case from the other cases for several reasons. One is that in this case we have the very natural action of SL2 on the upper half-plane to work with, and another is that the general strategy is the same as in the higher-dimensional cases yet more straightforward. The idea is to show generic freeness using a ping-pong argument which follows essentially from studying the generators’ singular values (i. e., eigenvalues of git gi ) and using equidistribution results. Specifically, Theorem 3.8.1 will follow from three lemmas which we prove below. Lemma 3.8.2 will imply Lemma 3.8.3, that the generic Γ(g) is free. In fact, it is Schottky – namely, one expects that there exist four disjoint and mutually external circles C1 , C2 , C3 , C4 in ℍ such that for 1 ≤ i ≤ 2 the generator gi of Γ maps the exterior of Ci onto the interior of Ci+2 , and the generator gi−1 of Γ maps the exterior of Ci+2 onto the interior of Ci . We then use Lemma 3.8.3 to show that generically the limit set of Γ acting on ℍ
3.8 Subgroups of SL2 (ℤ)
| 93
has arbitrarily small Hausdorff dimension in Lemma 3.8.4, which immediately implies Theorem 3.8.1, and in fact shows that the generic subgroup generated by two elements in SL2 (ℤ) is “arbitrarily thin” (Hausdorff dimension is a natural measure of thinness; if it is not 1, then it is thin, and the smaller the Hausdorff dimension, the thinner the group). Let BX := {γ ∈ SL2 (ℤ) | ‖γ‖ ≤ X} and note that |BX | ∼ c ⋅ X 2 for some constant c (see [114] and [98]). Also, for a fixed T > 0 we have 1+ϵ {γ ∈ BX s. t. tr(γ) < T} ≪ϵ T ⋅ X so that for any T lim μX ({γ ∈ G | tr(γ) > T}) = 1.
X→∞
(3.18)
In particular, we have limX→∞ μX ({γ ∈ G | tr(γ) > 2}) = 1. So, with probability tending to 1, each of the generators gi of Γ(g) will have two fixed points on the boundary S of ℍ. Write ai ci
gi = (
bi ). di
(3.19)
Then the fixed points of gi are (di − ai ) ± √(ai + di )2 − 4 2ci
,
which, since the trace of gi is large, approaches (di − ai ± (ai + di ))/(2ci ) = di /ci , −ai /ci as X goes to infinity. One of these points is attracting – call it αi – and the other is repelling – call it βi . The distance between these points is |(ai + di )/ci |, which is large with high probability. Furthermore, there exists a circle Ci containing αi and a circle Ci+2 containing βi , both of radius 1/|ci | (the isometric circles), such that gi maps the exterior of Ci onto the interior of Ci+2 and gi−1 maps the exterior of Ci+2 onto the interior of Ci . Note that as X tends to infinity, the probability that the radii of these circles are small is large, and that generically Ci and Ci+2 are disjoint since gi is hyperbolic with high probability. So far we have selected g1 uniformly at random out of a ball of radius X. As discussed above, as X → ∞, the probability that g1 is hyperbolic of large trace tends to 1. Now we select a second element g2 uniformly at random out of a ball of radius X, also of large trace with probability tending to 1. Let C1 , C3 , C2 , C4 be defined as above. We would like to show that as X → ∞, the probability that the circles C1 , C2 , C3 , C4 are
94 | 3 Randomness and computation in linear groups mutually external and disjoint (i. e., that g1 and g2 form a Schottky pair) tends to 1. Let r(Ci ) denote the radius of Ci . Note that for any ϵ > 0, we have lim μX ({(g1 , g2 ) ∈ G2 | max(r(Ci )) < ϵ}) = 1,
X→∞
i
so our desired statement about the disjointness of the circles Ci will follow from the following lemma. Lemma 3.8.2. For any pair (g1 , g2 ) ∈ G2 such that | tr(gi )| > 2 for i = 1, 2, let αi and βi denote the fixed points of gi . Let dmin (g1 , g2 ) := min(d(α1 , β2 ), d(α1 , α2 ), d(β1 , β2 ), d(β1 , α2 )), where d(⋅, ⋅) denotes hyperbolic distance. Then for any r > 0 we have lim μX ({(g1 , g2 ) ∈ G2 | tr(gi ) > 2, dmin (g1 , g2 ) > r}) = 1.
X→∞
(3.20)
Proof. From (3.18) the probability that the traces of gi are different and greater than 2 tends to 1. We may therefore restrict to nonconjugate hyperbolic pairs (g1 , g2 ) in proving Lemma 3.8.2. Specifically, it suffices to show that lim μX ({(g1 , g2 ) ∈ G2 | tr(g1 ) ≠ tr(g2 ), dmin (g1 , g2 ) > r}) = 1.
X→∞
(3.21)
Note now that to every pair of nonconjugate elements g1 , g2 ∈ G one can associate a unique pair of distinct closed geodesics L1 , L2 on G\ℍ fixed by g1 and g2 , respectively. Furthermore, the length ℓ(Li ) is 2 ℓ(Li ) = ((tr(gi ) + √tr2 (gi ) − 4)/2)
for i = 1, 2. Therefore our measure μX on G also induces a measure on the set S of closed geodesics on G\ℍ, and for a fixed T > 0, lim μX ({L ∈ S | ℓ(L) > T}) = 1
X→∞
from (3.18). With this in mind, (3.21) follows from the equidistribution of long closed geodesics2 which we summarize below from [439]. To each closed geodesic L on G\ℍ one associates the measure νL on G\ℍ which is arc length supported on L. Let ν = π3 ⋅ dxy2dy . For a finite set S of closed geodesics, let ℓ(S) = ∑L∈S ℓ(L), where ℓ(L) is the length of L, and define the measure νS on G\ℍ by νS =
1 ⋅∑ν . ℓ(S) L∈S L
2 The equidistribution of long closed geodesics was first proved spectrally by Duke, Rudnick and Sarnak in [114] and ergodic-theoretically by Eskin and McMullen in [127] shortly after.
3.8 Subgroups of SL2 (ℤ)
| 95
Let S be the set of all closed geodesics on G\ℍ, and let S (t) = {L ∈ S | ℓ(L) < t}. Then we have (see [114] or [127]) that as t → ∞ the measures νS (t) → ν. Therefore, given the association of pairs (g1 , g2 ) in (3.21) with pairs of closed geodesics, the pairs of fixed points (αi , βi ) of gi chosen uniformly at random from a ball of radius X also equidistribute as X → ∞, as desired. Lemma 3.8.2 implies that generically the isometric circles associated with g1 and g2 are disjoint. This is precisely what is needed for Γ to be Schottky, and so the following lemma is immediate. Lemma 3.8.3. Let G = SL2 (ℤ), and let Γ(g) and μX be as before. Then we have lim μX ({g ∈ Gk s. t. Γ(g) is Schottky}) = 1.
X→∞
In other words, as X → ∞, the picture of the fixed points and isometric circles of g1 and g2 is generically as in Figure 3.2, namely, the isometric circles are disjoint. Since the generic group Γ generated by g1 and g2 is Schottky, it is in fact free.
Figure 3.2: The generic picture of a two-generator subgroup of SL2 (ℤ) acting on ℍ.
To show that Γ will be infinite index in G with high probability, we will use the fact that it is Schottky with high probability to obtain upper bounds on the Hausdorff dimension of its limit set, i. e., the critical exponent of ∑ e−δ⋅d(x,γy) ,
γ∈Γ
showing that it is arbitrarily small with high probability. Lemma 3.8.4. Let gi , Γ = Γ(g) and μX be as above, and let δ(Γ) denote the Hausdorff dimension of the limit set of Γ. Then for every ϵ > 0 we have lim μX ({g = (g1 , g2 ) s. t. 0 < δ(Γ) < ϵ}) = 1.
X→∞
96 | 3 Randomness and computation in linear groups Proof. Let C1 , C2 , C3 , C4 be as above, and let Ii denote the i-th coordinate of I = (g1 , g2 , g1−1 , g2−1 ). By Lemma 3.8.3, the circles Ci are mutually external and disjoint, and the generators gi of Γ map the exterior of Ci onto the interior of Ci+2 while their inverses do the opposite. We recall the following set-up from [38]. Let K = 3, let a(j) denote the center of Cj and let r(j) denote the radius of Cj . Define Σ(m) to be the set of all sequences (i1 , . . . , im ), where i1 , . . . , im ∈ {1, 2, 3, 4}
and
i1 ≠ i2 ± 2, . . . , im−1 ≠ im ± 2.
Define I(i1 , . . . , im ) = Ii1 ⋅ ⋅ ⋅ Iim−1 (Cim ), where (i1 , . . . , im ) ∈ Σ(m), m ≥ 2 and Is It (x) = Is (It (x)). It is shown in [38] that the limit set Λ(Γ) of Γ can then be written as ∞
Λ(Γ) = ⋂
⋃
m=1 (i1 ,...,im )∈Σ(m)
I(i1 , . . . , im ),
(3.22)
and its Hausdorff dimension δ(Γ) has the upper bound δ(Γ) ≤ where λ=
max
1≤i=j≤K+1 ̸
− log K , 2 ⋅ log λ
r(i) . |a(i) − a(j)| − r(j)
(3.23)
(3.24)
Since in our set-up all radii r(i) tend to zero and the distances between the centers a(i) are large with high probability as X tends to infinity, (3.23) and (3.24) imply that for any ϵ > 0 lim μX ({g = (g1 , g2 ) s. t. δ(Γ) < ϵ}) = 1,
X→∞
as desired. To see that δ(Γ) > 0 with probability tending to 1, note that by theorems of Rivin in [424] and Kantor and Lubotzky in [236], finitely generated subgroups of SLn (ℤ) are Zariski dense in SLn (ℂ) with probability tending to 1 (in particular with our Euclidean measure). Therefore δ(Γ) > 0 with probability tending to 1.
3.9 Subgroups of SLn (ℤ) for n > 2
| 97
3.9 Subgroups of SLn (ℤ) for n > 2 In the previous section, we considered the action of SL2 on the upper half-plane ℍ, and showed that with probability tending to 1 the group Γ(g) is a Schottky group via a pingpong method on the boundary of ℍ. In this section, we use results of Breuillard and Gelander in [65] to prove an analogous statement for SLn (ℤ) for n > 2 after somewhat changing the notion of a ball of radius T. We also discuss what happens if we consider the usual Euclidean ball model in SLn where n > 2, for which this strategy will not prove that finitely generated subgroups of SLn (ℤ) are generically thin. 3.9.1 Ping-pong A natural analogue in higher rank of the methods used in Section 3.8 to prove generic thinness is the ping-pong argument for subgroups of SLn (ℝ), which is examined in [65] (note that [65] also considers SLn over non-Archimedean fields). We recall the relevant results here. We no longer have that SLn (ℝ) acts on ℍn in the nice way that SL2 (ℝ) acts on ℍ, and so we consider the action of SLn (ℝ) on real projective space ℙn−1 (ℝ), viewed as an n-dimensional vector space. We define the distance in ℙn−1 (ℝ) by d([v], [w]) =
‖v ∧ w‖ , ‖v‖ ⋅ ‖w‖
(3.25)
where [v] denotes the line spanned by v, and ‖v ∧ w‖ is defined as follows. Writing v∧w =
∑ (vi wj − vj wi )ei ∧ ej ,
1≤i 2ϵ > 0, and if the attracting points of gi and gi−1 are at least distance r apart from the repulsive hyperplanes of gj and gj−1 in ℙn−1 (ℝ), where i ≠ j. In the above definition, an element γ ∈ SLn (ℝ) is said to be (r, ϵ)-very proximal if both γ and γ −1 are (r, ϵ)-proximal, i. e., both γ and γ −1 are ϵ-contracting with respect to some attracting point vγ ∈ ℙn−1 (ℝ) and some repulsive hyperplane Hγ , such that d(vγ , Hγ ) ≥ r. Finally, γ is called ϵ-contracting if there exist a point vγ ∈ ℙn−1 (ℝ) and a projective hyperplane Hγ such that γ maps the complement of the ϵ-neighborhood of Hγ into the ϵ-ball around vγ . One can hope to then prove generic thinness in SLn (ℤ) by proving that two elements chosen uniformly at random out of a ball in SLn (ℤ) will be (r, ϵ)-very proximal, and that their corresponding attracting points and repulsive hyperplanes are at least r apart with probability tending to 1 as the radius of the ball grows to infinity. By Proposition 3.1 in [65], the necessary and sufficient condition for γ to be ϵ-contracting can be stated simply in terms of the top two singular values of γ. Theorem 3.9.2 (Proposition 3.1 [65]). Let ϵ < 1/4 and let γ ∈ SLn (ℝ). Let a1 (γ) and a2 (γ) be the largest and second-largest singular values of γ, respectively (i. e., largest (γ) ≤ ϵ2 , then γ is ϵ-contracting. More preand second-largest eigenvalues of γ t γ). If aa2 (γ) 1
cisely, writing γ = kγ aγ kγ , one can take Hγ to be the projective hyperplane spanned by {kγ −1 (ei )}ni=2 , and vγ = kγ (e1 ), where (e1 , . . . , en ) is the canonical basis of ℝn . Conversely, suppose γ is ϵ-contracting. Then
a2 (γ) a1 (γ)
≤ 4ϵ2 .
3.9 Subgroups of SLn (ℤ) for n > 2
| 99
Hence, if one could show that an element chosen uniformly at random out of a ball in SLn (ℤ), as well as its inverse, is expected to have a “large” ratio between the secondlargest and largest singular value, then (r, ϵ)-proximality would follow by appealing to equidistribution in sectors in SLn (ℝ). One can also use the work of [65] to prove that two given elements form a ping-pong pair when viewed as acting on ℙ(⋀k (ℝn )), where ⋀k denotes the exterior power, for a suitable k. To do this, one would in particular need each of the two elements and their inverses to have a large ratio between the k-th and (k + 1)-st singular value. Interestingly, as we show in the next section, none of these properties of singular values are generic in the usual Euclidean ball model if n > 2. We are, however, able to show that the middle two singular values have large ratio with probability tending to 1 in a modified Euclidean model, and with this we are able to prove statements on generic thinness in Section 3.9.3. Note, however, that at this point we can already show a version of Aoun’s theorem. Theorem 3.9.3. Given two long random products w1 , w2 of generators of a Zariski dense subgroup Γ of SLn (ℤ), w1 and w2 generate a free subgroup. Proof. By the results of Guivarc’h and Raugi [193] (see also Goldsheid and Margulis [172]), the assumption of Zariski density implies that the Lyapunov exponents of Γ with the given generating set are distinct, and so for the ratio of the top singular value to the second-biggest grows exponentially fast as a function of n. Since the words are also known to be equidistributed in sectors (see, e. g., [58]), the result follows.
3.9.2 ϵ-contraction in SLn (ℤ) where n > 2 In this section we prove the following theorem, which from the discussion in the previous section implies in particular that if n > 2, it is not true that two elements chosen uniformly at random out of a Euclidean ball of radius X will generically form a pingpong pair in ℙn−1 (ℝ) (and hence one cannot conclude that the group generated by two such elements is generically thin via this route). Furthermore, a similar result will hold when one considers ℙ(⋀k (ℝn )) for various k, so even with this strategy, the best that one can prove via this method is that a randomly chosen finitely generated subgroup of SLn (ℤ) will be thin with some positive probability (unfortunately, the probability given by our argument decreases as a function of n.) Instead of comparing the number of elements in SLn (ℤ) in a ball which are ϵ-contracting to the total number of elements in the ball, we compare the measures of the analogous sets in SLn (ℝ). This is essentially identical to the comparison over ℤ by Theorem 1.4 of [127] after one proves that the sets CX,η := {diag(α1 , . . . , αn ) | αi ∈ ℝ, X ≥ α1 ≥ ηα2 ; α2 ≥ α3 ≥ ⋅ ⋅ ⋅ ≥ αn , ∏ αi = 1}, i
100 | 3 Randomness and computation in linear groups where η > 16 is fixed, make up a well-rounded sequence of sets in the sense of Definition 3.10.2. Theorem 3.9.4. For n ≥ 3 fixed, let G = SLn (ℝ), and let μ be a Haar measure on SLn (ℝ). For g ∈ G, denote by a1 (g), a2 (g), . . . , an (g) the nonzero entries of the diagonal matrix ag in the KAK decomposition of g in (3.27) with a1 (g) ≥ a2 (g) ≥ ⋅ ⋅ ⋅ ≥ an (g) > 0. Fix η > 4. Then 0
2
n(n−1)
|
101
where C = 1/2 2 . Substituting jn = −j1 − j2 − ⋅ ⋅ ⋅ − jn−1 , it is clear that the maximum on R of any given exponential in the sum above is at most (n2 − n) log X, reached always at the point (log X, log X, . . . , log X, −(n − 1) log X). In fact, from [114] the integrals above 2 are asymptotic to cX n −n for some nonzero constant c. The second integral, over the intersection of R with the subset of R where j1 −j2 ≥ T, is the same as above, except that the upper limit of j2 is replaced by j1 − T. Since these two integrals (one over R and the other over a subregion of R) differ only in the range of j2 , and noting that asymptotically only the terms es1 j1 +⋅⋅⋅sn−1 jn−1 where ∑ si = n2 − n contribute, we write ∫( ∏ sinh(λ(H)))dH R
λ∈Σ+
log X j1
∼C⋅ ∫ ∫ 0
∼
∑
i+k=n2 −n 2≤i≤2n−2 i∈2ℤ
−j1 n−1
∑
2
i+k=n −n 2≤i≤2n−2 i∈2ℤ
ai (n)eij1 +kj2 dj2 dj1
ai (n) (n2 −n) log X e k(i + k)
2
= αX n −n for some α > 0, and ∫
( ∏ sinh(λ(H)))dH
+ R∩{j1 −j2 ≥T} λ∈Σ
log X j1 −T
∼C⋅ ∫ ∫ 0
∼
∑
i+k=n2 −n 2≤i≤2n−2 i∈2ℤ
−j1 n−1
∑
i+k=n2 −n 2≤i≤2n−2 i∈2ℤ
ai (n)eij1 +kj2 dj2 dj1
ai (n) (n2 −n) log X −kT e ⋅e . k(i + k)
Therefore, since k ≤ n2 − n − 2n + 2 = n2 − 3n + 2, we have ∫
2
( ∏ sinh(λ(H)))dH ∼ βX n −n ,
+ R∩{j1 −j2 ≥T} λ∈Σ
where 2
0 < β < αe−(n −3n+2)T , which proves the claim.
102 | 3 Randomness and computation in linear groups 3.9.3 Proof of Theorem 3.7.1 Although we have shown in the previous section that two elements chosen uniformly at random from a Euclidean ball in SLn (ℤ) are not expected to form a ping-pong pair, we show in this section that two elements chosen uniformly at random from a rather natural modification of the notion of Euclidean ball in G := SLn (ℤ) will form a pingpong pair. Let BX (G) := {γ ∈ G | ‖γ‖ < X and ‖γ −1 ‖ < X}, and let μX denote the normalized counting measure on (BX )2 with respect to this region. We then have the following. Proposition 3.9.5. Let G = SLn (ℤ), and let Γ(g) and μX be as above. Then we have lim μX ({g = (g1 , g2 ) ∈ G2 | Γ(g) is free}) = 1.
X→∞
As we note below, this is almost Theorem 3.7.1. To prove this proposition, we show that generically our generators will form a ping-pong pair in some suitable space ℙ(⋀k (ℝn )). According to [65], the first step is to show that they are generically ϵ-very contracting. Lemma 3.9.6. Let G = SLn (ℤ), and let μX be as above. For g ∈ G, denote by a1 (g), . . . , an (g) the nonzero entries of the diagonal matrix ag in the KAK decomposition of g in (3.27) with a1 (g) ≥ a2 (g) ≥ ⋅ ⋅ ⋅ ≥ an (g) > 0. Fix η > 4. Then, if n is even, we have a (g) a (g −1 ) lim μX ({g ∈ G k ≥ η2 and k −1 ≥ η2 }) = 1, ak+1 (g) X→∞ ak+1 (g ) where k = n/2. If n is odd, we have a (g) ak+1 (g) ak (g −1 ) ak+1 (g −1 ) , ) ≥ η2 }) = 1, , , lim μX ({g ∈ G min( k X→∞ ak+1 (g) ak+2 (g) ak+1 (g −1 ) ak+2 (g −1 ) where k = (n − 1)/2. Note that given the definition of k in the two cases in Lemma 3.9.6, whenever the mentioned ratios between singular values of g are large enough, so are the ratios between the relevant pairs of the singular values of g −1 . Hence we need only to prove the statements above for singular values of g. Proof. Let T = 2 log η and consider, as X → ∞, the ratio |{g ∈ BX (G) | jk (g) − jk+1 (g) ≥ T}| |{g ∈ BX (G)}|
3.9 Subgroups of SLn (ℤ) for n > 2
|
103
if n = 2k is even, and |{g ∈ BX (G) | jk (g) − jk+1 (g) ≥ T, jk+1 (g) − jk+2 (g) ≥ T}| |{g ∈ BX (G)}| if n = 2k + 1 is odd, where ji (g) is defined as in the previous section. We replace now G by SLn (ℝ), and the norm above with the Haar measure, noting that, as in the proof of Theorem 3.9.4, the ratio we get in this way will be asymptotic to the ratio above by Theorem 1.4 in [127] (see Theorem 3.10.1) along with Lemmas 3.10.3 and 3.10.5. We consider all elements γ ∈ SLn (ℝ) such that ‖γ‖2 = λmax (γ t γ) ≤ X 2 and ‖γ −1 ‖ ≤ 2 X ; in other words, in the above notation, consider the region R as the convex polygon defined by e|j1 | ≤ X,
e
|j2 |
≤ X, .. .
j1 ≥ j2 ≥ ⋅ ⋅ ⋅ ≥ jn , j1 + ⋅ ⋅ ⋅ + jn = 0,
(3.32)
e|jn | ≤ X. The first integral in question is ∫( ∏ sinh(λ(H)))dH. R
λ∈Σ+
Again we expand this as a sum of integrals of exponentials. If n is even, we obtain n/2
C ⋅ ∫ ∑ sgn(σ)e∑m=1 (n−2m+1)(jσ(m) −jσ(n/2+m) ) djn−1 ⋅ ⋅ ⋅ dj1 , R σ∈Sn
(3.33)
and if n is odd, (n−1)/2
C ⋅ ∫ ∑ − sgn(σ)e∑m=1
(n−2m+1)(jσ(m) −jσ((n−1)/2+m) )
R σ∈Sn
djn−1 ⋅ ⋅ ⋅ dj1 ,
(3.34)
n(n−1)
where C = 1/2 2 . Note that on R, the maximum of any one of the exponentials in the sums above is, depending on σ, X 2n−2+2n−4+⋅⋅⋅+2
(3.35)
or of smaller order both for n even and odd. When this maximum is achieved (i. e., when one considers an appropriate σ in the sum above), it is achieved at the point P = (log X, log X, . . . , log X, − log X, − log X, . . . , − log X) in the even case, and at Q = (log X, log X, . . . , log X, 0, − log X, − log X, . . . , − log X) in the odd case. Since the exponentials are all exponentials of linear functions, and since P and Q are contained in R
104 | 3 Randomness and computation in linear groups (in the even and odd cases, respectively), the maximum obtained on R for any one of the exponentials in the sums above is precisely the expression in (3.35). We now separate the two cases of n being even and odd. Case 1: We have n = 2k > 2 is even. Let f (j1 , . . . , jn−1 ) be the constant multiple of the sum of exponentials in (3.33). We will show that 2
∫ f (j1 , . . . , jn−1 )djn−1 . . . dj1 ∼ αX 2(n−1)+2(n−3)+⋅⋅⋅+2 = αX n /2
(3.36)
R
for some constant α. First, note that f (j1 , . . . , jn−1 ) is nonnegative on R, and hence an integral of f over a subregion of R will give a lower bound on this integral. Consider the subregion R obtained by taking the intersection of R with the region ji−1 ≥ ji + M
for all 2 ≤ i ≤ n,
where M = log(n! + 1). Now, any exponential in the sum of exponentials defining f has 2 a maximum of order less than cX n /2 , where c > 0, unless it is of the form e(n−1)jσ(1) +(n−3)jσ(2) +⋅⋅⋅+jσ(n/2) −jτ(n/2+1) −3jτ(n/2+2) −⋅⋅⋅−(n−1)jτ(n) , where σ, τ ∈ Sn/2 , and hence we may replace f in our computation by g(j1 , . . . , jn−1 ) :=
∑ (sgn(στ)e(n−1)jσ(1) +(n−3)jσ(2) +⋅⋅⋅+jσ(n/2) −jτ(n/2+1) −3jτ(n/2+2) −⋅⋅⋅−(n−1)jτ(n) ).
σ,τ∈Sn/2
The region R was defined in such a way that g is at least c ⋅ e(n−1)j1 +(n−3)j2 +⋅⋅⋅+jn/2 −jn/2+1 −3jn/2+2 −⋅⋅⋅−(n−1)jn on R , for some positive constant c. We hence show that the integral 2
∫ c ⋅ e(n−1)j1 +(n−3)j2 +⋅⋅⋅+jn/2 −jn/2+1 −3jn/2+2 −⋅⋅⋅−(n−1)jn djn−1 ⋅ ⋅ ⋅ dj1 ≫ X n /2 ,
(3.37)
R
thus proving the asymptotic in (3.36), since the maximum of f over R is of order at most 2 X n /2 , as discussed above. To prove (3.37), note that the exponential in the integral in (3.37) is positive everywhere, and so it is bounded below by 2
∫ c ⋅ e(n−1)j1 +(n−3)j2 +⋅⋅⋅+jn/2 −jn/2+1 −3jn/2+2 −⋅⋅⋅−(n−1)jn djn−1 ⋅ ⋅ ⋅ dj1 ≫ X n /2 , R ∩Bϵ
where ϵ > 0 and n/2
Bϵ := [log X − ϵ , log X]
n/2
× [− log X, − log X + ϵ ]
.
(3.38)
3.9 Subgroups of SLn (ℤ) for n > 2
|
105 2
Since the minimum of the exponential above over R ∩ Bϵ is easily seen to be c X n /2 for a constant c > 0 depending on ϵ and M, the integral in (3.38) is bounded below 2 by V(R ∩ Bϵ ) ⋅ c X n /2 , where V(R ∩ Bϵ ) denotes the area of R ∩ Bϵ which is at least a constant depending on ϵ and M. Thus (3.37) holds and so the asymptotic in (3.36) is correct. Let T > 0 be fixed, and let RT denote the region defined by jn/2 ≤ jn/2+1 + T. We now show that ∫ f (j1 , . . . , jn−1 )djn−1 ⋅ ⋅ ⋅ dj1 R∩RT 2
is of lower order than X n /2 , proving Lemma 3.9.6 for even n. To do this, consider any one of the exponentials in the sum defining f . It is of the form e(n−1)jσ(1) +(n−3)jσ(2) +⋅⋅⋅+jσ(n/2) −jσ(n/2+1) −3jσ(n/2+2) −⋅⋅⋅−(n−1)jσ(n) , where σ ∈ Sn . In the case that the coefficients of the jn/2 and jn/2+1 terms have opposite sign, the maximum value over R ∩ BT of ∑i=n/2,n/2+1 ai ji in the exponent is bounded ̸ 2
2
above by ( n2 − an/2 + an/2+1 ) log X if an/2+1 is negative, or ( n2 + an/2 − an/2+1 ) log(X) if an/2 is negative. Now consider the remaining part of the exponent, an/2 jn/2 + an/2+1 jn/2+1
≤ (an/2 + an/2+1 )jn/2 − (an/2 + an/2+1 )T ≤ (an/2 + an/2+1 )jn/2 .
(3.39)
Adding this to the upper bound on the maximum value of the rest of the terms in the exponent, we get an upper bound of (
n2 − 1) log X 2 2
for the exponent, and hence an upper bound of X n /2−1 for the corresponding exponential in the sum defining f . If the coefficients of the jn/2 and jn/2+1 terms have the same sign, then the maximum value over R ∩ BT of ∑i=n/2,n/2+1 ai ji in the exponent is ̸ 2
bounded above by ( n2 − an/2 − an/2+1 − 2) log X, and adding the remaining part of the exponent in (3.39) we get an upper bound of (
n2 − 2) log X 2 2
for the exponent, and hence an upper bound of X n /2−1 for the corresponding exponential in the sum defining f . Hence, 2
∫ f (j1 , . . . , jn−1 )djn−1 ⋅ ⋅ ⋅ dj1 ≪ X n /2−1 , R∩RT
(3.40)
106 | 3 Randomness and computation in linear groups which is of lower order than the asymptotic for the integral over R in (3.36). This concludes the proof of the lemma for even n. Case 2: We have n = 2k + 1 ≥ 3 is odd. This case is extremely similar to the even case above, so we omit all of the details, but note the key differences. In this case, the maximum over R of any given exponential in the sum defining f is X n(n−1)/2 , obtained at the point (log X, log X, . . . , log X, 0, − log X, − log X, . . . , − log X). The problem of finding the asymptotic for the integral of f over R reduces to obtaining a suitable lower bound on the asymptotic for ∫ c ⋅ e(n−1)j1 +(n−3)j2 +⋅⋅⋅+2jk −2jk+2 −4jn/2+2 −⋅⋅⋅−(n−1)jn djn−1 ⋅ ⋅ ⋅ dj1
(3.41)
R
over a region R defined very similarly as the one in the even case. This lower bound is obtained by considering the integral over R ∩ Bϵ , where Bϵ is defined to be the region n/2
Bϵ := [log X − ϵ , log X] jk+1
n/2
× [−ϵ , ϵ ] × [− log X, − log X + ϵ ]
.
Next, for T1 , T2 > 0 we define RT1 ,T2 to be the region defined by jk ≥ jk+1 + T1 , ≥ jk+2 + T2 , which is the analogue of RT in case 1. The proof that 2
f (j1 , . . . , jn−1 )djn−1 ⋅ ⋅ ⋅ dj1 ≪ X n −2 log X
∫ R∩RT1 ,T2
is almost identical to the proof of (3.40) above, and hence the statement in the lemma holds for odd n as well. The spectral gap from the previous lemma gives us that our generators will be ϵ-contracting with probability tending to 1 in a suitable space. Specifically, consider v, w ∈ ℙ(⋀k ℝn ). We define a metric d([v], [w]) =
‖v ∧ w‖ , ‖v‖ ⋅ ‖w‖
(3.42)
where [v] denotes the line spanned by v, and ‖v‖, ‖w‖ and ‖v ∧ w‖ are defined in a canonical way after fixing a basis B := {ei1 ∧ ⋅ ⋅ ⋅ ∧ eik | 1 ≤ i1 < i2 < ⋅ ⋅ ⋅ < ik ≤ n} for ⋀k (ℝn ). For example, writing v=
∑
1≤i1 2. Note that for our purposes in Theorem 3.10.1 to obtain an asymptotic count of the number of points in |Γv ∩ Bn |, we need only show that {Bn }n > N is a well-rounded sequence for some N ∈ ℕ. For T ∈ ℝ, let CT := {diag(α1 , . . . , αn ) | αi ∈ ℝ, T ≥ α1 ≥ ⋅ ⋅ ⋅ ≥ αn ≥ 1/T, ∏ αi = 1}, i
and, fixing β > 16, let CT,β := {diag(α1 , . . . , αn ) | αi ∈ ℝ, T ≥ α1 ≥ ⋅ ⋅ ⋅ αn/2 ≥ βαn/2+1 ≥ ⋅ ⋅ ⋅ ≥ αn ≥ 1/T, ∏ αi = 1}. i
if n is even, and let CT,β := {diag(α1 , . . . , αn ) | αi ∈ ℝ,
T ≥ α1 ≥ ⋅ ⋅ ⋅ α(n−1)/2 ≥ βα(n+1)/2 ≥ β2 α(n+3)/2 > α(n+3)/2 ≥ ⋅ ⋅ ⋅ ≥ αn ≥ 1/T, ∏ αi = 1}. i
if n is odd. In [127] it is shown that the sequence {KCT K} of regions whose volume tends to infinity as T → ∞ is well rounded. We show that this is true for the other sets above as well. We begin with the sequence {KCT K}. Lemma 3.10.3. Let CT be defined as above. Then the sequence {KCT K} of regions (whose volume tends to infinity as T → ∞) is well rounded. Proof. We first show that for any ϵ > 0 there is a neighborhood 𝒩1,ϵ of identity in SLn (ℝ) such that K. ⋃ gKCT K ⊂ KC(1+ϵ)T
g∈𝒩1,ϵ
√1 + ϵ. Then 𝒪 n−1√1+ϵ contains Let 𝒪 n−1√1+ϵ be the neighborhood of identity of radius n−1 U1 V for some neighborhood U1 of identity in K and some neighborhood V of identity in A. By the strong wavefront lemma (Theorem 2.1 in [179]) there exists a neighborhood 𝒩1,ϵ of identity in SLn (ℝ) such that 𝒩1,ϵ γ ⊂ k1 U1 aVk2 = k1 U1 Vak2
3.10 Well-roundedness | 111
for all γ = k1 ak2 in KCT K. Hence we have 𝒩1,ϵ KCT K ⊂ KU1 VCT K ⊂ K 𝒪 n−1√1+ϵ CT K ⊂ KC(1+ϵ)T K.
The last containment follows from the submultiplicativity of the spectral norm. √1 + ϵ < Namely, for any g ∈ 𝒪 n−1√1+ϵ and any h ∈ CT , we have ‖gh‖ ≤ ‖g‖ ⋅ ‖h‖ < T n−1 −1 −1 −1 (1+ϵ)T, and ‖(gh) ‖ ≤ ‖h ‖⋅‖g ‖ < T(1+ϵ) as well, and so K 𝒪 n−1√1+ϵ CT K ⊂ KC(1+ϵ)T K. Also, for any 0 < ϵ < 1 there exists a neighborhood 𝒩2,ϵ of identity in SLn (ℝ) such that K ⊂ ⋂ gKCT K. KC(1−ϵ)T g∈𝒩2,ϵ
In other words, we have g −1 KC(1−ϵ)T K ⊂ KCT K
for every g ∈ 𝒩2,ϵ . To see this, take 𝒩2,ϵ to be the neighborhood of radius
g
−1
is of norm less than
1 1−ϵ
and, for any h ∈ 1 1− n−1
KC(1−ϵ)T K, we have ‖g −1 h‖
1 . √1−ϵ −1 n−1
Then
≤ ‖g ‖⋅‖h‖ < T,
and ‖(g h) ‖ ≤ ‖h ‖ ⋅ ‖g‖ < T(1 − ϵ) < T for all ϵ > 0. Hence we have g −1 h ∈ KCT K for all g ∈ 𝒩2,ϵ and h ∈ KC(1−ϵ)T K. Recall from Section 3.9.3 that m(KCT K) ∼ p(T), where p is a polynomial of degree 2 n /2 if n is even and n(n − 1)/2 if n is odd. For any 0 < ϵ < 1 one can find ϵ1 > 0 such 1 1 that (1−ϵ ) = p( 1+ϵ ) and for any ϵ > 0 there is an 0 < ϵ2 < 1 such that (1+ϵ ) = p( 1−ϵ ). 1 2 Let U = 𝒩1,ϵ1 ∩ 𝒩2,ϵ2 , and note that (3.44) is indeed satisfied for this choice of U. −1
−1
−1
Now, given that {KCT K}T≥1 is a well-rounded sequence of sets, we are able to prove the following lemma, using a slightly different definition of well-roundedness, which is equivalent to Definition 3.10.2 by [127]. Definition 3.10.4. The sequence {Bn } of sets is well rounded if for any ϵ > 0 there exists an open neighborhood U of the identity in SLn (ℝ) such that m(U𝜕Bn ) N is well rounded.
To prove this, we essentially use Lemma 3.10.3 together with the fact that there is some R > 0 such that section.
m(KCT,β K) m(KCT K)
> R and an analysis similar to that in the previous
112 | 3 Randomness and computation in linear groups Proof. First, note that KCT K and KCT,β K can be viewed as convex polytopes in ℝn ; for example, KCT K is described in this way in (3.32). In fact, the polygon corresponding to KCT,β K is obtained from KCT K by cutting the polygon corresponding to KCT K by the hyperplane j1 = j2 + β, i. e., it is the intersection of the polygon corresponding to KCT K with j1 > j2 + β. Now, by the proof of Lemma 3.9.6, for any η > 0, there is some N ∈ ℕ such that if T > N, m(KCT,β K)
m(KCT K)
> 1 − η = R.
(3.45)
Let ϵ > 0, and let U be the neighborhood of identity in SLn (ℝ) such that m(U𝜕(KCT K)) < Rϵ/2 m(KCT K) for all T. From the above, the boundary of KCT,β K is the union of part of the boundary of KCT K and a polygon PT sitting inside the hyperplane j1 = j2 + β. By the same argument as in the proof of Lemma 3.9.6, there exists N ∈ ℕ and a neighborhood U of identity in SLn (ℝ) so that m(U PT ) < Rϵ/2 for all T > N (here it is key that PT comes nowhere near the vertex of the polygon corresponding to KCT K where the functions in the integrals (3.33) and (3.34) obtain their maxima). Let U = U ∩ U . Then m(U 𝜕(KCT,β K)) K) m(KCT,β
0} by a c
(
b az + b )z = . d cz + d
Recall also that we can define a metric on H by setting d(z, w) = arccosh(1 +
|z − w|2 ), 2ℑzℑw
and, equipped with this metric, H is isometric to the hyperbolic plane ℍ2 . Infinitesimally, the area form of this metric is given by dxdy , where y = ℑz. In addition, the y2 action of SL(2, ℝ) by linear fractional transformations described above is isometric, and, indeed, every orientation preserving isometry of ℍ2 is obtained this way, so Isom ℍ2 ≃ P SL(2, ℝ) = SL(2, ℝ)/{±I}, where the quotient by plus and minus identity is needed because (−I)z = −z = z, for all −1 z ∈ H. We will also need the singular value decomposition. Recall that every matrix A in M m×n can be written as A = PDQ, where P ∈ O(m), Q ∈ O(n), and D is a diagonal m × n matrix with nonnegative diagonal elements (see, e. g., [221]). The diagonal elements of D are known as the singular values of A. It is well known (and easy to verify) that
118 | 3 Randomness and computation in linear groups the Frobenius norm of A equals the Euclidean (L2 ) norm of the vector of its singular values. In the special case where n = m = 2 and det A = 1, it is easy to see that the above implies that A can be written as cos ϕ − sin ϕ
sin ϕ x )( cos ϕ 0
A=(
0
cos θ
1)( − sin θ x
sin θ ), cos θ
for some x > 1. Further, as noted above, ‖A‖2 = x 2 + 1/x2 . 3.14.1 Translation distance A big part of the reason for introducing the singular value decomposition above is to give a palatable answer to the following question. Question 3.14.1. How far (in hyperbolic metric) does the matrix A = ( ac db ) ∈ SL(2, ℝ) move the point i? The main reason why the singular value decomposition helps is that cos θ − sin θ
(
sin θ ) i = i, cos θ
so with A as above, we have d(i, A(i)) = d(i, ix 2 ) = 2 log x. Since ‖A‖2 = x 2 +
1 , x2
it follows that ‖A‖2 = 2 cosh d(i, A(i)).
(3.49)
As a minor bonus, we can now modify our procedure PickHyperbolic to return a point in the upper half-plane in procedure PickHalfplane (see Algorithm 3.14.1).
3.14.2 The fundamental domain and orbits of the SL(2, ℤ) action The action of SL(2, ℤ) on H is discrete, and its fundamental domain Λ is one of the best-known images in all of mathematics (the reader can see it again in Figure 3.3, which shows the modular tessellation – the tiling of the upper half-plane by the images
3.14 Action of SL(2, ℝ) and SL(2, ℤ) on the upper half-plane
| 119
Algorithm 3.14.1 Require: R a positive real number. 1: function PickHalfplane(R) (r, θ) ← PickHyperbolic(R) ier cos θ + sin θ 3: return −ier sin θ + cos θ 4: end function 2:
of Λ by elements of SL(2, ℤ)). Geometrically, Λ is a triangle with angles π3 , π2 , 0. The vertex corresponding to the last angle is an ideal vertex (the point ∞ in Figure 3.3), also known as a cusp. A cusp neighborhood at height w is the region Hw = {z ∈ H | 0 < ℜz < 1; ℑz > w}. A simple integration shows that area(Hw ) =
1 . w
This immediately implies the following observation.
Figure 3.3: The modular group.
Fact 3.14.2. The area of the complement in Λ of a circle of radius R around i for large R has area approximately exp(−R). The points in the fundamental domain index the orbits of the SL(2, ℤ) action, giving rise to the following natural question. Question 3.14.3. Given a point z ∈ H, which orbit is it in? In other words, which point of Λ gets mapped to z? This question is so natural it was asked and answered in the eighteenth century by Legendre and Gauss. Of course, for them, the question was a little different: they were given two linearly independent vectors in the plane. These vectors generate a lattice, and the question is: what is the canonical form for that lattice? In other words, Gauss and Legendre posed (and solved) the 2-dimensional lattice reduction problem (a very
120 | 3 Randomness and computation in linear groups Algorithm 3.14.2 Require: A complex number z with ℑz ≥ 0. 1: function Reduce(z) 2: while |z| ≤ 1 do 3: z ← −1/z 4: q ← round ℜz 5: z ←z−q 6: end while 7: return z 8: end function
nice reference is the paper [485]). Gauss’s algorithm (which is basically the continued fraction algorithm) proceeds as follows (Algorithm 3.14.2). In fact, the algorithm Reduce can be made to do more: give the point z ∈ H, we can return not just the point z0 ∈ Λ such that z is in the orbit of z0 , but also the matrix A ∈ SL(2, ℤ) such that z0 = Az, as done in Algorithm 3.14.3. Algorithm 3.14.3 Require: A complex number z with ℑz ≥ 0. function Reduce2(z) A←I while |z| ≤ 1 do z ← −1/z −1 0 A←( )A 0 1 q ← round ℜz z ←z−q 1 q A←( )A 0 1 end while return (A, z) end function
3.15 Selecting a random element of SL(2, ℤ) almost uniformly We are now ready to describe the algorithm for selecting a random matrix M from the set of matrices in SL(2, ℤ) with Frobenius norm bounded above by X. Aside from the observations above, the key remark is that since the hyperbolic plane is the homoge-
3.15 Selecting a random element of SL(2, ℤ) almost uniformly | 121
Algorithm 3.15.1 Require: A pair of positive real numbers X, ϵ 1: function PickFancy(X,ϵ) 2: R ← f (X, ϵ) 3: loop 4: z ← PickHalfplane(R) 5: (A, z0 ) ← Reduce2(z) 6: if ‖A‖ ≤ X then 7: return A 8: end if 9: end loop 10: end function
⊳ f is a function to be named later.
neous space of SL(2, ℝ) (that is, the quotient of SL(2, ℝ) by its maximal compact subgroup SO(2, ℝ)) the Haar measure on SL(2, ℝ) projects to the hyperbolic metric (see the discussion in [127, 114] for more on the subject, and Knapp’s book [267] for everything you ever wanted to know). This suggests the Algorithm 3.15.1. What should f (X, ϵ) be? Firstly, it is obviously necessary that the disk of radius f (X, ϵ) intersect all of the images of i by matrices A with ‖A‖ < X. As we have seen (equation (3.49)), in order for this to be true, we must have f (X, ϵ) > arccosh X 2 /2. On the other hand, the fundamental domain Λ of SL(2, ℤ) has a cusp, which is bad, since no disk can contain Λ, but not so bad, since the part of Λ which lies outside the disk of radius R around i is asymptotic to exp(−R). This means that if f (X) > t + arccosh X 2 /2, the ratio of the areas of the intersections of fundamental domains we are interested in is of order 1 + e−t . On the other hand, the area of the set of points that lie in fundamental domains we do not want is proportional to et . This is so, since the total area of “good” points is bounded (by πX 2 ) independently of t. Therefore, as claimed above, the amount of excess computation is proportional to the error.
3.15.1 Complexity estimates and implementation Picking the random complex number in the half-plane in function PickHalfplane has been made unnecessarily expensive. Unwinding what we are doing, we see that in the first step we pick a random number x between 0 and gC (X) = cosh(C + arccosh 2X 2 ) − 1. Note that gC (X) = cosh(C + arccosh X 2 /2) − 1
= cosh C cosh arccosh X 2 /2 − sinh C sinh arccosh X 2 /2 − 1 = cosh CX 2 /2 − sinh C √x4 /4 − 1 − 1 = OC (X 2 ).
122 | 3 Randomness and computation in linear groups In the next step we compute arccosh(x + 1) = log(x + 1 + √x 2 + 2x). Since the number of fundamental domains is exponential in the radius, we need roughly log X bits of precision, and the final step (Reduce2) then takes a logarithmic number of steps (see [276, 484]), each of which is of logarithmic complexity, so the running time is of the order of O(t log2 X).
3.16 Extensions to other Fuchsian and Kleinian groups Suppose that instead of SL(2, ℤ) we want to generate random elements of bounded norm from other subgroups of SL(2, ℝ) or, even more ambitiously, SL(2, ℂ). The general approach described above works. Suppose H is our (discrete) subgroup. To pick a random element, we pick a random point x in ℍ2 or ℍ3 (our radius computation goes through unchanged), and then find the matrix A ∈ H which moves x to the “canonical” fundamental domain of H. This last part, however, is not so obvious, because both questions (constructing the fundamental domain and “reducing” the point x to that fundamental domain) are nontrivial.
3.16.1 Constructing the fundamental domain The first observation is that if the group H is not geometrically finite, it does not have a finite-sided fundamental domain at all, so constructing one may be too much. It is, however, conceivable that deciding whether x is reduced (that is, lies in the canonical fundamental domain) is still decidable. Since no algorithm leaps to mind, we shall state this as a question. Question 3.16.1. Is there a decision procedure to determine whether x ∈ ℍn lies in the canonical fundamental domain for a not-necessarily-geometrically finite group H? Until Question 3.16.1 is resolved, we will assume that H is geometrically finite. Now, we can construct the fundamental domain by generating a chunk of the orbit of the base point, and then computing the Voronoi diagram of that point set – the resulting domains are the so-called Dirichlet fundamental domains. Computing the Voronoi diagram can be reduced to a Euclidean computation (see the elegant exposition in [381] and Edelsbrunner’s recent classic [118] for background on the various diagrams). However, a much harder problem is that of figuring out how much of an orbit needs to be computed. For Fuchsian groups, this was addressed by Gilman in her monograph [168] (at least for two-generator Fuchsian groups). For Kleinian groups the question is much harder, but has been studied at least for arithmetic Kleinian groups in [403]. All we can say in general is that the computation is finite (since at every step we check the conditions for the Poincaré polyhedron theorem), so after waiting for a finite (though possibly long) time, we are good to go. Now, the question is: lacking the
3.17 Higher rank | 123
number theory underlying the continued fraction algorithm, how do we reduce our random point to the canonical fundamental domain? There are a number of ways to try to emulate the continued fraction algorithm. Here is one. Algorithm 3.16.1
Require: x, b ∈ ℍn , side-pairing transformation of the Dirichlet domain Γ = {γ0 = I(n), γ1 , . . . , γk } function GreedyReduce(x, b, Γ) ⊳ b is the basepoint. M ← I(n) loop Loop over Γ to find the i ∈ [0, k] for which d(γi (x), b) is minimal. if i = 0 then return M end if M ← γi M b ← γi b end loop end function Algorithm 3.16.1 will terminate in at most exponential time (that is, exponential in d(b, x)), and it seems very plausible (for reasons of hyperbolicity) that it will actually terminate in time linear in d(b, x), but this seems difficult to show.
3.17 Higher rank 3.17.1 SL(n, ℤ) The algorithms for SL(2, ℤ) use, in essence, the KAK decomposition of the group (which is in this case the singular value decomposition). This exists, and is easy to describe geometrically, in the higher-rank case as well (this construction is due to Minkowski). We first introduce the positive definite cone PSD(n) = {M | M = M t , vt Mv ≥ 0, ∀v ∈ ℝn .}. The general linear group GL(n, ℝ) acts on PSD(n) by g(M) = gMg t . It is not immediate that the subset PSD1 (n) = {M ∈ PSD(n) | det M = 1} is invariant under SL(n, ℝ). We can define a family of (Finsler) metrics on PSD(n) by n
p dp (A, B) = (∑log σi (B−1 A) ) i=1
1/p
,
124 | 3 Randomness and computation in linear groups where σi (M) denotes the i-th singular value of M. When p = 2 this defines a Riemannian metric, which makes PSD1 (n) into the symmetric space for SL(n, ℤ). In particular, when n = 2 it is easy to check that PSD1 (2) is the hyperbolic plane ℍ2 with the usual metric. With this in place, the algorithm we described for SL(2, ℤ) goes through mutatis mutandis. The hard part is the reduction algorithm. In the setting of SL(n, ℤ) we have the lattice reduction problem, which has been heavily studied starting with Lovász’s foundational Lenstra–Lenstra–Lovász (LLL) algorithm in [286]. The LLL algorithm is generally used as an approximation algorithm; it reduces a point not into the fundamental domain but into a point near the fundamental domain, which begs the following question. Question 3.17.1. Are the matrices obtained in the LLL algorithm uniformly distributed? In any case, one can also perform exact lattice reduction, but in that case the running time is exponential in dimension (see [376]); for dimensions up to four there is an extension of the Legendre–Gauss algorithm, described above, which is exact and quadratic in terms of the bit-complexity of the input (see [377]).
3.17.2 Sp(2n, ℤ) For Sp(2n, ℝ) the symmetric space is the Siegel half-space, where the metric is defined the same way as for SL(n, ℝ), while the underlying space is not the positive semidefinite cone, but instead the set S(2n) of all complex symmetric matrices with positive definite imaginary part. A symplectic matrix X ∈ Sp(2n, ℝ) has the form X = ( AC DB ), where A, B, C, D are n × n matrices satisfying the conditions that At C = C t A, Bt = Dt B.
The action of Sp(2n, ℝ) on S(2n) is then given by X(Z) = (AZ + B)(CZ + D)−1 . For more details on this, see [461, 144]. In any case, the action of Sp(2n, ℤ) on the Siegel half-space is fairly well understood, and the algorithm we gave for SL(2, ℤ) (which is also known as Sp(2, ℤ)) goes through, with the usual question of lattice reduction, which has not been studied very extensively; the only reference we have found was [151], which is, however, quite thorough.
3.18 Miscellaneous other groups | 125
3.18 Miscellaneous other groups 3.18.1 The orthogonal group Even without integrality assumptions, it is not immediately obvious how to sample a uniformly random matrix from the orthogonal group. This question got a very elegant one-line answer from Stewart in his paper [473]. Stewart’s basic method is as follows. Firstly, we remark that it is well known that every matrix M possesses a QR decomposition, where Q is orthogonal, while M is upper-triangular, and this decomposition is unique up to post-multiplying Q by a diagonal matrix whose elements are ±1. This indeterminacy can be normalized away by requiring the diagonal elements of R to be positive. The algorithm is now the following (Algorithm 3.18.1). This algorithm works because the distribution of KX is the same as the distribution of X for a matrix X with i. i. d. normal entries, and so the distribution of KQ is the same as the distribution of Q, which is exactly what we seek (note that this method is essentially a slight extension of the method described in Section 3.13.1), and is also related to our algorithms for SL(n, ℤ). Algorithm 3.18.1 Require: n is a positive integer. function RandomOrthogonal(n) X ← an n × n matrix whose entries are independent N(0, 1) random variables (Q, R) ← the QR decomposition of X return Q end function Now generating random integral matrices in O(n) is easy – they are just the signed permutation matrices, and generating a random permutation is easy (in a quest for self-containment we give the algorithm below as Algorithm 3.18.2), as is assigning random signs. However, to the best of our knowledge there is no known way to generate uniformly random rational orthogonal matrices. We ask this as a question. Question 3.18.1. How do we generate a random element of O(n) whose elements have greatest common denominator bounded above by N? There is a natural companion question. Question 3.18.2. Let Oq (n) be the set of those elements of O(n) with rational entries, such that the size of the greatest common denominator is bounded above by q. Is there any exact or asymptotic formula for the order of |Qq (n)|? And there is another natural question.
126 | 3 Randomness and computation in linear groups Question 3.18.3. Let μq be the normalized counting measure on Oq (n) (as above). Do the measures μq converge weakly to the Haar measure on the orthogonal group? Questions related to Questions 3.18.2 and 3.18.3 are considered in the paper [177], and it is quite plausible that the methods extend, but it is not completely obvious as of this writing. The only thing we know with certainty is how to address the case of a b ), with a2 + b2 = 1. Thus, if a and b have SO(2). Here, the elements have the form ( −b a denominator q, we are counting the representations of q as a sum of two squares. For this there is the explicit formula of Dirichlet. 2a
2a
b
b
If q = p1 1 ⋅ ⋅ ⋅ pk k q1 1 ⋅ ⋅ ⋅ ql l , where pi = 4ki + 3, with qj = 4kj + 1, then the number of ways to
write q as a sum of two squares is ∏lj=1 (bj + 1).
To get an asymptotic result, it is necessary to consider all q ≤ Q, when we see that the number of elements in SO(2) with the greatest common divisor of coefficients equals the number of visible lattice points in the disk ‖x‖ ≤ Q (a visible point (a, b) is a lattice point with relatively prime a, b). Since the probability of a lattice point being relatively prime for Q ≫ 1 approaches 6/π 2 , and the number of lattice points in the disk is asymptotic to πQ2 , we see that the cardinality of SOQ (2) is asymptotic to π6 Q2 , so we have a rather satisfactory answer to Question 3.18.2 in this setting. Question 3.18.3 is also easy (but already deep) in this setting. It is equivalent to the equidistribution of rational numbers with bounded denominator in the interval, and that, it turn, is not hard to show to be equivalent to the prime number theorem (both statements are equivalent to the statement that ∑xk=1 μ(x) = o(x), where μ is the Möbius function). Finally, in view of the answer to Question 3.18.2, Question 3.18.1 is equivalent to the question of generating a lattice point in a ball, which we have already discussed in Section 3.13.1
Algorithm 3.18.2 Require: n > 0 1: function GenPerm(n) 2: a ← [1, 2, 3, . . . , n] 3: for i = 1 → n do 4: swap a[1] and a[n − i + 1] 5: end for. 6: return a 7: end function
3.18 Miscellaneous other groups | 127
3.18.2 Finite linear groups Our final remarks are on finite linear groups. The simplest class of groups to deal with is SL(n, p). How do we get a random element? This is quite easy (see Algorithm 3.18.3). We pick every element independently at random from Fp . If the resulting matrix M is singular, we try again; if not, we let the determinant be d. We then divide the first column of M by d. It is easy to see that the resulting matrix M will be uniformly distributed in SL(n, p). It is easy to see that the complexity of this method is O(nω log p), where ω is the optimal matrix multiplication exponent. Unfortunately, this simple method only works for SL(n, q). For Sp(2n, q) there is Algorithm 3.18.4, which is due to Hall. It is not hard to see that Hall’s algorithm has time complexity O(n3 log p). Algorithm 3.18.3 Require: n > 0 1: function GenRandSL(n) 2: loop 3: a ← a uniformly random element of M n×n (p). 4: if det(a) ≠ 0 then 5: return a with the first column divided by det(a). 6: end if 7: end loop 8: end function In general, there is a completely different polynomial-time algorithm based on the fact that the Cayley graphs of simple groups of Lie type are expanders – uniform expansion bounds have been obtained by a number of people (see [289, 252, 251, 308]). The main significance of the expansion for our purposes is that the random walk on the Cayley graph is very rapidly mixing (see [220, Section 3]), and so a random walk of poly-logaritmic length will be equidistributed over the group. Of course, this will be slower than Algorithm 3.18.3, and will only generate approximately uniform random elements. To be precise, the diameter of the Cayley graph of (for example) SL(n, p) will be O(n2 log p), so the expander-based algorithm will have time complexity O(log2 pnω+2 ). 3.18.3 SL(n, ℝ) The method described in Section 3.15 can be easily adopted to uniformly select a matrix from SL(n, ℝ) of bounded Frobenius norm (indeed, this is much easier than if the coefficients are constraints to be integers). The basic observation (see [266, p. 142]) is to
128 | 3 Randomness and computation in linear groups Algorithm 3.18.4 Require: n > 0 1: V ← symplectic vector space of dimension 2n. 2: function GenRandSp(n) 3: W ← {0} 4: for i = 1 → n; i ← i + 1 do 5: repeat 6: x, y ← random vectors in V. 7: x , y ← projections of x, y onto W. 8: x ← x − x 9: c ← ⟨x , y ⟩ 10: until c ≠ 0 11: xi ← x 12: yi ← y /c 13: W ← span of W and xi , yi . 14: end for 15: return x1 , x2 , . . . , xn , y1 , y2 , . . . , yn 16: end function use the KAK (singular value) decomposition. The two orthogonal factors are equidistributed, while the measure on the A (diagonal) factor is given by the product of sinh factors corresponding to the roots (this is an immediate generalization of the fact that the perimeter of a circle of radius R in the hyperbolic plane is sinh R).
3.18.4 Other groups? In the work by the author [422, 423] and Maher [323] the model of a random element is the random walk model, since this seemed to be the only natural model for the mapping class group. However, in view of the discussion above it makes sense to define the norm of an element γ of a mapping class group as the Teichmüller distance from some fixed base surface S to γ(S) (one can also use the Weil–Petersson distance), or the distance from a fixed curve to its image in the curve complex, and then pick a random element by analogy with the construction in this note. In fact, this has been done by Maher in [322].
3.19 Checking Zariski density | 129
3.19 Checking Zariski density The idea is based on the following result shown in [422] for G = SL(n, ℤ) and G = Sp(2n, ℤ).4 Theorem 3.19.1. Let M be a generic element of G as above. Then, the Galois group of the characteristic polynomial of M equals the Weyl group of G. Furthermore, if “generic” is taken to be with respect to the uniform measure on random words of length N in a symmetric generating set of G, then the probability that M fails to have the requisite Galois group of characteristic polynomial goes to zero exponentially fast with N. It should be remarked that the Weyl group of SL(n, ℤ) is the symmetric group Sn , while the Weyl group of Sp(2n, ℤ) is the signed permutation group (also known as the hyperoctahedral group) C2 ≀ Sn . For general semisimple groups Theorem 3.19.1 is still true, and the key number-theoretic argument can be found in the foundational paper [415] by Prasad and Rapinchuk. The exponential convergence argument then works by the argument of [422, 423]. This result has been rediscovered (using exactly the same method) by Jouve, Kowalski and Zywina [230]. Theorem 3.19.1 was extended to the situation where M is a generic element of a Zariski dense subgroup Γ of G (as above) in [424] (see also [310]). So, if a subgroup Γ is Zariski dense, then long words in generators of Γ have the full Galois group of the Zariski closure with probability exponentially close to 1. This leads us to use the Galois group as the tool to check Zariski density – if we check that the Galois group of a random long element is not the Weyl group of the putative Zariski closure, then Γ is not Zariski dense, with probability exponentially close to 1 as a function of N. The situation is completed by the following nice result of Prasad and Rapinchuk. Theorem 3.19.2 ([416, Theorem 9.10]). If G is SL(n, ℤ) and Γ < G is such that it contains one element γ1 with Galois group of characteristic polynomial equal to Sn and another element γ2 such that γ2 is of infinite order and does not commute with γ1 , then Γ is Zariski dense in G. If G is Sp(n, ℤ) and Γ < G contains elements γ1 , γ2 as above, then either Γ is Zariski dense in G or the Zariski closure of Γ (over ℂ) is the product of n copies of SL(2, ℂ). We then have the following simple (at least to write down) Algorithm 3.19.1 to confirm or deny whether a given subgroup Γ = ⟨γ1 , . . . , γk ⟩ of SL(n, ℤ) or of Sp(2n, ℤ) is Zariski dense. The least clear part is how to check that the Galois group of the characteristic polynomials in question is the Weyl group of the ambient group. Now, every monic polynomial of degree n with integer coefficients arises as the characteristic polynomial 4 The result is stated in [422] for SL(n, ℤ), while the only result stated for Sp(2n, ℤ) is that the characteristic polynomial of a generic matrix is irreducible. However, the argument goes through immediately for the symplectic group (indeed, it is easier than for SL(n, ℤ)). The result was stated for Sp(2n, ℤ) in [271, Theorem 7.12].
130 | 3 Randomness and computation in linear groups Algorithm 3.19.1 Require: ϵ > 0 1: function ZariskiDense(G, ϵ) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23:
⊳ G is one of SL(n, ℤ), Sp(2n, ℤ). State W ← Weyl group of G. ⊳ W is Sn for SL(n, ℤ), and C2 ≀ Sn for Sp(2n, ℤ). N ← c log ϵ ⊳ c is a computable constant. w1 ← a random product of N generators of Γ. w2 ← a random product of N generators of Γ. if w1 commutes with w2 then return FALSE end if 𝒢 ← the Galois group of the characteristic polynomial of w1 . if 𝒢 ≠ W then return FALSE end if 𝒢 ← the Galois group of the characteristic polynomial of w2 . if 𝒢 ≠ W then return FALSE end if if G = SL(n, ℤ) then return TRUE end if if Γ acts irreducibly on ℂ2n then return TRUE end if return FALSE end function
of some matrix in M n×n (ℤ), while characteristic polynomials of symplectic matrices are reciprocal (the coefficient sequence reads the same forwards and backwards), or, equivalently, x2n f (1/x) = f (x). It follows that the roots of such a polynomial come in inverse pairs, and so it is not hard to see that the Galois group of such a polynomial is the signed permutation or the hyperoctahedral group Hn = C2 ≀Sn (this group acts on the set {1, 2, . . . , 2n} by permuting the blocks of the form {2i, 2i+1} and possibly transposing the elements of a block).
3.20 Algorithms for large Galois groups | 131
3.20 Algorithms for large Galois groups A method to check if the Galois group of a monic polynomial of degree d in ℤ[x] is large (either the symmetric group Sd or the alternating group Ad ) was discovered by the author. The main ingredient is the Livingstone–Wagner theorem. First we give a definition. Definition 3.20.1. We say that a permutation group Gn ≤ Sn is k-homogeneous if G acts transitively on the set of unordered k-tuples of elements of {1, . . . , n}. Theorem 3.20.2 (Livingstone–Wagner [292]). If (with notation as in Definition 3.20.1) the group Gn is k-homogeneous, with k ≥ 5 and 2 ≤ k ≤ 21 n, then k is k-transitive. Remark 3.20.3. Obviously the hypotheses of Theorem 3.20.2 can only be met for n ≥ 10. Now, let M be a linear transformation of a complex n-dimensional vector space V n . We define ⋀k M to be the induced transformation on ⋀k V n . The following is standard (and easy). Fact 3.20.4. The eigenvalues of ⋀k M are sums of k-tuples of eigenvalues of M. To avoid excessive typing we will denote the characteristic polynomial of ⋀k M by χk (M). Lemma 3.20.5. If the Galois group of χ(M) is An or Sn and k < n, then the Galois group of χk (M) is An or Sn . In particular, χk (M) is irreducible. Proof. The Galois group of χk (M) is a normal subgroup of the Galois group of χ(M), and so is either An , Sn or {1}. In the first two cases we are done. In the last case, the fact that the sums of k-tuples of roots of χ(M) are rational tells us that the Galois group of χ(M) is a subgroup of Sk , contradicting our assumption. Lemma 3.20.6. Suppose that χk (M) is irreducible. Then the Galois group of χ(M) is k-homogeneous. Proof. The Galois group of χk (M) is a subgroup of the Galois group of χ(M). Since the roots of χk (M) are distinct, it follows that the Galois group of χ(M) acts transitively on unordered k-tuples of roots of χ(M), so is k-homogeneous. Therefore, so is the Galois group of χ(M). Finally, we can state our result. Theorem 3.20.7. Suppose n > 24. Then the Galois group of χ(M) is Sn if and only if χ5 (M) is irreducible, and the discriminant of χ(M) is not a perfect square. If χ5 (M) is irreducible and the discriminant of χ(M) is a perfect square, then the Galois group of χ(M) is An . For 24 ≥ n ≥ 12, the same statements hold with 5 replaced by 6 throughout.
132 | 3 Randomness and computation in linear groups Proof. It follows immediately from Lemmas 3.20.5 and 3.20.6 combined with the Livingstone–Wagner theorem (Theorem 3.20.2). Theorem 3.20.7 immediately leads to the following algorithm for checking if the Galois group of a polynomial p(x) ∈ ℤ[x] is large (one of An or Sn ). Algorithm 3.20.1 Require: n > 10. Require: f ∈ ℤ[x] a polynomial of degree n. 1: function Wedge(f )
2:
3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13:
if thenn > 24 k ← 5. elsek ← 6. end if M ← companion matrix of f. g ← χk (M). if g is not irreducible then return false end if D ← discriminant of f if D is a square of an integer then return An . end if return Sn . end function
The complexity of Algorithm 3.20.1 can be bounded above using van Hoeij’s algorithm for factoring univariate polynomials over ℚ. The complexity of factoring a polynomial of degree n and height h (where h is the log of the maximal coefficient) (see van Hoeij and Novocin [486]) of factoring over ℚ is given by O(n6 + h2 n4 ), which gives us the following. Theorem 3.20.8. The complexity of Algorithm 3.20.1 is at most O(n30 + h2 n20 ). Remark 3.20.9. The van Hoeij algorithm is far more efficient in practice than the running time bounds would indicate, and therefore so is Algorithm 3.20.1. Nonetheless, it is extremely unlikely that it would be competitive with the probabilistic algorithms described in the next section. It does, however, have the advantage of being fully deterministic (and very simple, to describe and to implement – the hardest part of the implementation is computing χk (M) without writing out the k-th exterior power of the companion matrix.)
3.21 Probabilistic algorithms | 133
3.21 Probabilistic algorithms Given the rather unsatisfactory state of affairs described above, it is natural to look for other, more practical methods, and, as often in life, one needs to give up something to get something. The algorithms we will describe below are probabilistic in nature – they involve random choices, and so they are Monte Carlo algorithms. They are also one-sided, that is, if we ask our computer “Is the Galois group of f the full symmetric group?” and the computer responds “yes,” we can be sure that the answer is correct. If the computer responds “no,” we know that the answer is correct with some probability 0 < p < 1. If we are unsatisfied with this probability of getting the correct answer, we can ask the computer the same question k times. At the end of this process, the probability of getting the wrong answer is (at most) (1 − p)k , so this is what is known as a Monte Carlo algorithm. Not surprisingly, the probabilistic algorithms for Galois group computation are based on the Chebotarev (really Frobenius) density theorem, in the (supposedly) effective form of equation (3.50). The idea is that once we know something about the statistics of the permutations in our putative Galois group, we can conduct some probabilistic experiments and predict the statistical properties of the outcome. Since the Frobenius density theorem talks about the conjugacy classes in the symmetric group Sn (recall that the conjugacy class of an element is given by its cycle structure), while the Chebotarev density theorem talks about the conjugacy classes in the Galois group itself, it is natural to ask what we can learn about the group just by looking at the conjugacy classes. To this end, we make the following definition (which we believe was first made by Dixon). Definition 3.21.1. A collection of conjugacy classes C1 , . . . , Ck of a group G invariably generates G if any collection of elements c1 , . . . , ck with ci ∈ Ci generates G. We make another definition (of our own, though the concept was also used by Dixon). Definition 3.21.2. A collection of conjugacy classes C1 , . . . , Ck of a group G acting on a set X is invariably transitive if any collection of elements c1 , . . . , ck with ci ∈ Ci generates a subgroup H of G acting transitively on X. Dixon [109] shows that in the case that G is a symmetric group Sn acting in the usual ways on {1, . . . , n}, for a fixed ϵ, Oϵ (log1/2 n) elements will be invariably transitive (we are abusing our own notation; a collection of elements invariably generate if their conjugacy classes do) with probability 1 − ϵ, and a similar estimate holds for the number of elements which generate Sn . Dixon’s proof is quite complicated. The result was improved a couple of years later by Luczak and Pyber in [312] – they removed the dependence on n, but their constant is quite horrifying. The truth is much more pleasing.
134 | 3 Randomness and computation in linear groups Fact 3.21.3 ([408]). Four uniformly random elements in Sn are invariably transitive with probability at least 0.95. Therefore, a collection of 4k elements is invariably transitive with probability at least 1 − 201 k . Heuristic argument. First define the sumset of a partition of n to be the collection of all sums of subsets of the partition, save 0 and n. Every permutation σ ∈ Sn defines a partition of n (corresponding to the cycle decomposition of σ), and defining s(σ) to be the sumset of that partition, it is clear that σ1 , . . . , σk are invariably transitive if ⋂ s(σi ) = 0. Now, it is well known that the expected number of cycles of a random permutation is log n (the variance is also equal to log n) – for a nice survey of results of this sort, see the paper by Diaconis, Fulman and Guralnick [99]. Their collection of sumsets has cardinality 2log n = nlog 2 . One can think of log 2 as the “discrete dimension” of the sumset, and so, taking k elements, the dimension of the intersection of the pairs of sumsets with the (big) diagonal is k(log 2) − k + 1. This is easily seen to turn negative when k = 4. The above point of view is also useful for a very fast algorithm to compute if a collection of elements are invariably transitive: computing the sumset of a partition of n can be done in time O(n2 ) by the dynamic programming algorithm, Algorithm 3.21.1. Algorithm 3.21.1 Require: X = (x1 , . . . , xk ) a collection of positive integers 1: function Sumset(f )
X←0 3: for i from 1 to k do X = X ∪ (X + xi ). 4: end for return X. 5: end function 2:
3.22 Probabilistic algorithm to check if p(x) of degree n has Galois group Sn To check that the Galois group is the full symmetric group, we first check transitivity, and then use one of two methods to determine whether the group is the full Sn . Both methods are based on Jordan’s theorem, but one (which has worse complexity in terms of the degree) is used for n < 13, and the other is used for n ≥ 13. It should be underscored that the algorithm is testing the hypothesis that the group is Sn – if
3.22 Probabilistic algorithm to check if p(x) of degree n has Galois group Sn
| 135
Algorithm 3.22.1 Require: p ∈ ℤ[x] of degree n. Require: ϵ > 0. 1: function IsTransititve(p, ϵ)
2: 3:
4: 5: 6: 7: 8: 9: 10: 11: 12: 13:
⊳ epsilon is the probability that we are wrong.
s ← {1, . . . , n}. for i ≤ −c log ϵ do q ← random prime with q ∤ 𝒟(p). d ← set of degrees of irreducible factors of factorization of p(x) mod q. s ← Sumset(d). if s ∩ s = 0 then return IRREDUCIBLE end if s ← s end for return NOT Sn end function
Algorithm 3.22.2 Require: p ∈ ℤ[x] irreducible of degree n. Require: ϵ > 0. ⊳ ϵ is the probability that we are wrong. 1: function IsPrimitive(p, ϵ) 2: for i ≤ −c log ϵ log n do 3: q ← random prime with q ∤ 𝒟(p). 4: d ← set of degrees of irreducible factors of factorization of p(x) mod q. 5: if d contains a prime bigger than n > 2 then 6: return PRIMITIVE 7: end if 8: end for 9: return NOT Sn . 10: end function
one has an unknown Galois group, it may be much harder to test whether it acts transitively, for example. In any case, we will need to first write some helper functions (Algorithm 3.22.1). If the polynomial is irreducible, we can proceed to the next step. At this point we can assume that the degree of the polynomial is greater than 3 (Algorithm 3.22.2).
136 | 3 Randomness and computation in linear groups Algorithm 3.22.3 Require: p ∈ ℤ[x] irreducible of degree n with primitive Galois group Require: ϵ > 0. 1: function IsSn(p, ϵ) 2: if n < 13 then 3:
4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26:
⊳ ϵ is the probability that we are wrong.
for i ≤ −c log ϵ do q ← random prime with q ∤ 𝒟(p). d ← set of degrees of irreducible factors of factorization of p(x) mod q. if d contains one 2 and the rest of the elements are odd then return YES end if end for return NO else if 𝒟(p) is a perfect square then return NO end if for i ≤ c log ϵ log n do q ← random prime with q ∤ 𝒟(p). d ← set of degrees of irreducible factors of factorization of p(x) mod q. if d contains a prime bigger than n > 2 and smaller than n − 5 then return YES end if end for return NO end if end function
Finally, we have either decided that our Galois group is not the full symmetric group or that it is transitive and primitive. We have now arrived at the last step (Algorithm 3.22.3).
3.22.1 Some remarks on the running time of detecting Galois group Sn The complexity of (probabilistically) factoring a polynomial of degree n modulo a prime q is bounded by O(n1.816 log0.43 q) (see [492] for a discussion), while the primes
3.22 Probabilistic algorithm to check if p(x) of degree n has Galois group Sn
| 137
we use are (at worst) of order of the size of the discriminant of the polynomial,5 for a complexity bound of O(n1.1816 (log0.43 n + log0.43 |f |1 )). The complexity of computing the discriminant is bounded by O(n3 (log n+log |f |1 )). This follows from the complexity results of Emiris and Pan [122]. Checking a number r for primality can be done in time 2 ̃ ̃ means that we ignore terms of order polynomial in log log r). Since O(log r) (where O the probability of a number r is of the order of 1/ log r by the prime number theorem, generating a random prime we factor mod q at most O(√n) times, and the running time is dominated by computing the discriminant and generating the requisite random primes, and so is of order O(n3 (log n + log |f |1 )).
3.22.2 Deciding whether the Galois group of a reciprocal polynomial is the hyperoctahedral group Deciding whether the Galois group of a reciprocal polynomial is the hyperoctahedral group is (given our preliminaries) an easy extension of the algorithm to check that the Galois group of a (not necessarily reciprocal polynomial) is the full symmetric group, and is given in Algorithm 3.22.4. The complexity of this algorithm is, again, dominated Algorithm 3.22.4 Require: p(x) be a reciprocal polynomial with integral coefficients of degree 2n. Require: ϵ > 0. 1: function IsHyperoctahedral(p, ϵ)
2:
3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14:
r ← trace polynomial of p. if The Galois group of r is not the symmetric group Sn then return NO end if for i from 1 to c√n do q ← random prime with q ∤ 𝒟(p). d ← set of degrees of irreducible factors of factorization of p(x) mod q. if d contains one 2 and the rest of the elements are odd then return YES end if end for return NO end function
5 In practice we would use much smaller primes.
138 | 3 Randomness and computation in linear groups by the complexity of computing the discriminant and generating primes, and is again of the order of O(n3 (log n + log |f |1 )).
3.23 Back to Zariski density We now return to Algorithm 3.19.1. We know how to do every step except for checking that the action on ℂ is irreducible (which is addressed in Section 3.23.1). We should now check what the complexity of the algorithm is. On lines 3 and 4 we are computing a (random) product of N matrices. By the Furstenberg–Kesten theorem (or any one of its refinements, see, e. g., [58]) we know that the sizes of the coefficients of the matrices w1 , w2 are of order of λN , for some λ depending on the generating set (but note that λ is at most ‖ log 𝒢 ‖, where 𝒢 is the maximum of the Frobenius norms of the generators), so in our case the coefficients are of the order of (1/ϵ)c , for some constant c, and therefore the coefficients of the characteristic polynomial of w1 , w2 are of order (1/ϵ)nc . This tells us (using the results in Section 3.22.1) that we can check that the Galois group of w1 , w2 is Sn in time O(n4 log ϵ log ‖𝒢 ‖), and likewise in the symplectic case we can check that the Galois group is C2 ≀ Sn in the same time. 3.23.1 Testing irreducibility One of the steps in Algorithm 3.19.1 involves checking that our group acts irreducibly on V 2n . This seems hard a priori, but there are two ways to deal with this. The first way involves computing in the splitting fields of the characteristic polynomials of the generators, so is not practical. The second, luckily, is polynomial time, and uses Burnside’s irreducibility criterion. Theorem 3.23.1 (Burnside’s irreducibility theorem). The only irreducible algebra of linear transformations on a vector space of finite dimension greater than 1 over an algebraically closed field is the algebra of all linear transformations on the vector space. Burnside’s theorem is (obviously) classical, and proofs can be found in many places, but the most recent (and simplest) proof by Lomonosov and Rosenthal is highly recommended (see [307]). Burnside’s theorem (Theorem 3.23.1) tells us that in order to check irreducibility, we need only check that the set of all elements in our group spans the whole matrix algebra M 2n×2n (thought of as a vector space). Now, our group is infinite, but luckily, Algorithm 3.23.1 (suggested by Cornulier) gets around that problem. Note that the inner loop of Algorithm 3.23.1 runs at most n2 times, and each iteration computes the rank of an at most n2 × n2 integer matrix. Theorem 3.23.2. The irreducibility of an n × n matrix group can be decided by using at most O(n8 log n log ‖𝒢 ‖) arithmetic operations (this uses the algorithm of Storjohann [476]).
3.24 A short history of Galois group algorithms | 139
Algorithm 3.23.1 Require: X = (g1 , . . . , gk ) a collection of generators. 1: function IsIrreducible(X)
2: 3:
4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14:
V0 ← ⟨I⟩ b0 ← basis of V0 . loop if |b0 | = n2 then return TRUE end if V1 ← ⟨b0 ∪ g1 b0 ∪ ⋅ ⋅ ⋅ ∪ gk b0 ⟩. b1 ← basis of V1 . if |b1 | = |b0 | then return FALSE end if V0 ← V1 . end loop end function
Another (simple but less efficient) algorithm to determine Zariski density is given in Section 3.28.
3.24 A short history of Galois group algorithms Galois groups of polynomials are believed to be difficult to compute in general, though there is no current agreement as to where the computation of Galois group falls in the complexity hierarchy (which is, of course, itself conjectural). 3.24.1 Kronecker’s algorithm The most obvious algorithm (which, in essence, goes back to Galois, but was first published by Kronecker in his book [272]) is the following (Algorithm 3.24.1). Note that the Algorithm 3.24.1 Require: f ∈ ℤ[x] a polynomial of degree n, with roots r1 , . . . , rn . 1: function Kronecker(f )
R(x) ← ∏σ∈Sn x − ∑ Xrσ(i) . ⊳ R(x) is the resolvent of f . (R1 , . . . , Rk ) are factors of f over Q[X1 , . . . , Xn ]. return Stabilizer(R1 ) ⊂ Sn . 4: end function 2:
3:
140 | 3 Randomness and computation in linear groups coefficients of the resolvent are symmetric functions of the roots of f , and are thus rational. Nevertheless, since the algorithm involves factoring a polynomial in n variables and of degree n!, this method is not likely to be practical for any but the smallest values of n.
3.24.2 Stauduhar’s algorithm Algorithm 3.24.2 (to the best of our knowledge) was designed by Stauduhar in [471]. The idea of Stauduhar’s algorithm (which we describe for irreducible polynomials, for simplicity) is that instead of computing the resolvent of f with respect to the full candidate Galois group (Sn in Kronecker’s algorithm) we compute it with respect to a set of Algorithm 3.24.2 Require: f ∈ ℤ[x] an irreducible polynomial of degree n, with roots r1 , . . . , rn . 1: function findMax(()G, f , ℱ , ℳ)
2:
3: 4:
5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20:
for M ∈ ℳ do FM ← ∑σ∈M σ(ℱ ). ⊳ FM is invariant under all permutations in M, and only those. QG:M (y; x1 , . . . , xn ) = ∏coset representatives of M in G (y − σ(FM (r1 , . . . , rn ))). ⊳ To compute QG:M we use the approximate roots r,̃ and then round the coefficients to nearest integers. if QG:M has an integer root then return M end if end for return none end function function Stauduhar(f ) rf̃ ← (r1̃ , r2̃ , . . . , rñ ), where rĩ are high precision numerical approximations to the roots of f . ℱ ← x1 x22 ⋅ ⋅ ⋅ xnn . G ← Sn . while |G| > 1 do ℳ ← the list of conjugacy classes of maximal transitive subgroups in G. H ← findMax(G, f , ℱ , ℳ). if H is none then return G end if G = H. end while end function
3.24 A short history of Galois group algorithms | 141
coset representatives of the largest possible Galois group with respect to a subgroup, and then iterate. To be precise, a casual examination will show that Stauduhar’s algorithm is designed very much as a technical tool, suitable for the state of computers in the late 1960s and early 1970s, and has really no fundamental computational complexity advantage over Kronecker’s algorithm. Note for example that if the subgroup M is large, then the invariant polynomial FM will have a lot of terms, while if M is small in comparison with the ambient group G, the degree of the resolvent Q will be huge. In many case we suffer from both problems at once. For example, the wreath products Sk ≀ Sn are maximal subgroups of Skn – in this case both the order and the index are enormous. Curiously, the most glaring problem with Stauduhar’s algorithm (the need to know all the maximal transitive subgroup structures of permutation groups) is, in a way, the least significant, since the number of such is quite small (low-degree polynomial in n). That said, Stauduhar’s algorithm certainly provides a considerable speed-up for the computation of Galois groups of low-degree polynomials. 3.24.3 Polynomial time (sometimes) The first theoretical advance in computing Galois groups came as a consequence of the celebrated LLL [286] result (at the time viewed as of purely theoretical interest) which showed that one could factor polynomials over the rational numbers in time polynomial in the input size – the algorithm was based on the Lovász lattice reduction algorithm (nowadays invariably referred to as the LLL algorithm, but attributed to Lovász in the source). The reason that the algorithm was not viewed as practical at the time was that running time was of the order of O(n9+ϵ + n7+ϵ log2+ϵ (∑ a2i )), for any ϵ, but with constants depending on ϵ. In any case, it was then showed by Landau in [278] that one could also factor in polynomial time over extensions of ℚ – Landau’s algorithm is basically a combination of the LLL algorithm with a very simple idea going back to Kronecker by way of Trager. Namely, to factor a polynomial over ℚ[α] it is enough to factor its norm (which is a polynomial over ℚ), as long as the norm is squarefree. However, the observation is that there is at most a quadratic (in the degree of the extension) shift of the variable which gives nonsquare-free norms. One does, however, need to compute the norm, and then compute a bunch of polynomial greatest common divisors, so after the smoke clears, the complexity of factoring a polynomial f (x) of degree n over ℚ(α) where α has minimal polynomial g(y) of degree m is n
O(m9+ϵ n7+ϵ log2 |g| log2+ϵ (|f |(m|g|) (mn)n )), so charitably dropping the ϵ’s and terms logarithmic in degrees we get the complexity to be worse than O((mn)9 log2 |f | log2 |g|). For example, in the generic case if we adjoin k roots of f and attempt to factor f over the resulting extension, complexity will be worse than O(n9(k+1) log4 |f |). In any event, Landau notes the following corollary.
142 | 3 Randomness and computation in linear groups Corollary 3.24.1 (Landau, [278]). The Galois group of a polynomial f can be computed in time polynomial in the degree of the splitting field f and log |f |. The corollary essentially follows from the polynomial bound on factoring, since, as she shows, one can construct the splitting field by adjoining one root at a time. Note that Corollary 3.24.1 is useless for recognizing large Galois groups (since, for example, in the generic case when the Galois group is the full symmetric group, the degree of the splitting field is n!), but it can be used for polynomial-time algorithms to determine whether the Galois group is solvable (this was done by Landau and Miller in [279]), thanks in no small part to Palfy’s theorem [404], which states that the order of 1 a transitive primitive solvable subgroup of Sn is at most 24 3 n3.24399... . Solvability has considerable symbolic value, since solvability of the Galois group is equivalent to f being solvable by radicals, and the question of whether all equations were thus solvable was what led to the discovery of Galois theory. More recently, some of these ideas were applied to the design of a polynomial-time algorithm to check if the Galois group of a polynomial is nilpotent (the Landau–Miller algorithm does not go through for this case, though Palfy’s bound obviously still holds) by Arvind and Kurur in [12]. The first polynomial-time algorithm to check that the Galois group of a polynomial is the symmetric group Sn or the alternating group An is due to Landau, and it uses Fact 3.25.9 and Lemma 3.25.7. The algorithm proceeds as follows (Algorithm 3.24.3). The algorithm as written looks worse than it is, since once we know that the Galois group is transitive, we only need to check transitivity of f over ℚ[α] for one root of f , and so on. Still, the complexity lies in the last step, where we need to factor f (which is still of degree n − 5) over an extension of degree O(n5 ), which gives us complexity of order O(n45 ), which is not practical. Since sporadic groups do not occur for Algorithm 3.24.3 Require: f ∈ ℤ[x] a polynomial of degree n. 1: function Landau(f )
2:
3: 4: 5: 6: 7: 8: 9: 10: 11:
K ← ℚ. R ← roots of f . for all S ⊂ R; |S| ≤ 5, in increasing order of size do if f is not irreducible over K[S] then return false end if end for D ← discriminant of f . if D is a square of an integer then return An end if return Sn end function
3.25 Some lemmas on permutations | 143
large n, we can do better (by checking just 4-transitivity), which gives us an algorithm of complexity O(n27 ), which is still not realistic in practice – note that if the implied constant is 1, then computing the Galois group of a quartic will take some 254 ∼ 2 1016 operations, or several years on a modern computer.
3.25 Some lemmas on permutations Consider a subgroup G of S2n which is known to be a subgroup of Hn = C2 ≀ Sn . We have the following proposition. Proposition 3.25.1. The G is all of Hn if and only if it surjects under Sn under the natural projection and it contains a transposition. Proof. If G = Hn , then clearly it surjects and it does contain a transposition, so we need to show the other direction. The first observation is that if τ ∈ G is a transposition, then it must be supported on a block. Indeed, if (without loss of generality) τ(1) = 3, then τ(2) = 4, contradicting the fact that τ only moves two elements. Since G surjects onto Sn , it acts transitively on the blocks, and so it contains all the transpositions supported on the blocks, and all of their products, so (by counting) it contains the entire kernel of the natural projection H2n → Sn , whence, by counting elements, G must be all of Hn . In order to make Proposition 3.25.1 useful for our algorithmic purposes, we first state an obvious corollary. Corollary 3.25.2. In the statement of Proposition 3.25.1 it is enough to require that G surjects onto Sn and contains an element σ whose cycle structure is (2, n1 , . . . , nk ) where all nk are odd. Proof. The element σ ∏ nk is a transposition. We finally ask how many elements σ of the type described in Corollary 3.25.2 Hn contains. As shown in the proof of Proposition 3.25.1, the 2-cycle must be supported on a block. Since there are n blocks, this can be chosen in n ways. On the other hand, all elements in an odd-length cycle ρ must be contained in distinct blocks (if not, then, without loss of generality, ρk (1) = 2 for some k < |ρ|, but then ρ2k = 1, contradicting the requirement that ρ has odd order). That means that each such cycle must have a “double”: if ρ = (a1 , . . . , ak ), then ρ = (b1 , . . . , bk ) where bi is in the same block as ai . To summarize, the element σ defines, and is uniquely defined by, the following data: a block, an element of Sn−1 of odd order and a preimage of that element under the natural map Hn−1 → Sn−1 . There are n possible blocks, and each element of Sn−1 has 2n−1 possible preimages, so the number of “good” elements σ equals n2n−1 times the number of elements of odd order in Sn−1 . We now need the following fact.
144 | 3 Randomness and computation in linear groups Fact 3.25.3. The number on of elements of odd order in Sn is asymptotic to cn!/√n. Proof. As discussed above, an element of odd order is a product of disjoint odd cycles, so by the standard Flajolet–Sedgewick theory, the exponential generating function of the sequence on is given by z+1 z 2i+1 )=√ , 2i + 1 z−1 i=0 ∞
exp( ∑
so the statement of Fact 3.25.3 is equivalent to the statement that the coefficients of the power series defining √ z+1 are asymptotic to c/√n. This can be done using the z−1 machinery of asymptotics (à la Flajolet–Odlyzko [133], see Flajolet and Sedgewick [134] or Pemantle and Wilson [409] for details), but in this case it is much simpler: √
z+1 −1/2 = (z + 1)(z 2 − 1) . z−1
Thinking of (z 2 − 1)−1/2 as a function of z 2 , the binomial theorem tells us that the coefficients are positive and are asymptotic to 1/√n. Multiplying that power series p(z) by z + 1 produces a power series with the same even coefficients, and with the coefficient of z 2k−1 equal to the coefficient of z 2k in p(z). Remark 3.25.4. The coefficient c is approximately equal to 0.8. Corollary 3.25.5. The number of elements in Sn which have one transposition and the . rest of cycles odd is asymptotic to 2√cn! n−2 Proof. A transposition is determined by its support. For each choice of a pair of elements a, b, we know that there are c(n − 2)!/√n − 2 elements with the transposition (a b) and the rest of cycles odd. Since there are n(n − 1)/2 choices of pair (a, b) we get cn! , as advertised. 2√n−2 Corollary 3.25.6. The number of elements in Hn of the type described in Corollary 3.25.2 is asymptotic to cn!2n . 2√n − 1 Proof. This follows immediately from Fact 3.25.3 and the discussion immediately above it. The next fact is universally useful. Lemma 3.25.7. Suppose G < Sn is a k-transitive group such that the stabilizer (in G) of every k-tuple of points is a transitive subgroup of Sn−k . Then G is (k + 1)-transitive. Proof. Given points a1 , a2 , . . . , ak+1 , b1 , b2 , . . . , bk+1 we would like to send ai to bi for i = 1, . . . , k + 1. Since G is k-transitive, there is a g ∈ G such that g(ai ) = bi , for i ≤ k. Since
3.25 Some lemmas on permutations | 145
the stabilizer of b1 , . . . , bk is transitive by hypothesis, there is h ∈ G such that h(bi ) = bi , for i = 1, . . . , k, and h(g(ak+1 )) = bk+1 . We will also need another fact. First we give a definition. Definition 3.25.8. A permutation group G acting on a set X is called k-transitive if the induced action on X k is transitive. A 1-transitive group is simply called transitive. Now the fact follows. Fact 3.25.9. Every finite 6-transitive group is either An or Sn . The only other 4-transitive groups are the Mathieu groups M11 , M12 , M23 , M24 . The statement with some indication of the proof strategy can be found in [75, p. 110], though a much more detailed explanation can be found in Cameron’s 1981 survey paper [74].
3.25.1 Jordan’s theorem A very useful result in the detection of large Galois groups is Jordan’s theorem. Theorem 3.25.10 ([223, Theorem 8.18]). Let Ω be a finite set, and let G act primitively on Ω. Further, let Λ ⊆ Ω with |Λ| ≤ |Ω| − 2. Suppose that GΛ (the subgroup of G stabilizing of Λ) acts primitively on Ω\Λ. Then, the action of G on Ω is (|Λ| + 1)-transitive. We will be using some corollaries of Theorem 3.25.10. Corollary 3.25.11 ([223, Theorem 8.17]). Let G be a permutation group acting primitively on Ω and containing a transposition. Then, G is all of SΩ . Corollary 3.25.12. Let G be a permutation group acting primitively on Ω and containing a cycle of prime length l, with |Ω|/2 < l < |Ω| − 4. Then G is either AΩ or SΩ . Proof. If we knew that G acted primitively on Ω, it would follow that the action is 6-transitive, whereupon the result would follow from Fact 3.25.9. However, the existence of a long prime cycle tells us that the action of G is primitive. Corollary 3.25.13 (Corollary of Corollary 3.25.12). It is enough to assume that G has an element g whose cycle decomposition contains a cycle of length l satisfying the hypotheses of Corollary 3.25.12. Proof. Obviously, all the short cycles in the cycle decomposition of g are relatively prime to l. So, raising g to the least common multiple of the lengths of the short cycle will produce an l-cycle.
146 | 3 Randomness and computation in linear groups Remark 3.25.14. It is an easy consequence of the prime number theorem that the density of elements satisfying the hypothesis of Corollary 3.25.12 is asymptotic to log 2/ log n, for n large.
3.26 A bit about polynomials In this section we will discuss some useful facts about polynomials. First, we discuss reciprocal polynomials. Recall that a reciprocal polynomial is one of the form f (z) = ∑i=0n ai xi , where an−i = ai for all i. In the sequel we will always assume that the polynomials are monic, have integer coefficients (unless otherwise specified) and (when reciprocal) have even degree. A reciprocal polynomial f (x) of degree n satisfies the equation xn f (1/x) = f (x), and therefore the roots of f (x) (in the splitting field of f ) come in pairs r, 1/r. To a reciprocal polynomial f we can associate the trace polynomial F of half the degree, by writing f in terms of the variable z = x + 1/x (constructing the trace polynomial is a simple matter of linear algebra, which we leave to the reader). While the Galois group G(f ) of a reciprocal polynomial f (of degree 2n now) is a subgroup of the hyperoctahedral group C2 ≀ Sn , the Galois group of the associated trace polynomial F is the image of G(f ) under the natural projection to Sn (see [489] for a very accessible introduction to all of the above). Another piece of polynomial information we will need is a bound on the discriminant of the polynomial. The best such bound is due to Mahler [326]. His result is the following. Theorem 3.26.1 (Mahler, [326]). Let f (x) = ∑ni=0 ai x i ∈ ℂ[x], and let |f |1 = ∑ni=0 |ai |. Then the discriminant D(f ) of f has the following bound: n n−2 D(f ) ≤ n |f |1 .
3.27 The Frobenius density theorem Let f (x) ∈ ℤ[x] be a polynomial of degree n. Its Galois group acts by permutations on roots of f (x) and can thus be viewed as a subgroup of the symmetric group Sn . Reducing f modulo a prime p produces a polynomial fp with coefficients in the finite field 𝔽p , which will factor (over 𝔽p ) into irreducible factors of degrees d1 , . . . , dk , with d1 +⋅ ⋅ ⋅+dk = n. The Galois group of fp is a cyclic permutation group with cycle structure (d1 , d2 , . . . , dk ) and it is a fundamental fact of Galois theory that the Galois group of f over ℚ contains an element with the same cycle type. In fact, a stronger statement is true. Theorem 3.27.1 (The Frobenius density theorem). The density (analytic or natural) of the primes p for which the splitting type of fp is the given type d1 , . . . , dk is equal to the proportion of elements of the Galois group of f with that cycle type.
3.27 The Frobenius density theorem
| 147
Since the cycle structure of a permutation gives its conjugacy class in Sn , the Frobenius density theorem talks about conjugacy in the symmetric group Sn . The stronger Chebotarev density theorem addresses the finer problem of conjugacy in the Galois group G of f . Even defining the terms needed to state the Chebotarev theorem will take us much too far afield – the reader is, instead, referred to the beautiful survey paper [472]. The fact of the matter is that for the purpose of computing Galois groups it is the Frobenius density theorem which is used (though this is hard to tell from the literature, which invariably refers to the Chebotarev theorem). One of the basic methods of computing Galois groups (or at least showing that they are large permutation groups, such as An or Sn , as is usually the case) consists of factoring the polynomial modulo a number of primes and seeing which cycles one gets. If one gets enough “interesting” cycle types, one knows that the Galois group is An or Sn . If, on the other hand, one keeps not getting the cycle types one expects for too many primes, the probability that the Galois group actually contains them becomes exponentially smaller with each prime one checks, so, after a while, one is reasonably sure that the Galois group is “small.” Since the Frobenius theorem is an asymptotic statement, it is useless without error bounds. Luckily, such were provided in the foundational paper of Lagarias and Odlyzko [277] (their paper concerns the Chebotarev theorem, but we will state it for the Frobenius theorem, since this is all we will ever use). Theorem 3.27.2 (Lagarias–Odlyzko). Let f ∈ ℚ[x] be of degree n while its splitting field L is of degree nL . Suppose the generalized Riemann hypothesis holds for the Dedekind zeta function of L. Let D(f ) be the absolute value of the discriminant of f . Then, if C is a cycle type in Sn , then the number of primes p smaller than x for which the splitting type of fp corresponds to C (which shall be denoted by πC (x, L)) satisfies |C| |C| 1 Li(x) ≤ cLO { x 2 log(D(f )xnL ) + log D(f )}, πC (x, L) − |G| |G|
(3.50)
with an effectively computable constant c. Remark 3.27.3. In his note [389], Oesterlé announced the following strengthening of the estimate of equation (3.50): |C| 1 5.3 log x |C| √x[log D(f )( + Li(x) ≤ ) + nL ( + 2)], πC (x, L) − |G| |G| π log[x] 2π
(3.51)
1 which would make the constant c of equation (3.50) be equal to 2π . Unfortunately, Oesterlé never published the proofs of the announced result. In the sequel, we will use the constant cLO , which the reader can think of as 1/(2π) if he or she prefers.
Remark 3.27.4. In his paper [448], Serre notes that one can remove the “parasitic” log D(f ) summand from the right-hand side of equation (3.50), to get an estimate of
148 | 3 Randomness and computation in linear groups the following form: |C| 1 |C| Li(x) ≤ cLO { x 2 log(D(f )x nL )}. πC (x, L) − |G| |G|
(3.52)
Of course, in order for the bounds in terms of the discriminant to be useful, we need to know how big the discriminant is, but luckily we do, thanks to Mahler’s bound (Theorem 3.26.1). We also know that the degree of the splitting field of f (x) is at most n! (where n is the degree of f ). Substituting this into equation (3.52), we get the following. Corollary 3.27.5. We have the following estimate: |C| |C| Li(x) ≤ cLO √x(|C| log x + (n log n + n log |f |1 )). πC (x, L) − |G| |G|
(3.53)
3.28 Another Zariski density algorithm Another method to test Zariski density rests on the following fact. Fact 3.28.1. Let H be a subgroup of a semisimple algebraic group G over a field of characteristic 0. Then, H is Zariski dense if and only if the following two conditions hold: 1. the adjoint representation of H on the Lie algebra of G is irreducible; 2. H is infinite. We already know (see Section 3.23.1) how to determine irreducibility. The additional observation is that an element of finite order has a cyclotomic characteristic polynomial, so the Galois group is cyclic. This leads to Algorithm 3.28.1. The problem Algorithm 3.28.1 Require: ϵ > 0 1: function GeneralZariskiDense(G, ϵ)
2:
3: 4: 5: 6: 7: 8: 9: 10: 11:
N ← c log ϵ ⊳ c is a computable constant. w ← a random product of N generators of Γ. if The characteristic polynomial of w is cyclotomic then return FALSE end if if The adjoint action of Γ is irreducible on the Lie algebra of G then return TRUE end if return FALSE end function
3.29 The base case: rank 1
| 149
with Algorithm 3.28.1 is that the adjoint representation acts on a vector space of dimension dim G, for the usual classical groups, so the running time is going to be of order of O(n14 log ‖𝒢 ‖), which is much worse than the complexity of Algorithm 3.19.1 in the cases where they are both applicable. On the other hand, the beginning of the discussion in the paragraph above begs the following question. Question 3.28.2. Given a collection of matrices in SL(n, ℤ), do they generate SL(n, ℤ)? There appears to be only one practical approach, i. e., to compute the fundamental domain of the span of the matrices on the homogeneous space of SL(n, ℝ). If we are lucky, and that domain is finite-volume, we can answer the question (the author has conducted a number of experiments along these lines). No general attack seems to be available, and it is not even clear whether the question is decidable! Similar sounding questions (like the membership problem) are undecidable in SL(n, ℤ) for n ≥ 4, but the techniques seem unapplicable here. In special cases (which are central to the study of mirror symmetry, see [466, 64]) the question can be decided by a rather diverse set of approaches. Our general set-up is the following. Let Γ be a subgroup of, say, G = SL(n, ℤ), or G = Sp(2n, ℤ) given by explicit generators. What can we say about Γ?
3.29 The base case: rank 1 Consider the modular group G = SL(2, ℤ), and consider Γ = ⟨M1 , M2 , . . . , Mk ⟩. We can ask the following questions. Question 3.29.1. Is the group Γ Zariski dense in SL (2, ℝ)? Question 3.29.2. Is the group Γ finite index in G? Question 3.29.3. Can we give a presentation of Γ? Question 3.29.4. If Γ is not of finite index, can we estimate the Hausdorff dimension of Γ? In this case, the answer to Question 3.29.1 is easy. The group is Zariski dense if and only if it is nonelementary (so has more than 2 limit points on the circle at infinity of the hyperbolic plane) – checking that is a relatively easy computation. Questions 3.29.2 and 3.29.3 are, at least in principle, easily answered by using the Poincaré polygon theorem – see the beautiful paper by Epstein and Petronio [124]. To be more precise, we compute the orbit of our favorite basepoint (for SL (2, ℤ) everyone’s favorite basepoint, at least in the upper half-space model, is the point i = √−1.) Then, we compute the Voronoi triangulation of a (sufficiently large) piece of the orbit – the Voronoi cell of the site √−1 is the Dirichlet domain of the subgroup. If the Dirichlet domain has infinite area, then we know that Γ is of infinite index. Otherwise, it is of finite index,
150 | 3 Randomness and computation in linear groups and the presentation can be deduced from the Poincaré fundamental polygon theorem. An apparently efficient algorithm (using continued fractions ideas) is given by Voight [490]. Even here, however, the computational complexity of the algorithm is a complete mystery (how large is “sufficiently large,” above, for example?). Since this is obviously a fundamental computational question, we state it here. Question 3.29.5. What is the computational complexity of Questions 3.29.2 and 3.29.3? The last question we mention – Question 3.29.4 – has also been studied in this context, most notably by McMullen [334] and Jenkinson and Pollicott [225]. The Pollicott– Jenkinson algorithm is superior for cocompact Kleinian (and Fuchsian) groups, McMullen’s algorithm is superior for groups of finite coarea (such as SL(2, ℤ)). In particular, McMullen computed the dimension of the Apollonian gasket to very high precision. McMullen’s algorithm takes exponential time in the number of digits of the answer, while (in the cusped case) the Jenkinson–Pollicott algorithm is also exponential (though produces rigorous bounds.) Neither has clear running time dependence on the size of the input, so we ask the following question. Question 3.29.6. Given a collection ℳ = {M1 , . . . , Mk } of matrices in SL(2, ℤ), what is the complexity of approximating the Hausdorff dimension of the group generated by ℳ to k decimal places? In any case, we see that in this simplest case, the problems are solvable (both in principle and in practice). The tricky (and very interesting) aspect is situating them in the complexity hierarchy.
3.30 Higher rank In higher rank the questions which are at least tractable for SL(2, ℤ) become much deeper. We have, again, a collection ℳ = {M1 , . . . , Mk } ⊂ SL(n, ℤ), for n > 2, and we consider a subgroup Γ = ⟨ℳ⟩.
3.31 Thin or not? Suppose now that Γ is known to be Zariski dense. Is it arithmetic (of finite index in SL(n, ℤ)) or thin (of infinite index)? This subject has been the subject of much work lately – the favorite sorts of groups for which this question was asked were monodromy groups of algebraic families (or hypergeometric differential equations, if one prefers). A particularly spectacular example were the 14 monodromy groups of Calabi– Yau three-folds (these are subgroups of the symplectic group Sp(4, ℤ)) described by Doran and Morgan in [111]. It turns out (apparently miraculously) that these fall into two groups of seven. The first were shown by Brav and Thomas [64] to be thin – the
3.31 Thin or not?
| 151
authors managed to find a “ping-pong” action of the groups, which showed that they split as amalgamated free products, and thus had to be thin for homological reasons. The second group of seven were shown to be arithmetic by Venkataramana and Signh in [466] by using Venkataramana’s classic theorem that a group containing enough opposing unipotents was arithmetic. In rank 1 (for groups O(n − 1, 1)) some impressive thinness results were obtained by Fuchs, Meiri and Sarnak [148] using again rather sophisticated methods which are not likely to generalize. We are looking for approaches that will usually work, while keeping in mind our conjecture. Conjecture 3.31.1. It is undecidable in general whether a Zariski dense group given by a generating set in SL(n, ℤ) (or Sp(2n, ℤ)) is thin. 3.31.1 Computing the fundamental polyhedron The Riemannian symmetric space for SL(n, ℤ) does not have totally geodesic hyperplanes, so at first it seems that the Dirichlet domain idea does not carry over. However, there is a way to projectivize the model (analogous to constructing the Beltrami–Klein model for hyperbolic space) which makes the computation feasible in principle. The sticking point is geometrical finiteness (or lack thereof): it is quite likely (though not known to the author) that there are (finitely generated) thin groups whose “Dirichlet domains” have infinitely many sides. Related questions have been studied by Kapovich, Leeb and Porti.
3.31.2 Eigenvalues It is an observation of Gelander, Meiri and David [93] that if Γ has an element M ∈ Γ with an eigenvalue λ of M such that no power of λ is real (Meiri and David call such numbers “genuinely complex” and matrices with a genuinely complex eigenvalue “complex”), then the action of M on the projectivized set of pairs of elements in projective space is minimal (that is, every orbit is dense), and it appears that whenever such an element exists, Γ is arithmetic. This is borne out by the Calabi–Yau examples – the Galander–Meiri–David criterion correctly distinguishes the seven thin groups from the seven arithmetic groups – we thus state this as a conjecture. Conjecture 3.31.2 (David, Gelander and Meiri). A group Γ ⊆ SL(n, ℤ) is arithmetic if and only if it contains a complex element. There are follow-on questions: Question 3.31.3. How do we tell if an element M is complex? Question 3.31.4. How easy are complex elements to find, if they exist?
152 | 3 Randomness and computation in linear groups Question 3.31.3 is tractable. It can be shown that in SL(n, ℤ) if an eigenvalue λ is 2 not genuinely complex, then one of λ, λ2 , . . . , λn is real. From this it follows that if a n matrix M has no genuinely complex eigenvalues, then one of M, M 2 , . . . , M n has all real eigenvalues (and that can be checked by a variant of Sturm’s algorithm [3]). This is not so bad for small n, but the dependence on n is very unsatisfying, so we would like to find a polynomial (in n) algorithm. Question 3.31.4 is more subtle. First, we present a theorem. Theorem 3.31.5 ([149, 9]). The random product of length N of the generators of a Zariski dense (thin or not) group has all eigenvalues real with probability 1 − exp(cN). Theorem 3.31.5 implies we do not find a complex element amongst the short products of generators; we are looking for a needle in a haystack.
3.31.3 Asymptotic eigenvalue distribution Consider our Zariski dense group Γ, and consider a long word in the generators. Theorem 3.31.5 tells us that the eigenvalues of this long word will be almost surely all real, so we can look at how they might be distributed. It is then possible that the distribution will be different for thin groups and arithmetic groups. Experiment seems to bear out this fact: below is a pair of figures (see Figure 3.4) showing the distribution of the natural logaritheorems of eigenvalues of random products of length 1000 of the generators of SL(4, ℤ) and products of length 1000 of a thin subgroup. We can see that there are some notable differences – we do not know how to prove them at this point, but there are fascinating questions!
Figure 3.4: Arithmetic versus thin.
3.31 Thin or not?
| 153
3.31.4 Finite quotients The final way to try to tell apart the thin and the arithmetic is to look at the congruence quotients. We know (by strong approximation) that these will usually be surjective onto Gp = SL(n, ℤ/pℤ), so the generators ℳ will project to generators of Gp . The Cayley graph of Gp with respect to these generators will be an expander by the work of [66, 417]. But what is its spectral gap? Experiments show that the Cayley graphs obtained are very close to Ramanujan graphs (for which the spectral graph is d −2√d − 1, where d is the degree). However, it seems that arithmetic subgroups have a slightly bigger spectral gap. Figure 3.5 gives the experimental results for SL(2, ℤ) – the quantity shown is the difference between the actual second eigenvalue and 2√3. The results show that the generators of Γ(2) give Ramanujan graphs (actually, slightly better), while the thin subgroup gives a spectral gap some 0.05 smaller on average.
Figure 3.5: Arithmetic versus thin congruence projections.
4 Compression techniques in group theory 4.1 Introduction Algorithms on compressed data Data compression is a core technique in computer science that is concerned with the space-efficient representation of data; see [441] for an introduction. Usually the goal is to store data in a compact way so that it takes less space on a disk or in main memory or makes the transfer of the data faster. Another appearance of data compression in the area of algorithm theory might be described by the term algorithmics on compressed data. This area is concerned with efficient algorithms that work on compressed data, e. g., compressed words, compressed trees, compressed graphs, etc. The goal of such algorithms is to check properties of compressed data and thereby beat a straightforward “decompress-and-check” strategy. There are three main applications for algorithms of this kind (see [294] for references). – In many areas, large data have to be not only stored in compressed form, but the original (uncompressed) data have to be processed and analyzed as well. In such cases, it makes sense to design algorithms that directly operate on the compressed data in order to save the time and space for (de)compression. A typical example is the search for specific patterns in genomes that are stored in a compressed way. – In some situations it makes sense to compute in a first phase a compressed representation of the input data, which makes regularities explicit. These regularities may be exploited in a second phase for speeding up an algorithm. This principle is known as acceleration by compression. – Large and often highly compressible data may appear as intermediate data structures in algorithms. In such a situation, one may try to store a compressed representation of these intermediate data structures and to process this representation. This may lead to more efficient algorithms. Examples for this strategy can be found for instance in computational topology or group theory. In this chapter, we deal with the last point. Our focus will be on applications in group theory, more specifically the solution of word problems. Before we go into the details, let us say a few words about compression in general. When talking about algorithms on compressed objects, one first has to address the following questions: – What are the objects that are compressed? We will be concerned with the compression of finite words and integers. In principle one can restrict oneself to the case that the objects to be compressed are binary words (words over the alphabet {0, 1}). Integers, for instance, can be encoded in binary representation. – What is the compressed representation (also known as the compression scheme) used for these objects? After applying a suitable binary coding, the compressed https://doi.org/10.1515/9783110667028-004
156 | 4 Compression techniques in group theory
–
representation can be written as a binary word as well. Thus, in principle a compression scheme can be viewed as a surjective mapping D : A → {0, 1}∗ . The set A ⊆ {0, 1}∗ is the set of all (binary encodings of) compressed representations of binary words, and D (the decompression function) maps such a binary word w to the (uncompressed) word it represents. We do not assume that D is injective, i. e., a word w ∈ {0, 1}∗ may have several compressed representations (and the set of all such representations is the preimage D−1 (w) ≠ 0). For our purpose, it is not necessary to deal with a concrete binary encoding of the compressed representations. Our compressed representations will be certain directed acyclic graphs. Moreover, the mapping D will always be computable. What kind of tests and operations are performed on the compressed objects? In our applications we have to check equivalence: given compressed objects y and z, check whether D(y) = D(z). Moreover, we have to compute certain operations efficiently on compressed representations. Given such an operation ∘ : {0, 1}∗ × {0, 1}∗ → {0, 1}∗ , this means that from y, z ∈ A we have to compute efficiently an x ∈ A such that D(x) = D(y) ∘ D(z).
Let us remark that compression has always principal limitations: if we have a compression scheme D : A → {0, 1}∗ , then surjectivity of D implies that for every n there has to exist a word w of length n such that D−1 (w) contains no binary word of length smaller than n. Thus, w has no compressed representation that is shorter than w. Exponential compression The first compression scheme we deal with allows to represent some (but, by the above remark, not all) words of length n by objects of size 𝒪(log n). We call this phenomenon exponential compression. We achieve exponential compression by avoiding long repetitions in a word. If a word has the form xuyuz, where u, x, y, z are shorter words, then a straightforward idea in data compression is to replace the two occurrences of u by an abbreviation A, which yields the word xAyAz. Moreover, we have to remember that A abbreviates u, which can be written down as a definition A := u. By repeating this step with xAyAz, one obtains a so-called straight-line program (SLP). Here is a simple SLP: A0 := A1 A1 , A1 := A2 A2 , A2 := A3 A3 , A3 := A4 A4 , A4 := ab (in the main text, we will use a slightly different notation). This SLP defines the word (ab)16 . The length of the word that is defined by an SLP can be exponential in the size of the SLP, where the latter is usually defined as the total number of symbols in all right-hand sides of definitions A := u. Thus, SLPs achieve exponential compression on some words.1 There exist several grammar-based compressors that compute from a given input word a small SLP 1 If one encodes an SLP of size n by a binary word, then the latter has length 𝒪(n ⋅ log n). For our purpose we can neglect this additional logarithmic factor. See [477] for details on the succinct binary coding of SLPs.
4.1 Introduction
| 157
for that word [80]. For the reader familiar with formal language theory, let us remark that an SLP can also be viewed as a context-free grammar that produces a single word. In many computer science texts, SLPs are introduced in this way. Formally, the underlying compression scheme is the function D that maps (the binary encoding of) an SLP to the word it produces. There are numerous papers on SLPs in computer science. One line of research investigates SLPs as a formalism for word compression (see, e. g., [80]). This is not the focus of this chapter. Another line of research uses SLPs for the efficient solution of algorithmic problems. An overview of the rich literature for this topic can be found in the survey [294]. Here, we are interested in applications of SLP compression in group theory. The basic underlying idea is that if a sequence of automorphisms (or, more generally, homomorphisms) of a group is applied to a group element, then the length of the resulting group element (measured in the shortest word over the generators that represents the group element) may be very large – exponentially in the length of the sequence of automorphisms. But a small (polynomial-size) SLP allows to represent the resulting group element. The second crucial fact is that many important operations can be efficiently carried out on words that are not given explicitly but compressed by SLPs; important examples are testing equality of words, concatenation of words and cutting out factors of a word. This allows us to compute efficiently with the outcomes of sequences of group automorphisms. The most direct application of these ideas concerns the word problem for automorphism groups, and this will be our guiding example in Section 4.6. More precisely, we will show that the word problem for the automorphism group of a free group can be solved in polynomial time. This result goes back to the work of Schleimer [442]. Sections 4.6.6 and 4.6.8 give a survey on further applications of SLPs in group theory. Tower compression and beyond SLPs offer exponential compression, but for some applications in group theory this is not enough. An example that we will consider in Section 4.7 is the Baumslag group G1,2 = ⟨a, b, t | t −1 at = a2 , b−1 ab = t⟩ that was introduced by Baumslag in [30]. It is an HNN-extension of the so-called Baumslag–Solitar group B1,2 = ⟨a, t | t −1 at = a2 ⟩, which in turn is an HNN-extension of ℤ. This allows to use the standard algorithm based on Britton reduction in order to solve the word problem for G1,2 . The problem with this algorithm is that it leads to words over the generators of enormous length. On the other hand, what makes these words so long are powers of the form an and t n for huge integers n. These integers are obtained from 1 and −1 by repeated applications of the two operations (x, y) → x + y and (x, y) → x ⋅ 2y for y ≥ 0. After n applications of the latter operation one can obtain numbers of size tower(n), where the tower function is defined by tower(0) = 1 and
158 | 4 Compression techniques in group theory tower(n + 1) = 2tower(n) . SLPs are therefore not useful for compressing the words obtained during Britton reduction for G1,2 . In order to compress the huge exponents (of size tower(n)) that appear in these words, Myasnikov, Ushakov and Won introduce in [370] a particular kind of arithmetic circuits, called power circuits. In contrast to SLPs, power circuits allow for tower compression. Some numbers of size tower(n) (namely, those that are obtained from the two operations (x, y) → x + y and (x, y) → x ⋅ 2y ) are compressed down to size 𝒪(n). In a second paper [369] Myasnikov, Ushakov and Won use power circuits to give a polynomial-time algorithm for the word problem for the Baumslag group G1,2 . We will present their algorithm (in a slightly modified way inspired by [103]) in Section 4.7. For some natural examples of groups, even power circuits are not enough to solve the word problem in polynomial time. In [108], Dison and Riley introduce so-called Hydra groups and show that their Dehn functions grow like the Ackermann functions. These form a family of extremely fast growing functions. The third Ackermann function already grows as fast as the abovementioned tower function. Nevertheless, Dison, Einstein and Riley [107] prove that the word problem for a Hydra group can be solved in polynomial time. For this, they use an extremely succinct representation of huge integers that allows so-called Ackermannian compression. This will be explained briefly in Section 4.7.4.
4.2 General notations We will use a few basic notions regarding finite words. For a set of symbols Σ (also called an alphabet), Σ∗ denotes the set of all finite words over Σ, i. e., the set of all finite sequences w = a1 a2 ⋅ ⋅ ⋅ an , where n ≥ 0 and a1 , a2 , . . . , an ∈ Σ. For n = 0 we obtain the empty word, which is denoted by ε. The length of w, |w| for short, is n. For 1 ≤ i ≤ n let w[i] = ai . For numbers i, j ≥ 1 we define the word w[i : j] as follows: ε { { { w[i : j] = {ai ai+1 ⋅ ⋅ ⋅ aj { { {w[i : min{j, n}]
if i > j, if 1 ≤ i ≤ j ≤ n, in all other cases.
In particular, ε[i : j] = ε for all i, j ≥ 1. We also use the short-hand notations w[: i] = w[1 : i] and w[i :] = w[i : n]. A word u is a factor of the word w if there exist words x and y such that w = xuy. A language over the alphabet Σ is a subset of Σ∗ . The set Σ∗ together with the operation of concatenating words is the free monoid generated by Σ; its identity element is the empty word. In many cases, the alphabet Σ will be finite. With Σ−1 = {a−1 | a ∈ Σ} we denote a disjoint copy of Σ. For a word w = a1 a2 ⋅ ⋅ ⋅ an with ai ∈ Σ ∪ Σ−1 we define the word w−1 −1 −1 −1 −1 as a−1 = a. The mapping w → w−1 is an involution on n ⋅ ⋅ ⋅ a2 a1 , where we set (a ) −1 ∗ −1 −1 (Σ ∪ Σ ) , i. e., it satisfies (w ) = w for all w ∈ (Σ ∪ Σ−1 )∗ .
4.3 Background from complexity theory | 159
4.3 Background from complexity theory In this section, we review the complexity classes that appear in this chapter. For a detailed introduction into the wide area of complexity theory the reader is advised to consult [10]. Computational complexity theory deals with the difficulty of solving (algorithmic) problems. Problems are usually specified in the following form. input: An object x question: Does object x have a certain property A(x)? The object x should have a finite description; it can be for instance a word over a finite alphabet, a tuple of words, a finite graph, etc. One can assume in fact that x is a word over the binary alphabet {0, 1} (a binary word). More complex objects can be encoded suitably by binary words. A word over an alphabet of more than two symbols can be encoded by a binary word by choosing suitable binary codes for the symbols. A tuple of words can be encoded by a single word by concatenating the entries in the tuple separated with a special symbol #. Similarly, a finite graph can be encoded by concatenating the rows of the adjacency matrix separated with #’s. Hence, we can identify a problem with a language of binary words, namely, the set of all words x that have the specified property (A(x) above). The concrete binary encoding is not important as long as it is natural in the sense that it does not blow up the description artificially. We will always assume such a natural encoding and ignore the specific coding details. Complexity classes are classes of languages over a finite (without loss of generality, binary) alphabet. Usually the languages in a complexity class are defined by restricting the resources (computation time or space) needed by an algorithm to check whether the input word x belongs to the language. A precise definition requires to fix a concrete machine model. Usually, this model is the (multi-tape) Turing machine, but any other reasonable machine model works as well. A model that is much closer to real programming languages than Turing machines is the random access machine (RAM). Basically, the reader can think about a simplified form of C programs, where the only data type are the natural numbers. To avoid the manipulation of very large integers in a single computation step, one usually assumes that the registers store numbers with only log2 (n) bits, where n is the length of the input word. The complexity classes P, NP and RP The most important complexity class for us is P. It is the class of all languages L for which there exists a polynomial-time algorithm that outputs yes (respectively, no) if the input word x belongs to L (respectively, does not belong to L). An algorithm works in polynomial time if there exists a polynomial p(n) such that the algorithm outputs for every input word x after at most p(|x|) computation steps an answer. Usually, the class P is identified with the class of all problems that can be solved efficiently. In
160 | 4 Compression techniques in group theory most cases, we do not state the concrete polynomial that bounds the running time for a problem in P. But in some cases we make statements of the form “problem A can be solved in time 𝒪(nk )” (where k is a constant). Such a concrete running time 𝒪(nk ) refers to the above RAM model with register length log2 (n). Another famous complexity class is NP. A convenient definition is the following. A language L ⊆ {0, 1}∗ belongs to NP if there exists a subset K ⊆ {0, 1}∗ × {0, 1}∗ and a polynomial q(n) such that the following hold: – K belongs to P. – If x ∈ L, then there exists a word z ∈ {0, 1}∗ of length q(|x|) such that (x, z) ∈ K. – If x ∈ ̸ L, then (x, z) ∈ ̸ K for all words z ∈ {0, 1}∗ of length q(|x|). A word z of length q(|x|) such that (x, z) ∈ K is also called a witness for the membership of x in L. A witness can be seen as a solution to a problem, and the definition of NP requires the existence of an algorithm that checks in polynomial time whether a proposed solution (which has to be short in the sense that the length of the solution is polynomially bounded in the input length) is indeed a solution. A typical example for a problem in NP is 3-colorability: Is a given finite undirected graph G 3-colorable? A potential solution is a mapping from the vertices of G to three colors (red, blue and green). Such a solution can be encoded by a word over the colors of length n if the vertices of G are 1, 2, . . . , n. Moreover, one can check in polynomial time whether such a color assignment is indeed a proper coloring; simply check whether for each edge its two end points have different colors. By definition, P ⊆ NP holds. The most famous open problem in theoretical computer science is whether P = NP holds. The general belief is that P is a proper subset of NP. Note that in the above definition of NP, the existence of a single witness z ∈ {0, 1}∗ suffices to show that x ∈ L. If we replace the existence of a witness by requiring that the large majority of all words z ∈ {0, 1}∗ of length q(|x|) are witnesses, then we obtain the class RP (randomized polynomial time): a language L ⊆ {0, 1}∗ belongs to RP if there exists a subset K ⊆ {0, 1}∗ × {0, 1}∗ and a polynomial q(n) such that the following hold: – K belongs to P. – If x ∈ L, then there exist at least 2/3 ⋅ 2q(|x|) words z ∈ {0, 1}∗ of length q(|x|) such that (x, z) ∈ K. – If x ∈ ̸ L, then (x, z) ∈ ̸ K for all words z ∈ {0, 1}∗ of length q(|x|). In other words, if the potential witness z is uniformly chosen from the set of all binary words of length q(|x|), then the probability for the event (x, z) ∈ K is zero (respectively, at least 2/3) if x ∈ ̸ L (respectively, x ∈ L). The choice of 2/3 is arbitrary in the sense that every probability strictly larger than 1/2 would yield the same complexity class. For a complexity class C we denote with coC the class of all complements of languages from C: coC = {{0, 1}∗ \ L | L ∈ C}. It is easy to see that P = coP, but whether
4.3 Background from complexity theory | 161
NP = coNP is open. It is also open whether RP = coRP, but there is some evidence in complexity theory that RP = coRP = P holds. The latter follows from plausible (but unproved) lower bounds from the area of circuit complexity theory [222]. Space bounded complexity classes Let us now come to space bounded computations. Here, the model of Turing machines is the most appropriate one. The following definition assumes knowledge of Turing machines. Fix a monotone function s : ℕ → ℕ. An s(n)-space bounded transducer is a deterministic Turing machine with three tapes: – An input tape. On this tape, the input word is written. The transducer can only read symbols from the input tape; input symbols cannot be overwritten. The input head can move in both directions. – A work tape of length s(n), where n is the length of the input word. The transducer can read and write symbols on the work tape, and the work tape head can move in both directions. – An output tape, which is initially empty. In each step, the transducer either leaves the current output tape cell blank and does not move the output tape head, or writes an output symbol into the current output tape cell and moves the output tape head one position to the right. A function f : {0, 1}∗ → {0, 1}∗ is s(n)-space computable if there exists an s(n)-space bounded transducer that eventually stops with f (x) written on the output tape, if initially x is written on the input tape. A language L ⊆ {0, 1}∗ can be accepted in space s(n) if the characteristic function χL : {0, 1}∗ → {0, 1} (which maps words in L to 1 and words in {0, 1}∗ \L to 0) is s(n)-space computable. The class L (deterministic logspace or logspace for short) is the class of all languages that can be accepted in space c ⋅⌈log2 n⌉ for some constant c ≥ 1. It is known that L ⊆ P, and most experts believe that this is a proper inclusion. The classes PSPACE (polynomial space) and EXPSPACE (exponential space) are the classes of all languages that can be accepted in space p(n) and 2p(n) , respectively, for some polynomial p(n). We have NP ∪ coNP ⊆ PSPACE ⊆ EXPSPACE. Reductions and complete problems Important concepts in complexity theory are reductions and completeness. Reductions allow to compare problems with respect to their algorithmic difficulty, and completeness allows to identify the most difficult problems within a complexity class. A language K ⊆ {0, 1}∗ is logspace reducible to the language L ⊆ {0, 1}∗ if there exists a (c ⋅ ⌈log2 n⌉)-space computable function f : {0, 1}∗ → {0, 1}∗ (for some constant c) such that for all x ∈ {0, 1}∗ : x ∈ K if and only if f (x) ∈ L. If K is logspace reducible to L and L ∈ P, then also K ∈ P. The same implication holds if we replace P by any other of the complexity classes introduced above. A language L is hard for the complexity class C (C-hard for short) if every K ∈ C is logspace reducible to L. This means that L is at
162 | 4 Compression techniques in group theory least as difficult as all languages in C. A language L is complete for the complexity class C (C-complete for short) if L is C-hard and belongs to C. One can think of C-complete languages as the most difficult problems in the class C. If P ≠ NP, then NP-complete languages as well as coNP-complete languages cannot belong to P. An example of an NP-complete problem is 3-colorability (Is a given finite graph 3-colorable?). Parallel complexity classes Another important subclass of P is the class NC, which stands for Nick’s class (named after Nick Pippenger). Intuitively, it is the class of problems in P that can be parallelized efficiently. To formalize this, one can use the model of parallel RAM (PRAM). A PRAM consists of several RAMs (the processors) that compute in parallel. To communicate with each other, these processors work on a shared set of registers. Then NC is the class of all problems that can be solved on a PRAM with p(n) many processors (where p(n) is a polynomial) in time 𝒪((log n)k ) for some constant k. Here, n is the length of the input word. One also says that the problem can be solved in polylogarithmic time using polynomially many processors. Quite often, one finds an alternative definition of NC using families of Boolean circuits. It is known that L ⊆ NC ⊆ P, but it is not known whether these inclusions are proper. If NC ≠ P, then P-complete languages cannot belong to NC. A P-complete problem in group theory is the generalized word problem for the free group F2 of rank 2 (see Section 4.5.1): given elements g1 , . . . , gn , g ∈ F2 , does g belong to the subgroup of F2 that is generated by g1 , . . . , gn [17]? Analogously to RP, one can also define a randomized version of NC. A language L ⊆ {0, 1}∗ belongs to randomized NC (RNC) if there exists a subset K ⊆ {0, 1}∗ × {0, 1}∗ and a polynomial q(n) such that the following hold: – K belongs to NC. – If x ∈ L, then there exist at least 2/3 ⋅ 2q(|x|) words z ∈ {0, 1}∗ of length q(|x|) such that (x, z) ∈ K. – If x ∈ ̸ L, then (x, z) ∈ ̸ K for all words z ∈ {0, 1}∗ of length q(|x|). A famous problem that belongs to RNC ∩ P, but which is not known to be in NC, is the existence of a perfect matching in a graph. More details on the classes NC and RNC can be found in [182].
4.4 Rewrite systems Some results in this chapter can be shown conveniently using rewrite systems (more details can be found in [52]). Consider a binary relation R ⊆ A × A on the set A and let R∗ be its reflexive and transitive closure. The relation R is terminating if there does not exist an infinite sequence a1 , a2 , a3 , . . . such that (ai , ai+1 ) ∈ R for all i ≥ 1. The relation
4.5 Groups and the word problem |
163
R is confluent if for all (a, b), (a, c) ∈ R∗ there exists a d ∈ A such that (b, d), (c, d) ∈ R∗ . A terminating relation R is confluent if and only if it is locally confluent, which means that for all (a, b), (a, c) ∈ R there exists a d ∈ A such that (b, d), (c, d) ∈ R∗ . The set of normal forms of R is NF(R) = A \ {a ∈ A | ∃b ∈ A : (a, b) ∈ R}. For a terminating and confluent relation R, the following can be shown: – For every element a ∈ A there exists a unique normal form NFR (a) such that (a, NFR (a)) ∈ R∗ . – If R̃ = (R∪R−1 )∗ is the smallest equivalence relation that contains R, then (a, b) ∈ R̃ if and only if NFR (a) = NFR (b).
A (word) rewrite system is a subset ℛ ⊆ Γ∗ × Γ∗ for an alphabet Γ. A pair (u, v) ∈ ℛ is written as u → v. We extend ℛ to the binary relation →ℛ that is defined by w1 →ℛ w2 if and only if there exist (u → v) ∈ ℛ and x, y ∈ Γ∗ such that w1 = xuy and w2 = ∗ xvy. The smallest equivalence relation containing →ℛ is denoted by ↔ℛ . We say that ℛ is terminating (respectively, confluent, locally confluent) if →ℛ has this property. We write NF(ℛ) for NF(→ℛ ) and NFℛ (w) for NF→ℛ (w) (the latter is used only if ℛ is terminating and confluent). Note that NF(ℛ) is the set of all finite words that do not contain a left-hand side u of a rule u → v as a factor. If ℛ is terminating, then, as remarked above, ℛ is confluent if and only if it is locally confluent. The latter can be checked by considering all triples (u, u1 , u2 ), where u is obtained by overlapping two left-hand sides of rules and u1 and u2 are obtained from u by replacing these left-hand sides by the corresponding right-hand sides (see [52] for a precise definition). The pair (u1 , u2 ) is also called a critical pair of ℛ. Then, a terminating ℛ is confluent if and only if for every critical pair (u1 , u2 ) there exists a word v such that u1 →∗ℛ v and u2 →∗ℛ v.
4.5 Groups and the word problem Our main application of compression techniques in group theory will be the efficient solution of word problems. This section gives a short overview of known results in this area. More details can be found in [219, 313]. For a general introduction into the area of group theory, see [435].
4.5.1 Presentations for groups Given a group G and a subset Γ ⊆ G, we denote with ⟨Γ⟩ the subgroup of G generated by Γ. It is the set of all products of elements from Γ ∪ Γ−1 . We only consider finitely generated groups. Thus, for every group G there is a finite set Γ ⊆ G such that G = ⟨Γ⟩; such a set Γ is called a finite generating set for G. That means that there exists a surjective monoid homomorphism π : (Γ ∪ Γ−1 )∗ → G that preserves the involution:
164 | 4 Compression techniques in group theory π(w−1 ) = π(w)−1 . We also say that the word w represents the group element π(w). For words u, v ∈ (Γ ∪ Γ−1 )∗ we say that u = v in G if π(u) = π(v). Quite often we will describe groups by presentations. In general, if H is a group and R ⊆ H is a set of so-called relators, then we denote with ⟨H | R⟩ the quotient group H/NR , where NR is the smallest normal subgroup of H with R ⊆ NR . Formally, we have NR = ⟨{hrh−1 | h ∈ H, r ∈ R}⟩. For group elements gi , hi ∈ H (i ∈ I) we also write ⟨H | gi = hi (i ∈ I)⟩ for the group ⟨H | {gi h−1 i | i ∈ I}⟩. For a set Γ, let F(Γ) be the free group generated by Γ. One can identify this group with the set of normal forms NF(ℛΓ ) ⊆ (Γ ∪ Γ−1 )∗ of the rewrite system ℛΓ with the following rules, where a ∈ Γ: aa−1 → ε, a−1 a → ε. Note that NF(ℛΓ ) is the set of all finite words that do not contain a factor of the form aa−1 or a−1 a for a ∈ Γ. Such words are also called reduced. It is easily seen that ℛΓ is terminating and confluent. Multiplication in the free group F(Γ) is defined by the following law, where u, v ∈ NF(ℛΓ ): u ⋅ v = NFℛΓ (uv). If Γ is finite and of cardinality n, we also write Fn for F(Γ) (the free group of rank n). For a set Γ and a set R ⊆ NF(ℛΓ ) of relators we define the group ⟨Γ | R⟩ = ⟨F(Γ) | R⟩. Every group G that is generated by Γ can be written as ⟨Γ | R⟩ for some R ⊆ NF(ℛΓ ). A group ⟨Γ | R⟩ with Γ and R finite is called finitely presented, and the pair (Γ, R) is a presentation for the group ⟨Γ | R⟩. Given two groups G1 = ⟨Γ1 | R1 ⟩ and G2 = ⟨Γ2 | R2 ⟩, where without loss of generality Γ1 ∩ Γ2 = 0, we define their free product G1 ∗ G2 = ⟨Γ1 ∪ Γ2 | R1 ∪ R2 ⟩. 4.5.2 The word problem In his seminal paper from 1911 [95], Dehn introduced the word problem (called Identitätsproblem by Dehn). Definition 4.5.1 (Word problem). Let G be a finitely generated group with a finite generating set Γ; hence G is isomorphic to ⟨Γ | R⟩ for a set of relators R ⊆ NF(ℛΓ ). The word problem for G with respect to Γ, briefly WP(G, Γ), is the following problem: input: A word w ∈ (Γ ∪ Γ−1 )∗ question: Does w = 1 hold in G, i. e., does NFℛΓ (w) belong to the normal subgroup of F(Γ) generated by R?
4.5 Groups and the word problem |
165
The way we defined it, the word problem depends on the chosen generating set for G. But if we are only interested in the decidability/complexity of word problems, the actual generating set is not relevant; if Γ and Σ are two finite generating sets for G, then WP(G, Γ) is logspace reducible to WP(G, Σ). This justifies the notation WP(G) for the word problem of G. Novikov [387] and independently Boone [53] proved in the 1950s the following seminal result. Theorem 4.5.2 ([53, 387]). There is a finitely presented group with an undecidable word problem. A modern treatment of this result can be found in [475]. Fortunately, many classes of groups with decidable word problems are known. Here is a nonexhaustive list of such classes: – Finitely generated linear groups, i. e., finitely generated groups that can be embedded into the general linear group GLn (F) for some field F and n ≥ 1: for every finitely generated linear group, the word problem can be solved in logspace. This was shown by Lipton and Zalcstein [291] for fields of characteristic zero and by Simon for fields of prime characteristic [464]. Important subclasses of finitely generated linear groups are finitely generated nilpotent groups, polycyclic groups, Coxeter groups, braid groups, graph groups (also known as right-angled Artin groups) and hence finitely generated free groups. – Finitely generated metabelian groups: A group G is metabelian if it is solvable of derived length at most two, or equivalently, the commutator subgroup of G is abelian. Finitely generated metabelian groups are in general not linear, but they can be embedded into finite direct products of linear groups [495]. This implies that also for finitely generated metabelian groups the word problem can be solved in logspace. – Word-hyperbolic groups: these groups were introduced by Gromov [187]. The usual definition says that a finitely generated group is word-hyperbolic if there exists a constant δ such that every geodesic triangle in the Cayley graph of G is δ-thin (which means that every point on one of the three sides of the triangle belongs to the δ-neighborhood of the opposite two sides). Free groups are for instance word-hyperbolic; one can choose δ = 0. It is known that a finitely generated group is word-hyperbolic if and only if it has a linear Dehn function [187] (see Section 4.5.4 below). For every word-hyperbolic group, the word problem can be solved in linear time. – Automatic groups [123]: these are groups where the right-multiplication with a generator can be recognized (in a certain way) by a finite two-tape automaton. Automatic groups cover many important classes like braid groups [11], Coxeter groups, graph groups and hyperbolic groups. The word problem for an automatic group can be solved in quadratic time. – Automaton groups [371]: these are certain finitely generated subgroups of the automorphism groups of regular rooted trees, where the automorphisms are de-
166 | 4 Compression techniques in group theory
–
scribed by invertible Mealy automata. Famous members of this class are the Grigorchuk group and the Gupta–Sidki groups (see [219, 371] for references), which are finitely generated infinite torsion groups. In addition the Grigorchuk group was the first example of a finitely generated group with intermediate growth as well as the first example of a group that is amenable but not elementary amenable. The word problem for every automaton group belongs to PSPACE, and recently an automaton group with a PSPACE-complete word problem has been constructed [493]. The Grigorchuk group and the Gupta–Sidki groups are so-called contracting automaton groups for which the word problem can be solved in logspace [23] (for the Grigorchuk group a proof can be also found in [159]). One-relator groups: these are groups of the form ⟨Γ | {r}⟩ (⟨Γ | r⟩ for short) for a single relator r ∈ NF(ℛΓ ). Magnus’s breakdown procedure [319] solves the word problem for every one-relator group. The complexity of the word problem for onerelator groups remains a mystery. Magnus’s breakdown procedure is nonelementary (i. e., its running time is not bounded by a fixed tower of exponents), and no better procedure that works for all one-relator groups is known. On the other hand, no example of a one-relator group, for which the word problem is provably not in P, is known [369].
4.5.3 HNN-extensions HNN-extension is an extremely important operation for constructing groups that arises in all parts of combinatorial group theory. It is also the key operation in all modern proofs of Theorem 4.5.2. Take a group H and a fresh generator t ∈ ̸ H, from which we obtain the free product H ∗ ⟨t⟩ ≅ H ∗ ℤ. Assume now that A, B ≤ H are two isomorphic subgroups of H and let φ : A → B be an isomorphism. Then, the group ⟨H ∗ ⟨t⟩ | t −1 at = φ(a) (a ∈ A)⟩ is called the HNN-extension of A with associated subgroups A and B (usually, the isomorphism φ is not mentioned explicitly). The above HNN-extension is usually written as ⟨H, t | t −1 at = φ(a) (a ∈ A)⟩. Britton [70] proved the following fundamental result on HNN-extensions. Theorem 4.5.3 (Britton’s lemma [70]). Let the group H be generated by Γ and let G = ⟨H, t | t −1 at = φ(a) (a ∈ A)⟩ be an HNN-extension. If a word w ∈ (Γ ∪ Γ−1 ∪ {t, t −1 })∗ represents the identity of G, then w contains a factor of the form t −1 ut (respectively, tut −1 ), where u ∈ (Γ ∪ Γ−1 )∗ represents an element of A (respectively, B). A subword of the form t −1 ut (respectively, tut −1 ), where u ∈ (Γ ∪ Γ−1 )∗ represents an element of A (respectively, B), is also called a pin. A simple corollary of Britton’s lemma is that H is a subgroup of the HNN-extension ⟨H, t | t −1 at = φ(a) (a ∈ A)⟩. Britton’s lemma can also be used to solve the word
4.5 Groups and the word problem |
167
problem for an HNN-extension ⟨H, t | t −1 at = φ(a) (a ∈ A)⟩. For this we need several assumptions: – The word problem for H is decidable. – There is an algorithm that decides whether a given word u ∈ (Γ ∪ Γ−1 )∗ (where Γ generates H) represents an element of A (respectively, B). – Given a word u ∈ (Γ ∪ Γ−1 )∗ that represents an element a ∈ A (respectively, b ∈ B), one can compute a word v ∈ (Γ ∪ Γ−1 )∗ that represents the element φ(a) (respectively, φ−1 (b)). Let us denote this word v with φ(u) (respectively, φ−1 (u)). Then, given a word w ∈ (Γ ∪ Γ−1 ∪ {t, t −1 })∗ one replaces pins t −1 ut (respectively, tut −1 ) by φ(u) (respectively, φ−1 (u)) in any order, until no more pins occur. If the final word does not belong to (Γ ∪ Γ−1 )∗ , then we have w ≠ 1 in the HNN-extension. If the final word belongs to (Γ ∪ Γ−1 )∗ , then one uses the algorithm for the word problem of H to check whether it represents the group identity. This algorithm is known as Britton reduction.
4.5.4 The Dehn function Let us introduce an important geometric complexity measure for finitely presented groups, which is related to the computational complexity of the word problem. Let G = ⟨Γ | R⟩ be a finitely presented group. Thus, G = F(Γ)/NR , where NR is the smallest normal subgroup of F(Γ) that contains R. If a reduced word w ∈ NF(ℛΓ ) represents the identity of G, then w can be obtained by computing the normal form (with respect to ℛΓ ) of a word of the form c1−1 r1 c1 c2−1 r2 c2 ⋅ ⋅ ⋅ cn−1 rn cn , where n ≥ 0, ci ∈ NF(ℛΓ ) and ri ∈ R ∪ R−1 . The smallest such number n is called the area of w, or area(w) for short. In other words, the area of w is the smallest number n such that in the free group F(Γ), w is a product of n conjugates of elements from R∪R−1 . The term “area” comes from the fact that area(w) has a nice geometric interpretation in terms of van Kampen diagrams (see [313] for more details). A van Kampen diagram is basically a planar graph with edges labeled by reduced words. The labels around a face yield a cyclic rotation of a relator or the inverse of a relator. Moreover, the edge labels along the boundary of the graph yield a cyclic rotation of the word w. The Dehn function for the presentation (Γ, R) is the function DΓ,R : ℕ → ℕ with DΓ,R (m) = max{area(w) | w ∈ NF(ℛΓ ), w = 1 in G, |w| ≤ m}. In other words, DΓ,R (m) is the maximal area of a reduced word of length at most m that represents the identity of G. Clearly, different finite presentations for G may yield
168 | 4 Compression techniques in group theory different Dehn functions. On the other hand, if (Γ1 , R1 ) and (Γ2 , R2 ) are two finite presentations for the group G, then there exists a constant c > 0 such that DΓ1 ,R1 (n) ≤ c ⋅ DΓ2 ,R2 (cn + c) + cn + c [165, Proposition 2.4]. Hence, whether the Dehn function is linear, quadratic, cubic, exponential, etc., does not depend on the chosen presentation for a group. Therefore, one also speaks of the Dehn function D(n) of the group G. For every infinite finitely presented group the Dehn function grows at least linear, and by [187], a group has a linear Dehn function if and only if it is word-hyperbolic. For more details on Dehn functions the reader is advised to consult [63]. The following relationship between the Dehn function of a group and the complexity of the word problem is well known (see [296, Proposition 2.22] for a proof). Theorem 4.5.4. Let G be a finitely presented group with a computable Dehn function D(n). Then the word problem for the group G can be solved in time 𝒪(D(n)) on a nondeterministic Turing machine. In particular, if G has a polynomial Dehn function, then the word problem for G belongs to NP. The basic idea for the proof of Theorem 4.5.4 is to guess a potential van Kampen diagram for w of area at most D(n) (this is the nondeterministic part of the algorithm). In a second phase, one has to verify whether the guessed planar graph is indeed a van Kampen diagram for w. The algorithm underlying Theorem 4.5.4 is very generic. To carry it out for a group, one needs besides the presentation only a computable bound on the Dehn function. But in many cases, more efficient algorithms exist. Note that the best complexity upper bound that can be obtained from Theorem 4.5.4 is NP (at least for infinite groups), but there exist many groups for which the word problem belongs to P (see Section 4.5.2). For instance, for every finitely generated linear group, the word problem can be solved in logspace. The so-called Baumslag–Solitar group BS1,2 = ⟨a, t | t −1 at = a2 ⟩ (which we will meet again in Section 4.7.1) is finitely generated linear and has a Dehn function of growth 2n [165]. Hence, Theorem 4.5.4 only yields a nondeterministic exponential-time algorithm for the word problem for BS1,2 . In Section 4.7 we will see even more extreme examples. In fact, the gap between the complexity of the word problem and the Dehn function cannot be bounded by any recursive function [87]. The strongest statement of this form was shown by Kharlampovich, Myasnikov and Sapir: for any recursive function f (n) there exists a finitely presented residually finite 3-solvable group G such that (i) WP(G) belongs to P and (ii) the Dehn function of G is not bounded by f (n) [261].
4.6 Exponential compression In this section, we consider SLPs, which allow to represent some words of length n by a description of length 𝒪(log n). We call this exponential compression. SLPs achieve good compression for words that contain many repetitive factors. We will use SLPs in
4.6 Exponential compression | 169
order to develop polynomial-time algorithms for the word problems of various automorphism groups and group extensions. Our guiding example will be the word problem for Aut(F3 ): the automorphism group for a free group of rank 3.
4.6.1 Motivation: the word problem for Aut(F3 ) Consider F3 , the free group of rank 3, and fix the group generators a, b and c for F3 . In 1924, Nielsen [380] proved that Aut(Fn ) is finitely generated for all n. A concrete generating set for n = 3 consists of the following four automorphisms (called elementary Nielsen transformations): – φ1 (a) = b, φ1 (b) = a, φ1 (c) = c, – φ2 (a) = b, φ2 (b) = c, φ2 (c) = a, – φ3 (a) = a−1 , φ3 (b) = b, φ3 (c) = c, – φ4 (a) = ab, φ4 (b) = b, φ4 (c) = c. Moreover, Aut(Fn ) is finitely presented, but this is not important for us. Our goal is to solve the word problem for Aut(F3 ) efficiently. Thus, we are given a word ψ1 ψ2 ⋅ ⋅ ⋅ ψn with ψi ∈ {φi , φ−1 i | 1 ≤ i ≤ 4}, and we want to know whether the composition of these automorphisms is the identity mapping on F3 . We assume that we compose from left to right, i. e., we first apply ψ1 , then ψ2 , and so on. Here is a straightforward solution for the word problem. Clearly, ψ ∈ Aut(F3 ) is the identity mapping on F3 if and only if ψ(x) = x for x ∈ {a, b, c}. These three identities have to hold in F3 . From the sequence ψ1 ψ2 ⋅ ⋅ ⋅ ψn , we can clearly compute the words ψn (ψn−1 (⋅ ⋅ ⋅ ψ1 (x) ⋅ ⋅ ⋅)) for x ∈ {a, b, c}. It therefore remains to check whether ψn (ψn−1 (⋅ ⋅ ⋅ ψ1 (x) ⋅ ⋅ ⋅))x−1 = 1
for x ∈ {a, b, c}
(4.1)
holds. These are three instances of the word problem for F3 . The word problem for a free group can be solved very efficiently in linear time by computing normal forms with respect to the rewrite system from Section 4.5.1. But the problem is that the words in (4.1) can be very long. Consider for instance (φ4 φ4 φ1 )n . If we apply φ4 φ4 φ1 to a we obtain baa. Hence, after each further application of φ4 φ4 φ1 we at least double the number of a’s. This implies that the length of the resulting word is exponential in n. Can we do better? First of all, the Dehn function of Aut(F3 ) is known to be exponential [68]. Hence, we do not gain anything from Theorem 4.5.4. Moreover, Aut(F3 ) is not linear [138].2 Hence, one cannot apply the results of Lipton and Zalcstein [291] and Simon [464] mentioned in Section 4.5.2. 2 Aut(F2 ) is linear; therefore we consider Aut(F3 ) instead of Aut(F2 ) in this section.
170 | 4 Compression techniques in group theory The question whether the word problem for Aut(Fn ) belongs to P for n ≥ 3 appeared on the list of open problems in group theory [240]. An affirmative answer was finally given by Schleimer [442] using compression by SLPs, which we define next.
4.6.2 Straight-line programs Definition 4.6.1 (Straight-line program). A straight-line program, briefly SLP, over the terminal alphabet Γ is a tuple 𝒢 = (V, Γ, S, ρ), where V is a finite set of variables, Γ is the set of terminal symbols, S ∈ V is the start variable and ρ is a mapping of type ρ : V → (Γ ∪ V)∗ such that the following holds. There exists a linear enumeration A1 , A2 , . . . , An of the variables of 𝒢 such that An = S and for all 1 ≤ i ≤ n, ∗
ρ(Ai ) ∈ (Γ ∪ {A1 , . . . , Ai−1 })
(4.2)
(in particular, ρ(A1 ) ∈ Γ∗ ). The word ρ(A) is also called the right-hand side of A. For a given SLP 𝒢 = (V, Γ, S, ρ) we homomorphically extend ρ to a mapping ρ : (Γ ∪ V)∗ → (Γ ∪ V)∗ by setting ρ(a) = a for a ∈ Γ. Then, property (4.2) ensures that for every word w ∈ (Γ ∪ V)∗ we have ρn (w) ∈ Γ∗ . We denote the mapping ρn with val𝒢 and we omit the subscript 𝒢 if 𝒢 is clear from the context. Finally, the word defined by 𝒢 is val(𝒢 ) = val𝒢 (S). The reader who is familiar with context-free grammars will notice that an SLP is nothing else than a context-free grammar (with productions A → ρ(A)) that generates the singleton language {val(𝒢 )}. Another view point is to see an SLP as a multiplicative circuit over a free monoid Σ∗ , where the variables are gates that compute the concatenation of its inputs. In algebraic complexity theory, the term “straight-line program” is also used for algebraic circuits that compute (multivariate) polynomials. In such a circuit, every internal gate either computes the sum or the product of its inputs, and the input gates of the circuit are labeled with constants or variables. The size of the SLP 𝒢 = (V, Γ, S, ρ) is |𝒢 | = ∑A∈V |ρ(A)|, which is the total number of symbols in all right-hand sides. Up to a factor of 𝒪(log2 (|V| + |Σ|)) this is the number of bits in a natural binary encoding of 𝒢 (which would be the true input length). One can assume that V = {1, . . . , m} and Σ = {m + 1, . . . , n} for some n > m and that m is the start variable of 𝒢 . For 1 ≤ i ≤ n + 1 let β(i) be the i-th binary word of length ⌈log2 (n + 1)⌉ in lexicographic order and extend β to a homomorphism from {1, . . . , n + 1}∗ to {0, 1}∗ . Then, we can encode 𝒢 by the following binary word: 1n 0 β(ρ(1)) β(n + 1) β(ρ(2)) β(n + 1) ⋅ ⋅ ⋅ β(ρ(m − 1)) β(n + 1) β(ρ(m)). An SLP is in Chomsky normal form if ρ(A) ∈ Γ ∪ VV for all A ∈ V. Every SLP that produces a nonempty word can be transformed in linear time into an equivalent SLP in Chomsky normal form [296, Proposition 3.8].
4.6 Exponential compression
| 171
Example 4.6.2. Consider the SLP 𝒢 = ({A1 , A2 , . . . , A7 }, {a, b}, A7 , ρ), where ρ(A1 ) = b, ρ(A2 ) = a, and ρ(Ai ) = Ai−1 Ai−2 for 3 ≤ i ≤ 7. Then val(𝒢 ) = abaababaabaab, which is the seventh Fibonacci word. The SLP 𝒢 is in Chomsky normal form and |𝒢 | = 12. A simple induction shows that |val(𝒢 )| ≤ 𝒪(3m/3 ) for every SLP 𝒢 of size m [80, proof of Lemma 1]. On the other hand, it is straightforward to define an SLP 𝒢n in Chomsky normal form of size 2n such that |val(𝒢n )| ≥ 2n . Hence, an SLP can be seen as a compressed representation of the word it generates, and exponential compression rates can be achieved in this way. Let us also mention that for every word w ∈ Γ∗ there exists an SLP of size 𝒪(n/ logk n), where n = |w| and k = |Γ| [42]; in that paper SLPs are called word chains.
4.6.3 Jeż’s algorithm for equality checking One of the most basic tasks for SLP-compressed words is compressed equality checking: input: Two SLPs 𝒢 and ℋ. question: Does val(𝒢 ) = val(ℋ) hold? Clearly, a simple decompress-and-compare strategy is very inefficient. It takes exponential time to compute val(𝒢 ) and val(ℋ). Nevertheless a polynomial-time algorithm exists. This was independently discovered by Hirshfeld, Jerrum and Moller [213, 214], Mehlhorn, Sundar and Uhrig [335, 336] and Plandowski [410]. Theorem 4.6.3 ([213, 214, 335, 336, 410]). Compressed equality checking can be solved in polynomial time. In this section we prove Theorem 4.6.3. The polynomial-time compressed equality checking algorithms of Hirshfeld et al. [213, 214] and Plandowski [410] use combinatorial properties of words, in particular the periodicity lemma of Fine and Wilf [132]. This lemma states that if p and q are periods of a word w (i. e., w[i] = w[i + p] and w[j] = w[j + q] for all positions 1 ≤ i ≤ |w| − p and 1 ≤ j ≤ |w| − q) and p + q ≤ |w|, then also the greatest common divisor of p and q is a period of w. The algorithms from [214, 410] achieve a running time of 𝒪(n4 ), where n = |𝒢 | + |ℋ|. An improvement to 𝒪(n3 ) (for the more general problem of compressed pattern matching), still using the periodicity lemma, was achieved by Lifshits [290]. In contrast to [214, 290, 410], the algorithm of Mehlhorn et al. [335, 336] does not use the periodicity lemma of Fine and Wilf. Actually, in [336], Theorem 4.6.3 is not explicitly stated but follows immediately from the main result. Mehlhorn et al. provide an efficient data structure for a finite set of words that supports the following operations:
172 | 4 Compression techniques in group theory – – – –
Set variable x to the symbol a. Set variable x to the concatenation of the values of variables y and z. Split the value of variable x into its length-k prefix and remaining part and store these words in variables y and z. Check whether the values of variables x and y are identical.
The idea is to compute for each variable a signature which is a small number that allows to do the equality test in constant time. The signature of a word is computed by iteratively breaking up the sequence into small blocks, which are encoded by integers using a pairing function. This leads to a cubic-time algorithm for compressed equality checking. An improvement of the data structure from [336] can be found in [5]. The idea from [5, 336] of recursively dividing a word into smaller pieces and replacing them by new symbols (integers in [5, 336]) was taken up by Jeż [226], who came up with an extremely powerful technique for dealing with SLP-compressed words (and the related problem of solving word equations [227], see Section 4.6.8.3). It also yields the probably simplest proof of Theorem 4.6.3. In the rest of Section 4.6.3 we present this algorithm. We ignore some implementation details that are important for getting the best time bound. Moreover, Jeż [226] obtained his result for the more general problem of fully compressed pattern matching, where it is asked whether an SLP-compressed word is a factor of another SLP-compressed word. Jeż’s algorithm is based on two operations on words that reduce the word length: block compression and pair compression. These operations are introduced next. 4.6.3.1 Block compression and pair compression Let s ∈ Σ∗ be a word over a finite alphabet Σ. We define the word block(s) as follows. n n n Assume that s = a1 1 a2 2 ⋅ ⋅ ⋅ ak k with k ≥ 0, a1 , . . . , ak ∈ Σ, ai ≠ ai+1 for all 1 ≤ i < k and (n ) (n )
(n )
(n )
(n )
(n )
ni > 0 for all 1 ≤ i ≤ k. Then block(s) = a1 1 a2 2 ⋅ ⋅ ⋅ ak k , where a1 1 , a2 2 , . . . , ak k n are new symbols. Note that block(ε) = ε. The factors ai i (1 ≤ i ≤ k) are also called the blocks of s and replacing s by block(s) is called block compression. For instance, for s = aabbbaccb we have block(s) = a(2) b(3) a(1) c(2) b(1) . For the symbol a(1) we will simply write a. Pair compression replaces pairs of different letters by fresh symbols. For this, we have to fix a partition Σ = Σℓ ⊎ Σr . Let s[Σℓ , Σr ] be the word that is obtained from s by replacing every occurrence of a factor ab in s with a ∈ Σℓ and b ∈ Σr by the new symbol ⟨ab⟩. For instance, for s = abcbabcad and Σℓ = {a, c} and Σr = {b, d} we have s[Σℓ , Σr ] = ⟨ab⟩⟨cb⟩⟨ab⟩c⟨ad⟩. Since two different occurrences of factors from Σℓ Σr must occupy disjoints sets of positions in s, the word s[Σℓ , Σr ] is well-defined. Obviously, for all words s, t ∈ Σ∗ we have s=t
⇐⇒
block(s) = block(t)
s=t
⇐⇒
s[Σℓ , Σr ] = t[Σℓ , Σr ].
and
(4.3) (4.4)
4.6 Exponential compression
| 173
For pair compression we will need a partition Σ = Σℓ ⊎Σr such that the length of s[Σℓ , Σr ] is by a constant factor smaller than the length of s. To find such a partition we need the condition that s does not contain a factor aa for a ∈ Σ. Note that after block compression this condition holds. Lemma 4.6.4. Given a word s ∈ Σ such that s does not contain a factor of the form aa with a ∈ Σ, one can compute in polynomial time a partition Σ = Σℓ ⊎ Σr such that |s[Σℓ , Σr ]| ≤ 41 + 43 |s|. Proof. We first count for every factor ab with a, b ∈ Σ and a ≠ b the number of occurrences of factors ab and ba in s; let us denote this number with w(a, b). Note that w(a, b) = w(b, a). For instance for s = abcdababc we have w(a, b) = 4. In the second step we compute a partition Σ = Σ1 ⊎ Σ2 such that s contains at least (|s| − 1)/2 many occurrences of factors from Σ1 Σ2 ∪ Σ2 Σ1 . This is a so-called MaxCut problem in an edge weighted graph. Take Σ as the node set and draw an undirected edge with weight w(a, b) between two nodes a, b ∈ Σ with a ≠ b. For two disjoint sets Σ1 , Σ2 ⊆ Σ let w(Σ1 , Σ2 ) =
∑
w(a, b),
(a,b)∈Σ1 ×Σ2
which is the total weight of all edges between Σ1 and Σ2 . Moreover, for Γ ⊆ Σ let w(Γ) =
1 ⋅ ∑ w(a, b). 2 a,b∈Γ,a=b̸
Thus w(Γ) is the sum of all weights of edges between nodes in Γ. Since the word s has |s| − 1 many factors of the form ab (a ≠ b), we have w(Σ) = |s| − 1. We now compute disjoint sets Σ1 and Σ2 as follows. We start with Σ1 = Σ2 = 0. As long as Σ ≠ Σ1 ⊎ Σ2 , we choose an arbitrary symbol a ∈ Σ \ (Σ1 ∪ Σ2 ). We add a to either Σ1 or Σ2 according to the following rule: – If w({a}, Σ1 ) ≤ w({a}, Σ2 ), then set Σ1 := Σ1 ∪ {a} (Σ2 does not change). – If w({a}, Σ1 ) > w({a}, Σ2 ), then set Σ2 := Σ2 ∪ {a} (Σ1 does not change). Thus, in every step we greedily put a into either Σ1 or Σ2 with the goal of maximally increasing w(Σ1 , Σ2 ). Then the following invariant is preserved by the algorithm: w(Σ1 , Σ2 ) ≥
1 ⋅ w(Σ1 ∪ Σ2 ). 2
If for instance w({a}, Σ1 ) ≤ w({a}, Σ2 ) we get w(Σ1 ∪ {a}, Σ2 ) = w(Σ1 , Σ2 ) + w({a}, Σ2 ) 1 ≥ ⋅ (w(Σ1 ∪ Σ2 ) + w({a}, Σ1 ) + w({a}, Σ2 )) 2 1 = ⋅ w(Σ1 ∪ {a} ∪ Σ2 ). 2
174 | 4 Compression techniques in group theory At the end, we get a partition Σ = Σ1 ⊎ Σ2 such that w(Σ1 , Σ2 ) ≥ 21 ⋅ (|s| − 1), i. e., s contains at least 21 ⋅ (|s| − 1) many factors from Σ1 Σ2 ∪ Σ2 Σ1 . Since Σ1 Σ2 and Σ2 Σ1 are disjoint, s contains at least 41 ⋅ (|s| − 1) many factors from Σ1 Σ2 or at least 41 ⋅ (|s| − 1) many factors from Σ2 Σ1 . Assume without loss of generality that the former holds and set Σℓ = Σ1 and Σr = Σ2 . This means that |s[Σℓ , Σr ]| ≤ |s| − 41 ⋅ (|s| − 1) = 41 + 43 |s|. 4.6.3.2 Block and pair compression on SLP-compressed words In this section, we show that block compression and pair compression can be done in polynomial time on SLP-compressed words. Moreover, we will control the size increase of the SLP in a very specific way. In the rest of this section, we assume that every SLP 𝒢 = (V, Σ, S, ρ) is in a kind of generalized Chomsky normal form; we require that for every variable A ∈ V, ρ(A) is either of the form u ∈ Σ∗ , uBv with u, v ∈ Σ∗ and B ∈ V, or uBvCw with u, v, w ∈ Σ∗ and B, C ∈ V. In other words, every right-hand side contains at most two occurrences of variables. This specific form of right-hand sides will be preserved by our algorithms. Let us fix an SLP 𝒢 = (V, Σ, S, ρ) of the above form. Note that Definition 4.6.1 implies that S does not occur in a right-hand side ρ(A) (A ∈ V). We denote with |𝒢 |0 (respectively, |𝒢 |1 ) the total number of occurrences of terminal symbols (respectively, variables) in all right-hand sides of 𝒢 . Thus, |𝒢 | = |𝒢 |0 + |𝒢 |1 . In the following, we will develop polynomial-time algorithms for three computational problems: – Compute from 𝒢 an SLP ℋ such that val(ℋ) = block(val(𝒢 )). – Compute from 𝒢 a partition Σ = Σℓ ⊎ Σr such that |val(𝒢 )[Σℓ , Σr ]| ≤ 41 + 43 |val(𝒢 )|. – Compute from 𝒢 and a partition Σ = Σℓ ⊎ Σr an SLP ℋ such that val(ℋ) = val(𝒢 )[Σℓ , Σr ]. Let us start with block compression. It is not difficult to show that the word block(val(𝒢 )) contains at most |𝒢 | different symbols. For every border between two symbols (terminal symbols or variables) in a right-hand side one new symbol in block(val(𝒢 )) can be created. Moreover, we can efficiently construct an SLP for block(val(𝒢 )). Lemma 4.6.5. There is an algorithm CompressBlocks that gets as input an SLP 𝒢 = (V, Σ, S, ρ) and computes in polynomial time an SLP ℋ such that the following properties hold: – all variables of ℋ belong to V and S is a variable of ℋ, – val(ℋ) = block(val(𝒢 )), – |ℋ|1 ≤ |𝒢 |1 ≤ 2|V| and |ℋ|0 ≤ |𝒢 |0 + 4|V|. Proof. Example 4.6.6 below shows the following construction for a concrete SLP. In a first step, we replace in every right-hand side of 𝒢 every maximal factor u that only contains terminal symbols by block(u). Unfortunately, this does not suffice in order to
4.6 Exponential compression
| 175
produce block(val(𝒢 )). The simplest example is the SLP 𝒢 with the variables S and A, where ρ(A) = a and ρ(S) = AA. We would obtain an SLP for a(1) a(1) , whereas block(val(𝒢 )) = a(2) . We therefore have to process the SLP obtained from the first step (which we call 𝒢 again) further. This processing is done bottom-up, i. e., if B occurs in ρ(A), then B has to be processed before A. During the processing of variable A, we will (i) modify the right-hand side of A in a certain way and (ii) replace every occurrence of A in a righthand side of the current SLP by either ε, a single symbol a(i) or a word of the form a(i) Ab(j) . This ensures the following invariant. When A is going to be processed, the current right-hand side ρ(A) neither starts nor ends with a nonterminal. It therefore suffices to consider the following three cases for the processing of A. Case 1. ρ(A) = ε. Then we replace all occurrences of A in right-hand sides by ε and remove A from the SLP. Case 2. ρ(A) = a(i) for some a ∈ Σ and i ≥ 1. Then we replace all occurrences of A in right-hand sides by a(i) and remove A from the SLP. Case 3. ρ(A) = a(i) αb(j) for some a, b ∈ Σ, i, j ≥ 1, and some word α. In this case we redefine the right-hand side of A by ρ(A) = α and replace all occurrences of A in right-hand sides by a(i) Ab(j) . Intuitively, we pop off from ρ(A) the first and last block. This is necessary since the first (respectively, last) block of ρ(A) might be merged with another block on the left (respectively, right), as in the above example with ρ(A) = a and ρ(S) = AA. After the above replacements, right-hand sides might contain factors of the form (i) (j) a a . We replace such a factor by a(i+j) . We iterate this replacement until no factors of the form a(i) a(j) occur in the right-hand sides of the SLP. This concludes the processing of the variable A and we can process the next variable, for which we can take any variable A for which all variables in ρ(A ) have been processed before. The invariant that ρ(A ) neither starts nor ends with a variable is clearly preserved. We process all variables except for the start variable S (which has no occurrences in the right-hand sides of the SLP). Let us denote the final SLP with ℋ. Since S is not processed, it is still present in ℋ. A simple inductive argument shows the following for every variable A: – If val𝒢 (A) = ε, then while processing A, all occurrences of A in right-hand sides of the SLP are replaced by ε. – If |block(val𝒢 (A))| = 1, i. e., val𝒢 (A) is of the form ai for some a ∈ Σ and i ≥ 1, then while processing A, all occurrences of A in right-hand sides of the SLP are replaced by a(i) . – If |block(val𝒢 (A))| ≥ 2, i. e., block(val𝒢 (A)) = a(i) ub(j) for some u, then while processing A, all occurrences of A in right-hand sides of the SLP are replaced by a(i) Ab(j) . Moreover, in the final SLP ℋ we have valℋ (A) = u. This implies that val(ℋ) = block(val(𝒢 )).
176 | 4 Compression techniques in group theory The number of occurrences of variables in a right-hand side does not increase, which implies that |ℋ|1 ≤ |𝒢 |1 ≤ 2|V|. Moreover, if a right-hand side of 𝒢 contains one (respectively, two) variables, then the length of that right-hand side can increase by at most 2 (respectively, 4). The worst case occurs if every occurrence of a variable A in a right-hand side is replaced by a word of the form a(i) Ab(j) . It follows that |ℋ|0 ≤ |𝒢 |0 + 4|V|. Example 4.6.6. Consider the SLP ({A1 , . . . , A7 }, {a, b}, A7 , ρ) with the following ρ-mapping: ρ(A1 ) = a7 ,
ρ(A2 ) = a5 b3 a6 ,
ρ(A5 ) = a2 A3 A1 ,
ρ(A3 ) = b3 a4 ,
ρ(A6 ) = a3 A4 b2 A3 a4 ,
ρ(A4 ) = A2 a2 A2 , ρ(A7 ) = A5 A6 .
In a first step, we replace all blocks in right-hand sides by the corresponding fresh symbols: ρ(A1 ) = a(7) ,
ρ(A2 ) = a(5) b(3) a(6) ,
ρ(A5 ) = a(2) A3 A1 ,
ρ(A3 ) = b(3) a(4) ,
ρ(A6 ) = a(3) A4 b(2) A3 a(4) ,
ρ(A4 ) = A2 a(2) A2 , ρ(A7 ) = A5 A6 .
Next, we process the nonterminals in the order A1 , A2 , A3 , A4 , A5 , A6 . Processing of A1 : We replace all occurrences of A1 in right-hand sides by a(7) and remove A1 from the SLP: ρ(A2 ) = a(5) b(3) a(6) , ρ(A5 ) = a(2) A3 a(7) ,
ρ(A3 ) = b(3) a(4) ,
ρ(A4 ) = A2 a(2) A2 ,
ρ(A6 ) = a(3) A4 b(2) A3 a(4) ,
ρ(A7 ) = A5 A6 .
Processing of A2 : We replace all occurrences of A2 in right-hand sides by a(5) A2 a(6) and change ρ(A2 ) to b(3) : ρ(A2 ) = b(3) ,
ρ(A3 ) = b(3) a(4) ,
ρ(A5 ) = a(2) A3 a(7) ,
ρ(A4 ) = a(5) A2 a(6) a(2) a(5) A2 a(6) ,
ρ(A6 ) = a(3) A4 b(2) A3 a(4) ,
ρ(A7 ) = A5 A6 .
In ρ(A4 ) we can merge the factor a(6) a(2) a(5) : ρ(A2 ) = b(3) ,
ρ(A3 ) = b(3) a(4) ,
ρ(A5 ) = a(2) A3 a(7) ,
ρ(A4 ) = a(5) A2 a(13) A2 a(6) ,
ρ(A6 ) = a(3) A4 b(2) A3 a(4) ,
ρ(A7 ) = A5 A6 .
Processing of A3 : We replace all occurrences of A3 in right-hand sides by b(3) A3 a(4) and change ρ(A3 ) to ε. Of course we can simplify this step by replacing all occurrences of A3 by b(3) a(4) and then eliminate A3 . We did not do this simplification in the proof of Lemma 4.6.5 (it is not required for correctness), but let us do it in this example: ρ(A2 ) = b(3) , ρ(A5 ) = a(2) b(3) a(4) a(7) ,
ρ(A4 ) = a(5) A2 a(13) A2 a(6) ,
ρ(A6 ) = a(3) A4 b(2) b(3) a(4) a(4) ,
ρ(A7 ) = A5 A6 .
4.6 Exponential compression
| 177
Merging the factors a(4) a(7) , b(2) b(3) and a(4) a(4) yields ρ(A2 ) = b(3) , ρ(A5 ) = a(2) b(3) a(11) ,
ρ(A4 ) = a(5) A2 a(13) A2 a(6) ,
ρ(A6 ) = a(3) A4 b(5) a(8) ,
ρ(A7 ) = A5 A6 .
Processing of A4 : We replace all occurrences of A4 in right-hand sides by a(5) A4 a(6) and change ρ(A4 ) to A2 a(13) A2 : ρ(A2 ) = b(3) , ρ(A5 ) = a(2) b(3) a(11) ,
ρ(A4 ) = A2 a(13) A2 ,
ρ(A6 ) = a(3) a(5) A4 a(6) b(5) a(8) ,
ρ(A7 ) = A5 A6 .
Merging symbols yields ρ(A2 ) = b(3) , ρ(A5 ) = a(2) b(3) a(11) ,
ρ(A4 ) = A2 a(13) A2 ,
ρ(A6 ) = a(8) A4 a(6) b(5) a(8) ,
ρ(A7 ) = A5 A6 .
Processing of A5 : We replace all occurrences of A5 in right-hand sides by a(2) A5 a(11) and change ρ(A5 ) to b(3) : ρ(A2 ) = b(3) , ρ(A5 ) = b(3) ,
ρ(A4 ) = A2 a(13) A2 ,
ρ(A6 ) = a(8) A4 a(6) b(5) a(8) ,
ρ(A7 ) = a(2) A5 a(11) A6 .
Processing of A6 : We replace all occurrences of A6 in right-hand sides by a(8) A6 a(8) and change ρ(A6 ) to A4 a(6) b(5) : ρ(A2 ) = b(3) , ρ(A5 ) = b(3) ,
ρ(A4 ) = A2 a(13) A2 ,
ρ(A6 ) = A4 a(6) b(5) ,
ρ(A7 ) = a(2) A5 a(11) a(8) A6 a(8) .
A final merging step yields ρ(A2 ) = b(3) , ρ(A5 ) = b(3) ,
ρ(A4 ) = A2 a(13) A2 ,
ρ(A6 ) = A4 a(6) b(5) ,
ρ(A7 ) = a(2) A5 a(19) A6 a(8) .
This concludes the example. After block compression, the word produced by the SLP 𝒢 does not contain a factor of the form aa. We are therefore in the situation from Lemma 4.6.4. Let us next show a variant of Lemma 4.6.4 for SLP-compressed words. Lemma 4.6.7. There is an algorithm Partition that gets as input an SLP 𝒢 = (V, Σ, S, ρ) such that val(𝒢 ) does not contain a factor aa (a ∈ Σ) and computes in polynomial time a partition Σ = Σℓ ⊎ Σr such that |val(𝒢 )[Σℓ , Σr ]| ≤ 41 + 43 |val(𝒢 )|.
178 | 4 Compression techniques in group theory Proof. We use the algorithm from the proof of Lemma 4.6.4. We have to argue that this algorithm can be carried out in polynomial time also if the input word s is given by an SLP. Note that the only knowledge about s needed for the algorithm is the number of occurrences of factors ab in s for all a, b ∈ Σ with a ≠ b. To compute these numbers for s = val(𝒢 ), we compute the number of occurrences of ab in every word val𝒢 (A) (A ∈ V) in a bottom-up way. First, we compute the first and last symbol of val𝒢 (A) (if it exists) for every A ∈ V bottom-up. This is straightforward. Now assume that we have a variable A, and for every variable B in ρ(A) we have already computed the number of occurrences of ab in val𝒢 (B) (for all a, b ∈ Σ with a ≠ b). Together with the knowledge of the first and last letters of the variable, we can easily compute the number of occurrences of ab in val𝒢 (A). Let us consider an example. Assume that ρ(A) = ababaBbcaCbcab. Moreover, assume that val𝒢 (B) ∈ bΣ∗ a contains five occurrences of ab and val𝒢 (C) ∈ bΣ∗ c contains six occurrences of ab. We have three explicit occurrences of ab in ababaBbcaCbcab. Together with the five (respectively, six) occurrences that come from B (respectively, C), this yields 14 occurrences. But there are three more occurrences, coming from aB, Bb and aC. Hence, in total we have 17 occurrences of ab in val𝒢 (A). Finally, we present a polynomial-time algorithm for pair compression. Lemma 4.6.8. There is an algorithm CompressPairs that gets as input (i) an SLP 𝒢 = (V, Σ, S, ρ) such that val(𝒢 ) does not contain a factor aa (a ∈ Σ) and (ii) a partition Σ = Σℓ ⊎ Σr and computes in polynomial time an SLP ℋ such that the following properties hold: – all variables of ℋ belong to V and S is a variable of ℋ, – val(ℋ) = val(𝒢 )[Σℓ , Σr ], – |ℋ|1 ≤ |𝒢 |1 ≤ 2|V| and |ℋ|0 ≤ |𝒢 |0 + 4|V|. Proof. Example 4.6.9 below shows the following construction for a concrete SLP. The proof is similar to the proof of Lemma 4.6.5. We process the variables of 𝒢 in a bottomup way. If variable A is going to be processed, we have enforced (by the previous processing steps) the following two properties: – If the first symbol a of val𝒢 (A) belongs to Σr , then ρ(A) starts with a. – If the last symbol a of val𝒢 (A) belongs to Σℓ , then ρ(A) ends with a. For the processing of A, we make a case distinction on ρ(A). Case 1. ρ(A) = ε. Then we replace all occurrences of A in right-hand sides by ε and remove A from the SLP. Case 2. ρ(A) = a ∈ Σ for some a ∈ Σ. Then we replace all occurrences of A in right-hand sides by a and remove A from the SLP. Case 3. ρ(A) = aα, where a ∈ Σr and α ≠ ε does not end with a symbol from Σℓ . Then we redefine the right-hand side of A by ρ(A) = α and replace every occurrence of A in a right-hand side of the current SLP by aA.
4.6 Exponential compression
| 179
Case 4. ρ(A) = αa, where a ∈ Σℓ and α ≠ ε does not begin with a symbol from Σr . Then we redefine the right-hand side of A by ρ(A) = α and replace every occurrence of A in a right-hand side of the current SLP by Aa. Case 5. ρ(A) = aαb, where a ∈ Σr and b ∈ Σℓ . Then we redefine the right-hand side of A by ρ(A) = α and replace every occurrence of A in a right-hand side of the current SLP by aAb. In all other cases we do nothing. The above processing can be also described as follows. If ρ(A) starts (respectively, ends) with a symbol from Σr (respectively, Σℓ ), then this letter is popped off from the right-hand side ρ(A). Intuitively, this is necessary since this symbol might be paired with a symbol from Σℓ on the left (respectively, a symbol from Σr on the right). We process all variables except for S. Then we replace all occurrences of factors ab ∈ Σℓ Σr in right-hand sides by ⟨ab⟩. Using induction, one can easily show the following statements: – If val𝒢 (A) = ε, then while processing A, all occurrences of A in right-hand sides of the SLP are replaced by ε. – If val𝒢 (A) = a ∈ Σ, then while processing A, all occurrences of A in right-hand sides of the SLP are replaced by a. – If val𝒢 (A) = aub, where a, b ∈ Σr , then while processing A, all occurrences of A in right-hand sides of the SLP are replaced by aA. Moreover, in the final SLP ℋ we have valℋ (A) = (ub)[Σℓ , Σr ]. – If val𝒢 (A) = aub, where a, b ∈ Σℓ , then while processing A, all occurrences of A in right-hand sides of the SLP are replaced by Ab. Moreover, in the final SLP ℋ we have valℋ (A) = (au)[Σℓ , Σr ]. – If val𝒢 (A) = aub, where a ∈ Σr and b ∈ Σℓ , then while processing A, all occurrences of A in right-hand sides of the SLP are replaced by aAb. Moreover, in the final SLP ℋ we have valℋ (A) = u[Σℓ , Σr ]. – If val𝒢 (A) = aub, where a ∈ Σℓ and b ∈ Σr , then while processing A, occurrences of A in right-hand sides of the SLP are not replaced. Moreover, in the final SLP ℋ we have valℋ (A) = (aub)[Σℓ , Σr ]. The size bounds for ℋ can be shown as in Lemma 4.6.5. Example 4.6.9. Consider the SLP ({A1 , . . . , A7 }, {a, b, c, d}, A7 , ρ) with the following ρ-mapping: ρ(A1 ) = aba,
ρ(A2 ) = bc,
ρ(A5 ) = A4 bcA4 ,
ρ(A3 ) = cd,
ρ(A6 ) = cA5 bA3 ,
ρ(A4 ) = A1 cA2 a,
ρ(A7 ) = A6 A6 .
Note that the word produced by this SLP does not contain a factor of the form xx for x ∈ {a, b, c, d}. Assume that the partition of the terminal symbols is given by Σℓ = {a, c} and Σr = {b, d}. We process the variables in the order A1 , A2 , A3 , A4 , A5 , A6 .
180 | 4 Compression techniques in group theory Processing of A1 : We replace all occurrences of A1 in right-hand sides by A1 a and change ρ(A1 ) to ab: ρ(A1 ) = ab,
ρ(A2 ) = bc,
ρ(A5 ) = A4 bcA4 ,
ρ(A3 ) = cd,
ρ(A6 ) = cA5 bA3 ,
ρ(A4 ) = A1 acA2 a,
ρ(A7 ) = A6 A6 .
Processing of A2 : We replace all occurrences of A2 in right-hand sides by bA2 c and change ρ(A2 ) to ε. As in Example 4.6.6 (processing of A3 ) we can simplify the SLP by replacing all occurrences of A2 in right-hand sides by bc: ρ(A1 ) = ab,
ρ(A5 ) = A4 bcA4 ,
ρ(A3 ) = cd,
ρ(A4 ) = A1 acbca,
ρ(A6 ) = cA5 bA3 ,
ρ(A7 ) = A6 A6 .
Processing of A3 : Nothing is done in this step. Processing of A4 : We replace all occurrences of A4 in right-hand sides by A4 a and change ρ(A4 ) to A1 acbc: ρ(A1 ) = ab,
ρ(A5 ) = A4 abcA4 a,
ρ(A3 ) = cd,
ρ(A4 ) = A1 acbc,
ρ(A6 ) = cA5 bA3 ,
ρ(A7 ) = A6 A6 .
Processing of A5 : We replace all occurrences of A5 in right-hand sides by A5 a and change ρ(A5 ) to A4 abcA4 : ρ(A1 ) = ab,
ρ(A5 ) = A4 abcA4 ,
ρ(A3 ) = cd,
ρ(A4 ) = A1 acbc,
ρ(A6 ) = cA5 abA3 ,
ρ(A7 ) = A6 A6 .
Processing of A6 : Nothing is done in this step. We finally obtain the following SLP by forming pairs ⟨xy⟩ with x ∈ {a, c} and y ∈ {b, d}: ρ(A1 ) = ⟨ab⟩,
ρ(A5 ) = A4 ⟨ab⟩cA4 ,
ρ(A3 ) = ⟨cd⟩,
ρ(A4 ) = A1 a⟨cb⟩c,
ρ(A6 ) = cA5 ⟨ab⟩A3 ,
ρ(A7 ) = A6 A6 .
This concludes the example. It is not hard to show that CompressBlocks and CompressPairs can be both implemented so that they work in time 𝒪(|𝒢 |). With a little bit more work, one can also show that the partition Σ = Σℓ ⊎ Σr from Lemma 4.6.7 can be computed in linear time (see, e. g., [226, 296]). 4.6.3.3 The final algorithm Using Lemmas 4.6.5, 4.6.7 and 4.6.8 it is now easy to prove Theorem 4.6.3. The goal is to check val(𝒢1 ) = val(𝒢2 ) for two SLPs 𝒢1 and 𝒢2 of the form described in the beginning of
4.6 Exponential compression
| 181
Section 4.6.3.2 (every right-hand side contains at most two occurrences of variables). Jeż’s strategy [226] for checking this equality is to compute from 𝒢i (i ∈ {1, 2}) an SLP ℋi such that val(ℋi ) = (block(val(𝒢i )))[Σℓ , Σr ] for i ∈ {1, 2} and val(ℋ1 ) ≤ 43 ⋅ |val(𝒢1 )| + 41 . This process is iterated. After at most log |val(𝒢1 )| ≤ 𝒪(|𝒢1 |) iterations it must terminate with an SLP such that val(𝒢1 ) has length one. At this stage, it is easy to check whether the current SLPs produce the same word. The size bounds in Lemmas 4.6.5 and 4.6.8 allow to control the sizes of the SLPs during this process. Here are the details. Proof of Theorem 4.6.3. Let 𝒢i = (Vi , Σ, Si , ρi ) for i ∈ {1, 2} and let n = |𝒢1 | + |𝒢2 | be the input size. The following algorithm checks whether val(𝒢1 ) = val(𝒢2 ): 1: while |val(𝒢1 )| > 1 do
2:
3: 4: 5: 6:
𝒢1 := CompressBlocks(𝒢1 ) 𝒢2 := CompressBlocks(𝒢2 ) (Σℓ , Σr ) := Partition(𝒢1 ) 𝒢1 := CompressPairs(𝒢1 , Σℓ , Σr ) 𝒢2 := CompressPairs(𝒢2 , Σℓ , Σr )
7: end while
8: check whether val(𝒢1 ) = val(𝒢2 )
Correctness of the algorithm follows from observations (4.3) and (4.4). It remains to analyze the running time of the algorithm. By Lemma 4.6.7, the number of iterations of the while loop is bounded by 𝒪(log |val(𝒢1 )|) ≤ 𝒪(|𝒢1 |). Let 𝒢1,k and 𝒢2,k be the SLPs after k iterations of the while loop. The number of variables in 𝒢i,k is at most |Vi |. Hence, by Lemma 4.6.5 and 4.6.8 we have |𝒢1,k | + |𝒢2,k | ≤ |𝒢1 | + |𝒢2 | + 8k(|V1 | + |V2 |) ≤ 𝒪(n2 ). Since the k-th iteration takes time polynomial in |𝒢1,k |+|𝒢2,k |, the whole algorithm runs in polynomial time. The reader might object that each application of CompressBlocks and CompressPairs increases the number of terminal symbols and that we did not analyze this number. This is true, but since |𝒢1,k | + |𝒢2,k | ≤ 𝒪(n2 ), also the number of terminal symbols is bounded by 𝒪(n2 ). Moreover, after each application of CompressBlocks and CompressPairs one can rename the terminal symbols into numbers 1, . . . , s for some s ≤ 𝒪(n2 ) (the “internal structure” of the new terminal symbols arising from CompressBlocks and CompressPairs is no longer important). One can improve the running time of the algorithm to 𝒪(n2 ). For this, one has to use the fact that CompressBlocks and CompressPairs can be implemented in linear time. Plugging this into the above estimate leads to a cubic-time algorithm. To improve the running time further to 𝒪(n2 ) one has to guarantee that the intermediate SLPs 𝒢i,k are of size 𝒪(|𝒢i |). For this, one uses in every second application of CompressPairs a
182 | 4 Compression techniques in group theory partition Σ = Σℓ ⊎ Σr such that CompressPairs reduces the total length of the maximal Σ-factors in the right-hand sides of the current SLP by a constant factor [226, 298]. It is neither known whether compressed equality checking belongs to NC nor whether it is P-complete. In [274] it is shown that compressed equality checking belongs to coRNC. This can be seen as an indication that compressed equality checking is not P-complete. 4.6.4 Cutting out factors from SLPs In our applications of SLPs in group theory, we need a second algorithmic technique. We have to cut out factors from the word produced by an SLP. Intuitively this is needed in order to simulate cancelation on SLP-compressed words. Formally, we define a generalization of SLPs, so-called composition systems [160],3 which are also known as collage systems [262] and interval grammars [199]. Definition 4.6.10 (Composition system). A composition system over the terminal alphabet Γ is a tuple 𝒢 = (V, Γ, S, ρ), where V is a finite set of variables, Γ is the set of terminal symbols, S ∈ V is the start variable and ρ is a mapping that assigns to every A ∈ V either a word ρ(A) ∈ (Γ∪V)∗ or an expression of the form B[k : ℓ] with B ∈ V and k, ℓ ≥ 1 such that the following holds: there exists a linear enumeration A1 , A2 , . . . , An of the variables of 𝒢 such that An = S and for all 1 ≤ i ≤ n, if Aj appears in ρ(Ai ), then j < i. – – –
To define the words val𝒢 (A) ∈ Γ∗ for A ∈ V, we use the following inductive rules: val𝒢 (a) = a for a ∈ Γ, if ρ(A) = α1 α2 ⋅ ⋅ ⋅ αk with αi ∈ V ∪ Γ, then we define val𝒢 (A) = val𝒢 (α1 )val𝒢 (α2 ) ⋅ ⋅ ⋅ val𝒢 (αk ), if ρ(A) = B[k : ℓ], then let val𝒢 (A) = val𝒢 (B)[k : ℓ] (recall the definition of w[k : ℓ] from Section 4.2).
Thus, if ρ(A) = B[k : ℓ], then val𝒢 (A) is obtained from val𝒢 (B) by cutting out the factor from position k to position ℓ. We finally define val(𝒢 ) = val𝒢 (S). Example 4.6.11. Consider the composition system 𝒢 = ({A1 , A2 , . . . , A10 }, {a, b}, A10 , ρ),
where ρ(A1 ) = b, ρ(A2 ) = a, ρ(Ai ) = Ai−1 Ai−2 for 3 ≤ i ≤ 7, ρ(A8 ) = A7 [3 : 10], ρ(A9 ) = A7 [4 : 11] and ρ(A10 ) = A8 A9 . We have val𝒢 (A7 ) = abaababaabaab (this is exactly Example 4.6.2), val𝒢 (A8 ) = aababaab, val𝒢 (A9 ) = ababaaba and finally val(𝒢 ) = val𝒢 (A10 ) = aababaab ababaaba. 3 The formalism in [160] differs in some minor details from our definition.
4.6 Exponential compression
| 183
We call [k : ℓ] also a cut operator. We use B[: k] (respectively, B[k :]) as an abbreviation for B[1 : k] (respectively, B[k : ℓ], where ℓ = |val𝒢 (B)|). The size of the composition system 𝒢 is |𝒢 | = ∑A∈V |ρ(A)|, where we set |B[k : ℓ]| = 1 + ⌈log2 (k)⌉ + ⌈log2 (ℓ)⌉. The following result was shown by Hagenah in his PhD thesis [199]. Theorem 4.6.12 ([199]). From a given composition system 𝒢 one can compute in polynomial time an SLP ℋ such that val(𝒢 ) = val(ℋ). Proof. Let 𝒢 = (V, Γ, S, ρ). We can assume that for every A ∈ V, ρ(A) is either a word from {ε} ∪ Γ ∪ VV or of the form B[i : j] for B ∈ V and i, j ≥ 1, which is a kind of Chomsky normal form. Variables A with ρ(A) = ε could be eliminated, but it is convenient to keep them. Define the edge relation ℰ on the variables by (A, B) ∈ ℰ if and only if B occurs in ρ(A). Let h be the height of 𝒢 , which is the length of a longest path in the directed graph (V, ℰ ). The idea is to push cut-operators downwards, i. e., from a variable in direction to the edges from ℰ . First we compute the lengths of all words val𝒢 (A) for A ∈ V. This can be done in polynomial time using at most |V| additions and subtractions on binary-encoded integers. Choose a variable A ∈ V such that ρ(A) = B[i : j] with i, j ≥ 1, but every variable C that can be reached from B in (V, ℰ ) has no cut-operator in its right-hand side. We show that we can eliminate the cut-operator in ρ(A) and thereby obtain a composition system 𝒢 = (V , Γ, S, ρ ) (which is again in the normal form described at the beginning of the proof) such that V ⊆ V , |V \ V| ≤ 2h, val𝒢 (A) = val𝒢 (A) for all A ∈ V, and the height of 𝒢 is bounded by the height of 𝒢 . By iterating this transformation, we can eliminate all cut-operators and transform 𝒢 into an equivalent SLP of size |𝒢 |+ 𝒪(h|V|) (the elimination of each cut-operator increases the size by 𝒪(h)). If i > |val𝒢 (B)| or i > j, then we have val𝒢 (A) = ε. In this case, we set ρ(A) = ε (this covers the case ρ(A) = ε). For the rest of the proof we assume that i ≤ |val𝒢 (B)| and i ≤ j. If ρ(B) = a for a ∈ Γ, then we set ρ(A) = a. Now assume that ρ(B) = CD (recall that we assume that 𝒢 is in normal form). Let m = |val𝒢 (C)|. There are three cases: Case 1. j ≤ m. We redefine ρ(A) = C[i : j]. Case 2. i > m. We redefine ρ(A) = D[i − m : j − m]. Case 3. i ≤ m and j > m. In this case we introduce two new variables C and D , redefine ρ(A) = C D and set ρ(C ) = C[i :] and ρ(D ) = D[: j − m]. By iterating this process, we arrive at one of the following two situations: (i) After several applications of cases 1 and 2 we have ρ(A) = X[k : l] and ρ(X) = a ∈ Γ for some variable X ∈ V. (ii) We arrive at case 3 for the first time. In situation (i) we finally set ρ(A) = a and then stop. Note that we have not introduced any new variables. Now assume that we arrive at situation (ii). We have introduced two new variables C and D . Let us deal with C (with D we deal analogously). We have set ρ(C ) = C[i :].
184 | 4 Compression techniques in group theory If ρ(C) = a for a ∈ Γ, then we set ρ(C ) = a. Now assume that ρ(C) = EF for E, F ∈ V. Let m = |val𝒢 (E)|. We distinguish two cases: Case 1. i ≤ m. We introduce a new variable E , redefine ρ(C ) = E F, set ρ(E ) = E[i :] and continue with E . Case 2. i > m. We redefine ρ(C ) = F[i − m :] and continue with C . By iterating this process we finally eliminate the cut-operator. Note that in each step at most one new variable is introduced (in case 1). Therefore, at most h variables are introduced. Since we have to do an analogous procedure for D , we introduce at most 2h new variables in total. Clearly, our process does not increase the height of the composition system. Moreover, the resulting composition system is again in Chomsky normal form. This proves the theorem.
4.6.5 The compressed word problem Before we come back to our guiding example – the word problem for Aut(F3 ) – we first introduce a compressed variant of the word problem where the input word is given by an SLP over the group generators. Definition 4.6.13 (Compressed word problem). Let G be a finitely generated group and fix a finite generating set Γ for G. The compressed word problem for G with respect to Γ, briefly CWP(G, Γ), is the following decision problem: input: An SLP 𝒢 over the terminal alphabet Γ ∪ Γ−1 . question: Does val(𝒢 ) = 1 hold in G? In CWP(G, Γ), the input size is of course the size |𝒢 | of the SLP 𝒢 . One may view the compressed word problem as the variant of the standard word problem where the input word is given succinctly by an SLP. Recall that there are words of length n which can be represented by an SLP of size 𝒪(log n). Therefore, one might expect that the compressed word problem is computationally harder than the ordinary word problem. Later, we will see groups where such a complexity jump indeed occurs (assuming P ≠ NP). For other groups, we will see only a moderate complexity jump when going from the uncompressed to the compressed variant of the word problem. As for the (uncompressed) word problem, the complexity of the compressed word problem is easily seen to be independent on the chosen generating set (up to logspace reductions), which allows to use the notation CWP(G) for the compressed word problem of G [296, Lemma 4.2]. What makes the compressed word problem interesting for the (standard) word problem are a couple of transfer theorems that allow to reduce the word problem for a certain complicated group to the compressed word problem for a simpler group. The first result of this kind was shown by Schleimer in [442]. Recall that for a group G,
4.6 Exponential compression | 185
Aut(G) denotes the automorphism group of G, which consists of all automorphisms of G with composition of functions as the group operation. There are examples of finitely generated (even finitely presented) groups G, where Aut(G) is not finitely generated (see, e. g., [287]). Therefore, we restrict ourselves in the following result to a finitely generated subgroup of Aut(G). Theorem 4.6.14 ([442]). Let G be a finitely generated group and let H be a finitely generated subgroup of Aut(G). If CWP(G) can be solved in polynomial time, then also WP(H) can be solved in polynomial time. Proof. Let Σ be a finite generating set for G, where without loss of generality a ∈ Σ implies a−1 ∈ Σ. Let H be generated by the finite set A ⊆ Aut(G), where again φ ∈ A implies φ−1 ∈ A. For a given input word φ1 φ2 ⋅ ⋅ ⋅ φn (with φi ∈ A for 1 ≤ i ≤ n) we have to check whether the composition of φ1 , φ2 , . . . , φn (in that order) is the identity isomorphism in order to solve the word problem for H. But this is equivalent to φn (φn−1 (⋅ ⋅ ⋅ φ1 (a) ⋅ ⋅ ⋅)) = a in G for all a ∈ Σ. Since Σ is closed under inverses, there is a canonical surjective homomorphism h : Σ∗ → G. Hence, every φi ∈ A can be obtained from a homomorphism ψi : Σ∗ → Σ∗ in the sense that φi (a) = h(ψi (a)) for all a ∈ Σ. It suffices to construct in polynomial time an SLP 𝒢 for the word ψn (ψn−1 (⋅ ⋅ ⋅ ψ1 (a) ⋅ ⋅ ⋅)). Let us take variables Ai,b , where 1 ≤ i ≤ n + 1 and b ∈ Σ, and define b ρ(Ai,b ) = { Ai+1,a1 ⋅ ⋅ ⋅ Ai+1,am
if i = n + 1, if 1 ≤ i ≤ n and ψi (b) = a1 ⋅ ⋅ ⋅ am .
By induction on i it follows that val𝒢 (Ai,b ) = ψn (ψn−1 (⋅ ⋅ ⋅ ψi (b) ⋅ ⋅ ⋅)). Let us state without proof two further results of a similar flavor. Recall the definition of the semidirect product: let K and Q be groups and let φ : Q → Aut(K) be a group homomorphism. Then the semidirect product K ⋊φ Q is the group with the domain K × Q and the following multiplication: (k, q)(ℓ, p) = (k ⋅ (φ(q)(ℓ)), qp), where ⋅ denotes the multiplication in K (note that φ(q) ∈ Aut(K) and hence φ(q)(ℓ) ∈ K). Theorem 4.6.15 ([296, Theorem 4.8]). Let K and Q be finitely generated groups, and let φ : Q → Aut(K) be a homomorphism. If WP(Q) and CWP(K) can be solved in polynomial time, then also WP(K ⋊φ Q) can be solved in polynomial time. Let G be a finitely presented group and fix a finite presentation (Γ, R) for G = ⟨Γ | R⟩. The word search problem for G, briefly WSP(G), is the following computational problem with output: input: A word w ∈ NF(ℛΓ ). output: 0 if w ≠ 1 in G, otherwise words c1 , . . . , cn ∈ NF(ℛΓ ), r1 , . . . , rn ∈ R ∪ R−1 such that w = ∏ni=1 ci ri ci−1 in F(Γ).
186 | 4 Compression techniques in group theory Hence, instead of just answering the question whether a given word w represents the identity of G, one also computes a proof showing that w indeed represents the identity in the positive case. Alternatively to returning words c1 , . . . , cn ∈ NF(ℛΓ ) and r1 , . . . , rn ∈ R ∪ R−1 such that w = ∏ni=1 ci ri ci−1 in F(Γ), one might also return a van Kampen diagram with boundary w. Speaking of the word search problem of a finitely presented group G and thereby suppressing the concrete presentation (Γ, R) is justified by the fact that the complexity of the word search problem does not depend (up to logspace reductions) on the concrete presentation [296, Lemma 2.25]. Of course, the word search problem for a group G can only be solved in polynomial time if the group has a polynomial Dehn function. Examples of groups for which the word search problem can be solved in polynomial time are automatic groups and finitely generated nilpotent groups. Theorem 4.6.16 ([296, Theorem 4.9]). Let K be a finitely generated normal subgroup of G such that the quotient Q = G/K is finitely presented and has a polynomial Dehn function, and WSP(Q) and CWP(K) can be solved in polynomial time. Then also WP(G) can be solved in polynomial time.
4.6.6 Complexity of compressed word problems Recall our initial question: Is the word problem for Aut(F3 ) solvable in polynomial time? To give a positive answer, it suffices by Theorem 4.6.14 to solve the compressed word problem for F3 in polynomial time. This is our next step. Theorem 4.6.17 ([293]). For every finite n ≥ 1, CWP(Fn ) can be solved in polynomial time. Proof. It would suffice to show the theorem for n = 2 (since Fn is a subgroup of F2 for every n ≥ 1), but this does not simplify the proof. Fix a generating set Γ for Fn of size n and let Σ = Γ ∪ Γ−1 . Let us write NF(w) for the normal form NFℛΓ (w) of the word w. Consider an SLP 𝒢 = (V, Σ, S, ρ). We can assume that 𝒢 is in Chomsky normal form. The idea for solving the compressed word problem is to compute in polynomial time a composition system for the word NF(val(𝒢 )). For this, we construct a composition 𝒢 = (V, Γ, S, ρ ) such that val𝒢 (A) = NF(val𝒢 (A)) for all A ∈ V. To do this, we process the variables of 𝒢 bottom-up. If ρ(A) = a ∈ Γ, we do not have to change anything, i. e., we can set ρ (A) = a. Now consider a variable A ∈ V such that ρ(A) = BC, and we have achieved already val𝒢 (B) = NF(val𝒢 (B)) and val𝒢 (C) = NF(val𝒢 (C)). Let u = val𝒢 (B), v = val𝒢 (C), n = |u| and m = |v|. We have to extend 𝒢 in such a way that val𝒢 (A) = NF(uv). Since u and v are already reduced, cancelation can happen in uv only at the border between the prefix
4.6 Exponential compression
| 187
u and the suffix v. In other words, there exists a number k ≥ 0 such that NF(uv) = u[: n − k] v[k + 1 :]. Since we want to construct a composition system, it suffices to compute the binary encodings of the numbers k and n. Then we can set ρ (A) = B C for fresh variables B and C with ρ (B ) = B[: n − k] and ρ (C ) = C[k + 1 :]. We first use Theorem 4.6.12 to compute SLPs 𝒢B and 𝒢C for the words u−1 and v, respectively, from the composition system 𝒢 we have computed so far. These two SLPs are only used temporarily. When A is processed, 𝒢B and 𝒢C are no longer needed. In order to get 𝒢B we first compute an SLP for u, and then we reverse all right-hand sides and replace every occurrence of a symbol a by a−1 . From 𝒢B and 𝒢C we can easily compute the lengths n and m of u−1 and v, respectively (by doing |𝒢B | + |𝒢C | additions of binary numbers). It remains to compute the binary encoding of the number k. Note that k is the length of the longest common prefix of u−1 and v. We cannot search in a brute-force way for k, since there are min{n, m} (an exponentially large number) possible values for k. Instead, we compute the number k in polynomial time using a well-known algorithmic technique: binary search. We start with i := 0 and j := min{n, m}. The invariant is that k belongs to the interval [i, j]. In each phase we check whether u−1 [: p] = v[: p] for p = ⌈(i + j)/2⌉. If this is true, we set i := p (and do not change j), otherwise we set j := p−1 (and do not change i). The condition u−1 [: p] = v[: p] can be checked in polynomial time using Theorems 4.6.3 and 4.6.12. Note that the interval [i, j] is halved in every step. Hence, after 𝒪(min{n, m}) steps we have i = j and then k = i. This concludes the construction of the composition system 𝒢 for the word NF(val(𝒢 )). By Theorem 4.6.12 we can transform 𝒢 in polynomial time into an SLP ℋ such that val(ℋ) = val(𝒢 ) = NF(val(𝒢 )). Clearly, val(𝒢 ) = 1 in Fn if and only if val(ℋ) = ε. But checking whether an SLP produces the empty word is very easy. One only has to check whether no terminal symbol can be reached from the start variable. In the rest of the section, we survey further results on the complexity of compressed word problems without going into proof details (many proofs can be found in [296]). Theorem 4.6.17 has been extended to many other classes of groups. Before we state the results, let us first mention a relatively simple result that is quite often useful to transfer complexity results for the compressed word problem (the second statement requires a bit more work). Proposition 4.6.18 ([296, Proposition 4.3 and Theorem 4.4]). Let H and G be finitely generated groups. In each of the following two cases, CWP(H) is logspace reducible to CWP(G): – H is a subgroup of G. – H is a finite extension of G.
188 | 4 Compression techniques in group theory A nontrivial generalization of Theorem 4.6.17 concerns graph groups and the larger class of virtually special groups. A graph group4 is specified by a finite simple graph (Γ, E), where Γ is the set of nodes and E is a symmetric and irreflexive edge relation. The associated graph group G(Γ, E) is the finitely presented group G(Γ, E) = ⟨Γ | ab = ba for all (a, b) ∈ E⟩. A group is called virtually special if it is a finite extension of a subgroup of a graph group. Theorem 4.6.19 ([296, Section 5.1]). For every virtually special group G, CWP(G) can be solved in polynomial time. The class of virtually special groups turned out to be very important for the solution of difficult problems in three-dimensional topology. In the course of this work, it turned out that the class of virtually special groups is surprisingly large. It contains the following classes of groups: – graph groups (by definition), – Coxeter groups [200], – one-relator groups with torsion [498], – fully residually free groups [498] (for fully residually free groups, Macdonald [315] independently obtains a polynomial-time solution for the compressed word problem), – fundamental groups of hyperbolic 3-manifolds [2]. In order to prove Theorem 4.6.19, it suffices by Proposition 4.6.18 to show that for every graph group the compressed word problem can be solved in polynomial time. This can be shown by a generalization of the proof technique for Theorem 4.6.17. In fact a more general result is shown in [207] using the operation of graph products. A graph product is specified by a tuple (Γ, E, (Ha )a∈Γ ), where (Γ, E) is a finite simple graph as above, and for every a ∈ Γ, Ha is a group, which is assumed to be finitely generated in the following. The associated graph product is the group G(Γ, E, (Ha )a∈Γ ) = ⟨∏ Ha gh = hg for all g ∈ Ha , h ∈ Hb , (a, b) ∈ E⟩. a∈Γ
Here, ∏a∈Γ Ha is the free product of the groups Ha . Intuitively, graph products interpolate between free products and direct products. If the edge relation E is empty, then G(Γ, E, (Ha )a∈Γ ) is the free product of the groups Ha . On the other hand, if (Γ, E) is a complete graph, then G(Γ, E, (Ha )a∈Γ ) is the direct product of the groups Ha . If all groups Ha are copies of ℤ, one obtains a graph group. In [207] the following transfer result is shown. 4 Graph groups are also known as right-angled Artin groups or free partially commutative groups.
4.6 Exponential compression
| 189
Theorem 4.6.20 ([207]). If CWP(Ha ) can be solved in polynomial time for all a ∈ Γ, then CWP(G(Γ, E, (Ha )a∈Γ )) can be solved in polynomial time as well. Similar transfer theorems have been shown for HNN-extensions over finite associated subgroups and amalgamated products over finite amalgamated subgroups in [206]. We only state the result for HNN-extensions. Theorem 4.6.21 ([206]). Let H be a finitely generated group with finite isomorphic subgroups A and B, and let φ : A → B be an isomorphism. If CWP(H) can be solved in polynomial time, then CWP(⟨H, t | t −1 at = φ(a) (a ∈ A)⟩) can be solved in polynomial time as well. Another far reaching generalization of Theorem 4.6.17 concerns word-hyperbolic groups. Theorem 4.6.22 ([217]). For every word-hyperbolic group G, CWP(G) can be solved in polynomial time. For the proof of Theorem 4.6.22 one transforms an SLP 𝒢 over the generators of G into an SLP for the shortlex normal form of 𝒢 , which is the length lexicographically smallest word that is equivalent in G to val(𝒢 ). Clearly, for every finite group, the compressed word problem can be solved in polynomial time by evaluating the input SLP directly in the finite group (formally, this is a consequence of Theorem 4.6.22). Recall the complexity class NC ⊆ P (Section 4.3), which is the class of problems that can be solved in polylogarithmic time with polynomially many processors. The precise borderline between P-completeness and containment in NC for the compressed word problem of a finite group has been studied by Beaudry, McKenzie, Péladeau and Thérien [39].5 Theorem 4.6.23 ([39]). Let G be a finite group. – If G is not solvable, then CWP(G) is P-complete. – If G is solvable, then CWP(G) belongs to NC. So, for instance, the compressed word problem for the symmetric group on five elements (a nonsolvable group) is P-complete. For the case that G is solvable, Beaudry et al. prove a result that is actually stronger than the second statement in Theorem 4.6.23; they show that CWP(G) belongs to the class DET ⊆ NC, which is the class of all problems that are AC0 -Turing reducible to the computation of the determinant of an integer matrix. Infinite groups, for which the compressed word problem belongs to NC, seem to be rare. The only examples known to the author are finitely generated nilpotent groups. 5 Beaudry et al. speak about the equivalent circuit evaluation problem. Moreover, they consider not only finite groups but also finite monoids.
190 | 4 Compression techniques in group theory Theorem 4.6.24 ([273]). For every finitely generated nilpotent group G, CWP(G) belongs to DET ⊆ NC. In [316, 317] it is shown that several other algorithmic problems (simultaneous conjugacy, subgroup membership, computing subgroup presentations, subgroup conjugacy, computing the normalizer and isolator of a subgroup, coset intersection and computing the torsion subgroup) can be solved in polynomial time for finitely generated nilpotent groups when group elements are represented by SLPs. Moreover, the polynomial-time algorithms even work if the finitely generated nilpotent group is part of the input, provided the number of generators and the nilpotency class of the input group are fixed. Let us next present some results on randomized algorithms for compressed word problems. As mentioned above, finitely generated nilpotent groups are the only known examples of groups with a compressed word problem in NC. If we allow randomization we obtain further examples. The definition of the (restricted) wreath product A ≀ B of two groups A and B can be found for instance in [435]. Theorem 4.6.25 ([274]). Let G be a finite direct product of copies of ℤ and ℤp for primes p. Then, for every n ≥ 1, CWP(G ≀ ℤn ) belongs to coRNC. Theorem 4.6.25 can be applied to free metabelian groups. Recall that a group is metabelian if its commutator subgroup is abelian. By the Magnus embedding theorem [320] the free metabelian group of rank r can be embedded into the wreath product ℤr ≀ ℤr . Hence, Theorem 4.6.25 yields the following. Theorem 4.6.26 ([274]). For every finitely generated free metabelian group the compressed word problem belongs to coRNC. As mentioned in Section 4.5.2, the word problem for every finitely generated linear group can be solved in logspace (and hence in polynomial time). One might therefore wonder whether also the compressed word problem for a finitely generated linear group can be solved in polynomial time. We do not know the answer. The best we can show is the following result. Theorem 4.6.27 ([296, Theorem 4.15]). For every finitely generated linear group G, CWP(G) belongs to coRP. For the proof of this result one reduces (in logspace) the compressed word problem of a finitely generated linear group to the so-called polynomial identity testing problem (PIT): given an algebraic circuit (or SLP) for a multivariate polynomial p(x1 , . . . , xn ) ∈ ℤ[x1 , . . . , xn ] (or p(x1 , . . . , xn ) ∈ 𝔽q [x1 , . . . , xn ] for a prime q), check whether p(x1 , . . . , xn ) is the zero polynomial. PIT is one of the most famous problems in the area of algebraic complexity theory. PIT is known to be in coRP (and this yields Theorem 4.6.27), and it is a famous open problem whether PIT belongs to P (the problem is open for ℤ as well as 𝔽q , see, e. g., [440] for a survey). The reduction from the compressed word problem
4.6 Exponential compression
| 191
to PIT uses the ideas from [291, 464], where it is shown that the word problem for a finitely generated linear group can be solved in logspace. As mentioned in Section 4.5.2, every finitely generated metabelian group can be embedded into a finite direct product of finitely generated linear groups [495]. With Theorem 4.6.27 it follows that the compressed word problem of a finitely generated metabelian group belongs to coRP. Let us now come to hardness results for compressed word problems. By the following theorem, the wreath product construction does not preserve the complexity of the compressed word problem. Theorem 4.6.28 ([296, Theorem 4.21]). If G is finitely generated non-abelian and ℤ ≤ H, then CWP(G ≀ H) is coNP-hard. The proof goes via a reduction from the (complement of the) subset sum problem to CWP(G ≀ H). The subset sum problem is the question whether given a list of binaryencoded natural numbers t, a1 , . . . , an there exists a subset A ⊆ {a1 , . . . , an } such that the sum of all numbers in A is t. Theorem 4.6.28 can be applied to the famous Thompson’s group F = ⟨x0 , x1 , x2 , . . . | xn xk = xk xn+1 for all k < n⟩. This group is actually finitely presented: F = ⟨a, b | [ab−1 , a−1 ba], [ab−1 , a−2 ba2 ]⟩. The group F has several nice representations, e. g., by piecewise linear homeomorphisms of the unit interval [0, 1] or by certain tree diagrams (see [76] for more details). A famous open problem is whether F is amenable. Lehnert and Schweitzer proved that the complement of the word problem of F (i. e., the set of all words over {a, b, a−1 , b−1 } that do not represent the group identity in F) is a context-free language [284], which implies that the word problem for F belongs to the complexity class LogCFL ⊆ NC (the set of all languages that are logspace reducible to a context-free language). From the latter one can deduce that CWP(F) is in PSPACE. It is known that the wreath product F ≀ ℤ is isomorphic to a subgroup of F [192]. Since F is non-abelian, Theorem 4.6.28 implies that CWP(F) is coNP-hard. Using the concept of leaf languages from complexity theory, a sharpening of the lower bound in Theorem 4.6.28 has been shown in [23] (the precise statement is a bit technical). An application of this is the following result from [23] (that also settles the complexity of the compressed word problem for Thompson’s group F). Theorem 4.6.29 ([23, Corollary B]). For the following groups the compressed word problem is PSPACE-complete: – wreath products G ≀ ℤ where G is either a finite nonsolvable group or a finitely generated free group of rank at least 2,
192 | 4 Compression techniques in group theory – – –
Thompson’s group F, the Grigorchuk group, the Gupta–Sidki groups.
The Grigorchuk group as well as the Gupta–Sidki groups are weakly branched groups [24]. In fact, the PSPACE-hardness proof from [23] is carried out for a large class of weakly branched groups. Recall from Section 4.5.2 that the word problem for an automaton group belongs to PSPACE and that there exist automaton groups with a PSPACE-complete word problem [493]. The construction for the latter result was also used in order to construct automaton groups with an EXPSPACE-complete compressed word problem. Theorem 4.6.30 ([493, Theorem 14]). There exists an automaton group with an EXPSPACE-complete compressed word problem. Recall Theorem 4.6.27, which is shown in [296] by a logspace reduction from the compressed word problem of a finitely generated linear group to PIT. The following result from [296] shows that there is also a logspace reduction in the opposite direction for the specific finitely generated linear group SL3 (ℤ). Theorem 4.6.31 ([296, Theorem 4.21]). PIT for multivariate polynomials over ℤ is logspace reducible to CWP(SL3 (ℤ)). Hence, solving the compressed word problem for SL3 (ℤ) is, up to logspace reductions, equivalent to polynomial identity testing, which has resisted so far all attacks for finding a polynomial-time algorithm. Let us finally come back to the compressed word problem for a free group. We proved the existence of a polynomial-time algorithm. By the following result, there is little hope to improve the upper bound to NC. Theorem 4.6.32 ([296, Theorem 4.16]). For every n ≥ 2, CWP(Fn ) is P-complete. The proofs of the P-hardness statements in Theorems 4.6.32 and 4.6.23 are very similar; they proceed by a reduction from the Boolean circuit value problem using the fact that Boolean conjunction can be simulated by commutators in nonsolvable finite groups and free groups. This approach is based on the work of Barrington [22] for finite nonsolvable groups and that of Robinson [429] for free groups, showing that the word problems for these groups are hard for the circuit complexity class NC1 . The same commutator technique is also used in the proof of Theorem 4.6.29.
4.6.7 Power word problems n
n
n
A typical application of SLPs is to compress words of the form u1 1 u2 2 ⋅ ⋅ ⋅ uk k for words u1 , . . . , uk ∈ Σ∗ and numbers n1 , . . . , nk ∈ ℕ. One can easily write down an SLP for
4.6 Exponential compression
| 193
this word of size 𝒪(∑ki=1 |ui | + log ni ). Note that the binary expansion of ni is of length 𝒪(log ni ). Let us define a power word over Σ as a tuple (u1 , n1 , . . . , u2 , n2 ) with u1 , . . . , uk ∈ Σ∗ and n1 , . . . , nk ∈ ℕ. Definition 4.6.33 (Power word problem). Let G be a finitely generated group with a finite generating set Γ. The power word problem for G with respect to Γ is the following problem: input: A power word (u1 , n1 , . . . , u2 , n2 ) over Γ ∪ Γ−1 . n n question: Does u1 1 ⋅ ⋅ ⋅ uk k = 1 hold in G? The input size in this problem is ∑ki=1 |ui | + log2 ni . As usual, the choice of the generating set has no influence on the computational complexity of the power word problem. It is an easy observation that the power word problem for G is logspace reducible to the compressed word problem for G. The power word problem has been studied in [303], where the following results were shown. Theorem 4.6.34 ([303]). For the following groups the power word problem belongs to the circuit complexity class TC0 (and hence to logspace): – finitely generated nilpotent groups, – wreath products G ≀ ℤ where G is finitely generated abelian. Theorem 4.6.35 ([303]). For the following groups the power word problem can be solved in logspace: – finitely generated free groups,6 – the Grigorchuk group. Recall that the compressed word problem for the Grigorchuk group is PSPACEcomplete (Theorem 4.6.29) and therefore more difficult than the power word problem for the Grigorchuk group (since L is a proper subset of PSPACE). Theorem 4.6.36 ([303]). Let G be either a finite nonsolvable group or a finitely generated free group of rank at least 2. Then the power word problem for G ≀ ℤ is coNP-complete. The proof of the coNP-hardness statement in Theorem 4.6.36 uses again the commutator technique of [22, 429] to simulate Boolean formulas (see the above remark after Theorem 4.6.32). It is also interesting to compare Theorem 4.6.36 with Theorem 4.6.29, according to which the compressed word problem for every wreath prod6 In fact, a stronger result is shown in [303]. The power word problem for a finitely generated free group is AC0 -Turing reducible to the word problem for the free group F2 . Recall that the word problem for F2 can be solved in logspace [291].
194 | 4 Compression techniques in group theory uct G ≀ ℤ with G either a finite nonsolvable group or a finitely generated free group of rank at least 2 is PSPACE-complete. For algebraic number fields, the power word problem has been studied earlier. n n n Ge [161] showed that one can verify in polynomial time an identity α1 1 α2 2 ⋅ ⋅ ⋅ αnn = 1, where αi are elements of an algebraic number field and ni are binary-encoded integers. A variant of the power word problem for free groups was studied by Gurevich and Schupp [196]. In this paper, the authors present a polynomial-time algorithm for a compressed form of the subgroup membership problem for a free group F, where n n n group elements are represented in the form a1 1 a2 2 ⋅ ⋅ ⋅ ann with binary-encoded integers ni . The ai must be standard generators of the free group F. Note that this input representation is slightly more restrictive compared to the power word problem. The latter allows powers of the form wx for w an arbitrary word over the group generators.
4.6.8 Further applications of straight-line programs in group theory In this section we present further applications of SLPs in group theory. 4.6.8.1 Word problems for groups of outer automorphisms One of our main applications of the compressed word problem was the (uncompressed) word problem for automorphism groups (see Theorem 4.6.14). Quite often in group theory and topology, the outer automorphism group is more important than the full automorphism group. Let G be a group. An automorphism h : G → G is called inner if there exists a group element g ∈ G such that for all x ∈ G, h(x) = g −1 xg. The set of all inner automorphisms is denoted by Inn(G). It is a normal subgroup of Aut(G). The quotient Out(G) = Aut(G)/Inn(G) is called the outer automorphism group of G. In [442], Schleimer shows that the word problem for Out(Fn ) (where as usual, Fn is the free group of rank n) can be solved in polynomial time. The word problem for Out(Fn ) reduces to the membership problem for Inn(Fn ) in Aut(Fn ). This problem reduces to the so-called simultaneous compressed conjugacy problem for Fn ; given SLPs 𝒢1 , ℋ1 . . . , 𝒢k , ℋk , does there exist an x ∈ Fn such that val(𝒢i ) = x val(ℋi ) x −1 in Fn for all 1 ≤ i ≤ k? Schleimer solves this problem as follows. Recall that in the proof of Theorem 4.6.17 we proved that from a given SLP 𝒢 one can compute in polynomial time a composition system (and hence also an SLP) for the normal form NFℛΓ (val(𝒢 )). For the simultaneous compressed conjugacy problem one has to extend this result to cyclic reductions. For a word w ∈ (Γ ∪ Γ−1 )∗ one defines the cyclically reduced normal form as follows. Let v = NFℛΓ (w) be the unique reduced normal form for w. Let x be the longest prefix of v such that v = xux−1 for some word u ∈ (Γ ∪ Γ−1 )∗ . Then u is the cyclically reduced normal form of w. One can show that given an SLP for v, one can compute in polynomial time SLPs for the factors x and u.
4.6 Exponential compression
| 195
As mentioned in Section 4.6.6, the simultaneous compressed conjugacy problem can be solved in polynomial time for every finitely generated nilpotent group [316], and in [217] the corresponding result is shown for every word-hyperbolic group. As a consequence, the word problem for every finitely generated subgroup of Out(G) for G word-hyperbolic or finitely generated nilpotent can be solved in polynomial time. The latter result can be generalized to the case where G is a graph product of wordhyperbolic groups and finitely generated nilpotent groups by using a transfer theorem for graph products from [207]. 4.6.8.2 The reachability theorem To the best of our knowledge, the first use of SLPs in the context of group theory was made by Babai and Szemerédi in their seminal paper [20], where the following socalled reachability theorem was shown. Theorem 4.6.37 (Reachability theorem from [20]). Let G be a finite group of order n and let S ⊆ G be a generating set of G such that S = S−1 . Then for every g ∈ G there exists an SLP 𝒢 over the terminal alphabet S such that val(𝒢 ) = g in G and |𝒢 | ≤ 𝒪((log n)2 ). Proof. Let us fix G and S as in the theorem. We consider for convenience SLPs over the alphabet S such that for every variable A ∈ V the right-hand side ρ(A) has one of the following forms: ε (the empty word), xy for x, y ∈ V ∪ S, or x−1 for x ∈ V ∪ S. If ρ(A) = x−1 , we set val𝒢 (A) = val𝒢 (x)−1 . At the end, one can eliminate all right-hand sides of the form x−1 by introducing for every variable A a copy A−1 that evaluates to val𝒢 (A)−1 . This only doubles the size of the SLP. An SLP for a subset H ⊆ G is an SLP 𝒢 (of the above form) that contains for every element h ∈ H a variable A such that val𝒢 (A) evaluates to h. Let c(H) (called the straight-line cost of H in [20]) be the minimal number of variables in an SLP for H. We define inductively a sequence of group elements g1 , g2 , . . . , gs as follows. Let g1 = 1. Now assume that g1 , g2 , . . . , gi have been constructed. Let ε
ε
ε
Ki = {g1 1 g2 2 ⋅ ⋅ ⋅ gi i | ε1 , ε2 , . . . , εi ∈ {0, 1}}. Let ci = c({g1 , g2 , . . . , gi }) be the straight-line cost of {g1 , g2 , . . . , gi }. Note that c1 = c({g1 }) = 1. If Ki−1 Ki = G, then we stop and set i = s. If Ki−1 Ki ≠ G, the set Ki−1 Ki is not closed under right-multiplication with generators from S (recall that S generates the group G). Thus, there exists an element g ∈ Ki−1 Ki S \ Ki−1 Ki , and we set gi+1 = g. Note that Ki+1 = Ki ∪ Ki gi+1 . Since gi+1 ∈ ̸ Ki−1 Ki , i. e., Ki gi+1 ∩ Ki = 0, we have |Ki+1 | = 2 ⋅ |Ki |. Since |K1 | = 1 = 20 , we have |Ki | = 2i−1 , which implies s ≤ log2 (n) + 1. Moreover, we claim that ci+1 ≤ ci + 3i for all 1 ≤ i < s. Let 𝒢i be an SLP for {g1 , g2 , . . . , gi } with ci variables. Since gi+1 ∈ Ki−1 Ki S, we can obtain gi+1 as a product of at most i elements from {g1 , g2 , . . . , gi }, i elements from {g1−1 , g2−1 , . . . , gi−1 }, and one generator from S. This means that an SLP for {g1 , g2 , . . . , gi+1 } can be obtained by adding
196 | 4 Compression techniques in group theory at most 3i variables to 𝒢i (i variables for the inverses of g1 , . . . , gi and 2i variables for a product of 2i + 1 elements). From c1 = 1 and ci+1 ≤ ci + 3i we get cs ≤ 𝒪(s2 ). Since s ≤ log2 (n) + 1, we get an SLP 𝒢s with 𝒪((log n)2 ) variables for {g1 , . . . , gs }. Since Ks−1 Ks = G, every element of G is a product of at most log2 (n) elements from {g1 , . . . , gs } and at most log2 (n) elements from {g1−1 , . . . , gs−1 }. Hence, every element of G can be obtained by an SLP with 𝒪((log n)2 ) variables, which (due to the special form of the SLPs we consider) implies that also the size of the SLP is 𝒪((log n)2 ). The typical application of Theorem 4.6.37 is to show that the membership problem for large finite groups is in NP. Here is a concrete example. Theorem 4.6.38. The following problem belongs to NP: input: Numbers d, n ≥ 1 (in unary notation), a prime number p (in binary representation) and a list of matrices M1 , M2 , . . . , Mm , M ∈ GLd (𝔽pn ). question: Does the matrix M belong to the subgroup of GLd (𝔽pn ) generated by the matrices M1 , M2 , . . . , Mm ? 2
Proof. Note that the size of ⟨M1 , M2 , . . . , Mm ⟩ is bounded by pnd . Hence, if M belongs to ⟨M1 , M2 , . . . , Mm ⟩, then by Theorem 4.6.37 there exists an SLP over the alphabet −1 {M1 , M1−1 . . . , Mm , Mm } of size 𝒪(n2 d4 log2 p) that produces the matrix M. Such an SLP can be encoded by a binary word of length polynomial in the input size. Given an SLP −1 𝒢 of polynomial size over the alphabet {M1 , M1−1 . . . , Mm , Mm } one can check in polynomial time whether 𝒢 evaluates to M. This involves polynomially many multiplications of matrices from GLd (𝔽pn ). A single matrix multiplication in GLd (𝔽pn ) can be done in polynomial time. The problem in Theorem 4.6.38 is also known as the membership problem for matrix groups over finite fields. No deterministic polynomial-time algorithm for this problem is known, but the problem is known to be in P for the case that the matrices M1 , M2 , . . . , Mm commute [18]. Another important special case is the membership problem for permutation groups: given permutations g1 , . . . , gm , g on some set {1, . . . , n}, does g belong to the permutation group generated by g1 , . . . , gm ? This problem belongs to NC [19]. The proof of Theorem 4.6.38 uses only two facts of the group GLd (𝔽pn ): (i) the size of GLd (𝔽pn ) is exponentially bounded in the input length (which implies that elements of the group can be encoded by binary words of polynomial length) and (ii) given the description of two elements, the product can be computed in polynomial time. This observation allows to generalize Theorem 4.6.38 to the setting of so-called black box groups [20]. For finite groups that are given by their multiplication tables, Fleischer [136] recently proved the following result, where qAC0 is the class of all problems that can be accepted by a family of unbounded fan-in Boolean circuits of constant depth and
4.6 Exponential compression
quasipolynomial size (i. e., size 2𝒪(log of the reachability theorem.
c
n)
| 197
for a constant c). The proof makes again use
Theorem 4.6.39. The following problem belongs to qAC0 : input: The multiplication table of a finite group G, a subset S ⊆ G and an element g ∈ G. question: Does g belong to the subsemigroup of G generated by S? One important consequence of this result is that the problem from Theorem 4.6.39 cannot be hard for the class TC0 . Nies and Tent [382] give another application of Theorem 4.6.37 in group theory. They use the theorem in order to construct small first-order sentences that describe a certain group. 4.6.8.3 Equations over groups and monoids SLPs have been also applied in the context of solving equations over groups and monoids. Fix a countably infinite set X of variables and let X −1 = {x −1 | x ∈ X} be a disjoint copy of X. A word equation over a finitely generated group G with generating set Γ is a pair (U, V), where U and V are words over the alphabet Γ ∪ Γ−1 ∪ X ∪ X −1 . A solution for (U, V) in G is a monoid homomorphism σ : (Γ ∪ Γ−1 ∪ X ∪ X −1 )∗ → G such that σ(a) = a for all a ∈ Γ, σ(U) = σ(V), and σ preserves the involution: σ(w−1 ) = σ(w)−1 for all w ∈ (Γ ∪ Γ−1 ∪ X ∪ X −1 )∗ . The problem of solvability of word equations over G is defined as follows: input: A word equation (U, V) over G. question: Is there a solution for (U, V) in G? For a monoid M (instead of the group G) one defines this problem analogously; the inverse variables x −1 are not present in this case and σ is just a morphism σ : (Γ ∪ X)∗ → M. Makanin proves in his seminal paper [327] that solvability of word equations over a free monoid Γ∗ is decidable. In [328] he extends this result to free groups. In [412], Plandowski and Rytter prove that minimal solutions for word equations over free monoids are highly compressible. A solution σ : (Γ ∪ X)∗ → Γ∗ for the word equation (U, V) is minimal if for every solution σ of (U, V) we have |σ(U)| ≤ |σ (U)|. We say that |σ(U)| is the length of a minimal solution of (U, V). Theorem 4.6.40 ([412]). Let (U, V) be a word equation over the free monoid Γ∗ and let n = |UV|. Assume that (U, V) has a solution in Γ∗ and let N be the length of a minimal solution of (U, V). Then, for every minimal solution σ of (U, V), the word σ(U) can be generated by an SLP of size 𝒪(n2 log2 (N)(log n + log log N)). In combination with other ingredients, Plandowski uses Theorem 4.6.40 to show that solvability of word equations over a free monoid belongs to PSPACE [411]. This is the currently best-known upper bound. Solvability of word equations in a free monoid is easily seen to be NP-hard, and it has been repeatedly conjectured that the
198 | 4 Compression techniques in group theory precise complexity is NP too. In [227], Jeż applies his recompression technique (see Section 4.6.3) to word equations and obtains an alternative PSPACE-algorithm for solving word equations over a free monoid. Gutiérrez [197] proves that solvability of word equations over free groups belongs to PSPACE as well. In [84, 101] this result was shown using recompression. Moreover, a representation of the set of all solutions of a word equation (U, V) over F(Γ) in form of a graph of size exponential in |UV| is exhibited. In terms of formal language theory, it is shown that the set of all solutions in reduced words forms an effectively constructible EDT0L language [84]. In [100], the recompression technique is extended to so-called twisted word equations and it is shown that solvability of word equations over a virtually free group is in PSPACE and that the set of all solutions in reduced words is an EDT0L language. Further results regarding the compressibility of solutions of word equations can be found in [102]. Another type of equations where SLPs found applications arises in the knapsack problem. The knapsack problem for the finitely generated group G with generating set Γ is the following decision problem: input: Words u, w1 , . . . , wk ∈ (Γ ∪ Γ−1 )∗ . n n question: Do there exist natural numbers n1 , . . . , nk ∈ ℕ such that u = w1 1 ⋅ ⋅ ⋅ wk k holds in G? This problem was first studied in [363]; it generalizes the classical knapsack problem for integers. In [306] the following result is shown. Theorem 4.6.41. Let G(Γ, E) be a graph group. If the graph (Γ, E) contains a cycle on four nodes (C4) or a path on four nodes (P4) as an induced subgraph, then knapsack for G(Γ, E) is NP-complete. If (Γ, E) contains neither P4 nor C4 as an induced subgraph, then knapsack for G(Γ, E) belongs to P. For the proof of the NP upper bound in Theorem 4.6.41, the following result is shown first for every graph group G(Γ, E), where u, w1 , . . . , wk ∈ (Γ ∪ Γ−1 )∗ . If there exist n n n1 , . . . , nk ∈ ℕ such that u = w1 1 ⋅ ⋅ ⋅ wk k in G(Γ, E), then there exist such n1 , . . . , nk ∈ ℕ that are exponentially bounded in the total input length |u| + |w1 | + ⋅ ⋅ ⋅ + |wk |. Therefore, the binary encodings of the numbers n1 , . . . , nk have only polynomially many bits. These binary encodings are a witness for the solvability of the knapsack equation. For n n the verification that u = w1 1 ⋅ ⋅ ⋅ wk k holds in G(Γ, E) one builds an SLP for the word nk −1 n1 u w1 ⋅ ⋅ ⋅ wk from the binary encodings of n1 , . . . , nk and solves the corresponding instance of the compressed word problem for G(Γ, I) in polynomial time using Theorem 4.6.19. Note that the power word problem for the graph group G(Γ, I) (defined in Section 4.6.7) would suffice here. The NP upper bound in Theorem 4.6.41 can be generalized in two aspects: (i) the result holds for all virtually special groups and (ii) the input words u, w1 , . . . , wk for knapsack can be represented by SLPs; the resulting problem is called compressed knapsack in [306]. In fact only compressed knapsack is a generalization of the classical knapsack problem for the integers, because in the latter problem, integers are assumed to
4.7 Tower compression and beyond | 199
be encoded in binary notation. In this setting, SLPs can be seen as a generalization of binary encodings from numbers to words. In [217] it is shown that also for every wordhyperbolic group G, compressed knapsack belongs to NP (and is NP-complete if G is infinite).
4.7 Tower compression and beyond SLPs allow to compress some words of exponential length down to polynomial length. We have seen examples of word problems where this suffices to obtain polynomialtime algorithms. Our guiding example was the word problem of the automorphism group of a free group. A naive solution leads to words of exponential length. Fortunately, these words can be compressed, using SLPs, down to polynomial size. In this section, we will see examples of word problems where the exponential compression offered by SLPs is not sufficient. In these examples, naive solutions for the word problem produce powers of the form aN , where a is a generator, and N is a huge integer, whose size may be super-exponential in the input length. Hence, we need a compression scheme that allows to compress such huge integers and that allows to do the necessary arithmetic manipulations efficiently. This leads to so-called power circuits. Our guiding example in this section is the word problem of the Baumslag group. The material in this section is taken from [103, 369, 370].
4.7.1 Motivation: the word problem for the Baumslag group Baumslag introduces in [30] a truly remarkable group that is nowadays known as the Baumslag group (sometimes also called the Baumslag–Gersten group) G1,2 . It is defined as the one-relator group G1,2 = ⟨a, b | b−1 a−1 bab−1 ab = a2 ⟩. Baumslag proved that all finite quotients of G1,2 are cyclic and that G1,2 is not residually finite. Gersten showed that the Dehn function of G1,2 has a lower bound of tower(⌈log2 n⌉), where the tower function is defined by tower(0) = 1 and tower(n + 1) = 2tower(n) . Later, Platonov [413] showed that tower(⌈log2 n⌉) is the exact Dehn function of G1,2 . Hence, Theorem 4.5.4 only yields a nonelementary algorithm for WP(G1,2 ). This led experts in the area to the conjecture that WP(G1,2 ) might be intrinsically difficult. It came therefore as a big surprise when Miasnikov, Ushakov and Won proved in 2011 that the word problem for G1,2 can be solved in polynomial time [369]. Their proof used a new compressed representation of huge integers, which they called power circuits. Before we have a closer look at power circuits, let us first explain how huge numbers arise when trying to solve the word problem for G1,2 .
200 | 4 Compression techniques in group theory Let us first write G1,2 = ⟨a, b | b−1 a−1 bab−1 ab = a2 ⟩ as an HNN-extension. By adding a new generator t and defining t = b−1 ab, we can write the relation b−1 a−1 bab−1 ab = a2 as t −1 at = a2 . Hence we have G1,2 = ⟨a, b | b−1 a−1 bab−1 ab = a2 ⟩ = ⟨a, b, t | t −1 at = a2 , b−1 ab = t⟩. A formal proof of this equivalence can be given using Tietze transformations. By taking the group BS1,2 = ⟨a, t | t −1 at = a2 ⟩, we finally get a decomposition of G1,2 as an HNN-extension of BS1,2 : G1,2 = ⟨BS1,2 , b | b−1 ab = t⟩.
(4.5)
The group BS1,2 is known as one of the so-called Baumslag–Solitar groups. It is an HNN-extension of ℤ = ⟨a⟩ and its Dehn function is known to be of exponential growth. From Britton’s lemma (Theorem 4.5.3) it follows that the generators a and t generate copies of ℤ. Hence, (4.5) is indeed a proper HNN-extension. The decomposition for G1,2 in (4.5) allows to perform Britton reduction (see Section 4.5.3) in order to solve the word problem for G1,2 . For this, we have to be able to decide whether a given element of BS1,2 belongs to ⟨a⟩ or ⟨t⟩, respectively. The crucial observation is that every element g ∈ BS1,2 can be uniquely written in the form t i ay t −j with i, j ∈ ℕ, y ∈ ℤ and the constraint that if i > 0 and j > 0, then y must be odd. Let us call this word the normal form of g. First of all, every word over {a, a−1 , t, t −1 } can be brought into the above normal form by the following infinite rewrite system ℬ, where x ∈ {a, a−1 , t, t −1 }, e ∈ {−1, 1} and m ∈ ℤ \ {0} are arbitrary: xx−1 → ε,
ae t → ta2e ,
t −1 ae → a2e t −1 ,
ta2m t −1 → am .
This system is terminating and one can show that it is also confluent, but we do not need this fact. From the defining relation t −1 at = a2 it follows that u = v in BS1,2 for every rule (u → v) ∈ ℬ. Hence, application of ℬ-rules preserves the value in the group BS1,2 . The first three rules allow to rewrite any word into the form t i ay t −j . To obtain the normal form, it suffices to apply the rules ta2m t −1 → am as long as possible. Uniqueness of the normal form follows from Britton’s lemma (Theorem 4.5.3). Assume that t i ay t −j = t k az t −l in BS1,2 , where we have i, j, k, l ∈ ℕ, y, z ∈ ℤ and the constraint that if i > 0 and j > 0 (respectively, k > 0 and l > 0), then y (respectively, z) must be odd. We get t i ay t l−j a−z t −k = 1. If l = j, then we get t i ay−z t −k = 1, which implies
4.7 Tower compression and beyond | 201
i = k and y = z (if y ≠ z, then t i ay−z t −k cannot be reduced to the empty word using Britton reduction). On the other hand, if l ≠ j one has to distinguish the cases l > j and l < j. The word t i ay t l−j a−z t −k has to contain a pin. If l > j, then one must have l > 0, k > 0 and z even, which contradicts the fact that t k az t −l is in normal form. If l < j, then one must have i > 0, j > 0 and y even, which contradicts the fact that t i ay t −j is in normal form. The following lemma summarizes the previous discussion. Lemma 4.7.1. Every element of BS1,2 can be uniquely written as t i az t −j with i, j ∈ ℕ, z ∈ ℤ and the constraint that if i > 0 and j > 0, then z must be odd. This normal form can be computed by using the rewrite system ℬ. As a corollary, we can in particular decide whether a given element of BS1,2 belongs to ⟨a⟩ or ⟨t⟩, respectively. This in turn allows to use Britton reduction in order to solve the word problem for G1,2 . Note that the normal form of a word w ∈ {a, a−1 , t, t −1 }∗ with respect to the rewrite n system ℬ can be of length exponential in |w|. For instance, we have at n →∗ t n a2 . We could solve this problem by storing powers of the form an using the binary representation of n. But for the group G1,2 this does not solve the problem. Consider for instance the sequence of words wn , where w0 = a and wn+1 = (b−1 wn b)−1 a(b−1 wn b). The length of the word wn is exponential in n. But the length of the word that one obtains using Britton reduction is much bigger: we claim that Britton reduction applied to wn yields atower(n) . This can be shown by induction on n. We have w0 = a1 = atower(0) . Now assume that Britton reduction for wn yields atower(n) . We then get the following for wn+1 : −1
wn+1 = (b−1 wn b) a(b−1 wn b)
= (b−1 atower(n) b) a(b−1 atower(n) b) −1
= t −tower(n) at tower(n) tower(n)
= a2
= atower(n+1) .
This shows that any compression technique that yields exponential compression must fail for solving WP(G1,2 ). What is needed is a compression scheme that allows to represent numbers of size tower(n) with polynomially many bits. Of course this is not possible for all numbers in the range 1, . . . , tower(n). But in the solution of the word problem for G1,2 only very particular numbers arise; so there is still hope. Moreover, our compressed representation of numbers must enable efficient implementations of the arithmetic operations that are needed in the Britton reduction process for G1,2 . All this is achieved by power circuits. 4.7.2 Power circuits Definition 4.7.2 (Power circuit [370]). A power circuit is a pair 𝒫 = (V, δ), where: – V is a nonempty finite set whose elements are called nodes, and
202 | 4 Compression techniques in group theory –
δ : V × V → ℤ such that the directed graph G(𝒫 ) = (V, E(𝒫 )) with E(𝒫 ) = {(u, v) | δ(u, v) ≠ 0} is acyclic.
Fix a power circuit 𝒫 = (V, δ). A marking in 𝒫 is a mapping M : V → ℤ. For a node u ∈ V we define the marking Mu by setting Mu (v) = δ(u, v) for all v ∈ V. Since G(𝒫 ) is acyclic we can assign to a marking M and a node u ∈ V values val(u), val(M) ∈ ℝ inductively as follows: val(u) = ∑ δ(u, v) ⋅ 2val(v) , v∈V
val(M) = ∑ M(v) ⋅ 2val(v) . v∈V
Thus, val(u) = val(Mu ). Note that val(u) = 0 if u has no outgoing edges in G(𝒫 ). If we want to make clear that we are dealing with the power circuit 𝒫 we write val𝒫 for val. A power circuit is called correct if val(u) ∈ ℤ for every node u. For a marking M : V → ℤ and a node u ∈ V we set |M|1 = ∑v∈V |M(v)| (which is the L1-norm of M) and |u|1 = |Mu |1 . Finally, we define the size of 𝒫 as |𝒫 | = ∑u∈V |u|1 . This is the total length of all unary encodings of numbers δ(u, v). In the following we identify a power circuit 𝒫 = (V, δ) with the directed acyclic graph G(𝒫 ) = (V, E(𝒫 )) where every edge (u, v) ∈ E(𝒫 ) is labeled with the number δ(u, v) ≠ 0. Figure 4.1 shows a correct power circuit. Every node is labeled with its value. Note that there are two nodes with the same value (24). This seems to be redundant. On the other hand, it is not clear how to detect efficiently such redundant nodes. This will be done in the next section. 24
33554418 2
24
−7 −4
−1
4
5
3
1 1
1
3
0
5
Figure 4.1: A correct power circuit.
Recall that a good compressed representation should have two properties: (i) it can be checked efficiently whether two compressed representations define the same objects and (ii) the relevant operations should have efficient algorithms on compressed
4.7 Tower compression and beyond | 203
representations. For power circuits we deal with point (i) in Section 4.7.2.2 whereas Section 4.7.2.3 deals with (ii). 4.7.2.1 Extended binary representations Our algorithm for checking equivalence of power circuits transforms a power circuit into a kind of canonical power circuit. This process will be called reduction. It is best explained as a manipulation of extended binary representations of integers. Every natural number n can be uniquely written as a sum of pairwise different powers of two: n = ∑ki=0 ai ⋅ 2i , where ai ∈ {0, 1}. Things get a bit more tricky if we allow arbitrary integers for ai . That is what we mean by an extended binary representation. For a mapping f : ℤ → ℤ let Supp(f ) = {i ∈ ℤ | f (i) ≠ 0} be the support of f . Let ℱ (ℤ) be the set of all mappings f : ℤ → ℤ with finite support. Such a mapping f encodes the rational number ν(f ) = ∑ f (i) ⋅ 2i . i∈ℤ
Sometimes, it is useful to view a mapping f ∈ ℱ (ℤ) as an infinite ℤ-word ⋅ ⋅ ⋅ f (−2)f (−1)f (0)f (1)f (2) ⋅ ⋅ ⋅. The mapping f ∈ ℱ (ℤ) is called compact if for all i, (i) f (i) ∈ {−1, 0, 1} and (ii) f (i) ∈ {−1, 1} implies f (i+1) = 0. Hence, in the ℤ-word corresponding to a compact mapping f there do not exist consecutive nonzero entries. Compact mappings have been studied under various names by several authors [194, 224, 418, 449], and found applications in data structures and computer arithmetic. It is easy to check that for compact mappings f , g ∈ ℱ (ℤ) with f ≠ g one has ν(f ) ≠ ν(g). This is in fact a corollary of Lemma 4.7.5 below. We want to transform an arbitrary mapping f ∈ ℱ (ℤ) into a compact mapping with the same ν-value. For this, we use a rewrite system on ℱ (ℤ). For two mapping f , g ∈ ℱ (ℤ) we write f → g if there exists i ∈ ℤ such that f (j) = g(j) for all j ∈ ℤ \ {i, i + 1, i + 2} and one of the following cases holds: – f (i) ≥ 2, g(i) = f (i) − 2, g(i + 1) = f (i + 1) + 1, f (i + 2) = g(i + 2), – f (i) ≤ −2, g(i) = f (i) + 2, g(i + 1) = f (i + 1) − 1, f (i + 2) = g(i + 2), – f (i) = −1, f (i + 1) = 1, g(i) = 1, g(i + 1) = 0, f (i + 2) = g(i + 2), – f (i) = 1, f (i + 1) = −1, g(i) = −1, g(i + 1) = 0, f (i + 2) = g(i + 2), – f (i) = f (i + 1) = 1, f (i + 2) = 0, g(i) = −1, g(i + 1) = 0, g(i + 2) = 1, – f (i) = f (i + 1) = −1, f (i + 2) = 0, g(i) = 1, g(i + 1) = 0, g(i + 2) = −1. Note that f → g implies that ν(f ) = ν(g) and that f is compact if and only if there is no g with f → g. One may view these rules also as local replacements on infinite ℤ-words analogously to the rules of a rewrite system on finite words (see Section 4.4): a b → a−2 b+1
for a ≥ 2 or (a = 1, b = −1),
(4.6)
204 | 4 Compression techniques in group theory a b → a+2 b−1
for a ≤ −2 or (a = −1, b = 1),
(4.7)
a a 0 → −a 0 a for a ∈ {−1, 1}.
(4.8)
For a mapping f ∈ ℱ (ℤ) we define |f |1 = ∑i∈ℤ |f (i)|. Lemma 4.7.3. The binary relation → on ℱ (ℤ) is terminating and confluent. It can be used to compute for a given f ∈ ℱ (ℤ) a compact mapping g such that ν(f ) = ν(g) in at most |f |1 rewrite steps. Proof. We first show that → is terminating. The rules (4.6) and (4.7) can be applied only finitely many times since each of these rules reduces |f |1 , whereas rule (4.8) does not change |f |1 . Hence, if there exists an infinite chain f1 → f2 → f3 → ⋅ ⋅ ⋅, then there exists such a chain in which only rule (4.8) is applied. But this is not possible since this rule moves a 0-entry to the left. By considering all overlappings between left-hand sides of the rules (4.6)–(4.8), the reader can easily check that → is also locally confluent and hence confluent. Hence, applying the rules in any order yields a unique normal form, which must be compact (since one can always apply a rule to any noncompact mapping). We finally show how to derive the normal form of f ∈ ℱ (ℤ) in at most |f |1 steps. We first apply the rules (4.6) and (4.7) as long as possible. Let f be the resulting mapping. We make at most |f |1 − |f |1 rewrite steps in this first phase. Moreover, |f |1 ≤ |f |1 . Consider the ℤ-word corresponding to f . It only contains the entries −1, 0 and 1 and it does not contain a factor −a a with a ∈ {−1, 1}. Hence, we can split this ℤ-word into maximal blocks of the form 1d and (−1)e that are separated by 0’s. We rewrite these blocks into their normal forms from left to right. Consider the left-most block 1d (if the left-most block is of the form (−1)e we can argue analogously). If d = 1 nothing has to be done. So, assume that d ≥ 2. The normal form of 1d 0 is −1 0d−1 1 and it is obtained in d − 1 steps. If the next block is separated from the block 1d by more than one 0, then we directly continue with the reduction of the next block. If the next block is separated from 1d by a single 0, then there are two cases. If the next block is also a 1-block, then we merge the right-most 1 from −1 0d−1 1 (the normal form of 1d 0) with the next 1-block (recall that d ≥ 2). On the other hand, if the next block is of the form (−1)e , then we obtain the factor 0 1 (−1)e 0, which rewrites in one step to 0 −1 0 (−1)e−1 0. We then continue with the block (−1)e−1 (in case e ≥ 2). In total, we make at most |f |1 rewrite steps in this second phase. Hence, altogether, we make at most |f |1 rewrite steps. Example 4.7.4. Consider the mapping f with f (0) = 15 and f (i) = 0 for all i ≠ 0. The following is a reduction to the compact normal form of f : 15 9 3 1 1
0 3 6 3 1
0 0 0 2 −1
0 0 0 0 0
0 0 0 0 1
→ → → → →
13 7 1 1 1
1 4 7 1 −1
0 0 0 3 0
0 0 0 0 0
0 0 0 0 1
→ → → → →
11 5 1 1 −1
2 5 5 1 0
0 0 1 1 0
0 0 0 1 0
0 0 0 0 1.
→ → → →
4.7 Tower compression and beyond | 205
Another nice aspect of compact mappings is that it is easy to compare their ν-values. For a mapping f ∈ ℱ (ℤ) with Supp(f ) ≠ 0 and i = max(Supp(f )) let f ̂ ∈ ℱ (ℤ) be defined by f ̂(i) = 0 and f ̂(j) = f (j) for all j ∈ ℤ \ {i}. We also define max(0) = −∞ for the following lemma. Lemma 4.7.5. Let f , g ∈ ℱ (ℤ) be compact such that Supp(f ) ∪ Supp(g) ⊆ ℕ (hence, ν(f ), ν(g) ∈ ℤ) and Supp(f ) ≠ 0. Let n = max(Supp(f )) ∈ ℕ, m = max(Supp(g)) ∈ {−∞} ∪ ℕ and assume that m ≤ n and f (n) = 1. Then one of the following cases holds (the cases are not disjoint): 1. m = −∞, n = 0 and thus ν(g) = 0 and ν(f ) = 1, 2. m = −∞, n > 0 and thus ν(g) = 0 and ν(f ) ≥ 2, 3. n ≥ m ≥ 0, g(m) = −1 and ν(f ) − ν(g) ≥ 2, 4. n ≥ 2, 0 ≤ m ≤ n − 2 and ν(f ) − ν(g) ≥ 2, 5. n ≥ 1, m = n − 1, g(m) = 1 and ν(f ) − ν(g) = ν(f ̂) − ν(g)̂ + 2m , ̂ 6. n = m ≥ 0, g(m) = 1 and ν(f ) − ν(g) = ν(f ̂) − ν(g). Proof. First assume that n = 0 and hence ν(f ) = 1. We must have m = −∞ or m = 0. If m = −∞, then case 1 holds. If m = 0, then either g(0) = −1 and ν(g) = −1 (thus, case 3 holds) or g(0) = 1 (thus, case 6 holds). In the rest of the proof, we assume that n > 0. Since f is compact, we have f (n−1) = 0 and thus ν(f ) ≥ 2n − 2n−1 + 1 = 2n−1 + 1 ≥ 2.
(4.9)
If m = −∞ we have ν(g) = 0 and thus case 2 holds. Now assume that 0 ≤ m ≤ n and n > 0 holds. If g(m) = −1, then ν(g) ≤ −2m + 2m−1 − 1 = −2m−1 − 1 (if m ≥ 1) or ν(g) = −1 (if m = 0). In both cases we have ν(g) ≤ −1, and thus ν(f ) − ν(g) ≥ 2 by (4.9). We therefore obtain case 3. Let us now assume that g(m) = 1. We can distinguish three cases according to the difference n − m ≥ 0: Case a: m ≤ n − 2. We get ν(g) ≤ 2n−1 − 1, and hence ν(f ) − ν(g) ≥ 2 by (4.9). We obtain case 4. ̂ and hence ν(f )−ν(g) = Case b: m = n−1. We have ν(f ) = 2n +ν(f ̂) and ν(g) = 2n−1 +ν(g), 2n−1 + ν(f ̂) − ν(g)̂ = ν(f ̂) − ν(g)̂ + 2m . We get case 5. Case c: m = n. Then we clearly get case 6. For the case that m ≤ n and f (n) = −1, statements analogous to those in Lemma 4.7.5 hold (one can simply negate all values f (i) and g(i), which negates the values ν(f ) and ν(g), and then apply Lemma 4.7.5). Together, they cover all cases (up to symmetry). A corollary of Lemma 4.7.5 is the following. Lemma 4.7.6. For all compact f , g ∈ ℱ (ℤ), ν(f ) = ν(g) implies f = g.
206 | 4 Compression techniques in group theory Proof. Assume that f ≠ g. By doing an appropriate right shift (which corresponds to multiplying the ν-values by a power of two), we can assume that Supp(f ) and Supp(g) are finite subsets of ℕ. Up to the symmetric cases for f (n) = 1, Lemma 4.7.5 covers all cases for f ≠ g. In all six cases we have ν(f ) ≠ ν(g). For cases 1–4 this is clear, and case 6 can be handled by induction on | Supp(f )| + | Supp(g)|. In case 5, define f ∈ ℱ (ℤ) by ̂ − 1) = 0, we have f ≠ g.̂ Using f (n − 1) = 1 and f (i) = f ̂(i) for all i ≠ n − 1. Since g(n ̂ ̂ which yields induction on | Supp(f )| + | Supp(g)|, we get ν(f ) + 2n−1 = ν(f ) ≠ ν(g), ν(f ) ≠ ν(g) in case 5. For a compact mapping f ∈ ℱ (ℤ) such that ν(f ) ∈ ℤ we must have Supp(f ) ⊆ ℕ. To see this, let z = ν(f ). Using the ordinary binary expansion of z, we can construct g ∈ ℱ (ℤ) such that ν(g) = ν(f ) and Supp(g) ⊆ ℕ. Let g ∈ ℱ (ℤ) be the unique normal form of g obtained from the rewrite rules (4.6)–(4.8). Then g is compact, ν(g ) = ν(f ) and we still have Supp(g ) ⊆ ℕ (the rewrite rules do not shift nonzero entries to the left). By Lemma 4.7.6 we have f = g . By the above remark, we can identify a compact mapping f ∈ ℱ (ℤ) such that ν(f ) ∈ ℤ with a finite word f (0)f (1) ⋅ ⋅ ⋅ f (k), where Supp(f ) ⊆ {0, 1, . . . , k}. 4.7.2.2 Reducing power circuits In this section, we present a reduction operation for correct power circuits that will be crucial for the equivalence check on power circuits. We start with some definitions. Consider a correct power circuit 𝒫 = (V, δ) such that val(u) ≠ val(v) for all u, v ∈ V with u ≠ v. For a marking M : V → ℤ we define the mapping fM ∈ ℱ (ℤ) by setting M(v)
if val(v) = i,
0
if there is no v with val(v) = i.
fM (i) = {
(4.10)
By our assumption there exists at most one node v ∈ V with val(v) = i. For a node u ∈ V we set fu = fMu . Note that ν(fu ) = val(u). A correct power circuit is called reduced if 𝒫 = (V, δ) has the following properties: – val(u) ≠ val(v) for all u, v ∈ V with u ≠ v, – for every u ∈ V, the mapping fu is compact. Figure 4.2 shows a correct reduced power circuit. The remark after the proof of Lemma 4.7.6 implies that Supp(fu ) ⊆ ℕ for every node u of a correct and reduced power circuit 𝒫 . In other words, if (u, v) ∈ E(𝒫 ), then val(v) ∈ ℕ. The goal of this section is to convert a correct power circuit into a reduced one in polynomial time. We start with the following lemma, which allows to compare values in correct reduced power circuits efficiently.
4.7 Tower compression and beyond | 207
25 1
3
0
1
1
−1 1
1
24
−1
1
1
−1
1
5
33554418 −1
1 1
2
1
4 1
Figure 4.2: A correct reduced power circuit.
Lemma 4.7.7. Given a correct reduced power circuit 𝒫 = (V, δ) one can compute in polynomial time a mapping σ𝒫 : V ×V → {−2, −1, 0, 1, −2} such that for all nodes u, v ∈ V we have −2 if val(u) − val(v) ≤ −2, { { { σ𝒫 (u, v) = {val(u) − val(v) if |val(u) − val(v)| ≤ 1, { { if val(u) − val(v) ≥ 2. {2 Proof. We compute the σ𝒫 -values using dynamic programming. More precisely, let v1 , v2 , . . . , vn be a topological sorting of the directed acyclic graph G(𝒫 ), in the sense that j < i for every edge (vi , vj ) ∈ E(𝒫 ). We process the nodes in that order. When processing node vk we have already computed all values σ𝒫 (vi , vj ) with 1 ≤ i, j < k. Then we compute the values σ𝒫 (v1 , vk ), σ𝒫 (vk , v1 ), . . ., σ𝒫 (vk−1 , vk ), σ𝒫 (vk , vk−1 ), σ𝒫 (vk , vk ) in any order. Let us consider the computation of a certain value σ𝒫 (u, v). By our evaluation order, we have already computed all values σ𝒫 (s, t), where s is a successor of u or v, and similarly, t is a successor of u or v. Since 𝒫 is reduced, we have val(s) ∈ ℕ for every successor s of u or v. We distinguish the following cases: Case 1. Neither u nor v has successors. Then we have u = v (since 𝒫 is reduced) and val(u) − val(v) = 0. Case 2. Only u has successor nodes, and v has no successor node (or vice versa). Then we have val(v) = 0. Using the σ𝒫 -values that we have computed already, we can determine among the successors of u the node s with the largest value. Since 𝒫 is reduced, v is the unique node without an outgoing edge. If s = v, then val(u) = δ(u, v) ∈ {−1, 1} and we can compute val(u) − val(v). If s ≠ v, then val(s) > 0. By point 2 from Lemma 4.7.5, we have val(u) − val(v) ≥ 2 if δ(u, s) = 1 and val(u) − val(v) ≤ −2 if δ(u, s) = −1.
208 | 4 Compression techniques in group theory Case 3. Both u and v have successor nodes. This is the main case. Using the σ𝒫 -values that we have computed already, we can determine among the successors of u and v the node s with the largest value. Assume without loss of generality that s is a successor of u and that δ(u, s) = 1 (the case that δ(u, s) = −1 is symmetric). Moreover, let t be the successor of v with the largest value (thus, val(t) ≤ val(s)). We now modify the labels of the outgoing edges of u and v in several steps. Thereby, the value val(u) − val(v) will not be changed. Moreover, in every step we reduce |u|1 + |v|1 . At the end, we set back the edge labels to their initial values. If δ(v, t) = −1, then val(u) − val(v) ≥ 2 by point 3 of Lemma 4.7.5. Now assume that δ(v, t) = 1. If s = t, then we set δ(u, s) and δ(v, s) both to zero and continue. Clearly, this does not change the difference val(u) − val(v) but reduces |u|1 + |v|1 . We can now assume that t ≠ s and thus val(t) < val(s). This case is covered by points 4 and 5 of Lemma 4.7.5. More precisely, if σ𝒫 (s, t) = 2, then val(u) − val(v) ≥ 2 by point 4. On the other hand, if σ𝒫 (s, t) = 1 (and thus val(s) = val(t) + 1), then we can apply point 5. For this, we modify the outgoing edge labels of u and v as follows. We set δ(u, s) = 0, δ(v, t) = 0 and δ(u, t) = 1 (note that due to compactness of 𝒫 , we had δ(u, t) = 0 before) and continue with the resulting power circuit (now t is the successor with the largest value). Point 5 of Lemma 4.7.5 implies that val(u) − val(v) does not change. After |u|1 + |v|1 steps we arrive at one of the cases, where we can directly compute σ𝒫 (u, v). Once this is done, we reset all values δ(u, x) and δ(v, x) back to their initial values. By the previous lemma we can always assume that a correct reduced power circuit
𝒫 is equipped with the mapping σ𝒫 .
Consider now again a correct reduced power circuit 𝒫 = (V, δ). Let N(𝒫 ) = {val(u) | u ∈ V}. We next want to construct from 𝒫 = (V, δ) an extended power circuit 𝒫 = (V , δ ) with the following properties: – 𝒫 is correct and reduced, – V ⊆ V and val𝒫 (u) = val𝒫 (u) for all u ∈ V, – N(𝒫 ) = N(𝒫 ) ∪ {n + 1 | n ∈ N(𝒫 )}.
We say that 𝒫 is an extension of 𝒫 . Let us define g(𝒫 ) = {n ∈ N(𝒫 ) | n + 1 ∈ ̸ N(𝒫 )}. Note that g(𝒫 ) ≤ g(𝒫 ) if 𝒫 is an extension of 𝒫 . Lemma 4.7.8. Given a correct and reduced power circuit 𝒫 = (V, δ) one can compute in polynomial time an extension 𝒫 = (V , δ ) such that |V | = |V| + g(𝒫 ). Proof. We cannot compute the (binary representations of the) numbers in N(𝒫 ). But each number n ∈ N(𝒫 ) is given by a unique node u (such that val(u) = n). By
4.7 Tower compression and beyond | 209
Lemma 4.7.7 we know whether the numbers n, m given by two nodes of the power circuit satisfy n < m and n + 1 = m, respectively. For each node u ∈ V such that val(u)+1 ∈ ̸ N(𝒫 ) we have to add a node u satisfying val(u ) = val(u) + 1. Let us consider such a node u and assume that for all successor nodes v ∈ V of u such that val(v) + 1 ∈ ̸ N(𝒫 ) we have already added a node v with val(v ) = val(v) + 1. Consider now the mapping fu ∈ ℱ (ℤ) defined in (4.10), which encodes the marking Mu . Note that this mapping is compact, since 𝒫 is compact. Hence, Supp(fu ) ⊆ ℕ. Below, we identify fu with a finite word fu (0)fu (1) ⋅ ⋅ ⋅ fu (k) ∈ {−1, 0, 1}∗ , where Supp(fu ) ⊆ {0, 1, . . . , k}. Let g ∈ ℱ (ℤ) be the unique compact mapping g with ν(g) = ν(fu ) + 1 = val(u) + 1. Also g is written as a finite word below. We first show the following. Claim. For every i ∈ Supp(g) we have (i) i = 0, (ii) i ∈ Supp(fu ) or (iii) i ≥ 1 and i − 1 ∈ Supp(fu ). To prove the claim, we distinguish three cases: Case 1. fu (0) = 0. Since fu is compact, we can write fu uniquely as (0 1)d 0 a w with a ∈ {0, −1}, d ≥ 0 and w ∈ {−1, 0, −1}∗ . By using the rewrite rules (4.6)–(4.8) from the proof of Lemma 4.7.3 we see that g = (−1 0)d 1 0 w if a = 0 and g = (−1 0)d+1 w if a = −1. Hence, the statement of the claim holds. Case 2. fu (0) = −1, and hence fu = −1 w for some w ∈ {−1, 0, −1}∗ . Then, g = 0 w and the statement of the claim clearly holds. Case 3. fu (0) = 1. Since fu is compact, we can write fu uniquely as (1 0)d a w with a ∈ {−1, 0}, d ≥ 1 and w ∈ {−1, 0, −1}∗ . By using the rewrite rules (4.6)–(4.8) from the proof of Lemma 4.7.3 we get g = (0 −1)d−1 0 1 0 w if a = 0 and g = (0 −1)d 0 w if a = −1. Again, the statement of the claim holds. By the above claim, for every i ∈ Supp(g) one of the following two cases holds (note that 𝒫 contains a node with value 0): – There is a node v ∈ V such that val𝒫 (v) = i. – There is no node v ∈ V with val𝒫 (v) = i, but there is a node v ∈ V with val𝒫 (v) = i − 1. In the second case, by construction, we have already added a node v to 𝒫 that evaluates to i. Therefore, all the necessary successors of the new node that will evaluate to val(u) + 1 are already present and we can add the corresponding outgoing edges for u . The edge labels are the nonzero entries of the compact mapping g constructed above. Note that g(𝒫 ) new nodes are added to the power circuit. We can now show the main result of this section. Theorem 4.7.9. There is a polynomial-time algorithm for the following problem: input: A power circuit 𝒫 = (V, δ), which is not necessarily correct.
210 | 4 Compression techniques in group theory output: In case 𝒫 is not correct, 0; otherwise, a correct and reduced power circuit 𝒫 = (V , δ ) together with a mapping f : V → V such that val𝒫 (v) = val𝒫 (f (v)) for all v ∈ V. Proof. We first describe the algorithm and then analyze its running time. Let 𝒫 = (V, δ) be the input power circuit and let v1 , v2 , . . . , vn be a topological sorting of the directed acyclic graph G(𝒫 ) in the sense that j < i for every edge (vi , vj ) ∈ E(𝒫 ). Thus v1 is a node without outgoing edges. We process the nodes in this order and add in each phase new nodes to the power circuit. At each time instant, we denote with 𝒫 the current power circuit. We start with the power circuit 𝒫 that only contains v1 . Assume that we have processed v1 , . . . , vk−1 for some k > 1 and let 𝒫 = (V , δ ) and f : {v1 , . . . , vk−1 } → V be the power circuit and the mapping that we have constructed so far, respectively. We also assume that val𝒫 (vi ) ∈ ℤ for all 1 ≤ i ≤ k − 1; otherwise the algorithm would have stopped with output zero before. The induction invariant that will be maintained by the algorithm is the following: – 𝒫 is correct and reduced, – for every 1 ≤ i ≤ k − 1, f (vi ) is the unique node of 𝒫 such that val𝒫 (f (vi )) = val𝒫 (vi ), – g(𝒫 ) ≤ k − 1. Note that all successor nodes of vk in G(𝒫 ) are among v1 , . . . , vk−1 . Since 𝒫 is reduced, we have f (vi ) = f (vj ) if and only if val𝒫 (vi ) = val𝒫 (vj ). We now define a marking M in 𝒫 . For every 1 ≤ i ≤ k − 1, we set M(f (vi )) to the sum of all values δ(vk , v) ≠ 0, where f (v) = f (vi ). All other values of M are set to zero. Note that val𝒫 (M) = val𝒫 (vk ) by construction. It remains to modify the marking M such that it becomes compact. Basically, we reduce the mapping fM ∈ ℱ (ℤ) defined in (4.10) by applying the rewrite rules (4.6)–(4.8) to fM and simulating every reduction step on the marking M. Note that |fM |1 = |M|1 ≤ |vk |1 . By Lemma 4.7.3, the reduction of fM can be carried out in at most |vk |1 rewrite steps. Consider a single rewrite step. Assume for instance that the rule a b → a−2 b+1 is applied, where a ≥ 2. This rule can be applied for every node v of 𝒫 such that M(v ) ≥ 2. In order to apply the rule, we need in 𝒫 a node v such that val𝒫 (v ) = val𝒫 (v ) + 1. Whether this node exists can be detected using the mapping σ𝒫 from Lemma 4.7.7. If this node v does not exist in the current 𝒫 , we create it using the extension algorithm from Lemma 4.7.8. If v exists, then we simply modify the marking M by setting M(v ) := M(v ) − 2 and M(v ) := M(v ) + 1. All other rules from (4.6)–(4.8) can be handled analogously. After making the marking M compact, it is easy to detect whether val𝒫 (vk ) = val𝒫 (M) ∈ ℤ. This is the case if Supp(fM ) ⊆ ℕ, which can be checked by comparing every node v such that M(v ) ≠ 0 with the unique node in 𝒫 that has no outgoing edges. For this, we use the mapping σ𝒫 . If it turns out that val𝒫 (M) ∈ ̸ ℤ, then we stop with the output 0. Otherwise, we check whether 𝒫 already contains a node v with the
4.7 Tower compression and beyond | 211
same value as val𝒫 (M). For this we have to check whether there exists a node v in 𝒫 such that M = Mv . If such a v exists, we set f (vk ) = v . Otherwise, we add a node v to 𝒫 , set δ(v , v ) = M(v ) for all nodes v ∈ 𝒫 and finally define f (vk ) = v . The extension steps do not increase g(𝒫 ) (they may reduce this number). Hence, after making M compact, we still have g(𝒫 ) ≤ k − 1. Finally, we possibly add a new node to 𝒫 (the node v with f (vk ) = v ). Hence, after processing vk we have g(𝒫 ) ≤ k. This shows that the invariants are preserved. We claim that the above algorithm works in polynomial time. Note that in total we make at most |𝒫 | = ∑u∈V |u|1 extension steps (for every rewrite step we make at most one extension step), and, by Lemma 4.7.8, each extension step adds g(𝒫 ) nodes to the power circuit, where 𝒫 is the current compact power circuit. By the third invariant, we have g(𝒫 ) ≤ |V| at every stage. Thus, in total we add at most |V| ⋅ |𝒫 | ≤ |𝒫 |2 nodes to the power circuit. Example 4.7.10. Let us apply the reduction algorithm from the above proof to the power circuit from Figure 4.1. We only consider the last step, where the node with value 33554418 is added. After processing the nodes with values 0, 1, 3, 5 and the two nodes with value 24 from Figure 4.1, we obtain the power circuit 𝒫 from Figure 4.3 on the left. The node with value 4 was added during an execution of the extension algorithm from Lemma 4.7.8. In the following we identify the nodes of 𝒫 with their values. 25
24 3
0
−1
−1
1 1
1
−1 1
1
24
−1
1
1 1
2
1
3
5
−1 1
4
0
1
1
6 −1
1
1 1
5 1
1
1
2
1
4
Figure 4.3: Two intermediate power circuits 𝒫 (on the left) and 𝒫 (on the right) for the reduction of the power circuit from Figure 4.1.
For the processing of 33554418 we start with the marking M defined by M(1) = −7, M(24) = 2 (these numbers correspond to the labels of the outgoing edges of 33554418 in Figure 4.1) and M(x) = 0 for x ∈ {0, 2, 3, 4, 5}. The corresponding mapping fM is represented by the ℤ-word 0 −7 022 2. Here is a reduction of this word using the rewrite rules (4.6)–(4.8). The first row specifies the position in the word.
212 | 4 Compression techniques in group theory 0
1
2
3
4
5
6
⋅⋅⋅
22
23
24
25
0 0 0 0 0 0 0 0
−7 −5 −3 −1 −1 −1 1 1
0 −1 −2 −3 −1 1 0 0
0 0 0 0 −1 0 0 0
0 0 0 0 0 −1 −1 −1
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
2 2 2 2 2 2 2 0
0 0 0 0 0 0 0 1
Only in the last rewrite step the extension algorithm from Lemma 4.7.8 has to be executed. In all previous rewrite steps, only numbers in the columns at positions 1, 2, 3, 4 are modified in the above table, and 𝒫 already contains nodes with the values 1, 2, 3, 4. Hence, these rewrite step can be directly simulated on the marking M. The last rewrite step adds a 1 at column 25, but 25 is not represented in 𝒫 . Hence, we have to execute the extension algorithm from Lemma 4.7.8. Since N(𝒫 ) = {0, 1, 2, 3, 4, 5, 24} the extension algorithm adds new nodes with the values 6 and 25, respectively. The resulting power circuit 𝒫 is shown in Figure 4.3 on the right. Then, we can finally add the new node with value 33554418 and three outgoing edges to 1, 4 and 25 (the columns with the nonzero entries of the last row in the above table). We obtain the reduced power circuit from Figure 4.2. Since the node with value 6 is not needed for representing the values of the original power circuit 𝒫 from Figure 4.1 it is omitted in the figure. Corollary 4.7.11. For a given correct power circuit 𝒫 = (V, δ) and two nodes u and v one can check in polynomial time whether val𝒫 (u) = val𝒫 (v) or val𝒫 (u) < val𝒫 (v) holds. Proof. Using Theorem 4.7.9, we compute in polynomial time a correct and reduced power circuit 𝒫 = (V , δ ) together with a mapping f : V → V such that val𝒫 (v) = val𝒫 (f (v)) for all v ∈ V. We can assume that 𝒫 is equipped with the mapping σ𝒫 as described in Lemma 4.7.7. This allows to check whether val𝒫 (f (u)) = val𝒫 (f (v)) or val𝒫 (f (u)) < val𝒫 (f (v)) holds. Weiß has shown in his thesis [496] that checking val𝒫 (u) = val𝒫 (v) for a given correct power circuit 𝒫 and two nodes u and v is a P-complete problem. 4.7.2.3 Arithmetic operations on power circuits In this section, we show that the operations of addition and multiplication (respectively, division) by a power of two have very efficient implementations on power circuits. We add some technical conditions to the results that will be needed for our application to the word problem for G1,2 in Section 4.7.3. For a power circuit 𝒫 = (V, δ) let roots(𝒫 ) be the set of all nodes v ∈ V that have no incoming edge in the graph G(𝒫 ). For a node u ∈ V let deg𝒫 (u) be the number of
4.7 Tower compression and beyond | 213
outgoing edges of u in the graph G(𝒫 ): deg𝒫 (u) = {v ∈ V | δ(u, v) ≠ 0}. Moreover, let deg(𝒫 ) = ∑v∈roots(𝒫) deg𝒫 (v). Theorem 4.7.12. Given a power circuit 𝒫 = (V, δ) and two nodes u, v ∈ roots(𝒫 ), u ≠ v, one can compute in polynomial time a power circuit 𝒫 = (V \ {v}, δ ) with the following properties: – val𝒫 (u) = val𝒫 (u) + val𝒫 (v), – val𝒫 (x) = val𝒫 (x) for all x ∈ V \ {u, v}, – roots(𝒫 ) = roots(𝒫 ) \ {v}, and – deg(𝒫 ) ≤ deg(𝒫 ). Proof. We set 𝒫 = (V \ {v}, δ ), where δ is defined by δ (u, x) = δ(u, x) + δ(v, x)
for all x ∈ V \ {v},
δ (x, y) = δ(x, y) for all x, y ∈ V \ {v} with x ≠ u. The properties from the theorem can be easily verified for 𝒫 . Figure 4.4 shows an example for the above construction, where we add the values of the two root nodes from the power circuit in Figure 4.1 (33554418 and 24). 33554442 2 −7
−1
1
24 −4
4
5
3
1 1
3
0
5
Figure 4.4: Computing 33554418+24.
Theorem 4.7.13. Given a power circuit 𝒫 = (V, δ), two nodes u, v ∈ roots(𝒫 ) with u ≠ v and e ∈ {−1, 1}, one can compute in polynomial time a power circuit 𝒫 = (V , δ ) with the following properties: – V ⊆ V and |V | ≤ |V| + deg𝒫 (u), – roots(𝒫 ) = roots(𝒫 ),
214 | 4 Compression techniques in group theory – – –
deg(𝒫 ) = deg(𝒫 ), val𝒫 (u) = val𝒫 (u) ⋅ 2e⋅val𝒫 (v) , and val𝒫 (x) = val𝒫 (x) for all x ∈ roots(𝒫 ) \ {u}.
Proof. Assume that (u, u1 ), . . . , (u, uk ) are the outgoing edges of u and (v, v1 ), . . . , (v, vl ) are the outgoing edges of v. We add a copy ui to the power circuit in case (u, ui ) is not the only incoming edge for ui . If (u, ui ) is the only incoming edge for ui , then we do not have to copy ui . In order to simplify the notation we set ui = ui in the latter case. Note that k
val𝒫 (u) = ∑ δ(u, ui ) ⋅ 2val𝒫 (ui ) . i=1
Hence, we have k
val𝒫 (u) ⋅ 2e⋅val𝒫 (v) = ∑ δ(u, ui ) ⋅ 2val𝒫 (ui )+e⋅val𝒫 (v) . i=1
We therefore construct 𝒫 = (V , δ ) with V = V ∪{u1 , . . . , uk } as follows. The edge (u, ui ) of 𝒫 is replaced by (u, ui ) and we set δ (u, ui ) = δ(u, ui ). We want to have val𝒫 (ui ) = val𝒫 (ui ) + e ⋅ val𝒫 (v). Therefore for every edge (ui , w) we add the edge (ui , w) with weight δ (ui , w) = δ(ui , w). Moreover, we add all edges (ui , vj ) for 1 ≤ j ≤ l with weight δ (ui , vj ) = e ⋅ δ(v, vj ). Of course, if vj is also a successor of ui , we add only a single edge with weight δ(ui , vj ) + e ⋅ δ(v, vj ). All edges (x, y) of 𝒫 with x ≠ u are not changed (with the only exception for the outgoing edges of ui in case ui = ui ). Note that the roots of 𝒫 are exactly the roots of 𝒫 (for this it is important that we do not copy ui in case (u, ui ) is the only incoming edge for ui , because otherwise ui would become a new root) and that deg(𝒫 ) = deg(𝒫 ). The properties for val𝒫 follow directly from the construction of 𝒫 . Figure 4.5 shows an example for the above construction. The node u (respectively, v) is the root node in the power circuit from Figure 4.1 with value 24 (respectively, 33554418). Note that u has two successor nodes with values 3 and 5. The successor with value 3 has to be copied since it has another incoming edge; the value of the copy is 33554421. The successor of u with value 5 is not copied, since u is its only predecessor; the new value of the node is 33554423.
4.7.3 Solving word problems using power circuits In this section we prove the following result using power circuits. Theorem 4.7.14. WP(G1,2 ) can be solved in polynomial time.
4.7 Tower compression and beyond | 215
24 ⋅ 233554418
33554418 2
24
−7 −4
4
2
3
1
1
−1 2
33554421
33554423
−7 1
3
−7
3
5
0 Figure 4.5: Computing 24 ⋅ 233554418 .
Proof. As explained in Section 4.7.1, Britton reduction for G1,2 leads to huge powers of the form t n and an . The idea is to store the exponent n of such powers using a correct power circuit. Formally, we manipulate a single power circuit 𝒫 . A power t n is represented by a pair (t, v), where v is a node of 𝒫 such that val𝒫 (v) = n and analogously for an . To make the notation more readable, we just write t n for this pair (t, v) and always assume implicitly that n is represented by a node of the power circuit 𝒫 . Moreover, we identify the powers a0 and t 0 with the empty word ε. The next idea is to use instead of the rules of the rewrite system ℬ for the Baumslag–Solitar group B1,2 (see Section 4.7.1) the following “large-scale” rewrite system ℬl , where m > 0, n, p, q ∈ ℤ \ {0}: m
an t m → t m an⋅2 ,
t
n⋅2m −m
−m n
(4.11)
a →a
t
a a →a
,
(4.13)
t t →t
.
(4.14)
p q
p+q
p q
p+q
,
(4.12)
Note that ℬl allows to transform any sequence of blocks ai , t j (i, j ∈ ℤ \ {0}) into a sequence of at most three blocks: t p aq t −r with p, r ∈ ℕ, q ∈ ℤ and the constraint that if q = 0, then p ⋅ r = 0. Also note that ℬl has no large-scale counterparts for the ℬ-rules ta2m t −1 → am . In some sense, we delay applications of these rules until we make a Britton reduction step for G1,2 . For the latter, we use the following rewrite system 𝒢 , where m ∈ ℕ and n ∈ ℤ are arbitrary: bt n b−1 → an ,
−1 m n −m
b t a t
b→t
n⋅2−m
m
if 2 divides n.
(4.15) (4.16)
216 | 4 Compression techniques in group theory To see that the combined rewrite system ℬl ∪ 𝒢 indeed allows to perform Britton reduction for G1,2 , note that a word t p aq t −r with p, r ∈ ℕ and q ∈ ℤ \ {0} cannot represent an element of ⟨t⟩ ⊆ BS1,2 since after maximally reducing t p aq t −r with the ℬ-rules ta2m t −1 → am we reach the normal form of t p aq t −r , but this normal form still contains at least one a or a−1 . Hence, we only need the rules (4.15) in order to perform Britton reduction for pins of the form bgb−1 . Moreover, the word t p aq t −r represents an element −p of ⟨a⟩ ⊆ BS1,2 if and only if p = r and 2p divides q, in which case we have t p aq t −r = aq2 in BS1,2 . Hence, we only need the rules (4.16) to perform Britton reduction for pins of the form b−1 gb. Now we can explain the algorithm for the word problem for G1,2 . Let w ∈ {a, a−1 , b, b−1 , t, t −1 }∗ be the input word. We can write w as w = w0 bϵ1 w1 bϵ2 w2 ⋅ ⋅ ⋅ bϵk wk , where ϵ1 , . . . , ϵm ∈ {−1, 1} and every wi belongs to {a, a−1 , t, t −1 }∗ . Moreover, we write every wi as a sequence of maximal blocks ap and t p (p ∈ ℤ \ {0}). We then construct a single correct power circuit 𝒫 that contains for each maximal a-block ap (respectively, t-block t p ) a node v ∈ roots(𝒫 ) such that val𝒫 (v) = p. Note that if different blocks have the same exponents, then we introduce different root nodes with the same value. During the algorithm, the nodes in roots(𝒫 ) will evaluate to the exponents of the a-blocks and t-blocks that are currently present. A crucial property of the algorithm is that deg(𝒫 ) will never increase during the algorithm. The main part of the algorithm applies the rules from ℬl ∪ 𝒢 as long as possible. For this, we can apply the following reduction strategy. 1. We first rewrite every maximal sequence of a-blocks and t-blocks into the form t p aq t −r with p, r ∈ ℕ, q ∈ ℤ and the constraint that if q = 0, then p ⋅ r = 0. For this we apply the rules from ℬl as long as possible. 2. Then we check whether a Britton reduction step can be made, i. e., whether a rule of 𝒢 can be applied. If not, then we obtain a Britton-reduced word, otherwise we apply a 𝒢 -rule and proceed with step 1. Before we talk about the implementation of a single rewrite step, let us first argue that the number of rewrite steps is bounded polynomially. The number of applications of 𝒢 -rules is clearly bounded by the input length |w|. Moreover, if we have m maximal a-blocks and t-blocks, then 𝒪(m) applications of ℬl -rules suffice in order to obtain the form t p aq t −r . Since no rule from ℬl ∪ 𝒢 increases the total number of a-blocks and t-blocks (some rules reduce this number), we do at most 𝒪(|w|) applications of ℬl -rules between two applications of 𝒢 -rules. In total, 𝒪(|w|2 ) rewrite steps are done. A single rewrite step involves some of the following arithmetical operations:
4.7 Tower compression and beyond | 217
(i) An addition of two integers p and q (for rules (4.13) and (4.14)), where p and q are represented by two root nodes u and v, respectively. The value of u (which is p) is changed to p + q.7 (ii) The operation (n, m) → n ⋅ 2m (for rules (4.11) and (4.12)), where n ∈ ℤ and m ∈ ℕ are represented by two root nodes u and v, respectively. The value of u (which is n) is changed to n ⋅ 2m , whereas the value of v (which is m) is not changed. (iii) The operation (n, m) → n ⋅ 2−m in case 2m divides n (for rule (4.16)), where n ∈ ℤ and m ∈ ℕ are represented by two root nodes u and v, respectively. The value of u (which is n) is changed to n ⋅ 2−m . (iv) Checking equality of two integers (for rule (4.16)). (v) Checking for two integers n, m with m ≥ 0 whether 2m divides n (for rule (4.16)). Each of these operations can be carried out in polynomial time on a correct power circuit as explained in Sections 4.7.2.2 and 4.7.2.3. For (v) we first add a node representing n ⋅ 2−m to the power circuit and then check whether the power circuit is still correct, using Theorem 4.7.9. It remains to show that the correct power circuit stays of polynomial size during the reduction. This would fail, if, for instance, the size of the power circuit would double in each rewrite step. All numbers that are involved in the above operations (i)– (v) are represented by root nodes of the current power circuit. The checks in (iv) and (v) do not lead to any size increase (we can continue after the checks with the same power circuit). For the arithmetic operations in (i), (ii) and (iii) it is important that root nodes do not have incoming edges. Hence, for the addition in (i) we can apply Theorem 4.7.12, which tells us that we do not have to copy any nodes and that deg(𝒫 ) does not increase. For the multiplications in (ii) and (iii) let u be the node representing n. By Theorem 4.7.13 we have to copy at most deg𝒫 (u) ≤ deg(𝒫 ) nodes. Moreover, deg(𝒫 ) does not change. This means that the degree of 𝒫 stays bounded by the input length |w|. Hence, in any rewrite step we have to copy at most |w| nodes. Since we do at most 𝒪(|w|2 ) rewrite steps, the size of the power circuit stays bounded by 𝒪(|w|3 ). This concludes the proof of Theorem 4.7.14. Using more efficient power circuit operations, Diekert, Laun and Ushakov prove in [103] that the word problem for G1,2 can be solved in time 𝒪(n3 ). Laun [280] extends this result to all Baumslag groups G1,q = ⟨a, b | b−1 a−1 bab−1 ab = aq ⟩ for q ≥ 2. 7 There might be other blocks whose exponent is p. In this case, the number p is also represented by another root node.
218 | 4 Compression techniques in group theory Theorem 4.7.15 ([280]). The word problem for every Baumslag group G1,q can be solved in time 𝒪(n3 ). To show this result, Laun introduced a variant of power circuits for base q instead of base 2. In [103], Diekert, Laun and Ushakov use power circuits to solve also another difficult word problem in polynomial time. The Higman group Hp for p ≥ 2 is defined as 2 Hp = ⟨a0 , . . . , ap−1 | ai+1 mod p ai a−1 i+1 mod p = ai (0 ≤ i ≤ p − 1)⟩.
It is known that H2 and H3 are trivial, whereas all groups Hp with p ≥ 4 are infinite. The group H4 was introduced by Higman [209]. It was the first example of an infinite finitely presented group with no nontrivial finite quotient. As for G1,2 the Dehn function of H4 can be shown to be lower bounded by the tower function. On the other hand, we have the following theorem. Theorem 4.7.16 ([103]). The word problem for the Higman group H4 can be solved in time 𝒪(n6 ). Laun [280] extends this result to all groups q Hp (1, q) = ⟨a0 , . . . , ap−1 | ai+1 mod p ai a−1 i+1 mod p = ai (0 ≤ i ≤ p − 1)⟩.
4.7.4 Ackermannian compression For some natural examples of groups, even power circuits are not enough to solve the word problem in polynomial time. In the following we will present a class of such groups. The material in this section is based on [107, 108]. Fix a k ≥ 1. Let us start with the group Gk = ⟨a1 , . . . , ak , t | t −1 a1 t = a1 , t −1 ai t = ai ai−1 (2 ≤ i ≤ k)⟩. These groups were introduced in [108] and are called Hydra groups in [107]. Note that Gk is an HNN-extension of the free group F(a1 , . . . , ak ) with respect to the automorphism θ : F(a1 , . . . , ak ) → F(a1 , . . . , ak ) with θ(a1 ) = a1 and θ(ai ) = ai ai−1 for 2 ≤ i ≤ k. It is shown in [108] that the elements a1 t, . . . , ak t ∈ Gk generate the free subgroup Hk = ⟨a1 t, . . . , ak t⟩ ≤ Gk . Finally, consider the HNN-extension Γk = ⟨Gk , p | p−1 hp = h (h ∈ Hk )⟩
= ⟨a1 , . . . , ak , t, p | t −1 ai t = θ(ai ), pai t = ai tp (1 ≤ i ≤ k)⟩.
4.7 Tower compression and beyond | 219
It is shown in [108] that the Dehn function of Γk is extremely fast growing. To make a formal statement, we introduce the following family of functions A0 : ℤ → ℤ, A1 : ℤ → ℤ and Ak : ℕ → ℕ for k ≥ 2: A0 (n) = n + 1
for n ∈ ℤ,
A1 (n) = 2n for n ∈ ℤ,
Ak+1 (0) = 1
for k ≥ 1,
Ak+1 (n + 1) = Ak (Ak+1 (n)) for n ∈ ℕ, k ≥ 1. By induction, one can show that Ak is indeed totally defined on ℕ. Already for small k, the function Ak grows extremely fast: A3 grows like the tower function tower(n), but A4 grows much faster. The diagonal function n → An (n) is a minor variation of the socalled Ackermann function, which is important in theoretical computer science. The Ackermann function grows faster than any primitive recursive function. It is shown in [108] that for k ≥ 2 the Dehn function of Γk has the same asymptotic growth as Ak . Hence, applying Theorem 4.5.4 would yield an extremely inefficient algorithm for the word problem of Ak (for, say, k ≥ 3) and in the uniform setting, where the index k of Ak is part of the input, the algorithm would be no longer primitive recursive. Nevertheless, Dison, Einstein and Riley proved the following result. Theorem 4.7.17 ([107]). For every k ≥ 1, the word problem for Γk can be solved in polynomial time. A complete treatment of the proof would go beyond the scope of this chapter. Let us just explain the high-level idea. The word problem for the HNN-extension Γk = ⟨Gk , p | p−1 hp = h (h ∈ Hk )⟩ is solved using Britton reduction. For this, it suffices to solve the membership problem for the finitely generated subgroup Hk of Gk , i. e., one has to decide whether a given word over the generators of Gk represents an element of Hk . This is the hard part. Huge integer exponents arise during the membership test. These numbers have to be compressed, but power circuits would not be powerful enough for this (if k ≥ 4). Let us explain the compression scheme used instead. For every k ≥ 0, the function Ak is injective and hence has a partially defined in−1 −1 verse function A−1 k . Clearly, A0 is totally defined on ℤ: A0 (n) = n − 1 for all n ∈ ℤ. The −1 −1 −1 function A1 is defined on 2ℤ: A1 (2n) = n. Finally, Ak for k ≥ 2 is a partially defined −1 −1 mapping from ℕ to ℕ. A word w = a1 a2 ⋅ ⋅ ⋅ an with ai ∈ {A0 , A−1 0 , A1 , A1 , . . . , Ak , Ak } is identified with the partial mapping defined by w(x) = a1 (a2 (⋅ ⋅ ⋅ an (x) ⋅ ⋅ ⋅)). The integers that arise while testing membership in Hk ≤ Gk can be represented as w(0) for a −1 −1 ∗ 8 word w ∈ {A0 , A−1 0 , A1 , A1 , . . . , Ak , Ak } of length polynomial in the input length. This yields a kind of Ackermannian compression of integers. The following result from [107] is the main algorithmic result for Ackermannian compressed integers. 8 Actually, a slight variant of the functions Ai is needed (see [107, Section 3]).
220 | 4 Compression techniques in group theory Theorem 4.7.18 ([107]). For every k ≥ 0, there is a polynomial-time algorithm that takes −1 −1 ∗ as input a word w ∈ {A0 , A−1 0 , A1 , A1 , . . . , Ak , Ak } , and declares which of the following four cases holds: (i) w(0) is not defined, (ii) w(0) = 0, (iii) w(0) > 0, (iv) w(0) < 0. A slight variant of this result is the key for the proof of Theorem 4.7.17.
4.8 Open problems Let us conclude this chapter with a list of open problems. Compressed word problems and straight-line programs – Does the compressed word problem for a finitely generated linear group belong to P? Recall that this is equivalent to PIT ∈ P, which is a major open problem in algebraic complexity theory. One might therefore consider interesting subclasses of linear groups. Braid groups are linear and the complexity of the compressed word problem is open. Another interesting class are polycyclic group. In [273] it is shown that PIT restricted to so-called skew arithmetic circuits reduces to the compressed word problem for a particular polycyclic group. Whether PIT for skew arithmetic circuits can be solved in polynomial time is open too. Also the Baumslag–Solitar group BS1,2 is linear, and the complexity of the compressed word problem is open. – One might also consider other group-theoretical decision problems for SLPcompressed input words. In Section 4.6.8.1 we briefly mentioned conjugacy. Another important decision problem in group theory is the generalized word problem for a finitely generated group G: given elements g, g1 , . . . , gn , does g ∈ ⟨g1 , . . . , gn ⟩ hold? The generalized word problem for a free group can be solved in polynomial time using Stalling’s folding procedure. Is it possible to compute this folding in polynomial time for SLP-compressed input words? Baumslag’s group and power circuits – What is the complexity of the conjugacy problem for the Baumslag group G1,2 ? It is shown in [104] that this problem can be solved in polynomial time in a strongly generic setting. This means that essentially for all inputs, conjugacy in G1,2 can be decided efficiently. Moreover, it is shown in [104] that the divisibility problem in power circuits reduces to the conjugacy problem for G1,2 , and it is conjectured that the divisibility problem in power circuits cannot be solved by an algorithm whose running time is bounded by an exponent tower of fixed height.
4.8 Open problems |
–
–
221
Is the word problem for the Baumslag group G1,2 in NC? Weiß conjectured in his thesis [496] that this is indeed the case, despite the fact that the equivalence problem for power circuits is P-complete [496]. What is the complexity of the compressed word problem for the Baumslag group G1,2 ? Before tackling this question, one should first clarify the complexity status of the compressed word problem for the Baumslag–Solitar group BS1,2 .
5 Discrete optimization in groups 5.1 Introduction 5.1.1 Motivation, general set-up, notable results Many classic discrete optimization problems have been generalized and studied in a general form – in noncommutative groups. For example, DO problems concerning integers (subset sum problem, knapsack problem, etc.) make perfect sense when the group of additive integers is replaced by an arbitrary (noncommutative) group G. The classical lattice problems are about subgroups (integer lattices) of the additive groups ℤn or ℚn , and their noncommutative versions deal with arbitrary finitely generated subgroups of a group G. The traveling salesman problem and the Steiner tree problem make sense for arbitrary finite subsets of vertices in a given Cayley graph of a noncommutative infinite group (with the natural graph metric). The Post correspondence problem carries over in a straightforward fashion from a free monoid to an arbitrary group. This list of examples can be easily extended, but the point here is that many classical DO problems have natural and interesting noncommutative versions. The purpose of this research is threefold. Firstly, this extends the area of DO to a new and mostly unknown territory, shedding some light on the nature of the problems and facilitating a deeper understanding of them. In particular, we want to clarify the “algebraic meaning” of these problems in the noncommutative situation. Secondly, these are algorithmic problems which are very interesting from the computational algebra viewpoint. They unify various techniques in group theory which seem to be far apart now. On the practical level, noncommutative DO problems occur in many everyday computations in algebra, so it is crucial to study their computational complexity and improve the algorithms. Thirdly, we aim to develop a robust collection of basic algebraic problems which would serve as building blocks for complexity theory in noncommutative algebra. Recall that the success of classical complexity theory in the area of NP computation is, mostly, due to a vast collection of discrete optimization problems which are known to be in P or NP-complete. It took many years, starting from the pioneering works of Cook, Levin and Karp in the 1970s, to gradually accumulate this very concrete knowledge. Nowadays, it is usually a matter of technique to reduce a new algorithmic problem to some known discrete optimization problem. This makes the theory of NP-complete DO problems, indeed, very robust. In the computational noncommutative algebra the data base of the known NP-complete problems is rather small, and complexity of some very basic problems is unknown. Our goal is to start building such a collection in noncommutative algebra. The research revealed a remarkable connection between algebraic and geometric structures of groups and their combinatorial properties. For example, in polycyclic groups the complexity of the subset sum problem turns out to be tightly connected to subgroup distortion. In a wide range of groups, the properties of the knapsack problem https://doi.org/10.1515/9783110667028-005
224 | 5 Discrete optimization in groups are described in terms of semilinear sets, which offers a systematic approach to big power conditions in combinatorial terms.
5.1.2 Brief overview of the problems 5.1.2.1 Stating the knapsack-type problems In Sections 5.2 and 5.3 we focus mostly on subset sum, knapsack, and submonoid membership problems and their variations (described below) in a given group G generated by a finite or countably infinite set X ⊆ G. We refer to all such problems as knapsack-type problems in groups. Elements in G are given as words over the alphabet X ∪ X −1 . We begin with three principal decision problems. The subset sum problem SSP(G, X): Given g1 , . . . , gk , g ∈ G, decide if ε
ε
g = g1 1 ⋅ ⋅ ⋅ gkk
(5.1)
for some ε1 , . . . , εk ∈ {0, 1}. Remark 5.1.1. As we show in Section 5.2.2 (Lemma 5.2.4), if X and Y are two finite generating sets for a group G, SSP(G, X) is in P if and only if SSP(G, Y) is in P. However, if at least one of the sets X and Y is infinite, the same is false in general (see Example 5.2.1). With that in mind, we often write SSP(G) if a finite generating set is implied, or if the generating set is fixed explicitly. We also often write SSP instead of SSP(G) when we talk about the problem in general, or when the group G is clear from the context. The same applies to all problems we examine, unless specifically mentioned otherwise. The knapsack problem KP(G, X): Given g1 , . . . , gk , g ∈ G, decide if ε
ε
g =G g1 1 ⋅ ⋅ ⋅ gkk
(5.2)
for some nonnegative integers ε1 , . . . , εk . There is also a variation of this problem, termed integer knapsack problem (IKP), when the coefficients εi are arbitrary integers. However, it is easy to see that IKP is P-time reducible to KP for any group G (see Section 5.3.1). The third problem is equivalent to KP in the classical (abelian) case, but in general it is a completely different problem that is of prime interest in algebra. The submonoid membership problem SMP(G, X): Given elements g1 , . . . , gk , g ∈ G, decide if g belongs to the submonoid generated by g1 , . . . , gk in G, i. e., if the following equality holds for some gi1 , . . . , gis ∈ {g1 , . . . , gk }, s ∈ ℕ: g = gi1 , . . . , gis .
(5.3)
5.1 Introduction | 225
The restriction of SMP to the case when the set of generators {g1 , . . . , gn } is closed under inversion (so the submonoid is actually a subgroup of G) is a well-known problem in group theory, called the generalized word problem (GWP) or the uniform subgroup membership problem in G. There is a huge bibliography on this subject; we mention some related results in Section 5.1.2.3. As usual in complexity theory, it makes sense to consider the bounded versions of KP and SMP, at least they are always decidable in groups where the word problem is decidable. In this case the problem is to verify if the corresponding equalities (5.2) and (5.3) hold for a given g provided that the number of factors in these equalities is bounded by a natural number m which is given in unary form, i. e., as the word 1m . In particular, the bounded knapsack problem (BKP) for a group G asks to decide, when given g1 , . . . , gk , g ∈ G and 1m ∈ ℕ, if equality (5.2) holds for some εi ∈ {0, 1, . . . , m}. This problem is P-time equivalent to SSP in G (see Section 5.2.2), so it suffices for our purposes to consider only SSP in groups. On the other hand, the bounded SMP in G is very interesting in its own right. The bounded submonoid membership problem BSMP(G, X): Given g1 , . . . , gk , g ∈ G and 1m ∈ ℕ (in unary form), decide if g is equal in G to a product of the form g = gi1 ⋅ ⋅ ⋅ gis , where gi1 , . . . , gis ∈ {g1 , . . . , gk } and s ≤ m. There are also interesting and important search variations of the decision problems above, when the task is to find an actual solution to equation (5.1), (5.2) or (5.3). There are two ways to state search variations of the problems. In the first, the weaker one, one considers only feasible instances of the problem, i. e., we assume that a solution to the instance exists; the second one is stronger, as it requires to solve the decision problem and simultaneously find a solution (if it exists) for a given instance. The former requires only a partial algorithm, while the latter asks for a total one. The weaker version of SSP(G), KP(G) and SMP(G) is always decidable in groups G with decidable word problem, while the stronger one might be undecidable (for instance, SMP in hyperbolic groups). In most cases we solve both the decision and the search variations of the problems above simultaneously, while establishing the time complexity upper bounds for the algorithms. There are also different optimization versions of these problems. However, they are not a primary consideration in the present survey. For details about them, we refer the reader to [363]. The typical groups we are interested in here are free, hyperbolic, abelian, nilpotent, or metabelian. In all these groups, and this is important, the word problem is decidable in P-time. We might also be interested in constructing some exotic examples of groups where the problems mentioned above have unexpected complexity.
226 | 5 Discrete optimization in groups 5.1.2.2 What is new? The general group-theoretic view on subset sum and knapsack problems provides several insights. It is well known that the classical SSP is pseudo-polynomial, i. e., it is in P when the integers are given in unary form, and it is NP-complete if the integers are given in binary form. In the group-theoretic framework the classical case occurs when the group G is the additive group of integers ℤ. In this case the complexity of SSP(ℤ) depends on whether the set X of generators of ℤ is finite or infinite. Indeed, if X = {1}, then we get SSP in ℤ in the unary form, so in this case it is in P (likewise for any other finite generating set). However, if X = {2n | n ∈ ℕ}, then SSP(ℤ) is P-time equivalent to the classical SSP in binary form, so SSP(ℤ) relative to this X is NP-complete (see Example 5.2.1 for details). To our surprise the situation is quite different (and much more complex) in noncommutative groups. In the noncommutative setting inputs are usually given as words in a fixed generating set of the group G, 10 i. e., in the unary form (so the size of the word x 2 is 210 ). It turns out that in the unary form SSP(G) is NP-complete even in some very simple non-abelian groups, such as the metabelian Baumslag–Solitar groups B(1, p), p ≥ 2, or the wreath product ℤ ≀ ℤ. Furthermore, the reasons why SSP(G) is hard for such groups G are completely different. Indeed, SSP is hard for G = B(1, p) because B(1, p) contains exponentially distorted infinite cyclic subgroups ℤ; while SSP is hard for ℤ≀ℤ since this group (also being finitely generated) contains an infinite direct sum ℤω . On the other hand, SSP(G) and KP(G) in the decision, search, or optimization variations are in P for hyperbolic groups G (relative to arbitrary finite generating sets). Observe that hyperbolic groups may contain highly (say, exponentially) distorted finitely generated subgroups, though such subgroups are not abelian. In this case the main reason why SSP(G) and KP(G) are easy lies in the geometry of hyperbolic groups, which is asymptotically “tree-like.” Another unexpected result which comes from the polynomial-time solution of KP in hyperbolic groups is that there is a hyperbolic group G with a finitely generated subgroup H such that the bounded membership subgroup problem for H is in P, but the standard subgroup membership problem for H is undecidable. This is the first result of this sort in groups. Further yet, there are P-time algorithms solving SSP and SMP (and all their variations) in finitely generated nilpotent groups, though in this case the algorithms explore the polynomial growth of such groups, not their geometry. It remains to be seen if there is a unifying viewpoint on why SSP, KP or SMP could be hard in a finitely generated group with polynomial-time decidable word problem. However, it is already clear that the nature of the complexity of these problems is much deeper than it reveals itself in the commutative case. 5.1.2.3 Results The subset sum problem is one of the few very basic NP-complete problems, so it has been studied intensely (see [253]). Beyond the general interest SSP attracted a lot of attention when Merkle and Hellmann designed a new public key cryptosystem [338]
5.1 Introduction | 227
based on the difficulty of some variation of SSP. The system was broken by Shamir in [450], but the interest persists and the ideas survive in numerous new cryptosystems and their variations (see [388]). Generalizations of knapsack-type cryptosystems to noncommutative groups seem quite promising from the viewpoint of post-quantum cryptography, but even the basic facts on complexity of SSP and KP in groups are lacking. In Section 5.2.5 we show that SSP(G) is NP-complete in many well-known groups which otherwise are usually viewed as computationally tame, e. g., free metabelian groups of finite rank r ≥ 2, the wreath product ℤ ≀ ℤ and, more generally, wreath products of any two finitely generated infinite abelian groups. These groups are finitely generated, but not finitely presented. Even more surprisingly, SSP(G) is NP-complete in each of the Baumslag–Solitar metabelian groups B(1, p), p ≥ 2, as well as in the metabelian group GB = ⟨a, s, t | [a, at ] = 1, [s, t] = 1, as = aat ⟩, introduced by Baumslag in [31]. Note that these groups are finitely presented and have very simple algebraic structures. Furthermore, it is not hard to see that SSP(G) is NP-hard if it is NP-hard in some finitely generated subgroup of G. In particular, every group containing subgroups isomorphic to any of the groups mentioned above has NP-hard SSP. Baumslag [32] and Remeslennikov [419] show that every finitely generated metabelian group embeds as a subgroup into a finitely presented metabelian group. This gives a method to construct various finitely presented groups with NP-complete SSP. On the other hand, Theorem 5.2.13 shows that SSP(G) is in P for every finitely generated nilpotent group G. The proof is short, but it is based on the rather deep fact that such groups have polynomial growth. One of the main results of Section 5.2 is Theorem 5.2.19, which states that SSP(G), as well as its search variation, is in P for any hyperbolic group G. As we mentioned above this also gives a P-time solution to the bounded knapsack problem in hyperbolic groups. The knapsack problems in groups, especially in their search variations, are related to the algorithmic aspects of the big powers method, which appeared long before any complexity considerations (see, for example, [29]). Recently, the method shaped up as a basic tool in the study of equations in free or hyperbolic groups [72, 259, 260, 394], algebraic geometry over groups [35] and completions and group actions [340, 36, 341], and it became a routine in the theory of hyperbolic groups (in the form of various lemmas on quasigeodesics). We prove (Theorem 5.3.3) that KP(G) and its search variation are in P for any hyperbolic group G. To show this we reduce KP(G) in P-time to BKP(G) in a hyperbolic group G. More precisely, we obtain the following result (Theorem 5.3.4), which is of independent interest. For any hyperbolic group G there is a polynomial p(x) ε such that if an equation g = g ε1 ⋅ ⋅ ⋅ gkk has a solution ε1 , . . . , εk ∈ ℕ, then this equation has a solution with εi bounded by p(n), where n = |g1 | + ⋅ ⋅ ⋅ + |gk | + |g| (and it can be found in P-time). On the other hand, decidability of quadratic equations in free groups
228 | 5 Discrete optimization in groups is NP-complete [257]. To solve knapsack problems in hyperbolic groups we developed a new graph technique, which we believe is of independent interest. Namely, given an instance of a problem we construct a finite labeled graph (whose size is polynomial in the size of the instance), such that one can see, just by looking at the graph, whether a solution to the given instance exists in the group, and if so then find it. We would like to mention one more result (Corollary 5.2.20) here which came as a surprise to us: BSMP(G) is P-time decidable for every hyperbolic group G. There are hyperbolic groups where the subgroup membership problem is undecidable even for a fixed finitely generated subgroup (see [420]). It seems this is the first natural example of an undecidable algorithmic problem in groups, whose bounded version is in P. It would be interesting to exploit this direction a bit further. The famous Mikhailova construction [350] shows that GWP is undecidable in the direct product F × F of a free non-abelian group F with itself. We prove in Section 5.2.6 (Theorem 5.2.39) that there is a finitely generated subgroup H in F2 × F2 such that BSMP for this fixed subgroup H in F2 × F2 is NP-complete. It follows that BSMP(G) is NP-hard for any group G containing F2 × F2 as a subgroup. Note that Venkatesan and Rajagopalan prove in [487] that in the multiplicative monoid Mat(n, ℤ) of all n × n integer matrices with n ≥ 20 the BSMP is average-case NP-complete. One of the reasons of this is that Mat(20, ℤ) contains a subgroup F2 × F2 . In another direction observe that fully residually free (or limit) groups, as well as finitely generated groups acting freely on ℤn -trees, have decidable GWP [342, 383, 384], though the time complexity of the decision algorithms is unknown. It would be remarkable if BSMP for such groups was in P. Note that Schupp gave a remarkable construction to solve GWP in P-time in orientable surface groups, as well as in some Coxeter groups [444]. We note in passing that the subgroup and submonoid membership problems in a given group could be quite different. For example, Romanovskii proves in [434] that GWP is decidable in every finitely generated metabelian group, but recent examples by Lohrey and Steinberg show that in a free metabelian non-abelian group there is a finitely generated submonoid with undecidable membership problem [301]. It would be very interesting to see what is the time complexity of BSMP in free metabelian or free solvable groups. Note that Umirbaev shows in [482] that GWP in free solvable groups of class ≥ 3 is undecidable. 5.1.2.4 Classical Post correspondence problem Let A be a finite alphabet with |A| ≥ 2. Denote by A∗ the free monoid with basis A viewed as the set of all words in A with concatenation as the multiplication. Let X be an infinite countable set of variables and X ∗ the corresponding free monoid. The classical Post correspondence problem (PCP) in A∗ : Given a finite set of pairs (g1 , h1 ), . . . , (gn , hn ) of elements of A∗ , determine if there is a nonempty word w(x1 , . . . , xn ) ∈ X ∗ such that w(g1 , . . . , gn ) = w(h1 , . . . , hn ) in A∗ .
5.1 Introduction | 229
Post shows in [414] that this problem is undecidable (see [467] for a simpler proof). Nowadays there are several variations of the Post correspondence problem (PCP) ∗ in A ; the following restricted version is the most typical. PCPn in A∗ : Let n be a fixed positive integer. Given a finite sequence of pairs (g1 , h1 ), . . . , (gm , hm ) in A∗ , where m ≤ n, determine if there is a nonempty word w(x1 , . . . , xm ) ∈ X ∗ such that w(g1 , . . . , gm ) = w(h1 , . . . , hm ) in A∗ . Breaking PCP into the collection of restricted problems PCPn makes the boundary between decidable and undecidable more clear: PCPn in A∗ is decidable for n ≤ 3 and undecidable for n ≥ 7 (see [119, 202, 333]). Another version of interest is the nonhomogeneous GPCP in the free monoid A∗ , in which case an input to GPCP contains a sequence of pairs (g1 , h1 ), . . ., (gn , hn ) as above and also four elements a, b, c, d ∈ A∗ , while the task is to find a word w(x1 , . . . , xn ) ∈ X ∗ such that aw(g1 , . . . , gn )b = cw(h1 , . . . , hn )d in A∗ . This problem is also undecidable in A∗ . There are marked variations of PCPn in A∗ , in which case for each pair (gi , hi ) in the instance the initial letters in gi and hi are not equal. These problems are known to be decidable [203]. We refer to the paper [201] for some recent developments on the Post correspondence problem in free semigroups. Finishing our short survey of known results we would like to mention that PCP is undecidable in a free non-abelian semigroup as well (the same argument as for free monoids). Hence, the semigroup version of PCP is also undecidable in semigroups containing free non-abelian subsemigroups, in particular, in groups containing free non-abelian subgroups, or solvable not virtually nilpotent groups (they contain free non-abelian subsemigroups). In what follows we focus only on the group-theoretic versions of the Post corresponding problems PCP and GPCP in groups, which is different from the original semigroup version since one has to take inversion of elements into account. 5.1.2.5 Post correspondence problem in groups Throughout Section 5.4 we use the following notation: G is an arbitrary fixed group generated by a finite set A, and F(X) is the free group with basis X = {x1 , . . . , xn }. We view elements of F(X) as reduced words in X ∪ X −1 . Sometimes we denote F(X) as F(x1 , . . . , xn ), or simply as Fn . As we mentioned earlier, the group-theoretic version of the Post correspondence problem involves terms (words) with inversion. The Post correspondence problem (PCP) in a group G: Given a finite set of pairs (g1 , h1 ), . . . , (gn , hn ) of elements of G, determine if there is a word w(x1 , . . . , xn ) ∈ F(x1 , . . . , xn ), which is not an identity of G, such that w(g1 , . . . , gn ) = w(h1 , . . . , hn ) in G.
230 | 5 Discrete optimization in groups Several comments are in order here. Recall that an identity on G is a word w(x1 , . . . , xn ) such that w(g1 , . . . , gn ) = 1 in G for any g1 , . . . , gn ∈ G. If the group G does not have nontrivial identities, then the requirement that w is not an identity becomes the same as in the original Post formulation that w is nonempty. Meanwhile, any nontrivial identity w(x1 , . . . , xn ) in G gives a solution to any instance of PCP in G, which is not very interesting. Sometimes we refer to words w which are identities in G as trivial solutions of PCP in G, while the solutions which are not identities in G are termed nontrivial. In this regard PCP(G) asks to find a nontrivial solution to PCP in G. In the sequel by PCP for a group G we always, if not said otherwise, understand the group-theoretic (not the semigroup one) version of PCP stated above. By definition PCP(G) depends on the given generating set of G; however, it is easy to see that PCP(G) for different finite generating sets are polynomial-time equivalent to each other, i. e., each one reduces to the other in polynomial time. Since in all our considerations the generating sets are finite we omit them from the notation and write PCP(G). Similar to the classical case one can define the restricted version PCPn for a group G, in which case the number of pairs in each instance of PCPn is bounded by n, and the nonhomogeneous one, GPCP (or GPCPn ), where there are some constants involved. Since the nonhomogeneous version is of crucial interest for us we state it precisely. The nonhomogeneous Post correspondence problem (GPCP) in a group G: given a finite sequence of pairs (g1 , h1 ), . . . , (gn , hn ) and two pairs (a1 , b1 ) and (a2 , b2 ) of elements of G (called the constants of the instance), determine if there is a word w(x1 , . . . , xn ) ∈ F(x1 , . . . , xn ) such that a1 w(g1 , . . . , gn )b1 = a2 w(h1 , . . . , hn )b2 in G. Note that the requirement on the word w is different for PCP and GPCP; in the former case w must not be an identity of the group G, while in the latter case w can be arbitrary. Given this distinction, it is not immediately clear whether solvability of GPCP in a given group implies solvability of PCP. Hence we prefer to use the term nonhomogeneous PCP over general PCP in groups. Two lemmas are due here. Lemma 5.1.2. For any group G, GPCP(G) is linear-time equivalent to the restriction of GPCP(G) where the constants b1 , b2 , a2 are all equal to 1. Proof. Indeed, in the notation above note that a1 w(g1 , . . . , gn )b1 = a2 w(h1 , . . . , hn )b2 in G if and only if −1 a−1 2 a1 w(g1 , . . . , gn )b1 b2 = w(h1 , . . . , hn ),
so GPCP in G is equivalent to GPCP with a2 = 1, b2 = 1. Moreover, aw(g1 , . . . , gn )b = w(h1 , . . . , hn ) in G if and only if abb−1 w(g1 , . . . , gn )b = w(h1 , . . . , hn ),
5.1 Introduction | 231
i. e., abw(g1b , . . . , gnb ) = w(h1 , . . . , hn ). Hence GPCP(G) is linear-time equivalent to GPCP(G) with b1 = a2 = b2 = 1, as claimed. From now on we often assume that in GPCP each instance has the constants b1 , b2 , a2 all equal to 1, in which case we denote a1 by a and term it the constant of the instance. Lemma 5.1.3. For any group G and for any instance (g1 , h1 ), . . . , (gn , hn ), a of GPCP(G) all solutions w to this instance can be described as w = w0 u, where w0 is a particular fixed solution to this instance and u is an arbitrary (perhaps, trivial) solution to PCP(G) for the instance (g1 , h1 ), . . . , (gn , hn ). Proof. Suppose w0 is a particular fixed solution to GPCP(G) for the instance (g1 , h1 ), . . . , (gn , hn ), a, so aw0 (g1 , . . . , gn ) = w0 (h1 , . . . , hn ). If w is an arbitrary solution to the same instance in G, then aw(g1 , . . . , gn ) = w(h1 , . . . , hn ), so w0−1 (g1 , . . . , gn )w(g1 , . . . , gn ) = w0−1 (h1 , . . . , hn )w(h1 , . . . , hn ), and hence u = w0−1 w solves PCP(G) for the instance (g1 , h1 ), . . . , (gn , hn ). Therefore, w = w0 u, as claimed. Lemma 5.1.3 shows that to get all solutions of GPCP in G for a given instance one needs only to find a particular solution of GPCP(G) and all solutions of PCP(G) for the same instance. As usual in discrete optimization there are several other standard variations of PCP problems: bounded, search and optimal. We mention them briefly now and refer to [363] for a thorough discussion of these types of problems in groups. The bounded version of PCP (or GPCP) requires that the word w in question should be of length bounded from above by a given number M. We denote these versions by BPCP(G) or BGPCP(G). The search variation of PCP (or GPCP) asks to find a word w that gives a nontrivial solution to a given instance of the problem (if such a solution exists). The optimization version of PCP (or GPCP) is a variation of the search problem, when one is asked to find a solution that satisfies some “optimal” conditions. In our case, if not said otherwise, the optimal condition is to find a shortest possible word w which is a solution to the given instance of the problem. 5.1.3 Preliminaries: algorithmic set-up To keep the exposition self-contained, we say a few words on how we present the data, models of computations, size functions, etc. (we refer to the book [366] for more details). Our model of computation is the random access machine (RAM).
232 | 5 Discrete optimization in groups To make the statements of the problems (from Section 5.1.2) more precise consider the following. If a generating set X = {x1 , . . . , xn } of a group G is finite, then the size of the word g = x1 . . . xk is its length |g| = k and the size of a tuple like g1 , . . . , gk , g from G is the total sum of the lengths |g1 | + ⋅ ⋅ ⋅ + |gk | + |g|. If the generating set X of G is infinite, then the size of a letter x ∈ X is not necessarily equal to 1; it depends on how we represent elements of X. In what follows we always assume that there is an efficient injective function ν : X → {0, 1}∗ which encodes the elements in X such that for every u ∈ {0, 1}∗ one can algorithmically recognize if u ∈ ν(X). In this case for x ∈ X we define size(x) = ν(x) and for a word w = x1 . . . xn with xi ∈ X we define size(w) = size(x1 ) + ⋅ ⋅ ⋅ + size(xn ). Similar to the above the size of a tuple (g1 , . . . , gk , g) is size(g1 , . . . , gk , g) = size(g1 ) + ⋅ ⋅ ⋅ + size(gk ) + size(g). One can go a bit further and identify elements x ∈ X with their images ν(x) ∈ {0, 1}∗ , and words w = x1 . . . xn ∈ X ∗ with the words ν(x1 ) . . . ν(xn ) ∈ {0, 1}∗ . This gives a homomorphism of monoids ν∗ : X ∗ → {0, 1}∗ . If in addition ν is such that for any x, y ∈ X the word ν(x) is not a prefix of ν(y) (this is easy to arrange), then: – ν∗ is injective, – ν∗ (X ∗ ) and ν∗ (X) are algorithmically recognizable in {0, 1}∗ , and – for every word v ∈ ν∗ (X ∗ ) one can find the word w ∈ X ∗ such that ν∗ (w) = v. From now on we always assume that a generating set comes equipped with a function ν, termed encoding, satisfying all the properties mentioned above. In fact, almost always all our generating sets X are finite, and in those rare occasions when X is infinite we describe ν precisely. In general, we view decision problems as pairs (I, D), where I is the space of instances of the problem equipped with a size function size : I → ℕ and D ⊆ I is a set of affirmative (positive) instances of the problem. Of course, the set I should be constructible and the size function should be computable. In all our examples the set I consists of either tuples (like (g1 , . . . , gk , g) in the case of KP) of words in the alphabet ΣX for some (perhaps, infinite) set of generators X of a group G, or, e. g., in the case of BKP or BSMP, tuples of the type (g1 , . . . , gk , g, 1m ) where 1m is a natural number m given in unary form. The problem (I, D) is decidable if there is an algorithm 𝒜 that for any x ∈ I decides whether x is in D (𝒜 answers “yes” or “no”). The problem is in class P if there is a decision algorithm 𝒜 with polynomial-time function with respect to the size of the instances in I, i. e., there is a polynomial p(n) such that for any x ∈ I the algorithm 𝒜 starts on x, halts in at most p(size(x)) steps and gives a correct answer “yes”
5.1 Introduction | 233
or “no.” Similarly, we define problems decidable in linear or quadratic time, nondeterministic polynomial-time NP and other complexity classes. A brief recounting on the latter is given in Section 5.1.4. Recall that a decision problem (I1 , D1 ) is P-time reducible to a problem (I2 , D2 ) if there is a P-time computable function f : I1 → I2 such that for any u ∈ I1 one has u ∈ D1 ⇐⇒ f (u) ∈ D2 . Such reductions are usually called either many-to-one P-time reductions or Karp reductions. We say that two problems are Karp-equivalent if each of them P-time Karp reduces to the other. Aside from many-to-one reductions, we use the so-called Cook reductions. That is, we say that a decision problem (I1 , D1 ) is P-time Cook reducible to a problem (I2 , D2 ) if there is an algorithm that solves problem (I1 , D1 ) using a polynomial number of calls to a subroutine for problem (I2 , D2 ), and polynomial time outside of those subroutine calls. Correspondingly, we say that two problems are P-time Cook-equivalent if each of them P-time Cook reduces to the other. In this survey, whenever we are primarily interested in establishing whether certain problems are P-time decidable or NP-complete, we make no special effort to use one of the two types of reduction over the other, and often simply say “P-time reduction” for either Karp or Cook P-time reductions. Now we define an optimization problem as a tuple (I, J, F, μ, extr), where I is the set of instances, J is a set of solutions, F : I → P(J) is a function that for each instance u ∈ I associates a subset F(u) ⊆ J of all feasible solutions for an instance u, μ(u, v) is a nonnegative real function which for u ∈ I, v ∈ F(u) measures the cost of solution v for an instance u and extr is either min or max. This optimization problem, given u ∈ I, asks to find v ∈ F(u) such that μ(u, v) = extr{μ(u, v ) | v ∈ F(u)}. Given two optimization problems Pi = (Ii , Ji , Fi , μi , extri ), i = 1, 2, we say that P1 is P-time reducible to P2 if there are P-time computable functions f : I1 → I2 , fu : F1 (u) → F2 (f (u)), u ∈ I1 , such that v ∈ F1 (u) ⇐⇒ fu (v) ∈ F2 (f (u)) and μ1 (u, v) = extr1 {μ1 (u, v ) | v ∈ F1 (u)} ⇐⇒ μ2 (f (u), fu (v)) = extr2 {μ2 (f (u), v ) | v ∈ F2 (f (u))}. In our consequent considerations, the functions fu are apparent from the set-up and we do not mention them in our arguments. We say that two optimization problem are P-time equivalent if each of them P-time reduces to the other.
5.1.4 Preliminaries: complexity classes Most results in this survey deal with complexity classes P and NP. We say that a decision problem belongs to the class P (polynomial time) if there is a deterministic Turing machine that solves it in time polynomial in size of the input. Respectively, a decision problem belongs to the class NP if there is a nondeterministic Turing machine that solves it in time polynomial in size of the input, or equivalently, if there is a polynomial-time verification algorithm (certificate) for an answer. Similarly related are complexity classes Ł and NL. A decision problem belongs to the class Ł (logarithmic space, logspace) (respectively, NL) if it can be solved by
234 | 5 Discrete optimization in groups a deterministic (respectively, nondeterministic) Turing machine with read-only input tape, write-only output tape and work tape of size logarithmic in size of the input. 5.1.4.1 Semilinearity Definition 5.1.4. A set of vectors A ⊆ ℕk is linear if there exist vectors v0 , . . . , vn ∈ ℕk such that A = {v0 + λ1 v1 + ⋅ ⋅ ⋅ + λn vn | λ1 , . . . , λn ∈ ℕ}. The tuple of vectors (v0 , . . . , vn ) is a linear representation of A. A set A ⊆ ℕk is semilinear if it is a finite union of linear sets A1 , . . . , Am . A semilinear representation of A is a list of linear representations for the linear sets A1 , . . . , Am . It is well known that the semilinear subsets of ℕk are exactly the sets definable in Presburger arithmetic. These are those sets that can be defined with a first-order formula φ(x1 , . . . , xk ) over the structure (ℕ, 0, +, ≤) (see [170]). Moreover, the transformations between such a first-order formula and an equivalent semilinear representation are effective. In particular, the semilinear sets are effectively closed under Boolean operations. 5.1.4.2 Zero-one equation problem Recall that a vector v ∈ ℤn is called a zero-one vector if each entry in v is either 0 or‘1. Similarly, a square matrix A ∈ Mat(n, ℤ) is called a zero-one matrix if each entry in A is either 0 or 1. Denote by 1n the vector (1, . . . , 1) ∈ ℤn . The following problem is NP-complete (see [91]). Zero-one equation problem (ZOE): Given a zero-one matrix A ∈ Mat(n, ℤ), decide if there exists a zero-one vector x ∈ ℤn satisfying A ⋅ x = 1n . 5.1.4.3 LogCFL LogCFL is the complexity class that contains all decision problems that can be reduced in logarithmic space to a context-free language. This class is situated between NL and AC1 , in the sense that it contains the former and is contained in the latter. It has several alternative characterizations: – logspace bounded alternating Turing machines with polynomial proof tree size; – semiunbounded Boolean circuits of polynomial size and logarithmic depth; – logspace bounded auxiliary pushdown automata with polynomial running time. Here we use the last characterization. An auxiliary pushdown automaton (AuxPDA) is a nondeterministic pushdown automaton with a two-way input tape and an additional work tape.
5.1 Introduction | 235
5.1.5 Preliminaries: nilpotent groups In this section we give the necessary background on nilpotent groups. Recall that a group G is called nilpotent if it possesses a central series, i. e., a normal series G = H1 ▷ H2 ▷ ⋅ ⋅ ⋅ ▷ Hs ▷ Hs+1 = 1,
(5.4)
such that [G, Hi ] ≤ Hi+1 for all i = 1, . . . , s. If s in the lowest possible for a given group G, we say that G is nilpotent of class c = s. A simple example of a nilpotent group is the Heisenberg group H3 (ℤ) = ⟨x, y, z | [x, y] = z, [x, z] = [y, z] = 1⟩, isomorphic to the group UT3 (ℤ) of unitriangular matrices with integer entries. If G is finitely generated, then so are the abelian quotients Hi /Hi+1 , 1 ≤ i ≤ s. Let ai1 , . . . , aimi be a standard basis of Hi /Hi+1 , i. e., a generating set in which Hi /Hi+1 e
has presentation ⟨ai1 , . . . , aimi | aijij , j ∈ 𝒯i ⟩ in the class of abelian groups, where 𝒯i ⊆ {1, . . . , mi } and eij ∈ ℤ>0 . Formally put eij = ∞ for j ∉ 𝒯i . Note that A = {a11 , a12 , . . . , asms }
is a polycyclic generating set for G, and we call A a Malcev basis associated with the central series (5.4). For convenience, we will also use a simplified notation, in which the generators aij and exponents eij are renumbered by replacing each subscript ij with j + ∑ℓ 2|X| guarantees that words ui di repeat, yielding that in G one has k
k
−1
g1 1 = ui di f1 2 ui di ,
(5.7)
i. e., that g1 , f1 are commensurable (for example, see Figure 5.3). 5.1.6.2 Logarithmic depth of van Kampen diagrams in hyperbolic groups Let G = ⟨X | R⟩ be a group presentation. For a van Kampen diagram D over ⟨X | R⟩ one can define a dual graph Dual(D) = (V, E), where the vertex set V is the set of all cells of D (including the outer cell) and the edge set E is the set of all pairs of cells (c1 , c2 ) in D sharing at least one vertex. The maximal distance in Dual(D) from the outer cell to other cells is called the depth of D, denoted by δ(D). By the depth δ(w) of a trivial in G word w we understand the minimal depth of a van Kampen diagram with the boundary label w (see [112, 368, 366]). Proposition 5.1.13 ([112]). Let G be a hyperbolic group given by a finite presentation G = ⟨X | R⟩. Then for any word w = w(X) with w =G 1 one has δ(w) = O(log2 |w|). 5.1.7 Preliminaries: graph groups and virtually special groups Let (A, I) be a finite simple graph, i. e., the edge relation (also called the independence relation) I ⊆ A × A which is irreflexive and symmetric. We also refer to (A, I) as an independence alphabet. With (A, I) we can associate a group G(A, I) = ⟨A | ab = ba for every (a, b) ∈ I⟩, called the graph group of (A, I).
240 | 5 Discrete optimization in groups A group G is called virtually special if it is a finite extension of a subgroup of a graph group. The following are examples of virtually special groups: – Coxeter groups [200], – one-relator groups with torsion [498], – fully residually free groups [498], – fundamental groups of hyperbolic 3-manifolds [2].
5.1.8 Preliminaries: automaton groups Consider a finite Mealy automaton 𝒜 = (Q, Σ), where: – Q is a finite set of states; – Σ is a finite input/output alphabet; – φ : Q × Σ → Q is a transition function; – ψ : Q × Σ → Σ is an exit function. If φ(q, ⋅) : Σ → Σ is bijective for every q ∈ Q, then 𝒜 is invertible. The automaton 𝒜 defines a transformation of Σ∗ , which extends to a transformation of Σω , as follows. Given w = a1 a2 . . . ∈ A∗ ∪ Aω , there is a unique path in 𝒜 starting at the provided initial state and with input labels w. The image of w under the transformation is the output label along that same path. Definition 5.1.14. A map f : Σ∗ → Σ∗ is automatic if f is defined by an automaton as above. One may forget the initial state of 𝒜, and consider the set of all transformations corresponding to all choices of the initial state of 𝒜; the semigroup of the automaton M(𝒜) is the semigroup generated by all these transformations. If 𝒜 is invertible, then all the transformations are invertible and we get a group called the group of the automaton 𝒜, denoted by G(𝒜). Proposition 5.1.15 ([171, Proposition 3]). Consider Mealy automata 𝒜1 , 𝒜2 . There exists a Mealy automaton, denoted by 𝒜1 × 𝒜2 , satisfying G(𝒜1 × 𝒜2 ) ≃ G(𝒜1 ) × G(𝒜2 ). 5.1.9 Preliminaries: wreath products Let G and H be groups. Consider the direct sum K = ⨁ Gh , h∈H
where Gh is a copy of G. We view K as the set G(H) of all functions f : H → G such that Supp(f ) = {h ∈ H | f (h) ≠ 1} is finite, together with pointwise multiplication as
5.1 Introduction
| 241
the group operation. The set Supp(f ) ⊆ H is called the support of f . The group H has a natural left action on G(H) given by hf (a) = f (h−1 a), where f ∈ G(H) and h, a ∈ H. The semidirect product G(H) ⋊ H is called the wreath product of G and H and is denoted by G ≀ H. Thus, as a set G ≀ H = {(f , h) | h ∈ H, f ∈ G(H) }, and the product of (f1 , h1 ), (f2 , h2 ) ∈ G ≀ H is defined by (f1 , h1 ) ⋅ (f2 , h2 ) = (f , h1 h2 ), where f (a) = f1 (a)f2 (h−1 1 a) for every a ∈ H. 5.1.10 Preliminaries: polycyclic groups, metabelian groups, Fox derivatives Let Fn be a free group with basis {f1 , . . . , fn }, n ≥ 2, viewed as a set of reduced words in the alphabet {f1 , . . . , fn }±1 , where the multiplication is given by concatenation and free reduction. Denote by ℤFn the integral group ring over Fn . The following notation is used throughout the whole chapter. For elements a and b in a group G we put ba = aba−1 and [a, b] = aba−1 b−1 . For a subset A ⊆ G we denote by ⟨A⟩, as well as gp(A), the subgroup generated by A in G, and by ⟨A⟩G , as well as gp(A)G , the normal subgroup in G generated by A. For two subsets A, B ⊆ G we set [A, B] = ⟨[a, b] : a ∈ A, b ∈ B⟩. Then G = [G, G] is the derived subgroup of G, and G = [G , G ] is the second derived subgroup of G. By ζn (G) we denote the n-th term of the upper central series of G, in particular, ζ1 (G) is the center of G. Let A be a normal abelian subgroup of a group G. Then A can be viewed as a module over the group ring ℤG where the action of an element α = ∑ti=1 mi gi ∈ ℤG (mi ∈ ℤ, gi ∈ G) on an element a ∈ A, written as aα , is defined by aα = ∏ti=1 (ami )gi = ∏ti=1 (agi )mi . Since a ∈ A acts on A identically the action of ℤG on A induces an action of ℤG/A on A, so A can be viewed as a module over ℤG/A as well. 5.1.10.1 Free Fox derivatives In [139, 140, 141, 82, 142] Fox gave a thorough account of the differential calculus in a free group ring ℤFn . We refer to [195, 139, 140, 141, 82, 142] for details, but mention briefly here some definitions and facts that we use in Section 5.1.10.2 below and in Section 5.4. The partial Fox derivatives are mappings d/dfj : ℤFn → ℤFn ,
1 ≤ j ≤ n,
(5.8)
242 | 5 Discrete optimization in groups satisfying the following conditions whenever s, t ∈ ℤ; u, v ∈ Fn : dfi /dfj = δij
(Kronecker delta);
d(su + tv)/dfj = s ⋅ du/dfj + t ⋅ dv/dfj ;
(5.9)
d(uv)/dfj = du/dfj + u ⋅ dv/dfj .
These derivatives are completely determined by the following equality for an element β ∈ ℤFn : β = βε + dβ/df1 ⋅ (f1 − 1) + ⋅ ⋅ ⋅ + dβ/dfn ⋅ (fn − 1),
(5.10)
where ε : ℤFn → ℤ is the specialization homomorphism defined by fj ε = 1, j = 1, . . . , n. It follows from the definitions that dm/dfj = 0 for any m ∈ ℤ, and that dg −1 = −1 −g ⋅ dg/dfj for any g ∈ Fn , j = 1, . . . , n. The following formula for any element u ∈ Fn is called the main identity for the Fox derivatives: n
∑ du/dfi ⋅ (fi − 1) = u − 1. i=1
(5.11)
Let Ḡ = Fn /R be the quotient of Fn by a normal subgroup R. Denote by fī the image of fi in Ḡ under the canonical epimorphism. Then Ḡ is generated by {f1̄ , . . . , fn̄ }. For any word w = w(f1̄ , . . . , fn̄ ) in these generators one may define formal derivatives dw/dfī , i = 1, . . . , n as the images of the derivatives dw(f1 , . . . , fn )/dfi under the induced group ring epimorphism ℤFn → ℤG.̄ Sometimes, slightly abusing notation we use the notation fi for fī as well, which is always clear from the context. In any case the property (5.11) remains valid in ℤG.̄ 5.1.10.2 Metabelian groups Consider a free metabelian group Mn = Fn /Fn and a free abelian group An = Mn /Mn with bases {f1 , . . . , fn } and {a1 , . . . , an }, respectively. Here we assume that these bases are the images of the basis {f1 , . . . , fn } of Fn under the natural epimorphisms Fn → Mn and Fn → An , so again with an abuse of notation we denote the elements fi Fn by fi . The natural epimorphisms π : Mn → An , π : Fn → An and π : Fn → Mn extend to ring epimorphisms π : ℤMn → ℤAn , π : ℤFn → ℤAn and π : ℤFn → ℤMn . The kernels of π and π are the ideals of ℤFn generated by the elements u − 1 with u ∈ Fn and u ∈ Fn , respectively. We say that a group M is presented in the variety 𝒜2 of metabelian groups by generators f1 , . . . , fn and defining relations r1 = 1, . . . , rm = 1, and write M = ⟨f1 , . . . , fn | r1 , . . . , rm ; 𝒜2 ⟩,
(5.12)
5.1 Introduction
| 243
where each ri is a word in the generators if and only if M is the quotient of a free metabelian group Mn with free generators f1 , . . . , fn by the normal subgroup R generated by r1 , . . . , rm . In this event we identify the elements fi with their images fi R in M. By Hall’s theorem [204] every finitely generated metabelian group M satisfies the ascending chain condition maxn on normal subgroups, so every finitely generated metabelian group has a finite presentation of the form (5.12). Let M be an arbitrary finitely generated metabelian group. The derived subgroup M of M is a module over a finitely generated commutative ring ℤA, where A = M/M , generated by finitely many elements [fi , fj ], i > j, for i, j = 1, . . . , n. Since the ring ℤA is Noetherian, M is finitely presented as ℤA-module. This gives a finite description of M , even in the case when M is not finitely generated as a group. Note that by [33, Theorem 3.1] there is an algorithm which, given a finite presentation (in 𝒜2 ) of a metabelian group M, finds a finite presentation of the ℤA-module M . Similarly, every abelian normal subgroup N ≥ M can also be viewed as a module over a finitely generated commutative ring ℤB, where B = M/N. This module is finitely presented and such a presentation can be found effectively [33]. The partial Fox derivatives d/dxi : Fn → ℤFn , i = 1, . . . , n, induce via the natural projections free derivatives d/dfi : Mn → ℤAn , i = 1, . . . , n. Equality (5.11) gives rise to the following one for every element g ∈ Mn : n
∑ dg/dfi ⋅ (ai − 1) = gπ − 1. i=1
(5.13)
Then for any u ∈ Mn and β ∈ ℤAn one has duβ /dfi = β ⋅ du/dfi ,
i = 1, . . . , n.
(5.14)
Let g = g(f1 , . . . , fn ) be a word in the generators fi , i = 1, . . . , n, of Mn . For any tuple u = (u1 , . . . , un ) ∈ (Mn )n we define an endomorphism ξ of Mn by its images on the free generators fi ξ = ui fi ,
i = 1, . . . , n.
(5.15)
dg/df1
(5.16)
Then gξ = g(u1 f1 , . . . , un fn ) = u1
n ⋅ ⋅ ⋅ udg/df ⋅ g. n
Now we slightly generalize the situation above. Let M = gp(f1 , . . . , fn ) = Mn /R be a finitely generated metabelian group given as a quotient of Mn . Let N ≥ M be an abelian normal subgroup of M containing the derived subgroup M . Denote by α : An → B = M/N the standard epimorphism. Let d/dfi = (d/dfi )α, i = 1, . . . , n, be the induced partial derivatives with values in ℤB. These derivatives can be applied to every
244 | 5 Discrete optimization in groups word g = g(f1 , . . . , fn ) viewed as an element of Mn so that the following analog of (5.13) holds: n
∑ dg/dfi ⋅ (ai − 1) = g̃ − 1, i=1
(5.17)
where g̃ = (gπ)α. Thus one can view N as a module over ℤM, as well as over ℤB. For every u ∈ N and β ∈ ℤM one has duβ /dfi = β̃ ⋅ du/dfi ,
i = 1, . . . , n,
(5.18)
where β̃ ∈ ℤB is the natural projection of β. Let ξ ∈ End(M) be an endomorphism which is the identity mod N. Then ξ is uniquely defined by the images fi ξ = ui fi ,
i = 1, . . . , n,
(5.19)
where ui ∈ N, i = 1, . . . , n. Then for every element g ∈ M presented by a word g = g(f1 , . . . , fn ) in the generators its image under ξ can be written as dg/df1
gξ = u1
n ⋅ ⋅ ⋅ udg/df ⋅ g. n
(5.20)
Moreover, in general, if G = ⟨f1 , . . . , fn ⟩ is an arbitrary finitely generated group and N is an abelian normal subgroup of G, then for an arbitrary word g = g(f1 , . . . , fn ) in the generators of G and u1 , . . . , un ∈ N, the following equality holds (by induction on the length of the word g(f1 , . . . , fn )): dg/df1
g(u1 f1 , . . . , un fn ) = u1
n ⋅ ⋅ ⋅ udg/df ⋅ g. n
(5.21)
In Sections 5.4.5 and 5.4.6 we use some known results on algorithmic problems in abelian and finitely generated metabelian groups; for the former we refer to [146, 446] and for the latter to [33, 285]. However, for one particular result that we need we could not find any reference, so we just give a sketch of a proof below. Lemma 5.1.16. There is an algorithm that for a given homomorphism ψ : M̄ → M of two finitely generated metabelian groups (given by its images on a finite set of generators of ̄ finds ker(ψ), i. e., it finds a finite set of generators of ker(ψ) as a normal subgroup M) of M.̄ ̄ then the subgroup ψ(M) ̄ is generated in M Proof. If X is a finite set of generators of M, 2 by a finite set ψ(X). By [33, Theorem 3.3] one can find a finite 𝒜 -presentation of the ̄ = ⟨ψ(X)⟩. Replacing M by ψ(M), ̄ if necessary, we may assume that ψ subgroup ψ(M) is onto. Note that for finitely generated abelian groups the problem of finding the kernel of an epimorphism is decidable (a linear algebra argument). Similarly, this problem
5.1 Introduction | 245
is decidable for finitely generated modules over finitely generated commutative rings. This allows one to find the kernels of the following homomorphisms, which are induced by ψ: ̄ M̄ → M/M ψab : M/
and ψ : M̄ → M ,
̄ M̄ ). Let where M̄ and M are viewed as finitely generated modules over ℤ(M/ ̄ ̄ ̄ ker(ψab ) = ⟨g1 M , . . . , gk M ⟩ for some g1 , . . . , gk ∈ M. Since gi ψ ∈ M and ψ, and hence ψ , is onto, there exist elements ui ∈ M̄ such that gi ψ = ui ψ for i = 1, . . . , k. Hence g̃i = u−1 i gi ∈ ker(ψ) for every such i. Obviously, ker(ψ) ≤ ⟨g1 , . . . , gk , M̄ ⟩ = ⟨g1̃ , . . . , gk̃ , M̄ ⟩. Hence, ker(ψ) = ⟨g1̃ , . . . , gk̃ , ker(ψ )⟩ = ⟨g1̃ , . . . , gk̃ ⟩ ker(ψ ) and the result follows. Now we slightly generalize the result above. Lemma 5.1.17. There is an algorithm that for a given epimorphism ψ : M̄ → M of two finitely generated metabelian groups (given by its images on a finite set of gener̄ and a subgroup K ≤ M which is a product of a finitely generated subgroup ators of M) ⟨a1 , . . . , am ⟩ and a normal subgroup ⟨b1 , . . . , bs ⟩M finds the full preimage K̄ = ψ−1 (K) ≤ M̄ of K in M̄ in the form ̄ K̄ = ⟨ā 1 , . . . , ā m ⟩ ⋅ ⟨b̄ 1 , . . . , b̄ s ⟩M ⋅ ker ψ,
where ā i ψ = ai and b̄ j ψ = bj for i = 1, . . . , m, j = 1, . . . , s. Proof. Let M̄ and M be given by some finite 𝒜2 -presentations, say, M̄ = ⟨X̄ | R̄ : 𝒜2 ⟩ and M = ⟨X | R : 𝒜2 ⟩. ̄ In the notation above K̄ = ⟨ā 1 , . . . , ā m ⟩ ⋅ ⟨b̄ 1 , . . . , b̄ s ⟩M ⋅ ker ψ for any ā , b̄ j such that ā i ψ = ai and b̄ j ψ = bj for i = 1, . . . , m, j = 1, . . . , s. Hence the task is to find some ̄ To do this we first find an 𝒜2 -presentation ψ-preimages of the elements ai , bj in M. ̄ of M and also an isomorphism θ : M → ⟨ψ(X)⟩ ̄ [33, Theoof the subgroup ⟨ψ(X)⟩ rem 3.3]. Applying θ to the elements ai , bj one can find their representations as words ̄ Now, the same words in the generators X give some in the “new” generators ψ(X). ̄ as required. ψ-preimages of the elements ai , bj in M, 5.1.10.3 Polycyclic groups Recall that a group G is polycyclic if it can be obtained from the trivial group by a finite series of cyclic extensions.
246 | 5 Discrete optimization in groups Below we mention several principal facts on polycyclic groups that we use throughout Section 5.4.6. Every polycyclic group G is finitely presented [204], residually finite [210], conjugacy separable, i. e., if two elements of G are not conjugate in G, then they are not conjugate in some finite quotient of G [137, 419], and virtually nilpotent-by-abelian, i. e., G contains a finite-index subgroup H such that the derived subgroup H is nilpotent (see [250, 285, 445]). Furthermore, every subgroup H of G is polycyclic (and hence finitely generated) and closed in the profinite topology [330, 428] (and hence separable), i. e., if g ∈ G does not belong to H, then in some finite quotient of G the image of g does not belong to the image of H. It follows from the facts above that the word conjugacy and subgroup membership problems are decidable in G. We refer to [445, 285] for general results on polycyclic groups, and to the paper [34] for a thorough account of the algorithmic theory of polycyclic (even polycyclic-by-finite) groups. More recent expositions of algorithmic results on polycyclic groups can be found in books [285, 216]. It is worthwhile to mention several algorithms (see the references above) on polycyclic or polycyclic-by-finite groups that we use in Section 5.4.6. In particular, there are algorithms that given finite subsets X and Y of a polycyclic group G find a finite presentation of the subgroup ⟨X⟩ of G generated by X, the intersection of the subgroups ⟨X⟩ ∪ ⟨Y⟩ and the centralizer C⟨Y⟩ (X). The following result is well known, and since we could not locate a proper reference we formulate it as a lemma below. Lemma 5.1.18. There is an algorithm which, given polycyclic groups G and H, a homomorphism ψ : G → H and a finitely generated subgroup K ≤ H, finds the full preimage ψ−1 (K) of K in G. Proof. Similar to Lemma 5.1.17. As we have mentioned above the twisted conjugacy problem in groups is closely related to the general Post correspondence problem, so it is of particular interest for us here. It follows from [131] that the twisted conjugacy problem is decidable in a polycyclic group G for any automorphism ξ ∈ Aut(G). Indeed, one can construct an extension Gξ of G by ⟨ξ ⟩ which is a polycyclic group in such a way that the twisted conjugacy problem for ξ in G reduces to the standard conjugacy problem in Gξ , so it is decidable. Moreover, the twisted conjugacy problem for an arbitrary endomorphism of G is decidable in G [430].
5.2 Subset sum problem and related problems 5.2.1 Definition Let G be a group generated by a finite set X = {x1 , . . . , xn } ⊆ G. Elements in G can be expressed as products of the generators in X and their inverses. We consider the following combinatorial problems.
5.2 Subset sum problem and related problems | 247
The subset sum problem SSP(G, X): Given g1 , . . . , gk , g ∈ G, decide if ε
ε
g = g1 1 ⋅ ⋅ ⋅ gkk
(5.22)
for some ε1 , . . . , εk ∈ {0, 1}. The bounded submonoid membership problem BSMP(G, X): Given g1 , . . . , gk , g ∈ G and 1m ∈ ℕ (in unary form), decide if g is equal in G to a product of the form g = gi1 ⋅ ⋅ ⋅ gis , where gi1 , . . . , gis ∈ {g1 , . . . , gk } and s ≤ m. The bounded knapsack problem BKP(G): Given g1 , . . . , gk , g ∈ G and 1m ∈ ℕ (in unary form), decide if ε
ε
g = g1 1 ⋅ ⋅ ⋅ gkk
(5.23)
for some integers ε1 , . . . , εk such that 0 ≤ εj ≤ m for all j = 1, 2, . . . , k. One may note that BKP(G, X) is P-time equivalent to SSP(G, X). As we see below in Proposition 5.2.5 [363, Proposition 2.5], the computational properties of the above problems do not depend on the choice of a finite generating set X, so we omit X from the notation and simply write SSP(G), etc. As we will see in Section 5.2.2, the above three problems are particular cases of the following problem. The acyclic graph word problem AGWP(G, X): Given an acyclic directed graph Γ labeled by letters in X ∪ X −1 ∪ {ε} with two marked vertices, α and ω, decide whether there is an oriented path in Γ from α to ω labeled by a word w such that w = 1 in G. The same problem is denoted as ARatMP(G) in [275]. An immediate observation is that this problem, like all the problems introduced in this section, is a special case of the uniform membership problem in a rational subset of G. Let graph Γ have n vertices and m edges. Define size(Γ) to be m + n. Let the total word length of labels of edges of Γ be l. For a given instance of AGWP(G, X), its size is the value m + n + l. With a slight abuse of terminology, we will also sometimes use labels that are words rather than letters in X ∪ X −1 ∪ {ε}. Such an abuse of terminology results in distorting the size of an instance of AGWP(G, X) by a factor of at most 3. Note that by a standard argument (see Proposition 5.2.5), if X1 and X2 are two finite generating sets for a group G, the problems AGWP(G, X1 ) and AGWP(G, X2 ) are P-time (in fact, linear-time) equivalent. In this sense, the complexity of AGWP in a group G does not depend on the choice of a finite generating set. In the sequel we write AGWP(G) instead of AGWP(G, X), implying an arbitrary finite generating set.
248 | 5 Discrete optimization in groups 5.2.2 Examples and basic properties The classical SSP (also called the zero-one knapsack problem) is the following algorithmic question. Given a1 , . . . , ak ∈ ℤ and M ∈ ℤ, decide if M = ε1 a1 + ⋅ ⋅ ⋅ + εk ak for some ε1 , . . . , εk ∈ {0, 1}. It is well known (see [155, 405, 406]) that if the numbers in SSP are given in binary form, then the problem is NP-complete, but if they are given in unary form, then the problem is in P. The examples below show how these two variations of SSP appear naturally in the group theory context. Example 5.2.1. Three variations of the subset sum problem for ℤ: – SSP(ℤ, {1}) is linear-time equivalent to the classical SSP in which numbers are given in unary form. In particular, SSP(ℤ, {1}) is in P. – For n ∈ ℕ ∪ {0} put xn = 2n . The set X = {xn | n ∈ ℕ ∪ {0}} obviously generates ℤ. Fix an encoding ν : X ±1 → {0, 1}∗ for X ±1 defined by
–
ν
xi
→
−xi
→
{
ν
0101(00)i 11,
0100(00)i 11.
Then SSP(ℤ, X) is P-time equivalent to its classical version where the numbers are given in binary form. In particular, SSP(ℤ, X) is NP-complete. n Let X = {2n | n ∈ ℕ ∪ {0}} and the number 2n is represented by the word 01(00)2 11 (unary representation). Then SSP(ℤ, X) is in P.
The first example is of no surprise, of course, since, by definition, we treat words representing elements of the group as in unary form. The second one shows that there might be a huge difference in complexity of SSP(G, X) for finite and infinite generating sets X. The third one indicates that if X is infinite, then it really matters how we represent the elements of X. Definition 5.2.2. Let G and H be groups generated by countable sets X and Y with encodings ν and μ, respectively. A homomorphism φ : G → H is called P-time computable relative to (X, ν), (Y, μ) if there exists an algorithm that given a word ν(u) ∈ ν(Σ∗X ) computes in polynomial time (in the size of the word ν(u)) a word μ(v) ∈ μ(Σ∗Y ) representing the element v = φ(u) ∈ H. Example 5.2.3. Let Gi be a group generated by a set Xi with encoding νi , i = 1, 2. If X1 is finite, then any homomorphism φ : G1 → G2 is P-time computable relative to (X1 , ν1 ), (X2 , ν2 ). To formulate the following results, we put 𝒫 = {SSP, BKP, BSMP, AGWP}.
5.2 Subset sum problem and related problems | 249
Lemma 5.2.4. Let Gi be a group generated by a set Xi with an encoding νi , i = 1, 2. If ϕ : G1 → G2 is a P-time computable embedding relative to (X1 , ν1 ), (X2 , ν2 ), then Π(G1 , X1 ) is P-time reducible to Π(G2 , X2 ) for any problem Π ∈ 𝒫 . Proof. The proof is straightforward. In view of Example 5.2.3 we have the following result. Proposition 5.2.5. If X and Y are finite generating sets for a group G, then Π(G, X) is P-time equivalent to Π(G, Y) for any problem Π ∈ 𝒫 . Proposition 5.2.6. Let G be a group and X a generating set for G. Then the word problem (WP) for G is P-time reducible to Π(G, X) for any problem Π ∈ 𝒫 . Proof. Let w = w(X). Then w = 1 in G if and only if 1ε = w in G for some ε ∈ {0, 1}, i. e., if and only if the instance 1, w of SSP(G) is positive (and similarly for other problems from 𝒫 ). Corollary 5.2.7. Let G be a group with a generating set X. Let Π ∈ 𝒫 . Then: (1) Π(G, X) is decidable if and only if the word problem for G is decidable, (2) if the word problem for G is NP-hard, then Π(G, X) is NP-hard too for any Π ∈ 𝒫 . This corollary shows that from the SSP viewpoint groups with polynomial-time decidable word problem are the most interesting. The following result shows how the decision version of SSP(G) gives a search algorithm to find an actual sequence of εi ’s that is a particular solution for a given instance of SSP(G). Similar result holds for all other problems in 𝒫 . Proposition 5.2.8. For any group G the search SSP(G) is P-time Turing reducible to the decision SSP(G). In particular, if SSP(G) is in P, then the search SSP(G) is also in P. Proof. The argument is rather known, so we just give a quick outline to show that it works in the noncommutative case too. Let w1 , . . . , wk , w be a given instance of SSP(G) that has a solution in G. To find a solution ε1 , . . . , εk ∈ {0, 1} for this instance consider the following algorithm. – Solve the decision problem for (w2 , . . . , wk ), w in G. If the answer is positive, then put ε1 = 0. Otherwise, put ε1 = 1 and replace w with w1−1 w. – Continue inductively and find the whole sequence ε1 , . . . , εk . SSP, BKP and BSMP in a group G reduce easily to AGWP. Proposition 5.2.9. Let G be a finitely generated group. SSP(G), BKP(G) and BSMP(G) are P-time reducible to AGWP(G). Proof. Let w1 , w2 , . . . , wk , w be an input of SSP(G). Consider the graph Γ = Γ(w1 , . . . , wk , w) shown in Figure 5.4. We see immediately that (w1 , . . . , wk , w) is a positive instance of SSP(G) if and only if Γ is a positive instance of AGWP(G). Since BKP(G)
250 | 5 Discrete optimization in groups
Figure 5.4: Graph Γ(w1 , w2 , . . . , wk , w), Proposition 5.2.9.
Figure 5.5: Graph Δ(w1 , w2 , . . . , wk , w, 1n ), Proposition 5.2.9. There are n + 2 vertices in the graph.
P-time reduces to SSP(G) (see [363]), it is only left to prove that BSMP(G) reduces to AGWP(G). Indeed, let (w1 , w2 , . . . , wk , w, 1n ) be an input of BSMP(G). Consider the graph Δ = Δ(w1 , w2 , . . . , wk , w, 1n ) shown in Figure 5.5. It is easy to see that (w1 , w2 , . . . , wk , w, 1n ) is a positive instance of BSMP(G) if and only if Δ is a positive instance of AGWP. We make a note of the following obvious property of AGWP. Proposition 5.2.10. Let G be a finitely generated monoid and H ≤ G its finitely generated submonoid. Then AGWP(H) is P-time reducible to AGWP(G). In particular: 1. if AGWP(H) is NP-hard, then AGWP(G) is NP-hard, 2. if AGWP(G) ∈ P, then AGWP(H) ∈ P. It is possible to state the above problems for monoids. We do not go into detail here and refer the reader to [145]. Before we move on to specific complexity results, we record two immediate observations regarding complexity of SSP and related problems. 1. As we observed in Proposition 5.2.6, WP in a given group G is not easier than any of the problems SSP, BSMP, AGWP. 2. Establishing whether there is a path from one vertex to another in a given acyclic graph (call this problem AcycPath) obviously reduces to AGWP in any group. In this sense, AGWP(G) is not easier than AcycPath.
5.2.3 Easy SSP In this section we overview groups for which SSP and related problems are, in some sense, easy.
5.2 Subset sum problem and related problems |
251
5.2.3.1 Virtually nilpotent groups Let G be a group generated by a finite set X. We assume that X is closed under inversion in G, so X −1 = X. For n ∈ ℕ we denote by Bn (X) the ball of radius n in the Cayley graph Cay(G, X) of G relative to X. We view Bn (X) as a finite directed X-labeled graph, which is the subgraph of Cay(G, X) induced by all vertices at distance at most n from the base vertex 1. The following result is well known. Proposition 5.2.11. Let G be a virtually nilpotent group generated by a finite set X. Then there is a P-time algorithm that for a given n ∈ ℕ outputs the graph Bn (X). Proof. Denote by Vn the set of vertices of Bn (X). Clearly, V0 = {1}, and Vn = Vn−1 ∪ ⋃ Vn−1 y. y∈X
(5.24)
By a theorem of Wolf [499] the growth of G is polynomial, i. e., |Vi | ≤ p(i) for some polynomial p(n). It follows from (5.24) that it takes at most |X| steps (one for each y ∈ X) to construct Bn (X) if given Bn−1 (X), where each step requires to take an arbitrary vertex v ∈ Bn−1 (X) − Bn−2 (X) (given by some word in X); we multiply it by the given y ∈ X and check if the new word vy is equal to any of the previously constructed vertices. Recall that finitely generated virtually nilpotent groups are linear, and therefore their word problems are decidable in polynomial time (in fact, real time [218] or logspace [317]). This shows that Bn (X) can be constructed in a time polynomial in n for a given fixed G and X. Remark 5.2.12. The argument above and the following Theorem 5.2.13 are based on the fact that finitely generated virtually nilpotent groups have polynomial growth. By Gromov’s theorem [186] the converse is also true, i. e., polynomial growth implies virtual nilpotence, so the argument cannot be immediately applied to other classes of groups. Theorem 5.2.13. Let G be a finitely generated virtually nilpotent group. Then SSP(G), BSMP(G) and AGWP(G) are in P. Proof. We give the proof for SSP. The argument for AGWP (and therefore BSMP) is similar. Consider an arbitrary instance g1 , . . . , gk , g of SSP(G). For every i = 0, . . . , k define a set ε
ε
Pi = {g1 1 ⋅ ⋅ ⋅ gi i | ε1 , . . . , εi ∈ {0, 1}}. Clearly, the given instance is positive if and only if g ∈ Pk . The set Pi can be constructed recursively using the formula Pi = Pi−1 ∪ Pi−1 ⋅ gi .
(5.25)
252 | 5 Discrete optimization in groups Observe that all elements of Pk lie in the ball Bm (X), where m = |g1 | + ⋅ ⋅ ⋅ + |gk |. Using formula (5.25) one can identify all vertices in Bm (X) that belong to Pk (an argument similar to the one in Proposition 5.2.11 works here as well) in polynomial time. In fact, an even stronger statement can be made about AGWP in a nilpotent group. Theorem 5.2.14 (Theorem 4.3 of [275]). Let G be a finitely generated virtually nilpotent group. Then AGWP(G) is NL-complete. Proof. First, we observe that AGWP(UTd (ℤ)) ∈ NL by recording matrix entries in binary form (the entries grow polynomially, for example, by [296, Proposition 4.18]). Next, we note that every finitely generated virtually nilpotent group G has a finiteindex torsion-free nilpotent subgroup H, which, in turn, embeds in some UTd (ℤ). Since AGWP(G) logspace reduces to AGWP(H) whenever H ≤f.i. G (see Theorem 5.2.56), the membership in NL follows. Finally, to establish NL-hardness it suffices to recall that AcycPath is NL-complete (see, for example, [60, Theorem 4.4.]), and therefore AGWP(G) is as well. It follows that SSP in a virtually nilpotent group belongs to NL. It remains to be seen whether SSP is NL-complete for a virtually nilpotent G. It is worth noting that, as we discuss in Section 5.3.3, SSP(ℤn ) is TC0 -complete. 5.2.3.2 Hyperbolic groups In this section we prove that the subset sum problem is P-time decidable for every hyperbolic group. We refer to Section 5.1.6 for an introduction to hyperbolic groups, and to [187, 4] for further details. The proofs in this section are based on some results from [112, 368] (see also the book [366]). The polynomial-time solution for the acyclic graph word problem for hyperbolic groups uses finite-state automata and two operations, called R-completion and folding, described below. Notation. For a finite automaton Γ over the alphabet X we denote by L(Γ) the set of all words accepted by Γ. By |Γ| we denote the number of states in Γ. In general, for a set S ⊂ X ∗ by S we denote the image of S in G = ⟨X | R⟩ under the standard epimorphism X ∗ → G. R-completion Recall that a group presentation ⟨X | R⟩ is called symmetrized if R = R−1 and R is closed under taking cyclic permutations of its elements. Given a symmetrized presentation ⟨X | R⟩ and an automaton Γ over ΣX = X ∪X −1 ∪{ε}, one can construct a new automaton 𝒞 (Γ) obtained from Γ by adding a loop labeled by r for every r ∈ R at every state v ∈ Γ. By R-completion of Γ we understand the graph 𝒞 k (Γ) for some k ∈ ℕ. We want to point out that unlike in [368], we do not perform Stallings foldings after adding relator
5.2 Subset sum problem and related problems |
253
loops. Instead, we perform a special transformation of the automaton described in Section 5.2.3.2. Proposition 5.2.15 (Properties of 𝒞 (Γ)). For every ⟨X | R⟩ and Γ the following holds: (a) Γ is a subgraph of 𝒞 (Γ), (b) L(Γ) = L(𝒞 (Γ)), (c) |𝒞 (Γ)| ≤ |Γ| ⋅ ‖R‖, where ‖R‖ = ∑r∈R |r|. Proof. It follows from the construction of 𝒞 (Γ). Non-Stallings folding Given an automaton Γ over a group alphabet ΣX , one can construct a new automaton ℱ (Γ) obtained from Γ by a sequence of steps, at each step adding new edges as described below. For every pair of consecutive edges of the form shown in the left column of the table below we add the edge from the right column of the table (in the same row), provided this edge is not yet in the graph. x
x −1
s1 → s2 → s3 x
ε
ε
x
ε
ε
s1 → s2 → s3
s1 → s2 → s3 s1 → s2 → s3
ε
s1 → s3 x
s1 → s3 x
s1 → s3 ε
s1 → s3 .
Clearly, the procedure eventually stops, because the number of vertices does not increase and the alphabet X is finite. Proposition 5.2.16. We have L(Γ) = L(ℱ (Γ)) for any finite automaton Γ over the alphabet ΣX . Proof. The language as a set of reduced words does not change. Lemma 5.2.17. Let ⟨X | R⟩ be a finite presentation of a hyperbolic group. Let Γ be an acyclic automaton over ΣX with at most l nontrivially labeled edges. Then 1 ∈ L(Γ) if and only if there exists u ∈ L(𝒞 O(log l) (Γ)) satisfying u =F(X) ε. Proof. If 1 ∈ L(Γ), then there exists v ∈ L(Γ) such that v =G 1. The length of v is bounded by l. By Proposition 5.1.13 the depth of v is bounded by O(log |v|). Let D be a diagram with perimeter label v of depth O(log |v|). Next, we mimic the proof of [366, Proposition 16.3.14]. Cut D to obtain a new “forest” diagram E of height l with a perimeter label vu, where u =F(X) ε (see [366, Figure 16.2]). The diagram E embeds into 𝒞 O(log l) (Γ) and the initial segment v of the perimeter label of E is mapped onto the corresponding word in Γ. This way we obtain a path from α to ω labeled with u, as claimed. The other direction of the statement follows from Proposition 5.2.15.
254 | 5 Discrete optimization in groups Proposition 5.2.18. Let ⟨X | R⟩ be a finite presentation of a hyperbolic group. Let Γ be an acyclic automaton over ΣX with at most l nontrivially labeled edges. Then 1 ∈ L(Γ) if ε and only if ℱ (𝒞 O(log l) (Γ)) contains an edge α → ω. Proof. It follows from Lemma 5.2.17, the definition of ℱ and Proposition 5.2.16. As an immediate corollary we get the following principal result. Theorem 5.2.19. AGWP(G) ∈ P for any hyperbolic group G. Corollary 5.2.20. SSP(G), BSMP(G), BKP(G) ∈ P for any hyperbolic group G. 5.2.3.3 Baumslag–Solitar groups BS(n, ±n) Recall that the Baumslag–Solitar group BS(n, m) is given by the presentation BS(n, m) = ⟨a, t | t −1 an t = am ⟩. It is easy to see (see Section 5.2.4) that SSP(BS(n, m)) is NP-complete if m ≠ ±n. It is less obvious that SSP(BS(n, ±n)) is in P, and so is AGWP(BS(n, ±n)). We briefly outline the algorithm here. Let Γ be an input graph for AGWP. We allow edges to be labeled with arbitrary powers of a. As we observed before, this results in a bounded change to the size of input. Starting with the graph Γ, we repeatedly apply Britton’s lemma to the graph: –
t ±1
acm
acm
t ∓1
for any path s1 → s2 → s3 → s4 add the edge s1 → s4 in the case of BS(n, n), or a−cm
the edge s1 → s4 in the case of BS(n, −n) (where c ∈ ℤ), and –
as
at
as+t
for any path s1 → s2 → s3 add the edge s1 → s3 .
The procedure terminates in polynomial time because powers m are bounded by the length of the input. The answer is “yes” if there exists an ε-edge from α to ω. 5.2.3.4 Some right-angled Artin groups Here we mention that in [305] it is shown that SSP(ℤn ) is TC0 -complete and SSP(RAAG(transitive forest)) is LogCFL-complete. These results are covered in more detail in Section 5.3.3.
5.2.4 Distortion as a source of hardness of SSP Distortion in a group can be a source of hardness for SSP (and therefore AGWP). To begin, consider the well-known Baumslag–Solitar metabelian group BS(m, n) = ⟨a, t | t −1 am t = an ⟩.
5.2 Subset sum problem and related problems |
255
Theorem 5.2.21. SSP(BS(1, 2)) is NP-complete. Proof. We showed in Example 5.2.1 that SSP(ℤ, X) is NP-complete for a generating set X = {xn = 2n | n ∈ ℕ ∪ {0}}. The map xn → t −n at n is obviously P-time computable and defines an embedding ϕ : ℤ → BS(1, 2) because n t −n at n = a2 . Hence, SSP(ℤ, X) P-time reduces to SSP(BS(1, 2)). Thus, SSP(BS(1, 2)) is NP-complete. In fact, it is easy to adjust the above proof to show that SSP(BS(m, n)) is NP-complete whenever |m| ≠ |n| and m, n ≠ 0. A similar exponential distortion can be observed in any nonvirtually nilpotent polycyclic group, which we can exploit to establish NP-completeness of SSP. In what follows we set up a reduction of ZOE (see Section 5.1.4.2). 5.2.4.1 Distortion in polycyclic groups The following two statements are well known. Recall that ‖ ⋅ ‖ denotes the Euclidean norm. Also note that below we follow the convention that for m, l ∈ ℕ ∪ {0}, the binomial coefficient (ml) is 0 whenever m < l. Lemma 5.2.22. Let M be an n×n matrix with complex entries, and let α be the maximum of absolute values of its eigenvalues. There is a positive constant CM such that for any v ∈ ℂn and any k ∈ ℕ k k−1 k k−n k k M v ≤ CM (α + ( )α + ⋅ ⋅ ⋅ + ( )α )‖v‖. 1 n Proposition 5.2.23. Let H = ⟨x⟩ ⋉ K, where K = ℤn and x acts on K by conjugation via a matrix X ∈ SLn (ℤ). Suppose H is not virtually nilpotent. Then X has a complex eigenvalue of absolute value greater than 1. Recall that every polycyclic group G has a unique maximal normal nilpotent subgroup, called the Fitting subgroup of G and denoted Fitt G (see, for example, [445, Chapter 1]). Note that Fitt G is a characteristic subgroup of G. Let 1 = G0 ◁ G1 ◁ G2 ◁ ⋅ ⋅ ⋅ ◁ Gm = G
(5.26)
be a subnormal series for G with cyclic quotients. Denote Fitt Gi = Hi . For each i = 0, . . . , m − 1, we have Gi ◁ Gi+1 and Hi is a characteristic subgroup of Gi , and therefore, Hi ◁ Gi+1 . It follows that Hi ≤ Hi+1 . Suppose that H = Fitt G is a term of the polycyclic series (5.26), H = Gj with j < m. Then H = Hj ≤ Hj+1 ≤ Hm = H, so Hj = Hj+1 . Observe that in this case, Gj+1 is not virtually nilpotent if Gj+1 /Gj is infinite cyclic. Indeed, if Gj+1 is virtually nilpotent, then it has a finite-index normal nilpotent subgroup and therefore Hj+1 > Hj .
256 | 5 Discrete optimization in groups Proposition 5.2.24. Let G be a polycyclic group that is not virtually nilpotent. There exists an element x ∈ G and normal nilpotent subgroups K ≤ H of G such that H/K is infinite abelian and ⟨x, H⟩/K is not virtually nilpotent. Proof. Since G/ Fitt G is infinite, it has a polycyclic series (5.26) such that Fitt G = Gj , j < m, and Gj+1 /Gj is infinite cyclic (see, for example, [445, Chapter 1, Proposition 2]). Let gj+1 ∈ Gj+1 be such that Gj+1 = ⟨gj+1 , Gj ⟩. We claim that we can take x = gj+1 , H = Gj , and we can take K to be the commutator subgroup H = [H, H] of H. Indeed, the subgroup K = H is characteristic in H = Gj+1 = Fitt G, and therefore normal in G. Furthermore, if the abelianization H/H is finite, then H is finite (for example, by [445, Chapter 1, Corollary 9]) and therefore so is G, by [445, Chapter 1, Lemma 6]. It was observed above that Gj+1 = ⟨x, H⟩ is not virtually nilpotent. Finally, if ⟨x, H⟩/H is virtually nilpotent, it follows by [445, Chapter 1, Corollary 12] that ⟨x, H⟩ is virtually nilpotent, which is not the case. For a polycyclic group G let x ∈ G and K ≤ H ≤ G be as provided by Proposition 5.2.24. In Section 5.2.4.2, we show that SSP in the abelian-by-cyclic group ⟨x, H⟩/K is NP-hard. In Section 5.2.4.3, we show that the instances involved in this reduction, in turn, polynomially reduce to SSP(⟨x, H⟩) and therefore to SSP(G). Together with the observation that the word problem in G is solvable in polynomial time, this will imply that SSP(G) is NP-complete. 5.2.4.2 SSP in abelian-by-cyclic groups Fix a group F = ℤ⋉ℤn with exponentially distorted ℤn by a matrix X ∈ SLn (ℤ). Also, fix a generating set {x, e1 , . . . , en }, where x is the generator of ℤ and e1 , . . . , en are standard generators for ℤn . Let φ : F(x, e1 , . . . , en ) → F be the canonical epimorphism. Below we reduce a problem known to be NP-complete, namely, the zero-one equation problem, to SSP(F). As before, let α be the greatest absolute value of an eigenvalue for X ∈ SLn (ℤ). Define a polynomial p(k) ∈ ℝ[k] as k 1 k 1 p(k) = CX ⋅ (1 + ( ) + ⋅ ⋅ ⋅ + ( ) n ), n α 1 α where CX is a constant provided by Lemma 5.2.22. The following statement follows by a standard argument. Proposition 5.2.25. In the above notation, for every k ∈ ℕ, there is j ∈ {1, . . . , n} satisfying 1 k k α ≤ X ej ≤ p(k)αk . √n By Proposition 5.2.23, α > 1. Observe that given k ∈ ℕ, one can find a basis vector e (denoted by ek∗ ) provided by Proposition 5.2.25 in polynomial time by computing
257
5.2 Subset sum problem and related problems |
X k e1 , . . . , X k en . Now, for λ ≥ 1 and k ∈ ℕ define a constant cλ,k = ⌈logα (p(k))⌉ + ⌈logα λ⌉ + ⌈logα √n⌉ + 1 and note that ∗ λX k ek∗ < X k+cλ,k ek+c . λ,k For a sequence of numbers n1 = 1, ni+1 = ni + cλ,ni we have λk−1 X n1 en∗1 < λk−2 X n2 en∗2 < ⋅ ⋅ ⋅ < λX nk−1 en∗k−1 < X nk en∗k .
(5.27)
Denote the words corresponding to X n1 en∗1 , . . . , X nk en∗k by wλ,1 , . . . , wλ,k , i. e., define wλ,i = x−ni en∗i xni . Clearly, |wλ,i | ≤ 1 + 2ni . Now we find an upper bound for ni . Note that cλ,k ≤ A + B logα (λk), where the constants A, B > 0 depend only on α, CX and n. Then ni ≤ n1 + iA + B(logα (λn1 ) + ⋅ ⋅ ⋅ + logα (λni )) ≤ iA + iB logα (λni ), or ni ≤ i, A + B logα (λni )
that is,
λni ≤ λi. A + B logα (λni )
Since there is a constant C ≥ 0 (that depends only on A, B and α) such that √t − C for all t ≥ 1, we have
t A+B logα (t)
≥
√λni − C ≤ λi, so ni ≤ λ−1 (λi + C)2 . Therefore, |wλ,i | ≤ 1 + 2ni ≤ 1 + 2λ−1 (λi + C)2 ,
(5.28)
where C ultimately depends only on X and n (of course, better estimates for the growth of ni are possible but immaterial for our purposes). Proposition 5.2.26. SSP(F) is NP-hard. Proof. For an instance of ZOE (see Section 5.1.4.2), a11 [ . A=[ [ .. [ak1
... ...
a1k .. ] ] . ], akk ]
258 | 5 Discrete optimization in groups choose λ = k and consider the instance (g1 , . . . , gk , g) of SSP(F), where a
a
gi = wλ,11i . . . wλ,kki and g = wλ,1 . . . wλ,k . We claim that the instance of ZOE is positive if and only if the corresponding instance of SSP is positive. Indeed, let k ≥ 2 (the case k = 1 is immediate). The instance of SSP is positive if and only if the linear combination k
k
i=1
i=1
(−1 + ∑ a1i εi )X n1 en∗1 + ⋅ ⋅ ⋅ + (−1 + ∑ aki εi )X nk en∗k
(5.29)
is equal to 0 for some εi ∈ {0, 1}. Since for every coefficient in (5.29) we have k
−1 ≤ −1 + ∑ aji εi ≤ k − 1, i=1
it follows from (5.27) that (5.29) is trivial if and only if all coefficients are 0, i. e., there are εi ∈ {0, 1} such that k
∑ aji εi = 1 i=1
for every 1 ≤ j ≤ k.
The latter is precisely the condition for the corresponding instance of ZOE to be positive. Furthermore, since it is straightforward to write wλ,k , the time to generate the instance of SSP is proportional to |g1 | + ⋅ ⋅ ⋅ + |gk | + |g| ≤ (k + 1)|g| ≤ (k + 1)(k|wk,k |). Taking (5.28) into account, we see that the above is clearly polynomial in k. Therefore, ZOE can be reduced to SSP(F) in polynomial time. Thus, SSP(F) is NP-hard. Since the word problem in F is decidable in polynomial time, the following holds. Corollary 5.2.27. SSP(F) is NP-complete. Remark 5.2.28. Note that the elements wλ,k involved in the above reduction belong to the “bottom” subgroup ℤn of F = ℤ ⋉ ℤn . 5.2.4.3 SSP in polycyclic groups Now we turn to the case of an arbitrary polycyclic group G. Theorem 5.2.29. Let G be a nonvirtually nilpotent polycyclic group. Then SSP(G) is NP-complete.
5.2 Subset sum problem and related problems |
259
Proof. Since the word problem in G is polynomial-time decidable, it suffices to show that SSP(G) is NP-hard. For that, we show that the reduction of ZOE to the subset sum problem in an abelian-by-cyclic group described in Section 5.2.4.2 can be refined to deliver a reduction to SSP(G). Let x ∈ G and let normal subgroups K ≤ H of G be as provided by Proposition 5.2.24. Passing to a subgroup of H/K, we may assume that the group F = ⟨x, H⟩/K is (free abelian)-by-cyclic, as specified in Section 5.2.4.2. Let a k × k matrix Z = (aji ) be given as an input of ZOE, let g1 , . . . , gk , g ∈ F be the equivalent input of SSP(F) and let wλ,1 , . . . , wλ,k , with λ = k, be the corresponding elements of F involved in the construction, as chosen in the proof of Proposition 5.2.26: gi = wλ,1
a1i
. . . wλ,k
aki
and g = wλ,1 ⋅ ⋅ ⋅ wλ,k .
Fix representatives wλ,k ∈ ⟨x, H⟩ of wλ,k : wλ,1 = wλ,1 K, . . . , wλ,k = wλ,k K. Consequently, fix representatives of gi , g as follows: a
a
gi = wλ,11i . . . wλ,kki
and g = wλ,1 ⋅ ⋅ ⋅ wλ,k .
Note that we may assume that elements in ⟨x, H⟩/K are encoded by words in generators of ⟨x, H⟩, so we can neglect the time required to choose elements wλ,i , gi , g. By construction, if the equality ε1
g1 ⋅ ⋅ ⋅ gk
εk
= g
holds in F, then k
∑ aji εi = 1 i=1
for every 1 ≤ j ≤ k, ε
ε
or, in other words, each factor wλ,i occurs in the product g1 1 ⋅ ⋅ ⋅ gk 1 exactly once. By ε ε the choice of wλ,i and gi , the same is true for factors wλ,i in the product g1 1 ⋅ ⋅ ⋅ gk1 , that is, the latter and the element g are products of the same factors wλ,1 , . . . , wλ,k in, perhaps, different orders. Recall that wλ,i ∈ H by Remark 5.2.28 and therefore generate a nilpotent subgroup H0 of H. Let c0 ≤ c be the nilpotency classes of H0 ≤ H, respectively. Since H0 is ε ε nilpotent, the wλ,i factors in the product g1 1 ⋅ ⋅ ⋅ gkk can be rearranged as ε
ε
α
g1 1 ⋅ ⋅ ⋅ gkk ⋅ h1 1 ⋅ ⋅ ⋅ hαmm = wλ,1 ⋅ ⋅ ⋅ wλ,k = g, where h1 , . . . , hm are iterated commutators of wλ,1 , . . . , wλ,k up to weight c0 ≤ c. Since ε ε there are k factors wλ,i in the product g1 1 ⋅ ⋅ ⋅ gkk , by Proposition 5.1.7, there is a polynomial P that only depends on c such that each |α1 |, . . . , |αm | can be taken to not exceed P(k).
260 | 5 Discrete optimization in groups Therefore, if the instance g1 , . . . , gk , g of SSP(F) is positive, then the instance −1 −1 g1 , . . . , gk , ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ h1 , . . . , h1 , ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ h−1 hm , . . . , hm , ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ h−1 1 , . . . , h1 , . . . , ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ m , . . . , hm , g P(k)
P(k)
P(k)
(5.30)
P(k)
of SSP(G) is positive. The converse is immediate since H/K is abelian and therefore hi ∈ K. It follows that the instance Z of ZOE is positive if and only if (5.30) is a positive instance of SSP(G). It is only left to observe that there are at most (2k)c + (2k)c−1 + ⋅ ⋅ ⋅ + (2k)2 iterated ±1 commutators of 2k elements wλ,i , each of length at most 2c |wλ,k |, so the tuple (5.30) can be constructed in time which is polynomial in the size of the original input Z. By Section 5.2.4.2, this gives a reduction of ZOE to SSP(G). Corollary 5.2.30. Let G be a polycyclic group. If G is virtually nilpotent, then SSP(G), AGWP(G) ∈ P. Otherwise SSP(G) and AGWP(G) are NP-complete. Proof. If G is virtually nilpotent, the subset sum problem has polynomial-time solution by Theorem 5.2.13. If G is not virtually nilpotent, the statement immediately follows by Theorem 5.2.29. Therefore, SSP(G) (consequently, AGWP(G)) is NP-hard for any group G that contains a not virtually nilpotent polycyclic subgroup and NP-complete if, additionally, the word problem for G is polynomial-time decidable. In particular, the subset sum problem is NP-complete for any virtually polycyclic group that is not virtually nilpotent.
5.2.5 Large abelian subgroups as a source of hardness In this section we exploit another source of NP-hardness of SSP, i. e., the existence of subgroup ℤω . We start with an infinitely generated group ℤω , a direct sum of countably many copies of the infinite cyclic group ℤ. We view elements of ℤω as sequences ℕ → ℤ with finite support. For i ∈ ℕ we denote by ei a sequence such that ei (j) = δi,j , where δi,j is the Kronecker delta function. The set E = {ei }i∈ℕ is a basis for ℤω . We fix an encoding ν : E ±1 → {0, 1}∗ for the generating set E defined by ν
ei
→
−ei
→
{
ν
0101(00)i 11,
0100(00)i 11.
Proposition 5.2.31. SSP(ℤω , E) is NP-complete. Proof. Below we reduce a problem known to be NP-complete, namely, the zero-one equation problem (see Section 5.1.4.2), to SSP(ℤω , E). The reduction is organized as follows.
5.2 Subset sum problem and related problems | 261
Given a zero-one n × n matrix A = (aij ), compute elements n
gi = ∑ aij ej ∈ ℤω j=1
(for i = 1, . . . , n)
and put g = e1 + ⋅ ⋅ ⋅ + en . Clearly, the tuple g1 , . . . , gn , g is P-time computable and A is a positive instance of ZOE if and only if g1 , . . . , gn , g is a positive instance of SSP(ℤω , E). This establishes a P-time reduction of ZOE to SSP(ℤω , {ei }), as claimed. The next proposition is obvious. Proposition 5.2.32. Let G be a group generated by a set X. If φ : ℤω → G is a P-time computable embedding relative to the generating sets E and X, then SSP(G) is NP-hard. If, in addition, the word problem for G is decidable in polynomial time, then SSP(G) is NP-complete. This result gives a wide class of groups G with NP-hard or NP-complete SSP(G). Proposition 5.2.33. The following groups have NP-complete SSP: (a) free metabelian non-abelian groups of finite rank, (b) the wreath product ℤ ≀ ℤ, and (c) the wreath product of two finitely generated infinite abelian groups. Proof. Let Mn be a free metabelian group with basis X = {x1 , . . . , xn }, where n ≥ 2. It is not hard to see that the elements ei = x1−i [x2 , x1 ]x1i
(fori ∈ ℕ)
freely generate a free abelian group ℤω (see, for example, the description of normal forms of elements of Mn in [71]). This gives a P-time computable embedding of ℤω into Mn relative to the generating sets E and X. It is known that the word problem in finitely generated metabelian groups is in P (see, for example, [344]). Hence, by Proposition 5.2.32, SSP(Mn ) is NP-complete and (a) holds. The wreath product of two infinite cyclic groups generated by a and t, respectively, is a finitely generated infinitely presented group G = ⟨a, t | [a, t −i at i ] = 1,
(for i ∈ ℕ)⟩.
The set {t −i at i | i ∈ ℕ} freely generates a subgroup isomorphic to ℤω . In fact, the map ei → t −i at i defines a P-time computable embedding of ℤω into G relative to the generating sets E and {a, t}. Proposition 5.2.32 finishes the proof of (b). Finally, consider arbitrary infinite finitely generated abelian groups A and B. Then A ≃ A1 × ℤ, B = B1 × ℤ and ℤ ≀ ℤ can be P-time embedded into A ≀ B. The result now follows from (b).
262 | 5 Discrete optimization in groups The Thompson group F has a finite presentation ⟨a, b | [ab−1 , a−1 ba] = 1, [ab−1 , a−2 ba2 ] = 1⟩. It is a remarkable group due to a collection of very unusual properties that made it a counterexample to many general conjectures in group theory (see [76]). Proposition 5.2.34. The subset sum problem for the Thompson group F is NP-complete. Proof. According to [86] the wreath product ℤ ≀ ℤ can be embedded into F with no distortion. The word problem for F is decidable in polynomial time [76, 455]. Now the result follows from Propositions 5.2.32 and 5.2.33. In [31] Baumslag gives an example of a finitely presented metabelian group GB = ⟨a, s, t | [a, at ] = 1, [s, t] = 1, as = aat ⟩. Proposition 5.2.35. SSP(GB) is NP-complete. Proof. As shown in [31] the subgroup ⟨a, t⟩ of the group GB is isomorphic to ℤ ≀ ℤ. Hence, ℤ ≀ ℤ embeds into GB and since ℤ ≀ ℤ is finitely generated this embedding is P-time computable. The word problem for GB is in P because GB is a finitely presented metabelian group. Thus, by Propositions 5.2.33 and 5.2.32, SSP(GB) is NP-complete. There are many examples of finitely presented metabelian groups with NP-complete subset sum problem. Indeed, Baumslag [32] and Remeslennikov [419] proved that every finitely generated metabelian group G embeds into a finitely presented metabelian group G∗ . Since G is finitely generated this embedding is P-time computable with respect to the given finite generating sets. Therefore, if G contains a P-time computably embedded subgroup ℤω , so does G∗ . 5.2.5.1 Lamplighter groups Another example in the same vein is the lamplighter group considered in [356]. There the authors organize reduction of the exact set cover problem (which is easily equivalent to ZOE) to SSP in the lamplighter group. 5.2.6 Mikhailova’s construction as a source of hardness The famous Mikhailova construction can also serve as a source of hardness (undecidability or NP-completeness) for the problems we consider. As a basic result, we make an observation regarding BSMP(F2 × F2 ). We have seen in Section 5.2.3.2 that the bounded submonoid problem is decidable in any hyperbolic group G in polynomial time. In this section we show that taking
5.2 Subset sum problem and related problems | 263
a direct product does not preserve P-time decidability of BSMP unless P = NP. In fact, we prove a stronger result. We show that there exists a (fixed!) subgroup H = ⟨h1 , . . . , hk ⟩ in F2 × F2 with NP-complete bounded membership problem. The bounded membership problem (GWP) for a fixed subgroup H = ⟨h1 , . . . , hk ⟩ ≤ G: Given g ∈ G and unary 1n ∈ ℕ, decide if g can be expressed as a product of the ±1 ±1 form g = h±1 i1 hi2 ⋅ ⋅ ⋅ hil , where l ≤ n and 1 ≤ i1 , . . . , il ≤ k. Similar to Proposition 5.2.5, one can show that the complexity of the bounded membership problem does not depend on a finite generating set for G, and, hence, we can denote this problem as BGWP(G; h1 , . . . , hk ). Proposition 5.2.36. BGWP(G; h1 , . . . , hk ) is P-time reducible to BSMP(G). Therefore, if BGWP(G; h1 , . . . , hk ) is NP-complete and the word problem in G is in P, then BSMP(G) is NP-complete. −1 n Proof. If and only if (h1 , . . . , hk , h−1 1 , . . . , hk , g, 1 ) is a positive instance of BSMP(G), (g, 1n ) is a positive instance of BGWP(G; h1 , . . . , hk ).
Below we prove that there exists a subgroup H = ⟨h1 , . . . , hk ⟩ in F2 × F2 with NP-complete BGWP(F2 × F2 ; h1 , . . . , hk ). In our argument we employ the idea used by Olshanskii and Sapir in [398, Theorems 2 and 7] to investigate subgroup distortions in F2 × F2 . The argument follows Mikhailova’s construction of a subgroup of F2 × F2 with undecidable membership problem. We briefly outline that construction as described in [350]. Let G = ⟨X | R⟩ be a finitely presented group. We may assume that both sets X and R are symmetric, i. e., X = X −1 and R = R−1 . Define a set DG = {(r, 1) | r ∈ R} ∪ {(x, x−1 ) | x ∈ X} ⊂ F(X) × F(X).
(5.31)
Let H be a subgroup of F(X) × F(X) generated by DG . Then for any w ∈ F(X) (w, 1) ∈ H ⇔ w = 1 in G. In more detail, the following lemma is true. Lemma 5.2.37 ([350]). Let w = w(X). If (w, 1) = (u1 , v1 )(u2 , v2 ) ⋅ ⋅ ⋅ (un , vn )
for some (ui , vi ) ∈ DG ,
then the word u1 . . . un is of the form w0 r1 w1 r2 w2 ⋅ ⋅ ⋅ wm−1 rm wm
for some wi ∈ F(X), ri ∈ R,
satisfying w0 w1 ⋅ ⋅ ⋅ wm =F(X) 1 and, hence, m
w =F(X) ∏(w0 ⋅ ⋅ ⋅ wi−1 )ri (w0 ⋅ ⋅ ⋅ wi−1 )−1 . i=1
(5.32)
264 | 5 Discrete optimization in groups Moreover, by [398, Lemma 1] we may assume that |w0 | + ⋅ ⋅ ⋅ + |wm | ≤ 4E, where E is the number of edges in the minimal van Kampen diagram for w over ⟨X | R⟩. Denoting C = max{|r| : r ∈ R}, we get |w0 | + ⋅ ⋅ ⋅ + |wm | ≤ 4(Cm + |w|), so n ≤ m + 2(|w0 | + |w1 | + ⋅ ⋅ ⋅ + |wm |) ≤ m + 8(Cm + |w|),
(5.33)
i. e., the number of elements of DG necessary to represent (w, 1) is bounded by a polynomial (in fact, linear) function of |w| and m. Lemma 5.2.38. Let ⟨X | R⟩ be a finite presentation of a group G and let DG ⊂ F(X) × F(X) be the set given by (5.31). If the isoperimetric function for ⟨X | R⟩ is bounded by a polynomial p, then the word problem in G is P-time reducible to BGWP(F(X) × F(X); DG ). Proof. As above, let C = max{|r| : r ∈ R}. For an arbitrary w ∈ F(X) compute n = p(|w|) + 8(Cp(|w|) + |w|). Now it easily follows from Lemma 5.2.37 and inequality (5.33) that w = 1 in G if and only if ((w, 1), 1n ) is a positive instance of BGWP(F(X) × F(X); DG ). Theorem 5.2.39. There is a finitely generated subgroup H = ⟨h1 , . . . , hk ⟩ in F2 × F2 such that BGWP(F2 × F2 ; h1 , . . . , hk ) is NP-complete. Proof. It is shown in [437] that there exists a finitely presented group G with NP-complete word problem and polynomial Dehn function. Let DG = {h1 , . . . , hk } be a subset of F(X) × F(X) defined by (5.31). By Lemma 5.2.38, BGWP(F(X) × F(X); DG ) is NP-hard. Since F2 × F2 contains a subgroup isomorphic to F(X) × F(X), BGWP(F2 × F2 ; DG ) is also NP-hard. It is only left to note that the word problem in F2 × F2 is P-time decidable, so BGWP(F2 × F2 ; DG ) is NP-complete. Corollary 5.2.40. If G contains F2 × F2 as a subgroup, then there exists {h1 , . . . , hk } ⊆ G such that BGWP(G; h1 , . . . , hk ) and BSMP(G) are NP-hard. If, in addition, the word problem in G is P-time decidable, then BGWP(G; h1 , . . . , hk ) and BSMP(G) are NP-complete. Corollary 5.2.41. Linear groups GL(≥ 4, ℤ) and SL(≥ 4, ℤ), braid groups and graph groups whose graph contains an induced square C4 have NP-complete BGWP and BSMP. The same hardness results hold for AGWP, since BSMP P-time reduces to AGWP. To address SSP, we may note that one can observe that BSMP(G) P-time reduces to SSP(G × Z). It follows that SSP(F2 × F2 × ℤ) is NP-complete. However, Lohrey and Zetzsche note in [304] that the construction used to prove NP-completeness of BGWP(F2 × F2 ) and BSMP(F2 × F2 ) can also be used to prove NP-completeness of SSP(F2 × F2 ). Proposition 5.2.42 ([304, Theorem 25]). SSP(F2 ×F2 ) and KP(F2 ×F2 ) are NP-complete.
5.2 Subset sum problem and related problems | 265
Proof. Consider the group G = ⟨X | R⟩ from the proof of Theorem 5.2.39 with NP-complete word problem and polynomial p(n) isoperimetric function. Let DG = {g1 , . . . , gm } be the corresponding set defined by (5.31). It follows from Lemma 5.2.38 that w =G 1 ⇔ ∃n ≤ q(|w|) such that (w, 1) ∈ DnG in F(X) × F(X), where q(n) = p(n) + 8(Cp(n) + n) is a polynomial function in n. Now it immediately follows that the following three statements are equivalent for each w ∈ F(X): – w =G 1; ε ε – (w, 1) = ∏q(|w|) g1 1i ⋅ ⋅ ⋅ gmmi in F(X) × F(X) for some εij ∈ {0, 1}; i=1 –
ε
ε
(w, 1) = ∏q(|w|) g1 1i ⋅ ⋅ ⋅ gmmi in F(X) × F(X) for some εij ∈ ℤ. i=1
Hence, SSP(F2 × F2 ) and KP(F2 × F2 ) are NP-hard. Since WP(F2 × F2 ) ∈ P we have SSP(F2 × F2 ) ∈ NP. See Section 5.3.3 for the fact that KP(F2 × F2 ) ∈ NP. 5.2.7 Transfer results for SSP and related problems In this section we consider what we collectively call transfer results, i. e., what computational properties of SSP and related problems carry through under various grouptheoretic constructions. 5.2.7.1 AGWP and related problems in free products with finite amalgamation Our primary concern in this section is the complexity of the acyclic graph word problem (AGWP) in free products of groups. However, our approach easily generalizes to free products with amalgamation over a finite subgroup, and to a wider class of problems. Theorem 5.2.43. Let G, H be finitely generated groups, and let C be a finite group that embeds in G, H. Then AGWP(G ∗C H) is P-time Cook reducible to AGWP(G), AGWP(H). Proof. Let G be given by a generating set X, and H by Y. Let Γ = Γ0 be the given acyclic graph labeled by Σ = X ∪ X −1 ∪ Y ∪ Y −1 ∪ C (we assume that alphabets X ±1 , Y ±1 are disjoint from C). Given the graph Γk , k ∈ ℤ, construct graph Γk+1 by adding edges to Γk as follows. Consider Γk , the maximal subgraph of Γk labeled by X ∪ X −1 ∪ C (i. e., the graph obtained by removing all edges labeled by Y ∪ Y −1 ). For each c ∈ C and each pair of vertices v1 , v2 ∈ V(Γk ) from the same connected component of Γk , decide whether a word equal to c in G is readable as a label of an oriented path in Γk , using the solution to AGWP(G). For c ∈ C, let Ec be the set of pairs (v1 , v2 ) ∈ V(Γk ) × V(Γk ) such c
that the answer to the above question is positive, and there is no edge v1 → v2 in c Γk . Construct the graph Γk by adding edges v1 → v2 , (v1 , v2 ) ∈ Ec to the graph Γk ,
266 | 5 Discrete optimization in groups −1 ∪C for all c ∈ C. Now consider the maximal subgraph Γ k of Γk labeled by Y ∪ Y and perform a similar operation using the solution to AGWP(H), obtaining the graph Γk = Γk+1 . Since there are at most 2|C| ⋅ |V(Γ)|2 possible c-edges to be drawn, it follows that size(Γk+1 ) < 2|C|size(Γ)2 and that Γk = Γk+1 = ⋅ ⋅ ⋅ for some k = n, where n ≤ 2|C|size(Γ)2 . We claim that a word w equal to 1 in G ∗C H is readable from α to ω in Γ if ε
and only if there is an edge α → ω in the graph Γn . Indeed, suppose there is a path in Γ and, therefore, in Γn , from α to ω labeled by a word w = w1 w2 ⋅ ⋅ ⋅ wm , with wj ∈ Σ and at least one non-C letter among w1 , . . . , wm , such that w = 1 in G ∗C H. The normal form theorem for free products with amalgamated subgroup guarantees that w has a subword w = wi wi+1 ⋅ ⋅ ⋅ wj of letters in X ∪ X −1 ∪ C or Y ∪ Y −1 ∪ C with w = c ∈ C in G or H, respectively, with at least one non-C letter among wi , . . . , wj . Since Γn = Γn+1 , the word w1 ⋅ ⋅ ⋅ wi−1 cwj+1 ⋅ ⋅ ⋅ wm is readable as a label of a path in Γn from α to ω. By induction, a word c1 ⋅ ⋅ ⋅ cℓ , ℓ ≤ m, c1 , . . . , cℓ ∈ C, is readable as a label of an oriented path in Γn from α to ω. By the construction, ε
Γn+1 = Γn contains an edge α → ω. The converse direction of the claim is evident.
Corollary 5.2.44. Let G, H be finitely generated groups, and let C be a finite group that embeds in G, H. If AGWP(G), AGWP(H) ∈ P, then AGWP(G ∗C H) ∈ P. Corollary 5.2.45. SSP, BKP, BSMP and AGWP are polynomial-time decidable in any finite free product with finite amalgamations of finitely generated virtually nilpotent and hyperbolic groups. Theorem 5.2.43 can be generalized to apply in the following setting. We say that a family ℱ of finite directed graphs is progressive if it is closed under the following operations: 1. taking subgraphs, 2. adding shortcuts (i. e., drawing an edge from the origin of an oriented path to its terminus), and 3. appending a hanging oriented path by its origin. Two proper examples of such families are acyclic graphs and graphs stratifiable in a sense that vertices can be split into subsets V1 , . . . , Vn so that edges only lead from Vk to V≥k . We say that a subset of a group G is ℱ -rational if it can be given by a finite (nondeterministic) automaton in ℱ . Finally, by the ℱ -uniform rational subset membership problem for G we mean the problem of establishing, given an ℱ -rational subset of a group G and a word as an input, whether the subset contains an element equal to the given word in G. In this terminology, AGWP(G) is precisely the ℱacyc -uniform rational subset membership problem for G, where ℱacyc is the family of acyclic graphs.
5.2 Subset sum problem and related problems | 267
Theorem 5.2.46. Let ℱ be a progressive family of directed graphs. Let G, H be finitely generated groups, and let C be a finite group that embeds in G, H. Then the ℱ -uniform rational subset membership problem for G ∗C H is P-time Cook reducible to that for G and H. Proof. Given an automaton Γ and a word w as an input, we start by forming an automaton Γw by appending a hanging path labeled by w−1 at every accepting state of Γ. The procedure then repeats that in the proof of Theorem 5.2.43, with obvious minor adjustments. In particular, the uniform rational subset membership problem for G ∗C H, with C finite, is P-time Cook reducible to that in the factors G and H (see decidability results in the case of graphs of groups with finite edge groups in [234]). Given that the uniform rational subset membership problem is P-time decidable for free abelian groups [295] and that P-time decidability of the same problem carries from a finite-index subgroup [191], the above theorem gives the following corollary. Corollary 5.2.47. Let G be a finite free product with finite amalgamations of finitely generated abelian groups. The rational subset membership problem in G is decidable in polynomial time. It remains to be seen whether SSP(G ∗ H) can be harder than SSP(G) and SSP(H). If that happens, then AGWP is harder than SSP in G or H. 5.2.7.2 SSP and related problems in direct products Results of Section 5.2.6 show that direct products may change the difficulty of SSP and AGWP, in contrast with results of Section 5.2.7.1, where we have seen that free products preserve the complexity of AGWP. Proposition 5.2.48. There exist groups G, H such that AGWP(G), AGWP(H) ∈ P, but AGWP(G × H) is NP-complete. Proof. By Theorem 5.2.39 [363, Theorem 7.4], BSMP(F2 × F2 ) is NP-complete. By Proposition 5.2.9 it follows that AGWP(F2 × F2 ) is NP-complete, while by Proposition 5.2.19 AGWP(F2 ) ∈ P. However, a direct product with a virtually nilpotent group does not increase the complexity of AGWP. Theorem 5.2.49. Let G be a finitely generated monoid and N a finitely generated virtually nilpotent group. Then AGWP(G) and AGWP(G × N) are P-time equivalent. Proof. We only have to show the reduction of AGWP(G × N) to AGWP(G), since the other direction is immediate by Proposition 5.2.10.
268 | 5 Discrete optimization in groups Since the complexity of AGWP in a given monoid does not depend on a finite generating set, we assume that the group G × N is generated by finitely many elements (g1 , 1N ), (g2 , 1N ), . . . and (1G , h1 ), (1G , h2 ), . . .. Consider an arbitrary instance ̂ of AGWP(G × N), where Γ = (V, E). Consider a graph Γ∗ = (V ∗ , E ∗ ), (Γ, α, ω, (g,̂ h)) ∗ where V = V × N and (g,h ) g E ∗ = { (v, h) → (v , hh ) for every (v, h) ∈ V ∗ and v → v ∈ E } , where (g, h ) denotes an element of G × N, with g ∈ G, h ∈ {1N , hi , h−1 i }. Let Γ = (V , E ) ∗ ∗ be the connected component of Γ containing (α, 1N ) ∈ V (or the subgraph of Γ∗ induced by all vertices in Γ∗ that can be reached from (α, 1N )). It is easy to see that
V ≤ |V| ⋅ B|E| , where B|E| is the ball of radius |E| in the Cayley graph of N relative to generators {hi }. Since the group N has polynomial growth [499] and a polynomial-time decidable word problem (in fact, real time by [218] or logspace by [317]), the graph Γ can be constructed in a straightforward way in polynomial time. Finally, it follows from the con̂ is a positive instance of AGWP(G × N) if and only if struction of Γ that (Γ, α, ω, (g,̂ h)) ̂ (Γ , (α, 1N ), (ω, h), g)̂ is a positive instance of AGWP(G). The above statement and Proposition 5.2.19 provide the following corollary. Corollary 5.2.50. AGWP(F2 × N) ∈ P for every finitely generated virtually nilpotent group N. A statement similar to Proposition 5.2.48 can be made about SSP with the help of the following claim. Proposition 5.2.51. Let G be a finitely generated monoid. 1. AGWP(G) is P-time reducible to SSP(G ∗ F2 ). 2. AGWP(G) is P-time reducible to SSP(G × F2 ). Proof. Let Γ be a given directed acyclic graph on n vertices with edges labeled by group words in a generating set X of the group G. We start by organizing a topological sorting on Γ, that is, enumerating vertices of Γ by symbols V1 through Vn so that if there is a path in Γ from Vi to Vj , then i ≤ j. This can be done in a time linear in size(Γ) by [231]. We assume α = V1 and ω = Vn , otherwise discarding unnecessary vertices. We perform a similar ordering of edges, i. e., we enumerate them by symbols E1 , . . . , Em so that if there is a path in Γ whose first edge is Ei and the last edge is Ej , then i ≤ j (considering the derivative graph of Γ we see that this can be done in time quadratic in size(Γ)). For each edge Ei , 1 ≤ i ≤ m, denote its label by ui , its origin by Vo(i) and its terminus by Vt(i) . We similarly assume that o(1) = 1 and t(m) = n.
5.2 Subset sum problem and related problems | 269
Figure 5.6: Reduction of AGWP(G) to SSP(G ∗ F2 ) and SSP(G × F2 ). Dotted arrows illustrate the −1 correspondence between edges labeled by ui and elements gi = vo(i) ui vt(i) .
Next, we produce in polynomial time n freely independent elements v1 , . . . , vn of the free group F2 = ⟨x, y⟩, of which we think as labels of the corresponding vertices V1 , . . . , Vn . For example, vj = xj yxj , j = 1, . . . , n, suffice. We claim that −1 −1 −1 , g2 = vo(2) u2 vt(2) , . . . , gm = vo(m) um vt(m) ; g = v1 vn−1 g1 = vo(1) u1 vt(1)
is a positive instance of SSP(G∗F2 ) (or SSP(G×F2 )) if and only if Γ is a positive instance of AGWP(G) (see Figure 5.6 for an example). Indeed, suppose there is an edge path Ei1 , . . . , Eik from α = V1 to ω = Vn in Γ with ui1 ⋅ ⋅ ⋅ uik = 1 in G. Note that since the above sequence of edges is a path, for each 1 ≤ μ ≤ k − 1, we have Vt(μ) = Vo(μ+1) , so vt(μ) = vo(μ+1) . Then the same choice of elements gi1 , . . . , gik gives −1 −1 −1 ) gi1 ⋅ ⋅ ⋅ gik = (vo(i1 ) ui1 vt(i )(vo(i2 ) ui2 vt(i ) ⋅ ⋅ ⋅ (vo(ik ) uik vt(i 2) k) 1)
−1 −1 −1 = vo(i1 ) ui1 (vt(i v )ui2 (vt(i v ) ⋅ ⋅ ⋅ vo(ik ) uik vt(i 1 ) o(i2 ) 2 ) o(i3 ) k) −1 = vo(i1 ) ui1 ui2 ⋅ ⋅ ⋅ uim vt(i k)
−1 = vo(i1 ) vt(i = v1 vn−1 = g. k)
In the opposite direction, suppose in G ∗ F2 (or G × F2 ) the equality gi1 ⋅ ⋅ ⋅ gik = g,
i1 < ⋅ ⋅ ⋅ < ik ,
(5.34)
holds. Consider the F2 -component of this equality: −1 −1 −1 vo(i1 ) vt(i ⋅ vo(i2 ) vt(i ⋅ ⋅ ⋅ vo(ik ) vt(i = v1 vn−1 . 1) 2) k)
Since v1 , . . . , vn are freely independent, it is easy to see by induction on k that the latter equality is only possible if vo(i1 ) = v1 ,
vt(i1 ) = vo(i2 ) ,
vt(i2 ) = vo(i3 ) , . . . , vt(ik−1 ) = vo(ik ) ,
vt(ik ) = vn ,
i. e., edges Ei1 , Ei2 , . . . , Eik form a path from V1 to Vn in Γ. Furthermore, inspecting the G-component of equality (5.34), we get ui1 ui2 ⋅ ⋅ ⋅ uik = 1, as required in AGWP(G).
270 | 5 Discrete optimization in groups However, in the next proposition we organize reduction of BSMP(G) to SSP(G×ℤ), thus simplifying the “augmenting” group, which allows to make a slightly stronger statement about the complexity of SSP in direct products. Proposition 5.2.52. Let G be a finitely generated monoid. Then BSMP(G) is P-time Cook reducible to SSP(G × ℤ). Proof. The proof below is for the case of a group G. The case of a monoid G is treated in the same way with obvious adjustments.
Figure 5.7: Reduction of BSMP(G) to SSP(G × ℤ). Graph Γm .
Let w1 , w2 , . . . , wk , w, 1n be the input of BSMP(G). We construct graphs Γm , m = 1, . . . , n, with edges labeled by elements of G × ℤ as shown in Figure 5.7. Note that a path from α to ω is labeled by a word trivial in G × ℤ if and only if it passes through exactly m edges labeled by (wi1 , 1), . . . , (wim , 1) and wi1 ⋅ ⋅ ⋅ wim = w in G. Therefore, the tuple w1 , . . . , wk , 1n is a positive instance of BSMP(G) if and only if at least one of graphs Γ1 , . . . , Γn is a positive instance of SSP(G × ℤ). We put G = F2 × F2 in the above Proposition 5.2.52 to obtain the following result. Proposition 5.2.53. SSP(F2 × F2 × ℤ) is NP-complete. Proof. As we mentioned in the proof of Proposition 5.2.48, AGWP(F2 × F2 ) is NP-complete. By Proposition 5.2.52, the latter P-time Cook reduces to SSP((F2 × F2 ) × ℤ). Therefore, SSP(F2 × F2 × ℤ) is NP-complete. The latter proposition answers the question whether the direct product preserves polynomial-time SSP. Corollary 5.2.54. There exist finitely generated groups G, H such that SSP(G) ∈ P, SSP(H) ∈ P but SSP(G × H) is NP-complete. Proof. By Corollary 5.2.50, AGWP(F2 × ℤ) is in P. Therefore, SSP(F2 ) and SSP(F2 × ℤ) are in P, while SSP(F2 × (F2 × ℤ)) is NP-complete by the above result. Recall that Lohrey and Zetzsche [304] observed (see Proposition 5.2.42) that, in fact, F2 × F2 has NP-complete SSP.
5.3 Knapsack problem
| 271
Corollary 5.2.55. SSP is NP-complete in braid groups Bn , n ≥ 5, special linear groups SL(n, ℤ), n ≥ 4 and graph groups whose graph contains ◻ (also called C4 ) as an induced subgraph. Proof. Note that F2 × F2 embeds in a braid group Bn with n ≥ 5 by a result of Makanina [329]. The statement now follows from Propositions 5.2.42 and 5.2.10 since all of the listed groups contain F2 × F2 as a subgroup. 5.2.7.3 AGWP in finite-index subgroups Theorem 5.2.56 (Theorem 4.2 of [275]). Let H be a finite-index subgroup of the finitely generated group G (hence, H is finitely generated). Then AGWP(G) is logspace reducible to AGWP(H). Proof. The proof proceeds by a standard coset bruteforcing argument. Let {g0 = 1, g1 , . . . , gk } be a fixed set of coset representatives of H in G. For each monoid generator a of G and each index 0 ≤ i ≤ k, precompute 0 ≤ j ≤ k and w ∈ H such that gi a = wgj . For an acyclic graph Γ with a set of vertices V and edges labeled by monoid generators of G, we construct an acyclic graph Δ with the set of vertices V ×{g0 , g1 , . . . , gk } and a
edges labeled by elements of H as follows. Let v → u be an edge of Γ, where u, v ∈ V and a is a monoid generator of G. Let gi a = wgj with w ∈ H. In this event, we introduce an edge from (v, gi ) to (u, gj ) labeled by w. It is straightforward to see that 1 is readable in Γ from the vertex α to ω if and only if 1 is readable in Δ from (α, 1) to (ω, 1). Note that since, in the notation of the above theorem, H ≤ G, it immediately follows that AGWP(G) and AGWP(H) are logspace equivalent.
5.3 Knapsack problem 5.3.1 Definition Let G be a group generated by a finite set X = {x1 , . . . , xn } ⊆ G. Consider the following decision problem. The knapsack problem KP(G, X): Given g1 , . . . , gk , g ∈ G, decide if ε
ε
g =G g1 1 ⋅ ⋅ ⋅ gkk
(5.35)
for some nonnegative integers ε1 , . . . , εk . We may consider a generalized knapsack problem, where in addition to the elements g1 , . . . , gk , the input contains “coefficient” terms h0 , h1 , . . . , hk ∈ G. The problem asks to
272 | 5 Discrete optimization in groups decide if ε
ε
ε
h0 g1 1 h1 g2 2 ⋅ ⋅ ⋅ hk−1 gkk hk = 1 in G for some nonnegative integers ε1 , . . . , εk . It is easy to see by a standard conjugation trick that this problem is P-time equivalent to KP(G, X) (see Proposition 5.3.1 for details). We may also consider a variation of this problem, termed integer knapsack problem (IKP), when the coefficients εi are arbitrary integers. It is easy to see that IKP is P-time reducible to KP for any group G (see below in this section). Observe that the classic knapsack problem is, in our terminology, KP(ℤ, 1) or KP(ℤ, 2n ), depending on whether unary or binary notation is considered (see Example 5.2.1 for details about encoding a generating set 2n for ℤ). Note that by a standard argument, if X1 and X2 are two finite generating sets for a group G, the problems KP(G, X1 ) and KP(G, X2 ) are P-time (in fact, linear-time) equivalent. In this sense, the complexity of KP in a group G does not depend on the choice of a finite generating set. In the sequel we write KP(G) instead of KP(G, X), implying an arbitrary finite generating set. The same applies to other versions of the knapsack problem, like IKP. Proposition 5.3.1. Let G be a finitely generated group. 1. IKP(G) P-time Cook reduces to KP(G). 2. KP(G) and the generalized knapsack problem for G are P-time Cook-equivalent. Proof. To show the first statement, given an input g1 , . . . , gk , g ∈ G of IKP(G), it suffices to consider the input g1 , g1−1 , . . . , gk , gk−1 , g ∈ G, of KP(G). To show the second statement, it suffices to show the reduction of the generalized knapsack problem to KP(G). Let g1 , . . . , gk , h0 , h1 , . . . , hk ∈ G be an input of the generalized knapsack problem. Note that the tuple h−1
h−1 h−1 0
g1 0 , g2 1
h−1 ⋅⋅⋅h−1 0
, . . . , gk k−1
−1 −1 , h−1 k hk−1 ⋅ ⋅ ⋅ h1
is an equivalent input of KP(G). Analogues of Lemma 5.2.4 and Proposition 5.2.5 hold for KP and IKP as well. In particular, it follows that a solution to KP is inherited by a finitely generated subgroup. Proposition 5.3.2. Let H be a finitely generated subgroup of a finitely generated group G. Then KP(H) (respectively, IKP(H)) P-time Karp reduces to KP(G) (respectively, IKP(G)). We also mention that the knapsack problem and its versions can be stated in a monoid. Note that in this case there is no immediate reduction of the generalized versions to the versions “without coefficients.”
5.3 Knapsack problem
| 273
5.3.1.1 General notes on complexity of the problems Similar to SSP and related problems, the word problem in a group G reduces to KP(G). Indeed, for an element g it suffices to consider the input g1 = 1, g for KP(G). However, the converse is not true. There are groups where the word problem is decidable and the knapsack problem is not. In general, both SMP and KP can be undecidable in a group with decidable word problem; for instance, a group with decidable word problem, but undecidable membership in cyclic subgroups is constructed in [397]. Groups with undecidable KP are not necessarily quite so hand-crafted; for example, recent works show that the knapsack problem is undecidable in integer unitriangular groups of sufficiently large size [297], in sufficiently large direct powers of the Heisenberg group H3 (ℤ) [275] and, more broadly, in a wide family of nilpotent groups of class at least 2 [355]. The bounded version of KP is at least always decidable in groups where the word problem is as well. Recall that BKP is P-time Cook reduces to SSP(G), so we consider it in Section 5.2. Observe that KP(G) is a particular case of the rational subset membership problem in G. The inclusion is in general strict, since there are groups where KP(G) is decidable (in fact, P-time), but the subgroup membership problem is undecidable. For example, for a hyperbolic group G we have KP(G) ∈ P by Theorem 5.3.3, but there exist hyperbolic groups with undecidable subgroup membership problem [420].
5.3.2 Groups with semilinear solution to KP In this case we inspect the following phenomenon. If a group enjoys in a certain sense controlled cancelation (for example, hyperbolic groups or right-angled Artin groups), then for a given input of KP, the set of tuples (ε1 , . . . , εk ) that deliver equality (5.35) is semilinear. Depending on a particular group or a class of groups, this delivers an upper bound on the complexity of KP (such as membership in P or NP, or solvability). With this in mind, we say that a finitely generated group G is (effectively) knapsack semilinear if for every knapsack problem input, the set of all tuples (ε1 , . . . , εk ) that deliver equality (5.35) is (effectively) semilinear. Observe that this condition is equivalent to the semilinearity of solution sets of knapsack-type exponential equations with linear constraints: ε
ε
v0 u11 v1 ⋅ ⋅ ⋅ vk−1 ukk vk = 1,
L(ε1 , . . . , εk ) = 0,
where L is a given system of linear equations on ε1 , . . . , εk . As an example of such a constraint, we can, for example, impose εi = εj for some i, j. Below we show for which classes of groups such behavior has been established. Besides the results that follow in this section, semilinearity plays a crucial role in transfer results for wreath products (see Section 5.3.5.4).
274 | 5 Discrete optimization in groups 5.3.2.1 Hyperbolic groups Here we study the knapsack problem KP(G) in hyperbolic groups G relative to finite generating sets. We show the following theorem. Theorem 5.3.3. Let G be a hyperbolic group generated by finite set X. Then KP(G, X) ∈ P. Moreover, there exists a P-time algorithm which for any positive instance g1 , . . . , gk , g ∈ G of KP(G) computes a sequence of nonnegative integers ε1 , . . . , εk ε ε such that g1 1 ⋅ ⋅ ⋅ gkk = g in G. Let G be a hyperbolic group. The following result, which is of independent interest, P-time reduces KP(G) to BKP(G). This proves Theorem 5.3.3 because BKP(G) is P-time decidable by Corollary 5.2.20. Theorem 5.3.4. Let G be a hyperbolic group. There is a polynomial p(x) such that if for g1 , . . . , gk , g ∈ G, there exist integers ε1 , . . . , εk ∈ ℤ such that ε
ε
g = g1 1 ⋅ ⋅ ⋅ g k k , then there exist such integers ε1 , . . . , εk ∈ ℤ with max{|ε1 |, . . . , |εk |} ≤ p(|g1 | + ⋅ ⋅ ⋅ + |gk | + |g|). Proof. Let E be the maximum order of torsion elements in G (it is well-defined since a hyperbolic group has a finite number of conjugacy classes of finite subgroups, see [47] or [62]), or E = 1 if G is torsion-free. For every torsion element gi , 1 ≤ i ≤ k, we may assume that |εi | < E. Suppose now that among g1 , . . . , gk there is at least one element of infinite order. Fix a presentation ⟨X | R⟩ of G and denote |g1 |X + ⋅ ⋅ ⋅ + |gk |X + |g|X = n (here | ⋅ |X denotes the geodesic length with respect to X). Let gi1 , . . . , gim be the entirety of elements of infinite order among g1 , . . . , gk . For each infinite order gij , 1 ≤ j ≤ m, let hj , cj be such that gij = cj−1 hj cj and let hj be ε
ε
cyclically reduced. Note that |hj |X , |cj |X ≤ |gij |X ≤ n. Given a product g1 1 ⋅ ⋅ ⋅ gkk , denote blocks of powers of finite order elements as follows: εi +1
εij+1 −1 −1 c j+1 −1 j+1
j cj gi +1 ⋅ ⋅ ⋅ gi j
ε
εi
= bj+1
for 1 ≤ j ≤ m − 1, ε
1 b1 = g1 1 ⋅ ⋅ ⋅ gi −1 c1−1 ,
ε
bm+1 = cm gi im+1+1 ⋅ ⋅ ⋅ gkk .
−1
m
1
For convenience put εij = αj so that ε
ε
α
g1 1 ⋅ ⋅ ⋅ gkk = b1 h1 1 b2 ⋅ ⋅ ⋅ bm hαmm bm+1 . Note that |bi | ≤ n ⋅ nE + 2n ≤ 3nE+1 . Consider a (2m + 2)-gon with sides q1 p1 q2 ⋅ ⋅ ⋅ pm qm+1 r where: – qi , 1 ≤ i ≤ m + 1, is labeled by a geodesic word representing bi ,
5.3 Knapsack problem
– –
| 275
α
pi , 1 ≤ i ≤ m, is labeled by a (λ, ε)-quasigeodesic word representing hi i (according to Lemma 5.1.11), and r is labeled by a geodesic word representing g.
We will show that given a sufficiently large polynomial bound on M, if at least one |αj | > M, then some powers |αi | > M can be reduced while preserving the equality α α g = b1 h1 1 b2 ⋅ ⋅ ⋅ bm hmm bm+1 . Assume some |αj | ≥ M, with M to be chosen later. By Lemma 5.1.10, the side pj of the polygon belongs to a closed (H + H ln(2m + 2))-neighborhood of the union of the other sides, where H only depends on X, R, λ and ε. By Lemma 5.1.11, λ and ε, in turn, only depend on X, R. If two points pj (t1 ), pj (t2 ), t1 < t2 , are (H + H ln(2m + 2))-close to a side q (where q is one of sides pi , qi , r), then by Lemma 5.1.9 the subpath pj (t), t1 ≤ t ≤ t2 , asynchronously K2 = K1 (H + H ln(2m + 2))-fellow travels with a subpath of q. Therefore we may assume that pj is split into at most (2m+1) segments, so that each segment asynchronously K2 = K1 (H + H ln(2m + 2))-fellow travels with a segment of another side. By the pigeonhole principle, at least one segment of pj contains at least (M − 2m)/(2m + 1) ≥
M M −2≥ − 2 = M1 2m + 1 3n
(5.36)
copies of the word representing hj . Denote this segment of pj by p and its fellow traveler by s. Note that since pj is (λ, ε)-quasigeodesic, the geodesic length of s is at least λ−1 (M1 |h1 |X − 2K2 ) − ε.
(5.37)
We show below that given a sufficiently large lower bound on M, p can fellow travel neither with qi , nor with r. Choosing M > 3n(λ(ε + 3nE+1 ) + 2K2 + 2) = Q1 (n)
(5.38)
guarantees M1 > λ(ε + 3nE+1 ) + 2K2 , so by (5.37) the geodesic length of |s|X > 3nE+1 , which eliminates the possibility that s is a segment of qi , 1 ≤ i ≤ m + 1. Note that Q1 (n) in (5.38) is of degree E + 2 in n since K2 = K1 (H + H ln(2m + 2)) ≤ K1 (H + 3nH). The same bound (5.38) also prohibits fellow travel with r since geodesic length of r is at most n < 3nE+1 . From (5.38) we conclude that with M > Q1 (n) + E,
(5.39)
the only possibility is that p fellow travels with a segment of some pl , l ≠ j. By Lemma 5.1.12, there exists L (depending on X) such that if p K2 -fellow travels with a segment of pl and M1 > nLK2 , then hj and hl are commensurable and form a
276 | 5 Discrete optimization in groups
k
k
Figure 5.8: Removing rectangle hj 1 = d −1 hl 2 d. k
k
rectangle hj 1 = d−1 hl 2 d (see (5.7)) with k1 between 0 and αj , and k2 between 0 and αl . In that case, αj and αl can be replaced by (αj − k1 ) and (αl − k2 ), respectively, preserving α α the equality g = b1 h1 1 . . . hmm bm+1 (see Figure 5.8). Note that nLK2 = nLK1 (H+H ln(2m+2))
= LK1 H n(2m + 2)K1 H ln L
≤ LK1 H n(4n)K1 H ln L . Hence, M1 ≥ nLK2 is guaranteed by
M > 3n(LK1 H n(4n)K1 H ln L + 2) = Q2 (n),
(5.40)
which is of degree ≤ (2 + K1 H ln L) in n. Consider M = Q1 (n) + Q2 (n) + E
(5.41)
that satisfies inequalities (5.38)–(5.40). By the argument above, if some |εi | > M and gi is a torsion element, then εi can be replaced with εi , where |εi | < E < M. If some |εi | > M and gi is an infinite-order element, then εi and some εj can be replaced by εi and some εj , respectively, where |εi | < |εi | and |εj | < |εj |. Repeating this procedure, we eventually obtain that for every 1 ≤ i ≤ k, |εi | < M. It is only left to note that M in (5.41) is of degree max{E + 2, 2 + K1 H ln L} in n, where E, K1 , H, L depend only on the presentation ⟨X | R⟩. Remark 5.3.5. One can see from the above proof that a hyperbolic group G is knapsack semilinear. To establish that, we have to recall that αi and αj above were replaced by αi − k1 and αj − k2 , where k1 , k2 are bounded by a constant that depends only on the input and the presentation. This means there are finitely many possible values for k1 and k2 for each of the finitely many pairs (i, j). From that and geometric considerations it follows that the set of solutions is semilinear. This was done in detail in [299]. Moreover, it is shown there that the resulting semilinear set has a polynomially bounded description. That is, the following statement holds.
5.3 Knapsack problem
| 277
Theorem 5.3.6 ([299]). Let G be a hyperbolic group. There is a polynomial p(x) that depends only on a presentation of G with the following property. If the group elements g, g1 , . . . , gk are represented by words over the generators of G and the total length of n n these words is N, then the set {(n1 , . . . , nk ) | g = g1 1 ⋅ ⋅ ⋅ gk k } has a semilinear representation, where all vectors only contain integers of size at most p(N). In other words, this theorem states that hyperbolic groups are knapsack tame (see Definition 5.3.21 below). 5.3.2.2 Right-angled Artin groups The main result of [304] claims that the knapsack problem can be solved in NP for graph groups. Below, A denotes an alphabet, and I denotes the set of commuting pairs of letters. The pair (A, I) is called an independence alphabet (see Section 5.1.7 for details on graph groups). Recall that the complexity bound for the knapsack problem in hyperbolic groups (Theorem 5.3.4) is proved by showing that the set of powers of input elements that deliver the required equality is semilinear. The authors of [304] use semilinearity systematically to prove the following. Theorem 5.3.7 ([304, Theorem 14]). Let G = G(A, I), u1 , . . . , un ∈ G \ {1}, v0 , . . . , vn ∈ G and x1 , . . . , xn be variables (we may have xi = xj ) ranging over ℕ. Then the set of solutions of the exponent equation x
v0 u1 1 v1 . . . vn−1 uxnn vn = 1
(5.42)
is semilinear (see Definition 5.1.4). Moreover, if there is a solution, then there is a solution 2 with xi ∈ O((αn)! ⋅ 22α n(n+3) ⋅ μ8α(n+1) ⋅ ν8α|A|(n+1) ), where: – α ≤ |A| is the size of a largest clique of the complementary graph (A, I)c = (A, (A × A) \ I), 2 – μ ∈ O(|A|α ⋅ 22α n λα ), – ν ∈ O(λα ), – λ = max{|u1 |, . . . , |un |, |v0 |, . . . , |vn |}. Outline of Proof. The key observation in the proof is that the set of reduced partially commutative words representing {pux s | x ∈ ℕ}, where u does not split in a proper product of commuting factors, can be read in a nondeterministic finite-state automaton of size polynomial in maximal length of p, u, s [304, Lemma 10]. This allows to approach the statement in the case n = 2 (under the above assumption on u1 and u2 ) as follows [304, Lemma 11]. We view equality (5.42) as x
x
2 v0 u1 1 v1 = v2−1 (u−1 2 ) ,
278 | 5 Discrete optimization in groups so the language of partially commutative words that represent both left- and righthand sides can be obtained by standard intersection construction for finite-state automata. At this point, semilinearity and the required bounds follow by inspecting the resulting automaton. For higher values of n or for ui that split in a proper product of commuting factors, the statement is reduced to the above case n = 2 using the observation that the product in (5.42) can be split in at most exponentially (in n) many mutually canceling factors [304, Lemma 13] of the form pux s [304, Lemma 6]. The required bounds follow by careful inspection of the arising linear conditions [304, Lemma 12]. As a consequence, if the exponent equation (5.42) has a solution x1 , . . . , xn ∈ ℕ, then it has a solution that can be binary-encoded by numbers of polynomial length. Such a solution can be verified by a polynomial-time algorithm. The latter problem is an instance of the so-called compressed word problem for a graph group. This is the classical word problem, where the input group element is given succinctly by a socalled straight-line program (SLP), which is a context-free grammar that produces a single word (here, a word over the group generators and their inverses). An SLP with n productions in Chomsky normal form can produce a string of length not more than 2n . It has been shown in [300] that the compressed word problem for a graph group can be solved in polynomial time (see also [296] for more details). Corollary 5.3.8. KP(G(A, I)) ∈ NP. Furthermore, note that the bounds on the size of solution in Theorem 5.3.7 depend on the maximum length of input elements u1 , . . . , un , v0 , . . . , vn polynomially. If those elements are succinctly given by SLPs (instead of group words in the alphabet A), the dependence on the maximal length of the programs will be at most exponential. Therefore, the above results still apply to compressed exponent equations, as the following theorem states. Theorem 5.3.9 ([304, Theorem 15]). Let (A, I) be a fixed independence alphabet. Solvability of compressed exponent equations over the graph group G(A, I) is in NP. 5.3.2.3 Cocontext-free groups As another instance of systematic use of semilinearity, the authors of [275] show solvability of the knapsack problem in groups with context-free coword problem (such groups are called cocontext-free). Theorem 5.3.10 ([275, Theorem 8.1]). Every cocontext-free group G is effectively knapsack semilinear and has a decidable knapsack problem. Outline of proof. Let g1 , g2 , . . . , gk , g be the input elements of the knapsack problem. By standard language-theoretic considerations, since the coword problem is context-free,
| 279
5.3 Knapsack problem
the language x
x
x
x
M = {a1 1 ⋅ ⋅ ⋅ akk | g1 1 ⋅ ⋅ ⋅ gk k ≠ g} is cocontext-free (here, {a1 , . . . , ak } is a new alphabet separate from generators of G). By Parikh’s theorem [407], the set of tuples (x1 , . . . , xk ) that correspond to M is effectively semilinear, and therefore so is its complement [170], which is precisely the set of solutions to the given input. 5.3.2.4 Baumslag–Solitar groups Observe that the key ingredient in the proof of decidability (even P membership) of the knapsack problem in hyperbolic groups was Theorem 5.3.4. It essentially stated that if the powers ε are “too big,” then the resulting product is also “big,” unless there is commensurability involved (see [363, Theorem 6.8]), a big power type condition. In Baumslag–Solitar groups BS(p, q) = ⟨a, t | t −1 ap t = aq ⟩, ε
ε
p, q > 1, there is a similar phenomenon: two powers of elements g1 1 and g2 2 can have unbounded (in a certain sense) cancelation between them only if g1 , g2 are “nearly” powers of the same element. To elaborate, say for definiteness p > q and observe that by Britton’s lemma every element of BS(p, q) can be written in the following normal form: g = a0 t ε1 a1 ⋅ ⋅ ⋅ an−1 t εn an ,
(5.43)
where a0 ∈ ⟨a⟩. If εi = −1, then ai = ak with k < q, and if εi = 1, then ai = ak with k < p. The number n is called the t-length of g, denoted n = |g|t . Lemma 5.3.11 ([113]). Let the coprime p, q > 1 be fixed. There is an algorithm that, given g1 , g2 , . . . , gn ∈ BS(p, q) with none of g1 , . . . , gn conjugating into ⟨a⟩, computes constants M, C with the following property. If for some 1 ≤ i ≤ n−1 the length of t reduction between xi+1 x gi i and gi+1 , with xi , xi+i > 0, is not bounded by M, then x
x
i+1 gi i gi+1 = ray wz u,
where y, z ∈ ℤ, r, w, u ∈ BS(p, q) and |r|t , |w|t , |u|t ≤ C. This allowed Dudkin and Treier [113] to prove the following theorem. Theorem 5.3.12 ([113]). If m, n > 1 and they are coprime, then KP(BS(m, n)) is decidable.
280 | 5 Discrete optimization in groups 5.3.2.5 Nonexample: Heisenberg group In this section we show that Heisenberg groups have decidable KP but do not enjoy semilinear behavior of solution tuples. In general, in the case of a nilpotent group, the condition to satisfy in the knapsack problem leads to a system of polynomial equations over ℤ (see Proposition 5.3.13 below and more in Section 5.3.4). The undecidability of the latter in general has been exploited by König, Lohrey and Zetzsche [275] and Mishchenko and Treier [355] to show that the knapsack problem is undecidable in nilpotent groups (see Section 5.3.4 for details). However, in the case of Heisenberg group H3 (ℤ), the resulting system of equations consists of linear equations and one quadratic equation. Such systems are known to be decidable, which was observed by both of the above groups of authors. To observe the reduction to Diophantine equations, let X = {x1 , x2 , . . . , xn }, and let G = ⟨X⟩ be a free nilpotent group of class 2. The following identity holds for the group G: ∀x, y, z ∈ G
[x, [y, z]] = 1.
(5.44)
Using identity (5.44), the collection process in the group G is organized via the transformation yx = xy[x, y]−1 ,
(5.45)
where x, y are any elements of G. Using equality (5.45), we can reduce any word g in the alphabet X ∪ X −1 to the normal form for elements of the group G: α
g = x1 1 ⋅ ⋅ ⋅ xnαn ∏[xi , xj ]βij , i N 2 +
5.3 Knapsack problem
| 287
4N. We inspect the path traversed by the word w1 w2 . . . wm w0 in the Cayley graph of G ∗ H with respect to generators G ∪ H. Since this word corresponds to the trivial group element, the path must be a loop and thus the word wi splits in at most m pieces, each piece mutually canceling with a subword of wj , j ≠ i. Let wi = abni −2 c as in Lemma 5.3.32. Then by the pigeonhole principle at least one piece contains at least (ni − 2−m)/m ≥ N +1 copies of the word b. Observe that fj is nonsimple and nj ≥ 3; otherwise n ‖fj j ‖ ≤ 2N, so it cannot mutually cancel with bN+1 since ‖bN+1 ‖ ≥ 2N + 2. Let wj = n −2
a b j c . We note that ‖b‖ ≤ ‖fi ‖ ≤ N, ‖b ‖ ≤ ‖fj ‖ ≤ N, so ‖b ‖ ≤ N copies of b mutually cancel with ‖b ‖ ⋅ ‖b‖/‖b ‖ = ‖b‖ copies of b (up to a cyclic shift). Therefore replacing ni , nj with ni − ‖b ‖ and nj − ‖b‖, respectively, preserves equality (5.49). Iterating this process, we may find numbers n1 , . . . , nm that deliver equality (5.49) such that ni ≤ p(N) = N 2 + 4N whenever fi is nonsimple.
Similarly to Theorem 5.3.4, for free products it is possible to establish polynomial bounds on the exponents involved in a solution to an instance of KP, if there are such polynomial bounds for the factors. Specifically, we say that G is a polynomially bounded knapsack group if there is a polynomial q such that any instance (g1 , . . . , gk , g) of KP(G) is positive if and only if the instance (g1 , . . . , gk , g, q(N)) of BKP(G) is positive, where N is the total length of g1 , . . . , gk , g. It is easy to see that this notion is independent of the choice of a finite generating set for G. Proposition 5.3.34. If G, H are polynomially bounded knapsack groups, then G ∗ H is a polynomially bounded knapsack group. Proof. Let f1 , f2 , . . . , fm , f ∈ G ∗ H be an input of KP(G ∗ H). For each fi , the normal form n of fi i , ni ≥ 3, is given by Lemma 5.3.32 or by simplicity of fi . Suppose some n1 , n2 , . . . , nm n n n provide a solution to KP, i. e., f1 1 ⋅ ⋅ ⋅ fmm f −1 = 1 in G ∗ H. Representing f −1 and each fi i by their normal forms and combining like terms we obtain (without loss of generality) a product g1 h1 g2 h2 ⋅ ⋅ ⋅ gℓ hℓ = 1,
(5.50)
where each gi ∈ G and each hi ∈ H, and we may assume ℓ ≤ ‖f ‖ + m ⋅ (max{‖fi ‖}) ⋅ p(max{‖fi ‖}) by Proposition 5.3.33, so ℓ ≤ N 2 p(N), where N is the total length of the input (i. e., the sum of word lengths of f1 , . . . , fm , f ). Furthermore, since the product in (5.50) represents the trivial element, it can be reduced to 1 by a series of eliminations of trivial syllables and combining like terms: (1) (1) g1 h1 g2 h2 ⋅ ⋅ ⋅ gℓ hℓ = g1(1) h(1) 1 ⋅ ⋅ ⋅ gℓ hℓ
(2) (2) = g1(2) h(2) 1 ⋅ ⋅ ⋅ gℓ−1 hℓ−1
= ⋅ ⋅ ⋅ = g1(ℓ) h(ℓ) 1 = 1,
288 | 5 Discrete optimization in groups where the product labeled by (j) is obtained from the one labeled by (j − 1) by a single elimination of a trivial term and combining the two (cyclically) neighboring terms. Ob(j) serve that each gi is (up to a cyclic shift) a product of the form gα gα+1 ⋅ ⋅ ⋅ gβ ; similarly for hi . Therefore, each gi is of the form (j)
(j)
δ
δ
δk
gi = d1j1j d2j2j ⋅ ⋅ ⋅ dk ij,j , (j)
(5.51)
,j
ij
where each dμj , 1 ≤ μ ≤ kij , is either (NS) one of the syllables involved in Lemma 5.3.32, or the normal form of some fi or fi2 , or the “u” part of (5.48), in which case δμj = 1, or (S) the syllable f in (5.48) for some fν , in which case δμj = nν . On the one hand, the total amount of syllables dμj involved in (5.51) for a fixed j is k1j + k2j + ⋅ ⋅ ⋅ + kℓ−j+1,j . On the other hand, it cannot exceed k11 + k21 + ⋅ ⋅ ⋅ + kℓ1 since
g1 , g2 , . . . , gℓ−j+1 are obtained by eliminating and combining elements g1 , g2 , . . . , gl . Taking into account that each ki1 ≤ m + 1 ≤ N, we obtain k1j + k2j + ⋅ ⋅ ⋅ + kℓ−j+1,j ≤ ℓN ≤ N 3 p(N). (j) Now, given the equality gi = 1 for some i, j, if the option (S) holds for any 1 ≤ μ ≤ kij , then the right-hand side of the corresponding equality (5.51) can be represented (via a standard conjugation procedure) as a positive instance of KP(G) with input of length bounded by (Nkij )2 ≤ N 8 p2 (N). Since G is a polynomially bounded knapsack (j)
(j)
(j)
group, we may assume that every nν that occurs as some δμj in some gi is bounded (j)
by a polynomial pG (N). A similar argument holds for the H-syllables hi , resulting in a polynomial bound pH (N). It is only left to note that since for every 1 ≤ ν ≤ m, either fν is nonsimple and then nν is bounded by p(N), or it is simple and then nν is bounded by pG (N) or pH (N), so every nν is bounded by p(N) + pG (N) + pH (N). (j)
Theorem 5.3.35. If G, H are polynomially bounded knapsack groups such that AGWP(G), AGWP(H) ∈ P, then KP(G ∗ H) ∈ P. Proof. By Proposition 5.3.34 KP(G ∗ H) is P-time reducible to BKP(G ∗ H). In turn, the latter is P-time reducible to AGWP(G ∗H) by Proposition 5.2.9. Finally, AGWP(G ∗H) ∈ P by Corollary 5.2.44. Corollary 5.3.36. KP is polynomial-time decidable in free products of finitely generated abelian and hyperbolic groups in any finite number. Proof. By [363, Theorem 6.7], hyperbolic groups are polynomially bounded knapsack groups. By [55], so are finitely generated abelian groups. The statement follows by Theorem 5.3.35. With minimal adjustments the above also works for free products with finite amalgamation.
5.3 Knapsack problem
| 289
Lohrey and Zetzsche in [304] investigate the same behavior for the more general case of HNN-extensions with finite associated subgroups, with regard to membership of the knapsack problem in NP. They show the following result. Theorem 5.3.37 ([304], Theorem 24). Let H be an HNN-extension of the finitely generated group G with finite associated subgroups. If KP(G) ∈ NP, then KP(H) ∈ NP. Outline of proof. The proof amounts to organizing an effective version of Britton’s lemma. Let G be generated by a finite set X. Let A, B ≤ G be finite subgroups of G, with an isomorphism φ : A → B. Note that we may assume that A, B ⊆ X. Let H = ⟨X, t | t −1 at = φ(a), a ∈ A⟩. We will pass to the polynomially equivalent generalized knapsack problem, that is, given u0 , w1 , u1 , w2 , . . . , wk , uk , we will decide if x
x
u0 w1 1 u1 ⋅ ⋅ ⋅ uk−1 wkk uk = 1 in H for some x1 , . . . , xk ≥ 0. By a linear size change of input, we may assume that w1 , . . . , wk are cyclically Britton-reduced. Now we represent the input u0 , w1 , u1 , w2 , . . . , wk , uk of the generalized knapsack problem by an obvious automaton Γ labeled by X ±1 ∪ t ±1 (with empty labels allowed) accepting u0 w1∗ u1 ⋅ ⋅ ⋅ wk∗ uk . We then saturate it: wherever there is a subpath labeled t −1 at (a ∈ A) or tbt −1 , we add a parallel edge labeled φ(a) and φ−1 (b), respectively. Since there are only a fixed number of possible labels, this results in a polynomial inflation of size of the automaton. By Britton’s lemma, the accepted subset of H does not change. Furthermore, one can show that if the resulting automaton Γ accepts a word w, then it also accepts its Britton-reduced form. Therefore, to decide whether the trivial group element is accepted by Γ , it suffices to consider the automaton Γ obtained from Γ by removing all edges labeled by t ±1 . Finally, by inspecting Γ , one can see that any of its runs can be obtained from a finite list of inputs to the generalized knapsack problem of size bounded by the size of Γ . Since a free product with amalgamation embeds in an HNN-extension (see [313]), it follows that a similar result holds for free products with amalgamation over a finite subgroup. Theorem 5.3.38 ([304], Theorem 22). Let G0 and G1 be two finitely generated groups with a common finite subgroup F. If KP(G0 ), KP(G1 ) ∈ NP, then KP(G0 ∗F G1 ) ∈ NP. Outline of proof. Since G0 ∗F G1 embeds in an HNN-extension of G0 ∗G1 with associated subgroups isomorphic to F, it follows that it suffices to prove the theorem in the case of G0 ∗ G1 . The latter can be done by a modification of the argument presented in Theorem 5.3.34 or outlined in Theorem 5.3.37.
290 | 5 Discrete optimization in groups 5.3.5.2 Direct products As we have already seen in Section 5.3.2.5, the knapsack problem is decidable in the Heisenberg group H3 (ℤ). On the other hand, it is not decidable in a sufficiently long direct product H3 (ℤ)d by Theorem 5.3.27. It follows that a direct product does not preserve decidability of the knapsack problem. Theorem 5.3.39 ([275]). There exist groups G, H with decidable knapsack problem such that KP(G × H) is not decidable. Note that, in fact, the above theorem is delivered by G = H = H3 (ℤ)d0 with some 1 ≤ d0 ≤ 322/2 = 161. 5.3.5.3 Finite-index subgroups It is not hard to see that both decidability of KP and its membership of NP carry from a finite-index subgroup to the group. The former was observed in [275] and the latter in [304]. Both statements are proved by bruteforcing the finitely many cosets in which the input elements fall. Theorem 5.3.40 ([275, 304]). Let H be a finite-index subgroup of a finitely generated group G. Then 1. KP(H) is decidable if and only if KP(G) is decidable. 2. KP(H) ∈ NP if and only if KP(G) ∈ NP. 5.3.5.4 Wreath products Consider a wreath product G ≀ H of G and H (see Section 5.1.9). The main question of this section is decidability of KP(G ≀ H). If H is finite, then G ≀ H is a finite extension of G|H| (see [302, Proposition 1]). By Theorem 5.3.40 it follows that KP(G ≀ H) is decidable if and only if KP(G|H| ) is decidable. Hence, we assume that H is infinite. Proposition 5.3.41 ([152, Proposition 5.1]). Suppose H is infinite. If KP(G ≀ H) is decidable, then KP(H) and KP(G∗ ) are decidable. Proof. H and Gm (for every m ∈ ℕ) are subgroups of G ≀ H and, hence, inherit decidability of the knapsack problem. By Theorem 5.3.27, the knapsack problem is undecidable for the direct product of sufficiently many copies of the Heisenberg group H3 (ℤ). As a corollary of Proposition 5.3.41, we get that KP(H3 (ℤ) ≀ ℤ) is undecidable. Furthermore, the converse of Proposition 5.3.41 is not true: – By [275, Theorem 6.8], for every l ∈ ℕ, KP(H3 (ℤ) × ℤl ) is decidable. – By [152, Theorem 5.3], there exists l ∈ ℕ such that for every group G ≠ 1, KP(G ≀ (H3 (ℤ) × ℤl )) is undecidable.
5.4 Post correspondence problem
| 291
Therefore, for G = ℤ and H = H3 (ℤ) × ℤl (for an appropriate value l) KP(G∗ ), KP(H) are decidable and G ≀ H is undecidable. In order to show decidability of KP(G ≀ H) one can strengthen the assumptions on H. Recall that we say that a finitely generated group H is (effectively) knapsack semilinear if for every knapsack equation, the set of all solutions (a solution can be seen as a vector of natural numbers) is (effectively) semilinear. By adding the assumption of knapsack semilinearity for H, we obtain a partial converse to [152, Proposition 5.1]. Theorem 5.3.42 ([152, Theorem 5.4]). Let H be effectively knapsack semilinear. Then KP(G ≀ H) is decidable if and only if KP(G∗ ) is decidable. In fact, if both G and H are knapsack semilinear, then using semilinear representations for KP(G) and KP(H) one can construct a semilinear representation of the solution set for KP(G ≀ H). Therefore, we get the following theorem. Theorem 5.3.43 ([152, Theorem 5.5]). The group G≀H is effectively knapsack semilinear if and only if both G and H are effectively knapsack semilinear. Since every free abelian group is clearly knapsack semilinear, it follows that the iterated wreath products G1,r = ℤr and Gd+1,r = ℤr ≀Gd,r are knapsack semilinear. By the well-known Magnus embedding, the free solvable group Sd,r embeds into Gd,r . Hence, we get the following corollary. Corollary 5.3.44 ([152, Corollary 5.6]). Every free solvable group is knapsack semilinear. Hence, solvability of exponent equations is decidable for free solvable groups. Theorem 5.3.45 ([152, Theorem 5.7]). For every nontrivial abelian group G, KP(G ≀ ℤ) is NP-complete.
5.4 Post correspondence problem 5.4.1 Connections of PCP to group theory 5.4.1.1 PCPn and the equalizer problem Let as above G be a fixed arbitrary group with a finite generating set A, and let Fn = F(x1 , . . . , xn ) be the free group with basis X = {x1 , . . . , xn }. An n-tuple of elements g = (g1 , . . . , gn ) ∈ Gn gives a homomorphism ϕg : Fn → G, where ϕg (x1 ) = g1 , . . . , ϕg (xn ) = gn , and vice versa, every homomorphism Fn → G gives a tuple as above. In this sense each instance (u1 , v1 ), . . . , (un , vn ) of PCP(G) can be uniquely described by a pair of homomorphisms ϕu , ϕv : Fn → G, where u = (u1 , . . . , un ), v = (v1 , . . . , vn ). In this case we refer to such a pair of homomorphisms as an instance of PCP in G.
292 | 5 Discrete optimization in groups Now given groups H, G and two homomorphisms ϕ, ψ ∈ Hom(H, G) one can define the equalizer E(ϕ, ψ) of ϕ, ψ as E(ϕ, ψ) = {w ∈ H | wϕ = wψ },
(5.52)
which is obviously a subgroup of H. If G does not have nontrivial identities, then all nontrivial words from E(ϕ, ψ) give all solutions to PCP in G for a given instance ϕ, ψ ∈ Hom(Fn , G). However, if G has nontrivial identities, then some words from E(ϕ, ψ) are identities which are not solutions to PCP(G). To accommodate all the cases at once we suggest to replace the free group Fn above by the free group FG,n in the variety Var(G) of rank n with basis {x1 , . . . , xn }. Then similar to the above every tuple u ∈ Gn gives rise to a homomorphism ϕu : FG,n → G, where ϕ(x1 ) = u1 , . . . , ϕ(xn ) = un , and nontrivial elements of the equalizer E(ϕu , ϕv ) describe all solutions of PCP(G) for the instance u, v ∈ Gn . This connects PCPn in G with the equalizers of homomorphisms from Hom(FG,n , G). There are two general algorithmic problems in groups concerning equalizers. The triviality of the equalizer problem (TEP(H, G)) for groups H, G: Given two homomorphisms ϕ, ψ ∈ Hom(H, G), decide if the subgroup E(ϕ, ψ) in H is trivial. The equalizer problem (EP(H, G)) for groups H, G: Given two homomorphisms ϕ, ψ ∈ Hom(H, G), find the equalizer EP(H, G). In particular, if EP(H, G) is finitely generated, then find a finite generating set of E(ϕ, ψ). The formulation above needs some explanation on how we mean “to find” a subgroup in a group. If the subgroup is finitely generated, then “to find” usually means to list a finite set of generators. It might happen that the subgroup is not finitely generated, but allows a finite set of generators as a normal subgroup, or as a module under some action. In this case to solve EP(H, G) one has to list a finite set of these generators of EP(H, G). In this section we consider equalizers of homomorphisms of finitely generated nilpotent groups, so in this event they are finitely generated and the problem of describing equalizers becomes well stated. Equalizers E(ϕ, ψ) were studied before, but mostly in the case when H = G and ϕ, ψ are automorphisms of G. There are few results on equalizers of endomorphisms in groups. Goldstein and Turner have proved in [173] that the equalizer of two endomorphisms of Fn is a finitely generated subgroup in the case one of the two maps is injective. However, it is not known whether there is an algorithm to decide if the equalizer of two endomorphisms in a free group Fn is trivial. Ciobanu, Martino and Ventura showed that generically equalizers of endomorphisms in free groups are trivial [85], so on most inputs in a free non-abelian group F PCP(F) does not have a solution, and in this sense PCP(F) is generically decidable. We summarize the discussion above in the following easy lemma.
5.4 Post correspondence problem
| 293
Lemma 5.4.1. Let G be a group. Then the following holds for any natural n > 0: (1) PCPn (G) is equivalent (being just a reformulation) to TEP for homomorphisms from Hom(FG,n , G). (2) Finding all solutions for a given instance of PCPn (G) is equivalent (being just a reformulation) to EP(FG,n , G) for the same instance. 5.4.1.2 GPCP and the double twisted conjugacy Let ϕ, ψ be two fixed automorphisms of a group G. Two elements u, v ∈ G are termed (ϕ, ψ)-double twisted conjugate if there is an element w ∈ G such that uwϕ = wψ v. In particular, when ψ = 1, then u and v are called ϕ-twisted conjugate, while in the case ϕ = ψ = 1, u and v are just usual conjugates of each other. The twisted (or double twisted) conjugacy problem in G is to decide whether two given elements u, v ∈ G are twisted (double twisted) conjugate in G for a fixed pair of automorphisms ϕ, ψ ∈ Aut(G). Observe that, since ψ has an inverse, the (ϕ, ψ)-double twisted conjugacy problem reduces to the ϕψ−1 -twisted conjugacy problem, so in the case of automorphisms it is sufficient to consider only the twisted conjugacy problem. This problem is much studied in groups; we refer to [48, 49, 488, 430, 431, 131, 129, 130] for some recent results. Much stronger versions of the above problems appear when one replaces automorphisms by arbitrary endomorphisms ϕ, ψ ∈ End(G). Not much is known about double twisted conjugacy problem in groups with respect to endomorphisms. The next statement (which follows from the discussion above) relates the double twisted conjugacy problem for endomorphisms to the nonhomogeneous Post correspondence problem. Proposition 5.4.2. Let G be a group generated by a finite set A = {a1 , . . . , an }. Then the following holds: (1) The double twisted conjugacy problem for endomorphisms in G is linear-time reducible to GPCPn (G). (2) If G is relatively free with basis A, then the double twisted conjugacy problem for endomorphisms in G is linear-time equivalent to GPCPn (G).
5.4.2 Hereditary word problem and GPCP It is easy to see that decidability of PCPn or GPCPn in a group G has some implications for the word problem in G. Indeed, an element g is equal to 1 in G if and only if GPCP1 is solvable in G for the instance consisting of a single pair (1, 1) and the constant g. Similarly, if G is torsion-free, then g = 1 in G if and only if PCP is solvable in G for the instance pair (g, 1). In this section we show that the whole lot of word problems in the quotients of G is reducible to GPCP in G.
294 | 5 Discrete optimization in groups Let G be a group generated by a finite set A. For a subset R ⊆ G by ⟨R⟩G we denote the normal closure of R in G. The hereditary word problem (HWP(G)) in G: Given a finite set R of words in the alphabet A ∪ A−1 , decide whether w is trivial in the quotient G/⟨R⟩G . Note that this problem can also be stated as the uniform membership problem to normal finitely generated subgroups of G. Observe also that HWP(G) requires a uniform algorithm for the word problems in the quotients G/⟨R⟩G . It seems that groups with decidable HWP are rare. Note that the hereditary word problem is decidable in finitely generated abelian or nilpotent groups. Proposition 5.4.3. Let G be a finitely generated group. Then the hereditary word problem in G P-time reduces to GPCP(G). Proof. Let A be a finite generating set of G. Suppose R is a finite set of elements of G, represented by words in A ∪ A−1 . Denote H = G/⟨R⟩G . Put DR = {(a, a−1 ) | a ∈ A} ∪ {(a−1 , a) | a ∈ A} ∪ {(r, 1) | r ∈ R} ∪ {(r −1 , 1) | r ∈ R}. Claim 1. Let w be a word w ∈ (A ∪ A−1 )∗ . Then w =H 1 if and only if there is a finite sequence of pairs (u1 , v1 ), . . . , (un , vn ) ∈ DR such that vn (⋅ ⋅ ⋅ (v2 (v1 wu1 )u2 ) ⋅ ⋅ ⋅)un =G 1.
(5.53)
Indeed, if (5.53) holds, then −1 −1 −1 w =G v1−1 . . . vn−1 (vn−1 u−1 n )un−1 . . . u1 =H 1,
since for every pair (u, v) ∈ DR one has uv = 1 in H. To show the converse, suppose w =H 1, i. e., w ∈ ⟨R⟩G . In this case w =G w1 r1 w2 . . . wm rm wm+1 ,
(5.54)
with ri ∈ R±1 , wi ∈ A∗ and w1 w2 . . . wm+1 =G 1. Rewriting (5.54) one gets r1−1 ⋅ w1−1 ⋅ w ⋅ w1 ⋅ 1 =G w2 r2 w3 . . . wm rm wm+1 w1 .
(5.55)
Note that the product on the left is in the form required in (5.53), and the product on the right is in the form required in (5.54). Now the result follows by induction on m. This proves the claim. Claim 2. Let R ⊆ (A ∪ A−1 )∗ be a finite set and w ∈ (A ∪ A−1 )∗ . Then GPCP(G) has a solution for the instance D̂ R = {(u, v−1 ) | (u, v) ∈ DR } with constant w if and only if w = 1 in H.
5.4 Post correspondence problem
| 295
Indeed, a sequence −1 (u1 , v1−1 ), . . . , (uM , vM ) ∈ D̂ R
(5.56)
gives a solution to GPCP(G) for the instance D̂ R with constant w if and only if −1 wu1 u2 ⋅ ⋅ ⋅ uM =G v1−1 v2−1 ⋅ ⋅ ⋅ vM ⇐⇒ vM (⋅ ⋅ ⋅ (v2 (v1 wu1 )u2 ) ⋅ ⋅ ⋅)uM =G 1,
which, by the claim above, is equivalent to w =H 1. This proves Claim 2 together with the proposition. Corollary 5.4.4. Let F be a free non-abelian group of finite rank. Then GPCP(F) is undecidable. Proof. It is known [353] that for any natural number n ≥ 2 there are finitely presented groups with n generators and undecidable word problem. Therefore, HWP(F) is undecidable. By Proposition 5.4.3 GPCP(F) is also undecidable. For a finite group presentation P = ⟨a1 , . . . , ak | r1 , . . . , rℓ ⟩ denote by N(P) = k + ℓ the total sum of the number of generators and relators in P. Let N be the least number N(P) among all finite presentations P with undecidable word problem. In [54] Borisov constructed a finitely presented group with four generators and 12 relations which has undecidable word problem. Furthermore, by Miller’s refinement of the Higman– Neumann–Neumann theorem (see [353, Corollary 3.7],) this group embeds in a twogenerated group with 12 relations, so N ≤ 2 + 12 = 14. Corollary 5.4.5. Let Fn be a free group of rank n ≥ 28. Then the endomorphism double twisted conjugacy problem in Fn (as well as GPCPn (Fn )) is undecidable. Proof. Let P0 = ⟨a1 , a2 | r1 , . . . , r12 ⟩ be the above presentation with undecidable word problem and Fn = ⟨a1 , . . . , an ⟩ a free group of rank n ≥ 28. Claim 2 in the proof of Proposition 5.4.3 shows that the word problem in the group H defined by the presentation P0 is polynomial-time reducible to GPCPn (Fn ), and hence the latter one is undecidable. Now part 2 in Proposition 5.4.2 shows that the endomorphism double twisted conjugacy problem in Fn is also undecidable, as claimed. Note that the automorphism twisted conjugacy problem is decidable in free groups [48] (see also [399, 67]). Together with Corollary 5.4.5, this gives the following result. Corollary 5.4.6. Free groups of rank at least 28 have decidable twisted conjugacy problem but undecidable endomorphism double twisted conjugacy problem. Remark 5.4.7. Note that for a given group, decidability of the endomorphism double twisted conjugacy problem implies decidability of the twisted conjugacy problem, which in turn implies decidability of the conjugacy problem. It was shown in [49] that the converse to the latter implication is in general false. The above Corollary 5.4.6 answers Ventura’s question whether the converse to the former implication is true.
296 | 5 Discrete optimization in groups Similar results hold for free solvable groups. Let Nsol be the least number N(P) among all finite presentations P which define a solvable group with undecidable word problem. In [256] Kharlampovich constructs a finitely presented solvable group with undecidable word problem, so such number Nsol exists. Corollary 5.4.8. Let Sm,n be a free solvable non-abelian group of class m ≥ 3 and rank n ≥ Nsol . Then the endomorphism double twisted conjugacy problem in Sm,n (as well as GPCPn (Sm,n )) is undecidable. Proof. Similar to the argument in Corollary 5.4.5. Observe that it immediately follows from definitions that decidability of PCP or GPCP in a finitely generated group is inherited by all finitely generated subgroup of G. Therefore, the above results give a host of groups with undecidable GPCP (as well as GPCPn ). Corollary 5.4.9. If a group G contains a free non-abelian subgroup F2 , then GPCP(G) is undecidable. Therefore, GPCP is undecidable, for example, in nonelementary hyperbolic groups, non-abelian right-angled Artin groups, groups with nontrivial splittings into free products with amalgamation or HNN extensions, braid groups Bn , nonvirtually solvable linear groups, etc. Another corollary of the results above is concerned with the complexity of the bounded GPCP in groups. Corollary 5.4.10. Let F be a non-abelian free group of finite rank. Then the bounded GPCP(F) is NP-complete. Proof. Let F = F(A) be a free non-abelian group with a finite basis A. It is shown in [437, Corollary 1.1] that there exists a finitely presented group H = ⟨B | R⟩ with NP-complete word problem and polynomial Dehn function δH (n). Passing to a subgroup of F(A), we may assume that A = B. One can see that in the case of a free group G = F(A), M in (5.56) is bounded by a polynomial (in fact, linear) function of |w| and the number m of relators in (5.54) (see [398, Lemma 1] for details). Note that there exists m as above bounded by δH (|w|), so M is bounded by some polynomial q(|w|). Therefore, the map w → (w, DR , M = q(|w|)) is a P-time reduction of the word problem in H to the bounded GPCP(F(A)). It follows that the latter is NP-hard and therefore NP-complete (since the word problem in F(A) is P-time decidable). Corollary 5.4.11. If a group G contains a free non-abelian subgroup F2 , then the bounded GPCP(G) is NP-hard.
5.4 Post correspondence problem
| 297
5.4.3 PCP in nilpotent groups In this section we study the complexity of Post correspondence problems in nilpotent groups. Proposition 5.4.12. There is a polynomial-time algorithm that given finite presentations of groups A, B in the class of abelian groups and a homomorphism ϕ : A → B computes a finite set of generators of the kernel of ϕ. Proof. Results of [235] provide a polynomial-time algorithm to bring an integer matrix to its canonical diagonal (Smith) normal form. Since computing the canonical presentation of a finitely presented abelian group reduces by a standard argument to finding Smith form of an integer matrix (determined by relators in a given presentation), we may find in polynomial time the canonical presentation of B, i. e., a direct decomposition B = ℤl × K, where K is a finite abelian group. Once B is in its canonical form, computing kernel of ϕ reduces to solving a system of linear equations in Z l and K, which can be done in polynomial time by the same results [235]. Corollary 5.4.13. There is a polynomial-time algorithm that given finite presentations of groups A, B in the class of abelian groups and homomorphisms ϕ, ψ ∈ Hom(A, B) computes a finite set of generators of the equalizer E(ϕ, ψ). Proof. Observe that the map ξ : A → B defined by ξ (g) = ϕ(g)ψ(g)−1 is a homomorphism from A to B and E(ϕ, ψ) = ker ξ . Now the result follows from Proposition 5.4.12. One can slightly strengthen the corollaries above. Corollary 5.4.14. Let c be a fixed positive integer. (1) There is a polynomial-time algorithm that given a finite presentation of a group A, a finite presentation of a group B in the class of abelian groups and a homomorphism ϕ ∈ Hom(A, B) computes a finite set of generators of the kernel ker ϕ modulo the commutator [A, A]. (2) There is a polynomial-time algorithm that given a finite presentation of a group A, a finite presentation of a group B in the class of abelian groups and homomorphisms ϕ, ψ ∈ Hom(A, B) computes a finite set of generators of the equalizer E(ϕ, ψ) modulo the commutator [A, A]. Proof. It follows immediately from Proposition 5.4.12 and Corollary 5.4.13. By γc (G) we denote the c-th term of the lower central series of G. Recall that the iterated commutator of elements g1 , . . . , gc is [g1 , g2 , . . . , gc ] = [. . . [[g1 , g2 ], g3 ], . . .]. The following lemma is well known (for example, see [250, Lemma 17.2.1]). Lemma 5.4.15. Let G be a group generated by elements x1 , . . . , xn ∈ G. Then γc (G) is generated as a subgroup by γc+1 (G) and iterated commutators [xi1 , . . . , xic ].
298 | 5 Discrete optimization in groups Lemma 5.4.16. Let c0 be a fixed positive integer. There is a polynomial-time algorithm that given a finite group presentation of a group G in the class of nilpotent groups of class ≤ c0 finds subgroup generators of [G, G]. Proof. It follows from Lemma 5.4.15 by an inductive construction since there are at most nc0 +1 iterated commutators [xi1 , . . . , xic ], c ≤ c0 , in a group generated by n ≥ 2 elements x1 , . . . , xn (the case n = 1 is obvious). Theorem 5.4.17. Let c0 be a fixed positive integer. Then there is a polynomial-time algorithm that given positive integers cH , cG ≤ c0 , finite presentations of groups H, G in the classes of nilpotent groups of class cH and cG , respectively, and homomorphisms ϕ, ψ ∈ Hom(H, G) computes a generating set of the equalizer E(H, ϕ, ψ) as a subgroup of H. Proof. Let Y and Z be finite generating sets of H and G, respectively. We use induction on the nilpotency class c = cG of G. If c = 1, then G is abelian and the result follows from Corollary 5.4.14, item 2. Suppose now that c > 1 and we are given ϕ, ψ ∈ Hom(H, G). Consider the quotient group Ḡ = G/γc (G), which is a nilpotent group of class c − 1. The homomorphisms ̄ Observe that the size of ϕ , ϕ, ψ induce some homomorphisms ϕ , ψ ∈ Hom(H, G). ψ (the total length of the images ϕ (y), ψ (y), y ∈ Y as words in Z) is the same as that of ϕ, ψ. Also observe that Ḡ is described in the class of nilpotent groups of class c − 1 by the same presentation that describes G in the class of nilpotent groups of class c. By induction we can compute in polynomial time a finite generating set, say, L = {h1 , . . . , hk }, of E = E(H, ϕ , ψ ) as a subgroup of H. By construction, for g ∈ E one has ϕ(g) = ψ(g) mod γc (G), and hence the map ξ (g) = ϕ(g)ψ(g)−1 defines a homomorphism ξ : E → γc (G). Obviously, E(ϕ, ψ) = ker ξ . Furthermore, note that the size of L is polynomial in terms of the size of the input, and the size of a generating set for γc (G) is polynomial (of degree that depends on c) in terms of the size of a generating set for G by Lemma 5.4.15. Now the result follows from Corollary 5.4.14, item 1, since γc (G) is abelian, and Lemma 5.4.16. Theorem 5.4.18. Let c be a fixed positive integer. (1) Let G be a finitely generated nilpotent group of class c. Then for any ϕ, ψ ∈ Hom(Fn , G) the subgroup E(ϕ, ψ) ≤ Fn contains γc+1 (Fn ) and is finitely generated modulo γc+1 (Fn ). (2) There is a polynomial-time algorithm that given a positive integer n, a presentation of a group G in the class of nilpotent groups of class c and homomorphisms ϕ, ψ ∈ Hom(Fn , G) computes a finite set of generators of E(ϕ, ψ) in Fn modulo the subgroup γc+1 (Fn ). Proof. Let Fn = Fn (X), where X = {x1 , . . . , xn }. Fix two homomorphisms ϕ, ψ ∈ Hom(Fn , G). Since G is nilpotent of class c one has γc+1 (G) = 1, so E(ϕ, ψ) ≥ γc+1 (Fn ). The quotient Nn,c = Fn /γc+1 (Fn ) is the finitely generated free nilpotent group of rank n
5.4 Post correspondence problem
| 299
and class c, and hence every subgroup, in particular the image Ē of E(ϕ, ψ), is finitely generated. It follows that the group E(ϕ, ψ) is finitely generated modulo γc+1 (Fn ). This proves (1). Note that the above argument allows one to reduce everything to the case of nilpotent groups, i. e., to consider the induced homomorphisms ϕ,̄ ψ̄ ∈ Hom(Nn,c , G), instead of ϕ, ψ and the subgroup Ē instead of E(ϕ, ψ). Now the result follows from Theorem 5.4.17. Theorem 5.4.19. Let G be a finitely generated nilpotent group. Then PCPn (G) ∈ P for every n ∈ ℕ. Proof. Indeed, by Theorem 5.4.18 one can compute in P-time a finite set of elements h1 , . . . , hm ∈ Fn such that E(ϕ, ψ) = ⟨h1 , . . . , hm , γc+1 (Fn )⟩. Now the instance of PCPn defined by (ϕ, ψ) has a nontrivial solution in G if and only if there is i such that ϕ(hi ) ≠ 1 in G. Indeed, in this case ϕ(hi ) = ψ(hi ) ≠ 1 in G. Otherwise, ϕ(E(ϕ, ψ)) = 1 in G so there is no nontrivial solution in G to the instance of PCPn determined by ϕ and ψ. This proves the theorem. 5.4.3.1 Post correspondence problems and equalizers Following [362] we briefly describe here two main variations of PCP and equalizer problems in groups. The general (nonhomogeneous) Post correspondence problem (GPCP) in a group G requires, for a given finite sequence of pairs (g1 , h1 ), . . . , (gn , hn ) of elements of G and an element v ∈ G, to determine if there is a group word w(x1 , . . . , xn ) in variables x1 , . . . , xn such that w(g1 , . . . , gn ) = vw(h1 , . . . , hn ) in G. In the particular case when v = 1 one gets the standard (homogeneous) PCP for the group G. Here, of course, any word w(x1 , . . . , xn ) which is an identity in G gives a solution to PCP in G for any input consisting of n pairs, so in the standard PCP it is required to determine if there exists a solution w which is not an identity in G (termed a nontrivial solution). Note that if v ≠ 1, then any solution to GPCP is a nontrivial one, so in the case of GPCP one is also looking only for nontrivial solutions, so PCP indeed can be viewed as a special case of GPCP. Usually the number of pairs in the input to GPCP or PCP is bounded by a fixed number n ∈ ℕ. The corresponding problems are denoted by GPCPn or PCPn . In all known cases, to our best knowledge, GPCP (or PCP) is decidable in a group G if and only if GPCPn (PCPn ) is decidable in G for each n. Whether this is always the case is an open problem. There are many interesting connections of PCP or GPCP with other algorithmic problems in groups (see [362]). Let G,̄ G be groups, let Hom(G,̄ G) be the set of homomorphisms from Ḡ to G and φ, ψ ∈ Hom(G,̄ G). In this section we denote the image of a map ξ on an element b by bξ . The expression of the form xφ = xψ
(5.57)
300 | 5 Discrete optimization in groups is called the equalizing equation for φ, ψ. An element g ∈ Ḡ is a solution to (5.57) if gφ = gψ. The set of all solutions Eq(φ, ψ) = {g ∈ Ḡ : gφ = gψ} is termed the equalizer of φ and ψ. Obviously, Eq(φ, ψ) is a subgroup of G.̄ Similarly to GPCP one can consider a general equalizing equation (here v is a fixed element of G) xφ = v ⋅ xψ
(5.58)
and the set Eq(φ, ψ, v) of all solutions of (5.58). Note that in this case the equalizer Eq(φ, ψ, v) is a coset of the equalizer Eq(φ, ψ), so to find all solutions of (5.58) it suffices to find a particular solution and all solutions of the corresponding “homogeneous” equation (5.57). It was shown in [362] that a more general equation of the type u⋅gφ⋅s = v ⋅ gψ ⋅ t, where u, v, s, t ∈ G, immediately reduces to (5.58). There are two classical algorithmic problems related to equations (5.57) and (5.58). The first one is to check if there exists a nontrivial solution (as in the case of PCP) to the equation, and if it does then to find one; while the second, more general problem is to describe the set of all solutions to the equation, i. e., to find the equalizers Eq(φ, ψ) and Eq(φ, ψ, v). The latter two problems are termed the equalizer problem (EP) and the generalized equalizer problem (GEP) for φ and ψ [362]. Of course, EP is a particular case of GEP (when v = 1). Observe that PCPn and GPCPn in a group G reduce to EP and GEP for groups G and G,̄ where Ḡ = Fn (var(G)) is a relatively free group of rank n in the variety generated by G. Finally, we say that GEP is decidable in a class of groups 𝒱 if for any pair of groups G,̄ G ∈ 𝒱 , every pair of homomorphisms φ and ψ in Hom(G,̄ G) and an element v ∈ G one can effectively find the equalizer Eq(φ, ψ, v). Decidability of EP in 𝒱 is defined similarly. An important remark is due here – one needs to define what it means “to find” the equalizer. If 𝒱 is a class of polycyclic groups, then the subgroup Eq(φ, ψ) of Ḡ is finitely generated, so in this case the algorithm is required to find a finite generating set of the subgroup. If 𝒱 is a class of finitely generated metabelian groups, then a priori the subgroup Eq(φ, ψ) ≤ Ḡ might turn out to be infinitely generated, so in this case the algorithm is asked to find another finite description of the subgroup, say, as a near normal subgroup of Ḡ (see definitions in the next section). 5.4.3.2 Overview of PCP and the generalized equalizer problem in polycyclic groups One of the main results of this section is that the generalized equalizer and general Post corresponding problems are decidable in an arbitrary polycyclic group. This implies decidability of a host of related problems, such as EP, PCP, the twisted and double twisted conjugacy problems, the pullback problem, etc. It was shown in [433, 362]
5.4 Post correspondence problem |
301
that all these results hold in arbitrary finitely generated nilpotent groups, so here we generalize it to arbitrary polycyclic groups. However, there is a difference; all these problems in nilpotent groups have solution in polynomial time [362], which we do not claim here. Now we explain some results mentioned above in more detail. Let φ, ψ be arbitrary fixed endomorphisms of a group G. Elements u, v ∈ G are termed (φ, ψ)-double twisted conjugate if there is an element w ∈ G such that uwφ = wψ v. In particular, when ψ = 1, then u and v are called φ-twisted conjugate, while in the case φ = ψ = 1, u and v are just usual conjugates of each other. The twisted (or double twisted) conjugacy problem in G is to decide whether two given elements u, v ∈ G are twisted (double twisted) conjugate in G with respect to given endomorphisms φ, ψ ∈ End(G). This problem, in the case when φ, ψ are automorphisms of G, is well studied in groups; we refer to [48, 430, 433, 488, 129, 131] for some recent results. However, not much is known about the double twisted conjugacy problem in groups with respect to arbitrary endomorphisms. Observe that the double twisted conjugacy problem immediately reduces to solving the equation of the form uxφ = xψ v, which as mentioned above reduces to GEP for φ and ψ. Hence decidability of GEP in polycyclic groups gives decidability of the double twisted conjugacy problem relative to arbitrary endomorphisms. Recall that the pullback of two homomorphisms φ : G1 → G and ψ : G2 → G is a subgroup P(φ, ψ) of the direct product G1 × G2 defined by P(φ, ψ) = {(a, b) ∈ G1 × G2 | aφ = bψ}. The pullback problem for given such φ and ψ is to find the subgroup P(φ, ψ) of the group G1 × G2 . To reduce the pullback problem to EP consider homomorphisms φ̄ and ψ̄ from G1 × G2 to G defined as compositions of the projections G1 × G2 → Gi , ̄ so decidability i = 1, 2, and the homomorphisms φ, ψ. In this set-up P(φ, ψ) = Eq(φ,̄ ψ), of EP in polycyclic groups implies decidability of the pullback in these groups. Similarly, for an endomorphism φ ∈ End(G) of a group G the subgroup Fix(φ) of fixed points of φ is defined as Fix(φ) = {g ∈ G | φ(g) = g}. It is easy to see that Fix(φ) = Eq(φ, idG ), where idG is the identity map on G. Hence to compute Fix(φ) it suffices to compute Eq(φ, idG ). Observe that the free group Fn (var(G)) in the variety generated by a polycyclic group G may not be polycyclic, so GPCP does not immediately reduce as described above to GEP in polycyclic groups, and hence the proof of decidability of GPCP in this case requires extra work. It is known that Fn (var(G)) is polycyclic if and only if G is nilpotent-by-finite [21]. One can say a bit more here: if G is nilpotent-by-finite, then one can not only solve PCP and GPCP, but also find all solutions to these problems for a given instance; meanwhile, if G is polycyclic but not nilpotent-by-finite, then one can still solve PCP and GPCP in G, but in this case we do not have an algorithm for finding all solutions for a given instance. Many algorithmic problems discussed above are also decidable in the class 𝒜2 of all finitely generated metabelian groups, provided minor restrictions on the endomorphisms φ and ψ. To explain, let M, M̄ ∈ 𝒜2 , φ, ψ ∈ Hom(M,̄ M) and v ∈ M. To find the generalized equalizer Eq(φ, ψ, v) we check if the general equalizing equation
302 | 5 Discrete optimization in groups ̄ and if it does find a particular solution, say, a ∈ M. ̄ xφ = v ⋅ xψ has a solution in M, Then we find the equalizer Eq(φ, ψ) and get the generalized equalizer Eq(φ, ψ, v) as the coset aEq(φ, ψ). Note that not all subgroups (in particular, not all equalizers) in finitely generated metabelian groups are finitely generated, so it is important to explain how we compute the equalizers Eq(φ, ψ). Recall that a subgroup H of a finitely generated metabelian group M̄ is called nearly normal (see [33]) if the intersection of ̄ In this case H ∩ M̄ is generated as H with the derived subgroup M̄ is normal in M. ̄ a normal subgroup of M by finitely many elements, say, u1 , . . . , um . Let g1 , . . . , gk be elements in H that generate H modulo M̄ . Then H is generated by these elements g1 , . . . , gk together with the normal subgroup H ∩ M̄ . The algorithm that finds the equalizer Eq(φ, ψ) in fact finds some finite sets {g1 , . . . , gk } and = {u1 , . . . , um } such that Eq(φ, ψ) = ⟨g1 , . . . , gk ⟩⟨u1 , . . . , um ⟩M . Now we can precisely state the results on metabelian groups and mention some corollaries. Let M and M̄ be finitely generated metabelian groups and φ, ψ ∈ Hom(M,̄ M) homomorphisms such that φ = ψ modulo the derived subgroup M . Then the generalized equalizer Eq(φ, ψ, v) can be computed for any v ∈ M ; in particular, one can find the equalizer Eq(φ, ψ). A more general and more technical result is as follows. For arbitrary φ, ψ ∈ Hom(M,̄ M) define the differentiator D(φ, ψ) as a subgroup of M generated ̄ and the normalized differentiator D∗ (φ, ψ) by all elements gφ(gψ)−1 , where g ∈ M, as the normal subgroup generated by D(φ, ψ) and M . We proved (Theorem 5.4.32 in Section 5.4.5) that if φ, ψ ∈ Hom(M,̄ M) are such that the normalized differentiator D∗ (φ, ψ) is abelian, then one can find the generalized equalizer Eq(φ, ψ, v) for every v ∈ D∗ (φ, ψ); in particular, one can find the equalizer Eq(φ, ψ). In this case the equal̄ The restriction that D∗ (φ, ψ) is abelian izer Eq(φ, ψ) is a nearly normal subgroup of M. is essential for our argument.
5.4.4 General lemmas and remarks Let Ḡ and G be arbitrary groups and φ, ψ ∈ Hom(G,̄ G). For g ∈ Ḡ denote a(g) = gφ ⋅ (gψ)−1 . Recall from Section 5.4.3.2 that the differentiator D(φ, ψ) of φ and ψ is defined ̄ of G, and the normalized differentiator D∗ (φ, ψ) as the as the subgroup ⟨a(g) | g ∈ G⟩ normal subgroup generated by D(φ, ψ) and M . Lemma 5.4.20. Let Ḡ and G be finitely generated groups and φ, ψ ∈ Hom(G,̄ G) such ̄ ψ(G)⟩. ̄ Suppose that the following assumptions are true: that G = ⟨φ(G), ∗ (1) D (φ, ψ) is abelian, (2) Ḡ ≤ Eq(φ, ψ), i. e., for every g ∈ Ḡ one has gφ = gψ. Then for any finite generating set {f1 , . . . , fn } of Ḡ the following conditions hold: (a) Eq(φ, ψ) ≤ ψ−1 (CG (a1 , . . . , an )), where ai = a(fi ), i = 1, . . . , n, and CG (a1 , . . . , an ) is the centralizer of the elements a1 , . . . , an in G,
5.4 Post correspondence problem
| 303
(b) the restriction ρ of the map a onto ψ−1 (CG (a1 , . . . , an )) is a homomorphism ρ : ψ−1 (CG (a1 , . . . , an )) → ζ1 G, (c) Eq(φ, ψ) = ker(ρ); in particular, if ζ1 G = 1, then Eq(φ, ψ) = ψ−1 (CG (a1 , . . . , an )).
(5.59)
Proof. Denote A = D∗ (φ, ψ). By the definition of the differentiator for every g = g(f1 , . . . , fn ) ∈ Ḡ one has gφ = g(f1 φ, . . . , fn φ) = g(a1 ⋅ f1 ψ, . . . , an ⋅ fn ψ). Since by assumption (1) A is normal abelian, it follows from (5.21) that g(a1 ⋅ f1 ψ, . . . , an ⋅ fn ψ) = a1
(dg/df1 )ψ
n )ψ ⋅ ⋅ ⋅ a(dg/df ⋅ gψ n
(5.60)
(here ψ is naturally extended by linearity to ℤḠ → ℤG). In particular, gφ = gψ ⇐⇒ a1
(dg/df1 )ψ
⋅ ⋅ ⋅ an(dg/dfn )ψ = 1.
(5.61)
For every commutator [fi , fj ], i, j = 1, . . . , n, one has 1−fj ψ fi ψ−1 aj
[fi , fj ]φ = [ai ⋅ fi ψ, aj ⋅ fj ψ] = ai
⋅ [fi , fj ]ψ.
By assumption (2) [fi , fj ]φ = [fi , fj ]ψ, and hence from the equality above f ψ−1
ai j
f ψ−1
= aj i
,
i, j = 1, . . . , n.
(5.62)
Viewing A as an G/A-module, observe from assumption (1) that G acts on A trivially, so dg/dfj (fi ψ−1)
= aj i
(f ψ−1)dg/dfj
(f ψ−1)dg/dfj
= ai j
aj
.
Now applying (5.62) we get dg/dfj (fi ψ−1)
aj
(f ψ−1)dg/dfj
= aj i
dg/dfj (fj ψ−1)
= ai
.
Hence, for every i = 1, . . . , n (dg/df1 )ψ
(a1
n )ψ ⋅ ⋅ ⋅ a(dg/df ) n
(fi ψ−1)
dg/dfj (fi ψ−1)
= Πj aj
Σ dg/dfj (fj ψ−1)
= ai j where the last equality comes from (5.11).
dg/dfj (fj ψ−1)
= Πj ai
gψ−1
= ai
,
304 | 5 Discrete optimization in groups Assume now that g ∈ Eq(φ, ψ). Then from the above equality and (5.61) we conclude that gψ−1
gφ = gψ ⇒ ai
= 1 ⇐⇒ [ai , gψ] = 1,
(i = 1, . . . , n).
This proves (a). To prove (b), take an arbitrary g ∈ ψ−1 (CG (a1 , . . . , an )) and put v = a1
(dg/df1 )ψ
n )ψ ⋅ ⋅ ⋅ a(dg/f . n
gψ−1 ̄ = 1. Since = 1, i = 1, . . . , n. Thus [v, ψ(G)] Then by the equality above vfi ψ−1 = ai A ̄ ̄ ̄ v ∈ A one has [v, A] = 1, so [v, φ(G)] ≤ [v, Aψ(G)] = [v, A][v, ψ(G)] = 1. Since G = ̄ ψ(G)⟩ ̄ it follows that v ∈ ζ G. This proves that ⟨φ(G), 1
ρ(ψ−1 (CG (a1 , . . . , an ))) ≤ ζ1 G. To finish (b) one needs to show that ρ is a group homomorphism. To this end if g, h ∈ ψ−1 (CG (a1 , . . . , an )), then ρ(gh) = a(gh) = gφhφ(hψ)−1 (gψ)−1 = gφa(h)(gψ)−1 . Since a(h) ∈ ζ1 G, we can further rewrite gφa(h)(gψ)−1 = gφ(gψ)−1 a(h) = a(g)a(h). This proves (b), and (c) follows from (b). The following lemma gives a tool to circumvent assumption (2) in Lemma 5.4.20. Lemma 5.4.21. Let Ḡ and G be finitely generated groups and φ, ψ ∈ Hom(G,̄ G) such ̄ ψ(G)⟩. ̄ Suppose that D∗ (φ, ψ) is abelian. Then: that G = ⟨φ(G), (a) The restriction of the map a : Ḡ → G onto Ḡ is a homomorphism; furthermore, this homomorphism preserves conjugation, in particular a(g f ) = a(g)fψ = a(g)fφ for every f ∈ G.̄ (b) The set a(Ḡ ) is a normal subgroup of G. (c) If Ḡ is generated as a normal subgroup by a set of elements {u : i ∈ I}, then a(Ḡ ) is generated as a normal subgroup by the set {a(ui ) : i ∈ I}.
i
Proof. Put A = D∗ (φ, ψ). Let g1 , g2 , g ∈ Ḡ and f ∈ G.̄ Since A is abelian it follows that [A, φ(Ḡ )] = 1 and [A, ψ(Ḡ )] = 1. Hence (g1 g2 )φ = a(g1 ) ⋅ g1 ψ ⋅ a(g2 ) ⋅ g2 ψ = a(g1 )a(g2 ) ⋅ (g1 g2 )ψ, so a(g1 )a(g2 ) = a(g1 g2 ). Similarly, g1−1 φ = a(g1−1 ) ⋅ g1−1 ψ = (g1 φ)−1 = g1−1 ψ ⋅ a(g1 )−1 = a(g1 )−1 ⋅ g1−1 ψ,
305
5.4 Post correspondence problem |
so a(g1 )−1 = a(g1−1 ). Therefore, the restriction of the map a onto Ḡ is a homomorphism. Finally, g f φ = (gφ)fφ = a(g)a(f )⋅fψ ⋅ (gψ)a(f )⋅fψ = a(g)fψ ⋅ g f ψ, so a(g)fψ = a(g f ), i. e., a(g)fψ ∈ a(Ḡ ). Observe that a(g)fφ = a(g)a(f )fψ = a(g)fψ = a(g f ) ∈ a(Ḡ ). ̄ ψ(G)⟩ ̄ it follows that the restriction of a onto Ḡ preserves the conjuSince G = ⟨φ(G), gation. This proves (a). Now (b) and (c) follow from (a). Corollary 5.4.22. Let G,̄ G, φ, ψ be as in Lemma 5.4.21. In the notation above put G1 = G/a(Ḡ ) and denote by φ1 and ψ1 the compositions of φ and ψ with the standard projection G → G1 . Then the groups G,̄ G1 and the homomorphisms φ1 , ψ1 ∈ Hom(G,̄ G1 ) satisfy all the assumptions of Lemma 5.4.20. The following lemma is a tool to get rid of the center of the group G (in the notation above). Lemma 5.4.23. Let G,̄ G be groups and φ, ψ ∈ Hom(G,̄ G). Denote by φ1 and ψ1 the compositions of φ, ψ with the standard projection G → G/ζ1 G, respectively. Then: (a) the restriction of the map a : Ḡ → G onto Eq(φ1 , ψ1 ) is a homomorphism ρ : Eq(φ1 , ψ1 ) → ζ1 G, (b) Eq(φ, ψ) = ker ρ. Lemma 5.4.24. Let G,̄ G be a pair of groups, N be a normal subgroup of G and φ, ψ be two homomorphisms of Ḡ to G. Let φ1 , ψ1 be the compositions of φ, ψ with the standard homomorphism G → G/N, respectively. Suppose that (5.58) has a solution modulo N, i. e., there is an element g ∈ Ḡ such that g1 φ = v ⋅ g1 ψ ⋅ u−1 ,
(5.63)
where u ∈ N. Then a solution of (5.58) exists if and only if there is a solution of the equation g2 φ = u ⋅ g2 ψ.
(5.64)
Proof. Let g be a solution of (5.58). Then g2 = g1−1 g is a solution of (5.64). Conversely, if g2 is a solution of (5.64), then g = g1 g2 is a solution of (5.58). Remark 5.4.25. Let G,̄ G be a pair of groups and φ, ψ be two homomorphisms of Ḡ to G. Let C ≤ ζ1 G be a central subgroup of G. Suppose that for every element g ∈ Ḡ one has c(g) = gφ(gψ)−1 ∈ C. Define a homomorphism ρ : Ḡ → C by the map g → c(g). Then an equation of the form gφ = c ⋅ gψ
306 | 5 Discrete optimization in groups ̄ In other words, we need to solve the memhas a solution in Ḡ if and only if c ∈ ρ(G). ̄ bership problem in ζ1 G with respect to ρ(G). 5.4.5 Post correspondence and equalizer problems in metabelian groups 5.4.5.1 EP and PCP Let M be a finitely generated metabelian group, let M̄ be a finitely generated metabelian group with generators f1 , . . . , fn , and φ, ψ ∈ Hom(M,̄ M). For g ∈ M̄ denote as above a(g) = gφ(gψ)−1 and put ai = a(fi ), i = 1, . . . , n. We use this notation through the whole section. When dealing with equalizer problems in metabelian groups we can always ̄ ψ(M)⟩. ̄ assume that M = ⟨φ(M), Indeed, by definition Eq(φ, ψ) does not change if we ̄ ψ(M)⟩. ̄ swap M to the subgroup ⟨φ(M), Furthermore, given such φ and ψ, one can compute the images of a given finite generating set, say, U, of M̄ under φ and ψ and then find a finite 𝒜2 -presentation of the subgroup ⟨φ(U) ∪ ψ(U)⟩ in M (see, for example, Theorem 9.5.3 in [285]), thus replacing the homomorphisms φ, ψ by the induced ̄ ψ(M)⟩. ̄ homomorphisms onto ⟨φ(M), When we will change M to a factor group M/R we usually will keep notation φ and ψ for the compositions of these homomorphisms with standard homomorphism M → M/R. Lemma 5.4.26. There is an algorithm that for any two homomorphisms φ, ψ ∈ Hom(M,̄ M) of finitely generated metabelian groups M,̄ M such that the normalized differentiator D∗ (φ, ψ) is abelian finds the equalizer Eq(φ, ψ) as a nearly normal subgroup of M.̄ ̄ ψ(M)⟩. ̄ Proof. As we have mentioned above we can assume in this case that M = ⟨φ(M), By Lemma 5.4.21 the set a(M̄ ) is a normal subgroup of M generated as a normal subgroup by finitely many elements a([fi , fj ]), i, j = 1, . . . , n. Put M1 = M/a(M̄ ) and denote by φ1 , ψ1 the compositions of φ, ψ with the canon̄ M1 together with the homoical projection M → M1 . By Corollary 5.4.22 the groups M, morphisms φ1 , ψ1 satisfy all the assumptions of Lemma 5.4.20. Hence by Lemma 5.4.20 Eq(φ1 , ψ1 ) = ker ρ, where ρ is the restriction of the map a : g → a(g) = gφ1 (gψ1 )−1 onto ̄ the centralizer ψ−1 1 (CM1 (a1 , . . . , an )) ≤ M (here ai = a(fi ), i = 1, . . . , n). We claim that there is an algorithm to find the equalizer Eq(φ1 , ψ1 ). Indeed, given φ, ψ and the elē one can find effectively the elements a1 , . . . , an ∈ M1 , and hence ments f1 , . . . , fn ∈ M, by Theorem 6.1 from [33] one can find the centralizer C = CM1 (a1 , . . . , an ) as a nearly normal subgroup of M1 . In fact, since by our assumptions D(φ, ψ) commutes with M (hence D(φ1 , ψ1 ) commutes with M1 ) the subgroup C contains M1 , so it is normal in M1 . ̄ By Lemma 5.1.17 one can compute the full preimage ψ−1 1 (C) in M as a normal subgroup of M̄ given by a finite set of normal generators. Moreover, ψ−1 (C) contains M̄ . Hence 1 one can find the subgroup ker ρ, which is equal to Eq(φ1 , ψ1 ). Observe that M̄ ≤ Eq(φ1 , ψ1 ). Let Eq(φ1 , ψ1 ) = ⟨g1 , . . . , gk , M̄ ⟩ for some g1 , . . . , gk ∈ ̄ Then gi φ = gi ψ (mod a(M̄ )), and hence gi φ = a(ui )gi ψ for some ui ∈ M̄ , i = M.
5.4 Post correspondence problem
| 307
−1 ̄ 1, . . . , k. One can check that in this case ḡi = u−1 i gi ∈ Eq(φ, ψ). Indeed, gφ = ui φgi φ = −1 −1 −1 −1 ui φa(ui )gi ψ = ui φui φ(ui ψ) gi ψ = (ui gi )ψ, as claimed. It follows that Eq(φ1 , ψ1 ) = ⟨ḡ1 , . . . , ḡk , M̄ ⟩, and hence
Eq(φ, ψ) = ⟨ḡ1 , . . . , ḡk ⟩(Eq(φ, ψ) ∩ M̄ ). ̄ thus Eq(φ, ψ) is nearly normal. Note that H = Eq(φ, ψ) ∩ M̄ is a normal subgroup of M, ̄ ̄ Indeed, if g ∈ Eq(φ, ψ) ∩ M and f ∈ M, then since (gψ)a(f ) = gψ (because gψ ∈ M ) one has g f φ = (gφ)fφ = (gψ)a(f )fψ = (gψ)fψ = g f ψ. The restriction of the map a : M̄ → M onto M̄ is a homomorphism λ : M̄ → a(M̄ ) of abelian groups which maps u → a(u). Note that H = ker λ. One can view M̄ as a ̄ M̄ , and a(M̄ ) as a module over ℤM/D∗ (φ, ψ). By Lemma 5.1.17, one module over ℤM/ ̄ This can compute a finite set {v1 , . . . , vt } of generators of H as a normal subgroup of M. ̄ gives a description of Eq(φ, ψ) as a nearly normal subgroup of M: Eq(φ, ψ) = ⟨ḡ1 , . . . , ḡk ⟩⟨v1 , . . . , vt ⟩M , ̄
(5.65)
as required. Corollary 5.4.27. The equalizer problem is decidable for any homomorphisms φ, ψ ∈ Hom(M,̄ M) of finitely generated metabelian groups M,̄ M such that D∗ (φ, ψ) is abelian. Moreover, for such φ, ψ one can decide whether the equalizer equation xφ = xψ has a nontrivial solution in M,̄ and if it does one can find a particular nontrivial solution. Proof. By Lemma 5.4.26, given two homomorphisms φ, ψ as above, we can find a presentation (5.65) of the equalizer eq(φ, ψ). Furthermore, since the word problem in M̄ is decidable one can check whether all the elements ḡ1 , . . . , ḡk , v1 , . . . , vt in this presen̄ tation are trivial, thus deciding if the equation xφ = xψ has a nontrivial solution in M, and if it does find a particular nontrivial solution. Recall that an endomorphism φ of a group G is termed IA-endomorphism if it is the identity modulo the derived subgroup G , i. e., for any g ∈ G gφg −1 ∈ G . Corollary 5.4.28. The fixed-points problem is decidable for IA-endomorphisms in the class of all finitely generated metabelian groups. 5.4.5.2 GEP and GPCP for metabelian groups In this subsection M is a finitely generated metabelian group, and M̄ is a finitely generated metabelian group with generating set {f1 , . . . , fn }. Lemma 5.4.29. Let N be an abelian normal subgroup of M, which contains M . Let φ, ψ ∈ Hom(M,̄ M) be a pair of homomorphisms. Suppose the following conditions are satisfied:
308 | 5 Discrete optimization in groups
(1) for every g ∈ M̄ we have gφ ⋅ (gψ)−1 = a(g) ∈ N, (2) M̄ ≤ Eq(φ, ψ), i. e., for every g ∈ M̄ we have gφ = gψ, (3) ζ1 (M) = 1. Then there is an algorithm which decides whether an equation of the form gφ = a ⋅ gψ,
a ∈ N,
(5.66)
has a solution. Proof. Note that all the assumptions of Lemma 5.4.20 are satisfied. Denote as in this lemma ai = a(fi ) for i = 1, . . . , n. Thus, g = g(f1 , . . . , fn ) is a solution of (5.66) if and only if n
∏ ai
(dg/dfi )ψ
i=1
= a.
(5.67)
Both sides of (5.67) are considered as elements of the module N. Multiplying these both sides by fi ψ − 1 we derive by (5.62) and (5.17) that ∑ni=1 (dg/dfi (fi −1))ψ
ai Hence
gψ
ai
gψ−1
= ai
= afi ψ−1 ai ,
= afi ψ−1 ,
i = 1, . . . , n.
i = 1, . . . , n.
(5.68) (5.69)
By [33, Theorem 7.2] there is an algorithm in any finitely generated metabelian group which decides if a given tuple of elements conjugates to other one. It is a generalization of Noskov’s algorithm from [386] dealing with a pair of elements and defining their conjugacy. So, we can check if the set of equations azi = afi ψ−1 ai , i = 1, . . . , n, corresponding to (5.69) has a solution z ∈ M. If the answer is “no,” equation (5.67) has no solutions. Suppose that the answer is “yes” for some element z = h ∈ M. We take M1 = M/N. ̄ Then we take any preimage g ∈ M̄ of the image of h in M . Then gψ is a Then M1 = ψ(G). 1 conjugating element as h before. Then we compute element gφ. After our computation with formula (5.20) we derive that a1
(dg/df1 )ψ
n )ψ ⋅ ⋅ ⋅ a(dg/df = r. n
Then we multiply both sides of (5.70) by fi ψ − 1 and obtain as before gψ−1
= r fi ψ−1 ,
i = 1, . . . , n.
afi ψ−1 = r fi ψ−1 ,
i = 1, . . . , n.
ai Hence
By our assumption ζ1 M = 1 we get a = r. And so g is a solution of (5.66).
(5.70)
5.4 Post correspondence problem
| 309
Lemma 5.4.30. Let N be an abelian normal subgroup of M, which contains M . Suppose the following conditions are satisfied: (1) for every g ∈ M̄ we have gφ ⋅ (gψ)−1 = a(g) ∈ N, (2) M̄ ≤ Eq(φ, ψ), i. e., for every g ∈ M̄ we have gφ = gψ. If there is an algorithm deciding if an equation gφ = a ⋅ gψ (mod ζ1 M),
a ∈ N,
(5.71)
has a solution, then there is an algorithm determining if equation (5.66) is decidable in M. Proof. If (5.71) has no solutions, then (5.66) has no solutions either. Suppose that g1 ∈ M gives a solution of (5.71): g1 φ = ca ⋅ g1 ψ,
c ∈ ζ1 M.
(5.72)
If g ∈ M̄ is a solution of (5.66), then z = g −1 g1 ∈ EqM/ζ1 M (φ, ψ): zφ = c ⋅ zψ.
(5.73)
Then we can derive a presentation EqM/ζ1 M = gp(g1 , . . . , gk , M̄ ).
(5.74)
For each g ∈ EqM/ζ1 M (φ, ψ) one has gφ = c(g) ⋅ gψ, where c(g) ∈ ζ1 M. We have a homomorphism μ : EqM/ζ1 M (φ, ψ) → ζ1 M, μ : g → c(g).
(5.75)
Clearly, (5.73) and therefore also (5.66) has a solution if and only if c ∈ gp(c(gi )|i = 1, . . . , k). It can be checked by a standard procedure. Corollary 5.4.31. Let N be an abelian normal subgroup of M, which contains M . Suppose the following conditions are satisfied: (1) for every g ∈ M̄ we have gφ ⋅ (gψ)−1 = a(g) ∈ N, (2) M̄ ≤ Eq(φ, ψ), i. e., for every g ∈ M̄ we have gφ = gψ. Then there is an algorithm deciding if equation (5.66) is decidable in M. Proof. The upper central series 1 ≤ ζ1 M ≤ ζ2 M ≤ ⋅ ⋅ ⋅ ≤ ζt M = ζt+1 M stabilizes in view of the ascending property for normal subgroups valid on M. So, the center ζ1 (M/ζt M) is trivial. By [33] there is an algorithm that finds the center and the
310 | 5 Discrete optimization in groups terms of upper central series of any finitely generated metabelian group. Also, it finds their finite presentations. By Lemma 5.4.29 there is an algorithm that decides if the image of (5.66) has a solution in M/ζt M. If such solution exists (in the other case (5.66) has no solutions) we can reduce by Lemma 5.4.30 the question to group M/ζt−1 M, and so on. The last step will give the answer for group M. Theorem 5.4.32. Let M be a finitely generated metabelian group, and let M̄ be a finitely generated metabelian group generated by f1 , . . . , fn , n ≥ 2. Let N be an abelian normal subgroup of M, which contains M . Then GEP is decidable for every pair of homomorphisms φ, ψ ∈ Hom(M,̄ M) and every element a ∈ N such that for every g ∈ M̄ one has gφ(gψ)−1 = a(g) ∈ N. In other words, then there is an algorithm deciding if an equation of the form (5.66) has a solution. ̄ and ψ(M). ̄ As a first step we Proof. As usual we assume that M is generated by φ(M) reduce the problem to the case when φ and ψ are homomorphisms of M̄ to M1 = M/R, where R is a normal subgroup of M such that M̄ ≤ EqM1 (φ, ψ). Define a normal subgroup R = a(M̄ ) of M as above. Then M̄ ≤ EqM1 (φ, ψ). By Corollary 5.4.31 we can decide if the natural image of equation (5.66) for M1 has a solution. If the answer is “no,” equation (5.66) has no solutions. Suppose that the answer is “yes.” Then there is an element z1 ∈ M̄ for which z1 φ = haz1 , h ∈ a(M̄ ). Then there is an element y ∈ M̄ for which yφ = hy, and we have z = y−1 z1 as a solution of (5.66) in M. Corollary 5.4.33. Let M be a finitely generated metabelian group, and let N be a normal abelian subgroup of M that contains M . Let φ, ψ be a pair of endomorphisms of M such that for every g ∈ M one has gφ(gψ)−1 ∈ N. Then the bi-twisted conjugacy problem is solvable for φ, ψ. In particular, the bi-twisted conjugacy problem is solvable for any pair of endomorphisms φ, ψ ∈ End(M), each of which induces the identical map on M/M . This generalizes the main result of [488], where φ induces the identical map on M/M and ψ = id. Corollary 5.4.34. Let M be a finitely generated metabelian group, and let M̄ = F(var(M)) be a relatively free metabelian group in the variety var(M) generated by M with basis {f1 , . . . , fn }, n ≥ 2. Let N be an abelian normal subgroup of M, which contains M . Then GPCPn is decidable for every pair of instances c̄ = (c1 , . . . , cn ), d̄ = (d1 , . . . , dn ) ∈ n M such that for the homomorphisms φ, ψ ∈ Hom(M,̄ M) corresponding to these instances and every element g ∈ M̄ we have gφ(gψ)−1 = a(g) ∈ N. In other words, then there is an algorithm deciding if an equation of the form (5.66) has a solution. Theorem 5.4.35. Let M be a metabelian polycyclic group. Let N be an abelian normal subgroup of M, which contains M . Let M̄ be a metabelian polycyclic group generated
5.4 Post correspondence problem
| 311
by f1 , . . . , fn , n ≥ 2. Then GEP is decidable for every pair of homomorphisms φ and ψ of M̄ to M. Proof. For every g ∈ M̄ denote by b(g) element gφ ⋅ (gψ)−1 . Consider EqN,M (φ, ψ) = {g ∈ M̄ | b(g) ∈ N}. Obviously, we can obtain effectively a finite set {g1 , . . . , gk } of generating elements of M̄ 1 = Eq,M (φ, ψ). We have to decide if an equation gφ = b ⋅ gψ,
b ∈ M,
(5.76)
gφ = b ⋅ gψ (mod N).
(5.77)
̄ has a solution g ∈ M. Consider the image of (5.76)
If (5.77) has no solutions, then (5.76) has no solutions. So, suppose that there is a solution g1 of (5.77). We can decide if such solution exists effectively. Then g1 φ = ab ⋅ g1 ψ,
(5.78)
where a ∈ N. Then consider equation (5.66). Every solution of this equation, if such solution exists, belongs to M̄ 1 . So, we reduce the problem to homomorphisms φ, ψ : M̄ 1 → M. By Theorem 5.4.32 we can effectively decide if equation (5.66) has a solution in M̄ 1 . If such solution does not exist, equation (5.76) has no solutions. If g is a solution of (5.66), then g = g2−1 g1 is a solution of (5.76). Corollary 5.4.36. Let M be a polycyclic metabelian group, and let N be a normal abelian subgroup of M that contains M . Then the bi-twisted conjugacy problem is solvable for every pair of endomorphisms φ, ψ of M. Thus, the bi-twisted conjugacy problem is solvable for M. This generalizes a result of [488], where φ is an arbitrary endomorphism and ψ = id. Theorem 5.4.37. Let M be a metabelian polycyclic group, and let M̄ = F(var(M)) be a relatively free group in the variety var(M) generated by M with basis {f1 , . . . , fn }, n ≥ 2. Then GPCPn is decidable for any pair of instances c̄ = (c1 , . . . , cn ), d̄ = (d1 , . . . , dn ) ∈ M̄ n . Proof. Let φ, ψ be a pair of homomorphisms of M̄ to M corresponding to the instances c̄ and d,̄ respectively. Denote as above K(φ, ψ) = ker(φ) ∩ ker(ψ). Clearly, if b ≠ 1, then ̄ equation (5.76) has a solution in M̄ if and only if it has a solution in M̄ 1 = M/K(φ, ψ). ̄ ker(φ) × M/ ̄ ker(ψ). Hence, But M̄ 1 is the natural image of M̄ in the polycyclic group M/ M̄1 is a metabelian polycyclic group. The statement of the theorem follows from Theorem 5.4.35.
312 | 5 Discrete optimization in groups 5.4.6 Post correspondence and equalizer problems for polycyclic groups 5.4.6.1 Post correspondence and equalizer problems for polycyclic groups with nilpotent derived subgroups Now we are to consider the class 𝒫 ∩ 𝒩 𝒜 of all polycyclic groups with nilpotent derived subgroups (i. e., the class of nilpotent-by-abelian polycyclic groups). Lemma 5.4.38. Let G be a polycyclic group such that the derived subgroup G is nilpotent. Let Ḡ be a polycyclic group and let φ, ψ ∈ Hom(G,̄ G) be a pair of homomorphisms ̄ ψ(G)). ̄ Then there is an algorithm which finds a finite set of gensuch that G = gp(φ(G), erators of the subgroup EqG (φ, ψ), and hence by [34, Theorem 3.4] there is an algorithm that derives an explicit description of EqG (φ, ψ). Proof. By [34, Lemma 3.8 and Theorem 6.3] we can effectively compute ker(φ), ker(ψ) ̄ and R = ker(φ) ∩ ker(ψ). Then Ḡ 1 = G/R is naturally embedded into a product ̄ ̄ G/ker(φ) × G/ker(ψ) of two groups. each of them being isomorphic to a subgroup of G, thus the derived subgroup Ḡ 1 is nilpotent. Obviously, EqG (φ, ψ) is the full preimage of EqG (φ1 , ψ1 ) in G,̄ where φ1 , ψ1 are homomorphisms of Ḡ 1 to G, induced by φ, ψ, respectively. If we effectively compute EqG (φ1 , ψ1 ), then we can effectively describe EqG (φ, ψ) by [34, Theorem 3.4]. Now we can assume that Ḡ is nilpotent of class not greater than the class of G . If the derived subgroup G is abelian, i. e., G is metabelian, then Ḡ is abelian and Ḡ is metabelian too. Then the statement of the lemma is true by Lemma 5.1.18 and Corollary 5.4.27. So, let 1 < ζ1 G < ζ2 G < ⋅ ⋅ ⋅ < ζt G = G
(5.79)
be the upper central series of G and t ≥ 2. All its members can be derived explicitly by [34, Corollary 5.3]. All its factors can be described effectively by [34, Theorem 3.4]. Denote A = ζ1 G . By induction on t we can find a finite set of generators of EqG/A (φ1 , ψ1 ), where φ1 , ψ1 are compositions of φ, ψ with the standard homomorphism G → G/A. Let EqA,G (φ, ψ) be the full preimage of EqG/A (φ1 , ψ1 ) in G.̄ It can be constructed effectively by Lemma 5.1.18. Obviously, EqG (φ, ψ) ≤ EqA,G (φ, ψ). ̄ ψ(G)) ̄ and A by A ∩ G. It can be done We swap Ḡ by EqA,G (φ, ψ), G by gp(φ(G), effectively by [34, Theorems 3.4 and 6.3]. For simplicity we keep all notation. Now for every g ∈ Ḡ one has a(g) = gφ(gψ)−1 ∈ A and [G , A] = 1. Then by Lemma 5.4.21 the set a(Ḡ ) = {a(g) : g ∈ Ḡ } is a normal subgroup of G. Denote by G1 the quotient G/a(Ḡ ) and by φ1 , ψ1 the compositions of φ, ψ with the standard homomorphism G → G1 . Now Ḡ ≤ EqG1 (φ1 , ψ1 ). Suppose we find EqG1 (φ1 , ψ1 ) = gp(g1 , . . . , gk ). Then there are elements u1 , . . . , uk ∈ Ḡ such that a(gi ) = a(ui ), i = 1, . . . , k. Denote g̃i = u−1 i gi , i = 1, . . . , k. These elements belong to EqG (φ, ψ). Consider a homomorphism ρ : Ḡ → G defined by the map g →
5.4 Post correspondence problem
| 313
a(g). Compute ker(ρ) = gp(h1 , . . . , hl ). Then EqG (φ, ψ) = gp(g̃1 , . . . , g̃k , h1 , . . . , hl ). Indeed, let w ∈ EqG (φ, ψ). As EqG (φ, ψ) ≤ EqG1 (φ1 , ψ1 ), then there is a presentation of the form w = w(g1 , . . . , gk ). Then w(g̃1 , . . . , g̃k ) = u ⋅ w(g1 , . . . , gk ), where u ∈ Ḡ ∩ EqG (φ, ψ), and hence u ∈ gp(h1 , . . . , hl ). It follows that w ∈ gp(g̃1 , . . . , g̃k , h1 , . . . , hl ). It remains to find generators of EqG1 (φ1 , ψ1 ). Let ζs G1 be the maximal member (hypercenter) of the upper central series of G1 which exists as G1 is Noetherian. It can be effectively found by [34, Corollary 5.3]. Change G1 to G(s) = G1 /ζs G1 . Then ζ1 G(s) = 1. Now all the assumptions of Lemma 5.4.20 for two groups G,̄ G(s) and two homomorphisms φ(s) , ψ(s) ∈ Hom(G,̄ G(s) ) that are compositions of φ1 , ψ1 with the standard homomorphism G1 → G(s) , respectively, are satisfied. Hence we can effectively compute EqG(s) (φ(s) , ψ(s) ). Then we apply s times Remark 5.4.23 to effectively obtain EqG1 (φ1 , ψ1 ). Corollary 5.4.39. EPP and EP are decidable in the class 𝒫 ∩ 𝒩 𝒜. Proof. The decidability of EPP was established by Lemma 5.4.38. The decidability of EP follows by the decidability of the word problem in G.̄ Lemma 5.4.40. Let G be a polycyclic group such that the derived subgroup G is nilpotent. Let Ḡ = gp(f1 , . . . , fn ) be a polycyclic group and let φ, ψ ∈ Hom(G,̄ G) be a pair of ̄ ψ(G)). ̄ Then there is an algorithm which solves homomorphisms such that G = gp(φ(G), GEP. Proof. By [34, Lemma 3.8 and Theorem 6.3] we can effectively compute ker(φ), ker(ψ) ̄ and R = ker(φ) ∩ ker(ψ). Then Ḡ 1 = G/R is naturally embedded into a product ̄ ̄ G/ker(φ) × G/ker(ψ) of two groups, each of them being isomorphic to a subgroup of G, thus the derived subgroup Ḡ 1 is nilpotent. Obviously, an equation of the form (5.58) is solvable in Ḡ if and only if it is solvable in Ḡ 1 . So, we can assume that Ḡ is nilpotent of the nilpotency class not greater than the class of G . If G is abelian, then the statement follows from Theorem 5.4.35. So, let us have (5.79) and A = ζ1 G . By induction on t we can assume that (5.58) has a solution corresponding to modulo A = ζ1 G . By Lemma 5.4.24 we reduce our question to an equation of the form gφ = a ⋅ gψ,
(5.80)
where a ∈ A. By Lemma 5.4.38 we can effectively swap Ḡ to EqA,G (φ, ψ) and so assume that for every g ∈ Ḡ one has a(g) = gφ(gψ)−1 ∈ A. Let Ḡ = gp(f1 , . . . , fn ). In particular, ̄ ψ(G)) ̄ and A to A ∩ G. denote ai = a(fi ), i = 1, . . . , n. Also, we can swap G to gp(φ(G), ̄ ̄ Then we swap G to G/a(G ), where a normal subgroup a(G ) is defined by (5.4.21). If g is a solution of (5.58) with respect to G/a(Ḡ ), then gφ = bv ⋅ gψ,
b ∈ a(Ḡ ).
314 | 5 Discrete optimization in groups Then there is u ∈ Ḡ such that uφ = b ⋅ uψ. Then u−1 g is a solution of (5.58). Now we can also assume that Ḡ ≤ EqG (φ, ψ). Let ζt G be the maximal member of the upper central series of G which exists as G is Noetherian. It can be effectively found by [34, Corollary 5.3]. Change G to G(t) = G1 /ζt G. Then ζ1 G(t) = 1. Now all the assumptions of Lemma 5.4.21 for two groups G,̄ G(t) , the normal subgroup A(t) that is the natural image of A in G(t) and two homomorphisms φ(t) , ψ(t) ∈ Hom(G,̄ G(t) ) that are compositions of φ, ψ with the standard homomorphism G → G(t) , respectively, are satisfied. We keep notation for natural images of the elements a, ai , i = 1, . . . , n. If g is a solution of (5.80), then a1
(dg/df1 )ψ(t)
⋅ ⋅ ⋅ an
(dg/dfn )ψ(t)
= a.
(5.81)
Multiplying both sides of (5.81) as elements of the module A(t) to fi ψ(t) − 1, i = 1, . . . , n, and using (5.62) and (5.11) we obtain similarly to the proof of Lemma 5.4.20 a set of equations gψ(t)
ai
= afi ψ(t) −1 ai ,
i = 1, . . . , n.
(5.82)
We have two tuples of elements (a1 , . . . , an ) and (af1 ψ(t) −1 a1 , . . . , afn ψ(t) −1 an ). By [34, Theorem 6.9] we can effectively decide the generalized conjugacy problem for these tuples. If they do not conjugate, equation (5.81) and equation (5.80) have no solutions. Then equation (5.58) has no solutions. Conversely, suppose that a conjugator r ∈ G(t) of the considered tuples exists. ̄ G is generated by A and ψ (G). ̄ Since Since G(t) is generated by φ(t) (G)̄ and ψ(t) (G), (t) (t) (t) A(t) centralizes all components of the tuples we can assume that r = hψ(t) for some h ∈ G.̄ Let hφ(hψ)−1 = a1
(dh/df1 )ψ
n )ψ ⋅ ⋅ ⋅ a(dh/df = w. n
(5.83)
Multiplying both sides by fi ψ(t) − 1, i = 1, . . . , n, and applying the trick as above we obtain f ψ(t) −1
(aw−1 ) i
= 1,
i = 1, . . . , n.
(5.84)
̄ = 1 and ζ (G) = 1, we Since a, w ∈ A(t) and thus [aw−1 , A(t) ] = 1, [aw−1 , ψ(t) (G)] 1 derive a = w. Hence equations (5.81), (5.80) and (5.58) have solutions. Now change G to G(t−1) = G/ζt−1 G. Then A(t−1) = ζ1 G(t−1) = ζt G/ζt−1 G. By Lemma 5.4.24 we reduce the problem of decidability of the equation induced by (5.80) with respect to G(t−1) to an equation of the form gφ(t−1) = at gψ(t−1) ,
at ∈ A(t−1) .
Here φ(t−1) , ψ(t−1) : Ḡ → G(t−1) are induced by φ and ψ, respectively. We apply Remark 5.4.25 to solve this equation. Continuing in such a way we solve (5.80) and so (5.58), or prove that there is no solution.
5.4 Post correspondence problem
| 315
5.4.6.2 Equalizer problems for polycyclic groups Theorem 5.4.41. EPP, EP and GEP are decidable in the class 𝒫 of all polycyclic groups. Proof. Now Ḡ and G are two polycyclic groups, and φ, ψ are two homomorphisms of ̄ ψ(G)). ̄ Ḡ to G. By [34, Theorem 3.4] we can assume that G = gp(φ(G), Let N be a nilpotent subgroup of finite index in G. By Lemma 5.1.18 we can find subgroups φ−1 (N) and ψ−1 (N) of finite indices in G.̄ By [34, Theorem 6.3] we can effectively find N̄ = φ−1 (N) ∩ ψ−1 (N) that also has a finite index in G.̄ By [34, Theorem 3.4] we effectively obtain a finite description of N.̄ As the generalized word problem is decidable in Ḡ we can present Ḡ by the standard procedure as a disjoint union of cosets, ̄ 1 ∪ ⋅ ⋅ ⋅ ∪ Ng ̄ l. Ḡ = Ng
(5.85)
̄ i does not intersect with EqG (φ, ψ). Let gi φ = ai ⋅ gψ, i = 1, . . . , l. If ai ∈ ̸ N, then Ng Let ai ∈ N. If an equation ui φ = ai ⋅ ui ψ
(5.86)
has a solution ui ∈ N̄ that can be found effectively by Lemma 5.4.40, then u−1 i gi lies in EqG (φ, ψ). Hence ̄ i ∩ EqG (φ, ψ) = EqN (φN , ψN ) ⋅ u−1 gi , Ng i
(5.87)
where φN , ψN denote restrictions of φ, ψ to N, respectively. This intersection can be found effectively by Corollary 5.4.39. ̄ i does not intersect with EqG (φ, ψ). If (5.86) has no solution ui ∈ N,̄ then Ng It follows that we can solve EPP. Since the word problem is solvable in Ḡ we can solve EP. In a similar way we can solve GEP. Equation (5.58) has a solution in Ḡ if and only if at least one of the equations ui φ = ai v−1 ⋅ ui ψ,
i = 1, . . . , l,
(5.88)
has a solution ui ∈ N.̄ It can be checked effectively by Lemma 5.4.40. 5.4.6.3 Post correspondence problems for polycyclic groups Theorem 5.4.42. Let G be a polycyclic group, and let Ḡ = F(var(G)) be a relatively free group of rank n ≥ 2 in the variety var(G) generated by G. Then PCP and GPCP are decidable for every pair of instances c̄ = (c1 , . . . , cn ), d̄ = (d1 , . . . , dn ) ∈ Gn , n ≥ 2.
316 | 5 Discrete optimization in groups Proof. Let φ, ψ be a pair of homomorphisms of Ḡ to G corresponding to the instances ̄ is naturally embedded into a c̄ and d,̄ respectively. Let R = ker(φ) ∩ ker(ψ). Then G/R ̄ ̄ product G/ker(φ) × G/ker(ψ) of two groups, each of them being isomorphic to a sub̄ is polycyclic. group of G, thus G/R It is known [21] that Ḡ is polycyclic if and only if G is nilpotent-by-finite. Suppose G is not nilpotent-by-finite. Then Ḡ is not polycyclic. It follows that R ≠ 1.
Hence every nontrivial element g ∈ R is a solution of PCP. Suppose that G is nilpotent-by-finite. Then Ḡ is polycyclic, and so PCP and GPCP are solvable by Theorem 5.4.41. ̄ ̄ is In general GPCP is solvable in Ḡ if and only if it is solvable in G/R. Since G/R polycyclic GPCP is solvable by Theorem 5.4.41. Corollary 5.4.43. Let G be a polycyclic group that is not nilpotent-by-finite. Then PCP has a solution for all instances c̄ = (c1 , . . . , cn ), d̄ = (d1 , . . . , dn ) ∈ Gn for n ≥ 2.
6 Problems in group theory motivated by cryptography 6.1 Introduction The object of this chapter is to showcase algorithmic problems in group theory motivated by (public-key) cryptography. In the core of most public-key cryptographic primitives there is an alleged practical irreversibility of some process, usually referred to as a one-way function with trapdoor, which is a function that is easy to compute in one direction, yet believed to be difficult to compute the inverse function on “most” inputs without special information, called the “trapdoor.” For example, the RSA cryptosystem uses the fact that while it is not hard to compute the product of two large primes, to factor a very large integer into its prime factors appears to be computationally hard. Another, perhaps even more intuitively obvious, example is that of the function f (x) = x 2 . It is rather easy to compute in many reasonable (semi)groups, but the inverse function √x is much less friendly. This fact is exploited in Rabin’s cryptosystem, with the multiplicative semigroup of ℤn (n composite) as the platform. In both cases though, it is not immediately clear what the trapdoor is. This is typically the most nontrivial part of a cryptographic scheme. For a rigorous definition of a one-way function we refer the reader to [478]; here we just say that there should be an efficient (which usually means polynomial-time with respect to the complexity of an input) way to compute this function, but no visible (probabilistic) polynomial-time algorithm for computing the inverse function on “most” inputs. The meaning of “most” is made more precise in Chapter 1 of this book. Before we get to the main subject of this chapter, namely, problems in combinatorial and computational group theory motivated by cryptography, we recall historically the first public-key cryptographic scheme, the Diffie–Hellman key exchange protocol, to put things in perspective. This is done in Section 6.2. We note that the platform group for the original Diffie–Hellman protocol was finite cyclic. In Section 6.2.1, we show how to convert the Diffie–Hellman key exchange protocol to an encryption scheme, known as the ElGamal cryptosystem. In the subsequent sections, we showcase various problems about infinite nonabelian groups. Complexity of these problems in particular groups has been used in various cryptographic primitives proposed over the last 20 years or so. We mention up front that a significant shift in paradigm motivated by research in cryptography was moving to search versions of decision problems that had been traditionally considered in combinatorial group theory (see, e. g., [366, 454]). In some cases, decision problems were used in cryptographic primitives (see, e. g., [401]) but these occasions are quite rare. The idea of using the complexity of infinite non-abelian groups in cryptography goes back to Wagner and Magyarik [321], who in 1985 devised a public-key protocol https://doi.org/10.1515/9783110667028-006
318 | 6 Problems in group theory motivated by cryptography based on the unsolvability of the word problem for finitely presented groups (or so they thought). Their protocol now looks somewhat naive, but it was pioneering. More recently, there has been an increased interest in applications of non-abelian group theory to cryptography initially prompted by the papers [6, 268, 460]. We note that a separate question of interest that is outside of the scope of this chapter is what groups can be used as platforms for cryptographic protocols. We refer the reader to the monographs [348, 366, 176] for relevant discussions and examples; here we just mention that finding a suitable platform (semi)group for one or another cryptographic primitive is a challenging problem. This is currently an active area of research; here we can mention that groups that have been considered in this context include braid groups (more generally, Artin groups), Thompson’s group, Grigorchuk’s group, small cancelation groups, polycyclic groups, (free) metabelian groups, various groups of matrices, semidirect products, etc. Here is the list of algorithmic problems that we discuss in this chapter. In most cases, we consider search versions of the problems as more relevant to cryptography, but there are notable exceptions. – The word (decision) problem: Section 6.5. – The conjugacy problem: Section 6.3. – The twisted conjugacy problem: Section 6.3.2. – The decomposition problem: Section 6.4. – The subgroup intersection problem: Section 6.4.2. – The factorization problem: Section 6.4.4. – The subgroup membership problem: Sections 6.6 and 6.7. – The isomorphism inversion problem: Section 6.8. – The subset sum and the knapsack problems: Section 6.10. – The hidden subgroup problem: Section 6.11. Also, in Section 6.9 we show that using semidirect products of (semi)groups as platforms for a Diffie–Hellman-like key exchange protocol yields various peculiar computational assumptions and, accordingly, peculiar search problems. In the concluding Section 6.12, we describe relations between some of the problems discussed in this chapter.
6.2 The Diffie–Hellman key exchange protocol The whole area of public-key cryptography started with the seminal paper by Diffie and Hellman [106]. We quote from Wikipedia: “Diffie–Hellman key agreement was invented in 1976 . . . and was the first practical method for establishing a shared secret over an unprotected communications channel.” In 2002 [208], Martin Hellman gave credit to Merkle as well: “The system . . . has since become known as Diffie-Hellman key exchange. While that system was first described in a paper by Diffie and me, it is
6.2 The Diffie–Hellman key exchange protocol |
319
a public-key distribution system, a concept developed by Merkle, and hence should be called ‘Diffie-Hellman-Merkle key exchange’ if names are to be associated with it. I hope this small pulpit might help in that endeavor to recognize Merkle’s equal contribution to the invention of public-key cryptography.” U.S. Patent 4,200,770, now expired, describes the algorithm, and credits Diffie, Hellman and Merkle as inventors. The simplest, and most original, implementation of the protocol uses the multiplicative group ℤ∗p of integers modulo p, where p is prime and g is primitive mod p. A more general description of the protocol uses an arbitrary finite cyclic group. 1. Alice and Bob agree on a finite cyclic group G and a generating element g in G. We will write the group G multiplicatively. 2. Alice picks a random natural number a and sends g a to Bob. 3. Bob picks a random natural number b and sends g b to Alice. 4. Alice computes KA = (g b )a = g ba . 5. Bob computes KB = (g a )b = g ab . Since ab = ba (because ℤ is commutative), both Alice and Bob are now in possession of the same group element K = KA = KB which can serve as the shared secret key. The protocol is considered secure against eavesdroppers if G and g are chosen properly. The eavesdropper, Eve, must solve the Diffie–Hellman problem (recover g ab from g, g a , and g b ) to obtain the shared secret key. This is currently considered difficult for a “good” choice of parameters (see, e. g., [337] for details). An efficient algorithm to solve the discrete logarithm problem (i. e., recovering a from g and g a ) would obviously solve the Diffie–Hellman problem, making this and many other public-key cryptosystems insecure. However, it is not known whether the discrete logarithm problem is equivalent to the Diffie–Hellman problem. We note that there is a “brute-force” method for solving the discrete logarithm problem: the eavesdropper Eve can just go over natural numbers n from 1 up one at a time, compute g n and see whether she has a match with the transmitted element. This will require O(|g|) multiplications, where |g| is the order of g. Since in practical implementations |g| is typically at least 10300 , this method is considered computationally infeasible. This raises a question of computational efficiency for legitimate parties: on the surface, it looks like legitimate parties, too, have to perform O(|g|) multiplications to compute g a or g b . However, there is a faster way to compute g a for a particular a by using the “square-and-multiply” algorithm, based on the binary form of a. For example, g 22 = (((g 2 )2 )2 )2 ⋅ (g 2 )2 ⋅ g 2 . Thus, to compute g a , one actually needs O(log2 a) multiplications, which is feasible given the recommended magnitude of a.
320 | 6 Problems in group theory motivated by cryptography 6.2.1 The ElGamal cryptosystem The ElGamal cryptosystem [121] is a public-key cryptosystem which is based on the Diffie–Hellman key exchange. The ElGamal protocol is used in the free GNU Privacy Guard software, recent versions of PGP and other cryptosystems. The Digital Signature Algorithm (DSA) is a variant of the ElGamal signature scheme, which should not be confused with the ElGamal encryption protocol that we describe below. 1. Alice and Bob agree on a finite cyclic group G and a generating element g of G. 2. Alice (the receiver) picks a random natural number a and publishes c = g a . 3. Bob (the sender), who wants to send a message m ∈ G (called a “plaintext” in cryptographic lingo) to Alice, picks a random natural number b and sends two elements, m ⋅ cb and g b , to Alice. Note that cb = g ab . 4. Alice recovers m = (m ⋅ cb ) ⋅ ((g b )a )−1 . A notable feature of the ElGamal encryption is that it is probabilistic, meaning that a single plaintext can be encrypted to many possible ciphertexts. We also point out that the ElGamal encryption has an average expansion factor of 2, meaning that the ciphertext is about twice as large as the corresponding plaintext.
6.3 The conjugacy problem Let G be a group with solvable word problem. For w, a ∈ G, the notation wa stands for a−1 wa. Recall that the conjugacy problem (or conjugacy decision problem) for G is the following. Given two elements u, v ∈ G, find out whether there is x ∈ G such that ux = v. On the other hand, the conjugacy search problem (sometimes also called the conjugacy witness problem) is the following: Given two elements a, b ∈ G and the information that ux = v for some x ∈ G, find at least one particular element x like that. The conjugacy decision problem is of great interest in group theory. In contrast, the conjugacy search problem is of interest in complexity theory, but of little interest in group theory. Indeed, if you know that u is conjugate to v, you can just go over words of the form ux and compare them to v one at a time, until you get a match (we implicitly use here an obvious fact that a group with solvable conjugacy problem also has solvable word problem). This straightforward algorithm is at least exponentialtime in the length of v, and therefore is considered infeasible for practical purposes. Thus, if no other algorithm is known for the conjugacy search problem in a group G, it is not unreasonable to claim that x → ux is a one-way function and try to build a (public-key) cryptographic protocol on that. In other words, the assumption here would be that in some groups G, the following problem is computationally hard. Given two elements a, b of G and the information that ax = b for some x ∈ G, find at least one particular element x like that. The (alleged) computational hardness of this
6.3 The conjugacy problem
| 321
problem in some particular groups (namely, in braid groups) has been used in several group-based cryptosystems, most notably in [6] and [268]. However, after some initial excitement (which has even resulted in naming a new area of “braid group cryptography”) (see, e. g., [96]), it seems now that the conjugacy search problem in a braid group may not provide a sufficient level of security (see, e. g., [215, 346, 347] for various attacks). We start with a simple key exchange protocol, due to Ko et al. [268], which is modeled on the Diffie–Hellman key exchange protocol (see Section 6.2). 1. An element w ∈ G is published. 2. Alice picks a private a ∈ G and sends wa to Bob. 3. Bob picks a private b ∈ G and sends wb to Alice. 4. Alice computes KA = (wb )a = wba , and Bob computes KB = (wa )b = wab . If a and b are chosen from a pool of commuting elements of the group G, then ab = ba, and therefore, Alice and Bob get a common private key KB = wab = wba = KA . Typically, there are two public subgroups A and B of the group G, given by their (finite) generating sets, such that ab = ba for any a ∈ A, b ∈ B. In the paper [268], the platform group G was the braid group Bn which has some natural commuting subgroups. Selecting a suitable platform group for the above protocol is a very nontrivial matter; some requirements on such a group were put forward in [452]: (P0) The conjugacy (search) problem in the platform group either has to be well studied or can be reduced to a well-known problem (perhaps, in some other area of mathematics). (P1) The word problem in G should have a fast (at most quadratic-time) solution by a deterministic algorithm. Better yet, there should be an efficiently computable “normal form” for elements of G. This is required for an efficient common key extraction by legitimate parties in a key establishment protocol, or for the verification step in an authentication protocol, etc. (P2) The conjugacy search problem should not have an efficient solution by a deterministic algorithm. We point out here that proving a group to have (P2) should be extremely difficult, if not impossible. Property (P2) should therefore be considered in conjunction with (P0), i. e., the only realistic evidence of a group G having property (P2) can be the fact that sufficiently many people have been studying the conjugacy (search) problem in G over a sufficiently long time. The next property is somewhat informal, but it is of great importance for practical implementations. (P3) There should be a way to disguise elements of G so that it would be impossible to recover x from x−1 wx just by inspection.
322 | 6 Problems in group theory motivated by cryptography One way to achieve this is to have a normal form for elements of G, which usually means that there is an algorithm that transforms any input uin , which is a word in the generators of G, to an output uout , which is another word in the generators of G, such that uin = uout in the group G, but this is hard to detect by inspection. In the absence of a normal form, say, if G is just given by means of generators and relators without any additional information about properties of G, then at least some of these relators should be very short to be used in a disguising procedure. To this one can add that the platform group should not have a linear representation of a small dimension, since otherwise, a linear algebra attack might be feasible.
6.3.1 The Anshel–Anshel–Goldfeld key exchange protocol In this section, we are going to describe a key establishment protocol from [6] that really stands out because, unlike other protocols based on the (alleged) hardness of the conjugacy search problem, it does not employ any commuting or commutative subgroups of a given platform group and can, in fact, use any non-abelian group with efficiently solvable word problem as the platform. This really makes a difference and gives a big advantage to the protocol of [6] over most protocols in this chapter. The choice of the platform group G for this protocol is a delicate matter though. In the original paper [6], a braid group was suggested as the platform, but with this platform the protocol was subsequently attacked in several different ways (see, e. g., [40, 153, 154, 215, 283, 282, 346, 347, 481]). The search for a good platform group for this protocol still continues. Now we give a description of the Anshel–Anshel–Goldfeld (AAG) protocol. A group G and elements a1 , . . . , ak , b1 , . . . , bm ∈ G are public. (1) Alice picks a private x ∈ G as a word in a1 , . . . , ak (i. e., x = x(a1 , . . . , ak )) and sends bx1 , . . . , bxm to Bob. y y (2) Bob picks a private y ∈ G as a word in b1 , . . . , bm and sends a1 , . . . , ak to Alice. y y (3) Alice computes x(a1 , . . . , ak ) = xy = y−1 xy, and Bob computes y(bx1 , . . . , bxm ) = yx = −1 x yx. Alice and Bob then come up with a common private key K = x −1 y−1 xy (called the commutator of x and y) as follows. Alice multiplies y−1 xy by x −1 on the left, while Bob multiplies x−1 yx by y−1 on the left, and then takes the inverse of the whole thing: (y−1 x−1 yx)−1 = x−1 y−1 xy. It may seem that solving the (simultaneous) conjugacy search problem for bx1 , . . . , bxm ; y y a1 , . . . , ak in the group G would allow an adversary to get the secret key K. However, if we look at step (3) of the protocol, we see that the adversary would have to know either x or y not simply as a word in the generators of the group G, but as a word in a1 , . . . , ak (respectively, as a word in b1 , . . . , bm ); otherwise, he or she would not be able
6.3 The conjugacy problem
y
| 323
y
to compose, say, xy out of a1 , . . . , ak . That means the adversary would also have to solve the membership search problem: Given elements x, a1 , . . . , ak of a group G, find an expression (if it exists) of x as a word in a1 , . . . , ak .
We note that the membership decision problem is to determine whether a given x ∈ G belongs to the subgroup of G generated by given a1 , . . . , ak . This problem turns out to be quite hard in many groups. For instance, the membership decision problem in a braid group Bn is algorithmically unsolvable if n ≥ 6 because such a braid group contains subgroups isomorphic to F2 × F2 (that would be, for example, the subgroup generated by σ12 , σ22 , σ42 , and σ52 , see [88]), where F2 is the free group of rank 2. In the group F2 × F2 , the membership decision problem is algorithmically unsolvable by an old result of Mihailova [350]. We also note that if the adversary finds, say, some x ∈ G such that bx1 = x b1 , . . . , bxm = bxm , there is no guarantee that x = x in G. Indeed, if x = cb x, where cb bi = bi cb for all i (in which case we say that cb centralizes bi ), then bxi = bxi for all i,
and therefore bx = bx for any element b from the subgroup generated by b1 , . . . , bm ; in particular, yx = yx . Now the problem is that if x (and, similarly, y ) does not belong to the subgroup A generated by a1 , . . . , ak (respectively, to the subgroup B generated by b1 , . . . , bm ), then the adversary may not obtain the correct common secret key K. On the other hand, if x (and, similarly, y ) does belong to the subgroup A (respectively, to the subgroup B), then the adversary will be able to get the correct K even though his or her x and y may be different from x and y, respectively. Indeed, if x = cb x, y = ca y, where cb centralizes B and ca centralizes A (elementwise), then
−1
−1
(x ) (y ) x y = (cb x)−1 (ca y)−1 cb xca y = x−1 cb−1 y−1 ca−1 cb xca y = x −1 y−1 xy = K, because cb commutes with y and with ca (note that ca belongs to the subgroup B, which follows from the assumption y = ca y ∈ B, and, similarly, cb belongs to A), and ca commutes with x. We emphasize that the adversary ends up with the correct key K (i. e., K = −1 −1 (x ) (y ) x y = x −1 y−1 xy) if and only if cb commutes with ca . The only visible way to ensure this is to have x ∈ A and y ∈ B. Without verifying at least one of these inclusions, there seems to be no way for the adversary to make sure that he or she got the correct key. Therefore, it appears that if the adversary chooses to solve the conjugacy search problem in the group G to recover x and y, he or she will then have to face either the membership search problem or the membership decision problem; the latter may very well be algorithmically unsolvable in a given group. The bottom line is that the adversary should actually be solving a (probably) more difficult (“subgroup-restricted”) version of the conjugacy search problem:
324 | 6 Problems in group theory motivated by cryptography
Given a group G, a subgroup A ≤ G and two elements g, h ∈ G, find x ∈ A such that h = x −1 gx, given that at least one such x exists.
6.3.2 The twisted conjugacy problem Let ϕ, ψ be two fixed automorphisms (more generally, endomorphisms) of a group G. Two elements u, v ∈ G are called (ϕ, ψ)-double twisted conjugate if there is an element w ∈ G such that uwϕ = wψ v. When ψ = id, then u and v are called ϕ-twisted conjugate, while in the case ϕ = ψ = id, u and v are just usual conjugates of each other. The twisted (or double twisted) conjugacy problem in G is: Decide whether two given elements u, v ∈ G are twisted (double twisted) conjugate in G for a fixed pair of endomorphisms ϕ, ψ of the group G.
Note that if ψ is an automorphism, then the (ϕ, ψ)-double twisted conjugacy problem reduces to the ϕψ−1 -twisted conjugacy problem, so in this case it is sufficient to consider just the twisted conjugacy problem. This problem was studied from a grouptheoretic perspective (see, e. g., [488, 431, 130]), and in [457] it was used in an authentication protocol. It is interesting that the research in [488, 431] was probably motivated by cryptographic applications, while the authors of [130] arrived at the twisted conjugacy problem motivated by problems in topology.
6.4 The decomposition problem Another ramification of the conjugacy search problem is the following decomposition search problem: Given two elements w and w of a group G, find two elements x ∈ A and y ∈ B that would belong to given subsets (usually subgroups) A, B ⊆ G and satisfy x ⋅ w ⋅ y = w , provided at least one such pair of elements exists.
We note that if in the above problem A = B is a subgroup, then this problem is also known as the double coset problem. We also note that some x and y satisfying the equality x⋅w⋅y = w always exist (e. g., x = 1, y = w−1 w ), so the point is to have them satisfy the conditions x ∈ A and y ∈ B. We therefore will not usually refer to this problem as a subgroup-restricted decomposition search problem because it is always going to be subgroup-restricted; otherwise it does not make much sense. We also note that the most commonly considered special case of the decomposition search problem so far is where A = B. We are going to show in Section 6.12 that solving the conjugacy search problem is unnecessary for an adversary to get the common secret key in the Ko–Lee (or any
6.4 The decomposition problem
| 325
similar) protocol (see Section 6.3); it is sufficient to solve a seemingly easier decomposition search problem. This was mentioned, in passing, in the paper [268], but the significance of this observation was downplayed there. We note that the membership condition x, y ∈ A may not be easy to verify for some subsets A. The authors of [268] do not address this problem; instead they mention, in justice, that if one uses a “brute-force” attack by simply going over elements of A one at a time, the above condition will be satisfied automatically. This however may not be the case with other, more practical, attacks. We also note that the conjugacy search problem is a special case of the decomposition problem where w is conjugate to w and x = y−1 . The claim that the decomposition problem should be easier than the conjugacy search problem is intuitively clear since it is generally easier to solve an equation with two unknowns than a special case of the same equation with just one unknown. We admit however that there might be exceptions to this general rule. Now we give a formal description of a typical protocol based on the decomposition problem. There are a public group G, a public element w ∈ G and two public subgroups A, B ⊆ G commuting elementwise, i. e., ab = ba for any a ∈ A, b ∈ B. 1. Alice randomly selects private elements a1 , a2 ∈ A. Then she sends the element a1 wa2 to Bob. 2. Bob randomly selects private elements b1 , b2 ∈ B. Then he sends the element b1 wb2 to Alice. 3. Alice computes KA = a1 b1 wb2 a2 , and Bob computes KB = b2 a1 wb1 a2 . Since ai bi = bi ai in G, one has KA = KB = K (as an element of G), which is now Alice’s and Bob’s common secret key. We now discuss several modifications of the above protocol.
6.4.1 “Twisted” protocol This idea is due to Shpilrain and Ushakov [456]; the following modification of the above protocol appears to be more secure (at least for some choices of the platform group) against so-called “length-based” attacks (see, e. g., [153, 154, 215]), according to computer experiments. Again, there are a public group G and two public subgroups A, B ≤ G commuting elementwise. 1. Alice randomly selects private elements a1 ∈ A and b1 ∈ B. Then she sends the element a1 wb1 to Bob. 2. Bob randomly selects private elements b2 ∈ B and a2 ∈ A. Then he sends the element b2 wa2 to Alice. 3. Alice computes KA = a1 b2 wa2 b1 = b2 a1 wb1 a2 , and Bob computes KB = b2 a1 wb1 a2 . Since ai bi = bi ai in G, one has KA = KB = K (as an element of G), which is now Alice’s and Bob’s common secret key.
326 | 6 Problems in group theory motivated by cryptography 6.4.2 Finding intersection of given subgroups Another modification of the protocol in Section 6.4 is also due to Shpilrain and Ushakov [456]. First we give a sketch of the idea. Let G be a group and g ∈ G. Denote by CG (g) the centralizer of g in G, i. e., the set of elements h ∈ G such that hg = gh. For S = {g1 , . . . , gk } ⊆ G, CG (g1 , . . . , gk ) denotes the centralizer of S in G, which is the intersection of the centralizers CG (gi ), i = 1, . . . , k. Now, given a public w ∈ G, Alice privately selects a1 ∈ G and publishes a subgroup B ⊆ CG (a1 ) (we tacitly assume here that B can be computed efficiently). Similarly, Bob privately selects b2 ∈ G and publishes a subgroup A ⊆ CG (b2 ). Alice then selects a2 ∈ A and sends w1 = a1 wa2 to Bob, while Bob selects b1 ∈ B and sends w2 = b1 wb2 to Alice. Thus, in the first transmission, say, the adversary faces the problem of finding a1 , a2 such that w1 = a1 wa2 , where a2 ∈ A, but there is no explicit indication of where to choose a1 from. Therefore, before arranging something like a length-based attack in this case, the adversary would have to compute generators of the centralizer CG (B) first (because a1 ∈ CG (B)), which is usually a hard problem by itself since it basically amounts to finding the intersection of the centralizers of individual elements, and finding (the generators of) the intersection of subgroups is a notoriously difficult problem for most groups considered in combinatorial group theory. Now we give a formal description of the protocol from [456]. As usual, there is a public group G, and let w ∈ G be public too. 1. Alice chooses an element a1 ∈ G, chooses a subgroup of CG (a1 ) and publishes its generators A = {α1 , . . . , αk }. 2. Bob chooses an element b2 ∈ G, chooses a subgroup of CG (b2 ) and publishes its generators B = {β1 , . . . , βm }. 3. Alice chooses a random element a2 from the group generated by β1 , . . . , βm and sends PA = a1 wa2 to Bob. 4. Bob chooses a random element b1 from the group generated by α1 , . . . , αk and sends PB = b1 wb2 to Alice. 5. Alice computes KA = a1 PB a2 . 6. Bob computes KB = b1 PA b2 . Since a1 b1 = b1 a1 and a2 b2 = b2 a2 , we have K = KA = KB , the shared secret key. We note that in [481], an attack on this protocol was offered (in the case where a braid group is used as the platform), using what the author calls the linear centralizer method. Then, in [40], another method of cryptanalysis (called the algebraic span cryptanalysis) was offered, applicable to platform groups that admit an efficient linear representation. This method yields attacks on various protocols, including the one in this section, if a braid group is used as the platform.
6.4 The decomposition problem
| 327
6.4.3 Commutative subgroups Instead of using commuting subgroups A, B ≤ G, one can use commutative subgroups. Suppose A, B ≤ G are two public commutative subgroups (or subsemigroups) of a group G, and let w ∈ G be a public element. 1. Alice randomly selects private elements a1 ∈ A, b1 ∈ B. Then she sends the element a1 wb1 to Bob. 2. Bob randomly selects private elements a2 ∈ A, b2 ∈ B. Then he sends the element a2 wb2 to Alice. 3. Alice computes KA = a1 a2 wb2 b1 , and Bob computes KB = a2 a1 wb1 b2 . Since a1 a2 = a2 a1 and b1 b2 = b2 b1 in G, one has KA = KB = K (as an element of G), which is now Alice’s and Bob’s common secret key.
6.4.4 The factorization problem The factorization search problem is a special case of the decomposition search problem: Given an element w of a group G and two subgroups A, B ≤ G, find any two elements a ∈ A and b ∈ B that would satisfy a ⋅ b = w, provided at least one such pair of elements exists.
The following protocol relies in its security on the computational hardness of the factorization search problem. As before, there are a public group G and two public subgroups A, B ≤ G commuting elementwise, i. e., ab = ba for any a ∈ A, b ∈ B. 1. Alice randomly selects private elements a1 ∈ A, b1 ∈ B. Then she sends the element a1 b1 to Bob. 2. Bob randomly selects private elements a2 ∈ A, b2 ∈ B. Then he sends the element a2 b2 to Alice. 3. Alice computes KA = b1 (a2 b2 )a1 = a2 b1 a1 b2 = a2 a1 b1 b2 , and Bob computes KB = a2 (a1 b1 )b2 = a2 a1 b1 b2 . Thus, KA = KB = K is now Alice’s and Bob’s common secret key. We note that the adversary, Eve, who knows the elements a1 b1 and a2 b2 , can compute (a1 b1 )(a2 b2 ) = a1 b1 a2 b2 = a1 a2 b1 b2 and (a2 b2 )(a1 b1 ) = a2 a1 b2 b1 , but neither of these products is equal to K if a1 a2 ≠ a2 a1 and b1 b2 ≠ b2 b1 .
328 | 6 Problems in group theory motivated by cryptography Finally, we point out a decision factorization problem: Given an element w of a group G and two subgroups A, B ≤ G, find out whether there are two elements a ∈ A and b ∈ B such that w = a ⋅ b.
This seems to be a new and nontrivial algorithmic problem in group theory, motivated by cryptography.
6.5 The word problem The word problem “needs no introduction,” but it probably makes sense to spell out the word search problem: Suppose H is a group given by a finite presentation < X; R > and let F(X) be the free group with the set X of free generators. Given a group word w in the alphabet X, find a sequence of conjugates of elements from R whose product is equal to w in the free group F(X).
A long time ago, there was an attempt to use the undecidability of the decision word problem (in some groups) in public-key cryptography [321]. This was, in fact, historically the first attempt to employ a hard algorithmic problem from combinatorial group theory in public-key cryptography. However, as was pointed out in [43], the problem that is actually used in [321] is not the word problem, but the word choice problem: Given g, w1 , w2 ∈ G, find out whether g = w1 or g = w2 in G, provided one of the two equalities holds. In this problem, both parts are recursively solvable for any recursively presented platform group G because they both are the “yes” parts of the word problem. Therefore, undecidability of the actual word problem in the platform group has no bearing on the security of the encryption scheme in [321]. On the other hand, employing decision problems (as opposed to search problems) in public-key cryptography would allow one to depart from the canonical paradigm and construct cryptographic protocols with new properties, impossible in the canonical model. In particular, such protocols can be secure against some “brute-force” attacks by a computationally unbounded adversary. There is a price to pay for that, but the price is reasonable: a legitimate receiver decrypts correctly with probability that can be made very close to 1, but not equal to 1. This idea was implemented in [401], so the exposition below follows that paper. We assume that the sender (Bob) is given a presentation Γ (published by the receiver Alice) of a group G by generators and defining relators, Γ = ⟨x1 , x2 , . . . , xn | r1 , r2 , . . . ⟩. No further information about the group G is available to Bob.
6.5 The word problem
| 329
Bob is instructed to transmit his private bit to Alice by transmitting a word u = u(x1 , . . . , xn ) equal to 1 in G in place of “1” and a word v = v(x1 , . . . , xn ) not equal to 1 in G in place of “0.” Now we have to specify the algorithms that Bob should use to select his words. Algorithm “0” (for selecting a word v = v(x1 , . . . , xn ) not equal to 1 in G) is quite simple: Bob just selects a random word by building it letter-by-letter, selecting each letter uniformly from the set X = {x1 , . . . , xn , x1−1 , . . . , xn−1 }. The length of such a word should be a random integer from an interval that Bob selects up front, based on his computational abilities. Algorithm “1” (for selecting a word u = u(x1 , . . . , xn ) equal to 1 in G) is slightly more complex. It amounts to applying a random sequence of operations of the following two kinds, starting with the empty word: 1. inserting into a random place in the current word a pair hh−1 for a random (short) word h, 2. inserting into a random place in the current word a random conjugate g −1 ri g of a random defining relator ri . The length of the resulting word should be in the same range as the length of the output of Algorithm “0”, for indistinguishability.
6.5.1 Encryption emulation attack Now let us see what happens if a computationally unbounded adversary uses what is called encryption emulation attack on Bob’s encryption. This kind of attack always succeeds against “traditional” encryption protocols where the receiver decrypts correctly with probability exactly 1. The encryption emulation attack is: For either bit, generate its encryption over and over again, each time with fresh randomness, until the ciphertext to be attacked is obtained. Then the corresponding plaintext is the bit that was encrypted.
Thus, the (computationally unbounded) adversary is building up two lists, corresponding to two algorithms above. Our first observation is that the list that corresponds to Algorithm “0” is useless to the adversary because it is eventually going to contain all words in the alphabet X = {x1 , . . . , xn , x1−1 , . . . , xn−1 }. Therefore, the adversary may just as well forget about this list and focus on the other one, which corresponds to Algorithm “1”. Now the situation boils down to the following. If a word w transmitted by Bob appears on the list, then it is equal to 1 in G. If not, then it is not. The only problem is: How can one conclude that w does not appear on the list if the list is infinite? Of course, there is no infinity in real life, so the list is actually finite because of Bob’s com-
330 | 6 Problems in group theory motivated by cryptography putational limitations. Still, at least in theory, the adversary does not know a bound on the size of the list if she does not know Bob’s computational limits. Can the adversary then perhaps stop at some point and conclude that w ≠ 1 with overwhelming probability, just like Alice does? The point however is that this probability may not at all be as “overwhelming” as the probability of the correct decryption by Alice. Compare the following cases: 1. For Alice to decrypt correctly “with overwhelming probability,” the probability P1 (N) for a random word w of length N not to be equal to 1 should converge to 1 (reasonably fast) as N goes to infinity. 2. For the adversary to decrypt correctly “with overwhelming probability,” the probability P2 (N, f (N)) for a random word w of length N produced by Algorithm “1” to have a proof of length ≤ f (N) verifying that w = 1 should converge to 1 as N goes to infinity. Here f (N) represents the adversary’s computational capabilities; this function can be arbitrary, but fixed. We see that the functions P1 (N) and P2 (N) are of very different nature, and any correlation between them is unlikely. We note that the function P1 (N) is generally well understood, and in particular, it is known that in any infinite group G, P1 (N) indeed converges to 1 as N goes to infinity. On the other hand, functions P2 (N, f (N)) are more complex; note also that they may depend on a particular algorithm used by Bob to produce words equal to 1. Algorithm “1” described in this section is very straightforward; there are more delicate algorithms discussed in [366]. Functions P2 (N, f (N)) are currently subject of active research, and in particular, it appears likely that there are groups in which P2 (N, f (N)) does not converge to 1 at all if an algorithm used to produce words equal to 1 is chosen wisely. We also note in passing that if in a group G the word problem is recursively unsolvable, then the length of a proof verifying that w = 1 in G is not bounded by any recursive function of the length of w. Of course, in real life, the adversary may know a bound on the size of the list based on a general idea of what kind of hardware may be available to Bob; but then again, in real life the adversary would be computationally bounded, too. Here we note (again, in passing) that there are groups G with efficiently solvable word problem and words w of length n equal to 1 in G, such that the length of a proof verifying that w = 1 in G is not bounded by any tower of exponents in n (see [413]). Thus, the bottom line is the following. In theory, the adversary cannot positively identify the bit that Bob has encrypted by a word w if she just uses the encryption emulation attack. In fact, such an identification would be equivalent to solving the word problem in G, which would contradict the well-known fact that there are (finitely presented) groups with recursively unsolvable word problem.
6.5 The word problem
| 331
It would be nice, of course, if the adversary was unable to positively decrypt using encryption emulation attacks even if she did know Bob’s computational limitations. This, too, can be arranged (see the following subsection).
6.5.2 Encryption Building on the ideas from the previous subsection and combining them with a simple yet subtle trick, we describe here an encryption protocol from [401] that has the following features: (F1) Bob (the sender) encrypts his private bit sequence by a word in a public alphabet X. (F2) Alice (the receiver) decrypts Bob’s transmission correctly with probability that can be made arbitrarily close to 1, but not equal to 1. (F3) The adversary, Eve, is assumed to have no bound on the speed of computation or on the storage space. (F4) Eve is assumed to have complete information on the algorithm(s) and hardware that Bob uses for encryption. However, Eve cannot predict outputs of Bob’s random number generator. (F5) Eve cannot decrypt Bob’s bit correctly with probability > 43 by emulating Bob’s encryption algorithm. This leaves Eve with the only possibility: to attack Alice’s decryption algorithm or her algorithm for obtaining public keys, but this is a different story. Here we only discuss the encryption emulation attack, to make the point that this attack can be unsuccessful if the probability of the legitimate decryption is close to 1, but not exactly 1. Here is the relevant protocol (for encrypting a single bit). (P0) Alice publishes two presentations: Γ1 = ⟨x1 , x2 , . . . , xn | r1 , r2 , . . . ⟩,
Γ2 = ⟨x1 , x2 , . . . , xn | s1 , s2 , . . . ⟩. One of them defines the trivial group, whereas the other one defines an infinite group, but only Alice knows which one is which. Bob is instructed to transmit his private bit to Alice as follows. (P1) In place of “1,” Bob transmits a pair of words (w1 , w2 ) in the alphabet X = {x1 , x2 , . . . , xn , x1−1 , . . . , xn−1 }, where w1 is selected randomly, while w2 is selected to be equal to 1 in the group G2 defined by Γ2 (see, e. g., Algorithm “1” in the previous section). (P2) In place of “0,” Bob transmits a pair of words (w1 , w2 ), where w2 is selected randomly, while w1 is selected to be equal to 1 in the group G1 defined by Γ1 .
332 | 6 Problems in group theory motivated by cryptography Under our assumptions (F3), (F4) Eve can identify the word(s) in the transmitted pair which is/are equal to 1 in the corresponding presentation(s), as well as the word, if any, which is not equal to 1. There are the following possibilities: 1. w1 = 1 in G1 , w2 = 1 in G2 ; 2. w1 = 1 in G1 , w2 ≠ 1 in G2 ; 3. w1 ≠ 1 in G1 , w2 = 1 in G2 . It is easy to see that possibility (1) occurs with probability 21 (when Bob wants to transmit “1” and G1 is trivial, or when Bob wants to transmit “0” and G2 is trivial). If this possibility occurs, Eve cannot decrypt Bob’s bit correctly with probability > 21 . Indeed, the only way for Eve to decrypt in this case would be to find out which presentation Γi defines the trivial group, i. e., she would have to attack Alice’s algorithm for obtaining a public key, which would not be part of the encryption emulation attack anymore. Here we just note, in passing, that there are many different ways to construct presentations of the trivial group, some of them involving a lot of random choices (see, e. g., [360] for a survey on the subject). In any case, our claim (F5) was that Eve cannot decrypt Bob’s bit correctly with probability > 43 by emulating Bob’s encryption algorithm, which is obviously true in this scheme since the probability for Eve to decrypt correctly is, in fact, precisely 21 ⋅ 21 + 1 ⋅1 = 43 . (Note that Eve decrypts correctly with probability 1 if either of the possibilities 2 (2) or (3) above occurs.)
6.6 The subgroup membership problem In [459], a public-key encryption scheme was offered, where security was based on the alleged computational hardness of the subgroup membership search problem combined with the automorphism inversion problem. A free metabelian group was used as the platform. In this section, we describe a different scheme where the (notoriously hard) subgroup membership problem in a matrix group over ℚ is employed. In the following Section 6.7, we offer another public-key encryption scheme whose security is based on the hardness of the subgroup membership decision problem. The latter problem is known to be algorithmically unsolvable in groups GLn (ℚ) for n ≥ 4, and is not known to be algorithmically solvable for n = 2, 3. The protocol in this section is for encrypting elements of a free group F(x, y) by matrices from GL2 (ℚ). This encryption is homomorphic with respect to group multiplication. Denote 1 A(k) = ( 0
k ), 1
1 k
B(k) = (
0 ). 1
It is well known that A(k) and B(k) generate a free group if k ≥ 2.
6.6 The subgroup membership problem
| 333
Private key A private key consists of: – an integer k ≥ 2, – a matrix H ∈ GL2 (ℚ), – a pair (u1 = u1 (x, y), u2 = u2 (x, y)) of elements of a free group F(x, y) generated by x and y. The pair (u1 , u2 ) should freely generate a subgroup of F(x, y). Public key Let M1 = H −1 A(k)H, M2 = H −1 B(k)H. A public key is a pair of matrices (u1 (M1 , M2 ), u2 (M1 , M2 )). Denote R = u1 (M1 , M2 ), S = u2 (M1 , M2 ). Plaintext Plaintexts are elements w(x, y) of the free group F(x, y).
Encryption Encryption of w(x, y) is the matrix w(R, S) ∈ GL2 (ℚ). Decryption The matrix w(R, S) is conjugated by H −1 to get Hw(R, S)H −1 = w(HRH −1 , HSH −1 ) = w(u1 (A(k), B(k)), u2 (A(k), B(k))). Now the latter matrix w(u1 (A(k), B(k)), u2 (A(k), B(k))) is in the subgroup of SL2 (ℤ) generated by u1 (A(k), B(k)) and u2 (A(k), B(k)), and in particular, in the subgroup generated by A(k) and B(k). There are efficient algorithms [83, 126] for representing a given element of this subgroup as a group word in A(k) and B(k). However, the plaintext is a group word in u1 (A(k), B(k)) and u2 (A(k), B(k)). Thus, one last step in decryption is rewriting an element of a free group generated by A(k) and B(k) as a group word in u1 (A(k), B(k)) and u2 (A(k), B(k)). This can be done efficiently by using Nielsen’s method (see, e. g., [313]). The decryption is unique because the group generated by u1 and u2 is free.
334 | 6 Problems in group theory motivated by cryptography 6.6.1 Security assumption The subgroup membership (search) problem in a subgroup of SL2 (ℚ) generated by two given elements is computationally hard. We note that the general subgroup membership (decision) problem in SL2 (ℚ) is open, and therefore there is no known time bound for solving the subgroup membership (search) problem in a random subgroup of SL2 (ℚ). 6.6.2 Trapdoor Reducing the general subgroup membership (search) problem in SL2 (ℚ) to the subgroup membership (search) problem in the subgroup generated by A(k) and B(k), where this problem can be solved efficiently due to [83] or [126].
6.7 Using the subgroup membership decision problem The complexity of decision problems, as opposed to that of search problems, is rarely used in public-key cryptography. A notable exception is [401] (see also Section 6.5) where the complexity of the word (decision) problem was used to defeat the encryption emulation attack. Here we use the complexity of the subgroup membership decision problem in groups GLn (ℚ) to encrypt a single bit. It is well known that if n ≥ 4, then the subgroup membership problem in groups GLn (ℚ) is unsolvable since these groups contain a subgroup isomorphic to F2 × F2 , a direct product of two free groups of rank 2. In the latter group, the subgroup membership problem is known to be unsolvable [350]. The initial set-up is similar to that in Section 6.6, except that we will be working in the group GL4 (ℚ), so in particular, here 1 0 A(k) = ( 0 0
k 1 0 0
0 0 1 0
0 0 ), 0 1
1 k B(k) = ( 0 0
0 1 0 0
0 0 1 0
0 0 ). 0 1
We require here that k ≥ 3. The conjugating matrix H ∈ GL4 (ℚ). Otherwise, the set-up is the same. Encryption To encrypt the “1” bit, the sender builds a random group word in the published elements and transmits it to the receiver.
6.8 The isomorphism inversion problem
| 335
To encrypt the “0” bit, the sender just selects a random matrix from GL4 (ℚ) and transmits it to the receiver. Decryption We are going to use the fact that for a random matrix from SL2 (ℤ), the probability to belong to the subgroup generated by A(k) and B(k) (in the notation of Section 6.6) is negligible if k ≥ 3; this follows from the fact that the latter subgroup has infinite index in SL2 (ℤ) (see [83]). The same is therefore also true for a random matrix from GL4 (ℚ) and for the subgroup generated by A(k) and B(k) in the notation of this section. Thus, the receiver applies the procedure from Section 6.6 to see whether the transmitted matrix belongs to the subgroup generated by public elements. If it does, the secret bit must be “1,” otherwise it must be “0.”
6.8 The isomorphism inversion problem The isomorphism (decision) problem for groups is very well known. Suppose two groups are given by their finite presentations in terms of generators and defining relators, and find out whether the groups are isomorphic. The search version of this problem is well known, too: Given two finite presentations defining isomorphic groups, find a particular isomorphism between the groups. Now the following problem, of interest in cryptography, is not what was previously considered in combinatorial group theory: Given two finite presentations defining isomorphic groups G and H and an isomorphism φ : G → H, find φ−1 .
Now we describe an encryption scheme whose security is based on the alleged computational hardness of the isomorphism inversion problem. Our idea itself is quite simple: encrypt with a public isomorphism φ that is computationally infeasible for the adversary to invert. A legitimate receiver, on the other hand, can efficiently compute φ−1 because she knows a factorization of φ in a product of “elementary”, easily invertible isomorphisms. What is interesting to note is that this encryption is homomorphic because φ(g1 g2 ) = φ(g1 )φ(g2 ) for any g1 , g2 ∈ G. The significance of this observation is due to a result of [401]: If the group G is a non-abelian finite simple group, then any homomorphic encryption on G can be converted to a fully homomorphic encryption scheme, i. e., encryption that respects not just one but two operations: either Boolean AND and OR or arithmetic addition and multiplication. In summary, a relevant scheme can be built as follows. Given a public presentation of a group G by generators and defining relations, the receiver (Alice) uses a chain of
336 | 6 Problems in group theory motivated by cryptography private “elementary” isomorphisms G → H1 → ⋅ ⋅ ⋅ → Hk → H, each of which is easily invertible, but the (public) composite isomorphism φ : G → H is hard to invert without the knowledge of a factorization in a product of “elementary” ones (note that φ is published as a map taking the generators of G to words in the generators of H). Having obtained in this way a (private) presentation H, Alice discards some of the defining relations to obtain a public presentation H.̂ Thus, the group H, as well as the group G (which is isomorphic to H), is a homomorphic image of the group Ĥ (note that Ĥ has the same set of generators as H does but has fewer defining relations). Now the sender (Bob), who wants to encrypt his plaintext g ∈ G, selects an arbitrary word wg (in the generators of G) representing the element g and applies the public isomorphism φ to wg to get φ(wg ), which is a word in the generators of H (or H,̂ since H and Ĥ have the same set of generators). He then selects an arbitrary word hg in the generators of Ĥ representing the same element of Ĥ as φ(wg ) does, and this is now his ciphertext: hg = E(g). To decrypt, Alice applies her private map φ−1 (which is a map taking the generators of Ĥ to words in the generators of G) to hg to get a word wg = φ−1 (hg ). This word wg represents the same element of G as wg does because φ−1 (hg ) = φ−1 (φ(wg )) = wg in the group G since both φ and φ−1 are homomorphisms, and the composition of φ and φ−1 is the identity map on the group G, i. e., it takes every word in the generators of G to a word representing the same element of G. Thus, Alice decrypts correctly. We emphasize here that a plaintext is a group element g ∈ G, not a word in the generators of G. This implies, in particular, that there should be some kind of canonical way (a “normal form”) of representing elements of G. For example, for elements of an alternating group Am (these groups are finite non-abelian simple groups if m ≥ 5), such a canonical representation can be the standard representation by a permutation of the set {1, . . . , m}. Now we are going to give more details on how one can construct a sequence of “elementary” isomorphisms starting with a given presentation of a group G = ⟨x1 , x2 , . . . | r1 , r2 , . . . ⟩ (here x1 , x2 , . . . are generators and r1 , r2 , . . . are defining relators). These “elementary” isomorphisms are called Tietze transformations. They are universal in the sense that they can be applied to any (semi)group presentation. Tietze transformations are of the following types: (T1) Introducing a new generator: Replace ⟨x1 , x2 , . . . | r1 , r2 , . . . ⟩ by ⟨y, x1 , x2 , . . . | ys−1 , r1 , r2 , . . . ⟩, where s = s(x1 , x2 , . . . ) is an arbitrary element in the generators x1 , x2 , . . . . (T2) Canceling a generator (this is the converse of (T1)): If we have a presentation of the form ⟨y, x1 , x2 , . . . | q, r1 , r2 , . . . ⟩, where q is of the form ys−1 , and s, r1 , r2 , . . . are in the group generated by x1 , x2 , . . . , replace this presentation by ⟨x1 , x2 , . . . | r1 , r2 , . . . ⟩. (T3) Applying an automorphism: Apply an automorphism of the free group generated by x1 , x2 , . . . to all the relators r1 , r2 , . . . .
6.8 The isomorphism inversion problem
| 337
(T4) Changing defining relators: Replace the set r1 , r2 , . . . of defining relators by another set r1 , r2 , . . . with the same normal closure. That means, each of r1 , r2 , . . . should belong to the normal subgroup generated by r1 , r2 , . . . , and vice versa. Tietze proved (see, e. g., [313]) that two groups given by presentations ⟨x1 , x2 , . . . | r1 , r2 , . . . ⟩ and ⟨y1 , y2 , . . . | s1 , s2 , . . . ⟩ are isomorphic if and only if one can get from one of the presentations to the other by a sequence of transformations (T1)–(T4). For each Tietze transformation of the types (T1)–(T3), it is easy to obtain an explicit isomorphism (as a map on generators) and its inverse. For a Tietze transformation of type (T4), the isomorphism is just the identity map. We would like here to make Tietze transformations of the type (T4) recursive, because a priori it is not clear how Alice can actually implement these transformations. Thus, Alice can use the following recursive version of (T4). (T4 ) In the set r1 , r2 , . . . , replace some ri by one of the following: ri−1 , ri rj , ri rj−1 , rj ri , rj ri−1 or xk−1 ri xk , xk ri xk−1 , where j ≠ i and k is arbitrary. One particularly useful feature of Tietze transformations is that they can break long defining relators into short pieces (of length, say, 3 or 4) at the expense of introducing more generators, as illustrated by the following simple example. In this example, we start with a presentation having two relators of length 5 in three generators, and end up with a presentation having four relators of length 3 and one relator of length 4, in six generators. The ≅ symbol below means “is isomorphic to.” Example 6.8.1. We have G = ⟨x1 , x2 , x3 | x12 x23 , x1 x22 x1−1 x3 ⟩
≅ ⟨x1 , x2 , x3 , x4 | x4 = x12 , x4 x23 , x1 x22 x1−1 x3 ⟩
≅ ⟨x1 , x2 , x3 , x4 , x5 | x5 = x1 x22 , x4 = x12 , x4 x23 , x5 x1−1 x3 ⟩ ≅ (now switching x1 and x5 – this is (T3))
≅ ⟨x1 , x2 , x3 , x4 , x5 | x1 = x5 x22 , x4 = x52 , x4 x23 , x1 x5−1 x3 ⟩
≅ ⟨x1 , x2 , x3 , x4 , x5 , x6 | x1−1 x5 x22 , x4−1 x52 , x6−1 x4 x2 , x6 x22 , x1 x5−1 x3 ⟩ = H. We note that this procedure of breaking relators into pieces of length 3 increases the total relator length (measured as the sum of the lengths of all relators) by at most a factor of 2. Since we need our “elementary” isomorphisms to be also given in the form xi → yi , we note that the isomorphism between the first two presentations above is given by xi → xi , i = 1, 2, 3, and the inverse isomorphism is given by xi → xi , i = 1, 2, 3; x4 → x12 . By composing elementary isomorphisms, we compute the isomorphism φ between the first and the last presentation, i. e., φ : x1 → x5 , x2 → x2 , x3 → x3 . By composing the inverses of elementary isomorphisms, we compute φ−1 : x1 → x1 x22 , x2 → x2 ,
338 | 6 Problems in group theory motivated by cryptography x3 → x3 , x4 → x12 , x5 → x1 , x6 → x12 x2 . We see that even in this toy example, recovering φ−1 from the public φ is not quite trivial without knowing a sequence of intermediate Tietze transformations. Furthermore, if Alice discards, say, two of the relators from the last presentation to get a public Ĥ = ⟨x1 , x2 , x3 , x4 , x5 , x6 | x1−1 x5 x22 , x6 x22 , x1 x5−1 x3 ⟩, then there is no isomorphism between Ĥ and G whatsoever, and the problem for the adversary is now even less trivial: find relators completing the public presentation Ĥ to a presentation H isomorphic to G by way of the public isomorphism φ, and then find φ−1 . Moreover, φ as a map on the generators of G may not induce an onto homomorphism from G to H,̂ and this will deprive the adversary even from the “brute-force” attack by looking for a map ψ on the generators of Ĥ such that ψ : Ĥ → G is a homomorphism, and ψ(φ) is identical on G. If, say, in the example above we discard the relator x1 x5−1 x3 from the final presentation H, then x4 will not be in the subgroup of Ĥ generated by φ(xi ), and therefore there cannot possibly be a ψ : Ĥ → G such that ψ(φ)
is identical on G. We now describe a homomorphic public-key encryption scheme a little more formally. Key generation: Let φ : G → H be an isomorphism. Alice’s public key then consists of φ as well as presentations G and H,̂ where Ĥ is obtained from H by keeping all of the generators but discarding some of the relators. Alice’s private key consists of φ−1 and H. Encrypt: Bob’s plaintext is g ∈ G. To encrypt, he selects an arbitrary word wg in the generators of G representing the element g and applies the public isomorphism φ to wg to get φ(wg ), which is a word in the generators of H (or H,̂ since H and Ĥ have the same set of generators). He then selects an arbitrary word hg in the generators of Ĥ representing the same element of Ĥ as φ(wg ) does, and this is now his ciphertext: hg = E(g). Decrypt: To decrypt, Alice applies her private map φ−1 to hg to get a word wg = φ−1 (hg ). This word wg represents the same element of G as wg does because φ−1 (hg ) = φ−1 (φ(wg )) = wg in the group G since both φ and φ−1 are homomorphisms, and the composition of φ and φ−1 is the identity map on the group G.
In the following example, we use the presentations G = ⟨x1 , x2 , x3 | x12 x23 , x1 x22 x1−1 x3 ⟩,
Ĥ = ⟨x1 , x2 , x3 , x4 , x5 , x6 | x1 = x5 x22 , x4 = x52 , x6 = x4 x2 , x6 x22 ⟩ and the isomorphism φ : x1 → x5 , x2 → x2 , x3 → x3 from Example 6.8.1 to illustrate how encryption works. Example 6.8.2. Let the plaintext be the element g ∈ G represented by the word x1 x2 . Then φ(x1 x2 ) = x5 x2 . Then the word x5 x2 is randomized in Ĥ by using relators of Ĥ as
6.9 Semidirect product of groups and more peculiar computational assumptions | 339
well as “trivial” relators xi xi−1 = 1 and xi−1 xi = 1. For example, multiply x5 x2 by x4 x4−1 to get x4 x4−1 x5 x2 . Then replace x4 by x52 , according to one of the relators of H,̂ and get x52 x4−1 x5 x2 . Now insert x6 x6−1 between x5 and x2 to get x52 x4−1 x5 x6 x6−1 x2 , and then replace x6 by x4 x2 to get x52 x4−1 x5 x4 x2 x6−1 x2 , which can be used as the encryption E(g). Finally, we note that automorphisms, instead of general isomorphisms, were used in [184] and [357] to build public-key cryptographic primitives employing the same general idea of building an automorphism as a composition of elementary ones. In [357], those were automorphisms of a polynomial algebra, while in [184] automorphisms of a tropical algebra were used along the same lines. We also note that “elementary isomorphisms” (i. e., Tietze transformations) are universal in nature and can be adapted to most any algebraic structure (see, e. g., [352, 351, 458]).
6.9 Semidirect product of groups and more peculiar computational assumptions Using a semidirect product of (semi)groups as the platform for a very simple key exchange protocol (inspired by the Diffie–Hellman protocol) yields new and sometimes rather peculiar computational assumptions. The exposition in this section follows [198] (see also [232]). First we recall the definition of a semidirect product. Definition 6.9.1. Let G, H be two groups, let Aut(G) be the group of automorphisms of G and let ρ : H → Aut(G) be a homomorphism. Then the semidirect product of G and H is the set Γ = G ⋊ρ H = {(g, h) : g ∈ G, h ∈ H} with the group operation given by (g, h)(g , h ) = (g ρ(h ) ⋅ g , h ⋅ h ).
Here g ρ(h ) denotes the image of g under the automorphism ρ(h ), and when we write a product h ⋅ h of two morphisms, this means that h is applied first.
In this section, we focus on a special case of this construction, where the group H is just a subgroup of the group Aut(G). If H = Aut(G), then the corresponding semidirect product is called the holomorph of the group G. Thus, the holomorph of G, usually denoted by Hol(G), is the set of all pairs (g, ϕ), where g ∈ G, ϕ ∈ Aut(G), with the group operation given by (g, ϕ) ⋅ (g , ϕ ) = (ϕ (g) ⋅ g , ϕ ⋅ ϕ ). It is often more practical to use a subgroup of Aut(G) in this construction, and this is exactly what we do below, where we describe a key exchange protocol that uses (as the platform) an extension of a group G by a cyclic subgroup of Aut(G).
340 | 6 Problems in group theory motivated by cryptography One can also use this construction if G is not necessarily a group, but just a semigroup, and/or consider endomorphisms of G, not necessarily automorphisms. Then the result will be a semigroup. Thus, let G be a (semi)group. An element g ∈ G is chosen and made public as well as an arbitrary automorphism ϕ ∈ Aut(G) (or an arbitrary endomorphism ϕ ∈ End(G)). Bob chooses a private n ∈ ℕ, while Alice chooses a private m ∈ ℕ. Both Alice and Bob are going to work with elements of the form (f , ϕr ), where f ∈ G, r ∈ ℕ. Note that two elements of this form are multiplied as follows: (f , ϕr ) ⋅ (h, ϕs ) = (ϕs (f ) ⋅ h, ϕr+s ). The following is a public-key exchange protocol between Alice and Bob. 1. Alice computes (g, ϕ)m = (ϕm−1 (g) ⋅ ⋅ ⋅ ϕ2 (g) ⋅ ϕ(g) ⋅ g, ϕm ) and sends only the first component of this pair to Bob. Thus, she sends to Bob only the element a = ϕm−1 (g) ⋅ ⋅ ⋅ ϕ2 (g) ⋅ ϕ(g) ⋅ g of the (semi)group G. 2. Bob computes (g, ϕ)n = (ϕn−1 (g) ⋅ ⋅ ⋅ ϕ2 (g) ⋅ ϕ(g) ⋅ g, ϕn ) and sends only the first component of this pair to Alice. Thus, he sends to Alice only the element b = ϕn−1 (g) ⋅ ⋅ ⋅ ϕ2 (g) ⋅ ϕ(g) ⋅ g of the (semi)group G. 3. Alice computes (b, x) ⋅ (a, ϕm ) = (ϕm (b) ⋅ a, x ⋅ ϕm ). Her key is now KA = ϕm (b) ⋅ a. Note that she does not actually “compute” x ⋅ ϕm because she does not know the automorphism x = ϕn ; recall that it was not transmitted to her. But she does not need it to compute KA . 4. Bob computes (a, y) ⋅ (b, ϕn ) = (ϕn (a) ⋅ b, y ⋅ ϕn ). His key is now KB = ϕn (a) ⋅ b. Again, Bob does not actually “compute” y ⋅ ϕn because he does not know the automorphism y = ϕm . 5. Since (b, x) ⋅ (a, ϕm ) = (a, y) ⋅ (b, ϕn ) = (g, ϕ)m+n , we should have KA = KB = K, the shared secret key. Remark 6.9.2. Note that, in contrast to the original Diffie–Hellman key exchange, correctness here is based on the equality hm ⋅hn = hn ⋅hm = hm+n rather than on the equality (hm )n = (hn )m = hmn . In the original Diffie–Hellman set up, our trick would not work because if the shared key K was just the product of two openly transmitted elements, then anybody, including the eavesdropper, could compute K. We note that the general protocol above can be used with any noncommutative group G if ϕ is selected to be a nontrivial inner automorphism, i. e., conjugation by an element which is not in the center of G. Furthermore, it can be used with any noncommutative semigroup G as well, as long as G has some invertible elements; these can be used to produce inner automorphisms. A typical example of such a semigroup would be a semigroup of matrices over some ring. Now let G be a noncommutative (semi)group and let h ∈ G be an invertible noncentral element. Then conjugation by h is a nonidentical inner automorphism of G that we denote by φh . We use an extension of the semigroup G by the inner automorphism φh , as described in the beginning of this section. For any element g ∈ G and for any
6.9 Semidirect product of groups and more peculiar computational assumptions | 341
integer k ≥ 1, we have φh (g) = g −1 gh; φkh (g) = h−k ghk . Now our general protocol is specialized in this case as follows. 1. Alice and Bob agree on a (semi)group G and on public elements g, h ∈ G, where h is an invertible noncentral element. 2. Alice selects a private positive integer m, and Bob selects a private positive integer n. 3. Alice computes (g, φh )m = (h−m+1 ghm−1 ⋅ ⋅ ⋅ h−2 gh2 ⋅ h−1 gh ⋅ g, φm h ) and sends only the first component of this pair to Bob. Thus, she sends to Bob only the element A = h−m+1 ghm−1 ⋅ ⋅ ⋅ h−2 gh2 ⋅ h−1 gh ⋅ g = h−m (hg)m . 4. Bob computes (g, φh )n = (h−n+1 ghn−1 ⋅ ⋅ ⋅ h−2 gh2 ⋅ h−1 gh ⋅ g, φnh ) and sends only the first component of this pair to Alice. Thus, he sends to Alice only the element B = h−n+1 ghn−1 ⋅ ⋅ ⋅ h−2 gh2 ⋅ h−1 gh ⋅ g = h−n (hg)n . 5.
6.
7.
m m Alice computes (B, x) ⋅ (A, φm h ) = (φh (B) ⋅ A, x ⋅ φh ). Her key is now KAlice = m −(m+n) m+n φh (B) ⋅ A = h (hg) . Note that she does not actually “compute” x ⋅ φm h because she does not know the automorphism x = φnh ; recall that it was not transmitted to her. But she does not need it to compute KAlice . Bob computes (A, y) ⋅ (B, φnh ) = (φnh (A) ⋅ B, y ⋅ φnh ). His key is now KBob = φnh (A) ⋅ B. Again, Bob does not actually “compute” y ⋅ φnh because he does not know the automorphism y = φm h. m Since (B, x)⋅(A, φh ) = (A, y)⋅(B, φnh ) = (M, φh )m+n , we should have KAlice = KBob = K, the shared secret key.
Thus, the shared secret key in this protocol is n −(m+n) K = φm (hg)m+n . h (B) ⋅ A = φh (A) ⋅ B = h
Therefore, our security assumption here is that it is computationally hard to retrieve the key K = h−(m+n) (hg)m+n from the quadruple (h, g, h−m (hg)m , h−n (hg)n ). In particular, we have to take care that the elements h and hg do not commute because otherwise, K is just a product of h−m (hg)m and h−n (hg)n . Once again, the problem is: Given a (semi)group G and elements g, h, h−m (hg)m and h−n (hg)n of G, find h−(m+n) (hg)m+n .
Compare this to the Diffie–Hellman problem from Section 6.2: Given a (semi)group G and elements g, g n and g m of G, find g mn .
342 | 6 Problems in group theory motivated by cryptography A weaker security assumption arises if an eavesdropper tries to recover a private exponent from a transmission, i. e., to recover, say, m from h−m (hg)m . A special case of this problem, where h = 1, is the “discrete log” problem, namely, recover m from g and g m . However, the discrete log problem is a problem on cyclic, in particular abelian, groups, whereas in the former problem it is essential that g and h do not commute. By varying the automorphism (or endomorphism) used for an extension of G, one can get many other security assumptions. However, many (semi)groups G just do not have outer (i. e., noninner) automorphisms, so there is no guarantee that a selected platform (semi)group will have any outer automorphisms. On the other hand, it will have inner automorphisms as long as it has invertible noncentral elements. In conclusion, we note that there is always a concern (as well as in the standard Diffie–Hellman protocol) about the orders of public elements (in our case, about the orders of h and hg): if one of the orders is too small, then a brute-force attack may be feasible. If a group of matrices of small size is chosen as the platform, then the above protocol turns out to be vulnerable to a linear algebra attack, similar to an attack on Stickel’s protocol [474] offered in [453], albeit more sophisticated (see [343, 432]). Selecting a good platform (semi)group for the protocol in this section still remains an open problem. We also mention another, rather different, proposal [402] of a cryptosystem based on the semidirect product of two groups and yet another, more complex, proposal of a key agreement based on the semidirect product of two monoids [7]. Finally, we note that the semidirect product construction is rather universal and can be used (in a cryptographic context), in particular, with tropical algebras (see [185]) that enjoy unparalleled efficiency of computation.
6.10 The subset sum problem and the knapsack problem Most of the material in this brief section can be found in Chapter 5 of this book, but we included it here for smoother reading and to express our hope that the subset sum problem and/or the knapsack problem have potential to be employed in cryptographic primitives. As usual, elements of a group G are given as words in the alphabet X ∪ X −1 . We begin with three decision problems. The subset sum problem (SSP): Given g1 , . . . , gk , g ∈ G, decide if ε
ε
g = g1 1 . . . gkk for some ε1 , . . . , εk ∈ {0, 1}.
(6.1)
6.10 The subset sum problem and the knapsack problem | 343
The knapsack problem (KP): Given g1 , . . . , gk , g ∈ G, decide if ε
ε
g = g1 1 . . . gkk
(6.2)
for some nonnegative integers ε1 , . . . , εk . The third problem is equivalent to KP in the abelian case, but in general this is a completely different problem. The submonoid membership problem (SMP): Given elements g1 , . . . , gk , g ∈ G, decide if g belongs to the submonoid generated by g1 , . . . , gk in G, i. e., if the following equality holds for some gi1 , . . . , gis ∈ {g1 , . . . , gk }, s ∈ ℕ: g = gi1 , . . . , gis .
(6.3)
The restriction of SMP to the case where the set of generators {g1 , . . . , gn } is closed under inversion (so that the submonoid is actually a subgroup of G) is a well-known subgroup membership problem, one of the most basic algorithmic problems in group theory. There are also natural search versions of the decision problems above, where the goal is to find a particular solution to equation (6.1), (6.2) or (6.3), provided that solutions do exist. We also mention, in passing, an interesting research avenue explored in [363]: Many search problems can be converted to optimization problems asking for an “optimal” (usually meaning “minimal”) solution of the corresponding search problem. A well-known example of an optimization problem is the geodesic problem: Given a word in the generators of a group G, find a word of minimum length representing the same element of G. The classical (i. e., not group-theoretical) subset sum problem is one of the very basic NP-complete problems, so there is extensive related bibliography (see [253]). The SSP problem attracted a lot of extra attention when Merkle and Hellmann designed a public-key cryptosystem [338] based on a variation of SSP. That cryptosystem was broken by Shamir in [450], but the interest persists and the ideas survive in numerous new cryptosystems and their variations (see, e. g., [388]). Generalizations of knapsack-type cryptosystems to noncommutative groups seem quite promising from the viewpoint of post-quantum cryptography, although relevant cryptographic schemes are yet to be built. In [363], the authors showed, in particular, that SSP is NP-complete in (1) the direct sum of countably many copies of the infinite cyclic group ℤ; (2) free metabelian nonabelian groups of finite rank; (3) wreath products of two finitely generated infinite abelian groups; (4) Thompson’s group F; (5) the Baumslag–Solitar group BS(m, n) for |m| ≠ |n|; and many other groups.
344 | 6 Problems in group theory motivated by cryptography In [275], the authors show that the subset sum problem is polynomial-time decidable in every finitely generated virtually nilpotent group but there exists a polycyclic group where this problem is NP-complete. Later in [385], Nikolaev and Ushakov showed that, in fact, every polycyclic nonvirtually-nilpotent group has NP-complete subset sum problem. Also in [275], it was shown that the knapsack problem is undecidable in a direct product of sufficiently many copies of the discrete Heisenberg group (which is nilpotent of class 2). However, for the discrete Heisenberg group itself, the knapsack problem is decidable. Thus, decidability of the knapsack problem is not preserved under direct products. In [145], the effect of free and direct products on the time complexity of the knapsack and related problems was studied further. This concludes our very brief survey on the subset sum and the knapsack problems in groups and their potential use in cryptographic primitives. A much more detailed survey on these problems can be found in Sections 5.1.2, 5.2 and 5.3.
6.11 The hidden subgroup problem Given a group G, a subgroup H ≤ G and a set X, we say that a function f : G → X hides the subgroup H if for all g1 , g2 ∈ G, one has f (g1 ) = f (g2 ) if and only if g1 H = g2 H for the cosets of H. Equivalently, the function f is constant on the cosets of H, while it is different between the different cosets of H. The hidden subgroup problem (HSP) is: Let G be a finite group, X a finite set and f : G → X a function that hides a subgroup H ≤ G. The function f is given via an oracle, which uses O(log |G| + log |X|) bits. Using information gained from evaluations of f via its oracle, determine a generating set for H.
A special case is where X is a group and f is a group homomorphism, in which case H corresponds to the kernel of f . The importance of the HSP is due to the facts that: – Shor’s polynomial-time quantum algorithm for the factoring and discrete logarithm problems (as well as several of its extensions) relies on the ability of quantum computers to solve the HSP for finite abelian groups. Both the factoring and the discrete logarithm problem are of paramount importance for modern commercial cryptography. – The existence of efficient quantum algorithms for the HSP for certain non-abelian groups would imply efficient quantum algorithms for two major problems: the graph isomorphism problem and certain shortest vector problems in lattices. More specifically, an efficient quantum algorithm for the HSP for the symmetric group would give a quantum algorithm for the graph isomorphism, whereas an efficient quantum algorithm for the HSP for the dihedral group would give a quantum algorithm for the shortest vector problem.
6.12 Relations between some of the problems |
345
We refer to [494] for a brief discussion on how the HSP can be generalized to infinite groups.
6.12 Relations between some of the problems In this section, we discuss relations between some of the problems described earlier in this chapter. In Section 6.10 we have already pointed out some of the relations; now here are some other relations, through the prism of cryptographic applications. We start with the conjugacy search problem, which was the subject of Section 6.3, and one of its ramifications, the subgroup-restricted conjugacy search problem: Given two elements w, h of a group G, a subgroup A ≤ G and the information that wa = h for some a ∈ A, find at least one particular element a like that.
In reference to the Ko–Lee protocol described in Section 6.3, one of the parties (Alice) transmits wa for some private a ∈ A, and the other party (Bob) transmits wb for some private b ∈ B, where the subgroups A and B commute elementwise, i. e., ab = ba for any a ∈ A, b ∈ B. Now suppose the adversary finds a1 , a2 ∈ A such that a1 wa2 = a−1 wa and b1 , b2 ∈ B such that b1 wb2 = b−1 wb. Then the adversary gets a1 b1 wb2 a2 = a1 b−1 wba2 = b−1 a1 wa2 b = b−1 a−1 wab = K, the shared secret key. We emphasize that these a1 , a2 and b1 , b2 do not have anything to do with the private elements originally selected by Alice or Bob, which simplifies the search substantially. We also point out that, in fact, it is sufficient for the adversary to find just one pair, say, a1 , a2 ∈ A, to get the shared secret key: a1 (b−1 wb)a2 = b−1 a1 wa2 b = b−1 a−1 wab = K. In summary, to get the secret key K, the adversary does not have to solve the (subgroup-restricted) conjugacy search problem, but instead, it is sufficient to solve an apparently easier (subgroup-restricted) decomposition search problem (see Section 6.4). Then, one more trick reduces the decomposition search problem to a special case where w = 1, i. e., to the factorization problem (see Section 6.4.4). Namely, given w = a ⋅ w ⋅ b, multiply it on the left by the element w−1 (which is the inverse of the public element w) to get w = w−1 a ⋅ w ⋅ b = (w−1 a ⋅ w) ⋅ b. Thus, if we denote by Aw the subgroup conjugate to A by the (public) element w, the problem for the adversary is now the following factorization search problem:
346 | 6 Problems in group theory motivated by cryptography
Given an element w of a group G and two subgroups Aw , B ≤ G, find any two elements a ∈ Aw and b ∈ B that would satisfy a ⋅ b = w , provided at least one such pair of elements exists.
Since in the original Ko–Lee protocol one has A = B, this yields the following interesting observation. If in that protocol A is a normal subgroup of G, then Aw = A, and the above problem becomes: Given w ∈ A, find any two elements a1 , a2 ∈ A such that w = a1 a2 . This problem is trivial; a1 here could be any element from A, and then a2 = a−1 1 w . Therefore, in choosing the platform group G and two commuting subgroups for a protocol described in Section 6.3 or Section 6.4, one has to avoid normal subgroups. This means, in particular, that “artificially” introducing commuting subgroups as, say, direct factors is unacceptable from the security point of view. At the other extreme, there are malnormal subgroups. A subgroup A ≤ G is called malnormal in G if for any g ∈ G, Ag ∩ A = {1}. We observe that if in the original KoLee protocol A is a malnormal subgroup of G, then the decomposition search problem corresponding to that protocol has a unique solution if w ∉ A. Indeed, suppose w = −1 −1 −1 a1 ⋅ w ⋅ a1 = a2 ⋅ w ⋅ a2 , where a1 ≠ a2 . Then a−1 2 a1 w = wa2 a1 , and hence w a2 a1 w = −1 a2 a1 . Since A is malnormal, the element on the left does not belong to A, whereas the one on the right does, a contradiction. This argument shows that, in fact, already if Aw ∩ A = {1} for this particular w, then the corresponding decomposition search problem has a unique solution. Finally, we describe one more trick that reduces, to some extent, the decomposition search problem to the (subgroup-restricted) conjugacy search problem. Suppose we are given w = awb, and we need to recover a ∈ A and b ∈ B, where A and B are two elementwise commuting subgroups of a group G. Pick any b1 ∈ B and compute: wb
w b
−1 −1 −1 −1 −1 [awb, b1 ] = b−1 w−1 a−1 b−1 1 awbb1 = b w b1 wbb1 = (b1 ) b1 = ((b1 ) ) b1 . Since we know b1 , we can multiply the result by b−1 = 1 on the right to get w −1 w b −1 w b ((b1 ) ) . Now the problem becomes: Recover b ∈ B from the known w = ((b1 ) ) w and (b−1 1 ) . This is the subgroup-restricted conjugacy search problem. By solving it,
one can recover a b ∈ B. Similarly, to recover an a ∈ A, one picks any a1 ∈ A and computes: [(awb)−1 , (a1 )−1 ] = awba1 b−1 w−1 a−1 a−1 1 w = awa1 w−1 a−1 a−1 1 = (a1 )
a
−1 −1
a−1 −1 a1 .
w a−1 ) 1 = ((a1 ) −1
Multiply the result by a1 on the right to get w = ((a1 )w )a , so that the problem −1 −1 −1 becomes: Recover a ∈ A from the known w = ((a1 )w )a and (a1 )w . We have to note that since a solution of the subgroup-restricted conjugacy search problem is not always unique, solving the above two instances of this problem may not −1
−1
6.12 Relations between some of the problems | 347
necessarily give the right solution of the original decomposition problem. However, any two solutions, call them b and b , of the first conjugacy search problem differ w by an element of the centralizer of (b−1 1 ) , and this centralizer is unlikely to have a nontrivial intersection with B. A similar computation shows that the same trick reduces the factorization search problem, too, to the subgroup-restricted conjugacy search problem. Suppose we are given w = ab, and we need to recover a ∈ A and b ∈ B, where A and B are two elementwise commuting subgroups of a group G. Pick any b1 ∈ B and compute b
−1 [ab, b1 ] = b−1 a−1 b−1 1 abb1 = (b1 ) b1 . −1 b Since we know b1 , we can multiply the result by b−1 1 on the right to get w = (b1 ) . This is the subgroup-restricted conjugacy search problem. By solving it, one can recover a b ∈ B. This same trick can, in fact, be used to attack the subgroup-restricted conjugacy search problem itself. Suppose we are given w = a−1 wa, and we need to recover a ∈ A. Pick any b from the centralizer of A; typically, there is a public subgroup B that commutes with A elementwise; then just pick any b ∈ B. Then compute a
[w , b] = [a−1 wa, b] = a−1 w−1 ab−1 a−1 wab = a−1 w−1 b−1 wab = (b−w ) b. Multiply the result by b−1 on the right to get (b−w )a , so the problem now is to recover a ∈ A from (b−w )a and b−w . This problem might be easier than the original problem because there is flexibility in choosing b ∈ B. In particular, a feasible attack might be to choose several different b ∈ B and try to solve the above conjugacy search problem for each in parallel by using some general method (e. g., a length-based attack, see [154, 153]). Chances are that the attack will be successful for at least one of the b’s.
Bibliography [1] [2] [3]
[4]
[5]
[6] [7]
[8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18]
[19] [20]
[21]
I. Aalbersberg and H. Hoogeboom. Characterizations of the decidability of some problems for regular trace languages. Math. Syst. Theory, 22:1–19, 1989. I. Agol. The virtual Haken conjecture. Doc. Math., 18:1045–1087, 2013. With an appendix by Ian Agol, Daniel Groves, and Jason Manning. A. G. Akritas, A. W. Strzeboński, and P. S. Vigklas. Improving the performance of the continued fractions method using new bounds of positive roots. Nonlinear Anal. Model. Control, 13(3):265–279, 2008. J. M. Alonso, T. Brady, D. Cooper, V. Ferlini, M. Lustig, M. Mihalik, M. Shapiro, and H. Short. Notes on word hyperbolic groups. In Group Theory from a Geometrical Viewpoint (Trieste, 1990), pages 3–63. World Sci. Publ., River Edge, NJ, 1991. Edited by H.Short. S. Alstrup, G. Stølting Brodal, and T. Rauhe. Pattern matching in dynamic texts. In Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2000, pages 819–828. ACM/SIAM, 2000. I. Anshel, M. Anshel, and D. Goldfeld. An algebraic method for public-key cryptography. Math. Res. Lett., 6(3–4):287–291, 1999. I. Anshel, M. Anshel, D. Goldfeld, and S. Lemieux. Key agreement, the algebraic eraserTM , and lightweight cryptography. In Algebraic Methods in Cryptography, volume 418 of Contemporary Mathematics, pages 1–34. American Mathematical Society, 2006. Y. Antolín and L. Ciobanu. Finite generating sets of relatively hyperbolic groups and applications to geodesic languages. Trans. Amer. Math. Soc., 368(11):7965–8010, 2016. R. Aoun. Random subgroups of linear groups are free. Duke Math. J., 160(1):117–173, 2011. S. Arora and B. Barak. Computational Complexity – A Modern Approach. Cambridge University Press, 2009. E. Artin. Theorie der Zöpfe. Abh. Math. Semin. Univ. Hamb., 4(1):47–72, 1925. V. Arvind and P. P. Kurur. Testing nilpotence of Galois groups in polynomial time. ACM Trans. Algorithms, 8(3):Art. 32, 22, 2012. G. N. Arzhantseva. On groups in which subgroups with a fixed number of generators are free. Fundam. Prikl. Mat., 3(3):675–683, 1997. G. N. Arzhantseva. Generic properties of finitely presented groups and Howson’s theorem. Comm. Algebra, 26(11):3783–3792, 1998. G. N. Arzhantseva. A property of subgroups of infinite index in a free group. Proc. Amer. Math. Soc., 128(11):3205–3210, 2000. G. N. Arzhantseva and A. Yu. Ol’shanskiĭ. Generality of the class of groups in which subgroups with a lesser number of generators are free. Mat. Zametki, 59(4):489–496, 638, 1996. J. Avenhaus and K. Madlener. The Nielsen reduction and P-complete problems in free groups. Theoret. Comput. Sci., 32(1–2):61–76, 1984. L. Babai, R. Beals, J.-Y. Cai, G. Ivanyos, and E. M. Luks. Multiplicative equations over commuting matrices. In Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996, pages 498–507. ACM/SIAM, 1996. L. Babai, E. M. Luks, and Á. Seress. Permutation groups in NC. In Proceedings of the 19th Annual ACM Symposium on Theory of Computing, STOC 1987, pages 409–420. ACM, 1987. L. Babai and E. Szemerédi. On the complexity of matrix group problems I. In Proceedings of the 25th Annual Symposium on Foundations of Computer Science, FOCS 1984, pages 229–240. IEEE Computer Society, 1984. B. Bajorska, O. Macedonska, and W. Tomaszewski. A defining property of virtually nilpotent groups. Publ. Math. Debrecen, 81:415–420, 2012.
https://doi.org/10.1515/9783110667028-007
350 | Bibliography
[22] [23] [24] [25] [26] [27]
[28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40]
[41] [42] [43] [44] [45]
D. A. M. Barrington. Bounded-width polynomial-size branching programs recognize exactly those languages in NC1 . J. Comput. System Sci., 38:150–164, 1989. L. Bartholdi, M. Figelius, M. Lohrey, and A. Weiß. Groups with ALOGTIME-hard word problems and PSPACE-complete compressed word problems. CoRR, abs/1909.13781, 2019. L. Bartholdi, R. I. Grigorchuk, and Z. Šuniḱ. Branch groups. In Handbook of Algebra, volume 3, pages 989–1112. Elsevier/North-Holland, Amsterdam, 2003. F. Bassino, A. Martino, C. Nicaud, E. Ventura, and P. Weil. Statistical properties of subgroups of free groups. Random Structures Algorithms, 42(3):349–373, 2013. F. Bassino, C. Nicaud, and P. Weil. Random generation of finitely generated subgroups of a free group. Internat. J. Algebra Comput., 18(2):375–405, 2008. F. Bassino, C. Nicaud, and P. Weil. Generic properties of subgroups of free groups and finite presentations. In Algebra and Computer Science, volume 677 of Contemp. Math., pages 1–43. Amer. Math. Soc., Providence, RI, 2016. F. Bassino, C. Nicaud, and P. Weil. On the genericity of Whitehead minimality. J. Group Theory, 19(1):137–159, 2016. G. Baumslag. On generalized free products. Math. Z., 78:423–438, 1962. G. Baumslag. A non-cyclic one-relator group all of whose finite quotients are cyclic. J. Aust. Math. Soc. A, 10:497–498, 1969. G. Baumslag. A finitely presented metabelian group with a free abelian derived group of infinite rank. Proc. Amer. Math. Soc., 35:61–62, 1972. G. Baumslag. Subgroups of finitely presented metabelian groups. J. Aust. Math. Soc., 16:98–110, 1973. G. Baumslag, F. B. Cannonito, and D. J. S. Robinson. The algorithmic theory of finitely generated metabelian groups. Trans. Amer. Math. Soc., 344(2):629–648, 1994. G. Baumslag, F. B. Cannonito, D. J. S. Robinson, and D. Segal. The algorithmic theory of polycyclic-by-finite groups. J. Algebra, 142(1):118–149, 1991. G. Baumslag, A. G. Miasnikov, and V. Remeslennikov. Algebraic geometry over groups I. Algebraic sets and ideal theory. J. Algebra, 219:16–79, 1999. G. Baumslag, A. G. Miasnikov, and V. Remeslennikov. Discriminating completions of hyperbolic groups. Geom. Dedicata, 92:115–143, 2002. G. Baumslag, A. Myasnikov, and V. Remeslennikov. Malnormality is decidable in free groups. Internat. J. Algebra Comput., 9(6):687–692, 1999. A. F. Beardon. The Hausdorff dimension of singular sets of properly discontinuous groups. Amer. J. Math., pages 722–736, 1966. M. Beaudry, P. McKenzie, P. Péladeau, and D. Thérien. Finite monoids: From word to circuit evaluation. SIAM J. Comput., 26(1):138–152, 1997. A. Ben-Zvi, A. Kalka, and B. Tsaban. Cryptanalysis via algebraic spans. In Advances in Cryptology – CRYPTO 2018, volume 10991 of Lecture Notes in Computer Science, pages 255–274. Springer, Berlin, 2018. O. Bernardi and O. Giménez. A linear algorithm for the random sampling from regular languages. Algorithmica, 62(1–2):130–145, 2012. J. Berstel and S. Brlek. On the length of word chains. Inform. Process. Lett., 26(1):23–28, 1987. J.-C. Birget, S. Magliveras, and M. Sramka. On public-key cryptosystems based on combinatorial group theory. Tatra Mt. Math. Publ., 33:137–148, 2006. J.-C. Birget, S. Margolis, J. Meakin, and P. Weil. PSPACE-complete problems for subgroups of free groups and inverse finite automata. Theoret. Comput. Sci., 242(1–2):247–281, 2000. J. S. Birman. Braids, Links, and Mapping Class Groups, volume 82 Annals of Mathematics Studies. Princeton University Press/University of Tokyo Press, Princeton, NJ/Tokyo, 1974.
Bibliography | 351
[46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56]
[57]
[58] [59] [60] [61] [62] [63] [64] [65] [66] [67]
[68] [69] [70]
J. Birman, K. Hyoung Ko, and S. J. Lee. A new approach to the word and conjugacy problems in the braid groups. Adv. Math., 139(2):322–353, 1998. O. Bogopolski and V. Gerasimov. Finite subgroups of hyperbolic groups. Algebra Logika, 34:343–345, 1995. English translation: Algebra and Logic 34 (1995), no. 6, 343–345 (1996). O. Bogopolski, A. Martino, O. Maslakova, and E. Ventura. The conjugacy problem is solvable in free-by-cyclic groups. Bull. Lond. Math. Soc., 38(5):787–794, 2006. O. Bogopolski, A. Martino, and E. Ventura. Orbit decidability and the conjugacy problem for some extensions of groups. Trans. Amer. Math. Soc., 362:2010–2036, 2010. O. Bogopolski and E. Ventura. The mean Dehn functions of abelian groups. J. Group Theory, 11(4):569–586, 2008. I. Bondarenko and R. Kravchenko. Finite-state self-similar actions of nilpotent groups. Geom. Dedicata, 163:339–348, 2013. R. V. Book and F. Otto. String–Rewriting Systems. Springer, 1993. W. W. Boone. The word problem. Ann. of Math. (2), 70:207–265, 1959. V. Borisov. Simple examples of groups with unsolvable word problem. Math. Notes, 6:768–775, 1969. I. Borosh and L. B. Treybig. Bounds on positive integral solutions of linear Diophantine equations. Proc. Amer. Math. Soc., 55(2):299–304, 1976. A. V. Borovik, A. G. Myasnikov, and V. N. Remeslennikov. The conjugacy problem in amalgamated products. I. Regular elements and black holes. Internat. J. Algebra Comput., 17(7):1299–1333, 2007. A. V. Borovik, A. G. Myasnikov, and V. N. Remeslennikov. Generic complexity of the conjugacy problem in HNN-extensions and algorithmic stratification of Miller’s groups. Internat. J. Algebra Comput., 17(5–6):963–997, 2007. P. Bougerol and J. Lacroix. Products of Random Matrices with Applications to Schrödinger Operators. Birkhäuser Boston, 1985. J. Bourgain, A. Gamburd, and P. Sarnak. Affine linear sieve, expanders, and sum-product. Invent. Math., 179(3):559–644, 2010. C. Bourke, R. Tewari, and N. V. Vinodchandran. Directed planar reachability is in unambiguous log-space. ACM Trans. Comput. Theory, 1(1):4:1–4:17, February 2009. G. E. P. Box and M. E. Muller. A note on the generation of random normal deviates. Ann. Math. Stat., 29(2):610–611, 1958. N. Brady. Finite subgroups of hyperbolic groups. Internat. J. Algebra Comput., 10:399–405, 2000. N. Brady, T. R. Riley, and H. Short. The Geometry of the Word Problem for Finitely Generated Groups. Birkhäuser, 2007. C. Brav and H. Thomas. Thin monodromy in Sp(4). Compos. Math., 150(3):333–343, 2014. E. Breuillard and T. Gelander. On dense free subgroups of Lie groups. J. Algebra, 261(2):448–467, 2003. E. Breuillard, B. Green, and T. Tao. Approximate subgroups of linear groups. Geom. Funct. Anal., 21(4):774–819, 2011. M. Bridson and D. Groves. The Quadratic Isoperimetric Inequality for Mapping Tori of Free Group Automorphisms. Memoirs of the American Mathematical Society. American Mathematical Society, 2010. M. R. Bridson and K. Vogtmann. On the geometry of the automorphism group of a free group. Bull. Lond. Math. Soc., 27(6):544–552, 1995. M. R. Bridson and D. T. Wise. Malnormality is undecidable in hyperbolic groups. Israel J. Math., 124:313–316, 2001. J. L. Britton. The word problem. Ann. of Math., 77(1):16–32, 1963.
352 | Bibliography
[71] [72] [73] [74] [75] [76] [77] [78] [79] [80]
[81] [82] [83] [84] [85]
[86] [87]
[88] [89] [90] [91] [92]
[93] [94]
R. M. Bryant and V. A. Roman’kov. Automorphism groups of relatively free groups. Math. Proc. Cambridge Philos. Soc., 127:411–424, 1999. V. K. Bulitko. Equations and inequalities in a free group and a free semigroup. Tul. Gos. Ped. Inst. Uchenye. Zap. Mat. Kaf., 2:242–252, 1970. I. Bumagin. Time complexity of the conjugacy problem in relatively hyperbolic groups. Internat. J. Algebra Comput., 25(5):689–723, 2015. P. J. Cameron. Finite permutation groups and finite simple groups. Bull. Lond. Math. Soc., 13(1):1–22, 1981. P. J. Cameron. Permutation Groups, volume 45. Cambridge University Press, 1999. J. W. Cannon, W. J. Floyd, and W. R. Parry. Introductory notes on Richard Thompson’s groups. Enseign. Math., 42(2):215–256, 1996. P.-E. Caprace and M. Sageev. Rank rigidity for CAT(0) cube complexes. Geom. Funct. Anal., 21(4):851–891, 2011. S. Caruso and B. Wiest. On the genericity of pseudo-Anosov braids II: conjugations to rigid braids. Groups Geom. Dyn., 11(2):549–565, 2017. C. Champetier. Propriétés statistiques des groupes de présentation finie. J. Adv. Math., 116(2):197–262, 1995. M. Charikar, E. Lehman, A. Lehman, D. Liu, R. Panigrahy, M. Prabhakaran, A. Sahai, and A. Shelat. The smallest grammar problem. IEEE Trans. Inf. Theory, 51(7):2554–2576, 2005. N. Chavdarov et al. The generic irreducibility of the numerator of the zeta function in a family of curves with large monodromy. Duke Math. J., 87(1):151–180, 1997. K. T. Chen, R. H. Fox, and R. C. Lyndon. Free differential calculus, IV. The quotient groups of the lower central series. Ann. of Math., 68(1):81–95, 1958. A. Chorna, K. Geller, and V. Shpilrain. On two-generator subgroups of SL2 (ℤ), SL2 (ℚ), and SL2 (ℝ). J. Algebra, 478:367–381, 2017. L. Ciobanu, V. Diekert, and M. Elder. Solution sets for equations over free groups are EDT0L languages. Internat. J. Algebra Comput., 26(5):843–886, 2016. L. Ciobanu, A. Martino, and E. Ventura. The generic Hanna Neumann Conjecture and Post Correspondence Problem. preprint. Available at http://www.epsem.upc.edu/~ventura/ventura/engl/abs-e.htm#(31), 2008. S. Cleary. Distortion of wreath products in some finitely presented groups. Pacific J. Math., 228(1):53–61, 2006. D. E. Cohen, K. Madlener, and F. Otto. Separating the intrinsic complexity and the derivational complexity of the word problem for finitely presented groups. MLQ Math. Log. Q., 39(2):143–157, 1993. D. J. Collins. Relations among the squares of the generators of the braid group. Invent. Math., 117:525–529, 1994. M. Cordes, M. Duchin, Y. Duong, M.-C. Ho, and A. P. Sánchez. Random nilpotent groups I. Int. Math. Res. Not., 7:1921–1953, 2018. J. Crisp, E. Godelle, and B. Wiest. The conjugacy problem in subgroups of right-angled Artin groups. J. Topol., 2(3):442–460, 2009. S. Dasgupta, C. Papadimitriou, and U. Vazirani. Algorithms. McGraw-Hill Science, 2006. H. Davenport. Multiplicative Number Theory, volume 74 of Graduate Texts in Mathematics. Springer-Verlag, New York, third edition, 2000. Revised and with a preface by Hugh L. Montgomery. O. David, T. Gelander, and C. Meiri. Personal communication. M. Davis, H. Putnam, and J. Robinson. The decision problem for exponential Diophantine equations. Ann. of Math., pages 425–436, 1961.
Bibliography | 353
[95] [96]
[97] [98] [99] [100]
[101] [102]
[103] [104] [105] [106] [107] [108] [109] [110] [111]
[112] [113] [114] [115] [116] [117] [118]
M. Dehn. Über unendliche diskontinuierliche Gruppen. Math. Ann., 71:116–144, 1911. P. Dehornoy. Braid-based cryptography. In Group Theory, Statistics, and Cryptography, volume 360 of Contemporary Mathematics, pages 5–33. American Mathematical Society, 2004. K. Delp, T. Dymarz, and A. Schaffer-Cohen. A matrix model for random nilpotent groups. Int. Math. Res. Not., 1:201–230, 2019. J. Delsarte. Sur le gitter Fuchsien. C. R. Acad. Sci. Paris, 214(147–179):1, 1942. P. Diaconis, J. Fulman, and R. Guralnick. On fixed points of permutations. J. Algebraic Combin., 28(1):189–218, 2008. V. Diekert and M. Elder. Solutions of twisted word equations, EDT0L languages, and context-free groups. In Proceedings of the 44th International Colloquium on Automata, Languages, and Programming, ICALP 2017, volume 80 of LIPIcs, pages 96:1–96:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2017. V. Diekert, A. Jeż, and W. Plandowski. Finding all solutions of equations in free groups and monoids with involution. Inf. Comput., 251:263–286, 2016. V. Diekert, O. Kharlampovich, and A. Mohajeri Moghaddam. SLP compression for solutions of equations with constraints in free and hyperbolic groups. Internat. J. Algebra Comput., 25(1&2):81–112, 2015. V. Diekert, J. Laun, and A. Ushakov. Efficient algorithms for highly compressed data: the word problem in Higman’s group is in P. Internat. J. Algebra Comput., 22(8), 2012. V. Diekert, A. G. Myasnikov, and A. Weiß. Conjugacy in Baumslag’s group, generic case complexity, and division in power circuits. Algorithmica, 76(4):961–988, 2016. V. Diekert, A. G. Myasnikov, and A. Weiß. Amenability of Schreier graphs and strongly generic algorithms for the conjugacy problem. J. Symbolic Comput., 83:147–165, 2017. W. Diffie and M. E. Hellman. New directions in cryptography. IEEE Trans. Inf. Theory, IT-22:644–654, 1976. W. Dison, E. Einstein, and T. R. Riley. Taming the hydra: The word problem and extreme integer compression. Internat. J. Algebra Comput., 28(7):1299–1381, 2018. W. Dison and T. R. Riley. Hydra groups. Comment. Math. Helv., 88(3):507–540, 2013. J. D. Dixon. Random sets which invariably generate the symmetric group. Discrete Math., 105(1–3):25–39, 1992. J. D. Dixon. Probabilistic group theory. C. R. Math. Acad. Sci. Soc. R. Can., 24(1):1–15, 2002. C. F. Doran and J. W. Morgan. Mirror symmetry and integral variations of Hodge structure underlying one-parameter families of Calabi-Yau threefolds. In Mirror Symmetry. V, volume 38 of AMS/IP Stud. Adv. Math., pages 517–537. Amer. Math. Soc., Providence, RI, 2006. C. Druţu. Asymptotics cones and quasi-isometry invariants for hyperbolic metric spaces. Ann. Inst. Fourier (Grenoble), 51:81–97, 2001. F. Dudkin and A. Treier. Knapsack problem for Baumslag–Solitar groups. Preprint. W. Duke, Z. Rudnick, and P. Sarnak. Density of integer points on affine homogeneous varieties. Duke Math. J., 71(1):143–179, 1993. N. M. Dunfield and D. P. Thurston. A random tunnel number one 3-manifold does not fiber over the circle. Geom. Topol., 10:2431–2499, 2006. N. M. Dunfield and W. P. Thurston. Finite covers of random 3-manifolds. Invent. Math., 166(3):457–521, 2006. M. G. Durham and S. J. Taylor. Convex cocompactness and stability in mapping class groups. Algebr. Geom. Topol., 15(5):2839–2859, 2015. H. Edelsbrunner. Geometry and Topology for Mesh Generation. Cambridge University Press, 2001.
354 | Bibliography
[119] A. Ehrenfeucht, J. Karhumaki, and G. Rozenberg. The (generalized) Post correspondence problem with lists consisting of two words is decidable. Theoret. Comput. Sci., 21(2):119–144, 1982. [120] M. Elder, G. Elston, and G. Ostheimer. On groups that have normal forms computable in logspace. J. Algebra, 381(0):260–281, 2013. [121] T. ElGamal. A public-key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans. Inf. Theory, IT-31:469–473, 1985. [122] I. Z. Emiris and V. Y. Pan. Improved algorithms for computing determinants and resultants. J. Complexity, 21(1):43–71, 2005. [123] D. B. A. Epstein, J. W. Cannon, D. F. Holt, S. V. F. Levy, M. S. Paterson, and W. P. Thurston. Word Processing in Groups. Jones and Bartlett, Boston, 1992. [124] D. B. A. Epstein and C. Petronio. An exposition of Poincaré’s polyhedron theorem. Enseign. Math.(2), 40(1–2):113–170, 1994. [125] M. Ershov. Kazhdan quotients of Golod–Shafarevich groups. Proc. Lond. Math. Soc. (3), 102(4):599–636, 2011. [126] H.-A. Esbelin and M. Gutan. On the membership problem for some subgroups of SL2 (Z). Ann. Math. Québec, 43:233–247, 2019. [127] A. Eskin and C. McMullen. Mixing, counting, and equidistribution in lie groups. Duke Math. J., 71(1):181–209, 1993. [128] B. Farb. Relatively hyperbolic groups. Geom. Funct. Anal., 8(5):810–840, 1998. [129] A. L. Fel’shtyn. The Reidemeister number of any automorphism of a Gromov hyperbolic group is infinite. J. Math. Sci., 119(1):117–123, 2004. [130] A. Fel’shtyn, Yu. Leonov, and E. Troitsky. Twisted conjugacy classes in saturated weakly branch groups. Geom. Dedicata, 134(1):61–73, 2008. [131] A. Fel’shtyn and E. Troitsky. Twisted conjugacy separable groups. Preprint. Available at http://arxiv.org/abs/math/0606764, 2012. [132] N. J. Fine and H. S. Wilf. Uniqueness theorems for periodic functions. Proc. Amer. Math. Soc., 16:109–114, 1965. [133] P. Flajolet and A. Odlyzko. Singularity analysis of generating functions. SIAM J. Discrete Math., 3(2):216–240, 1990. [134] P. Flajolet and R. Sedgewick. Analytic Combinatorics. Cambridge University Press, 2009. [135] P. Flajolet, P. Zimmerman, and B. Van Cutsem. A calculus for the random generation of labelled combinatorial structures. Theoret. Comput. Sci., 132(1–2):1–35, 1994. [136] L. Fleischer. On the complexity of the Cayley semigroup membership problem. In Proceedings of the 33rd Computational Complexity Conference, CCC 2018, volume 102 of LIPIcs, pages 25:1–25:12. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2018. [137] E. Formanek. Conjugate separability in polycyclic groups. J. Algebra, 42(1):1–10, 1976. [138] E. Formanek and C. Procesi. The automorphism group of a free group is not linear. J. Algebra, 149:494–499, 1992. [139] R. H. Fox. Free differential calculus. I: Derivation in the free group ring. Ann. of Math., 57(3):547–560, 1953. [140] R. H. Fox. Free differential calculus. II: The isomorphism problem of groups. Ann. of Math., 59(2):196–210, 1954. [141] R. H. Fox. Free differential calculus III. Subgroups. Ann. of Math., 64(3):407–419, 1956. [142] R. H. Fox. Free differential calculus, V. The Alexander matrices re-examined. Ann. of Math., 71(3):408–422, 1960. [143] N. Franco and J. González-Meneses. Conjugacy problem for braid groups and Garside groups. J. Algebra, 266(1):112–132, 2003.
Bibliography | 355
[144] P. J. Freitas. On the Action of the Symplectic Group on the Siegel upper Half Plane. PhD thesis, University of Illinois, 1999. [145] L. Frenkel, A. Nikolaev, and A. Ushakov. Knapsack problems in products of groups. J. Symbolic Comput., 74:96–108, 2016. [146] L. Fuchs. Infinite Abelian Groups, Volumes 1, 2. Academic Press, New York, 1970 and 1972. [147] E. Fuchs. Counting problems in Apollonian packings. Bull. Amer. Math. Soc. (N.S.), 50(2):229–266, 2013. [148] E. Fuchs, C. Meiri, and P. Sarnak. Hyperbolic monodromy groups for the hypergeometric equation and Cartan involutions. J. Eur. Math. Soc. (JEMS), 16(8):1617–1671, 2014. [149] E. Fuchs and I. Rivin. Generic thinness in finitely generated subgroups of SLn (ℤ). Int. Math. Res. Not. IMRN, (17):5385–5414, 2017. [150] H. Furstenberg and H. Kesten. Products of random matrices. Ann. Math. Stat., pages 457–469, 1960. [151] N. Gama, N. Howgrave-Graham, and P. Nguyen. Symplectic lattice reduction and NTRU. Advances in Cryptology-EUROCRYPT 2006, pages 233–253, 2006. [152] M. Ganardi, D. König, M. Lohrey, and G. Zetzsche. Knapsack Problems for Wreath Products. ArXiv e-prints, Sept 2017. [153] D. Garber, S. Kaplan, M. Teicher, B. Tsaban, and U. Vishne. Probabilistic solutions of equations in the braid group. Adv. Appl. Math., 35:323–334, 2005. [154] D. Garber, S. Kaplan, M. Teicher, B. Tsaban, and U. Vishne. Length-based conjugacy search in the braid group. In Algebraic Methods in Cryptography, volume 418 of Contemporary Mathematics, pages 75–88. American Mathematical Society, 2006. [155] M. Garey and J. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979. [156] A. Garreta. The Diophantine Problem over Random Diophantine Groups. PhD thesis, Stevens Institute of Technology, 2016. [157] A. Garreta, A. Miasnikov, and D. Ovchinnikov. Properties of random nilpotent groups. Groups Complex. Cryptol., 9:99–115, 2017. [158] A. Garreta, A. Miasnikov, and D. Ovchinnikov. Random nilpotent groups, polycyclic presentations, and Diophantine problems. Groups Complex. Cryptol., 9(2):99–115, 2017. [159] M. H. Garzon and Y. Zalcstein. The complexity of Grigorchuk groups with application to cryptography. Theoret. Comput. Sci., 88(1):83–98, 1991. [160] L. Gasieniec, M. Karpinski, W. Plandowski, and W. Rytter. Randomized efficient algorithms for compressed strings: The finger-print approach (extended abstract). In Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching, CPM’96, volume 1075 of Lecture Notes in Computer Science, pages 39–49. Springer, 1996. [161] G. Ge. Testing equalities of multiplicative representations in polynomial time (extended abstract). In Proceedings of the 34th Annual Symposium on Foundations of Computer Science, FOCS 1993, pages 422–426, 1993. [162] V. Gebhardt and J. González-Meneses. The cyclic sliding operation in Garside groups. Math. Z., 265(1):85–114, 2010. [163] A. Genevois. Hyperbolicities in CAT(0) cube complexes. arXiv preprint arXiv:1709.08843, 2017. [164] S. M. Gersten. On Whitehead’s algorithm. Bull. Amer. Math. Soc. (N.S.), 10(2):281–284, 1984. [165] S. M. Gersten. Dehn functions and ℓ1 -norms of finite presentations. In Gilbert Baumslag and Charles F. Miller III, editors, Algorithms and Classification in Combinatorial Group Theory, pages 195–220. Springer, 1991. [166] S. M. Gersten and H. B. Short. Small cancellation theory and automatic groups. Invent. Math., 102(2):305–334, 1990.
356 | Bibliography
[167] S. M. Gersten and H. B. Short. Rational subgroups of biautomatic groups. Ann. of Math. (2), 134(1):125–158, 1991. [168] J. Gilman. Two-generator Discrete Subgroups of PSL(2, R), volume 561. Amer. Mathematical Society, 1995. [169] R. Gilman, A. Miasnikov, and D. Osin. Exponentially generic subsets of groups. Illinois J. Math., 54(1):371–388, 2010. [170] S. Ginsburg and E. Spanier. Semigroups, Presburger formulas, and languages. Pacific J. Math., 16(2):285–296, 1966. [171] T. Godin. Knapsack problem for automaton groups. Preprint arXiv:1609.09274. Available at https://arxiv.org/abs/1609.09274. [172] I. Goldsheid and G. A. Margulis. Lyapunov exponents of a product of random matrices. Uspekhi Mat. Nauk, 269(5):11–71, 1989. [173] R. Z. Goldstein and E. C. Turner. Fixed subgroups of homomorphisms of free groups. Bull. Lond. Math. Soc., 18:468–470, 1986. [174] E. S. Golod and I. R. Shafarevich. On the class field tower. Izv. Akad. Nauk SSSR Ser. Mat., 28:261–272, 1964. [175] A. S. Golsefidy and P. Sarnak. The affine sieve. J. Amer. Math. Soc., 26(4):1085–1105, 2013. [176] M. I. Gonzalez-Vasco and R. Steinwandt. Group Theoretic Cryptography. Chapman & Hall/CRC, 2015. [177] A. Gorodnik, F. Maucourant, and H. Oh. Manin’s and Peyre’s conjectures on rational points and adelic mixing. Ann. Sci. Éc. Norm. Supér. (4), 41(3):383–435, 2008. [178] A. Gorodnik and A. Nevo. Splitting fields of elements in arithmetic groups. arXiv preprint arXiv:1105.0858, 2011. [179] A. Gorodnik and H. Oh. Orbits of discrete subgroups on a symmetric space and the Furstenberg boundary. Duke Math. J., 139(3):483–525, 2007. [180] M. Greendlinger. Dehn’s algorithm for the word problem. Comm. Pure Appl. Math., 13:67–83, 1960. [181] M. Greendlinger. An analogue of a theorem of Magnus. Arch. Math., 12:94–96, 1961. [182] R. Greenlaw, H. James Hoover, and W. L. Ruzzo. Limits to Parallel Computation: P-Completeness Theory. Oxford University Press, 1995. [183] R. I. Grigorchuk. Symmetrical random walks on discrete groups. In Multicomponent Random Systems, volume 6 of Adv. Probab. Related Topics, pages 285–325. Dekker, New York, 1980. [184] D. Grigoriev and V. Shpilrain. Tropical cryptography. Comm. Algebra, 42:2624–2632, 2014. [185] D. Grigoriev and V. Shpilrain. Tropical cryptography II: extensions by homomorphisms. Comm. Algebra, 47:4224–4229, 2019. [186] M. Gromov. Groups of polynomial growth and expanding maps. Publ. Math. IHES, 53:53–73, 1981. [187] M. Gromov. Hyperbolic groups. In Essays in Group Theory, volume 8 of MSRI Publications, pages 75–263. Springer, 1985. [188] M. Gromov. Asymptotic invariants of infinite groups. In Geometric group theory, vol. 2 (Sussex, 1991), volume 182 of London Math. Soc. Lecture Note Ser., pages 1–295. Cambridge Univ. Press, Cambridge, 1993. [189] M. Gromov. Random walk in random groups. Geom. Funct. Anal., 13(1):73–146, 2003. [190] F. Grunewald and D. Segal. On the integer solutions of quadratic equations. J. Reine Angew. Math., pages 13–45, 2004. [191] Z. Grunschlag. Algorithms in Geometric Group Theory. PhD thesis, University of California at Berkeley, 1999. [192] V. S. Guba and M. V. Sapir. On subgroups of R. Thompson’s group F and other diagram groups. Mat. Sb., 190(8):3–60, 1999.
Bibliography | 357
[193] Y. Guivarc’h and A. Raugi. Propriétés de contraction d’un semi-groupe de matrices inversibles. Coefficients de Liapunoff d’un produit de matrices aléatoires indépendantes. Israel J. Math., 65(2):165–196, 1989. [194] U. Güntzer and M. Paul. Jump interpolation search trees and symmetric binary numbers. Inform. Process. Lett., 26(4):193–204, 1987. [195] N. Gupta. Free Group Rings, volume 66. American Mathematical Society, Providence, RI, 1987. [196] Y. Gurevich and P. Schupp. Membership problem for the modular group. SIAM J. Comput., 37:425–459, 2007. [197] C. Gutiérrez. Satisfiability of equations in free groups is in PSPACE. In Proceedings of the 32nd Annual ACM Symposium on Theory of Computing, STOC 2000, pages 21–27. ACM, 2000. [198] M. Habeeb, D. Kahrobaei, C. Koupparis, V. Shpilrain. Public key exchange using semidirect product of (semi)groups. In Applied Cryptography and Network Security – ACNS 2013, volume 7954 of Lecture Notes in Computer Science, pages 475–486. Springer, 2013. [199] C. Hagenah. Gleichungen mit regulären Randbedingungen über freien Gruppen. PhD thesis, University of Stuttgart, 2000. [200] F. Haglund and D. Wise. Coxeter groups are virtually special. Adv. Math., 224:1890–1903, 2010. [201] V. Halava and T. Harju. Some new results on Post correspondence problem and its modifications. Bull. Eur. Assoc. Theor. Comput. Sci. EATCS, 73:131–141, 2001. [202] V. Halava, T. Harju, and M. Hirvensalo. Binary (generalized) Post correspondence problem. Theoret. Computer Science, 276(1–2):183–204, 2002. [203] V. Halava, M. Hirvensalo, and R. de Wolf. Decidability and undecidability of marked PCP. In Proceedings of the 16th Annual Conference on Theoretical Aspects of Computer Science, STACS’99, pages 207–216. Springer-Verlag, Berlin, Heidelberg, 1999. [204] P. Hall. Finiteness conditions for soluble groups. Proc. Lond. Math. Soc., 3(1):419–436, 1954. [205] G. H. Hardy and E. M. Wright. An Introduction to the Theory of Numbers. Oxford University Press, Oxford, sixth edition, 2008. Revised by D. R. Heath-Brown and J. H. Silverman. With a foreword by Andrew Wiles. [206] N. Haubold and M. Lohrey. Compressed word problems in HNN-extensions and amalgamated products. Theory Comput. Syst., 49(2):283–305, 2011. [207] N. Haubold, M. Lohrey, and C. Mathissen. Compressed decision problems for graph products of groups and applications to (outer) automorphism groups. Internat. J. Algebra Comput., 22(8):1240007, 2012. [208] M. E. Hellman. An overview of public key cryptography. IEEE Commun. Mag., pages 42–49, May 2002. [209] G. Higman. A finitely generated infinite simple group. J. Lond. Math. Soc., 26:61–64, 1951. [210] K. A. Hirsch. On infinite soluble groups (IV). J. Lond. Math. Soc., 1(1):81–85, 1952. [211] D. R. Hirschfeldt, C. G. Jockusch, Jr., R. Kuyper, and P. E. Schupp. Coarse reducibility and algorithmic randomness. J. Symbolic Logic, 81(3):1028–1046, 2016. [212] D. R. Hirschfeldt, C. G. Jockusch, Jr., T. H. McNicholl, and P. E. Schupp. Asymptotic density and the coarse computability bound. Computability, 5(1):13–27, 2016. [213] Y. Hirshfeld, M. Jerrum, and F. Moller. A polynomial-time algorithm for deciding equivalence of normed context-free processes. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science, FOCS 1994, pages 623–631. IEEE Computer Society, 1994. [214] Y. Hirshfeld, M. Jerrum, and F. Moller. A polynomial algorithm for deciding bisimilarity of normed context-free processes. Theoret. Comput. Sci., 158(1&2):143–159, 1996. [215] D. Hofheinz and R. Steinwandt. A practical attack on some braid group based cryptographic primitives. In Advances in Cryptology – PKC 2003, volume 2567 of Lecture Notes in Computer Science, pages 187–198. Springer, Berlin, 2003.
358 | Bibliography
[216] D. F. Holt, B. Eick, and E. A. O’Brien. Handbook of Computational Group Theory. Discrete Mathematics and Its Applications. Chapman & Hall/CRC, Boca Raton, FL, 2005. [217] D. F. Holt, M. Lohrey, and S. Schleimer. Compressed decision problems in hyperbolic groups. In Proceedings of the 36th International Symposium on Theoretical Aspects of Computer Science, STACS 2019, volume 126 of LIPIcs, pages 37:1–37:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. [218] D. F. Holt and S. Rees. Solving the word problem in real time. J. Lond. Math. Soc., 63(2):623–639, 2001. [219] D. F. Holt, S. Rees, and C. E. Röver. Groups, Languages and Automata, volume 88 of London Mathematical Society Student Texts. Cambridge University Press, 2017. [220] S. Hoory, N. Linial, and A. Wigderson. Expander graphs and their applications. Bull. Amer. Math. Soc., 43(4):439–562, 2006. [221] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, 1990. [222] R. Impagliazzo and A. Wigderson. P = BPP if E requires exponential circuits: Derandomizing the XOR lemma. In Proceedings of the 29th Annual ACM Symposium on the Theory of Computing, STOC 1997, pages 220–229. ACM, 1997. [223] I. M. Isaacs. Finite Group Theory, volume 92. American Mathematical Soc., 2008. [224] J. Jedwab and C. J. Mitchell. Minimum weight modified signed-digit representations and fast exponentiation. Electron. Lett., 25:1171–1172, 1989. [225] O. Jenkinson and M. Pollicott. Calculating Hausdorff dimension of Julia sets and Kleinian limit sets. Amer. J. Math., pages 495–545, 2002. [226] A. Jeż. Faster fully compressed pattern matching by recompression. ACM Trans. Algorithms, 11(3):20:1–20:43, 2015. [227] A. Jeż. Recompression: A simple and powerful technique for word equations. J. ACM, 63(1):4:1–4:51, 2016. [228] T. Jitsukawa. Malnormal subgroups of free groups. In Computational and Statistical Group Theory (Las Vegas, NV/Hoboken, NJ, 2001), volume 298 of Contemp. Math., pages 83–95. Amer. Math. Soc., Providence, RI, 2002. [229] J. P. Jones. Universal Diophantine equation. J. Symbolic Logic, 47(3):549–571, 1982. [230] F. Jouve, E. Kowalski, and D. Zywina. Splitting fields of characteristic polynomials of random elements in arithmetic groups. Israel J. Math., pages 1–44, 2010. [231] A. B. Kahn. Topological sorting of large networks. Commun. ACM, 5(11):558–562, 1962. [232] D. Kahrobaei and V. Shpilrain. Using semidirect product of (semi)groups in public key cryptography. In Computability in Europe – CiE 2016, volume 9709 of Lecture Notes in Computer Science, pages 132–141. Springer, Berlin, 2016. [233] S. Kalajdžievski. Automorphism group of a free group: centralizers and stabilizers. J. Algebra, 150(2):435–502, 1992. [234] M. Kambites, P. V. Silva, and B. Steinberg. On the rational subset problem for groups. J. Algebra, 309(2):622–639, 2007. [235] R. Kannan and R. Bachem. Polynomial time algorithms for computing Smith and Hermite normal forms of an integer matrix. SIAM J. Comput., 8:499–507, 1979. [236] W. M. Kantor and A. Lubotzky. The probability of generating a finite classical group. Geom. Dedicata, 36(1):67–87, 1990. [237] I. Kapovich. Small cancellation groups and translation numbers. Trans. Amer. Math. Soc., 349(5):1851–1875, 1997. [238] I. Kapovich. Clusters, currents, and Whitehead’s algorithm. Exp. Math., 16(1):67–76, 2007. [239] I. Kapovich. Generic-case complexity of Whitehead’s algorithm, revisited. arXiv preprint arXiv:1903.07040, 2019. [240] I. Kapovich, A. Miasnikov, P. Schupp, and V. Shpilrain. Generic-case complexity, decision
Bibliography | 359
problems in group theory, and random walks. J. Algebra, 264(2):665–694, 2003. [241] I. Kapovich, A. Miasnikov, P. Schupp, and V. Shpilrain. Average-case complexity and decision problems in group theory. Adv. Math., 190(2):343–359, 2005. [242] I. Kapovich and A. Myasnikov. Stallings foldings and subgroups of free groups. J. Algebra, 248(2):608–668, 2002. [243] I. Kapovich and T. Nagnibeda. Subset currents on free groups. Geom. Dedicata, 166:307–348, 2013. [244] I. Kapovich, I. Rivin, P. Schupp, and V. Shpilrain. Densities in free groups and ℤk , visible points and test elements. Math. Res. Lett., 14(2):263–284, 2007. [245] I. Kapovich and P. Schupp. Delzant’s T -invariant, Kolmogorov complexity and one-relator groups. Comment. Math. Helv., 80(4):911–933, 2005. [246] I. Kapovich and P. Schupp. Genericity, the Arzhantseva–Ol’shanskii method and the isomorphism problem for one-relator groups. Math. Ann., 331(1):1–19, 2005. [247] I. Kapovich and P. Schupp. On group-theoretic models of randomness and genericity. Groups Geom. Dyn., 2(3):383–404, 2008. [248] I. Kapovich and P. E. Schupp. Random quotients of the modular group are rigid and essentially incompressible. J. Reine Angew. Math., 628:91–119, 2009. [249] I. Kapovich, P. Schupp, and V. Shpilrain. Generic properties of Whitehead’s algorithm and isomorphism rigidity of random one-relator groups. Pacific J. Math., 223(1):113–140, 2006. [250] M. I. Kargapolov and Ju. I. Merzlyakov. Fundamentals of Theory of Groups. Springer Verlag, 1979. [251] M. Kassabov. Symmetric groups and expander graphs. Invent. Math., 170(2):327–354, 2007. [252] M. Kassabov. Universal lattices and unbounded rank expanders. Invent. Math., 170(2):297–326, 2007. [253] H. Kellerer, U. Pferschy, and D. Pisinger. Knapsack Problems. Springer, 2004. [254] H. Kesten. Symmetric random walks on groups. Trans. Amer. Math. Soc., 92:336–354, 1959. [255] B. Khan. The structure of automorphic conjugacy in the free group of rank two. In Computational and Experimental Group Theory, volume 349 of Contemp. Math., pages 115–196. Amer. Math. Soc., Providence, RI, 2004. [256] O. Kharlampovich. A finitely presented solvable group with unsolvable word problem. Izv. Akad. Nauk SSSR Ser. Mat., 45(4):852–873, 1981. [257] O. Kharlampovich, I. Lysenok, A. G. Myasnikov, and N. Touikan. The solvability problem for quadratic equations over free groups is NP-complete. Theory Comput. Syst., 47:250–258, 2010. [258] O. Kharlampovich, A. Miasnikov, and P. Weil. Stallings graphs for quasi-convex subgroups. J. Algebra, 488:442–483, 2017. [259] O. Kharlampovich and A. Myasnikov. Irreducible affine varieties over a free group. I: Irreducibility of quadratic equations and Nullstellensatz. J. Algebra, 200(2):472–516, 1998. [260] O. Kharlampovich, A. Myasnikov, and E. Lyutikova. Equations over Q-completions of hyperbolic groups. Trans. Amer. Math. Soc., 351(7):2961–2978, 1999. [261] O. Kharlampovich, A. G. Myasnikov, and M. V. Sapir. Algorithmically complex residually finite groups. Bull. Math. Sci., 7(2):309–352, 2017. [262] T. Kida, T. Matsumoto, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. Collage system: a unifying framework for compressed pattern matching. Theoret. Comput. Sci., 298(1):253–272, 2003. [263] S.-H. Kim and T. Koberda. Embedability between right-angled Artin groups. Geom. Topol., 17(1):493–530, 2013. [264] S.-H. Kim and T. Koberda. The geometry of the curve graph of a right-angled Artin group. Internat. J. Algebra Comput., 24(2):121–169, 2014.
360 | Bibliography
[265] V. Klee and G. J. Minty. How good is the simplex algorithm? In Inequalities, III (Proc. Third Sympos., Univ. California, Los Angeles, Calif., 1969; dedicated to the memory of Theodore S. Motzkin), pages 159–175. Academic Press, New York, 1972. [266] A. Knapp. Representation Theory of Semisimple Groups: An Overview Based on Examples, volume 36. Princeton University Press, Princeton, NJ, 2001. [267] A. W. Knapp. Lie Groups Beyond an Introduction, volume 140. Springer Science & Business Media, 2013. [268] K. H. Ko, S. J. Lee, J. H. Cheon, J. W. Han, J. Kang, and C. Park. New public-key cryptosystem using braid groups. In Advances in Cryptology – CRYPTO 2000, volume 1880 of Lecture Notes in Computer Science, pages 166–183. Springer, Berlin, 2000. [269] T. Koberda. What is...an acylindrical group action? Notices Amer. Math. Soc., 65(1):31–34, 2018. [270] A. Kontorovich. From Apollonius to Zaremba: local-global phenomena in thin orbits. Bull. Amer. Math. Soc. (N.S.), 50(2):187–228, 2013. [271] E. Kowalski. The Large Sieve and Its Applications: Arithmetic Geometry, Random Walks and Discrete Groups, volume 175. Cambridge University Press, 2008. [272] L. Kronecker. Grundzüge einer arithmetischen Theorie der algebraischen Grössen. Angefügt ist eine neue Ausg. der Inaugural-Dissertation: De unitatibus complexis. G. Reimer, 1882. [273] D. König and M. Lohrey. Evaluation of circuits over nilpotent and polycyclic groups. Algorithmica, 80(5):1459–1492, 2018. [274] D. König and M. Lohrey. Parallel identity testing for skew circuits with big powers and applications. Internat. J. Algebra Comput., 28(6):979–1004, 2018. [275] D. König, M. Lohrey, and G. Zetzsche. Knapsack and subset sum problems in nilpotent, polycyclic, and co-context-free groups. Algebra Comput. Sci., 677:138–153, 2016. [276] J. C. Lagarias. Worst-case complexity bounds for algorithms in the theory of integral quadratic forms. J. Algorithms, 1(2):142–186, 1980. [277] J. C. Lagarias and A. M. Odlyzko. Effective versions of the Chebotarev density theorem. Algebraic Number Fields, pages 409–464, 1977. [278] S. Landau. Factoring polynomials over algebraic number fields. SIAM J. Comput., 14(1):184–195, 1985. [279] S. Landau and G. L. Miller. Solvability by radicals is in polynomial time. J. Comput. System Sci., 30(2):179–208, 1985. [280] J. Laun. Efficient algorithms for highly compressed data: the word problem in generalized Higman groups is in P. Theory Comput. Syst., 55(4):742–770, 2014. [281] P. D. Lax. Functional Analysis. Pure and Applied Mathematics. Wiley-Interscience [John Wiley & Sons], New York, 2002. [282] S. J. Lee and E. Lee. Potential weaknesses of the commutator key agreement protocol based on braid groups. In Advances in Cryptology – EUROCRYPT 2002, volume 2332 of Lecture Notes in Computer Science, pages 14–28. Springer, Berlin, 2002. [283] E. Lee and J. H. Park. Cryptanalysis of the public key encryption based on braid groups. In Advances in Cryptology – EUROCRYPT 2003, volume 2656 of Lecture Notes in Computer Science, pages 477–490. Springer, Berlin, 2003. [284] J. Lehnert and P. Schweitzer. The co-word problem for the Higman–Thompson group is context-free. Bull. Lond. Math. Soc., 39(2):235–241, 02 2007. [285] J. C. Lennox and D. J. S. Robinson. The Theory of Infinite Soluble Groups. Clarendon Press, 2004. [286] A. K. Lenstra, H. W. Lenstra, and L. Lovász. Factoring polynomials with rational coefficients. Math. Ann., 261(4):515–534, 1982. [287] J. Lewin. Residual Properties of Loops and Rings. PhD thesis, New York University, 1964.
Bibliography | 361
[288] M. Li and P. Vitányi. An Introduction to Kolmogorov Complexity and Its Applications. Graduate Texts in Computer Science. Springer-Verlag, New York, second edition, 1997. [289] M. W. Liebeck, N. Nikolov, and A. Shalev. Groups of Lie type as products of SL2 subgroups. J. Algebra, 326:201–207, 2011. [290] Y. Lifshits. Processing compressed texts: A tractability border. In Proceedings of the 18th Annual Symposium on Combinatorial Pattern Matching, CPM 2007, volume 4580 of Lecture Notes in Computer Science, pages 228–240. Springer, 2007. [291] R. J. Lipton and Y. Zalcstein. Word problems solvable in logspace. J. ACM, 24(3):522–526, 1977. [292] D. Livingstone and A. Wagner. Transitivity of finite permutation groups on unordered sets. Math. Z., 90:393–403, 1965. [293] M. Lohrey. Word problems and membership problems on compressed words. SIAM J. Comput., 35(5):1210–1240, 2006. [294] M. Lohrey. Algorithmics on SLP-compressed strings: A survey. Groups Complex. Cryptol., 4(2):241–299, 2012. [295] M. Lohrey. The Rational Subset Membership Problem for Groups: A Survey. 2013. [296] M. Lohrey. The Compressed Word Problem for Groups, Springer Briefs in Mathematics. 2014. [297] M. Lohrey. Rational subsets of unitriangular groups. Internat. J. Algebra Comput., 25(01–02):113–121, 2015. [298] M. Lohrey. Equality testing of compressed strings. In Proceedings of the 10th International Conference on Combinatorics on Words, WORDS 2015, volume 9304 of Lecture Notes in Computer Science, pages 14–26. Springer, 2015. [299] M. Lohrey. Knapsack in hyperbolic groups. J. Algebra, 545:390–415, 2020. [300] M. Lohrey and S. Schleimer. Efficient computation in groups via compression. In Proceedings of Computer Science in Russia, CSR-2007, pages 249–258. Springer, 2007. [301] M. Lohrey and B. Steinberg. Tilings and submonoids of metabelian groups. Theory Comput. Syst., 48:411–427, 2011. [302] M. Lohrey, B. Steinberg, and G. Zetzsche. Tilings and submonoids of metabelian groups. Inform. Sci., 243:191–204, 2015. [303] M. Lohrey and A. Weiß. The power word problem. CoRR, abs/1904.08343, 2019. [304] M. Lohrey and G. Zetzsche. Knapsack in graph groups, HNN-extensions and amalgamated products. In 33rd Symposium on Theoretical Aspects of Computer Science, 2016. [305] M. Lohrey and G. Zetzsche. The Complexity of Knapsack in Graph Groups. In 34th Symposium on Theoretical Aspects of Computer Science, STACS 2017, March 8–11, 2017, Hannover, Germany, pages 52:1–52:14, 2017. [306] M. Lohrey and G. Zetzsche. Knapsack in graph groups. Theory Comput. Syst., 62(1):192–246, 2018. [307] V. Lomonosov and P. Rosenthal. The simplest proof of Burnside’s theorem on matrix algebras. Linear Algebra Appl., 383:45–47, 2004. [308] A. Lubotzky. Finite simple groups of Lie type as expanders. J. Eur. Math. Soc. (JEMS), 13(5):1331–1341, 2011. [309] A. Lubotzky and C. Meiri. Sieve methods in group theory I: Powers in linear groups. J. Amer. Math. Soc., 25:1119–1148, 2012. [310] A. Lubotzky and L. Rosenzweig. The Galois group of random elements of linear groups. arXiv preprint arXiv:1205.5290, 2012. [311] A. Lubotzky and D. Segal. Subgroup Growth, volume 212 of Progress in Mathematics. Birkhäuser Verlag, Basel, 2003. [312] T. Łuczak and P. Pyber. On random generation of the symmetric group. Combin. Probab. Comput., 2(4):505–512, 1993.
362 | Bibliography
[313] R. Lyndon and P. Schupp. Combinatorial Group Theory. Classics in Mathematics. Springer, 2001. [314] I. G. Lysënok. Some algorithmic properties of hyperbolic groups. Izv. Akad. Nauk SSSR Ser. Mat., 53(4):814–832, 912, 1989. [315] J. Macdonald. Compressed words and automorphisms in fully residually free groups. Internat. J. Algebra Comput., 20(3):343–355, 2010. [316] J. Macdonald, A. G. Miasnikov, and D. Ovchinnikov. Low-complexity computations for nilpotent subgroup problems. Internat. J. Algebra Comput., 29(4):639–661, 2019. [317] J. Macdonald, A. G. Myasnikov, A. Nikolaev, and S. Vassileva. Logspace and compressed-word computations in nilpotent groups. CoRR, abs/1503.03888, 2015. [318] W. Magnus. Über diskontinuierliche Gruppen mit einer definierenden Relation (Der Freiheitssatz). J. Reine Angew. Math., 163:141–165, 1930. [319] W. Magnus. Das Identitätsproblem für Gruppen mit einer definierenden Relation. Math. Ann., 106(1):295–307, 1932. [320] W. Magnus. On a theorem of Marshall Hall. Ann. of Math. (2), 40:764–768, 1939. [321] M. R. Magyarik and N. R. Wagner. A Public Key Cryptosystem Based on the Word Problem. In Advances in Cryptology – CRYPTO 1984, volume 196 of Lecture Notes in Computer Science, pages 19–36. Springer, Berlin, 1985. [322] J. Maher. Asymptotics for pseudo-Anosov elements in Teichmüller lattices. Geom. Funct. Anal., 20(2):527–544, 2010. [323] J. Maher. Random walks on the mapping class group. Duke Math. J., 156(3):429–468, 2011. [324] J. Maher and A. Sisto. Random subgroups of acylindrically hyperbolic groups and hyperbolic embeddings. Int. Math. Res. Not. IMRN, 2019(13):3941–3980, 2019. [325] J. Maher and G. Tiozzo. Random walks on weakly hyperbolic groups. J. Reine Angew. Math., 742:187–239, 2018. [326] K. Mahler. An inequality for the discriminant of a polynomial. Michigan Math. J., 11:257–262, 1964. [327] G. S. Makanin. The problem of solvability of equations in a free semigroup. Mat. Sb., 103:147–236, 1977. In Russian; English translation in Math. USSR-Sb. 32(2), 1977. [328] G. S. Makanin. Equations in a free group. Izv. Akad. Nauk SSSR Ser. Math., 46:1199–1273, 1983. In Russian; English translation in Math. USSR Izv. 21, 1983. [329] T. A. Makanina. Occurrence problem for braid groups Bn+1 with n + 1 ≥ 5. Math. Notes Acad. Sci. USSR, 29(1), 1981. [330] A. I. Maltsev. On certain classes of infinite soluble groups. Amer. Math. Soc. Transl. Ser. 2, 2:1–21, 1956. [331] L. Markus-Epstein. Stallings foldings and subgroups of amalgams of finite groups. Internat. J. Algebra Comput., 17(8):1493–1535, 2007. [332] Ju. V. Matijasevic. Enumerable sets are Diophantine. Sov. Math., Dokl., 11:354–358, 1970. Translated from Russian, Dokl. Akad. Nauk SSSR 191:279–282, 1970. [333] Yu. Matiyasevich and G. Sénizergues. Decision problems for semi-Thue systems with a few rules. In Proceedings of the 11th Annual IEEE Symposium on Logic in Computer Science, LICS’96, pages 523–534, 1996. [334] C. T. McMullen. Hausdorff dimension and conformal dynamics, III: Computation of dimension. Amer. J. Math., pages 691–721, 1998. [335] K. Mehlhorn, R. Sundar, and C. Uhrig. Maintaining dynamic sequences under equality-tests in polylogarithmic time. In Proceedings of the 5th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1994, pages 213–222. ACM/SIAM, 1994. [336] K. Mehlhorn, R. Sundar, and C. Uhrig. Maintaining dynamic sequences under equality tests in polylogarithmic time. Algorithmica, 17(2):183–198, 1997.
Bibliography | 363
[337] A. J. Menezes, P. C. van Oorschot, and S. A. Vanstone. Handbook of Applied Cryptography. CRC Press, 1996. [338] R. Merkle and M. Hellman. Hiding information and signatures in trapdoor knapsacks. IEEE Trans. Inf. Theory, 24:525–530, 1978. [339] A. Miasnikov and D. Osin. Algorithmically finite groups. J. Pure Appl. Algebra, 215(11):2789–2796, 2011. [340] A. G. Miasnikov and V. Remeslennikov. Exponential groups II: extensions of centralizers and tensor completion of CSA-groups. Internat. J. Algebra Comput., 6(6):687–711, 1996. [341] A. G. Miasnikov, V. Remeslennikov, and D. Serbin. Regular free length functions on Lyndon’s free ℤ[t]-group F ℤ[t] . In Algorithms, Languages, Logic, volume 378 of Contemporary Mathematics, pages 37–77. American Mathematical Society, 2005. [342] A. G. Miasnikov, V. Remeslennikov, and D. Serbin. Fully residually free groups and graphs labeled by infinite words. Internat. J. Algebra Comput., 66(4):689–737, 2006. [343] A. G. Miasnikov and V. Roman’kov. A linear decomposition attack. Groups Complex. Cryptol., 7:81–94, 2015. [344] A. G. Miasnikov, V. Romankov, A. Ushakov, and A. Vershik. The word and geodesic problems in free solvable groups. Trans. Amer. Math. Soc., 362:4655–4682, 2010. [345] A. Miasnikov and P. Schupp. Computational complexity and the conjugacy problem. Computability, 6(4):307–318, 2017. [346] A. G. Miasnikov, V. Shpilrain, and A. Ushakov. A practical attack on some braid group based cryptographic protocols. In Advances in Cryptology – CRYPTO 2005, volume 3621 of Lecture Notes in Computer Science, pages 86–96. Springer, Berlin, 2005. [347] A. G. Miasnikov, V. Shpilrain, and A. Ushakov. Random subgroups of braid groups: An approach to cryptanalysis of a braid group based cryptographic protocol. In Advances in Cryptology – PKC 2006, volume 3958 of Lecture Notes in Computer Science, pages 302–314. Springer, Berlin, 2006. [348] A. G. Miasnikov, V. Shpilrain, and A. Ushakov. Group-based Cryptography. Birkhäuser Verlag, 2008. [349] A. Miasnikov and A. Ushakov. Generic case completeness. J. Comput. System Sci., 82(8):1268–1282, 2016. [350] K. A. Mihailova. The occurrence problem for direct products of groups. Dokl. Akad. Nauk SSSR, 119:1103–1105, 1958. [351] A. A. Mikhalev, V. Shpilrain, and U. U. Umirbaev. On isomorphism of Lie algebras with one defining relation. Internat. J. Algebra Comput., 14:389–393, 2004. [352] A. A. Mikhalev, V. Shpilrain, and J.-T. Yu. Combinatorial Methods: Free Groups, Polynomials, and Free Algebras. Springer-Verlag, New York, 2003. [353] C. F. Miller III. Decision problems for groups – survey and reflections. In Algorithms and Classification in Combinatorial Group Theory, pages 1–60. Springer, 1992. [354] A. Minasyan and D. Osin. Acylindrical hyperbolicity of groups acting on trees. Math. Ann., 362(3–4):1055–1105, 2015. [355] A. Mishchenko and A. Treier. Knapsack problem for nilpotent groups. Groups Complex. Cryptol., 9:87–98, 2017. [356] A. Mishchenko and A. Treier. Subset sum problem in the lamplighter group. Preprint, 2016. [357] T. T. Moh. A public key system with signature and master key functions. Comm. Algebra, 27:2207–2222, 1999. [358] P. Morar and A. Ushakov. Search problems in groups and branching processes. Internat. J. Algebra Comput., 25(3):445–480, 2015. [359] L. Mosher. Mapping class groups are automatic. Ann. of Math. (2), 142(2):303–384, 1995.
364 | Bibliography
[360] A. D. Myasnikov, A. G. Myasnikov, and V. Shpilrain. On the Andrews–Curtis equivalence. In Combinatorial and Geometric Group Theory (New York, 2000/Hoboken, NJ, 2001), volume 296 of Contemporary Mathematics, pages 183–198. American Mathematical Society, 2002. [361] A. G. Myasnikov and A. Nikolaev. Verbal subgroups of hyperbolic groups have infinite width. J. Lond. Math. Soc., 90(2):573–591, 2014. [362] A. G. Myasnikov, A. Nikolaev, and A. Ushakov. The Post correspondence problem in groups. J. Group Theory, 17:991–1008, 2014. [363] A. G. Myasnikov, A. Nikolaev, and A. Ushakov. Knapsack problems in groups. Math. Comp., 84:987–1016, 2015. [364] A. G. Myasnikov, V. N. Remeslennikov, and E. V. Frenkel. Amalgamated free product of groups: normal forms and measures. Math. Notes, 91(3–4):592–596, 2012. Translation of Mat. Z., 91(4):633–637, 2012. [365] A. G. Myasnikov and V. Shpilrain. Automorphic orbits in free groups. J. Algebra, 269(1):18–27, 2003. [366] A. G. Myasnikov, V. Shpilrain, and A. Ushakov. Non-commutative Cryptography and Complexity of Group-Theoretic Problems, volume 177 of Mathematical Surveys and Monographs. American Mathematical Society, Providence, RI, 2011. With an appendix by Natalia Mosina. [367] A. G. Myasnikov and A. Ushakov. Random subgroups and analysis of the length-based and quotient attacks. J. Math. Cryptol., 2(1):29–61, 2008. [368] A. G. Myasnikov and A. Ushakov. Random van Kampen diagrams and algorithmic problems in groups. Groups Complex. Cryptol., 3(1):121–185, 2011. [369] A. G. Myasnikov, A. Ushakov, and D. W. Won. The word problem in the Baumslag group with a non-elementary Dehn function is polynomial time decidable. J. Algebra, 345(1):324–342, 2011. [370] A. G. Myasnikov, A. Ushakov, and D. W. Won. Power circuits, exponential algebra, and time complexity. Internat. J. Algebra Comput., 22(6):1250047, 2012. [371] V. Nekrashevych. Self-similar Groups. American Mathematical Society, 2005. [372] A. Nevo and P. Sarnak. Prime and almost prime integral points on principal homogeneous spaces. Acta Math., 205(2):361–402, 2010. [373] M. Newman. Integral Matrices, volume 45. Academic Press, 1972. [374] D. J. Newman. Simple analytic proof of the prime number theorem. Amer. Math. Monthly, 87(9):693–696, 1980. [375] M. Newman. Counting modular matrices with specified Euclidean norm. J. Combin. Theory Ser. A, 47(1):145–149, 1988. [376] P. Nguyen. Lattice reduction algorithms: Theory and practice. In Advances in Cryptology–EUROCRYPT 2011, pages 2–6, 2011. [377] P. Q. Nguyen and D. Stehlé. Low-dimensional lattice basis reduction revisited. ACM Trans. Algorithms, 5(4):46, 2009. [378] G. A. Niblo and L. D. Reeves. The geometry of cube complexes and the complexity of their fundamental groups. Topology, 37(3):621–633, 1998. [379] J. Nielsen. Die Isomorphismen der allgemeinen, unendlichen Gruppe mit zwei Erzeugenden. Mathematische Annalen, 78, 1918. [380] J. Nielsen. Die Isomorphismengruppe der freien Gruppen. Math. Ann., 91(3–4):169–209, 1924. [381] F. Nielsen and R. Nock. Hyperbolic Voronoi diagrams made easy. In Computational Science and Its Applications (ICCSA), 2010 International Conference on, pages 74–80. IEEE, 2010. [382] A. Nies and K. Tent. Describing finite groups by short first-order sentences. Israel J. Math., 221(1):85–115, 2017.
Bibliography | 365
[383] A. Nikolaev. Membership Problem in Groups Acting Freely on Non-Archimedean Trees. PhD thesis, McGill University, 2010. [384] A. Nikolaev and D. Serbin. Membership problem in groups acting freely on ℤn -trees. Submitted. Available at http://arxiv.org/abs/1107.0943, 2011. [385] A. Nikolaev and A. Ushakov. Subset sum problem in polycyclic groups. J. Symbolic Comput., 84:84–94, 2018. [386] G. A. Noskov. Conjugacy problem in metabelian groups. Math. Notes Acad. Sci. USSR, 31(4):252–258, 1982. [387] P. S. Novikov. On the algorithmic unsolvability of the word problem in group theory. Amer. Math. Soc. Transl. Ser. 2, 9:1–122, 1958. [388] A. Odlyzko. The rise and fall of knapsack cryptosystems. In In Cryptology and Computational Number Theory, pages 75–88. AMS, 1990. [389] J. Oesterlé. Versions effectives du théoreme de Chebotarev sous l’hypothese de Riemann généralisée. Astérisque, 61:165–167, 1979. [390] Y. Ollivier. Sharp phase transition theorems for hyperbolicity of random groups. Geom. Funct. Anal., 14(3):595–679, 2004. [391] Y. Ollivier. A January 2005 Invitation to Random Groups, volume 10 of Ensaios Matemáticos [Mathematical Surveys]. Sociedade Brasileira de Matemática, 2005. [392] Y. Ollivier. On a small cancellation theorem of Gromov. Bull. Belg. Math. Soc. Simon Stevin, 13(1):75–89, 2006. [393] Y. Ollivier. Some small cancellation properties of random groups. Internat. J. Algebra Comput., 17(1):37–51, 2007. [394] A. Yu. Ol’shanskii. Diagrams of homomorphisms of surface groups. Sibirsk. Mat. Zh., 30:150–171, 1989. [395] A. Yu. Ol’shanskii. Periodic quotients of hyperbolic groups. Mat. Zametki, 182:543–567, 1991. [396] A. Yu. Olshanskiĭ. Almost every group is hyperbolic. Internat. J. Algebra Comput., 2(1):1–17, 1992. [397] A. Yu. Olshanskii and M. V. Sapir. Length functions on subgroups in finitely presented groups. In Y. Paek, D. L. Johnson, and A. C. Kim, editors, Groups–Korea’98: Proceedings of the International Conference Held at Pusan National University, Pusan, Korea, August 10–16, 1998. De Gruyter Proceedings in Mathematics Series, 2000. [398] A. Ol’shanskii and M. Sapir. Length and area functions on groups and quasi-isometric Higman embeddings. Internat. J. Algebra Comput., 11(2):137–170, 2001. [399] A. Ol’shanskii and M. Sapir. Groups with small Dehn functions and bipartite chord diagrams. Geom. Funct. Anal., 16(6):1324–1376, 2006. [400] D. Osin. Acylindrically hyperbolic groups. Trans. Amer. Math. Soc., 368(2):851–888, 2016. [401] D. Osin and V. Shpilrain. Public key encryption and encryption emulation attacks. In Computer Science in Russia 2008, volume 5010 of Lecture Notes in Computer Science, pages 252–260. Springer, Berlin, 2008. [402] S.-H. Paeng, K.-C. Ha, J. H. Kim, S. Chee, and C. Park. New public key cryptosystem using finite non-abelian groups. In Advances in Cryptology – CRYPTO 2001, volume 2139 of Lecture Notes in Computer Science, pages 470–485. Springer, Berlin, 2001. [403] A. Page. Computing arithmetic Kleinian groups. arXiv preprint arXiv:1206.0087, 2012. [404] P. P. Pálfy. A polynomial bound for the orders of primitive solvable groups. J. Algebra, 77(1):127–137, 1982. [405] C. Papadimitriou. Computational Complexity. Addison-Wesley, 1994. [406] C. Papadimitriou and K. Steiglitz. Combinatorial Optimization: Algorithms and Complexity. Dover Publications, 1998. [407] R. J. Parikh. On context-free languages. J. ACM, 13(4):570–581, 1966.
366 | Bibliography
[408] R. Pemantle and I. Rivin. On random generation of transitive subgroups of the symmetric group. In preparation. [409] R. Pemantle and M. C. Wilson. Analytic Combinatorics in Several Variables, volume 140. Cambridge University Press, 2013. [410] W. Plandowski. Testing equivalence of morphisms on context-free languages. In Proceedings of the 2nd Annual European Symposium on Algorithms, ESA 1994, volume 855 of Lecture Notes in Computer Science, pages 460–470. Springer, 1994. [411] W. Plandowski. Satisfiability of word equations with constants is in PSPACE. J. ACM, 51(3):483–496, 2004. [412] W. Plandowski and W. Rytter. Application of Lempel–Ziv encodings to the solution of word equations. In Proceedings of the 25th International Colloquium on Automata, Languages and Programming, ICALP 1998, volume 1443 of Lecture Notes in Computer Science, pages 731–742. Springer, 1998. [413] A. N. Platonov. An isoparametric function of the Baumslag–Gersten group. Vestnik Moskov. Univ. Ser. I Mat. Mekh., 3:12–17, 70, 2004 (Russian). [414] E. L. Post. A variant of a recursively unsolvable problem. Bull. Amer. Math. Soc., 52(4):264–268, 1946. [415] G. Prasad and A. S. Rapinchuk. Existence of irreducible r-regular elements in Zariski-dense subgroups. Math. Res. Lett., 10:21–32, 2003. [416] G. Prasad and A. Rapinchuk. Generic elements in Zariski-dense subgroups and isospectral locally symmetric spaces. In Hee Oh and Emmanuel Breuillard, editors, Thin Groups and Superstrong Approximation, volume 61 in Math. Sci. Res. Inst. Publ., pages 211–252. Cambridge, New York, 2014. [417] L. Pyber and E. Szabó. Growth in finite simple groups of Lie type. J. Amer. Math. Soc., 29(1):95–146, 2016. [418] G. W. Reitwiesner. Binary arithmetic. Adv. Comput., 1:231–308, 1960. [419] V. Remeslennikov. On finitely presented groups. In Fourth All-Union Symposium on the Theory of Groups, 1973, pages 164–169. Novosibirsk, USSR, 1973. [420] E. Rips. Subgroups of small cancellation groups. Bull. Lond. Math. Soc., 14(1):45–47, 1982. [421] I. Rivin. Simple curves on surfaces. Geom. Dedicata, 87(1–3):345–360, 2001. [422] I. Rivin. Walks on groups, counting reducible matrices, polynomials, and surface and free group automorphisms. Duke Math. J., 142(2):353–379, 2008. [423] I. Rivin. Walks on graphs and lattices-effective bounds and applications. In Forum Mathematicum, volume 21, pages 673–685, 2009. [424] I. Rivin. Zariski density and genericity. Int. Math. Res. Not. IMRN, (19):3649–3657, 2010. [425] I. Rivin. Growth in free groups (and other stories)—twelve years later. Illinois J. Math., 54(1):327–370, 2010. [426] I. Rivin. Generic phenomena in groups–some answers and many questions. arXiv preprint arXiv:1211.6509, 2012. [427] I. Rivin. Growth in free groups (and other stories). arXiv preprint math/9911076, 1999. [428] D. J. S. Robinson. A Course in the Theory of Groups. Springer-Verlag, New York, 1982. [429] D. Robinson. Parallel Algorithms for Group Word Problems. PhD thesis, University of California, San Diego, 1993. [430] V. Romankov. The twisted conjugacy problem for endomorphisms of polycyclic groups. J. Group Theory, 13(3):355–364, 2010. [431] V. Romankov. Twisted conjugacy classes in nilpotent groups. J. Pure Appl. Algebra, 215(4):664–671, 2011. [432] V. Romankov. A nonlinear decomposition attack. Groups Complex. Cryptol., 8:197–207, 2016. [433] V. Romankov and E. Ventura. The twisted conjugacy problem for pairs of endomorphisms in
Bibliography | 367
nilpotent groups. ArXiv preprint arXiv:0910.3463, 2009. [434] N. S. Romanovskii. The occurrence problem for extensions of abelian by nilpotent groups. Sib. Math. J., 21:170–174, 1980. [435] J. J. Rotman. An Introduction to the Theory of Groups. Springer, fourth edition. 1995. [436] A. Salehi Golsefidy and P. P. Varjú. Expansion in perfect groups. Geom. Funct. Anal., 22(6):1832–1891, 2012. [437] M. V. Sapir, J.-C. Birget, and E. Rips. Isoperimetric and isodiametric functions of groups. Ann. Math., 156(2):345–466, 2002. [438] M. Sapir and I. Špakulová. Almost all one-relator groups with at least three generators are residually finite. J. Eur. Math. Soc. (JEMS), 13(2):331–343, 2011. [439] P. Sarnak. Letter to J. Davis about reciprocal geodesics. http://web.math.princeton.edu/ sarnak/SarnakDavisLtr05.pdf, 2005. [440] N. Saxena. Progress on polynomial identity testing – II. Electron. Colloq. Comput. Complex., 20:186, 2013. [441] K. Sayood. Introduction to Data Compression. Morgan Kaufmann, fourth edition, 2012. [442] S. Schleimer. Polynomial-time word problems. Comment. Math. Helv., 83(4):741–765, 2008. [443] P. E. Schupp. Embeddings into simple groups. J. Lond. Math. Soc. (2), 13(1):90–94, 1976. [444] P. Schupp. Coxeter groups, 2-completion, perimeter reduction and subgroup separability. Geom. Dedicata, 96(1):179–198, 2003. [445] D. Segal. Polycyclic Groups. Cambridge Tracts in Mathematics. Cambridge University Press, 2005. [446] A. Seidenberg. Constructions in a polynomial ring over the ring of integers. Amer. J. Math., 100(4):685–703, 1978. [447] J.-P. Serre. Représentations linéaires des groupes finis, Hermann, Paris, 1967. MR, 38:1190, 1977. [448] J.-P. Serre. Quelques applications du théorème de densité de Chebotarev. Inst. Hautes Études Sci. Publ. Math., 54:323–401, 1981. [449] J. Shallit. A primer on balanced binary representations, 1993. Unpublished technical report; available from https://cs.uwaterloo.ca/shallit/Papers/bbr.pdf. [450] A. Shamir. A polynomial-time algorithm for breaking the basic Merkle–Hellman cryptosystem. IEEE Trans. Inf. Theory, 30(5):699–704, 1984. [451] R. Sharp. Local limit theorems for free groups. Math. Ann., 321(4):889–904, 2001. [452] V. Shpilrain. Assessing security of some group based cryptosystems. In Group Theory, Statistics, and Cryptography, volume 360 of Contemporary Mathematics, pages 167–177. American Mathematical Society, 2004. [453] V. Shpilrain. Cryptanalysis of Stickel’s key exchange scheme. In Computer Science in Russia 2008, volume 5010 of Lecture Notes in Computer Science, pages 283–288. Springer, Berlin, 2008. [454] V. Shpilrain. Search and witness problems in group theory. Groups Complex. Cryptol., 2:231–246, 2010. [455] V. Shpilrain and A. Ushakov. Thompson’s group and public key cryptography. In Applied Cryptography and Network Security – ACNS 2005, volume 3531 of Lecture Notes Comp. Sc., pages 151–164. Springer, 2005. [456] V. Shpilrain and A. Ushakov. A new key exchange protocol based on the decomposition problem. In Algebraic Methods in Cryptography, volume 418 of Contemporary Mathematics, pages 161–167. American Mathematical Society, 2006. [457] V. Shpilrain and A. Ushakov. An authentication scheme based on the twisted conjugacy problem. In Applied Cryptography and Network Security – ACNS 2008, volume 5037 of Lecture Notes in Computer Science, pages 366–372. Springer, 20089.
368 | Bibliography
[458] V. Shpilrain and J.-T. Yu. Factor algebras of free algebras: on a problem of G. Bergman. Bull. Lond. Math. Soc., 35:706–710, 2003. [459] V. Shpilrain and G. Zapata. Using the subgroup membership search problem in public key cryptography. In Algebraic Methods in Cryptography, volume 418 of Contemporary Mathematics, pages 169–179. American Mathematical Society, 2006. [460] V. M. Sidelnikov, M. A. Cherepnev, and V. Y. Yashcenko. Systems of open distribution of keys on the basis of noncommutative semigroups. Russian Acad. Sci. Dokl. Math., 48:384–386, 1994. [461] C. L. Siegel. Symplectic geometry. Amer. J. Math., 65(1):1–86, 1943. [462] C. L. Siegel. Zur Theorie der quadratischen Formen. Nachr. Akad. Wiss. Göttingen Math.-Phys. KL II, pages 21–46, 1972. [463] P. Silva, X. Soler-Escriva and E. Ventura. Finite automata for Schreier graphs of virtually free groups. J. Group Theory, 19(1):25–54, 2016. [464] H.-U. Simon. Word problems for groups and contextfree recognition. In Proceedings of Fundamentals of Computation Theory, FCT 1979, pages 417–422. Akademie-Verlag, 1979. [465] B. Simon. Representations of Finite and Compact Groups, volume 10. American Mathematical Soc., 1996. [466] S. Singh and T. N. Venkataramana. Arithmeticity of certain symplectic hypergeometric groups. Duke Math. J., 163(3):591–617, 2014. [467] M. Sipser. Introduction to the Theory of Computation. Course Technology, 2005. [468] A. Sisto. Quasi-convexity of hyperbolically embedded subgroups. Math. Z., 283(3–4):649–658, 2016. [469] A. Sisto. Contracting elements and random walks. J. Reine Angew. Math., 742:79–114, 2018. [470] J. R. Stallings. Topology of finite graphs. Invent. Math., 71(3):551–565, 1983. [471] R. P. Stauduhar. The determination of Galois groups. Math. Comp., 27:981–996, 1973. [472] P. Stevenhagen and H. W. Lenstra. Chebotarëv and his density theorem. Math. Intelligencer, 18(2):26–37, 1996. [473] G. W. Stewart. The efficient generation of random orthogonal matrices with an application to condition estimators. SIAM J. Numer. Anal., 17(3):403–409, 1980. [474] E. Stickel. A new method for exchanging secret keys. In Proceedings of the Third International Conference on Information Technology and Applications (ICITA’05), volume 2 of Contemporary Mathematics, pages 426–430. IEEE Computer Society, 2005. [475] J. Stillwell. Classical Topology and Combinatorial Group Theory. Springer, second edition, 1995. [476] A. Storjohann. Integer matrix rank certification. In Proceedings of the 2009 International Symposium on Symbolic and Algebraic Computation, pages 333–340. ACM, 2009. [477] Y. Tabei, Y. Takabatake, and H. Sakamoto. A succinct grammar compression. In Proceedings of the 24th Annual Symposium on Combinatorial Pattern Matching, CPM 2013, volume 7922 of Lecture Notes in Computer Science, pages 235–246. Springer, 2013. [478] J. Talbot and D. Welsh. Complexity and Cryptography: An Introduction. Cambridge University Press, 2006. [479] N. W. M. Touikan. A fast algorithm for Stallings’ folding process. Internat. J. Algebra Comput., 16(6):1031–1045, 2006. [480] H. C. Tran. On strongly quasiconvex subgroups. Geom. Topol., 23(3):1173–1235, 2019. [481] B. Tsaban. Polynomial-time solutions of computational problems in noncommutative-algebraic cryptography. J. Cryptology, 28:601–622, 2015. [482] U. Umirbaev. Occurrence problem for free solvable groups. Algebra Logic, 34:112–124, 1995. [483] A. Ushakov. Fundamental Search Problems in Group Theory. ProQuest LLC, Ann Arbor, MI, 2005. Thesis (PhD) – City University of New York.
Bibliography | 369
[484] B. Vallée. Gauss’ algorithm revisited. J. Algorithms, 12(4):556–572, 1991. [485] B. Vallée, A. Vera, et al. Lattice reduction in two dimensions: analyses under realistic probabilistic models. In Proceedings of the 13th Conference on Analysis of Algorithms, AofA, volume 7, 2007. [486] M. van Hoeij and A. Novocin. Gradual sub-lattice reduction and a new complexity for factoring polynomials. Algorithmica, 63(3):616–633, 2012. [487] R. Venkatesan and S. Rajagopalan. Average case intractability of matrix and Diophantine problems (extended abstract). In S. R. Kosaraju, M. Fellows, A. Wigderson, and J. A. Ellis, editors, STOC, pages 632–642. ACM, 1992. [488] E. Ventura and V. Romankov. The twisted conjugacy problem for endomorphisms of metabelian groups. Algebra Logic, 48:89–98, 2009. [489] P. Viana and P. Murgel Veloso. Galois theory of reciprocal polynomials. Amer. Math. Monthly, 109(5):466–471, 2002. [490] J. Voight. Computing fundamental domains for Fuchsian groups. J. Théor. Nombres Bordeaux, 21(2):469–491, 2009. [491] J. von Neumann. Some matrix inequalities and metrization of metric space. Mitt. Forschungsinst. Math. Mech. Kujbyschew-Univ. Tomsk, 1:286–300, 1937. [492] J. Von Zur Gathen and D. Panario. Factoring polynomials over finite fields: a survey. J. Symbolic Comput., 31(1):3–17, 2001. [493] J. P. Wächter and A. Weiß. An automaton group with PSPACE-complete word problem. CoRR, abs/1906.03424, 2019. [494] L. Wang, L. H. Wang, Z. Cao, Y. Yang, and X. X. Niu. Conjugate adjoining problem in braid groups and new design of braid-based signatures. Sci. China Inf. Sci., 53:524–536, 2010. [495] B. A. F. Wehrfritz. On finitely generated soluble linear groups. Math. Z., 170:155–167, 1980. [496] A. Weiß. On the Complexity of Conjugacy in Amalgamated Products and HNN Extensions. PhD thesis, University of Stuttgart, 2015. [497] J. H. C. Whitehead. On equivalent sets of elements in a free group. Ann. of Math. (2), 37(4):782–800, 1936. [498] D. Wise. Research announcement: the structure of groups with a quasiconvex hierarchy. Electron. Res. Announc. Math. Sci., 16:44–55, 2009. [499] J. Wolf. Growth of finitely generated solvable groups and curvature of Riemannian manifolds. J. Differential Geom., 2:421–446, 1968. [500] E. Wolk. A note on “The comparability graph of a tree”. Proc. Amer. Math. Soc., 16:17–20, 1965. [501] R. Young. Averaged Dehn functions for nilpotent groups. Topology, 47(5):351–367, 2008. [502] R. J. Zimmer. Ergodic Theory and Semisimple Groups, volume 81 of Monographs in Mathematics. Birkhäuser Verlag, Basel, 1984. [503] A. Żuk. Property (T) and Kazhdan constants for discrete groups. Geom. Funct. Anal., 13(3):643–670, 2003.
Index acylindrically hyperbolic group 15, 20 AGWP(G, X ), acyclic graph word problem 247 algebraic span cryptanalysis 326 algorithm – continued fraction 119 – LLL 124 – Stauduhar 140 algorithmic problems 159 algorithmically finite group 28, 29 Anshel–Anshel–Goldfeld key exchange protocol 322 Aoun – Richard 91 asymptotic density 4 automatic group 18, 19 automaton – accepting 79 automorphism problem 30, 32 Baumslag–Gersten group 22 BKP(G, X ), bounded knapsack problem 225, 247 block compression 172 Borel, Armand 83 braid group 15, 35 Breuillard, Emmanuel 91 Britton reduction 167 Britton’s lemma 166 BSMP(G, X ), bounded submonoid membership problem 225, 247 central tree property 63 Chavdarov, Nikolai 83 Chomsky normal form 170 cogrowth 52 coherent 57 coincidence probability 75 commutator 322 compact mapping 203 complete language 162 complexity class 2 – EXPSPACE 161 – L 161 – NC 162 – NP 160 – P 159 – PSPACE 161 – RNC 162
– RP 160 composition system 182 compressed equality checking 171 compressed knapsack 198 compressed word problem 184 confluent relation 163 conjugacy problem 10, 14, 16, 17, 21, 22, 320 conjugacy search problem 25, 320 critical pair 163 curve complex 37 cut operator 183 cylindrically hyperbolic group 19 decision problem 3, 10 decomposition – Cartan 97 decomposition problem 324 decomposition search problem 324 Dehn – algorithm 51 – presentation 51 Dehn function 167 Dehn Monster 28, 30 density 48 – annular – strict 79 – asymptotic 78 – model 48, 64, 74 Diffie–Hellman key agreement 318 Diffie–Hellman problem 319, 341 dimension – Hausdorff 96, 150 distance – projective 97 double coset problem 324 (double) twisted conjugacy problem 293 ElGamal cryptosystem 320 encryption emulation attack 329 EP, equalizer problem 292, 300 factorization problem 327 factorization search problem 327, 345 few-generators model 64 few-relators model 53 Fourier transform – on finite groups 85
372 | Index
free group 5, 30 free monoid 158 free product 164 Fuchs, Elena 90 fully homomorphic encryption 335 Garside algorithm 35, 37 Gauss, Carl Friedrich 119 Gelander, Tsahik 91 generic 48 – exponentially 48 – super-polynomially 67 generic free basis property 70 generic set 5 generic-case complexity 1, 2, 6, 8, 14, 21, 36 GEP, generalized equalizer problem 300 GPCP(G), nonhomogeneous Post correspondence problem 230 graph – cyclically reduced 62 – reduced 62 – rooted 62 graph of groups 21 graph product 188 graph-based model 66 Greendlinger 51 Gromov-hyperbolic space 13, 15, 19 group – automatic 165 – automaton 165 – automorphism group 184 – automorphism group of free group 169 – Baumslag group 199 – finitely generated 163 – finitely presented 164 – free 46, 79, 164 – graph group 188 – Grigorchuk group 192 – Gupta–Sidki groups 192 – Higman group 218 – Hydra group 218 – linear 165 – metabelian 165 – modular 57 – nilpotent 189 – one-relator 166 – outer automorphism group 194 – special linear – ergodic action 78
– Thompson’s group 191 – virtually special 188 – word-hyperbolic 165 hard language 161 hidden subgroup problem 344 HNN-extension 166 holomorph 339 hyperbolically embedded subgroup 18 isolated see pure isomorphism inversion problem 335 isomorphism problem 10, 38, 39 isomorphism rigidity 38, 42 Kazhdan’s property (T) 52 knapsack problem 198 – IKP(G, X ), integer 224, 272 – in groups 224 – KP(G, X ) 224, 271 knapsack semilinear group 273, 291 knapsack tame group 282 Kolmogorov complexity 43 Kontorovich, Alex 90 Kowalski, Emmanuel 82 Landau, Susan 142 language 158 Legendre, Adrien-Marie 119 linear time 12 locally confluent relation 163 logspace reducible 161 Lubotzky, Alex 82 malnormal 46 mapping class group 20, 37 Markov chain 11, 33 Markovian automaton 73 – ergodic 75 matrix – Perron-Frobenius 85 Meiri, Chen 82 membership problem 19 membership search problem 27, 323 modular group 40, 41 Morse subgroup 18 negligible 48 – exponentially 48
Index | 373
– super-polynomially 67 negligible set 5 Newman, Morris 80 Nielsen – equivalent 40, 53 – move 53 nilpotent 59 norm – submultiplicative 89 normal form 163, 322 one-relator group 38 one-way function 317 pair – ping-pong 98 pair compression 172 parallel RAM 162 PCP(G), Post correspondence problem 229 phase transition 48 pin 166 point – visible 78 polycyclic 60 polynomial identity testing problem 190 polynomial time 12 power circuit 201 – correct 202 – marking 202 – reduced 206 power word 193 power word problem 193 Prasad, Gopal 84 prefix-heavy 72 presentation 164 property – generic 77 – negligible 77 pure 46 quadratic time 19 quasiconvex 46 quasiconvex subgroup 17 quotient test 11 random – element 77, 114 – subgroup 77, 90 random access machine 159
random process 2, 10, 12, 15, 27 random van Kampen diagrams 24 random walk 10, 13, 16, 23 Rapinchuk, Andrei 84 reachability theorem 195 recursive method 67 reduced word 164 rejection algorithm 66 relatively hyperbolic group 12, 17 relatively hyperbolic groups 20 relator 164 residually 57 rewrite system 163 right-angled Artin group 16, 20 search group-theoretic problems 23 semidirect product 185, 339 sequence – well-rounded 109 series – lower central 58 size function 4 small cancellation 47 SMP(G, X ), submonoid membership problem 224 solvability of word equations 197 space bounded transducer 161 SSP(G, X ), subset sum problem 224, 247 stable subgroup 17 straight-line program 170 strictly minimal element 32 subgroup – arithmetic 77 – Zariski dense 77 subgroup membership problem 10, 332 subgroup-restricted conjugacy search problem 345 support of a mapping 203 TEP, triviality of equalizer problem 292 terminating relation 162 theorem – Frobenius density 146 – Jordan 145 Tietze transformations 336 Turing machine 4 twisted conjugacy problem 324 uniform probability distribution 7
374 | Index
Whitehead minimal 47, 69 – strictly 69 Whitehead move 31 Whitehead’s algorithm 30–32, 35 word – cyclic reduction 46 – geodesic 46
– reduction 46 word choice problem 328 word equation 197 word problem 10–12, 164, 328 word search problem 23, 24, 185, 328 word-hyperbolic group 17, 20 wreath product 190