Elements of Discrete Mathematics: Numbers and Counting, Groups, Graphs, Orders and Lattices (De Gruyter Textbook) [1 ed.] 3111060691, 9783111060699

This book treats the elements of discrete mathematics that have important applications in computer science, thus providi

140 72 4MB

English Pages 280 [282] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Acknowledgment
Notation
Contents
1 Algebraic structures
2 Elementary number theory
3 Some useful growth estimates
4 Discrete probability
5 Combinatorics
6 Generating functions
7 Group actions and special families of groups
8 Graph theory
9 Order structures and lattices
10 Boolean functions and circuits
Solutions
Bibliography
Index
Recommend Papers

Elements of Discrete Mathematics: Numbers and Counting, Groups, Graphs, Orders and Lattices (De Gruyter Textbook) [1 ed.]
 3111060691, 9783111060699

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Volker Diekert, Manfred Kufleitner, Gerhard Rosenberger, Ulrich Hertrampf Elements of Discrete Mathematics

Also of Interest Discrete Algebraic Methods. Arithmetic, Cryptography, Automata and Groups Volker Diekert, Manfred Kufleitner, Gerhard Rosenberger, Ulrich Hertrampf, 2016 ISBN 978-3-11-041332-8, e-ISBN (PDF) 978-3-11-041333-5, e-ISBN (EPUB) 978-3-11-041632-9 Geometry and Discrete Mathematics. A Selection of Highlights Benjamin Fine, Anja Moldenhauer, Gerhard Rosenberger, Annika Schürenberg, Leonard Wienke, 2022 ISBN 978-3-11-074077-6, e-ISBN (PDF) 978-3-11-074078-3, e-ISBN (EPUB) 978-3-11-074093-6 A Primer in Combinatorics Alexander Kheyfits, 2021 ISBN 978-3-11-075117-8, e-ISBN (PDF) 978-3-11-075118-5, e-ISBN (EPUB) 978-3-11-075124-6

Algebra and Number Theory. A Selection of Highlights Benjamin Fine, Anja Moldenhauer, Gerhard Rosenberger, Annika Schürenberg, Leonard Wienke, 2023 ISBN 978-3-11-078998-0, e-ISBN (PDF) 978-3-11-079028-3, e-ISBN (EPUB) 978-3-11-079039-9 Algebraic Graph Theory. Morphisms, Monoids and Matrices Ulrich Knauer, Kolja Knauer, 2019 ISBN 978-3-11-061612-5, e-ISBN (PDF) 978-3-11-061736-8, e-ISBN (EPUB) 978-3-11-061628-6

Volker Diekert, Manfred Kufleitner, Gerhard Rosenberger, Ulrich Hertrampf

Elements of Discrete Mathematics �

Numbers and Counting, Groups, Graphs, Orders and Lattices

Mathematics Subject Classification 2020 05-01, 05A15, 05A19, 05C10, 05C21, 05C45, 06-01, 11-01, 60-01, 68R10, 94-01 Authors Prof. Dr. Volker Diekert University of Stuttgart Department of Computer Science Universitätsstr. 38 70569 Stuttgart Germany [email protected]

Prof. Dr. Gerhard Rosenberger University of Hamburg Department of Mathematics Bundesstr. 55 20146 Hamburg Germany [email protected]

Dr. Manfred Kufleitner University of Stuttgart Department of Computer Science Universitätsstr. 38 70569 Stuttgart Germany [email protected]

Prof. Dr. Ulrich Hertrampf University of Stuttgart Institute of Formal Methods of Computer Science Universitätsstr. 38 70569 Stuttgart Germany [email protected]

ISBN 978-3-11-106069-9 e-ISBN (PDF) 978-3-11-106255-6 e-ISBN (EPUB) 978-3-11-106288-4 Library of Congress Control Number: 2023944678 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2024 Walter de Gruyter GmbH, Berlin/Boston Cover image: Unter Verwendung des magischen Quadrats aus dem Bild „Melencolia I“ von Albrecht Dürer, public domain Typesetting: VTeX UAB, Lithuania Printing and binding: CPI books GmbH, Leck www.degruyter.com

Preface Preludium

Indeed what I have here written makes no claim to novelty in points of detail; and therefore I give no sources, because it is indifferent to me whether what I have thought has already been thought before me by another.1

Wittgenstein tells us more in his foreword to his Tractatus logico-philosophicus: “What can be said at all can be said clearly; and about that of which one cannot speak, one must stay silent.”

About the content This book is based on the courses Discrete Mathematics and Concrete Mathematics for master students in Computer Science and Mathematics at the University of Stuttgart, and in parts on the course Linear Algebra and Algebraic Structures for Computer Scientists at the University of Dortmund. These courses were held successfully over many years. Experiences of these courses had an essential influence on the selection of the material, as well as the presentation. Over the years the amount of material was continually increasing, so we decided to focus just on a few Elements of Discrete Mathematics which is the current title. The title of the present textbook expresses the limitation to elementary contents, but more importantly, the title expresses our admiration of unattainable mathematical models, beginning with the Elements of Euclid (around 300 BC) and the Éléments de géométrie algébrique by Grothendieck (1928–2014), which he wrote in the 1960s with the help of Dieudonné (1906–1992). The idea of this book is to convey essential elements of discrete mathematics needed to work with modern developments of the information era in a mathematically competent manner. The English edition of this book covers the material of its German original, but there is also new material. For example, we introduced the first chapter about algebraic structures, which are used throughout our book and in many other areas of Computer Science, such as design and analysis of efficient algorithms or hardware and software verification. Rings and fields are ubiquitous, and when computing modulo n for large n, which may have thousands of digits, we need to understand fast exponentiation. This includes calculations using large numbers and computing modulo n. An increasing number of people use online banking or cashless paying. It is therefore crucial to develop an understanding of these transactions not just pretending to be safe, but under realistic assumptions actually provide a safety guarantee, as long as the rules

1 This is from the 1922 English translation of “Logisch-Philosophische Abhandlung” by Ludwig Wittgenstein (1889–1951), also known as “Tractatus logico-philosophicus”. Wittgenstein was born in Austria, but he became a British citizen in 1939. https://doi.org/10.1515/9783111062556-201

VI � Preface of the game are respected. One should also know what might happen if these rules are violated. For this, we need to investigate prime numbers and consider important facts about their density. This is part of our presentation of elementary number theory. In particular, encryption via the RSA method will be explained. Then we deal with estimates which are essential when it comes to counting objects or understanding the run time of certain algorithms. Various algorithms that are completely reliable in practice use randomness as a tool to obtain a result at all. A chapter on discrete probability provides what we need here. Then we go to the heart of discrete mathematics, including combinatorics and generating functions. Later chapters are rather independent of each other and can be read in any order. For example, there is a new and more advanced chapter on group actions. We briefly discuss p-groups and nilpotent groups which are important in many applications. As a byproduct of earlier results, we also show the simplicity of alternating groups. Special attention is paid to graph theory. The corresponding chapter contains various algorithms, ranging from bipartite matching to the planar separator theorem. Finally, we deal with order structures and lattices, as well as Boolean functions and circuits. Exercises in this book have a high priority. We also provide solutions to all exercises, but, of course, in order to keep the learning effect as high as possible, it is not recommended to fall back on the solutions too quickly. We provide complete proofs for all important statements, and we refrain from excuses like “This would go far beyond the scope of this book”. The prior knowledge requirements are quite low; this means that parts of the book are to some extent also suitable for high school students. The basic contents discussed is not just a series of definitions and elementary interrelations. Instead of presenting recipes to be followed rigidly, the book attempts to convey a deeper understanding of the concepts in the mathematical context. Our aim is to present knowledge, techniques, and ways of thinking suited to enable readers to independently solve mathematical problems themselves. We present numerous bijective proofs using combinatorial interpretation. In contrast, inductive proofs make it relatively easy for the reader to check the correctness of known results, however essential ideas as well as the origin of these results often remain hidden. A combinatorial interpretation, on the other hand, is able to shed light on the result. Moreover, we provide numerous pictures to illustrate the mathematical derivations. Discrete mathematics is a very modern and exciting area, having a wide range of applications. We tried to emphasize the value of mathematical aesthetics, even if in some cases this was at the expense of the strongest possible results. Mathematical entertainment had a decisive influence on the selection of the advanced topics. This book supplements and deepens the basics and demonstrates possible applications. Also some topics that go beyond the standard material are covered. We hope that the reader will find at least one highlight in each chapter. We prefer fluent reasoning to lengthy explanations. Thus, there should be space for the reader’s own considerations.

Preface

� VII

At the end of each chapter, we inserted short chapter summaries, as a learning aid and mnemonic. In writing the text, we took inspiration and adapted proofs by other mathematicians, which we found, for example, in the textbooks [2, 13, 32] and in the book [22] of Graham (1935–2020), Knuth (born 1938), and Patashnik (born 1954). Moreover, in many places we used original literature that has not yet been published in textbooks. Sometimes something new has emerged, in some other places our text remained close to the sources. In these cases we often provide explicit hints. If mathematicians are mentioned by name, we added biographical information, whenever it looked reasonable to us and the dates were publicly available or we received permission to state the year of birth. We hope that a chronological classification helps to understand mathematical developments in context. In the case of living mathematicians, we have resorted to their own transliteration, whenever we were aware of it. Finally, we typically omit punctuation marks at the end of displayed formulas. About the authors Volker Diekert (born 1955) studied in Hamburg and Montpellier, completing a Diplôme des Études Supérieures with Alexander Grothendieck. He graduated in mathematics in Hamburg where he regularly attended seminars offered by Ernst Witt, and he earned his PhD in algebraic number theory under direction of Jürgen Neukirch in Regensburg. These three outstanding late mathematicians were also truly impressive personalities with an enduring influence in his life. He habilitated in informatics at the Technical University Munich. After that he held the chair for Theoretical Computer Science at the University of Stuttgart from 1991 until his retirement in 2023. Manfred Kufleitner (born 1977) studied computer science with minor in mathematics at the University of Stuttgart. He earned his doctorate in computer science at the University of Stuttgart under the supervision of Volker Diekert. After one year of postdoc at LaBRI in Bordeaux, Manfred returned to Stuttgart where he did his habilitation. Manfred had the honor of being invited as a guest professor at the Technical University of Munich and as an adjunct professor at the University of Hamburg. He was a lecturer in Computer Science at Loughborough University. Since 2020, he is a member of staff at the University of Stuttgart. Gerhard Rosenberger (born 1944) earned his PhD in Hamburg with a thesis in analytic number theory and habilitated in combinatorial group theory. He comes with the greatest life experience among the authors and the longest background in teaching mathematics. He has also a profound experience in writing textbooks. His research work has been enriched by extended stays in Finland, Scotland, Switzerland, Russia, and the United States. In his scientific collaboration he has coauthors from more than 25 different countries. At present he continues to teach at the University of Hamburg. Ulrich Hertrampf (born 1956) started his career in pure mathematics. After his diploma thesis in the area of group representations, he earned his PhD in the area of

VIII � Preface practical computer science, but returned to theoretical research when he completed his habilitation in the structural complexity theory group of Klaus Wagner in Würzburg. He is a professor at the University of Stuttgart, where he has been working in the group of the first author from 1996 to 2023. About anagrams The working title for the German edition of this book was simply Diskrete Mathematik based on a course with the same title, which students renamed by swapping letters to Diekerts Mathematik (i. e., Diekert’s Mathematics). Now, this anagram is no longer visible in the title, but our cover shows an excerpt from the famous copperplate engraving by Albrecht Dürer with the title Melencolia§I, an anagram of Cameleon § LI I. Our cover shows the versatile elements of a magic square running away. A print on laid paper of Dürer’s Melencolia§I belongs, for example, to the National Gallery of Art in Washington (DC).

Acknowledgment A book like this cannot become reality without the support and help of various people. Particularly, we would like to list the following names here (in alphabetical order): Murray Elder, Jonathan Kausch, Jürn Laun, Alexander Lauser, Caroline Mattes, Heike Photien, Horst Prote, Aila Rosenberger, Florian Stober, Tobias Walter, and Armin Weiß. All of them contributed to the German version and/or the current edition by creating or solving exercises, reading and correcting our texts, and helping us to write in LATEX and draw pictures with Till Tantau’s TikZ. All mistakes that remain in the text are on the account of the authors. Thanks are also due to DeGruyter for accepting this manuscript for their textbook series. Stuttgart and Hamburg, September 2023

https://doi.org/10.1515/9783111062556-202

VD, MK, GR, UH

Notation Sets, relations, and functions All mathematical objects studied in the book are sets. A natural or complex number is a set, a graph is a set, etc. So, for example, if we speak about the set of all finite graphs, we might use an encoding such that the vertex- and edge-sets of a graph are finite subsets of natural numbers. Hence, the set of all finite graphs becomes a subset inside the set of sets over the natural numbers. Dealing with mathematical objects, we have to say when they are equal. In graph theory, we distinguish between concrete graphs which are equal if and only if the underlying vertex- and edge-sets are equal as sets. In contrast, two abstract graphs are equal as soon as they are isomorphic. Thus, the encoding does not matter. The empty set is denoted by 0. The notation was introduced by André Weil2 (1906– 1998). Let A and B be sets. Their union (resp. intersection) is denoted by A∪B (resp. A∩B). We write A ⊆ B for set-inclusion, A ⊊ B for strict set-inclusion, and A \ B (or, occasionally, A − B) for the set difference {a ∈ A | a ∉ B}. A relation between A and B is a subset of A × B. The Cartesian product A × B of two sets A and B is the set of all ordered pairs (a, b) with a ∈ A and b ∈ B. Especially, if (a, b) = (c, d), then a = c and b = d. In the following, a pair always means an ordered pair.3 The construction is named after the French philosopher, scientist, and mathematician René Descartes (1596–1650). A mapping (or function) f : A → B from A to B is a triple (A, B, R), where R ⊆ A × B is a relation such that for every a ∈ A there is exactly one b ∈ B with (a, b) ∈ R. We also write f (a) = b or a 󳨃→ b in that case. In particular, the set of mappings from A to B is itself a set; it is denoted by BA . More generally, if I is any set such that for each i ∈ I there is some set Ai , then the Cartesian product ∏i∈I Ai is defined as the set of mappings f from I to ⋃i∈I {Ai | i ∈ I} such that f (i) ∈ Ai for all i ∈ I. We follow the axiom of choice which asserts that ∏i∈I Ai ≠ 0 if Ai ≠ 0 for all i ∈ I. The power set of a set A is the set of subsets of A. According to Section 1.1, we let 2 = {0, 1}, then the power set can be identified with 2A via characteristic functions since every function χ ∈ 2A defines a subset χ −1 (1) = {a ∈ A | χ(a) = 1}, and vice versa, a subset B ⊆ A defines a function χB ∈ 2A by χB (a) = 1 ⇐⇒ a ∈ B. If A is any set, then the identity mapping is denoted by idA . The identity mapping id0 of the empty set is the (nonempty) triple (0, 0, 0). The composition of mappings (or functions) f : A → B and g : B → C is defined by g ∘ f : A → C where (g ∘ f )(a) = g(f (a)). Identities behave as neutral elements: we have (g ∘ idB ) = (idC ∘ g) = g. 2 Weil created the symbol 0 modeling the letter Ø in the Danish and Norwegian alphabet. 3 Following Kuratowski (1896–1980), a pair (a, b) ∈ A × B is the set (a, b) = {{a}, {a, b}}. https://doi.org/10.1515/9783111062556-203

XII � Notation A mapping f : A → B is injective (or an embedding) if for every b ∈ B there is at most one a ∈ A with f (a) = b. It is surjective, if for all b ∈ B there is at least one a ∈ A with f (a) = b. It is bijective if it is both injective and surjective. We also speak of injections, surjections, and bijections, respectively. A bijection from a (finite) set to itself is also called permutation. A set A is countable if there is an injection f : A → ℕ. A

B

A

f arbitrary

B

f injective

A

B

A

f surjective

B

f bijective

Landau symbols In many cases, the growth behavior of a given function is more relevant than specific values. To take care of this, Landau symbols (named after Edmund Georg Hermann Landau, 1877–1938) turned out to be very useful. This notation has been designed to describe function classes. In the sequel, we will consider functions f , g, . . . from ℕ to ℝ or, more general, to ℂ. The function classes 𝒪(g), Ω(g), and Θ(g) (read “Oh of g”, “Omega of g” and “Theta of g”) are defined as follows: 𝒪(g) = { f : ℕ → ℂ | ∃c > 0 ∃n0 ≥ 0 ∀n ≥ n0 : 󵄨󵄨󵄨f (n)󵄨󵄨󵄨 ≤ c ⋅ 󵄨󵄨󵄨g(n)󵄨󵄨󵄨}

󵄨

󵄨

󵄨

󵄨

Ω(g) = { f : ℕ → ℂ | g ∈ 𝒪(f )} Θ(g) = 𝒪(g) ∩ Ω(g)

Thus, f ∈ 𝒪(g), if up to a constant eventually (i. e., for inputs greater than some constant n0 ) f does not grow faster than g. Moreover, by convention, classes defined using 𝒪 are always considered to be closed downwards. For example, the class 2𝒪(n) should certainly contain all polynomials, too. Phrased differently, if f (n) ≤ g(n) for almost all n ∈ ℕ, then the convention says f (𝒪(n)) ⊆ g(𝒪(n)). Analogously, we define f ∈ Ω(g) if up to a constant eventually the growth rate of f is not less than that of g, and f ∈ Θ(g) if up to constants eventually the growth rates of f and g are the same. An example is given by (2n ) ∈ Θ(n3 ) ⊊ 𝒪(2n ). 3 We define o(g) and ω(g) (read “little-oh” and “little-omega”) by: 󵄨 󵄨 󵄨 󵄨 o(g) = { f : ℕ → ℂ | ∀c > 0 ∃n0 > 0 ∀n ≥ n0 : 󵄨󵄨󵄨f (n)󵄨󵄨󵄨 < c ⋅ 󵄨󵄨󵄨g(n)󵄨󵄨󵄨}

ω(g) = { f : ℕ → ℂ | g ∈ o(f )}

Landau symbols

� XIII

Thus, o(g) contains those functions which are eventually growing strictly slower than g. The functions in ω(g) are eventually growing strictly faster than g. As a consequence, we have o(g) ⊊ 𝒪(g) and o(g) ∩ Θ(g) = 0. The relations defined by 𝒪, Ω, and Θ are reflexive relations: f ∈ 𝒪(f ), f ∈ Ω(f ), and f ∈ Θ(f ). All of these relations are transitive: if f ∈ 𝒪(g) and g ∈ 𝒪(h), then f ∈ 𝒪(h), and accordingly for Ω, Θ, o, ω. The only symmetric relation among these is Θ: if f ∈ Θ(g), then also g ∈ Θ(f ). For any two bases a, b > 1, we have loga (n) ∈ Θ(logb (n)). Thus, classes like 𝒪(log n) can be defined without specifying the logarithm’s base. Further, the following relationship can easily be seen: logk (n) ∈ o(logk+1 (n)) ⊊ o(n) We say that a function f : ℕ → ℂ is polynomially bounded if f ∈ 𝒪(nk ) for some k ≥ 0. Accordingly, we speak of a polynomial-time algorithm if its running time is polynomially bounded. Sometimes the term efficient is used as a synonym for polynomially bounded. The class of polynomially bounded algorithms is robust in the sense that the definition is, to a large extent, independent of the machine model or implementation details. The notation f ∼ g expresses that two functions f , g : ℕ → ℂ show the same asymptotic behavior. It refines the Θ-relation, and it is defined by: f ∼ g ⇐⇒ lim

n→∞

f (n) =1 g(n)

n(n−1) ∈ Θ(n2 ), but (n2 ) 2 nk n! ∼ k! . We will apply k!(n−k)!

It is a refinement because, for example, (n2 ) =

≁ n2 . More gen-

erally, for a fixed k ≥ 0, we have (kn) = such asymptotic considerations only to functions, where almost all values are nonzero, thus avoiding the discussion, whether 0 ∼ 0 is true.

Contents Preface � V Acknowledgment � IX Notation � XI Sets, relations, and functions � XI Landau symbols � XII 1 1.1 1.2 1.3 1.4 1.4.1 1.4.2 1.4.3 1.4.4

Algebraic structures � 1 Numbers � 1 From semigroups to vector spaces � 2 Basic group theory � 6 A glimpse into rings � 10 Matrix rings � 10 Polynomials � 11 Ideals and quotient rings � 12 Complex numbers � 13

2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13

Elementary number theory � 15 Modular arithmetic � 15 Euclidean algorithm � 17 Fundamental theorem of arithmetic � 18 Bits and bytes � 19 Error detection for article numbers � 20 Chinese remaindering � 21 Fermat’s primality test � 24 Fast exponentiation � 25 RSA encryption � 27 Euler’s totient function � 29 Finite multiplicative subgroups of fields � 31 Fibonacci numbers � 33 Recursion depth of the Euclidean algorithm � 37 Exercises � 38 Summary � 41

3 3.1 3.2 3.3 3.4

Some useful growth estimates � 42 Factorials � 42 Binomial coefficients � 43 Least common multiple � 46 Prime number density � 49

XVI � Contents 3.5

Bertrand’s postulate � 51 Exercises � 53 Summary � 55

4 4.1 4.2 4.3

Discrete probability � 56 Probabilities and expected values � 56 Jensen’s inequality � 60 Birthday paradox � 61 Exercises � 62 Summary � 64

5 5.1 5.2 5.3 5.4 5.5 5.6 5.6.1 5.6.2 5.7 5.8 5.9 5.10 5.11 5.12

Combinatorics � 65 Enumerative combinatorics � 65 Binomial coefficients � 67 Average case analysis of bubble sort � 78 Inclusion and exclusion � 79 Rencontres numbers � 82 Stirling numbers � 83 Stirling numbers of the second kind � 84 Stirling numbers of the first kind � 88 Bell numbers � 92 Partition numbers � 93 Catalan numbers � 96 Dyck words � 96 Binary trees � 98 Expected height of binary search trees � 100 Exercises � 102 Summary � 106

6 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10

Generating functions � 108 Ordinary generating functions � 108 Fibonacci numbers � 109 Catalan numbers � 110 Stirling numbers of the second kind � 111 Partition numbers � 111 Growth of the partition numbers � 115 Pentagonal number theorem � 117 Exponential generating functions � 120 Stirling numbers of the first kind � 121 Bell numbers � 122 Exercises � 123 Summary � 125

Contents

7 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9

Group actions and special families of groups � 126 Group actions � 126 Orbit-stabilizer theorem � 127 p-groups � 128 Cyclic groups � 129 Dihedral groups � 130 Symmetric groups � 131 Alternating groups � 134 Nilpotent groups � 138 Unit triangular groups � 139 Exercises � 142 Summary � 143

8 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8.10 8.11 8.12 8.13 8.14 8.15

Graph theory � 144 Basic terms � 144 Eulerian and Hamiltonian cycles � 150 Trees � 152 Cayley’s formula � 154 Hall’s marriage theorem � 156 Stable marriage � 158 Menger’s theorem � 160 Maximum flows � 161 Max-flow min-cut theorem � 162 Dinic’s algorithm � 167 Planar graphs � 170 Euler’s formula � 172 Colorings of planar graphs � 174 Planar separators � 174 Ramsey’s theorem � 177 Exercises � 181 Summary � 184

9 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9

Order structures and lattices � 185 Partial orders � 185 Complete partial orders � 189 Denotational semantics � 190 Least fixed points of monotone mappings � 193 Lattices � 195 Complete lattices � 197 Modular and distributive lattices � 198 Boolean lattices � 203 Boolean rings � 204

� XVII

XVIII � Contents 9.10

10 10.1 10.2 10.3

Stone’s general representation theorem � 207 Exercises � 210 Summary � 211 Boolean functions and circuits � 212 Shannon’s upper bound � 214 Shannon’s lower bound � 215 Lupanov’s asymptotically optimal upper bound � 217

Solutions � 221 Chapter 2 � 221 Chapter 3 � 225 Chapter 4 � 228 Chapter 5 � 231 Chapter 6 � 237 Chapter 7 � 240 Chapter 8 � 242 Chapter 9 � 248 Bibliography � 251 Index � 253

1 Algebraic structures Our first chapter deals with numbers and some few algebraic results which are typically covered in standard courses in analysis, algebra, and linear algebra. Such a short chapter has to be highly selective. We restrict ourselves to those algebraic results which pop up later in the book, or which are, according to our experience, frequently used in the literature (or in other courses) about Discrete Mathematics. The reader will find various examples, and following the spirit of the book, (essentially) all mathematical statements have concise proofs. Experienced readers might skip Chapter 1, or read only those parts on demand when needed.

1.1 Numbers According to Giuseppe Peano (1858–1932), the natural numbers ℕ can be defined by the following axioms: – There is a natural number zero. – The successor of a natural number is a natural number again. – Zero is not the successor of any natural number. – The successors of two different natural numbers are different. – Any subset of natural numbers, containing zero and with every number also its successor, is the set of all natural numbers. This last requirement is also called the axiom of (mathematical) induction. We denote zero by 0 and the successor of a number n by s(n). A standard realization defines a natural number as the set of all smaller natural numbers. The number 0 corresponds to the empty set 0, the number 1 = s(0) to the one-element set {0}, the number 2 = s(1) is the set {0, {0}} with two elements. Starting from 0 = 0, in this notation we have s(n) = {0, . . . , n} for all natural numbers n. Thus, the number n is a set of n elements. The order relation m ≤ n in this setting corresponds to the subset relation m ⊆ n. The usual construction of number systems starts with ℕ, from which the integers ℤ, and then the rational numbers ℚ can be obtained. Then, from ℚ the set of real numbers ℝ is constructed, for example, by Dedekind completion, named after Julius Wilhelm Richard Dedekind (1831–1916). We assume the reader to be familiar with the real numbers. Since the polynomial X 2 + 1 has no real zero, an imaginary number i = √−1 is introduced, which has the property that its square is the negative number −1. Hence, i2 + 1 = 0. Thereby, we obtain the complex numbers ℂ. It turns out that every nonconstant polynomial over ℂ has a zero. The corresponding property says that ℂ is algebraically closed. But this fundamental result is outside the scope of this book, and never used here. However, since complex numbers appear several times in the book, we give https://doi.org/10.1515/9783111062556-001

2 � 1 Algebraic structures a formal definition in Section 1.4.4. Complex numbers are not needed to understand the results in the context of real numbers or finite fields, which is our main focus.

1.2 From semigroups to vector spaces Since ancient times, algebraic terminology and techniques are used in most areas of mathematics. Algebraic methods have their origins in elementary number theory and the study of equations. Modern algebra is characterized by numerous notions. In fact, this is one of the contributions of these theories, as important properties and criteria can be described and investigated in a uniform way. In our book we mainly meet algebraic structures in the range from semigroups (like the positive natural numbers with addition) to rings (like ℤ) and fields (like 𝔽p = ℤ/pℤ for a prime p or ℚ or ℂ). A short overview of the various algebraic structures is given by the following diagrams. semigroups

(semi-)rings

monoids

domains

groups

fields

existence of inverses

commutative multiplication multiplicative inverses

neutral element

x ⋅ y ≠ 0 for x, y ≠ 0

associative law

distributive laws

One operation

Two operations: + and ⋅

For a binary mapping ∘ : M × M → M, we often use the infix notation x ∘ y instead of ∘(x, y). Frequently, we use a multiplicative notation, then we also write x⋅y or xy for short. In an additive notation, we write x + y instead of x ∘ y. An operation on M is associative if for all x, y, z ∈ M, we have (xy)z = x(yz), and it is commutative or Abelian, named after Niels Henrik Abel (1802–1829), if xy = yx holds for all x, y ∈ M. For a commutative operation ∘, we mostly write x + y. For instance, the addition forms a commutative and associative operation on the natural numbers ℕ = {0, 1, 2, . . .}. The set of integers ℤ is divided into three parts: the negative integers {−1, −2, . . .}, the set {0}, and the positive natural numbers (resp. positive integers) ℕ \ {0}. An element e ∈ M is called idempotent if e2 = e. It is called neutral if xe = ex = x for all x ∈ M. A neutral element is idempotent, but the converse does not hold: consider the set {0, 1} with multiplication, then both elements are idempotent, 1 is neutral, but 0 is not. In a multiplicative notation, we frequently denote the neutral element by 1, whereas for additive notation we may use 0. The neutral element is often called the identity. An element z ∈ M is a zero element if xz = zx = z for all x ∈ M. In a multiplicative notation,

1.2 From semigroups to vector spaces

� 3

the zero element is frequently denoted as 0. There will be, however, no risk of confusion with the neutral element in commutative structures. Definition 1.1. Let (M, ⋅) be a set with a binary operation (x, y) 󳨃→ x ⋅ y. (a) The set M is called a semigroup if the operation is associative. (b) It is called a monoid if M is a semigroup with a neutral element 1. That is, 1 ⋅ x = x ⋅ 1 = x for all x ∈ M. (c) It is called a group if M is a monoid with a neutral element 1 such that for all x ∈ M there is some y ∈ M such that yx = 1 = xy. We also write y = x −1 in this case. (d) If M is a monoid, then the group of units U(M) is the largest subgroup in M which contains the neutral element 1 of M: it is the set of all x ∈ M such that x −1 ∈ M with xx −1 = x −1 x = 1 exists. Example 1.2. The empty set 0 is a semigroup but there is no neutral element. The onepoint set {⋆} is a group with its unique binary multiplication. The set {0, . . . , n} with the operation x + y = min{x + y, n} forms a finite commutative monoid. The element n is a zero element, and the element 0 is neutral. If n ≥ 1, then M is not a group, because 0 ≠ n = n + x for all x ∈ M. ⬦ In algebra, objects of main interest include rings, fields, polynomial rings, and the ring of matrices over a commutative ring or field. All these structures have two operations: a commutative addition and a multiplication which is not necessarily commutative. Definition 1.3. A semiring (with 1) is given by a tuple (R, +, ⋅), where (R, +) is a commutative monoid with neutral element 0 and (R, ⋅) a monoid with neutral element 1 such that multiplication distributes over addition: (x + y) ⋅ z = x ⋅ z + y ⋅ z z ⋅ (x + y) = z ⋅ x + z ⋅ y The latter is called the distributivity property. Moreover, multiplication by 0 annihilates R. That is, 0 ⋅ x = x ⋅ 0 = 0. A semiring (R, +, ⋅) is called a ring if (R, +) is a group. In this case the minus sign − denotes subtraction. The (semi)ring R is commutative if (R, ⋅) is commutative. Example 1.4. The natural numbers ℕ (resp. ℤ) form a commutative semiring (resp. ring) with the usual addition and multiplication. If (R, +, ⋅) is a semiring and n ∈ ℕ, then we have a natural interpretation of n in R as the n-fold addition of the neutral element for the multiplication in R. In particular, we can have an interpretation of n ⋅ r ∈ R for all n ∈ ℕ and r ∈ R. If (R, +, ⋅) is a ring and n ∈ ℤ, then we already have defined n ⋅ r for all n ∈ ℕ and r ∈ R. For n < 0, we define n ⋅ r by n ⋅ r = −((−n) ⋅ r). ⬦

4 � 1 Algebraic structures We usually write R instead of (R, +, ⋅) or (R, +, ⋅, 0, 1). The zero ring R = {0} is the only semiring with 0 = 1. An element having a multiplicative inverse is a unit and R∗ = {r ∈ R | ∃s: rs = sr = 1} is called the group of units or the multiplicative group of R. Note that 1 ∈ R∗ . An element r of a ring R is a zero divisor if there exists a nonzero element s ≠ 0 in R with rs = 0 or sr = 0. Note that 0 itself is a zero divisor unless 0 = 1. For example, in the ring ℤ/6ℤ the numbers 2 and 3 are the nonzero zero divisors. Invertible elements cannot be zero divisors. If R and S are rings, then the direct product R × S is a ring with component-wise addition and multiplication. If |S| > 1, then every element (r, 0) is zero divisor. Definition 1.5. A ring is a domain if 0 is the only zero divisor. An integral domain is a commutative domain. A ring R is a skew field if R∗ = R \ {0}. A commutative skew field is called a field. A skew field is a ring where (R\{0}, ⋅) is a group. The positive integers {n ∈ ℕ | n ≥ 1} form a commutative semiring and ℤ is a commutative ring. We have ℤ∗ = {1, −1}, i. e., ℤ is not a field. The rationals ℚ, the reals ℝ, and the complex numbers ℂ are fields. Skew fields like the quaternions ℍ typically appear in matrix rings over a field. Later, in Section 2.1, we will study in detail the rings ℤ/nℤ of integers modulo n. In particular, we will see that ℤ/nℤ is a field if and only if n is a prime. If R is a semiring and X a set, then RX = {f : X → R | f is a mapping} forms a semiring with pointwise addition and multiplication. Formally, for f , g ∈ RX the mappings f + g ∈ RX and f ⋅ g ∈ RX are defined by (f + g)(x) = f (x) + g(x) and

(f ⋅ g)(x) = f (x) ⋅ g(x)

(1.1)

If X ≠ 0, then the semiring RX is commutative if and only if R is commutative. The multiplication in RX , according to (1.1), must not be confused with the multiplication in Rn×n , the ring of n by n matrices over a commutative ring R, which can be found in Section 1.4.1, Equation (1.3). Actually, if R is a field F, then F n×n is also an F-vector space of dimension n2 according to the next definition. Definition 1.6. Let F = (F, +, ⋅) be a field and V = (V , +) an Abelian group. Then V is called an F-vector space, if F and V are linked by a scalar multiplication F × V → V , (α, u) 󳨃→ α ⋅ u, satisfying the following axioms for all α, β ∈ F and u, v ∈ V : – (αβ) ⋅ u = α ⋅ (β ⋅ u) and 1 ⋅ u = u. – (α + β) ⋅ (u + v) = α ⋅ u + α ⋅ v + β ⋅ u + β ⋅ v. A subset L ⊆ V is called linearly independent if for all finite subsets I ⊆ L we have ∑u∈I αu ⋅ u = 0 ⇐⇒ ∀u ∈ I : αu = 0. A linearly independent set B is called a basis if the set F ⋅ B = {f ⋅ b | f ∈ F, b ∈ B} generates the Abelian group V . The cardinality of a basis B is called the dimension of V , denoted dim(V ).

1.2 From semigroups to vector spaces

� 5

The axioms imply the rules (−1) ⋅ u = −u and α ⋅ u = 0 ⇐⇒ α = 0 ∨ u = 0. Often αu is written instead of α ⋅ u. It can be shown that dim(V ) is well defined and does not depend on the chosen basis. For finite-dimensional vector spaces, this is a standard exercise in linear algebra. For arbitrary cardinalities, it follows using the axiom of choice. Example 1.7. Let X be a set and F a field. Then the set F X with component-wise addition is an Abelian group. It is an F-vector space by letting (α ⋅ f )(x) = α ⋅ f (x) for all f ∈ F X , α ∈ F and x ∈ X. If |X| = n is finite, then we have dim(F X ) = n, because the function χx , which maps y ∈ X to 1 if y = x and to 0 otherwise, forms a basis. ⬦ A subset Y of an algebraic structure Z forms a substructure if Y itself has the same structural properties as those claimed for Z. For example, consider the semigroup M = {1, 0} with multiplication as operation; in this case {1} and {0} are subsemigroups of M. The semigroup M forms a monoid, but only {1} is a submonoid. Even though {0} forms a monoid, it is not a submonoid, because it does not have the same identity as M. A substructure Y of Z is generated by X if it is the smallest substructure containing X. We denote this substructure by ⟨X⟩. A mapping between two algebraic structures, which is compatible with the respective operations (such as + and ⋅) and which maps neutral elements to neutral elements, is called a homomorphism. A bijective homomorphism between algebraic structures is called an isomorphism, because the inverse mapping is a homomorphism, too. An automorphism is an isomorphism from a structure onto itself. Thus, a homomorphism φ : M → N between monoids M and N has the two properties φ(xy) = φ(x)φ(y) and φ(1) = 1. Here, 1 denotes the neutral element 1M of M and 1N of N, respectively. A homomorphism φ : G → H between groups automatically maps the neutral element in G to the neutral element in H. Thus, the additional requirement φ(1) = 1 is redundant for groups. Definition 1.8. If M is a monoid, we define the set End(M) of endomorphisms as the set of homomorphisms h : M → M from M into itself. Its subset of automorphisms is denoted by Aut(M). With the operation given by composition of mappings, End(M) is a monoid; Aut(M) is the subgroup of units in End(M). Example 1.9. Let Σ be a set, then Σ∗ (resp. Σ+ ) denotes the set of finite sequences (a1 , . . . , an ) with n ≥ 0 (resp. n ≥ 1). An element (a1 , . . . , an ) is also called a word and we define its length to be n. The empty sequence has length zero. It is also called the empty word. In the context of words, we also say that Σ is an alphabet, and an element of Σ is called a letter. The set Σ∗ is a monoid with the following operation, where the empty word is the neutral element: (a1 , . . . , am ) ⋅ (b1 , . . . , bn ) = (a1 , . . . , am , b1 , . . . , bn ) The set of nonempty words Σ+ is the so-called free semigroup over Σ. It is free, because the natural inclusion ι : Σ 󳨅→ Σ+ for all semigroups S induces a natural bijection

6 � 1 Algebraic structures φ 󳨃→ φι between the set of homomorphisms from Σ+ to S and the set of mappings from Σ to S. Similarly, the set of all words Σ∗ is the free monoid over Σ. The monoid of endomorphisms End(Σ∗ ) can be identified with the set of mappings from Σ to Σ∗ . It is infinite as soon as Σ ≠ 0. Its subgroup Aut(Σ∗ ) is the set of permutations of Σ. It is finite as long as Σ is finite. ⬦

1.3 Basic group theory Symmetries (or more general, groups) reflect the fundamental idea to classify concrete objects into classes of objects up-to-symmetry, which has been a major step in mathematical reasoning. Indeed, viewing a tree in nature is a concrete object, but in mathematics a tree is a tree up to isomorphism. That is, a tree is an abstraction of a concrete tree. The term “group” goes back to Galois (1811–1832), who examined the solutions of polynomial equations over the rational numbers and tried to classify them into groups of solvable equations. Let us recall that, by Definition 1.1, a group G is a monoid with a neutral element 1 ∈ G such that for all x ∈ G we have 1 ⋅ x = x ⋅ 1 = x and there is an inverse y ∈ G with x ⋅ y = y ⋅ x = 1. In fact, it is sufficient to require the existence of left inverses. To see this, let yz = 1. Then we also have xy = 1 for some x. Thus, x = x(yz) = (xy)z = z, and therefore zy = xy = 1. Hence, y is also the right inverse of z. Moreover, the inverses are uniquely determined, and we write x −1 for the inverse y of x. If the operation is written additively (i. e., using + instead of ⋅), then the inverse is denoted by −x and the neutral element is 0. For example, the integers ℤ, with addition, form a group. A subset H of a group G is a subgroup if H contains the neutral element of G and if H with the operation of G forms a group. We also write H ≤ G in order to indicate the subgroup relation. For any subset X ⊆ G, the subgroup of G generated by X is ⟨X⟩. Thus, ⟨X⟩ contains exactly those group elements which can be written as a product of elements x and x −1 with x ∈ X. The empty product yields the neutral element. For X = {x1 , . . . , xn }, we also write ⟨x1 , . . . , xn ⟩ instead of ⟨{x1 , . . . , xn }⟩. In the following, let G be a group, let g, g1 , g2 ∈ G be arbitrary group elements, and let H be a subgroup of G. We call the set gH = {gh | h ∈ H} the (left) coset of H with respect to g. Analogously, Hg is the right coset with respect to g. The set of left cosets is denoted by G/H, i. e., G/H = {gH | g ∈ G} Analogously, H\G = {Hg | g ∈ G} is the set of right cosets. A subgroup H of G is called normal if the left cosets of H are equal to the right cosets of H, i. e., if gH = Hg for all g ∈ G. This is equivalent to saying that the left cosets and the right cosets define the same partitions of G.

1.3 Basic group theory

� 7

If G is commutative, then obviously all subgroups are normal. If a subgroup H is normal in G, then we also write H ⊴ G. It is stronger than the subgroup notation H ≤ G. The center of group G is defined by Z(G) = {g ∈ G | ∀h ∈ G : gh = hg}. The center is a normal subgroup in every group. Thus, we always have Z(G) ⊴ G. Lemma 1.10. For cosets and right cosets, the following properties hold: (a) |H| = |gH| = |Hg|; (b) g1 H ∩ g2 H ≠ 0 if and only if g1 ∈ g2 H if and only if g1 H ⊆ g2 H if and only if g1 H = g2 H; (c) |G/H| = |H\G|. Proof. (a) The mapping g⋅ : H → gH, x 󳨃→ gx is a bijection with inverse map g −1 ⋅ : gH → H, x 󳨃→ g −1 x. Thus |H| = |gH|, and by symmetry also |H| = |Hg|. (b) Let g1 h1 = g2 h2 with h1 , h2 ∈ H. Then g1 = g2 h2 h1−1 ∈ g2 H. From g1 ∈ g2 H, we obtain g1 H ⊆ g2 H ⋅ H = g2 H. Now, g1 H ⊆ g2 H immediately implies g1 H ∩ g2 H ≠ 0, and by symmetry g2 H ⊆ g1 H. Thus, all four statements in (b) are equivalent. (c) Note that g1 ∈ g2 H is true if and only if g1−1 ∈ Hg2−1 holds. Thus, using (b) we obtain a bijection gH 󳨃→ Hg −1 from the set G/H to H\G. Lemma 1.10 (b) shows that different cosets of H are disjoint. Each element g ∈ G belongs to the coset gH. These two facts together show that the cosets induce a partition of G, where by (a) all classes are of the same size. A set R ⊆ G is called a set of (left)-coset representatives if the canonical mapping G → G/H induces a bijection between R and G/H. In other words, for each left-coset gH there is exactly one r ∈ R with gH = rH. Right-coset representatives are defined symmetrically. The set of left-coset representatives has the same cardinality as any set of right-coset representatives. The index of H in G is the cardinality [G : H] = |G/H|. Lemma 1.10 (c) yields [G : H] = |H\G|. The order of G is its size |G|. The order of an element g ∈ G is the order of ⟨g⟩. If ⟨g⟩ is finite, then the order of g is the smallest positive integer n such that g n = 1. The following theorem is named after Joseph Louis Lagrange, 1736–1813. Its main application is for finite groups and stated in Corollary 1.12. We prove it first in the general form which holds for finite and infinite groups. Theorem 1.11. Let H be a subgroup of a group G and R ⊆ G be a set of left-coset representatives. Then the canonical mapping ρ : R × H → G,

(r, h) 󳨃→ rh

is bijective. In particular, there is a bijection between the sets G/H × H and G. Proof. Let g ∈ G, then there is some r ∈ R such that g ∈ rH. This implies that ρ is surjective: we can write g = rh. Assume that g = rh = r ′ h′ for r, r ′ ∈ R and h, h′ ∈ H. By Lemma 1.10, the group G is the disjoint union of cosets of H, and, by definition of R, we conclude that r = r ′ . Hence, r −1 g = h = h′ . Therefore, ρ is injective, too. The bijection

8 � 1 Algebraic structures between G/H × H and G follows, because by the definition of coset representatives there is a bijection between R and G/H. Corollary 1.12 (Lagrange). Let G be a finite group. Then |G| = [G : H] ⋅ |H|. Proof. This is a special case of Theorem 1.11. For finite groups G, Lagrange’s theorem has the following consequences: the order of a subgroup H divides the order of G. Taking H = ⟨g⟩ for an arbitrary g ∈ G, we conclude that the order of any group element g divides the order of G. If K is a subgroup of H and H is a subgroup of G, then we obtain [G : K] = [G : H][H : K]. Theorem 1.13. Let g be a group element of order d. Then g n = 1 ⇐⇒ d | n. Proof. If n = kd, then g n = (g d )k = 1k = 1. For the converse, let g n = 1 and n = kd + r with 0 ≤ r < d. Then 1 = g n = (g d )k g r = 1⋅g r = g r and therefore r = 0. So n = kd +0 = kd is a multiple of d. Corollary 1.14. Let G be a finite group with n elements. Then we have g |G| = 1 for all g ∈ G. Proof. Let d be the order of g ∈ G. Then d divides n = |G| by Lagrange’s theorem, and thus g n = 1 by Theorem 1.13. Corollary 1.15. Let G be a group with subgroups H and K such that hk = kh for all (h, k) ∈ H × K, and let φ : H × K → G be the canonical homomorphism which is induced by the inclusions H ≤ G and K ≤ G. That is, it maps (h, k) to hk ∈ G. Then φ is injective if and only if H ∩ K = {1}. In particular, if φ is surjective and H ∩ K = {1}, then φ is an isomorphism. Moreover, if H and K are finite with gcd(|H|, |K|) = 1, then H ∩ K = {1}. Proof. If 1 ≠ g ∈ H ∩ K, then φ(g, g −1 ) = 1 = φ(1, 1) and φ is not injective. For the other direction, let H ∩ K = {1}. Since hk = kh for h ∈ H, k ∈ K, the homomorphism φ : H × K → G is well defined. Assume h1 k1 = h2 k2 for h1 , h2 ∈ H and k1 , k2 ∈ K, then h2−1 h1 = k2 k1−1 ∈ H ∩ K = {1}. Hence h1 = h2 and k1 = k2 . Now, let H and K be finite with |H| = m and |K| = n and gcd(m, n) = 1. Suppose (h, k) ∈ H × K with hk ∈ H ∩ K. Then h ∈ Kk −1 = K. Hence, by Theorem 1.13, the order of h divides m and n. Since gcd(m, n) = 1, the order of h must be 1, and therefore h = 1. By symmetry, we obtain k = 1, too. This implies hk = 1 and therefore H ∩ K = {1}. Theorem 1.16. Subgroups of index 2 are normal. Proof. Let [G : H] = 2. Then there are exactly two cosets, H and G \ H. If g ∈ H, then we have H = gH = Hg. If g ∉ H, we have gH ≠ H and Hg ≠ H, and therefore G \ H = gH = Hg. Hence, H is normal. Let φ : G → K be a group homomorphism. The kernel ker(φ) and image im(φ) are defined as follows: ker(φ) = {g ∈ G | φ(g) = 1}

1.3 Basic group theory

� 9

im(φ) = φ(G) = {φ(g) ∈ K | g ∈ G} Note that ker(φ) is a normal subgroup of G, and im(φ) is a (not necessarily normal) subgroup of K. Theorem 1.17. Let G be a group, and let H be a subset of G. Then the following properties are equivalent: (a) H is a normal subgroup of G. (b) H is a subgroup such that G/H forms a group with respect to the operation g1 H ⋅g2 H = g1 g2 H, the neutral element being H. (c) H is the kernel of a group homomorphism φ : G → K. (d) H is a subgroup and for all g ∈ G we have gHg −1 ⊆ H. Proof. (a) ⇒ (b) We have g1 H g2 H = g1 (Hg2 )H = g1 (g2 H)H = g1 g2 H. This shows that the operation on G/H is well defined and associative, and that H is the neutral element. Being well defined means that the operation is independent of the chosen representatives g1 and g2 . The inverse of gH is g −1 H ∈ G/H. (b) ⇒ (c) Consider the mapping φ : G → G/H, g 󳨃→ gH. Then, φ(g1 g2 ) = g1 g2 H = g1 H ⋅g2 H = φ(g1 ) φ(g2 ). This shows that φ is a homomorphism. The kernel of φ is ker(φ) = {g ∈ G | gH = H} = H. (c) ⇒ (d) Let φ : G → K be a group homomorphism with ker(φ) = H. Then φ(1) = 1 yields 1 ∈ H. For all g1 , g2 ∈ H, we have φ(g1 g2−1 ) = φ(g1 )φ(g2 )−1 = 1 ⋅ 1 = 1, and thus g1 g2−1 ∈ ker(φ) = H. Therefore, H is a subgroup because 1 ∈ ker(φ) and g ∈ ker(φ) ⇐⇒ g −1 ∈ ker(φ). For all g ∈ G and all h ∈ H, we have φ(ghg −1 ) = φ(g)φ(h)φ(g)−1 = φ(g)φ(g)−1 = 1, and thus gHg −1 ⊆ H = ker(φ). (d) ⇒ (a) From gHg −1 ⊆ H, we obtain gH ⊆ Hg and Hg −1 ⊆ g −1 H for all g ∈ G. Since all group elements in G can be represented as inverses, the latter implies Hg ⊆ gH for all g ∈ G. Both relations together yield gH = Hg for all g ∈ G. If H is a normal subgroup, then the group of Theorem 1.17 (b) is called the quotient group of G modulo H. The mapping G → G/H, g 󳨃→ gH is a homomorphism with kernel H. Theorem 1.18 (Homomorphism theorem for groups). Let φ : G → K be a group homomorphism. Then G/ker(φ) is a group by g ker(φ) ⋅ h ker(φ) = gh ker(φ) and φ induces an isomorphism: φ : G/ker(φ) → im(φ) g ker(φ) 󳨃→ φ(g)

Proof. Let H = ker(φ). The mapping φ is well defined: if g1 H = g2 H, then g1 = g2 h for a suitable h ∈ H. This yields φ(g1 H) = φ(g1 ) = φ(g2 h) = φ(g2 )φ(h) = φ(g2 ) = φ(g2 H) since φ(h) = 1. But if φ is well defined, it is obviously a homomorphism and by construction it is surjective. It remains to show that φ is injective. Let φ(g1 H) = φ(g2 H). Then φ(g1 ) = φ(g2 )

10 � 1 Algebraic structures and φ(g1−1 g2 ) = 1. Finally, this yields g1−1 g2 ∈ H and therefore g1 H = g1 (g1−1 g2 H) = g2 H.

1.4 A glimpse into rings Rings were introduced in Definition 1.3. Throughout this section, R = (R, +, ⋅) denotes a ring with 0 as the neutral element for the addition and 1 as the neutral element for the multiplication. An element x ∈ R is called a unit if there is some y ∈ R with xy = yx = 1. Thus, the units form a group which is a submonoid of (R, ⋅, 1). It is called the group of units R∗ . A subset S ⊆ R is called a subring, if 0, 1 ∈ S and the restrictions of + and ⋅ turn S = (S, +, ⋅) into a ring. For example, ℤ is a subring of ℚ. A subring S of a field R is called a subfield if S is a field. The field ℚ of rational numbers is a subfield of the reals ℝ; and ℝ is a subfield of the complex numbers. Section 1.4.4 provides a formal definition to obtain the field ℂ of complex numbers as a quotient of the polynomial ring ℝ[X] by the ideal (X 2 + 1)ℝ[X]. Quotient rings are introduced in Section 1.4.3 together with some other basic concepts about rings.

1.4.1 Matrix rings Noncommutative rings appear most naturally when considering (n × n)-matrix rings over a commutative ring R. Prominent examples are the matrix rings Rn×n , where R is ℤ or a finite field or an infinite field like ℚ, ℝ, or ℂ. Calculation in matrix rings over a field is a cornerstone of linear algebra. Let us review some basics. Throughout Section 1.4.1, we assume that R is a (semi)ring with 0 ≠ 1 ∈ R. First, let [n] = {1, . . . , n}, then the set of matrices Rn×n is the set of mappings from [n] × [n] to R. A typical and convenient notation is to write (ai,j ) for the matrix in Rn×n , which maps the pair (i, j) to the element ai,j ∈ R. This defines a pointwise addition as done in Equation (1.1) (ai,j ) + (bi,j ) = (ai,j + bi,j )

(1.2)

However, we deviate from pointwise multiplication. The matrix multiplication is n

(ai,j ) ⋅ (bi,j ) = ( ∑ ai,k ⋅ ak,j )

(1.3)

(ai,j ) ⋅ (bi,j ) ⋅ (ci,j ) = (∑ ai,k ⋅ bk,ℓ ⋅ cℓ,j )

(1.4)

k=1

The formula

k,ℓ

1.4 A glimpse into rings



11

shows that the multiplication is associative, because possible brackets on the left disappear on the right thanks to associativity in R. Equations (1.2)–(1.4) show that Rn×n is a (semi)ring. The zero matrix is the matrix where all entries are 0. It is also simply denoted by 0, as it is neutral for the addition. For i, j ∈ [n], we denote by Ei,j the matrix such that the entry (k, ℓ) is equal to 1 for i = k and ℓ = j, and all other entries are 0. For example, if n = 2, then E1,1 = ( 01 00 ) and E1,2 = ( 00 01 ). We see that the identity matrix In = ∑ni=1 Ei,i is neutral for the multiplication, hence In is also denoted by 1. More generally, we identify r ∈ R inside Rn×n with the matrix r ⋅ In . If R is a field, then Rn×n becomes an R-vector space of dimension n2 with basis {Ei,j | 1 ≤ i, j ≤ n}. We have E1,1 ⋅ E1,2 = E1,2 but E1,2 ⋅ E1,1 = 0. Thus, for all n ≥ 2 the (semi)ring Rn×n is not commutative. We will come back to calculation in matrix rings in Section 7.8 when we give some results about nilpotent groups.

1.4.2 Polynomials Of particular interest is the ring of polynomials over a commutative ring R. It is denoted by R[X]. Formally, a polynomial is a mapping f ∈ Rℕ where almost all values f (i) ∈ R are zero. That is, there is some d ∈ ℕ such that f (i) = 0 for all i > d. The polynomial f with f (i) = 0 for all i is called the zero polynomial, and we write f = 0. The degree of a polynomial f is denoted by deg(f ). We define the degree of the zero polynomial to be −∞. That is, deg(0) = −∞. For f ≠ 0, we let deg(f ) = d if f (d) ≠ 0, but f (i) = 0 for all i > d. If deg(f ) = d and ai = f (i) for all i ∈ ℕ, then we also write f = (a0 , . . . , ad ). A polynomial f = (a0 , . . . , ad ) is typically written as a sum d

f (X) = ∑ ai X i or f (X) = ∑ ai X i i=0

i≥0

where X is a formal symbol. Polynomials of degree at most 0 are called constants; they are identified with the elements of R. Polynomials form a ring: ∑ ai X i + ∑ bi X i = ∑(ai + bi )X i

i≥0

i≥0

i≥0

∑ ai X i ⋅ ∑ bj X j = ∑ ai X i ⋅ bj X j = ∑ ( ∑ ai bj )X k

i≥0

j≥0

i, j≥0

k≥0 i+j=k

A polynomial f ∈ R[X] induces a polynomial mapping f ̃ : R → R by substituting values for the variable X, f ̃ : R → R, r 󳨃→ ∑ ai r i i≥0

12 � 1 Algebraic structures The mapping f 󳨃→ f ̃ is called the evaluation homomorphism. In rings like ℝ[X], a polynomial is usually identified with the corresponding polynomial mapping. That is, f 󳨃→ f ̃ is injective in this case. On the other hand, if R is a finite ring with 0 ≠ 1, then f 󳨃→ f ̃ cannot be injective because R[X] is infinite and R is finite. Example 1.19. Let R = ℤ/2ℤ and f (X) = X 2 + X ∈ R[X]. Then f (X) has degree 2 and it is not the zero polynomial. But if f (X) is interpreted as the mapping f ̃ : ℤ/2ℤ → ℤ/2ℤ, r 󳨃→ r 2 + r, we find that f ̃ is the zero function, because f ̃(0) = f ̃(1) = 0 for 0, 1 ∈ ℤ/2ℤ. ⬦

1.4.3 Ideals and quotient rings An additive subgroup I of (R, +) is called a two-sided ideal (or ideal for short) if R⋅I ⋅R ⊆ I. Every subring contains 1, but if an ideal I contains 1, then we have I = R. For example, ℤ is a subring of ℚ, but it is not an ideal. Actually, a field R has only two ideals, {0} and R itself. Let I be an ideal, then the set of additive cosets of I is R/I = {r + I | r ∈ R}. Since addition in R is commutative, R/I forms an Abelian group with addition: (r + I) + (s + I) = r + s + I Let us define multiplication on R/I by (r + I) ⋅ (s + I) = rs + I The following computation shows that multiplication is well defined: (r + I)(s + I) = rs + Is + rI + II ⊆ rs + I The associativity of multiplication and the distributive property for R/I now follow from the corresponding properties in R. We call R/I the quotient ring of R modulo I. The elements of R/I are called residue classes. We say r is congruent to s modulo I if r + I = s + I. Any element s ∈ r+I is called a representative of the class r+I. In computations modulo I, we can switch back and forth between representatives and classes. Let φ : R → S be a ring homomorphism. Its kernel is the ideal ker(φ) = {r ∈ R | φ(r) = 0}. The image of φ is the subring im(φ) = {φ(r) | r ∈ R} of S. We have 1 ∈ ker(φ) ⇐⇒ im(φ) = {0}. Hence, in general the ideal ker(φ) is not a subring. In analogy to Theorem 1.17, one can easily show that I is an ideal if and only if I is the kernel of a ring homomorphism φ : R → S. Example 1.20. Consider the derivative operator D : {f : ℝ → ℝ | f is differentiable} → ℝℝ , f 󳨃→ f ′

1.4 A glimpse into rings

� 13

We have (f + g)′ = f ′ + g ′ , i. e., D is a group homomorphism with respect to addition. On the other hand, ker(D) = {f | f is a constant mapping} is not an ideal. Therefore, D is not a ring homomorphism. ⬦ The following theorem is analogous to the homomorphism theorem for groups, Theorem 1.18. Its proof is straightforward from that theorem. Theorem 1.21 (Homomorphism theorem for rings). Let φ : R → S be a ring homomorphism. Then φ induces the ring isomorphism R/ker(φ) → im(φ)

r + ker(φ) 󳨃→ φ(r)

Proof. The set φ(r + ker(φ)) is equal to the singleton {φ(r)}. Therefore, the mapping r + ker(φ) 󳨃→ φ(r) is a well-defined ring homomorphism. Applying Theorem 1.18 (to the homomorphism φ from the additive group (R, +) to (S, +)) yields that φ is a bijection, and bijective ring homomorphisms are isomorphisms. Example 1.22 yields instances of Theorem 1.21. Example 1.22. Let R be a commutative ring and I ⊆ R be an ideal. In the following “equal” means “canonically isomorphic”. – For polynomials f = ∑i≥0 ai X i and g = ∑i≥0 bi X i in R[X], we define f ≡I g if ai −bi ∈ I for all i ≥ 0. The ring R[X] modulo ≡I is equal to the ring of polynomials (R/I)[X]. – For matrices (ai,j ) and (bi,j ) in Rn×n , we let (ai,j ) ≡I (bi,j ) if ai,j −bi,j ∈ I for all 1 ≤ i, j ≤ n. Then the quotient ring Rn×n modulo ≡I is equal to the matrix ring (R/I)n×n . ⬦ A consequence of the homomorphism theorem for rings is that homomorphisms between fields are injective. In particular, homomorphisms between finite fields are bijective. We show a more general statement. Corollary 1.23. Ring homomorphisms from a field into a ring with 0 ≠ 1 are injective. In particular, all homomorphisms between fields are injective. Proof. Let φ be a homomorphism and suppose that φ(z) = 0 for z ≠ 0. Then 1 = φ(1) = φ(z−1 z) = φ(z−1 ) ⋅ φ(z) = φ(z−1 ) ⋅ 0 = 0, contradicting the assumption. Thus, ker(φ) = {0}, and φ is injective by Theorem 1.21.

1.4.4 Complex numbers First definition. A standard way to define the field ℂ of complex numbers begins with the 2-dimensional ℝ-vector space ℝ×ℝ with the component-wise addition and the scalar multiplication r ⋅ (x, y) = (rx, ry) for reals r, x, y ∈ ℝ. Identifying ℝ with ℝ ⋅ (1, 0) and letting i = (0, 1), we can write every complex number uniquely as a + bi with a, b ∈ ℝ.

14 � 1 Algebraic structures Using this notation the multiplication in ℂ is defined by (a+bi)(c+di) = ac−bd+(ad+bc)i. For z = (a + bi), we let N(z) = (a + bi)(a − bi) = a2 + b2 . Then N(z) is never negative and positive whenever z ≠ 0. Therefore, ℂ is a field since (a + bi)−1 = (a − bi)/(a2 + b2 ). The nonnegative real N(z) is also called the norm of z; and its square root √N(z) is the absolute value, denoted by |z|. Taking the square root has the advantage that the meaning of |r| remains the same when a real r ∈ ℝ is viewed as the complex number r = r + 0 ⋅ i ∈ ℂ. A direct verification shows what we expect in case of absolute values: |z + z′ | ≤ |z| + |z′ | and |z ⋅ z′ | = |z| ⋅ |z′ |. Moreover, i2 = −1, and hence, i4 = 1. Since the order of i is 4, it is a so-called primitive fourth root of unity. Second definition. We consider the quotient ring of polynomials ℝ[X] modulo the ideal generated by X 2 + 1. To make the standard notation z = a + bi explicitly visible, we replace the formal symbol for the indeterminate X by another formal symbol i. Thus, we denote the ring of polynomials over ℝ as ℝ[i] and the ideal is generated by i2 + 1. Then we define ℂ as quotient ring modulo (i2 + 1)ℝ[i]: ℂ = ℝ[i]/(i2 + 1)ℝ[i]

(1.5)

A straightforward calculation shows that the quotient ring ℂ is a field. As said in Section 1.1, the polynomial i2 +1 has a zero: it is the class of i, denoted simply as the imaginary number i. As expected, the two different definitions are equivalent. The construction of the field ℂ in Equation (1.5) is a special case of a quadratic field extension. Example 1.24 (Quadratic field extensions). Let k be any field and a ∈ k an element which is not a square in k, i. e., there is no b ∈ k with a = b2 . Then the polynomial X 2 − a has no zero, and we define k[√a] = k[X]/(X 2 − a)k[X]

(1.6)

In the special case of k = ℚ and a = 2, we can interpret the result in two different ways. We can view k[√2] as a quotient ring of ℚ[X] or as the smallest subfield of ℝ containing √2. Likewise for a = −2, we can view ℚ[√−2] as a quotient ring of ℚ[X], or as the smallest subfield of ℂ containing √−2. Note that ℚ[√−2] does not contain i. The smallest subfield of ℂ containing i and √±2 is a quadratic extension of ℚ[i]: it can be realized either as (ℚ[i])[√2] or (ℚ[√2])[i]. ⬦

2 Elementary number theory A basic fact in number theory is Theorem 2.2 stating that ℤ/nℤ is a field if and only if n is a prime number. It is the first theorem in this chapter. Our proof differs slightly from the usual approach; its structure is a nested induction, which is using an outer induction on primes and inside its proof an induction on natural numbers. Our proof also shows that for two different primes p and q there are a, b ∈ ℤ with 1 = ap + bq, which in turn is a special case of Bézout’s identity, stated in Theorem 2.4 and further analyzed in Section 2.2. The uniqueness of the prime factorization is shown in Section 2.3.

2.1 Modular arithmetic The integers ℤ form an additive group with addition. Together with the multiplication ℤ becomes a commutative ring. If n ∈ ℤ, then nℤ is an ideal because nℤ is additive subgroup of ℤ and a ⋅ nℤ ⊆ nℤ for every a in the ambient ring ℤ. We also note that nℤ = (−n)ℤ. Section 1.4.3 developed the general facts how to compute in quotient ring R/I if R is a (not necessarily commutative) ring and I ⊆ R is a (two-sided) ideal. For readers who have skipped that part, we repeat the basics in the specific setting of modular arithmetic over integers. The residue class of k ∈ ℤ (with respect to n) is the subset k + nℤ ⊆ ℤ. We have ℓ ∈ k + nℤ if and only if k + nℤ = ℓ + nℤ. Therefore, residue classes are either identical or disjoint. Classes can be added and multiplied like integers: (k + nℤ) + (ℓ + nℤ) = k + ℓ + nℤ (k + nℤ) ⋅ (ℓ + nℤ) = kℓ + nℤ

The results of the operations are well defined: they do not depend on the representatives, as we can replace k by k ′ ∈ k + nℤ and ℓ by ℓ′ ∈ ℓ + nℤ; then k + ℓ + nℤ = k ′ + ℓ′ + nℤ and kℓ + nℤ = k ′ ℓ′ + nℤ. Not surprisingly, the direct verification that these operations are well defined mimics the corresponding calculations for quotient rings with respect to two-sided ideals. Example 2.1. We want to compute a561 mod 12 for a = 5, 7, and 11. To compute the natural number a561 first would be rather cumbersome. However, we may note that 52 = 25 ≡ 1 ≡ 49 = 72 mod 12, and 11 ≡ −1 mod 12. Hence, we see that an ≡ a mod 12 for a = 5, 7, 11 and every odd number n. ⬦ A specific property of ℤ is that it is a principal ideal domain which refers to a ring R where every ideal is of the form rR for some r ∈ R. To see this property, let I ⊆ ℤ be an ideal such that {0} ≠ I ≠ ℤ. Then I contains a least positive integer n with n ≥ 2 because I ≠ ℤ. We claim that I = nℤ. In order to prove the claim, consider any k ∈ I. This implies https://doi.org/10.1515/9783111062556-002

16 � 2 Elementary number theory −k ∈ I, too. Hence, we may assume that we have k ∈ ℕ. There is some maximal m ∈ ℕ such that mn ≤ k. This implies k = mn + r with 0 ≤ r < n. Therefore r ∈ I. However, by the choice of n, we must have r = 0. This yields k ∈ nℤ, and hence the claim. As a consequence, all quotient rings of ℤ are of the form ℤ/nℤ with n ∈ ℕ. For k ∈ ℓ + nℤ, we write k ≡ ℓ mod n, and we say that k and ℓ are congruent modulo n. The least r ∈ ℕ such that k = mn + r is the remainder of the division of k by n. The class k + nℤ is uniquely determined by the remainder k mod n. Hence, it is possible to identify the set ℤ/nℤ with the set of representatives {0, . . . , n − 1}. Sometimes, other choices like {1 − ⌈n/2⌉, . . . , ⌊n/2⌋} are also convenient. The nontrivial implication in the following theorem tells us that ℤ/pℤ is a field if p is prime. This is a consequence of Bézout’s lemma (Theorem 2.4). Therefore, it is usually proven by showing Bézout’s lemma first. Our proof is less standard, but in some sense minimalistic and more direct. Unlike other proofs, we do not rely on the (extended) Euclidean algorithm. Theorem 2.2. Let n ∈ ℕ, then ℤ/nℤ is a field if and only if n is prime. Proof. For n = 0 or n = 1, the ring ℤ/nℤ is not a field. If n = rs is a product with 1 < r, s < n, then r and s are zero-divisors and ℤ/nℤ is not a field. Thus, we let p = n be a prime number and denote R = ℤ/pℤ. We will show that the ring R is a field. This is true for p = 2. Thus, for induction, we may assume p ≥ 3 and that ℤ/qℤ is a field whenever q is a prime with q < p. The main step in proving that R is field is to show that the multiplication z mod p 󳨃→ rz mod p is injective in R for all 1 ≤ r < p. We prove this claim by induction on r. The claim is true for r = 1. Thus, we let r ≥ 2, and we write r = r ′ q where q is a prime with 2 ≤ q < p. Suppose rz ≡ rz′ mod p. We have to show z ≡ z′ mod p. Since r = r ′ q, we have r ′ qz ≡ r ′ qz′ mod p with r ′ < r. Thus, by induction on r, we have qz ≡ qz′ mod p. Since q is a prime less than p, we know by the induction on the prime p that ℤ/qℤ is a field. Since p ∉ qℤ, the congruence class p mod q has a multiplicative inverse in ℤ/qℤ because ℤ/qℤ is a field. Thus, we find some 0 < a < q such that ap ≡ 1 mod q. This means ap − 1 ∈ qℤ; and therefore there is some b ∈ ℤ such that 1 = ap + bq ∈ ℤ. We conclude 1 ≡ bq mod p. Hence, qz = qz′ ∈ R implies z ≡ bqz ≡ bqz′ ≡ z′ mod p. This proves the claim: Multiplication with r is injective in the ring R = ℤ/pℤ. Since z 󳨃→ rz is injective in R and R is finite, multiplication by r is surjective. Therefore, there is some s ∈ R with rs = 1. This shows that every r ∈ R \ {0} has an inverse. Thus, R = ℤ/pℤ is a field. In computations modulo a given number n, one can arbitrarily switch between working with integers and working with residue classes. The result will always be correct “modulo n”. However, we have to be careful if we try to divide by divisors of n. As in any ring, (ℤ/nℤ)∗ denotes the group of units of the ring ℤ/nℤ, i. e., the residue classes having a multiplicative inverse. Its cardinality |(ℤ/nℤ)∗ | is denoted by φ(n), By Theorem 2.2, we have φ(n) = n − 1 if and only if n is prime.

2.2 Euclidean algorithm



17

2.2 Euclidean algorithm An integer k divides ℓ, denoted by k | ℓ, if there is an m ∈ ℤ with km = ℓ. The greatest common divisor of two integer numbers k and ℓ is denoted gcd(k, ℓ). The gcd is the largest natural number dividing both k and ℓ. We define the greatest common divisor of k and 0 as |k|. Two numbers are called coprime (or relatively prime) if their greatest common divisor is 1. Similarly, the least common multiple is denoted by lcm(k, ℓ) and we have |k ⋅ ℓ| = gcd(k, ℓ) ⋅ lcm(k, ℓ). The Euclidean algorithm (Euclid from Alexandria, working around 300 BC) is an efficient procedure for the computation of the greatest common divisor. As we have gcd(k, ℓ) = gcd(−k, ℓ) = gcd(ℓ, k), it is sufficient to consider k, ℓ ∈ ℕ. We let 0 < k ≤ ℓ and write ℓ in the form ℓ = qk + r, where 0 ≤ r < k is the residue. The residue r is also denoted “ℓ mod k”. We will take a closer look at this in the next section. Each number that divides k and the residue r also divides the sum ℓ = qk + r. Each number that divides k and ℓ also divides the difference r = ℓ − qk. This leads to the following recursive version of the Euclidean algorithm as shown in Example 2.3. Example 2.3. We want to compute gcd(21, 59). Below, on the left-hand side we give a way to compute the gcd using natural numbers only. The right-hand side shows a shorter computation using negative numbers, too: 59 = 2 ⋅ 21 + 17 21 = 1 ⋅ 17 + 4

17 = 4 ⋅ 4 + 1

59 = 3 ⋅ 21 − 4 21 = 5 ⋅ 4 + 1

The result is gcd(21, 59) = 1.



The following theorem is frequently accredited to Étienne Bézout (1730–1783), who has shown a corresponding statement for polynomials. However, Theorem 2.4 was found earlier by Claude Gaspard Bachet de Méziriac (1581–1638). Theorem 2.4 (Bézout’s identity). Let k, ℓ ∈ ℤ. Then there are a, b ∈ ℤ satisfying gcd(k, ℓ) = ak + bℓ Proof. We may assume ℓ > k > 0, the other cases are obvious or can be reduced to this case. Let r0 = ℓ and r1 = k. The Euclidian algorithm successively computes residues r0 > r1 > r2 > ⋅ ⋅ ⋅ > rn ≥ rn+1 = 0, satisfying ri−1 = qi ri + ri+1 for suitable qi ∈ ℕ. Thus, we obtain gcd(k, ℓ) = gcd(ri−1 , ri ) = gcd(rn , 0) = rn . Now we show that for all i ∈ {0, . . . , n} there exist integers ai and bi such that ai ri + bi ri−1 = rn . For i = n, we have an = 1 and bn = 0. Let now i < n and let ai+1 and bi+1 be already defined,

18 � 2 Elementary number theory i. e., ai+1 ri+1 + bi+1 ri = rn . Using ri+1 = ri−1 − qi ri , we obtain (bi+1 − ai+1 qi )ri + ai+1 ri−1 = rn . Therefore ai = bi+1 − ai+1 qi and bi = ai+1 have the desired property. The above proof leads to the following procedure. The extended Euclidian algorithm in addition to gcd(k, ℓ) also computes numbers a and b such that ak + bℓ = gcd(k, ℓ). /∗ Preconditions: k ≥ 0, ℓ ≥ 0 ∗/ /∗ Compute (a, b, t) with ak + bℓ = t = gcd(k, ℓ) ∗/ function ext-gcd(k, ℓ) begin if k = 0 then return (0, 1, ℓ) else (a, b, t) := ext-gcd(ℓ mod k, k); return (b − a ⋅ ⌊ kℓ ⌋, a, t) fi end Example 2.5. We continue Example 2.3. Inserting the successive equations in reverse order yields: 1 = 17 − 4 ⋅ 4 = 17 − 4 ⋅ (21 − 1 ⋅ 17) = −4 ⋅ 21 + 5 ⋅ 17 = −4 ⋅ 21 + 5 ⋅ (59 − 2 ⋅ 21) = −14 ⋅ 21 + 5 ⋅ 59

The computed representation of 1 as the linear combination −14 ⋅ 21 + 5 ⋅ 59 is not the only possibility. For example, we obtain another representation as follows: 1 = −14 ⋅ 21 + 59 ⋅ 21 − 21 ⋅ 59 + 5 ⋅ 59 = 45 ⋅ 21 − 16 ⋅ 59. ⬦ Corollary 2.6. Let n ∈ ℤ and let k, ℓ ∈ ℤ be coprime to n, i. e., gcd(n, k) = gcd(n, ℓ) = 1. Then also the product kℓ is coprime to n. Proof. Write 1 = an + bk = cn + dℓ for suitable a, b, c, d ∈ ℤ. Multiplication yields 1 = (an + bk)(cn + dℓ) = (anc + bkc + adℓ)n + bd(kℓ). Hence, any common divisor of n and kℓ also divides 1. From this, we obtain gcd(n, kℓ) = 1. Corollary 2.7. Let n ∈ ℕ. Then (ℤ/nℤ)∗ = {k + nℤ | gcd(k, n) = 1}. Proof. We have k ∈ (ℤ/nℤ)∗ if and only if there is a representation of 1 of the form 1 = kℓ + mn. By Theorem 2.4, this is the case if and only if gcd(k, n) = 1.

2.3 Fundamental theorem of arithmetic The fundamental theorem of arithmetic states that every positive natural number has a unique prime factorization.

2.4 Bits and bytes

� 19

Theorem 2.8. Let ℙ = {2, 3, 5, 7, 11, . . .} be the set of prime numbers and n ∈ ℕ with n ≥ 1. Then n has a unique product representation n = ∏ pnp p∈ℙ

Here, for every prime number p the exponent np is different from zero if and only if p divides n. Proof. The number 1 allows the prime factorization with np = 0 for all p ∈ ℙ, and this is also the only possible representation. If n is greater than 1, then a prime number p divides n. Thus, we get a representation of the desired form by an inductive argument, using the prime factorization of n/p. Moreover, for every prime p dividing n there is a prime factorization with np ≥ 1. If n = ∏p∈ℙ pnp is any prime factorization, then we have np ≠ 0 if and only if p divides n. This follows from the fact that a product of natural numbers which are not divisible by p cannot be divisible by p, which in turn is a direct consequence of Theorem 2.2, because the ring ℤ/pℤ is a field, and a product of nonzero elements in a field cannot evaluate to zero. (Not surprisingly, we can also apply Corollary 2.6 to see this.) It remains to show that the prime factorization of n is unique. ′ Consider n = ∏p∈ℙ pnp = ∏p∈ℙ pnp . We have to show that np = np′ for all primes p. Note that n = 1 if and only if np = np′ = 0 for all p. Thus, by symmetry, we may assume nq ≥ 1 for some prime q. This implies that q divides n, and therefore, as observed above, we have nq′ ≥ 1 as well. We are done by induction on n since we can write n/q = qnq −1 ⋅ ∏ pnp = qnq −1 ⋅ ∏ pnp ′

p∈ℙ\{q}



p∈ℙ\{q}

2.4 Bits and bytes One of the many applications of modular arithmetic (or residue class arithmetic) can be found in the internal representation of integer numbers within computers. We assume that k bits are available for representing an integer number, and that multiplication by minus one should mostly be possible. Since one bit has to be used as a sign bit, and observing that −0 = +0, obviously we cannot represent exactly the same number of positive and negative integers. Therefore, one possible maximum range of representable integers is −2k−1 , . . . , −1, ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 0, 1, . . . , 2k−1 − 1 ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 2k−1 numbers

2k−1 numbers

In many cases it turns out to be beneficial to use residue class arithmetic modulo 2k instead of explicitly using a sign bit. If we perform computations in the ring ℤ/2k ℤ, the above given number range constitutes a system of representatives. In a place value

20 � 2 Elementary number theory system to the basis 2, the first (most significant) bit carries the information about the sign of the number. This bit has the value 1 if and only if the represented number is negative. Of course, there must be a special handling for the case that a performed operation leads to leaving the representable number range (a so-called “Overflow”). Suppose 8 bits are available. Then we can represent the range from −128 to +127. For k = 8, we obtain: 01111111 10000000 ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 11111111 k bits

= = =

2k−1 − 1 2k−1 2k − 1

= = =

127 128 255

≡ ≡ ≡

127 −128 −1

(mod 28 ) (mod 28 ) (mod 28 )

One advantage of this arithmetic is that subtraction is practically as easy as addition, because for multiplication by −1 it is essentially sufficient to take the two’s complement. Let x = x1 ⋅ ⋅ ⋅ xk with xi ∈ {0, 1} and let x = x1 ⋅ ⋅ ⋅ xk where we define 0 = 1 and 1 = 0. Then we obtain 1 ⋅ ⋅ ⋅ 1 = 2k − 1 x + x = ⏟⏟⏟⏟⏟⏟⏟⏟⏟ k bits

All computations then are only valid modulo 2k , however, if the final result lies within the valid number range, then it is always correct. In particular, we can conclude −x = x + 1. Let, e. g., k = 8 and x = 01101011 = +107. We compute −x = −107 by: x x+1

= =

10010100 10010101 = 149 ≡ −107 mod 256

Starting with a negative number, e. g., x = 10010000 = −112 ≡ 144 mod 256, we can compute −x = +112 as follows: x x+1

= =

01101111 01110000 = 112

2.5 Error detection for article numbers Further applications of modular arithmetic can be found in the EAN (European Article Number) and the ISBN (International Standard Book Number). The function of the EAN is the identification of commercial items, whereas the ISBN serves the same purpose for books. In both systems, the last digit is a check digit, computed from the weighted sum of the other digits. A correct EAN is a 13-digit decimal number x13 x12 ⋅ ⋅ ⋅ x1 with check digit x1 and the property x13 + 3 ⋅ x12 + x11 + 3 ⋅ x10 + ⋅ ⋅ ⋅ + x1 ≡ 0

mod 10

2.6 Chinese remaindering

� 21

Here, 1 and 3 appear alternating as weights of the digits. If the weighted checksum results in a value different from 0 mod 10, this corresponds to an error message. What kinds of errors can we detect with this procedure? Deviation of one digit by a leads to a difference of a or 3a in the checksum, depending on the weight of the concerned digit. Since we have gcd(3, 10) = 1 we will get an error message in this case, if a ≢ 0 mod 10. These so-called one-bit errors could already have been detected, if all digits had the weight 1. But, having different weights for adjacent digits makes it possible to recognize if two digits xi+1 and xi in an EAN have been permuted, provided that xi+1 ≢ xi mod 5. However, if, e. g., two adjacent digits of values 7 and 2 are permuted, this will not be recognized. In case of the 10-digit ISBN, the check digit is computed modulo 11, which requires 10 possible weights 10, 9, 8, . . . , 1 to be available, all of which are coprime to 11. A correct ISBN x10 x9 ⋅ ⋅ ⋅ x1 with check digit x1 satisfies 10 ⋅ x10 + 9 ⋅ x9 + 8 ⋅ x8 + ⋅ ⋅ ⋅ + 1 ⋅ x1 ≡ 0

mod 11

Here, the difference between two adjacent weights is always 1, so that in addition to one-bit errors, any permutation of adjacent digits will be recognized. However, when computing the ISBN check digit a symbol X for the value 10 is needed, because the 10 decimal digits cannot represent the 11 different possible values of computation modulo 11.

2.6 Chinese remaindering Recall that the direct product R1 × R2 of rings R1 and R2 bears the structure of a ring by componentwise addition and multiplication. Moreover, there is a canonical ring homomorphism π : ℤ → R1 × R2 defined by π(1) = (1, 1). By Theorem 1.21, we know that π induces an isomorphism between ℤ/nℤ and π(ℤ) where nℤ is the ideal π −1 (0, 0). Chinese remaindering yields a natural condition when the induced homomorphism is an isomorphism in the setting of modular arithmetic. Theorem 2.9 (Chinese remainder theorem). Let k, ℓ ∈ ℤ be coprime. Then we obtain a canonical ring isomorphism: π : ℤ/kℓℤ → ℤ/kℤ × ℤ/ℓℤ x + kℓℤ 󳨃→ (x + kℤ, x + ℓℤ) Proof. Consider the canonical ring homomorphism π : ℤ → ℤ/kℤ × ℤ/ℓℤ and (x + kℤ, y + ℓℤ). Since gcd(k, ℓ) = 1, there are integers a, b ∈ ℤ with ak + bℓ = 1. It follows that bℓ ≡ 1 mod k and ak ≡ 1 mod ℓ. For any x, y ∈ ℤ, the sum yak+xbℓ has the following properties:

22 � 2 Elementary number theory yak + xbℓ ≡ 0 + x ⋅ 1 ≡ x yak + xbℓ ≡ y ⋅ 1 + 0 ≡ y

mod k mod ℓ

It follows that π(yak + xbℓ) = (x + kℤ, y + ℓℤ). Thus, we showed that π is surjective. Moreover, we have π(x ′ ) = π(x) for all x ′ ∈ x + kℓℤ, so (x mod kℓ) 󳨃→ (x mod k, x mod ℓ) is well defined. Hence π induces a surjection from ℤ/kℓℤ to ℤ/kℤ × ℤ/ℓℤ. Finally, we realize that there are exactly kℓ elements both in ℤ/kℓℤ and in ℤ/kℤ × ℤ/ℓℤ. So the induced homomorphism is bijective. In fact, the converse of Theorem 2.9 applies, too. If ℤ/kℓℤ and ℤ/kℤ × ℤ/ℓℤ are isomorphic, then gcd(k, ℓ) = 1. To see this, observe (lcm(k, ℓ)x, lcm(k, ℓ)y) = (0, 0) for all (x, y) ∈ ℤ/kℤ × ℤ/ℓℤ. So if ℤ/kℓℤ embeds into ℤ/kℤ × ℤ/ℓℤ, then lcm(k, ℓ) ≥ kℓ; and thus, gcd(k, ℓ) = 1. The extended Euclidean algorithm from Section 2.2 provides an effective procedure to compute, for a given pair (y, z) ∈ ℤ/kℤ × ℤ/ℓℤ, a number x ∈ ℤ with π(x) = (y, z). Example 2.10. Let k = 5 and ℓ = 7. We seek a number z with π(z) = (3 + 5ℤ, 4 + 7ℤ), i. e., we are looking for a z satisfying z ≡ 3 mod 5 and z ≡ 4 mod 7. Using the extended Euclidean algorithm, we obtain −4 × 5 + 3 × 7 = 1. As in the proof of Theorem 2.9, we let z = 4 ⋅ (−4) ⋅ 5 + 3 ⋅ 3 ⋅ 7 = −17 and, indeed, we obtain −17 ≡ 3 mod 5 and −17 ≡ 4 mod 7. Another solution is found as −17 + 5 ⋅ 7 = 18. Observe that 18 ≡ 3 mod 5 and 18 ≡ 4 mod 7 are also valid. So, 18 is the (uniquely defined) solution of π(z) = (3 + 5ℤ, 4 + 7ℤ) in the range {0, . . . , 5 ⋅ 7 − 1}. ⬦ Example 2.11. The rings ℤ/35ℤ and ℤ/5ℤ×ℤ/7ℤ are isomorphic. In the following table, for each element of ℤ/35ℤ, we give the pair in ℤ/5ℤ × ℤ/7ℤ to which it is mapped. 0 (0,0)

1 (1,1)

2 (2,2)

3 (3,3)

4 (4,4)

5 (0,5)

6 (1,6)

7 (2,0)

8 (3,1)

9 (4,2)

10 (0,3)

11 (1,4)

12 (2,5)

13 (3,6)

14 (4,0)

15 (0,1)

16 (1,2)

17 (2,3)

18 (3,4)

19 (4,5)

20 (0,6)

21 (1,0)

22 (2,1)

23 (3,2)

24 (4,3)

25 (0,4)

26 (1,5)

27 (2,6)

28 (3,0)

29 (4,1)

30 (0,2)

31 (1,3)

32 (2,4)

33 (3,5)

34 (4,6)

⬦ Chinese remaindering, Theorem 2.9, has many important consequences since invertible elements of ℤ/kℤ × ℤ/ℓℤ are those where both components are invertible.

2.6 Chinese remaindering

� 23

Corollary 2.12. Let k, ℓ ∈ ℤ be coprime. Then (ℤ/kℓℤ)∗ → (ℤ/kℤ)∗ × (ℤ/ℓℤ)∗ x + kℓℤ 󳨃→ (x + kℤ, x + ℓℤ)

is a group isomorphism with respect to multiplication of the invertible elements. A typical application of the Chinese remainder theorem is the fact that for coprime numbers k and ℓ the congruence x ≡ y mod kℓ holds if and only if the two congruences x ≡ y mod k and x ≡ y mod ℓ are satisfied. By Corollary 2.12, we can further see that x modulo kℓ is invertible if and only if x is invertible modulo k and modulo ℓ. If a product of more than two coprime numbers is given, the Chinese remaindering, Theorem 2.9, can also be applied several times. Corollary 2.13. Let m1 , . . . , mn ∈ ℤ be pairwise coprime and let m = m1 ⋅ ⋅ ⋅ mn . The following mapping is a ring isomorphism: ℤ/mℤ → ℤ/m1 ℤ × ⋅ ⋅ ⋅ × ℤ/mn ℤ

x + mℤ 󳨃→ (x + m1 ℤ, . . . , x + mn ℤ) An interpretation of Corollary 2.13 as solving multiple congruences yields the following version of the Chinese remaindering which was found by Sun Zi in the third century. However, his result was published only much later in 1247 by Qin Jiushao. Corollary 2.14. Let m1 , . . . , mn ∈ ℤ be pairwise coprime and let m = m1 ⋅ ⋅ ⋅ mn . For all x1 , . . . , xn ∈ ℤ, there is exactly one x ∈ {0, . . . , m − 1} which simultaneously satisfies the following n congruences: x ≡ x1 .. .

x ≡ xn

mod m1 mod mn

The set of solutions for this system of congruences is x + mℤ. There are several simple proofs for the fact that there are infinitely many prime numbers. One of them can be derived from Corollary 2.14. Corollary 2.15. There are infinitely many prime numbers. Proof. Suppose that there were only finitely many primes. Then we could find a number n satisfying the congruence n ≡ p−1 mod p for every prime p. This number is greater than 1 and it is not divisible by any prime number. By Theorem 2.8, we know that this is not possible. The proof of Corollary 2.15 has some quantitative consequence: let n ≥ 3 and m be the product of the first n − 1 primes, then the nth prime is smaller than m.

24 � 2 Elementary number theory

2.7 Fermat’s primality test A frequently recurring topic is Fermat’s little theorem, named after Pierre de Fermat (1607–1665). This fundamental theorem in elementary number theory is important for many applications of elementary number theory, including public-key cryptography. Fermat worked as a lawyer and later on as judge in Toulouse. He gained mathematical influence by correspondence with important mathematicians of his time. Legendary is his note that he has a “truly wonderful proof” of unsolvability of the Diophantine equations an + bn = cn with integers a, b, c ≠ 0 and n > 2. But the margin was “too narrow to contain” the proof. This claim went down in mathematical history as Fermat’s last theorem and up to the early 1990s it was one of the most famous number-theoretic conjectures. Due to the simple formulation of the problem, various hobby mathematicians tried to work on the solution and did not decline from presenting wrong solutions. Only in the year 1993, Fermat’s last theorem was proven by Andrew Wiles (born 1953). However, his first proof contained a gap, which was filled in 1995 by a joint work with Richard Taylor (born 1962). As the reader may expect (or suspect?), we treat Fermat’s little theorem, only. There is no original proof available, which can be credited to Fermat. But the reasoning for this theorem is sufficiently simple that there is hardly any doubt that Fermat knew a correct proof. Theorem 2.16 (Fermat’s little theorem). Let p be a prime and a ∈ ℤ. Then:

a

ap ≡ a

mod p

≡1

mod p

p−1

if gcd(a, p) = 1

Actually, we give three different and easy proofs of Fermat’s little theorem. A fourth proof is given later in Example 7.11. All proofs presented here are simple, but they use different concepts. Proof. As a preamble, we observe that if gcd(a, p) = 1, then both statements of the theorem are equivalent. This follows because gcd(a, p) = 1 implies that a has a multiplicative inverse b in (ℤ/pℤ)∗ . The remaining case is that p divides a. Then both a mod p and ap mod p are equal to zero. Group-theoretical proof. The set (ℤ/pℤ)∗ is a group with p − 1 elements. Hence, ap−1 = 1 for all a ∈ (ℤ/pℤ)∗ by Corollary 1.14. Proof using a bijection. Let gcd(a, p) = 1. By Corollary 2.7, we have a ∈ (ℤ/pℤ)∗ , and (ℤ/pℤ)∗ is a group. The multiplication by a group element is a bijection in every group (ℤ/pℤ)∗ . Therefore, (p − 1)! =

∏i ≡

i∈{1,...,p−1}

∏ i ≡ (p − 1)! ⋅ ap−1 mod p

i∈{a,...,a(p−1)}

We are done because gcd((p − 1)!, p) = 1, and hence we can divide by (p − 1)!.

2.8 Fast exponentiation

� 25

Inductive proof. Without restriction, let a ∈ ℕ. The first statement holds for a = 0. Now, let a + 1 ≥ 1. By induction, we have ap ≡ a mod p. The binomial theorem (see Theorem 5.3) yields p

p (a + 1)p ≡ ∑ ( )ai ≡ ap + 1 ≡ a + 1 mod p i i=0

(2.1)

In (2.1) we use that (pi) ≡ 0 mod p for 1 ≤ i ≤ p − 1 because p divides the nominator, but not the denominator of (pi) =

p! . i!(p−i)!

An idea for a simple primality test derived from Theorem 2.16 leads to a procedure called the Fermat test. This procedure works as follows: (a) Choose a ∈ {1, . . . , n − 1} at random. (b) Compute an−1 mod n. (c) If an−1 ≢ 1 mod n, then n certainly is not a prime number, otherwise it possibly is. This test, explicitly or implicitly, is the basis of almost all primality tests used in practice. In technical applications, a and n usually are binary numbers with several hundred or thousand digits. It is impossible to perform 21000 arithmetic operations in this universe’s time span. Consequently, for realistic applications of the Fermat test, we need a fast algorithm for exponentiation. Example 2.17. We want to determine the value z = 142222 mod 77 without using a computer. From the prime decomposition 77 = 7 ⋅ 11, we know that it is sufficient to compute z mod 7 and z mod 11. Clearly, z mod 7 = 0. By Fermat’s little theorem, Theorem 2.16, we know that 1410 ≡ 310 ≡ 1 mod 11. This implies z ≡ 142 ≡ 32 mod 11. Now, using the Chinese remainder theorem, we have to find the uniquely determined number in the range {0, . . . , 76}, which is congruent to 9 modulo 11 and at the same time divisible by 7. This is easily seen to be the number 42. ⬦

2.8 Fast exponentiation The first conveyed proof of Fermat’s little theorem is due to Leonhard Euler (1707–1783). Euler was extremely proliferous and throughout his life has enriched mathematics with 5 a lot of fundamental insights. In the year 1732 Euler found out that 22 + 1 = 4294967297 n is not a prime, thus refuting Fermat’s conjecture that all numbers of the form 22 + 1 are prime numbers. The first five numbers in this sequence are the primes 3, 5, 17, 257, n and 65537. Primes of the form 22 + 1 are called Fermat prime numbers. However, apart from the five mentioned numbers, no further Fermat prime number is known; and from today’s point of view it seems rather plausible that there are no more such primes. More5 over, Euler found the number 641 as a factor of 22 + 1. In fact, we have 34294967296 mod 4294967297 = 3029026160

26 � 2 Elementary number theory meaning that the sixth Fermat number already fails the Fermat test for a = 3. Would Euler in his time (without pocket calculator or computer) have been able to compute the value 34294967296 mod 4294967297 at all? Certainly, it was impossible to multiply 3 with itself for 4294967296 times and divide the resulting number by 4294967297. There are at least two reasons. On the one hand, the result of these multiplications has far more digits than a person could ever be able to write down, and, on the other hand, Euler simply could not have performed 4294967296 computation steps in his life time: 4294967296 seconds are more than 136 years. Fortunately, fast exponentiation solves both problems. First, too large numbers as intermediate results are prevented by computing the results of all multiplications modulo 4294967297, thus always dealing with numbers less than 4294967297 (or products of two such numbers). The second problem is solved as follows: To determine 34294967296 , it essentially suffices to successively compute squares 25 = 32 times, starting with the value 3. The computational effort needed could already have been possible in 1732 for Euler. But we do not know, if he actually proceeded this way. This remains a speculation. Now, let us examine the problem more generally. We want to compute ab mod n with a, b, n ∈ ℕ, assuming that these numbers have hundreds of digits. Consider the following program: /∗ Preconditions: a, b, n ∈ ℕ ∗/ /∗ Compute ab mod n ∗/ function modexp(a, b, n) begin e := 1; while b > 0 do if b odd then e := e ⋅ a mod n fi; a := a2 mod n; b := ⌊ b2 ⌋ od; return e end Since b is halved in each run of the while loop, this loop will be executed at most as many times as b has binary digits, i. e., ⌊log2 b⌋ + 1 times. Furthermore, in each run of the loop, at most two (modular) multiplications are performed. One extreme case here occurs, if we have a power of two in the exponent. Then, except for the last run of the loop, only one multiplication per run is executed. The other extreme case is given by numbers of the form 2q − 1. The binary representation of such numbers consists of all ones. Thus, in every run two multiplications have to be performed. In general, for b > 0 with k ones and m zeros in the binary representation of b, we have to perform exactly 2k+m multiplications. To be precise, the procedure performs (k+m) squaring operations and k multiplications. Squaring is a special form of multiplication. On the other hand, since

2.9 RSA encryption

ab =

� 27

(a + b)2 − (a − b)2 4

squaring cannot be significantly easier than multiplication. We note that if a, b, n, k ∈ ℕ are natural numbers such that a, b, n ≤ 2k , the value ab mod n can be determined in time polynomial in k.

2.9 RSA encryption Probably the most famous encryption method using public keys is RSA by Ronald Linn Rivest (born 1947), Adi Shamir (born 1952), and Leonard Adleman (born 1945). With a little knowledge of modular arithmetic, the method is easy to describe, and it is easy to prove its correctness. We only need Fermat’s little theorem and the Chinese remainder theorem. For the algorithmic realization, we need reliable primality tests, fast exponentiation, and the extended Euclidean algorithm. In the following protocol, a person A, named “Alice”, wants to receive some information from a person B, named “Bob”. The information shall be sent over a public channel and yet must remain occult. The method is asymmetric in the sense that only messages from Bob to Alice are encrypted. Moreover, it also is asymmetric in the sense that Bob’s resources might be more restricted than Alice’s. This is quite realistic if messages are to be sent from a mobile station, and even the energy resources may be limited. The RSA method: (a) Alice chooses prime numbers p, q satisfying 3 < p < q. (b) She computes n = pq and φ(n) = (p − 1)(q − 1). (c) She chooses an exponent e > 1 such that gcd(e, φ(n)) = 1. (d) She determines s such that es ≡ 1 mod φ(n). (e) She publishes (n, e). All other parameters remain her secret, in particular, she must not pass on the “secret” s. (f) Bob encrypts a message 0 ≤ x ≤ n − 1 by computing y = x e mod n and sends y to Alice. (g) Alice decodes y by computing ys mod n. Let us consider the following example. Alice chooses p = 5 and q = 11. Thus, n = 55 and φ(n) = 4 ⋅ 10 = 40. For the exponent e, we choose the value 3. Then gcd(3, 40) = 1. To determine s, Alice applies the extended Euclidean algorithm on e = 3 and φ(n) = 40, the result is s = 27, and with this value of s, we indeed have 3 ⋅ s ≡ 1 mod 40. In this example, Alice publishes the pair (55, 3). If Bob wants to send the message x = 23 to Alice, then he computes y = 233 mod 55 = 12 and sends y. Alice receives the message y = 12 and, in order to decrypt the message, she computes 1227 mod 55 = 23. Using the Chinese remainder theorem and Fermat’s little theorem, this can even be accomplished by hand. We have

28 � 2 Elementary number theory 1227 ≡ 23 = 8 ≡ 3 27

27

12 ≡ 1 ≡ 1

mod 5

mod 11

From the representation of 1 as 1 = −2 ⋅ 5 + 11, we can determine the value 1227 mod 55 as follows: (1 ⋅ (−2) ⋅ 5 + 3 ⋅ 11) mod 55 = 23 The next theorem says that it is no coincidence that after decrypting an encrypted message you end up with the original text again. Theorem 2.18. The RSA method is correct: If Bob encrypts a number x in the range 0 ≤ x ≤ n − 1 as y = x e mod n, then x = ys mod n. Proof. By the Chinese remainder theorem, it suffices to show x ≡ ys mod r for r = p and r = q. By symmetry, it clearly is sufficient to prove the congruence for r = p. Since y = x e mod p, we have to show that x ≡ x es mod p for all x ∈ ℤ/pℤ. This is obvious for x ≡ 0 mod p. So, let now gcd(x, p) = 1. We have es = 1 + k(p − 1)(q − 1) for a suitable value of k ∈ ℕ, and x p−1 ≡ 1 mod p by Fermat’s little theorem. Therefore x es = x 1+k(p−1)(q−1) = x ⋅ (x (p−1) )

k(q−1)

≡x

mod p

RSA security is based on the fact that no efficient method is known which, on input n = pq for two randomly chosen prime numbers p and q, computes the factors p and q. According to the current state of research in 2023, 2000 bits for n are still considered safe, and there are no realistic ideas to factor numbers with 4000 bits or more. We cannot prove that in order to break RSA you have to be able to factor n, because it would also suffice, e. g., to find φ(n) or the secret exponent s. However, the complexity of the following three problems is roughly the same: (1) Factorize n; (2) Compute φ(n); and (3) Compute s such that es ≡ 1 mod φ(n). We will not go into detail about this fact here. The statement can be found in the literature. It is also dealt with in detail in our volume Discrete Algebraic Methods [12]. In many implementations of RSA, small exponents like e = 3 or e = 17 are used to make encryption as fast as possible. However, Bob must be very careful with small public exponents, to keep possible attackers unable to decrypt the messages; see, e. g., Exercise 2.16 for the simplest form of Håstad’s broadcast attack [24] or the review article by Boneh [7] from 1999. The choice of the secret decryption exponent s is much more critical. If s is small, say less than 230 , then one can determine it by an exhaustive search, because e is known. According to Boneh and Durfee [8], every private key s satisfying s ≤ n0.292 is insecure. Thus, the situation for e and s is asymmetric. In the project The RSA Challenge Numbers, RSA-ℓ numbers (being products of two different large primes) of increasing bit-length ℓ were published with the challenge to

2.10 Euler’s totient function

� 29

factor them. We give a small overview that reflects, to the best of our knowledge, the status at the begin of 2023. – RSA-640 was factored on 11/2/2005. – RSA-768 was factored on 12/12/2009. – RSA-704 was factored on 7/2/2012. – The factors of RSA-1024 are not publicly known (as of May 2023). RSA-1024 = 1350664108659952233496032162788059699388814756056670 2752448514385152651060485953383394028715057190944179 8207282164471551373680419703964191743046496589274256 2393410208643832021103729587257623585096431105640735 0150818751067659462920556368552947521350085287941637 7328533906109750544334999811150056977236890927563

2.10 Euler’s totient function Recall that units are the invertible elements of a ring R with respect to multiplication. In this section we will further investigate the group of units (ℤ/nℤ)∗ of the ring ℤ/nℤ. We will focus on two questions, here. First, we want to know how to tell whether an arbitrary element of ℤ/nℤ is a unit. And second, we are interested in the number of units in ℤ/nℤ. We answered the first question already in Corollary 2.7. An element k ∈ ℤ/nℤ is invertible if and only if gcd(k, n) = 1, i. e., if k and n are coprime. In this case, using the extended Euclidean algorithm, we can find an ℓ satisfying kℓ ≡ 1 mod n. We now turn to the second question and consider Euler’s totient function, which is also called Euler’s φ function: 󵄨 󵄨 φ(n) = 󵄨󵄨󵄨(ℤ/nℤ)∗ 󵄨󵄨󵄨 From this definition, we obviously obtain φ(1) = 1 and the estimate 1 ≤ φ(n) ≤ n − 1. Since n is prime if and only if all numbers between 1 and n − 1 are coprime to n, we obtain φ(n) = n − 1 ⇐⇒ n is a prime number This also matches our observation that ℤ/nℤ is a field if and only if n is prime. Using Corollary 2.12 we obtain gcd(m, n) = 1 󳨐⇒ φ(mn) = φ(m)φ(n) Thus, in order to determine the value of the φ function for arbitrary numbers, we only have to clarify how to compute the value of this function on prime powers. So let p be a prime number and k ≥ 1. Which of the numbers 0, 1, . . . , pk − 1 are not coprime to

30 � 2 Elementary number theory pk ? Clearly, these are exactly the pk−1 numbers 0, p, 2p, . . . , (pk−1 − 1)p, because they are divisible by p. Thus, all of the remaining pk −pk−1 numbers are coprime to pk . This shows p is a prime number 󳨐⇒ φ(pk ) = (p − 1)pk−1 Now, we are able to compute the value φ(n) for any number n if we know its prime e factorization. Let n = ∏i pi i be the prime factorization of n. Then, from the above fore

e −1

mula for prime powers, we obtain φ(n) = ∏i φ(pi i ) = ∏i (pi − 1)pi i . This can easily be transformed to obtain the following, known as Euler’s formula: 1 φ(n) = n ⋅ ∏ (1 − ) p p prime

(2.2)

p|n

An important property of Euler’s totient function can be deduced from the generalization of Fermat’s little theorem, stated below. The proof of the following theorem of Euler can easily be given in a completely analogous and elementary way. Theorem 2.19 (Euler). Let gcd(a, n) = 1, then we have aφ(n) ≡ 1 mod n. Proof. (ℤ/nℤ)∗ can be represented in the form {g1 mod n, . . . , gφ(n) mod n} for certain gi ∈ ℤ. Multiplication by a is a bijective mapping on (ℤ/nℤ)∗ , because a and n are φ(n) coprime. Thus, (ℤ/nℤ)∗ = {ag1 mod n, . . . , agφ(n) mod n}. Let g = ∏i=1 gi . Then φ(n)

g ≡ ∏ agi ≡ aφ(n) g mod n i=1

Now g, being an element of (ℤ/nℤ)∗ itself, has an inverse g −1 ∈ (ℤ/nℤ)∗ . This implies 1 ≡ gg −1 ≡ aφ(n) gg −1 ≡ aφ(n) mod n Example 2.20. We want to determine the last two decimals of 34444 . For this, we compute 34444 mod 100. We have φ(100) = φ(22 )φ(52 ) = (2 − 1) ⋅ 2 ⋅ (5 − 1) ⋅ 5 = 40. Since 3 and 100 are coprime, it follows from Euler’s Theorem 2.19 that 340 ≡ 1 mod 100. Thus we obtain 34444 = (340 )111 ⋅ 34 ≡ 1 ⋅ 34 ≡ 81 mod 100. Therefore, the decimal representation of 34444 ends in 81. ⬦ In the following Theorem 2.21, we will introduce another property of Euler’s φ function. Here, the notation ∑t|n φ(t) means that we sum up the values φ(t) for all positive divisors t > 0 of n. Theorem 2.21. ∑ φ(t) = n t|n

2.11 Finite multiplicative subgroups of fields

� 31

Proof. Consider the following set: N ={

n−1 0 1 ,..., } n,n n

of n distinct fractions. All these fractions mn can be reduced to kt for coprime numbers k and t. Note that after the reduction, t is still a divisor of n. Therefore we have N ={

k | t | n, 0 ≤ k < t, gcd(k, t) = 1} t

Arranging the fractions according to their different denominators t yields a partition of n into disjoint subsets: N = ⋃{ t|n

k | 0 ≤ k < t, gcd(k, t) = 1} t

Now, the claim directly follows from 󵄨󵄨 󵄨󵄨 k 󵄨 󵄨 󵄨󵄨 󵄨 󵄨󵄨{ | 0 ≤ k < t, gcd(k, t) = 1}󵄨󵄨󵄨 = 󵄨󵄨󵄨{k | 0 ≤ k < t, gcd(k, t) = 1}󵄨󵄨󵄨 = φ(t) 󵄨󵄨 󵄨󵄨 t

2.11 Finite multiplicative subgroups of fields The aim is show Theorem 2.23. It states that every finite subgroup in a the multiplicative group F \ {0} of a field F is cyclic. We begin the section with the statement about Abelian groups which is of independent interest. Lemma 2.22. Let a, b ∈ A be elements in an Abelian group A such that the order of a, b are finite with ord(a) = m and ord(b) = n. Then there is some c ∈ A of order lcm(m, n). Proof. The result is trivial for m = 1 or n = 1. Thus, we can assume m, n ≥ 2. We write A = (A, +, 0) in additive notation. For g ∈ A, we let ⟨g⟩ be the generated subgroup of g in A. Since A is Abelian, there is a homomorphism φ : ℤ/mℤ × ℤ/nℤ → ⟨a, b⟩ ⊆ A such that φ(1, 0) = a and φ(0, 1) = b. Let gcd(m, n) = 1. Then Corollary 1.15 tells us, first, that ⟨a⟩∩⟨b⟩ = {0} and, second, this implies that φ is injective. By Corollary 2.13, ℤ/mℤ×ℤ/nℤ is isomorphic to ℤ/mnℤ if gcd(m, n) = 1. Thus, we are done when gcd(m, n) = 1 because ℤ/mnℤ contains an element of order mn. Actually, ab = φ(1, 1) has order mn. It remains the second case, gcd(m, n) ≥ 2. Hence, there exists a prime number p and exponents 1 ≤ r, s ∈ ℕ such that r and s are both maximal with the properties pr | m and ps | n. Without restriction, we have r ≤ s. We can write ℤ/mℤ as a direct product ℤ/pr ℤ × qℤ such that gcd(p, q) = 1. The element p ∈ ℤ/pr ℤ has order pr−1 (even if r = 1). Thus ℤ/pr ℤ × qℤ contains the elements (p, 0) of order pr−1 and (0, 1) of order q. Hence, by the first case, the element a′ = φ(p, 1) has order m′ = m/p in ⟨a⟩ ≅ ℤ/pr ℤ × qℤ. Since r ≤ s,

32 � 2 Elementary number theory we have lcm(m′ , n) = lcm(m, n). Since m′ n =

mn p

< mn, we use induction to conclude

that ⟨a , b⟩ ⊆ A contains an element c of order lcm(m, n). ′

Theorem 2.23. Let F be a field and U ⊆ F \ {0} any finite subgroup in a the multiplicative group F \{0}. Then U is cyclic. In particular, there is some α ∈ U with U = {αi | 0 ≤ i < |U|}. Proof. Let |U| = n and let d1 , . . . , dn be the orders of the elements of U. Define d = lcm(d1 , . . . , dn ). This implies d ≤ n. The polynomial X d − 1 has degree d and n pairwise different roots β ∈ U. Using polynomial division for polynomials over a field, we conclude X d − 1 = ∏β∈U (X − β), and therefore d ≥ n. Thus, d = |U|. By Lemma 2.22, the group U contains an element of order d. Thus, U is cyclic. The implication in Corollary 2.24 from right to left is a slightly improved version of a theorem by Édouard Lucas (1842–1891) [30]. Corollary 2.24. Let 2 ≤ n ∈ ℕ. Then the following properties are equivalent: (a) Number n is prime. (b) There exists a ∈ ℤ such that an−1 ≡ 1 mod n and a(n−1)/p ≢ 1 mod n for every prime divisor p of n − 1. (c) For all prime divisors p of n − 1, there exists ap ∈ ℤ such that apn−1 ≡ 1 mod n and ap

(n−1)/p

≢ 1 mod n.

Proof. (a) ⇒ (b) If n is prime, then (ℤ/nℤ)∗ is cyclic by Theorem 2.23. A generator a of (ℤ/nℤ)∗ satisfies (b). (For n = 2, the second condition holds because 1 ∈ ℕ has no prime divisors.) (b) ⇒ (c) Trivial with ap = a. (c) ⇒ (a) We show that (ℤ/nℤ)∗ contains an element of order n − 1. It then follows that n is prime by Theorem 2.2. Consider a prime divisor p of n−1 and suppose that ap ∈ ℤ satisfies the condition in (c). Let ep be maximal such that pep | n−1 and let mp be the order of ap in (ℤ/nℤ)∗ . Since apn−1 ≡ 1 mod n, the order mp must divide n − 1 by Corollary 1.14. On the other hand, mp is not a factor of (n − 1)/p because ap ≢ 1 mod n. This implies ep ∗ that p | mp . Lemma 2.22 tells us that (ℤ/nℤ) contains an element a where the order is the least common multiple over all mp with p | n − 1. Since n − 1 = ∏p|n−1 pep , the order of a is a multiple of n − 1. Since the order of a is at most |(ℤ/nℤ)∗ | ≤ n − 1, we conclude that the order of a is n − 1, as desired. (n−1)/p

By successively applying Corollary 2.24 to n and all prime divisors p of n − 1, one can certify the primality of n using polynomial-size certificates. This fact was first observed by Vaughan Pratt [35] in 1975. More precisely, [35] shows the following. If n is prime (resp. not prime), then we can guess a short proof checkable in polynomial time in log n which shows that n is prime (resp. not prime). It became known only in 2002 that guessing short proofs is not necessary since primality of binary numbers can be decided in polynomial time [1]: “Primes is in P”. The authors Agrawal, Kayal, and Saxena

2.12 Fibonacci numbers



33

received the distinguished Gödel prize1 in 2006 for this achievement. A textbook proof of this result is given, for example, in [12].

2.12 Fibonacci numbers The Fibonacci numbers Fn (Leonardo Pisano Bigollo, called Leonardo da Pisa, called Fibonacci “Filius Bonacci”, ca. 1175–1240) are defined recursively as follows: F0 = 0,

F1 = 1,

Fn+1 = Fn + Fn−1

(2.3)

The first values of this sequence are: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, . . . This sequence belongs to the world’s most popular number sequences. You certainly cannot expect to be the only one to win in a lottery, where numbers have to be chosen, if you choose numbers like 3, 5, 8, 13, 21, 34. If you do it anyway, at least you should not be too excited looking forward to a high payout too early. In Fibonacci’s book “Liber Abaci” from 1202 the famous rabbit problem is mentioned rather casually. This problem is the following: How many pairs of rabbits exist after one year, if you start with an adult pair, pairs of rabbits become adult after one month, then they reproduce monthly, and all rabbits live more than a year? The solution can be found in the following table, where A represents an adult couple and B a child couple. Couples 1. Jan. 1. Feb. 1. March 1. April 1. May 1. June 1. July 1. Aug. 1. Sept.

A

A

A B ABAAB ABAABABA ABAABABAABAAB

1. Oct. 1. Nov. 1. Dec. 1. Jan. 1 https://sigact.org/prizes/gödel/2006.html

Number of A’s B

A

1 1 2 3 5 8 13 21 34 55 89 144 233

B’s

Total

0 1 1 2 3 5 8 13 21 34 55 89 144

1 2 3 5 8 13 21 34 55 89 144 233 377

34 � 2 Elementary number theory The words in the “Couples” column are formed row by row replacing each A of the former row by AB, and each B by A. A grown couple lives on and has a couple of children each month. A couple of children grows up and becomes an adult couple. Note that every word is the beginning of the word in the following line, therefore in this way we obtain an infinite word, the Fibonacci word. (We are aware that the infinite word assumes infinite life for all the rabbits...) Besides the growth behavior of rabbits, there are many other possible combinatorial interpretations of Fibonacci numbers. We give some more examples. Example 2.25. If we have an arbitrary number of dominoes of lengths 1 and 2, then Fn+1 is the number of ways to chain dominoes in a row, such that the length of the chain is exactly n. Secondly, we claim that Fn is the number of words over the two letter alphabet {a, b}, having length n and starting with an a, but not containing aa as a subword. Moreover, Fn+2 is the number of words over {a, b}, having length n and not containing aa as a subword. The reader is invited to check the first claim. We explain the second statement concerning words of length n starting with an a and not having two adjacent a’s. The first values F0 = 0 and F1 = 1 should be clear: no word of length 0 starts with a, and only one word of length 1 has that property. Now, we consider words of length n ≥ 2 starting with a. Any such word has one of two possible forms: xb for a word x of length n − 1, starting with an a and without two adjacent a’s. Or xba for a word x of length n − 2, starting with an a and without two adjacent a’s. In the first case, there are Fn−1 possibilities for x; in the second case, we have Fn−2 possibilities. Summing up, there are Fn−1 + Fn−2 = Fn such words. For the third statement, we have a similar argument. Only now, without the requirement that words start with an a, we have to start with values 1 (the empty word has the desired property for length 0) and 2 (a and b are two words of length 1 with the property). Therefore we start with F2 = 1 for n = 0 and F3 = 2 for n = 1 and generally obtain Fn+2 . ⬦ Fibonacci numbers are growing fast. Due to the inductive definition, we obtain the following estimates for n ≥ 3: Fn ≤ 2n ≤ F2n Thus, the growth is somehow exponential. This can be stated much more precisely with the following approach. Assume, Fn = x n for an x ∈ ℝ, then for all n ≥ 1 we obtain x n+1 = x n + x n−1 Since Fn ≥ 1 for n ≥ 1 we have x ≠ 0. Therefore we can divide by x n−1 , and the above equation equivalently becomes x 2 = x + 1. This quadratic equation has two solutions: Φ=

1 + √5 2

and

√ ̂ = 1− 5 Φ 2

2.12 Fibonacci numbers



35

Here Φ = 1+2 5 is the golden ratio. This is the aspect ratio b/a of a rectangle with side lengths a and b satisfying the condition a/b = b/(a + b). An approximation to this ratio can be found in a famous picture of the Vitruvian Man by Leonardo da Vinci (1452–1519), which appears on Italian 1 Euro coins. A better approximation is √

Φ = 1.61803 39887 49894 84820 45868 34365 63811 77203 09179 80576 28621 . . . ̂ The equality x 2 = x + 1 can be transformed to x(x − 1) = 1. Therefore Φ−1 = Φ − 1 = −Φ. n+1 n n−1 n+1 n ̂n−1 ̂ ̂ ̂ Both numbers Φ and Φ obey the rules Φ = Φ +Φ and Φ = Φ + Φ , because that ̂n i’s how they were determined. Thus, also every linear combination Fn (a, b) = aΦn + bΦ satisfies Fn+1 (a, b) = Fn (a, b) + Fn−1 (a, b) In order to find Fn (a, b) = Fn , it is sufficient to solve the following system of equations on two unknowns: ̂ 0 = F0 = 0 aΦ0 + bΦ

̂ 1 = F1 = 1 aΦ1 + bΦ

Thus, we obtain b = −a and a = Fn =

1 . √5

Putting things together, we have

n n ̂n 1 Φn − Φ 1 + √5 1 − √5 = (( ) −( ) ) ̂ √5 2 2 Φ−Φ

(2.4)

Observe that −0.7 < 1−2 5 < −0.6. The sequence ( 1−2 5 )n approaches zero (in an alternating way) exponentially fast. Therefore, √



Fn = [

n

1 1 + √5 ( ) ] √5 2

where [x] denotes the integer next to x ∈ ℝ (rounded up or down). This approximation is getting better and better with growing values of n. So, we can efficiently compute Fibonacci numbers Fn even for quite large values of n, like F256 = 141 693 817 714 056 513 234 709 965 875 411 919 657 707 794 958 199 867 if the arithmetic system used provides the according precision. What if we want to use exact arithmetic in our computations? Even this can be accomplished. We use the following 2 × 2 matrices 0 1

M1 = (

1 F )=( 0 1 F1

F1 ) F2

Fn−1 Fn

and Mn = (

Fn ) Fn+1

36 � 2 Elementary number theory An elementary matrix multiplication yields Mn+1 = Mn ⋅ M1 = M1 ⋅ Mn . and therefore Mn = (M1 )n for all n ∈ ℤ. So Fn−1 Fn

(

Fn 0 )=( Fn+1 1

n

1 ) 1

(2.5)

Using the technique of fast exponentiation, Fn can be computed using 𝒪(log |n|) multiplications of 2 × 2 matrices over the integers. The maximum bit length of entries in this computation is linear in n. The connection between the golden ratio Φ and the Fibonacci numbers is well understood: Φ is a nonrational number that is best approximated in rational numbers as quotient of two consecutive Fibonacci numbers. This in turn has the consequence that the golden ratio is hard to approximate by a quotient of two small integers, which can also be seen from the fact that in the continued fraction expansion of Φ in (2.6) all coefficients have value 1. Indeed, an iterated application of the identity Φ = 1 + Φ1 yields Φ=1+

1 1 1 =1+ =1+ Φ 1 + Φ1 1+

1 1+ Φ1

=1+

1+

1

1

(2.6)

1 1+ 1+⋅⋅⋅

Surprisingly, it seems to be exactly this irrationality of Φ that emphasizes its importance in art and nature. In art, picture frames are made according to the golden ratio, in nature many plants have spirals in their construction plan, containing a number of leaves determined by Fibonacci numbers. For example, the spirals of sunflowers consist of 34 and 55 leaves. There are a lot of identities on Fibonacci numbers. A particularly nice one refers to the greatest common divisor of Fm and Fn . We state it as a theorem: Theorem 2.26. gcd(Fm , Fn ) = Fgcd(m,n) Proof. We split the claim of the theorem into two parts, Fgcd(m,n) | gcd(Fm , Fn ) and gcd(Fm , Fn ) | Fgcd(m,n) . Let n = kp. We want to show Fk | Fn . Without loss of generality, we can assume n ≥ 1. We consider relations between the matrices Mn and Mk : Fn−1 Fn

(

Fn F kp ) = M1n = M1 = ( k−1 Fn+1 Fk

p

Fk ) Fk+1

Taking both sides modulo Fk (that is, all entries are written modFk as in Example 1.22), we obtain Fn−1 Fn

(

p

Fn F ) ≡ ( k−1 Fn+1 0

0 p ) Fk+1

mod Fk

2.13 Recursion depth of the Euclidean algorithm

� 37

In particular, Fn ≡ 0 mod Fk and thus Fk | Fn . This implies that Fgcd(m,n) divides both Fm and Fn , and therefore Fgcd(m,n) | gcd(Fm , Fn ). For the other direction, first note that Fn and Fn+1 are coprime, which immediately follows by induction from the recursive equation Fn+1 = Fn + Fn−1 , using gcd(F1 , F0 ) = 1. Let m > n. We have to show gcd(Fm , Fn ) | Fgcd(m,n) . This is trivial for n = 0 or n = 1. Now, let g = gcd(Fm , Fn ) and write m = np + r for some 0 ≤ r < n. Then we have Mm = Mnp Mr . Computing modulo g yields Fm−1 0

Mm ≡ (

p

0

Fm+1

p

Fn−1 0

)≡(

0 Fr−1 p )( Fn+1 Fr

Fr ) Fr+1

mod g

p

We conclude 0 ≡ Fn+1 Fr mod g, and so g divides Fn+1 Fr . Using gcd(Fn , Fn+1 ) = 1 and p g | Fn , we see that the numbers g and Fn+1 are coprime. Thus, g is a divisor of Fr . So g also divides gcd(Fn , Fr ) and by induction we get g | Fgcd(n,r) , because r < n. But gcd(m, n) = gcd(n, r), and so we finally have g | Fgcd(m,n) . A further connection between Fibonacci numbers and the greatest common divisor once again emphasizes the algorithmic importance of the Fibonacci numbers. We will investigate that in the following section.

2.13 Recursion depth of the Euclidean algorithm In this section we want to address the question of what is the maximum number of recursive calls that can occur in a run of the Euclidean algorithm. This algorithm computes the greatest common divisor gcd(k, ℓ). Let us briefly repeat how the algorithm works. Since gcd(k, ℓ) = gcd(ℓ, k) = gcd(|k|, |ℓ|), we can assume that 0 ≤ k ≤ ℓ. For k = 0, gcd(0, ℓ) = ℓ and we are done. Now let 0 < k ≤ ℓ. We first compute numbers q and r such that ℓ = qk + r with 0 ≤ r < k. Any number that divides k and ℓ also divides r; but, moreover, any number dividing k and r also divides ℓ. Hence gcd(k, ℓ) = gcd(r, k), and we can proceed recursively with 0 ≤ r < k. Let us look at an example. Suppose we use the Euclidean algorithm to compute the gcd of two consecutive Fibonacci numbers Fn−1 and Fn . Since Fn = Fn−1 + Fn−2 , we will obtain the following recursive calls: gcd(Fn−1 , Fn ) = gcd(Fn−2 , Fn−1 ) = ⋅ ⋅ ⋅ = gcd(F0 , F1 ) = 1 From this case, we learn that the number of recursive calls in the computation of gcd(k, ℓ) can be logarithmic in k. But consecutive Fibonacci numbers indeed already constitute the worst case. Let 0 ≤ k ≤ ℓ be natural numbers and gcd(k, ℓ) = g. Further assume that computing g requires a total number of n recursive calls in the Euclidean algorithm. Then there is a sequence of numbers

38 � 2 Elementary number theory f0 = 0,

f1 = g,

...,

fn−1 = k,

fn = ℓ

such that fi+1 = qi fi + fi−1 , where 0 ≤ fi−1 < fi and qi ≥ 1. This yields fi ≥ Fi for all 0 ≤ i ≤ n. Particularly, k ≥ Fn−1 . We state this in the next theorem. Theorem 2.27. Let Φ = 1+2 5 be the golden ratio and let k, ℓ ∈ ℕ \ {0}. Then computing the greatest common divisor gcd(k, ℓ) by an execution of the Euclidean algorithm requires at most ⌈logΦ k⌉ recursive calls. √

Note that the recursion depth of the Euclidean algorithm satisfies logΦ k < 32 log2 k for every k > 1. Thus, in order to compute the gcd of two, say, 100-digit binary numbers, at most 150 recursive calls have to be performed.

Exercises 2.1. Let p be a prime number. Show that log10 (p) is not rational.

2.2. An application of the Euclidean algorithm: (a) Determine two numbers x, y ∈ ℤ with x ⋅ 35 − y ⋅ 56 = gcd(35, 56). (b) Determine x, y ∈ ℕ with the above property. 2.3. Find all solutions of the linear congruence 3x − 7y ≡ 11

mod 13

2.4. Let a, b be two natural numbers, and suppose that gcd(a, b) = 1. Show that gcd(a + b, a − b) ∈ {1, 2}. 2.5. Divisibility rules: (a) Show that a number in the decimal system is divisible by 3 if and only if its cross sum is divisible by 3. (b) Find an analogous rule for divisibility by 11 in the decimal system. 2.6. Show that for any n ≥ 2, the number n4 + 4n is not a prime number. Hint: Consider the following polynomial: (x 2 + 2y2 )2 − 4x 2 y2 .

2.7. Let n ∈ ℕ. Prove: (a) If 2n − 1 is prime, then n is prime, too. Prime numbers of the form 2n − 1 are called Mersenne prime numbers (after Marin Mersenne, 1588–1648). (b) If 2n + 1 is a prime number, then n is a power of two. n (c) Let fn = 22 + 1 be the nth Fermat number. As of 2023, the only known primes in this series are f0 , , . . . , , f4 . Show: If m < n, then gcd(fm , fn ) = 1. From this, conclude that there are infinitely many prime numbers. Hint: Consider (fn − 2)/fm .

Exercises

� 39

2.8. Find the smallest x ∈ ℕ satisfying x ≡ 1 mod 2,

x ≡ 0 mod 3,

x ≡ 1 mod 5,

x ≡ 6 mod 7

2.9. Show that the congruence system x ≡ a mod n, x ≡ b mod m has a solution if and only if gcd (n, m) divides a − b. Confirm that in case the system has a solution, it is unique modulo lcm(n, m). 2.10. (a) (b) (c)

Show for all n ∈ ℕ: n5 ≡ n mod 30 4 2 3n +n +2n+4 ≡ 21 mod 60 7n+2 + 82n+1 ≡ 0 mod 57

2.11. Let p be an odd prime, and let a ∈ ℕ be odd and not divisible by p. Show that ap−1 ≡ 1

mod 4p

2.12. The RSA method: (a) What is the number of elements in the multiplicative group (ℤ/51ℤ)∗ ? (b) Determine the secret decryption exponent, which belongs to the publicly available RSA key (n, e) = (51, 11). (c) The ciphertext 7 was encrypted using the RSA method with the public key (n, e) = (51, 11), i. e., 7 = x 11 mod 51. Determine the plaintext x. (d) How many elements of order 10 does the group (ℤ/51ℤ)∗ contain? (e) Is the multiplicative group (ℤ/51ℤ)∗ a cyclic group?

2.13. Let p, q be prime numbers, n = pq and e ∈ ℕ such that gcd(e, φ(n)) = 1. To decrypt messages, that have been encrypted using the RSA method with the public key (n, e), your budget committee recommends using a private key s according to the cheaper rule es ≡ 1

mod lcm(p − 1, q − 1)

(instead of es ≡ 1 mod φ(n)). This is supposed to be a saving because lcm(p − 1, q − 1) is smaller than the product φ(n) = (p − 1)(q − 1). Is this version of the algorithm still correct? 2.14. We extend the RSA method to three prime numbers. Let p, q, r be three different primes, let n = pqr and s ⋅ e ≡ 1 mod φ(n). Messages x, y ∈ {0, . . . , n − 1} are encrypted using the rule c(x) = x e mod n and decrypted as d(y) = ys mod n. (a) Show that the procedure is correct, i. e., that d(c(x)) = x for all x ∈ {0, . . . , n − 1}. (b) The ciphertext y = 14 has been created according to the public key (n, e) = (66, 27). Determine ys mod k for k = 2, 3, 11 and the plaintext x = ys mod 66.

2.15. Suppose the two users A1 and A2 of RSA have published their respective keys (n, e1 ) and (n, e2 ). Now Bob encrypts the same text m for A1 and for A2 . Show that Oscar can determine the plaintext m from the two cipher texts, provided e1 and e2 are coprime.

40 � 2 Elementary number theory 2.16. Suppose a bank sends the same message m to three different customers. The message m is encrypted with the RSA method using the public keys (n1 , 3), (n2 , 3), and (n3 , 3), with three different values for ni . Show that the attacker Oscar is able to decrypt the message m under these conditions. 2.17. Prove: ∀a, b ∈ ℤ : gcd(a, b) = 1 ⇒ aφ(b) + bφ(a) ≡ 1 mod ab.

2.18. (a) (b) (c) (d)

Let Fn with n ∈ ℕ be the sequence of Fibonacci numbers. Prove: F1 + ⋅ ⋅ ⋅ + Fn = Fn+2 − 1 ∑nk=0 Fk2 = Fn Fn+1 ∀n ≥ 0 ∀k ≥ 1 : Fn+k = Fk Fn+1 + Fk−1 Fn ∀n ≥ 1 : Fn+1 Fn−1 − Fn2 = (−1)n

2.19. Let M be a set, p a prime number, and f : M → M a mapping satisfying f p (m) = m for all m ∈ M. Here, f p denotes the p-fold successive execution of the mapping f . (a) Let m ∈ M. Show that the values f (m), f 2 (m), . . . , f p (m) are either all equal or pairwise different. (b) Now let M be finite and F = {m ∈ M | f (m) = m} the set of fixed points of f . Prove that |M| ≡ |F| mod p.

2.20. We want to show that Fp+1 + Fp−1 ≡ 1 mod p holds for every prime number p. We define L1 = 1, L2 = 3, and Ln+2 = Ln+1 + Ln . Prove the following: (a) Ln = Fn+1 + Fn−1 . (b) Ln is the cardinality of the set ℒn , where ℒn denotes the set of those subsets M ⊆ {1, . . . , n} that do not contain two consecutive numbers modulo n (so, 1 is a successor of n). (c) If p is prime then Lp ≡ 1 mod p. 2.21. For p, q ∈ ℤ with p > 1 let q rem p (for remainder) denote the integer r satisfying − p2 ≤ r < p2 and q ≡ r mod p. The gcd may be computed by the Euclidean algorithm, but with rem instead of mod . Show that the recursion depth of this algorithm is at most ⌈logΨ k⌉, where Ψ = √2 + 1.

Hint: Consider the recursion Gn+1 = Gn + 2Gn−1 .

Summary �

41

Summary Notions – – – – – – – –

prime number numbers: ℕ, ℤ, ℚ, ℝ, ℂ associative operation distributive operation neutral element invertible element commutative/Abelian semigroup, monoid, group

– – – – – – – –

ring, field unit substructure homomorphism isomorphism gcd(k, ℓ) coprime numbers prime factorization

– – – – – – – –

residue class congruent modulo n Euler’s totient function order of a group order of an element cyclic group Fibonacci numbers Fn golden ratio

Methods and results – – – – – – – – – – – – – – – – – – – – – – –

(Extended) Euclidean algorithm Bézout’s lemma: For all k, ℓ ∈ ℤ, there are a, b ∈ ℤ such that gcd(k, ℓ) = ak + bℓ. Fundamental theorem of arithmetic: Each n ∈ ℕ has a unique prime factorization. k ∈ (ℤ/nℤ)∗ ⇐⇒ gcd(k, n) = 1 ⇐⇒ the mapping x 󳨃→ kx on ℤ/nℤ is bijective Computing inverses modulo n ℤ/nℤ is a field ⇐⇒ n is a prime number Chinese remainder theorem: If gcd(k, ℓ) = 1, then the map ℤ/kℓℤ → ℤ/kℤ × ℤ/ℓℤ with x mod kℓ 󳨃→ (x mod k, x mod ℓ) is a ring isomorphism. Solving simultaneous congruences There are infinitely many prime numbers. Fermat’s little theorem: p prime 󳨐⇒ ap ≡ a mod p. Fermat test Fast (modular) exponentiation RSA method Computing Euler’s φ function Euler theorem: gcd(a, n) = 1 󳨐⇒ aφ(n) ≡ 1 mod n ∑t|n φ(t) = n In finite commutative groups G, we have a|G| = 1 for all a ∈ G. In finite groups, ak = 1 ⇐⇒ the order of a divides k Finite multiplicative subgroups of fields are cyclic Combinatorial interpretation of Fibonacci numbers Let t | n. A cyclic group of order n has φ(t) elements of order t. √ √ Fn = √15 (( 1+2 5 )n − ( 1−2 5 )n ). Fast computation of Fn by matrices. gcd(Fm , Fn ) = Fgcd(m,n) Recursion depth of the Euclidean algorithm for computing gcd(m, n) is in 𝒪(log(m + n))

3 Some useful growth estimates We already have seen that the nth Fibonacci number Fn is the rounded value of Φn /√5. Thus, these numbers show an exponential growth in the base of the golden ratio Φ. In the current chapter, we will examine the growth behavior of the following functions: 2n n!, ( ), lcm(n), and π(x) n Here again, we assume n ∈ ℕ, and n! = n(n − 1) ⋅ ⋅ ⋅ 1 is the factorial of n. We consider (2n , the middle binomial coefficient, and lcm(n) = lcm{1, . . . , n} denotes the least ) = (2n)! n!n! n common multiple of the first n positive integers. The function π(x), for a positive real x, means the number of primes less than or equal to x. The goal we pursue is not to provide best estimates for the above functions in all cases, instead we are looking for estimates that can be easily derived and, therefore, well memorized. We decided to include the π(x) function, too, because there are some fascinating statements on prime number density that can be obtained as easy consequences using the rest of the material.

3.1 Factorials The sequences n! and 2n can be defined inductively: 0! = 20 = 1,

(n + 1)! = (n + 1)n!,

and

2n+1 = 2 ⋅ 2n

Some initial values can be found in the following table: n 2n n!

0 1 1

1 2 1

2 4 2

3 8 6

4 16 24

5 32 120

⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅

10 1024 3628800

⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅

20 1048576 2432902008176640000

The estimates 1000 for 210 and 1 million for 220 are rough but often useful. The error at 1 million is approximately 5 %. The values of n! for n ≤ 5 are easy to remember. For n ≥ 4, we always have n! > 2n . But how much faster is the growth of n! compared 2 to that of 2n ? Does n! grow faster than 2n ? The answer is no, and this can easily be seen: Obviously, n! ≤ nn = 2n log2 n for all n, and n log2 n is less than n2 for each n ≥ 1. An immediate lower bound for n! can be obtained using the observation that in the product n n! half of the factors are at least as large as n2 . This yields ( n2 ) 2 ≤ n!; together with the upper bound nn , we obtain the important rule log n! ∈ Θ(n log n). https://doi.org/10.1515/9783111062556-003

3.2 Binomial coefficients

� 43

Now we want to derive tighter bounds for n!. The following is obvious: ln n! = ln 2 + ln 3 + ⋅ ⋅ ⋅ + ln n From this we can conclude for any n ≥ 2 that n

ln(n − 1)! < ∫ ln x dx < ln n! 1

The antiderivative of ln x is x ln x − x + C. We thus obtain ln(n − 1)! < n ln n − n + 1 < ln n! and from this n

n (n − 1)! < e ⋅ ( ) < n! e We have reached the goal of our consideration; for n ≥ 1, we have the following estimates (with equality only if n = 1): n

n

n n e ⋅ ( ) ≤ n! ≤ ne ⋅ ( ) e e

(3.1)

In fact, using a more detailed inspection, better results can be achieved. In particular, Stirling’s formula has been proven: n

n n! ∼ √2πn( ) e

(3.2)

For n = 20, Stirling’s formula yields the approximation 2.42 ⋅ 1018 . Compared to the entry for 20! in the table above, which is a little more than 2.43 ⋅ 1018 , this appears to be pretty good. The estimate according to equation (3.1) yields 0.58 ⋅ 1018 ≤ 20! ≤ 11.75 ⋅ 1018 .

3.2 Binomial coefficients Many people have heard about binomial coefficients in their high schools. There, the binomial coefficients (kn) are usually defined for natural numbers k, n with k ≤ n via the following equation: n n! ( )= k k! (n − k)!

44 � 3 Some useful growth estimates Thus, (kn) is exactly the number of k-element subsets in {1, . . . , n}. This is the combinatorial interpretation of (kn). We will deal with binomial coefficients in detail later in Section 5.2, but this simple fact shall be established right here already: A sequence (i1 , . . . , ik ) of k pairwise different numbers between 1 and n defines the k-element subset {i1 , . . . , ik }. The number of such sequences is n(n − 1) ⋅ ⋅ ⋅ (n − k + 1). Since the ordering is irrelevant in the representation of the set, we can permute the ij arbitrarily. Thus, there are k! sequences defining the same subset. Now, the assertion follows from the equality (n − 1) ⋅ ⋅ ⋅ (n − k + 1) n! = k! k! (n − k)! The fact that there are altogether 2n subsets of {1, . . . , n} further shows n 2n = (1 + 1)n = ∑ ( ). k k This representation of 2n as a sum can also be obtained directly from the binomial theorem, Theorem 5.3. Here the immediate consequence (kn) ≤ 2n for all 0 ≤ k ≤ n suffices for our purposes. Directly from the definition of the binomial coefficients, also the following equation can be derived: n n k( ) = (n − k + 1)( ) k k−1 This yields the following consequence: n n n n n n ) > ( ) = 1. 1 = ( ) < ( ) < ⋅⋅⋅ < ( n ) = ( n ) > ⋅⋅⋅ > ( ⌈2⌉ n−1 n 0 1 ⌊2⌋ The sequence is increasing up to the middle and then it starts decreasing again (see Figure 3.1). So for n ≥ 2, the value (⌊ nn ⌋) is the maximum of the following n values: 2, (n1 ), 2

n ). Therefore, ( n ) must be at least as large as the average of these values. For . . ., (n−1 ⌊n⌋

n ≥ 2, we have

2

2n n n ≤( n )=( n ) n ⌊2⌋ ⌈2⌉

(3.3)

We note the following standard rule for n ≥ 1: 4n 2n 2n + 1 ≤( ) √2n, we see that ep ((2n n Erdős observed: If

2n 2 n < p ≤ n and n ≥ 3 then ep (( )) = 0. 3 n

Any prime p in the given range obviously divides n! exactly once. Therefore, in the de, the prime p appears exactly twice. The numerator (2n)! connominator of (2n ) = (2n)! n n!n! tains exactly two multiples of p as factors, namely p and 2p, which completes the proof of the claim. Putting things together, this yields 4n 2n ≤ ( ) ≤ ( ∏ 2n)( ∏ 2n n √ √ p≤ 2n

2n −1.

3.2. Let a1 ≤ ⋅ ⋅ ⋅ ≤ an and b1 ≤ ⋅ ⋅ ⋅ ≤ bn be two sequences of real numbers, and let π : {1, . . . , n} → {1, . . . , n} be a permutation. Show that the sum S(π) = ∑ni=1 ai bπ(i) assumes its maximum value if π is the identity. The sum S(π) is minimal if π(i) = n + 1 − i for all i (i. e., if π reverses the order). 3.3. Let a1 , . . . , an be positive real numbers and let





H = n/( a1 + ⋅ ⋅ ⋅ + 1

1 ) an

be the harmonic mean,

G = √a1 ⋅ ⋅ ⋅ an be the geometric mean, n

54 � 3 Some useful growth estimates –

A = (a1 + ⋅ ⋅ ⋅ + an )/n be the arithmetic mean, and



Q = √(a12 + ⋅ ⋅ ⋅ + an2 )/n be the quadratic mean.

Show: min(a1 , . . . , an ) ≤ H ≤ G ≤ A ≤ Q ≤ max(a1 , . . . , an ).

3.4. Let s be a real number. Show that the series ∑i≥1

1 is

converges if and only if s > 1.

3.5. For every n ≥ 1, let t(n) be the number of positive divisors of n. We define the average number of divisors by t(n) = n1 ∑ni=1 t(i). Show that |t(n) − ln n| ≤ 1.

3.6. Show that there are arbitrarily large gaps between two consecutive prime numbers, that is, for each n ∈ ℕ there is an index i such that pi+1 − pi ≥ n. Here p1 , p2 , . . . is the ascending sequence of prime numbers. 3.7. Let p1 , p2 , . . . be the sequence of prime numbers in ascending order. Show: (a) For every sufficiently large number n, we have pn ≥ 31 n log n. (b) The series ∑i≥1 p1 diverges. i

Summary

Summary Notions – lcm(n) = lcm({1, . . . , n}) – prime-counting function π(x) – k divides ℓ, k | ℓ

– factorial n! – binomial coefficient (kn) – least common multiple

Methods and results – e ⋅ ( ne )n ≤ n! ≤ ne ⋅ ( ne )n for n ≥ 1

n ) is the largest binomial coefficient among (n), . . . , (n) – (⌊n/2⌋ 1 n



– –



4n ≤ (2n ) < (2n+1 ) < 4n for n ≥ 1 2n n n ( kn )k ≤ (kn) < ( en )k for 0 < k ≤ n k lcm(n) = ∏p≤n p⌊logp n⌋ for n ≥ 1 (product m(mn ) | lcm(n) for 1 ≤ m ≤ n n n−1

– 2 < lcm(n) ≤ 4 –

n log2 n

taken over primes p, only)

for n ≥ 7

≤ π(n) for n ≥ 4

– ∏p≤n p ≤ 4n−1 for n ≥ 1 (product taken over primes p, only)

– For every ε > 0 there is n0 ∈ ℕ such that π(n) ≤

(2+ε)n log n

for all n ≥ n0

– Bertrand’s postulate: ∀n ≥ 1 there is a prime number p such that n < p ≤ 2n – For all n ≥ 212 , there are at least

n 3 log2 n

prime numbers p with n < p ≤ 2n

� 55

4 Discrete probability Complexity frequently speaks about the behavior in the worst case. However, we are often interested in “typical cases”. In the worst case, when playing roulette, you would loose every single game. But in reality, at least every now and then you will win a considerable amount. This happens rarely, and the casino stays rich. To be able to describe such a behavior more precisely, we will develop some elementary concepts from discrete probability theory here, in order to use them for later applications.

4.1 Probabilities and expected values A discrete probability space is a finite or countable set Ω together with a mapping Pr : Ω → [0, 1] into the real 0–1 interval, satisfying the following condition: ∑ Pr[ω] = 1

ω∈Ω

1 If Ω is a finite set and Pr[ω] a constant value, i. e., Pr[ω] = |Ω| for all ω ∈ Ω, then we speak of a uniform distribution. An event is a subset A ⊆ Ω. The probability of A is

Pr[A] = ∑ Pr[ω] ω∈A

If Ω is finite, then for uniform distributions we obtain Pr[A] =

|A| “Number of positive cases” = |Ω| “Total number of cases”

This is one of the motivations for the next chapter, where we will present techniques for determining the respective numbers. In a round of roulette, the probability space is given by the set {0, . . . , 36}, and the events red and black both have the same probability 18/37. Essentially, it is this difference 1 − 36/37 = 1/37, which causes the disadvantage of the players against the casino. A random variable X in the following always means a real-valued function X :Ω→ℝ The expected value of X is defined as follows: E[X] = ∑ X(ω) Pr[ω] ω∈Ω

If Ω is an infinite set, we have to make sure that the series is absolutely convergent because otherwise the expected value is undefined. In most cases considered here, the https://doi.org/10.1515/9783111062556-004

4.1 Probabilities and expected values

� 57

probability space will be finite, therefore we do not need to care about convergence. In the other cases, we will make implicit convergence assumptions, often without even mentioning them. In cases of uniform distribution, the expected value is the average over all values of the random variable. Then we have E[X] =

1 ∑ X(ω) |Ω| ω∈Ω

e. g., the expected value occurring on a rolled dice is 3.5. It is notable that this value does not correspond to any number of points that might actually appear when a dice is rolled. Each event A ⊆ Ω can directly be interpreted as a random variable via the characteristic function χA : Ω → {0, 1} (with χA (a) = 1 for a ∈ A and χA (a) = 0 otherwise). Then, the probability of the event A is the expected value of the characteristic function, Pr[A] = E[χA ]. If x ∈ ℝ, then Pr[X = x] denotes the probability of the event {ω ∈ Ω | X(ω) = x} Thus we have Pr[X = x] = Pr[X −1 (x)]. Further, directly from the definition, we obtain the following statement: E[X] = ∑ X(ω) Pr[ω] = ∑ x Pr[X = x] x∈ℝ

ω∈Ω

If X does not take any negative values and X(ω) > 0 for at least one ω with Pr[ω] > 0, then obviously E[X] > 0. Moreover, the following connection between probability and expected value, named after Andrei Andrejewitsch Markov (1856–1922), can be proven. Theorem 4.1 (Markov inequality). Let X be a random variable with X(ω) ≥ 0 for all ω and E[X] > 0. Then for all λ > 0, Pr[X ≥ λE[X]] ≤

1 λ

Proof. We have E[X] = ∑ X(ω) Pr[ω] ≥ ω∈Ω



ω∈Ω X(ω)≥λE[X]

X(ω) Pr[ω] ≥ λ E[X] Pr[X ≥ λE[X]]

This yields the claim of the theorem. An important property is the linearity of the expected value: E[aX + bY ] = aE[X] + bE[Y ] where a, b ∈ ℝ and X, Y : Ω → ℝ are random variables. The random variable aX + bY : Ω → ℝ is defined by (aX + bY )(ω) = aX(ω) + bY (ω). If X : Ω → ℝ is a random variable, then we associate with X its discrete density function fX : ℝ → [0, 1] and its distribution

58 � 4 Discrete probability FX : ℝ → [0, 1]. These are defined as follows: fX : ℝ → [0, 1],

FX : ℝ → [0, 1],

fX (x) = Pr[X = x]

FX (x) = Pr[X ≤ x]

The distribution can be computed from the density, and vice versa. It can happen that completely different random variables lead to the same distribution and density. Many interesting properties already can be obtained from the distribution (or density) without knowing the actual random variable. Hence, the actual probability space often is not relevant at all. In particular, we have E[X] = ∑ x fX (x) x∈ℝ

To stay close to a concrete imagination, we will further use discrete random variables. However, note that it is this approach that enables us to change to continuous random variables where necessary. Essentially, sums are replaced by integrals, and fX (x) becomes dx. But, one must ensure that expressions remain meaningful and well defined, which would require a significant amount of theoretical foundation. We say that two random variables X and Y are independent if the following holds for all x, y ∈ ℝ: Pr[X = x ∧ Y = y] = Pr[X = x] ⋅ Pr[Y = y] Here, X = x ∧ Y = y means the intersection of the events X = x and Y = y. The intuition behind this is that independent random variables do not affect each other. For example, the probability for throwing two sixes with two dice in one attempt is 1/36 because each of the dice shows a six with probability 1/6, independent of the other. Similarly, the probability for a double is 1/6, and the probability for a Yahtzee (5 times the same number) in a single roll of five dice is 1/64 = 1/1296. If X and Y are independent, then E[XY ] = E[X]E[Y ] This can be seen as follows: E[XY ] = ∑ z Pr[XY = z] z

= ∑ ∑ xy Pr[X = x ∧ Y = y] z xy=z

= (∑ x Pr[X = x]) ⋅ (∑ y Pr[Y = y]) x

y

= E[X]E[Y ] The expected value of the random variable X − E[X] obviously is 0. It is much more interesting to consider the square (X − E[X])2 of this random variable. Its expected value

4.1 Probabilities and expected values

� 59

can certainly not be negative. It is positive, whenever it is defined and satisfies the inequality Pr[X ≠ E[X]] > 0. We call the expected value of (X − E[X])2 the variance Var[X] of X. The variance is a measure of how much X differs from E[X]. We obtain 2

Var[X] = E[(X − E[X]) ] = E[X 2 − 2E[X]X + E[X]2 ] = E[X 2 ] − 2E[X]E[X] + E[X]2 = E[X 2 ] − E[X]2 The first equation is the definition of the variance. The third follows from the linearity of the expected value. Thus, the expected value of the random variable X 2 is at least E[X]2 , and the difference is the variance of X. Example 4.2. In a Bernoulli experiment (Jacob Bernoulli, 1654–1705), the success or failure is given by a result 1 or 0. Typically, two values p and q are defined such that Pr[X = 1] = p and Pr[X = 0] = q = 1 − p. Then, E[X] = p and Var[X] = p − p2 = pq. ⬦ By σX we denote the standard deviation of X, it is defined as σX = √Var[X]. The name is motivated by the following relation. Theorem 4.3 (Chebyshev inequality). Let λ > 0. Then 1 󵄨 󵄨 Pr[󵄨󵄨󵄨X − E[X]󵄨󵄨󵄨 ≥ λ σX ] ≤ 2 λ Proof. By Markov’s inequality and the definitions of σX and Var[X], we obtain 1 2 󵄨 󵄨 Pr[󵄨󵄨󵄨X − E[X]󵄨󵄨󵄨 ≥ λ σX ] = Pr[(X − E[X]) ≥ λ2 Var[X]] ≤ 2 λ The estimate given in Theorem 4.3 yields meaningful results only for deviations larger than the standard deviation (i. e., for λ > 1). Theorem 4.4. Let X and Y be independent random variables. Then Var[X + Y ] = Var[X] + Var[Y ] Proof. Using E[XY ] = E[X]E[Y ], we conclude Var[X + Y ] = E[(X + Y )2 ] − E[X + Y ]2 = E[X 2 ] + 2E[XY ] + E[Y 2 ] − E[X]2 − 2E[X]E[Y ] − E[Y ]2 = E[X 2 ] − E[X]2 + E[Y 2 ] − E[Y ]2 = Var[X] + Var[Y ]

60 � 4 Discrete probability

4.2 Jensen’s inequality A function f : ℝ → ℝ is called convex if the following inequality is valid for all λ ∈ [0, 1] and x, y ∈ ℝ: f ((1 − λ)x + λy) ≤ (1 − λ)f (x) + λf (y) Graphically, in the two-dimensional Euclidean space ℝ2 , convexity is indicated as follows: the line segment between the points (x, f (x)) and (y, f (y)) lies above the graph of f . Thus, the graph of a convex function looks something like this:

λf (x) + (1 − λ)f (y)

x

λx + (1 − λ)y

y

If a function f can be differentiated twice, it is convex if and only if its second derivative f ′′ is nonnegative everywhere. Both functions f (x) = x 2 and g(x) = 2x are convex. The second derivatives are f ′′ (x) = 2 and g ′′ (x) = (ln 2)2 ⋅ 2x , and thus always nonnegative in both cases. The following inequality is named after Johan Ludwig William Valdemar Jensen (1859–1925). Theorem 4.5 (Jensen’s inequality). Let f : ℝ → ℝ be a convex function and k ≥ 1. If λ1 , . . . , λk ∈ [0, 1] ⊆ ℝ are such that ∑ki=1 λi = 1, then k

k

i=1

i=1

f (∑ λi xi ) ≤ ∑ λi f (xi ) Proof. Without restriction, we may assume λi > 0 for all 1 ≤ i ≤ k. We use induction on k. If k = 1, then λ1 = 1, and the statement is obvious. So let k > 1 and λ1 < 1, now. This yields k

k

i=1

i=2

f (∑ λi xi ) = f (λ1 x1 + (1 − λ1 ) ∑

λi x) 1 − λ1 i

4.3 Birthday paradox

k

≤ λ1 f (x1 ) + (1 − λ1 )f (∑ i=2

k

≤ λ1 f (x1 ) + (1 − λ1 ) ∑ i=2

λi x) 1 − λ1 i

λi f (x ) 1 − λ1 i

� 61

(since f is convex) (by induction)

k

= ∑ λi f (xi ) i=1

For a random variable X : Ω → ℝ and a function f : ℝ → ℝ, we define another random variable f (X) : Ω → ℝ by f (X)(ω) = f (X(ω)). Then E[f (X)] = ∑ y Pr[f (X) = y] y∈ℝ

= ∑ y ∑ Pr[X = x] x∈ℝ

y=f (x)

= ∑ f (x) Pr[X = x] x∈ℝ

This enables us to compute the expected value of f (X), without explicitly determining the density of f (X). In Section 5.12 we will apply Corollary 4.6 to the convex function 2x to compute the average height of binary search trees. Corollary 4.6. Let f : ℝ → ℝ be a convex function and X a random variable on a finite probability space. Then f (E[X]) ≤ E[f (X)] Proof. Let X : Ω → ℝ be the random variable. We may assume X(Ω) = {x1 , . . . , xk } with Pr[X = xi ] = λi . By Jensen’s inequality, we obtain k

k

i=1

i=1

f (E[X]) = f (∑ λi xi ) ≤ ∑ λi f (xi ) = E[f (X)] Remark 4.7. Experience teaches that it is easy to recall that for convex functions f there is an inequality between the values f (E[X]) and E[f (X)], but it is much harder to tell in which direction the inequality holds. Is it f (E[X]) ≤ E[f (X)] or f (E[X]) ≥ E[f (X)]? Here, it is helpful to remember the variance, which is defined by E[X 2 ] − E[X]2 . The variance is positive and x 󳨃→ x 2 is a convex function. Therefore, f (E[X]) ≤ E[f (X)] is valid. ⬦

4.3 Birthday paradox By a curve sketching argument on the function (1+x)−ex one can easily see that (1+x) ≤ ex for all x, where equality holds only at x = 0 (see Exercise 3.1b). If x is close to zero,

62 � 4 Discrete probability we get a quite useful estimate. This important technique gives an explanation for the following fact, known as the birthday paradox: If there are more than 23 people in a room, the probability that two of them have the same birthday is greater than 1/2.

The word paradox is used here because the number 23 at first glance appears much too small, compared to the 366 possible birthdays. But let us have a closer look. Suppose we have n possible birthdays and m people in the room. If we put all the people in a line and everyone tells his or her birthday, we have a (more or less) random sequence of length m with dates out of the n possible ones. The probability that the first i + 1 dates in the sequence are all different is n n−1 n−i 1 i ⋅ ⋅⋅⋅ = 1 ⋅ (1 − ) ⋅ ⋅ ⋅ (1 − ) n n n n n Thus, the probability that all m birthdays are different is m−1

∏ (1 − i=0

i ) n

Have we made a mistake so far? Well, to view the considered sequence to be random assumes a uniform distribution, which might not be totally realistic. However, it is intuitively clear that we are on the safe side (if on certain days the probability is higher than average, then the probability for a match increases). Moreover, we are going to make the value even higher. Next, we use the above mentioned inequality (1 + x) ≤ ex . For the probability that all birthdays are different, this yields the following estimate: m−1

∏ (1 − i=0

m−1 m−1 i i ) ≤ ∏ e− n = e− ∑i=0 n i=0

i n

= e−

m(m−1) 2n

Thus, we will fall below the desired limit on average in the range of m = √2n ln 2. For n = 365 (or 366), this value is approximately 22.5, so for m = 23 the probability for all different birthdays is already smaller than 1/2. Experiments at birthday parties and in classes usually affirm this value.

Exercises 4.1. A hunter’s probability to hit his target is 1/2. What is the probability that he will have at least 3 hits out of 10 shots? 4.2. A family has four children. Assume that for every baby born the probability of being a girl is 0.5 and determine the probability that (a) the family has exactly one girl,

Exercises

� 63

(b) the first and second child both are boys, (c) at least two children are boys, (d) all children are female. 4.3. Let m, n ∈ ℕ and n < m. Alice and Bob independently choose a number from the set M = {1, 2, . . . , m}. What is the probability that the two numbers differ by at most n? To solve this question, determine the cardinality of the set {(a, b) | a, b ∈ M and |a − b| ≤ n} 4.4. We wish to sort a sequence of pairwise different numbers a = (a1 , . . . , an ) using quicksort. To this end, we choose a random pivot element ai and determine subsequences a′ = (ai1 , . . . , aik ) and a′′ = (aj1 , . . . , ajℓ ) such that –



ais < ai < ajt for all 1 ≤ s ≤ k and all 1 ≤ t ≤ ℓ, i1 < ⋅ ⋅ ⋅ < ik and j1 < ⋅ ⋅ ⋅ < jℓ and k + ℓ + 1 = n.

This can be accomplished using n − 1 comparisons. Then the sequences a′ and a′′ are sorted recursively, the results being b′ and b′′ , respectively. Now, (b′ , ai , b′′ ) is a sorting of a. The recursion terminates if n = 0. How many comparisons does quicksort need on average? 4.5. Let again a = (a1 , . . . , an ) be a sequence of pairwise different numbers. We want to find the kth largest element without first sorting the sequence. We proceed in a similar way to quicksort from Exercise 4.4. We randomly choose a pivot element p and again construct two subsequences of elements that are smaller (resp. larger) than p. If we count the number of elements in the first subsequence, we can decide whether p is the element we are looking for, or in which of the two lists we will have to search further for the desired element. This procedure is called quickselect. Show that the average number of comparisons Q(n) of quickselect is less than or equal to 2(1 + ln 2)n. Hint: Assume that the sequence a consists of the numbers 1, . . . , n and that we want to determine the position of element k. If π denotes the sequence of pivot elements, use the 0–1-valued random variables Xij (π) defined as “i will be compared with j”. Distinguish three cases depending on how k relates to i and j. 4.6. Let n ≥ 1 and Hn = ∑nk=1 k1 . Let a random variable X be defined by X : Ω → {1, . . . , n} with the Zipf distribution Pr[X = k] = (Hn ⋅ k)−1 , named after George Kingsley Zipf (1902–1950), who empirically found out that in natural language texts the kth most frequent word occurs with a probability proportional to 1/k. Compute the asymptotics of the expected value and the standard deviation of X.

64 � 4 Discrete probability

Summary Notions – – – – – –

(discrete) probability space uniform distribution probability Pr[A] random variable X expected value E[X] discrete density fX

– – – – – –

distribution FX independent random variables variance Var[X] Bernoulli experiment standard deviation σX convex function

Methods and results – Ω finite, uniformly distributed 󳨐⇒ Pr[A] =

– E[X] = ∑ω∈Ω X(ω) Pr[ω]

|A| |Ω|

– Ω finite, uniformly distributed 󳨐⇒ E[X] = (∑ω X(ω))/|Ω|

– Markov inequality: X ≥ 0, E[X] > 0, λ > 0 󳨐⇒ Pr[X ≥ λE[X]] ≤

– Linearity of the expected value: E[aX + bY ] = aE[X] + bE[Y ]

1 λ

– E[X] = ∑x x Pr[X = x] = ∑x x fX (x)

– X, Y independent 󳨐⇒ E[XY ] = E[X]E[Y ]

– Var[X] = E[(X − E[X])2 ] = E[X 2 ] − E[X]2 ≥ 0 – σX = √Var[X]

– Chebyshev’s inequality: For λ > 0, Pr[|X − E[X]| ≥ λ σX ] ≤

– X, Y independent 󳨐⇒ Var[X + Y ] = Var[X] + Var[Y ]

1 λ2

– Jensen’s inequality: f : ℝ → ℝ convex, λi ∈ [0, 1], ∑ki=1 λi = 1 󳨐⇒ f (∑ki=1 λi xi ) ≤ ∑ki=1 λi f (xi )

– Ω finite, f convex 󳨐⇒ f (E[X]) ≤ E[f (X)]

– Birthday paradox: m randomly chosen events from Ω with m ≥ √2|Ω| ln 2 Pr[two events equal] > 1/2.

󳨐⇒

5 Combinatorics The section begins with a brief introduction to enumerative combinatorics. Then, using the example of binomial coefficients, we will introduce the concept of a bijective proof : One can prove an identity of the form f (n) = g(n) by defining two sets F and G such that |F| = f (n) and |G| = g(n), and providing a bijection between F and G. The step from a function f (n) to a set F with |F| = f (n) is called combinatorial interpretation. Two sets are disjoint if their intersection is empty. The following connection between sets F and G is typical. Decompose the set F into classes, so F is the union of pairwise disjoint subsets Gk with |Gk | = gk (n). A bijection between F and ⋃k Gk is trivially given by the identity. Thus, F = ⋃k Gk yields f (n) = ∑k gk (n). A popular combinatorial interpretation is the urn model by Pólya (George Pólya, 1887–1985). In this model, balls are drawn from a bin (called the urn) and the question is in how many ways this can happen. There are several different modes to be considered here, resulting in different counting functions. One advantage of this model lies in the fact that it allows for different counting functions to be interpreted in a uniform way. However, not all functions that are interesting for us can be represented in this model, and, moreover, there are often better and more obvious interpretations. Therefore, we take a more general approach.

5.1 Enumerative combinatorics We write |A| = |B| if there is a bijection between A and B, and in this case we say that A and B have the same cardinality. If A is a finite, then |A| = |{0, . . . , n − 1}| for some n ∈ ℕ, and we simply write |A| = n. Thus, the cardinality of a finite set is a natural number. The set of all mappings from A to B is denoted by BA . This makes sense because a map f : A → B can be identified with an A-tuple (ba )a∈A , where ba = f (a) for all a ∈ A. If A and B are finite sets with |A| = n and |B| = m, then there are exactly mn mappings from A to B; for each of the n elements a ∈ A, there are m possible images f (a) ∈ B. We state 󵄨󵄨 󵄨 󵄨 A󵄨 |A| n 󵄨󵄨{ f : A → B | f is a mapping}󵄨󵄨󵄨 = 󵄨󵄨󵄨B 󵄨󵄨󵄨 = |B| = m . If both A and B are empty, then on the left-hand side we have the number 1, since for A = B = 0 the identity is the only mapping from A to B, and on the right hand side we have the expression 00 . So, in a natural way we obtain 00 = 1. Thus, x 0 = 1 is true for any number x. This is one of a whole range of useful conventions, e. g., an empty product is always 1, an empty sum is always 0, i. e., ∏k∈0 ak = 1 and ∑k∈0 ak = 0. The analogous fact from predicate logic is that a universally quantified statement ∀x ∈ 0: φ(x) over the empty set is always true, while ∃x ∈ 0: φ(x) is always false. https://doi.org/10.1515/9783111062556-005

66 � 5 Combinatorics Suppose |A| = |B| = n. How many bijections between A and B are there? The answer is n! (“n-factorial”), where the factorial is defined by n! = n ⋅ (n − 1) ⋅ ⋅ ⋅ 1. For n = 0, our convention that empty products are 1 yields 0! = 1. Thus, we claim 󵄨 󵄨󵄨 A 󵄨󵄨{ f ∈ B | f is bijective}󵄨󵄨󵄨 = n! This is true for n = 0. In general, for any n we have to count all tuples (bi )1≤i≤n with pairwise distinct bi ∈ B. For b1 , there are n possibilities; for b2 , there are n−1 possibilities left; and so on. Clearly, this results in n! possibilities. This approach can be generalized to arbitrary finite sets A and B with |A| = k and |B| = n: 󵄨󵄨 󵄨 A 󵄨󵄨{ f ∈ B | f is injective}󵄨󵄨󵄨 = n(n − 1) ⋅ ⋅ ⋅ (n − k + 1) n! for k ≤ n = (n − k)!

(5.1)

This is easy to see, too. Again we have to count the tuples (bi )1≤i≤n with pairwise distinct bi ∈ B. For b1 , there are n possibilities; for b2 , there are n − 1 possibilities left; and so on; finally, for bk , there are (n − k + 1) remaining possibilities. The percentage of bijections among all mappings from {1, . . . , n} to {1, . . . , n} decreases exponentially as n grows: Equation (3.1) yields n! n 1 ≤ ≤ en−1 nn en−1 and using Stirling’s formula (3.2) we get nn!n ∼ e−n √2πn. Here and in many other places, π is the mathematical constant defined by the ratio of a circle’s circumference and its diameter, and e is Euler’s number: π = 3.14159 26535 89793 23846 26433 83279 50288 41971 69399 37510 . . . e = 2.71828 18284 59045 23536 02874 71352 66249 77572 47093 69995 . . . Recall that power set of A is the set of all subsets of A and denoted by 2A . We can easily see 󵄨󵄨 A 󵄨󵄨 |A| 󵄨󵄨2 󵄨󵄨 = 2 If a set A has n elements, then there are 2n subsets of A. For all n ∈ ℕ, we have n < 2n . This observation is a special case of the set-theoretic fact that there can be no surjection of a set on its power set (see Exercise 5.1c), i. e., the power set is always “larger”. Therefore the set of all sets cannot exist, because, speaking a little sloppy, this would be the largest of all sets, but its power set would still be larger.

5.2 Binomial coefficients

� 67

5.2 Binomial coefficients n! repreIn Section 3.2 we already mentioned that the binomial coefficient (kn) = k!(n−k)! sents the number of k-element subsets of a set of n elements and this is its combinatorial interpretation. Next we extend this combinatorial interpretation to k ∈ ℤ and arbitrary sets A. By (Ak) we denote the set of k-element subsets of A,

A ( ) = {B ⊆ A | |B| = k} k Obviously, (Ak) = 0 if k < 0 or k > |A|. On the other hand, (A0 ) = {0} is true for every set A. Now let A be a finite set, and let |A| = n. We have |(A1 )| = n because we can identify the one-element subsets of A with the elements of A. Thus, there is a bijection between the A ), too: We only sets A and (A1 ). Moreover, there is a bijection between the sets (Ak) and (n−k A A have to map every subset B ∈ (k ) to its complement set A \ B ∈ (n−k ). A

A ) A \ B ∈ (n−k

B ∈ (Ak)

A )| for all k ∈ ℤ. The power set of A consists of 2n elements, This yields |(Ak)| = |(n−k and at the same time it is the disjoint union of all (Ak). Thus, we obtain without further reasoning

󵄨󵄨 A 󵄨󵄨 󵄨 󵄨 2n = ∑󵄨󵄨󵄨( )󵄨󵄨󵄨 󵄨󵄨 k 󵄨󵄨 k Since (Ak) and (kx ) both are standard notations, the following theorem simply has to be valid. We will formally prove it now: Theorem 5.1. Let A be a set of n elements. Then for all k ∈ ℤ, we have 󵄨󵄨 A 󵄨󵄨 n 󵄨󵄨 󵄨 󵄨󵄨( )󵄨󵄨󵄨 = ( ) 󵄨󵄨 k 󵄨󵄨 k Proof. The theorem is trivially true for k < 0 or k > n because then both terms are equal to 0. For 0 ≤ k ≤ n, there are n(n − 1) ⋅ ⋅ ⋅ (n − k + 1) sequences (a1 , . . . , ak ) of pairwise distinct ai . Note that, by convention, n(n − 1) ⋅ ⋅ ⋅ (n − k + 1) = 1 for k = 0. Two such sequences represent the same set, if they coincide up to a permutation of the indices. There are k! such permutations, which completes the proof.

68 � 5 Combinatorics Now, we further extend the domain of binomial coefficients (kx ) to complex numbers x and integers k. We begin with nonnegative integers in the denominator: x x(x − 1) ⋅ ⋅ ⋅ (x − k + 1) ( )= k k!

for x ∈ ℂ and k ∈ ℕ

Here, the product in the numerator is called the falling factorial, denoted by x k . Thus we obtain x xk ( )= k k!

where x k = x(x − 1) ⋅ ⋅ ⋅ (x − k + 1)

Note that for x = n ∈ ℕ this coincides with the usual definition (kn) =

n! . k!(n−k)!

For k > 0,

both the numerator and denominator of consist of k factors. For k = 0, x k is just the empty product, which by our convention equals 1, i. e., x 0 = (x0) = 1. If k and x are natural numbers and x < k, then the product x(x − 1) ⋅ ⋅ ⋅ (x − k + 1) contains the factor zero. Therefore, x k = (kx ) = 0 for x ∈ ℕ with 0 ≤ x < k. For all other x and k ≥ 0, there is no zero among the factors of the numerator, and consequently, x k ≠ 0 ≠ (kx ) for x ∉ ℕ and 0 ≤ k. For example, (1/10 ) < 0 and (1/10 ) > 0. In particular, both values are nonzero. 4 5 Finally, we extend the domain to all k ∈ ℤ and x ∈ ℂ by the definition (kx ) = 0, if k < 0 is a negative integer. In particular, (kx )

n n ( )=( ) k n−k

for n ∈ ℕ and k ∈ ℤ

For k ≥ 0, the following nice relation is obvious: −1 (−1) ⋅ (−2) ⋅ ⋅ ⋅ (−k) = (−1)k ( )= k! k The binomial coefficient (kx ) can be interpreted as a polynomial in x, having degree k and, if k ≥ 1, the zeros 0, . . . , k − 1. If k = 0, then (x0) is the constant polynomial with value 1. This directly leads us to the polynomial method. If we want to prove an identity for binomial coefficients, where for all occurring (mx ), we have m ≤ k, then this is an identity of polynomials in x (with coefficients in ℂ) of degree less than or equal to k. A theorem from algebra says that two different polynomials (with coefficients from ℂ) of degree less than or equal to k are equal if they coincide at k + 1 different input values x. Thus, If one can show the identity for at least k + 1 natural numbers x, then it is automatically valid for all x ∈ ℂ. This is our guiding theme for this chapter: Prove (if possible) identities using a combinatorial interpretation first, and then try to apply the polynomial method.

5.2 Binomial coefficients

� 69

This approach is extremely helpful because it leads to an understanding for identities. It also helps to avoid some induction proofs. These are often well suited for understanding identities, but they do not help finding identities or remembering them. Since binomial coefficients are defined for all k ∈ ℤ, we can often omit summation limits, thus making the formulas clearer and simplifying induction proofs. For the rest of this section, x and y are always complex numbers (or unknowns) and k, ℓ, m, n are always integers. Binomial coefficients (kn) are natural numbers for n, k ∈ ℕ, but this is not obvious n! . However, by Theorem 5.1, this is a trivial fact. from their representation as (kn) = k!(n−k)! The following identity is the basis for Pascal’s triangle (Blaise Pascal, 1623–1662) and perhaps the most important property of binomial coefficients (see Figure 5.1). Moreover, as an immediate consequence, we can see that all binomial coefficients (kn) for n, k ∈ ℤ are integer-valued.

x=8

1

8

7

28

21

15 56

=

20 70

=

1

k

35

35

=

2

15 56

=

6 28

5 =

1 7

1 8

6 7

1

5 21

4

k

=

3

4

=

10

1

8

6

6

=

1

10

=

1

4

5

1

k

x=7

1

3

1

k

x=6

1

1

k

x=5

3

2

k

x=4

1

1

k

x=3

1

k

x=1

x=2

1

k

x=0

0

x ( ) k

1

Figure 5.1: Pascal’s triangle.

Theorem 5.2 (Addition theorem). x x−1 x−1 ( )=( )+( ) k k k−1 Proof. The combinatorial interpretation (Theorem 5.1) directly yields the identity if x = n is a natural number: we can assume A = {1, . . . , n}. The set of k-element subsets of A can be split up into two classes. We have those sets containing n in the first class, and the others in the second. We apply Theorem 5.1: In the first class, there are as many subsets n−1). And in the second class, of A as there are (k − 1)-element subsets of A \ {n}, i. e., (k−1 n−1 there are exactly the k-element subsets of A \ {n}, i. e., ( k ). Since the sum must be (kn), this shows the identity for all x ∈ ℕ. The claim thus holds for infinitely many values. But

70 � 5 Combinatorics both sides of the equation consist of polynomials of degree k, so the polynomial method yields the claim for all x ∈ ℂ. As mentioned above, as an immediate consequence of the addition theorem we see that the binomial coefficients (kn) for n ∈ ℤ always are integers, which is not obvious directly from the fraction n⋅⋅⋅(n−k+1) . Another easy consequence of Theorem 5.2 is the k! binomial theorem. Here, ∑k denotes summing up over all k ∈ ℤ. The sum is well defined because in the considered sums almost all terms are 0. Theorem 5.3 (Binomial theorem). n (x + y)n = ∑ ( )x k yn−k k k Proof. We consider the product in the two indeterminates x and y, (x + y)n = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ (x + y)(x + y) ⋅ ⋅ ⋅ (x + y) n factors

The term x k yn−k is obtained, whenever in k of the n factors we choose the summand x, and in the remaining n−k factors we choose y. Thus, any k-element subset of the n factors yields the term x k yn−k . By Theorem 5.1, there are exactly (kn) such subsets. Therefore, (kn) is the coefficient of x k yn−k . Another important identity is the trinomial revision. The name is chosen according to [22]. This identity is based on the fact that for x = k + ℓ + n and m = k + ℓ the product (mx )(mk) = (k+ℓ+n . The trinomial revision )(k+ℓ ) becomes the trinomial coefficient (k+ℓ+n)! k! ℓ! n! k+ℓ k enables us to simplify products of binomial coefficients: Theorem 5.4 (Trinomial revision). x m x x−k ( )( ) = ( )( ) m k k m−k

for all x ∈ ℂ, m, k ∈ ℤ

Proof. If m < 0 or m < k, both sides are 0, so the equation is obvious. Using the polynomial method, it is therefore sufficient to prove it for all cases with 0 ≤ k ≤ m ≤ n = x ∈ ℕ. Imagine we have n balls, of which we want to choose k and paint them red, m − k shall be chosen to be painted green, and the remaining n − m balls shall become blue. How many ways are there to choose, which of the balls will be painted in which color? We might first look for the subset of size m, which shall be colored red or green, and then decide which ones shall be red. Or we first choose k balls to be painted red, and then choose the green ones from the remaining n − k balls. If you want to transform an identity of the form fn = ∑k (kn)gk which defines fn ’s in terms of gk ’s into an identity which defines the gn ’s in terms of the fk ’s, then you can use the following trick. Consider the (n+1)×(n+1) matrices P and Q with entries Pij = (ji) and

5.2 Binomial coefficients �

71

Qij = (−1)i−j (ji), with indices from {0, . . . , n}. Both P and Q are lower triangular matrices. The matrix P appears as part of Pascal’s triangle. For the product R = PQ, we have i k Rij = ∑ Pik Qkj = ∑ ( )(−1)k−j ( ) k j k k In particular, Rij = 0 for i < j. For i ≥ j, the trinomial revision yields i i−j i i−j Rij = ( ) ∑ ( )(−1)k−j = ( ) ∑ ( )(−1)k j k k−j j k k 1 i = ( )(−1 + 1)i−j = { j 0

for i = j for i > j

Thus, R is the identity matrix. This yields Theorem 5.5. Theorem 5.5 (Binomial inversion). Let f0 , . . . , fn and g0 , . . . , gn be numbers such that fi = ∑k (ki )gk for all 0 ≤ i ≤ n. Then gn = ∑k (−1)n−k (kn)fk . Proof. For a matrix A, let AT be the transposed matrix of A, having the entry Aij at position (j, i). Using the above defined matrices, (f0 , . . . , fn ) = (g0 , . . . , gn ) ⋅ PT . Then, (g0 , . . . , gn ) = (f0 , . . . , fn ) ⋅ QT , which yields the assertion. The next example summarizes some important combinatorial interpretations. Example 5.6. Consider an urn containing n balls numbered from 1 to n. We draw k balls from the urn. In particular, if balls are not replaced, then k ≤ n. Then there are n! (a) nk = (n−k)! draws without replacement and with order, n (b) (k ) draws without replacement and without order, (c) nk draws with replacement and with order, and (d) (n+k−1 ) draws with replacement and without order. k Here, replacement means that each ball is noted after being drawn and then is put back into the urn again. With order means that we not only count which balls are drawn, but also the order in which they appear. For example, from the first formula we learn that there are 336 possible distributions of the top 3 places in a race with 8 participants, because 8 ⋅ 7 ⋅ 6 = 336. We have already dealt with the first three formulas under the keywords injections, k-element subsets and arbitrary mappings. The formula (d) can be seen as follows: A draw of balls with replacement and without order can be represented by values b1 , . . . , bn ∈ ℕ. The number bi indicates how often ball i was drawn. If k balls are drawn, then ∑i bi = k. Let a0 = 0 and ai+1 = ai + bi+1 + 1 for 0 ≤ i < n. Then we have 1 ≤ a1 < ⋅ ⋅ ⋅ < an−1 < an = n + k. In particular, {a1 , . . . , an−1 } is a selection of n − 1 elements from {1, . . . , n + k − 1}. But from the values ai , the values bi can be reconstructed. Thus, (n+k−1 ) = (n+k−1 ) yields the claim. ⬦ k n−1

72 � 5 Combinatorics The derivation in Example 5.6 (d) contains a frequently used method: The ai ’s are arranged according to a given order (in this case the usual order on natural numbers) although they were drawn in an arbitrary order. We will further refine this technique in the next part. According to an anecdote, the students in the class of Gauss should be busy and quiet for a while. Thus, the teacher told them to sum up the numbers from 1 up to 100. Gauss solved this problem immediately by writing down the following: (1 + 100) + (2 + 99) + ⋅ ⋅ ⋅ + (49 + 52) + (50 + 51) = 50 ⋅ 101 = 5050 Why do we mention this here? The reason is that the according well-known formula, which due to the above anecdote is often named after Gauss, is the following: ∑ k=

0≤k≤n

n+1 n(n + 1) =( ) 2 2

A simple proof by induction is possible, but boring. Let us try a combinatorial interpretation and consider the set A = {1, . . . , n + 1}. Then 󵄨󵄨 A 󵄨󵄨 󵄨󵄨 󵄨 󵄨 󵄨 󵄨󵄨( )󵄨󵄨󵄨 = ∑ 󵄨󵄨󵄨{{ j, k} | 1 ≤ j < k}󵄨󵄨󵄨 = ∑ (k − 1) = ∑ k 󵄨󵄨 2 󵄨󵄨 1≤k≤n+1 1≤k≤n+1 0≤k≤n This is Gauss in its purest form! What about the sum of squares? As soon as the result is known, it is again an easy exercise to prove it by induction. But what can be done, if you have forgotten this formula? It is a good idea to learn how to deduce it. As before, we show that the following identity is valid: n+1 󵄨 󵄨 ( ) = ∑ 󵄨󵄨󵄨{{ℓ, j, k} | 1 ≤ ℓ < j < k}󵄨󵄨󵄨 3 1≤k≤n+1 󵄨 󵄨 = ∑ 󵄨󵄨󵄨{{ℓ, j} | 1 ≤ ℓ < j < k}󵄨󵄨󵄨 1≤k≤n+1

=



1≤k≤n+1

k−1 k )= ∑ ( ) 2 2 0≤k≤n

(

Since 2 ⋅ (k2 ) = k 2 − k, this yields n+1 ) = ( ∑ k 2 ) − ( ∑ k) 3 1≤k≤n 1≤k≤n

2⋅(

Together with our knowledge of the Gaussian sum, we obtain n+1 n+1 2n3 + 3n2 + n )+( )= 6 3 2

∑ k2 = 2 ⋅ (

1≤k≤n

5.2 Binomial coefficients

� 73

This idea can be generalized. The subsets of A = {1, . . . , n + 1} of size m + 1 can be divided into classes according to their maximal element k. This yields 󵄨󵄨 {1, . . . , k − 1} 󵄨󵄨 n+1 󵄨 󵄨 )󵄨󵄨󵄨 ) = ∑ 󵄨󵄨󵄨( 󵄨󵄨 󵄨󵄨 m m+1 1≤k≤n+1

(

=



1≤k≤n+1

k−1 k )= ∑ ( ) m m 0≤k≤n

(

Thus we obtain the Theorem 5.7. Theorem 5.7 (Upper summation). For all m, n ∈ ℕ we have n+1 k )= ∑ ( ) m+1 m 0≤k≤n

(

Very similarly, we can prove the following identity, which holds for n ∈ ℤ and arbitrary x. Theorem 5.8 (Parallel summation). x+n+1 x+k )= ∑( ) n k k≤n

(

Proof. First, let x ∈ ℕ and n ∈ ℕ. The n-element subsets of A = {1, . . . , x + n + 1} can be arranged into classes according to the largest element from A, which does not (!) belong to the subset. Let B ∈ (An) and x+k+1 the largest element of A\B. Then we have x+k+1 ∈ ̸ B and x + k + 2, . . . , x + n + 1 ∈ B. This implies 0 ≤ k ≤ n and |B ∩ {1, . . . , x + k}| = k. Thus, the set B is uniquely determined by the value k and a k-element subset of {1, . . . , x + k}. This yields 󵄨󵄨 A 󵄨󵄨 󵄨󵄨 {1, . . . , x + k} 󵄨󵄨 x+k 󵄨󵄨 󵄨 󵄨 󵄨 )󵄨󵄨󵄨 = ∑ ( ) 󵄨󵄨( )󵄨󵄨󵄨 = ∑ 󵄨󵄨󵄨( 󵄨󵄨 k k 󵄨󵄨 n 󵄨󵄨 k≤n󵄨󵄨 k≤n By the polynomial method, the identity is also valid for all x ∈ ℂ and n ∈ ℕ. The extension to all n ∈ ℤ is trivial, because for n < 0 all terms become 0. The following equation is named after the French mathematician, chemist, and musician Alexandre-Théophile Vandermonde (1735–1796). Theorem 5.9 (Vandermonde’s identity). x+y x y ( ) = ∑ ( )( ) n k n − k k Proof. We first prove the identity for x, y ∈ ℕ by combinatorial interpretation and then apply the polynomial method again. Let X and Y be disjoint sets with |X| = x and |Y | = y.

74 � 5 Combinatorics On the left-hand side of the identity, we have the number of ways to select n elements from X ∪ Y , whereas on the right-hand side we count the possibilities to first choose k elements from X and then n − k from Y . Thus, again we choose n elements from X ∪ Y . Summation over k means that all ways to partition the n chosen elements into according subsets of X and Y are counted accordingly. X ∪Y

X n

Y k

n−k

If n is negative and x, y ∈ ℂ, then the terms on both sides are 0. Now we proceed to the case n ∈ ℕ. We will apply the polynomial method for two variables. For a fixed y ∈ ℕ, both sides are polynomials in x, having degree n. The combinatorial interpretation has shown that these two polynomials coincide for all x ∈ ℕ, especifically there are at least n + 1 places, where these polynomials of degree n coincide. Thus, the polynomials are equal and the above identity is true for all x ∈ ℂ and all y ∈ ℕ. Now let x ∈ ℂ be fixed, then on both sides there are polynomials in y of degree n, and again these polynomials coincide at n + 1 places at least. This proves the Vandermonde identity for all n ∈ ℤ and all x, y ∈ ℂ. A special case is the following identity for all m, n ∈ ℕ: m+n m n ) = ∑ ( )( ) m k m−k k

(

To provide a direct combinatorial interpretation, we consider the rectangular grid with corners at (0, 0) and (n, m) and the set M of all shortest paths in the grid leading from (0, 0) to (n, m). Each of these paths consists of m vertical steps of unit length and n horizontal steps of the same length, arranged in any order. If we denote the vertical steps by “0” and the horizontal ones by “1”, then each shortest path is uniquely represented by a 0–1-sequence of length m + n with exactly n ones. The number of such sequences is given by the number of ways to place n ones in m + n places, thus m+n ) n

|M| = (

Moreover, every path in the set M passes exactly through one of the points xk = (m−k, k) with 0 ≤ k ≤ m.

5.2 Binomial coefficients

xm

� 75

(n, m) xk

(0, 0)

x0

There are exactly (m−k+k ) = (mk) shortest paths from (0, 0) to xk and exactly k n ) shortest paths from x to (n, m). This yields the claim. (n−m+k+m−k ) = (m−k k m−k For x, y ∈ ℕ, we want to give another alternative proof for the Vandermonde identity using a very instructive technique. For this, we interpret the binomial theorem, Theorem 5.3, as an equality between two polynomials. Let Z be an indeterminate. Then x+y n )Z = (Z + 1)x+y = (Z + 1)x ⋅ (Z + 1)y n

∑( n

x y = (∑ ( )Z k ) ⋅ (∑ ( )Z ℓ ) k ℓ ℓ k x y = ∑(∑ ( )( ))Z n k n − k n k

Here, in the last step all coefficients of Z n have been combined. Now, the above equality of polynomials yields the Vandermonde identity by comparing the coefficients at Z n . Suppose we want to distribute up to t identical objects into ℓ bins. In how many ways can this be done? The answer is given by the next theorem, which was already presented in a different form in Example 5.6 (d). Theorem 5.10.

󵄨󵄨 󵄨󵄨 t+ℓ 󵄨󵄨 󵄨 ℓ ) 󵄨󵄨{(e1 , . . . , eℓ ) ∈ ℕ | ∑ ek ≤ t}󵄨󵄨󵄨 = ( 󵄨󵄨 󵄨󵄨 ℓ 1≤k≤ℓ

Proof. We imagine t + ℓ dots arranged in a horizontal row. We select ℓ of these dots and replace them by dashes. There are (t+ℓ ) possible selections. Each of these selections ℓ corresponds to exactly one ℓ-tuple (e1 , . . . , eℓ ) ∈ ℕℓ with ∑ℓk=1 ek ≤ t. ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ∙ ∙ ⋅ ⋅ ⋅ ∙ ∙ | ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ∙ ∙ ⋅ ⋅ ⋅ ∙ ∙ | ⋅ ⋅ ⋅ | ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ∙ ∙ ⋅ ⋅ ⋅ ∙ ∙ | ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ∙ ∙ ⋅⋅⋅ ∙ ∙ e1 dots e2 dots eℓ dots overhang ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ t dots and ℓ dashes

We first find e1 points in the line, before the first dash appears. Then, e2 points follow up to the second dash, and so on. After the ℓth dash, there can be an overhang of dots left to reach the desired total of t dots. Thus, the solutions of the inequality and the selections of dots and dashes can be mapped to each other bijectively.

76 � 5 Combinatorics But now, back to the binomial theorem, Theorem 5.3. We give a slightly modified combinatorial proof and state the theorem in full generality. Theorem 5.11 (General binomial theorem). Let r, x, y ∈ ℂ. Then r (x + y)r = ∑ ( )x k yr−k k k

for |x| < |y|

or r ∈ ℕ

Proof. First, let r, x, y ∈ ℕ. Let R, X, Y be sets such that |R| = r, |X| = x, |Y | = y, and X ∩ Y = 0. Then (x + y)r is the number of mappings from R into the disjoint union of X and Y . Each subset K of R defines a class of mappings, FK = { f : R → X ∪ Y | f −1 (X) = K} Each mapping f ∈ FK can be decomposed into a mapping fX : K → X and a mapping fY : R \ K → Y . Let |K| = k. Then we have x k possibilities for fX and yr−k possibilities for fY . Thus, |FK | = x k yr−k . Since the number of subsets K with |K| = k is (kr ), this yields the claim for r, x, y ∈ ℕ.

R

f:

fX

X

fY

Y

K R\K

Now, once more we apply the polynomial method. As in Theorem 5.9, we have two variables, so we have to take two steps again. In a first step, we use the polynomial method to extend from r, x, y ∈ ℕ to r, x ∈ ℕ, y ∈ ℂ. The same argument then leads us from r, x ∈ ℕ, y ∈ ℂ to r ∈ ℕ, x, y ∈ ℂ. At this point, our standard approach does not help anymore. For r ∈ ℂ \ ℕ, the sum ∑k (kr )x k yr−k is an infinite series. Thus, we need a separate consideration of the case |x| < |y| for r ∈ ℂ. We let z = xy and f (z) = (1 + z)r . Then we have |z| < 1 and it is sufficient to show r f (z) = ∑ ( )zk k k The kth derivative of f is

5.2 Binomial coefficients



77

f (k) (z) = r k (1 + z)r−k Thus, f (k) (0) = r k , and the Taylor series of f is the following: f (k) (0) k r z = ∑ ( )zk k! k k≥0 k ∑

The series is absolutely convergent for |z| < 1 because |(kr )| as a function of k is bounded by a polynomial in k of degree ⌊|r|⌋. This can be seen directly from the definition of the binomial coefficients by canceling all factors that are smaller than 1. The absolute convergence yields f (z) = ∑k (kr )zk . In the special case r = −1, x = −z and y = 1, Theorem 5.11 yields the limit of the geometric series, ∑ zk =

k≥0

1 1−z

for |z| < 1

An integral estimate shows that (k+z )k −z for k → ∞ converges to a constant (by the k 1 way defining z! for all complex numbers z). This implies (kr ) in 𝒪(k −1−r ). Thus, using the Leibniz criterion (Gottfried Wilhelm Leibniz, 1646–1716) on alternating series, the series 1 ). ∑k (kr ) converges for every real number r > −1. In particular, (1 + 1) 2 = √2 = ∑k (1/2 k If we encounter the nth power of more than two summands, we may use the next theorem. Theorem 5.12 (Multinomial theorem). Let d ≥ 1. Then (x1 + ⋅ ⋅ ⋅ + xd )n =



ki ≥0, k1 +⋅⋅⋅+kd

n! k k x 1 ⋅ ⋅ ⋅ xd d k ! ⋅ ⋅ ⋅ kd ! 1 =n 1

Proof. We can argue via a combinatorial interpretation as we did in the case of the binomial theorem, Theorem 5.11; however, we have to use nk1 (n − k1 )k2 ⋅ ⋅ ⋅ kd ! n! n n − k1 k = = ( )( ) ⋅ ⋅ ⋅ ( d) k1 ! ⋅ ⋅ ⋅ kd ! k1 ! ⋅ ⋅ ⋅ kd ! k1 k2 kd A proof by induction on n is also possible, but we already know that from the proof of Theorem 5.3. This time we will perform an induction on the value of d. For d = 1, the claim is obvious. For d > 1, we define y = x2 + ⋅ ⋅ ⋅ + xd and use the binomial theorem, Theorem 5.11, to obtain n k (x1 + ⋅ ⋅ ⋅ + xd )n = (x1 + y)n = ∑ ( )x1 1 yn−k1 k1 k 1

Now, induction on d yields

78 � 5 Combinatorics n k (x1 + ⋅ ⋅ ⋅ + xd )n = ∑ ( )x1 1 ( ∑ k 1 k k ≥0, k +⋅⋅⋅+k 1

i

2

d =n−k1

(n − k1 )! k2 k x ⋅ ⋅ ⋅ xd d ) k2 ! ⋅ ⋅ ⋅ kd ! 2

n! k x1 1 ( ∑ k !(n − k )! 1 1 k ≥0, k +⋅⋅⋅+k 0≤k ≤n

= ∑

i

1

=



ki ≥0, k1 +⋅⋅⋅+kd

2

n! k k x1 1 ⋅ ⋅ ⋅ xd d k ! ⋅ ⋅ ⋅ k ! d =n 1

d =n−k1

(n − k1 )! k2 k x ⋅ ⋅ ⋅ xd d ) k2 ! ⋅ ⋅ ⋅ kd ! 2

For d, ki , n ∈ ℕ with k1 +⋅ ⋅ ⋅+kd = n, the multinomial coefficient is defined as follows: n! n )= k1 , . . . , kd k1 ! ⋅ ⋅ ⋅ kd !

(

This is the number of possibilities to partition an n-element set into d disjoint classes such that the ith class Ci contains exactly ki elements. For a proof, let s be the number of such decompositions. We denote each decomposition by the according sequence (C1 , . . . , Cd ). For each i, there are ki ! different orders of the elements in Ci . If all the classes are ordered, what we get is an arbitrary permutation of the n elements. On the other hand, from a permutation π we can reconstruct the decomposition by successively forming d blocks in (π(1), . . . , π(n)) from left to right, such that the ith block contains exactly ki elements. Now ignoring the order of the elements in block i, this yields the class Ci . n ). Thus, we showed that s ⋅ k1 ! ⋅ ⋅ ⋅ kd ! = n!, yielding s = (k ,...,k 1

d

Example 5.13. Let n = 4, d = 3, and (k1 , k2 , k3 ) = (1, 1, 2). If we interpret the n-element set {1, 2, 3, 4} as positions of a word and give positions of class 1 the label a, positions of 4 ) = 12 counts the following words: class 2 label b, and positions of class 3 label c, then (1,1,2 abcc acbc accb

bacc bcac bcca

cabc cacb cbac

cbca ccab ccba

which provides another interpretation of multinomial coefficients.



5.3 Average case analysis of bubble sort The term bubble sort usually appears in connection with a simple sorting method, based on local transpositions. Let π = (π1 , . . . , πn ) be a sequence of numbers. These can be sorted by repeatedly passing through the sequence from left to right and swapping elements πi and πi+1 , whenever the condition πi > πi+1 is satisfied. Of course, each such swap results in a change of π. The procedure terminates as soon as a run without any swap operation is done. This is an advantage of bubble sort: The final run yields a verification that the sequence is actually sorted. It is rather clear that bubble sort is very well suited for sorting sequences that are likely to be nearly sorted before.

5.4 Inclusion and exclusion

� 79

We want to consider the time needed to perform this sorting algorithm. The first run always brings the largest number to the final position, after the second run the last two positions carry the largest two numbers correctly, and so on. After n runs we are definitely done, and the cost in every iteration is at most n swappings. Therefore, bubble sort is a quadratic procedure, and you would have to try hard to implement it with a runtime not in 𝒪(n2 ). There are numerous suggestions for optimizing bubble sort, but do they manage to have a runtime in o(n2 )? The answer is a very clear No. Even arbitrarily optimized bubble sort variants on average are still quadratic algorithms: As a measure we will consider the number of inversions of a permutation π = (π1 , . . . , πn ). This is defined by the number of pairs (i, j) with i < j and πj < πi . Theorem 5.14. The number of inversions of a permutation π = (π1 , . . . , πn ) on average is 1 n ⋅( ) 2 2 Proof. If n ≤ 1, the claim is trivially true. So now, let n ≥ 2. For π = (π1 , . . . , πn ), we define a permutation π by π = (πn , . . . , π1 ). The mapping π 󳨃→ π defines an involution (i. e., π = π) without fixed points. This partitions the set of permutations into classes of the form {π, π}, each class consisting of exactly two elements. Summing up the number of inversions within one such class {π, π} always yields exactly (n2 ). Whenever we speak of bubble sort, we may always assume that in every step at most one inversion is eliminated. The average number of inversions therefore underestimates the runtime of any realistic bubble sort implementation. Corollary 5.15. Bubble sort on average and in the worst case requires Θ(n2 ) comparisons to sort a sequence of n items. The claim of Corollary 5.15 holds for any distribution of the inputs, provided the sequences π and π appear with the same probability, or it is decided in each case at random whether to sort the sequence from left to right or in the opposite direction.

5.4 Inclusion and exclusion If A and B are disjoint sets, then |A ∪ B| = |A| + |B|. More generally, for arbitrary not necessarily disjoint sets A and B we have the formula |A ∪ B| = |A| + |B| − |A ∩ B| because in an enumeration of the elements from A and from B the elements in the intersection of A and B each appeared twice. Thus, one occurrence of these has to be subtracted again. For three finite sets A, B, C, the following formula is valid: |A ∪ B ∪ C| = |A| + |B| + |C| − |A ∩ B| − |B ∩ C| − |A ∩ C| + |A ∩ B ∩ C|

80 � 5 Combinatorics This can be seen directly from the following Venn diagram (John Venn, 1834–1923) of three sets A, B, C:

A B

C

Example 5.16. How many numbers between 1 and 1000 are divisible by 3, 5, or 8? Let A, B, and C, resp., be the sets of numbers between 1 and 1000 that are divisible by 3, 5, and 8, resp. Then 1000 ⌋ = 66 15 1000 |B ∩ C| = ⌊ ⌋ = 25 40 1000 |A ∩ C| = ⌊ ⌋ = 41 24

1000 ⌋ = 333 3 1000 |B| = ⌊ ⌋ = 200 5 1000 |C| = ⌊ ⌋ = 125 8

|A ∩ B| = ⌊

|A| = ⌊

|A ∩ B ∩ C| = ⌊

1000 ⌋=8 120

This yields |A ∪ B ∪ C| = 333 + 200 + 125 − 66 − 25 − 41 + 8 = 534.



The principle of inclusion and exclusion generalizes this formula and allows exact counting of the elements in a union of sets. Theorem 5.17 (Inclusion–exclusion principle). Let A1 , . . . , An be finite sets. Then |A1 ∪ ⋅ ⋅ ⋅ ∪ An | = ∑ (−1)k+1 k≥1



|Ar1 ∩ ⋅ ⋅ ⋅ ∩ Ark |

1≤r1 0 and pi pairwise distinct primes. Then φ(n), which counts the numbers between 1 and n that are coprime to n, safisfies φ(n) = n(1 −

1 1 1 )(1 − ) ⋅ ⋅ ⋅ (1 − ) p1 p2 pr

For n ≤ 1, this is clear. Now let n ≥ 2. For every divisor d ∈ ℕ of n, among the numbers between 1 and n there are dn which are divisible by d. Let A = {1, . . . , n}, and for i = 1, . . . , r let Ai = {x ∈ A | pi divides x}. Then, for φ(n) = |A \ (A1 ∪ ⋅ ⋅ ⋅ ∪ Ar )|, we obtain r

󵄨󵄨 󵄨 k 󵄨󵄨A \ (A1 ∪ ⋅ ⋅ ⋅ ∪ Ar )󵄨󵄨󵄨 = |A| + ∑ (−1) k=1

r

= n + ∑ (−1)k k=1

|Ar1 ∩ ⋅ ⋅ ⋅ ∩ Ark |



1≤r1 n. Furthermore, we have P(n) = p(n, k) for k ≤ 1

96 � 5 Combinatorics Theorem 5.38 (Recursion formula for lower partition numbers). Let k ≥ 1. Then p(n, k) = p(n, n) + ∑ p(n − j, j) j≥k

Proof. The equation is obvious for n ≤ 0. So let n, k ≥ 1. There is one partition with exactly one summand. Consider now a partition with more than one summand, then its smallest summand has a value j satisfying k ≤ j ≤ ⌊n/2⌋. If we omit j in such decompositions, this yields the contribution p(n − j, j). Theorem 5.38 applied to n ∈ ℕ and k ≥ 1 yields the equation p(n, k) = 1+∑⌊n/2⌋ p(n− j=k j, j) because p(n, n) = 1 and p(n − j, j) = 0 for j > ⌊n/2⌋. In the later chapter on generating functions, we will show log P(n) ∈ Θ(√n). Therefore, the growth of the partition numbers P(n) cannot be limited by any polynomial, however, they grow significantly slower than, e. g., 2n .

5.9 Catalan numbers Catalan numbers (Eugène Charles Catalan, 1814–1894) occur in connection with many combinatorial problems. In this section, we will deal with balanced parentheses, Dyck words, and binary trees. The nth Catalan number Cn is defined as follows: Cn = This can be rearranged to Cn =

1 2n ( ) n+1 n

1 2n+1 ( ). 2n+1 n

We investigated the growth of the binomial

coefficient (2n ) in Section 3.2. By Equation (3.4), we have n Stirling’s formula, obtain Cn ∼

4n 2n(n+1)

≤ Cn ≤ 4n and, using

4n n ⋅ √πn

The following table shows some exemplary values of Catalan numbers: n Cn

0 1

1 1

2 2

3 5

4 14

5 42

6 132

... ...

10 16796

... ...

20 6564120420

5.10 Dyck words Dyck words1 describe well-balanced brackets. For example, ()((())()) is well-balanced, while ())()((()) is not. For a better readability, we will use the letter a for “opening 1 Ritter Walther Franz Anton von Dyck (1856–1934) was the first Rektor of today’s Technical University of Munich, when in 1903 TUM received a rectorate constitution.

5.10 Dyck words

� 97

bracket” and b for “closing bracket”. Thus, abaaabbabb is a Dyck word, and abbabaaabb is not. Let the set of strings or words over a, b be denoted by {a, b}∗ . For w ∈ {a, b}∗ , let |w| be its length, |w|a the number of occurrences of the letter a in w, and |w|b the number of occurrences of b. In particular, we always have |w| = |w|a + |w|b . We write u ≤ w, if u is a prefix of w, i. e., if uv = w for a word v ∈ {a, b}∗ . Here, uv is the resulting word if the words u and v are concatenated. The Dyck words w ∈ {a, b}∗ can be defined by the following two conditions: (a) |w|a = |w|b , (b) |u|a ≥ |u|b for all prefixes u of w. In this section, Dn denotes the set of all Dyck words of length 2n. Thus, Dn = {w ∈ {a, b}2n | |w|a = n, ∀ u ≤ w: |u|a ≥ |u|b } Theorem 5.39. The number of Dyck words of length n equals Cn , i. e., |Dn | =

1 2n ( ) n+1 n

Proof. Let En = Dn b be the set of words wb, where w is a Dyck word of length 2n. Clearly, |En | = |Dn |. Moreover, let Wn be the set of words of length 2n + 1 with exactly n + 1 occurrences of the letter b. Then 2n + 1 2n + 1 2n ( ) )= n+1 n n+1

|Wn | = (

It remains to show that (2n+1)|En | = |Wn |. A word from Wn can be visualized in a kind of up and down walk: the letter a results in one step up and b is visualized by one step down. If we start on the ground level, i. e., level 0, we end up on level −1. Words from En are characterized by the fact that before the final step the line always runs on or above the zero line. In general, this is not the case. For example, the word w = abbabaaabbb ∈ W5 corresponds to the following graph: Level 2 1 w

0 −1

a

b u

b

a

b

a

a

a v

b

b

b

98 � 5 Combinatorics Now, we consider a word w from Wn and look at its shortest prefix u that reaches the minimum level. If we write the word w as uv, then the cyclic permutation vu is a word from En . In our example, u = abb and vu = abaaabbbabb ∈ E5 . It is easy to see that abb is the only prefix leading to a word from E5 in this way. This observation is applicable for any word in Wn . First note that, clearly, u is nonempty and ends on a b. Suppose there are two prefixes u1 and u1 u2 of w = u1 u2 v ∈ Wn such that 0 < |u1 | < |u1 u2 | ≤ |w| and u2 vu1 as well as vu1 u2 are elements of En . Consider level ℓ reached after the prefix u2 , if we start at level 0. Since u2 vu1 ∈ En , we conclude ℓ ≥ 0 because words from En never go below level 0 before their final character. But from vu1 u2 ∈ En we get ℓ < 0 because the level of vu1 u2 becomes negative after u2 and has never been negative before. This is a contradiction, so each word in En corresponds exactly to 2n + 1 cyclic permutations in Wn , and each word from Wn can be uniquely assigned to one word from En by a cyclic permutation. Therefore, (2n + 1)|En | = |Wn | as desired.

5.11 Binary trees If you look for a name in a phone book, the typical procedure will be as follows. You open the book somewhere in the middle, and depending on the letter given on the top of the page decide whether to continue your search in the part before this page or in the part behind this page. This is the first step of a general procedure called binary search. The idea behind this is to design for a given linearly ordered set a data structure, which supports fast lookup, insertion, and deletion. Binary trees are very well suited as basic structure here. Let us first define saturated binary trees inductively. A single vertex v defines a saturated binary tree with the vertex set {v}. Here, the vertex v is at the same time root and leaf . The height of the tree is 0, and the set of interior vertices is empty. Now let B1 and B2 be saturated binary trees with vertex sets V1 and V2 , and assume V1 ∩ V2 = 0. Let v be a new vertex, v ∉ V1 ∪ V2 . Then we define a saturated binary tree B with vertex set {v} ∪ V1 ∪ V2 as follows. The root of B is v, which has the root of B1 as its left child, and the root of B2 as its right child. The set of leaves of B is the union of the according sets of leaves of B1 and B2 . Especially, v is not a leaf. The set of interior vertices of B consists of v and the interior vertices of B1 and B2 . If the height of Bi is hi for i = 1, 2, then B’s height is max{h1 , h2 } + 1. Thus, in a saturated binary tree there is exactly one root, each interior vertex has exactly two children, but the leave vertices are childless. If there are n leaves, then there are exactly n − 1 interior vertices. This follows by induction using the observation that 1 + (n − 1) + (m − 1) = (n + m) − 1. Therefore, the number of vertices of a saturated binary tree with n interior vertices is 2n + 1 and in particular odd. We obtain a general binary tree by deleting all leaves from a saturated binary tree. Thus, the height is decreased by 1, and each vertex now can have 0, 1, or 2 children.

5.11 Binary trees

� 99

Therefore, the differences are, first, that a tree can be empty and, second, that there may be vertices with just one child. We take the original root (if it survived) and again differentiate between interior vertices (having one or two children) and leaves (vertices without children).

saturated binary tree

general binary tree

From a given binary tree, the saturated binary tree can be reconstructed by adding leaves accordingly. Thus, there is a natural bijection between the set of binary trees with n vertices and the set of saturated binary trees with 2n + 1 vertices. Their number, just like the number of Dyck words, is described by the Catalan numbers. Theorem 5.40. The number of binary trees with n vertices (or the number of saturated 1 2n binary trees with n interior vertices) is Cn = n+1 ( n ). Proof. By our preliminary remarks, it suffices to count the number of saturated binary trees with 2n + 1 vertices. We code these trees by words of length 2n + 1 over an alphabet of two letters. Informally, we perform a depth-first search from left to right: If an interior vertex is visited for the first time in this depth-first search, then we write the letter a, on leaves we write the letter b. Thus, overall we write n times an a and n + 1 times b. Before visiting the last leaf, at any time we visited at least as many interior vertices as leaves. Therefore, using the notation from the last section, the written word is a string from En = Dn b. The saturated binary tree shown in the drawing above corresponds to the word abaaabbbabb. Thus, by Theorem 5.39, it is sufficient to find a bijection between the set ℬn of saturated binary trees with n interior vertices and the set En . For this, we formalize the depth-first search by a mapping code : ℬn → {a, b}2n+1 . We define the mapping inductively as follows: For n = 0, let code(B) = b, where B is the only tree in ℬ0 . Then, let B ∈ ℬn with n > 0 and v the root of B. Let L be the left subtree of v and R the right one. Then we define code(B) = a ⋅ code(L) ⋅ code(R) By induction, code(L) and code(R) are Dyck words followed by a b, so code(B) is such a word as well. Therefore, code(B) is an element of En . Also code(B) has an a as its first letter, so a ⋅ code(L) is the shortest nonempty prefix which is a Dyck word. Thus, B can be reconstructed, if code(B) is known. This yields injectivity of the mapping code.

100 � 5 Combinatorics Now, let us consider an arbitrary word w ∈ En , then w has exactly one decomposition w = aubvb such that u and v are Dyck words. By induction, ub = code(L) and vb = code(R) for saturated binary trees L and R, and consequently w = code(B) for a tree B ∈ ℬn . Thus, the mapping is also surjective, and therefore it is a bijection. From the construction rules for binary trees we obtain the following corollary. Corollary 5.41. The Catalan numbers satisfy the following rule: C0 = 1,

Cn+1 = ∑ Ck Cn−k k

for n ∈ ℕ

5.12 Expected height of binary search trees In this section we will continue our investigation of the binary search technique. We consider the following question: How long do we have to search in a randomly generated binary tree to find a specific entry? Here, “randomly generated” means that the elements have been inserted into the search tree in an arbitrary order and that every possible order has the same probability. The time needed for a search corresponds to the length of the search path. It will turn out that on average we will only need logarithmic time (in the number of elements). There are various approaches in which insertion of the elements is accomplished in a not too naive way, thus achieving this complexity even in the worst case. However, the following estimate will show that any extra efforts at insertion time are unnecessary, if the data elements appear in random order. If π is a permutation of the elements {1, . . . , n}, we also use the notation π = (π(1), π(2), . . . , π(n)). This should not be confused with the cycle notation [. . . ] in Section 5.6.2. For I ⊆ {1, . . . , n}, let BI (π) be the binary search tree that arises if the elements i ∈ I are inserted into the initially empty tree, if their order is given by π (from left to right). In case I = {1, . . . , n}, we also write B(π) instead of B{1,...,n} (π). Let us consider an example: For n = 7, π = (3, 2, 6, 1, 5, 7, 4), and I = {2, 3, 4, 5, 7}, the tree B{2,3,4,5,7} (π) will be the result if the elements 3, 2, 5, 7, and 4 in this order are inserted into a previously empty search tree. Note that π −1 (3) < π −1 (2) < π −1 (5) < π −1 (7) < π −1 (4). 3

3 5

2 4

6

2 7

5

1 4

B{2,3,4,5,7} (π)

B(π)

7

5.12 Expected height of binary search trees

� 101

We define the following random variables: Ri (π) = “the root of B(π) is i”

XI (π) = “the height of BI (π)” YI (π) = 2XI (π)

Our goal now is to estimate the expected value E[Xn ], where we use Xn to denote X{1,...,n} . We assume the uniform distribution on the set of all permutations. It turns out that it is smarter and easier to begin the consideration with E[Yn ]. First, we show that E[Yn ] = E[2Xn ] can be estimated by a polynomial of degree 3. Obviously, E[Y1 ] = 1. Now, let n ≥ 2. From n

Yn (π) = 2 ∑ Ri (π) ⋅ max{Y{1,...,i−1} (π), Y{i+1,...,n} (π)} i=1

we obtain n

E[Yn ] = 2 ∑ E[Ri ⋅ max{Y{1,...,i−1} , Y{i+1,...,n} }] i=1

Note that Ri (π) = 1 if π(1) = i, and Ri (π) = 0, otherwise. For i ∈ ̸ I, the random variables Ri and YI are independent. Thus, n

E[Yn ] = 2 ∑ E[Ri ] ⋅ E[max{Y{1,...,i−1} , Y{i+1,...,n} }] i=1

We have E[Ri ] =

1 n

and max{YI , YJ } ≤ YI + YJ and therefore E[Yn ] ≤

2 n ∑(E[Y{1,...,i−1} ] + E[Y{i+1,...,n} ]) n i=1

The linearity of expected values, together with the property E[YI ] = E[Y{1,...,|I|} ] = E[Y|I| ] yield E[Yn ] ≤

4 n 4 n−1 ∑ E[Yi−1 ] = ∑ E[Yi ], n i=1 n i=0

because each term is counted twice. Next, we will show that E[Yn ] ≤ 41 (n+3 ). For n = 1, 3 we have E[Y1 ] = 1 = 41 (43). So, from now on, let n ≥ 2. With E[Y0 ] = 0, we obtain

102 � 5 Combinatorics

E[Yn ] ≤

=

4 n−1 4 n−1 1 i + 3 ) ∑ E[Yi ] ≤ ∑ ( n i=0 n i=0 4 3

(by induction)

1 n−1 i + 3 1 n+3 1 n+3 )= ⋅( )= ⋅( ) ∑( n i=0 3 n 4 4 3

The second to last equality follows using Theorem 5.7 on the upper summation. So, now we have an estimate for E[Yn ], and we will apply Jensen’s inequality, Corollary 4.6, to the convex function f : x 󳨃→ 2x in order to obtain E[Xn ]: 2E[Xn ] ≤ E[2Xn ] = E[Yn ] ≤

1 n+3 ( ) ≤ cn3 + c 4 3

for an appropriate constant c ∈ ℝ. This yields Theorem 5.42 about the average height Xn of binary search trees with n vertices. Theorem 5.42. E[Xn ] ≤ 3 log2 n + 𝒪(1) ∈ 𝒪(log n)

Exercises 5.1. (a) (b) (c)

Let A, B and C be arbitrary sets and AB the set of mappings from B to A. Show that there is a bijective mapping between the sets C (A×B) and (C B )A . Show that there is a bijective mapping between C A∪B and C A × C B if A ∩ B = 0. Show that there is no bijective mapping between the sets A and 2A because there are no surjections from A to 2A .

5.2. How many 9-digit numbers are there, where each digit from 0 to 9 occurs at most once, and 0 occurs at least once? 5.3. A committee of 8 persons shall be chosen from a set of 15 women and 12 men. (a) How many possibilities are there if the commission must have an equal number of men and women? (b) How many possibilities are there if the commission shall have at least two men? (c) How many possibilities are there if the commission is supposed to have more men than women? 5.4. The game Carcassone uses quadratic cards. Each of the four sides corresponds to a road (r), a walled town (w), or a meadow (m). How many possible patterns are there? Here, a pattern means, for example, that two opposite sides show roads and the other two meadows. 5.5. Show that any nonempty finite set has as many subsets with an even number of elements as it has subsets of odd size. 5.6. Prove the following equalities:

Exercises

(a)

� 103

n k n n ∑ ( )( ) = ( ) ⋅ 2n−m m k m k=m

n k (b) ∑ ( )( )ℓ = n ⋅ 3n−1 k ℓ k,ℓ (c)

n n+i ) = 6n ∑ ∑ ( )( i j i j

m−k m+k 2m + 1 (d) ∑ ( )( )=( ) n n 2n + 1 k (e)

m m+1 n k ) ∑ i ) = (n + 1)m+1 − (n + 1) ∑ (( k i=1 k=1

5.7. The Fibonacci numbers are defined by F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 for n ≥ 2. Prove for all n ∈ ℕ: n−k (a) Fn+1 = ∑ ( ) k k≤n n (b) F2n = ∑ ( )Fi i i (c)

n F3n = ∑ ( )2i Fi i i

n (d) 0 = ∑ ( )(−1)i Fn+i i i 5.8. Let n ≥ 3. Suppose G(3) (n) denotes the number of all subsets A ⊆ {1, . . . , n} such that |A| = 3 and ∑a∈A a is even. Develop a formula for G(3) (n). 5.9. Prove the following equalities: n k n+1 } (a) ∑ ( ){ } = { k m m+1 k

n n+1 (b) ∑ [ ]k = [ ] k 2 k (c)

n k n+1 ] ∑ [ ]( ) = [ k m m +1 k

5.10. One hundred Smurfs have been captured by the king, because he was jealous of their wisdom. The king has a cabinet with 100 drawers numbered 1 through 100, in which he puts the Smurfs’ ID cards, exactly one in each drawer. The Smurfs get one last chance to be released from captivity. The king explains his game: In a random order, the Smurfs, one after another, will be brought into the room with the drawer cabinet. None of them will know how many of them had their turn before. Once in the room they are allowed to open 50 drawers one at a time and look inside, but they are not allowed to touch the ID cards. Then the drawers are closed again, and everything looks completely unchanged.

104 � 5 Combinatorics If a Smurf has seen his ID, he is sent back to his cell and has to wait there. Otherwise the game is over, and the king explains that even if only one of the Smurfs fails, all of them will be locked in their cells forever. However, the king promises to release all the Smurfs if each of them saw his ID card in the cabinet during his visit to the room. His calculation says that the chance of a Smurf to find his ID is only 1/2, no matter what his strategy is. Here, the king is right. However, his conclusion that the probability for them to be set free is only 2−100 is wrong. A little careless, he allows the Smurfs to briefly discuss the situation one last time. After that, each Smurf is placed in solitary confinement and all communication between them is prevented. The game begins. (a) Show that the clever Smurfs have a strategy to be set free with a probability that is greater than 31 %. (b) Show that there is no strategy for the Smurfs to be set free with a probability of 32 % or more. Hint: Consider the following game, which obviously is not more difficult for the Smurfs than that described in the exercise. We may assume the Smurfs to be numbered from 1 to 100. In the beginning, all the Smurfs are in the room and Smurf 1 starts opening drawers, which he does until he finds his ID card. The opened drawers are not closed again. The next one is the Smurf with the smallest number, whose ID card has not yet appeared in an open drawer. He also opens drawers (without closing them again) until he finds his ID card. The Smurfs proceed in this fashion, until all drawers are opened. Any of them can look into all the open drawers at any time. Finally, the Smurfs are released if none of them has opened more than 50 drawers. 5.11. We consider the following guessing game for two players Alice and Bob: The two players agree on two numbers n, r ∈ ℕ. Then Alice chooses an arbitrary set R ⊆ {1, . . . , n}. Bob’s task is to determine R, for which he may ask up to r questions of the form “Is R ∩ M = 0?” for an arbitrary set M ⊆ {1, . . . , n}. Alice is supposed to give the true answer on each question. After at most r questions, Bob specifies a set, which he claims to be R, and he wins, if his claim is correct. Example: Let (n, r) = (5, 4). The game may proceed as follows: Bob:

Is R ∩ {1, 2, 3} = 0?

Alice:

No.

Bob:

Is R ∩ {4, 5} = 0?

Alice:

Yes.

Bob:

Is R ∩ {1, 2} = 0?

Alice:

Yes.

Bob:

R equals {3}!

Thus, in this example Bob has won after only three questions. (a) Give a winning strategy for Bob using n questions.

Exercises

(b) (c)

� 105

Show that this is optimal, i. e., for every strategy there is a set R such that the strategy needs more than n − 1 questions if Alice has chosen R. Is the guessing game fair, if r = n − 1?

5.12. Prove the following equalities: 2n 2n (a) Cn = ( ) − ( ) n n+1 (b)

Cn =

n 2 1 ∑( ) n+1 k k

2(2n + 1) C n+2 n 5.13. Show that there are exactly Cn−2 possibilities to divide a regular polygon with n ≥ 3 vertices into triangles (i. e., there are Cn−2 different triangulations). The C4 = 14 ways to triangulate a hexagon are shown here: (c)

Cn+1 =

5.14. A family 𝒜 ⊆ 2{1,...,n} is an antichain if its elements are pairwise incomparable, i. e., if M ⊆ N then M = N. Prove the following facts: n ) subsets. (a) (Sperner’s theorem) Every antichain contains at most (⌊n/2⌋ Hint: Each element of an antichain appears in a maximal chain (compare Section 9.1), and maximal chains do not contain two different members of the same antichain. (b) The given bound is tight.

106 � 5 Combinatorics

Summary Notions – – – – – – – – – – – – – –

equinumerous mappings BA factorial n! falling factorial nk power set 2A characteristic mapping binomial coefficient (kn) k-element subsets (Ak) n ) multinomial coefficients (k ,...,k 1 d bubble sort inversion rencontres numbers Rn , Rn,m fixed point Stirling number of the first kind [kn]

– – – – – – – – – – – – – –

Stirling number of the second kind {kn} partition, classes cycle, cycle notation rising factorial nk harmonic number Hn Bell number Bn arithmetic partition number P(n, k) accumulated partition number P(n) Ferrers diagram lower partition number p(n, k) Catalan number Cn Dyck word up and down walk (saturated) binary tree

Methods and results – Combinatorial interpretation, bijective proof, polynomial method – |BA | = |B||A| , |2A | = 2|A| , |(Ak)| = (|A| ) k

– There are n! permutations on {1, . . . , n}.

– There are nk injective mappings from {1, . . . , k} to {1, . . . , n}. –



1 n ≤ nn!n ≤ en−1 en−1 n ) for n (kn) = (n−k

∈ ℕ and k ∈ ℤ

x−1) – Addition theorem: (kx ) = (x−1 ) + (k−1 k

– Binomial theorem: (x + y)r = ∑k (kr )x k yr−k for |x| < |y| or r ∈ ℕ x−k ) – Trinomial revision: (mx )(mk) = (kx )(m−k

– Binomial inversion: fi = ∑k (ki )gk for 0 ≤ i ≤ n 󳨐⇒ gn = ∑k (−1)n−k (kn)fk – Urn model: draw with/without replacement and with/without order – Gauss formula: ∑nk=0 k = (n+1 ) 2

n+1 ) = ∑ k – Upper summation: (m+1 0≤k≤n (m)

– Parallel summation: (x+n+1 ) = ∑k≤n (x+k ) n k

y – Vandermonde identity: (x+y ) = ∑k (kx )(n−k ) n

– |{(e1 , . . . , eℓ ) ∈ ℕℓ | ∑1≤k≤ℓ ek ≤ t}| = (t+ℓ ) ℓ

k

k

n )x 1 ⋅ ⋅ ⋅ x d – Multinomial theorem: (x1 + ⋅ ⋅ ⋅ + xd )n = ∑ki ≥0, k1 +⋅⋅⋅+kd =n (k ,...,k 1 d 1

d

Summary �

107

– Bubble sort requires Θ(n2 ) comparisons on average.

– Inclusion–exclusion principle: |A1 ∪ ⋅ ⋅ ⋅ ∪ An | = ∑k≥1 (−1)k+1 ∑1≤r1 0 and an ordinary generating function cannot help. An obvious case where an ∈ 2ω(n) is an = n!, which is exactly the number of permutations of {1, . . . , n}, leading us to the notion of an exponential generating function. For a sequence of real or complex numbers (an )n∈ℕ , it is defined as the formal power series an n z n! n≥0

ã(z) = ∑

The rules for addition, multiplication, and differentiation are as follows:

6.9 Stirling numbers of the first kind

� 121

an n b a + bn n z + ∑ n zn = ∑ n z n! n! n! n≥0 n≥0 n≥0 ∑

n a ⋅b an n b z ⋅ ∑ n zn = ∑ ∑ k n−k zn n! n! k! (n − k)! n≥0 n≥0 n≥0 k=0



1 n n ( ∑ ( )ak ⋅ bn−k )zn n! k n≥0 k=0

=∑

a an n z ) = ∑ n+1 zn n! n! n≥0 n≥0 ′

(∑

If the absolute values of the elements in a sequence (an )n∈ℕ satisfy the upper bound |an | ∈ 2n log2 n+𝒪(n) , then there is an r > 0 such that |an | ≤ (rn)n for almost all n. Then, n a the series ∑n≥0 n!n zn is absolutely convergent for all z < 1/re (since nn! ≤ nen ). In particular, the series has a positive radius of convergence, and looking at the derivatives again, we can see that the corresponding analytic function uniquely determines all the coefficients an . We consider some simple examples. For an = 1, ã(z) = exp(z) = ez is the exponential n function in its series representation exp(z) = ∑n≥0 zn! . For an = n!, the exponential gen1 is the geometric series ∑n≥0 zn . For n, m ∈ ℕ, let I(n, m) be the erating function ã(z) = 1−z number of injections of an n-element set into an m-element set, i. e., I(n, m) = mn = n!(mn). If we fix a value for m and consider an = I(n, m) as a sequence in n, then its exponential generating function is the polynomial m I(m, n) n z = ∑ ( )zn = (1 + z)m n! n n n≥0

ã(z) = ∑

6.9 Stirling numbers of the first kind As our first nontrivial example for an exponential generating function, we now consider the sequence of cycle numbers [n1 ] = (n − 1)!. The ordinary generating function does not converge in this case, but the exponential generating function has the form n ∑n≥0 n!1 [n1 ]zn = ∑n≥0 zn = − ln(1 − z). This, in turn, is a special case of the exponential generating function of Stirling numbers of the first kind. Since ∑k [kn] = n!, the associated accumulated exponential generating function is the geometric series considered before. However, the exponential generating function of the series [kn] of Stirling numbers of the first kind for fixed k is interesting. First, we observe that this sequence grows slower than n!, but by (5.6) it grows at least as fast as (n−k)!. The corresponding exponential generating function thus has a positive radius of convergence, while the ordinary generating function is not convergent for z > 0. Now, for k ∈ ℕ we investigate the series ∑n n!1 [kn]zn . For this, we use the already known general binomial theorem, Theorem 5.11. Let r, z ∈ ℂ, and let |z| < 1. Then

122 � 6 Generating functions r rn (1 + z)r = ∑ ( )zn = ∑ zn n n n n! By Corollary 5.29, we know that r n = r(r − 1) ⋅ ⋅ ⋅ (r − n + 1) is a polynomial in r of degree n with coefficients s(n, k) = (−1)n−k [kn]. So r n = ∑k s(n, k)r k , and thus er ln(1+z) = (1 + z)r = ∑(∑ k

n

s(n, k) n k z )r n!

Comparison of coefficients for each fixed z with |z| < 1 yields ∑ n

s(n, k) n (ln(1 + z))k z = n! k!

Thus, this formula for each fixed k yields the exponential generating function of the Stirling numbers of the first kind with sign s(n, k). Going back to [kn] = (−1)n−k s(n, k), we replace z with −z and obtain the exponential generating function for the Stirling numbers of the first kind [kn], 1 n n (− ln(1 − z))k [ ]z = n! k k! n≥0 ∑

The radius of convergence for each of these series is 1.

6.10 Bell numbers In this subsection, we will treat another nice example of an exponential generating function, namely that for the Bell numbers Bn . We already know from the inequalities in (5.8) that, on the one hand, Bn ∈ 2ω(n) , but, on the other hand, Bn ≤ n! holds as well. So, unlike the ordinary generating function, the exponential generating function certainly has a positive radius of convergence. Bn n ̃ =∑ Let b(z) n≥0 n! z be the exponential generating function of the Bell numbers Bn . We saw before that Bn+1 = ∑k (kn)Bk for all n ∈ ℕ. By ez we denote the usual exponential n

function to base e, which, e. g., has the series representation ez = ∑n≥0 zn! . Using the computing rules for exponential generating functions given above, we then obtain the following differential equation: B 1 n b̃′ (z) = ∑ n+1 zn = ∑ ( ∑ ( )Bk )zn n! n! k k n≥0 n≥0 = (∑ n

B zn ̃ ⋅ ez ) ⋅ (∑ n zn ) = b(z) n! n! n

Exercises

� 123

z

We want to solve this differential equation. The function ee satisfies the differential equation. Now let b1 (z) and b2 (z) be two solutions of this differential equation with b1 (z) > 0 and b2 (z) > 0 for all 0 < z ∈ ℝ. Then we obtain for the logarithmic derivatives b′1 (z) b1 (z)

b′2 (z) b2 (z)

= ez . Therefore, b1 (z) and b2 (z) only differ by a constant factor, ̃ = ceez for a c > 0. Then, b(0) ̃ because the derivative of b (z)/b (z) is zero. Thus, b(z) =1 (ln ∘ b1 )′ (z) =

=

1

yields c = e1 , and we obtain

2

z

̃ = ee −1 b(z) Bn n ̃ =∑ In particular, we see that the series b(z) n≥0 n! z converges everywhere. This yields a new proof for the Dobiński formula already known from Theorem 5.34. By series expañ = eez −1 = 1 ∑ ( 1 ∑ zn k n ) = ∑ ( 1 ∑ k n ) zn . Comparing the coefficients sion, we get b(z) n e k k! n! e k k! n n! yields the Dobiński formula

Bn =

1 kn ∑ e k≥0 k!

Exercises 6.1. Let Fn be the nth Fibonacci number. Show that the infinite sum ∑n≥0 10−n Fn converges to a rational number. 6.2. Let c1 , c2 ∈ ℝ be such that c1 ≠ 0 ≠ c2 and c12 + 4c2 > 0. We define λ1 =

2

c1 √ c1 + ( ) + c2 2 2

and

λ2 =

2

c1 √ c1 − ( ) + c2 2 2

Let the sequence (an )n≥0 be recursively defined by a0 = 0, a1 = 1, and an = c1 an−1 +c2 an−2 for n ≥ 2. Show: z (a) The generating function of the sequence (an )n≥0 is a(z) = 1−c z−c . z2

(b) an =

1

c 2⋅√( 21 )2 +c2

(λn1 − λn2 ).

1

2

6.3. Let a sequence of numbers an be defined inductively by a0 = 2, a1 = 5, and an+2 = 5an+1 − 6an . Determine the generating function of the an and prove that an = 2n + 3n holds for all n. 6.4. We define the sequence of numbers (an )n≥0 by a0 = 0, a1 = 1 and an = 3an−1 −2an−2 + 2n−1 for n ≥ 2. Determine the generating function of (an )n≥0 , and show an = 1 + (n − 1)2n . 6.5. Let Hn = ∑nk=1 1/k be the harmonic sequence. Determine its generating function h(z). Hint: You may use the equality − ln(1 − z) = ∑n≥1 zn /n.

124 � 6 Generating functions 6.6. Determine the generating function of (F2n )n≥0 . Here, Fn is the nth Fibonacci number. 6.7. Let a0 = 1 and an = ∑n−1 i=0 (n − i)ai . Determine the generating function of (an )n≥0 . 6.8. Determine the exponential generating function of the rencontres numbers Rn .

6.9. An automaton over the finite alphabet Σ is a 4-tuple 𝒜 = (Q, δ, q0 , F) with a finite set of states Q, an initial state q0 ∈ Q, a set of final states F ⊆ Q, and a transition function δ : Q × Σ → Q. The transition function δ can be extended to sequences of elements from Σ by letting δ(q, ε) = q and δ(q, wa) = δ(δ(q, w), a). Here, ε is the empty sequence (the empty word), a ∈ Σ, and w is any sequence over Σ. We denote the set of all finite sequences over Σ by Σ∗ . The language accepted by 𝒜 is L(𝒜) = {w ∈ Σ∗ | δ(q0 , w) ∈ F}. Our goal is to find how many words of length n are accepted by 𝒜. (a) For a state q, let Lq = {w ∈ Σ∗ | δ(q0 , w) = q} be the set of words leading to q. Show that Lq0 = {ε} ∪ ⋃δ(p,a)=q0 Lp ⋅ a and Lq = ⋃δ(p,a)=q Lp ⋅ a for q ≠ q0 . q (b) Let an be the number of words of length n in Lq and let aq (z) be the generating q function of (an )n≥0 . Then aq0 (z) = 1 + ∑δ(p,a)=q0 zap (z) and aq (z) = ∑δ(p,a)=q zap (z) for q ≠ q0 . (c) Let bn be the number of words of length n in L(𝒜). Then ∑q∈F aq (z) is the generating function of (bn )n≥0 . (d) Let Σ = {a, b}, Q = {q0 , q1 , q2 }, F = {q0 , q1 }, and δ be given by q

c

δ(q, c)

q0 q0 q1 q1 q2 q2

a b a b a b

q1 q0 q2 q0 q2 q2

a, b

b a q0

q1

a

q2

b

Determine the generating function of the number of words of length n in the language accepted by this automaton. How many words of length n does the automaton accept?

Summary

� 125

Summary Notions – – – – – –

ordinary generating function analytic function radius of convergence formal power series multiset n in summands from M, ZM (n)

– – – – – –

partitions 𝒫o (n) partitions 𝒫d (n) pentagonal number partition numbers Ed (n) partition numbers Od (n) exponential generating function

Methods and results – Ordinary generating functions: interrrelation between asymptotic growth and radius of convergence – Calculating with formal power series – Inverting formal power series

– Generating function of the Fibonacci numbers: f (z) =

z 1−z−z2

– How to solve simple recursion equations with the help of generating functions – Generating function of the Catalan numbers: c(z) =

1−√1−4z 2z

– Generating function of Stirling numbers of second kind: Sk (z) = ∏1≤i≤k

z 1−iz

– Generating function of number of multisets over {1, . . . , k} satisfying conditions Nj ⊆ ℕ: ∏kj=1 (∑i∈Nj zi )

– Generating function of the number of multisets over {1, . . . , k}: 1 1−zm P(n): ∏m≥1 1−z1 m P(n, k): ∏km=1 1−zz m

– Generating function of ZM (n): ∏m∈M – Generating function of

– Generating function of

1 (1−z)k

– Generating function of Pd (n) and of Po (n): ∏m≥1 (1 + zm )

– log2 Pd (n) ≥ √n for all n ≥ 32 – log P(n) ∈ Θ(√n)

– Pentagonal number theorem: ∏m≥1 (1 − zm ) = ∑j∈ℤ (−1)j zf (j)

{ 1 for n = f (j) and j ∈ ℤ even – Ed (n) − Od (n) = {−1 for n = f (j) and j ∈ ℤ odd { 0 else, i. e., if n ≠ f (j) for all j ∈ ℤ – Calculation rules for exponential generating functions – ez = ∑n≥0

zn n!

– Exponential generating function of [kn]: ∑n≥0

(− ln(1−z))k k! z B ∑n≥0 n!n zn = ee −1

1 n n [ ]z n! k

– Exponential generating function of Bell numbers:

=

7 Group actions and special families of groups This chapter is independent of the other parts of the book; and it is not part of the German edition of the present book. The focus of the additional material is on finite groups and the interplay between group theory and combinatorics. For that we introduce the concept of a group (or a monoid) acting on a set in Section 7.1, and we will prove the orbitstabilizer theorem, Theorem 7.5. The theorem leads to an important theorem of Cauchy, Theorem 7.12, stating that every group, whose order is divisible by a prime p, has an element of order p. Later, in Section 7.8, we use Theorem 7.5 to show that finite p-groups are nilpotent. During this excursion to a more advanced group theory, we meet several other prominent families of groups, like the dihedral groups Dn and alternating groups An . For example, we will show that the groups Dn are never simple for n ≥ 3 and the groups An are always simple for n ≠ 4 because here trivial groups are defined to be simple. Indeed, a standard definition of a simple group is restated in Section 7.7. It says that there is no normal subgroup other than {1} and the group itself. Our proof that the alternating groups An are simple for n ≠ 4 uses Theorem 7.12.

7.1 Group actions Groups are monoids; therefore we begin with actions of monoids on a set. Definition 7.1. A left action of a monoid M on a set X is given by a mapping M × X → X, denoted as (g, x) 󳨃→ g ⋅ x, such that for all x ∈ X and g, h ∈ M we have (a) 1 ⋅ x = x, (b) (gh) ⋅ x = g ⋅ (h ⋅ x). We say that M acts on X (from the left) and we call X an M-set, respectively a G-set if the monoid is a group G. A right action is defined analogously by symmetry. If we speak about an action without specifying left or right action, then, by default, we mean a left action. A morphism between M-sets X and Y is a mapping φ : X → Y which is compatible with the action. That is, φ(g ⋅ x) = g ⋅ φ(x) for all (g, x) ∈ M × X. A bijective morphism is called an isomorphism. Example 7.2. Let H be a subgroup of a group G, then G/H is a G-set by letting g ⋅fH = gfH. If K is a subgroup of H, then the canonical mapping G/K → G/H is a morphism of G-sets. Let A = (Q, Σ, δ, q0 , F) be a deterministic automaton as defined in formal language theory, where δ : Q × Σ → Q is the transition function. See, for example, [26]. Then δ induces a right action of Σ∗ on the state set Q. If X1 and X2 are M-sets, then the disjoint union X = X1 ∪ X2 inherits the structure of the M-sets by letting M act on x ∈ X as M acts on Xi if x ∈ Xi for i = 1, 2. It follows that https://doi.org/10.1515/9783111062556-007

7.2 Orbit-stabilizer theorem

� 127

a morphism from X to any M-set Y induces a pair (φ1 , φ2 ) of morphisms φi : Xi → Y for i = 1, 2. And vice versa, a pair (φ1 , φ2 ) of morphisms φi : Xi → Y for i = 1, 2 yields a morphism from X to Y . ⬦

7.2 Orbit-stabilizer theorem In the following we concentrate on G-sets where G is always a group. Definition 7.3. Let X be a G-set and x ∈ X. The orbit of x is the G-set Gx = { g ⋅ x | g ∈ G}. The stabilizer of x is the subgroup Gx = { g ∈ G | g ⋅ x = x}. An element x ∈ X is called a fixed point if Gx = G. Lemma 7.4. Let X be a G-set and x, y ∈ X. Then the mapping G/Gx → Gx defined by gGx 󳨃→ gx is an isomorphism1 of G-sets. If y is in the orbit Gx of x, then Gx = Gy and the stabilizers Gx and Gy are conjugate. Proof. The mapping gGx 󳨃→ gx is well defined because Gx ⋅ x = {x}. Hence, it is a morphism of G-sets. By construction, it is surjective. If g ⋅ x = h ⋅ x, then we have g −1 h ∈ Gx . Hence it is injective. Finally, let y ∈ Gx, then y = g ⋅ x for some g ∈ G. This implies x = g −1 ⋅ y and, therefore, Gx ⊆ Gy ⊆ Gx. Thus we obtain Gx = Gy. Moreover, h ∈ Gy ⇐⇒ ghg −1 ∈ Gx . Theorem 7.5 (Orbit-stabilizer theorem). Let X be a G-set and R ⊆ X a subset such that for each orbit Gy there is exactly one “base” point x ∈ Gy ∩ R. Then the isomorphism G/Gx → Gy, g 󳨃→ g ⋅ x between orbits induces an isomorphism of the G-set X and the disjoint union of the left-cosets G/Gx with x ∈ R. Thus, X ≅ ⋃{G/Gx | x ∈ R}

(7.1)

Proof. By Lemma 7.4, orbits are either disjoint or equal. It follows that the G-set X is a disjoint union of orbits. For each orbit Gy, there is exactly one x ∈ R such that Gy = Gx. Having fixed x ∈ R, we obtain the isomorphism G/Gx → Gy, g 󳨃→ g ⋅ x of G-sets which extends to a disjoint union. Corollary 7.6. Let X be a finite G-set. Then we have |X| = ∑x∈R [G : Gx ]. Proof. Since X is finite, the set R is finite and the index [G : Gx ] is finite for all x ∈ X. The corollary is therefore immediate by Theorem 7.5. Theorem 7.8 is named after John Wilson (1741–1793). It is a theorem in number theory which describes a characterization of prime numbers, but the proof (for the nontrivial direction) fits well into this section. 1 According to Definition 7.1.

128 � 7 Group actions and special families of groups Lemma 7.7. Let A be a finite Abelian group and G = {1, −1} be a cyclic group of order two which acts on A by mapping (m, a) ∈ G × A to am ∈ A. (Here, both A and G are presented as multiplicative groups.) Then ∏{a | a ∈ A} = ∏{a | a2 = 1}. Proof. Let a ∈ A, then there are two cases. In the first case, G ⋅ a = {a, a−1 } contains two elements. Then the factor a a−1 appears in ∏{a | a ∈ A}, but evaluates to 1, so we can cancel it. Thus, the remaining factors in ∏{a | a ∈ A} are those group elements a with a = a−1 . Here, we are in the second case, where a ∈ A and G ⋅ a = {a} because a2 = 1 (which is the same as a = a−1 ). Theorem 7.8 (Wilson’s theorem). For n ≥ 2, we have (n − 1)! ≡ −1 mod n if and only if n is a prime number. Proof. Let n be composite, n = pq with 1 < p, q < n. Then p divides (n − 1)!, but p has no inverse in (ℤ/nℤ)∗ . Hence, no product pr with r ∈ ℤ can evaluate to −1 mod n. For the other direction, F = ℤ/nℤ is a field and A = F \ {0} is a finite Abelian group. By Lemma 7.7, we have ∏{a | a ∈ A} = (n − 1)! mod n and therefore (n − 1)! ≡ 1 ⋅ (−1) mod n because the equation x 2 − 1 = (x + 1)(x − 1) = 0 holds in A if and only if x = a and a ∈ {±1 mod n}. Thus we are done.2

7.3 p-groups Let p be a prime. A group G is called a p-group if the order of every element is a power of p. The orbit-stabilizer theorem implies the following result. Lemma 7.9. Let G be finite p-group and X be a finite G-set. Then we have |X| ≡ |{x ∈ X | Gx = G}| mod p. Proof. This follows from Corollary 7.6 because [G : Gx ] ≡ 0 mod p if Gx ≠ G and Gx = {x} if G = Gx . Lemma 7.9 yields the following structural property for finite p-groups. Theorem 7.10. Let G be a finite p-group and Z(G) = { g ∈ G | ∀h ∈ G : gh = hg} its center. Then either G = {1} or |Z(G)| ≥ p. Proof. Every group acts on itself by conjugation x 󳨃→ gxg −1 . We have x ∈ Z(G) if and only if x is a fixed point. For G ≠ {1}, Lemma 7.9 yields 󵄨 󵄨 0 ≡ |G| ≡ 󵄨󵄨󵄨Z(G)󵄨󵄨󵄨 mod p Since 1 ∈ Z(G), we have |Z(G)| ≥ p, and the center Z(G) is not trivial.

2 For n = 2, the set {±1 mod n} is a singleton and we have +1 ≡ −1 mod 2.

7.4 Cyclic groups

� 129

In the following we use the fact that if the p-group P = ℤ/pℤ acts on a set X, then for all x ∈ X there are only two cases: either |P ⋅ x| = 1 or |P ⋅ x| = p. Example 7.11. Let us show that Lemma 7.9 yields another proof of Fermat’s little theorem (Theorem 2.16) with a little help of combinatorics on words. Let A be a finite alphabet of size m ≥ 1 and p a prime number. Shifting the first letter to the end defines a shift operator σ on Ap , which is given by σ(aw) = wa for all a ∈ A and w ∈ Ap−1 . The shift σ induces an action of ℤ/pℤ on Ap . The action is trivial if and only if w = ap for some a ∈ A. There are m words of the form ap and we have |Ap | = mp . Hence, Lemma 7.9 implies mp ≡ m mod p for all m ≥ 1. This shows Theorem 2.16. ⬦ The following theorem is named after Augustin Louis Cauchy (1789–1857). The proof presented here was given by James H. McKay (1923–2012) in [33]. The proof uses Lemma 7.9 and has some similarity to the reasoning in Example 7.11. Theorem 7.12 (Cauchy). Let G be a finite group and let p be a prime dividing |G|. Then, G contains an element of order p. Proof. Let n = |G| and S = {(g1 , . . . , gp ) ∈ Gp | g1 ⋅ ⋅ ⋅ gp = 1 in G}. In each tuple (g1 , . . . , gp ) ∈ S, the elements g1 , . . . , gp−1 ∈ G can be chosen arbitrarily and gp is then uniquely determined by gp = (g1 ⋅ ⋅ ⋅ gp−1 )−1 . Hence, |S| = np−1 , and therefore |S| ≡ 0 mod p since p | n and p ≥ 2. The cyclic p-group Z = ℤ/pℤ acts on S by shifting the indices. The action Z × S → S is defined by m ⋅ (g1 , . . . , gp ) = (g1+m , . . . , gp , g1 , . . . , gm )

(7.2)

for 0 ≤ m < p. An element (g1 , . . . , gp ) ∈ S is a fixed point if and only if (g1 , . . . , gp ) = (g, . . . , g) for some g ∈ G if and only if g p = 1. Using Lemma 7.9, we conclude 󵄨 󵄨 0 ≡ 󵄨󵄨󵄨{ g ∈ G | g p = 1}󵄨󵄨󵄨 mod p Since 1p = 1, there is also some 1 ≠ g ∈ G which has order p.

7.4 Cyclic groups A group G is called cyclic if G is generated by a single element x, that is, G = ⟨x⟩. In this case, x is called a generator of G. All cyclic groups G are isomorphic to ℤ/nℤ with addition modulo n for some n ∈ ℕ. If G is infinite, then we have n = 0, and otherwise we have n = |G|. To see this, consider the surjective homomorphism φ : ℤ → G which maps the group generator 1 ∈ ℤ to the generator x ∈ G. If G is infinite, then φ is an isomorphism, and we have k ≡ m mod 0 ⇐⇒ k = m. If G is finite of order n, then the kernel of φ is nℤ. Hence, ℤ/nℤ ≅ G by Theorem 1.18.

130 � 7 Group actions and special families of groups Theorem 7.13. Subgroups of cyclic groups are cyclic. Proof. Let U be a subgroup of G = ⟨g⟩. There is a minimal 1 ≤ n ∈ ℕ with g n ∈ U. Consider an arbitrary element g k ∈ U, then g k mod n ∈ U. Since n was chosen to be minimal, we conclude k ≡ 0 mod n. Therefore, U is generated by g n .

7.5 Dihedral groups For n ∈ ℕ, we turn the set of pairs ℤ/nℤ × ℤ/2ℤ into a group Dn by: (k, i) ⋅ (ℓ, j) = (k + (−1)i ℓ, i + j)

(7.3)

The operation in (7.3) is well defined since (−1)i = (−1)i if i ≡ i′ mod 2. It is associative and the neutral element is (0, 0). The inverse of (k, i) is the element ((−1)i+1 k, i). The groups Dn are called dihedral groups. The construction of dihedral groups is a special case of a semidirect product. The notion of a semidirect product for monoids is defined in Exercise 7.1. For n = 0, the group is infinite because its carrier set is ℤ/0ℤ × ℤ/2ℤ ≅ ℤ × ℤ/2ℤ. We have D1 ≅ ℤ/2ℤ, and D2 = ℤ/2ℤ × ℤ/2ℤ. The latter group is known as the Klein four-group named after Felix Christian Klein (1849–1925). It got a special name since it pops up as a subgroup in many (even noncommutative) groups where the order is divisible by 4. For 1 ≠ n ≠ 2, none of the groups Dn is commutative because (1, 0) ⋅ (0, 1) = (1, 1) but (0, 1) ⋅ (1, 0) = (−1, 1) and 1 ≠ −1 mod n for n ∉ {1, 2}. In the following we interpret Dn geometrically. A regular polygon is an undirected graph Gn = (Vn , En ) with vertex set Vn = ℤ/nℤ and edge set En = {{i, i +1} | i ∈ ℤ/nℤ} for 2 ≠ n ∈ ℕ. For n = 2, we let G2 be the “bigon” which has two vertices and two different edges between them. The 1-gon is a single vertex with a self-loop. Our definition includes the case n = 0 which results in the “infinite line” where the set of vertices is ℤ and where each vertex i has outgoing edges to i ± 1. If n ≥ 1, we call Gn a regular n-gon. An n-gon has n vertices and n edges. The automorphism group of the bi-gon is the Klein fourgroup D2 = ℤ/2ℤ × ℤ/2ℤ. For n ∉ {1, 2}, the graphs Gn have no self-loops and no multiple edges. Therefore, whenever convenient we assume n ∉ {1, 2}. This implies E ⊆ (V2 ) where (Vk ) = {K ⊆ V | |K| = k} for k ∈ ℤ. ′

3 2 0

1

Triangle

3

2

0

1

Quadrangle

4

2 0

1

Pentagon

7.6 Symmetric groups �

131

An automorphism of a graph G = (V , E) with E ⊆ (V2 ) is a bijective mapping φ : V → V satisfying ∀x, y ∈ V : {x, y} ∈ E ⇔ {φ(x), φ(y)} ∈ E. The set of automorphisms of a graph G = (V , E) is denoted by Aut(G). It is a subset of V V . It forms a group with composition of mappings as operation. The neutral element of this group is idV . Remark 7.14. The general definition Aut(G) for a graph G with self-loops and/or multiple edges is slightly more involved. The definition is used in the next theorem for n = 1 and n = 2, only. It is not used elsewhere. Therefore, we leave the cases n = 1 and n = 2 to the interested reader. ⬦ Theorem 7.15. The family of dihedral groups Dn is equal to the family of automorphism groups Aut(Gn ) of the regular polygons Gn for all n ∈ ℕ. Proof. In the spirit of Aut(G1 ), we suppose n ∉ {1, 2}. The group Aut(Gn ) contains a rotation δ and a reflection σ by: δ(j) = j + 1 mod n σ(j) = −j mod n A direct inspection shows δn = 1 ∈ Dn and σ 2 = 1. Moreover, σδσ = δ−1 , or equivalently σδ = δn−1 σ. Note that these equations are valid for all n. Define a homomorphism φ : Dn → Aut(Gn ) by φ(k, i) = δk σ i . Due to the multiplication in Dn , it is a homomorphism because i

φ((k, i) ⋅ (ℓ, j)) = δk+(−1) ℓ σ i+j = δk σ i δℓ σ j = φ(k, i) ⋅ φ(ℓ, j) Thanks to the equation σδ = δn−1 σ in Aut(Gn ), the homomorphism is surjective. It is injective because a direct verification shows that φ(k, i) = idV implies (k, i) = (0, 0) where (0, 0) is the neutral element in Dn .

7.6 Symmetric groups For n ∈ ℕ, we let [n] = {1, . . . , n} and (n2 ) = {{i, j} ⊆ [n] | i ≠ j}. In the context of . There is no risk of counting, we read (n2 ) ∈ ℕ as the binomial coefficient (n2 ) = n(n−1) 2 |n| n confusion since |( 2 )| = ( 2 ). Let Sn be the set of all permutations on {1, . . . , n}, called the symmetric group on n elements. More generally, let X be any set, then the set of all permutations of X forms a group Perm(X) using the composition of mappings defined by (πσ)(x) = π(σ(x)) for x ∈ X. If G is any group, then x 󳨃→ gx for x, g ∈ G defines a permutation on G. Therefore, every group of any cardinality appears as a subgroup of Perm(G), and every finite group of order at most n is isomorphic to a subgroup of Sn .

132 � 7 Group actions and special families of groups Let π be a permutation on {1, . . . , n}. Then an element {i, j} ∈ (n2 ) is called an inversion of π if (π(i) − π(j))(i − j) < 0. If we think of (π(1), . . . , π(n)) as a sequence, then {i, j} with i < j is an inversion if the elements at positions i and j are in the wrong (or “false”) order, i. e., π(i) > π(j). By ℱ (π) we denote the set of inversions3 of π. Example 7.16. Consider the permutation π ∈ S9 given by (π(1), . . . , π(9)) = (3, 1, 7, 2, 4, 6, 9, 5, 8). Then the inversions of π are (1, 2), (1, 4), (3, 4), (3, 5), (3, 6), (3, 8), (6, 8), (7, 8), and (7, 9). One way of illustrating the inversions is by writing (1, . . . , 9) twice, one above the other. Then we draw a line from each number i in the first line to π(i) in the second line: 1

2

3

4

5

6

7

8

9

1

2

3

4

5

6

7

8

9

Every crossing of two lines corresponds to an inversion.



Every permutation in Sn has at most (n2 ) inversions and the permutation π = (n, . . . , 1) with π(i) = n − i + 1 matches this upper bound. The identity mapping is the only permutation without inversions. The sign of π is sign(π) = (−1)|ℱ (π)| . It is an element of the multiplicative group {1, −1}. Lemma 7.17. Let π : {1, . . . , n} → {1, . . . , n} be a permutation. Then we have: π(j) − π(i) π(j) − π(i) = ∏ j − i j−i 1≤i 1, a direct verification shows τ{i−1,i} (k) = π −1 τ{i,i+1} π(k) for all k, and therefore τ{i−1,i} = π −1 τ{i,i+1} π

(7.5)

Hence, by induction on i we are done.

7.7 Alternating groups A group G is called simple if it has no normal subgroups other than {1} and G itself. The classification of finite simple groups started with the work of Galois, who, presumably, knew that the alternating group A5 is simple, but A4 is not. The full classification of all finite simple groups was finished only in 2008. It is a great mathematical achievement with hundreds of authors contributing several thousands of pages of proof. The classification says that a finite simple group is either cyclic, or alternating, or it belongs to an infinite class called “the groups of Lie type”, or else it is one of the explicitly known 26 so-called sporadic groups. The largest of these sporadic groups is called the monster group of order 246 ⋅ 320 ⋅ 59 ⋅ 76 ⋅ 112 ⋅ 133 ⋅ 17 ⋅ 19 ⋅ 23 ⋅ 29 ⋅ 31 ⋅ 41 ⋅ 47 ⋅ 59 ⋅ 71. The aim of this section is far more modest. We will only show the very first (baby) step of the classification: all alternating groups An are simple except for n = 4. The alternating group An is defined for n ≥ 1 as

7.7 Alternating groups

An = {π ∈ Sn | sign(π) = 1}

� 135

(7.6)

Hence, An is normal in the group Sn of permutations on n elements, since it is the kernel of the homomorphism sign. The elements of An are also called even permutations, while permutations of sign equal to −1 are called odd permutations. Corollary 7.19 shows that the even permutations are those generated by an even number of transpositions. The simplicity of A5 led to the insight that polynomial equations of degree five or higher generally cannot be resolved by radicals. The first complete proof of this result is due to Abel and its discovery was the beginning of Galois theory. The general result about the simplicity of An has been proven in full generality with group theoretical methods by Camille Jordan (1838–1922) in [27, p. 66]. Theorem 7.23 (Jordan, 1870). Let n ≥ 1. Then the alternating group An is simple if and only if n ≠ 4. The groups A1 and A2 are trivial, therefore simple. The group A3 is the simple cyclic group ℤ/3ℤ since S3 is the dihedral group D3 with six elements. The group A4 is not simple because the Klein four-group ℤ/2ℤ × ℤ/2ℤ appears as normal subgroup of index 3, see Exercise 7.3. Thus, it only remains to show Theorem 7.23 under the restriction n ≥ 5. This will take the rest of Section 7.7. We follow the proof in Suzuki’s classical textbook [36, p. 295]. We found the reference in the preprint Simplicity of An by Keith Conrad (University of Connecticut). His paper presents five proofs for Theorem 7.23. Suzuki’s proof is the fifth one.

Proof of Theorem 7.23 We devise the proof into two parts. Part I. Every permutation from Sm is also viewed as a permutation in Sn for m ≤ n. If π ∈ Sn is a permutation, then we let supp(π) = {i ∈ [n] | π(i) ≠ i} be its support. If supp(σ) ∩ supp(π) = 0, then σπ = πσ. A permutation π ∈ Sn is called an ℓ-cycle if there is a sequence (a1 , . . . , aℓ ) of length ℓ such that π(ai ) = ai+1 for 1 ≤ i < ℓ, π(aℓ ) = a1 and π(aj ) = aj for ℓ < j ≤ n. Every

permutation π can be written as a product π = ∏ki=1 Ci of cycles C1 , . . . , Ck with pairwise disjoint supports. The factorization into cycles having pairwise disjoint supports is called the cycle arrangement (or disjoint cycle arrangement) of π. It is unique up to permutation of the indices. A permutation π having a cycle arrangement into two 2-cycles is called a double-transposition. Lemma 7.24. Let n ≥ 1. Then any two ℓ-cycles are conjugate and any two doubletranspositions are conjugate.

136 � 7 Group actions and special families of groups Proof. Let C be an ℓ-cycle defined by a sequence (a1 , . . . , aℓ ) and C ′ be an ℓ-cycle defined by a sequence (b1 , . . . , bℓ ). Then let σ ∈ Sn such that σ(ai ) = bi . This yields C ′ = σCσ −1 . More generally, if π = ∏ki=1 Ci and π ′ = ∏ki=1 Ci′ are cycle arrangements with |Ci | = |Ci′ |, then we find σ ∈ Sn with π ′ = σπσ −1 by induction on k. The case of double-transpositions is the special case of k = 2 and |Ci | = |Cj′ | = 2 for i, j ∈ {1, 2}. If an ℓ-cycle C is defined by a sequence (a1 , . . . , aℓ ), then we use the “bracketnotation” C = [a1 , . . . , aℓ ] as in [22], and we call ℓ the length of the cycle written as |C|. Clearly, [a1 , . . . , aℓ ] = [a2 , . . . , aℓ , a1 ] and [a1 , . . . , aℓ ]−1 = [aℓ , . . . , a1 ]. The notation is used to distinguish C from another standard notation where (a1 , . . . , aℓ ) means the permutation π defined by (π(1), . . . , π(ℓ)) = (a1 , . . . , aℓ ). In particular, a transposition τ{i,j} is also written as τ{i,j} = [i, j]. Other papers write a cycle in the form (a1 ⋅ ⋅ ⋅ aℓ ), omitting the commas. The bracket-notation refers to the Stirling numbers [kn] of the first kind, which are used in Section 5.6. The numbers [kn] count the number of permutations on n-elements which can be written as a product of k cycles. Another advantage is that the notation distinguishes [a, b, c](d) = d and (abc)(d) where the latter could also mean a cycle arrangement of two cycles. Still, the bracket notation has some ambiguity, since [n] could mean a 1-cycle or the set {1, . . . , n}. However, the context will always make clear what is meant. Lemma 7.25. Let n ≥ 3, then An is generated by 3-cycles. Proof. Every element π ∈ An can be written as a product of transpositions by Corollary 7.22. Since π is an even permutation, we rewrite the product of transpositions as a product of pairs [a, b][c, d] with either |{a, b, c, d}| = 3 and b = c or |{a, b, c, d}| = 4. If |{a, b, c, d}| = 3 and b = c, then [a, b][b, d] = [a, b, d], which is a 3-cycle. If |{a, b, c, d}| = 4, then we can write [a, b][c, d] = [a, b][a, c][a, c][c, d], which is a product of two 3-cycles, and we are done. The following lemma fails for n ∈ {3, 4}. For n ∈ {1, 2}, it trivially holds because A1 and A2 are trivial and therefore generated by the empty set. Lemma 7.26. Let n ≥ 5, then An is generated by double-transpositions. Proof. All double-transpositions are even. Hence, they belong to An . Consider the 3-cycle [1, 2, 3]. Since n ≥ 5, it can be written as a product of double transpositions [1, 2, 3] = [1, 3][4, 5] [4, 5][2, 3]. Now, we are done by Lemma 7.25. Proposition 7.27. Let n ≥ 5 and N ⊴ Sn be a nontrivial normal subgroup. Then N ∩ An = An . Proof. Let 1 ≠ π = ∏ki=1 Ci ∈ N be written as a cycle arrangement. Assume first that there is a cycle Ci which has length m ≥ 3. Without restriction we assume C1 = [1, 2, 3, . . . , m]. Since m ≥ 3, we have the following commutator equation: π −1 ([2, 1]π[1, 2]) = [m, . . . , 3, 2, 1][2, 1][1, 2, 3, . . . , m][1, 2] = [1, 2, m]

7.7 Alternating groups �

137

This implies [1, 2, m] ∈ N ∩ An because N is normal, so N contains a 3-cycle. All 3-cycles are conjugate by Lemma 7.24 and the 3-cycles generate An . Hence, we are done. In the second case, every Ci has length 2. If k = 1, then N = Sn by Corollary 7.21. Thus, we may assume k ≥ 2. Now, we can write π without restriction as k

π = [1, 2][3, 4] ∏ Ci i=3

Since N is normal, we obtain π −1 ([2, 3]π[3, 2]) = [1, 4][3, 2] ∈ N. For n = 3, there are no double transpositions in An . Since we exclude n = 4, we may assume n ≥ 5. We are done because by Lemma 7.24 all double transpositions are conjugate, and they generate An for n ≥ 5 by Lemma 7.26. Corollary 7.28. Let {1} ≠ N ⊲ An be a nontrivial normal subgroup in An and τ ∈ Sn a transposition. Then, either N ∩ τNτ = {1} or N ∩ τNτ = An . Proof. The intersection N ∩ τNτ ≤ An is a subgroup of Sn because τ = τ −1 . To see that it is normal, let first σ ∈ An . Then σ(N ∩ τNτ)σ −1 = N ∩ στNτσ −1 , and we can write στ = τρ for some ρ ∈ An . Therefore στNτσ −1 = (στ)N(στ)−1 = τρNρ−1 τ = τNτ. Second, σ ∉ An , then στ ∈ An and σ = τρ for some ρ ∈ An . Hence, στNτσ −1 = N and σNσ −1 = τNτ. Consequently, N ∩τNτ ≤ An is normal in Sn . If N ∩τNτ = {1}, the assertion of the corollary holds. If N ∩ τNτ ≠ {1}, then we apply Proposition 7.27; and we are done, too. Here is another remarkable consequence of Proposition 7.27: Corollary 7.29. Let {1} ≠ N⊲An be a nontrivial normal subgroup in An . Then either N = An or we have |An | = |N|2 . Proof. Let τ be a transposition. Since N is normal in An , the subgroup τNτ = τNτ −1 is normal in An = τAn τ. Exercise 7.2 shows the following two assertions: first, HK is normal in a group G if both H and K are normal in G, and second, if HK = KH and H ∩ K = {1}, then the mapping H × K → HK, (h, k) 󳨃→ hk is a bijection. We apply these facts to H = N and K = τNτ. Hence, NτNτ is normal in An . However, τ(NτNτ)τ = (τNτ)N = NτNτ. Thus, NτNτ is not trivial (as N ≠ {1}) and normal in Sn . By Proposition 7.27, we obtain NτNτ = An . Since |τNτ| = |N|, we conclude |An | = |NτNτ| = |N|2 . Part II. The clou of the proof of Theorem 7.23 is the beautiful end. Following Suzuki, it is either an application of Bertrand’s postulate or a consequence of a theorem by Cauchy. Bertrand’s postulate was conjectured by Bertrand and proved by Chebyshev (1821–1894) in 1850 (see Theorem 3.5). It tells us that for all 1 ≤ n ∈ ℝ there is a prime p with n < p ≤ 2n. As just seen above in Corollary 7.29: either An is simple, or |An | = |N|2 is a square for every normal subgroup N sitting strictly between {1} and An . Thus, by contradiction, we assume |An | = |N|2 for some normal subgroup. By Bertrand’s postulate, there is a prime p such that n/2 < p ≤ n. Since n ≥ 5, we have p ≥ 3. It follows

138 � 7 Group actions and special families of groups that p divides |An | = n!2 , but p2 does not divide n!. Contradiction. Hence, An is simple, witnesses are all primes p, satisfying n/2 < p ≤ n. For n = 5, the primes 3 and 5 divide |A5 | = 3 ⋅ 4 ⋅ 5, and 5/2 < 3 ≤ 5. Even without knowing the proof of Bertrand’s postulate, it immediately shows that A5 is simple. Witnesses are 3 and 5, they show that |A5 | is not a square without calculating 3 ⋅ 4 ⋅ 5 = 60. To finish the proof of Theorem 7.23 by Cauchy’s theorem, Theorem 7.12, without using Bertrand’s postulate, we observe that 2 divides |An | for n ≥ 3. By contradiction, let N be a normal subgroup N sitting strictly between {1} and An . Again, by Corollary 7.29, the number 2 divides |N|. Hence, N contains an element π ≠ 1 of order 2 by Theorem 7.12. Therefore, the cycle arrangement of π can be written as a product π = ττ2 ⋅ ⋅ ⋅ τm of transpositions with pairwise disjoint supports. Since τ is a transposition, we may apply Corollary 7.29 which yields N ∩τNτ = {1}. We observe that π = τπτ, hence 1 ≠ π ∈ N ∩τNτ = {1}, a contradiction. Theorem 7.23 is shown.

7.8 Nilpotent groups Nilpotent groups are a class of groups that strictly contains all Abelian groups. But nilpotent groups are very close to Abelian groups. For example, two elements having relatively prime orders commute, see Exercise 7.6 for a more precise statement. Therefore, the smallest non-Abelian group S3 is not nilpotent. The concept of nilpotent groups is credited to Sergei Nikolaevich Chernikov (1912–1987). Recall that Z(G) = { g ∈ G | ∀h ∈ G : gh = hg} denotes the center of a group G. It is a normal subgroup. Definition 7.30. A group G is called nilpotent if there is a finite sequence G = G0 , . . . , Gm = {1} such that Gi+1 = Gi /Z(Gi ) for 0 ≤ i < m. The minimal m for such a sequence is called the nilpotency class of G. If no such m exists, then we define the nilpotency class of G to be ∞. We have m = 0 if and only if G is trivial, and m ≤ 1 if and only if G is Abelian, because Z(G) = G for Abelian groups. Theorem 7.31. Finite p-groups are nilpotent. Proof. The result is obvious if G is Abelian. Otherwise, Z(G) is not trivial, and G/Z(G) is a smaller p-group. The result follows by induction using Theorem 7.10. 1 ab 1 c) 00 1

Example 7.32. The discrete Heisenberg4 group UT(3, ℤ) = {( 0 1 0 1 1 0) 00 1

neither Abelian nor a p-group for any p because ( 0

| a, b, c ∈ ℤ} is

is of infinite order. However,

it is nilpotent of nilpotency class two. We encourage the reader to verify these proper-

4 Werner Karl Heisenberg (1901–1976) was one of the main pioneers in quantum mechanics.

7.9 Unit triangular groups

� 139

ties of UT(3, ℤ) without any further reading, but using Section 1.4.1. This might help to understand more general results about unit triangular groups in Section 7.9, which are proved later without referring to the special case of the Heisenberg group. The permutation group of three elements has six elements and it is the smallest group which is not Abelian. It is not nilpotent because its center is trivial. Actually, no permutation group of a set with at least three elements is nilpotent. This follows directly from Theorem 7.33, as Sn is a subgroup of Sn+1 for all n ∈ ℕ. ⬦ Theorem 7.33. The family of nilpotent groups is closed under subgroups, homomorphic images, and finite direct products. More precisely, if G = ∏ℓi=1 Hi is a direct product of groups Hi with nilpotency class mi , then G is of nilpotency class max{m1 , . . . , mℓ }. Proof. Direct products. The trivial group is nilpotent. Let G and H be nilpotent of nilpotency classes k and m. Let G = G0 , . . . , Gk = {1} such that Gi+1 = Gi /Z(Gi ) for 0 ≤ i < k, and H = H0 , . . . , Hm = {1} such that Hi+1 = Hi /Z(Hi ) for 0 ≤ i < m. Without restriction we may assume 1 ≤ k ≤ m. By adding trivial groups Gk+1 , . . . , Gm = {1}, we obtain 1 ≤ k = m. The definition of direct products directly shows Z(Gi × Hi ) = Z(Gi ) × Z(Hi ). Hence, G × H is nilpotent of nilpotency class m = max{k, m}. The result follows by induction. Homomorphic images. Let G = G0 , . . . , Gm = {1} such that Gi+1 = Gi /Z(Gi ) for 0 ≤ i < m, and h : G → H be a surjective homomorphism. Then h induces a surjective homomorphism from G/Z(G) onto H/Z(H). By induction on m, the group H/Z(H) is nilpotent. Hence, H is nilpotent. Subgroups. Let H be a subgroup of a nilpotent group G and G = G0 , . . . , Gm = {1} such that Gi+1 = Gi /Z(Gi ) for 0 ≤ i < m. Then the homomorphism theorem for groups implies that H/Z(G) ∩ Z(H) is a subgroup of G/Z(G). By induction, H/Z(G) ∩ Z(H) is nilpotent. The result follows with the canonical surjective homomorphism H/Z(G) ∩ Z(H) → H/Z(H).

7.9 Unit triangular groups The aim of this section is twofold. First, we show that there are (finite) nilpotent groups of arbitrary high nilpotency class. Second, we show that this property is realized by unit triangular groups over an arbitrary commutative ring. We assume that the reader is familiar with matrix calculation (and notation) as it was introduced in Section 1.4.1. Throughout we let R be a commutative ring with 0 ≠ 1, and we consider the ring Rn×n of n × n-matrices over R for n ≥ 1. A matrix A = (ai,j ) is upper triangular if ai,j = 0 for all i > j, and an upper triangular matrix is unit triangular if (in addition) ai,i = 1 for all i. Recall that, according to Section 1.4.1, we denote by Ei,j the matrix, where the entry at the coordinate (i, j) is 1, all other entries being 0. Let i < j and k < ℓ. Then Ei,j Ek,ℓ = 0 unless j = k. If j = k, then Ei,k Ek,ℓ = Ei,ℓ .

140 � 7 Group actions and special families of groups Thus, for i < j ≤ k < ℓ the matrices Ei,k and Ek,ℓ never commute. Moreover, considering Ei,k Ek,ℓ = Ei,ℓ we see that ℓ−i = (ℓ−k)+(k −i). This implies ℓ−i > max{(ℓ−k), (k −i)} whenever ℓ > k and k > i. This trivial observation will become crucial. By UT(n, R) we denote the set of unit triangular n × n-matrices over R. We also use UTn as an abbreviation of UT(n, R). This means we can write UTn = 1 + ∑ R ⋅ Ei,j i 1 are bipartite if and only if n is even (then we may choose A to contain all vertices xi for even values of i, and B those for odd values of i). Let Am = {a1 , . . . , am } and Bn = {b1 , . . . , bn } be disjoint sets. The complete bipartite graph over Am and Bn is Km,n = (Am ∪ Bn , {{a, b} | a ∈ Am , b ∈ Bn }). The number of edges of the graph Km,n is mn.

⋅⋅⋅ ⋅⋅⋅ K1,n

K2,n

K3,3

K4,5

A most obvious measure for a vertex x in a graph is the number of incident edges. We call this the vertex degree (or shorter degree) dx of x, i. e., dx = |{e ∈ E | x ∈ e}| for any graph (V , E) and any vertex x ∈ V . In the complete graph Kn , all vertices have degree n − 1, and in the cycle Cn with n ≥ 3, dx = 2 for all vertices. For paths Pn with n ≥ 2, exactly two vertices (the endpoints) have degree 1, and all other vertices have degree 2. The following theorem provides a first observation about the vertex degrees in graphs G = (V , E). Theorem 8.2 (Handshaking lemma). ∑ dx = 2|E|

x∈V

Proof. We count the number of incidence pairs in two different ways. Here, an incident pair is a pair (v, e) with v ∈ V , e ∈ E, and v ∈ e. Each edge connects two vertices and therefore appears in two incident pairs. This corresponds to the right side. On the other hand, each vertex x appears in dx incidence pairs, which corresponds to the way the pairs are counted on the left. The theorem’s name is motivated by thinking of an edge xy as the handshake of x and y. Then, in order to verify the equation, one counts how many hands are shaken (i. e., the number of incidence pairs) in two different ways. Sometimes this theorem is called the degree sum formula, and then the following corollary is named as handshaking lemma: Corollary 8.3. The number of odd-degree vertices is even.

150 � 8 Graph theory

8.2 Eulerian and Hamiltonian cycles When Euler was in Königsberg, he was asked the following question: Is it possible to walk through the city, cross each of the seven bridges over the Pregel River exactly once, and finally arrive exactly where you started; see Figure 8.1. In the language of graph theory, this can be formulated as follows: Do the following graphs each have a cycle, which uses every edge exactly once?

In the right graph, we avoid the multiple edges of the left one by adding vertices and edges. Euler showed in 1736 that such cycles do not exist. This event is often regarded as the birth of graph theory. A cycle in a graph is called Eulerian cycle if each vertex is visited exactly once. Theorem 8.4 provides an easy to check characterization of those graphs that have an Eulerian cycle.

Figure 8.1: Königsberg (detail of an engraving by Joachim Bering, 1613).

Theorem 8.4 (Euler, 1736). A connected graph has an Eulerian cycle if and only if all vertices have an even degree. Proof. If a graph has an Eulerian cycle, then the degree of a vertex x that occurs k times on the circuit is dx = 2k.

8.2 Eulerian and Hamiltonian cycles

� 151

Conversely, let G = (V , E) be a graph in which each vertex has an even degree. Let p = v0 . . . vn be a path of maximum length on which every edge is used at most once. As vn has an even degree, we must have v0 = vn (otherwise the path p could be extended). If there is any edge unused in p, then there is also an edge vi x unused in p (because G is connected). But now vi . . . vn v1 . . . vi x is a longer path than p, which uses each edge only once. This is a contradiction, therefore p is an Eulerian cycle. Remark 8.5. An Eulerian cycle in a directed graph G = (V , E) with E ⊆ V × V goes through all directed edges xy ∈ E from x to y. Also, loops xx are no problem here. The above proof can easily be adapted to the case of directed graphs, see Exercise 8.9 for details. In this case, the statement to be shown is that a connected directed graph has a cycle if and only if for each vertex x, the indegree (i. e., the number of edges of the form yx) is equal to the outdegree (i. e., the number of edges of the form xy). ⬦ An Eulerian path is a path that uses each edge of a graph exactly once. Every Eulerian cycle is also an Eulerian path, but an Eulerian path can end at a different vertex than it begins. From Theorem 8.4 it is easy to derive that a graph has an Eulerian path if and only if at most two of its vertices have an odd degree: One draws a path of length 2 with a new vertex between the two vertices of odd degree (by Corollary 8.3, we know that it is impossible that exactly one vertex has odd degree). In the resulting graph, each vertex has an even degree and Theorem 8.4 can be applied. We also see that each Eulerian path in this case starts at one vertex of odd degree and ends at the other one. This immediately leads to a solution for the following children’s riddle: Can the following picture be drawn without lifting the pencil and without drawing a line more than once.

House of Santa Claus A quite similar concept to the Eulerian cycle is the Hamiltonian cycle (after Sir William Rowan Hamilton, 1805–1865). A Hamiltonian cycle is a cycle v0 . . . vn that visits each vertex exactly once (where, of course, start and end point v0 = vn is counted only once). For example, the complete graph Kn for n ≥ 3 always has a Hamiltonian cycle. The Petersen graph, however, has no Hamiltonian cycle. A Hamiltonian cycle of the dodecahedron is indicated by thick edges in the following picture:

152 � 8 Graph theory

Hamiltonian cycle in a dodecahedron In contrast to Eulerian cycles, no simple criterion is known that can be used to check whether a graph has a Hamiltonian cycle (such a criterion would solve the so-called P– NP problem). Theorem 8.6, known as Ore’s theorem (Øystein Ore, 1899–1968) provides a sufficient condition. Theorem 8.6 (Ore, 1960). If in a graph G = (V , E) with |V | ≥ 3 every pair of nonadjacent vertices x, y satisfies the condition dx + dy ≥ |V |, then G has a Hamiltonian cycle. Proof. Suppose there is a graph with n ≥ 3 vertices that satisfies the assumptions of the theorem, but has no Hamiltonian cycle. Let G = (V , E) be such a graph with |V | = n and with a maximum number of edges, and let xy ∈ ̸ E. Since E is maximal, the graph (V , E ∪ {xy}) has a Hamiltonian cycle, and this cycle uses the edge xy. If we omit the edge xy, then the result is a path v1 . . . vn in G from x = v1 to y = vn that visits each vertex exactly once. Let X = {vi | vi+1 x ∈ E} and Y = {vi | vi y ∈ E}. This yields y ∈ ̸ X and y ∈ ̸ Y , as well as |X| = dx and |Y | = dy . But, dx + dy ≥ n, and consequently X ∩ Y ≠ 0. Let vi ∈ X ∩ Y . The sequence of vertices v1 v2 . . . vi vn vn−1 . . . vi+1 v1 defines a Hamiltonian cycle because vi vn = vi y ∈ E and vi+1 v1 = vi+1 x ∈ E. This is a contradiction. So there is no graph G that satisfies the assumptions of the theorem, but does not have a Hamiltonian cycle.

8.3 Trees A tree is a nonempty connected graph without simple cycles. A vertex of a tree is a leaf if its degree is at most 1. The vertices that are not leaves are called inner vertices. The following figure shows all trees with 5 vertices.

8.3 Trees

� 153

The three trees with 5 vertices Sometimes, in a tree one vertex is marked and called the root of the tree. Then, one speaks of a rooted tree. The idea is that the root in some sense corresponds to the beginning of the tree. Theorem 8.7 summarizes some properties of (unrooted) trees: Theorem 8.7. Let G = (V , E) be a nonempty graph. The following properties are equivalent: (a) G is a tree. (b) Each pair of vertices from V is connected by exactly one simple path in G. (c) G is connected and |E| = |V | − 1. (d) G is connected, but removing any edge from E makes the graph disconnected. Proof. (a) ⇒ (b) Suppose two vertices x, y ∈ V are connected by two different paths xv1 . . . vm−1 y and xw1 . . . wn−1 y. We choose the vertices x and y such that m + n is minimal. Then {v1 , . . . , vm−1 } ∩ {w1 , . . . , wn−1 } = 0. Hence xv1 . . . vm−1 ywn−1 . . . w1 x is a simple cycle. This is a contradiction because G is a tree. Thus, there is at most one path between any two vertices and, since G is connected, at least one such path exists. (b) ⇒ (c) For each vertex, there is exactly one outgoing edge that leads to the root, and every edge does this for some vertex. There are |V | − 1 vertices which are not the root. After the preliminary consideration, this equals the number of edges. This idea is described in more detail below. We choose any vertex r ∈ V to be the root and assign to each vertex x ∈ V \ {r} the first edge ex = xv1 ∈ E of the unique path xv1 . . . vm−1 r connecting x with the root. Further, we define the height h(x) of x as the length of the path from x to r, i. e., h(x) = m. If a vertex y ≠ x lies on the path from x to the root, then h(y) < h(x). Suppose ex = ey for two vertices x ≠ y. Let p = xv1 . . . vm−1 r and q = yw1 . . . wn−1 r be two simple paths in G. Then v1 = y and w1 = x. Hence, h(x) < h(y) and h(y) < h(x). This is a contradiction, and therefore ex ≠ ey whenever x ≠ y. Consequently, for EV = {ex ∈ E | x ∈ V \ {r}} we obtain |EV | = |V \ {r}| = |V | − 1. Suppose there is an edge xy ∈ E \ EV . Without restriction, we may assume h(y) ≤ h(x). If y is lying on the path from x to the root, then from ex ≠ xy we get h(y) ≤ h(x) − 2. In any case, using the edge xy, we get a new simple path from x to the root via the vertex y, which is a contradiction to (b). So E = EV and therefore |E| = |V | − 1. (c) ⇒ (d) Let e ∈ E. The graph G′ = (V , E \ {e}) has only |V | − 2 edges, but |V | − 2 edges can connect at most |V | − 1 vertices. So G′ is not connected. (d) ⇒ (a) If G were to contain a simple cycle, then we could remove any edge of this cycle and the resulting graph would still be connected. This is a contradiction to (d). So G cannot contain any simple cycle.

154 � 8 Graph theory Starting with any connected graph, we can by Theorem 8.7 (d) remove edges until we end up with a tree. Therefore each connected graph contains (at least) one so-called spanning tree. Below we state this formally in Corollary 8.8. Corollary 8.8. Every nonempty connected graph G = (V , E) contains a tree with vertex set V as subgraph. In particular, Corollary 8.8 shows that any connected graph with n vertices has at least n − 1 edges. Spanning trees are a simple, but versatile tool in graph theory. We consider the following example: Let G be a connected graph with n vertices. Using a depth-first search technique on a spanning tree of G yields a (not simple) cycle of length 2(n − 1) which visits each vertex of G at least once. Of course, the concept of spanning trees can also be generalized to not necessarily connected graphs by considering the connected components separately. Another simple corollary of Theorem 8.7 justifies the chosen terminology. Corollary 8.9. Every tree has leaves. Proof. Suppose each vertex of the tree G = (V , E) has at least degree 2. By Theorem 8.2, this implies |E| ≥ |V |, a contradiction to Theorem 8.7 (c). Thus, G has to have at least one leaf. In fact, Corollary 8.9 shows that any tree with at least two vertices also has at least two leaves. The proof technique is typical: One has to pick the leaves. For a tree of two vertices, the assertion is obvious. If a tree has at least three vertices we can remove (“pick”) a leaf x. This leaf exists according to Corollary 8.9. The resulting tree has two leaves y and z by the induction hypothesis. Since x only was connected to one vertex, at least one of y and z is also a leaf in the original tree. Together with x, we have two leaves. For all n ≥ 2, the path Pn is an example of a tree with n vertices and exactly two leaves.

8.4 Cayley’s formula Cayley’s formula (Arthur Cayley, 1821–1895) determines the number of spanning trees in a complete graph with n vertices. This is not the same as to determine the number of trees with n vertices up to isomorphism. Note, e. g., that there is only one tree with three vertices, namely the graph P3 , but there are three spanning trees with the vertex set {1, 2, 3}. 3 1

3 2

1

3 2

1

2

We consider the complete graph Kn with the set V = {1, . . . , n} of vertices. The set E is (V2 ) and contains (n2 ) edges. A spanning tree consists of n − 1 edges, so there are

8.4 Cayley’s formula

� 155

m ) potential candidates for spanning trees, where m = (n). Of course, the number of (n−1 2 spanning trees is much less. The exact number is given in the following theorem. There are several proofs for this classical theorem of enumerative combinatorics. We use the method to represent trees by so-called Prüfer codes. These codes are named after Ernst Paul Heinz Prüfer (1896–1934), who used them to find a very elegant and simple proof for Theorem 8.10 in 1918. We present his proof below.

Theorem 8.10 (Cayley’s formula). Let n ≥ 2. The number of spanning trees in a complete graph with n vertices is nn−2 . Proof. Let Kn = (V , E) be the complete graph with vertex set V and |V | = n. We assume the vertices in V to be linearly ordered. Further, let (V , T) with T ⊆ E be a spanning tree of Kn . We encode T by a sequence in V n−2 . For n = 2, this is the empty sequence, which corresponds to the only spanning tree with T = E in this special case. Now let n ≥ 3. Let b1 ∈ V be the smallest leaf of (V , T). The point now is to list the neighbor p1 of b1 and not the leaf b1 itself. Thus, b1 p1 ∈ T. We define V ′ = V \ {b1 } and T ′ = T \ {b1 p1 }. Then (V ′ , T ′ ) is a spanning tree for a complete graph with n − 1 vertices. By induction, there is a sequence (p2 , . . . , pn−2 ), which encodes (V ′ , T ′ ). We define the encoding of (V , T) by the sequence (p1 , p2 , . . . , pn−2 ) and call this sequence the Prüfer code for (V , T). By induction, one can see that {p1 , . . . , pn−2 } is exactly the set of inner vertices of T. Especially, some of the pi ’s may be the same. Therefore, from the sequence (p1 , . . . , pn−2 ) we can recover the leaf b1 as the smallest element in the vertex set V \ {p1 , . . . , pn−2 }. Hence, we can conclude b1 p1 ∈ T. Now, inductively we reconstruct from {p2 , . . . , pn−2 } the spanning tree T ′ with vertex set V ′ = V \{b1 }. We obtain T = T ′ ∪{b1 p1 }. The mapping, which to each spanning tree T assigns its Prüfer code, therefore, is an injective mapping from the set of all spanning trees for (V , E) to the set V n−2 . So it only remains to show that every sequence (p1 , . . . , pn−2 ) is the Prüfer code of a spanning tree. Let b1 be the smallest element in V \ {p1 , . . . , pn−2 }. By induction, the partial sequence (p2 , . . . , pn−2 ) is a Prüfer code of a spanning tree T ′ of V ′ = V \ {b1 }. Thus, (V ′ , T ′ ) is connected and T ′ has n − 2 edges. If we define T = T ′ ∪ {b1 p1 }, then (V , T) is connected and T has n − 1 edges. So (V , T) is a spanning tree with Prüfer code (p1 , . . . , pn−2 ). The procedure for the generation of the Prüfer code for a spanning tree (V , T) can be described as follows. One starts with the ordered vertex set V = {1, . . . , n}. Then the smallest leaf is picked and its neighbor is listed. Now continue with the smaller tree, and go on this way until only two vertices remain. Thus, the procedure terminates exactly when there are no inner vertices in the tree anymore. Only inner vertices from T are listed. Conversely, the tree for a given Prüfer code (p1 , . . . , pn−2 ) that was formed in this way, can be reconstructed by first setting V = {1, . . . , n} (where n is the length of the sequence increased by two). Now it is possible to find out which leaf was picked first, namely the smallest vertex in V \ {p1 , . . . , pn−2 }. Clearly, the remaining tree has been formed over the vertex set V ′ = V \{b1 } and its Prüfer code is (p2 , . . . , pn−2 ). The according

156 � 8 Graph theory tree (V ′ , T ′ ) can be determined inductively from this code and, together with the vertex b1 and the edge b1 p1 , the resulting graph is exactly the tree we are looking for, its Prüfer code is (p1 , . . . , pn−2 ). The recursion terminates as soon as n = 2 and the Prüfer code is empty. Then the tree to be formed is uniquely determined by the two vertices from the vertex set. The following diagram shows an example. 8

4 3

2

7

1 5

6

The tree for the Prüfer code (2, 7, 7, 1, 7, 1) Trees are the most commonly encountered graph class, in many cases as binary trees (all vertices have maximum degree 3, and the root has maximum degree 2), but also as probability trees, or as search trees for ordered sets, or, like in Section 5.11, representing bracketed expressions, such as those found in the structure of XML (Extensible Markup Language) documents.

8.5 Hall’s marriage theorem Let A and B be disjoint sets, and let G = (A ∪ B, E) be a bipartite graph, i. e., every edge in E connects a vertex from A with a vertex from B. A subset M ⊆ E is a matching if no two edges in M have a common vertex. We call M a perfect matching for A if every vertex of A is contained in an edge of M. a

1

a

1

a

1

b

2

b

2

b

2

c

3

c

3

c

3

d

4

d

4

d

4

Bipartite graph

Matching

Perfect matching

Hall’s marriage theorem provides a necessary and sufficient condition for a perfect matching for A to exist. For this, let NG (a) = {b ∈ B | ab ∈ E}

8.5 Hall’s marriage theorem

� 157

be the set of neighbors of a ∈ A in the graph G. This notation can be extended to subsets X ⊆ A by the definition NG (X) = ⋃a∈X NG (a). The marriage condition is the following: |NG (X)| ≥ |X| for all subsets X ⊆ A. Obviously, if |B| < |A| then no perfect matching can exist. Similarly, the marriage condition has to be valid, if a perfect matching exists. The main statement of Hall’s marriage theorem is that the converse is also true: if the marriage condition is true, then there is a perfect matching for A. The theorem in this form has been shown in 1935 by Philip Hall (1904–1982). In a slightly different formulation it had already been shown in 1931 in two independent papers by Kőnig (Dénes Kőnig, 1884–1944) and Egerváry (Jenő Egerváry, 1891–1958). Even earlier, already in 1929, Menger’s theorem was proven, and Hall’s marriage theorem can easily be derived from Menger’s theorem. Nevertheless, today the marriage theorem is usually attributed to Hall. The marriage theorem got its name from the following situation: Think of A (like Alice) as a set of women, and B (like Bob) as a set of men. An edge between woman and man exists if marriage is possible. Then, a perfect matching means that it is possible to marry all persons from group A (in this case the women) to persons from group B (the men), without marrying a man to more than one woman. Theorem 8.11 (Hall’s marriage theorem). Let G = (A ∪ B, E) be a bipartite graph. There is a perfect matching for A if and only if |NG (X)| ≥ |X| holds for all X ⊆ A. Proof. If M is a perfect matching, then the graph (A ∪ B, M) satisfies the marriage condition. But then, G clearly satisfies the marriage condition as well. For the converse direction, let G be a graph that satisfies the marriage condition. If for each proper subset 0 ≠ X ⊊ A the inequality |NG (X)| > |X| is valid, then we can remove an arbitrary edge e from G, and the remaining graph still satisfies the marriage condition and therefore has a perfect matching by induction on the number of edges. But then, this matching is also a perfect matching for G. Now we may assume that the above is not the case, i. e., there exists a nonempty set X ⊊ A with |NG (X)| = |X|. Let G1 be the subgraph of G induced by X ∪ NG (X), and let G2 be the subgraph induced by the remaining vertices (outside X ∪ NG (X)). The graph G1 satisfies the marriage condition, because NG1 (X ′ ) = NG (X ′ ) is true for all X ′ ⊆ X. It remains to show that G2 satisfies the marriage condition because then, by induction, both G1 and G2 have perfect matchings, so their union is a perfect matching of A. Let G2 = (A′ ∪ B′ , E ′ ) with A′ ⊆ A and B′ ⊆ B. For X ′ ⊆ A′ , we have 󵄨󵄨 󵄨 󵄨 󵄨 󵄨 ′ 󵄨 󵄨 ′󵄨 ′ 󵄨 󵄨 ′ 󵄨󵄨NG2 (X )󵄨󵄨󵄨 + 󵄨󵄨󵄨NG (X)󵄨󵄨󵄨 ≥ 󵄨󵄨󵄨NG (X ∪ X)󵄨󵄨󵄨 ≥ 󵄨󵄨󵄨X ∪ X 󵄨󵄨󵄨 = 󵄨󵄨󵄨X 󵄨󵄨󵄨 + |X| and thus |NG2 (X ′ )| ≥ |X ′ |. Therefore, G2 satisfies the marriage condition, which completes the proof. Frequently, in applications of Hall’s marriage theorem, Theorem 8.11, we have |A| = |B|. Then every perfect matching for A in the bipartite graph G = (A ∪ B, E) is also a perfect matching for B.

158 � 8 Graph theory

8.6 Stable marriage The Nobel Memorial Prize in Economic Sciences, more precisely the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel, was awarded in 2012 to Alvin Elliot Roth (born 1951) and Lloyd Stowell Shapley (born 1923). This was to dignify their contribution to the theory of stable assignments and the design of certain markets. A central part of their work was the Gale–Shapley algorithm, which had already been developed 50 years before by David Gale (1921–2008) and Shapley. The empirical work carried out later by Roth then demonstrated the importance of stability in real-world conditions. In the current section we will present the fundamental algorithm of Gale and Shapley. Let A and B be sets of n people each. Without restriction let A be the women and B the men. Each person has a preference list about the people of the opposite gender. For a ∈ A, we can think of the preference list Pa as a linear order ba(1) > ⋅ ⋅ ⋅ > ba(n) with B = {ba(1) , . . . , ba(n) }. If ba(i) precedes ba(j) in the order, then a prefers the man ba(i) to ba(j) . Analogously, every man b ∈ B has a preference list Pb . A marriage (or the matching M ⊆ A×B) is stable if all the women are married and there are no divorces. A divorce occurs when there are two pairs (a, b′ ), (a′ , b) such that Pa (b) > Pa (b′ ) and Pb (a) > Pb (a′ ) because in this case a and b would get divorced from their partners and form a new couple (a, b). If a′ and b′ then get together, they will all be married again. So a’s and b’s happiness increased, while a′ ’s and b′ ’s may have decreased. The situation of two couples is easy to analyze. There are two women a, a′ and two men b, b′ . Let us first consider the case when one pair mutually give the highest preference to each other. Without restriction, this is the pair (a, b). Thus, Pa (b) > Pa (b′ ) and Pb (a) > Pb (a′ ). In this case, (a, b), (a′ , b′ ) is stable, and it is the only possible stable marriage. If no such pair exists, the preferences cross each other. Let us say a has b as her favorite, but b favors a′ , who in turn prefers b′ , for which woman a has a higher preference. We get a circular arrangement of the highest preferences a

b

a′

b′

In this case, both possible solutions are stable; they differ in that either the two men or the two women marry their favorites. There are also more complicated situations where a stable marriage is possible, but only if one of the two groups is privileged. In general, the result will not be a stable marriage if couples meet in random order and, according to their preferences, divorce from their current partner and remarry. In the previous situation, (a, b) could be married initially, then change to (a′ , b), then (a′ , b′ ) and (a, b′ ), and finally, (a, b) again. The other two people here in each of these cases are unmarried. In our view of a stable marriage, there are no unmarried people, so we add a woman a0 and a man b0 who have the lowest preference in all the lists of the original

8.6 Stable marriage

� 159

women and men, respectively. People who in the above situation are unmarried, now get married to a0 or b0 . Then, this example yields an infinite sequence of unstable marriages, where all people are married. We illustrate one pass in the following picture. The dashed edge in each case indicates where the stability is broken. After only four reorientations of the pairs, we are back in the initial position. a

b

a

b

a

b

a

b

a

b

a′

b′

a′

b′

a′

b′

a′

b′

a′

b′

a0

b0

a0

b0

a0

b0

a0

b0

a0

b0

For the calculation of a stable marriage, one usually separates engagement and marriage to avoid divorces. To get engaged means to choose a partner tentatively, while marrying is irrevocable. The Gale–Shapley algorithm constructs a marriage as follows: (a) Initially, nobody is engaged or married. (b) As long as an unengaged man b ∈ B exists, b makes a proposal of marriage to the lady a ∈ A, whom he had not asked before and who in his list has the highest preference among these women. The woman a accepts the proposal and becomes engaged to b if she does not have a partner yet, or if she prefers b to her current fiancé. If necessary, an engagement is broken in order to enter into another. (c) If all men are engaged, then everyone marries his fiancée. In the following, each pass of step (b) is called a round. Theorem 8.12 (Gale, Shapley 1962). The Gale–Shapley algorithm constructs a stable marriage in at most n2 rounds. Proof. Engaged women remain engaged in each round. A woman only reengages, if this is an improvement for her. In particular, none of the women gets engaged more than n times. After n2 rounds at the latest, every woman has received a proposal, and all women (and therefore also all men) are engaged. Suppose the marriage constructed by the algorithm were unstable. This means that there are two married couples (a, b′ ) and (a′ , b) with Pa (b) > Pa (b′ ) and Pb (a) > Pb (a′ ). But then, prior to the engagement of b to a′ , a either turned down a proposal from b or left him. The reason for this was an engagement to a man b′′ with Pa (b′′ ) > Pa (b). Since women only improve during the process, we can conclude Pa (b′ ) ≥ Pa (b′′ ). This is a contradiction to Pa (b) > Pa (b′ ). In particular, there is always a stable marriage. The marriage constructed by the Gale–Shapley algorithm can be described even more precisely.

160 � 8 Graph theory Remark 8.13. The Gale–Shapley algorithm is optimal for men. They will always be engaged to the woman with the highest preference among all women, so that a stable solution is possible with them. To prove this fact, it is sufficient to show that a pair (a, b′ ) ∈ A × B is not possible in any stable marriage if the woman a denied a proposal of b′ or left b′ during the Gale– Shapley procedure. Suppose, in contradiction, there is a first time t when a woman a denies or leaves a man b′ , although a stable marriage M with (a, b′ ) ∈ M is possible. In both cases the reason must be a man b with Pa (b) > Pa (b′ ). We have (a′ , b) ∈ M for a woman a′ ≠ a. If Pb (a) < Pb (a′ ), then a′ would have denied b before t or would have been left by a′ . This is impossible by the choice of (a, b′ ). Thus, Pb (a) > Pb (a′ ). Now, if (a, b′ ) and (a′ , b) meet, a and b will leave their partners and form a new pair (a, b). Therefore, M is not stable, which is a contradiction. ⬦ According to Remark 8.13, the order of the proposals is irrelevant in the Gale– Shapley procedure. Men can even take their time. This changes if women use an only partially ordered list of preferences. Then it is important who’s proposal comes first. Women are highly underprivileged in the algorithm, they will always get the man with lowest preference, under all choices that allow a stable marriage, see also Exercise 8.16.

8.7 Menger’s theorem Menger’s theorem (Karl Menger, 1902–1985) establishes a connection between one parameter to be maximized and another to be minimized. Statements of this kind are typical in the field of graph theory. More specifically, one parameter is the minimum number of vertices needed to separate two (not necessarily disjoint) vertex sets A and B in a given graph G. The other parameter is the maximum number of disjoint paths in G leading from A to B. Menger’s theorem says that these two parameters coincide. Let G = (V , E) be a directed or undirected graph and let A, B ⊆ V . An AB-path is a path x0 . . . xn with x0 ∈ A and xn ∈ B, and the vertices in between, i. e., x1 , . . . , xn−1 satisfy {x1 , . . . , xn−1 } ∩ (A ∪ B) = 0. Note that, for an AB-path, the case x0 = xn ∈ A ∩ B is possible. An AB-separator is a subset C ⊆ V such that every AB-path uses at least one vertex from C. Two paths are disjoint if they have no vertices in common. Theorem 8.14 (Menger 1929; directed graphs, disjoint paths). Let G = (V , E) be a directed graph and A, B ⊆ V . Then the size k of a smallest AB-separator is equal to the maximum number of pairwise disjoint AB-paths. Proof. If C is an AB-separator, then there are at most |C| disjoint AB-paths. It remains to show that k disjoint AB-paths exist. If E = 0 then |A ∩ B| = k and the vertices in A ∩ B form k disjoint AB-paths of length 0. Now let e = xy be a directed edge in E. Removing the edge e yields the graph G′ = (V , E \ {e}). If the smallest AB-separator in G′ has size k, then, by induction, there are k disjoint AB-paths in G′ . These are also disjoint in G.

8.8 Maximum flows � 161

Now, let C be an AB-separator in G′ and let |C| < k. Both S = C ∪ {x} and T = C ∪ {y} are AB-separators in G. Therefore, |S| = k = |T|. Every AS-separator in G′ , but also every TB-separator in G′ is an AB-separator in G; this is due to the orientation of edge e from x to y. By induction, there are both k disjoint AS-paths 𝒫 in G′ and k disjoint TB-paths 𝒬 in G′ . The paths from 𝒫 and the paths from 𝒬 only intersect in C because otherwise there would be an AB-path, which is disjoint to C. In each vertex from S, there is an AS-path from 𝒫 terminating, and in each vertex from T, there is a TB-path from 𝒬 beginning. Thus, we can concatenate the paths from 𝒫 and from 𝒬 and obtain k disjoint AB-paths in G. Note that the path terminating in x is continued with the path beginning in y, which is possible because xy ∈ E. This proof of Theorem 8.14 appeared in the year 2000 in a paper by Frank Göring [21]. Theorem 8.15 (Menger 1929; undirected graphs, disjoint paths). Let G = (V , E) be a graph and A, B ⊆ V . Then the size of a smallest AB-separator is equal to the maximum number of pairwise disjoint AB-paths. Proof. This follows from the directed version of Menger’s theorem, Theorem 8.14, if instead of an undirected edge {x, y} we use the two orientations (x, y) and (y, x). Hall’s marriage theorem, Theorem 8.11, can be obtained as a corollary to Menger’s theorem: The marriage condition says that A is a minimal AB-separator, and then Menger’s theorem returns the perfect matching as disjoint AB-paths of length 1.

8.8 Maximum flows Before introducing the notion of flows, we want to consider a variant of Theorem 8.14, which characterizes the maximum number of edge-disjoint paths between two vertices. For this, we need a bit more notation. In the following variant of Menger’s theorem, multiple edges will be allowed. Therefore, we consider edge sequences instead of paths. Let G = (V , E, σ, τ) be a graph, where σ(e) is the source vertex of edge e ∈ E and τ(e) is its target vertex. An edge sequence is a sequence of edges π = e1 . . . en such that τ(ei ) = σ(ei+1 ) for all 1 ≤ i < n. We call π an st-sequence if σ(e1 ) = s and τ(en ) = t. The edge sequence π defines the path σ(e1 ) . . . σ(en )τ(en ). Two edge sequences π1 = e1 . . . em and π2 = f1 . . . fn are disjoint, if ei ≠ fj for all i, j. Let s, t ∈ V be two different vertices of G = (V , E, σ, τ). An st-cut is a pair (A, B) such that s ∈ A ⊆ V and t ∈ B = V \ A. Thus, an st-cut (A, B) divides V into two nonempty disjoint parts, and each edge sequence from s to t has to leave A and enter B. The weight of (A, B) is the number of edges e ∈ E such that σ(e) ∈ A and τ(e) ∈ B. These are exactly those edges, which allow to pass over from A to B.

162 � 8 Graph theory Theorem 8.16 (Menger 1929; directed graphs, disjoint edges). Let G = (V , E, σ, τ) be a directed graph, where multiple edges are allowed, and let s, t ∈ V with s ≠ t. Then the minimum weight of an st-cut is equal to the maximum number of disjoint st-sequences. Proof. We use a construction called edge graph (or line graph). The edge graph L(G) of G is defined such that E is its vertex set and F is its edge set, where F = {(e, f ) ∈ E × E | τ(e) = σ(f )} i. e., the edges of the original graph G are the vertices of the edge graph, and an edge is drawn from e to f if the target vertex of e is equal to the source vertex of f . Note that the edge graph L(G) never has multiple edges. Therefore, any path e1 . . . em in L(G) corresponds to an edge sequence e1 . . . em in G, and vice versa. e4 e1 e3 e2

e6 e4 e5

e2

e7

e7

directed graph G

e3

e1

e6

e5

associated edge graph L(G)

Now, the claim follows if we apply the directed version of Menger’s theorem (Theorem 8.14) to the edge graph L(G) with A = {e | σ(e) = s} and B = { f | τ(f ) = t}.

8.9 Max-flow min-cut theorem Now we will consider a quantitative version of the previous theorem, the so-called maxflow min-cut theorem. A proof was first published in a technical report by Ford and Fulkerson (Lester Randolph Ford, Jr., born 1927, and Delbert Ray Fulkerson, 1924–1976) [19]. Simultaneously, during the journal publication [20] in 1956, Peter Elias (1923–2001), Amiel Feinstein (born 1930), and Claude Elwood Shannon (1916–2001) found another proof [17]. In order to formulate the max-flow min-cut theorem, we have to enter the world of networks and flows and provide some preparation. A flow network consists of a finite set of vertices V with two distinguished vertices, a source vertex s and a target vertex t, and a capacity function c : V × V → ℝ≥0 into the nonnegative real numbers. By the capacity function, a weighted directed graph is defined, where we will only draw the edges with positive capacity. We can think of an edge (x, y) as a piece of pipe or tube with capacity c(x, y). Accordingly, a flow network (V , c, s, t) is a system of tubes or pipes with distinguished source s and target t. A typical

8.9 Max-flow min-cut theorem

� 163

problem in combinatorial optimization is to compute a maximum flow from the source to the target, which does not exceed any of the capacities. Formally, a flow is a mapping f : V × V → ℝ satisfying the following three conditions: – (skew symmetry) f (x, y) = −f (y, x) for all x, y ∈ V . – (conservation of flow) If u ∈ V and s ≠ u ≠ t, then ∑v∈V f (u, v) = 0. – (capacity condition) f (x, y) ≤ c(x, y) for all x, y ∈ V . We can interpret the skew symmetry such that a positive flow from x to y is the same as a negative flow from y to x. Consequently, f (x, x) = 0 for all vertices x, so it is not a constraint to require c(x, x) = 0 for all x. Thus, flow networks have no loops. Conservation of flow states that in every inner vertex the incoming flow equals the outgoing flow. This is not demanded for the source and target vertices. We measure the value ‖f ‖ of a flow f : V × V → ℝ at the source by ‖f ‖ = ∑ f (s, y) y∈V

As with flow networks, for flows we only draw edges xy where the value f (x, y) is positive. 6

8 s

1 7

5

4

2

9

s

t

3

3

4

10

1 7

flow network

2 1

9

t 9

flow with value 11

The definition of the value ‖f ‖ regards the source, but not the target. The next lemma shows that the same value could be measured at the target, as well. More generally, we consider arbitrary st-cuts. As with graphs, an st-cut (A, B) of a flow network (V , c, s, t) is characterized by the conditions s ∈ A ⊆ V and t ∈ B = V \ A. Lemma 8.17. Let f : V × V → ℝ be a flow and (A, B) an st-cut. Then ‖f ‖ =



x∈A, y∈B

f (x, y)

In particular, ‖f ‖ = ∑y∈V f (s, y) = ∑x∈V f (x, t). Proof. We show the statement by induction on |A|. For A = {s}, the statement is trivial. Now let A = A′ ∪ {u} with s ∈ A′ and u ∉ A′ . By induction, ‖f ‖ = ∑x∈A′ , y∈B′ f (x, y) for B′ = B ∪ {u}. Now,

164 � 8 Graph theory



x∈A, y∈B

f (x, y) =

∑ x∈A′ , y∈B′

f (x, y) + ∑ f (u, y) − ∑ f (x, u) y∈B

x∈A

But s ≠ u ≠ t, and therefore using the skew symmetry and flow conservation conditions, we obtain the claim: ∑ f (u, y) − ∑ f (x, u) = ∑ f (u, y) + ∑ f (u, x) = ∑ f (u, v) = 0

y∈B

x∈A

y∈B

x∈A

v∈V

Since f (t, t) = 0, the statement ‖f ‖ = ∑x∈V f (x, t) appears as a special case, where B = {t}. The capacity c(A, B) of an st-cut (A, B) is defined by c(A, B) =



x∈A, y∈B

c(x, y)

By the capacity condition, Lemma 8.17 says that the capacity of any st-cut is an upper bound for the value of a flow. For all st-cuts (A, B) and all flows f , we have the following estimate: ‖f ‖ ≤ c(A, B)

(8.1)

The following Theorem 8.18 by Ford and Fulkerson states that this estimate is tight. In Section 8.10 below, we will present a more general and, moreover, algorithmic version of the max-flow min-cut theorem. First, we want to show that this theorem can be seen as a corollary to Menger’s theorem, Theorem 8.16. Theorem 8.18 (Max-flow min-cut theorem). Let N = (V , c, s, t) be a flow network. The maximum value ‖f ‖ of an st-flow f is equal to the minimum capacity c(A, B) of an st-cut (A, B). Furthermore, if all capacities are natural numbers, then the maximum flow f can be chosen to be integer valued on every edge. Proof. By Equation (8.1), it suffices to show the existence of a flow whose value is equal to the capacity of an st-cut. First, let all capacities be integers. Then we replace each edge (x, y) by c(x, y) copies, which results in an unweighted directed graph G with multiple edges. Now, from x to y there are c(x, y) edges. Let (A, B) be a minimal st-cut in G. The weight k of (A, B) is equal to c(A, B). According to Theorem 8.16, we can find k disjoint st-sequences π1 , . . . , πk in G. For two vertices x, y, let w(x, y) be the number of edges from x to y occurring in one of the edge sequences πi . We define a flow f by f (x, y) = w(x, y) − w(y, x). Then, ‖f ‖ = k = c(A, B). Thus, particularly the second part of the theorem is proven. Extending the proof to rational capacities is easy: we multiply by the least common denominator of the capacities. This reduces the case from rational to integer capacities. If irrational capacities occur, this case can be reduced to rational capacities by continuity

8.9 Max-flow min-cut theorem

� 165

arguments. For real number capacities, for now we are content with this proof sketch because we will study a different proof in the next section.

Residual graphs and augmenting paths We consider another access to Theorem 8.18 because we want to approach the algorithmic solution. For an arbitrary flow f (e. g., the zero flow), the residual graph (V , Rf ) is defined by the edge set Rf = {(x, y) ∈ V × V | c(x, y) > f (x, y)} Let A ⊆ V be the set of all vertices which are reachable from s in the residual graph. If we let B = V \ A, then ∑ (c(x, y) − f (x, y)) = 0

x∈A, y∈B

Now, if t ∉ A, then (A, B) defines an st-cut satisfying ‖f ‖ = c(A, B). For t ∈ A, we can increase the value of the flow along one path from s to t in the residual graph. Next we shall investigate more precisely the increase of the value of f along a path in the residual graph. It is crucial to find a clever way to implement an efficient algorithm for solving the flow problem. For this purpose, we consider a central concept in flow theory, called augmenting path. This is a directed path π from s to t in the residual graph, where, in particular, all edges xy on the path π satisfy the inequality c(x, y) > f (x, y). If an augmenting path exists, then the capacity is not completely exploited, and we can increase the value of the flow f along the augmenting path to obtain a new flow with value ‖f ‖ + min{c(x, y) − f (x, y) | xy is an edge on the path π} Essentially, this argument is already the original proof by Ford and Fulkerson, which immediately leads to an algorithm, too: Start with the zero flow f (x, y) = 0 for all (x, y). Compute the residual graph and an augmenting path π, and increase the flow along the path by the positive value necessary to use the full capacity on one edge of π. This yields a proper increase of the flow’s value and always increases it by a positive integer, if all capacities are natural numbers. Even if all capacities are in ℚ, the algorithm will terminate, but on inputs given in binary, with an unfavorable choice of augmenting paths, an exponential running time may occur. And even worse, if some capacities are irrational, the Ford–Fulkerson algorithm might lead to smaller and smaller increments, so that not even the convergence against a maximum flow is guaranteed.

166 � 8 Graph theory Example 8.19. Let φ = −1+2 5 ≈ 0.6180339887 . . . be a solution of the quadratic equation x 2 + x − 1 = 0. In particular, φ + φ2 = 1. Thus 1 + φ + φ2 = 2 is the maximum flow in the following flow network. √

3 s 3

3

φ

1 t 2

φ

There are three paths from s to t. We will start with the top one as an augmenting path. After that, the other two paths have the free capacities φ and φ2 . For this case, we will construct an infinite sequence of augmenting paths. To this end, we assume the following situation after n − 1 further augmenting paths: One path has no free capacity, another has a free capacity of φn , and the third has a free capacity of φn+1 . Then, consider the following augmenting path: we go from s to t via the path of free capacity φn+1 , then we go back to s via the path of no free capacity (note that on this path the edges in the direction from t to s are actually present in the residual graph), finally we go back to t via the free capacity path φn . This augmenting path yields an increase of φn+1 . The free capacities change as follows: The path from s to t without free capacity before, now has φn+1 of free capacity; the path that had a free capacity of φn now has φn − φn+1 = φn+2 of free capacity; and the path that had a free capacity of φn+1 , now has no free capacity left. So the situation is the same as before the augmenting path, except that n has been replaced by n + 1. If we proceed in this manner, then we never reach the flow of value 2 because for all n ∈ ℕ we have φn > 0. If we assume an additional direct edge from s to t, which has capacity c > 0, then the maximum flow will be 2 + c, but the above sequence of augmenting paths converges to the limit 2. ⬦ Nearly 20 years passed before Jack Edmonds (born 1934) and Richard Karp (born 1935) presented their heuristic to always increase the value along a shortest path in the residual graph. By n = |V | we denote the number of vertices and by m the number of edges (x, y) with c(x, y) + c(y, x) > 0. Then, m ≤ n2 and m is at most twice as large as the number of edges having a positive capacity. Edmonds and Karp’s heuristic leads to a polynomial running time in 𝒪(m2 n), which in particular is independent of the capacities [16]. Roughly at the same time, but independent of Edmonds and Karp, who were doing research in the USA, Yefim Dinic (born 1949) in the former USSR found a similar procedure for computing a maximum flow in the better running time 𝒪(mn2 ), see [14]. We now present Dinic’s algorithm. Knowledge of Theorem 8.18 is not required.

8.10 Dinic’s algorithm

� 167

8.10 Dinic’s algorithm Dinic’s algorithm starts with the zero flow and works in phases. A flow f has already been determined before each phase. If f is not a maximum flow yet, then in a phase various augmenting paths are found, until the distance from s to t in the residual graph has increased. Within each phase, one of at most m edges is deleted in every nth step at the latest. Thus, a phase terminates after 𝒪(mn) steps. At the end of a phase, the flow value has properly increased. The crucial point, however, is that no more than n phases are executed because after each phase the distance from s to t in the residual graph will have increased by at least one. This argument finally yields the runtime bound 𝒪(mn2 ). We assume to have a given flow f , and now we will describe the work of a phase. Let (V , Rf ) be the residual graph of f . Since (x, y) ∈ Rf , the skew symmetry yields c(x, y)+ c(y, x) > 0. So Rf uses at most m edges for any possible flow. We denote the length of a shortest path from x to y in the current residual graph by df (x, y). Define df (x, y) = ∞ if there is no path connecting x and y. The value df (x, y) is the current distance between x and y. It depends on f . In the beginning of a phase, we construct the level graph. Certain vertices and edges from (V , Rf ) are included in the level graph. For d ≥ 0, we let Ld = {x ∈ V | df (s, x) = d}, and for d ≥ 1 define Ed = {(x, y) ∈ Ld−1 × Ld | (x, y) ∈ Rf } Now, the level graph (L, E) is given by L = ⋃d≥0 Ld and E = ⋃d≥1 Ed . Using breadth-first search, we can construct (L, E) in 𝒪(m) steps. Within a phase, the vertex sets Ld do not change. Particularly, no vertices are removed from L, only edges from E are deleted. For the rest of the phase, the residual graph will never again be constructed explicitly. If t is not in L then (L, V \ L) is an st-cut, and its value is ‖f ‖ = c(L, V \ L). By Equation (8.1), ‖f ‖ is the maximum value, so we are done. Thus, we assume t ∈ L and k = df (s, t) in the beginning of each phase, i. e., t ∈ Lk . Now, we remove edges from the level graph as long as there are edges whose source is s. Thus, we will change (L, E) dynamically, at the same time augmenting flows. For this, we define three invariants, which will be preserved for all graphs (L, E) and residual graphs (V , Rf ) considered in this phase: (1) We have E ⊆ Rf . (2) If p ∈ La and q ∈ Lb , then df (p, q) ≥ b − a. (3) Each st-path of length k in (V , Rf ) is also present in (L, E). When a new phase is beginning, the three invariants are satisfied. First, we remove the edges Ek+1 because they do not appear on any length-k path from s to t. Especially, the outdegree of t then is zero. The invariants were not violated. We now start a depth-first search from s. If we find a vertex of outdegree zero, we stop the search. Thus, the depthfirst search yields a path π = (p0 , . . . , pℓ ) with p0 = s, (pd−1 , pd ) ∈ Ed for 1 ≤ d ≤ ℓ, where

168 � 8 Graph theory pℓ has no outgoing edge in (L, E). The time complexity of this search is in 𝒪(n). We will distinguish the cases pℓ ≠ t and pℓ = t. So, first let pℓ ≠ t. Then in (L, E) there is no path from s to t using pℓ because pℓ has outdegree zero. By the third invariant, there is also no length-k path from s to t in (V , Rf ) using pℓ either. We delete the edge (pℓ−1 , pℓ ) from E. This does not change Rf , and the invariants remain valid. Then we start a new depth-first search at s (or continue the former one at pℓ−1 ). The other case is a little more subtle. Now we have pℓ = t. By the first invariant, π = (p0 , . . . , pℓ ) is an augmenting path. We use Eπ to denote the set of edges in π, Eπ = {(pd−1 , pd ) | 1 ≤ d ≤ ℓ} We let ρ = min{c(e) − f (e) | e ∈ Eπ }, then ρ > 0. We define a new flow fπ by f (x, y) + ρ for (x, y) ∈ Eπ { { { fπ (x, y) = {f (x, y) − ρ for (y, x) ∈ Eπ { { else {f (x, y) Thus, the value of the flow along the path π increases by ρ, and at least one of the edges e ∈ Eπ is saturated, meaning that the changed flow fπ satisfies fπ (e) = c(e) for at least one edge e ∈ Eπ . Finally, on another run through the path π, we delete all saturated edges from E. Then we start a new depth-first search at s. As the flow has changed, we have to check, if the invariants for Rfπ are still valid. For that, we investigate the changes in the residual graph. The saturated edges e from the path π are no longer present in Rfπ . Therefore, deleting them causes no problems. All unsaturated edges (pd−1 , pd ) from π are still present, both in Rfπ and in (L, E). But we have to be careful because if f (x, y) is increased to f (x, y) + ρ, then at the same time f (y, x) is decreased to f (y, x) − ρ. Such edges (y, x) may have been newly added to Rfπ . However, they have not been added to (L, E). The first invariant is obviously still valid. But we have to prove the second and third invariant. For all of the new edges (y, x), we can see that y ∈ Ld+1 and x ∈ Ld for a d ≥ 0. Now we define a set R, which we will transform step by step into Rπf , such that R will satisfy the three invariants if we replace Rf by R in the assertions. Distance in R will be denoted dR . Initially, R consists of the edges Rf without the saturated edges from Eπ . Then, the invariants are satisfied. Now let R′ = R ∪ {(y, x)} for an arbitrary pair (y, x) ∈ Ld+1 × Ld . We will show the invariants for R′ with the according distance dR′ . The first invariant is valid because we added an edge to R′ . Now consider p ∈ La and q ∈ Lb . If a shortest path from p to q in R′ does not use the edge (y, x), then b − a ≤ dR (p, q) = dR′ (p, q). If that edge is used on the path, then we can determine dR′ (p, q) as follows: dR′ (p, q) = dR (p, y) + 1 + dR (x, q) ≥ (d + 1 − a) + 1 + (b − d) = b − a + 2

8.10 Dinic’s algorithm

� 169

So the second invariant is valid. Finally, in order to prove the third invariant, we consider a length-k path from s to t in R′ . If it does not use the edge (y, x), then it is a path in R and therefore also in (L, E) because the invariants are valid for R. But if the edge is used on the path, we come to a contradiction as follows: k = dR′ (s, t) = dR (s, y) + 1 + dR (x, t) ≥ (d + 1) + 1 + (k − d) = k + 2 This proves the third invariant. The set Rfπ can be obtained from R by a sequence of steps of this kind. Therefore the invariants are also valid for Rfπ . After each 𝒪(n) steps, E is losing an edge, so after 𝒪(mn) steps there is no outgoing edge at s anymore. This is the end of the phase. The distance from s to t in the current residual graph is at least k by the second invariant. On the other hand, in (L, E) there is no path from s to t. Thus, the third invariant shows that there is no length-k path in Rf either. Therefore, in the current residual graph we have df (s, t) ≥ k + 1. After at most n phases we obtain df (s, t) = ∞. Now we have determined a maximum flow, and the overall running time is bounded by 𝒪(mn2 ). Example 8.20. On the flow network from page 163, Dinic’s algorithm computes the flow after three phases. Here, we consider another flow network: x 9

3 s

9

5

3 9

9

9

t

3

9 y

We start the first phase of Dinic’s algorithm with the zero flow. The flow, its associated residual graph, and the resulting level graph at the beginning of this phase are: x 0

0 s

0

0 0

0

0

x 0

s

t

t

0

0 y

y

x s

t y

After adding the augmenting path (s, x, y, t) of value 3, the following picture appears, when the second phase begins:

170 � 8 Graph theory x

3 s

x s

t

3

t

3

y

y

x s

t y

The flow on the edge (y, x) is −3; therefore it still has a free capacity of 8. So the augmenting path (which, in this case, is unique) has value 8. Adding this path to the flow yields:

3 s

8

x 8

8

5 8

x 8

s

t

3

8 y

t y

s y

Now, at the beginning of the third phase, t is not included in the level graph. Thus, the computed flow with value 11 is a maximum flow. The associated minimal cut is given by the three edges (s, x), (y, x), and (y, t). ⬦ The runtime for calculating maximum flows in recent years has further improved. It is still subject of current research. Dinic’s algorithm is robust and easy to implement; therefore in practice it plays an important role, too. Dinic also gave an interesting view on the historical development of flow algorithms in [15].

8.11 Planar graphs The following puzzle is well known: There are three supply stations for electricity, gas, and water, and there are three houses A, B, and C. Is it possible to connect each of the supply stations to each of the houses without an intersection of two supply lines?

8.11 Planar graphs

Electricity Gas

A

� 171

Water

B

C

A graph is called planar if it can be drawn in the plane in such a way that the edges never intersect. Note that this is equivalent to demanding that the graph can be drawn on the surface of a sphere without intersections. The above riddle can now be formulated in graph-theoretic terms as follows: Is the graph K3,3 planar? The following example shows that the graph K4 is planar.

with intersection

without intersection

The Pier, Part 5-8, July 29 at 1:05 on SWR **t** lines only A face of a planar graph is a maximal connected area in the plane containing neither edges nor vertices. Often, faces are also referred to as regions or areas. For connected graphs with more than two vertices, each face is bordered by a (not necessarily simple) cycle. In particular, the outside borders of a planar graph bound an unconstrained face. The complete graphs K1 and K2 both have only one face. The following graph has 4 faces:

3

1

2

4

Faces 1 and 3 both are surrounded by a single cycle, while faces 2 and 4 are not surrounded by single cycles. According to our definitions, a planar graph is a graph that can be drawn in the plane without any intersections of edges. If such a drawing is already given along with the graph, then we call it a plane graph or a planar embedding of the graph. Note that

172 � 8 Graph theory graphs can be isomorphic, but have different intersection-free plane embeddings. The following two graphs are isomorphic, but the first drawing has two faces surrounded by a cycle of length 5, while the second drawing has no such face at all.

Two faces of length 5

No faces of length 5

8.12 Euler’s formula To keep the presentation in this section short, we define some fixed identifiers: n is always the number of vertices of a graph G, m denotes the number of edges, and if G is planar, then f is the number of faces (including the outer face). Euler’s formula (also called Euler’s polyhedron formula) provides a relationship between the numbers n, m, and f . Especially, Euler’s formula yields the fact that all intersection-free embeddings of a given planar graph in the plane have the same number of faces. Theorem 8.21 (Euler’s formula; Euler 1758). For any nonempty connected planar graph, we have n−m+f =2 Proof. Let G be a connected planar graph with at least one vertex. If G contains no simple cycles, then G is a tree, and Euler’s formula is true because m = n − 1 and f = 1. So now, let G be a graph that is not a tree and let e be any edge on a simple cycle in G. We may assume a given intersection-free embedding of G in the plane. If we remove edge e, then two faces are merged into one because the faces on the two sides of e are different. Thus, the difference between the number of edges and the number of faces remains unchanged. The claim follows by induction on m. Several other interesting properties of planar graphs can be derived from Euler’s formula. Corollary 8.22. Let G be a nonempty planar graph. (a) If n ≥ 3, then m ≤ 3n − 6. (b) The graphs K5 and K3,3 are not planar. (c) There is a vertex x with degree dx ≤ 5. Proof. (a) By adding edges, we can assume that G is connected. Since n ≥ 3, each face is bounded by a (not necessarily simple) cycle of length at least 3. There are a maximum

8.12 Euler’s formula

� 173

of 2 faces on each side of an edge. This shows 3f ≤ 2m. With Euler’s formula, it follows that 6 = 3n − 3m + 3f ≤ 3n − 3m + 2m and hence the assertion. (b) The K5 has 10 edges. This contradicts the estimate from (a). So the K5 is not planar. To show that the graph K3,3 is not planar, we give a stronger bound on the number of edges in bipartite planar graphs. Let n ≥ 4 and let G be a connected bipartite planar graph with n vertices. Each face is surrounded by a cycle of length at least 4 (length 3 is not possible since G is bipartite). This yields 4f ≤ 2m and 4 = 2n − 2m + 2f ≤ 2n − 2m + 1m So m ≤ 2n − 4. The K3,3 has 9 edges and thus contradicts this estimate. So the graph K3,3 is not planar. (c) By restricting ourselves to a connected component, we can assume that G is connected. Also, G has at least 7 vertices, otherwise there is nothing to show. Let d̄ = (∑x∈V dx )/n be the average vertex degree. Due to ∑x dx = 2m, it follows by Euler’s formula that 2m 6n − 12 d̄ = ≤ < 6. n n

Since the average degree is less than 6, a vertex with degree at most 5 must exist. The graphs K5 and K3,3 are not planar, and Kuratowski’s theorem says that every nonplanar graph contains a K5 or a K3,3 in some sense; see, e. g., [13]. Therefore, K5 and K3,3 are the only archetypes of nonplanar graphs. In the icosahedron, each vertex has degree 5. This shows that the estimate in Corollary 8.22 (c) is tight.

Icosahedron as planar graph

174 � 8 Graph theory

8.13 Colorings of planar graphs A C-coloring of a graph G = (V , E) is a mapping f : V → C such that f (x) ≠ f (y) for all xy ∈ E. We call C the set of colors. Also G is k-colorable if there is a C-coloring with |C| = k. The famous four-color theorem of Kenneth Appel (1928–2013) and Wolfgang Haken (1928–2022) states that every planar graph is 4-colorable [4, 5]. In this section we will prove a weaker statement: Theorem 8.23 (Five-color theorem). Every planar graph is 5-colorable. Proof. Let C be a set of five colors and G = (V , E) be a planar graph. According to Corollary 8.22 (c), there is a vertex v of degree dv ≤ 5. If dv ≤ 4, then we consider the subgraph of G induced by V \ {v}. By induction, the induced subgraph has a C-coloring f . Putting v back into the graph and letting f (v) be one of the colors that were not used by v’s at most 4 neighbors, we obtain a C-coloring f of G. Now let dv = 5 and let a, b, c, d, e be the neighbors of v, where the names are chosen such that the vertices in the drawing are arranged clockwise around v. a e

v

b

d c

It is not possible that both edges ac and bd exist because otherwise they would intersect. Without restriction, ac ∈ ̸ E. Now, we remove vertex v and all its incident edges from G. Then we replace the vertices a and c by a single new vertex zac ∈ ̸ V . This yields a planar graph that has two vertices less than G. By induction, there is a 5-coloring f ′ for the resulting graph. We construct a coloring f : V → C of G, which assigns the color f ′ (zac ) to both vertices a and c, while all other vertices in G keep their color. It only remains to assign a color to v: Since two of v’s five neighbors have the same color, there is one color left to be used for v.

8.14 Planar separators A separator of a graph G = (V , E) is a set of vertices C such that V can be divided into subsets A, B, C with the property that there are no edges between A and B. The idea is that an efficient construction of small separators may enable a so-called divide-andconquer approach to algorithmic problems on graphs: Compute a separator C such that the subsets A and B are roughly the same size (divide); then recursively find solutions for

8.14 Planar separators �

175

the subgraphs induced by A and B, and finally, using C as kind of an “adapter”, put them together to obtain a solution for G (conquer). Unfortunately, there is not necessarily a suitable separator C in every graph. However, with planar graphs the situation is much more comfortable. For planar graphs, there is always a separator C with |C| ∈ O(√n) such that roughly at least one-third of all vertices are in A and B, respectively. The planar separator theorem states that a little more precise: Theorem 8.24 (Lipton, Tarjan 1979). Let G = (V , E) be a planar graph. Then there are disjoint sets of vertices A, B, C such that A ∪ B ∪ C = V and – |A| < 2n/3 and |B| < 2n/3, – |C| ≤ √8n, and – there are no edges between vertices from A and vertices from B. Proof. Without restriction, let n ≥ 3. By adding edges, if necessary, we may assume that in a plane representation of G all faces are bordered by triangles (including the outer face). Let k = ⌊√2n⌋. Every simple cycle which is a subgraph of G also defines a cycle in the planar embedding of G. For a simple cycle C of G, let V (C) be the set of vertices on the cycle C, A(C) the set of vertices on the outside of C, and B(C) the set of vertices in the area surrounded by C. Let C be a cycle satisfying the following three conditions: (a) C has at most 2k vertices, (b) |A(C)| < 2n/3, and (c) under conditions (a) and (b), the value of |B(C)| − |A(C)| is minimal. Such a cycle C exists because the triangle around the outer face satisfies the conditions (a) and (b). We assume that |B(C)| ≥ 2n/3 and in the rest of the proof we will show that this leads to a contradiction. Let D be the subgraph of G induced by B(C) ∪ V (C). For x, y ∈ V (C), let c(x, y) be the minimum number of edges on a path from x to y in the graph C, and d(x, y) be the minimum number of edges on a path from x to y in D. Claim 1. For all x, y ∈ V (C), we have c(x, y) = d(x, y). Proof of Claim 1: Since C is a subgraph of D, clearly d(x, y) ≤ c(x, y). Suppose there are vertices x, y ∈ V (C) with d(x, y) < c(x, y). Then, let x, y be a pair of such vertices, for which d(x, y) is minimal. Let p be a path of length d(x, y) in D from x to y. Since d(x, y) is minimal, x and y are the only vertices of p on the cycle C. The graph C ∪ p consisting of the cycle C, together with the path p contains three simple cycles: the cycle C itself, and two cycles C1 and C2 which include p as a part. Without restriction, let |B(C1 )| ≥ |B(C2 )|. Then, A(C1 ) < 2n/3 because 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 n − 󵄨󵄨󵄨A(C1 )󵄨󵄨󵄨 = 󵄨󵄨󵄨B(C1 )󵄨󵄨󵄨 + 󵄨󵄨󵄨V (C1 )󵄨󵄨󵄨

1 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 > (󵄨󵄨󵄨B(C1 )󵄨󵄨󵄨 + 󵄨󵄨󵄨B(C2 )󵄨󵄨󵄨 + 󵄨󵄨󵄨V (P)󵄨󵄨󵄨 − 2) 2

176 � 8 Graph theory 1󵄨 󵄨 = 󵄨󵄨󵄨B(C)󵄨󵄨󵄨 ≥ n/3 2 Therefore C1 satisfies condition (b), and from d(x, y) < c(x, y) we see that C1 satisfies condition (a). But condition (c), together with B(C1 ) ⊆ B(C), yields B(C) = B(C1 ). So p has no interior vertices and c(x, y) ≤ 1, which contradicts d(x, y) < c(x, y). This shows Claim 1. Claim 2. Subgraph C has exactly 2k vertices. Proof of Claim 2: Suppose |V (C)| < 2k. Let e = xy be an arbitrary edge on C. Since G is triangulated, e is adjacent to a triangle in D. Let z be the third vertex of this triangle. In particular, since |B(C)| ≥ 2n/3, we know that B(C) ≠ 0. Claim 1 yields z ∈ B(C) because otherwise cycle C would have a chord in D. Now, consider the cycle C ′ obtained from C by deleting the edge e and then adding vertex z and edges xz and zy. Cycle C ′ satisfies conditions (a) and (b), but B(C ′ ) ⊊ B(C) and A(C ′ ) = A(C), in contradiction to condition (c) for C. This shows Claim 2. Let x0 , . . . , x2k−1 be the vertices of C in the order they appear on this cycle. To simplify notation, we let x2k = x0 . Let S = {x0 , . . . , xk } and T = {xk , . . . , x2k }. Claim 3. There are k + 1 disjoint paths from S to T in D. Proof of Claim 3: According to Menger’s theorem, Theorem 8.15, there are either k + 1 disjoint ST-paths or there is an ST-separator of size less than or equal to k. Let P be an ST-separator and |P| ≤ k. Since S ∩ T = {x0 , xk }, we have x0 , xk ∈ P. In the subgraph of D induced by P, let Q be the connected component containing x0 . Then Q cannot contain xk because otherwise the distance d(x0 , xk ) between x0 and xk would be less than |P| and consequently less than k, which contradicts d(x0 , xk ) = c(x0 , xk ) = k. Let R be the set of vertices outside of Q that have a neighbor in the component Q and have a path to xk which does not use vertices from Q (thus, R contains the neighbors of the outer border of Q viewed from x0 ). By definition of Q, the vertex sets R and P are disjoint. Since G is triangulated, the vertices of R induce a path from S to T in D because this path starts on C on one side between x0 and xk and ends on C on the other side between those two vertices. So P is not an ST-separator, a contradiction. This shows Claim 3. Graph G is planar, so the k + 1 disjoint paths π0 , . . . , πk of Claim 3 cannot intersect. Therefore, we may assume path πi to start at vertex xi and to end at x2k−i . Claim 1 shows that πi contains at least c(xi , x2k−i ) + 1 vertices. So ∑ki=0 min{i, k − i} = ⌊k/2⌋ ⋅ ⌈k/2⌉ ≥ (k 2 − 1)/4, which yields k k (k + 1)2 󵄨 󵄨 n ≥ ∑󵄨󵄨󵄨V (πi )󵄨󵄨󵄨 ≥ ∑ min{2i + 1, 2(k − i) + 1} ≥ 2 i=0 i=0

This contradicts the definition of k = ⌊√2n⌋, and thus |B(C)| < 2n/3. Therefore, the vertex sets A(C), B(C), V (C) satisfy the claim of the theorem.

8.15 Ramsey’s theorem

� 177

Lipton and Tarjan (Richard Jay Lipton, born 1946, and Robert Endre Tarjan, born 1948) published Theorem 8.24 in 1979 [29]. There, they also showed that the separator C can be constructed in linear time. Thus, separators in planar graphs are a valuable tool for many computational problems, leading to efficient algorithms. The proof given above followed the lines of Alon, Seymour, and Thomas [3] (Noga Alon, born 1956, Paul D. Seymour, born 1950, and Robin Thomas).

8.15 Ramsey’s theorem If a population of lemmings reaches a certain size and density, a large number of them disperse looking for food and shelter, in many cases a journey with uncertain outcome. Among the problems that might occur are wide rivers, where sometimes only a small portion of the lemmings reach the other shore. Suppose after crossing k rivers there should be at least n surviving lemmings. How can that be accomplished? One reasonable strategy is to start the journey with a large enough number of individuals. This is exactly what lemmings do, and the construction of Ramsey numbers (Frank Plumpton Ramsey, 1903–1930) pursues a similar approach.

Figure 8.2: Norway lemmings (from Brehms Tierleben, 1927).

A coloring of the edges of a graph G = (V , E) with colors C is a mapping f : E → C. In many cases we consider edge colorings of a complete graph. In this sense an arbitrary graph can be colored by assigning a special color to those edges that are not in E. For example, the characteristic function χE : (V2 ) → {0, 1} of E, where χE (e) = 1 ⇐⇒ e ∈ E

178 � 8 Graph theory constitutes a commonly used coloring. A natural generalization considered in this section defines a coloring not only for 2-element subsets, but for the edges of complete k-hypergraphs. A k-hypergraph is a pair (V , E) of vertices V and k-hyperedges E ⊆ (Vk ). We call k the dimension of the hypergraph. In this section, a coloring always means a mapping of the form V f :( )→C k for k ≥ 1. A subset X ⊆ V is called monochromatic if there is a color b ∈ C such that f (e) = b for all e ∈ (Xk ), i. e., all hyperedges which completely belong to X have the same color b. Thus, for the characteristic function of a graph G, a subset X of vertices is monochromatic if and only if X is a clique (i. e., all edges between vertices in X are present in G) or if X is an independent set (i. e., in G there are no edges between vertices from X). The essential statement of Ramsey’s theory is that large monochromatic subsets X are guaranteed to exist if the vertex set V is extremely large. In the special case of ordinary graphs, this can be stated as follows. For every n ∈ ℕ, there is a smallest number R(n) having the following property: If (V , E) is a graph with at least R(n) vertices, then there is a subset X ⊆ V such that |X| ≥ n and X is either a clique or an independent set. Example 8.25. If there are six people in a restaurant, then there are either three of them who know each other, or three of them who do not know each other. In the underlying graph, the six people constitute six vertices, and an edge between the two vertices x and y is drawn if persons x and y know each other. Let us first assume that person A knows three of the others. If these three do not know each other, we are done, but otherwise at least two of them know each other and together with A we have three people who know each other. But what do we do, if A does not know three of the others? Well, then there are at least three of them that A does not know. Swapping the properties know each other and do not know each other, the situation is symmetric to the previous case. This proves the claim. For five people, we cannot always force this situation. For a proof, it is enough to consider a round table where five people are sitting, and each of them knows both his table neighbors, but none of the others. Then there is neither a clique of size 3 nor an independent set of size 3. Summing up, we have shown that R(3) = 6. ⬦ More general, the Ramsey numbers Rk,c (n) for any set of vertices V such that |V | ≥ Rk,c (n) and any coloring f : (Vk ) → C such that |C| = c guarantee a monochromatic subset of size n in V . Existence and formal definition of these numbers will be given by the following Theorem 8.26. Note that the Ramsey numbers R(n) for ordinary graphs (where the coloring of the edges is given by the characteristic function) are included because R(n) = R2,2 (n).

8.15 Ramsey’s theorem

� 179

Theorem 8.26 (Ramsey 1930; finite version). For all k, c, n ∈ ℕ, there is a smallest number Rk,c (n) ∈ ℕ with the following property: If V is a set with |V | ≥ Rk,c (n) and f : (Vk ) → C a coloring with |C| = c, then there is a monochromatic subset X ⊆ V such that |X| = n. Proof. We use induction on the dimension k. The cases k = 0, c ≤ 1, or n = 0 are trivial and not interesting. So let k ≥ 1, c ≥ 2, and n ≥ 1. For k = 1, by the pigeonhole principle R1,c (n) = c(n − 1) + 1 because if V is at least this number, one of the c colors must be assigned to at least n vertices. Now let k ≥ 1 and assume that rk = Rk,c (n) is already defined. We want to derive an upper bound for rk+1 = Rk+1,c (n). To this end, let us consider a finite set V and a coloring V ) → C. We assume the set V to be very large, it should contain far more than f : (k+1 rk elements. Later we will see, which size will be sufficient here. The idea is to thin out V extremely, and then use a coloring g of the k-element subsets on the remaining part of V . If the remaining set still has size at least rk , then there is a monochromatic subset with respect to g. This will turn out to be monochromatic with respect to f , too. To start the thinning process we first introduce a linear order < on V . Then every nonempty subset of V has uniquely determined smallest and largest elements. By taking V ) as pairs (K, b) such that out maximum elements, we now may write hyperedges in (k+1 V K ∈ ( k ), b ∈ V and max(K) < b. Our aim is to thin out V in such a way that g(K) = f (K, b) is true, independently of b. For m ∈ ℕ with m ≤ rk , we inductively define subsets Am , Bm ⊆ V and a coloring g : (Akm ) → C, satisfying the following properties: (a) Am contains m elements and Bm contains “enough” elements. (b) For all a ∈ Am and b ∈ Bm , we have a < b. (c) For all K ∈ (Akm ) and all b ∈ Bm , we have f (K, b) = g(K). For m = 0, we define A0 = 0 and B0 = V . Then, g is the empty mapping, no colors with respect to g are assigned, yet. The coloring g will be extended to larger and larger sets, therefore no index is provided for g. Now let m ≥ 0 and Bm ≠ 0. We define am+1 = min(Bm ) and Am+1 = Am ∪ {am+1 }. Then |Am+1 | = m + 1. Next we extend the coloring g to a coloring g : (Am+1 ) → C. All elements k K ∈ (Am+1 ), which are not colored yet, contain the element a . Thus, the number of m+1 k m such hyperedges is (k−1). We are looking for a subset Bm+1 ⊆ Bm \ {am+1 } of maximum size with the property that for all K ∈ (Am+1 ) and all b ∈ Bm+1 the value f (K, b) is the same. k Such a Bm+1 does exist, but it might be empty. Having found such a set Bm+1 , we extend the coloring g by choosing the color f (K, b) ∈ C for all K ∈ (Am+1 ) with am+1 ∈ K. Note k that this color is well defined because f (K, b) is the same for all K and b. However, for the case that Bm+1 is empty, we have to choose an arbitrary color from C. By construction of Bm+1 , the three conditions given above are still satisfied. Let us first assume that the sets Bm always were large enough and we managed to construct the set Ark . By the definition of rk , in the set Ark with coloring g there is a monochromatic subset X of size n. Thus, there is a color γ ∈ C such that g(K) = γ for all

180 � 8 Graph theory X ). We may write this hyperedge as a pair K ∈ (Xk ). Now let us consider an element of (k+1 X (K, b) with K ∈ ( k ) and max(K) < b. Then conditions (b) and (c) yield

f (K, b) = g(K) = γ This shows that X is monochromatic for our original coloring f , too. One question remains: How large does Bm have to be in order to guarantee a sufficiently large size of Bm+1 ? Suppose we just want to color a single K ∈ (Am+1 ) and then guark antee r elements in Bm+1 . By the pigeonhole principle, c(r − 1) + 1 elements in Bm \ {am+1 } are sufficient. Since c ≥ 2, any set Bm with |Bm | ≥ cr is sufficiently large. Now, there are m ) elements to color. Thus, if B not only one, but (k−1 m+1 is supposed to finally contain r m ( elements, it is sufficient if the size of Bm is at least c k−1) r. Moreover, to ensure condition (a) for m = rk , the set V must be large enough to guarantee Brk −1 ≠ 0. According to the above argument the following size suffices: rk m |V | ≥ ∏ c(k−1) = c( k )

m 0 do Y := Y ⋅ X; X := X − 1 od The semantic function of w describes its effect on two parameters called X and Y . Therefore, we understand Σ as the set ℕ2 , where the pair (m, n) means that the value of X is m and that of Y is n. For the above program, we claim w(m, n) = (0, n ⋅ m!). Thus, the factorial m! is computed as w(m, 1) if the final value of Y is returned as the result. First note that the semantics of w is defined on all inputs because the loop always terminates. Now consider the function f : Σ → Σ with f (m, n) = (0, n ⋅ m!). Since w is defined on all inputs, the claim follows if we can show Γb,c (f ) = f . Here, the Boolean condition b is the

9.4 Least fixed points of monotone mappings

� 193

evaluation of X > 0, and c(m, n) is the pair (m − 1, nm). For b(m, n) = 0, we have m = 0, and thus (Γb,c (f ))(0, n) = (0, n) = (0, n ⋅ 0!) = f (0, n). In all other cases, m > 0 yields (Γb,c (f ))(m, n) = f (c(m, n)) = f (m − 1, nm) = (0, nm ⋅ (m − 1)!) = (0, n ⋅ m!). This shows, that w in fact computes the factorial function. Note that program fragments of the form while b do c od are skipped if the condition b always evaluates to false. However, the problem arises that we cannot even algorithmically check this property of arithmetic expressions. This results from the famous incompleteness theorem of Gödel. Kurt Friedrich Gödel (1906–1978) published this result in his maybe most important work, entitled On formally undecidable propositions of Principia Mathematica and related systems.

9.4 Least fixed points of monotone mappings Kleene’s fixed point theorem yields the existence of least fixed points for continuous mappings in complete partial orders. But even more, it also provides an approximation for the least fixed point because this fixed point is the supremum of the set { f i (⊥) | i ∈ ℕ}. If (M, ≤) is a complete partial order and the mapping f : M → M is monotonic, but not continuous, then the above statement may be wrong. For example, if we extend the natural order (ℕ, ≤) to (M, ≤) = (ℕ ∪ {ω1 , ω2 }, ≤), where n < ω1 < ω2 for all n ∈ ℕ, then (M, ≤) is a CPO, and f (n) = n + 1, f (ω1 ) = f (ω2 ) = ω2 defines a monotone mapping. In this case, we have ⊥ = 0 and ω1 = sup{ f i (0) | i ∈ ℕ}, but ω1 is not a fixed point – the only fixed point is ω2 . Now let (M, ≤) be any complete partial order with minimum element ⊥, and let f : M → M be a monotone mapping. Define f 0 = ⊥ and f i+1 = f (f i ) for i ∈ ℕ. Then f i ≤ f i+1 and therefore the supremum f ω = sup{ f i | i ≥ 0} exists. If f (f ω ) = f ω , then f ω is a least fixed point. Otherwise, f ω < f (f ω ), and we can start a new fixed point iteration at f ω because f ω ≤ x for all fixed points x of f . Also the second iteration may possibly not lead to a fixed point. Then we start a third, and so on. By transfinite induction it is possible for the above procedure to be iterated an arbitrary number of times to obtain a least fixed point. We state this in Theorem 9.5. The proof of Theorem 9.5 presented here uses the well-ordering theorem. This makes the proof a standard routine for well-order relations, in analogy to the proof of Kleene’s fixed point theorem. Theorem 9.5. Let (M, ≤) be a complete partial order and f : M → M a monotone mapping. Then there is a uniquely determined least fixed point. Proof. First, let Ω be an arbitrary well-ordered set. So Ω is linearly ordered and every nonempty subset of Ω has a uniquely determined minimum element. Then we define a mapping Ω → M, α 󳨃→ f α . Later we will choose Ω very large, such that this mapping cannot be injective. But this does not matter now. For α ∈ Ω, we define f α = f (sup{f β | β < α})

194 � 9 Order structures and lattices We have to show that f α is well defined because it is not clear that the bound sup{ f β | β < α} exists at all. We show more by proving that the following four statements are valid for all α, β ∈ Ω: (a) f α is defined. (b) β < α implies f β ≤ f α . (c) f α ≤ f (f α ). (d) If x ∈ M and f (x) = x, then f α ≤ x. We prove these assertions by contradiction. Assume one of the above statements to be false for an α ∈ Ω. Then there is a minimal α having this property, and all four assertions are true for all γ with γ < α. The contradiction arises because we can show that all four statements also hold for α. We start with (a): If β < α, then f β is defined. Thus, for β ≤ γ < α we have f β ≤ f γ . So { f β | β < α} is linearly ordered and, therefore, it is a directed subset. Then f α = f (sup{f β | β < α}) exists because M is a CPO. In particular, f ⊥ = f (⊥). For (b), note that f α = f (sup{f β | β < α}) and f β ≤ f (f β ) are true for all β < α. Then by the monotonicity of f , we have for all β < α that f β ≤ f (f β ) ≤ f (sup{f β | β < α}) = f α For (c), consider the following computation: f α = f (sup({f β | β < α})) β

(by definition)

≤ f (sup{ f (f ) | β < α})

(because f β ≤ f (f β ))

≤ f (f (sup{f β | β < α}))

(monotonicity of f )

α

= f (f )

(by definition)

Finally, for (d) let x ∈ M with f (x) = x. We know that f β ≤ x for all β < α. Therefore, sup{ f β | β < α} ≤ x is also valid. This yields f α = f (sup{f β | β < α}) ≤ f (x) = x which completes the proof that all four assertions are valid for all α ∈ Ω. Now, we consider the power set Ω = 2M of M and define a well-ordering on it. For that, we use the well-ordering theorem, which, as mentioned before, is equivalent to the axiom of choice. The cardinality of a power set is always greater than the cardinality of the original set. Therefore, there is no injective mapping from Ω to M. If we apply the procedure given above to the set Ω, then there must be α, β ∈ Ω such that f β = f α and β < α. But then, f β ≤ sup{ f β | β < α}, and thus also f (f β ) ≤ f α , so altogether we obtain f β ≤ f (f β ) ≤ f α = f β So f β is a fixed point of f and by the fourth assertion, f β is the least fixed point.

9.5 Lattices

� 195

9.5 Lattices A nonempty partial order (V , ≤) is called a lattice if for all x and y from V there are both a least upper bound x ∨ y and a greatest lower bound x ∧ y. The operations ∨ and ∧ in lattices are usually called join and meet, respectively. Every nonempty linear order is a lattice. In lattices, minimal and maximal elements are uniquely determined, if they exist. In particular, each finite lattice has a minimum element ⊥ and a maximum element ⊤, and in such lattices all subsets have a supremum and an infimum. Next, we will consider some examples. Let (ℕ, |) be the set of natural numbers with the partial order defined by the divisor relation. For m, n ∈ ℕ, the least upper bound is the least common multiple lcm(m, n) and the greatest lower bound is the greatest common divisor gcd(m, n). The power set with the subset relation (2M , ⊆) is a lattice, where the supremum of two sets is given by their union, and the infimum is the intersection. If {(Vi , ≤) | i ∈ I} is a family of lattices, then the Cartesian product (∏i∈I Vi , ≤) with componentwise order is a lattice, too. Up to renaming of the elements, there are ten lattices with no more than 5 elements. The Hasse diagrams of these ten lattices are the following:

In all lattices (V , ≤) we have the following four rules: (V1)

x∧x =x x∨x =x

(idempotence)

(V2)

x∧y =y ∧x x∨y =y ∨x

(commutativity)

(V3)

x ∧ (y ∧ z) = (x ∧ y) ∧ z x ∨ (y ∨ z) = (x ∨ y) ∨ z

(associativity)

(V4)

x ∧ (x ∨ y) = x ∨ (x ∧ y) = x

(absorption)

The requirement of idempotence is redundant because (V1) can be obtained from the two absorption laws (V4). This can be seen as follows. Let y = x ∨ x, then x = x ∨ (x ∧ y) = x ∨ (x ∧ (x ∨ x)) = x ∨ x The derivation for x = x ∧ x is completely analogous. Lattices are special cases of partial orders, but at the same time lattices are algebraic structures with two binary operations

196 � 9 Order structures and lattices ∧ and ∨. Binary operations satisfying (V1) to (V4) (or equivalently (V2) to (V4)) characterize lattices, as shown by the following result. Theorem 9.6. Let V be a nonempty set with two binary operations ∨ and ∧, such that the three properties (V2) to (V4) are satisfied. If for x, y ∈ V we define the relation x ≤ y by the condition x = x ∧ y, then (V , ≤) is a lattice. Here, the operations ∧ and ∨ serve as infimum and supremum, respectively. Proof. We already showed that due to (V4) we may assume the rule of idempotency (V1), too. First, we show that ≤ defines a partial order. The idempotency (V1) directly yields x ≤ x, so ≤ is reflexive. But ≤ is also antisymmetric because, by commutativity (V2), for x ≤ y and y ≤ x we have x = x ∧ y = y ∧ x = y. Finally, transitivity follows from associativity (V3) because for x ≤ y and y ≤ z we have x = x ∧ y = x ∧ (y ∧ z) = (x ∧ y) ∧ z = x ∧ z. Next, we show that x ∧ y is the greatest lower bound. From (x ∧ y) ∧ x = x ∧ y, we can see that x ∧ y is a lower bound for x, and by symmetry it is a lower bound for y, too. Now, let z ≤ x and z ≤ y, i. e., z = x ∧ z and z = y ∧ z. It remains to show z = (x ∧ y) ∧ z. But this follows from (x ∧ y) ∧ z = x ∧ (y ∧ z) = x ∧ z = z. So far, the proof needed only the axioms (V1) to (V3), the absorption laws (V4) were only used to derive (V1). Next, with the help of (V4) we want to show the duality x = x ∧ y ⇐⇒ y = x ∨ y

(9.1)

In fact, (V4) states y = y ∨ (y ∧ x). Thus, from x = x ∧ y = y ∧ x we obtain y = y ∨ x = x ∨ y. Conversely, if y = x ∨ y, then again (V4) yields x = x ∧ (x ∨ y) = x ∧ y. By the duality in Equation (9.1), we see that x ∨ y is the least upper bound of x and y if and only if x ∧ y is the greatest lower bound. But we already know that x ∧ y is the greatest lower bound of x and y. Let V be a lattice and V ′ ⊆ V . Then V ′ is called a sublattice of V if V ′ is closed with respect to the binary operations ∧ and ∨ of V . Nonempty chains in lattices are sublattices. Note that a subset V ′ ⊆ V may be a lattice with respect to the partial order induced by V , without being a sublattice of V . The difference between subsets of a lattice which are lattices themselves, and sublattices can be found in the example below. Example 9.7. Let the lattice (V , ≤) with V = {⊥, a, b, c, ⊤} be given by the following diagram: ⊤ c

a

b ⊥

9.6 Complete lattices



197

Here {⊥, a, b, c} is a sublattice, while {⊥, a, b, ⊤} is only a subset which is a lattice, but not a sublattice because a ∨ b = c ∈ ̸ {⊥, a, b, ⊤}. ⬦

9.6 Complete lattices A complete lattice is a partial order (V , ≤) in which every subset has a supremum. In particular, sup 0 = ⊥ exists, and a complete lattice is always nonempty and has a minimum element ⊥. Thus, every complete lattice is a complete partial order. Many CPOs are not complete lattices because in complete partial orders only directed subsets need to have a supremum. For example, the set of finite and infinite words Σ∞ with the prefix order is a CPO, but not a complete lattice, as soon as Σ contains at least two letters. In a complete lattice V , every subset also has an infimum because inf D = sup{x ∈ V | ∀y ∈ D : x ≤ y} In particular, a complete lattice is a lattice with a minimum element and a maximum element. Every finite lattice is complete, and for every set M the power set lattice (2M , ⊆) is complete. Next, we want to prove the important fixed point theorem of Knaster and Tarski (after Bronisław Knaster, 1893–1980, and Alfred Tarski, 1901–1983). For the proof, we will use Theorem 9.5, which guarantees least fixed points in complete partial orders. Theorem 9.8 (Knaster–Tarski fixed point theorem). Let V be a complete lattice and f : V → V a monotone mapping. Then the subset P(f ) = {y ∈ V | f (y) = y} of fixed points in V is again a complete lattice. In particular, there are uniquely determined least and greatest fixed points. Proof. Let Y ⊆ P(f ). We have to prove that Y has a supremum in P(f ). Thus we have to show that there is a uniquely determined least fixed point in P(f ) which is at least as large as every y ∈ Y . To this end, consider VY = {x ∈ V | sup Y ≤ x}. Then VY is a complete lattice with least element ⊥Y = sup Y . For y ∈ Y and x ∈ VY , we have y ≤ sup Y ≤ x, and since f is monotone, this yields y = f (y) ≤ f (sup Y ) ≤ f (x). Thus sup Y ≤ f (sup Y ) ≤ f (x), and f induces a monotone mapping from VY to VY . Lattice VY , being a complete lattice, is a complete partial order, too. Therefore, by Theorem 9.5 in VY there is a least fixed point xf ,Y ∈ P(f ) of f . Then, xf ,Y is the least upper bound of Y with respect to P(f ). Let V be a complete lattice and f : V → V a monotone mapping. Then the lattice P(f ) = {y ∈ V | f (y) = y} in general is not a sublattice of V . Consider, for example, the complete lattice {⊥, a, b, c, ⊤} from Example 9.7. If f (c) = ⊤ and f leaves all other elements unchanged, then f is monotone, and the set of fixed points P(f ) is the lattice {⊥, a, b, ⊤}, which is not a sublattice.

198 � 9 Order structures and lattices

9.7 Modular and distributive lattices Let V be an arbitrary lattice. Then, the following distributivity inequalities are valid: x ∨ (y ∧ z) ≤ (x ∨ y) ∧ (x ∨ z)

x ∧ (y ∨ z) ≥ (x ∧ y) ∨ (x ∧ z)

If, moreover, x ≤ z is true, this yields the following modular inequality: x ∨ (y ∧ z) ≤ (x ∨ y) ∧ z Modular and distributive lattices are defined by the fact that the respective inequalities may be replaced by equalities. A lattice V is distributive if it satisfies one of the following two equivalent conditions for all x, y, z ∈ V : (D)

x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z)

(D’)

x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z)

The equivalence of the conditions (D) and (D’) follows from the following consideration. It suffices to show (D) 󳨐⇒ (D’) because the converse then follows from the duality principle. Thus, consider a, b, c ∈ V and let x = a ∧ b, y = a and z = c. We have to show a ∧ (b ∨ c) = (a ∧ b) ∨ (a ∧ c). This can be seen as follows: (a ∧ b) ∨ (a ∧ c) = ((a ∧ b) ∨ a) ∧ ((a ∧ b) ∨ c) (by (D)) = a ∧ ((a ∧ b) ∨ c)

(absorption)

= a ∧ (b ∨ c)

(absorption)

= a ∧ (a ∨ c) ∧ (b ∨ c)

(by (D))

A lattice V is called modular if for all x, y, z ∈ V the following implication is true: (M)

x ≤ z 󳨐⇒ x ∨ (y ∧ z) = (x ∨ y) ∧ z

Every distributive lattice is modular. Of the ten lattices with at most five elements shown in Section 9.5, exactly the first eight are distributive lattices. The ninth lattice is not modular while the tenth lattice is modular, but not distributive. These two lattices play a central role in the rest of this section, therefore we will refer to them as N5 and M5 . Note that all lattices with at most four elements are distributive.

9.7 Modular and distributive lattices

199



⊤ a



b

c

c

b

a





lattice M5 (modular, but not distributive)

lattice N5 (nonmodular)

The following Theorem 9.9 by Dedekind states that the lattice N5 is the archetype of a nonmodular lattice. Theorem 9.9 (Dedekind). Let V be any lattice. Then the following three statements are equivalent: (a) V is modular. (b) The modular reduction rule is valid for all x, y, z ∈ V : if x ≤ y and z ∧ x = z ∧ y and z ∨ x = z ∨ y, then x = y. (c) V does not contain a sublattice isomorphic to N5 . Proof. If V is modular, then the reduction rule is valid because (M)

x = x ∨ (x ∧ z) = x ∨ (z ∧ y) = (x ∨ z) ∧ y = (z ∨ y) ∧ y = y Moreover, the reduction rule obviously is not valid in N5 . To see this, consider the picture of N5 and let x = a, y = c and z = b. Now let V be nonmodular. Then there are a, b, c ∈ V such that a ≤ c and a ∨ (b ∧ c) < (a ∨ b) ∧ c

(9.2)

We will show that the elements of V defined by x = a ∨ (b ∧ c), y = b, z = (a ∨ b) ∧ c, n = b ∧ c, and e = a ∨ b form a sublattice, which is isomorphic to N5 : e=a∨b z = (a ∨ b) ∧ c

y=b

x = a ∨ (b ∧ c) n=b∧c

The relations n ≤ x < z ≤ e and n ≤ y ≤ e, shown as edges in the above diagram, are obvious. Assume that x ≤ y. Then a ≤ b and, by assumption, we have a ≤ c, so in Equation (9.2) both sides would evaluate to b ∧ c, which is impossible. But also y ≤ z

200 � 9 Order structures and lattices cannot be true because then b ≤ c and, together with a ≤ c, in Equation (9.2) both sides would evaluate to a ∨ b, which is impossible as well. Consequently, y ∉ {n, x, z, e}, y and x are incomparable, and y and z are incomparable, too. Thus, the five elements n, x, y, z, e are pairwise different. Further, we see x ∨ y = a ∨ (b ∧ c) ∨ b = a ∨ b = e

z ∧ y = (a ∨ b) ∧ c ∧ b = b ∧ c = n

This again proves that y is not comparable to either x or z, but, on the other hand, it shows that {n, x, y, z, e} is closed with respect to ∧ and ∨ and is isomorphic to N5 , which completes the proof of the theorem. Recall that the lattice M5 is modular, but it is not a distributive lattice. If neither M5 nor N5 appear as sublattices in a given lattice, then this lattice is distributive. This yields the characterization of distributive lattices presented in the following theorem by Garrett Birkhoff (1911–1996), who is regarded to be the founder of universal algebra. He was the son of George David Birkhoff (1884–1944), who, among other things, gained considerable attention even beyond the area of mathematics by his mathematical theory of aesthetics [6] which he designed in 1933. Theorem 9.10 (Birkhoff). Let V be a lattice. Then the following statements are equivalent: (a) V is distributive. (b) For all x, y, z ∈ V , the reduction rule is valid: If x ∧z = y∧z and x ∨z = y∨z then x = y. (c) V does not contain any sublattice that is isomorphic to M5 or N5 . Proof. Let V be a distributive lattice. We first show the validity of the reduction rule. Assume that x ∧ z = y ∧ z and x ∨ z = y ∨ z. Then x = y, which can be seen as follows: x = x ∨ (x ∧ z)

(absorption)

= x ∨ (y ∧ z)

(since x ∧ z = y ∧ z)

= (x ∨ y) ∧ (y ∨ z)

(since x ∨ z = y ∨ z)

= (x ∨ y) ∧ (x ∨ z) (distributivity) = y ∨ (x ∧ z)

(distributivity)

=y

(absorption)

= y ∨ (y ∧ z)

(since x ∧ z = y ∧ z)

Neither M5 nor N5 satisfy the (distributive) reduction rule, which follows from the above pictures of M5 and N5 . In both cases we have a ∧ b = ⊥ = c ∧ b and a ∨ b = ⊤ = c ∨ b, but a ≠ c. Now, let V be a nondistributive lattice, and assume that V does not contain a sublattice that is isomorphic to N5 . Then by Theorem 9.9, we know that V is modular. Since V

9.7 Modular and distributive lattices

� 201

is nondistributive, there are a, b, c ∈ V with the property that (a ∧ b) ∨ (a ∧ c) < a ∧ (b ∨ c). We define n = (a ∧ b) ∨ (a ∧ c) ∨ (b ∧ c) e = (a ∨ b) ∧ (a ∨ c) ∧ (b ∨ c)

x = (a ∧ e) ∨ n y = (b ∧ e) ∨ n z = (c ∧ e) ∨ n

This obviously yields n ≤ x, y, z. Using (M), we obtain a ∧ n = a ∧ ((a ∧ b) ∨ (a ∧ c) ∨ (b ∧ c))

= ((a ∧ b) ∨ (a ∧ c)) ∨ (a ∧ (b ∧ c)) = (a ∧ b) ∨ (a ∧ c)

This, together with a∧e = a∧(b∨c), yields n < e. Now we can conclude x, y, z ≤ e. Indeed, for x ≤ e note that a ∧ e ≤ e and n < e, and the same works completely analogous for y and z. In order to show that {n, e, x, y, z} is a sublattice of V that is isomorphic to M5 , it is now sufficient to prove that x ∧ y = x ∧ z = y ∧ z = n and x ∨ y = x ∨ z = y ∨ z = e. We will prove the identity x ∧ y = n as an example, the other cases can be shown analogously. We have x ∧ y = ((a ∧ e) ∨ n) ∧ ((b ∧ e) ∨ n)

(by definition)

= ((a ∧ e) ∧ ((b ∧ e) ∨ n)) ∨ n

(modularity)

= ((a ∧ e) ∧ e ∧ (b ∨ n)) ∨ n

(commutativity)

= (a ∧ (b ∨ c) ∧ (b ∨ (a ∧ c))) ∨ n

(absorption)

= (a ∧ (b ∨ (a ∧ c))) ∨ n

(since a ∧ c ≤ c ≤ b ∨ c)

=n

(idempotence)

= ((a ∧ e) ∧ ((b ∨ n) ∧ e)) ∨ n

(modularity)

= ((a ∧ e) ∧ (b ∨ n)) ∨ n

(idempotence)

= (a ∧ (b ∨ ((b ∨ c) ∧ (a ∧ c)))) ∨ n

(modularity)

= (a ∧ b) ∨ (a ∧ c) ∨ n

(modularity)

This completes the proof of the theorem. Let V be a lattice. An element a ∈ V is called irreducible (more precisely ∨-irreducible or join irreducible) if there are smaller elements, but a is not the supremum of two strictly smaller elements. In other words, a is not a minimal element, and for all b, c ∈ V such that a = b ∨ c we must have a = b or a = c. Also 𝒥 (V ) will denote the set of irreducible elements of V .

202 � 9 Order structures and lattices If M is an arbitrary set, then (2M , ⊆) is a complete and distributive lattice, and the set of irreducible elements is equal to the set of one-element subsets. Thus, 𝒥 (2M ) can be identified with the original set M. We say that V is a lattice of sets if V is isomorphic to a sublattice of a power set lattice (2M , ⊆). Every lattice of sets is distributive because the distributivity of sublattices is generally inherited from a greater lattice. Any finite lattice that has more than one element contains irreducible elements. However, there are infinite lattices without irreducible elements. Take, for example, ℤ × ℤ with component-wise ordering. In chains, all elements except ⊥ are irreducible. Here is the Hasse diagram of a lattice with four elements, in which all elements except ⊥ are irreducible.

⊥ In the next example, the irreducible elements are the circled ones.

Theorem 9.11. Let V be a finite, distributive lattice. Then V is a lattice of sets by virtue of the mapping ρ : V → 2𝒥 (V ) , which for all a ∈ V is defined by ρ(a) = {x ∈ 𝒥 (V ) | x ≤ a} In particular, ρ is an injective mapping, and it satisfies the equations ρ(x ∨ y) = ρ(x) ∪ ρ(y) and ρ(x ∧ y) = ρ(x) ∩ ρ(y). Moreover, we have ρ(⊥) = 0 and ρ(⊤) = 𝒥 (V ). Proof. The mapping ρ satisfies both ρ(⊥) = 0 and ρ(⊤) = 𝒥 (V ), and it transforms the order relation ≤ into the subset relation ⊆. Each a ∈ V can be represented as a supremum because a = sup J for J = {x ∈ V | ⊥ < x ≤ a}. If there is an element b ∈ J \ J(V ), then b can be written in the form b = c ∨ d for strictly smaller elements c < b and d < b. Particularly, ⊥ ∉ {c, d} ⊆ J. If we delete the element b from J, thus forming J ′ = J \ {b}, then a = sup J ′ is still true. Iteration of this process has to terminate because V is finite. Then we end up with a = sup{x ∈ 𝒥 (V ) | x ≤ a} = sup ρ(a) In particular, ρ is an injective mapping. It remains to show ρ(a ∨ b) = ρ(a) ∪ ρ(b) and ρ(a ∧ b) = ρ(a) ∩ ρ(b) for all a, b ∈ V . We have

9.8 Boolean lattices

� 203

ρ(a) ∩ ρ(b) = {x ∈ 𝒥 (V ) | x ≤ a and x ≤ b} = ρ(a ∧ b) As to the union, we first have ρ(a) ∪ ρ(b) = {x ∈ 𝒥 (V ) | x ≤ a or x ≤ b} ⊆ ρ(a ∨ b) ∈ ρ(V ) Conversely, let x ∈ ρ(a∨b). Then x = x ∧(a∨b) = (x ∧a)∨(x ∧b) because V is distributive. But x ∈ 𝒥 (V ) is irreducible, and thus x = x ∧ a or x = x ∧ b. This yields x ≤ a or x ≤ b, and consequently x ∈ ρ(a) ∪ ρ(b). We showed ρ(a ∨ b) = ρ(a) ∪ ρ(b), thus completing the proof of the theorem.

9.8 Boolean lattices Let V be a lattice with a minimum element ⊥ and a maximum element ⊤. If V is finite, then ⊥ and ⊤ exist anyway. Two elements x, y ∈ V are called complements if x ∧ y = ⊥ and x ∨ y = ⊤. The elements ⊥ and ⊤ are always complements. If V is a chain, ⊥ and ⊤ are the only elements having complements. A lattice V is called complemented if every element has a complement in V . A power set lattice (2M , ⊆) is complete, distributive, and complemented because for all A ⊆ M, clearly, A and M \ A are complements. A lattice V with a minimum element ⊥ and maximum element ⊤ is called a Boolean lattice if it is distributive and complemented. The Boolean lattices are named after George Boole (1815–1864) because they go back to his logic calculus from 1847. The name was coined by Henry Maurice Sheffer (1882–1964) in the year 1913. In distributive lattices, complements are uniquely determined if they exist at all. This follows from the reduction rule in Theorem 9.10, but, of course, it can also be verified directly. To do so, let x, y1 , y2 ∈ V with x ∧ yi = ⊥ and x ∨ yi = ⊤ for i = 1, 2. Then, by distributivity, we obtain y1 = y1 ∧ (x ∨ y2 ) = (y1 ∧ x) ∨ (y1 ∧ y2 ) = y1 ∧ y2 which proves y1 = y1 ∧ y2 . But then, by symmetry, we also have y2 = y1 ∧ y2 , and thus y1 = y2 . In particular, in Boolean lattices to every element a there is a uniquely determined complement. In the following, we will denote this element by a. Due to the uniqueness of complements, we have a = a, and this, together with the duality principle, yields the de Morgan law a ∧ b = a ∨ b and

a∨b=a∧b

These equalities are named after Augustus de Morgan (1806–1871), but they were already known since the Middle Ages from the handbook of logic Summa Logicae by Wilhelm von Occam (ca. 1288–1347). His name is often associated with the principle of parsimony in scholasticism, Occam’s razor.

204 � 9 Order structures and lattices Next, we want to determine the finite Boolean lattices and show that they are isomorphic to power set lattices of the form (2M , ⊆). As usual, we will call elements of dimension 1 atoms. Thus, in a partial order V with minimum element ⊥, an atom is an element a ≠ ⊥ such that there is no b ∈ V satisfying ⊥ < b < a. Lemma 9.12. Let (V , ≤) be a Boolean lattice and a ∈ V . Then a is irreducible if and only if a is an atom. Proof. If a is an atom, then a is irreducible. Now, assume that ⊥ ≠ b ∈ V is not an atom. Then there is an a ∈ V with ⊥ < a < b, and therefore a ∨ (b ∧ a) = (a ∨ b) ∧ (a ∨ a) = b ∧ ⊤ = b Thus, it suffices to show b ∧ a < b because then b is not irreducible. Assume to the contrary that b ∧ a = b. Then, de Morgan’s law yields b ∨ a = b, hence also a ≤ b. Thus a ≤ b∧b = ⊥, which is a contradiction to a ≠ ⊥. So b∧a < b, and the lemma is proved. Theorem 9.13 (Stone). Every finite Boolean lattice V can be represented as a power set lattice. More precisely, there is a canonical isomorphism between (V , ≤) and (2A , ⊆), where A is the set of atoms of V . Proof. By Lemma 9.12, the set of irreducible elements 𝒥 (V ) is equal to the set A of all atoms of V . Being a Boolean lattice, V is distributive and thus, by Theorem 9.11, isomorphic to the lattice of sets (ρ(V ), ⊆) with ρ(V ) = {ρ(a) | a ∈ V } and ρ(a) = {x ∈ 𝒥 (V ) | x ≤ a}. For an atom a, clearly ρ(a) = {a}, thus all one-element sets belong to ρ(V ). Moreover, we have ρ(a1 ∨ ⋅ ⋅ ⋅ ∨ an ) = {a1 , . . . , an }. Therefore, the inclusion A ⊆ V induces a canonical isomorphism between the lattices (2A , ⊆) and (V , ≤). This particularly shows that there is a finite Boolean lattice with k elements if and only if k = 2n is a power of two. Then, this lattice is uniquely determined up to isomorphism. Stone’s theorem, also known as the representation theorem for Boolean algebras, was discovered in 1936 by the American mathematician Marshall Harvey Stone (1903–1989). Making use of the axiom of choice, this theorem is valid much more generally. We will come back to that after the next section. The proof there will use the technical concept of ultrafilters.

9.9 Boolean rings The power set 2M of a set M not only is a Boolean lattice, it can as well be thought of as the set of mappings from M to {0, 1}. In this sense, we want to identify a subset A ⊆ M with its characteristic function χA : M → {0, 1}, which maps elements from A to 1 and elements from M \A to 0. If we interpret {0, 1} as a Boolean lattice 𝔹 with 0 < 1, then the lattice 2M is the Cartesian product 𝔹M . But we could as well take {0, 1} to be the field 𝔽2 = ℤ/2ℤ. Then 2M is the 𝔽2 -algebra 𝔽M 2 , with componentwise addition and multiplication. In particular,

9.9 Boolean rings

� 205

2 2M = 𝔽M 2 is a commutative ring satisfying 2x = 0 and x = x for all elements x. If we go back and interpret the elements of this ring as subsets of M, then multiplication corresponds to intersection

A⋅B=A∩B and addition to symmetric set difference A + B = A △ B = (A ∪ B) \ (A ∩ B) M Subrings of 𝔽M 2 , more precisely of (2 , △, ∩, 0, M), are called rings of sets. A ring R = (R, +, ⋅, 0, 1) is a Boolean ring if for all x ∈ R we have x 2 = x. In this section we are going to show that the terms “Boolean lattice” and “Boolean ring” are equivalent. In the next section we will prove (using the concept of ultrafilters) the general representation theorem of Stone, which states that the Boolean rings are exactly the rings of sets. Of course, all rings of sets are Boolean. Therefore, this direction is trivial.

Proposition 9.14. For every Boolean ring R, multiplication is commutative and 2x = 0 is true for all x ∈ R. In other words, Boolean rings are exactly the commutative 𝔽2 -algebras with idempotent multiplication. Proof. Let us first show that 2x = 0 for every x ∈ R. The Boolean property yields 2x = (2x)2 = 4x 2 = 4x and therefore, 0 = 4x − 2x = 2x, as stated. As to commutativity, consider (x + y)2 and the following computation: x + y = (x + y)2 = x 2 + xy + yx + y2 = x + xy + yx + y This implies xy + yx = 0 and, adding yx on both sides, we obtain xy = yx. If R is a Boolean ring, we define x ≤ y by the condition x = xy. The relation ≤ is a partial order. Reflexivity follows because x 2 = x for all x. Antisymmetry is obtained from the fact that R is commutative. Finally, transitivity follows from xz = xyz = xy = x for x = xy and y = yz. Proposition 9.15. Let R(+, ⋅, 0, 1) be a Boolean ring and let x ≤ y be defined by x = xy. Then (R, ≤) is a Boolean lattice, and the following statements are true: (a) ⊥ = 0 and ⊤ = 1, (b) x ∧ y = xy, (c) x ∨ y = x + y + xy, (d) x = 1 + x.

206 � 9 Order structures and lattices Proof. We already saw that (R, ≤) is a partial order, and obviously, ⊥ = 0 and ⊤ = 1. For (b), we note that xyx = xy = xyy, so xy is a lower bound for {x, y}. If z is another lower bound for {x, y}, then z = zx and z = zy. But then z = zy = zxy, which yields z ≤ xy and thus x ∧y = xy is the greatest lower bound. As to (c), we have x(x +y+xy) = x 2 +xy+x 2 y = x+2xy = x and, analogously, y(x+y+xy) = y, so x+y+xy is an upper bound for {x, y}. If z is another upper bound, then x = xz and y = yz. Thus (x +y+xy)z = xz+yz+xyz = x +y+xy, which yields x+y+xy ≤ z and therefore x∨y = x+y+xy. Finally, for (d), from x+(1+x) = 1 and x(1 + x) = 0, we conclude that x and 1 + x are complements. Thus, we have shown that (R, ≤) is a complemented lattice. The distributive law is satisfied, too: (x ∨ y) ∧ z = (x + y + xy)z = xz + yz + xyz = (x ∧ z) ∨ (y ∧ z) Hence, (R, ≤) is a Boolean lattice. Proposition 9.16. Let (V , ≤) be a Boolean lattice. Let 0 = ⊥ and 1 = ⊤ and define addition and multiplication as follows: (a) x + y = (x ∧ y) ∨ (x ∧ y), (b) x ⋅ y = x ∧ y. Then (V , +, ⋅, 0, 1) is a Boolean ring and x ≤ y is equivalent to x = xy. Proof. The equalities x + x = 0, x + 0 = x, x ⋅ x = x, and x ⋅ 1 = x are obvious. Moreover, x + y = y + x and xy = yx, and also (xy)z = x(yz). Next we will show that addition is associative, which can be seen as follows: (x + y) + z = (x ∧ y ∧ z) ∨ (x ∧ y ∧ z) ∨ (x ∧ y ∧ z) ∨ (x ∧ y ∧ z) = x + (y + z)

This computation can be illustrated using a Venn diagram by drawing x, y, and z as sets, and visualizing ∨ as union and + as symmetric difference. It remains to show distributivity, so we have to prove (x + y)z = xz + yz. First, (x + y)z = ((x ∧ y) ∨ (x ∧ y)) ∧ z = (x ∧ y ∧ z) ∨ (x ∧ y ∧ z). But another computation yields xz + yz = (x ∧ z ∧ y ∧ z) ∨ (y ∧ z ∧ x ∧ z)

= (x ∧ z ∧ (y ∨ z)) ∨ (y ∧ z ∧ (x ∨ z)) = (x ∧ y ∧ z) ∨ (x ∧ y ∧ z)

This shows (x + y)z = xz + yz, which completes the proof of the proposition. Thus, the concepts of “Boolean lattice”, “Boolean algebra”, and “Boolean ring” are all equivalent. We speak of a lattice if we want to emphasize the partial order property. Having defined the operations ∧ and ∨ from this, we speak of a Boolean algebra. Conversely, from a Boolean algebra with operations ∧ and ∨, we can switch to a Boolean lattice if we define x ≤ y by x = x ∧ y. If in the Boolean algebra we switch from the “OR

9.10 Stone’s general representation theorem

� 207

function” ∨ to the “exclusive OR” +, then with x + y = (x ∧ y) ∨ (x ∧ y) and xy = x ∧ y we get a Boolean ring. Finally, by the definition x ≤ y ⇐⇒ x = xy, we can switch back to the lattice, or the algebra, from which we started. These interrelations lead us to a second version of Stone’s theorem, Theorem 9.11. Theorem 9.17. Every finite Boolean ring R is isomorphic to a power set ring 2M . Omitting the finiteness condition, we obtain the general representation theorem of Stone, which will be further investigated in the next section.

9.10 Stone’s general representation theorem An algebra of sets is a sublattice V of a power set lattice (2M , ⊆) such that 0, M ∈ V and for all A, B ∈ V , not only A ∪ B and A ∩ B, but also the complement M \ A belongs to V . Example 9.18. If X is a topological space, then the family of all sets which are both open and closed forms an algebra of sets. Here, a topological space is a set X equipped with a topology, that is, a family of subsets of X closed under arbitrary union and finite intersection. Subsets of X belonging to the topology are called open, their complements are called closed. The sets 0 and X are both open and closed. For any finite alphabet Σ, the family of regular languages (also known as type 3 languages) over Σ, which plays an important role in computer science, forms an algebra of sets. ⬦ In this section we will prove Stone’s general representation theorem, which states that every Boolean ring is isomorphic to a ring of sets. This directly yields the fact that every Boolean lattice is isomorphic to an algebra of sets. The proof relies on the axiom of choice and is based on the existence of ultrafilters. With the concept of ultrafilters the proof of the representation theorem is quite simple. The main difficulty here is to become familiar with ultrafilters and to “assume” that ultrafilters do exist. The term filter was coined by Henri Paul Cartan (1904–2008), who was also a founding member of the group of mathematicians, who worked under the pseudonym Nicolas Bourbaki. This group in 1934 began to publish basic mathematics textbooks, Éléments de mathématique, and thus played a most important role in the modern development of this science. Filters in a partial order (R, ≤) have the following vivid interpretation. A filter is a nonempty subset F ⊆ R, which “filters out” sufficiently large, but not all elements, so in a certain sense these elements get “stuck” in F. Particularly, we have F ≠ R because the filter should let some elements pass. Moreover, if x ∈ F and y is greater than x, then y is in F, too. If both x and y belong to F, then there is a common reason for this, namely a z ∈ F such that z ≤ x and z ≤ y. If the partial order has a minimum element ⊥, then ⊥ ∉ F because F ≠ R. Note that some authors omit the requirement F ≠ R and speak of proper filters if F ≠ R.

208 � 9 Order structures and lattices In the following, we will use filters only in Boolean lattices, thus we translate the above vivid definition for this special case to the equivalent language of Boolean rings. This will make the formulas a bit more compact. Instead of a lattice R, we assume a Boolean ring R. So let R = (R, +, ⋅, 0, 1) be a Boolean ring and as usual x ≤ y if and only if x = xy. Now, a subset F ⊆ R is called a filter if the following four conditions are satisfied: (a) F ≠ 0, (b) 0 ∉ F, (c) ∀x ∈ F ∀y ∈ R: x = xy 󳨐⇒ y ∈ F, (d) ∀x, y ∈ F: xy ∈ F. Particularly, the third condition ensures 1 ∈ F. If 0 ≠ x ∈ R is any element, then the set Fx = {y ∈ R | x ≤ y} forms a filter, which is called a principle filter. If {Fi | i ∈ I} is a family of filters on a linearly ordered index set I and Fi ⊆ Fj for all i ≤ j, then obviously their union ⋃{Fi | i ∈ I} is a filter, again. Thus, within the set of filters of R, chains {Fi | i ∈ I} have a supremum ⋃{Fi | i ∈ I}, and by Zorn’s lemma (which, like the well-ordering theorem, is equivalent to the axiom of choice) it follows that every filter is contained in a maximal filter. By definition, a filter U ⊆ R is maximal if any filter F satisfying U ⊆ F ⊆ R is equal to U. A maximal filter is called an ultrafilter. For a filter F ⊆ R, it is impossible to contain both x and x = 1 + x for any x because then 0 = x(1 + x) would also be in F, which is not possible. This observation suffices for a characterization of ultrafilters. Lemma 9.19. A filter F ⊆ R is an ultrafilter if and only if for each x ∈ R either x or 1 + x is contained in F. Proof. Let x ∈ R and F ⊆ R be a filter. The filter F cannot contain both x and 1 + x. Suppose F contains neither x nor 1 + x. We have to show that F is not maximal. To this end, we define the set F ′ = {z ∈ R | ∃y ∈ F: xy = xyz}. We have F ⊆ F ′ and x ∈ F ′ because F is not empty. Thus, F ≠ F ′ , and it only remains to show that F ′ is a filter. If F ′ were to contain 0, then xy = xy0 = 0 for some y ∈ F and therefore y(1 + x) = y. Consequently, y ≤ 1+x ∈ F, which contradicts our assumptions. So we showed 0 ∉ F ′ . Finally, for z ∈ F ′ and z = zz′ there is a y ∈ F satisfying the equation xy = xyz = xy(zz′ ) = (xyz)z′ = xyz′ , i. e., z′ ∈ F ′ . Thus, F ′ indeed is a filter. So the set of ultrafilters 𝒰 is exactly the set of those filters, which for each x ∈ R contains either x or 1 + x. Our next step will be to assign a set ρ(a) ⊆ 𝒰 of ultrafilters to each a ∈ R. We define ρ(a) = {U ∈ 𝒰 | a ∈ U} This leads to the general representation theorem. Theorem 9.20 (Stone). Let R be a Boolean ring and 𝒰 the set of ultrafilters of R. Then the assignment a 󳨃→ ρ(a) = {U ∈ 𝒰 | a ∈ U} embeds the ring R as a subring into the Cartesian

9.10 Stone’s general representation theorem � 209

product 2𝒰 = 𝔹𝒰 . In particular, R is a ring of sets. So every Boolean lattice is isomorphic to an algebra of sets. Proof. Obviously, we have ρ(0) = 0 and ρ(1) = 𝒰 . Next, we will prove the injectivity of ρ. Consider a, b ∈ R with a ≠ b. By symmetry, we may assume that a ≠ ab because either a ≠ ab or b ≠ ba. Therefore, a(1 + ab) = a + ab ≠ 0. Consequently, there is an ultrafilter Uc containing the principle filter generated by c = a(1 + ab). The ultrafilter Uc contains a and 1 + ab, but it cannot contain b, because otherwise ab ∈ Uc , which contradicts 1 + ab ∈ Uc . Thus, it remains to show ρ(ab) = ρ(a) ∩ ρ(b) and ρ(a + b) = ρ(a) △ ρ(b). Ultrafilters containing ab are exactly those containing a and b. Thus, the equation ρ(ab) = ρ(a)∩ρ(b) is satisfied. Now, consider an ultrafilter U such that a+b ∈ U. If U were to contain neither a nor b, we would clearly have 1+a, 1+b ∈ U. Then, also (a+b)(1+a)(1+ b) ∈ 𝒰 , which is impossible because (a+b)(1+a+b+ab) = a+a+ab+ab+b+ab+b+ab = 0. This proves the inclusion ρ(a + b) ⊆ ρ(a) ∪ ρ(b). But if U were to contain both a and b, then ab ∈ U would follow. This, again, is impossible because (a + b)ab = ab + ab = 0. So U is in the symmetric difference of ρ(a) and ρ(b), which yields ρ(a + b) ⊆ ρ(a) △ ρ(b). For the reverse direction, we start with an ultrafilter U satisfying a, (1 + b) ∈ U. Now, we just have to show that a + b ∈ U. But if a + b ∉ U, then we would have a, (1 + b), (1 + a + b) ∈ U, which is impossible due to a(1+b)(1+a+b) = (a+ab)(1+a+b) = a+a+ab+ab+ab+ab = 0. Altogether we have shown ρ(a + b) = ρ(a) △ ρ(b), and hence the assertion. Note that Theorem 9.13 for finite Boolean rings (or equivalently, for finite Boolean lattices) indeed is a special case of Theorem 9.20. For this, it is sufficient to observe that the ultrafilters in finite Boolean rings are exactly the principle filters generated by atoms. The connection to topology from Example 9.18 can be made more precise by Theorem 9.20. For interested readers familiar with the basic topological notions, we would like to explain the essential idea: Any ultrafilter over R can be viewed as a mapping in {0, 1}R . The Cartesian product {0, 1}R is equipped with the product topology over the discrete sets {0, 1}. Then, open sets are arbitrary unions of sets of the form N(fS , S), where S ⊆ R is finite and fS ∈ 2S . Now, let N(fS , S) = { f ∈ 2R | ∀s ∈ S : f (s) = fS (s)}. The sets N(fS , S) are both open and closed. The space 2R = {0, 1}R then forms a totally disconnected compact Hausdorff space. Using compactness, one can see that sets that are both open and closed can be written as a finite union of N(fS , S), where S can be fixed. We know that the set of ultrafilters 𝒰 is contained in 2R . The property of not being an ultrafilter can be recognized by at most three components, for example, f (x) = f (y) = 1, but f (xy) = 0 or f (0) = 1, and so on. Therefore, the complement of 𝒰 is open and 𝒰 is closed in 2R . So 𝒰 itself is a totally disconnected compact Hausdorff space. In this interpretation, Example 9.18 and Theorem 9.20 tell us that the Boolean algebras are exactly the algebras of sets defined by both open and closed sets in totally disconnected compact Hausdorff spaces. The mentioned terms and theorems from topology can be found in textbooks like [9] and [28].

210 � 9 Order structures and lattices

Exercises 9.1. The chain condition dealt with in this exercise often appears in algebra. It is named after Marie Ennemond Camille Jordan (1838–1922) and Otto Ludwig Hölder (1859–1937). Let (M, ≤) be a partial order with a minimum element, where each element has finite dimension. Then the following statements are equivalent: (i) Any two maximal chains with the same endpoints have the same length (Jordan– Hölder chain condition). (ii) For all a, b ∈ M, we have that if b is an upper neighbor of a, then the dimensions satisfy dim(b) = dim(a) + 1.

9.2. With the notation of Section 9.3 and Theorem 9.4, let c ∈ ℱ be a partially defined function from Σ to Σ and w = while b do c od a while loop. We know that the partially defined function w ∈ ℱ is the least fixed point of the operator Γb,c . (a) Let ⊥ be the everywhere undefined function. Is Γb,c (⊥) = ⊥ true? (b) Choose b and c such that all f ∈ ℱ are fixed points of the operator Γb,c . How does the domain of Γb,c (⊥) look like in this case? (c) Characterize both w ⊑ id and w ⊑ while b′ do c od. (d) Let c ∈ ℱ be a total function. Show that Γb,c has more than one fixed point if and only if w does not always terminate. (e) Show that Γb,c might have only one fixed point, although w does not always terminate. Hint: By Exercise 9.2d, c ∈ ℱ is not everywhere defined.

9.3. Show the existence of (infinite) Boolean lattices which are not isomorphic to any power set lattice. 9.4. Show the existence of infinite lattices (with minimum element) which have no irreducible elements. 9.5. Show the existence of infinite lattices of sets, which have no irreducible elements.

9.6. Let M be a nonempty set and 2M the power set of M. Show that one does not get a ring if addition and multiplication are defined by A + B = A ∪ B and A ⋅ B = A ∩ B. 9.7. Determine the Hasse diagrams for all lattices with 5 elements.

Summary �

211

Summary Notions – – – – – – – – – – – –

partial order linear order/total order well founded minimal, maximal well-order neighbor Hasse diagram chain dimension dim(x) refinement topological sorting/order directed subset

– – – – – – – – – – – –

complete partial order minimum element ⊥ maximum element ⊤ monotone mapping continuous mapping fixed point denotational semantics lattice sublattice complete lattice modular lattice distributive lattice

– – – – – – – – – – – –

lattices M5 and N5 irreducible, 𝒥 (V ) lattice of sets Boolean lattice atom ring of sets Boolean ring algebra of sets power set ring filter principle filter ultrafilter

Methods and results – Each countable partial order has an injective and monotone embedding into ℚ. – Kleene’s fixed point theorem: For each continuous mapping f on a complete partial order, sup{ f i (⊥) | i ≥ 0} is the unique least fixed point. – The semantics of a while loop as fixed point of the operator Γb,c – Every monotone mapping on a complete partial order has a unique least fixed point. – The computation rules (V2) to (V4) characterize lattices. – Knaster–Tarski fixed point theorem: Let f be a monotone mapping on a complete lattice. Then the fixed points of f form a complete lattice again. – Dedekind: V modular lattice ⇐⇒ V has no sublattice N5 – Birkhoff: V distributive lattice ⇐⇒ V has no sublattices M5 , N5 – V is a finite, distributive lattice ⇐⇒ V is a finite lattice of sets – In every Boolean lattice, we have: irreducible ⇐⇒ atom – Boolean ring = Boolean lattice – Filter F ⊆ R is an ultrafilter ⇐⇒ ∀x ∈ F: either x ∈ F or 1 + x ∈ F – Stone’s theorem: Every Boolean lattice is isomorphic to an algebra of sets.

10 Boolean functions and circuits Now, let us take an excursion into the theory of circuits. First, we prove a fundamental result by Shannon from 1949, which states that n-ary Boolean functions can be realized by circuits, whose size has a growth of order Θ(2n /n). After that we will prove the sharper result of Oleg Borisovich Lupanov (1932–2006), by which 2n /n + o(2n /n) gates are sufficient to compute arbitrary n-ary Boolean functions. This bound is asymptotically optimal. In what follows, 𝔹 = {0, 1} denotes the Boolean lattice with 0 < 1 and the operations ∨ for logical OR, ∧ for logical AND, and x for taking the complement of x. The complement x (negation) of x can be written as 1−x, too. The elements in 𝔹 may also be interpreted as truth values. The Cartesian product 𝔹n is a Boolean lattice, in which the operations ∨ and ∧ are defined componentwise. Stone’s representation theorem already tells us that the lattices 𝔹n for n ∈ ℕ uniquely represent all finite Boolean lattices (up to isomorphism). On the other hand, we can view any z ∈ 𝔹n as the bit sequence (z1 , . . . , zn ) with zi ∈ {0, 1}, which in turn can be interpreted as a binary representation of the natural number ∑ni=1 zi 2i−1 ∈ {0, . . . , 2n − 1}. Thus, the Hasse diagrams corresponding to 𝔹1 , 𝔹2 , and 𝔹3 represent the number ranges {0, 1}, {0, 1, 2, 3}, and {0, . . . , 7}, respectively. 111 11 1

01

0 1

𝔹

10

011

101

110

001

010

100

00

000

2

𝔹3

𝔹

An n-ary Boolean function means a mapping f : 𝔹n → 𝔹. Any function f : 𝔹n → 𝔹 can also be interpreted as a property of the numbers between 0 and 2n − 1. For n = 0, these properties are trivial and correspond to the two constants ⊥ and ⊤. In the following, we assume n ≥ 1. Already for moderate values of n, there are enormously many n such properties. Their number is exactly 22 . For example, if n = 8, then we have 2256 properties, which is a decimal number having 77 digits. Circuits assume a bit sequence (z1 , . . . , zn ) ∈ 𝔹n as their input. On any specific such input, the circuit’s evaluation yields a truth value from 𝔹. Thus, circuits are a suitable tool to check properties of natural numbers in the range from 0 to 2n − 1. Hence, the algorithmic complexity of a property can be measured in terms of the size of a corresponding circuit. We will see that any n-ary Boolean function can be represented by circuits of size 𝒪(2n /n), and that “almost all” of them actually require this exponential size. https://doi.org/10.1515/9783111062556-010

10 Boolean functions and circuits

� 213

Unfortunately, this very sharp “generic” bound is not of much use for specific functions. Although almost all Boolean functions have extremely high complexity, there is no known specific family of Boolean functions fn , for which the growth of the circuit sizes is at least quadratic in n. In the following, we will more generally call all mappings of the form f : 𝔹n → m 𝔹 Boolean functions. The formal definition of circuits over a set of Boolean variables {b1 , . . . , bn } is a bit technical. A circuit S is a directed cycle-free graph, the vertices are called gates. The indegree of the gates is always either 0 or 2, and to each gate of indegree 2 a type is assigned. Gates of indegree 0 are called input gates. These gates are taken from the set {g1 , . . . , gn , g 1 , . . . , g n }. Gates of indegree 2 are called internal gates, and their type is either ∨ (logical OR) or ∧ (logical AND). Internal gates with outdegree 0 are called output gates. They are numbered as o1 , . . . , om . The size of a circuit S is the number of internal gates. Every such circuit defines a Boolean function f : 𝔹n → 𝔹m as follows: For all z ∈ 𝔹n , we assign a value g(z) ∈ 𝔹 to each gate g, and then we define f (z) to be (o1 (z), . . . , om (z)). The definition of g(z) ∈ 𝔹 is done from the input gates down through the circuit. First, for every input gate we let gi (z) = zi and g i (z) = 1 − zi . If g is an internal gate with directed edges from gates h1 and h2 , then we can inductively assume that h1 (z) and h2 (z) are already defined. If the type of g is ∨, then we write g = h1 ∨ h2 and define g(z) = h1 (z) ∨ h2 (z). Analogously, if g is of type ∧, then we write g = h1 ∧ h2 and define g(z) = h1 (z) ∧ h2 (z). We observe the following important difference between Boolean formulas and circuits: In circuits, each gate can be the input gate of several other gates, whereas in formulas identical subformulas have to be included several times if necessary because there is no possibility to reuse them. A circuit S is a realization of a Boolean function f : 𝔹n → 𝔹 if o1 (z) = f (z) for all z ∈ 𝔹n . If in a given circuit all input gates are from the set {g1 , . . . , gc , g 1 , . . . , g c } and the output gates are o1 , . . . , od , then this circuit realizes Boolean functions f : 𝔹n → 𝔹m for all n ≥ c and m ≤ d. x

z

y

y

∧ ∨

∨ ∧ Circuit S

x

y

z

f (x, y, z)

0 0 0 0 1 1 1 1

0 0 1 1 0 0 1 1

0 1 0 1 0 1 0 1

0 1 0 0 0 1 1 1

Function f realized by S

214 � 10 Boolean functions and circuits

10.1 Shannon’s upper bound Let f : 𝔹n → 𝔹 be an arbitrary Boolean function and n ≥ 1. To warm up, we will show two simple upper bounds on the number of internal gates needed to realize f . First, we prove that 2n+1 − 3 gates are sufficient. For n = 1, there are exactly four Boolean functions, each can be represented by one internal gate. It remains to prove the assertion for n ≥ 2. So now, we let n ≥ 2 and assume the assertion to hold for n − 1. The function f induces two functions f0 : 𝔹n−1 → 𝔹 and f1 : 𝔹n−1 → 𝔹 by defining f0 (z1 , . . . , zn−1 ) = f (z1 , . . . , zn−1 , 0) and f1 (z1 , . . . , zn−1 ) = f (z1 , . . . , zn−1 , 1). Inductively we can realize both f0 and f1 by circuits S0 and S1 having less than 2n − 3 internal gates. Then we obtain the Shannon decomposition of f as (S0 ∧ g n ) ∨ (S1 ∧ gn ). gn

S0

gn

S1





∨ Circuit for f For this circuit, at most 2(2n − 3) + 3 = 2n+1 − 3 internal gates are needed. Thus, the claim is proven for all n ≥ 1. Note that in this construction, the outdegree of each gate is at most 1. By multiple use of intermediate results that have already been computed (thus, allowing the outdegree of some gates to be greater than 1), we will obtain better bounds. A simple approach to this is the following. First, determine an upper bound for the number of gates in a circuit that computes all Boolean functions 𝔹k → 𝔹. The circuit we are looking for has one output gate for each of these functions. Let Ck be the size of this circuit. We claim that k

Ck ≤ 4 ⋅ 22

There are 4 Boolean functions 𝔹 → 𝔹, each of which requires at most 1 internal gate. So C1 ≤ 4, and the assertion is shown for k = 1. Now let k ≥ 2. As before, we can split f : 𝔹k → 𝔹 into two functions f0 , f1 : 𝔹k−1 → 𝔹, and then combine circuits for f0 and f1 with 3 additional gates to obtain a circuit for f . Applying this technique to each of the k 22 functions 𝔹k → 𝔹, we obtain k

Ck ≤ Ck−1 + 3 ⋅ 22

10.2 Shannon’s lower bound

k−1

k

� 215

k−1

k

Now, by induction, we get Ck ≤ 4 ⋅ 22 + 3 ⋅ 22 and, since for k ≥ 2 we have 4 ⋅ 22 ≤ 22 , the assertion follows. Now, we will use the bound for Ck to slightly refine the bounds for single functions f : 𝔹n → 𝔹. Suppose we already have a specific circuit for each of the functions 𝔹k → 𝔹. We want to estimate how many additional gates are needed to realize f if n ≥ k. We claim that 3 ⋅ 2n−k − 3 additional gates are sufficient. For n = k, we need no further gates at all, and since 3 ⋅ 20 − 3 = 0, the claim is satisfied. Now let n > k. Again, we split up f into two functions f0 , f1 : 𝔹n−1 → 𝔹, and combine the circuits for f0 and f1 with 3 more gates to get a circuit for f . Hence, for f at most 2 ⋅ (3 ⋅ 2n−1−k − 3) + 3 = 3 ⋅ 2n−k − 3 additional gates are needed. Together with the gates needed to realize the functions 𝔹k → 𝔹, we need k less than 3 ⋅ 2n−k + 4 ⋅ 22 gates for f . Now, let k = ⌊log2 n⌋ − 1 and obtain k

⌊log2 n⌋−1

3 ⋅ 2n−k + 4 ⋅ 22 = 3 ⋅ 2n−⌊log2 n⌋+1 + 4 ⋅ 22

log2 n−1

≤ 3 ⋅ 2n−log2 n+2 + 4 ⋅ 22 = 12 ⋅

2n + 4 ⋅ 2n/2 ∈ 𝒪(2n /n) n

Thus, we can summarize this section by the following statement. Theorem 10.1 (Shannon 1949). Ever n-ary Boolean function f : 𝔹n → 𝔹 can be realized by a Boolean circuit where the number of gates is bounded by 𝒪(2n /n).

10.2 Shannon’s lower bound Now, we will show the counterpart of the upper bound in Theorem 10.1. Almost all n-ary Boolean functions require circuits of size 2n /n to be realized. The reason for that is quite n simple: there are not enough circuits of size 2n /n to realize 22 functions. We may even extend the computational power of the gates without changing this result. Throughout this section we will allow gates for each binary Boolean function. Thus, instead of two 2 different types, we will have 16 = 22 of them. Then we can even do without negated input gates, we will only assume the n input gates g1 , . . . , gn . One internal gate will be specified as the circuit’s output gate, and thus each circuit with these 16 possible gate types computes an n-ary Boolean function. The gates can be specified as triples (T, ℓ, r), where T is the gate’s type and ℓ and r denote the left and right input gates, respectively. Our goal is to show that almost all functions require at least 2n /n gates. For this, we need some preparation. A circuit is called reduced if any two different internal gates define different Boolean functions. Note, however, that in reduced circuits internal gates might behave exactly like input gates. Lemma 10.2. Let S be a circuit with s internal gates and n input gates. Then for every n t with s ≤ t ≤ 22 there is a reduced circuit with t internal gates computing the same Boolean function as S.

216 � 10 Boolean functions and circuits Proof. We order the gates of S from left to right, such that for each gate g the respective left and right input gates can be found to the left of g. Thus, the input gates in this order appear as the first n gates. (We assume a topological order of the gates.) Then, we extend n the order to the right with additional internal gates, until finally each of the 22 Boolean functions is represented by an internal gate. During this process, we keep the order in such a way that the respective input gates always are further to the left. Now, this very large new circuit is reduced from left to right. If two internal gates represent the same function, we can do without the gate which is further to the right because all connections to the right from this gate can be redirected without destroying the order. This way the large circuit will be more and more reduced. We stop the process as soon as we have a reduced circuit with t internal gates. The gates on the right that have not yet been considered will be deleted. Because of s ≤ t, there is an internal gate in the reduced circuit which computes the same function as S. Let S be a circuit with s internal gates. In the next step, we will associate the gates with the numbers 1, . . . , s, s + 1, . . . , s + n, where the order is given as follows: The output gate is called s, and the input gates g1 , . . . , gn are associated with the numbers s+1, . . . , s+ n. Thus, we obtain a description of the circuit as a sequence of triples S = ((T1 , ℓ1 , r1 ), . . . , (Ts , ℓs , rs )) where Ti is one of the 16 possible gate types and ℓi and ri are the predecessors (left and right, respectively) of the internal gate i. If π is a permutation of {1, . . . , s +n}, which fixes the numbers s, s + 1, . . . , s + n, then π will change the names of internal gates, but not the functionality of the circuit. This leads to a new circuit description π(S) of the same circuit. The evaluation of gate π(i) in the circuit π(S) evaluates to the same value as the gate i in S. Since π leaves the output gate fixed, gate s will compute the same Boolean function as in S. If the permutation π is applied, the resulting circuit description has the form π(S) = ((T1′ , ℓ1′ , r1′ ), . . . , (Ts′ , ℓs′ , rs′ )), with (Tj′ , ℓj′ , rj′ ) = (Ti , π(ℓi ), π(ri )) for π(i) = j. If the circuit is reduced and π ≠ id, then for the descriptions we obtain S ≠ π(S), which can be shown as follows. Since π ≠ id, there are two gates i and j such that π(i) = j ≠ i. If the descriptions S and π(S) were identical, then in particular (Tj , ℓj , rj ) = (Tj′ , ℓj′ , rj′ ) = (Ti , π(ℓi ), π(ri )) would hold. But then Ti = Tj , π(ℓi ) = ℓj and π(ri ) = rj , and therefore the original gates i and j both would have the same evaluation, which is impossible in reduced circuits. Lemma 10.3. Let s ≥ n. Then the number of n-ary Boolean functions that can be computed by circuits with at most s internal gates is bounded by a function in 2s log s+𝒪(s) . Proof. If f : 𝔹n → 𝔹 is one of these functions, then, by Lemma 10.2, f is computed by a reduced circuit S with exactly s internal gates. The description of S is given by a list of triples ((T1 , ℓ1 , r1 ), . . . , (Ts , ℓs , rs ))

10.3 Lupanov’s asymptotically optimal upper bound

� 217

and the number of such sequences of triples is less than (16(s + n)(s + n))s ≤ k s ⋅ s2s for a constant k ≤ 642 . However, each circuit S computing the function f has (s − 1)! descriptions π(S), which are pairwise distinct. Thus, in the desired estimate we may divide the number k s ⋅ s2s by (s − 1)!. According to Equation (3.1), we have s! ≥ (s/e)s , and thus we obtain k s ⋅ s2s /(s − 1)! ≤ s ⋅ (ek)s ⋅ s2s /ss ∈ 2s log s+𝒪(s) This completes the proof of the claim. Theorem 10.4 (Shannon 1949). The fraction of n-ary Boolean functions computable by circuits having at most 2n /n internal gates approaches 0 as n → ∞. Proof. According to Lemma 10.3, there are at most 2s log s+𝒪(s) functions up to size s. If we n n let s = 2n /n and take the logarithm, we obtain 2n − 2 log + 𝒪(2n /n). In order to recognize n n

the relevant functions computable in size s among the 22 functions, we need to subtract n log(22 ) = 2n and observe that the difference tends to −∞. So the fraction, as claimed, approaches 0. For a Boolean function f : 𝔹n → 𝔹, let sf be the minimum number of gates required by any circuit realizing f . Moreover, let s(n) be the maximum value of sf over all functions f : 𝔹n → 𝔹. In other words, s(n) is the minimum number of gates needed to be able and realize every arbitrary n-ary Boolean function. The function s(n) satisfies s(n) ∈ Θ(2n /n) because, by Theorem 10.4, we have s(n) ∈ Ω(2n /n), while the upper bound s(n) ∈ 𝒪(2n /n) already was proven in Section 10.1.

10.3 Lupanov’s asymptotically optimal upper bound The upper bound 𝒪(2n /n) in Theorem 10.1 for the circuit complexity of n-ary Boolean functions has been improved in [31] by Lupanov (1932–2006). He showed that every n-ary Boolean function can be realized by a circuit with 2n /n + o(2n /n) gates. The goal in this section is to prove this bound, which is asymptotically optimal by Theorem 10.4. For the proof, we first consider Boolean matrices as a very important tool. Let R and S be finite index sets and let F be a Boolean matrix from 𝔹R×S . We view R and S as disjoint sets of Boolean variables. An element ρ ∈ 𝔹R can be interpreted as a subset of rows in R. Hence, for all x ∈ R, the Boolean value ρ(x) evaluates to 1 if and only if x belongs to the set represented by ρ. An analogous view is possible for σ ∈ 𝔹S , too. Now, the matrix F defines a Boolean function H : 𝔹R × 𝔹S → 𝔹 by 1

H(ρ, σ) = {

0

if ∃x ∈ R ∃y ∈ S: 1 = ρ(x) = σ(y) = F(x, y) otherwise

218 � 10 Boolean functions and circuits If we read ρ(x) = 1 as “ρ activates row x” and σ(y) = 1 analogously for columns, then H(ρ, σ) = 1 means that F(x, y) = 1 for an activated row x and an activated column y. We say that a circuit with input gates hx and hy for x ∈ R and y ∈ S realizes the matrix F, if it realizes the Boolean function H. Note that negated input gates hx and hy are not assumed. Lemma 10.5. Let R, S be finite index sets and F a Boolean matrix from 𝔹R×S . If there are at most p nonzero rows in F, then F can be realized by a circuit of size 3 ⋅ 2p + |S|. Proof. Let P ⊆ R be a subset of the rows containing all rows with nonzero entries, and let |P| = p. Then F has at most 2p different columns. We define H : 𝔹P × 𝔹S → 𝔹 by 1

H(ρ, σ) = {

0

if ∃x ∈ P ∃y ∈ S: 1 = ρ(x) = σ(y) = F(x, y)

otherwise

for ρ ∈ 𝔹P and σ ∈ 𝔹S . It suffices to realize H by a circuit with input gates hx and hy for x ∈ P and y ∈ S. The negated gates hx and hy shall not occur in the construction. For a column s ∈ S, let J(s) ⊆ S be the set of columns of F that coincide with s, formally J(s) = {y ∈ S | ∀x ∈ P: F(x, y) = F(x, s)} Moreover, for any s ∈ S we define two Boolean functions: Es (ρ) = ⋁{ρ(x) | F(x, s) = 1} Gs (σ) = ⋁{σ(y) | y ∈ J(s)} The functions Es : 𝔹P → 𝔹 and Gs : 𝔹S → 𝔹 can be realized by circuits of the form ⋁{hx | F(x, s) = 1} or ⋁{hy | y ∈ J(s)}, respectively. Using another AND gate we can realize the following function: Hs (ρ, σ) = Es (ρ) ∧ Gs (σ) Hence, Hs (ρ, σ) = 1 is equivalent to the existence of a row x ∈ P and a column y ∈ S satisfying the following three properties: (1) We have 1 = ρ(x) = σ(y). (2) The columns y and s are equal. (3) F(x, y) is equal to 1. Our next step is to construct a circuit for H as a disjunction of circuits for Hs for a suitable set of columns s. For that, we choose a minimal set Q ⊆ S satisfying S = ⋃{J(s) | s ∈ Q}. This means that each column in the matrix F appears exactly once as a column of the same form in Q. Now we have H(ρ, σ) = ⋁{Hs (ρ, σ) | s ∈ Q}

10.3 Lupanov’s asymptotically optimal upper bound

� 219

The number of circuits necessary to implement all functions Gs is bounded by |S| because the sets J(s) define a partition of S. For the functions Es , overall 2p gates are sufficient. Now, |Q| ≤ 2p yields the following upper bound for the size of the circuit for H: p ⏟⏟⏟2⏟⏟ ⏟⏟ + ⏟⏟|S| ⏟⏟⏟⏟⏟ + Es

p ⏟⏟⏟2⏟⏟ ⏟⏟

Gs

one AND gate for each Hs

p ⏟⏟ + ⏟⏟⏟2⏟⏟

OR gates for H

This completes the proof of the lemma. Theorem 10.6 (Lupanov 1958). Every n-ary Boolean function can be realized by a circuit of size 2n /n + o(2n /n). Proof. Let f : 𝔹n → 𝔹 be an n-ary Boolean function and let 1 ≤ k < n. Considering the arguments of f , we view the first k parameters as one unit, and the remaining n − k parameters as the other. Let R = 𝔹k and S = 𝔹n−k , and define a matrix F ∈ 𝔹R×S by F(x, y) = f (x, y) for a row index x ∈ R and a column index y ∈ S. Moreover, in F we combine each block of p rows into a new matrix. For this, let R = A1 ∪ ⋅ ⋅ ⋅ ∪ Aℓ be a partition of R with |Ai | ≤ p and ℓ = ⌈2k /p⌉. Now, the matrices Fi ∈ 𝔹R×S are defined by F(x, y) for x ∈ Ai

Fi (x, y) = {

0

otherwise

Thus, we have F = F1 ∨⋅ ⋅ ⋅∨Fℓ , and there are at most p nonzero rows in Fi . By Lemma 10.5, every matrix Fi can be realized with 3 ⋅ 2p + |S| gates. Hence, F has a circuit of size ℓ ⋅ (3 ⋅ 2p + 2n−k ) + ℓ Now, to obtain a realization of the function f it remains to provide input gates hx and hy for F. Let g1 , . . . , gn , g 1 , . . . , g n be the input gates of the circuit for f . Then hx results as a conjunction of k gates from {g1 , . . . , gk , g 1 , . . . , g k }, and hy as a conjunction of n − k gates from {gk+1 , . . . , gn , g k+1 , . . . , g n }. Thus, computing all the values hx and hy for x ∈ R and y ∈ S requires no more than k ⋅ 2k + (n − k) ⋅ 2n−k gates. Overall, we get the following upper bound for the size of the circuit for f : ⌈2k /p⌉(3 ⋅ 2p + 2n−k ) + ⌈2k /p⌉ + k ⋅ 2k + (n − k) ⋅ 2n−k We now let k = 3⌈log2 n⌉ and p = n − 5⌈log2 n⌉. Then, the rightmost part of the above bound satisfies 3 ⋅ 2p + 2n−k + ⌈2k /p⌉ + k ⋅ 2k + (n − k) ⋅ 2n−k ∈ o(2n /n) For the other part, we have (⌈2k /p⌉ − 1)(3 ⋅ 2p + 2n−k ) ≤

2k (3 ⋅ 2p + 2n−k ) p

220 � 10 Boolean functions and circuits

=

3 ⋅ 2k+p 2n 2n + ∈ + o(2n /n) p p n

The last estimate is due to the fact that, for a sufficiently large n, we have the inequality 1/(n − log n) ≤ 1/n + 1/(n log n). Altogether we have shown that the circuit for f needs at most 2n /n + o(2n /n) internal gates. Corollary 10.7. Let s(n) be the smallest natural number such that every n-ary Boolean function can be realized by a circuit with s(n) gates. Then, asymptotically we have s(n) ∼ 2n /n. Proof. By Theorem 10.4, we have s(n) ≥ 2n /n. Lupanov’s bound shows s(n) ∈ 2n /n + o(2n /n). Together, these two results yield the claim of the corollary. The circuit we constructed in Proposition 10.6 works in just four levels, so there are at most three alternations between AND and OR gates. The top level (an AND level) is dedicated to the construction of the values hx and hy . Then, the construction in Lemma 10.5 yields an OR level for Es and Gs . The third level is an AND level again, it is needed to construct the circuits for Hs . Finally, there is another OR level for the construction of H and the disjunction of the matrices Fi . Note that common normal forms for Boolean formulas, such as the disjunctive normal form or the conjunctive normal form, only require two logical levels. For further reading, we refer to the textbooks [38, 39].

Solutions Chapter 2 2.1. The fact that p is prime is not important here. It suffices that p is not a power of 10 because log10 (p) = r/s implies log10 (ps ) = r and thus ps = 10r . 2.2(a) The Euclidean algorithm yields: 56 = 2 ⋅ 35 − 14, i. e., 14 = 2 ⋅ 35 − 56, 35 = 2 ⋅ 14 + 7,

i. e., 7 = 35 − 2 ⋅ 14,

14 = 2 ⋅ 7,

i. e., 7 = ggT(35, 56).

Substituting the first equation into the second yields 7 = 35−2⋅(2⋅35−56) = −3⋅35−(−2)⋅56, and so x = −3, y = −2. 2.2(b) 56 35 ⋅ 35 − ⋅ 56 + 2 ⋅ 56 7 7 56 35 = ( − 3) ⋅ 35 − ( − 2) ⋅ 56 7 7 = 5 ⋅ 35 − 3 ⋅ 56, and thus x = 5, y = 3.

7 = −3 ⋅ 35 +

2.3. Note that 3x − 7y ≡ 11 mod 13 is equivalent to 3x + 6y ≡ 11 mod 13. We multiply this by 11 to obtain −6x + y ≡ 4 mod 13 or y ≡ 6x + 4 mod 13. Thus, to obtain an arbitrary solution of the congruence, we may choose any integer x and take y from the set 6x + 4 + 13ℤ. Therefore, the general solution is {(x, y) | x ∈ ℤ and y ∈ 6x + 4 + 13ℤ} 2.4. Every common divisor of a + b and a − b also divides 2a and 2b. Thus, the assertion follows from gcd(2a, 2b) = 2. 2.5(a) Let n = ∑ℓk=0 ak 10k . Then the cross-sum is q(n) = ∑ℓk=0 ak . Now, 10 ≡ 1 mod 3 yields n ≡ q(n) mod 3. This proves the divisibility rule of three. 2.5(b) Let n = ∑ℓk=0 ak 10k again. Then 10k ≡ (−1)k mod 11. The rule for divisibility by 11 follows: n ≡ ∑ℓk=0 (−1)k ak mod 11. A number is divisible by 11 if and only if its alternating digit sum is divisible by 11. 2.6. Observe that n4 + 4n for even n is an even number, and it is greater than 2. Hence we can restrict ourselves to the cases n = 2k + 1 for k ≥ 1. Let x = n and y = 2k , then n4 + 4n = x 4 + 4y4 because 4y4 = 4 ⋅ 24k = 42k+1 . Using binomial formulas, we obtain x 4 + 4y4 = (x 2 + 2y2 )2 − 4x 2 y2 = (x 2 + 2y2 + 2xy)(x 2 + 2y2 − 2xy). Since x, y ∈ ℕ and y > 1, https://doi.org/10.1515/9783111062556-011

222 � Solutions we have x 2 + 2y2 + 2xy ≥ x 2 + 2y2 − 2xy ≥ (x − y)2 + y2 ≥ 4. In particular, x 4 + 4y4 is not a prime number. 2.7(a) Suppose n = pq with p, q > 1. Consider the identity q−1

x q − yq = (x − y) ∑ x i yq−1−i

(1)

i=0

for x = 2p and y = 1. m

m

2.7(b) Suppose n = r2m for an odd factor r > 1. Then 2r2 +1 = (22 )r −(−1)r . Equation (1) m with x = 22 , y = −1, and q = r yields a nontrivial factorization again. 2.7(c) Let 1 ≤ d ∈ ℕ with d | fm and d | fn . Then n

m 2n−m −1 m 2n−m −2 fn − 2 22 − 1 = (22 ) − (22 ) + ⋅⋅⋅ − 1 = 2m fm 2 +1

Therefore, fm | (fn − 2), and hence also d | (fn − 2). This, together with d | fn , yields d | 2. But we know d ≠ 2 because fn and fm both are odd. Since the numbers fn are pairwise coprime and each fn has at least one prime factor, the sequence of prime numbers must be infinite. 2.8. The solution is 111: The two requirements x ≡ 1 mod 2 and x ≡ 0 mod 3 are equivalent to x ≡ 3 mod 6. Adding the third equation, we get 3 + 6k ≡ 1 mod 5, which yields k ≡ 3 mod 5, so k = 3+5ℓ, and x = 3+6(3+5ℓ) = 21+30ℓ for an ℓ ∈ ℕ. Then, by the fourth equation, we have 21 + 30ℓ ≡ 6 mod 7, i. e., 2ℓ ≡ 6 mod 7, or equivalently, ℓ ≡ 3 mod 7. We get the unique solution x = 21 + 30 ⋅ 3 = 111, which is the only positive solution in the range up to lcm{2, 3, 5, 7} = 210. 2.9. Let x0 be a solution of the given system. Then x0 −a = rn and x0 −b = sm for suitable integers r and s, and consequently a−b = (x0 −b)−(x0 −a) = sm−rn. Define d = gcd (n, m). The linear combination sm − rn is a multiple of d, so d | (a − b). Conversely, let d be a divisor of (a − b). Then a − b = kd. Moreover, there is a representation d = nx + my. Hence a − knx = b + kmy = x0 is a solution of the system. Now let t = lcm(n, m), then for an arbitrary solution x0 also the numbers x0 + kt are solutions for any k ∈ ℤ. We have to show uniqueness modulo t. Suppose x1 is another solution. Then both m and n divide x1 − x0 , and therefore also t = lcm(n, m) divides x1 − x0 . Hence, x1 is of the form x1 = x0 + kt for a suitable value of k. 2.10(a) By Fermat’s little theorem, we have n5 ≡ n mod 2, n5 ≡ n mod 3 and n5 ≡ n mod 5. Now, the Chinese remainder theorem yields n5 ≡ n mod 30. 2.10(b) We have 3n

4

+n2 +2n+4

≡ 0 ≡ 21

mod 3

(2)

Chapter 2

3n

4

3n

4

+n2 +2n+4 2

+n +2n+4

� 223

≡ 1 ≡ 21

mod 4

(3)

≡ 1 ≡ 21

mod 5

(4)

Equation (2) is obvious, the left-hand side clearly is divisible by 3. Equation (3) is valid because the base 3 is congruent to −1 modulo 4, and the exponent is even for every n ∈ ℕ. For Equation (4), note that the exponent n4 + n2 + 2n + 4 is divisible by 4, which can easily be checked, and 34 = 81 ≡ 1 mod 5. The claim now follows using the Chinese remainder theorem. 2.10(c) Using 64 ≡ 7 mod 57, we get 7n+2 + 82n+1 = 49 ⋅ 7n + 8 ⋅ (82 )n = 49 ⋅ 7n + 8 ⋅ (64)n ≡ 49 ⋅ 7n + 8 ⋅ 7n = 57 ⋅ 7n ≡ 0 mod 57. 2.11. By Fermat’s little theorem, we have ap−1 ≡ 1 mod p because a and p are coprime. Moreover, p − 1 is even, so we can write p − 1 = 2k. Then, from gcd(a, 4) = 1 using Euler’s Theorem, we obtain ap−1 = a2k = (aφ(4) )k ≡ 1k ≡ 1 mod 4. Now the claim follows from gcd(4, p) = 1 with the Chinese remainder theorem. 2.12(a) |(ℤ/51ℤ)∗ | = φ(51) = φ(3)φ(17) = 2 ⋅ 16 = 32. 2.12(b) From φ(51) = 32 = 3 ⋅ 11 − 1, we get 3 ⋅ 11 ≡ 1 mod 32. Hence, the secret key is s = 3. 2.12(c) 73 ≡ 49 ⋅ 7 ≡ −2 ⋅ 7 ≡ −14 ≡ 37 mod 51, i. e., x = 37. 2.12(d) According to Corollary 1.14, the order of an element divides the group order. Here, the group order is 32, and 10 does not divide 32. Consequently, there are no elements of order 10 in (ℤ/51ℤ)∗ . 2.12(e) For all elements of (ℤ/17ℤ)∗ , but also for all elements of (ℤ/3ℤ)∗ , we have a16 = 1. So the Chinese remainder theorem shows that a16 = 1 is true for all elements in (ℤ/51ℤ)∗ as well. In particular, there are no elements of order 32, and thus (ℤ/51ℤ)∗ is not cyclic. 2.13. Surprisingly, the budget committee is right in this case. According to the Chinese remainder theorem and up to symmetry in p and q, the required property is just that x es ≡ x mod p holds for all x. This is guaranteed if es ≡ 1 mod k for a multiple k ∈ (p − 1)ℤ. Solving the symmetry, k ∈ lcm(p − 1, q − 1)ℤ is sufficient. For p = 7 and q = 19 with e = 5, the budget committee would have recommended s = 11 instead of s = 65. 2.14(a) We have d(c(x)) = (x e mod n)s mod n ≡ x es ≡ x 1+k(p−1) ≡ x mod p for some k ∈ ℕ. Analogously, we obtain d(c(x)) ≡ x mod q and d(c(x)) ≡ x mod r. The Chinese remainder theorem then yields d(c(x)) ≡ x mod n. Finally, using x ∈ {0, . . . , n − 1}, we obtain d(c(x)) = x. 2.14(b) Note that φ(66) = 20 and 1 ≡ 21 = 3 ⋅ 7 ≡ 3 ⋅ 27 mod 20. Thus, the decryption exponent is s = 3. Then 143 ≡ 03 ≡ 0 mod 2, 143 ≡ (−1)3 ≡ −1 ≡ 2 mod 3 and 143 ≡ 33 ≡ 27 ≡ 5 mod 11. The Chinese remainder theorem yields x = 38.

224 � Solutions 2.15. For the coprime numbers e1 and e2 , the extended Euclidean algorithm yields two numbers a, b such that ae1 + be2 = 1. Using these together with the encrypted messages me1 mod n and me2 mod n, Oscar can determine the plaintext as follows: (me1 )a ⋅(me2 )b = mae1 +be2 ≡ m mod n. 2.16. If two of the numbers n1 , n2 , n3 are not coprime, then Oscar can compute their gcd and thus obtain a factorization. Therefore, we now assume that n1 , n2 , and n3 are pairwise coprime. Using the Chinese remainder theorem, Oscar can determine a number x ∈ {1, . . . , n1 ⋅ n2 ⋅ n3 } satisfying x ≡ m3 mod n1 ⋅ n2 ⋅ n3 . Since m < ni , we know m3 < n1 ⋅ n2 ⋅ n3 and thus x = m3 . So the message m = √3 x can be computed by evaluating the root. 2.17. By Theorem 2.19, we have aφ(b) + bφ(a) ≡ aφ(b) ≡ 1 mod b, and analogously, aφ(b) + bφ(a) ≡ 1 mod a. Since a and b are coprime, the Chinese remainder theorem yields aφ(b) + bφ(a) ≡ 1 mod ab. 2.18(a) We have F1 = F3 − F2 , . . . , Fn = Fn+2 − Fn+1 . Summing up yields F1 + ⋅ ⋅ ⋅ + Fn = Fn+2 − F2 = Fn+2 − 1.

2.18(b) For n = 0, we obtain ∑0k=0 Fk2 = F02 = 0 = F0 F1 . Now let n > 0. Then ∑nk=0 Fk2 = 2 2 2 ∑n−1 k=0 Fk + Fn = Fn−1 Fn + Fn = Fn (Fn−1 + Fn ) = Fn Fn+1 . 2.18(c) The claim is obvious for k = 1. Now let k > 1. Then Fk Fn+1 + Fk−1 Fn = (Fk−1 + Fk−2 )Fn+1 + Fk−1 Fn

= Fk−1 Fn+1 + Fk−2 Fn+1 + Fk−1 Fn = Fk−1 (Fn+1 + Fn ) + Fk−2 Fn+1 = Fk−1 Fn+2 + Fk−2 Fn+1 = F(n+1)+(k−1) = Fn+k

2.18(d) 1. Proof by induction. Obviously, F2 F0 − F12 = −1, and for n > 1 we obtain Fn+1 Fn−1 − Fn2 = (Fn + Fn−1 )Fn−1 − Fn2 2 = Fn Fn−1 + Fn−1 − Fn2

2 = Fn (Fn−1 − Fn ) + Fn−1 2 = −Fn Fn−2 + Fn−1

= −(−1)n−1 n

(by induction)

= (−1)

2. Proof using matrices. According to Equation (2.5), we have the matrix equality n F Fn ( Fn−1 ) = ( 01 11 ) . The determinant of ( 01 11 ) is −1, therefore the determinant of the rightn Fn+1

hand side is (−1)n . The determinant of the matrix on the left-hand side is Fn+1 Fn−1 − Fn2 . Thus, the claim is proven.

Chapter 3

� 225

2.19(a) Since f p is the identical map, f is a bijection. If f i (m) = f j (m) for some 1 ≤ i < j ≤ p, then f q (m) = m for q = j − i < p. This implies f gcd(p,q) (m) = m and so f (m) = m because gcd(p, q) = 1. But if f (m) = m, then also f k (m) = m for all k ∈ ℕ. 2.19(b) Define the relation m ∼ n by f i (m) = f j (n) for some i, j ∈ ℕ. This relation clearly is an equivalence relation. According to Exercise 2.19a, the class of an m ∈ M \ F has exactly p elements. Hence, the number of nonfixed points is divisible by p. 2.20(a) We prove this by induction on n. For n ∈ {1, 2}, the equation can be easily verified. Moreover, we get Ln+2 = Ln+1 + Ln = (Fn+2 + Fn ) + (Fn+1 + Fn−1 )

= (Fn+2 + Fn+1 ) + (Fn + Fn−1 ) = Fn+3 + Fn+1

2.20(b) We have ℒ1 = {0}, because 1 is the successor of 1 modulo 1. The set ℒ2 consists of the three subsets 0, {1} and {2}. Now let n ≥ 3. If we do not compute modulo n, then the number of corresponding subsets of {1, . . . , n} is just the number of words with letters a, b in which never two successive a’s appear. By Example 2.25, there are Fn+2 such words. Now, if we compute modulo n, then the number of subsets M ∈ ℒn with 1 ∉ M is Fn+1 because in this case the fact that 1 is the successor of n has no relevance. Finally, let us consider the subsets M ∈ ℒn with 1 ∈ M. Then positions 2 and n have to be filled with b’s. Moreover, we have 2 < n, so the number of such sets M (again by Example 2.25) is Fn−1 . Now Exercise 2.20a yields the claim. 2.20(c) Let f : ℒp → ℒp be defined by f (M) = {i + 1 mod n | i ∈ M}. Then f p (M) = M for all M ∈ ℒp . The only fixed point of f is M = 0. Then by Exercise 2.19, we have |ℒp | ≡ 1 mod p. 2.21. The Euclidean algorithm on input ℓ, k with ℓ ≥ k ≥ 0 as usual computes a sequence qn , qn−1 , . . . , q1 , 0 with ℓ = qn , k = qn−1 and q1 = gcd(ℓ, k). Letting gi = |qi |, we obtain g0 = 0, g1 ≥ 1 and gm+1 ≥ gm + 2gm−1 for 1 ≤ m < n. The quadratic equation x 2 = 1 + 2x has the solutions x = 1 ± √2. As in Section 2.13, this yields gn ≥ ((1 + √2)n + (1 − √2)n )/2, which proves the claim.

Chapter 3 3.1(a) For n = 0, we have (1 + x)0 = 1 = 1 + 0x. Now let n > 0. Inductively we obtain (1 + x)n = (1 + x)(1 + x)n−1 ≥ (1 + x)(1 + (n − 1)x) = 1 + nx + (n − 1)x 2 ≥ 1 + nx. 3.1(b) If ex ≥ 1 + x is valid for all x, then for x > 0 we obtain x = eln x ≥ 1 + ln x, and thus ln x ≤ x − 1. Hence, it suffices to show ex ≥ 1 + x for all x. (1) Proof using power series representation. By definition, ex = ∑n≥0 x n /n!, so we have to show ∑n≥2 x n /n! ≥ 0. Note that for even n, x n /n! ≥ −x n+1 /(n + 1)! is equivalent

226 � Solutions to n + 1 ≥ −x. In particular, for −1 ≤ x the claim is obvious, if we always combine two summands. Thus ex/m > 0 for all x, if m is chosen sufficiently large. But then we also get ex = (ex/m )m > 0 for all x. Observing that 1 + x < 0 for x < −1, we see that the claim holds for all x ∈ ℝ. (2) Proof using curve sketching. The function f (x) = ex − x − 1 has a minimum at x = 0 (the derivative ex − 1 has its only zero there). Moreover, f (x) tends to infinity as x → ±∞. From f (0) = 0, it follows that x = 0 is the only root of f . Hence, we obtain f (x) > 0 for all x ≠ 0. This yields the desired result. 3.1(c) For −x ≤ n ≠ 0, both sides of the inequality are positive. Therefore, the claim follows by taking the nth root on both sides, and then using Exercise 3.1b. x . The function has a zero at x = 0. 3.1(d) Consider the function f (x) = ln(x + 1) − x+1 1 1 Moreover, f also has its minimum there because the derivative f ′ (x) = x+1 − (x+1) 2 = x is positive for x > 0 and negative for x < 0. (x+1)2

3.2. We can successively rearrange π by considering situations with bπ(i) > bπ(i+1) and swapping π(i) and π(i + 1). This will finally yield the identity permutation. Therefore, it suffices to show that S(π) never decreases by an exchange like this. The according situation only concerns two positions, hence we may assume n = 2. So let a1 ≤ a2 and b1 ≤ b2 . We have to compare the sums S = a1 b1 +a2 b2 and S ′ = a1 b2 +a2 b1 . The difference S − S ′ is nonnegative because a1 b1 + a2 b2 − a1 b2 − a2 b1 = a1 (b1 − b2 ) + a2 (b2 − b1 ) = (a2 − a1 )(b2 − b1 ). 3.3. We have H = n/(∑i ai−1 ) ≥ n/(∑ni=1 (minj aj )−1 ) = minj aj and Q = √n−1 ∑ni=1 ai2 ≤

√n−1 ∑ni=1 (maxj aj )2 = maxj aj . Now we want to show G ≤ A by induction. For n = 1, the inequality is obvious. So let n > 1. If all ai are equal, then G = A. Otherwise, without restriction we may assume a1 > A and a2 < A. Now we define y = a1 + a2 − A. Then (n − 1)A = y + a3 + ⋅ ⋅ ⋅ + an , hence A is also the arithmetic mean of y, a3 , . . . , an . Moreover, we have yA − a1 a2 = a1 A + a2 A − A2 − a1 a2 = (a1 − A)(A − a2 ) > 0. Thus, by induction we obtain An = A ⋅ An−1 ≥ A ⋅ y ⋅ a3 ⋅ ⋅ ⋅ an ≥ a1 ⋅ ⋅ ⋅ an . As to H ≤ G, we have H = n/ ∑ni=1 ai−1 = ∏nj=1 aj /(n−1 ∑ni=1 ∏j=i̸ aj ). The denominator here is an arithmetic mean, which, according to what we have already shown, is greater than or equal to the geometric mean: H≤

∏nj=1 aj

n n √∏ i=1

∏j=i̸ aj

=

∏nj=1 aj

n n n−1 √(∏ j=1 aj )

=G

Finally, we show A ≤ Q. Again we use G ≤ A and obtain ∑ni=1 (√n)−1



ai /√∑nj=1 aj2

=

∑ni=1 √(√n)−2

1 ≤ ∑ni=1 ( 2n +



2 n 2 (ai /√∑j=1 aj )

ai2 ) 2 ∑nj=1 aj2

=1

Chapter 3

� 227

Multiplication of both sides by √∑ni=1 ai2 /√n yields the desired inequality. 1 is monotonically decreasing for x xs ∞ 1 1 1 ≤ and ∑i≥1 is ≤ 1 + ∫1 x s dx < ∞. Now let s = 1 and k ≥ 1, i. e., is k k the harmonic series. Then ∑2i=2k−1 +1 i−1 ≥ ∑2i=2k−1 +1 2−k = 1/2. Thus, for all n n ∑2i=1 1i > n/2.

3.4. Let s > 1. The function x 󳨃→ i ∫i−1 x1s dx

> 0. Hence, we consider ≥ 1 we have

3.5. We will show n ln n−n ≤ n t(n) ≤ n ln n+n. In the sum ∑ni=1 t(i), each number k counts as a divisor exactly once at the numbers k, 2k, . . . , ⌊ kn ⌋k. This yields n t(n) = ∑nk=1 ⌊ kn ⌋, which finally leads to the estimates n

n n n 1 n 1 n ∑ ⌊ ⌋ ≤ ∑ ≤ n ∑ ≤ n(1 + ∫ dx) ≤ n + n ln n k k k x k=1 k=1 k=1 n

n

n

1

n n 1 ∑ ⌊ ⌋ ≥ ∑ ( − 1) ≥ −n + n ∫ dx ≥ −n + n ln n k k x k=1 k=1 1

3.6. According to Equation (3.9), we have π(n) ≤ log3n n for almost all n. Hence, even the 2 average distance between prime numbers has to increase. An elementary solution for this exercise was already known to Euclid. The n − 1 numbers n! + 2, n! + 3, . . . , n! + n are all composite because n! + i for 2 ≤ i ≤ n is divisible by i. If pi is the largest prime number satisfying pi < n! + 2, then pi+1 > n! + n follows, and thus pi+1 − pi ≥ n. 3.7(a) For every sufficiently large number m, we have π(m) ≤ 3m/ log m by Equation (3.9). We define m = pn and obtain pn ≥ 31 n log pn . Now the claim follows from pn ≥ n. 3.7(b) (1) Elementary solution. Suppose the series is convergent. Then for a sufficiently large index k, we obtain ∑i≥k p1 ≤ 21 . For n ∈ ℕ, let Mn be the set of numbers from i {1, . . . , n}, such that all prime factors are smaller than pk . Each number x ∈ Mn has a unique representation as x = rs2 with a square-free factor r. There are only a constant number of possibilities for r, and s satisfies s ≤ √n. Therefore, |Mn | ∈ 𝒪(√n). For each i ≥ 1, there are at most n/pi numbers in {1, . . . , n} that are divisible by pi , because only each pi th number is divisible by pi . This yields n≤∑ i≥k

n + 𝒪(√n) ≤ n/2 + 𝒪(√n) pi

which is a contradiction. (2) Solution using prime number density. Let p denote a prime number, here. According to Theorem 3.6, between n and 2n there are already Θ(n/ log n) prime numbers. Therefore ∑2k