Conversational Problem Solving 2020000542, 9781470456351, 9781470456948

This book features mathematical problems and results that would be of interest to all mathematicians, but especially und

213 15 2MB

English Pages 189 [190] Year 2020

Report DMCA / Copyright


Table of contents :
Chapter 1. The First Day
Chapter 2. Polynomials
Chapter 3. Base Mathematics
Chapter 4. A Mysterious Visitor
Chapter 5. Set Theory
Chapter 6. Two Triangles
Chapter 7. Independence Day
Chapter 8. Independence Aftermath
Chapter 9. Amanda
Chapter 10. An Aesthetical Error
Chapter 11. Miraculous Cancellation
Chapter 12. Probability Theory
Chapter 13. Geometry
Chapter 14. Hodgepodge
Chapter 15. Self-referential Mathematics
Chapter 16. All Good Things Must Come to an End
1. First Day
2. Polynomials
3. Base Mathematics
4. A Mysterious Visitor
5. Set Theory
6. Two Triangles
7. Independence Day
8. Independence Aftermath
9. Amanda
10. An Aesthetical Error
11. Miraculous Cancellation
12. Probability Theory
13. Geometry
14. Hodgepodge
15. Self-Referential Mathematics
16. All Good Things Must Come to an End
Recommend Papers

Conversational Problem Solving
 2020000542, 9781470456351, 9781470456948

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Conversational Problem Solving

Richard P. Stanley

Conversational Problem Solving

Conversational Problem Solving

Richard P. Stanley

2010 Mathematics Subject Classification. Primary 00A07, 00A08, 00A09.

For additional information and updates on this book, visit

Library of Congress Cataloging-in-Publication Data Names: Stanley, Richard P., 1944- author. Title: Problem solving camp / Richard P. Stanley. Description: Providence, Rhode Island : American Mathematical Society, [2020] | Includes bibliographical references and index. Identifiers: LCCN 2020000542 | ISBN 9781470456351 (paperback) | ISBN 9781470456948 (ebook) Subjects: LCSH: Problem solving. | Mathematics–Problems, exercises, etc. | AMS: General – General and miscellaneous specific topics – Problem books. | General – General and miscellaneous specific topics – Recreational mathematics [See also 97A20]. | General – General and miscellaneous specific topics – Popularization of mathematics. Classification: LCC QA63 .S73 2020 | DDC 510.76–dc23 LC record available at

Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for permission to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For more information, please visit Send requests for translation rights and licensed reprints to [email protected]. c 2020 by the author. All rights reserved.  Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines 

established to ensure permanence and durability. Visit the AMS home page at 10 9 8 7 6 5 4 3 2 1

25 24 23 22 21 20

to Atsuko and my former students in 18.S34 and 18.A34

Contents Preface


Chapter 1. The First Day


Chapter 2. Polynomials


Chapter 3. Base Mathematics


Chapter 4. A Mysterious Visitor


Chapter 5. Set Theory


Chapter 6. Two Triangles


Chapter 7. Independence Day


Chapter 8. Independence Aftermath


Chapter 9. Amanda


Chapter 10. An Aesthetical Error


Chapter 11. Miraculous Cancellation


Chapter 12. Probability Theory


Chapter 13. Geometry


Chapter 14. Hodgepodge


Chapter 15. Self-referential Mathematics


Chapter 16. All Good Things Must Come to an End


Notes 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

159 159 159 160 160 161 161 162 162 162 163

First Day Polynomials Base Mathematics A Mysterious Visitor Set Theory Two Triangles Independence Day Independence Aftermath Amanda An Aesthetical Error vii



11. 12. 13. 14. 15. 16.

Miraculous Cancellation Probability Theory Geometry Hodgepodge Self-Referential Mathematics All Good Things Must Come to an End

163 164 164 165 165 166





Preface The original inspiration for this book was a series of bridge (the card game) books by David Bird. These books feature a mixture of amusing characters (such as a pompous abbot, a clever parrot, and Robin Hood) and interesting bridge hands. Could something similar be done for mathematics? If so, what should be the subject matter? My answer to this question is the present book. For many years at M.I.T. I taught or co-taught the freshman seminar 18.S34, later called 18.A34, on Mathematical Problem Solving, focused on the Putnam Competition. During the course of teaching this seminar I accumulated lots of interesting problems and mathematical facts, as well as some dumb jokes. This provided most of the material herein. The intended audience for this book is mathematicians at all levels beginning with undergraduates, and even high school students, who are adept at solving challenging problems such as those appearing on the IMO or Putnam Competition. The primary purpose of the problems in this book is not didactic, but rather to entertain. Nevertheless I hope that at least some readers will learn some interesting and useful mathematics. For readers primarily interested in the mathematical content of this book, I have included a List of Problems preceding each chapter. I am grateful to Kevin Carde for his careful reading of an earlier manuscript of this book. I also wish to thank Federico Ardila, an anonymous reviewer, and my editor Sergei Gelfand for their helpful suggestions. Note to reader. I have taken some liberties with the use of direct quotations, i.e., material in quotation marks, for the sake of readability. For example, when Professor Blakeley says “I will denote it by S,” it should be understood that he has written the symbol S (or some blackboard equivalent) on the blackboard. Similarly, if for instance someone refers to a numbered equation in the text, one should understand that they have done something like point to this equation on the board.



The First Day PROBLEM LIST Cutting a 3 × 3 × 3 cube into unit cubes Cutting an n1 × ⋯ × nd brick into unit cubes Two players cutting an m × n chocolate bar impartially (impartial cutcake) Alice cutting vertically and Bob cutting horizontally (cutcake) Two persons dividing a cake fairly Any finite number of persons dividing a cake fairly


3 3 4 4 7 7



Professor Mortimer Ignatius Blakeley walked into Room C-122 on the first day of Problem Solving Camp. Eight eager but somewhat apprehensive aspiring mathematicians were seated at their desks. Professor Blakeley already knew quite a bit about these students from having read their files during the selection process and from the reception the previous night. They had been chosen from a highly talented group of undergraduate math majors throughout the USA, and now the time had finally come to see how they would do! The Problem Solving Camp students were the following: ● Emiliano Alvarez, whose parents moved to the Boston area from Ecuador shortly before Emiliano was born. ● Daniel Arkin, from the Milwaukee area. ● Sandra Billingsley, a resident of Cincinnati. ● Sam Dalton, from Columbia, Missouri. ● Clayton Martin, a black student from Tucson. ● Patrick O’Connell, a resident of Ireland who came to the USA for his undergraduate education. ● Jung Wook Park, whose home was in Seoul, Korea. He also came to the USA for the first time to begin his undergraduate studies. ● Fumei Yang, a student who came with her family from Guizhou Province in China to a suburb of Denver when she was 12. “Welcome to Problem Solving Camp!” said Professor Blakeley. “I am sure you will find this a great opportunity to learn new mathematics and to enjoy working on some challenging problems. You are already familiar with the course mechanics. To quickly review, in the mornings we will discuss mathematics and work on some new problems. In the afternoon session I’ll pass out some problems related to the morning’s discussion, and we can work on them with the help of a couple of course assistants. We can also discuss progress on earlier problems that have not yet been solved. Now let us get down to serious business! “First some basic notation that will be used throughout the course. You are all familiar with the notation Z, Q, R, C for the integers, rationals, real numbers, and complex numbers. By the way, why is the letter Z used for the integers? Why not I?” “Zahl,” said Emiliano quickly. “Exactly!” said Professor Blakeley. “Zahl is the German word for ‘number’ and Zahlen for ‘numbers.’ And who first used the letter Z, in whatever font, to denote the integers?” No one replied. “Sad . . . , such a basic question. I believe it was first used by Nicolas Bourbaki in the 1930’s. I suppose you all know that Nicolas Bourbaki is a pseudonym for a group of mostly French mathematicians that began around 1935 trying to reformulate much of math in an extremely abstract and formal manner. “Even if you don’t know the history of the notation Z, at least you should know how to write it. It should be written in two strokes, like this:



“Whatever you do, don’t make the mistake of beginning . You will then get the ugly, lopsided .” The students were beginning to get the idea that Professor Blakeley’s lecturing style was a little eccentric. “Well, let us move on to a little more notation. I use N to denote the set of nonnegative integers and P for the positive integers. Sometimes N is used to denote the natural numbers, but there doesn’t seem to be universal agreement on whether 0 is a natural number. Therefore you should think N for ‘nonnegative’ and P for ‘positive.’ Finally, when n ∈ N I will write [n] = {1, 2, . . . , n}, so in particular [0] = ∅. Where did the symbol ∅ come from?” No one volunteered, so Professor Blakeley continued, “It is a letter from the Norwegian alphabet that was suggested by Andr´e Weil. It has nothing to do with the Greek letter φ. “Finally the time has arrived to do some mathematics! We should start out with a problem with important real-world applications. And what could be more important than the question of how to cut cheese? Therefore suppose that we want to cut a cube of cheese that is three inches on each side into 27 1 × 1 × 1 inch cubes. You are allowed to rearrange the pieces after each cut and to cut more than one piece at once, but each cut must be a single planar slice. What is the least number of cuts necessary? Please keep quiet if you have seen this problem before.” “The number of pieces can at most double with each cut. Since 24 < 27 it will take at least five cuts. It shouldn’t be hard to rearrange the pieces to achieve this,” said Daniel. Professor Blakeley knew from Daniel’s application that he was very clever but somewhat impulsive. “Wait a second,” said Fumei. “The first cut is really lopsided. No matter how you cut, you’ll have one piece with nine squares and one with 18 . . . .” Several students were about to speak, but Fumei continued. “In fact, since 24 < 18 you need at least five more cuts, and obviously you can do it in six cuts.” Professor Blakeley was impressed. Although Fumei’s solution was not the most elegant, it was a good argument that had a lot of potential for generalization. “Very good!” he said. “Does anyone who has seen the problem before know the most elegant solution?” Jung Wook Park spoke up. “The center cube has six sides. Each needs a separate cut.” “Exactly,” said Professor Blakeley. “Very clever, but perhaps we can get further with Daniel’s and Fumei’s approach. What about the general case: an n1 ×n2 ×⋯×nd ‘brick’ in d-dimensional space. The math may be interesting, though we lose the applicability to cheese cutting.” Clayton said, “Not necessarily. I’ve been wondering how to cut a fourdimensional block of cheese that popped out of one of my Klein bottles.” “I stand corrected.” Just from talking with Clayton for a few minutes at the reception, Professor Blakeley knew that he had a well-developed sense of humor. “But we can still think about how to cut an n1 × n2 × ⋯ × nd brick. Let’s mull over this, and please raise your hand if you think you have a solution or just a significant insight.” Professor Blakeley patiently waited while the students worked furiously. After only a couple of minutes the hand of Sandra went up, soon followed by Jung Wook and Daniel.



Amazing, thought Professor Blakeley. “Well, Sandra, since you raised your hand first, let’s hear your thoughts. Please come to the board and explain.” Sandra walked up to the front of the classroom and explained with the help of the chalkboard, “It requires ⌈log2 (ni )⌉ cuts to break up a line of ni bricks in the ith coordinate direction. Since we can cut up only one direction at a time for each piece, a lower bound on the number of cuts is d


∑⌈log2 (ni )⌉. i=1

It seems pretty clear this should be possible to obtain just by doing the strategy for d = 1 one direction at a time.” “Excellent!” said Professor Blakeley. “Indeed, it’s not much work to make your argument completely rigorous. This is really the way to look at this problem, and not by some special trick that works only in a handful of cases. “There are some different kinds of problems related to cake cutting. Let’s start with a certain mathematical game played between two players, Alice and Bob (the traditional names for the players of two-player math games). This time we are cutting chocolate bars rather than cheese. Suppose we have such a bar divided by indentations into an m × n array of squares. Alice begins and (unless m = n = 1) breaks the bar into two pieces by cutting along one of the grid lines. The players take turns choosing one of the pieces and cutting it into two along one of the grid lines, if this piece is not just a 1 × 1 square. The first player unable to move loses. This situation will occur when we reach mn 1 × 1 squares. For each value of m and n, who will win under optimal play? What is the correct strategy for the winner? Again raise your hand if you see the answer.” This time hands went up very quickly. Professor Blakeley pointed to Daniel, implicitly asking him to speak. Daniel said, “The number of pieces increases by one after each turn. Therefore the game will end after mn − 1 turns. Thus Alice wins if and only if mn is even. It doesn’t matter how she plays.” “Good, good! Probably many of you would get bored very quickly playing this game. Not only is the winner independent of how either player plays, but also the number of moves. The game is called impartial cutcake, ‘impartial’ because at any stage of the game, the moves available to each player are the same. “Let’s try to make things more interesting. Rather than have the same moves available for each player, let each cut be available to just one of the players. To be specific, regard our m × n chocolate bar to be an array with m rows and n columns. Alice can only make vertical cuts separating two adjacent columns (so on her first turn she has n − 1 choices), and similarly Bob for rows. Alice goes first as before, and the first player unable to move loses. This game is called simply cutcake. Who wins with optimal play? Raise your hand if you have an idea.” After a short while Emiliano raised his hand. After Professor Blakeley nodded he said, “If n > m, say, then Alice has more available moves. It seems plausible that Alice will always win.” Fumei, who had been furiously computing some small cases, said, “That doesn’t seem correct. For the 3×2 case, Alice will lose whether or not she has the horizontal or vertical cuts.”



Emiliano replied, “I see . . . . But it must be the case that if n is sufficiently larger than m, then Alice will win if she has the vertical cuts.” Meanwhile Patrick had been writing some code on his PC. Professor Blakeley knew that he was a computer whiz. Patrick said, “I’ve checked that Alice wins on a 1 × n board for any n ≥ 2 (trivial), on a 2 × n board for n ≥ 4, and on a 3 × n board for n ≥ 8. I’m using the obvious fact that if Alice wins on m × n, then she also wins on m × (n + 1).” “Great!” said Professor Blakeley. “That does suggest an obvious conjecture. While you’re at it, consider the generalization where we begin with finitely many chocolate bars of various sizes. Let’s take a short break before getting back to these questions.” Professor Blakeley was pleased to see that the students spent most of their break time thinking about chocolate bars. A couple of students were even eating them. What better way to spend a break from mathematics than doing mathematics? Also during the break Clayton approached Professor Blakeley with scissors and a piece of paper divided into a 3 × 5 grid of squares. Clayton challenged the Professor to a game of impartial cutcake and said that the Professor could move first. “That’s very generous of you, Clayton,” said Professor Blakeley, “but I don’t have much of a talent for mathematical games, so you should find a stronger opponent.” After the break Professor Blakeley said, “Let me first say some general words concerning games like cutcake. I will speak somewhat informally and not give precise definitions. A game is called partizan if it is always disadvantageous to move. It is not hard to see that cutcake is a partizan game. Any cut a player makes to a rectangle R simply removes that option for him- or herself while doubling the number of options for the other player on the two pieces into which R has been cut. It turns out that for any partizan game G we can assign a real number ν(G) called the value of the game. We should think that ν(G) represents the number of moves that Alice is ahead of Bob. For instance, a 1 × n chocolate bar R has value ν(R) = n − 1. If R is part of a larger collection of chocolate bars of various sizes, then Alice has n − 1 possible moves which she can take on R at her leisure, and Bob can do nothing to disrupt these n − 1 moves. Similarly, an m × 1 rectangle has value −(m − 1), since now Alice is m − 1 moves behind. A 1 × 1 rectangle has value 0, meaning that mover loses. If we have a set S = {R1 , . . . , Rk } of disjoint chocolate bars, then a move on S consists of cutting some Ri in an allowed way. We can denote this by writing S = R1 + ⋯ + Rk . It seems very plausible and can be made completely rigorous that ν(S) = ν(R1 ) + ⋯ + ν(Rk ). In particular, ν(S) = 0 if and only if mover loses (with optimal play). This is quite clear when all the Ri ’s are of sizes mj × 1 and 1 × nj . The player’s moves are completely independent, so it is just a question of counting how many moves each player has. One way to show that a complicated position S has a certain value v is to take a rectangle R that is known to have value −v and then to show that mover loses in the game S + R. Thus to understand cutcake completely, it suffices to determine the value ν(R) of any rectangle. “Let us write for instance 2Ra,b + 3Rc,d to be the game with two a × b rectangles and three c × d rectangles. A tedious analysis of all possibilities shows that if we begin with R3,8 + R4,1 , then mover loses. Since ν(R4,1 ) = −3 it follows that R3,8 = 3. Now one can also show that R5,9 = 1. Note that in general ν(Ra,b ) = −ν(Rb,a ), since



Figure 1.1. Cutcake values on Ra,b + Rb,a the second player can always win by an obvious mimicking strategy. Thus we see that R3,8 + 3R9,5 = 0, i.e., mover loses on R3,8 + 3R9,5 . I hope this simple example illustrates sufficiently the power of the value function ν. “So the problem we need to solve is to give a simple formula for ν(Ri,j ), or at least a simple method to compute this number. Any ideas?” Fumei said, “It would help to see some data so I can get some feeling for what is going on.” “Exactly!” said Professor Blakeley. “Unless you have unusual insight and see what to do immediately, naturally you want to see some data. In fact, for this problem a sufficient amount of data leads to an obvious conjecture. One way of saying the rule is that if m ≤ n and m has k + 1 binary digits, then ν(Rm,n ) = v if and only if (v + 1)2k ≤ n ≤ (v + 2)2n − 1. In particular, ν(Rm,n ) = 0 if and only if m and n have the same number of binary digits.” “Here is a table of ν(Rm,n ) for m, n ≤ 15,” continued Professor Blakeley as he displayed Figure 1.1 on the screen. “Once the value of ν(Rm,n ) is guessed, its validity is straightforward to prove by induction. I’ll let you figure this out by yourself. “Cutcake happens to be an especially simple partizan game because the value of every position is an integer. In more complicated partizan games we can get fractional values. For nonpartizan games the situation becomes much more complicated. I don’t need to discuss this with you, because all this and much more is



thoroughly covered in the fantastic book Winning Ways by Berlekamp, Conway, and Guy.1 If you like this kind of mathematical game theory, then be sure to take a look at this book. “I would be remiss if I didn’t at least mention the theory of ‘fair cake-cutting,’ although we don’t have time for proofs. I’m sure you’re familiar with the procedure of dividing some object like a piece of cake among two persons. The two persons may have different ‘valuation functions’ which tell them how much they like each piece of the cake. The goal is to share the cake so each person thinks they got at least their fair share (called proportionality) and neither person thinks the other got more than their fair share (called envy-freeness). It is easy to see that envy-freeness implies proportionality. The well-known method is for one person to divide the cake into two pieces which he or she regards of equal value, and for the second person to choose one of the pieces. We have to put some reasonable assumptions on the valuation functions in order for this procedure to work. We can regard a valuation as a finitely additive nonatomic probability distribution V . Thus the value of the entire cake is 1, each individual point has value 0, and if A, B are disjoint pieces of cake, then V (A ∪ B) = V (A) + V (B). “For three persons, an envy-free algorithm was devised in 1960 by John Selfridge, later independently found by Conway. It requires as many as five slices. In 1995 Steven Brams and Alan Taylor found a finite procedure that works for any number n of persons, but the number of slices for sufficiently large fixed n is an unbounded function of the valuations. Finally in 2018 Haris Aziz and Alan Mackenzie described an envy-free algorithm that has a bounded number of slices for fixed n. If f (n) is the least number of slices which suffices for any n valuations, then it is known that f (n) ≥ cn2 , while the upper bound is an exponential tower of six n’s. Quite a difference! “Let’s see if we can figure out an algorithm for eight persons.” Professor Blakeley then reached into his backpack and pulled out a cupcake wrapped in cling wrap and a plastic knife. He unwrapped the cupcake and said, “I am happy to have all of you share this delicious cupcake, but let’s first divide it fairly into eight pieces. Who would like to begin?” None of the students were able to think of an algorithm that could conceivably take an exponential tower of six 8’s slices. They had their doubts about the prac8 ticality of such an algorithm, even if they could discover it. Even 88 slices would far exceed the number of atoms in the cupcake. “Not to worry,” said Professor Blakeley, “there is a simple solution!” He reached into his backpack and pulled out seven more cupcakes. He then gave a cupcake to each student. “But Professor Blakeley,” said Sandra, “it doesn’t seem fair that you are getting nothing.” “A very good point!” replied the Professor. Once more he reached into his backpack, pulled out a cupcake about 50% larger than any of the others, unwrapped it, and took a big bite. The other students also began eating their cupcakes, and thus ended the morning class.

1 According to [101], Richard Guy was for several years the world’s oldest living mathematician, until his death on March 9, 2020. He was born on September 30, 1916.


Polynomials PROBLEM LIST Condition for ax + b to have a zero in K Condition for ax2 + bx + c to have a zero in K Condition for ax3 + bx2 + cx + d to have only real zeros Necessary condition for a polynomial to have only real zeros Sufficiency of the previous condition Number of terms in the previous condition Number of terms in a discriminant Discriminant of xn + ax + b Discriminant of xn + axk + b n Discriminant of ∑ni=0 xn! i )x Discriminant of ∑ni=0 (n+α n−i i! Real zeros and Toeplitz matrices Newton’s necessary condition for real zeros Product of log-concave polynomials is log-concave k k ∑j=0 (−1)k−j ( j )(x + j)n has positive coefficients Zeros of f (x + 1) + uf (x), where ∣u∣ = 1


10 10 10 12 12 13 13 13 14 14 15 15 16 16 17 18



Professor Blakeley arrived the next morning in a highly frenetic mood. “I did it! I did it! I out-Eulered Euler!” He paused a short time to increase the suspense and then continued, “I have just discovered a fantastic variant of Euler’s famous formula eπi + 1 = 0, which involves the five fundamental constants 0, 1, i, e, and π. My formula also includes Euler’s constant n

1 − log n) = 0.5772156649 . . . . i=1 i

γ = lim (∑ n→∞

“You have the privilege of being present at the unveiling of the greatest formula in all of mathematics: e0πγi = 1.” The students were speechless in the presence of this √unexpected result. Finally Clayton said, “I think I see a way to incorporate also 2 into your formula.” “Amazing!” said Professor Blakeley. “We can discuss this after class. Meanwhile, let’s get going with today’s topic, namely, polynomials. There’s lots of interesting stuff to say about these creatures. In fact, the entire subject of algebraic geometry is about the zeros of polynomials. Regrettably I don’t have time today to cover everything that’s known about algebraic geometry, so I will sample from a smaller set. We will look at some properties of polynomials that make interesting problems or are entertaining stand-alone facts. “We will consider only polynomials in one variable over a field K. The set of such polynomials is denoted K[x], as you all know. By a zero of a polynomial f (x) ∈ K[x], I mean an element α of some extension field of K for which f (α) = 0. Often the word ‘root’ is used as a synonym for ‘zero,’ but if you want to be precise, then polynomials f (x) have zeros, while polynomial equations f (x) = 0 have roots. “We can start with polynomials of degree one, say f (x) = ax + b, where a ≠ 0. Does anyone know a necessary and sufficient condition for such a polynomial to have a zero in K?” No one bothered to answer this rhetorical question, so Professor Blakeley continued, “Okay, let’s go to quadratics. Let f (x) = ax2 + bx + c ∈ K[x] with a ≠ 0. Is there a nice condition for f (x) to have two zeros (always counted with multiplicity) in K?” “b2 − 4ac should be a square in K,” suggested Emiliano. “Wrong!” said Professor Blakeley. Wrong? Everyone knew the quadratic formula. What could be wrong? “Remember, K can be any field,” said Professor Blakeley. Emiliano saw the light. “Oops, I should have said that the characteristic of K isn’t 2.” “Now you’re cooking! So over R this becomes the familiar condition b2 −4ac ≥ 0. What about cubic polynomials (2.1)

f (x) = ax3 + bx2 + cx + d ∈ R[x],

where a ≠ 0? Is there an analogue of the discriminant condition b2 − 4ac ≥ 0 for all the zeros to be real?” The students had not heard of such a condition for cubics. Most knew that the discriminant Δ(f ) of a monic cubic polynomial (2.1) with zeros α1 , α2 , α3 in some



extension field is defined as Δ(f ) =

2 ∏ (αi − αj ) . 1≤i with ≥ in the two conditions. I should mention that you have to be a little careful here, since it is false that a real symmetric matrix is positive semidefinite (i.e., all eigenvalues are nonnegative) if all the leading principal minors are nonnegative. An 0 0 example is [ ]. 0 −1 “Exactly analogous reasoning applies to real polynomials f (x) of degree n. We obtain n − 1 polynomial inequalities among the coefficients that are necessary for having only real zeros. Namely, suppose the zeros are α1 , . . . , αn and let n

pj = ∑ αij . i=1

Suppose that the zeros are all distinct. We get that if f (x) has only real zeros, then the n × n matrix ⎡ n p2 ⋯ pn−1 ⎤ p1 ⎥ ⎢ ⎢ p p p ⋯ pn ⎥ 2 3 ⎥ ⎢ 1 ⎥ ⎢ p3 p4 ⋯ pn+1 ⎥ (2.4) V t V = ⎢ p2 ⎥ ⎢ ⎥ ⎢ ⋮ ⎥ ⎢ ⎢ pn−1 pn pn+1 ⋯ p2n−2 ⎥ ⎦ ⎣ is positive definite. Equivalently, its principal leading minors are positive. Since the first leading principal minor is just n, we get n − 1 inequalities in all. “It turns out that this necessary condition for real zeros is also sufficient. Let me give the elegant proof. It requires a knowledge of the basic result about real quadratic forms, which I now state. Let A = [aij ]n−1 i,j=0 be a real nonsingular symmetric n × n matrix. The quadratic form associated with A is the polynomial n−1

SA (x) = ∑ aij xi xj . i,j=0

We can then write SA (x) in the form (2.5)





SA (x) = ∑ Zk2 − ∑ Zk2 ,

where each Zk is a (nonzero) real linear form, i.e., n

Zk = ∑ bhk xh , bhk ∈ R. h=1

This representation is not unique, but the integer q is unique and is equal to the number of positive eigenvalues of A.



“Now let A be the matrix V t V of equation (2.4), coming from the real polynomial f (x) = c(x − α1 )⋯(x − αn ), where α1 , . . . , αn can be any distinct complex numbers. We compute that SA (x) = =


i+j i+j ∑ (α1 + ⋯ + αn ) xi xj

i,j=0 n


2 n−1 ∑ (x0 + αk x1 + αk x2 + ⋯ + αk xn−1 ) .


x0 + αk x1 + αk2 x2 + ⋯ + αkn−1 xn−1 .

If αk ∈ R, then Zk2 contributes a positive Let Zk = term to the sum in equation (2.5). Otherwise let an overhead bar denote complex ¯ k yields conjugation. The pair αk , α Zk = Pk + iQk , Z¯k = Pk − iQk , where Pk and Qk are real linear forms. Note that Zk2 + Z¯k2 = 2Pk2 − 2Q2k , thus contributing one positive and one negative term to (2.5). Hence the number of positive eigenvalues of A is q + 12 (n − q) = 12 (n + q). In particular, A is positive definite (equivalently, all eigenvalues are positive) if and only if q = n, and the proof follows for distinct zeros. By a continuity argument or a slightly more complicated version of the previous proof, we get the generalization to any zeros.” Even the students who were not familiar with real quadratic forms admired this application of them. They promised themselves to learn about this subject as quickly as possible. Professor Blakeley continued, “Here are the first two inequalities for a monic polynomial xn + axn−1 + bxn−2 + cxn−3 + dxn−4 + ⋯ of deg n: (n − 1)a2 − 2nb ≥ 0, −2(n − 1)a3 c + (n − 2)a2b2 − 4(n − 1)a2 d + 2(5n − 6)abc − 4(n − 2)b3 + 8nbf − 9nc2 ≥ 0. The next one has 34 terms, and the next after that 204 terms, and then 1409 terms. I wonder if there is a nice way to determine the number of terms? So far as I know, no one has looked at this problem, but it could be difficult. It is similar to finding the number of terms in the discriminant of a generic polynomial of degree n, which is an open problem. This sequence begins (starting at n = 1) 1, 2, 5, 16, 59, 246, 1103, 5247. “Going back to the matrix V , note that since V is a Vandermonde matrix, we have that det A = det V t V = Δ(f ), the discriminant of f . Let’s see if you can compute some discriminants, other than when f is quadratic. If a and b are complex numbers and n ≥ 2, then who knows how to compute Δ(xn + ax + b)?” Professor Blakeley was slightly surprised when no one volunteered. “Here is a hint,” he said, “note that if f (x) = ∏ni=1 (x − αi ), then n

1 .” i=1 x − αi

f ′ (x) = f (x) ∑

I wonder if that will be enough of a hint, he thought, but within a few minutes Daniel, Patrick, and Sandra had their hands in the air. The Professor selected Sandra.



“I’ll have to do the details of the computation on the fly,” she said, “but the basic idea should work. Using Professor Blakeley’s hint we have f ′ (αj ) = ∏(αj − αi ). i≠j

Since Δ(f ) = ∏i≠j (αi − αj ) , we have 2


Δ(f ) = (−1)( 2 ) f ′ (α1 )⋯f ′ (αn ).


Now if f (x) = xn + ax + b, then f ′ (x) = nxn−1 + a. Hence αj f ′ (αj ) = = =

nαjn + aαj n(−aαj − b) + aαj −(n − 1)aαj − bn.

Plugging into equation (2.6) and using α1 α2 ⋯αn = (−1)n b gives n

Δ(f ) =

(−1)( 2 ) n ∏(−bn − (n − 1)aαj ) α1 ⋯αn j=1 n


n (−1)( 2 )+n bn (n − 1)n an ∏ (− − αj ) b (n − 1)a j=1


(−1)( 2 )+n (n − 1)n an [(



−bn bn ) − + b] (n − 1)a n−1 n

(n )+n 2

−bn b ) − ] (n − 1) a [( (n − 1)a n−1




(−1)( 2 ) nn bn−1 + (−1)(

n n


n−1 ) 2

(n − 1)n−1 an .”

Sandra received some help from the other students as she performed this computation, especially since it was easy to make a sign error, but she felt very pleased at figuring this out. Professor Blakeley agreed. “Fantastic! There is actually a more general formula for Δ(xn + axk + b) for any 0 < k < n, namely d


Δ(xn + axk + b) = (−1)( 2 ) bk−1 [nN bN −k − (−1)N (n − k)N −K kK aN ] , where d = gcd(n, k), N = n/d, and K = k/d. The proof is similar, though more complicated. “Let’s try one more. It shouldn’t be so difficult if you followed Sandra’s proof. For n ≥ 1 let n xn . fn (x) = ∑ i=0 n! What is Δ(fn )? Here the polynomial is not monic. By definition, if f (x) = c ∏ni=1 (x − αi ), then Δ(f ) ∶= c2n−2 ∏(αj − αi )2 . i n. The matrix A(f ) is an example of a Toeplitz matrix, i.e., a matrix with constant diagonals (parallel to the main diagonal). The remarkable result of Michael Aissen, Isaac Schoenberg, and Anne Whitney is that all the zeros of f (x) are real if and only if every minor of A(f ) is nonnegative. I said that this result is useless for specific examples because even for quadratic polynomials ax2 + bx + c it is necessary to check infinitely many minors; there is no single minor equal to a positive multiple of b2 − 4ac. Things really get interesting when we look at power series rather than polynomials, but this topic is beyond the scope of this camp. It is part of the fascinating theory of total positivity. “A useful necessary condition for only real zeros is due to Isaac Newton, though this is not his best-known result. Write a real polynomial of degree n in the form f (x) = ∑ni=0 (ni)ai xi . Newton’s theorem says that if all the zeros of f (x) are real, then a2i ≥ ai−1 ai+1 for 1 ≤ i ≤ n − 1. In other words, the coefficients a0 , a1 , . . . , an are logarithmically concave, or log concave (sometimes written log-concave) for short.



For instance, equation (2.3) is a special case. How hard can it be to prove a result from the 1600’s? Does anyone know how to do this?” A few students had seen the result before. One needs to use the fact that if all the zeros of a real polynomial f (x) are real, then the same is true for its derivative f ′ (x). This is an immediate consequence of Rolle’s theorem, which implies that there is a zero of f ′ (x) in between two zeros of f (x), together with the limiting case which states that if f (x) has a zero of order k at x = α, then f ′ (x) has a zero of order k − 1 at x = α. Professor Blakeley let Daniel continue the proof. He said, “By the Fundamental di−1 Theorem of Algebra, f (x) has n zeros, including multiplicity. Let Q(x) = dx i−1 f (x). Thus Q(x) is a polynomial of degree n − i + 1 with only real zeros. Let R(x) = xn−i+1 Q(1/x), a polynomial of degree at most n − i + 1. The zeros of R(x) are just reciprocals of those zeros of Q(x) not equal to 0, with possible new zeros at 0. At dn−i−1 any rate, all zeros of R(x) are real. Now let S(x) = dx n−i−1 R(x), a polynomial of degree at most two. Then every zero of S(x) is real. An explicit computation yields S(x) = cn (ai−1 x2 + 2ai x + ai+1 ) for some nonzero constant cn which I don’t remember offhand. If ai−1 = 0, then trivially a2i ≥ ai−1 ai+1 . Otherwise S(x) is a quadratic polynomial. Since it has real zeros, its discriminant Δ is nonnegative. But Δ = (2ai )2 − 4ai−1 ai+1 = 4(a2i − ai−1 ai+1 ) ≥ 0, so the sequence a0 , a1 , . . . , an is log-concave as claimed.” “So elegant!” said Professor Blakeley. “By the way cn = n!/2, though this is irrelevant. “This reminds me of a cute argument involving log-concavity. Suppose that A(x) = ∑i ai xi and B(x) = ∑i bi xi are polynomials with positive log-concave coefficients, so a2i ≥ ai−1 ai+1 and similarly for bi . How can we show that the product A(x)B(x) also has log-concave coefficients?” The students began computing furiously. It looked like it was going to be messy, but not hopeless, to give a direct proof. But what was the “cute argument” that Professor Blakeley had in mind? “Here is a little hint,” said Professor Blakeley. “What is the determinant of the following matrix: ai+1 a ]? ” [ i ai−1 ai So the cute argument involved linear algebra, but it still wasn’t clear how to bring it into the picture. “Okay, another hint. Given a polynomial f (x) = ∑ ai xi , consider the infinite matrix from equation (2.9) (or we could take a sufficiently large finite matrix) ⎡ a0 a1 a2 a3 ⋯ ⎤ ⎢ ⎥ ⎢ 0 a a a ⋯ ⎥ 0 1 2 ⎢ ⎥ ⎢ ⎥ A(f ) = ⎢ 0 0 a0 a1 ⋯ ⎥ . ⎢ ⎥ ⎢ 0 0 0 a0 ⋯ ⎥ ⎢ ⎥ ⎢ ⎥ ⋮ ⎣ ⎦ What can you say about A(f )A(g)?” With this very strong hint, it was soon clear that A(f )A(g) = A(f g). In fact, the map f (x) ↦ A(f ) is an isomorphism from the polynomial ring R[x] to the



set of upper triangular real Toeplitz matrices [aj−i ]i,j≥0 with finitely many nonzero elements in each row. But what did this have to do with log-concavity? It didn’t take long for the penny to drop. Fumei was given the task of explaining. “The key is the Cauchy-Binet formula,” she said. “We need to show that every 2 × 2 consecutive submatrix B (that is, the rows and columns of B come from consecutive rows i, i+1 and columns j, j+1 of A(f g)) has a nonnegative determinant. Let A(f )[i, i+1] be the submatrix of A(f ) consisting of rows i and i+1, and similarly A(g)⟨j, j +1⟩ for the columns of A(g). Then B = A(f )[i, i+1]⋅A(g)⟨j, j +1⟩ (matrix multiplication). By the Cauchy-Binet formula, det B = ∑ det C(r, s) ⋅ det D(r, s), r 0 for all i there follows easily that for all i ≤ j and s ≥ 0 we have ai aj ≥ ai−s aj+s . Hence det C(r, s) ≥ 0 and det D(r, s) ≥ 0, and the proof follows.” “Ooh la la,” exclaimed Professor Blakeley, “such a nice argument! Let me remind you of the statement of the Cauchy-Binet formula. Let C be an m × n matrix and D an n × m matrix over a field K. We want to compute det CD. If m > n, then it should be clear by a rank argument that det CD = 0, so assume m ≤ n. If S ⊆ [n], then let C[S] be the submatrix of C consisting of the columns of C indexed by elements of S, and similarly D⟨S⟩ for the rows of D. Then the Cauchy-Binet formula asserts that det CD = ∑ det C[S] ⋅ det D⟨S⟩, S

where S ranges over all m-element subsets of [n]. Thus in the present situation we have C = A(f )[i, i + 1] and D = A(g)⟨j, j + 1⟩, in the notation of Fumei. “By the way, who knows another name for the Cauchy-Binet formula?” There were no volunteers, so Professor Blakeley continued, “It is also known as the Binet-Cauchy formula. “Time for a break! Here is a little problem to think about so you won’t get too bored while I answer some text messages. For positive integers n and k, define (2.10)

k k Pk,n (x) = ∑ (−1)k−j ( )(x + j)n . j j=0

Show that Pk,n (x) has positive coefficients.” The students worked hard during Professor Blakeley’s absence. It wasn’t hard to see that Pn,n (x) = n! and Pk,n (x) = 0 for k > n. Thus it sufficed to consider k < n, though this assumption didn’t seem to help. They fooled around with binomial coefficients for a while. It looked like some combinatorial interpretation of the coefficients might be possible using the Principle of Inclusion-Exclusion, but the computation became messy. Then Daniel got the idea that maybe all the zeros of Pk,n (x) were real and negative. In that case, we would have Pk,n (x) = ∏dj=1 (x + αi ) for αi > 0, so clearly Pk,n (x) would have positive coefficients. However, when they computed numerically the zeros of Pk,n (x) for some small examples, they saw that every zero had real part − k2 < 0. This also sufficed to show that Pk,n (x) had positive



coefficients, because the nonreal zeros came in pairs − k2 ± bi, and k k2 k + b2 , (x − (− + bi)) (x − (− − bi)) = x2 + kx + 2 2 4 which has positive coefficients. At this point Professor Blakeley returned to the classroom. “Any progress?” he asked. The students explained what they had done, and the Professor said, “Great job! You really got to the essence of the problem quickly! So the key question remains: how to show that all the zeros have real part −k/2? “Since time is short, let me give you a little hint. Let f (x) be a complex polynomial such that every zero has the same real part γ. Let u ∈ C with ∣u∣ = 1. What can you say about the zeros of f (x + 1) + uf (x)?” Some quick experimentation by Patrick produced the surprising conjecture that all the zeros of f (x + 1) + uf (x) have real part γ − 12 . How to prove it? Professor Blakeley noticed that Sam was quite agitated. The Professor was pleased that he finally had an excuse to call on Sam, who he knew was quite brilliant but extremely shy. Sam said in a surprisingly assertive voice, “We may assume that f (x) is monic. Then we can write f (x) = ∏ (x − γ − δj i) , δj ∈ R, j

where i2 = −1. If f (w + 1) + uf (w) = 0, then ∣f (w)∣ = ∣f (w + 1)∣. Suppose that w = γ − 12 + η + θi, where η, θ ∈ R. Thus ----- --1 ----- = --- ( 1 + η + (θ − δ )i)----- . (2.11) + η + (θ − δ (− )i) ∏ ∏ j j --- -------- j 2 -- -- j 2 -Note that 2 1 ∣− + η + (θ − δj )i∣ 2 2 1 ∣ + η + (θ − δj )i∣ 2 Hence if η > 0, then

= =

1 − η + η 2 + (θ − δj )2 , 4 1 + η + η 2 + (θ − δj )2 . 4

1 1 ∣− + η + (θ − δj )i∣ < ∣ + η + (θ − δj )i∣ 2 2 for all j, while if η < 0, then the inequality goes the other way. Therefore equation (2.11) holds only if η = 0, and the proof follows.” Although an ordinary conversation was very stressful for Sam, the same could not be said for discussing mathematics. “Awesome,” said Professor Blakeley. “But what does this have to do with the problem of showing that every zero of the polynomial Pk,n (x) of equation (2.10) has real part − k2 ?” There were quite a few volunteers, and Patrick was chosen. “Let E be the operator on polynomials P (x) defined by EP (x) = P (x + 1). Then k k (E − 1)k = ∑ (−1)k−j ( )E j , j j=0



so Pk,n (x) = (E − 1)k xn . Every zero of xn has real part 0. By what we have just shown, each time we apply E − 1 we lower the real part of every zero by 12 . Thus every zero of Pk,n (x) has real part − k2 .” “Yep, that about wraps it up! Nice job, everyone! Well, it looks like time is up for today’s class. I hope you have learned some interesting facts about polynomials, and see you soon at lunch!”


Base Mathematics PROBLEM LIST 10, 11, 12, 13, 14, 15, 16, 17, 20, 22, 24, 31, 100, ?, 10000 1, 1, 1, ?, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, . . . Lucas’s theorem on binomial coefficients mod p Kummer’s theorem on largest power of p dividing (nk) de Polignac’s formula on largest power of p dividing n! 100/9899 and √ Fibonacci numbers 5000 − 700 51 and Catalan numbers ∑n≥0 1/Cn Maclaurin series for (sin−1 x)2 Maclaurin series for cos t (sin−1 x) ∑n≥0 (4 − 3n)/Cn Greedy sequence with no three terms in arithmetic progression Growth rate of “irregular” sequences with no three terms in arithmetic progression Smallest growth rate of sequence with no three terms in arithmetic progression Infinitely many primes without the digit k ∑p 1/p diverges (p prime) Computing the nth binary digit of π Morse-Hedlund sequence is cubefree Infinite squarefree binary sequence Infinite squarefree ternary sequence Fair allocation of gifts to children ai bi Largest power of x − 1 dividing ∑m i=1 (x − x ) Infinite binary sequence with almost every prefix ending in a square Fibonacci word is not eventually periodic Fibonacci word prefixes end in square of length at most five Fibonacci word invariant under 0 → 01 and 1 → 0 Average length of shortest square suffix of a Fibonacci word prefix


22 22 22 23 24 24 24 25 25 25 25 26 26 27 27 28 28 29 29 29 29 30 31 31 31 31 31 31



“As a segue into today’s topic,” said Professor Blakeley as soon as the students were seated, “here is a little puzzle. I’m sure some of you have seen it already. What is the missing number?” 10, 11, 12, 13, 14, 15, 16, 17, 20, 22, 24, 31, 100, ?, 10000 Most of the students had seen this puzzle, but a few had not. After giving them a little time to think about it, Professor Blakeley said, “Here is a hint—the sequence is constant.” This hint seemed only to deepen the mystery for the students not already familiar with the problem, but then Fumei saw the light. “I got it!” she said. “Great!” said Professor Blakeley. “Please explain.” “All the terms are equal to 16,” said Fumei. They are just written in different bases, beginning with base 16, then 15, 14, down to base 2. So the missing number is 16 in base 3, which is 121.” “Okay! So today we will look some more at base mathematics. Not to worry, I mean number bases, like writing integers in base 10. But before we begin, there are a couple of other ‘find the missing term’ puzzles that everyone should know. The first is the following.”

Half the students had seen this puzzle before. A couple of the others got it quickly. For the remaining holdouts Professor Blakeley gave a hint. “Note that each term is symmetrical about a vertical line.” Now the solution became clear to those still in the dark. “The second missing number puzzle is actually quite a bit more mathematical. Here it is: 1, 1, 1, ?, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, . . . . Given that the missing number f (4) is not 1, what is it?” Now the students were completely mystified. Was this some kind of obscure joke? It was not hard to come up with a highly contrived answer like the sequence 1 + δ4n , where δij is the Kronecker delta, but not even Professor Blakeley could be that lame. Professor Blakeley continued, “There really is a nice answer to this question! Namely, let f (n) be the number of inequivalent differentiable structures that can be put on Rn . By a result of Stallings, f (n) = 1 if n ≠ 4. Freedman showed that f (4) > 1, and Taubes showed that in fact f (4) = c, the cardinality of the continuum (or of the real numbers). Thus the missing number is c.” The students were quite amazed at this dichotomy between n = 4 and n ≠ 4, though not all of them knew exactly what was meant by inequivalent differentiable structures. When asked about this by Clayton, Professor Blakeley replied, “You need to understand the definition of a differentiable manifold. Very roughly, it is a space M on which we can do calculus. More precisely, we want to cover M with open sets Uα such that we are given maps ϕα ∶ Uα → Rn that are homeomorphisms onto an open subset of Rn , where the transition maps between the induced coordinate systems on an intersection Uα ∩ Uβ are infinitely differentiable. You can look



up exactly what this means. You can imagine what it means for two differentiable manifolds M and N to have equivalent (or diffeomorphic) differentiable structures: there should be a bijection f ∶ M → N such that f and f −1 are infinitely differentiable. Again you can look up precisely what this means, but the upshot is that there is only one way to do calculus on Rn for n ≠ 4, but lots of ways for n = 4. Thus there is a good opportunity to write quite a few textbooks on four-dimensional calculus.” After a short break it was time to get back to business. “There doesn’t seem to be much connection between base b representations of real numbers and ‘serious’ mathematics,” said Professor Blakeley. “For instance, √ no one has any idea how to prove or disprove that the decimal expansion of 2 has infinitely many zeros or that infinitely many primes are palindromes (read the same backwards as forwards) in base √ 10. On the other hand, I hope all of you can show that the binary expansion of 2 indeed does have infinitely many 0’s and infinitely many 1’s. Can anyone give me some examples of where the base b expansion does appear in a noncontrived way, at least for certain b?” “Lucas’s theorem,” replied Daniel immediately. This suggestion was soon followed by various other contributions, including Kummer’s theorem. “Not bad!” said Professor Blakeley. “We don’t have time to go into all your suggestions, so let’s discuss two of the most striking examples, Lucas’s and Kummer’s theorem. I will state these results for the benefit of those who haven’t seen them or whose memory is hazy. “Lucas’s theorem asserts that if p is prime and if n = ∑i≥0 ai pi and k = ∑i≥0 bi pi are the base p expansions of the positive integers n and k (so 0 ≤ ai < p and 0 ≤ bi < p), then n ai ( ) ≡ ∏ ( ) (mod p). k i≥0 bi If f (x) and g(x) are polynomials with integer coefficients, I will write f (x) ≡ g(x) (mod p), or simply f (x) ≡ g(x), to mean that all coefficients of f (x) − g(x) are divisible by p. Since (1 + x)p ≡ 1 + xp (mod p) (the so-called ‘sophomore’s dream’), we have n k n ∑ ( )x = (1 + x) k k≥0 i


(1 + x)∑ ai p

p ∏ (1 + x )




⎛ ai b pi ⎞ . ∏ ∑ ( )x i ⎝ ⎠ i≥0 bi ≥0 bi

Taking the coefficient of xk on both sides and using the uniqueness of the base p expansion k = ∑ bi pi gives the result. “As an application, how many numbers in the nth row of Pascal’s triangle, i.e., the numbers (nk) for k ≥ 0, are not divisible by p?” Several hands went up, and Professor Blakeley pointed to Jung Wook. “If n = ∑ ai pi as before, then the number is ∏i (1 + ai ).” “And why is that?”



“We need each bi to satisfy bi ≤ ai , so there are ai + 1 choices for bi .” “Super! Now someone mentioned Kummer’s theorem. Who can explain that?” Only Daniel and Sandra volunteered, and Sandra was chosen. “Kummer’s theorem says that the largest power of p dividing (nk) is equal to the number of carries in adding k and p − k in base p using the usual addition algorithm.” “Proof?” asked Professor Blakeley. “I don’t remember exactly, but you just have to fool around with de Polignac’s formula.” Wow, thought Professor Blakeley, she knows the name de Polignac. I bet I can stump her with the next question. “And what is de Polignac’s full name?” Silence. “Ah, no one knows. Don’t they teach history anymore? It is Alphonse Armand Charles Georges Marie de Polignac. The formula is due originally to Legendre and is also called Legendre’s formula. Thus we have a partial instantiation of Stigler’s law of eponymy: no scientific discovery is named after its original discoverer. This is named after Stephen Stigler, who wrote about it in 1980. Note, however, that Stigler’s law of eponymy implies that Stigler’s law of eponymy was not originally discovered by Stigler. “Let me remind you of what the Legendre/de Polignac formula says, namely, if p is prime, then the exponent of the largest power of p dividing n! is ⌊ np ⌋ + ⌊ pn2 ⌋ + ⌊ pn3 ⌋ + . . . . I won’t insult you by asking for a proof. Anyway, Kummer’s theorem follows by keeping careful track of the powers of p dividing n!, k!, and (n − k)!. “Now let’s look at a rather frivolous instance of base mathematics: getting the decimal expansion (or any base b) to behave nicely. For instance, 100 = 0.010102030508133455 . . . . 9899 Taking the digits after the decimal point two at a time generates the Fibonacci numbers 1, 1, 2, 3, 5, 8, 13, 34, 55, . . . . What’s going on?” Hands quickly shot up, and Professor Blakeley pointed to Clayton. “We can write 1/100 100 = 1 1 . 9899 1 − 100 − 10000 If Fn is the nth Fibonacci number, then n ∑ Fn+1 x =



Put x =

1 100

1 . 1 − x − x2

and divide by 100 to get Fn 100 . = n 9899 n≥1 100 ∑

Fn When Fn has at most two digits, then 100 n has the decimal expansion, starting after the decimal point, consisting of 2n − 2 0’s followed by the digits of Fn (with a leading 0 if Fn has just one digit), so adding all of these up will give what you want, until there is some spillover from the first three-digit Fibonacci number.” “Good explanation!” said Professor Blakeley. “It should also be pointed out √ that the left-hand side of equation (3.1) converges for ∣x∣ < −1+2 5 = 0.618 . . . , so 1 there is no problem with convergence when x = 100 .



“So how about

√ 5000 − 700 51 = 1.00010002000500140042013204291430 . . . ?

Looking at the digits in groups of four gives the sequence 1, 1, 2, 5, 14, 42, 132, 429, 1430, . . . . Does anyone recognize it?” Emiliano was chosen among the volunteers. “These are the Catalan numbers 1 (2n ). I don’t remember exactly the formula for the generating function Cn = n+1 n C(x) = ∑n≥0 Cn xn = 1 + x + 2x2 + . . . , but it should be the case that √ F (1/10000) = 5000 − 700 51. Then we can argue the same way Clayton did.” “Exactly right!” said Professor Blakeley. “In fact, we have √ 1 C(x) = (1 − 1 − 4x). 2x 4 It is a happy accident that the numerator of 1 − 10000 , namely 2499, is divisible √ √ 2 by 7 , so F (1/10000) can be written using 51 rather than 2499. I assume that everyone here is familiar with Catalan numbers, one of the most uniquitous sequences appearing in mathematics. In fact, the OEIS1 entry on Catalan numbers (A000108) states that ‘(t)his is probably the longest entry in the OEIS, and rightly so.’ “By the way, does anyone know the value of the sum 1 ? S ∶= ∑ n≥0 Cn As a hint, the first four terms are 1+1+

1 1 + = 2.7. ” 2 5

None of the students bothered to make the obvious conjecture. “I see you have figured it out,” continued Professor Blakeley. “After all, the only sensible guess that fits the data is √ 3 3π (3.2) S =2+ = 2.806133 . . . . 27 You may be wondering how to prove such a formula. The key is the well-known Maclaurin series for (sin−1 x)2 , namely, (3.3)

22n−1 2n x . 2 2n n≥1 n ( n )

(sin−1 x)2 = ∑

One clever way to prove equation (3.3) is to first show that (3.4)

cos t (sin−1 x) = ∑ (−1)n t2 (t2 − 22 )(t2 − 42 )⋯(t2 − (2n − 2)2 ) n≥0

x2n . (2n)!

Any thoughts on how to do this without a lot of computation?” Soon there were some volunteers, and the Professor pointed to Patrick. “We have 2n 2n t cos t (sin−1 x) = ∑ (−1)n (sin−1 x) . (2n)! n≥0 1

The Online Encyclopedia of Integer Sequences,



Since sin−1 x = x + . . . , we see that the coefficient of x2n /(2n)! in cos(t sin−1 x) is an even polynomial Pn (t) of degree 2n and leading coefficient (−1)n . If k ∈ Z, then cos 2kθ is an even polynomial in cos θ of degree 2k. Moreover, cos2 (sin−1 x) = 1−x2 . Hence cos 2k(sin−1 x) is an even polynomial in x of degree 2k. For instance, I happen to remember that cos 4(sin−1 x) = 8x4 − 8x2 + 1. It follows that Pn (±2k) = 0 for ∣k∣ < n. We now know all the zeros and the leading coefficient to Pn (t), so we get Pn (t) = (−1)n t2 (t2 − 22 )(t2 − 42 )⋯(t2 − (2n − 2)2 ). ” “Super! And how does this help us prove equation (3.3), i.e., the expansion of (sin−1 x)2 ?” It seemed to Professor Blakeley that the students were taking an inordinate amount of time to find the simple trick. Finally Daniel said, “Take the coefficient of t2 !” “Exactly! And by taking the coefficient of t2k we get the expansion of −1 (sin x)2k , though these expressions become more and more complicated as k increases. So finally, we need to use equation (3.3) to evaluate ∑n≥0 1/Cn .” √ Fumei volunteered. “Substitute 12 x for x and apply the operator 2

d 2 d d x x dx dx dx n

x to equation (3.3). The left-hand side becomes ∑n≥0 C . It should be routine to n apply this operator to the right-hand side and then set x = 1.” “You are right about the right. After we apply your procedure and simplify, we obtain √ −1 1 √ xn 2(x + 8) 24 x sin ( 2 x) = + . ∑ (4 − x)2 (4 − x)5/2 n≥0 Cn

If we set x = 1, then we get the claimed equation (3.2). “To test your understanding, today’s problem set includes the evaluation of the sum ∑n≥0 (4 − 3n)/Cn . “Let me give one further example of base mathematics which is a little more interesting mathematically. While this result can be formulated without any mention of number bases, it is most elegantly stated in terms of them. Define a sequence a0 , a1 , . . . of nonnegative integers recursively as follows: an is the least nonnegative integer greater than an−1 such that no three of the numbers a0 , a1 , . . . , an form an arithmetic progression. Thus a0 = 0 and a1 = 1. Since 0,1,2 form an arithmetic progression, we have a2 = 3. Similarly a3 = 4. Since we have arithmetic progressions 1,3,5; 0,3,6; 1,4,7; 0,4,8, we have a4 = 9. What is an , e.g., a1000000 ?” Patrick quickly computed that the sequence begins 0, 1, 3, 4, 9, 10, 12, 13, 27, 28, 30, 31, 36, 37, 39, 40, 81, 82, . . . . Everyone saw the striking gaps from 4 to 9, 13 to 27, and 40 to 81, i.e., from 1 (3n − 1) to 3n . And between these gaps the differences were the same as starting 2 from the beginning. For instance, the differences between consecutive terms from 27 to 40 are 1,2,1,5,1,2,1, which are the same as the gaps between consecutive terms from 0 to 13. This made it easy to compute recursively the value a1000000 , and in fact it didn’t take Patrick long to announce that a1000000 = 1726672221.



“So what is an elegant nonrecursive description of an in general?” asked Professor Blakeley. After politely waiting a while he continued, “I know eventually you can work this out, but I will save some time and just tell you the answer. Namely, write n in binary and read it in ternary! Thus 1000000 = 219 + 218 + 217 + 216 + 214 + 29 + 26 , so a1000000 = 319 + 318 + 317 + 316 + 314 + 39 + 36 . Of course it’s easy to prove this result by induction once it’s guessed, but it’s still very cute! “There are lots of known generalizations. For instance, instead of starting with a0 = 0, we could instead let k > 0 and start with a0 = 0 and a1 = k. For n ≥ 2 we define an to be the least positive integer greater than an−1 so that no three of a0 , a1 , . . . , an are in arithmetic progression. You can play around with this for a while and see that a curious dichotomy arises. When k is of the form 3j or 2 ⋅ 3j , which we call the regular values of k, there is a nice explicit description of an analogous to the case k = 1 (the case we just considered). For regular k the rate of growth of an is given by 1 an an (3.5) = lim inf α < lim sup α = 1, n→∞ n 2 n n→∞ 3 = 1.58496 . . . . where α = log log 2 “For other (irregular) values of k, the sequence appears to be very unpredictable. Let me recall the notation f (n) ∼ g(n), read ‘f (n) is asymptotic to g(n),’ if f (n) = 1. lim n→∞ g(n) There is a simple heuristic argument about the rate of growth of an if the sequence behaves sufficiently randomly. Think of pn as the ‘probability’ that n appears in the sequence. Now n will appear in the sequence if and only if n − i and n − 2i do not appear for 1 ≤ i ≤ n/2. If these events are independent, then we obtain ⌊n/2⌋


pn = ∏ (1 − pn−i pn−2i ), n ≥ 2. i=1

For the recurrence (3.6) with the initial conditions p0 = p1 = .5, which corresponds to a ‘random’ start, one can show that √ p1 + p2 + ⋅ ⋅ ⋅ + pn ∼ c n log n, where √ c is a positive constant. Hence when k is irregular we would expect that n ∼ an log an , or c′ n 2 . log n This estimate agrees quite well with the numerical evidence, but nothing has been proved. Note that equation (3.7) yields a faster rate of growth than (3.5). “I could mention the original motivation for the first sequence (corresponding to k = 1). We can ask for the size r3 (n) of the largest subset of [n] containing no three terms in an arithmetic progression. A natural conjecture is that we can


an ∼



construct this subset using a greedy algorithm, which leads to our first sequence. However, this conjecture is false. The best current bounds are (3.8)

√ 8 log n


< r3 (n) < c

(log log n)4 n log n

for some constant c > 0. As a bonus problem you can try to improve either the upper or lower bound, or even both. “We can also ask for the least rate of growth among all infinite sequences 0 = e0 < e1 < e2 < ⋯ that don’t contain three terms in arithmetic progression. In this direction, it is known that there is such a sequence satisfying √ (3.9) #{i ∶ ei < n} > n exp(−c log n) for all n ≥ 1. If you are having trouble understanding what this bound means, note that for > 0 we have n1− = n exp(− log n). √ Since log n grows more slowly than log n, this means that #{i ∶ ei < n} grows faster that n1− for any > 0.” It was time for a short break. Professor Blakeley and some students slipped out for some coffee. When they returned the Professor didn’t waste any time getting back to business. “Although base mathematics tends to be somewhat frivolous and not related to serious mathematics (whatever that means), there are some exceptions. For instance, James Maynard has shown that for any 0 ≤ k ≤ 9 there are infinitely many primes that don’t have the digit k in their decimal expansion. The proof is a very intricate argument using many tools from analytic number theory. Maynard’s result is probably quite far from the strongest result along these lines, since for instance it is plausible on probabilistic grounds, related to the fact that the sum of the reciprocal of the primes diverges, that there are infinitely many primes of the form (10n −1)/9, i.e., with only the digit 1 in their decimal expansion. “By the way, does anyone know a simple proof that ∑p 1/p diverges, where p ranges over all primes?” “It’s immediate from the prime number theorem,” suggested Fumei. “That is true, but I wouldn’t call this proof ‘simple.’ The prime number theorem is not an easy result. How about a proof using only elementary calculus, for instance?” After a little while Jung Wook volunteered. “Consider for x > 0 the product f (x) = =

1 1 1 − p n/2. The number of permutations whose largest cycle has length k is (nk)(k − 1)!(n − k)! = n!/k, since there will be a unique cycle C of length k, and the number of ways of choosing the elements of C is (nk). Then we have (k − 1)! ways to pick C and (n − k)! ways to permute the remaining elements. Thus L(n) ≥

n 1 n! ∑ k⋅ n! k=⌊n/2⌋+1 k

1 n. ” 2 “Marvelous!” said Professor Blakeley with all sincerity. “In fact, one can show that lim L(n)/n indeed exists. Let us call it c. Can anyone√guess what it is?” Various guesses were given, such as log 2 and 12 (−1 + 5). Professor Blakeley was disappointed. “No one in my class last summer could guess it either. It looks like I will have to give you a clue, namely, numerically we have ∼

c = 0.62432998854355087099293638310 . . . .” Alas, no student was able to recognize this number, so Professor Blakeley said, “Analysis of data is a basic tool for solving problems. Try to develop this important skill! It looks like I will have to tell you the answer, namely, (4.3)


∞ 0

exp (−x − ∫


e−y dy) dx.” y

“Just what I was about to say!” said Clayton. “How could I have missed it?” added Emiliano. Professor Blakeley said, “This result was proved in 1966 by Stuart Lloyd and Larry Shepp. In fact, they generalized it to . . . .” At this moment an unexpected visitor entered the room. It was a stately male cat, rather large with short, mostly black hair and a black nose, but with a number of prominent white patches. The cat walked to the front of the desks, turned around so it was facing the board, and sat down.



“It looks like he wants to learn some mathematics,” said Professor Blakeley. “Let’s test his knowledge.” He knelt down near the cat and said to it, “If 7 + 5 = 12, then stay where you are. If 7 + 5 = 13, then walk over to me.” The cat stayed put. “Wow!” said Professor Blakeley. “Let’s try some calculus. If the derivative of x3 is 3x2 , then stay where you are. If it’s 14 x4 , then walk over to me.” Again the cat did not budge. Sandra walked up to the front of the room and said, “Let’s try a real challenge. If there are two groups of order six, then walk over to me. If there are more than two, then stay where you are.” The cat stayed where he was. The students were disappointed. “I guess abstract algebra is too much to expect,” said Sandra. “Now just a minute!” said Professor Blakeley. “I believe our visitor is correct. I am surprised that none of you is familiar with algebra. There are actually infinitely many groups of order six. Sandra did not say ‘up to isomorphism.’ ” Was the cat really as literal-minded as Professor Blakeley? Sandra decided to try again. She turned to the cat and said, “If there are exactly two isomorphism classes of groups of order six, then walk over to me. If there are more than two, then stay where you are.” To everyone’s amazement, the cat stood up, walked over to Sandra, rubbed his cheeks against her leg, and began purring. Sandra gave it a few pats. Professor Blakeley said, “Holy cow, this cat seems to be smarter than Cayley! Arthur Cayley was the founder of abstract group theory. In one of his papers he wrote that there were three 6-element groups (up to isomorphism) because he counted Z/6Z and (Z/2Z) × (Z/3Z) as different groups.” By now everyone was in front of the room admiring the mysterious visitor. Finally Patrick said, “Okay, time for a real test! If there are infinitely many pairs of prime numbers that differ by two, then stay where you are. Otherwise walk over to me.” The cat stayed where it was for a short while and then stood up and began walking in circles while meowing piteously. His tail was swishing back and forth, and his ears were turned back. Finally he ran out of the room and quickly disappeared from sight. “Gee,” said Patrick, “I guess he couldn’t solve the Twin Prime Conjecture.” “What do you expect?” said Professor Blakeley. “After all he’s only a cat.”


Set Theory PROBLEM LIST Uncountably many subsets of Z forming a chain Uncountably many subsets of Z whose pairwise intersections are finite Goodstein’s total b-ary expansion procedure Sylver coinage Can R3 be partitioned into unit circles? Chromatic number of a graph on R Infinitely many prisoners with red or blue hats Point mass escaping to infinity Billiard ball escaping the table


40 40 40 43 43 44 45 46 46



“All right!” said Professor Blakeley at the start of the next class. “Today we are going to have some fun with sets! Let’s start with the existence or nonexistence of certain sets. The first problem is quite well known, so only raise your hand if you solve it and haven’t seen it before. “Does there exist an uncountable collection S of distinct subsets Sα of Z (where α belongs to some uncountable index set I) which form a chain, that is, if Sα , Sβ ∈ S, then either Sα ⊆ Sβ or Sβ ⊆ Sα ?” Half the students had not seen this question before. At first glance the answer should be negative, since we would need uncountably many different elements to account for the (infinitesimal) “steps” in the chain of subsets. Since a positive answer seemed very counterintuitive, the students unfamiliar with the problem deduced that on psychological grounds such a subset S does exist. Professor Blakeley decided to help out with a hint. “Note that the integers and rationals have the same cardinality.” Soon Sandra saw the light and raised her hand. “The integers and rationals are both countable, so it is equivalent to ask the question for subsets of Q. For every real number α let Sα = {x ∈ Q ∶ x < α}.” “Terrific!” said Professor Blakeley. “Once one thinks about rational numbers rather than integers, the problem becomes much easier. “Okay, here is another one. Does there exist an uncountable collection S of distinct subsets Sα of Z such that if α ≠ β, then Sα ∩ Sβ is finite?” Now only Daniel admitted to having seen the problem. Again it seemed very counterintuitive that such a collection S could exist. Guided by the previous problem, however, it wasn’t long before Jung Wook raised his hand. “We can replace Z by Q as before. For every α ∈ R, let Sα = {x1 , x2 , . . . }, where x1 , x2 , . . . is a sequence of rational numbers converging to α.” “Again terrific!” said Professor Blakeley. “Now for a real change of pace,” Professor Blakeley continued. “Given positive integers n and b, define the total b-ary expansion Tb (n) (also called the hereditary base-b notation) as follows. Write n as a sum of powers of b, with no power occurring more than b − 1 times. (This is just the usual base b expansion of n.) For instance, if n = 357948 and b = 3, then we get 311 + 311 + 37 + 36 + 36 + 32 . Now do the same for each exponent, giving 2




+ 33


+ 33+3+1 + 33+3 + 33+3 + 31+1 .

Continue doing the same for every exponent not already a b or 1, until finally only b’s and 1’s appear. In the present case we get that T3 (357948) is the array 1+1




+ 33


+ 33+3+1 + 33+3 + 33+3 + 31+1 .

Now define a sequence a0 , a1 , . . . as follows. Choose a0 to be any positive integer, and choose a base b0 > 1. To get a1 , write the total b0 -ary expansion Tb0 (a0 − 1) of a0 − 1, choose a base b1 ≥ b0 , and replace every appearance of b0 in Tb0 (a0 − 1) by b1 . This gives the total b1 -ary expansion of the next term a1 . To get a2 , write the total b1 -ary expansion Tb1 (a1 − 1) of a1 − 1, choose any base b2 ≥ b1 , and replace every appearance of b1 in Tb1 (a1 − 1) by b2 . This gives the total b2 -ary expansion of the next term a2 . Continue in this way to obtain a3 , a4 , . . . . In other words, given an and the previously chosen base bn , to get an+1 , write the total bn -ary expansion



Tbn (an − 1) of an − 1, choose a base bn+1 ≥ bn , and replace every appearance of bn in Tbn (an − 1) by bn+1 . This gives the total bn+1 -ary expansion of the next term an+1 . Example. Choose a0 = 357948 and b0 = 3 as above. Then 1+1

a0 − 1 = 357947 = 33



+ 33


+ 33+3+1 + 33+3 + 33+3 + 3 + 3 + 1 + 1.

Choose b1 = 10. Then a1

= 1010 +1+1 + 1010 +1+1 + 1010+10+1 + 1010+10 + 1010+10 + 10 + 10 + 1 + 1 = 10102 + 10102 + 1021 + 1020 + 1020 + 10 + 10 + 1 + 1. 1+1


Then 1+1

a1 − 1 = 1010



+ 1010


+ 1010+10+1 + 1010+10 + 1010+10 + 10 + 10 + 1.

Choose b2 = 766. Then a2




766766 +1+1 + 766766 +1+1 + 766766+766+1 + 766766+766 +766766+766 + 766 + 766 + 1,

etc. Prove that for some n we have an = an+1 = ⋯ = 0.” This seemed completely counterintuitive to the students. How could we not force an → ∞ by choosing the bn ’s sufficiently large? “It would be unfair to have you work on this yourselves,” said Professor Blakeley, “since you aren’t really expected to have the necessary background, namely, the theory of ordinal numbers. Therefore I will explain the solution. “If you are not already familiar with ordinal numbers, you can think that the first ordinal number is 0. Once we have constructed an ordinal number x, then there is always a next ordinal number. We can represent an ordinal number by a line of dots. Once we have represented x, put a dot to the right of all the other dots to get the next ordinal. At first you might think that we are simply constructing the positive integers, but the point is that even after an infinite sequence of steps, we can still continue. Thus we start with no dots, then one dot (●), then two dots (●●), etc. After a while we will have a string of infinitely many dots: ●●●●⋯ This ordinal number is denoted ω. It is the smallest limit ordinal, i.e., it doesn’t have a last element. But we can still continue the process, placing a dot after this infinite sequence: ● ● ● ● ⋯● This ordinal is denoted ω + 1, and continuing to place dots gives a second limit ordinal, denoted 2ω: ● ● ● ● ⋯ ● ● ● ●⋯ You can easily imagine continuing this way to get 3ω, 4ω, etc. Continuing, we will reach an infinite sequence of infinite sequences, each looking like ω. This ordinal number is denoted ω 2 . Then add a next dot to get ω 2 + 1, etc. We will eventually get (though of course not consecutively) ω 2 + ω, ω 2 + 2ω, . . . , 2ω 2 , 3ω 2 , . . . , ω 3 , ω 4 , . . . . Finally we reach the limit ordinal that is the limit of the sequence ω, ω 2 , ω 3 , ω 4 , . . . . This ordinal is denoted ω ω . You can imagine now the sequence ω

ω, ω ω , ω ω , . . . .



This limit ordinal is denoted 0 . We are on the verge of leaving the realm of intuitive comprehensibility, but fortunately we won’t be needing any ordinal beyond 0 . “Every ordinal number preceding 0 can be uniquely written as a finite expression in terms of 1 and ω using addition and exponentiation, with the terms in each sum appearing in decreasing order, for example, ωω


+ω 3 +2ω 2 +1

+ ω 3ω+5 + 2ω + 8.

I should point out that in standard (Zermelo-Fraenkel) set theory, there exist much larger ordinals. In particular, there is a first uncountable ordinal, but it is impossible to define by an explicit construction. Anyway, a fundamental property of ordinal numbers is that any strictly decreasing sequence of ordinal numbers is finite. “We are now ready to look at total b-ary expansions. Given b and n, construct an ordinal number f (b, n) by taking the total b-ary expansion T (b, n) of n, replacing each b by ω, and writing each sum in decreasing order of its terms. When we replace each appearance of b in T (b, n) with any other integer c ≥ 2, we get the total c-ary expansion of some (usually enormous) integer N , but we still have f (b, n) = f (c, N ). It is easy to see that f (c, N − 1) < f (c, N ) (as ordinal numbers). Thus the ordinal numbers Tb1 (a1 − 1), Tb2 (a2 − 1), Tb3 (a3 − 1), . . . are strictly decreasing. Since every strictly decreasing sequence of ordinal numbers is finite, the procedure must terminate! This amazing result was proved by Reuben Goodstein in 1944.” After waiting a little for the proof to sink in, Professor Blakeley continued, “Is everyone familiar with the Peano postulates? They are a simple set of axioms for the positive integers. Logicians are interested in the precise power of these axioms, i.e., what theorems can and cannot be proved with them. Goodstein’s theorem is the first result historically that can be stated within the Peano axioms, cannot be proved with these axioms, but can be proved using the Zermelo-Fraenkel axioms. “The question arises as to how long it takes to obtain 0 using our procedure.” “What do you mean, how long? You can’t give any bound in advance,” said Daniel. “Good question! Given n and b, we cannot give in advance an upper bound on the number of steps to termination. The idea is to compare the procedure with the following one: given an ordinal number x, we successively choose smaller ordinals until reaching 0. How long can this procedure Px last? If x = n, a finite ordinal, then we can say that the procedure lasts at most n steps. What if we choose ω? We cannot give a bound on the number of steps, but we can say that after one step, then we can give a bound on the number of remaining steps. Similarly if x = ω + n, then after n + 1 steps we can give a bound on the number of steps to terminate. “What about x = 2ω + n? Now we can say that after n + 1 steps, we can give a bound k on the number of steps until we can give a bound j on the number of steps to terminate, etc. You get the idea, although this kind of description becomes very cumbersome, that it does give a description of how long the entire procedure takes to terminate. “Now we can describe the length of the original procedure T by asking for what ordinal number x is T ‘comparable’ to Px , that is, has the same stopping time description as Px . All this can be made completely rigorous. It will come as no surprise that T is comparable to P0 , so the procedure can last a really long time.



“It can be interesting to determine the ordinal stopping time of various games and procedures that must terminate in finitely many moves or steps. For a game like chess (which we can make of finite length by requiring that no position may be repeated twice), the stopping time is just a finite ordinal. There is a simple game invented by John Conway that has a more interesting stopping time. It is called Sylver Coinage for J. J. Sylvester, who proved a theorem equivalent to saying that the game always ends in finitely many moves. There are two players Alice and Bob. They each take turns choosing positive integers. No player may choose an integer that is a sum, allowing repetitions, of previously chosen integers. Whoever chooses 1 loses. For instance, if Alice goes first and chooses 2, then Bob can choose 3, and Alice loses, since every integer n ≥ 2 can be written in the form n = 2a + 3b, where a, b are nonnegative integers. Sylver Coinage is a quite subtle game and not well understood. For instance, it is unknown who wins if the first move is 16. But the point I want to make here is that it can last a long time, since its length is known to be comparable to Pω2 . As a hint for proving this, consider for each state of the game the pair (d, m), where d is the greatest common divisor of all numbers that can no longer be chosen, and m is the number of multiples of d that have not yet been chosen. First note that m is finite by some elementary number theory. “Let’s take a five-minute break. You can have some fun playing Sylver Coinage.” Five minutes was hardly enough time to develop even the slightest feeling for how to play the game, but it was adequate for quenching thirst and the inverse process. Professor Blakeley noticed Emiliano poring over a book. The Professor walked up to Emiliano and saw that the book was entitled Enumerative Combinatorics, vol. 1, and Emiliano was looking at a chapter on rational generating functions. “That book has some interesting material,” Professor Blakeley said, “but the author doesn’t know how to write. You shouldn’t read anything written by him.” Emiliano nodded politely, but he did not desist from reading the book. After the break Professor Blakeley was as enthusiastic as ever. “I hope all of you are now expert Sylver Coinage players. Let’s look at a geometric problem where set theory is unexpectedly involved. Later1 we will look at some geometry problems that are more traditional. Okay, the question is: can R3 be partitioned into unit circles? In other words, is R3 a disjoint union of circles of radius one?” Given that set theory is unexpectedly involved, the students were not sanguine about finding an explicit partitioning. However, they did not see how to use set theory to show that such a partitioning existed. Professor Blakeley interrupted these musings by saying, “I will explain how it’s done. First let me review the concept of well-ordering. A well-ordering of a set S is a total ordering < of S (so that if x, y ∈ S and x ≠ y, then either x < y or y < x, but not both), so that every subset of S has a least element in the ordering. Thus the usual ordering of N is a well-ordering, but the usual ordering of the nonnegative real numbers isn’t, since for example the positive real numbers don’t have a least element. It is a simple consequence of the Zermelo-Fraenkel axioms plus the Axiom of Choice, which I will assume, that every set can be well-ordered. Any ordinal number (considered informally as a sequence of dots I defined earlier, where one dot is less than another if it is to the left) is well-ordered, and conversely every well-ordered set is isomorphic, as a totally ordered set, to a unique ordinal. 1

Chapter 13



“Let φ be the least ordinal whose cardinality is that of the continuum (the cardinality of R or R3 ). Such an ordinal exists since every set can be well-ordered and every (nonempty) set of ordinals has a least element. Let α ↦ xα be any bijection between ordinals α < φ and elements xα of R3 . Thus R3 = {xα ∶ α < φ}. For each xα I will define a unit circle Cα containing xα such that the distinct circles Cα form a partition of R3 . “The definition of Cα is by transfinite induction. That is, I will define Cα after defining Cβ for all β < α. Namely, if xα lies on a circle Cβ with β < α, then let Cα = Cβ . Otherwise let Cα be any unit circle containing xα and disjoint from all Cβ for β < α. We need only show that such a circle Cα exists, since once we complete the transfinite induction every xα ∈ R3 will be contained in one of the circles Cα . “Each circle Cβ with β < α lies on a plane Pβ . The number of these planes has cardinality less than the continuum since α < φ. Because there are a continuum number of planes passing through xα , we can choose such a plane P that contains no Cβ for β < α. There are a continuum number of unit circles containing xα and lying in the plane P . Since each Cβ intersects P in at most two points, there are less than a continuum number of points of P intersecting some Cβ , where β < α. Hence some unit circle through xα lies in P and is disjoint from all Cβ , completing the proof.” A neat proof !, thought the students. Somehow it seemed like cheating to invoke set theory in such a way. “Let’s move on to another aspect of set theory,” said the Professor, “one that gives a counterintuitive result. Are you all familiar with the chromatic number of a graph?” No one indicated otherwise, but as a reminder Professor Blakeley explained, “Let G be a graph with vertex set V and edge set E. We can assume G is simple, that is, each edge is incident to two distinct vertices, and no two edges are incident to the same two vertices. The chromatic number χ(G) is the smallest number n (which may be an infinite cardinal number if V is infinite) for which there is a ‘coloring’ f ∶ V → S, where S is a set of cardinality n, such that adjacent vertices get different values (colors). For instance, the famous four-color theorem says that if G is a finite planar graph, then χ(G) ≤ 4. “Now consider the following graph G. √Its vertex set is R. Two vertices u and √ v are adjacent if u − v + 2 ∈ Q or u − v − 2 ∈ Q. What is the chromatic number χ(G)?” After getting no quick response, Professor Blakeley gave a hint: “What can you say about the length of cycles in G?” He noticed that Sam was looking at him expectantly, so Professor Blakeley asked Sam if he had an idea. Sam√replied, “If we walk k steps from a vertex u, then we’ll be at a vertex u + r + a 2, where r ∈ Q and a is an integer with the same parity as k. Hence all cycle lengths are even, so the graph is bipartite, i.e., χ(G) = 2.” “Very good, but how did you conclude that G is bipartite, i.e., the vertex set is a disjoint union V = V1 ⊍ V2 , and every edge goes between V1 and V2 ?” “It’s a standard theorem in graph theory that a graph is bipartite if and only if all its cycles have even length.” “How do you prove the ‘if’ statement, which is what we need here.”



Sam thought for a few seconds and said, “Choose a vertex v in each connected component and color it red. If there is a path of even length from v to another vertex u, then color u red. Otherwise color u blue. The even length cycles guarantee that this procedure gives a well-defined 2-coloring.” “Hmm. . . . Is there a snag in that argument?” At first it sounded completely ironclad to the students, but Clayton was the first to wonder what it had to do with set theory, after which the light quickly dawned. “Perhaps we need the Axiom of Choice in order to pick a vertex in each component to color red.” “Yes,” said Professor Blakeley, “we certainly seemed to use the Axiom of Choice. Is there any way around it?” Surely there must be a way to avoid the Axiom of Choice, thought many of the students. One could try to define a canonical vertex in each component, such as the least positive vertex, though clearly that particular idea would not work. In fact, nothing seemed to work. Perhaps the Axiom of Choice was essential, though they did not know how to prove such a result. “I am not so surprised that you don’t see how to avoid the Axiom of Choice,” said Professor Blakeley. “In fact, let us assume only the Zermelo-Fraenkel (ZF) axioms, together with the Axiom of Choice for countable sets, and the axiom that all subsets of R are measurable. It is known that these axioms are consistent with ZF alone. Alexander Soifer then showed that χ(G) is neither finite nor countably infinite! However, it is unknown whether χ(G) has the cardinality c of the continuum, i.e., of the real numbers, or equivalently for those who know a little set theory, cardinality 2ℵ0 . Of course if the Continuum Hypothesis is true, then c = 2ℵ0 , so then we would have χ(G) = c (since the number of vertices of G is c, so c is an upper bound). You probably know the famous result of Paul Cohen that both the truth and falsity of the Continuum Hypothesis are consistent with ZF.” This was a lot to absorb for students with no background in set theory, but Professor Blakeley continued relentlessly. “Let’s look at another situation where the Axiom of Choice plays a decisive role. This problem belongs to the genre ‘beleaguered prisoners.’ We have a rather large prison with infinitely many prisoners numbered 1, 2, . . . . Each prisoner knows the numbers of all the prisoners. Each prisoner receives a red hat or blue hat, but they are unable to see the color of their hat. However, all the prisoners get to see the colors of everyone else’s hat. The evil prison staff then asks each prisoner individually and privately what is the color of their hat. All the prisoners are then executed unless only finitely many guess incorrectly. The prisoners are allowed to communicate before the procedure begins but not after they see the hats of their fellow prisoners. We assume that the prisoners can instantaneously process all the information they receive. What is their best strategy?” It was a big clue that the Axiom of Choice was involved, and it wasn’t long before Daniel and Patrick had their hands in the air. It was also clear that Sam had (or at least thought that he had) a solution. Professor Blakeley pointed to Patrick. “Put an equivalence relation ∼ on all infinite sequences α = a1 a2 ⋯ of R (red) and B (blue) by saying that α ∼ β if α and β differ in only finitely many terms. For each equivalence class E, the prisoners choose a representative αE ∈ E. This is where the Axiom of Choice comes in. If prisoner i sees color aj on prisoner j, then prisoner i chooses the representative αE = b1 b2 ⋯ of the class E containing the



sequence α = a1 a2 ⋯. The term ai , which prisoner i does not know, is irrelevant. Prisoner i then guesses the color bi for his or her own hat. Clearly this works.” “Well done!” said Professor Blakeley. “Yes, the Axiom of Choice certainly comes in handy. Also the ability to remember and process an infinite amount of information. Later2 we will consider some more problems involving beleagured prisoners. “While we are on the topic of infinitely many people, let’s consider the Hilbert Hotel, which I’m sure most of you have seen. This hotel has infinitely many rooms numbered 1, 2, 3, . . . . At first infinitely many guests G1 , G2 , . . . arrive and line up to check in. “The clerk spends one minute to put guest G1 in room 1, then 12 minute to put G2 in room 2, then 14 minute to put G3 in room 3, etc., so after two minutes all the guests have been accommodated, and the hotel is full. Now another infinite line H1 , H2 , . . . of guests arrive. The clever clerk moves each Gi to room 2i and then puts each Hi in room 2i − 1. Again all the guests are accommodated. More complicated scenarios are possible, which are useful for explaining the paradoxical nature of infinity to neophytes. “But suppose that the clerk adopts a different strategy. As the first guests G1 , G2 , . . . arrive, the clerk first places G1 in room 1. When G2 arrives, he moves G1 to room 2 and places G2 in room 1. Continuing in this manner, when a new guest arrives he moves each guest already accommodated to the next higher room, and then places the new guest in the vacant room 1. Soon all the new guests have checked in, and the clerk is very pleased at his accomplishment. Then he notices to his horror that every room is empty! All the guests have mysteriously vanished! “It is also interesting to note that according to classical Newtonian physics, the above scenario is actually physically possible! To strip the argument down to its essentials, suppose that an object moves in a straight line in such a way that at time t its distance (or the distance of its center of mass) from some fixed point is 1/(1 − t). Where will it be at time t = 1? Although an infinite amount of energy is needed, the universe is infinite in extent, so this energy can be obtained without introducing any singularities. In fact, if you allow point masses, then there is an initial configuration of five point masses in R3 such that if they follow the rules of Newtonian physics, one of the masses will escape to infinity in a finite amount of time!” After a short discussion of this surprising fact Professor Blakeley continued, “This talk about Newtonian physics reminds me of a startling result concerning mathematical billiards. Suppose X is a compact convex subset of R2 whose boundary ∂X is continuously differentiable. Thus X has a tangent line at every point of ∂X, and the slope of this line varies continuously as we move along ∂X. In particular, we can play billiards on X, where our billiard ball B is just a point. We start with B at some point inside X and let B move in a straight line in some direction. When it reaches a point p on the boundary, it bounces back so that the angle of incidence equals the angle of reflection (with respect to the tangent line at p). “The startling result is that there exists such a set X and initial motion of B such that B eventually escapes from the set X!” 2

Chapter 14



Naturally the students wondered what Professor Blakeley was talking about. It clearly made no sense that the ball could leave the set X. “Perhaps I am not being entirely accurate about the ball escaping from X,” said the Professor, “but that is a somewhat whimsical interpretation by Benjamin Halpern of a curious situation. He gives an example where the ball bounces closer and closer to a point p on ∂X, and the direction in which B moves approaches the direction of the tangent line at p. After infinitely many steps but in finite time (assuming that the speed of the ball is constant) the ball reaches p, moving in the tangent direction. “What happens next? Strictly speaking, the motion of the ball becomes undefined. However, Halpern suggests that after reaching p it escapes X along the tangent line at p. Another possible interpretation is that the ball will stay on the boundary, circling forever around it. Very strange!” With these deep philosophical considerations the morning session came to an end.


Two Triangles PROBLEM LIST Binomial coefficient identities Sum of entries ⟨ nk ⟩ in row n of Stern’s triangle Largest entry in row n of Stern’s triangle Generating function Un (x) for entries in row n of Stern’s triangle Alternating sum of entries in row n of Stern’s triangle n ∑k ω k ⟨ k ⟩, where ω = e2πi/3 n n ⟨ 0 ⟩ + ⟨ 1 ⟩ − ⟨ n2 ⟩ + ⟨ n3 ⟩ + ⟨ n4 ⟩ − ⟨ n5 ⟩ + . . . 2

n ∑k ⟨ k ⟩ n 3 ∑k ⟨ k ⟩ 3 The generating function for ∑i (ni) is not algebraic Ratios bi /bi+1 of terms in Stern’s diatomic sequence Sum of elements in row n of the Calkin-Wilf tree 2m −1 r The sum ∑i=1 bi Rationality of generating function for congruence condition on row n of Stern’s triangle Number of pairs (a, b) generating Z/mZ Fraction of consecutive entries in row n of Stern’s triangle satisfying a congruence condition Fraction of entries in row n of Stern’s triangle satisfying a congruence condition Fraction of terms in Stern’s diatomic sequence satisfying a congruence condition An identity involving φ(m) and ψ(m) The equation ∣f (n)∣ = ∑d∣n f (d) Some mysterious observations on Gm,a (x)


50 52 52 53 53 54 54 54 56 56 57 58 58 59 60 61 61 62 62 62 62



When Professor Blakeley walked into class the next day, he seemed very excited. “You all know about Pascal’s arithmetic triangle P , or just Pascal’s triangle or the arithmetic triangle or less commonly Pingal’s Meruprastar, after the Indian mathematician Archarya Pingala from the 3rd/2nd century BC,” he said as he wrote part of this well-known array on the board.

1 1 1 1 1 1

3 4


1 2

1 3

6 10

1 4


1 5


⋮ “The kth entry in row n, beginning with row 0 and calling the leftmost entry in each row the 0th entry, is denoted (nk) and of course is called a binomial coefficient. If k > n, then (nk) = 0. The defining recurrence of the arithmetic triangle is n n−1 n−1 ( )=( )+( ). k k−1 k By the way, the number of digits needed to write the nth row in decimal notation is about (n2 log n)/(2 log 10), so if each digit is written the same size, then the array is not really a triangle. We get a triangle if each entry (nk) takes up the same amount of space, but then the entries are going to be very hard to read for large n. Anyway, the arithmetic triangle has a lot of interesting properties, such as the row ) being equal sums ∑nk=0 (nk) being equal to 2n and certain diagonal sums ∑nk=0 (n−k k to the Fibonacci number Fn+1 . There are a myriad of further identities like k a b a+b )=( ), ∑ ( )( i k − i k i=0

n ), which includes as a special case, using the symmetry (nk) = (n−k

n n 2 2n ∑ ( ) = ( ). n i=0 i

We can extend the definition to (αk) for any complex number (or indeterminate) α and k ∈ N by


α α(α − 1)⋯(α − k + 1) . ( )= k! k



Some other identities are n n i ∑ ( )x i=0 i 3 n i n ∑(−1) ( ) i i=0 n

∑( i=0

= (1 + x)n , = (−1)n

(3n)! , n!3

α+k α+n+1 ) = ( ), n k

2i 2(n − i) ) = 4n , ∑ ( )( n−i i=0 i n

(6.2) min(a,b)

∑ i=0


α+β+i β α α+a β +b )( )( ) = ( )( ). i a−i b−i b a

“Binomial coefficients also have a lot of congruence and divisibility properties, as we saw when we discussed Lucas’s theorem and Kummer’s theorem.1 “I am pleased to announce that I have found an amazing analogue of Pascal’s triangle, with even more identities and congruences! I call it the multiplicative triangle or Blakeley’s triangle, denoted B. It is the multiplicative analogue of Pascal’s arithmetic triangle. The initial conditions are that every row begins and ends with a one, just like Pascal’s triangle. Using the same indexing conventions as before, denote the kth entry in row n of B by GGnkGG. Then we have the defining recurrence GnG Gn − 1G Gn − 1G G G=G G G G Gk G Gk − 1G ⋅ G k G, 0 < k < n. G G G G G G Behold the beautiful multiplicative triangle!” 1 1 1 1 1 1

1 1

1 1


1 1

1 1

1 1


1 1


⋮ The students were temporarily too stunned to reply, so Professor Blakeley continued, “I have just begun to unravel the mysteries of this triangle. For instance, it is straightforward to show that n G G n ∑ GG GG = G k=0 k G n G G2 n ∑ GG GG = k=0 G k G

n + 1, n + 1.

Amazingly enough, I can show that n G G3 n ∑ GG GG = n + 1, G k=0 k G 1

Chapter 3



which has no analogue for Pascal’s triangle. There is also the multiplicative Lucas’s theorem and Kummer’s theorem, namely, for any prime p and 0 ≤ k ≤ n we have GnG ≡ 1 (mod p), and the exponent of the largest power of p dividing GnG is 0.” Gk G Gk G 4 Clayton’s hand shot up. “I think I can handle the sum ∑nk=0 GGnkGG ,” he said. “Terrific!” responded Professor Blakeley. “The multiplicative triangle should be a fecund area for further research. Maybe some of you can write your Ph.D. thesis on this topic. I’ll let you mull over this for a bit, while we consider another interesting ‘Pascal-like’ triangle. “Let’s consider the following analogue. We have the same initial conditions as before, that is, row 0 consists of one 1, and in each subsequent row we place a 1 at the beginning and end. We add two adjacent numbers just as in Pascal’s triangle, but we also take each number in row n and place a copy directly below it in row n + 1. It would take me too long to write down the entire array, but here are the first few rows.”

1 1 1

1 1 1 1 1 2 1 2 1 1 2 1 3 2 3 1 3 2 3 1 2 1 1 ⋮

“So, how many entries in row n?” A number of students cried out, “2n+1 − 1.” “Great!,” said Professor Blakeley. “This array is a lot fatter than Pascal’s triangle. I will call it Stern’s triangle and denote it by S. Can anyone see a problem with this terminology?” “Who is Stern?” ventured a couple of students. “I will get to that later,” replied Professor Blakeley, “but that is not the problem with the terminology.” After seeing only blank faces Professor Blakeley continued, “Even if we adjust the font size of each entry so all entries take up the same amount of space, the two boundary edges are not straight lines as in Pascal’s triangle, but rather exponential curves. Nevertheless, I hope I can be forgiven for referring to S as a ‘triangle.’ Of course we could scale the entry sizes so we do get a triangle, but it would be hard to read the microscopic size numbers after a few rows. “Let’s get back to real mathematics. What’s the sum of the elements in row n?” “3n ,” a few students replied. “Why is that?” The question was so easy that Jung Wook responded without first raising his hand, “Each entry contributes three times to the next row, once to the left, once to the right, and once directly below.” “A very clear and succinct explanation!” said Professor Blakeley. “And what is the largest entry in row n? If you start computing these numbers you get 1, 1, 2, 3, 5, 8, 13, . . . , so there is an obvious conjecture.” Emiliano was chosen to answer. “It’s easy to prove by induction that you get Fibonacci numbers. Let g(n) be the maximum entry in row n. Any two consecutive entries include one number brought down from the previous row, so g(n + 1) ≤ g(n) + g(n − 1). If g(n − 1) and g(n) do appear consecutively in row n, then g(n − 1) + g(n) and g(n) appear consecutively in row n + 1. Since g(0) and



g(1) are consecutive in row 1, we have by induction that g(n + 1) = g(n) + g(n − 1), and it’s easy to check initial conditions.” “Well done!” said Professor Blakeley. “Let’s try something a little more serious. What is the ‘Stern analogue’ of the binomial theorem? To be precise, let ⟨ nk ⟩ be n ⟩ = 1. What the kth entry in row n, beginning with the 0th entry, so ⟨ n0 ⟩ = ⟨ 2n+1 −2 can we say about the polynomial n+1

2 −2 n Un (x) = ∑ ⟨ ⟩xk ?” k k=0

Patrick was furiously computing and quickly announced that he had a conjecture. Professor Blakeley suggested trying to find a formula for Un (x) directly, without first making a conjecture. Soon the students saw the way to do this. The numbers in row n + 1 brought straight down from row n contribute xUn (x2 ) to Un+1 (x). The numbers brought down to the left contribute Un (x2 ), and to the right contribute x2 Un (x2 ). Hence Un+1 (x) = (1 + x + x2 )Un (x2 ). Using the initial condition U0 (x) = 1, we obtain n−1



Un (x) = ∏ (1 + x2 + x2⋅2 ) , n ≥ 1. i=0

Professor Blakeley was quite pleased. “This is a good analogue of the binomial theorem, because it is a nice product formula. I should point out that it is more traditional to use a similar array that begins with two 1’s in the first row. It looks as follows,” he said while writing 1 1 1 1 4 1 5 4 7

2 3 2 3 3 5 2 5 3 4 3 8 5 7 2 7 5 8 3 7 4 5 ⋮

1 1 1 . 1 1

“This array is called Stern’s diatomic array. It is essentially equivalent to Stern’s triangle S I defined previously, but I personally prefer S because of its more elegant properties such as the analogue of the binomial theorem. “Who can tell me the alternating sum ∑k (−1)k ⟨ nk ⟩ of the entries in row n of S?” Hands shot up quickly, and Daniel yelled, “3n−1 !” “And why is that? Daniel, why don’t you explain?” Daniel said, “Just let x = −1 in the product formula for Un (x).” “Yes, this is completely analogous to showing that ∑k (−1)k (nk) = 0 by setting x = −1 in (x + 1)n = ∑k (nk)xk . However, I am sad to say that 3n−1 is not the correct answer.” The students were mystified. They were not yet sufficiently familiar with Professor Blakeley’s inclination toward literal-mindedness.



“You don’t see it? I am shocked! The formula 3n−1 fails when n = 0,” he said. “Let’s see if we can do a better job with k n ∑ ω ⟨ ⟩, k k where ω = e2πi/3 , a primitive cube root of unity.” Again many hands were quickly raised, and Professor Blakeley asked Sandra to explain. She had learned her lesson. “It’s 0 if n ≥ 1 and 1 if n = 0, by letting x = ω in Un (x) for n ≥ 1 and using 1 + ω + ω 2 = 0. We also need that 2i and 2 ⋅ 2i are incongruent and nonzero modulo 3.” “Excellent, excellent!” said Professor Blakeley. “Note the same is true if we k replace ω with e2πi/(3⋅2 ) , provided n > k. What about the sum n n n n n n ⟨ ⟩ + ⟨ ⟩ − ⟨ ⟩ + ⟨ ⟩ + ⟨ ⟩ − ⟨ ⟩ + .... 0 1 2 3 4 5 The signs go + + − + + − + + − + + − ⋯.” Fumei was the first to raise her hand. “Any sequence a0 , a1 , . . . that is periodic of period three is a linear combination of the sequences bn = 1, bn = ω n , and bn = ω 2n . We just have to figure this out for 1, 1, −1, 1, 1, −1, . . . .” “That’s the basic idea,” said Professor Blakeley. “How about some details?” Patrick was chosen to come to the board. “We need to solve the linear equations a+b+c a + ωb + ω 2 c a + ω 2 b + ωc

= = =

1, 1, −1.

Then the answer will be aUn (1) + bUn (ω) + cUn (ω 2 ). Wait a second, we have Un (1) = 3n and Un (ω) = Un (ω 2 ) = 0 (for n > 0 of course). So the answer is a3n .” “And what’s a?” asked Professor Blakeley. “Just add the equations,” said Fumei. “We get a = 1/3.” “Exactly! Curiously we get the answer 3n−1 , the same as for the alternating sum ⟨ n0 ⟩ − ⟨ n1 ⟩ + ⟨ n2 ⟩ − ⟨ n3 ⟩ + . . . . I’m sure you can think of some generalizations, so let’s switch gears—can anything interesting be said about n 2 u2 (n) ∶= ∑ ⟨ ⟩ ?” i i 2

) It was clear that the elegant proof by generating functions that ∑i (ni) = (2n n was not going to extend, since Um (x)Un (x) ≠ Um+n (x). The best that the students could come up with is that u2 (n) is the middle coefficient of Un (x)2 , but they did not see what to do with this. Meanwhile several students computed the values 1, 3, 13, 59, 269, 1227, 5597, 25531, . . . and looked them up in OEIS. Amazingly this sequence appeared with a simple rule, but with no reference to Un (x) or to the name Stern, so they now had the conjecture (6.3)

u2 (n + 1) = 5u2 (n) − 2u2 (n − 1),

with the initial conditions u2 (0) = 1 and u2 (1) = 3.



“A very elegant conjecture!” said Professor Blakeley. “Let’s take a little break and see if you can come up with a proof. Feel free to collaborate.” After only five or so minutes, Daniel and Jung Wook said that they think they had it. Jung Wook explained, “We need to look at an auxiliary function n n ⟩. u1,1 (n) = ∑ ⟨ ⟩⟨ i i+1 i Now each element in row n is either equal to the element above it or to the sum of the two neighboring elements above it, so 2 n 2 n n ⟩) u2 (n + 1) = ∑ ⟨ ⟩ + ∑ (⟨ ⟩ + ⟨ i i i+1 i i

= 3u2 (n) + 2u1,1 (n).

(6.4) Similarly,

n n n n n n ⟩) + ∑ (⟨ ⟩ + ⟨ ⟩) ⟨ ⟩ ∑ ⟨ ⟩ (⟨ ⟩ + ⟨ i i i+1 i i+1 i+1 i i

u1,1 (n) = (6.5)

2u2 (n) + 2u1,1 (n).


We can restate the recurrences (6.4) and (6.5) in the matrix form [

u2 (n + 1) 3 2 u (n) ]=[ ][ 2 ], u1,1 (n + 1) u1,1 (n) 2 2


u2 (n + 1) 3 2 u (1) ]=[ ] [ 2 ]. 2 2 u1,1 (n + 1) u1,1 (1)

so n

3 2 ] is x2 −5x+2, so A2 −5A+2I = 0. 2 2 Therefore An−1 (A2 − 5A + 2I) = 0. Apply this to the vector

The minimum polynomial of the matrix A = [


u2 (1) 3 ] = [ ], 2 u1,1 (1)

and we get the desired recurrence.” So quick!, thought Professor Blakeley, I took two days to find this argument. “Fantastic! Note that this result is simpler than the corresponding one for Pascal’s 2 ), and triangle. For Pascal’s triangle we have ∑i (ni) = (2n n 2n n 1 , ∑ ( )x = √ 1 − 4x n≥0 n an algebraic function of x, i.e., it satisfies a nonzero polynomial equation whose coefficients are polynomials in x. Here the equation is (1 − 4x)y 2 − 1 = 0. On the other hand, since u2 (n) satisfies a linear recurrence with constant coefficients, the generating function will be rational (a quotient of two polynomials), a special case of algebraic functions which is much simpler and more tractable. More specifically, we have 1 − 2x n . ∑ u2 (n)x = 1 − 5x + 2x2 n≥0



I am assuming you know the theory of such recurrences and how to deduce, for instance, that √ √ √ √ n n 5 + 17 5 − 17 17 17 1 1 )( ) +( − )( ) . u2 (n) = ( + 2 34 2 2 34 2 √


Thus u2 (n) grows like ( 5+ 2 17 ) = (4.56155 . . . )n , while u1 (n) = ∑k ⟨ nk ⟩ = 3n . “Now that we understand u2 (n), let’s see if we can do something with n 3 u3 (n) = ∑ ⟨ ⟩ .” k k Now that the students understood the technique, it didn’t take long for them to come up with a solution. This time one needs the auxiliary functions n 2 n ⟩ u2,1 (n) = ∑ ⟨ ⟩ ⟨ k k+1 k and similarly for u1,2 (n). However, because of the symmetry n n ⟩ ⟨ ⟩ = ⟨ n+1 −2−k k 2 we have u2,1 (n) = u1,2 (n). It is then easy to check that for n ≥ 1 we have u3 (n) = u2,1 (n) =

3u3 (n − 1) + 6u2,1 (n − 1), 2u3 (n − 1) + 4u2,1 (n − 1).

At this point Professor Blakeley asked, “Can someone find a formula in their heads for u3 (n) when n ≥ 1?” Professor Blakeley called on Emiliano after several hands went up. “The coefficient matrix has eigenvalues 7 and 0 since its determinant is 0 and its trace is 7, so u3 (n) = c ⋅ 7n for some constant c. Putting n = 1 gives u3 (n) = 3 ⋅ 7n−1 .” 3 “Great! A remarkably simple formula. There is no counterpart for ∑i (ni) . In 3

fact, one can show that the generating function for ∑i (ni) is not even algebraic (a 3 6 good challenge). For some reason, the coefficient matrix [ ] has determinant 2 4 0, so we drop one degree from the ‘expected.’ That is, the denominator of the generating function ∑n≥0 u3 (n)xn has degree one, not two. I should also point out that the formula u3 (n) = 3 ⋅ 7n−1 fails for n = 0, because the minimal polynomial of 3 6 [ ] is x(x − 7), not x − 7. 2 4 r “I hope it is clear that this method can be applied to ur (n) ∶= ∑k ⟨ nk ⟩ for any positive integer r. One has to check that we end up with only finitely many linear recurrences, but this isn’t so difficult. Here are the recurrences we get for r = 4, 5, 6: u4 (n) = u5 (n) = u6 (n) =

10u4 (n − 1) + 9u4 (n − 2) − 2u4 (n − 3), 14u5 (n − 1) + 47u4 (n − 2), 20u6 (n − 1) + 161u6 (n − 2) + 40u6 (n − 3) − 4u6 (n − 4).

It is an interesting project to look at further properties of these recurrences, such as their length, but I will not discuss this further.



“But I will say that this phenomenon is much more general and not connected with the triangle S. As just one random example, let n−1




Fn (x) = ∏ (1 + 2x3 − x2⋅3 + x3⋅3 ) ,



and let vr (n) be the sum of the rth powers of the coefficients of Fn (x). Then we have v2 (n) = 8v2 (n − 1) − 9v2 (n − 2), v3 (n) = 15v3 (n − 1) − 59v3 (n − 2) + 183v3 (n − 3), v4 (n) = 32v4 (n − 1) + 278v4 (n − 2) − 1600v4 (n − 3) − 867v4 (n − 4).

“This is something I will leave you to mull over at your leisure. But now I would like to discuss a completely unexpected property of Stern’s triangle. For convenience set ⟨ nk ⟩ = 0 if k > 2n − 2. First note that the rows of the Stern triangle are stable, that is, for any k ≥ 0 the sequence ⟨ k0 ⟩, ⟨ k1 ⟩, ⟨ k2 ⟩, . . . eventually becomes constant, say with value bk+1 (rather than bk in order to agree with established notation). In fact, the (n + 1)st row begins with the first half of the nth row. Obviously we have i


k 2 2⋅2 ∑ bk+1 x = lim Um (x) = ∏ (1 + x + x ) . k≥0



We are taking the limit coefficientwise. Since for fixed k the coefficient of xk becomes constant as m → ∞, there is no problem with convergence in taking the limit. Set b0 = 0. The sequence b0 , b1 , . . . is called Stern’s diatomic series or Stern’s diatomic sequence (or the Stern-Brocot sequence), after an 1858 paper of Moritz Abraham Stern. It begins as follows: 0, 1, 1, 2, 1, 3, 2, 3, 1, 4, 3, 5, 2, 5, 3, 4, 1, 5, 4, 7, 3, 8, 5, 7, 2, 7, 5, 8, . . . . Its most amazing property is the following: every nonnegative rational number occurs exactly once as a ratio bi /bi+1 of two consecutive terms, and this fraction is in lowest terms!” Was this another of Professor Blakeley’s jokes? There didn’t seem to be any connection between the hypothesis and the conclusion. “I know what some of you must be thinking,” said Professor Blakeley, “but this is really a true theorem. It is certainly one of the most startling results in mathematics. It can be directly proved by induction using the recurrence b2n = bn and b2n+1 = bn + bn+1 (essentially equivalent to the definition of Stern’s triangle), but it is more easily understood by introducing an intermediate object known as the Calkin-Wilf tree T . This is an infinite binary tree with root labeled by the fraction 1/1. We then label the other vertices recursively using the following rule: if a vertex is labeled a/b, then its left child is labeled a/(a + b) and its right child (a + b)/b. Here are the first few levels.”



1/1 1/2


1/3 1/4

3/2 4/3



2/3 2/5

3/1 5/3



“It’s not difficult to prove the following three facts, which immediately imply the desired result. ● If we read the numerators of the labels of T in the usual reading order, then we obtain the sequence b1 , b2 , . . . , i.e., the Stern diatomic sequence except for the first term 0. ● If we read the denominators of the labels of T in the same reading order, then we obtain the sequence b2 , b3 , . . . . ● Every positive rational number occurs exactly once as a label. “The Calkin-Wilf tree is named after Neil Calkin and Herbert Wilf, who published their paper on this subject in 2000. This is a good example of Stigler’s law of eponymy which I discussed earlier. The earliest explicit appearance of the CalkinWilf tree seems to be a 1997 paper of Jean Berstel and Aldo de Luca, who called it the Raney tree. However, similar trees go back to Stern and even to Kepler in 1619. “Here is a little problem on the Calkin-Wilf tree. What is the sum f (n) of the elements at height n? Let the top height be 1, so f (1) = 1, f (2) = 12 + 2 = 52 , etc.” A couple of minutes later Fumei and Clayton raised their hands at about the same time. Professor Blakeley asked Fumei to explain. She said, “In all rows but the first, the fractions come in pairs ab and ab . Their children sum to a+b b a+b a b a + + + =3+ + . a+b b a+b a b a There are 2n−2 such pairs at height n, so we get f (n + 1) = f (n) + 3 ⋅ 2n−2 . This recurrence is easy to solve, but I didn’t do it yet.” “Good job!” said Professor Blakeley. “Indeed, the solution is f (n) = 1 (3 ⋅ 2n−1 − 1). 2 “There is a simple connection between Stern’s triangle and Stern’s diatomic sequence. Namely, you can check that row n of Stern’s triangle has the form u, 1, ur , where u is the sequence consisting of the first 2n −1 term of Stern’s diatomic sequence, and ur is its reverse. From this observation we obtain 2m −1

1 r ∑ bi = (ur (m) − 1). 2 i=1

“Let’s look at a further aspect of Stern’s triangle, this time number-theoretical. We discussed earlier Lucas’s theorem and Kummer’s theorem, which gave some



divisibility properties of (nk). We can ask similar questions for Stern’s triangle. The behavior seems to be quite different. In particular, for 0 ≤ a < m let n gm,a (n) = # {k ∶ 0 ≤ k ≤ 2n+1 − 2, ⟨ ⟩ ≡ a (mod m)} . k Set Gm,a (x) = ∑n≥0 gm,a (n)xn . I did some experiments that suggest that Gm,a (x) is a rational function. For instance, it seems that G2,0 (x) = G2,1 (x) = G3,0 (x) = G3,1 (x) = G3,2 (x) = G4,0 (x) = G4,1 (x) = G4,2 (x) = G4,3 (x) = G5,0 (x) = G5,1 (x) = G5,2 (x) = G5,3 (x) = G5,4 (x) =

2x2 , (1 − x)(1 + x)(1 − 2x) 1 + 2x , (1 + x)(1 − 2x) 4x3 , (1 − x)(1 − 2x)(1 + x + 2x2 ) 1 + x − 4x3 − 4x4 , (1 − x)(1 − 2x)(1 + x + 2x2 ) 2x2 + 4x4 , (1 − x)(1 − 2x)(1 + x + 2x2 ) 4x4 , (1 − x)(1 + x)(1 − 2x)(1 − x + 2x2 ) 1 + x − 2x2 − 4x3 , (1 − x)(1 + x)(1 − 2x) 2x2 , (1 + x)(1 − 2x)(1 − x + 2x2 ) 4x3 , (1 − x)(1 + x)(1 − 2x) 4x4 , (1 − x)(1 + x)(1 − 2x)(1 − x + 2x2 ) 1 − x2 − x4 − 8x5 + 5x6 − 4x7 − 16x8 + 8x9 − 32x10 − 32x11 , (1 − x)(1 + x)(1 − 2x)(1 + x2 )(1 − x + 2x2 )(1 − x2 + 4x4 ) 2x2 + 8x5 + 2x6 − 4x7 + 12x8 − 16x10 , (1 + x)(1 − 2x)(1 + x2 )(1 − x + 2x2 )(1 − x2 + 4x4 ) 4x3 + 4x5 + 4x6 + 12x7 − 4x8 + 16x10 , (1 + x)(1 − 2x)(1 + x2 )(1 − x + 2x2 )(1 − x2 + 4x4 ) 4x4 − 4x5 + 8x6 + 8x7 + 8x8 + 16x10 + 32x11 . (1 + x)(1 − 2x)(1 + x2 )(1 − x + 2x2 )(1 − x2 + 4x4 )

The cases G2,0 (x) and G2,1 (x) are easy exercises, but the others don’t seem so easy. Does anyone have an idea about proving rationality?” Some of the students thought about whether there was some analogue of the proof that ∑n u2 (n)xn is rational. There one needed to introduce a new function u1,1 (n). After some thought Daniel ventured, “It seems that we need to look at the function gm,a,b (n) that counts the number of k for which ⟨ nk ⟩ ≡ a (mod m) and n ⟨ k+1 ⟩ ≡ b (mod m). I’m not sure whether this will be sufficient.”



“You are right on the button! Here is what happens. In Daniel’s definition of gm,a,b (n), we need to allow the values k = −1 and k = 2n − 2, giving the pairs n ⟩, ⟨ n0 ⟩) = (0, 1) and (⟨ 2nn−2 ⟩, ⟨ 2nn−1 ⟩) = (1, 0), contributing to gm,0,1 (n) and (⟨ −1 n ⟩ are easily seen to be relatively prime, gm,1,0 (n), respectively. Since ⟨ nk ⟩ and ⟨ k+1 we have gm,a,b (n) = 0 unless a, b generate the additive group Z/mZ. In that case we obtain the recurrences (6.7)

gm,a,b (n + 1) = gm,a,b−a (n) + gm,a−b,b (n),

where a, b generate Z/mZ and the subscripts b − a and a − b are taken modulo m. Note that if a, b generate Z/mZ, then so do the pairs a, b − a and a − b, b. Thus we get a system of linear recurrences of the same type as we got for u2 (n) and u3 (n), so the generating functions Gm,a,b (x) = ∑ gm,a,b (n)xn n≥0

are rational. It is then clear that Gm,a (x) is also rational, since Gm,a (x) = ∑b Gm,a,b (x). “Let’s see if we can glean some further information about gm,a,b . First, how big is the matrix recurrence we need to solve to compute Gm,a,b ? In other words, for how many pairs (a, b) ∈ Z/mZ × Z/mZ is it the case that a.b generate Z/mZ (as an additive group)?” Several hands went up, and Sandra was selected. “If m is a prime power pk , then we want to count pairs (a, b) for which not both a and b are multiples of p. There are pk−1 multiples of p in Z/pk Z, so the number of generating pairs is (pk )2 − (pk−1 )2 = p2(k−1) (p2 − 1).” “Great! And what if m is not a prime power?” “The Chinese remainder theorem should be applicable, though I would have to think a bit more.” “Exactly! It is a simple application of the Chinese remainder theorem which I leave as an exercise that if f (m) is the number of pairs (a, b) ∈ Z/mZ × Z/mZ for which a, b generate Z/mZ, then f (m) is multiplicative, i.e., f (m)f (r) = f (mr) if m and r are relatively prime. It follows that p2 − 1 . 2 p∣m p

f (m) = m2 ∏

We can write this another way as follows. The Dedekind psi function is defined by p+1 . p∣m p

ψ(m) = m ∏

Recalling that the Euler phi function satisfies p−1 , p∣m p

φ(m) = m ∏

we see that f (m) = ψ(m)φ(m). “Who can explain the factor 1 − 2x in the denominator of the examples that I showed you?” Emiliano was chosen. “The matrix A corresponding to the recurrence (6.7) has two 1’s in every row, while all other entries are 0. Hence the all 1’s vector



is an eigenvector with eigenvalue 2, yielding the factor x − 2 in the characteristic polynomial. This corresponds to a factor 1 − 2x in the denominator of Gm,a,b (x).” “Good explanation! Conceivably there could be a factor 1−2x in the numerator which cancels the denominator factor. It seems plausible, however, that the number n ⟩ ≡ b (mod m) should be, in the gm,a,b (n) of k for which ⟨ nk ⟩ ≡ a (mod m) and ⟨ k+1 limit as n → ∞, a positive fraction of all the entries of the nth row of Stern’s triangle. Since there are 2n+1 − 1 entries, this means that (6.8)

gm,a,b (n) ∼ cm,a,b 2n

for some constant cm,a,b > 0. Hence the factor 1 − 2x should appear in the denominator but not in the numerator. “Can we make this reasoning more precise?” Jung Wook was the first to volunteer. “If we divide the matrix A by 2, then we get the matrix of a finite-state Markov chain. I think we want to compute the stationary distribution.” “Fantastic idea!” said Professor Blakeley, once more impressed by the knowledge of the students and how quickly they could access it. “I will explain further for those of you like Jung Wook who are familiar with Markov chains. It’s easy to see that every column sum of the matrix A/2 is equal to 1. (We already noticed this for row sums.) Hence the all 1’s row vector v is a left eigenvector of A/2 with eigenvalue 1. Now by the theory of Markov chains v/Z gives the stationary distribution, where Z is the normalizing factor φ(m)ψ(m) so that the entries sum to 1. Since v is the vector of all 1’s, the stationary distribution is just the uniform distribution on pairs (a, b) which generate Z/mZ. In other words, the constants cm,a,b of equation (6.8) are given by cm,a,b = 1/φ(m)ψ(m). “We can also ask for the proportion cm,a of numbers ⟨ nk ⟩, 0 ≤ k ≤ 2n+1 − 2, that are congruent to a modulo m. Equivalently, gm,a (n) ∼ cm,a 2n . It is pretty straightforward to obtain the value of cm,a from the fact that cm,a,b = 1/φ(m)ψ(m), assuming as usual that a, b generate Z/mZ. As an example, the pairs (a, b) that generate Z/4Z are (1,1), (1,2), (1,3), (1,4), (2,1), (2,3), (3,1), (3,2), (3,3), (3,4), (4,1), (4,3). Of the 24 numbers in these twelve pairs, the number 2 appears 4 = 16 . In general, the answer turns out to be four times. Hence c6,2 = 24 (6.9)

cm,a =

mφ(d) , dφ(m)ψ(m)

where d = gcd(m, a). For instance, for m = 12 we get (c12,0 , . . . , c12,11 ) =

1 (2, 6, 3, 4, 3, 6, 2, 6, 3, 4, 3, 6). 48

I have put the formula (6.9) for cm,a as a problem on the afternoon’s problem set. “These results on cm,a,b and cm,a can be carried over directly to Stern’s diatomic sequence, because as noted before, row n of Stern’s triangle has the form u, 1, ur , where u is the sequence consisting of the first 2n − 1 term of Stern’s diatomic sequence, and ur is its reverse. We obtain the following. “Let a, b generate Z/mZ. Let hm,a,b (n) = #{bi ∶ 1 ≤ i ≤ n, bi ≡ a (mod m), bi+1 ≡ b (mod m)}.



Similarly, for any a ∈ Z/mZ let hm,a (n) = #{bi ∶ 1 ≤ i ≤ n, bi ≡ a (mod m)}. Then (6.10) (6.11)

hm,a,b (n) n→∞ n


1 , φ(m)ψ(m)

hm,a (n) n


mφ(d) , where gcd(m, a) = d. dφ(m)ψ(m)




“Note that given d∣m, the number of integers a ∈ [m] satisfying gcd(m, a) = d is equal to φ(m/d). Hence from equation (6.11) we obtain the identity 1 = ∑ φ(m/d) d∣m

mφ(d) , dφ(m)ψ(m)

or equivalently (after interchanging d and m/d), (6.12)

φ(m)ψ(m) = ∑ dφ(d)φ(m/d). d∣m

Naturally such an elementary identity cannot be too hard to prove directly. The most straightforward proof is to use the fact that the functions φ(m), ψ(m), and m are multiplicative. Moreover, if f (m) and g(m) are multiplicative, then so are f (m)g(m) and ∑d∣m f (d)g(m/d). It follows that both sides of equation (6.12) are multiplicative. Hence we need to verify this equation only when m is a prime power pk , which amounts to showing that k−1

p2(k−1) (p − 1)(p + 1) = pk−1 (p − 1) + ∑ pi pi−1 (p − 1)pk−i−1 (p − 1) + pk (p − 1)pk−1 , i=1

which is straightforward. “To check your understanding of this proof technique, I have put on today’s problem set the question of finding the unique function f ∶ P → Q satisfying f (1) = 1 and ∣f (n)∣ = ∑ f (d). d∣n

Express your answer in terms of the functions ω(m) (the number of distinct prime divisors of m) and Ω(m) (the total number of prime divisors of m, i.e., Ω(∏ pai i ) = ∑ ai ). For instance, ω(12) = 2 and Ω(12) = 3. “There are some other observations about the generating functions Gm,a (x) that are mysterious to me. For instance, why are there factors 1 − x and 1 + x in lots of denominators? More generally, why do the denominators have so many factors? Why do some numerators have only one term? In addition to the data I showed you earlier, the numerators have just one term for G6,0 (x), G6,3 (x), G7,0 (x), G8,4 (x) (but not G8,0 (x)), G9,0 (x), G10,5 (x) (but not G10,0 (x)), G11,0 (x) (but not G13,0 (x)), etc. There are also some strange ‘coincidences’, such as G4,0 (x) = G5,0 (x) and G6,0 (x) = G9,0 (x). Moreover, there is a strong tendency for the highest few numerator coefficients to be powers of 2, up to sign. In fact, the leading coefficient in all examples I checked is always a power of 2, up to sign. Some especially striking examples are the numerators 4x4 + 4x6 + 16x8 of G6,5 (x), 4x6 − 8x7 + 16x8 of G10,0 (x), and 4x6 + 8x7 + 16x8 of G13,0 .



“Numerous generalizations suggest themselves. What about congruence conditions on k consecutive terms of row n of Stern’s triangle or of Stern’s diatomic sequence? More general products such as (6.6)? “Okay, I hope I have convinced you that Stern’s triangle and the Stern diatomic series are fantastic mathematical creatures! It’s true that they lack the richness of Blakeley’s triangle, but nevertheless they are not unworthy of mathematical consideration.”


Independence Day PROBLEM LIST Probability that five numbers are chosen in increasing order k-tuples of subsets of [n] with empty intersection k-element subsets of [p] whose elements sum to 0 (mod p) k-element subsets of [n] whose elements sum to 0 (mod n) Probability that the cycle containing 1 has length k Probability that 1 and 2 are in the same cycle of a random permutation Expected minimum value of n numbers from [0, 1] 1 m n ∫0 x (1 − x) dx 1 9 8 4 ∫ ∫ ∫R x y z w dx dy dz 1 1 2k ∫0 ⋯ ∫0 ∏1≤i ai for all i < j) are the first elements in each cycle of w written in standard form. “Now we can see why the probability that the length of the cycle containing 1 equals k of a random permutation w ∈ Sn does not depend on k. Instead of the cycle containing 1, we can equivalently look at the cycle containing n. If w ˆ = a1 a2 ⋯an , then the length of the cycle of w containing n is just n + 1 − w ˆ −1 (n), the position of n in w ˆ starting from the right. Since w ˆ −1 (n) is equally likely to have any value 1, 2, . . . , n, the proof follows.” 1

Mentioned in Chapter 4



After a short pause to allow the students to digest this elegant argument, Professor Blakeley continued, “Try the following: given a random permutation w ∈ Sn , what is the probability that 1 and 2 are in the same cycle?” Hands quickly shot up, and Professor Blakeley chose Clayton. “Rather than 1 and 2, we can equivalently ask for the probability that n − 1 and n are in the same cycle. This is just the probability that n − 1 follows n in w, ˆ so the probability is 1/2.” “Great!” said Professor Blakeley. “You all now seem to really understand this nice bijection w → w, ˆ which is sometimes called the fundamental bijection. So let’s move on. Choose n real numbers uniformly and independently from the interval [0, 1]. What is the expected value of mini xi , the minimum value of x1 , . . . , xn ?” “When n = 1 the expected minimum is 12 ,” volunteered Clayton. “Brilliant!” said Professor Blakeley. “What if n = 2?” “It should be 13 ,” replied Clayton. “In fact, it’s easy to see by computing 1 1 1 x dy dx, but obviously this isn’t the solution you are looking for. In general 1/2 ∫0 ∫x the answer should be 1/(n + 1).” “Your intuition is good, and it’s not so hard to come up with a computational proof. Can anyone think of a really elegant proof?” After not receiving an immediate response, Professor Blakeley continued, “Well, here is an argument. Choose points x1 , . . . , xn uniformly and independently on a circle of circumference 1. Then choose an additional point y on the circle. Cut the circle at y and then ‘straighten out’ into a unit interval [0, 1] where for definiteness say that moving from 0 to 1 corresponds to moving clockwise on the circle. What we have are n points x1 , . . . , xn uniformly and independently chosen from [0, 1]. Now by symmetry on the circle the expected distance between any two consecutive 1 . Hence the expected distance between y and the points among x1 , . . . , xn , y is n+1 1 first xi clockwise from y is n+1 , and this distance is just min xi . Thus the ‘hidden independence’ is among n + 1 points, not the original n points. “Great stuff, but we don’t have time to dawdle! Now that we’re warmed up, let’s try something a little different. Given integers m, n ≥ 0, evaluate the integral 1 m n ∫0 x (1 − x) dx.” Jung Wook was the first to raise his hand. “I think this is some evaluation of the beta function. You can prove it by induction and integration by parts.” “Ugh!” said Professor Blakeley. “You are correct, but induction should be a last resort. We want some nice conceptual argument.” Although the students knew that the solution somehow involved independence or uniformity, they were unable to figure out what Professor Blakeley had in mind. “Try to interpret the integral probabilistically,” suggested Professor Blakeley. “Let’s see,” said Jung Wook, “we can think of x as a number chosen uniformly in [0, 1]. Then xm is the probability that m other numbers u1 , . . . , um in [0, 1] are all less than x, and (1 − x)n is the probability that n other numbers v1 , . . . , vn in [0, 1] are all greater than x.” “Good start!” “I got it!” said Patrick. “It’s similar to the previous problem. We choose all m + n + 1 numbers ui , vj , x at once. There are (m + n + 1)! ways to order them. The number of orderings for which x is preceded by u1 , . . . , um and followed by



v1 , . . . , vn is m! n!. Hence 1


xm (1 − x)n dx =

m! n! .” (m + n + 1)!

“Exactly!” said Professor Blakeley. “You must admit it’s a neat way to evaluate an integral. To test your understanding, I have included on today’s problem set the triple integral 1 9 8 4 ∫ ∫ ∫ x y z w dx dy dz,



where R is the region of R defined by x, y, z ≥ 0 and x + y + z ≤ 1, and where w = 1 − x − y − z. This was a problem on the 1984 Putnam.2 “Here’s an interesting integral of a similar flavor, namely, for positive integers n and k, we have 3







2k ∏ (xi − xj ) dx1 ⋯dxn = 1≤i 1), since x1 is the first of the n points to be visited.” It was immediately clear that when n = 2 we have p2 = 1, and when n = 3 by symmetry p2 = p3 = 1/2. Jung Wook quickly did a “brute force” computation for n = 4 and obtained the surprising answer p2 = p3 = p4 = 1/3. Could it be that for general n we have pi = 1/(n − 1) for all 2 ≤ i ≤ n − 1? After waiting a while without further progress, Professor Blakeley decided to give a hint. “Consider when Rebecca is adjacent to the point xi , 2 ≤ i ≤ n, for the first time.” It wasn’t long before Daniel’s hand went up. “Note that when Rebecca is adjacent to xi , 2 ≤ i ≤ n, for the first time then Rebecca has not yet visited xi . Assume Rebecca is adjacent to xi for the first time. Then xi will be the last unvisited point if and only if Rebecca visits the other point adjacent to xi before reaching xi . By symmetry the probability for this is independent of i, so pi = 1/(n − 1) for 2 ≤ i ≤ n.” “Excellent!” said Professor Blakeley, and the other students had to agree. It was a quite amazing trick. “Are there other graphs with this property?” asked Fumei. “Clearly by symmetry the complete graph Kn will work, but is that all?” “A natural question! I assume that the property you have in mind is the following. Start at any vertex v of an n-vertex graph with no multiple edges and no loops from a vertex to itself. Do a random walk on the vertices by moving with equal probability from a vertex to one of its neighbors. Then for any vertex u ≠ v, the probability that u is the last vertex to be visited for the first time is 1/(n − 1). Fumei’s question was first raised by Ron Graham, and Lov´ asz and Winkler showed that the only such graphs are cycles and complete graphs. However, if you want the equal probability property to hold only for a particular start vertex v, then there are further examples. “You can think about this, but now let’s look at another clever problem. This one involves the concept of parking functions, which have recently shown up in some surprising contexts. Suppose that a one-way street has n parking spaces on one side, numbered 1, 2, . . . , n in the order they are encountered when driving down the street. There are n cars C1 , . . . , Cn . For each 1 ≤ i ≤ n, car Ci (or more accurately, its driver) has a preferred space 1 ≤ ai ≤ n. The cars park by the following algorithm. The cars enter the street one at a time in the order C1 , . . . , Cn . The car Ci goes to its preferred space ai and parks there if it is unoccupied. Otherwise it continues down the street and parks in the first available space.



“Sadly this street has a design flaw; there is a tall cliff at the end so any car unable to park will fall off the cliff and cannot try again.

“We say that the sequence a1 , . . . , an is a parking function if all the cars can park. For instance, if n = 5, then 4,2,5,4,3 is not a parking function since three cars will first go to 4 or 5, and there is only room for two of them. “This example makes it clear that a necessary condition to be a parking function is that for any 1 ≤ i ≤ n at most i of the terms aj satisfy aj ≥ n − i + 1. In fact, it is not hard to show that this condition is also sufficient, but we don’t need that here. Note, however, that this condition implies that every permutation of a parking function is a parking function. What we are interested in is the number fn of parking functions of length n. For instance, f1 = 1 (the only parking function being 1), f2 = 3 (from 11, 12, 21), and f3 = 16: 111 112 121 211 122 212 131 311 123 132 213 231

221 113 312 321.

We could have saved some time by noting that we have one permutation of the sequence 111, three of 112, three of 122, three of 113, and six of 123, so 16 in all. So what is fn ?” “These numbers look like the number of labelled trees on n + 1 vertices,” suggested Clayton. “If this is correct, then fn = (n + 1)n−1 by Cayley’s formula for trees.” “Great observation!” said Professor Blakeley. “For those unfamiliar with Cayley’s formula, it says that the number of trees on the vertex set [n] is nn−2 . I should point out that despite the appellation ‘Cayley’s formula,’ the first proof was due to Borchardt in 1860, and in fact it was stated without proof in 1857 by Sylvester. “Clayton’s conjecture suggests that there is some connection between parking functions and trees, and indeed several ingenious bijections are known. Can anyone think of a ‘direct’ argument that does not involve trees?” After receiving no response, Professor Blakeley continued, “Yes, this is quite tricky. Here is the clever proof due to Henry Pollak. Imagine that we have n + 1 parking spaces numbered 1, 2, . . . , n + 1 arranged counterclockwise in a circle.



“Each car Ci has a preferred space ai as before, but now 1 ≤ ai ≤ n + 1 since there are n + 1 spaces. The cars follow the same procedure as before, namely, they enter the circular street one at a time. Each car goes to their preferred space. If it is unoccupied, then they park there. Otherwise they drive counterclockwise and park in the first available space. Since the street is circular, all the cars will be able to park. Thus at the end there will be one unoccupied space u. It’s easy to see that the sequence (a1 , . . . , an ) is a parking function if and only if u = n + 1. “Now let 0 ≤ i ≤ n. We change the preferences to (a1 + i, . . . , an + i), where the terms are taken modulo n + 1 so they lie in the interval [1, n + 1]. Clearly the parking procedure simply gets rotated i spaces counterclockwise. Thus if u was unoccupied after following the parking procedure for (a1 , . . . , an ), then u + i will be unoccupied for (a1 + i, . . . , an + i). “Can anyone complete the proof?” Several hands shot up. Emiliano was chosen. “By what you said, for each (a1 , . . . , an ) there is exactly one value 0 ≤ i ≤ n for which (a1 + i, . . . , an + i) is a parking function. For fixed (a1 , . . . , an ) let us call these n + 1 sequences (a1 + i, . . . , an + i) equivalent, clearly an equivalence relation. Each car has n + 1 choices of a preferred space, so there are (n + 1)n preference sequences (a1 , . . . , an ). Since each equivalence class has n + 1 elements, there are (n + 1)n−1 of them. Since each equivalence class contains exactly one parking function, there are (n+1)n−1 parking functions of length n.” “Good explanation!” said Professor Blakeley. “A lot more can be done with parking functions, but we must move on. I’ll just leave a couple of problems for you to mull over. How many parking functions of length n contain just one 1? How many parking functions of length n remain a parking function when one 1 is removed from it?” Naturally some of the students could not resist working on these problems immediately, even when Professor Blakeley continued with the next problem. “It’s time to take a look at some herpetological mathematics. There are many fascinating problems dealing with tiling rectangles with various sets of tiles. Here we will be looking at tiles called snakes. We’ll look at some other rectangle tiling problems later.1 1

Chapter 13



“A snake on an m × n chessboard is a nonempty subset S of the squares of the board obtained as follows. Start at one of the squares and continue walking one step up or to the right, stopping at any time. The squares visited are the squares of the snake. Here is an example of the 8 × 8 chessboard covered with disjoint snakes:

“If we consider two snakes equivalent if they are translations of each other, then here are the eight inequivalent snakes of size four (that is, with four squares):

“As a warmup, how many inequivalent snakes have k squares?” There were soon many volunteers, from whom Fumei was selected. “Let ai be the number of squares in the ith row of the snake. If the snake has r rows, then the sequence (a1 , . . . , ar ) determines the snake. There are 2n−1 ways to write n as an ordered sum a1 + ⋅ ⋅ ⋅ + ar of positive integers (where r can vary), so there are 2n−1 snakes with n squares.” “Super! Hopefully everyone here knows that a sequence (a1 , . . . , ar ) of positive integers summing to n is called a composition of n. There are many ways to see that the number of compositions of n ≥ 1 is 2n−1 . One especially elegant argument is to write down a row of n dots. There are n − 1 spaces between the dots. Choose



any subset S of them, and place a vertical bar at the spaces in S. This divides the dots into ‘compartments.’ Reading the number of dots in each compartment from left-to-right gives a bijection between the subsets S and compositions of n. Since there are 2n−1 subsets of an (n − 1)-element set, it follows that there are 2n−1 compositions of n. “By the way, if we don’t care about the order of the summands, so we might as well write the summands in weakly decreasing order, then we obtain a partition of n. For instance, the five partitions of 4 are 4, 3+1, 2+2, 2+1+1, and 1+1+1+1. Who can tell me a simple explicit formula for the number p(n) of partitions of n?” Some of the students chuckled. They already had some familiarity with the function p(n). Professor Blakeley continued, “My question has an easy answer: no one can tell me such a formula. No such formula is known, and its existence is extremely unlikely in view of what is known about partitions. However, there are some amazing formulas for p(n) that I would not call simple. For example, let Ak (n) =

eπis(m,k)−2nm/k ,

0≤m a2 , a1 > an , and a1 + a3 + ⋅ ⋅ ⋅ + a2n−1 = a2 + a4 + ⋅ ⋅ ⋅ + a2n . If Alice plays Amanda’s strategy she will end up with a score of a2 + ⋅ ⋅ ⋅ + a2n . However, if she chooses a1 first and then after Bob’s move plays Amanda’s strategy, she will end up with more, since after Bob’s first move Alice’s score will be strictly larger than Bob’s.” “Nice!” said Professor Blakeley. “I don’t think a simple strategy is known for maximizing one’s score. The best known way to determine the strategy is by using dynamic programming techniques. “Since I mentioned high school algebra, let’s jazz things up by looking at noncommutative high school algebra. How difficult can it be to prove an identity from this subject? “Here is an example. Suppose that in a noncommutative (but of course associative) ring R with identity we have two elements x, y such that 1−xy is invertible. Show that 1 − yx is also invertible, and then (9.1)

(1 + x)(1 − yx)−1 (1 + y) = (1 + y)(1 − xy)−1 (1 + x). ”

Despite the simple statement, the students didn’t find this problem so easy. Eventually Jung Wook came up with the idea that if we can formally write (1 − yx)−1

= =

1 + yx + yxyx + yxyxyx + . . . 1 + y(1 + xy + xyxy + . . . )x


1 + y(1 − xy)−1 x,

then it would follow that (1 − yx)−1 = 1 + y(1 − xy)−1 x. “Hold on,” said Fumei. “How do you know this kind of power series expansion is valid?” “Just pass to some kind of noncommutative power series ring containing R,” replied Jung Wook. “I’m not sure about the details, but it should be straightforward.” “The theory of completion has it subtleties, even in the commutative case,” said Professor Blakeley. “In particular, a ring is not always contained in its completion (with respect to some ideal). We would like an elementary computational proof.” “Okay,” said Jung Wook, “but my argument does suggest that this is a correct formula for (1 − yx)−1 .” “An excellent insight! Can someone prove this formula?” It didn’t take long to come up with the straightforward verification (1 − yx)(1 + y(1 − xy)−1 x) = 1 − yx + y(1 − xy)−1 x − yxy(1 − xy)−1 x = 1 − yx + y(1 − xy)(1 − xy)−1 x = 1. Some of the students realized that one also needed to show (by a similar argument) that (1 + y(1 − xy)−1 x)(1 − yx) = 1, since in a ring (unlike a group) it is possible for ab = 1 but not ba = 1. “Now that we know that 1 − yx is invertible, how do we show that (1 + x)(1 − yx)−1 (1 + y) = (1 + y)(1 − xy)−1 (1 + x) ?”



asked Professor Blakeley. “If we could expand (1 − xy)−1 and (1 − yx)−1 in a power series, then there would be no problem since both sides become the sum of all alternating products of x’s and y’s, but a priori such an expansion is not justified.” This simple noncommutative identity proved surprisingly tricky. Professor Blakeley could have assigned it as a homework problem, but he decided instead to show the solution right away when no one seemed to be getting anywhere. Even Sam, when asked, could only give an embarrassed negative shake of his head. “Let u = (1 − xy)−1 and v = (1 − yx)−1 . Note that (1 − yx)y = y(1 − xy), and therefore yu = vy. Thus (1 + x)v(1 + y) = = =

v + xv + yu + xyu v + xv + yu + u − (1 − xy)u u + v + xv + yu − 1.

This last expression is symmetric with respect to the permutation (written in disjoint cycle form) (x, y)(u, v), so an affirmative answer follows. “In fact, the power series method is actually valid. A vast generalization is due to Daniel Krob. An informal way to state this generalization is that any identity in noncommutative variables which holds after formally writing (1−a)−1 = 1+a+a2 +. . . (provided this makes sense formally) continues to hold in any ring for which the identity is defined.” After a brief pause, Professor Blakeley continued, “This ‘noncommutative high school algebra’ reminds me of an amusing result of Alex Wilkie concerning ordinary high school algebra. Namely, if one uses the axioms: (1) x + y = y + x, (2) (x + y) + z = x + (y + z), (3) x ⋅ 1 = x, (4) x ⋅ y = y ⋅ x, (5) (x ⋅ y) ⋅ z = x ⋅ (y ⋅ z), (6) x ⋅ (y + z) = x ⋅ y + x ⋅ z, (7) 1x = 1, (8) x1 = x, (9) xy+z = xy ⋅ xz , (10) (x ⋅ y)z = xz ⋅ y z , (11) (xy )z = xy⋅z , then one cannot prove the (true) identity ((1 + x)y + (1 + x + x2 )y )x ⋅ ((1 + x3 )x + (1 + x2 + x4 )x )y = ((1 + x)x + (1 + x + x2 )x )y ⋅ ((1 + x3 )y + (1 + x2 + x4 )y )x . At first you might think that this shows that there are identities that can be stated but not proved within high school algebra, but note that none of the axioms involve subtraction. What is really being shown is that there are identities not involving subtraction that cannot be proved without using subtraction. Still, it is an interesting and rather counterintuitive result. “That reminds me of a somewhat similar though much simpler fact. A semiring satisfies all the axioms of a ring except the existence of additive inverses. For instance, the nonnegative integers N form a semiring. We can define primes, irreducibles, and unique factorization in semirings just as we do for rings. Does the semiring N have unique factorization?”



“Sure,” said Daniel. “Factorization is the same as in Z. Negative factors are irrelevant since −1 is a unit in Z.” “Very good! Now what about R≥0 [x], the semiring of all polynomials in x with nonnegative real coefficients?” “Same reasoning,” replied Daniel. “Polynomials with negative coefficients. . . .” He became silent. “Yes?” “Maybe I spoke too quickly, since polynomials with negative coefficients cannot be multiplied by a unit in R[x] (that is, by a nonzero real number) and become a polynomial with nonnegative coefficients. In fact, probably R≥0 [x] does not have unique factorization.” “Can anyone think of an example showing this?” asked Professor Blakeley. Sam was acting quite excited, so Professor Blakeley asked him if he had an answer. Sam walked up to the board and wrote: (9.2)

(1 + x3 )(1 + x + x2 ) = (1 + x)(1 + x2 + x4 ).

“Nice!” said Professor Blakeley. “I believe that you have found the simplest example. All the four factors are irreducible in R≥0 [x], but none is a unit times another. Of course over R[x] there is no problem because we have the complete factorization (1 + x)(1 − x + x2 )(1 + x + x2 ). “In fact, there is a lot of interest these days in the complexity of ‘subtractionfree’ computations. Only addition, multiplication, and division are allowed. For instance, the polynomial 1 − x + x2 can be computed in a subtraction-free way as (x3 + 1)/(x + 1). What polynomials (or rational functions), say in finitely many variables over the integers, can be computed without subtraction, and how many steps are necessary to compute them? Time is up for this class, but we can discuss it further during the afternoon sessions.”


An Aesthetical Error PROBLEM LIST A proof that π = 0 The operation x ∗ y = xlog y Fields F for which F + ≅ F ∗ ⊕ Z/2Z Maximum dimension of a space of n × n real nilpotent matrices Maximum dimension of a space of n × n real diagonalizable matrices Maximum dimension of a space of n × n real matrices whose nonzero matrices have a nonzero real eigenvalue Maximum dimension of a space of n × n real matrices for which 0 is the only real eigenvalue Maximum dimension of a space of n × n real matrices such that every nonzero matrix is nonsingular Maximum number of linear independent vector fields on an (n − 1)-sphere Dimensions of finite-dimensional division algebras over R A 9-dimensional space of real 16 × 16 matrices whose nonzero matrices are nonsingular Maximum dimension of a space of n × n real matrices such that every matrix has a real eigenvalue Maximum dimension of an affine space of real n × n nonsingular matrices


88 88 90 90 90 90 91 91 91 92 93 93 94



When Professor Blakeley entered the classroom the next morning, he turned on the overhead projector. “Let’s see if you can find the flaws, if any, in a couple of proofs,” he said. He pressed some keys on the laptop that was connected to the projector. On the screen appeared the following text. Define


dx . −1 1 + x2 π We know from elementary calculus that I = 2 . Set y = 1/x. When x = −1, then y = −1. Similarly y = 1 when x = 1. Moreover, dx = d(1/y) = −dy/y 2 . Hence 1 −dy I =∫ 2 −1 y (1 + y −2 ) 1 dy = −∫ −1 1 + y 2 = −I, I =∫

so π = 0. “Any comments?” asked Professor Blakeley. Clayton said, “We have found an inconsistency in mathematics. This is sad.” “On the contrary,” said Professor Blakeley, “it is a time for great rejoicing. You all know that the Clay Mathematics Institute has announced seven Millennium Prize Problems, each of which has a one million dollar prize for the solution. To receive the prize the solution ‘must be published in a refereed mathematics publication of worldwide repute’ (among other conditions). Only one of the problems (the Poincar´e conjecture) has been solved up to now, but if mathematics is inconsistent, then we can prove any statement by contradiction. Hence we should immediately write up the six proofs of the remaining problems and send them off to Annals of Mathematics. Once news of the inconsistency gets out there will be a huge stampede of mathematicians solving long-open problems.” Alas, it was clear to the students that there was no inconsistency, but rather a flaw, in Professor Blakeley’s argument. They considered it too simple to bother volunteering, so Professor Blakeley continued with another screen display. Define a binary operation ∗ on the set R+ of positive reals by x ∗ y ∶= xlog y . Since x ∗ y = e(log x)(log y) , it follows that ∗ is associative, commutative, and distributive over multiplication. Moreover, the number e = 2.718 . . . is an identity element for this operation, while 1 ∗ x = 1 for all x ∈ R+ . The inverse to x ≠ 1 under the operation ∗ is e1/(log x) . Thus R+ with the operations × (multiplication) and ∗ is a field. “Does anyone see the error?” asked Professor Blakeley. The students looked intensively at this argument, but they could not see anything wrong. The mysterious operation ∗ was simply the transport by the function et on R of the operation of multiplication. Since et also carried addition to multiplication, all the properties were clear. Professor Blakeley was becoming more and more agitated as the students searched for an error. Finally he could contain himself no longer. “Don’t students nowadays have any aesthetic sense? Take a look at the symbol ∶=. The colon



and the equals sign are too far apart! Here it is blown up,” he said as he showed another slide:


“Ugly, ugly, ugly! And on some versions of LATEX it’s even worse—the colon and equal sign are not lined up correctly! How can this problem be rectified?” The students were relieved that nothing was wrong with their mathematical reasoning, though they were concerned about their aesthetic judgment. The colon and equals sign looked okay to them. The two symbols could be moved closer using /hspace, but this seemed an inelegant solution. Emiliano suggested that there might already be a command for the correct symbol. “Exactly!” said Professor Blakeley. The mathtools package has the command /coloneqq. Compare the ugly colon followed by equals sign with the beautiful /coloneqq:



“Unfortunately there is no time for further discussion of mathematical typography, as we should get back to actual mathematics. However, this evening I suggest that you all read Kant’s Critique of the Aesthetic Power of Judgment.” Clayton raised his hand and asked, “Do you have any other examples of bad notation?” “An excellent question!” replied Professor Blakeley. “Perhaps the worst notation of all time is due to Barry Mazur. He was giving a lecture and wanted to annoy Serge Lang, a stickler for good notation, who was in the audience. When the letter Ξ (upper case Greek xi) is drawn by hand, it consists of three horizontal line segments:

“Mazur let Ξ be a complex number and considered the complex conjugate of Ξ divided by Ξ:”

After waiting for the laughter to die down, Professor Blakeley continued, “Some similar awful notation occurs in Chinese and Japanese arithmetic. For instance, the formula 11 + 2 − 1 = 12 becomes:

” The students who were not familiar with Chinese characters did not get it, while the others realized that Chinese and Japanese people would never combine (one), Western math symbols with Chinese characters for numbers, such as (two), and (ten). “By the way,” said Professor Blakeley, “the problem I showed you today about the operation x ∗ y is related to the fact that R∗ ≅ R+ ⊕ Z/2Z, where R∗ denotes the multiplicative group of R, and R+ the additive group. Which fields F have the ‘opposite’ property, namely, (10.1)

F + ≅ F ∗ ⊕ Z/2Z ? ”



This was not such a difficult question. Almost everyone saw that the presence of an element of order two in F + meant that F had characteristic 2, i.e., x + x = 0 for all x ∈ F . Since any finite subgroup of the multiplicative group of a field is cyclic, F + could have only one element of order 2. Hence the unique field (up to isomorphism) satisfying (10.1) is the finite field F2 . Professor Blakeley continued, “Now let’s take a look at some linear algebra, a little off the beaten path. The set Mn (R) of n × n real matrices is a real vector space of dimension n2 . Let’s consider some subspaces of Mn (R) of maximum dimension satisfying various properties. For instance, what’s the maximum dimension of a subspace V of Mn (R) all of whose elements are nilpotent, i.e., have only the eigenvalue 0?” “There is an obvious conjecture,” said Daniel. “Namely?” “Namely, (n2 ), achieved by the strictly upper triangular matrices.” “A good guess, but how to prove it? Note that it isn’t enough to show that the strictly upper triangular matrices form a maximal subspace of nilpotent matrices, as there might be a completely different such subspace of larger dimension.” To the amazement of everyone else, Sam soon raised his hand. Naturally Professor Blakeley was excited to have him answer. ), and Sam said, “The space V of real symmetric matrices has dimension (n+1 2 only the zero matrix is nilpotent (since symmetric matrices are diagonalizable). ) = n2 . If A and B are subspaces of an m-dimensional vector space Now (n2 ) + (n+1 2 and dim A + dim B > m, then A ∩ B ≠ {0}. Hence any subspace of Mn (R) of dimension greater than (n2 ) must contain a nonzero symmetric matrix, and the proof follows.” “Super!” said Professor Blakeley. “Great argument! Okay, who besides Sam can tell me the largest dimension of a subspace of Mn (R) for which every element is diagonalizable?” Several hands quickly shot up. Professor Blakeley pointed to Fumei. “Let Sn be the space of symmetric matrices as in Sam’s argument, and let Tn be the space of strictly upper triangular matrices. Sam explained why Mn (R) = Sn ⊕ Tn . Since every matrix in Sn is diagonalizable and no nonzero matrix in Tn is diagonalizable, ).” the space Sn must have the highest dimension, namely, (n+1 2 “Right!” said Professor Blakeley. “The decomposition Mn (R) = Sn ⊕ Tn gives us two results for the price of one! “What about the maximum dimension of a subspace X of Mn (R) such that every nonzero matrix in X has a nonzero real eigenvalue?” The students quickly realized that every nonzero symmetric matrix had all its eigenvalues real, with at least one of them nonzero, so it seemed plausible that ). No one could think of such a space X, however, and dim X > dim Sn = (n+1 2 finally the light dawned for Jung Wook. He said, “Let Skn denote the subspace of Mn (R) of skew-symmetric matrices A, i.e., A = −At . It has dimension (n2 ), and every eigenvalue of A is purely imaginary, i.e., has real part 0. Moreover, skewsymmetric matrices are diagonalizable since they are normal (i.e., AAt = At A), so every nonzero skew-symmetric matrix has at least one (actually, at least two for n ≥ 2) nonzero eigenvalues. Since dim Skn = (n2 ) = dim Mn (R) − dim Sn , it follows ), achieved by Sn .” that the maximum possible dimension is (n+1 2



“Right you are! It is curious that the maximum dimension of a subspace such that every nonzero matrix in it has at least one nonzero real eigenvalue is the same as the maximum dimension of a subspace such that every matrix has all its eigenvalues real. “I’m sure it will be easy for you to get some more mileage out of the decomposition Mn (R) = Sn ⊕ Skn . For instance, the maximum dimension of a subspace of Mn (R) for which 0 is the only real eigenvalue is (n2 ). “Now let’s turn to a much more challenging variant. Given 1 ≤ r ≤ n, what is the maximum dimension ρr (n) of a subspace Y of Mn (R) such that every nonzero element of Y has at least r nonzero eigenvalues? One value of r is easy. Who can tell me what this is?” Among the volunteers Professor Blakeley chose Patrick. “We just did the case ).” r = 1. We got ρ1 (n) = (n+1 2 “Looks good to me!” said Professor Blakeley. “Now let’s try the other extreme, namely, ρn (n). In other words, what is the maximum dimension of a subspace of Mn (R) for which every nonzero matrix is nonsingular? I don’t expect anyone to figure this out, but here is a little table.” n 1 2 3 4 5 6 7 8 9 10 11 12 ρn (n) 1 2 1 4 1 2 1 8 1 2 1 4 n ρn (n)

13 14 15 16 1 2 1 9

17 18 19 20 21 22 23 24 1 2 1 4 1 2 1 8

n 25 26 27 28 29 30 31 32 33 34 35 36 ρn (n) 1 2 1 4 1 2 1 10 1 2 1 4 “Does anyone see any patterns?” “ρn (n) = 1 if n is odd,” several students said. “Yes, that stares you in the face. Perhaps we can prove this special case.” It wasn’t long before hands went into the air. Emiliano was chosen. “Let A and B be two linearly independent nonsingular matrices in Mn (R), where n is odd. Let λ be a real eigenvalue (necessarily nonzero) of B −1 A with eigenvector x, which exists since the characteristic polynomial of B −1 A has odd degree. Then Ax = λBx, so A − λB is singular and nonzero.” “Not bad! The first step in solving this problem should be to ask how we can use the hypothesis that n is odd. In Emiliano’s solution, it boils down to the fact that a real polynomial of odd degree has a real zero. So now we are halfway there! What about even n?” First it was obvious to conjecture that ρn (n) = 2 when n ≡ 2 (mod 4). Soon after that the students conjectured that in fact ρn (n) depended only on the largest power 2u of 2 dividing n. The data gave the values 1, 2, 4, 8, 9, 10 for u = 0, 1, 2, 3, 4, 5. “An excellent observation!” said Professor Blakeley. “In fact, this conjecture is true, and here is a table for u ≤ 12.” u ρ2u (2u )

0 1 2 3 4 5 6 1 2 4 8 9 10 12

7 8 9 10 11 12 16 17 18 20 24 25

“Can someone tell me why ρn (n) ≤ n?,” continued Professor Blakeley. Several students quickly saw that if A1 , . . . , An+1 were linearly independent matrices in Mn (R) and 0 ≠ x ∈ Rn , then we must have α1 A1 (x) + ⋅ ⋅ ⋅ + αn+1 An+1 (x) = 0



for some α1 , . . . , αn+1 ∈ R, not all zero. But then α1 A1 + ⋅ ⋅ ⋅ + αn+1 An+1 is singular and nonzero. “After this things start getting tough,” said Professor Blakeley, “so I will just tell you some interesting facts. “Let n = (2m + 1)24a+b , where 0 ≤ b < 4. Then ρn (n) = 8a + 2b . This is a famous result of J. Frank Adams in 1962. Hurwitz, Radon, and Eckmann showed that ρ(n) ≥ 8a + 2b (you might try to do this yourself, using a hint I will give soon), and Adams showed equality. The proof of Adams is based on homotopy theory. By a nonvanishing vector field F on the sphere Sn−1 = {x ∈ Rn ∶ ∣x∣ = 1}, we mean a continuous assignment of a nonzero tangent vector Fx to each point x of Sn−1 . A set S = {F, G, . . . } of nonvanishing vector fields is linearly independent if the set of vectors Fx , Gx , . . . is linearly independent for all x ∈ Sn−1 . Perhaps you have heard of the ‘hairy ball theorem’ which states that there is no nonvanishing vector field when n − 1 is even. For the case n − 1 = 2, the hairy ball theorem is sometimes stated in more colorful terms like ‘you can’t comb the hair on a coconut.’ “It’s not so hard to see that the maximum number of linearly independent vector fields on Sn−1 is just ρn (n) − 1, and it is the vector field problem which Adams solved by topological methods. “Let us consider how we might find some subspaces of Mn (R) for which every nonzero matrix is nonsingular. Let A be an n-dimensional real division algebra, i.e., an n-dimensional real vector space with an operation of multiplication such that (a) R is contained in the center of A (that is, αx = xα for all α ∈ R and x ∈ A), (b) multiplication is left and right-distributive over addition, and (c) for any elements a ≠ 0 and b, the equations ax = b and ya = b have exactly one solution. We do not assume that multiplication is associative. “For a ∈ A define a map Ta ∶ A → A by Ta (x) = ax. Then Ta is a linear transformation on A which is nonsingular by (c) above. Moreover, it is clear that for α, β ∈ R we have Tαa+βb = αTa + βTb . It follows that the set {Ta ∶ a ∈ A} is an ndimensional subspace of the set of real linear transformations A → A such that every nonzero Ta is nonsingular. Thus we have achieved equality in the bound ρn (n) ≤ n which we proved earlier. Let me note that the space T = {Ta ∶ a ∈ A} need not be closed under multiplication; this will be the case precisely when multiplication in A is associative, in which case A ≅ T as R-algebras. “Now over the real numbers we have division algebras D of the following dimensions d: ● d = 1: of course here D = R. ● d = 2: now we can take D = C. An R-basis for D is given by [

1 0 ] and 0 1

0 1 ]. −1 0 ● d = 4: we can take D to be the famous algebra H (for Hamilton) of quaternions. Naturally this algebra isn’t commutative since C is algebraically closed. ● d = 8: now we can take D to be the algebra O of octonions, discovered independently by John T. Graves in 1843 and Arthur Cayley soon after. [



“It is easy to see from the formula ρn (n) = 8a + 2b that ρn (n) = n if and only if n = 1, 2, 4, 8. Thus we have shown (assuming the formula for ρn (n)) that every finite-dimensional division algebra over R has dimension 1, 2, 4, or 8. In fact, this fundamental result was not known until Adams proved the formula for ρn (n).” After a short break Professor Blakeley resumed. “Let me now show you that ρ16 (16) ≥ 9. Given that ρ8 (8) = 8 we clearly have ρ16 (16) ≥ 8, since we can replace A 0 the 8 × 8 matrix A with the 16 × 16 matrix [ ]. Once you understand how to 0 A add one more dimension, you can think about how to get ρn (n) ≥ 8a + b in general. “Let V be an 8-dimensional subspace of M8 (R) (constructed from the octonions) whose nonzero matrices are nonsingular. For each A ∈ V and α ∈ R, define a linear transformation τA,α ∶ R8 ⊕ R8 → R8 ⊕ R8 by τA,α (u, v) = (Av + αu, At u − αv). The set of all τA,α ’s form a 9-dimensional subspace W of M16 (R), so it remains to show that every nonzero element of W is nonsingular. If not, there would be a nonzero vector (u, v) in the kernel of some τA,α with (A, α) ≠ (0, 0), so Av + αu = 0, At u − αv = 0. Multiplying the first equation on the left by ut gives ut Av = −αut u. Multiplying the second equation on the left by v t and transposing gives ut Av = αv t v. Thus α(ut u + v t v) = 0 ⇒ α = 0. Then Av = 0 and At u = 0. Since at least one of u, v is nonzero, we see that A is singular and hence A = 0, a contradiction. “Now let D be the set of all proved theorems, and let Q be the theorem that all nonzero elements of W are nonsingular. Then Q∈D.” The students were impressed by these beautiful connections between linear algebra (spaces of nonsingular matrices), topology (vector fields on spheres), and algebra (division algebras over R). They could see that finding ρr (n) for any r was likely to be very difficult. They also realized that Professor Blakeley must really love this topic since he went over thirty minutes without telling a joke until the very end. “Given the above results, it is natural to ask for the maximum dimension of a subspace V of Mn (R) such that every matrix in V has at least one real eigenvalue (including 0). Any thoughts?” Some hands went up almost instantaneously. Sandra was selected. “If n is odd, then the maximum dimension is n2 . Every matrix in Mn (R) has a real eigenvalue since a real polynomial of odd degree has a real zero.” “Exactly! And what about even n?” This was less clear to Sandra. “A lower bound is n2 − n + 1,” she said, “achieved by matrices for which all the entries in the first row except possibly the first entry are 0. But I don’t immediately see whether one can do better. Clearly an upper bound is n2 − 1.” “Here is a fact that might be useful. Using the theory of Clifford algebras, one can construct a subspace W of Mn (R) of dimension ρn (n) such that no nonzero



matrix in W has a real eigenvalue. Note that this is best possible by the definition of ρn (n).” “I see,” said Sandra. “The spaces V and W have only 0 in common, so we get the improved upper bound (when n is even) of n2 − ρn (n). Is this best possible?” “I don’t know. Something to think about.” Professor Blakeley wasn’t quite finished. “I’ll mention one problem on today’s problem set. Show that for any field F, the maximum dimension of an affine subspace of Mn (F) for which every element is nonsingular is (n2 ), achieved for instance by the upper triangular matrices with 1’s on the main diagonal. Have fun!”


Miraculous Cancellation PROBLEM LIST Three states with three points in common to all three USA state closest to Africa Factorization of x2 − y 2 over C Factorization of f (x) − g(y) over C Prime numbers with n consecutive 7’s in their decimal expansion n × n magic square of distinct primes Can f (x)2 have fewer nonzero coefficients than f (x)? Is a certain subgroup H of the group G of all infinite sequences of integers a direct summand? With G as above, every homomorphism f ∶ G → Z is determined by its value on unit vectors G is not a free abelian group Every homomorphism ψ∶ G → Z vanishes on all but finitely many unit vectors


96 97 97 98 99 99 99 100 101 101 101



Professor Blakeley began the next day of class by saying, “Let’s warm up with an application of graph theory to geography. Do there exist three states of the United States with three points in common to all three of them?” None of the students had a very accurate map of the USA in their heads, so they pulled out their laptops, tablets, and phones to take a look. They could not find three states with the desired property. “Suppose we put a point in the interior of each of the three states,” said Professor Blakeley, “and connect it by three nonintersecting curves to each of the three points in common to all three of the states. Does anyone see what we have? Think graph-theoretically.” Jung Wook was the first to volunteer. “Let the three interior points and three intersection points be the vertices of a graph. Since each interior point is connected to each intersection point, the graph is isomorphic to the complete bipartite graph K3,3 .” Professor Blakeley drew a diagram of this graph on the board:

Then he said, “So what can we conclude?” A number of hands shot up, and Clayton was chosen. “We have an embedding of K3,3 without crossing edges on the surface of a sphere. This contradicts the fact that K3,3 is nonplanar. Therefore three such states do not exist.” “I am glad to see that you have some familiarity with planar graphs,” said the Professor, “but I think there must be a flaw in your reasoning because in fact three such states do exist.” What could be the flaw? It was unlikely that there was a contradiction in mathematics. After a while Sandra ventured, “I think we are assuming that the states are connected. If a state wasn’t connected we could use an interior point in each of its components, maybe giving a planar graph with seven vertices.” “Super! You are on the right track! Now which state isn’t connected?” “Michigan!” said several students. “The land area of Michigan isn’t connected,” said Professor Blakeley, “but the state as a whole is. There is water between the two land areas that still belongs to Michigan. I am afraid you have to look elsewhere.” The students stared at their screens but could not find a disconnected state. The Professor continued, “You need quite an accurate map to see this. If you look at the southwest corner of Kentucky, you will see that there is a tiny area separated from the rest of the state. It is then clear from the map that the three states are Kentucky, Missouri, and Tennessee.” He then drew a sketch on the board, where M = Missouri, K = Kentucky, and T = Tennessee.



Mississippi River



M T “Cool!” said Clayton, “but why does Kentucky have this extra component?” “Excellent question! At one time the western part of the southern boundary of Kentucky was defined to be the land north of a certain latitude and south of the Mississippi River. Then the river changed its course, creating the present situation. Subsequently the definition of Kentucky was changed so that this kind of phenomenon cannot happen again. “By the way, does anyone know of a geologically significant property of this tiny piece of land that got cut off from the rest of the state?” Unsurprisingly there were no volunteers, so Professor Blakeley answered his own question. “It lies in the center of the most hazardous area for earthquakes in the USA east of the Rocky Mountains. It was at or very near the epicenter of one of the three New Madrid earthquakes of 1811–12, the largest known earthquakes to occur in the USA east of the Rockies. “There are quite a few other ‘paradoxical’ geography facts, mainly based on the earth’s surface being a sphere rather than a plane, or more psychologically, on common misperceptions about the relative position of various geographical entities. As a nice example, which state of the USA is closest to Africa?” The natural guess seemed to be Florida, so the students realized that this could not be the answer. As you travel south down the east coast of the USA the coastline moves to the west away from Africa, so conceivably any east coast state could be the answer. Given the question, the least likely state was probably the correct answer, so after a brief discussion, but without consulting a map, the students guessed Maine. “Exactly! Okay, one final geography question, though not at all mathematical. If you move due east from Vol Hall at the University of Tennessee, you will eventually leave the USA. When you next reach land, in what country will you be?” Why such a specific starting point?, thought the students. Perhaps you will end up on the exact boundary of two countries and are unable to decide which is the correct answer. Eventually they were forced to look at some accurate maps. Fumei was the first to say, “Algeria!” “Right you are!” said Professor Blakeley. “Amusing because Algeria does not border on the Atlantic Ocean.” After dispensing with this geographical nonsense, Professor Blakeley returned to real mathematics. “Earlier1 we looked at a high school algebra identity. Let’s continue with another high school algebra topic, factoring polynomials. Who can factor x2 − y 2 over the complex numbers?” 1 Chapter




Clearly this was a rhetorical question, so no one volunteered. The Professor continued, “Note that this polynomial has the form f (x) − g(y). Suppose that f (x) and g(y) are any complex polynomials. If f (x) = g(x), then f (x) − g(y) will have a factor x − y. More generally, if f (x) − g(y) is reducible over C and if a(x), b(y) are nonconstant polynomials, then f (a(x)) − g(b(y)) will also be reducible. Let’s call a polynomial f (x) ∈ C[x] indecomposable if it cannot be written in the form f (a(x)), where a(x) is a polynomial of degree at least two. “The big question is whether there are any other cases for which f (x) − g(y) is reducible over C, where f and g are indecomposable, and there are no c, d ∈ C for which g(cx + d) = f (x). Any thoughts?” Some students thought of the factorization x4 + 4y 4 = (x2 − 2xy + 2y 2 )(x2 + 2xy + 2y 2 ). However, this factorization was basically irrelevant to the problem. It was an interesting example of a factorization over Q (or Z), but over C any polynomial xn + ty n (t ∈ C) factors into linear factors over C. Moreover, the polynomials f (x) = xn and g(y) = −ty n satisfy g(ax) = f (x), where a = (−t)−1/n . After several minutes of concentrated effort no examples were found. The consensus was that there were no other examples, though no one saw how to prove this. “Didn’t you learn factorization in high school?” said Professor Blakeley. “I might need to have your high school diplomas rescinded.” He then wrote down f (x) =



√ √ √ 7t t (1 + −7)x5 + (7 − −7)x4 − 7t2 (4 − −7)x3 2 2 √ √ t2 −2t2 (7 + 2 −7)x2 + (5 + 49t + (21t − 1) −7) x − 7t3 , 2 √ √ √ 7t t 7 5 −y + (1 − −7)y + (7 + −7)y 4 + 7t2 (4 + −7)y 3 2 2 2 √ √ t −2t2 (7 − 2 −7)y 2 − (5 + 49t − (21t − 1) −7) y − 7t3 , 2

x7 −

where t is any nonzero complex number. “I leave it to you to find a factorization of f (x) − g(y). It is obvious that there are no examples of smaller degree.” Emiliano then interrupted, “Why is this obvious?” “I am using the word ‘obvious’ to mean ‘I know that it is true’,” replied Professor Blakeley. “That reminds me of a little joke. A professor is lecturing to his students, writes a formula on the board, and states that it is obvious. A student interrupts and asks why it is obvious. The professor stares at the formula for several minutes and then walks out of the room. Half an hour later he returns and says, ‘I was right! It is obvious!’ “Assuming that indeed there are no examples f (x), g(y) to the original problem of degree less than seven, the question then arises, what other degrees are possible? Try to write up a complete, self-contained solution during the break.” Although the students were suspicious of Professor Blakeley’s request, some of them worked on the problem to no avail. After the break Professor Blakeley said, “No solutions? That is disappointing. It turns out that there are exactly five remaining degrees for f (x) and g(y), namely, 11, 13, 15, 21, and 31. (There are no examples where deg f ≠ deg g.) The proof uses the classification of simple groups.”



Although Professor Blakeley claimed to be disappointed about the lack of a solution by the students, secretly he was quite pleased. The original proof of the classification of finite simple groups involved about 15000 journal pages. Even the streamlined second-generation proof (not yet complete) was estimated to be 5000 pages. The Professor did not relish reading through thousands of pages of a purported solution, with the classification of finite simple groups as a lemma. “While the result about factoring f (x) − g(y) is certainly interesting,” said Professor Blakeley, “it is not nearly as important or useful as the tool used to solve it, namely, the classification of finite simple groups. Can someone give another good example of a large disparity between a result and the method used to prove it? I want an example where no simpler proof is known.” The question was somewhat vague, but after a short while Patrick said, “For any positive integer n, there exists a prime number p with at least n consecutive 7’s in its decimal expansion. This follows immediately from Dirichlet’s theorem on primes in arithmetic progression.” “Could you explain further?” “Dirichlet’s theorem says that for any relatively prime integers a, b, there are infinitely many primes of the form ax + b, where x ∈ P. Simply take a = 10n and b = 79 (10n − 1).” “Not such a bad example. My favorite is the theorem that for any n ≥ 1 there exists an n × n magic square of distinct prime numbers. By a magic square, I mean that all rows, columns, and the two long diagonals have the same sum. Does anyone see the easy proof, modulo a recent high-powered result?” Jung Wook’s hand went up quickly. “Green-Tao!” he said. “By the Green-Tao theorem there exists an arithmetic progression p1 , p2 , . . . , pn2 of (distinct) prime numbers. It is well known that there exist n × n magic squares M using the distinct integers 1, 2, . . . , n2 . Simply replace k by pk in M .” “Good job!” said Professor Blakeley. “A good example of using a sledgehammer to crack a walnut, though in this situation it might be the only tool available. So who can write down such a magic square of size n = 5?” No one volunteered, and the students suspected this was another of Professor Blakeley’s impossible questions. Indeed, the Professor continued, “No volunteers? The smallest example based on Jung Wook’s proof uses the arithmetic progression (11.1)

6171054912832631 + 366384 ⋅ 223092870k, 0 ≤ k ≤ 24.

The factor 223092870 is a primorial number, i.e., the product of the first j primes for some j. Here 223092870 = 2 ⋅ 3 ⋅ 5 ⋅ 7 ⋅ 11 ⋅ 13 ⋅ 17 ⋅ 19 ⋅ 23. “However, there is a 5 × 5 prime magic square with line sum 313, namely, ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

59 107 71 23 53 ⎤ ⎥ 13 37 113 61 89 ⎥ ⎥ ⎥ 43 41 83 79 67 ⎥ . ” ⎥ 101 19 17 103 73 ⎥ ⎥ 97 109 29 47 31 ⎥ ⎦

After the break Professor Blakeley changed the subject slightly. “The miraculous cancellation necessary for the factorization of f (x)−g(y) reminds me of another problem. Let f (x) ∈ C[x]. Can f (x)2 have fewer nonzero coefficients than f (x)?”



Since the Professor mentioned miraculous cancellation, most likely the answer was affirmative. Moreover, this was the unexpected answer mathematically, hence the expected answer psychologically. However, the students had no idea how to find an example. Experimenting with some small degree polynomials got them nowhere. “Your intuition is accurate,” said Professor Blakeley. “There are no such polynomials of degree less than twelve. An example of degree 12 is given by (11.2)

pα (x) = (125x6 + 50x5 − 10x4 + 4x3 − 2x2 + 2x + 1)(αx6 + 1).

This polynomial has 13 nonzero coefficients for α ≠ 0, −125. For eight values of α ≠ 0, −125 (of which six are rational), including α = −110, pα (x)2 has precisely 12 nonzero coefficients. As a hint concerning where this polynomial came from, note that √ 1 + 4x = 1 + 2x − 2x2 + 4x3 − 10x4 + . . . . It is also known that there is a constant c > 0 such that for any n ≥ 1, there is a real polynomial of degree n − 1 whose square has fewer than cn0.81071... terms.” Professor Blakeley realized that he had yet to present a problem that day that the students had a chance of solving on their own, not counting the factorization of x2 − y 2 . “Let’s switch gears and try a problem about adding integers. What could be simpler? Let G be the abelian group of all infinite sequences (a1 , a2 , . . . ) of integers, with the operation of componentwise addition. It is surprising that there are a lot of interesting properties of this innocent-looking group. “For a starter, let H be the subgroup of G generated by all unit vectors en = (0, 0, . . . , 0, 1, 0, 0, . . . ), where the 1 is in the nth coordinate. Equivalently, H consists of all elements of G with only finitely many nonzero terms. Is H a direct summand of G, i.e., is there a subgroup K of G for which G = H ⊕ K? Any thoughts?” Clayton said, “Well, K would have to have the property that every element of G differs in only finitely many terms from a unique element of K. Is there even a subset of G with this property?” Hands quickly shot up, and Sandra was selected. “If we assume the Axiom of Choice, then there is no problem. In fact, for any subgroup B of any group A, there is a subset S of A such that every element of A can be uniquely written as b + s, where b ∈ B and s ∈ S. Simply choose an element from each left coset of B.” “Good way to put it!” said Professor Blakeley. “You can assume the Axiom of Choice for this problem, though actually it isn’t needed. Note that if we were looking instead at infinite sequences of rational numbers, then the answer would be affirmative, since any set of linearly independent elements of a vector space can be extended to a basis, assuming the Axiom of Choice. So does working over Z rather than Q make any difference?” Now the answer didn’t seem so obvious. Sam seemed to have his hand slightly raised, so Professor Blakeley asked him if he had an answer. “Consider the element g = (2, 4, 8, . . . , 2n , . . . ) ∈ G. In G/H we have for any n ≥ 1 that g

= =

(0, 0, . . . , 0, 2n , 2n+1 , 2n+2 , . . . ) 2n (0, 0, . . . , 0, 1, 2, 4, . . . ).



Thus in G/H, g is divisible by 2n for all n ≥ 1. Moreover, g ≠ (0, 0, . . . ) in G/K since g differs from (0, 0, . . . ) in infinitely many elements. But the only element of G that is divisible by 2n for all n ≥ 1 is the identity (0, 0, . . . ), so H cannot be a direct summand.” “Super! Note how Sam had to use some property of Z that could not be carried over to Q. With what we know so far, the following problem shouldn’t be too difficult. Show that every homomorphism f ∶ G → Z is determined uniquely by its value on the en ’s. (Of course the operation on Z is addition.) Equivalently, if f (en ) = 0 for all n, then f (w) = 0 for all w ∈ G.” No one volunteered. Professor Blakeley was disappointed. He said, “Consider also (3, 32 , 33 , . . . ).” Soon some hands were raised, including Fumei’s. She said, “Let w =(a1 , a2 , . . . ). Since 2n and 3n are relatively prime, for each n there are integers cn and dn for which an = 2n cn + 3n dn . Let u = (2c1 , 22 c2 , 23 c3 , . . . ) and b = (3d1 , 32 d2 , 33 d3 , . . . ), so w = u + v. Assume that f (en ) = 0 for all n. Then f (u) = 2n (0, 0, . . . , 0, an+1 2n+1 , an+2 2n+2 , . . . ). Thus f (u) is divisible by 2n for all n, so f (u) = 0. Similarly f (v) is divisible by 3n for all n, so f (v) = 0. Thus f (w) = 0.” “Very clever to use the fact that 2n and 3n are relatively prime! Can anyone see why G is not a free abelian group, i.e., does not contain a subset S such that every element of G can be written uniquely as a linear combination of elements of S with finitely many nonzero coefficients? “There are a number of ways to show this. One way is based on the result just proved by Fumei.” After a while Daniel was the first to raise his hand. “The group G is uncountable, while H is countable. Thus if G is free, it is a direct sum of uncountably many groups Zα ≅ Z, where α ranges over some uncountable index set I. If g ∈ G, then define the support supp(g) of g to be the set of α ∈ I such that the α-coordinate of g (that is, the projection of g to Zα ) is nonzero. Since H is countable, the set X = ⋃h∈H supp(h) is countable. Thus we can find a homomorphism ψ∶ G → Z that vanishes on H but is nonzero on Zβ for β ∈/ X. This contradicts the result proved by Fumei.” “A clever argument! I have one further problem about the group G that I have put on today’s problem set: prove that every homomorphism ψ∶ G → Z vanishes on all but finitely many ei . “Oops, I seem to have gone ten minutes overtime. You all must be starving, so that wraps up this morning’s class.”


Probability Theory PROBLEM LIST Strategy for guessing which of two numbers is larger Expected number of increasing subsequences of length k of a random permutation w ∈ Sn Expected least value of k for which x1 + ⋅ ⋅ ⋅ + xk > 1 Probability that x1 + ⋅ ⋅ ⋅ + xk ≤ 1 Limiting probability that when tossing a generalized die, at some point the sum of the numbers will be n Probability that xi + xi+1 ≤ 1 for 1 ≤ i ≤ n − 1 Asymptotic formula for the Euler number En Alternating permutations are counted by En Permutations a1 a2 ⋯adn ∈ Sdn such that ai > ai+1 if and only if d∣i Direct connection between alternating permutations and probability that xi + xi+1 ≤ 1 Probability that x1 + x2 + x3 ≤ 1, x3 + x4 + x5 ≤ 1, etc. Probability that xi + xi+1 + xi+2 ≤ 1 for 1 ≤ i ≤ n − 2 Alternating fixed-point free involutions Combinatorial proof that sec2 x = 1 + tan2 x Combinatorial proof that tan(x + y) = (tan x + tan y)/(1 − (tan x)(tan y))


104 104 105 105 106 107 109 109 110 110 111 111 112 112 112



After a weekend of rest, relaxation, and working on math problems, the class reconvened on Monday. Professor Blakeley began, “Today we’ll concentrate mostly on probability theory. This is a vast subject, but most of it is unsuitable for competitive problem solving. I’ve picked a few problems that illustrate some interesting aspects of elementary probability theory. We already saw some examples1 involving hidden independence and uniformity. “To begin, Bob writes an integer on a piece of paper and a different integer on another piece, and puts the two pieces into a hat. Alice reaches into the hat, pulls out one of the pieces of paper, seeing an integer n. She then guesses whether n is larger or smaller than the number on the other piece of paper. Does Alice have a strategy that gives her a probability of success of greater than 50%? We assume that Alice knows nothing of Bob’s psychology for choosing numbers.” “No way!” said Daniel, and the other students nodded in agreement. “The number on the other piece is just as likely to be smaller than larger.” “Your reasoning about the number on the other side is impeccable,” said Professor Blakeley, “but why does that imply that Alice cannot improve on a random guessing strategy?” At first no one volunteered. Could Alice really do better than 50%? Then Jung Wook said, “I think I once saw a problem about a game where one player’s strategy involves simulating a move of the other player. Maybe the present problem is of this nature.” “Not a bad idea! Here’s how it actually works. Alice chooses any probability distribution on Z for which every integer has a positive probability. She then chooses a ‘threshold’ t by choosing an integer from her distribution and adding 1/2 to it. If n > t, she then guesses that n is larger than the other number; otherwise smaller. “Who can tell me why this works?” Among the volunteers, Emiliano was given the honor to explain. “If t is larger than both of Bob’s numbers or smaller than both of Bob’s numbers, then Alice will be correct 50% of the time. But if t is between the two numbers, then Alice will always be right. There is a positive probability p for this to happen, so altogether Alice will be right with probability 12 + p.” “Right you are! Very counterintuitive! On the other hand, one can show that for any ε > 0, there is no strategy for Alice that guarantees a probability of success exceeding 12 + ε. “Okay, let’s switch gears and look a little at the concept of expectation. I’ll consider a couple of probability distributions on the positive integers. Just to review the definition, suppose that the probability of n ∈ P is pn . Let f be a function on P with values in Z, say. Then the expectation of f is the sum E(f ) ∶= ∑ f (n)pn . n≥1

The nicest property of expectation is that it is additive: E(f + g) = E(f ) + E(g). For a warm up, recall that an increasing subsequence of a permutation a1 a2 ⋯an of [n] (or more generally, of any sequence of integers) is a sequence ai1 < ai2 < ⋯ < aik such that 1 ≤ i1 < i2 < ⋯ < ik ≤ n. What is the expected number of increasing subsequences of length k (where 1 ≤ k ≤ n) of a random permutation (uniform distribution) w ∈ Sn ?” 1

Chapters 7 and 8



This was quite an easy problem, and hands went up quickly. Professor Blakeley pointed to Sandra. “There are (nk) subsequences of w of length k. The probability that a given one of them is increasing is 1/k!, so the expected number is (nk)/k!.” “Exactly! By the additivity of expectation, it doesn’t matter that the probability that some given subsequence of length k is increasing is not independent of the probability that some other given subsequence of length k is increasing. For instance, the probabilty that a1 < a2 < a3 and a3 < a4 < a5 is not 1/3!2 , which is what we would get if they were independent, but rather 1/5!. “Okay, let’s try something a little more difficult, but still rather simple. Suppose we choose a sequence x1 , x2 , . . . of real numbers independently and uniformly from the interval [0, 1]. What is the expected least value of k for which x1 +x2 +⋅ ⋅ ⋅+xk > 1? As usual, please don’t volunteer if you have seen this classic problem before.” No one volunteered. Hmm, thought Professor Blakeley, this must be a very popular problem in problem-solving circles. “In that case,” he said, “would someone who already knows this problem like to explain?” Fumei volunteered. “The probability that x1 + x2 + ⋅ ⋅ ⋅ + xk < 1 is 1/k!. Let pi be the probability that i is the least integer for which x1 + ⋅ ⋅ ⋅ + xi > 1. Thus 1 pk = (1 − ) − p1 − p2 − ⋯ − pk−1 . k! The solution is pk = (k − 1)/k!. By definition of expectation, the expected value of k is 1 k−1 =∑ = e. ” ∑k⋅ k! k! k≥1 k≥0 “Right you are, but perhaps there’s a more elegant way to get this answer.” Patrick was the first to raise his hand. “For any probability distribution r on P let r≥k = rk + rk+1 + . . . . Then the expected value of r is E(r) = ∑ iri = ∑ r≥i . i≥1


For the current problem, p≥i is the probability that x1 + x2 + ⋅ ⋅ ⋅ + xi−1 ≤ 1, which is 1/(i − 1)!. Thus 1 = e.” E(p) = ∑ i≥1 (i − 1)! “Very good explanation! And why is the probability that x1 + ⋅ ⋅ ⋅ + xk ≤ 1 equal to 1/k!?” Among the many volunteers, Clayton was selected. “The volume of a simplex (the convex hull of affinely independent points) in Rk with one vertex at the origin and other vertices v1 , . . . , vk is the absolute value of the determinant of the matrix whose rows are v1 , . . . , vk , divided by k!. The region in Rk defined by xi ≥ 0 and x1 + ⋅ ⋅ ⋅ + xk ≤ 1 is a simplex whose vertices are the origin and the k unit coordinate vectors. Thus the volume is 1/k! times the determinant of the identity matrix.” “Correct, though there is a more elegant way to see this. Let yi = x1 + ⋅ ⋅ ⋅ + xi . The condition xi ≥ 0 and x1 + ⋅ ⋅ ⋅ + xk ≤ 1 is equivalent to (12.1)

0 ≤ y1 ≤ y2 ≤ ⋯ ≤ yk ≤ 1.

If we choose any k points 0 ≤ yi ≤ 1, then the probability that 0 ≤ y1 ≤ y2 ≤ ⋯ ≤ yk ≤ 1 is 1/k! (since all orderings of the yi ’s are equally likely). The linear transformation between the x’s and y’s has determinant 1, so it preserves volume, and the proof follows.



“Time to leave the subject of expectation. Here is a problem that appeared on the 1960 Putnam and is somewhat similar to the one we just considered. Let n ≥ 1. Toss a die (which has six equally likely faces labelled 1,2,3,4,5,6) repeatedly. Let pn be the probability that after some number of tosses, the sum of the numbers we 1 . The problem is to have seen is equal to n. For instance, p1 = 1/6 and p2 = 16 + 36 find limn→∞ pn . We might as well generalize the problem as follows: let a be any probability distribution on P with finite support, that is, ai = 0 for all but finitely many i. Successively choose independent integers n1 , n2 , . . . from the distribution a, that is, we choose i with probability ai . Let pn be the probability that for some k we have n1 + ⋅ ⋅ ⋅ + nk = n. Find limn→∞ pn . You can assume that there does not exist d > 1 such that if ai ≠ 0, then d∣i, since otherwise it is easy to see that lim pn doesn’t exist. You can work together for a while.” It seemed clear to the students that in order to solve this problem, they needed to find some kind of manageable formula for pn itself, or at least a recurrence or generating function for pn . They soon obtained the recurrence n


pn = ∑ aj pn−j , n > 0, j=0

where a0 = 0 and p0 = 1, since we can obtain the sum n by first obtaining n − j with probability pn−j , and then n in one further trial with probability aj . This recurrence begged to be transformed into a generating function identity. Setting P (x) = ∑i≥0 pi xi and A(x) = ∑i≥0 ai xi , the coefficient of xn in A(x)P (x) is just the right-hand side of equation (12.2). Therefore multiplying both sides of (12.2) by xn and summing on n ≥ 0 gave P (x) = A(x)P (x), so A(x) = 1 or P (x) = 0. Either alternative seemed unlikely. The students quickly realized that they forgot to take into account that the recurrence (12.2) fails for n = 0. Adjusting for this value gave P (x) = A(x)P (x) + 1, so P (x) =

1 . 1 − A(x)

How to obtain limn→∞ pn from this formula? It didn’t take long for Sam to see the light, as was evident from his excited appearance. The other students asked him to explain. “Clearly the polynomial B(x) = 1 − A(x) satisfies B(1) = 0. I claim that every other complex zero α of B(x) satisfies ∣α∣ > 1. Every coefficient of P (x) is at most 1, so P (x) has radius of convergence at least 1. Therefore B(x) cannot have a zero of absolute value less than 1. “Suppose that B(α) = 0 and ∣α∣ = 1. We are assuming that for any d > 1 there is some i for which ai ≠ 0 and d ∤ i. Hence either α = 1 or there is some aj ≠ 0 for which αj ≠ 1. Since ∑ ai = 1 and ai ≥ 0, the triangle inequality (including the condition for equality) implies that the only way for ∑ aj αj = 1 is for α = 1. Therefore B(α) ≠ 0 for ∣α∣ = 1 except when α = 1, so every complex zero α ≠ 1 of B(x) must satisfy ∣α∣ > 1. Moreover, the zero α = 1 of B(x) has multiplicity one, since otherwise the coefficients of P (x) would be unbounded.



“Let L = 1/ ∑ jaj . Since L ≠ 0, by l’Hˆopital’s rule we can compute that L + B(x), 1−x where B(x) has radius of convergence greater than 1. Hence if B(x) = ∑ bn xn , then limn→∞ bn = 0, so 1 . lim pn = L = n→∞ ∑ jaj For a1 = ⋯ = a6 = 16 we get L = 2/7.” Wow!, thought all the other students together with Professor Blakeley, who said, “Very impressive! The same result is actually true for any probability distribution a on P, but it is quite a bit harder to prove. Especially troublesome is the case when ∑ jaj = ∞, an example being aj = 6/(π 2 j 2 ). “Another problem for today, which is related to all sorts of goodies. Choose x1 , . . . , xn independently and uniformly from the interval [0, 1]. What is the probability Pn that xi + xi+1 ≤ 1 for 1 ≤ i ≤ n − 1? “I don’t expect an instantaneous solution, but any ideas?” Patrick said, “It’s easy to express the answer as a horrendous multiple integral, but I don’t see what that gets you.” “What is the integral?” Patrick went to the board and wrote P (x) =


Pn = ∫


x1 =0


x2 =0

x3 =0



xn =0

dxn dxn−1 ⋯dx1 .

“Yes, this formula is immediate from the definition of Pn . Is it really hopeless to get anywhere with it?” asked Professor Blakeley. “We could try to evaluate it one integral at a time,” suggested Daniel. “Thus Pn

= ∫



x1 =0 1

x2 =0 1−x1


x3 =0 1−x2



xn−1 =0 1−xn−3

(1 − xn−1 )dxn−1 dxn−2 ⋯dx1

1 (1 − xn−2 − (1 − xn−2 )2 ) dxn−2 dxn−3 ⋯dx1 , 2 x3 =0 xn−2 =0 x1 =0 x2 =0 etc. Maybe we can find a formula for each step, which we can prove by induction. It looks really messy.” “Right you are about the messiness. I don’t know whether this can be made to work, but there is a simpler approach.” After getting no response other than blank stares, Professor Blakeley gave a hint. “Let’s generalize the problem to = ∫

Pn (t) = ∫



x1 =0


x2 =0


x3 =0



xn =0

dxn dxn−1 ⋯dx1 ,

where t is any real number. What to do?” Soon Jung Wook said, “Differentiate!” “Differentiate? Maybe this isn’t a bad idea. What do we get?” Jung Wook wrote on the board: Pn′ (t) = =


x2 =0


x3 =0

Pn−1 (1 − t).



xn =0

dxn dxn−1 ⋯dx2



“A simple but somewhat nonstandard recurrence,” said Professor Blakeley. “How to make further progress?” Sandra suggested, “If we differentiate again, we get rid of the 1 − t. We get Pn′′ (t) = −Pn−2 (t). ”

(12.3) ′′

“This looks similar to y = −y, so maybe trig functions are involved,” said Fumei. “Not a bad insight,” said the Professor. “But how do we actually solve the recurrence (12.3)? We need some way to handle all the Pn (t)’s at once.” Numerous hands shot up including that of Clayton, who blurted, “Generating functions!” “Exactly! Patrick, since you wrote down the original integral in the first place, why don’t you come up and try to get the generating function approach to work?” Patrick was happy for this opportunity. He said, “Let F (x, t) = ∑ Pn (t)xn . n≥0

Differentiating twice with respect to t gives ∂2 F (x, t) = ∂t2 = =


∑ Pn (t)x



− ∑ Pn−2 (t)xn n≥0 2

−x F (x, t).

The variable x is never differentiated, so we can treat it like a constant. Thus the general solution to this equation is F (x, t) = A(x) sin xt + B(x) cos xt. ” Taking into account the initial condition F (x, 0) = 1 and realizing that one needed to set t = 1 at the end, the students with some assistance from Professor Blakeley expressed the solution in the optimal form F (x, t) = (sec x)(cos(t − 1)x + sin tx). Setting t = 1 gives the generating function n ∑ Pn x = sec x + tan x. n≥0

“A very elegant trigonometric generating function!” said Professor Blakeley. “Probably some of you have encountered it before, but let me say a few words for the benefit of the others. If we write xn sec x + tan x = ∑ En , n! n≥0 then the numbers En are called Euler numbers because Euler was interested in the coefficients of sec x in connection with his work on summing series like ∑n≥1 n−2d for d ∈ P. There is no nice formula for En , but in 1879 D´esir´e Andr´e gave them a very elegant combinatorial interpretation. Namely, En is the number of permutations w = a1 ⋯an ∈ Sn satisfying a1 > a2 < a3 > a4 < a5 > ⋯.



The inequality signs alternate between > and a3 < a4 > a5 < ⋯. The transformation ai → an+1−i shows that the number of reverse alternating permutations in Sn is also equal to An . Now let w = a1 a2 ⋯an+1 be an alternating or reverse alternating permutation in Sn+1 . Suppose that aj+1 = n + 1. Then aj aj−1 ⋯a1 is a reverse alternating permutation u of some j-element subset S of [n], while aj+2 aj+3 ⋯an+1 is a reverse alternating permutation v of the complementary subset [n]−S. Every alternating or reverse alternating permutation can be obtained uniquely in this way. There are (nj) choices for S, Aj choices for u, and An−j choices for v. When n ≥ 1, the number of permutations in Sn+1 that are alternating or reverse alternating is 2An+1 . Hence we obtain the recurrence n n 2An+1 = ∑ ( )Aj An−j , n ≥ 1. j=0 j

(12.4) n


Let y = ∑n≥0 An xn! . Multiply equation (12.4) by xn! and sum on n ≥ 0. In general, n n if z = ∑ Cn xn! , then z ′ = ∑n≥0 Cn+1 xn! , where z ′ denotes the formal derivative (differentiate term-by-term). Taking into account the initial conditions and the failure of the recurrence for n = 0, we get the differential equation 2y ′ = y 2 + 1, y(0) = 1. The unique solution is y = sec x + tan x, and the proof follows. “Good job!” said Professor Blakeley. “This is a standard ‘Combinatorics 101’ example of the use of generating functions to solve recurrences. While fairly straightforward, it does have the defect of not explaining why we got such a simple



answer. There is a more illuminating proof based on an inclusion-exclusion argument. I will give the argument for sec x, that is, alternating permutations in S2n . The argument for tan x is similar but a bit more tricky. “Let Sk be the set of permutations w = a1 a2 ⋯a2n ∈ S2n satisfying a1 > a2 < a3 > a4 < ⋯ > a2n−2k , a2n−2k+1 > a2n−2k+2 > ⋯ > a2n , and let Tk be those permutations in Sk that also satisfy a2n−2k > a2n−2k+1 . Hence S1 − T1 consists of all alternating permutations in Sn . Moreover, Ti = Si+1 − Ti+1 . Hence An = #(S1 − T1 ) = #S1 − #(S2 − T2 ) = ⋯ = #S1 − #S2 + #S3 − ⋯. ) ways A permutation in Sk is obtained by choosing a2n−2k+1 , a2n−2k+2 , . . . , a2n in (2n 2k )A and then a1 , a2 , . . . , a2n−2k in A2(n−k) ways. Hence #Sk = (2n . Therefore 2(n−k) 2k we get the recurrence 2n 2n 2n A2n = ( )A2n−2 − ( )A2n−4 + ( )A2n−6 − ⋯. 2 4 6 This is precisely the recurrence we get for E2n (rather than A2n ) by equating coefficients of x2n /(2n)! on both sides of 1 =

( ∑ E2n n≥0


( ∑ E2n n≥0

so ∑ A2n n≥0

x2n ) ⋅ cos x (2n)! x2 x4 x2n ) (1 − + − ⋯) , (2n)! 2! 4!

1 x2n = = sec x, (2n)! cos x

as desired. “To see further that this is a better proof, for d ≥ 1 let fd (n) be the number of permutations a1 a2 ⋯an ∈ Sn such that ai < ai+1 if and only if d∣i. The case d = 2 gives alternating permutations. Then it is easy to show in complete analogy with the previous inclusion-exclusion argument that ∑ fd (dn) n≥0

xdn = (dn)!

1 xdn ∑ (−1) (dn)! n≥0



This generalization makes it much more clear why we obtained sec x as the generating function for alternating permutations in S2n . I will leave you to figure out xdn+r . for 1 ≤ r ≤ d − 1 the formula for ∑n≥0 fd (dn + r) (dn+r)! “I am not quite done yet! We found a connection between the probabilities Pn and the Euler numbers En , as well as a connection between the Euler numbers and alternating permutations. Let’s try to complete the circle by finding a direct connection between Pn and alternating permutations. As a hint, recall the linear change of variables yi = x1 + ⋅ ⋅ ⋅ + xi that we used to find the probability that x1 + ⋅ ⋅ ⋅ + xk ≤ 1.” The first idea to come to mind was to use the same change of variables on the equations xi ≥ 0 and xi + xi+1 ≤ 1, but that did not seem to work. Emiliano was the first to see the light and was asked to explain. “Make the substitution x2i+1 = 1 − y2i+1 and x2i = y2i . This is an affine transformation whose determinant



is ±1, so the volume is preserved. The conditions xi ≥ 0 and xi + xi+1 ≤ 1 become 0 ≤ yi ≤ 1 and (12.5)

y1 ≥ y2 ≤ y3 ≥ y4 ≤ y5 ≥ y6 ≤ ⋯.

Since all n! orderings of the yi ’s are equally likely, the probability that equation (12.5) holds is just An /n!.” “So elegant! Much more can be said about alternating permutations. Indeed, if I weren’t so lazy I could write a whole book about them. But let me mention a nice generalization of finding the probability that xi + xi+1 ≤ 1 that I have put on today’s problem set. I’ll just state a special case for the sake of simplicity, but you should be able to figure out a generalization. Let n ≥ 1. Choose x1 , . . . , xn independently and uniformly from the interval [0, 1]. Let Pn be the probability that x1 + x2 + x3 x3 + x4 + x5 x5 + x6 + x7


≤ ≤ ≤ ⋮

1 1. 1

The final inequality has two or three terms on the left, depending on whether n is even or odd. Show that n!Pn is equal to the number of permutations w = w1 ⋯wn ∈ Sn such that w1 > w2 > w3 w3 < w4 < w5 w5 > w6 > w7 . w7 < w8 < w9 ⋮ You’ll see that Emiliano’s elegant change of variables no longer works. You’ll have to be a little more clever. “By the way, consider the probability that if we choose x1 , . . . , xn independently and uniformly from the interval [0, 1] as before, then xi + xi+1 + xi+2 ≤ 1 for all 1 ≤ i ≤ n − 2. Though superficially similar to the previous problem, no nice answer is known. “Let me finish off with one very challenging problem which I have put on today’s problem set. Let E be an indeterminate. Consider the power series 1 + x (E ) G(x) = ( 1−x




We can write this as 2x (E +1)/4 ) (1 + 1−x 2x n (E 2 + 1)/4 ) , )( ∑( 1−x n n≥0 2

G(x) = =



from which it is clear that G(x) = ∑n≥0 pn (E)xn , where pn (E) is a polynomial in E. For instance, 1 2 (E + 1), 2 1 4 (E + 2E 2 + 1), p2 (E) = 8 1 (E 6 + 3E 4 + 11E 2 + 9), p3 (E) = 48 1 (E 8 + 4E 6 + 38E 4 + 68E 2 + 33). p4 (E) = 384 Now in pn (E) do the curious operation of turning each exponent into a subscript, so E k becomes the Euler number Ek . Call the resulting number f (n). For instance, p1 (E) =

1 (E6 + 3E4 + 11E2 + 9) 48 1 (61 + 3 ⋅ 5 + 11 ⋅ 1 + 9) = 48 = 2.

f (3) =

The problem is to show that f (n) is the number of alternating permutations in S2n that are fixed-point-free involutions, that is, all their cycles (in their decomposition into a product of disjoint cycles) have length two. For instance, when n = 3 we have the two permutations 214365 and 645231. “Who can see how to do this?” The students of course did not take this question seriously. Professor Blakeley as usual feigned disappointment and then said, “Just one more item before we can go to lunch. Now that we have a combinatorial interpretation of sec x and tan x in terms of alternating permutations, we can simply define sec x


∑ A2n n≥0

tan x


x2n , (2n)!

∑ A2n+1 n≥0

x2n+1 . (2n + 1)!

All the other trig functions can be defined in terms of these, e.g., sin x = (tan x)/(sec x). We can develop the theory of trigonometry from these definitions. This is the subject of combinatorial trigonometry. For instance, today’s problem set asks for combinatorial proofs that sec2 x = 1 + tan2 x and tan(x + y) =

tan x + tan y . 1 − (tan x)(tan y)

“I have searched through numerous textbooks on trigonometry, but none take this elegant combinatorial approach. It does have the slight defect that the trig functions are no longer functions, but rather formal power series. Thus you lose all connections with angles, triangles, periodicity, harmonic motion, etc., but you



can’t have everything. In return you have a chance to develop a lot of properties of alternating permutations.” The students hoped that Professor Blakeley wouldn’t get the idea of writing a trigonometry textbook based on alternating permutations, since they had some doubts about its marketability.


Geometry PROBLEM LIST Do there exist uncountably many disjoint subsets of R2 homeomorphic to two tangent circles? Do there exist uncountably many disjoint subsets of R2 homeomorphic to the letter Y? Can a convex n-gon be tiled with nonconvex quadrilaterals? For what n do there exist nonregular equiangular n-gons with integer sides? Polynomials P (x) ∈ R[x] such that some P (x)Q(x) has nonnegative coefficients Is there an equilateral triangle in R2 with integer vertices? When is there a regular n-simplex in Rn with integer vertices? When are there n + 1 vertices of an n-cube whose convex hull is a regular simplex? Well-centered regular simplex in a cube Line dividing 2n points in the plane in half Line through two points dividing the remaining 2(n − 1) points in half Maximum number of halving lines Circle through three points with half the remaining points inside Number of halving circles Rectangle tiled with rectangles with at least one side of integer length Can a 10 × 15 rectangle be tiled with 1 × 6 rectangles? Tiling an a × b rectangle (a, b ∈ R>0 ) with finitely many squares Tiling a triangular region of hexagons with tribones Tiling a square with rectangles similar to a 1 × x rectangle Tiling a rectangle with copies of a certain polyomino Smallest rectangle that can be tiled with a given polyomino


116 116 116 116 118 118 118 119 120 120 120 121 121 121 121 122 123 123 125 127 127



When class began the next day, Professor Blakeley was his usual excited self. “Today let’s see what you know about geometry! The first problem is pretty easy. Can we find uncountably many pairwise disjoint subsets of R2 , each homeomorphic to two circles with one point in common? We may or may not have one of the two circles contained in the other. If you’re not familiar with homeomorphisms, you can think of two simple closed (continuous) curves with one point in common. ‘Simple’ means that the curves are not self-intersecting.” A few students were familiar with this problem and generously did not volunteer. Among the remaining students, the first to volunteer was Clayton. “Let me call the allowed subsets 2-circles. Given a 2-circle T , choose a pair {uT , vT } of points with rational coordinates, such that uT lies inside both circles and vT lies inside just one of them (when one closed curve is contained inside the other), or uT lies inside one circle and vT inside the other (when the two closed curves have disjoint interiors). Clearly if {uS , vS } = {uT , vT }, then S = T . Since there are only countably many vectors (u, v) ∈ Q2 , it follows that we can’t have uncountably many pairwise disjoint 2-circles.” “Very nice! I’ll let you get away with assuming that every simple closed curve in R2 has an interior (the Jordan curve theorem). For a somewhat more onerous variation which is on the problem set, can we find uncountably many pairwise disjoint subsets of R2 , each homeomorphic to the letter Y, that is, three closed line segments with an endpoint in common? “Time for another, just a little more challenging! Can a convex n-gon C in the plane be tiled with finitely many nonconvex quadrilaterals? Thus the quadrilaterals have disjoint interiors, and their union (including their interiors) is C.” After a few minutes of trying to find a tiling without success, the students began to feel that the answer was negative. “Could we have a hint?” asked Patrick. “A hint? So sad. Okay then, consider angles,” said the Professor. Not much of a hint, but Daniel was first to understand its significance. “The sum of the interior angles of each quadrilateral is 2π. Each quadrilateral has exactly one interior vertex, that is, a vertex in the convex hull of the other three vertices. The sum of the angles around each interior vertex is also 2π. If there are q quadrilaterals, then the sum of the angles around all their interior vertices is 2πq. This doesn’t leave any angles available for the boundary of the n-gon, so a tiling is impossible.” “Good proof! Note that we can tile a triangle with one nonconvex quadrilateral and one nonconvex pentagon, so we really need to use the fact that the tiles are all (nonconvex) quadrilaterals. “Okay, let’s try another geometry problem! For what integers n ≥ 3 do there exist equiangular n-gons in the plane which are not regular (that is, not all edges have the same length, given equiangularity) and whose edges all have integer lengths? Any first thoughts?” A few hands went up, and the Professor selected Fumei. “It’s easy to do n = 3 and n = 4. Any equiangular triangle is regular, and of course there are rectangles which are not squares and which have integer edge lengths.” “A good first step, since it shows that there is a true dichotomy. Now what? What about n = 5?” After a short while Professor Blakeley gave a hint. “Think fifth roots of unity.”



It didn’t take long before there were some volunteers, the first being Sam, so he was chosen. The other students were no longer surprised when Sam volunteered. “Let ζ = e2πi/5 , a primitive fifth root of unity. If a0 , . . . , a4 > 0, then the condition for them being the side lengths, in that order, of an equiangular pentagon is 4

k ∑ ak ζ = 0. k=0

But the polynomial 1 + x + x + x + x4 is irreducible over Q and its zeros are ζ k , 0 ≤ k ≤ 4. Thus the only rational linear combinations of the ζ k ’s that equal 0 are of the form a(1 + ζ + ζ 2 + ζ 3 + ζ 4 ) = 0, a ∈ Q. 2


Hence all the ak ’s are equal. The same reasoning works when n is any odd prime, since 1 + x + ⋅ ⋅ ⋅ + xp−1 is irreducible over Q.” “Exactly!” said Professor Blakeley. “A little simple algebra goes a long way. Note that we need not assume that our polygon is convex or even non-selfintersecting. I hope you all remember why the polynomial Φp (x) = 1 + x + ⋅ ⋅ ⋅ + xp−1 is irreducible over Q when p is prime. The simplest proof is to apply the Eisenstein irreducibility criterion to Φp (1 + x). “Now what about composite n > 4?” Jung Wook said, “Sam’s reasoning shows that if the side lengths are a0 , . . . , an−1 k in that order, then the polynomial P (x) = ∑n−1 k=0 ak x must be divisible by the cyclotomic polynomial Φn (x).” “Yes, and I remind you that Φn (x) is defined to be the monic polynomial over Q of least degree that has e2πi/n as a zero. The degree of Φn (x) is the Euler phi function value φ(n), and the zeros of Φn (x) are the primitive nth roots of unity e2πdi/n , where 1 ≤ d ≤ n, and d and n are relatively prime. “If we allow nonpositive edge lengths, i.e., after drawing an edge and turning an angle 2π/n, we proceed in the opposite direction to which we are facing, then there is no problem. We can let P (x) be any polynomial of degree n − 1 except a(1 + x + ⋅ ⋅ ⋅ + xn−1 ) that is divisible by Φn (x). We can also easily arrange for all coefficients of P (x) to be nonzero, in case you don’t like sides of length 0. But what if you want all side lengths to be positive?” Patrick was chosen. “Define Γn (x) by 1 − xn = Φn (x)Γn (x). 1−x ˆ Then perturb the coefficients of Γn (x) slightly, keeping them rational, getting Γ(x). ˆ We can then multiply Φn (x)Γ(x) by a common denominator of its coefficients to get a polynomial with positive integer coefficients that are the side lengths of our polygon.” “Not bad, but can we think more geometrically?” Soon Jung Wook had his hand in the air. “If n = ab, where a, b > 1, then just take a regular n-gon with sides of length one, say, and stretch every ath side (so b sides are stretched in all) to the same longer integer length.” “Okeydokey! Just like stretching a square to a rectangle. That seems to settle our polygon problem. Before going on to the next problem, here is a somewhat related problem you can mull over. Let P (x) ∈ R[x]. Show that there exists 0 ≠ Q(x) ∈ R[x] such that every coefficient of P (x)Q(x) is nonnegative if and



only if P (x) has no positive real zeros. By the way, the corresponding question for polynomials in several variables is much more subtle. I think that it’s open whether there exists an algorithm for deciding whether Q exists. “Let’s continue with some more geometry. By a lattice point in Rn , I mean a point with integer coordinates. As a warmup, does there exist an equilateral triangle in R2 whose vertices are lattice points? How many have seen this wellknown problem before?” Five hands went up, and Professor Blakeley asked Sandra to explain. “Suppose such a triangle T exists with side length s. We can translate the triangle so one vertex is at (0, 0) and the others at (a, b) and (c, d), say. Thus s2 = a2 + b2 ∈ Z. The a b area of the triangle is the absolute value of 12 det [ ], which is rational. On c d √

the other hand, the area is also 43 s2 , which is irrational, a contradiction. Therefore T does not exist.” “Excellent argument!” said Professor Blakeley. “Remember1 that if we have an n-dimensional simplex σ in Rn whose vertices are the origin and v1 , . . . , vn , then the n-dimensional volume of σ is 1/n! times the absolute value of the determinant whose rows are v1 , . . . , vn . “Now let’s try a challenging generalization. For what positive integers n do there exist n + 1 lattice points in Rn such that any two of them are the same distance apart, that is, the lattice points form the vertices of a regular simplex? We just showed that n = 2 is not possible.” “I can do n = 1,” said Clayton. “Incredible! I won’t ask you to present the proof, but what about n = 3?” Soon some hands went up. Professor Blakeley did not even bother to ask someone to speak, but rather said, “Yes, this is quite easy—we can take the vertices to be 000, 110, 101, 011, for instance. So let’s get more serious. Suppose that n is even. Who can come up with a significant necessary condition for the simplex to exist?” Sam was the first to raise his hand. “We can generalize the argument for n = 2. By the determinantal formula just mentioned by Professor Blakeley, the volume of the simplex √ is rational. If the side length is s, then one can compute that the volume is n + 1sn /n!2n/2 . When n is even, then sn ∈ Z and 2n/2 ∈ Z. Thus we must have that n + 1 is a perfect square in order for the volume to be rational.” Everyone here should be able to check that the volume is indeed √ “Awesome! n + 1sn /n!2n/2 . Is this necessary condition also sufficient for n even?” After a short while Professor Blakeley continued, “We can try to find as simple an example as possible. Perhaps we can take the n + 1 points to consist of the n unit coordinate vectors together with a vector of the form (a, a, . . . , a).” With√this huge hint it wasn’t difficult for the students to compute that taking a = (1 − 1 + n)/n gave a regular simplex for any n. Clearly the last vertex is rational when 1 + n is a square, and one can scale by a factor of n to get integer vertices. Thus the proof was complete that when n is even, the integer simplex σ exists if and only if n + 1 is a square. “There remains the case where n is odd,” said Professor Blakeley. “This turns out to be more of a challenge, involving some knowledge of the Hasse-Minkowski 1

Chapter 12



theory of rational quadratic forms, so let me just tell you the answer. If n ≡ 3 (mod 4), then it is always possible, while if n ≡ 1 (mod 4) the necessary and sufficient condition is that n + 1 is the sum of two squares (of nonnegative integers). I’m sure you must know that the condition for this is that if n + 1 = pa1 1 ⋯par r is the factorization of n + 1 into prime powers, then ai is even whenever pi ≡ 3 (mod 4). This goes back to Fermat and Euler.” “Cool!” said Clayton. The other students agreed. “Let’s look at one more problem of a similar nature before the break,” said the Professor. “For what n ≥ 1 can we find n + 1 vertices of an n-dimensional cube that form the vertices of a regular simplex σ? That is, every pair of distinct vertices are the same distance apart. Any first thoughts?” Fumei was the first to speak. “Clearly n = 1 is possible (and trivial), while n = 2 is impossible. We already saw for n = 3 that we can take the vertices 000, 110, 101, 011. In fact, by what we just saw for regular simplices with any integer vertices, n + 1 must be a perfect square when n is even, and must be a sum of two squares when n ≡ 1 (mod 4).” “Good start! Can anyone rule out the cases when n is even or n ≡ 1 (mod 4) entirely, except for the trivial exception n = 1?” Daniel was the first to volunteer. “We can assume that the vertices of the cube are the 2n binary vectors of length n, and that one of the vertices is the origin. Since all the other vertices v1 , . . . , vn have the same distance to the origin, they √ have the same number of 1’s, say d. If i ≠ j, then vi and vj are at distance d apart, so they have d/2 1’s in common. In particular, d is even. “I claim that d = (n + 1)/2. Since σ is an n-dimensional simplex, the vectors v1 , . . . , vn form a basis for Rn . Thus there are unique real numbers ai for which a1 v1 + ⋅ ⋅ ⋅ + an vn = (1, 1, . . . , 1). Take the dot product with vi to get d d (a1 + ⋅ ⋅ ⋅ + an ) + ai = d, 1 ≤ i ≤ n. 2 2 = ⋯ = a = 2/(n + 1). Thus The unique solution is a1 n n+1 v1 + ⋅ ⋅ ⋅ + vn = (1, 1, . . . , 1). 2 Take the dot product with (1, 1, . . . , 1) to get dn = n(n+1) , so d = (n + 1)/2 as 2 claimed. Since d is even, we must have n ≡ 3 (mod 4).” “Excellent! So now we know that either n = 1 or n ≡ 3 (mod 4). Who can settle the case n ≡ 3 (mod 4)?” After a few minutes of futile attempts by the students, Professor Blakeley continued, “Again I gave you a chance for fame and glory by solving a famous open problem.” By now the students had become used to being asked to solve quickly problems which turn out to be open, so they did not react strongly to this latest pronouncement. “Let me explain further,” continued the Professor. “Let σ be a regular simplex in Rm−1 (we’ll soon see why I work in m − 1 dimensions rather than n) whose vertices v0 , v1 , . . . , vm−1 are binary vectors. As Daniel mentioned, we can assume that v0 = (0, 0, . . . , 0). Let A be the (m − 1) × (m − 1) matrix whose rows are v1 , . . . , vm−1 . Change the 1’s to −1’s and the 0’s to 1’s to get a new matrix B.



Then let C be the m × m matrix whose first row and column are all 1’s, and the remaining submatrix of C is B. The condition that σ is regular becomes (using Daniel’s result d = (n + 1)/2) the condition that any two distinct rows of C are orthogonal. Equivalently, CC t = mI. An m × m matrix of 1’s and −1’s for which any two distinct rows are orthogonal is called a Hadamard matrix. By multiplying each row and each column of a Hadamard matrix by 1 or −1 as needed, we can assume that the first row and column is all 1’s, called a bordered Hadamard matrix. Conversely, given a bordered Hadamard matrix we can reverse the steps to obtain σ. Thus Daniel’s argument shows that σ exists in Rm−1 if and only if m is the order of a Hadamard matrix, and that a necessary condition is that m = 1 (not mentioned by Daniel since we assumed n ≥ 1), m = 2, or m ≡ 0 (mod 4). It is a famous open problem that these conditions are sufficient. The smallest m divisible by four for which it is not known whether an m × m Hadamard matrix exists is m = 668. (1, 1, . . . , 1) can be “One final remark. Daniel’s formula v1 + ⋅ ⋅ ⋅ + vn = n+1 2 rewritten (0, 0, . . . , 0) + v1 + ⋅ ⋅ ⋅ + vn 1 = (1, 1, . . . , 1). n+1 2 In other words, the center of the simplex σ is the same as the center of the cube in which it is inscribed. We can ask for an even stronger ‘well-centered’ property, namely, every Euclidean automorphism of σ (of which there are (n + 1)!) extends to an automorphism of the cube. On today’s problem set I ask you to find all n for which this is possible. This is not such a difficult problem. “Okay, break time! We’ll look at some more geometry problems after the break.” The break passed very quickly, and Professor Blakeley presented the next problem. “As a trivial warmup, given 2n points in the plane, show that there is a line L that divides the points in half, that is, n of the points lie on one side of L and n on the other side. “No need to volunteer, since I’m sure you all can do this. Simply choose a line M not parallel to any line through two of the points, and slide M in from infinity always keeping its slope the same. Since M can cross at most one of the 2n points at a time, at some time it will have crossed exactly n of the points. “Now suppose that no three of the 2n points are collinear. Can we find a line L through two of the points that divides the remaining 2(n − 1) points in half?” Several hands quickly went up, and Patrick was selected. “Let L be a line through two of the points, say p and q. If L is not already a halving line, then there are fewer points a on one side of L than the b = 2(n − 1) − a points on the other side. Now rotate L around the point p. Each time L passes through a point, the number of points on each side changes by one, and after rotating 180○ the number of points on the side originally containing a points now has b points. Thus sometime during the rotation process L becomes a halving line.” “Nice!” said Professor Blakeley. “Note that Patrick’s argument shows that there are at least n halving lines, since there is a halving line through any of the 2n points. If the points are in convex position (i.e., the vertices of a convex 2n-gon), then equality is achieved—there are exactly n halving lines. Note that for four points not in convex position (always assuming no three are collinear) we can do better—there are three halving lines.



“In general, the problem of finding the maximum number f (n) of halving lines of a configuration of 2n points is open. The best results to date are √ log n


< f (n) < c′ n4/3 ,

where c, c′ are positive real constants. “Let’s consider an interesting variant—circles. Suppose we have a set S of 2n+1 points in the plane such that no three lie on a line and no four on a circle. We will say that such a set is in general position. A halving circle is a circle through three of the points, with n − 1 of the remaining points in its interior, so also n − 1 in its exterior. The first question: do halving circles always exist?” The students quickly saw that an argument similar to the halving line existence also worked. Clayton explained, “Pick any two points p, q ∈ S. Consider all circles through p, q that don’t intersect another point in S. One of these circles contains all the points on one side of the line pq, and another circle contains all the points on the other side. Thus one circle contains at least n points of S, and the other at most n − 1 points of S. Continuously move the center O of the first circle in a straight line toward the center O ′ of the second circle. By continuity, one of these points between O and O ′ will be the center of a circle passing through p, q and containing exactly n − 1 points of S in its interior.” “Sounds good to me!” said Professor Blakeley. “So what can be said about the number of halving circles, other than being positive?” Since there was a halving circle passing through any two points p, q and by necessity intersecting just one other point of S, it was clear to the students that there were at least 1 2n(2n + 1) 1 ⋅ = n(2n + 1) 3 2 3 halving circles. It was difficult to make further progress, one reason being that checking examples seemed much more difficult than for halving lines. “You are now in a position to appreciate an amazing result of Federico Ardila,” said Professor Blakeley, “namely, there are always exactly n2 halving circles!” This did seem to be quite a startling result. Why were circles so much more nicely behaved than lines? “I won’t give the details of the proof here, but the idea is as follows. First show that the number of circles is invariant by continuously moving one set S of 2n + 1 points in general position into another, and analyzing what happens when the set of three-element subsets of S whose elements lie on a halving circle changes. Once we know that the number of halving circles is invariant, we can choose S suitably so we can show that it has n2 halving circles. Ardila chooses S to consist of a slight perturbation of the vertices of a regular (2n − 1)-gon with center O, together with O and a point Q far away from the (2n − 1)-gon, chosen of course so that all 2n + 1 of the points are in general position. With these hints and sufficient time I think you should be able to fill in the details. “Let’s move on to our final topic for today—tiling rectangles with rectangles. It is surprising that there are a number of cool results about such a simple tiling situation. Let’s start with a classic: show that if a rectangle is tiled by finitely many rectangles each of which has at least one side of integer length, then the tiled rectangle has at least one side of integer length.” Some of the students had seen this problem before, and Patrick was allowed to explain. “We can assume for simplicity that the rectangle has vertices (0, 0), (a, 0),



(0, b), and (a, b). Let = ∫


a 0


e2πi(x+y) dx dy


1 (e2πia − 1)(e2πib − 1). (2πi)2


The integral over each of the tiles is 0 since we get a factor e2πc − 1, where c is an integer. Therefore I = 0, so at least one of a or b is an integer.” “Good argument! This is probably the quickest and simplest proof for those who have the background to understand it. Stan Wagon has assembled fourteen different proofs, including Patrick’s. Note also that there is no problem in extending this result to higher dimensions. “A rather similar problem is the following: can a 10 × 15 rectangle be tiled by 1 × 6 rectangles (in any orientation)? Please volunteer only if you aren’t already familiar with the theory of tiling a × b rectangles with c × d rectangles.” The statement that the problem was rather similar to the previous one was a big hint, and Emiliano was the first to raise a hand. “It is no more difficult to do the general case of tiling an a × b rectangle R with c × d rectangles. Assume the lower-left corner of R is (0, 0). For any rectangle S let Γ(S) = ∑ xm y n , (m,n)

where (m, n) ranges over all integer points in S except those on the top edge and right edge. Thus for any rectangle S in the tiling of R, say with bottom left vertex (r, s), we have either Γ(S) = xr y s (1 + x + ⋅ ⋅ ⋅ + xc−1 )(1 + y + ⋅ ⋅ ⋅ + y d−1 ) or Γ(S) = xr y s (1 + x + ⋅ ⋅ ⋅ + xd−1 )(1 + y + ⋅ ⋅ ⋅ + y c−1 ). Thus, setting Pab (x) = (1 + x + ⋅ ⋅ ⋅ + xa−1 )(1 + y + ⋅ ⋅ ⋅ + y b−1 ), we have Pab (x) = =

Γ(R) ∑ Γ(S) S

(13.1) (13.2)


r s c−1 d−1 ∑ x y (1 + x + ⋅ ⋅ ⋅ + x )(1 + y + ⋅ ⋅ ⋅ + y )

+ ∑ xr y s (1 + x + ⋅ ⋅ ⋅ + xd−1 )(1 + y + ⋅ ⋅ ⋅ + y c−1 ),

where S ranges over all rectangles in the tiling of R. Now put x = y = e2πi/c . By summing a geometric series it is clear that p−1 2πih/q

∑e h=0


= 0 if q∣p, ≠ 0 otherwise.

Thus the right-hand side of equation (13.2) vanishes. The left-hand side will vanish if and only if c∣a or c∣b. Similarly we get that either d∣a or d∣b if a tiling exists. It is easy to see that these necessary conditions are also sufficient, though necessity is all we need to conclude that we cannot tile a 10 × 15 rectangle with 1 × 6 rectangles.” “Any comment?” asked Professor Blakeley. “I get the proof of the necessity that either c∣a or c∣b, and similarly either d∣a or d∣b, but why is it sufficient?” said Sandra. “Sufficiency would mean we could tile a 1 × 2 rectangle with 2 × 2 rectangles, which seems quite challenging.”



“Good point!” said Professor Blakeley. “Clearly in order to cover just the first row and the first column we need that both a and b are sums of c’s and d’s, i.e., we can write a = uc + vd for u, v ∈ N, and similarly for b. It now really is easy to check that with this added requirement we do indeed have a necessary and sufficient condition. Note again that it is easy to generalize our result, originally due to Nicolaas de Bruijn, to higher dimensions. “Now I’d like to tell you about a problem rather similar to the previous ones, but surprisingly more difficult. For any real numbers a, b > 0, show that an a × b rectangle can be tiled by finitely many squares if and only if a/b ∈ Q. Here is an especially elegant proof using some simple linear algebra. “Consider an a × b rectangle R. It is quite easy to see that if a/b ∈ Q, then R can be tiled by finitely many squares, so assume a/b ∈/ Q. Suppose that R is tiled with finitely many rectangles R1 , . . . , Rn , where Ri is of size ai × bi . Let V ⊂ R be the rational vector space spanned by the numbers a1 , . . . , an , b1 , . . . , bn , that is, V consists of all rational linear combinations of the ai ’s and bi ’s. In particular, a, b ∈ V . Let f be any linear transformation f ∶ V → R satisfying f (a) = 1 and f (b) = −1. “Now for any u × v rectangle S with u, v ∈ V define the Hamel area (with respect to f ) of S to be H(S) = f (u)f (v). If a rectangle S is tiled by finitely many rectangles S1 , . . . , Sn , then it is easy to check the additivity property (13.3)

H(S) = H(S1 ) + ⋅ ⋅ ⋅ + H(Sn ).

Namely, first check it for two disjoint rectangles with a common edge, as in the figure below, which amounts to f (a)f (b + c) = f (a)f (b) + f (a)f (c).

“To complete the proof that H is additive (equation (13.3)), simply note that the tiling S1 , . . . , Sn can be refined into a grid tiling. “Now comes the punch line. Let R be our original a × b rectangle tiled into finitely many squares Ri , say of size ri ×ri . Then H(R) = −1 while H(Ri ) = f (ri )2 ≥ 0, contradicting the additivity of H.” “Awesome proof!” said Emiliano, with the other students in approval. “Let’s take another little break so you can gather up some energy for a couple of more tiling problems.” During the break Sandra asked Professor Blakeley if there were some nice tiling problems analogous to tiling an a × b rectangle with a c × d rectangle, but which couldn’t be solved so easily. “I’m glad you asked me that, Sandra,” said Professor Blakeley. “There is some really interesting work in this area. Here is one example. Consider the region T (n)



consisting of a triangular array of n(n + 1)/2 unit regular hexagons. Here is the case n = 5:

“Call T (2) a tribone. For what values of n can T (n) be tiled by tribones, in either of the two possible orientations?” Professor Blakeley rummaged through his backpack and produced the following figure, showing that T (9) could be tiled by tribones:

“Clearly a necessary condition is that n(n + 1)/2 must be a multiple of 3, so n ≡ 0, 2 (mod 3). However, this condition is not sufficient. The necessary and sufficient condition is that (13.4)

n ≡ 0, 2, 9, 11 (mod 12),

but it can be shown that this cannot be proved by an argument analogous to the one we use for tiling an a × b rectangle with c × d rectangles. I can give you some references later if you’re interested.” After the break Professor Blakeley returned to the classroom looking quite pleased with himself, as he had managed to pass another level of Candy Crush Friends. “Let’s look at an even more challenging problem about tiling rectangles, one issue being that we don’t know in advance how large the tiling is. Let x be a positive real number. I say that a rectangle R is similar to a 1 × x rectangle if R is an a × ax rectangle for some a > 0. For what positive real numbers x can a square



be tiled with finitely many copies of rectangles all similar to a 1 × x rectangle? The rectangles can be in any orientation. Any thoughts?” Jung Wook was the first to volunteer. “It should be easy whenever x is rational. I wonder if this condition is necessary.” “You are right about rationality,” Professor Blakeley said. “For instance, here is the solution for x = 3/5, from which it should be clear how to do any positive rational number. In fact, we can take all the rectangles to be congruent, i.e., all the same size.

“So what about Jung Wook’s suggestion that rationality is also necessary?” Sam had his hand partially raised, so Professor Blakeley asked him to explain. “I don’t think that rationality is necessary. We can draw a figure of a tiling and try to figure out what side lengths will produce this figure. This should involve solving polynomial equations. For instance, suppose the figure is the following:

“If the square has side 1 and we let the big rectangle at the top be x × 1, then the rectangle at the lower right is (1 − x) × (1 − x)/x, and the one at the lower left is (1 − x)/x × x(1 − x). Hence x(1 − x) +

1−x = 1, x

so x3 − x2 + 2x − 1 = 0. This polynomial is irreducible over Q since it has degree three and ±1 are not zeros, so there is a nonrational value of x that works.” Patrick announced that Sam’s polynomial had exactly one real zero, which was about 0.569840290998053 . . . . “Nice!” said Professor Blakeley. “It’s clear that we can get lots of algebraic numbers in this way. Can we get a transcendental value of x?” “I don’t think so,” volunteered Fumei. “Whatever the configuration, we should be able to do an argument like Sam’s to get a polynomial equation satisfied by x.”



Figure 13.1. A polyomino tile with 18 squares “Your insight is correct, though the details take some work. So the big question is, what algebraic numbers can we get? This is not so easy.” The students tended to agree with this assessment, since they could think of no way to exclude any given positive algebraic number. Perhaps all such numbers were possible. Anticipating this line√ of reasoning, Professor Blakeley said, “Here is√an inter√ is esting tidbit. The value 2 is not possible, nor is 2 + 43 . However, 2 + 17 12 possible.” √ At first this seemed extremely mysterious. Then Sandra noticed that 43 < 2, √ while 17 > 2. Could this observation be somehow relevant? 12 “You’re on the right track!” said Professor Blakeley. “Let me state the remarkable result of Freiling-Rinne and Laczkovich-Szekeres (independently). The real number x > 0 is possible if and only if x is algebraic, and all the conjugates of x (the zeros of the polynomial over Q of least degree satisfied by x) have positive real part. “How is such a result proved? First let’s consider the sufficiency of the condition on x. It is based on a nice result of H. S. Wall. Given a polynomial P (x) = xn + an−1 xn−1 + ⋅ ⋅ ⋅ + a0 , define its alternant Q(x) by Q(x) = an−1 xn−1 + an−3 xn−3 + an−5 xn−5 + . . . . Then all zeros of P (x) have positive real part if and only if we can write (13.5)

Q(x) = P (x) − Q(x)




cn x + cn−1 x +

1 ⋱ 1 +c x 1

where each ci > 0. “Given equation (13.5) with each ci > 0, the ci ’s give a recipe for constructing a tiling of a square by rectangles similar to an x × 1 rectangle. I won’t go into the details, but they are not so difficult. The case c1 = ⋯ = c6 = 1, for instance, corresponding to the polynomial P (x) = x6 − x5 + 5x4 − 4x3 + 6x2 − 3x + 1, might make a diabolical puzzle for a ten-year-old. The rectangles would be given, and the task would be to assemble them into a square. “Speaking of diabolical puzzles, consider the tile of Figure 13.1. We give our hapless victim 500 copies of this tile and ask him or her to assemble some subset of them into a rectangle, not allowing flipping a tile to the other side. It turns



out that the least number of tiles needed to accomplish this task is 468. Even if this were known in advance, it would still be quite a challenging puzzle. It would continue to remain quite difficult with the additional information that the size of the tiled rectangle is 78 × 108. “Shapes such as this one made of unit squares attached along their edges are known as polyominoes. Given a polyomino, the question of whether some number of them tile a rectangle is probably undecidable, that is, there is no algorithm that can decide this. I say ‘probably,’ because there are some similar problems that are known to be undecidable. For instance, it is known that the problem of tiling some rectangle (of unspecified size) whose tiles belong to a finite set of polyominoes is undecidable. One consequence of the undecidability of tiling by a single polyomino is the following: let r(P ) be the minimum number of copies of the polyomino P that are needed to tile a rectangle. Set r(P ) = ∞ if no such tiling exists. Let f (n) be the maximum finite value of r(P ) as P varies over all polyominoes with n squares. Then f (n) grows faster than any recursive function (essentially, any function that can be computed on a computer, allowing infinite memory but only a finite input), since otherwise we could just try all possibilities, say with a backtrack algorithm, for tiling a rectangle with i copies of an n-square polyomino P for all i ≤ f (n). I wouldn’t be surprised if there is a polyomino P with under 100 squares, or certainly under 1000 squares, such that r(P ) > 10100 . Now that would be a real puzzle! “Let’s get back to tiling a square with rectangles similar to a 1 × x rectangle. How does one prove the necessity of the condition that x is an algebraic number all of whose conjugates have positive real part? I won’t go into the details, but the idea is similar to the proof that an a × b rectangle can be tiled by finitely many squares if and only if a/b ∈ Q. Recall that to prove the necessity of the condition a/b ∈ Q we defined an ‘area function’ (the Hamel area). In our present situation, if x is algebraic and has a conjugate with a nonpositive real part, then one can define an additive area function so that all rectangles similar to a 1 × x rectangle have nonnegative area, while a square has negative area. This immediately contradicts the existence of a tiling.” It was time for lunch. On the way to the cafeteria Clayton asked Professor Blakeley if for every positive integer n there exists a polyomino P with r(P ) = n. “A good question!” said the Professor. “There has been some research on this problem.” After briefly consulting his laptop, he continued, “The known values r(P ) are all numbers 4m with m ≥ 1, together with 1, 2, 10, 18, 50, 138, 246, 270. It is unknown whether we can have r(P ) = 6 or r(P ) equaling an odd number greater than 1. I believe that the only value of r(P ) that has been ruled out is r(P ) = 3.”


Hodgepodge PROBLEM LIST 2n ⌈ 21/n2 −1 ⌉ = ⌊ log ⌋ 2 n−1 2 1 an = n−1 ∑i=0 ai an an−4 = an−1 an−3 + a2n−2 , etc. First four digits of n1000000 are distinct 100 prisoners who open 50 boxes each 100 prisoners who make their choices in advance 100 prisoners who can leave one bit of information each Number of boxes opened by each prisoner for success rate of 50% 100 prisoners open 99 boxes each and don’t want to see their name A fly flies between a man and the point 0 A fly flies between two approaching trains


130 131 131 132 132 134 135 135 135 135 136



“No particular topic for today,” said Professor Blakeley as he arrived in class the next morning, “just a hodgepodge of miscellaneous goodies. “First some light relief. Is it true that for every positive integer n, we have ⌈


2 21/n



2n ⌋ ?” log 2

Daniel pointed out that 21/n − 1 = −1 + e

log 2 n


log 2 log2 2 log3 2 + + + ..., n 2n2 6n3

from which it is easy to compute that 2 log 2 2n −1+ + .... = 6n 21/n − 1 log 2 Thus one needed to know whether 2n/(log 2) could be just a tiny bit (in a suitable precise sense) less than an integer. “Very nice observation! So who can do the computation?” No one could find a counterexample to equation (14.1), although Patrick managed to do the computation up to n = 100000 or so. Professor Blakeley was disappointed. “Can’t anyone do arithmetic computations in their heads anymore? You should read the story ‘The Feeling of Power’ by Isaac Asimov, where someone in the future figures out that one can do computations, including just counting, without a computer. I am afraid I will have to just tell you the answer: the least value of n for which (14.1) fails is n = 777451915729368. In fact, for this value of n we have 2 21/n − 1 2n log 2





“Well, with this big hint, I’m sure you all can see the next value of n for which (14.1) fails.” Naturally none of the students had the slightest chance of answering this implied question, so once again Professor Blakeley had to reveal the solution. “It is 140894092055857794, and then 1526223088619171207, etc. To get a little more serious, perhaps you are wondering how these huge numbers were found. Daniel’s observation that 2n/(log 2) is close to an integer is very germane. In fact, if you have some familiarity with the theory of continued fractions (which I remind you can be used to find the ‘best’ rational approximations of irrational numbers), then it’s not hard to show that if m = ⌈2/(21/n − 1)⌉ and the equality (14.1) fails, then m/n is a convergent of 2/(log 2). This gives a huge reduction in the possible values of n that we need to look at. If Cn denotes the nth convergent of 2/(log 2), then we have C1 = 1, C2 = 3, C3 = 23 , C4 = 26 , etc., until we reach 8 9 C36 =

2243252046704767 , 777451915729368



so only 36 cases to check rather than 777451915729368. Similarly, 406534415799078269 , C40 = 140894092055857794 4403748962482230453 . 1526223088619171207 To my knowledge, only eight values of n are known for which equation (14.1) fails, and it is not known whether there are infinitely many such values. “You didn’t do so well on this question, so here is a chance to redeem yourselves. Define a0 = a1 = 1 and C42


an =


1 n−1 2 ∑ a , n > 1. n − 1 i=0 i

Is an an integer for all n ≥ 0?” The students quickly went to work with their computers. Patrick managed to check that an was an integer up to a36 , a number of 697208172 decimal digits. It seemed like the result should be true, but given that the previous problem broke down at n = 777451915729368, they weren’t so certain. Attempts to find a proof were fruitless. “Here’s a little hint,” said Professor Blakeley. “Work modulo 43.” This hint was a lot more than ‘little.’ It took less than 30 seconds for the students to discover that a44 was a rational number with denominator divisible by 43. “You got it!,” said the Professor. “Indeed 44 is the least integer n for which an is not an integer, and we have the approximation a44 = 5.4093 × 10178485291567 . The sequence a0 , a1 , . . . is sometimes known as G¨ obel’s sequence. I don’t know if there is a good reason why an is an integer until the surprisingly large n = 44. “On the other hand,” said the Professor, “there are sequences which have no right to be integer valued, so to speak, but nevertheless their terms are all integers. Four famous examples are the recurrences an an−4 an an−5 an an−6 an an−7

= an−1 an−3 + a2n−2 , n ≥ 4, a0 = a1 = a2 = a3 = 1, = an−1 an−4 + an−2 an−3 , n ≥ 5, a0 = a1 = a2 = a3 = a4 = 1, = an−1 an−5 + an−2 an−4 + a2n−3 , n ≥ 6, a0 = a1 = a2 = a3 = a4 = a5 = 1, = an−1 an−6 + an−2 an−5 + an−3 an−4 , n ≥ 7, a0 = a1 = a2 = a3 = a4 = a5 = a6 = 1.

The integrality of the first two of these sequences is a nice exercise, but the last two are quite challenging. Moreover, the recurrence an an−8 = an−1 an−7 + an−2 an−6 + an−3 an−5 + a2n−4 , n ≥ 7, with the initial conditions a0 = ⋯ = a7 = 1, satisfies an ∈ Z for n < 17, but 420514 . a17 = 7



“These sequences are examples of Somos sequences, which are best understood in the context of cluster algebras and the related Laurent phenomenon. Sadly I don’t have the time to discuss this wonderful topic. “Let me give you one last chance to do a computation in your head. This one is actually quite easy once you see the trick. Find a positive integer n such that the first four digits in the decimal expansion of n1000000 are all distinct.” After a surprisingly short time Jung Wook said, “I got it! We have 10000011000000 = 2718 . . . . ” “Right you are! There are other more contrived solutions, but you have found the best one.” After the customary ten-minute break Professor Blakeley switched to a different topic. “Earlier1 we looked at a situation where infinitely many prisoners had to guess some colors. This scenario is not so realistic. There are many ‘beleaguered prisoners’ problems with only finitely many prisoners. Let me give one of the best known. Perhaps some day you will find it useful if you become a prisoner. “An evil warden is in charge of 100 prisoners (all with different names). He puts a row of 100 boxes in a room. Inside each box is the name of a different prisoner. The prisoners enter the room one at a time. Each prisoner must open 50 of the boxes, one at a time. If any of the prisoners does not see his or her own name, then they are all killed. The prisoners may have a discussion before the first prisoner enters the room with the boxes, but after that there is no further communication. A prisoner may not leave a message of any kind for another prisoner. In particular, all the boxes are shut once a prisoner leaves the room. If all the prisoners choose 50 boxes at random, then each has a success probability of 1/2, so the probability that they are not killed is 2−100 , not such good odds. Is there a strategy that will increase the chances of success? What is the best strategy? Please don’t volunteer if you’ve seen this problem before.” “We can certainly do better than 2−100 ,” said Jung Wook. “Oh?” “The first prisoner chosen by the warden chooses the first 50 boxes, and all the other prisoners choose the last 50 boxes. If the first prisoner is successful, then the other prisoners have a probability of 50/99 of success each, so the total success 99 ) .” probability is 12 ⋅ ( 50 99 “This couldn’t be right,” said Sandra. “You have 99 prisoners opening the same 50 boxes. The probability that all are successful is 0.” “Hmm, good point,” said Jung Wook. “Well, at least the argument is correct ⋅ 2−99 .” for the first two prisoners, so the probability goes down to 50 99 “A rather minuscule improvement,” said Professor Blakeley. “We can apply this argument to 50 pairs of prisoners,” said Sandra. “That is, prisoners 1, 3, 5, . . . choose boxes 1–50, while prisoners 2, 4, 6, . . . choose 51–100. Since a prisoner chooses only when the previous prisoners have been successful, this gets the probability down to 1 1 1 50 1 49 p = ( ⋅ )( ⋅ )⋯( ⋅ ).” 2 99 2 97 2 1 “Numerically this is about 9.9116 × 10−30 , an improvement by a factor of 12.5645 . . . ,” said Professor Blakeley after doing a quick computation on his laptop. 1

Chapter 5



“If we apply Sandra’s argument to 2n prisoners and simplify, then we get a ). This beats random guessing by a factor success probability of 1/(2n n 22n √ ∼ πn, (2n ) n using Stirling’s asymptotic formula for m!. This improvement is still pretty abysmal. Can we do better?” After getting no response, Professor Blakeley decided to give a hint. “The key to understanding this problem is the realization that the prisoners do not have to decide in advance on which boxes they will open. A prisoner can decide which box to open next based on what he or she has seen in the boxes previously opened.” Fumei said, “If a prisoner has to decide which box to open based on what he or she has seen previously, then the most natural way to do this is for the prisoners to arbitrarily order themselves P1 , . . . , P100 before the box selection begins. The prisoner Pi will first go to box i (that is, the ith box in the order they are lined up). After opening box i, the prisoner will see the name of some prisoner Pj . Then prisoner Pi next opens box j (unless i = j, in which case Pi is already successful) and continues this strategy. But I would have to compute whether such a strategy is an improvement over what Sandra suggested.” “Excellent insight!” said Professor Blakeley. “How would you compute the success probability?” Fumei went into a trance of intense concentration. After a short while, she said, “Eventually the prisoner Pi must return to box i. The box prior to this will have the name of Pi inside. Thus this cycle must have length at most 50. In other words, if Pw(i) is the prisoner whose name is in box i, then all the prisoners will be successful if and only if the permutation w ∈ S100 has no cycle of length greater than 50. It shouldn’t be so hard to compute how many such permutations there are.” “Fantastic! Fumei’s argument shows that if we have 2n prisoners, then her strategy has a success probability of f (n)/(2n)!, where f (n) is the number of permutations in S2n with no cycle of length greater than n. Who can tell me something about f (n)?” Soon several hands went up, and Emiliano was selected. “A permutation w ∈ S2n can contain at most one cycle of length greater than n. If this length is k, ) ways, and then choose the rest of w then we can choose the k-cycle in (k − 1)!(2n k in (2n − k)! ways. Hence f (n) = =

2n 2n (2n)! − ∑ (k − 1)!( )(2n − k)! k k=n+1 2n

1 , k k=n+1

(2n)! − (2n)! ∑

so the success probability p2n is 2n

1 .” k k=n+1

p2n = 1 − ∑

“Fine argument! So about how big is p2n ?”



“Well, 2n

n 1 2n 1 1 =∑ −∑ . k=n+1 k k=1 k k=1 k

1 Now ∑m i=1 i is close to log m − γ, where γ is Euler’s constant, so p2n will be close to 1−log 2n+log n = 1−log 2. Maybe I made a mistake since the answer is independent of n, while it seems intuitive that p2n → 0 as n → ∞.” “Actually, your computation is impeccable while your intuition is faulty. It is not hard to see that p2n actually decreases to 1 − log 2 as n → ∞, so for any n, p2n > 1 − log 2 = 0.30685 . . . . In particular, p100 = 0.31182 . . . . No matter how large n is, we can achieve more than a 30% chance of success!” The students who had not seen this problem before were astounded by this unlikely result. Professor Blakeley continued, “In fact, it can be shown, but this takes more work, that Fumei’s strategy is optimal. No strategy can give a higher probability of success.” “What if the prisoners have to make all their choices in advance, before opening a single box?” asked Patrick. “What is the optimal strategy then? We already saw ).” we can get a success probability of 1/(2n n “Not a bad question!” said the Professor. “Any thoughts?” Sam had his hand raised slightly, so Professor Blakeley called on him. “Let’s do it for 2n prisoners. Suppose prisoner Pi chooses to open the boxes belonging to the n-element subset Si of [2n]. The prisoners will be successful if Fumei’s permutation w satisfies w−1 (i) ∈ Si for all i. Let A = (aij ) be the 2n × 2n matrix defined by

aij = {

1 if i ∈ Sj , 0 otherwise.

Then the number of successful permutations w−1 is just per A, the permanent of A. Thus we want to find the largest permanent of a 2n × 2n (0, 1)-matrix with n 1’s in every row. I know there are some inequalities involving maximum and minimum permanents of (0, 1)-matrices, but I can’t remember them exactly and don’t know if they are relevant.” “Right on the button! I remind you that the permanent of an m × m matrix B = (bij ) is defined by per B = ∑ b1,w(1) b2,w(2) ⋯bm,w(m) , w∈Sm

just like the expansion of the determinant except all the signs are plus. The permanent is not nearly as well-behaved or important as the determinant, but it does have its uses. I hope everyone understands Sam’s contention that the number of successful permutations is per A.” No one said otherwise, so Professor Blakeley continued, “Sam is also right about a permanental inequality being relevant. It’s not so easy to figure out quickly, so I will just tell you the story. “Henryk Minc conjectured in 1963 that if B is an m × m (0, 1)-matrix with ri 1’s in row i, then m


per B ≤ ∏(ri )!1/ri . i=1



Minc’s conjecture was proved by Lev Br´egman (sometimes spelled without the accent mark) in 1973. I won’t go into Br´egman’s proof, but who sees the relevance to our prisoner problem?” It wasn’t long before hands flew into the air. Clayton was chosen. “The matrix A we are looking at is a 2n × 2n (0, 1)-matrix with n 1’s in every row. By equation (14.3) we have 2n

per A ≤ (n!1/n )

= n!2 .

Hence the largest possible success probability q2n satisfies q2n ≤

1 n!2 = 2n . (2n)! ( n )

This is exactly the probability achieved by Sandra’s strategy, so it is best possible.” “Righto! For 2n = 100, say, quite a difference between the probability 0.31182 . . . when a prisoner can choose which box to open based on what (s)he has already ) = 9.9116 . . . × 10−30 when all choices must be made in advance! seen, and 1/(100 50 “Let’s look at a slight variant of the 100 prisoners problem. Each prisoner can leave one bit of information (that is, either 0 or 1) for the next prisoner. How does this affect the probability of success? “There’s an obvious way to get it up to 1/2,” said Jung Wook. “How’s that?” “The first prisoner opens boxes 1–50, say. He or she leaves a 0 if the next prisoner is in boxes 1–50 and a 1 if not (so boxes 51–100). Each prisoner repeats the same strategy. It doesn’t matter whether or not the choices must be made in advance.” “Super! Can we do better?” “No way! The first prisoner has a 50% chance of success.” “Right! 50% is optimal and quite easy to prove. Obviously there are many further variations. Perhaps you can come up with something especially interesting. On today’s problem set are the following two problems, neither very difficult: ● We now have 2n prisoners and the same evil warden. Let 0 < α < 1. The prisoners now open 2αn boxes (or more precisely, the nearest integer to 2αn). As before, all are killed if some prisoner does not see his or her name. Let pn (α) be the probability of success using the strategy suggested by Fumei. For what α do we have limn→∞ pn (α) = 1/2? ● Suppose now the 100 prisoners each open 99 boxes, and if any prisoner sees their name, then all are killed. Note that this time it is bad to see your own name. What is the optimal strategy and the corresponding probability of success? Here it turns out that optimality is much easier to prove than in the original problem.” After the break, Professor Blakeley said that it was time to do a little mathematical biology since this was currently a very hot subject. The class would discuss the motion of flies. “Suppose that a man and fly start at the point 0 on the x-axis. The man walks at a pace of 4 feet/second in the positive x-direction. The fly flies at a speed of 10 feet/second, moving back and forth between the man and the point 0. We assume that the fly can change its direction instantaneously. Where will the fly be after one minute?”



Daniel’s hand went up immediately. “This is just the problem von Neumann solved by summing the infinite series. The fly flies for one minute at 10 feet/second, so it has flown a total distance of 10 × 60 = 600 feet.” “But where does the fly end up?” “Well, you just . . . .” Daniel was stuck. “Wait a minute!” said Emiliano. “Something is fishy about this problem. How does the fly actually start moving?” “You hit the nail on the head!” said Professor Blakeley. “The fly’s motion is undefined. You can see this by placing the man at the point x = 240 (where he will be after one minute) and the fly anywhere between the man and the point 0. Now run the scenario backwards, and after one minute you will end up with the same initial condition as our problem: the man and fly are both at 0. The original problem is an instance of what is called ‘summing an infinite series from the wrong end.’ ” “What is the problem that Daniel referred to?” asked Patrick. Professor Blakeley nodded at Daniel, who said, “I get it now! In the problem solved by von Neumann, two trains were approaching each other, and the fly was flying between the two trains, starting at one of the trains. Now there is no problem getting started. The infinite number of oscillations come at the end, so we have a normal infinite series. The problem is to compute the distance the fly has flown, which is just its speed times the time it takes for the trains to meet. When von Neumann was asked this problem, he instantaneouly gave the correct answer. When the questioner said, ‘I see that you saw the trick,’ von Neumann replied, ‘What trick? I summed the infinite series.’ ”


Self-referential Mathematics PROBLEM LIST Distribution of terms of the Kolakoski sequence 1, 2, 2, 1, 1, 2, 1, 2, 2, . . . Nonperiodicity of the Kolakoski sequence Definition of the sequence 1, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, . . . Growth rate of the previous sequence √ Transcendentality of φ2−φ , where φ = 12 (1 + 5) Transcendentality of eπ Transcendentality of the positive real root of xx = x + 1 Growth rate of an+1 = an + a1n The recurrence an+1 = an − a1n The sequence 2, 3, 3, 2, 3, 3, 3, 2, 3, 3, 3, 2, 3, 3, 2, . . . Wythoff’s Nim Analogue of Beatty’s theorem for triples Definition of the look and say sequence 1, 11, 21, 1211, 111221, . . . Only 1, 2, and 3 appear in the look and say sequence Length of the nth term of the look and say sequence 13 ⋅ 532 ⋅ 3853 ⋅ 96179 An autogram


138 138 139 139 139 140 140 140 140 141 142 144 144 144 144 146 147



Professor Blakeley started out the next class, after the customary greetings, by saying “Let’s take a look at some mathematical self-references. Who knows some examples?” “Liar paradox!” said Clayton. “Yes, the sentence ‘This sentence is false.’ is about as self-referential as you can get. It is also meaningless, at least at the level of rigorous mathematics. There are many variations, but meaningless statements are not so interesting. Can someone give a more significant example?” It wasn’t exactly clear what Professor Blakeley was looking for, but Sandra ventured, “G¨ odel’s incompleteness theorem? It is proved by constructing a statement that says ‘This statement is unprovable.’ If the system in which the statement lies is consistent, then the statement must indeed be unprovable.” “Excellent example!” said Professor Blakeley, only a little surprised that at least one of the students had some understanding of G¨odel’s famous theorem. “What I want to do for the remainder of the class is much more mundane. First we’ll consider some sequences which can be encoded by other sequences, but the encoding sequence is just the original one! “If this seems a little vague, then let’s look at the archetypal example, the sequence 1, 2, 2, 1, 1, 2, 1, 2, 2, 1, 2, 2, 1, 1, 2, 1, 1, 2, 2, 1, 2, 1, 1, 2, 1, 2, 2, 1, 1, . . . .” “This looks like the Kokakola sequence or some such name,” suggested Jung Wook. “Right! Well, almost right. It is actually called the Kolakoski sequence after the recreational mathematician William Kolakoski, though it was first discussed by Rufus Oldenburger in 1939, another confirmation of Stigler’s law of eponymy. Historically, the Kolakoski sequence was the second sequence to be included in the OEIS! (The first was the number of groups of order n, up to isomorphism.) Jung Wook, can you explain how the Kolakoski sequence is defined?” Jung Wook replied, “Its terms are all 1’s and 2’s, beginning with 1. If you read the sequence by the length of its runs (maximal subsequences of equal consecutive letters), then you obtain the same sequence!” “Exactly. We start with one 1, so we record 1. Then two 2’s, so we record 2. Then two 1’s, then we record 2, etc. Clearly it is uniquely defined by the properties given by Jung Wook. Let f (n) be the number of 1’s among the first n terms of the sequence. Rather surprisingly, it is not known whether limn→∞ f (n)/n exists. However, the limit is conjectured to exist and to equal 1/2 (no surprise). “Can someone tell me why the Kolakoski sequence isn’t eventually periodic?” Daniel’s hand went up immediately. “If the sequence were eventually periodic, then its run sequence would also be eventually periodic, giving a smaller period.” “Could you give some more details? By the way, I am happy to see that you correctly used the subjunctive mode.” Daniel smiled at this compliment and continued, “Let the Kolakoski sequence be a1 , a2 , . . . . Suppose it is eventually periodic of minimum period p. Let i ≥ 1 be any index after the periodicity begins for which ai ≠ ai+1 . Let b1 , b2 , . . . , bk be the run sequence of the sequence α = ai+1 , ai+2 , . . . , ai+p , that is, bj is the length of the jth run of α. Clearly k < p and b1 , b2 , . . . , bk is an eventual period, contradicting the minimality of p.”



“A good explanation, though a bit terse. Here is an interesting variant of the Kolakowski sequence: 1, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 9, 9, 9, 9, 9, . . . . Who can see the definition?” After the previous problem, it was easy to figure out the definition of the current sequence g1 , g2 , . . . . It also was identical to its sequence of run lengths. The difference with the Kolakoski sequence is that here the sequence begins 1 and every difference gi+1 − gi is equal to 0 or 1. The number of i’s in the sequence is gi . “I don’t think there’s a nice formula for gn ,” said Professor Blakeley, “but we can ask for its rate of growth. Any ideas?” “Clearly gn ≤ n,” said Daniel. “Moreover, it is termwise larger than the weakly increasing sequence 1,2,2,3,3,3,4,4,4,4,. . . , where i occurs i times. For this sequence h1 , h2 , . . . we have h(n+1) = n. This suggests that gn will grow roughly like nα for 2

some 12 < α < 1.” “Good argument! If we’re lucky, we might even have gn ∼ cnα for some constant c > 0. Does anyone see a good way to guess the values of α and c?” After a few minutes some students had an answer, and the Professor chose Fumei. “By the definition of gn we have immediately that gg1 +g2 +⋅⋅⋅+gn = n + 1.


Approximating gn by cn , then g1 + ⋅ ⋅ ⋅ + gn is about cnα+1 /(α + 1) since ∫ xα dx = xα+1 /(α + 1). We can approximate n + 1 on the right-hand side of equation (15.1) by n, so we get α


cnα+1 ) = n. (15.2) c( α+1 √ 2 Thus nα +α = n, so α = 12 (−1 + 5) = φ − 1, where φ is the golden mean. Also looking at the coefficient of n on both sides of equation (15.2) gives cα+1 = 1, (α + 1)α which gives c = φ2−φ .” “Great! With some careful analysis it can in fact be shown that (15.3)

gn ∼ φ2−φ nφ−1 .

This result would seem quite mysterious to someone who did not know Fumei’s elegant heuristic argument, and this argument points the way to a rigorous proof. I should mention that the sequence g1 , g2 , . . . is known as Golomb’s sequence or sometimes as Silverman’s sequence. “By the way, is the coefficient φ2−φ appearing in equation (15.3) a transcendental number? Presumably you know that a transcendental number is a complex number that is not the zero of a nonzero polynomial with rational (or equivalently, integer) coefficients.” “Gelfond-Schneider!” said Patrick. “Can you explain?” “The Gelfond-Schneider theorem says that if a and b are complex algebraic numbers and b is irrational, then ab is transcendental. There are some obvious exceptions like a = 0 or a = 1.”



“Precisely! In fact, a = 0, 1 are the only exceptions. Thus for instance 2 2 is transcendental, as is φ2−φ . Can anyone tell me why eπ is transcendental?” Daniel and Patrick quickly raised their hands, followed by Jung Wook, to whom √ the Professor pointed. “The complex number i = −1 is algebraic and irrational, yet eπi = −1, which is not transcendental (or even irrational). By Gelfond-Schneider, eπ cannot be algebraic.” “Good! There are also some ‘nonstandard’ applications of the GelfondSchneider theorem. For instance, is the unique positive real root of the equation xx = x + 1 transcendental?” The students did not find this problem to be very challenging. Sandra was asked to explain. “If x is algebraic and irrational, then by Gelfond-Schneider xx is transcendental. But xx = x + 1, which is algebraic, a contradiction. There remains p/q

the case that x is rational, say p/q in lowest terms. Then ( pq ) = 1+ pq , so q (as well as p) is a qth power of an integer. Thus q = 1, but no integer satisfies pp = p + 1.” “You got it! Let’s go back to the Golomb-Silverman sequence, where Fumei was able to guess the rate of growth. There are numerous other tricks for guessing the growth rate of a sequence. Suppose for instance that a1 = 1 and 1 , n ≥ 1. (15.4) an+1 = an + an What do you think is the growth rate of an ?” There were no immediate volunteers, though Professor Blakeley realized that most of the students would be able to solve this problem within 30 minutes or so. He said, “The idea is that an is growing sufficiently slowly so that an+1 − an is close to a derivative. That is, let f (x) be a smooth increasing interpolation of an , so f (n) = an . Then f (x + 1) − f (x) will be close to (f (x + h) − f (x))/h for h ′ small, so our (x) = 1/f (x). The solutions √ recurrence (15.4) is approximated by f √ are f (x) = 2x + c, and indeed it turns out that an ∼ 2n. Of course this needs to be proved rigorously, but this is only a ‘technical difficulty.’ “Note that this method of guessing the answer breaks down completely for a recurrence like an+1 = an + a2n . Now an is growing so quickly that f (x + 1) − f (x) is not a good approximation to f ′ (x). “A superficially similar recurrence is 1 , an+1 = an − an with the initial condition a1 = 2. Can anyone see what is happening?” After spending less than a minute with his laptop computing several hundred approximate values, Patrick said, “It seems to be kind of random. If an is large, then we are subtracting a small number to get the next term. Eventually we will get close to 1. After that an+1 = an − a1n will be close to 0, so ∣an+2 ∣ again will be large, and an+2 could be positive or negative. I don’t see any way of being more precise.” “Good insight! Indeed, it seems hopeless to determine exactly how close an will be to 1 in this scenario. In fact, it is an open question whether the sequence a1 , a2 , . . . is bounded. Note that replacing √ the recurrence with the differential equation f ′ (x) = −1/f (x) leads to f (x) = c − 2x, which is nonsense. However, if an is very large, we can estimate how many steps are necessary before we get close to 1. Does anyone see this?”



Some hands √ went up, and the honor went to Emiliano. “If c is large compared to x and f (x) = c − 2x, then f (x + 1) −√f (x) is a good approximation to f ′ (x) = √ −1/f (x). Thus if ∣an ∣ is large and an ≈ ± c − 2x, then an−1 ≈ ± c − 2(x − 1). This suggests that it will take about 12 a2n steps to get near 1.” “Great! Well, it’s time to get back to self-referential sequences. Let’s look at one more, superficially similar to the Kolakoski sequence, namely, 2, 3, 3, 2, 3, 3, 3, 2, 3, 3, 3, 2, 3, 3, 2, 3, 3, 3, 2, 3, 3, 3, 2, 3, 3, 3, 2, . . . .” The students immediately computed that the run sequence is 1, 2, 1, 3, 1, 3, 1, 2, 1, 3, 1, 3, 1, 3, 1, . . . , obviously not the original sequence. However, it didn’t take long for them to see that if the 1’s are deleted from the run sequence, then we did get back the original sequence. Clearly there was a unique such sequence of 2’s and 3’s beginning with 2. Patrick said that it reminded him of some old Putnam problem, but he couldn’t remember the details. “You are right about the Putnam,” said Professor Blakeley. “It was Problem A-6 on the fifty-fourth (1993) exam. The problem asked to show that there is a real number r such that the numbers n = 1 + ⌊rm⌋, m ≥ 0, index the terms of the sequence equal to 2. That is, if the sequence is a1 , a2 , . . . , then the mth term equal to 2 is a1+⌊r(m−1)⌋ . Out of a total of ten points, one participant received nine points, and no one else received more than four points. Can we do better? The first step is to figure out the value of r, similarly to how Fumei found heuristically the asymptotic behavior of Golomb’s sequence.” Hands quickly were raised, and Emiliano was allowed to explain. “Let n = rm (approximately). Then among the first n terms about m of them are equal to 2, so n − m = (r − 1)m of them are equal to 3. These terms describe n runs of 3’s, where the total number of 3’s is 2m + 3(r − 1)m. Thus among the first n + 2m + 3(r − 1)m = (4r − 1)m terms, n = rm of them are equal to 2. Hence we should have 1 rm = . (4r − 1)m r √ √ Solving for r gives r = 2 ± 3. Since clearly r > 1, we must have r = 2 + 3.” “Yep, that’s half the battle! Now comes the unpleasant technical part: √ how to prove that the terms equal to 2 are indexed by the numbers 1 + ⌊(2 + 3)m⌋ for m ≥ 0.” Professor Blakeley thought that it would be unrealistic for the students to come up with a quick proof, so he deferred it to the problem set. “We just found a nice formula for the indices (or positions in the sequence) of the terms equal to 2. Now what about the terms equal to 3?” “Beatty’s theorem!” exclaimed Daniel and Clayton almost simultaneously. “Right you are,” said Professor Blakeley. “Daniel, how about explaining what Beatty’s theorem is and its applicability here?” “Let x, y be two positive irrational numbers related by 1 1 + = 1. x y



Then Beatty’s theorem says that every positive integer occurs exactly once among the numbers ⌊mx⌋ and ⌊my⌋ for m ≥ 1. √ √ “If we put x = 2 + 3, then we compute that y = 12 (1 + 3). Therefore the √ (1 + 3)⌋, m ≥ 1.” terms equal to 3 are indexed by the numbers 1 + ⌊ m 2 “Exactly! Beatty’s theorem is a handy tool for problem-solving competitions. Here is another problem where it is used, but it is not at all a priori obvious that it is relevant. “I’m sure you’re familiar with the game of Nim, where we begin with a finite number of nonempty piles of counters (or coins or stones) with various (finite) numbers of counters in each pile. There are two players, Alice and Bob. (Alice and Bob seem to play a huge number of different games. I wonder how they find time to do anything else.) They move alternately, with Alice moving first. A move consists of taking any positive number of counters from a pile. Whoever is first unable to move, i.e., who is faced with no remaining counters, is the loser. There is an elegant strategy for playing Nim based on the binary expansion of the terms of the sequence. You can look this up later if you haven’t seen it already. Note that Nim with just two piles of sizes a and b is trivial to analyze. If a = b, then Bob wins by keeping the two sizes equal. Otherwise Alice wins by the same strategy. “Now, however, we will look at a variant called Wythoff ’s Nim or Wythoff ’s Game. We begin with just two piles, but in addition to removing a positive number of counters from one pile, a player can remove the same positive number of counters from both piles. Let us call a state of the game a position. For what positions does Bob win when Alice moves first? We can denote a position by the pair (c, d), where there are c counters in the first pile and d in the second. By symmetry, we might as well assume that c ≤ d.” The students started off by computing some small winning positions for Bob, so they could get some feeling for the game. Professor Blakeley suggested that a winning position for Bob should be called safe. The key properties of safe positions are that any move from a safe position gives an unsafe position, and from any unsafe position there is some move giving a safe position. Clearly (0, 0) was safe, since Alice cannot move on her first turn. The positions (c, d) with c ≤ d from which we can reach (0, 0) in one move are (0, a) and (a, a) for any a ≥ 1, so these are winning for Alice. The unique position (c, d) with c ≤ d from which every move leads to a position already known to be a win for Alice is (1, 2), so (1, 2) is safe. Similarly, the next safe position is (3, 5). Continuing in this way, we get the following safe positions (a, b), with a ≤ b (computed very quickly by Patrick, of course):


(0, 0), (1, 2), (3, 5), (4, 7), (6, 10), (8, 13), (9, 15), (11, 18), . . . .

A good way to visualize these positions is by drawing a diagram of the points (a, b) ∈ N2 . When a position (a, b) is winning for Bob, we can circle it and cross out all points directly to the right (east), above (north), or northeast of (a, b), and then circling the points closest to the origin that are not circled or crossed out.



The students quickly guessed from the data in equation (15.5) that (a) the differences b − a increased by 1 at each step, and (b) every positive integer appeared exactly once. These rules determined the positions (a, b) uniquely. Equivalently, we can order the safe positions as (a0 , b0 ), (a1 , b1 ), . . . such that bn − an = n and for n ≥ 1, an is the least positive integer not equal to any ai or bi with i < n. Thus they needed to show that these indeed were the set of safe positions, and then they needed to figure out what this had to do with Beatty’s theorem. Soon they had it solved. First Jung Wook gave the proof that the safe positions were determined by the above rules. “The proof is by induction on n, the base case n = 0 being clear. Assume for n−1. For the next safe position (an , bn ) (that is, with the least possible value of an ), an cannot equal a previous ai or bi , since otherwise we could move from (an , bn ) to (ai , bi ). Similarly bn − an ≥ n. It is easy to check that if we let an be the least positive integer not among the previous ai ’s and bi ’s, and set bn = an + n, then every move is to an unsafe position. Thus (an , bn ) is safe, so the proof follows.” Daniel explained the connection with Beatty sequences. “Jung Wook showed that the first coordinates a0 < a1 < ⋯ of safe positions (an , bn ) with an ≤ bn are obtained recursively by starting at step 0 with a0 = 0, with no positive integer ‘crossed out.’ At step n let an be the least positive integer not crossed out, and cross out an +n. Thus a1 = 1. We cross out a1 +1 = 2, so a2 = 3. Cross out a2 +2 = 5, bn = an + n. I claim that an = ⌊nφ⌋ and bn = ⌊n(φ + 1)⌋, so a3 = 4, etc. Moreover, √ where φ = 12 (1 + 5), the golden mean. Clearly then bn = an + n. Note that 1 1 + = 1. φ φ+1



Hence by Beatty’s theorem every positive integer is equal to exactly one of an or bn , n ≥ 1. This proves the claim.” “All right!” said Professor Blakeley. “There is a natural generalization of the ‘Beatty pair’ (x, y) to triples. Let us call a triple (x, y, z) of positive real numbers a Blakeley triple if every positive integer occurs exactly once among the numbers ⌊mx⌋, ⌊my⌋, ⌊mz⌋ for m ≥ 1. I have proved an amazing theorem that if (x, y, z) is a Blakeley triple, then x, y, and z are all imaginary parts of a zero of the Riemann zeta function on the critical line!” This fact seemed truly astonishing, almost unbelievable, to the students until Fumei realized what was going on. “I take it that Blakeley triples don’t exist?” she asked the Professor. “Sadly, that is a result of Uspensky from 1927. Admittedly it takes some of the luster off my theorem. I have put Uspensky’s result on today’s problem set. “Let’s get back to reading a sequence by the lengths of its runs. This subject reminds me of the famous sequence 1, 11, 21, 1211, 111221, 312211, 13112221, 1113213211, . . . . Who can see how it is defined?” Professor Blakeley was a little surprised that none of the students seemed to have run across this sequence before, as no one raised their hands or spoke immediately. With the big hint that the sequence was related to lengths of runs, soon however some students figured out the rule. Patrick was chosen to explain. “Given the nth term Cn of the sequence, for each run aa⋯a record the length of the run followed by the number a. The sequence of such pairs, one for each run of Cn , gives the next term Cn+1 . Thus each term is a ‘description’ of the previous term. The number of terms of Cn+1 is twice the number of runs of Cn .” “Exactly! Another way to say this is that given the nth term Cn , to obtain the next term read Cn from left-to-right by saying ‘k j’s’ when we encounter a run of k consecutive j’s. Thus we read 1 as ‘one 1’ or 11. Next we read 11 as ‘two 1’s’ or 21. Next we read 21 as ‘one 2, then one 1’ or 1211. Next we read 1211 as ‘one 1, then one 2, then two 1’s,’ etc. For this reason the sequence is known as the look and say sequence. It is due to John Conway. “One simple property which I leave as an exercise is that only the digits 1,2,3 can appear. The big question is to determine the length Ln of the nth term in the sequence. Thus L1 = 1, L2 = 2, L3 = 2, L4 = 4, etc. There is no simple formula for Ln , but there is an explicit formula for its generating function F (x) = ∑ Ln xn . n≥1

Can anyone see what this is?” By the time Professor Blakeley asked this question, Patrick had already computed a few dozen values of Ln , but he could not make sense of the data. Professor Blakeley waited a couple of more minutes and continued, “I mentioned before1 that you need to develop a facility for analyzing data. This kind of problem arises all the time in research. I am surprised that no one noticed that F (x) is just a rational 1

Chapter 4



function! When written in lowest terms it has the form P (x)/(1 − x)Q(x), where P (x) = x + x2 − x3 − x4 − x5 − x6 − x8 + 4x9 + 6x10 − 6x12 − 8x13 +x14 + 5x15 + 4x16 + x17 + 4x18 + x19 − 3x20 − 6x21 − 8x22 −12x23 + 6x24 + 36x25 + 20x26 + x27 − 58x28 − 34x29 + 30x30 +20x31 + 23x32 − 35x33 + 9x34 + 26x35 − 8x36 − 12x37 −42x38 − 7x39 + 79x40 + 13x41 − 16x42 − 14x43 − 107x44 +65x45 + 33x46 + 32x47 + 39x48 − 126x49 − 38x50 + 25x51 +66x52 + 64x53 − 89x54 + 8x55 − 45x56 + 15x57 + 27x58 −15x59 + 44x60 + 56x61 − 54x62 − 41x63 + 11x64 + 21x65 +50x66 − 62x67 + 19x68 + 4x69 + 4x70 − 15x71 − 31x72 +22x73 + 20x74 − 18x75 + 18x76 − 18x77 + 18x78 − 12x79 and Q(x) = 1 − x2 − 2x3 − x4 + 2x5 + 2x6 + x7 − x8 − x9 − x10 − x11 =

−x12 + 2x13 + 5x14 + 3x15 − 2x16 − 10x17 − 3x18 − 2x19 +6x20 + 6x21 + x22 + 9x23 − 3x24 − 7x25 − 8x26 −8x27 + 10x28 + 6x29 + 8x30 − 5x31 − 12x32 + 7x33 −7x34 + 7x35 + x36 − 3x37 + 10x38 + x39 − 6x40 −2x41 − 10x42 − 3x43 + 2x44 + 9x45 − 3x46 + 14x47 −8x48 − 7x50 + 9x51 + 3x52 − 4x53 − 10x54 − 7x55 +12x56 + 7x57 + 2x58 − 12x59 − 4x60 − 2x61 + 5x62 +x64 − 7x65 + 7x66 − 5x67 + 12x68 − 6x69 + 3x70 − 6x71 .”

The students were not surprised at being chastised by Professor Blakeley for not immediately seeing this answer by inspection or intuition, but they were surprised at the amazing answer itself. Why did Ln satisfy a linear recurrence with constant coefficients of order 72 and no smaller order? How did anyone ever find such a result? Anticipating these obvious questions, Professor Blakeley continued, “The key to understanding the sequence is the fact that after the eighth term (explaining why deg P (x) − deg(1 − x)Q(x) = 8), every term Cn is a word in an alphabet consisting of 92 basic sequences, called atoms. Each atom evolves independently from the others. The simplest atom is 22, which evolves into itself (two 2’s). Other examples of atoms are a = 3112221, b = 132, and c = 13211. Then a evolves into bc, etc. Of course there are a lot of things to check, but it is just a finite computation. Conway and collaborators who helped with this process must have had a lot of perserverance, since a priori it is not clear there are only finitely many atoms. Maybe they would have become discouraged after finding 90 atoms and would have called it a day. “Let fi (n) be the number of occurrences of the ith atom (after we order them in some way) in Cn . You can see from what I have said that we will get a system of 92 simultaneous linear recurrences of order one (that is, fi (n+1) is a linear combination of fj (n)’s). By standard techniques just like we did for computing ur (n) when we



discussed Stern’s triangle,2 we see that Ln will satisfy a linear recurrence with constant coefficients of order 92. One can show that the characteristic polynomial for this recurrence is x18 (x − 1)2 (x + 1)x71 Q(1/x). From the structure of the 92 recurrences it can be seen that the factor (x − 1)(x + 1) will cancel out of this characteristic polynomial, so the smallest order of a linear recurrence is actually 72, not 74. The factor 1 − x in the denominator arises from the atom 22, which evolves to itself. “From the theory of linear recurrences with constant coefficients, we know that if (1 − x1 )Q(1/x) has a unique zero λ of maximum absolute value greater than 1, where λ is a positive real number, then √ Ln+1 = λ. lim n Ln = lim n→∞ n→∞ Ln Such a zero does exist here, and λ = 1.303577269034 . . . . So you see, the look and say problem is really quite simple. “More challenging is what happens when we start with an arbitrary first term like 41122222. Here the next term will be 142152. However we start, except 22, we will always have this same rate of growth. This is, if we start with a nonempty word (a nonempty sequence of positive integers of finite length) w ≠ 22 and let Ln (w) be the length of the nth term, then limn→∞ Ln (w)1/n = λ. This takes quite a bit of additional work to prove. There will be additional atoms that don’t increase the overall growth rate. “Now for a change let’s do something really simple,” said Professor Blakeley, “namely, multiplying integers, which you learn in elementary school. Here is the prime factorization of an integer: 13 ⋅ 532 ⋅ 3853 ⋅ 96179. What is the integer?” This did not seem to be such a challenging problem. A few students did the computation on their laptops or phones. “Amazing!” they agreed. What could be amazing about the decimal expansion of some integer?, thought the remaining students. Even if it were all 9’s, for instance, it wasn’t so surprising that some relatively small number of the form 10n − 1 would have a complicatedlooking factorization. Patrick went to the board where Professor Blakeley had written the product and erased the dots separating the different factors. “That’s the answer,” he said, “though the fifth digit, a 2, is written rather sloppily.” By now the other students had checked that the product was indeed 13532385396179. “What could be simpler?” said Professor Blakeley. “I wonder whether this elegant rule applies to all products of prime powers. If it does, then we have made a big breakthrough in the fields of computational complexity and cryptography.” Unfortunately, Clayton was quick to point out some counterexamples, such as 22 ≠ 22 and 2 ⋅ 3 ≠ 23. A near miss is 25 ⋅ 92 = 2592. 2

Chapter 6



“That’s a shame,” said Professor Blakeley. “The case where it does work was found by the amateur mathematician James Davis on June 4, 2017, thereby receiving an award of $1000 from John Conway. I’m not sure if it’s known whether or not there are such examples that are squarefree, so we don’t have to bring down an exponent. “My final example of self-reference is not as mathematical as the ones we have just looked at. Does anyone know what an autogram is?” “A telegram you send to yourself?” suggested Fumei. “Nice try, but that doesn’t have much of a self-referential character. No, an autogram is a sentence which provides an inventory of its own characters. There are many variants, but a good example is the following sentence: This sentence employs two a’s, two c’s, two d’s, twenty-eight e’s, five f’s, three g’s, eight h’s, eleven i’s, three l’s, two m’s, thirteen n’s, nine o’s, two p’s, five r’s, twenty-five s’s, twenty-three t’s, six v’s, ten w’s, two x’s, five y’s, and one z. “You really need a computer to find such sentences.”


All Good Things Must Come to an End PROBLEM LIST √ 2n ∑n≥0 ( n )xn = 1/ 1 − 4x n 2k 2(n−k) ∑k=0 ( k )( n−k ) = 4n Partitions of n into odd parts and into distinct parts Combinatorial proof that po (n) = q(n) Partitions with no part appearing once vs. parts congruent to ±1 (mod 6) Partitions with no part appearing exactly 1, 2, 3, 5, 7, or 11 times Monic polynomials of degree n over Fq with no irreducible factor of multiplicity one Monic squarefree polynomials of degree n over Fq , etc. Power series f (x) such that the coefficient of xn in f (x)n+1 is equal to 1 The Lagrange inversion formula Power series g(x) such that the coefficient of xn in g(x)2n+1 is 1 Coefficients of the power series y satisfying y = 1 + xy k Power series ι(x) such that the coefficient of xn in ι(x)n is 0, n > 1 Cutting a birthday cake with frosting on top The three-gap theorem


150 150 151 151 151 152 152 152 153 153 155 155 156 157 158



The last day of class finally arrived. Professor Blakeley dressed appropriately with a black suit, smartly matching his black shoes, black shirt, and black tie. “A sad day has arisen in the history of mathematics, the final class of this summer’s Problem Solving Camp,” he said, as he pulled a black handkerchief from his pocket and dabbed his eyes. “One last chance for some mathematics! One of my favorite topics is generating functions, so let’s look at some nice problems in this area. You are probably already familiar with some basic generating function techniques. For instance, we have 2n n 1 , (16.1) ∑ ( )x = √ n 1 − 4x n≥0 as may be seen by applying the binomial theorem α (1 + t)α = ∑ ( )tk , k≥0 k where we define (αk) by equation (6.1), to the case t = −4x, α = − 21 , and simplifying. If we square both sides of equation (16.1) and take the coefficient of xn , then we obtain the identity n 2k 2(n − k) ) = 4n . ∑ ( )( n−k k=0 k This simple identity, which I mentioned in an earlier class,1 is surprisingly difficult to prove combinatorially. “You probably also know about using generating functions to prove partition identities. Recall2 that a partition of n ≥ 0 is a weakly decreasing sequence λ = (λ1 , λ2 , . . . ) of nonnegative integers λi satisfying ∑ λi = n. Each term λi > 0 is called a part of λ. If p(n) is the number of partitions of n, then Euler observed that 1 n . ∑ p(n)x = ∏ i n≥0 i≥1 1 − x This identity can be seen by inspection by expanding each factor 1 = ∑ ximi . 1 − xi mi ≥0 Picking the term ximi from this factor corresponds to requiring the partition λ to have mi parts equal to i. “In the same way, if q(n) is the number of partitions of n into distinct parts, and po (n) is the number of partitions of n into odd parts, then ∑ q(n)x




i ∏(1 + x ) i≥1

= = =

1 − x2i i i≥1 1 − x 1 ∏ 2i−1 1 − x i≥1 ∏

n ∑ po (n)x , n≥0

1 2

Equation (6.2). Chapter 8



whence we obtain Euler’s famous result q(n) = po (n). “As an aside, it is natural to ask for a combinatorial proof that po (n) = q(n), i.e., an explicit bijection from the set of partitions of n with odd parts to those with distinct parts. Has anyone seen such a bijection before?” Some hands went up, and Clayton was selected. “Let λ be a partition of n into odd parts with m2i−1 parts equal to 2i − 1. Thus n = ∑(2i − 1)m2i−1 . i

Write each m2i−1 as a sum of distinct powers of two (binary expansion) and distribute 2i − 1 over the sum. We obtain a partition of n into distinct parts, since every positive integer can be written uniquely as the product of an odd integer and a power of two. The process can be reversed, so we have a bijection.” “Good! As an example, let λ have five parts equal to 9, twelve parts equal to 5, two parts equal to 3, three parts equal to 1, and no other parts. Then we have 112

= 9 ⋅ 5 + 5 ⋅ 12 + 3 ⋅ 2 + 1 ⋅ 3 = 9(1 + 4) + 5(4 + 8) + 3 ⋅ 2 + 1(1 + 2) = 9 + 36 + 20 + 40 + 6 + 1 + 2.

Some other bijections are known, but this is the most straightforward. “Let’s try a variant that is a little more subtle. Show that the number of partitions for which no part appears exactly once is equal to the number of partitions into parts not congruent to ±1 (mod 6).” After a while some hands appeared, and Professor Blakeley pointed to Fumei. “Let f (n) be the number of partitions for which no part appears exactly once. Then ∑ f (n)x




2i 3i ∏(1 + x + x + . . . ) i≥1


∏ (1 + i≥1

x2i ) 1 − xi

1 − xi + x2i 1 − xi i≥1



1 − x6i . 2i 3i i≥1 (1 − x )(1 − x )

Since every multiple of 6 is of the form 2i and 3j, we have 1 1 =∏ , ∏ 2i )(1 − x3i ) 6i )2 (1 − x6i−4 )(1 − x6i−3 )(1 − x6i−2 ) (1 − x (1 − x i≥1 i≥1 just by rearranging the denominator factors. Therefore 1 n , ∑ f (n)x = ∏ 6i )(1 − x6i−4 )(1 − x6i−3 )(1 − x6i−2 ) (1 − x n≥0 i≥1 and the proof follows.” “Very nice!” said Professor Blakeley, thinking that Fumei and the other volunteers must have had a lot of experience with this type of problem to solve it so quickly. “This problem is just the first of a large class of similar ones,” he continued. “For example, the number of partitions of n with no part appearing exactly 1, 2, 3,



5, 7, or 11 times is equal to the number of partitions of n whose parts are congruent to 0, 4, 6, 8, 9, 12, 16, 18, 20, 24, 27, 28, 30, or 32 (mod 36). The proof follows from the identities (1 − x12 )(1 − x18 ) 1 − x − x2 − x3 − x5 − x7 − x11 = (16.2) 1−x (1 − x4 )(1 − x6 )(1 − x9 ) and (16.3) (1 − x12i )(1 − x18i ) 1 = . ∏ ∏ 4i )(1 − x6i )(1 − x9i ) (1 − x 1 − xi i≥1 i≡0,4,6,8,9,12,16,18,20,24,27,28,30,32 (mod 36) One source of such identities is the theory of ‘cyclotomic numerical semigroups.’ ” “There is another rather similar, but more subtle, application of the identity (16.4)

1 + x2 + x3 + x4 + ⋅ ⋅ ⋅ =

1 − x6 (1 − x2 )(1 − x3 )

used by Fumei,” continued the Professor. “Namely, let P (n) be the number of monic polynomials f (t) of degree n over Fq such that no irreducible factor of f (t) has multiplicity exactly equal to one. Then (16.5)

P (n) = q ⌊n/2⌋ + q ⌊n/2⌋−1 − q ⌊(n−1)/3⌋ , n ≥ 2.

“To prove this formula, let β(d) denote the number of monic irreducible polynomials of degree d over Fq . Since every monic polynomial is uniquely a product of monic irreducible polynomials, we get the well-known formula 1 = ∏ (1 − xd )−β(d) . (16.6) 1 − qx d≥1 Let F (x) = ∑n≥0 P (n)xn . Analogously to equation (16.6) we have β(d)

F (x) =

∏ (1 + d≥1

x2d ) 1 − xd



∏( d≥1

1 − x6d ) (1 − x2d )(1 − x3d )


It then follows from (16.6) that F (x) = =

1 − qx6 (1 − qx2 )(1 − qx3 ) x (1 + q)(1 + x) 1 + qx + qx2 − , − + q q(1 − qx2 ) q(1 − qx3 )

from which it is easy to obtain equation (16.5). There is also a solution that does not use equation (16.4), but it is not as conceptual. “To test your understanding, try to show that the number of squarefree (that is, no irreducible factor has multiplicity greater than one) monic polynomials of degree n over Fq is equal to q n−1 (q − 1) when n ≥ 2. Similarly there is an analogous result arising from equations (16.2) and (16.3), and more generally, from the theory of cyclotomic semigroups. “So far we’ve been looking at rather standard generating function techniques for problem-solving purposes. Let’s switch gears and try something more challenging.



Find the unique power series f (x) = ∑n≥0 an xn with rational coefficients an such that for all n ≥ 0 the coefficient of xn in f (x)n+1 is equal to 1.” None of the students had any idea how to do this problem, so they started calculating. To begin, the case n = 0 gave a0 = 1. Then f (x)2 = 1 + 2a1 x + . . . , so a1 = 12 . Patrick quickly wrote some code which revealed that (16.7)

f (x) =

1 1 1 4 1 1 1 + x + x2 − x + x6 − x8 2 12 720 30240 1209600

691 1 x10 − x12 + . . . . 47900160 1307674368000 Both Daniel and Emiliano blurted, “Bernoulli numbers!” The Bernoulli numbers Bn are most succinctly defined by the generating function x xn = x . ∑ Bn n! e − 1 n≥0 (16.8)


Daniel and Emiliano had seen these numbers enough for the equation (16.8) to ring a bell, and with a little additional computation it didn’t take long to conjecture that xn x n . = (−1) B f (x) = ∑ n 1 − e−x n≥0 n! But how to prove it? None of the students could think of a way to get a handle on the n coefficients of ( 1−ex−x ) , until Sam raised his hand. “Maybe Lagrange inversion?” he said. “Righto!” said Professor Blakeley. “The Lagrange inversion formula is precisely the tool that is needed. Let me review this terrific result. “Let F (x) = ∑n≥0 an xn be a formal power series, say with complex coefficients. ‘Formal’ means that we do not regard F (x) as a function of x; F (x) is simply a way to record the sequence a0 , a1 , . . . . If G(x) = ∑n≥0 bn xn is another power series and b0 = 0 (denoted G(0) = 0), then we can compute F (G(x)) as a formal series, namely, F (G(x)) = a0 + a1 G(x) + a2 G(x)2 + . . . . Since b0 = 0, the smallest nonvanishing term of G(x)n has degree at least n, so there are only finitely many terms ai G(x)i that involve xn . Thus no infinite sums are involved in computing F (G(x)), so F (G(x)) exists as a formal series. Note that the series G(x) = x is the identity element for composition of series, i.e., F (G(x)) = G(F (x)) = F (x) for all F (x). “Suppose now that a0 = 0 and a1 ≠ 0. An easy induction argument shows that there is a unique formal series G(x) satisfying F (G(x)) = x. I leave for you the simple proof that if F (G(x)) = x, then also G(F (x)) = x. Thus the set of formal power series with a0 = 0 and a1 ≠ 0 form a group under composition. The series G(x) satisfying F (G(x)) = G(F (x)) = x is called the compositional inverse of F (x) and is denoted F ⟨−1⟩ (x). For instance, for 0 ≠ a ∈ C we have (

⟨−1⟩ x x ) , = 1 − ax 1 + ax

obtained by solving y = x/(1 − ax) for x. Similarly, (ex − 1)⟨−1⟩ = log(1 + x).



In general, however, there is no simple expression for F ⟨−1⟩ (x). However, the Lagrange inversion formula gives a formula for the coefficients of F ⟨−1⟩ (x) (or more generally, F ⟨−1⟩ (x)k ) which is sometimes useful. “I will use the notation [xn ]F (x) for the coefficient of xn in the power series F (x). Let F (x) satisfy F (0) = 0 and [x]F (x) ≠ 0, so F ⟨−1⟩ (x) is defined. Let k, n ∈ Z. The Lagrange inversion formula asserts that n


n[x ]F n


(x) = k[x k


x ) . ]( F (x)

It is surprising that the coefficients of powers of F ⟨−1⟩ (x) are related to coefficients of powers of x/F (x). “For the proof, we need to work with power series with finitely many terms (possibly none) that have negative exponents. Clearly the set C((x)) of such series forms a C-algebra (a vector space over C and a ring with obvious compatibility conditions) with the obvious definitions of addition and multiplication. In fact, C((x)) is even a field, for if F (x) ∈ C((x)), then we can write (uniquely) F (x) = xm F1 (x), where F1 (0) ≠ 0, and then set x−m 1 = , F (x) F1 (x) where 1/F1 (x) is the usual division of power series when F1 (0) ≠ 0. Moreover (though not relevant here), the field C((x)) is the quotient field of the ring C[[x]] of all formal power series over C. It is obtained from C[[x]] by inverting the single element x, that is, C((x)) = C[[x]][1/x]. “Okay, now we are all set up for the proof! Given any F (x) = ∑n an xn ∈ C((x)), we can define the formal derivative F ′ (x) in the obvious way: F ′ (x) = ∑ nan xn−1 = ∑(n + 1)an+1 xn . n


The key ingredient of the proof is the trivial observation that [x−1 ]F ′ (x) = 0.

(16.10) Now set

F ⟨−1⟩ (x)k = ∑ pi xi , i≥k

so xk = ∑ pi F (x)i . i≥k

Differentiate both sides to obtain kxk−1


∑ ipi F (x)


F ′ (x)



kxk−1 F (x)n


∑ ipi F (x)


F ′ (x).


Here we are expanding both sides of (16.11) as elements of C((x)). Thus, kxk−1 F (x)n


kxk−1 (a1 x + a2 x2 + ⋯)n

= kxk−n−1 (a1 + a2 x + ⋯)−n .



“We wish to take the coefficient of x−1 on both sides of (16.11). Since 1 d F (x)i−n , i ≠ n, i − n dx

F (x)i−n−1 F ′ (x) =

it follows from (16.10) that the coefficient of x−1 on the right-hand side of (16.11) is a1 + 2a2 x + ⋯ ) [x−1 ]npn F (x)−1 F ′ (x) = [x−1 ]npn ( a 1 x + a 2 x2 + ⋯ = [x−1 ]npn (

1 + ⋯) x

= npn . Hence kxk−1 = npn = n[xn ]F ⟨−1⟩ (x)k , F (x)n which is equivalent to (16.9). “Does everyone understand the proof?” No one said otherwise. The argument actually was quite elegant and easy to follow. Professor Blakeley continued, “Can someone explain how we can use Lagrange inversion to see that for all n ≥ 0, [x−1 ]

n+1 x ) = 1? 1 − e−x Perhaps Sam would like to explain, since he is already familiar with Lagrange inversion.” Sam said, “Define F (x) by f (x) = x/F (x), where f (x) is the power series we want, that is, [xn ]f (x)n+1 = 1. Putting k = 1 in equation (16.9) gives

[xn ] (

n[xn ]F ⟨−1⟩ (x) = 1, n ≥ 1, so

xn = − log(1 − x). n≥1 n

F ⟨−1⟩ (x) = ∑

Hence F (x) = 1 − e−x and f (x) = x/(1 − e−x ).” “Precisely! It is rather miraculous how it works out so elegantly. To test your understanding, I have included on today’s problem set the three problems below. I’m afraid you won’t have much time to work on them since this afternoon’s problem session is the last one.” 1 2 x + . . . (over the rationals) ● Find the unique power series g(x) = 1 + 13 x − 45 such that for all n ∈ N we have [xn ]g(x)2n+1 = 1. ● For any integer k ≥ 2 find the coefficients of the power series y = 1 + x + kx2 + . . . satisfying y = 1 + xy k . In fact, we can let k be an indeterminate. Note that y does not have a formal compositional inverse! ● Find the coefficients of the unique power series ι(x) = 1 + x − 12 x2 + . . . satisfying ι(0) = 1, [x]ι(x) = 1, [xn ]ι(x)n = 0 for n > 1.



To do this you will have to find the ‘correct’ form of equation (16.9) for k = 0. Plugging k = 0 into the equation as it stands gives no information. “Any questions?” said Professor Blakeley. “Why did you use the letter iota? Why not f , for instance?” asked Clayton. “I was hoping someone would ask that!” said Professor Blakeley. “It gives me an excuse to check your vocabulary. What is the meaning of the word iotacism?” A very strange word, thought the students, but none had seen it before. Daniel decided to take a wild guess. “Excessive attention to details,” he ventured. “Not a bad try,” said Professor Blakeley, “but unfortunately incorrect.” He paused to see if there were any further guesses, and then continued,“It means the excessive use of the letter iota.” Could this be a joke? How could a word with such an unlikely meaning come into existence? “I know what you must be thinking, but there really is such a word,” said Professor Blakeley. “It arose in the subject of Greek phonology. A lot of vowels and diphthongs in ancient Greek converged in pronunciation so they now sound like iota in modern Greek. The more specific meaning of iotacism refers to this vowel change, but more generally it means the excessive use of the letter iota. “Now that you understand iotacism, let’s try another one. What is the meaning of ‘lambdacism’ ?” “Well,” said Sandra, “it probably doesn’t mean the excessive use of the letter lambda. That would be too easy.” “Right you are! Note that the sound to which λ refers is not a vowel, so it wouldn’t be involved in the vowel shift from ancient to modern Greek. It actually refers to a medical condition or speech impediment preventing the proper pronunciation of the sound of the letter l (denoted /l/ in the International Phonetic Alphabet). “Let’s try one more,” said Professor Blakeley. “What is the meaning of ‘epsiloncism’ ?” “Either excessive use of the letter epsilon or its sound, or faulty pronunciation of the sound corresponding to epsilon,” suggested Emiliano. “Don’t they teach vocabulary anymore in schools?” replied Professor Blakeley. After a short pause to increase the tension he continued, “What gave you the idea that epsiloncism is a word?” Emiliano did not bother to answer this rhetorical question, but he did ask, “What are all the Greek letters x for which xcism is a word?” “A natural and important question! It might not be so easy to give a definitive answer, but in addition to iotacism and lambdacism, I did find gammacism, deltacism, zetacism, etacism (also known as itacism), thetacism, kappacism, mucism (but not related to the letter μ), rhotacism, sigmacism, and omegacism online. “Can you guess what word describes the excessive use of XOR gates in electrical circuits?” Several students groaned when they realized that the word could only be exorcism. “Okay, enough vocabulary lessons,” continued Professor Blakeley. “I don’t want to get fired for teaching the wrong subject, so let’s get back to math. “We began this camp with a discussion of cutting cheese, chocolate bars, and cakes, so it is appropriate to end on the same topic. We have a cylindrical birthday



cake with frosting on the top. Let 0 < θ < 2π. Cut out a wedge that makes an angle θ, remove it from the cake, turn it upside down, and insert it back into the cake. Now cut another wedge of angle θ adjacent (say counterclockwise) to the first one and do the same procedure. Continue in this way, always removing a wedge adjacent in the counterclockwise direction to the wedge just removed and inserted, and then turning it upside down and inserting it back into the cake. For what angles θ will we have all the frosting back on top after a finite positive number of moves?” “There’s an obvious conjecture,” volunteered Daniel. “Oh?” “The angle θ must be a rational multiple of 2π. The condition is clearly necessary, since otherwise each cut will be in the interior of some previous intact piece P , so after we turn over the new wedge the piece P will have one part with frosting and one without.” “Hmm, that seems plausible, but perhaps you are overlooking something.” At first Daniel’s argument seemed ironclad, but then Jung Wook said, “When we turn over a piece of cake prior to reinserting it, it reverses the pattern of the frosted and unfrosted regions. Perhaps this will make a difference.” “Exactly! It makes quite a big difference. It takes some time to figure out what is going on, so let me give you a hint. Let n = ⌈2π/θ⌉. Clearly if 2π/θ = n, then after n flips all the frosting will be on the bottom, and after n more it will be back on top. So let’s assume 2π/θ ∈/ Z. When we make the kth cut, consider the set Sk of clockwise angles between that cut and all the radii that separate a frosted region from an unfrosted region. “For example, suppose that θ = 2. After one cut the region between the angles 0 and θ will be unfrosted, so S1 = {0, θ}. After the second cut, the unfrosted region is between 0 and 2θ, so S2 = {0, 2θ}. Similarly S3 = {0, 3θ}. The next cut at the angle 4θ will be inside the first cut. When we flip, the unfrosted region from 0 and 4θ (counterclockwise) becomes a frosted region from 3θ to 7θ, while the frosted region from 3θ to 0 becomes an unfrosted region from 7θ to 4θ. (We reduce all angles ψ modulo 2π so 0 ≤ ψ < 2π.) Thus after the flip the region from 3θ to 7θ is frosted, while the remaining cake from 7θ to 3θ is unfrosted. The radii between frosted and unfrosted regions are at 3θ and 7θ. Measuring these angles clockwise from the position 4θ gives S4 = {−3θ, θ}. Here is the picture.” 0

7θ 4θ θ 2θ



“What can you tell me about the sets Sk for any θ ≠ 2π/n? You can work on this together for a few minutes.” After a little while Sandra was chosen to speak. “I think we can show by induction that for any k, Sk is a subset of (16.12)

{−(n − 1)θ, −(n − 2)θ, . . . , −θ, 0, θ, 2θ, . . . , nθ}. ”

“Exactly! Very nice to get this so quickly! The proof by induction is straightforward. The difficulty is figuring out what to prove.” “The hint didn’t hurt,” said Sandra. “Undoubtedly, but I’m sure you could have done this without the hint given some more time. Now what can you conclude about getting all the frosting back on top?” “Since there are only finitely many possible Sk , and since given Sk we can reverse the procedure to get back to Sk−1 , it follows immediately that after finitely many steps we will get back to the frosting being on top. A crude upper bound for the number of steps is 22n−1 , the number of subsets of the set (16.12).” Sam then raised his hand. “I think I can show that if θ ≠ 2π/n, then the least number of steps to get the frosting back on top is 2n(n − 1). Moreover, we will never get all the frosting on the bottom as in the θ = 2π/n case.” “Fantastic! We can discuss this some more during the afternoon session. Sam’s result raises the following question. For any subset T of the set (16.12) it makes sense to apply the flipping operation, getting a new subset w(θ). As Sandra mentioned, flipping is reversible, so w is a permutation of the 22n−1 subsets of (16.12). What are the cycle lengths of this permutation? Sam’s result is that the cycle containing the empty set ∅ has length 2n(n − 1) when θ ≠ 2π/n. I am not aware of any work on this more general problem. “Here’s a cute problem on today’s problem set that has some similarity to the previous one. It is known as the three-gap theorem. Choose a positive integer n and angle 0 < θ < 2π. Choose a starting point on a circle, and place n − 1 further points at angles θ, 2θ, . . . , nθ from the starting point, say in the counterclockwise direction. Show that there are at most three distinct distances between pairs of adjacent points around the circle. “There are quite a few . . . ” began Professor Blakeley, who then glanced at the clock. Time was up. “Never mind, it looks like that does it for my lectures this summer. If only I had a few hundred more hours . . . . I hope you all had a good time and learned a lot of interesting mathematics. You were an excellent audience, and some day I expect you to become famous mathematicians! Maybe some of you will be giving this same math camp!” The students stood up and applauded. Professor Blakeley gave a wistful smile and followed the students out of the classroom.

Notes 1. First Day The cheese cutting problem for a 4×4×4 cube, where Jung Wook’s lower bound of six cuts is still achievable, appears in Gardner [64, pp. 51–52]. The general problem is discussed again by Gardner [65, p. 34]. In particular, equation (1.1) is due to Eugene Putzer and R. Lowen [131] in 1958. The remark by Clayton about a piece of cheese popping out of a Klein bottle is reminiscent of the end of a story by Martin Gardner [63]. Regarding the trivial cutting problem where each player cuts a piece along a grid line, Winkler [189, p. 93] says, “This ridiculously easy puzzle has been known to stump some very high-powered mathematicians for as much as a full day, until the light finally dawns amid groans and beatings of heads against walls.” The chocolate bar cutting games where one player can cut horizontally and the other vertically appears in volume 1 [18, pp. 24–26] of the monumental opus (four volumes in the A K Peters edition) by Berlekamp, Conway, and Guy. For an introduction to the theory of fair cake-cutting, see [49] and [93].

2. Polynomials Professor Blakeley’s amazing variant of Euler’s formula is based on a cartoon by A. Pun and R. Stanley [130]. The n − 1 inequalities that are equivalent to a real polynomial having only real zeros are explained by Gantmacher [62, Chap. XV, §9.1]. The origins of the formula for Δ(xn + axm + b) are obscure. Proofs were given by Swan [173] and by Greenfield and Drucker [78], based on the fact that the discriminant of a polynomial f (x) is essentially the resultant of f (x) and f ′ (x). See n also [69, pp. 406–407]. The discriminant of the polynomial ∑ni=0 xn! goes back at least to a more general result of Hilbert [85]. The polynomial of equation (2.8) is a generalized Laguerre polynomial. Its discriminant was found by Schur [146]. Sometimes a different but equivalent formula is given because of different conventions in defining the discriminant. For information on Newton’s log-concavity theorem, see [83, p. 52]. For a discussion of the terms “Cauchy-Binet formula” vs. “Binet-Cauchy formula,” see [21]. For some information on Cauchy’s contribution to the theory of determinants, see Muir [113, pp. 92–131]. The result that all zeros of f (x + 1) − uf (x) have real part γ − 12 has a short historical discussion and proof in [127, p. 577]. 159



3. Base Mathematics The sequence 10, 11, 12, 13, 14, 16, 17, 20, . . . appears in [38, puzzle 51]. For the sequence whose terms have vertical symmetry, see [64, p. 161], [66, pp. 205–206]. Michael Freedman [58, p. 122] was the first to find an “exotic” differentiable structure on R4 , while Clifford Taubes [178] showed that there are a continuum number of them. For Rn when n ≠ 4, see Stallings [153], in particular Corollary 5.2. For further information on Catalan numbers, including 214 combinatorial interpretations, see Stanley [165]. In particular, the sum ∑n≥0 C1n is discussed in Problem A65. For the greedy construction of sequences containing no three terms in arithmetic progression, see Odlyzko and Stanley [117]. For some generalizations and further work, see [48], [100], [111], [112], and the references therein. For further information related to the bounds on r3 (n) (equation (3.8)), see [175] and the references therein. The result (3.9) on an infinite sequence of positive integers with no three terms in arithmetic progression is due to Moser [110]. Maynard’s result about primes with a missing digit appears in [105]. See [16] for the algorithm of Bailey, Borwein, and Plouffe on the nth binary digit of π. For a nice survey of the computation of π see [17]. A good survey on the Morse-Hedlund sequence is given by Jean-Paul Allouche and Jeffrey Shallit [8]. For the minimum density of a letter in an infinite ternary squarefree word, see Khalyavin [90]. For the maximum density of a letter in an infinite ternary squarefree word, see Ochem [115]. The result about obtaining a squarefree word after replacing each ∗ with 0, 1, or 2 is due to Gasnikov and Shur [61]. For the fairest way to choose gifts and the largest power of x − 1 dividing a polynomial with ±1 coefficients, see [159] and the references given there. For the Fibonacci word and related sequences, see [20] and [50]. Regarding Professor Blakeley’s weight control plan, originally I planned to use this technique to help control the rate of inflation. I am grateful to Anna Ying Pun for suggesting its applicability to weight control.

4. A Mysterious Visitor Lagarias’s inequality (4.1) which is equivalent to the Riemann hypothesis appears in [99]. One reference for the elementary combinatorics of permutations, including Stirling numbers of the first kind, is the text of Stanley [161, §§1.3–1.7]. For the Lloyd-Shepp result (4.3) on the expected length of the longest cycle of a permutation w ∈ Sn , see [102]. The mysterious visitor is based on a former pet cat named Whitey (despite being mostly black). I used to amuse my children by demonstrating that Whitey was a mathematical genius. I would ask Whitey questions similar to the first question Professor Blakeley asked the cat, and Whitey invariably answered every question correctly by staying put. The error of Cayley appears in [33, p. 51]. An interesting related paper is that of Miller [106].



5. Set Theory The problem about uncountably many subsets of Z, any two of which have a finite intersection, appeared on the Fiftieth William Lowell Putnam Mathematical Competition (1989) [87, Problem B4, pp. 12, 111]. For the procedure related to the total b-ary expansion of n, see Wikipedia [74]. If we let bi = i + 2 for all i, then the fact that the procedure terminates is known as Goodstein’s theorem [73]. Of course the proof of termination when bi = i + 2 is exactly the same for the more general version given here with arbitrary bases bi . There are some interesting computations in the case bi = i + 2. For instance, if we start with a0 = 4 = 22 , then the largest term is 3 ⋅ 2402653210 . For some further examples of procedures that unexpectedly terminate, see [67]. Sylver Coinage is discussed in Berlekamp, Conway, and Guy [19, Chap. 18], summarized in [174]. ath The problem of partitioning R3 into (disjoint) unit circles appears in Komj´ and Totik [94, Chap. 13, Problem 13]. I am grateful to Maryanthe Malliaris for calling this result to my attention. The fact that the measurability of all subsets of R is consistent with ZF is a famous result of Solovay [151]. The result of Soifer about a chromatic number depending on the Axiom of Choice may be found in his paper [150], based on earlier similar results by Shelah and him. The problem of the infinitely many prisoners is discussed by Winkler [190, pp. 91–92, 99–101], including information on the history of this problem. The two books [189], [190] contain a number of other similar problems, though not involving set theory in their solution. For the result about one of five point masses escaping to infinity, see Saari and Xia [143]. Halpern’s result on a ball escaping a billiard table may be found at [82]. Mathematical billiards is actually a vast subject. For an introduction see Rozikov [141] and Tabachnikov [176]. 6. Two Triangles Stern’s diatomic array and diatomic series, as well as the Calkin-Wilf tree, had its genesis in a paper [44, p. 356] of Eisenstein, though some aspects can be traced all the way back to Kepler [89, Book III, chapter II]. An explicit treatment first appeared in a paper of Moritz Abraham Stern [171] in 1858. For the recurrences satisfied by ur (n) see Stanley [170]. The eponymous paper on the Calkin-Wilf tree, necessarily written by Calkin and Wilf, is [30]. For Profes3 sor Blakeley’s statement that the generating function for ∑i (ni) is not algebraic, see [162, Exer. 6.3]. The formula 12 (3 ⋅ 2n−1 − 1) for the sum of the elements of the Calkin-Wilf tree at height n is due to Stanley and was generalized by Franco [57]. The result (equation (6.10) in text) that the distribution modulo m of pairs (bn , bn+1 ) of consecutive terms in Stern’s diatomic sequence is the uniform distribution on pairs a, b generating Z/mZ, and its consequences such as equation (6.9), is due to Reznick [139]. The equation ∣f (n)∣ = ∑d∣n f (d), along with some similar equations, was proposed as a problem by Stanley [154], with published solution by Demos [42].



7. Independence Day The problem about choosing five numbers from a hat in increasing order is of ancient provenance and appears in numerous books and websites. Some problems analogous to that of counting subsets satisfying S1 ∩ ⋯ ∩ Sk = ∅ appear in Stanley [161, Exer. 1.32]. For information on equation (7.1) (subsets of Z/nZ that sum to 0), see Stanley [161, Exer. 1.105]. The fundamental bijection w ↦ w ˆ was first used by Alfr´ed R´enyi [137] and first systematically developed by Dominique Foata and Marcel-Paul Sch¨ utzenberger [51]. There is also a short discussion in [161, §1.3]. The integral appearing in equation (7.2) is discussed in [6, p. 133], while that appearing in equation (7.3) is a special case of Selberg’s integral [10, Ch. 8], [56], [91], first evaluated by Atle Selberg in 1944. The problem on choosing four points on the surface of a sphere appeared as Problem A6 on the 53rd William Lowell Putnam Mathematical Competition (1992) [87, pp. 27, 159–160]. Valtr’s formula (7.4) appears in [183]. The airplane boarding problem can be found in Winkler, [189, pp. 35–37]. The original source is unknown. For some further developments see Henze and Last [84]. 8. Independence Aftermath Lov´ asz and Winkler [103, p. 129], [104, Thm. 2] state that the problem about xi being the last unvisited point on an n-cycle is “folklore.” The result of Lov´asz and Winkler that only cycles and complete graphs have this independence property appears in [104]. Parking functions (though not described in terms of parking cars) were first enumerated by Ronald Pyke [133] in 1959. The interpretation in terms of parking cars is due to Alan Konheim and Benjamin Weiss [95]. The elegant proof of Pollak is cited by Riordan [140]. Parking functions have subsequently arisen in many additional contexts. A good survey is due to Catherine Yan [192]. The enumeration of tilings of rectangles by snakes, in a more general context, is due to R. Stanley [157] (published solution by William Chen [35]) and [161, Exer. 3.56]. For the problems about n and 2n snakes of size n, see Alexandersson and Jordan [7]. Igor Pak [124] has developed an interesting theory of tiling by snakes (which he calls ribbon tiles and are also known as ribbons or border strips). The remarkable formula (8.1) for p(n) is a famous result of Rademacher, building on previous work of Hardy and Ramanujan. For an exposition see for instance Andrews [9, Chap. 5]. Later Bruinier and Ono [29] found a complicated formula for p(n) as a finite sum of algebraic numbers. For the computation of p(1020 ) see Johansson [86]. 9. Amanda The “Next Card Red” game is discussed, with additional references, in Winkler [189, pp. 67, 71–72]. The precise provenance of the game where Alice and Bob choose either the first or last number is unknown. It is also discussed by Winkler [189, pp. 1–2].



The elegant proof of the noncommutative identity equation (9.1) is due to Sergey Fomin (personal communication), motivated by his paper [53, proof of Thm. 1.2]. The generalization due to Krob appears in [97]. A short discussion also is given by Reutenauer [138, §8]. For the result of Wilkie concerning a high school algebra identity, see [188] and the Wikipedia article [177]. The nonunique factorization (9.2) appears, in the context of partially ordered sets, in Birkhoff [22, p. 69], repeated in Stanley [161, Exer. 3.11(c)]. For information on subtraction-free complexity, see the paper [54] by Sergey Fomin, Dima Grigoriev, and Gleb Koshevoy. 10. An Aesthetical Error For further information on Mazur’s horrendous notation, see [163]. The uniqueness of F2 as a solution to equation (10.1) was noticed by Stanley [164]. The problem of finding the largest subspace of Mn (F) of nilpotent matrices for any field F goes back to Gerstenhaber. For some references and similar problems, see de Seguins Pazzis [147]. Additional information can be found in the slides of Quinlan [134]. The result of Adams on vector fields on spheres appears in [3]. Readers with a background in topology might be interested in the course notes of Miller [107]. In Lecture 2 of this reference appears the construction of ρn (n) linearly independent vector fields on an n-sphere using Clifford algebras. The argument that ρ16 (16) ≥ 9 is taken from Quinlan [134]. This reference also contains information on the largest affine subspace of Mn (F) consisting only of nonsingular matrices. The question of evaluating ρr (n) was raised in [167]. The largest subspace V of Mn (F) such that every matrix in V has at least one real eigenvalue is discussed in [169]. For basic information on octonions, see Wikipedia [116]. 11. Miraculous Cancellation For some information on the Kentucky Bend, the tiny enclave of Kentucky surrounded by Missouri and Tennessee, see [88]. For some further interesting geographical facts, including the closest state to Africa, see the Geography chapter of Winkler [189]. The result about factoring f (x)−g(y) is reported by Walter Feit [52, Thm. 1.1]. For information on the classification of finite simple groups, see [37]. The example of degree seven was taken from [125]. It is probably hopeless to track the provenance of Professor Blakeley’s joke about a formula being obvious. A variant attributed to Kakutani appears in Krantz [96, p. 3]. The Green-Tao theorem that there are arbitrarily long arithmetic progressions of (distinct) prime numbers was proved in 2004 by Ben Green and Terry Tao [76]. For examples of long arithmetic progressions of primes, see [77]. For examples of prime magic squares, see [128]. Erd˝os [46] showed in 1949 that there exist polynomials whose square has fewer terms. The first explicit example was given by R´enyi in 1947. It has 29 terms, while its square has 28 terms. The example given by equation (11.2) was found by Coppersmith and Davenport [40] in 1991. Abbott [2] showed in 2002, using the theory of Gr¨obner bases, that 12 is the minimum degree of a polynomial whose



square has fewer terms. The result about a polynomial of degree n − 1 whose square has fewer than cn0.81071... terms is due to Verdenius [184]. Further references are at [187]. The group G of infinite integer sequences (a1 , a2 , . . . ) is known as the BaerSpecker group or Specker group, after Reinhold Baer and Ernst Specker. For an introduction to its properties, see [15]. A somewhat different proof of nonfreeness is given by Blass [23]. 12. Probability Theory The problem about Alice choosing the larger or smaller of two numbers written by Bob on two sides of a piece of paper appears in Winkler [189, pp. 65, 68–69]. He attributes it to Tom Cover, 1986. The problem about the least k for which x1 + ⋅ ⋅ ⋅ + xk > 1 appeared as Problem A3 of the Eighteenth (1958) Putnam Competition [79, pp. 457–458]. The problem about the probability an that a partial sum of positive integers is equal to n, in the case of any probability distribution p on P (not just those of finite support), is due to Feller and Pollard [47]. They give an ingenious elementary proof. (They also give a nonelementary but somewhat more informative proof when 1/E ≠ 0.) A number of other proofs have been given which use Fourier analysis and other analytic tools. Stanley [160] gave a survey of alternating permutations which includes the results discussed by Professor Blakeley. The result about equation (12.6), i.e., x1 +x2 +x3 ≤ 1, x3 +x4 +x5 ≤ 1, etc., appears (in the more general context of partially ordered sets) in Stanley [155, Cor. 4.2]. The probability that xi + xi+1 + xi+2 ≤ 1, 1 ≤ i ≤ n − 2, is considered in [161, Exer. 4.56(d)]. For a combinatorial proof that tan x+tan y sec2 x = 1 + tan2 x and tan(x + y) = 1−(tan , see [162, Exer. 5.7]. x)(tan y) 13. Geometry The problem on embedding 2-circles in the plane is a slight generalization, but with the same proof, of a problem which Winkler [189, p. 56] states has been attributed to R. L. Moore. An elegant solution to the “more onerous variation” with Y’s rather than 2-circles, due to Randy Dougherty, also appears in Winkler [189, pp. 137–138]. The problem of tiling a convex polygon with nonconvex quadrilaterals appears in [132]. The characterization of the dimensions n for which there exists a regular nsimplex in Rn with integer vertices is due to Schoenberg [145]. The result about n-dimensional cubes having n+1 vertices that are the vertices of a regular simplex is due to Grigor´ev (or Grigorev or Grigoriev or Grigor’ev) [80]. The lower bound on the maximum number f (n) of halving lines is due to T´ oth [181], while the upper bound is given by Dey [43]. For more information on halving lines, see [123]. The result by Ardila on halving circles appears in [12]. The fourteen proofs on tiling a rectangle with rectangles of at least one integer side appears in Wagon [185]. The theorem of de Bruijn on tiling an a × b rectangle with c × d rectangles may be found in [28]. The result that an a × b rectangle can be tiled by finitely many squares if and only if a/b ∈ Q is due to Max Dehn [41]. The proof given here is due independently to Hadwiger [81] and Pokrovski˘i [126], with a simplification by Boltianski˘i [25, 26].



It is also presented in [4, p. 209]. Hadwiger and Pokrovski˘i use a Hamel basis, i.e., a basis for R as a vector space over Q, but Boltianski˘i pointed out that this was unnecessary. The tiling of the region T (n) by tribones is due to Conway (see Conway and Lagarias [39]) and further explained by Thurston [180]. A more elementary version of the proof is due to Propp [129]. The result of H. S. Wall on continued fractions appears in [186]. For tilings of a square by similar rectangles, see Freiling and Rinne [59] and Laczkovich and Szekeres [98]. The concept of the order r(P ) of a polyomino is due to Klarner [92]. For information on tiling a rectangle with copies of the same polyomino, see Bodini [24] and the references therein. A figure of a tiling of a 78 × 108 rectangle with 468 copies of the polyomino of Figure 13.1 appears at [45]. The undecidability of tiling a rectangle whose tiles belong to a finite set of polyominoes is due to Yang [193]. See Ardila and Stanley [13] for a general survey paper on plane tilings that should be accessible to undergraduates. 14. Hodgepodge Equation (14.1) was first considered by Golomb and Hales [72], but an error in multiprecision arithmetic gave an incorrect value for the first counterexample. The correct value is due to Buhler in 2004 and is reported by Golomb [71]. For further information on G¨ obel’s sequence (14.2), see [120]. A good introduction to Somos sequences and the Laurent phenomenon is the survey by Fomin and Zelevinsky [55]. Two further papers of interest are by Speyer [152] and by Carroll and Speyer [31]. The 100 prisoners problem was first considered by Peter Miltersen [60] in 2003. It has been reproduced in numerous places, including [166, §13.1] and [190, pp. 12, 18–20]. Some further beleaguered prisoner problems may be found in [32], [189], [190]. Minc’s conjecture on permanents of (0, 1)-matrices appears in [108]. In violation of Stigler’s law of eponymy, Br´egman’s theorem was first proved by Lev Br´egman [27]. Subsequently there have appeared a number of other proofs of Minc’s conjecture. Perhaps the nicest proof is due to Jaikumar Radhakrishnan [135], [4, Chap. 37], based on the theory of entropy. 15. Self-Referential Mathematics For information on the Kolakoski sequence see [118]. For Golomb’s sequence see [119]. The Gelfond-Schneider theorem was proved independently in 1934 by Aleksandr Gelfond [70] and Theodor Schneider [144]. It was the seventh of Hilbert’s famous list of 23 unsolved problems. Hilbert predicted that it would be solved long after the Riemann hypothesis was proved, but it was actually one of the first of Hilbert’s problems to be solved! The transcendentality of the root of xx = x + 1 is due to R. Stanley [156]. The recurrence an+1 = an − a1n was mentioned by Ron Graham in a joint address to the American Mathematical Society and Mathematical Association of America on August 6, 2000. For a discussion of why the problem of boundedness is so difficult, see Chamberland and Martelli [34].



√ The sequence 2, 3, 3, 2, 3, 3, 3, . . . related to 2 + 3 appears in [87, pp. 19–20, 178–180]. Abraham Wythoff analyzed Wythoff’s Nim in 1907 [191]. According to Martin Gardner [68, p. 106] the game was played in China under the name ji˘ an sh´ız˘ı or more accurately 捡石子 (“picking stones” or “choosing stones”). Uspensky’s result [182] on “Blakeley triples” was also proved by Skolem [148] and Graham [75]. It also appeared as Problem B6 on the Fifty-Sixth (1995) William Lowell Putnam Competition [87, pp. 214–216]. For information on Conway’s “look and say” sequence, see [121]. For an explanation that the minimum degree of the linear recurrence satisfied by Ln is equal to 72, see the answer by “Guntram” at [168]. The integer 13532385396179 discovered by James Davis has received a lot of publicity, for instance, [109]. Further information is given by Sloane [149, pp. 1069– 1070]. For a video discussing this number, see [1]. Autograms were invented and named by Lee Sallows. For further information see [14]. 16. All Good Things Must Come to an End )(2(n−k) ) = 4n , see [11] and For a combinatorial proof of the identity ∑nk=0 (2k k n−k the references therein. For further information on partitions of integers, see Andrews [9] and Stanley [161, §1.8]. Some references for the problem on partitions for which no part appears exactly once may be found at [122]. For the theory of cyclotomic numerical semigroups, see Ciolan, Garc´ıa-S´anchez, and Moree [36], and the references therein. The formula (16.5) for counting monic polynomials of degree n over Fq with no irreducible factor of multiplicity one is due to Stanley [158], with a published solution by Richard Stong [172]. The solution given in the text, as well as Stong’s solution, is sketched in [161, Exer. 1.186(c)]. We have followed the exposition of Lagrange inversion in Stanley [162, §5.4]. For the applications discussed by Professor Blakeley see Exercises 5.55–5.56 and Example 6.2.6 of this reference. The origin of the cake with frosting problem is uncertain. For an exposition, see Winkler [190, pp. 111, 115–118]. The three-gap theorem, also called the three-distance theorem or the Steinhaus conjecture, was conjectured by Steinhaus in the 1950’s and proved by S´os, Sur´ anyi, ´ and Swierczkowski (independently) in 1958–59. For references see the Wikipedia article [179].

Bibliography [1] 13532385396179-Numberphile, [2] John Abbott, Sparse squares of polynomials, Math. Comp. 71 (2002), no. 237, 407–413, DOI 10.1090/S0025-5718-00-01294-1. MR1863010 [3] J. F. Adams, Vector fields on spheres, Ann. of Math. (2) 75 (1962), 603–632, DOI 10.2307/1970213. MR139178 [4] Martin Aigner and G¨ unter M. Ziegler, Proofs from The Book, 6th ed., Springer, Berlin, 2018. See corrected reprint of the 1998 original [ MR1723092]; Including illustrations by Karl H. Hofmann. MR3823190 [5] Michael Aissen, I. J. Schoenberg, and A. M. Whitney, On the generating functions of totally positive sequences. I (English, with Hebrew summary), J. Analyse Math. 2 (1952), 93–103, DOI 10.1007/BF02786970. MR53174 [6] G. L. Alexanderson, L. F. Klosinski, and L. C. Larson, The William Lowell Putnam Mathematical Competition, Problems and Solutions: 1965–1984, Mathematical Association of America, Washington, DC, 1985. [7] Per Alexandersson and Linus Jordan, Enumeration of border-strip decompositions and WeilPetersson volumes, J. Integer Seq. 22 (2019), no. 4, Art. 19.4.5, 20. MR4001005 [8] Jean-Paul Allouche and Jeffrey Shallit, The ubiquitous Prouhet-Thue-Morse sequence, Sequences and their applications (Singapore, 1998), Springer Ser. Discrete Math. Theor. Comput. Sci., Springer, London, 1999, pp. 1–16. MR1843077 [9] G. E. Andrews, The Theory of Partitions Addison-Wesley, Reading, 1976; reissued by Cambridge University Press, Cambridge, 1985 and 1998. MR1634067 [10] George E. Andrews, Richard Askey, and Ranjan Roy, Special functions, Encyclopedia of Mathematics and its Applications, vol. 71, Cambridge University Press, Cambridge, 1999. MR1688958 [11] Valerio De Angelis, Pairings and signed permutations, Amer. Math. Monthly 113 (2006), no. 7, 642–644, DOI 10.2307/27642007. MR2252934 [12] Federico Ardila, The number of halving circles, Amer. Math. Monthly 111 (2004), no. 7, 586–591, DOI 10.2307/4145159. MR2080040 [13] Federico Ardila and Richard P. Stanley, Tilings, Math. Intelligencer 32 (2010), no. 4, 32–43, DOI 10.1007/s00283-010-9160-9. MR2747701 [14] Autogram. Retrieved March 16, 2019, from [15] Baer-Specker group. Retrieved March 30, 2019, from Baer-Specker_group. [16] David Bailey, Peter Borwein, and Simon Plouffe, On the rapid computation of various polylogarithmic constants, Math. Comp. 66 (1997), no. 218, 903–913, DOI 10.1090/S00255718-97-00856-9. MR1415794 [17] D. H. Bailey, J. M. Borwein, P. B. Borwein, and S. Plouffe, The quest for pi, Math. Intelligencer 19 (1997), no. 1, 50–57, DOI 10.1007/BF03024340. MR1439159 [18] Elwyn R. Berlekamp, John H. Conway, and Richard K. Guy, Winning ways for your mathematical plays. Vol. 1, 2nd ed., A K Peters, Ltd., Natick, MA, 2001. MR1808891 [19] Elwyn R. Berlekamp, John H. Conway, and Richard K. Guy, Winning ways for your mathematical plays. Vol. 3, 2nd ed., A K Peters, Ltd., Natick, MA, 2003. MR2006327 [20] J. Berstel, Fibonacci words—a survey, in The Book of L (G. Rozenberg and A. Salomaa, eds.) (electronic), Springer, Berlin/Heidelberg, 1986, pp. 13–27. [21] Binet-Cauchy or Cauchy-Binet, URL (version 2019/3/24): questions/5763.




[22] Garrett Birkhoff, Lattice theory, Third edition. American Mathematical Society Colloquium Publications, Vol. XXV, American Mathematical Society, Providence, R.I., 1967. MR0227053 [23] A. Blass, Why isn’t an infinite direct product of copies of Z a free module? answer, URL (version 2019/03/30): [24] Olivier Bodini, Tiling a rectangle with polyominoes, Discrete models for complex systems, DMCS ’03 (Lyon), Discrete Math. Theor. Comput. Sci. Proc., AB, Assoc. Discrete Math. Theor. Comput. Sci., Nancy, 2003, pp. 81–88. MR2011022 [25] V. G. Boltianskii, Equivalent and Equidecomposable Figures (A. K. Henn and C. E. Watts, trans.), D. C. Heath and Co., Boston, 1963. [26] Vladimir G. Boltianski˘ı, Hilbert’s third problem, Scripta Series in Mathematics, V. H. Winston & Sons, Washington, D.C.; Halsted Press [John Wiley & Sons], New York-TorontoLondon, 1978. Translated from the Russian by Richard A. Silverman; With a foreword by Albert B. J. Novikoff. MR0500434 [27] L. Br´ egman, Some properties of nonnegative matrices and their permanents, Math. Dokl. 14 (1973), 945–949. [28] N. G. de Bruijn, Filling boxes with bricks, Amer. Math. Monthly 76 (1969), 37–40, DOI 10.2307/2316785. MR234841 [29] Jan Hendrik Bruinier and Ken Ono, Algebraic formulas for the coefficients of halfintegral weight harmonic weak Maass forms, Adv. Math. 246 (2013), 198–219, DOI 10.1016/j.aim.2013.05.028. MR3091805 [30] Neil Calkin and Herbert S. Wilf, Recounting the rationals, Amer. Math. Monthly 107 (2000), no. 4, 360–363, DOI 10.2307/2589182. MR1763062 [31] Gabriel D. Carroll and David Speyer, The cube recurrence, Electron. J. Combin. 11 (2004), no. 1, Research Paper 73, 31. MR2097339 [32] Larry Carter and Stan Wagon, The Mensa Correctional Institute, Amer. Math. Monthly 125 (2018), no. 4, 306–319, DOI 10.1080/00029890.2018.1418100. MR3779218 [33] A. Cayley, Desiderata and Suggestions: No. 1. The Theory of Groups, Amer. J. Math. 1 (1878), no. 1, 50–52, DOI 10.2307/2369433. MR1505151 [34] Marc Chamberland and Mario Martelli, Unbounded orbits and binary digits, J. Difference Equ. Appl. 9 (2003), no. 7, 687–691, DOI 10.1080/1023619021000042199. MR1976143 [35] Richard Stanley and William Y. C. Chen, Problems and Solutions: Solutions: 10199, Amer. Math. Monthly 101 (1994), no. 3, 278–279, DOI 10.2307/2975614. MR1542509 [36] Emil-Alexandru Ciolan, Pedro A. Garc´ıa-S´ anchez, and Pieter Moree, Cyclotomic numerical semigroups, SIAM J. Discrete Math. 30 (2016), no. 2, 650–668, DOI 10.1137/140989479. MR3483156 [37] Classification of finite simple groups. Retrieved November 19, 2018, from https://en. [38] J. J. Clessa, Micropuzzles, Pan Books, London, 1983; reprinted as Math and Logic Puzzles for PC Enthusiasts, Dover, Mineola, NY, 1996. [39] J. H. Conway and J. C. Lagarias, Tiling with polyominoes and combinatorial group theory, J. Combin. Theory Ser. A 53 (1990), no. 2, 183–208, DOI 10.1016/0097-3165(90)90057-4. MR1041445 [40] Don Coppersmith and James Davenport, Polynomials whose powers are sparse, Acta Arith. 58 (1991), no. 1, 79–87, DOI 10.4064/aa-58-1-79-87. MR1111092 ¨ [41] M. Dehn, Uber Zerlegung von Rechtecken in Rechtecke (German), Math. Ann. 57 (1903), no. 3, 314–332, DOI 10.1007/BF01444289. MR1511212 [42] Richard Stanley and M. S. Demos, Problems and Solutions: Solutions of Advanced Problems: 5653, Amer. Math. Monthly 77 (1970), no. 1, 85–86, DOI 10.2307/2316875. MR1535769 [43] T. K. Dey, Improved bounds for planar k-sets and related problems, Discrete Comput. Geom. 19 (1998), no. 3, Special Issue, 373–382, DOI 10.1007/PL00009354. Dedicated to the memory of Paul Erd˝ os. MR1608878 ¨ [44] G. Eisenstein, Uber ein einfaches Mittel zur Auffindung der h¨ oheren Reciprocit¨ atsgesetze und der mit ihnen zu verbindenden Erg¨ anzungss¨ atze (German), J. Reine Angew. Math. 39 (1850), 351–364, DOI 10.1515/crll.1850.39.351. MR1578671 [45] K. Eklhad, 18-omino, order 468. Retrieved August 4, 2019, from polyomino/tile468.html.



[46] P. Erd˝ os, On the number of terms of the square of a polynomial, Nieuw Arch. Wiskunde (2) 23 (1949), 63–65. MR0027779 [47] P. Erd¨ os, W. Feller, and H. Pollard, A property of power series with positive coefficients, Bull. Amer. Math. Soc. 55 (1949), 201–204, DOI 10.1090/S0002-9904-1949-09203-0. MR27867 [48] P. Erd˝ os, V. Lev, G. Rauzy, C. S´ andor, and A. S´ ark¨ ozy, Greedy algorithm, arithmetic progressions, subset sums and divisibility, Discrete Math. 200 (1999), no. 1-3, 119–135, DOI 10.1016/S0012-365X(98)00385-9. Paul Erd˝ os memorial collection. MR1692285 [49] Fair cake-cutting. Retrieved March 23, 2019, from cake-cutting. [50] Fibonacci word. Retrieved December 22, 2018, from Fibonacci_word. [51] Dominique Foata and Marcel-P. Sch¨ utzenberger, Th´ eorie g´ eom´ etrique des polynˆ omes eul´ eriens (French), Lecture Notes in Mathematics, Vol. 138, Springer-Verlag, Berlin-New York, 1970. MR0272642 [52] Walter Feit, Some consequences of the classification of finite simple groups, The Santa Cruz Conference on Finite Groups (Univ. California, Santa Cruz, Calif., 1979), Proc. Sympos. Pure Math., vol. 37, Amer. Math. Soc., Providence, R.I., 1980, pp. 175–181. MR604576 [53] Sergey Fomin, Schur operators and Knuth correspondences, J. Combin. Theory Ser. A 72 (1995), no. 2, 277–292, DOI 10.1016/0097-3165(95)90065-9. MR1357774 [54] Sergey Fomin, Dima Grigoriev, and Gleb Koshevoy, Subtraction-free complexity, cluster transformations, and spanning trees, Found. Comput. Math. 16 (2016), no. 1, 1–31, DOI 10.1007/s10208-014-9231-y. MR3451422 [55] Sergey Fomin and Andrei Zelevinsky, The Laurent phenomenon, Adv. in Appl. Math. 28 (2002), no. 2, 119–144, DOI 10.1006/aama.2001.0770. MR1888840 [56] Peter J. Forrester and S. Ole Warnaar, The importance of the Selberg integral, Bull. Amer. Math. Soc. (N.S.) 45 (2008), no. 4, 489–534, DOI 10.1090/S0273-0979-08-01221-4. MR2434345 [57] Z. Franco and R. P. Stanley, Problem 12097, Amer. Math. Monthly 126 (2019), 280. [58] Michael H. Freedman and Frank Quinn, Topology of 4-manifolds, Princeton Mathematical Series, vol. 39, Princeton University Press, Princeton, NJ, 1990. MR1201584 [59] C. Freiling and D. Rinne, Tiling a square with similar rectangles, Math. Res. Lett. 1 (1994), no. 5, 547–558, DOI 10.4310/MRL.1994.v1.n5.a3. MR1295549 [60] Anna G´ al and Peter Bro Miltersen, The cell probe complexity of succinct data structures, Automata, languages and programming, Lecture Notes in Comput. Sci., vol. 2719, Springer, Berlin, 2003, pp. 332–344, DOI 10.1007/3-540-45061-0 28. MR2080711 [61] Daniil Gasnikov and Arseny M. Shur, Square-free partial words with many wildcards, Internat. J. Found. Comput. Sci. 29 (2018), no. 5, 845–860, DOI 10.1142/S0129054118420078. MR3845387 [62] F. R. Gantmacher, Matrix Theory, vol. 2, American Mathematical Society, Providence, RI, 2000; reprinted from 1953 Russian edition. [63] M. Gardner, The Island of Five Colors, Future Tense (K. Crossen, ed.), Greenberg, New York, 1952; reprinted in C. Fadiman, Fantasia Mathematica, Springer, New York, 1958. [64] M. Gardner, aha! Insight, W. H. Freeman, New York, 1978. [65] M. Gardner, Nine problems, in Hexaflexagons and Other Mathematical Diversions, University of Chicago Press, 1988. [66] M. Gardner, The sixth symbol and other problems, in Time Travel and Other Mathematical Bewilderments, W. H. Freeman, New York, 1988. [67] M. Gardner, Bulgarian solitaire and other seemingly endless tasks, in The Last Recreations, Springer, New York, 1997. [68] Martin Gardner, Penrose tiles to trapdoor ciphers, MAA Spectrum, Mathematical Association of America, Washington, DC, 1997. . . .and the return of Dr. Matrix; Revised reprint of the 1989 original. MR1443205 [69] I. M. Gelfand, M. M. Kapranov, and A. V. Zelevinsky, Discriminants, resultants, and multidimensional determinants, Mathematics: Theory & Applications, Birkh¨ auser Boston, Inc., Boston, MA, 1994. MR1264417 [70] A. O. Gelfond, Sur le septi` eme Probl` eme de Hilbert, Bull. de l’Acad´emie des Science de l’URSS. Classe des sciences math´ ematiques et naturelles sciences math´ ematiques, VII(4), 623–634.



[71] S. W. Golomb, Martin Gardner and tictacktoe, in A Lifetime of Puzzles (E. D. Demaine, M. L. Demaine, and T. Rodgers, eds.), CRC Press, Boca Raton, 2008, pp. 293–301. [72] Solomon W. Golomb and Alfred W. Hales, Hypercube tic-tac-toe, More games of no chance (Berkeley, CA, 2000), Math. Sci. Res. Inst. Publ., vol. 42, Cambridge Univ. Press, Cambridge, 2002, pp. 167–182. MR1973012 [73] R. L. Goodstein, On the restricted ordinal theorem, J. Symbolic Logic 9 (1944), 33–41, DOI 10.2307/2268019. MR10515 [74] Goodstein’s theorem. In Wikipedia. Retrieved May 18, 2018, from https://en.wikipedia. org/wiki/Goodstein’s_theorem. [75] R. L. Graham, On a theorem of Uspensky, Amer. Math. Monthly 70 (1963), 407–409, DOI 10.2307/2311859. MR148555 [76] Ben Green and Terence Tao, The primes contain arbitrarily long arithmetic progressions, Ann. of Math. (2) 167 (2008), no. 2, 481–547, DOI 10.4007/annals.2008.167.481. MR2415379 [77] Green-Tao theorem, In Wikipedia. Retrieved March 2, 2019, from https://en.wikipedia. org/wiki/Green-Tao_theorem. [78] Gary R. Greenfield and Daniel Drucker, On the discriminant of a trinomial, Linear Algebra Appl. 62 (1984), 105–112, DOI 10.1016/0024-3795(84)90089-2. MR761061 [79] A. M. Gleason, R. E. Greenwood, and L. M. Kelly, The William Lowell Putnam Mathematical Competition, Mathematical Association of America, Washington, D.C., 1980. MR588757 [80] N. A. Grigor´ ev, Regular simplices inscribed in a cube and Hadamard matrices, Proc. Steklov Inst. Math. 152 (1982), 97–98. [81] H. Hadwiger, Vorlesungen u ¨ber Inhalt, Oberfl¨ ache und Isoperimetrie (German), SpringerVerlag, Berlin-G¨ ottingen-Heidelberg, 1957. MR0102775 [82] Benjamin Halpern, Strange billiard tables, Trans. Amer. Math. Soc. 232 (1977), 297–305, DOI 10.2307/1998942. MR451308 [83] G. H. Hardy, J. E. Littlewood, and G. P´ olya, Inequalities, Cambridge, at the University Press, 1952. 2d ed. MR0046395 [84] Norbert Henze and G¨ unter Last, Absent-minded passengers, Amer. Math. Monthly 126 (2019), no. 10, 867–875, DOI 10.1080/00029890.2019.1656024. MR4033222 [85] David Hilbert, Ueber die Discriminante der im Endlichen abbrechenden hypergeometrischen Reihe (German), J. Reine Angew. Math. 103 (1888), 337–345, DOI 10.1515/crll.1888.103.337. MR1580169 [86] F. Johansson, Fast arbitrary-precision evaluation of special functions in the Arb library. Retrieved March 29, 2019, from [87] Kiran S. Kedlaya, Bjorn Poonen, and Ravi Vakil, The William Lowell Putnam Mathematical Competition, 1985–2000: Problems, solutions, and commentary, MAA Problem Books Series, Mathematical Association of America, Washington, DC, 2002. MR1933844 [88] Kentucky Bend. In Wikipedia. Retrieved November 17, 2018, from https://en.wikipedia. org/wiki/Kentucky_Bend. [89] J. Kepler, Harmonices Mvndi, 1621; English translation as an e-book: Google Books ID eEkLAAAAIAAJ. [90] Andrey Khalyavin, The minimal density of a letter in an infinite ternary square-free word 883 , J. Integer Seq. 10 (2007), no. 6, Article 07.6.5, 4. MR2312251 is 3215 [91] J. S. Kim and S. Oh, The Selberg integral and Young books, DMTCS Proc. AT (FPSAC 2014), 381–392. [92] David A. Klarner, Packing a rectangle with congruent n-ominoes, J. Combinatorial Theory 7 (1969), 107–115. MR248643 [93] E. Klarreich, How to cut cake fairly and finally eat it too, Quanta magazine, www. [94] P´ eter Komj´ ath and Vilmos Totik, Problems and theorems in classical set theory, Problem Books in Mathematics, Springer, New York, 2006. MR2220838 [95] A. G. Konheim and B. Weiss, An occupancy discipline and applications, SIAM J. Applied Math. 14 (1966), 1266–1274. [96] Steven G. Krantz, Mathematical apocrypha: Stories and anecdotes of mathematicians and the mathematical, MAA Spectrum, Mathematical Association of America, Washington, DC, 2002. MR1918107



[97] Daniel Krob, Expressions rationnelles sur un anneau (French), Topics in invariant theory (Paris, 1989/1990), Lecture Notes in Math., vol. 1478, Springer, Berlin, 1991, pp. 215–243, DOI 10.1007/BFb0083505. MR1180991 [98] M. Laczkovich and G. Szekeres, Tilings of the square with similar rectangles, Discrete Comput. Geom. 13 (1995), no. 3-4, 569–572, DOI 10.1007/BF02574063. MR1318796 [99] Jeffrey C. Lagarias, An elementary problem equivalent to the Riemann hypothesis, Amer. Math. Monthly 109 (2002), no. 6, 534–543, DOI 10.2307/2695443. MR1908008 [100] S. Lindhurst, An investigation of several interesting sets of numbers generated by the greedy algorithm, Senior thesis, Princeton University, 1990. [101] List of centenarians (scientists and mathematicians). In Wikipedia. Retrieved March 23, 2019, from mathematicians). [102] L. A. Shepp and S. P. Lloyd, Ordered cycle lengths in a random permutation, Trans. Amer. Math. Soc. 121 (1966), 340–357, DOI 10.2307/1994483. MR195117 [103] L´ aszl´ o Lov´ asz and Peter Winkler, Mixing of random walks and other diffusions on a graph, Surveys in combinatorics, 1995 (Stirling), London Math. Soc. Lecture Note Ser., vol. 218, Cambridge Univ. Press, Cambridge, 1995, pp. 119–154, DOI 10.1017/CBO9780511662096.007. MR1358634 [104] L´ aszl´ o Lov´ asz and Peter Winkler, A note on the last new vertex visited by a random walk, J. Graph Theory 17 (1993), no. 5, 593–596, DOI 10.1002/jgt.3190170505. MR1242177 [105] James Maynard, Primes with restricted digits, Invent. Math. 217 (2019), no. 1, 127–218, DOI 10.1007/s00222-019-00865-6. MR3958793 [106] G. A. Miller, Contradictions in the Literature of Group Theory, Amer. Math. Monthly 29 (1922), no. 9, 319–328, DOI 10.2307/2298721. MR1520085 [107] H. Miller, Vector fields on spheres, etc.,∼mdono/haynes-notes.pdf. [108] Henryk Minc, Upper bounds for permanents of (0, 1)-matrices, Bull. Amer. Math. Soc. 69 (1963), 789–791, DOI 10.1090/S0002-9904-1963-11031-9. MR155843 [109] A. Moseman, Why 13532385396179 is a magic number, Popular Mechanics, www., June 8, 2017. [110] Leo Moser, On non-averaging sets of integers, Canad. J. Math. 5 (1953), 245–252, DOI 10.4153/cjm-1953-027-0. MR53140 [111] Richard A. Moy, On the growth of the counting function of Stanley sequences, Discrete Math. 311 (2011), no. 7, 560–562, DOI 10.1016/j.disc.2010.12.019. MR2765623 [112] Richard A. Moy, Mehtaab Sawhney, and David Stoner, Characters of independent Stanley sequences, European J. Combin. 70 (2018), 354–363, DOI 10.1016/j.ejc.2018.01.008. MR3779623 [113] T. Muir, The Theory of Determinants in the Historical Order of Development, vol. 1, second ed., MacMillan, London, 1906. [114] Sam Northshield, Stern’s diatomic sequence 0, 1, 1, 2, 1, 3, 2, 3, 1, 4, . . . , Amer. Math. Monthly 117 (2010), no. 7, 581–598, DOI 10.4169/000298910X496714. MR2681519 [115] Pascal Ochem, Letter frequency in infinite repetition-free words, Theoret. Comput. Sci. 380 (2007), no. 3, 388–392, DOI 10.1016/j.tcs.2007.03.027. MR2331007 [116] Octonion, In Wikipedia. Retrieved March 20, 2019, from [117] A. M. Odlyzko and R. Stanley, Some curious sequences constructed with the greedy algorithm, unpublished notes dated January, 1978; available at∼rstan/papers/ od.pdf. [118] OEIS Foundation Inc. (2018), The On-Line Encyclopedia of Integer Sequences, A000002. [119] OEIS Foundation Inc. (2018), The On-Line Encyclopedia of Integer Sequences, A001462. [120] OEIS Foundation Inc. (2018), The On-Line Encyclopedia of Integer Sequences, A003504. [121] OEIS Foundation Inc. (2018), The On-Line Encyclopedia of Integer Sequences, A005150. [122] OEIS Foundation Inc. (2018), The On-Line Encyclopedia of Integer Sequences, A007690.



[123] OEIS Foundation Inc. (2018), The On-Line Encyclopedia of Integer Sequences, A076523. [124] Igor Pak, Ribbon tile invariants, Trans. Amer. Math. Soc. 352 (2000), no. 12, 5525–5561, DOI 10.1090/S0002-9947-00-02666-0. MR1781275 [125] D. Pasechnik, Factorization of polynomials in two variables, answer, URL (version 2015/02/14): [126] V. G. Pokrovski˘ı, Decompositions of n-dimensional parallelepipeds (Russian), Mat. Zametki 33 (1983), no. 2, 273–280, 319. MR693437 [127] Alexander Postnikov and Richard P. Stanley, Deformations of Coxeter hyperplane arrangements, J. Combin. Theory Ser. A 91 (2000), no. 1-2, 544–597, DOI 10.1006/jcta.2000.3106. In memory of Gian-Carlo Rota. MR1780038 [128] Prime magic squares, retrieved July 24, 2019, from prime/squares13.html#5. [129] James Propp, A pedestrian approach to a method of Conway, or, A tale of two cities, Math. Mag. 70 (1997), no. 5, 327–340, DOI 10.2307/2691169. MR1488869 [130] A. Pun and R. Stanley, Notices Amer. Math. Society 64 (2017), 536. [131] E. J. Putzer and R. W. Lowen, On the optimum method of cutting a rectangular box into unit cubes, Convair Scientific Research Laboratory, San Diego, 1958. [132] Puzzles in geometry and combinatorial geometry, Problem 5, options=543&lmm=0. [133] Ronald Pyke, The supremum and infimum of the Poisson process, Ann. Math. Statist. 30 (1959), 568–576, DOI 10.1214/aoms/1177706269. MR107315 [134] R. Quinlan, Special spaces of matrices, URL (version 2013/08/27): www.maths.nuigalway. ie/∼rquinlan/talks/quinlan.pdf. [135] Jaikumar Radhakrishnan, An entropy proof of Bregman’s theorem, J. Combin. Theory Ser. A 77 (1997), no. 1, 161–164, DOI 10.1006/jcta.1996.2727. MR1426744 [136] A. R´ enyi, On the minimal number of terms of the square of a polynomial, Hungarica Acta Math. 1 (1947), 30–34. MR0022268 [137] A. R´ enyi, Th´ eorie des ´ el´ ements saillants d’une suite d’observations, in Colloq. Combinatorial Methods in Probability Theory, August 1–10, 1962, Matematisk Institut, Aarhus Universitet, 1962, pp. 104–115. [138] Christophe Reutenauer, A survey of noncommutative rational series, Formal power series and algebraic combinatorics (New Brunswick, NJ, 1994), DIMACS Ser. Discrete Math. Theoret. Comput. Sci., vol. 24, Amer. Math. Soc., Providence, RI, 1996, pp. 159–169. MR1363511 [139] Bruce Reznick, Regularity properties of the Stern enumeration of the rationals, J. Integer Seq. 11 (2008), no. 4, Article 08.4.1, 17. MR2447843 [140] John Riordan, Ballots and trees, J. Combinatorial Theory 6 (1969), 408–411. MR234843 [141] Utkir A. Rozikov, An introduction to mathematical billiards, World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2019. MR3887584 [142] H. J. Ryser, Matrices of zeros and ones, Bull. Amer. Math. Soc. 66 (1960), 442–464, DOI 10.1090/S0002-9904-1960-10494-6. MR146192 [143] Donald G. Saari and Zhihong Xia, Off to infinity in finite time, Notices Amer. Math. Soc. 42 (1995), no. 5, 538–546. MR1324734 [144] Theodor Schneider, Transzendenzuntersuchungen periodischer Funktionen I, II, J. Reine Angew. Math. 172 (1935), 65-69, 70–74, DOI 10.1515/crll.1935.172.70. MR1581438 [145] I. J. Schoenberg, Regular simplices and quadratic forms, J. London Math. Soc. 12 (1937), 48–55. [146] J. Schur, Affektlose Gleichungen in der Theorie der Laguerreschen und Hermiteschen Polynome (German), J. Reine Angew. Math. 165 (1931), 52–58, DOI 10.1515/crll.1931.165.52. MR1581272 [147] Cl´ ement de Seguins Pazzis, Spaces of matrices with a sole eigenvalue, Linear Multilinear Algebra 60 (2012), no. 10, 1165–1190, DOI 10.1080/03081087.2011.654118. MR2983758 [148] Th. Skolem, On certain distributions of integers in pairs with given differences, Math. Scand. 5 (1957), 57–68, DOI 10.7146/math.scand.a-10490. MR92797 [149] Neil J. A. Sloane, The on-line encyclopedia of integer sequences, Notices Amer. Math. Soc. 65 (2018), no. 9, 1062–1074. MR3822822



[150] Alexander Soifer, Axiom of choice and chromatic number of Rn , J. Combin. Theory Ser. A 110 (2005), no. 1, 169–173, DOI 10.1016/j.jcta.2004.10.004. MR2128972 [151] Robert M. Solovay, A model of set-theory in which every set of reals is Lebesgue measurable, Ann. of Math. (2) 92 (1970), 1–56, DOI 10.2307/1970696. MR265151 [152] David E. Speyer, Perfect matchings and the octahedron recurrence, J. Algebraic Combin. 25 (2007), no. 3, 309–348, DOI 10.1007/s10801-006-0039-y. MR2317336 [153] John Stallings, The piecewise-linear structure of Euclidean space, Proc. Cambridge Philos. Soc. 58 (1962), 481–488. MR149457 [154] R. Stanley, Problem 5653, Amer. Math. Monthly 76 (1969), 200. [155] Richard P. Stanley, Two poset polytopes, Discrete Comput. Geom. 1 (1986), no. 1, 9–23, DOI 10.1007/BF02187680. MR824105 [156] R. Stanley, A transcendental number?, Quickie 88-10, Mathematical Entertainments column (S. Weintraub, ed.), Math. Intelligencer 11 (winter 1989), 55. [157] R. Stanley, Problem 10199, Amer. Math. Monthly 99 (1992), 163. [158] R. Stanley, Problem 11348, Amer. Math. Monthly 115 (2008), 262. [159] R. Stanley, Fairest way to choose gifts, URL (version 2010/09/01): 37276. [160] Richard P. Stanley, A survey of alternating permutations, Combinatorics and graphs, Contemp. Math., vol. 531, Amer. Math. Soc., Providence, RI, 2010, pp. 165–196, DOI 10.1090/conm/531/10466. MR2757798 [161] Richard P. Stanley, Enumerative combinatorics. Volume 1, 2nd ed., Cambridge Studies in Advanced Mathematics, vol. 49, Cambridge University Press, Cambridge, 2012. MR2868112 [162] Richard P. Stanley, Enumerative combinatorics. Vol. 2, Cambridge Studies in Advanced Mathematics, vol. 62, Cambridge University Press, Cambridge, 1999. With a foreword by Gian-Carlo Rota and appendix 1 by Sergey Fomin. MR1676282 [163] R. Stanley, What are the worst notations, in your opinion?, answer, URL (version 2012/11/19): [164] R. Stanley, Fantastic properties of Z/2Z, answer, URL (version 2013/11/15): mathoverflow. net/q/148925. [165] Richard P. Stanley, Catalan numbers, Cambridge University Press, New York, 2015. MR3467982 [166] Richard P. Stanley, Algebraic combinatorics: Walks, trees, tableaux, and more; Second edition of [ MR3097651], Undergraduate Texts in Mathematics, Springer, Cham, 2018. MR3823204 [167] R. Stanley, Maximum dimension of a space of n × n real matrices with at least k nonzero eigenvalues, URL (version 2018/08/27): [168] R. Stanley, Order of Conway’s “look and say” sequence, URL (version 2019/1/9): [169] R. Stanley, Maximum dimension of space of matrices with a real eigenvalue, URL (version 2019/3/30): [170] Richard P. Stanley, Some Linear Recurrences Motivated by Stern’s Diatomic Array, Amer. Math. Monthly 127 (2020), no. 2, 99–111, DOI 10.1080/00029890.2020.1677104. MR4048653 [171] M. Stern, Ueber eine zahlentheoretische Funktion (German), J. Reine Angew. Math. 55 (1858), 193–220, DOI 10.1515/crll.1858.55.193. MR1579066 [172] R. Stong, Solution to 11348, Amer. Math. Monthly 117 (2010), 87–88. [173] Richard G. Swan, Factorization of polynomials over finite fields, Pacific J. Math. 12 (1962), 1099–1106. MR144891 [174] Sylver Coinage. In Wikipedia. Retrieved March 17, 2019, from wiki/Sylver_coinage. [175] Szemer´ edi’s theorem. In Wikipedia. Retrieved February 8, 2019, from https://en.’s_theorem. [176] Serge Tabachnikov, Geometry and billiards, Student Mathematical Library, vol. 30, American Mathematical Society, Providence, RI; Mathematics Advanced Study Semesters, University Park, PA, 2005. MR2168892 [177] Tarski’s high school algebra problem. In Wikipedia. Retrieved April 22, 2018, from https://’s_high_school_algebra_problem.



[178] Clifford Henry Taubes, Gauge theory on asymptotically periodic 4-manifolds, J. Differential Geom. 25 (1987), no. 3, 363–430. MR882829 [179] Three-gap theorem. In Wikipedia. Retrieved December 21, 2018, from https://en. [180] William P. Thurston, Conway’s tiling groups, Amer. Math. Monthly 97 (1990), no. 8, 757– 773, DOI 10.2307/2324578. MR1072815 [181] G´ eza T´ oth, Point sets with many k-sets, Proceedings of the Sixteenth Annual Symposium on Computational Geometry (Hong Kong, 2000), ACM, New York, 2000, pp. 37–42, DOI 10.1145/336154.336171. MR1802255 [182] J. V. Uspensky, On a Problem Arising Out of the Theory of a Certain Game, Amer. Math. Monthly 34 (1927), no. 10, 516–521, DOI 10.2307/2299838. MR1521314 [183] P. Valtr, Probability that n random points are in convex position, Discrete Comput. Geom. 13 (1995), no. 3-4, 637–643, DOI 10.1007/BF02574070. MR1318803 [184] W. Verdenius, On the number of terms of the square and the cube of polynomials, Nederl. Akad. Wetensch., Proc. 52 (1949), 1220–1226 = Indagationes Math. 11, 459–465 (1949). MR33267 [185] Stan Wagon, Fourteen proofs of a result about tiling a rectangle, Amer. Math. Monthly 94 (1987), no. 7, 601–617, DOI 10.2307/2322213. MR935845 [186] H. S. Wall, Polynomials whose zeros have negative real parts, Amer. Math. Monthly 52 (1945), 308–322, DOI 10.2307/2305291. MR12709 [187] E. W. Weisstein, Sparse polynomial square, in MathWorld. Retrieved August 4, 2019, from [188] Alex J. Wilkie, On exponentiation—a solution to Tarski’s high school algebra problem, Connections between model theory and algebraic and analytic geometry, Quad. Mat., vol. 6, Dept. Math., Seconda Univ. Napoli, Caserta, 2000, pp. 107–129. MR1930684 [189] Peter Winkler, Mathematical puzzles: a connoisseur’s collection, A K Peters, Ltd., Natick, MA, 2004. MR2034896 [190] Peter Winkler, Mathematical mind-benders, A K Peters, Ltd., Wellesley, MA, 2007. MR2334790 [191] W. A. Wythoff, A modification of the game of Nim, Nieuw Arch. Wisk. 8 (1907/1909), 199–202. [192] Catherine H. Yan, Parking functions, Handbook of enumerative combinatorics, Discrete Math. Appl. (Boca Raton), CRC Press, Boca Raton, FL, 2015, pp. 835–893. MR3409354 [193] Jed Yang, Rectangular tileability and complementary tileability are undecidable, European J. Combin. 41 (2014), 20–34, DOI 10.1016/j.ejc.2014.03.008. MR3219249


13532385396179, 146, 166

Buhler, Joe Peter, 165

Abbott, John Anthony, 163 Adams, John Frank, 92, 163 Aissen, Michael, 15 Alexandersson, Per Wilhelm, 162 algebraic (generating function), 55 Allouche, Jean-Paul Simon, 160 alternant (of a polynomial), 126 alternating permutation, 109 Andr´ e, D´ esir´e, 108 Andrews, George W. Eyre, 162, 166 Ardila, Federico, 121, 164, 165 arithmetic triangle, 50 Asimov, Isaac, 130 asymptotic, 27 atom (of the look-and-say sequence), 145 autogram, 147, 166 Aziz, Haris, 7

Calkin, Neil James, 58, 161 Calkin-Wilf tree, 57, 161 Candy Crush Friends, 124 Carroll, Gabriel Drew, 165 Catalan number, 25, 160 Cauchy-Binet formula, 17 Cauchy-Schwarz inequality, 35 Cayley’s formula, 75 Cayley, Arthur, 37, 92, 160 Chamberland, Marc Andr´ e, 165 Chen, William Yongchuan, 162 Chinese remainder theorem, 60 choosing stones, 166 chromatic number, 44 Ciolan, Emil-Alexandru, 166 Clifford algebra, 163 cluster algebra, 132 Cohen, Paul Joseph, 45 combinatorial trigonometry, 112 composition (of a positive integer), 77 compositional inverse, 153 congruent rectangles, 125 continued fraction, 130 convex hull, 70 Conway, John Horton, 7, 43, 144, 159, 161, 165, 166 Coppersmith, Don, 163 Cover, Tom, 164 cubefree (word), 29 cutcake, 4 impartial, 4 cyclotomic numerical semigroup, 152, 166 cyclotomic polynomial, 117 cyclotomic semigroup, 152

Baer, Reinhold, 164 Baer-Specker group, 164 Bailey, David Harold, 29 Beatty pair, 144 Berlekamp, Elwyn Ralph, 7, 159, 161 Bernoulli numbers, 153 Berstel, Jean, 58 billiards, 46 Binet-Cauchy formula, 17 binomial coefficient, 50 Birkhoff, Garrett, 163 Blakeley triple, 144, 166 Blakeley’s triangle, 51 Blass, Andreas Raphael, 164 Bodini, Olivier, 165 Boltianskii, Vladimir Grigorevich, 164 Borchard, Clayton Wilhelm, 75 border strip, 162 Borwein, Jonathan Michael, 29 Bourbaki, Nicolas, 2 Brams, Steven J., 7 Br´ egman, Lev M., 135, 165 de Bruijn, Nicolaas Govert, 123, 164 Bruinier, Jan Hendrik, 162

Davenport, James Harold, 163 Davis, James, 147, 166 Dedekind psi function, 60 Dedekind sum, 78 Dehn, Max Wilhelm, 164 Demos, Miltiades Stavros, 161 Dey, Tamal Krishna, 164 175


diffeomorphic, 23 differentiable manifold, 22 discriminant, 10, 13 division algebra, 92 Dougherty, Randall Lee, 164 Drucker, Daniel Stephen, 159 Eckmann, Beno, 92 Eisenstein irreducibility criterion, 117 Eisenstein, Ferdinand Gotthold Max, 161 entropy, 165 envy-freeness, 7 epsiloncism, 156 Erd˝ os, Paul (P´ al), 163, 164 Euler number, 108 Euler, Leonhard, 108, 119, 150, 151 exorcism, 156 expectation, 104 factor (of a word or sequence), 29 Feit, Walter, 163 Feller, William (Vilibald Sre´ cko), 164 de Fermat, Pierre, 119 Fibonacci word, 31 fly, motion of, 135 Foata, Dominique, 162 Fomin, Sergey Vladimirovich, 163, 165 Fraktur alphabet, 34 Franco, Zachary Martin, 161 free abelian group, 101 Freedman, Michael Hartley, 160 Freiling, Christopher Francis, 126, 165 Frobenius-Burnside lemma, 35, 67 fundamental bijection, 68 Gantmacher, Felix Ruvimovich, 159 Garc´ıa-S´ anchez, Pedro Abelardo, 166 Gardner, Martin, 159, 166 Gasnikov, Daniil, 160 Gelfond, Aleksandr Osipovich, 165 Gelfond-Schneider theorem, 139, 165 Gerstenhaber, Murray, 163 G¨ obel’s sequence, 131, 165 G¨ odel incompleteness theorem, 138 golden mean, 139, 143 Golomb’s sequence, 139, 165 Golomb, Solomon Wolf, 165 Goodstein, Reuben Louis, 42 Graham, Ronald Lewis, 74, 165, 166 Graves, John Thomas, 92 Green, Ben Joseph, 163 Green-Tao theorem, 99, 163 Greenfield, Gary Robert, 159 Grigor´ ev, N. A., 164 Gr¨ obner basis, 163 Guy, Richard Kenneth, 7, 159, 161 Hadamard matrix, 120 bordered, 120


Hadwiger, Hugo, 164 hairy ball theorem, 92 Hales, Alfred Washington, 165 Halpern, Benjamin Rigler, 47, 161 halving circle, 121 halving line, 120 Hamel area, 123 Hamel basis, 165 Hamilton, William Rowan, 92 Hardy, Godfrey Harold, 162 harmonic number, 34 Hasse-Minkowski theory, 119 Henze, Norbert, 162 hereditary base-b notation, 40 herpetological mathematics, 76 Hilbert Hotel, 46 Hilbert, David, 159, 165 Hurwitz, Adolf, 92 increasing subsequence, 104 indecomposable (polynomial), 98 iotacism, 156 ji˘ an sh´ız˘ı, 166 Johansson, Fredrik, 162 Jordan, Linus, 162 Kakutani, Shizuo, 163 Kant, Immanuel, 89 Kentucky Bend, 163 Kepler, Johannes, 58, 161 Khalyavin, Andrey, 160 Klarner, David Anthony, 165 Kolakoski sequence, 138, 165 Kolakoski, William George, 138 Komj´ ath, P´ eter, 161 Konheim, Alan Gustave, 162 Krantz, Steven George, 163 Krob, Daniel, 85, 163 Kummer’s theorem, 24 Laczkovich, Mikl´ os, 126, 165 Lagarias, Jeffrey Clark, 34, 160, 165 Lagrange inversion formula, 154, 166 Laguerre polynomial, generalized, 159 lambdacism, 156 Lang, Serge, 89 Last, G¨ unter, 162 lattice point, 118 Laurent phenomenon, 132, 165 left-to-right maximum (of a permutation), 67 Legendre’s formula, 24 Legendre, Adrien-Marie, 24 limit ordinal, 41 Lloyd, Stuart P., 36, 160 longest cycle, 35 look and say sequence, 144, 166 Lov´ asz, L´ aszl´ o, 74, 162


Lowen, Robert W., 159 de Luca, Aldo, 58 Lucas’s theorem, 23 Mackenzie, Simon, 7 Malliaris, Maryanthe, 161 Martelli, Mario, 165 Maynard, James, 28 Mazur, Barry Charles, 89 Millennium Prize Problems, 88 Miller, George Abram, 160 Miller, Haynes Robert, 163 Miltersen, Peter Bro, 165 Minc, Henryk, 134, 165 Moore, Robert Lee, 164 Moree, Pieter, 166 Morse-Hedlund sequence, 29 Moser, Leo, 160 Muir, Sir Thomas, 159 multiplicative (function), 60 multiplicative triangle, 51 von Neumann, John (J´ anos), 136 New Madrid earthquakes, 97 Newton, Isaac, 15 Nim, 142 nonvanishing vector field, 92 Ochem, Pascal, 160 octonions, 92, 163 Odlyzko, Andrew Michael, 160 OEIS, 25 Oldenburger, Rufus, 138 Ono, Ken, 162 ophidiophobia, 78 order (of a polyomino), 165 ordinal number, 41 Pak, Igor, 162 parking function, 75, 162 partition (of an integer), 78, 150, 166 partizan game, 5 Pascal’s triangle, 50 Peano postulates, 42 permanent, 134 picking stones, 166 Pingal’s Meruprastar, 50 Pingala, Archarya, 50 Plouffe, Simon, 29 Pokrovskii, V. G., 164 de Polignac’s formula, 24 de Polignac, Alphonse Armand Charles Georges Marie, 24 Pollak, Henry Otto, 75, 162 Pollard, Harry, 164 polyomino, 127 power sum symmetric function, 11 primorial number, 99 principal leading minor, 11


prisoners, beleaguered, 45, 132 proportionality, 7 Propp, James Gary, 165 Prouhet-Thue-Morse sequence, 29 Pun, Anna Ying, 159, 160 Putzer, Eugene J., 159 Pyke, Ronald, 162 quadratic form, 12, 119 quaternions, 92 Quinlan, Rachel, 163 Rademacher, Hans Adolph, 162 Radhakrishnan, Jaikumar, 165 Radon, Johann Karl August, 92 Ramanujan, Srinivasa Aiyangar, 162 Raney tree, 58 rational (generating function), 55 regular simplex, 118 R´ enyi, Alfr´ ed, 162, 163 Reutenauer, Christophe, 163 reverse alternating permutation, 109 Reznick, Bruce Arie, 161 ribbon, 162 ribbon tile, 162 Riemann hypothesis, 34, 160 Rinne, Dan C., 126, 165 Riordan, John Francis, 162 Rozikov, Utkir Abdulloevich, 161 Saari, Donald Gene, 161 safe (position), 142 Sallows, Lee Cecil Fletcher, 166 Schneider, Theodor, 165 Schoenberg, Isaac Jacob, 15, 164 Schur, Issai, 159 Sch¨ utzenberger, Marcel-Paul, 162 secant number, 109 Selberg’s integral, 162 Selberg, Atle, 162 Selfridge, John Lewis, 7 semidefinite (matrix), 12 semiring, 85 de Seguins Pazzis, Cl´ ement, 163 Shallit, Jeffrey Outlaw, 160 Shelah, Saharon, 161 Shepp, Lawrence Alan, 36, 160 Shur, Arseny M., 160 Silverman’s sequence, 139 similar rectangles, 124 Skolem, Thoralf Albert, 166 snake, 76, 162 Soifer, Alexander, 45, 161 Solovay, Robert Martin, 161 Somos sequence, 132, 165 sophomore’s dream, 23 S´ os, Vera T., 166 Specker group, 164 Specker, Ernst Paul, 164


Speyer, David E,3 , 165 spigot algorithm, 29 squarefree (word), 29 Stallings, Jr., John Robert, 160 standard representation (of a permutation), 67 Stanley, Richard Peter, 159–166 Steinhaus conjecture, 166 Steinhaus, Hugo Dyonizy, 166 Stern’s diatomic array, 53, 161 Stern’s diatomic sequence, 57 Stern’s diatomic series, 57 Stern’s triangle, 52 Stern, Moritz Abraham, 57, 58, 161 Stern-Brocot sequence, 57 Stigler’s law of eponymy, 24, 58, 138, 165 Stigler, Stephen Mack, 24 Stirling number, signless of first kind, 35, 160 Stirling’s formula, 133 Stong, Richard Andrew, 166 subtraction-free computation, 86 Sur´ anyi, J´ anos, 166 Swan, Richard Gordon, 159 ´ Swierczkowski, Stanislaw, 166 Sylver Coinage, 43, 161 Sylvester, James Joseph, 43, 75 Szekeres, George, 126, 165 Tabachnikov, Serge(i), 161 tangent number, 109 Tao, Terence Chi-Shen, 163 Taubes, Clifford Henry, 160 Taylor, Alan Dana, 7 three-distance theorem, 166 three-gap theorem, 158, 166 Thurston, William Paul, 165 Toeplitz matrix, 15 total positivity, 15 total b-ary expansion, 40 Toth, Geza, 164 Totik, Vilmos, 161 transcendental number, 139 tribone, 124, 165 Tutte, William Thomas, 36 Uspensky, James Victor, 144, 166 Valtr, Pavel, 70, 162 valuation (of a cake), 7 value (of a partizan game), 5 Vandermonde determinant, 11 vector field, 92 Verdenius, Wibbe, 164 Wagon, Stanley, 122, 164 3 David Speyer’s middle name is “E” (no period)


Wall, Hubert Stanley, 126, 165 Weil, Andr´e, 3 Weiss, Benjamin, 162 well-ordering (of a set), 43 Whitney, Anne Marie, 15 wildcard (in a ternary squarefree word), 30 Wilf, Herbert Saul, 58, 161 Wilkie, Alex James, 85, 163 Winkler, Peter Mann, 74, 159, 162, 164, 166 Wythoff’s Game, 142 Wythoff’s Nim, 142, 166 Wythoff, Willem Abraham, 166 Xia, Zhihong (Jeff), 161 Yan, Catherine Huafei, 162 Yang, Jed, 165 Zelevinsky, Andrey Vladlenovich, 165 Zermelo-Fraenkel set theory, 42

This book features mathematical problems and results that would be of interest to all mathematicians, but especially undergraduates (and even high school students) who participate in mathematical competitions such as the International Math Olympiads and Putnam Competition. The format is a dialogue between a professor and eight students in a summer problem solving camp and allows for a conversational approach to the problems as well as some mathematical humor and a few nonmathematical digressions. The problems have been selected for their entertainment value, elegance, trickiness, and unexpectedness, and have a wide range of difficulty, from trivial to horrendous. They range over a wide variety of topics including combinatorics, algebra, probability, geometry, and set theory. Most of the problems have not appeared before in a problem or expository format. A Notes section at the end of the book gives historical information and references.

For additional information and updates on this book, visit