246 117 3MB
English Pages [94] Year 2021
Yeuh-Gin Gung and Dr. Charles Y. Hu Award for 2021 to Deanna Haunsperger for Distinguished Service Nancy Ann Neudauer, Tevian Dray , George Berzsenyi, John Harris, Diane Lussier, Amy Shell-Gellasch, ´ James Alvarez, and Ruth Charney
The 2021 Gung and Hu Award is enthusiastically bestowed upon Deanna Haunsperger for her prolific service to mathematics including with the Mathematical Association of America, for her influential leadership of women in mathematics, for her long focus on inclusion and on building inclusive mathematical communities, and for a laudable career that has been rich in mathematical research, mathematical education, and mathematical exposition. There are many ways that one might influence the course of mathematics that honors the values and legacy of Gung and Hu. Throughout her career, one can see Deanna’s commitment to welcome students with diverse backgrounds into the mathematics community. Deanna has cultivated hospitality and inclusivity by working to make learning and doing mathematics, from elementary to advanced levels, interesting and viable for many people. Both the goal of building a mathematical community and Deanna’s efforts to this end are aligned with the MAA’s core values. Deanna Haunsperger was MAA President (2017–2018) and is Professor of Mathematics at Carleton College, where she has been teaching for over twenty-five years. She earned her B.A. in mathematics and computer science from Simpson College and her Ph.D. in mathematics from Northwestern University, focusing on voting theory applications to decision making. doi.org/10.1080/00029890.2021.1868181
March 2021]
YEUH-GIN GUNG AND DR. CHARLES Y. HU AWARD
195
As a faculty member at Carleton College, Co-Editor of Math Horizons, CoFounder and Co-Director of the NSF-funded Summer Mathematics Program for Women Undergraduates [SMP], the Second Vice President and President of the MAA, Chair of the Strategic Planning Group on Students, Chair of the Council on Outreach Programs, Co-Chair of the Centennial Planning Committee, a member of many more MAA committees, and a member of the mathematics community as a whole, Deanna has done a tremendous job of encouraging, mentoring, and envisioning programs to help undergraduates pursue graduate study and careers in the mathematical sciences. Carleton’s Summer Math Program for women (SMP) was recognized by the AMS in 2014 as a Program That Makes a Difference. As the Co-Founders and Co-Directors from 1995 through 2014, Deanna and fellow mathematician and husband Stephen Kennedy created a community of several hundred female mathematicians who support, encourage, and inspire one another and who mentor younger women who are thinking of going into mathematics. The impact of this group of women mathematicians can be felt throughout the country. The community of women built by this program, whose members started as undergraduates, now boasts over 110 Ph.D.s in mathematics or a mathematical science, with over 30 members currently in graduate school in mathematics. These women will invariably tell you how grateful they were to SMP and to Deanna and Stephen for helping them get where they are now in mathematics. In addition to the leadership and mentoring that Deanna and Stephen provided to students during SMP, they have continued to foster the community of former SMPers long after the NSF stopped funding summer programs for women. Every year, they organize an SMP reunion at the Joint Mathematics Meetings, which now brings together approximately 40–50 alumnae at various stages in their mathematical careers. While SMP still had some NSF funding, they would help organize a JMM workshop, the Graduate Education Mentoring Workshop (GEM), to offer continued mentoring and networking for former SMPers who were pursuing graduate studies. This workshop was run by former SMPers who had tenure-track jobs themselves, but Deanna’s and Stephen’s ideas, enthusiasm, and encouragement would be felt every step of the way. Deanna was recognized in 2012 by the AWM with the M. Gweneth Humphreys Award for Mentorship of Undergraduate Women in Mathematics. She won, with Stephen Kennedy, the MAA Meritorious Service Award in 2016. She also is on the Board of Directors of Pro Mathematica Arte, which oversees the Budapest Semesters in Mathematics, and she was Co-Chair of the Human Resources Advisory Committee of the Mathematical Sciences Research Institute. Deanna co-edited the books The Edge of the Universe and A Century of Advancing Mathematics and is working on a new book on mathematical communities. Deanna’s co-editorship of the fourth edition of the MAA’s popular 101 Careers in Mathematics is yet another example of Deanna’s MAA-related community effort. This one nicely combines Deanna’s editorial skills with an explicitly broad outreach mission: to encourage students and other young mathematicians to see mathematical careers as possible and viable for themselves. The 125 people featured in this edition are notably diverse in every sense of the word. As MAA President and past-President, Deanna helped launch a new MAA Award and then served as the first Chair of the Committee on the Inclusivity Prize. And if that were not enough, she and Stephen financially support the MAA and are members of the MAA Icosahedron Society. As an Association, the MAA is stronger and a model for others because of Deanna Haunsperger’s insistence that we be fair, inclusive, and welcoming, which has 196
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
expanded our community with mathematicians who respect and include all. This is distinguished service from which the MAA and the profession will long benefit. Biographical sketch. Deanna, a first-generation college student from central Iowa, earned her B.A. (1986) in mathematics and computer science from Simpson College and her Ph.D. (1991) from Northwestern University in mathematics under Donald Saari in voting theory applications to decision making. She married fellow mathematician Stephen Kennedy in 1990. From 1991 to 1994, they worked at St. Olaf College alongside past MAA president Lynn Steen and future MAA president Paul Zorn. Since 1994, they have worked at Carleton College, where Deanna chaired the Department of Mathematics, 2011–2014. Together they edited Math Horizons, 1999–2004, and have nurtured the award-winning NSF-funded Carleton Summer Mathematics Program since 1995. Deanna is a former President of the MAA, an inaugural AWM Fellow, and a winner of the AWM’s M. Gweneth Humphreys Award for Mentorship of Undergraduate Women. Deanna and Stephen have two adult children, Sam and Maggie. About the Gung and Hu Award. The Gung and Hu Award for Distinguished Service to Mathematics, first presented in 1990, is the endowed successor to the Association’s Award for Distinguished Service to Mathematics, first presented in 1962. This award is intended to be the most prestigious award for service offered by the Association. It honors distinguished contributions to mathematics and mathematical education, in one particular aspect or many, and in a short period or over a career. The initial endowment was contributed by husband and wife Dr. Charles Y. Hu and Yueh-Gin Gung. It is worth noting that Dr. Hu and Yueh-Gin Gung were not mathematicians, but rather a professor of geography at the University of Maryland and a librarian at the University of Chicago, respectively. They contributed generously to our discipline, writing, “we always have high regard and great respect for the intellectual agility and high quality of mind of mathematicians and consider mathematics as the most vital field of study in the technological age we are living in.” Response. The work that I do for the Association and for mathematics is very fulfilling; it brings me great happiness. To be recognized by my friends and by our Association for that work is an incredible honor. When I was in graduate school, one of my graduate-student colleagues asked what I wanted to do with my Ph.D. when I finished, and I explained that my goal was to teach at a liberal arts college. He told me that I should belong to the MAA—the organization that supports professors and their students, embraces research-based pedagogy, promotes research by faculty and students, publishes exceptional exposition, and creates a community where all are welcomed and encouraged to contribute. He was absolutely right, so I joined the MAA, and I married him. Many of the accomplishments listed above would not have been possible, or at least not nearly as fun, without Stephen Kennedy by my side. We have been on many adventures together, mathematical and otherwise, and I thank him for his support. I would also like to thank my colleagues at Carleton and my kids Sam and Maggie for always supporting me when I say yes to new responsibilities. Finally, I would like to thank all the MAA friends I have made along the way for being my mathematical family. Together, I hope we will continue to welcome all people and voices into our community and give them opportunities to contribute. We will be richer for it. ORCID Tevian Dray
March 2021]
http://orcid.org/0000-0001-8692-7963
YEUH-GIN GUNG AND DR. CHARLES Y. HU AWARD
197
Analysis of Series and Products. Part 2: The Trapezoidal Rule Ian Thompson , Morris Davies, and Miren Karmele Urbikain
Abstract. Following on from our recent investigation of series and products using the Euler– Maclaurin formula, we show how the trapezoidal rule can be used to obtain the same asymptotic expansions and can also produce exact transformations into equivalent series with different convergence properties.
1. INTRODUCTION. In a recent article [8], henceforth referred to as “Part 1,” we investigated applications of the Euler–Maclaurin formula in analyzing the behavior of infinite series and products. Specifically, if S(x) is a function expressed as a series, we are concerned with the behavior as x → ∞, where this cannot be obtained by elementary means. As an example, we looked at Poisson’s series S(x) =
∞
e−j
2 /x 2
.
(1)
j =1
Using the Euler–Maclaurin formula, we were able to show that √ x π 1 S(x) − + → 0 as x → ∞, 2 2 and also that the magnitude of the left-hand side decreases exponentially as x increases. However, we did not retrieve the full form of Poisson’s 1823 result [6, p. 420] √ ∞ √ 1 x π 2 − +x π e−(j πx) , S(x) = 2 2 j =1
x > 0.
(2)
As further examples, we also considered T (x) =
∞
e−j
3 /x 3
,
x > 0,
and F (x) =
j =1
∞ 2 2 ln 1 − e−j /x .
(3)
j =1
The first of these is motivated by the fact that even summands tend to make matters easier (so we decided to investigate a similar-looking series with an odd function in the summand), and the second originates from the infinite product ∞ 2 2 1 − e−j /x , G(x) = exp F (x) = j =1
doi.org/10.1080/00029890.2021.1859323 MSC: Primary 40A25, Secondary 30-02 Supplemental data for this article can be accessed on the publisher’s website.
198
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
which originates from a method for modelling heat flow through walls (see the supplemental material for Part 1). In this sequel article, we will investigate an alternative method that is slightly more complicated than the Euler–Maclaurin formula, but significantly more powerful: the trapezoidal rule. Using this, we will see that Poisson’s result (2) can be retrieved in full, and (somewhat remarkably) an exact transformation with similar properties can be obtained for F (x). 2. THE TRAPEZOIDAL RULE. The composite trapezoidal rule for the real line may be obtained by starting with a finite interval [−a, a] and dividing this into 2n subintervals each of width s = a/n. Given a function h, we then approximate the area under the graph of h on each subinterval using a trapezoid. The area of one such trapezoid is Aj =
s h (j − 1)s + h j s , 2
and summing over j leads to the result
n s h (j − 1)s + h j s + E h(s) ds = 2 j =1−n −a a
n−1 h(−a) + h(a) + = s h(j s) + E, 2 j =1−n where E is the error, which generally disappears as s → 0. Taking the limit n → ∞ while keeping s fixed (assuming that h(±a) → 0 as a → ∞, and that the integral exists in this limit), we find that the rule for the whole real line is
∞ −∞
h(s) ds = s
∞
h(j s) + E.
(4)
j =−∞
So, much like the Euler–Maclaurin formula, the trapezoidal rule relates a sum to an 2 integral involving the same function. As a simple example we can set h(s) = e−s and s = 1/x to immediately retrieve the first two terms in (2). The issue now is to determine the error E. One approach is to return to the Euler–Maclaurin formula (see [7, §3.3] for example), but this simply reproduces results we already have. We will employ an alternative method based on contour integration, which can yield more information, especially in cases where E decreases exponentially with s (see [9] for a comprehensive survey of these). The idea dates back to a paper from the late nineteenth century by Georg Landsberg [2], where it was used to derive (2). Later it was used “in reverse” by Alan Turing [10], as a means of proving that a certain integral representation of the Riemann zeta function can be computed very precisely using a series. We begin by setting up an integral that will evaluate to the series in question in an appropriate limit. Let be the anticlockwise oriented rectangular contour with vertices at ±(Q + 12 ) + iu and ±(Q + 12 ) − iv for u, v > 0 (see Figure 1). Then, provided f March 2021]
ANALYSIS OF SERIES & PRODUCTS, PART 2
199
Figure 1. The closed contour , consisting of the straight sections 1 , . . . , 6 . The black bars denote the divisions between 1 and 6 and between 3 and 4 at s = ±(Q + 12 ).
is analytic inside , the residue theorem shows that
Q f (s; x) s−j ds = 2πi f (j ; x) lim 2πis 2πis s→j e e −1 −1 j =−Q Q
=
(5)
f (j ; x).
j =−Q
Now consider the individual sections of the contour . On the lower edge, the factor e2πv appears in the denominator of the integrand, so if v is chosen to be sufficiently large then the contribution from 5 will be exponentially small. To achieve the same effect on the upper edge, we observe that 2πis e f (s; x) = f (s; x) 2πis −1 , e2πis − 1 e −1
(6)
and hence lim
Q→∞
1,2,3
f (s; x) ds = lim Q→∞ e2πis − 1
1,2,3
f (s; x) ds + 1 − e−2πis
∞
−∞
f (s; x) ds.
Here we have introduced the shorthand notation 1,2,3 to mean the union of the three sections 1 , 2 , and 3 . Combining this with (5), we then have
∞
f (j ; x) =
j =−∞
∞ −∞
f (s; x) ds + H + V ,
(7)
where H and V represent the contributions from the horizontal and vertical components of , respectively. That is, H = lim
Q→∞
200
2
f (s; x) ds + 1 − e−2πis
5
f (s; x) ds e2πis − 1
c THE MATHEMATICAL ASSOCIATION OF AMERICA
(8)
[Monthly 128
and V = lim
Q→∞
1,3
f (s; x) ds + 1 − e−2πis
4,6
f (s; x) ds . e2πis − 1
(9)
Next, we simplify our expression for H . Parametrizing 2 and 5 by writing s = w + iu and s = w − iv (and noting that 2 is traversed from right to left), we obtain ∞ ∞ f (w + iu; x) f (w − iv; x) H = dw + dw. (10) 2π(u−iw) 2π(v+iw) − 1 −1 −∞ e −∞ e Up to this point, we have allowed for the possibility that u and v might take different values. This may be useful for complex series, but if the summand f (s; x) is real for real s then the Schwarz reflection principle [5, Theorem 10.4] applies, i.e., f (¯s ; x) = f¯(s; x),
(11)
where the overbar denotes a complex conjugate. Crucially, this means the singularity structure of f is symmetric about the real axis. In this case, setting v = u makes the integrals in (10) into mutual complex conjugates, meaning that ∞ f (w + iu; x) dw. (12) H = 2 Re 2π(u−iw) − 1 −∞ e Even in this reduced form, it is usually difficult to evaluate H directly. However, it may be possible to proceed using the fact that |e2π(u−iw) | > 1, and so H =2
∞ j =1
e−2πj u Re
∞ −∞
f (w + iu; x)e2πij w dw.
(13)
Generally, the integral in (13) is easier to evaluate than the integral in (12). Failing this, a useful bound can be obtained by noting that ∞
2
f (w + iu; x) dw. |H | < 2πu (14) e − 1 −∞ The factor e2πu appearing in the denominator will often facilitate a proof that H is exponentially small. Finally, we must consider V . Again assuming that the Schwarz reflection principle (11) holds and setting v = u, we parametrize the contours in (9) by writing s = ±(Q + 12 ) ± iw. In this way, we find that u dw Im f Q + 12 + iw; x − f −Q − 12 + iw; x . (15) V = −2 lim Q→∞ 0 1 + e2πw For most series of practical interest, f (s; x) → 0 as s → ∞, at least in the vicinity of the real line, and we aim to use this to show that V = 0. A simple strategy is to find an upper bound for the modulus of the term in square brackets. That is, we write
1 1
M(Q) = max Im f Q + 2 + iw; x − f −Q − 2 + iw; x
, 0≤w≤u
March 2021]
ANALYSIS OF SERIES & PRODUCTS, PART 2
201
Figure 2. The closed contour , consisting of the straight sections 1 , . . . , 6 and the semi-circle c . The black bar denotes the division between 6 and 1 at s = Q + 12 .
and observe that |V | ≤ 2 lim M(Q) Q→∞
0
u
dw 1 + e2πw
1 lim M(Q), ≤ π Q→∞ having replaced the denominator with e2πw to reach the last line. It then remains to show that M(Q) → 0 as Q → ∞; the method for achieving this will depend upon the particular form of the function f . A variation of the above analysis allows us to deal with sums in which the index ranges over the natural numbers. The analogue of (4) for this case is
∞ 0
∞ h(0) + h(s) ds = s h(j s) + E, 2 j =1
where once again, E represents an error that will disappear as s → 0. To obtain additional terms and a bound for the error, we use a contour integral similar to the left-hand side of (5) but with the path now as shown in Figure 2. Assuming f (s; x) is analytic inside the contour, the residue theorem yields
Q
f (s; x) ds = f (j ; x). 2πis e −1 j =1
(16)
As before, the exponential in the denominator will prove useful in showing that the integral along the lower edge is small for large v, and we use (6) to rewrite the contributions from the upper half-plane in the form f (s; x) f (s; x) ds = ds − f (s; x) ds. (17) 2πis − 1 −2πis 1,2,3 e 1,2,3 1 − e 1,2,3 Some care must now be taken with limits. Since there is a pole at the origin (unless it should happen that f (0; x) = 0), we cannot let ε → 0 in (17). However, the last 202
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
integral does not possess this singularity, so
lim lim
Q→∞ ε→0
1,2,3
f (s; x) ds = −
∞
f (s; x) ds.
(18)
f (s; x) ds + H + R + L,
(19)
0
Therefore, letting Q → ∞ in (16) yields ∞
f (j ; x) =
∞
0
j =1
where H , R, and L denote contributions from the horizontal, left, and right edges of , respectively. That is,
f (s; x) f (s; x) H = lim ds , (20) ds + −2πis 2πis − 1 Q→∞ 2 1 − e 5 e
f (s; x) f (s; x) R = lim ds , (21) ds + −2πis 2πis − 1 Q→∞ 1 1 − e 6 e and L = lim
ε→0
3
f (s; x) ds + 1 − e−2πis
4,c
f (s; x) ds . e2πis − 1
(22)
The first of these can be treated following the procedure used in the previous case; one need only change the lower limit of integration to zero in (10) and (12)–(14) to obtain the relevant equations. Similarly, the contributions from the right edges of and are the same. Therefore, if f (s; x) is real for positive real s, then a simplified form for R is given by (15) with the second term inside the square brackets omitted. However, the location of left edge is now fixed, so we must deal with L in some other way. Parametrizing 3 , 4 , and c by writing s = iw, s = −iw, and s = εeiθ , we find that
u v π/2 εeiθ f εeiθ ; x f (iw; x) f (−iw; x) dw − dw − dθ . L = i lim 2πw − 1 iθ ε→0 e2πw − 1 ε e ε −π/2 exp(2πiεe ) − 1 (23) If f (s; x) is analytic at s = 0, then the limit ε → 0 commutes with the last integral. The first two terms can also be simplified if f (s; x) is real for real positive s. Setting v = u then leads to the result u f (0; x) Im[f (iw; x)] dw − . (24) L = −2 2πw e −1 2 0 The remaining integral in (24) must be evaluated exactly, or at least approximated, in order to gain an useful expansion for the series. One possibility is to use a Maclaurin series. Thus, if f (s; x) =
∞
aj (x)s j ,
(25)
j =0
March 2021]
ANALYSIS OF SERIES & PRODUCTS, PART 2
203
then L= −
u ∞ w2j −1 a0 (x) +2 dw. a2j −1 (x)(−1)j 2πw − 1 2 e 0 j =1
(26)
In principle, further progress can be made using repeated integration by parts, because it follows from [3, equations (7.1) and (7.2)] that Li0 e−2πw =
1 e2πw
−1
and
d Lin e−2πw = −2π Lin−1 e−2πw , dw
where Lin (·) represents the polylogarithm of order n. However, the resulting expressions are complicated and unlikely to be useful. Alternatively, if it is possible to let u → ∞, we may use the much simpler result [1, equation 3.411(2)] 0
∞
w2j −1 B2j dw = (−1)j +1 , e2πw − 1 4j
j = 1, 2, . . . ,
(27)
where Bj represents a Bernoulli number. However, a word of caution is in order here. To reach (26) from (25), we interchanged an infinite series with an integration, and this procedure may not be valid if the integral is improper. The effect of this can be observed from the fact that replacing u with infinity in (26) and using (27) yields a0 (x) B2j L= − − . a2j −1 (x) 2 2j j =1 ∞
(28)
Since the coefficient function a2j −1 (x) is arbitrary at this point, we cannot say for certain whether the series on the right-hand side is convergent, but in most cases it will be divergent, because the magnitude |B2j | grows very rapidly with j . Nevertheless, using a finite number of terms may produce a useful asymptotic representation for L, as we will see in our second example. 3. EXAMPLES. We now apply the trapezoidal rule to the three example series from Part 1; that is S(x) from (1) and then T (x) and F (x) from (3). In each case, x is a positive parameter, and the initial form of the series converges rapidly for small values. We seek alternative forms of the series that provide accurate values for large x. Even summand. For the simple case of Poisson’s series S(x), we begin by writing f (s; x) = e−s
2 /x 2
.
(29)
As in Part 1, we take advantage of the fact that the summand is an even function, writing ∞ 1 −j 2 /x 2 −1 + e . S(x) = 2 j =−∞
(30)
We then use (29) in (7) to obtain √ 2S(x) + 1 = x π + H + V , 204
c THE MATHEMATICAL ASSOCIATION OF AMERICA
(31) [Monthly 128
where now H = 2 Re
∞
−∞
2
2
e−(w+iu) /x dw e2π(u−iw) − 1
(32)
and −(Q+1/2)2 /x 2
u
V = 4 lim e Q→∞
0
ew /x sin (2Q + 1)w/x 2 dw. 1 + e2πw 2
2
(33)
The integral in (32) can be evaluated using (13). We find that H =2
∞
e−2πj u Re
∞
2 /x 2 2πij w
e−(w+iu)
e
dw
−∞
j =1
∞ √ 2 = 2x π e−(j πx) , j =1
having used the substitution w = t + i(πj x 2 − u). In view of this, we can choose any finite value for u. For example, with u = 1 we have 2 /x 2
|V | ≤ 4 lim e−(Q+1/2) Q→∞
1 0
2
2
ew /x dw, 1 + e2πw
which clearly shows that V = 0. Rearranging (31) now yields (2). A summand without symmetry. We now consider the series T (x) from (3). To apply the trapezoidal rule here we must use (19), because the index clearly cannot be extended to −∞ in a manner similar to (30). The necessary integral is given in Part 1; we have ∞ 3 3 e−s /x ds = x 43 , x > 0, 0
where (·) represents the Gamma function [4, Chapter 5]. Consequently (19) yields T (x) = x 43 + H + R + L, (34) with
∞
3
3
e−(w+iu) /x dw, (35) e2π(u−iw) − 1 0 u 3w(Q + 12 )2 − w3 dw −(Q+1/2)3 /x 3 3w 2 (Q+1/2)/x 3 R = 2 lim e e sin , (36) 3 Q→∞ x 1 + e2πw 0
H = 2 Re
and 1 L= − −2 2 March 2021]
0
u
sin(w3 /x 3 ) dw. e2πw − 1 ANALYSIS OF SERIES & PRODUCTS, PART 2
(37)
205
Now 2 |H | < 2πu e −1
∞
e−(w
3 −3wu2 )/x 3
dw,
0
and this can be bounded by setting u = x and then writing w = xt. In this way, we find that ∞ 2x 3 e3t−t dt. |H | < 2πx e −1 0 The remaining integral does not depend on x. It can be bounded by splitting the integration path at t = 2. To the left of this point the integrand is bounded above by e2 , and to the right we have 3t − t 3 < −t. Therefore 2x (2e2 + e−2 ). e2πx − 1
|H |
0, and initially assume that nei√ ther u nor v exceeds x π so that all points sj± lie outside the contour and therefore do not interfere with the calculation. Since the configuration of singularities is symmetric about the real axis, it is natural to set u = v, after which the Schwarz reflection principle (11) applies for s ∈ . The next step is to substitute (39) into (19). It should be noted that (19) depends on (18), which remains valid because the singularity of f (s; x) at the origin is integrable. In this way, we find that √ x π 3 ζ 2 + H + R + L, (42) F (x) = − 2 where ζ (·) represents the Riemann zeta function [4, Chapter 25]. In (42), H , R, and L are the contributions from the horizontal, right, and left edges of the contour, given by (20)–(22), respectively, and we have used the fact that √ ∞ x π 3 −s 2 /x 2 ds = − ζ 2 , ln 1 − e 2 0 which was derived in Part 1. Our strategy for evaluating the remaining terms in (42) is to allow the contour to expand vertically as well as horizontally in the limit Q → ∞. It is then possible to evaluate H exactly, by summing the contributions from the branch cuts emanating from the singularities at s = sj+ . We can also evaluate L exactly, though it is necessary to return to (23) in order to achieve this; (24) is not valid here because f (0; x) does not exist. However, we must first find an appropriate value for u, by considering the contribution from the right edge of . A simplified expression for this March 2021]
ANALYSIS OF SERIES & PRODUCTS, PART 2
207
is given by (15) with the second term in the square bracket omitted (see Section 2); thus u dw Im f Q + 12 + iw; x . R = −2 lim Q→∞ 0 1 + e2πw On the path of integration, the imaginary part of f is the argument of the complex quantity 2 /x 2
1 − e−(Q+1/2+iw)
= 1 − e(w
2 −(Q+1/2)2 −iw(2Q+1))/x 2
,
which clearly lies in the right half-plane if w < Q + 12 . Moreover, since f (s; x) is real for positive real s (so that the argument is zero), a continuous branch can only be maintained by using the principal value (with imaginary part in the interval (−π, π]) whenever w satisfies this inequality. Since the upper bound for w is u, a natural choice is to set u = Q. The argument can then be determined using the inverse sine function; thus (w2 −(Q+1/2)2 )/x 2 Q sin w(2Q + 1)/x 2 e dw arcsin . R = −2 lim Q→∞ 0 1 + e2πw |1 − e(w2 −(Q+1/2)2 −iw(2Q+1))/x 2 | Then, by maximizing the modulus of the argument to the inverse sine function, we find that (Q2 −(Q+1/2)2 )/x 2 ∞ e dw |R| ≤ 2 lim arcsin 2 −(Q+1/2)2 )/x 2 (Q Q→∞ e2πw 1−e 0 −(Q+1/4)/x 2 (43) 1 e lim arcsin ≤ π Q→∞ 1 − e−(Q+1/4)/x 2 = 0. Next consider L. Unlike the corresponding integral for T (x) (which is (37)), this contribution depends on Q, because we are expanding the contour vertically as well as horizontally. Therefore (23) becomes
Q π/2 iεeiθ f εeiθ ; x Im[f (iw; x)] L = lim lim −2 dw − dθ . iθ Q→∞ ε→0 e2πw − 1 ε −π/2 exp(2πiεe ) − 1 2
2
To determine f (iw; x), we begin by observing that the argument of 1 − e−(iw) /x is fixed for w > 0. Then, if we allow s to traverse a small circular arc from s = ε to s = εeiπ/2 , (40) shows that the argument varies continuously from zero to π. Therefore Im[f (iw; x)] = π in the above expression for L. For the second integral, we expand the integrand as a series in ε using (40), retaining only those terms that do not vanish as ε → 0. The result is that Q
π/2 2 2iθ 2 1 dw + log ε e /x dθ . L = − lim lim 2π (44) Q→∞ ε→0 e2πw − 1 2π −π/2 ε After evaluating the integrals and taking the two limits, we arrive at the result L = ln(2πx). 208
c THE MATHEMATICAL ASSOCIATION OF AMERICA
(45) [Monthly 128
√ √ Figure 3. The deformed contour 2 , with x 3π > Q > x 2π. The dashed lines represent branch cuts.
Finally, consider H . Applying the Schwarz reflection principle (11), we find that H = 2 Re
2
f (s; x) ds. 1 − e−2πis
(46)
As Q is increased, the contour 2 “wraps” around the branch cuts emanating from s = sj+ (see (41)). This is illustrated in Figure 3. The lowest point on 2 is then fixed √ by the branch point s1+ ; its imaginary part is x π . Since there is a factor e−2πis in the denominator of the integrand in (46), we may conclude that |H | is proportional to 3/2 e−2π x and so decreases exponentially as x → ∞. Using the values we have obtained for R and L, (42) now reproduces the result calculated using the Euler–Maclaurin formula in Part 1. However, we can go√further and determine H exactly by reasoning as follows. Suppose that Q exceeds x j π for some positive integer j . Denoting the diversion around the branch cut emanating from sj+ by + j , we find that + j
f (s; x) ds = 1 − e−2πis
√ x j π +iQ √
(1+i)x j π
f (s; x) ds − 1 − e−2πis
√ x j π +iQ √
(1+i)x j π
fr (s; x) ds, 1 − e−2πis
(47)
where the subscripts “ ” and “r” denote evaluation on the left and right faces of the branch cut, respectively. Now the only difference between f and fr is due to the change in argument that occurs as the contour encircles the branch point. This circle is 2 2 traversed clockwise, and 1 − e−s /x has a simple zero at s = sj+ , so it follows that f (s; x) = fr (s; x) − 2πi. March 2021]
ANALYSIS OF SERIES & PRODUCTS, PART 2
209
Consequently, (47) reduces to + j
f (s; x) ds = 2π 1 − e−2πis
Q √
x jπ
dw , 1 − e2πw e−ikj
(48)
where kj = 2 j π 3/2 x, √ and the substitution s = x j π + iw has been used. In view of (46), only the real part of (48) contributes to H . This avoids any further technicalities involving complex logarithms; indeed Re
+ j
Q f (s; x) dw ds = 2π Re √ −2πis 2πw e−ikj 1−e x jπ 1 − e Q = − Re log e−2πw − e−ikj √
x jπ
=
1 2
(49)
1 − 2e−kj cos kj + e−2kj ln . 1 − 2e−2πQ cos kj + e−4πQ
The exact value for H is then obtained by taking the limit Q → ∞ and summing over j (noting the factor 2 in (46)). That is, H =
∞ ln 1 − 2e−kj cos kj + e−2kj .
(50)
j =1
In this last step, we have implicitly assumed that contributions from the remaining horizontal parts of the contour (i.e., those between the branch cuts, and the sections joining 2 to the vertical edges 1 and 3 ) disappear as Q → ∞. In some sense this is obviously true, because the integrand decays exponentially as the contour is moved upwards. This might just be enough to satisfy an applied mathematician, physicist or engineer. However, the distance between adjacent branch points is √ |sj±+1 − sj± | = x 2π j + 1 − j , and this tends to zero as j is increased. This means we cannot avoid the appearance of a singularity either on or close to the path of integration as the contour is moved upwards, which might alarm readers of a “pure” disposition. A rigorous proof that the horizontal sections can indeed be disregarded is provided in the supplemental material. Finally, we piece together all of our results by substituting (43), (45), and (50) into (42). Recalling the definition of F from (3), we arrive at the exact result F (x) =
√ ∞ x π 3 2 2 ζ 2 ln 1 − e−j /x = ln(2πx) − 2 j =1 +
∞ √ 3/2 √ 3/2 ln 1 − 2e−2 j π x cos 2 j π 3/2 x + e−4 j π x . (51) j =1
210
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Figure 4. The function F (x), two-term approximation (obtained by discarding the sum from the right-hand side of (51)), and three-term approximation (52).
All of the terms appearing here are elementary, except the Riemann zeta function, for which we need only the single value ζ
3 2
=
∞ 1 = 2.612375348685488 . . . . 3/2 j j =1
The series on the right-hand side of (51) converges very rapidly for large x and its value is negligible relative to the other two terms unless x is small. Retaining only the leading contribution to the series results in the three-term approximation √ ∞ x π 3 2 2 3/2 ζ 2 − 2 cos 2π 3/2 x e−2π x ln 1 − e−j /x = ln(2πx) − 2 j =1
3/2 + O e−(2π) x , (52)
3/2
verifying our earlier prediction that |H | is proportional to e−2π x , and also confirming that the error in the formula we found in Part 1 is exponentially small. Plots of the function F (x) and the two- and three-term approximations are shown in Figure 4, for 0.35 ≤ x ≤ 0.65. The three-term approximation is visually indistinguishable from the exact curve for x > 0.48. All three curves are visually indistinguishable for x > 0.65. Transformations for some other, similar, series can be obtained directly from (51). For example if ∞ 2 2 ln 1 − e−(2j −1) /x F2 (x) = j =1
and
∞ 2 2 ln 1 + e−j /x , F3 (x) = j =1
then F (x) − F (x/2) =
∞ 2 2 2 2 ln 1 − e−j /x − ln 1 − e−(2j ) /x = F2 (x), j =1
March 2021]
ANALYSIS OF SERIES & PRODUCTS, PART 2
211
and F
√
√ 2 x + F3 2 x = F (x).
Both F2 (x) and F3 (x) are approximately linear for large x. This follows directly from (52); indeed √ √ √ x π 3 x π 3 ln 2 and F3 (x) ≈ − ζ 2 + 2− 2 ζ 2 , F2 (x) ≈ ln(2) − 4 2 4 with exponentially small errors in both cases. 4. CONCLUDING REMARKS. We have considered two methods for finding asymptotic expansions of series and products that contain a large (or small) parameter. Of the two, Euler–Maclaurin summation is perhaps slightly simpler, in that it requires fewer exact integrations. On the other hand, the trapezoidal rule also enables us to derive exact relationships between certain series that have opposing convergence properties, in the sense that one series converges rapidly for small x (say) whereas the other converges rapidly for large x. A final remark concerns series obtained by taking logarithms of infinite products, as in our third example. It should not be thought that the elementary exact integrations that occurred when calculating the contribution from the left edge of the contour (see (44)–(45)) were due to chance, or a contrived (or inspired) choice of example. Since real summands satisfy the Schwarz reflection principle (11), only the imaginary part of f contributes to the integrals in (23). In other words, the logarithm itself disappears, leaving only the argument, which is constant. The same phenomenon facilitated the exact determination of the branch cut contributions at the end of Section 3. Here the difference between the logarithms on opposite faces of the cut is constant, and this leads to the simple integration in (49). Thus the trapezoidal rule offers a very promising general technique for analyzing infinite products. We leave this subject (for now at least) with a challenge. Is it possible to obtain exact transformations of series whose summand is not an even function, such as T (x) in (3)? The best we could do, using either the Euler–Maclaurin formula or the trapezoid rule, is the asymptotic formula (38). Might there be something even better? ORCID Ian Thompson
http://orcid.org/0000-0001-5537-450X
REFERENCES [1] Gradshteyn, I. S., Ryzhik, I. M. (2007). Tables of Integrals, Series and Products, 7th ed. London: Elsevier Academic Press. [2] Landsberg, G. (1893). Zur Theorie der Gauss’schen Summen und der linearen Transformation der Thetafunctionen. J. Reine Angew. Math. 111: 234–253. doi.org/10.1515/crll.1893.111.234 [3] Lewin, L. (1981). Polylogarithms and Associated Functions. New York: North Holland. [4] Olver, F. W. J., Lozier, D. W., Boisvert, R. F., Clark, C. W. (2010). NIST Handbook of Mathematical Functions. Cambridge, UK: Cambridge Univ. Press. [5] Osborne, A. D. (1999). Complex Variables and Their Applications. Harlow, UK: Addison-Wesley. [6] Poisson, S. (1823). Suite du m´emoire sur les int´egrales d´efinies et sur la sommation des s´eries. Somma´ Polytech. Math. 12(XIX): 404–509. tion des s´eries de quantit´es p´eriodiques. J. Ec. [7] Stoer, J., Bulirsch, R. (1992). Introduction to Numerical Analysis, 2nd ed. New York: Springer-Verlag. [8] Thompson, I., Davies, M. G., Urbikain, M. K. (2021). Analysis of series and products. Part 1: Euler–Maclaurin summation. Amer. Math. Monthly. 128(2): 115–124. doi.org/10.1080/00029890.2021. 1845542 [9] Trefethen, L. N., Weideman, J. A. C. (2014). The exponentially convergent trapezoidal rule. SIAM Rev. 56(3): 385–458. doi.org/10.1137/130932132
212
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
[10] Turing, A. M. (1945). A method for the calculation of the zeta-function. Proc. London Math. Soc. 48(3): 180–197. doi.org/10.1112/plms/s2-48.1.180 IAN THOMPSON is a senior lecturer in mathematics at the University of Liverpool in the UK. He was awarded his undergraduate degree by the University of Newcastle Upon Tyne in 2000, and completed his Ph.D. in 2003 at the University of Manchester, under the supervision of Prof. I. D. Abrahams. His research interests include complex and Fourier analysis, modeling techniques for wave phenomena, and computational methods. Department of Mathematical Sciences, University of Liverpool, Liverpool L69 7ZL, UK [email protected]
MORRIS DAVIES retired from the University of Liverpool in 1995, but has continued his research in building heat transfer to the present day. During the 1960s he conducted an investigation into the thermal behavior of perhaps the first passive solar-heated building of modern times, designed in 1957 and built near Liverpool (53.4 degrees latitude). Formerly School of Architecture, University of Liverpool, Liverpool L69 7ZN, UK [email protected]
KARMELE URBIKAIN is a lecturer in heat transfer at the School of Engineering of Bilbao, University of the Basque Country (UPV/EHU). She graduated in industrial engineering and completed her Ph.D. in thermal engineering at the UPV/EHU. Her research interests include heat transfer through opaque and semitransparent elements and energy use in buildings. Department of Thermal Engineering, The University of the Basque Country, Alameda Urquijo, Bilbao, Spain [email protected]
100 Years Ago This Month in The American Mathematical Monthly Edited by Vadim Ponomarenko We welcome the first number of Mathematics Teacher, January, 1921, issued as the official organ of the National Council of Teachers of Mathematics, and devoted to the interests of mathematics in junior and senior high schools of the United States. There is a special call for copies of the Monthly for May, 1915. By an error in printing, the cover for that month read April, 1915, although the inside front page was marked correctly. Slips to be pasted over the incorrect date were sent out the next month, but a number of subscribers did not attend to this matter. As a result many libraries lack this one issue to complete their files. If anyone is willing to part with copies of the Monthly for May, 1915, such copies will be paid for by the Secretary of the Association at the rate of one dollar per copy, and will then be available for those desiring to complete their files. The same price will be paid for a limited number of copies for June and November, 1895, in volume 2. —Excerpted from “Notes and News” (1921). 28(3): 147–152.
March 2021]
ANALYSIS OF SERIES & PRODUCTS, PART 2
213
Morikawa’s Unsolved Problem Jan E. Holly and David Krumm
Abstract. By combining theoretical and computational techniques from geometry, calculus, group theory, and Galois theory, we prove the nonexistence of a closed-form algebraic solution to a Japanese geometry problem first stated in the early nineteenth century. This resolves an outstanding problem from the sangaku tablets which were at one time displayed in temples and shrines throughout Japan.
1. INTRODUCTION. During the Edo Period of Japanese history (1603–1867) there developed a curious practice of hanging wooden tablets with mathematical content from the eaves of Buddhist temples and Shinto shrines. Many of these tablets, known as sangaku, have been lost to history, but close to 900 of them have been preserved [8]. The problems inscribed on the surviving sangaku are mostly of a geometric nature, and they range in difficulty from trivial to unsolved. Solutions to many of these problems can be found in the books by Fukagawa and Pedoe [4] and Fukagawa and Rothman [5]; the latter reference also discusses various historical aspects surrounding the mathematics of the Edo Period. Unsolved sangaku problems seem to be rare; in fact, we are aware of only two such problems listed in the literature, both in Fukagawa and Rothman [5, Chapter 7]. One of the problems mentioned in that text was originally proposed in 1821 and has recently been solved [6]. The present article concerns the other unsolved problem, which was proposed by Jihei Morikawa during the same time period. The main objects involved in Morikawa’s problem are illustrated in Figure 1. Given a line L and circles C1 and Cr of radii 1 and r ≥ 1, respectively, such that C1 and
Figure 1. The circles C1 and Cr , the line L, and the minimal inscribed square with side length μ(r). The original statement of the problem involves circles of radii a and b with b ≥ a, but a scaling of the plane reduces this general case to the case a = 1. doi.org/10.1080/00029890.2021.1859919 MSC: Primary 12F10, Secondary 51N20, 01A27
214
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Cr are tangent to each other and to L, the problem asks us to express, in terms of r, the minimum side length μ(r) of a square that can be inscribed in the region between C1 , Cr , and L. Here, “inscribed” means touching all three of C1 , Cr , and L. A surviving travel diary of mathematician Kanzan Yamaguchi, a contemporary of Morikawa, includes an entry with some additional information about Morikawa’s problem and the sangaku containing it. Based on that entry, Fukagawa and Rothman report the following. The tablet contained no solution, but Morikawa had written, “I will be very happy if someone can solve this problem.” And so, says Yamaguchi, “I went to Morikawa’s home with my friend Takeda and asked him what the answer is. He said that he could not solve the problem yet.” Neither does Yamaguchi’s diary contain a solution and, like Morikawa, we would be very happy if someone solves this problem. [5, p. 265] It seems surprising that Morikawa’s problem would have frustrated all attempts at a solution, considering that the mathematicians of the Edo Period had a strong understanding of geometry, were adept in the use of algebra, and even had some knowledge of basic calculus. One begins to wonder whether the problem can in fact be solved. The purpose of this article is to address the question of the existence of a closedform expression for μ(r). From our analysis in Section 4 it follows that μ(r) is a root of a polynomial whose coefficients are polynomials in r; in light of this fact, a natural question is whether μ(r) is expressible by radicals in terms of r. Precise terminology is defined below, but the question can be stated intuitively as follows: Is there a radical expression, such as √ √ √ 4 11 3r − π · 3 r 5 − r + 2i + 1 + i + 5 r 6 − 7 , √ 3 − 2r 3 + 9r − 5 that for every real number r ≥ 1 can be evaluated to yield μ(r)? We provide here a negative answer to this question, thus showing that a closedform algebraic solution does not exist in the classical sense. In order to state our results we introduce the following terminology. Recall that if J ⊆ C is a nonempty set and f : J → C is a function, we say that f is an algebraic function if there exists a nonzero polynomial q ∈ C[k, x] such that q(c, f (c)) = 0 for every c ∈ J . If, moreover, this condition is satisfied by a polynomial q whose Galois group over the field C(k) is solvable—or equivalently, whose splitting field is contained in a radical extension of C(k)—then we say that f is a radical function. We can now state our main result. Theorem (see Theorem 5.4). The function μ : [1, ∞) → R is not radical. In fact, there is no infinite subset J ⊆ [1, ∞) such that μ : J → R is radical. The proof of this theorem makes critical use of computational tools in Galois theory that have only recently become available due to work of N. Sutherland [9]. In particular, our argument relies on Sutherland’s implementation in the system Magma [1] of an algorithm for computing geometric Galois groups. Besides this algorithm, the proof uses elementary geometry as well as calculus and Galois theory. This article is organized as follows. In Section 2, we provide notation and basic results about inscribed squares. In Section 3, we show that any minimal inscribed square must be positioned as in Figure 1, with a corner on each of C1 , Cr , and L, and March 2021]
MORIKAWA’S UNSOLVED PROBLEM
215
with no side of the square tangent to these objects. In Section 4, we derive an explicit formula for a function whose minimum value is μ(r); as a byproduct we obtain a numerical method for approximating μ(r) given the radius r. In addition, we show that μ(r) can be expressed in terms of a root of a certain polynomial of degree 10. Finally, in Section 5, we use Galois theory to study this polynomial and thus prove the main theorem. 2. CONFIGURATIONS FOR INSCRIBED SQUARES. Figure 2 shows the notation that will be used throughout the article, regardless of which inscribed square is under discussion. Denoted are the circles (C1 , Cr ), centers of the circles (O1 , Or ), line (L), vertices of the square (V1 , Vr , Vup , Vdn ), and angle between L and the lower right side of the square (θ ∈ [0, π/2)). If θ = 0, then Vdn is the lower left vertex. To facilitate phrasing, the line is considered horizontal as shown, with C1 on the left. Most of the discussion in Sections 2 and 3 assumes an arbitrary but fixed value of r ≥ 1. Basic geometric facts [2] are used throughout.
Figure 2. Notation for the line, two circles, and square.
Lemma 2.1. For inscribed squares (for fixed r ≥ 1), let θ ∈ [0, π/2) be as in Figure 2. (i) For every θ ∈ [0, π/2), there is a unique inscribed square. (ii) There exists a minimum side length over the set of inscribed squares. (Thus, Morikawa’s problem is well-defined.) Note: Throughout the rest of the article, the side length of the inscribed square at angle θ ∈ [0, π/2) will be denoted s(θ). Proof. To prove (i), fix θ ∈ [0, π/2). We find the inscribed square at angle θ as follows, stated somewhat informally to avoid excessive technicalities. Consider all squares, inscribed or not, that make angle θ with L as in Figure 2. Let s be a side length under consideration for being that of an inscribed square. Imagine sliding such a square—with angle θ and side length s—along L until the square is to 216
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
the right of C1 but is just touching C1 . If s is too small, then the square will not reach Cr . If s is too large, then the square will overlap Cr . By a continuity and monotonicity argument, there is a unique s such that the C1 -touching square will exactly reach Cr . The existence and uniqueness of an inscribed square follows from that of s above, along with the fact that the inscribed square clearly cannot be moved left or right and still be inscribed. Subsequently, (ii) follows from the facts that s(θ) is continuous on [0, π/2) and limθ→π/2− s(θ) = s(0). Generally in this article, “the square” will mean the inscribed square as given by the context, unless stated otherwise. Figure 3 shows the types of intersections between the
Figure 3. Possible configurations, in terms of the square’s types of intersections with the line and circles. In Section 3, we prove that a minimal square can occur only in Con6, or in Con19 with r = 1, by eliminating all of the others: Con1,2,3 (Lemma 3.2), Con4,5 (Lemma 3.3), Con7,10,11 (Lemma 3.4), Con12,13,14 (Lemma 3.5), Con15 (Lemma 3.6), Con16,18 (Lemma 3.7), Con19 (Lemma 3.8), Con8,9,17 (Lemma 3.16).
March 2021]
MORIKAWA’S UNSOLVED PROBLEM
217
square and circles as θ increases from 0 to π/2, i.e., as the square rotates (and changes size as necessary). Each combination of such intersections, as shown in Figure 3, will henceforth be referred to as a configuration. The configurations as illustrated in Figure 3 will be denoted Con1, Con2, Con3, etc. Lemma 2.2. Figure 3 shows all possible steps through the configurations as θ increases from 0 to π/2. Proof. The steps through the configurations obviously depend upon r. For example, from Con6, the value of r determines which circle first becomes tangent to a side of the square as θ increases. Small r leads to Con7, large r leads to Con12, and a certain intermediate value of r leads to Con11. Also note that only r = 1 gives Con9. The fact that these are the steps through the configurations is generally clear, with two exceptions: from Con8 to Con9 and Con10, and from Con15 to Con16. For Con8 to Con9 and Con10, the question is whether the upper right side of the square could instead rotate past tangency with Cr before the upper left side of the square becomes tangent to C1 . This can happen only if the lower left side of the square has steeper slope than the line through Vdn and O1 —i.e., the lower left side “points above” O1 —and the lower right side points above Or . However, this is impossible because the circle of radius (r + 1)/2 through O1 and Or , as shown in Figure 4, is tangent to L. Every angle inscribed in a semicircle is a right angle, so since L is below the new circle except at the point of tangency, Q, the lower sides of the square cannot both point above their respective circles’ centers. At best, the lower sides can point exactly at the centers, but only if r = 1 and Vdn = Q.
Figure 4. Circle with diameter O1 -to-Or is tangent to L.
For Con15 to Con16, the question is whether Vup can instead touch C1 before the lower right side of the square touches Cr . This can happen only if the square with upper right side tangent at Vr to Cr has Vup touching C1 . However, this is impossible. Any square with upper right side tangent at Vr to Cr must have Vdn at or to the left of Q in Figure 4 in order for the square to reach C1 , but then the square is angled such that Vup cannot touch C1 . 218
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
3. CONFIGURATIONS FOR MINIMAL SQUARES. In this section, we prove that a minimal square—i.e., an inscribed square with side length that is minimal over all orientations—exists only in Con6, and has V1 lower than O1 . An exception occurs if r = 1, where Con19 is the reflection of Con6 and thus also has a minimal square. The proof consists of a sequence of lemmas showing that a minimal square cannot be in any other configuration. The final result is given by Proposition 3.17. Additional notation is used throughout this section, for the lines perpendicular to each of C1 , Cr , and L at the points of contact with the square under consideration. As illustrated in Figure 5, these (dashed) lines are denoted T1 , Tr , and TL , respectively. As before, unless stated otherwise we assume an arbitrary but fixed value of r ≥ 1. Lemma 3.1. For a given inscribed square, if T1 intersects TL above Tr (respectively, below Tr ), then s is a strictly decreasing (respectively, increasing) function of θ at that square’s angle. Proof. If T1 intersects TL above Tr , then the lines form a triangle to the right of TL , such as in Figure 5.
Figure 5. Notation: Lines T1 , Tr , and TL through points of intersection.
Consider fixing the size of the square, and rotating it counterclockwise about a point inside the triangle. As the square begins to rotate, it starts to overlap with each of C1 , Cr , and L. Formally, the rate of change of the following are positive: (1) the radius of C1 minus the distance from O1 to the square, (2) the radius of Cr minus the distance from Or to the square, and (3) the distance below L to the lowest point on the square. This means that the difference between the side lengths of an inscribed square and a fixed-size square is a strictly decreasing function of θ; thus s is a strictly decreasing function of θ. Similarly, if T1 intersects TL below Tr , then s is a strictly increasing function of θ.
March 2021]
MORIKAWA’S UNSOLVED PROBLEM
219
Lemma 3.2. A minimal square cannot be in Con1, Con2, or Con3. Proof. Each of these configurations corresponds to θ = 0. For small enough > 0, it is easy to see that for all θ ∈ (0, ), T1 intersects TL above Tr . Therefore by Lemma 3.1, s(θ) is strictly decreasing for θ ∈ (0, ) and thus by continuity s(0) is not minimal. Lemma 3.3. A minimal square cannot be in Con4 or Con5, and a minimal square cannot be in Con6 with V1 as high as or higher than O1 . Proof. Figure 6 illustrates Con4, and the same reasoning applies to the other two configurations. First, note that V1 is higher than O1 in Con4 and Con5 because some part of the left side of the square is tangent to C1 , and that left side has negative slope. Consider the line L tangent to Cr at Vup , and the line L bisecting the angle between L and L . Line L must intersect C1 (in order to “escape” the region between C1 , Cr , and L), so L is below O1 .
Figure 6. Illustration for Lemma 3.3. The line L bisecting the angle between L and L is below O1 , so V1 is higher than L , leading to an application of Lemma 3.1.
Because V1 is higher than O1 and therefore higher than L , the square is tilted in such a way that Vup is closer than Vdn to the intersection of L and L. Therefore, Tr intersects L to the right of where TL intersects L ; thus the intersection of TL and Tr is below the intersection of TL and L . Meanwhile, T1 has nonnegative slope and therefore intersects TL above the intersection of TL and L , thus above the intersection of TL and Tr . Hence by Lemma 3.1, s(θ) is a strictly decreasing function and is therefore not minimal for these values of θ. Lemma 3.4. A minimal square cannot be in Con7, Con10, or Con11. Proof. In each of these configurations, T1 intersects TL below the intersection between TL and Tr . Therefore by Lemma 3.1, s(θ) is strictly increasing and therefore not minimal at the corresponding θ. The proof that a minimal square cannot be in Con8, Con9, or Con17 is substantially more complicated than the other proofs, and is saved for the end of the section. Lemma 3.5. A minimal square cannot be in Con12, Con13, or Con14. Proof. In each of these configurations, some part of the upper left side of the square is tangent to C1 . Therefore, in order for Vup to reach Cr , θ cannot be particularly small, 220
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Figure 7. Illustration for Lemma 3.5. In Con12, Con13, and Con14, θ > π/4, so TL is to the right of Vup , leading to an application of Lemma 3.1.
and certainly θ > π/4: As seen in Con9 (Figure 3) which has θ = π/4 and r = 1, a square with θ = π/4 and upper left side tangent to C1 can only just barely reach—and not even with Vup —the smallest possible Cr , that with r = 1. Then decreasing θ with the square’s upper left side still tangent to C1 would shrink and move the square away from Cr , so θ ≤ π/4 is not possible here when r = 1. It is also not possible when r > 1 because any larger Cr would further prevent the square from reaching it. Therefore, TL is to the right of Vup , as illustrated in Figure 7 for the case of Con12. Thus by Lemma 3.1, s(θ) is a strictly increasing function of θ and is therefore not minimal for these values of θ. Lemma 3.6. A minimal square cannot be in Con15. Proof. We prove this by showing that as the square rotates counterclockwise through the range of Con15, the intersection between T1 and TL moves upward strictly monotonically, and the intersection between Tr and TL moves downward strictly monotonically. Because it is clear in Con15 that TL ’s intersection with T1 is below that with Tr for the smaller values of θ, and above that with Tr for the larger values of θ, Lemma 3.1 implies that s(θ) strictly increases then strictly decreases as a function of θ, so the result follows. (This seems to imply that there is a maximal square in Con15. However, this article does not address maximal squares.) We give the proof that the intersection between T1 and TL moves upward strictly monotonically; an analogous proof shows that the intersection between Tr and TL moves downward strictly monotonically. Consider not just inscribed squares, but inscribed rectangles in general, determined by the pair (p, q) as shown in Figure 8, where p is the horizontal distance between O1 and the bottom vertex, and q is the height above L of the intersection between T1 and TL . Let f (p, q) = length of lower left side of rectangle, g(p, q) = length of lower right side of rectangle.
March 2021]
MORIKAWA’S UNSOLVED PROBLEM
221
Figure 8. For Lemma 3.6, the rectangle determined by the point (p, q).
For each p there is a unique q, say q = h(p), such that the rectangle is a square, i.e., f (p, q) = g(p, q). Our goal is to show that h(p) is a strictly increasing function of p. We will do this by showing that (1) (2) (3) (4)
f (p, q) is a strictly increasing function of p, g(p, q) is a strictly decreasing function of p, f (p, q) is a strictly decreasing function of q, g(p, q) is a strictly increasing function of q.
This will complete the proof because if point (p, q) gives a square, then (1) and (2) imply that increasing p will cause the lower left side to become larger than the lower right side, so by (3) and (4), q must be increased in order to restore equality of the side lengths.
Figure 9. For Lemma 3.6, notation for the proof that f (p, q) is a strictly increasing function of p.
For (1), fix q and note that f (p, q) is + m − 1 in Figure 9. By similar triangles, m/q = (1 − q)/. Define a function j () = + m − 1 = + 222
q(1 − q) − 1,
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
which equals f (p, q). An easy calculation shows that j () > 0 because > 1/2, and > 0, thus proving (1). since p and increase together, this gives ∂f ∂p
Figure 10. For Lemma 3.6, geometry for the proof that g(p, q) is a strictly decreasing function of p.
For (2), note that g(p, q) is exactly r less than the distance from Or to the lower left side of the rectangle. That side is on a line through a point Q below O1 . If p increases while q stays fixed, then as shown in Figure 10, the lower left side of the rectangle is still aligned with Q because that side is always parallel to T1 . Therefore, as p increases, the distance from Or to the lower left side of the rectangle decreases, thus proving (2).
Figure 11. For Lemma 3.6, geometry for proofs that f (p, q) is strictly decreasing, and g(p, q) is strictly increasing, as functions of q.
For (3) and (4), note that f (p, q) is exactly 1 less than the distance from O1 to the lower right side of the rectangle, and g(p, q) is exactly r less than the distance from Or to the lower left side of the rectangle. If p is fixed and q increases, then as shown in Figure 11, the lower left side of the square (which is parallel to T1 ) rotates away from Or , and the lower right side (which is perpendicular to T1 ) rotates toward O1 . This proves (3) and (4), thus completing the proof. Lemma 3.7. A minimal square cannot be in Con16 or Con18. Proof. In each of these configurations, T1 intersects TL above the intersection between TL and Tr . Therefore by Lemma 3.1, s(θ) is strictly decreasing and therefore not minimal at the corresponding θ. March 2021]
MORIKAWA’S UNSOLVED PROBLEM
223
If r = 1, then Con19 is simply the reflection of Con6, so for Con19 we focus on the case r = 1. Lemma 3.8. If r = 1, then a minimal square cannot be in Con19. Proof. Given a square at angle θ in Con19, we claim that the inscribed square in the reflected orientation, i.e., at angle π/2 − θ, is smaller. This fact can be seen by mapping the original square to a horizontal reflection that is positioned so that the image P of vertex P (= V1 ) is on Cr , as shown in Figure 12. In the process, vertex Q (= Vr ) is mapped to a point Q inside the disk bounded by C1 for the following reason. The acute angles φ made with the horizontal are the same for the line through P and Q and the line through P and Q, while circle C1 is steeper than Cr at every height above L between 0 and 1.
Figure 12. Illustration for Lemma 3.8 showing the horizontal reflection of the square that sends point P on C1 to a point P on Cr . (Such a reflection is not necessarily about the midline of the square.)
Therefore, the inscribed square at angle π/2 − θ must be smaller than the reflected square and thus the original square. To prove that a minimal square cannot be in Con8, Con9, or Con17, preliminary results and definitions are helpful. Definition 3.9. Con10+ will refer to Con10 but will also allow r ≤ 1. Con8+ will refer to the union of Con7, Con8, Con9, and Con10, and will also allow r ≤ 1. Remark 3.10. A proof that a minimal square cannot be in Con8+ constitutes a proof for Con8, Con9, and Con17 (as well as Con7, Con10, Con16, and Con18, for which proofs have already been given) because Con17 with r = r0 ≥ 1 is equivalent to a version of Con8 with r = 1/r0 ≤ 1. Lemma 3.11. The square in Con10+ has side length less than or equal to M, where M=
2r , √ √ r + 8 r +1
with equality if and only if r = 1. 224
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Proof. Let P be the point on line L between C1 and Cr such that the distance from P to the closest point on C1 (along the line through P and O1 ) equals the distance from P to the closest point on Cr . The line segment from P to O1 can be viewed as the hypotenuse of a right triangle, as can the line segment from P to Or , so the equality of distance can be written using the Pythagorean theorem√as follows. Let p be the distance from P to the intersection of L and C1 . Because 2 r is the distance from the intersection of L and C1 to the intersection of L and Cr (see Figure 15), we have √ 1 + p 2 − 1 = r 2 + (2 r − p)2 − r. By basic geometry, the square in Con10+ must have Vdn to the left of P , unless r = 1, in which case Vdn = P . Therefore the square has side length less than or equal to 1 + p 2 − 1, with equality exactly if and only if r = 1. It is straightforward to confirm, by substituting the following into the equation √ above, and by noting that this is the desired solution because it lies in (0, 2 r) for r > 0, that √ √ 8r + 2 r , p= √ √ r + 8 r +1 and then that
1 + p2 − 1 =
2r , √ √ r + 8 r +1
which is M in the statement of the lemma. Lemma 3.12. For a square in Con17, if the distance from√L to the midpoint of the upper right side of the square is less than or equal to r − r/ 2, then the square is not minimal. Proof. Given such a square in Con17 as shown in Figure 13, we claim that reflecting the square about its midline—the vertical line through its center point—will cause it to overlap both C1 and Cr . This will prove the lemma because an inscribed square in the same orientation as the reflected square is thus smaller than the original square. We prove the claim by showing that reflecting the square takes vertex P (= Vup ) as shown in Figure 13 to vertex P in the interior of the disk with boundary Cr , and takes Q (= Vr ) to vertex Q such that either Q or part of the left side of the reflected square is in the interior of the disk with boundary C1 . First, the line through P and Q has slope 1, and the line through Q and P has slope −1, by the following reasoning: In Figure 13, the fact that α = α is clear because of the right angle shown, and α = α because of the symmetry of the reflection. The line segment between P and Q is seen to be the base of an isoceles triangle since P and Q are the same distance from the center of the square. Because α = α , the isoceles triangle is symmetric about a line of slope −1 through the center of the square. Therefore, the base of the triangle, and thus the line through P and Q , has slope 1. Similar reasoning shows that the line through Q and P has slope −1. Next, we claim that the secant line segment intersecting Cr at Q and the point nearest horizontally to P has slope shallower than −1. (This secant would lie almost exactly along Cr in Figure 13, and would not be distinguishable in the figure.) To prove the claim, we note that the midpoint of that secant is at the same height as the upper March 2021]
MORIKAWA’S UNSOLVED PROBLEM
225
Figure 13. For Lemma 3.12, the horizontal reflection of the square about its midline.
√ right side of the square, no higher than r − r/ 2 by hypothesis, because the secant’s endpoints are at the heights of vertices P and Q. A secant has as a perpendicular bisector a radial line segment √ of Cr ; such a radial line segment for this secant must reach below a height of r − r/ 2 since it passes through the secant’s midpoint. Thus, this radial line segment must have √ slope steeper than 1, because at slope 1 it would reach down only to height r − r/ 2 with its length r and top endpoint at height r. The secant, which is perpendicular to it, must therefore have slope shallower than −1. Therefore, P is in the interior of the disk with boundary Cr . To show the overlap with C1 , we first note that Q is at height greater than √ 1 − 1/ 2. This is because Vr of a square in Con17 must be higher than that in Con9 or Con10 (whichever applies, depending on whether r = 1) by the basic geometry of the counterclockwise rotation toward Con17. Among √ these three configurations, Con9 with its r = 1 has the lowest Vr , at height 1 − 1/ 2 since θ = π/4. line whose lower intersection with Now, Q lies on the line of slope √ 1 through P , a √ C1 is at height less than 1 − 1/ 2 because 1 − 1/ 2 is where the tangent to C1 has slope 1. Therefore, Q is on the line of slope 1 through P on either the segment from P to C1 or the segment that lies in the interior of the disk with boundary C1 . In the latter case, the proof of the lemma is complete. The former case can only occur if the original square is so large that Q is above a portion of the top half of C1 , because Q is straight above V1 . In that case, the left side of the reflected square must intersect the interior of the disk with boundary C1 , completing the proof of the lemma, because the reflected square is also a reflection of the original square about a horizontal line through its center point. Since the left side of the original square is tangent to C1 , and the horizontal line of reflection is below the center of C1 , the reflected left side must overlap C1 and thus intersect the interior of the disk with boundary C1 . Corollary 3.13. If r ≥ 3, then a minimal square cannot be in Con17. Proof. Suppose for contradiction that r ≥ 3 and a minimal square is in Con17. Then the side length, s, must be less than or equal to the side lengths of all squares in all other 226
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
configurations, including Con10 which is addressed in Lemma 3.11. In particular, we must have s ≤ M from Lemma 3.11. Let h be the distance from L to the midpoint of √ the upper right side of the square. We know that h can be no greater than ( 5/2)s, because this is the highest that a midpoint of a side√ can possibly√be, occurring if the midpoint is√straight above Vdn . In summary, h ≤ ( 5/2)s ≤ ( 5/2)M. Therefore, h ≤ r − r/ 2 if √ r 2r 5 ≤r−√ , √ √ 2 r + 8 r +1 2 which is true if r ≥ 3. However, Lemma 3.12 then implies that the square is not minimal. This gives a contradiction. Definition 3.14. For a family {J (t)}t∈I of lines or line segments, where I is an interval in R, the pivot at a given t ∈ I is the point about which the line or line segment is pivoting at t, if such a point exists. In other words, for a given t ∈ I , if there exists > 0 such that u ∈ I ∩ (t − , t + ) implies that J (u) ∩ J (t) consists of a single point, then the pivot, P , at t is defined by P = lim J (u) ∩ J (t) u→t
if the limit exists. (The limit is one-sided if t is an endpoint of I .) Lemma 3.15. Let Lφ be the line through the origin having angle φ ∈ (0, π/2) with the positive x-axis as in Figure 14, define an interval I = (π/2 − φ, π/2), and fix > 0. Let {J (β)}β∈I be the family of line segments of length with endpoints on the x-axis and Lφ , and for which J (β) intersects the x-axis at angle β as in Figure 14. Then the following hold.
Figure 14. Notation for Lemma 3.15.
(i) For {J (β)}β∈I , the position of the pivot P at β is determined by b (cot β) = c (cot γ ), where b and c are the distances along J (β) from P to the x-axis and Lφ , respectively, and γ = π − φ − β is the angle between J (β) and Lφ . (ii) Let {K(β)}β∈I be the family of lines such that K(β) is the line perpendicular to J (β) through J (β)’s pivot. Then for {K(β)}β∈I , the pivot at β has y-coordinate cos γ + cos(γ − β) cos β . sin(γ + β) March 2021]
MORIKAWA’S UNSOLVED PROBLEM
227
Proof. Given such Lφ , , {J (β)}β∈I , and {K(β)}β∈I , fix a value β0 ∈ I . To prove (i), let P be the pivot at β0 for {J (β)}β∈I , and let b and c be the distances along J (β0 ) from P to the x-axis and Lφ , respectively. Define a family {H (β)}β∈(0,π/2) of variable-length line segments as those for which H (β) is the line segment through P with endpoints on the x-axis and Lφ , and that intersects the x-axis at angle β; e.g., H (β0 ) = J (β0 ). It follows that d(length of H (β))/dβ = 0 at β = β0 because H (β) and J (β) have the same pivot at β0 , so the instantaneous rates of change of location of their respective endpoints (on the x-axis and Lφ ) are the same. Let B(β) and C(β) be the distances along H (β) from P to the x-axis and Lφ , respectively, so B (β0 ) + C (β0 ) = 0. Basic trigonometry shows that B(β) = hB / sin β and C(β) = hC / sin γ , where γ = π − φ − β, and hB and hC are the (shortest) distances from P to the xaxis and Lφ , respectively. Putting all of this together, (i) follows because b = B(β0 ) and c = C(β0 ). For (ii), we note that K(β) is the line sin γ − b cos β + b sin β, y = cot β x − sin φ sin(φ + 2β) , i.e., y = (cot β)x − sin φ sin β where the second form uses γ = π − φ − β, as well as a substitution for b stemming from (i) which gives b(cot β) = ( − b)(cot γ ). For all β, the slope of K(β) is greater than zero, so the pivot for {K(β)}β∈I at β0 can be found as follows. For any given y, define the function fy (β) giving the value of x such that (x, y) is on the line K(β): sin(φ + 2β) . fy (β) = (tan β)y + sin φ cos β Then the pivot at β0 must have y-coordinate such that fy (β0 ) = 0. Solving this equation for y gives (ii). Lemma 3.16. A minimal square cannot be in Con8, Con9, or Con17. Proof. Suppose for contradiction that a minimal square exists in Con8+ . Noting that the extended version of Con8 with r = r0 ≤ 1 is equivalent to Con17 with r = 1/r0 , we may apply Corollary 3.13, concluding that r > 1/3. Let I = [β1 , β2 ] be the set of angles between L and the lower left sides of the squares in Con8+ , and let β0 ∈ I be the maximum such angle of a minimal square in Con8+ . Then we know that β0 = β2 , and that β0 = β1 unless r = 1, because a minimal square cannot be in Con7, Con10, Con16, or Con18. Let be the side length of the minimal square, and let {G(β)}β∈I be the family of line segments of length with endpoints on L and C1 , for which G(β) intersects L at angle β. Let m(β) be the distance from Or to G(β). Then m (β0 ) = 0 because either m(β0 ) is a minimal value on an open interval, or the square for β0 is in Con9 so the shortest distance from Or to G(β0 ) is the distance from Or to G(β0 )’s endpoint at L, which is the pivot for {G(β)}β∈I at β0 . Let {H (β)}β∈I be the family of lines such that H (β) is perpendicular to G(β) through G(β)’s pivot. Let h(β) be the vertical distance from Or to H (β), but considered negative if H (β) is below Or . The sign of m (β) is the same as the sign of h(β), because h(β) > 0 means that G(β) is moving away from Or as β increases, and vice versa. Therefore, h(β0 ) = 0. 228
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Let L0 be the line tangent to C1 at G(β0 )’s endpoint on C1 . It follows from this tangency of L0 and C1 that the pivot at β0 for {G(β)}β∈I is the same as the pivot at β0 for the family {J (β)}β∈I of line segments of length with endpoints on L and L0 with J (β) intersecting L at angle β. Intuitively, this match of the pivot is because J (β) acts like G(β) near β0 , and can be seen formally by standard -δ reasoning. It also follows that the pivot P at β0 for {H (β)}β∈I is the same as the pivot at β0 for the family {K(β)}β∈I of lines such that K(β) is perpendicular to J (β) through J (β)’s pivot. We now apply Lemma 3.15(ii) to {K(β)}β∈I in order to show that the distance from L to P is less than r. This distance, the “y-coordinate” of P , is thus given by cos γ + cos(γ − β) cos β yP = , sin(γ + β) √ where γ is the angle that J (β) makes with L0 . We claim that yP ≤ 2, and begin the proof by noting that for angle φ between L and L0 we have φ + γ + β = π, along with φ, β ∈ (0, π/2), γ ∈ (0, π/2], and γ + β ∈ (π/2, π). Let v = (γ + β)/2 and w = (γ − β)/2, and let cos(v + w) + cos(2w) cos(v − w) f (v, w) = sin(2v) √ √ so that yP ≤ 2 can be proved by showing that f (v, w) ≤ 2 on the domain defined by v + w ≤ π/2, v − w < π/2, and v ≥ π/4. Although v cannot take the value π/4 in the geometric interpretation, this extension of the domain of f to include v = π/4 facilitates phrasing in the proof of a bound. Straightforward computations show that − cos3 β − 3 cos γ cos β ∂f 1/3, the fact that yP < r easily follows. Now, because H (β0 ) has positive slope, and pivot P on H (β0 ) is at a distance less than r from L, P is below and to the left of Or . Recalling that h(β) is the (signed) vertical distance from Or to H (β), the fact that pivot P is to the left of Or implies March 2021]
MORIKAWA’S UNSOLVED PROBLEM
229
that h (β0 ) < 0. We already know that h(β0 ) = 0, so there exists > 0 such that β ∈ (β0 , β0 + ) implies h(β) < 0. However, since h(β) and m (β) have the same sign, this means m (β) < 0 on (β0 , β0 + ), contradicting m(β0 ) being minimal. Proposition 3.17. A minimal square occurs only in Con6, and possibly in Con19, and must have V1 lower than O1 . Con19 has a minimal square if and only if r = 1. Proof. This result follows from the lemmas ruling out all other configurations, and the fact that Con19 is the reflection of Con6. 4. EQUATIONS FOR MINIMUM SIDE LENGTH. In this section, we derive equations toward the pursuit of the minimum side length of an inscribed square, knowing from Proposition 3.17 that this minimum occurs in Con6 and has V1 lower than O1 . We continue using here the notation introduced at the beginnings of Sections 2 and 3. Lemma 4.1. In Con6 with V1 lower than O1 , s(θ) decreases then increases, both strictly monotonically, as a function of θ. Thus, there is a unique minimal square in Con6. Proof. As θ increases in Con6 with V1 lower than O1 , TL moves to the right and V1 moves down, so the intersection between T1 (which has negative slope) and TL moves down. Simultaneously, Vr moves up, so the intersection between Tr and TL moves up. Therefore, as θ increases, the triangle formed by TL , T1 , and Tr switches from the right side of TL to the left side of TL at a unique value, when T1 and Tr intersect TL at the same point. By Lemma 3.1, s(θ) decreases then increases, both strictly monotonically. The uniqueness of the minimal square in Con6 follows from this and Proposition 3.17. For the equations used in seeking the minimum side length of an inscribed square, the parameter representing orientation will be the distance, x, from the line L to the upper left vertex V1 of the square, as shown in Figure 15. It is not difficult to see that x is related strictly monotonically to θ. Proposition 4.2. The minimum side length, μ(r), of an inscribed square is the minimum value of the function ⎡
z(x) = ⎣x 2 + r − x −
2 √ r 2 − 2 r − x − 2x − x 2
2 ⎤ 12 ⎦
(1)
√ on the interval (1 − 1/ 2, 1). The minimum occurs at a unique point, denoted xm . Moreover, the function z is strictly decreasing for x < xm and strictly increasing for x > xm . Proof. Let (x1 , x2 ) be the interval of values of the distance x from L to V1 such that the associated square is in Con6 with V1 lower than O1 . We know from Proposition 3.17 that the minimum side length corresponds to some x ∈ (x1 , x2 ). If x ∈ (x1 , x2 ), then as illustrated in Figure 15, congruent triangles using x and z show that Vup is at a horizontal distance x from V1 . Therefore, for x ∈ (x1 , x2 ), 1 z = x 2 + h2 2 , 230
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Figure 15. Equations used to compute the side length of a square in Con6, given distance x from the line L to the upper left vertex V1 of the square. Here the side length is denoted z.
where h is the vertical distance from V1 to Vup . Thus by the equations in Figure 15, the side length is given as a function of x ∈ (x1 , x2 ) by z(x) in equation (1). In addition, it follows from Lemma 4.1 that on (x1 , x2 ) the minimum occurs at a unique point, xm , and that z is strictly decreasing for x < xm√and strictly increasing for x > xm . However, the stated domain (1 − 1/ 2, 1) extends √ beyond (x1 , x2 ); certainly, x2 < 1 by definition.√ Toward the claim that x1 > 1 − 1/ 2, we suppose for contradic√ tion that x = 1 − 1/ 2 is possible. Because V1 is distance x from L, x = 1 − 1/√2 occurs exactly when the line through O1 and V1 has slope −1. In that case, 1 − 1/ 2 is also the horizontal distance from V1 to a vertical √line tangent to C1 on the right side. Because Vup is horizontal distance x (= 1 − 1/ 2) from V1 , Vup is on that vertical line. This contradicts the fact that Vup is on√Cr which requires Vup to be to the right of that vertical line. Noting that √ x < 1 − 1/ 2 would result in a similar contradiction, we conclude that x1 > 1 − 1/ 2. √ Now considering z in equation (1) as a function of x ∈ (1 − 1/ 2, 1), we know on (x1 , x2 ) that z represents the side length of an inscribed square, but outside (x1 , x2 ) we only have equation (1). It is helpful to set up geometric interpretations for the cases √ in order to complete the proof by showing that z x ∈ (1 − 1/ 2, x1 ) and x ∈ (x2 , 1)√ strictly decreases when x ∈ (1 − 1/ 2, x1 ) and strictly increases when x ∈ (x2 , 1). Figures 16 and 17 illustrate geometric √ interpretations of z as given by equation (1) when x belongs to the intervals (1 − 1/ 2, x1 ) and (x2 , 1), respectively. Here, the three consecutive vertices still lie on the line and the two circles, although the squares are no longer inscribed. In both cases, the same equations as in Figure √ 15 still apply, and lead to (1). The fact that z strictly decreases when x ∈ (1 − 1/ 2, x1 ) and strictly increases when x ∈ (x2 , 1) can been seen by the same reasoning as in Lemma 4.1, this time using T1 , Tr , and TL defined as before except that the “point of contact” is specifically that with the relevant vertex, as shown in Figures 16 and 17. The reasoning then uses a modified form of Lemma 3.1 that applies to these squares with consecutive vertices on the line and circles, instead of to inscribed squares, and whose proof is analogous to that of Lemma 3.1. March 2021]
MORIKAWA’S UNSOLVED PROBLEM
231
Figure 16. Geometry showing that z given by (1) is not a minimum when x < x1 .
Figure 17. Geometry showing that z given by (1) is not a minimum when x > x2 .
Remark 4.3 (Approximating μ(r)). From Proposition 4.2, one can deduce an algorithm for computing an approximation of μ(r) given the radius r. Indeed, it suffices for this purpose to minimize the function z, which can be achieved by applying rootfinding methods to z . Proposition 4.4. In the result of Proposition 4.2, i.e., that the minimum side length is given by μ(r) = z(xm ), the number xm is a root of the 10th degree polynomial f4 (f1 f7 + f2 f6 − 2f3 f5 )2 − (f32 f4 + f52 − f1 f4 f6 − f2 f7 )2 , 232
c THE MATHEMATICAL ASSOCIATION OF AMERICA
(2)
[Monthly 128
where, letting k =
√ r,
f1 (x) = −2x + 4k f2 (x) = (4k − 2)x + k 4 − 4k 2 f3 (x) = (6k − 3)x + k 4 − 2k 3 − 3k 2 f4 (x) = −x 2 + 2x f5 (x) = 4x 3 − (2k 2 + 6k + 7)x 2 + (2k 3 + 3k 2 + 10k)x − 2k 3 f6 (x) = 8x 3 + (−4k 2 − 16)x 2 + (4k 3 − 2k 2 + 6)x − 4k 3 + 8k 2 − 4k f7 (x) = (4k 2 − 16k)x 3 + (−k 4 + 4k 3 − 10k 2 + 40k)x 2 + (2k 4 − 8k 3 + 4k 2 − 20k + 2)x + 4k 2 . Proof. By Proposition 4.2, μ(r) = z(xm ),√where z(x) is given by equation (1), and xm is the unique point in the interval (1 − 1/ 2, 1) such that dz/dx = 0, or equivalently dA/dx = 0 where A(x) = (z(x))2 . It remains to show that xm is a root of (2). Let g(x) = (−2x + 4k) f4 (x) + (4k − 2)x + k 4 − 4k 2 , so that expansion in (1) gives A(x) = 2 (x − k 2 ) g(x) + (−x + 2k) f4 (x) + x 2 + (−k 2 + 2k − 1)x + k 4 − 2k 2 . Note that √ dg (4k − 2) f4 (x) + 4x 2 − (4k + 6)x + 4k = , √ dx f4 (x) which leads to √ (x − k 2 ) (2k − 1) f4 (x) + 2x 2 − (2k + 3)x + 2k 1 dA = + g(x) √ √ 2 dx g(x) f4 (x) +
x 2 − (2k + 1)x + 2k − f4 (x) + 2x − k 2 + 2k − 1. √ f4 (x)
In view of the fact that dA/dx = 0 at x = xm , set 12 dA/dx = 0, then multiply by √ √ √ g(x) f4 (x) and separate terms with g(x) from the rest to obtain
g(x) (2x − k 2 + 2k − 1) f4 (x) + 2x 2 − (2k + 3)x + 2k = −f3 (x) f4 (x) − f5 (x).
March 2021]
MORIKAWA’S UNSOLVED PROBLEM
233
√ Note that g(x) = f1 (x) f4 (x) + f2 (x), so that squaring both sides and omitting “(x)” for readability results in f6 f4 + f7 f1 f4 + f2 = f32 f4 + 2f3 f5 f4 + f52 . √ f4 from the rest gives (f1 f7 + f2 f6 − 2f3 f5 ) f4 = f32 f4 + f52 − f1 f4 f6 − f2 f7 .
Separating terms with
Squaring both sides and rearranging leads to the final polynomial equation, f4 (f1 f7 + f2 f6 − 2f3 f5 )2 − (f32 f4 + f52 − f1 f4 f6 − f2 f7 )2 = 0. This polynomial has degree 10. Terms of higher degree could arise only from f52 and f1 f4 f6 , each of which has degree 6, but both f52 and f1 f4 f6 have 16x 6 as their 6th degree term, so f52 − f1 f4 f6 has degree 5. In summary, xm is a root of the polynomial in (2). 5. NONEXISTENCE OF A SOLUTION BY RADICALS. In this section, we prove that μ is not a radical function as defined in Section 1. For convenience we work instead with the function λ defined by λ(c) = (μ(c2 ))2 . It is intuitively clear that if μ is a radical function then λ is also radical; a formal proof of this fact is given in a more general context in Lemma 5.1. The proof of our main result, Theorem 5.4, shows that λ is not radical on any infinite subset of [1, ∞). As a consequence we obtain the fact that μ cannot be radical on any such set. We refer the reader to [3, Chapters 13 and 14] for the algebraic background assumed in this section. √ Lemma 5.1. Let J be a nonempty subset of [1, ∞), let n ∈ Z+ , and let I = n J be the set of positive nth roots of elements of J . Suppose that a function f : J → R is radical, and define g : I → R and h : J → R by g(c) = f (cn ) and h(c) = (f (c))n . Then g and h are radical. Proof. We begin by showing that g is radical. Let q ∈ C[k, x] be a nonzero polynomial satisfying q(c, f (c)) = 0 for every c ∈ J , and such that there is a radical extension R/C(k) containing a splitting field S of q. Let ϕ : C(k) → C(k) be the embedding induced by the map k → k n , and let be an algebraic closure of C(k). By basic field theory (see [7, Chapter V, §2, Theorem 2.8]), we may extend the map ϕ to an embedding ϕ : R → . Defining Q(k, x) = q(k n , x) we have Q(c, g(c)) = 0 for every c ∈ I . Moreover, since Q is the polynomial obtained by applying ϕ to the coefficients of q, the above observations imply that Q splits in the field ϕ(S), which is contained in the radical extension ϕ(R)/C(k). Thus g is a radical function. Next we show that h is radical. The argument will be given assuming that q is monic and has no repeated root; the general case can be proved similarly. Let α1 , . . . , αd be the roots of q in S, and let Q(k, x) = (x − α1n ) · · · (x − αdn ). Note that since q has coefficients in C[k], the same holds for Q. (The elementary symmetric functions of α1n , . . . , αdn are polynomials in the elementary symmetric functions of α1 , . . . , αd .) Moreover, as explained below, one can show that Q(c, h(c)) = 0 for every c ∈ J . Since Q splits in the field C(k, α1n , . . . , αdn ) ⊆ S ⊆ R, this implies that h is radical. 234
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Fixing c ∈ J , the fact that Q(c, h(c)) = 0 can be seen heuristically first: Since q(k, x) = (x − α1 ) · · · (x − αd ) and q(c, f (c)) = 0, we must have f (c) = αi (c) for some i, and thus Q(c, h(c)) = 0. This argument can be made rigorous by extending the map k → c to a ring homomorphism C[k, α1 , . . . , αd ] → C, so that αi (c) is well-defined. We refer the interested reader to Section 3 in [7, Chapter VII] for the necessary tools. Next we prove two preliminary results needed to show that λ is not radical. Lemma 5.2. Let p(k, x) be the polynomial defined by (2) considering k and x as indeterminates. As an element of the ring C(k)[x], the polynomial p is irreducible and has Galois group isomorphic to the symmetric group S10 . Proof. We rely on a computation carried out using the computer algebra system Magma; the code for our computation is available in the supplemental online material. Constructing p(k, x) as an element of the ring Q(k)[x], we use Sutherland’s algorithm [9] to compute a permutation representation of the Galois group of p over C(k), and we obtain the group S10 . It follows that the Galois group acts transitively on the roots of p, so p is irreducible over C(k). By Proposition 4.2 we may regard xm as a function of r defined on the interval [1, ∞), and moreover, we have μ(r) = z(xm (r)) for all r ≥ 1.
(3) √ For convenience we will make the change of variable k = r and work instead with the function ξ : [1, ∞) → R defined by ξ(k) = xm (k 2 ). From Proposition 4.4 we deduce that p(k, ξ(k)) = 0
for all k ∈ [1, ∞).
(4)
Furthermore, (3) implies that λ(k) = (z(ξ(k)))2 . Hence, writing ξ for ξ(k), we have
2 2 . λ(k) = ξ 2 + k 2 − ξ − k 4 − 2k − ξ − 2ξ − ξ 2 By manipulating the equation above we obtain a polynomial h ∈ Q[k, x, y] with the property that h(k, ξ(k), λ(k)) = 0
for all k ∈ [1, ∞).
(5)
Explicitly, h is given by the formula 2 h(k, x, y) = (y − x 2 − c12 − c2 + c32 + c4 )2 + 4c32 c4 − 4c12 c2 + 4c12 c32 + 4c12 c4 2 − c4 8c12 c3 + 4c3 (y − x 2 − c12 − c2 + c32 + c4 ) , where c1 = k 2 − x, c2 = k 4 , c3 = 2k − x, and c4 = 2x − x 2 . Lemma 5.3. Let be an algebraic closure of the field C(k). Suppose that α, β ∈ satisfy p(k, α) = h(k, α, β) = 0. Then C(k, α) ⊆ C(k, β). March 2021]
MORIKAWA’S UNSOLVED PROBLEM
235
Proof. We rely on a number of computations in Magma; the code used for all computations is available in the supplemental online material. Let F = Q(k, α). To prove the lemma it suffices to show that F ⊆ Q(k, β). Regarding p as an element of the ring Q(k)[x], note that p is irreducible by Lemma 5.2, and α is a root of p by hypothesis, so we may identify F with the field Q(k)[x]/(p(k, x)). Constructing F in Magma and factoring1 the polynomial h(k, α, y) over F , we find that this polynomial has two roots in F and two roots that are quadratic over F . Note that β must be one of these four roots since h(k, α, β) = 0. Suppose that β ∈ F . Computing the minimal polynomial of β over Q(k) we obtain a polynomial of degree 10; thus [Q(k, β) : Q(k)] = 10. Since β ∈ F and [F : Q(k)] = deg(p) = 10, this implies that F = Q(k, β). In particular, F ⊆ Q(k, β) as desired. Now suppose that β is quadratic over F . Then a minimal polynomial computation shows that [Q(k, β) : Q(k)] = 20. Since [F (β) : F ] = 2 and [F : Q(k)] = 10, this implies that F (β) = Q(k, β), so again F ⊆ Q(k, β). We can now prove the main theorem of this article. Theorem 5.4. There is no infinite subset J ⊆ [1, ∞) such that μ : J → R is radical. Proof. As above, let denote an algebraic closure of the field C(k). By Lemma 5.1, in order to prove the theorem it suffices to show that λ is not radical on any infinite subset of [1, ∞). Suppose for contradiction that I ⊆ [1, ∞) is an infinite set such that λ : I → R is radical. Then there is a nonzero polynomial q(k, y) ∈ C[k, y] whose Galois group over C(k) is solvable, and such that q(k, λ(k)) = 0 for all k ∈ I.
(6)
Regarding q and h as elements of the ring C[k, x, y], let f (k, x) = Resy (h(k, x, y), q(k, x, y)) , where Resy denotes the resultant as polynomials in y. By (5) and (6), for every k ∈ I the polynomials h(k, ξ(k), y) and q(k, ξ(k), y) have a common root, namely λ(k); hence f (k, ξ(k)) = 0 for all
k ∈ I.
(7)
Similarly, letting g(k) = Resx (f (k, x), p(k, x)) ∈ C[k], equations (4) and (7) imply that g(k) = 0 for every k ∈ I . Since I is an infinite set, we must have g = 0. Therefore, f and p have a common root α ∈ . Given that f (k, α) = 0, the definition of f implies that q(k, y) and h(k, α, y) have a common root β ∈ . Note that the assumptions in Lemma 5.3 are satisfied. Let N ⊂ be the splitting field of q(k, y) over C(k) and let F = C(k, α). By Lemma 5.3 we have F ⊆ C(k, β) and therefore F ⊆ N. Letting L ⊂ be the splitting field of p(k, x) over C(k), we have L ⊆ N since F ⊆ N and L is the Galois closure of 1 The
236
algorithm used by Magma to factor polynomials over algebraic function fields is discussed in [10].
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
the extension F /C(k). Since the extension L/C(k) is Galois, the group Gal(L/C(k)) is a quotient of Gal(N/C(k)). The definition of q(k, y) implies that the latter group is solvable, so the former is, too. This contradicts Lemma 5.2 (since the group S10 is not solvable), and thus completes the proof of the theorem. ACKNOWLEDGMENTS. JEH received support from the Colby College Research Grant Program. The authors thank Gerardo Lafferriere of the Fariborz Maseeh Department of Mathematics and Statistics at Portland State University for welcoming JEH as a Visiting Scholar during the final stages of this project.
ORCID Jan E. Holly http://orcid.org/0000-0002-6419-3038 David Krumm http://orcid.org/0000-0001-9999-2166 REFERENCES [1] Bosma, W., Cannon, J., Playoust, C. (1997). The Magma algebra system. I. The user language. J. Symbolic Comput. 24(3–4): 235–265. [2] Coxeter, H. S. M. (1989). Introduction to Geometry, 2nd ed. New York: Wiley. [3] Dummit, D. S., Foote, R. M. (2004). Abstract Algebra, 3rd ed. Hoboken, NJ: Wiley. [4] Fukagawa, H., Pedoe, D. (1989). Japanese Temple Geometry Problems. Winnipeg, Canada: The Charles Babbage Research Centre. [5] Fukagawa, H., Rothman, T. (2008). Sacred Mathematics: Japanese Temple Geometry. Princeton, NJ: Princeton Univ. Press. [6] Kinoshita, H. (2018). An unsolved problem in the Yamaguchi’s travell diary. Sangaku J. Math. 2: 43–53. sangaku-journal.eu [7] Lang, S. (2002). Algebra, 3rd ed. New York, NY: Springer-Verlag. [8] Rothman, T. (1998). Japanese temple geometry. Sci. Amer. 278(5): 84–91. [9] Sutherland, N. (2018). Computations with Galois groups in Magma. Computeralgebra Rundbrief. 62: 16–21. fachgruppe-computeralgebra.de/data/CA-Rundbrief/car62.pdf [10] Trager, B. M. (1976). Algebraic factoring and rational function integration. In: Jenks, R. D., ed. SYMSAC ’76: Proceedings of the Third ACM Symposium on Symbolic and Algebraic Computation. New York: ACM, pp. 219–226. JAN E. HOLLY received a Ph.D. in mathematics from the University of Illinois, and is a mathematical neuroscientist or a pure mathematician depending on the day. Department of Mathematics and Statistics, Colby College, Waterville, ME 04901 [email protected]
DAVID KRUMM received a Ph.D. in mathematics from the University of Georgia, and is currently in a visiting position at Reed College. He first became aware of the existence of sangaku while visiting the Todai-ji temple in Nara, Japan. Mathematics Department, Reed College, Portland, OR 97202 [email protected]
March 2021]
MORIKAWA’S UNSOLVED PROBLEM
237
Composites in the Ulam Spiral In 1963, Stanisław Ulam plotted the natural numbers in a rectilinear spiral array and observed that prime values in the resulting Ulam spiral tend to cluster on certain diagonal lines [1]. The Ulam spiral also contains arbitrarily large square patches consisting entirely of composite numbers. 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183
198 145 144 143 142 141 140 139 138 137 136 135 134 133 182 199 146 101 100 99 98 97 96 95 94 93 92 91 132 181 200 147 102 65 64 63 62 61 60 59 58 57 90 131 180 201 148 103 66 37 36 35 34 33 32 31 56 89 130 179 202 149 104 67 38 17 16 15 14 13 30 55 88 129 178 203 150 105 68 39 18 5
4
3 12 29 54 87 128 177
204 151 106 69 40 19 6
1
2
205 152 107 70 41 20 7
8
9 10 27 52 85 126 175
11 28 53 86 127 176
206 153 108 71 42 21 22 23 24 25 26 51 84 125 174 207 154 109 72 43 44 45 46 47 48 49 50 83 124 173 208 155 110 73 74 75 76 77 78 79 80 81 82 123 172 209 156 111 112 113 114 115 116 117 118 119 120 121 122 171 210 157 158 159 160 161 162 163 164 165 166 167 168 169 170 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225
The nth element on the diagonal ray 1, 9, 25, 49, . . . in the Ulam spiral is (2n − 1)2 . The d × d block with (2n + 3)2 as its upper-left corner is a d × d matrix whose entries are quadratic polynomials in n with positive integer coefficients and constant terms greater than 1. For example, d = 3 yields 2 2 2 (2n + 3)
A(n) = (2n + 5)2 − 1 (2n + 7)2 − 2
(2n + 3) + 1 (2n + 5)2 (2n + 7)2 − 1
(2n + 5) + 2 9 (2n + 5)2 + 1 ≡ 24 (2n + 7)2 47
10 25 48
27 26 (mod n). 49
Let denote the least common multiple of the constant terms of these d 2 polynomials. For example, = lcm{9, 10, 24, 25, 26, 27, 47, 48, 49} = 323,341,200 when d = 3. Since (2 + 3)2 > , each entry of A() is larger than and divisible by a prime factor of . Thus, A() contains only composite numbers. REFERENCES [1] Gardner, M. (1964). Mathematical games: The remarkable lore of the prime numbers. Sci. Amer. 210(3): 120–128.
—Submitted by Stephan Ramon Garcia1 (Claremont, CA) and Matthew A. Myers (Spruce Pine, NC) doi.org/10.1080/00029890.2021.1858692 MSC: Primary 11A07, Secondary 11A41; 1 partially supported by NSF Grant DMS-1800123
238
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
A Uniform Proof of the Finiteness of the Class Group of a Global Field Alexander Stasinski Abstract. We give a definition of a class of Dedekind domains which includes the rings of integers of global fields and give a proof that all rings in this class have finite ideal class group. We also prove that this class coincides with the class of rings of integers of global fields.
1. INTRODUCTION. Background. The starting point of algebraic number theory is to define an (algebraic) number field K as a finite field extension of Q and the ring ZK of algebraic integers in K as all the elements in K that satisfy a monic polynomial equation with coefficients in Z. (This is called the integral closure of Z in K.) Unlike the situation in Z, unique factorization of elements in ZK into irreducibles can fail. Nevertheless, ZK is an example of a Dedekind domain, that is, an integral domain in which every nonzero proper ideal is, uniquely, a product of prime ideals. One can develop a parallel theory of finite extensions K, called (algebraic) function fields, of the field Fq (t), which is the field of fractions of the ring of polynomials Fq [t] over a finite field Fq . The analogue of ZK is then the ring of elements in K that satisfy a monic polynomial equation with coefficients in Fq [t]. These rings are also Dedekind domains and their theory can to a large extent be developed in parallel with that of the rings ZK . For this reason, it is sometimes convenient to use the term global field for either a number field or a function field. One of the most fundamental problems about rings of integers in global fields is the study of the (failure of) unique factorization. This is encoded in the (ideal) class group of the ring, which is trivial if and only if unique factorization holds. A important result in algebraic number theory is that the class group of ZK is finite. Similarly, it is known that the class group of a ring of integers in a function field is finite. One can define the ideal class group Cl(R) of any Dedekind domain R as the equivalence classes of nonzero ideals where two ideals I, J are said to be equivalent if aI = bJ , for some nonzero a, b ∈ R, and the group operation is induced by multiplication of ideals. The main results. In [6], P. L. Clark asked whether there exists a “purely algebraic” proof of the finiteness of the class group of global fields and whether there exist any “structural” conditions on a Dedekind domain that imply the finiteness of its class group. In this article, we answer these questions in the affirmative. More precisely, we introduce a class (G) of Dedekind domains (see Definition 2) that contains the rings of integers of global fields. We then give a uniform (i.e., not case by case) proof that any ring of class (G) has finite ideal class group. We also give a known argument showing how to deduce the finiteness of the class group of any overring of a ring of class (G). doi.org/10.1080/00029890.2021.1855036 MSC: Primary 13A15, Secondary 11R29; 13F10; 13F05
March 2021]
FINITENESS OF THE CLASS GROUP
239
The proof of the finiteness of the class group that we give is essentially that of R. Swan [17, Theorem 3.9] and I. Reiner [16, (26.3)] (modulo some exercises) for rings of integers in global fields. The contribution here is that we axiomatize properties of a Dedekind domain sufficient for the proof to go through and that we show that these properties in fact characterize rings of integers in global fields. This shows in particular that the finiteness of the class groups of global fields can be proved uniformly without adeles and without methods from the geometry of numbers. This fact does not seem to have been widely known. Indeed, Clark writes in [6] that “[. . . ] it is generally held that the finiteness of the class number is one of the first results of algebraic number theory which is truly number-theoretic in nature and not part of the general study of commutative rings” and in [15, B.1, p. 334] the authors write: “Note well that for a general Dedekind domain, ClK need not be finite. This shows that one essentially needs some analysis to supplement the abstract algebra in Chapter 5.” A key idea of the proof is to estimate the norm of an element from above algebraically using the fact that a determinant is a homogeneous polynomial in the entries of a matrix (see the proof of Lemma 2). This idea is present in [7, (20.10)], [16, (26.3)], and [17, p. 53] but can be traced back to Zassenhaus [18] in the number field case and Higman–McLaughlin [12] in the function field case. By contrast, the standard nonadelic and nongeometric proof of the finiteness of the class group in the number field case (see, e.g., [14, V, Section 4]) expresses the field norm in terms of the complex absolute values of Galois conjugates, and in the function field case this needs a modification involving absolute values. In the final section, we show that the class (G) coincides with the class of global fields. This uses the Artin–Whaples axiomatization of global fields and shows that the quasi-triangle inequality condition in Definition 2, despite its simplicity and elementary nature, implies the product formula for absolute values. 2. BASIC PIDS AND RINGS OF CLASS (G). All rings are commutative with identity. Let N denote the set of positive integers. We use the standard acronym PID for “principal ideal domain.” A ring R is called a finite quotient domain or is said to have finite quotients if for every nonzero ideal I of R, the quotient R/I is a finite ring. If R is a finite quotient domain, I ⊆ R a nonzero ideal, and x ∈ R is nonzero, we write NR (I ) = |R/I | and NR (x) = |R/xR|. We also define NR (0) = 0. The function NR : R → N ∪ {0} is called the ideal norm on R. It is known that if R is a finite quotient Dedekind domain, then NR is multiplicative (see [14, Lemma V.3.5]). Definition 1. We call a PID A a basic PID if it is not a field and if the following conditions are satisfied: 1. A is a finite quotient domain; 2. there exists a constant c ∈ N such that for each m ∈ N, #{x ∈ A | NA (x) ≤ c · m} ≥ m (i.e., A has “enough elements of small norm”); 3. there exists a constant C ∈ N such that for all x, y ∈ A, NA (x + y) ≤ C · (NA (x) + NA (y)) (i.e., NA satisfies the “quasi-triangle inequality”). There exist PIDs for which the first and second conditions in Definition √ 1 hold but for which the third condition fails. Take, for instance, the PID A = Z[ 2]. Then 240
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
√ √ √ u = 1 + 2 is a unit in A and u = 1 − 2. For any r ∈ N write ur = ar + br 2, for ar , br ∈ Z. Then ar grows with r and NA (ur + ur ) = N(2ar ) = |NQ(√2)/Q (2ar )| = 4ar2 , while NA (ur ) + NA (ur ) = 2, since ur and ur are units. Thus the third condition in Definition 1 fails for A even though A is a finite quotient domain that satisfies the second condition since it has infinitely many units. Another example of a finite quotient PID A where the second condition holds but the third condition fails is the localization Z(p) of Z at a prime p, that is, the subring of Q consisting of fractions a/b, a, b ∈ Z, where p b. Here ±1 + p n is a unit for every n ∈ N, so NA (1 + p n ) + NA (−1 + p n ) = 1 + 1 = 2, while NA (1 + p n + (−1 + p n )) = NA (2)NA (p)n , which grows with n. Definition 2. Let A be a basic PID. We call a Dedekind domain B a ring of class (G) (over A) if B is an A-algebra that is finitely generated and free as a module over A. Since free modules over a PID are torsion-free, we may and will consider A as a subring of B via the embedding a → a · 1. Our goal in the next section is to prove that any ring B of class (G) has finite ideal class group. The terminology “(G)” is provisional (“G” for global), because it will turn out that the class (G) is equal to the class of rings of integers in global fields (see Corollary 1). Definition 3. By global field we mean either a finite extension of Q or a finite separable extension of some Fq (t), where t is transcendental over Fq . By a ring of integers of a global field K we mean either the integral closure in K of Z (in the number field case) or the integral closure in K of Fq [t], for some t ∈ K transcendental over Fq (in the function field case). Note that in the function field case, there is no unique ring of integers, as, for instance, one can also take the integral closure of Fq [t −1 ]. Proposition 1. Let B be a ring of integers of a global field. Then B is a ring of class (G) over Z or Fq [t], respectively. Proof. First, it is straightforward to check that Z is a basic PID. Indeed, all its proper quotients Z/n are finite, NZ (n) = |n| (the absolute value of n) so #{x ∈ Z | NZ (x) ≤ m} = 2m + 1 ≥ m and NZ (x + y) = |x + y| ≤ |x| + |y| = NZ (x) + NZ (y), thanks to the usual triangle inequality. Next, let A = Fq [t]. For any f (t) ∈ A we have NA (f (t)) = q deg(f ) , so A is a finite quotient domain and #{x ∈ A | NA (x) ≤ m} = #{x ∈ A | deg(x) ≤ logq (m) } = q logq (m) +1 > q logq (m) = m, so the second property in Definition 1 is satisfied for A. Furthermore, for f (t), g(t) ∈ A we have NA (f (t) + g(t)) = q deg(f +g) = q max{deg f,deg g} ≤ q deg f + q deg g = NA (f (t)) + NA (g(t)), so A is a basic PID. Thus both Z and Fq [t] satisfy Definition 1 with c = C = 1. March 2021]
FINITENESS OF THE CLASS GROUP
241
It is well known that B is a Dedekind domain and is free of finite rank over Z or Fq [t], respectively (see [14, Theorem I.4.7] for the number field case and [14, Theorem X.1.7] for the function field case; note that if the extension of fraction fields is finite and separable, which is always the case for number fields, this follows with a classical proof, but for function fields, where separability may fail, it requires a separate proof). Thus, in either case, B is a ring of class (G). 3. NORM ESTIMATES AND FINITENESS. Throughout this section, let B be a ring of class (G) over the basic PID A and let K and L be the field of fractions of A and B, respectively. For α ∈ L we let Tα denote the endomorphism L → L, x → αx, and define the norm NL/K (α) = det(Tα ). As is well known, the fact that B is the integral closure of A in L implies that NL/K (B) ⊆ A (see, e.g., [14, Corollary IV.2.4]). The following lemma is a consequence of [14, Propositions IV.6.9 and V.3.6] (which is valid when A is a Dedekind domain, not necessarily a PID). We give a simple proof in our setting (where A is a PID), exploiting the Smith normal form (see, e.g., [1, Section 5.3]). Lemma 1. For any nonzero α ∈ B, we have NB (α) = NA (NL/K (α)). Proof. We have NB (α) = |B/αB| and B/αB is the cokernel of the map Tα : B → B. By the Smith normal form, we have B/αB ∼ = A/p1 A ⊕ · · · ⊕ A/pn A, where n is the rank of B over A, and pi ∈ A are some nonunits such that det(Tα ) = u−1 p1 · · · pn , for some unit u ∈ A (u = det(P Q) where P Tα Q is the Smith normal form, with Tα identified with its matrix with respect to some chosen basis). Now observe that for any m1 , . . . , mk ∈ A, we have |A/m1 · · · mk A| = |A/m1 A| · · · |A/mk A|, which follows from the Chinese remainder theorem (see, e.g., [1, Corollary 2.25]), combined with the fact that for any irreducible element m ∈ A and i ∈ N, we have |mi A/mi+1 A| = |A/mA| (the map A → mi A given by 1 → mi induces an isomorphism A/mA → mi A/mi+1 A). Thus NA (NL/K (α)) = |A/NL/K (α)A| = |A/ det(Tα )A| = |A/p1 A| · · · |A/pn A| = |B/αB| = NB (α).
Lemma 2. Let x1 , . . . , xn be a basis for B over A. Let α ∈ B and write α = c1 x1 + · · · + cn xn , with ci ∈ A. Then there exists a homogeneous polynomial f (T1 , . . . , Tn ) over A of degree n such that NL/K (α) = f (c1 , . . . , cn ). Moreover, there exists a constant C ∈ N such that NB (α) ≤ C · max{NA (ci )}n . i
242
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Proof. For 1 ≤ i, j, k ≤ n, let rij(k) ∈ A be such that xi xj =
n
rij(k) xk .
k=1
Then αxi = c1 xi x1 + · · · + cn xi xn = c1
n
ri1(k) xk + · · · + cn
k=1
=
n n k=1
n
(k) rin xk
k=1
cj rij(k) xk ,
j =1
so the matrix of Tα with respect to the basis x1 , . . . , xn has (i, k)-entry equal to n (k) j =1 cj rij , for 1 ≤ i, k ≤ n. Hence each entry of the matrix of Tα is a linear form in c1 , . . . , cn , and therefore det(Tα ) = f (c1 , . . . , cn ) for some homogeneous polynomial f of degree n. n1,1 n1,n nk,1 nk,n Moreover, write f (c 1 ,n . . . , cn ) = a1 c1 · · · cn + · · · + ak c1 · · · cn , where ai ∈ A, k, ni,j ∈ N and j =1 ni,j = n, for every i. By Lemma 1 and the quasi-triangle inequality for NA , there exists a constant C0 ∈ N such that NB (α) = NA (NL/K (α)) = NA (f (c1 , . . . , cn )) ≤ C0 NA (a1 )NA (c1 )n1,1 · · · NA (cn )n1,n + · · · + NA (ak )NA (c1 )nk,1 · · · NA (cn )nk,n ≤ C0 k · max{NA (ai )}(max{NA (ci )})n . i
i
Hence the result follows by letting C = C0 k · maxi {NA (ai )}. Theorem 1. Suppose that B is a ring of class (G) over A. Then there exists a constant C ∈ N such that for any ideal I in B, there exists a nonzero element α ∈ I such that NB (α) ≤ C · NB (I ). Hence the ideal class group of B is finite. Proof. Let x1 , . . . , xn be a basis for B over A. Let m be the unique positive integer such that mn ≤ NB (I ) < (m + 1)n . The fact that A is a basic PID (the second property) says that there exists a c ∈ N such that for every m, #{x ∈ A | NA (x) ≤ cm} ≥ m. Thus, for every m, the set Sm := {x ∈ A | NA (x) ≤ 2cm} has at least m + 1 elements. Hence the set Sm x1 + · · · + Sm xn has at least (m + 1)n distinct elements. Since (m + 1)n > |B/I |, there exist two distinct elements s and t in the set Sm x1 + · · · + Sm xn that are congruent mod I . Write March 2021]
FINITENESS OF THE CLASS GROUP
243
s=
n i=1
ai xi and t =
n i=1
bi xi , with ai , bi ∈ Sm . Then n (ai − bi )xi s−t = i=1
is a nonzero element of I and by the third property of Definition 1, there is a C0 ∈ N such that NA (ai − bi ) ≤ C0 (NA (ai ) + NA (bi )) ≤ C0 2 · 2cm. Thus Lemma 2 implies that there is a C1 ∈ N such that NB (s − t) ≤ C1 · max{NA (ai − bi )}n ≤ C1 (C0 4cm)n , i
and thus C1 (C0 4cm)n NB (s − t) ≤ = C1 (C0 4c)n . NB (I ) mn Taking α = s − t and C = C1 (C0 4c)n thus proves the first assertion of the theorem. A well-known argument now implies the finiteness of the class group of B (see, e.g., [14, Lemmas V.3.8 and V.3.9]). We give the argument here for the convenience of the reader. If I is an ideal of B, we write [I ] for the corresponding ideal class in the class group Cl(B). We will first show that any ideal class c ∈ Cl(R) contains an ideal I such that N(I ) ≤ C. Let J be an ideal of B such that c = [J ]. By the first assertion of the theorem, there exists a nonzero α ∈ J and a C ∈ N such that N(α) ≤ C · N(J ). Since αB ⊆ J , the unique factorization of ideals in B implies that we have αB = I J for some ideal I . Since [αB] is the trivial ideal class, [I ] = [J ]−1 = c−1 , and by the multiplicativity of N, N(J )N(I ) = N(α) ≤ C · N(J ), so N(I ) ≤ C. We have thus shown what we wanted for c−1 . But c ∈ Cl(B) was arbitrary, so it holds for all c. Now, since there are only finitely many ideals of norm below a given bound (see, e.g., [14, Lemma V.3.7]) we conclude that there can only be finitely many classes c ∈ Cl(B). The theorem above together with Proposition 1 imply that rings of integers of global fields have finite ideal class group. Let D be an integral domain with field of fractions K. A ring R such that D ⊆ R ⊆ K is called an overring of D. The following is a known result. Lemma 3. Let D be a Dedekind domain with finite class group. Then any overring R of D is a Dedekind domain with finite class group. Proof. It is well known that R is a Dedekind domain (see, e.g., [5, Lemma 1-1]). Since the class group of D is finite, hence torsion, a result independently due to Davis [8, Theorem 2], Gilmer and Ohm [9, Cor. 2.6], and Goldman [10, §1, Corollary (1)] implies that R is the localization of D at a multiplicative subset of D. Then, by a straightforward argument (see [5, Propositions 1 and 2, Corollaries 1–3]), the class group of R is a quotient of the class group of D, hence is finite. 244
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
The class of overrings of rings of class (G) includes all S-integer rings, for any finite set S of places containing the Archimedean ones. On the other hand, by Theorem 3 it will follow that a ring of S-integers is not of type (G) unless it is a ring of integers of a global field. 4. RINGS OF CLASS (G) AND GLOBAL FIELDS. For the reader’s convenience, we state a few definitions and results from Artin’s book [2, Chapter 1]. Definition 4. An absolute value (called “valuation” in [2, Chapter 1]) of a field K is a function | · | : K → R, x → |x|, satisfying the following conditions: 1. |x| ≥ 0 and |x| = 0 if and only if x = 0; 2. |xy| = |x| · |y|; 3. there exists a constant c ∈ R, c ≥ 1 such that if |x| ≤ 1, then |1 + x| ≤ c. Note that the third condition is equivalent to | · | satisfying the quasi-triangle inequality (cf. the third condition in Definition 1). Indeed, let c be as in the third condition above and let x, y ∈ K. If either x = 0 or y = 0, the quasi-triangle inequality is trivially satisfied, so we may assume that x = 0, y = 0, and without loss of generality |x/y| ≤ 1. Then |1 + x/y| ≤ c ≤ 2c(1 + |x/y|), so |x + y| ≤ C(|x| + |y|), with C = 2c. Conversely, if C ∈ R is a positive number such that |x + y| ≤ C(|x| + |y|) holds for all x, y ∈ K, then in particular |1 + x| ≤ C(1 + |x|), and by making C larger if necessary, we can take C ≥ 1. Thus, if |x| ≤ 1, we obtain |1 + x| ≤ 2C, so the third condition in Definition 4 holds with c = 2C. The trivial absolute value is the one for which |x| = 1 for all nonzero x ∈ K. One defines two absolute values | · |1 and | · |2 to be equivalent if for any x ∈ K, |a|1 < 1 if and only if | · |2 < 1. It turns out that every absolute value is equivalent to one for which the usual triangle inequality holds. Let K be a field and | · |v and absolute value of K. The absolute value | · |v is said to be non-Archimedean if for all x, y ∈ K, |x + y|v ≤ max{|x|v , |y|v }; otherwise | · |v is said to be Archimedean. We call | · |v discrete if |K|v is a discrete subset of R. If | · |v is non-Archimedean, then Ov = {x ∈ K | |x|v ≤ 1} is a ring (called the valuation ring at v), pv = {x ∈ K | |x|v < 1} is a maximal ideal of Ov , and the field k v = Ov /pv is called the residue class field at v. Let be a set of nonequivalent and nontrivial absolute values of K. Consider the set k0 = {x ∈ K | |x|v ≤ 1 for all v ∈ }. It is not hard to show that k0 is a field if and only if contains no Archimedean prime (see [2, Chapter 12, Section 1]). In this case we may consider k0 as a subfield of each k v . We will use the following fundamental result, due to Artin and Whaples [3]. Theorem 2. Suppose that K is a field with a set of mutually nonequivalent and nontrivial absolute values such that the following two conditions hold: 1. For every x ∈ K × , |x|v = 1 for all but a finite number of v ∈ and
|x|v = 1; v∈
March 2021]
FINITENESS OF THE CLASS GROUP
245
2. there is at least one v ∈ such that either v is Archimedean or v is discrete and k v is finite. Then K is a global field. A comment on the proof of this theorem. The proof of [2, Chapter 12, Theorem 3] shows that under the conditions of Theorem 2, if has at least one Archimedean absolute value, then K is a number field and otherwise K is a finite extension of k0 (t), for some t transcendental over k0 . In the latter case, k0 is a subfield of the finite field k v , so k0 itself is finite and thus K is a global function field. We now come to the main result of the present section. Theorem 3. Let A be a finite quotient PID such that its ideal norm NA satisfies the quasi-triangle inequality. Then the field of fractions K of A is a global field and A is a ring of integers of K. , for a, b ∈ A and it is Proof. The ideal norm NA extends to K via NA (a/b) = NNAA (a) (b) immediately checked that NA on K satisfies the quasi-triangle inequality. Thus NA on K is an absolute value. We also have p-adic absolute values for every nonzero prime ideal p of A. Indeed, for a ∈ A, let vp (a) denote the largest integer n such that pn divides the ideal aA, and define |a|p = |A/p|−vp (a) . |a|
Just like NA , the function | · |p : a → |a|p extends to K via |a/b|p = |b|pp , and this defines an absolute value on K. Note that NA is not equivalent to any of the absolute values | · |p because if p ∈ A is a generator of a prime ideal p, we have |p|p = |A/p|−1 < 1, while NA (p) = |A/p| > 1. We will now verify that K together with the absolute values NA and | · |p , where p runs through the prime ideals of A, satisfies the conditions of Theorem 2. Condition 1: Since A is not a field (by the definition of basic PID), it has a nonzero proper ideal, so the ideal norm NA is not the trivial absolute value on K. For any nonzero a, b ∈ A, there are only finitely many prime elements of A that divide a or b, so |a/b|p = 1 for all but finitely many p. Set |x|∞ := NA (x) for x ∈ K and let f = {p | p = (0) prime ideal of A} and = f ∪ {∞}. Note that f is nonempty e since A is not a field. For a nonzero a ∈ A, let aA = p11 · · · per r be the prime ideal factorization, where ei = vpi (a). Then
e
|a|i = |p11 |p1 · · · |per r |pr · |a|∞ = |A/p1 |−e1 · · · |A/pr |−er · |A/aA|
i∈
= |A/p1 |−e1 · · · |A/pr |−er · |A/p1 |e1 · · · |A/pr |er = 1, where for the penultimate equality we have used the Chinese remainder theorem and n n the fact that |A/p | = |A/p| , for any prime p and n ∈ N (see [14, Lemma V.3.4]). Thus also i |a/b|i = 1 for any nonzero a, b ∈ A. Condition 2: Let p ∈ f . Then | · |p is discrete since its values are of the form |A/p|n , n ∈ Z. Moreover, the valuation ring Op contains A, so by [13, Theorem 2.3] Op is a finite quotient domain. In particular, k p = Op /p is finite. Thus Theorem 2 implies that K is a global field. By [2, Chapter 12, Corollary 1 and Theorem 4] the set {| · |v | v ∈ } consists of all the nontrivial absolute values on K (up to equivalence). Since x ∈ K lies in A if and 246
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
only if vp (x) ≥ 0 for all p ∈ f , we have A = {x ∈ K | |x|p ≤ 1 for all p ∈ f } =
Op .
(1)
p∈f
If K is a number field, let OK be its ring of integers. If K is a function field, we define OK as follows. As noted just before Theorem 2, k0 is a subfield of any k p , so k0 is finite. Moreover, (1) implies that k0 ⊂ A. Let t ∈ A be an element such that t ∈ k (such an element exists since A is not a field); then NA (t) > 1 (otherwise |x|p ≤ 1 for all p ∈ , hence t ∈ k0 ). By the proof of [2, Chapter 12, Theorem 3], K is a finite extension of the field of fractions k0 (t) of k0 [t]. In this case, let OK denote the integral closure of k0 [t] in K. It remains to show that in either case we have A = OK . Let A0 = Z in case K is a number field and let A0 = k0 [t] otherwise. By [4, Corollary 5.22], OK is the intersection of all the valuation rings of K containing A0 , where a valuation ring R of K is an integral domain with field of fractions K such that x ∈ K implies x ∈ R or x −1 ∈ R. It is clear that every Op is a valuation ring of K containing A0 , so that by (1) we have OK ⊆ A. Conversely, we claim that every valuation ring R of K containing A0 equals some Op . Indeed, let R be a valuation ring of K containing A0 . Then R is integrally closed [4, Proposition 5.18], so OK ⊆ R. If m is the maximal ideal of R, then q := OK ∩ m is a prime ideal of OK , and as R is a local ring [4, Proposition 5.18], we have OK,q ⊆ R, where OK,q is the localization of OK at the prime ideal q = OK ∩ p. Since OK is a Dedekind domain, OK,q is a discrete valuation ring, hence a valuation ring, so by [4, Theorem 5.21] we must have OK,q = R. Now, since A is a PID it is integrally closed (and A contains A0 ), so we must have OK ⊆ A. Let p be a prime ideal of A such that q = OK ∩ p (i.e., p can be any prime ideal dividing the ideal qA). Then OK ⊆ A ⊆ Op , so OK,q ⊆ Op and by [4, Theorem 5.21] OK,q = Op and thus R = Op . It thus follows from (1) and [4, Corollary 5.22] that A = OK . Let B be a ring of class (G) over the basic PID A, let K be the fraction field of A, and let L be the fraction field of B. With this notation, we have the following result. Lemma 4. The field extension L/K is of finite degree and B is the integral closure of A in L. On the other hand, let L /K be a finite separable extension and let B be the integral closure of A in L . Then B is a ring of class (G) over A. Proof. Let S = A \ {0}. Then (by a simple argument) S −1 B is finitely generated as a vector space over S −1 A = K. Thus S −1 B is an integral domain that is a finitedimensional vector space, so S −1 B is a field, that is, S −1 B = L. Hence L/K is finite. Furthermore, since B is finitely generated over A, B is integral over A (see, e.g., [14, Proposition I.2.10]). Thus B lies inside the integral closure C of A in L. Since B ⊆ C ⊆ L, the fraction field of C is L. Any x ∈ C is integral over A, hence integral over B. Since B is a Dedekind domain it is integrally closed, so x ∈ B. Thus C = B, that is, B is the integral closure of A in L. Moreover, it is well known that B is a Dedekind domain that is finitely generated over A (see, e.g., [14, Theorems I.4.7 and I.6.2]). Since B is torsion-free, it is free over A and thus B is a ring of class (G) over A. Corollary 1. Let A be a finite quotient PID such that its ideal norm NA satisfies the quasi-triangle inequality. Let B be a Dedekind domain that is a finitely generated and free A-module. Then B is a ring of integers of a global field. In particular, if A is a basic PID and B is of class (G) over A, then B is a ring of integers of a global field. March 2021]
FINITENESS OF THE CLASS GROUP
247
Proof. Let K and L be the fraction field of A and B, respectively. By Lemma 4, L is a global field and B is the integral closure of A in L. By Theorem 3, A is the integral closure in K of A0 , where A0 is either Z or Fq [t], for some t ∈ K. Let C be the integral closure of A0 in L. Since A0 ⊆ A we trivially have C ⊆ B. By the transitivity of integrality [14, Proposition I.2.18] applied to A0 ⊆ A ⊆ B, we have that B is integral over A0 , hence B ⊆ C and so B = C. We have proved that B is the integral closure of Z or Fq [t] in the global field L and thus B is a ring of integers in L. One may ask whether there exists a Dedekind domain B that is finitely generated and free over a PID A with finite quotients and such that B has infinite class group. Theorem 1 and Corollary 1 show that if such an example exists, then the quasi-triangle inequality must fail for ideal norm NA . We note that Goldman [10] and Heitmann [11] have given examples of Dedekind domains with finite quotients and infinite class groups, but we do not know whether these examples are finitely generated and free over some PID. It is a trivial fact that there exist Dedekind domains (even PIDs) with finite class groups that are not overrings of any ring of integers of a global field. Indeed, the polynomial ring C[X] is a PID but is not a finite quotient domain, so cannot be an overring of any finite quotient domain (finite quotient domains are stable under localization). However, we do not know whether there exists a finite quotient Dedekind domain with finite class group that is not the overring of any ring of integers of a global field. ACKNOWLEDGMENTS. I wish to thank Pete L. Clark and D. Lorenzini for pointing out inaccuracies in a previous version and for comments that helped to improve the exposition.
ORCID Alexander Stasinski
http://orcid.org/0000-0003-0415-5918
REFERENCES [1] Adkins, W. A., Weintraub, S. H. (1992). Algebra: An Approach via Module Theory. Graduate Texts in Mathematics, Vol. 136. New York: Springer-Verlag. [2] Artin, E. (1967). Algebraic Numbers and Algebraic Functions. New York-London-Paris: Gordon and Breach Science Publishers. [3] Artin, E., Whaples, G. (1945). Axiomatic characterization of fields by the product formula for valuations. Bull. Amer. Math. Soc. 51(7): 469–492. [4] Atiyah, M. F., Macdonald, I. G. (1969). Introduction to Commutative Algebra. Reading, MA-LondonDon Mills, Ont.: Addison-Wesley Publishing Co. [5] Claborn, L. (1965). Dedekind domains and rings of quotients. Pacific J. Math. 15: 59–64. [6] Clark, P. L. (2010). Is there a “purely algebraic” proof of the finiteness of the class number? MathOverflow. mathoverflow.net/q/50557 [7] Curtis, C. W., Reiner, I. (1962). Representation Theory of Finite Groups and Associative Algebras. Pure and Applied Mathematics, Vol. XI. New York-London: Interscience Publishers. [8] Davis, E. D. (1964). Overrings of commutative rings. II. Integrally closed overrings. Trans. Amer. Math. Soc. 110: 196–212. [9] Gilmer, R., Ohm, J. (1964). Integral domains with quotient overrings. Math. Ann. 153: 97–103. [10] Goldman, O. (1964). On a special class of Dedekind domains. Topology. 3(suppl. 1): 113–118. [11] Heitmann, R. C. (1974). PID’s with specified residue fields. Duke Math. J. 41(3): 565–582. [12] Higman, D. G., McLaughlin, J. E. (1959). Finiteness of class numbers of representations of algebras over function fields. Mich. Math. J. 6: 401–404. [13] Levitz, K. B., Mott, J. L. (1972). Rings with finite norm property. Canadian J. Math. 24: 557–565. [14] Lorenzini, D. (1996). An Invitation to Arithmetic Geometry. Graduate Studies in Mathematics, Vol. 9. Providence, RI: American Mathematical Society. [15] Ramakrishnan, D., Valenza, R. J. (1999). Fourier Analysis on Number Fields. Graduate Texts in Mathematics, Vol. 186. New York: Springer-Verlag.
248
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
[16] Reiner, I. (1975). Maximal Orders. London Mathematical Society Monographs, No. 5. London-New York: Academic Press. [17] Swan, R. G. (1970). K-theory of Finite Groups and Orders. Lecture Notes in Mathematics, Vol. 149. Berlin-New York: Springer-Verlag. [18] Zassenhaus, H. (1938). Neuer beweis der endlichkeit der klassenzahl bei unimodularer aequivalenz endlicher ganzzahliger substitutionsgruppen. Abh. Math. Semin. Univ. Hamb. 12: 276–288. ALEXANDER STASINSKI is a professor at Durham University in the UK. He works on the representation theory of finite and compact p-adic groups and related areas of algebra and zeta functions. Department of Mathematical Sciences, Durham University, Durham DH1 3LE, UK [email protected]
A Pi in the Face of Tau Once they served me pi in school, But now you offer tau. I think you take me for a fool, And that I won’t allow. Pi gives me from diameter Circumference’s bound. Tau’d force the radius on me, Sir, As though that’s more profound. But I’m too old to change my way. So I hope you won’t feel hurt If from your spread I push away, ’Cause pi’s my just dessert! —Submitted by Howard L. Ritter, Jr., M.D., Perrysburg, OH doi.org/10.1080/00029890.2021.1855038 MSC: Primary 00A99
March 2021]
FINITENESS OF THE CLASS GROUP
249
Fermat’s Last Theorem Implies Euclid’s Infinitude of Primes Christian Elsholtz Abstract. We show that Fermat’s last theorem and a combinatorial theorem of Schur on monochromatic solutions of a + b = c implies that there exist infinitely many primes. In particular, for small exponents such as n = 3 or 4 this gives a new proof of Euclid’s theorem, as in this case Fermat’s last theorem has a proof that does not use the infinitude of primes. Similarly, we discuss implications of Roth’s theorem on arithmetic progressions, Hindman’s theorem, and infinite Ramsey theory toward Euclid’s theorem. As a consequence we see that Euclid’s theorem is a necessary condition for many interesting (seemingly unrelated) results in mathematics.
1. INTRODUCTION. Imagine that the set of positive integers has only finitely many primes. We will investigate consequences, and to become more creative with this, we imagine we live in an entirely different world, namely in a “world with only finitely many primes.” If you are a number theorist, then you will realize that a major part of analytic number theory just vanishes. One of the implications of this article is that algebraic number theorists and combinatorialists would live in a very different world, too. The reason is that “Fermat’s last theorem” even in the first interesting case with exponent 3 would be wrong, that major parts of the modern subject of additive combinatorics would disappear, and that even basic results of infinite Ramsey theory would not exist. If you wonder why this is the case, we invite you to a journey of unexpected discoveries in the fictional “world with only finitely many primes”! There are many proofs of Euclid’s theorem stating that there exist infinitely many primes. There is a very thorough bibliographic collection of 70 pages on a multitude of proofs of Euclid’s theorem, due to Meˇstrovi´c [21]. Other collections are given by Ribenboim [25] and a very recent one by Granville [17]. For some recent proofs, see [27, 34]. Many of these proofs make use of an infinite sequence with mutually coprime inten gers, such as Fn = 22 + 1 (Goldbach, in a letter to Euler 1730), or primitive divisors of certain recursive sequences (see, e.g., [27]). Furstenberg [14] made use of a suitably defined topology to prove Euclid’s theorem. A number of proofs have used the exponents of a prime factorization; see, for example, [10, 11, 22]. Even more recently, two proofs [1, 18] made use of van der Waerden’s theorem applied to the patterns of exponents. Alpoge [1] introduced van der Waerden’s theorem to this subject, and Granville [18] combined Alpoge’s idea with a theorem of Fermat, namely that there are no four squares in arithmetic progression. Inspired by this new type of proof, we investigate which type of purely combinatorial results can be combined with some kind of arithmetic result to give new proofs that there exist infinitely many primes. In this way, we link Euclid’s theorem to some very beautiful and significant results of modern mathematics. doi.org/10.1080/00029890.2021.1856544 MSC: Primary 11A41, Secondary 05D10; 11B75 c 2021 The Author(s). Published with license by Taylor & Francis Group, LLC. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
250
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Here is a brief outline of the article. In Section 2, we link Euclid’s theorem to Fermat’s last theorem, eventually proved by Wiles [33], and to a theorem of Schur (1916), which is often considered to be the starting point of combinatorial number theory. In Section 3, the link is to a theorem of Roth (1953) on the density of integers without arithmetic progressions. An independent elementary proof of Euclid’s theorem is a by-product, in Section 4. Section 5 has some discussion about varying the numbertheoretic or combinatorial input. Section 6 uses a theorem of Hindman (1974) on an infinite extension of Schur’s theorem, and Section 7 gives two proofs using infinite Ramsey theory. Roth’s theorem and its extension by Szemer´edi [31], and quantitative versions thereof, (e.g., due to Bourgain [3], Gowers [15], Green and Tao [19]) have inspired many excellent mathematicians and have had tremendous impact on the relatively young field of additive combinatorics. 2. FERMAT’S LAST THEOREM IMPLIES EUCLID’S THEOREM. We first state Schur’s theorem and then the main result of this article. Lemma 1 (Schur’s theorem [28], 1916). For every positive integer t, there exists an integer st such that if one colors each integer m ∈ [1, st ] using one of t distinct colors, then there is a monochromatic solution of a + b = c, where a, b, c ∈ [1, st ]. Theorem 1. For n ≥ 3 let FLT(n) denote the statement “There are no solutions of the equation x n + y n = zn in positive integers x, y, z.” Then “FLT(n) is true” and Schur’s theorem imply that there exist infinitely many primes. Theorem 1 gives a new proof of Euclid’s theorem for those exponents for which a proof of FLT(n) independent of the infinitude of primes exists. This is certainly the case for n = 3, 4, 5, where elementary proofs exist (see [9,24]). It then trivially follows for infinitely many exponents, for example for all multiples of 3. The application of Fermat’s last theorem with general n to Euclid’s theorem might possibly compete for the most indirect proof, but at present the proof with general n is not actually a proof at all, as Wiles’s proof makes use of the fact that there exist infinitely many primes. We briefly show that Schur’s theorem nowadays can be seen as a direct consequence of Ramsey’s theorem [23] (1929). Ramsey’s theorem (see [4, Theorem 10.3.1]) states that, for any number t of colors (let us call them 1, . . . , t) and positive integers n1 , . . . , nt there exists an integer R(n1 , . . . , nt ) such that if the edges of the complete graph on R(n1 , . . . , nt ) vertices are colored, there exists an index i and a monochromatic clique of size ni all of whose edges are of color i. In our application we only need the case n1 = · · · = nt = 3. Let χ : {1, . . . , N} → {1, . . . , t} be the coloring of the first N = R(3, . . . , 3) integers. Let us define a coloring of the edges of the complete graph with vertices {1, 2, . . . , N} as follows: The edge (i, j ) is given the color χ(|i − j |). Ramsey’s theorem guarantees that there is a monochromatic triangle. Let us denote the vertices of this triangle by (i, j, k), where i < j < k. Let a = j − i, b = k − j , and c = k − i. Then a, b, c all have the same color and a + b = c holds. This gives the required monochromatic solution. Proof of Theorem 1. Suppose there exist only finitely many primes p1 , . . . , pk (say). e Every positive integer can be written as m = ki=1 pi i . We write integers as an nth power times an nth power-free Hence, writing ei = nqi + ri with nnumber. qi ri k k 0 ≤ ri ≤ n − 1 gives m = = N(m) × R(m) (say). We use i=1 pi i=1 pi nk distinct colors, denoted by (t1 , . . . , tk ), 0 ≤ ti ≤ n − 1, and we color the integer March 2021]
FERMAT’S LAST THEOREM GUARANTEES PRIMES
251
nq +r m = ki=1 pi i i by (r1 , . . . , rk ). By Schur’s theorem there exists a monochromatic triple (a, b, c) such that c = a + b and with a fixed color (r1 , . . . , rk ), corresponding r to R = ki=1 pi i . Here a, b, c all contain the same factor R and we can write a, b, c as a = N(a)R, b = N(b)R, c = N(c)R, with positive integers N(a), N(b), N(c). Dividing by R gives N(a) + N(b) = N(c) with nth powers, which is a contradiction to FLT(n). It might seem that we require unique factorization, as for an integer with distinct prime factorizations the coloring is not well-defined. However, for an application of Schur’s theorem it is perfectly fine if an integer m with hypothetical distinct prime factorizations is assigned only one of the colors. (Assigning all corresponding colors to m would be an alternative, but then χ would not actually be a function.) It is of historic interest to note that Schur’s motivation was to study Fermat’s equation modulo primes. Dickson had proved that there is no congruence obstruction to the Fermat equation, and Schur [28] gave a simple proof of this. 3. ROTH’S THEOREM IMPLIES EUCLID’S THEOREM. The Fermat equation has also been studied with coefficients. The case x n + y n = 2zn in positive integers has attracted special attention, as a solution in distinct positive integers would mean that there exist nth powers x n < zn < y n in arithmetic progression. It was conjectured by D´enes that for n ≥ 3 there exist only trivial solutions with x = y = z. This was proved by Darmon and Merel [7] based on the methods of Wiles. Sierpi´nski [30] gives elementary proofs of the cases n = 4 (Chapter 2, §8) and n = 3 (Chapter 2, §14); see also [5]. We also give new proofs of Euclid’s theorem in these cases. The following result gives a matching combinatorial tool. Lemma 2 (Roth [26]). Let δ > 0 and N ≥ N(δ). Every subset S ⊂ [1, N] of at least δN elements contains three distinct elements s1 , s2 , s3 ∈ S in arithmetic progression, i.e., s1 + s3 = 2s2 . It should be noted that there is a purely combinatorial proof of Roth’s theorem, e.g., in [16, pp. 46–49]. In contrast to van der Waerden’s and Schur’s theorem the above statement is a so-called “density version”: this result not only guarantees monochromatic solutions in some unspecified color, but even in all those colors that occur with a positive density. Theorem 2. For n ≥ 3 let DM(n) denote the statement “There are no three positive nth powers in arithmetic progression” or equivalently “There are no solutions of the equation x n + y n = 2zn in positive integers x < z < y.” Then “DM(n) is true” and Roth’s theorem imply that there exist infinitely many primes. Proof. We first prove the following (possibly surprising) lemma. Lemma 3. Suppose there exist only finitely many primes p1 < · · · < pk . The set of nth powers has positive density in the set of all integers, i.e., there exists some δ = δ(n, k) > 0 such that for all N the set of nth powers in [1, N] is at least δN. Proof of lemma. We prove this by dividing a lower bound approximation of the number of nth powers in [1, N] by an upper bound approximation of all integers in [1, N], both counted by means of exponent patterns. The upper bound on the number of e e possible exponent patterns (e1 , e2 , . . . , ek ) follows from p11 · · · pkk ≤ N, which gives log N log N log N ei ≤ log pi . Hence (1 + log p1 ) · · · (1 + log pk ) is an upper bound. For the lower bound e on the number of nth powers, we count those ei divisible by n and with pi i ≤ N 1/k 252
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
N N for all i = 1, . . . , k. We see that at least 1 + nklog · · · 1 + nklog of all integers log p1 log pk at most N are nth powers, which gives (for large N) a positive proportion of at least lower bound C ≥ (nk) δ ≥ upper k , for some C > 0. bound
With this lemma we can replace Schur’s theorem by Roth’s theorem. Roth’s theorem directly guarantees that there exists a nontrivial arithmetic progression of nth powers, which is in contradiction to DM(n). (Note that in this case there is no need to divide by the factor R of the first proof.) Remark. The results by van der Waerden (used by Alpoge and Granville) and Schur or Roth (used here) are early results of Ramsey theory. The numerical bounds on st implied by Schur’s theorem are moderate, compared to the very quickly increasing bounds in van der Waerden’s theorem. Let st denote the least number such that for any t-coloring, which is a map χ : {1, . . . , st } → {1, . . . , t}, there exist a, b, c with a + b = c and χ(a) = χ(b) = χ(c). It follows from Schur’s proof that st ≤ t!e . Note added in proof. The author would like to thank Shin-ichiro Seki for drawing attention to the paper [29]. In fact, in that paper Shin-ichiro Seki shows that Roth’s theorem, together with DM(3) and a method by Erd˝os gives Euler’s theorem, namely that the sum of reciprocals of primes diverges. 4. POSITIVE DENSITY GIVES A NEW ELEMENTARY PROOF. The observation about “positive density” in Lemma 3 also leads to a short and new proof of Euclid’s theorem: Proof. Lemma 3 says the number of nth powers (for any fixed n ≥ 2) has positive density in the set of positive integers. But it is also clear that there are at most N 1/n positive nth powers x n ≤ N, contradicting the lower bound of δN (for some fixed δ > 0) for sufficiently large N. Comparing with the bibliography [21], the proof closest in spirit appears to be Chaitin’s proof [6]. We note that the main focus of this article is not about short proofs but how seemingly remote results can be applied. 5. DISCUSSION ON VARIANTS OF THE PROOFS ABOVE: THE FRANKL– ¨ GRAHAM–RODL THEOREM AND FOLKMAN’S THEOREM. 1. We now discuss that knowing something more on the combinatorial side, namely knowing about the number of monochromatic solutions, helps in reducing the number-theoretic input considerably. On the combinatorial side, Frankl, Graham, and R¨odl [13] proved that with t colors the number of monochromatic solutions (a, b, c) of the equation a + b = c with a, b, c ∈ [1, N] increases quadratically, i.e., there is a positive constant ct such that the number S(t, N) of solutions is at least ct N 2 . (In fact, [13] gives a direct proof for the Schur equation, but also covers much more general cases.) As in the proof of Theorem 1, the monochromatic solutions of a + b = c correspond to solutions of x n + y n = zn in positive integers. On the number-theoretic side there are several reasons why the number of solutions is smaller, giving a contradiction to the assumption “there are finitely many primes only.” A result of Faltings [12] would give there are at most O(N) solutions of x n + y n = zn with x n , y n , zn ∈ [1, N], being coprime in pairs. A much more March 2021]
FERMAT’S LAST THEOREM GUARANTEES PRIMES
253
elementary approach is as follows: For odd n the left-hand side of x n + y n = zn i n−1−i i can be factored as (x + y) n−1 y . In particular, when n = 3 this is i=0 (−1) x 3 3 2 2 − xy + y ). The number of divisors of any integer zn ≤ N x + y = (x + y)(x √ is clearly at most N. (Actually, as we assume there are at most k prime factors, this can be improved to Ck (log N)k .) n Hence √ the number-theoretic upper bound of at most N values of z with at2 most N factorizations each and the combinatorial lower bound of at least ct N solutions contradict each other. This remark also applies in the situation of x n + y n = 2zn , as using a result of Varnavides [32] one can also prove that in this situation there would be at least ct N 2 many solutions, with x, y, z ≤ N, contradicting as before the numbertheoretic upper bound. 2. For the combinatorial lemma there are other alternatives. For example, a theorem of Folkman [16, p. 81] guarantees much larger monochromatic structures than Schur’s theorem does: For every number t of colors, every coloring χ : N → {1, . . . , t}, and every s ∈ N, there exist Ns,t and a1 , . . . , as ∈ [1, Ns,t ] with the property that all nontrivial subset sums i∈I ai , where I ⊆ {1, . . . , s} is nonempty, are monochromatic. In analogy with the proof of Theorem 1, this would mean, in the special case s = 3, applied with the same coloring and after dividing by the common factor R, that all of a1 , a2 , a3 , a1 + a2 , a1 + a3 , a2 + a3 , a1 + a2 + a3 are nth powers. Proving that this is impossible could be easier than proving FLT(n), as FLT(n) corresponds to s = 2 with fewer conditions. But we are not aware of any literature on this. 6. HINDMAN’S THEOREM IMPLIES EUCLID’S THEOREM. Let us explicitly write down an extreme form of the above remark on Folkman’s theorem. An extension of Folkman’s theorem is Hindman’s theorem [20]; see also [2] and [16, p. 85]. Lemma 4. For any integer t ≥ 2 and any t-coloring χ : N → {1, . . . , t}, there exists an infinite sequence A = {a1 , a2 , . . .} such that all subset sums i∈I ai over nonempty finite index sets I ⊂ N are monochromatic. Theorem 3. Hindman’s theorem implies Euclid’s theorem. Proof. We start as in the proof of Theorem 1. Suppose there exist only finitely many k ei primes p1 , . . . , pk (say). Every integer n as m = i=1 pi , ei = nqi + ri can be written qi ri k k = N(m) × R(m) (say). with 0 ≤ ri ≤ n − 1. That is, m = i=1 pi i=1 pi k nqi +ri We color the integer m = i=1 pi by (r1 , . . . , rk ). By Hindman’s theorem there exists an infinite set such that all nonempty finite subset sums are monochromatic with r a fixed color (r1 , . . . , rk ), corresponding to R = ki=1 pi i . Dividing by R gives an infinite set such that all finite subset sums are nth powers. This would in particular correspond to some fixed x n and infinitely many pairs n (yi , zin ) of nth powers such that x n + yin = zin holds. This is clearly impossible, as the difference between consecutive nth powers zn − (z − 1)n ≥ zn−1 increases when n ≥ 2 is fixed and z increases. Remark. The proof of Hindman’s theorem is not trivial, but it is certainly much more accessible than FLT(n) for general n. Moreover, the proof of Hindman’s theorem does not make use of Euclid’s theorem, in contrast to Wiles’s proof of FLT. 254
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
7. INFINITE RAMSEY THEORY IMPLIES EUCLID’S THEOREM. The above proof does not need the full strength of Hindman’s theorem, as it essentially only uses sums of two elements. Hence it is possible to reduce the combinatorial input accordingly, which we discuss below. Lemma 5 (The infinite Ramsey theorem IRT, see e.g., [8, Theorem 9.1.2]). Let X be some infinite set and color all subsets of X of size w with t different colors. Then there exists some infinite subset M ⊂ X such that the subsets of M of size w all have the same color. In plain words, the case w = 2 of Lemma 5 says that a finite coloring of the complete graph K∞ guarantees a complete monochromatic K∞ as a subgraph. Theorem 4. The infinite Ramsey theorem IRT implies Euclid’s theorem. We leave the proof of Theorem 4 as an exercise to the reader, and only remark it is a variant of Theorem 3 and our final Theorem 5. It turns out that one does not actually need an infinite complete monochromatic graph, but only a monochromatic complete bipartite graph K2,∞ , where one set of the vertices consists of two elements and the other one is infinite (say countable). We give a complete proof of this and the application to Euclid’s theorem below. To prove the existence of this infinite substructure is quite simple. Lemma 6 (The K 2,∞ lemma). Let X be some infinite set and color all pairs of two distinct elements of X with t different colors. Then there exist a set V = {v1 , v2 } ⊂ X and an infinite set W = {w1 , w2 , . . .} ⊂ X\V such that all edges (vi , wj ), with i ∈ {1, 2} and j ∈ N, have the same color. For ease of notation we assume that X is countable. Proof. One can construct the required sets step by step. Choose any set A = {a1 , a2 , . . . , at+1 } ⊂ X of t + 1 distinct elements as vertices. Let v1 = a1 . There are infinitely many adjacent edges (v1 , xj ). Hence one of the t colors, say color c1 , occurs infinitely often. Let X1 = {x1,j : j ∈ N} ⊂ X be the set of those elements such that (v1 , x1,j ) are these infinitely many edges of color c1 . Now study the color of all (ai , x1,j ) as follows. There exists one color c2 (say) that occurs infinitely often among the infinitely many edges (a2 , x1,j ). Let X2 = {x2,j : j ∈ N} ⊂ X1 be those elements such that (a2 , x2,j ) are of color c2 . If c1 = c2 we have found the required substructure with V = {a1 , a2 } and W = X2 . We therefore assume that c1 = c2 . We iterate the step above and come to infinite subsets Xt+1 ⊂ Xt ⊂ · · · ⊂ X3 ⊂ X2 ⊂ X1 ⊂ X such that for fixed i all edges (ai , xi,j ), j ∈ N, are of color ci (say). As there are t distinct colors only, there must be two distinct indices i1 , i2 ∈ {1, . . . , t + 1} such that ci1 = ci2 . With i1 < i2 without loss of generality and V = {ai1 , ai2 }, W = Xi2 and the lemma is proved. An alternative is to color the elements x ∈ X\A with the vector color (c1 , . . . , ct+1 ) if the color of the edge (ai , x) is ci , i = 1, . . . , t + 1. As there is only a finite number of vector colors, namely t t+1 , there is an infinite number of x ∈ X\A with the same vector color, which defines the set W . As before, there are two indices i1 = i2 such that ci1 = ci2 . Hence V = {ai1 , ai2 } and W are the sets required. Theorem 5. The K2,∞ lemma implies Euclid’s theorem. Proof. Let n ≥ 2, and assume that p1 , . . . , pk is the list of all primes. We color nq +r the integers by the same rule as before: m = ki=1 pi i i is colored by χ(m) = March 2021]
FERMAT’S LAST THEOREM GUARANTEES PRIMES
255
(r1 , . . . , rk ). Based on this coloring, we define an infinite graph on the positive integers. The edges (mi , mj ) receive the color χ(mi + mj ). We apply the K2,∞ lemma to this graph: there exists a complete bipartite graph with parts V = {v1 , v2 } and an infinite set W such that all edges (vi , wj ), with i ∈ {1, 2} and j ∈ N, have the same color (r1 , . . . , rk ). n−r We multiply all integers in N by the constant P = ki=1 pi i . All pairwise n n n sums P vi + P wj = P (vi + wj ) are an nth power zi,j (say). Note that z2,j − z1,j = P (v2 − v1 ) is a constant, and is also the distance between infinitely many distinct pairs of nth powers, for the infinitely many values j . This is impossible, as the gap between consecutive nth powers increases (see above). With Hindman’s theorem we made use of a quite advanced combinatorial result, and the number-theoretic part became correspondingly quite simple. We then reduced the depth of the combinatorial lemma until we reached the K2,∞ lemma. On the numbertheoretic side, we eventually used the elementary fact that the gaps between consecutive nth powers increase and simple arithmetic such as P (mi + mj ) = P mi + P mj . 8. CONCLUSION. As our journey through a fictional world comes to an end, let us briefly reflect: a common theme in all variants discussed is that the existence of only finitely many primes would guarantee patterns for the set of nth powers that cannot actually exist, sometimes for deep reasons, sometimes for obvious ones, depending on the strength of the pattern. Summarizing the results we find: Corollary 1. In the “world with only finitely many primes” the following hold: 1. If Schur’s theorem holds, then FLT(n) is wrong for all n ≥ 3. If FLT(n) holds for some n ≥ 3, then Schur’s theorem does not hold. 2. If Roth’s theorem holds, then DM(n) is wrong for all n ≥ 3. If DM(n) holds for some n ≥ 3, then Roth’s theorem does not hold. 3. The set of nth powers has positive density (giving an immediate contradiction). 4. Hindman’s theorem does not hold. 5. The infinite Ramsey theorem (IRT) does not hold. 6. The K2,∞ lemma does not hold. In other words, Euclid’s theorem is logically connected with many interesting and seemingly unrelated results in mathematics. Having seen all these variants and extensions, the original version, i.e., the combination of Schur’s theorem and the Fermat–Wiles theorem is the one that looks most intriguing to this author. And Fermat’s last theorem may be the one that many of us would miss most in the fictional “world with only finitely many primes”! ACKNOWLEDGMENTS. The author would like to thank the referees, the editor, R. Dietmann, J. Erde, I. Leader, R. Meˇstrovi´c, J.-C. Schlage-Puchta, and A. Wiles for useful comments on the manuscript. The author was partially supported by the Austrian Science Fund (FWF): W1230 and I 4945-N.
ORCID Christian Elsholtz
http://orcid.org/0000-0002-2960-4030
REFERENCES [1] Alpoge, L. (2015). van der Waerden and the primes. Amer. Math. Monthly. 122(8): 784–785. [2] Baumgartner, J. E. (1974). A short proof of Hindman’s theorem. J. Combin. Theory Ser. A. 17: 384–386.
256
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
[3] Bourgain, J. (2008). Roth’s theorem on progressions revisited. J. Anal. Math. 104: 155–192. [4] Cameron, P. J. (1995). Combinatorics: Topics, Techniques, Algorithms. Cambridge: Cambridge Univ. Press. [5] Carmichael, R. D. (1915). Diophantine Analysis. New York: Wiley. [6] Chaitin, G. J. (2005). Meta Math! The Quest for Omega. New York: Pantheon Books. [7] Darmon, H., Merel, L. (1997). Winding quotients and some variants of Fermat’s last theorem. J. Reine Angew. Math. 490: 81–100. [8] Diestel, R. (2017). Graph Theory. Graduate Texts in Mathematics, Vol. 173, 5th ed. Berlin: SpringerVerlag. [9] Edwards, H. N. (1996). Fermat’s Last Theorem. New York: Springer-Verlag. [10] Elsholtz, C. (2012). Prime divisors of1 thin sequences. Amer. Math. Monthly. 119(4): 331–333. ¨ [11] Erd˝os, P. (1938). Uber die Reihe p . Mathematica, Zutphen B. 7: 1–2. [12] Faltings, G. (1983). Endlichkeitss¨atze f¨ur abelsche Variet¨aten u¨ ber Zahlk¨orpern. Invent. Math. 73(3): 349–366. [13] Frankl, P., Graham, R. L., R¨odl, V. (1988). Quantitative theorems for regular systems of equations. J. Combin. Theory Ser. A. 47(2): 246–261. [14] Furstenberg, H. (1955). On the infinitude of primes. Amer. Math. Monthly. 62(5): 353. [15] Gowers, W. T. (2001). A new proof of Szemer´edi’s theorem. Geom. Funct. Anal. 11(3): 465–588. [16] Graham, R. L., Rothschild, B. L., Spencer, J. H. (1990). Ramsey Theory, 2nd ed. New York: Wiley. [17] Granville, A. (2017). A panopoly of proofs that there are infinitely many primes. Lond. Math. Soc. Newsl. 472: 23–27. [18] Granville, A. (2017). Squares in arithmetic progressions and infinitely many primes. Amer. Math. Monthly. 124(10): 951–954. [19] Green, B., Tao, T. (2017). New bounds for Szemer´edi’s theorem, III: a polylogarithmic bound for r4 (N ). Mathematika. 63(3): 944-1040. [20] Hindman, N. (1974). Finite sums from sequences within cells of a partition of N. J. Comb. Theory Ser. A. 17:1–11. [21] Meˇstrovi´c, R. (2018) Euclid’s theorem on the infinitude of primes: a historical survey of its proofs (300 B.C.–2017), version 3. arxiv.org/abs/1202.3670 [22] P´olya, G. (1918). Zur arithmetischen Untersuchung der Polynome. Math. Z. 1: 143–148. [23] Ramsey, F. P. (1929). On a problem of formal logic. Proc. Lond. Math. Soc. Ser. (2). 30(4): 264–286. [24] Ribenboim, P. (1999). Fermat’s Last Theorem for Amateurs. New York: Springer-Verlag. [25] Ribenboim, P. (2004). The Little Book of Bigger Primes, 2nd ed. New York: Springer-Verlag. [26] Roth, K. (1953). On certain sets of integers. J. London Math. Soc. 28: 104–109. [27] Saidak, F. (2006). A new proof of Euclid’s theorem. Amer. Math. Monthly. 113(10): 937–938. ¨ [28] Schur, I. (1916). Uber die Kongruenz x m + y m ≡ zm (modp). Jahresber. Dtsch. Math.-Ver. 25: 114– 117. [29] Seki, S. (2018). Valuations, arithmetic progressions, and prime numbers. Notes Number Theory Discrete Math. 24(4): 128–132. nntdm.net/volume-24-2018/number-4/128-132/ [30] Sierpi´nski, W. (1964). Elementary Theory of Numbers. Monografie Matematyczne, Vol. 42. Warsaw: Pa´nstwowe Wydawnictwo Naukowe. [31] Szemer´edi, E. (1975). On sets of integers containing no k elements in arithmetic progression. Acta Arith. 27: 199–245. [32] Varnavides, P. (1959). On certain sets of positive density. J. Lond. Math. Soc. 34: 358–360. [33] Wiles, A. (1995). Modular elliptic curves and Fermat’s last theorem. Ann. Math. (2). 141(3): 443–551. [34] Wooley, T. D. (2017). A superpowered Euclidean prime generator. Amer. Math. Monthly. 124(4): 351– 352. CHRISTIAN ELSHOLTZ obtained his Master’s and Ph.D. degrees in mathematics from Darmstadt University of Technology. He was a visiting student at the University of Oxford, and taught at the University of Stuttgart, TU Clausthal, and Royal Holloway University of London, before moving to Graz University of Technology. His mathematical interests include combinatorial number theory, primes, and extremal combinatorics. In combinatorics, he enjoys finding large sets in high-dimensional spaces, for example, avoiding arithmetic progressions. Institut f¨ur Analysis und Zahlentheorie, Technische Universit¨at Graz, Kopernikusgasse 24, A-8010 Graz, Austria. [email protected]
March 2021]
FERMAT’S LAST THEOREM GUARANTEES PRIMES
257
Bar Bets and Generating Functions: The Distribution of the Separation of Two Distinct Card Ranks Arnold Saunders and Hosam Mahmoud
Abstract. What is the probability two distinct ranks (say, a king and a five) will appear within a card of one another in a shuffled deck? An old bar bet is predicated on it being nearly guaranteed. By modeling the outcome using a finite automaton, we construct the probability generating function for card rank separation. We then derive the corresponding probability mass function, mean, and variance. We conclude with an unimpressive modification to make the scam a virtual lock.
1. INTRODUCTION. The rank of a playing card is its numeric or face value. The five of hearts has rank “five,” the king of diamonds has rank “king,” and so on. An old bar bet works as follows. You hand a deck of cards to a mark and ask him or her to shuffle it and then to name any two distinct ranks. Next, you claim you can magically force at least one occurrence of the two ranks to appear within one card of each another. If successful, the mark buys you a drink. Lounge lore claims this works “99 percent of the time [2].” Several authors in the literature of recreational mathematics [4, 8, 9] and in online forums [3, 7, 10] have worked out the probability that two distinct ranks are adjacent. A few others have even considered the probability that the two ranks are separated by at most one card [5, 6]. We aim to take this a step further and derive the probability mass function for separation by s cards, as well as compute the mean separation and its variance. For clarity of discussion, we say two cards in a deck are at separation s if there are s other cards between them. For example, if the subsequence . . . , K♥, 9♣, J♠, . . . appears in the deck, we say that the king of hearts and the jack of spades are separated by one card, or at separation 1. We say two distinct ranks are at separation s if that is the smallest separation between all instances of the two ranks in the deck. For example, to find the separation of the king and jack ranks, we note the separation of all 16 distinct king-jack pairs in the deck and take the smallest. If we let S be the separation of two designated ranks, our goal is to work out the distribution of S. 2. AN ELEMENTARY APPROACH. Without loss of generality, let us continue to use the king and jack as the designated ranks. Consider the event {S ≥ s}. In a wellshuffled deck, all 52! permutations of the cards are equally likely, which provides a denominator for the desired probability. For the event in question to occur, we can have a favorable configuration by the following construction: doi.org/10.1080/00029890.2021.1856581 MSC: Primary 60C05, Secondary 05A15; 97A20
258
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
•
• •
• • • • • •
ways. Call the four chosen Choose four locations out of 52 for the jacks in 52 4 locations i, j, k, , in increasing order. Note that the highest values for i, j, k, are respectively 49, 50, 51, 52. Permute the four jacks of the deck over these locations in 4! ways. There are i − 1 locations to the left of position i. If i is large enough (namely larger than s + 1), then we have positions to the left of i to place kings at separation at least s. Precisely, there are i − 1 − s positions to the left of i to place kings at separation s, if i is large enough, and none if i is not large enough. We can say there are max(i − s − 1, 0) positions where we can place kings at separation at least s from the jack at position i. Likewise, we can place kings at separation at least s at any of max(j − i − 1 − 2s, 0) positions between i and j . Likewise, we can place kings at separation at least s at any of max(k − j − 1 − 2s, 0) positions between j and k. Likewise, we can place kings at separation at least s at any of max( − k − 1 − 2s, 0) positions between k and . Likewise, we can place kings at separation at least s at any of max(52 − − s, 0) positions above . From all the candidate positions for the kings at separation at least s, choose four positions and permute the four kings over them in 4! ways. Now that the jacks and kings are placed at locations at least s apart, we can permute the remaining 44 cards in 44! ways over the remaining places. Let y(s) be the number of all candidate positions for the four kings. So, we have y(i, j, k, ; s) = max(i − s − 1, 0) + max(j − i − 1 − 2s, 0) + max(k − j − 1 − 2s, 0) + max( − k − 1 − 2s, 0) + max(52 − − s, 0).
Now we have the probability 49 50 51 52 4! × 4! × 44! y(i, j, k, ; s) . P(S ≥ s) = 52! 4 i=1 j =i+1 k=j +1 =k+1 From this expression we can find the distribution of S, namely we have P(S = s) = P(S ≥ s) − P(S ≥ s + 1). While this leads to a solution that can easily be coded into a program, we had to very carefully work through multiple cases to arrive at the correct construction. Moreover, to obtain a closed-form expression for P(S = s), we will have to delicately break apart the nested summations according the behavior of y(s). We propose an approach that frees us from this tedious bookkeeping. 3. A GENERATING FUNCTION APPROACH. We begin by noting that once we have selected two ranks, the ranks of the remaining cards are unimportant. Thus, we can recast this problem as working with a deck of 52 colored cards: 4 red, 4 green, and the remaining 44 blue. Next, let us consider the alphabet C = {r, g, b}—the letters standing in for the three colors in the obvious way—and Q0 , the subset of words over C containing at least one March 2021]
CARD RANK SEPARATION
259
occurrence of the letters r and g separated by at most s occurrences of the letter b. The finite automaton (FA) in Figure 1 accepts precisely those words that are members of Q0 . Although FAs are typically depicted with one token per arc, for clarity and economy of size, we occasionally connect two states in our diagram with an arc labeled by the regular expression corresponding to the substring recognized by the (now hidden) intermediary states and arcs. For example, our FA transitions from state q1 to q0 whenever it encounters s + 1 consecutive b symbols.
Figure 1. Subset Q0 of words over alphabet C .
We seek the generating function Q0 (z, u, v) where the coefficient of the term zn uj v k , denoted by [zn uj v k ] Q0 (z, u, v), is the number of words in Q0 of length n containing j occurrences of letter r and k occurrences of letter g. Let Qi , i ∈ {1, 2, 3}, denote the subset of words accepted when starting from state qi and Qi (z, u, v) be the associated counting generating function so that coefficient [zn uj v k ] Qi (z, u, v) is the number of words in Qi of length n containing j occurrences of letter r and k occurrences of letter g. Then from Figure 1, we obtain the following system of generating functions: Q0 (z, u, v) = zQ0 (z, u, v) + uzQ1 (z, u, v) + vzQ2 (z, u, v), 1 − zs+1 s+1 Q1 (z, u, v) = z Q0 (z, u, v) + uz Q1 (z, u, v) 1−z 1 − zs+1 Q3 (z, u, v), + vz 1−z 1 − zs+1 Q2 (z, u, v) Q2 (z, u, v) = zs+1 Q0 (z, u, v) + vz 1−z 1 − zs+1 Q3 (z, u, v), + uz 1−z Q3 (z, u, v) = 1 + (1 + u + v)zQ3 (z, u, v). 260
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Backward substitution gives us the desired expression for Q0 (z, u, v) in terms of z, u, and v alone [1, pp. 56–58]. We are only interested in those members of Q0 with four r’s and four g’s. Hence we define the generating function F s (z) ≡ [u4 v 4 ] Q0 (z, u, v), so that [zn ] F s (z) is the number of words over alphabet C of length n containing four r’s and four g’s and at least one occurrence of the letters r and g separated by at most s occurrences of letter b. Using a symbolic manipulation system, we find 2z8 (35 − zs+1 − 3z2s+2 − 9z3s+3 − 9z4s+4 − 9z5s+5 − 3z6s+6 − z7s+7 ) . (1 − z)9 (1) Let us next define the generating function G(z, u) so that [zn us ] G(z, u) denotes the number of words over alphabet C of length n containing four r’s and four g’s and the minimum number of b’s between any pair of letters r and g is s. This generating function is defined implicitly in terms of F s (z) by F s (z) =
G(z, u) s = F (z)us . 1−u s≥0 Substitution of (1) into this relation and solving for G(z, u) gives us z 3z2 9z3 35 2(1 − u)z8 − − − G(z, u) = (1 − z)9 1 − u 1 − uz 1 − uz2 1 − uz3
9z4 9z5 3z6 z7 − − − − . (2) 1 − uz4 1 − uz5 1 − uz6 1 − uz7
With generating functions (1) and (2) in hand, we have the means to address several questions about the distribution of S. For example, to find the expected value of S, let Sn denote the quantity S for a deck with four red, four green, and n − 8 blue cards. Then we have ∂ G(z, u) u=1 [zn ] ∂u , E [Sn ] = [zn ]G(z, 1) and so in our case where n = 52 we obtain ∂ G(z, u) u=1 [z52 ] ∂u 59005603980 ≈ 1.1201. = E [S] = 5248 52 [z ] G(z, 1) 4 4 For the variance of S, we first find the second factorial moment of Sn , namely ∂2 [zn ] ∂u 2 G(z, u) u=1 , E [Sn (Sn − 1)] = [zn ] G(z, 1) yielding E [S(S − 1)] = March 2021]
[z52 ]
∂2 ∂u2
G(z, u)
[z52 ] G(z, 1)
u=1
=
168872169804 ≈ 3.2058. 5248
CARD RANK SEPARATION
4
4
261
Thus, the variance is given by Var (S) = E [S(S − 1)] + E [S] − (E [S])2 =
789111942650231327 ≈ 3.0712. 256938608269126875
To derive a closed-form expression for P(S = s), rewrite G(z, u) in (2) as G(z, u) =
7 2(1 − u)z8 αj zj , (1 − z)9 j =0 1 − uzj
so that the probability generating function of S is given by G(z, u) [z ] 5248 = 52
4
4
7 52 48 −1 1 . 2(1 − u) αj [z44−j ] 9 4 4 (1 − z) (1 − uzj ) j =0
(3)
Let us consider the term [z44−j ]
1 (1 − z)9 (1 − uzj )
in (3) for each j . When j = 0, we obtain 52 1 1 1 44 [z ] . = 9 1−u (1 − z) 1−u 8 On the other hand, when j ≥ 1, we find 1
[z44−j ]
(1 − − uzj ) n + 8 n 44−j = [z ] [[n ≡ 0 (mod j )]]u j zn zn 8 n≥0 n≥0 = [z44−j ]
z)9 (1
n i+8 n≥0 i=0
i + 8
8
[[i ≡ n (mod j )]]u
44−j
=
i=0
8
[[i ≡ 44 (mod j )]]u
n−i j
zn
44−i−j j
j 52 − j − ij i = u, j i=0 44−j
where [[·]] is Iverson bracket notation for an indicator function. Thus, (3) becomes 2
262
43 21 51 − i 50 − 2i i 52 52 48 −1 i u −3 u 35 −(1 − u) 8 8 8 4 4 i=0 i=0 c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
−9
13 10 7 49 − 3i i 48 − 4i i 47 − 5i i u −9 u −9 u 8 8 8 i=0 i=0 i=0 6 5 46 − 6i i 45 − 7i i u − u −3 8 8 i=0 i=0
. (4)
Extracting the coefficient of the us term of (4) yields −1 52 51−s 52−s 52 48 70 [[s = 0]]−2 − [[s > 0]] 8 8 8 4 4 50−2s 52−2s 49−3s 52−3s −6 − [[s > 0]] −18 − [[s > 0]] 8 8 8 8 52 − 5s 48−4s 52−4s 47−5s − [[s > 0]] − 18 − [[s > 0]] −18 8 8 8 8 46−6s 52−6s 45−7s 52−7s −6 − [[s > 0]] −2 − [[s > 0]] , (5) 8 8 8 8 where we set nk ≡ 0 when k > n. Since 70 = 2 + 6 + 18 + 18 + 18 + 6 + 2, we can consolidate the indicators in (5), further simplify and obtain the form 52 − s 52 − 7s 51 − s 45 − 7s P(S = s) = 2 + − − 8 8 8 8 52 − 2s 52 − 6s 50 − 2s 46 − 6s + − − +6 8 8 8 8 52 − 4s 52 − 5s 52 − 3s + + 18 + 8 8 8 49 − 3s 48 − 4s 47 − 5s 52 48 − − − . 8 8 8 4 4 Let us conclude by returning to the original bar bet. Will it work “99 percent of the time?” To answer this, we need to find P(S ≤ 1). From (1), this is simply, 38769062856 [z52 ] F 1 (z) ≈ 0.7360. 5248 = 5248 4
4
4
4
So no, far from it. What is the smallest s such that P(S ≤ s) ≥ 0.99? Well noting P(S ≤ 7) =
52083420946 [z52 ] F 7 (z) ≈ 0.9887 5248 = 5248 4
4
4
4
and P(S ≤ 8) =
52259016240 [z52 ] F 8 (z) ≈ 0.9921, 5248 = 5248 4
4
4
4
we see s = 8 is the desired quantity. Good luck getting someone to take that bet. March 2021]
CARD RANK SEPARATION
263
ORCID Arnold Saunders
http://orcid.org/0000-0001-8376-3211
REFERENCES [1] Flajolet, P., Sedgewick, R. (2009). Analytic Combinatorics. New York: Cambridge Univ. Press. [2] Garcia, F., Schindler, G. (1975). Magic with Cards: 113 Easy to Perform Miracles with an Ordinary Deck. New York: Reiss Games, Inc. [3] Grime, J. (2009). Response: Scam School Small Risk, HUGE Reward Magic Trick. Youtube. www.youtube.com/watch?v=1WJyEQXZ7dY [4] Holte, J., Holte, M. (1993). Probability of n ace-king adjacencies in a shuffled deck. Math. Gaz. 77(480): 368–370. doi.org/10.2307/3619783 [5] Lambiam. (2007). Wikipedia mathematics reference desk archives. en.wikipedia.org/wiki/Wikipedia: Reference desk/Archives/Mathematics/2007 June 24 [6] Nishiyama, Y. (2013). The probability of cards meeting after a shuffle. Int. J. Pure Appl. Math. 85(5): 849–857. doi.org/10.12732/ijpam.v85i5.3 [7] Shallit, J. (2010). A neat problem on card arrangements. recursed.blogspot.com/2010/01/neat-problemon-card-arrangements.html [8] Singmaster, D. (1991). The probability of finding an adjacent pair in a deck. Math. Gaz. 75(473): 293– 299. doi.org/10.2307/3619487 [9] Suman, K. (1993). A problem in arrangements with adjacency restrictions. Math. Gaz. 77(480): 366– 367. doi.org/10.2307/3619782 [10] user940. (2013). Probability of 2 cards being adjacent. Mathematics Stack Exchange. math. stackexchange.com/q/539247 ARNOLD SAUNDERS is a Ph.D. student (graduated in 2020) in statistics at the George Washington University. He holds graduate degrees in computer science and applied mathematics from Creighton University and the Air Force Institute of Technology, respectively. His research interests include random discrete structures and analytic probability. Department of Statistics, The George Washington University, Washington, DC 20052 arnold [email protected]
HOSAM MAHMOUD is a Professor in the Department of Statistics at the George Washington University, where he used to be the chair. He holds a Ph.D. in computer science from the Ohio State University. He is on the editorial board of five academic journals and has authored four books. Department of Statistics, The George Washington University, Washington, DC 20052 [email protected]
264
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
NOTES Edited by Vadim Ponomarenko
Countable Metric Spaces Without Isolated Points Frederick K. Dashiell Jr. Abstract. This note provides a short, self-contained proof of the famous fact that any countable metric space without isolated points is homeomorphic to the space of rational numbers. The discussion is carried out entirely in the language of metric spaces.
1. INTRODUCTION. The theorem of the abstract was first proved for subspaces of Euclidean space Rn in a 1920 paper of Sierpi´nski [8], which devotes 6 pages to the proof. The general, metric space, case follows immediately from the fact that any countable metric space is homeomorphic to a subspace of the real line, which Sierpi´nski published one year later [9, p. 89].1 The general case is now widely known as Sierpi´nski’s theorem. Strangely, both of Sierpi´nski’s books [10, p. 107] and [11, p. 142] state the general case but the proofs are incomplete (even after reducing to the real line)! And Kuratowski [7, p. 287] states the general case and provides the reduction to the real line, but is then content to outsource the proof by referencing Sierpi´nski [8]. In the third generation of great Polish topology texts, Engelking [5, p. 370] outlines two complete proofs of the general case but they are not self-contained, since they rely on topological properties, provided in exercises, of either the irrational numbers or of the Cantor set. Other proofs have appeared in [1,4] and the book of Dasgupta [3, p. 319], but they also depend on either one of the same spaces just mentioned or the countable dense linear order of Cantor [3, p. 160]. The recent, shorter proof of Ciesielski [2] is self-contained but uses the general topological notion of a basis for a topology on a certain subset of NN , so it goes beyond metric spaces, and it is longer and more complicated than the proof presented here. Both proofs share concepts originally revealed by Sierpi´nski [8] (and also by Fr´echet [6, pp. 150–152]). In particular, the method of a descending sequence of open partitions of the space originates in [8]. Thus no complete self-contained metric space proof, which can be given in a course in real analysis, seems to exist in the abundant literature covering metric spaces. Terminology. In this note, a ball in a metric space (X, dX ) is a set of the form Br (x) = {w : dX (w, x) < r} for some x ∈ X and r > 0; an open set is the union of a family of balls. The point x is isolated if Br (x) = {x} for some r > 0. A metric space is without isolated points if and only if no point is isolated; hence every open set is without isolated points. A continuous function between metric spaces satisfies the standard ε, δ condition. Two metric spaces are homeomorphic if there is a bijection between them that is continuous in both directions. 1921 article is also restricted to Rn , using a type of subspace which soon became known as a zerodimensional subspace of Rn . However, unlike the 1920 article, which utilizes the coordinates of Rn , the argument is obviously valid in a general metric space (as acknowledged by Urysohn [12, p. 76, note 6]). doi.org/10.1080/00029890.2021.1856586 MSC: Primary 54E35 1 The
March 2021]
NOTES
265
2. THE THEOREM. Since the space of rational numbers, with the usual notion of distance |p − q| between two elements, is a countable metric space without isolated points, the theorem can be stated in the following form. Theorem. Any two countable metric spaces without isolated points are homeomorphic. Proof. Let (X, dX ) and (Y, dY ) be two countable metric spaces without isolated points. Enumerate X = {x1 , x2 , . . . } and Y = {y1 , y2 , . . . }, where xm = xn and ym = yn for m = n. Put D = {dX (xm , xn ) : m = n} ∪ {dY (ym , yn ) : m = n}, so that D is a countable set of positive numbers. For each nonempty S ⊂ X define next(S) = xk where k = min{i : xi ∈ S}, and similarly for S ⊂ Y . For r > 0, put B(S, r) = Br (next(S)) and G(S, r) = S \ B(S, r).
(1)
Now we introduce “the splitter”: For any nonempty open S ⊂ X or S ⊂ Y , there is an arbitrarily small r > 0 such that B(S, r) and G(S, r) divide S into two disjoint nonempty open subsets. To see this, set x = next(S). Since x is not isolated, there is y ∈ S, y = x. Choose r ∈ (0, dX (x, y)) \ D small enough so that B(S, r) = / D, G(S, r) = {u ∈ S : Br (x) ⊂ S. Then y ∈ S \ Br (x) = G(S, r), and since r ∈ dX (x, u) > r}. Thus G(S, r) is nonempty and open. The splitter is used recursively to generate a sequence of partitions of X and Y into open sets, using as indices the set WG of words consisting of the letters B and G (suggesting blue or ball and green or general-open), whose first letter is always G. Thus WG = {G, GB, GG, GBB, GBG, GGB, GGG, . . . }. For t ∈ WG , the length |t| is the number of letters in t, and t is called blue (respectively, green) if the last letter is B (respectively, G); tˆB and tˆG extend t by adding one letter. Set UG = X and VG = Y . Using induction on the length of t ∈ WG , suppose that open sets Ut ⊂ X and Vt ⊂ Y are defined for all t ∈ WG with |t| = n ≥ 1. Applying the splitter to these Ut and Vt (there are 2n−1 of each), set UtˆB = B(Ut , rn+1 ) and 1 is used for UtˆG = G(Ut , rn+1 ), and similarly for the Vt , where the same rn+1 < n+1 all the sets at stage n + 1. Hence Ut and Vt are defined for all t ∈ WG . The desired homeomorphism h : X → Y is given by h(next(Ut )) = next(Vt ) for green t ∈ WG .
(2)
To finish the proof, observe that when a member x ∈ X is x = next(Ut ) for some green t, it becomes the center of the balls Us for s ∈ {tˆB, tˆBB, . . . }, and these form a decreasing sequence of balls with center x and radii → 0. This ensures that every x ∈ X is associated with a unique green t ∈ WG , and similarly for every y ∈ Y , so that h is a bijection from X onto Y . Finally, h is bicontinuous because h(Ut ) = Vt for every t ∈ WG ; hence the ε, δ criterion for h(x) = y is satisfied by the blue balls Vt , Ut (for blue t ∈ WG ) containing y and x. As an immediate consequence, the theorem yields the following classical characterization of countable metric spaces [7, p. 287]. Unlike the classical proofs, the present proof does not depend on the complete metrizability of any metric space, such as the Cantor set, the irrational numbers, or the real line. Corollary. Every countable metric space is homeomorphic to a subspace of the space of rational numbers. 266
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Proof. Let (X, d) be a countable metric space and Q be the space of rational numbers. Define a metric ρ on the countable set X × Q by ρ((x, p), (y, q)) = d(x, y) + |p − q|. Then X is homeomorphic to the subspace X × {0}, and X × Q has no isolated points. By the theorem, X × Q is homeomorphic to Q; hence X is homeomorphic to a subspace of Q. ACKNOWLEDGMENTS. I thank two referees for helpful suggestions. I am grateful for the hospitality of the Department of Mathematics at UCLA.
ORCID Frederick K. Dashiell Jr.
http://orcid.org/0000-0001-5392-4352
REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]
Błaszczyk, A. (2019). A simple proof of Sierpi´nski’s theorem. Amer. Math. Monthly. 126(5): 464–466. Ciesielski, K. (2020). Sierpi´nski’s topological characterization of Q. Math. Mag. 93(2): 136–138. Dasgupta, A. (2014). Set Theory. New York: Springer. Eberhart, C. (1977). Some remarks on the irrational and rational numbers. Amer. Math. Monthly. 84(1): 32–35. Engelking, R. (1989). General Topology. Berlin: Heldermann Verlag. Fr´echet, M. (1910). Les dimensions d’un ensemble abstrait. Math. Ann. 68(2): 145–168. Kuratowski, K. (1966). Topology, Volume I. New York and London: Academic Press. Sierpi´nski, W. (1920). Sur une propri´et´e topologique des ensembles d´enombrables denses en soi. Fund. Math. 1: 11–16. Sierpi´nski, W. (1921). Sur les ensembles connexes et non connexes. Fund. Math. 2: 81–95. Sierpi´nski, W. (1934). Introduction to General Topology. Toronto: Univ. of Toronto Press. Sierpi´nski, W. (1956). General Topology. Toronto: Univ. of Toronto Press. Reprint (2000). Mineola, NY: Dover Publications Urysohn, P. (1925). M´emoire sur les multiplicit´es Cantoriennes. Fund. Math. 7: 30–137.
Center of Excellence for Computation, Algebra, and Topology (CECAT), Chapman University, Orange, CA [email protected]
March 2021]
NOTES
267
On Arithmetic Progressions of Powers in Cyclotomic Polynomials ` Viˆe.t Chu Hung Abstract. We determine necessary conditions for when powers corresponding to positive/negative coefficients of n are in arithmetic progression. When n = pq for any primes q > p > 2, our conditions are also sufficient. Finally, we generalize the result when n = pq to the so-called inclusion-exclusion polynomials first introduced by Bachman.
1. INTRODUCTION AND MAIN RESULTS. For integers n ≥ 1, the nth cyclotomic polynomial is defined as n (X) =
n
(X − e
2π mi n
).
m=1,(m,n)=1
It is well known that n is in Z[X] with degree φ(n), where φ is the Euler totient function. In the study of cyclotomic polynomials, we can reduce our enquiry to the case when n is odd, square-free, and composite by [10, Remark 2.2]. Much work has been done to characterize n (see [1, 3, 6, 10]), and many nice results are achieved when n has a small number of prime divisors (see [4, 5, 7]). In particular, we know an explicit formula for pq : ⎞ ⎞ ⎛ s ⎛ p−1 r q−1 Xip ⎝ Xj q ⎠ − Xip ⎝ Xj q ⎠ X−pq , (1) pq (X) = i=0
j =0
i=r+1
j =s+1
where r, s are nonnegative and pr + qs = (p − 1)(q − 1). For its derivation, see [7]. Clearly, r and s, when 0 < r < q, are uniquely determined as follows: pr ≡ (p − 1)(q − 1) mod q
(2)
s = ((p − 1)(q − 1) − pr)/q.
(3)
If we expand the products in (1), the resulting monomial terms are all different [7]. Our first main result shows necessary conditions when powers of X are in arithmetic progression. Two examples are 21 (X) = X12 − X11 + X9 − X8 + X6 − X4 + X3 − X + 1, 33 (X) = X20 − X19 + X17 − X16 + X14 − X13 + · · · − X4 + X3 − X + 1. Observe that powers corresponding to positive coefficients of 21 (X) are in arithmetic progression, and powers corresponding to negative coefficients of 33 (X) are in arithmetic progression. Our theorems provide necessary conditions for when these arithmetic progressions appear. Let cn,k be the coefficient of Xk and define Sn+ := {k : cn,k > 0} and Sn− := {k : cn,k < 0}. doi.org/10.1080/00029890.2021.1856582 MSC: Primary 11B83
268
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Theorem 1. Let n be an odd, square-free, composite number. Write n = p1 p2 · · · pt , where p1 < p2 < · · · < pt . Then the following hold. (i) If t is odd, then Sn+ is not in arithmetic progression. (ii) If t is even and Sn+ is in arithmetic progression, then p2 ≡ 1 mod p1 . Theorem 2. Let n be an odd, square-free, composite number. Write n = p1 p2 · · · pt , where p1 < p2 < · · · < pt . Then the following hold. (i) If t is odd, then Sn− is not in arithmetic progression. (ii) If t is even and Sn− is in arithmetic progression, then p2 ≡ −1 mod p1 . We have the following two corollaries. Corollary 1. Let n be an odd, square-free, composite number. Then Sn− and Sn+ are not simultaneously in arithmetic progression. Proof. Write n = p1 p2 · · · pt , where p1 < p2 < · · · < pt . If t is odd, Theorem 1 says that Sn+ is not in arithmetic progression. Suppose that t is even and that both Sn+ and Sn− are in arithmetic progression. By items (ii) of Theorems 1 and 2, p2 = m1 p1 + 1 = m2 p1 − 1 for some m1 , m2 ∈ N. Hence, 2p2 = (m1 + m2 )p1 , which implies that either p1 = 2 or p1 divides p2 . Both cases are impossible. + Corollary 2. Let 2 < p < q be primes. Then Spq forms an arithmetic progression if − forms an arithmetic progression if and only if q = mp + 1 for some m ∈ N, and Spq and only if q = mp − 1 for some m ∈ N. + . The forward implication Proof. Due to similarity, we only prove the result for Spq follows directly from Theorem 1 item (ii). For the backward implication, we use formula (1). Suppose that q = mp + 1 for some m ∈ N. Then formulas (2) and (3) give + r = m(p − 1) and s = 0. Combined with formula (1), this clearly indicates that Spq is in arithmetic progression of difference p.
Finally, we generalize Corollary 2 to a family of inclusion-exclusion polynomials introduced by Bachman [2]. An inclusion-exclusion polynomial is defined as Pa,b (X) =
(X − 1)(Xab − 1) , (Xa − 1)(Xb − 1)
where a, b are relatively prime natural numbers; Pa,b can also be interpreted as the semigroup polynomial of the numerical semigroup generated by a and b [9]. When a and b are odd primes, Pa,b (X) = ab (X). Theorem 3. Let 1 < a < b be coprime natural numbers. Then Pa,b (X) is a polynomial. Furthermore, the exponents of the monomials with positive coefficient are in arithmetic progression if and only if b ≡ 1 mod a. The exponents of the monomials with negative coefficient are in arithmetic progression if and only if b ≡ −1 mod a. 2. PROOFS OF THEOREMS 1 AND 2. We modify a powerful technique, which was used by Schur [8] to prove there exist cyclotomic polynomials with coefficients arbitrarily large in absolute value. The following lemma is the key ingredient. Lemma 1. Let n be an odd, square-free, composite number. Write n = p1 p2 · · · pt . Then modulo Xp2 +2 ,
p1 −1 i p2 − Xp2 +1 , if t is odd; i=0 X − X n (X) ≡ ∞ ∞ ip1 ip1 +1 p2 p2 +1 − i=0 X +X −X , if t is even. i=0 X March 2021]
NOTES
269
Proof. By [10, Lemma 1.2], we can write n (X) = d|n (Xd − 1)μ(n/d) , where μ(n) denotes the M¨obius function. Modulo Xp2 +2 , we have (Xd − 1)μ(n/d) n (X) ≡ (X − 1)μ(n) (Xp1 − 1)μ(n/p1 ) (Xp2 − 1)μ(n/p2 ) d|n d>p2
where =
≡ (X − 1)μ(n) (Xp1 − 1)μ(n/p1 ) (Xp2 − 1)μ(n/p2 ) (−1) , d|n d>p2
μ(n/d).
If t is odd, μ(n) = −1 and μ(n/p1 ) = μ(n/p2 ) = 1. Since it is well known that d|n μ(d) = 0 for all n > 1, we get = −1. Thus, n (X) ≡ −
(Xp1 − 1)(Xp2 − 1) ≡ −(1 + X + · · · + Xp1 −1 )(Xp2 − 1) X−1 ≡ 1 + X + · · · + Xp1 −1 − Xp2 − Xp2 +1 .
If t is even, we get μ(n) = 1, μ(n/p1 ) = μ(n/p2 ) = −1, and = 1. Therefore, n (X) ≡ −(X − 1)
1 1 1 − Xp1 1 − Xp2
≡ −(X − 1)(1 + Xp1 + X2p1 + · · · )(1 + Xp2 + X2p2 + · · · ) ≡
∞
X
ip1
i=0
−
∞
Xip1 +1 + Xp2 − Xp2 +1 .
i=0
We have finished our proof. Lemma 2. Let n be an odd, square-free, composite number. Write n = p1 p2 · · · pt . Suppose 2p1 + 2p2 ≥ φ(n) + 2, where φ is the Euler totient function. Then t = 2. Proof. Suppose that t ≥ 3. Then φ(n) + 2 ≥ (p1 − 1)(p2 − 1)(p3 − 1) + 2 ≥ 6(p1 − 1)(p2 − 1) + 2 = 6p1 p2 − 6(p1 + p2 ) + 8. Hence, 2p1 + 2p2 ≥ φ(n) + 2 implies that 4(p1 + p2 ) ≥ 3p1 p2 + 4, which is a contradiction since p1 p2 > 4p1 and 2p1 p2 > 4p2 . Therefore, t = 2. We are ready to prove Theorems 1 and 2. Proof of Theorem 1. If t is odd and Sn+ is in arithmetic progression, then by Lemma 1 it must be that Sn+ = {0, 1, 2, . . . , φ(n)}. However, Xp2 and Xp2 +1 have coefficient −1, a contradiction. Therefore, Sn+ is not in arithmetic progression. If t is even, Lemma 1 guarantees that 0 and p1 are in Sn+ . Suppose that Sn+ is in arithmetic progression. If p2 ∈ Sn+ , then p1 divides p2 , a contradiction. So, Xp2 must be cancelled by Xmp1 +1 for some m. Therefore, p2 = mp1 + 1, as desired. Proof of Theorem 2. If t is odd, Lemma 1 says that p2 and p2 + 1 are in Sn− . Suppose that Sn− is in arithmetic progression. Then Sn− = {p2 , p2 + 1, . . . , φ(n) − p2 }. Thus, the number of powers with negative coefficients is exactly φ(n) − 2p2 + 1. Hence, −(φ(n) − 2p2 + 1) is an upper bound for the sum of these coefficients. By symmetry of cyclotomic polynomials, we have Sn+ = {0, 1, 2, . . . , p1 − 1, φ(n) − p1 + 1, . . . , φ(n) − 1, φ(n)}. 270
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Thus, the number of powers with positive coefficient is exactly 2p1 . Since each coefficient is 1, the sum of them is 2p1 . Using the fact that n (1) = 1 if n is not a prime power, we know that 2p1 − (φ(n) − 2p2 + 1) ≥ 1, which is equivalent to 2p1 + 2p2 ≥ φ(n) + 2. By Lemma 2, we have t = 2, which contradicts the assumption that t is odd. If t is even, Lemma 1 says that 1 and p1 + 1 are in Sn− . If p2 + 1 is in Sn− , p1 must divide p2 , a contradiction. So Xp2 +1 must be cancelled by Xmp1 for some m ∈ N. Therefore, p2 = mp1 − 1. 3. PROOF OF THEOREM 3. We first prove that Pa,b (X) is a polynomial and then consider powers of monomials with positive coefficients. Lemma 3. For 1 < a < b and gcd(a, b) = 1, there exists a unique 3 ≤ m ≤ b such that b divides (m − 1)a − 1. Proof. Because (a, b) = 1, there exist r and s such that sa + tb = 1. All integral solutions of the equation xa + yb = 1 are of the form (x, y) = (r + tb, s − ta) for some t ∈ Z. Hence, there is a unique solution with 1 ≤ x = r0 + t0 b ≤ b. Set m = r0 + t0 b + 1. By definition, b divides (m − 1)a − 1. It remains to show 3 ≤ m ≤ b or equivalently, 1 < r0 + t0 b < b. If r0 + t0 b = 1, then b divides a − 1, which contradicts 1 < a < b. So r0 + t0 b > 1. If r0 + t0 b = b, then b divides 1, which contradicts b > 1. So r0 + t0 b < b. This completes the proof. Proof of Theorem 3. We write Pa,b (X) =
(X − 1)(Xa(b−1) + Xa(b−2) + · · · + 1) . Xb − 1
It suffices to prove that f (X) := (X − 1)(Xa(b−1) + Xa(b−2) + · · · + 1) can be written as (Xb − 1)g(X) for some polynomial f (X). We have f (X) = (Xab−a+1 + Xab−2a+1 + · · · + X) − (Xab−a + Xab−2a + · · · + 1).
(4)
Let 3 ≤ m ≤ b be chosen such that b divides (m − 1)a − 1. By Lemma 3, m exists and is unique. For each 1 ≤ k ≤ m − 1, we have Xab−ka+1 − Xab−(b+k−m+1)a = Xa(m−k−1) (Xab−((m−1)a−1) − 1),
(5)
which is divisible by Xb − 1. For each m ≤ k ≤ b, we have Xab−ka+1 − Xab−(k−m+1)a = Xab−ka+1 (1 − X(m−1)a−1 ),
(6)
which is divisible by Xb − 1. From (4)–(6), we know that Pa,b (X) is a polynomial. Furthermore, letting := ((m − 1)a − 1)/b ≥ 1. we can write Pa,b (X) =
m−1
Xa(m−k−1) u(X) −
b
Xab−ka+1 v(X),
(7)
k=m
k=1
where u(X) = X(a−−1)b + X(a−−2)b + · · · + 1 and v(X) = X(−1)b + X(−2)b + · · · + 1. Next, we prove that exponents of monomials with positive coefficients are in arithmetic progression if and only if b ≡ 1 mod a. Forward implication: By (7), the two largest powers with positive coefficients are a(m − 2) + (a − − 1)b and a(m − 3) + (a − − 1)b. (Note that the two monomials having these powers are not cancelled.) March 2021]
NOTES
271
Hence, we have an arithmetic progression of difference a. If u(X) has exactly one summand or a − − 1 = 0, then b ≡ 1 mod a. Assume that a − − 1 > 0. Because gcd(a, b) = 1, it follows that Xa(m−2)+(a−−2)b must get cancelled. Then there exist 1 ≤ j ≤ and m ≤ k ≤ b such that a(m − 2) + (a − − 2)b = (ab − ka + 1) + b( − j ). Replacing b = (m − 1)a − 1 and simplifying, we arrive at a(k − m) + 1 = b(2 − j ), which gives j = 1. So a(k − m) + 1 = b and thus, b ≡ 1 mod a. Backward implication: straightforward calculations show m = b − (b − 1)/a + 1 and = a − 1. Hence, u(X) = 1 and we have Pa,b (X) =
m−1 k=1
Xa(m−k−1) −
b
Xab−ka+1 (X(−1)b + X(−2)b + · · · + 1).
k=m
Since the power of each monomial in the second sum is (1 − 2b) mod a, no summand in the first sum gets cancelled. Therefore, the powers of monomials with positive coefficient are in an arithmetic progression. We have shown that exponents of monomials with positive coefficients are in arithmetic progression if and only if b ≡ 1 mod a. As the proof for negative coefficients is similar, we omit it. ACKNOWLEDGMENTS. The author wishes to thank the referees and the editor for useful comments that improved this paper.
REFERENCES [1] Bachman, G. (1993). On the coefficients of cyclotomic polynomials. Mem. Amer. Math. Soc. 106(510): vi+80 pp. [2] Bachman, G. (2010). On ternary inclusion-exclusion polynomials. Integers. 10: 623–638. [3] Bateman, P. (1949). Note on the coefficients of the cyclotomic polynomial. Bull. Amer. Math. Soc. 55: 1180–1181. [4] Beiter, M. (1964). The midterm coefficient of the cyclotomic polynomial Fpq (x). Amer. Math. Monthly. 71(7): 769–770. [5] Beiter, M. (1968). Magnitude of the coefficients of the cyclotomic polynomials Fpqr (x). Amer. Math. Monthly. 75(4): 370–372. [6] Dresden, G. (2004). On the middle coefficient of a cyclotomic polynomial. Amer. Math. Monthly. 111(6): 531–533. [7] Lam, T., Leung, K. (1996). On the cyclotomic polynomial pq (X). Amer. Math. Monthly. 103(7): 562– 564. [8] Lehmer, E. (1936). On the magnitude of the coefficients of the cyclotomic polynomial. Bull. Amer. Math. Soc. 42(6): 389–392. [9] Moree, P. (2014). Numerical semigroups, cyclotomic polynomials, and Bernoulli numbers. Amer. Math. Monthly. 121(1): 890–902. [10] Thangadurai, R. (2000). On the coefficients of cyclotomic polynomials. In: Adhikari, S., Katre, S., Thakur, D., eds. Cyclotomic Fields and Related Topics. Pune: Bhaskaracharya Pratishthana, pp. 311– 322. Department of Mathematics, University of Illinois at Urbana-Champaign, Urbana, IL 61820 [email protected]
272
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Note on Dirichlet’s Sinc Integral David M. Bradley
Abstract. We offer an alternative method for evaluating Dirichlet’s integral, i.e., the integral of the sinc function sin(t)/t over the positive real line. Our method employs the fundamental theorem of calculus for complex path integrals, but avoids the customary invocation of Cauchy’s theorem.
In this note, we provide an alternative derivation of the Dirichlet integral evaluation
∞
0
π sin t dt = . t 2
(1)
For a selection of real-variable proofs of (1), see [2, Example 3, §10.16, p. 286], [3, Art. 165, Example 3, p. 465, or Art. 173, p. 488, or Example 10, p. 515], [4, Exercise 59, §2.6, p. 77], or [7]. Proofs that venture into the complex realm tend to rely on Cauchy’s theorem and estimates for eiz /z along portions of a closed contour of arbitrarily large extent, indented to avoid the pole at the origin. Typically, the contour is either an indented rectangle [1, Chapter 4, §5.3.3, pp. 155–157] or an indented semicircle [5, Chapter II, §46, Example 2, pp. 98–99]. By contrast, our approach employs little beyond the fundamental theorem of calculus for complex path integrals. We begin by observing that for any nonzero real or complex t, eit − e−it 1 sin t = = t 2it 2
1
−1
eitz dz.
(2)
The integral representation (2) is valid for any rectifiable path in the complex plane that joins −1 to 1, simply because z → eitz /(it) is a primitive of eitz . Accordingly, we choose a semi-circular path in the upper half-plane. Our path may be parametrized by z = eiθ , where θ decreases through real values from π to 0. Then, for any real x, integrating (2) over real t between 0 and x yields
x 0
1 0 x iteiθ iθ e ie dθ dt = e ie dt dθ 2 π 0 0 π 1 π ix(cos θ+i sin θ) π 1 0 ixeiθ e − 1 dθ = − e dθ. = 2 π 2 2 0
sin t dt = t
x
1 2
0
iteiθ
iθ
The interchange of integration order in the foregoing calculation is justified by continuity of the integrand. Consequently, π − 2
x 0
1 sin t dt = t 2
π
e−x sin θ cos(x cos θ) + i sin(x cos θ) dθ.
(3)
0
doi.org/10.1080/00029890.2021.1856585 MSC: Primary 42A38, Secondary 30E20; 26A42
March 2021]
NOTES
273
Since the left-hand side of equation (3) is real, the right-hand side must also be. Denoting the former by (x), we infer that 1 (x) = 2
π
e
−x sin θ
π/2
cos(x cos θ) dθ =
0
e−x sin θ cos(x cos θ) dθ.
0
Next, notice that if 0 ≤ θ ≤ π/2, then sin θ ≥ 2θ/π because the graph of sine is above the chord. This inequality is called Jordan’s inequality; see, e.g., [6, Chapter 2, §2.3, p. 33]. It follows that for any positive real x,
π/2
|(x)| ≤
e 0
−x sin θ
|cos(x cos θ)| dθ ≤
π/2
e 0
−2xθ/π
π dθ = 2
1 − e−x . x
Letting x tend to infinity through positive real values, we deduce the evaluation (1). ACKNOWLEDGMENTS. It is a pleasure to thank the editor and the anonymous referees for their helpful comments and suggestions.
ORCID David M. Bradley
http://orcid.org/0000-0003-2952-2366
REFERENCES [1] Ahflors, L. V. (1966). Complex Analysis, 2nd ed. New York, NY: McGraw-Hill. [2] Apostol, T. M. (1974). Mathematical Analysis, 2nd ed. Reading, MA: Addison-Wesley. [3] Bromwich, T. J. I’a (1991). An Introduction to the Theory of Infinite Series, 3rd ed. New York, NY: Chelsea. [4] Folland, G. (1999). Real Analysis: Modern Techniques and Their Applications, 2nd ed. New York, NY: Wiley. ´ (1916). Mathematical Analysis, Part I of Volume II: Functions of a Complex Variable, 2nd [5] Goursat, E. ed. (Hedrick, E. R., Dunkel, O., trans.) Boston, MA: Ginn. [6] Mitrinovi´c, D. S., Vasi´c, P. M. (1970). ∞ Analytic Inequalities. New York, NY: Springer. [7] Williams, K. S. (1971). Note on 0 (sin x/x)dx. Math. Mag. 44(1): 9–11. jstor.org/stable/2688849 Department of Mathematics and Statistics, University of Maine, Orono, ME 04469 [email protected], [email protected]
274
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Widder-style Real Analytic Functions The uniform limit of polynomials on a real interval is continuous, but need not even be differentiable (e.g., Weierstrass’s function). Widder [2, Chapter 4] proved that the uniform limit on [0, 1] of polynomials with nonnegative real coefficients is real analytic, as the restriction of a complex analytic function in the closed unit disk. The uniform convergence of the sequence and the continuity of the limit are “feeding on each other,” in the sense that some of the assumptions in Widder’s result might be superfluous. Indeed, we shall prove the following. Proposition. The pointwise limit on (0, 1) of polynomials with nonnegative real coefficients is real analytic. Proof. Denote the polynomials by Pn , n = 0, 1, 2, . . . , and the pointwise limit by f . By passing to complex variables, note that the polynomials Pn are uniformly bounded on every closed disk {z ∈ C : |z| ≤ r} for every 0 < r < 1. In fact, let n k n Pn (z) = ∞ k=0 ak z , where (ak )k≥0 is a sequence of nonnegative numbers with only a finite number of nonzero terms, for each n = 0, 1, 2, . . . . Then |Pn (z)| ≤
∞ k=0
akn |z|k ≤
∞ k=0
akn r k = Pn (r) < C = C(r)
for |z| ≤ r, and since the sequence {Pn (r)}n≥0 is convergent by hypothesis, it is therefore bounded. As the sequence {Pn }n≥0 is convergent on a segment of the open unit disk, Osgood’s classical theorem says that it is also convergent inside the entire open unit disk; moreover, the convergence is uniform on every compact subset of the open unit disk. This implies that limn→∞ Pn (z) is complex analytic in the open unit disk, and so its restriction to (0, 1), i.e., our function f , is real analytic. In other words, pointwise convergence of {Pn }n≥0 implies the analyticity of f on (0, 1). To include x = 1 in the domain of analyticity, Whitley [1, p. 51] assumed that f is continuous at x = 1, and proved that the latter also implies the uniform convergence of a subsequence of {Pn }n≥0 towards f —which is a partial, but solid, return to Widder’s original result. The following example shows that proposition is the our n n+ (x) = x 2−k x k converges best we can do under these circumstances: P n k=0 ∞ −n n pointwise on [0, 1] toward f (x) = n=0 2 x for 0 ≤ x < 1, f (1) = 3, and the latter is discontinuous at x = 1. REFERENCES [1] Whitley, R. (1976). Limits of generalized polynomials with nonnegative coefficients. J. Approximation Theory. 18(1): 50–56. [2] Widder, D. V. (1946). The Laplace Transform. Princeton, NJ: Princeton Univ. Press.
—Submitted by George Stoica, Saint John, New Brunswick, Canada doi.org/10.1080/00029890.2021.1856584 MSC: Primary 30C10, Secondary 30A10; 30B40
March 2021]
275
PROBLEMS AND SOLUTIONS Edited by Daniel H. Ullman, Daniel J. Velleman, and Douglas B. West with the collaboration of Paul Bracken, Ezra A. Brown, Zachary Franco, L´aszl´o Lipt´ak, Rick Luttmann, Hosam Mahmoud, Frank B. Miles, Lenhard Ng, Kenneth Stolarsky, Richard Stong, Stan Wagon, Lawrence Washington, and Li Zhou. Proposed problems should be submitted online at americanmathematicalmonthly.submittable.com/submit. Proposed solutions to the problems below should be submitted by July 31, 2021, via the same link. More detailed instructions are available online. Proposed problems must not be under consideration concurrently at any other journal nor be posted to the internet before the deadline date for solutions. An asterisk (*) after the number of a problem or a part of a problem indicates that no solution is currently available. In memory of Robin J. Chapman (1963–2020), an exceptional and prolific contributor.
PROBLEMS 12237. Proposed by Donald E. Knuth, Stanford University, Stanford, CA. Let x0 = 1 and 3/10 xn+1 = xn + xn for n ≥ 0. What are the first 40 decimal digits of xn when n = 10100 ? 12238. Proposed by Tran Quang Hung, Hanoi, Vietnam. Let ABCD be a convex quadrilateral with AD = BC. Let P be the intersection of the diagonals AC and BD, and let K and L be the circumcenters of triangles PAD and PBC, respectively. Show that the midpoints of segments AB, CD, and KL are collinear. 12239. Proposed by David Altizio, University of Illinois, Urbana, IL. Determine all positive integers r such that there exist at least two pairs of positive integers (m, n) satisfying the equation 2m = n! + r. 12240. Proposed by Yue Liu, Fuzhou University, Fuzhou, China, and Fuzhen Zhang, Nova Southeastern University, Fort Lauderdale, FL. We denote by A∗ the conjugate transpose of the matrix A. (a) Let x ∈ Cm be a unit column vector. Find the eigenvalues of the (m + 1)-by-(m + 1) matrices ∗ ∗ xx x x x x∗ and . x 0 x∗ 0 (b) More generally, let X be an m-by-n complex matrix, and let ρ be any real number. Find the eigenvalues of the (m + n)-by-(m + n) matrices ∗ ∗ X X X∗ X XX and . X ρIm X∗ ρIn 12241. Proposed by Ovidiu Furdui and Alina Sˆınt˘am˘arian, Technical University of ClujNapoca, Cluj-Napoca, Romania. Prove ∞ 2n 1 1 ln 2 − 1 n − ln 2 + . (−1) n = 4n k 8 n=1 k=n+1 doi.org/10.1080/00029890.2021.1861418
276
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
12242. Proposed by Elena Corobea, Technical College Carol I, Constant¸a, Romania. For n ≥ 1, let
2022 1 n k k=0 x /(2k + 1) In =
2021 dx. n+1 k x /(2k + 1) k=0
0
Let L = limn→∞ In . Compute L and limn→∞ n(In − L). 12243. Proposed by M. L. Glasser, Clarkson University, Potsdam, NY. For a > 0, evaluate a t dt. 0 sinh t 1 − csch2 a · sinh2 t
SOLUTIONS A Trigonometric Identity From a Generating Function 12117 [2019, 469]. Proposed by Michel Bataille, Rouen, France. Let n be a nonnegative integer. Prove sinn+1 (4π/7) sinn+2 (π/7)
−
sinn+1 (π/7) sinn+2 (2π/7)
+ (−1)n
sinn+1 (2π/7) sinn+2 (4π/7) √ (i + j + k)! (−1)n−i 2i , =2 7 i! j ! k!
where the sum is taken over all triples (i, j, k) of nonnegative integers satisfying i + 2j + 3k = n. Solution by Li Zhou, Polk State College, Winter Haven, FL. Consider the generating func n tion g(x) = x/(1 − 2x − x 2 + x 3 ) with formal power series ∞ a x . We have n n=0 g(x) =
∞ x = x (2x + x 2 − x 3 )m 1 − (2x + x 2 − x 3 ) m=0
=
∞ (i + j + k)! i 2 (−1)k x n+1 , i!j !k! n=0
where the sum in the coefficient of x n+1 is over all triples (i, j, k) of nonnegative√ integers satisfying i + 2j + 3k = n. Therefore, the right side of the requested identity is 2 7an+1 . We claim that the zeros of the polynomial 1 − 2x − x 2 + x 3 in the denominator of the generating function are −2 cos(2πj/7) for j ∈ {1, 2, 3}. To see why, let ζ = exp(2π i/7), so that −2 cos(2πj/7) = −ζ j − ζ −j . Substituting x = −ζ j − ζ −j into 1 − 2x − x 2 + x 3 yields −ζ −3j (ζ 6j + ζ 5j + ζ 4j + ζ 3j + ζ 2j + ζ j + 1) = 0. With α = 1/(2 cos(3π/7)), β = 1/(2 cos(π/7)), and γ = −1/(2 cos(2π/7)), the zeros are 1/α, 1/β, and 1/γ . Since the product of the zeros is the negative of the constant term, we have αβγ = −1, whence α = 4 cos(π/7) cos(2π/7) =
sin(4π/7) , sin(π/7)
and similarly β=
March 2021]
sin(π/7) sin(2π/7)
and
γ =−
sin(2π/7) . sin(4π/7)
PROBLEMS AND SOLUTIONS
277
Now write the partial fraction decomposition of g(x): x g(x) = (1 − αx)(1 − βx)(1 − γ x) ∞
=
B C A + + = (Aα n + Bβ n + Cγ n )x n , 1 − αx 1 − βx 1 − γx n=0
so an+1 = Aα n+1 + Bβ n+1 + Cγ n+1 . We calculate the constants A, B, and C: A= =
1 α = (α − β)(α − γ ) 4(cos(π/7) − cos(3π/7))(cos(2π/7) + cos(3π/7)) 1 1 = √ , 16 sin(π/7) sin(2π/7) cos(5π/14) cos(π/14) 2 7 sin(π/7)
where the final equality follows from the identity
√ 8 sin(π/7) sin(2π/7) sin(3π/7) = 7. (This in turn follows from the well-known identity m−1 k=1 (2 sin(kπ/m)) = m.) Similarly, B=
β 1 =− √ (β − γ )(β − α) 2 7 sin(2π/7)
and
1 γ =− √ . (γ − α)(γ − β) 2 7 sin(4π/7) √ n+1 Therefore, + Bβ n+1 + Cγ n+1 ), which √ the left side of the requested identity is 2 7(Aα equals 2 7an+1 , completing the proof. C=
Also solved by R. Chapman (UK), G. Fera (Italy), O. Kouba (Syria), P. Lalonde (Canada), C. R. Pranesachar (India), A. Stadler (Switzerland), R. Stong, R. Tauraso (Italy), and the proposer.
Greatest Prime Divisors for a Quadratic 12126 [2019, 658]. Proposed by Marian Tetiva, National College “Gheorghe Ros¸ca Codreanu,” Bˆırlad, Romania. Let P (n) be the greatest prime divisor of the positive integer n. Prove that P (n2 − n + 1) < P (n2 + n + 1) and P (n2 − n + 1) > P (n2 + n + 1) each hold for infinitely many positive integers n. Solution by Robert Tauraso, Universit`a di Roma “Tor Vergata,” Rome, Italy. Define a sequence q by qn = P (n2 + n + 1). Since n2 − n + 1 = (n − 1)2 + (n − 1) + 1, we obtain P (n2 − n + 1) = qn−1 . Moreover, gcd(n2 − n + 1, n2 + n + 1) = gcd(n2 − n + 1, 2n) = gcd(n2 − n + 1, n) = gcd(1, n) = 1, which implies qn = qn−1 . Therefore, falseness of the claim requires q to be eventually strictly decreasing or strictly increasing. That contradicts qn2 = max(qn , qn−1 ), which follows from (n2 )2 + (n2 ) + 1 = (n2 + n + 1)(n2 − n + 1). Also solved by B. Ahn (Korea), R. Boukharfane (Saudi Arabia), R. Chapman (UK), K. Egamberganov (France), K. Gatesman, E. A. Herman, N. Hodges (UK), Y. J. Ionin, M. Javaheri, P. W. Lindstrom, R. Molinari, A. Natian, A. Pathak, M. Reid, C. Schacht, N. C. Singer, A. Stenger, R. Stong, L. Zhou, GCHQ Problem Solving Group (UK), ONU-SOLVE Problem Solving Club, and the proposer.
278
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Trailing Zeros of Fibonacci Numbers 12128 [2019, 658]. Proposed by Omran Kouba, Higher Institute for Applied Sciences and Technology, Damascus, Syria. Let Fn be the nth Fibonacci number, defined by F0 = 0, F1 = 1, and Fn+1 = Fn + Fn−1 for n ≥ 1. Find, in terms of n, the number of trailing zeros in the decimal representation of Fn . Solution by the Armstrong Problem Solvers, Georgia Southern University, Savannah, GA. For a positive integer n, let z(n) denote the number of trailing zeros in the decimal representation of Fn . Write n = 2a · 3b · 5c · d, where a, b, and c are nonnegative and d is relatively prime to 30. We prove ⎧ ⎪ if bc = 0, ⎨0 z(n) = 1 if a = 0 < bc, ⎪ ⎩ min{a + 2, c} if abc > 0. We study the divisibility of Fn by powers of 2 and 5. It is well known (and easily proved by induction) that Fn is even if and only if n is a multiple of 3, and Fn is divisible by 5 if and only if n is divisible by 5. Thus, if bc = 0, then Fn is odd or not a multiple of 5, so z(n) = 0. For a prime p, write νp (n) for the exponent of the highest power of p dividing n. Our three lemmas about powers of 2 are proved by induction. Lemma 1. If n ≡ 3 (mod 6), then ν2 (Fn ) = 1. Proof. For the basis, ν2 (F3 ) = ν2 (2) = 1. For larger n, given ν2 (Fn−6 ) = 1, we have Fn−6 = 2m, where m is odd. Five applications of the recurrence yield Fn = 8Fn−5 + 5Fn−6 . Hence Fn = 2(4Fn−5 + 5m), where 4Fn−5 + 5m is odd, so ν2 (Fn ) = 1. Lemma 2. If n ≡ 6 (mod 12), then ν2 (Fn ) = 3. Proof. For the basis, ν2 (F6 ) = ν2 (8) = 3. For larger n, given ν2 (Fn−12 ) = 3, we have Fn−12 = 8m, where m is odd. Using the identity Fr+1 Fs+1 + Fr Fs = Fr+s+1 , we obtain Fn = F12 Fn−11 + F11 Fn−12 = 144Fn−11 + 89(8m) = 8(18Fn−11 + 89m). Since 18 is even and 89m is odd, ν2 (Fn ) = 3. Lemma 3. If n is a multiple of 6, then ν2 (Fn ) = ν2 (n) + 2.
Proof. Write n = 2a · 3s, where a, s ∈ N and s is odd; we use induction on a. For the basis, n is an odd multiple of 6, so ν2 (n) = 1, and Lemma 2 yields ν2 (Fn ) = 3 = ν2 (n) + 2. To move from n to 2n, we are given ν2 (n) = a and ν2 (Fn ) = a + 2. Thus ν2 (2n) = a + 1 and Fn = 2a+2 m, where m is odd. Now F2n = Fn Fn+1 + Fn Fn−1 = Fn (Fn + 2Fn−1 ) = 2a+2 m(2a+2 m + 2Fn−1 ) = 2a+3 m(2a+1 m + Fn−1 ). Note that n − 1 is not a multiple of 3, so Fn−1 is odd. Hence 2a+1 · m + Fn−1 is odd, and ν2 (F2n ) = a + 3, as desired. The next two lemmas complete the study of powers of 5. Lemma 4. If t ∈ N and p is prime, then ∞ t t < νp (t!) = . j p p − 1 j =1
March 2021]
PROBLEMS AND SOLUTIONS
279
Proof. The equality is immediate. For the inequality, ∞ ∞ t t t t/p < = . = j j p p 1 − (1/p) p − 1 j =1 j =1
Lemma 5. ν5 (Fn ) = ν5 (n). Proof. Binet’s formula yields
√ n √ n 1+ 5 − 1− 5 . Fn = √ 2n 5
Expanding the numerator using the binomial theorem cancels the even powers of ting m = (n − 1)/2 yields
m √ 2k+1 √ 2k+1 n 1 5 − − 5 Fn = √ 2n 5 k=0 2k + 1 =
m 1
2n−1
k=0
√ 5. Let-
m n n n 1 5k = n−1 + n−1 5k . 2k + 1 2k + 1 2 2 k=1
Now let n = 5 d, where c ≥ 0 and d is not divisible by 5. We have m 2k c i=0 (5 d − i) n−1 c 2 Fn = 5 d + 5k . (2k + 1)! k=1 c
By Lemma 4, ν5 ((2k + 1)!) < (2k + 1)/4 < k. Setting k = k − ν5 ((2k + 1)!) yields m m n−1 c c+ k c k 5 sk = 5 d + 5 sk 2 Fn = 5 d + k=1
k=1
where s1 , . . . , sm are integers. Since k ≥ 1 and d is not divisible by 5, ν5 (Fn ) = ν5 (2n−1 Fn ) = c = ν5 (n).
Since z(n) is equal to the smaller of ν2 (Fn ) and ν5 (Fn ), Lemmas 1, 3, and 5 give the desired formula. Editorial comment. For divisibility of Fibonacci numbers by prime powers, solvers cited one or more of the following: J. H. Halton (1966), On the divisibility properties of Fibonacci numbers, The Fibonacci Quarterly 4(3): 217–240; T. Lengyel (1995), The order of the Fibonacci and Lucas numbers, The Fibonacci Quarterly 33(3): 234–239; M. Renault (2013), The period, rank, and order of the (a, b)-Fibonacci sequence mod m, Math. Magazine 86: 372–380; and Problem 11968 [2017, 274; 2019, 85] from this Monthly. Also solved by H. Al-Assad (Syria), R. Chapman (UK), G. Fera (Italy), N. Hodges (UK), E. J. Ionas¸cu, Y. J. Ionin, D. E. Knuth, J. H. Lindsey II, O. P. Lossers (Netherlands), G. Marks, R. Molinari, A. Natian, J. H. Nieto (Venezuela), A. Stadler (Switzerland), R. Stong, D. Terr, T. Wiandt, M. Wildon (UK), L. Zhou, and the proposer.
Nested Radicals 12129 [2019, 659]. Proposed by Hideyuki Ohtsuka, Saitama, Japan. Compute √ 2 + 2 + 2 + ··· + 2 − 2 + ···,
280
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
where the sequence of signs consists of n − 1 plus signs followed by a minus sign and repeats with period n. Solution by Tamas Wiandt, Rochester Institute of Technology, Rochester, NY. The value is 2 cos π/(2n + 1) . Let √ ak = 2 + ε1 2 + ε2 2 + · · · + εk 2, where εj ∈ {1, 0, −1}. A straightforward induction proves k π ε1 · · · εi . ak = 2 sin 4 i=0 2i This result appears as problem 183 in G. P´olya, G. Szeg˝o (1972), Problems and Theorems in Analysis I, Berlin: Springer-Verlag. It implies that the nested radical expression is convergent. Taking εj = −1 when j is a multiple of n and εj = 1 otherwise, we compute
(r+1)n−1 ∞ ∞ ∞ 1 1 ε1 · · · εi r r 1 2 − = (−1) = (−1) 2i 2s 2rn 2n−1 s=rn i=0 r=0 r=0
1 2n − 1 1 = 2 · . = 2 − n−1 2 1 + 1/2n 2n + 1 For the limit L, we use sin x = cos(π/2 − x) to compute
π 2n − 1 π L = 2 sin ·2· n = 2 cos n . 4 2 +1 2 +1 Also solved by B. Bradie, S. Brickman, R. Chapman (UK), H. Chen, G. A. Correia (Brazil), K. Egamberganov (France), G. Fera (Italy), M. Goldenberg & M. Kaplan, J. A. Grzesik, N. Hodges (UK), W. Janous (Austria), M. Javaheri, J. C. Kieffer, D. E. Knuth, O. Kouba (Syria), P. Lalonde (Canada), O. P. Lossers (Netherlands), J. McHugh, A. Natian, J. H. Nieto (Venezuela), M. Omarjee (Saudi Arabia), C. R. Pranesachar (India), J. Reid, A. Stadler (Switzerland), R. Stong, R. Tauraso (Italy), E. I. Verriest, S. Yadav (India), L. Zhou, GCHQ Problem Solving Group (UK), Missouri State University Problem Solving Group, Northwestern University Problem Solving Group, and the proposer.
An Application of the Inverse Function Theorem 12131 [2019, 659]. Proposed by Michael Maltenfort, Northwestern University, Evanston, IL. Let m and n be positive integers with n ≥ 2. Suppose that U is an open subset of Rm and f : U → Rn is continuously differentiable. Let K be the set of all x ∈ U such that the derivative Df (x), as a linear transformation, has rank less than n. Prove that if f (K) is countable, U \ K = ∅, and f (U ) is closed, then f (U ) = Rn . Solution by the proposer. We need the following lemma. Lemma. Let A be a nonempty open set in Rn , where n ≥ 2. If the boundary of A is countable, then A is dense in Rn . Proof. We prove the contrapositive. Assume A is not dense in Rn . This guarantees an open ball B in Rn that is disjoint from A. Select a ∈ A. Since n ≥ 2, we can choose a closed line segment l contained in B such that a is not on the line that contains l. For every x ∈ l, let / A, σx contains σx be the closed line segment with endpoints a and x. Since a ∈ A and x ∈ a boundary point of A, say dx , and since A is open, dx = a. By our choice of l, for x = x we have σx ∩ σx = {a}, and thus dx = dx . Therefore the points dx for x ∈ l are distinct, so the boundary of A is uncountable.
March 2021]
PROBLEMS AND SOLUTIONS
281
Let A = f (U \ K). We show that A is open. Suppose c ∈ U \ K and c = (c1 , . . . , cm ). Some n-by-n minor of the matrix of partial derivatives at c has nonzero determinant. For simplicity of notation, assume that this minor is determined by the first n coordinates. Let c = (c1 , . . . , cn ) and define g : Rn → Rm by the formula g(x1 , . . . , xn ) = (x1 , . . . , xn , cn+1 , . . . , cm ). Notice that g(c ) = c. Applying the inverse function theorem to f ◦ g at c , we see that there exists an open set V ⊆ g −1 (U ) that contains c such that on V the function f ◦ g has, first, a derivative with nonzero determinant, and, second, a continuous inverse. By the first property, (f ◦ g)(V ) ⊆ f (U \ K), and by the second, (f ◦ g)(V ) is open. Thus (f ◦ g)(V ) is an open neighborhood of f (c) that is contained in f (U \ K). Using this and the fact that A ∪ f (K) equals f (U ), which is closed, we conclude that the boundary of A is contained in f (K) and is therefore countable. By our lemma, A is dense in Rn . Since f (U ) ⊇ A and f (U ) is closed, f (U ) = Rn . Editorial comment. This problem and solution were inspired by Theorem 1 in P. Liu and S. Liu (2018), On the surjectivity of smooth maps into Euclidean spaces and the fundamental theorem of algebra, this Monthly, 125(10): 941–943. ´ Plaza & F. Perdomo (Spain), T. Schonbek, and R. Stong. Also solved by J.-P. Grivaux (France), A.
A Process of Decay on Positive Integers 12132 [2019, 755]. Proposed by K. S. Bhanu and Mukta Deshpande, Institute of Science, Nagpur, India, and P. G. Dixit, Modern College, Pune, India. Let n be a positive integer, and let X0 = n + 1. Repeatedly choose the integer Xk uniformly at random among the integers j with 1 ≤ j < Xk−1 , stopping when Xm = 1. (a) What is the expected value of m? (b) What is the expected value of Xm−1 ? Solution by William J. Cowieson, Fullerton College, CA. The nth harmonic number Hn equals nk=1 1/k. The solution to (a) is Hn and to (b) is 1 + Hn . (a) Let an = E[m]. For n > 0, there is at least one “jump” to X1 and from there we may need more jumps to get to 1. Conditioning on X1 yields the recurrence 1 ai , n i=0 n−1
an = 1 +
with boundary condition a0 = 0. Taking the difference of successive instances of the recurrence, appropriately normalized, yields nan − (n − 1)an−1 = 1 + an−1 , which simplifies to an = an−1 + 1/n. Thus an = a0 + ni=1 1/ i, so E[m] = an = Hn . (b) Let bn = E[Xm−1 ]. Note that Xm−1 = n + 1 when the process moves to 1 in one step. For n > 1, conditioning on X1 thus yields
1 n+1+ bn = bi . n i=1 n−1
As in (a), this simplifies to bn = bn−1 + 1/n. Since b1 = 2, the solution is E[Xm−1 ] = bn = 1 + Hn . Editorial comment. Stephen J. Herschkorn observed that the problem is analyzed in detail on pages 193–195 in Sheldon M. Ross (1996), Stochastic Processes, 2nd ed., Hoboken, NJ: John Wiley & Sons.
282
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
Also solved by R. A. Agnew, K. F. Andersen (Canada), E. Bojaxhiu (Albania) & E. Hysnelaj (Australia), N. Caro (Brazil), R. Chapman (UK), M. P. Cohen, C. Curtis, K. Gatesman, O. Geupel (Germany), Ab. Goel, Ar. Goel, N. Grivaux (France), N. Hodges (UK), J. C. Kieffer, O. Kouba (Syria), P. Lalonde (Canada), J. H. Lindsey II, P. W. Lindstrom, O. P. Lossers (Netherlands), M. D. Meyerson, A. Natian, P. Putalapattu, K. Schilling, A. Stadler (Switzerland), R. Stong, R. Tauraso (Italy), M. Vowe (Switzerland), T. Wiandt, L. Zhou, Armstrong Problem Solvers, GCHQ Problem Solving Group (UK), Missouri State Problem Solving Group, and the proposer.
Quadrilaterals with Infinitely Many Inscribed Squares 12133 [2019, 755]. Proposed by Daniel Hu, Los Altos High School, Los Altos, CA. Let ABCD be a convex quadrilateral. Suppose that lines AB and CD meet at P , lines AD and BC meet at Q, and AC and BD meet at R. Prove that there are infinitely many squares with one vertex on each side of ABCD if and only if AC ⊥ BD and PR ⊥ QR. Composite solution by Davis Problem Solving Group, Davis, CA, and the editors. Assume ABCD has two inscribed squares EFGH and IJKL, with E and I on AB, F and J on BC, G and K on CD, and H and L on DA. Suppose that the circumcircles (DGH) and (DKL) intersect at M. Since ∠MGH = ∠MDL = ∠MKL and ∠MHG = ∠MDK = ∠MLK, M is the center of a spiral similarity mapping GH to KL. Since EFGH and IJKL are squares, this spiral similarity maps EFGH to IJKL as well. It follows that triangles MEI, MFJ, MGK, and MHL are all similar. Let MS, MT, MU, and MV be altitudes of these four triangles, respectively. Since there is a spiral similarity centered at M that maps EFGH to STUV, STUV is also a square. Let N be the foot of the perpendicular dropped from M onto AC. The point N lies both on the circle with diameter AM, as do S and V , and on the circle with diameter CM, as do T and U . Hence the common perpendicular bisector l of SV and TU passes through the centers of both circles, and l is therefore also the perpendicular bisector of MN. It follows that l is parallel to AC, and hence AC ⊥ SV. Similarly, the circumcircles (BSMT) and (DUMV) intersect at a point O on BD, the common perpendicular bisector of ST and UV is also the perpendicular bisector of MO and is parallel to BD, and hence BD ⊥ ST. We conclude that AC ⊥ BD and that MNRO is a rectangle whose center coincides with the center of STUV. The segment TV passes through this center, which is the midpoint of MR. Also, since TV is the perpendicular bisector of SU, it passes through the center of circumcircle (SMUP), which is the midpoint of MP. Thus TV is parallel to PR. Likewise, SU is parallel to QR, and we have PR ⊥ QR. Conversely, suppose AC ⊥ BD and PR ⊥ QR. Let A , B , C , D , P , and Q be the images of A, B, C, D, P , and Q, respectively, under the inversion centered at R with power 1. Let W , X, Y , and Z be the midpoints of A B , B C , C D , and D A , respectively. Since WX and ZY are parallel to AC, and WZ and XY are parallel to BD, and AC ⊥ BD, WXYZ is a rectangle. Also, the circumcircles (A B R) and (C D R) are the inversive images of AB and CD, and they intersect at P . They have centers W and Y , respectively, so WY is the perpendicular bisector of P R. Likewise, XZ ⊥ Q R. Therefore WY ⊥ XZ, and WXYZ is a square. Consequently, 1 1 1 1 + = A R + C R = 2WX = 2XY = B R + D R = + , AR CR BR DR
March 2021]
PROBLEMS AND SOLUTIONS
283
or equivalently, AC AR · CR = . BD BR · DR Now locate M such that ∠BAM = ∠CAD, ∠DAM = ∠CAB, ∠ABM = ∠DBC, and ∠CBM = ∠DBA. Let S, T , U , and V be the orthogonal projections of M onto AB, BC, CD, and DA, respectively. Since ∠BAM and ∠ABM are acute, S is between A and B. By similar triangles, AS/MS = AR/DR and BS/MS = BR/CR, so AR · CR AC AS = = . BS BR · DR BD Also, BT/BM = BR/AB and BS/BM = BR/BC, so BT BR · BM BS = = < 1. BC AB · BC AB Therefore T is between B and C, CT/BT = AS/BS = AC/BD, and ST AC. Moreover, BT/MT = BR/AR, so CT CT BT AS BR AR · CR BR CR = · = · = · = . MT BT MT BS AR BR · DR AR DR Therefore ∠BCM = ∠ACD and ∠DCM = ∠ACB. By similar reasoning, we find that U is between C and D, and V is between D and A. Furthermore, TU BD SV and UV AC. Therefore STUV is a rectangle. Finally, ST = (BT/BC) · AC and TU = (CT/BC) · BD, so TU CT BD AC BD = · = · = 1. ST BT AC BD AC Thus STUV is a square. Using M as the center of spiral similarity, we get infinitely many inscribed squares, such as EFGH in the figure, where triangles MES, MFT, MGU, and MHV are all similar. Also solved by J. A. Grzesik (sufficiency only), N. Hodges (UK), R. Stong, and the proposer.
284
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
REVIEWS Edited by Darren Glass Department of Mathematics, Gettysburg College, Gettysburg, PA 17325
How to Be an Antiracist. By Ibram X. Kendi, One World, New York, NY, 2019. 320 pp., ISBN 978-0525509288, $27.00.
Reviewed by Margaret Reese Editor’s Note: During the summer of 2020, events including the murders of George Floyd and Breonna Taylor and subsequent protests by Black Lives Matter groups have inspired many people to undertake a deeper exploration of issues of equality and racial justice. We decided to take this moment to step back from the traditional types of books we review and instead take a look at a book that looks at some of these issues.
A racist idea is any idea that suggests one racial group is inferior or superior to another racial group in any way . . . An antiracist idea is any idea that suggests the racial groups are equals in all their apparent differences—that there is nothing right or wrong with any racial group. (p. 20)
Antiracism. Ibram X. Kendi’s book How to Be an Antiracist is an engaging history of racist and antiracist ideas. Kendi interweaves his own life story with lessons in the history of racism. He ties historical events with his own coming to consciousness through his educational experience from elementary school to graduate school, as well as his extra-curricular struggles with bullies, authority figures, and cancer. Through these experiences, Kendi learns about racism and what he has named antiracism. He recalls himself as a young racist full of assimilationist ideas and shares embarrassing stories from his youth that show that racists ideas are hidden, self-damaging, and not restricted to white people. For the better part of my life I held both racist and antiracist ideas, supported both racist and antiracist policies; I’ve been antiracist one moment, racist in many more moments. To say Black people can’t be racist is to say all Black people are being antiracist at all times. My own story tells me that is not true. History agrees. (p. 144)
From believing that racism stems from ignorance, the young Kendi grows to understand the source of racist policy as self-interest and power. Mainstream ideas about ending racism in the 1970s and 80s now read like assimilationist propaganda. As Kendi mixes stories from the American struggle for equality among the races, he maps the changes in his own attitudes and ambitions, exploring his own understandings as a child, teen, and young adult about his place in the world. As a student in college and in graduate school, Kendi grows in awareness of the struggles faced by people in various group intersections—race-gender, race-class, race-gender-class, race-sexuality—which brings him to the conclusion that a just society will have equal opportunity and treatment for all, without regard to these groupings. He becomes aware of the sorts of prejudices that he himself has as he doi.org/10.1080/00029890.2021.1858659
March 2021]
REVIEWS
285
interacts with fellow students, especially ones falling in these group intersections. He argues for an antiracist world, where everyone is free to be who they are. All the American subgroups can be celebrated for their distinctiveness, not diminishing the overall American experiment but, in fact, fulfilling it. These groups and subgroups will continue to exist, and society will be more vibrant because of them. As a historian, Kendi provides a well-documented case, and the scholarly notes abound: newspaper and magazine articles, historical monographs, statistical studies, Supreme Court decisions and dissents, plays, novels, essays, speeches, and more. Just reading the notes can be fun, although I will point out that reading on the Kindle makes checking the notes and references difficult: the endnotes are all at the end of the book and, unfortunately, are not linked from the text. While the book comes with a robust set of references, it aims to make the reader confront her own beliefs about race, about intersectionality, and about capitalism. Quantitative reasoning about social justice. Throughout How to Be an Antiracist, instructors will find excellent sources of material on social justice and racism issues for use in classes dealing with quantitative reasoning. Kendi provides references to studies that quantify the systemic racism in the United States: home ownership rate, lead poisoning, Alzheimer’s disease, life expectancy, infant mortality, cancer death, health insurance rates, voter suppression, prison populations, student suspensions, police shootings, PTSD rates, SAT scores, bias in jobs, income and housing, the partisan divide, the wage gap, the wealth gap, and the teaching force. Arguing that the Black body is not more prone to violence, Kendi reports that “researchers have found a much stronger and clearer correlation between violent-crime levels and unemployment levels than between violent crime and race.” (p. 79)1 But this also draws a relationship between the quantitative information and potential housing policy revisions in the Department of Housing and Urban Development (HUD). It is easy to imagine this being an inspiring topic for students in a quantitative reasoning general education course. Kendi’s family has a shockingly high incidence of cancer: both grandfathers, a grandmother, both parents, his wife, himself. He reports “African-Americans are 25% more likely to die of cancer than Whites.” (p. 21)2 The National Institute of Health’s National Cancer Institutes (NCI) provides examples of cancer rate disparities. A little digging into the website reveals “Cancer Health Disparities” documentation based on government statistics and an interactive website that provides access to cancer data by gender, race and more. The website is supported by the Surveillance Research Program of NCI’s Division of Cancer Control and Population Sciences. In another example of racial inequity, a Stanford Center on Poverty and Inequality publication reports “71 percent of White families lived in owner-occupied homes in 2014, compared to 45 percent of Latinx families and 41 percent of Black families.” (p. 18)3 The Stanford Center provides data on poverty and inequality trends, Hispanic trends, California welfare trends and income segregation maps on the website. Education is not the answer While educational lessons can certainly be drawn from quantitative studies, Kendi laments, in a chapter called “Failure,” the failure of education to fix systemic racism. “Behavioral-enrichment programs, like mentoring and educational programs, can help individuals but are bound to fail racial groups, which 1 See
Kendi’s note www.huduser.gov/portal/periodicals/em/summer16/highlight2.html Kendi’s note www.cancer.gov/about-nci/organization/crchd/about-health-disparities/definitions 3 See Kendi’s note “Housing” Pathways: A Magazine on Poverty, Inequality, and Social Policy, Special Issue 2017, available at inequality.stanford.edu/publications/pathway/state-union-2017 2 See
286
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128
are held back by bad policies, not bad behavior.” (p. 201, emphasis added) Still, Kendi offers guidance for analyzing the structural racism within institutions. Although he raises awareness of racism and its effects through storytelling, his antiracist prescription for the way forward is quantitative analysis with an eye toward policy making. Applying Kendi’s method to the mathematical community, we can ask: What proportion of undergraduates is Black? What proportion of math majors is Black? What proportion of math professors is Black? Is there equity? We ask these questions with the antiracist axiom that interest in mathematics is not associated with race. That is, Black students are equally interested in majoring in mathematics and going to graduate school in mathematics, as well as being equally serious about careers in mathematics as any other racial group. There is simply a group of people interested in mathematics with a racial make-up that is not different from the general population, within the country, states or institutions. Do we find that the proportion of math majors differs significantly by race? Kendi does not provide statistics for our particular field, but his advice is nonetheless clear: Identify the racial inequities, find the racist policies that are causing them, and change those policies. The MAA Committee on Minority Participation in Mathematics (CMPM) seeks to increase participation in several ways. For example, the committee has invited speakers for the MAA-SIAM-AMS Hrabowski-Gates-Tapia-McBay Lecture and Panel at the Joint Mathematics Meetings and hosted breakfasts for minority chairs to network with each other and the leadership of the MAA. (Disclosure: The author is a member of CMPM.) Can we do more? Yes. In fact, the recent Black Lives Matter protests prompted CMPM, under the leadership of Carrie Diaz Eaton (Bates College), to issue a statement supporting the protests and acknowledging the inequities in the community of mathematicians. We can name these inequities more explicitly, find the policies that cause the inequities and change those policies. We can know the facts. We can work toward making our associations, institutions, and departments places where any person can flourish and where the racial groups within the flourishing group are proportional to the actual racial groups in the larger community. The way forward. Kendi seeks to inspire readers to become committed antiracists: to assume that all racial groups are equal and equally capable and equally interested, and to work to establish policies that achieve equitable outcomes for all racial groups. In the mathematics community, we have underrepresented groups. We must determine what policies and practices are keeping the groups from full participation and reform those policies and practices. As college and university faculty and administrators, we can take up Kendi’s challenge: devise and implement policies that are known to work. And if they don’t work, try something else. To be an antiracist is a radical choice in the face of this history, requiring a radical reorientation of our consciousness. (p. 23) Department of Mathematics, Virginia Wesleyan University, 5817 Wesleyan Drive, Virginia Beach, VA 23455 [email protected]
March 2021]
REVIEWS
287
In addition to Kendi’s book, there are many other sources one might turn to in order to further our understanding of racism, antiracism, and equity issues in mathematics. Alden Bradford runs a reading group on the subject in the mathematics department at Purdue, and had the following recommendations: •
•
•
•
•
•
288
For an introduction to how mathematics institutions are failing Black learners, there is the article “Equity, inclusion, and antiblackness in mathematics education” by Danny Bernard Martin, appearing in Vol. 22 (2019) of Race Ethnicity and Education. The article “Political conocimiento for teaching mathematics: Why teachers need it and how to develop it” by Rochelle Guti´errez makes the strong case that anyone teaching mathematics needs to be aware of the political climate they are teaching in. This appears in the 2017 book Building Support for Scholarly Practices in Mathematics Methods edited by S. Kastberg, A. Tyminski, A. Lischka, and W. Sanchez, and published by The Association of Mathematics Teacher Educators. The article “The Role of Professional Societies in STEM Diversity” by Vernon R. Morris and Talitha M. Washington identifies specific issues they face and specific actions we can take which could help improve diversity in the mathematics community. This article originally appeared in Vol. 87 (2017) of the Journal of the National Technical Association. The book The Crest of the Peacock: The Non-European Roots of Mathematics by George Gheverghese Joseph takes a look at the history of mathematics through a non-Eurocentric lens. The newest edition of this book was published in 2011 by Princeton University Press. Cathy O’Neil’s 2016 book Weapons of Math Destruction, published by Crown Books, looks at the ways that big data algorithms are used to reinforce inequalities in society, many of which have racial aspects. For those interested in making a change in their own classroom, the 2019 MAA book Mathematics for Social Justice: Resources for the College Classroom, edited by Gizem Karaali and Lily Khadjavi, contains several ready-made lessons which simultaneously address social issues and teach college-level math concepts. Even if you don’t end up using one of the lessons, it provides some helpful advice for those wanting to make their teaching responsive to society.
c THE MATHEMATICAL ASSOCIATION OF AMERICA
[Monthly 128