114 60 2MB
English Pages 244 [258] Year 2016
Mathematical Surveys and Monographs Volume 213
Beurling Generalized Numbers Harold G. Diamond Wen-Bin Zhang (Cheung Man Ping)
American Mathematical Society
https://doi.org/10.1090//surv/213
Beurling Generalized Numbers
Mathematical Surveys and Monographs Volume 213
Beurling Generalized Numbers Harold G. Diamond Wen-Bin Zhang (Cheung Man Ping)
American Mathematical Society Providence, Rhode Island
EDITORIAL COMMITTEE Robert Guralnick Michael A. Singer, Chair
Benjamin Sudakov Constantin Teleman
Michael I. Weinstein
2010 Mathematics Subject Classification. Primary 11N80.
For additional information and updates on this book, visit www.ams.org/bookpages/surv-213
Library of Congress Cataloging-in-Publication Data Names: Diamond, Harold G., 1940–. Zhang, Wen-Bin (Cheung, Man Ping), 1940– . Title: Beurling generalized numbers / Harold G. Diamond, Wen-Bin Zhang (Cheung Man Ping). Description: Providence, Rhode Island : American Mathematical Society, [2016] | Series: Mathematical surveys and monographs ; volume 213 | Includes bibliographical references and index. Identifiers: LCCN 2016022110 | ISBN 9781470430450 (alk. paper) Subjects: LCSH: Numbers, Prime. | Numbers, Real. | Riemann hypothesis. | AMS: Number theory – Multiplicative number theory – Generalized primes and integers. msc Classification: LCC QA246 .D5292 2016 | DDC 512/.2–dc23 LC record available at https://lccn.loc.gov/2016022110
Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Permissions to reuse portions of AMS publication content are handled by Copyright Clearance Center’s RightsLink service. For more information, please visit: http://www.ams.org/rightslink. Send requests for translation rights and licensed reprints to [email protected]. Excluded from these provisions is material for which the author holds copyright. In such cases, requests for permission to reuse or reprint material should be addressed directly to the author(s). Copyright ownership is indicated on the copyright page, or on the lower right-hand corner of the first page of each article within proceedings volumes. c 2016 by the authors. All rights reserved. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability. Visit the AMS home page at http://www.ams.org/ 10 9 8 7 6 5 4 3 2 1
21 20 19 18 17 16
Dedicated to our wives, Nancy Diamond and Kun-Ming Luo Zhang, and to the memory of our friend and colleague, Paul T. Bateman
Contents Preface
xi
Chapter 1. Overview 1.1. Some questions about primes 1.2. The cast 1.3. Examples 1.4. π, Π, and an extended notion of g-numbers 1.5. Notes
1 1 2 4 6 7
Chapter 2. Analytic Machinery 2.1. A function class 2.2. Measures 2.3. Mellin transforms 2.4. Norms 2.5. Convergence 2.6. Convolution of measures 2.7. Convolution of functions 2.8. The L and T operators 2.9. Notes
9 9 9 10 11 11 12 15 16 18
Chapter 3. dN as an Exponential and Chebyshev’s Identity 3.1. Goals and plan 3.2. Power series in measures 3.3. Inverses 3.4. The exponential on V 3.5. Three equivalent formulas 3.6. Notes
19 19 21 22 23 27 28
Chapter 4. Upper and Lower Estimates of N (x) 4.1. Normalization and restriction 4.2. O-log density 4.3. Lower log density 4.4. An example with infinite residue but 0 lower log density 4.5. Extreme thinness is inherited 4.6. Regular growth 4.7. Notes
29 29 30 33 36 37 39 40
Chapter 5. Mertens’ Formulas and Logarithmic Density 5.1. Introduction 5.2. Logarithmic density 5.3. The Hardy-Littlewood-Karamata Theorem
41 41 41 43
vii
viii
CONTENTS
5.4. 5.5. 5.6. 5.7. 5.8. 5.9.
Mertens’ sum formula Mertens’ product formula A remark on γ An equivalent form and proof of “only if” Tauber’s Theorem and conclusion of the argument Notes
45 46 47 48 49 51
Chapter 6. O-Density of g-integers 6.1. Non-relation of log-density and O-density 6.2. O-Criteria for O-density 6.3. Sharper criteria for O-density 6.4. Notes
53 53 56 60 62
Chapter 7. Density of g-integers 7.1. Densities and right hand residues 7.2. Axer’s Theorem 7.3. Criteria for density 7.4. An L1 criterion for density 7.5. Estimates of N (x) with an error term 7.6. Notes
63 63 63 65 69 72 76
Chapter 8. Simple Estimates of π(x) 8.1. Unboundedness of π(x) 8.2. Can there be as many primes as integers? 8.3. π(x) estimates viaregular growth 8.4. Lower bounds for 1/pi via lower log-density 8.5. Notes
77 77 78 79 81 81
Chapter 9. Chebyshev Bounds – Elementary Theory 9.1. Introduction 9.2. Chebyshev bounds for natural primes 9.3. An auxiliary function 9.4. Chebyshev bounds for g-primes 9.5. A failure of Chebyshev bounds 9.6. Notes
83 83 83 87 90 95 98
Chapter 10.1. 10.2. 10.3. 10.4. 10.5. 10.6.
10. Wiener-Ikehara Tauberian Theorems Introduction Wiener-Ikehara Theorems The Fej´er kernel Proof of the Wiener-Ikehara Theorems A W-I oscillatory example Notes
99 99 99 101 104 108 110
Chapter 11.1. 11.2. 11.3. 11.4. 11.5.
11. Chebyshev Bounds – Analytic Methods Introduction Wiener-Ikehara setup A first decomposition Further decomposition of I2,σ (y) Chebyshev bounds
111 111 112 113 114 116
CONTENTS
11.6. Notes
ix
117
Chapter 12.1. 12.2. 12.3. 12.4. 12.5. 12.6. 12.7.
12. Optimality of a Chebyshev Bound Introduction The g-prime system PB Chebyshev bounds and the zeta function ζB (s) The counting function NB (x) Fundamental estimates Proof of the Optimality Theorem Notes
119 119 119 122 124 127 129 131
Chapter 13.1. 13.2. 13.3. 13.4. 13.5. 13.6. 13.7.
13. Beurling’s PNT Introduction A lower bound for |ζ(σ + it)| Nonvanishing of ζ(1 + it) An L1 condition and conclusion of the proof Optimality – a continuous example Optimality – a discrete example Notes
133 133 133 135 137 139 144 148
Chapter 14.1. 14.2. 14.3. 14.4. 14.5. 14.6. 14.7. 14.8.
14. Equivalences to the PNT Introduction Implications Sharp Mertens relation and the PNT Optimality of the sharp Mertens theorem Implications between M (x) = o(x) and m(x) = o(1) Connections of the PNT with M (x) = o(x) Sharp Mertens relation and m(x) = o(1) Notes
151 151 151 152 154 155 156 158 160
Chapter 15.1. 15.2. 15.3. 15.4. 15.5. 15.6. 15.7.
15. Kahane’s PNT Introduction Zeros of the zeta function A lower bound for |ζ(σ + it)| A Schwartz function and Poisson summation Estimating the sum of a series by an improper integral Conclusion of the proof Notes
161 161 162 165 166 170 172 174
Chapter 16.1. 16.2. 16.3. 16.4. 16.5.
16. PNT with Remainder Introduction Two general lemmas A Nyman type remainder term A dlVP-type remainder term Notes
175 175 176 179 186 192
Chapter 17.1. 17.2. 17.3.
17. Optimality of the dlVP Remainder Term Background Discrete random approximation Generalized primes satisfying the Riemann Hypothesis
195 195 196 204
x
CONTENTS
17.4. 17.5. 17.6. 17.7. 17.8. 17.9. 17.10. 17.11. Chapter 18.1. 18.2. 18.3. 18.4. 18.5. 18.6. 18.7.
Generalized primes with large oscillation Properties of G(z) Representation of log G(z) as a Mellin transform A template zeta function Asymptotics of NB (x) Asymptotics of ψB (x) Normalization and hybrid Notes 18. The Dickman and Buchstab Functions Introduction The ψ(x, y) function The φ(x, y) function A Beurling version of ψ(x, y) G-numbers with primes from an interval Other relations Notes
208 209 210 214 218 222 226 227 229 229 230 233 234 235 237 238
Bibliography
239
Index
243
Preface Generalized numbers are a multiplicative structure introduced by A. Beurling [Be37] in 1937 to investigate the degree to which prime number theory is independent of the additive properties of the natural numbers. Beyond their own interest, the results and techniques of this theory apply to several other systems having the character of prime numbers and integers. Indeed, such ideas occurred already in a 1903 paper of E. Landau [La03] proving the prime number theorem for ideals of algebraic number fields. We shall introduce and use continuous (!) analogues of generalized (briefly: g-) numbers. As another application, these distributions provide an attractive path to the theories of Dickman and Buchstab for integers whose prime factors lie only in restricted ranges. A central question that we shall examine is the following: if a sequence of g-integers is generated by a sequence of g-primes, and if one of the collections is “reasonably near” its classical counterpart, does the other collection also have this property? This monograph does not examine all facets of g-number theory; some interesting topics that are largely ignored include probabilistic theory, oscillatory counting functions, and collections of primes and integers that are unusually dense or sparse. We hope that the accompanying list of references will help interested readers to explore these topics further. Our intended audience is readers having some familiarity with mathematical analysis and analytic number theory, particularly an analytic proof of the prime number theorem. Background material that we assume can be found in such books as those of Apostol [Ap76], Bateman-Diamond [BD04], Chandrasekharan [Ch68], [Ch70], Davenport [Da00], Ingham [In32], Montgomery-Vaughan [MV07], or Tenenbaum [Te95]. Specialized results will be developed as needed. Many examples are provided to illustrate how various hypotheses affect the behavior of g-number systems. They are important! But readers put off by details are encouraged to at least note the point of each example. This work contains published and new work of the authors. Also, we have benefited from the contributions of many others, and it is our pleasant duty to thank them here. These include Beurling, who originated the study and established its first important result; P. J. Cohen, who introduced the first author to this subject; P. T. Bateman and H. L. Montgomery, with whom we have worked in this area; and J.-P. Kahane, who established one of our main results. Also, we thank A. J. Hildebrand for his mathematical and TEXnical advice. The authors request that readers advise us of errors or obscurities they find. Our email address is [email protected] . We maintain corrections and comments at www.math.illinois.edu/∼hdiamond/hgdwbz/corrigenda.pdf . Urbana and Chicago IL May, 2016 xi
https://doi.org/10.1090//surv/213/01
CHAPTER 1
Overview What can you say if you drop all the hypotheses? Summary. Some questions we shall consider, main functions, and examples of generalized number systems.
1.1. Some questions about primes The positive integers can be generated from the number 1 by the operation of addition, or alternatively, from the prime numbers by multiplication. Generalized (henceforth g-) numbers were introduced by Arne Beurling to examine prime number theory in a purely multiplicative setting. This theory also provides a framework for studying several related topics of interest. As a brief preview of some areas we shall visit, we pose a few questions: (1) How sensitive is the prime number theorem (PNT) to the additive structure of the integers? (2) How important is this additive structure in the quest to establish the Riemann hypothesis? (3) What is the exponential of a measure? Is the counting measure of integers representable as the exponential of something interesting? (4) Is there a continuous analogue of prime number theory? (5) Why has it been so hard to improve upon the 1899 PNT error estimate of Ch. J. de la Vall´ee Poussin? For a glimpse at the scenery to be encountered enroute, here are some brief, incomplete, and perhaps tantalizing answers. (1) Suppose we use the rational primes exceeding 2 to generate g-integers. This set, the odd numbers, does not admit addition. However, it has just one fewer prime than in the classical case, and so the primes here obey the PNT. Thus we may suspect—and shall show—that the PNT can be established without use of the additive structure of integers. What is needed is that the counting function of the g-integers be “suitably close” to that of the rational integers. (2) There exist g-number systems whose integer counting function N (x) is, like that of the rational integers, very close to x but have neither an additive structure nor a “zeta functional equation,” and for which the analogue of the Riemann hypothesis does not hold. Thus, a proof of the Riemann hypothesis will require more than just the multiplicative structure of integers and smallness of N (x) − x. (3) With multiplicative convolution as the product operation on measures, the exponential of a measure can be defined by its power series. In these terms, 1
2
1. OVERVIEW
the counting measure dN of integers can be represented as the exponential of the weighted prime and prime power counting measure dΠ. (4) If we replace the counting functions of primes and integers by suitable continuous functions and recognize Chebyshev’s elementary prime number identity as the key relation connecting primes and integers, then we can develop an elegant and useful parallel version of prime number theory. (5) The only improvements ever made in the PNT error estimate of de la Vall´ee Poussin have been achieved by exploiting the additive as well as the multiplicative structure of integers. We shall show by a construction that for g-numbers (where additive structure is missing) the dlVP error term is in fact optimal. 1.2. The cast As a replacement for the natural primes, we introduce a sequence of real numbers P = {pi } satisfying 1 < p1 ≤ p2 ≤ p3 ≤ . . . ,
pi → ∞.
We call P a generalized prime number system or, briefly, a g-prime system. It is not important that the elements of P be distinct – e.g. prime ideals may have equal norms, and we shall meet further examples in which g-primes and gintegers occur with multiplicity. However, the smallest g-prime must exceed 1, for we do not admit 1 as a prime and any g-prime in the interval (0, 1) would generate a system of g-integers having an infinite number of elements in (0, 1). Also, the sequence of primes should be unbounded. Failure of either of the last two conditions would yield a system that little resembles the natural numbers. Using notation that parallels that of classical prime number theory, we introduce the g-prime counting functions πP (x) := #{i ≥ 1 : pi ≤ x}, 1 1 ΠP (x) := πP (x) + πP (x1/2 ) + πP (x1/3 ) + . . . , 2 3 x ψP (x) := log pi = log u dΠP (u). piαi ≤x
1
When the context is clear, we shall generally suppress the P subscripts. The function Π is given above by a formally infinite series. Here we show that it is well-defined and, importantly, close to the prime counting function. Lemma 1.1. For each x > 1, the formally infinite series for Π is terminating, and if we assume that π(x) x, then Π(x) − π(x) x1/2 . Proof. We have x1/n < p1 for n > log x/ log p1 . For such n, π(x1/n ) = 0, so the series for Π(x) is in fact a finite sum. The claimed inequality holds because 0 ≤ Π(x) − π(x) ≤ π(x1/2 ) +
log x π(x1/3 ) x1/2 . log p1
Associated with P we have the semigroup of g-integers it generates, N = NP : 1 = n1 < n2 ≤ n3 ≤ n4 ≤ . . . . In view of possible nonunique representation of integers, we give a formal definition of N in terms of exponents in the product representation. Let E denote the family
1.2. THE CAST
3
of sequences of nonnegative integers ν = {ν1 , ν2 , . . . } with νi = 0 from some point onward. The family E is countable, and to each element ν i = {νi1 , νi2 , . . . , 0, 0, . . . } associate a g-integer by the rule ni ←→
∞
ν
pj ij .
j=1
Given a positive number x, at most a finite number of g-integers ni satisfy ni =
(1.1)
∞
ν
pj ij ≤ x.
j=1
We define a counting function N (·) = NP (·) on [1, ∞) by setting N (x) to be the number of g-integers for which (1.1) holds. The central theme of this work is the relation between P and N , generally expressed as a hypothesis upon one of NP or ΠP , and a conclusion expressed in terms of the other function. For the rest of this chapter, we suppose that lim sup N (x)/x < ∞,
(1.2)
x→∞
which we describe as an O-density condition. This bound is a reasonable condition for g-integer systems; moreover, as we shall show in Chapter 4, it is not a real restriction for our work. In case limx→∞ N (x)/x exists and equals δ, we say that N has a density and δ is its value. Continuing with our list of functions, we have the g-zeta function, the analogue of the Riemann zeta function, defined on a half plane in C by ∞ ni −s = u−s dN (u). (1.3) ζ(s) = ζN (s) := 1−
i
The sum is called a generalized Dirichlet series and the integral is called the Mellin transform of dN , with each convergent for s > 1. (See §2.3 for the general definition of Mellin transform. Also, the “1−” in the last integral denotes inclusion of dN ({1}); integrals having lower limit 1 omit this contribution.) The g-zeta function has an Euler product representation −1 1 − pi −s (1.4) , s > 1, i
which is established in the same way as the classical formula. Also, in this region, ∞ (1.5) log ζ(s) = − log 1 − pi −s = pi −νs /ν = u−s dΠ(u) i
and (1.6)
−
d ζ (s) =− ζ(s) ds
i
1
∞
u−s dΠ(u) =
1
∞
ν≥1
u−s log u dΠ(u) =
1
1
∞
u−s dψ(u) .
4
1. OVERVIEW
1.3. Examples Example 1.2. Odd numbers (again). Let P1 = 3, 5, 7, 11, . . . , so that N1 is the odd numbers. By inspection, N1 (x) = (x + 1)/2 , whence N1 has density 1/2. A moral to be drawn from this example is that the density of integers is sensitive to small changes in g-primes (whereas, as we shall see, the truth of the PNT is not). The zeta function of N1 can be expressed in terms of the Riemann zeta function as −1 1 − p−s n−s = = (1 − 2−s ) ζ(s), s > 1, ζ1 (s) = p≥3
n≥1, n odd
with the product extending over all natural primes from 3 onward. We found the last expression by multiplying and dividing the product by (1 − 2−s ). Note that the residue of ζ1 (s) at s = 1 is 1/2, which is the density of N1 . If a g-number system has a density, then it equals the residue of the associated zeta function at s = 1. We shall return to this point and qualify use of the word “residue” in Chapter 7. Example 1.3. A repeated prime. Let P2 = 3, 3, 4, 5, 7, 11, 13, . . . . Here the (natural) prime 2 is suppressed, 3 is introduced as a g-prime (in addition to its usual occurrence), the number 4 also is taken as a g-prime, and all the natural primes from 5 onward are g-primes. The sequence N2 of g-integers begins 1, 3, 3, 4, 5, 7, 9, 9, 9, 11, 12, 12, 13, 15, 15, 16, 17, 19, . . . . To find the counting function N2 we determine the associated zeta function, first as an Euler product −2 −1 −1 1 − 4−s 1 − p−s , s > 1. ζ2 (s) = 1 − 3−s p≥5
A small manipulation gives ζ2 (s) in terms of the Riemann zeta function: −1 −1 ζ(s) . (1.7) ζ2 (s) = 1 + 2−s 1 − 3−s Expanding each of the factors in (1.7), we find that ζ2 (s) = (1 − 2−s + 2−2s ∓ . . . )(1 + 3−s + 3−2s + . . . ) =
× (1 + 2−s + 3−s + 4−s + . . . ) (−1)j (2j 3k n)−s .
j≥0 k≥0 n≥1
(The rearrangement is valid, because the series is absolutely convergent for s > 1.) Since ζ2 (s) also has the Mellin transform representation ∞ ζ2 (s) = u−s dN2 (u), 1−
we have N2 (x) =
(−1)j =
j,k≥0, n≥1 2j 3k n≤x
(−1)j
j,k≥0
x . 2j 3k
1.3. EXAMPLES
5
The last sum has O(log2 x) terms, and approximating t by t, we find that N2 has a density and its value (cf. §7.1) is (−1)j 1 1 = 1. = δ2 = j k 2 3 1 + 1/2 1 − 1/3 j,k≥0
This example provides a g-number system that consists entirely of natural numbers, has density 1, and yet is different from the natural numbers. Example 1.4. Continuous functions. Let x x, x ≥ 1, Nc (x) := (δ1 + du) = 0, x < 1, 1− where δ1 is Dirac point measure at 1 and du is Lebesgue measure on [1, ∞). This function is continuous on (1, ∞) and approximates N (x) = x , the counting function of the natural numbers. The “zeta function” associated with Nc is ∞ 1 s u−s (δ1 + du) = 1 + ζc (s) = = , s > 1. s−1 s−1 1− This function is close to the Riemann zeta function on the half line {s : s > 1}. (However, the two functions are not close on the complex half-plane {s : s > 1}; e.g. the Riemann zeta function is unbounded as s → 1 + i∞ while ζc (s) → 1.) To determine a continuous function that corresponds to the classical weighted prime counting function Π(·), we express log ζc (s) as a Mellin transform: Theorem 1.5. For s > 1, ∞ 1 − u−1 s du = log . u−s (1.8) log u s−1 1 Proof. Let F (s) denote the integral on the left side of the last formula. Then ∞ 1 s 1 + = log F (s) = − u−s (1 − u−1 ) du = − . s−1 s s−1 1 Thus F agrees with log s/(s − 1) to within a constant. The constant is 0 since s . lim F (s) = 0 = lim log s→+∞ s→+∞ s−1 Recalling the relation
∞
u−s dΠ(u) = log ζ(s)
1
connecting the weighted prime counting function with the logarithm of the Riemann zeta function, the choice x 1 − u−1 du (1.9) Πc (x) := log u 1 yields an analogous “continuous prime counting function” associated with Nc . The function Πc (x) is close to the logarithmic integral li(x), which in turn approximates the counting function of the rational primes. This example suggests the plausibility of the PNT, but of course it is not a proof of that result. Continuous analogues of prime number theory will occur repeatedly in this work, and they will provide an essential ingredient for the constructions of Chapter 17.
6
1. OVERVIEW
1.4. π, Π, and an extended notion of g-numbers Example 1.4 is continuous, which perhaps seems unusual, but it has many of the distribution properties of discrete g-primes and g-integers. In the sequel, we shall make use of that example and some others that are not discrete. Using the notion of exp∗ , to be defined in Chapter 3, we expand our definition of g-numbers to encompass these cases: A g-number system is a pair of right continuous increasing functions Π and N on [1, ∞) satisfying dN = exp∗ dΠ and Π(1) = 0. This usage differs from that of T. Hilberdink [Hi12], who calls such a system outer ; he applies the name g-prime system to one for which there exists a function π that is connected to the weighted counting function Π by 1 1 (1.10) Π(x) = π(x) + π(x1/2 ) + π(x1/3 ) + . . . 2 3 with the property that π is increasing. (Here, of course, Π also is increasing.) Suppose we are given an increasing function Π. We can determine a function π that formally satisfies (1.10) by Moebius inversion: ∞ μ(k) Π(x1/k ). (1.11) π(x) := k k=1
For given x, this series is terminating (and hence convergent) if Π(u) = 0 for all u < a for some a > 1, for in this case x1/k < a for all sufficiently large k. If, on the other hand, Π(x) > 0 for all x > 1, we can easily show that the preceding series still converges if Π satisfies a Lipschitz condition at x = 1. Lemma 1.6. Suppose Π(x) is increasing on [1, ∞) and Π(x) x − 1 for x near 1+. Then the series (1.11) is absolutely convergent and is uniformly convergent on any given interval (1, X). Moreover, the function defined by (1.11) satisfies (1.10). Proof. Suppose x > 1 is given. For k > log x we have log x . Π(x1/k ) x1/k − 1 = e(log x)/k − 1 k Thus the series (1.11) is dominated by log x 1 + Π(x) , k k2 k≤log x
k>log x
so the series converges absolutely. For x ≤ X, the series is dominated by the number Π(X) log X + O(1), so it converges uniformly there by the Weierstrass M-test. If we insert π(x) as defined by (1.11) into the right side of (1.10) we get an absolutely summable series: indeed ∞ ∞ ∞ 1 μ(k) 1 Π(x1/nk ) ≤ Π(x1/ ) d(), n k n=1 k=1 =1 where d() = kn= 1, the “divisor function.” If we repeat the argument of the first part and then apply summation by parts using the famous estimate ≤t d() ∼ t log t, we find that the double sum is dominated by d() log x d() + Π(x)(log log x)2 . Π(x) 2 ≤log x
>log x
1.5. NOTES
7
It follows that Moebius inversion is valid, and the function π satisfies (1.10).
For Π(x) to be increasing, it is not necessary that π(x) given by (1.11) also be increasing. Indeed, a simple example is provided by setting Π(x) = 0 for x < 2 and Π(x) = 1 for x ≥ 2. For x < 8, our formula gives ⎧ ⎪ ⎨0, x < 2, 1/2 π(x) = Π(x) − (1/2) Π(x ) ∓ · · · = 1, 2 ≤ x < 4, ⎪ ⎩1 2 , 4 ≤ x < 8. In general, we shall require only of our g-number systems that Π(x) be monotone, but it is interesting to show that the important Example 1.4 satisfies the extended condition that π(x) is monotone. Proposition 1.7. Let Π = Πc given by (1.9) be the (weighted) continuous “prime-counting” function of Example 1.4 and let π = πc be the associated function given by (1.11). Then π is well-defined and is increasing on (1, ∞). Proof. The integrand of Π, (1 − t−1 )/ log t, tends to 1 as t → 1+. Thus Π(x) x − 1 for x near 1+, and hence the series defining π(x) converges by Lemma 1.6. Formally differentiating the series for π(x) yields ∞ μ(k) x1/k − 1 . π (x) = k x log x k=1
The differentiated series converges absolutely and also uniformly on any given interval (1, X) by the argument used in proving the last lemma. Thus the termwise differentiation is valid ([Ap74], Theorem 9.14; [BS00], Theorem 9.4.4). It remains to show that π(x) is increasing. Writing x1/k as exp({log x}/k) and expanding exp as a Maclaurin series, we find ∞ ∞ ∞ μ(k) 1/k μ(k) (log x)l (x π (x) x log x = − 1) = . k k l! kl k=1
k=1
l=1
We can exchange the order of the last two sums by the absolute convergence of the series. This yields ∞ ∞ 1 (log x)l μ(k) π (x) log x = x l! kl+1 l=1 ∞
k=1
1 6 x−1 1 (log x)l > 2 >0 = x l! ζ(l + 1) π x l=1
for all x > 1. Thus π is increasing. 1.5. Notes
A survey of g-numbers by the first author and P. T. Bateman is given in [BD69]. Some of our arguments come from this article. §1.4. In [Hi12], it is shown that N (x) := 1 + c(x − 1) determines a g-prime system in the sense of Hilberdink for 0 < c < 2, while for some c > 2, the prime counting function π(x) is not always monotonic increasing. Proposition 1.7 is the case c = 1 of Hilberdink’s result.
https://doi.org/10.1090//surv/213/02
CHAPTER 2
Analytic Machinery Are we there yet? Summary. A medley of functions and associated measures. The Mellin transform. A family of norms. Convergence of measures. Convolutions. A derivation operator on measures.
2.1. A function class We begin by setting out conditions on functions that justify Mellin transforms, convergence, and other analytic techniques. Happily, these conditions are not very restrictive, and nearly all the functions that occur in our study, e.g. N (·), Π(·), satisfy them. Readers eager to get to g-numbers might skim this chapter for its definitions and conventions and return for details as needed. Here are the requirements for our functions, along with brief comments: (1) have support contained in [1, ∞), the reasonable “home” for g-integers. (2) have total variation function of polynomial growth. We shall make essential use of Mellin transforms and integrations on [1, ∞), for which we want limited growth at infinity. This condition also implies local bounded variation, so we can associate a Borel-Stieltjes measure dF with each function F as well as integrate by parts. (For “elementary” methods, transforms are not used, so growth is not an issue and local bounded variation alone would suffice.) (3) be right continuous. This choice will be convenient in associating a Borel measure with each function. Let V denote the class of right continuous real or complex valued functions F with support contained in [1, ∞) whose associated total variation function on the interval [1, x], called Fv (x), is of polynomial growth at infinity. Here are two instances of functions in V arising in Example 1.4: one is Nc , the continuous approximation to the integer counting function; the other is 0, x < 1 (2.1) F (x) := 1, x ≥ 1. 2.2. Measures We may regard Borel sets on R as generated by the half open intervals of the form (a, b] for a < b, and we introduce a signed Borel measure dF corresponding to each element F ∈ V by setting dF ((a, b]) := F (b) − F (a). 9
10
2. ANALYTIC MACHINERY
We shall often write “dF ∈ dV” to say that dF is a measure associated with F ∈ V. (An exception to our convention of connecting function and measure names is the function given in (2.1), for which it is customary to represent the associated measure as δ1 , Dirac measure at 1.) With these conventions, we can express a function F in terms of a measure dF by x dF = F (x), 1 ≤ x < ∞. 1−
We shall call F the cumulative distribution function (briefly: distribution function) associated with dF . For b > a ≥ 1, we shall understand b dF := dF = F (b) − F (a). a
(a, b]
b
(Note that a dF includes any point mass that dF may have at b, but not at a.) We associate with the total variation function Fv the total variation measure |dF | = d(Fv ), briefly dFv , and we have x |dF | , 1 ≤ x < ∞. Fv (x) = 1−
2.3. Mellin transforms Given F ∈ V with F (x) = O(xτ ) for some τ ≥ 0, integration by parts shows that the Mellin integral ∞ ∞ u−s dF (u) = s u−s−1 F (u) du 1−
1
exists for all s ∈ C with s > τ . (Note that the integrated terms vanish at 1− and at infinity.) The integral defines a function of s that we call the Mellin (or Mellin-Stieltjes) transform of dF , and we set ∞ u−s dF (u) , s > τ. F(s) := 1−
For example, the Riemann zeta function is expressible on {s : s > 1} as the Mellin transform of the counting measure dN (x) of positive rational integers. Since the g-zeta function is expressible by a generalized Dirichlet series, as in (1.3), the reader may ask, Why not restrict ourselves to series? The main reason for the generality is that we shall often approximate discrete counting functions by continuous ones, and Mellin transforms enable us to treat continuous as well as discrete measures. We assume that the basic facts, such as the definition of abscissa of convergence (denoted σc ), abscissa of absolute convergence (denoted σa ), conditions justifying differentiation, etc., are known for Dirichlet series; similar relations hold for Mellin transforms. Here we note only that the region of convergence of a Mellin integral is a half plane and that the Mellin transform has the important property of carrying convolutions into pointwise products (see §2.6). We establish or give references for further information as we need it. We note here an inequality that yields two useful estimates for Mellin transforms with real arguments.
2.5. CONVERGENCE
11
Lemma 2.1. Suppose that F, G ∈ V with F (u) ≤ G(u) for all u ≥ 1 and suppose that H is a left continuous, positive, decreasing function on [1, ∞). Then, for all x > 1, x
x
H(u) dF (u) ≤ 1−
H(u) dG(u). 1−
Proof. Integration by parts gives x H(u) {dG(u) − dF (u)} = H(x){G(x) − F (x)} 1− x + {G(u) − F (u)}{−dH(u)} ≥ 0.
1
Specializing to H(u) := u−σ and letting x → ∞, we obtain an inequality between Mellin transforms: Lemma 2.2. Suppose that F, G ∈ V with F (x) ≤ G(x) for all x ≥ 1. Then for all σ > σc we have F(σ) ≤ G(σ). 2.4. Norms We introduce a family of norms for measures via Mellin transforms. This will be useful for the next section and in later analysis. If F ∈ V, Fv (x) xτ for some τ ≥ 0, and σ > τ , then, by integration by parts, ∞ ∞ v (σ) = F u−σ |dF (u)| = σ u−σ−1 Fv (u) du < ∞. 1−
1
In this case, we say that the Mellin integral is absolutely convergent at σ. (We say that a Mellin integral is absolutely convergent at a complex point s if it is absolutely convergent at σ, where σ = s.) We define dVσ to be the set of measures with this property and note that dVσ is a linear space, since it admits addition and scalar multiplication. v (σ) and call this the σ-norm of dF . Let Vσ denote the We define dF σ := F collections of functions F ∈ V whose associated measure has finite σ-norm. It is easy to see that · σ satisfies the expected properties of a norm. Also, dF σ ≤ dF σ for any σ > σ, and in particular dF σ is finite whenever dF σ is. For the opposite relation between the growth of Fv and the norm, note that if dF σ < ∞, then, for any x ≥ 1, x x σ (2.2) Fv (x) = |dF (u)| ≤ x u−σ |dF (u)| ≤ xσ dF σ xσ . 1−
1−
It is important to have the restriction σ ≥ 0 for norms, since some of the preceding inequalities are false for negative values of σ. 2.5. Convergence We are going to use limits of sequences of measures, and for this we set out a notion of convergence. For dFi ∈ dV, i = 1, 2, . . . , we say that the sequence converges to a limit (in dV) if there exists a measure dF ∈ dV and some σ ≥ 0 for which (2.3)
dFi − dF σ → 0,
i → ∞.
12
2. ANALYTIC MACHINERY
Of course, if (2.3) holds for some σ, then it holds for any larger value of the parameter. In a related vein, we have the following observation: Lemma 2.3. Suppose that a sequence {dFi } converges and g is a continuous function of polynomial growth on [1, ∞). Then {g dFi } converges. The proof consists of noting that, if g(u) uc , then g dFi σ+c dFi σ . Lemma 2.4. If a sequence of measures in dV converges, then the associated sequence of distribution functions converges locally uniformly on R. Proof. We can assume that dFi converges to 0, i.e. dFi σ → 0 for some σ, and let X > 1 be given. Then, by (2.2), we have, uniformly for x ∈ [1, X], |Fi (x)| ≤ (Fi )v (x) ≤ xσ dFi σ ≤ X σ dFi σ → 0,
i → ∞.
The space of measures dVσ is complete (in the usual sense that a Cauchy sequence in dVσ has a limit in this space): if i dFi σ < ∞ for some σ, then, by a simple argument, there exists a measure dF ∈ dVσ to which i dFi converges. 2.6. Convolution of measures We define multiplicative convolution, a multiplication on dV, by setting (2.4) dF ∗ dG := dF (u) dG(v) E
uv∈E
first for E = (a, b], a half open interval, with the set {uv ∈ (a, b]} understood as {(u, v) ∈ (1/2, ∞) × (1/2, ∞) : a < uv ≤ b}. (The 1/2’s serve only as numbers smaller than 1, in order to include possible contributions of dF , dG at 1.) The convolution measure is then extended to all Borel sets E in R. This operation is commutative and associative. Rudin [Ru87] gives an elegant proof of associativity of convolutions by transferring the relation into the Mellin transform space, where we can appeal to the associativity of C. The measure δ1 , Dirac measure at 1, is the unity element of convolutions: for any dF ∈ dV we have δ1 ∗dF = dF . We shall occasionally use repeated convolutions – for dF ∈ dV, set dF ∗0 := δ1 and dF ∗n := dF ∗ dF ∗n−1 for n ≥ 1. The following integration formulas will be used often. Let dF, dG ∈ dV and f be a Borel function that is bounded on an interval [1, x]. Then x x x dF ∗ dG = F (x/v) dG(v) = G(x/u) dF (u) (2.5) 1−
and
1−
1−
x
f (dF ∗ dG) =
(2.6) 1−
f (uv) dF (u) dG(v)
uv≤x
x x/v
= 1−
f (uv) dF (u) dG(v) .
1−
The first formula follows from the definition of convolution and Fubini’s Theorem on iterated integration. Formula (2.6) holds for f = χS , the indicator function of some bounded Borel measurable set S, by (2.4) with S ∩ [1, x] in place of E. By linearity the formula holds for f a simple function, and finally it holds for a general bounded measurable function by uniform approximation by simple functions.
2.6. CONVOLUTION OF MEASURES
13
Lemma 2.5. Suppose that F, G ∈ Vσ , and define H(x) = 0 for x < 1 and x dF ∗ dG, x ≥ 1. H(x) := 1−
Then H ∈ Vσ . Proof. By construction, H has support in [1, ∞). Next, we show the growth condition for Hv . By the triangle inequality, x x |dF ∗ dG| ≤ |dF | ∗ |dG| . Hv (x) = 1−
1−
The last inequality and (2.5) together imply that x x Fv (x/u) |dG(u)| ≤ O((x/u)σ ) |dG(u)| = O(xσ ), Hv (x) ≤ 1−
1−
v (σ) < ∞. since G It remains to show that H is right continuous. We have x+ x x + F lim H(x + ) − H(x) = lim −F dG(u). →0+ →0+ 1− u u (This formula is valid, since F (x/u) = 0 for x < u ≤ x + .) The F difference in the last integral goes to zero pointwise as → 0 by right continuity of F , and hence the integral goes to zero by dominated convergence. (Here 2Fv ({x + 1}/u) is a dominating function for the integrand.) The following formula shows how the Mellin transform converts convolutions into pointwise multiplications. This property is the reason the Mellin (rather than, say, Laplace) transform is useful in studying multiplicative properties. Lemma 2.6. Let F, G ∈ V and suppose s ∈ C is a number for which the Mellin transform of dF converges and that of dG converges absolutely. Then ∞ u−s (dF ∗ dG)(u) = F(s) G(s). 1−
(Of course, the roles of F and G can be exchanged here.) Proof. Let Z be any number exceeding 1. By (2.6), Z Z Z/y u−s (dF ∗ dG)(u) = (xy)−s dF (x) dG(y) 1−
y=1− √ Z
x=1−
=
x 1− √ Z
+
−s
Z
√ y= Z
√ Z
dF (x)
y −s
y −s
Z/y √ x= Z
=: I + II + III,
y −s dG(y)
1−
y=1−
+
Z/y
x=1−
say.
x−s dF (x) dG(y) x−s dF (x) dG(y)
14
2. ANALYTIC MACHINERY
As Z → ∞ we have I → F(s) G(s); √Z y −σ o(1) dGv (y) = o(1), |II| ≤ since
Z/y √ Z
y=1−
√ x−s dF (x) → 0 uniformly for 1 ≤ y ≤ Z as Z → ∞; and Z −σ dGv (y) = o(1). |III| ≤ √ O(1)y
y= Z
Example 2.7. Divisor function. The divisor function d of elementary number theory corresponds to the convolution of the counting function of the (rational) integers with itself in the sense that x d(n) = 1= ds dt = d· ∗2 . n≤x
mn≤x
st≤x
1−
Example 2.8. Convolution powers of Lebesgue measure. Borel–Lebesgue measure on [1, ∞), we have dt∗n = logn−1 t dt/(n − 1)! ,
For dt
n ≥ 1.
(This relation is most easily verified via the cumulative distribution function or Mellin transforms.) We conclude this section with a useful inequality. Lemma 2.9. Suppose that F and G are nondecreasing functions in V with F (u) ≤ G(u) for all u ≥ 1. Then, for all positive integers n and real x > 1, x x dF ∗n ≤ dG∗n . 1−
1−
Proof. We start with the algebraic identity dG∗n − dF ∗n = (dG − dF ) ∗ (dG∗n−1 + dG∗n−2 ∗ dF + · · · + dF ∗n−1 ) and integrate each side: x dG∗n − dF ∗n 1− x = {G(x/u) − F (x/u)}(dG∗n−1 + dG∗n−2 ∗ dF + · · · + dF ∗n−1 )(u) ≥ 0 1−
since G − F and the measures are each nonnegative.
We shall have occasion to convolve measures involving fractional powers of the argument, such as dπ(u1/2 ), which occurs in dΠ(u) (see (1.6) and (1.10)). For this we have the following formula. Lemma 2.10. Let n, k be positive integers and f (u) = fn (u) := φ(u1/n ),
r(u) = rn (u) := ρ(u1/n )
for functions φ, ρ ∈ V. We have x (2.7) df ∗ dr = 1−
x1/n
1−
dφ ∗ dρ,
2.7. CONVOLUTION OF FUNCTIONS
x
(2.8)
df ∗k =
15
x1/n
dφ∗k .
1−
1−
With a small abuse of notation, we sometimes write the last formula as x1/n x 1/n ∗k dφ(u ) = dφ∗k . 1−
1−
Proof. We use iterated integration and a change of variable. Write x x x df ∗ dr = f (x/t) dr(t) = φ(x1/n/t1/n ) dρ(t1/n ) 1−
1−
1−
x1/n
x1/n
dφ ∗ dρ.
φ(x1/n/u) dρ(u) =
= 1−
1−
To show (2.8), we proceed inductively, taking dr = df ∗(k−1) .
2.7. Convolution of functions The preceding notion of convolution has a useful specialization to functions. Let Rloc be the class of functions on [1, ∞) that are locally Riemann integrable (rather than Lebesgue integrable, to avoid some unneeded complexity). We define f ∗ g, the multiplicative convolution of functions f, g ∈ Rloc , by x du x f ∗ g (x) := g(u) . f u u 1 In the case of absolutely continuous measures (where we can write dF (x) = f (x) dx) this definition essentially coincides with the previous notion of convolution, as we show now. Lemma 2.11. Suppose that f, g ∈ Rloc and F, G are their integrals, i.e. x x f (u) du, G(x) = g(u) du. F (x) = 1
1
Then f ∗ g(x) dx = (dF ∗ dG) (x),
x ≥ 1.
Proof. It suffices to show that x x (2.9) f ∗ g(v) dv = (dF ∗ dG) (v), 1
1
x ≥ 1.
y Each side of the last formula is a continuous function (in each case x → 0 as y − x → 0), so the desired result follows from (2.9) upon differentiating. We show (2.9) by first changing the order of iterated integration and then changing the integration variables: x f ∗ g (v) dv 1 x v x x du du v v = g(u) dv g(u) f f dv = u u u u 1 1 1 u x x/u x x x = f (w) dw g(u) du = F dG(u) = (dF ∗ dG) (v). u 1 1 1 1
16
2. ANALYTIC MACHINERY
Also, by making a change of variable in the defining relation, we see that f ∗ g = g ∗ f , i.e., convolution of functions is commutative. Suppose that f and g are of polynomial growth. Then the Mellin transform of dF exists in some half plane, and it is expressed as ∞ ∞ −s x dF (x) = x−s f (x) dx. F (s) = 1
1
The last integral is the classical definition of the Mellin transform of a function f , and it is usually denoted also by fˆ(s) (sorry!). By the last lemma and Lemma 2.6 the product formula for the Mellin transform of functions is ∞ ∞ ∞ −s −s (2.10) x f (x) dx x g(x) dx = x−s (f ∗ g)(x) dx, 1
1
1
or, fˆ(s)ˆ g(s) = (f ∗ g)ˆ(s). We shall also occasionally use repeated convolutions f ∗n , n = 2, 3, . . . , and for these (fˆ(s))n = (f ∗n )ˆ(s). Example 2.12. Example 2.8 revisited. Define a function 1, if x ≥ 1, 1(x) := 0, if x < 1. We have the convolution formula 1∗2 (x) = 1 ∗ 1(x) =
x
u−1 du = log x
1
and, by induction, ∗n
1 (x) = 1
x
(log x/u)n−2 −1 (log x)n−1 u du = . (n − 2)! (n − 1)!
2.8. The L and T operators In this section we introduce two mappings of the class of measures dV to itself. These will appear many times in the sequel, particularly in connection with Mellin transforms and their derivatives. For k any nonnegative integer, define Lk on dV by Lk : dF (t) → (log t)k dF (t); for k = 1, we write L instead of L1 . For α ∈ C, define T α on dV by T α : dF (u) → tα dF (t), and for α = 1, we write T instead of T 1 . Let us look at some properties of L. Differentiation of a Mellin transform yields the formula ∞ ∞ ∞ d −s −s u dF (u) = − u log u dF (u) = − u−s (L dF )(u), ds 1− 1− 1− which suggests that L is connected with differentiation. We call a linear operator on a ring that satisfies the product rule of differentiation a derivation. We claim
2.8. THE L AND T OPERATORS
17
that L has this property. L is clearly linear, and we now show that it satisfies the product rule. Lemma 2.13. For dF, dG ∈ dV, we have (2.11)
L (dF ∗ dG) = (L dF ) ∗ dG + dF ∗ (L dG).
Proof. It suffices to establish the relation for the associated distribution functions. By (2.6), we have x L (dF ∗ dG) = log(uv) dF (u) dG(v) uv≤x 1 log u dF (u) dG(v) + dF (u) log v dG(v) = uv≤x x (L dF ) ∗ dG + dF ∗ (L dG). = 1
(Note that it is irrelevant whether these integrals start at 1− or at 1, since the logarithm vanishes at 1.) If we define L0 dF := dF and Ln dF := L(Ln−1 dF ) for n ≥ 1, then the preceding formula implies that (2.12)
L(dF ∗n ) = n dF ∗n−1 ∗ L dF,
(2.13)
n n j L (dF ∗ dG) = L dF ∗ L∗n−j dG. j j=0 n
We are going to apply L term-by-term to infinite series. This is justified by the following observation. Lemma 2.14. L is a continuous operator on dV. The proof consists in noting that {L dFi } converges whenever {dFi } does, by applying Lemma 2.3. Now we look at T . Lemma 2.15. For dF, dG ∈ dV and any α ∈ C, we have (2.14)
T α (dF ∗ dG) = (T α dF ) ∗ (T α dG).
Proof. Using the first part of (2.6), we get x α T (dF ∗ dG) = (st)α dF (s) dG(t) st≤x 1− x α α (s dF (s)) (t dG(t)) = (T α dF ) ∗ (T α dG). = st≤x
1−
Since the integrated version of (2.14) holds, so does the measure form.
Some simple properties of T are that it is one-to-one and onto; also, the composition T α ◦ T −α = T 0 , the identity mapping on dV. The T and L maps also have obvious interpretations as operators on Rloc , the algebra of all locally Riemann integrable functions supported in [1, ∞): (T α f )(t) := tα f (t),
(Lk f )(t) := (log t)k f (t),
18
2. ANALYTIC MACHINERY
with the understanding that (Lf )(t) = (log t)f (t). L is a derivation, i.e. L(f ∗ g) = (Lf ) ∗ g + f ∗ Lg.
(2.15)
Indeed, if we use Lemma 2.11 to interpret f ∗ g(x) dx as dF ∗ dG, then Lemma 2.13 shows the derivation property at once. Each T operator is an isomorphism on the algebra, satisfying T α (f ∗ g) = T α f ∗ T α g.
(2.16)
This is verified by noting that x x α x x x du du α f f {g(u)uα } g(u) = . x u u u u u 1 1 Example 2.16. A T product. The formula n ∞ ∞ f (x)x−s−1 dx = f ∗n (x)x−s−1 dx 1
1
will be used later. This equality can be established in terms of the operator T −1 by writing n ∞ ∞ −s −1 x (T f )(x) dx = x−s (T −1 f )∗n (x) dx 1 1 ∞ = x−s T −1 (f ∗n )(x) dx. 1
2.9. Notes References for this chapter include [Ro88], Ch. 6 and 7 of [Ru87], Ch. 9 of [Ti39], [Di70a], as well as the works cited in the Preface.
https://doi.org/10.1090//surv/213/03
CHAPTER 3
dN as an Exponential and Chebyshev’s Identity Can exponentials compound our interest? Summary. The counting measure of integers as an exponential of the weighted prime and prime power counting measure. Connection with Chebyshev’s elementary prime number identity.
3.1. Goals and plan This chapter has two main goals. The first is to express dN , the integer counting measure, as the exponential of the measure dΠ, specifically 1 1 (3.1) dN = δ1 + dΠ + dΠ ∗ dΠ + dΠ ∗ dΠ ∗ dΠ + · · · =: exp∗ dΠ, 2! 3! with ∗ denoting multiplicative convolution. This relation provides the fundamental connection between the counting functions of g-primes and the associated g-integers, and it is the reason for introducing Π(x), which counts primes and prime powers. Chebyshev showed in the 1850s that the rational primes satisfy the inequalities (3.2)
π(x) ≤ Π(x) x/ log x.
By analogy with the classical case, we say that a g-number system satisfies a Chebyshev O-bound if (3.2) holds. We shall study these estimates in Chapters 9 and 11. The argument Chebyshev used was in essence based on the identity (3.3)
L dN = dN ∗ dψ,
where L denotes the log operator, introduced in §2.8, and dψ = L dΠ is the weighted prime and prime power counting measure. Our second goal for this chapter is to show that (3.3) holds for generalized number systems, and moreover this relation is intimately connected with the exponential representation above. If we recall the derivation property of L, the connection between the formulas is suggested by the calculus relation for differentiable functions y(x) = Cef (x) ⇐⇒ y (x) = y(x) f (x). We shall show that the power series (3.1) is convergent and that the analogues of familiar analytic operations are valid. Then we show that exp has the expected properties of an exponential, that (3.3) and (3.1) are equivalent, and that these relations hold for g-numbers. At first glance the exponential representation (3.1) looks “far-out.” However, the Mellin formulas for ζ(s) and log ζ(s) and the Mellin transform’s homomorphic property of taking convolutions to products suggest how (3.1) might make sense: 19
20
3. dN AS AN EXPONENTIAL AND CHEBYSHEV’S IDENTITY
starting with (1.5), expanding the exponential as a series, and using Lemma 2.6 for each term, for s in some half plane, we find ∞ u−s dΠ(u) ζ(s) = exp log ζ(s) = exp 1 2 3 1 1 = 1 + u−s dΠ(u) + u−s dΠ(u) + u−s dΠ(u) + . . . 2! 3! 1 1 = u−s δ1 + dΠ + dΠ∗2 + dΠ∗3 + . . . (u) . 2! 3! (For compactness in the preceding formula, we have suppressed the limits on the integrals and used dΠ∗i to denote an i-fold convolution of dΠ with itself.) That is, we have shown ∞ ∞ −s u dN (u) = u−s exp∗{dΠ}(u). 1−
1
The uniqueness theorem for Mellin transforms (see [BD04] §6.6, or for the Dirichlet series version, [Ch68] Ch. X, Th. 7; [MV07] Th. 1.6) implies Theorem 3.1. The formula dN = exp∗ dΠ holds for any g-number system of polynomial growth. Remark 3.2. The polynomial growth condition arises from our analytic proof, which requires the zeta function to be convergent in some half plane. This is not a serious restriction, for we shall study only g-number systems that are reasonably similar to the natural numbers. An unconditional version of (3.1) holds as well (cf. [Di70a]) if we use a weaker notion of convergence than that given in §2.5. The preceding formulas together yield the interesting Mellin transform identity for the zeta function
∞ ∞ (3.4) exp u−s dΠ(u) = u−s exp∗{dΠ}(u). 1
1−
The last identity leads to some useful observations about the zeta function. Lemma 3.3. Suppose that the integer counting function N of a g-number system N satisfies the O-density condition lim sup N (x)/x < ∞.
(1.2 bis)
x→∞
Then ζN (s) = ζ(s) converges in the open half-plane { s > 1} and is nonzero there. Proof. First, for any σ > 1, ∞ ∞ u−σ dN (u) = σ u−σ−1 N (u) du σ ζ(σ) = 1−
1
∞
u−σ du =
1
σ < ∞. σ−1
Thus the zeta function converges absolutely on { s > 1}. Next, we have for σ > 1, ∞ ∞ u−σ−it dΠ(u) ≤ u−σ dΠ(u) = log ζ(σ) < ζ(σ) < ∞, | log ζ(σ + it)| = 1
1
and since log ζ(s) is bounded on this half plane, ζ(s) = 0 here.
Relations (3.1) and (3.3) are at the heart of our treatment of g-numbers. They apply to continuous “counting functions” as well as discrete ones. Thus it is useful to study the properties of convolution exponentials.
3.2. POWER SERIES IN MEASURES
21
3.2. Power series in measures Convolution exponentials are one case of power series in measures. We shall presently meet some other series as well, so we consider here the general notion of g(dF ) for measures dF and functions g that are holomorphic at the origin. Let V(c) denote the collection of functions F ∈ V satisfying F (1) = c, and let dV(c) represent the associated measures. Since elements of V satisfy F (x) = 0 for all x < 1, we have dF ({1}) = F (1), and elements of dV(c) have a point mass c at the point 1. Our main cases will be c = 0 and 1. The convergence of a series in a measure dF depends on the size of |dF ({1})|, as we now show. Proposition 3.4. Let g(z) be a holomorphic function at 0 whose Maclaurin series has radius of convergence ρ. Let dF ∈ dV(c) with |c| < ρ. Then the series
g(dF ) :=
∞
an dF ∗n
n=1
converges to an element of dV(c). Further, g is continuous on ∪|c| 1, by Lebesgue’s convergence theorem we have dF σ → |c| as σ → ∞. Thus there exists a number σ1 such that dF σ < ρ for σ > σ1 . For such a value of σ, we show that N an dF ∗n = an dF ∗n → 0 , g(dF ) − σ
n=0
σ
n>N
N → ∞.
By the triangle inequality, it suffices to show that, as N → ∞,
|an |
∞
u−σ |dF |∗n (u) → 0.
1−
n>N
By Lemma 2.6, the last sum is n>N
|an |
∞ 1−
u−σ |dF |(u)
n =
|an |(dF σ )n ,
n>N
and this converges to 0 as N → ∞, since dF σ < ρ and the last expression is the tail of a convergent series.
22
3. dN AS AN EXPONENTIAL AND CHEBYSHEV’S IDENTITY
To show continuity, suppose dF σ < ρ and dFν − dF σ → 0 as ν → ∞. Let Mν := max(dFν σ , dF σ ), and say ν is large enough that Mν ≤ ρ < ρ. Then g(dFν ) − g(dF )σ ≤ ≤
∞ n=1 ∞ n=1
≤
|an | dFν n − dF n σ n−1 n−1−j j |an | dFν ∗ dF ∗ (dFν − dF )
∞ n=1
≤
∞
j=0
|an | nMν n−1 dFν − dF σ
σ
|an | nρ n−1 dFν − dF σ → 0, ν → ∞.
n=1
Most of our applications will be to measures dF ∈ V(0); in this case g(dF ) will exist for any function g that is holomorphic at 0. 3.3. Inverses We say that dF ∈ dV is invertible if there exists dG ∈ dV such that dF ∗dG = δ1 . Obviously dG is invertible as well. The usual argument from group theory shows that if an inverse exists, it is unique; we will generally refer to dG as the inverse of dF and denote it by dF ∗−1 . One example of an inverse relation, from classical number theory, is dN ∗dM = δ1 , where M is the summatory function of the Moebius function. Another example is (δ1 + dt) ∗ (δ1 − dt/t) = δ1 .
(3.5)
Lemma 2.6 implies that the Mellin transforms of a pair of invertible measures are reciprocals. One way to verify the last formula is to use Mellin transforms. Evaluation at the set {1} shows that dF ({1}) = 0 is a necessary condition for dF to be invertible. Now we show this condition is sufficient as well. Proposition 3.5. If dF ∈ dV and dF ({1}) = 0, then dF is invertible. If dF ({1}) = 1, then dF ∗−1 =
(3.6)
∞
(−1)n (dF − δ1 )∗n .
n=0
Proof. First, suppose dF ({1}) = 1, and set dF1 := dF1 −δ1 with dF1 ∈ dV(0). The function g(z) := (1 + z)−1 has the series g(z) =
∞
(−1)n z n
n=0
in the disc {z ∈ C : |z| < 1}. We claim that g(dF1 ) is the inverse of dF . First, note that g(dF1 ) ∈ dV. Indeed, for σ large enough, we have dF1 σ < 1 (as shown in the proof of the last proposition), and so ∞ ∞ dF1 n < ∞. (−1)n (dF1 )∗n σ ≤ σ n=0
n=0
3.4. THE EXPONENTIAL ON V
23
Next, we have (δ1 + dF1 )∗g(dF1 ) − δ1
σ
N −1 ∗n ≤ (δ1 + dF1 ) ∗ (−1)n dF1 − δ1 σ n=0 ∞ ∗n + (δ1 + dF1 ) ∗ (−1)n dF1 σ n=N ∞
N ≤ dF1 σ + dF σ ·
dF1 n → 0 σ
n=N
as N → ∞. Thus dF ∗ g(dF1 ) = δ1 , so dF ∈ dV(1) is invertible. Finally, suppose more generally that dF ({1}) = c = 0. Let dF2 = c−1 dF . Now dF2 ({1}) = 1, so there exists a measure dG2 such that dF2 ∗ dG2 = δ1 . Now dG = c−1 dG2 is the inverse of dF . It is immediate that the measures of dV(0) form a group under addition, with 0 the neutral element and −dG the inverse of dG. We show now that dV(1) forms a group under convolution. It is clear that the convolution of two measures in dV(1) also lies there and that δ1 is the neutral element. As we noted in §2.6, convolution is associative. The preceding proposition implies that elements of dV(1) have an inverse in dV. Finally, since dF ({1}) dF −1 ({1}) = (dF ∗ dF −1 )({1}) = δ1 ({1}) = 1, the inverse of each element of dV(1) also lies in this set. We show next that there is a homomorphic mapping from the group (dV(0), +) to (dV(1), ∗). As in the case of numbers, the exponential provides this map. 3.4. The exponential on V Given F ∈ V, define exp∗ dF := δ1 + dF +
∞
1 1 dF ∗ dF + · · · = (dF )∗i . 2! i! i=0
As usual, dF ∗0 := δ1 , the Dirac measure at 1, and for i ≥ 1, dF ∗i := dF ∗ dF ∗i−1 , the i-fold convolution of dF . By Proposition 3.4, this series is convergent regardless of the value of dF (1), since the exponential is an entire function. Examples 3.6. Two exponential relations. In §3.1 we showed that dN = exp∗ dΠ holds for any g-number system. Another exponential relation is provided by the continuous analogue of dN , discussed in §1.4: x 1 − u−1 du. (3.7) δ1 + dt = exp∗ dΠc with Πc (x) := log u 1 We shall establish this formula by a generalization of (3.4). Proposition 3.7. Suppose F ∈ V and F is absolutely convergent at σa . Then, for all s with s ≥ σa , ∞ ∞ −s u dF (u) = u−s exp∗ dF (u). exp 1
1−
24
3. dN AS AN EXPONENTIAL AND CHEBYSHEV’S IDENTITY
Proof. Expanding the exponential as a series and applying Lemma 2.6, we find
∞ ∞ n 1 ∞ −s 1 ∞ −s ∗n u dF (u) = u dF (u) n! n! 1− 1 n=0 n=0 ∞ ∞ ∞ 1 dF ∗n (u) = = u−s u−s exp∗ dF (u). n! 1− 1− n=0
exp F(s) =
The exchange of the sum and integral is justified by absolute convergence.
Proof of (3.7). By Theorem 1.5 and the last proposition we have ∞ ∞ ∞ s = exp u−s (δ1 + du) = u−s dΠc (u) = u−s exp∗ dΠc (u). s−1 1− 1 1− Now δ1 + du = exp∗ dΠc by the identity theorem for Mellin transforms.
We shall revisit (3.7) in the next section. As another application of the proposition, we show Lemma 3.8. Let F ∈ V and α ∈ R. Then T α exp∗ dF = exp∗ (T α dF ). Proof. One can apply Lemma 2.15 termwise followed by a convergence argument. As an alternative, we use Proposition 3.7 twice to obtain ∞ ∞ ∞ −s α ∗ −(s−α) ∗ u u (exp dF )(u) = u (exp dF )(u) = exp u−(s−α) dF (u), 1− 1− 1 ∞ ∞ ∞ −s ∗ α −s α u (exp u dF )(u) = exp u u dF (u) = exp u−(s−α) dF (u). 1−
1
1
Since the Mellin transforms are the same, so are the measures.
We showed in Proposition 3.7 how the exponential “commutes” with Mellin transforms. The following inequality involves analogous exponentials and integrals, but with a positive measure and taken over a bounded interval. Lemma 3.9. Let F ∈ V be nondecreasing. For x ≥ 1 we have x
x dF (u) ≥ exp∗{dF }(u). exp 1−
1−
Proof. Remember that dF is zero to the left of 1. Now the left side of the last equation equals x 1 1+ dF (u) + dF (u) dF (v) + . . . , 2! u≤x, v≤x 1− while the right side equals x 1 1+ dF (u) + dF (u) dF (v) + . . . . 2! uv≤x 1− The multiple integrals are taken over subregions in the second expression, so the claimed inequality holds.
3.4. THE EXPONENTIAL ON V
25
In our use of the exponential, we shall generally assume that dF ∈ dV(0), in which case exp∗ dF ∈ dV(1). If F (1) = c = 0, we can express dF = cδ1 + dF1 where F1 (1) = 0 and apply the following proposition to obtain exp∗ dF = exp∗{cδ1 } ∗ exp∗ dF1 = ec exp∗ dF1 . The exp operator on dV has the algebraic properties of an exponential; in particular it takes sums into (convolution) products. Proposition 3.10. For dF, dG ∈ dV, we have exp∗ (dF + dG) = exp∗ dF ∗ exp∗ dG. Proof. For K a positive integer, we add the elements of the square matrix
1 1 dF ∗i ∗ dG∗j i! j! 0≤i,j≤K as a double sum in two ways: First, sum by rows and then by columns, yielding 1 1 (3.8) dF ∗i ∗ dG∗j , i! j! 0≤i≤K
0≤j≤K
an approximation to exp∗ dF ∗ exp∗ dG, Next, sum over points (i, j) on fixed diagonals of the form i + j = constant and lying within the square [0, K] × [0, K], yielding 1 k! dF ∗i ∗ dG∗j + dT, k! i!j! 0≤k≤K
i+j=k
where dT represents the sum of the elements of the (K + 1) × (K + 1) matrix lying outside the triangular region {i + j ≤ K}. The contribution from the first region is 1 (dF + dG)∗k , (3.9) k! k≤K
∗
an approximation to exp (dF + dG). Let σ be sufficiently large that dF σ and dGσ are each finite. We next show that dT , the difference of the expressions in (3.8) and (3.9), has arbitrarily small σ-norm for large K. Replace dF and dG by their total variation measures and extend the dT sum beyond the (K + 1) × (K + 1) square to get 1 ∞ 1 k dF σ + dGσ . u−σ (|dF | + |dG|)∗k (u) = dT σ ≤ k! 1− k! k>K
k>K
The last series goes to 0 as K → ∞. The preceding argument implies also that (3.9) converges to exp∗(dF + dG). It remains to show that (3.8) converges to exp∗ dF ∗ exp∗ dG. We have 1 1 dF ∗i ∗ dG∗j dHK := exp∗ dF ∗ exp∗ dG − i! j! 0≤i≤K
0≤j≤K
1 1 1 dG∗j + dG∗j dF ∗i . = exp∗ dF ∗ j! j! i! j>K
0≤j≤K
i>K
26
3. dN AS AN EXPONENTIAL AND CHEBYSHEV’S IDENTITY
Using the bound
0≤j≤K
1 dG∗j ≤ exp∗ |dG|, j!
and Proposition 3.7 (connecting exponentials and Mellin transforms) we get 1 1 dHK σ ≤ exp∗ (dF σ ) (dGσ )j + exp∗ (dGσ ) (dF σ )i j! i! j>K
i>K
which converges to 0 as K → ∞.
Corollary 3.11. If dF ∈ dV and exp∗ dF = dG, then exp∗{−dF } = dG∗−1 The proof consists of noting that exp∗ dF ∗ exp∗ (−dF ) = exp∗ 0 = δ1 . Examples 3.12. dN ∗2 and dN ∗−1 as exponentials. In classical number theory, dN ∗ dN corresponds to the divisor function and dN ∗−1 to the Moebius function. We have dN ∗ dN = exp∗(2 dΠ),
dN ∗−1 = exp∗(−dΠ).
Example 3.13. Square roots. Recall from (3.1) and (1.9) that 1 − t−1 dN = exp∗ dΠ, δ1 + dt = exp∗ dt =: exp∗ dΠc . log t Set
∗1/2 δ1 + dt := exp∗(dΠc /2). ∗1/2 ∗1/2 = dN and δ1 + dt ∗ δ1 + dt = δ1 + dt.
dN ∗1/2 := exp∗( dΠ/2),
We have dN ∗1/2 ∗ dN ∗1/2
The PNT (for both rational and g-numbers) asserts that Π(x) − Πc (x) is x “small,” where Πc (x) := 1 (1 − t−1 ) dt/(log t). As an application of several of our relations, here are two parallel formulas involving Π(x) − Πc (x) that will be useful in estimating the integer counting function N (x). Lemma 3.14. For x ≥ 1 we have x −1 (3.10) N (x) − N (u)u du = (3.11)
1 x
N (x)/x =
x
exp∗ (dΠ − dΠc ),
1−
T −1 {dN ∗ (δ1 − dt/t)} =
1−
x
T −1 exp∗ (dΠ − dΠc ),
1−
where T is defined in §2.8. Proof. For the first relation, write dN ∗ (δ1 − dt/t) = dN ∗ (δ1 + dt)−1 = exp∗ (dΠ − dΠc ). Thus
x
exp∗ (dΠ − dΠc ) = N (x) −
1−
x
N 1
x dt t
= N (x) −
t
1−
N (u) 1
For the second relation, use (3.5) to write x dN ∗ (δ1 − dt/t) ∗ (δ1 + dt) = N (x) =
x
1−
x
du . u
x {dN ∗ (δ1 − dt/t)}. t
Thus N (x)/x has the first claimed form. By writing dN = exp∗ dΠ and δ1 + dt = exp∗ dΠc , we obtain the second claimed formula for N (x)/x.
3.5. THREE EQUIVALENT FORMULAS
27
3.5. Three equivalent formulas The exponential relation is equivalent to two other formulas. By this equivalence, Chebyshev’s identity (3.3) follows at once from the representation (3.1). Proposition 3.15. Let F, G ∈ V and suppose F (1) = 1 and G(1) = 0. Then following three expressions are equivalent: (3.12)
dF = exp∗ dG
(3.13)
L dF = dF ∗ L dG dG =
(3.14)
∞
(−1)i−1 (dF − δ1 )∗i /i.
i=1
Here L is the log operator and δ1 Dirac measure at 1. (The last expression is formally similar to Taylor’s series for the logarithm.) Proof. (3.12) ⇒ (3.13): Apply L termwise to the exponential series. This is valid by Lemma 2.14. We then use (2.12) to evaluate L dG∗n to get 1 1 L dF = L{δ1 + dG + dG∗2 + dG∗3 + . . . } 2! 3! 1 = L dG ∗ 0 + δ1 + dG + dG∗2 + . . . = L dG ∗ dF. 2! (3.13) ⇒ (3.12): Consider dH := dF ∗−1 ∗ exp∗ dG. We show dH = δ1 . First, dH({1}) = dF ∗−1 ({1}) (exp∗ dG)({1}) = edG({1}) /dF ({1}) = e0 /1 = 1 = δ1 ({1}). Now it suffices to show that LdH = 0. We have LdH = (LdF ∗−1 ) ∗ exp∗ dG + dF ∗−1 ∗ L exp∗ dG. We showed just above that L exp∗ dG = (exp∗ dG) ∗ LdG. Next, 0 = Lδ1 = L(dF ∗ dF ∗−1 ) = (LdF ) ∗ dF ∗−1 + dF ∗ (LdF ∗−1 ). Thus
LdF ∗−1 = −(LdF ) ∗ dF ∗−2 ,
and we have LdH = (exp∗ dG) ∗
− (LdF ) ∗ dF ∗−2 + dF ∗−1 ∗ LdG = (exp∗ dG) ∗ dF ∗−2 ∗ dF ∗ LdG − LdF = 0. Thus dF ∗−1 ∗ exp∗ dG = δ1 , i.e. (3.12) holds. (3.13) ⇔ (3.14): Formula (3.14) is equivalent to ∞ (3.15) LdG = L (−1)i−1 (dF − δ1 )∗i /i ,
i=1
since both sides of (3.14) are 0 on the set {1}, the only place where L is 0. Now L is a continuous operator, so we can apply it termwise to the last sum. Using the series representation (3.6) for dF ∗−1 , we obtain ∞ LdG = LdF ∗ (−1)i−1 (dF − δ1 )∗(i−1) = LdF ∗ dF ∗−1 . i=1
28
3. dN AS AN EXPONENTIAL AND CHEBYSHEV’S IDENTITY
The last formula is equivalent to (3.13).
The exponential representation (3.7) (again). This time, we prove the formula using (3.13), specifically, we show that L(δ1 + dt) = (δ1 + dt) ∗ (1 − t−1 ) dt.
(3.16)
It suffices to verify that the integrals of the two sides are equal. On the one hand, x x L(δ1 + dt) = log t dt = x log x − x + 1, 1−
1−
and on the other, by iterated integration, x x x (1 − u−1 ) du = x log x − x + 1. (δ1 + dt) ∗ (1 − t−1 ) dt = 1− 1 u Thus (3.16) holds, and so δ1 + dt = exp∗ dΠc by the preceding proposition. 3.6. Notes Exponentials of arithmetic functions on N are treated in [Di70a] and [BD04], §2.4. x §3.1 The interested reader might use (3.1) or (3.3) to show that 1 LdN ≥ ψ(x). §3.4. Exponentials are presented via Dirichlet series in [Ap76], §11.9. §3.5. By Proposition 3.15 the exponential mapping from V(0) to V(1) is one-toone and onto; this insures that g-prime and g-integer systems determine one other uniquely.
https://doi.org/10.1090//surv/213/04
CHAPTER 4
Upper and Lower Estimates of N (x) Size matters Summary. Renormalizing g-integer counting functions having polynomial growth to make linear growth. Growth rate of type x logα x. Upper and lower logarithmic densities. An example of g-primes and associated gintegers having the same growth rate. Regular growth.
4.1. Normalization and restriction In §1.1 it was convenient to assume that a g-number system N satisfies the O-density condition (1.2), i.e. that N (x)/x is bounded. Suppose instead that our g-number system satisfies the more general polynomial growth condition (4.1)
lim sup N (x)/xα < ∞ x→∞
for some positive α = 1. While systems of such growth are less usual, they lie in the function classes described in §2.1 and allow analysis by Mellin transform techniques. However, as we show now, a simple mapping takes such a g-number system N into another one, N , for which (4.1) holds with α = 1. Let P denote a new set of g-primes created from P by the mapping pi → piα =: pi . The g-integers n of the associated g-number system N are related to those of N by ni = niα . The counting function of N satisfies N (x) = #{ni ≤ x} = #{niα ≤ x} = #{ni ≤ x1/α } = N (x1/α ) x, i.e. the O-density condition (1.2) holds for N . Thus, to treat a system whose counting function is of polynomial growth, we may as well assume from the outset that N (x)/x is bounded. We now give another example of a g-number system N whose integer counting function has “excessive” size. Recall from classical number theory the divisor function τ (n) and its counting function D(x). The latter satisfies dD = dN ∗ dN = exp∗(2 dΠ). Here N is the counting function of the rational integers and Π that of the weighted primes and prime powers. The g-primes of N are the rational primes, each with multiplicity 2, and the g-integers are the rational integers, with n having multiplicity τ (n). By Dirichlet’s well-known approximation, the counting function of N has size D(x) ∼ x log x. In this book, we generally treat cases in which the g-prime counting function is near x/ log x and the counting function of the g-integers is O(x). In this chapter, we shall consider upper and lower logarithmic densities, which are cruder measures of the growth of N (·) than density. (In Chapters 6 and 7 we 29
30
4. UPPER AND LOWER ESTIMATES OF N (x)
shall take up g-number systems satisfying various notions of density.) At the end of the chapter we examine two anomalous cases. 4.2. O-log density We define ld (N ), the upper logarithmic density of N , to be
1 x lim sup u−1 dN (u) . log x 1− x→∞ If this quantity is finite, we say that N has O-log density. We define ld (N ), lower logarithmic density, to be the limit inferior of the corresponding expression. We say that N has logarithmic density in case the two quantities are finite and equal. Lower log density is discussed in the following section and log density in §5.2. When a number system satisfies an O-density condition (cf. §§1.2, 6.2), it satisfies an O-log density condition as well, as shown by summation or integration by parts. The converse implication is false, as we illustrate in Example 6.1 below; thus we say that O-log density is a “weaker” condition than O-density. In the rest of this section we give simple criteria for generalized integers to have O-log density. These involve the behavior of either the associated zeta function at s = 1+ or (a weighted version of) the prime counting function. Generally similar relations hold for lower densities. Lemma 4.1. Suppose that F ∈ V is a nondecreasing function with a Mellin transform ∞ u−s dF (u). F (s) := 1−
Then F(s) = O(1/s) as s → 0+ if and only if F (x) = O(log x) as x → ∞. Proof. Suppose first that there is a number B such that F(s) ≤ B/s for 0 < s < 1. Given x > e, take s = 1/ log x ∈ (0, 1). We find ∞ x u−s dF (u) ≤ u−s dF (u) ≤ B log x. (1/e)F (x) ≤ 1−
1−
In the other direction, suppose that F (x) ≤ C + C log x,
x ≥ 1.
For s > 0, we have by Lemma 2.2 (comparison of integrators), ∞ ∞ F(s) = u−s dF (u) ≤ C u−s (δ1 + du/u) = C + C/s, 1−
i.e. F(s) 1/s,
1−
0 < s < 1.
Taking F(s) = ζN (s + 1) in the preceding lemma, we obtain Proposition 4.2. A g-number system N has O-log density if and only if (4.2)
lim sup (s − 1) ζN (s) < ∞. s→1+
We call the value of the left side of (4.2) the upper right hand residue of ζN (s) at s = 1, or more briefly, the upper residue of ζN (s) there. Adapting the proof of Lemma 4.1, we obtain
4.2. O-LOG DENSITY
31
Corollary 4.3. A g-number system N has log density 0 if and only if ζN (s) has upper residue 0 at s = 1. What role do small primes play in determining whether a g-number system has O-log density? Not surprisingly, the answer is None, as we show next. Thus we can alter the initial part of a g-prime distribution in checking whether the associated g-integers have O-log density. We treat differently a g-number system generated by discrete primes from one arising from a general increasing function Π(x). In the first case, say P is a discrete g-prime system with counting function π(x) and a number B > 1 is given. Partition P into P1 , the g-primes not exceeding B, and P2 , the remaining g-primes. For i = 1, 2, let πi denote the counting function of Pi , and Πi the weighted g-prime and g-prime power counting function of Pi . Of course, powers of a prime p ∈ P1 can exceed B and will be counted in Π1 . In the second case, write dΠ = dΠ1 + dΠ2 with dΠ1 (u) = χ(1, B] (u) dΠ(u) and dΠ2 (u) = χ(B, ∞) (u) dΠ(u), with χE the indicator function of a set E. In both cases let Ni denote the counting function of Ni . We have dN = exp∗{dΠ1 + dΠ2 } = dN1 ∗ dN2 and ζ(s) = ζ1 (s) ζ2 (s), where ζi is the zeta function associated with Ni . Proposition 4.4. Let N be a g-number system generated by dΠ = dΠ1 + dΠ2 as above. Then N has O-log density if and only if N2 has the same. Proof. It is obvious that if N has O-log density, so does N2 . Now suppose that N2 has O-log density. By Proposition 4.2, (s − 1)ζN2(s) is bounded as s → 1+. We now show that ζ1 (1) is finite. For P discrete with smallest prime q > 1, ∞ B 1 1 u−1 dΠ1 (u) = log ζ1 (1) = u−1 + u−2 + u−3 + . . . dπ(u) 2 3 1 1 B
1 q 1 ≤ π(B) < ∞, u−1 1 + + 2 + . . . dπ(u) ≤ q q q−1 1 and for a general increasing function Π, ∞ u−1 dΠ1 (u) = log ζ1 (1) = 1
B
u−1 dΠ(u) ≤ Π(B) < ∞.
1
For s ∈ (1, 2), we have ζ(s) ≤ ζ1 (1) ζ2 (s) 1/(s − 1). Thus N has O-log density, by another application of Proposition 4.2. The reader is encouraged to give an elementary proof of this proposition. Proposition 4.5. Let N be a g-number system for which x dΠ(u) = log log ex + E(x), x ≥ 1, u 1 holds for some function E(x). If E(x) is bounded above, then N has O-log density. On the other hand, if E(x) is unbounded above but bounded from below, then N does not have O-log density.
32
4. UPPER AND LOWER ESTIMATES OF N (x)
Proof. Since
1 − u−1 du − log log ex u log u 1 is bounded, the formula of the proposition can be changed to x x dΠ(u) 1 − u−1 (4.3) = du + E1 (x), x ≥ 1, u u log u 1 1 where E and E1 differ by at most a bounded quantity. Suppose E1 (x) is bounded above by some positive number K. Then (4.3) and Lemma 2.2 give the Mellin inequality ∞ ∞ 1 − u−1 du + K. u−s dΠ(u) < u−(s−1) u log u 1 1 Recalling formulas (1.5) and (1.8), we find s log ζ(s) < log + K, s > 1. s−1 Thus lim sups→1+ (s−1)ζ(s) is finite, and so N has O-log density by Proposition 4.2. Next, suppose (4.3) holds with E1 (x) ≥ −M for all x ≥ 1 and E1 (x) is unbounded above. If the zeta function of N has abscissa of convergence σc > 1, then N does not have O-log density and we are done; thus it suffices to assume that σc = 1. Since dΠ ≥ 0, we have log ζ(s) < ζ(s) for s > 1, and hence the Mellin integral for log ζ(s) also is convergent there. Thus, ∞ ∞ ∞ 1 − u−1 −(s−1) −s du u dE1 (u) = u dΠ(u) − u−s log u 1 1 1 holds for s > 1. Integrating by parts and writing s = 1 + , we then obtain ∞ u−1− E1 (u) du = log{ ζ(1 + )/(1 + )}, > 0. (4.4) I() := x
1
Given a large positive number L, we have E1 (u0 ) ≥ L for some (large!) number u0 . Then E1 (u) remains large for a rather long interval beyond u0 : we have u dΠ(t) 1 − t−1 − dt E1 (u) = E1 (u0 ) + t t log t u0 u log u dt L = L − log >L− ≥ t log t log u 2 0 u0 for log u0 ≤ log u ≤ (log u0 ) exp(L/2) := log u1 . It follows that, for a suitable small , I() is large: indeed, ∞ L u1 −1− L −1− u du + u du = −M + (u0 − − u1 − ) . I() > −M 2 u0 2 1 Now choose = (log 2)/(log u0 ), so that u0 − = 1/2. A small calculation shows that u1 − < 1/10 for any L ≥ 3, say. Thus I() > −M + L/5. Since M is fixed and L can be arbitrarily large (and hence small), we have lim sup log{ζ(1 + )} = lim sup I() = ∞. →0+
→0+
By another application of Proposition 4.2, N does not have O-log density.
4.3. LOWER LOG DENSITY
33
Example 4.6. PNT ⇒ O-log density. Suppose that the prime counting function of a g-number system N satisfies x 1 − v −1 x + o(x) (4.5) Π(x) − dv = , x ≥ 1. log v log ex log log 10x 1 The PNT clearly holds here. However, with the notation of the last proposition, x dt , E1 (x) ∼ 1 t log et log log 10t which diverges to infinity as x → ∞. By the last proposition, N does not have O-log density. 4.3. Lower log density Recall that we defined the lower logarithmic density of N by x 1 ld (N ) = lim inf u−1 dN (u) . x→∞ log x 1− Here we establish some conditions for ld (N ) to be positive. Theorem 4.7. If a g-number system N has positive lower logarithmic density, then x (4.6) u−1 dΠ(u) ≥ log log ex − M 1
holds for some positive number M and all x ≥ 1. Conversely, if (4.6) holds, then N has either positive or infinite lower logarithmic density. Proof. Suppose first that ld (N ) > 0. By Lemmas 3.9 and 2.15, we have x x
x T −1 dΠ ≥ exp∗ {T −1 dΠ}(u) = T −1 exp∗ {dΠ}(u). exp 1−
1−
1−
By (3.1) and the hypothesis, the last integral is x x T −1 dN = u−1 dN (u) ≥ c log x 1−
1−
for all x ≥ 1 and some positive number c. This establishes (4.6). Next, assume that (4.6) holds. We have x u−1 dΠ(u) + K ≥ 1
1
x
1 − u−1 du u log u
for some positive K, by the boundedness of x 1 − u−1 du − log log x. 1 u log u Defining dΠ+ := dΠ + Kδ1 , where δ1 is Dirac measure at 1, we obtain the equivalent inequality x x x x −1 −1 −1 −1 1 − u du = T dΠ+ = u dΠ+ (u) ≥ u T −1 dΠc . log u 1− 1− 1− 1−
34
4. UPPER AND LOWER ESTIMATES OF N (x)
Using Lemmas 2.9 and 2.15, we find for each nonnegative integer n x x −1 ∗n T dΠ+ ≥ T −1 dΠ∗n c , 1−
1−
and summing terms of the exponential series, we obtain x x (4.7) T −1 exp∗ dΠ+ ≥ T −1 exp∗ dΠc . 1−
1−
Now exp∗ dΠ+ = exp∗ {dΠ + Kδ1 } = (exp∗ dΠ) ∗ (exp∗ {Kδ1 }) = eK dN and exp∗ dΠc = δ1 + dt, by the homomorphic property of exp and the identifications of exp∗ dΠ and exp∗ dΠc provided by (3.1) and (3.7) respectively. Thus (4.7) translates to x x K −1 e T dN ≥ T −1 (δ1 + dt) = 1 + log x, 1−
1−
so the lower logarithmic density of N is either positive or infinite.
Proposition 4.8. Suppose that a g-number system N has positive lower logarithmic density. Then (4.8)
lim inf (s − 1)ζN (s) > 0. s→1+
We call the value of the left side of (4.8) the lower residue of ζN (s) at s = 1. The proposition is proved by integration by parts, as was done for the corresponding part of Proposition 4.2. Does the converse of Proposition 4.8 hold? The situation here is different from Proposition 4.2: without further conditions, the answer is No. In the next section we shall give an example in which ld (N ) = 0 but the zeta function has an infinite residue at s = 1+. However, if the zeta function also has a finite upper residue at 1, then, as we show in Corollary 4.10, ld (N ) > 0. In the remainder of this section, we give some sufficient conditions for positive lower logarithmic density. Proposition 4.9. Suppose that N satisfies (4.8) and the bound (4.9)
ζN (1 + )/ζN (1 + 2) 1
for 0 < < 1. Then N has positive lower logarithmic density. Proof. By hypothesis, there exist numbers m, B, and 0 such that ζ(1 + ) ≥ m > 0 and
2ζ(1 + ) ≤B ζ(1 + 2)
for 0 < < 0 . Note that B ≥ 2, since ζ(s) decreases on the half-line {s : s > 1}. Let ∈ (0, 0 ], to be specified. We have ∞ Bζ(1 + 2) − ζ(1 + ) = u−1− (Bu− − 1) dN (u). 1−
The last integrand is positive for 1 < u < B 1/ =: U , and it is negative for all u > U . It follows that U U − −1− (Bu − 1)u dN (u) < Bu−1 dN (u) (2 − 1)ζ(1 + ) < 1−
1−
4.3. LOWER LOG DENSITY
(the last inequality is immediate) or m ≤ ζ(1 + ) < B
U
35
u−1 dN (u).
1−
For x sufficiently large, take = log B/ log x. Then U = x and we obtain finally x m log x log x. u−1 dN (u) > B log B 1− An immediate consequence of this proposition is Corollary 4.10. If ζN (1 + ) has positive finite bounds from above and below as → 0+, then N has positive lower logarithmic density. The interested reader is encouraged to give a direct proof of the corollary. Could we have proved the result of Proposition 4.9 without assuming condition (4.8)? By Proposition 4.8, this condition is necessary in order for N to have positive lower logarithmic density. Thus, our question is, Does (4.9) imply (4.8)? The answer is No, as we now show. Example 4.11. Condition (4.9) ⇒ Condition (4.8). Define the g-integer counting function F (x) of a (continuous) system by taking dF to be the convolution square root of δ1 + dt (see Example 3.13). Note that dF ≥ 0, since it is the exponential of a nonnegative measure. By Lemma 2.6, we have ∞ s 1/2 x−s dF (x) = , s > 1, F (s) := s−1 1− and hence lims→1+ (s − 1)F(s) = 0; i.e. (4.8) is not satisfied. On the other hand, (4.9) holds, as
2 1/2 √ F (1 + ) ∼ = 2 as → 0 + . F(1 + 2) Another version of Proposition 4.9 can be deduced from the following Proposition 4.12. Formula (4.9) holds for N if and only if 1 −ζN (1 + ) , ζN Proof. By the mean value theorem,
(4.10)
0 < < 1.
log ζ(1 + ) − log ζ(1 + 2) =
−ζ (1 + c ) ζ
for some c ∈ (1, 2). Also, by (1.6),
∞ −ζ (s) = u−s dψ(u), ζ 1 where ψ ↑, so (−ζ /ζ)(s) is a decreasing function of s. These relations yield
−ζ −ζ (1 + 2) ≤ log{ζ(1 + )/ζ(1 + 2)} ≤ (1 + ) . ζ ζ The right hand inequality and (4.10) imply that ζ(1 + )/ζ(1 + 2) 1. Conversely, the left hand inequality and (4.9) give an upper bound for 2 (−ζ /ζ)(1 + 2). Corollary 4.13. Suppose that (4.8) and (4.10) both hold. Then N has positive lower logarithmic density.
36
4. UPPER AND LOWER ESTIMATES OF N (x)
4.4. An example with infinite residue but 0 lower log density Example 4.14. Define a very lacunary g-prime system P by setting p1 = e and for m ≥ 2, setting pm = exp{m exp pm−1 } with multiplicity bm = pm 1+1/m . Let N = {ni } denote the associated g-integers with n0 = 1 and, as usual, ζ(s) the g-zeta function. We show first that ζ(1 + ) → ∞,
(4.11) and then that (4.12)
lim inf x→∞
→ 0+,
1 1 = 0. log x ni ni ≤x
To begin, we show that ζ(s) converges for s > 1. Without loss of generality, we can assume that s ∈ (1, 2). Partition P into P1 := {pm with multiplicity bm , m < 2/(s − 1)}, P2 := {pm with multiplicity bm , m ≥ 2/(s − 1)}. Let π1 (x) and π2 (x) denote the g-prime counting functions and Π1 (x) and Π2 (x) the weighted g-prime and g-prime power counting functions of P1 and P2 respectively. Then we have exp{dΠ} = exp{dΠ1 + dΠ2 } = exp{dΠ1 } ∗ exp{dΠ2 }. It follows that
∞
x−s (exp{dΠ1 } ∗ exp{dΠ2 })(x) = ζ1 (s) · ζ2 (s) ,
ζ(s) = 1−
where
∞
x−s exp{dΠν }(x), ν = 1, 2.
ζν (s) := 1−
Now, we have
ζ1 (s) =
(1 − pm −s )−bm ,
m 1, and hence so does that of ζ(s).
4.5. EXTREME THINNESS IS INHERITED
37
Next, we use the high multiplicity of integers to show that the right hand residue of ζ(s) at s = 1 is infinite. For s > 1, choose m such that 1/(2m + 2) < s − 1 ≤ 1/(2m). Using a single term of the series for log ζ2 (s), we have (s − 1)ζ(s) > (s − 1) log ζ2 (s) ≥ (s − 1) pm −s pm 1+1/m ≥ (s − 1) pm −s+1+1/m (1 − pm −1−1/m ) 1 1 1 pm 1/(2m) = exp{(1/2) exp pm−1 }, > 2m + 2 2 4m + 4 which goes to ∞ as s → 1+, proving (4.11). Finally, we show ld (N ) = 0. For μ a positive integer, take x = pμ+1 − 1. If a g-integer ni satisfies ni ≤ x, then ni has only g-prime divisors pm with m ≤ μ. Hence we have x −b 1 − pm −1 m , t−1 dN (t) ≤ 1−
and so
x
log
m≤μ
t−1 dN (t) ≤
1−
pm 1+1/m
m≤μ
≤2
∞ 1 −k pm k
k=1
pm
1+1/m
pm −1 ≤ 2μ pμ 1/μ .
m≤μ
Now log x ≥ and hence
x 1−
t−1 dN (t) ≤
1 μ+1 log pμ+1 = exp pμ , 2 2
2 exp{−pμ + 2μ pμ 1/μ } log x = o(log x). μ+1
for x = pμ+1 − 1 → ∞. This establishes (4.12). 4.5. Extreme thinness is inherited The rational primes are distributed “thinly” in the rational integers in the sense that πN (x) = o(NN (x)). In this section we exhibit a condition under which the counting function of a system of g-integers actually has the same order of growth as that of the associated g-primes. Proposition 4.15. Suppose that the counting function of a g-prime system P satisfies π(x) = πP (x) x/ logλ x, for a constant λ > 1. Then the g-integer counting function of NP also satisfies N (x) x/ logλ x. Proof. Deleting the first few g-primes from P changes N (x) asymptotically only by a fixed factor. Take α = α(λ) so large that α > eλ and (log α)λ−1 > 2/(λ − 1). The function x → x/ logλ x is increasing for x > eλ , so it is convenient to assume, without loss of generality, that π(x) = 0 for x ≤ α.
38
4. UPPER AND LOWER ESTIMATES OF N (x)
By Lemma 1.1, the weighted g-prime and g-prime power counting function Π(x) satisfies the same O-bound as does π(x). Let B > 0 be a constant such that Π(x) ≤ Bx/ logλ x, x > 1.
(4.13)
Of course, Π(x) = 0 for x ≤ α. We show by induction that, for all k ∈ N and x > α, x dΠ∗k ≤ 2λ(k−1) B k x/ logλ x . (4.14) Πk (x) := 1
Indeed, (4.14) holds for k = 1 by (4.13). Next, by Dirichlet’s hyperbola method, √x √x Πk+1 (x) ≤ Πk (x/t) dΠ(t) + Π(x/t) dΠk (t) =: I1 + I2 1
1
√ √ say, since the contribution of the overlapped square [1, x] × [1, x] is nonnegative. We have
√x x/t 2λ B x dΠk (t) ≤ t−1 dΠk (t). I2 ≤ B λ λ log x/t log x 1 1 Let ck denote the coefficient on the right side of (4.14). Extend the range of the last integral to infinity and apply integration by parts. We find ∞ ∞ ∞ ck /(λ − 1) dt ck = , t−1 dΠk (t) = t−2 Πk (t) dt ≤ ck < λ λ−1 (log α) 2 1 1 α t log t by the definition of α. Similarly, √x √x x/t 2λ ck x dΠ(t) ≤ I1 ≤ c k t−1 dΠ(t), λ λ log x/t log x 1 1 ∞ −1 and using the bound for 1 t dΠk (t) with k = 1, we find that the last integral is at most B/2. Combining the bounds for I1 and I2 , we find √ x
Πk+1 (x) ≤
ck+1 x 2λ B ck x = , λ log x logλ x
proving (4.14). Now we insert (4.14) into the exponential formula for dN . We get x 1 δ1 + Π + dΠ ∗ dΠ + . . . N (x) = 2! 1− ∞ x 1 λ k x (2 B) < 1 + 2−λ , λ k! log x logλ x k=1 since the infinite series is convergent. Thus the claimed O-bound for N (x) holds.
Example 4.16. A thin set of numbers. Let P be a g-prime system with pn := n(log(2 + n))2 ,
n = 1, 2, . . . .
A small calculation shows that πP (x) := #{n : pn ≤ x} ∼ x/ log2 x. By the last proposition, we have also NP (x) x/ log2 x.
4.6. REGULAR GROWTH
39
4.6. Regular growth We say that a positive nondecreasing function f on [1, ∞) has regular growth if there exists a nonnegative exponent α such that, for all c > 0, f (cx)/f (x) ∼ cα as x → ∞. One can show ([BD69], §4) that if f has this property, then for any > 0, c xα− ≤ f (x) ≤ C xα+ holds with suitable constants c and C for all x ≥ 1. In particular, if the g-integer counting function of a system N is of regular growth with exponent α = 1, then the abscissa of convergence of the corresponding zeta function is not 1. As noted in § 4.1, systems we consider are normalized to make α = 1. It is immediate from the definitions of regular growth and density that positive density ⇒ regular growth. However, regular growth does not have connections with other of the properties we treat, so it is not going to play an important role in this work. Example 4.17. Regular growth ⇒ logarithmic density (and a fortiori regular growth does not imply density.) Let N (x) = x and take dN2 := dN ∗ dN . By the Dirichlet divisor estimate, N2 (x) ∼ x log x. Easy calculations show that N2 (cx)/N2 (x) ∼ c, while x x dN2 (u) N2 (u) N2 (x) 1 = + du ∼ log2 x. 2 u x u 2 1− 1 Thus, N2 has regular growth but not logarithmic density. Example 4.18. Logarithmic density ⇒ regular growth. Set Π(x) := 2k /k , x ≥ 1. k≤log x/(log 2)
We have log ζ(s) =
∞
2k(1−s) /k = log
k=1
so
1 , s > 1, 1 − 21−s
−1 ζ(s) = 1 − 21−s = 1 + 2 · 2−s + 4 · 4−s + . . . ,
and hence (by the identity theorem for Dirichlet series) N (x) = 2ν . 0≤ν≤log x/(log 2)
This g-number system has logarithmic density, for x
log x 1 log x. u−1 dN (u) = 1 + 1=1+ ∼ log 2 log 2 1− n 2 ≤x
On the other hand, this system is not of regular growth, for as k → ∞, N ( 23 2k ) 2k − 1 1 2 = k+1 ∼ = . k N (2 ) 2 −1 2 3
40
4. UPPER AND LOWER ESTIMATES OF N (x)
4.7. Notes §4.1. Beurling [Be37] considered also number systems having an integer counting function satisfying N (x) ∼ cx logk x. We do not treat this extension here. §4.2. Here and below we discuss upper, lower, and right hand residues of a generating function F(s) at a point, but the term residue is reserved for an analytic function that is meromorphically continuable at the point. §4.5. It is likely that the integer counting function of Example 4.16 has regular growth, but we have not shown this. §4.6. The notion of regular growth is closely connected with Karamata’s theory of regular variation. These topics are discussed in detail in the encyclopedic works of [BGT87] and [Ko04].
https://doi.org/10.1090//surv/213/05
CHAPTER 5
Mertens’ Formulas and Logarithmic Density Euler’s number is really constant Summary. Connection of Mertens’ sum and product formulas with logarithmic density and the right hand residue of the zeta function at 1. Theorems of A. Tauber and of Hardy-Littlewood-Karamata.
5.1. Introduction In this chapter we examine for g-numbers • O- and asymptotic relations for the Mertens product (1 − 1/p)−1 p≤x
• O- and asymptotic relations for the Mertens sum x dψ(u) ψ1 (x) := u 1 • O-log density and logarithmic density of the associated g-integers. We shall show the intimate connections of these notions. We shall assume throughout this chapter that the g-primes are discrete, so that the Mertens product has a clear interpretation. Also, we assume that the counting function of g-integers satisfies N (x) x, so ζ(s) has abscissa of convergence 1. 5.2. Logarithmic density Recall that we say N has logarithmic density if
1 x 1 lim dN (u) x→∞ log x 1− u exists. Let ζ(s) be the zeta function of a g-number system N , and suppose that lims→1+ (s − 1)ζ(s) exists and equals α. We say in this case that zeta has a right hand residue α at s = 1. We begin by showing the equivalence of these properties. Proposition 5.1. N has logarithmic density if and only if ζ(s) has a right hand residue at s = 1. In case these quantities exist, they are equal. Proof. Assume N has logarithmic density α. Then, for large x, x u−1 dN (u) ∼ α log x. (x) := 1− 41
42
5. MERTENS’ FORMULAS AND LOGARITHMIC DENSITY
For s > 1, integration by parts yields ∞ ∞ −(s−1) u d(u) = (s − 1) u−s (u) du ζ(s) = 1− 1 ∞ α + o(1) = (s − 1) u−s (α + o(1)) log u du = s−1 1 as s → 1+, i.e. (s − 1)ζ(s) has limit α. In the other direction, the Hardy-Littlewood-Karamata (HLK) Tauberian Theorem, proved in the next section, applies directly: if ζ = ζN satisfies the asymptotic relation ζ(s) ∼ α/(s − 1), s → 1+, then N has logarithmic density α. We noted in §4.2 that validity of the PNT does not guarantee that a g-number system satisfies even an O-logarithmic density condition. However, a slightly different hypothesis yields logarithmic density. Corollary 5.2. N has a positive logarithmic density if and only if u ∞
1 − v −1 dv du lim u−s−1 Π(u) − s→1+ 1 log v 1 exists. Proof. The displayed expression equals lim log{ζ(s)(s − 1)/s},
s→1+
the logarithm of the right hand residue of zeta, as we see by integrating by parts (1.5) and (1.8). By the last proposition, the latter expression is a real number if and only if NP has positive logarithmic density. Another criterion is given in terms of dψ(u)/u. Theorem 5.3. Suppose that the Chebyshev function ψ(x) of a g-number system N satisfies x x dψ(u) du = log x + O (5.1) ψ1 (x) := , x > 1, u 1 1 u f (u) ∞ where f > 0 and 2 du/{uf (u) log u} < ∞. Then N has logarithmic density. On the other hand, if x du , x > 1, (5.2) ψ1 (x) ≥ log x + 1 u f (u) ∞ holds with f instead satisfying 2 du/{uf (u) log u} = ∞, then N does not have logarithmic density. Proof. Let
F (s) := (s − 1)
∞
u−s (ψ1 (u) − log u) du.
1
The integral is convergent for s > 1 under our standing assumption that N (x) x, because of the (weak!) estimate x x x ψ1 (x) = T −1 dψ ≤ T −1 dN ∗ T −1 dψ = T −1 LdN log2 x. 1
1
1
5.3. THE HARDY-LITTLEWOOD-KARAMATA THEOREM
Now we have ∞ ζ − (s) = u−(s−1) d ψ1 (u) ζ 1 ∞ = (s − 1) u−s log u du + 1
∞
43
u−s (ψ1 (u) − log u) du
1
1 = + F (s). s−1 Hence, with an integration, for 1 < s < 2, 2 2 1 ζ log ζ(s) − log ζ(2) = + F (w) dw − (w) dw = ζ w−1 s s 2 1 = log F (w) dw. + s−1 s By Proposition 5.1, it suffices to show that the last term on the right side has a finite limit as s → 1+. Indeed, for 1 < s ≤ 2, we have u ∞ dv −s |F (s)| ≤ C(s − 1) u du 1 1 vf (v) ∞ −s ∞ ∞ 1 v −s dv. u du dv = C = C(s − 1) vf (v) f (v) 1 v 1 Hence, for 1 < s1 < s2 , s2 ≤ F (w) dw s1
s2
|F (w)| dw ≤ C
s1
∞
=C 1
s2
s1 1 −(s2 −1)
v −(s1 −1) − v vf (v) log v
∞
v −w dv f (v)
dw
dv → 0
−(s1 −1) − v −(s2 −1) as s1 , s2 → 1+, by the dominated convergence 2 theorem, since v goes to 0 pointwise. By Cauchy’s criterion, s F (w) dw has a finite limit as s → 1+. ∞ On the other hand, if (5.2) holds with 2 du/{uf (u) log u} = ∞, then the preceding calculations show 2 ∞ −w ∞ −(s−1) 2 v v − v −1 F (w) dw = dv dw = dv, f (v) vf (v) log v s s 1 1
and the last integral goes to infinity as s → 1+. Thus (s − 1)ζ(s) → ∞, and N does not have logarithmic density. 5.3. The Hardy-Littlewood-Karamata Theorem If a real-valued function F ∈ V satisfies F (x) = {c + o(1)} log x as x → ∞, then a small calculation shows that its Mellin transform satisfies ∞ c + o(1) s→0+. u−s dF (u) = (5.3) F(s) := s 1− We call a result of this type abelian, because the key step in its proof is typically an averaging argument based on Abel’s method of summation (or integration) by parts. The HLK Theorem asserts that the converse holds for each nondecreasing function F (x). Note that Lemma 4.1 is an “O-” version of the theorem.
44
5. MERTENS’ FORMULAS AND LOGARITHMIC DENSITY
Theorem 5.4 (HLK). Suppose F ∈ V is real and nondecreasing and its Mellin transform satisfies (5.3). Then F (x) = {c + o(1)} log x as x → ∞. Proof. We may assume that c > 0; otherwise, replace F (x) by F (x) + c log x with some c > 0. Also, we may assume that F (x) = 0 near 1, which does not affect (5.3) or the result of the theorem. For any polynomial P (z) = with
N
an z n
n=0
1
P (u) du > 0 and number s > 0, we have ∞ ∞ N N an c + on (1) (5.4) u−s P (u−s ) dF (u) = an u−s(n+1) dF (u) = n+1 s 1− 1− n=0 n=0 0
=
N c + oP (1) an , s n+1 n=0
s→0+.
It is OK to take the error term oP (1) outside the sum since 1 N an = P (u) > 0. n+1 0 n=0 If we change the integration variable on the left side of (5.4), we find 1 c + oP (1) 1 −1/s )= P (u) du . − tP (t) dF (t s 0 0 Note that c + oP (1) > 0 for sufficiently small positive s. For any given Riemann integrable function G on [0, 1] with G > 0 and any positive number , we can choose polynomials P1 and P2 , also with positive integrals, such that P1 (x) ≤ G(x) ≤ P2 (x) for 0 ≤ x ≤ 1 with 1 1 {P2 (t) − G(t)} dt < , {G(t) − P1 (t)} dt < . 0
0
The function F (t−1/s ) is non-increasing in t, hence 1 1 −1/s −1/s tP1 (t) dF (t )≤− tG(t) dF (t )≤− − 0
0
1
tP2 (t) dF (t−1/s ),
0
or
1 c + oP2 (1) 1 c + oP1 (1) 1 P1 (u) du ≤ − tG(t) dF (t−1/s ) ≤ P2 (u) du. s s 0 0 0 1 Since 0 {P2 (t) − P1 (t)} dt can be taken arbitrarily small and G lies between P1 and P2 , we have 1 c + o(1) 1 −1/s − tG(t) dF (t )= G(u) du, s 0 0 Now take
G(t) :=
1/t, 1/e ≤ t ≤ 1 0, 0 ≤ t < 1/e.
s→0+.
5.4. MERTENS’ SUM FORMULA
With this choice, we find F (e1/s ) − 0 = −
1
dF (t−1/s ) =
1/e
c + o(1) s
1
u−1 du =
1/e
45
c + o(1) , s
and, with one last change of variable, F (x) = {c + o(1)} log x as x → ∞.
5.4. Mertens’ sum formula A weak form of Mertens’ classical sum formula asserts that x dψ(t) (5.5) ∼ log x. t 1 Here we show the connection of this relation with logarithmic density for a g-number system. As a warm-up, we give an O- estimate of ψ1 (x) that is sharper than the auxiliary estimate in the proof of Theorem 5.3. Proposition 5.5. Suppose a g-number system has finite upper logarithmic density and positive lower logarithmic density. Then ψ1 (x) log x. Proof. The Chebyshev identity (3.1) and Lemma 2.15 with α = −1 together yield log n Λ(n) 1 ∗ = . n n n n≤y
n≤y
The right side of the last equation is at most 1 log y log2 y, n n≤y
while the left side equals Λ(m) 1 mn≤y
m n
≥
Λ(m) 1 √ ψ1 ( y) log y. m √ √ n
m≤ y
n≤ y
2
Taking y = x , we get the desired estimate.
Theorem 5.6. Suppose that a g-number system has positive logarithmic density. Then (5.5) holds. Proof. By integration by parts, ∞ ∞
u −(s−1) −s ζ(s) = u dN (u)/u = (s − 1) u dN (x)/x du 1− 1 1− ∞ 1 A +o , s → 1+, = (s − 1) u−s {A + o(1)} log u du = s−1 s−1 1 for some positive constant A. Similarly, again for s → 1+, ∞ −ζ (s) = u−(s−1) log u dN (u)/u = A(s − 1)−2 + o (s − 1)−2 . 1
Divide the two expressions and let s → 1+ to obtain ∞ 1 . u−s dψ(u) = −ζ (s)/ζ(s) ∼ s−1 1
46
5. MERTENS’ FORMULAS AND LOGARITHMIC DENSITY
Since ψ ↑, the result follows from the Hardy-Littlewood-Karamata Tauberian Theorem of the last section. The converse assertion, “(5.5) =⇒ positive logarithmic density” is not true. This is easily seen by reconsidering Example 4.6: here x {1 + o(1)} log x dψ(t) = log x + ∼ log x , (5.6) t log log 10x 1 and in this case N does not have finite upper logarithmic density. However, as we showed in Theorem 5.3, a hypothesis slightly stronger than (5.6) insures that N does have positive logarithmic density. We shall show in Theorem 8.5 that π(x) = o(N (x)) under the assumption of a regular growth condition. Here we prove another o-estimate using logarithmic density. (Recall from §4.6 that log density is independent of regular growth.) Also, we give a lower bound for π(x). Corollary 5.7. Suppose that N is a g-number system with positive logarithmic density. Then π(x) = o(x) and, for any > 0, π(x) x1− . Proof. By the last theorem x dψ(u) = {1 + o(1)} log x. ψ1 (x) := u 1 Then, by integration by parts, x u dψ1 (u) π(x) ≤ Π(x) = log u p1 − x 1 1 ψ1 (x)x − − ψ1 (u) = du log x log u log2 u p1 x 1 du = o(x). {1 + o(1)} 1 − = {1 + o(1)}x − log u p1 For the lower bound estimate, we difference the result of (5.5) and write x dψ(t) log x ∼ < x−1 {ψ(x) − ψ(x1− )} < x−1 ψ(x). t 1− x Thus ψ(x) x1− log x and so π(x) x1− .
Remark 5.8. Theorem 5.6 is proved under the assumption that N has positive logarithmic density. Can the condition be weakened to allow logarithmic density 0? The answer is No: in Example 6.1 (below), π(ak )/ak > log k → ∞ as k → ∞, and on the other hand the g-number system constructed has logarithmic density 0. 5.5. Mertens’ product formula Mertens’ classical product formula asserts that 1 −1 (5.7) 1− log x → eγ , x → ∞. p p≤x
Here p ranges over the natural primes and γ denotes Euler’s number. We shall establish an analogous formula for g-primes under the hypothesis that the g-integers have a logarithmic density.
5.6. A REMARK ON γ
47
Example 5.9. An odd Mertens product. As a hint of what to expect, consider the Mertens product for the g-prime system P1 which consists of the natural primes from 3 onwards. N1 is the collection of odd numbers and has density 1/2. Formula (5.7) gives 1 −1! 1− log x → eγ /2, x → ∞. p 3≤p≤x
We showed in Proposition 5.1 that a g-number system has a logarithmic density precisely when it has a right hand residue at s = 1, and in this case, the two quantities are the same. Now we show further that a g-number system N has these properties if and only if Mertens’ product formula holds in N . Theorem 5.10. Suppose that N is a g-number system generated by g-primes P = {pi }. We have 1 −1 1− log x → A eγ , x → ∞, (5.8) pi pi ≤x
if and only if N has logarithmic density A. 5.6. A remark on γ It seems natural enough that A should appear in the last formula, but it is perhaps surprising also to have Euler’s constant γ, particularly since this number does not seem connected with Beurling numbers: if we use the definition ∞ 1 γ := lim − log x = u−1 (du − du), x→∞ n 1− n≤x
then the most obvious definition of the corresponding quantity for a Beurling system N = {ni } with density (or logarithmic density) A would seem to be ∞ 1 1 γN = lim {dNN (u) − A du}. − A log x = x→∞ ni u 1− ni ≤x
This number, however, is generally different from Euler’s constant. For example, for N the odd numbers, the last formula gives 1 1 γodds = lim − log x x→∞ n 2 n≤x n odd
= lim
x→∞
=γ−
n≤x
1 − log x − n
n≤x/2
1 1 − log x 2n 2
log(1/2) + γ γ + log 2 = >γ. 2 2
Proof of the theorem. The argument proceeds via the following steps: • an equivalent form of the theorem • a generating function • a tauberian theorem for Mellin transforms.
48
5. MERTENS’ FORMULAS AND LOGARITHMIC DENSITY
5.7. An equivalent form and proof of “only if ” Taking the log of each side of formula (5.8) and expanding log(1 − 1/p) as a series gives 1 (5.9) − log log x → log A + γ αpα p≤x α≥1
as an expression equivalent to (5.8). In the following lemma, we replace the double sum and the iterated log in (5.9) by functions that are better suited for analysis. Lemma 5.11. The formula x 1 − u−1 1 (5.10) dΠ(u) − du → log A, log u 1 u
x → ∞,
is equivalent to (5.9). Proof. As a preliminary, note that each of (5.9) and (5.10) implies the crude bound 1 (5.11) log log x. p p≤x Now, we change the double sum in (5.9) into dΠ(u)/u; the two expressions differ only in some higher prime powers, and we show the difference goes to 0 as x → ∞. We have x 1 dΠ(u) 1 0≤ = − . α α αp u αp 1 α: pα >x p≤x α≥1
p≤x
Note that α ≥ max{2, log x/(log p)}. Thus the last sum is at most log p log log x 1 log p 1 = o(1), p−α log x log x p p log x p≤x
α≥2
p≤x
by the boundedness of log p/p and (5.11). Next, we replace log log x by a function having a better Mellin transform. Integration by parts and a change of variable give ∞ ∞
x 1 − u−1 −2 lim du − log log x = − u log log u du = − e−v log v dv. x→∞ u log u 1 1 0 We interpret the last integral using two formulas for the derivative of Euler’s gamma function (cf. [MV07], Appendix C). On the one hand, ∞ ∞ d Γ (1) = e−v v s−1 dv = e−v log v dv, ds 0 s=1 0 and on the other, the logarithmic derivative of the product form of Γ gives Γ (1) = Γ (1)/Γ(1) = −γ. Thus, as x → ∞,
1 − u−1 du − log log x → γ. u log u 1 Together, the two expressions show the equivalence of (5.10) and (5.9). x
5.8. TAUBER’S THEOREM AND CONCLUSION OF THE ARGUMENT
49
We now prove the “only if” assertion of the theorem by showing that the relation (s − 1)ζ(s) → A,
(5.12)
s → 1+
follows from (5.10), and appealing to Proposition 5.1. We start by noting that (1.5) and (1.8) together yield ∞
s 1 − u−1 (5.13) log ζ(s) − log = du . u−s dΠ(u) − s−1 log u 1 Next, let (x) denote the left side of (5.10). If we express the last formula in terms of , integrate by parts, and apply Lemma 5.11, we find ∞ ∞ log{(s − 1)ζ(s)/s} = u−(s−1) d(u) = (s − 1) u−s (u) du 1 1 ∞ = (s − 1) u−s {log A + o(1)} du = log A + o(1) 1
as s → 1+. Also, log s → 0 as s → 1 in the last formula, so upon exponentiating, (5.12) holds, which proves one implication of the theorem. 5.8. Tauber’s Theorem and conclusion of the argument Applying Proposition 5.1 again, we complete the proof of Theorem 5.10 by deducing (5.10) from (5.12). We use the Mellin version of the following tauberian theorem. Theorem 5.12 (Tauber). Let f be a right continuous function on [1, ∞) that is locally of bounded variation. Further, suppose that ∞ f (s) := u−s df (u) 1−
exists for all s > 1, that f(s) → as s → 1+, and that x log u df (u) = o(log x), x → ∞. (5.14) u 1 Then
x
1−
1 df (u) → u
as x → ∞.
(We remark that (5.14) is also a necessary condition for this theorem.) In Tauber’s theorem take = log A and df (u) = dΠ(u) −
1 − u−1 du. log u
By (5.12), we have ζ(s) ∼ A/(s − 1) as s → 1+. Thus ∞ u−s df (u) = log{ζ(s)(s − 1)/s} → log A = . 1
Recalling that log u dΠ(u) = dψ(u), we have from Theorem 5.6, x x log u 1 df (u) = {dψ(u) − (1 − u−1 ) du} = o(log x). u 1 1 u
50
5. MERTENS’ FORMULAS AND LOGARITHMIC DENSITY
Thus, the condition of the theorem is satisfied and so, as x → ∞, x x x 1 1 1 − u−1 df (u) = dΠ(u) − du → log A . u log u 1− u 1 u 1
Proof of Tauber’s Theorem. We suppose without loss of generality that f (u) = 0 for all u ≤ e; this avoids having factors such as 1/ log u near u = 1. Set x log u G(x) := df (u). u e By the hypothesis on f, we have ∞ ∞ dG(u) I() := = u− u−1− df (u) → , log u e e
→0+.
Integrating by parts and recalling that G(x) = o(log x) by (5.14), we find ∞ ∞ G(u) du G(u) du → , → 0 + . + I() = I1 () + I2 () := 1+ log u 1+ u log2 u e e u To proceed, we show first that I1 () → 0 as → 0+. For given δ > 0, we have |G(u)/ log u| < δ for all u exceeding some U = U (δ) and |G(u)/ log u| < M for some M and e ≤ u ≤ U . Thus U ∞ M du δ du + ≤ M {e− − U − } + δ, |I1 ()| ≤ 1+ 1+ u u e U whence lim sup→0+ |I1 ()| ≤ δ. Since δ is arbitrarily small, I1 () → 0 as claimed. It follows that I2 () → as → 0+. Our next goal is to show that x G(u) du (5.15) → , x → ∞. 2 e u log u Given a (large) positive x, set := 1/ log x and x ∞ G(u) du G(u) du − I3 () := 2 1+ log2 u e u log u e u x ∞ G(u) du G(u) du = (1 − u− ) − ; 2 1+ u log u log2 u e x u we show I3 () → 0 as → 0+. The last integral has absolute value at most ∞ o(1) du o(1)x− o(1) ∞ −1− u du = ≤ = o(1). 1+ log u log x x log x x u Now, 0 < 1 − u− ≤ log u for > 0 and u > 1; thus x x x o(1) du |G(u)| du − G(u) du (1 − u ) log u = . 2 ≤ 2 u u log u u log u e e e By reasoning used above, for given δ > 0, this is at most U x du du M + δ ≤ M log U + δ log x, u e U u and so I3 () → 0. It follows that (5.15) holds.
5.9. NOTES
Finally, by integration by parts, x x x 1 dG(u) G(u) du G(x) df (u) = = + → , 2 u log u log x e e e u log u
51
x → ∞.
5.9. Notes §5.5. The product result, Theorem 5.10, is based on an article [Ol11] by R. Olofsson. This theme is treated also by P. Pollack [Pl13]. §5.8. Tauber’s original paper from 1897 treated power series with radius of convergence 1. Our Mellin transform version follows that for Laplace transforms by J. Korevaar [Ko04] §I-14.
https://doi.org/10.1090//surv/213/06
CHAPTER 6
O-Density of g-integers How weighty are the matters contained in this space Summary. O-density and criteria. Independence of O-density and log density. Irrelevance of small primes. More on the residue of ζ(s) at s = 1.
6.1. Non-relation of log-density and O-density Recall that we say that a g-number system N has O-density (finite upper density) if its integer counting function satisfies N (x) x. Also, we say that N has positive lower density if N (x) x. Here we give some conditions for these densities that are related to estimates of classical prime number theory. Integration by parts shows that if a g-number system has O-density, then it also has O-log density. Similarly, positive lower density implies positive lower log density. Example 4.18 shows that density is a stronger condition than log density. In this section we give two examples showing that if a g-number system has log density or has O-density, we cannot conclude the existence of the other. Example 6.1. log density does not imply O-density. Suppose that a g-number system has log density. We show that this does not insure that the system has O-density or that its primes satisfy a Chebyshev O-bound π(x) x/ log x. k For k = 2, 3, . . . , let ak := ek be a g-prime of multiplicity ak log k . This lacunary collection of primes, call it P , violates the Chebyshev condition, for the number of primes at the single point x = ak is about x log log log x, and a fortiori, N , the g-integers generated by P , do not satisfy an O-density condition. On the other hand, N does have a log density, and it is 0. By Corollary 4.3, it suffices to show that, for some function B() → ∞ as → 0+, log ζ(1 + ) ≤ log(1/) − B().
(6.1) By (1.5), we have
∞
log ζ(s) = 1
where Sν = Sν (s) :=
1 1 u−s dΠ(u) = S1 + S2 + S3 + . . . , 2 3 ∞
ak log k ak −νs ,
ν = 1, 2, . . . .
k=2
The series of terms of log ζ(s) − S1 is uniformly bounded for s > 1: (6.2)
∞ ∞ ∞ ∞ 1 Sν ≤ log k ak −(ν−1) (log k)/ak 1. ν ν=2 k=2
k=2
k=2
53
54
6. O-DENSITY OF G-INTEGERS
Also, with s = 1 + , we have S1 ≤
(6.3)
∞
(log k) ak − .
k=2
We show that there is a reasonably small integer K = K() beyond which the numbers ak − go to zero so rapidly that the tail of the last series is dominated by its K + 1st term. For the initial range of the series it suffices to use the trivial estimate ak − < 1 to get (by a weak form of Stirling’s formula) the bound (6.4)
K
(log k) ak −
0, take K such that K K ≤ 1/ < (K + 1)K+1 .
(6.5)
Let us suppose that ≤ 1/30. The right inequality implies that K ≥ 3. Next, a small calculation shows that (ak+1 )− /(ak )− < exp{−2kk+1 },
k ≥ 3.
Now suppose that k ≥ K + 1 ≥ 4, so that kk > 1. We have (ak+1 )− log(k + 1) log(k + 1) log 5 1 < exp{−2k} ≤ exp{−8} < , − (ak ) log k log k log 4 1000 and a geometric series estimate gives ∞ 1000 3 log(K + 1) aK+1 − < log K. (log k) ak − < 999 2 k=K+1
This bound and (6.4) together give S1 < K log K + 1 + 2 log K − K. Recalling (6.2) and (6.5), there is an absolute constant c such that, for all < 1/30, log ζ(1 + ) < K log K + 2 log K − K + c ≤ log 1/ − B(), where B() = K − 2 log K − c → ∞ as → 0+. By Corollary 4.3, the log density of N is 0. Example 6.2. O-density does not imply log density. We give a continuous example of a g-number system having O-density but not log-density. Define a sequence of real numbers {Yν } by taking Yν := exp(100ν ) for ν = 1, 2, . . . , and define a g-integer counting function by x ∞
1 − u−1 ∗ 1− du . (6.6) N (x) := χ(Yν ,Yν2 ) (u) − χ(Yν2 ,Yν4 ) (u) exp log u 1− ν=1 We first show that N (x) x by applying Theorem 6.5 (see below). We have x ∞ 1 − u−1 π(x) ≤ Π(x) = 1− du χ(Yν ,Yν2 ) (u) − χ(Yν2 ,Yν4 ) (u) log u 1 ν=1 x x 1 − u−1 ≤ du . 2 log u log x 1
6.1. NON-RELATION OF LOG-DENSITY AND O-DENSITY
55
(The sequence {Yν } increases sufficiently rapidly that the intervals do not overlap.) Also, for s > 1, we have ∞ ∞ 1 − u−1 χ(Yν ,Yν2 ) (u) − χ(Yν2 ,Yν4 ) (u) du u−s 1 − log ζ(s) = log u 1 ν=1 (6.7) ∞ ∞ s = log − Jν (s) + Kν (s) , s − 1 ν=1 ν=1 where
Yν2
Jν (s) :=
u
−s
Yν
and
Yν2
Kν (s) :=
u
−s
Yν
We have at once for s > 1, ∞ 0< Kν (s) < ν=1
du − log u
du − u log u ∞
Y1
u2
Yν4
u−s
du log u
u−s
du . u log u
Yν2
Yν4
Yν2
du 1 < < 10−43 , log u Y1
and each individual J term satisfies Y4 Y2 Y2 du du du − = > 0. J(s) = u−s u−s u−s 1 − u1−s log u log u log u Y Y2 Y Thus (s − 1)ζ(s)/s < exp Y1−1 1, s > 1, i.e. ζ(s) has a finite upper residue at s = 1. Both the conditions of Theorem 6.5 hold, and so N (x) x. It remains to prove that this system does not have log-density. For this, we apply Proposition 5.1 and show that the zeta function does not have a right hand residue at s = 1. For each positive integer M , define i = i (M ) by 1 := 1/(2 log YM ) = 0.5 · 100−M and 2 = 71 . We show that the ratio 2 ζ(1 + 2 )/1 ζ(1 + 1 ) is bounded away from 1. By (6.7), we have the key relation −{log ζ(1 + )/(1 + )} =
∞
Jν (1 + ) + 10−43 θ,
ν=1
with |θ| < 1. We show • the contribution of JM (1 + 1 ) is reasonably large • the contributions of all terms Jν (1 + 2 ) are small, which will yield the desired inequalities. We need one numerical evaluation. Using either computer algebra or Table 5.1 of [AS64], we find 41 log YM 21 log YM dv dv − e−v e−v JM (1 + 1 ) = v v 1 log YM 21 log YM 2 1 dv dv − = 0.169906 . . . . e−v e−v = v v 1/2 1
56
6. O-DENSITY OF G-INTEGERS
Since all Jν (1 + 1 ) are positive, we have ∞
(6.8)
Jν (1 + 1 ) > JM (1 + 1 ) > 0.1699.
ν=1
On the other hand, for Jν (1+2 ), we give two simple upper estimates, depending upon the size of 2 log Yν . For ν < M , where 2 log Yν is small, we have 22 log Yν 22 log Yν dv dv 0 < Jν (1 + 2 ) = {e−v − e−2v } v ≤ v v 2 log Yν 2 log Yν 7 log Yν = 3.5 · 100ν−M ; = 2 log Yν = 2 log YM for ν ≥ M , 2 log Yν is large, and we have ∞ 22 log Yν dv ≤ 0 < J(1 + 2 ) ≤ e−v e−v dv v 2 log Yν 2 log Yν = exp{−2 log Yν } = exp{−3.5 · 100ν−M }. Now (6.9)
ν exp{0.10} > 1.1 1 ζ(1 + 1 ) for a sequence of pairs of numbers {1 (M ), 2 (M )} tending to 0+ as M → ∞. Thus ζ(s) does not have a right hand residue at s = 1, and by Proposition 5.1, the g-number system does not have log-density. 6.2. O-Criteria for O-density In the remainder of this chapter, we give several conditions, some necessary, others sufficient, for O-density. The conditions are typically O-bounds on the prime counting function and/or the growth of the zeta function near its pole at s = 1. We begin by showing, in analogy with Proposition 4.4, that the contribution of small primes plays no role in determining whether a g-number system has O-density or density. In the sequel, we shall freely remove initial primes when it is convenient. Proposition 6.3. Let B > 1 and dΠ be the weighted prime and prime power counting measure of a g-number system N , decomposed into dΠ1 + dΠ2 with Π1 counting primes p ≤ B and the powers of these primes, and Π2 counting the remaining part of Π (as in Proposition 4.4). Also set dNi := exp∗ dΠi , i = 1, 2, and let N2 denote the g-number system associated with dΠ2 . Then N has O-density (resp. density) if and only if N2 has the same. Proof. We prove the proposition for density; for O-density, replace asymptotic estimates by O-bounds.
6.2. O-CRITERIA FOR O-DENSITY
57
Suppose first that N has density A. We have x x N2 (x) = dN ∗ {exp∗ (−dΠ1 )} = N (x/t) exp∗ (−dΠ1 )(t) 1− 1− x x Ax +o exp∗ (−dΠ1 )(t). = t t 1− ∞ ∞ By the proof of Proposition 4.4, 1 T −1 dΠ1 < ∞; thus 1− T −1 exp∗ (−dΠ1 ) is x (absolutely) convergent. This implies, first, that 1− o(x/t) exp∗ (−dΠ1 )(t) = o(x). Also, as x → ∞, ∞ x T −1 exp∗ (−dΠ1 ) ∼ T −1 exp∗ (−dΠ1 ) =: I. 1−
1−
Then, by Lemma 3.8 and Proposition 3.7, ∞ ∗ −1 I= exp (−T dΠ1 ) = exp − 1−
∞
T −1 dΠ1 > 0.
1−
Finally, as x → ∞,
x
N2 (x) ∼ Ax
T −1 exp∗ (−dΠ1 ) ∼ Ax exp −
1−
∞
T −1 dΠ1 .
1
Next, suppose that N2 has density c2 . Write x x ∗ N (x) = dN2 ∗ exp (dΠ1 ) = N2 (x/t) exp∗ (dΠ1 )(t), 1−
1−
and apply the preceding reasoning to conclude that
∞ N (x) ∼ c2 x exp t−1 dΠ1 (t) . 1
Thus N has density as well.
The following result follows immediately from the proof of Lemma 3.3 or from Proposition 4.2, after noting that O-density implies O-log density. Proposition 6.4. If N (x) x, then the associated zeta function has a finite upper residue at s = 1. Does Proposition 6.4 have a direct converse? The answer is No, for such an implication and Proposition 4.2 would together yield the relations “O-log density ⇐⇒ finite upper residue ⇐⇒ O-density.” However, Example 6.1 implies that O-log density ⇒ O-density. Thus Proposition 6.4 does not have an unconditional converse. Theorem 6.5. If the zeta function of a g-number system has a finite upper residue at s = 1 and the counting function of g-primes satisfies the upper Chebyshev bound π(x) x/ log x, then N (x) x. Proof. We use the Chebyshev upper bound in the equivalent form ψ(x) x. By Chebyshev’s identity (3.3) and Proposition 4.2, x x x x T (x) := dN (u) x log x. log u dN (u) = ψ(x/u) dN (u) 1− 1− 1− u
58
6. O-DENSITY OF G-INTEGERS
Integration by parts gives
x
N (x) = 1 +
(log u)−1 dT (u) x.
1
Could we have established the last theorem assuming only the existence of a Chebyshev upper bound? The answer is No; the g-integer system with counting function D(x) of §4.1 satisfies D(x) ∼ x log x = O(x) but π(x) ∼ 2x/ log x. This example shows also that the condition π(x) x/ log x by itself does not imply that the associated g-integer system has a finite upper residue at s = 1. Next, suppose that the counting function a g-number system can be resolved into a main part satisfying the conditions of the last theorem and a small remaining part. In this case, we show that the g-integers have O-density. It is convenient to handle separately the following convergence result. Lemma 6.6. Suppose Π2 (x) ∈ V and there exists a nondecreasing function A(x) on [1, ∞) such that ∞ x (6.10) exp∗ dΠ2 ≤ A(x) and A(t) t−2 dt < ∞. 1−
1
Then
∞
T −1 exp∗ dΠ2
(6.11) 1−
converges.
Proof. The convergence of A(t) t−2 dt and the inequality 2x 2x dt dt A(x) = A(x) ≤ A(t) 2 = o(1) 2 2x t t x x
imply that A(x) = o(x). Set
x
E(u, x) :=
exp∗ dΠ2 ,
u < x.
u
We have |E(u, x)| ≤ 2A(x). Then, by integration by parts, v v E(u, v) −1 ∗ + t (exp dΠ2 )(t) = E(u, t)t−2 dt. v u u It follows that
v
u
t
−1
v 2A(v) +2 (exp dΠ2 )(t) ≤ A(t)t−2 dt → 0 v u ∗
as u, v → ∞, and thus (6.11) holds.
Corollary 6.7. Suppose that Π(x) of a g-number system N can be written as a sum Π1 (x) + Π2 (x), where Π1 (x) ↑, Π1 (x) x/ log x, and ∞ 1 1, 1 < s < 2. x−s dΠ1 (x) − log s − 1 1 Further, suppose that the conditions of the last lemma hold for Π2 (x). Then the integer counting function of N satisfies N (x) x.
6.2. O-CRITERIA FOR O-DENSITY
59
Proof. Recall from (3.1) and Proposition 3.10 that dN = exp∗ dΠ = dN1 ∗ exp∗ dΠ2 with dN1 := exp∗ dΠ1 . The conditions for Π1 (x) are those of the last theorem. Thus, N1 (x) x. Combining this bound with the last lemma, we obtain x x x x dA(t) x dN1 (t) = dA(t) x . N (x) ≤ A N1 t t t 1− 1− 1− Now integrate by parts. We find
N (x) A(x) + x
∞
A(t) t−2 dt x.
1
Corollary 6.8. Suppose that Π(x) = Π1 (x) + Π2 (x), with Π1 (x) as in the last theorem. In place of (6.10), assume that ∞ T −1 exp∗ dΠ2 < ∞ 1−
or that
∞
T −1 dΠ2 < ∞.
1−
In either case, N (x) x holds. Proof. Suppose the first integral condition holds. If we take x exp∗ dΠ2 A(x) := 1−
in Lemma 6.6, we see by integration by parts that ∞ x x A(x) −2 −1 ∗ exp dΠ2 (t) ≤ A(t) t dt = − t t−1 exp∗ dΠ2 (t) < ∞. + x 1 1− 1− Then, by the lemma, N (x) x. Now assume the second integral condition. By the triangle inequality and Lemma 2.15 ∞ ∞ −1 ∗ exp dΠ2 ≤ T T −1 exp∗ dΠ2 1− 1− ∞ ∞ ∗ −1 exp T dΠ2 = exp T −1 dΠ2 , = 1−
so we again have N (x) x.
1−
For a lower bound we have Theorem 6.9. Suppose that a g-integer system N satisfies N (x) x. Then lim inf →0+ ζ(1 + ) > 0. In the other direction, suppose that the lower Chebyshev bound ψ(x) x holds and also that ζ(1 + ) has positive finite bounds from above and below as → 0+. Then N (x) x. Proof. The first assertion is proved using integration by parts, as was done in showing the first part of Lemma 3.3. The argument for the converse statement is essentially that of Theorem 6.5, except that here Corollary 4.10 is used in place of Proposition 4.2 for the logarithmic density bound.
60
6. O-DENSITY OF G-INTEGERS
Remarks 6.10. Note that both upper and lower bounds for ζ(1 + ) are conditions here and in the corollary. There is a further asymmetry between this theorem and Theorem 6.5: in the earlier one, we assumed the upper estimate π(x) x/ log x and used integration by parts to show ψ(x) x; if we had assumed π(x) x/ log x here instead of ψ(x) x, then we would not xhave been able to obtain the ψ(x) bound without some further condition, such as 1 Π(t)t−1 dt = o(x). 6.3. Sharper criteria for O-density Here, we show that a reasonably sharp Chebyshev upper bound yields an upper residue for ζ(s) which, in turn, leads to another O-density result (cf. Example 4.6 and Theorem 5.3). Proposition 6.11. Suppose that the Chebyshev function ψ(x) of a g-number system N satisfies the one-sided inequality (6.12)
∞
ψ(x) ≤ x + x/f (x),
x ≥ 1,
where f (x) ≥ 1 and 2 du/{uf (u) log u} < ∞. Then N has O-density. ∞ On the other hand, suppose ψ(x) ≥ x + x/f (x) holds with f instead satisfying du/{uf (u) log u} = ∞. Then N does not have O-density. 2 Proof. We have for 1 < s ≤ 2, ∞ ∞ −ζ (s) = s u−s−1 ψ(u) du ≤ s u−s {1 + 1/f (u)} du. ζ 1 1 Thus ∞ du −ζ 1 u−s (s) ≤ + O(1) + 2 . ζ s−1 f (u) 2 Then, upon integrating, we have uniformly for 1 < s < 2, 2 −ζ (u) du log ζ(s) − log ζ(2) = ζ s 2 ∞ 1 du ≤ log + O(1) + 2 dv. u−v s−1 f (u) s 2 The double integral is less than ∞ ∞ ∞ ∞ du du du −v −s = < 1, u dv u f (u) f (u) log u uf (u) log u 2 s 2 2 whence ζ(s) ≤ C/(s − 1) for some constant C > 0. Thus zeta has a finite upper residue at s = 1. Also, we have a Chebyshev bound for the g-number system by hypothesis. It follows that N (x) x by Theorem 6.5. For the opposite estimate, we argue as follows. The contribution of small primes is irrelevant for O-density (Proposition 6.3), so we can assume that ψ(x) = 0, 1 ≤ x < 2, ψ(x) ≥ x + cx/f (x), x ≥ 2, ∞ for some c > 0. Supposing 2 du/{uf (u) log u} is divergent, our path is first to show that the zeta function does not have a finite upper residue at s = 1, and then to apply Proposition 6.4. We have for s > 1, ∞ ∞ −ζ (s) = s u−s−1 ψ(u) du ≥ u−s {1 + c/f (u)} du. ζ 2 2
6.3. SHARPER CRITERIA FOR O-DENSITY
61
Once more integrating, we have for 1 < s < 2, 2 −ζ (v) dv log ζ(s) − log ζ(2) = ζ s 2 ∞ ≥ u−v {1 + c/f (u)} du dv. s
2
If we exchange the order of the last integrals and insert a negative term, we find ∞ 2 c 1 du u−v dv 1 − + log ζ(s) − log ζ(2) ≥ u f (u) 2 s ∞ 1 − u−1 c + du (u−s − u−2 ) = log u f (u) log u 2 ∞ ∞ 1 − u−1 c du du + −B u−s u−s ≥ log u f (u) log u 1 2 for some number B, uniformly for 1 < s < 2. The last integral diverges to infinity as s → 1+ by the monotone convergence theorem and the hypothesis, and the next to last integral equals log{s/(s − 1)} by Theorem 1.5. It follows that ∞ c du →∞ u−s (s − 1)ζ(s) exp f (u) log u 2 as s → 1+. By the contrapositive form of Proposition 6.4, the g-number system does not have O-density. Examples 6.12. Critical exponent for O-density. If a g-number system satisfies ψ(x) − x ≤ x(log log x)−1.001 , then it has O-density. On the other hand, if ψ(x) − x ≥ x(log log x)−1 , then there is no O-density. We next deduce an O-density result from a one-sided Mertens-type hypothesis (see §5.1) by an elementary argument. Theorem 6.13. Suppose that the Chebyshev function ψ(x) of a g-number system satisfies the one-sided Mertens-type inequality x (6.13) ψ1 (x) := u−1 dψ(u) ≤ log x + O(1). 1
Then the associated g-integers satisfy N (x) x. Proof. Suppose first that ψ(x) = 0 for x ≤ A, where A ≥ 3 and that x (6.14) u−1 dψ(u) ≤ log x − 1, x > A. A
We shall show in this case that the explicit inequality N (x) ≤ x holds for all x ≥ 1. Our argument proceeds by induction on intervals [An , An+1 ). On the initial interval, Chebyshev’s identity gives (6.15)
(L dN )(t) = log t dN (t) = {dψ ∗ dN }(t) = 0,
1 ≤ t < A.
Thus dN (t) = 0 for 1 < t < A, and since dN {1} = 1, we have N (x) = 1 ≤ x,
1 ≤ x < A.
62
6. O-DENSITY OF G-INTEGERS
Suppose that N (u) ≤ u is known to hold for 1 ≤ u < An , and let An ≤ x < . For t ≥ A we have x/t < An , so N (x/t) ≤ x/t holds by our assumption. A Another application of Chebyshev’s identity yields x x x x (6.16) T (x) := dψ(t) ≤ x (log x − 1) L dN = N (x/t) dψ(t) ≤ 1 1 A t n+1
for An ≤ x < An+1 . Relation (6.15) together with the last inequality on each interval [Ai , Ai+1 ) for 1 ≤ i ≤ n imply that (6.16) holds on A ≤ x < An+1 . On this interval we have x x T (x) dT (u) T (u) = 1+ + N (x) = 1 + du 2 log x A log u A u log u x A log u − 1 x (log x − 1) du = 1 + x − + < x, ≤1+ 2 log x log A log u A which completes the induction. To handle the general case, suppose in (6.13) that x u−1 dψ(u) ≤ log x + B 1
for some constant B and all x ≥ 1. Choose A ≥ 3 such that
A
u−1 dψ(u) ≥ B + 1. 1 ∞ (If such a choice cannot be made, then 1 u−1 dψ(u) is finite and (6.14) holds already for suitable A.) Define P0 := P ∩ (A, ∞) and let ψ0 and N0 denote the Chebyshev and g-integer counting functions associated with P0 . Now ψ0 satisfies condition (6.14), so N0 (x) x for all x ≥ 1. By Proposition 6.3, N (x) x.
6.4. Notes §4.6. A small calculation shows that the counting function of Example 4.18 has O-density. Thus we see also that O-density ⇒ regular growth. §§6.2, 4.3. In the proof of Theorem 6.9 and also that of Corollary 4.10, both lower bound estimates, we have used an upper bound hypothesis on ζ(1 + ). While some condition is needed, we wonder whether a weaker one would suffice.
https://doi.org/10.1090//surv/213/07
CHAPTER 7
Density of g-integers Will a number system sink if its density is greater than 1? Summary. Density and criteria. More on the residue of ζ(s) at s = 1. Axer’s theorem on convolutions.
7.1. Densities and right hand residues Recall that a g-number system N is said to have density δ if the counting function of its g-integers satisfies limx→∞ N (x)/x = δ, and to have a right hand residue r if lims→1+ {(s − 1) ζN (s)} exists and equals r. We begin by giving the simple proof of the remark made at the end of Example 1.2 about evaluating a density that is known to exist Lemma 7.1. Let F ∈ V and suppose F (x)/x → δ as x → ∞. Then the Mellin transform F(s) converges for s > 1 and lim (s − 1) F(s) = δ.
s→1+
Proof. Using integration by parts, ∞ ∞ −s −s−1 F (s) := u dF (u) = s u F (u) du ∼ 1−
1
∞
u−s−1 δu du =
1
as s → 1 + . Thus (s − 1) F(s) → δ.
δ , s−1
The converse of the last lemma is not true: there exist Mellin transforms of nonnegative measures having a right hand residue for which the cumulative function does not have a density. One case is Example 4.18. Another is Example 6.1, where the right hand residue is 0 but there is no density. 7.2. Axer’s Theorem In the next section and in a later chapter we shall need versions of the following convolution estimate. Theorem 7.2 (Axer). Let A(x) and B(x) be right continuous functions on [1, ∞), each locally of bounded variation. Assume that B(x) = o(x) and its total variation function satisfies Bv (x) = O(x). ∞ Also, assume that |A(x)| ≤ A1 (x), another right continuous function, with 1 A1 (u)u−2 du < ∞. Finally, assume that either (A) A1 (x) ↑ or (B) A1 (x)/x ↓ . x Then 1− dA ∗ dB = o(x). 63
64
7. DENSITY OF G-INTEGERS
Proof. We apply the Dirichlet hyperbola method. Let K be a large positive number and write K x/K x x x x dB(t) + dA(t) − A(K)B . dA ∗ dB = A B t t K 1− 1− 1− =: I + II − III, say. In Case (A), we note first that A1 (x) = o(x), by the first step in the proof of Lemma 6.6. Next, suppose that Bv (t) ≤ M t for all t ≥ 1. We have, by Lemma 2.1, x/K x/K x x |I| ≤ A1 A1 dBv (t) ≤ M (δ1 + dt) t t 1− 1− since t → A1 (x/t) is positive, left continuous, and decreasing. Thus x/K x |I| ≤ M A1 (x) + M A1 dt, t 1− and if we change the variable in the last integral and extend the range of integration, we obtain |I| ≤ o(x) + M x
∞
A1 (u)u−2 du.
K
Choosing K large makes the last integral small; given > 0, we have |I| < x for a large K. With this choice of K, we also have K |II| ≤ o(x/t) dAv (t) = o(x), x → ∞. 1−
(Here Av (x) is the total variation function of xA(x); it is not connected with A1 (x).) Also III = O(1) o(x/K) = o(x), and thus 1− dA ∗ dB = o(x) holds in Case (A). In Case (B), we have II + III = o(x) as before, for any fixed positive K. It remains to show I = o(x). The assumption ∞here is that A2 (x) := A1 (x)/x ↓, and the integral hypothesis can be restated as 1 A2 (u)u−1 du < ∞. It follows that x 1 A2 (x) log x ≤ √ t−1 A2 (t) dt = o(1), x → ∞, 2 x which implies that A2 (x) = o(log−1 x). Now, integrating by parts we find x/K x x A2 dBv (t) |I| ≤ t t 1− x/K x x x
= Bv KA2 (K) − x dt + t−1 dt A2 . Bv (t) − t−2 A2 K t t 1 We can drop the last term, since t → A2 (x/t) is increasing. Using the upper estimate for Bv , we find x/K x dt |I| ≤ M xA2 (K) + x M t t−2 A2 t 1 x/K x x −1 =O + Mx t dt. A2 log K t 1
7.3. CRITERIA FOR DENSITY
65
We change the integration variable in the last integral, obtaining ∞ x A2 (u)u−1 du ≤ A2 (u)u−1 du. K
K
Thus I/x is arbitrarily small for sufficiently large K. With such a choice of K, each x of I, II, III = o(x), and so 1− dA ∗ dB = o(x) in this case as well. 7.3. Criteria for density Our first result is a direct refinement of Theorem 6.5. Theorem 7.3. If N (x) ∼ Ax, then the associated zeta function has a right hand residue c at s = 1. Conversely, if zeta has a right hand residue A at s = 1 and the counting function of g-primes satisfies the PNT, then N (x) ∼ Ax. The first assertion can be deduced from Proposition 5.1 or directly by integration by parts of the Mellin integral for ζ(s). For the second assertion, replace the O- bounds of Theorem 6.5 with asymptotic estimates. (In place of Proposition 4.2, use Proposition 5.1.) Example 7.4. A set of EVEN primes. Let P := {p + 1} with {p} the rational primes, i.e. P = {3, 4, 6, 8, 12 . . . }. We show that the associated set of g-integers N has a density and that it equals 6/π 2 . Clearly the PNT holds for P. To apply the preceding theorem, it suffices to show that ζP (s), the zeta function of P, has a right hand limit of 6/π 2 at s = 1. Recall that (s − 1)ζ0 (s) ∼ 1 as s → 1+ for ζ0 (s) the Riemann zeta function. Now p−s − (p + 1)−s 1 − p−s ζP (s) 1 − , = = ζ0 (s) 1 − (p + 1)−s 1 − (p + 1)−s p p and for 1 < s < 2, p−s − (p + 1)−s p−s−1 p−2 . Thus the last product is absolutely and uniformly convergent for 1 < s < 2. It follows that we can take the limit pointwise in the product, and the density exists. We find its value to be 1 6 p−1 − (p + 1)−1 1 − p−2 = = 2. lim (s − 1)ζP (s) = 1− = −1 s→1+ 1 − (p + 1) ζ (2) π 0 p p An immediate consequence of the second assertion of the last theorem is Corollary 7.5. Suppose N is a g-number system having logarithmic density and for which the PNT holds. Then N has density. The last theorem is not of the form “if and only if,” because we assumed the PNT for one implication. Example 4.18 shows that the theorem is false without this condition. Recall that we had 2k /k . Π(x) := 1≤k≤log x/(log 2)
The PNT does not hold here, since Π(2n ) − Π(2n − 1) = 2n /n Π(2n ).
66
7. DENSITY OF G-INTEGERS
Also, since
1 s−1 , s → 1+, → 1 − 21−s log 2 ζ(s) has a right hand residue at s = 1+. On the other hand, we showed that N (x)/x has no limit as x → ∞. Is there a variant of Theorem 7.3 that is valid without assuming the PNT? The answer is Yes, if we assume a little more than just the existence of a right hand residue of ζ(s). This residue can be represented as the limit as s → 1+ of ζ(s)(s − 1)/s, and the last function is representable as the Mellin transform ∞ F (s) := u−s {dN ∗ (δ1 − du/u)}. (s − 1)ζ(s) =
1−
This observation and (3.11) immediately give the desired criterion for density: Proposition 7.6. A g-number system N has a density if and only if F(s), given above, converges at s = 1, i.e. x u−1 {dN ∗ (δ1 − du/u)} lim x→∞
1−
exists. If it exists, the density equals F(1). A direct proof consists in noting that the last integral equals x x dN (u) N (u) du N (x) − = 2 u u x 1− 1 by integration by parts. Another density criterion involving an integral of N (x) is the following. Proposition 7.7. Suppose N (x) is the counting function of a g-number system N and A is a positive number. Then ∞ ∞ dN (t) − A dt N (t) − At I1 := exists ⇐⇒ I2 := dt exists. t t2 1− 1 If either integral exists, N has density A. Proof. Suppose first that I1 exists, i.e. x dN (u) − A du = L + o(1), F (x) := u 1− Then
x → ∞.
x u dF (u) = xF (x) − F (u) du 1− 1 x = x{L + o(1)} − {L + o(1)} du = o(x). x
N (x) − Ax + A =
1
Thus N (x) ∼ Ax and N has density. Integrating by parts, we have x x N (u) − Au dN (u) − A du N (x) − Ax + . du = −A − (7.1) 2 u x u 1− 1− As x → ∞, the last integral converges by hypothesis and {N (x) − Ax}/x → 0 since N (x) ∼ Ax. Thus I2 exists as well.
7.3. CRITERIA FOR DENSITY
67
Next, suppose that I2 exists. Let 0 < < 1. We have x+x N (t) − At 1+ N (x) ≤ dt + A log(1 + ) . x t2 x The integral goes to 0 as x → ∞, and so N (x) 1+ lim sup ≤ A log(1 + ). x x→∞ Letting → 0, we find lim supx→∞ N (x)/x ≤ A. Similarly, we deduce from x N (x) 1− N (t) − At 1 ≥ dt + A log x t2 1− x−x that lim inf x→∞ N (x)/x ≥ A. Thus N (x)/x → A as x → ∞. Appealing again to (7.1), we see that I1 exists as well.
Does Proposition 7.7 have a converse: if N is a g-number system having density A, does it follow that I1 and I2 exist? The answer is No. We shall show in Proposition 14.6 that the continuous g-number system with prime measure 1 − u−1 2 1 − u−1 du + du dΠ(u) := log u log u has g-integer counting function x log log x 4x . +O N (x) = 4x − log x log2 x It is clear that this system has density A = 4. On the other hand, the variant of I2 having lower integration limit 10 satisfies ∞ ∞ log log t −4 N (t) − At + O dt = dt = −∞. t2 t log t t log2 t 10 10 Theorem 7.8. Say that the function Π(x) of a g-number system N can be written as a sum Π1 (x) + Π2 (x), where Π1 (x) ↑, Π1 (x) ∼ x/ log x, and ∞ 1 → log A x−s dΠ1 (x) − log s − 1 1 as s → 1+. Further, suppose that Π2 (x) satisfies the conditions of Lemma 6.6. ∞ Then N has density A 1− T −1 exp∗ dΠ2 . Proof. By (3.1) and Proposition 3.10, we have dN = exp∗ dΠ = dN1 ∗ exp∗ dΠ2 with dN1 := exp∗ dΠ1 . By Theorem 7.3, x N1 (x) := exp∗ dΠ1 ∼ Ax. 1−
Subtracting A(δ1 + dt) from dN1 and adding the same, we find x x Ax ∗ (exp dΠ2 )(t) + (7.2) N (x) = exp∗ dΠ2 ∗ (dN1 − Aδ1 − Adt). 1− t 1− We apply Axer’s Theorem to the second integral in this formula with dA := exp∗ dΠ2 ,
dB := dN1 − Aδ1 − Adt.
68
7. DENSITY OF G-INTEGERS
We have |A(x)| ≤ A1 (x), where A1 (x) satisfies the conditions (6.10), and by Theorem 7.3, B(x) = N1 (x) − Ax = o(x), Bv (x) ≤ N1 (x) + Ax x. x Thus 1− dA ∗ dB = o(x), Now the first term in (7.2) is asymptotic to ∞ T −1 exp∗ dΠ2 =: A x, Ax 1−
by Lemma 6.6, and hence N has density A .
We have, as analogues of Corollary 6.8, Corollary 7.9. Suppose that Π(x) = Π1 (x) + Π2 (x), with Π1 (x) as in the last theorem. In place of (6.10) assume that ∞ T −1 exp∗ dΠ2 < ∞ 1−
or that
∞
T −1 dΠ2 < ∞.
1−
In either case, N has a density as before. A result that is related to but simpler than Theorem 7.8 is Proposition 7.10. Suppose that the integer counting function N (x) of a gnumber system N satisfies ∞ y dx sup N (y) − N (u)u−1 du 2 < ∞ . x y≤x 1 1 Then N has a density. Proof. Write dN = exp∗ dΠ = exp∗ dΠc ∗ exp∗ dΠ2 , where, as usual, dΠc = (1 − t−1 ) dt/ log t. We have, from (3.10), x x −1 ∗ N (u)u du = exp (dΠ − dΠc ) = N (x) − 1
1−
x
exp∗ dΠ2 ,
1−
If we take A(x) := sup N (y) − y≤x
1
y
N (u) du , u
then A(x) is nondecreasing and satisfies (6.10). Now, x x/t exp∗ dΠc (exp∗ dΠ2 )(t) = x N (x) = 1−
1−
x
T −1 exp∗ dΠ2 ,
1−
and the last integral converges as x → ∞ by Lemma 6.6. Thus N has a density, ∞ and it equals 1− T −1 exp∗ dΠ2 .
7.4. AN L1 CRITERION FOR DENSITY
69
7.4. An L1 criterion for density In this section we show that a g-number system satisfying an L1 condition on Π(t) − t/ log t has density. Theorem 7.11. Suppose that N is a g-number system for which ∞ (7.3) t−2 |Π(t) − t/ log t| dt < ∞. 2
Then the g-integers of N have density. Without loss of generality, we can replace (7.3) by ∞ (7.4) t−2 |Π(t) − Πc (t)| dt < ∞ 1
where
(1.9 bis)
t
Πc (t) = 1
since
∞
1 − u−1 du, log u
∞
dt 2 < ∞. t log t 2 2 The key step in the proof is the following lemma, showing that Π can be decomposed into two parts that come within a prescribed of fulfilling the conditions of Theorem 7.8. The proof will be concluded by applying the lemma iteratively. t−2 |Πc (t) − t/ log t| dt
Lemma 7.12. Suppose Π satisfies condition (7.4). Then there exist constants B (depending on Π) and C (absolute) such that for any ∈ (0, 1) there is a decomposition Π = Π1 + Π2 , where Π1 ↑, (7.5) (7.6)
1 − < Π1 (x)/Πc (x) < 1 + , 1 ≤ x < ∞, ∞ t−2 |Π1 (t) − Πc (t)| dt ≤ B < ∞, 1
and
(7.7)
∞
t−1 |dΠ2 (t)| ≤ C/2 < ∞.
4
Proof. Let x0 ≥ 1 be a point at which |Π(x0 ) − Πc (x0 )| ≥ Πc (x0 ). If no such point x0 exists, then the choice Π1 := Π and Π2 := 0 obviously suffices. Suppose that Π(x0 ) ≥ (1+)Πc (x0 ). Let I = I(x0 ) = [a, b] denote the maximal interval containing x0 such that Π(x) ≥ (1 + /3)Πc (x) holds for all x ∈ I. This interval is closed at a by the right-continuity of Π − (1 + /3)Πc , and it is closed at b because Π(b) = (1 + /3)Πc (b). It is bounded because of condition (7.4). Next suppose that Π(x0 ) ≤ (1 − )Πc (x0 ). Let J = J(x0 ) denote the maximal interval containing x0 such that Π(x) < (1 − /3)Πc (x) holds for all x ∈ J. Similar reasoning as for I shows that J is bounded and open. Write J = (c, d). We shall call points of each interval I or J “-bad,” or, briefly, “bad,” and those of the complementary intervals of [1, ∞) are called “-good” or just “good.” (Note that points of an interval on which |Π(x) − Πc (x)| > Πc (x)/3 holds are not bad unless there exists some point x0 of the interval at which |Π(x0 )−Πc (x0 )| ≥ Πc (x).)
70
7. DENSITY OF G-INTEGERS
For x ≥ 1, set
⎧ ⎪ x good, ⎨Π(x), Π1 (x) = (1 + /3)Πc (x), x ∈ ∪ I, ⎪ ⎩ (1 − /3)Πc (x), x ∈ ∪ J, ⎧ ⎪ x good, ⎨0, Π2 (x) = Π(x) − (1 + /3)Πc (x), x ∈ ∪ I, ⎪ ⎩ Π(x) − (1 − /3)Πc (x), x ∈ ∪ J.
and
By construction, Π1 is nondecreasing and it satisfies (7.5). Also, we have |Π1 (x) − Πc (x)| ≤ |Π(x) − Πc (x)|,
1 ≤ x < ∞,
and so Π1 satisfies (7.6). It remains to prove (7.7); to do this, we apply an argument of a tauberian character in which we show that each bad interval is reasonably long. Separate but analogous arguments are needed for the I and J cases. We treat I in detail and sketch the argument for J. The behavior of dΠ2 on I = [a, b] is as follows. 0 ≤ dΠ2 {a} = Π(a) − (1 + /3)Πc (a) ≤ Π(a) − Π(a−) = dΠ{a}, dΠ2 = dΠ − (1 + /3)dΠc ,
a < x < b,
dΠ2 {b} = 0. Our plan for estimating the integral over I (recognizing a possible jump of Π2 at a) is first to show that b −1 −1 (7.8) T |dΠ2 | = t |dΠ2 (t)| ≤ 3Π(b)/b + 3 Π(t)t−2 dt, I
I
a−
and then that each of the last two terms is bounded by a constant multiple of −2 I |Π(t) − Πc (t)|t−2 dt. For a ≤ t ≤ b we have Π(a−) − Πc (a) ≤ Πc (a) ≤ Πc (t) ≤ Π(t) − Πc (t). 3 3 It follows by integration by parts that b −1 T (dΠ − dΠc ) = t−1 (dΠ − dΠc )(t) ≥ 0, I
and thus
a−
7 1 + /3 dΠc + dΠ ≤ T −1 dΠ . 3 I I I If we integrate the last expression by parts, drop the negative term, and change 7/3 to 3, we get (7.8). To estimate Π(b)/b, we first show the interval I = [a, b] is rather long. Indeed, since a ≤ x0 < b and Π ↑, we have T −1 |dΠ2 | ≤
T −1
Π(y) ≥ Π(x0 ) ≥ (1 + )Πc (x0 ) for y ≥ x0 . The concavity of Πc implies that Πc (y) − Πc (x0 ) Πc (x0 ) − 0 . ≤ y − x0 x0 − 1
7.4. AN L1 CRITERION FOR DENSITY
71
It follows that Π(y) ≥ (1 + )
x0 − 1 Πc (y) > (1 + /3)Πc (y) y−1
for y < 1 + (x0 − 1)(1 + )/(1 + /3). Therefore if x0 ≥ 3, we must have b ≥ 1 + (x0 − 1)
1+ ≥ x0 (1 + /3). 1 + /3
or b ≥ (1 + /3)a. Now b dt b dt 2 b −1 2 b ≥ Πc (3b/4), |Π(t) − Πc (t)| 2 ≥ Πc (t) 2 ≥ Πc t 3 b/(1+/3) t 9 1 + /3 9b a and for b ≥ 4, the concavity of Πc and the preceding inequalities yield Πc (b) (1 + /3)(b − 1) Π(b) = 1+ ≤ Πc (3b/4) b 3 b (3b/4 − 1) b 18 b −2 2Πc (3b/4) ≤ 2 t |Π(t) − Πc (t)| dt. ≤ b a Also, for a ≤ t ≤ b we have Π(t) ≤
1 + /3 {Π(t) − Πc (t)}, /3
4 t−2 |Π(t) − Πc (t)| dt. I I If we insert the preceding estimates into (7.8), we get 66 b −2 T −1 |dΠ2 | ≤ 2 t |Π(t) − Πc (t)| dt, a I
and hence
t−2 Π(t) dt ≤
provided that a ≥ 4. Arguing similarly with intervals J = (c, d), we show this time that the interval is long by moving from x0 to the left. We find that 24 b −2 T −1 |dΠ2 | ≤ 2 t |Π(t) − Πc (t)| dt a J provided that c ≥ 3. Also, Π2 ≡ 0 on S = {(∪I) ∪ (∪J)}, and hence, trivially, −1 T |dΠ2 | = 0 ≤ t−2 |Π(t) − Πc (t)| dt S
Combining the three estimates for with C = 66.
S
T −1 |dΠ2 |, we conclude that (7.7) holds
Proof of the theorem. Let n = 1/n, n = 1, 2, 3, . . . . Take x1 = 1 and for n = 2, 3, . . . choose a strictly monotonic sequence {xn } such that (i) xn is 1/(n + 1) good and (ii) xn is so large that ∞ T −1 |dΠ1/n,2 | < 2−n , xn
where Π1/n,2 is as in the preceding lemma. Define functions Π1 and Π2 on [1, ∞) by taking Π1 (x) = Π(x) and Π2 (x) = 0 for 1 ≤ x < x2 ; and for n ≥ 2, take Π1 (x) = Π1/n,1 (x) and Π2 (x) = Π1/n,2 (x) for xn ≤ x < xn+1 .
72
7. DENSITY OF G-INTEGERS
By construction, we have Π = Π1 + Π2 , Π1 ↑, Π1 ∼ x/ log x as x → ∞, and ∞ x2 ∞ −1 −1 (7.9) T |dΠ2 | ≤ T |dΠ2 | + 2−n < ∞. 1
1
n=2
Finally, for s > 1, write ∞ ∞ ∞ s = t−s dΠ1 (t) − log x−s (dΠ − dΠc )(x) − x−s dΠ2 (x). s − 1 1 1 1 The last integral has a limit as s → 1+ by (7.9). Integration by parts gives ∞ ∞ −s x (dΠ − dΠc )(x) = s x−s−1 {Π(x) − Πc (x)} dx, 1
1
and this integral also tends to a limit as s → 1+ by the hypothesis of the theorem and the fact that Πc (x) − x/ log x x/ log2 x. Now the hypotheses of Corollary 7.9 are satisfied, and hence N has density. 7.5. Estimates of N (x) with an error term In the last section we showed that a g-integer system has density if an integral involving Π(t) − t/ log t converges. Here we show how error bounds assumed for Π(t) lead to corresponding estimates of N (x). Theorem 7.13. Suppose there exists a positive number A such that x x dΠ(t) (1 − t−1 ) dt = + log A + E(x) (7.10) t t log t 1 1 where E(x) is an error term. (A) If E(x) = O(log−α x) for some α > 2, then N (x) = Ax + O(x log2−α x). (B) If E(x) = O(exp{− loga x}) for some a ∈ (0, 1), then
N (x) = Ax + O(x exp{−(log x log log x)a }), Let
x
Πc (x) := 1
x
λ(x) := 1 x νj (x) :=
1 − t−1 dt, log t T −1 dΠc =
1
x
a = a/(1 + a).
1 − t−1 dt, t log t
T −1 (dΠ − dΠc − {log A}δ1 )∗j , j ≥ 1,
1−
and ν(x) := ν1 (x). Note that dνj = dν ∗j , x |dν| ≤ 2λ(x) + ν(x) + 2| log A|, 1
and
e
λ(x) ≤ 1
x
dt/(t log t) = 1 + log log x for x ≥ e.
dt/t + e
We have (cf. (3.11)) dN = exp∗ dΠ = A δ1 ∗ (δ1 + dt) ∗ exp∗ (dΠ − dΠc − {log A}δ1 ),
7.5. ESTIMATES OF N (x) WITH AN ERROR TERM
so, by Lemma 3.8, x ∗ (δ1 + dt) ∗ exp (T dν) = A N (x) = A 1−
x
1−
Thus |N (x) − Ax| ≤ Ax
∞
x T exp∗ dν = Ax t
73
x
exp∗ dν.
1−
|νj (x)|/j! ,
j=1
and we shall show the last expression to be suitably small. The key to proving each assertion is an estimate of |νj (x)|. For (A) one shows that there exist constants A0 and A1 such that for all x, j ≥ 1, |νj (x)| ≤ jA0 (2 log log 3x + A1 )j−1 log−α x.
(7.11)
The proof of this inequality and its application are quite similar to that of (B) and are given in [Bzd99]. Here we establish the analogue of (7.11) and prove (B). Lemma 7.14. Suppose that for some b ≥ 0 and a ∈ (0, 1) |ν(x)| ≤ A0 logb ex exp{−(log x)a },
1 ≤ x < ∞.
Then there exists a constant A1 such that for all x, j ≥ 1 |νj (x)| ≤ jA0 (2λ(x) + A1 )j−1 logb ex exp{−(j −1 log x)a }.
(7.12)
The factor logb ex will be used in Corollary 7.15 below. Proof. The case j = 1 is true. Suppose that the jth case holds and set y = x1/(j+1) and z = x/y. By the Dirichlet hyperbola method, νj+1 (x) = dν ∗j (s) dν(t) st≤x z y νj (x/t) dν(t) + ν(x/t) dνj (t) − ν(y)νj (z) = 1
= I + II − III, say. Now
1
y
jA0 (2λ(x/t) + A1 )j−1 logb (ex/t) exp{−(j −1 log(x/t))a } |dν|(t) y b j−1 −1 a log (ex) exp{−(j log z) |dν| ≤ jA0 (2λ(x) + A1 )
|I| ≤
1
1
≤ jA0 (2λ(x) + A1 )j−1 logb (ex) exp{−({j + 1}−1 log x)a } × (2λ(x) + B + 2| log C|), where B = sup |ν(x)| < ∞. Also, z |II| ≤ A0 logb (ex/t) exp{−(log x/t)a }|dν|∗j (t) 1 x j |dν| ≤ A0 logb ex exp{−(log x/z)a } 1 −1
≤ A0 log ex exp{−({j + 1} b
log x)a }(2λ(x) + B + 2| log C|)j ,
and |III| ≤ BjA0 (2λ(x) + A1 )j−1 logb ex exp{−({j + 1}−1 log x)a }.
74
7. DENSITY OF G-INTEGERS
Thus |νj+1 (x)| ≤ (j + 1)A0 (2λ(x) + A1 )j logb ex exp{−({j + 1}−1 log x)a }
with A1 = 2B + 2| log C|. Proof of the theorem. We have ∞ K ∞ |N (x) − Ax| ≤ |νj (x)|/j! = + , Ax 1 j=1 K+1
say, with K to be determined, and with the common summand A0 (2λ(x) + A1 )j−1 exp{−(j −1 log x)a }/(j − 1)!. To estimate K 1 , we first give an upper bound for the logarithm of exp{−(j −1 log x)a }/(j − 1)! over all j. Recalling the crude lower bound j! ≥ j j e−j , it suffices to asymptotically approximate the larger and simpler function loga x F (u) := − a − (u − 1) log u + u u for large x. We have ua+1 F (u) = a loga x − ua+1 (log u − 1/u). For large x, there is a unique zero U , for which U a+1 log U − U a = a loga x.
(7.13) We have
− loga x − U a (U − 1) log U + U a+1 − loga x − U a+1 log U ∼ , Ua Ua and inserting (7.13) into the last formula yields F (U ) =
− loga x − a loga x − U a −(a + 1) loga x ∼ . Ua Ua If we drop the smaller term in (7.13), and set a = a/(a + 1), we obtain F (U ) ∼
U a ∼ aa (log x)aa /(log U )a . Then taking logarithms in (7.13), we find that log U ∼ a log log x. The last three asymptotic formulas together give F (U ) ∼ −(a + 1)1/(a+1) (log x log log x)a
Now, for any fixed arbitrarily small positive number , there exists x0 = x0 () such that for all x ≥ x0 and all j ≥ 1, exp{−(j −1 log x)a } ≤ exp{( − (a + 1)1/(a+1) )(log x log log x)a }. (j − 1)!
(7.14)
Let C(, x, a) = C() denote the right side of (7.14). With K still to be determined, we have K K < A0 C()(2λ(x) + A1 )j−1 < A0 C() · 2(2λ(x) + A1 )K−1 , 1
j=1
provided that 2λ(x) + A1 ≥ 2.
7.5. ESTIMATES OF N (x) WITH AN ERROR TERM
75
Next, ∞ K+1
< A0
∞
(2λ(x) + A1 )j /j! ≤ 2A0 (2λ(x) + A1 )K /K! ,
j=K
provided that K ≥ 2(2λ(x) + A1 ). Take K := (log x)a , and recall that λ(x) = O(log log x). For any > , there is an x1 = x1 ( ) such that |N (x) − Ax|/Ax
1, then (B) holds.
Corollary 7.15. Suppose dΠ ≥ 0, 0 < a < 1, and Π(x) = Πc (x) + O(x exp{−(log x)a }).
(7.15)
Then, with a := a/(a + 1), there exists a constant C such that
N (x) = Ax + O(x exp{−(log x log log x)a }). Proof. Integrating by parts, x x T −1 (dΠ − dΠc ) = {Π(x) − Πc (x)}/x + {Π(t) − Πc (t)}dt/t2 1 1 ∞ ∞ a {Π(t) − Πc (t)}dt/t2 = O{e−(log x) } + − 1 x ∞ a −(log x)a = O{e } + log C + O e−(log t) dt/t . x
The last integral is asymptotic to (log x)1−a exp(− loga x)/a, as can be seen by l’Hospital’s rule. Thus (7.15) implies that there is a number log C such that x T −1 (dΠ − dΠc − {log C} δ1 ) = O{(log x)1−a exp(− loga x)}. 1−
The last integral is the function ν(x). We estimate νj (x) by (7.12), taking b = 1−a. It follows from the proof of the theorem that |N (x) − Ax|/x < (log x)1−a C( , x, a),
x ≥ x1 .
We may absorb the (log x)1−a factor into C( , x, a) by choosing an > but still small enough that (1 + a)1/(1+a) − > 1. Thus, there exists x2 such that for all x ≥ x2 ,
N (x) = Ax + O(x exp{−(log x log log x)a }).
76
7. DENSITY OF G-INTEGERS
7.6. Notes Much of the material of this chapter comes from the authors’ articles [Di70a], [Di77], and [Zh88]. §7.3. The reader is encouraged to give a direct proof of Corollary 7.5 starting from Chebychev’s Identity 3.3. The reader is invited to prove the following assertion by a small tauberian argument: If the counting function N of a g-number system N satisfies x N (t)t−1 dt ∼ Ax 1
for some positive constant A, then N has density A. §§7.3, 7.4. Several examples are given in [Di77] showing that Theorems 7.8 and 7.11 can fail if various of the hypotheses are not satisfied. For example, suppose ϕ is any positive, continuous strictly increasing function on [1, ∞) for which ∞ t−2 ϕ(t) dt = ∞ 1
and ϕ(x) is bounded pointwise by Cx/ log x (a mild condition, but admittedly one not present in Theorem 7.11). Then there exists a g-prime system for which Π(x) − Πc (x) ϕ(x) but lim sup x−1 N (x) = ∞ and lim inf x−1 N (x) = 0. x→∞
x→∞
§7.4. We ask whether there is a simpler proof of Theorem 7.11. §7.5. Another example of a g-number system that is somewhat close to the rational integers Z is given in [AMH15]: define a generalized Chebyshev function by ψ(x) := x − 1. A small calculation shows its zeta function to be given by ∞ n−s ζ(s) = exp , s > 1. log n n=2 This system satisfies Theorem 7.13 with error term E(x) 1/{x log x}, and so formula (B) holds for N (x). In the article, under assumption of the Riemann Hypothesis, O- and Ω-bounds are given for N (x) − Ax; the omega bound of x1−o(1) shows this number system is in fact not too close to Z.
https://doi.org/10.1090//surv/213/08
CHAPTER 8
Simple Estimates of π(x) Individual prime are ornery; it is easier to manage a whole herd Summary. Under weak conditions, π(x) is unbounded. An irregular prime distribution can lead to a surprising integer distribution. Simple prime estimates for integers that are regularly distributed or have logarithmic density.
8.1. Unboundedness of π(x) In this chapter we give some simple estimates of the prime-counting function π(x) for a g-prime system P. We assume throughout this chapter that P is a discrete set of primes. To start, we show that, under a weak hypothesis, the g-primes, like their classical counterpart, must be infinite in number. Proposition 8.1. Suppose that N = NP is a number system whose zeta function has abscissa of convergence strictly greater than 0. Then P is infinite. Proof. Suppose N were generated by only a finite number of primes. Then the Euler product for ζN (s) would extend over a finite number of factors and so would converge at each positive value of s. By the usual proof of the equality of the Euler product and Dirichlet series for the zeta function, the Dirichlet series would converge there as well. This contradicts the hypothesis that the abscissa of convergence exceeds 0. An equivalent form of this proposition has the hypothesis that (8.1)
lim sup N (x)/xα > 0 x→∞
for some α > 0. The two conditions can be shown equivalent using Theorem 6.9 of [BD69]. Here is a direct proof that (8.1) implies an infinitude of primes. (This argument introduces an estimate that will be used again below.) Proof. Let n ≤ x be an integer composed entirely of the first r elements of P, expressed with appropriate multiplicity. We have p1ν1 · · · prνr ≤ x, where the νi are nonnegative integers and clearly satisfy νi ≤ log x/ log pi for 1 ≤ i ≤ r. Thus the number of integers that can be formed from p1 , . . . , pr is at most
log x log x r log x ··· 1 + < 1+ . (8.2) 1+ log p1 log pr log p1 If P contained only r primes, then we would have N (x) logr x and hence N (x)/xα → 0 for every positive value of α, in violation of (8.1). 77
78
8. SIMPLE ESTIMATES OF π(x)
8.2. Can there be as many primes as integers? This seems a strange question: since each prime is an integer and 1 is an integer not counted as a prime, we have N (x) ≥ π(x) + 1 for x ≥ 1, and so the answer is, of course, No. In §4.5, we showed that if a g-prime set is sufficiently thin, then its g-integer counting function will have the same order of growth. Here, we give an example showing that, absent some regularity of N (cf. Theorem 8.5), the ratio of the g-prime and g-integer counting functions can get arbitrarily near to 1. Example 8.2. A surfeit of primes. There exists a number system N = NP for which πP (x) = 1. (8.3) lim sup x→∞ NP (x) We create a lacunary sequence of primes with each element having high multiplicity. Let q1 = 2 and q2 , q3 , . . . be an increasing sequence of positive rational integers to be chosen successively. Let P be the prime system formed with nqn as a prime with multiplicity qn for n = 1, 2, . . . . For given n, set m := q1 + · · · + qn−1 ; then pj = nqn for m < j ≤ m + qn . Assuming that q1 , . . . , qn−1 have been selected, take qn to be an integer so large that log nqn m −1/2 qn < 1. 1+ log 2 Such qn exists, since (a + b log x)m x−1/2 → 0 as x → ∞ for fixed a, b, and m. The integers not exceeding nqn consist of the prime nqn with multiplicity qn along with the integers formed from the m primes smaller than nqn . The number of composite integers, by (8.2), is at most log nqn m 1+ . log 2 Thus log nqn m < qn + qn1/2 . N (nqn ) ≤ qn + 1 + log 2 It is clear from the inequality π(x)+1 ≤ N (x) that lim sup π(nqn )/N (nqn ) ≤ 1. For the other inequality, note that π(nqn ) = qn + qn−1 + · · · + q1 ≥ qn and hence lim sup
qn π(nqn ) ≥ lim sup = 1. 1/2 N (nqn ) qn + qn
Thus (8.3) holds for P. Remark 8.3. The g-number system created here has O-log density. We have
1 1 −qk 1 1− < = exp −qk log 1 − ni kqk kqk ni ≤x kqk ≤x kqk ≤x
q
1 q k k = exp exp , +O 2 kqk kqk k kqk ≤x kqk ≤x since k qk /(kqk )2 is bounded.
8.3. π(x) ESTIMATES VIA REGULAR GROWTH
79
By the construction, q log kqk q1 +···+qk−1 1/2 log qk k−1 < 1 + < qk log 2 or qk−1 log log qk < (1/2) log qk . If k is large enough that log log qk ≥ 1/2, we have (8.4)
qk > exp qk−1 ,
showing that the q sequence grows very rapidly. Suppose that kqk ≤ x. With two applications of (8.4), we get exp exp qk−2 < exp qk−1 < qk < x. Since k − 2 < qk−2 , we have k − 2 < log log x, and hence k − 1 ≤ log x. Returning to the first set of inequalities, we obtain 1 1 ≤ ≤ log log x + C1 , k k kqk ≤x
and so
k≤1+log x
1 < C2 exp{log log x + C1 } log x. ni
ni ≤x
8.3. π(x) estimates via regular growth The following combinatorial lemma is a g-number version of the Moebius inversion formula. For it, we recall the notion of regular growth from §4.6 and introduce here an item of specialized notation. Given a discrete g-number system N = NP (r) and a positive integer r, let NP (x) = N (r) (x) denote the number of elements of N ∩ [1, x] having no prime factor among the initial r g-primes. More formally, let p1 ≤ p2 ≤ . . . be the primes of P and N (r) (x) the function on [1, ∞) that counts the number of sequences {ν1 , ν2 , . . . } of nonnegative integers for which piνi ≤ x. ν1 = · · · = νr = 0 and Lemma 8.4. Let N (r) (x) be the counting function defined above. Then (8.5)
N (r) (x) =
1 δ1 =0
···
1
(−1)δ1 +···+δr N (xp1−δ1 · · · pr−δr ).
δr =0
Further, if N (x) has regular growth with exponent α, then r (8.6) N (r) (x) ∼ N (x) (1 − pi−α ), x → ∞. i=1
Proof. We have
p1μ1 p2μ2 · · · ≤ xp1−δ1 · · · pr−δr
if and only if μ
μ
r+1 r+2 p1μ1 +δ1 · · · prμr +δr pr+1 pr+2 · · · ≤ x.
Thus
∗
N (xp1−δ1 · · · pr−δr ) = ν
ν
p1 1 p2 2 ···≤x
1,
80
8. SIMPLE ESTIMATES OF π(x)
where ∗ denotes the restriction ν1 ≥ δ1 , . . . , νr ≥ δr . Let M (r) (x) denote the right side of (8.5). We have 1
M (r) (x) =
···
δ1 =0
1
∗
(−1)δ1 +···+δr ν
1,
ν
p1 1 p2 2 ···≤x
δr =0
with the same ν restriction. Changing the summation order, we get
M (r) (x) = ν
min(ν1 ,1)
ν
p1 1 p2 2 ···≤x
δ1 =0
Since
min(ν,1) δ
(−1) =
δ=0
we have
(−1)δ1 +···+δr .
δr =0
0 if ν ≥ 1, 1 if ν = 0,
M (r) (x) =
min(νr ,1)
···
1 = N (r) (x).
ν ν p1 1 p2 2 ···≤x ν1 = ··· =νr =0
If N (x) has regular growth with exponent α, then, as x → ∞, 1 1 N (xp1−δ1 · · · pr−δr ) N (r) (x) = ··· (−1)δ1 +···+δr N (x) N (x) δ1 =0
∼
1 δ1 =0
δr =0
···
1
(−1)δ1 +···+δr (p1−δ1 · · · pr−δr )α =
r
(1 − pi−α ) .
i=1
δr =0
We apply this lemma to estimate π(x). Theorem 8.5. Let P be a g-prime system whose integer counting function N = NP has regular growth with exponent of growth α. For any positive integer r we have r π(x) ≤ (1 − pi−α ). lim sup x→∞ N (x) i=1 In particular, if 1/pi α diverges, then π(x) = o(N (x)). Proof. Let N (r) be the counting function of the last lemma. We have N (r) (x) = 1 ≥ 1 = π(x) − r. ν
ν
r+1 r+1 pr+1 pr+1 ···≤x
pi ≤x i>r
Combining this inequality with (8.6), we find that π(x) r + N (r) (x) ≤ lim sup = (1 − pi−α ). N (x) N (x) x→∞ i=1 r
lim sup x→∞
Now
r r (1 − pi−α ) ≤ exp − pi −α ; 1
1
if the last sum diverges as r → ∞, then the product diverges to 0, and we get π(x) = o(N (x)).
8.5. NOTES
8.4. Lower bounds for
81
1/pi via lower log-density
We showed in Theorem 4.7 that a g-number system has positive or infinite lower log density if and only if x u−1 dΠ(u) ≥ log log x + O(1). 1 Here we extend one implication of that result to treat 1/pi . Lemma 8.6. We have Π(x) − π(x) ≤ Π(x1/2 ),
x ≥ 1.
Proof. By (1.10), Π(x) − π(x) 1 1
1 1 1 1 1 1 1 1 1 1 π(x 2 ) + π(x 3 ) + π(x 4 ) + π(x 5 ) + π(x 6 ) + π(x 7 ) + . . . = 2 3 4 5 6 7 ≤ π(x1/2 ) + (1/2) π(x1/4 ) + (1/3) π(x1/6 ) + . . . = Π(x1/2 ) . Corollary 8.7. Suppose that N is a g-number system generated by discrete g-primes p1 ≤ p2 ≤ . . . and N has positive lower logarithmic density. Then the g-primes of N satisfy 1 ≥ {1 + o(1)} log log x. pi pi ≤x
If, in addition,
(8.7)
∞
u−2 dΠ(u) < ∞,
1
then
1 ≥ log log x + O(1). pi
pi ≤x
Proof. We have Π(x) − π(x) ≤ Π(x1/2 ) by the last lemma. Combining this inequality with Lemma 2.1, we find that x x x u−1 {dΠ(u) − dπ(u)} ≤ u−1 dΠ(u1/2 ) ≤ u−2 dΠ(u). 1
1
1
If (8.7) holds, then the second conclusion follows at once from Theorem 4.7. Otherwise, for large K, write x x 1 x −1 −1 −1 −2 u dπ(u) ≥ {u − u }dΠ(u) > 1 − u dΠ(u). K 1 1 K Since K is arbitrary, we have 1 ≥ {1 + o(1)} log log x. pi pi ≤x
8.5. Notes §§8.2, 8.3. Much of the material of these sections comes from [BD69].
https://doi.org/10.1090//surv/213/09
CHAPTER 9
Chebyshev Bounds – Elementary Theory Are ballpark estimates OK off the field? Summary. A review of Chebyshev’s method. Conditions for a g-prime counting function to have the expected order of growth. An example of failure. (In Chapter 11 Chebyshev bounds are established under somewhat weaker hypotheses by analytic methods.)
9.1. Introduction Before the PNT was established for the rational integers N, the true order of the prime-counting function π(x) was established for the first time by P. L. Chebyshev. He showed by elementary methods (see Chapter 14 for a discussion of this term) that there exist two numbers α > 0 and β < ∞ such that π(x) π(x) (9.1) lim inf ≥ α, lim sup ≤ β. x→∞ x/ log x x→∞ x/ log x We call the preceding inequalities lower and upper Chebyshev bounds. The PNT asserts that (9.1) holds with α = β = 1. In this chapter we apply elementary methods to study the g-number analogues of these relations. As in the classical case, we call the inequalities Chebyshev bounds. We give several conditions for a g-integer counting function N (x) that insure the validity of one or the other of the bounds (9.1). As a warm-up and also for its historical interest, we give a version of Chebyshev’s argument. We express our estimates in terms of Chebyshev’s function Λ(n), ψ(x) := n≤x α
with Λ(n) = log p if n = p and Λ(n) = 0 otherwise. We establish inequalities ψ(x) ψ(x) ≥ α, lim sup ≤β; x x x→∞ the form (9.1) with the same values of α and β is obtained by summation by parts. (9.2)
lim inf x→∞
9.2. Chebyshev bounds for natural primes The starting point of Chebyshev’s investigation is the formula Λ(pk ). log n = pk |n
This can be expressed in (multiplicative) convolution terms as (9.3)
L1 = Λ ∗ 1 83
84
9. CHEBYSHEV BOUNDS – ELEMENTARY THEORY
where L(n) = log n and 1(n) = 1 for all n ≥ 1. (In §3.1 we considered the measure version of this relation.) For k = 1, 2, . . . let functions ek be defined by 1, n = k, ek (n) = δk,n = 0, n = k. If we convolve each side of (9.3) by the Moebius function μ(n) and sum the resulting expression, we obtain (9.4) L1 ∗ μ(n) = Λ ∗ 1 ∗ μ(n) = Λ ∗ e1 (n) = ψ(x), n≤x
n≤x
n≤x
since μ ∗ 1 = e1 and e1 is the identity element of multiplicative convolution. This formula is difficult to use directly, because rather little is known at the outset about the factor μ on the left hand side. A good strategy is to introduce in place of μ(n) an arithmetic function f (n) which serves as an “approximate convolution inverse” of 1(n). Specifically, f will be chosen to have the following properties: ∞ (9.5) f (n)/n = 0, n=1
(9.6)
|f (n)| = O(x log−2− x)
with some
> 0,
n≤x
and one of the one-sided conditions f ∗ 1(n) ≤ C, (9.7)
x ≥ 1,
n≤x
or (9.8) or (9.9)
c1 , 1 ≤ x < B, f ∗ 1(n) ≥ 0, x ≥ B, n≤x
0, 1 ≤ x < B, f ∗ 1(n) ≥ c2 , x ≥ B, n≤x
for some real constants C < ∞, c1 , c2 > 0, and B > 1. It is known from classical prime number theory that f = μ satisfies (9.5) but not (9.6). Also, μ satisfies (9.7), (9.8), and (9.9) with C = c1 = c2 = 1 and B > 1. Property (9.6) makes f more tractable than μ. Proposition 9.1. Let f satisfy (9.5) and (9.6) and set ∞ αf := − f (n)n−1 log n. 1
Then (9.10) and (9.11)
ψ(x) ≥
αf x + O(x log− x) C
if (9.7) holds
⎧ αf x ⎪ + O(x log− x) ⎪ ⎨ c1 (1 − B −1 ) ψ(x) ≤ αf Bx ⎪ ⎪ + O(x log− x) ⎩ c 2
if (9.8) holds, if (9.9) holds.
9.2. CHEBYSHEV BOUNDS FOR NATURAL PRIMES
85
Proof. If we convolve each side of (9.3) with f and sum, we obtain L1 ∗ f (n) = Λ ∗ (1 ∗ f )(n) n≤x
and hence
(9.12)
n≤x
n≤x
log m f (n) =
n≤x
m≤x/n
1 ∗ f (m) Λ(n).
m≤x/n
Using the simple estimate log m = y log y − y + O(log y) m≤y
and (9.5) and (9.6), we evaluate the left hand side of (9.12) as x x x log − + O(log x) f (n) n n n n≤x f (n) log n f (n) −x + O log x |f (n)| = (x log x − x) n n n≤x
n≤x
n≤x
f (n) log n f (n) + αf x + x + O(x log−−1 x). = −(x log x − x) n n n>x n>x Then we evaluate the two sums on the right hand side of the last equality by partial summation and use property (9.6). We have f (n) ∞ = y −1 dS(y), n x n>x where S(y) := n≤y f (n), and we find that f (n) = O(log−1− x). n n>x Similarly, we find that f (n) n−1 log n =
∞
y −1 log y dS(y) = O(log− x).
x
n>x
Therefore the left hand side of (9.12) equals αf x + O(x log− x). Now we evaluate the right hand side of (9.12). In case (9.7) holds, we have 1 ∗ f (m) Λ(n) ≤ C Λ(n) = Cψ(x), n≤x
n≤x
m≤x/n
and the desired lower bound for ψ(x) follows. If (9.8) holds, we use the inequality c1 , for 1 ≤ x/n < B, 1 ∗ f (m) ≥ 0, for x/n ≥ B. m≤x/n In this case, we have 1 ∗ f (m) Λ(n) ≥ n≤x
m≤x/n
x/Blog(x1/2 )/ log B
x Bk
x,
so the desired upper bound for ψ holds. Finally, assuming (9.9), we apply the inequality 0, 1 ≤ x/n < B, 1 ∗ f (m) ≥ c2 , x/n ≥ B, m≤x/n in the right hand side of (9.12) and obtain Λ(n) = c2 ψ(x/B). αf x + O(x log− x) ≥ c2 n≤x/B
Thus the upper bound for ψ follows here too.
Example 9.2. Take f = e1 − 2e2 . Then
x
x 0, x even, −2 = f ∗ 1(n) = 1 2 1, x odd. n≤x Thus we may take c1 = C = 1, B = 2 in (9.7) and (9.8) and αf = log 2 and obtain x log 2 + o(x) ≤ ψ(x) ≤ 2x log x + o(x). Example 9.3. (Chebyshev) Take f = e1 − e2 − e3 − e5 + e30 . Then
x x x x x f ∗ 1(n) = − − − + , 1 2 3 5 30 n≤x
a periodic function of x with period 30 that assumes only the values 0 and 1. Also, the value 1 is assumed for 1 ≤ x < 6. Thus properties (9.7) and (9.8) are satisfied with c1 = C = 1 and B = 6. Furthermore, ∞ f (n) 1
n
= 0,
∞
|f (n)| = O(1),
1
and
log 2 log 3 log 5 log 30 + + − = .92129 . . . . 2 3 5 30 Thus we find Chebyshev’s estimates αf =
.92129 · · · + o(1) ≤ ψ(x)/x ≤ 1.10555 · · · + o(1). Summation by parts yields (9.1) with α = .92129 . . . and β = 1.10555 . . . .
9.3. AN AUXILIARY FUNCTION
87
9.3. An auxiliary function In the next section, we shall establish Chebyshev bounds for g-primes under the assumption that E ∗(x) := sup |N (y) − Ay|/y y≥x
satisfies
∞
(9.13)
E ∗(x) x−1 dx < ∞.
1
The function E ∗(x) has several useful properties: it is a positive valued, nonincreasing upper bound for |N (x) − Ax|/x satisfying an integral condition. However, it would be desirable for our analysis to have a bounding function with the further property of not decreasing too rapidly. The goal of this section is to construct such a function to use as a replacement for E ∗. Lemma 9.4. Assume E ∗(x) satisfies (9.13). Then there exists a positive valued function Q(x) having the following four properties for 1 ≤ x < ∞: (9.14)
Q(x) ≥ E ∗(x)
(9.15)
Q(x) is nonincreasing
(9.16)
Q(x) ≤ 4Q(x2 ) ∞ Q(x) x−1 dx < ∞ .
(9.17)
1
Remarks 9.5. (i) For γ ∈ (1, 2], the function log−γ ex has properties (9.15), (9.16), (9.17) of Q(x); readers willing to settle for Chebyshev inequalities proved under the stronger hypothesis E(x) log−γ ex can skip the construction of Q(x) below and simply use log−γ ex in its place in the next section. (ii) The condition (9.16) that we have introduced is reasonably sharp: if it were replaced by Q(x) ≤ 2Q(x2 ), say, with Q(x) assumed positive on some interval (1, 1 + ), then the inequality X2 X4 X2 Q(x) Q(x2 ) Q(u) dx ≤ 2 dx = du x x u 2 X X X shows that (9.17) could not hold. Proof. Define Q(x) recursively by setting Q(x) := E ∗(1) for 1 ≤ x ≤ 2 and m m Q(x) := max{E ∗ 22 , 4−1 Q 22 } m
m+1
for 22 < x ≤ 22 (9.14) – (9.17).
, m = 0, 1, 2, . . . . We shall verify that Q(x) satisfies conditions
• Q(x) ≥ E ∗(x). This is clear on [1, 2], since Q(x) = E ∗(1) ≥ E ∗(x). For m m+1 x ∈ (22 , 22 ], we have m m m Q(x) = max{E ∗ 22 , 4−1 Q 22 } ≥ E ∗ 22 ≥ E ∗(x), since E ∗(x) is nonincreasing.
88
9. CHEBYSHEV BOUNDS – ELEMENTARY THEORY
m m+1 ≤ Q 22 , m = 0, 1, 2 . . . . • Q(x) is nonincreasing. We show that Q 22 m m+1 Since Q is constant on intervals (22 , 22 ] this will show the monotonicity. By definition, Q(1) = Q(2) = E ∗(1). Thus, Q(4) := max{E ∗(2), 4−1 Q(2)} ≤ max{E ∗(1), 4−1 E ∗(1)} = Q(2). m m Similarly, for m ≥ 1, by (9.14), E ∗ 22 ≤ Q 22 , and hence m+1 m m m Q 22 = max{E ∗ 22 , 4−1 Q 22 } ≤ Q 22 . • Q(x) ≤ 4Q(x2 ). For 1 ≤ x ≤ 2, we have 1 ≤ x2 ≤ 22 . If 1 ≤ x2 ≤ 2, Q(x) = Q(x2 ) and the result holds. If 2 < x2 ≤ 22 , then Q(x2 ) = max{E ∗(2), 4−1 Q(2)} ≥ 4−1 Q(2) = 4−1 Q(x). m−1
For 22
•
∞ 1
m
m
m+1
< x ≤ 22 , m ≥ 1, we have 22 < x2 ≤ 22 and, similarly, 2m 2 −1 −1 = 4 Q(x). Q(x ) ≥ 4 Q 2
Q(x) x−1 dx < ∞. Since Q(x) is constant on intervals, we have
22
m+1
2m
2
m+1 m log 22 Q(x) x−1 dx = Q 22 m m = max{E ∗ 22 , 4−1 Q 22 } 2m log 2.
Suppose that, for some positive integers μ, ν, m m (9.18) 4−1 Q 22 ≥ E ∗ 22 holds for μ ≤ m < μ + ν. In this case, we have inductively for 1 ≤ λ ≤ ν, μ+λ μ Q 22 = 4−λ Q 22 , and thus
22
(9.19)
μ+ν
Q(x) x 22
−1
ν 2μ 2μ−1 log 2 dx = Q 2 2−λ .
μ
λ=1
Now we distinguish two cases, according to occurrences of (9.18). Case I. Suppose (9.18) holds for all m ≥ λ for some λ ∈ N. By the preceding formula, ∞ λ λ−1 Q(x) x−1 dx = Q 22 log 22 < ∞, 22λ
and thus (9.17) holds here. Case II. Now suppose there exists an infinite sequence m1 < m2 < · · · < mk < mk+1 < · · · such that (9.20)
m m E ∗ 22 > 4−1 Q 22
9.3. AN AUXILIARY FUNCTION
89
holds for each mk and the reverse inequality, (9.18), holds for each m ∈ (mk , mk+1 ). In this case, we show that mk+1 −1 22m+1 mk mk −1 log 22 . Q(x) x−1 dx ≤ 4E ∗ 22 (9.21) 22m
m=mk
First, if mk+1 = mk + 1, the left hand side of (9.21) equals 22mk +1 2mk +1 2 −1 ∗ 2mk Q(x) x dx = E 2 x−1 dx m m k
22
22
k
mk mk mk mk −1 log 22 = 2E ∗ 22 log 22 , = E 22 ∗
which establishes (9.21). Next, suppose mk+1 ≥ mk + 2. The preceding calculation gives 22mk +1 2mk −1 −1 ∗ 2mk 2 log 2 . Q(x) x dx = 2E m 22
k
For the remaining intervals of (9.21) we can apply (9.19) to obtain 22mk+1 mk mk mk −1 mk +1 log 22 = E ∗ 22 2 log 22 , Q(x) x−1 dx < Q 22 m +1 k
22
since
mk +1 mk −1 2mk mk Q 22 = max{E ∗ 22 ,4 Q 2 } = E ∗ 22 . Together, the last two integral expressions give (9.21). Thus we have m+1 m+1 ∞ 22 ∞ mk+1 −1 22 −1 Q(x) x dx = Q(x) x−1 dx 22m
m=m1
≤4
k=1 m=mk
∞
22m
∞ mk mk −1 m m−1 log 22 ≤4 . E ∗ 22 E ∗ 22 log 22 m=1
k=1
To show the last sum is finite, we note that 22m 22m m m−1 ∗ −1 ∗ 2m , E (x)x dx ≥ E (2 ) x−1 dx = E ∗ 22 log 22 22m−1
22m−1
∗
since E (x) is nonincreasing. Combining the inequalities, we find m ∞ ∞ ∞ 22 2m−1 −1 ∗ 2m log 2 ≤4 Q(x) x dx ≤ 4 E 2 E ∗(x)x−1 dx 22
m1
m=1 ∞
=4 by (9.13). Thus
∞ 1
Q(x) x
2 −1
m=1
22m−1
E ∗(x)x−1 dx < ∞
dx < ∞.
−1
Lemma 9.6. Assume (9.15) and (9.17) hold. Then Q(x) = o(log ex). ∞ Proof. Given > 0, we have x Q(t)t−1 dt < /2 for all x ≥ X = X(). Thus, for x ≥ X 2 , x
Q(x) log x ≤ 2
√ x
Q(t)t−1 dt < .
90
9. CHEBYSHEV BOUNDS – ELEMENTARY THEORY
The following lemma shows a kind of “stability” of Q(x) under multiplicative convolution. Lemma 9.7. Assume that Q(x) satisfies (9.15), (9.16), and (9.17). Then we x have 1 t−1 Q(x/t)Q(t) dt ≤ c1 Q(x) for some positive constant c1 . Proof. We have x t−1 Q(x/t)Q(t) dt = 2 1
√ x
t−1 Q(x/t)Q(t) dt
1
√ ≤ 2Q( x)
√ x
Q(t)t−1 dt ≤ c1 Q(x)
1
∞ √ √ since Q(x/t) ≤ Q( x) ≤ 4Q(x) for 1 ≤ t ≤ x and 1 Q(t)t−1 dt < ∞.
9.4. Chebyshev bounds for g-primes In this section, we give finite upper- and positive lower Chebyshev bounds for gprimes. (We shall revisit this topic in Chapter 11, when we have developed further analytic tools.) We first show a result in the opposite direction: that the lower bound cannot be too big nor the upper bound too small. Proposition 9.8. Let α and β be numbers satisfying (9.1). If a g-number system N has finite O-log density, then α ≤ 1; if N has positive lower log density, then β ≥ 1. In particular, if N (x) x, then α ≤ 1 and if N (x) x, then β ≥ 1. Proof. First assume that N has finite O-log density. By Proposition 4.2, (s − 1)ζ(s) is bounded above as s → 1+. Thus ∞ x−s−1 {Π(x) − Πc (x)} dx < ∞, lim sup log{(s − 1)ζ(s)/s} = lim sup s s→1+
s→1+
1
with the continuous “prime counting function” of (1.9), x x 1 − u−1 Πc (x) := du ∼ . log u log x 1 Suppose α > 1, i.e. lim inf x→∞
π(x) >1+ x/ log x
for some > 0. Then Π(x) − Πc (x) > Πc (x), for some number X ≥ 1, and
log{(s − 1)ζ(s)/s} > O(1) +
∞
x ≥ X,
x−s−1 Πc (x) dx → ∞
X
as s → 1+, contradicting the boundedness of the function. The case of N having positive lower log density is handled analogously.
Now we turn to the more significant Chebyshev bounds α > 0 and β < ∞, using an approximate convolution inverse relation analogous to (9.9). Again, we write the counting function N (x) of the g-integers of a system N as (9.22)
N (x) = Ax + xE(x),
for
x ≥ 1,
9.4. CHEBYSHEV BOUNDS FOR G-PRIMES
91
with A a positive constant, and set E ∗(x) := sup |E(y)|. y≥x
Theorem 9.9. Suppose (9.22) holds with ∞ x−1 E ∗(x) dx < ∞. 1
Then there exist numbers α > 0 and β < ∞ satisfying Chebyshev’s bounds (9.2). Corollary 9.10. If N (x) = Ax + O(x log−γ ex)
(9.23)
is satisfied with some constant γ > 1, then the Chebyshev bounds (9.2) hold. Proof of the corollary. Say |E(x)| ≤ c log−γ ex for some c > 0. Then E (x) ≤ c log−γ ex. It follows that ∞ ∞ x−1 E ∗(x) dx ≤ c x−1 log−γ ex < ∞, ∗
1
1
and the claim of the corollary follows from the theorem.
We begin proving the theorem by establishing a convolution identity that will be used in the two succeeding lemmas. Lemma 9.11. For x ≥ 1 and η > 0, we have x (δ1 + dt) ∗ (δ1 − ηt−η dt) = x1−η . (9.24) 1−
Proof. The left side of (9.24) is x x (x/t)(δ1 − ηt−η dt) = x + xt−η = x1−η . 1
1−
Remark 9.12. If η = 1, then the right side of (9.24) is 1. The associated measures satisfy (δ1 + dt) ∗ (δ1 − t−1 dt) = δ1 , i.e., δ1 − t−1 dt is the convolution inverse of δ1 + dt. (The usual group theory argument for the uniqueness of inverses justifies the word “the.”) Now we introduce the measure (δ1 − t− dt) ∗ Q(t) dt, with a suitable > 0, to prove Theorem 9.9. We shall show that this measure acts as an “approximate convolution inverse of dN ” – it is an analogue of the function f in (9.9), as the following lemma shows. Lemma 9.13. Assume that Q(x) satisfies the four properties of Lemma 9.4. Then, for a sufficiently small (fixed) number > 0, we have x dN ∗ (δ1 − t− dt) ∗ Q(t) dt ≥ 0 u (x) := 1−
for all x ≥ 1; also, u (x) → ∞ as x → ∞.
92
9. CHEBYSHEV BOUNDS – ELEMENTARY THEORY
Proof. Note that x − dN ∗ t dt = 1−
x
N (x/t)t
−
x
dt ≤ N (x)
1
dt. 1
Thus, if 1 ≤ x ≤ 1/, then x dN ∗ (δ1 − t− dt) ≥ N (x) − N (x)(x − 1) ≥ 0. 1−
The lemma is certainly true for 1 ≤ x ≤ 1/, since the third convolution factor in the u integral is everywhere nonnegative. If x > 1/, we utilize all the convolution factors. With A the density of the g-number system (cf. (9.22)), we shall subtract from dN and then add back the measure A(δ1 + dt). Write x (δ1 + dt) ∗ (δ1 − t− dt) ∗ Q(t) dt u (x) = A 1− x + (dN − Aδ1 − Adt) ∗ (δ1 − t− dt) ∗ Q(t) dt =: I1 + I2 , 1−
say. We shall show that I1 > 0 on (1, ∞), that I1 → ∞ as x → ∞, and that |I2 | is smaller than I1 . Using (9.24) on the first two factors in I1 , we find x x I1 = Ax1− t−1+ Q(t) dt = Ax u−−1 Q(x/u) du. 1
1
Thus I1 > 0, and I1 x → ∞ as x → ∞. To estimate I2 , note first that 1−
|N (x) − Ax| = x|E(x)| ≤ xE ∗(x) ≤ xQ(x),
x ≥ 1.
This inequality and Lemma 9.7 give x x = (dN − Aδ − Adt) ∗ Q(t) dt (N (x/t) − Ax/t)Q(t) dt 1 1− 1 x ≤ (x/t)Q(x/t)Q(t) dt ≤ c1 xQ(x). 1
It follows that x |I2 | ≤ c1 (x/t)Q(x/t)(δ1 + t− dt) = c1 xQ(x) + c1 x 1−
x
t−−1 Q(x/t) dt.
1
Now choose > 0 so small that log(1/) > 4c1 /A; then 1/ > 4c1 /A holds as well. Thus the last term in the |I2 | estimate is smaller than I1 /4. Also, for x ≥ 1/, we have x 1/ 1 1 1 I1 = Ax t−−1 Q(x/t) dt ≥ AxQ(x) t−−1 dt 2 2 2 1 1 1 1 ≥ AxQ(x) log > c1 xQ(x), 2 since ≥ exp(−e−1 ) > 1/2. Therefore, I2 < (3/4)I1 , and u (x) ≥ I1 − |I2 | > I1 /4. Thus u (x) is always positive and tends to infinity with x. We shall use the following approximation to estimate ψ(x).
9.4. CHEBYSHEV BOUNDS FOR G-PRIMES
93
Lemma 9.14. Suppose that N (x) = Ax + o(x log−1 ex). Then, for given η > 0, x LdN ∗ δ1 − ηt−η dt = Ax/η + o(x). 1−
Proof. Write the integral in the iterated form x x/t LdN (δ1 − ηt−η dt). 1−
1
Applying integration by parts and the hypothesis, we find u u LdN = N (u) log u − N (t)t−1 dt = Au log u − Au + o(u). 1
1
Thus the integral of the lemma can be written as x x x x x A log − A + o (δ1 − ηt−η dt) t t t t 1− x x (δ1 − ηt−η dt) = A(log x − 1) t 1− x −1 − Ax t (log t) (δ1 − ηt−η dt) +
x o (δ1 + ηt−η dt) t 1−
1−
x
=: I1 + I2 + I3 , say. By Lemma 9.11, I1 = A(log x − 1)x1−η = o(x). Integration by parts shows x I2 = Aηx (log t) t−1−η dt = Ax/η + o(x), 1
and direct calculation gives I3 = o(x).
Lemma 9.15. Assume that Q(x) satisfies the four properties of Lemma 9.4. Then, for fixed > 0, x LdN ∗ (δ1 − t− dt) ∗ Q(t) dt = O(x). 1−
Proof. Lemma 9.6 implies that the hypothesis of Lemma 9.14 is satisfied; by this and (9.17), the integral on the left hand side equals x x O(x/t)Q(t) dt = O x t−1 Q(t) dt = O(x). 1−
1−
We are now ready to establish Theorem 9.9. The fundamental relation is Chebyshev’s identity dψ ∗ dN = LdN , which was established in Chapter 3. We convolve both side of the identity by an appropriate measure for each of the bounds. Proof of the Chebyshev upper bound. Here we start with the identity x x (9.25) dN ∗ (δ1 − t− dt) ∗ Q(t) dt ∗ dψ = LdN ∗ (δ1 − t− dt) ∗ Q(t) dt. 1−
1−
The left side is expressible as x du ∗ dψ = 1−
1
x
u (x/t) dψ(t) ≥
x/B
dψ(t), 1
since u (x), the function introduced in Lemma 9.13, is everywhere nonnegative and is at least 1 for all x exceeding some number B.
94
9. CHEBYSHEV BOUNDS – ELEMENTARY THEORY
The right side of (9.25) is O(x) by Lemma 9.15. Combining the estimates of (9.25), we obtain ψ(x/B) = O(x), whence
ψ(x) = O(x).
Proof of the Chebyshev lower bound. This time we take δ1 − t−1 dt as an approximate inverse of dN . Starting again from the Chebyshev identity, we have x x −1 dN ∗ (δ1 − t dt) ∗ dψ = LdN ∗ (δ1 − t−1 dt). 1−
1−
The right side of the last equation is Ax + o(x) by Lemma 9.14 with η = 1. For the left side, set u v(u) := dN ∗ (δ1 − t−1 dt), 1−
and restate the starting identity as x v(x/t) dψ(t) = Ax + o(x). (9.26) 1
Now write v(u) = I + II with u u (δ1 − t−1 dt), I = I(u) := A t 1− u {N (u/t) − Au/t} (δ1 − t−1 dt). II = II(u) := 1−
We have I = A by (9.24) with η = 1. Since |N (w) − Aw| ≤ wQ(w), we have u |II| ≤ (u/t)Q(u/t) (δ1 + t−1 dt) 1− u u = uQ(u) + Q(w) dw = O 1 + Q(w) dw , 1
u
1
the last because 1 Q(t) dt ≥ (u − 1)Q(u) ≥ (u/2)Q(u) for u ≥ 2. Combining the estimates, we see that u v(u) = O 1 + Q(w) dw . 1
Insert into (9.26) both the preceding bound and ψ(t) ≤ ct. We find x x/t x v(x/t) dψ(t) ≤ K Q(u) du dψ(t) 1+ 1 1 1 x x ≤ Kψ(x) + K dψ ∗ Q(t) dt = Kψ(x) + K ψ(x/t)Q(t) dt 1
≤ K ψ(x) 1 +
1
B
1 x
Q(t) dt + cx
t−1 Q(t) dt
B
with some ∞constant K > 0 and any number B > 1. For B sufficiently large, we have cK B t−1 Q(t) dt ≤ 13 A. Fixing such a B and recalling (9.26), we find B x 1 Kψ(x) 1 + Q(t) dt + Ax ≥ v(x/t) dψ(t) = Ax + o(x). 3 1 1
9.5. A FAILURE OF CHEBYSHEV BOUNDS
95
Thus, for x sufficiently large, Kψ(x) 1 +
B
1
1 Q(t) dt ≥ Ax, 3
i.e. ψ(x) x. This completes the proof of Theorem 9.9.
9.5. A failure of Chebyshev bounds In this section, we show that Chebyshev bounds need not hold for g-number systems satisfying condition (9.23) with γ < 1. Thus, apart from the case γ = 1, Corollary 9.10 is sharp. In Chapter 11, we consider this limiting case. Theorem 9.16 (R. S. Hall). Let α ∈ [0, 1], β ∈ [1, ∞] and γ ∈ [0, 1) be given. There exists a g-prime system for which (1) N (x) = Ax + O(x log−γ x); (2) lim inf x→∞ π(x)(log x)/x = α; (3) lim supx→∞ π(x)(log x)/x = β. Proof. We assume that γ ∈ (0, 1), for a construction with this restriction clearly satisfies condition (1) with γ = 0. Let π0 (x) be the counting function of the rational primes. For each n ∈ N, define the interval In := (an , bn ] with end points an := exp{2n/(1−γ) } and
bn := an 2n/4 .
The intervals {In } are nonoverlapping, for a small calculation shows that bn < an+1 . Let Cn be the set consisting of the νn largest rational primes in In , with νn := (1 − α){π0 (bn ) − π0 (an )} . (Cn consists of all primes in In if α = 0, and Cn = ∅ if α = 1.) If β < ∞, let Dn be a set consisting of any (β − 1){π0 (bn ) − π0 (an )} distinct real numbers in the interval (bn − 1, bn ]. (If β = 1, then Dn = ∅.) If β = ∞, let Dn be a set consisting of any 2n/8 {π0 (bn ) − π0 (an )} distinct real numbers in (bn − 1, bn ]. Let " " C= Cn , D = Dn . n odd
n even
Define P := {pj }∞ j=1 as the nondecreasing sequence of elements of (R \ C) ∪ D, where R is the set of rational primes, and take N := {nk }∞ k=1 to be the sequence of g-integers generated by P, with n1 = 1 and appropriate multiplicity for any integers that occur repeatedly. We next show that the contribution of the primes of C ∪ D is of modest size. Since (log x)γ /x is decreasing for x > eγ , {π0 (bn ) − π0 (an )}
logγ (bn − 1) logγ an . ≤ {π0 (bn ) − π0 (an )} bn − 1 an
By Chebyshev’s bounds for rational primes, the last quantity is of order b logγ a 2n/4 2n/4 logγ an 1 n n ≤ = = 3n/4 . 1−γ log bn an log an + (n/4) log 2 2 log an
96
9. CHEBYSHEV BOUNDS – ELEMENTARY THEORY
Then, for some K = K(β), we have logγ p logγ q logγ an + ≤ {π0 (bn ) − π0 (an )} p q an p∈C
p∈D
n odd
+K
2n/8 {π0 (bn ) − π0 (an )}
n even
=O
∞
n=1
It follows that
(1 + (logγ p)/p)
p∈C
1 2n/2
logγ (bn − 1) bn − 1
= O(1).
(1 − (logγ q)/q)−1
q∈D
converges. n m Let {kj }∞ j=1 be the sequence of products p q , n, m = 0, 1, 2, . . . of powers n m p of primes p ∈ C and powers q of q ∈ D, arranged in ascending order. Also, we define a sequence {cj }∞ j=1 by ∞ 1 1 −1 cj 1− s 1− s , s = k p q j=1 j p∈C
q∈D
where s ≥ 1. Then we have (9.27)
∞ logγ p logγ q −1 |cj | logγ kj 1+ 1− ≤ kj p q j=1 p∈C
q∈D
by repeated use of the inequality
logγ (xy) = (log x + log y)γ ≤ logγ x logγ y ,
valid for x, y ≥ e2 . (Note that each kj ≥ a1 ≥ e2 .) Since the g-number system N was generated by g-primes P = (R \ C) ∪ D, the zeta function of N has the representation ∞ 1 1 −1 1 1− s 1− s = ζ0 (s) , σ > 1, ζ(s) := s nk p q p∈C
k=1
q∈D
where ζ0 (s) is the Riemann zeta function. It follows that ∞ ∞ ∞ 1 1 cj = · , nk s ms j=1 kj s m=1 k=1
and hence N (x), the counting function of g-integers in N , is given by x cj N (x) = 1= cj +O |cj | . =x kj kj nk ≤x
kj ≤x
kj ≤x
kj ≤x
From (9.27) and the convergence of the infinite product on the right, the infinite series ∞ j=1 cj /kj converges absolutely and, by the definition of cj , −1 ∞ 1 1 cj /kj = =: A > 0. 1− 1− p q j=1 p∈C
q∈D
9.5. A FAILURE OF CHEBYSHEV BOUNDS
Hence we have
|cj | N (x) = Ax + O x |cj | . +O kj kj >x
Also, we have
kj >x
kj ≤x
kj ≤x
|cj | |cj | logγ kj ≤ = o(1) kj kj
logγ x as x → ∞ and
97
kj >x
x x |cj | logγ kj |cj | ≤ 1 + =O . logγ x kj logγ x kj ≤x
Thus N (x) = Ax + O(x/ logγ x). Hence Property (1) holds. We next prove Property (2). Since π0 (x) ∼ x/ log x by the PNT for N, we have π0 (an ) = 0 and n→∞ π0 (bn ) lim
From the inequality π(x) ≥ π0 (x) −
π0 (bn−1 ) n/8 π0 (an−1 2(n−1)/4 ) n/8 2 2 ≤ lim = 0. n→∞ π0 (an ) n→∞ π0 (a2n−1 ) lim
(1 − α){π0 (bn ) − π0 (an ) ≥ π0 (x) − (1 − α)π0 (x),
bn ≤x, n odd
we see that lim inf x→∞
π(x) log x π(x) = lim inf ≥ α. x→∞ π0 (x) x
In the other direction, π(an ) ≤ π0 (an ) + K 2k/8 {π0 (bk ) − π0 (ak )} ≤ π0 (an ) + K2n/8 π0 (bn−1 ) . k 1, set ∞ L . F (x)e−sx dx − (10.2) G(s) := s−1 0 99
100
10. WIENER-IKEHARA TAUBERIAN THEOREMS
Theorem 10.1 (W-I Upper Bound). If there exist positive constants λ, y0 , and K such that 2λ |t| ity 1 1− e G(σ + it) dt ≤ K (10.3) lim sup 2 2λ σ→1+ −2λ
−x
for each y ≥ y0 , then F (x)e
≤ C holds for some constant C > 0.
Conversely, if F (x)e−x ≤ C for some constant C > 0 then, for every λ > 0, (10.3) holds with K = Cπ uniformly for y ∈ R. Remark 10.2. If there exist positive constants λ0 , y0 , and K such that (10.3) holds with λ = λ0 for all y ≥ y0 then this relation holds (with some other constant K) for all λ > 0 and all y. Corollary 10.3. If there exists λ > 0 such that 2λ |G(σ + it)| dt < ∞ lim sup σ→1+
−x
then F (x)e
−2λ
≤ C for some constant C > 0.
We see from (10.2) that G(σ) is real for real σ > 1. By the reflection principle, G(¯ s) = G(s) for complex s, and thus 2λ 1 |t| ity 1− e G(σ + it) dt 2λ −2λ 2 is real for real y. Theorem 10.4 (W-I Lower Bound). If F (x)e−x ≥ c for all x ≥ x0 ≥ 0 and some constant c > 0, then there exists a constant γ > 0 such that, for every λ > 0, 2λ |t| ity 1 1− e G(σ + it) dt ≥ (−L + γ)π (10.4) lim inf σ→1+ −2λ 2 2λ uniformly for y ≥ y0 (λ). Conversely, assume there exist positive constants λ, γ, and y0 such that (10.4) holds for all y ≥ y0 and that F (x)e−x ≤ C for some constant C > 0. Then F (x)e−x ≥ c for all x ≥ x0 , with some constant c > 0. Theorem 10.5 (Wiener-Ikehara). Let F (x) be a nondecreasing, nonnegative function on [0, ∞) which satisfies (10.1) and (10.2). Then lim F (x)e−x = L
x→∞
if and only if there exists a constant λ0 ≥ 0 such that 2λ |t| ity 1 1− e G(σ + it) dt exists (10.5) lim σ→1+ −2λ 2 2λ for every λ > λ0 and every y ≥ y0 (λ), and such that 2λ |t| ity 1 1− e G(σ + it) dt = 0 (10.6) lim lim y→∞ σ→1+ −2λ 2 2λ for every λ > λ0 .
´ KERNEL 10.3. THE FEJER
101
Corollary 10.6. If there exists a constant λ0 ≥ 0 such that, for every λ > λ0 , 2λ 1 |t| ity 1− e {G(σ + it) − G(σ + it)} dt = 0 (10.7) lim σ, σ →1+ −2λ 2 2λ uniformly for y ≥ y0 (λ), then limx→∞ F (x)e−x = L. In particular, if 2λ (10.8) lim |G(σ + it) − G(σ + it)| dt = 0 σ, σ →1+
−2λ
for every λ > 0, then limx→∞ F (x)e−x = L. Remark 10.7. Our theorems apply for a Laplace transform having abscissa of convergence σc = 1. For other abscissa values, analogous theorems hold; alternatively, as noted in §4.1, one could normalize the function F to make σc = 1. Remark 10.8. In Theorem 10.5 we must assume that (10.6) is satisfied for all λ > λ0 ; if this relation holds only for 0 < λ ≤ λ1 , then we can conclude only that lim inf F (x)e−x ≤ lim sup F (x)e−x < ∞. An example with lim inf F (x)e−x < lim sup F (x)e−x is given in §10.5. 10.3. The Fej´ er kernel Our proofs of Theorems 10.1, 10.4, and 10.5 depend on analytic properties of the Fej´er kernel, which is defined on R for every λ > 0 by 2λ 1 |t| ixt (10.9) kλ (x) := 1− e dt. 2λ −2λ 2 For completeness, we state and prove some familiar properties of kλ . Lemma 10.9. Let x ∈ R, λ > 0, and δ > 0. Then sin λx 2 , (10.10) kλ (x) = λ λx (10.11) (10.12)
0 ≤ kλ (x) ≤ min{λ, λ−1 x−2 }, ∞ kλ (x) dx = π, −∞
(10.13) |u|≥δ
kλ (u) du
0. Then, integration by parts and some familiar relations yield ∞ ∞ ∞ sin2 v sin 2v sin u dv = 2 du = π. dv = 2 2 v u −∞ v 0 0 The next lemma estimates L1 -norms of 2λ |t| ν ixt 1 (ν) 1− (it) e dt, (10.14) kλ (x) = 2λ −2λ 2 the ν-th derivative of kλ . We shall use these results to establish Chebyshev bounds in the next chapter. Lemma 10.10. Let 0 < λ ≤ 1/2. Then, for ν = 1, 2, 3, . . . , (10.15)
(ν)
|kλ (x)| ≤
∞
(10.16) −∞
3(2λ)ν−1 , 1 + x2
(ν)
|kλ (x)| dx ≤
x ∈ R,
8(2λ)ν . ν+1 (ν)
Proof. We begin with an absolute bound for kλ (x), to be applied for |x| small. From (10.14), we have the simple inequality 2λ t ν (2λ)ν+1 (ν) 1− t dt = , x ∈ R. (10.17) |kλ (x)| ≤ 2λ (ν + 1)(ν + 2) 0 For application to larger |x|, we show that 8 (10.18) |kλ (x)| ≤ 2 , λ|x| ≥ 6/5, 3x (10.19)
(ν)
|kλ (x)| ≤
8(2λ)ν−1 , 3x2
ν ≥ 2, x = 0.
For ν = 1, we have 2 sin λx cos λx 2 sin2 λx |kλ (x)| = − 2 3 x λx | sin 2λx| 2 sin2 λx 1 2 8 ≤ + ≤ 2+ ≤ 2, x2 λ|x|3 x λ|x| x2 3x For ν ≥ 2, start with the expression 2λ t 1− kλ (x) = cos xt dt 2λ 0 and make ν differentiations. We find 2λ t ν (ν) 1− t T (xt) dt, kλ (x) = 2λ 0
λ|x| ≥ 6/5.
where T = ± sin or ± cos, depending on ν (mod 4). Integrating by parts twice, with T1 := T = ∓ cos or ± sin and T2 := T1 = ∓ sin or ∓ cos, we find 2λ (ν + 1)t T2 (xt) ν−2 T2 (2λx) (ν) ν−1 ν − 1 − dt, (2λ) + νt kλ (x) = x2 x2 2λ 0
´ KERNEL 10.3. THE FEJER
and so (ν)
|kλ (x)| ≤
(2λ)ν−1 + x2
2λ
0
νtν−2 x2
103
(ν + 1)t dt. ν − 1 − 2λ
To treat the last integral, set t∗ = 2λ(ν − 1)/(ν + 1) and note that (ν + 1)t > 0, if 0 ≤ t < t∗ , ν −1− 2λ < 0, if t∗ < t ≤ 2λ. Therefore (ν) |kλ (x)|
t∗
(ν + 1)t (2λ)ν−1 1 ν−2 ν − 1 − dt ≤ + νt x2 x2 0 2λ 2λ
1 (ν + 1)t − 2 νtν−2 ν − 1 − dt x t∗ 2λ
ν − 1 ν−1 . = 2(2λ)ν−1 x−2 1 + ν +1
Now we note that {(ν − 1)/(ν + 1)}ν−1 is decreasing for ν ≥ 2 and hence {(ν − 1)/(ν + 1)}ν−1 ≤ 1/3 for
ν = 2, 3, . . . .
It follows that (10.19) holds for x = 0 and all integers ν ≥ 2. To show (10.15), we treat separately the case ν = 1. Recall that 0 < λ ≤ 1/2. If λ|x| ≤ 2, we have 2λ2 (1 + x2 ) ≤ 1/2 + 2(2)2 < 9, whence 3 (2λ)2 < . 6 1 + x2 On the other hand, if λ|x| > 2, then |x| > 4 and 8 + 8x2 < 9x2 , so 8 3 < . 2 3x 1 + x2 For ν ≥ 2, we establish (10.15) by an analogous but simpler argument, treating separately the cases |x| ≤ 3 and x > 3. (ν) While we could estimate |kλ (x)| dx by integrating (10.15), a sharper bound comes from applying (10.17) and (10.19) (resp. (10.18) for ν = 1 and |x| ≥ 2/λ) once again. We have ∞ ∞ 8(2λ)ν−1 2(2λ)ν+1 X (ν) +2 |kλ (x)| dx ≤ dx (ν + 1)(ν + 2) 3x2 −∞ X λ2 X 2 8(2λ)ν + ≤ = 8(2λ)ν−1 (ν + 1)(ν + 2) 3X ν +1 by taking 1 X= λ
#
2(ν + 1)(ν + 2) . 3
104
10. WIENER-IKEHARA TAUBERIAN THEOREMS
10.4. Proof of the Wiener-Ikehara Theorems To simplify the notation, let Δλ (t) :=
|t| + 1 1− . 2 2λ
Proof of Theorem 10.1. We begin our argument by rewriting (10.2) as ∞ G(σ + it) = e−(σ+it)x {F (x) − Lex } dx, σ > 1, 0
multiplying by Δλ (t)eity , and integrating over −2λ < t < 2λ. If then we change the order integration in the resulting double integral, we find 2λ (10.20) Δλ (t)eity G(σ + it) dt −2λ ∞ ∞ = e−σx F (x) kλ (y − x) dx − L e−(σ−1)x kλ (y − x) dx. 0
0
Suppose F (x)e−x ≤ C holds for some constant C > 0. In this case, by (10.1), ∞ ∞ −σx F (x)e dx ≤ C lim (σ − 1) e−(σ−1)x dx = C. (10.21) L = lim (σ − 1) σ→1+
σ→1+
0
0
Then the left hand side of (10.20) is at most ∞ e−(σ−1)x kλ (y − x) dx ≤ Cπ, C 0
and it is at least
−L
∞
e−(σ−1)x kλ (y − x) dx ≥ −Lπ ≥ −Cπ
0
for all λ > 0 and y ∈ R by (10.12) and (10.21). Thus (10.3) holds for all y. Conversely, suppose (10.3) is satisfied for each y ≥ y0 . In this case, the modulus of the left hand side of (10.20) is at most K + 1 for y ≥ y0 and 1 < σ < 1 + δ(y), for some δ(y) > 0. Hence, by (10.12), ∞ e−σx F (x) kλ (y − x) dx ≤ K + 1 + Lπ. 0
Let B denote the constant on the right hand side. By the monotone convergence theorem, ∞ e−x F (x) kλ (y − x) dx ≤ B 0
for all y ≥ y0 ; thus y+1 B≥ e−x F (x) kλ (y − x) dx ≥ e−y−1 F (y)
0
−1
y
for y ≥ y0 . It follows that −y
F (y)e
−x
for y ≥ y0 , and therefore F (x)e
≤ eB
0
−1
kλ (u) du
kλ (u) du
−1
≤ C holds with a suitable constant C > 0.
10.4. PROOF OF THE WIENER-IKEHARA THEOREMS
105
Proof of Theorem 10.4. Suppose that F (x)e−x ≥ c, for all x ≥ x0 with some constant c > 0. We have first, L ≥ c, by an analogue of (10.21). Next, by (10.20) and the F (x)e−x condition, we have 2λ Δλ (t)eity G(σ + it) dt −2λ x0 ∞ e−(σ−1)x kλ (y − x) dx + e−(σ−1)x (F (x)e−x − c)kλ (y − x) dx ≥ (c − L) 0 0 ∞ x0 −(σ−1)x ≥ (c − L) e kλ (y − x) dx − R e−(σ−1)x kλ (y − x) dx 0
0
for R a positive constant and every λ > 0. Hence, by (10.11) and (10.12), 2λ lim inf Δλ (t)eity G(σ + it) dt σ→1+ −2λ y y R ≥ (c − L) kλ (u) du − R kλ (u) du ≥ (c − L)π − λ(y − x0 ) −∞ y−x0 for y > x0 . Let γ = c/2 and fix y0 = y0 (λ) (> x0 ) satisfying R cπ > . 2 λ(y0 − x0 ) Then (10.4) holds uniformly for y ≥ y0 . Conversely, if there exist positive constants λ, γ, and y0 such that (10.4) holds for all y ≥ y0 then, by (10.20), ∞ ∞ e−σx F (x) kλ (y − x) dx − L e−(σ−1)x kλ (y − x) dx ≥ (−L + γ/2)π 0
0
for every y ≥ y0 and 1 < σ < 1 + η(y) with η(y) > 0. Hence, by the monotone convergence theorem, ∞ ∞ (10.22) e−x F (x) kλ (y − x) dx ≥ L kλ (y − x) dx + (−L + γ/2)π 0
0 −x
for y ≥ y0 . If we assume also that F (x)e ≤ C for some constant C > 0 then the left hand side is bounded above by y+δ C kλ (y − x) dx + e−x F (x) kλ (y − x) dx |y−x|≥δ
y−δ
for y ≥ δ > 0. Applying (10.12) and (10.13), the left hand side of (10.22) is bounded above further by 2C + πe2δ e−(y+δ) F (y + δ) λδ for any δ > 0 and y ≥ max{δ, y0 }. Choosing δ sufficiently large so that 2C/(λδ) < γπ/4, i.e., taking δ > 8C/(γπλ) and fixing it, (10.22) yields 2C γπ γ π− > , πe2δ lim inf e−(y+δ) F (y + δ) ≥ Lπ + −L + y→∞ 2 λδ 4 and hence γe−2δ lim inf e−x F (x) > > 0. x→∞ 4
106
10. WIENER-IKEHARA TAUBERIAN THEOREMS
Proof of Theorem 10.5. If limx→∞ F (x)e−x = L then F (x)e−x = L + r(x), where r(x) → 0 as x → ∞. Hence the right hand side of (10.20) equals ∞ e−(σ−1)x r(x)kλ (y − x) dx. 0
∞
Since 0 |r(x)|kλ (y − x) dx < ∞, then, by the dominated convergence theorem, the right hand side of (10.20) has a limit as σ → 1+. Hence, so does the left hand side of (10.20), and we have ∞ 2λ lim Δλ (t)eity G(σ + it) dt = r(x)kλ (y − x) dx, σ→1+
−2λ
0
which shows (10.5). Also, since r(x) → 0 as x → ∞ and the mass of kλ (y − x) is concentrated in the region where x is near y, a small argument shows that the last integral goes to 0 as y → ∞. This establishes (10.6) for every λ > 0. Conversely, assume that the limit (10.5) exists for all λ > λ0 and y ≥ y0 (λ), and that (10.6) is satisfied for every λ > λ0 . Then, by (10.20) again, 2λ ∞ ∞ e−x F (x) kλ (y − x) dx − L kλ (y − x) dx = lim Δλ (t)eity G(σ + it) dt 0
σ→1+
0
for λ > λ0 and y ≥ y0 (λ). From (10.6), ∞ e−x F (x) kλ (y − x) dx = L 0
∞
−2λ
kλ (y − x) dx + oλ (1),
0
where oλ (1) → 0 as y → ∞. The last integral tends to π as y → ∞ by (10.11). Thus ∞ e−x F (x) kλ (y − x) dx = Lπ + oλ (1) (10.23) 0
for λ > λ0 . For δ0 > 0 to be specified later, we have, by (10.23), y+δ0 e−x F (x) kλ (y − x) dx ≤ Lπ + oλ (1). y−δ0
Note that the left hand side is at least −(y+δ0 )
e
F (y − δ0 )
y+δ0
kλ (y − x) dx.
y−δ0
By (10.12) and (10.13), y+δ0 kλ (y − x) dx = π − y−δ0
|u|≥δ0
kλ (u) du ≥ π 1 −
2 >0 πλδ0
for λ ≥ λ1 with λ1 sufficiently large. Therefore we obtain 2 ≤ Lπ + oλ (1) e−(y−δ0 ) F (y − δ0 )e−2δ0 π 1 − πλδ0 for λ ≥ λ1 . Given > 0, we first choose δ0 > 0 satisfying e2δ0 < 1 + /2, and next choose λ1 so large that λ1 ≥ λ0 , λ1 δ0 ≥ 2, and
e2δ0 < 1 + . 1 − 2/(πλ1 δ0 )
10.4. PROOF OF THE WIENER-IKEHARA THEOREMS
107
Then we obtain e−(y−δ0 ) F (y − δ0 ) ≤
e2δ0 (L + oλ (1)) ≤ L(1 + ) + oλ (1) 1 − 2/(πλδ0 )
for λ ≥ λ1 (> λ0 ). It follows that lim sup e−y F (y) ≤ L(1 + ), y→∞
and since can be taken arbitrarily small, lim sup e−y F (y) ≤ L.
(10.24)
y→∞
Furthermore, by (10.24), the left hand side of (10.23) is at most y+δ K kλ (y − x) dx + e−x F (x) kλ (y − x) dx |y−x|≥δ
y−δ
2K ≤ + e−y+δ F (y + δ) λδ
y+δ
kλ (y − x) dx y−δ
for some constant K > 0 and every δ > 0. Hence 2K ≥ Lπ + oλ (1) πe−y+δ F (y + δ) + λδ for λ > λ0 and y ≥ y0 (λ), or e2δ e−y−δ F (y + δ) ≥ L + oλ (1) − It follows that
2K . πλδ
lim inf e−y F (y) ≥ (L − 2K/(πλδ))e−2δ y→∞
for every λ > λ0 . Therefore, letting λ → ∞, we obtain lim inf e−y F (y) ≥ Le−2δ y→∞
for every δ > 0. Now letting δ → 0+, we get lim inf e−y F (y) ≥ L. y→∞
This bound and (10.24) complete the proof of Theorem 10.5.
Proof of Corollary 10.6. If, for every λ > λ0 , (10.7) holds uniformly for y ≥ y0 (λ), then by Cauchy’s convergence criterion, (10.5) holds for every λ > λ0 and every y ≥ y0 (λ). Let (y) = (y, λ) denote the limit function in (10.5). Then 2λ Δλ (t)eity G(σ + it) dt −2λ
converges to (y) as σ → 1+ uniformly for y ≥ y0 (λ). Hence, given > 0, 2λ Δλ (t)eity G(σ + it) dt − (y) < (10.25) −2λ
for 1 < σ < 1 + δ(λ) and y ≥ y0 (λ). Fixing σ = σ0 < 1 + δ(λ) and noting that G(σ0 + it) is a continuous function of t, we have by the Riemann-Lebesgue lemma 2λ Δλ (t)eity G(σ0 + it) dt = 0. lim y→∞
−2λ
108
10. WIENER-IKEHARA TAUBERIAN THEOREMS
It follows from (10.25) that
lim sup (y) ≤ lim
y→∞
y→∞
2λ
−2λ
Δλ (t)eity G(σ0 + it) dt + = .
Similarly, lim inf y→∞ (y) ≥ −, whence limy→∞ (y) = 0, i.e., (10.6) is satisfied for every λ > λ0 . Thus limx→∞ e−x F (x) = L by Theorem 10.5. 10.5. A W-I oscillatory example Let λ1 > 0 and F (x) be defined on [0, ∞) by x F (x) := (1 − cos{2λ1 u}) eu du. 0 −x
We show that F (x)e does not have a limit at infinity and the Laplace transform of F has poles at nonreal points of {σ = 1}. First, direct integration shows F (x) = ex −
ex sin(2λ1 x + θ) 4λ21 $ − , 1 + 4λ21 1 + 4λ21
whence 1 lim inf F (x)e−x = 1 − $ , x→∞ 1 + 4λ21
1 lim sup F (x)e−x = 1 + $ . x→∞ 1 + 4λ21
The Laplace transform of F can be written as ∞ 1 ∞ −(s−1)x 1 + G(s), e−sx F (x) dx = e (1 − cos 2λ1 x) dx = s s − 1 0 0 for σ > 1, where 1 1 G(s) := − − s 2s
1 1 + s − 1 − 2iλ1 s − 1 + 2iλ1
.
It is easy to see that the condition (10.1) is satisfied here. To examine the W-I expression associated with F , let 2λ Δλ (t)eity G(σ + it) dt, λ > 0, σ > 1. I(λ, σ, y) := −2λ
We shall show that limσ→1+ I(λ, σ, y) exists for all λ > 0 and y > 0 and ⎧ ⎪ 0 < λ ≤ λ1 , ⎨oλ (1), y + θ) sin(2λ π λ (10.26) lim I(λ, σ, y) = 1 $ 1 σ→1+ + oλ (1), λ > λ1 , ⎪ ⎩− 2 1 − λ 1 + 4λ21 where oλ (1) → 0 as y → ∞. Therefore (10.5) holds for all λ > 0 and y > 0, but (10.6) holds only for 0 < λ ≤ λ1 , and this limit does not exist for any λ > λ1 . Using the identity 1 1 1 1 = − s(s − a) a s−a s and a small manipulation, we find G(s) = −
1 1 1 1 4λ21 1 − − . 2 1 + 4λ1 s 2(1 + 2iλ1 ) s − 1 − 2iλ1 2(1 − 2iλ1 ) s − 1 + 2iλ1
10.5. A W-I OSCILLATORY EXAMPLE
109
Then I(λ, σ, y) = I1 (λ, σ, y) + I2 (λ, σ, y) + I3 (λ, σ, y), where
2λ Δλ (t)eity dt 4λ21 I1 (λ, σ, y) := − , 1 + 4λ21 −2λ σ + it 2λ Δλ (t)eity dt 1 I2 (λ, σ, y) := − , 2(1 + 2iλ1 ) −2λ σ − 1 + i(t − 2λ1 )
and
2λ Δλ (t)eity dt 1 . 2(1 − 2iλ1 ) −2λ σ − 1 + i(t + 2λ1 ) By the continuity of the integrand, limσ→1+ I1 (λ, σ, y) exists for all y ≥ 0 and λ > 0, and, by the Riemann-Lebesgue lemma, this function tends to 0 as y → ∞. Similarly, limσ→1+ I2,3 (λ, σ, y) each exist for 0 < λ < λ1 , y ≥ 0 and each tends to 0 as y → ∞ for 0 < λ < λ1 . It remains to study limσ→1+ I2,3 (λ, σ, y) for λ ≥ λ1 , where the integrands are not so well-behaved. First, for λ > λ1 , we evaluate limσ→1+ I2 (λ, σ, y) by splitting the integration interval [−2λ, 2λ] into parts I3 (λ, σ, y) := −
[−2λ, 2λ1 − ] ∪ [2λ1 − , 2λ1 + ] ∪ [2λ1 + , 2λ] for some satisfying 0 < < min{2λ − 2λ1 , 2λ1 }. The integrand of I2 (λ, σ, y) over the first and third subintervals are continuous and, as before, have limits as σ → 1+ for y ≥ 0 and these limits tend to zero as y → ∞. By changing the integration variable, the remaining integral becomes 2λ1 + Δλ (t)eity dt eiuy du e2iλ1 y λ1 (10.27) = 1− 2 λ 2λ1 − σ − 1 + i(t − 2λ1 ) − σ − 1 + iu 2iλ1 y iuy ue du e . + 4λ σ − 1 + iu − Let A and B denote the first and second integrals respectively on the right side of (10.27). By the bounded convergence theorem, B has the limit iuy eiy − e−iy e du = − y − i as σ → 1+ for y > 0, and this tends to zero as y → ∞. Also, eiuy e−iuy A= + (10.28) du σ − 1 + iu σ − 1 − iu 0 (σ − 1) cos uy u sin uy du + 2 du := 2A1 + 2A2 . =2 2 + u2 (σ − 1) (σ − 1)2 + u2 0 0 A further change of integration variable makes A1 into /(σ−1) cos({σ − 1}vy) dv, 1 + v2 0 and hence, as σ → 1+, it has a limit ∞ 0
dv π = 2 1+v 2
110
10. WIENER-IKEHARA TAUBERIAN THEOREMS
by the dominated convergence theorem. Then, by the bounded convergence theorem again, A2 has a limit y sin uy sin v π du = dv = + o(1) u v 2 0 0 as σ → 1+. Therefore, limσ→1+ I2 (λ, σ, y) exists for λ > λ1 and y > 0 and λ1 e2iλ1 y 1− (π + o(1)). lim I2 (λ, σ, y) = − σ→1+ 2(1 + 2iλ1 ) λ Similarly, to evaluate limσ→1+ I3 (λ, σ, y), we split the integration interval into parts [−2λ, −2λ1 − ] ∪ [−2λ1 − , −2λ1 + ] ∪ [−2λ1 + , 2λ]. We can show that lim I3 (λ, σ, y) = −
σ→1+
λ1 e−2iλ1 y 1− (π + o(1)). 2(1 − 2iλ1 ) λ
Therefore, limσ→1+ I(λ, σ, y) exists for all λ > λ1 and y > 0 and λ1 sin(2λ1 y + θ) π $ + oλ (1). lim I(λ, σ, y) = − 1 − σ→1+ 2 λ 1 + 4λ21 Finally, for λ = λ1 , to evaluate limσ→1+ I2 (λ1 , σ, y), we split the integration interval [−2λ1 , 2λ1 ] into [−2λ1 , 2λ1 −]∪[2λ1 −, 2λ1 ] with satisfying 0 < < 2λ1 . The integral over the first subinterval has a limit as σ → 1+ for all y ≥ 0, and this limit tends to zero as y → ∞. By changing the integration variable, the integral over the rest of the interval becomes 0 2λ1 + u e2iλ1 y+iuy du e2iλ1 y 0 ueiuy du 1 =− . 1− 2λ1 σ − 1 + iu 4λ1 − σ − 1 + iu − 2 By the bounded convergence theorem, the last integral has a limit 0 iuy 1 − e−iy e du = − y − i as σ → 1+, and this limit also tends to zero as y → ∞. Hence limσ→1+ I2 exists and is o(1) as y → ∞. Similarly, limσ→1+ I3 (λ1 , σ, y) exists and is o(1) as y → ∞. Therefore, limσ→1+ I(λ1 , σ, y) exists and is o(1) as y → ∞. This completes the proof of (10.26). 10.6. Notes §10.2. The Wiener-Ikehara tauberian theorem is discussed in some detail in [Ko04], [Ha49], and [Te95]. The proof of the W-I theorem we give is based on one of S. Bochner (see [Ch68], Notes). Converse forms of the W-I theorem, including upper and lower bound estimates, are given in [Zh14]. A form of W-I using the hypothesis (10.8) is given in [BD69]. See also the Notes for Chapter 13. §10.5. The example of this section is given in [Be37] and [BD69].
https://doi.org/10.1090//surv/213/11
CHAPTER 11
Chebyshev Bounds – Analytic Methods Of course you can do better – Your mother Summary. Analytic methods are applied to establish upper and lower bounds for Chebyshev’s function ψ(x) under hypotheses weaker than those of Chapter 9. In the next chapter, we show the optimality of these results.
11.1. Introduction Let N (x) = Ax + xE(x) (A > 0) denote the integer counting function of a g-number system N . It was conjectured by the first author that Chebyshev bounds for ψ(x) could be established under the assumption (11.1)
∞
x−1 |E(x)| dx < ∞.
1
Later, the second author established the Chebyshev bounds given in Theorem 9.9 using the stronger hypothesis ∞ x−1 sup |E(y)| dx < ∞. 1
y≥x
The condition (11.1) was shown to be too weak by an example of J.-P. Kahane. Subsequently, Chebyshev bounds were established by J. Vindas and the authors under various auxiliary conditions on E(x) (see Notes). Here we show the bounds under an average hypothesis on E(x). In the next chapter, we shall show that these results are optimal in the class of systems N which satisfy (11.1) and a pointwise bound on E(x) log x. Theorem 11.1. Suppose that the counting function N (x) of integers of a gnumber system N satisfies both (11.1) and x E(u) log u du x, x ≥ 1. (11.2) 1
Then the Chebyshev bounds x ψ(x) x hold. Our proof is based on the application of W-I theorems to the Mellin transform ψ(s). Key ingredients in the argument are estimates of L1 norms of derivatives of the Fej´er kernel given in Lemma 10.10 and the treatment of I3 (y) (given in §11.4) by a concrete version of Wiener’s theorem on division algebras. 111
112
11. CHEBYSHEV BOUNDS – ANALYTIC METHODS
11.2. Wiener-Ikehara setup We begin with a few preliminaries. By (11.1) and Proposition 7.7, we have N (x) ∼ Ax. The weak estimate log p ≤ N (x) log x ∼ x log x ψ(x) := pν ≤x
follows, since every power pν of a g-prime p is a g-integer. This bound insures convergence of the Mellin transform ∞ ∞ ζ (s) −s x dψ(x) = s x−s−1 ψ(x) dx = − ψ(s) := ζ(s) 1 1 in the half plane {s = σ + it, σ > 1}. Here, as usual, ζ(s) is the zeta function associated with the g-number system N . We make a change of variables in the last integral to bring it to our Laplace form for the W-I theorems, writing ∞ ζ (s) = e−su ψ(eu ) du. − s ζ(s) 0 We claim that this transform has a right hand residue 1 at s = 1. Indeed, since N (x) ∼ Ax, as σ → 1+, ∞ A , ζ(σ) = σ x−σ−1 N (x) dx ∼ σ −1 1 ∞ ∞ −A x−σ−1 N (x) dx − σ x−σ−1 (log x)N (x) dx ∼ ; ζ (σ) = (σ − 1)2 1 1 thus ∞
lim (σ − 1)
σ→1+
e−σu ψ(eu ) du = 1,
0
and condition (10.1) is satisfied with L = 1. Following (10.2), for σ > 1, write ∞ ∞ 1 G(s) := (11.3) e−su ψ(eu ) du − e−su {ψ(eu ) − eu } du = s − 1 0 0 1 1 1 d ζ (s) − =− − log{(s − 1)ζ(s)} . =− s ζ(s) s − 1 s s ds Let λ be a positive number, to be chosen later. (Spoiler alert: unlike the proof of the PNT, here λ will be bounded.) Following the W-I method of the preceding chapter, multiply through (11.3) by eity and the smoothing factor + 1 |t| Δλ (t) := . 1− 2 2λ Then integrate both sides of the resulting formula over −2λ < t < 2λ and exchange the integration order (which is valid by absolute convergence). For σ > 1, we find 2λ ∞ (11.4) Δλ (t)eity G(σ + it) dt = e−σu {ψ(eu ) − eu } kλ (y − u) du −2λ 0 ∞ =− e−σu kλ (y − u) du − Iσ (y) , 0
11.3. A FIRST DECOMPOSITION
where
113
1 |t| kλ (x) := 1− eitx dt 2λ −2λ 2 is the Fej´er kernel for R and (still with σ > 1), 2λ eity d Iσ (y) := log{(s − 1)ζ(s)} dt. Δλ (t) s ds −2λ
2λ
Now let σ → 1+ in (11.4). Since kλ ≥ 0, each of the integrals ∞ ∞ ∞ e−σu ψ(eu )kλ (y − u) du, e−σu eu kλ (y − u) du, e−σu kλ (y − u) du 0
0
0
∞ has a limit by the monotone convergence theorem. Also, since 0 kλ (y −u) du < π, by (10.12), the last two integrals have finite limits for each y. (It remains to be shown that the limit of the first integral also is finite.) Hence Iσ (y) has a limit as well. It follows that ∞ 2λ ity lim Δλ (t)e G(σ + it) dt = − e−u kλ (y − u) du − lim Iσ (y). σ→1+
−2λ
σ→1+
0
The integral on the right side of the last formula can be rewritten as y y y/2 ev−y kλ (v) dv + ev−y kλ (v) dv < πe−y/2 + kλ (v) dv → 0 −∞
y/2
y/2
as y → ∞. Thus we see that 2λ (11.5) lim Δλ (t)eity G(σ + it) dt = o(1) − lim Iσ (y) σ→1+
σ→1+
−2λ
as y → ∞.
Our proof of the Chebyshev bounds will depend on the size of limσ→1+ Iσ (y): if it is bounded, then ψ(x) x by Theorem 10.1; if it is sufficiently small, then ψ(x) x by Theorem 10.4. The analysis of Iσ (y) is more delicate than that of the classical W-I proof of the PNT, mainly because, here, (d/ds) log{(s − 1)ζ(s)} need not have a continuous extension to the closed half plane σ ≥ 1. 11.3. A first decomposition It is convenient to express Iσ (y) in terms of a more tractable function. We begin by examining (s − 1)ζ(s) near s = 1. By (11.1), ∞ ∞ (11.6) H := x−1 |E(x)| dx = |E(eu )| du < ∞. 1
0
Also, we set
1 ∞ −s 1 ζ(s) − . x E(x) dx = A 1 As s−1 The integral for g converges uniformly for σ ≥ 1, and defines a continuous function there. Since ∞ (11.8) (s − 1)ζ(s) = As + (s − 1)s x−s E(x) dx = A(s + (s − 1)sg(s)), (11.7)
g(s) :=
1
(s − 1)ζ(s) has a continuous extension to the half plane σ ≥ 1, and (s − 1)ζ(s) → 1 as s → 1, σ ≥ 1. A
114
11. CHEBYSHEV BOUNDS – ANALYTIC METHODS
Thus there exists a number η1 > 0 such that 1 (s − 1)ζ(s) ≤ − 1 2 for |s − 1| ≤ η1 , σ ≥ 1. A For s in this semidisc, we can write log{(s − 1)ζ(s)} = log A + log{1 + (s − 1)(1 + sg(s))} , and, differentiating, obtain (11.9)
1 + (2s − 1)g(s) + s(s − 1)g (s) d log{(s − 1)ζ(s)} = . ds 1 + (s − 1)(1 + sg(s))
It follows, for 0 < λ ≤ η1 /4 and 1 < σ ≤ 1 + η1 /2, that 2λ 1 + (2s − 1)g(s) eity Δλ (t) Iσ (y) = dt s 1 + (s − 1)(1 + sg(s)) −2λ 2λ (11.10) eity (s − 1)g (s) dt Δλ (t) + 1 + (s − 1)(1 + sg(s)) −2λ := I1,σ (y) + I2,σ (y). The integrand of I1,σ (y) is continuous on the closed semidisc, and so 2λ 1 + (1 + 2it)g(1 + it) eity lim I1,σ (y) = Δλ (t) dt σ→1+ 1 + it 1 + it{1 + (1 + it)g(1 + it)} −2λ exists and is finite for each y. By the Riemann-Lesbegue lemma, (11.11)
lim I1,σ (y) = o(1)
σ→1+
as y → ∞. Since Iσ (y) has a limit and I1,σ (y) has a finite limit as σ → 1+, it follows that limσ→1+ I2,σ (y) exists for each y. It remains to study the last limit. 11.4. Further decomposition of I2,σ (y) It is a nuisance to deal with many factors in the integrand of I2,σ (y) moving at once as σ → 1+, so we write (11.12)
s−1 it = + (σ − 1)R(s), 1 + (s − 1){1 + sg(s)} (1 + it){1 + itg(s)}
where we have (after a bit of algebra) R(s) :=
1 − it(s − 1)g(s) . s(1 + it) (1 + itg(s)) (1 + (s − 1)g(s))
The definition of g(s) and (11.6) give |g(s)| ≤ H/A and hence |itg(s)| ≤ 1/2 for |s − 1| ≤ η2 , σ ≥ 1 for some constant η2 > 0. If we further require that 0 < η2 ≤ min{η1 , 1}, then R(s) is well-defined in the semidisc D := {s : σ ≥ 1, |s − 1| ≤ η2 }. In this section, we shall approximate I2,σ (y) using (11.12) and applying a geometrical series device of Wiener to treat 1/{1 + itg(s)}.
11.4. FURTHER DECOMPOSITION OF I2,σ (y)
115
Inserting (11.12) into the integrand of I2,σ (y) yields, for 0 < λ ≤ η2 /4 and 1 < σ ≤ 1 + η2 /2, 2λ eity g (s) it dt I2,σ (y) = Δλ (t) (1 + it)(1 + itg(s)) −2λ 2λ (11.13) + Δλ (t)eity (σ − 1)g (s)R(s) dt −2λ
:= I3,σ (y) + I4,σ (y). The function R(s) is continuous on the compact semidisc D and bounded there by some constant R, say. It follows that 2λ |I4,σ (y)| ≤ R Δλ (t)(σ − 1)|g (s)| dt, y ∈ R. −2λ
Recalling the definition of g(s), we have 1 ∞ −σ (σ − 1)|g (s)| ≤ x (σ − 1)|E(x)| log x dx. A 1 Note that (σ − 1) log x x−1 |E(x)| ≤ x−1 |E(x)|, x−σ (σ − 1)|E(x)| log x = exp{(σ − 1) log x} which is integrable by the hypothesis (11.1), and x−(σ−1) (σ − 1) log x → 0 at each point x ≥ 1 as σ → 1+. By the dominated convergence theorem, the last integral tends to 0 as σ → 1+. Therefore (σ − 1)|g (s)| → 0 as σ → 1+ uniformly for all t ∈ R, and so lim I4,σ (y) = 0.
(11.14)
σ→1+
It remains to study I3 (y) := lim I3,σ (y) ; σ→1+
this limit exists since I2,σ (y) and I4,σ (y) have limits as σ → 1+. Since |itg(s)| ≤ 1/2 for s ∈ D, we can write it it (−1)ν (it)ν g(s)ν . = (1 + it)(1 + itg(s)) 1 + it ν≥0
Thus I3,σ (y) =
(−1)ν Jν,σ (y),
ν≥0
where (11.15)
Jν,σ (y) =
2λ
−2λ
Δλ (t)eity g (s)
(it)ν+1 g(s)ν dt. 1 + it
We shall evaluate Jν,σ by expressing it as an additive convolution of L1 functions. We have 2λ (ν+1) (y) = Δλ (t)(it)ν+1 eity dt, kλ −2λ
116
11. CHEBYSHEV BOUNDS – ANALYTIC METHODS
the ν + 1st derivative of the Fej´er kernel, and for σ ≥ 1, define Gσ (u) := A−1 e−(σ−1)u E(eu ),
GD σ (u) := −uGσ (u),
Z(u) := e−u ,
the last three functions with support in [0, ∞). We commit a small abuse of notation by writing here the Fourier transforms of the functions as ∞ σ (t) = 1 G e−itu e−(σ−1)u E(eu ) du = g(s), σ ≥ 1, A 0 1 ∞ −itu −(σ−1)u % D e e uE(eu ) du = g (s), σ > 1, Gσ (t) = − A 0 and ∞ 1 = Z(t) e−itu e−u du = . 1 + it 0 Below and in the next section, we make another exception: we write ‘∗’ for additive convolution. By direct calculation or familiar Fourier relations, we obtain the representation (11.16)
(ν+1)
Jν,σ (y) = (kλ
∗ν ∗ GD σ ∗ Gσ ∗ Z)(y),
σ > 1.
11.5. Chebyshev bounds (ν)
Using the last formula and the bounds (10.16) for L1 -norms of kλ (x), we shall derive the inequality (11.19) (below) for I3 (y). This in turn, will yield an upper bound for ψ(x)/x. Afterward, with a further restriction on λ, we sharpen the estimate of I3 (y) to establish a lower bound. Here, we introduce for the first time the hypothesis (11.2): there is a number B > 0 such that x −1 E(u) log u du ≤ B, 1 ≤ x < ∞. (11.17) x 1
Our first task is to study Jν,σ (y) (ν = 0, 1, 2, . . . ) and estimate limits as σ → 1+. To overcome convergence problems of (11.16) at σ = 1+, we combine GD σ with Z: for σ ≥ 1 let 1 u −(u−v) −(σ−1)v e e vE(ev ) dv. mσ (u) := − A 0 −(σ−1)v is decreasing in v, we find from Then mσ (u) = (Z ∗ GD σ )(u), and, since e (11.17) by integration by parts that (11.18)
|mσ (u)| ≤ B/A for σ ≥ 1, u ≥ 0.
(u). Also, by (11.6), the dominated convergence theorem, m1 (u) = limσ→1+ By ∞ mσ∗ν ∞ |G (u)| du = H/A. Induction on ν shows that |G | (u) du = (H/A)ν ; 1 1 0 0 ∞ ∗ν hence 0 |G1 (u)| du ≤ (H/A)ν . It follows that D ∗ν ∗ν (G∗ν σ ∗ Z ∗ Gσ )(u) = (Gσ ∗ mσ )(u) → (G1 ∗ m1 )(u)
as σ → 1+, again by the dominated convergence theorem, and, with (11.18), BH ν . Aν+1 ∗ Z ∗ GD σ )(u) with (10.16), we see
D ∗ν |(G∗ν σ ∗ Z ∗ Gσ )(u)|, |(G1 ∗ m1 )(u)| ≤
Combining the preceding bounds for (G∗ν σ that y (ν+1) ∗ν (GD (t) dt Jν,σ (y) = σ ∗ Gσ ∗ Z)(y − t) kλ −∞
11.6. NOTES
117
is absolutely convergent. By one last application of the dominated convergence theorem, we conclude that Jν (y) = limσ→1+ Jν,σ (y) exists. Moreover, with (10.16) again, we have ν+1 4B 2λH for ν ≥ 0, |Jν (y)| ≤ H A where 0 < λ ≤ η2 /4. It now follows that (−1)ν Jν,σ (y) = (−1)ν Jν (y) I3 (y) = lim σ→1+
ν≥0
ν≥0
holds for 0 < λ ≤ min{η2 , A/H}/4. Furthermore, under this λ restriction, |I3 (y)|
0 such that 2λ Δλ (t)eity G(σ + it) dt ≥ (−1 + c)π lim inf σ→1+
−2λ
for all sufficiently large y (equivalently ∞ e−u ψ(eu )kλ (y − u) du ≥ c,
y → ∞).
0
From the preceding inequality and the fact that ψ(x)/x is bounded above, we deduce the desired lower bound. 11.6. Notes §11.1 Zhang’s article appears in [Zh87a]; Kahane’s in [Ka98]; Vindas established a Chebyshev upper bound assuming (11.1) and the additional hypothesis E(x) = o(1/ log x) in [Vn12]. The authors [DZ13a] and Vindas [Vn13] independently established upper and lower bounds with the auxiliary hypothesis E(x) = O(1/ log x), and the same people further weakened the E(x) hypothesis to the average condition given in this chapter.
https://doi.org/10.1090//surv/213/12
CHAPTER 12
Optimality of a Chebyshev Bound Too much of a good thing can be wonderful – Mae West Summary. An example is given of a g-number system satisfying the integral condition of Theorem 11.1 but having {N (x)−Ax}(log x)/x tend to infinity, possibly arbitrarily slowly, for which the Chebyshev O-bounds fail.
12.1. Introduction In the last chapter, we proved that if the counting function of integers of a Beurling g-number system satisfies the conditions N (x) = Ax + xE(x), ∞ x−1 |E(x)| dx < ∞, (11.1 bis) 1
and an average of the pointwise bound (12.1)
E(x) = O(1/ log x),
then the associated g-prime system satisfies Chebyshev bounds. In this chapter we show the optimality of (12.1) in the class of g-number systems satisfying (11.1) by giving an example in which both (12.1) and the Chebyshev bounds fail. Theorem 12.1. Let f (x) be any positive-valued, increasing unbounded function on [1, ∞). Then there exists a g-number system NB such that (1) The associated zeta function ζB (s) is analytic on the open half plane {σ > 1}, (s − 1)ζB (s) has a continuous extension to {σ ≥ 1}, and itζB (1 + it) = 0 for all real t; (2) The g-integer counting function NB (x) satisfies (11.1) and also (12.2)
E(x) = O(f (x)/ log x);
(3) The counting function πB (x) of the g-primes satisfies lim sup x→∞
πB (x) πB (x) = ∞ and lim inf = 0. x→∞ x/ log x x/ log x
In other words, if (12.1) is replaced by (12.2) for an unbounded function f , no matter how slowly growing, then there exists a g-number system also satisfying (11.1) for which the Chebyshev bounds fail. 12.2. The g-prime system PB The proof proceeds by construction of an example. We start by creating from f another function which grows at least as slowly as f , but with some more useful analytical properties. 119
120
12. OPTIMALITY OF A CHEBYSHEV BOUND
Lemma 12.2. Given a positive-valued, increasing unbounded function f (x) on [1, ∞), there exists a function k(x) also defined on [1, ∞) such that (1) k(x) ≥ 1 for x ≥ 1 and k(x) f (x); (2) k(x) is increasing and k(x) → ∞ as x → ∞; (3) k(x) is differentiable and (log x)/k(x) is increasing on (1, ∞). Proof. First, set x ≥ 1.
f1 (x) := min{f (x), log log(ee x)},
Thus 0 < f1 (x) ≤ f (x) for x ≥ 1. Moreover, f1 (x) is increasing and f1 (x) → ∞ as x → ∞. Next, set x f1 (t) dt, x ≥ 1. f2 (x) := x−1 1
We have, for x ≥ 1,
x−1 f1 (x) < f1 (x). x Also, f2 (x) is continuous (xf2 (x) is an integral!) and it increases, since for δ > 0, 0 ≤ f2 (x) ≤
x 1 f1 (t) dt + f1 (x) δ x+δ 1 x 1 δ x ≥ f1 (t) dt + f1 (t) dt = f2 (x). x+δ x 1 1
f2 (x + δ) ≥
Moreover, f2 (x) → ∞ as x → ∞, for 1 x 1 f2 (x) > f1 (t) dt ≥ f1 (x/2) → ∞. x x/2 2 x f3 (x) := 1 + x−1 f2 (t) dt.
Then set
1
Now, 1 ≤ f3 (x) ≤ 1 + f2 (x) ≤ 1 + f1 (x) ≤ 1 + f (x), x ≥ 1. Also, as before, f3 (x) is increasing and f3 (x) → ∞ as x → ∞. Moreover, f3 (x) is differentiable on (1, ∞), since f2 (x) is continuous there. Finally, we define k(x) = f3 (log log(ee x)),
x ≥ 1.
Thus we have 1 ≤ k(x) ≤ 1 + f (log log(ee x)), and, from the definition of f1 (x),
x ≥ 1,
k(x) log log log log x. Also, k(x) is increasing and k(x) → ∞ as x → ∞. Moreover, log x 1 f3 (log log(ee x)) log x = 1− . k(x) xk(x) f3 (log log(ee x)) log(ee x) Note that f3 (x) ≥ 1 and that 0≤
f3 (x)
f2 (x) − = x
x 1
f2 (t) dt log log(ee x) f2 (x) < < x2 x x
12.2. THE G-PRIME SYSTEM PB
121
for x > 1. Therefore, for x > 1, log log{ee log log(ee x)} f3 (log log(ee x)) log x < < 1, f3 (log log(ee x)) log(ee x) log log(ee x) and hence log x 1 log log{ee log log(ee x)} ≥ 1− > 0. k(x) xk(x) log log(ee x)
We next determine a sparse sequence of real numbers to use in the construction. First, since k(x) is increasing and k(x) → ∞ as x → ∞, there exists a sequence {cn }, n = 1, 2, . . . , such that $ 1/ k(cn ) < ∞. n≥1
Next, define another sequence {An }, n = 1, 2, . . . , recursively by A1 = e and An+1 = max{eAn , cn+1 }. Note that the sequence {log An } grows faster than exponentially. Hence we have log k(n)
log An + (1/2) log k(n) 2 log An for n ≥ n0 . Hence, from the definition of PB , $ πB (An+1 −) = (π(An+1 −) − π( k(n)An )) + (πB (An ) − πB (An −)) + πB (An −) $ An log k(n) ≤ (π(An+1 −) − π( k(n)An )) + + πB (An −) 2 log An < π(An+1 −). Thus πB (An −) ≤ π(An −) holds for all (sufficiently large) n, and so πB (A∗n ) πB (An ) An log k(n)/(2 log An ) + π(An ) $ $ = ∗ ≤ ∗ ∗ ∗ An / log An An / log An k(n)An / log ( k(n)An ) $ ≤ (log k(n))/ k(n) → 0 as n → ∞. For further analysis, we introduce an auxiliary g-number system. Set dπ0 := d(πB − π)v , the variation of d(πB − π). Notice that dπ0 = 0 on [1, An0 ) ∪
"
(A∗n , An+1 ),
n≥n0
and for n ≥ n0 , dπ0 {A∗n } = dπ{A∗n } dπ0 {An } = An log k(n)/(2 log An ) , the last regardless of whether or not An is a rational prime. By analogy with relations from Chapter 3, set 1 dπ0 (x1/k ) dΠ0 (x) := k k≥1
and
1 x dΠ∗n N0 (x) := 1 + 0 , n! 1
x ≥ 1,
n≥1
where dΠ∗n 0 denotes the n-fold multiplicative convolution of dΠ0 with itself. Also, we need an estimate of a sum of reciprocals of rational primes.
12.3. CHEBYSHEV BOUNDS AND THE ZETA FUNCTION ζB (s)
123
Lemma 12.4.
(12.4)
Am 1,
and hence k(x) N0 (x) . x log x
|E2 (x)| ≤
(12.12) Also,
(12.13)
∞
x−1 |E2 (x)| dx < ∞.
1
Proof. By Lemma 12.7, x ∞ log y −1 log y dN0 (y) < dN0 (y) < ∞. y y −1 k(y) k(y) An0 1 The left hand side equals, by integration by parts, log x log An0 N0 (x) − A−1 N0 (An0 ) x−1 n0 k(x) k(An0 ) x log y − 1 y k (y) log y + + N0 (y) y −2 dy. k(y) k2 (y) An0 Recall that k (x) ≥ 0 and note that log An0 > 1. Hence we have x log y log x log An0 dN0 (y) ≥ x−1 N0 (x) − A−1 N0 (An0 ). y −1 n0 k(y) k(x) k(An0 ) An0 Also, the second term on the right hand side is a constant; thus (12.11) follows. Then, 1 x N0 (x) −1 dΠ∗n |E2 (x)| ≤ x 0 (t) ≤ n! 1 x n≥1
and (12.12) follows. Moreover, by Lemma 12.7 again, ∞ ζ0 (s) N0 (x) dx = x−s x s 1 holds for σ ≥ 1. Hence ∞ ∞ x−1 |E2 (x)|dx ≤ x−2 N0 (x) dx = ζ0 (1) < ∞. 1
1
12.5. FUNDAMENTAL ESTIMATES
127
12.5. Fundamental estimates The study of E1 (x) requires a more delicate argument. Here we develop estimates for use in the next section. We will see one motivation for the choice of PB : the considerable cancellation of d(ΠB − dΠ). We begin with bounds for ∞ t− d(πB − π)(t), ∈ N. J (x) := x
The reader is encouraged to review Figure 1 (§12.2) to see the contributions of the various intervals to PB ; also recall the notation $ A∗n := k(n) An . Lemma 12.9. For sufficiently large n0 , ⎧ 2 2 ⎪ ⎪ ⎨(log k(n0 ))/(4 log An0 ) (12.14) |J1 (x)| ≤ (log k(n))/ log An ⎪ ⎪ ⎩(log2 k(n + 1))/(4 log2 A
if 1 ≤ x < An0 , if An ≤ x < A n , n ≥ n0 ,
n+1 )
if A n ≤ x < An+1 , n ≥ n0 .
Also, for ≥ 2,
⎧ 1− ⎪ ⎪ ⎨An0 (log k(n0 ))/ log An0 (12.15) |J (x)| ≤ 2A1− n / log An ⎪ ⎪ 1− ⎩A (log k(n + 1))/ log A n+1
if 1 ≤ x < An0 , if An ≤ x < A n , n ≥ n0 , n+1
if A n ≤ x < An+1 , n ≥ n0 .
Proof. For A n ≤ x < An+1 with n ≥ n0 − 1 we have & ' Am log k(m) −1 J1 (x) = p S1 (m), A−1 − := m 2 log Am m≥n+1
Am x. Since d(ΠB − Π)(t) = 0 for t < An0 , d(ΠB − Π)∗ (t) = 0 for 1 ≤ t ≤ x, and the inner integral of I2 becomes ∞ ∞ −1 ∗ −1 t d(ΠB − Π) (t) = t d(ΠB − Π)(t) = c1 . 1
Hence
1
(12.23)
I2 ≤
∞
1 >log x/ log A n0
|c1 | An0 dx |c1 | dx = = |c1 |e|c1 | log An0 . ! x ! 1 x ≥1
Exchanging the sum and outer integral in I1 , we obtain 1 K(), I1 ≤ ! ≥1
where
∞
K() :=
x An
0
−1
∞
x
t
−1
d(ΠB − Π) (t) dx. ∗
We have at once K(1) ≤ c2 (from Lemma 12.10). For ≥ 2, ∞ ∞ ∞ −1 ∗−1 dx −1 u d(ΠB − Π)(u) v dΠ0 (v) K() ≤ −1 x An An0 x/v ∞0 ∞ ∞ dx −1 ∗−1 u−1 d(ΠB − Π)(u) = v dΠ0 (v). x A−1 An x/v n0 0
Letting x/v = y, the inner integrals on the right hand side become ∞ ∞ ∞ ∞ dy dy −1 −1 ≤ u d(Π − Π)(u) u d(Π − Π)(u) B B y . y An /v 0
y
1
y
12.7. NOTES
131
By Lemma 12.10, the last expression is the finite quantity c2 . Therefore, −1 ∞ ∞ −1 v −1 dΠ∗−1 (v) = c v dΠ (v) ≤ c2 c4−1 , K() ≤ c2 2 0 0 where c4 := (12.24)
∞ 1
A−1 n0
x
−1
An0
dΠ0 (x); by Lemma 12.5 this also is finite. Hence 1 I1 ≤ c 2 + c2 c−1 ≤ c2 (1 + ec4 )/2. ! 4 ≥2
Now (12.22) follows from (12.23) and (12.24).
Conclusion of the Proof. Property (1) of the theorem was established in Lemma 12.6, and Property (3) in Lemma 12.3. It remains only to show Property (2). By (12.8) and (12.9) we have |N (x) − Ax|x−1 = |E(x)| ≤ x−1 + |E1 (x)| + |E2 (x)|. Then, by (12.21), (12.12), and (1) of Lemma 12.2, |E1 (x)| + |E2 (x)| k(x)/ log x f (x)/ log x. Finally, by (12.13) and (12.22), ∞ x−1 |E(x)| dx ≤ 1
∞
x−1 (x−1 + |E1 (x)| + |E2 (x)|) dx < ∞.
1
12.7. Notes The main result of this section comes from the authors’ article [DZ13b]. §12.6. For convenience, we wrote |c1 | in (12.23). In fact, c1 > 0, as the reader might show.
https://doi.org/10.1090//surv/213/13
CHAPTER 13
Beurling’s PNT Our raison d’ˆetre Summary. We give a proof of Beurling’s PNT for g-numbers and show this result is optimal in the class Beurling considered.
13.1. Introduction The main aim of Beurling’s g-number investigation was to give conditions on a g-integer system N implying that the generalized version of the PNT holds, i.e., π(x) ∼ x/ log x as x → ∞. His result was the following. Theorem 13.1. Suppose that N has a counting function satisfying (13.1)
N (x) = NN (x) = Ax + O(x(log ex)−γ )
for some A > 0. If γ > 3/2, then the PNT holds. By Chebyshev’s bounds ψ(x) x; thus ψ(x) ∼ Π(x) log x as x → ∞ by integration by parts, and it suffices prove the PNT in the form ψ(x) ∼ x. We first show that the g-zeta function is nonzero on {s : s = 1}; our argument, starting with an extension of the classical 3-4-1 inequality via the Fej´er kernel, follows that of Beurling. The second part is to apply the Wiener-Ikehara theorem, for which we use results established in Chapter 10. We remark that our tauberian method is somewhat simpler than Beurling’s because we use ψ rather than dψ. Beurling’s result is optimal in the class of g-number systems having a counting function of the form (13.1), as we shall see below: there are examples satisfying (13.1) with γ = 3/2 for which the PNT fails. The first of these examples is continuous; the second has discrete g-integers. Note that if (13.1) holds for a g-number system N with some number γ, then the relation holds a fortiori for a number γ < γ. To simplify some calculations, we henceforth assume that 3/2 < γ ≤ 7/4. 13.2. A lower bound for |ζ(σ + it)| We prove the nonvanishing of ζ(1 + it) for t = 0 by combining two relations: the first, which we give here, is a lower bound for |ζ(σ + it)|; in the next section we show that ζ(s) satisfies a Lipschitz condition of a positive order. In Chapter 10 we introduced the Fej´er kernel on R; here we use a corresponding family of 2π periodic functions. For L a positive integer and x real, set L−1 || ix 1− 1− e =1+2 cos x . KL (x) := L L −L≤≤L
=1
Lemma 13.2. For each L ∈ N and x ∈ R, KL (x) ≥ 0. 133
134
13. BEURLING’S PNT
Proof. We have L L L L L ix 2 e = eix e−ikx = ei(−k)x 0 ≤ =1
=L+
=1
k=1
=1 k=1
L−1
(L − ) eix + e−ix = LKL (x).
=1
We use the nonnegativity of KL to derive a generalization of the classical bound ζ(σ)3 |ζ(σ + it)|4 |ζ(σ + 2it)| ≥ 1 (“3-4-1 inequality”), which is used in many proofs of nonvanishing of the Riemann zeta function on { s = 1}. Lemma 13.3. For each integer L ≥ 2, real t, and σ > 1, ζ(σ)L
L−1
|ζ(σ + it)|2L−2 ≥ 1.
=1
Proof. We show that (13.2)
L log ζ(σ) + 2
L−1
(L − ) log |ζ(σ + it)| ≥ 0.
=1
Recall from (1.5) that
log ζ(σ + it) =
∞
u−σ e−it log u dΠ(u).
1
Taking real parts and using the evenness of cos, we get ∞ log |ζ(σ + it)| = u−σ cos(t log u) dΠ(u). 1
Thus the left side of (13.2) can be written as ∞ L−1
L u−σ 1 + 2 1− cos(t log u) dΠ(u). L 1 =1
Now the expression in curly brackets is KL (t log u), which is nonnegative. Also, dΠ ≥ 0. This establishes (13.2) and hence the claim of the lemma. Next, we show that |ζ(σ + it)| cannot approach 0 too swiftly as σ → 1+. Lemma 13.4. Let N be a g-number system which satisfies (13.1) with γ > 3/2. Let t be any nonzero real number and L any integer exceeding 1. There exists a positive number C = CN (L, t) for which (13.3)
|ζ(σ + it)| ≥ C(σ − 1)1/2+1/(2L) ,
Proof. Once again, set (11.7 bis)
g(s) :=
1 A
∞
1 < σ ≤ 2.
x−s−1 {N (x) − Ax} dx =
1
1 ζ(s) − . As s−1
The integral for g converges absolutely and uniformly for { s ≥ 1}. It defines there a continuous function with |g(s)| ≤ H for some constant H, and we can write (13.4)
ζ(s) = A +
A + Asg(s), s−1
s ≥ 1.
13.3. NONVANISHING OF ζ(1 + it)
135
In particular, for 1 < σ ≤ 2 and fixed t = 0, A 2A(H + 1) := , σ−1 σ−1 A |ζ(σ + it)| ≤ + A + (2 + L|t|)AH := A , 2|t| ζ(σ) ≤
2 ≤ ≤ L.
Now use Lemma 13.3 with L+1 in place of L and insert the preceding estimates. We find 1 ≤ ζ(σ)L+1 |ζ(σ + it)|2L |ζ(σ + 2it)|2L−2 · · · |ζ(σ + Lit)|2 A L+1 ≤ |ζ(σ + it)|2L (A )2+4+···+2L−2 , σ−1 and, with C = (A )−(L+1)/(2L) (A )−(L−1)/2 , |ζ(σ + it)| ≥ C(σ − 1)1/2+1/(2L) ,
1 < σ ≤ 2.
13.3. Nonvanishing of ζ(1 + it) The next step is to show that zeta satisfies Lipschitz condition of order larger than 1/2. This result implies that if ζ(s0 ) = 0, then |ζ(s)| is quite small for s near s0 , which will lead to the desired contradiction. (In the classical case, one simply notes at this point that zeta is analytic and thus has a bounded derivative.) Lemma 13.5. Suppose N is a g-number system that satisfies (13.1) for some γ ∈ (3/2, 7/4]. Let 0 < δ < 1 < T be given and set Ω = Ω(δ, T ) := {s = σ + it : 1 ≤ σ ≤ 2, δ ≤ |t| ≤ T }. There is a positive number D = DN (δ, T ) such that, if s1 , s2 ∈ Ω, then |ζ(s1 ) − ζ(s2 )| ≤ D|s1 − s2 |γ−1 . Proof. We use the representation of zeta from (13.4) to write
A A − + A s1 g(s1 ) − s2 g(s2 ) := I + II. ζ(s1 ) − ζ(s2 ) = s1 − 1 s2 − 1 If |s1 − s2 | ≤ 1, then |I| =
A|s1 − s2 | A ≤ 2 |s1 − s2 |γ−1 , |s1 − 1||s2 − 1| δ
since γ − 1 < 1 by assumption. If, on the other hand, |s1 − s2 | > 1, then |I| ≤
2A 2A A|s1 − 1| + A|s2 − 1| ≤ ≤ |s1 − s2 |γ−1 . |s1 − 1||s2 − 1| δ δ
In any case, 2A |s1 − s2 |γ−1 . δ2 Also, using the definition of g(s), ∞ |II| = {s1 x−s1 − s2 x−s2 }x−1 {N (x) − Ax} dx 1 ∞ |s1 x−s1 − s2 x−s2 | B(log ex)−γ dx, ≤ |I| ≤
1
136
13. BEURLING’S PNT
for some B > 0. We estimate |s1 x−s1 − s2 x−s2 | differently depending on whether or not |s1 − s2 | log x < 1. Let X = exp(|s1 − s2 |−1 ), and suppose first that x < X. We have s2 d −s1 −s2 −u (u x ) du − s2 x | = |s1 x du s1
≤ |s1 − s2 |x−1 {1 + (T + 2) log x} ≤ |s1 − s2 |x−1 3T log ex. Now
X
|s1 x
IIA :=
−s1
− s2 x
−s2
| B(log ex)
−γ
X
dx ≤ 3BT |s1 − s2 |
1
(log ex)1−γ 1
dx , x
and the last integral is less than X (log X)2−γ |s1 − s2 |γ−2 dx = = ≤ 4|s1 − s2 |γ−2 . (log x)1−γ x 2 − γ 2 − γ 1 Thus IIA ≤ 12BT |s1 − s2 |γ−1 . On the other hand, for x ≥ X, we use the inequality |s1 x−s1 − s2 x−s2 | ≤ 2(2 + T )x−1 < 6T x−1 . It follows that IIB : =
∞
|s1 x−s1 − s2 x−s2 | B(log ex)−γ dx ∞ 6BT dx = |s1 − s2 |γ−1 ≤ 12BT |s1 − s2 |γ−1 , (log x)−γ < 6BT x γ − 1 X X
and hence |II| ≤ 24BT |s1 − s2 |γ−1 . Finally, I and II together, and D = 2Aδ −2 +24BT , give the claimed bound.
We apply the last two lemmas to show that ζ(s) = 0 on s = 1. Theorem 13.6. Let N be a g-number system which satisfies (13.1) for some γ > 3/2. The associated zeta function ζ(s) has no zeros in the closed half plane {s : s ≥ 1}. Proof. We have shown already in Lemma 3.3 that ζ(s) is nonvanishing in the open half plane {s : s > 1} – indeed, a much weaker condition was used there. Now we assume that ζ(1 + it0 ) = 0 for some t0 = 0 (Why can we assume that t0 = 0?), and show that this leads to a contradiction. Taking s2 = 1 + it0 and s1 = σ + it0 with 1 < σ ≤ 2 and 0 < δ < |t0 | in Lemma 13.5, we have |ζ(σ + it0 )| ≤ D(σ − 1)γ−1 . Next, take L to be an integer that is so large that (2L)−1 < γ − 3/2. We apply Lemma 13.4 with this L to obtain |ζ(σ + it0 )| ≥ C(σ − 1)1/2+1/(2L) ,
1 < σ ≤ 2.
Together, the last two relations imply that (σ − 1)γ−3/2−1/(2L) ≥ C/D.
13.4. AN L1 CONDITION AND CONCLUSION OF THE PROOF
137
But this inequality is impossible for σ − 1 sufficiently small, since the exponent is positive. Thus ζ(1 + it0 ) = 0. 13.4. An L1 condition and conclusion of the proof We complete the proof of Theorem 13.1 by applying Corollary 10.6 to conclude that ψ(ex )e−x → 1 as x → ∞, i.e., the PNT holds. For this, we show that ∞ 1 1 1 ζ (s) − = ψ(eu ) e−su du − G(s) := − s ζ s−1 s − 1 0 satisfies an L1 Cauchy condition. We shall use an L2 hypothesis here that is slightly weaker than that of Theorem 13.1, in order to be able to use the same result for the punchline in the proof of the PNT given in Chapter 15. Lemma 13.7. Suppose the zeta function of a g-number system has no zeros on the line { s = 1} and the integer counting function satisfies ∞ 2 dx N (x) − Ax log x < ∞. x x 1 Then the function G, defined above, satisfies 2λ (10.8 bis) lim |G(σ + it) − G(σ + it)| dt = 0 σ, σ →1+
−2λ
for every λ > 0. Proof. We begin by expressing G(s) in terms of 1 ∞ −s−1 1 ζ(s) (11.7 ter) g(s) := − , x {N (x) − Ax} dx = A 1 As s−1 a continuous function for σ = s ≥ 1 to which we can conveniently apply the hypothesis of the theorem. We claim that, for σ > 1, 1 1 Ag (s) Ag(s) (13.5) G(s) = − − 2 − − . s s (s − 1)ζ(s) ζ(s) Indeed, A(s − 1)g(s) = (1 − s−1 )ζ(s) − A, and if we differentiate each side of this formula and divide by −(s − 1)ζ(s), we find (1 − s−1 )ζ (s) + s−2 ζ(s) A{g(s) + (s − 1)g (s)} =− (s − 1)ζ(s) (s − 1)ζ(s) 1 1 1 ζ (s) − + (1 − s−2 ) =− s ζ(s) s−1 s−1 1 1 = G(s) + + 2 , s s showing that (13.5) holds. Writing s = σ + it with σ > 1, we set 1 1 Ag(s) 1 Jσ (t) := − − 2 − , Kσ (t) := , Lσ (t) := Ag (s). s s (s − 1)ζ(s) ζ(s) By the triangle inequality, 2λ |G(σ + it) − G(σ + it)| dt ≤ I + II, −
−2λ
138
13. BEURLING’S PNT
where
2λ
I= −2λ
|Jσ (t) − Jσ (t)| dt,
2λ
II = −2λ
|Lσ (t)Kσ (t) − Lσ (t)Kσ (t)| dt.
We have limσ, σ →1+ I = 0, since Jσ is continuous. (Recall that (s − 1)ζ(s) = 0 for s ≥ 1.) Also, 2λ 2λ II ≤ |Lσ (t){Kσ (t) − Kσ (t)}| dt + |{Lσ (t) − Lσ (t)}Kσ (t)| dt := IIa + IIb , −2λ
−2λ
say. Now Kσ (t) is uniformly continuous on {s = σ + it : 1 ≤ σ ≤ 2, −2λ ≤ t ≤ 2λ},
and thus, as σ, σ → 1+, 2λ IIa = o(1) |Lσ (t)| dt, −2λ
IIb
2λ
−2λ
|Lσ (t) − Lσ (t)| dt.
It remains to show that limσ,σ →1+ IIb = 0; this Cauchy condition implies that 2λ limσ→1+ −2λ |Lσ | exists, and so IIa → 0 also. By the Cauchy-Schwarz inequality, 2λ 1/2 2λ 1/2 2λ |Lσ (t) − Lσ (t)| dt ≤ dt · |Lσ (t) − Lσ (t)|2 dt , −2λ
−2λ
−2λ
and thus, for fixed λ > 0, it suffices to show that 2λ III := |Lσ (t) − Lσ (t)|2 dt −2λ
goes to 0 as σ, σ → 1+. Making a change of variable in the integral (11.7) for g and differentiating, we obtain the Fourier integral representation ∞ e−itu φσ (u) du, Lσ (t) = Ag (σ + it) = 0
where By hypothesis,
φσ (u) := −e−σu u{N (eu ) − Aeu }.
∞
(13.6)
φ1 (u)2 du < ∞,
0
and a small calculation, using the Cauchy-Schwarz inequality, shows that φσ is integrable on (0, ∞) for σ > 1. The function Lσ is the Fourier transform of φσ , and we have ∞ Lσ (t) − Lσ (t) = e−itu {φσ (u) − φσ (u)} du. 0
By (a weak form of) Plancherel’s identity ([Kn68], Ch. 6), ∞ III ≤ 2π {φσ (u) − φσ (u)}2 du. 0
The last integrand is dominated by the integrable function φ1 (u)2 and it tends to 0 pointwise as σ, σ → 1+, so by the dominated convergence theorem, III → 0. Thus (10.8) holds.
13.5. OPTIMALITY – A CONTINUOUS EXAMPLE
139
If we insert the result of the preceding lemma into Corollary 10.6, we can conclude that ψ(ex ) e−x → 1 as x → ∞. As we noted at the beginning of the chapter, ψ(x) ∼ Π(x) log x. Thus π(x) ∼ Π(x) ∼ x/ log x, and the PNT holds for the g-number system N . This completes the proof of Theorem 13.1. 13.5. Optimality – a continuous example We now show that Theorem 13.1 is optimal in the class of g-number systems N with counting function (13.1 bis)
N (x) = NN (x) = Ax + O(x(log ex)−γ ).
Following Beurling, we give an example of a g-number system satisfying (13.1) with γ = 3/2 for which the PNT fails. Thus the exponent 3/2 is optimal in this class of g-number systems. (We shall show in Chapter 15 that the PNT can be established under an L2 condition that is weaker than the hypothesis in Theorem 13.1.) Example 13.8. (A wobbly prime counting function.) For x ≥ 1, let a “wobbly g-prime counting function” be defined by x 1 − cos(log t) (13.7) πw (x) := dt. log t 1 The integrand is asymptotic to 12 log t as t → 1+, so the integral is convergent. We shall show that πw (x) has the desired prime counting properties, and the “g-integer” system it generates satisfies (13.1). First, πw (1) = 0 and πw ↑; these properties hold also for the weighted function 1 1 Πw (x) := πw (x) + πw (x1/2 ) + πw (x1/3 ) + . . . . 2 3 Thus πw qualifies as a (continuous) g-prime counting function. Next, x cos(log t) dt, πw (x) := πw (e) + li(x) − li(e) − log t e (13.8)
where
x
+
li(x) := lim
→0+
1−
0
1+
du log u
= li e + e
x
du . log u
By integrating twice by parts, x x πw (x) = li(x) − , {sin(log x) + cos(log x)} + O 2 log x log2 x or after application of a trigonometric identity,
√ πw (x)/ li(x) = 1 − sin(π/4 + log x)/ 2 + o(1),
x → ∞.
Thus the PNT does not hold for πw . Now we turn to the g-integers it generates. Theorem 13.9. Let πw be defined by (13.7) and let Πw be the companion function defined by (13.8). The associated g-integer counting function x exp∗ dΠw Nw (x) := 1−
satisfies (13.1) with γ = 3/2.
140
13. BEURLING’S PNT
We have (13.9)
x
Nw (x) =
exp∗{dπw + dΠ2 }
1−
where
1 1 πw (x1/2 ) + πw (x1/3 ) + . . . . 2 3 We next analyze the two parts of the formula for Nw . Π2 (x) :=
Lemma 13.10. For suitable constants c0 , c1 , we have x x c0 x cos(log x) c1 x cos(log x) N1 (x) := exp∗ dπw = x + + + O . log3/2 x log5/2 x log7/2 x 1− To establish Theorem 13.9, the weaker relation N1 (x) = x + O(x/ log3/2 x) would suffice; the four-term form will be used in the next section.
Lemma 13.11.
x
N2 (x) :=
exp∗ dΠ2 x1/2 .
1−
By the homomorphic property of the exponential, x Nw (x) = dN1 ∗ dN2 . 1−
After proving the two lemmas, we shall establish the theorem by showing that Nw (x) has the form of its bigger convolution factor, N1 (x). Proof of Lemma 13.10. Recall x 1 − cos(log x) N1 (x) := dx. exp∗ dπw and dπw (x) := log x 1− Thus N1 (x) is continuous on (1, ∞). We first find its generating function. For
s > 1, write ∞ ∞
1 − (xi + x−i )/2 −s dx . x dπw (x) = x−s π w (s) := log x 1 1 Then ∞
xi + x−i 1 1/2 1/2 dx = − + + , x−s 1 − π w (s) = − 2 s−1 s−1−i s−1+i 1 and so log(s − 1 − i) + log(s − 1 + i) π − log(s − 1) w (s) = 2 $ $ (s − 1 − i)(s − 1 + i) (s − 1)2 + 1 = log = log . s−1 s−1 We take the branch of the square root that is positive for s real and exceeding 1. It follows that $ ∞ ∞ (s − 1)2 + 1 −s ∗ −s . x {exp dπw }(x) = exp x dπw (x) = N1 (s) = s−1 1− 1 By the Mellin inversion formula, for any x > 1 and a > 1, a+i∞ 1 1 (s) ds . N1 (x) = xs N 2πi a−i∞ s
13.5. OPTIMALITY – A CONTINUOUS EXAMPLE
141
To simplify convergence considerations, introduce 1 (s) − 1 = φ(s) := N
$ (s − 1)2 + 1 − (s − 1) . s−1
Rationalizing the numerator, we find (13.10)
φ(s) =
(s − 1)
1 1 $ 2, |s| (s − 1)2 + 1 + (s − 1)
|s| large.
We restate the Mellin formula as N1 (x) = I + 1, say, with 1 I := 2πi
a+i∞
xs φ(s) a−i∞
ds s
and 1 2πi
a+i∞
xs a−i∞
ds = 1, s
the last by a familiar calculation. The integrand of I has singularities at 0, 1, and 1 ± i. We deform the contour to the vertical line {s : s = 1/2}, with loops taken to avoid crossing the rays (−∞ − i, 1 − i) and (−∞ + i, 1 + i). By the residue theorem, the contribution to I from the pole of φ(s) at s = 1 is x. By the bound φ(s) 1/|s|2 from (13.10) and the fact that |xs | = x1/2 on the vertical line, the contribution to I from the vertical portion of the new contour is O(x1/2 ). (We could have extended the contour further to the left, but this would not be useful for the remainder of this analysis.) We next calculate the loop integral about 1 + i. On a circle of radius about this point, φ(s) is bounded, and thus the integral converges to 0 as → 0. It remains to calculate the integral along the segments Lπ = 1 + i + reiπ ,
L−π = 1 + i + re−iπ ,
with the former traversed from r = 0 to r = 1/2 and latter from 1/2 to 0. Writing 1 (s)/s = (s − 1 − i)1/2 f (s) N where f (s) is analytic for |s − 1 − i| ≤ 1/2, we have on this disc 1 (s) − 1)/s φ(s)/s = (N = −1/s + c0 (s − 1 − i)1/2 + c1 (s − 1 − i)3/2 + O(|s − 1 − i|5/2 ) for c0 , c1 suitable constants. The integral of −1/s over the two segments cancels out, so it remains to study the contribution of the remaining terms.
142
13. BEURLING’S PNT
We have J1/2
c0 := 2πi = = = =
xs (s − 1 − i)1/2 ds
+ L−π
Lπ
1/2 c0 πi/2 −πi/2 1+i (e −e )x x−r r 1/2 dr 2πi 0 ∞ ∞ c0 x1+i − e−r log x r 1/2 dr π 0 1/2 ∞ ∞ c0 x1+i − e−u u1/2 du π log3/2 x 0 (log x)/2 c0 Γ(3/2) x1+i + O(x1/2 ) , π log3/2 x
where Γ denotes the Euler Gamma function. Also, c J3/2 := 1 + xs (s − 1 − i)3/2 ds, 2πi L−π Lπ and a similar calculation yields J3/2 = −
c1 Γ(5/2) x1+i π log5/2 x
+ O(x1/2 ).
Then, with L representing either Lπ or L−π , we have the error term
1/2 ds x |xs | |s − 1 − i|5/2 x−r r 5/2 dr |s| L r=0 ∞ ∞ x x −r log x 5/2 e r dr = e−u u5/2 du ≤x , 7/2 7/2 log x 0 log x r=0
since the last integral is Γ(7/2), a finite quantity. The same calculations apply for the loop integral about 1 − i, and noting that xi + x−i = 2 cos(log x), we obtain the claimed formula for N1 (x). Proof of Lemma 13.11. By Lemma 2.10 and a weak form of Lemma 13.10, we have first
x
N3 (x) :=
∗
exp
1
1−
2
dπw (t
1/2
√
) =
x
∗
exp 1−
1 2
dπw
√ x
exp∗ dπw
≤ 1−
Now
1 1 dN3 ∗ exp∗ dπw (t1/3 ) + dπw (t1/4 ) + . . . 3 4 1− x#
x exp∗ dπw (t1/3 ) + dπw (t1/4 ) + . . . . t 1−
N2 (x) :=
x
√ x.
13.5. OPTIMALITY – A CONTINUOUS EXAMPLE
143
If we extend the integration to infinity and use Lemma 2.10 again, we find ∞
√ N2 (x) x exp∗ t−1/2 dπw (t1/3 ) + t−1/2 dπw (t1/4 ) + . . . 1− ∞ ∞ √ t−3/2 dπw (t) · exp t−2 dπw (t) · · · = x exp 1 1− ∞ −3/2 √ t ≤ x exp + t−2 + t−5/2 + . . . dπw (t) 1
√ = x exp
∞
1
t−3/2 1 − cos(log t) dt . log t 1 − t−1/2
We note that 1 − cos(log t) ≤ log t
(log t)/2, 1 < t ≤ e 2, t > e.
Hence the last integral is at most e ∞ −3/2 t−3/2 log t 2t dt dt + 1, −1/2 −1/2 2 1−e 1 1−t e
√ since log t/(1 − t−1/2 ) is bounded on (1, e]. It follows that N2 (x) x . √ Proof of Theorem 13.9. First note that since N2 (x) x, integration by parts gives ∞ ∞ y dN2 (t) dN2 (t) dN2 (t) = − = A + O(y −1/2 ), y → ∞, t t t 1− 1− y for some constant A. Apply the hyperbola method, writing x Nw (x) = dN1 ∗ dN2 = I + II 1−
with
√
x
N1 (x/t) dN2 (t),
I := 1−
Now
√ x
II :=
√ {N2 (x/t) − N2 ( x)} dN1 (t).
1−
x/t x +O dN2 (t) t log3/2 x/t 1− √x x 3/4 −1 = Ax + O(x ) + O t dN (t) 2 log3/2 x 1− x = Ax + O 3/2 log x
√ x
I=
with A =
∞ 1−
t−1 dN2 (t). Next, by another integration by parts, √x $ II = O x/t dN1 (t) = O x3/4 . 1−
The two calculations together give the claimed formula for Nw (x).
144
13. BEURLING’S PNT
13.6. Optimality – a discrete example In the theory of g-numbers presented here, discreteness of primes has been of minor importance: most of our results and several examples apply also to continuous number distributions. Still, it is psychologically satisfying to have examples using discrete g-primes. Here, we convert the continuous example of the last section into a discrete one by a construction using the floor function. (In Chapter 17, below, a continuous example is made discrete by a stochastic process.) We start with the wobbly g-prime counting function πw of the last section and define a collection of g-primes by −1 (n)}, Pd := {pn := πw
or πd (x) = πw (x) . Since πd (x) = πw (x) + O(1), the result of the last section shows that the PNT does not hold for Pd . Now we show that Nd , the collection of g-integers generated by Pd , satisfies the counting condition (13.1). Our argument follows that of the last section. With 1 1 Πd (x) := πd (x) + πd (x1/2 ) + πd (x1/3 ) + · · · =: πd (x) + Πr (x), 2 3 say, we have x x (13.11) Nd (x) := exp∗ dπd ∗ exp∗ dΠr . exp∗ dΠd = 1−
1−
As before, we analyze each of the last two convolution factors. We have πd (x) ≤ πw (x) for functions are nondecreasing. It x all x, and x both ∗n follows from Lemma 2.9 that 1 dπd∗n ≤ 1 dπw holds for all positive integers n and x ≥ 1. Hence x x exp∗ dΠr ≤ exp∗ dΠ2 =: N2 (x), Nr (x) := 1−
1−
√ and, by Lemma 13.11, Nr (x) x. Our main job is to show, analogously with Lemma 13.10, Lemma 13.12.
x
N4 (x) := where k :=
∞ 1
exp∗ dπd = ek x + O(x/ log3/2 x) ,
1−
t
−1
(dπd − dπw )(t) (a negative number).
Here is an outline of our argument: by the definitions of N1 and N4 , x k N4 (x) = e dN1 ∗ exp∗(dπd − dπw − kδ1 ). 1−
Applying the Dirichlet hyperbola method, N4 (x) = I + II, say, with √x k I := e N1 (x/t) exp∗(dπd − dπw − kδ1 )(t),
1− √ x x/s
k
II := e
1−
√ x
∗
exp (dπd − dπw − kδ1 ) dN1 (s).
13.6. OPTIMALITY – A DISCRETE EXAMPLE
145
The main contribution to N4 (x) comes from I, whose evaluation starts with Lemma 13.10. Then, we use the homomorphic property (2.16) of the multiplication operator T α to obtain T −1 exp∗ dF = exp∗{T −1 dF } for any dF ∈ dV. We find √x c0 cos(log x/t) c1 cos(log x/t) k 1+ I=e x (13.12) + log3/2 (ex/t) log5/2 (ex/t) 1− ex exp∗ dν(t), + O log−7/2 t with dν := T −1 {dπd − dπw − kδ1 }. We now give growth estimates for ν, its convolution powers, and its total variation function. Let ν1 (x) := ν(x) and for n ≥ 2, x x νn (x) := (dν)∗n = νn−1 (x/t) dν(t). 1−
1−
Lemma 13.13. There exist positive constants A0 , A1 , and B0 such that for all x > 1 and all positive integers n, (13.13) and
|νn (x)| ≤ nA0 (2 log log ex + A1 )n−1 x−1/n
x
|dν| ≤ 2 log log ex + B0 .
(13.14) 1−
Proof. We establish (13.13) by induction. First, x ν1 (x) = u−1 {dπd − dπw − kδ1 }(u). 1−
The definition of k implies that ν1 (∞) = 0; thus ∞ ν1 (x) = − u−1 {dπd − dπw }(u). x
We have omitted the term kδ1 since x > 1. Integrating by parts and recalling that −1 < πd (u) − πw (u) ≤ 0, we see that −1/x ≤ ν1 (x) ≤ 1/x for all x > 1. Thus the case n = 1 is established, provided that A0 ≥ 1. Before we go further, we show that (13.14) holds. Indeed, since k < 0, x x |dν| = u−1 {dπd + dπw + |k|δ1 }(u) 1− 1− x x −1 =2 u dπw (u) + u−1 {dπd − dπw − kδ1 }(u) 1 1− x 1 − cos(log u) du + ν(x) =2 u log u 1 log x 1 1 − cos v dv + ≤ 2 log log ex + B0 . ≤2 v x 0 (We have used the facts w that the last integrand is bounded for v near 0 and, by integration by parts, 1 cos v dv/v is bounded for all w > 1.)
146
13. BEURLING’S PNT
Now suppose that case n of (13.13) holds, and apply the hyperbola method again, this time with y = x1/(n+1) and z = x/y. Assuming also that A1 > B0 , we have (dν)∗n(s) dν(t) =: I1 + I2 − I3 ,
νn+1 (x) =
st≤x
say, with
|I1 | =
y
n−1 −1/n νn (x/t) dν(t) ≤ nA0 (2 log log ex + A1 ) (x/y)
1−
y
|dν|
1−
≤ nA0 (2 log log ex + A1 )n−1 x−1/(n+1) (2 log log ex + B0 ) , n z A0 ν(x/t)(dν) (t) ≤ |dν| x/z 1− 1−
|I2 | =
z
∗n
≤ A0 x−1/(n+1) (2 log log ex + B0 )n , |I3 | = |ν(y) νn (z)| ≤
A0 nA0 (2 log log ex + A1 )n−1 z −1/n y
≤ nA20 x2/(n+1) (2 log log ex + A1 )n−1 Now |I1 | + |I2 | + |I3 | ≤ A0 x−1/(n+1) (2 log log ex + A1 )n−1 × {(n + 1)(2 log log ex + B0 ) + nA0 x−1/(n+1) } ≤ (n + 1)A0 x−1/(n+1) (2 log log ex + A1 )n ,
if we choose A1 = B0 + A0 . Recall dν(t) := t−1 {dπd (t) − dπw (t) − kδ1 } and νn (x) :=
x
dν ∗n .
1−
We now apply the bounds of the last lemma. Lemma 13.14. For y > 1 and each k > 0 we have the estimates y f (y) := −1 + exp∗ dν log−k y , 1− y exp∗{dπd − dπw − kδ1 } y log−3 y F (y) := 1− y h(y) := exp∗ |dν| log2 y. 1−
Proof. For y > 1, f (y) =
∞ n=1
νn (y)/n!. Break the sum at
N := (8 + k)(log log ey + A1 ) , with constants coming from estimates of the preceding lemma. We find N 1
≤ A0 y −1/N exp(2 log log ey + A1 ) log−k y
13.6. OPTIMALITY – A DISCRETE EXAMPLE
147
(with much to spare!) and ∞
∞ ≤ A0 (2 log log ey + A1 )n /n! ≤ 2A0 (2 log log ey + A1 )N /N !
N +1
N
≤ 2A0
e 8+k
N
≤ 2A0 e−N log−k y.
The two estimates give f (y) log−k y. For the second estimate, write y ∗ T exp dν = F (y) = 1−
y
{δ1 + t df (t)},
1−
integrate by parts, and use the preceding estimate for f with k = 3. We find F (y) y log−3 y. For the estimate of h, recall that |dν| = T −1 dπd + T −1 dπw + |k|δ1 , and thus
exp∗ |dν| = T −1 dNd ∗ T −1 dNw ∗ exp∗(|k|δ1 ) . Since πd (y) ≤ πw (y), we have Nd (y) ≤ Nw (y), and then, by Lemma 2.1, y y −1 T dNd ≤ T −1 dNw 1−
1− ∗
for all y ≥ 1. Also, the integral of exp (|k|δ1 ) is e|k| , a constant. Thus we have y y exp∗ |dν| T −1 Nd ∗ T −1 dNw 1− 1− y 2 −1 ∗2 y −1 T dNw ≤ ≤ T dNw log2 y, 1−
1−
the last since Nw (x) x by Theorem 13.9.
Proof of Lemma 13.12. We now approximate N4 (x) by studying I and II. √ ↑, N (x) x, and x/s > x in our s-range. Also, we estimate Recall that N 1 1 y ∗ exp (dπ − dπ − kδ ) using the preceding lemma. We find d w 1 1− √x x/s k ∗ II := e exp (dπd − dπw − kδ1 ) dN1 (s) √
1− √ x
=
O 1−
Now we turn to
x
x x log−3 dN1 (s) = O(x log−2 x). s s √ x
c cos(log x/t) c cos(log x/t) 0 1 1+ + log3/2 (ex/t) log5/2 (ex/t) 1− ex + O log−7/2 exp∗ dν(t), t The main contribution to I comes from the term 1: by Lemma 13.14, √x √ k exp∗ dν(t) = ek x{1 + f ( x)} = ek x + O(x log−3/2 x). I1 := e x
(13.12 bis)
k
I=e x
1−
148
13. BEURLING’S PNT
For the O- term, use the bound on h from Lemma 13.14: √x ex exp∗ dν(t) O log−7/2 I4 := x t 1− √x −7/2 x log x exp∗ |dν| x log−3/2 x . 1−
The remaining parts of (13.12 bis) can be expressed as a linear combination of four terms of the form √x Ia,b := x (x/t)ai log−b (ex/t) exp∗ dν(t), 1−
with a = ±1, and b = 3/2, 5/2. The representation exp∗ dν = δ1 + df gives √x −b 1+ai 1+ai log (ex) + x t−ai log−b (ex/t) df (t). Ia,b = x 1−
Then, integrating by parts, and using the bound on f (y) from the last lemma, we find Ia,b x log−b (ex). Together, these estimates imply that N4 (x) = ek x + O(x log−3/2 x). Conclusion of the example. It remains to establish the formula for Nd (x). We can simply repeat mutatis mutandis the argument that concluded the proof of Theorem 13.9. For that result, we had x exp∗ (dπw + dΠ2 ), Nw (x) = 1−
with (Lemmas 13.10 and 13.11) y exp∗ dπw = y + O(y log−3/2 y) , 1−
y
exp∗ dΠ2 y 1/2 ,
1−
while here we have
x
Nd (x) =
exp∗ (dπd + dΠr )
1−
with
y
∗
k
exp dπd = e y + O(y log
−3/2
y
y) ,
1−
exp∗ dΠr y 1/2 .
1−
Thus we have Nd (x) = cx + O(x log−3/2 x) for a suitable constant c.
13.7. Notes §13.1. Theorem13.1 was proved by Beurling in [Be37]. It is given also in [BD69] and [MV07]. A small extension appears in [Di69]. Another proof of the PNT under a weaker assumption than (13.1) has been given by J. C. Schlage-Puchta and J. Vindas [SPV12]. Their hypothesis is a Cesaro version of Beurling’s, namely, for some positive integer m and real γ > 3/2, x x N (t) − at t m dt = O 1− , x → ∞. t x logγ x 1 The authors give an example where their condition holds and Beurling’s fails.
13.7. NOTES
149
§13.4. Our proof of Theorem 13.1 cites the tauberian theorem of Chapter 10. An alternative approach would be to extend results from Chapter 11, for here E(x) := x−1 {N (x) − Ax} (log x)−γ holds with γ > 3/2, whence (11.1), the basic condition of that chapter, is satisfied. However, that path would retrace arguments of Chapter 10. The results of Lemma 13.7 and Corollary 10.6 substantiate the remark in [BD69], p. 199, that if the zeta function has no zeros with real part 1, then the PNT can be derived from an L2 hypothesis. §13.5. We emphasize that Beurling’s result is optimal in the class of counting functions having the form (13.1); as we shall show in Chapter 15, the PNT holds under a weaker L2 condition. §13.6. The reader is invited to consider why in proving Lemma 13.12 we needed more terms from Lemma 13.10 than we did in proving Theorem 13.9.
https://doi.org/10.1090//surv/213/14
CHAPTER 14
Equivalences to the PNT Vive la diff´erence Summary. In classical prime number theory there are several asymptotic formulas that are called “equivalent” to the PNT. Here we investigate conditions under which Beurling analogues of these relations do or do not hold.
14.1. Introduction In classical prime number theory, there are several asymptotic formulas said to be “equivalent” to the PNT. These include ψ(x) ∼ x, M (x) := μ(n) = o(x),
(14.1) (14.2)
n≤x
(14.3)
m(x) :=
(14.4) 1
μ(n)/n = o(1),
n≤x x
Λ(n) dψ(t) = = log x − γ + o(1). t n n≤x
Here γ is Euler’s constant. The sense of the equivalence is that these statements can be derived from one another without complex variable or Fourier theory. Further remarks on this notion are given in the Notes at the end of the chapter. Each of the preceding formulas has an interpretation for g-numbers, which we discuss in the next section. In this chapter we investigate conditions under which the Beurling versions of these relations do or do not imply one other. Some of the implications hold unconditionally, while others are not valid without further hypotheses. Because the relations all imply each other in the classical case, we expect them to do the same under conditions that approximate the classical ones, e.g. if the g-integer system has a positive density A or, perhaps, that N (x) − Ax is suitably small. 14.2. Implications First, we set out the Beurling version of the relations (14.1) – (14.4). The psi relation (14.1) is equivalent to Π(x) ∼ x/ log x by integration by parts. The formula x dψ(t) = log x + c1 + o(1), x → ∞, (14.5) ψ1 (x) := t 1 is an analogue of (14.4) for g-numbers. We call this a “sharp Mertens relation.” Note that c1 = −γ in general. We can see this by making a g-number system 151
152
14. EQUIVALENCES TO THE PNT
containing all primes in N along with one additional g-prime λ. In such a system, (14.5) holds with c1 = −γ + (log λ)/(λ − 1). In classical number theory, M (x) denotes the sum function of the Moebius μ function. The characteristic property of μ, as we noted in §3.3, is that it is the convolution inverse of the 1 function. For a g-number system with counting function N , we definedM =: dN ∗−1 , and, by Corollary 3.11, have dM = exp∗{−dΠ}. Also x set m(x) := 1− t−1 dM (t). Note that these definitions make sense even when dM is not discrete or when factorization into primes is not unique. The following diagram shows the implications we shall establish between our several asymptotic formulas. Double arrows indicate unconditional implications. Single arrows indicate implications subject to an additional condition. Absence of vertical arrows indicates that we cannot prove implications without hypotheses which themselves imply the assertions ψ(x) ∼ x and ψ1 (x) = log x + c1 + o(1). ψ(x) ∼ x
ks
M (x) = o(x)
/
ks
ψ1 (x) = log x + c1 + o(1)
m(x) = o(1) /
Figure 2. Implications between four formulas
14.3. Sharp Mertens relation and the PNT One relation between the sharp Mertens formula and the PNT is true unconditionally: Proposition 14.1. If a g-number system satisfies (14.5), then the PNT holds. Proof. We verify (14.1) by applying integration by parts: x x t dψ1 (t) = xψ1 (x) − ψ1 (t) dt ψ(x) = 1 x1 = x log x + c1 x + o(x) − {log t + c1 + o(1)} dt 1
= x + o(x). Next we show that (14.5) actually implies somewhat more than the PNT.
Proposition 14.2. Suppose that a sharp Mertens-type relation holds for a gnumber system. Then x {ψ(t) − t}t−2 dt 1
converges to a finite limit as x → ∞.
Proof. Again apply integration by parts, this time to dψ(t)/t, to get x x ψ(t) ψ(x) + dt = 1 + o(1) + log x + {ψ(t) − t}t−2 dt. ψ1 (x) = x t2 1 1 Since ψ1 (x) − log x has a limit as x → ∞, so does the last integral on the right hand side.
14.3. SHARP MERTENS RELATION AND THE PNT
153
The last proposition shows that proof of (14.5) requires more than the PNT ∞ alone. Note for later use that divergence of 1 {ψ(t) − t}t−2 dt could occur when ψ(t) − t is too large, for example if it exceeds t/ log t, or for certain oscillatory cases. Now we give a condition under which the PNT does imply (14.5). Theorem 14.3. Suppose that the PNT holds and that, for some constant A > 0, |N (x) − Ax| ≤ xD(x),
(14.6)
x ≥ 1,
where D is right continuous, monotone decreasing, and satisfies ∞ x−1 D(x) dx < ∞. (14.7) 1
Then the sharp Mertens relation (14.5) holds as well. Corollary 14.4. Suppose that the PNT holds and that (14.7) is satisfied with D(x) := sup |N (y) − Ay|/y. y≥x
Then (14.5) holds. Proof of Theorem 14.3. The key for our argument is the formula x x log t dN (t) = dN ∗ dψ, 1
1
the integral form of Chebyshev’s identity (3.3) for g-numbers. The left hand side, integrated by parts, becomes x N (y) dy = Ax log x − Ax + o(x), (14.8) N (x) log x − y 1 since D(y) = o(1/ log ey) by Lemma 9.6, and x x −1 y N (y) dy = A(x − 1) + O(D(y)) dy = Ax + o(x/ log x). 1
1
For a measure dα on [1, ∞) and Dirac point mass δ1 at 1, we have x (δ1 + du) ∗ dα = (δ1 + ds) dα(t) 1−
st≤x x
x/t
=
(δ1 + ds) dα(t) = 1−
1−
x
−1
x dα(t). t
Thus, upon adding and subtracting (Aδ1 + Adt) ∗ dψ on the right hand side of the Chebyshev identity, it becomes x x Ax dψ(t) + (dN − Aδ1 − A dt) ∗ dψ. (14.9) t 1 1 Now we show that
x
(dN − Aδ1 − A dt) ∗ dψ = cx + o(x)
I := 1
for some constant c. We rewrite I as I1 + I2 , with x (dN − Aδ1 − A dt) ∗ (δ1 + dt) I1 := 1−
154
14. EQUIVALENCES TO THE PNT
and
x
(dN − Aδ1 − A dt) ∗ (dψ − δ1 − dt).
I2 := 1−
We have
x
x (dN − Aδ1 − A dt) = N (x) − Ax + x 1− t = O(xD(x)) + x(c + o(1)) = cx + o(x),
I1 =
with
1
x
N (t) − At dt t2
∞
N (t) − ct dt. t2 1 ∞ (The integral is convergent, since it is dominated by 1 t−1 D(t) dt.) Then apply Axer’s Theorem 7.2 to estimate I2 . For x ≥ 1, take c=
A(x) := N (x) − Ax,
B(x) := ψ(x) − x.
We have |A(x)| ≤ xD(x), with D(x) satisfying the conditions of Case (B). Also, B(x) = o(x) by the PNT and Bv (x) = ψ(x) + x x. By the theorem, I2 = o(x). Combining (14.8) and (14.9) with the approximation of I, we get Ax log x − Ax + o(x) = Axψ1 (x) + cx + o(x),
which is equivalent to (14.5). We conclude this section by mentioning a special case of Theorem 14.3.
Corollary 14.5. Suppose that the PNT holds and that (14.6) is satisfied with (14.10)
D(x) := C log−γ x,
x ≥ x0
with C > 0 and γ > 1. Then (14.5) holds. 14.4. Optimality of the sharp Mertens theorem We observed in Proposition 14.1 that the PNT is needed in order to establish the sharp Mertens formula (14.5), but, as we saw in Proposition 14.2, the PNT by itself is not sufficient for this purpose. Theorem 14.3 and its corollaries provide conditions that yield (14.5). Are those conditions excessive? In particular, we ask whether perhaps we could prove the theorem under the weaker hypothesis (14.11)
N (x) = cx + O(x/(log x)).
The following example shows that the answer to this question is No. Proposition 14.6. PNT and N (x) = Ax + O(x/log x) ⇒ (14.5). Proof. We give an example in which the first two conditions hold but the sharp Mertens relation fails. As the prime density, take 1 − u−1 2 1 − u−1 du. (14.12) dΠ(u) := du + log u log u First note that the PNT holds for dΠ. Indeed, for x > e, x 1 x x 1 Π(x) = c + du = . +O + O log u log x log2 u log2 x e
14.5. IMPLICATIONS BETWEEN M (x) = o(x) AND m(x) = o(1)
155
We next observe that the sharp Mertens relation (14.5) does not hold for this prime density. Consider x x x (1 − u−1 )2 ψ(x) = log u dΠ(u) = (1 − u−1 ) du + du = x + f (x) log u 1 1 1 with x (1 − u−1 )2 du − log x − 1 ∼ x/ log x. f (x) := log u 1 By the contrapositive form of Proposition 14.2, (14.5) does not hold for this gnumber system. It remains to show that (14.11) holds for this example. Let 1 − u−1 2 du dα(u) = . log u u Combining the convolution identities (3.1), (2.5), Lemma 3.10, (3.7), and Lemma 2.15, we get x
1 − u−1 1 − u−1 2 N (x) = exp∗ du du + log u log u 1− x x x = exp∗{u dα(u)} (δ1 + du) ∗ exp∗{u dα(u)} = u 1− 1− x x ∞ 1 ∗ ∗n exp dα = x 1 + dα =x = ec x − R(x), n! 1− 1 n=1 where
∞
c := 1
∞ 1 ∞ ∗n dα and R(x) := x dα . n! x n=1
By the definition of convolution, ∞ ∗n dα = · · ·
s1 ···sn >x
x
dα(s1 ) · · · dα(sn ).
At least one of the si in the last integral exceeds x1/n , and the remaining sj lie in [1, ∞). Letting in succession s1 , s2 , . . . sn assume the maximal value, we have ∞ n−1 ∞ ∞ ∞ du n2 cn−1 , dα∗n ≤ n dα dα ≤ n cn−1 = 2 log x x 1 x1/n x1/n u log u and hence ∞ x n2 cn−1 x =O . 0 ≤ R(x) < n! log x log x n=1
This establishes (14.11). 14.5. Implications between M (x) = o(x) and m(x) = o(1)
Proposition 14.7. Suppose N is a g-number system for which m(x) = o(1). Then M (x) = o(x). Proof. Integrate by parts: x M (x) = t dm(t) = x m(x) − 1−
x
m(t) dt = o(x). 1
156
14. EQUIVALENCES TO THE PNT
The converse implication need not be true without some condition on the density of integers. Proposition 14.8. M (x) = o(x) ⇒ m(x) = o(1). Proof. We give the following example. Let Nr (x) = x , the counting function of rational integers, and let dN := T −1 dNr . This is a g-number system which has integer density 0 and convolution inverse function x 1 μ(n). M (x) = T −1 dMr = n 1− n≤x
By classical prime number theory, this M (x) → 0 as x → ∞, and so M (x) = o(x) as x → ∞. On the other hand, x 1 6 m(x) := T −1 dM = μ(n) = 2 + o(1) = o(1). 2 n π 1− n≤x
However, if N (x) − Ax is small enough, then M (x) = o(x) =⇒ m(x) = o(1). Proposition 14.9. Suppose N is a g-number system for which M (x) = o(x) and for some A > 0, ∞ |At − N (t)| ≤ tD(t) with D ↓ and D(t)t−1 dt < ∞. 1
Then m(x) = o(1). Proof. We use Axer’s Theorem. For x ≥ 1, write x x x x m(x) = dM (t) = A−1 + {δ1 + dt − A−1 dN } ∗ dM. 1− t 1− Evaluate the last integral using Form (B) of Theorem 7.2. With A(t) := t−A−1 N (t) and B(t) := M (t), the conditions are satisfied, and we conclude that the integral is o(x) and thus m(x) = o(1). 14.6. Connections of the PNT with M (x) = o(x) Proposition 14.10. Let N be a g-number system for which the PNT holds and, further, suppose N has log density. Then M (x) = o(x). Proof. By Corollary 7.5, N has density; let A denote the density. Applying Proposition 3.15 to the relation dM = exp∗{−dΠ}, we find LdM = exp∗{−dΠ} ∗ L{−dΠ} = −dM ∗ dψ. Now add and subtract terms (δ1 + dt) ∗ dM and A−1 dN ∗ dM (= A−1 δ1 ) in the last formula and integrate to obtain x x x x x 1 x x 1 (14.13) −ψ N LdM = dM (t) + − dM (t) − t t A t t A 1 1− 1− =: I + II − 1/A . The left side of (14.13) equals
M (x) log x − 1
x
M (t) dt = M (x) log x + O(x), t
since | exp∗ {−dΠ}| ≤ exp∗ dΠ and thus |M (t)| ≤ N (t) t.
14.6. CONNECTIONS OF THE PNT WITH M (x) = o(x)
157
On the right side of (14.13), the PNT and the |dM | bound give x o(x/t) dt = o(x log x). |I| ≤ 1−
Similarly, the density condition implies that A−1 N (x/t) − x/t = o(x/t), whence II = o(x log x) also. Thus M (x) log x = o(x log x). Does the last proposition have a converse? The answer is No: We give an example of a g-number system with a density and satisfying M (x) = o(x) for which the PNT does not hold. Our method is based on the Wiener-Ikehara Theorem and the observation that ∞ (14.14) dN + dM = exp∗{dΠ} + exp∗{−dΠ} = 2 dΠ∗2n /(2n)! ≥ 0. n=0
Lemma 14.11. Assume N is a g-number system with density A and zeta function ζ(s) and that both ζ(σ + it) − A/(σ + it − 1) and 1/ζ(σ + it) converge in L1 norm on every fixed interval −T ≤ t ≤ T as σ → 1+. Then M (x) = o(x). Proof. By the density condition, N (ex ) ∼ Aex . We shall show that also F (x) := N (ex ) + M (ex ) ∼ Aex . Set
G(s) :=
∞
e−sx F (x) dx −
0
1 A A = + ζ(s)−1 − A . ζ(s) − s−1 s s−1
We apply Corollary 10.6, the W-I Theorem. By (14.14), F is increasing; also condition (10.7) holds by the hypothesis on ζ and 1/ζ. Thus F (x)e−x → A as x → ∞, and so M (ex ) = o(ex ). Proposition 14.12. M (x) = o(x) and the condition (14.15)
N (x) = Nw (x) = Ax + O(x(log ex)−3/2 )
do not imply the PNT. Proof. We shall show that M (x) = o(x) but not the PNT holds for the gnumber system Nw from §13.5 that satisfies (14.15). (This condition is clearly more demanding than density alone.) Our argument is based on the last lemma. First note, by (14.15), that ζ(s) − A/(s − 1) can be extended to {s : σ ≥ 1} as a continuous function. We spell out the form of the function 1/ζ by using results from §13.5. We have dN := exp∗ dΠw = exp∗{dπw + dΠ2 } = dN1 ∗ dN2 , with N1 (x) satisfying (14.15), with a different value of A, and N2 (x) x1/2 . Thus 1 (s)N 2 (s) for s > 1. Here ζ(s) = N $ ∞ ∞ (s − 1)2 + 1 −s ∗ −s N1 (s) = x {exp dπw }(x) = exp x dπw (x) = s−1 1− 1 (using the branch of the square root that is positive for s real and s > 1) and ∞ ∞ −s ∗ N2 (s) = x {exp dΠ2 }(x) = exp x−s dΠ2 (x), 1−
1−
158
14. EQUIVALENCES TO THE PNT
the last a function that is analytic and non zero on {s : σ > 1/2}. Thus s−1 1 2 (s)−1 , =$ N ζ(s) (s − 1 − i)(s − 1 + i) which has an L1 limit as σ → 1+ on any bounded t interval. Applying the last lemma, we conclude that M (x) = o(x). Proposition 14.13. M (x) = o(x) and density ⇒ ψ1 (x) = log x + c1 + o(1). Proof. If the implication were true, we would have M (x) = o(x) and density ⇒ ψ1 (x) = log x + c1 + o(1) ⇒ ψ(x) ∼ x (the last implication by Proposition 14.1), which we have just shown false.
14.7. Sharp Mertens relation and m(x) = o(1) Theorem 14.14. Suppose N is a g-number system for which x dψ(t) (14.5 bis) ψ1 (x) := = log x + c1 + o(1) t 1 holds. Then m(x) = o(1). Proof. We begin as in the proof of Proposition 14.10, but multiply through the Chebyshev-type formula by the homomorphism T −1 . We find L dm = L T −1 dM = −T −1 dM ∗ T −1 dψ = −dm ∗ dψ1 . Integration and use of the ψ1 formula yield x x x log t dm(t) = − ψ1 (x/t) dm(t) = − {log(x/t) + c1 + o(1)} dm(t) 1 1 1 x or (cancelling 1 log t dm(t) and using the bound |dm(t)| ≤ dN (t)/t ) x T −1 dN = −c1 m(x) + o(log x). m(x) log x = −c1 m(x) + o 1
Thus m(x) =
o(log x) = o(1). log x + c1
The last theorem does not have a converse. Proposition 14.15. m(x) = o(1) and density ⇒ ψ1 (x) = log x + c1 + o(1). Proof. Again, it suffices to show that m(x) = o(1) and density ⇒ ψ(x) ∼ x. Our construction is a modification of Example 13.8 from §13.5: to reduce calculation, we this time define a “wobbly” g-number system N by setting x 1 − cos(log t) dt. Π(x) := log t 1 As in §13.5, the PNT does not hold, since √ Π(x)/ li(x) = 1 − sin(π/4 + log x)/ 2 + o(1), x → ∞. Also, by Lemma 13.10, the integer-counting function satisfies x x c0 x cos(log x) , exp∗ dΠ = x + + O N (x) := log3/2 x log5/2 x 1− and thus this g-integer system has density.
14.7. SHARP MERTENS RELATION AND m(x) = o(1)
159
It remains to show that m(x) = o(1) as x → ∞. From the proof of Lemma 13.10, $ (s − 1)2 + 1 , ζN (s) = ζ(s) = s−1 with the branch of the square root that is positive for s real and exceeding 1. Now ∞ ∞ −s−1 1/ζ(s + 1) = u dM (u) = u−s dm(u). 1−
1−
and by the Mellin inversion formula, for any real number b > 0, b+iT 1 xs ds 1 {m(x+) + m(x−)} = lim T →∞ 2πi b−iT s ζ(s + 1) 2 or 1 m(x) = lim T →∞ 2πi
b+iT b−iT
xs ds √ + O(1/x) . s2 + 1
The integrand of the last integral has singularities at ±i. Arguing as in the proof of Lemma 13.10, we deform the contour to the vertical line {s : s = −1/2}, with loops taken about ±i to avoid crossing the rays (−∞ − i, −i) and (−∞ + i, i). We claim first that the contribution of the integral along the vertical line is of order x−1/2 . We see this by writing the integral as −1/2+i∞ −1/2+i∞ s 1 x 1 ds + xs √ − ds , s2 + 1 s −1/2−i∞ s −1/2−i∞ with the understanding that the last integral is broken at the points −1/2 ± i. The first integral is 0 by the residue argument used in proving the Perron formula, and the second integral has order ∞ x−1/2 (t + 1)−3 dt x−1/2 . 0
We next calculate the loop integral Ii about i. The details are close to those of Lemma 13.10, so we will be brief. In the disc of radius 1/2 about i, we have 1 1 1 1 1 √ √ √ 1 + O(|s − i|) . =√ =√ s−i s+i s − i 2i s2 + 1 The bottom of the loop is the line segment s = i + re−πi with r going from 1/2 to (a small number) , then a circle about i of radius , and finally a line segment s = i + reπi along the top segment with r going from to 1/2. The integrand has order −1/2 on the small circle and the arc length is 2π, so this portion of the integral goes to 0 with . Combining the contributions from the two horizontal line segments, we find ∞ 1/2 dr Γ(1/2) xi dr = √ Ii = √ e−r log x {r 1/2 + O(r 3/2 )} e−r log x r 1/2 r r log x π 2i 0 0 with Γ the Euler gamma function. A similar bound holds for the integral taken about −i. Thus m(x) = o(1).
160
14. EQUIVALENCES TO THE PNT
14.8. Notes §14.1. The word “equivalent” deserves a word of explanation because it is not used in a mathematical sense. Shortly after the classical PNT was proved, several other asymptotic formulas were established that were deducible from one another by simple real variable arguments ([BD04], [La09], [Na00]). These propositions were considered equivalent with respect to this property. While this notion was reasonable then, its logical basis completely disappeared with the discovery of elementary proofs of the PNT. Still, we continue to use this word to describe a convenient grouping of propositions. §14.2. We have shown that no implications exist between some formulas in Figure 2 that are not connected with arrrows, unless we make additional hypotheses that are at least as strong as those used in proving the PNT. §14.3. Theorem 14.3 and the optimality example of §14.4 were established by the authors in [DZ12]. Assuming that the zeta function of N has right hand residue A at s = 1 and the sharp Mertens relation (14.5) holds, then by the second part of Theorem 7.3, we have N (x) ∼ Ax. Also, assuming (14.5), Proposition 14.2 implies that ∞ {ψ(x) − x} x−2 dx 1
converges. Assuming (14.5), can we conclude that ∞ {N (x) − Ax} x−2 dx 1
converges? We do not know the answer to this question, but if one assumes a rather stronger hypothesis than (14.5), the answer is Yes by Theorem 7.13. §14.5. Another hint that the condition m(x) = o(1) is stronger than M (x) = o(x) is provided by the following evaluations of the Riemann zeta function for σ → 1+, each made using one of these conditions: ∞ ∞ 1 1 M (x) = dx = o , x−σ dM (x) = σ x−σ ζ(σ) x σ−1 1− 1 ∞ ∞ 1 = x−(σ−1) dm(x) = (σ − 1) x−σ m(x) dx = o(1). ζ(σ) 1− 1 §14.7. We could have deduced Proposition 14.12 from Proposition 14.15.
https://doi.org/10.1090//surv/213/15
CHAPTER 15
Kahane’s PNT What lies beyond optimal? Summary. Following Kahane, the PNT is established under an L2 condition on the error term of N (x) that is weaker than Beurling’s condition. The argument uses the Poisson summation formula and Schwartz functions.
15.1. Introduction Beurling’s g-number problem is to find weak conditions on the integer counting function N = NN of a g-integer system N under which the generalized version of the PNT holds, i.e. π(x) ∼ x/ log x as x → ∞. For N with a positive density A, we write throughout this chapter (15.1)
N (x) = NN (x) = Ax + xE(x)
with E(x) → 0 as x → ∞ (see Notes for a comment on this condition). As we showed in Chapter 13, the hypothesis (15.2)
E(x) = O(log−γ x) with γ > 3/2
yields the PNT, and this result is optimal in the class of error terms of this type. P. T. Bateman and first author conjectured that the weaker L2 -condition ∞ (15.3) (E(x) log x)2 x−1 dx < ∞ 1
sufficed to establish the PNT [BD69]. Late in the twentieth century, this conjecture was established by J.-P. Kahane by an ingenious argument in a series of papers [Ka96] – [Ka98]. Here we give a proof of this theorem, based on Kahane’s work, but presented in a classical framework with full details. Recall that the key step in Beurling’s proof of the PNT was to establish the nonvanishing of his zeta function on the line σ = 1; this is the main task here as well. We show Theorem 15.1. If the counting function of a g-integer system N satisfies (15.3), then the zeta function of N has no zeros on the vertical line s = 1. Then we apply the Wiener-Ikehara theorem, in essentially the same way as in Theorem 13.1, to establish Theorem 15.2. If NN (x) satisfies (15.3), then the PNT holds for N . As a warm-up and for use in deriving the Poisson summation formula, we show that (15.3) insures that a g-number system satisfies Chebyshev bounds: Lemma 15.3. Assume that (15.3) holds. Then x/ log x π(x) x/ log x. 161
162
15. KAHANE’S PNT
Proof. The Cauchy-Schwarz inequality implies ∞ 1/2 ∞ −1 2 −1 x |E(x)| dx ≤ (E(x) log x) x dx 2
and
2
x 1
∞ 2
1 dx x log2 x
x 1/2 2 −1 E(u) log u du ≤ (E(u) log u) u du 1
x
1/2 0. If such a pair of zeros exists, then, as σ → 1+, log |ζ(σ + ia)| + log |ζ(σ − ia)| + log ζ(σ) = log o(1) → −∞. In §15.4, this estimate is captured by introducing a Schwartz function and applying the Poisson summation formula; the result is given in (15.22). (This is the device that replaces the Fej´er trigonometric polynomial in Beurling’s argument.) Finally, with some delicate Fourier analysis, the left hand side of (15.22) is shown to have a finite lower bound uniformly for all σ > 1, in contradiction to (15.22). We conclude that it is impossible for ζ(s) to have any zeros on s = 1. Condition (15.3) can be restated as ∞ (15.4) c := {E(eu )u}2 du < ∞. 0
We shall henceforth assume that ∞ (15.5) {E(eu )u}2 du > 0 U
for every U > 0. Otherwise, E(x) = 0 for all x ≥ eU and hence Beurling’s condition (15.2) would be satisfied. In this case Theorems 13.6 and 13.1 would hold, establishing Theorems 15.1 and 15.2 at once. With s := σ + it, as usual, write the function g of (11.7) as a Fourier transform 1 ∞ −itu −(σ−1)u 1 ∞ −s x E(x) dx = e e E(eu ) du. g(s) := A 1 A 0 Then (with g (σ + it) as a shorthand for {(1/i)d/dt}g(σ + it)) we have −1 ∞ −itu −(σ−1)u g (σ + it) = e e E(eu )u du, A 0 and, by Plancherel’s identity for Fourier transforms, we have uniformly for σ > 1, ∞ ∞ 1 c 2 (15.6) |g (σ + it)| dt ≤ {E(eu )u}2 du = . 2 2 2πA 2πA −∞ 0 Now we examine the Lipschitz behavior of g(s) for s = 1+. By (11.7), ζ(s) 1 = + g(s), As s−1 and near a hypothetical zeta zero, g(s) is close to −1/(s − 1). (15.7)
15.2. ZEROS OF THE ZETA FUNCTION
163
Lemma 15.4. Assume that (15.3) holds. Then there exists a positive constant η ≤ 1 such that, whenever 1 ≤ σ ≤ σ and |σ − σ + ih| ≤ η, we have 1/4 ∞ (15.8) |g(σ + it) − g(σ + i{t + h})| ≤ 4c1/4 |σ − σ + ih|1/2 {E(eu )u}2 du , K
where K satisfies (15.9)
∞ K
K {E(eu )u}2
1 1/2 = c1/2 |σ − σ + ih| . du
Moreover, K → ∞ as |σ − σ + ih| → 0. In any case, the left hand side of (15.8) is at most 2 ∞ c1 := |E(eu )| du. A 0 Remark 15.5. The right hand side of (15.8) depends only on the distance between the two points, not on the point σ + it itself. This is essential for the proof of the lower bound (15.14) below. Proof. We first note that, by (15.5), the left side of (15.9) is a continuous and increasing function of K; thus K satisfying (15.9) is well-defined. Now we write
(15.10)
g(σ + it) − g(σ + i{t + h}) ∞ = A−1 e−itu e−(σ −1)u (1 − e−(σ−σ +ih)u ) E(eu ) du 0
=: A−1 (I1 + I2 ), say, where I1 and I2 are the partial integrals over the intervals [0, K] and [K, ∞) respectively and K satisfies condition (15.9). Let Δ := σ − σ + ih, and note that, as Δ → 0, we have K → ∞ and K|Δ| = o(1). Thus there exists a positive number η ≤ 1 such that K|Δ| ≤ 1 for |Δ| ≤ η. For 0 < u < K we then have 1 1 |1 − e−uΔ | ≤ |uΔ| 1 + + + . . . < 2|uΔ|. 2! 3! Insert this estimate into I1 and apply the Cauchy-Schwarz inequality to get K |E(eu )||Δ|u du ≤ 2|Δ|c1/2 K 1/2 . |I1 | ≤ 2 0
Also,
|I2 | ≤ 2
∞
∞
|E(e )| du ≤ 2 u
K
Note that |σ − σ + ih| c1/2 K 1/2 =
1/2 {E(e )u} du K −1/2 . u
2
K
∞
{E(eu )u}2 du
1/2 K −1/2
K
for K satisfying (15.9). Therefore we arrive at (15.8). The final claim of the lemma is immediate.
We use the last lemma to show the consequences of ζ(s) having a zero on σ = 1.
164
15. KAHANE’S PNT
Lemma 15.6. Assume (15.3) and let a > 0. (i) If ζ(1 + ia) = 0, then ζ(σ + ia) = o((σ − 1)1/2 ) as σ → 1+, (ii) ζ(s) has at most one zero on the half line 1 + it, t > 0, (iii) If ζ(s) has zeros 1 ± ia, then, as → 0+, ∞ ∞ (15.11) e−(1+)v dπ(ev )+ e−(1+)v eiav dπ(ev ) 0 0 ∞ + e−(1+)v e−iav dπ(ev ) → −∞. 0
Proof. To show (i), apply the last lemma with σ = 1, t = a, and h = 0. We have K → ∞ as σ → 1+ and ∞ {E(eu )u}2 du → 0. K
Then, by (15.7) and (15.8), |ζ(σ + ia)| = |ζ(σ + ia) − ζ(1 + ia)| ≤
(σ − 1)A + A(σ − 1)|g(1 + ia)| |ia(σ − 1 + ia)| + A|σ + ia||g(1 + ia) − g(σ + ia)|
= o({σ − 1}1/2 ). To show (ii), suppose on the contrary that ζ(s) has two zeros 1 + it1 and 1 + it2 with 0 < t1 < t2 . Let k(x) := 1 + cos xt1 + cos xt2 + (1/2){cos x(t2 + t1 ) + cos x(t2 − t1 )}. Note that k(x) = (1 + cos xt1 )(1 + cos xt2 ) ≥ 0. Hence
∞
(15.12)
x−σ k(log x)dΠ(x) ≥ 0,
for
σ > 1.
1
On the other hand, the left hand side of (15.12) equals log ζ(σ) + log |ζ(σ + it1 )| + log |ζ(σ + it2 )| + (1/2) log |ζ(σ + i{t2 + t1 })| + (1/2) log |ζ(σ + i{t2 − t1 })|. By (15.7), log ζ(σ) = log
1 + O(1) σ−1
and, by (i), log |ζ(σ + itj )| = (1/2) log (σ − 1) + log o(1),
j = 1, 2,
as σ → 1+. Also, ζ(σ + i(t2 ± t1 )) 1 because g(s) is uniformly bounded on the closed half-plane σ ≥ 1. Therefore the left hand side of (15.12) is ≤ log
1 + log (σ − 1) + log o(1) + O(1) → −∞ σ−1
as σ → 1+. This contradicts (15.12).
15.3. A LOWER BOUND FOR |ζ(σ + it)|
165
To show (iii), assume that ζ(s) has zeros 1 ± ia. (If ζ(1 + ia) = 0, the reflection principle implies that ζ(1 − ia) = 0.) We write ∞ x−s dπ(x) exp{Q(s)}, ζ(s) = exp 1
where
∞
Q(s) :=
{log (1 − x−s )−1 − x−s } dπ(x)
1
is a holomorphic function on {s : σ > 1/2}. Note that Q(s) and Q (s) are bounded for σ ≥ 12 + δ for each fixed δ > 0. Thus we have for > 0, ∞ (15.13) e−itv e−(1+)v dπ(ev ) = log ζ(1 + + it) − Q(1 + + it). 0
Now, by (i), the left hand side of (15.11) equals {log ζ(1+ ) − Q(1 + )} + {log ζ(1 + − ia) − Q(1 + − ia)} + {log ζ(1 + + ia) − Q(1 + + ia)} = log o(1) + O(1) → −∞ as → 0+.
15.3. A lower bound for |ζ(σ + it)|
The assumption that ζ(s) has zeros 1 ± ia implies, besides (15.11), that |ζ(s)| satisfies a lower bound on the half plane σ > 1. Lemma 15.7. Assume that (15.3) holds and that ζ(s) has zeros s = 1 ± ia with a > 0. Then there exists a positive constant B such that |ζ(1 + + it)| > B|t|−8
(15.14)
uniformly for |t| ≥ 3a/2 and 0 < ≤ 1. Proof. We apply the same idea used in proving (ii) of the last lemma. Let 4 1 k(x) := 1 + cos ax + cos 2ax (1 + cos tx). 3 3 Since k(x) ≥ 0, we have (15.15)
∞
x−σ k(log x) dΠ(x) ≥ 0,
σ > 1.
1
On the other hand, ∞ (15.16) x−σ k(log x) dΠ(x) 1
= log ζ(σ) +
4 1 log |ζ(σ + ia)| + log |ζ(σ + i2a)| + log |ζ(σ + it)| 3 3
2 + (log |ζ(σ + i{a + t})| + log |ζ(σ + i{a − t})|) 3 1 + (log |ζ(σ + i{2a + t})| + log |ζ(σ + i{2a − t})|). 6
166
15. KAHANE’S PNT
By (15.7), we have log ζ(σ) = − log (σ − 1) + O(1), log |ζ(σ + i2a)| = O(1), log |ζ(σ + i(ma ± t))| ≤ log |t| + O(1),
m = 1, 2,
for sufficiently large |t| and 1 < σ ≤ 2. Also, by a weak form of (15.11), 1 log |ζ(σ + ia)| ≤ log (σ − 1) + O(1). 2 Therefore, by (15.15) and (15.16), we arrive at 1 5 − log (σ − 1) + log |ζ(σ + it)| + log |t| + O(1) ≥ 0 3 3 and hence 5 1 11 1 log |t| log |ζ(σ + it)| ≥ log (σ − 1) − log |t| + O(1) ≥ log (σ − 1) − 3 3 3 6 for sufficiently large |t| and 1 < σ ≤ 2. It follows, for σ ≥ 1 + |t|−α , that α 11 + log |ζ(σ + it)| ≥ − log |t|, 3 6 i.e. |ζ(σ + it)| ≥ |t|−(α/3+11/6) . On the other hand, for 1 < σ < 1 + |t|−α with α > 2, by (15.7) and (15.8), |ζ(σ + it) − ζ(1 + |t|−α + it)|
≤ A|t|−α−2 + O(|t|−α ) + 2A|t| g(σ + it) − g(1 + |t|−α + it) ≤ O(|t|−α ) + 8c1/4 |t|(1 + |t|−α − σ)1/2 o(1) = o(|t|−α/2+1 )
as |t| → ∞. Hence, by taking α = 18, we find |ζ(σ + it)| ≥ |t|
−(α/3+11/6)
+o
1 |t|α/2−1
≥ |t|−8
uniformly for |t| ≥ t0 > 3a/2, 1 < σ ≤ 2. Finally, for 3a/2 ≤ |t| ≤ t0 , 1 < σ ≤ 2, we have ζ(σ + it) = 0. It follows that |ζ(σ + it)| > B|t|−8 with some B > 0, uniformly for |t| ≥ 3a/2, 1 < σ ≤ 2. 15.4. A Schwartz function and Poisson summation Next, by introducing a Schwartz function, we reformulate the hypothetical inequality (15.11) in terms of the Poisson summation formula (see (15.21) and (15.22)). As we noted above, this device replaces the Fej´er trigonometric polynomials used in Beurling’s PNT proof. For the remainder of this chapter, let φ(u) denote a real-valued even Schwartz function on R, i.e. one with the properties that it has derivatives of all orders and for all nonnegative integers k and , the k-th derivative satisfies (15.17)
φ(k) (u) = Ok, (|u|− ), k = 0, 1, 2, . . . , as |u| → ∞.
Informally, a Schwartz function is one that, along with all derivatives, vanishes rapidly at ±∞. We first show that the Fourier transform preserves the Schwartz denotes the Fourier transform of X. property. For the rest of the chapter, ‘X’
15.4. A SCHWARTZ FUNCTION AND POISSON SUMMATION
167
Lemma 15.8. Let φ(u) denote a real-valued even Schwartz function on R. Its Fourier transform ∞ := e−itu φ(u) du φ(t) −∞
is real-valued, even, and has the following properties: (i) φ ∈ C ∞ (R), (ii) (15.17) holds for ( φ )(k) for all nonnegative integers k and , and (iii) it satisfies the inversion formula ∞ 1 dt. e−itu φ(t) φ(u) = 2π −∞ is Sketch of Proof. The complex conjugate of φ(t) ∞ ∞ ∞ − itu −itv (φ(t)) = e φ(u)du = e φ(−v)dv = e−itv φ(v)dv = φ(t); −∞
−∞
−∞
thus φ is real-valued. A similar argument shows φ is even. Differentiation of the integral for φ is valid, since the differentiated integral is in L1 . The inversion formula we quote from [Kn68], Ch. 6. (Note that the inversion formula is valid with the negative sign in the exponential because φ is even.) The remaining relations are easy to establish using calculus. We next introduce a function whose Fourier transform is essentially the product of log ζ(1 + + iat) and a Schwartz function. Lemma 15.9. For > 0, the function ∞ − v)e−(1+)v dπ(ev ) φ(u (15.18) H (u) := 0
is real-valued and satisfies ∞ ∞ 1 −iatu −(1+)u u (15.19) φ(at) e e dπ(e ) = −iatu H (u) du. 2π −∞ −∞ Proof. We note that, by (ii) of Lemma 15.8, ∞ ∞ − v)|e−(1+)v dπ(ev ) du |φ(u −∞ 0 ∞ ∞ −(1+)v = e |φ(u − v)| du dπ(ev ) 0 −∞ ∞ ∞ = e−(1+)v |φ(w)|dw dπ(ev ) < ∞. 0
−∞
Hence we can exchange the order of the iterated integration on the right hand side of (15.19). Then the equality (15.19) follows by a simple manipulation. The function H has the following elementary properties. Lemma 15.10. Assume 0 < ≤ 1 and π(x) x/ log x. Then we have (i) H ∈ C ∞ (R);
168
15. KAHANE’S PNT
(ii) For nonnegative integers k and , ⎧ −k −u/2 −1 ⎪ u ⎨Ok, u + e H() (u) = O (1) ⎪ ⎩ Ok, (|u|−k )
for u ≥ 2, for |u| < 2, for u ≤ −2,
where the O−constants are independent of ; () (iii) For nonnegative integers and m, |u|m H (u) ∈ L1 (R). Proof. The lemma is proved by calculus. Given u, take K > |u| and suppose first v ≥ K + 1. Then, by (ii) of Lemma 15.8 we have for fixed k ≥ 2 ( φ ) (u − v)e−(1+)v
e−(1+)v , (v − K)k
and the last function is integrable on [K + 1, ∞) with respect dπ(ev ). Also, for |u| < K and 0 ≤ v ≤ K + 1, the left hand side of the last display is bounded. It follows that ∞ ( φ ) (u − v)e−(1+)v dπ(ev ) H (u) =
0
H (u)
and ∈ C(R), since the differentiated integrand is absolutely integrable. Similarly, we can show that ∞ H() (u) = ( φ )() (u − v)e−(1+v dπ(ev ) 0 () H (u)
∈ C(R). This proves (i). To prove (ii), it suffices to show that the bounds asserted there are in fact satisfied by ∞ () H (u) := |( φ )() (u − v)|e−(1+)v dπ(ev ). (15.20) and
0
For |u| < 2, the bound on ( φ )() (u − v) given by (ii) of Lemma 15.8 insures that ()
H (u) = O (1).
()
Next, suppose that u > 2, and split the integral for H (u) into three partial integrals I1 , I2 and I3 extending over intervals (0, u−1], (u−1, u+1], and (u+1, ∞) respectively. We have, by (ii) of Lemma 15.8, integration by parts, and application of the Chebyshev upper bound, u−1 −(1+)v e dπ(ev ) |I1 | (u − v)k+1 0 u−1 1 k + 1 −(1+)v = e−(1+)(u−1) π(eu−1 ) + π(ev ) − dv e k+1 (u − v) (u − v)k+2 0 u−1 −v e−u log u e dv e−u/2 + + k+1 k k+1 u v(u − v) u u 1 with k ≥ 2, since the last integral is bounded by u−1 −v u/2 dv e dv + k+1 k+1 v(u − v) v(u − v) 1 u/2 u/2 1 dv dv e−u/2 u−1 + k k+1 . u v u (u − v)k+1 1 u/2
15.4. A SCHWARTZ FUNCTION AND POISSON SUMMATION
169
An analogous but simpler calculation gives ∞ ∞ −(1+)v e dπ(ev ) dv −u |I3 | e e−u u−1 . k+1 k+1 (v − u) v(v − u) u+1 u+1 Also, since ( φ )() is bounded for each , u+1 |I2 | e−(1+)v dπ(ev ) ≤ e−(u−1)(1+) π(eu+1 ) e−u u−1 . u−1 ()
This establishes the bound for H (u) for u ≥ 2. ()
We handle u < −2 similarly: by (ii) of Lemma 15.8, the integral for H (u) is ∞ −(1+)v e dπ(ev ) k, (v + |u|)k+1 0 ∞ 1+ k+1 = π(ev ) + e−(1+)v dv k+1 k+2 (v + |u|) (v + |u|) 0 ∞ dv 1 1 + . k+1 k |u|k+1 (v + |u|) |u| 1 This finishes the proof of (ii). Note that all the O-constants are independent of . () Finally, we turn to (iii). To show that |u|m H (u) ∈ L1 (R) for any nonnegative () integers , m and > 0, we use the estimates of |H (u)| from (ii). We have −∞ −2 2 |u|m |H() (u)| du |u|m−k du + |u|m du −∞ −∞ −2 ∞ um u−k + e−u/2 u−1 du < ∞, + 2 ()
provided that k is taken suitably large in (ii). Therefore |u|m H (u) ∈ L1 (R).
To investigate the connection between (15.11) and (15.14), we fix a Schwartz function φ(t) satisfying the additional conditions (i) φ(t) = 1 on [−3a/2, 3a/2] and (ii) φ has support contained in (−2a, 2a). For 0 < ≤ 1, let ∞ e−iatv e−(1+)v dπ(ev ). M (t) := φ(at) 0
By Lemma 15.9, M (t) is the Fourier transform of H (u/a)/(2πa). Then, by (iii) of Lemma 15.10, |H (u/a)| + |(H (u/a)) | ∈ L1 (R). Hence we have by the Poisson summation formula ([BD04], Appendix A3; [MV07], Appendix D2) M (m) = H (2πm/a)/a. (15.21) m∈Z
m∈Z
We note that M (m) = 0 for |m| ≥ 2. Hence the left hand sides of (15.21) and (15.11) are equal, and we have another description of the contribution of log ζ(s) near the pole and purported zeros of zeta. Lemma 15.11. Assume that (15.3) is satisfied and that ζ(s) has zeros 1 ± ia. Let H− denote the negative part of H . Then H− (2πn/a) → −∞ as → 0 + . (15.22) n≥1
170
15. KAHANE’S PNT
Proof. By (15.11), the left hand side of (15.21) tends to −∞ as → 0+. Then, by (ii) of Lemma 15.10, the right hand side of (15.21) is at least H− (2πn/a) + O(1). n≥1
Hence (15.22) follows. 15.5. Estimating the sum of a series by an improper integral For further analysis of (15.22), let h (u) := uH (u/a)/(2πa). By (15.22), h− (2πn) → −∞ n
(15.23)
as → 0+,
n≥1
provided that (15.3) is satisfied and ζ(s) has zeros at 1 ± ia. We now establish some analytic properties of h . Lemma 15.12. Assume that 0 < ≤ 1 and π(x) x/ log x. We have (i) h (u) ∈ L1 (R) ∩ C ∞ (R) (ii) h (u) → 0 as |u| → ∞; (iii) There is a constant C such that |h (u)|, |h (u)| ≤ C hold uniformly for −∞ < u < ∞ and 0 < ≤ 1. Proof. (i) follows directly from (i) and (iii) of Lemma 15.10. To show (ii), we use the bound (ii) of the same lemma: h (u) = O(e−|u|/2 + |u|−k ). For (iii), note that the preceding bound applies as well for h (u). It remains to deal with small u, say |u| ≤ 2. In this case, for each , we have the bounds ∞ |H (u)|, |H (u)| (1 + v)− e−v dπ(ev ), 0
independently of ∈ (0, 1]. For = 2, say, the last integral converges to a finite number. Thus h (u) and h (u) are bounded in this range too, so the claimed absolute bounds hold. 3 We next estimate the sum of a series involving |h− | by a definite integral, using a delicate calculation. For this, introduce −j −j−1 )}, Sj = Sj, := {n ∈ N : h− (2πn) ∈ [−2 , −2
j = 1, 2, . . .
and
−1 S0 = S0, := {n ∈ N : h− )}. (2πn) ∈ [−C, −2 with C given in (iii) of Lemma 15.12. By (ii) of the same lemma, Sj and S0 are all finite sets. Also, we set
(15.24)
K =C +3
and define intervals
( ) In (α) = Ij,n (α) := 2nπ − α 2−j−1 /K, 2nπ + α 2−j−1 /K
for n ∈ Sj , j = 0, 1, 2, . . . , and α ∈ (0, 1].
15.5. ESTIMATING THE SUM OF A SERIES BY AN IMPROPER INTEGRAL
171
Let τ (u) be a real-valued Schwartz function with support [−1, 1] such that τ (u) is even and nonnegative with τ (u) du = 1, and let τδ (u) = 1δ τ (u/δ) with δ > 0. Lemma 15.13. Assume that π(x) x/ log x. Let δ = δj := 2−j−3 /K, where K is defined in (15.24), and let 0 < ≤ 1. Then 2−6 − 2 (15.25) | − h− ∗ τ (u)| du ≥ |h (2nπ)|3 δ K4 In (3/4) n∈Sj
n∈Sj
for j = 0, 1, 2, . . . . Proof. Recalling that |h (u)| < K, by Lemma 15.12 (iii), we see first that −j−1 h− + K|u − 2nπ| ≤ 0 (u) = h (u) < −2
(15.26)
for u ∈ In (1). Then, for u ∈ In (1/2), −j−1 −K −h− (u) > 2
2−j−1 = 2−j−2 . 2K
Hence, for u ∈ In (1/4), −h− ∗ τ (u) = τδ (u − v)(−h− δ (v)) dv |u−v|≤δ ≥ τδ (u − v)2−j−2 dv = 2−j−2 |u−v|≤δ
It follows that
|w|≤δ
2 −2j−4 (−h− ∗ τδ (u)) du ≥ 2
(15.27) In (3/4)
τδ (w) dw = 2−j−2 .
2−j−2 , K
−j−2
where 2 /K is the length of In (1/4) ⊂ In (3/4). For n ∈ Sj , j = 1, 2, . . . , |h− (2nπ)| ≤ 2−j , and for n ∈ S0 , |h− (2nπ)| ≤ C < K. Hence (15.25) follows from (15.27). We remark that K 4 is extravagant in (15.25) for j = 1, 2, . . . , for there the factor K suffices. Recall from Lemma 15.9 that ∞ e−iatv e−(1+)v dπ(ev ) = M (t) = φ(at)
∞
e−itu h (u)u−1 du.
−∞
0
To remove the effects of the pole of ζ(s) at s = 1 and its supposed zeros at s = 1±ia, introduce ∞ m (t) := (1 − φ(at)) e−iatv e−(1+)v dπ(ev ) 0 ∞ (15.28) = e−itu {e−(1+)u/a dπ(eu/a ) − h (u)u−1 du}. −∞
Then m (t) = 0 for t ∈ [−3/2, 3/2]. Also, since h (u) ∈ L1 (R), we have ∞ m (t) = −i e−itu {ue−(1+)u/a dπ(eu/a ) − h (u)du}. −∞
Therefore (15.29)
i m (t) τδ (t)
∞
= −∞
e−itu {f (u) − h ∗ τδ (u)} du,
172
15. KAHANE’S PNT
where
∞
f (u) :=
τδ (u − v)ve−(1+)v/a dπ(ev/a ).
0
Note that f (u) ≥ 0. We −now come to a key inequality for showing the uniform boundedness of n≥1 h (2nπ)/n. Lemma 15.14. Under the assumptions of the last lemma, ∞ 2−6 − 1 |m (t) τδ (t)|2 dt ≥ 4 |h (2nπ)|3 , j = 0, 1, 2, . . . . (15.30) 2π −∞ K n∈Sj
Proof. It suffices to show that ∞ (15.31) |m (t) τδ (t)|2 dt ≥ 2π −∞
n∈Sj
2 | − h− ∗ τδ (u)| du
In (3/4)
for j = 0, 1, 2, . . . ; then (15.30) follows from (15.25). We have ∞ ∞ ∞ f (u)du = τδ (u)du · ve−(1+)v/a dπ(ev/a ) < ∞. −∞
−∞
0
Since f ∈ L1 (R) and f is bounded, f ∈ L2 (R). Also, h (u) ∈ L2 (R) for the same reason. Then h ∗ τδ (u) ∈ L2 (R) too. Hence, from (15.29) and the Plancherel identity for Fourier transforms, the left hand side of (15.31) equals ∞ 2 (15.32) 2π |f (u) − h ∗ τδ (u)| du ≥ 2π |f (u) − h ∗ τδ (u)|2 du, −∞
In (3/4)
n∈Sj
since the sets {In (3/4) : n ∈ Sj } are pairwise disjoint. For u ∈ In (3/4) and |v| ≤ δ, we have u − v ∈ In (1) and so by (15.26), −h− ∗ τ (u) = − τδ (v) h− δ (u − v) dv ≥ 0. |v|≤δ
Also, f ≥ 0. Thus (15.33)
|f (u) − h ∗ τδ (u)|2 du ≥
In (3/4)
2 | − h− ∗ τδ (u)| du,
In (3/4)
and (15.31) follows from (15.32) and (15.33).
15.6. Conclusion of the proof In the remainder of the argument, we use the zeta lower bound (15.14) to establish uniform boundedness of the left side of (15.23) for 0 < ≤ 1; this is the sought-after contradiction. Lemma 15.15. Assume (15.3) and let 0 < δ ≤ 1. Then, we have uniformly for 0 < ≤ 1, ∞ (15.34) |m (t) τδ (t)|2 dt δ −18 . −∞
15.6. CONCLUSION OF THE PROOF
173
Proof. The Fourier transform τδ (t) of the Schwartz function τδ is itself a Schwartz function, and so for any positive n it satisfies τ (δt)| = On ({δ|t|}−n ), | τδ (t)| = |
as
|t| → ∞.
τδ | and |m | are even, because each Also, recall that m (t) = 0 for |t| < 3/2, and | of τδ and m is the Fourier transform of a real function. Therefore ∞ ∞ Tn ∞ −2n 2 2 (15.35) |m (t) τδ (t)| dt = 2 |m (t) τδ (t)| dt ≤ 2n t |m (t)|2 dt δ −∞ 3/2 3/2 for some constant Tn . Differentiate the first relation in (15.28), obtaining ∞ e−iatu e−(1+)u {(−iau)(1 − φ(at)) − aφ (at)} dπ(eu ). m (t) = 0
Also, recall that 1 − φ(at) and φ (at) are each bounded. It follows that ∞ 2 ∞ 2 (15.36) |m (t)|2 e−iatu e−(1+)u dπ(eu ) + e−iatu e−(1+)u u dπ(eu ) . 0
0
For |t| ≥ 3/2, by (15.13) and (15.7), the first term on the right hand side of the last inequality is bounded by 2(| log ζ(1 + + iat)|2 + |Q(1 + + iat)|2 ) = O(log2 |t|). Here we use the lower bound (15.14) to show the second term is at most ζ (1 + + iat) 2 2 + |Q (1 + + iat)| 1 + |t|16 |ζ (1 + + iat)|2 . 2 ζ(1 + + iat) Also, by (15.7), for |t| ≥ 3/2, |ζ (1 + + iat)|2 1 + |t|2 |g (1 + + iat)|2 . The last inequalities, together with (15.35) and (15.36) give ∞ Tn −2n ∞ 16−2n 2 δ |m (t) τδ (t)| dt ≤ t {1 + t2 |g (1 + + iat)|2 } dt. 2π −∞ 3/2 Choosing n = 9 in the last expression, we obtain, by (15.6), ∞ |m (t) τδ (t)|2 dt ≤ T δ −18 {1 + c/(2πaA2 )}. −∞
The next step approaches the boundedness of
n≥1
h− (2nπ)/n.
Lemma 15.16. Under the assumptions of the last lemma there are a positive integer k and a constant Z, both independent of , such that k (15.37) |h− (2nπ)| ≤ Z. n≥1
Proof. Combining (15.30) and (15.34), we obtain, uniformly for 0 < ≤ 1 and j = 0, 1, 2, . . . , ∞ − 3 |h (2nπ)| < Z1 |m (t) τδ (t)|2 dt < Z2 δ −18 (15.38) −∞
n∈Sj
For j = 0, 1, 2, . . . , choose δ = δj := 2−j−3 /K, as in Lemma 15.13. We obtain 3 j+3 |h− K)18 = Z3 218j . (2nπ)| < Z2 (2 n∈Sj
174
15. KAHANE’S PNT
Also, |h (2nπ)| ≤ 2−j for n ∈ Sj , j = 1, 2, . . . , so we find for each index j ≥ 1 22 (15.39) |h− < Z3 2−j . (2nπ)| n∈Sj
The preceding bound holds also for n ∈ S0 , and we have |h (2nπ)| ≤ K. Thus 22 |h− < Z3 K 19 = Z4 . (2nπ)| n∈S0
Finally,
22 |h− = (2nπ)|
n≥1
22 |h− < Z4 + Z3 (2nπ)|
j≥0 n∈Sj
2−j =: Z.
j≥1
Proof of Theorem 15.1. By H¨ older’s inequality and (15.37), we have 1/k (k−1)/k h− (2nπ) 1 k |h− ≤ P, (15.40) ≤ (2nπ)| n nk/(k−1) n≥1
n≥1
n≥1
where the constant P is independent of . This contradicts (15.23). Therefore, ζ(s) has no zeros on the line σ = 1. Proof of Theorem 15.2. The argument here is the same as that used for the conclusion of Theorem 13.1. The hypothesis (15.3) enables us to apply Lemma 13.7. We have 2λ (10.8 ter) lim |G(σ + it) − G(σ + it)| dt = 0 σ, σ →1+
−2λ
for every λ > 0, where
1 1 ζ (s) − . s ζ s−1 Then, an application of Corollary 10.6 implies that the PNT holds for the g-number system N . This completes the proof of Theorem 15.2. G(s) := −
15.7. Notes ∞ §15.1. If E(x) satisfies (15.3), then 2 x−1 |E(x)| dx is finite by the first estimate of Lemma 15.3. It follows that N has density and E(x) = o(1) by Proposition 7.7. Bateman and the first author showed in [BD69] that if condition (15.3) is satisfied and if ζ(s) has no zeros on the line s = 1 then the PNT holds. This raised the question of whether (15.3) might in fact imply the nonvanishing of ζ(s) on the line, and so suffice for the PNT. Further improvements of Kahane’s theorem were given by the second author in [Zh15]. Let x −1 Ek−1 (t) dt, k ≥ 1. E0 (x) := (N (x) − Ax)/x and Ek (x) := x 1
One of the results asserts that if ∞ {Ek (x) log x}2 x−1 dx < ∞ 1
for some k ≥ 1, then the PNT holds. (Kahane’s theorem is the k = 0 case.)
https://doi.org/10.1090//surv/213/16
CHAPTER 16
PNT with Remainder What’s left to say? Summary. The PNT remainder is the difference Δ(x) := |π(x)−li(x)|. Here we establish “Nyman type” and “de la Vall´ee Poussin type” estimates for Δ, starting from different assumptions on the g-number counting function N (x).
16.1. Introduction In this chapter, we shall establish the PNT with remainder terms of two forms. The first is Δ(x) x log−k x, which we call a “Nyman type remainder term,” after its discoverer, and the second is Δ(x) x exp{−c logβ x}, which we call a “dlVP type remainder term,” after Ch. J. de la Vall´ee Poussin, who first gave such a bound. The rational integers differ from g-integers in that the former are equally spaced. This property allows the Riemann zeta function ζR (s) to be extended to an analytic function on C with a simple pole at s = 1. These facts and known zero-free regions for zeta yield a dlVP type error term with β any number less than 3/5. This result was achieved in 1958 and is still essentially the best one known. The situation for a g-number system N can be quite different, depending on the distribution of N : if we assume that N (x) = Ax + O(xθ ) with some number θ ∈ [0, 1), then the zeta function ζ = ζN can be analytically continued to the half plane {σ > β}. In this case, estimates of Δ with β equal to 1/2 can be obtained by mimicking classical contour shifting methods. We shall not pursue this case further here. In the next chapter, we shall show with an example that such a bound is optimal. The hypotheses made on N in this chapter do not allow extension of zeta to the left of the line {σ = 1}. In the first case we assume that N (x) = Ax + O(x log−γ x), and in the second one, N (x) = Ax + O(x exp{−c logα x}). In both situations we apply an inversion method to a high order derivative of (essentially) a Mellin transform of Δ. In the first case we use an analogue of the Wiener-Ikehara method, but with a Gaussian function in place of the triangular function; and in the second case, we use the Plancherel identity. For one side of the resulting integral identities we estimate the derivative of the zeta function expression, and on the other side, obtain a pointwise bound for Δ(x) by simple tauberian arguments. The zeta function estimates in the two cases are similar; we treat these together in the next section. The parts of the arguments that differ are presented separately in the last two sections. 175
176
16. PNT WITH REMAINDER
16.2. Two general lemmas Suppose that a g-number system N has integer counting function (16.1)
N (x) = Ax + O(x log−γ x)
with constants A > 0 and γ > 3. As this is the weaker of our two conditions, results established for this case are valid in the dlVP case as well. We introduce an auxiliary function f (s) := (s − 1)ζ(s)/s,
σ > 1,
which will be very useful in both cases. By integration by parts ∞ (16.2) f (s) = A + (s − 1) x−s−1 (N (x) − Ax) dx. 1
This formula shows that f (s) is well-defined and differentiable in the closed halfplane {σ ≥ 1} and is bounded near 1. We begin by establishing a lower bound for |f | using upper bounds upon f and its derivatives. Lemma 16.1. Suppose that (16.3)
|f () (σ + it)| ≤ Bg(t)+1 ,
= 0, 1,
holds with some constant B > 0 and some positive valued function g(t) uniformly for 1 < σ ≤ 2, t ∈ R, and that g(t) satisfies (16.4)
1 ≤ g(nt) ≤ Dg(t),
n = 1, 2, . . . , 7,
with some constant D ≥ 1 for all t ∈ R. Then (16.5)
|f (σ + it)|−1 ≤ Cg(t)λ
holds with constants C > 0 and λ = 5 for 1 < σ ≤ 2 and all t ∈ R. Proof. It is convenient to treat separately the case of small values of |t|. We see from (16.2) that f (s) is continuous on {σ ≥ 1} and nonvanishing at 1, so |f (s)| has a positive lower bound on a sufficiently small closed semidisc about 1. Also, ζ(s) = 0 on the open half plane {σ > 1}, so f (s) has the same property. It follows that |f (s)| ≥ for 1 ≤ σ ≤ 2 and |t| ≤ t0 for sufficiently small > 0 and t0 > 0. In this case, (16.5) holds with any λ ≥ 0. For |t| ≥ t0 , we are away from the pole of zeta, and, since ζ(s) = sf (s)/(s − 1), an estimate for either f (s) or ζ(s) translates to an estimate for the other. In this case, it suffices to show that (16.6)
|ζ(σ + it)| ≥ C1 g(t)−λ .
We use the classical method of first giving a lower bound for |ζ(σ + it)| for all σ ∈ [σ1 , 2] with a suitable σ1 > 1 and then extending the result to all σ ∈ (1, 2] with the aid of an upper bound on |ζ (σ + it)|. By (16.3), (16.7)
|ζ () (σ + it)| ≤ C2 g(t)+1 ,
= 0, 1.
We remark that the classical 3-4-1 inequality, ζ 3 (σ)|ζ(σ + it)|4 |ζ(σ + i2t)| ≥ 1, will yield (16.6) with λ = 7. However, we shall obtain a better bound with the aid of a more complicated trigonometric polynomial.
16.2. TWO GENERAL LEMMAS
177
There exists a cosine polynomial (see Notes) p(θ) = a0 + a1 cos θ + a2 cos 2θ + · · · + a7 cos 7θ with the properties (i) p(θ) ≥ 0 for all θ ∈ R, (ii) a1 > a0 > 0 and ak ≥ 0 for 2 ≤ k ≤ 7, (iii)
S := (a0 + a1 + · · · + a7 )/(a1 − a0 ) < 5.91.
Property (i) implies that (16.8)
ζ a0 (σ)|ζ(σ + it)|a1 · · · |ζ(σ + 7it)|a7 ≥ 1
for all σ > 1. It follows, from (16.7) with = 0, property (ii), and (16.4), that |ζ(σ + it)| ≥ C3 (σ − 1)a0 /a1 g(t)−(a2 +···+a7 )/a1 with some positive constant C3 ; we may assume that C2 > C3 . Therefore, for 1 + δg(t)−η ≤ σ ≤ 2 with δ > 0 and η > 0, both to be specified, (16.9)
|ζ(σ + it)| ≥ C3 δ a0 /a1 g(t)−(a0 η+a2 +···+a7 )/a1 .
Now suppose that 1 < σ ≤ 1 + δg(t)−η . By (16.7) with = 1, we have (16.10) ζ(σ + it) − ζ(1 + δg(t)−η + it) ≤ C2 δg(t)−η+2 . Choose η to make the exponents of g(t) in (16.9) and (16.10) the same and choose δ to make C3 δ a0 /a1 = 2C2 δ, namely, a /(a −a ) C3 1 1 0 a0 + a1 + · · · + a7 + 1 = S + 1 and δ = . η= a1 − a0 2C2 In this case, we obtain (16.11)
|ζ(σ + it)| ≥ |ζ(1 + δg(t)−η + it)| − ζ(σ + it) − ζ(1 + δg(t)−η + it) ≥ C4 g(t)−S+1
with some constant C4 > 0. The lemma with λ = S − 1 follows from the last inequality, (16.9), and property (iii) of p(θ). The following identity will enable us to combine bounds for 1/f (s) and for derivatives of f (s) to obtain bounds for derivatives of f /f . Lemma 16.2. Assume that a function h(s) is nonvanishing in a region and is n-times differentiable there for some positive integer n. Then (n−1) n h (s) h (s) h (s) h(n) (s) k−1 , ,..., (16.12) = (−1) Pk,n , h(s) h(s) h(s) h(s) k=1
where Pk,n (x1 , . . . , xn ) =
Ck,k xk11 · · · xknn
k
is a homogeneous polynomial of degree k with positive integral coefficients Ck,k , where k = {k1 , . . . , kn } is an n-vector with components satisfying (16.13)
k1 + · · · + kn = k
and
k1 + 2k2 + · · · + nkn = n.
178
16. PNT WITH REMAINDER
Moreover, we have n
(16.14)
Ck,k ≤ 2n−1 (n − 1)!.
k=1 k
Remark. Also, the coefficients Ck,k satisfy n
(−1)k−1
k=1
Ck,k = 0,
k
but this property is not used in the sequel. To simplify the notation, we will not write the argument s in the proof. Proof. The lemma is proved by induction on n. Plainly it is true for n = 1. For n = 2 we have h h h 2 − = , h h h which corresponds to the monomials P1,2 (x1 , x2 ) = C{0,1},1 x2 = x2 ,
P2,2 (x1 , x2 ) = C{2,0},2 x21 = x21 ;
thus the formula holds here as well. Now suppose that (16.12) and (16.13) hold and that h is n + 1-times differentiable. By the first condition in (16.13), we have h h h(n) Pk,n (h , h , . . . , h(n) ) , ,..., . Pk,n = h h h hk Differentiating, we obtain (n) (n−1) n h (−1)k−1 Pk,n (h , h , . . . , h(n) ) h = = h h hk k=1 n h Pk,n (h , h , . . . , h(n) ) = (−1)k k h hk k=1 (−1)k−1 DPk,n (h , h , . . . , h(n) ) + := I + II. hk Here, DPk,n (h , . . . ) represents (d/ds){Pk,n (h (s), . . . )}. In terms of the coefficients we have n h k1 +1 h k2 h(n) kn I= (−1)k k Ck,k ··· h h h =
k=1 n+1
(−1)k−1 (k − 1)
k=2
and II =
k
k
Ck,k−1
h k1 +1 h k2 h(n) kn ··· h h h
k1 −1 k2 +1 (n) kn n h h h Ck,k k1 ··· (−1)k−1 h h h k=1 k k1 k2 −1 (3) k3 +1 (n) kn h h h h + k2 ··· h h h h k1 (n) kn −1 (n+1) h h h + · · · + kn ··· . h h h
16.3. A NYMAN TYPE REMAINDER TERM
179
Therefore, if we set polynomials Pk,n+1 (x1 , . . . , xn , xn+1 ) := (k − 1) Ck,k−1 x1k1 +1 xk22 · · · xknn +
(16.15)
k
Ck,k k1 x1k1 −1 x2k2 +1 xk33 · · · xknn
k
+ k2 xk11 xk22 −1 x3k3 +1 · · · xknn + · · · kn−1 kn −1 + kn xk11 · · · xn−1 xn xn+1 for k = 1, . . . , n, n + 1, then (n) n+1 h h h(n) h(n+1) ,..., , = (−1)k−1 Pk,n+1 . h h h h k=1
By (16.15) and the induction hypothesis, the polynomial Pk,n+1 has coefficients Ck ,k (with k = {k1 , . . . , kn , kn+1 }) that are all positive integers and also terms kn+1 xk11 · · · xknn xn+1 that satisfy k1 + · · · + kn + kn+1 = k and k1 + 2k2 + · · · + nkn + (n + 1)kn+1 = n + 1. This proves (16.13). Finally, let Sn denote the left hand side of (16.14). Note that (16.13) implies that k = k1 + · · · + kn ≤ n. Then, by (16.15) (with the first form of I), n+1 n Sn+1 = Ck ,k = Ck,k + Ck,k (k1 + · · · + kn ) ≤ 2nSn . k k=1 k
k=1
k
k
Therefore, by the induction hypothesis, Sn+1 ≤ 2n n!. This proves (16.14).
16.3. A Nyman type remainder term We showed earlier that if γ > 1 in (16.1) then Chebyshev bounds hold for N (Corollary 9.10), and if γ > 3/2 then we have the PNT (Theorem 13.1). Now, for larger γ values, we establish the PNT with a remainder term. Theorem 16.3. Assume that (16.1) holds with γ > 3. Then (16.16) ψ(x) = x + O x(log x)−(γ−3)/8 . Remarks. This remainder term is not optimal. (See Notes.) By integration by parts, the conclusion of the theorem can be restated as π(x) = li(x) + O x(log x)−(γ+5)/8 . Combining Theorem 16.3 with the g-integer approximation Theorem 7.13 (A) we obtain Corollary 16.4. A Beurling generalized number system satisfies N (x) = Ax + Ok (x log−k x) with some constant A > 0 and all large k ∈ N if and only if π(x) = li(x) + O (x log− x) holds for all large ∈ N.
180
16. PNT WITH REMAINDER
The method we apply in the proof of Theorem 16.3 has some similarities with that used for the Wiener-Ikehara Theorem. We exploit the “weight” produced by derivatives of −(ζ /ζ)(1 + it) to obtain the remainder term; the required bounds for derivatives are established using condition (16.1). A second difference is that here we use a Gaussian convergence factor, which vanishes at infinity much faster than the Fej´er kernel. This is an outline of our argument. For a positive integer n of our choice, we bound the n-th derivative of (ζ /ζ)(s); because n will be fixed for given γ, estimates need not be uniform in this parameter. We start calculations with the identity ∞ dn −ζ (s) 1 ψ(x) − x − = dx x−s (− log x)n n ds sζ(s) s−1 x 1 ∞ ψ(eu ) − eu = e−(s−1)u (−u)n du eu 0 for σ = s > 1. With numbers y > 0 (large) and > 0 (small), multiply through the last formula by 2 2 eity− t /2/(2π) and integrate on t. The double integral below converges absolutely, which justifies switching the order of integration. We define functions L = Ln (σ, y) and R = Rn (σ, y) with (n) 2 2 1 (−1)n ∞ −ζ (σ + it) − eity− t /2 dt L := 2π −∞ (σ + it)ζ(σ + it) σ + it − 1 ∞ u u 2 2 1 ∞ −(σ−1)u n ψ(e ) − e = e u eit(y−u) e− t /2 dt du u e 2π u=0 t=−∞ ∞ u u 2 2 ψ(e ) − e 1 e−(σ−1)u un e−(y−u) /(2 ) du =: R. =√ eu 2π 0 (In the last integral, the Fourier transform of the Gaussian function was used.) Now let σ → 1+. By the dominated convergence theorem, ∞ u u 2 2 ψ(e ) − e 1 un e−(y−u) /(2 ) du =: R∗ . R→ √ eu 2π 0 We have the normalization (16.17)
1 √ 2π
∞
e−v
2
/(22 )
dv = 1.
−∞
Thus an average of un (ψ(eu ) − eu )/eu is expressed in terms of the limit of Ln as σ → 1+. It remains to choose a suitable n, estimate limσ→1+ Ln (σ, y), and extract pointwise upper and lower bounds from this average. We begin with the last task. Bounds for ψ(x) Lemma 16.5. Suppose that the integer counting function of N satisfies (16.1) and that Ln −7(n+1)/γ holds for some natural number n < γ − 2, uniformly for all σ > 1, all y ≥ 2, and 0 < < 1. Then, for η = η(γ, n) > (7 + γ)/n, ψ(x) = x + O x{log x}−γ/(7+η) .
16.3. A NYMAN TYPE REMAINDER TERM
181
Proof. Take n to be a positive integer satisfying n < γ − 2. (The reason for this restriction will appear in the proof of the next lemma.) The first step is to derive an upper bound for ψ. With a number δ to be chosen that is large compared with and satisfying δ = o(y), the ψ-part of the integral for R∗ is expressible as ∞ 2 2 1 (16.18) un e−(y−u) /(2 ) ψ(eu )e−u du R1∗ := √ 2π 0 y+δ 2 2 1 ≥√ un e−(y−u) /(2 ) ψ(eu )e−u du 2π y−δ ψ(ey−δ )(y − δ)n y+δ −(y−u)2 /(22 ) √ e du. ≥ 2π ey+δ y−δ Using (16.17) and the Chebyshev O-bounds, we find O(y n ) ∞ −v2 /(22 ) R1∗ ≥ ψ(ey−δ )(y − δ)n e−y−δ − e dv. δ The last integral is smaller than 2 2 2 −δ2 /(22 ) 1 ∞ −v2 /(22 ) ve dv = < e−δ /(2 ) ; e (16.19) δ δ δ thus R1∗ ≥ ψ(ey−δ )(y − δ)n e−y−δ + O(y n e−δ /(2 ) ). The rest of R∗ is −R2∗ with ∞ ∞ 2 2 1 1 ∗ n −(y−u)2/(22 ) u e du = √ (y + v)n e−v /(2 ) dv. R2 := √ 2π 0 2π −y 2
2
Expanding the binomial and using (16.17) and (16.19) again (but with y in place of δ), we see that ny n−1 ∞ −v2/(22 ) ∗ n n−1 −y 2 /(22 ) √ (16.20) e )+ ve dv R2 = y + O(y 2π −y ∞ 2 2 n(n − 1) n−2 y v 2 e−v /(2 ) dv + . . . . + √ 2 2π −y Since the integrand ve−v /(2 ) is odd, ∞ ∞ 2 2 2 2 −v 2/(22 ) ve dv = ve−v /(2 ) dv = 2 e−y /(2 ) ; 2
2
−y
y
also, the last integral in (16.20) is less than ∞ 2 2 v 2 e−v /(2 ) dv = O(3 ). −∞
Thus the first three terms of R2∗ equal
y n + O 2 y n−2 .
The remaining terms of R2∗ have lower powers of y and higher powers of , so the last estimate is valid for all of R2∗ . The preceding bounds for R1∗ and R2∗ together with the condition for Ln give 2 2 ψ(ey−δ )(y − δ)n ≤ y n 1 + O 2 y −2 + O(e−δ /(2 ) ) + O(−7(n+1)/γ ). ey+δ
182
or
or
16. PNT WITH REMAINDER
2 2 ψ(ey−δ ) y n 2δ ≤ e 1 + O 2 y −2 + O(e−δ /(2 ) ) + O y −n −7(n+1)/γ ey−δ y−δ 2 2 ψ(ey−δ ) ≤ 1 + O δ + e−δ /(2 ) + y −n −7(n+1)/γ . y−δ e $ Taking δ := 2 log 1/, we find e−δ /(2 2
2
)
= .
Next, set := y −γ/(7+{7+γ}/n) ; with this choice, y −n −7(n+1)/γ = . Since δ > , these bounds give ψ(ey−δ ) ≤ 1 + O(δ). ey−δ Finally, δ is connected with y by $ δ = 2 log 1/ y −γ/{7+(7+γ)/n}+o(1) for large y. Thus, we have
ψ(x) ≤ x + O x{log x}−γ/(7+η)
provided that we take η = η(γ, n) > (7 + γ)/n. Now we turn to the lower bound for ψ, starting again with R∗ = R1∗ − R2∗ and using the same values for δ, and n. Note that we have shown already that R2∗ = y n + O(2 y n−2 ). To study R1∗ , recall (16.18) and the Chebyshev bound ψ(x)/x < B for all x ≥ 1. This time, write ∞ 2 2 1 un e−(y−u) /(2 ) ψ(eu )e−u du R1∗ := √ 2π 0 y+δ u 2 2 ψ(e ) 1 ≤ √ un e−(y−u) /(2 ) du u e 2π y−δ 2 2 B +√ un e−(y−u) /(2 ) du =: I + II. 2π [0,∞)\(y−δ, y+δ) By (16.17), our main term is ψ(ey+δ )(y + δ)n y+δ −(y−u)2 /(22 ) ψ(ey+δ )(y + δ)n √ I≤ e du ≤ . ey−δ 2π ey−δ y−δ Using the binomial expansion ∞ ∞ 2 2 2 2 2B 2B un e−(y−u) /(2 ) du = √ (y + v)n e−v /(2 ) dv II < √ 2π y+δ 2π δ ∞ 2 2 2B n(n − 1) n−2 2 =√ y n + ny n−1 v + y v + . . . e−v /(2 ) dv, 2 2π δ and from the upper bound calculations, we find II ≤ y n + O 2 y n−1 .
16.3. A NYMAN TYPE REMAINDER TERM
183
Thus ψ(ey+δ )(y + δ)n + y n + O(y n−1 2 ), ey−δ and combining the bounds for R1∗ and R2∗ and that for Ln , we obtain R1∗ ≤
ψ(ey+δ )(y + δ)n + y n − y n + O(2 y n−1 ) ≥ −c−7(n+1)/γ . ey−δ Proceeding as before, we find ψ(ey+δ ) ≥ 1 − c1 δ − c2 2 y −1 − c3 y −n −7(n+1)/γ ey+δ or
ψ(x) ≥ x + O x{log x}−γ/(7+η) .
The two inequalities for ψ(x) give the claimed formula.
Bounds for Ln We need a bound for Ln for use in Lemma 16.5. Recall that the integrand of the integral defining Ln includes as a factor the nth derivative of (16.21)
Z(s) :=
1 −ζ (s) − ; sζ(s) s−1
we have to estimate |Z (n) |. For this, instead of ζ(s), we use f (s) := (s − 1)ζ(s)/s, and we give bounds for 1/f and derivatives of f along with the recipe for combining them. To start, we write a convenient representation for Z in terms of the auxiliary function f (s) = (s − 1)ζ(s)/s. Since s − 1 1 1 ζ 1 1 ζ (s) 1 1 1 = , log ζ(s) (s) + − = + − − s s s ζ s−1 s sζ(s) s − 1 s s2 we have (16.22)
Z(s) =
1 −f (s) 1 − − 2. sf (s) s s
Also, we introduce here (16.23)
g(t) := (|t| + e)1/γ .
(In the next section, we choose g differently.) We begin with upper bounds for the modulus of derivatives of f . Lemma 16.6. If (16.1) is satisfied then (16.24)
|f () (σ + it)| ≤ Bg(t)+1
with some constant B > 0, uniformly for 1 < σ ≤ 2, t ∈ R, and all integers satisfying 0 ≤ < γ − 1. Proof. We have ∞ ∞ N (x) dx 1 −s dN (x) − ζ(s) = = f (s) = 1 − x x−s dN1 (x), s x 1− 1−
184
16. PNT WITH REMAINDER
where
N (u) du dN (u) − u 1− x (N (u) − Au) du + A x log−γ x. = N (x) − Ax − u 1− x
N1 (x) :=
For any X > 1, write X −s −s f (s) = x dN1 (x) − N1 (X)X + s 1−
∞
x−s−1 N1 (x) dx.
X
Differentiating times ( = 0 is OK), we have by Leibniz’s rule X x−s log x dN1 (x) − N1 (X)X −s log X (−1) f () (s) = −1 ∞ +s x−s−1 log x N1 (x) dx X∞ − x−s−1 log−1 x N1 (x) dx X
:= I + II + III + IV, say. Integration by parts shows
X
|I| < 1−
N (x) dx (log X)+1 x−1 log x dN (x) + x
uniformly for σ > 1 and all real t. Also, we have II (log X)−γ . And ∞ x−1 log−γ x dx (|t| + e) (log X)+1−γ , III (|t| + e) X
with the integral converging, since < γ − 1. This condition limits the order of zeta derivatives, whence the condition n < γ − 2 in Lemma 16.5. Also, for 1 ≤ < γ − 1, ∞ IV x−1 (log x)−1−γ dx (log X)−γ . X
Now II and IV are negligible, so we obtain f () (s) (log X)+1 + (|t| + e) (log X)+1−γ . Setting log X = g(t), we find f () (σ + it) g(t)+1 , the claimed bound.
The function g(t) of (16.23) satisfies the condition (16.4). It follows from (16.24) and Lemma 16.1 that |f (σ + it)|−1 ≤ Cg(t)λ holds with constants C > 0 and λ = 5 for 1 < σ ≤ 2 and all t ∈ R. Lemma 16.7. Assume that f satisfies (16.24 bis)
|f () (σ + it)| ≤ B1 g(t)+1
for all nonnegative integers < γ − 1 and (16.5 bis)
|f (σ + it)|−1 ≤ B2 g(t)5
16.3. A NYMAN TYPE REMAINDER TERM
185
with positive constants B1 and B2 , uniformly for 1 < σ ≤ 2 and all t ∈ R. Then 7 n f (s) (n−1) {Bng(t) } . sf (s) |s| holds with some constant B > 1 for 1 < σ ≤ 2 and 1 ≤ n < γ − 1. Proof. We apply Lemma 16.2 to estimate f (s) (n−1) . f (s) By (16.24 bis) and (16.5 bis), we have for each natural number j < γ − 1, (j) f (σ + it) j+6 (16.25) f (σ + it) ≤ B1 B2 g(t) . Let B3 := B1 B2 ; we can assume B3 > 1. Thus, for n ≥ 1 and nonnegative integers k1 , . . . , kn satisfying k1 + · · · + kn = k and k1 + 2k2 + · · · + nkn = n, we have n n (j) f (σ + it) kj k ≤ B3k g(t)j+6 j ≤ B3n g(t)7n . f (σ + it) j=1 j=1 (We have used the observation that k ≤ n.) Hence, by Lemma 16.2, for 1 ≤ n < γ − 1, all real t, and 1 < σ ≤ 2, n f (s) (n−1) n n ≤ (16.26) Ck,k B3 g(t)7 ≤ (n − 1)! 2B3 g(t)7 . f (s) k=1 k
Now we find derivatives of f (s)/{sf (s)} using Leibniz’s rule. For each n ≥ 1, (j) f (s) (n−1) n−1 n − 1 (−1)n−1−j (n − 1 − j)! f (s) = . sf (s) j sn−j f (s) j=0 Using (16.26) with j in place of n − 1, we have f (s) (n−1) n−1 j+1 n − 1 (n − 1 − j)! j! ≤ 2B3 g(t)7 sf (s) j |s|n−j j=0
≤
n−1 j+1 n (n − 1)! n! 2B3 g(t)7 . 2B3 g(t)7 ≤ |s| |s| j=0
Noting that n! ≤ nn , we conclude that f (s) (n−1) ≤ 1 2B3 ng(t)7 n . sf (s) |s|
Proof of Theorem 16.3. Our next step is to show that the Ln condition of Lemma 16.5 is satisfied. Recall that 2 2 (−1)n ∞ (n) Z (σ + it)eity− t /2 dt, Ln := 2π −∞ with Z(s) :=
−ζ (s) 1 −f (s) 1 1 − = − − 2, sζ(s) s−1 sf (s) s s
186
16. PNT WITH REMAINDER
and f (s) := (s − 1)ζ(s)/s. If we adjusting the O-constant to account also for derivatives of 1/s and 1/s2 , the last lemma implies that j (16.27) |Z (j−1) (s)| |s|−1 Bjg(t)7 , provided that j < γ − 1. By the last estimate, with n in place of j − 1 (and n < γ − 2), ∞ ∞ 2 2 g(t)7(n+1) −2 t2/2 e Ln γ dt = (t + e)7(n+1)/γ−1 e− t /2 dt t + e 0 0 ∞ 7(n+1)/γ −2 t2/2 1+ t e dt/t −7(n+1)/γ , 0
which is the condition of Lemma 16.5. Finally, we determine the exponent for the remainder term in Theorem 16.3. In Lemma 16.5 we showed that ψ(x) = x + O x(log x)−E(γ) , provided that
γ 7 + (7 + γ)/n with n < γ − 2. Here we obtain an explicit bound for E(γ). Take E(γ)
− =: E(γ) , = − 8 32 128γ − 224 8 8 provided that γ > 3. This completes the proof of Theorem 16.3.
Remark. The exponent E(γ) can be increased slightly if γ is restricted to larger values; for example, for γ ≥ 11 we have γ 5 35 γ 3 − − > − . 8 32 128γ − 224 8 16 In the other direction, there is a limit for this approach, as shown by the inequality γ γ 9 63/64 γ 9 γ < = − − < − 7 + (7 + γ)/n 7 + (7 + γ)/(γ − 2) 8 64 8γ − 7 8 64 for all γ > 3. 16.4. A dlVP-type remainder term In this section, we study a g-number system N whose g-integer counting function satisfies (16.28)
N (x) = Ax + O(x exp{−c(log x)α })
with some constants A > 0, c > 0, and α ∈ (0, 1]. Such a remainder term is smaller than those of the previous section, and we show that the g-prime counting function π(x) of N has a smaller remainder term as well.
16.4. A DLVP-TYPE REMAINDER TERM
187
Theorem 16.8. If (16.28) is satisfied then π(x) = li(x) + O(x exp{−c (log x)β })
(16.29)
with β = α/(7 + α) and some constant c > 0. Remark. In fact, (16.29) holds with a somewhat larger value of β, e.g. β > α/(6.91 + α), as will appear below. Recall the function
Πc (x) :=
(1.9 bis)
1
x
1 − u−1 du, log u
x ≥ 1,
a close relative of li(x). The proof of the theorem starts with the familiar equation s − 1 ∞ 1 (16.30) log ζ(s) = x−s−1 {Π(x) − Πc (x)} dx, σ > 1. s s 1 Differentiating k times gives s − 1 ∞ dk 1 log ζ(s) x−s−1 (− log x)k (Π(x) − Πc (x)) dx, = dsk s s 1 and, by the Plancherel identity with s = σ + it, for any fixed σ > 1, ∞ k s − 1 2 1 d log ζ(s) dt k ds s s −∞ (16.31) ∞ = 2π x−2σ−1 (log x)2k {Π(x) − Πc (x)}2 dx. 1
The main steps in proving Theorem 16.8 are to find a bound for the left side of (16.31) and, with a suitable choice of k, to extract an estimate for Π(x) − Πc (x) on the right side of the equation. We begin with the latter task, which we consider a tauberian problem, because we are deriving a pointwise estimate of Π(x) − Πc (x) from an integral of this function. Lemma 16.9. Suppose that ∞ k s − 1 2 d 1 dt ≤ (M kη )2k (16.32) log ζ(s) k s s −∞ ds holds with some constants M > 0 and η ≥ 1, uniformly for 1 < σ ≤ 2 and all positive integers k. Then Π(x) = Πc (x) + O(x exp{−c (log x)1/η })
(16.33)
holds with some constant c > 0. Proof. By (16.31), (16.32), and the monotone convergence theorem, we have ∞ (16.34) x−3 {Π(x) − Πc (x)}2 (log x)2k dx ≤ (M kη )2k . 1
We apply a simple tauberian method to the last expression. Let δ(x) := Π(x) − Πc (x),
x ≥ 1.
Suppose first that δ(x) > 0. Since Π ↑ and (1 − u−1 )/ log u ≤ 1, we see for y > x that y 1 − u−1 dΠ − du > δ(x) − (y − x), δ(y) = δ(x) + log u x
188
16. PNT WITH REMAINDER
and so δ(y) ≥ δ(x)/2 for x ≤ y ≤ x + δ(x)/2 < 3x/2. (The last inequality holds for all sufficiently large x, since Π(x) and Πc (x) are each O(x/ log x).) Then, from (16.34), x+δ(x)/2 3x −3 δ(x) 3 (log x)2k ≤ y −3 δ(y)2 (log y)2k dy ≤ (M kη )2k 2 2 x and hence
δ(x) x
3
2k M kη ≤ 27 . log x
On the other hand, if δ(x) < 0, consider an interval to the left of x. We have δ(y) ≤ δ(x)/2 for x/2 < x + δ(x)/2 ≤ y ≤ x, and so 1 x3
|δ(x)| 2
3 x x 2k log ≤ y −3 δ(y)2 (log y)2k dy ≤ (M kη )2k . 2 x+δ(x)/2
Assuming that x is sufficiently large, we find 3 2k 3M kη |δ(x)| ≤ . x log x Thus, in either case,
|δ(x)| x
3 ≤
M kη log x
2k
holds with some constant M ≥ 1, uniformly for all k ∈ N. Then we have 3 |δ(x)|/x ≤ exp{−c (log x)1/η } with c > 0 by taking
* 1/η + 1 log x . k= e M
This proves (16.33).
Now we set out to obtain the required bound for the left side of (16.32). Recall the function f (s) = (s − 1)ζ(s)/s, σ > 1. The main contribution to the integrand on the left side of (16.32) comes from terms of the form
(s − 1)ζ(s) f (s) (n−1) dn = log , 1 ≤ n ≤ k. dsn s f (s) We find from formula (16.12) that f (s) (n−1) f (s)
=
n =1
(−1)−1 P,n
f (s) f (n) (s) ,..., f (s) f (s)
.
To use this relation, we need upper bounds for |f |−1 and for the modulus of derivatives of f ; we start with the derivatives.
16.4. A DLVP-TYPE REMAINDER TERM
189
Lemma 16.10. If (16.28) is satisfied then
+ 1 (+1)/α (16.35) |f () (σ + it)| ≤ Bc−(+1)/α max , log (|t| + e) α with some constant B > 0, uniformly for 1 < σ ≤ 2, t ∈ R, and all integers ≥ 0. Proof. We have ∞ ∞ 1 N (x) dx f (s) = 1 − x−s dN (x) − x−s dN1 (x), ζ(s) = = s x 1− 1− where
N (u) du dN (u) − u 1− x α (N (u) − Au) du + A xe−c(log x) = N (x) − Ax − u 1− x
N1 (x) :=
(the last by (16.28) and L’Hospital’s rule). For any X > 1, write X −s −s f (s) = x dN1 (x) − N1 (X)X + s 1−
∞
x−s−1 N1 (x) dx.
X
Differentiating times, we have by Leibniz’s rule X () (−1) f (s) = x−s log x dN1 (x) − N1 (X)X −s log X 1− ∞ +s x−s−1 log x N1 (x) dx X ∞ x−s−1 log−1 x N1 (x) dx − X
:= I + II + III + IV, say. Integration by parts shows X N (x) dx |I| < (log X)+1 x−1 log x dN (x) + x 1− uniformly for σ > 1 and all real t. Also, on this set, II e−c(log X) log X and ∞ α III (|t| + e) x−1 e−c(log x) log x dx ; α
X
we estimate this integral using an inequality for the incomplete gamma function. If x ≥ b ≥ 1 then we claim ∞ y b−1 e−y dy ≤ xb e−x . (16.36) Γ(x, b) := x
The change of integration variable y = x(1 + v) gives ∞ b −x b−1 −xv b −x Γ(x, b) = x e (1 + v) e dv ≤ x e 0
∞
(1 + v)b−1 e−bv dv.
0
The integral on the right hand side is a decreasing function of b ≥ 1 and equals 1 at b = 1. Thus (16.36) holds.
190
16. PNT WITH REMAINDER
Set y = c logα x in the integral estimate for III. We find ∞ ∞ α 1 x−1 e−c(log x) log x dx = (+1)/α y (+1)/α−1 e−y dy αc X c logα X Γ(c logα X, ( + 1)/α) 1 = ≤ log+1 X exp(−c logα X), (+1)/α α αc provided that c logα X ≥ ( + 1)/α.
(16.37) In this case,
III (|t| + e) log+1 X exp(−c logα X). Similarly, for ≥ 1, again assuming (16.37), ∞ α IV x−1 e−c(log x) log−1 x dx log+α X exp(−c logα X). X
Now II and IV are negligible, so we obtain f () (s) log+1 X 1 + (|t| + e) exp(−c logα X) . Setting
+ 1 1/α log(|t| + e) 1/α , , cα c we find f () (σ + it) log+1 X, giving the claimed bound. log X = max
We find bounds for f /f much as in the last section, but this time we take g(t) := (log (|t| + e))1/α . We want to apply Lemma 16.1, for which we need (16.24 bis)
|f () (σ + it)| ≤ Bg(t)+1 ,
= 0, 1,
as well as (16.4 bis)
1 ≤ g(nt) ≤ Dg(t),
n = 1, 2, . . . , 7.
The last inequalities clearly hold. Also, (16.24 bis) follows from (16.35) by taking a larger constant B, because α ≤ 1 and
+ 1 (+1)/α 2 2/α max , log (|t| + e) ≤ {log(|t| + e)}(+1)/α , = 0, 1. α α Now by Lemma 16.1, |f (σ + it)|−1 ≤ Cg(t)λ holds with constants C > 0 and λ = 5 for 1 < σ ≤ 2 and all t ∈ R. To evaluate the left hand side of (16.32) we give pointwise bounds for derivatives of (1/s) log f (s) by using Lemma 16.2. Let C := (( + 1)/α)1/α . Lemma 16.11. Assume that the function f (s) = (s − 1)ζ(s)/s satisfies +1 (16.35 bis) |f () (σ + it)| ≤ M1 c−1/α max{C , g(t)} for = 0, 1, . . . , and (16.38)
|f (σ + it)|−1 ≤ M2 g(t)λ
16.4. A DLVP-TYPE REMAINDER TERM
191
with positive constants M1 , M2 and λ, uniformly for 1 < σ ≤ 2. Then k d −1 ≤ 1 M5 k max{Ck2 , g(t)2 } g(t)λ k {s log f (s)} |s| dsk holds with some constant M5 > 1 for 1 < σ ≤ 2 and all integers k ≥ 1. Proof. By (16.35 bis) and (16.38), we have for each j ≥ 1, (j) f (σ + it) −1/α j+1 (16.39) max{Cj , g(t)} g(t)λ . f (σ + it) ≤ M3 c Thus, for n ≥ 1 and nonnegative integers k1 , . . . , kn satisfying k1 + · · · + kn = k and k1 + 2k2 + · · · + nkn = n, n (j) n kj f (σ + it) kj −1/α j+1 λ ≤ M3k c max{C , g(t)} g(t) n f (σ + it) j=1 j=1 n ≤ M4 max{Cn2 , g(t)2 }g(t)λ . (We have used the observations that Cj ≤ Cn for 1 ≤ j ≤ n and that k ≤ n.) Hence, by Lemma 16.2, for n ≥ 1, all real t, and 1 < σ ≤ 2, n d f (s) (n−1) dsn log f (s) = f (s) n n (16.40) ≤ Ck,k M4 max{Cn2 , g(t)2 }g(t)λ k=1 k
n ≤ (n − 1)! 2M4 max{Cn2 , g(t)2 }g(t)λ . On the same set, we have | log f (s)| |s|
(16.41)
since, by Theorem 16.3, the integral on the right hand side of (16.30) is uniformly bounded for 1 < σ and all t ∈ R. Now we find derivatives of s−1 log f (s) using Leibniz’s rule. For each positive integer k, we have k (n) k (−1)k−n (k − n)! dk −1 log f (s) {s log f (s)} = . k k−n+1 n ds s n=0 Using (16.41) and (16.40) with Cn ≤ Ck , we get Bk! dk k {s−1 log f (s)} ≤ k ds |s | k n k (k − n)!(n − 1)! 2M4 max{Cn2 , g(t)2 }g(t)λ + k−n+1 n |s| n=1 k k 1 k! B+ 2M4 max{Ck2 , g(t)2 }g(t)λ |s| n n=1 k 1 M5 k max{Ck2 , g(t)2 }g(t)λ . ≤ |s|
≤
192
16. PNT WITH REMAINDER
Proof of Theorem 16.8. It remains to show that the integral condition (16.32) is satisfied. If we replace max{Ck2 , g(t)2 } by Ck2 + g(t)2 in the last lemma, we find for k ≥ 1, 2 ∞ k d s−1 1 ζ(s) dt dsk s log s −∞ (16.42) ∞ ∞ g(t)2λk g(t)(2λ+4)k dt + dt . ≤ (M5 k)2k Ck4k 2 1 + t2 −∞ 1 + t −∞ For any β ≥ 0 we have ∞ ∞ ∞ logβ (t + e) dt logβ w dw logβ w dw 2 = ≤ (e + 2) 2 1 + t2 1 + w2 0 e 1 + (w − e) e (16.43) ∞ ≤ (e2 + 2) e−v v β dv = (e2 + 2)Γ(β + 1). 0
Applying (16.43) with β = 2λk/α and (2λ + 4)k/α respectively, we find that the right hand side of (16.42) is at most
k + 1 4k/α 2λk 2λk/α 2(λ + 2)k (2λ+4)k/α M52k k2k ≤ (M k1+(λ+2)/α )2k . α α α It follows with η = 1 + (λ + 2)/α, and in particular, with η = 1 + 7/α, that 2 ∞ k d s−1 1 dt ≤ (M kη )2k , log ζ(s) dsk s s −∞ i.e. (16.32) holds uniformly for 1 < σ ≤ 2 and all positive integers k. Thus the conditions of Lemma 16.9 are satisfied, and Theorem 16.8 is established.
16.5. Notes §16.3. H. Wegmann [We66] has applied the elementary prime number methodology of E. Wirsing to show that if N (x) = Ax + O(x log−γ x), then π(x) = li(x) + O(x(log x)−k ) with any k < γ/3. This is a stronger result than ours, but, with details, would take us too far afield to present. Nyman [Ny49] asserted that the results of Corollary 16.4 were equivalent to the statement that “to every > 0 and every nonnegative integer n, a constant A = A(, n) can be chosen such that (16.44)
|ζ (n) (s)| < A|t|
and
|1/ζ(s)| < A|t|
hold uniformly in the region σ > 1, |t| ≥ .” It was observed by A. E. Ingham in Math Reviews [MR 0032693] that these conditions do not imply even the PNT. As a simple example, let P := {2, 2, 3, 3, 5, 5, . . . , p, p, . . . } be a set of Beurling generalized primes; that is, P consists of the rational primes p each taken exactly twice. Let N be the associated set of g-integers. The function ζN (s) is the square of the Riemann zeta function. It is known from classical number theory that the conditions (16.44) are satisfied for ζ 2 (s). On the other hand, N
16.5. NOTES
193
consists of rational integers n, each repeated d(n) times, where d(n) is the classical divisor function. Here √ N (x) = x log x + (2γ − 1)x + O( x) and
$ ψ(x) = 2x + O x exp{−c log x } .
Thus neither of the assertions of Corollary 16.4 holds. L. Tschakaloff (Chakalov) [Ts40] proved the existence of a cosine polynomial p of order 7 with S(p) = 5.90529 . . . . An earlier version of his work appears in the Yearbook of the University of Sofia, 19 (1923) (in Bulgarian). This topic is studied more generally by S. Gy. R´ev´esz [Rv07]. Formula (16.5) is valid with some positive λ < 4.91 by use of the full force of Tschakaloff’s trigonometric inequality. Formula (16.12) for the n-th derivative of the logarithm of a function is a special case of an inverse Bell polynomial relation or of Fa` a di Bruno’s identity for differentiating composite functions. §16.4. The main result of this section is based on an article of R. S. Hall [Ha72].
https://doi.org/10.1090//surv/213/17
CHAPTER 17
Optimality of the dlVP Remainder Term How good is too good? Summary. Can one establish a better PNT remainder estimate for a discrete g-number system than that of de la Vall´ ee Poussin? Here we construct an example showing that the answer is in general, No. Also, we show, in the other direction, that there exist discrete g-number systems for which the analogue of the Riemann Hypothesis holds.
17.1. Background The first proofs of the Prime Number Theorem relied on a global analysis of the Riemann zeta function, mainly, its analytic continuability and Hadamard product representation. In this way, C. J. de la Vall´ee Poussin proved in 1899 that (17.1)
ζ(s) = 0 for
σ > 1 − c / log t, t ≥ 2,
where c is some positive constant. This is the so-called “classical” zero-free region of ζ(s), from which the PNT was deduced by contour integration, with an remainder term $ (17.2) π(x) = li(x) + O(x exp{−c log x}), x where li(x) = 2 du/ log u and c is some positive constant (not necessarily the one in (17.1)). Early in the twentieth century, E. Landau showed that the classical zero-free region (17.1) could be derived by means of local analysis. He showed also that essentially the same arguments used in proving the PNT could be applied to establish the Prime Ideal Theorem for a fixed algebraic number field. His reasoning in fact proves that if the number N (x) of integral ideals with norm at most x satisfies (17.3)
N (x) = kx + O(xθ )
with k > 0 and θ ∈ (0, 1) (a result that had been proved earlier by Weber), then the zeta function of the number field satisfies (17.1) and the number π(x) of prime ideals with norm not exceeding x satisfies (17.2). In this chapter, we revisit questions (2) and (5) of Chapter 1, giving examples that illustrate two cases of extreme behavior of the prime number distributions of (discrete!) g-number systems: (DLVP) to establish the optimality of (17.1) and (17.2), apart from the numerical value of the constants c; the result is given in Theorem 17.14. (RH) to give, on the other hand, a g-number system for which the Riemann Hypothesis holds for the associated zeta function and the prime counting function 195
196
17. OPTIMALITY OF THE DLVP REMAINDER TERM
satisfies π(x) = li(x) + O(x1/2 ). This result, which is slightly better than the classical conditional estimate, is given in Theorem 17.11. In each case, the Beurling g-number system that is constructed satisfies (17.3) with θ ∈ (1/2, 1) (optimality is not known for θ ≤ 1/2) Our arguments show the existence of g-number systems having the desired properties, but they are “ineffective” in the sense that they do not provide explicit constructions. In each case, the process proceeds in two main steps. First, we construct an explicit “template” continuous distribution or, equivalently, a template zeta function. This provides the main term in the distribution of a g-prime system that will then be selected probabilistically. For RH, the template zeta function is s/(s−1); that for DLVP includes s/(s−1) along with a second factor that has infinitely many zeros on {σ = 1 − 1/ log |t|} and none to the right of this curve. We will show that the Chebyshev function ψ(x) for DLVP satisfies ψ(x) − x ψ(x) − x √ √ = 2, lim inf = −2. lim sup x→∞ x exp{−2 log x} x→∞ x exp{−2 log x} For the second step, random g-primes are created from a sequence of positive real numbers {vk } as the nonzero values of {vk Xk }, where {Xk } are independent Bernoulli variables, with a “success” probability pk assigned to each number vk . The sequence {vk } is chosen to be sufficiently dense in a neighborhood of infinity and {pk } are chosen to mimic the template density. In this case, the expectation of the Fourier transform of a truncation of the random g-prime distribution closely approximates that of the similarly truncated template distribution. Then a familiar inequality of Kolmogorov is used to select g-primes from {vk } so that the Fourier transform of its truncation is suitably close to that of the truncated template distribution, yielding the claimed result in each case. In the following discussion, c denotes a positive constant which may have different values in several consecutive occurrences. However, where it seems necessary to show clearly the relation between consecutive values of constants, we will instead write c1 , c2 , . . . . 17.2. Discrete random approximation 17.2.1. A general form. Our starting point is an inequality variously ascribed to Bernstein or Kolmogorov [DMV06, Lo77]. For completeness, we give a simple proof. Lemma 17.1. Let Yk , k = 1, . . . , K, be independent random variables such that |Yk | ≤ 1 and the expectation EYk = 0. Also, let rk , k = 1, . . . , K, be real numbers 2 such that |rk | ≤ 1. Let Y = K k=1 rk Yk and set σ = Var Y. Then, for any number 2 2 2 σe ≥ σ and 0 ≤ v ≤ 2σe , we have P [Y ≥ v] ≤ exp − v 2 /(4σe2 ) . Proof. Let λ ≥ 0 be a parameter (to be specified later). Then P [Y ≥ v] ≤ Eeλ(Y −v) . Since the Yk are independent, Eeλ(Y −v) = e−λv E
K k=1
eλrk Yk = e−λv
K k=1
Eeλrk Yk .
17.2. DISCRETE RANDOM APPROXIMATION
197
It is easily verified that eu ≤ 1 + u + u2 for −1 ≤ u ≤ 1. Thus, for 0 ≤ λ ≤ 1, Eeλrk Yk ≤ E(1 + λrk Yk + λ2 rk2 Yk2 ) = 1 + λ2 rk2 EYk2 ≤ exp{λ2 rk2 EYk2 } by the inequality 1 + w ≤ ew , which holds for all real w. Hence we have P [Y ≥ v] ≤ exp{−λv + λ2 σ 2 }.
(17.4)
If 0 ≤ v ≤ 2σe2 , by taking λ = v/(2σe2 ), we obtain
v2
v2 v2 σ2 P [Y ≥ v] ≤ exp − 2 + ≤ exp − 2 . 4 2σe 4σe 4σe
We next describe how to make g-numbers using random choices, and then we show that a reasonably simple special case “works” for the two constructions. The argument also will reveal how much latitude there is for our choices. Lemma 17.2. Let 1 ≤ v0 < v1 < · · · < vk < · · · be a sequence such that vk → ∞ as k → ∞. Also, let {pk }∞ k=1 be a sequence of real numbers such that 0 < pk ≤ 1 for k ≥ k0 . Assume that there exists an increasing function F (x) on (−∞, ∞) with support on [1, ∞) which satisfies pk F (x), vk ≤x
pk
$ F (x) (1 + log x) ,
x 1,
∞
ζC (s) = exp
(17.43)
v −s fC (v) dv ,
1
with (17.44)
g(v 1/k ) 1 − v −1 −2 v −1/k cos(γk log v) log v k
fC (v) :=
k≥1
for v ≥ 1. We have fC (v) > 0 for v > 1. Furthermore there is a constant c ∈ (0, 1) such that 1 − v −1 1 − v −1 ≤ fC (v) ≤ (1 + c) (17.45) (1 − c) log v log v holds for v ≥ e4 . Proof. Recall from (17.39) that our template zeta function is represented as s/(s − 1) times a product of G factors involving zeros {ρk } and their conjugates. For σ > 1, {k (s − ρk )} > 1, and so, by Lemma 17.16, ∞ −k g(v 4 ) −4−k +iγk −s G(k (s − ρk )) = exp − v v dv . 4k 1 Lemma 17.20 implies that
−k
g(v 4
−k
) 4−k v −4
1
k≥1
for v ≥ e4 , and hence ∞ k=1
∞
1
−k
g(v 4 ) −4−k v cos(γk log v) v −s dv 4k
converges absolutely for σ > 1. Thus we can write ∞ G(k (s − ρk )) G(k (s − ρ¯k )) k=1
=
∞ k=1
exp − 2
= exp 1
∞
∞
1
−2
−k
g(v 4 ) −4−k v cos(γk log v) v −s dv 4k
−k ∞ g(v 4 ) −4−k −s v cos(γ log v) v dv . k 4k
k=1
We showed in Theorem 1.5 that ∞ −1 s −s 1 − v = exp dv . v s−1 log v 1
218
17. OPTIMALITY OF THE DLVP REMAINDER TERM
Combining the preceding formulas, we obtain (17.43). For v < e4 , we have v 1/k < e and so g(v 1/k ) in (17.44) is 0. Thus fC (v) =
1 − v −1 > 0. log v
On the other hand, for v ≥ e4 , (17.45) with c = 2c1 /(1 − e−4 ) follows from (17.41), since 2c1 < 1 − e−4 .
17.8. Asymptotics of NB (x) The last lemma shows that fC (v) satisfies the conditions of Lemma 17.7. Thus there exists a sequence {pj } := {vkj }, j = 1, 2, . . . , such that , x √ log(t + 1) −it −it (17.46) pj − v fC (v) dv x 1 + 1 + log x 1 pj ≤x
for 1 ≤ x < ∞ and t ≥ 0. In particular, (17.47) πB (x) = 1= pj ≤x
x
fC (v) dv + O(x1/2 ).
1
We proceed as in the proof of Theorem 17.11. Let ∞ F1 (s) = {v −s + log(1 − v −s )} dπB (v), 1 ∞ F2 (s) = v −s {dπB (v) − fC (v) dv}. 1
By (17.47) and (17.45), F1 (s) and F2 (s) converge uniformly for σ ≥ 12 + for each > 0 and hence are analytic for σ > 1/2. The sought-after zeta function ζB (s) is convergent for σ > 1 and can be written as ζB (s) = ζC (s) exp{−F1 (s) + F2 (s)}. It follows that ζB (s) has an analytic continuation to the same half plane as ζC (s), and it has there the same set of zeros and poles. In particular, ζB (s) has infinitely many zeros on the curve σ = 1 − 1/ log |t|, |t| ≥ e2 , and no zeros to its right. Also, it has a simple pole at s = 1 with residue k2 > 0. (This number will, as usual, be the density of the g-integer system.) To establish the asymptotics of NB (x), we need estimates of ζB (s) both close to and far from zeros. Lemma 17.22. Given x ≥ x0 (> 1), let σ1 = 1/2 + (log x)−1/3 . The following estimates hold uniformly for σ1 ≤ σ ≤ 2, t ≥ 0, and |s − 1| ≥ c > 0. (i) if |t − γk | > 12 γk for all k ∈ N, then
$ ζB (σ + it) exp c (log x)1/3 + (log x)1/6 log(t + 1) ;
17.8. ASYMPTOTICS OF NB (x)
219
(ii) if |t − γk0 | ≤ 12 γk0 for some k0 ∈ N, then
$ ζB (σ + it) exp c (log x)1/3 + (log x)1/6 log(t + 1) t × 1 + k0 . 4 |σ + it − ρk0 | Remark 17.23. Our argument actually rules out the apparent blow-up of the last estimate at s = ρk0 , but we do not need the sharper bound. Proof. Analogously with the proof of Theorem 17.11 (there with formulas (17.17) and (17.18) and Lemma 17.13), we have $ (17.48) |ζB (σ + it)| ≤ |ζC (σ + it)| exp c (log x)1/3 + (log x)1/6 log(t + 1) uniformly for σ1 ≤ σ ≤ 2 and t ≥ 0. Now we examine ζC (σ + it), omitting the factor s/(s − 1), since we excluded a neighborhood of s = 1. By (17.38), if |t − γk | > γk /2, then 4θ1 , e2 4k and G(4k (s − ρ¯k )) satisfies the same equality. Thus, if |t − γk | > 12 γk holds for all k ∈ N, then the remaining factors of ζC (σ + it) satisfy ∞ 2 ∞ 4 k k G(4 (s − ρk ))G(4 (s − ρ¯k )) ≤ 1, 1+ 2 k e 4 G(4k (s − ρk )) = 1 +
k=1
k=1
uniformly for σ ≥ 1/2 and all such t ≥ 0. This inequality together with (17.48) proves (i ). Next, we prove (ii ). For any given t ≥ 0, there is at most one k such that |t − γk | ≤ γk /2, i.e., γk /2 ≤ t ≤ 3γk /2; indeed, the big gap between γk and γk+1 insures that 3γk /2 < γk+1 /2. If there is some k0 such that |t − γk0 | ≤ 12 γk0 , then |t − γk | > 12 γk for all k = k0 . Hence ∞
G(4k (s − ρk ))G(4k (s − ρ¯k )) 1
k=1 k=k0
uniformly for σ ≥ 1/2 and for t satisfying |t − γk0 | ≤ 12 γk0 with some k0 . Then (17.49)
s−1 ζC (σ + it) |G(4k0 (σ + it − ρk0 ))|, s
since
4θ1 1. e2 4k0 To estimate the right hand side of (17.49), suppose first that G(4k0 (s − ρ¯k0 )) = 1 +
1 − 2 · 4−k0 =: σ2 ≤ σ ≤ 2. In this case, Z := {4k0 (σ + it − ρk0 )} = 4k0 (σ − 1) + 1 ≥ −1, and by (17.28), G(4k0 (σ + it − ρk0 )) 1. Now suppose σ1 ≤ σ < σ2 . In this case −Z > 1 > 0 and so −Z < −2Z ≤ −2 · {4k0 (σ1 − 1) + 1} =: −2Z1 .
220
17. OPTIMALITY OF THE DLVP REMAINDER TERM
Applying (17.26), the right hand side of (17.49) is bounded by 1+ We have
2 exp{−2Z1 } . + it − ρk0 |
4k0 |σ
exp{−2Z1 } = exp − 2 · 4k0 (σ1 − 1) − 2 ≤ exp − 2 · 4k0 − 1/2 + (log x)−1/3
−2 log γ k0 γk0 exp ≤ γk 0 t . (log x)1/3
It follows that
t + it − ρk0 | 1 for σ1 ≤ σ < σ2 and t satisfying |t − γk0 | ≤ 2 γk0 . Combining this estimate with (17.49) and (17.48) establishes (ii ), which completes the proof of the lemma. G(4k0 (σ + it − ρk0 )) 1 +
4k0 |σ
Proceeding as in the proof of Theorem 17.11, write b+iT 1 (x + 1)s+1 − xs+1 NB (x) ≤ lim ds . ζB (s) T →∞ 2πi b−iT s(s + 1) We apply contour integration to evaluate the last integral. Let Tn = 2γn . Consider the contour integral taken along the rectangle with vertices b ± iTn and σ1 ± iTn . Then | ± Tn − γk | > 12 γk for all k ∈ N. By (i ) of the last lemma, along the top of the rectangle we have b+iTn (x + 1)s+1 − xs+1 ds ζB (s) s(s + 1) σ1 +iTn $ (x + 1)b+1 exp c (log x)1/3 + (log x)1/6 log(Tn + 1) = o(1) 2 Tn as n → ∞. A similar inequality holds for the integral along the bottom side of the rectangle. Using the residue theorem, we shift the integration path to the line σ = σ1 . Then break the integral at t = x and apply (17.21) to the two parts to obtain σ1 +iTn 1 (x + 1)s+1 − xs+1 ds NB (x) ≤ k2 · (x + 1/2) + lim ζB (s) n→∞ 2πi σ −iT s(s + 1) 1 n = k2 · (x + 1/2) + O(I1 ) + O(I2 ), with k2 the residue of ζB (s) at s = 1, x ∞ x σ1 xσ1 +1 dt and I2 := I1 := |ζB (σ1 + it)| |ζB (σ1 + it)| 2 dt . t+1 t 0 x To estimate these integrals, we next show that 2T (17.50) |ζB (σ1 + it)| dt T exp c(log x)2/3 ,
T ≤ x,
T
and
2T
(17.51) T
$ |ζB (σ1 + it)| dt T exp c(log x)1/6 log T ,
T > x.
17.8. ASYMPTOTICS OF NB (x)
221
In (17.50), suppose first that there exists no γk ∈ [2T /3, 4T ]. In this case, |t − γk | > γk /2 holds for each point t ∈ [T, 2T ]. Indeed, if γk > 4T, then γk − t ≥ γk − 2T > γk /2, if γk < 2T /3, then t − γk ≥ T − γk > γk /2. Now Case (i ) of Lemma 17.22 applies, and we have ζB (σ1 + it) exp{c(log x)2/3 } for T ≤ t ≤ 2T , which establishes (17.50). On the other hand, suppose γK ∈ [2T /3, 4T ] for some K. To avoid multiple cases, we extend the integral in (17.50) to the interval [γK /4, 3γK ]. It is easy to see that γK /4 ≤ T and 2T ≤ 3γK . Since estimate (ii ) of Lemma 17.22 is at least as large as that of (i ), it is valid to use (ii ) for all t in the integration range. We find t . ζB (σ1 + it) exp c(log x)2/3 1 + K 4 |σ1 + it − ρK | Now the denominator of the last expression is 4K − 1/2 + (log x)−1/3 + 4−K + i(t − γK ) 4K (1 + |t − γK |), and so 3γK
|ζB (σ1 + it)| dt exp c(log x)
2/3
γK /4
3γK
1+ γK /4
The last integral is bounded by 3γK + 2 · 4−K · 3γK
0
2γK
t dt. 4K (1 + |t − γK |)
dv γK T, 1+v
and so (17.50) is true in this case too. The same argument, with the obvious modification in applying Lemma 17.22, yields (17.51). We return to the estimates of NB (x). By (17.50), we have x x σ1 dt |ζB (σ1 + it)| I1 = t+1 0 k 2k+1 x/2 x1/2 exp c(log x)2/3 |ζB (σ1 + it)| dt x x/2k+1 log x x
1/2
0≤k≤ log 2
exp c(log x)
2/3
.
Also, from (17.51), we have ∞ xσ1 +1 I2 = |ζB (σ1 + it)| 2 dt t x 2k+1 x ∞ 2−2k x−2 |ζB (σ1 + it)| dt xσ1 +1 x
k=0 ∞ σ1 −k
2
2k x
exp c(log x)1/6 (log{2k x})1/2 .
k=0
Substituting for σ1 , the right hand side becomes ∞ 1 x1/2 exp{c(log x)2/3 } exp c(log x)1/6 (log{2k x})1/2 . k 2 k=0
222
17. OPTIMALITY OF THE DLVP REMAINDER TERM
We split the last sum into two parts, over the ranges 2k ≤ x and 2k > x respectively. On the first range (log{2k x})1/2 (log x)1/2 , and so the contribution of the first part is exp c(log x)2/3 . The second part is
∞ 2−k exp c(log{2k })2/3 ≤ 2−k/2 1,
k=1
x k> log log 2
since for k >
log x log 2
≥ 8c3 / log 2,
2−k exp c(log{2k })2/3 ≤ 2−k/2 .
Putting together the estimates, we find that ∞ xσ1 +1 |ζB (σ1 + it)| 2 dt x1/2 exp c(log x)2/3 , t x and hence
NB (x) ≤ k2 · (x + 1/2) + O x1/2 exp c(log x)2/3 .
A similar argument shows that
NB (x) ≥ k2 · (x − 1/2) + O x1/2 exp c(log x)2/3 ,
and together, the two inequalities for NB (x) establish (i ) of Theorem 17.14. 17.9. Asymptotics of ψB (x) To complete the proof of Theorem 17.14, we analyze the asymptotics of ψB (x) with a direct calculation. This will yield the sharp Ω± -constants of (iv ). Recall from §1.2 that the Chebyshev function is defined by x ψB (x) := log pi = log v dΠB (v), 1
piαi ≤x
where, as usual, 1 1 ΠB (x) := πB (x) + πB (x1/2 ) + πB (x1/3 ) + . . . . 2 3 Lemma 1.1 implies that ΠB (x) = πB (x) + O x1/2 . (A slightly better remainder term can be shown, but that is not needed here.) The preceding formula and integration by parts give x ψB (x) = log v dπB (v) + O x1/2 log x . 1
Define the template psi function by x ψC (x) := log v fC (v) dv. 1
Then, by (17.47) and more integration by parts, x ψB (x) − ψC (x) = log v{dπB (v) − fC (v) dv} + O x1/2 log x 1 = O x1/2 log x .
17.9. ASYMPTOTICS OF ψB (x)
223 k
Given x > e4 , define K ∈ N by γK ≤ x < γK+1 . (As usual, γk = e4 .) By (17.44), ψC (x) = x − 2F (x) + O(log x), where F (x) :=
K k=1
x
−k
(log v) 4−k g(v 4
−k
) v −4 cos(γk log v) dv =:
γk
K
Ik (x),
k=1
say. To show the oscillation of ψB , it suffices to prove that (17.52)
lim sup x→∞
F (x) √ =1 x exp{−2 log x }
and (17.53)
lim inf x→∞
F (x) √ = −1. x exp{−2 log x }
We shall show that the behavior of F (x) is dominated by the contribution of a single term Ik0 (x) or Ik0 +1 (x) with k0 = K/2 or k0 = (K − 1)/2. For k ≤ K − 2, change the variable and write x4−k k k g(u) (log u) u4 −2 cos(4k γk log u) du. (17.54) Ik (x) := 4 e −k
We split the integration interval into two subintervals [e, e5 ] and [e5 , x4 ] and let Ik,1 and Ik,2 (x) denote the respective partial integrals. We estimate Ik,1 trivially, using the information that g(u) log u and the cosine are bounded. We find that k
Ik,1 (e5 )4 ≤ x5/16 . To estimate Ik,2 (x), we exploit the bounds we have found for g(u) log u and its derivative. Integrating by parts and applying Lemma 17.18, we obtain −k −k 1 (17.55) γk Ik,2 (x) = x1−4 sin(γk log x) + O x1−4 (1+ 2 log{π/2}) x4−k k − sin(4k γk log u)u4 −1 (g(u) log u) du
e5 x4
−
−k k
sin(4k γk log u)g(u)(log u)(4k − 1)u4
−2
du.
e5
(The integrated term at the lower limit has order exp(5 · 4k ); when divided by γk , this is O(x1/4 ) for k ≤ K − 2.) By (17.36), the first integral in (17.55) is x4−k k −k 1 1 u4 −2− 2 log(π/2) du 4−k x1−4 (1+ 2 log{π/2}) . e5
To estimate the second integral in (17.55), subtract and add 1 in the integrand. Apply Lemma 17.18 (again) to the part with g(u) log u − 1; we find x4−k k k sin(4k γk log u)(g(u) log u − 1)u4 −2 du (4 − 1) e5
x4
4
k
−k k
u4 e5
−2− 12 log(π/2)
−k
du x1−4
(1+ 12 log{π/2})
.
224
17. OPTIMALITY OF THE DLVP REMAINDER TERM
For the remaining portion of the integral, integration by parts (or application of the second mean value theorem) yields
x4
−k
(4 − 1) k
−k
k
sin(4 γk log u) u
4k −2
e5
x1−4 du γk
.
Together, we obtain −k
(17.56) Ik (x) =
x1−4
1−4−k (1+ 1 log(π/2)) −k 2 5 x sin(γk log x) x1−4 16 +O + + x γk γk γk2
for k ≤ K − 2. We estimate the maximum of each of the error terms as k varies. With 4k = α, the first error term in (17.56) is
(log x)(1 + 1 log(π/2)) 2 x exp − −α . α By calculus, the maximum of this expression for α > 0 is $ $ $ x exp − 2 1 + (1/2) log(π/2) log x < x exp − 2.21 log x . A similar calculation shows the second error estimate is at most √ $ x exp − 2 2 log x . Then, the third error term in (17.56) is independent of k and is smaller than the estimate of the other two. We use estimate of the first error term in (17.56) for 1 ≤ k ≤ K − 2 and find that the sum of the Ik errors has order $ $ x exp − 2.21 log x } log log x x exp − 2.2 log x }. Thus K−2 k=1
Ik (x) =
K−2
−k
x1−4
k=1
$ sin(γk log x) + O x exp − 2.2 log x . γk
It remains to estimate IK (x) and IK−1 (x). We note from the definition of −k x4 , the in K, viz. γK ≤ x < γK+1 , that upper limit of4 the integration range 16 K+1−k , that is, up to e for IK (x) and up to e (17.54), can extend up to exp 4 for IK−1 (x). The key fact we shall use (Lemma 17.16) is that g(u) is a polynomial in log u of degree at most m − 1 on each interval (em , em+1 ]. Combining formula (17.54) with the representation of g(u) as a polynomial, we can express Ik (x) with k = K or K − 1 as a linear combination of terms μ k Ik (x, a) := 4k (log u)j u4 −1 cos(4k γk log u) du/u, ea −k
with 1 ≤ a < 4−k log x and μ = min(ea+1, x4
) ≤ e16 . Integrating by parts yields μ k Ik (x, a) = γk−1 sin(4k γk log u)u4 −1 (log u)j a e μ −1 k 4k −1 − γk sin(4 γk log u){u (log u)j } du ea μ k k γk−1 μ4 (log μ)j + γk−1 |{u4 −1 (log u)j } | du . ea
17.9. ASYMPTOTICS OF ψB (x) k
Now u4
−1
225
k
(log u)j is clearly increasing for μ4 ≤ x, log μ ≤ 16, and K e−4 ≤ x−1/4 , k = K, −1 γk = K−1 ≤ x−1/16 , k = K − 1. e−4
3/4 Thus we have I√ and IK−1 (x) x15/16 , and each of these quantities is K (x) x x exp − 2.2 log x . Thus we arrive at the representation
(17.57)
F (x) =
K−2
$ sin(γk log x) + O x exp − 2.2 log x . γk
−k
x1−4
k=1
We are now close to establishing the desired estimates (17.52) and (17.53). To analyze the last formula, we first look where the maximum of −k
x−4 /γk = exp{−4−k log x − 4k } occurs. Setting α := 4k , we see that exp{−(log x)/α − α} achieves its maximum √ when α = log x (regarding α as a real variable). Given x sufficiently large, let $ 4k0 ≤ log x < 4k0 +1 . The main contribution to the sum in (17.57) comes from the terms having indices k0 or k0 + 1. Indeed, we see that
−k
1≤k≤K−2 k=k0 ,k0 +1
x1−4 γk
$ = O(x exp{−4 log x })
by evaluating the sum at the next-largest terms, k = k0 − 1 and k = k0 + 2, and then multiplying by K − 4 log log x. The last sum is smaller than the error term in (17.57), so we drop all summands from that formula except those with indices k0 and k0 + 1. We now have (17.58)
F (x) =
−k
x1−4
k=k0 , k0 +1
$ sin(γk log x) + O x exp − 2.2 log x . γk
√ −k k In choosing k0 , we noted that x1−4 /e4 has its maximum at 4k = log x. (In this instance, k is not an integer, in general.) The value at the maximum √ is x exp(−2 log x). Next, we show that at most one of the terms √ with indices k0 , k0 + 1 can achieve this maximum. Suppose first that 4k0 ≤ log x < 2 · 4k0 . In this case, −k0 −1 $ x1−4 ≤ x exp − (2.25) log x . γk0 +1 √ On the other hand, if 2 · 4k0 ≤ log x < 4k0 +1 , again we have −k0
x1−4 γk 0
$ ≤ x exp − (2.25) log x .
It follows that lim sup x→∞
|F (x)| √ ≤ 1. x exp{−2 log x }
226
17. OPTIMALITY OF THE DLVP REMAINDER TERM
√ It remains to show that F (x)/(x exp{−2 log x }) approaches ±1 infinitely often. By (17.58) and the analysis that follows, for any given (large) k and for x sufficiently near x1 := exp 42k we have $ −k k F (x) ∼ x1−4 e−4 sin(γk log x) ∼ x exp{−2 log x } sin(γk log x). We conclude the argument by making a small perturbation of x1 whose only significant effect is to make the value of sine be ±1. Set x2 = x1 e , where is a small number to be determined. We have $ −k −k k k −k e−4 = x1−4 e−4 e(1−4 ) → x1 exp{−2 log x1 } x1−4 2 1 as → 0. For the argument of sine, suppose k
e4 log x1 = 2πm + πθ for some positive integer m and |θ| ≤ 1. Then we have k
k
γk log x2 = e4 (log x1 + ) = 2πm + πθ + e4 = 2πm ± π/2 by taking = e−4 (±π/2 − θπ). k
The latter is a number that goes to zero as k → ∞, and thus F (x2 (k)) satisfies (17.52) (resp. (17.53)). This completes the proof of Theorem 17.14. 17.10. Normalization and hybrid We can construct other g-prime systems, based on PR and PB of Theorems 17.11 and 17.14, that have the same properties as the primes in N (assuming the truth of the Riemann hypothesis for (iv ) and (v )). For instance, given any positive integer nX , there is a g-prime system PX such that (i ) PX consists of numbers in {vk } given in (17.13) and PX ∩ [1, nX ] is identical with the primes in N of size at most nX ; (ii ) the counting function NX (x) of the resulting g-integers satisfies NX (x) = x + O(x1/2 exp{c(log x)2/3 }); (iii ) the associated zeta function ζX (s) is analytic for σ > 1/2 except for a simple pole at s = 1 with residue 1; (iv ) the function ζX (s) has no zeros on the half plane σ > 1/2; (v ) the g-prime counting function πX (x) satisfies πX (x) = li(x) + O(x1/2 ). The construction can be done by successively inserting some rational primes into the g-prime system PR of Theorem 17.11 and deleting some other g-primes from it. Specifically, we can first insert all rational primes up to nX that are not already in PR and then delete all g-primes up to nX that are not rational primes. In this way, the new system satisfies all the properties (i ) – (v ) except that the residue k of ζX (s) at s = 1, i.e., the density of the resulting g-integers, may not be 1. The desired estimates can be shown for (ii ) by using the inclusion-exclusion principle, since only a finite number of g-primes are involved. We may suppose that k < 1 in the new system, for otherwise we delete a finite number of additional
17.11. NOTES
227
g-primes. Finally, there is a finite or infinite sequence {wn } of numbers wn ∈ N such that wn > n X , (1 − wn−1 ) = k, wn−1/2 < ∞, 1 = O(log x). wn ≤x
We enlarge PX to contain the collection {wn }. Then the resulting g-number system has all the expected properties. 17.11. Notes §17.1. The material of this chapter is based on the articles [DMV06] and [Zh07]. §17.2.1. A reference for the Bernstein-Kolmogorov Lemma is §19.1A(i) in [Lo77]. If we have v > 2σe2 , then the choice λ = 1 in (17.4) in the proof yields the further estimate P [Y ≥ v] ≤ exp{−v + σ 2 } ≤ exp{−v/2}. Besides the sequence {vk } defined by (17.13), any other sequence that is suitably dense (in some sense) in a neighborhood of ∞ also will satisfy the conditions of Lemmas 17.2 and 17.5. For example, the sequences {vk } defined by $ vk := log(k + k0 ) , k = 0, 1, 2, . . . , and vk := log(k + k0 ) log log(k + k0 ), k = 0, 1, 2, . . . , satisfy the conditions of Lemmas 17.2 and 17.5 as well. §17.2.2. In addition to our choice of the pair f and F , it is easy to show that if vα on [1, ∞), f (v) (1 + log v)β where α and β are constants with −1 < α < 1 and β ≥ 0, then xα+1 (1 + log x)β too satisfies the conditions of Lemmas 17.2 and 17.5. F (x) =
§17.3.2. The system PR of generalized primes in Theorem 17.11 can be taken as subsequences of the sequence {vk } given in (17.13) as well as the sequences mentioned above.
https://doi.org/10.1090//surv/213/18
CHAPTER 18
The Dickman and Buchstab Functions Real numbers want to be sieved too Summary. The Dickman and Buchstab functions appear in approximations to the counting functions of rational integers without large prime factors, resp. those having only large prime factors. We use the framework of Beurling numbers to suggest how these functions arise and give several related formulas.
18.1. Introduction For x ≥ 1, y > 1, the function ψ(x, y) denotes the number of positive rational integers not exceeding x all of whose prime factors are at most y. (This ψ is not related to Chebyshev’s function.) The integers counted here are called “y-smooth” or simply “smooth.” To see the connection of ψ(x, y) to primes and their powers, note that the generating function of ψ(x, y) is a truncation of the product form of the Riemann zeta function: for s > 1, ∞ −1 1 − p−s ζ(s, y) := x−s dψ(x, y) = n−s = . x=1−
n≥1 p|n⇒p≤y
Thus log ζ(s, y) = −
log 1 − p
−s
p≤y
p≤y
∞ ∞ 1 −αs p = = x−s dΠ(x, y), α x=1 α=1 p≤y
where Π(x, y) :=
1 . α α
p ≤x p≤y
It follows by the argument that established (3.4) that ∞ ∞ ∞ x−s dψ(x, y) = exp x−s dΠ(x, y) = 1−
x=1
x−s exp∗ dΠ(x, y),
x=1−
whence, by the identity theorem for Mellin transforms, x (18.1) ψ(x, y) = exp∗ dΠ(t, y). t=1−
On the other hand, the function φ(x, y) denotes the number of positive rational integers not exceeding x without any prime factors in (1, y]. Note that if y ≥ x ≥ 1, then 1 is the only number in [1, x] having no prime factors in the range (1, y], and in this case, φ(x, y) = 1. 229
230
18. THE DICKMAN AND BUCHSTAB FUNCTIONS
For given y, the Mellin transform of dφ(x, y) is ∞ −1 1 − p−s , x−s dφ(x, y) = n−s = 1−
Also, −
s > 1.
p>y
n≥1 p|n⇒p>y
∞ 1 −αs p log 1 − p−s = = α p>y p>y α=1
with Υ(x, y) :=
∞
x−s dΥ(x, y)
x=1
1 , α α
p ≤x p>y
whence (18.2)
x
exp∗ dΥ(t, y).
φ(x, y) = t=1−
We have, for each y > 1, Π(x, y) + Υ(x, y) = Π(x). The preceding definitions and properties of exp from Chapter 3 give x x dψ(·, y) ∗ dφ(·, y) = exp∗{dΠ(·, y) + dΥ(·, y)} 1− 1− x = exp∗ dΠ = x := N (x), x ≥ 1. 1−
Also, (3.7) gives the relation x x 1 − t−1 x dt = exp∗ dΠc = exp∗ (δ1 + dt) = x := Nc (x). log t 1− 1− 1− We have N (x) ≈ Nc (x), and by the classical PNT, Π(x) ≈ Πc (x) (≈ li(x)). In this chapter, we introduce continuous analogues of ψ(x, y) and φ(x, y) in terms of exponentials involving dΠc , and we show that the resulting functions are precisely those occurring in asymptotic formulas of Dickman–de Bruijn and of Buchstab. 18.2. The ψ(x, y) function By well-known theorems of Dickman and de Bruijn, ψ(x, y) is asymptotic to x ρ(log x/ log y) as x, y → ∞ in certain ranges. Here ρ denotes the Dickman function, defined by ρ(u) = 1 for 0 ≤ u ≤ 1 and for u ≥ 1 by u u ρ(u) := ρ(t) dt . u−1
Further information on ψ(x, y) and the ρ function can be found in Chapter III.5 of [Te95], §7.1 of [MV07], and [Mo93]. We make one observation on ρ(u): it vanishes more swiftly at infinity than the reciprocal of Euler’s gamma function. A continuous analogue of Π(x, y) can be defined for x ≥ 1, y > 1 by x (1 − t−1 ) dt dt (18.3) Πc (x, y) := χ[1, y] (t) − χ(y,∞) (t) , log t t log t 1
18.2. THE ψ(x, y) FUNCTION
231
with χE the indicator function of a set E, and a continuous analogue of ψ(x, y) by x
(1 − t−1 ) dt dt − χ(y,∞) (t) , exp∗ χ[1, y] (t) (18.4) ψc (x, y) := log t t log t 1− x i.e. ψc (x, y) := 1− exp∗ dΠc (·, y). Now we give an exact representation of ψc (x, y) in terms of the rho function, which resembles the Dickman asymptotic formula. Theorem 18.1. We have ψc (x, y) = x ρ(log x/ log y) for all x ≥ 1, y > 1. Before proving the theorem, let’s make a small calculation to show the result plausible. We start with another expression for ψc (x, y) created by adding and subtracting a term χ(y,∞) (t) dt/ log t. We find x
(1 − t−1 ) dt dt − χ(y,∞) (t) exp∗ ψc (x, y) := (18.5) log t log t 1− x
dt (δ + dt) ∗ exp∗ − χ(y,∞) (t) = log t 1− x
x dt exp∗ − χ(y,∞) (t) = , t log t 1− and by the homomorphic property of exp (Lemma 3.8), x
dt . exp∗ − χ(y,∞) (t) ψc (x, y) = x t log t 1− We expand the last formula as x ψc (x, y) dt dt ∗2 1 x (18.6) χ(y,∞) (t) ∓ .... χ(y,∞) (t) =1− + x t log t 2! 1 t log t 1 If 1 ≤ x ≤ y, then 1 is the only term on the right side of (18.6), and we find log x ψc (x, y) =1=ρ , x log y since ρ(u) = 1 for 0 ≤ u ≤ 1. Next, suppose y < x ≤ y 2 . In this case, x log x ψc (x, y) log x dt =1− = 1 − log =ρ , x log y log y y t log t since, by a small calculation, ρ(u) = 1 − log u for 1 ≤ u ≤ 2. Calculations for larger x are more complicated, so we now turn to the proof of Theorem 18.1. We establish the formula using Mellin transforms. For s ∈ C with s > 1 we have, on the one hand, ∞ 1 ∞ dx −s = x ψc (x, y) x−s dψc (x, y) x s x=1− 1
dx 1 ∞ = x−s exp ∗ χ[1, y] (x) − x−1 s x=1− log x ∞ dx 1 x−s χ[1, y] (x) − x−1 = exp s log x x=1 ∞ ∞ dx 1 dx −s −1 − . = exp x (1 − x ) x−s s log x log x x=1 y
232
18. THE DICKMAN AND BUCHSTAB FUNCTIONS
Set
E1 (z) :=
∞
e−t dt/t,
z
with the integration along the horizontal line from z = x + iy to +∞ + iy. We see that E1 (z) is analytic on the open half-plane z > 0 and E1 (z) = −e−z /z. Hence ∞ 1 dx = exp(−E1 (s − 1) log y), x−s ψc (x, y) x s − 1 1 where we have used formula (1.8). On the other hand, ∞ ∞ log x dx = log y x−s ρ e−(s−1)u log y ρ(u) du . log y 1 0 The last integral is a Laplace transform, and we identify it by forming its differential equation and solving. For z > 0, let ∞ e−zu ρ(u) du . F (z) := 0
It is convenient here to define ρ(u) = 0 for u < 0, for then we have {u ρ(u)} = ρ(u) − ρ(u − 1),
u > 0, u = 1.
If we differentiate F and integrate by parts the resulting expression (noting that u ρ(u) is continuous at u = 1, the point of nondifferentiability), we find ∞ 1 ∞ −zu e−zu u ρ(u) du = − e {ρ(u) − ρ(u − 1)} du F (z) = − z 0 0 = −z −1 (1 − e−z ) F (z). Let f (z) := log z + E1 (z),
−z
Then f (z) = (1 − e
)/z and
z > 0.
f (z) e F (z) = 0.
Therefore F (z) = ce−f (z) = cz −1 e−E1 (z) ,
z > 0.
To evaluate the constant c, note that ρ(u) = 1 for 0 ≤ u < 1, and hence ∞ ∞ 1 F (z) = e−zu du + e−zu {ρ(u) − 1} du = + O(e−z ), z → +∞. z 0 1 Therefore zF (z) → 1, and also E1 (z) → 0 as z → +∞. Thus c = 1 and we have F (z) = z −1 exp{−E1 (z)}. For s > 1, it follows that ∞ log x dx = F {(s − 1) log y} log y x−s ρ log y 1 ∞ 1 dx = exp(−E1 {(s − 1) log y}) = . x−s ψc (x, y) s−1 x 1 By the Mellin identity theorem, the claimed formula for ψc (x, y) holds. As an immediate consequence of the theorem and formula (18.6) we have
18.3. THE φ(x, y) FUNCTION
233
Corollary 18.2. Let x ≥ 1, y > 1. Then
log x ψ (x, y) x dt c exp∗ − χ(y,∞) (t) ρ = = . log y x t log t 1− We shall consider ψc (x, y) further in §18.4 below. 18.3. The φ(x, y) function Buchstab’s theorem provides an asymptotic estimate of the number of rational integers up to x whose prime factors all exceed y, namely, log x x φ(x, y) ∼ ω log y log y as x, y → ∞ in certain ranges. ω denotes Buchstab’s function, which is defined as 0, u < 1, ω(u) = 1/u, 1 ≤ u ≤ 2, and for u > 2 as the continuous solution of the difference differential equation {u ω(u)} = ω(u − 1).
(18.7)
Further study of φ(x, y) and the omega function can be found in Chapter III.6 of [Te95] and in §7.2 of [MV07]. We remark that ω(u) tends rapidly toward e−γ (γ = Euler’s constant) as u → ∞. Define a continuous analogue of φ(x, y) for x ≥ 1, y > 1 by x
dt (18.8) φc (x, y) := exp∗ χ(y,∞) (t) log t 1− If we combine the last formula with (18.5) and (3.7), we find, for any numbers x ≥ y > 1, x x dψc (·, y) ∗ dφc (·, y) = (δ1 + dt) = x. 1−
1−
Thus ψc (x, y) and φc (x, y) fit together in a similar way as their rational integer counterparts ψ(x, y) and φ(x, y), as was noted in §18.1. Now we show that φc (x, y) has an exact representation in terms of Buchstab’s function. Comparing the present formula with that for ψc (x, y), here we have an integral and a constant term. Theorem 18.3. For x ≥ 1, y > 1, (18.9)
1 φc (x, y) = 1 + log y
x
ω y
log t log y
dt.
Proof. Again, we verify the claim using Mellin transforms. We have ∞ ∞ ∞
dx dx x−s dφc (x, y) = x−s exp∗ χ(y, ∞) (x) x−s = exp log x log x x=1− x=1− y ∞ dt = exp(E1 {(s − 1) log y}). = exp e−t t (s−1) log y Also, making the substitution u = log x/ log y, we get ∞ ∞
log x dx −s δ1 + ω =1+ x exp{−(s − 1)u log y} ω(u) du. log y log y x=1− 1
234
18. THE DICKMAN AND BUCHSTAB FUNCTIONS
The last integral is a Laplace transform, and we identify it by forming its differential equation and solving. For z > 0, let ∞ F (z) := e−zu ω(u) du. 1
We differentiate F and integrate by parts the resulting expression using the difference differential equation (18.7). That relation in fact holds for u > 1, aside from u = 2. At this point, u ω(u) is continuous, and thus the integration by parts is valid. We get ∞ e−z ∞ −zv e−z e−z −zu − {1 + F (z)}. F (z) = − e u ω(u) du = − e ω(v) dv = − z z 1 z 1 Then
−E1 (z) e {F (z) + 1} = 0,
i.e., F (z) + 1 = c exp{E1 (z)}. Hence F (z) + 1 → c as z → +∞. On the other hand, F (z) → 0 as z → +∞. Thus c = 1 and F (z) + 1 = exp{E1 (z)}. Therefore ∞
log x dx = 1 + F {(s − 1) log y} = exp E1 {(s − 1) log y} . x−s δ1 + ω log y log x x=1− We see that the two Mellin integrals are equal, so log t dt (18.10) dt φc (t, y) = δ1 + ω . log y log y Formula (18.9) holds, as we see by integrating the last equation from 1− to x and noting that the right side becomes x x log t log t 1 1 1+ ω ω dt = 1 + dt. log y 1− log y log y y log y 18.4. A Beurling version of ψ(x, y) The reader will notice that Πc (x, y) of §18.2 does not qualify as a g-prime counting function, for it is decreasing in x for x > y. (Its density function also has support extending beyond the smoothness bound y, but this is not a fatal flaw, since the classical counterpart Π(x, y) also counts the contribution of higher prime powers lying in the range (y, x].) As a replacement for Πc (x, y) we introduce x (1 − t−1 ) dt Π+ (x, y) := χ[1, y] (t) c log t 1 and its associated Beurling g-number counting function x
(1 − t−1 ) dt . ψc+ (x, y) := exp χ[1, y] (t) log t 1− We now represent ψc+ (x, y) in terms of the rho and omega functions. Theorem 18.4. For x ≥ y > 1, set u = log x/ log y. We have u ψc+ (x, y) = xρ(u) + x ρ(u − v) y −v ω(v) dv. 1
18.5. G-NUMBERS WITH PRIMES FROM AN INTERVAL
235
Proof. For given y, we have by (18.3) χ[1, y] (t)
(1 − t−1 ) dt dt = dΠc (t, y) + χ(y, ∞) (t) . log t t log t
By results from the last two sections, x
dt exp∗ dΠc (t, y) + χ(y, ∞) ψc+ (x, y) = t log t 1− x x x = dψc (·, y) ∗ T −1 dφc (·, y) = ψc , y t−1 dφc (t, y) t 1− 1− x log x − log t dt x log t ρ δ1 + ω . = log y log y t log y 1− t If we substitute v = log t/ log y and note that ω(v) = 0 for v < 1, we obtain the claimed formula for ψc+ (x, y). 18.5. G-numbers with primes from an interval A series of papers by J. Friedlander [Fr76] and E. Saias [Sa92, Sa95] studied the counting function θ(x, y, z) of rational integers in [1, x] having all their prime factors in a range (z, y]. Here we define a continuous version of this function by setting x
dt θc (x, y, z) := , exp∗ χ(z, y] (t) log t 1 and we describe θc (x, y, z) in terms of the Dickman and Buchstab functions. Theorem 18.5. Suppose x ≥ y > z > 1. We have x log t dt log t dt δ 1 + ρ ∗ δ1 + ω . (18.11) θc (x, y, z) = log y log y log z log z 1− Proof. From the identity χ(z, y] (t) = (χ[1, y] (t) − t−1 ) − (1 − t−1 ) + χ(z, ∞) (t) we find
dt dt θc (t, y, z) = exp∗ {(χ[1, y] (t) − t−1 ) − (1 − t−1 ) + χ(z, ∞) (t)} . log t
Using the relations
dt = dψc (·, y), exp∗ (χ[1, y] (t) − t−1 ) log t dt = δ1 − t−1 dt, exp∗ − (1 − t−1 ) log t dt = dφc (·, z) exp∗ χ(z, ∞) (t) log t
and the homomorphic property of exp, we find dt θc (t, y, z) = dψc (·, y) ∗ (δ1 − t−1 dt) ∗ dφc (·, z). By (18.10) dt φc (t, z) = δ1 + ω
log t dt . log z log z
236
18. THE DICKMAN AND BUCHSTAB FUNCTIONS
Using Theorem 18.1 and the fact that ψc (x, y) has a jump of 1 at x = 1, we get
log t log t log t dt . dψc (t, y) = δ1 + d tρ = δ1 + ρ dt + ρ log y log y log y log y To treat the convolution factor δ1 − t−1 dt, we note that, for any measure dF ∈ dV, we have x x x dF ∗ t−1 dt = F (x/t) dt/t = F (v) dv/v, 1−
1−
whence dF ∗ t−1 dt = F (t) dt/t. Thus dψc (·, y) ∗ t−1 dt = tρ
1−
log t dt log y
t
and hence
log t dt . log y log y If we combine the last relation with that for dt φc (t, z), we obtain the claimed formula for θc (x, y, z). dψc (·, y) ∗ (δ1 − t−1 dt) = δ1 + ρ
Here is another form of the last theorem. Theorem 18.6. For x ≥ y > z > 1 and u = log x/ log y, v = log x/ log z, σ = log z/ log y, we have dx θc (x, y, z) = δ1 +
ρ (u) log y
+
ρ(u − σ) 1 + log z log z
v
ρ(u − tσ) ω (t) dt dx.
1
Proof. The key is to rewrite the convolution in (18.11). Let x x/t x log t log s log t log t dt ∗ ω dt = ds ω dt . ρ ρ F (x) := log y log z log y log z 1 1 1 We differentiate the double integral, noting that the derivative of the outer integral is 0 because the inner integral vanishes as t → x. We find v x log x/t log t dt F (x) = ω = log z ρ ρ (u − sσ) ω(s) ds . log y log z t 1 1 Now divide the last integral by log y log z and integrate it by parts. We get v dF (x) 1 ρ(u − sσ) ω (s) ds dx, = ρ(u − σ) − ω(v) + log y log z log z 1 and if we insert this expression in place of the convolution in (18.11) we obtain the claimed formula for dx θc (x, y, z). Remark. In [Sa92], Saias gives the formula
v θ(x, y, z) ρ (u) ρ(u − σ) 1 ∼ + + ρ(u − tσ) ω (t) dt. x log y log z log z 1 Although the left side of this formula differs from that of the last theorem, the results are not inconsistent; rather they are instances of a relation 1 x g(t) dt ∼ g(x), x 1 which holds under quite mild conditions, e.g. xg(x) → ∞ and xg (x) = o(g(x)).
18.6. OTHER RELATIONS
237
Remark. If we want to create a continuous g-number system having prime density that comes from several intervals (disjoint or not) we can convolve together factors dx θc (x, yi , zi ) associated with intervals (zi , yi ], i = 1, 2, . . . . 18.6. Other relations We conclude with brief mention of continuous analogues of some other Dickman/Buchstab relations. Buchstab Identity. If 1 < y < z ≤ x, then x ψ(x, y) = ψ(x, z) − ψ ,p . p y 1, x u ω(u) − y − y 0
with u := log x/ log y. However, this approach is complicated by the discontinuity of ω and the need for manipulations. Instead, we prove the relation more simply using Mellin transforms. x Let Fy (x) := 1 χ(y, ∞) (t) dt. We have, for s > 1, ∞ y 1−s y (s) = F x−s χ(y, ∞) (x) dx = s−1 x=1 and, from the proof of Theorem 18.3, ∞ x−s dφc (x, y) = exp(E1 {(s − 1) log y}), φc (s) = x=1−
238
18. THE DICKMAN AND BUCHSTAB FUNCTIONS
∞
e−t dt/t. We find ∞ x−s log x dφc (x, y) − φc (s) =
with E1 (z) :=
z
x=1
e−(s−1) log y y (s). = φc (s) F s−1 By Lemma 2.6 and the Mellin identity theorem, (18.13) holds. = exp(E1 {(s − 1) log y})
18.7. Notes §18.2. There is a family of functions jκ with κ ≥ 1 that plays an important role in sieve theory. Up to a multiplicative constant, j1 is the function ρ. The κth convolution power of exp dΠc is expressible in terms of jκ . These functions are discussed in Chapter 14 of [DH08].
Bibliography M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions, National Bureau of Standards Applied Mathematics Series, 55, 1964. xiv+1046 pp. Reprinted by Dover Publications, New York, 1965. [AMH15] F. Al-Maamori and T. Hilberdink, An example in Beurling’s theory of generalised primes, Acta Arith. 168 (2015), 383–395. MR3352439. [Ap74] T. M. Apostol, Mathematical analysis, 2nd ed., Addison-Wesley, Reading, MA, 1974. xvii+492 pp. MR49:9123. [Ap76] T. M. Apostol, Introduction to analytic number theory, Undergrad. Texts in Math., Springer-Verlag, New York-Heidelberg, 1976. xii+338 pp. MR55:7892. [Bzd99] M. Balazard, La version de Diamond de la m´ ethode de l’hyperbole de Dirichlet, Enseign. Math. (2) 45 (1999), 353–270. MR2001a:11167. [BS00] R. G. Bartle and D. R. Sherbert, Introduction to real analysis, Third ed. John Wiley, New York, 2000. xii+404 pp. MR1135107. [BD69] P. T. Bateman and H. G. Diamond, Asymptotic distribution of Beurling’s generalized prime numbers, in Studies in number theory, W. J. LeVeque, ed., Math. Assoc. Amer., 1969, 152–210. MR39:4105. [BD04] P. T. Bateman and H. G. Diamond, Analytic number theory. An introductory course, World Scientific, Singapore, 2004. xiv +360 pp. Reprinted, with minor changes, in series Monographs in Number Theory, vol. 1, 2009. MR2005h:11208. [Be37] A. Beurling, Analyse de la loi asymptotique de la distribution des nombres premiers g´ en´ eralis´ es. I, Acta Math. 68 (1937), 255–291. [BGT87] N. H. Bingham, C. M. Goldie, and J. L. Teugels, Regular variation, Encyclopedia of Mathematics and its Applications, 27. Cambridge U. Press, Cambridge, 1987. xx+491 pp. MR88i:26004. [Ch68] K. Chandrasekharan, Introduction to analytic number theory, Springer–Verlag, Berlin, 1968. viii+140 pp. MR40:2593. [Ch70] K. Chandrasekharan, Arithmetical functions, Springer–Verlag, Berlin, 1970. xi+231 pp. MR43:3223. [Di69] H. G. Diamond, The prime number theorem for Beurling’s generalized numbers, J. Number Theory, 1 (1969), 200–207. MR39:4106. , Asymptotic distribution of Beurling’s generalized integers, Illinois J. Math. 14 [Di70a] (1970), 12–28. MR40:5555. , A set of generalized numbers showing Beurling’s theorem to be sharp, Illinois [Di70b] J. Math., 14 (1970), 29–34. MR40:5556. , Chebyshev estimates for Beurling generalized prime numbers, Proc. Amer. [Di73a] Math. Soc. 39 (1973), 503–508. MR47:3332. , Chebyshev type estimates in prime number theory, S´ em. de Th´ eorie des Nom[Di73b] bres, 1973-1974 (Univ. Bordeaux I, Talence), Exp. No. 24, 11 pp., 1974. MR52:13690. , When do Beurling generalized integers have a density?, J. Reine Angew. Math. [Di77] 295 (1977), 22–39. MR56:8518. [DH08] H. G. Diamond and H. Halberstam, A higher-dimensional sieve method. With an appendix (“Procedures for computing sieve functions”) by William F. Galway. Cambridge Tracts in Mathematics, 177. Cambridge University Press, Cambridge, 2008. MR2458547 (2009h:11151). [DMV06] H. G. Diamond, H. L. Montgomery, and U. M. A. Vorhauer, Beurling primes with large oscillation, Math. Ann. 334 (2006), 1–36. MR2006j:11131. [AS64]
239
240
[DZ12] [DZ13a] [DZ13b] [Da00]
[Fr76] [Ha72] [Ha73] [Ha49] [Hi12] [HL06] [In32] [Iv85] [Kn68] [Ka96] [Ka96b]
[Ka97]
[Ka97b]
[Ka98]
[Ko04] [Ko05] [La03]
[La09]
[Lo77] [Mo93]
BIBLIOGRAPHY
H. G. Diamond and W.-B. Zhang, A PNT equivalence for Beurling numbers, Funct. Approx. Comment. Math. 46 (2012), 225–234. MR2931668. , Chebyshev Bounds for Beurling Numbers, Acta Arith. 160 (2013), 143–157. MR3105332. , Optimality of Chebyshev Bounds for Beurling Generalized Numbers, Acta Arith. 160 (2013), 259–275. MR3106097. H. Davenport, Multiplicative number theory, 3rd ed. Revised and with a preface by Hugh L. Montgomery, Grad. Texts in Math., 74, Springer-Verlag, New York, 2000. xiv+177 pp. MR2001f:11001. J. B. Friedlander, Integers free from large and small primes, Proc. London Math. Soc. (3) 33 (1976), 565–576. MR54:5139. R. S. Hall, The prime number theorem for generalized primes, J. Number Theory 4 (1972), 313–320. MR0308069. , Beurling generalized prime number systems in which the Chebyshev inequalities fail, Proc. Amer. Math. Soc. 40 (1973), 79–82. MR47:6634. G. H. Hardy, Divergent series, Oxford University Press, Oxford, 1949. xvi+396 pp. MR11,25a. T. Hilberdink, Generalised prime systems with periodic integer counting function, Acta Arith., 152 (2012), 217–241. MR2885785. T. Hilberdink and M. Lapidus, Beurling zeta functions, generalised primes, and fractal membranes, Acta Appl. Math. 94 (2006), 21–48. MR2271675 (2007k:11174). A. E. Ingham, The distribution of prime numbers, Cambridge Tracts in Math. and Math. Physics, 30, Cambridge Univ. Press, Cambridge, 1932. A. Ivi´ c, The Riemann zeta–function, Wiley–Interscience, New York, 1985. xvi+517 pp. MR87d:11062. Y. Katznelson, An introduction to harmonic analysis, 2nd corrected ed., Dover Publications, Inc., New York, 1976. xiv+264 pp. MR10976. J. P. Kahane, Une formule de Fourier sur les nombres premiers, Gazette des Math´ ematiciens 67 (1996), 3–9. MR97b:11113. , Une formule de Fourier pour les nombres premiers, Application aux nombres premiers g´ en´ eralis´es de Beurling. (French) [A Fourier formula for primes. Application to Beurling generalized primes] Harmonic analysis from the Pichorides viewpoint (Anogia, 1995, Myriam D´ echamps, ed.), 41–49, Publ. Math. Orsay, 96-01, Univ. Paris XI, Orsay, 1996. MR98c:11103. , A Fourier formula for prime numbers, Harmonic analysis and number theory (Montreal, PQ, 1996), 89–102, CMS Conf. Proc., 21, Amer. Math. Soc., Providence, RI, 1997. MR98f:11099. , Sur les nombres premiers g´ en´ eralis´ es de Beurling. Preuve d’une conjecture de Bateman et Diamond, Journal de Th´ eorie des Nombres de Bordeaux 9 (1997), 251–266. MR99f:11127. , Le rˆ ole des algebres A de Wiener, A∞ de Beurling et H 1 de Sobolev dans la th´ eorie des nombres premiers g´ en´ eralis´ es de Beurling, Ann. Inst. Fourier (Grenoble) 48 (1998), 611–648. MR99k:11152. J. Korevaar, Tauberian theory. A century of developments, Grund. der Math. Wiss. 329, Springer-Verlag, Berlin, 2004. xvi+483 pp. MR2006e:11139. , Distributional Wiener-Ikehara theorem and twin primes, Indag. Math. (N.S.) 16 (2005), no. 1, 37–49. MR2006d:11105. E. Landau, Neuer Beweis des Primzahlsatzes und Beweis des Primidealsatzes, Math. Annalen 56 (1903), 645–670. Reprinted in Collected Works Vol. 1, Thales Verlag, Essen, 1987, 327–352. , Handbuch der Lehre von der Verteilung der Primzahlen. 2 B¨ ande, 2d ed. With an appendix by Paul T. Bateman. Chelsea Publishing Co., New York, 1953. xviii+pp. 1–564; ix+pp. 565–1001. MR0068565. M. Lo` eve, Probability theory. I, 4th ed. Graduate Texts in Mathematics 45, SpringerVerlag, New York, 1977. xvii+425 pp. MR58:31324a. P. Moree, Psixyology and Diophantine equations, Ph.D. Dissertation, Rijksuniversiteit, Leiden, 1993. x+196 pp. MR96e:11114.
BIBLIOGRAPHY
[MV07]
[Na00]
[Ny49] [Ol11] [Pl13] [Po99]
[Rv07] [Ro88] [Ru87] [Sa92] [Sa95] [SPV12] [So36]
[Te95] [Ti39] [Ts40] [Vn12] [Vn13] [We66]
[Zh86] [Zh87a] [Zh87b] [Zh88] [Zh07] [Zh14] [Zh15]
241
H. L. Montgomery and R. C. Vaughan, Multiplicative number theory. I. Classical theory, Cambridge Studies in Advanced Mathematics, 97, Cambridge University Press, Cambridge, 2007. xviii+552 pp. MR2009b:11001. W. Narkiewicz, The development of prime number theory. From Euclid to Hardy and Littlewood, Monographs in Math. Springer-Verlag, Berlin, 2000. xii+448 pp. MR2001c:11098. B. Nyman, A general prime number theorem, Acta Math. 81 (1949), 299–307. MR0032693. R. Olofsson, Properties of the Beurling generalized primes, J. Number Theory 131 (2011), 45–58. MR2729208. P. Pollack, On Mertens’ theorem for Beurling primes, Canad. Math. Bull. 56 (2013), no. 4, 829–843. MR3121692. Ch. J. de la Vall´ee Poussin, Sur la fonction ζ(s) de Riemann et le nombre des nombres premiers inf´ erieurs a ` une limite donn´ ee, M´ emoires couronn´ es et autres m´ emoires publi´ es par l’Acad´ emie Royale des Sciences, des Lettres et des Beaux-Arts de Belgique 59 (1899–1900), no. 1, 74 pp. S. Gy. R´ ev´ esz, On some extremal problems of Landau, Serdica Math. J. 33 (2007), 125–162. MR2313797. H. L. Royden, Real analysis, 3rd ed., Macmillan, New York, 1988. xx+444 pp. MR90g:00004. W. Rudin, Real and complex analysis, 3rd ed., McGraw-Hill, New York, 1987. xiv+416 pp. MR88k:00002. E. Saias, Entiers sans grand ni petit facteur premier.I, Acta Arith. 61 (1992), 347–374. MR93d:11096. E. Saias, Entiers sans grand ni petit facteur premier.III, Acta Arith. 71 (1995), 351–379. MR96g:11113. J. C. Schlage-Puchta and J. Vindas, The prime number theorem for Beurling’s generalized numbers. New cases, Acta Arith. 153 (2012), 299–324. MR2912720. S. L. Sobolev (Soboleff), Sur quelques ´ evaluations concernant les familles de fonctions etc. . . . . C. R. Acad. Sc. URSS (Doklady)I(X), N.7 (84), 1936, p.279 et III(XII), N.1 98, 1936, p. 107. G. Tenenbaum, Introduction to analytic and probabilistic number theory, Cambridge Studies in Adv. Math., 46, Cambridge Univ. Press, Cambridge, 1995. MR97e:11005b. E. C. Titchmarsh, Theory of functions, 2nd ed., Oxford Univ. Press, Oxford, 1939. L. Tschakaloff (Chakalov), Trigonometrische Polynome mit einer Minimumeigenschaft, Ann. Scuola Norm. Super. Pisa (2) 9, (1940). MR0001365. J. Vindas, Chebyshev estimates for Beurling generalized prime numbers. I., J. Number Theory 132 (2012), 2371–2376. MR2944760. , Chebyshev upper estimates for Beurling’s generalized prime numbers, Bull. Belg. Math. Soc. Simon Stevin 20 (2013), 175–180. MR3082752. Wegmann, H., Beitr¨ age zur Zahlentheorie auf freien Halbgruppen. II, J. Reine Angew. Math. 221 (1966), 150–159, MR32:4098. (The result of Wegmann and the reference to Wirsing are not correctly stated in the review.) W.-B. Zhang, Asymptotic distribution of Beurling’s generalized prime numbers and integers, Ph.D. Thesis, University of Illinois, Urbana, 1986. , Chebyshev type estimates for Beurling generalized prime numbers, Proc. Amer. Math. Soc. 101 (1987), 205–212. MR88m:11081. , A generalization of Halsz’s theorem to Beurling’s generalized integers and its application, Illinois J. Math. 31 (1987), 645–664. MR89a:11102. , Density and O-density of Beurling generalized integers, J. Number Theory 30 (1988), 120–139. MR90a:11111. , Beurling primes with RH and Beurling primes with large oscillation, Math. Ann. 337 (2007), 671–704. MR2007k:11148. , Wiener-Ikehara theorems and the Beurling generalized primes, Monatsh. Math. 174 (2014), 627–652. MR3233115. , Extensions of Beurling’s prime number theorem, Int. J. Number Theory 11 (2015), 1589–1616. MR3398753.
Index
D(x), divisor counting function, 29 L operator, 16 T operator, 16 Πc (x), continuous prime counting function, 5 V function class, 9 V(c), dV(c), 21 li(x), 139 ω function, 233 φc (x, y), 233 ψc (x, y), 230 ρ function, 230 σ-norm, 11 dV measure class, 10
de la Vall´ ee Poussin, Ch. J., 1, 195 density, 3, 63 derivation, 17 Diamond, H. G., xi, 111, 161 Dickman rho function, 230 distribution function, 10 divisor function, 29 dlVP type remainder term, 175 equivalent, 151, 160 Euler product representation, 3 exponential representation, 19 Fej´ er kernel, 101 Friedlander, J., 235 function class V, 9
absolutely convergent, 11 Apostol, T. M., xi asscissa of convergence, 10 Axer’s Theorem, 63, 154
g- (generalized), 1 g-number counting functions, 2, 3 g-number system, 6 g-prime system, 2 g-zeta function, 3 generalized Dirichlet series, 3
Bateman, P. T., v, xi, 161 Bernoulli variables, 196 Bernstein inequality, 196 Beurling, A., xi, 1, 40 Bochner PNT proof, 110 Borel measure, 9 Buchstab identity for ψc (x, y), 237 Buchstab omega function, 233
Hall, R. S., 95, 193 Hardy-Littlewood-Karamata Theorem, 43 Hilberdink, T., 6 Hildebrand identity for ψc (x, y), 237 Hildebrand, A. J., xi
Chandrasekharan, K., xi Chebyshev bounds, 19, 53, 83, 111 Chebyshev identity, 19, 27 Chebyshev identity for φc (x, y), 237 Cohen, P. J., xi continuous function example, 5 continuous prime number theory, 2 convergence of measures, 11 convolution of functions, 15 convolution of measures, 12 counting function of g-integers, 3 counting functions of g-primes, 2 cumulative distribution function, 10
indicator function of a set, 12 Ingham, A. E., xi invertible, 22 Kahane, J.-P., xi, 111 Kolmogorov inequality, 196 Landau, E., xi, 195 logarithmic density, 30, 41, 53 logarithmic integral, 5 lower density, 53 lower logarithmic density, 30 lower-residue, 34
Davenport, H., xi de Bruijn, N. G., 230
measure, 9 Mellin integral, 10 243
244
Mellin transform, 3, 10 Mertens’ product formula, 46 Mertens’ sum formula, 45 Montgomery, H. L., xi multiplicative convolution of measures, 12 norm, 11 Nyman type remainder term, 175 Nyman, B., 192 O-density, 3, 53 O-log density, 30 odd number example, 4 optimality of a Chebyshev bound, 119 optimality of PNT error term, 195 outer g-number system, 6 Plancherel’s identity, 138, 162 PNT (prime number theorem), 1 Poisson summation formula, 169 polynomial growth, 9 random g-primes, 196 regular growth, 39, 79 remainder term in PNT, 175 repeated prime example, 4 Riemann hypothesis, 1, 195 Riemann Hypothesis example, 205 Riemann-Lebesgue lemma, 107 right continuous, 9 right hand residue, 41, 63 Saias, E., 235 Schlage-Puchta, J. C., 148 Schwartz function, 166 sharp Mertens relation, 151 smooth numbers, 229 Tauber’s Theorem, 49 template distribution, 196 Tenenbaum, G., xi thin set of primes, 37 total variation function, 9 Tschakaloff, L., 193 unboundedness of π(x), 77 upper density, 53 upper logarithmic density, 30 upper-residue, 30 Vaughan, R. C., xi Vindas, J., 111, 117, 148 Weber, 195 Wegmann, H., 192 Wiener-Ikehara Theorem, 100 Wiener-Ikehara upper, lower bounds, 100 wobbly g-prime function, 139 Zhang, W.-B., 111, 117
INDEX
“Generalized numbers” is a multiplicative structure introduced by A. Beurling to study how independent prime number theory is from the additivity of the natural numbers. The results and techniques of this theory apply to other systems having the character of prime numbers and integers; for example, it is used in the study of the prime number theorem (PNT) for ideals of algebraic number fields. Using both analytic and elementary methods, this book presents many old and new theorems, including several of the authors’ results, and many examples of extremal behavior of g-number systems. Also, the authors give detailed accounts of the PNT theorem of J. P. Kahane and of the example created with H. L. Montgomery, showing that additive structure is needed for proving the Riemann Hypothesis. Other interesting topics discussed are propositions “equivalent” to the PNT, the role of multiplicative convolution and Chebyshev's prime number formula for g-numbers, and how Beurling theory provides an interpretation of the smooth number formulas of Dickman and deBruijn.
For additional information and updates on this book, visit www.ams.org/bookpages/surv-213
SURV/213
AMS on the Web www.ams.org