Number Theory, Fourier Analysis and Geometric Discrepancy (London Mathematical Society Student Texts, Series Number 81) [1 ed.] 1107044030, 9781107044036

The study of geometric discrepancy, which provides a framework for quantifying the quality of a distribution of a finite

137 83 2MB

English Pages 252 [249] Year 2014

Table of contents :
Cover
Title: Number Theory, Fourier Analysis and Geometric Discrepancy
Contents
Introduction
1 Prelude
2 Arithmetic functions and integer points
3 Congruences
4 Quadratic reciprocity and Fourier series
5 Sums of squares
6 Uniform distribution and completeness of the trigonometric system
7 Discrepancy and trigonometric approximation
8 Integer points and Poisson summation formula
9 Integer points and exponential sums
10 Geometric discrepancy and decay of Fourier transforms
11 Discrepancy in high dimension and Bessel functions
References
Index

Recommend Papers

Number Theory and Algebraic Geometry (London Mathematical Society Lecture Note Series, Series Number 303) [1 ed.] 0521545188, 9780521545181

Peter Swinnerton-Dyer's mathematical career encompasses more than 60 years' work of amazing creativity. This v

122 24 3MB Read more

Dispersive Partial Differential Equations: Wellposedness and Applications (London Mathematical Society Student Texts, Series Number 86) 1107149045, 9781107149045

The area of nonlinear dispersive partial differential equations (PDEs) is a fast developing field which has become excee

122 79 879KB Read more

Compact Matrix Quantum Groups and Their Combinatorics (London Mathematical Society Student Texts, Series Number 106) [1 ed.] 1009345737, 9781009345736

142 85 13MB Read more

Lectures on Lie Groups and Lie Algebras (London Mathematical Society Student Texts, Series Number 32) 0521495792, 9780521495790

Three of the leading figures in the field have composed this excellent introduction to the theory of Lie groups and Lie

117 108 2MB Read more

Elementary Theory of L-functions and Eisenstein Series (London Mathematical Society Student Texts, Series Number 26) [1 ed.] 0521434114, 9780521434119

This book is a comprehensive and systematic account of the theory of p-adic and classical modular forms and the theory o

120 14 3MB Read more

Maurer–Cartan Methods in Deformation Theory (London Mathematical Society Lecture Note Series, Series Number 488) [1 ed.] 1108965644, 9781108965644

101 67 2MB Read more

Künneth Geometry: Symplectic Manifolds and their Lagrangian Foliations (London Mathematical Society Student Texts, Series Number 108) [1 ed.] 1108830714, 9781108830713

This clear and elegant text introduces Künneth, or bi-Lagrangian, geometry from the foundations up, beginning with a rap

111 22 2MB Read more

Diophantine Analysis: Proceedings at the Number Theory Section of the 1985 Australian Mathematical Society Convention (London Mathematical Society Lecture Note Series, Series Number 109) [1 ed.] 0521339235, 9780521339230

The papers in this volume, which were presented at the 1985 Australian Mathematical Society convention, survey recent wo

109 105 980KB Read more

Auslander-Buchweitz Approximations of Equivariant Modules (London Mathematical Society Lecture Note Series, Series Number 282) [1 ed.] 0521796962, 9780521796965

This book presents a new homological approximation theory in the category of equivariant modules, unifying the Cohen-Mac

118 42 2MB Read more

Analysis at Urbana: Volume 2, Analysis in Abstract Spaces (London Mathematical Society Lecture Note Series, Series Number 138) [1 ed.] 052136437X, 9780521364379

Throughout the academic year 1986–7, the University of Illinois was host to a symposium on mathematical analysis which w

113 17 2MB Read more

Number Theory, Fourier Analysis and Geometric Discrepancy (London Mathematical Society Student Texts, Series Number 81) [1 ed.]
1107044030, 9781107044036

Author / Uploaded
Giancarlo Travaglini

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

London Mathematical Society Student Texts 81

Number Theory, Fourier Analysis and Geometric Discrepancy G I A N C A R L O T R AVAG L I N I Universit`a di Milano-Bicocca

University Printing House, Cambridge CB2 8BS, United Kingdom Cambridge University Press is part of the University of Cambridge. It furthers the University’s mission by disseminating knowledge in the pursuit of education, learning and research at the highest international levels of excellence. www.cambridge.org Information on this title: www.cambridge.org/9781107044036 c G. Travaglini 2014 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2014 Printed in the United Kingdom by Clays, St Ives plc A catalogue record for this publication is available from the British Library Library of Congress Cataloguing in Publication data Travaglini, Giancarlo, author. Number theory, Fourier analysis and geometric discrepancy / Giancarlo Travaglini. pages cm. – (London Mathematical Society student texts) Includes bibliographical references and index. ISBN 978-1-107-04403-6 (hardback) – ISBN 978-1-107-61985-2 (paperback) 1. Number theory – Textbooks. I. Title. QA241.T68 2014 512.7–dc23 2014004844 ISBN 978-1-107-04403-6 Hardback ISBN 978-1-107-61985-2 Paperback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

LONDON MATHEMATICAL SOCIETY STUDENT TEXTS Managing Editor: Professor D. Benson, Department of Mathematics, University of Aberdeen, UK 39 40 41 42 43 44 45 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

Set theory for the working mathematician, KRZYSZTOF CIESIELSKI Dynamical systems and ergodic theory, M. POLLICOTT & M. YURI The algorithmic resolution of Diophantine equations, NIGEL P. SMART Equilibrium states in ergodic theory, GERHARD KELLER Fourier analysis on finite groups and applications, AUDREY TERRAS Classical invariant theory, PETER J. OLVER Permutation groups, PETER J. CAMERON Introductory lectures on rings and modules. JOHN A. BEACHY ´ HAJNAL & PETER HAMBURGER. Translated by ATTILA MATE Set theory, ANDRAS An introduction to K-theory for C∗ -algebras, M. RØRDAM, F. LARSEN & N. J. LAUSTSEN A brief guide to algebraic number theory, H. P. F. SWINNERTON-DYER Steps in commutative algebra: Second edition, R. Y. SHARP ¨ ¨ Finite Markov chains and algorithmic applications, OLLE HAGGSTR OM The prime number theorem, G. J. O. JAMESON Topics in graph automorphisms and reconstruction, JOSEF LAURI & RAFFAELE SCAPELLATO Elementary number theory, group theory and Ramanujan graphs, GIULIANA DAVIDOFF, PETER SARNAK & ALAIN VALETTE Logic, induction and sets, THOMAS FORSTER Introduction to Banach algebras, operators and harmonic analysis, GARTH DALES et al Computational algebraic geometry, HAL SCHENCK Frobenius algebras and 2-D topological quantum field theories, JOACHIM KOCK Linear operators and linear systems, JONATHAN R. PARTINGTON An introduction to noncommutative Noetherian rings: Second edition, K. R. GOODEARL & R. B. WARFIELD, JR Topics from one-dimensional dynamics, KAREN M. BRUCKS & HENK BRUIN Singular points of plane curves, C. T. C. WALL A short course on Banach space theory, N. L. CAROTHERS Elements of the representation theory of associative algebras I, IBRAHIM ASSEM, ´ DANIEL SIMSON & ANDRZEJ SKOWRONSKI An introduction to sieve methods and their applications, ALINA CARMEN COJOCARU & M. RAM MURTY Elliptic functions, J. V. ARMITAGE & W. F. EBERLEIN Hyperbolic geometry from a local viewpoint, LINDA KEEN & NIKOLA LAKIC Lectures on K¨ahler geometry, ANDREI MOROIANU ¨ AN ¨ ANEN ¨ Dependence logic, JOUKU VA Elements of the representation theory of associative algebras II, DANIEL SIMSON & ´ ANDRZEJ SKOWRONSKI Elements of the representation theory of associative algebras III, DANIEL SIMSON & ´ ANDRZEJ SKOWRONSKI Groups, graphs and trees, JOHN MEIER Representation theorems in Hardy spaces, JAVAD MASHREGHI ´ An introduction to the theory of graph spectra, DRAGOSˇ CVETKOVIC, ´ PETER ROWLINSON & SLOBODAN SIMIC Number theory in the spirit of Liouville, KENNETH S. WILLIAMS Lectures on profinite topics in group theory, BENJAMIN KLOPSCH, NIKOLAY NIKOLOV & CHRISTOPHER VOLL Cliﬀord algebras: An introduction, D. J. H. GARLING Introduction to compact Riemann surfaces and dessins d’enfants, ERNESTO GIRONDO & ´ GABINO GONZALEZ-DIEZ The Riemann hypothesis for function fields, MACHIEL VAN FRANKENHUIJSEN

London Mathematical Society Student Texts 81

Number Theory, Fourier Analysis and Geometric Discrepancy G I A N C A R L O T R AVAG L I N I Universit`a di Milano-Bicocca

University Printing House, Cambridge CB2 8BS, United Kingdom Cambridge University Press is part of the University of Cambridge. It furthers the University’s mission by disseminating knowledge in the pursuit of education, learning and research at the highest international levels of excellence. www.cambridge.org Information on this title: www.cambridge.org/9781107044036 c G. Travaglini 2014 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2014 Printed in the United Kingdom by Clays, St Ives plc A catalogue record for this publication is available from the British Library Library of Congress Cataloguing in Publication data Travaglini, Giancarlo, author. Number theory, Fourier analysis and geometric discrepancy / Giancarlo Travaglini. pages cm. – (London Mathematical Society student texts) Includes bibliographical references and index. ISBN 978-1-107-04403-6 (hardback) – ISBN 978-1-107-61985-2 (paperback) 1. Number theory – Textbooks. I. Title. QA241.T68 2014 512.7–dc23 2014004844 ISBN 978-1-107-04403-6 Hardback ISBN 978-1-107-61985-2 Paperback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

To Frances

Contents

Introduction

PART ONE

page ix

ELEMENTARY NUMBER THEORY

1

1

Prelude 1.1 Prime numbers and factorization 1.2 Chebyshev’s theorem

3 3 11

2

Arithmetic functions and integer points 2.1 Arithmetic functions and Dirichlet product 2.2 μ(n) 2.3 d(n) and σ (n) 2.4 r (n) 2.5 φ(n)

19 19 23 25 33 35

3

Congruences 3.1 Basic properties 3.2 The theorems of Fermat and Euler 3.3 Almost prime numbers 3.4 The Chinese remainder theorem 3.5 The RSA encryption

41 42 46 50 56 59

4

Quadratic reciprocity and Fourier series 4.1 Quadratic residues 4.2 Gauss sums 4.3 Fourier series 4.4 Another proof of the quadratic reciprocity law

65 66 74 76 83

5

Sums of squares 5.1 The theorems of Minkowski and Dirichlet

87 87

vii

viii

Contents 5.2 5.3 5.4

Sums of two squares Gaussian integers Sums of four squares

92 94 100

PART TWO FOURIER ANALYSIS AND GEOMETRIC DISCREPANCY

105

Uniform distribution and completeness of the trigonometric system 6.1 Kronecker’s theorem 6.2 Completeness of the trigonometric system 6.3 The Weyl criterion 6.4 Normal numbers 6.5 Benford’s law

107 108 110 118 123 127

7

Discrepancy and trigonometric approximation 7.1 One-sided trigonometric approximation 7.2 The Erd˝os–Tur´an inequality 7.3 The inequalities of Koksma and Koksma–Hlawka

134 135 138 140

8

Integer points and Poisson summation formula 8.1 Fourier integrals 8.2 The Poisson summation formula 8.3 The Gauss circle problem 8.4 Integer points in convex bodies

150 150 160 162 169

9

Integer points and exponential sums 9.1 Preliminary results 9.2 Van der Corput’s method and the Dirichlet divisor problem

183 183

10

Geometric discrepancy and decay of Fourier transforms 10.1 Choosing N points in a cube 10.2 Roth’s theorem 10.3 Average decay of Fourier transforms 10.4 Irregularities of distribution for convex bodies

193 193 195 200 203

11

Discrepancy in high dimension and Bessel functions 11.1 Bessel functions 11.2 Deterministic and probabilistic discrepancies 11.3 The case d = 5, 9, 13, . . . References Index

214 215 219 224 229 238

6

186

Introduction

Through this book we wish to achieve and connect the following three goals: 1) to present some elementary results in number theory; 2) to introduce classical and recent topics on the uniform distribution of infinite sequences and on the discrepancy of finite sequences in several variables; 3) to present a few results in Fourier analysis and use them to prove some of the theorems discussed in the two previous points. The first part of this book is dedicated to the first goal. The reader will find some topics typically presented in introductory books on number theory: factorization, arithmetic functions and integer points, congruences and cryptography, quadratic reciprocity, and sums of two and four squares. Starting from the first few pages we introduce some simple and captivating findings, such as Chebyshev’s theorem and the elementary results for the Gauss circle problem and for the Dirichlet divisor problem, which may lead the reader to a deeper study of number theory, particularly students who are interested in calculus and analysis. In the second part we start with the uniformly distributed sequences, introduced in 1916 by Weyl and related to the strong law of large numbers and to Kronecker’s approximation theorem. Then we introduce the definition of discrepancy, which is the quantitative counterpart of the uniform distribution and has natural applications in the computation of high-dimensional integrals. For the particular case of integer points we use diﬀerent techniques to prove some classical but not trivial results for the Gauss circle problem and for the Dirichlet divisor problem. Then we introduce the geometric discrepancy, also known as irregularities of distribution because some of its main results show the existence of unavoidable errors in the approximation of a continuous object by a discrete sampling. This theory has grown over the last 60 years thanks to ix

x

Introduction

the contributions by Roth, Schmidt, Beck and other authors, and it is presently a crossroads between number theory, combinatorics, Fourier analysis, algorithms and complexity, probability and numerical analysis [49]. Its current applications range from traditional science and engineering to modern computer science and financial mathematics [43]. A large number of the results in this book are proved through Fourier analytic arguments: pointwise convergence of Fourier series, completeness of the trigonometric system, trigonometric approximation, Poisson summation formula, exponential sums, decay of Fourier transforms and Bessel functions. The result is a short and self-contained course on Fourier analysis, which we present in parallel with the two previous points. This book is based on a number theory course of 60–70 hours given by the author at the University of Milano-Bicocca for several years. It was a postgraduate course, but many undergraduate students attended it with success. The prerequisites are limited: no prior knowledge of number theory or Fourier analysis is necessary, we assume a bit of algebra and a solid background in calculus, we need the Lebesgue integral, but we use complex analysis only in the last chapter. The lecture notes of the above course first appeared in the Italian book Appunti su teoria dei numeri, analisi di Fourier e distribuzione di punti [173]. We wish to thank the Unione Matematica Italiana who generously released the rights to this revised and expanded English version. We are very grateful to the students who attended the course. We also wish to thank Anatoly Podkorytov, Giacomo Gigante, Leonardo Colzani, Luca Brandolini and William Chen, who have read the original draft. We thank William Chen also for permission and encouragement to freely use his notes [46] of the number theory courses he taught at Imperial College London. These notes were modified many times from notes used by various colleagues over many years, both at Imperial College London and University College London. It seems likely that Davenport is the original source of these notes. We are also happy to thank all the people at Cambridge University Press who were involved in publishing this book, in particular Roger Astley, Roisin Munnelly, Samuel Harrison and Joanna Breeze. Giancarlo Travaglini March 2014

1 Prelude

Number theory deals with the properties of the positive integers, which were probably the first mathematical objects discovered by human beings. In this chapter we shall initially study the factorization of positive integers into primes, a basic result called the fundamental theorem of arithmetic. The possibly exaggerated title ‘Prelude’ refers to the second section, where we introduce Chebyshev’s theorem on the distribution of prime numbers. This result is remarkable and yet rather easy to understand, and it may encourage the reader to approach more advanced topics in number theory. For the first part of this book we have used various references, including [3, 4, 6, 8, 9, 42, 46, 63, 68, 72, 76, 90, 93, 96, 101, 103, 108, 119, 120, 127, 128, 136, 145, 151, 165].

1.1 Prime numbers and factorization We shall denote by N = {1, 2, . . .} the set of natural numbers and by Z the set of integers. We shall say that 0 b ∈ Z divides a ∈ Z if there exists c ∈ Z such that a = bc. In this case we shall write b | a. If b does not divide a we shall write b a. We know1 that, given a ∈ Z and b ∈ N, there exist (unique) q, r ∈ Z such that a = bq + r, with 0 ≤ r < b. We present the following consequence. Theorem 1.1 Let b > 1 be an integer. Then every a ∈ N can be written in one and only one way in base b : a = c0 + c1 b + c2 b2 + . . . + cn bn , 1

(1.1)

The set {a − xb : x ∈ Z} ∩ (N∪ {0}) is not empty and let r = a − qb be its minimum. Observe that (0 ≤)r < b, otherwise we would have r = a − qb ≥ b and 0 ≤ a − (q + 1) b < r, against the minimality of r. Now assume that a = bq1 + r1 = bq2 + r2 , then |r1 − r2 | = b |q2 − q1 |. If q1 q2 then |r1 − r2 | = b |q2 − q1 | ≥ b, which is impossible since 0 ≤ r1 , r2 < b.

3 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:10:35, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.002

4

Prelude

where n ≥ 0 and 0 ≤ c j < b for j = 0, . . . , n − 1, while 1 ≤ cn < b. Proof Let us prove by induction that if bn ≤ a < bn+1 , then a can be written in base b. This is true for n = 0 and we assume that it is true for every 0 ≤ m ≤ n − 1. Since bn ≤ a < bn+1 we have a = cn bn + r , with 0 ≤ r < bn and 1 ≤ cn < b. If r = 0, then a is written as in (1.1). If r > 0, we recall that r < bn and thus bm ≤ r < bm+1 for some m ≤ n − 1. We now use the induction assumption to write r = p0 + p1 b + p2 b2 + . . . + pm bm with 0 ≤ p j < b for j = 0, . . . , m. Then a = p0 + p1 b + p2 b2 + . . . + pm bm + cn bn . Finally, we assume that there are two ways to write a in (1.1). By suitably subtracting them we obtain 0 = q0 + q1 b + q2 b2 + . . . + qk bk with qk ≥ 1 (k = 0 gives a contradiction, so we assume that k ≥ 1). For every 0 ≤ j ≤ k − 1 we have q j ≤ b − 1. Then we have qk bk = − q0 + q1 b + q2 b2 + . . . + qk−1 bk−1 and bk ≤ qk bk ≤ |q0 | + |q1 | b + |q2 | b2 + . . . + |qk−1 | bk−1 ≤ (b − 1) 1 + b + b2 + . . . + bk−1 = bk − 1 ,

which is impossible.

The following theorem introduces the definition of greatest common divisor. Theorem 1.2 Let a, b be two integers not both zero. Then there exists a unique d ∈ N such that (i) there exist x, y ∈ Z satisfying d = ax + by, (ii) d | a and d | b, (iii) if k ∈ N divides a and b, then it also divides d. Proof

Let I := {au + bv}u,v∈Z .

Then I ∩ N is not empty and let d = min (I ∩ N). Observe that d trivially

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:10:35, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.002

1.1 Prime numbers and factorization

5

satisfies (i) and (iii). In order to show that d satisfies (ii), it is enough to prove that d divides every element in I. Indeed, let au + bv = z ∈ I, assume that q ∈ Z and 0 ≤ r < d satisfy z = dq + r. Then r = z − dq = au + bv − (ax + by) q = a (u − xq) + b (v − yq) ∈ I . Since 0 ≤ r < d = min (I ∩ N) we deduce that r = 0, thus d divides z. The uniqueness follows from (ii) and (iii). The number d is called the greatest common divisor (gcd) of a and b. We shall write2 d = (a, b) . When (a, b) = 1 we shall say that a and b are coprime. Observe that (a, b) = 1 if and only if there exist integers x, y such that ax + by = 1. Theorem 1.3 (Euclid’s lemma) a | c.

Let a, b, c satisfy a | bc and (a, b) = 1. Then

Proof Let x and y satisfy ax + by = 1. Then c = cax + cby. Since a | acx and a | bcy, we deduce that a | c. We are going to describe a famous method, called the Euclidean algorithm, which gives the gcd of two positive integers. We need a lemma. Lemma 1.4

Let a, b, q, r ∈ N satisfy a = qb + r. Then (a, b) = (b, r).

Proof If t | b and t | r, then t | a. In particular, (b, r) | a. Since (b, r) | b we deduce that (b, r) | (a, b). In the same way we see that (a, b) | (b, r). The Euclidean algorithm uses the previous lemma to compute the gcd of two integers a ≥ b > 0. Indeed, let us write a = q1 b + r1 b = q2 r1 + r2 r1 = q3 r2 + r3 r2 = q4 r3 + r4 rn−3 = qn−1 rn−2 + rn−1 rn−2 = qn rn−1 .

.. .

with 0 < r1 with 0 < r2 with 0 < r3 with 0 < r4

1 is a prime number if it has only two positive divisors (namely 1 and p). We shall write P for the set of prime numbers. We shall say that an integer n > 1 is a composite number if n P. Lemma 1.7 Proof

Let a, b ∈ N and let p be a prime. If p | ab, then p | a or p | b.

p ∈ P and p a imply (a, p) = 1. Then Theorem 1.3 gives p | b.

By applying the previous lemma several times we obtain the following result. Lemma 1.8 Let a1 , a2 , . . . , ak ∈ N and let p be a prime. If p | (a1 a2 · · · ak ), then p | a j for some j = 1, . . . , k. We can now introduce the fundamental theorem of arithmetic.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:10:35, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.002

1.1 Prime numbers and factorization Theorem 1.9 (Fundamental theorem of arithmetic) be written in a unique way (up to permutation) as

7

Every integer n > 1 can

mN 1 m2 n = pm 1 p2 · · · pN ,

(1.3)

where the numbers p j are prime and the numbers m j are positive integers. mN 1 m2 In certain cases it may be useful to write n = pm 1 p2 · · · pN · · · as an +∞ infinite product, where p j = P and all but a finite number of exponents m j j=1 are zero. (1.3) is called the canonical decomposition (or factorization) of n.

Proof We shall use induction to prove that every integer n ≥ 2 can be written as a product of prime numbers. This is true for 2 and we consider n > 2. Assume that every 2 ≤ m ≤ n − 1 can be written as a product of prime numbers. If n ∈ P, we are done. If not, let n = n1 n2 . Then, by the induction assumption, n1 and n2 are products of prime numbers. Then the same is true for n. Now we shall prove the uniqueness of the decomposition. Let A ⊂ N be the set of natural numbers with more than one canonical decomposition. Let M = min A. Then we may write M = p1 p2 · · · pr = p1 p2 · · · ps , where p1 , p2 , . . . , pr and p1 , p2 , . . . , ps are prime numbers (possibly repeated), and the two products diﬀer for at least one term. By Lemma 1.8 we deduce that p1 = pj for a suitable j. Then = p2 · · · pr = p1 · · · pj−1 pj+1 · · · ps M is smaller than M and admits two diﬀerent canonical decompositions.

See [20, 6.1] for a comment on the above theorem and its history. See also [61, Ch. 8] and [85]. The fundamental theorem of arithmetic allows us to write the gcd as follows. Let a, b ∈ N. We may write a = pα1 1 pα2 2 · · · pαs s and b = pβ11 pβ22 · · · pβs s (with the same prime numbers in the two products) as long as we allow some of the exponents to be zero. Then 1 , β1 ) min(α2 , β2 ) s , βs ) (a, b) = pmin(α p2 · · · pmin(α . s 1

Observe that the Euclidean algorithm does not need the fundamental theorem of arithmetic. Euclid has proved the infinitude of primes. Theorem 1.10 (Euclid)

There are infinitely many prime numbers.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:10:35, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.002

8

Prelude

Proof Assume that P = {p1 , p2 , . . . , pN }. Then every other number should be a product of elements in P. But this is impossible for (p1 p2 · · · pN ) + 1. Here we can see the original proof. ‘Prime numbers are more than any assigned multitude of prime numbers. Let A, B, and C be the assigned prime numbers. I say that there are more prime numbers than A, B, and C. Take the least number DE measured by A, B, and C. Add the unit DF to DE. Then EF is either prime or not. First, let it be prime. Then the prime numbers A, B, C, and EF have been found which are more than A, B, and C. Next, let EF not be prime. Therefore it is measured by some prime number. Let it be measured by the prime number G. I say that G is not the same with any of the numbers A, B, and C. If possible, let it be so. Now A, B, and C measure DE, therefore G also measures DE. But it also measures EF. Therefore G, being a number, measures the remainder, the unit DF, which is absurd. Therefore G is not the same with any one of the numbers A, B, and C. And by hypothesis it is prime. Therefore the prime numbers A, B, C, and G have been found which are more than the assigned multitude of A, B, and C. Therefore, prime numbers are more than any assigned multitude of prime numbers.’ (Euclid, Elements, Book IX)

Observe that the above argument does not oﬀer an instrument to find prime numbers. Indeed, (2 · 3 · 5 · 7 · 11 · 13) + 1 = 30031 = 59 · 509 is a composite number. The following result was proved by Euler in 1737. Theorem 1.11 (Euler) +∞ 1 = +∞ . p p∈P

First proof

(1.4)

For every real number x ≥ 2 we write −1

1 . P x := 1− p P p≤x

Since 1 1 1 log (1 + ε) = ε − ε2 + ε3 − ε4 + . . . 2 3 4 for −1 < ε < 1, we have

1 1 1 1 1 + 2 + 3 + 4 + ... log 1 − log P x = − = p p 2p 3p 4p P p≤x P p≤x

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:10:35, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.002

1.1 Prime numbers and factorization 1

1 1 3 1 1 . ≤ 1 + 2 + 3 + 4 + ... = p 2 P p≤x p 2 2 2 P p≤x

9

To complete the proof we show that P x → +∞ when x → +∞. Indeed, let p1 < p2 < p3 < . . . < pN be the prime numbers ≤ x. Since

−1 +∞ 1 = p− j 1− p j=0 we have

−1 1 Px = (1.5) 1− p P p≤x −1 N

1 = 1− pj j=1 ⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎜⎜⎜ ⎟⎟⎟ ⎜⎜⎜ ⎟⎟⎟ ⎜⎜⎜ ⎟⎟ 1 1 1 1 1 1 = ⎝⎜1 + + + . . .⎟⎠ ⎜⎝1 + + + . . .⎟⎠ · · · ⎜⎝1 + + + . . .⎟⎟⎠ p1 p21 p2 p22 pN p2N ⎞ ⎛

⎟⎟ ⎜⎜ 1 1 1 1 1 1 1 + + ... + + 2+ + . . .⎟⎟⎠ + ⎜⎜⎝ 2 + =1+ p1 p2 pN p1 p1 p2 p2 p1 p3 ⎞ ⎛ ⎟⎟ ⎜⎜ 1 1 + ⎜⎝⎜ 3 + 2 + . . .⎟⎠⎟ + . . . p1 p1 p2 1 = n m1 m2 mN n=p1 p2 ···pN , p1 ≤x, p2 ≤x, ... pN ≤x

≥

1 k≤x

k

−→ +∞

as x → +∞.

An argument similar to the one in (1.5) shows that for every real number s > 1 we have +∞ −1 1 = . (1.6) 1 − p−s s n n=1 p∈P The function ζ (s) :=

+∞ 1 s n n=1

is called the Riemann zeta function and it plays a fundamental role in the study of prime numbers (see [102] or the short introductions in [8] or [10]). Let us now see a diﬀerent proof of (1.4), which we owe to Clarkson [56].

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:10:35, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.002

10

Prelude

Second proof of Theorem 1.11 Let p1 < p2 < p3 < . . . be all the prime num bers, and let us assume that +∞ p∈P 1/p converges. Then there exists a positive integer k such that 1 1 (1.7) < . p 2 n>k n Let Q = p1 p2 · · · pk . For no ∈ N does there exist p j (with 1 ≤ j ≤ k) such that p j | (1 + Q). Then the prime divisors of 1 + Q must be found among pk+1 , pk+2 , . . . and, for every N ≥ 1, we have ⎛ ⎞m N +∞ ⎜ 1 1 ⎟⎟⎟⎟ ⎜⎜⎜ ≤ (1.8) ⎜ ⎟ . 1 + Q m=1 ⎝ n>k pn ⎠ =1 In order to prove (1.8), we start by showing that every term (1 + Q)−1 in the LHS appears also in the RHS. Let 1 1 = 3 . say 1 + Q pk+2 pk+5 p4k+9 −1 8 Then p3k+2 pk+5 p4k+9 appears inside n>k p1n . In order to end the proof of (1.8) we observe that the terms 1 + Q are distinct. Then (1.7) implies N +∞ m 1 1 ≤ =1. 1 + Q 2 m=1 =1 This is impossible because

+∞

1 =1 1+Q

= +∞.

Remark 1.12 We now want to estimate the divergence of the series We start by proving the inequality −1

−1 −2 1 ≤ e p +p . 1− p P p≤x P p≤x

+∞

p∈P

1/p.

(1.9)

Indeed, let f (t) = (1 − t) et+t . Then f (t) = t (1 − 2t) et+t ≥ 0 for every t ∈ 2 1 [0, 1/2]. Since f (0) = 1, we have f (t) ≥ 1, that is to say 1−t ≤ et+t for every t ∈ [0, 1/2]. This implies (1.9). If we take logarithms on both sides, while recalling (1.5) and the inequalities3 2

1 k≤x 3

k

=

[x] 1 k=1

k

2

[x]+1

≥ 1

1 dx = log ([x] + 1) > log x , x

The integral part [α] of a real number α is the largest integer smaller than or equal to α, for example, [5] = 5, [e] = 2, [−π] = −4.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:10:35, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.002

1.2 Chebyshev’s theorem

11

we obtain

⎛ ⎞ ⎛ −1 ⎞⎟ ⎜⎜⎜

⎜⎜⎜ −1 −2 ⎟⎟⎟ ⎟ 1 ⎟ ⎟⎟⎟ ≤ log ⎜⎜⎜ log log x < log ⎜⎜⎜⎝ e p +p ⎟⎟⎟⎠ 1− ⎝ p ⎠ P p≤x P p≤x = p−1 + p−2 .

P p≤x

Since

P p≤x

we obtain

p−2
n, then p does not appear in the canonical decomposition of n! (and n/p j = 0 for every j ≥ 1). If p ≤ n, then the n/p numbers n p, 2p, 3p, . . . , p (1.12) p are all the positive integers not larger than n and that can be divided by p. Observe that n/p2 numbers among those in (1.12) can also be divided by p2 . They are n p2 , 2p2 , 3p2 , . . . , 2 p2 p (of course n/p2 = 0 if p2 > n). Then the power of p in the canonical decom position of n! is at least n/p + n/p2 . Going on, we observe that the following n/p3 numbers can be divided by p3 : n p3 , 2p3 , 3p3 , . . . , 3 p3 . p Then of n! is at least n/p + thepower of p in the canonical decomposition n/p2 + n/p3 . In this way we see that the finite sum j≥1 n/p j is the exponent of p in the canonical decomposition of n! 5

The above lemma can be used to compute the canonical factorization of n! Consider, for example, 10! The prime numbers not exceeding 10 are 2, 3, 5, 7. We have 10 10 10 10 10 10 10 + 2 + 3 =8, + 2 =4, =2, =1. 2 3 5 7 2 2 3 Then 10! = 28 · 34 · 52 · 7 .

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:10:35, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.002

1.2 Chebyshev’s theorem

13

Theorem 1.15 (Chebyshev) There exist two positive constants c1 , c2 such that, for every real number x ≥ 2, x x ≤ π(x) ≤ c2 . (1.13) c1 log x log x Proof

The main idea is to study the diﬀerence π (2n) − π (n) = card (P ∩ (n, 2n])

for every integer n ≥ 2. For this purpose we use the binomial coeﬃcient

2n 2n (2n − 1) · · · (n + 1) (1.14) = n (n − 1) · · · 1 n (i.e., the ‘bisector’ of Pascal’s (or Tartaglia’s) triangle) and observe that every number p ∈ P ∩ (n, 2n] divides the numerator in (1.14), but does not divide the 2n 2n denominator. Then p | n , hence also Pn := p∈P∩(n,2n] p divides n . Then we have

2n π(2n)−π(n) < Pn ≤ . (1.15) n n For every prime number p let r p be the integer satisfying pr p ≤ 2n < pr p +1 .

(1.16)

Weuse decomposition Lemma 1.14 to compute the power of p in the canonical +∞ j of 2n . Indeed, we know that p appears with exponent 2n/p in (2n)! j=1 n +∞ 2 j and with exponent 2 j=1 n/p in (n!) . Then it appears with exponent +∞ 2n n −2 j j p p j=1 2n (2n)! in the canonical decomposition of n = (n!)2 . Now observe that, for every real number α, the function α → [α] − 2 α2 takes only the values 0 or 1. Indeed, [α] − 2 α2 ∈ Z and α α α − 1 = 2. −1 = α − 1 − 2 < [α] − 2 1 ⇐⇒ k < . k 2

is the largest binomial coeﬃcient in the expansion of (1 + 1)2n . Hence

= (1 + 1)

2n

=

2n 2n k=0

k

=2+

2n−1

k=1

Then

2n 2n 2n ≤ 2 + (2n − 1) ≤ 2n . k n n

2n 22n . ≥ 2n n

Moreover, we obviously have

2n 2n 2n < = 22n n k k=0

(1.19)

(1.20)

(this upper bound will be used later). Then (1.18) and (1.19) give 2n ≤

22n ≤ (2n)π(2n) , 2n

that is, π (2n) ≥

n log 2 . log (2n)

From this we deduce that for every real number x ≥ 2 we have π (x) ≥ 6

x 1 log 2 . 4 log x

(1.21)

√ Observe that a simple use of Stirling’s formula (n! ∼ 2πn (n/e)n , see Theorem 6.32) gives

2n ∼ 4n (nπ)−1/2 . n

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:10:35, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.002

1.2 Chebyshev’s theorem

15

As for the estimate from above, we observe that (1.15) and (1.20) imply nπ(2n)−π(n) < 22n , that is, (π (2n) − π (n)) log n < 2n log 2

(1.22)

or π (2n)
16. We proceed by induction assuming (1.24) true for all indices between 2 and k − 1. Let n be the positive integer satisfying 2n − 2 < k ≤ 2n. Then (1.23) implies π (k) ≤ π (2n)
2 log 2

n . log (2n + 1)

This implies π(x) ≥ c1

x log x

for every x ≥ 2. This proof is due to Tihomirov. Remark 1.16

In (1.22) we have seen that π (2n) − π (n) < log 4

n log n

for every integer n ≥ 2. It is possible to prove a similar bound from below: n . π (2n) − π (n) > c log n This last inequality is a strong form of the so-called Bertrand’s postulate (perhaps more correctly called the Bertrand–Chebyshev theorem, see [143]), which states that for every positive integer n there is always at least one prime number p such that n < p ≤ 2n. Bertrand’s postulate is not a trivial result, since there

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:10:35, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.002

1.2 Chebyshev’s theorem

17

exist arbitrarily long sequences of consecutive composite numbers (for every large integer k, consider k! + 2, k! + 3, . . . , k! + k). Zhang has recently proved (see [183]) the following fundamental result. Theorem 1.17 (Zhang)

Let pn be the nth prime number. Then lim inf (pn+1 − pn ) < +∞ . n→+∞

Exercises 1) Let 1, 1, 2, 3, 5, 8, 13, 21, . . . be the Fibonacci sequence, defined by a1 = a2 = 1 and an = an−1 + an−2 if n ≥ 3. Prove that (an+1 , an ) = 1 for every n. 2) Find the positive integers M which have 6 as their first digit and are equal to M/25 once the first digit is erased. 3) Let a, b, c ∈ Z. Show that the equation ax + by = c has an integer solution if and only if (a, b) | c. 4) Prove that if 2n + 1 is a prime number, then n is a power of 2. 5) Prove that for all positive integers a and b we have 2a − 1, 2b − 1 = 2(a,b) − 1 . 6) Prove that 10 is the only positive composite number such that each one of its non-trivial divisors has the form ar + 1 with integers a ≥ 1 and r ≥ 2. 7) Let a, b be two positive integers with (a, b) = 1. Determine the largest integer d that cannot be written as d = ma + nb with m and n positive integers. This is part of the so-called coin problem. (Why this name?) 8) Let k ≥ 1 and let a0 , a1 , . . . , ak ∈ Z, with ak 0. For every positive integer n, let Q (n) = kj=0 a j n j . Prove that there exist infinitely many integers n0 such that Q (n0 ) P. 9) Prove (1.6). 10) Prove that 1 ζ (s) ∼ s−1 as s → 1+ . 11) Euclid’s proof of the infinitude of prime numbers shows that the (increasing) sequence pn of prime numbers satisfies pn+1 ≤ 1 +

N

pj .

j=1

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:10:35, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.002

18

Prelude Deduce the following result: π (x) ≥ c log log x .

12) Write the prime numbers as an increasing sequence p1 < p2 < . . . and prove the existence of two positive constants c1 and c2 such that c1 n log (n) ≤ pn ≤ c2 n log (n) .

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:10:35, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.002

2 Arithmetic functions and integer points

In number theory, an arithmetic function is a real or complex-valued function f (n) defined on N. Many of these functions represent important arithmetic properties of n; in particular, we shall see the Dirichlet function d (n), which counts the positive divisors of n, the function r (n), which gives the number of representations of n as a sum of two squares, and the Euler function φ(n), which says how many numbers between 1 and n are coprime with n. Several important arithmetic functions show an irregular behaviour as n grows. Consider, for example, d (n): if n is a prime number, then d (n) = 2, but this does not exclude that the number n + 1 may have many divisors, so that d (n + 1) is large. It may therefore be diﬃcult to consider the behaviour of a given arithmetic function f (n) as n tends to infinity. Sometimes it is useful to study the average n 1 f ( j) , n −→ n j=1 as n → +∞. In certain cases the study of this arithmetic mean consists in counting the integer points (that is, the points with integral coordinates) inside suitable domains contained in the plane or in Rd . As a simple yet amazing result, we shall see that (on average) a natural number n can be written as a sum of two squares in ... π ways. Several important arithmetic functions have algebraic properties, which we are going to introduce in a general setting.

2.1 Arithmetic functions and Dirichlet product We define the sum of two arithmetic functions in the obvious way: ( f + g) (n) := f (n) + g (n) 19 Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:11:19, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.003

20

Arithmetic functions and integer points

(pointwise sum). As for the product, besides the pointwise product ( f g) (n) := f (n) g (n) , we consider the Dirichlet convolution (or Dirichlet product) ( f ∗ g) (n) := f (d) g (n/d) = f (d) g d ,

(2.1)

dd =n

d|n

where the sums d|n and dd =n are over the positive divisors d of n. Here we can see four simple arithmetic functions ! 1 if n = 1, I (n) := 0 if n > 1, 0 (n) := 0 for every n ,

(2.2)

u (n) := 1 for every n , N (n) := n. Theorem 2.1 The set of all arithmetic functions, with pointwise sum and Dirichlet product, is a commutative ring with additive identity 0 (n) and multiplicative identity I (n). Proof It is obvious that the set of all arithmetic functions, with the above sum, is an abelian group with identity 0 (n). We need to show that (i) (ii) (iii) (iv)

f ∗ g = g ∗ f, ( f ∗ g) ∗ h = f ∗ (g ∗ h) , f ∗ (g + h) = f ∗ g + f ∗ h, f ∗ I = f.

The identity (i) follows from the symmetry of the second sum in (2.1). In order to prove (ii) we write ⎛ ⎞ ⎟⎟⎟ ⎜⎜⎜⎜ ⎜⎝⎜ ( f ∗ g) (d) h d = (( f ∗ g) ∗ h) (n) = f (p) g (q)⎟⎟⎠⎟ h d dd =n dd =n pq=d = f (p) g (q) h d pqd =n

and observe the symmetry of p, q, d in the last sum. We finally prove (iii) ( f ∗ (g + h)) (n) = f (d) g (n/d) + f (d) h (n/d) d|n

d|n

= ( f ∗ g) (n) + ( f ∗ h) (n)

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:11:19, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.003

2.1 Arithmetic functions and Dirichlet product and (iv) ( f ∗ I) (n) =

21

f (d) I (n/d) = f (n) I (1) = f (n) .

d|n

We now introduce the Dirichlet inverse of an arithmetic function. Theorem 2.2 Let f be an arithmetic function such that f (1) 0. Then there exists one and only one arithmetic function f ∗−1 (we call it the Dirichlet inverse of f ) such that f ∗ f ∗−1 = I . Proof We shall define f ∗−1 (n) by induction on n. For n = 1 let f ∗−1 (1) = 1/ f (1). Now assume that f ∗−1 (k) has been introduced for every 1 ≤ k < n. Observe that f ∗−1 must satisfy f (n/d) f ∗−1 (d) = 0 d|n

(for n > 1). That is to say 0 = f (1) f ∗−1 (n) +

f (n/d) f ∗−1 (d)

d|n d 1 and g (ab) = g (a) g (b) for every pair of positive integers a, b such that (a, b) = 1 and ab < m0 n0 . Let d be a positive divisor of m0 n0 . Again we write d = d1 d2 , with d1 | m0 , d2 | n0 . Then ( f ∗ g) (m0 n0 ) = f (d) g (m0 n0 /d) d|m0 n0

= f (1) g (m0 n0 ) +

f (d1 d2 ) g (m0 n0 /d1 d2 )

d1 |m0 , d2 |n0 , d1 d2 >1

= g (m0 n0 ) − g (m0 ) g (n0 ) +

f (d1 ) f (d2 ) g ((m0 /d1 ) (n0 /d2 ))

d1 |m0 , d2 |n0

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:11:19, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.003

2.2 μ(n) 23 = g (m0 n0 ) − g (m0 ) g (n0 ) + f (d1 ) g (m0 /d1 ) f (d2 ) g (n0 /d2 ) d1 |m0

d2 |n0

= g (m0 n0 ) − g (m0 ) g (n0 ) + ( f ∗ g) (m0 ) ( f ∗ g) (n0 ) . Since g (m0 n0 ) g (m0 ) g (n0 ) we deduce that ( f ∗ g) (m0 n0 ) ( f ∗ g) (m0 ) ( f ∗ g) (n0 ) . This is impossible if f ∗ g is a multiplicative function. (iii) f and I = f ∗ f ∗−1 are multiplicative functions. Therefore (iii) follows from (ii). Corollary 2.6

Let f be an arithmetic function. Then f (m) k(n) = m|n

is a multiplicative function if and only if the same is true for f . Proof We may write k = f ∗ u (see (2.2)). Since u is a multiplicative function, Theorem 2.5 says that k is a multiplicative function if and only if the same is true for f .

2.2 μ(n) We now introduce the M¨obius function μ(n) : ⎧ ⎪ 1 if n = 1, ⎪ ⎪ ⎪ ⎨ N (−1) μ(n) := ⎪ if n > 1 is a product of N distinct prime numbers, ⎪ ⎪ ⎪ ⎩ 0 otherwise. Then μ(1) = 1, μ(2) = −1, μ(3) = −1, μ(4) = 0, μ(5) = −1, μ(6) = 1, μ(7) = −1, μ(8) = 0, ... , μ(12) = 0, ... , μ(30) = −1, ... The arithmetic function μ is multiplicative. Indeed, assume that (m, n) = 1. Then mn admits a prime number p such that p2 | mn if and only if at least one number between m and n has the same property. That is to say, μ(mn) = 0 if and only if μ(m)μ(n) = 0. Now let m be the product of M distinct prime numbers and let n be the product of N distinct prime numbers. Then (since (m, n) = 1) the number mn is the product of M + N distinct prime numbers, and μ(mn) = (−1) M+N = (−1) M (−1)N = μ(m)μ(n) . We are going to show that u is the Dirichlet inverse of μ.

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:11:19, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.003

24

Arithmetic functions and integer points

Lemma 2.7

μ∗−1 = u, that is, ! 1 μ(m) = 0 m|n

if n = 1, if n > 1.

(2.5)

Proof The identity (2.5) is obvious if n = 1. By Corollary 2.6, the function n → m|n μ(m) is a multiplicative function. Then it is enough to prove (2.5) when n is a power of a prime number, say n = p s (s ≥ 1). Indeed, in this case

μ(m) =

m|p s

s μ pj = 1 − 1 + 0 + 0 + · · · + 0 = 0 . j=0

We introduce the M¨obius inversion formula, which was proved by M¨obius in 1832. Theorem 2.8 (M¨obius’ inversion formula) Let f and g be arithmetic functions. Then g = u ∗ f if and only if f = μ ∗ g. Proof

Lemma 2.7 implies μ ∗ g = μ ∗ f ∗ u = f ∗ (μ ∗ u) = f ∗ I = f .

The converse is similar.

We are going to apply the M¨obius function to the study of square-free numbers. A natural number is a square-free number if it is not divisible by the square of any integer > 1. Let Q denote the set of square-free numbers. For every positive real number x let Q (x) := card {n ∈ Q : n ≤ x} . The following result was proved by Gegenbauer in 1885. As x → +∞ we have √ 6 Q (x) = 2 x + O x . π

Theorem 2.9 (Gegenbauer)

Proof

Observe that

! μ2 (n) =

1 0

Then Q (x) =

(2.6)

if n is square-free, otherwise.

μ2 (n) .

n≤x

Let g = μ2 ∗ μ. Then the M¨obius inversion formula implies μ2 = u ∗ g. By

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:11:19, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.003

2.3 d(n) and σ (n)

25

Theorem 2.5, g is a multiplicative function. Let p be a prime number and let ≥ 1, then ! −1 if = 2, g p = μ2 pk μ p−k = 0 otherwise. k=0 Then g (n) = 0 unless n = m2 , where m is square-free. Hence g m2 = μ (m) for every m ∈ N. Then n (u ∗ g) (n) = μ2 (n) = u g (d) 1 g (d) = d n≤x n≤x n≤x n≤x d|n d≤x =

g (d)

x

d≤x

=x

d

+∞ μ (m) m=1

m2

=

−x

√ m≤ x

x μ (m) 2 + O (1) m

+∞ μ (m) √ √ μ (m) + O x = x + O x . m2 m2 √ m=1

m> x

We claim that

+∞ μ(m) m=1

m2

=

Indeed, by (2.5) we have ⎛ ⎞ +∞ +∞ ⎟⎟ 1 1 ⎜⎜⎜⎜ ⎜⎝⎜ μ(m)⎟⎟⎟⎠⎟ = 1= 2 k m|k k2 k=1 k=1 =

d|n

6 . π2

⎛ ⎞ +∞ ⎜⎜⎜ ⎟⎟ μ(m) ⎜⎜⎝ μ(m)⎟⎟⎟⎠ = 2 (mn) m,n=1 mn=k

+∞ +∞ +∞ 1 μ(m) π2 μ(m) = , 6 m=1 m2 n2 m=1 m2 n=1

by (4.35). This implies (2.6). Since

(2.7)

6 Q (x) = 2 + O x−1/2 x π

we may say that the percentage of square-free numbers is 6/π2 .

2.3 d(n) and σ (n) The Dirichlet function d (n) counts the positive divisors of n : 1. d(n) :=

(2.8)

m|n

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:11:19, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.003

26

Arithmetic functions and integer points

Then d(1) = 1, d(2) = 2, d(3) = 2, d(4) = 3, d(5) = 2, d(6) = 4, ... On the one hand, we have d(p) = 2 if and only if p is a prime number. On the other hand, we have d (q) = 2N if q is a product of N distinct prime numbers. We also introduce the arithmetic function σ(n) := m, (2.9) m|n

which sums the positive divisors of n. Then σ (1) = 1, σ (2) = 3, σ (3) = 4, σ (4) = 7, σ (5) = 6, σ (6) = 12, ... Observe that σ(p) = p + 1 if and only if p is a prime number. mN 1 m2 Proposition 2.10 Write n = pm 1 p2 · · · pN according to its canonical decomposition. Then

d(n) = (m1 + 1) (m2 + 1) · · · (mN + 1)

(2.10)

and σ(n) =

2 +1 pmN +1 − 1 p1m1 +1 − 1 pm −1 2 ··· N . p1 − 1 p2 − 1 pN − 1

Hence d (n) and σ (n) are multiplicative functions. Proof

If k | n, then k = p11 p22 · · · pNN with 0 ≤ j ≤ m j for every j. Then d (n) = card (1 , 2 , . . . , N ) : 0 ≤ j ≤ m j , j = 1, . . . m j = (m1 + 1) (m2 + 1) · · · (mN + 1) .

In a similar way, σ(n) =

m1 m2

···

1 =0 2 =0

=

1 +1 pm 1

−1 p1 − 1

mN N =0

p11 p22

p2m2 +1

···

−1 ··· p2 − 1

pNN

⎞⎛ m ⎞ ⎛m ⎞ ⎛m N 1 2 ⎟⎟⎟ ⎜⎜⎜ ⎟⎟⎟ ⎜⎜⎜ ⎟⎟⎟ ⎜⎜⎜ = ⎜⎜⎝⎜ p11 ⎟⎟⎟⎠ ⎜⎜⎜⎝ p22 ⎟⎟⎟⎠ · · · ⎜⎜⎜⎝ pNN ⎟⎟⎟⎠ 1 =0

2 =0

N =0

pmN N +1

−1 . pN − 1

The natural numbers n which satisfy σ(n) = 2n are called perfect numbers. 6 = 1 + 2 + 3 and 28 = 1 + 2 + 4 + 7 + 14 are perfect numbers. The existence of at least one odd perfect number is an open problem. For even perfect numbers we have the following result, due to Euclid and Euler. Theorem 2.11 (Euclid–Euler) An even integer n is a perfect number if and only if it can be written in the form n = 2 p−1 (2 p − 1) ,

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:11:19, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.003

2.3 d(n) and σ (n)

27

where 2 p − 1 is a prime number.1 Proof Let n = 2 p−1 (2 p − 1) and assume that 2p − 1 is a prime number. Since σ is a multiplicative function and 2 p−1 , 2 p − 1 = 1, we have σ (n) = σ 2 p−1 σ (2 p − 1) = (1 + 2 + 22 + . . . + 2 p−1 ) σ (2 p − 1) = (2 p − 1) 2 p = 2n . Hence n is perfect. Conversely, let n be an even perfect number. We can write n = 2 p−1 q, with p > 1 and q odd. Then 2 p q = 2n = σ (n) = σ 2 p−1 σ (q) = (2 p − 1) σ (q) . Hence (2 p − 1) | q, and we write q = (2 p − 1) s. Then σ (q) = 2 p s . Now observe that s and q = (2 p − 1) s are two diﬀerent divisors of q, and q + s = 2 p s = σ (q) . Then q and s are the only divisors of q. Hence q is a prime number and s = 1. This gives q = 2 p −1 and n = (2 p − 1) 2 p−1 , where 2 p −1 is a prime number. Remark 2.12 The previous result is very elegant, but it is not as useful as it may appear. The problem is that we do not know which numbers of the form 2m − 1 are prime numbers. We have seen that 2m − 1 is a prime number only if m is a prime number, but the converse is false: 25 − 1 = 31 is a prime number, 27 −1 = 127 is a prime number, but 211 −1 = 2047 = 23·89 is not a prime number. Actually we do not know whether there are infinitely many prime numbers of this form. Therefore we do not know whether there are infinitely many even perfect numbers. These prime numbers are called Mersenne primes (after the French monk Mersenne, who investigated them in the seventeenth century). Up to the first half of the twentieth century only a few Mersenne primes were known. Then computers helped mathematicians (the first was Robinson in 1952 at UCLA) to find several new numbers, and at the moment we know 48 Mersenne primes. The largest one is 257885161 − 1, and it was discovered by Cooper in 2013 at the University of Central Missouri. Nowadays the search for 1

Observe that if 2 p − 1 is a prime number, then the same is true for p. Indeed, assume that p = k with k > 1 and > 1. Then −1 2 p − 1 = 2k − 1 = (2k − 1) 2k j . j=0

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:11:19, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.003

28

Arithmetic functions and integer points

Mersenne primes involves distributed computing and thousands of volunteers connected to the Great Internet Mersenne Prime Search (GIMPS). See [184]. The following results will provide some information on the behaviour of d (n) and σ (n) when n is large. Theorem 2.13

We have the following estimates.

(i) For every ε > 0 there exists a constant c = cε such that d(n) ≤ c nε for every n ≥ 1. (ii) For no constants c1 and c2 can the inequality d(n) ≤ c1 logc2 n be true for every n ≥ 2. Proof In order to prove (i) we may assume that 0 < ε < 1. If n has canonical mN 1 m2 decomposition n = pm 1 p2 · · · pN , then Proposition 2.10 implies d(n) 1 + m1 1 + m2 1 + mN = εm1 ··· . 2 N nε p1 pεm pεm N 2 Since p j ≥ 2 for every j, we have εm j

pj

≥ 2εm j = eεm j log 2 ≥ 1 + εm j log 2 > 1 + m j ε log 2 .

Hence

1 + mj εm j

pj

0, m1 m2 ≤ R represents the area of the grey part below the graph of the hyperbola:

(0,R)

y = R/x (0,1) (1,0)

(R,0)

Now let A = (0, 0), B = (0, u), C = (u, u), D = (u, 0):

Y=(0,R)

H

F B=(0,u)

C y = R/x G

A

D=(u,0)

(R,0)

Since (u + 1)2 > R, the point (u + 1, u + 1) is over the graph of the hyperbola. Then the number of points between the axes and the graph of the hyperbola is twice the number of points inside the strip between AY and DF (we count the points on DF, but not the ones on AY), minus u2 (that is, minus the number of points inside the square ABCD, where only half of the boundary counts). The strip between AY and DF is the union of the rectangles having base 1 and

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:11:19, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.003

32

Arithmetic functions and integer points heights R1 , R2 , . . . , Ru . Then (2.11), (2.12) and (2.13) give R R R R R d(n) = 2 + + ... + − u2 = 2 R + + . . . + − u2 + O(u) 1 2 u 2 u n≤R

√ 1 = 2R log u + γ + O − u2 + O(u) = R log R + (2γ − 1)R + O( R) . u The search for better estimates of the error d(n) − R log R − (2γ − 1) R n≤R

is called the Dirichlet divisor problem. We shall return to it later in Chapter 9. We now show that σ (n) is not much larger than n. Proposition 2.16

For every integer n ≥ 2 we have n + 1 ≤ σ(n) < n + n log n .

Proof The first inequality is trivial (1 and n are divisors of n). We prove the second inequality: n 1 n 1 ≤n 2W ≥ eW/2

(3.15)

(note that if m is square-free, then d(m) = 2v(m) ). By Theorem 2.14 we have √ d(m) = x log x + (2γ − 1) x + O x . m≤x

Then (3.15) implies 2x log x >

d(m) ≥ card Γ1,2 eW/2

m∈Γ1,2

for large x. Hence

1√ card Γ1,2 < 2x log x e− 20 log x .

Then, for large x we have

√ 1 1√ card C 1x < x1/4 + 2x log x e− 20 log x < xe− 30 log x .

We now pass to C 2x and write it as the disjoint union C 2x = C2,1 ∪ C2,2 , where (3.16) C2,1 ' ( √ 4 = n ∈ C 2x : ∃p ∈ P such that p | n and δ := (n − 1, p − 1) ≥ e log x := T .

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:13:49, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.004

54

Congruences

Since n is a composite number, we have n = pm with m > 1. Since m ≤ x/p, (3.16) implies the existence of at most x/ (δp) choices for m. Indeed, if m0 is one of these choices (that is, m0 ≤ x/p and (pm0 − 1, p − 1) = δ), then we claim that m0 + j is not a choice for m if 0 < | j| < δ. Indeed, for suitable and t we have pm0 − 1 = δ ,

p − 1 = δt .

(3.17)

The assumption p(m0 + j) − 1 = δr would therefore imply pm0 + (1 + δt) j − 1 = δr , pm0 − 1 = − j + δ (r − jt) , against (3.17). Then we have at most x/ (δp) choices for m. Since δ | (p − 1) √ 4 log x we obtain and T = e x 1 x 1 1 do we have n | (2n − 1). 7) Letp ≡ 1 (mod 4) be a prime number. Use Wilson’s theorem to show that p | s2 + 1 for some s ∈ N. 8) Assume that the positive integer n is not a square and let a satisfy (a, n) = 1. Prove the existence of integers x and y such that √ √ 0 ≤ x < n , 0 ≤ |y| < n , ax ≡ y (mod n) . 9) Let p ∈ P. Prove that a p ≡ b p (mod p) implies a p ≡ b p mod p2 . 10) Give a diﬀerent proof of the infinitude of odd composite numbers which satisfy (3.12) by showing that if n is such a number, then the same is true for 2n − 1. 11) Where do we need p > 3 in the proof of Theorem 3.14?

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:13:49, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.004

4 Quadratic reciprocity and Fourier series

This chapter is devoted to the quadratic reciprocity law, proved by Gauss in 1796, when he was only 19 years old. See [118] for a story of the law. We will give two proofs. The second one will depend on Fourier series, and we are therefore going to start the parallel short course in Fourier analysis we talked about in the Introduction. We now describe a general result on polynomial congruences. Let f (x) = an xn + . . . + a0 be a polynomial with integral coeﬃcients, and let p be a prime number. Observe that the solutions of f (x) ≡ 0 (mod p)

(4.1)

are residue classes (mod p). Lagrange proved the following result in 1770. Theorem 4.1 Let f (x) = an xn + an−1 xn−1 + . . . + a1 x + a0 be a polynomial with integral coeﬃcients. Let p ∈ P and assume that p an . Then the congruence f (x) ≡ 0 (mod p)

(4.2)

has at most n solutions (pairwise non-congruent (mod p)). Proof The case n = 0 is not interesting and the case n = 1 follows from Theorem 3.5. We work by induction and assume that the result is true for polynomials of degree n − 1. Let f (x) have degree n. If (4.2) has no solutions, then the result is true. If x0 is a solution, then we write f (x) = (x − x0 ) f1 (x) + r with f1 (x) of degree n − 1 and r ≡ 0 (mod p). Assume that x1 x0 (mod p) is another solution of (4.2). Then 0 ≡ f (x1 ) ≡ (x1 − x0 ) f1 (x1 ) (mod p) . 65 Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

66

Quadratic reciprocity and Fourier series

Since p is a prime number and x1 − x0 0 (mod p) we have f1 (x1 ) ≡ 0 (mod p). Then the solutions of (4.2) which are not congruent to x0 (mod p) must be solutions of f1 (x) ≡ 0 (mod p), of which, by the induction assumption, there are at most n − 1 (incongruent (mod p)). Hence (4.2) has at most n solutions. Remark 4.2 If p is not a prime number, then the above theorem may fail. Indeed, the equation x2 − 1 ≡ 0 (mod 8) has four solutions: x = ±1, ±3. Remark 4.3 The assumption p an enables us to see the ‘true’ degree of a polynomial inside a congruence (mod p). As an example, the polynomial 15x7 +x6 −3x+2 has degree 7, but the congruence 15x7 +x6 −3x+2 ≡ 0 (mod 5) should be rewritten as x6 − 3x + 2 ≡ 0 (mod 5).

4.1 Quadratic residues We consider the quadratic congruence ax2 + bx + c ≡ 0 (mod p) ,

(4.3)

where a, b, c ∈ Z and P p a. The case p = 2 is rather simple, so we shall assume that p is odd. Since a is invertible (mod p), we may rewrite (4.3) as x2 + βx + γ ≡ 0 (mod p) . Moreover we can assume that β is even, since we can always replace βx by (β + p) x. We therefore investigate x2 + 2δx + γ ≡ 0 (mod p) ,

(x + δ)2 ≡ δ2 − γ (mod p) .

That is, we consider congruences of the form z2 ≡ a (mod p) ,

(4.4)

where P p a. Definition 4.4 If (4.4) admits a solution, then we say that a is a quadratic residue (mod p). Otherwise we say that a is a quadratic non-residue (mod p).

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

4.1 Quadratic residues

67

Theorem 4.5 Let p be an odd prime number. Then there are exactly (p − 1) /2 quadratic residues (mod p):

2 p−1 2 2 1 ,2 ,..., . (4.5) 2 Proof Of course (4.4) has a solution when a equals one of the numbers in (4.5). Moreover, assume that r2 ≡ s2 (mod p) with 1 ≤ s ≤ r ≤ p−1 2 . Then r2 − s2 = (r − s) (r + s) ≡ 0 (mod p) . Since 2 ≤ r + s ≤ p − 1, we have r ≡ s (mod p). Now assume that a is < t ≤ p − 1. We claim a quadratic residue with a ≡ t2 (mod p) and p+1 2 2 . Indeed, t ≡ (t − p)2 (mod p) and that a ≡ u2 (mod p) with 1 ≤ u ≤ p−1 2 p−1 2 2 − 2 ≤ t − p ≤ −1. Hence t ≡ u (mod p) with 1 ≤ u ≤ p−1 2 . Now we define the Legendre symbol, introduced by Legendre in 1798. Let a be an integer and let p be an odd prime number. We write ⎧ ⎪

1 if p a and a is a quadratic residue (mod p) , ⎪ ⎪ ⎪ a ⎨ := ⎪ −1 if p a and a is a quadratic non-residue (mod p) , ⎪ ⎪ p L ⎪ ⎩ 0 if p | a. Observe that 1p = 1 for every p, because the equation x2 ≡ 1 (mod p) has L the solution x = 1. Theorem 4.6 (Euler’s criterion) Let p be an odd prime number. Then for every integer a we have

a (4.6) ≡ a(p−1)/2 (mod p) . p L Proof If p | a,then 0 ≡ a(p−1)/2 (mod p) and the theorem is proved. Assume that p a. If ap = 1, then there exists z such that z2 ≡ a (mod p), and p z. L Then Fermat’s little theorem implies

a (mod p) . a(p−1)/2 ≡ z p−1 ≡ 1 ≡ p L Now assume that ap = −1, so that the equation z2 ≡ a (mod p) has no L solutions. Observe that for every j = 1, . . . , p − 1, Theorem 3.5 shows the existence of a unique 1 ≤ j ≤ p − 1 such that j j ≡ a (mod p). For every j we have j j (because the equation z2 ≡ a (mod p) has no solutions). In this way we have subdivided the set {1, 2, . . . , p − 1} into (p − 1) /2 pairs j, j .

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

68

Quadratic reciprocity and Fourier series

Then, by Wilson’s theorem, −1 ≡ (p − 1)! ≡

p−1 j=1

j≡

(p−1)/2

j j ≡ a(p−1)/2 (mod p) .

j=1

Remark 4.7 In particular, if p is an odd prime number, then the equation x2 ≡ −1 (mod p) has a solution if and only if p ≡ 1 (mod 4). Theorem 4.8 Let p be an odd prime number. Then for every a, b ∈ Z we have

ab a b = . (4.7) p L p L p L Proof

By Euler’s criterion we have

a b ab (mod p) . ≡ (ab)(p−1)/2 ≡ a(p−1)/2 b(p−1)/2 ≡ p L p L p L

(4.8)

In order to end the proof we observe that

ab a b − ≤2. p L p L p L Then (4.8) implies (4.7).

Theorem 4.9 (Gauss’ lemma) Let p be an odd prime number and let a be an integer such that p a. Let & ! ax p p and < ax − p

3. L Arguing as before, we obtain

p p p−1 p−1 3−1 3 (−1) 2 ( 2 ) = (−1) 2 . = p L 3 L 3 L We recall (4.6). If p ≡ 1 (mod 3) we have

p 1 = =1, 3 L 3 L while if p ≡ 2 (mod 3) we have

p 3−1 −1 = ≡ (−1) 2 ≡ −1 (mod p) . 3 L 3 L Moreover, (−1)

p−1 2

! =

+1 −1

if p ≡ 1 (mod 4) , if p ≡ 3 (mod 4) .

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

74

Quadratic reciprocity and Fourier series

Then

3 p L

= 1 if and only if p ≡ 1 (mod 3) and p ≡ 1 (mod 4)

or p ≡ 2 (mod 3) and p ≡ 3 (mod 4) .

Then 3p = 1 if and only if p ≡ ±1 (mod 12), while there are no soluL tions for p ≡ ±5 (mod 12). Note that any odd prime p > 3 must satisfy p ≡ ±1 (mod 12) or p ≡ ±5 (mod 12).

4.2 Gauss sums We shall give another proof of Theorem 4.11, based on Gauss sums and Fourier series. In this section we introduce Gauss sums. Definition 4.12 the Gauss sum

Let 0 a ∈ Z and q ∈ N. Assume that (a, q) = 1. We define q

S (q, a) :=

2

e2πian

/q

.

(4.18)

n=1

Lemma 4.13 Let 0 a ∈ Z and q1 , q2 ∈ N such that (q1 , q2 ) = 1 and (a, q1 q2 ) = 1. Then S (q1 q2 , a) = S (q1 , q2 a) S (q2 , q1 a) . q1 2 Proof We recall Theorem 3.4, which says that if R = x j and S = {yk }qk=1 j=1 are complete sets of residues mod q1 and mod q2 , respectively, then the set is a complete set of residues mod q1 q2 . Then q2 x j + q1 yk 1≤ j≤q1 , 1≤k≤q2

S (q1 q2 , a) =

q1 q2

2

e2πian

/(q1 q2 )

n=1

=

q1 q2

e2πia(q2 x j +q1 yk )

2

/(q1 q2 )

.

j=1 k=1

2 Since q2 x j + q1 yk ≡ q22 x2j + q21 y2k (mod q1 q2 ) we deduce that S (q1 q2 , a) =

q1 j=1

e2πiaq2 x j /q1 2

q2

e2πiaq1 yk /q2 = S (q1 , q2 a)S (q2 , q1 a) . 2

k=1

Gauss sums and the Legendre symbol are related by the following result.

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

4.2 Gauss sums

75

Lemma 4.14 Let 0 a ∈ Z and let p be an odd prime number such that (a, p) = 1. Then

a S (p, a) = S (p, 1) . p L Proof

In the proof of Theorem 4.5 we have seen that the equation x2 ≡ m (mod p)

(4.19)

has one solution if m ≡ 0 (mod p), otherwiseit has zero or two solutions. Then the number of solutions of (4.19) equals 1 + mp and we can write L

m (4.20) e2πiam/p p L n=1 m=1 p p p m m 2πiam/p 2πiam/p e + e = e2πiam/p . = p p L L m=1 m=1 m=1 p

S (p, a) =

e2πi(an

2

/p)

=

p

1+

p Indeed, by Theorem 3.3 and the assumption (a, p) = 1 we deduce that {am}m=1 is a complete set of residues mod p. Then p

e2πiam/p =

m=1

p

e2πih/p = 0

h=1

p 2πih/p (observe that h=1 e is the sum in C of the pth roots of unity). Since (a, p) = 1, Corollary 3.6 implies the existence of an inverse a∗ of a. Then (4.20) and Theorem 4.8 give S (p, a) =

h=1

since 1 =

p ∗ ah

1 p L

p =

2πih/p

e L

a∗ = p

p p a h h 2πih/p e = e2πih/p , p p L h=1 L L h=1 p L

∗ a a p p . In particular we have S (p, 1) =

p h h=1

Then

p

e2πih/p . L

a S (p, a) = S (p, 1) . p L

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

76

Quadratic reciprocity and Fourier series

4.3 Fourier series In this section we start to discuss Fourier analysis. See [81, 159, 161, 164, 185]. For the real analysis results see [82, 162, 182]. * A complex vector space H with inner product ·, · and norm f := f, f is a (separable) Hilbert space if it is complete and separable with respect to the distance d ( f, g) := f − g. A sequence {un } ⊂ H is orthonormal if un , um = δn,m for every m, n (here δn,m = 0 if n m, while δn,n = 1). We say that {un }+∞ n=1 is an orthonormal basis if it satisfies one of the following equivalent conditions. Theorem 4.15 Let H be a separable Hilbert space and let {un }+∞ n=1 be a sequence in H. The following conditions are equivalent. (i) (ii) (iii) (iv) (v)

The finite linear combinations of the terms un are dense in H. Let f ∈ H. If f, un = 0 for n, then f =1 0. 1 every N For every f ∈ H we have 11 f − n=1 f, un un 11 → 0 as N → +∞. For every f, g ∈ H we have f, g = n f, un g, un . For every f ∈ H we have f 2 = n | f, un |2 (Parseval identity).

Proof We assume (i). Let f ∈ H, then there is a sequence {gm }+∞ n=1 of finite linear combinations of the terms un , such that gm − f → 0 as m → +∞. Assume that f, un = 0 for every n. Then f, gm = 0 for every m. By the Cauchy–Schwartz inequality we have f 2 = f, f = f, f − gm ≤ f f − gm −→ 0 as m → +∞. Then (i) implies (ii). Now we assume (ii). For every positive integer N we define the partial sum S N f :=

N

f, un un

n=1

and observe that f − S N f, S N f = 0. Then1 f 2 = f − S N f + S N f, f − S N f + S N f = f − S N f 2 + S N f, S N f = f − S N f 2 +

(4.21) N

f, um um , f, un un

m,n=1

= f − S N f 2 +

N

| f, un |2 .

n=1 1

We recall that ax + by, z = ax, z + by, z, and that x, y = y , x. Hence x, ay + bz = ax, y + bx, z.

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

4.3 Fourier series

77

Letting N → +∞ we obtain Bessel’s inequality | f, un |2 ≤ f 2 ,

(4.22)

n

which implies the convergence of the series n | f, un |2 . Then S N f is a Cauchy sequence in H and therefore it converges to an element g ∈ H. For every j we have f − S N f, u j = 0 when N is large enough, and then f − g, u j = 0. Hence f = g by (ii). Then f = n f, un un (in norm). We now assume (iii). Arguing as in (4.21) we obtain f, g = f − S N f, g − S N g + S N f, S N g . By the Cauchy–Schwartz inequality and (iii) we have f − S N f, g − S N g ≤ f − S N f g − S N g −→ 0 . Furthermore S N f, S N g =

N

f, um um , g, un un =

m,n=1

N f, un g, un . n=1

Letting N → +∞ we obtain (iv). Observe that (v) is a particular case of (iv). Finally, let us assume (v). Again from (4.21) we see that f − S N f 2 → 0. Since S N f is a finite linear combinations of the terms un , we obtain (i). We are interested in periodic functions and we assume that the period is equal to 1. Periodic functions can be seen as functions on the 1-dimensional torus T := R/Z. In some problems this amounts to considering functions defined on [0, 1) or on another interval of length 1. For the sake of symmetry we shall often use the interval [−1/2, 1/2). We point out that in some situations it is misleading to identify T with an interval of length 1. For example, a function can be continuous on [−1/2, 1/2), but it can have a discontinuous periodic continuation. Consider the Hilbert space L2 (T) of square integrable functions on T. Here ! &1/2 2 f (x)g(x) dx ; . f, g := f L2 (T) = | f (x)| dx T

T

are an orthonormal We shall see that the exponentials {en }n∈Z := e2πinx n∈Z basis of L2 (T). For every f ∈ L2 (T) the projections 2 f (n) := f, en = f (x) e−2πinx dx T

are called Fourier coeﬃcients of f . Indeed, we shall prove (Corollary 6.14) the following result.

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

78

Quadratic reciprocity and Fourier series

Theorem 4.16

For every f ∈ L2 (T) we have the Parseval identity +∞ 2 2 = f (n) | f (x)|2 dx . T

n=−∞

(4.23)

Hence f (x) =

+∞

2 f (n) e2πinx

(4.24)

n=−∞

in the L2 norm. The RHS of (4.24) is called the Fourier series of f (x). We write 2 (Z) for the Hilbert space of complex sequences a = {an }+∞ n=−∞ which satisfy +∞ |an |2 < +∞ . n=−∞

If a, b ∈ (Z), the inner product and the norm, respectively, are ⎧ +∞ ⎫1/2 +∞ ⎪ ⎪ ⎪ ⎪ ⎨ 2⎬ a, b := an bn and . a = ⎪ |a | ⎪ n ⎪ ⎪ ⎩ ⎭ 2

n=−∞

n=−∞

There is one and only one function f ∈ L2 (T) such that an = 2 f (n). Moreover, 2 2 2 f ↔ f is a linear isometry between L (T) and (Z). See, for example, [182]. The natural convergence of a Fourier series is in the L2 sense, and the pointwise convergence for continuous functions may fail, see Theorem 4.19. We ob f (n) | < +∞, f (n) e2πinx converges absolutely, that is, n∈Z | 2 serve that if n∈Z 2 2 2πinx converges uniformly to f (x). Unfortunately the absolute then n∈Z f (n) e convergence fails in many relevant cases. Then we look for other conditions which imply the pointwise convergence. Definition 4.17 We say that a function f on an interval [a, b] is piecewise continuous on [a, b] if (i) f is continuous everywhere on [a, b] except at finitely many points x1, . . . , xk . (ii) f x+j := lim x→x+j f (x) and f x−j := lim x→x−j f (x) exist and are finite for every j = 1, . . . , k (if x j = a or x j = b, then we require only one limit). We say that a function f on an interval [a, b] is piecewise smooth on [a, b] if f is a piecewise continuous function on [a, b].

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

4.3 Fourier series

79

Theorem 4.18 Let f : R → C be 1-periodic and piecewise smooth, say on − 21 , 12 . Then for every x0 ∈ R we have, as N → +∞, S N f (x0 ) :=

+N n=−N

1 − 2 f (n) e2πinx0 −→ f x0 + f x0+ . 2

Proof We call S N f the Fourier partial sum of f . Observe that if (x) = (n) = 2 f (n) e2πinx0 . Then we may assume that x0 = 0. Moref (x + x0 ), then 2 over, we can replace f (x) with k (x) = f (x) −

f (0+ ) + f (0− ) . 2

Then 2 f (n) = 2 k(n) for every n 0 and we may assume that f (0− ) = − f (0+ ). In this way the theorem reduces to the claim that S N f (0) → 0 . Observe that the function g(x) = 12 { f (x) + f (−x)} is piecewise smooth, con −2πinx 2πinx = +N , we tinuous at x = 0 and satisfies g (0) = 0. Since +N n=−N e n=−N e obtain +N +N 1/2 2 S N f (0) = f (x) e−2πinx dx f (n) = n=−N

−1/2

n=−N

⎛ ⎞ 1/2 +N +N ⎟⎟ 1 ⎜⎜ 1/2 = ⎜⎜⎜⎝ f (x) e−2πinx dx + f (−x) e2πinx dx⎟⎟⎟⎠ 2 −1/2 −1/2 n=−N n=−N 1/2 +N f (x) + f (−x) e−2πinx dx = 2 −1/2 n=−N +N 1/2 +N f (x) + f (−x) −2πinx 2 e = g(n) = S N g(0) . dx = 2 n=−N −1/2 n=−N Now we prove that S N g(0) → 0. Indeed, let h(x) := We observe that g(x) h(x) ∼ → 2πix

!

1 2πi 1 2πi

g(x) . e2πix − 1 g (0− ) g (0+ )

as x → 0− , as x → 0+ .

The function h is piecewise continuous and therefore bounded. Then Bessel’s inequality (4.22) implies 2 h(n) → 0

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

80

Quadratic reciprocity and Fourier series

as |n| → +∞. Now observe that 1/2 2 g(n) = g(x) e−2πinx dx = =

−1/2 1/2

−2πi(n−1)x

h(x) e −1/2

1/2 −1/2

dx −

h(x) e2πix − 1 e−2πinx dx 1/2

−1/2

h(x) e−2πinx dx = 2 h(n − 1) − 2 h(n) .

Then, as N → +∞, S N g(0) =

+N

2 g(n) =

n=−N

+N 2 h(n − 1) − 2 h(n) n=−N

= 2 h(−N − 1) − 2 h(−N) + 2 h(−N) − 2 h(−N + 1) + . . . + 2 h(N − 1) − 2 h(N) =2 h(−N − 1) − 2 h(N) −→ 0 . Then S N f (0) = S N g(0) → 0.

In particular, if f ∈ C (T) is piecewise smooth we have, as N → +∞, S N f (x) −→ f (x) for every x ∈ T. The above result is false when we only assume that f is continuous. Theorem 4.19

There exists f ∈ C (T) such that lim sup |S N f (0)| = +∞ .

(4.25)

N→+∞

We need the following result. Lemma 4.20 There exists a positive constant τ such that for every N ∈ N and x ∈ R we have N (2π sin jx) < τ . (4.26) j j=1 Proof N

Observe that for every x Z we have e2πinx = e−2πiN x

n=−N

= e−2πiN x

2N m=0

e2πimx = e−2πiN x

2N m e2πix

(4.27)

m=0

1 1 1 e2πi(2N+1)x − 1 e2πi(N+ 2 ) x − e−2πi(N+ 2 ) x sin 2π N + 2 x = = . eπix − e−πix sin (πx) e2πix − 1

In order to prove (4.26) it is enough to consider x ∈ (0, 1/2). By (4.27) we have x x N N N sin (2π jx) = 2π cos (2π jy) dy = −πx + π e2πi jy dy j 0 0 j=−N j=1 j=1

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

4.3 Fourier series

81

sin ((2N + 1) πy) dy sin (πy)

x 1 1 1 + sin ((2N + 1) πy) = −πx + π − dy . πy sin (πy) πy 0 1 1 − πy is bounded on (0, 1/2), we only have to Since the function y → sin(πy) estimate (2N+1)πx x sin ((2N + 1) πy) sin u dy = du . y u 0 0 6r It is enough to bound 1 sinu u du for every r ≥ 1. Integration by parts gives r +∞ r sin u cos u r cos u 1 du − du du = 3 . = ≤ 2 + − 2 u u 1 u u2 1 1 1 x

= −πx + π

0

Proof of Theorem 4.19

For positive integers M and N let

P M,N (x) := e2πiMx

N sin (2π jx) j=1

j

.

By (4.26) we have, for every M, N, x, P (x) ≤ τ ,

(4.28)

M,N

2M,N (n) = 0 if n < M − N or n > M + N . P Let S M denote the Fourier partial sum of degree M. Since S M P M,N (x) = e2πiMx

−1 1 2πi jx e , 2i j j=−N

we have N 1 1 1 S M P M,N (0) = ≥ log (N) . 2 n=1 n 2

(4.29)

We choose two positive sequences M j and N j such that for every positive integer j we have M j + N j < M j+1 − N j+1 log N j > 2 j 3 Let f (x) =

+∞ 1 P M ,N (x) . j2 j j j=1

(4.30) (4.31)

(4.32)

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

82

Quadratic reciprocity and Fourier series

The uniform convergence of the series in (4.32) implies the continuity of f . By 2Mk ,Nk (n) have disjoint supports 2M j ,N j (n) and n → P (4.30) the functions n → P provided j k. By (4.28), (4.29), (4.31) and (4.32) we have h−1 1 1 S Mh f (0) = P M j ,N j (0) + 2 S Mh P Mh ,Nh (0) ≥ h − c −→ +∞ 2 j h j=1

This proves (4.25).

We now show some applications of Fourier series to the summation of numerical series. Let s : R −→ R be the sawtooth function s(x) := {x} −

1 2

(4.33)

(where {x} = x − [x] is the fractional part of x). Note that s (x) has period 1 and graph

0.5

-1

1

2

-0.5

The sawtooth function is piecewise smooth and has Fourier coeﬃcients ⎧ 61 ⎪ −1 ⎪ if n 0, ⎨ 0 x − 12 e−2πinx dx = 2πin 2 s(n) = ⎪ ⎪ ⎩ 0 if n = 0. Then Theorem 4.18 implies ! +∞ +∞ −1 2πinx 1 sin (2πnx) s (x) e = =− 0 π n=1 n n=−∞ 2πin

if x Z, if x ∈ Z.

(4.34)

n0

Then, for every 0 < x < 1, 1 sin (2πnx) 1 =− . 2 π n=1 n +∞

s (x) = x −

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

4.4 Another proof of the quadratic reciprocity law

83

Letting x = 1/4 we obtain the identity π 1 1 1 1 = 1 − + − + − ... 4 3 5 7 9 Now observe that Parseval’s identity (4.23) implies 2 1

1/2 2 +∞ +∞ 1 1 1 −1 = 1 . = dx = t2 dt = x − 2πin 2 2 2 12 2π n 0 −1/2 n=−∞ n=1 n0

In this way we obtain the amazing identity +∞ 1 π2 , = 2 6 n n=1

(4.35)

proved by Euler in 1735 and previously known as the Basel problem (originally proposed by Mengoli in 1644).

4.4 Another proof of the quadratic reciprocity law We need the following lemma on Gauss sums. Lemma 4.21

Let q be an odd positive number. Then S (q, 1) = q q1/2 ,

where

! q =

Proof

1 i

if q ≡ 1 (mod 4) , if q ≡ 3 (mod 4) .

(4.36)

For every x ∈ [0, 1] let f (x) =

q−1

e2πi(n+x) /q . 2

n=0 2

Since e2πi(0/q) = e2πiq f (0) =

q−1

/q

we have

/q

=

2

e2πin

n=0

q−2 n=−1

e2πi(n+1) /q = 2

q−1

e2πi(n+1)

2

/q

= f (1) .

(4.37)

n=0

Then the periodic continuation of f (x) is continuous and piecewise smooth on [0, 1]. We compute the Fourier coeﬃcients ⎞ 1 ⎛⎜ q−1 q−1 n+1 ⎟⎟⎟ ⎜⎜⎜ 2 2 2πi(n+x) /q 2 ⎜⎝⎜ e ⎟⎟⎠⎟ e−2πikx dx = f (k) = e2πiy /q e−2πik(y−n) dy 0

n=0

n=0

n

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

84

Quadratic reciprocity and Fourier series =

q−1 n=0

2 2πi yq −ky

n+1

q

dy =

e

n

2 2πi yq −ky

e

0 2

−2πiq k4

= qe

1

2πiq(u− 2k )

2

e

1

dy = q

e2πi(qu

2

−kqu)

du

0

2

−2πiq k4

du = qe

1− 2k − 2k

0

2

e2πiqv dv .

By Definition 4.12, (4.37) and Theorem 4.18 we have +N

S (q, 1) = f (0) = lim

N→+∞

+N

2 f (k) = q lim

N→+∞

k=−N

k2

e−2πiq 4

1− 2k

2

e2πiqv dv

− 2k

k=−N

⎛ ⎞ ⎜⎜⎜ ⎟⎟⎟ +N 1− k +N 1− k 2 2 ⎟⎟ ⎜⎜⎜⎜ 2πiqv2 −2πiq/4 2πiqv2 e dv + e e dv⎟⎟⎟⎟ . = q lim ⎜⎜ ⎟⎠ k k N→+∞ ⎜ ⎝ − − k=−N k even

Since

+∞

2πiv2

dv =

e

+∞

R2

R

k=−N k odd

2

2

2πit +∞ +∞ 2πit e e2πit e dt = + dt −→ 0 1/2 1/2 2 2t 4πit 8πit3/2 R R2

as |R| → +∞, we obtain lim

N→+∞

where

+N k=−N k even

1− 2k

2πiqv2

e

dv = lim

N→+∞

− 2k

+∞

2πiqv2

e

+N k=−N k odd

1− 2k

e − 2k

dv := lim

R→+∞

−∞

2πiqv2

R −R

dv =

+∞

−∞

2

e2πiqv dv =

−∞

2

e2πiqv dv ,

2

e2πiqv dv .

Then

S (q, 1) = q 1 + e−πiq/2

+∞

√ q 1 + i−q

+∞ −∞

2

e2πis ds .

If we choose q = 1 and observe that S (1, 1) = 1, then (without use of complex integration2 ) we obtain +∞ 1 2 . (4.38) e2πiv dv = 1−i −∞ Then

2

√ q (1 + i−q ) S (q, 1) = . 1−i

For a complex analytic proof of (4.38) see e.g. [156, p. 181]. For a real analytic proof see [140, 2.7.3].

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

4.4 Another proof of the quadratic reciprocity law Since 1 + i−q = 1−i

!

1 i

85

if q ≡ 1 (mod 4) , if q ≡ 3 (mod 4) ,

we complete the proof. Second proof of Theorem 4.11 Lemmas 4.13 and 4.14 imply

Let p, q be two distinct prime numbers. Then

S (pq, 1) = S (p, q)S (q, p) =

q p S (p, 1) S (q, 1) . p L q L

By Lemma 4.21 and the notation in (4.36) we deduce that

pq p q S (pq, 1) = = . q L p L S (p, 1)S (q, 1) p q Finally observe that if p ≡ q ≡ 3 (mod 4), then pq ≡ 1 (mod 4) and while the other values of p and q give

pq p q

= 1.

pq p q

= −1,

Exercises 1) Do the following congruences have solutions? x2 ≡ 219 (mod 383) ,

x2 ≡ 650 (mod 1109) ,

x2 ≡ 611 (mod 1009) .

2) Does the congruence x2 ≡ 196 (mod 1357) have a solution? (Observe that 1357 is not a prime number.) 3) Use the identity !

−1 1 if p ≡ 1 (mod 4) , = −1 if p ≡ 3 (mod 4) p L to deduce another proof of Theorem 3.10. 4) Let p be an odd prime number. Prove the following identities: p−1 p−2 a a =0, = (−1)(p−1)/2 . p p L L a=1 a=1 5) Let un and vn be two sequences in a Hilbert space H. Assume that un → u and vn → v. Prove that un , vn → u, v. 6) Let f (x) have period 1 on R and satisfy g(x) = x2 when x ∈ [0, 1). Write the 1 . Fourier series of f and find N such that S N f − f L2 (T) ≤ 10 N 2 2πikx . Prove that 7) Let f ∈ L (T) and QN (x) = k=−N ck e f − S N f L2 (T) ≤ f − QN L2 (T) .

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

86

Quadratic reciprocity and Fourier series

8) Let f ∈ L1 (T) and g ∈ L∞ (T). Prove that 1 lim f (t) g (nt) dt = 2 f (0) 2 g (0) . n→+∞

0

9) Let f ∈ C (T) be piecewise smooth on (say) [0, 1]. Prove that the Fourier f (n) | < +∞) and deduce that series of f converges absolutely (that is, n∈Z | 2 S N f (x) converges uniformly to f (x). 10) Let f be piecewise smooth on T. Prove that the Fourier series of f converges uniformly to f (x) in every closed interval where f is continuous. 1 11) Compute +∞ n=1 n4 .

Downloaded from https:/www.cambridge.org/core. Tufts Univ, on 04 Jun 2017 at 03:15:27, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.005

5 Sums of squares

In this chapter we first determine the integers which can be written as sums of two squares, then we shall prove a celebrated result of Lagrange which says that every positive integer can be written as a sum of four squares. In geometric terms, Lagrange’s theorem says that for every positive integer n, we have √ t ∈ R4 : |t| = n ∩ Z4 ∅ . √ That is, every 4-dimensional sphere with centre 0 and radius n contains at least one integer point. In the first section we introduce two elegant results with a wide range of applications: Minkowski’s theorem and Dirichlet’s theorem.

5.1 The theorems of Minkowski and Dirichlet We need to define lattices in Rd (the most familiar example is Zd ). Definition 5.1 additive group

Let {p1 , . . . , pd } be a basis of Rd . By a lattice we mean the ⎧ ⎫ ⎪ ⎪ d ⎪ ⎪ ⎪ ⎪ ⎨ ⎬ m p L=⎪ ⎪ j j ⎪ ⎪ ⎪ ⎪ ⎩ j=1 ⎭

m1 ,...,md ∈Z

generated by the above basis. The closed set ⎧ ⎫ ⎪ ⎪ d ⎪ ⎪ ⎪ ⎪ ⎨ ⎬ KL := ⎪ x p : 0 ≤ x ≤ 1, j = 1, . . . , d ⎪ j j j ⎪ ⎪ ⎪ ⎪ ⎩ j=1 ⎭

(5.1)

and its translates in L are called fundamental parallelepipeds of L, and we write A (L) for their common volume. 87 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

88

Sums of squares Observe that the union

7 p∈L

(KL + p) = Rd

is disjoint up to sets of measure zero. The following result has been proved by Minkowski in 1889. It concerns convex bodies in Rd , that is, convex sets with positive and finite measure. Theorem 5.2 (Minkowski)

Let L be a lattice in Rd .

(i) Let D ⊂ Rd be a convex body, symmetric around the origin (i.e., x ∈ D implies −x ∈ D), with volume |D| > 2d A(L). Then D contains points of L other than 0. (ii) Let D ⊂ Rd be a closed convex body, symmetric around the origin, with volume |D| ≥ 2d A(L). Then D contains points of L other than 0. Proof (i) We first claim the existence of two diﬀerent points x, y ∈ D such that 12 x − 12 y ∈ L. Indeed, for every p ∈ L we have

1 D p := D ∩ (KL + p) − p ⊆ KL , 2 and therefore ∪ p∈L D p ⊆ KL . Then 7 1 D = D = 2−d |D| > A (L) = |K | ≥ D p L p p∈L . 2 p∈L

Hence the union ∪ p∈L D p is not disjoint and there are p, q ∈ L and x, y ∈ D (x y) such that 12 x − p = 12 y − q. Observe that p q. By symmetry and convexity we have 0 p−q=

1 x + (−y) 1 x− y= ∈D. 2 2 2

(ii) Now D is a closed set. We may assume |D| = 2d A(L). Again we the union ∪ p∈L D p is not disjoint. Assume the contrary. If we have claim that ∪ p∈L D p < A(L), we proceed as before. Then we assume 7 D p = 2−d |D| = A(L) . p∈L

We observe that D is a bounded set. Assume the contrary. Since |D| > 0 there exist n + 1 points x1 , . . . , xn+1 in D which are not coplanar. If D is unbounded there exists y ∈ D far away from a ball containing x1 , . . . , xn+1 . Then the convex hull of x1 , . . . , xn+1 , y (that is, the intersection of all convex sets containing x1 , . . . , xn+1 , y) is contained in D and has large volume. Hence D is bounded

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

5.1 The theorems of Minkowski and Dirichlet

89

and the set ∪ p∈L D p is the disjoint union of a finite number of closed sets such that 7 7 D p ⊆ KL , D p p∈L = |KL | . p∈L This is impossible because the sets D p are closed pairwise disjoint sets, therefore they must have positive distance from one another. Minkowski’s theorem has several interesting applications. Among them is Pick’s theorem (see [126]), which says that a convex polygon P (with interior Po and boundary ∂P) having vertices in Z2 has area 1 |P| = card Po ∩ Z2 + card ∂P ∩ Z2 − 1 . 2 We now introduce Dirichlet’s theorem. We know that Q is dense in R, and it is easy to show that for every α ∈ R 1 . Indeed, this means that and q ∈ N there exists p ∈ Z such that α − qp ≤ 2q |αq − p| ≤ 1/2 and this is obvious, since for every number αq there exists an integer which is at most a distance 1/2 away. The following result, proved by Dirichlet in 1840, is far more interesting. Theorem 5.3 (Dirichlet) For all α ∈ R and N ∈ N there exist p, q ∈ Z such that 1 ≤ q ≤ N and |αq − p| ≤ 1/ (N + 1). Proof

The N + 2 numbers 0, {α} , {2α} , . . . , {Nα} , 1

belong to the interval [0, 1] and, as usual, we write β for the integral part of a real number β and {β} = β − β for the fractional part. Then1 at least two of them (termed r and s, respectively) satisfy |r − s| ≤ 1/ (N + 1). If r = 0 and s = {kα}, then 1 0 ≤ kα − [kα] = {kα} ≤ N+1 and we choose q = k and p = [kα]. The case r = {kα} and s = 1 is similar. Then we may assume that r = {kα} and s = {α}, with 0 < k < . We have 1 . N+1 Then we choose q = − k and p = [α] − [kα]. This completes the proof. |( − k) α − [α] + [kα]| = |{α} − {kα}| ≤

1

This easy argument has been popularized by Dirichlet as the pigeonhole principle. It says that if n + 1 pigeons have to be put in n holes, then at least one hole will contain more than one pigeon.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

90

Sums of squares

Corollary 5.4 For every irrational number α there exist infinitely many rational numbers p/n satisfying α − p < 1 . (5.2) n n2 Proof Given a positive integer N1 , the previous theorem implies the existence of at least one rational number p1 /n1 such that α − p1 < 1 ≤ 1 . n1 n1 N1 n2 1

Since α is irrational, there exists a positive integer N2 such that |αn1 − p1 | > N12 . We now repeat the previous argument with N2 in place of N1 . In this way we find np22 np11 such that α − np22 ≤ n12 . And so we go on. 2

Remark 5.5 Corollary 5.4 is no longer true if α ∈ Q. Assume the contrary, then there exist integers a ≥ 0 and b > 0 such that for infinitely many pairwise diﬀerent rational numbers p j /q j we have a p j 1 − < 2 . b qj qj We may assume q j > 0 and p j /q j unbounded. Since a p j 0 − = b qj

a/b. Observe that the sequence q j is aq j − bp j bq j

≥

1 , bq j

we obtain a contradiction. The following result shows that the exponent 2 in Corollary 5.4 cannot be improved. Theorem 5.6√ Let α be an irrational algebraic number of degree 2 (for example, α = 2). Then there exists a positive constant H such that for every rational number p/n we have α − p ≥ H . n n2 Proof We may assume n large. There exists a second-degree polynomial Q(x) = Mx2 + N x + R with integral coeﬃcients, such that Q (α) = 0. By the mean value theorem there exists θ between α and p/n such that Q(α) − Q np = Q (θ) . (5.3) α − np

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

5.1 The theorems of Minkowski and Dirichlet

91

Since α is irrational, we have Q (α) = 2Mα + N 0. Hence there is δ > 0 such that |Q (x)| < 2 |Q (α)| for |x − α| < δ. We may assume |α − p/n| < 1/n2 , otherwise there is nothing to prove. Since n is large we have |α − p/n| < δ and |α − θ| < δ. Then (5.3) implies Q p Q p p p 2 p 1 n n α − = +N + R M > = n n n Q (θ) 2Q (α) 2 |Q (α)| M p2 + N pn + Rn2 1 1 ≥ . = 2 |Q (α)| n2 2 |Q (α)| n2 Indeed, Q np 0 and then M p2 + N pn + Rn2 is a non-zero integer. Hurwitz proved in 1891 (see [76, Ch. 4] or [90, Ch. XI]) that (5.2) can be improved up to the inequality α − p < 1 . (5.4) √ n 5n2 The following result shows that (5.4) cannot be improved further (see e.g. [151]). √ Theorem 5.7 Let L > 5. Then only a finite number of rational numbers p/q satisfy the inequality √ p 5 − 1 1 . (5.5) − < q 2 Lq2 √ Proof Assume that p/q satisfies (5.5) for some L > 5. Let √ √ ⎛ ⎞⎛ ⎞ ⎜⎜ 5 − 1 ⎟⎟⎟ ⎜⎜⎜ 5 + 1 ⎟⎟⎟ ⎟⎠ ⎜⎝ x + ⎟⎠ . f (x) = x2 + x − 1 = ⎜⎜⎝ x − 2 2 Since f (p/q) 0, (5.5) implies 2 2 p p + pq − q2 p p 1 = f ≤ + − 1 = q q q2 q2 q √ √ √ p 5 − 1 p 5 + 1 5 − 1 √ 1 p + 5 = − + ≤ − q 2 q 2 Lq2 q 2

√ 1 1 + 5 . ≤ 2 2 Lq Lq Hence q2 ≤

1

L L−

√ . 5

Then only finitely many rational numbers p/q satisfy (5.5).

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

92

Sums of squares

5.2 Sums of two squares Let us go back to the arithmetic function r (n) introduced in (2.14). The following result is due to Girard and Fermat. We will give two proofs, one based on Dirichlet’s theorem and another based on Minkowski’s theorem. Theorem 5.8 (Girard and Fermat) Then r(p) ≥ 1. First proof

Let p ≡ 1 (mod 4) be a prime number.

By Remark 4.7 we know that the congruence z2 + 1 ≡ 0 (mod p)

(5.6) √ p and has solutions. By Theorem 5.3 there exists w/s such that 1 ≤ s ≤ z − w ≤ 1 . √ p s p +1 s Writing t = zs − wp we have p √ < p. |t| ≤ √ p +1

(5.7)

Then

s2 + t2 = s2 + (zs − wp)2 = s2 1 + z2 + p w2 p − 2zsw . By (5.6) p divides the above RHS, so that p | s2 + t2 . By (5.7) and the in√ equality 1 ≤ s ≤ p we obtain 0 < s2 + t2 < 2p. Then p = s2 + t2 . Second proof

Let z be as in (5.6). We consider the lattice L = (x, y) ∈ Z2 : y ≡ zx (mod p)

generated by (1, z) and (0, p). We have A (L) = p. The open disc Bo := (x, y) ∈ R2 : x2 + y2 < 2p * centred at 0 and with radius 2p has area 2πp > 22 p. Then Minkowski’s theorem implies that Bo contains a point of L other than the origin. In other words, there exists (x, y) ∈ L such that 0 < x2 + y2 < 2p. As in the first proof, note that (5.6) implies x2 + y2 = x2 1 + z2 ≡ 0 (mod p) . Hence x2 + y2 = p.

The following theorem was proved by Jacobi in 1834 and provides a simple way to decide whether a given positive integer can be written as a sum of two squares.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

5.2 Sums of two squares Theorem 5.9 (Jacobi)

93

Let an integer n > 1 have canonical decomposition n = 2r pr11 · · · prkk q1s1 · · · qs ,

(5.8)

where p1 ≡ . . . ≡ pk ≡ 1 (mod 4) and q1 ≡ . . . ≡ q ≡ 3 (mod 4). Then n can be written as a sum of two squares if and only if every sm is even. Proof Let n = x2 + y2 and let d = (x, y) (the greatest common divisor of x and y). Assume (say) that s1 is odd and let x 2 y 2 n n= 2 = + := x02 + y20 . d d d Then q1 appears with n. an odd exponent in the canonical decomposition of Hence q1 | x02 + y20 . Then if q1 | x02 we also have q1 | y20 . Since (x0 , y0 ) = 1, we have q1 , x02 = q1 , y20 = 1. Then Theorem 3.5 implies the existence of an integer z such that y0 z ≡ x0 (mod q1 ). Hence 0 ≡ n ≡ x02 + y20 ≡ y20 1 + z2 (mod q1 ) . Then z2 ≡ −1 (mod q1 ), against Remark 4.7. Conversely, let us write n = 2r pr11 · · · prkk q1s1 · · · qs ,

(5.9)

where p1 ≡ . . . ≡ pk ≡ 1 (mod 4) ,

q1 ≡ . . . ≡ q ≡ 3 (mod 4)

and every sh is even. Since a2 + b2 A2 + B2 = (aA − bB)2 + (aB + bA)2 ,

(5.10)

it is suﬃcient to prove that every prime power factor in the product (5.9) is a sum of two squares. Indeed 2 = 1 + 1, each p j is a sum of two squares by Theorem 5.8 and every qmsm = qmsm + 0 is trivially a sum of two squares since sm is even. This completes the proof. Example 5.10 Let us consider the number 12103 = 19 · 72 · 13. The prime number 19 ≡ 3 (mod 4) has an odd power, so 12103 cannot be written as a sum of two squares. Example 5.11 Let us consider the number 196625 = 112 · 53 · 13, where 11 is the only prime factor ≡ 3 (mod 4). Its power is even, so 196625 can be written as a sum of two squares. Let us find a way of doing this. Observe that 112 = 112 + 02 ,

53 = 102 + 52 ,

13 = 32 + 22

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

94

Sums of squares

and then, by the argument in (5.10), 196625 = 112 + 02 102 + 52 32 + 22 = (11 · 10 + 0 · 5)2 + (11 · 5 − 0 · 10)2 32 + 22 = 1102 + 552 32 + 22 = (110 · 3 + 55 · 2)2 + (110 · 2 − 55 · 3)2 = 4402 + 552 . The identity (5.10) is clearer in terms of complex numbers. Indeed, let us write z = a + ib ,

Z = A + iB .

Then |z|2 = a2 + b2 ,

|Z|2 = A2 + B2 ,

zZ = (aA − bB) + i (aB + bA)

and (5.10) is nothing but |z|2 |Z|2 = |zZ|2 . The above simple remark introduces the following section.

5.3 Gaussian integers In this section we shall extend Theorem 5.9 by establishing an explicit identity for r (n). This will be done by replacing Z with the larger ring Z [i] = {z ∈ C : z = x + iy : x, y ∈ Z} of Gaussian integers. Z [i] can be identified with Z2 , addition and multiplication come from C, as well as conjugation and absolute value. Actually Z [i] is an integral domain, that is, a commutative ring with 1 that has no zero divisors (that is, ab = 0 implies a = 0 or b = 0). It will be useful to define the function N (z) = |z|2 , which takes integral values. Observe that N (wz) = N (w) N (z) . Z [i] is the simplest extension of Z and we shall see that several results from Section 1.1 still hold true for Z [i]. Proposition 5.12 Let w 0 and z be Gaussian integers. Then there exist q, r ∈ Z [i] such that z = qw + r and N (r) < N (w). Proof Since w 0, the set {qw}q∈Z[i] is a lattice generated by w and iw, with squares of side length |w| as fundamental parallelograms. Then for every z ∈ √ Z [i] there exists q ∈ Z [i] such that |z − qw| ≤ |w| / 2. See the figure below.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

5.3 Gaussian integers

95

iw w 0

z qw

If r = 0, we say that w is a divisor of z and write w | z. The four Gaussian integers ±1, ±i are the only elements with absolute value 1 and are the only invertible elements in Z [i]. They are called units. We say that two elements a, b ∈ Z [i] are associated if there is a unit u such that a = ub. Observe that the elements associated with x + iy are (x + iy) ,

(−x − iy) ,

(−y + ix) , (y − ix) ,

while in general x − iy is not associated with x + iy. An element 0 π ∈ Z [i] which is not a unit is called2 a Gaussian prime if its only divisors are units or elements associated with π, that is, if π = ab implies N (a) = 1 or N (b) = 1. If a positive integer is a Gaussian prime, then it is also a prime number. The converse is not true, as 2 = (1 + i) (1 − i) and 5 = (1 + 2i) (1 − 2i) are not Gaussian primes. More generally, by Theorem 5.8, every prime number p ≡ 1 (mod 4) can be written as a sum of two squares: p = x2 + y2 = (x + iy) (x − iy) , with non-zero x and y, so that p is not a Gaussian prime. Observe that if N (π) is a prime number, then π is a Gaussian prime. The converse is false. Indeed, N (3) = 9 is not a prime number, but 3 is a Gaussian prime. Indeed, assume the contrary and write 3 = ab, with 1 < N (a) < N (3) = 9. Then N (a) = 3. If a = a1 + ia2 , then 3 = a21 + a22 , which is impossible. 2

We shall not confuse the term prime (or prime number), used only for elements in Z, with Gaussian prime.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

96

Sums of squares The following general result will be proved later in this section.

Theorem 5.13 units)

The Gaussian primes are (up to conjugation and products by

(i) the prime numbers ≡ 3 (mod 4), (ii) the Gaussian integers π such that N (π) = 2 or N (π) is a prime number ≡ 1 (mod 4). First we have to show that the Gaussian integers satisfy a fundamental theorem of arithmetic (see [3, 79, 101, 120]). We need a few steps. We start with the Euclidean algorithm for Gaussian integers. Let w, z ∈ Z [i], then Proposition 5.12 allows us to write z = q1 w + r1 w = q2 r1 + r2 r1 = q3 r2 + r3 r2 = q4 r3 + r4 rn−3 = qn−1 rn−2 + rn−1 rn−2 = qn rn−1 .

.. .

with 0 < N (r1 ) < N (w) with 0 < N (r2 ) < N (r1 ) with 0 < N (r3 ) < N (r2 ) with 0 < N (r4 ) < N (r3 ) with 0 < N (rn−1 ) < N (rn−2 )

Following the lines in reverse order we see that rn−1 | z and rn−1 | w. In a similar way we see that every divisor of z and w is also a divisor of rn−1 . We have therefore proved the following result (see Remark 1.5). Theorem 5.14 Let z and w be Gaussian integers which are not both zero. Then there exists d ∈ Z [i] such that (i) d | z and d | w, (ii) if t | z and t | w, then t | d, (iii) there exist two Gaussian integers a and b such that d = az + bw. We call d a greatest common divisor (Gaussian gcd) of z and w. The Gaussian gcd is unique up to units. We write d := (z, w). Theorem 5.15 Let c ∈ Z [i] satisfy N (c) > 1. Then c can be written as a product of Gaussian primes. Proof We use induction on the values of N (c). If N (c) = 2, then (up to associated Gaussian integers) c = 1 + i, hence c is a Gaussian prime. Let N (c) > 2. If c is a Gaussian prime, the proof is complete, otherwise we have c = ab, with 1 < N (a) < N (c) and 1 < N (b) < N (c). Then the induction assumption implies a = π1 · · · π and b = π1 · · · πm , where π1 , . . . , π and π1 , . . . , πm are Gaussian primes. Hence c is a product of Gaussian primes.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

5.3 Gaussian integers

97

Proposition 5.16 If s | zw and (s, z) = 1, then s | w. Moreover, if (s, z) = 1 and (s, z ) = 1, then (s, zz ) = 1. Proof Then

We have zw = rs and 1 = as + bz for suitable Gaussian integers r, a, b. w = asw + bzw = asw + brs ,

so that s | w. As for the second part, we have 1 = as + bz and 1 = αs + βz , hence 1 = s (aαs + bzα + aβz ) + zz bβ. Then (s, zz ) = 1. Proposition 5.17 Let π, π1 , . . . , π be Gaussian primes such that π | π1 · · · π . Then π is associated with one of the Gaussian primes π1 , . . . , π . Proof We may assume that π is not associated with any of π1 , . . . , π−1 . Then the previous proposition implies π | π . Since π and π are Gaussian primes we have π = uπ, where u is a unit. Theorem 5.18 Every Gaussian integer z (with N (z) > 1) can be written as a product of Gaussian primes. The canonical decomposition is unique up to order of factors and multiplication by units. Proof

We only have to prove the uniqueness. Assume that π1 . . . πm . z = π1 . . . π =

Then π1 | π1 · · · πm and the previous proposition implies that π1 is associated with (say) π1 . Hence π2 . . . πm , π2 . . . π = u where u is an invertible element. Continuing in this way we see that = m and πk are associated in pairs. the Gaussian primes π j and Proof of Theorem 5.13 We prove that a prime number p ≡ 3 (mod 4) is a Gaussian prime. Assume the contrary. Then p = ab with N (a) = N (b) = p. If we write a = x + iy we obtain x2 + y2 = p, contradicting Theorem 5.9. We have already seen that if N (π) is a prime number, then π is a Gaussian prime. To complete the proof we have to show that if π is a Gaussian prime, then it satisfies one of the two conditions in the statement of the theorem. Indeed, N (π) = ππ, where the complex conjugate π is also a Gaussian prime. Hence the identity N (π) = ππ represents (up to the order and up to products of units) the unique canonical decomposition of the Gaussian integer N (π). If the integer N (π) is a prime number and N (π) = 2 or N (π) ≡ 1 (mod 4), then (ii) is satisfied. The case N (π) is prime and N (π) ≡ 3 (mod 4) is not possible, for π = x + iy implies N (π) = x2 + y2 , contradicting Theorem 5.9. Finally, if

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

98

Sums of squares

N (π) is not a prime number, we write N (π) = tv with t, v ∈ N and not equal to 1. Note that ππ = tv, so it follows from uniqueness of factorization that uπ equals, say, t for a suitable unit u. As a Gaussian prime, uπ = t is also a prime number. If uπ = 2 or uπ ≡ 1 (mod 4), then uπ is not a Gaussian prime. Hence uπ ≡ 3 (mod 4). Now we go back to the study of the arithmetic function r (n). Lemma 5.19 Let p ≡ 1 (mod 4) be a prime number. Then p is the product of two non-associated Gaussian primes and r (p) = 8. Proof

We already know that p is not a Gaussian prime. We can write p = x2 + y2 = (x + iy) (x − iy) = ππ = ππ ,

with x y and π, π Gaussian primes. By Theorem 5.18 we know that p can be written only as p = u (x + iy) u (x − iy) or p = u (y + ix) u (y − ix), where u is an arbitrary unit. Since x 0, y 0 and x y, then p can be written as a sum of two squares in precisely 8 ways, that is, r (p) = 8. In other words we have p = (±x)2 + (±y)2 = (±y)2 + (±x)2 . We now establish an explicit identity for r (n), proved by Jacobi in 1834. Let n ∈ N have canonical decomposition

Theorem 5.20 (Jacobi)

n = 2r pr11 · · · prkk q1s1 · · · qs ,

(5.11)

where p1 ≡ . . . ≡ pk ≡ 1 (mod 4) and q1 ≡ . . . ≡ q ≡ 3 (mod 4). Then ⎧ ⎪ ⎪ if every sm is even, ⎨ 4 kj=1 r j + 1 r (n) = ⎪ ⎪ ⎩ 0 otherwise. We look at (5.11) and factor n in Z [i]:

Proof

n = (1 + i)r (1 − i)r

k

x j + iy j

r j

x j − iy j

j=1

r j

q sjm ,

m=1

where x j + iy j x j − iy j = x2j + y2j = p j and Theorem 5.13 implies that the Gaussian integers 1 ± i, x j ± iy j , q j are all Gaussian primes. Observe that n is a sum of two squares if and only if there exists z ∈ Z [i] such that n = zz. Hence r (n) counts the number of these factors z of n in Z [i]. We can have

z = (1 + i)r (1 − i)r−r

k j=1

x j + iy j

rj

x j − iy j

r j −rj

q sjm /2 ,

m=1

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

5.3 Gaussian integers

99

where 0 ≤ r ≤ r, 0 ≤ rj ≤ r j for every j = 1, . . . , k. Since count the expressions u

k

x j + iy j

rj

x j − iy j

r j −rj

1+i 1−i

= i, we have to

,

j=1

where u is a unit. Observe that we have four choices for u and r j + 1 choices for each rj . This completes the proof. We now show that r (n) and the Dirichlet function d (n) (see (2.8)) are related by the arithmetic function δ(n) := d1 (n) − d3 (n) ,

(5.12)

where d1 (n) = card {d ∈ N : d | n, d ≡ 1 (mod 4)} , d3 (n) = card {d ∈ N : d | n, d ≡ 3 (mod 4)} . Theorem 5.21

We have r(n) = 4δ(n) .

Proof

Following the notation in (5.8) we write n = 2r pr11 · · · prkk q1s1 · · · qs .

(5.13)

The odd divisors of n are precisely the monomials that we obtain by expanding k

r

1 + p j + p2j + . . . + p j j

j=1

1 + qm + q2m + . . . + qmsm .

(5.14)

m=1

The divisors of n which are ≡ 1 (mod 4) are the monomials containing an even number of primes q counted with multiplicity, while the divisors ≡ 3 (mod 4) are the monomials with an odd number of terms q. In order to compute δ (n) we therefore replace every p and q in (5.14) with +1 and −1, respectively. Then ⎧ ⎪ ⎪ ⎨ kj=1 r j + 1 δ(n) = ⎪ ⎪ ⎩ 0

if the terms sm are all even, otherwise.

The conclusion now follows from Theorem 5.20.

We are now in a position to apply Theorem 2.13 and deduce similar estimates for the mean growth of r(n). Theorem 5.22 tions.

The arithmetic function r (n) satisfies the following condi-

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

100

Sums of squares

(i) For every ε > 0 there is a constant c = cε such that r(n) ≤ c nε . (ii) There are no positive constants c1 and c2 such that r(n) ≤ c1 logc2 n .

(5.15)

Proof (i) is a direct consequence of Theorem 5.21 and Theorem 2.13. In order to prove (ii), we assume that (5.15) is true and modify the argument in the proof of Theorem 2.13. By Corollary 3.10 there are infinitely many prime numbers p1 < p2 < . . . which are ≡ 1 (mod 4). Choose an integer h > c2 and let an = (p1 p2 · · · ph+1 )n for every positive integer n. Arguing as in the proof of Theorem 2.13, we show that d(an ) > K(c) logc (an ), where K(c) = log−h−1 (p1 p2 · · · ph+1 ) depends only on h. Observe that an has no divi sors ≡ 3 (mod 4), so that δ(an ) = d1 (an ) = d (an ).

5.4 Sums of four squares We shall prove the following result, stated by Bachet in 1621 and proved3 by Lagrange in 1770. Theorem 5.23 (Lagrange) four squares.

Every positive integer can be written as a sum of

Assume that a and b are sums of four squares. Then the same is true for ab. Indeed (5.16) x12 + x22 + x32 + x42 y21 + y22 + y23 + y24 = (x1 y1 + x2 y2 + x3 y3 + x4 y4 )2 + (x1 y2 − x2 y1 + x3 y4 − x4 y3 )2 + (x1 y3 − x3 y1 + x4 y2 − x2 y4 )2 + (x1 y4 − x4 y1 + x2 y3 − x3 y2 )2 . Then Theorem 5.23 is a corollary of the following result. Theorem 5.24 Every prime number can be written as a sum of four squares. Let ωd be the volume of the d-dimensional unit ball. Observe that ωd+1 = ωd 3

1

−1

√ d 1 − x2 dx .

(5.17)

Lagrange acknowledged the influence of Euler. See [72, Vol. II, Ch. VIII] for a story of this result.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

5.4 Sums of four squares Then

101

π/2 3/2 8 dx = π cos4 θ dθ 1 − x2 3 −1 0 π/2 1 2 1 iθ −iθ 4 dθ = π . = π e +e 6 0 2

ω4 = ω3

1

In general (see Lemma 11.1) we have ωd = Γ(x) :=

+∞

πd/2 , Γ( d2 +1)

where

y x−1 e−y dy

(5.18)

0

is the gamma function. Integration by parts proves the identities Γ(x + 1) = x Γ(x) ,

Γ(n + 1) = n! .

(5.19)

Then the identity ω2k = πk /k! follows. Lemma 5.25

For every prime number p there exist a, b ∈ Z such that a2 + b2 ≡ −1 (mod p) .

(5.20)

Proof Since 12 + 02 ≡ −1 (mod 2), we may assume that p is odd. Arguing as in the proof of Theorem 4.5 we observe that the (p + 1) /2 numbers in the set ⎧

2 ⎫ ⎪ ⎪ ⎪ p−1 ⎪ ⎬ ⎨ 2 2 2 A=⎪ 0 ,1 ,2 ,..., ⎪ ⎪ ⎪ ⎭ ⎩ 2 are pairwise non-congruent modulo p, and the same is true for the (p + 1) /2 numbers in the set ⎧

2 ⎫ ⎪ ⎪ ⎪ p−1 ⎪ ⎬ ⎨ 2 2 2 . B=⎪ −1 − 0 , −1 − 1 , −1 − 2 , . . . , −1 − ⎪ ⎪ ⎪ ⎭ ⎩ 2 The set A ∪ B has p + 1 elements. Then there exist a2 ∈ A and −1 − b2 ∈ B such that a2 ≡ −1 − b2 (mod p). Proof of Theorem 5.24

Let L be the 4-dimensional lattice generated by

(1, 0, a, −b) , (0, 1, b, a) , (0, 0, p, 0) , (0, 0, 0, p) , where a, b satisfy (5.20). Then A (L) is (the absolute value of) ⎡ ⎤ ⎢⎢⎢ 1 0 a −b ⎥⎥⎥ ⎢⎢⎢⎢ 0 1 b a ⎥⎥⎥⎥ ⎥⎥ , det ⎢⎢⎢⎢ ⎢⎢⎢ 0 0 p 0 ⎥⎥⎥⎥⎥ ⎦ ⎣ 0 0 0 p

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

102

Sums of squares

that is, A(L) = p2 . We take a ball rB ⊂ R4 centred at 0, with radius r and satisfying 25/2 p < r2 < 2p . (5.21) π The volume of the 4-dimensional ball rB is equal to π2 r4 /2. Then (5.21) yields π2 4 r > 24 p2 = 24 A(L) . 2 By Minkowski’s theorem, rB contains a point 0 P ∈ L. Let us write P = γ1 (1, 0, a, −b) + γ2 (0, 1, b, a) + γ3 (0, 0, p, 0) + γ4 (0, 0, 0, p) = (γ1 , γ2 , aγ1 + bγ2 + pγ3 , −bγ1 + aγ2 + pγ4 ) . Then 0 < γ12 + γ22 + (aγ1 + bγ2 + pγ3 )2 + (−bγ1 + aγ2 + pγ4 )2 < 2p .

(5.22)

Since (aγ1 + bγ2 )2 + (−bγ1 + aγ2 )2 = (a2 + b2 )(γ12 + γ22 ) and since 1 + a2 + b2 ≡ 0 (mod p), we obtain γ12 + γ22 + (aγ1 + bγ2 + pγ3 )2 + (−bγ1 + aγ2 + pγ4 )2 ≡ ≡

(5.23)

γ12 + γ22 + (aγ1 + bγ2 )2 + (−bγ1 + aγ2 )2 (γ12 + γ22 )(1 + a2 + b2 ) ≡ 0 (mod p) .

Then γ12 + γ22 + (aγ1 + bγ2 + pγ3 )2 + (−bγ1 + aγ2 + pγ4 )2 = p . In this way we have written p as a sum of four squares.

As an example, let us write 682 as a sum of four squares. Following (5.16) we write 682 = 22 · 31 = 32 + 32 + 22 + 02 52 + 22 + 12 + 12 = (3 · 5 + 3 · 2 + 2 · 1)2 + (3 · 2 − 3 · 5 + 2 · 1)2 + (3 · 1 − 2 · 5 − 3 · 1)2 + (3 · 1 + 3 · 1 − 2 · 2)2 = 232 + 72 + 102 + 22 . Lagrange’s theorem can be proved without appealing to Minkowski’s theorem (see e.g. [9] or [63]). Jacobi proved in 1828 the stronger result that the number of ways to write a

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

5.4 Sums of four squares

103

positive integer n as a sum of four squares is eight times the sum of the divisors of n that are not multiples of 4 (see [7] or [128]). If we want to write a positive integer as a sum of cubes, we need nine of them (not less, because 23 = 2 · 23 + 7 · 13 ). Hilbert proved in 1909 the following general result (see e.g. [98]). Theorem 5.26 (Hilbert) For every positive integer k there exists a positive integer g(k) such that every positive integer is a sum of g(k) kth powers of non-negative integers.

Exercises 1) Let k be a positive integer and let A ⊆ Rd be a measurable set with volume > k. Prove the existence of k + 1 diﬀerent points x1 , x2 , . . . , xk+1 such that xi − x j ∈ Zd for every i, j. 2) Let D ⊆ Rd be a convex set, symmetric around the origin. Assume that D ∩ Zd = {0}. Prove that if p and q are diﬀerent points in Zd , then

1 1 D+ p ∩ D+q =∅. 2 2 3) Given a convex set D ⊆ Rd , let ch (D) be the convex hull of D, that is, the intersection of all convex sets containing D. Prove that ch (D) = t ∈ Rd : t = α1 t1 + . . . + αk tk , k ∈ N, t1 , . . . , tk ∈ D, α1 ≥ 0, . . . , αk ≥ 0, α1 + . . . + αk = 1} . 4) Prove in an elementary way that no integer m ≡ 3 (mod 4) can be written as a sum of two squares. 5) Prove that q and r in Proposition 5.12 are not uniquely determined. Show that r = 0 if and only if q | z. 6) Let π be a Gaussian prime. Prove that there exists one and only one prime number p such that π | p. 7) Let (a, b, c) be a primitive Pythagorean triple. That is a, b, c are positive integers with no common factor and satisfying a2 + b2 = c2 . Prove that a + ib and a − ib are relatively prime in Z [i]. 8) Let α, β be coprime Gaussian integers. Assume that αβ is a square. Prove that there exists a unit u such that uα and u−1 β are squares. 9) Give a proof of Lemma 5.19 without appealing to Gaussian integers. 10) Determine the smallest positive integer which can be written as a sum of two squares in more than eight ways.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

104

Sums of squares

11) Determine all representations of 325 as a sum of two squares. 12) Prove that a positive integer n can be written as a diﬀerence of two squares if and only if n 2 (mod 4). Prove that this representation is unique (up to signs) when n is a prime number. 13) Write 874 as a sum of four squares.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:16:08, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.006

6 Uniform distribution and completeness of the trigonometric system

In Chapter 2 we have seen proofs of the ‘elementary’ estimates for the mean growth of the arithmetic functions d (n) and r (n) which consist of counting the integer points (that is, the points with integral coordinates) in certain domains in R2 . In the second part of this book we shall use Fourier analytic methods to improve the above-mentioned elementary estimates and to study several related results concerning lattice points in large domains in Rd . As an essentially more general point of view, we shall consider finite sequences N ⊂ Td = Rd /Zd , and we shall investigate the ‘discrepancy’ between the xj j=1 6 integral Td f (x) dx and the Riemann sum N1 Nj=1 f x j , where f belongs to a suitable class of functions on Td . We begin the study of this multidimensional numerical integration problem by considering the case where f is the characteristic function of an arbitrary interval in T or Td . This will lead us to the definition of a uniformly distributed sequence, which is the main topic of this chapter. Let us go back to Corollary 5.4, which says that for every irrational number α there exist infinitely many rational numbers p/ j such that α − p < 1 . j j2 Then for every interval I0 ⊆ T containing the origin, and for every irrational number α, there exist infinitely many diﬀerent points of the sequence jα which belong to I0 . We will show that the same is true for every interval I ⊆ T, so that jα is dense in T (this means that the fractional parts1 { jα} = jα − jα are 1

To avoid any possible confusion, we shall always write {β} for the fractional part of a real number β. A sequence will be written with no curly brackets, for example, jα, t ( j), t j or with ∞ N . brackets and explicit indices, for example, { jα}∞ j=1 , {t ( j)} j=1 , t j j=1

107 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

108 Uniform distribution and completeness of the trigonometric system dense in [0, 1)). This comes as a consequence of Kronecker’s theorem, a deep result that we are going to describe in the coming section.

6.1 Kronecker’s theorem The density of jα in T concerns the set { jα}∞j=1 , while Kronecker’s theorem deals with the sequence { jα}∞j=1 , showing that, in a certain sense, this sequence fills all the intervals in T at a proper speed. More precisely, jα satisfies the following definition (see [86, 95, 114]). Definition 6.1 Let t ( j) be a sequence taking values in T. We say that t ( j) is uniformly distributed in T if for every interval I ⊆ T we have card {t( j) ∈ I : 1 ≤ j ≤ N} = |I| , N where |I| is the length of I. lim

N→∞

Observe that every interval in [0, 1) is an interval in T, while there are intervals in T which are only the union of two intervals in [0, 1). If u ( j) is a real sequence, we consider the sequence t ( j) = {u ( j)} of its fractional parts. If t ( j) is uniformly distributed in [0, 1), then we say that u ( j) is uniformly distributed mod 1. Sometimes we may simply say that u ( j) is uniformly distributed. It is easy to extend Definition 6.1 to several variables. By an interval in Td we shall mean the product I = I1 × I2 × . . . × Id of d intervals in T. Definition 6.2 A sequence t ( j) taking values in Td is uniformly distributed in Td if for every interval I ⊆ Td we have card {t( j) ∈ I : 1 ≤ j ≤ N} = |I| , N where |I| is the volume of I. lim

N→∞

(6.1)

We can now state Kronecker’s theorem. Theorem 6.3 (Kronecker) Let the real numbers 1, α1 , α2 , . . . , αd be linearly independent over Q. Let α = (α1 , α2 , . . . , αd ), then the sequence jα is uniformly distributed in Td . We shall see the proof of this theorem in Section 6.3. Recall that if we assume 1, α1 , α2 , . . . , αd linearly independent over √ Q, then , α , . . . , α must be irrational. The example α = 2, α2 = the numbers α 1 2 d 1 √ 2 − 1 shows that the converse is not true when d ≥ 2.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

6.1 Kronecker’s theorem

109

Going back to Definition 6.2, we rewrite (6.1) in the form ⎞ ⎛ N ⎟⎟⎟ ⎜⎜⎜ 1 ⎜ lim ⎜⎝⎜ χI (t ( j))⎟⎟⎠⎟ = χI (t) dt , N→∞ N Td j=1

(6.2)

which shows an integral equal to a limit of Riemann sums, at least for the class of characteristic functions2 of intervals. Therefore the definition of the Riemann integral makes the following characterization natural. Theorem 6.4 (Weyl) tributed if and only if

A sequence t ( j) taking values in Td is uniformly dis⎛ ⎞ N ⎜⎜⎜ 1 ⎟⎟⎟ ⎜ f (t( j))⎟⎟⎟⎠ = f (x) dx lim ⎜⎜⎝ N→∞ N Td j=1

(6.3)

for every Riemann integrable function f on Td . Proof The ‘if’ part is obvious since (6.2) is a particular case of (6.3). Let us prove the ‘only if’ part. Let t ( j) be uniformly distributed in Td and let f be a Riemann integrable function on Td . Then for every ε > 0 there exist two finite linear combinations ah χIh (x) , S (x) = Ah χIh (x) s(x) = h

h

of characteristic 6 functions of intervals Ih such that s(x) ≤ f (x) ≤ S (x) for every x and Td (S (x) − s(x)) dx ≤ ε. Since t ( j) is uniformly distributed, we have, for every h, ⎛ ⎞ N ⎜⎜⎜ 1 ⎟⎟⎟ ⎜ lim ⎜⎜⎝ χIh (t( j))⎟⎟⎟⎠ = |Ih | . N→∞ N j=1 Since the sum over h is finite, we have ⎛ ⎛ ⎞ ⎞ N N ⎜⎜⎜ 1 ⎜⎜⎜ 1 ⎟⎟⎟ ⎟⎟⎟ ⎜ ⎜ ⎟ lim ⎜⎜⎝ s(t( j))⎟⎟⎠ = lim ⎜⎜⎝ ah χIh (t( j))⎟⎟⎟⎠ N→∞ N N→∞ N j=1 j=1 h ⎛ ⎞ N ⎜⎜⎜ 1 ⎟⎟⎟ ⎜ ⎟ χIh (t( j))⎟⎟⎠ = ah |Ih | = s(x) dx , ah lim ⎜⎜⎝ = N→∞ N d h

j=1

! 2

The characteristic function of a set E is χE (x) :=

T

h

1 0

if x ∈ E, if x E.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

110 Uniform distribution and completeness of the trigonometric system and the same holds true for S (x). Hence the theorem follows from the definition of the Riemann integral and from the inequalities N N N 1 1 1 s(t( j)) ≤ f (t( j)) ≤ S (t( j)) . N j=1 N j=1 N j=1

The proof of Kronecker’s theorem is an immediate corollary of a simple and deep characterization of uniformly distributed sequences, called the Weyl criterion, which is in turn related to the completeness of the trigonometric system.

6.2 Completeness of the trigonometric system Let f be a measurable on Td and let p ≥ 1 be a real number. We say function p d that f belongs to L T if ! f L p (Td ) :=

&1/p | f (t)| dt p

Td

< +∞ .

Observe that f L p (Td ) = 0 if and only if f (t) = 0 for almost every t ∈ Td . Let f be a measurable function on Td . We define f L∞ (Td ) to be the smallest K such that | f (t)| ≤ K for almost every t ∈ Td . Then L∞ Td is the space of all functions admitting such a K. Theorem 6.5 (H¨older’s inequality) Let 1 ≤ p, q ≤ ∞ with f, g be measurable non-negative functions on Td . Then f g ≤ f L p (Td ) gLq (Td ) . Td

1 p

+

1 q

= 1 and let

(6.4)

Proof If the two indices p, q take values 1 and ∞, then the inequality is simple. We therefore assume that p > 1 and q > 1. It is enough to prove (6.4) when f L p (Td ) = gLq (Td ) = 1. Moreover, without loss of generality, we may assume that f and g are positive on Td . For every t ∈ Td let s = s (t) and u = u (t) be the real numbers satisfying f (t) = e s/p and g (t) = eu/q . We may assume that s ≤ u, and consequently observe that s≤ s+

s u s−u u−s = + =u+ ≤u. q p q p

(6.5)

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

6.2 Completeness of the trigonometric system

111

Let ϕ (r) = ar + b be the straight line joining the points (s, e s ) and (u, eu ). By convexity we have er ≤ ϕ (r) for every s ≤ r ≤ u. Then, by (6.5),

s u s u s u 1 1 + + e p+q ≤ ϕ =a + b = (as + b) + (au + b) p q p q p q 1 1 s 1 u 1 = ϕ (s) + ϕ (u) = e + e , p q p q that is, f (t) g (t) ≤

1 1 p f (t) + gq (t) . p q

The integration of both sides gives 1 1 p fg ≤ f + gq = 1 . p Td q Td Td Corollary 6.6

Let 1 ≤ p ≤ q ≤ +∞. Then f L p (Td ) ≤ f Lq (Td ) .

6 Proof We may assume 1 ≤ p < q < +∞, then q/p > 1. Since Td dt = 1, H¨older’s inequality implies ! &1/p ! &1/p ! &1/q = ≤ · 1 = f Lq (Td ) . | f |p | f |p · 1 | f |q f L p (Td ) = Td

Td

Td

Proposition 6.7 (Minkowski’s inequality)

For every 1 ≤ p ≤ ∞ we have

f + gL p (Td ) ≤ f L p (Td ) + gL p (Td ) . Proof The result is obvious if p = 1 or p = ∞. Let 1 < p < ∞. We may assume that f −g. By H¨older’s inequality we have p (| f | + |g|) | f + g| p−1 | f + g| ≤ Td Td 1 1 1 1 ≤ f L p (Td ) 11| f + g| p−1 11Lq (Td ) + gL p (Td ) 11| f + g| p−1 11Lq (Td ) 1−1/p p = f L p (Td ) + gL p (Td ) , | f + g| d T

since

1 p

+

1 q

= 1 means q = p/ (p − 1). Then &1/p ! ≤ f L p (Td ) + gL p (Td ) . | f + g| p Td

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

112 Uniform distribution and completeness of the trigonometric system Remark 6.8 Let 1 < p < ∞ and let f be a real function belonging to L p Td . Let ! if f (t) ≥ 0, | f (t)| p−1 g (t) = p−1 if f (t) < 0. − | f (t)| Then H¨older’s inequality implies gLq (Td ) f g ≤ f L p (Td ) | f |p = Td

Td

1/p

=

Td

| f |p

1/q Td

=

| f |p

Td

| f |p .

Then for every non-zero function f ∈ L p Td there exists a non-zero function g ∈ Lq Td such that H¨older’s inequality becomes an equality: gLq (Td ) , f g = f L p (Td ) Td

that is,

f L p (Td ) =

f g Td

gLq (Td )

Whence

.

f L p (Td ) =

sup gLq Td =1 ( )

Td

fg .

Theorem 6.9 (Minkowski’s integral inequality) Let f be a measurable and non-negative function on Td × Td , and let 1 ≤ p < ∞. Then p &1/p ! &1/p ! f (t, y) dy dt ≤ f p (t, y) dt dy . (6.6) Td

Td

Td

Td

This means that the L p norm of the integral is not larger than the integral of the L p norm. Proof When p = 1, (6.6) is a consequence of Tonelli’s theorem. When p > 1, Remark 6.8 and H¨older’s inequality imply p &1/p ! f (t, y) dy dt = sup f (t, y) dy g (t) dt gLq Td =1 Td Td Td Td ( ) f (t, y)g (t) dt dy = sup d d gLq Td =1 T T ( )

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

6.2 Completeness of the trigonometric system 113 ⎛ ⎞ & & ! ! ⎜ 1/p 1/q ⎟ ⎜⎜⎜ ⎟⎟⎟ ≤ sup f p (t, y) dt |g (t)|q dt ⎜⎝ ⎟⎠ dy gLq Td =1 Td Td Td ( ) &1/p ! p f (t, y) dt dy . = Td

Td

For f ∈ L1 (Td ) and n ∈ Zd we define the Fourier coeﬃcients 2 f (n) := f (t) e−2πin·t dt , Td

where n · t = n1 t1 + n2 t2 + . . . + nd td is the inner product in Rd . If f, g ∈ L1 (Td ) we define the convolution ( f ∗ g) (t) :=

Td

f (t − y)g(y) dy .

For every f, g ∈ L1 (Td ) we have

Proposition 6.10

f (n) 2 g(n) (i) ( f ∗ g)∧ (n) = 2 (ii) f ∗ gL p (Td ) ≤ f L p (Td ) gL1 (Td ) Proof

for every n ∈ Zd , if f ∈ L p Td , 1 ≤ p ≤ ∞.

By the invariance of the Lebesgue measure under translation we have ∧ ( f ∗ g) (n) = f (t − y)g(y)dy e−2πin·t dt Td Td = g(y) f (t − y) e−2πin·t dtdy d d T T −2πin·y g(y) e f (u) e−2πin·u dudy = 2 f (n) 2 g(n) . = Td

Td

By Minkowski’s integral inequality we have ! p &1/p f (y − t)g(t) dt dy f ∗ gL p (Td ) = ≤

Td

Td

Td

|g(t)|

Td

|g(t)|

&1/p dt | f (y − t)| dy p

!

=

!

Td

Td

&1/p dt = gL1 (Td ) f L p (Td ) . | f (u)| p du

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

114 Uniform distribution and completeness of the trigonometric system We call a trigonometric polynomial on Td any finite sum of the form P(t) = an e2πin·t , where an ∈ C. When d = 1 we call the degree of P the largest n ≥ 0 such that an 0 or a−n 0. By (i) in the previous proposition the convolution of a trigonometric polynomial of degree N with an integrable function is still a trigonometric polynomial of degree N. As a very significant example, we introduce the Fej´er kernel on T. This is the family of trigonometric polynomials of degree N defined by N

| j| o (6.7) 1− e2πi jx KN (x) = KN (x) := N+1 j=−N (we shall write KNo (x) only when it proves necessary to avoid confusion with the multidimensional Fej´er kernel). For every x Z we have

2 sin (π (N + 1) x) 1 . KN (x) = N+1 sin (πx)

Lemma 6.11

Proof

(6.8)

By (4.27) we have, for every x Z, N n=−N

e2πinx =

sin ((2N + 1) πx) . sin (πx)

(6.9)

Therefore (6.9) implies N

(N + 1) sin (πx) KN (x) = (N + 1) sin (πx) 1−

| j| e2πi jx (6.10) N+1 j=−N ⎧ ⎫ ⎪ N N ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ 2πi jx ⎬ (N + 1 − | j|) e2πi jx = sin2 (πx) = sin2 (πx) e ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ⎭ j=−N =0 j=− ⎫ ⎧ N sin 2π + 1 x N ⎪ ⎪ ⎪ ⎬ ⎨ 2πi(+ 12 ) x ⎪ 2 2 e = sin (πx) Im ⎪ = sin (πx) ⎪ ⎪ ⎪ ⎭ ⎩ (πx) sin =0 =0 ⎫ ⎧ & ! N ⎪ 2πi(N+1)x ⎪ ⎪ −1 ⎬ ⎨ πix 2πix ⎪ πix e e = sin (πx) Im ⎪ ⎪ ⎪ = sin (πx) Im e ⎪e ⎭ ⎩ e2πix − 1 =0 ! & sin (π (N + 1) x) = sin (πx) Im eπi(N+1)x = sin2 (π (N + 1) x) , sin (πx) 2

where Im denotes the imaginary part.

2

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

6.2 Completeness of the trigonometric system

115

The family of trigonometric polynomials DN (x) :=

N

e2πinx =

n=−N

sin ((2N + 1) πx) sin (πx)

(6.11)

is called the Dirichlet kernel and it is important because the partial sum S N f (x) =

N

2 f (n) e2πinx

n=−N

(see Section 4.3) is the convolution of f with DN : S N f = f ∗ DN . Indeed 2N (n) = ( f ∗ DN ) (n) = 2 f (n) D ∧

!

2 f (n) 0

if |n| ≤ N, if |n| > N,

2N (n) = 0 when |n| > N. Observe, see 2N (n) = 1 when |n| ≤ N and D since D inside the chain of identities (6.10), that 1 Dj N + 1 j=0 N

KN =

(6.12)

is the arithmetic mean of the Dirichlet kernel. Theorem 4.19 proved the existence of a continuous function with pointwise non-convergent Fourier series. We are going to see that in this sense the Fej´er kernel behaves better than the Dirichlet kernel. We call Fej´er means the trigonometric polynomials (KN ∗ f ) (x) =

N

1−

j=−N

| j| 2 f (n) e2πi jx . N+1

The Fej´er kernel on Td is defined as KN (t) := KNo (t1 ) · KNo (t2 ) · . . . · KNo (td ) , where t = (t1 , t2 , . . . , td ) and KNo is the 1-dimensional Fej´er kernel (see (6.7)). The d-dimensional generalization of the property (6.12) leads to a diﬀerent kind of multidimensional Fej´er kernel (see [170]), which we are not going to discuss here. Lemma 6.12

The Fej´er kernel on Td satisfies the following conditions.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

116 Uniform distribution and completeness of the trigonometric system (i) KN (t) ≥ 0 for every N ∈ N and every t ∈ Td . Moreover 2N (0) = KN (t) dt = 1=K |KN (t)| dt . Td

Td

(6.13)

(ii) Let 0 < δ < 12 and let B (0, δ) be the ball centred at the origin and having radius δ. Then KN (t) dt = 0 . lim N→+∞ [− 1 , 1 ]d B(0,δ) 2 2 Proof Let KNo be the 1-dimensional Fej´er kernel. Then (i) is a simple conse8o (0) = 1. In order to prove (ii) we observe quence of (6.8) and of the identity K N that, as N → ∞, (6.8) yields 1/2 2 1 cδ −→ 0 . KNo (x) dx ≤ dx ≤ 0< 2 N + 1 N +1 sin (πx) δ |x|≥δ Then

[− 12 , 12 ]d B(0,δ)

KN (t) dt ≤ ≤

!

√ t: max|t j |≥δ/ d

d j=1

=

Theorem 6.13

KN (t) dt

o √ KN (t j ) |t j |≥δ/ d

d j=1

&

j

|t j |≥δ/

√

d

⎛ ⎞ ⎜⎜⎜ ⎟⎟⎟ o dt j ⎜⎜⎜⎝ KN (tr ) dtr ⎟⎟⎟⎠ T

r j

KNo (t j ) dt j −→ 0 .

Let f ∈ C Td . Then, as N → ∞, KN ∗ f − f L∞ (Td ) → 0 ,

so that (KN ∗ f ) (t) converges uniformly to f (t). Let f ∈ L p Td , with 1 ≤ p < ∞. Then, as N → ∞, KN ∗ f − f L p (Td ) → 0 . 6 Proof Since Td KN = 1 we have, for every f ∈ C Td , sup |KN ∗ f (t) − f (t)| = sup f (t − y)KN (y) dy − f (t) KN (y) dy t∈Td

t∈Td

Td

Td

(6.14) ( f (t − y) − f (t)) KN (y) dy ≤ = sup |KN (y)| sup | f (t − y) − f (t)| dy . t∈Td

Td

Td

t∈Td

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

6.2 Completeness of the trigonometric system

117

Choose ε > 0. Since f is uniformly continuous on Td , there exists δ > 0 such that if |y| < δ, then sup | f (t − y) − f (t)| < ε . t∈Td

We may assume that f is not identically zero. Then by the previous lemma there exists N0 such that, for N ≥ N0 , ε KN (y) dy < . 1 1 d 2 f L∞ (Td ) [− 2 , 2 ] B(0,δ) We thus write |KN (y)| sup | f (t − y) − f (t)| dy = Td

t∈Td

+ B(0,δ)

We observe that, by (6.13),

[− 12 , 12 ]d B(0,δ)

=E+F .

|KN (y)| dy ≤ ε ,

E≤ε B(0,δ)

while, for N ≥ N0 ,

F ≤ 2 f L∞ (Td ) Then

[− 12 , 12 ]d B(0,δ)

|KN (y)| dy ≤ ε .

Td

|KN (y)| sup | f (t − y) − f (t)| dy ≤ 2ε , t∈Td

and the first part of the theorem is proved. In order to prove the second part let us recall that C Td is dense in L p Td when 1 ≤ p < ∞. Then, for every f ∈ L p Td , there exists g ∈ C Td such that f − gL p (Td ) < ε. The first part of this theorem tells us that we can find N such that KN ∗ g − gL p (Td ) ≤ KN ∗ g − gL∞ (Td ) ≤ ε . Finally, by (ii) in Proposition 6.10 we obtain KN ∗ f − f L p (Td ) ≤ KN ∗ f − KN ∗ gL p (Td ) + KN ∗ g − gL p (Td ) + g − f L p (Td ) ≤ KN L1 (Td ) f − gL p (Td ) + ε + ε ≤ 3ε . Since KN ∗ f is a trigonometric polynomial, we immediately deduce the completeness of the trigonometric system, that is, the following result, which is a consequence of Theorems 6.13 and 4.15.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

118 Uniform distribution and completeness of the trigonometric system Corollary 6.14 The space of the trigonometric polynomials is dense in C Td and in L p Td (if 1 ≤ p < ∞). In particular, it is dense in L2 Td . Then is a complete orthonormal system and we obtain the following idene2πin·t n∈Zd tity, proved by Parseval in 1799: 2 2 (6.15) f (n) = | f (t)|2 dt . Td

n∈Zd

We can now go back to the uniformly distributed sequences.

6.3 The Weyl criterion The following theorem, proved by Weyl in 1916 [180, 181], constitutes the fundamental result of the theory of uniformly distributed sequences. Theorem 6.15 (Weyl criterion) A sequence t ( j) with values in Td is uniformly distributed if and only if, for every 0 k ∈ Zd , we have lim

N→∞

N 1 2πik·t( j) e =0. N j=1

(6.16)

Proof Let t ( j) be uniformly distributed. If we choose f (y) = e2πik·y , with k 0, and apply (6.3) we obtain lim

N→∞

N 1 2πik·t( j) e = e2πik·y dy = 0 . d N j=1 T

The other direction depends on the results obtained in the previous section. Let P(t) = k ak e2πik·t be a trigonometric polynomial on Td . Then, if we assume (6.16), ⎛ ⎞ N N ⎜⎜⎜ 1 ⎟⎟ 1 2πik·t( j) ⎟ ⎜ ⎟ ⎜ ⎟ P(t( j)) = a0 + ak ⎜⎝ e P (6.17) ⎟⎠ −→ a0 = d N N j=1

k0

T

j=1

as N → ∞. Then (6.3) is true for the trigonometric polynomials. Now let f ∈ C Td and ε > 0. By (6.14) we know that ≤ (K ∗ f (y)) dy − f (y) dy (6.18) |K M ∗ f (y) − f (y)| dy M Td

Td

Td

≤ sup |(K M ∗ f ) (y) − f (y)| < ε y∈Td

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

6.3 The Weyl criterion

119

if M is large enough. Since K M ∗ f is a trigonometric polynomial, (6.17) implies N 1 (K M ∗ f ) (t( j)) − (K M ∗ f ) (y) dy ≤ ε N Td j=1 for N large enough. Then (6.18) yields N 1 f (t( j)) − f (y) dy N Td j=1 ≤

N 1 | f (t( j)) − (K M ∗ f ) (t( j))| N j=1

N 1 (K M ∗ f ) (t( j)) − (K M ∗ f ) (y) dy + N Td j=1 (K M ∗ f (y)) dy − f (y) dy + Td

Td

≤ 3ε . Then (6.3) is true forthe continuous functions. Now let I be any interval in Td and let g1 , g2 ∈ C Td satisfy g1 (t) ≤ χI (t) ≤ g2 (t) for every t ∈ Td and 6 (g2 − g1 ) ≤ ε. Then Td ⎞ ⎛ N ⎟⎟⎟ ⎜⎜⎜ 1 g2 (t) dt − ε ≤ g1 (t) dt = lim ⎜⎜⎜⎝ g1 (t ( j))⎟⎟⎟⎠ |I| − ε ≤ N→+∞ N Td Td j=1

1 1 card {t ( j)}Nj=1 ∩ I ≤ lim sup card {t ( j)}Nj=1 ∩ I ≤ lim inf N→+∞ N N→+∞ N ⎞ ⎛ N ⎜⎜⎜ 1 ⎟⎟⎟ g2 (t ( j))⎟⎟⎟⎠ = g2 (t) dt ≤ g1 (t) dt + ε ≤ |I| + ε . ≤ lim ⎜⎜⎜⎝ N→+∞ N d d

j=1

T

T

The proof is now complete. We can now prove Kronecker’s theorem.

Proof of Theorem 6.3 By the Weyl criterion it is enough to prove that for every 0 k = (k1 , k2 , . . . , kd ) ∈ Zd we have N 1 2πi jα·k e −→ 0 N j=1

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

120 Uniform distribution and completeness of the trigonometric system as N → ∞. Indeed N 1 1 e2πi(N+1)α·k − 1 1 2 1 2πi jα·k = −→ 0 , e − 1 ≤ + N N e2πiα·k − 1 N N e2πiα·k − 1 j=1 since 1, α1 , α2 , . . . , αd are linearly independent over Q, and then we cannot have k 0 and α · k ∈ Z. We shall now exhibit a few conditions which imply uniform distribution. Proposition 6.16 Let t ( j) be a uniformly distributed sequence in Td , let c be a constant and let u ( j) be a sequence in Td such that (t ( j) − u ( j)) → c. Then u ( j) is uniformly distributed. Proof write

By Definition 6.1 we may assume that c = 0. Let 0 k ∈ Zd and let us

N N N 1 1 2πik·u( j) 1 2πik·u( j) e = − e2πik·t( j) + e2πik·t( j) := AN + BN . e N j=1 N j=1 N j=1

By the Weyl criterion we know that BN → 0 as N → ∞. We have N N 1 1 e2πik·(u( j)−t( j)) − 1 e2πik·(u( j)−t( j)) − 1 e2πik·t( j) ≤ |AN | = N N j=1 j=1 ≤

N N 1 2π |k| |2πik · (u ( j) − t ( j))| ≤ |u ( j) − t ( j)| . N j=1 N j=1

We have used the inequality

iθ e − 1 ≤ |θ|

(6.19)

(that is, the fact that a chord is not longer than its arc) and the Cauchy–Schwarz inequality. Let 0 < ε < 1 and√choose M such that |u√( j) − t ( j)| < ε if j > M. We may assume that N ≥ M d/ε. Since |a − b| ≤ d/2 for every a, b ∈ Td , we have N M N 1 1 1 |u ( j) − t ( j)| = |u ( j) − t ( j)| + |u ( j) − t ( j)| ≤ ε + ε . N j=1 N j=1 N j=M+1

By the Weyl criterion we deduce that u ( j) is a uniformly distributed sequence. In order to find more uniformly distributed sequences we are going to present a result due to Fej´er, which deals with sequences obtained by restrictions of

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

6.3 The Weyl criterion

121

functions defined on [1, +∞). Observe that Fej´er’s condition is geometric in nature, while Kronecker’s theorem involves an arithmetic condition. First of all we present the Euler–Maclaurin summation formula, proved independently by Euler and Maclaurin in 1735. Let g ∈ C1 ([1, +∞)).

Lemma 6.17 (Euler–Maclaurin summation formula) Then for all integers 0 < M < N we have N h=M

g(h) =

1 (g(M) + g(N)) + 2

N

N

g(x)dx +

M

s(x)g (x)dx ,

M

where s (x) is the sawtooth function (see (4.33)). Proof Integrating by parts and recalling that s(± ) = ∓1/2 for every ∈ Z, and that s (x) = 1 for every x Z, we have

N

s(x)g (x) dx =

M

N−1

=

s(x)g (x) dx

h

h=M N−1

h+1

s(x)g(x)

x=(h+1)− x=h+

−

h=M

N−1 h=M

h+1

g(x) dx h

N 1 g(h + 1) + g(h) − g(x) dx = 2 2 M h=M N N 1 = g(h) − (g(N) + g(M)) − g(x) dx . 2 M h=M N−1

1

Theorem 6.18 (Fej´er) Let f ∈ C2 ([1, +∞)) have f (x) definitely of constant sign. Assume that lim

x→+∞

1 f (x) = lim = 0. x f (x) x→+∞ x

(6.20)

Then the sequence f ( j) is uniformly distributed in T. Proof By our assumptions we know that f (x) is monotonic for large x. Then there exists x0 such that | f (x)| and | f (x)| are positive for every x ≥ x0 . By the Weyl criterion it is enough to prove that for every integer k 0 we have N 1 2πik f ( j) e −→ 0 N j=1

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

122 Uniform distribution and completeness of the trigonometric system as N → ∞. Indeed, by the Euler–Maclaurin summation formula we have N 1 2πik f ( j) e N j=1 N 1 1 2πik f (1) = e2πik f (x) dx + + e2πik f (N) e N 1 2N N 1 + s(x)2πik f (x)e2πik f (x) dx N 1

:= AN + BN + C N . Of course BN → 0. We shall first of all show that AN → 0. Indeed, integrating by parts and recalling the assumptions in (6.20) we have, as N → ∞, 1 N 2πik f (x) 1 N 2πik f (x) e dx = o (1) + e dx |AN | = N 1 N x0 N 1 1 dx e2πik f (x) 2πik f (x) = o (1) + N x0 2πik f (x) 2πik f (N) N 1 e 1 2πik f (x) f (x) e ≤ o (1) + + 2 dx N 2πik f (N) 2π |k| N x0 f (x) N f (x) 1 1 −→ 0 dx = o (1) + ≤ o (1) + 2 2π |k| N x0 f (x) 2π |k| N | f (N)| and

N N 1 2πik f (x) s(x)2πik f (x) e dx ≤ o (1) + 2π |k| f (x) dx x0 N 1 2π |k f (N)| →0. = o (1) + N

1 |C N | = N

As a consequence we immediately obtain the following result. Corollary 6.19

The sequences j α (0 < α < 1)

and

log β j (β > 1)

are uniformly distributed mod 1. Remark 6.20 The sequence log j is not uniformly distributed mod 1. Indeed, by Lemma 6.17, N N N 2πi 1 1 1 2πi log j dx e = o(1) + e2πi log x dx + s(x) e2πi log x N j=1 N 1 N 1 x

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

= o(1) +

1 N

= o(1) +

1 N

6.4 Normal numbers 123

log N log N e(2πi+1)y dy + O N 0

(2πi+1) log N e −1 e2πi log N − 1/N 0. = o(1) + 2πi + 1 2πi + 1

Note that log j is dense in T since log j → +∞ , log ( j + 1) − log j −→ 0 .

6.4 Normal numbers By Kronecker’s theorem we know that the sequence ne is uniformly distributed mod 1. Of course this cannot be true for every subsequence of ne. An explicit example is to consider the sequence n!e and show that the fractional parts {n!e} tend to zero, so that they are not uniformly distributed in T. Indeed, as n → ∞, ⎫ ⎧ ⎛ ⎫ ⎧ ⎧ ⎞⎫ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ +∞ n +∞ +∞ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 1 ⎟⎟⎟⎟⎪ ⎬ ⎪ ⎬ ⎨ 1⎪ ⎨ ⎜⎜⎜⎜ 1 ⎬ ⎪ ⎨ 1⎪ ⎟ ⎜ + = n! n! = n! {n!e} = ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎟ ⎜ ⎪ ⎪ ⎪ ⎪ ⎪ ⎠ ⎝ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ j! j! j! j! ⎭ ⎩ j=0 ⎭ ⎩ ⎭ ⎩ j=n+1 ⎪ j=0 j=n+1 & ! n! n! n! + + + ··· = (n + 1)! (n + 2)! (n + 3)! ! & 1 1 1 = + + + ··· n + 1 (n + 1) (n + 2) (n + 1) (n + 2) (n + 3) & ' !

c ( 1 1 1 + ··· ∼ + = 1+ −→ 0 . n+1 n + 2 (n + 2) (n + 3) n+1 The following result, proved by Weyl in 1916 [181], shows that the previous example (the fact that n!e is not uniformly distributed) is, in a certain sense, an exception. Theorem 6.21 Let an be a sequence of pairwise distinct integers. Then the sequence an x is uniformly distributed mod 1 for almost every real number x. Proof

For every integer k 0 let PN (x) :=

N 1 2πikan x e . N n=1

By our assumptions the terms an are pairwise distinct, and the same holds true for the numbers kan . Therefore the Parseval identity (6.15) implies 2 1 1 N N 1 1 1 2 2πika x n e . dx = 1= |PN (x)| dx = 2 2 N N N 0 0 n=1 n=1

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

124 Uniform distribution and completeness of the trigonometric system If N = m2 we have 1 +∞ +∞ |Pm2 (x)|2 dx = 0 m=1

m=1

1

|Pm2 (x)|2 dx =

0

+∞ 1 0. We write the decimal expansion α = [α] .a1 a2 . . . Consider a block Bk = b1 b2 . . . bk of k digits and assume that am+1 am+2 . . . am+k = b1 b2 . . . bk . This means

m +∞ b1 an an b2 bk α = [α] + + + + . . . + , + n n m+1 m+2 m+k 10 10 10 10 10 n=1 n=m+k+1 that is, {10m α} =

+∞ a j+m b2 b1 bk + 2 + ... + k + . 10 10 10 j 10 j=k+1

This means that the fractional part {10m α} belongs to the interval b1 b2 b2 bk b1 bk 1 + 2 + ... + k , + 2 + ... + k + k . Ik := 10 10 10 10 10 10 10 Then the number AN (Bk ) of blocks equal to Bk appearing among the first N digits of α is equal to the number of elements of the finite sequence {{10m α}}m+k≤N which belong to Ik , AN (Bk ) := 1. (6.21) 1≤m≤N−k {10m α}∈Ik

Let {10m α}∞ m=1 be uniformly distributed mod 1. Then, by (6.21), lim

N→∞

1 1 AN (Bk ) = lim N→∞ N N

1

1≤m≤N−k {10m α}∈Ik

⎛ ⎜⎜ N − k ⎜⎜⎜⎜⎜ 1 = lim ⎜ N→∞ N ⎜⎜⎜⎝ N − k

1≤m≤N−k {10m α}∈Ik

⎞ ⎟⎟⎟ ⎟⎟ 1⎟⎟⎟⎟⎟ = |Ik | = 10−k . ⎟⎠

Hence α is normal and the ‘if’ part of the lemma is proved. Now let α be normal. Then 1 1 1 |Ik | = 10−k = lim AN (Bk ) = lim N→∞ N N→∞ N − k 1≤m≤N−k {10m α}∈Ik

and if we use all the possible choices of Bk we shall see that the sequence {10m α}∞ m=1 satisfies Definition 6.2 for those intervals whose extremes are rational numbers with powers of 10 as denominators (that is, rational numbers with finite-digit expansion). We use the density of these numbers to end the proof. Indeed, let α, β be any interval in T. Let a1 , a2 , b1 , b2 have finite-digit

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

126 Uniform distribution and completeness of the trigonometric system sequences and satisfy [a1 , b1 ] ⊂ [α, β] ⊂ [a2 , b2 ] and (b2 − b1 ) + (a2 − a1 ) < ε. Then 1 card {t( j) ∈ [a1 , b1 ] : 1 ≤ j ≤ N} − (b1 − a1 ) − ε N 1 < card {t( j) ∈ [α, β] : 1 ≤ j ≤ N} − (β − α) N 1 < card {t( j) ∈ [a2 , b2 ] : 1 ≤ j ≤ N} − (b2 − a2 ) + ε , N

and the lemma is proved.

Remark 6.25 It is quite diﬃcult to construct normal numbers. The first example was provided by Champernowne [41]: 0.123456789101112131415 · · · Champernowne conjectured that also 0.2357111317192329 · · · (obtained by writing the sequence of prime numbers) is normal. Copeland and Erd˝os [62] proved the following more general fact. Let p j be a strictly increasing sequence of positive integers such that card p j : j ≤ N ≥ cN 1−ε (observe that the prime numbers satisfy this density condition because of Theorem 1.15). Then the number 0.p1 p2 p3 · · · is normal. Remark 6.26 Normal numbers have been studied by Borel and Cantelli while dealing with the strong law of large numbers (see e.g. [146]). Suppose we flip a coin infinitely many times, and add 0 for each tail and 1 for each head. Let S N be the sum obtained after N tosses. The strong law of large numbers says that almost surely we have limN→+∞ SNN = 12 . In other words, we may see the above sequence of 0 and 1 as a number belonging to the interval [0, 1), written in base 2, then the law says that almost every number in the interval [0, 1) has asymptotically the same number of 0s and 1s. This is Theorem 6.23 for base 2 and blocks of one digit. The popular version of this result says that a monkey typing at random on a typewriter keyboard for an infinite amount of time will almost surely produce Dante’s Divine Comedy. We actually know more than this: the monkey will produce it infinitely many times, with due frequency. We also know that we should not be in a hurry.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

6.5 Benford’s law

127

6.5 Benford’s law In a short note [129] published in 1881, Newcomb wrote: That the ten digits do not occur with equal frequency must be evident to anyone making use of logarithmic tables, and noticing how much faster the first pages wear out than the last ones. The first significant digit is oftener 1 than any other digit, and the frequency diminishes up to 9 ... The law of probability of the occurrence of numbers is such that all mantissæ of their logarithms are equally probable.

The first significant digit is the first non-zero digit appearing in the digit expansion of a positive number (we agree to write, say, 0.5 and not 0.4999 · · · ). For example, the first significant digits of the numbers π = 3.14159265 · · · , 2014 and 1/2014 = 0.00049652 · · · are, respectively, 3, 2 and 4. Observe that a positive real number v has first significant digit k if and only if ) 9 (6.22) log10 (k) ≤ log10 (v) < log10 (k + 1) , ) 9 where log10 (v) is the fractional part of log10 (v). In his paper Newcomb stated that the probability of the first significant digit being k is equal to the length log10 (k + 1) − log10 (k) = log10 (1 + 1/k) of the interval log10 (k), log10 (k + 1) . Let us write the values of these lengths for k = 1, 2, . . . , 9:

log10 (2/1) = 0.30103 · · · log10 (4/3) = 0.12494 · · · log10 (6/5) = 0.079181 · · · log10 (8/7) = 0.057992 · · · log10 (10/9) = 0.045757 · · ·

log10 (3/2) = 0.17609 · · · log10 (5/4) = 0.09691 · · · log10 (7/6) = 0.066947 · · · log10 (9/8) = 0.051153 · · ·

(6.23)

Therefore the first significant digit should be 1 with probability about 30.1%, 2 with probability about 17.6%, ... , 9 with probability about 4.6%. Newcomb gave no actual numerical data or evidence for this ‘logarithmic law’, but in a sense he was not wrong. Indeed, in 1938 Benford [17], unaware of Newcomb’s paper, produced many sequences (areas of rivers, populations, addresses, powers of integers, factorials, etc.) taken from the ‘real world’ and from mathematics, which showed, especially if taken as a whole, a good verification of the above-mentioned ‘logarithmic law’, which was later named after him.3 Let us check Benford’s law on the populations of the Italian cities and towns (we say towns for all of them). There are 8092 towns in Italy, and for every 3

Diaconis and Freeman have shown in [70] that Benford’s data were manipulated, but even the unmanipulated data still show a remarkable evidence of the law.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

128 Uniform distribution and completeness of the trigonometric system k = 1, 2, . . . , 9 we shall write C(k) for the number of towns with population starting with the digit k: C (1) = 2510 ≈ 31.010% C (3) = 1013 ≈ 12.515% C (5) = 637 ≈ 7.870% C (7) = 461 ≈ 5.695% C (9) = 348 ≈ 4.299%

C (2) = 1400 ≈ 17.297% C (4) = 734 ≈ 9.068% C (6) = 545 ≈ 6.733% C (8) = 444 ≈ 5.485%

(2510 ≈ 31.010% means that 2510 is about 31.010% of 8092, ...). Note that these percentages fit very well with the numbers log10 (1 + 1/k) in (6.23). Up to now we have not described a mathematical theorem, but rather a phenomenon of the real world. It is clear that Benford’s law can be applied only to large data sets which range over several orders of magnitude. Anyway, as pointed out in [18], a satisfactory explanation of the law does not seem to be at hand. Let us turn for the moment to a diﬀerent feature of Benford’s law and show that some familiar numerical sequences such as 2n or n! satisfy the law, which it is now time to state. Definition 6.27 A positive real sequence tn is a Benford sequence if, for every k ∈ {1, 2, . . . , 9},

) 9 card n ≤ N : k is the first significant digit of tn 1 = log10 1 + lim . N→+∞ N k A modification of the definition above will quickly relate Benford’s law to the main topic of this chapter. Definition 6.28 Let v be a positive real number and let M (v) be the integer satisfying 1 ≤ 10−M(v) v < 10. Let r ∈ N and let u1 ∈ {1, 2, . . . , 9}, u ∈ {0, 1, . . . , 9} for every = 2, 3, . . . , 9. If 10−M(v) v has digital expansion 10−M(v) v = u1 .u2 u3 . . . ur . . . we say that the number u1 u2 . . . ur is the first significant r-block of v. Definition 6.29 A positive real sequence tn is a strong Benford sequence if r such that u1 ∈ {1, 2, . . . , 9} for every r ∈ N and every finite sequence u j j=1 and u ∈ {0, 1, 2, . . . , 9} for every = 2, 3, . . . , 9, we have ) 9 card n ≤ N : u1 u2 . . . ur is the first significant r-block of tn lim N→+∞ N

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

= log10 1 +

6.5 Benford’s law 1 . u1 u2 . . . ur

129

Of course, strong Benford sequences are Benford sequences. The following result was proved by Diaconis in 1977 [69]. Theorem 6.30 (Diaconis) A real positive sequence tn is a strong Benford sequence if and only if the sequence log10 (tn ) is uniformly distributed mod 1. In order to prove the ‘if’ part, recall (6.22) and apply Definition 6.1 to the intervals

ur ur u2 u2 1 log10 u1 + + . . . + r−1 , log10 u1 + + . . . + r−1 + r . 10 10 10 10 10 The proof of the ‘only if’ part is similar to the last step of the proof of Lemma 6.24 (see also [69] or [134, p. 111]). Corollary 6.31

The sequence 2n is a strong Benford sequence.

Proof We need to show that log10 (2n ) = n log10 (2) is uniformly distributed mod 1. This follows from Kronecker’s theorem since log10 (2) is an irrational number. Stirling’s formula (see e.g. [150]) will allow us to prove that n! is also a strong Benford sequence. Theorem 6.32 (Stirling’s formula)

We have √ n! ∼ 2πn nn e−n . 6 +∞ Proof We know that n! = 0 tn e−t dt. Then, by the change of variables √ t = n + s 2n we have +∞ √ √ √ n n + s 2n e−n−s 2n 2n ds n! = √ − n/2 √ * n √ +∞ = nn+1/2 e−n 2 √ 1 + s 2/n e−s 2n ds − n/2 +∞ √ gn (s) ds , := nn+1/2 e−n 2 −∞

where

⎧ ⎪ ⎪ ⎨ 0 √ √ √ √ n gn (s) = ⎪ ⎪ ⎩ 1 + s 2/n e−s 2n = en log(1+s 2/n)−s 2n

We claim that

lim

n→+∞

+∞ −∞

gn (s) ds =

+∞

−∞

e−s ds = 2

√ if s ≤ − n/2, √ if s > − n/2. √ π.

(6.24)

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

130 Uniform distribution and completeness of the trigonometric system The second equality in (6.24) is well known:

+∞ 2 +∞ +∞ 2 2 2 2 2 e−s ds = e−s ds e−x dx = e−s −x dsdx −∞

=

−∞ 2π 0

−∞

+∞

e−ρ ρ dρdθ = π 2

R2 +∞

0

(6.25)

e−v dv = π .

0

Observe that

⎛ √

3 ⎞ * √ √ ⎜⎜ s 2 s2 s ⎟⎟ + O 3/2 ⎟⎟⎠ − s 2n n log 1 + s 2/n − s 2n = n ⎜⎜⎝ √ − n n n

3 s = −s2 + O √ n

as n → +∞. In order to prove (6.24), let k (x) =

log (1 + x) − x x2

for x > −1. Then k (x) =

x 1+x

+ x − 2 log (1 + x) a (x) := 3 . x3 x

Since a (x) < 0 for −1 < x < 0 and a (x) > 0 for x > 0, we see that k (x) is increasing on (−1, +∞). The function k (x) becomes continuous if we set k (0) = −1/2. Then, for every n ∈ N, √

√

√

2 en(log(1+s 2/n)−s 2/n) = e2s k( s 2/n) ⎧ 2 2s2 k(0) ⎪ ⎪ e−s ⎨ e √= √ ≤⎪ ⎪ e2s2 k s 2 = e−s 2 1 + s √2 ⎩

if −

√ n/2 < s < 0,

if s > 0.

Hence the proof of the first equality in (6.24) follows from the dominated convergence theorem. Theorem 6.33 Proof

n! is a strong Benford sequence.

By Stirling’s formula we have √ log10 (n!) − log10 2πnnn e−n −→ 0 ,

and by Proposition 6.16 it is enough to prove that w (n) := log10 nn+1/2 e−n is uniformly distributed mod 1. For every real number x ≥ 1 define w (x) = log10 x x+1/2 e−x . Observe that for 1 ≤ x ≤ N we have

1 1 w (x) = log x + ≤ c log N , log (10) 2x

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

6.5 Benford’s law

1 1 c . w (x) = 1− ≥ x log (10) 2x N

131

By applying Lemma 9.9 below we obtain N 1 2πian log N e ≤ c √ −→ 0 N n=1 N as N → +∞. Then, by the Weyl criterion, w (n) is uniformly distributed mod 1 and therefore n! is a strong Benford sequence. Remark 6.34 The argument in Remark 6.20 shows that for every non-zero real number b the sequence b log10 (n) is not uniformly distributed, and therefore nb is not a strong Benford sequence. In particular, the sequence n of the positive integers is not a strong Benford sequence. Of course we cannot deduce that n is not a Benford sequence. We then prove this fact directly. For every positive integer N and every k ∈ {1, 2, . . . , 9}, let ) 9 card n ≤ N : k is the first significant digit of n . qk (N) = N Then a direct computation shows that q1 (9) =

1 , 9

q1 (99) =

11 , 99

q1 (999) =

111 , 999

...

so that q1 (10n − 1) =

1 log10 (2) . 9

Then n is not a Benford sequence. Observe that we have another proof of Remark 6.20. The study of arithmetic functions in Chapter 2 may suggest replacing q1 (N) with its arithmetic mean p1 (N) = N −1 Nj=1 q1 ( j). Unfortunately this idea does not work, since it was proved in [80] that p1 (N) log10 (2). We may try again and average p1 (N), but we still do not obtain the limit log10 (2). Flehinger [80] has shown that we obtain the correct limits only if we average qk (N) infinitely many times. If tn is a strong Benford sequence then, for every α > 0, the same is true for αtn , so that the law does not depend on the units of measurement (see [139]). Benford’s law can be used to optimize computer data storage, to check demographic projections, to discover scientific and fiscal frauds [37, 122, 132, 133]. It is common to check not only the first digit, but also the first few digits, in a sort of compromise between Benford’s law and the strong Benford’s law.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

132 Uniform distribution and completeness of the trigonometric system

Exercises 1) Prove the existence of dense sequences in T which are not uniformly distributed. 2) Let t ( j) be a sequence with values in Td . Prove that ⎛ ⎞ N ⎜⎜⎜ 1 ⎟⎟⎟ ⎜ lim ⎜⎜⎝ f (t( j))⎟⎟⎟⎠ = f (x) dx N→∞ N Td j=1 holds true for every Riemann integrable function f on Td if and only if it d holds true for every continuous function √ √ on T . 3) Prove that the sequence 2 j, 2 − 1 j is not uniformly distributed in T2 . 4) Prove that we cannot replace the interval I in (6.1) with a Lebesgue measurable set. 5) Prove that f L∞ (Td ) = lim f L p (Td ) . p→+∞

6) Let P(t) = an e2πin·t be a trigonometric polynomial on Td . Prove that 2 = an for every n ∈ Zd . P(n) N ak e2πinkx be a trigonometric polynomial on T. Use Proposi7) Let PN = k=0 tion 6.10 to prove that 11 11 1P 1 ≤ 2πN P p , N L (T)

N L p (T)

where 1 ≤ p ≤ ∞. 8) Let DN be the Dirichlet kernel (see 6.11). Prove the existence of two positive constants c and c such that c log N ≤ DN L p (T) ≤ c log N . 9) Use argument in the proof of Theorem 4.19 to exhibit a function f ∈ the C T2 such that the ‘summation by squares’ 2 S NQ f (t) := f (n) e2πin·t |n1 |≤N, |n2 |≤N

converges uniformly as N → +∞, while lim sup S D f (0) = +∞ , N→+∞

where S ND f (x) :=

N

2 f (n) e2πin·t

|n|≤N

denotes the ‘summation by discs’.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

6.5 Benford’s law

133

10) Let H M = KN ∗ KN ∗ . . . ∗ KN be the convolution of M Fej´er kernels. Prove that H M L p (Td ) = 1 for every M. 11) Find a Benford sequence which is not a strong Benford sequence. 12) Prove that the Fibonacci sequence is a strong Benford sequence. 13) Let q1 (N) as in Remark 6.34. Compute lim inf q1 (N) N→+∞

and

lim sup q1 (N) . N→+∞

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:17:32, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.007

7 Discrepancy and trigonometric approximation

The definition of uniform distribution (Definition 6.1) does not consider the speed of convergence. In fact an estimate of this speed can be useful, for example, since Theorem 6.4 relates uniformly distributed sequences to the computation of integrals. In this chapter we introduce the definition of discrepancy, which is a quantitative counterpart of the uniform distribution. Then we shall see how to estimate the discrepancy and how to use it to evaluate the approximation of certain integrals by Riemann sums. Definition 7.1 Let AN = {ω( j)}Nj=1 ⊂ T be a set of N points. The discrepancy of AN (with respect to intervals) is defined by DN = D (AN ) = D({ω( j)}Nj=1 ) := sup |I| N − card {ω( j)}Nj=1 ∩ I , I

where the supremum is over all intervals I ⊆ T. It is sometimes useful to write DN 1 = sup |I| − card {ω( j)}Nj=1 ∩ I N N I N 1 χI (ω( j)) = sup χI (x) dx − N j=1 I T as a sup of distances between integrals and Riemann sums. It may be diﬃcult to estimate DN directly, and it is natural to look for bounds related to the exponential sums introduced in the Weyl criterion. The Erd˝os– Tur´an inequality is the most important result in this direction (see Section 7.2). We first need a Fourier analytic result on the approximation of the characteristic function of an interval by trigonometric polynomials. 134 Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:19:04, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.008

7.1 One-sided trigonometric approximation

135

7.1 One-sided trigonometric approximation The proof of the Weyl criterion uses the completeness of the trigonometric system to approximate the characteristic function χI of an interval I with trigonometric polynomials. Here we shall obtain polynomial approximations of χI from above and from below, which in turn will lead us to the Erd˝os–Tur´an inequality. The proof of the following theorem follows the approach in [58], where a general multidimensional result is proved. We point out the existence of a deeper and sharper version of the following theorem, due to Beurling, Selberg and Vaaler [84, 123, 176]. Theorem 7.2 There exists a constant c > 0 such that for every interval I ⊂ T and for every positive integer N there exist two trigonometric polynomials p+I,N and p−I,N of degree at most N satisfying p−I,N (x) ≤ χI (x) ≤ p+I,N (x) for every x ∈ T, and

T

c . p+I,N (x) − p−I,N (x) dx ≤ N

(7.1)

Proof It is enough to construct p+I,N (then we set p−I,N (x) := 1 − p+I c ,N ). Since a translation does not change the degree of a polynomial, we may assume that I is symmetric around the origin and write I s := [−s, s] (with 0 ≤ s < 1/2). We also write p+s,N in place of p+Is ,N . If s ≥ 12 − N2 we can choose p+s,N (x) = 1 for every x ∈ T. Then we assume that 0 ≤ s < 12 − N2 . For every x ∈ [−1/2, 1/2) we define

G (7.2) f s,N (x) := 1 + χIs+1/N (x) , N 3 (s − |x| + 2/N)3 where the constant G ≥ 1 will be chosen later during the proof. Note that f s,N is a non-negative function. Let K M be the Fej´er kernel of degree M. By Parseval’s identity we have K M 2L2 (T) =

M

j=−M

=1+

1−

| j| M+1

2 =1+

M 2 (M + 1 − j)2 (M + 1)2 j=1

M 2 M (2M + 1) j2 = 1 + . 2 3 (M + 1) (M + 1) j=1

√ Hence K M L2 (T) ≈ M, where A ≈ B means that A > 0, B > 0 and there exist two positive constants c1 and c2 such that c1 A ≤ B ≤ c2 A. We may assume that

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:19:04, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.008

136

Discrepancy and trigonometric approximation

N is even, say N = 2M, with M ≥ 2. We introduce the Jackson kernel 1 1 sin4 (π (M + 1) x) 2 (x) = K M K M 2L2 (T) K M 2L2 (T) (M + 1)2 sin4 (πx)

sin4 (π (M + 1) x) 1 ≈ . ≤ c min N, N 3 |x|4 N 3 sin4 (πx)

JN (x) :=

(7.3)

The Jackson kernel resembles the Fej´er kernel, but it is more concentrated near 2M (m) ≥ 0 for every integer m, the origin, see [105] and Lemma 6.11. Since K we have 2 2N (k) = (x) e−2πikx dx KM K M 2L2 (T) J T

⎞2 ⎛⎜ M ⎟⎟⎟ −2πikx ⎜⎜⎜ 2πimx 2M (m) e ⎟⎟⎠ e = K dx ⎜⎝ T m=−M

= =

M

T m,n=−M M

2M (n) e2πi(m+n−k)x dx 2M (m) K K

2M (n) 2M (m) K K

m,n=−M

=

M

e2πi(m+n−k)x dx T

2M (k − m) ≥ 0 . 2M (m) K K

m=−M

Therefore for every k ∈ Z we have 2N (k) ≤ JN L1 (T) = 0≤J

1 JN = K M 2L2 (T) T

T

2 KM =1.

(7.4)

The trigonometric polynomial (see (7.2)) p+Is ,N := p+s,N := f s,N ∗ JN has degree ≤ N. Let us show that p+s,N (x) ≥ χIs (x) for every x ∈ T. Since p+s,N is non-negative and symmetric around the origin, we only have to prove that p+s,N (x) ≥ 1 for every 0 ≤ x ≤ s. By (7.4) this means proving that we have JN (y) f s,N (x − y) dy ≥ JN (y) dy = 1 T

T

for every 0 ≤ x ≤ s, that is, JN (y) f s,N (x − y) − 1 dy ≥ 0 , T

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:19:04, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.008

7.1 One-sided trigonometric approximation so

JN (y) x+I s+1/N

137

G dy ≥ JN (y) dy . N 3 (s − |x − y| + 2/N)3 ( x+Is+1/N )c

(7.5)

We are going to prove (7.5). In the rest of this proof c, c1 , c2 , . . . will be positive constants independent of s, N, G. Let us call D x and E x , respectively, the LHS and the RHS of (7.5). In order to prove that D x ≥ E x we first observe that if |y| ≤ N1 and 0 ≤ x ≤ s, then y ∈ x + I s+1/N . Therefore (7.3) implies 1/N G JN (y) dy Dx ≥ 3 N (s − |x − y| + 2/N)3 −1/N

1/N G 1 2 (y) dy KM ≥ min |y|≤1/N N 3 (s − |x − y| + 2/N)3 K M 2 2 L (T) −1/N 1/N G sin4 (π (M + 1) y) ≥ c1 dy N 6 (s − x + 3/N)3 0 sin4 (πy) G , ≥ c2 N 3 (s − x + 3/N)3 since sin (π (M + 1) y) ≥ c3 Ny when 0 ≤ y ≤ 1/N. Now observe that if |y| ≤ s − x + 1/N, then y ∈ x + I s+1/N . Then (7.3) implies 1/2 1 1 1 JN (y) dy ≤ c4 3 dy ≤ c5 Ex ≤ 3 . 4 N s−x+1/N y |y|>s−x+1/N N3 s + 1 − x N

If we choose G = 27c5 /c2 , then E x ≤ D x for every 0 ≤ x ≤ s. We now prove (7.1). It is enough to estimate p+s,N − 2s . p+s,N − χIs = T

T

Since p s,N is a non-negative function, we can apply Proposition 6.10 and (7.4) to obtain 1 1 1 1 p+s,N = 11 f s,N ∗ JN 11L1 (T) ≤ 11 f s,N 11L1 (T) T

G = 1+ χIs+1/N (x) dx N 3 (s − |x| + 2/N)3 T s+1/N 1 = 2s + 2/N + 2G dx . 3 (s − x + 2/N)3 N 0 Hence

T

N s+2 2 1 1 c6 . du ≤ p+I,N − χI ≤ + 2G N N 1 N u3

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:19:04, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.008

138

Discrepancy and trigonometric approximation

7.2 The Erd˝os–Tur´an inequality We can now prove the following result (see [77]). Theorem 7.3 (Erd˝os–Tur´an inequality) There exists c > 0 such that for every choice of positive integers N, H and for every finite sequence ω = {ω( j)}Nj=1 ⊂ T we have ⎞ ⎛ H N ⎟⎟ ⎜⎜⎜ N 1 e2πikω( j) ⎟⎟⎟⎟ . DN ≤ c ⎜⎜⎜⎜⎝ + (7.6) ⎟⎠ H k=1 k j=1 Proof Let I be an interval in T. By Theorem 7.2 there exists a trigonometric polynomial p+I,H having degree at most H and satisfying card {ω( j)}Nj=1 ∩ I =

N

χI (ω( j)) ≤

j=1

=

H k=−H

N

p+I,H (ω( j)) =

j=1

2 p+I,H (k)

N j=1

e2πikω( j)

N H

2 p+I,H (k) e2πikω( j)

j=1 k=−H N c 2 p+I,H (k) ≤ |I| + e2πikω( j) N+ H j=1 0 0, choose P and x0 , x1 , x2 , . . . , xP−1 , xP = x such that P

| f (x ) − f (x−1 )| ≥ V f (x) − ε .

=1

Let x < y ≤ 1. Then V f (y) ± f (y) ⎛ ⎞ P ⎜⎜⎜ ⎟⎟ ⎜ ))| (x (x (y) (x)| ≥ ⎜⎝| f −f + | f − f −1 ⎟⎟⎟⎠ ± ( f (y) − f (x) + f (x)) =1

≥ V f (x) − ε ± f (x) . Hence V f (y) ± f (y) ≥ V f (x) ± f (x). Then the functions t → V f (t) ± f (t) are increasing. If we write f (t) =

V f (t) + f (t) V f (t) − f (t) − , 2 2

we prove the first part of the lemma. Finally observe that an increasing function g : [0, 1] → R satisfies Vg = g (1) − g (0). Then V(V f + f )/2 + V(V f − f )/2 1 1 1 1 = V f + f (1) − V f + f (0) + V f − f (1) − V f − f (0) 2 2 2 2 = V f (1) − V f (0) = V f . The following definition is essentially equivalent to Definition 7.1, since it is easy to prove that D∗N ≤ DN ≤ 2D∗N . Definition 7.7

Let D∗N = D∗N {ω( j)}Nj=1 := sup card [0, α] ∩ {ω( j)}Nj=1 − Nα α∈[0,1] 1 N χ[0,α] (x) dx . = sup χ[0,α] (ω( j)) − N α∈[0,1] 0 j=1

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:19:04, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.008

142

Discrepancy and trigonometric approximation

Theorem 7.8 (Koksma’s inequality) Let {ω( j)}Nj=1 ⊂ [0, 1]. Then for every function f of bounded variation on [0, 1] we have 1 N 1 D∗ f (ω( j)) − f (x) dx ≤ V f N . N N 0 j=1

Proof The previous lemma allows us to assume that f is a decreasing function satisfying f (0) = 1 and f (1) = 0. Choose a large positive integer M, let & ! 1 ≤ f (x) ≤ 1 I1 = x ∈ [0, 1] : 1 − M and, for each h = 2, . . . , M, consider the interval ! & M−h+1 M−h ≤ f (x) < Ih = x ∈ [0, 1] : . M M M Ih . Note that the intervals Ih Then we have the disjoint union [0, 1] = ∪h=1 possibly reduce to one point or to the empty set. Let

S M (x) :=

M M 1 M−h+1 χIh (x) = χ (x) , M M h=1 I h h=1

where every I h := ∪hs=1 I s is an interval anchored at 0. Then 1 N 1 f (ω( j)) − f (x) dx N 0 j=1 1 N 1 S M (ω( j)) − S M (x) dx ≤ N 0 j=1 N 1 1 ( f (x) − S M (x)) dx . + ( f (ω( j)) − S M (ω( j))) + N j=1 0

(7.13)

(7.14)

By (7.13) we have 1 N 1 S M (ω( j)) − S M (x) dx N 0 j=1 1 M N M D∗ 1 D∗N 1 1 = Vf N χI h (ω( j)) − χI h (x) dx ≤ ≤ M M h=1 N j=1 N N 0 h=1

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:19:04, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.008

7.3 The inequalities of Koksma and Koksma–Hlawka

143

(since V f = 1). Now we consider the second term in the RHS of (7.14). Since χ[0,1] =

M

χIh

h=1

we have ⎛ ⎞ N N ⎜ M 1 ⎟⎟ 1 M − h + 1 ⎜ ⎜ χIh (ω( j))⎟⎟⎟⎠ ( f (ω( j)) − S M (ω( j))) = ⎜⎜⎝ f (ω( j)) − N j=1 M N j=1 h=1 ⎛M ⎞

N ⎟⎟ M−h+1 1 ⎜⎜⎜⎜ − f (ω( j)) ⎟⎟⎟⎠ = ⎜⎝ χIh (ω( j)) N j=1 h=1 M ⎛ ⎞ N M ⎟⎟ 1 1 1 ⎜⎜⎜⎜ . ≤ ⎜⎝ χIh (ω( j))⎟⎟⎟⎠ = M N j=1 h=1 M Finally ⎞ 1 1 ⎛ M ⎟⎟ ⎜⎜⎜ M − h + 1 ⎜⎜⎝ f (x) − ( f (x) − S M (x)) dx = χIh (x)⎟⎟⎟⎠ dx M 0 0 h=1

M M−h+1 1 − f (x) dx ≤ . = M M h=1 Ih This completes the proof since M is an arbitrarily large number.

By Theorem 7.4 we readily obtain the following result. Corollary 7.9 Let α be an irrational algebraic number of degree 2 and let f be a function of bounded variation on [0, 1]. Then there exists a positive constant c such that N 1 log2 N . f ({ jα}) − f (t)dt ≤ c V f N N T j=1 We may say that Koksma’s inequality turns the discrepancy for a small class of functions (characteristic functions of intervals) into the discrepancy for a large class (functions of bounded variation). See [22, 24, 25, 26, 91, 114]. We wish to show that it may be natural to choose (a mild modification of) the sawtooth function s (x) (see (4.33)) and its translates as the ‘small class of functions’. We consider the periodic function g (x) :=

1 − s (x) 2

(7.15)

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:19:04, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.008

144

Discrepancy and trigonometric approximation

(then g(x) = 1 − x if 0 ≤ x < 1). For x Z we have (see (4.34)) 1 1 1 2πikx g (x) = e χ[0,α]+Z (x) dα = + . 2 k0 2πik 0 Remark 7.10 The function g (x) has a peculiar property. On the one hand, the convolution with g (x) is the inverse of the diﬀerentiation. Indeed, for k 0, integration by parts yields 1 −2πikx 2 2 g (k) f (k) = f (x) e dx = f (x) e−2πikx dx = 2 f (k) . 2πik T T On the other hand, g (x) is a superposition of characteristic functions of intervals anchored at the origin. Theorem 7.11 Let {t ( j)}Nj=1 ⊂ T and let g (x) be as in (7.15). Let f be a smooth function on T. Then N 1 f (x + t ( j)) − f (y) dy sup (7.16) x∈T N j=1 T ⎛ N ⎞⎞ ⎛⎜

⎜⎜⎜ 1 ⎟⎟⎟⎟⎟⎟ ⎜⎜⎜ ⎜ a + f (x) dx ⎜⎜⎝sup ⎜⎜⎝ g (x + t ( j))⎟⎟⎟⎠⎟⎟⎟⎠ . ≤ inf a∈R T x∈T N j=1 Proof

The periodicity of f and integration by parts yield 1 1 1 1 2 2 f (x) e−2πikx dx = f (x) e−2πikx dx = f (k) f (k) = 2πik 0 2πik 0 1 1 (k) 8 f (k) = f8 = 2 2 −4π k −8π3 ik3 for every k 0. Hence the Fourier series of f (x) and f (x) converge absolutely, then they converge pointwise to f (x) and f (x), respectively. Observe that the function x → f (x + t ( j)) has Fourier coeﬃcients ∧ −2πikx 2πikt( j) (k) (x ( (· ( = f + t j)) e dx = e f (u) e−2πiku du f + t j)) T

f (k) = e2πikt( j) 2

T

f (0)). By (7.15) we have, for every a ∈ R, (in particular f (· + t ( j))∧ (0) = 2 ⎛ ⎞ N N 1 ⎜⎜ 1 ⎟⎟⎟ ⎜ sup f (x + t ( j)) − f (y)dy = sup ⎜⎜⎜⎝ e2πikt( j) ⎟⎟⎟⎠ 2 f (k) e2πikx x∈T N j=1 x∈T N j=1 T k0 ⎛ ⎞ N ⎜⎜ 1 1 ⎟⎟ ⎜⎜⎜ 2πikt( j) ⎟ 2πikx ⎟ 2 ⎟⎟⎠ (2πik) f (k) e e = sup ⎜⎝ x∈T k0 2πik N j=1

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:19:04, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.008

7.3 The inequalities of Koksma and Koksma–Hlawka ⎛ ⎞ N ⎜⎜ ⎟⎟⎟ 1 ⎜ ∧ = sup ⎜⎜⎜⎝2 e2πikt( j) ⎟⎟⎟⎠ a + f (k) e2πikx g(k) N j=1 x∈T k0 ⎛⎛ ⎞ ⎞ N ⎜⎜⎜⎜ 1 ⎟⎟⎟ ⎟⎟⎟⎟ ⎜⎜⎜⎜⎜⎜ ⎟ g (· + t ( j))⎟⎟⎠ ∗ a + f ⎟⎟⎠ (x) . = sup ⎜⎝⎜⎝ x∈T N j=1 Hence (7.16) follows from Proposition 6.10.

145

Koksma’s inequality deals with functions of bounded variation and actually most (bounded) familiar functions on T share this property. Now we want to introduce a multidimensional extension of Koksma’s inequality (results of this type are usually termed Koksma–Hlawka inequalities, see e.g. [94, 95, 114, 121, 131]) which holds for a reasonably large class of functions (possibly noncontinuous). Definition 7.12 We say that a function h (t) on Td is piecewise smooth2 on Td (d > 1) if we can write h (t) = f (t) χΩ (t), where f is smooth (i.e., it has derivatives of all orders) on Td and χΩ (t) is the characteristic function of an open set Ω in Td . Lemma 7.13 Let f ∈ C1 Td . Then for x = (x1 , . . . , xd ) and n = (n1 , . . . , nd ) we have ∧

∂f (n) = 2πink 2 f (n) ∂xk for every k = 1, . . . , d. Proof

We may assume that k = 1. Integration by parts gives

∂f (x) e−2πin·x dx Td ∂x1 1 1 1 −2πin1 x1 = ··· f (x) (−2πin1 ) e dx1 e−2πi(n2 x2 +...+nd xd ) dx2 · · · dxd − 0 0 0 = 2πin1 f (x) e−2πin·x dx = 2πin1 2 f (n) . Td

The following result was proved in [25]. For simplicity we state and prove it only in the planar case. 2

Note that when d = 1 this definition does not coincide with Definition 4.17.

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:19:04, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.008

146

Discrepancy and trigonometric approximation

Theorem 7.14 Let {t ( j)}Nj=1 ⊂ T2 and let h (t) = f (t) χΩ (t) be a piecewise smooth function on T2 . Let 1 ≤ p, q ≤ +∞ satisfy 1/p + 1/q = 1. Let 11 2 11 11 11 11 11 1 ∂ f 11 ∂f 1 ∂f 1 1 1 + 2 11 11 + 11 . Vq ( f ) = 4 f Lq (Td ) + 2 11 11 1 q d q d ∂t1 L (T ) ∂t2 L (T ) 1 ∂t1 ∂t2 1Lq (Td ) Let I s = [0, s1 ] × [0, s2 ] be an interval anchored at the origin and let 11 11 N 11 11 1 11 ds , D p := χ(·−Is+Z2 )∩Ω (t ( j)) − |(· − I s+Z2 ) ∩ Ω|11 11 [0,1]2 1 1 N j=1 p d L (T ) where |·| denotes the volume. Then N 1 h (t ( j)) − h (v) dv ≤ Vq ( f ) D p . N T2 j=1 Proof

Let g (t) be the periodic function g (t) := χIu+Z2 (t) du .

(7.17)

[0,1]2

Then

2 g (n) =

1

0

=

1

(1 − t1 ) (1 − t2 ) e−2πi(n1 t1 +n2 t2 ) dt1 dt2

(7.18)

0

1 1 , 2δ (n1 ) + 2πin1 2δ (n2 ) + 2πin2

where

! δ (nk ) =

if nk = 0, if nk 0.

1 0

We introduce the operator D defined by ((2δ (n1 ) − 2πin1 ) (2δ (n2 ) − 2πin2 )) 2 D f (t) := f (n) e2πin·t .

(7.19)

n∈Z2

We observe that

!

&1/q |D f (t)| dt q

T2

≤ Vq ( f ) .

(7.20)

Indeed, for every t = (t1 , t2 ) ∈ T2 Lemma 7.13 implies D f (t1 , t2 ) = 4 2 f (0, 0) − 2 2πin1 2 f (n1 , 0) e2πin1 t1 −2

n2 ∈Z

n1 ∈Z

2πin2 2 f (0, n2 ) e2πin2 t2 +

2πin1 2πin2 2 f (n1 , n2 ) e2πi(n1 t1 +n2 t2 )

n1 ,n2 ∈Z

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:19:04, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.008

7.3 The inequalities of Koksma and Koksma–Hlawka 1 2πin1 t1 2 2 = 4 f (0, 0) − 2 2πin1 f (n1 , n2 ) e e2πin2 t2 dt2

−2

n1 ,n2 ∈Z

2πin2 2 f (n1 , n2 ) e2πin2 t2

n1 ,n2 ∈Z

=4 −2

0

1

f (t1 , t2 ) dt1 dt2 − 2

[0,1]2 1

0

1

e2πin1 t1 dt1 +

0

0

147

∂2 f (t1 , t2 ) ∂t1 ∂t2

∂f (t1 , t2 ) dt2 ∂t1

∂f ∂ f (t1 , t2 ) dt1 + (t1 , t2 ) , ∂t2 ∂t1 ∂t2 2

which implies (7.20). Observe that Theorem 4.15, (7.17), (7.18) and (7.19) imply N 1 f (t ( j)) χΩ (t ( j)) − f (t) χΩ (t) dt N j=1 T2 ⎛ ⎞ N ⎟⎟⎟ 1 ⎜⎜⎜⎜ 2 2 ⎜⎜⎝ = χΩ (n) f (n) e2πin·t( j) ⎟⎟⎟⎠ χΩ (t ( j)) − f (n) 2 N j=1 2 2 n∈Z n∈Z ⎞ ⎛ N ⎟⎟⎟ 1 ⎜⎜⎜⎜ ⎜⎝⎜ (D f ) (n) 2 (D g−t( j) (n)⎟⎟⎠⎟ χΩ (t ( j)) − g (n) 2 χΩ (n) f ) (n)2 = N j=1 2 2 n∈Z

=

1 N

N j=1

T2

n∈Z

D f (u) g−t( j) (u) du χΩ (t ( j)) −

T2

D f (u) (g ∗ χΩ ) (u) du

⎞ ⎛ N ⎟⎟⎟ ⎜⎜⎜ 1 ⎜ = D f (u) ⎜⎜⎝ χΩ (t ( j)) g (u − t ( j)) − χΩ (t) g (u − t) dt⎟⎟⎠⎟ du , N j=1 T2 T2

where gv (u) := g (u + v). Therefore H¨older’s inequality, (7.20), (7.17) and Minkowski’s integral inequality imply N 1 f (t ( j)) χΩ (t ( j)) − f (t) χΩ (t) dt N T2 j=1 &1/q ! ≤ |D f (t)|q dt T2

p ⎫1/p ⎧ ⎪ ⎪ N ⎪ ⎪ ⎪ ⎪ 1 ⎨ ⎬ ×⎪ χΩ (t ( j)) g (u − t ( j)) − χΩ (t) g (u − t) dt du⎪ ⎪ ⎪ ⎪ ⎪ 2 2 N ⎩ T j=1 ⎭ T ⎧ ⎪ N ⎪ ⎪ 1 ⎨ (t ( j)) ≤ Vq ( f ) ⎪ χ χIs+Z2 (u − t ( j)) ds Ω ⎪ ⎪ ⎩ T2 N j=1 [0,1]2

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:19:04, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.008

148

Discrepancy and trigonometric approximation p &1/p χΩ (t) χIs+Z2 (u − t) ds du − T2 [0,1]2 ⎧ ⎪ N 1 ⎪ ⎪ ⎨ ≤ Vq ( f ) χΩ (t ( j)) χIs+Z2 (u − t ( j)) ⎪ ⎪ ⎩ T2 N j=1 [0,1]2 ⎪ p &1/p − χΩ (t) χIs+Z2 (u − t) du ds T2

= Vq ( f ) p ⎫1/p ⎧ ⎪ ⎪ N ⎪ ⎪ ⎪ ⎪ 1 ⎨ ⎬ 2 (t ( ) − I j)) − ∩ Ω| χ du ds × |(u ⎪ ⎪ s+Z u−I ∩Ω ) ( 2 ⎪ ⎪ s+Z ⎪ ⎩ T2 N j=1 ⎭ [0,1]2 ⎪ = Vq ( f ) D p . The previous theorem has introduced L p norms of multidimensional discrepancies with respect to fairly general sets. This will be one of our main interests in the last chapters.

Exercises 1) Let N be a large positive integer. Prove the existence of an interval I ⊂ T such that, for every trigonometric polynomial T (x) of degree N, satisfying T (x) ≥ χI (x) for every x ∈ T, we have 1 (T − χI ) ≥ . N +1 T 2) Prove that an infinite sequence {ω( j)}∞j=1 ⊂ T is uniformly distributed if and only if D {ω( j)}Nj=1 = o(N) as N → ∞. 3) Let s (x) be the sawtooth function (see (4.33)) and let α be an irrational algebraic number of degree 2. Prove the inequality N s( jα) ≤ c log2 N . j=1

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:19:04, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.008

7.3 The inequalities of Koksma and Koksma–Hlawka

149

4) Use the Erd˝os–Tur´an inequality to prove (see Definition 7.1) 2 ⎞1/3 ⎛ ⎜⎜⎜ +∞ N ⎟⎟⎟ 1 1/3 ⎜ 2πihω( j) ⎟⎟⎟⎟ DN ≤ cN ⎜⎜⎜⎜ e . ⎟⎠ ⎝ h h=1

j=1

5) Let p be an odd prime. Use the results in Chapter 4 to prove that D p ≤ c p1/2 log p .

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:19:04, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.008

8 Integer points and Poisson summation formula

In Chapter 2 we proved the following results (Theorems 2.19 and 2.14) for the arithmetic means of r(n) and d(n) : 1 r(n) = π + O R−1/2 , R n≤R 1 d(n) = log R + (2γ − 1) + O R−1/2 , R n≤R as R → +∞, where γ is the Euler–Mascheroni constant. In the last chapters we shall use diﬀerent techinques to improve these estimates and to study several related problems. We first need to introduce the Fourier integrals.

8.1 Fourier integrals For every f ∈ L1 (Rd ) we define its Fourier transform 2 f (t) e−2πiξ·t dt , f (ξ) := Rd

where ξ · t = ξ1 t1 + ξ2 t2 + . . . + ξd td is the inner product in Rd . Proposition 8.1 uous on Rd .

Let f ∈ L1 (Rd ). Then the function 2 f (ξ) is uniformly contin-

6 Proof Let R > 0 satisfy |t|>R | f (t)| dt ≤ ε/4. By (6.19) we have, for η small enough, 2 f (t) e−2πi(ξ+η)·t dt − f (t) e−2πiξ·t dt f (ξ + η) − 2 f (ξ) = Rd Rd −2πiη·t −2πiη·t − 1 dt = − 1 dt + 2 ≤ | f (t)| e | f (t)| e | f (t)| dt Rd

|t|≤R

|t|>R

150 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

≤ 2π |η|

|t|≤R

8.1 Fourier integrals

151

|t| | f (t)| dt + ε/2 ≤ ε .

Theorem 8.2 follows.

The Fourier transform and the diﬀerentiation are related as

(i) Let t = (t1 , . . . , td ) ∈ Rd and ξ = (ξ1 , . . . , ξd ) ∈ Rd . Assume that for a ∂f ∈ L1 (Rd ). Then given k we have ∂t k

∧ ∂f (ξ) = 2πiξk 2 f (ξ) . ∂tk (ii) Let f ∈ L1 (Rd ) and assume that for a given k we have tk f (t) ∈ L1 (Rd ). Then 2 f (ξ) can be diﬀerentiated with respect to ξk and ∂2 f (ξ) = (−2πitk f (t))∧ (ξ) . ∂ξk Proof For the proof of (i) we refer to Lemma 7.13. In order to prove (ii), let h = (0, . . . , 0, hk , 0, . . . , 0). We apply the dominated convergence theorem to obtain

2 1 f (ξ + h) − 2 f (ξ) −2πit·(ξ+h) −2πit·ξ = f (t) e dt − f (t) e dt hk hk Rd Rd 1 = f (t) e−2πit·ξ e−2πit·h − 1 dt hk Rd ∧

−2πitk hk e −1 f (t) (ξ) → (−2πitk f (t))∧ (ξ) = hk as hk → 0.

∞

Remark 8.3 Let S be the set of functions ϕ ∈ C (R ) such that for every polynomial P on Rd and every choice of non-negative integers α1 , . . . , αd we have

αd

α1 ∂ ∂ ··· ϕ(t) < ∞ . sup P (t) ∂t ∂t d 1 d t∈R d

The previous result shows that if ϕ ∈ S, then 2 ϕ ∈ S. We observe that S is dense in L p (Rd ) for every 1 ≤ p < +∞. The convolution is defined as on Td : ( f ∗ g) (t) := f (t − s)g(s) ds . Rd

The proof of the following result is close to the corresponding argument for the periodic case, see Proposition 6.10 and Theorem 6.9.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

152

Integer points and Poisson summation formula

Theorem 8.4 We have (i) If f, g ∈ L1 (Rd ), then ( f ∗ g)∧ (ξ) = 2 f (ξ) 2 g(ξ) . (ii) If f ∈ L1 (Rd ) and g ∈ L p (Rd ), p ≥ 1, then f ∗ gL p (Rd ) ≤ f L1 (Rd ) gL p (Rd ) . (iii) Let f be a measurable and non-negative function on Rd × Rd , and let 1 ≤ p < ∞. Then we have the Minkowski integral inequality p &1/p ! &1/p ! p f (t, y) dy dt ≤ f (t, y) dt dy . Rd

Lemma 8.5

Rd

Let φ ∈ L1 (Rd ) satisfy

6 Rd

Rd

Rd

φ = 1. For every ε > 0 let

φε (t) := ε−d φ(ε−1 t) .

(8.1)

Then, as ε → 0, (i) if 1 ≤ p < +∞ and f ∈ L p (Rd ), then f ∗ φε → f in the L p norm; (ii) if f is bounded and continuous on Rd , then f ∗ φε → f uniformly. 6 Proof We prove only (i). Since Rd φε = 1 for every ε > 0, Minkowski’s integral inequality yields &1/p ! p |( f ∗ φε ) (t) − f (t)| dt Rd

! p &1/p = f (t − s) − f (t) φε (s) ds dt Rd

Rd

! p &1/p −d −1 ε = f (t − s) − f (t) φ(ε s) ds dt Rd

Rd

! p &1/p = f (t − εu) − f (t) φ(u) du dt Rd

Rd

&1/p ! p f (t − εu) − f (t) φ(u) dt ≤ du =

Rd

Rd

Rd

&1/p

!

|φ(u)|

Rd

| f (t − εu) − f (t)| p dt

du .

Then, by the dominated convergence theorem it is enough to prove that (8.2) | f (t − v) − f (t)| p dt → 0 Rd

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

8.1 Fourier integrals

153

as v → 0. Indeed, (8.2) is true if f is continuous and has compact support. The general case follows by approximation. Let f (t) = e−πa|t| , with t ∈ Rd , a > 0. Then 2

Lemma 8.6

2 2 f (ξ) = a−d/2 e−π|ξ| /a .

Proof We prove the 1-dimensional case first. Let f (x) = e−πax . By diﬀerentiation under the integral sign and integration by parts we obtain d 2 2 (−2πix) e−πax e−2πiux dx f (u) = (−2πix f (x))∧ (u) = du R i i 2 −πax2 −2πiux (−2πax) e e dx = − e−πax e−2πiux (−2πiu) dx = a R a R −2πu 2 = f (u) . a The diﬀerential equation 2

d 2 −2πu 2 f (u) = f (u) du a 2 has solution 2 f (u) = Ke−πu /a , where, by (6.25), ∞ 2 K= 2 f (0) = e−πax dx = a−1/2 .

−∞

Now the proof of the d-dimensional case is simple: 2 2 e−πa|t| e−2πit·ξ dt f (ξ) = d R 2 2 2 e−πat1 e−2πit1 ξ1 dt1 e−πat2 e−2πit2 ξ2 dt2 · · · e−πatd e−2πitd ξd dtd = R

R

R

= a−1/2 e−πξ1 /a · a−1/2 e−πξ2 /a · · · a−1/2 e−πξd /a = a−d/2 e−π|ξ| 2

2

2

2

/a

.

We can now prove the Fourier inversion formula. Theorem 8.7 (Fourier inversion formula) Let f ∈ L1 (Rd ) satisfy 2 f ∈ L1 (Rd ). Then f is (almost everywhere equal to) a continuous function and it is the inverse Fourier transform of its Fourier transform. That is 2 (8.3) f (ξ) e2πiξ·t dξ . f (t) = Rd

Proof

For t ∈ Rd and ε > 0 let φ(ξ) = e2πiξ·t−πε

2

|ξ|2

.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

154

Integer points and Poisson summation formula

We apply the previous lemma, following the notation in (8.1), 2 2 2 2 2 φ(y) = e2πiξ·t−πε |ξ| e−2πiξ·y dξ = e−πε |ξ| e−2πiξ·(y−t) dξ Rd −d −π|y−t|2 /ε2

=ε e

Rd

= hε (y − t) ,

6

where h(y) = e−π|y| . Since Rd h = 1, Lemma 8.5 implies, as ε → 0, −πε2 |ξ|2 2πiξ·t 2 2 2 e e φf = f (y)hε (y − t) dt φf = f (ξ) dξ = 2

Rd

Rd

Rd

Rd

= ( f ∗ hε ) (t) → f (t) in L1 Rd . Then there exists a sequence ε j → 0 such that, as j → +∞, 2 2 e−πε j |ξ| e2πiξ·t 2 f (ξ)dξ −→ f (t) a.e. Rd

Since 2 f ∈ L1 Rd , the dominated convergence theorem implies 2 2 e−πε |ξ| e2πiξ·t 2 e2πiξ·t 2 f (ξ) dξ −→ f (ξ) dξ Rd

Rd

as ε → 0. Then, up to a set of measure zero, e2πiξ·t 2 f (t) = f (ξ) dξ . Rd

The function f (t) is continuous because it is the Fourier transform of an integrable function. Corollary 8.8 Let f ∈ X := f ∈ L1 (Rd ) : 2 f ∈ L1 (Rd ) . Then f and 2 f belong 2 d to L R and 11 11 f 11 2 d . f L2 (Rd ) = 11 2 L (R )

Proof

By the inversion formula (Theorem 8.7) we have f ∈ L∞ (Rd ). So | f |2 ≤ f L∞ (Rd ) |f| . Rd

Rd

g. Then Theorem 8.7 implies Hence f ∈ L2 Rd . Given f, g ∈ X, let h = 2 2 h(y) = and therefore

Rd

−2πiy·ξ

2 g(ξ)e

Rd

f g=

Rd

dξ =

f2 h=

Rd

Rd

2 g(ξ)e2πiy·ξ dξ = g(y) ,

2 f h=

Rd

2 g. f2

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

8.1 Fourier integrals 11 11 If we let g = f we obtain f L2 (Rd ) = 11 2 f 11 2 d for every function in X. L (R )

155

The density of S allows us to extend the previous result to every function f ∈ L1 (Rd ) ∩ L2 (Rd ). The following general result was proved by Plancherel in 1910 [141]. See, for example, [82] for a proof. Theorem 8.9 (Plancherel)

f ∈ L2 (Rd ) and If f ∈ L1 (Rd ) ∩ L2 (Rd ), then 2 11 11 (8.4) f 11 2 d . f L2 (Rd ) = 11 2 L (R )

2 Moreover, the map F : f → f extends uniquely to a unitary isomorphism on 2 d L R . Now we start studying the behaviour of 2 f (ξ) for large |ξ|. Of course we have the easy bound 2 (8.5) f (ξ) ≤ f 1 d . L (R )

The main result is the Riemann–Lebesgue lemma. Lemma 8.10 (Riemann–Lebesgue) as |ξ| → ∞.

f (ξ) → 0 For every f ∈ L1 (Rd ) we have 2

Proof A separation of variables and (7.7) imply the result for the characteristic function of a d-dimensional interval. Then the result is true for every finite linear combination of characteristic functions of d-dimensional intervals. Hence it is true for a dense subspace of L1 (Rd ). Then for every f ∈ L1 (Rd ) fn (ξ) → 0 as |ξ| → +∞ and there exists a sequence fn ∈ L1 (Rd ) such that 2 f − fn L1 (Rd ) → 0. Then sup 2 f (ξ) − 2 fn (ξ) ≤ f − fn L1 (Rd ) −→ 0 ξ

as n → ∞. Hence 2 f (ξ) → 0.

For certain families of functions we can say more on the decay of the Fourier transform. We need a lemma on convex functions. Lemma 8.11 Let f be a convex function on an open interval (a, b). Then f exists on (a, b) except at most in a countable set. Moreover, f is increasing. f (x) increases with s. Then the function x → Proof The function s −→ f (x+s)− s f+ (x) is increasing on (a, b). Indeed, if x < y,

f+ (x) := lim+ s→0

f (y) − f (x) f (x + s) − f (x) ≤ s y−x

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

156

Integer points and Poisson summation formula

f (y + s) − f (y) f (y + s) − f (y) := f− (y) ≤ lim+ := f+ (y) . s→0 s→0 s s Then f+ (x) has at most a countable number of discontinuities. Since the continuity of f+ at x implies the continuity of f at x, we end the proof. ≤ lim−

The following two results are due to Podkorytov [142]. See also [35, 38]. Lemma 8.12 (Podkorytov) Assume that f : R → [0, +∞) is supported and concave on the interval [−1, 1]. Then, for |ξ| ≥ 1, we have

1 1 1 1 −2πiξx 2 f (x) e dx ≤ f (ξ) = + f −1 + . (8.6) f 1− |ξ| −1 2 |ξ| 2 |ξ| Proof We may assume that ξ ≥ 1. By Lemma 8.11 we can integrate by parts and obtain 1 1 1 1 − + −2πiξx 2 f (1 ) + f (−1 ) + f (x) e dx . (8.7) f (ξ) ≤ 2πξ 2πξ 2πξ −1 Assume that f (α) ≥ f (x) for every x ∈ [−1, 1]. Then f increases in [−1, α] and decreases in [α, 1]. We can assume that 0 ≤ α ≤ 1. Then f (−1+ ) ≤ f (−1 + 1/ (2ξ)). In order to estimate f (1− ) we observe that when α ≤ 1 − 1/ (2ξ) we have f (1− ) ≤ f (1 − 1/ (2ξ)). Since f is concave in the interval [−1, 1], if α > 1 − 1/ (2ξ) we have f (1− ) ≤ f (α) ≤ 2 f (0) ≤ 2 f (1 − 1/ (2ξ)) . In order to estimate the integral in (8.7) we change variables, 1+ 1

1 2ξ 1 f (x) e−2πiξx dx = − f x − I := e−2πiξx dx . 1 2ξ −1 −1+ 2ξ Hence

1 2I = f (x) e dx − f x− e−2πiξx dx 1 2ξ −1+ 2ξ −1

−1+ 1 1

2ξ 1 = f (x) e−2πiξx dx + f (x) − f x − e−2πiξx dx 1 2ξ −1+ 2ξ −1 1+ 1

2ξ 1 + f x − e−2πiξx dx 2ξ 1

1

−2πiξx

1 1+ 2ξ

:= I1 + I2 + I3 . Since 0 ≤ α ≤ 1 we have

−1+ 1 2ξ 1 1 f (x) dx = f −1 + − f (−1+ ) ≤ f −1 + . |I1 | ≤ 2ξ 2ξ −1

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

8.1 Fourier integrals

157

The estimate for I3 is similar when α ≤ 1 − 1/(2ξ). If α > 1 − 1/(2ξ), then α+ 1

1+ 1

2ξ 2ξ 1 1 f x− f x− dx − dx |I3 | ≤ 1 2ξ 2ξ 1 α+ 2ξ

1 1 − − f (1 ) ≤ 2 f (α) ≤ 4 f (0) ≤ 4 f 1 − . = 2 f (α) − f 1 − 2ξ 2ξ As for I2 , the monotonicity of f implies 1

1 f x− − f (x) dx |I2 | ≤ 1 2ξ −1+ 2ξ

1 1 + − = f 1− − f (−1 ) − f (1 ) + f −1 + 2ξ 2ξ

1 1 ≤ f 1− + f −1 + . 2ξ 2ξ The previous lemma has the following geometric meaning. Assume for simplicity that f (x) is even. Then 1 1 −2πiξx f (x) e dx = −i f (x) sin (2πξx) dx . −1

−1

Let ξ ≥ 1. In the figure we overlap the graphs of f (x) and f (x) sin (2πξx) on [0, 1]:

0

1

61 Then we see that −1 f (x) sin (2πξx) dx behaves like a Leibniz sum. Indeed, it is (from right to left) a sum of terms having alternating signs and decreasing absolute value. Then it is smaller than the first term, that is, the area of the shaded part, which is essentially contained in a rectangle with sides 1/ξ and

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

158

Integer points and Poisson summation formula

f (s − 1/ (2ξ)). We acknowledge that the above figure represents a suitable choice of ξ. We deduce a useful geometric estimate for the Fourier transform of the characteristic function of a planar convex body. Theorem 8.13 (Podkorytov) Let C ⊂ R2 be a convex body. We write Θ := (cos θ, sin θ) and, for 0 ≤ θ < π and small δ > 0, let ! & λ(δ, θ) = λC (δ, θ) := t ∈ C : δ + t · Θ = sup (y · Θ)

(8.8)

y∈C

be the chord perpendicular to Θ and ‘at distance δ from the boundary’ ∂C of C (see the following figure). Then, for large ρ > 0 we have 2 χC (ρΘ) ≤ c ρ−1 λC (ρ−1 , θ) + λC (ρ−1 , θ + π) ,

(8.9)

where |λC | is the length of the segment λC .

Proof

We may assume that Θ = (1, 0). Then 2 χC (ξ1 , 0) =

+∞

−∞

+∞

−∞

χC (t1 , t2 ) dt2 e−2πiξ1 t1 dt1 = 2 h(ξ1 ) ,

where h (s) is the length of the intersection of C with the line t1 = s. Let [a, b] be the support of h (s). Observe that h (s) can be seen as the diﬀerence between a concave function and a convex function on [a, b]. Hence h (s) is concave on

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

8.1 Fourier integrals

159

[a, b] and the previous lemma1 implies

1 1 c 2 h b− +h a+ h(ξ1 ) ≤ 2 |ξ1 | 2 |ξ1 | |ξ1 | −1 −1 −1 ≤ cρ λC (ρ , 0) + λC (ρ , π) . Let C be any convex planar body. The previous lemma yields, for every ξ ∈ R2 , c 2 . (8.10) χC (ξ) ≤ 1 + |ξ| Observe that (8.10) cannot be improved. Indeed, for a square Q = [−1/2, 1/2]2 we have 1/2 1/2 sin (πξ1 ) sin (πξ2 ) 2 χQ (ξ1 , ξ2 ) = e−2πi(t1 ξ1 +t2 ξ2 ) dt1 dt2 = (8.11) πξ1 πξ2 −1/2 −1/2 (with obvious modifications when ξ1 or ξ2 vanish) and therefore, for every integer n,

1 1 2 χQ 2n + , 0 = . 2 π (2n + 1/2) The next lemma shows that the case of a disc is diﬀerent. Lemma 8.14 Let χ1 (t) := χB(0,1) (t) be the characteristic function of the disc B (0, 1) ⊂ R2 having centre 0 and radius 1. Then there exists a positive constant c such that, for every ξ ∈ R2 , c 2 . χ1 (ξ) ≤ 1 + |ξ|3/2 Proof We observe that the length of the chords at distance |ξ|−1 from the boundary is ≈ |ξ|−1/2 . Then we apply (8.5) and Lemma 8.12. In the last chapter we shall prove that, as |ξ| → ∞,

1 3π 2 χ1 (ξ) = | ξ|−3/2 cos 2π|ξ| − + O(|ξ|−5/2 ) . π 4 1

We first pass from [a, b] to the symmetric support [−α, α]. Then we define k (x) = h (αx), so that 1 2 α k (αξ) h(s)e−2πisξ ds = α h(αt)e−2πiξαt dt = α 2 h (ξ) = −1 −α

1 1 1 1 ≤ α k 1 − = α h α − + k −1 + + h −α + . 2 |αξ| 2 |αξ| 2 |ξ| 2 |ξ|

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

160

Integer points and Poisson summation formula

8.2 The Poisson summation formula The Poisson summation formula is a bridge between Fourier integrals and Fourier series. The basic idea is to consider a function f ∈ L1 Rd and periodize it. Theorem 8.15

Let f ∈ L1 (Rd ). Then there exists g ∈ L1 (Td ) such that

gL1 (Td ) ≤ f L1 (Rd ) ,

2 g(m) = 2 f (m)

for every m ∈ Zd .

Observe that 2 f (m) is a Fourier transform of the function f ∈ L1 (Rd ), while 2 g(m) is a Fourier coeﬃcient of the periodic function g ∈ L1 (Td ). Proof

The function g(t) =

f (t + k)

k∈Zd

is periodic (that is, g (t + k) = g (t) for every t ∈ Rd and k ∈ Zd ). We have dt ≤ gL1 (Td ) = |g(t)| | f (t + k)| dt 1 1 d [− 12 , 12 )d k∈Zd [− 2 , 2 ) = | f (t)| dt = | f (t)| dt . 1 1 d Rd k∈Zd [− 2 , 2 ) −k By the dominated convergence theorem we have, for every m ∈ Zd , −2πim·t 2 g(m) = f (t + k) e dt = f (t + k) e−2πim·t dt 1 1 d − , ) [− 12 , 12 )d k∈Zd [ d 2 2 k∈Z −2πim·t = f (t) e dt = f (t) e−2πim·t dt = 2 f (m) . 1 1 d d − , −k R k∈Zd [ 2 2 ) The following result was proved by Poisson between 1823 and 1827. Theorem 8.16 (Poisson summation formula) Let f ∈ L1 (Rd ) and let a, c > 0 satisfy c c 2 ≤ , (8.12) f (ξ) | f (t)| ≤ d+a (1 + |t|) (1 + |ξ|)d+a for every t, ξ ∈ Rd . Then f is (a.e. equal to) a continuous function and for every t ∈ Td we have 2 f (t + m) = (8.13) f (m) e2πim·t . m∈Zd

m∈Zd

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

8.2 The Poisson summation formula Letting t = 0 we obtain

f (m) =

m∈Zd

2 f (m) .

161

(8.14)

m∈Zd

The particular case (8.14) is very elegant and is usually termed the Poisson summation formula. Proof Let g(t) = m∈Zd f (t + m). By (8.12) and Theorem 8.15 we have g ∈ g(m) = 2 f (m) for every m ∈ Zd . The second inequality in (8.12) L1 (Td ) and 2 implies +∞ ρd−1 2 dρ < +∞ , g(m) ≤ c (1 + ρ)d+a 0 m∈Zd so that the series m∈Zd 2 g(m) e2πim·t converges absolutely and uniformly. Then 2 f (t + m) = f (m) e2πim·t m∈Zd

m∈Zd

for every t ∈ Td .

As an exercise we deduce, for every x Z, the identity +∞

1 π2 = . 2 sin2 (πx) n=−∞ (x + n) Indeed, the function

! A (y) :=

1 − |y| 0

satisfies, when η 0, 1 −2πiηy 2 (1 − |y|) e dy = 2 A (η) = −1

1

(8.15)

if |y| ≤ 1, if |y| > 1

(1 − y) cos (2πηy) dy =

0

sin2 (πη) . π2 η2

By the Fourier inversion formula (Theorem 8.7), the function

2 sin (πx) φ (x) := πx has Fourier transform 2 φ (ξ) =

!

1 − |ξ| 0

if |ξ| ≤ 1, if |ξ| > 1.

(8.16)

We apply the Poisson summation formula (8.13) to φ (x). By (8.16) we obtain sin (π (x + n)) 2 2 φ (n) e2πinx = 2 = φ (0) = 1 . (x π + n) n∈Z n∈Z

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

162

Integer points and Poisson summation formula

This implies (8.15). We can use (8.15) to obtain another proof of (4.35). Indeed, +∞

1 1 1 1 1 = = (8.17) 2 2 4 8 (2n (n (n + 1) + 1/2) + 1/2)2 n=−∞ n=0 n=0

2 π 1 π2 . = = 8 sin (π/2) 8 +∞ −2 −2 Let X := +∞ n=1 n , then we obtain X/4 := n=1 (2n) , so that, by (8.17), X − X/4 = π2 /8. Hence X = π2 /6. The Poisson summation formula is related to the Euler–Maclaurin summation formula (Lemma 6.17). See [113, p. 22]. +∞

+∞

8.3 The Gauss circle problem For every R > 0 let χR (t) := χB(0, R) (t) be the characteristic function of the disc BR := B (0, R) = t ∈ R2 : |t| ≤ R with centre 0 and radius R. Let R be large and let N (R) := card BR ∩ Z2 = χR (m) m∈Z2

be the number of integer points in BR . Let DR := N(R) − πR2 . We know (see (2.15)) that DR = O (R). The following result was proved by Sierpinski in 1906 [155] (see also [113, 160]). Theorem 8.17 (Sierpinski) There exists c > 0 such that |DR | ≤ cR2/3 . Proof First we use a convolution to smooth the discontinuous function χR (t). Let ε > 0 be small (we shall choose it later on) and let ϕε (t) := π−1 ε−2 χε (t) .

6

The support of ϕε is contained in B (0, ε). Moreover, R2 ϕε = 1 and −1 −2 −2πiξ·t −1 2 π ε χε (t) e dt = π χε (tε) e−2πiξ·tε dt ϕε (ξ) = R2 R2 χ1 (εξ) . χ1 (t) e−2πiεξ·t dt = π−12 = π−1

(8.18)

(8.19)

R2

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

8.3 The Gauss circle problem Let ε (R) := N

χ(ε) R (t) := (ϕε ∗ χR ) (t) ,

163

χ(ε) R (m) .

m∈Z2

ε (R). By Lemma 8.14 the function We are going to compare N(R) with N χ(ε) R (t) has absolutely convergent Fourier series, hence it is continuous. Moreover, it coincides with χR (t) when t is away from the boundary of the disc. Namely, χ(ε) χR (t) = R (t)

if t BR+ε BR−ε .

Indeed, if t BR+ε , then |t − y| > ε for every y ∈ BR and, therefore, (t) = ϕε (t − y) dy = 0 = χR (t) . χ(ε) R BR

If t ∈ BR−ε then B (t, ε) ⊆ BR and, therefore, (ε) ϕε (t − y) dy = ϕε (u) du = 1 = χR (t) . χR (t) = BR

Since, for every t,

Bε

0≤ χ(ε) R (t) =

R2

ϕε (t − y)χR (y) dy ≤

we have ε (R − ε) = N

≤

R2

χ(ε) R−ε (m) ≤

m∈Z2

ϕε (t − y) dy =

R2

ϕε = 1 ,

χR (m) = N(R)

(8.20)

m∈Z2

ε χ(ε) R+ε (m) = N (R + ε) .

m∈Z2

Since

e−2πiξ·t dt = R2

2 χR (ξ) = BR

χ1 (Rξ) , e−2πiξ·Rs ds = R2 2 B1

Theorem 8.4 and (8.19) give (ε) ∧ χR (ξ) = π−1 2 χR (ξ) = 2 χ1 (εξ) R2 2 χ1 (Rξ) . ϕε (ξ) 2

(8.21)

By Lemma 8.14, (8.19) and (8.21), we can apply the Poisson summation formula to the function χ(ε) R (t) and obtain ∧ ε (R) = 2 N χ(ε) ϕε (m)2 (m) = χR (m) χ(ε) R (m) = R m∈Z2

m∈Z2 −1 2

= πR + π R 2

m∈Z2

2 χ1 (εm) 2 χ1 (Rm) .

m∈Z2 , m0

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

164

Integer points and Poisson summation formula

We apply Lemma 8.14 again, then we bound the series with an integral and use polar coordinates: (1 + |εm|)−3/2 |Rm|−3/2 2 χ1 (εm) 2 χ1 (Rm) ≤ c1 R2 R2 2 2 m∈Z , m0 m∈Z , m0 ∞ 1 1 1/2 1/2 ≤ c2 R dξ = c2 R r−1/2 dr 3/2 3/2 (1 + εr)3/2 |ξ| 1 |ξ|≥1 (1 + ε |ξ|) ∞ 1 s−1/2 ds = c3 R1/2 ε−1/2 . ≤ c2 R1/2 ε−1/2 (1 + s)3/2 0 Hence

ε (R) = πR2 + O R1/2 ε−1/2 . N

(8.22)

Now we replace R with R ± ε in (8.22), then (8.20) implies ε (R + ε) = π (R + ε)2 + O (R + ε)1/2 ε−1/2 N(R) ≤ N = πR2 + 2πRε + O R1/2 ε−1/2 = πR2 + O(Rε + R1/2 ε−1/2 ) together with a similar estimate from below. Let ε = R−1/3 (this choice makes the terms Rε and R1/2 ε−1/2 equal). Then N(R) = πR2 + O(R2/3 ) . Sierpinski’s result is one step in a sequence of estimates for the circle problem. Let θ := inf α ∈ R : N(R) − πR2 = O(Rα ) . The following results have been obtained so far: θ≤1 Gauss (1801) θ ≤ 2/3 Sierpinski (1906) θ ≤ 2/3 − ε van der Corput (1923) θ ≤ 37/56 = 0.66071 · · · Landau (1924), Littlewood and Walfisz (1924) θ ≤ 163/247 = 0.65992 · · · Walfisz (1927) θ ≤ 27/41 = 0.65854 · · · Nieland (1928) θ ≤ 15/23 = 0.65217 · · · Titchmarsh (1934) θ ≤ 13/20 = 0.65 Hua (1942) θ ≤ 24/37 = 0.64865 · · · Chen (1963) θ ≤ 35/54 = 0.64815 · · · Nowak (1984) Kolesnik (1985) θ ≤ 278/429 = 0.64802 · · · θ ≤ 7/11 = 0.63636 · · · Iwaniec and Mozzochi (1987)

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

8.3 The Gauss circle problem θ ≤ 46/73 = 0.63014 · · · θ ≤ 131/208 = 0.62981 · · ·

165

Huxley (1993) Huxley (2003).

In 1915 Hardy and Landau proved independently that θ ≥ 1/2 [88, 115]. In a subsequent paper [89], Hardy proved that, as R → ∞, N(R) − πR2 0, (8.23) * R log R which shows that the bound N(R) − πR2 ≤ c R1/2 is false. We present a proof of the result of Hardy and Landau which follows a general argument due to Erd˝os and Fuchs [75, 130], and has been communicated to us by Podkorytov. Theorem 8.18 (Hardy–Landau) Let N (R) := card Z2 ∩ B (0, R) and let D (R) := N (R) − πR2 . Assume the existence of two constants c0 > 0 and α < 1 such that |D (R)| ≤ c0 Rα for every R ≥ 1. Then α ≥ 1/2. Proof

The power series f (z) =

+∞

2

zh

h=−∞

is defined on the open unit disc As in Chapter {z ∈ C : |z| < 1}. √ r (m) = √ 2, let card (a, b) ∈ Z2 : a2 + b2 = m . Then r (m) = N m − N m − 1 for every positive integer m. Therefore f 2 (z) =

+∞

zh +k =

h,k=−∞

=1+

2

2

+∞

r (m) zm

(8.24)

m=0

+∞ +∞ √ √ √ N m zm N m − N m − 1 zm = (1 − z) m=1

m=0

+∞ +∞ √ √ πz + (1 − z) = (1 − z) D m zm . D m + πm zm = 1−z m=0 m=0

Since log x ≤ x − 1 for every x ∈ R, then, for 0 < r < 1, (6.25) implies +∞ +∞ +∞ 1 2 2 2 f (r) > rh ≥ rt dt = * e−u du (8.25) log (1/r) 0 0 h=0 √ √ πr π = * ≥ √ . 2 log (1/r) 2 1 − r For every positive integer K let S K (z) := 1 + z + . . . + zK−1 ,

IK (r) :=

π −π

2 f reiθ S K reiθ dθ .

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

166

Integer points and Poisson summation formula

Observe that f S K is a power series with non-negative integral coeﬃcients. Let us write +∞ ( f S K ) (z) = αm zm . m=0

By Parseval’s theorem and (8.25) we obtain 1 +∞ +∞ 2 m 2πimθ IK (r) = 2π α2m r2m ≥ 2π αm r2m αm r e dθ = 2π 0

m=0

m=0

Kr2K π3/2 r = 2π f r2 S K r2 ≥ √ Kr2(K−1) ≥ √ . 1−r 1 − r2 We now want to bound IK (r) from above. (8.24) implies π iθ 1 + reiθ + . . . + r K−1 ei(K−1)θ 2 πre dθ IK (r) ≤ 1 − reiθ −π π +∞ √ 2 iθ K−1 i(K−1)θ iθ m imθ 1 + re + . . . + r e 1 − re D m r e dθ + −π m=0 := A + B . We estimate A. For every θ ∈ [−π, π] we have 2 1 − reiθ = 1 − reiθ 1 − re−iθ = (1 − r)2 + 2r(1 − cos θ) = (1 − r)2 + 4r sin2

θ 2

≥ (1 − r)2 +

2

4r 2 1 2r θ θ ≥ max 1 − r, 2 π π2

(since sin t ≥ π2 t for every 0 ≤ t ≤ π/2). Then π π 1 1 2 2 A ≤ πK dθ 1 − reiθ dθ ≤ c1 K ((1 max − r), θ) −π 0

π 1−r 1 1 dθ + c1 K 2 θ−1 dθ ≤ c2 K 2 log . ≤ c1 K 2 1−r 0 1−r 1−r We estimate B. By the Cauchy–Schwarz inequality, Parseval’s identity and the assumption |D (R)| ≤ c0 Rα we have π +∞ √ iθ K−1 i(K−1)θ m imθ 1 + re + . . . + r e D m r e dθ B≤2 m=0 −π &1/2 ! π 2 1 + reiθ + . . . + r K−1 ei(K−1)θ dθ ≤2 −π

⎧ 2 ⎫ 1/2 π ⎪ ⎪ +∞ ⎪ ⎪ ⎪ ⎪ √ ⎬ ⎨ m imθ ×⎪ D m r e dθ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ −π m=0

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

8.3 The Gauss circle problem ⎫1/2 ⎧K−1 ⎫1/2 ⎧ +∞ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬ ⎪ ⎨ 2 √ 2m ⎪ ⎬ ⎨ 2m ⎪ r ⎪ D m r ⎪ ≤ 4π ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ ⎪ ⎩ ⎭ ⎩ m=0 m=0 ⎧ +∞ ⎫1/2 ⎪ ⎪ ⎪ ⎨ α 2m ⎪ ⎬ ≤ 4πc0 K 1/2 ⎪ m r ⎪ ⎪ ⎪ ⎩ ⎭ m=0 ⎧ +∞ ⎫1/2 ⎪ ⎪ ⎪ ⎨ α m ⎪ ⎬ ≤ 4πc0 K 1/2 ⎪ m r ⎪ ⎪ ⎪ ⎩ ⎭ ≤ 4πc0

167

m=0 1/2

K , (1 − r)(α+1)/2

where the last step is a consequence of H¨older’s inequality. Indeed,2 for 0 < a < 1 and 0 < r < 1 we have ⎫a ⎧ ∞ ⎫1−a ⎧∞ ∞ ∞ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬ ⎪ ⎨ m ⎪ ⎬ ⎨ m ⎪ a m a am m−am (m r ) r mr = mr r ⎪ ≤⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ ⎪ ⎩ ⎭ ⎩ m=0 m=0 m=0 m=0 ! &a ! &1−a 1 1 1 ≤ = . 2 1−r (1 − r) (1 − r)a+1 Hence

1 K 1/2 Kr2K 2 (r) ≤ c ≤ I K log . + 4πc K 2 0 1−r (1 − r)1/2 (1 − r)(α+1)/2

(8.26)

We now choose r = rK := 1 − K −b , with b > 2. Then (8.26) implies

2K 1 1+b/2 c3 K ≤ bK 2 log K + K 1/2 K b(α+1)/2 . 1− b K 2K → 1 as K → +∞. Then for large K we have Observe that 1 − K −b 2K −b ≥ 1/2. Hence there exists c4 > 0 such that 1−K c4 K 1+b/2 ≤ bK 2 log K + K 1/2 K b(α+1)/2 .

(8.27)

Since b > 2, then (8.27) implies b 1 b (α + 1) ≤ + , 2 2 2 that is, bα ≥ 1 for every b > 2. Hence α ≥ 1/2. 1+

Before going on we state the general result proved by Erd˝os and Fuchs. See [75, 130]. 2

If the numbers α j and β j are ≥ 0, then 1 < p, q < +∞ and

1 p

+

1 q

+∞

j=1

α jβ j ≤

+∞

j=1

p 1/p

αj

+∞ q 1/q , j=1 β j

where

= 1.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

168

Integer points and Poisson summation formula

Theorem 8.19 (Erd˝os and Fuchs) Let A be a set of non-negative integers and, for every non-negative integer n, let γ (n) be the number of pairs (a1 , a2 ) ∈ A×A which solve the equation n = a1 + a2 . Let us assume the existence of positive numbers L, α, c such that n (γ (h) − L) ≤ cnα . h=0 Then α ≥ 1/4. The following result was proved by Kendall in 1948 [106] (see also [35]). Kendall’s result deals with the ‘shifted circle problem’. It shows that the conjecture θ = 1/2 is true in this easier problem. Apparently, Kendall was the first to realize that certain lattice point problems can be studied using Fourier analysis in several variables. Theorem 8.20 (Kendall) For every t ∈ R2 let DR (t) := card B(t, R) ∩ Z2 − πR2 = −πR2 + χB(t,R) (k) , k∈Z2

where B (t, R) is the disc with centre t and radius R. Then there exists c > 0 such that, for every R ≥ 1, we have DR L2 (T2 ) ≤ c R1/2 . A comparison between Kendall’s result and (8.23) shows that the discs centred at the origin have a discrepancy larger than the average. Proof The integration is over T2 since the function DR (t) is periodic. We compute the Fourier coeﬃcients of DR (t). For m = 0 we have ⎛ ⎞ ⎜⎜⎜ ⎟⎟⎟ 2 2 ⎜ ⎟ 2 χB(t,R) (k)⎟⎟⎠ dt = −πR + χB(t,R) (k) dt DR (0) = ⎜⎜⎝−πR + [0,1)2

= −πR2 +

k∈Z2

k∈Z2

[0,1)

2

k∈Z2

χB(0,R) (k − t) dt = −πR2 +

R2

[0,1)2

χB(0,R) (t)dt = 0 .

For 0 m ∈ Z2 we have 2R (m) = DR (t) e−2πim·t dt = D [0,1)2

=

k∈Z2

=

R2

[0,1)2

⎛ ⎞ ⎜⎜⎜ ⎟⎟⎟ ⎜⎜⎜−πR2 + χB(t,R) (k)⎟⎟⎟⎠ e−2πim·t dt 2 ⎝ [0,1) k∈Z2 χB(t,R) (k) e−2πim·t dt = χB(0,R) (k − t) e−2πim·t dt k∈Z2

[0,1)2

χ1 (Rm) . χB(0,R) (t) e−2πim·t dt = 2 χR (m) = R22

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

8.4 Integer points in convex bodies

169

By Lemma 8.14 there exists c > 0 such that, for every m 0, D 2R (m) ≤ cR1/2 1 . |m|3/2 Then Parseval’s identity (4.23) implies +∞ 2 D 2R (m) ≤ c1 R r−2 dr = c2 R . |m|−3 ≤ c2 R DR 2L2 (T2 ) = m∈Z2

0m∈Z2

1

In the last chapter we shall prove the lower bound DR L2 (T2 ) ≥ c R1/2 . Moreover we shall extend Theorem 8.20 to d variables and see that the corresponding lower bound may change significantly with d. For other mean estimates related to the circle problem see, for example, [60, 99].

8.4 Integer points in convex bodies In this section we shall replace the disc with a square or, more generally, with a planar convex body. In the case of a square Q we observe that if the sides are parallel to the axes (or have rational slopes), then the function R → R2 − card RQ ∩ Z2 changes sign infinitely many times and we have cR ≤ R2 − card RQ ∩ Z2 ≤ c1 R for infinitely many values of R. See the figure below, where the two squares have almost the same area, while the largest one contains ≈ R integer points more than the smaller one.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

170

Integer points and Poisson summation formula

Averaging over the rotations we obtain a diﬀerent estimate [28, 144, 169]. Theorem 8.21 Let P ⊂ R2 be a polygon and let S O (2) be the rotation group in R2 . For every cos θ sin θ σθ = ∈ S O (2) − sin θ cos θ let σθ (P) be the polygon rotated by the angle θ. Then there exists a positive constant c such that, for every R ≥ 2, 2π 2 2 2 card Rσθ (P) ∩ Z − R |P| dθ ≤ c log R . 0

We need a lemma on the average decay of the Fourier transform of the characteristic function χP of P. Lemma 8.22 Let χP (t) be the characteristic function of a polygon P. Let 1 ≤ p ≤ ∞, then there exist positive constants c and c p such that, for every ρ ≥ 2, ! −2 11 11 if p = 1, cρ log ρ 12 χP (ρ·)1L p ([0,2π]) ≤ (8.28) if 1 < p ≤ ∞, c p ρ−1−1/p χP (ρ cos θ, ρ sin θ) is written in polar coordinates. where 2 χP (ρΘ) = 2 Proof We may subdivide P as a disjoint (up to sets of measure zero) union of a finite number of convex polygons. That is, we may assume that P is convex and therefore apply Theorem 8.13. Then the case p = ∞ follows from (8.10). For the case p < ∞ we look at the figure below:

which shows that the chord at distance ρ−1 from the boundary (see (8.8)) has length λ ρ−1 , θ ≤ c min 1, ρ−1 θ−1 .

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

8.4 Integer points in convex bodies Then !

2π

p 2 χP (ρΘ) dθ

&1/p

0

171

! 1/ρ &1/p ! c2 p &1/p 1 c1 c1 ≤ dθ + dθ ρ ρ ρθ 0 1/ρ ! if p = 1, c3 ρ−2 log ρ ≤ −1−1/p if p > 1, cpρ

where c p depends on p.

A diﬀerent proof of the previous result will be given implicitly in the proof of Lemma 10.6. The following lemma was pointed out to us by Colzani. Lemma 8.23 Let C ⊂ Rd be a closed convex body. Let Ao denote the interior of a set A ⊆ Rd and let ∂(A) be the boundary of A. Then for large R and small ε we have, for every q ∈ ∂ (RC), B (q, ε) ⊆ (R + ε) C \ ((R − ε) C)o , where B (q, ε) = t ∈ Rd : |t − q| ≤ ε .

Proof Since C has positive measure, there is an open ball contained in C. We may assume that B (0, 1) ⊂ C. By convexity we have ε R C+ C⊆C. R+ε R+ε Hence (R + ε) C ⊇ RC + εC ⊇ RC + B (0, ε) ,

(8.29)

and therefore B (q, ε) ⊆ (R + ε) C for every q ∈ ∂ (RC). We now replace R + ε with R and apply (8.29) to C o . Then ((R − ε) C)o + B (0, ε) ⊆ (RC)o . If y ∈ B (q, ε) ∩ ((R − ε) C)o , then q ∈ (RC)o . Hence q ∂ (RC).

Proof of Theorem 8.21 We may assume that P is convex. Let χP be the characteristic function of P. We follow the argument in the proof of Theorem 8.17. Let ϕε (t) := π−1 ε−2 χε (t) , (ε,θ) := ) (ϕ χ(ε) (t) , N χ(ε) (t) := ∗ χ ε RP R R R (σθ (m)) . m∈Z2

We observe that 2 ϕε (σθ (m)) = π−1 ε−2

|t|≤ε

e−2πiσθ (m)·t dt = π−1 ε−2

−1

e−2πim·σθ

(t)

dt

|t|≤ε

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

172

Integer points and Poisson summation formula = π−1 ε−2 χ1 (εm) . e−2πim·t dt = 2 ϕε (m) = π−12 |t|≤ε

Moreover,

∧

(χRP (σθ (·))) (m) = χRP (σθ (t)) e−2πim·t dt R2 −1 χRP (u) e−2πim·σθ (u) du = χRP (u) e−2πiσθ (m)·u du = R2 R2 2 χP (Rσθ (m)) . χP (s) e−2πiRσθ (m)·s ds = R2 2 =R R2

Then Lemma 8.14 and (8.10) give, for |ξ| ≥ 1, ∧ (ε) −1 −1 −1 = 2 2 (ξ) (ξ) (ξ) (ξ) 2 (σ (·)) σ σ σ χ χ(ε) = ϕ χ θ ε RP θ θ θ R R −5/2 −3/2 χP Rσ−1 ε , = R2 2 χ1 (ξ) 2 θ (ξ) ≤ cR |ξ| where c is independent of θ. Hence we can apply the Poisson summation formula and obtain ∧ (ε,θ) = 2 ϕε (m) (χRP (σθ (·)))∧ (m) (8.30) N χ(ε) R R (σθ (·)) (m) = m∈Z2

=R

2

m∈Z2

2 ϕε (m) 2 χP (Rσθ (m)) = R2 |P| + R2

χ(ε) R (t) Then

= (ϕε ∗ χRP ) (t) =

2 ϕε (m) 2 χP (Rσθ (m)) .

m0

m∈Z2

By Lemma 8.23 we have

!

RP

ϕε (t − s) ds =

1 0

if t ∈ (R − ε)P, if t (R + ε)P.

(ε,θ) (ε,θ) ≤ card Rσθ (P) ∩ Z2 = χRP (σθ (m)) ≤ N N R−ε R+ε

(8.31)

m∈Z2

and therefore (8.28), (8.30) and (8.31) imply 2π 2 2 c card Rσθ (P) ∩ Z − R |P| dθ 0 ⎛

⎞⎟ ⎜⎜⎜ 2π m ⎟⎟⎟ 2 2 ⎜ 2 2 ϕε (m) ≤ 2Rε + ε + max ⎜⎜⎝(R ± ε) χP (R ± ε) |m| σθ dθ⎟⎟ ± |m| ⎠ 0 m0 2 ϕε (m) (R |m|)−2 log (R |m|) . ≤ 2Rε + c1 R2 m0

Now we choose ε = 1/R. Then (8.19) and Lemma 8.14 give 2 R2 ϕ (m) (R |m|)−2 log (R |m|) 1/R

m0

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:20:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.009

8.4 Integer points in convex bodies 2 = π−1 χ1 (R−1 m) |m|−2 log(R |m|)

173

m0

1

|m|−2 log(R |m|) 3/2 (1 + /R) |m| m0 ∞ 1 ≤ c3 log(Rx) dx x (1 + x/R)3/2 1 R ∞ −1 2 3/2 x log(R ) dx + c4 R x−5/2 log x2 dx ≤ c8 log2 R . ≤ c4 ≤ c2

1

R

The following result, proved by Davenport in 1958 [67], shows that suitable rotations make the discrepancy of a square at most logarithmic. Theorem 8.24 (Davenport) Let α be an irrational algebraic number of degree 2 and let Q be a unit square with a side parallel to the vector (1, a). Then there exists c > 0 such that 2 2 2 card R(Q + t) ∩ Z − R dt ≤ c log R . T2

Proof For simplicity, let √ Q√has sides us √assume √ that √ the unit square √ parallel 2/ 3, 1/ 3 respecto the unit vectors σ = −1/ 3, 2/ 3 and σ⊥ = tively. Arguing as in the proof of Theorem 8.20, we write 2 2 2 card R(Q + t) ∩ Z − R dt T2 2 2 = R4 χQ (Rm) m0

⎧ ⎪ ⎪ ⎪ ⎪ 4⎨ ≤R ⎪ ⎪ ⎪ ⎪ ⎩

+

m0 , |m·σ|≤R−1

+

+

m0 , |m·σ⊥ |≤R−1

R−1 0 . ρ→+∞

Remark 10.5

The lower bound 11 1 12 χC (ρ·)11L2 ([0,2π)) ≥ cρ−3/2

may fail. Indeed, let Q = [−1/2, 1/2]2 be a unit square and let k ∈ N. Then, for every p > 1, (8.11) implies π/4 2π p p 2 sin(πk cos θ) sin(πk sin θ) dθ χQ (k cos θ, k sin θ) dθ = 8 πk cos θ πk sin θ 0 0 π/4 π/4 p 1 sin(2πk sin2 (θ/2)) p θ−p dθ sin(πk cos θ) dθ ≤ c 1 ≤ c p 2p p 2p sin θ k k 0 0 k−1/2 π/4 1 1 −(3p+1)/2 ≤ cp 2p k p θ p dθ + cp 2p θ−p dθ ≤ c , p k k k k−1/2 0 which is o k−p−1 as k → +∞. For a triangle we have the expected lower bound.

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:22:55, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.011

202

Geometric discrepancy and decay of Fourier transforms

Lemma 10.6 Let T be a triangle in R2 and let 1 < p ≤ +∞. Then there exists a constant c p > 0 such that, for large ρ, 11 1 12 χT (ρ·)11L p ([0,2π)) ≥ c ρ−1−1/p . Proof

We may assume that 1 < p < +∞. Let Θ = (cos θ, sin θ) and let ω (t) =

−1 −2πiρΘ·t e Θ. 2πiρ

As in the proof of Theorem 8.25, we have div (ω (t)) = e−2πiρΘ·t . Then the divergence theorem implies 2 e−2πiρΘ·t dt = ω (s) · ν(s) ds , χT (ρΘ) = T

∂T

where ds is the 1-dimensional measure on the three sides λ1 , λ2 , λ3 , while v (s) is the outward unit vector at the point s. Since T is a triangle, v (s) takes only three values: ν1 , ν2 , ν3 .

Then

−Θ · ν1 Θ · ν2 e−2πiρΘ·s ds − e−2πiρΘ·s ds 2πiρ λ1 2πiρ λ2 Θ · ν3 e−2πiρΘ·s ds − 2πiρ λ3

2 χT (ρΘ) =

:= A(ρ, Θ) + B(ρ, Θ) + C(ρ, Θ) . We may assume that ! & 1 1 λ1 = (s, 0) : − ≤ s ≤ 2 2 is the base of T . It is enough to prove that, for a given δ, we have π +δ p 2 2 χT (ρΘ) dθ ≥ c ρ−p−1 . π 2 −δ

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:22:55, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.011

10.4 Irregularities of distribution for convex bodies We show that

π 2 +δ π 2 −δ

203

|A(ρ, Θ)| p dθ ≥ c ρ−p−1 .

Indeed, since |Θ · ν1 | = |sin θ| we have, for large ρ and suitable constants, p π +δ π +δ 1/2 2 2 1 p −2πiρs cos θ e − ds dθ |A(ρ, Θ)| dθ = c p p sin θ π π ρ −1/2 2 −δ 2 −δ π +δ p 2 1 sin (πρ cos θ) sin θ dθ = cp p ρ π2 −δ πρ cos θ p π +c1 /ρ

2 sin (πρ cos θ) 1 sin θ dθ ≥ cp p+1 π πρ cos θ ρ 2 π/4 (sin (x)) p 1 ≥ cp p+1 dx ≥ cp ρ−p−1 . xp ρ 0 As for B(ρ, Θ) and C(ρ, Θ), the restriction θ − π2 ≤ δ and a similar computation lead us to evaluate two integrals of the form c4 p 1 sin (2πρx) dx , ρ p c3 ρx where 0 < c3 < c4 . Then π +δ 2 |B(ρ, Θ)| p dθ + π 2 −δ

− π2 +δ − π2 −δ

|C(ρ, Θ)| p dθ ≤ c5 ρ−2p .

10.4 Irregularities of distribution for convex bodies In this section we use the results on the average decay of Fourier transforms to study the discrepancy of a finite sequence {t( j)}Nj=1 ⊂ T2 with respect to a family {Bα } of subsets of T2 . If the family is too large or too small, then the problem may have little interest. It is meaningful to consider families of discs, rectangles, polygons, convex bodies, ... and look for finite sequences which yield a small discrepancy for the whole family. Let C be a convex body with diameter less than 1 and let εσ−1 (C) − t be the rotated, dilated and translated copy of C, where σ ∈ S O(2), 0 < ε ≤ 1 and t ∈ T2 . Note that the assumption on the diameter of C makes the projection of εσ(C) − t from R2 on T2 injective. We define the discrepancies

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:22:55, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.011

204

Geometric discrepancy and decay of Fourier transforms

DC,{t(1),...,t(N)} N

N

:= −N |C| +

χC (t( j)) ,

(10.14)

j=1

DC,{t(1),...,t(N)} (t) = DN (t) := −N |C| + N

N

χC−t (t( j)) ,

j=1

DC,{t(1),...,t(N)} (σ, t) = DN (σ, t) := −N |C| + N

N

χσ−1 (C)−t (t( j)) ,

j=1

DC,{t(1),...,t(N)} (ε, σ, t) = DN (ε, σ, t) := −Nε2 |C| + N

N

χεσ−1 (C)−t (t( j)) .

j=1

When no confusion will arise we simply write DN (t) in place of DN (σ, t) or DN (ε, σ, t). We show that the function t → DCN (ε, σ, t) = DN (t) has Fourier series ⎛ N ⎞ ⎜⎜⎜ ⎟ 2πik·t ⎜⎜⎜ e2πik·t( j) ⎟⎟⎟⎟⎟ ε2 2 , (10.15) ⎝ ⎠ χC (εσ(k)) e k0

j=1

2N (k). where we note that here 2 χC is a Fourier transform on R2 . We compute D Since ⎞ ⎛⎜ N ⎟⎟ ⎜⎜⎜ N ⎜⎜⎝ χεσ−1 (C)−t (t( j))⎟⎟⎟⎟⎠ dt = χεσ−1 (C) (t( j) + t) dt T2

j=1

=

j=1

T2

j=1

T2

N

χεσ−1 (C) (t)dt = Nε2 |C|

2N (0) = 0. For k 0 we have we have D ⎞ ⎛⎜ ⎟⎟ ⎜⎜⎜ N 2N (k) = ⎜⎜⎝ χεσ−1 (C)−t (t( j)) − Nε2 |C|⎟⎟⎟⎟⎠ e−2πik·t dt D T2

=

N j=1

=

N j=1

T2

j=1

−2πik·t

χεσ−1 (C) (t( j) + t)e

e2πik·t( j) 2 χεσ−1 (C) (k) =

dt =

N j=1

N

2πik·t( j)

e

T2

χεσ−1 (C) (u)e−2πik·u du

e2πik·t( j) ε2 2 χC (εσ(k)) .

j=1

We now recall the Monte Carlo method. Let H be a measurable subset of T2 . For every j = 1, . . . , N let dμ j denote

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:22:55, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.011

10.4 Irregularities of distribution for convex bodies

205

the Lebesgue measure on T2 . We define the Monte Carlo error (we also call it the Monte Carlo discrepancy)3

DMC N

2 ⎫1/2 ⎧ ⎪ ⎪ N ⎪ ⎪ ⎪ ⎪ ⎨ N |H| − du1 · · · duN ⎬ := ⎪ · · · χ . u ⎪ H j ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ T2 T2 j=1

This term does not change if we introduce an extra translation, which allows us to apply Parseval’s identity. Then by (10.15) we have 2 N 2 N |H| − = ··· χH−s u j ds du1 · · · duN (10.16) DMC N T2 T2 T2 j=1 2 N 2 2 = ··· χH (k) e2πik·u j du1 · · · duN T2 T2 2 j=1

0k∈Z

N N 2 2 = ··· e2πik·u j e−2πik·u du1 · · · duN χH (k) T2

0k∈Z2

T2 j=1 =1

⎛ ⎞ ⎟⎟⎟ 2 ⎜⎜⎜ 2πik·u −2πik·u j 2 = e e du j du ⎟⎟⎟⎠ χH (k) ⎜⎜⎝⎜N + 2 T2 j T 0k∈Z2 2 2 χH (k) = N − |H|2 + χH 2L2 (T2 ) = N |H| − |H|2 . =N 0k∈Z2

Observe that, up to sets of measure zero, we have |H| −√|H|2 = 0 if and only if H = ∅ or H = T2 . In all other cases we have DMC N = c N. One would guess that suitable choices of the sequence {t( j)}Nj=1 should im√ prove the order of increasing N of the Monte Carlo discrepancy. The following theorem, proved independently by Beck [13] and Montgomery [123] (see also [28, 47] and the results of Schmidt [152, 153]) shows that the L2 √4 discrepancy cannot go beyond the lower bound N. Theorem 10.7 (Beck–Montgomery) For every convex body C ⊂ T2 having diameter less than 1 there exists a constant c > 0 such that for every finite set {t( j)}Nj=1 ⊂ T2 we have !

1 0

3

S O(2)

T2

&1/2 ≥ cN 1/4 . |DN (ε, σ, t)|2 dtdσdε

(10.17)

The comparison between the Monte Carlo method and the Riemann sums associated with low-discrepancy sequences is a classical problem in the study of high-dimensional integration. See e.g. [131, Ch. 1]; also [71] and [158, Ch. 2].

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:22:55, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.011

206

Geometric discrepancy and decay of Fourier transforms

Proof

Let 0 < q < 1 and 0 < r ≤ 1. By (10.15) and Lemma 10.4 we have 1 r (10.18) |DN (ε, σ, t)|2 dtdσdε r qr S O(2) T2 2 N r 2 2 e2πit( j)·m 1 ε 2 χC (εσ(m)) dσdε = r qr S O(2) m0 j=1 2 N 2 e2πit( j)·m 1 2 ≈ χ (|m| ξ) |ξ|3 dξ r {qr≤|ξ|≤r} C m0 j=1 2 N 2 2πit( j)·m 2 −2 2 ≈ χC (ξ) dξ e r |m| {qr|m|≤|ξ|≤r|m|} m0 j=1 2 N 2πit( j)·m 2 −2 −1 ≈ e r |m| (1 + r |m|) , m0 j=1

where the implicit constants associated with ≈ do not depend on N or r. We now apply (10.18) first with r = 1 and second with r = kN −1/2 (the constant k will be chosen later): 1 (10.19) |DN (ε, σ, t)|2 dtdσdε q

S O(2) T2

2 N e2πit( j)·m |m|−3 ≈ m0 j=1 ≥ c inf |m|−1 k−2 N 1 + kN −1/2 |m| m0 2 ⎧ ⎫ ⎪ ⎪ N ⎪ ⎪ ⎪ ⎪ ⎨ 2πit( j)·m 2 −1 −2 −1/2 −1 ⎬ ×⎪ e k N (1 + kN |m| |m|) ⎪ ⎪ ⎪ ⎪ ⎪ ⎩m0 j=1 ⎭ ⎫ ⎧ kN −1/2 ⎪ ⎪ ⎪ ⎪ ⎬ −1 1/2 ⎨ −1 1/2 2 . k N ≈k N ⎪ |DN (ε, σ, t)| dtdσdε⎪ ⎪ ⎪ ⎭ ⎩ −1/2 2 qkN S O(2) T

Since qkN −1/2 ≤ ε ≤ kN −1/2 there exists δ > 0 such that, for suitable constants q and k, we have δ ≤ q2 k2 |C| ≤ Nε2 |C| ≤ k2 |C| ≤ 1 − δ . Then4

4

|DN (ε, σ, t)| = an integer − Nε2 |C| ≥ δ

This is a trivial estimate and the proof consists of blowing up this error.

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:22:55, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.011

10.4 Irregularities of distribution for convex bodies for every σ, t and ε ∈ qkN −1/2 , kN −1/2 . Then N 1/2

kN −1/2

qkN −1/2

S O(2) T2

Finally, by (10.19) we have 1 q

207

S O(2) T2

|DN (ε, σ, t)|2 dt dσ dε > c > 0 .

(10.20)

|DN (ε, σ, t)|2 dt dσ dε ≥ cN 1/2 .

Corollary 10.8 For every convex body C ⊂ T2 having diameter less than 1 and for every finite set {t( j)}Nj=1 ⊂ T2 there exists a (dilated, translated and of C such that rotated) copy C DC,{t(1),...,t(N)} ≥ cN 1/4 . N We show that in certain cases the dilation is not necessary (and in other cases it cannot be avoided). We first need an estimate (essentially due to Cassels [40, 123]) for the sums Nj=1 e2πim·t( j) which appear in the Fourier coeﬃcients of DN (t), see (10.15). Lemma 10.9 (Cassels) For every positive integer N let √ √ QN := m = (m1 , m2 ) ∈ Z2 : |m1 | ≤ 2N , |m2 | ≤ 2N .

(10.21)

Then for every finite set {t( j)}Nj=1 ⊂ T2 we have 2 N e2πim·t( j) ≥ N 2 . 0m∈QN j=1 Proof

(10.22)

We add N 2 on both sides of (10.22), which becomes 2 N e2πim·t( j) ≥ 2N 2 . √ √ |m1 |≤ 2N |m2 |≤ 2N j=1

It is enough to prove that |m1 |≤

√

√ 2N |m2 |≤ 2N

2 N √ 2 e2πim·t( j) ≥ N 2N + 1 .

(10.23)

j=1

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:22:55, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.011

208

Geometric discrepancy and decay of Fourier transforms

Let t () = (t1 () , t2 ()). Then the LHS of (10.23) is larger than 2 ⎞⎛ ⎞ N ⎛ ⎜⎜⎜ ⎟ ⎟ ⎜ ⎟ ⎟ ⎜ | | |m |m ⎟⎟⎟ ⎜⎜⎜ ⎟⎟⎟ 2 2πim·t( j) ⎜⎜⎜1 − √ 1 e 1 − √ ⎠⎟ ⎝⎜ ⎠⎟ √ √ ⎝ 2N + 1 2N + 1 j=1 |m1 |≤ 2N |m2 |≤ 2N ⎞⎛ ⎞ ⎛ ⎜⎜⎜ ⎟⎟⎟ ⎜⎜⎜ ⎟⎟⎟ ⎟⎟⎟ ⎜⎜⎜1 − √ |m2| ⎟⎟⎟ ⎜⎜⎜1 − √ |m1| = ⎠ ⎠ ⎝ ⎝ √ √ 2N + 1 2N + 1 |m |≤ 2N |m |≤ 2N 1

×

2

N N

e2πim·(t( j)−t(k))

j=1 k=1

⎞ ⎛ ⎟⎟⎟ ⎜⎜⎜ |m1 | ⎟⎟⎟ e2πim1 (t1 ( j)−t1 (k)) ⎜ ⎜⎜⎝1 − √ = ⎠ √ 2N + 1 j=1 k=1 |m1 |≤ 2N ⎞ ⎛ ⎜⎜⎜ ⎟⎟⎟ | |m 2 ⎜⎜⎜1 − √ ⎟⎟⎟⎠ e2πim2 (t2 ( j)−t2 (k)) × √ ⎝ 2N + 1 |m |≤ 2N N N

2

=

N N

K √2N (t1 ( j) − t1 (k)) K √2N (t2 ( j) − t2 (k)) ,

(10.24)

j=1 k=1

where K M is the Fej´er kernel on T (see (6.7)). Since K M (x) ≥ 0 for every x, the last term in (10.24) is not smaller than the ‘diagonal’: N

K √2N (t1 ( j) − t1 ( j)) K √2N (t2 ( j) − t2 ( j))

j=1

= N K √2N (0) K √2N (0) = N

√ 2 2N + 1 .

We have the following result [172]. Theorem 10.10 Let T ⊂ T2 be a triangle having sides of length less than 1. Then for every finite set {t( j)}Nj=1 ⊂ T2 we have |DN (σ, t)|2 dσdt ≥ c N 1/2 . T2

S O(2)

Corollary 10.11 Given a triangle T ⊂ T2 having sides of length less than 1, and given a finite set {t( j)}Nj=1 ⊂ T2 , there exists a copy (translated and rotated) T of T such that −N |T | + card T ∩ {t( j)}N ≥ cN 1/4 . j=1

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:22:55, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.011

10.4 Irregularities of distribution for convex bodies

209

Proof of Theorem 10.10 Applying (10.15) with ε = 1, Parseval’s identity and Lemmas 10.6 and 10.9 we obtain 2 N 2 2 2πim·t( j) 2 χT (σ(m)) dσ |DN (σ, t)| dσdt = e T2 S O(2) S O(2) 0m∈Z2 j=1 2 2 N N 2πim·t( j) −3 −3/2 2πim·t( j) e |m| ≥ cN e ≥ c N 1/2 , ≥c 0m∈QN j=1 0m∈QN j=1 where QN has been defined in (10.21).

A variant of the above argument can be used to give a diﬀerent proof of Roth’s theorem (see [123, Ch. 6]). So far we have seen two diﬀerent estimates from below for the L2 discrepancy: the logarithmic estimate in Roth’s theorem and the N 1/4 estimate proved in this section. The logarithmic estimate is optimal, since Theorem 8.24 gives an upper counterpart of Roth’s theorem. Now we show also that Theorem 10.7 and Corollary 10.4 cannot be improved. This result is due to Beck and Chen [15], and we provide two proofs in this chapter. The first one will depend on Kendall’s result (Theorem 8.20 [106], see also [35]). The second one is probabilistic in nature and will be a particular case of an L p result. A third proof will be given in the next chapter (Remark 11.5). Theorem 10.12 (Beck and Chen) Let C ⊂ [−1/2, 1/2)2 be a convex body with diameter less than 1. Then for every positive integer N there exists a finite set {t( j)}Nj=1 ⊂ T2 such that (10.25) |DN (σθ , t)|2 dtdθ ≤ cN 1/2 . S O(2)

T2

Proof By Lagrange’s theorem (Theorem 5.23) there are four non-negative integers j1 , j2 , j3 , j4 such that N = j21 + j22 + j23 + j24 . Arguing as in [34] we choose a1 , a2 , a3 , a4 ∈ [0, 1)2 such that 2 −1 2 (10.26) a + j−1 Z ∩ ak + jk Z = ∅ whenever k. For each = 1, 2, 3, 4, the set 2 2 A j2 := a + j−1 Z ∩ [0, 1) contains j2 elements. By (10.26) the union AN = A j21 ∪ A j22 ∪ A j23 ∪ A j24 is disjoint and therefore it contains N points. Observe that for every measurable set H ⊂ T2 we have card (AN ∩ H) − N |H|

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:22:55, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.011

210

Geometric discrepancy and decay of Fourier transforms = card A j21 ∩ H − j21 |H| + . . . + card A j24 ∩ H − j24 |H| .

Hence (10.25) will be a consequence of, say, 2 2 1/2 card A j21 ∩ (σ(C) + t) − j1 |C| dt dθ ≤ cN . T2

S O(2)

In other words, we may prove the theorem assuming N = M 2 (a square) and considering the finite set ' p q ( , ∩ [0, 1)2 , {t( j)}Nj=1 = AN = A M2 = a + M M p,q∈Z 2 where a ∈ 0, M −1 is given. Then card (A 2 ∩ (σ(C) + t)) − M 2 |C|2 dtdσ M S O(2) T2 card (A 2 ∩ (σ(C) + t + a)) − M 2 |C|2 dtdσ = M S O(2)

T2

S O(2)

T2

2

' p q ( M−1 2 = , ∩ (σ(C) + t) − M |C| dtdσ card M M p,q=0 S O(2) T2 2 2 dtdσ card {(p, q)} (σ(MC) = ∩ + Mt) − M |C| p,q∈Z S O(2) [0,1)2 2 card Z2 ∩ (Mσ(C) + v) − M 2 |C| dvdσ = M −2 S O(2) [0,M)2 2 2 2 = (10.27) card Z ∩ (Mσ(C) + v) − M |C| dvdσ ,

since the function v −→ card Z2 ∩ (Mσ(C) + v) − M 2 |C| is periodic and the square [0, M)2 consists of M 2 disjoint copies of [0, 1)2 . In this way we have changed our point of view slightly. Now M is no longer the (square root of the) number of points, but rather a (discrete) dilation parameter, while (10.27) is the lattice point problem studied in Theorem 8.26. Then card (A 2 ∩ (σ(C) + t)) − M 2 |C|2 dtdσ ≤ cM = cN 1/2 . M S O(2)

T2

The second proof of Theorem 10.12 comes from the case p = 2 in the next theorem, proved by Chen [22, 44]. Let M be a large positive integer and let

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:22:55, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.011

10.4 Irregularities of distribution for convex bodies N = M 2 . Let {t ( j)}Nj=1

2 1 2 1 1 := Z ∩ − , M 2 2

211

(10.28)

be the restriction of the shrunk lattice M1 Z2 to the unit square. We replace every point t ( j) with a random point u j in the small square 2 1 1 , (10.29) S j = t ( j) + − 2M 2M centred at t ( j) and having side length 1/M (in statistics this choice may be termed jittered discrepancy, see e.g. [16, 110]). Theorem 10.13 (Chen) Let C ⊂ [0, 1)2 be a convex body. Let M be a large positive integer and let N = M 2 . Let S 1 , . . . , S N be the N small squares with area 1/N in (10.29). Then for every p < +∞ we have (see (10.14)) &1/p ! C,{u1 ,...,uN } p N ··· du1 · · · duN ≤ c p N 1/4 , N DN SN

S1

where c p is independent of N and, for every j, du j is the Lebesgue measure. Proof By Corollary 6.6 we may assume that p is an even integer. For every S j we have χC∩S j u j du j = C ∩ S j . (10.30) Sj

Since C is convex, there are M # (≈ M) small squares which intersect the boundary of C. Up to a reordering, we may write the set of these small squares as M# . Observe that for every S A and every uS ∈ S we have A = {S k }k=1 χC∩S (uS ) − N |C ∩ S | = 0 .

(10.31)

For every j = 1, . . . , N let u j ∈ S j . Then (10.31) implies 1 ,...,uN } DC,{u = −N |C| + N

=

N j=1

N N C ∩ S + χC u j = −N χC∩S j u j j j=1

j=1

(χC∩S (uS ) − N |C ∩ S |) .

S ∈A

Hence C,{u ,...,u } p DN 1 N ) 9 = N |C ∩ S 1 | − χB∩S 1 uS 1 · · · N C ∩ S p − χB∩S p uS p . S 1 ,...,S p ∈A

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:22:55, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.011

212

Geometric discrepancy and decay of Fourier transforms

We have p 1 ,...,uN } ··· du1 · · · duN DC,{u N SN S1 = NN ···

NN

SN

(10.32)

S 1 S ,...,S ∈A 1 p

× χC∩S 1 uS 1 − N |C ∩ S 1 | · · · χC∩S p uS p − N C ∩ S p du1 · · · duN M# =N ··· S 1 ,...,S p ∈A

S M#

S1

× χC∩S 1 uS 1 − N |C ∩ S 1 | · · · χC∩S p uS p − N C ∩ S p du1 · · · du M# .

By (10.30) the term in (10.32) is zero whenever at least one of the terms χC∩S i uS i − N |C ∩ S i | appears exactly once in the above product. Then the non-zero contributions to the sum S 1 ,...,S p ∈A in (10.32) may come only when each term appears at least twice. For simplicity, let us first think that the small squares S 1 , S 2 , . . . , S p ∈ A are subdivided into p/2 pairs, which can be chosen in

M # M # − 1 · · · M # − p/2 + 1 M# ≤ c p M p/2 = (p/2)! p/2

ways. In general, the small squares are grouped into sets of two, three, four, ... Then we have to choose q ≤ p/2 sets of two, three, four, ... This can be done in at most cp M p/2 ways. Observe that we have χC∩S u j − N C ∩ S j ≤ 1 j for every j and every u j ∈ S j . Then ···

NN SN

S1

1 ,...,uN } DC,{u N

p

p/2 p/4 du1 · · · duN ≤ c = c . p M p N

For the case p = ∞ a diﬀerent argument due to Beck gives the upper estimate cN 1/4 log1/2 N [12, 14, 50].

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:22:55, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.011

10.4 Irregularities of distribution for convex bodies

213

Exercises 1) Let Q = [−1/2, 1/2]2 and let 1 < p < ∞. Prove the existence of a positive constant c p such that 1 11 12 χQ (ρ·)11L p ([0,2π)) ≥ c p ρ−3/2−1/(2p) for every real ρ ≥ 2. 2) Let Q = [−1/2, 1/2]2 and let 1 < p < +∞. Prove the existence of a sequence ρk → +∞ and a positive constant c p such that 1 11 12 ≥ c ρ−1−1/p . χ (ρ ·)11 Q

k

L p ([0,2π))

p

k

3) Use the argument in the proof of Lemma 10.6 to give another proof of Lemma 8.22. 4) Prove that the Fourier transform of the characteristic function χ1 (t) of the unit disc B (0, 1) ⊂ R2 , centred at the origin, admits a diverging sequence of zeros. 5) Let T be a triangle in R2 and let 1 < p < ∞. Prove the existence of a constant c p > 0 such that, for large R, &1/p ! p 2 2 ≥ c R1−1/p , card Z ∩ (Rσθ (T ) + t) − R |T | dtdθ S O(2)

T2

where σθ denotes rotation by the angle θ. 6) Let Q = [−1/2, 1/2]2 . For 0 ≤ x < 1/4 and 0 ≤ θ < 2π let π (x, θ) be the half-plane with slope θ and distance x from the origin. Let Aθ,t = Q ∩ π (t, θ). Prove the existence of a sequence N → +∞ of positive integers such that N ⊂ Q satisfying there exists a distribution u j j=1 2π N χAθ,t u j dθdt ≤ c log2 (N) . N Aθ,t − T2 0 j=1

N 7) Let u j

j=1

⊂ [−1/2, 1/2]2 . Prove the existence of a convex body C ⊂

[−1/2, 1/2]2 and a positive constant c such that C,{u ,...,u } DN 1 N ≥ c N 1/3 .

Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 04 Jun 2017 at 03:22:55, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.011

11 Discrepancy in high dimension and Bessel functions

In this chapter we investigate the discrepancy associated with the translates of a d-dimensional ball rB in Td (here B is the unit ball centred at the origin, and 0 < r < 1/2). We denote by χ1 the characteristic function of B. In Lemma 8.14 we used a geometric argument to prove an upper bound for 2 χ1 (ξ) when d = 2. Here we shall use complex analysis to obtain, for every d, an asymptotic estimate of 2 χ1 (ξ), as |ξ| → +∞. We start by recalling a probably well-known result. Lemma 11.1 The d-dimensional unit ball B = t ∈ Rd : |t| ≤ 1 has volume ωd =

πd/2 , Γ d2 + 1

where Γ is the gamma function (see (5.18)). Proof By the d-dimensional integral formula in spherical coordinates there exists Hd > 0 such that for every continuous radial integrable function1 f on Rd we have +∞ f (t) dt = Hd fo (r) rd−1 dr , (11.1) Rd

0

where fo is defined by fo (|t|) = f (t). We are going to compute Hd . On the one hand, (6.25) gives 2 2 2 e−|t| dt = e−t1 dt1 · · · e−td dtd = πd/2 . Rd

R

R

On the other hand, (11.1) gives +∞ +∞ 2 1 2 e−|t| dt = Hd e−r rd−1 dr = Hd e−x x(d−2)/2 dx d 2 0 0 R 1

We say that a function f (t) on Rd is radial if f (t) = f (u) whenever |t| = |u|. It is easy to prove that the Fourier transform of a radial integrable function is a radial function.

214 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

11.1 Bessel functions

d 1 = Hd Γ . 2 2

215

Then Hd = Therefore, (5.19) implies

1

ωd = Hd

2πd/2 . Γ (d/2)

rd−1 dr =

0

πd/2 1 2πd/2 = . d dΓ Γ d2 + 1 2

Observe that ωd → 0 as d → +∞. Let χ1 (t) be the characteristic function of the d-dimensional unit ball centred at the origin. If we argue as in the proof of Theorem 8.13 we obtain 1 √ d−1 2 χ1 (ξ) = ωd−1 1 − s2 e−2πi|ξ|s ds (11.2) −1 (d−1)/2 π(d−1)/2 1 = e−2πi|ξ|s ds . 1 − s2 d+1 −1 Γ 2 The Fourier transform 2 χ1 (ξ) can be expressed in terms of Bessel functions, a field of great relevance and independent interest. See, for example, [81, 117, 164, 178].

11.1 Bessel functions For ν > −1/2 and x > 0, let (x/2)ν Jν (x) := √ Γ (ν + 1/2) π

1 −1

ν−1/2 eisx ds 1 − s2

(11.3)

be the Bessel function2 (of the first kind) of order ν. See, for example, [164, p. 155]. Then (11.2) yields 2 χ1 (ξ) = ρ−d/2 Jd/2 (2π |ξ|) . 2

(11.4)

(11.3) is known as the Poisson representation of Bessel functions. Another way to introduce Bessel functions is to consider the wave equation in R2 , write the Laplacian in polar coordinates and separate variables [81]. This leads to the identity Jν (x) =

+∞ k=0

x (−1)k k!Γ (ν + k + 1) 2

2k+ν

.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

216

Discrepancy in high dimension and Bessel functions

If rB is the ball centred at the origin and having radius r, then we have the identity 2 χrB (ξ) = rd/2 |ξ|−d/2 Jd/2 (2πr |ξ|) .

(11.5) A variant of the above computation shows that, for every radial f ∈ L1 Rd and fo defined by f (t) = fo (|t|) for almost every t ∈ Rd , we have the following identity: +∞ −(d−2)/2 2 fo (x) J(d−2)/2 (2πx |ξ|) xd/2 dx f (ξ) = 2π |ξ| 0

(see [164, p. 155]). (11.4) shows that here we only need Bessel functions of integer order n or of half-integer order n + 1/2. For Jn (x) we have the identity π 1 Jn (x) = eix sin θ−inθ dθ , 2π −π while the Bessel functions Jn+1/2 (x) can be written in terms of trigonometric functions: : :

2 2 sin x sin x , J3/2 (x) = − cos x , . . . J1/2 (x) = πx πx x : Jn+1/2 (x) = (−1)

n

n

sin x 2 n+1/2 1 d x π x dx x

(see e.g. [117, Ch. 5] and [178]). We shall estimate Jν (x) for large x. Theorem 11.2 that, for x ≥ 1,

For every ν > −1/2 there exists a positive constant cν such : Jν (x) =

νπ π 2 cos x − − + Eν (x) , πx 2 4

(11.6)

where |Eν (x)| ≤ cν x−3/2 . Proof We consider the region of the complex plane obtained by deleting the ν−1/2 which rays (−∞, −1] and [1, +∞). We then choose the branch of 1 − z2 is positive on (−1, 1). We compute the integral in (11.3) via Cauchy’s integral theorem. We integrate ν−1/2 eizx f (z) = 1 − z2

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

11.1 Bessel functions

217

over the contour in the figure below, where R and ε are respectively a large positive number a small positive number

iR

1-

-1 -1+

1

Let ε → 0 and R → +∞. Then the two integrals on the small arcs vanish, as well as the integral on the upper side {t + iR : −1 ≤ t ≤ 1}. In this way we obtain 1 ν−1/2 eitx dt (11.7) 1 − t2 −1 +∞ ν−1/2 = ei(−1+iy)x i dy 1 − (−1 + iy)2 0 +∞ ν−1/2 − ei(1+iy)x i dy 1 − (1 + iy)2 0 +∞ +∞ ν−1/2 ν−1/2 = ie−ix e−yx dy − ieix e−yx dy . y2 + 2iy y2 − 2iy 0

We have

0

ν−1/2 y2 ± 2iy = yν−1/2 (±2i)ν−1/2 + A (y) ,

where

! |A (y)| ≤ cν

yν+1/2 y2ν−1

if 0 < y ≤ 1, if y > 1.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

218

Discrepancy in high dimension and Bessel functions

Then, as x → +∞,

+∞ 0

ν−1/2 e−yx dy y2 ± 2iy +∞ ν−1/2 = (±2i) yν−1/2 e−yx dy 0

1

+∞ ν+1/2 −yx 2ν−1 −yx +O y e dy + O y e dy 0 1

1 = (±2i)ν−1/2 x−ν−1/2 Γ ν + 2

x +∞ −ν−3/2 ν+1/2 −s −2ν 2ν−1 −s s e ds + O x s e ds +O x 0 x

1 = (±2i)ν−1/2 x−ν−1/2 Γ ν + + O x−ν−3/2 . 2

Then (11.3) and (11.7) imply 1 ν−1/2 (x/2)ν eitx dt Jν (x) = 1 − t2 √ Γ (ν + 1/2) π −1 (x/2)ν = √ x−ν−1/2 Γ (ν + 1/2) π

1 −ix 1 ix ν−1/2 ν−1/2 × (2i) Γ ν+ Γ ν+ ie − (−2i) ie + O(x−3/2 ) 2 2 ix−1/2 ν−1/2 −ix = √ e − (−i)ν−1/2 eix + O(x−3/2 ) i 2π −1/2 ix ei(ν−1/2)π/2 e−ix − e−i(ν−1/2)π/2 eix + O(x−3/2 ) = √ 2π : π π 2 cos x − ν − = + O(x−3/2 ) . πx 2 4 In the following figure we compare the graph of J1 (x) with its approximation :

3 2 cos x − π πx 4 (the dashed curve):

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

11.2 Deterministic and probabilistic discrepancies

219

0.5 0.4 0.3 0.2 0.1 0.0 2

4

6

8

10

12

14

16

18

20

22

24

26

28

30

32

34

36

38

40

42

44

46

48

50

-0.1 -0.2 -0.3

11.2 Deterministic and probabilistic discrepancies Let M be a large positive integer and let N = M d . As in (10.28) let d 1 1 1 {t ( j)}Nj=1 = Zd ∩ − , M 2 2 be the restriction of M −1 Zd to the unit cube. For every j = 1, . . . , N, we consider, as in Theorem 10.13, a random point inside each small cube d 1 1 , S j = t ( j) + − 2M 2M centred at t ( j) and with side length 1/M. Let λ(t) := Nχ[−1/(2M),1/(2M))d (t)

(11.8)

be the (normalized) characteristic function of [−1/ (2M) , 1/ (2M))d . For every j = 1, . . . , N let λ j (t) = λ (t − t ( j)) .

(11.9)

Then λ j (t) is supported on S j and λ j (t) dt is the (normalized) Lebesgue measure on S j . We are going to compare the grid discrepancy associated with the finite sequence {t ( j)}Nj=1 , that is, the square root of 2 N −N |rB| + D2grid (N) := χrB−t (t ( j)) dt Td j=1

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

220

Discrepancy in high dimension and Bessel functions

and the jittered discrepancy associated with the N random points in the squares S j , that is, the square root of D2jittered (N) := ··· (11.10) Td Td Td 2 N × −N |rB| + χrB−t u j dtλ1 (u1 ) du1 · · · λN (uN ) duN . j=1

These two choices of points are represented in the figures below: 0.5

0.5

-0.5

0.5

-0.5

-0.5

-0.5

For every k = (k1 , . . . , kd ) ∈ Zd we have 2 λ(k) = N

d sin(πk s /M) s=1

πk s

(11.11)

with obvious modifications when k s = 0 for some s = 1, . . . , d. Moreover 2 λ(k) λ j (k) = e2πik·t( j) 2

(11.12)

for every j = 1, . . . , N and k ∈ Zd . Theorem 11.3 Let rB be the ball centred at the origin, with radius r < 1/2. The jittered discrepancy satisfies the following identity: D2jittered (N) = N χrB 2L2 (Td ) − χrB ∗ λ2L2 (Td ) . (11.13) Proof

We argue as in (10.16). By Parseval’s identity and (11.12) we have

D2jittered (N)

2 N 2 2 = ··· χrB (k) e2πik·u( j) λ1 (u1 ) du1 · · · λN (uN ) duN Td Td 0k∈Zd j=1

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

11.2 Deterministic and probabilistic discrepancies =

221

N N 2 2 e2πik·u j e−2πik·u λ j u j du j λ (u ) du χrB (k) 0k∈Z2

j=1 =1

Td

Td

⎛ ⎞ ⎟⎟⎟ 2 ⎜⎜⎜ 2πik·u −2πik·u ⎜ j 2 = e e λ j u j du j λ (u ) du ⎟⎟⎟⎠ χrB (k) ⎜⎜⎝N + d Td j T 0k∈Z2 ⎛ 2 ⎜⎜⎜ 2 e2πik·t( j) e−2πik·t() = χrB (k) ⎜⎜⎜⎝N + 2 j 0k∈Z 2πik·u j −2πik·u × e e λ u j du j λ (u ) du Td Td ⎛ ⎞ ⎟⎟⎟ 2 ⎜⎜⎜ 2 |2 λ(k)|2 e2πik·t( j) e−2πik·t() ⎟⎟⎟⎠ . = χrB (k) ⎜⎜⎜⎝N + j

0k∈Z2

Hence 2 ⎞⎞ ⎛ ⎛ N ⎟⎟⎟⎟⎟⎟ 2 ⎜⎜⎜ ⎜⎜⎜⎜ 2 ⎜ ⎟⎟ 2 χrB (k) ⎜⎜⎜⎜N + 2 D2 (N) = λ(k) ⎜⎜⎜⎜−N + e2πik·t( j) ⎟⎟⎟⎟⎟⎟⎟⎟ ⎝ ⎝ ⎠⎠ 2

(11.14)

j=1

0k∈Z

2 N 2 2 2 2 2 2 χrB (k) 1 − 2 χrB (k) 2 λ(k) + λ(k) e2πik·t( j) . =N j=1 k∈Z2 0k∈Z2

Observe that N

! e2πik·t( j) =

j=1

if k ∈ MZd , otherwise.

N 0

(11.15)

Indeed, by (10.28) we have N

2πik·t( j)

e

j=1

⎛ ⎞ d ⎜ M−1 ⎜⎜⎜ 2πiks j/M ⎟⎟⎟⎟ ⎜⎜⎝ ⎟⎟⎠ , = e s=1

j=0

where k = (k1 , . . . , kd ). If k MZd there is k s MZ. Then M−1

e2πiks j/M =

j=0

M−1

e2πiks /M

j

=0

j=0

and (11.15) is proved. Then (11.11) and (11.15) imply 2 λ(k)

N

e2πik·t( j) = 0

j=1

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

222

Discrepancy in high dimension and Bessel functions

for every k 0. Hence 2 2 2 D2jittered (N) = N χrB (k) 1 − 2 λ(k) = N χrB 2L2 (Td ) − χrB ∗ λ2L2 (Td ) . k∈Z2

Lemma 11.4

For every dimension d we have D2jittered (N) < M d−1

Proof

πd/2 d3/2 rd−1 . 2Γ d2 + 1

For every t ∈ Td we have

⎛ d ⎞ ⎟⎟ ⎜⎜⎜ 1 1 (χrB ∗ λ) (t) = N rB ∩ ⎜⎜⎝ − , + t⎟⎟⎟⎠ . 2M 2M

(11.16)

Observe that the support of the function λ is small and we have (χrB ∗ λ) (t) = χrB (t) for every

⎧ √ ⎫ ⎪ ⎪ ⎪ d⎪ ⎬ ⎨ 2 t⎪ . x ∈ T : min |x − y| ≤ ⎪ ⎪ ⎭ ⎩ y∈∂(rB) 2M ⎪

Hence (11.13) and Lemma 11.1 imply, for large M, 2 dt Djittered (N) ≤ N |rB| − N √ |t| 1000 2Γ d2 + 1 the previous lemma implies (11.17).

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

224

Discrepancy in high dimension and Bessel functions

Remark 11.7 The inequality (11.17) is not true for every d 1 (mod 4). A cumbersome computation [51] shows that for d = 2 we have D2jittered M 2 > D2grid M 2 .

11.3 The case d = 5, 9, 13, . . . The case 1 < d ≡ 1 (mod 4) is diﬀerent. As a preliminary step, we prove a simultaneous approximation result (see [90, Ch. XI] and [138]), which is related to Theorem 5.3. Lemma 11.8 Let α1 , α2 , . . . , αn ∈ R and let H be a positive integer. Then such that there are p1 , p2 , . . . , pn ∈ Z and an integer M p − α M ≤ H n+1 , H≤M (11.22) j j < 1/H for every j = 1, . . . , n. We first show why this result is related to the case d ≡ 1 (mod 4). Indeed, let d = 4 + 1, then (11.18) gives 2 (2πrM |h|) , D2grid (N) = Nrd |h|−d Jd/2 h0

while (11.6) yields

1 4 + 1 π π 2 − cos 2πrM − |h| 2 2 4 π2 rM |h| 1 sin2 (2πrM |h|) . = 2 π rM |h|

2 Jd/2 (2πrM |h|) ∼

and a useful upper bound Then we can use (11.22) to obtain an integer M 2 for each term sin 2πr M |h| , where h is any integer number contained in a suitably large ball centred at the origin. Proof of Lemma 11.8 We may assume that 0 < α j < 1 for every j = 1, . . . , n. We subdivide the cube [0, 1)n as the disjoint union [0, 1) = n

Hn 7

Qk ,

k=1

where the cubes Qk have sides parallel to the axes and length H −1 . For every integer 0 ≤ m ≤ H n+1 we consider the point ({mα1 } , {mα2 } , . . . , {mαn }) ∈ [0, 1)n ,

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

11.3 The case d = 5, 9, 13, . . .

225

where {mαk } is the fractional part of mαk . Then there exists a cube Qk0 which contains at least H + 1 of the above H n+1 + 1 points. Hence there are integers 0 ≤ m1 < m2 < . . . < mH+1 such that mH+1 − m1 ≥ H and, for every j = 1, . . . , n, H −1 > mH+1 α j − m1 α j = (mH+1 − m1 ) α j − mH+1 α j − m1 α j . = mH+1 − m1 and, for every j, we let p j = mH+1 α j − m1 α j . Then We let M j − p j < 1/H . Mα ≤ H n+1 we obtain (11.22). Since H ≤ M

The following result is due to Parnovski and Sobolev. Theorem 11.9 (Parnovski and Sobolev) Let 1 < d ≡ 1 (mod 4) and +∞ let of 0 < r < 1/2. For every ε > 0 there exists a diverging sequence M j j=1 positive integers satisfying log−1/(d+ε) M j . (11.23) D2grid M dj ≤ cε,d,r M d−1 j Proof

For every positive integer j let j = m ∈ Zd : 0 < |m| ≤ j2 . B

We have

j ⊃ m ∈ Zd : 0 < 2r |m| ≤ j2 , B

j ≤ 2d j2d . card B

Then Lemma 11.8 implies the existence of a diverging sequence M j of positive integers satisfying d 2d sin 2πrM j |m| ≤ j−1 (11.24) j ≤ M j ≤ j2 j +1 , j . The assumption d ≡ 1 (mod 4) implies for every m ∈ B νπ π cos2 x − − = sin2 (x) 2 4 when ν = d/2 in (11.6). Let N j = M dj for every j. Then Parseval’s identity, (11.18), (11.4) and (11.24) imply 2 (11.25) 2πrM j |h| D2grid N j = N j rd |h|−d Jd/2 = N jr

d

0 j2

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

226

Discrepancy in high dimension and Bessel functions d−1 ≤ M d−1 π−2 |h|−(d+1) sin2 2πrM j |h| j r 0 j2 d−1 −2 j ≤ cd M d−1 j r

j2

d−1 s−2 ds + c M d−1 j r 1 d−1 d−2 + Od M d−2 ≤ cd j−2 M d−1 . j r j r

+∞ j2

d−2 s−2 ds + O M d−2 j r

Observe that (11.24) implies, for every ε > 0, d d+ε . log M j < 2 j2 + 1 log j < cε,d 2 j2 Then the inequality

j2 > cε,d log1/(d+ε) M j

and (11.25) complete the proof.

Remark 11.10 The inequality (11.23) is almost sharp. Indeed, Parnovski and Sobolev [138, Theorem 3.1] have proved that if d ≡ 1 (mod 4) then, for every positive real number δ, we have D2grid (N) ≥ cd,δ M d−1−δ . An inequality like (11.23) can be true also for sets diﬀerent from balls. See [137, 175]. We can now prove the following result [51]. Theorem 11.11 have

Proof

Let d ≡ 1 (mod 4). Then for infinitely many values of M we D2grid M d ≤ D2jittered M d .

Let N = M d . By Proposition 6.10 we have χrB ∗ λL1 (Td ) ≤ χrB L1 (Td ) λL1 (Td ) = |rB| .

Then (11.13) and Corollary 6.6 yield D2jittered (N) = N |rB| − χrB ∗ λ2L2 (Td ) = N |rB| − χrB ∗ λL1 (Td ) + N χrB ∗ λL1 (Td ) − χrB ∗ λ2L2 (Td ) (χrB ∗ λ) (t) − (χrB ∗ λ)2 (t) dt ≥N Td

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

11.3 The case d = 5, 9, 13, . . .

=N

Td

227

(χrB ∗ λ) (t) (1 − (χrB ∗ λ) (t)) dt .

Observe that 0 ≤ (χrB ∗ λ) (t) (1 − (χrB ∗ λ) (t)) ≤ 1

(11.26)

for every t. Also note that (χrB ∗ λ) (t) (1 − (χrB ∗ λ) (t)) = 0 whenever

⎛ d ⎞ ⎜⎜⎜ ⎟⎟ 1 1 , + t⎟⎟⎟⎠ ∩ S (0, r) = ∅ , ⎜⎜⎝ − 2M 2M

where S (0, r) = {t : |t| = r} is the sphere centred at 0, with radius r. We write t = ρσ in polar coordinates. For every σ ∈ Σd−1 (the unit sphere in Rd ) let Br,σ be the half-space containing the origin and such that its boundary is tangent to the sphere S (0, r) at the point rσ. By (11.16) and the symmetry of the cube about its centre we have ⎞ ⎛ d ⎟⎟ 1 ⎜⎜⎜ 1 1 (χrB ∗ λ) (rσ) = N rB ∩ ⎜⎜⎝ − , + rσ⎟⎟⎟⎠ = + Od,r M −1 . 2M 2M 2 By the monotonicity of the function ρ → (χrB ∗ λ) (ρσ) we have, for r ≤ ρ ≤ r + 1/ (4M), 0 ≤ (χrB ∗ λ) (ρσ) ≤

1 + Od,r M −1 . 2

Then, for large M, 2 (χrB ∗ λ) (t) (1 − (χrB ∗ λ) (t)) dt Djittered (N) ≥ N r≤|t|≤r+1/(4M) 1 (χrB ∗ λ) (t) dt ≥ N 3 r≤|t|≤r+1/(4M) ⎛ ⎞ d ⎜⎜⎜ ⎟⎟ 1 1 2 1 Br,σ ∩ ⎜⎜⎝ − , + ρσ⎟⎟⎟⎠ dt + Od,r M d−2 ≥ N 3 2M 2M r≤|t|≤r+1/(4M) ≥ cd N 2 M −d dt + Od,r M d−2 ≥ cd M d−1 . r≤|t|≤r+1/(4M)

By appealing to (11.23) we complete the proof.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

228

Discrepancy in high dimension and Bessel functions

Exercises 1) Prove the existence of a positive constant c such that for every positive integer m and real x > 0, the Bessel function Jm (x) satisfies |Jm (x)| ≤ c x−1/3 . 2) Prove that for every non-negative integer k and every x > 0 we have d −k xk x Jk (x) = −Jk+1 (x) . dx 3) Let Q = [−1/2, 1/2]d . Prove that for every ρ ≥ 2 we have log (ρ) 2 . χQ (ρσ) dσ ≥ c ρd Σd−1 4) Let rB be a d-dimensional ball of radius r, and let Djittered be as in (11.10). Prove the existence of a positive constant cd,r such that D2jittered M d −→ cd,r M d−1 as M → +∞.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:23:30, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.012

References

[1] T. van Aardenne-Ehrenfest, Proof of the impossibility of a just distribution of an infinite sequence of points over an interval, Proc. Kon. Ned. Akad. v. Wetensch 48 (1945), 266–271. [2] T. van Aardenne-Ehrenfest, On the impossibility of a just distribution, Proc. Kon. Ned. Akad. v. Wetensch 52 (1949), 734–739. [3] W.W. Adams, L.J. Goldstein, Introduction to number theory, Prentice-Hall, 1976. [4] J. Agnew, Explorations in number theory, Contemporary Undergraduate Mathematics Series, Brooks/Cole, 1972. [5] W.R. Alford, A. Granville, C. Pomerance, There are infinitely many Carmichael numbers, Ann. of Math. 139 (1994), 703–722. [6] G.E. Andrews, Number theory, Dover Publications, 1994. [7] G.E. Andrews, S.B. Ekhad, D. Zeilberger, A Short Proof of Jacobi’s formula for the number of representations of an integer as a sum of four squares, Amer. Math. Monthly 100 (1993), 274–276. [8] T.M. Apostol, Introduction to analytic number theory, Undergraduate Texts in Mathematics, Springer, 1998. [9] A. Baker, A concise introduction to the theory of numbers, Cambridge University Press, 1984. [10] P.T. Bateman, H.G. Diamond, Analytic number theory. An introductory course, World Scientific Publishing, 2004. [11] D. Bayer, P. Diaconis, Trailing the dovetail shuﬄe to its lair, Ann. Appl. Probab. 2 (1992), 294–313. [12] J. Beck, Balanced two-colourings of finite sets in the square I, Combinatorica 1 (1981), 50–64. [13] J. Beck, Irregularities of distribution I, Acta Math. 159 (1987), 1–49. [14] J. Beck, W.W.L. Chen, Irregularities of distribution, Cambridge Tracts in Mathematics, 89, Cambridge University Press, 2008. [15] J. Beck, W.W.L. Chen, Note on irregularities of distribution II, Proc. London Math. Soc. 61 (1990), 251–272. [16] D.R. Bellhouse, Area estimation by point counting techniques, Biometrics 37 (1981), 303–312.

229 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:24:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.013

230

References

[17] F. Benford, The law of anomalous numbers, Proc. Am. Philos. Soc. 78 (1938), 551–572. [18] A. Berger, T.P. Hill, Benford’s law strikes back: no simple explanation in sight for mathematical gem, Math. Intelligencer 33 (2011), 85–91. [19] D. Bilyk, Roth’s orthogonal functions method in discrepancy theory and some new connections, in ‘A panorama of discrepancy theory’ (W.W.L. Chen, A. Srivastav, G. Travaglini - Editors), Lecture Notes in Mathematics, Springer, to appear. [20] S. Bochner, The role of mathematics in the rise of science, Princeton University Press, 1966. [21] E. Borel, Les probabilit´es d´enombrables et leurs applications arithm´etiques, Rend. Circ. Mat. Palermo 27 (1909), 247–271. [22] L. Brandolini, W.W.L. Chen, L. Colzani, G. Gigante, G. Travaglini, Discrepancy and numerical integration in Sobolev spaces on metric measures spaces, preprint. [23] L. Brandolini, W.W.L. Chen, G. Gigante, G. Travaglini, Discrepancy for randomized Riemann sums, Proc. Amer. Math. Soc. 137 (2009), 3177–3185. [24] L. Brandolini, C. Choirat, L. Colzani, G. Gigante, R. Seri, G. Travaglini, Quadrature rules and distribution of points on manifolds, Ann. Sc. Norm. Super. Pisa Cl. Sci., to appear [25] L. Brandolini, L. Colzani, G. Gigante, G. Travaglini, On the Koksma–Hlawka inequality, J. Complexity 29 (2013), 158–172. [26] L. Brandolini, L. Colzani, G. Gigante, G. Travaglini, A Koksma–Hlawka inequality for simplices, in ‘Trends in harmonic analysis’ (M. Picardello - Editor), Springer INdAM Series, Springer, 2013, 33–46. [27] L. Brandolini, L. Colzani, A. Iosevich, A. Podkorytov, G. Travaglini, Geometry of the Gauss map and lattice points in convex domains, Mathematika 48 (2001), 107–117. [28] L. Brandolini, L. Colzani, G. Travaglini, Average decay of Fourier transforms and integer points in polyhedra, Ark. Mat. 35 (1997), 253–275. [29] L. Brandolini, G. Gigante, S. Thangavelu, G. Travaglini, Convolution operators defined by singular measures on the motion group, Indiana Univ. Math. J. 59 (2010), 1935–1945. [30] L. Brandolini, G. Gigante, G. Travaglini, Irregularities of distribution and average decay of Fourier transforms, in ‘A panorama of discrepancy theory’ (W.W.L. Chen, A. Srivastav, G. Travaglini - Editors), Lecture Notes in Mathematics, Springer, to appear [31] L. Brandolini, A. Greenleaf, G. Travaglini, L p −L p estimates for overdetermined Radon transforms, Trans. Amer. Math. Soc. 359 (2007), 2559–2575. [32] L. Brandolini, S. Hofmann, A. Iosevich, Sharp rate of average decay of the Fourier transform of a bounded set, Geom. Funct. Anal. 13 (2003), 671–680. [33] L. Brandolini, A. Iosevich, G. Travaglini, Spherical means and the restriction phenomenon, J. Fourier Anal. Appl. 7 (2001), 359–372. [34] L. Brandolini, A. Iosevich, G. Travaglini, Planar convex bodies, Fourier transform, lattice points, and irregularities of distribution, Trans. Amer. Math. Soc. 355 (2003), 3513–3535.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:24:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.013

References

231

[35] L. Brandolini, M. Rigoli, G. Travaglini, Average decay of Fourier transforms and geometry of convex sets, Rev. Mat. Iberoamer. 14 (1998), 519–560. [36] L. Brandolini, G. Travaglini, Pointwise convergence of Fej´er type means, Tohoku Math. J. 49 (1997), 323–336. [37] L. Brandolini, G. Travaglini, La legge di Benford, Emmeciquadro 45 (2012). [38] J. Bruna, A. Nagel, S. Wainger, Convex hypersurfaces and Fourier transforms, Ann. of Math. 127 (1988), 333–365. [39] F. Cantelli, Sulla probabilit`a come limite della frequenza, Atti Accad. Naz. Lincei 26 (1917), 39–45. [40] J.W.S. Cassels, On the sums of powers of complex numbers, Acta Math. Hungar. 7 (1957), 283–289. [41] D.G. Champernowne, The construction of decimal normal in the scale of ten, J. London Math. Soc. 8 (1933), 254–260. [42] K. Chandrasekharan, Introduction to analytic number theory, Die Grundlehren der mathematischen Wissenschaften, Band 148, Springer, 1968. [43] B. Chazelle, The discrepancy method. Randomness and complexity, Cambridge University Press, 2000. [44] W.W.L. Chen, On irregularities of distribution III, J. Austr. Math. Soc. 60 (1996), 228–244. [45] W.W.L. Chen, Lectures on irregularities of point distribution, unpublished, 2000. [46] W.W.L. Chen, Elementary number theory, unpublished, 2003. [47] W.W.L. Chen, Fourier techniques in the theory of irregularities of point distribution, in ‘Fourier analysis and convexity’ (L. Brandolini, L. Colzani, A. Iosevich, G. Travaglini - Editors), Birkhauser, 2004, 59–82. [48] W.W.L. Chen, M. Skriganov, Upper bounds in irregularities of point distribution, in ‘A panorama of discrepancy theory’ (W.W.L. Chen, A. Srivastav, G. Travaglini - Editors), Lecture Notes in Mathematics, Springer, to appear. [49] W.W.L. Chen, A. Srivastav, G. Travaglini - Editors, A panorama of discrepancy theory, Lecture Notes in Mathematics, Springer, to appear. [50] W.W.L. Chen, G. Travaglini, Discrepancy with respect to convex polygons, J. Complexity 23 (2007), 662–672. [51] W.W.L. Chen, G. Travaglini, Deterministic and probabilistic discrepancies, Ark. Mat. 47 (2009), 273–293. [52] W.W.L. Chen, G. Travaglini, Some of Roth’s ideas in discrepancy theory, in ‘Analytic number theory: essays in honour of Klaus Roth’ (W.W.L. Chen, W.T. Gowers, H. Halberstam, W.M. Schmidt, R.C. Vaughan - Editors), Cambridge University Press, 2009, 150–163. [53] P.R. Chernoﬀ, Pointwise convergence of Fourier series, Amer. Math. Monthly 87 (1980), 399–400. [54] K.L. Chung, A course in probability theory, Academic Press, 2001. [55] M. Cipolla, Sui numeri composti P che verificano la congruenza di Fermat aP−1 ≡ 1 (mod P), Ann. Mat. Pura Appl. 9 (1904), 139–160. [56] J.A. Clarkson, On the series of prime reciprocals, Proc. Amer. Math. Soc. 17 (1966), 541. [57] L. Colzani, G. Gigante, Summation formulas and integer points under shifted generalized hyperbolae, preprint.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:24:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.013

232

References

[58] L. Colzani, G. Gigante, G. Travaglini, Trigonometric approximation and a general form of the Erd˝os–Tur´an inequality, Trans. Amer. Math. Soc. 363 (2011), 1101–1123. [59] L. Colzani, G. Gigante, G. Travaglini, unpublished, 2012. [60] L. Colzani, I. Rocco, G. Travaglini, Quadratic estimates for the number of integer points in convex bodies, Rend. Circ. Mat. Palermo 54 (2005), 241–252. [61] J.H. Conway, R.K. Guy, The book of numbers, Copernicus, 1996. [62] A.H. Copeland, P. Erd˝os, Note on normal numbers, Bull. Amer. Math. Soc. 52 (1946), 857–860. [63] W.A. Coppel, Number theory. An introduction to mathematics, Springer, 2009. [64] J.G. van der Corput, Zalhentheorische absch¨atzungen, Math. Ann. 84 (1921), 53–79. [65] J.G. van der Corput, Zalhentheorische absch¨atzungen mit anwendung auf gitterpunktprobleme, Math. Z. 17 (1923), 250–259. [66] J.G. van der Corput, Verteilungsfunktionen I–VIII, Proc. Akad. Amsterdam 38 (1935), 813–821, 1058–1066; 39 (1936), 10–19, 19–26, 149–153, 339–344, 489–494, 579–590. [67] H. Davenport, Notes on irregularities of distribution, Mathematika 3 (1956), 131–135. [68] J. De Koninck, F. Luca, Analytic number theory. Exploring the anatomy of integers. Graduate Studies in Mathematics, 134, American Mathematical Society, 2012. [69] P. Diaconis, The distribution of leading digits and uniform distribution mod 1, Ann. Prob. 5 (1977), 72–81. [70] P. Diaconis, D. Freedman, On rounding percentages, J. Amer. Statist. Assoc. 366 (1979), 359–364. [71] J. Dick, F. Pillichshammer, Discrepancy theory and quasi-Monte Carlo integration, in ‘A panorama of discrepancy theory’ (W.W.L. Chen, A. Srivastav, G. Travaglini - Editors), Lecture Notes in Mathematics, Springer, to appear. [72] L.E. Dickson, History of the theory of numbers, Vol. I, II, Chelsea Publishing Co., 1966. [73] M. Drmota, R.F. Tichy, Sequences, discrepancies and applications. Lecture Notes in Mathematics, 1651, Springer, 1997. [74] P. Erd˝os, On almost primes, Amer. Math. Monthly 57 (1950), 404–407. [75] P. Erd˝os, W.H.J. Fuchs, On a problem of additive number theory, J. London Math. Soc. 31 (1956), 67–73. [76] P. Erd˝os, J. Suranyi, Topics in the theory of numbers, Undergraduate Texts in Mathematics, Springer, 2003. [77] P. Erd˝os, P. Tur´an, On a problem in the theory of uniform distribution I, II, Indag. Math. 10 (1948), 370–378, 406–413. [78] G. Everest, T. Ward, An introduction to number theory, Graduate Texts in Mathematics, 232, Springer, 2005. [79] D.E. Flath, Introduction to number theory, John Wiley & Sons, 1989. [80] B. Flehinger, On the probability that a random integer has initial digit A, Amer. Math. Monthly 73 (1966), 1056–1061. [81] G.B. Folland, Fourier analysis and its applications, Wadsworth & Brooks/Cole, 1992.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:24:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.013

References

233

[82] G.B. Folland, Real analysis. Modern techniques and their applications, John Wiley & Sons, 1999. [83] L.J. Goldstein, A history of the prime number theorem, Amer. Math. Monthly 80 (1973), 599–615. [84] S.W. Graham, G. Kolesnik, Van der Corput’s method of exponential sums, London Mathematical Society Lecture Note Series, 126, Cambridge University Press, 1991. [85] A. Granville, The Fundamental theorem of arithmetic, preprint. [86] A. Granville, Z. Rudnick - Editors, Equidistribution in number theory, an introduction, Springer, 2007 [87] T.H. Gronwall, Some asymptotic expressions in the theory of numbers, Trans. Amer. Math. Soc. 14 (1913), 113–122. [88] G.H. Hardy, On the expression of a number as the sum of two squares, Quart. J. Math. 46 (1915), 263–283. [89] G.H. Hardy, On Dirichlet’s divisor problem, Proc. London Math. Soc. 15 (1916), 1–25. [90] G.H. Hardy, E.M. Wright, An introduction to the theory of numbers, Oxford University Press, 1938. [91] G. Harman, Metric number theory, London Mathematical Society Monographs, New Series, 18, Oxford University Press, 1998. [92] G. Harman, Variations on the Koksma–Hlawka inequality, Unif. Distr. Theory 5 (2010), 65–78. [93] H. Hasse, Number theory, Springer, 1980. [94] F.J. Hickernell, Koksma–Hlawka inequality, in ‘Encyclopedia of statistical sciences’ (S. Kotz, C.B. Read, D.L. Banks - Editors), Wiley-Interscience, 2006. [95] E. Hlawka, The theory of uniform distribution, AB Academic Publishers, 1984. [96] E. Hlawka, J. Schoißengeier, R. Taschner, Geometric and analytic number theory, Universitext, Springer, 1991. [97] P. Hoﬀman, The man who loved only numbers: The story of Paul Erd˝os and the search for mathematical truth, Hyperion Books, 1998. [98] L.K. Hua, Introduction to number theory, Springer, 1982. [99] M.N. Huxley, The mean lattice point discrepancy, Proc. Edinburgh Math. Soc. 38 (1995), 523–531. [100] M.N. Huxley, Area, lattice points and exponential sums, London Mathematical Society Monographs, New Series, 13, Oxford Science Publications, 1996. [101] K. Ireland, M. Rosen, A classical introduction to modern number theory, Graduate Texts in Mathematics, 84, Springer, 1990. [102] A. Ivi´c, The Riemann zeta-function. Theory and applications, Dover Publications, 2003. [103] G.A. Jones, J.M. Jones, Elementary number theory, Springer Undergraduate Mathematics Series, Springer, 1998. [104] C. Joy, P.P. Boyle, K.S. Tan, Quasi-Monte Carlo methods in finance, Management Science 42 (1996), 926–938. [105] Y. Katznelson, An introduction to harmonic analysis, Cambridge Mathematical Library, Cambridge University Press, 2004. [106] D.G. Kendall, On the number of lattice points in a random oval, Quart. J. Math. Oxford Series 19 (1948), 1–26.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:24:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.013

234

References

[107] N. Koblitz, A course in number theory and cryptography, Graduate Texts in Mathematics, 114, Springer, 1994. [108] H. Koch, Number theory, Graduate Studies in Mathematics, American Mathematical Society, 2000. [109] J.F. Koksma, Een algemeene stellinguit de theorie der gelijkmatige verdeeling modulo 1, Mathematica B (Zupten) 11 (1942-43), 7–11. [110] T. Kollig, A. Keller, Eﬃcient multidimensional sampling, Computer Graphics Forum 21 (2002), 557–563. [111] M.N. Kolountzakis, T. Wolﬀ, On the Steinhaus tiling problem, Mathematika 46 (1999), 253–280. [112] S.V. Konyagin, M.M. Skriganov, A.V. Sobolev, On a lattice point problem arising in the spectral analysis of periodic operators, Mathematika 50 (2003), 87– 98. [113] E. Kratzel, Lattice points, Mathematics and its Applications, Kluwer Academic Publisher, 1988. [114] L. Kuipers, H. Niederreiter, Uniform distribution of sequences, Dover Publications, 2006. ¨ [115] E. Landau, Uber die gitterpunkte in einen kreise (Erste, zweite Mitteilung), Nachr. K. Gesellschaft Wiss. G¨ottingen, Math.-Phys. Klasse (1915), 148–160, 161–171. [116] E. Landau, ber Dirichlets teilerproblem, Sitzungsber, Math.–Phys. Klasse Knigl. Bayer. Akad. Wiss. (1915), 317–328. [117] N.N. Lebedev, Special functions and their applications, Dover Publications, 1972. [118] F. Lemmermeyer, Reciprocity laws (from Euler to Eisenstein), Springer Monographs in Mathematics, Springer, 2000. [119] W.J. LeVeque, Fundamentals of number theory, Addison-Wesley, 1977. [120] W.J. LeVeque, Elementary theory of numbers, Dover Publications, 1990. [121] J. Matousek, Geometric discrepancy. An illustrated guide, Algorithms and Combinatorics, 18, Springer, 2010. [122] R. Matthews, The power of one, New Scientist, 10 July 1999. [123] H. Montgomery, Ten lectures on the interface between analytic number theory and harmonic analysis, CBMS Regional Conference Series in Mathematics, 84, American Mathematical Society, 1994. [124] W.J. Morokoﬀ, R.E. Caflisch, A quasi-Monte Carlo approach to particle simulation of the heat equation, SIAM J. Numer. Anal. 30 (1993), 1558–1573. [125] R.M. Murty, N. Thain, Prime numbers in certain arithmetic progressions, Funct. Approx. Comment. Math. 35 (2006), 249–259. [126] R.M. Murty, N. Thain, Pick’s theorem via Minkowski’s theorem, Amer. Math. Monthly 114 (2007), 732–736. [127] W. Narkiewicz, Number theory, World Scientific, 1983. [128] M.B. Nathanson, Elementary methods in number theory, Graduate Texts in Mathematics, 195, Springer, 2000. [129] S. Newcomb, Note on the frequency of use of the diﬀerent digits in natural numbers, Amer. J. Math. 4 (1881), 39–40. [130] D.J. Newman, A simplified proof of the Erd˝os–Fuchs theorem, Proc. Amer. Math. Soc. 75 (1979), 209–210.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:24:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.013

References

235

[131] H. Niederreiter, Random number generation and quasi-Monte Carlo methods, CBMS-NSF Regional Conference Series in Applied Mathematics, 63, SIAM, 1992. [132] M. Nigrini, Benford’s law. Applications for forensic accounting, auditing, and fraud detection, John Wiley & Sons, 2012. [133] M. Nigrini, L. Mittermaier, The use of Benford’s law as an aid in analytical procedures, Auditing - A Journal of Practice & Theory 16 (1997), 52–67. [134] I. Niven, Irrational numbers, Carus Mathematical Monographs, 11, MAA, 2005. [135] I. Niven, H. Zuckerman, An introduction to the theory of numbers, John Wiley & Sons, 1980. [136] O. Ore, Number theory and its history, Dover Publications, 1988. [137] L. Parnovski, N. Sidorova, Critical dimensions for counting lattice points in Euclidean annuli. Math. Model. Nat. Phenom. 5 (2010), 293–316. [138] L. Parnovski, A. Sobolev, On the Bethe–Sommerfeld conjecture for the polyharmonic operator, Duke Math. J. 107 (2001), 209–238. [139] R. Pinkham, On the distribution of first significant digits, Ann. Math. Stat. 32 (1961), 1223–1230. [140] M.A. Pinsky, Introduction to Fourier analysis and wavelets, Graduate Studies in Mathematics, 102, American Mathematical Society, 2002. [141] M. Plancherel, Contribution a l’etude de la representation d’une fonction arbitraire par les integrales d´efinies, Rend. del Circ. Mat. Palermo 30 (1910), 298– 335. [142] A.N. Podkorytov, The asymptotic of a Fourier transform on a convex curve, Vestn. Leningr. Univ. Mat. 24 (1991), 57–65. [143] S. Ramanujan, A proof of Bertrand’s postulate, J. Indian Math. Soc. 11 (1919), 181–182. [144] B. Randol, On the Fourier transform of the indicator function of a planar set, Trans. Amer. Math. Soc. 139, 271–278. [145] D. Redmond, Number theory. An introduction, Monographs and Textbooks in Pure and Applied Mathematics, 201, Marcel Dekker, 1996. [146] E. Regazzini, Probability and statistics in Italy during the First World War I: ´ Cantelli and the laws of large numbers, J. Electron. Hist. Probab. Stat. 1 (2005) 1–12. [147] F. Ricci, G. Travaglini, Convex curves, Radon transforms and convolution operators defined by singular measures, Proc. Amer. Math. Soc. 129 (2001), 1739– 1744. [148] S. Robinson, Still guarding secrets after years of attacks, RSA earns accolades for its founders, SIAM News 36 5 (2003). [149] K.F. Roth, On irregularities of distribution, Mathematika 1 (1954), 73–79. [150] W. Rudin, Principles of mathematical analysis, International Series in Pure and Applied Mathematics, McGraw-Hill, 1976. [151] J.D. Sally, P.J. Sally, Roots to research. A vertical development of mathematical problems, American Mathematical Society, 2007. [152] W.M. Schmidt, Irregularities of distribution IV, Invent. Math. 7 (1968), 55–82. [153] W.M. Schmidt, Lectures on irregularities of distribution, Tata Institute of Fundamental Research Lectures on Mathematics and Physics, 56, Tata Institute of Fundamental Research, Bombay, 1977.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:24:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.013

236

References

[154] E. Scholz (Editor), Hermann Weyl’s Raum–Zeit–Materie and a general introduction to his scientific work, DMV Seminar 30, Birkhauser, 2001. [155] W. Sierpinski, O Pewnem zagadnieniu z rachunku funckcyj asymptotycznych, Prace mat.-fiz. 17 (1906), 77–118. [156] R.A. Silverman, Complex analysis with applications, Dover Publications, 1984. [157] S. Singh, The code book: The science of secrecy from ancient Egypt to quantum cryptography, Doubleday Books, 1999. [158] I.H. Sloan, H. Wozniakowski, When are quasi-Monte Carlo algorithms eﬃcient for high dimensional integrals? J. Complexity 14 (1998), 1–33. [159] P. Soardi, Serie di Fourier in pi`u variabili, Unione Matematica Italiana – Pitagora, Quaderni dell’Unione Matematica Italiana 26, 1984. [160] C.D. Sogge, Fourier integrals in classical analysis, Cambridge Tracts in Mathematics, 105, Cambridge University Press, 1993. [161] E. Stein, R. Shakarchi, Fourier analysis, An introduction, Princeton Lectures in Analysis, I, Princeton University Press, 2003. [162] E. Stein, R. Shakarchi, Real analysis, measure theory, integration, and Hilbert spaces, Princeton Lectures in Analysis, III, Princeton University Press, 2005. [163] E. Stein, R. Shakarchi, Functional analysis, Princeton Lectures in Analysis IV, Princeton University Press, 2011. [164] E. Stein, G. Weiss, Introduction to Fourier analysis in Euclidean spaces, Princeton Mathematical Series, 32, Princeton University Press, 1971. [165] I.N. Stewart, D.O. Tall, Algebraic number theory, Chapman and Hall Mathematics Series, Chapman and Hall, 1979. [166] J. Stillwell, Elements of algebra, geometry, numbers, equations, Undergraduate Texts in Mathematics, Springer, 1994. [167] J. Stillwell, Elements of number theory, Undergraduate Texts in Mathematics, Springer, 2003. [168] D.R. Stinson, Cryptography, theory and practice, CRC Press Series on Discrete Mathematics and its Applications, Chapman & Hall/CRC, 2002. [169] M. Tarnopolska-Weiss, On the number of lattice points in planar domains, Proc. Amer. Math. Soc. 69 (1978), 308–311. [170] G. Travaglini, Fejer kernels for Fourier series on Tn and on compact Lie groups, Math. Z. 216 (1994), 265–281. [171] G. Travaglini, Crittografia, Emmeciquadro 21 (2004), 21–28. [172] G. Travaglini, Average decay of the Fourier transform, in ‘Fourier analysis and convexity’ (L. Brandolini, L. Colzani, A. Iosevich, G. Travaglini - Editors), Birkhauser, 2004, 245–268. [173] G. Travaglini, Appunti su teoria dei numeri, analisi di Fourier e distribuzione di punti, Unione Matematica Italiana – Pitagora, Quaderni dell’Unione Matematica Italiana 52, 2010. [174] K. Tsang, Recent progress on the Dirichlet divisor problem and the mean square of the Riemann zeta-function, Sci. China Math. 53 (2010), 2561–2572. [175] M. Tupputi, in preparation. [176] J.D. Vaaler, Some extremal problems in Fourier analysis, Bull. Amer. Math. Soc. 12 (1985), 183–216. [177] G. Vorono¨ı, Sur un probl`eme du calcul des fonctions asymptotiques, J. Reine Angew. Math. 126 (1903), 241–282.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:24:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.013

References

237

[178] G.N. Watson, A treatise on the theory of Bessel functions, Cambridge Mathematical Library, Cambridge University Press, 1922. [179] D.D. Wall, Normal numbers, Thesis, University of California, 1949. [180] H. Weyl, Uber ein problem aus dem gebiete der diophantischen approximationen, Nacr. Ges. Wiss. Gottingen (1914), 234–244. [181] H. Weyl, Uber die gleichverteilung von zhalen mod. eins, Math. Ann. 77 (1916), 313–352. [182] R.L. Wheeden, A. Zygmund, Measure and integral. An introduction to real analysis, Pure and Applied Mathematics, 43, Marcel Dekker, 1977. [183] Y. Zhang, Bounded gaps between primes, Ann. Math. 179 (2014), 1121–1174. [184] G. Ziegler, The great prime number record races, Notices Amer. Math. Soc. 51 (2004), 414–416. [185] A. Zygmund, Trigonometric series, I, II, Cambridge Mathematical Library, Cambridge University Press, 1993.

Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 04 Jun 2017 at 03:24:10, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107358379.013

Index

b | a, 3 b a, 3 (a, b), 5 [·], 10 ≡, 41 {·}, 82 |·|, 35 ∼, 11 ≈, 135 ·, 138 A(L), 87 Absolute convergence of Fourier series, 78 Almost prime number, 51 a p L , 67 Arithmetic function, 19 Associated elements in Z [i], 95 Average decay, 170 BR , 162 B(t, R), 34 Beck and Chen’s theorem, 209 Beck–Montgomery theorem, 205 Benford sequence, 128 Benford’s law, 128 Bertrand’s postulate, 16 Bertrand–Chebyshev theorem, 16 Bessel function, 215 Bessel’s inequality, 77 Borel–Cantelli theorem, 124 Caesar cipher, 59 Canonical decomposition, 7 card, 11 Cassels’ lemma, 207 Champernowne number, 126 Characteristic function, 109 Chebyshev’s theorem, 13

Chen’s theorem, 211 χ characteristic function, 109 χR , 162 Chinese remainder theorem, 56 Cipolla’s theorem, 51 Complete set of residues, 43 Completeness of the trigonometric system, 110 Composite number, 6 Congruent, 41 Convex body, 88 Convolution on Rd , 151 Convolution on Td , 113 Copeland and Erd˝os theorem, 126 Coprime numbers, 5 van der Corput conjecture, 194 van der Corput sequence, 194 van der Corput’s theorem, 186 D (AN ), 134 d (n), 25 Davenport’s theorem, 173 Degree of a trigonometric polynomial, 114 δ (n), 99 Diaconis’ theorem, 129 Dirichlet convolution, 20 Dirichlet divisor problem, 32 Dirichlet function, 25 Dirichlet inverse, 21 Dirichlet kernel, 115 Dirichlet product, 20 Dirichlet’s approximation theorem, 89 Dirichlet’s theorem on primes in arithmetic progressions, 48 Dirichlet’s theorem on the average of d (n), 29 Dirichlet’s theorem on the average of σ (n), 32 Discrepancy in several variables, 193

238

Index Discrepancy in T, 134 DN , 134 Dgrid (N), 219 Djittered (N), 220 , 203 DC,{t(1),...,t(N)} N D∗N , 141 DN (t), 203 DC,{t(1),...,t(N)} (t), 203 N DN (σ, t), 203 DC,{t(1),...,t(N)} (σ, t), 203 N DN (ε, σ, t), 203 DC,{t(1),...,t(N)} (ε, σ, t), 203 N DMC N , 205 Enigma, 60 Erd˝os and Fuchs’ theorem, 168 Erd˝os number, 51 Erd˝os’ theorem, 51 Erd˝os–Tur´an inequality, 138 Euclid’s lemma, 5 Euclid’s theorem, 7 Euclid–Euler theorem, 26 Euclidean algorithm, 5 Euclidean algorithm for Gaussian integers, 96 Euler function, 35 Euler’s criterion, 67 Euler’s theorem on the prime numbers of the form 4k + 1, 47 Euler’s theorem on the sum of the inverses of the primes, 8 Euler–Fermat theorem, 46 Euler–Mascheroni constant, 29 Euler–Mclaurin summation formula, 121 Exponential sum, 186 2 f (n), 77 fo , 214 Fej´er kernel on T, 114 Fej´er kernel on Td , 115 Fej´er mean, 115 Fej´er’s theorem, 121 Fermat’s little theorem, 47 φ(n), 35 Fibonacci sequence, 17 First significant digit, 127 First significant r-block, 128 2 f (ξ), 150 Fourier coeﬃcient in Td , 113 Fourier coeﬃcient in T, 77 Fourier inversion formula, 153 Fourier partial sum, 79 Fourier series, 78 Fourier transform, 150

239

Fractional part, 82 Frequency analysis, 59 Function of bounded variation, 140 Fundamental parallelepiped, 87 Fundamental theorem of arithmetic, 7 Fundamental theorem of arithmetic in Z [i], 97 γ, 29 Gamma function, 101 Γ (s), 101 Gauss circle problem, 34, 162 Gauss sum, 74 Gauss’ lemma, 68 Gauss’ theorem, 34 Gaussian integers, 94 Gaussian prime, 95 gcd, 5 Gegenbauer’s theorem, 24 Geometric discrepancy, 194 Girard–Fermat theorem, 92 Greatest common divisor, 5 Greatest common divisor in Z [i], 96 Gronwall’s theorem, 33 Hardy–Landau theorem, 165 Hilbert space, 76 Hilbert’s theorem, 103 H¨older’s inequality, 110 Hurwitz’s theorem, 91 I (n), 20 Integer number, 3 Integer point, 19 Integral part, 10 Interval in Td , 108 ISBN code, 41 Jackson kernel, 136 Jacobi’s theorem on r(n), 98 Jacobi’s theorem on sums of two squares, 93 Jittered discrepancy, 211 Jordan’s theorem, 140 Jν , 215 Kendall’s theorem, 168 KL , 87 KN , 114 KNo , 114 Koksma’s inequality, 142 Koksma–Hlawka inequality, 145 Kronecker’s theorem, 108 L2 (T), 77 L p Td , 110 λ(t), 219 2 (Z), 78

240

Index

Lagrange’s theorem on finite groups, 46 Lagrange’s theorem on polynomial equations, 65 Lagrange’s theorem on sums of four squares, 100 Landau’s theorem, 38 Lattice, 87 Legendre symbol, 67 Mersenne prime, 27 Mertens’ theorem on the average of φ(n), 38 Mertens’ theorem on the sum of the inverses of the primes, 11 Minkowski’s inequality, 111 Minkowski’s integral inequality on Rd , 152 Minkowksi’s integral inequality on Td , 112 Minkowski’s theorem, 88 M¨obius function, 23 M¨obius’ inversion formula, 24 Monte Carlo discrepancy, 205 μ(n), 23 Multiplicative function, 21 N, 3 N (·), 94 N (n), 20 Natural number, 3 Normal number, 124 Orthonormal basis, 76 P, 6 π (x), 11 Parnovski and Sobolev theorem, 225 Parseval identity, 118 Partial summation formula, 139 Partition of unity, 176 Perfect numbers, 26 Pick’s theorem, 89 Piecewise continuous function, 78 Piecewise smooth function in one variable, 78 Piecewise smooth function on Td , 145 Pigeonhole principle, 89 Plancherel’s identity, 155 Podkorytov’s theorem, 176 Poisson summation formula, 160 Polynomial congruences, 65 Prime number, 6 Prime number theorem, 11 Public key, 60 Q, 24 Q (x), 24 Quadratic congruence, 66 Quadratic non-residue, 66

Quadratic reciprocity law, 70 Quadratic residue, 66 r (n), 33 Radial function, 214 Reduced set of residues, 44 Residue class, 41 Riemann sum, 109 Riemann zeta function, 9 Riemann–Lebesgue lemma, 155 Roth’s theorem, 195 RSA encryption, 59 d|n , 20 S (q, a), 74 S (t, R), 34 s(x), 82 S N f , 79 Sawtooth function, 82 Sierpinski’s theorem, 162 σ (n), 25 σθ (·), 170 S j , 211 S, 151 Square-free number, 24 Stirling’s formula, 129 Strong Benford sequence, 128 Strong Benford’s law, 128 Strong law of large numbers, 126 T, 77 Td , 107 Θ, 158 Trigonometric polynomial, 114 u (n), 20 Uniformly distributed sequence, 108 Uniformly distributed sequence mod 1, 108 Unit in Z [i], 95 V f , 140 Vorono¨ı’s theorem, 183 Wall’s lemma, 124 Weyl criterion, 118 Weyl’s theorem, 109 Wilson’s theorem, 50 Z, 3 Z [i], 94 Z/mZ, 41 (Z/mZ)× , 46 Z/pZ, 46 Zhang’s theorem, 17 ζ (s), 9 0 (n), 20