Number Theory: Sailing on the Sea of Number Theory: Proceedings of the 4th China-Japan Seminar 9789812708106, 981-270-810-3

This volume is not an ordinary proceedings volume assembling papers submitted but a collection of prestigious survey pap

227 46 2MB

English Pages 268 Year 2007

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Contents......Page 22
Preface......Page 6
Program......Page 14
1. The analytic continuation of multiple Dirichlet series......Page 24
2. An example of double Dirichlet series with a natural boundary......Page 27
3. Proof of Theorem 2.1......Page 31
4. Proof of Theorem 2.2......Page 33
5. An application to the Riesz mean......Page 37
6. The multiple case......Page 41
References......Page 44
1. Introduction......Page 47
2. Selmer groups......Page 49
3. Oddness of graphs......Page 51
4. New non-congruent numbers......Page 53
References......Page 60
1. Introduction......Page 62
2.1. Polynomial g(x)......Page 66
2.2. Upper bound for #E(p)......Page 68
2.3. Conjecture......Page 70
3.1. Structure of o as an η-module......Page 72
3.2. Relη and κ(η)......Page 82
3.3.1. Action of automorphisms......Page 86
3.3.2. Evaluation of κ(η)......Page 90
4.1. Case of η = id......Page 93
4.1.1. Case of real quadratic fields......Page 95
4.1.3. Case of non-cyclic abelian fields of degree 4......Page 96
4.1.4. Case of imaginary abelian fields of degree 4......Page 99
4.1.5. Case where F is the Galois closure of a real cubic field F0 with negative discriminant......Page 101
4.2. Case of complex conjugation......Page 104
4.2.1. Case of [F : Q] = 4......Page 105
4.2.2. Case of [F : Q] = 6......Page 106
4.3. Case where F is an imaginary abelian field with [F : Q] = 6 and the order of Gal(F/Q) is 3......Page 107
5.1. Divisors of f(p)......Page 110
5.2. Structure of Galois group extended by roots of units......Page 112
References......Page 119
2. The starting point......Page 120
3. Elliptic modular forms......Page 122
4. Siegel modular forms of genus two......Page 126
References......Page 129
Shifted Convolution Sums of Fourier Coefficients of Cusp Forms Yuk-Kam Lau, Jianya Liu and Yangbo Ye......Page 131
1.1. The classical case: the Riemann zeta-function and Dirichlet L-functions......Page 132
1.2. L-functions of degree two......Page 133
1.3. Rankin-Selberg L-functions......Page 134
1.5. Notations......Page 135
2.1. Spectral theory of automorphic forms......Page 136
2.2. The Rankin-Selberg method and shifted convolution sums......Page 137
3. Variants of the circle method......Page 138
3.1. The δ-symbol method......Page 139
3.2. Jutila’s variant......Page 142
4. The spectral method......Page 145
5. The spectral method: meromorphic continuation to σ > 1/2......Page 148
6.1. Further meromorphic continuation to σ > 1/2......Page 153
6.2. Illustration for the proof of Theorem 1.1......Page 155
References......Page 156
1. Introduction......Page 159
2. Part I: Generic polynomials of degree 3......Page 160
2.1. Generic polynomials of degree 3......Page 161
2.2. The Splitting Field of R(t;X)......Page 163
2.3. A Criterion for Kt Kt......Page 164
2.4. Notes on the Reducible Cases......Page 165
2.5. An Application: Parametrization of Unrami.ed Cyclic Cubic Extensions of Quadratic Fields......Page 166
3.1. Elliptic Curves of the Form w3 = u3 + au2 + bu + c......Page 168
3.2. Some facts on the Hessian curves......Page 171
3.3. Twists of Hessian Elliptic Curves (1)......Page 172
3.4. Twists of Hessian Elliptic Curves (2)......Page 174
3.5. Some cases of non-empty H (µ, t)[Q]......Page 175
References......Page 176
1.1. Modular Hyperbolas and Kloosterman Sums......Page 178
1.3. Acknowledgements......Page 180
2.1. Exponential and Character Sums......Page 181
2.2. Theory of Uniform Distribution......Page 182
2.3. Arithmetic Functions, Divisors, Prime Numbers......Page 184
3.1. Points on Ha,m in Intervals for All a......Page 186
3.2. Points on Ha,m in Intervals on Average Over a......Page 188
3.3. Points on Ha,m in Sets with Arithmetic Conditions......Page 192
4.1. Distances......Page 193
4.2. Convex Hull......Page 197
4.3. Visible Points......Page 200
5.2. Distribution of Angles in Some Point Sets......Page 203
5.4. Sato-Tate Conjecture in the “Vertical” Aspect......Page 204
5.6. Approximations by Sums of Two Rationals......Page 205
5.8. Computing Discrete Logarithms and Factoring......Page 206
6.1. Generalisations......Page 207
References......Page 208
1. Erd˝os-Heilbronn conjecture and the polynomial method......Page 213
2. Various sumsets with polynomial restrictions......Page 218
3. Snevily’s conjecture and additive theorems......Page 223
4. On a conjecture of Lev and related results......Page 227
5. Working with general abelian groups......Page 230
6. On value sets of polynomials......Page 233
References......Page 234
1. Preliminaries......Page 237
2. Assumptions......Page 240
3. Theorem......Page 242
4. A General Modular Relation associated to the Riemann Zeta-function......Page 246
5. A General Modular Relation associated to a product of the Riemann Zeta-function......Page 252
References......Page 257
1. Introduction......Page 260
2. -adic cae: = p......Page 261
3. p-adic case......Page 262
References......Page 264
Index......Page 266
Recommend Papers

Number Theory: Sailing on the Sea of Number Theory: Proceedings of the 4th China-Japan Seminar
 9789812708106, 981-270-810-3

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

zy z

NUMBER THEORY Sailing on the Sea of Number Theory

Series on Number Theory and Its Applications

ISSN 1793-3161

Series Editor: Shigeru Kanemitsu (Kinki University, Japan) Editorial Board Members: V. N. Chubarikov (Moscow State University, Russian Federation) Christopher Deninger (Universität Münster, Germany) Chaohua Jia (Chinese Academy of Sciences, PR China) H. Niederreiter (National University of Singapore, Singapore) M. Waldschmidt (Université Pierre et Marie Curie, France) Advisory Board: K. Ramachandra (Tata Institute of Fundamental Research, India (retired)) A. Schinzel (Polish Academy of Sciences, Poland)

Vol. 1 Arithmetic Geometry and Number Theory edited by Lin Weng & Iku Nakamura Vol. 2 Number Theory: Sailing on the Sea of Number Theory edited by S. Kanemitsu & J.-Y. Liu

Series on Number Theory and Its Applications Vol.2

NUMBER THEORY Sailing on the Sea of Number Theory Proceedings of the 4th China-Japan Seminar Weihai, China

30 August - 3 September 2006

Editors S. KcUiemitSU (KMaUniversity, Japan) J.-Y.

LlU. (ShandongUniversity, China)

World Scientific NEW J E R S E Y • L O N D O N • S I N G A P O R E • B E I J I N G • S H A N G H A I • H O N G K O N G • TAIPEI • C H E N N A I

Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

Series on Number Theory and Its Applications — Vol. 2 NUMBER THEORY Sailing on the Sea of Number Theory Copyright © 2007 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

ISBN-13 978-981-270-810-6 ISBN-10 981-270-810-3

Printed in Singapore.

v

PREFACE The present volume is a collection of survey papers of the talks presented at the very successful fourth China-Japan Seminar on number theory. The seminar was held in Shandong University Academic Center, in Weihai, Shandong, People’s Republic of China, during August 30 – September 3, 2006, under the support of Shandong University, the Japan Society for the Promotion of Science (JSPS) and the National Natural Science Foundation of China (NSFC). The organizers Shigeru Kanemitsu and Jianya Liu would like to express their hearty thanks to Shandong University for their generosity and support. The title of the seminar reads “Sailing on the sea of number theory” which suggests that we are supposed to sail freely both on the sea of number theory and the sea of Weihai. The talks were given in a wide conference hall looking over the sea all round and the atmosphere was superb, and all the participants really got the feeling of sailing on the sea of number theory and the sea of Weihai.

Weihai Sea

It should be mentioned that Ms. Nan Luo, together with other staff members of Shandong University, helped out the organizers throughout in preparing and conducting the seminar very successfully. The organizers also would like to thank two of their Chinese colleagues, Professors T. X. Cai and Z. -W. Sun for their attractions. The former read his own poem

vi

PREFACE

which has a passage referring to Pythagoras and the latter made a spot announcement of some recent results in a live fashion.

Shandong University Academic Center (Weihai)

Traditionally, we have been publishing the collection of papers presented at the seminars, but from this volume onwards, we are going to publish survey papers which give good perspectives of recent progress of research in number theory and related fields. We would like to thank the authors of these papers for their kind cooperation regarding the preparation of excellent manuscript and the subsequent unification process of style, which is to make the volume of a readable book quality. In this volume we assemble the following papers: 1. S. Egami and K. Matsumoto, “Convolutions of the von Mangoldt Function and Related Dirichlet Series” has the theme of analytic continuation of multiple Dirichlet series of various types and their possible natural boundaries. Analytic continuation is furnished largely by means of the Mellin-Barnes integrals which is a form of the definition of the betafunction. The authors consider Φ2 with the coefficients of the form of the Abel convolution G2 of the von Mangoldt function as an example of a multiple Dirichlet series which may have a natural boundary. They also consider the Riesz sum of G2 and obtain an asymptotic formula. 2. K.-Q. Feng and Y. Xue, “Constructing New Non-congruent Numbers by Graph Theory” is regarding the application of graph theory (combined with the results on elliptic curves) to finding non-congruent numbers with arbitrarily many prime factors. The main tools they use from elliptic curve theory is that if the rank of the rational points of a certain elliptic curve (with n in the coefficients) is 0, then n is a non-congruent number and this condition in turn is realized if the corresponding Selmer groups have

PREFACE

vii

the minimal sizes. This last condition is then checked on the basis of the oddness of the suitable graphs. The reader can learn these kinds of various facts about elliptic curves and graph theory. 3. Y. Kitaoka, “Distribution of Units of an Algebraic Number Field Modulo an Ideal” is a massive work expounding the author’s new investigation on the distribution of units modulo an ideal. Here the author combines algebraic structures with an analytic output, i.e. the density etc. and should give rise to a new fertile uncultivated land for the coming younger generation. Uncultivated because, compared with extensive study on other ingredients of an algebraic number field like class numbers, the units have not been paid much attention. Not only the field is promising but the problem setting will be very beneficial to younger scientists: Setting a problem in algebraic aspects and incorporate analytic tools to arrive at some statistical data. In the paper the reader can learn really all notions and tools in algebraic number theory, the Frobenius automorphism, Hilbert’s ramificaˇ tion theory, Galois action, the Cebotar¨ ev density theorem, and the Artin conjecture. 4. W. Kohnen, “Sign Changes of Fourier Coefficients and Eigenvalues of Cusp Forms”. In the paper the recent results are summarized on the sign change problem of the Fourier coefficients a(n) of cusp forms f (Siegel modular forms). The Theorem in §1 about infinitely many sign changes is proved using the Hecke L-series for f and the Rankin-Selberg zeta-function, which motivates the study of the first occurrence of the sign change. For a normalized Hecke eigenform that is a new form of even integral weight k and level N , sharp bounds for the occurrence of the first sign change is obtained, with or without the symmetric square L-function and the Hecke relations, (sub-)convexity bounds, prime number theorem, etc. The same ideas give results including sign changes in short intervals. Sign changes for Siegel modular forms of genus 2 are also summarized. Infinitely many sign changes may not occur in general, yet there are theorems which have similar flavor as those for the elliptic modular case. The proof uses the spinor zeta-function in place of Hecke L-series. 5. Y. -K. Lau, J. -Y. Liu and Y. -B. Ye, “Shifted Convolution Sums of Fourier Coefficients of Cusp Forms”. Let α denote the infimum of the exponent of t in the estimation of the Riemann zeta-function on the critical line σ = 1/2. The GLH (Generalized Lindel¨ of Hypothesis) which follows from the GRH (Generalized Riemann Hypothesis) implies α = 0. The convexity bound which is obtained with the aid of the Phragm´en-Lindel¨ of convexity principle is α = 1/4 and the authors call any improvement on the convex-

viii

PREFACE

ity bound a subconvexity bound, while the bound α = 1/6 which is 2/3-rd power of convexity and is due to Weyl, is called a Weyl-like bound. The paper is concerned with summarizing the research made hitherto toward these subconvexity and Weyl-like bounds for automorphic L-functions. The authors refer to the L-functions of degree n according to the degree of the generic polynomial factor in p−s in their Euler product and after mentioning the degree 2 case, they present their newest Weyl-like bound for the Rankin-Selberg L-function L(1/2 + it, f × g) formed from f , a holomorphic Hecke eigenform for Γ0 (N ) of weight k (or a Maass Hecke eigenform for Γ0 (N ) with Laplace eigenvalue 1/4 + k 2 ) and g a fixed holomorphic or Maass cusp form for Γ0 (N ), or for Γ0 (N ′ ) with (N, N ′ ) = 1. The Weyllike bound obtained is 2/3 in the weight aspect, which is expounded in the authors’ most recent article of 78 pages. The method hinges on the meromorphic continuation of the shifted convolution sum formed from the Fourier coefficients. In the paper, the intermediate developments toward the above bound are explained in some detail and the reader can learn a quite rich mixture of many new methods which have their origin in analytic number theory and spectral theory. 6. K. Miyake, “Two Expositions on Arithmetic of Cubics”. The paper consists of two parts. Part I is devoted to the study of cubic generic polynomials R = R(t; x) and Q = Q(s; x) with parameters t and s for the symmetric group of degree 3 and the cyclic group of order 3, respectively. As an application, the divisibility of the class numbers of quadratic fields by 3 is obtained. For the algebraic tools used, the reader is referred to the recent books by G. Gras or J. Neukirch. Thus Part I motivates the study of cubic and more generally, non-abelian fields. In Part II, two types of families of elliptic curves are considered whose sets of rational points over Q are described by certain subsets of cubic fields. One type is given by u3 = u3 + au2 + bu + c whose short forms are Mordell curves, while the other family consists of the twists of Hessian family of elliptic curves over the splitting field of the cubic polynomial R and also over the quadratic field contained in the splitting field. Part II therefore introduces the reader to the world of arithmetic on elliptic curves. Basic knowledge is to be found in the textbooks of J. Silverman and of Silverman and Tate. 7. I. Shparlinski, “Distribution of Points on Modular Hyperbolas”. For a positive integer m and an arbitrary integer a with gcd(a, m) = 1, the author defines the modular hyperbola as xy ≡ a(mod m) and designates all the points (x, y) on it by Ha,m . The author considers the distribution and geometric properties of points of Ha,m , denoted Ha,m (X , Y) whose coordinates

PREFACE

ix

x and y lie in prescribed sets of integers X and Y, respectively. Acquiring precise asymptotic formulas and establishing positivity of #Ha,m (X , Y) for various interesting sets have been the central theme in this area. However, as the author writes, “there is a large number of papers which rather routinely study various problems related to Ha,m on the case by case basis. Here, we explain some standard principles which can be used to derive these and many other results of similar spirit about the points on Ha,m as simple corollaries of just one general result about the uniformity of distribution of point on Ha,m in certain domains. In §3.1, this result, Theorem 10, is derived in a very straightforward fashion from (1.1)—the known bound of Kloosterman sums—using some standard arguments”, and we learn one important lesson that if one keeps doing routine work, eventually all the result obtained will be relinquished into history’s dustbox and will not be referred to as illustrating examples in the subsequent publications. §§2.1– 2.3 give a nice survey on a diversity of methods used to study the point on Ha,m , and especially, it is remarked that multiplicative character sums can sometimes give better results than Kloosterman sums. Various applications are described as well. In particular in §5, one can find some interesting results, especially an unexpected result on torsion of elliptic curves in §5.4. 8. Z. -W. Sun, “A Survey of Problems and Results on Restricted Sumsets”. The paper gives a useful survey on recent results in additive number theory, now being exciting, related to combinatorics. A restricted sumset has the form {a1 + · · · + an : a1 ∈ A1 , . . . , an ∈ An , P (a1 , ..., an ) 6= 0}, where A1 , . . . , An are finite subsets of Z, or a field or an abelian group, and P is a suitable polynomial. The book “Additive Combinatorics” (2006) by T. Tao and V. Vu, mainly summarizes important results on sumsets without restrictions, obtained by various different tools. The author is concerned with the lower bounds for cardinalities of various sumsets obtained by algebraic methods. §1 gives rather enlightening discussion on the powerful “polynomial method” via Alon’s “Combinatorial Nullstellensatz” the proof of which is given and which implies the AlonNathanson-Ruzsa theorem, generalizing Dias da Silva and Hamidoune’s extension of the Erd˝ os-Heilbronn conjecture. In the remaining sections one can find recent developments by the author and his school. The survey also contains some open conjectures. 9. H. Tsukada, “A General Modular Relation in Analytic Number Theory” presents equivalent forms of the functional equation with multiple gamma factors satisfied by two sets of Dirichlet series, in the form of mod-

x

PREFACE

ular relation between the corresponding two sets of H-function series, where H-function means the Fox H-function. As far as we have checked, there is no formula which does not come under this big umbral theorem. In the paper, there are some illustrating examples with multiple gamma factors. Hopefully, the theorem covers also those zeta-functions studied in the papers of Kohnen and Lau-Liu-Ye. 10. D. -Q. Wan, “L-functions of Function Fields”. Let Fq denote a finite field of q elements with characteristic p > 0. As a generalization of the case of C= the projective plane, U = projective plane with the origin and infinity removed and K = Fq (t), the function field, the following general situation is considered: Let C denote a smooth projective geometrically connected curve defined over Fq with function field K, with its absolute Galois group denoted by GK and let j : U → C be a Zariski open dense subset of C. Let Fℓ be a finite extension of Qℓ , ℓ being a prime number. With V a finite dimensional vector space over Fℓ , let ρ : GK → GL(V ) be a continuous representation of GK unramified on U . There are two L-functions introduced, the L-function L(U, ρ) and the complete L-function of ρ on C, L(C, ρ) which differ by a finite number of Euler factors. The paper gives a concise survey of recent results on analyticity of these L-functions, especially, for those representations arising from geometry. It is stated that in the ℓ-adic case, with ℓ 6= p, all ℓ-adic representations are essentially geometric from L-function point of view including ℓ-adic function field analogue of Artin’s entireness conjecture. In the more complicated ℓ = p case, the location of zeros and poles on the compact unit disc is determined and the author’s result says that if the p-adic representation is geometric, the Lfunction L(U, ρ) is p-adic meromorphic everywhere. The interested reader can go on by reading the references given. All in all, the interested reader can really get benefited by going through this volume and we hope we will keep doing this work of putting together the recent developments in handy volumes. The editors would like to express their hearty thanks to Professor Haruo Tsukada, Dr. Jing Ma and Ms. Nan Luo for their devoted help toward the completion of this volume. Dr. Jing Ma, especially, spent an immense amount of time to edit the final versions of all the papers; without her help, the volume could not have been completed so well and in time. Professor Haruo Tsukada kindly checked the final version of the manuscript and made essential improvement therein to whom the editors would like to express their hearty gratitude. Last but not least, thanks are due to the editor of

PREFACE

xi

the World Scientific, Mrs. Ji Zhang, for her kind and timely help throughout the preparation of the proceedings. As usual, we end up with recording a poem describing our feeling, and this time it is the following.

The fifth China-Japan Seminar will be held in Osaka, at Kinki University and we do hope we’ll meet there again. Wo men xi wang zai jian dao ni men. Jin Gunagzi and Liu Jianya—the editors-organizers

This page intentionally left blank

xiii

PROGRAM Wednesday, August 30, 2006 9:00–9:40 The Organizers, Opening Address

Morning Session (Chair: Jianya Liu) 10:00–10:40 Andrzej Schinzel, The Number of Solutions in a Box of a Linear Homogeneous Congruence 11:00–11:40 Shou-Wu Zhang, Periods Integrals and Special Values of L-series Afternoon Session (Chair: Krishnaswami Alladi) 14:30–15:00 Zhi-Wei Sun, Curious Identities and Congruences Involving Bernoulli Polynomials 15:20–15:40 Shigeki Akiyama, Rational Based Number System and Mahler’s Problem 15:50–16:00 Masayuki Toda, On Gauss’ Formula for ψ and Finite Expressions for the L-series at 1 16:20–16:40 Xiumin Ren, Estimates of Exponential Sums over Primes and Applications in Number Theory 16:50–17:05 Haruo Tsukada, A General Modular Relation Associated with the Riemann Zeta-function 17:10–17:20 Takako Kuzumaki Kobayashi, A Transformation Formula for Certain Lambert Series

xiv

PROGRAM

Thursday, August 31, 2006 Morning Session (Chair: Vladimir N. Chubarikov) 9:00–9:40 Krishnaswami Alladi, New Approaches to Jacobi’s Triple Product Identity and a Quadruple Product Extension 10:00–10:40 Trevor D. Wooley, Waring’s Problem in Function Fields 11:00–11:40 Yangbo Ye, A New Bound for Rankin-Selberg L-functions Afternoon Session (Chair: Yoshio Tanigawa) 14:30–15:00 Wenpeng Zhang The Mean Value of Dedekind Sum and Cochrane Sum in Short Intervals 15:20–15:50 Isao Wakabayashi, Some Thue Equations and Continued Fractions 16:10–16:30 Yuk-Kam Lau, The Error Terms in Dirichlet’s Divisor Problem, the Circle Problem and the Mean Square Formula of the Riemann Zeta-function 16:40–17:00 Wenguang Zhai, On the Fourth Power Moments of ∆(x) and E(t) 17:10–17:20 Tianxin Cai, A Generalization of a Curious Congruence of Harmonic Numbers 17:40–17:50 Kentaro Ihara, On the Structure of Algebra of Multiple Zeta Values

Saturday, September 2, 2006 Morning Session (Chair: Kohji Matsumoto) 9:00–9:40 Vladimir N. Chubarikov, Trigonometric Sums in Number Theory, Analysis and Probability Theory 10:00–10:40 Winfried Kohnen, Sign Changes of Fourier Coefficients and Hecke Eigenvalues of Cusp Forms 11:00–11:40 Igor Shparlinski, Exponential and Character Sums with Combinatorial Sequences Afternoon Session (Chair: Wenpeng Zhang) 14:30–15:00 Chaohua Jia, On a Conjecture of Yiming Long 15:10–15:40 Koichi Kawada, On the Sum of Five Cubes of Primes 15:50–16:10 Honggang Xia, On Zeros of Cubic L-functions 16:20–16:40 Zhiguo Liu, A Theta Function Identity to the Quintic Base 16:50–17:05 Masaki Sudo, On the Exponential Equations ax − by = c (1 ≤ c ≤ 300) 17:25–17:40 Yoshinobu Nakai, On a Function in the Biquadratic Theta-Weyl Sums 17:45–17:55 Deyu Zhang, Zero Density Estimates for Automorphic L-functions

PROGRAM Sunday, September 3, 2006 Morning Session (Chair: Winfried Kohnen) 9:00–9:40 Daqing Wan, L-functions of Infinite Symmetric Powers 10:00–10:30 Yoshiyuki Kitaoka, Distribution of Units of an Algebraic Number Field 10:50–11:20 Kohji Matsumoto, The Riesz Mean of the Convolution Product of Von Mangoldt Functions and the Related Zeta-function 11:30–12:00 Katsuya Miyake, Twists of Hessian Elliptic Curves and Cubic Fields Afternoon Session (Chair: Shigeru Kanemitsu) 14:30–15:00 Leo Murata, On a Property of the Multiplicative Order of a (mod p) 15:20–15:50 Yonggao Chen, On the Prime Power Factorization of n! 16:00–16:20 Yoshio Tanigawa, Kronecker’s Limit Formula and the Hypergeometric Function 16:25–16:35 Guangshi L¨ u, Some Results in Classical Analytic Number Theory 16:40–16:50 Huaning Liu, Mean Value of Dirichlet L-functions and Applications to Pseudorandom Binary Sequences in Cryptography 16:55–17:05 Hailong Li, The Structural Elucidation of Eisenstein’s Formula

xv

This page intentionally left blank

xvii

Advisory Committee: Professor Tao Zhan, Shandong University, China Organizing Committee: Professor Shigeru Kanemitsu, Kinki University, Japan Professor Jianya Liu, Shandong University, China Guest Speakers: Professor Krishnaswami Alladi, University of Florida, USA Professor Vladimir N. Chubarikov, Moscow State University, Russia Professor Winfried Kohnen, Universitaet Heidelberg, Germany Professor Andrzej Schinzel, Polish Academy of Sciences, Poland Professor Igor Shparlinski, Macquarie University, Australia Professor Daqing Wan, University of California, USA Professor Trevor D. Wooley, University of Michigan, USA Professor Yangbo Ye, University of Iowa, USA Professor Shou-Wu Zhang, Columbia University, USA

Peripatetic Mathemagicians

xviii

Speakers: Professor Shigeki Akiyama, Niigata University, Japan Professor Tianxin Cai, Zhejiang University, China Professor Yonggao Chen, Nanjing Normal University, China Professor Chaohua Jia, The Chinese Academy of Sciences, China Professor Koichi Kawada, Iwate University, Japan Professor Yoshiyuki Kitaoka, Meijo University, Japan Professor Yuk-Kam Lau, The University of Hong Kong, Hong Kong Professor Zhiguo Liu, East China Normal University, China Professor Kohji Matsumoto, Nagoya University, Japan Professor Katsuya Miyake, Waseda University, Japan Professor Leo Murata, Meiji-Gakuin University, Japan Professor Yoshinobu Nakai, Yamanashi University, Japan Professor Xiumin Ren, Shandong University, China Professor Masaki Sudo, Seikei University, Japan Professor Zhi-Wei Sun, Nanjing University, China Professor Yoshio Tanigawa, Nagoya University, Japan Professor Haruo Tsukada, Kinki University, Japan Professor Isao Wakabayashi, Seikei University, Japan Professor Honggang Xia, Shandong University, China Professor Wenguang Zhai, Shandong Normal University, China Professor Wenpeng Zhang, Northwest University, China

A Dead Heat Match

Short Communications: Dr. Kentaro Ihara, Kinki University, Japan Professor Takako Kuzumaki Kobayashi, Gifu University, Japan Professor Hailong Li, Weinan Teacher’s College, China Dr. Huaning Liu, Northwest University, China Professor Guangshi L¨ u, Shandong University, China Mr. Masayuki Toda, Kinki University, Japan Dr. Deyu Zhang, Shandong University, China

xix

This page intentionally left blank

xxi

CONTENTS

Preface Program Convolutions of the von Mangoldt Function and Related Dirichlet Series Shigeki Egami and Kohji Matsumoto Constructing New Non-congruent Numbers by Graph Theory Keqin Feng and Yan Xue Distribution of Units of an Algebraic Number Field Modulo an Ideal Yoshiyuki Kitaoka Sign Changes of Fourier Coefficients and Eigenvalues of Cusp Forms Winfried Kohnen

v xiii

1

24

39

97

Shifted Convolution Sums of Fourier Coefficients of Cusp Forms Yuk-Kam Lau, Jianya Liu and Yangbo Ye

108

Two Expositions on Arithmetic of Cubics Katsuya Miyake

136

Distribution of Points on Modular Hyperbolas Igor E. Shparlinski

155

A Survey of Problems and Results on Restricted Sumsets Zhi-Wei Sun

190

xxii

CONTENTS

A General Modular Relation in Analytic Number Theory Haruo Tsukada

214

L-Functions of Function Fields Daqing Wan

237

Index

243

1

CONVOLUTIONS OF THE VON MANGOLDT FUNCTION AND RELATED DIRICHLET SERIES SHIGEKI EGAMI Faculty of Engineering, Toyama University, Gofuku, Toyama 930-8555, Japan E-mail: [email protected] KOHJI MATSUMOTO Graduate School of Mathematics, Nagoya University, Chikusa-ku, Nagoya 464-8602, Japan E-mail: [email protected] In this paper, we first give a brief survey on the theory of meromorphic continuation and natural boundaries of multiple Dirichlet series. Then we consider the double Dirichlet series Φ2 (s) defined by the convolution of logarithmic derivatives of the Riemann zeta-function. Especially we propose the conjecture that Φ2 (s) would have the natural boundary on ℜs = 1, and give a supportive evidence. We further present an application of Φ2 (s) to the Riesz mean, and discuss its multiple analogues.

1. The analytic continuation of multiple Dirichlet series Let s = σ + it be a complex variable, and P (X1 , . . . , Xr ) a polynomial of complex coefficients. The multiple zeta-function ζr (s; P ) =

∞ X

m1 =1

···

∞ X

P (m1 , . . . , mr )−s

(1.1)

mr =1

was first studied by Mellin [29,30], and independently by Barnes [5,6] for P a linear form, at the beginning of the 20th century. Mellin proved the meromorphic continuation of (1.1) to the whole complex plane C if all the coefficients of P have positive real parts. Several mathematicians after Mellin proved the meromorphic continuation of (1.1) under weaker assumptions. At present, the assumption (H0 S) introduced by Essouabri [12] is the weakest. Essouabri [11] also pointed out that the multi-variable generaliza-

2

SHIGEKI EGAMI AND KOHJI MATSUMOTO

tion ζr (s1 , . . . , sn ; P1 , . . . , Pn ) =

∞ X

m1 =1

···

∞ X

P1 (m1 , . . . , mr )−s1

mr =1

(1.2)

−sn

× · · · × Pn (m1 , . . . , mr )

of (1.1), where s1 , . . . , sn ∈ C and P1 , . . . , Pn ∈ C[X1 , . . . , Xr ], can be continued meromorphically to the whole space Cn under the same type of assumption. A special type of multi-variable multiple series ζEZ,r (s1 , . . . , sr ) =

∞ X

m1 =1

···

∞ X

−s2 1 m−s 1 (m1 + m2 )

(1.3)

mr =1 −sr

× · · · × (m1 + · · · + mr )

,

which is called the Euler-Zagier r-fold sum, has been studied extensively in recent years. The meromorphic continuation of (1.3) to Cr is included in the above theorem of Essouabri [11], but [11] is unpublished. Various different proofs of the continuation were published by Arakawa and Kaneko [3], Zhao [37], Akiyama, Egami and Tanigawa [1], and the second-named author [27]. The method in [27] is based on the Mellin-Barnes integral formula Z 1 Γ(s − z)Γ(z) −z (1 + λ)−s = λ dz (1.4) 2πi (c) Γ(s) (where s, λ ∈ C, λ 6= 0, | arg λ| < π, ℜs > 0, 0 < c < ℜs, and the path of integration is the vertical line from c − i∞ to c + i∞), which was already used in Mellin’s papers [29,30]. For arithmetical applications, it is important to consider various multiple Dirichlet series with arithmetical coefficients. Peter [32] discussed the analytic continuation of the series ∞ X

m1 =1

···

∞ X a1 (m1 ) · · · ar (mr ) , P (m1 , . . . , mr )s m =1

(1.5)

r

where ak (mk ) (1 ≤ k ≤ r) are complex numbers. Actually he treated the more general situation that P (m1 , . . . , mr ) in the denominator is replaced by P (λ1 (m1 ), . . . , λr (mr )), where λk (m) are complex numbers in a certain fixed cone on C satisfying limm→∞ |λk (m)| = ∞ (1 ≤ k ≤ r). The multivariable series ∞ ∞ X X f (m1 , . . . , mr ) , (1.6) ··· ms11 · · · msrr m =1 m =1 1

r

CONVOLUTIONS OF THE VON MANGOLDT FUNCTION

3

where f (m1 , . . . , mr ) is a non-negative arithmetical function, was studied by de la Bret`eche [8]. In connection with sums of the Euler-Zagier type, multiple L-series defined by twisting (1.3) by Dirichlet characters have been investigated by Goncharov [19], Arakawa and Kaneko [3,4], Akiyama and Ishikawa [2], and Ishikawa [21,22]. More generally, we may claim that if Dirichlet series ϕk (s) =

∞ X ak (m) , ms m=1

1≤k≤r

(1.7)

behave nicely, then we can show that the multiple Dirichlet series of the form ∞ ∞ X X a1 (m1 ) a2 (m2 ) ··· Φr (s1 , . . . , sr ; ϕ1 , . . . , ϕr ) = ms11 (m1 + m2 )s2 mr =1 m1 =1 (1.8) ar (mr ) × ··· × (m1 + · · · + mr )sr

also behaves nicely. In fact, the following theorem was proved in Matsumoto and Tanigawa [28].

Theorem 1.1 ([28]). Assume that ϕk (s) (1 ≤ k ≤ r) are absolutely convergent for σ > αk (> 0), can be continued meromorphically to the whole plane C, holomorphic except for a possible pole (of order at most 1) at s = αk , and of polynomial order in any fixed strip σ1 ≤ σ ≤ σ2 . Then Φr (s1 , . . . , sr ; ϕ1 , . . . , ϕr ) can be continued meromorphically to the whole space Cr , and the location of its possible singularities can be described explicitly. In particular, if all ϕk (s) are entire, then Φr (s1 , . . . , sr ; ϕ1 , . . . , ϕr ) is also entire. The proof of the above theorem is an analogue of the second-named author’s proof of the meromorphic continuation of (1.3) given in [27], whose basic tool is the Mellin-Barnes formula (1.4). The idea of applying formula (1.4) in such a situation had been already mentioned by the first-named author [10] in the one-variable case. The authors express their sincere gratitude to Professor Gautami Bhowmik for pointing out an error in the original manuscript, and useful suggestions. In particular, the form of Conjecture (B) below was first suggested by her.

4

SHIGEKI EGAMI AND KOHJI MATSUMOTO

2. An example of double Dirichlet series with a natural boundary In Theorem 1.1, there is the condition that each ϕk (s) is holomorphic except for only one possible pole. Actually it is possible to prove a result of similar type under the weaker condition that each ϕk (s) has finitely many poles. However, if some of ϕk (s) has infinitely many poles, the behaviour of the multiple series Φr (s1 , . . . , sr ; ϕ1 , . . . , ϕr ) may be quite different. The following simple example illustrates this phenomenon. Let Λ(n) be the von Mangoldt function, and M (s) = −

∞ X ζ′ Λ(n) (s) = , ζ ns n=1

(2.1)

where ζ(s) is the Riemann zeta-function. Then M (s) is meromorphic in the whole plane, and has infinitely many poles because all zeros of ζ(s) are the poles of M (s). In fact it is known that 1 T log T, T ≥2 2π (Theorem 9.4 of Titchmarsh [36]), where N (T ) is the number of (counted with multiplicity) of ζ(s) in the region 0 < σ < 1, 0 < t which is expected to be equal to the number of poles of M (s) in the region because all zeros of ζ(s) are conjectured to be simple. Let ∞ ∞ X X Λ(k)Λ(m) . Φ2 (s) = Φ2 (0, s; M, M ) = (k + m)s m=1 N (T ) ∼

(2.2) zeros ≤ T, same

(2.3)

k=1

This can be rewritten as

Φ2 (s) = where G2 (n) =

∞ X G2 (n) , ns n=1

X

Λ(k)Λ(m).

(2.4)

(2.5)

k+m=n

The series on the right-hand side of (2.3), (2.4) is absolutely convergent for ℜs > 2, because G2 (n) ≤

n−1 X k=1

log k · log(n − k) ≤ n(log n)2 .

(2.6)

In the present paper we will show, under the assumption of certain conjectures, that Φ2 (s) has the natural boundary on the line ℜs = 1 (Theorem

CONVOLUTIONS OF THE VON MANGOLDT FUNCTION

5

2.2 below). Therefore it seems that the behaviour of Φ2 (s) is completely different from that of multiple series studied in [28]. The history of the investigation of natural boundaries of Dirichlet series also goes back to the beginning of the 20th century. The analytic continuaP tion and the natural boundary of the function p p−s (p runs over primes) were studied by Kluyver [23], Landau [25], and Landau and Walfisz [26]. In 1928, Estermann published two papers [13], [14] on natural boundaries of Dirichlet series. In the former paper [13], he considered a certain class of Dirichlet series which have Euler products, and gave a criterion when the series can be continued to the whole plane and when it has the natural boundary. The continuation and natural boundaries of Euler products were further studied in more general situations by several mathematicians such as Dahlquist [9], Kurokawa [24]. A multi-variable generalization was recently discussed by Bhowmik, Essouabri and Lichtin [7]. The results in the present paper give a different direction of research on natural boundaries of Dirichlet series. A part of the present work was already announced on the occasion of a conference on number theory (in honour of Professor Akio Fujii) held at Rikkyo University, Tokyo, in January 2005. On the other hand, independently of the present work, Tanigawa and Zhai [35] have considered Dirichlet series which are more general than ours, and have discussed the same type of problems (except for the Riesz mean). Their proof of the claim on natural boundaries (Theorem 1.3 of [35]) seems incomplete; some condition similar to our (B) below seems to be necessary to verify their argument. We mention here the number-theoretic motivation of the study of Φ2 (s). The function G2 (n) defined by (2.5) is a classical subject matter of number theory, because it is connected with the famous conjecture of C. Goldbach (that is, any even integer (≥ 4) can be expressed as a sum of two primes); in fact, the conjecture implies that G2 (n) > 0 for all even n ≥ 4. Fujii [15] studied the mean value of G2 (n) and proved that, if we assume the Riemann hypothesis (RH) for ζ(s), then X

G2 (n) =

n≤X

1 2 X + O(X 3/2 ) 2

(2.7)

for any large positive X. In [16], Fujii improved his result to obtain X

n≤X

G2 (n) =

1 2 X − H(X) + O((X log X)4/3 ) 2

(2.8)

6

SHIGEKI EGAMI AND KOHJI MATSUMOTO

under RH. Here H(X) = 2

X X 1+ρ , ρ(1 + ρ) ρ

where ρ runs over the non-trivial zeros of ζ(s), counted with multiplicity. From the work [20] of Hardy and Littlewood it is expected that G2 (n) for even n is approximated by nS2 (n), where ¶ ¶ Y µ Yµ 1 1 1+ S2 (n) = 1− . (2.9) p−1 (p − 1)2 p|n

(p,n)=1

Moreover it follows from Lemma 1 of Montgomery and Vaughan [31] that X 1 nS2 (n) = X 2 + O(X log X). (2.10) 2 n≤X

From this viewpoint, Fujii [16] reformulated his formula (2.8) into X (G2 (n) − nS2 (n)) = −H(X) + O((X log X)4/3 ).

(2.11)

n≤X

Hence the term H(X) represents the main oscillation in the above formulation of Goldbach’s problem. Some properties of H(X) have been studied in Fujii [17]. By (2.4) and Perron’s formula we have Z c+iT X 1 Xs G2 (n) = Φ2 (s) ds + O(T −1 X 2+ε ) (2.12) 2πi c−iT s n≤X

with c > 2. Therefore the study of Φ2 (s) will be useful to understand the behaviour of G2 (n). In the next section we will prove the following: Theorem 2.1 (under RH). The function Φ2 (s) can be continued meromorphically to the half-plane ℜs > 1, and holomorphic except for the simple poles at s = 2 (with residue 1) and s = 1 + ρ (with residue −2n(ρ)/ρ) for any non-trivial zero ρ of ζ(s), where n(ρ) is the multiplicity of ρ. By this theorem, we can shift (under RH) the path of integration on the right-hand side of (2.12) to ℜs = 1 + ε. We encounter the poles s = 2 and s = 1 + ρ, and the sum of their residues is (1/2)X 2 − H(X), which coincides with the explicit terms on the right-hand side of (2.8). In particular, we find that the properties of H(X) are closely connected with the behaviour of Φ2 (s) on the line ℜs = 3/2.

CONVOLUTIONS OF THE VON MANGOLDT FUNCTION

7

Next we consider the behaviour of Φ2 (s) on the line ℜs = 1. We propose the following: Conjecture 2.1. The line ℜs = 1 is the natural boundary of Φ2 (s). In the present paper we will show an evidence which supports the above conjecture. Let I be the set of all imaginary parts of non-trivial zeros of ζ(s). A well-known conjecture speculates that the positive elements of I would be linearly independent over the rationals. The following statement is a special case of this conjecture: (A) If γj ∈ I (1 ≤ j ≤ 4) and γ1 + γ2 = γ3 + γ4 (6= 0), then (γ3 , γ4 ) equals (γ1 , γ2 ) or (γ2 , γ1 ). These conjectures were mentioned on p.50 of Fujii [18]. In that paper Fujii made an extensive study on additive properties of the zeros of ζ(s). For instance he proved that the set {γ1 + γ2 | γ1 , γ2 ∈ I, γ1 > 0, γ2 > 0} is uniformly distributed mod 1 (Corollary 3 of [18]). Here we introduce the following quantitative version of (A): (B) There exists a constant α, with 0 < α < π/2, such that if γj ∈ I (1 ≤ j ≤ 4), γ1 + γ2 6= 0, and (γ3 , γ4 ) is neither equal to (γ1 , γ2 ) nor to (γ2 , γ1 ), then |(γ1 + γ2 ) − (γ3 + γ4 )| ≥ exp (−α(|γ1 | + |γ2 | + |γ3 | + |γ4 |)) .

(2.13)

Clearly (B) implies (A). In §4 of the present paper we will prove that, under RH, the set K = {κ | κ = γ1 + γ2 for some γ1 , γ2 ∈ I} \ {0}

(2.14)

is dense in the whole set of real numbers R. This result will yield the following theorem. Theorem 2.2 (under RH). If we assume that (B) is true, then Conjecture 1 is true. Hence the continuation achieved by Theorem 1.1 seems to be bestpossible. It is therefore not rash to propose the following Conjecture 2.2. The error term on the right-hand side of (2.8) is to be O(X 1+ε ) and Ω(X), where Ω(X) means that it is not o(X).

8

SHIGEKI EGAMI AND KOHJI MATSUMOTO

3. Proof of Theorem 2.1 In this section we prove Theorem 2.1. First we assume ℜs > 2 + 2ε. Then we have Φ2 (s) = =

∞ ∞ X X Λ(k)Λ(m) (k + m)s m=1

k=1 ∞ ∞ X X k=1

m ´−s Λ(k)Λ(m) ³ . 1 + ks k m=1

(3.1)

We apply the Mellin-Barnes formula (1.4) with λ = m/k to (3.1) to obtain Z ∞ ∞ X X Λ(k)Λ(m) 1 Γ(s − z)Γ(z) ³ m ´−z Φ2 (s) = dz ks 2πi (c) Γ(s) k k=1 m=1 (3.2) Z ∞ ∞ X Γ(s − z)Γ(z) X 1 −s+z −z Λ(k)k = Λ(m)m dz. 2πi (c) Γ(s) m=1 k=1

Two infinite series in the integrand are convergent when σ−c > 1 and c > 1. These conditions, and also the condition 0 < c < σ (which is necessary to apply (1.4)), are satisfied by the choice c = 1 + ε. Under this choice of c, we have Z Γ(s − z)Γ(z) 1 M (s − z)M (z)dz. (3.3) Φ2 (s) = 2πi (c) Γ(s) The next step is to shift the path of integration from ℜz = c = 1 + ε to ℜz = −ε. First we have to show that this shifting is possible. It is known that N (T + 1) − N (T ) ≪ log T

(3.4)

for any T ≥ 2, where f ≪ g means f = O(g) (Theorem 9.2 of Titchmarsh [36]). Hence we can find an arbitrarily large T such that |T − γ| ≫ (log T )−1 for any γ ∈ I. Combining (3.5) with the formula X 1 M (z) = − + O(log(|y| + 2)), y = ℑz, γ = ℑρ, z−ρ

(3.5)

(3.6)

|y−γ| 1 − ε, and hence holomorphic in that half-plane. Actually it is possible to continue this integral meromorphically to the whole plane, by shifting the path further to the left. The most difficult part is the second term X Γ(s − ρ)Γ(ρ) B2 (s) = − M (s − ρ). (3.10) Γ(s) ρ The factor Γ(s − ρ) has poles at s = ρ − ℓ (ℓ = 0, 1, 2, . . .), while the factor M (s − ρ) has poles at s = ρ + 1 and at s = ρ + ρ′ , where ρ′ denotes the

10

SHIGEKI EGAMI AND KOHJI MATSUMOTO

non-trivial zeros of ζ(s). In order to control this situation, we now assume RH (to the end of this section). Then the only poles of B2 (s) in the region ℜs > 1 are s = ρ + 1 for non-trivial zeros ρ, and the residue there is −n(ρ)

n(ρ) Γ(1)Γ(ρ) =− . Γ(ρ + 1) ρ

These poles are isolated singularities, and hence B2 (s) can be continued to ℜs > 1. This implies the meromorphic continuation of Φ2 (s) to ℜs > 1. The residue of Φ2 (s) at s = 2 is 1, and at s = 1 + ρ is −

n(ρ) n(ρ) 2n(ρ) − =− . ρ ρ ρ

Now the proof of Theorem 2.1 is complete.

4. Proof of Theorem 2.2 To prove Theorem 2.2, we use the classical explicit formula ¶ ´ Xµ 1 1 1 Γ′ ³ s 1 M (s) = b + + +1 − + , s−1 2 Γ 2 s−ρ ρ ρ

(4.1)

where b = 1 + (C0 /2) − log 2π and C0 is Euler’s constant (formula (2.12.7) of [36]). Substituting this into (3.10), for ℜs > 1 we obtain µ ¶¾ X Γ(s − ρ)Γ(ρ) ½ 1 1 Γ′ s − ρ B2 (s) = − b+ + +1 Γ(s) s−ρ−1 2 Γ 2 ρ µ ¶ X X Γ(s − ρ)Γ(ρ) 1 1 (4.2) + ′ + ′ Γ(s) s − ρ − ρ ρ ′ ρ ρ

= B21 (s) + B22 (s),

say. Clearly B21 (s) is meromorphic on the whole plane, and has no pole on the line ℜs = 1. To investigate B22 (s), we assume RH (to the end of this section), and rewrite ρ = ρ1 = 1/2 + iγ1 and ρ′ = ρ2 = 1/2 + iγ2 to obtain 1 X X Γ(s + 1 − ρ1 )Γ(ρ1 ) B22 (s) = , ℜs > 1. (4.3) Γ(s) ρ ρ (s − ρ1 − ρ2 )ρ2 1

2

Therefore B22 (s) may behave singularly as s tends to ρ1 + ρ2 , that is, any point of the form 1 + iκ with κ ∈ K (where K is the set defined by (2.14)). Before studying this phenomenon closely, we first prove Lemma 4.1 (under RH). The set K is dense in R.

CONVOLUTIONS OF THE VON MANGOLDT FUNCTION

11

Proof. It is classically known that N (T ) =

1 T log T − C1 T + O(log T ), 2π

C1 =

1 + log 2π 2π

(Theorem 9.4 of [36]), and, under RH, the error term in the above formula can be replaced by O(log T / log log T ) (Theorem 14.13 of [36]). Therefore, for any fixed h ∈ R, the number of zeros on the interval (1/2 + iT, 1/2 + i(T + h)] is 1 (T + h) log(T + h) − C1 (T + h) (4.4) 2π µ ¶ log T 1 T log T + C1 T + O − 2π log log T ½ µ ½ µ ¶¾ ¶¾ 1 h h 1 = T log T + log 1 + h log T + log 1 + + 2π T 2π T µ ¶ log T 1 T log T + O − C1 h − 2π log log T ¶ µ h log T = . log T + O 2π log log T There exists a sufficiently large T0 = T0 (h), such that the right-hand side of (4.4) is positive for any T ≥ T0 . Let α be any non-zero real number, and ε be arbitrarily small. Then, by using this positivity, we can find a sufficiently large T = T (α, ε) and γ1 , γ2 ∈ I, satisfying γ1 ∈ (T + α − ε/2, T + α + ε/2],

γ2 ∈ (−T − ε/2, −T + ε/2].

Hence |α−(γ1 +γ2 )| < ε. Moreover, if ε < |α|, then γ2 6= −γ1 , so γ1 +γ2 ∈ K. Thus we conclude the assertion of the lemma. In view of the above lemma we now know that the points of the form 1 + iκ (κ ∈ K) are dense on the line ℜs = 1. Now we assume (B), and prove the following Lemma 4.2 (under RH and (B)). For any κ ∈ K, the function B22 (s) tends to infinity as s tends to 1 + iκ from the right. Proof. By (A) we see that there is only one pair (γ10 , γ20 ) (and its reverseordered pair (γ20 , γ10 ) ) satisfying γ10 + γ20 = κ. Put ρ01 = (1/2) + iγ10 , ρ02 =

12

SHIGEKI EGAMI AND KOHJI MATSUMOTO

(1/2) + iγ20 . Then B22 (s) =

n(ρ01 )n(ρ02 ) Γ(s)

½

Γ(s + (1/2) − iγ10 )Γ((1/2) + iγ10 ) (s − 1 − iγ10 − iγ20 )((1/2) + iγ20 ) ¾ Γ(s + (1/2) − iγ20 )Γ((1/2) + iγ20 ) + (s − 1 − iγ10 − iγ20 )((1/2) + iγ10 ) 1 X X∗ Γ(s + 1 − ρ1 )Γ(ρ1 ) + Γ(s) γ γ (s − ρ1 − ρ2 )ρ2 1

=

∗ (s) B22

2

∗∗ (s), B22

+ P P∗

say, where the symbol means the sum over all (γ1 , γ2 ) satisfying 0 0 0 0 ∗ (γ1 , γ2 ) 6= (γ1 , γ2 ), (γ2 , γ1 ). Then B22 (s) is meromorphic on the whole plane, and its residue at s = 1 + iκ = 1 + i(γ10 + γ20 ) is ½ n(ρ01 )n(ρ02 ) Γ((3/2) + i(κ − γ10 ))Γ((1/2) + iγ10 ) (4.5) Γ(1 + iκ) (1/2) + iγ20 ¾ Γ((3/2) + i(κ − γ20 ))Γ((1/2) + iγ20 ) + (1/2) + iγ10 2n(ρ01 )n(ρ02 ) = Γ(ρ01 )Γ(ρ02 ), Γ(1 + iκ) ∗ (s) → ∞ as s → 1 + iκ. Therefore the which does not vanish. That is, B22 ∗∗ remaining task is to show that B22 (s) remains finite as s → 1 + iκ. Putting s = 1 + η + iκ (η ≥ 0, small), we have

1 Γ(1 + η + iκ) X X∗ Γ((3/2) + η + i(κ − γ1 ))Γ((1/2) + iγ1 ) × . (η + i(κ − γ1 − γ2 ))((1/2) + iγ2 ) γ γ

∗∗ B22 (1 + η + iκ) =

1

(4.6)

2

To prove the lemma, it is enough to show that the right-hand side of (4.6) is absolutely convergent, uniformly in η. By using Stirling’s formula we have X 1 ∗∗ (1 + η + iκ) ≪ B22 (|κ − γ1 | + 1)1+η Γ(1 + η + iκ) γ 1 (4.7) X∗ 1 . ×e−(π/2)(|κ−γ1 |+|γ1 |) |κ − γ1 − γ2 |(1 + |γ2 |) γ 2

The inner sum on the right-hand side of (4.7) can be divided into X X + = Σ 1 + Σ2 , 01

CONVOLUTIONS OF THE VON MANGOLDT FUNCTION

13

say, where λ = κ − γ1 . If λ = 0, then obviously Σ2 = O(1). If λ > 0, we divide Σ2 as X X X + = Σ21 + Σ22 + Σ23 , + Σ2 = γ2 >λ+1

o r −1, and holomorphic there except for the simple poles at s = r and s = r − 1 + ρ for all non-trivial zeros ρ of ζ(s). The residues at s = r and s = r − 1 + ρ are 1 , (r − 1)!



r · n(ρ) , ρ(1 + ρ) · · · (r − 2 + ρ)

respectively. Proof. We prove this theorem by induction on r. When r = 2, this theorem is exactly Theorem 2.1. Assume that the theorem is true for r −1. Applying (1.4) to (6.2), we obtain Z Γ(s − z)Γ(z) 1 Φr (s) = Φr−1 (s − z)M (z)dz (6.4) 2πi (c) Γ(s) for ℜs > r, where 1 < c < ℜs − (r − 1). Shift the path of integration to ℜz = −ε. By using the same T as in (3.5), we can show that this shifting is possible. (Note that in the strip −ε ≤ ℜz ≤ c the factor Φr−1 (s − z) is in the domain of its absolute convergence, hence is O(1).) The result is that Φr (s) =

Φr−1 (s − 1) X Γ(s − ρ)Γ(ρ) − Φr−1 (s − ρ) (6.5) s−1 Γ(s) ρ Z Γ(s − z)Γ(z) 1 − Φr−1 (s) log 2π + Φr−1 (s − z)M (z)dz. 2πi (−ε) Γ(s)

Under the induction assumption, this expression gives the continuation of Φr (s) to ℜs > r − 1. Moreover, the residues of Φr−1 (s − 1)/(s − 1) at s = r, s = r − 1 + ρ are 1 , (r − 1)!



(r − 1)n(ρ) , ρ(1 + ρ) · · · (r − 2 + ρ)

20

SHIGEKI EGAMI AND KOHJI MATSUMOTO

respectively, while the residue of X Γ(s − ρ)Γ(ρ) Φr−1 (s − ρ) Br (s) = − Γ(s) ρ

at s = r − 1 + ρ is −n(ρ) ·

Γ(r − 1)Γ(ρ) 1 n(ρ) · =− . Γ(r − 1 + ρ) (r − 2)! ρ(1 + ρ) · · · (r − 2 + ρ)

Hence the assertion of Theorem 6.1 follows.

The function Φr−1 (s − ρ) is singular at s = r − 2 + ρ + ρ′ for any non-trivial zero ρ′ . Hence, in view of Lemma 4.1, it is natural to raise the following: Conjecture 6.1. The line ℜs = r − 1 is the natural boundary of Φr (s). In fact, under a certain assumption, we can show that Φr (s) ∼

r(r − 1) 1 n(ρ01 )n(ρ02 )Γ(ρ01 )Γ(ρ02 ) Γ(r − 1 + iκ) s − (r − 1 + iκ)

(6.6)

as s → r − 1 + iκ for any κ = γ10 + γ20 ∈ K. This implies, as in the proof of Theorem 2.2, that Conjecture 6.1 is true. When r = 2, (6.6) is nothing but (4.11), which has been shown under RH and (B). We prove (6.6) for general r by induction. When s → r − 1 + iκ, we have (r − 1)(r − 2) Φr−1 (s − 1) ∼ n(ρ01 )n(ρ02 )Γ(ρ01 )Γ(ρ02 ) (6.7) s−1 Γ(r − 2 + iκ) 1 1 × r − 2 + iκ (s − 1) − (r − 2 + iκ) (r − 1)(r − 2) 1 = n(ρ01 )n(ρ02 )Γ(ρ01 )Γ(ρ02 ) Γ(r − 1 + iκ) s − (r − 1 + iκ)

by induction assumption. Next, we divide Br (s) as

Γ(s − ρ01 )Γ(ρ01 ) Φr−1 (s − ρ01 ) Γ(s) Γ(s − ρ02 )Γ(ρ02 ) Φr−1 (s − ρ02 ) − n(ρ02 ) Γ(s) X Γ(s − ρ)Γ(ρ) − Φr−1 (s − ρ) Γ(s) 0 0

Br (s) = − n(ρ01 )

ρ6=ρ1 ,ρ2

= Br1 (s) + Br2 (s) + Br3 (s),

(6.8)

CONVOLUTIONS OF THE VON MANGOLDT FUNCTION

21

say. The factor Φr−1 (s − ρ01 ) has a pole at s = r − 1 + iκ = r − 2 + ρ01 + ρ02 , whose residue is given by Theorem 6.1. Therefore Γ(r − 2 + ρ02 )Γ(ρ01 ) Br1 (s) ∼ −n(ρ01 ) Γ(r − 1 + iκ) µ ¶ (r − 1)n(ρ02 ) 1 × − 0 ρ2 (1 + ρ02 ) · · · (r − 3 + ρ02 ) s − (r − 1 + iκ) 1 r−1 n(ρ01 )n(ρ02 )Γ(ρ01 )Γ(ρ02 ) = Γ(r − 1 + iκ) s − (r − 1 + iκ)

as s → r − 1 + iκ. The asymptotic behaviour of Br2 (s) when s → r − 1 + iκ is exactly the same. Therefore, if we assume (C)r The sum Br3 (s) remains finite when s → r − 1 + iκ, then we have Br (s) ∼

2(r − 1) 1 n(ρ01 )n(ρ02 )Γ(ρ01 )Γ(ρ02 ) Γ(r − 1 + iκ) s − (r − 1 + iκ)

(6.9)

as s → r − 1 + iκ. From (6.5), (6.7) and (6.9), we obtain (6.6), which implies the following: Theorem 6.2 (under RH). If we assume that (B) and (C)k (k ≤ r) are true, then Conjecture 6.1 is true.

References 1. S. Akiyama, S. Egami and Y. Tanigawa, Analytic continuation of multiple zeta functions and their values at non-positive integers, Acta Arith., 98 (2001), 107–116. 2. S. Akiyama and H. Ishikawa, On analytic continuation of multiple L-functions and related zeta-functions, in Analytic Number Theory, C. Jia and K. Matsumoto (eds.), Dev. Math., Vol.6, Kluwer, 2002, pp.1–16. 3. T. Arakawa and M. Kaneko, Multiple zeta values, poly-Bernoulli numbers, and related zeta functions, Nagoya Math. J., 153 (1999), 189–209. 4. T. Arakawa and M. Kaneko, On multiple L-values, J. Math. Soc. Japan, 56 (2004), 967–991. 5. E. W. Barnes, The theory of the double gamma function, Philos. Trans. Roy. Soc. (A), 196 (1901), 265–387. 6. E. W. Barnes, On the theory of the multiple gamma function, Trans. Cambridge Philos. Soc., 19 (1904), 374–425. 7. G. Bhowmik, D. Essouabri and B. Lichtin, Meromorphic continuation of multivariable Euler products, Forum Math., to appear. 8. R. de la Bret`eche, Estimation de sommes multiples de fonctions arithm´etiques, Compositio Math., 128 (2001), 261–298.

22

SHIGEKI EGAMI AND KOHJI MATSUMOTO

9. G. Dahlquist, On the analytic continuation of Eulerian products, Ark. Mat., 1 (1952), 533–555. 10. S. Egami, Some curious Dirichlet series, S¯ urikaisekikenky¯ usho K¯ oky¯ uroku, 1091, RIMS Kyoto Univ. (1999), 172–174. 11. D. Essouabri, Singularit´es des s´eries de Dirichlet associ´ees ` a des polynˆ omes de plusieurs variables et applications ` a la th´eorie analytique des nombres, Th´ese, Univ. Henri Poincar´e - Nancy I, 1995. 12. D. Essouabri, Singularit´es des s´eries de Dirichlet associ´ees ` a des polynˆ omes de plusieurs variables et applications en th´eorie analytique des nombres, Ann. Inst. Fourier, 47 (1997), 429–483. 13. T. Estermann, On certain functions represented by Dirichlet series, Proc. London Math. Soc., (2)27 (1928), 435–448. 14. T. Estermann, On a problem of analytic continuation, ibid., 471–482. 15. A. Fujii, An additive problem of prime numbers, Acta Arith., 58 (1991), 173–179. 16. A. Fujii, An additive problem of prime numbers II, Proc. Japan Acad., 67A (1991), 248–252. 17. A. Fujii, An additive problem of prime numbers III, ibid., 278–283. 18. A. Fujii, An additive theory of the zeros of the Riemann zeta function, Comment. Math. Univ. St. Pauli, 45 (1996), 49–116. 19. A. B. Goncharov, Multiple polylogarithms, cyclotomy and modular complexes, Math. Res. Letters, 5 (1998), 497–516. 20. G. H. Hardy and J. E. Littlewood, Some problems of “partitio numerorum” (V): A further contribution to the study of Goldbach’s problem, Proc. London Math. Soc., (2)22 (1924), 46–56. 21. H. Ishikawa, On analytic properties of a multiple L-function, in Analytic Extension Formulas and their Applications, S. Saitoh et al. (eds.), Soc. Anal. Appl. Comput., Vol.9, Kluwer, 2001, pp.105–122. 22. H. Ishikawa, A multiple character sum and a multiple L-function, Arch. Math., 79 (2002), 439–448. 23. J. C. Kluyver, Benaderingsformules betreffende de priemgetallen beneden eene gegeven grens, Koninglijke Akad. Wet. Amsterdam, Versl. Gew. Verg. Wis- en Natuur. Afd., 8 (1900), 672–682. 24. N. Kurokawa, On the meromorphy of Euler products (I), Proc. London Math. Soc., (3)53 (1986), 1–47; (II), ibid., 209–236. ¨ 25. E. Landau, Uder die Multiplikation Dirichlet’scher Reihen, Rend. Circ. Mat. Palermo, 24 (1907), 81–160. ¨ 26. E. Landau and A. Walfisz, Uder die Nichtfortsetzbarkeit einiger durch Dirichletsche Reihen definierter Funktionen, ibid. 44 (1920), 82–86. 27. K. Matsumoto, Asymptotic expansions of double zeta-functions of Barnes, of Shintani, and Eisenstein series, Nagoya Math. J., 172 (2003), 59–102. 28. K. Matsumoto and Y. Tanigawa, The analytic continuation and the order estimate of multiple Dirichlet series, J. Th´eorie des Nombres de Bordeaux, 15 (2003), 267–274. 29. H. Mellin, Eine Formel f¨ ur den Logarithmus transcendenter Funktionen von endlichem Geschlecht, Acta. Soc. Sci. Fenn., 29 (1900), no.4.

CONVOLUTIONS OF THE VON MANGOLDT FUNCTION

23

30. H. Mellin, Die Dirichlet’schen Reihen, die zahlentheoretischen Funktionen und die unendlichen Produkte von endlichem Geschlecht, Acta Math., 28 (1904), 37–64. 31. H. L. Montgomery and R. C. Vaughan, Error terms in additive prime number theory, Quart. J. Math. Oxford, (2)24 (1973), 207–216. 32. M. Peter, Dirichlet series associated with polynomials, Acta Arith., 84 (1998), 245–278. 33. P. Sargos, Prolongement m´eromorphe des s´eries de Dirichlet associ´ees ` a des fractions rationnelles de plusieurs variables, Ann. Inst. Fourier, 34(3) (1984), 83–123. 34. P. Sargos, Croissance de certaines s´eries de Dirichlet et applications, J. Reine Angew. Math., 367 (1986), 139–154. 35. Y. Tanigawa and W. Zhai, Dirichlet series associated with polynomials and applications, J. Number Theory, to appear. 36. E. C. Titchmarsh, The Theory of the Riemann Zeta-function, Oxford, 1951. 37. J. Zhao, Analytic continuation of multiple zeta-functions, Proc. Amer. Math. Soc., 128 (2000), 1275–1283.

24

CONSTRUCTING NEW NON-CONGRUENT NUMBERS BY GRAPH THEORY KEQIN FENG∗ Department of Mathematical Science, Tsinghua University, Beijing 100084, China E-mail: [email protected] YAN XUE Department of Mathematical Science, Tsinghua University, Beijing 100084, China E-mail: [email protected] This paper is a survey on recent results of new series of non-congruent numbers which can be acquired by traditional arithmetic theory of elliptic curves plus a result from algebraic graph theory. More precisely, we start from the following two facts: (1) A square-free positive integer n is a non-congruent number if and only if the rank of the group En (Q) of rational points of the elliptic curve En : y 2 = x3 − n2 x is zero. (2) If the 2-Selmer groups Sn and Sˆn of the elliptic curve En and its dual curve Eˆn : y 2 = x3 + 4n2 x have minimal sizes |Sn | = 1 and |Sˆn | = 4, then rank(En (Q)) = 0. Selmer groups can be determined by the data of locally solvability conditions of the homogenous spaces of elliptic curves En and Eˆn . The next step is to organize the data into carefully constructed graphs so that the Selmer groups have minimum if and only if the graphs have specific “odd” property. By a result in algebraic graph theory, an odd graph can be described by the rank of the Laplace matrix of the graph over F2 . Thus, by computing the rank of a certain matrix over F2 , we can determine all n such that Sn and Sˆn have minimum, in which case n is a non-congruent number. After explaining this method and related concepts, we describe the results on new series of non-congruent numbers obtained in this way and illustrate them by examples.

1. Introduction A square-free positive integer n is called a congruent number if n is the area of a certain rational right triangle where “rational” means that the ∗ Supported

by a National Scientific Research Project 973 of China No. 2004 CB 3180004 and a NSFC grant No. 60433050.

CONSTRUCTING NEW NON-CONGRUENT NUMBERS BY GRAPH THEORY

25

lengths of three sides of this triangle are rational numbers. Otherwise n is called a non-congruent number. To determine all congruent numbers is one of long-standing problems in number theory. Using the theory of modular forms, Tunnell [17] (also see [10]) presented an elementary criterion on congruent numbers under the Birch and Swinnerton-Dyer conjecture, but this problem has not been solved completely. Many congruent numbers and non-congruent numbers have been determined [6,7,11–15], but most of them have at most 4 prime divisors. In this paper we are concerned only with non-congruent numbers, and we list up several known results in the following (p, q, r, s are odd prime numbers, and ( pq ) is the Legendre symbol.) Lemma 1.1. In the following cases, n is a non-congruent number. (1) (Genocchi [6]) n = p ≡ 3 (mod 8),

n = pq,

n = 2p, n = 2pq,

p ≡ q ≡ 3 (mod 8), p ≡ 5 (mod 8),

p ≡ q ≡ 5 (mod 8).

(2) (Lagrange [12]) n = pq, n = 2pq, n = pqr, n = 2pqr,

(p, q) ≡ (1, 3) (mod 8), ( pq ) = −1, (p, q) ≡ (1, 5) (mod 8), ( pq ) = −1, (p, q, r) ≡ (1, 1, 3) (mod 8), with the condition (∗), (p, q, r) ≡ (1, 1, 5) (mod 8), with the condition (∗).

Condition (∗): n can be written as n = p1 p2 p3 or 2p1 p2 p3 such that ³p ´ ³p ´ 1 1 = = −1. p2 p3 (3) (Serf [15])

n = pq, n = pqr, n = pqr, n = 2pqr, n = pqrs,

(p, q) ≡ (5, 7) (mod 8), ( pq ) = −1, (p, q, r) ≡ (1, 3, 3) (mod 8), ( pq ) = −( pr ), (p, q, r) ≡ (3, 5, 7) (mod 8), ( qr ) = −1, (p, q, r) ≡ (1, 5, 5) (mod 8), ( pq ) = −( pr ), (p, q, r, s) ≡ (5, 5, 7, 7) (mod 8), and

³p´ ³q ´ =− =− ; or r³ ´ ³ s ´ r´ ³ p p q 1=− = =− ; or r s ³p´ ³p´ ³sq ´ ³q ´ 1=− =− , =− . r s r s

1=

³p´

26

KEQIN FENG AND YAN XUE

Serf also claimed in [15] that he found more cases of non-congruent numbers n with 5 or 6 odd prime divisors, but “it is almost impossible to exhibit them in a reasonable way.” In this survey paper we introduce several works [2–4,18,19] trying to find a reasonable way to describe a remarkable portion of new non-congruent numbers, including many series of such n with arbitrarily large number of prime factors. The method starts from the following basic fact (see N. Koblitz’ book [10], for example). Lemma 1.2. A square-free positive integer n is a non-congruent number if and only if the group En (Q) of rational points of the elliptic curve En : y 2 = x3 − n2 x has rank zero. Traditional way to deal with rank(En (Q)) is the 2-descent method which we explain in §2. With this method, the rank problem of En (Q) is reduced to determining the Selmer groups Sn and Sˆn of the elliptic curve En and its dual curve Eˆn : y 2 = x3 + 4n2 x. In particular, if the Selmer groups have minimal sizes |Sn | = 1 and |Sˆn | = 4, then rank(En (Q)) = 0 so that n is a non-congruent number by Lemma 1.2. The Selmer groups are defined by locally solvability of certain equations so that can be determined in principle by Hensel lemma. But the data of such conditions of locally solvability are rather complicated when the number of prime factors of n is larger. Next step is that the data can be organized into carefully constructed graphs so that the Selmer groups have minimum if and only if the graphs have specific “odd” property. By a result in algebraic graph theory, odd graphs can be described by the rank of the Laplace matrix of the graph over F2 . Thus, all n such that Sn and Sˆn have minimum, so that rank(En (Q)) = 0 and, a fortiori, n is a non-congruent number, can be determined by computing the rank of a certain matrix over F2 . We explain our graph-theory tools in §3 and introduce the results on new series of non-congruent numbers obtained in this way in §4. Finally we mention the Birch and SwinnertonDyer conjecture for elliptic curve En briefly in §5. 2. Selmer groups In this section we describe the 2-descent method for elliptic curve En . For details we refer to the last chapter of Silverman’s book [16]. Let n = p1 · · · pt or n = 2p1 · · · pt where t > 1 and p1 , · · · , pt are distinct odd prime numbers. We define a set of prime divisors of the rational number

CONSTRUCTING NEW NON-CONGRUENT NUMBERS BY GRAPH THEORY

27

field Q by S = {∞, 2, p1 , · · · , pt }, and a subgroup M of the multiplicative group Q∗ /(Q∗ )2 generated by −1, 2, p1 , · · · , pt M = h−1, 2, p1 , · · · , pt i ⊆ Q∗ /(Q∗ )2 .

For each d ∈ M we have the homogenous spaces Cd and Cˆd of En and its 2-dual curve Eˆn : y 2 = x3 + 4n2 x defined by Cd : dw2 = d2 t4 + 4n2 z 4 , Cˆd : dw2 = d2 t4 − n2 z 4 . For each prime divisor v ∈ S, we denote Cd (Qv ) and Cˆd (Qv ) the set of non-trivial solutions (w, t, z) 6= (0, 0, 0) of Cd and Cˆd in the local field Qv respectively. The Selmer groups Sn and Sˆn of En and Eˆn are defined by locally solvability of Cd and Cˆd : Sn = {d ∈ M : Cd (Qv ) 6= ∅ for all v ∈ S}, Sˆn = {d ∈ M : Cˆd (Qv ) 6= ∅ for all v ∈ S}. It is proved that Sn and Sˆn are subgroups of M , and 1 ∈ Sn ,

{±1, ±n} ⊆ Sˆn ,

since C1 and Cˆd for d = ±1, ±n have global non-trivial solutions in Q. The following result is a special consequence of 2-descent method. Lemma 2.1. If Sn and Sˆn have minimal sizes: Sn = {1} and Sˆn = {±1, ±n}, then rank(En (Q)) = 0, so that n is a non-congruent number. Now the problem is reduced to finding an explicit criterion to describe Sn = {1} and Sˆn = {±1, ±n}. By the definition of Selmer groups, Sn = {1} if and only if for all d ∈ M and d 6= 1, there exists v ∈ S such that Cd (Qv ) = ∅. Similarly, Sˆn = {±1, ±n} can also be described by such kind of local solvabilities of Cˆd . It is not difficult to give the following result by Hensel lemma and careful computation. Lemma 2.2 ([3, Lemma 3.1, 3.2, 5.1, 5.2]). Let p1 , · · · , pt (t > 1) be distinct odd prime numbers, d ∈ M = h−1, 2, p1 , · · · , pt i ⊂ Q∗ /(Q∗ )2 , and p denotes an odd prime number. (A) If n = p1 · · · pt , then (A1) Cd (Q∞ ) = ∅ ⇔ d < 0.

28

KEQIN FENG AND YAN XUE

n/d −1 (A2) For p|d, Cd (Qp ) = ∅ ⇔ ( −1 p ) = −1 or “( p ) = 1 and ( p ) = −1”. d (A3) For p| 2n d , Cd (Qp ) = ∅ ⇔ ( p ) = −1. (A4) If n ≡ ±3 (mod 8) and 2|d, then Cd (Q2 ) = ∅. (A5) d ≡ 1 (mod 4) ⇒ Cd (Q2 ) 6= ∅. (A6) If n ≡ ±1 (mod 8), d = 2d′ where d′ |n and d′ ≡ 1 (mod 4), then Cd (Q2 ) 6= ∅.

(A1′ ) 2|d ⇒ Cˆd (Q2 ) = ∅. (A2′ ) If 2 ∤ d, then Cˆd (Q2 ) = ∅ ⇔ d ≡ ±3 (mod 8) and ±3 (mod 8). n/d (A3′ ) If p|d, then Cˆd (Qp ) = ∅ ⇔ ( −1 p ) = 1 and ( p ) = −1. (A4′ ) If p| 2n , then Cˆd (Qp ) = ∅ ⇔ ( −1 ) = 1 and ( d ) = −1. d

p

n d



p

(B) If n = 2p1 · · · pt , then (B1) Cd (Q∞ ) = ∅ ⇔ d < 0. (B2) 2|d ⇒ Cd (Q2 ) = ∅. 2n/d −1 −1 (B3) For p|d, if ( −1 p ) = −1 or “( p ) = 1 and ( p )4 ( p ) = −1”, then Cd (Qp ) = ∅. (B4) For p| nd , then ( dp ) = −1 ⇒ Cd (Qp ) = ∅. n/d (B1′ ) For p|d, Cˆd (Qp ) = ∅ ⇔ ( −1 p ) = 1 and ( p ) = −1. d (B2′ ) For p| nd , Cˆd (Qp ) = ∅ ⇔ ( −1 p ) = 1 and ( p ) = −1. (B3′ ) 2 ∤ d ⇒ Cˆd (Q2 ) 6= ∅.

For the Genocchi cases in Lemma 1.1 (1), it can be seen by Lemma 2.2 that Sn = {1} and Sˆn = {±1, ±n} so that n is a non-congruent number. For the first two cases of Lemma 1.1 (2), it can be seen that Sn = {1} and Sˆn = {±1, ±n} if and only if ( pq ) = −1. But in general cases, it seems no simple way to write down a necessary and sufficient condition for Sn = {1} and Sˆn = {±1, ±n} if n has large number of prime divisors. What we did in next step is to find proper graphs such that the condition can be described by a specific property of graphs which we introduce in next section. 3. Oddness of graphs We use standard terminology in graph theory (see [8] for example). Let G = (V, A) be a (simple) directed graph where V = V (G) = {v1 , · · · , vm } is the set of vertices of G, and A = A(G) is the set of arcs in G. We denote → −−→ −−→ an arc (vi , vj ) ∈ A by − v− i vj . If both of vi vj and vj vi belong to A, we have

CONSTRUCTING NEW NON-CONGRUENT NUMBERS BY GRAPH THEORY

29

a two-direction arc vi vj in G and call it an edge. If all arcs in A(G) are two-directed, the graph G is called non-directed. The adjacency matrix of G is defined by M (G) = (aij )16i,j6m where aij =

½

1 0

→ if − v− i vj ∈ A(G), otherwise.

1 6 i 6= j 6 m,

Let di =

m X

aij (the outdegree of vertex vi ),

1 6 i 6 m.

j=1

The Laplace matrix of G is defined by L(G) = diag(d1 , · · · , dm ) − M (G). Since the sum of each row of L(G) is zero, we know that rankQ (L(G)) 6 m − 1 and Lij = (−1)j+k Lik where Lij is the co-factor of L(G) at the position (i, j). For non-directed graph G, the matrices M (G) and L(G) are symmetric and L11 = (−1)i+k Lik ,

1 6 i, k 6 m.

In this case, it is well known that the absolute value of |L11 | is the number of spanning trees of the non-directed graph G (see [2] Section 1.2.4). Definition 3.1. Let G = (V, A) be a directed graph. A partition {V1 , V2 } of V is called odd if either there exists a vertex v1 ∈ V1 such that #{v1 → V2 } (the total number of arcs from v1 to vertices in V2 ) is odd, or there exists v2 ∈ V2 such that #{v2 → V1 } is odd. Otherwise the partition {V1 , V2 } is called even. The graph G is called odd if all non-trivial partitions {V1 , V2 } 6= {V, ∅} of V are odd. The following result presents a simple criterion for oddness of a graph G in terms of the rank of the Laplace matrix L(G) over finite field F2 . Remark that rankF2 (L(G)) 6 rankQ (L(G)) 6 m − 1 and the total number of partitions of V is 2m−1 (m = |V |) since we view {V1 , V2 }={V2 , V1 }.

30

KEQIN FENG AND YAN XUE

Lemma 3.1 ([2, Lemma 2.2]). Let G = (V, A) be a directed graph, m = |V | and r = rankF2 (L(G)) (6 m − 1). Then the total number of even partitions of V is 2m−r−1 . In particular, the graph G is odd if and only if r = m − 1. For non-directed graph G, G is odd if and only if the number t(G) = |L11 (G)| of spanning trees of G is odd. An odd graph should be connected. There exists plenty of odd graphs as shown in following examples: ′ (1) All directed cycles Cm with V = {v1 , · · · , vm } and A = − − → − − − − − → − − − → {v1 v2 , · · · , vm−1 vm , vm v1 }. For non-directed graphs: (2) All trees T , since t(T ) = 1. (3) All complete graphs Km with odd integer m > 3 defined by V = {v1 , · · · , vm } and A = {vi vj : 1 6 i 6= j 6 m} since we have Cayley formula t(Km ) = mm−2 . (4) All cycles Cm with odd integer m > 3 defined by V = {v1 , · · · , vm } and A = {v1 v2 , · · · , vm−1 vm , vm v1 }. The concept of odd graph has been used in number theory to determine the 4-rank of the class group of imaginary quadratic number fields in 1930’s by L. R´edei and H. Reichardt and to present a sufficient condition for the Pell’s equation x2 − ny 2 = −1 having no integral solution in 1970’s. In the next section we show its new application in number theory to present series of new non-congruent numbers. 4. New non-congruent numbers Let n = p1 · · · pt (t > 1) be a product of distinct odd primes. We define a graph G(n) = (V, A) by ½ µ ¶ ¾ pj − − → V = {p1 , · · · , pt }, A = pi pj : = −1, 1 6 i 6= j 6 t . pi In 1996, the first author found that for specific n, the oddness of the graph G(n) is a sufficient condition for Sn = {1} and Sˆn = {±1, ±n} as described in Lemma 4.1. Lemma 4.1 ([2, Theorem 3.1]). (1) Suppose that n = p1 · · · pt (t > 1), p1 ≡ 3 (mod 8) and pi ≡ 1 (mod 8) when i > 2. If G(n) is an odd graph, then Sn = {1} and Sˆn = {±1, ±n}. (2) Suppose that n = 2p1 · · · pt (t > 1), p1 ≡ 5 (mod 8) and pi ≡ 1 (mod 8) when i > 2. If G( n2 ) is an odd graph, then Sn = {1} and Sˆn = {±1, ±n}.

CONSTRUCTING NEW NON-CONGRUENT NUMBERS BY GRAPH THEORY

31

This result presents series of non-congruent number n such that n can have arbitrarily lager number of prime divisors since it is easy to show by the Dirichlet theorem on primes in arithmetic progressions that for each nondirected graph G there exist infinitely many of n in form (1) of Lemma 4.1 and 2n in form (2) of Lemma 4.1 such that G(n) = G and G( n2 ) = G respectively. Later, we find suitable graphs to do this for all cases of n with more careful consideration. Now we describe our results and omit technique in proofs. Case 2|n. Let n = 2n′ , where n′ = p1 · · · pt q1 · · · qs ,

pi ≡ 1 (mod 4),

qj ≡ 3 (mod 4),

1 6 i 6 t,

1 6 j 6 s,

is a product of distinct prime numbers (t + s > 1). Let P = {p1 , · · · , pt }, ˆ Q = {q1 , · · · , qs }. We define a graph G(n) = (V, A) by V = {2, p1 , · · · , pt , q1 , · · · , qs }, ½ ¾ µ ¶ pj A = pi pj : = −1, pi , pj ∈ P pi µ ¶ ½ ¾ q − → ∪ pq : = −1, p ∈ P, q ∈ Q p µ ¶ ¾ ½ 2 − → = −1 (⇔ p ≡ 5 (mod 8)), p ∈ P . ∪ p2 : p From Lemma 2.2 (B) we obtain the following result. Theorem 4.1 ([3, Lemma 5.3, 5.4]). For n = 2n′ , we have ˆ Sˆn = {±1, ±n} ⇔ G(n)is an odd graph ⇒ Sn = {1}. Then by the matrix characterization of odd graphs in Lemma 3.2, we get the following result. Theorem 4.2 ([3, Theorem 2.6]). Let n = 2n′ and n′ has decomposition (4.1). Then Sn = {1} and Sˆn = {±1, ±n} (so that rank(En (Q)) = 0 and n is a non-congruent number) if and only if the following two conditions are satisfied. (1) s = 0 so that n′ = p1 · · · pt (t > 1) and pi ≡ 1 (mod 4).

32

KEQIN FENG AND YAN XUE

(2) Define the following numbers ( p 1 if ( pji ) = −1, 1 6 i 6= j 6 t, aij = p 0 if ( pji ) = 1, ½ 1 if pi ≡ 5 (mod 8), ci = 1 6 i 6 t, 0 otherwise, a∗ii =

t X

1 6 i 6 t.

aij + ci ,

j=1,j6=i

Then ¯ ∗ ¯a11 ¯ ¯a21 ¯ ¯ . ¯ .. ¯ ¯a t1

¯ a12 · · · a1t ¯¯ a∗22 · · · a2t ¯¯ .. .. ¯¯ = 1 ∈ F2 . . . ¯ a · · · a∗ ¯ t2

tt

In particular, there exists at least one i (1 6 i 6 t) such that pi ≡ 5 (mod 8). As an example of new consequences, the following result can be derived from Theorem 4.3. Corollary 4.1. If n = 2p1 · · · pt (t > 1), pi ≡ 5 (mod 8) (1 6 i 6 t) and p ( pji ) = 1 for all 1 6 i 6= j 6 t, then Sn = {1} and Sˆn = {±1, ±n} so that rank(En (Q)) = 0 and n is a non-congruent number. Case 2 ∤ n. For n = p1 · · · pt ≡ ±3 (mod 8) we have the following result where G(n) is the (non-directed) graph G(n) = (V, E) defined by V = {p1 , · · · , pt }, µ ¶ ½ ¾ pj E = pi pj : = −1, 1 6 i 6= j 6 t . pi Theorem 4.3 ([3, Theorem 2.4]). For n ≡ ±3 (mod 8), Sn = {1} and Sˆn = {±1, ±n} (so that rank(En (Q)) = 0 and n is a non-congruent number) if and only if the following three conditions are satisfied. (1) n ≡ 3 (mod 8). (2) n = p1 · · · pt , p1 ≡ 3 (mod 4) and pi ≡ 1 (mod 4) for 2 6 i 6 t. (3) G(n) is an odd graph. For the case n ≡ ±1 (mod 8), we need to generalize the concept of odd graph a little more. Here we state the final result. Let n = p1 · · · pt q1 · · · qs be a product of distinct prime numbers, where pi ≡ 1 (mod 4), qj ≡ 3 (mod 4),

CONSTRUCTING NEW NON-CONGRUENT NUMBERS BY GRAPH THEORY

33

(1 6 i 6 t, 1 6 j 6 s), (t + s > 1). Let P = {p1 , · · · , pt }, Q = {q1 , · · · , qs }. We define a graph G∗ (n) = (V, A) by V = {2, p1 , · · · , pt , q1 , · · · , qs }, µ ¶ ½ ¾ pj A = pi pj : = −1, pi , pj ∈ P pi µ ¶ ½ ¾ q → pq : ∪ − = −1, p ∈ P, q ∈ Q p o n− → ∪ 2r : r ≡ ±3 (mod 8), r ∈ P ∪ Q .

Theorem 4.4 ([3, Theorem 2.5]). For n ≡ ±1 (mod 8), Sn = {1} and Sˆn = {±1, ±n} if and only if the following three conditions are satisfied. (1) n ≡ 1 (mod 8). (2) The decomposition of n has one of the following forms: (2.1) n = p1 · · · pr P1 · · · Ps Q1 Q2 ; (2.2) n = p1 · · · pr P1 · · · Pt q1 q2 ; (2.3) n = p1 · · · pr P1 · · · Pl Q1 q1 where pi ≡ 1, Pj ≡ 5, Qλ ≡ 3, qµ ≡ 7 (mod 8) and r > 0,

2|s > 0,

2|t > 2,

2 ∤ l > 1.

(3) There exists only one non-trivial even partition V1 = {2} and V2 = V \V1 for the graph G∗ (n). Namely, the rank of L(G∗ (n)) over F2 is |V | − 2 where L(G∗ (n)) is the Laplace matrix of G∗ (n) defined in §3. Many knowing results on non-congruent numbers in Lemma 1.1 are special cases of Theorem 4.2-6. Now under the change of variable, the elliptic curve En : y 2 = x3 − n2 x transforms into the elliptic curve En′ : y 2 = x3 − 3nx2 + 2n2 x.

It is obvious that rank(En (Q)) = rank(En′ (Q)). The homogenous spaces of En′ and its dual curve Eˆn′ : y 2 = x3 + 6nx2 + n2 x are Cd′ : dw2 = d2 t4 + 6ndt2 z 2 + n2 z 4 , Cˆd′ : dw2 = d2 t4 − 3ndt2 z 2 + 2n2 z 4 = (dt2 − 2nz 2 )(dt2 − nz 2 ).

Let Sn′ and Sˆn′ be the Selmer groups of En′ and Eˆn′ respectively. Then 1 ∈ Sn′ and {1, 2, n, 2n} ⊂ Sˆn′ . As a consequence of 2-descent method, we also have the following fact. Lemma 2.1′ . If Sn′ = {1} and Sˆn′ = {1, 2, n, 2n}, then rank(En′ (Q)) = rank(En (Q)) = 0, so that n is a non-congruent number.

34

KEQIN FENG AND YAN XUE

In 2002, Goto [6] obtained new non-congruent numbers n for n having at most 4 prime divisors by using Lemma 2.1′ . With helping of odd graphs, we obtain following general results which present more non-congruent numbers. Case 2 ∤ n.

Let

n = p1 · · · pt q1 · · · qs ,

t + s > 1,

pi ≡ ±1 (mod 8), qj ≡ ±3 (mod 8),

1 6 i 6 t, 1 6 j 6 s.

(4.1)

Theorem 4.5 ([4, Theorem 3.1]). Assume that n ≡ 3 (mod 4) with decomposition (4.2). Then Sn′ = {1} and Sˆn′ = {1, 2, n, 2n} if and only if the following two conditions are satisfied. (1) n ≡ 3 (mod 8) and s = 1, so that n = p1 · · · pt q (t > 0) where pi ≡ ±1 (mod 8) (1 6 i 6 t), q ≡ ±3 (mod 8). ˜ ˜ defined by (2) The graph G(n) = (V˜ , A) V˜ = {p1 , · · · , pt , q}, ¯µ ¶ ½ ¾ ½ ¯µ ¶ ¾ ¯ pj → − →q ¯¯ q = −1, 1 6 i 6 t ¯ p− p A˜ = − p = −1, 1 6 i = 6 j 6 t ∪ i j¯ i ¯ pi pi

is odd.

Theorem 4.6 ([4, Theorem 3.2]). Assume that n ≡ 1 (mod 4) with decomposition (4.2). Then Sn′ = {1} and Sˆn′ = {1, 2, n, 2n} if and only if the following two conditions are satisfied. (1) s = 2 and n ≡ 1 (mod 8) so that n = p1 · · · pt q1 q2 . (2) ¯ ∗ ¯ ¯m11 m12 · · · m1t b11 ¯ ¯ ¯ ¯ .. .. .. .. ¯ ¯ . . . . ¯¯ = 1 ∈ F2 , ¯ ¯ mt1 mt2 · · · m∗ bt1 ¯ tt ¯ ¯ ¯ l l2 · · · lt k1 ¯ 1

where

mij = biλ =

(

(

½

1 0

1 0 ½ 1 k1 = 0 li =

1 0

p

if ( pji ) = −1, otherwise,

1 6 i 6= j 6 t,

if ( qpλi ) = −1, otherwise,

1 6 i 6 t, 1 6 λ 6 2,

if pi ≡ 7 (mod 8), if pi ≡ 1 (mod 8),

if q1 ≡ 3 (mod 8), if q1 ≡ 5 (mod 8),

1 6 i 6 t,

CONSTRUCTING NEW NON-CONGRUENT NUMBERS BY GRAPH THEORY

35

and m∗ii =

t X

mij + bi1 + bi2 ,

1 6 i 6 t.

j=1,j6=i

Theorem 4.7 and 4.8 present series of explicit non-congruent numbers as following. Corollary 4.2 ([4, Corollary 3.3]). Suppose that n = p1 · · · pt q (t > 1) and pi ≡ 1 (mod 8), 1 6 i 6 t − 1, µ

q pi



= −1,

pt ≡ 7 (mod 8), q ≡ 5 (mod 8), 1 6 i 6 t.

Then n is a non-congruent number provided one of the following conditions is satisfied. ³ ´ (1) ppji = 1 for all 1 6 i 6= j 6 t. ³ ´ (2) ppji = −1 for all 1 6 i 6= j 6 t and t is even.

Corollary 4.3 ([4, Corollary 3.4]). Suppose that n = p1 · · · pt q (t > 1) and pi ≡ 1 (mod 8), 1 6 i 6 t − 1, pt ≡ 7 (mod 8),

q ≡ 5 (mod 8).

Then n is a non-congruent number if the following two conditions are satisfied. (1) There is exactly one i (1 6 i 6 t) such that ( pqi ) = −1. (2) The (non-directed) graph G = (V, A) defined by ¯µ ¶ ¾ ½ ¯ pj ¯ = −1, 1 6 i 6= j 6 t V = {p1 , · · · pt }, A = pi pj ¯ pi is odd.

Corollary 4.4 ([4, Corollary 3.5]). Suppose that n = p1 · · · pt q1 q2 (t > 0) and (1) pi ≡ 1 (mod 8), q1 ≡ q2 ≡ 3 (mod 8), ( qp1i )( qp2i ) = −1, (1 6 i 6 t). (2) ( ppji ) = 1 (1 6 i 6= j 6 t) or “ ( ppji ) = −1 (1 6 i 6= j 6 t) and t is even”. Then n is a non-congruent number. Corollary 4.5 ([4, Corollary 3.6]). Suppose that n = p1 · · · pt q1 q2 (t > 1) and (1) pi ≡ 1 (mod 8) (1 6 i 6 t − 1), (pt , q1 , q2 ) ≡ (7, 5, 3) (mod 8);

36

KEQIN FENG AND YAN XUE

(2) All ( qpλi ) = 1, (1 6 i 6 t, 1 6 λ 6 2) except ( qp1t ) = −1; (3) The non-directed graph G = (V, A) defined in Corollary 4.10 is odd. Then n is a non-congruent number. Case 2|n = 2n′ . We proved (see [5]) that there is no n′ ≡ 7 (mod 8) such that Sn′ = {1} and Sˆn′ = {1, 2, n′ , n}. We have not completed the case n′ ≡ 1 or 3 (mod 8), but for n′ ≡ 5 (mod 8) we get the following result. Theorem 4.7 ([5, Theorem 3.1]). Suppose that n = 2n′ and n′ = p1 · · · pt q1 · · · qs ≡ 5 (mod 8) where t, s > 0, t + s > 1, pi ≡ ±1 (mod 8) and qj ≡ ±3 (mod 8), (1 6 i 6 t, 1 6 j 6 s). Then the following two conditions are equivalent. (1) Sn′ = {1} and Sˆn′ = {1, 2, n′ , n} so that rank(En′ (Q)) = 0 and n is a non-congruent number; (2) s = 1 so that n′ = p1 · · · pt q, q ≡ ±3 (mod 8); and ¯ ∗ ¯ ¯m11 m12 · · · m1t ¯ ¯ ¯ ¯ .. .. ¯ = 1 ∈ F , D = ¯ ... 2 . . ¯¯ ¯ ¯ m m · · · m∗ ¯ t1 t2 tt

where

mij = bi =

(

(

1 0

1 0

p

if ( pji ) = −1, otherwise, if ( pqi ) = −1, otherwise,

1 6 i 6= j 6 t, 1 6 i 6 t,

and m∗ii =

t X

mij + bi ,

1 6 i 6 t.

j=1,j6=i

(We assume D = 1 for t = 0.) From simple computations, we can derive the following consequences of Theorem 4.13. Corollary 4.6 ([5, Corollary 3.2]). Suppose that n = 2n′ and n′ satisfies one of following conditions where p, q, p1 , p2 are prime numbers and p1 6= p2 . (1) n′ = q ≡ 5 (mod 8); (2) n′ = pq, ( qp ) = −1 and (p, q) ≡ (1, 5) or (7, 3) (mod 8);

CONSTRUCTING NEW NON-CONGRUENT NUMBERS BY GRAPH THEORY

37

(3) n′ = p1 p2 q, (p1 , p2 , q) ≡ (1, 1, 5) or (1, 7, 3) (mod 8) and there exists one of p1 , p2 , q which is quadratic non-residue of the other two prime numbers; (4) n′ = p1 p2 q, (p1 , p2 , q) ≡ (7, 7, 5) (mod 8) and ( pp21 ) = ( pq2 ) = −( pq1 ). Then Sn′ = {1} and Sˆn′ = {1, 2, n′ , n} so that rank(En′ (Q)) = 0 and n is a non-congruent number.

Corollary 4.7 ([5, Corollary 3.3]). Let n = 2n′ , n′ = p1 · · · pt q ≡ 5 (mod 8) where p1 , · · · , pt are distinct prime numbers, pi ≡ ±1 (mod 8) (1 6 i 6 t) and q ≡ ±3 (mod 8). Let r1 , · · · , rl be distinct prime numbers, rλ ≡ ±1 (mod 8) (1 6 λ 6 l), ( rrji ) = 1 (1 6 i, j 6 l), r1 · · · rl ≡ 1 (mod 8) and µ ¶ µ ¶ pi rλ = = 1, 1 6 i 6 s, 1 6 λ 6 l, rλ pi µ ¶ q = −1, 1 6 λ 6 l. rλ If D = 1 ∈ F2 where D is given by Theorem 4.13 (2) (so that n is a non-congruent number), then N = r1 · · · rl n is a non-congruent number. 5. Birch and Swinnerton-dyer conjecture for En The Birch and Swinnerton-Dyer conjecture says that for each elliptic curve E over Q, (BSD1) The order of zero of L-function L(E, s) at s = 1 is equal to rank(E(Q)). (BSD2) If rank(E(Q)) = 0 (so that L(E, 1) 6= 0 by (BSD1)), then L(E, 1) is equal to a certain conjectured value. For several cases of En such that rank(En (Q)) = 0, Chunlai Zhao [18,19] calculated L(En , 1) by using Eisenstein series and odd-graph language and then verified conjecture (BSD1) and (BSD2). For the case that n has at most 2 odd prime divisors, this can be done by using Tunnell’s elementary criterion, see [2]. References 1. N. Aoki, On the 2-Selmer groups of elliptic curves arising from the congruent number problems, Comment. Math. Univ. St. Paul., 48 (1999), 77–101. 2. K. Feng, Non-congruent numbers, odd graphs and the Birch-Swinnerton-Dyer conjecture, Acta Arith., 80 (1996), 71–83. 3. K. Feng and M. Xiong, On elliptic curves y 2 = x3 − n2 x with rank zero, Jour. of Number Theory, 109 (2004), 1–26.

38

KEQIN FENG AND YAN XUE

4. K. Feng and Y. Xue, New series of odd non-congruent numbers, to appear in Science in China (A), 2006. 5. K. Feng and Y. Xue, New series of non-congruent numbers n ≡ 10 (mod 16), preprint, 2006. 6. A. Genocchi, Sur l’impossibilit´e de quelques ´egalit´es doubles, C.R. Acad. Sci. Paris, 78 (1874), 423–436. 7. T. Goto, A study on the Selmer groups of elliptic curves with a rational 2-torsion, Kyushu University, Doctoral thesis, 2002. 8. J. M. Harris, J. L. Hirst, M. J. Mossignhoff, Combinatorics and Graph Theory, Springer-Verlag, Berlin, 2000. 9. B. Iskra, Non-congruent numbers with arbitrarily many prime factors congruent to 3 modulo 8, Proc. Japan Acad., 72 (1996), 168–169. 10. N. Koblitz, Introduction to Elliptic Curves and Modular Forms, GTM 97, 2nd ed. Springer-Verlag, 1993. 11. J. Lagrange, Construction d’une table de nombres congruents, Bull. Soc. Math. France, Suppl. Mem., 49–50 (1977), 125–130. 12. J. Lagrange, Nombres congruents et courbes elliptiques, S´emin. DelangePisot-Poitou, 1974/75, Fasc. 1, Expos´e 16,17pp. 13. F. Lemmermeyer, Some families of non-congruent numbers, Acta Arith., 110 (2003), 15–36. 14. F. R. Nemenzo, All congruent number less than 40000, Proc. Japan Acad., 74 (1998) 29–31. 15. P. Serf, Congruent numbers and elliptic curves, in Computational Number Theory (Debrecen, 1989), de Gruyter, 1991, 227–238. 16. J. Silverman, The Arithmetic of Elliptic Curves, GTM 106, Springer-Verlag, 1986. 17. J. B. Tunnell, A classical Diophantine problem and modular forms of weight 3/2, Invent. Math., 72 (1983), 323–334. 18. C. Zhao, A criterion for elliptic curves with lowest 2-power in L(1), Math. Proc. Cambridge Philos. Soc., 121 (1997), 385–400. 19. C. Zhao, A criterion for elliptic curves with lowest 2-power in L(1) II, Acta Math. Sinica (English ser.), 21 (2005), 961–976.

39

DISTRIBUTION OF UNITS OF AN ALGEBRAIC NUMBER FIELD MODULO AN IDEAL YOSHIYUKI KITAOKA Department of Mathematics, Meijo University, Tenpaku, Nagoya, 468-8502, Japan E-mail: [email protected] Let F be an algebraic number field and oF the maximal order of F . We are interested in how units of F distribute in (oF /n)× , where n is an integral ideal. When n is a prime ideal, we give the upper bound of the order of the subgroup represented by units in (oF /n)× , using new invariants. Prime ideals are ruled by an automorphism of an overfield of F , which is a Galois extension of the rationals. We give the expected density of the set of prime ideals which attain the upper bound, taking account of Chebotarev’s density theorem. In the third section, we try to generalize the above to principal ideals generated by rational primes. On the contrary to the prime ideal case, there remain much to do even in order to complete the algebraic framework.

1. Introduction Let F be an algebraic number field and oF the ring of algebraic integers in F . The structure of the group o× F of units in F is well described by Dirichlet’s Theorem, which says that there exist a primitive w-th root ζw of unity and units ǫ1 , · · · , ǫr so that they generate o× F and the equality Q ai a0 ζw i ǫi = 1 implies a0 ≡ 0 mod w, a1 = · · · = ar = 0. In this paper, we are interested in the distribution of units modulo an integral ideal. For an integral ideal n, we put ¯ © ª ¡ ¢ E(n) = ǫ mod n ¯ ǫ ∈ o× ⊂ (oF /n)× . F

This is a finite group and therefore infinitely many multiplicative relations modulo n arise among units. We would like to know these relations, the structures of E(n) and (oF /n)× /E(n). However, these are heavily dependent on the modulus ideal n, and we are to extract a property common to some appropriate set of ideals.

40

YOSHIYUKI KITAOKA

In §2, we take up the case that n is a prime ideal. In this case, (oF /n)× is cyclic and the structure being determined by the order, this case is easier, as already studied in [7]. Let us briefly explain its outline (see the text for details). Let K be a subsidiary extension field of F , and assume that K is a Galois extension of the rational number field Q. We take an element η ∈ Gal(K/Q) to control prime ideals of F . g(x) ∈ Z[x] is the monic polynomial of minimal degree such that ¯ n o ¯ W1 (g(x)) := ǫg(η) ¯ ǫ ∈ o× F

is a finite group, whose order we denote by δ1 (cf. (2.1)) and put (cf. (2.3)) n ¯ √ o δ g(ρ) ¯ δ0 = max m ¯ m ǫ 1 = 1 for ∀ǫ ∈ o× F , ∀ρ|K = η ,

and we say that a prime ideal p of F corresponds to η if there is a prime ideal of K lying above p whose Frobenius automorphism is η. Then, for every prime ideal p of F corresponding to η, we see that #E(p) divides δ1 g(p)/δ0 , where p denotes a rational prime number lying below p, and we conjectured in [7] n ¯ o ¯ # p ¯ p < x, p ∤ 2DK , #E(p) = δ1 g(p)/δ0 and p corresponds to η ∼ den(η)Li(x),

where ¯ ª ¤−1 £© σ ∈ Gal(K/F ) ¯ ση = ησ : Gal(K/F ) ∩ hηi ∞ X µ(m)#Hδ0 m (η) × . [Kδ0 m : Kη ] m=1 ³q ´ Here, µ(m) denotes the M¨obius function, Km = K m o× K , Kη is the fixed subfield of K by η and ¯ o n √ δ1 g(ρ) ¯ = 1 for ∀ǫ ∈ o× . Hm (η) = ρ ∈ Gal(Km /Q) ¯ ρ|K = η and m ǫ F den(η) =

We showed that the expected density den(η) is indeed finite and positive. The conjecture is true under Generalized Riemann Hypothesis in a few cases [2,6,12–14]. We know the arithmetic frame-work, but as Artin’s conjecture on primitive roots, the remaining problem is the estimation of the accumulation of error terms when we apply Chebotarev’s density theorem to infinitely many algebraic number fields. In §3, we go on to study the case where n is a principal ideal generated by a rational prime number. We have already studied a few cases ([8,9]). In

DISTRIBUTION OF UNITS

41

this paper, we deal with slightly wider classes; let F be a Galois extension of Q and η an element of the center of Gal(F/Q). These two assumptions play the essential role at present, although they should be loosened. The automorphism η controls rational primes through the Frobenius automorphism. Denote the group of roots of unity in F by WF . Then, we see, as η-modules ¡ ¢ M ∼ Q ⊗Z o× Q[di ], F /WF = i

where Q[di ] denotes the cyclotomic field Q(ζdi ) with η action given by αη = ζdi α for α ∈ Q(ζdi ), and di is a divisor of the order d of η. Then, writing E(p) for E((p)), we will see in Corollary 3.3 that .£ Y η−p ¤ |Φdi (p)| ker ιp : WF o× #E(p) = #WF F i

for a rational prime p ( ∤ 2DF ) whose Frobenius automorphism is η. Here, Φm (x) is the cyclotomic polynomial of index m, and a canonical mapping ιp : o× F → E(p)/WF is defined by ǫ 7→ ǫ mod p. Putting £ η−p ¤ Relη = gcd ker ιp : WF o× , F p

where p corresponds to η as above and is sufficiently large (cf. (3.7)), we have ¯ . Y ¯ |Φdi (p)| Relη #E(p) ¯ #WF i

and hence

¯ . Y ¯ |Φdi (p)| #E(p). Relη ¯ #WF i

Experimentally, we expect that there are infinitely many prime numbers p Q which correspond to η and satisfy Relη = #WF i |Φdi (p)|/#E(p), replacing “|”by “=”. We do not know how to evaluate Relη . But there is a plausible description for it with supporting experimental data, which we will explain below. Putting together subgroups corresponding to the same di , we put, for a divisor m of d ¯ o n ¯ Φm (η) u ∈ W . U (m) = u ∈ o× ¯ F F

42

YOSHIYUKI KITAOKA

Let g(x) be the polynomial defined in the case K = F above, and define ˜ and τm (˜ natural numbers ∆ η ) for any extension η˜ of η as follows: o n ¯ ˜ = max t ¯¯ ζtg(ρ) = 1 for ∀ρ ∈ Gal(F (ζt )/Q) satisfying ρ|F = η , ∆ n ¯ o ¯ Φ (˜η) ˜ . τm = τm (˜ η ) := max t ¯ ζt m = 1, t|∆

³ ³q ´. ´ ˜ And for a prime number p whose Frobenius class in Gal F ∆ o× Q F contains η˜, we put ¯   ¯ Y  ¯ (i) vm ∈ U (m), Q vm ¯¯ R(˜ η) = . Φm (p)/τm  ≡ ζ mod p for ∃ζ ∈ WF  ¯ (ii) m|d vm m|d

Here Φm (p)/τm is an integer, and the group R(˜ η ) depends only on η˜ in spite of its definition. Obviously, it induces a subgroup of ker ιp ¯ ) ( ¯Y Y η−p Φm (p)/τm ¯ vm ∈ R(˜ η ) WF o× , R(˜ η , p) = vm ¯ F ¯ m|d

m|d

η−p WF o× ] F

and hence [ker ιp : is divisible by the index [R(˜ η , p) : WF o× F which depends only on η˜. Now we have ¯ £ η−p ¤ ¯ κ(η) := gcd R(˜ η , p) : WF o× ¯ Relη . F

η−p

],

η ˜

After such preparations, we give κ(η) explicitly for several types of algebraic number fields of low degree in §4, and making use of it, we confirm η−p the existence of a prime number p satisfying κ(η) = [ker ιp : WF o× ] by F computer experiment, which yields the expectation Relη = κ(η). As referred to above, we have already studied several cases [8] where the rank of o× F is one. There, we have given the value κ(η) by experiments and showed that the expected density of the set of primes p satisfying κ(η) = Q #WF |Φdi (p)|/#E(p) is indeed positive. Our argument here elucidates the theoretical background of their values. In the appendix, we give the structure of the Galois group of the field extended by roots of units, which is necessary to consider all extensions η˜ of η explicitly. Notations : For an algebraic number field L, we denote by oL , o× L , WL , DL , Lm the ring of algebraic integers in L, the group of units in³q L, the ´

group of roots of unity in L, the discriminant of L, and the field L m o× L extended by all m-th roots of units in L, respectively. Assume that L is a

DISTRIBUTION OF UNITS

43

Galois extension over Q. σL/Q (p) denotes the Frobenius automorphism of a prime ideal of p of L, and for a prime number p lying below p, σL/Q (p) denotes a conjugacy class {ρσL/Q (p)ρ−1 | ρ ∈ Gal(L/Q)}. For integers a, b we denote their greatest common divisor by (a, b) or by gcd(a, b). For a polynomial f (x) = a0 + a1 x + · · · + an xn ∈ Z[x], ρ ∈ Gal(L/Q) and u ∈ L× , we write n Y t uat ρ . uf (ρ) = t=0

We denote by F an algebraic number field, with which we are mainly concerned in this article, and for an integral ideal n of F , we put ¯ © ª¡ ¢ E(n) = ǫ mod n ¯ ǫ ∈ o× ⊂ (oF /n)× . F

We denote #WF by w. For a natural number n, ζn is a primitive n-th root of unity and the polynomial Y Φn (x) = (x − ζna ) (a,n)=1

is the cyclotomic polynomial of index n. For a polynomial h(x), we put h(x, y) =

h(x) − h(y) . x−y

2. Case of prime ideals Throughout this section, fields K ⊃ F are fixed algebraic number fields and we assume that #o× F = ∞ and that K is a Galois extension of the rational number field Q. We choose and fix an element η ∈ Gal(K/Q) to control prime ideals of F through the Frobenius automorphism. 2.1. Polynomial g(x) First, we introduce a key polynomial g(x), which plays a central role. Lemma 2.1. Let g(x) be a non-zero polynomial in Z[x] such that ¯ n o ¯ W1 (g(x)) := ǫg(η) ¯ ǫ ∈ o× F

is a finite group. We fix a primitive polynomial g(x) of minimal degree among them. Then it divides xd − 1 in Z[x] for £ ¤ d := hηi : hηi ∩ Gal(K/F ) .

44

YOSHIYUKI KITAOKA

Proof. By virtue of η d ∈ Gal(K/F ), W1 (xd − 1) = {1} is clear and we may take a primitive polynomial g(x) of minimal degree satisfying #W1 (g(x)) < ∞. Then there exist an integer a and polynomials q(x), r(x) ∈ Z[x] so that a(xd −1) = q(x)g(x)+r(x) and deg r(x) < deg g(x). The assumption #o× F = r(η) a(η d −1)−q(η)g(η) ∞ implies deg g(x) ≥ 1. For ǫ ∈ o× , we have ǫ = ǫ = F g(η) −q(η) (ǫ ) and hence W1 (r(x)) is a finite group. Hence the minimality of deg g(x) implies r(x) = 0 and then the primitiveness of both xd − 1 and g(x) entails that g(x) divides xd − 1. Since the polynomial g(x) divides xd − 1, we may assume that g(x) is monic. Hereafter the monic polynomial g(x) means the one defined in Lemma 2.1 and put δ1 := #W1 (g(x)).

(2.1)

Example 2.1. If η ∈ Gal(K/F ) holds, then obviously ǫη = ǫ for any ǫ ∈ o× F , and so we have g(x) = x−1 and δ1 = 1. Although the determination of g(x) in general is complicated, we know the following [7]: Suppose that K = F is a Galois extension of Q; then the polynomial g(x) is given as follows: (R) The case where F is real. (R1) g(x) = xd−1 + xd−2 + · · · + 1 if Gal(F/Q) = hηi. (R2) g(x) = xd − 1 otherwise. (I) The case where F is imaginary. We denote the complex conjugation by J. (I1) g(x) = xd−1 + xd−2 + · · · + 1 if [Gal(F/Q) : hηi] = 2 and J 6∈ hηi. (I2) g(x) = xd − 1 if [Gal(F/Q) : hηi] > 2 and J 6∈ hηi. (I3) g(x) = xd/2−1 + xd/2−2 + · · · + 1 if Gal(F/Q) = hηi. (I4) The case of J ∈ hηi = 6 Gal(F/Q).

(i) If there is an element u ∈ Gal(F/Q) such that JuJ −1 u−1 6∈ hηi, then g(x) = xd − 1. (ii) If JuJ −1 u−1 ∈ hηi holds for every element u ∈ Gal(F/Q), then g(x) = xd/2 − 1. P Remark. In the case of (ii) in (I4), g(x) is (xd/2 − 1)( i xai − P bi d/2 − 1 in virtue of i x ) in [7], but it turns out that it must be x d/2 J =η ∈ Z(Gal(F/Q)).

DISTRIBUTION OF UNITS

45

In the general case, suppose that there is a real infinite place of F ; then we can show ( xd−1 + xd−2 + · · · + 1 if [F : Q] = d, g(x) = d x −1 otherwise. But in the case of F being totally imaginary, the evaluation of g(x) is not easy. 2.2. Upper bound for #E(p) Let P (∤ 2DK ) be an unramified prime ideal of K whose Frobenius automorphism σK/Q (P) is η; then we say a prime number p and a prime ideal p of F lying below P correspond to η. Note that the assumption yields that the condition ζ ≡ 1 mod P for a root of unity ζ in K implies ζ = 1. By ramification theory, in particular, applying the assertion (3) for i = 1 in the following theorem to L = K, M = F, N = Q and Q = P with H = Gal(K/F ) and Z = hηi, we derive that £ ¤ d := hηi : hηi ∩ Gal(K/F ) = deg(P ∩ F ), (2.2)

where pdeg p signifies the number of elements of the residue class field modulo a prime ideal p. Hilbert’s ramification theory for intermediate fields : Let L ⊃ M ⊃ N be algebraic number fields, and suppose that L/N is a Galois extension with Galois group G, and that H is the subgroup corresponding to M. For a prime ideal Q of L, the decomposition group and the inertia group of Q with respect to L/N are denoted by Z, T respectively. Then we have (1) For σ, τ ∈ G, Qσ ∩ M = Qτ ∩ M if and only if ZσH = Zτ H. (2) Let G = Zσ1 H + · · · + Zσs H

(σ1 = id)

be the double coset decomposition. Then the ideals Qσ1 ∩ M, · · · , Qσs ∩ M are all distinct prime ideals of M lying above Q ∩ N. (3) Let ei , fi be the ramification index and the relative degree of Qσi ∩ M with respect to M/N respectively. Then we have £ ¤ £ ¤ ei fi = σi−1 Zσi : σi−1 Zσi ∩ H , ei = σi−1 T σi : σi−1 T σi ∩ H .

46

YOSHIYUKI KITAOKA

Lemma 2.2. We put h(x) := (xd − 1)/g(x) (∈ Z[x]). If a prime number p ( ∤ 2DK ) corresponds to η, then δ1 divides h(p). g(η) Proof. Take a unit ǫ ∈ o× is a primitive δ1 -th root of F such that ǫ unity, and let P be an unramified prime ideal of K lying above p such that σK/Q (P) = η, and put p = P ∩ F . For a generator α ∈ oF of (oF /p)× , we put ǫ ≡ αa mod p (a ∈ Z). Then we have

1 = ǫδ1 g(η) ≡ ǫδ1 g(p) ≡α

mod P

aδ1 g(p)

mod P,

which implies 1 ≡ αaδ1 g(p) mod p. Since by (2.2) d = deg p, there is an integer b such that ag(p)δ1 = (pd − 1)b, or aδ1 = h(p)b. We have only to show that (δ1 , b) = 1. Suppose that q is a prime number dividing (δ1 , b); then we have ǫg(p)δ1 /q ≡ αag(p)δ1 /q ≡ α(p g(η)δ1 /q

d

−1)b/q

≡ 1 mod p,

g(η)δ1 /q

which implies ǫ ≡ 1 mod P. Since ǫ is a root of unity in K g(η)δ1 /q and P ∤ 2DK , we have ǫ = 1. This contradicts the fact that ǫg(η) is a primitive δ1 -th root of unity. Lemma 2.3. Let m be a natural number and p (∤ m) a prime ideal of F corresponding to η. Let Pm (| p) be a prime ideal of Km whose Frobenius automorphism ρ is an extension of η. Then we have √ δ1 g(ρ) m#E(p) | δ1 g(p) ⇐⇒ m ǫ = 1 for ∀ǫ ∈ o× F, √ where p is a prime number lying below p and m ǫ means all m-th roots of ǫ. Proof. The left-hand side assertion is equivalent to m | δ1 g(p) and √ × m δ1 g(ρ) ǫδ1 g(p)/m ≡ 1 mod p for ∀ǫ ∈ o× ǫ is an F as (oF /p) is cyclic. Since δ1 g(p)/m m-th root of unity in Km for ǫ ∈ o× , the congruence ǫ ≡ 1 mod Pm F √ m δ1 g(ρ) ǫ = 1 by p ∤ 2mDK . Therefore the left-hand side is equivalent to √ δ g(ρ) assertion is equivalent to m | δ1 g(p) and m ǫ 1 = 1 for ∀ǫ ∈ o× F . Noting that the condition m | δ1 g(p) is contained in the second condition, taking 1 as ǫ, we complete the proof. In the lemma, the right-hand side assertion holds for m = 1 and therefore #E(p) | δ1 g(p).

DISTRIBUTION OF UNITS

47

We note that m#E(p) | δ1 g(p) ⇐⇒ m · h(p)/δ1 | [(oF /p)× : E(p)], where h(p)/δ1 is an integer by Lemma 2.2. We put for a natural number m ¯ n o √ δ g(ρ) ¯ Hm (η) := ρ ∈ Gal(Km /Q) ¯ ρ|K = η and m ǫ 1 = 1 for ∀ǫ ∈ o× . F

√ δ g(ρ) Here m ǫ means all m-th roots of ǫ and so ζm1 = 1 for ρ ∈ Hm (η). Now we introduce another constant o n ¯ √ δ1 g(ρ) ¯ = 1 for ∀ǫ ∈ o× , ∀ρ ∈ Gal(K /Q) with ρ = η . δ0 = max m ¯ m ǫ m |K F (2.3) The maximum is assured to exist, by applying Proposition 5.1 in the appendix to L = K, f (x) = δ1 g(x) with ǫ = 1. Then, taking m = δ0 in Lemma 2.3, we have ¯ #E(p) ¯ δ1 g(p)/δ0

for all prime ideals p corresponding to η. The evaluation of δ0 is not easy in general. Proposition 2.1. We have (δ0 , δ1 ) = 1 and δ0 | g(p) for a prime number p corresponding to η.

Proof. Let m be a divisor of (δ0 , δ1 ). Take ǫ = ǫ0 ∈ o× F so that the order of √ g(η) (δ /m)g(η) ǫ0 is δ1 ; then δ0 ǫ0 δ1 g(ρ) = 1 for any extension ρ of η implies ǫ0 1 = 1, which yields m = 1, i.e., (δ0 , δ1 ) = 1. Then, Lemma 2.3 implies δ0 | g(p). Proposition 2.2. Let p be a prime ideal corresponding to η. For a natural number m, which is not divisible by p, the condition m | δ1 g(p)/#E(p) holds if and only if ρ := σKm /Q (Pm ) ∈ Hm (η), where Pm is a prime ideal of Km lying above p and satisfies σK/Q (Pm ∩ K) = η. Proof. This is an immediate consequence of Lemma 2.3. 2.3. Conjecture The previous proposition means that for any given natural number m, the condition on p that δ1 g(p)/#E(p) is a multiple of m is characterized in terms of Frobenius automorphisms. Therefore, after some transformations, we can apply Chebotarev’s density theorem (see [7] for details).

48

YOSHIYUKI KITAOKA

Theorem 2.1. Let m be a natural number. If p is a prime ideal of F and p is a prime number lying below it, then the density of the set n ¯ o ¯ p ¯ p6 | 2mδ0 DK , m#E(p) | g(p)δ1 /δ0 , and p corresponds to η

is equal to ¯ £© ª ¤−1 #Hmδ0 (η) σ ∈ Gal(K/F ) ¯ ση = ησ : Gal(K/F ) ∩ hηi . [Kmδ0 : Kη ] Here Kη is a subfield of K fixed by hηi. Then the usual procedure X p :#E(p)=g(p)δ1 /δ0

1=

X

µ(m)

m

X

1

p :m#E(p)|g(p)δ1 /δ0

suggests Conjecture 2.1. Denoting by den(η) ∞ ¯ ª ¤−1 X £© µ(m)#Hδ0 m (η) ¯ σ ∈ Gal(K/F ) ση = ησ : Gal(K/F ) ∩ hηi , [Kδ0 m : Kη ] m=1

and denoting a prime number and a prime ideal of F by p, p (p | p), we have o n ¯ ¯ # p ¯ p < x, p6 | 2DK , #E(p) = g(p)δ1 /δ0 and p corresponds to η ∼ den(η)Li(x).

That the infinite sum den(η) is convergent to a positive number is shown in [7]. We note that the condition p < x is used instead of NF/Q (p) < x, and so this is a modification of the usual natural density. This conjecture is a generalization of [2,6,12–14] and hence the conjecture for a real quadratic field K = F is true under the Generalized Riemann Hypothesis. When η ∈ Gal(K/F ), we know that d = 1 and g(x) = x − 1 in Lemma 2.1 and hence both σK/Q (P) = σK/F (P) and h(x) = 1 hold. Therefore the conjecture is true under G.R.H. by a result of [12]. The situation of [12] is as follows: Let K/F be a finite Galois extension and let C be a union of conjugacy classes of Gal(K/F ), and W is a finitely generated subgroup of F × of finite rank (≥ 1) modulo its torsion subgroup, and k is an integer (> 0). In [12], under G.R.H. it is shown that the density of the set M (F, K, C, W, k) of prime ideals p of F exists. Here p is in M (F, K, C, W, k) if and only if for a prime ideal P of K lying above p, the Frobenius automorphism σK/F (P) is in C, ordp (w) = 0 for all w ∈ W and the index [(oF /p)× : {w

DISTRIBUTION OF UNITS

49

mod p | w ∈ W }] divides k. Hence the index is bounded. In contrast to this, our case allows that [(oF /p)× : E(p)] tends to infinity. 3. Case of rational primes Hereafter, we study the distribution of units modulo (p), writing E(p) for E((p)), where p is a rational prime, and we restrict ourselves to the case where K = F is a Galois extension of Q, and we let η ∈ Gal(F/Q). From Corollary 3.3 to the end of this section, we will assume η ∈ Z(Gal(F/Q)). Because, this yields that for ǫ ∈ o× F , we have ǫη ≡ ǫp mod p if a prime number p corresponds to η. 3.1. Structure of o× F as an η-module Lemma 3.1. Let d be a natural number. For a divisor m of d, we put Θm (x) =

d−1 µ X X

k=0

Then we have X

Θm (x) = d,

m|d

a mod m (a,m)=1

¶ ak xk ∈ Z[x]. ζm

¯ ¯ xd − 1 ¯ Θm (x)Φm (x).

Proof. Let ζ be a d-th root of unity; then there exist a divisor m of d a and an integer a so that ζ = ζm , where m and a mod m ((a, m) = 1) are uniquely determined, and hence we have ¶ X jk ½ 0 X µ X if k 6≡ 0 mod d, ak ζd = = ζm d if k ≡ 0 mod d, a mod m m|d

(a,m)=1

j mod d

which yields X

Θm (x) = d.

m|d

Next, let us show that (xd − 1)/Φm (x) | Θm (x). Since a root of (xd − 1)/Φm (x) = 0 is not a primitive m-th root of unity, we have only to show that Θm (ζdb ) = 0 for every integer b so that ζdb , being a d-th root of unity, is

50

YOSHIYUKI KITAOKA

not a primitive m-th root of unity. Let b be an integer; then for an integer a satisfying (a, m) = 1, we have da/m+b

ζd

= 1 ⇒ ad/m + b ≡ 0 mod d

⇒ am1 + b ≡ 0 mod mm1

( putting d = mm1 )

⇒ b = m1 m2 , m2 ≡ −a mod m



ζdb

=

m2 ζm

( for ∃m2 ∈ Z)

is a primitive m-th root of unity.

This shows for (a, m) = 1 that if ζdb is not a primitive m-th root of unity, ad/m+b 6= 1 and so we have then ζd Θm (ζdb ) =

d−1 X X

a mod m (a,m)=1

ad/m+b k

(ζd

) = 0.

k=0

This completes the proof of (xd − 1)/Φm (x) | Θm (x). Lemma 3.2. For a divisor m of d, decompose xd − 1 as Φm (x)Ψm (x) = xd − 1. Then there exist polynomials um (x), vm (x) in Z[x] satisfying um (x)Φm (x) + vm (x)Ψm (x) = d.

(3.1)

Proof. We put directly as follows : um (x) = xΨ′m (x) − deg Ψm (x) · Ψm (x) ∈ Z[x], vm (x) = xΦ′m (x) − deg Φm (x) · Φm (x) ∈ Z[x]. Then we have um (x)Φm (x) + vm (x)Ψm (x) ¢ ¡ = xΨ′m (x) − deg Ψm (x) · Ψm (x) Φm (x) ¢ ¡ + xΦ′m (x) − deg Φm (x) · Φm (x) Ψm (x) ¢ ¡ = x Ψ′m (x)Φm (x) + Φ′m (x)Ψm (x) ¡ ¢ − deg Ψm (x) + deg Φm (x) Φm (x)Ψm (x)

= x(xd − 1)′ − d(xd − 1)

= d.

We fix η ∈ Gal(F/Q) and denote the order of η by d : #hηi = d.

(3.2)

DISTRIBUTION OF UNITS

Proposition 3.1. Let m be a positive divisor of d. Then ¯ n o ¯ Φm (η) U (m) = u ∈ o× ∈ WF F ¯u

51

(3.3)

is an η-stable subgroup of o× F , and we have o× F

Θm (η)

, o× F

Ψm (η)

⊂ U (m),

d

o× F ⊂

Y

m|d

U (m) ⊂ o× F.

If η is in the center of Gal(F/Q), then U (m) is Gal(F/Q)-stable. Proof. It is easy to see that U (m) is η-stable, and if we assume that η is in the center of Gal(F/Q), uρΦm (η) = uΦm (η)ρ holds for ρ ∈ Gal(F/Q) and u ∈ o× F , and so U (m) is a Gal(F/Q)-stable subgroup. By previous lemmas, we know that Θm (x)Φm (x) ≡ Ψm (x)Φm (x) ≡ 0 mod xd − 1, whence Θm (η)Φm (η) = Ψm (η)Φm (η) = 0, which implies the left-hand side inclusion. Then Lemma 3.1 yields P Y ud = u m|d Θm (η) = uΘm (η) , m|d

which implies the right-hand side inclusion. Proposition 3.2. (i) Let um , vm be elements in U (m) and ζ ∈ WF . Then Y Y um = ζ vm m|d

m|d

implies um /vm ∈ WF ,

∀m | d.

(ii) Let q be a natural number and um ∈ U (m), ζ ∈ WF . If Y um ≡ ζ mod q, m|d

then there are roots κm ’s of unity in WF such that udm ≡ κm mod q,

∀m | d.

52

YOSHIYUKI KITAOKA

Proof. To prove the assertions, we may assume ζ = vm = 1, taking the quotient of both sides and absorbing ζ −1 into u1 . Let um (x), vm (x), Ψm (x) be those in Lemma 3.2. The assertion (i) is proved as follows. Noting that Q −1 um = n|d un , we have by (3.1) n6=m

udm =

Y

un−um (η)Φm (η)−vm (η)Ψm (η)

n|d n6=m

=

µY

n|d n6=m

un

¶−vm (η)Ψm (η)

um (η)Φm (η) , · um

which we rewrite as ¶ µY Φm (η)um (η) (unΦn (η) )−vm (η)(Ψm /Φn )(η) · um , udm = n|d n6=m

where we note that Φn (x) divides Ψm (x) if n 6= m. Hence, recalling (3.3), Φ (η) un n ∈ WF implies udm ∈ WF and so um ∈ WF . Q Next, we assume um ≡ 1 mod q; then similarly as above we have Y m (η)um (η) (unΦn (η) )−vm (η)(Ψm /Φn )(η) · uΦ mod q, udm ≡ m n|d n6=m

and the right-hand side is in WF and denoted by κm . Remark 3.1. In (ii), a stronger conclusion um ≡ κm mod q does not hold in general. Considering U (m)/WF as a Z-lattice, we put V (m) = U (m)/WF ⊗Z Q. f (x) ∈ Z[x] acts on U (m) by u 7→ uf (η) , and so U (m)/WF is a Z[x]-module annihilated by Φm (x). Hence Q[x]/(Φm (x)) acts on V (m). Thus V (m) is a vector space over Q(ζm ), and thus the following Lemma 3.3 (Exercise 2 on p.282 in [3]) is clear. Note that f (x) ∈ Q[x]/(Φm (x)) acts on Q(ζm ) by α 7→ f (ζm )α (α ∈ Q(ζm )). Lemma 3.3. As η-modules, we have ¡ ¢ M ∼ Q ⊗Z o× Q[di ], F /WF = i

where Q[di ] is Q(ζdi ) viewed as a representation space of η, on which η acts by αη = ζdi α for α ∈ Q(ζdi ), and di is a divisor of the order d of η.

DISTRIBUTION OF UNITS ϕ(di )−1

Now let 1, ζdi , · · · , ζdi

53

be a basis of Z[ζdi ] ( ⊂ Q[di ] ) and put

Φdi (x) = a0 + a1 x + · · · + aϕ(di )−1 xϕ(di )−1 + xϕ(di ) . Then we have 1η = ζdi , ζdηi = ζd2i , · · · , ´ ³ oη n ϕ(d )−1 ϕ(d ) ϕ(d )−1 . = ζdi i = − a0 + a1 ζdi + · · · + aϕ(di )−1 ζdi i ζdi i

Hence denoting by Ui the subgroup of o× F corresponding to nZ[ζdi ] for an appropriate natural number n in Lemma 3.3 where n is sufficiently large Q to kill the ambiguity of WF , Ui is a subgroup of finite index of o× F , and each U = Ui has a basis U = hǫ0 , ǫ1 , · · · , ǫϕ(di )−1 i (ǫj ↔ ζdji ) such that ǫηi = ǫi+1 (i = 0, 1, · · · , ϕ(di ) − 2),

aϕ(d

ǫa0 0 ǫa1 1 · · · ǫϕ(dii)−1 ǫηϕ(di )−1 = 1. )−1

Lemma 3.4. For the subgroup U = Ui above, and h(x) ∈ Z[x], we have ¯ U h(η) ⊂ WF ⇔ Φdi (x) ¯ h(x)

Proof. The assertion follows from

¯ U h(η) ⊂ WF ⇔ h(ζdi )Z[ζdi ] = 0 ⇔ h(ζdi ) = 0 ⇔ Φdi (x) ¯ h(x).

Corollary 3.1. Let g(x) be the polynomial defined in the previous section for K = F and η, i.e. the monic polynomial in Z[x] of minimal degree such that ¯ n o ¯ ǫg(η) ¯ ǫ ∈ o× F

is a finite group. Then g(x) is equal to lcmi Φdi (x) =

Y

Φm (x).

m|d,U (m)6=WF

Proof. By Lemma 3.4, the following equivalence holds for h(x) ∈ Z[x] ¯ n o ¯ h(η) ǫh(η) ¯ ǫ ∈ o× ⊂ WF ⇔ Ui ⊂ WF for ∀i ⇔ Φdi | h for ∀i. F

Thus we have g(x) = lcmi Φdi (x). Similarly for U (m) in (3.3), it is easy to see ¯ n o ¯ ǫh(η) ¯ ǫ ∈ o× ⊂ WF ⇔ U (m)h(η) ⊂ WF for ∀m ⇔ Φm | h for ∀m, F

where m should satisfy the condition U (m) 6= WF . Because, the first equivalence is obvious, and the right-hand side divisibility implies the middle

54

YOSHIYUKI KITAOKA

inclusion. Assume the middle inclusion. If Φm (x) ∤ h(x), then there are polynomials f1 (x), f2 (x) ∈ Z[x] such that f1 (x)Φm (x) + f2 (x)h(x) = e ∈ Z (e 6= 0), and hence U (m)e ⊂ WF , which yields U (m) ⊂ WF . This contradicts U (m) 6= WF and hence we obtain the right-hand side divisibility. Q Thus we have g(x) = m|d,U (m)6=WF Φm (x). Theorem 3.1. For an integer p (6= ±1), we have Y £ × η−p ¤ oF : WF o× = |Φdi (p)|, F i

where di ’s are those in Lemma 3.3.

To prove the theorem, we need some lemmas. η−p Lemma 3.5. For ǫ ∈ o× ∈ F and for an integer p (6= ±1), the inclusion ǫ × WF implies ǫ ∈ WF . Moreover, let oF ⊃ U ⊃ V ⊃ WF be η-groups; then the mapping φ : ǫ 7→ ǫη−p from U to WF U η−p induces an isomorphism

U/V ∼ = WF U η−p /WF V η−p . Proof. Suppose ǫη−p ∈ WF for ǫ ∈ o× F . Inductively, it is easy to see that there is an element κn ∈ WF such that n

n

ǫη = κn ǫp . d

d

The assumption η d = id yields ǫ = κd ǫp . Hence we have ǫ1−p = κd ∈ WF , which implies ǫ ∈ WF by 1 − pd 6= 0. Next, suppose φ(ǫ) ∈ WF V η−p for ǫ ∈ U ; then ǫη−p = ζv η−p (ζ ∈ WF , v ∈ V ) holds. Thus (ǫ/v)η−p = ζ implies ǫ/v ∈ WF and ǫ ∈ V , which completes the proof. Lemma 3.6. Let U1 , U2 , U be η-subgroups of o× F and suppose U1 U2 ⊂ U,

U1 ∩ U2 = WF ,

[U : U1 U2 ] < ∞.

Then we have [U : WF U η−p ] = [U1 : WF U1η−p ][U2 : WF U2η−p ]. Proof. A canonical mapping (u1 , u2 ) 7→ u1 u2 from U1 ×U2 to U1 U2 induces a surjective homomorphism η−p

f : U1 × U2 → U1 U2 /WF (U1 U2 )

,

DISTRIBUTION OF UNITS

55

and it is easy to see ker f ⊃ WF U1η−p × WF U2η−p and η−p

(u1 , u2 ) ∈ ker f ⇒ u1 u2 ∈ WF (U1 U2 )

η−p

⇒ u1 u2 = ζ(v1 v2 )

⇒ ζ ′ :=

u1

v1η−p



v2η−p u2

(ζ ∈ WF , vi ∈ Ui ) ∈ U1 ∩ U2 = WF

⇒ u1 = ζ ′ v1η−p , u2 = (ζ/ζ ′ )v2η−p

Hence we have ker f ⊂ WF U1η−p × WF U2η−p and so ker f = WF U1η−p × WF U2η−p , which implies η−p

[U1 U2 : WF (U1 U2 )

] = [U1 : WF U1η−p ][U2 : WF U2η−p ].

(3.4)

Lemma 3.5 yields η−p

[U : U1 U2 ] = [WF U η−p : WF (U1 U2 )

].

(3.5)

From (3.4) and (3.5) we have η−p

[U : WF U η−p ] =

[U : U1 U2 ][U1 U2 : WF (U1 U2 ) η−p [WF U η−p : WF (U1 U2 ) ]

]

= [U1 : WF U1η−p ][U2 : WF U2η−p ]. Lemma 3.7. Suppose for an η-subgroup U we have ¯ n o ¯ U = WF ǫf (η) ¯ f (x) ∈ Z[x]

for some ǫ ∈ o× F . Let h(x) ∈ Z[x] be a primitive polynomial of minimal degree such that U h(η) ⊂ WF . Then U/WF U η−p is a cyclic group generated by a coset ǫWF U η−p and the following holds: ¤ £ U : WF U η−p = |h(p)|, ¯ ¾ ½ ¯ η−p A(η) ¯ A(x) ∈ Z[x] with deg A(x) < deg h(x) and . WF U = WF ǫ ¯ A(p) ≡ 0 mod h(p) d

Proof. By the assumption η d = id, we have U η −1 = {1} ⊂ WF and so the polynomial h(x) referred to in the lemma exists. Dividing xd − 1 by h(x), we write xd − 1 = q(x)h(x) + r(x) (q(x), r(x) ∈ Q[x], deg r(x) < deg h(x)), and choose a non-zero integer a such that aq(x), ar(x) ∈ Z[x]. Then by virtue of ǫar(η) = ǫa(η

d

−1)−aq(η)h(η)

= (ǫh(η) )−aq(η) ∈ WF ,

56

YOSHIYUKI KITAOKA

we have r(x) = 0 by the choice of h(x). Hence xd − 1 = q(x)h(x), and we may assume that h(x) is monic. We let its degree be n. Because of ǫ ∈ U , ǫη−p ∈ U η−p is clear and hence we have ǫη WF U η−p = p ǫ WF U η−p , and so U/WF U η−p is a cyclic subgroup generated by ǫWF U η−p . Since h(x) is a monic polynomial of degree n, we have ­ n−1 ® . U = WF ǫ, ǫη , · · · , ǫη

We note that ǫA(η) ∈ WF for a polynomial A(x) ∈ Z[x] with deg A(x) < deg h(x) implies A(x) = 0. Because, ǫA(η) ∈ WF yields U A(η) ⊂ WF and therefore the definition of h(x) implies A(x) = 0. Now, let us show the second assertion; put v = ǫA(η) ∈ U,

A(x) = a0 + a1 x + · · · + an−1 xn−1 ∈ Z[x].

We shall show that v ∈ WF U η−p is equivalent to A(p) ≡ 0 mod h(p). To this end, write h(x) =

n X

h i xi ,

hn = 1.

i=0

First, assume v = κuη−p ∈ WF U η−p and put u = ǫb0 +b1 η+···+bn−1 η we have v = κǫ(b0 +b1 η+···+bn−1 η = κǫb0 η+b1 η

2

n−1

n−1

; then

)(η−p)

n

+···+bn−1 η −p(b0 +b1 η+···+bn−1 η n−1 )

= κǫbn−1 h(η) ×ǫb0 η+b1 η

2

+···+bn−2 η n−1 −bn−1 (h0 +···+hn−1 η n−1 )−p(b0 +b1 η+···+bn−1 η n−1 )

.

By the choice of h(x), ǫbn−1 h(η) ∈ WF , whence comparing the exponent of ǫ, we obtain a0 = −bn−1 h0 − pb0 , ak = bk−1 − bn−1 hk − pbk , Pn Hence, putting B(x) = k=1 bk−1 xk−1 , we get B(x) =

n−1 X

1 ≤ k ≤ n − 1.

(bn−1 hk + pbk + ak )xk−1 + bn−1 xn−1 ,

k=1

which we may rewrite as bn−1 (h(x) − h0 )/x + p(B(x) − b0 )/x +

n−1 X k=1

ak xk−1 ,

DISTRIBUTION OF UNITS

57

whence (x − p)B(x) = bn−1 h(x) − bn−1 h0 − pb0 +

n−1 X

ak xk = bn−1 h(x) + A(x).

k=1

Substituting x = p, we have A(p) = −bn−1 h(p), i.e., h(p) | A(p). If, conversely h(p) | A(p) holds, then we define bn−1 ∈ Z by A(p) = −bn−1 h(p). Then x − p divides bn−1 h(x) + A(x) and we may put bn−1 h(x) + A(x) = (x − p)B(x),

B(x) ∈ Z[x].

The leading coefficient of B(x) is bn−1 and so we may put B(x) = Pn−1 k A(η) = k=0 bk x for some integers b0 , · · · , bn−2 . Then we have v = ǫ h(η) −bn−1 B(η) η−p h(η) (ǫ ) (ǫ ) . Since by the choice of h(x), ǫ ∈ WF , we obtain v ∈ WF U η−p . Thus we have shown the equivalence and the last assertion in the lemma. Since U/WF U η−p is generated by ǫWF U η−p , the index [U : WF U η−p ] is equal to the order of ǫWF U η−p . Applying the last assertion to A(x) = m ∈ Z, we conclude that the condition ǫm ∈ WF U η−p is equivalent to m ≡ 0 mod h(p). Hence the second assertion [U : WF U η−p ] = |h(p)| follows. Proof of Theorem 3.1. By Lemma 3.3, there are η-subgroups Ui of o× F such that Ui /WF ∼ /W . = Z[ζdi ] and Ui /WF ’s form a direct product in o× F F [Ui : WF Uiη−p ] = |Φdi (p)| follows from Lemma 3.7, and then Lemma 3.6 completes the proof of the theorem. Corollary 3.2. Suppose that p(6= ±1) is an integer; then we have for U (m) defined in (3.3) £ ¤ U (m) : WF U (m)η−p = |Φm (p)|r , U (m)Φm (p) ⊂ WF U (m)η−p ,

where r is defined by rϕ(m) = rankZ U (m). Proof. Recalling that Q ⊗Z (U (m)/WF ) is a vector space over Q(ζm ), we denote its dimension by r. Therefore U (m) contains a subgroup which is isomorphic to a direct product of r copies of Z[ζm ] as η-modules. Then the first equation follows from Lemmas 3.6 and 3.7. Let u ∈ U (m); then uΦm (η)−Φm (p) = uΦm (η,p)(η−p) ∈ U (m)η−p (cf. Notation) and U (m)Φm (η) ⊂ WF (cf. (3.3)) together imply uΦm (p) ∈ WF U (m)η−p . From now on, we assume η ∈ Z(Gal(F/Q)).

58

YOSHIYUKI KITAOKA

Corollary 3.3. Suppose that η be in the center of Gal(F/Q), and a prime number p ( ∤ 2DF ) satisfies η = σF/Q (p) (cf. Notation). Then for a canonical surjective mapping ιp : o× F → E(p)/WF defined by ǫ 7→ ǫ mod p we have #E(p) = w

× [o× F : WF oF

[ker ιp :

η−p

]

η−p WF o× ] F

=

w

Q

i

|Φdi (p)|

[ker ιp : WF o× F

η−p

]

,

(3.6)

where w = #WF . Proof. Since η is in the center of Gal(F/Q), it follows from η = σF/Q (p) × η−p that ǫη−p ≡ 1 mod p holds for ǫ ∈ o× ⊂ ker ιp , F , and so we have WF (oF ) whence by the homomorphism theorem #(E(p)/WF ) = #E(p)/w =

× [o× F : WF oF

[ker ιp :

η−p

]

η−p WF o× ] F

=

Q

i

|Φdi (p)|

[ker ιp : WF o× F

η−p

]

by Theorem 3.1. In regard to this, under the assumption η ∈ Z(Gal(F/Q)), we put £ η−p ¤ Relη = gcd ker ιp : WF o× , F

(3.7)

p

˜ where prime numbers p satisfy σF/Q (p) = η and p ∤ 2DF∆˜ for a constant ∆ defined in the next subsection. Then we have ¯ Q ¯ w i |Φdi (p)| . #E(p) ¯¯ Relη

This upper bound seems to be the best one. Although we do not know how to evaluate Relη , there is a candidate κ(η) for it, which is a divisor of Relη by definition. We will explain it in the next subsection, and in §4, we describe κ(η) explicitly for several types of algebraic number fields.

DISTRIBUTION OF UNITS

59

3.2. Relη and κ(η) In this subsection, we define a candidate κ(η) for Relη and in §3.3, we rewrite it to evaluate easily it, and in §4, we write it down explicitly for several types of algebraic number fields. Computer experiments convince us of the truth of the conjecture. As before, let η ∈ Z(Gal(F/Q)) and let the polynomial g(x) be as in Corollary 3.1. We put ¯ o n ˜ = max t ∈ N ¯¯ ζtg(ρ) = 1 for ∀ρ ∈ Gal(F (ζt )/Q) with ρ|F = η (3.8) ∆

and for a divisor m of d (cf. (3.2)) and an extension η˜ ∈ Gal(F∆ ˜ /Q) of η, put n ¯ o ¯ Φ (˜η) ˜ , τm = τm (˜ η ) = max t ¯ ζt m = 1, t|∆ ³q ´ ˜ ∆ ˜ where F∆ o× ˜ = F F . The existence of ∆ is guaranteed by Proposition 5.1. By defining an integer a by η ˜ a ζ∆ ˜, ˜ = ζ∆

it is easy to see that ˜ τm = (Φm (a), ∆),

(3.9)

˜ and if o× F = U (m), then we have g(x) = Φm (x) and τm = ∆ by Corol(p), τm lary 3.1. Note that for a prime number p ( ∤ 2DF∆˜ ) with η˜ ∈ σF∆/Q ˜ Φ (p)

Φ (˜ η)

≡ ζτmm = 1 mod p for a prime ideal p divides Φm (p) because of ζτmm lying above p, and we may put ¯   ¯ Y  ¯ (i) vm ∈ U (m), Q vm ¯¯ . (3.10) R(˜ η) = Φm (p)/τm  ≡ ζ mod p for ∃ζ ∈ WF  ¯ (ii) m|d vm m|d

This is well-defined by Proposition 3.2 and forms a group. Moreover it is independent of the choice of a prime p, which follows from the following proposition. Proposition 3.3. Let η˜ ∈ Gal(F∆ ˜ /Q) be an extension of η. Suppose that a prime number p ( ∤ 2DF∆˜ ) satisfies σF∆˜ /Q (p) ∋ η˜, and vm ∈ U (m), ζ ∈ WF ; then we have Y Φm (p)/τm vm ≡ ζ mod p m|d



Y √ τm

m|d

vm

Φm (ρ˜ η ρ−1 )

= ζ for ∀ρ ∈ Gal(F∆ ˜ /Q).

60

YOSHIYUKI KITAOKA

Proof. Suppose a prime ideal p of F∆ ˜ = σF∆˜ /Q (p); ˜ lying above p satisfy η then we have, for ρ ∈ Gal(F∆ /Q) ˜ Y Φm (p)/τm vm ≡ ζ mod p m|d

⇒ζ≡ ⇒ζ=

Y √ τm

vm

Φm (p)

m|d

Y √ τm

vm



Y √ τm

vm

Φm (ρ˜ η ρ−1 )

−1

mod pρ

m|d

Φm (ρ˜ η ρ−1 )

,

m|d

on noting that the right-hand side is a root of unity in F∆ ˜ by η ∈ Φ (η) Z(Gal(F/Q)) and vmm ∈ WF . We may trace the above argument in the reverse way to prove the converse. (p) ∋ η˜, we define a For a prime number p ( ∤ 2DF∆˜ ) satisfying σF∆/Q ˜ mapping Y φp : U (m) → o× F /WF m|d

by φp

µY

vm

m|d



=

Y

Φm (p)/τm vm .

(3.11)

m|d

It is well-defined in view of Φm (p)/τm ∈ Z and Proposition 3.2 and we see that ιp ◦ φp (R(˜ η )) = {1},

(3.12)

by the definition of R(˜ η ) and ιp in Corollary 3.3. Proposition 3.4. For vm ∈ U (m) and an integer p (6= ±1), we have ³Y ´ Y ˜ η−p ∆ ˜ m Φm (η,p)∆/τ vm ∈ WF o× φp ⇔ vm ∈ WF o× F F ,

where Φm (x, y) = (Φm (x) − Φm (y))/(x − y) as in the notation. Proof. It is easy to see by (3.11) ³Y ´ Y η−p Φm (p)/τm vm ∈ WF o× φp ⇔ vm · ǫ−(η−p) ∈ WF , F

which is equivalent to Y

˜

˜

Φm (p)∆/τm · ǫ−(η−p)∆ ∈ WF , vm

∃ǫ ∈ o× F, (3.13)

DISTRIBUTION OF UNITS ˜ m Φ (η)∆/τ

˜

m ∆ noting that for u ∈ o× F , u ∈ WF if and only if u ∈ WF . Since vm WF holds by (3.9) and (3.3), (3.13) is equivalent to Y ˜ ˜ m (Φm (η)−Φm (p))∆/τ · ǫ(η−p)∆ ∈ WF vm ³Y ´ ˜ m ˜ η−p Φm (η,p)∆/τ vm · ǫ∆ ⇔ ∈ WF Y ˜ m ˜ Φm (η,p)∆/τ ⇔ vm · ǫ∆ ∈ WF , (by Lemma 3.5)

61



which completes the proof.

Proposition 3.5. For an extension η˜ ∈ Gal(F∆ ˜ /Q) of η and a prime number p ( ∤ 2DF∆˜ ) satisfying η˜ ∈ σF∆˜ /Q (p), we put R(˜ η , p) = φp (R(˜ η ))WF o× F

η−p

(⊂ o× F ),

η ˜ a where the image of φp is viewed in o× ˜. ˜ = ζ∆ F and define an integer a by ζ∆ Then, we have £ η−p ¤ R(˜ η , p) : WF o× F ¯ · ½Y ¾¸ Y ¯ ˜ ∆ ˜ m Φm (η,a)∆/τ vm ¯¯ vm ∈ U (m), ∈ WF o× vm = R(˜ η) : , F m|d

m|d

which is independent of the choice of p. η−p

× Proof. Let us see first φ−1 ) ⊂ R(˜ η ). Let vm ∈ U (m) and supp (WF oF Q Q η−p × pose φp ( vm ) ∈ WF oF ; we must show that vm ∈ R(˜ η ). The supposition yields ³Y ´ η−p vm ∈ WF o× φp F Y ˜ ∆ ˜ m Φm (η,p)∆/τ ⇒ vm ∈ WF o× ( by Proposition 3.4) F Y ˜ ∆(η−p) ˜ m (Φm (η)−Φm (p))∆/τ ∈ WF o× ⇒ vm F Y ˜ m ˜ Φm (p)∆/τ ⇒ vm = ζǫ∆(η−p) (ζ ∈ WF , ǫ ∈ o× F) Y Φm (p)/τm = ζ ′ ǫη−p . ⇒ vm Q Φ (p)/τm η−p Here ζ ′ is a root of unity and lies in WF , because vmm , ǫ ∈ F. Therefore, we have Y Φm (p)/τm vm = ζ ′ ǫη−p ≡ ζ ′ mod p Q and so vm ∈ R(˜ η ). Now we have by the second homomorphism theorem £ £ η−p ¤ η−p ¤ R(˜ η , p) : WF o× = φp (R(˜ η )) : φp (R(˜ η )) ∩ WF o× , F F

62

YOSHIYUKI KITAOKA

and it is equal to £ ¤ × η−p × η−p R(˜ η ) : φ−1 ) ( by φ−1 ) ⊂ R(˜ η )) p (WF oF p (WF oF ¯ ¾¸ · ½Y Y ¯ ˜ ∆ ˜ m Φm (η,p)∆/τ vm ∈ WF o× , = R(˜ η) : vm ¯¯ vm ∈ U (m), F

˜ and therefore by Proposition 3.4. The definition of a implies a ≡ p mod ∆ Φ (η,p)−Φm (η,a)

vmm

˜ ∆

∈ o× F , which completes the proof.

Since by (3.12), R(˜ η , p) ⊂ ker ιp holds, we have ¯ £ η−p ¤ η−p ¤ ¯ £ , R(˜ η , p) : WF o× ¯ ker ιp : WF o× F F

and

¯ £ ¤¯ × η−p ¯ R(˜ η , p) : WF oF ¯

gcd σF

(p) ∋ η, ˜ ˜ /Q ∆ p∤2DF ˜ ∆

£

ker ιp : WF o× F

η−p ¤

,

since the left index is independent of the choice of p( ∤ 2DF∆˜ ) by Proposition 3.5. Therefore, putting £ η−p ¤ κ(η) = gcd R(˜ η , p) : WF o× (3.14) F η ˜|F =η

we have

¯ ¯ κ(η) ¯¯

gcd σF /Q (p) = η, p∤2DF ˜ ∆

£

ker ιp : WF o× F

η−p ¤

,

where the right-hand side is Relη by definition (cf.(3.7)). Hence we have ¯£ ¯ η−p ¤ ¯ ¯ κ(η) ¯ Relη ¯ ker ιp : WF o× F

for prime numbers p which satisfies η ∈ σF/Q (p) and p ∤ 2DF∆˜ . Thus, we have shown with (3.6), Theorem 3.2.

¯ Q ¯ w i |Φdi (p)| #E(p) ¯¯ . κ(η)

(3.15)

We expect κ(η) = Relη , and we conjecture that for infinitely many primes p, “|” is replaced by “=” in (3.15). Note that if there is at least one prime p such that . Y £ η−p ¤ κ(η) = w |Φdi (p)| #E(p) (= ker ιp : WF o× ), F

DISTRIBUTION OF UNITS

63

then κ(η) = Relη holds. The computer experiment supports this in all the examples in §4. To proceed to the next step, we need to know in terms of Frobenius η−p automorphisms the condition on p for which [ker ιp : WF o× ]/κ(η) is a F multiple of a natural number m (cf. Proposition 2.2). Successful cases are some number fields with rank o× F = 1 [8], and cubic abelian fields [9]. Remark 3.2. It is desirable to generalize the prime number case to more general situation in case of “modulo prime ideal” in the previous section. 3.3. Evaluation of κ(η) In this subsection, we give another description of the index [R(˜ η , p) : × η−p WF oF ] and κ(η) convenient for evaluation. 3.3.1. Action of automorphisms √ First, we study the explicit action of η˜ on ∆ ǫ as a preparation. Let ∆ be a natural number and U a Gal(F/Q)-stable subgroup of o× F such that ­ ® U = ζw , u1 , · · · , us , (3.16) where ζw is a primitive w-th root of unity and ui ’s are multiplicatively independent. Therefore {u1 , · · · , us } is a basis of U/WF as a Z-module. For a polynomial with integral coefficients h(x) = hn xn + hn−1 xn−1 + · · · + h0 , we assume U h(η) ⊂ WF

(3.17)

and write, as in the introduction h(x, y) = (h(x) − h(y))/(x − y) =

n X t=1

ht

t−1 X

xt−k−1 y k .

k=0

We suppose η ∈ Z(Gal(F/Q)) as before, and let η˜ be an extension of η to √ Gal(F∆ /Q), and fix a ∆-th root ∆ uj ∈ F∆ once and for all. Write Y √ a √ η˜ η ˜ ai a ij ∆ u ζ∆w = ζ∆w , ∆ ui = ζ∆w , (3.18) j j

and similarly for ρ ∈ Gal(F∆ /Q) ρ b ζ∆w = ζ∆w ,

√ ∆

ρ

bi ui = ζ∆w

Y √ ∆ j

uj bij ,

(3.19)

64

and

YOSHIYUKI KITAOKA

 b1   b =  ...  , 

 a1   a =  ...  , 

A = (aij ), B = (bij ).

bs

as

Lemma 3.8. We have

ρη = ηρ on U ⇔ ba + Ab ≡ ab + Ba mod w, and AB = BA. Proof. The assertion follows, comparing P P ³ Y ´ρ Y n Y b oaij bai + j aij bj Y aij j aij bjk jk ai bai bj uηρ = ζ = ζ u u = ζ u ζ w w w w j i k k j

k

k

and P P ³ Y ´η Yn Y a obij abi + j bij aj Y bij j bij ajk jk bi abi aj . u uρη = ζ = ζ = ζ u ζ u w w w w i j k k j

k

k

Lemma 3.9. Putting Ak = (aij (k)) for each non-negative integer k, we have Y √ a (k) √ η ˜k αi (k) ij ∆ ∆ u , (3.20) ui = ζ∆w j j

where αi (k) is defined by  if k = 0,  0 t t (α1 (k), · · · , αs (k)) = (a1 , a2 , · · · , as ) if k = 1, (3.21)   k−1 k−2 k−1 (a +a A + ··· + A )a if k > 1.

Proof. The case of k = 0, 1 is clear. Inductively, we see the assertion, using Y n a Y √ a oaij (k) √ η˜k+1 αi (k)a j ∆ ∆ = ζ∆w ui uk jk ζ∆w =

αi (k)a+ ζ∆w

j P

k

j

aij (k)aj

Y √ ∆

uk

P

k

Lemma 3.10. We have h(η,a)

ui

∈ WF

Y

h(a,A)(i,j)

uj

j

where h(a, A)(i,j) is the (i, j)-entry of h(a, A).

,

j

aij (k)ajk

.

DISTRIBUTION OF UNITS

65

Proof. The assertion follows from h(η,a)

ui

Pn

= ui

t=1

∈ WF = WF

Pt−1

k=0

s Y

ht at−k−1 η k

Pn

uj

t=1

Pt−1

k=0

ht at−k−1 aij (k)

(by Lemma 3.9)

j=1

Y

h(a,A)(i,j)

uj

.

j

Lemma 3.11. For an integral vector x = (x1 , . . . , xs ), we put Y √ x √ ∆ ∆ ǫ= ui i .

Then we have

√ ∆

h(˜ η)

ǫ

=

Y √ ∆

ui

h(˜ η )xi

xh(a,A)a

= ζ∆w

i

and

h(A) = 0. Proof. We have by (3.20) √ ∆

ui

h(˜ η)

=

n Y √ ∆

ui

hk η ˜k

P

= ζ∆wk

hk αi (k)

Y √ ∆

uj

P

k

aij (k)hk

.

j

k=0

P h(η) The assumption ui ∈ WF (cf. (3.17)) yields k aij (k)hk = 0, i.e., h(A) = P 0, and k hk αi (k) is the i-th component of h(a, A)a by (3.21), from which the first assertion follows. Lemma 3.12. For ρ−1 , we put Y √ b′ √ ρ−1 b′i ρ−1 b′ ij ∆ u , ∆ ui = ζ∆w ζ∆w = ζ∆w , j j

b′ = t (b′1 , · · · , b′s ).

(3.22)

(b′ij ) = B −1 .

(3.23)

Then we have bb′ ≡ 1 mod ∆w,

b′ b ≡ −Bb′ mod ∆w,

Proof. By (3.19), we have −1



ρρ bb = ζ∆w , ζ∆w = ζ∆w

and the equation P P √ √ ρρ−1 bi b′ + j bij b′j Y √ b b′ ∆ ∆ ui = ∆ ui = ζ∆w uk j ij jk . k

66

YOSHIYUKI KITAOKA

The first equation implies the first congruence in (3.23) and together with multiplicative independence of ui ’s the second equation implies both (b′ij ) = B −1 and the second congruence b′ b + Bb′ ≡ 0 mod ∆w. Lemma 3.13. Putting Y √ q √ ρ˜ η ρ−1 qi ij ∆ ∆ u ui = ζ∆w , j

q = t (q1 , · · · , qs ),

(3.24)

we have

q ≡ b′ {(a − A)b + Ba} mod ∆w.

qij = aij , −1

= uηi , we have qij = aij . Since we have Proof. By uρηρ i √ ρ˜ η ρ−1 ∆ ui n Y √ b oη˜ρ−1 bi ij ∆ u ( by (3.19) ) = ζ∆w j =

n

bi a+ ζ∆w

j P

=

bij aj

YnY √ ∆ j

b′ (bi a+

= ζ∆w

j

P

j

bij aj )

uk

ajk

k

obij oρ−1

( by (3.18) )

Y n b′ Y √ b′ obij ajk k ∆ ζ∆w uℓ kl

j,k ℓ P P b′ (bi a+ j bij aj )+ j,k b′k bij ajk ζ∆w

Y √ ∆

uℓ

aiℓ

,

( by (3.22) ) ( by Lemma 3.8 )



it is easy to see by comparing this with (3.24) q ≡ b′ (ab + Ba) + BAb′ ≡ b′ (ab + Ba) + ABb′ ′



≡ b (ab + Ba) − Ab b

( by Lemma 3.8) ( by Lemma 3.12)



≡ b {(a − A)b + Ba} mod ∆w. Corollary 3.4. For √ ∆

ǫ=

we have ³√ ∆

h(ρ˜ η ρ−1 )

ǫ

´b

Y √ ∆

x

ui i ,

x{h(a)b+h(a,A)Ba}

= ζ∆w

.

Proof. Using Lemmas 3.11 and 3.13, we have Y √ h(ρ˜ηρ−1 )x √ η ρ−1 ) xh(a,A)b′ {(a−A)b+Ba} xh(a,A)q i ∆ h(ρ˜ ∆ . = ζ∆w ǫ = ui = ζ∆w i

DISTRIBUTION OF UNITS

67

Noting that h(a, A)(a − A) = h(a) − h(A) = h(a), we have h(a, A){(a − A)b + Ba} ≡ h(a)b + h(a, A)Ba mod ∆w, which completes the proof. 3.3.2. Evaluation of κ(η) With the preparation in the previous subsubsection, we may now give a η−p formula for the index [R(˜ η , p) : WF o× ], which is easier to evaluate. For F η˜ ∈ Gal(F∆ /Q) with η ˜ = η, we recall (3.10) on the form ˜ |F ¯   ¯ (i) vm ∈ U (m), Y  ¯ Q √ Φm (˜η) √ Φm (ρ˜ηρ−1 ) vm ¯¯ (ii) Q τm R(˜ η) = τ m vm = vm ∈ WF for ∀ρ ,  ¯ m|d m|d m|d by Proposition 3.3.

Lemma 3.14. Recalling (3.16),(3.18),(3.19), we put ® ­ U (m) = ζw , um,1 , · · · , um,sm , Y √ (m) √ am,j η ˜ a ˜ ˜ ∆ ∆ um,j η˜ = ζ∆w um,k ajk , ζ∆w = ζ∆w ˜ , ˜ ˜ k

√ ˜ ∆

ρ

um,i =

bm,i ζ∆w ˜

Y √ ˜ ∆

(m)

um,j bij ,

ρ b ζ∆w = ζ∆w ˜ , ˜

j

t t

am = (am,1 , · · · , am,sm ), bm = (bm,1 , · · · , bm,sm ),

(m)

A(m) = (aij ), (m)

B (m) = (bij ).

Then for vm =

Y

y

m,j , um,j

j

ym = (ym,1 , · · · , ym,sm ),

(3.25)

Q the condition m vm ∈ R(˜ η ) is equivalent to (P ˜ m )ym Φm (a, A(m) )am ≡ 0 mod ∆, ˜ (∆/τ (#) Pm ˜ m )ym {Φm (a)bm + Φm (a, A(m) )(B (m) − b)am } ≡ 0 mod ∆w ˜ (∆/τ m

for every ρ ∈ Gal(F∆ ˜ /Q). Proof. By putting

˜ m, xm,j = ym,j ∆/τ

xm = (xm,1 , · · · , xm,sm ),

68

YOSHIYUKI KITAOKA

the condition

Q

m

Y³Y √ ˜ ∆ m

vm ∈ R(˜ η ) amounts to clearly

um,j xm,j

j

´Φm (ρ˜ηρ−1 )

=

Y³Y √ ˜ ∆ m

um,j xm,j

j

´Φm (˜η)

∈ WF .

By Lemma 3.11, we have ´Φm (˜η) ³Y √ xm Φm (a,A(m) )am ˜ ∆ = ζ∆w um,j xm,j , ˜ j

while by Corollary 3.4 ³Y √ ´Φm (ρ˜ηρ−1 ) b′ xm {Φm (a)bm +Φm (a,A(m) )B (m) am } ˜ ∆ = ζ∆w , um,j xm,j ˜ j

Q ρ−1 b′ on putting ζ∆w = ζ∆w η ) is equivalent ˜ . Hence the condition ˜ m vm ∈ R(˜ to Y b′ x {Φ (a)b +Φ (a,A(m) )B (m) a } Y x Φ (a,A(m) )a m m m m m m m m ζ∆w ζ∆w = ∈ WF . (3.26) ˜ ˜ m

m

It is easy to see that Q ⇔ ⇔ ⇔

xm Φm (a,A(m) )am ∈ WF ˜ m ζ∆w Q xm Φm (a,A(m) )am ζ˜ =1 Pm ∆ (m) ˜ x Φ (a, A )am ≡ 0 mod ∆ Pm ˜m m (m) )am ≡ 0 m (∆/τm )ym Φm (a, A

and the equality in (3.26) is equivalent to X b′ xm {Φm (a)bm + Φm (a, A(m) )B (m) am } m



which completes the proof.

X

˜ mod ∆,

˜ xm Φm (a, A(m) )am mod ∆w,

m

Remark 3.3. The condition (#) depends only on (i) ym mod τm , (ii) b, bm mod w, i.e., on ρ|F , (iii) am mod wτm . Further, (iv) am mod w is uniquely determined by η.

DISTRIBUTION OF UNITS

69

With respect to the assertion (i), we have only to note that Φm (a)bm + Φm (a, A(m) )(B (m) − b)am

= Φm (a, A(m) ){(a − A(m) )bm + (B (m) − b)am }

≡ 0 mod w by Lemma 3.8.

(3.9) implies Φm (a) ≡ 0 mod τm , which implies the assertion on bm . The assertion on b follows from the first equation of (#). The statements (iii),(iv) are obvious. Lemma 3.15. Let vm ∈ U (m) and ym be as in the previous lemma; then the condition Y ˜ ∆ ˜ m Φm (η,a)∆/τ ∈ WF o× vm F m|d

is equivalent to (♮)

YY

P

um,ji

˜ m ym,i Φm (a,A(m) )(i,j) ∆/τ

m|d j

If a stronger condition o× F =

Q

˜ ∆

∈ WF o× F .

U (m) holds, then it is equivalent to

ym Φm (a, A(m) ) ≡ 0 mod τm . Q Φm (a,A(m) )(i,j) Φm (η,a) Proof. By Lemma 3.10, we know that um,i ∈ WF j um,j , which yields Q ˜ m Φm (η,a)∆/τ m|d vm ˜ m Q Q ym,i Φm (η,a)∆/τ = m|d i um,i ( by (3.25) ) ˜ m Q Q Q ym,i Φm (a,A(m) )(i,j) ∆/τ ∈ WF m|d i j um,j ˜ m Q Q P ym,i Φm (a,A(m) )(i,j) ∆/τ = WF m|d j um,ji , Q whence follows the first assertion. If o× U (m) holds, then {um,j } is a F = basis of o× /W , and so the first equivalence implies the second one. F F Proposition 3.6. Let um,i , ym,i , ym be as in Lemma 3.14, and let η˜ be an extension of η, and let p ( ∤ 2DF∆˜ ) be a prime number satisfying η˜ ∈ σF∆˜ /Q (p); then we have £ £ ¤ η−p ¤ = V1 (˜ η ) : V2 (˜ η) , R(˜ η , p) : WF o× F

70

YOSHIYUKI KITAOKA

where

¯ ) ¯ ¯ V1 (˜ η ) = WF ¯ ym satisfies (#) in Lemma 3.14 , ¯ m j ¯ ) ( Y Y y ¯¯ m,j V2 (˜ η ) = WF um,j ¯ ym satisfies ( ♮ ) in Lemma 3.15 . ¯ (

YY m

If

o× F

=

ym,j um,j

j

Q

U (m) holds, then we have ¯ ) ( Y Y y ¯¯ m,j (m) V2 (˜ η ) = WF um,j ¯ ym Φm (a, A ) ≡ 0 mod τm for ∀m|d . ¯ m

j

Proof. The assertion follows easily from Proposition 3.5 and the previous two lemmas. Q Suppose o× U (m); then {um,i } is a basis of o× F = F /WF , whence comparing exponents, we may assume ¯ o n ¯ (3.27) V1 (˜ η ) = {ym mod τm } ¯ ym ’s satisfy (#) in Lemma 3.14 , ¯ n o ¯ V2 (˜ η ) = {ym mod τm } ¯ ym Φm (a, A(m) ) ≡ 0 mod τm for ∀m|d . (3.28)

The inclusion V2 (˜ η ) ⊂ V1 (˜ η ) follows from their original definitions, but Q we can check it directly when o× U (m) holds, as follows: Suppose F = {ym } ∈ V2 (˜ η ); then the first equality of (#) is obvious. Noting that Φm (a) = Φm (a) − Φm (A(m) ) = Φm (a, A(m) )(a − A(m) ), we have Φm (a)bm + Φm (a, A(m) )(B (m) − b)am

= Φm (a, A(m) )(abm − A(m) bm + B (m) am − bam )

and then Lemma 3.8 yields the second equality of (#). 4. Examples 4.1. Case of η = id We assume η = id throughout this subsection. Then obviously, we have d (= the order of η) = 1, o× F = U (1), and the polynomial g(x) defined in Corollary 3.1 is equal to g(x) = x − 1.

DISTRIBUTION OF UNITS

71

Corollary 5.1 in the appendix yields ˜ = τ1 = w. ∆ Let η˜ be an extension of η. Since g(x, y) = 1 yields V2 (˜ η ) = 0, we have for r = rankZ o× , (3.27), (3.28) read F ¯½ ½ ¾ ¯ ya ≡ 0 mod w, V1 (˜ η )/V2 (˜ η ) = y ∈ (Z/wZ)r ¯¯ y{(a − 1)b + (B − b)a)} ≡ 0 mod w2

where we put, as in §3.3.1 with (ii) in Remark 3.3 ® ­ o× F = ζw , u1 , · · · , ur , √ η ˜ η ˜ ai √ a w w ζw ui = ζw ui , 2 = ζw 2 , 2 t

a = (a1 , · · · , ar ), Y b bi b ρ , uρi = ζw = ζw ζw ujij ,

A = 1r ,

j

t

b = (b1 , · · · , br ),

B = (bij ).

Now, we note that the assumption η = id yields in the above a ≡ 1 mod w,

a ≡ 0 mod w

and so a = 1 + wa,

a = wa,

say. Then we have ¯ © ª V1 (˜ η )/V2 (˜ η ) = y mod w ¯ y(ab + (B − b)a) ≡ 0 mod w .

Then, putting

R(ρ) = (B − b1r , b),

(4.1)

and replacing a, a by a, a we have V (˜ η ) := V1 (˜ η )/V2 (˜ η) ¯ ½ µ ¶ ¾ ¯ a = y mod w ¯¯ yR(ρ) ≡ 0 mod w for ∀ρ ∈ Gal(F/Q) , a

where redefining a, a as above √ η ˜ η ˜ a ai √ w w ζw ui = ζw ui , t a = (a1 , · · · , ar ) 2 = ζw 2 · ζw , Y ρ b bi ζw = ζw , ui ρ = ζw uj bij , t b = (b1 , · · · , br ), B = (bij ). j

(4.2) (4.3)

72

YOSHIYUKI KITAOKA

We note that if ρ|F = id, then (4.3) implies b ≡ 1 mod w, b ≡ 0 mod w and B is the identity matrix, and hence R(ρ) ≡ 0 mod w. By denoting B, b corresponding to ρ by B(ρ), b(ρ), respectively, it is easy to see that R(ρ1 ρ2 ) = B(ρ1 )R(ρ2 ) + b(ρ2 )R(ρ1 ).

(4.4)

In the following, we evaluate κ(η) = gcd #V (˜ η ) (cf. (3.14)) for several types of algebraic number fields, and furthermore we show that for η˜ ∈ Gal(Fw /F ), there is η˜0 such that R(˜ η ) ⊃ R(˜ η0 ) and κ(η) = #V (˜ η0 ). This is not necessarily true if η 6= id. Once we find a prime number p such that σF/Q (p) = η and w(p − 1)r /#E(p) = κ(η), we have κ(η) = Relη . 4.1.1. Case of real quadratic fields Let F be a real quadratic field and let ǫ (> 1) be the fundamental unit ˜ = w = 2. We take ǫ as u1 ; then for with N (ǫ) = (−1)s ; then clearly ∆ ρ(6= id) ∈ Gal(F/Q), (−1)ρ = −1,

ǫρ = (−1)s ǫ−1

imply B = (−1), b = 1, b = (s) and so R(ρ) = (−2, s) ≡ (0, s) mod 2. Therefore V (˜ η) =

¯ ½ µ ¶ ¾ ¯ a = sax ≡ 0 mod 2 . x mod 2 ¯¯ x(0, s) a

Here a is defined by ζ4η˜ = (−1)a ζ4 as in (4.2). Let us see that κ(η) = gcd #V (˜ η) =

(

2

if N (ǫ) = 1,

1

if N (ǫ) = −1.

The first is obvious because of s = 0, and for the second, we have only to take η˜0 so that a = 1. This is compatible with [8], where we have shown that the expected density of the set (cf. (3.15))

is positive.

ª © ¯ p ¯ #E(p) = 2(p − 1)/κ(η), σF/Q (p) = id

DISTRIBUTION OF UNITS

73

4.1.2. Case of real cubic abelian fields Let F be a real cubic abelian field and σ a generator of Gal(F/Q); then we ˜ = w = 2 and as a set of fundamental units, we can take u1 , u2 so have ∆ that (−1)σ = (−1)1 ,

uσ1 = u2 ,

uσ2 = (u1 u2 )−1

and NF/Q (u1 ) = NF/Q (u2 ) = 1 [9]. Thus we have (cf. (4.1)) R(σ) =

µµ

0 1 −1 −1

2



− 12 ,

R(σ ) ≡

µ

µ ¶¶ µ ¶ 0 110 ≡ mod 2, 0 100

010 110



mod 2,

which yield V (˜ η) ¯ n o ¯ = (x1 , x2 ) mod 2 ¯ x1 (a1 + a2 ) + x2 a1 ≡ x1 a2 + x2 (a1 + a2 ) ≡ 0 mod 2 .

√ √ Here a1 , a2 are defined by ui η˜ = (−1)ai ui in (4.2). We can choose η˜0 which corresponds to a1 = a2 = 1 and then #V (˜ η0 ) = 1, i.e. κ(η) = gcd V (˜ η ) = 1. This is compatible with [9], where the expected density of the set © ¯ p ¯ #E(p) = 2(p − 1)2 ,

is explicitly given.

ª σF/Q (p) = id

4.1.3. Case of non-cyclic abelian fields of degree 4 √ √ Let F = Q( d1 , d2 ), where d1 , d2 (> 1) are natural numbers and let F1 , F2 , F3 be three real quadratic subfields of F . Let ǫi (> 1) be the fundamental unit of Fi , NFi /Q ǫi = (−1)si , si = 0, 1. Put £ ¤ Q = o× F : h−1, ǫ1 , ǫ2 , ǫ3 i ;

74

YOSHIYUKI KITAOKA

then the type of a set {u1 , u2 , u3 } of fundamental units of F is given as follows [10]: (i) u1 (ii) u1 (iii) u1 (iv) u1 (v) u1 (vi) u1 (vii) u1

= ǫ1 , u2 √ = ǫ1 , u2 √ = ǫ1 , u2 √ = ǫ1 ǫ2 , u2 √ = ǫ1 ǫ2 , u2 √ = ǫ1 ǫ2 , u2 √ = ǫ1 ǫ2 ǫ3 , u2

= ǫ2 , u3 = ǫ2 , u3 √ = ǫ2 , u3 = ǫ2 , u3 √ = ǫ3 , u3 √ = ǫ2 ǫ3 , u3 = ǫ2 , u3

= ǫ3 = ǫ3 = ǫ3 = ǫ3 = ǫ2 √ = ǫ3 ǫ1 = ǫ3

(Q = 1) (Q = 2) (Q = 4) (Q = 2) (Q = 4) (Q = 4) (Q = 2).

√ In the case (ii) – (vi), si = 0 is supposed if ǫi appears in the symbol , and s1 = s2 = s3 is supposed for the case (vii). We denote by σi the nontrivial automorphism fixing ǫi and so Gal(F/Q) = {σ1 , σ2 , σ3 = σ1 σ2 , id}. b = b(ρ) = 1 in (4.3) is equal to 1 because of w = 2 and hence we have by (4.4) R(σ3 ) = R(σ1 σ2 ) = B(σ1 )R(σ2 ) + R(σ1 ). Proposition 4.1. Let η be the identity; then the value of κ(η) = gcd #V (˜ η) is given as follows: Case (i)  if s1 + s2 + s3 = 0, 8 κ(η) = 4 if s1 + s2 + s3 = 1,  2 otherwise,

Case (ii)

κ(η) =

Case (iii)

   4   

2

  s2 = s3 = 0 or if s = 0, s3 = 1,  2 s2 = 1, s3 = 0, otherwise,

κ(η) =

½

4 2

u1 uσ1 2 = −1 or u1 uσ1 3 = −1,

if u1 uσ1 2 = u2 uσ2 1 = −1, otherwise,

Case (iv) κ(η) =

½

2 4

if uσ1 1 uσ1 2 = −1 and s3 = 1, otherwise,

Case (v), (vi) κ(η) = 2,

DISTRIBUTION OF UNITS

75

Case (vii) κ(η) =

½

4 1

if s1 = 0, if s1 = 1.

Proof. We prove the case (vii). Proofs of the other cases are similar. √ For u1 = ǫ1 ǫ2 ǫ3 , u2 = ǫ2 , u3 = ǫ3 , we have (−1)σ1 = (−1)σ2 = −1, and −1 uσ1 1 = (−1)κ1 u1 u−1 2 u3 , σ2 κ2 −1 u1 = (−1) u1 u2 ,

uσ2 1 = (−1)s2 u−1 2 , uσ2 2 = u2 ,

uσ3 1 = (−1)s3 u−1 3 , uσ3 2 = (−1)s3 u−1 3 ,

for some κ1 , κ2 = 0, 1, and put s1 = s2 = s3 = s. Then it is easy to see that   0 1 1 κ1 R(σ1 ) ≡  0 0 0 s  mod 2, 000 s   0 1 0 κ2 R(σ2 ) ≡  0 0 0 0  mod 2, 000 s   0 0 1 κ1 + κ 2 + s  mod 2, R(σ1 σ2 ) ≡  0 0 0 s 000 0

and hence, putting t a = (a1 , a2 , a3 , a), we have in due order

(x1 , x2 , x3 ) ∈ V (˜ η)   (0, x1 , x1 , x1 κ1 + x2 s + x3 s)a ≡ 0 mod 2 ⇔ (0, x1 , 0, x1 κ2 + x3 s)a ≡ 0 mod 2  (0, 0, x1 , x1 (κ1 + κ2 + s) + x2 s)a ≡ 0 mod 2   x1 (a2 + a3 + κ1 a) + x2 sa + x3 sa ≡ 0 mod 2 ⇔ x1 (a2 + κ2 a) + x3 sa ≡ 0 mod 2  x1 (a3 + κ1 a + κ2 a + sa) + x2 sa ≡ 0 mod 2.

We divide the proof into two cases. For s = 0 we get

(x1 , x2 , x3 ) ∈ V (˜ η)  x (a + a + κ  1 2 3 1 a) ≡ 0 mod 2 ⇔ x1 (a2 + κ2 a) ≡ 0 mod 2  x1 (a3 + κ1 a + κ2 a) ≡ 0 mod 2

⇒ {(0, x2 , x3 ) | x2 , x3 mod 2} ⊂ V (˜ η ),

76

YOSHIYUKI KITAOKA

where the inclusion becomes the equality for a = 0, a2 = 1. Therefore κ(η) = 4 holds. For s = 1, we have   x1 (a2 + a3 + κ1 a) + x2 a + x3 a ≡ 0 mod 2, (x1 , x2 , x3 ) ∈ V (˜ η ) ⇔ x1 (a2 + κ2 a) + x3 a ≡ 0 mod 2,  x1 (a3 + κ1 a + κ2 a + a) + x2 a ≡ 0 mod 2

and since the coefficient matrix is regular for a = 1, we have V (˜ η ) = {(0, 0, 0)} and so κ(η) = 1.

√ √ Remark 4.1. Suppose F = Q( d1 , d2 ) (2 ≤ d1 , d2 ≤ 500); then κ(η) = Relη is confirmed by finding a prime number p satisfying #E(p) = 2(p − 1)3 /κ(η) by computer. 4.1.4. Case of imaginary abelian fields of degree 4 Let F be an imaginary abelian field of degree 4 and let F0 be the real quadratic subfield in F , and ǫ0 (> 1) the fundamental unit of F0 . Put £ ¤ × Q = o× F : WF oF0 .

We define a fundamental unit ǫ of F as follows: In case of Q = 1, we put ǫ = ǫ0 . Next, we assume Q = 2; let us see then that we can choose a fundamental unit ǫ of F so that ǫ0 = ζw ǫ2 ,

ǫJ = ζw ǫ,

where J means the complex conjugation. We agree as follows. We may a 2 suppose ǫ0 = ζw ǫ with a = 0, 1 without loss of generality. Assume a = 1; n 2n−1 J ǫ ǫǫ ∈ F0 implies ǫǫJ = ǫn0 , and so ǫǫJ = (ζw ǫ2 )n . Thus we have ǫJ = ζw J and comparing the absolute values, n = 1, which yields ǫ = ζw ǫ, and ǫ0 = ǫǫJ = ζw ǫ2 . If a = 0, then ǫ0 = ǫ2 follows, which implies ǫJ = ǫ2n−1 by ǫǫJ = ǫn0 for an integer n, whence yields n = 1 and ǫJ = ǫ, contradicting Q = 2. Proposition 4.2. We have   either Q = 1 and NF0 /Q (ǫ0 ) = 1 2 if √ κ(η) = or Q = 1 and −1 ∈ F,   1 otherwise.

DISTRIBUTION OF UNITS

77

Proof. Putting η ˜ a ζw 2 = ζw 2 ζw ,

√ w

η ˜

a1 ǫ = ζw

√ w

ǫ,

b ρ , = ζw ζw

b1 b11 ǫ , ǫρ = ζw

we have V (˜ η ) = {x mod w | x((b11 − b)a1 + b1 a) ≡ 0 mod w for ∀ρ ∈ Gal(F/Q)}. • Case of Q = 1, NF0 /Q (ǫ0 ) = 1: In this case, we show that V (˜ η ) ⊃ V (˜ η0 ) = {x mod w | x ≡ 0 mod w/2} (˜ η0 ↔ a1 = 1), from which we have κ(η) = 2. We note that ǫρ = ǫ or ǫ−1 by virtue of ǫ = ǫ0 , and so b1 = 0 for every ρ and so V (˜ η ) = {x mod w | x(b11 − b)a1 ≡ 0 mod w for ∀ρ ∈ Gal(F/Q)}. If w = 2, 4, 6, then the possibilities for b, b11 are w = 2 ⇒ b = 1, b11 = ±1 w = 4, 6 ⇒ b = ±1, b11 = ±1 and hence the assertion above is true. √ If w = 8, then we have F = Q(ζ8 ) and ǫ0 = 2 + 1, NF0 /Q (ǫ0 ) = −1, which contradicts the assumption. √ √ If w = 12, then F = Q( −1, 3) holds, and the possibilities are either b = ±1, b11 = 1, or b = ±5, b11 = −1. The automorphism ρ corresponding to b = −1, b11 = 1 implies 2x ≡ 0 mod 12, i.e. x ≡ 0 mod 6 and so V (˜ η0 ) = {x mod 12 | x ≡ 0 mod 6}. Therefore #V (˜ η0 ) = 2 holds. √ • Case of Q = 1, NF0 /Q (ǫ0 ) = −1, −1 ∈ F√ and ζ8 ∈ F : In this case, we have F = Q(ζ8 ) and ǫ0 = 2 − 1, and the possibilities are b b11 b1 (b11 − b)a1 + b1 a

1 3 5 7 1 −1 −1 1 0 4 4 0 0 −4a1 + 4a −6a1 + 4a −6a1

which also implies V (˜ η ) ⊃ V (˜ η0 ) = {x mod w | x ≡ 0 mod w/2} (˜ η0 ↔ a = a1 = 1). This means κ(η) = 2.

78

YOSHIYUKI KITAOKA

√ • Case of Q = 1, NF0 /Q (ǫ0 ) = −1, −1 ∈ F and ζ8 6∈ F : √ We note that w = 4 and F is the composite of Q( −1) and F0 . Then we have the table: ½ ½ ½ ½ ½ ρ|FF0 = id ρ|FF0 = id ρ|FF0 6= id ρ|FF0 6= id ρ b 1 3 1 3 b11 b1 (b11 − b)a1 + b1 a

1 0 0

1 0 −2a1

−1 2 −2a1 + 2a

−1 2 −4a1 + 2a

Therefore we have V (˜ η ) ⊃ V (˜ η0 ) = {x mod w | x ≡ 0 mod w/2} (˜ η0 ↔ a1 = 1, a = 0) and κ(η) = 2. √ • Case of Q = 1, NF0 /Q (ǫ0 ) = −1, −1 6∈ F : In case of w = 2, we take η˜0 corresponding to a1 = 0, a = 1, so that b1 = 1 for ρ|F0 6= id implies V (˜ η ) ⊃ V (˜ η0 ) = {0}, i.e. κ(η) = 1. In case of w = 6, F = F0 (ζ3 ) holds, and we take η˜0 corresponding to a = a1 = 1. Then we have V (˜ η ) ⊃ V (˜ η0 ) = {0} and so κ(η) = 1, considering ρ corresponding to ρ ζw = ζw , ǫρ0 = −ǫ−1 0 , for which we get b = 1, b1 = 3, b11 = −1.

• Case of Q = 2 : Let ρ be the complex conjugation; then b = −1, b1 = 1, b11 = 1 hold. Therefore η˜0 corresponding to a1 = 0, a = 1 gives V (˜ η ) ⊃ V (˜ η0 ) = {0} and κ(η) = 1.

The proposition explains the theoretical background of the constant ∆ in [8], which is our κ(η) where the positivity of the expected density of {p | #E(p) = w(p − 1)/κ(η), σF/Q (p) = id} is shown. 4.1.5. Case where F is the Galois closure of a real cubic field F0 with negative discriminant We note that F is an S3 -extension of Q, and so w = 2, 4, 6. Let ǫ ( > 1) be the fundamental unit of F0 , and σ an automorphism of order 3 in Gal(F/Q); σ then ζw = ζw holds, and putting ǫ′ = ǫσ , we see J

2

σ

ǫ′ = ǫσ = ǫ′ = ǫ−1−σ , ′ where J denotes the complex conjugation. That [o× F : hζw , ǫ, ǫ i] = 1 or 3 is known [4]. We still suppose η ∈ Gal(F/Q) is the identity. Then we have

Proposition 4.3. ( 3 κ(η) = 1

′ if F0 is pure cubic and [o× F : hζw , ǫ, ǫ i] = 1,

otherwise.

DISTRIBUTION OF UNITS

79

Proof. For an extension η˜ of η, we write as (4.2), (4.3), √ η ˜ aw+1 ai √ w u , ζw , w ui η˜ = ζw 2 = ζw 2 i ρ b ζw = ζw ,

b2 u1 b21 u2 b22 , u2 ρ = ζw

b1 u1 b11 u2 b12 , u1 ρ = ζw

where o× F = hζw , u1 , u2 i. ′ ′ Suppose [o× F : hζw , ǫ, ǫ i] = 1; then we have, on putting u1 = ǫ, u2 = ǫ , σ ζw σ2 ζw J ζw σJ ζw σ2 J ζw

= ζw , = ζw , −1 = ζw , −1 = ζw , −1 = ζw ,

uσ1 2 uσ1 uJ1 uσJ 1 2 uσ1 J

= u2 , −1 = u−1 1 u2 , = u1 , −1 = u−1 1 u2 , = u2 ,

uσ2 2 uσ2 uJ2 uσJ 2 2 uσ2 J

−1 = u−1 1 u2 , = u1 , −1 = u−1 1 u2 , = u2 , −1 = u−1 1 u2 .

Therefore we have b1 = b2 = 0 for ∀ρ ∈ Gal(F/Q), and ⇔

(x1 , x2 ) mod w ∈ V (˜ η) (x1 , x2 )



µ

b11 − b b12 b21 b22 − b

¶µ

a1 a2



≡ 0 mod w for ∀ρ

 x1 (−a1 + a2 ) + x2 (−a1 − 2a2 ) ≡ 0 mod w      x1 (−2a1 − a2 ) + x2 (a1 − a2 ) ≡ 0 mod w 2x1 a1 − x2 a1 ≡ 0 mod w    −x a + 2x2 a2 ≡ 0 mod w   1 2 x1 (a1 + a2 ) + x2 (a1 + a2 ) ≡ 0 mod w

(ρ = σ) (ρ = σ 2 ) (4.5) (ρ = J) (ρ = σJ) (ρ = σ 2 J).

If F0 is pure cubic, then we have w = 6, and taking η˜0 corresponding to a1 = a2 = 1, we get by (4.5)  −3x2 ≡ 0 mod 6      −3x1 ≡ 0 mod 6 (x1 , x2 ) mod 6 ∈ V (˜ η0 ) ⇔ 2x1 − x2 ≡ 0 mod 6    −x + 2x2 ≡ 0 mod 6   1 2x1 + 2x2 ≡ 0 mod 6 ½ xj ≡ 0 mod 2, ⇔ x1 + x2 ≡ 0 mod 3. It is easy to see V (˜ η ) ⊃ V (˜ η0 ) for any extension η˜, and thereforep κ(η) = 3. Next, we suppose that F0 is not pure cubic, which implies Q( DF0 ) 6= √ Q( −3) and w = 2 or 4. For η˜0 corresponding to a1 = a2 = 1, we have (cf. (4.5)) (x1 , x2 ) mod w ∈ V (˜ η0 ) ⇔ x1 ≡ x2 ≡ 0 mod w.

80

YOSHIYUKI KITAOKA

Therefore we have κ(η) = 1. ′ Now suppose [o× F : hζw , ǫ, ǫ i] = 3. First we show that there exist ǫ0 ∈ × oF , e ∈ Z so that σ o× F = hζw , ǫ0 , ǫ0 i, 2

e −1−σ ǫ0 ǫσ0 = ζw ,

e ǫ/ǫ′ , ǫ30 = ζw −e 1+σ ǫ0 , ǫJ0 = ζw

(4.6)

−σ ǫσJ 0 = ǫ0 .

′ 3 e b ′c Take a unit ǫ0 ∈ o× for e, b, c ∈ Z. If F \ hζw , ǫ, ǫ i and let ǫ0 = ζw ǫ ǫ b ≡ c ≡ 0 mod 3, then we may assume b = c = 0, i.e. that ǫ0 is a root of unity. This contradicts ǫ0 6∈ hζw , ǫ, ǫ′ i. e ′b e −c ′ b−c ǫ (ǫǫ′ )−c = ζw If b ≡ 0 mod 3, then c 6≡ 0 mod 3 and (ǫσ0 )3 = ζw ǫ ǫ allows us to assume b 6≡ 0 mod 3, taking ǫσ0 instead of ǫ0 . Thus we may e ′c ǫǫ , taking ǫ−1 assume ǫ30 = ζw 0 instead, if necessary. If c ≡ 0 mod 3 holds, −c/3 3 −e then ǫ = (ζ3w ǫ0 ǫ′ ) ∈ F (ζ9 )3 , which contradicts Lemma 1.2 in [6]. e −σ 2 e −1 If c = 1 holds, then ǫ30 = ζw ǫ ǫ , which yields the and ǫ0 3σ = ζw −σ e contradiction ǫ = (ζ3w ǫ0 )3 ∈ F (ζ9 )3 as above. Therefore we may assume e ǫ30 = ζw ǫ/ǫ′ . e ′ ′ e ′2 Then we have ǫ3σ = (ǫ0 ǫ′ )3 . Hence there is a third 0 = ζw ǫ (ǫǫ ) = ζw ǫǫ root ω of unity so that

ǫσ0 = ωǫ0 ǫ′ ,

(4.7)

which implies ǫ′ = ω −1 ǫσ0 /ǫ0 ,

−e 3 ′ −e −1 2 σ ǫ = ζw ǫ0 ǫ = ζw ω ǫ0 ǫ0 .

This yields σ o× F = hζw , ǫ0 , ǫ0 i

2

2

−1 e −1−σ and ǫσ0 = ζw ǫ0 in (4.6) follows from ǫσ0 = (ωǫ0 ǫ′ )σ = ω(ωǫ0 ǫ′ )(ǫ−1 ǫ′ ) e −1−σ e −1−σ = ω 3 ζw ǫ0 = ζw ǫ0 . ǫ0σ+σJ = 1 in (4.6) follows from ǫ0σ+σJ = σ 1+J > 0 and the fact that ǫ0σ+σJ is a root of unity, since (ǫ0σ+σJ )3 = (ǫ0 ) 2 2 2 2 J e ′ σ+σJ (ζw ǫ/ǫ ) = ǫσ+σJ−σ −σ J = 1 by ǫσ = ǫσJ , ǫσ J = {ǫ′ }J = ǫσ . By (4.7), we have

ǫJ0 = (ω −1 ǫ′

−1 σ J ǫ0 )

2

−e −1 2 σ −e −1 2σ σ = ωǫ1+σ ǫ−σ ǫ0 ǫ0 )(ζw ω ǫ0 ǫ0 )ǫ−σ 0 0 = ω(ζw ω 2

−e 1+σ −2e 2+2σ+σ ǫ0 = ω −1 ζw ǫ0 = ω −1 ζw 2

e by ǫ01+σ+σ = ζw . We have only to show ω = 1 to complete the proof of (4.6). It follows from

ǫσ0 = ǫ0Jσ

2

J

−e 1+σ σ = (ω −1 ζw ǫ0 )

2

J

e σ = ωζw ǫ0

2

J+J

e e −1−σ J J = ωζw (ζw ǫ0 ) ǫ0 = ωǫσ0 .

DISTRIBUTION OF UNITS

81

Now, putting u1 = ǫ0 , u2 = ǫσ0 , we have by (4.6) σ ζw σ2 ζw J ζw σJ ζw σ2 J ζw

= ζw , = ζw , −1 = ζw , −1 = ζw , −1 = ζw ,

uσ1 2 uσ1 uJ1 uσJ 1 2 u1σ J

= u2 , e −1 −1 = ζw u1 u2 , −e = ζw u1 u2 , = u−1 2 , = u−1 1 ,

uσ2 2 uσ2 uJ2 uσJ 2 2 uσ2 J

e −1 −1 = ζw u 1 u2 , = u1 , = u−1 2 , −1 = u1 , −e = ζw u 1 u2 ,

which yield ⇔ ⇔

(x1 , x2 ) mod w ∈ V (˜ η) x1 ((b11 − b)a1 + b12 a2 + b1 a) + x2 (b21 a1 + (b22 − b)a2 + b2 a) ≡ 0 mod w,

 x1 (−a1 + a2 ) + x2 (−a1 − 2a2 + ea) ≡ 0 mod w      x1 (−2a1 − a2 + ea) + x2 (a1 − a2 ) ≡ 0 mod w x1 (2a1 + a2 − ea) ≡ 0 mod w    x (a − a2 ) + x2 (−a1 + a2 ) ≡ 0 mod w   1 1 x2 (a1 + 2a2 − ea) ≡ 0 mod w ½ x1 (a1 − a2 ) ≡ 0 mod w, ⇒ x2 (a1 − a2 ) ≡ 0 mod w,

(ρ = σ) (ρ = σ 2 ) (ρ = J) (ρ = σJ) (ρ = σ 2 J)

⇒ x1 ≡ x2 ≡ 0 mod w for η˜0 (↔ a1 − a2 = 1).

′ Hence we have κ(η) = 1 under the assumption [o× F : hζw , ǫ, ǫ i] = 3. This completes the proof.

When F0 is defined by x3 + a1 x + a0 = 0 with 0 ≤ a1 , |a0 | ≤ 100, we have checked κ(η) = Relη , by finding a prime number p satisfying #E(p) = w(p − 1)2 /κ(η) by computer. 4.2. Case of complex conjugation In this subsection, let F be an imaginary abelian extension of the rational number field Q with [F : Q] = 2n(≥ 4), and we assume that η is the complex conjugation J. Denote the maximal real subfield by F0 ; then d = 2 and g(x) = x − 1 are obvious, and it is known that £ ¤ × Q = o× F : oF0

is 1 or 2. ˜ Therefore, we have o× F = U (1) and so ∆ = τ1 = 2 by Corollary 5.1. Now Proposition 3.6 reads as follows:

82

YOSHIYUKI KITAOKA

For an extension η˜ ∈ Gal(F2 /Q) of η, put ® ­ o× F = ζw , u1 , · · · , ur , η ˜ aw−1 = ζ2w ζ2w , Y√ a √ η˜ ai ui = ζ2w uj ij , j

t

a = (a1 , · · · , ar ),

ρ b ζw = ζw , ui ρ

A = (aij ), Y bi = ζw uj bij , j

t

b = (b1 , · · · , br ),

B = (bij ).

Then we have (cf. (3.27),(3.28)) ¯ ¾ ½ ¯ r r ¯ xa ≡ 0 mod 2, and for ∀ρ ∈ Gal(F/Q) , V1 (˜ η ) = x ∈ Z /2Z ¯ x((aw − 2)b + (B − b)a) ≡ 0 mod 2w V2 (˜ η ) = {0},

ai and we note that uJi = ζw

Q

(4.8)

a

j

uj ij determines ai mod w uniquely.

4.2.1. Case of [F : Q] = 4 Let ǫ0 (> 1) be the fundamental unit of F0 . Proposition 4.4. We have the following: ( 2 if NF0 /Q (ǫ0 ) = 1 and Q = 1, κ(η) = 1 otherwise. Proof. Since r = rank o× F = 1, (4.8) amounts to ¯ ½ ¾ ¯ x a ≡ 0 mod 2, for ∀ρ ∈ Gal(F/Q) V1 (˜ η ) = x1 ∈ Z/2Z ¯¯ 1 1 . (4.9) x1 {(aw − 2)b1 + (b11 − b)a1 } ≡ 0 mod 2w

First, we assume Q = 1; then ǫ0 being a fundamental unit of F , we can take ǫ0 as the fundamental unit u1 of F and a1 ≡ 0 mod w is clear and so the first congruence x1 a1 ≡ 0 mod 2 is satisfied for any x1 ∈ Z. If ρ = id on F0 , then we have ρ = id or J, and hence b = ±1, b1 ≡ 0 mod w, and b11 = 1. Therefore, the above equation for this ρ is satisfied for all x1 in this case. Suppose ρ 6= id on F0 and put NF0 /Q (ǫ0 ) = (−1)s ; then noting b1 ≡ sw/2 mod w, b11 = −1, b ≡ 1 mod 2, the second equation in (4.9) becomes x1 (aw − 2)b1 ≡ x1 (aw − 2)sw/2 ≡ 0 mod 2w.

DISTRIBUTION OF UNITS

83

Hence, if s = 0, this is satisfied for all x1 , and so κ(η) = 2. If s = 1, then it is equivalent to x1 (aw − 2)w/2 ≡ 0 mod 2w. Taking a = 0, we have x1 ≡ 0 mod 2, and so κ(η) = 1. Next, we assume Q = 2; then we can choose a fundamental unit ǫ of F as in §4.1.4 so that ǫ0 = ζw ǫ2 ,

ǫJ = ζw ǫ.

Let u1 = ǫ; then a1 is odd, and then the first congruence implies x1 ≡ 0 mod 2 and so κ(η) = 1. We remark that this is also compatible with [8] and explains the theoretical background of the constant ∆ there, which is κ(η) here. 4.2.2. Case of [F : Q] = 6 Proposition 4.5. In this case, we have κ(η) = 1. Proof. Let σ be an element in Gal(F/Q) of order 3. Let us show that there is a system {u1 , u2 } of fundamental units so that uσ1 = u2 ,

uσ2 = (u1 u2 )−1 ,

uJi = ui .

(4.10)

Because of ranko× F = 2, there is a system u1 , u2 of fundamental units uσ1 = u2 ,

c (u1 u2 )−1 , uσ2 = ζw

using the theory of integral representation of the cyclic group of prime order di (the theorem on p. 508 [3]). Put uJi = ζw ui ; then defining an integer e by σ e ζw = ζw , we have d2 J uσJ 1 = u2 = ζw u2 ,

σ ed1 d1 uJσ 1 = (ζw u1 ) = ζw u2 ,

and so Gal(F/Q) being abelian, we may assume d2 = ed1 . The equality 2 2 (1+σ+σ 2 )J J(1+σ+σ 2 ) c −c d1 u11+σ+σ = ζw yields u1 = ζw and u1 = (ζw u1 )1+σ+σ = (1+e+e2 )d +c

(1+e+e2 )d +2c

1 1 ζw , and hence ζw = 1. Therefore we have (1 + e + d /2 2 e )d1 + 2c ≡ 0 mod w, which implies d1 is even. Thus we have (ζw1 u1 )J = d1 /2 ζw u1 , and so we may assume d1 = 0. This necessitates c ≡ 0 mod w/2, c c c i.e. ζw = ±1. If ζw = −1, taking −ui as ui , we can assume ζw = 1, which completes the proof of the above assertion (4.10). Hence we have

Jσ −e ζw = ζw , 2 −e2 Jσ ζw = ζw ,

uJσ 1 = u2 , −1 Jσ 2 u1 = u−1 1 u2 ,

−1 −1 uJσ 2 = u1 u2 , 2 = u1 , uJσ 2

84

YOSHIYUKI KITAOKA

whence ρ = Jσ ⇒ b ≡ 1 mod 2, ρ = Jσ 2 ⇒ b ≡ 1 mod 2,

µ

¶ 0 1 , −1 −1 µ ¶ −1 −1 B= , 1 0 B=

b ≡ 0 mod w, b ≡ 0 mod w

and this implies, for an extension η˜0 corresponding to a1 = a2 = 1 (x1 , x2 ) ∈ V1 (˜ η0 ) ½ xa ≡ 0 mod 2, ⇔ x((aw − 2)b + (B − b)a) ≡ 0 mod 2w µ ¶ 1 ⇒ (x1 , x2 )(B + 12 ) ≡ 0 mod 2 (ρ = Jσ, Jσ 2 ) 1 µµ ¶ ¶µ ¶  0 1 1   + 12 ≡ 0 mod 2, (ρ = Jσ)  (x1 , x2 ) −1 −1 1 µµ ¶ ¶µ ¶ ⇒  −1 −1 1   (x1 , x2 ) + 12 ≡ 0 mod 2 (ρ = Jσ 2 ) 1 0 1 ⇒ x1 ≡ x2 ≡ 0 mod 2. This yields V1 (˜ η0 ) = V2 (˜ η0 ) = 2Z, and so κ(η) = 1. Remark 4.2. When equations y 3 − ay + b = 0 and x2 + c = 0 (0 < a, b < 1000, 0 < c < 100, a, b, c ∈ Z), define a real cubic abelian subfield and an imaginary quadratic subfield of F , respectively, the equality κ(η) = Relη is confirmed by finding a prime number p so that #E(p) = w(p − 1)2 with the aid of computer. 4.3. Case where F is an imaginary abelian field with [F : Q] = 6 and the order of η ∈ Gal(F/Q) is 3 Proposition 4.6. In this case, we have κ(η) = 1. Proof. As in §4.2.2 (cf. (4.10)), we may assume o× F = hζw , u1 , u2 i,

uη1 = u2 ,

uη2 = (u1 u2 )−1 (u1 , u2 ∈ F0 ),

(4.11)

where F0 is the maximal real subfield of F. It is easy to see that o× F = U (3), × g(η) 2 i.e. o× = {ǫ ∈ o | ǫ ∈ W }, on putting g(x) = Φ (x) = x + x + 1. F 3 F F ˜ = τ3 for every extension η˜ of η (cf. the remark just after Hence we have ∆ (3.9)).

DISTRIBUTION OF UNITS

Lemma 4.1. We have

 7 ˜ = 3 ∆  1

85

if F = Q(ζ7 ), if ζ3 ∈ F, otherwise.

˜ and let pn || ∆. ˜ Then ζpg(ρ) Proof. Let p be a prime divisor of ∆ = 1 for n every extension ρ ∈ Gal(F (ζpn )/Q) of η by definition (cf. (3.8)). Suppose that σa ∈ Gal(Q(ζpn )/Q) with ζpσna = ζpan coincides with η on F ∩ Q(ζpn ); then it is extended to an element σ˜a ∈ Gal(F (ζpn )/Q) with σ˜a|F = η and g(˜ σ )

so we have a2 + a + 1 ≡ 0 mod pn by ζpn a = 1. In particular, p is an odd prime (6= 5), and ¯ ª £ ¤ © Q(ζpn ) : F ∩ Q(ζpn ) ≤ # a mod pn ¯ a2 + a + 1 ≡ 0 mod pn .

Let us check that the right-hand side is ≤ 2. In case of p = 3, this is obvious, since there is no solution of x2 + x + 1 ≡ 0 mod 9. Suppose p 6= 3; then Lemma 5.2 with f (x) = x2 + x + 1, A(x) = 4, B(x) = −(2x + 1), n = 3 yields #{a mod pn | a2 + a + 1 ≡ 0 mod pn } ≤ 2. Now the inequality [Q(ζpn ) : F ∩ Q(ζpn )] ≤ 2 yields pn−1 (p − 1) = [Q(ζpn ) : F ∩ Q(ζpn )][F ∩ Q(ζpn ) : Q] | 12.

(4.12)

Therefore the possibilities of p, n are p = 13, n = 1; p = 7, n = 1; p = 3, n ≤ 2.

• Case of p = 13, n = 1 (4.12) implies [F ∩ Q(ζ13 ) : Q] = 6, and so F ⊂ Q(ζ13 ). Since the subgroup corresponding to F in Gal(Q(ζ13 )/Q) is of order 2, it is generated by the complex conjugation. Hence F is a real subfield, which is a contradiction. • Case of p = 7, n = 1 (4.12) implies 6 ≤ 2[F ∩ Q(ζ7 ) : Q], and then [F ∩ Q(ζ7 ) : Q] = 3, 6. Suppose [F ∩ Q(ζ7 ) : Q] = 6; then F = Q(ζ7 ). Therefore we have η = σ2 g(η) ˜ in case of or = σ4 and they satisfy ζ7 = 1. Thus we conclude that 7 | ∆ F = Q(ζ7 ). Suppose [F ∩ Q(ζ7 ) : Q] = 3; then F0 and F ∩ Q(ζ7 ) coincides with the maximal real subfield of Q(ζ7 ). σa (a = 3, 5) induces an automorphism of order 3 in Gal(F ∩ Q(ζ7 )/Q), and one of them coincides with η on F ∩ Q(ζ7 ) = F0 but neither of them satisfies a2 + a + 1 ≡ 0 mod 7. Thus this case does not occur. • Case of p = 3, n = 2 In this case, pn = 9, and there is no integer a which satisfies a2 + a + 1 ≡ 0 mod 9. Hence this case does not happen.

86

YOSHIYUKI KITAOKA

• Case of p = 3, n = 1 In case of F ∋ ζ3 , an automorphism ρ ∈ Gal(F (ζ3 )/Q) = Gal(F/Q), which is an extension of η fixes ζ3 (∈ F ∩ Q(ζ3 )) because the order of η is three, g(ρ) ˜ In case of ζ3 6∈ F , F ∩ Q(ζ3 ) = Q holds, and so and so ζ3 = 1, i.e. 3 | ∆. g(ρ) ρ ˜ extending η to ρ by ζ = ζ −1 , we have ζ 6= 1, i.e. 3 ∤ ∆. 3

3

3

˜ = 1; then τ3 = 1 and PropoNow we distinguish three cases. Suppose ∆ × η−p sition 3.5 implies [R(˜ η , p) : WF oF ] = 1. Hence we get κ(η) = 1. ˜ = 3; then ζ3 ∈ F implies 3 | w. As in Lemma 3.14, we put Suppose ∆ √ η ˜ ai Q √ a 3 u aij ζ3w , 3 ui η˜ = ζ3w = ζ3w j j Q ρ b ρ bi bij ζw = ζw , ui = ζw j uj ,

which yields a ≡ b ≡ 0 mod w by (4.11), and since the quadratic subfield of F is Q(ζ3 ), we have a ≡ 1 mod 3 and so Φ3 (a) ≡ 0 mod 3. Hence by ˜ and then (3.27) means V1 (˜ (3.9), we have τ3 = 3 = ∆ η ) = {y mod 3 | yΦ3 (a, A)(B − b)a ≡ 0 mod 3w for ∀ρ}. It is easy to see that ¶ µ 20   if ρ = J,  02 ¶ B−b= µ  −a 1   if ρ = η, −1 −a − 1 µ ¶ w whence choosing η˜ with a = , we have, by a ≡ 1 mod 3 2w µ ¶ µ ¶ 2w w (B − b)a ≡ , mod 3w. w w Since 2x + y ≡ x + y ≡ 0 mod 3 has only a trivial solution, y ∈ V1 (˜ η) satisfies yΦ3 (a, A) ≡ 0 mod 3. Thus V1 (˜ η ) ⊂ V2 (˜ η ) (cf. (3.27),(3.28)) holds, which means κ(η) = 1. ˜ = 7, i.e. F = Q(ζ7 ) as above. As in Lemma 3.14, we put Suppose ∆ √ η ˜ ai Q √ a 7 u aij ζ7w , 7 ui η˜ = ζ7w = ζ7w j j Q ρ b ρ bi bij ζw = ζw , ui = ζw j uj .

Since the order of η is 3, we have a ≡ 2, 4 mod 7, and a ≡ b ≡ 0 mod w ˜ (cf. (4.11)). Therefore we have Φ3 (a) ≡ 0 mod 7, from which τ3 = 7 = ∆ follows by (3.9) and hence y ∈ V1 (˜ η ) (cf. (3.27)) yields ˜ yΦ3 (a, A)(B − b)a ≡ 0 mod ∆w.

DISTRIBUTION OF UNITS

87

Here, since we have

taking a =

µ

¶ µ 2 0    02 ¶ B−b= µ  −a 1   −1 −a − 1

¶ w , we obtain w

if ρ = J, if ρ = η,

µ

¶ µ ¶ 2w (−a + 1)w (B − b)a = , . 2w (−a − 2)w µ ¶ 2 −a + 1 The determinant of is not 0 mod 7, and so y ∈ V1 (˜ η ) yields 2 −a − 2 yΦ3 (a, A) ≡ 0 mod 7 and so y ∈ V2 (˜ η ). Therefore we conclude that κ(η) = 1 as above. Remark 4.3. In all the examples given here, we have o× F = U (m) for a single m. If F is a real cyclic extension of degree 4, then we see that there is a system of fundamental units {u1 , u2 , u3 } such that uσ1 = ζu−1 1 ,

uσ1

=

ζu−1 1 ,

uσ2 = u−1 3 ,

uσ2

=

u−1 3 ,

uσ3 = ζ ′ u2

uσ3

=

u−1 1 u2

(ζ, ζ ′ = ±1), (ζ = ±1),

(4.13) (4.14)

where σ is a generator of Gal(F/Q). Let η = σ 2 ; then we have U (1) = h−1, u1 i,

U (1) = h−1, u1 i,

U (2) = h−1, u2 , u3 i

U (2) =

−1 h−1, u1 u−2 2 , u1 u2 u3 i

for (4.13), for (4.14),

and so o× F 6= U (1)U (2) in case of (4.14). We can see that ½ ½ 2 if ζ ′ = −1 1 if ζ = −1 κ(η) = for (4.13), = for (4.14), ′ 4 if ζ = 1 2 if ζ = 1 and as far as we have checked, κ(η) = Relη is true. 5. Appendix 5.1. Divisors of f (p) Proposition 5.1. Let L be a Galois extension of Q and η ∈ Gal(L/Q), and let f (x) ∈ Z[x] be a polynomial in Q[x] with (f (x), f ′ (x)) = 1. Then there exists the maximum δ of natural numbers m such that f (˜ η) if η˜ ∈ Gal(L(ζm )/Q) and η˜|L = η, then ζm = 1 holds.

(5.1)

88

YOSHIYUKI KITAOKA

We have the following expression for δ: δ=

gcd η∈σL/Q (p),p∤2DL δ

f (p) = lim

x→∞ η∈σ

gcd

f (p).

L/Q (p),p>x

We need a few lemmas to prove Proposition 5.1. Lemma 5.1 (Newton Approximation). Let q be a prime number and f (x) ∈ Zq [x]. If a ∈ Zq satisfies |f (a)|q < |f ′ (a)|2q , then there is a solution α ∈ Zq of f (x) = 0 such that |α − a|q ≤ |f (a)/f ′ (a)|q . Proof. See p.83 in [1]. Lemma 5.2. Let q be a prime number and let f (x) ∈ Z[x] be the polynomial in the proposition. Taking integral polynomials A(x), B(x) ∈ Z[x] and a natural number n such that A(x)f (x) + B(x)f ′ (x) = n, we define the integer s by q s ||n. Then for any natural number t ≥ 2s + 1 #{a mod q t | f (a) ≡ 0 mod q t } ≤ deg f (x) · q s holds. Proof. If t ≥ 2s + 1 and f (a) ≡ 0 mod q t , then we have q s+1 ∤ f ′ (a) and hence |f (a)|q < |f ′ (a)|2q . Hence, by Lemma 5.1, there is an element α ∈ Zq such that f (α) = 0 and |α − a|q ≤ |f (a)/f ′ (a)|q . q t−s | f (a)/f ′ (a) implies a ≡ α mod q t−s . Since the number of roots α is less than or equal to deg f (x), we obtain the assertion. Lemma 5.3. The maximal integer δ in Proposition 5.1 exists. Proof. For relatively prime natural numbers m1 and m2 , the condition (5.1) holds for m = m1 m2 if and only if it holds for m = m1 , m2 . Hence, assuming that m is a power q t of a prime q, we have only to show that it holds for a finitely many such integers m. Suppose that η ′ ∈ Gal(Q(ζqt )/Q) coincides η on L ∩ Q(ζqt ); the number of such η ′ ’s is equal to [Q(ζqt ) : L ∩ Q(ζqt )]. Since η ′ is extended to an element of Gal(L(ζqt )/Q) whose f (a) restriction on L is η, the condition (5.1) yields ζqt = 1, where a is defined ′

by ζqηt = ζqat . Hence a mod q t corresponding to η ′ satisfies f (a) ≡ 0 mod q t . Let n, s be those in Lemma 5.2. If, then t ≥ 2s + 1 holds, then Lemma 5.2

DISTRIBUTION OF UNITS

89

implies [Q(ζqt ) : L ∩ Q(ζqt )] ≤ deg f (x) · q s and so ϕ(q t ) ≤ deg f (x) · q s [L ∩ Q(ζqt ) : Q] ≤ deg f (x) · n[L : Q]. Hence q t is bounded. Proof of Proposition 5.1. Put G=

gcd

f (p)

η∈σL/Q (p),p>x

for a large number x (> 2DL δ). Take any extension η˜ of η in Gal(L(ζG )/Q); f (˜ η) f (q) then we have ζG ≡ ζG ≡ 1 mod q if q (> xG) is a prime number such f (˜ η) that σL(ζG )/Q (q) ∋ η˜. If ζG 6= 1, then for a prime divisor ℓ of the order of f (˜ η) ζG , we have ℓ | G and ζℓ − 1 ∈ q. Hence we get ℓ = q and so q | G. This f (˜ η) contradicts q > G. Thus we obtain ζG = 1 and so G | δ. Conversely, take a prime p so that η ∈ σL/Q (p) and p ∤ 2DL δ. For η˜ ∈ f (˜ η)

f (p)

f (p)

σLδ /Q (p), ζδ = 1 holds, whence ζδ ≡ 1 mod p follows. Then ζδ should occur and δ divides f (p). Thus we have δ | G. In general, δ = gcdη∈σL/Q (p),p∤2DL f (p) does not hold.

=1

Corollary 5.1. Suppose f (x) = x−1 in the proposition. If η is the identity (resp. the complex conjugation), then we have δ = w (resp. δ = 2). Proof. Suppose δ satisfies the condition (5.1). By the Galois theory, there are [Q(ζδ ) : L ∩ Q(ζδ )] extensions η˜ of η to Gal(L(ζδ )/Q), and then the supposition f (x) = x − 1 implies ζδη˜−1 = 1, i.e. ζδη˜ = ζδ . Thus the extension to Gal(L(ζδ )/Q) is uniquely determined as the identity. This means [Q(ζδ ) : L ∩ Q(ζδ )] = 1, that is Q(ζδ ) ⊂ L and hence δ|w. Hence, in case that η is the identity, w divides δ clearly and so δ = w, and if η is the complex conjugation, then ζδ−1 = ζδη˜ = ζδ yields δ = 2. 5.2. Structure of Galois group extended by roots of units The aim of this subsection is to study the structure of µ ³ q ´. ³ p ´¶ n pn Gal L p o× W L L L for an algebraic number field L. Let us recall [11]

Theorem 5.1. Let L be a field of characteristic 0 and let p be a prime and a ∈ L \ Lp . Then we have n

(i) If p 6= 2, then xp − a is irreducible over L for every natural number n,

90

YOSHIYUKI KITAOKA

(ii) If p = 2, then x2 −a is irreducible over L, and a 6∈ −4L4 if and n only if x2 − a is irreducible over L for any integer n (≥ 2). Proposition 5.2. Let L be an algebraic number field. Then the following hold for any natural number n: (i) For ǫ ∈ o× / WL L2 , we have −4ǫ ∈ / L(ζ2n )4 . L with ǫ ∈ × 2 2n (ii) For ǫ ∈ oL with ǫ ∈ / WL L , x − ǫ is reducible over L(ζ2n ) if and only √ n if ǫ ∈ L(ζ2 ). √ ǫi ∈ L(ζ2n ) and ǫi ∈ / WL L2 hold for i = 1 (iii) If, for ǫi ∈ o× L (i = 1, 2) 2 and 2, then ǫ1 ǫ2 ∈ WL L . Proof. Proof of (i): Suppose −4ǫ ∈ L(ζ2n )4 ; then there is an element α ∈ L(ζ2n ) such that α4 = −4ǫ. Putting M = L(α), we see that L ⊂ M ⊂ L(ζ2n ). Since L(ζ2n ) is abelian over L, M is a Galois extension of L. The assumption on ǫ and (ii) in Theorem 5.1 imply that x4 + 4ǫ is irreducible over L and so [M : L] = 4 is valid. Since M is a Galois extension of L, M √ √ contains a conjugate −1α of α and so −1. √ √ √ √ In case −1 ∈ / L( −ǫ), we have L( −ǫ)( −1) = M = √ of √ √ √ L( −ǫ)( α2 ) (α2 = ±2 −ǫ ∈ L( −ǫ)), which yields √ α2 /(−1) ∈ L( −ǫ)2 . √ √ Then −α2 = ∓2 −ǫ = (c + d −ǫ)2 holds for ∃ c, ∃ d ∈ L, and this implies √ √ ∓2 −ǫ = c2 − ǫd2 + 2cd −ǫ, which yields a contradiction ǫ = (c/d)2 . Thus √ √ −1 ∈ L( −ǫ) holds. √ √ √ √ √ If −1 ∈ / L, then L( −1) = L( −ǫ) and so −1/ −ǫ ∈ L follows, which implies the contradiction ǫ ∈ L2 . √ √ Suppose −1 ∈ L. We note that Q(ζ2n ) = Q( −1)Q(ζ2n + ζ2−1 n ) and √ n −1 of Q(ζ Q(ζ2n + ζ2−1 )/Q is cyclic, and so a subfield containing n 2 ) coin√ m cides with Q(ζ2 ) for some integer m. Hence by −1 ∈ L ⊂ M ⊂ L(ζ2n ) there is an integer such that L ∩ Q(ζ2n ) = Q(ζ2m ), M ∩ Q(ζ2n ) = Q(ζ2m+2 ). Hence x4 − ζ2m is irreducible over L by (ii) in Theorem 5.1. Thus M = √ √ √ √ a L( 4 −4ǫ) = L( 4 ζ2m ) holds. By Kummer’s theory, we have 4 −4ǫ/ 4 ζ2m ∈ × L for ∃ a ∈ Z, which implies a contradiction ǫ ∈ −4ζ2am L4 ⊂ WL (oL )2 . Thus we have completed the proof of (i). Proof of (ii): We know n

x2 − ǫ is irreducible over L(ζ2n )

⇔ x2 − ǫ is irreducible over L(ζ2n ) and −4ǫ ∈ / L(ζ2n )4 if n ≥ 2

⇔ x2 − ǫ is irreducible over L(ζ2n ).

DISTRIBUTION OF UNITS

91

The first equivalence follows from Theorem 5.1. The second follows from (i). This completes the proof of the case (ii). √ √ Proof of (iii): If L( ǫ1 ) = L( ǫ2 ), then ǫ1 /ǫ2 ∈ L2 holds and the assertion √ √ (iii) is clear. We assume L( ǫ1 ) 6= L( ǫ2 ) hereafter. We need the following: For a natural number n, the subfields of Q(ζ2n ) are Na = Q(ζ2a ), Na,+ = Q(ζ2a + ζ2−1 a ), −1 Na,− = Q(ζ2a − ζ2a ),

[Na : Q] = 2a−1 (a = 1, 2, · · · , n), [Na,+ : Q] = 2a−2 (a = 3, 4, · · · , n), [Na,− : Q] = 2a−2 (a = 3, 4, · · · , n).

To show this, we have only to verify that subfields of Nn = Q(ζ2n ) not contained in Nn−1 are Nn,+ and Nn,− for n ≥ 3. Since Gal(Nn /Q) ∼ = Z/2Z ⊕ Z/2n−2 Z and the subgroup corresponding to Nn−1 is of order 2 and generated by g := (0 mod 2) ⊕ (2n−3 mod 2n−2 ) ∈ Z/2Z ⊕ Z/2n−2 Z, the assertion above follows from the fact that subgroups of Gal(Nn /Q) which do not contain g is (0 mod 2) ⊕ (0 mod 2n−2 ), h(1 mod 2) ⊕ (0 mod 2n−2 )i, h(1 mod 2) ⊕ (2n−3 mod 2n−2 )i. They correspond to Nn , Nn,+ , Nn,− respectively, since (1 mod 2) ⊕ (0 mod 2n−2 ) corresponds to the complex conjugation and (0 mod 2) ⊕ (1 mod 2n−2 ) n−3 ≡ 1 + 2n−1 mod 2n . corresponds to ζ2n → ζ25n , and 52 By virtue of Gal(L(ζ2n )/L) ∼ = Gal(Q(ζ2n )/L ∩ Q(ζ2n )), we have √ [L( ǫi ) ∩ Q(ζ2n ) : L ∩ Q(ζ2n )] √ = [Q(ζ2n ) : L ∩ Q(ζ2n )]/[Q(ζ2n ) : L( ǫi ) ∩ Q(ζ2n )] √ √ = [L(ζ2n ) : L]/[L( ǫi , ζ2n ) : L( ǫi )] √ √ = [L(ζ2n ) : L]/[L(ζ2n ) : L( ǫi )] = [L( ǫi ) : L] = 2. √ √ √ √ Since L ⊂ L( ǫi ) ⊂ L(ζ2n ) and L( ǫ1 ) 6= L( ǫ2 ), fields L( ǫ1 ) ∩ Q(ζ2n ) √ and L( ǫ2 ) ∩ Q(ζ2n ) are different quadratic extensions of L ∩ Q(ζ2n ). Using the classification above, we get L ∩ Q(ζ2n ) = Q or Na,+ (a = 3, 4, · · · , n − 1), on noting that the quadratic extensions of Na (2 ≤ a ≤ n − 1) in Nn are only Na+1 , the quadratic extensions of Na,+ (3 ≤ a ≤ n − 1) in Nn are Na , Na+1,+ , Na+1,− , the quadratic extensions of Na,− (3 ≤ a ≤ n) in Nn are only Na .

92

YOSHIYUKI KITAOKA

(iii.1) Suppose L ∩ Q(ζ2n ) = Q. √ √ √ We note that L( ǫ1 ), L( ǫ2 ), L( ǫ1 ǫ2 ) are quadratic√extensions of L √ √ contained in L(ζ2n ) and they are equal to L( −1), L( 2), L( −2). If √ √ √ √ L( ǫj ) = L( −1) (j = 1 or 2), then ǫj / −1 ∈ L and −ǫj ∈ L2 fol√ √ lows, which is a contradiction. Hence we have L( ǫ1 ǫ2 ) = L( −1), whence −ǫ1 ǫ2 ∈ L2 follows. √ √ √ For the remaining cases, we may assume L( 2) = L( ǫ1 ), L( −2) = √ L( ǫ2 ); then ǫ1 ∈ 2L2 , ǫ2 ∈ −2L2 yield ǫ1 ǫ2 ∈ −L2 .

(iii.2) Suppose L ∩ Q(ζ2n ) = Na,+ for a = 3, 4, · · · , n − 1. √ κ + 2) and Put κ = ζ2a + ζ2−1 a ; then Na,+ = Q(κ), Na+1,+ = Q( √ √ √ Na+1,− = Q( κ − 2) hold, and Q( κ + 2), Q( κ − 2) and Q(ζ2a ) are √ √ √ quadratic extensions of Na,+ in Q(ζ2n ). Also L( ǫ1 ), L( ǫ2 ), L( ǫ1 ǫ2 ) √ are quadratic extensions of L in L(ζ2n ) and should be equal to L( κ + 2), √ √ L( κ − 2), L(ζ2n ). Since −1(ζ2a − ζ2−1 a ) is real, it is in Na,+ and it follows √ 2 2 that (κ + 2)(κ − 2) = κ2 − 4 = (ζ2a − ζ2−1 = −( −1(ζ2a − ζ2−1 ∈ a ) a )) 2 2 2 −Na,+ ⊂ −L . Hence we have ǫ1 ǫ2 , ǫ1 (ǫ1 ǫ2 ) or ǫ2 (ǫ1 ǫ2 ) ∈ −L . The assumption ǫi ∈ / WL L2 now implies −ǫ1 ǫ2 ∈ L2 . Corollary 5.2. Let L be an algebraic number field and suppose ǫ ∈ o× L √ √ m satisfies ǫ ∈ / WL L2 . If ǫ ∈ L(ζ2m ), then x2 − ǫ is irreducible over L(ζ2m ). √ Proof. The assumption ǫ ∈ L(ζ2m ) implies m ≥ 2. By Theorem 5.1, √ √ √ m / L(ζ2m ) and −4 ǫ ∈ / x2 − ǫ is irreducible over L(ζ2m ) if and only if 4 ǫ ∈ √ m m L(ζ2m )4 . Suppose that x2 − ǫ is reducible over L(ζ ); first, assume 2 √ √ m ≥ 3. If f := 4 ǫ ∈ L(ζ2m ), then −4ǫ = (ζ8 2f )4 ∈ L(ζ2m )4 , which √ contradicts the assertion (i) of Proposition 5.2. If −4 ǫ √∈ L(ζ2m )4 , then √ putting −4 ǫ = f14 (f1 ∈ L(ζ2m )), we have −4ǫ = (f12 /(ζ8 2))4 ∈ L(ζ2m )4 , √ m which is also a contradiction. Therefore x2 − ǫ is irreducible over L(ζ2m ) in case of m ≥ 3. √ √ Now, applying the above to m = 3, x8 − ǫ = (x2 )4 − ǫ is irreducible √ over L(ζ8 ). Hence x4 − ǫ is irreducible over L(ζ4 ), which completes the proof of the case m = 2. Corollary 5.3. Let L be an algebraic number field and let ǫ1 , ǫ2 ∈ o× L √ √ satisfy ǫ1 ∈ / WL L2 . If ǫ1 ∈ L(ζ2m ) for a natural number m, then ǫ1 ǫ2 ∈ / 2 m L(ζ2 ) . m √ Proof. By applying Corollary 5.2 to ǫ = ǫ1 ǫ22 , x2 − ǫ1 ǫ2 is irreducible √ √ / L(ζ2m )2 . over L(ζ2m ) and so is x2 − ǫ1 ǫ2 . Therefore ǫ1 ǫ2 ∈

DISTRIBUTION OF UNITS

93

The following is the main result of this subsection. Theorem 5.2. Let L be an algebraic number field. Denote by r the rank of o× L . Let p be a prime number and n a natural number. Then we have the following: √ 2n 2 WL )2 ⊂ WL · (o× (i) Suppose either p 6= 2 or that p = 2 and o× L) . L ∩ L( Then we have µ ³ q ´. ³ p ´¶ n r pn ∼ W Gal L p o× L = (Z/pn Z) . L L

√ 2n 2 (ii) Suppose that p = 2 and o× WL )2 6⊂ WL · (o× L ∩ L( L ) . Then we have n ≥ 2 and µ ³ q ´. ³ p ´¶ pn pn × ∼ Gal L oL WL L = Z/2n−1 Z × (Z/2n Z)r−1 . We need more lemmas.

Lemma 5.4. Suppose that p is an odd prime number and ǫ ∈ o× L is not in p p n WL (o× ) . Then ǫ ∈ / L(ζ ) for ∀n ≥ 1. p L √ Proof. Suppose ǫ ∈ L(ζpn )p ; then L ⊂ L( p ǫ) ⊂ L(ζpn ) clearly and ¡ ± ¢ ¡ ± ¢ Gal L(ζpn ) L ∼ = Gal Q(ζpn ) L ∩ Q(ζpn ) . √ Since L(ζpn )/L is an abelian extension, L( p ǫ)/L is a Galois extension. √ / L and p 6= 2, xp − ǫ is irreducible over L. Thus we Hence in view of p ǫ ∈ √ √ √ p have [L( ǫ) : L] = p and so the conjugate ζp p ǫ is in L( p ǫ). Therefore √ √ √ ζp ∈ L( p ǫ) and L ⊂ L(ζp ) ⊂ L( p ǫ) follows. [L( p ǫ) : L] = p and [L(ζp ) : L] | p − 1 imply L(ζp ) = L. Therefore we have L ∩ Q(ζpn ) ⊃ Q(ζp ) and √ hence L ∩ Q(ζpn ) = Q(ζpm ) and L( p ǫ) ∩ Q(ζpn ) = Q(ζpm+1 ) hold for √ 1 ≤ ∃ m < n by p 6= 2. Since L( p ǫ) (⊂ L(ζpn )) is the composite of L √ √ and L( p ǫ)p∩ Q(ζpn ) = Q(ζpm+1 ), we have L( p ǫ) = L(ζpm+1 ) and hence √ a f := p ǫ/ p ζpm ∈ L for ∃ a ∈ Z by virtue of ζpm ∈ L. Therefore ǫ = × p a p ζpm f ∈ WL (oL ) and this contradicts the assumption on ǫ. × p Lemma 5.5. Suppose ǫ1 ∈ o× n (≥ 2) be a L is not in WL (oL ) and let √ n natural number. Under the further assumption of ǫ1 ∈ / L( 2 WL )2 in case of p = 2, we have ³ ¡ p ¢. ¡ p ¢´ √ n n n Gal L p WL , p ǫ1 L p WL ∼ = Z/pn Z.

94

YOSHIYUKI KITAOKA

√ n Proof. We note that L( p WL ) = L(ζpm ) for some integer m (≥ n) and n by Kummer’s theory, it suffices to prove that xp − ǫ1 is irreducible over √ n L( p WL ). √ n n In the case of p 6= 2, xp − ǫ1 is irreducible over L( p WL ) if and only √ n if xp − ǫ1 is so over L( p WL ), which is true by virtue of Lemma 5.4. √ n n Suppose that p = 2 and x2 − ǫ1 is reducible over L( 2 WL ); then either √ √ √ n n ǫ1 ∈ L( 2 WL ) or −4ǫ1 ∈ L( 2 WL )4 occurs. However, neither of them n can not occur by the assumption or (i) in Proposition 5.2. Thus x2 − ǫ1 is √ n irreducible over L( 2 WL ). Lemma 5.6. Under the assumption in (ii) in Theorem 5.2, n ≥ 2 holds ℓ and there is an element ǫ1 ∈ o× / WL (o× L such that ǫ1 ∈ L ) for ∀ℓ ≥ 2, √ n ǫ1 ∈ L( 2 WL )2 and ³ ¡p ¢. ¡ p ¢´ √ n n n Gal L 2 WL , 2 ǫ1 L 2 WL ∼ = Z/2n−1 Z.

2 Proof. Suppose ǫ ∈ o× L ∩ L(ζ2m ) for m := #WL ; then we have ǫ = 2 ∃ ∃ (a + bζ2m ) for a, b ∈ L, which implies ǫ = a2 + b2 ζm , ab = 0 and hence √ × 2 × 2 2 ǫ ∈ WL (o× L ) . Thus we have oL ∩ L( WL ) ⊂ WL (oL ) , and hence n ≥ 2. 2 By the assumption in (ii), there is an element ǫ ∈ o× L such that ǫ = α √ × 2 2n / WL (oL ) . Write ǫ = ζǫk1 where ζ ∈ WL and (α ∈ L( WL )) and ǫ ∈ × × ℓ 2 ǫ1 ∈ oL with ǫ1 ∈ / WL (oL ) for ∀ℓ ≥ 2. The condition ǫ ∈ / WL (o× L ) implies √ √ √ (k−1)/2 √ 2n ∈ L( WL ). Therefore 2 ∤ k. Then we have α = ǫ = ζ ǫ1 ǫ1 √ √ n ǫ1 ∈ L( 2 WL ) holds. Defining the integer a by #WL = 2a b (2 ∤ b), n+a a+1 n−1 √ √ Corollary 5.2 implies that x2 − ǫ1 = (x2 )2 − ǫ1 is irreducible √ n−1 √ n over L(ζ2n+a ) = L( 2 WL ). Therefore x2 − ǫ1 is also irreducible over √ n L( 2 WL ).

Lemma 5.7. Let ǫ1 , · · · , ǫr be a system of fundamental units of L. Suppose the assumptions of (i), (ii) in Theorem 5.2 in each case, with one extra√ √ n condition ǫ1 ∈ L( 2 WL ) in case of (ii). Then we have ³p ´ √ √ n √ √ n n pa+1 ǫs+1 ∈ / L p WL , p ǫ1 , · · · , p ǫs , pa ǫs+1 for 0 ≤ ∀a < n and 1 ≤ ∀s < r.

√ √ √ √ √ n Proof. Suppose pa+1 ǫs+1 ∈ L( p WL , pn ǫ1 , · · · , pn ǫs , pa ǫs+1 ); then we have, for some integers a1 , a2 , · · · , as+1 p √ √ n √ a a √ n n f := p ǫ1 1 · · · p ǫs s pa ǫs+1 as+1 / pa+1 ǫs+1 ∈ L( p WL ), √ √ √ √ n n since L( p WL , pn ǫ1 · · · , pa ǫs+1 ) is a Kummer extension of L( p WL ).

DISTRIBUTION OF UNITS

95

In case of (i): Define the integer b by pb k(a1 , · · · , as , (as+1 p − 1)pn−a−1 ),

0 ≤ b ≤ n − a − 1.

Then we have a /pb

ǫ := ǫ1 1

b

(a

s+1 · · · ǫsas /p ǫs+1

p−1)pn−a−1−b

= fp

n−b

n−b−1

= (f p

p n )p ∈ L( p WL )p .

p By noting that ǫ is not in WL (o× L ) , the equation above contradicts Lemma 2 5.4 if p 6= 2. Suppose p = 2; the assumption of (i) implies ǫ ∈ WL (o× L) , which contradicts the choice of the integer b.

In case of (ii): Define the integer b by 2b k(2a1 , a2 , · · · , as , (2as+1 − 1)2n−a−1 ),

0 ≤ b ≤ n − a − 1.

Then we have 2a1 /2b a2 /2b ǫ2

ǫ := ǫ1

b

(2a

· · · ǫsas /2 ǫs+1s+1

−1)2n−a−1−b

p n ∈ L( 2 WL )2 .

Suppose 2b k2a1 ; then put

(2a /2b −1)/2 a /2b

√ 2a /2b n−b−1 2 ) = ( ǫ1 1 f 2

b

(2a

−1)2n−a−1−b

. η1 = ǫ1 and η2 = ǫ1 1 ǫ2 2 · · · ǫsas /2 ǫs+1s+1 √ √ √ √ n n We have η1 , η2 ∈ o× η1 ∈ L( 2 WL ) and η1 η2 = ǫ ∈ L( 2 WL )2 . This L, √ n contradicts Corollary 5.3, since L( 2 WL ) = L(ζ2m ) for some m. √ 2n b 2 Suppose 2 | a1 ; then ǫ ∈ L( WL ) as above and one of A2 := a2 /2b , · · · , b As := as /2b , As+1 := (2as+1 − 1)2n−a−1−b is odd, and ǫ1 2a1 /2 = √ b n (ǫ1 a1 /2 )2 ∈ L( 2 WL )2 . Hence we have ´2 ³p As+1 −2a /2b 2n 2 . ǫǫ1 1 = ǫA · · · ǫ W ∈ L L 2 s+1 A

s+1 2 Applying (iii) of Proposition 5.2 to ǫ1 and ǫA 2 · · · ǫs+1 , we have the incluAs+1 A2 sion ǫ1 ǫ2 · · · ǫs+1 ∈ WL L2 , which is a contradiction.

Proof of Theorem 5.2. Let ǫ1 , · · · , ǫr be a system of fundamental units of L, and we may suppose that ǫ1 is a unit given in Lemma 5.6 in the case (ii). Then Lemmas 5.5, 5.6, 5.7 imply ( · ³q ´ ´¸ ³p prn in the case (i), pn pn × L = oL : L WL n−1+(r−1)n 2 in the case (ii). ³q ´. ¢ ¡√ n n Since L p o× L p WL is a Kummer extension, this completes the L proof.

96

YOSHIYUKI KITAOKA

Acknowledgments I thank Professor S. Kanemitsu for many helpful suggestions, and this work was partially supported by Grant-in-Aid for Scientific Research (C), The Ministry of Education, Culture, Sports, Science and Technology of Japan. References 1. J.W.S. Cassels and A. Fr¨ ohlich, Algebraic Number Theory, Academic Press, 1967. 2. Y-M. J. Chen, Y. Kitaoka and J. Yu, Distribution of units of real quadratic number fields, Nagoya Math. J., 158 (2000), 167–184. 3. C.W. Curtis and I. Reiner, Representation theory of finite groups and associative algebras, Interscience, 1962. 4. T. Honda, Pure cubic fields whose class numbers are multiple of three, J. Number Theory, 3 (1971), 7–12. 5. M. Ishikawa and Y. Kitaoka, On the distribution of units modulo prime ideals in real quadratic fields, J. reine angew. Math., 494 (1998), 65–72. 6. Y. Kitaoka, Distribution of units of a cubic field with negative discriminant, J. Number Theory, 91 (2001), 318–355. 7. Y. Kitaoka, Distribution of units of an algebraic number field, in Galois Theory and Modular Forms, (2003), 287–303. Developments in Mathematics, Kluwer Academic Publishers. 8. Y. Kitaoka, Distribution of units of an algebraic number fields with only one fundamental unit, Proc. Japan Acad., 80A (2004), 86–89. 9. Y. Kitaoka, Distribution of units of a cubic abelian field modulo prime numbers, J. Math. Soc. Japan, (2)58 (2006), 563–584. ¨ 10. T. Kubota, Uber den bizyklischen biquadratischen Zahlk¨ orper, Nagoya Math. J., 10 (1955), 65–85. 11. S. Lang, Algebra, Springer-Verlag, 2002. 12. H.W. Lenstra, Jr., On Artin’s conjecture and Euclid’ algorithm in global fields, Inventiones math., 42 (1977), 201–224. 13. K. Masima, On the distribution of units in the residue class field of real quadratic fields and Artin’s conjecture (in Japanese), RIMS Kokyuroku, 1026 (1998), 156–166. 14. H. Roskam, A quadratic analogue of Artin’s conjecture on primitive roots, J. Number Theory, 81 (2000), 93–109.

97

SIGN CHANGES OF FOURIER COEFFICIENTS AND EIGENVALUES OF CUSP FORMS WINFRIED KOHNEN Universit¨ at Heidelberg, Mathematisches Institut, INF 288, D-69120 Heidelberg, Germany E-mail: [email protected] We give a survey about recent results on sign changes of Fourier coefficients and eigenvalues of cusp forms, both in the elliptic case and in the case of Siegel modular forms.

1. Introduction Fourier coefficients of elliptic cusp forms are mysterious objects and in general no simple arithmetical formulas are known for them. If one checks tables, one finds e.g. that quite often sign changes of those coefficients occur and it seems a natural assignment to try to understand them. For example, one may ask if there are infinitely many sign changes or when the first sign change occurs, or one may study sign changes in short intervals. This might be particularly interesting when the cusp form is a normalized Hecke eigenform and so the Fourier coefficients are equal to the Hecke eigenvalues. In this article we would like to give a survey on recent results obtained in this direction. In the last section we will also address the case of Siegel modular forms of genus two, where the situation gets more involved.

2. The starting point The result in the following Theorem seems to be well-known. However, we are not able to give a precise reference where it appeared first. As a substitute, we refer to the joint paper with M. Knopp and W. Pribitkin [10] for an extension to quite general subgroups of SL2 (R) and a discussion of related topics.

98

WINFRIED KOHNEN

As usual, we define Γ0 (N ) :=

½µ

ab cd



¯ ¯ ∈ Γ1 ¯¯ c ≡ 0

¾ (mod N ) ,

where of course Γ1 := SL2 (Z) denotes the full modular group. Theorem 2.1. Let f be a non-zero cusp form of even integral weight k on Γ0 (N ) and suppose that its Fourier coefficients a(n) are real for all n ≥ 1. Then the sequence (a(n))n∈N has infinitely many sign changes, i.e. there are infinitely many n such that a(n) > 0 and there are infinitely many n such that a(n) < 0. Proof. It is sufficient to assume that a(n) ≥ 0 for all but finitely many n and to derive a contradiction. Let X a(n)n−s , ℜ(s) ≫ 1 Lf (s) = n≥1

be the Hecke L-series of f . Our assumption and a very classical result of Landau (published in 1909) then imply that either Lf (s) converges everywhere or Lf (s) has a singularity at the real point of its line of convergence. However, according to a classical result of Hecke, Lf (s) extends to an entire function, and the former case occurs. Hence we conclude in particular that a(n) ≪ nc and hence also a(n)2 ≪ nc for all real c. As a consequence, the Rankin-Selberg zeta function X a(n)2 n−s , ℜ(s) ≫ 1 Rf,f (s) = n≥1

must converge everywhere, hence is entire. This contradicts a classical result of Rankin-Selberg according to which the latter has a pole at s = k of residue proportional (up to a non-zero constant depending only on k and N ) to the square of the Petersson norm of f which is non-zero, since f is nonzero. This completes the proof.

According to the above Theorem, a reasonable question to ask is if it is possible to obtain a bound on the first sign change, say in terms of k and N . Of course, in general this seems to be a difficult question. If f 6= 0, then recall that by the valence formula for modular forms the orders of zeros of f on the compactified Riemann surface X0 (N ) = Γ0 (N )\H ∪ P1 (Q) (where H is the complex upper half-plane) sum up to k 12 [Γ1 : Γ0 (N )]. Hence there must exist a number n in the range 1≤n≤

k [Γ1 : Γ0 (N )] 12

SIGN CHANGES OF FOURIER COEFFICIENTS AND EIGENVALUES OF CUSP FORMS

99

such that a(n) 6= 0. Therefore being optimistic, one might hope for a sign change in the range 1≤n≤

k [Γ1 : Γ0 (N )] + 1. 12

In a very special case, this indeed follows from work of Siegel [18]. To formulate the result, suppose that k ≥ 4 is even and denote by dk the dimension of the space Mk (Γ1 ) of modular forms of weight k on Γ1 . Recall that dk is given by the formula  k if k ≡ 2 (mod 12),  [ 12 ] dk =  k [ 12 ] + 1 otherwise.

Then Siegel showed that there are explicitly computable rational numbers cn (n = 0, 1, . . . , dk ), depending on k, such that dk X

cn af (n) = 0,

n=0

∀ f ∈ Mk (Γ1 ).

Of course, this more or less seems to follow from elementary linear algebra. However, Siegel’s explicit formulas imply that if k ≡ 2 (mod 4), then all the cn are strictly positive. Since a cusp form of weight k on Γ1 is determined by its n-th Fourier coefficients where n runs from 1 to dk − 1, we conclude immediately that if k ≡ 2 (mod 4), then there must be a sign change of the a(n) in the range 1 ≤ n ≤ dk . Thus using the formula for dk given above, we see that the above optimistic expectation in this special case was justified. Unfortunately, if k ≡ 0 (mod 4) or if N > 1, then Siegel’s arguments do not work any longer, and so one has to look for other devices. 3. Elliptic modular forms In the following, we first look at a normalized Hecke eigenform f that is a newform of level N . Recall that “normalized” means that a(1) = 1 and “newform” essentially means that the exact level of f is N . In this case, the Fourier coefficients are equal to the Hecke eigenvalues. It seems that sign changes of the a(p) (where p runs through primes only), in case f is of level 1 have been studied first by Ram Murty [17]. The following result was obtained in joint work with J. Sengupta.

100

WINFRIED KOHNEN

Theorem 3.1 ([12]). Suppose that f is a normalized Hecke eigenform of even integral weight k and squarefree level N that is a newform. Then one has a(n) < 0 for some n with ´ ³ p n ≪ kN exp c log N/ log log 3N (log k)27 , (n, N ) = 1.

Here c > 2 and the constant implied in ≪ is absolute.

Note that it is reasonable to assume that (n, N ) = 1, since the eigenvalues a(p) with p|N are explicitly known by Atkin-Lehner theory. The proof of the above result uses techniques from analytic number theory (e.g. Perron’s formula and a strong convexity principle) and properties of the symmetric square L-function of f , notably the fact that the value 1 of the latter at s = 1 is universally bounded from below by ≫ log(kN ) , an important result by D. Goldfeld, J. Hoffstein and D. Lieman [8]. Recently, the above result was improved in a joint paper with H. Iwaniec and J. Sengupta, as follows. Theorem 3.2 ([9]). Suppose that f is a normalized Hecke eigenform of even integral weight k and level N (not necessarily squarefree) that is a newform. Then one has a(n) < 0 for some n with √ n ≪ k N · log8+ǫ (kN ), (n, N ) = 1, ǫ > 0. Indeed, this immediately follows from Theorem 1 in [9]. The proof is “elementary” in the sense that it completely avoids the use of the symmetric square L-function. Instead, the Hecke relations for the eigenvalues are exploited. Let us be a bit more precise. One proves the following two Propositions. Proposition 3.1. One has ³x´ X √ λ(n) log2 ≪ǫ (k 2 N )1/4 log2+ǫ (kN ) x, n n≤x,(n,N )=1

x ≥ 1; ǫ > 0.

The proof follows in a standard way from the convexity principle in combination with Perron’s formula. Proposition 3.2. Suppose that λ(n) ≥ 0 for 1 ≤ n ≤ x, (n, N ) = 1. Then ³x´ X √ x λ(n) log2 ≫ , x ≫ N. 2 n log x n≤x,(n,N )=1

SIGN CHANGES OF FOURIER COEFFICIENTS AND EIGENVALUES OF CUSP FORMS

101

We indicate the proof in the case N = 1 (the general case, of course, is similar). We clearly have ³x´ X X λ(n) log2 ≫ λ(n). n n≤x



n≤x/2

We now restrict the summation to n = pℓ, where p and ℓ are primes p x/2. We then find X

λ(n) log2

n≤x

³x´ n



³ X √

p≤

x/2

³ X ´2 λ(p) − √ p≤

x/2

´ 1

(since λ(pℓ) = λ(p)λ(ℓ) if p and ℓ are different and λ(p)2 = λ(p2 ) + 1)) ≫

³ X p≤



x/2

³ X ´2 1 − √ p≤

x/2

1

´

(since λ(p) ≥ 0 in the given range) ≫

x log2 x

(by the Prime Number Theorem). It is easy to see that Propositions 3.1 and 3.2 imply Theorem 3.2. Using the same ideas, but working a bit harder one can obtain in a similar way Theorem 3.3 ([9]). Suppose that f is a normalized Hecke eigenform of level N (not necessarily squarefree) and even integral weight k. Then a(n) < 0 for some n with 29

n ≪ (k 2 N ) 60 ,

(n, N ) = 1.

Note that the bound in Theorem 3.3 in weight aspect is better than the one obtained by convexity, although no sub-convexity bounds for Lfunctions have been used. We remark that using the recent sub-convexity bound in the case of the full modular group ¯ µ ¶¯ ¯ ¯ ¯Lf 1 + it ¯ ≪ǫ (|t| + k)1/3+ǫ , ǫ > 0, ¯ ¯ 2

due to Jutila and Motohashi (2006, to appear) —the proof is much more difficult and involved—, one can improve the bound in Theorem 3.3 in weight aspect to k 2/3+ǫ if N = 1.

102

WINFRIED KOHNEN

The method used in [12] can be extended in various directions. First, one can study sign changes in short intervals. More precisely, denote by Sf+ (x) and Sf− (x) the number of positive integers n ≤ x with (n, N ) = 1 for which a(n) > 0 and a(n) < 0, respectively. The following result was proved in joint work with I. Shparlinski. Theorem 3.4 ([13]). Suppose that f is a normalized Hecke eigenform of even integral weight k and squarefree level N that is a newform. Then there are absolute constants η < 1 and A > 0 such that for y = xη one has Sf±1 (x + y) − Sf±1 (x) > 0 whenever x ≥ (kN )A . In another way, one can generalize Theorem 3.1 and its method of proof to arbitrary non-zero cusp forms with real Fourier coefficients. The following result was obtained in joint work with Y.J. Choie. Theorem 3.5 ([4]). Let f be a non-zero cusp form of even integral weight k and squarefree level N with real Fourier coefficients a(n). Then there exist n1 , n2 ∈ N with µ ¶ log(N + 1) n1 , n2 ≪ k 3 N 4 log10 (kN ) · exp c log log(N + 2) · max{ψk (N ), k 2 N 1/2 log16 (kN )}

such that a(n1 ) > 0, a(n2 ) < 0. Here c > 0 is an absolute constant and ψk (N ) :=

Y log(kN ) . log p

p|N

The proof of Theorem 3.5, being a bit more technically involved, proceeds as follows. One writes f as a linear combination of a special orthogonal basis {Fν } of Hecke eigenforms of weight k and level N and carries over to the Rankin-Selberg zeta functions RFν ,Fµ (s) estimates partially already proved in [12] in the context of L(sym2 Fν , s). To obtain final corresponding statements for f itself one applies Chebyshev’s inequality in conjunction with uniform lower bounds for the Petersson scalar products hFν , Fν i. The bounds obtained in this way are somewhat weaker than those in [12], being partially due to the fact that one averages over a basis of Hecke eigenforms and in this way some extra factors depending on k and N are introduced. Somewhat better bounds (using similar methods) can be obtained if one restricts to forms f , e.g. in the subspace of newforms.

SIGN CHANGES OF FOURIER COEFFICIENTS AND EIGENVALUES OF CUSP FORMS

103

We also note that in Theorem 3.1 the additional assumption (n, N ) = 1 was made, since for Hecke eigenforms the eigenvalues a(p) (p a prime, p|N ) are explicitly known as already stated above. For arbitrary cusp forms, however, it seems unnatural to enforce this condition. 4. Siegel modular forms of genus two Let Hg := {Z ∈ Cg,g | Z = Z ′ , ℑ(Z) > 0} be the Siegel upper half-space of genus g and recall that the real symplectic group Spg (R) ⊂ GL2g (R) operates on Hg by µ ¶ AB ◦ Z = (AZ + B)(CZ + D)−1 . CD Let Γg := Spg (Z) be the group of integral symplectic matrices of size 2g, also called the Siegel modular group of genus g. Let F be a Siegel cusp form of integral weight k and genus g, i.e. F is a complex-valued holomorphic function on Hg satisfying the transformation law µ ¶ AB k F (M ◦ Z) = det(CZ + D) F (Z), ∀M = ∈ Γg CD and having a Fourier expansion of the form X a(T )e2πitr(T Z) , F (Z) = T >0

Z ∈ Hg ,

where T runs over all positive definite, symmetric half-integral matrices of size g. For basic facts on Siegel modular forms we refer e.g. to [7]. Note that a(T [U ]) = (−1)k a(T ),

∀ U ∈ GLg (Z)

(where GLg (Z) operates on T > 0 as above by T [U ] = U ′ T U ). This easily follows from the transformation formula for F applied with ¶ µ ′ U 0 . M= 0 U −1 Here as usual, for a matrix U we denote by U ′ its transpose. Using the analytic properties of the Koecher-Maass Dirichlet series attached to F (cf. e.g. [11] and the literature given there) and of the RankinSelberg Dirichlet zeta function attached to F (for g = 1 cf. p. 2, l. 6; cf.

104

WINFRIED KOHNEN

e.g. [2] in the general case), it should not be difficult to generalize the Theorem 2.1 to the situation here, i.e. if F has real Fourier coefficients and is not identically zero, then there should exist infinitely many T > 0 (modulo GLg (Z)) such that a(T ) > 0 and there should be infinitely many T > 0 (modulo GLg (Z)) such that a(T ) < 0. However, we have not checked this in detail. Now suppose that F is an eigenfunction of all Hecke operators. Note that eigenvalues and Fourier coefficients for g > 1 are no longer “proportional”, in any reasonable sense, and properties of the former ones in general cannot be deduced from the other ones and conversely, in an easy way. Although the Fourier coefficients remain rather mysterious and not much is known about them, the situation is a bit better for the eigenvalues, since the latter can be studied with the help of representation theory and algebraic geometry. The situation is particularly good if g = 2, the easiest case after the elliptic case, and for the rest of this section we will stick to this case. Thus in the following F will denote a cuspidal Hecke eigenform of weight k and genus 2. We will denote the linear space of all cusp forms of weight k on Γ2 by Sk (Γ2 ). Recall that the spinor zeta function attached to F is given by Y ZF,p (p−s )−1 , ℜ(s) ≫ 1, ZF (s) = p

where

ZF,p (X) := (1 − α0,p X)(1 − α0,p α1,p X)(1 − α0,p α2,p X)(1 − α0,p α1,p α2,p X) and where α0,p , α1,p , α2,p are “the” Satake p-parameters attached to F . One has X λn n−s = ζ(2s − 2k + 4)−1 ZF (s), ℜ(s) ≫ 1. n≥1

Here λ(n) denotes the eigenvalue of F under the usual Hecke operator T (n) in genus 2. The numbers λ(n) are always real. The completed function ZF∗ (s) = (2π)−2s Γ(s)Γ(s − k + 2)ZF (s)

has meromorphic continuation to C and is (−1)k -invariant under s 7→ 2k − 2 − s [1]. Moreover, ZF∗ (s) is entire if either k is odd or if k is even and F is contained in the orthogonal complement of the Maass subspace SkM (Γ2 ) [6,16]. The Maass subspace is invariant under all Hecke operators and is Hecke isomorphic to the space of elliptic cusp forms of weight 2k − 2 on Γ1 . More

SIGN CHANGES OF FOURIER COEFFICIENTS AND EIGENVALUES OF CUSP FORMS

105

precisely, for a Hecke eigenform F in SkM (Γ2 ) there is a uniquely determined normalized cuspidal Hecke eigenform of weight 2k − 2 on Γ1 such that the relation ZF (s) = ζ(s − k + 1)ζ(s − k + 2)Lf (s) holds [5]. As a first surprise, it is not generally true that the eigenvalues of a Hecke eigenform in Sk (Γ2 ) change signs infinitely often. More precisely, S. Breulmann proved the following Theorem 4.1 ([3]). Suppose that k is even and let F be a Hecke eigenform in SkM (Γ2 ), with eigenvalues λ(n) (n ∈ N). Then λ(n) > 0 for all n. The proof exploits the relation between the λ(n) and the eigenvalues of the form f corresponding to F , as stated above in terms of a relation of zeta functions, together with the fact that the latter satisfy Deligne’s theorem, previously known as the Ramanujan-Petersson conjecture (actually, a simpler estimate is sufficient). On the contrary to the above, one has the following Theorem 4.2 ([15]). Let F be a Hecke eigenform in Sk (Γ2 ) with Hecke eigenvalues λn (n ∈ N). Suppose that F lies in the orthogonal complement of SkM (Γ2 ) if k is even. Then the sequence (λn )n∈N has infinitely many sign changes. The proof is based on Landau’s theorem coupled with the analytic properties of the spinor zeta function of F and a theorem of Weissauer [19] according to which the generalized Ramanujan-Petersson conjecture (saying that |α1,p | = |α2,p | = 1 for all p) is true for forms as in Theorem 4.2. Of course, after Theorem 4.2 the question arises when the first sign change occurs. In this respect, very recent joint work with J. Sengupta says the following. Theorem 4.3 ([14]). Let F be a Siegel-Hecke eigenform in Sk (Γ2 ) and suppose either that k is odd or that k is even and F is in the orthogonal complement of SkM (Γ2 ). Denote by λ(n) (n ∈ N) the eigenvalues of F . Then there exists n ∈ N with n ≪ k 2 log20 k such that λ(n) < 0. Here the constant implied in ≪ is absolute.

106

WINFRIED KOHNEN

The proof follows a similar pattern as that of Theorem 3.2, with the Hecke L-function Lf (s) replaced by the spinor zeta function. However, since the Hecke relations for λ(n) are more involved in genus 2 than in the elliptic case, exploiting them naturally turns out to be more difficult. One also makes use of the result of [19].

References 1. A.N. Andrianov, Euler products corresponding to Siegel modular forms of genus 2, Russ. Math. Surv., 29 (1974), 45–116. 2. S. B¨ ocherer and S. Raghavan, On Fourier coefficients of Siegel modular forms, J. Reine Angew. Math., 384 (1988), 80–101. 3. S. Breulmann, On Hecke eigenforms in the Maass space, Math. Z., 232(3) (1999), 527–530. 4. Y.J. Choie and W. Kohnen, The first sign change of Fourier coefficients of cusp forms. Preprint 2006. 5. M. Eichler and D. Zagier, The theory of Jacobi forms, Progress in Math., 55 (1985), Birh¨ auser: Boston. 6. S.A. Evdokimov, A characterization of the Maass space of Siegel cusp forms of genus 2 (in Russian), Mat. Sbornik, (154)112 (1980), 133–142. 7. E. Freitag, Siegelsche Modulformen, Grundl. d. Math. Wiss., 254 (1983). Springer: Berlin Heidelberg New York. 8. J. Hoffstein and P. Lockart, Coefficients of Maass forms and the Siegel zero (with an appendix by D. Goldfeld, J. Hoffstein and D. Lieman), Ann. of Math., 140 (1994), 161–180. 9. H. Iwaniec, W. Kohnen and J. Sengupta, The first negative Hecke eigenvalue. To appear in Intern. J. Number Theory. 10. M. Knopp, W. Kohnen and W. Pribitkin, On the signs of Fourier coefficients of cusp forms, The Ramanujan Journal., 7 (2003), 269–277. 11. W. Kohnen and J. Sengupta, On Koecher-Maass series of Siegel modular forms, Math. Z., 242 (2002), 149–157. 12. W. Kohnen and J. Sengupta, On the first sign change of Hecke eigenvalues of newforms, Math. Z., 254 (2006), 173–184. 13. W. Kohnen and I. Shparlinski, On the number of sign changes of Hecke eigenvalues of newforms. To appear in J. Austral. Math. Soc.. 14. W. Kohnen and J. Sengupta, The first negative Hecke eigenvalue of a Siegel cusp form of genus two. To appear in Acta Arithm.. 15. W. Kohnen, Sign changes of Hecke eigenvalues of Siegel cusp forms of genus two, To appear in Proc. AMS. 16. T. Oda, On the poles of Andrianov L-functions, Math. Ann., 256 (1981), 323–340. 17. Ram Murty, Oscillations of Fourier coefficients of modular forms, Math. Ann., (4)262 (1983), 431–446. 18. C.L. Siegel, Berechnung von Zetafunktionen an ganzzahligen Stellen, Nachr. Akad. Wiss. G¨ ottingen Math.-Phys. Kl. II, (1969), 87–102.

SIGN CHANGES OF FOURIER COEFFICIENTS AND EIGENVALUES OF CUSP FORMS

107

19. R. Weissauer, The Ramanujan conjecture for genus 2 Siegel modular forms (an application of the trace formula). Preprint, Mannheim 1993.

108

SHIFTED CONVOLUTION SUMS OF FOURIER COEFFICIENTS OF CUSP FORMS YUK-KAM LAU Department of Mathematics, The University of Hong Kong, Pokfulam Road, Hong Kong E-mail: [email protected] JIANYA LIU∗ School of Mathematics and System Sciences, Shandong University, Jinan, Shandong 250100, China E-mail: [email protected] YANGBO YE† Department of Mathematics, The University of Iowa, Iowa City, IA 52242-1419, U.S.A. E-mail: [email protected] Let g be a holomorphic Hecke eigenform for Γ0 (N ) of weight l, or a Maass eigenform for Γ0 (N ) with Laplace eigenvalue 1/4 + l2 . Let λg (n) be the nth Fourier coefficient of g. A shifted convolution sum of λg (n) is a sum of the form P n λg (n)λg (n + h)w(n), where h is a nonzero integer, and w a nice weight function. These shifted convolution sums play a crucial role in analytic number theory, and in particular, in subconvexity bound problems of automorphic Lfunctions. This article will survey historical developments and recent progress on estimation and analytic continuation of a type of shifted convolution sums. The techniques to be used include spectral decomposition using Poincar´ e series, a special choice of an orthonormal basis of Hecke eigenforms, a classical result of Good and its generalization by Kr¨ otz and Stanton, and a spectral large sieve. The shifted convolution sum will be meromorphically continued to ℜs > −1/2, passing through all poles from Laplace eigenvalues.

∗ Supported

in part by the 973 Program, by NSFC Grant # 10531060, and by a Ministry of Education Major Grant Program in Sciences and Technology # 305009. † Supported in part by the USA National Security Agency under Grant Number H9823006-1-0075. The United States Government is authorized to reproduce and distribute reprints notwithstanding any copyright notation herein.

SHIFTED CONVOLUTION SUMS OF FOURIER COEFFICIENTS OF CUSP FORMS

109

1. Automorphic L-functions and subconvexity problems 1.1. The classical case: the Riemann zeta-function and Dirichlet L-functions In his 1859 memoir, Riemann introduced the approach of using an analytic object — Riemann zeta-function — to study the arithmetic problem of distribution of primes. Nowadays this approach has been exploited in various scopes with fruitful results. The associated artificial analytic objects are known as L-functions. They are functions defined on the complex plane under analytic/meromorphic continuation, sharing common features and conjectures with the Riemann zeta-function. There are two important open conjectures for L-functions: Generalized Riemann Hypothesis (GRH) and Generalized Lindel¨of Hypothesis (GLH). The former concerns the location of nontrivial zeros and the latter is about the size of an L-function on the critical line ℜs = 1/2. By standard complex analysis, it is seen that GLH follows from GRH. Though being weaker, progress towards GLH is rather slow, even for the Riemann zeta-function ζ(s). In the case of ζ(s), GLH is the assertion α = 0 in the order estimate ζ(1/2 + it) ≪ε |t|α+ε

for |t| ≥ 1.

(1.1)

The upper estimate (1.1) holds true for α = 1/4 by the Phragm´en– Lindel¨of convexity principle, a robust method in complex analysis. The record to date is α = 32/205 due to Huxley [14], but it is still far from the anticipation in GLH. Amazingly, Weyl was able to show α = 1/6 about eighty years ago. Note that 32/205 = 1/6 − 13/1230. The progress meanwhile is small, and it seems that a kind of obstruction at Weyl’s bound is present. Below, we shall find such an obstruction occurred in other cases. The method of convexity principle applies well to other L-functions. Naturally, the bound resulted from this principle is called a convexity bound, and we refer any improvement (usually on the exponent of the convexity bound) as a subconvexity bound. Besides, we call a bound Weyl-like if it is the 2/3-th power of the convexity bound up to an arbitrarily small ε > 0. For instance, the convexity bound of ζ(s) is |t|1/4+ε and its Weyl-like bound is |t|1/6+ε . The Dirichlet L-function L(s, χ) was introduced to study the primes in an arithmetic progression. In addition to the t-aspect on the critical line s = 1/2+it, we are interested in the aspect of conductor, that is the modulus of the character χ. For either aspect, the exponent of the convexity bound is 1/4. The best known exponent for t-aspect is 1/6 which is Weyl-like. Unlike the Riemann zeta-function, nobody can break the Weyl bound so

110

YUK-KAM LAU, JIANYA LIU AND YANGBO YE

far, though it was proven quite long time ago. On the conductor aspect, the Weyl-like bound is just achieved recently by Conrey and Iwaniec [6] for real characters. Before this, we only have 3/16 due to Burgess [4], which also remains the best for all characters. Both ζ(s) and L(s, χ) are L-functions of degree one, referring to the degree of a generic polynomial factor in p−s of their Euler product factorizations. Next, we turn to higher degree examples. 1.2. L-functions of degree two The L-function associated to a holomorphic Hecke eigenform or Hecke Maass eigenform is a typical example of degree two. Let Γ0 (N ) be the congruence subgroup that contains matrices in SL2 (Z) whose lower left entry is a multiple of N . The upper half plane H is identified with G/K where G = GL(2, R) and K = O(2, R) and hence the quotient space Γ0 (N ) \ H ∼ = Γ0 (N ) \ G/K. But instead, we consider the space Γ0 (N ) \ G which is regarded as the unit tangent bundle of Γ0 (N ) \ H. The Haar measure on Γ0 (N ) \ G is descended from dg =

dxdy dϕ , y 2 2π

where an element g ∈ G is expressed via Iwasawa decomposition as µ ¶ µ 1/2 ¶µ ¶ 1x cos ϕ sin ϕ y g= , y −1/2 − sin ϕ cos ϕ 1 mapped under the natural projection to z = x + iy ∈ H. Consider the Laplace operator ¶ µ 2 ∂2 ∂2 ∂ 2 e + y + ∆ = −y ∂x2 ∂y 2 ∂x∂ϕ

on the Hilbert space L2 (Γ0 (N ) \ G). A Maass eigenform with eigenvalue e with eigen1/4 + k 2 is a square-integrable K-invariant eigenfunction of ∆ 2 value 1/4 + k . A holomorphic cusp form of weight k is a holomorphic function f on H such that the function in L2 (Γ0 (N ) \ G) corresponding e with eigenvalue (k/2)(k/2 − 1). to y k/2 f (z)eikϕ is an eigenfunction of ∆ When these forms are invariant under the Hecke operators Tn with n ≥ 1, we call them Hecke Maass eigenforms and holomorphic Hecke eigenforms, respectively. Their associated L-functions will involve three parameters t, k and N , and the convexity bound is (|t| + k)1/2 N 1/4 (|t|kN )ε .

SHIFTED CONVOLUTION SUMS OF FOURIER COEFFICIENTS OF CUSP FORMS

111

The current best subconvexity bounds are accordingly |t|1/3+ε by Good [11] and Meurman [28], k 1/3+ε by Peng [31] and Ivi´c [15], and N 1/6+ε for certain forms by Conrey and Iwaniec [6]. Amazingly, all of them are only Weyl-like, that is, no further advance is achieved for L-functions of degree two. Perhaps there is a barrier behind which we can break through merely in the very special case of ζ(s). 1.3. Rankin-Selberg L-functions The subconvexity problem of L-functions of degree ≥ 3 is mostly unsolved. One accessible case is the Rankin-Selberg L-function which is of degree 4. Let f be a holomorphic Hecke eigenform for Γ0 (N ) of weight k or Hecke Maass eigenform with eigenvalue 1/4 + k 2 . Suppose g a fixed holomorphic or Maass cusp form of weight l or eigenvalue 1/4 + l2 and level D. The Rankin-Selberg L-function L(s, f × g) satisfies the convexity bound ¡ ¢1/2+ε . L(1/2 + it, f × g) ≪ N D(|t| + k + l)(|t| + |k − l|)

(1.2)

We fix the level N , the form g, and t, and study the subconvexity estimate of L(1/2 + it, f × g) in the k-aspect. Thus, we are seeking bounds like L(1/2 + it, f × g) ≪N,g,t,ε k β+ε

(1.3)

for some 0 ≤ β < 1. This was firstly achieved by Sarnak [34] for holomorphic f , and by Liu and Ye [25,26] for f being Maass. Progress in this direction is summarized in the following table. Throughout the paper, θ denotes a bound towards the Generalized Ramanujan Conjecture (GRC) for GL2 , for which θ = 1/2 is trivial, and the best bound known to date is θ = 7/64 due to Kim and Sarnak [20]. GRC actually predicts that θ = 0. β

author(s)

the shifted convolution sum is treated by

18 19−2θ 15+2θ 16 6−2θ 7−4θ

Sarnak [34] Liu and Ye [25,26] Blomer [2] Lau, Liu, and Ye [23] Lau, Liu, and Ye [24] Jutila and Motohashi [19]

spectral method spectral method circle method spectral method spectral method spectral method

1− 2 3 2 3

1 8+4θ

112

YUK-KAM LAU, JIANYA LIU AND YANGBO YE

The Weyl-like bound achieved in [24] is as follows. Theorem 1.1. Let f be a holomorphic Hecke eigenform for Γ0 (N ) of weight k, or a Maass Hecke eigenform for Γ0 (N ) with Laplace eigenvalue 1/4 + k 2 , and correspondingly let g be a fixed holomorphic or Maass cusp form for Γ0 (N ), or for Γ0 (N ′ ) with (N, N ′ ) = 1. Then, for any small ε > 0, L(1/2 + it, f × g) ≪N,t,g,ε k 2/3+ε ,

(1.4)

where the implied constant grows at most polynomially in t and N , with the degree of the polynomial growth depending on ε. The same bound is obtained by Jutila and Motohashi [19] but for the full modular group SL2 (Z). The result in [19] also provides a subconvexity bound in the t-aspect, when |t| is suitably smaller than k. For the RankinSelberg L-function in question, the Weyl-like bound is only attained in the weight/spectral amongst the various aspects. The subconvexity bound is available on the level aspect N , see [29] and [13], but remains unsettled on t-aspect. 1.4. Plan of the article In this article, we try to give some historical developments and recent progress on estimation and analytic continuation of a type of shifted convolution sums, and indicate some of their applications to subconvexity bounds for automorphic L-functions. In view of the huge amount of materials in these areas at hand, we will mainly mention applications to the RankinSelberg L-function L(s, f × g) in the weight/spectral aspect. §2 presents some fundamentals of the shifted convolution sums. Basically, there are two methods to treat the shifted convolution sums, the circle method, and the spectral method. In §3, we describe two variants of the circle method, and their consequences in subconvexity bounds L(s, f × g). §§4–6 are devoted to the spectral method and its recent developments. The materials are organized not always in historical order, but in logical order. For example, some results in §§4–6 are actually obtained earlier then those in §3. 1.5. Notations As usual, τ (n), ϕ(n) denote, respectively, the divisor function and the Euler quotient function. For z ∈ C, ℜz and ℑz denote, respectively, the real and imaginary part of z. Following Riemann, for the specific complex variable s,

SHIFTED CONVOLUTION SUMS OF FOURIER COEFFICIENTS OF CUSP FORMS

113

we write ℜs = σ and ℑs = t; thus s = σ + it. The symbol A ≍ B represents both A ≪ B and B ≪ A, and e(x) = e2πix .

2. Shifted convolution sums 2.1. Spectral theory of automorphic forms In 1949, Maass [27] introduced the Γ-invariant eigenfunctions of the nonEuclidean Laplacian on H, which are now understood to be the basic elements in the harmonic analysis of Γ \ H with Γ being a discrete subgroup of SL2 (R). To see the importance, one should recall that the Fourier series expansion is a kind of spectral decomposition in harmonic analysis of the Euclidean plane R2 . More specifically, consider the group G = R2 acting on the plane R2 as translations, then the G-invariant eigenfunctions of the Euclidean Laplacian are exponential functions e(mx + ny) with eigenvalues 4π 2 (m2 + n2 ), where m, n ∈ Z. In 1950s, Selberg wrote a couple of influential papers on harmonic analysis and discontinuous groups. In particular, he founded harmonic analysis for weakly symmetric Riemannian spaces and the Selberg trace formula generalizing the Poisson summation formula. Around the same time, there appeared important intimate works devoted by other writers. The upper half plane H equipped with the hyperbolic metric is a Riemannian manifold. As the metric is invariant under SL2 (Z) and hence under Γ, one may look for a spectral decomposition with Γ-invariant eigenfunctions of the Laplacian associated to the hyperbolic manifold. Such decomposition is possible by virtue of the theory of unbounded self-adjoint operators in Hilbert spaces. However, as Γ may not be cocompact (for example, the congruence subgroups), the Γ-invariant eigenfunctions for the discrete spectrum/eigenvalues, called Maass cusp forms, are not sufficient for complete spectral decomposition. One needs to include Eisenstein series, for the continuous spectrum, in the spectral decomposition. The spectral theory is interesting on its own. For instance, the trace formula developed by Selberg yields the Weyl’s law on the discrete spectrum for congruence subgroups. Consequently it guarantees the existence of infinitely many Maass cusp forms. It is worthy of remarking that the existence problem of Maass cusp forms for general discrete subgroups remains open; the answer is likely to be negative due to the work of Phillips and Sarnak [32].

114

YUK-KAM LAU, JIANYA LIU AND YANGBO YE

2.2. The Rankin-Selberg method and shifted convolution sums One major topic in modular forms is the study of their Fourier coefficients. Rankin and Selberg independently established a method, nowadays called the Rankin-Selberg method, based on the integral Z y l g(z)g(z)E(z, s) dµ(z), Γ\H

where Γ \ H is the fundamental domain of Γ, g is a modular form of even integral weight l for Γ and X ℑ(γz)s E(z, s) = γ∈Γ∞ \Γ

is an Eisenstein series, and σ > 1. It is well known that g(z) admits a Fourier expansion, X ag (n)e(nz). (2.1) g(z) = n≥1

By the invariance under Γ, the integral is unfolded to give Z Z l y g(z)g(z)E(z, s) dµ(z) = y l+s |g(z)|2 dµ(z) Γ\H

(2.2)

Γ∞ \H

= G(s)Dg (s),

where G(s) is a product of some Gamma factors, and Dg (s) =

X |ag (n)|2 . ns

n≥1

A classical method with Perron’s formula will yield an asymptotic formula for the summatory function X |ag (n)|2 n≤x

from Dg (s), provided that Dg (s) can be analytically continued to the left beyond σ = 1. With the available information on E(z, s), Dg (s) is meromorphically continued to the whole complex plane, and is regular on σ ≥ 1/2 except for a finite number of poles lying on the segment 1/2 < s ≤ 1. This is the basic principle of the Rankin-Selberg method. Prior to the works of Rankin and Selberg, Petersson developed an explicit formula, namely the Petersson trace formula, for the Fourier coefficients of a modular form. This formula involves the Kloosterman sum

SHIFTED CONVOLUTION SUMS OF FOURIER COEFFICIENTS OF CUSP FORMS

115

S(m, n, c) and Bessel functions, and is derived from the Poincar´e series X Pm (z, s) = ℑ(γz)s e(mγz). γ∈Γ∞ \Γ

Apparently a good understanding of the Kloosterman sum will result in better knowledge on the Fourier coefficients. To this end, Selberg [35] investigated the series X S(m, n, c) Z(s, m, n) = , c2s c≥1

which is a crucial component in the Fourier coefficient of Pm (z, s). His method is to give Pm (z, s) a spectral decomposition, regarding Pm (·, s) as a function in L2 (Γ \ H). By the aforementioned spectral theory, Pm (·, s) is a linear combination a series of Maass cusp forms and spectral integrals of Eisenstein series. The coefficients of the discrete part, i.e. the series of Maass cusp forms, are products of gamma functions, which amount to the analytic properties of the function Pm (z, s) in s. The continuous part is similar but, for simplicity, will not be further discussed here. As a result, Pm (z, s) is regular for σ > 1/2 except possibly for a finite number of simple poles on (1/2, 1]. Replacing the Eisenstein series E(z, s) in (2.2) by a Poincare series Pm (z, s), the two methods can be combined to study the shifted convolution sum X ag (n)ag (n + h) . (n + h/2)s n≥1

This idea was pointed out by Selberg in the last section of [35], but at that time, he did not find an application for this shifted convolution sum. During the past decades, the uses of the shifted convolution sum came up in the study of L-functions. Indeed for the classical example of degree one - the Riemann zeta-function ζ(s), the investigation of its fourth moment already leads naturally to the shifted convolution sum for the divisor function d(n), which was considered by Heath-Brown [7]. One needs to handle this or similar type of sums for higher degree automorphic L-functions. 3. Variants of the circle method

Let g be a holomorphic Hecke eigenform for Γ0 (N ) of even weight l or Hecke Maass eigenform with eigenvalue 1/4 + l2 . Then g admits the following Fourier expansions: X λg (n)n(l−1)/2 e(nz) (3.1) g(z) = n≥1

116

YUK-KAM LAU, JIANYA LIU AND YANGBO YE

when g is holomorphic, and g(z) = y 1/2

X

λg (n)Kil (2π|n|y)e(nx)

(3.2)

n6=0

when g is Maass, where Kil is the modified Bessel function of the third kind, and z = x + iy. We normalize λg (1) = 1 in (3.1) and (3.2). In this section, we describe variants of the circle method to treat the shifted convolution sums like X Dg (ν1 , ν2 , h) = λg (m)λg (n)W (m, n) (3.3) ν1 m−ν2 n=h

uniformly in positive integers ν1 , ν2 , h, where W : R × R → R is a nice test function. For example, one may suppose that W is smooth, supported on [M1 , 2M1 ] × [M2 , 2M2 ], and satisfies kW (ij) k∞ ≪i,j M1−i M2−j

for all i ≥ 0, j ≥ 0,

(3.4)

where M1 , M2 are real numbers greater than 1. 3.1. The δ-symbol method To attack Dg (ν1 , ν2 , h) in (3.3), Duke, Friedlander, and Iwaniec [9,10] developed the δ-symbol method, which can be viewed as a variant of the circle method. This δ-symbol method has also been used in many occasions; see for example the DFI paper series, Kowalski, Michel, and Vanderkam [21], and Michel [29]. The following description is based on [9] and Michel [30]. Let ½ 1 if n = 0, (3.5) δ(n) = 0 if n 6= 0, be the Dirac symbol at 0 restricted to integers n; the basic idea of the δ-symbol method is to express δ(n) in terms of additive characters. One starts with a smooth, compactly supported, even function ω(x) with X ω(r) = 1. ω(0) = 0, r≥1

Put δd (n) = ω(d) − ω then we have δ(n) =

X d|n

³n´ d

δd (n).

;

SHIFTED CONVOLUTION SUMS OF FOURIER COEFFICIENTS OF CUSP FORMS

117

Now the condition d|n can be detected by additive characters. Thus, µ ¶ X1 X hn e δ(n) = δd (n) d d h mod d d≥1 X 1 X ∗ ³ an ´ e ∆c (n), (3.6) = c c c≥1

a mod c

where r = (h, d), a = h/r, c = d/r, and ∆c (n) =

X1 r≥1

r

δcr (n).

In practice, one applies the above identity to integers |n| < U/2, say, with the text function ω(x) supported on [K/2, K] and whose derivative satisfy kω (j) k∞ ≪ K −j−1

for all j ≥ 0.

Then δd (n) vanishes save for 1 ≤ d < max(K, U/K) = K by choosing K = U 1/2 . Hence ∆c (n) vanishes save for 1 ≤ c < K and ∆c (n) ≪ K −1 . Now applying (3.6) to the Dirac symbol δ(ν1 m − ν2 n − h) in (3.3), one therefore gets rid of the condition ν1 m − ν2 n − h = 0. For technical reasons, one introduces a localization factor φ(ν1 x − ν2 y − h) in Dg (ν1 , ν2 , h), where φ is a smooth function compactly supported on [−U/2, U/2], satisfying φ(0) = 1 and kφ(j) k∞ ≪ U −j

for all j ≥ 0.

Hence µ ¶ ah X e − Dg (ν1 , ν2 , h) = λg (m)λg (n) c m,n 1≤c≤K a mod c µ ¶ ν1 ma − ν2 na ×e Ec (m, n, h), c X

X∗

(3.7)

where 1 Ec (x, y, h) = W (x, y)φ(ν1 x − ν2 y − h) ∆c (ν1 x − ν2 y − h). c It turns out that the derivatives of Ec (x, y, h) are well controlled; in fact µ ¶i+j 1 ν1i ν2j K kEc(ij) k∞ ≪i,j . (3.8) (cK + |ν1 x − ν2 y − h|) min(M1 , M2 )i+j c

118

YUK-KAM LAU, JIANYA LIU AND YANGBO YE

Next one applies the Voronoi summation formula to both variables m and n, and the shifted convolution sum in question is transformed to Dg (ν1 , ν2 , h) =

X (ν1 ν2 , c) X λg (m)λg (n)S(−ν1′ m + ν2′ n, −h, c) c2 (3.9) m,n c≤K × Ic (m, n, h),

where S(a, b, c) is the classical Kloosterman sum, and µ √ ¶ Z ∞Z ∞ 4π mx Ec (x, y, h)Jk−1 Ic (m, n, h) =(2πik )2 c/(ν1 , c) 0 µ0 √ ¶ 4π ny × Jk−1 dxdy c/(ν2 , c)

(3.10)

with νj′ = νj /(νj , c). Integrating the Bessel functions in (3.10) by parts many times, one shows that Ic (m, n, h) is very small unless m and n lie in certain short ranges, and one can therefore restrict the summations of m and n in (3.9) to these short ranges. Applying Weil’s bound for Kloosterman sums p |S(a, b, r)| ≤ τ (r) (a, b, r)r, one gets the following result.

Theorem 3.1. Let g be a holomorphic cusp form for Γ0 (N ) of even weight l, or a Maass cusp form for Γ0 (N ) with eigenvalue 1/4 + l2 . Then Dg (ν1 , ν2 , h) ≪N,l,ε (ν1 M1 + ν2 M2 )3/4+ε .

(3.11)

This is proved by Duke, Friedlander, and Iwaniec [9] for the full modular group, and by Kowalski, Michel, and Vanderkam [21] for Γ0 (N ). From Theorem 3.1, one can get a subconvexity bound for L(1/2 + it, f × g). Theorem 3.2. Let f be a holomorphic Hecke eigenform for Γ0 (N ) of weight k, or a Maass Hecke eigenform for Γ0 (N ) with Laplace eigenvalue 1/4 + k 2 , and let g be a fixed holomorphic or Maass cusp form for Γ0 (N ), or for Γ0 (N ′ ) with (N, N ′ ) = 1. Then L(1/2 + it, f × g) ≪N,t,g,ε k 11/12+ε .

(3.12)

Theorem 3.2 does not appear in literatures. However it can be compared with Iwaniec’s bound k 5/12+ε for single automorphic L-functions [16].

SHIFTED CONVOLUTION SUMS OF FOURIER COEFFICIENTS OF CUSP FORMS

119

3.2. Jutila’s variant Let a′ /q ′ < a/q < a′′ /q ′′ be three consecutive Farey fractions with denominators ≤ Q, and µ ¶ µ ¸ a a + a′ a + a′′ M , . = q q + q ′ q + q ′′ Then (0, 1] is a disjoint union of these M(a/q), µ ¶ G G∗ a M . (0, 1] = q q≤Q a mod q

Let δ(n) function defined as in (3.5). Then the circle method of Hardy and Littlewood actually starts with the following decomposition of the δ(n): X X∗ Z e(nα)dα. (3.13) δ(n) = q≤Q a mod q

M(a/q)

Note that the length of the M(a/q) depends on a and q, and therefore in general one cannot invert the order of the summation and the integration above. This is known as the leveling problem. Jutila introduced another variant of the circle method in [17] and [18] to attack this leveling problem. This variant has also been used in many occasions like Harcos [12], Harcos and Michel [13], Blomer [1,2], and Blomer, Harcos, and Michel [3]. Theorem 3.3. Let Q ≥ 1 and Q−2 ≤ δ ≤ Q−1 be two parameters. Let ω be a non-negative function supported in [Q, 2Q] satisfying X kωk∞ ≤ 1, ω(q) > 0.

For r ∈ Q, let Ir (α) be the characteristic function of the interval [r−δ, r+δ], and define Λ=

X q

ω(q)ϕ(q),

X∗ 1 X ˜ Id/q (α). I(α) = ω(q) 2δΛ q d mod q

˜ Then I(α) is a good approximation to the characteristic function of [0, 1] in the sense that Z 1 Q2+ε 2 ˜ |1 − I(α)| dα ≪ε . (3.14) δΛ2 0

120

YUK-KAM LAU, JIANYA LIU AND YANGBO YE

To transform the sum in question by Jutila’s variant of the circle method, we let Q > N ν1 ν2 , and δ = Q−1 . Let ω ˜ be a function supported in [Q, 2Q] satisfying k˜ ω k∞ ≍ Q−j

for all j ≥ 0,

and let ω(q) = ω ˜ (q)χ[N ν1 ν2 |q] (q), where χ[N ν1 ν2 |q] (q) is the characteristic function of N ν1 ν2 |q. Then Λ≍ For simplicity, put T (α) =

X

Q2 . N ν1 ν2

λg (m)λg (n)e(ν1 mα)e(−ν2 nα)W (m, n).

m,n

Then the shifted convolution sum in (3.3) can be written as Z 1 T (α)e(−αh)dα Dg (ν1 , ν2 , h) = =

Z

0

1

˜ I(α)T (α)e(−αh)dα +

0 mt

=: D

+ Det ,

Z

1

0

˜ (1 − I(α))T (α)e(−αh)dα (3.15)

say. The error term Det is, by Cauchy’s inequality and Theorem 3.3, ˜ 2 · kT k2 Det ≪ k1 − Ik ¯ ¯ ¯ ¯X Q1+ε ¯ ¯ ≪ 1/2 max ¯ λg (m)λg (n)e(α1 m)e(−α2 n)W (m, n)¯ . ¯ δ Λ α1 ,α2 ¯

(3.16)

m,n

And this last quantity is acceptable by applying the estimate X λg (n)e(mα) ≪ε N k 5/4 x1/2 (N kx)ε n≤x

uniformly in α. This is due to Wilton, but the explicit dependence on N and k is shown in [13]. The main term Dmt in (3.15) can be computed as X 1 Dmt = ω ˜ (q) 2δΛ N ν1 ν2 |q (3.17) ¶ µ µ ¶¶ X∗ Z δ µd d T × + η e −h +η dη. q q −δ d mod q

SHIFTED CONVOLUTION SUMS OF FOURIER COEFFICIENTS OF CUSP FORMS

121

Applying Voronoi’s summation formula, we find that the double sums over m, n in (3.17) is for N ν1 ν2 |q, µ¯ ¶ µ ¶ X d(ν2 n − ν1 m) d ∗ λg (m)λg (n)e (m, n), (3.18) ωq,ν T +η = 1 ,ν2 ,η q q m,n where ∗ ωq,ν (x1 , x2 ) 1 ,ν2 ,η

4π 2 ν1 ν2 = q2

Z

∞ 0

×Jk−1

µ

Z



W (t1 , t2 )e(ν1 t1 η − ν2 t2 η) √ √ ¶ µ ¶ 4πν2 x2 t2 4πν1 x1 t1 Jk−1 dt1 dt2 . q q 0

Inserting (3.18) into (3.17), we get ¶ µ µ ¶¶ X X∗ Z δ µd d T + η e −h +η dη q q N ν1 ν2 |q d mod q −δ Z δ X e(−ηh) ω ˜ (q) = −δ

× =:

Z

δ

X m,n

N ν1 ν2 |q

∗ S(−h, ν2 n − ν1 m, q)λg (m)λg (n)ωq,ν (m, n)dη 1 ,ν2 ,η

e(−ηh) −δ

X

ω ˜ (q)Y (m, n)dη,

(3.19)

N ν1 ν2 |q

say. This is where interchange of orders is needed, and this is guaranteed by the fact that the length of the intervals [−δ, δ] is independent of the variables. The quantity Y (m, n) above can be transformed as X X ∗ Y (m, n) = S(−h, r, q) λg (m)λg (n)ωq,ν (m, n), 1 ,ν2 ,η r∈Z

ν2 n−ν1 m=r

which is similar to the inner sums in (3.9). Arguing similarly, and invoking the spectral large sieve inequality, one gets Theorem 3.4. Let g be a holomorphic cusp form for Γ0 (N ) of even weight l, or a Maass cusp form for Γ0 (N ) with eigenvalue 1/4 + l2 . Then Dg (ν1 , ν2 , h) ≪N,l,ε (ν1 M1 + ν2 M2 )1/2+θ+ε .

(3.20)

The estimate above leads to a subconvexity bound for Rankin-Selberg L-functions. Theorem 3.5. Let f be a holomorphic Hecke eigenform for Γ0 (N ) of weight k, or a Maass Hecke eigenform for Γ0 (N ) with Laplace eigenvalue

122

YUK-KAM LAU, JIANYA LIU AND YANGBO YE

1/4 + k 2 , and correspondingly let g be a fixed holomorphic or Maass cusp form for Γ0 (N ), or for Γ0 (N ′ ) with (N, N ′ ) = 1. Then for any small ε > 0, we have L(1/2 + it, f × g) ≪N,t,g,ε k (6−2θ)/(7−4θ)+ε .

(3.21)

Theorems 3.4 and 3.5 are proved by Blomer [1,2]. 4. The spectral method In 2001, Sarnak [34] considered the subconvexity problem for RankinSelberg L-functions associated to two cusp forms with one varying weight and one fixed weight, in which the shifted convolution sum for the cusp form of fixed weight came into play. Sarnak applied Selberg’s approach but made a modification of replacing Pm (z, s) by X ℑ(γz)s e(−hℜ(γz)), Uh (z, s) = γ∈Γ∞ \Γ

where h is a positive integer. This helps simplify calculations. Together with his ingenious estimates on inner products of eigenfunctions [34], he obtained a subconvexity estimate for the Rankin-Selberg with an application to quantum unique ergodicity. To explain the ideas of Sarnak [34], let ν1 , ν2 be positive integers, and h an integer. For σ > 1, we define µ√ ¶ X ν1 ν2 mn l−1 λg (n)λg (m) Dg (s, ν1 , ν2 , h) = ν1 m + ν2 n m,n>0 ν1 m−ν2 n=h

× (ν1 m + ν2 n)−s

(4.1)

when g is a holomorphic cusp form, and Dg (s, ν1 , ν2 , h) =

X

m,n6=0 ν1 m−ν2 n=h

à p !2il ν1 ν2 |mn| λg (n)λg (m) ν1 |m| + ν2 |n|

(4.2)

× (ν1 |m| + ν2 |n|)−s when g is a Maass form. To illustrate the ideas, let us consider the case of g being a holomorphic cusp form on Γ0 (N ) of weight l. Write Γ = Γ0 (N ν1 ν2 ) and V (z) = y l g(ν1 z)g(ν2 z).

SHIFTED CONVOLUTION SUMS OF FOURIER COEFFICIENTS OF CUSP FORMS

123

Then V is a Γ-invariant function rapidly decreasing at the cusps of Γ, and V ∈ L2 (Γ \ H). By the standard unfolding method, Dg (s, ν1 , ν2 , h) can be expressed in terms of the inner product (see [34, p.444], (A7)-(A9)) Dg (s, ν1 , ν2 , h) = (2π)s+l−1 (ν1 ν2 )(l−1)/2

hUh (·, s), V i . Γ(s + l − 1)

(4.3)

Note that V is square-integrable on Γ\H, because it is built from a cusp form. On the other hand, as a Poincar´e series, Uh is not square integrable on Γ\H. However, since Γ\H is of finite volume, Parseval’s identity applies. Therefore X hUh (·, s), φj ihV, φj i hUh (·, s), V i = j≥1

+

Z 1 X ∞ hUh (·, s), Ea (·, 1/2 + iτ )i 4π a −∞ ×hV, Ea (·, 1/2 + iτ )i dτ,

(4.4)

where a runs over all the cusps. Note that hUh , φ0 i = 0. In view of (4.3), one may investigate the right-side of (4.4) for the properties of Dg (s, ν1 , ν2 , h). These inner products can be computed as follows. Theorem 4.1. We have hUh (·, s), φj i =

π 1/2−s ρj (−h) Γ 4|h|s−1/2

µ

s − 1/2 + itj 2

¶ µ ¶ s − 1/2 − itj Γ , 2

and hUh (·, s), Ea (·, 1/2 + iτ )i =

π 1−s−iτ ρa (1/2 + iτ, −h) Γ(1/2 − iτ ) 2|h|s−1/2+iτ ¶ µ ¶ µ s − 1/2 − iτ s − 1/2 + iτ Γ . ×Γ 2 2

Recall that by the Maass-Selberg theory (see Deshouillers and Iwaniec [8, p.227]), L2 (Γ\H) admits a spectral decomposition with respect to ∆. The spectrum of ∆ consists of two components: the discrete spectrum 0 = λ0 < λ1 ≤ λ2 ≤ · · · , and the continuous spectrum covering the segment [1/4, ∞). Each eigenvalue in the discrete spectrum has finite order, and λj → ∞ as j → ∞. Moreover, there are two types of eigenvalues: 0 < λj < 1/4 which are called exceptional, and λj ≥ 1/4. The famous Selberg conjecture asserts that there is no exceptional eigenvalue for congruence groups, but the currently best known result is λ1 ≥ 1/4 − θ2 , where θ = 7/64 is the exponent of the best known bound toward the Generalized

124

YUK-KAM LAU, JIANYA LIU AND YANGBO YE

Ramanujan Conjecture for Maass forms, due to Kim and Sarnak [20]. Write λj = sj (1 − sj ) and sj = 1/2 + itj where 0 < itj ≤ θ if λj is exceptional, and tj ∈ [0, ∞) otherwise.

(4.5)

Theorem 4.1 with (4.5) implies immediately that each summand on the right-side of (4.4) is holomorphic in σ > 1/2 + θ. Using the estimate of individual hV, φj i developed in [33], Sarnak [34, Theorem A.1] concluded that Dg (s, ν1 , ν2 , h) extends to a holomorphic function on σ > 1/2 + θ and has the following upper bound estimate. Theorem 4.2. Let g be a holomorphic cusp form for Γ0 (N ) of even weight l, or a Maass cusp form for Γ0 (N ) of Laplace eigenvalue 1/4 + l2 . Then Dg (s, ν1 , ν2 , h) extends to a holomorphic function for σ ≥ 1/2 + θ + ε, for any ε > 0. Moreover, in this region it satisfies Dg (s, ν1 , ν2 , h) ≪N,g,ε (ν1 ν2 )1/2+ε |h|1/2+θ+ε−σ (1 + |t|)3 + χ(g)|h|1−σ , where χ(g) = 0 or 1 according as g is holomorphic or Maass form. From this, Sarnak [33] deduced the following subconvexity bound for holomorphic Hecke eigenform f . Note that it is the first subconvexity bound in the k-aspect for L(1/2 + it, f × g). Theorem 4.3. Let f be a holomorphic Hecke eigenform for Γ0 (N ) of weight k, and let g be a fixed holomorphic or Maass cusp form for Γ0 (N ), or for Γ0 (N ′ ) with (N, N ′ ) = 1. Then for any small ε > 0, L(1/2 + it, f × g) ≪N,t,g,ε k 18/(19−2θ)+ε .

(4.6)

Theorem 4.2 also enables Liu and Ye [25,26] to derive a subconvexity bound for L(s, f × g) with Maass eigenforms f . Theorem 4.4. Let f be a Maass Hecke eigenform for Γ0 (N ) with Laplace eigenvalue 1/4 + k 2 , and let g be a fixed holomorphic or Maass cusp form for Γ0 (N ), or for Γ0 (N ′ ) with (N, N ′ ) = 1. Then for any small ε > 0, L(1/2 + it, f × g) ≪N,t,g,ε k (15+2θ)/16+ε .

(4.7)

SHIFTED CONVOLUTION SUMS OF FOURIER COEFFICIENTS OF CUSP FORMS

125

5. The spectral method: meromorphic continuation to σ > 1/2 If we allow the occurrence of poles, Dg (s, ν1 , ν2 , h) can indeed be meromorphically continued to a wider region. According to (4.5), if we continue Dg (s, ν1 , ν2 , h) to σ > 1/2, the possible poles are those at sj = 1/2 + itj , where 0 < itj ≤ θ with λj = sj (1 − sj ) being exceptional Laplace eigenvalues. As predicted by the GRC, these poles should not exist. Since we do not assume GRC, we will have to control the residues of these possible poles. Furthermore, we may refine Sarnak’s Theorem 4.2 in the t-aspect via the mean square estimate in Good [11] rather than the term-wise bound. Good’s result was proved originally for holomorphic cusp forms of weight l ≥ 4. In other cases, it was generalized recently by Kr¨otz and Stanton [22]. As we will see, switching from individual bound for hV, φj i to a mean square estimate provides a significant saving. In view of (4.3), (4.4), and Theorem 4.1, we introduce the following functions Bj (h, s) = (2π)s+l−1 (ν1 ν2 )(l−1)/2

hUh (·, s), φj i Γ(s + l − 1)

ρj (−h) 2s+l−3 π l−1/2 = (ν1 ν2 )(l−1)/2 s−1/2 Γ(s + l − 1) |h| ¶ µ ¶ µ s − 1/2 − itj s − 1/2 + itj Γ , ×Γ 2 2 hUh (·, s), Ea (·, 1/2 + iτ )i Ca (h, s, τ ) = (2π)s+l−1 (ν1 ν2 )(l−1)/2 Γ(s + l − 1)

(5.1)

ρa (1/2 + iτ, −h) 2s+l−2 π l−iτ = (ν1 ν2 )(l−1)/2 s−1/2+iτ Γ(s + l − 1) Γ(1/2 − iτ )|h| ¶ µ ¶ µ s − 1/2 − iτ s − 1/2 + iτ Γ , (5.2) ×Γ 2 2

and denote by Rh (s) the following sum over the exceptional eigenvalues, Rh (s) =

X (ν1 ν2 )(l−1)/2 2s+l−3 π l−1/2 ρj (−h) Γ(s + l − 1) |h|s−1/2 1/2≤sj ≤1/2+θ ¶ µ ¶ µ s − (1 − sj ) s − sj Γ ×Γ hV, φj i. 2 2

(5.3)

Note here that we include the possible nonexceptional eigenvalue λj = 1/4 with sj = 1/2 and ti = 0 in Rh (s) just for technical simplicity. Then, for

126

YUK-KAM LAU, JIANYA LIU AND YANGBO YE

σ > 1, Dg (s, ν1 , ν2 , h)−Rh (s) = +

1 4π

X

j: tj >0 XZ ∞ a

Bj (h, s)hV, φj i (5.4) Ca (h, s, τ )hV, Ea (·, 1/2 + iτ )idτ.

−∞

Since Rh (s) is a finite sum and hV, φj i ≪ kV kkφj k ≪ν1 ,ν2 ,g 1, it follows that Rh (s) is analytic in the half-plane σ > 0 except for poles at sj and 1 − sj . By Sarnak [34, (A.16)], we can choose {φj } to be Hecke eigenforms such that µ ¶ (mN tj )ε πtj ρj (m) ≪ε √ cosh mθ . (5.5) 2 N Inserting (5.5) into (5.3), and then applying Stirling’s formula, we deduce, for 1/2 ≤ σ ≤ 2 and |t| ≥ 1, Rh (s) ≪ν1 ,ν2 ,g |h|1/2+θ−σ+ε .

(5.6)

However, the above estimate is not true in the region 1/2 ≤ σ ≤ 2 and |t| ≤ 1, since the factor Γ((s−sj )/2) in (5.3) has a pole at s = sj = 1/2+itj with 0 < itj ≤ θ as in (4.5). Obviously, these poles lie in the interval [1/2, 1/2 + θ] ⊂ [1/2, 1]. This is why we require |t| ≥ 1 in Theorem 5.1 below. By Theorem 4.1, Bj (h, s) (when tj ≥ 0) and Ca (h, s, τ ) are holomorphic in σ > 1/2. The right-side of (5.4) is analytically continued to a holomorphic function on σ > 1/2, provided that uniform convergence on compact sets is justified. From (5.5), we infer that for 1/2 + ε ≤ σ ≤ 3/2, (1 + ||t| − tj |)σ/2−3/4 (1 + |t|)σ/2+l−3/4

e−πtj /2 Bj (h, s) ≪l,N |h|1/2−σ+θ+ε tεj

−π(|t−tj |+|t+tj |−2|t|)/4

×e

(5.7) ,

and, with the spectral large sieve in place of (5.5), e−π|τ |/2 Ca (h, s, τ ) ≪l,N |h|1/2−σ+ε (1 + |τ |)ε

(1 + ||t| − τ |)σ/2−3/4 (1 + |t|)σ/2+l−3/4

−π(|t−τ |+|t+τ |−2|t|)/4

×e

(5.8)

.

To verify the uniform convergence of (5.4) on compact sets, we assume for instance l ≥ 4 and invoke Good [11, Theorem 1]. The function V is of

SHIFTED CONVOLUTION SUMS OF FOURIER COEFFICIENTS OF CUSP FORMS

127

different form from that of f there; nonetheless, Good’s result still covers our case. This is because his proof applies to fl (z) = y k F (z)Pl (z) where F and Pl are a cusp form and a Poincar´e series for Γ, respectively; see [11, (3.2)] and [11, §4]. Note that g(ν1 z) and g(ν2 z) are cusp forms for Γ, and therefore g(ν2 z) can be written as a linear combination of the Poincar´e series. Hence, Z X 1 X T |hV, φj i|2 eπtj + |hV, Ea (·, 1/2 + iτ )i|2 eπ|τ | dτ ≪ T 2l . (5.9) 4π a −T tj ≤T

The estimate (5.9) is also valid for other cases, by Kr¨otz and Stanton [22]. Plainly |t − τ | + |t + τ | − 2|t| ≥ |τ | if |τ | ≥ 2|t|. Thus, by (5.7) and Weyl’s law #{j : tj ≤ T } = cT 2 + O(T log T )

(5.10)

for T ≥ 2|t|, we have X X e−πtj /4 e−πtj |Bj (h, s)|2 ≪ |h|1+2θ−2σ+ε (1 + |t|)3/2−σ−2l tj ≥T

tj ≥T

1+2θ−2σ+ε −3T /4

≪ |h|

e

Also, by (5.8), we have, for T ≥ 2|t|, Z e−π|τ | |Ca (h, s, τ )|2 dτ ≪ |h|1−2σ+ε e−3T /4 |τ |≥T

(5.11)

.

(5.12)

≪ |h|1+2θ−2σ+ε e−3T /4 .

Now assume T0 ≥ 2|t|. Dividing dyadically and applying the CauchySchwarz inequality, we obtain X |Bj (h, s)hV, φj i| j: T0 −1/2 To reach the Weyl bound as in (1.4), we need to meromorphically continue Dg (s, ν1 , ν2 , h) further to the left. 6.1. Further meromorphic continuation to σ > −1/2 First, let us look at Rh (s) in (5.3). As Rh (s) is a finite sum and hV, φj i ≪ kV kkφj k ≪ν1 ,ν2 ,g 1, Rh (s) is analytic in the complex plane except for poles lying on the real axis, which arise from the two gamma functions. In particular, on the halfplane σ > 0, there are only finitely many poles at sj and 1 − sj lying in the interval [1/2 − θ, 1/2 + θ] ⊂ [0, 1]. Using Stirling’s formula, we deduce from

SHIFTED CONVOLUTION SUMS OF FOURIER COEFFICIENTS OF CUSP FORMS

131

(5.5) in the same way as we deduce (5.6) that, for |σ| ≤ A0 and |t| ≥ 1, Rh (σ + it) ≪A0

|ρj (−h)| σ−1/2 |h| |Γ(σ + l −

1 + it)| ¯ µ ¶ µ ¶¯ ¯ ¯ s − sj s − (1 − sj ) ¯¯¯¯ hV, φj i¯ × ¯¯Γ Γ ¯ 2 2

≪ |h|1/2−σ+θ+ε |t|−l ≪ |h|1/2−σ+θ+ε .

(6.1)

Now let us turn to the first sum on the right side of (5.4). Recall (5.1) and (5.2) and write ¶ µ 2s+l−3 π l−1/2 s − 1/2 + itj Bj (s) = (ν1 ν2 )(l−1)/2 Γ Γ(s + l − 1) 2 (6.2) µ ¶ s − 1/2 − itj ×Γ , 2 µ ¶ s − 1/2 + iτ 2s+l−2 π l−iτ Γ Ca (s, τ ) = (ν1 ν2 )(l−1)/2 Γ(s + l − 1) 2 µ ¶ s − 1/2 − iτ ×Γ . 2

(6.3)

Then from (5.4), (5.1), (5.2), (6.2), and (6.3) we have Dg (s, ν1 , ν2 , h) − Rh (s) X ρj (−h) = Bj (s)hV, φj i |h|s−1/2 j:tj >0 Z 1 X ∞ ρa (1/2 + iτ, −h) Ca (s, τ ) + hV, Ea (·, 1/2 + iτ )i dτ. (6.4) 4π a −∞ Γ(1/2 − iτ ) |h|s−1/2+iτ We deduce from (6.2) that Bj (s) ≪l,ν1 ,ν2 ,ε T

(6.5)

for 0 ≤ tj ≤ 2T and −1/2 ≤ σ ≤ 3/2. Similarly, from (6.3) we derive that Ca (s, τ ) ≪l,ν1 ,ν2 ,ε T for |τ | ≤ 2T and |s − (1/2 ± itj )| ≥ ε or |s − (1/2 ± iτ )| ≥ ε. Besides, we may deduce that ¯ µ ¯ µ ¶¯2 ¶¯2 Z Z ¯ ¯ ¯ ¯ ¯Bj 1 + ε + it ¯ dt and ¯Ca 1 + ε + it, τ ¯ dt ¯ ¯ ¯ ¯ 2 2 (6.6) |t|≍T |t|≍T ≪l,ν1 ,ν2 ,ε T 1−2l

132

YUK-KAM LAU, JIANYA LIU AND YANGBO YE

and Z

|t|≍T

¯ µ ¶¯2 Z ¯ ¯ ¯Bj − 1 + it ¯ dt, ¯ ¯ 2 |t|≍T

¯ ¶¯2 ¯ ¯ ¯Ca (− 1 + it,τ ¯ dt ¯ ¯ 2

≪l,ν1 ,ν2 T

(6.7) 2−2l

.

6.2. Illustration for the proof of Theorem 1.1 The proof of Theorem 1.1 follows the line of arguments in [34], and our salient point is a delicate study on the Mellin transform of the shifted convolution sum against an oscillatory function. We need to give it a good upper estimate. To do so, we decompose spectrally the shifted convolution sum. The oscillatory function is given by an exponential integral, to which we apply the stationary phase method to extract the main part. Our desired estimate then follows from the spectral large sieve inequality and an estimate of Good on inner products of eigenfunctions. The spectral decomposition of shifted convolution sum is powerful and interesting on its own. It plays a key role in [34] as well, but there, Sarnak considered only for his purpose the analytic continuation to the plane σ > 1/2 + θ. We need a more precise form so that the meromorphic continuation is carried out to the wider region σ > −1/2. To illustrate the crucial roles played by the meromorphic continuation to σ > −1/2 and bounds in (6.7), let us look at [24, (9.25)]: Σ′′d (C, T )ℓ0 =

1 2πi

Z

X

ℓ′′ j: 0 1, respectively. Finally, µ(k) denotes the M¨obius function. We recall that µ(1) = 1, µ(k) = 0 if k > 2 is not squarefree and µ(k) = (−1)ω(k) otherwise. 1.3. Acknowledgements The author is very grateful to the organisers of the 4th China-Japan Seminar on number theory Jianya Liu and Shigeru Kanemitsu for their kind invitation to this meeting and help with preparation of this manuscript.

158

IGOR E. SHPARLINSKI

The author would like to thank Arne Winterhof for careful reading of the manuscript and making many valuable suggestions; in particular, the equation (4.6) is due to him. This work was supported in part by ARC grant DP0556431. 2. Number Theory Background 2.1. Exponential and Character Sums We have already mentioned the prominent role of Kloosterman sums and the bound (1.1) in particular. Most of the works also use the identity ½ 1 X 1 if v ≡ 0 (mod m), (2.1) em (rv) = 0 if v 6≡ 0 (mod m) m r∈Z/mZ

to express various characteristic functions and thus relating various counting questions to exponential sums. It is very often complemented by the bound W +Z X

z=W +1

em (rz) ≪ min{Z, m/|r|}

(2.2)

which holds for any integers r, W and Z > 1 with 0 < |r| 6 m/2, see [39, Bound (8.6)]. We need the estimate from [63] of exponential sums with rational functions of special type, which generalises the bound (1.1) of Kloosterman sums. Lemma 2.1. Let n1 , . . . , ns be nonzero pairwise distinct integers. Then the bound m X max em (a1 z n1 + . . . + as z ns ) ≪ d1/s m1−1/s+o(1) gcd(a1 ,...,as ,m)=d

z=1 gcd(z,m)=1

holds, where the implied constant depends only on n1 , . . . , ns . However, in many cases using bounds of multiplicative character sums yields stronger results. Let Φm be the set of all ϕ(m) multiplicative characters modulo m. We have the following analogue of (2.1). For any integer r, ½ X 1 1 if r ≡ 1 (mod m), χ (r) = (2.3) 0 otherwise. ϕ(m) χ∈Φm

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

159

We also use χ0 to denote the principal character. The following result is a combination of the Polya-Vinogradov bound (for ν = 1) and Burgess (for ν > 2) bounds, see [39, Theorems 12.5 and 12.6]. Lemma 2.2. For any integers W and Z with 1 6 Z 6 m, the bound ¯ W +Z ¯ ¯ X ¯ 2 ¯ ¯ χ(z)¯ 6 Z 1−1/ν m(ν+1)/4ν +o(1) max ¯ χ∈Φm ¯ ¯ χ6=χ0 z=W +1

holds with ν = 1, 2, 3 for any m and with arbitrary positive integer ν if m = p is a prime. The identity (2.3) immediately implies that for 1 6 Z 6 m ¯ +Z ¯2 W +Z ¯ X X ¯¯ WX ¯ 1 6 ϕ(m)Z χ(z)¯ = ϕ(m) ¯ ¯ ¯

(2.4)

z=W +1 gcd(z,m)=1

χ∈Φm z=W +1

which has been used in many works on Ha,m . Furthermore, it turns out, that sometimes one gets better results using the following 4th moment estimate from [3] (for prime m = p) and [28] (for arbitrary m), see also [32]. Lemma 2.3. For an arbitrary integer W , if m = p is a prime, and for W = 0 for arbitrary m, and an arbitrary positive integer Z 6 m, the bound ¯ +Z ¯4 ¯ X ¯¯ WX ¯ χ(z)¯ 6 m1+o(1) Z 2 ¯ ¯ ¯ χ∈Φm z=W +1

holds.

2.2. Theory of Uniform Distribution For a finite set F ⊆ [0, 1]s of the s-dimensional unit cube, we define its discrepancy with respect to a domain Ξ ⊆ [0, 1]s as ¯ ¯ ¯ #{f ∈ F : f ∈ Ξ} ¯ ¯ ∆(F, Ξ) = ¯ − λ(Ξ)¯¯ , #F where λ is the Lebesgue measure on [0, 1]s . We now define the discrepancy of F as D(F) =

sup Π⊆[0,1]s

∆(F, Π),

160

IGOR E. SHPARLINSKI

where the supremum is taken over all boxes Π = [α1 , β1 ) × . . . × [αs , βs ) ⊆ [0, 1]s . A link between the discrepancy and exponential sums is provided by the celebrated Koksma–Sz¨ usz inequality, see [22, Theorem 1.21]. However, for points of Ha,m , due to the discrete structure of the problem, one can immediately establish such a link directly by the identity (2.1). For example, one can consider the points ³x y´ ∈ [0, 1]2 , (x, y) ∈ Ha,m , , m m

and apply the bound (1.1) to estimate their discrepancy, which in turn is equivalent to studying points of Ha,m (X , Y) where X and Y are sets of consecutive integers. Moreover, the Koksma–Hlawka inequality, see [22, Theorem 1.14], allows to estimate average values of various functions on the points (x, y) ∈ Ha,m .

Lemma 2.4. For any continuous function ψ(z) on the unit cube z ∈ [0, 1]s and a finite set F ⊆ [0, 1]s of discrepancy D(F), the following bound holds: Z 1 X ψ(f ) = ψ(z)dz + O (D(F)) #F [0,1]s f ∈F

where the implied constant depends only on s and the function ψ. To study Ha,m ∩ W for more general sets W some additional tools are required from the theory of uniform distribution. As usual, we define the distance between a vector u ∈ [0, 1]s and a set Ξ ⊆ [0, 1]s by dist(u, Ξ) = inf ku − wk, w∈Ξ

where kvk denotes the Euclidean norm of v. Given ε > 0 and a domain Ξ ⊆ [0, 1]s we define the sets Ξε+ = {u ∈ [0, 1]s \Ξ : dist(u, Ξ) < ε} and Ξε− = {u ∈ Ξ : dist(u, [0, 1]s \Ξ) < ε} . Let h(ε) be an arbitrary increasing function defined for ε > 0 and such that lim h(ε) = 0.

ε→0

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

161

As in [47,56], we define the class Sh of domains Ξ ⊆ [0, 1]s for which ¡ ¢ ¡ ¢ λ Ξε+ 6 h(ε) and λ Ξε− 6 h(ε).

A relation between D(F) and ∆(F, Ξ) for Ξ ∈ Sh is given by the following inequality of [47] (see also [56]). Lemma 2.5. For any domain Ξ ∈ Sh , we have ³ ´ ∆(F, Ξ) ≪ h s1/2 D(F)1/s .

Finally, the following bound, which is a special case of a more general result of H. Weyl [74] shows that if Ξ has a piecewise smooth boundary such that Ξ ∈ Sh for some linear function h(ε) = Cε. Lemma 2.6. For any domain Ξ ∈ Sh with piecewise smooth boundary, we have ¡ ¢ λ Ξε± = O(ε).

To use the above results for the study of points on Ha,m , one usually considers points ³x y´ , ∈ [0, 1]2 , (x, y) ∈ Ha,m , 1 6 x, y 6 m. (2.5) m m 2.3. Arithmetic Functions, Divisors, Prime Numbers Certainly some elementary bounds such as k ϕ(k) ≫ log log(k + 2) and

µ 2ω(k) 6 τ (k) ≪ exp (log 2 + o(1))

log k log log k



,

(2.6)

see [71, Section I.5.2 and I.5.4] appear at various stages of the proofs of relevant results. The following well known consequence of the sieve of Eratosthenes (essentially of the inclusion-exclusion principle expressed via the M¨obius function) is very often needed to estimate the main terms of various asymptotic formulas (see, for example, [67,68]). Lemma 2.7. For any integers m, Z > 1 and W > 0, W +Z X

z=W +1 gcd(z,m)=1

1=

ϕ(m) Z + O(2ω(m) ). m

162

IGOR E. SHPARLINSKI

For an infinite monotonically increasing sequence of positive integers ∞ A = (an )n=1 , we define H(x, y, z; A) = #{n 6 x : ∃ d|an with y < d 6 z}. For A = N, the set of natural numbers, the order of magnitude of H(x, y, z; N) for all x, y, z has been determined in [25], see also [38]. Also in [25], one can find upper bounds for H(x, y, z; Pb ) of the expected order of magnitude, where Pb = {p + b : p prime} is a set of so-called shifted primes. However, for the problem of studying Ha,m , we need analogous results where n is restricted to an arithmetic progression. More precisely, let us define the sequences Tk = {mk − 1 : m ∈ N}

and

Uk = {pk − 1 : p prime}.

It has been shown in [26] that the arguments of [25] imply the following estimates. It is usual that in questions of this kind, the constant 1 + log log 2 = 0.086071 . . . . (2.7) κ=1− log 2 plays an important role, see also [38]. Lemma 2.8. Uniformly for 100 6 y 6 x0.51 , 1.1y 6 z 6 y 1.1 , 1 6 k 6 log x, we have k κ H(x, y, z; Tk ) ≪ x u (log(1/u))−3/2 , ϕ(k) k κ u (log(1/u))−3/2 , H(x, y, z; Uk ) ≪ x ϕ(k) where z = y 1+u . A certain result of [26] relies on the existence of infinitely many primes p with a prescribed structure of divisors of p − 1, which is done using a very deep results of [10] concerning the Bombieri-Vinogradov theorem. For an integer k > 1 we write di+1 T (k) = max i=1,...,τ (m)−1 di where 1 = d1 < . . . < dτ (m) = m are the positive divisors of m. By [60, Theorem 1], we have: Lemma 2.9. Uniformly in z > t > 2, z log t z log t ≫ # {k 6 z : T (k) 6 t} ≫ . log z log z

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

163

Finally, we remark, that several interesting results about the distribution of points on Ha,m (X , Y) on average over a, for some special sets X and Y, such as intervals X = Y = {z : 1 6 z 6 m/2}, are based on various asymptotic formulas for average values of Dirichlet L-functions, see, for example, [48,75,86]. 3. Distribution of Points on Ha,m 3.1. Points on Ha,m in Intervals for All a A classical conjecture asserts that for any fixed ε > 0 and a sufficiently large p, for every integer a there are integers x and y with |x|, |y| 6 p1/2+ε and such that xy ≡ a (mod p), see [30,33–35] and references therein. The question has probably been motivated by the following observation. Using the Dirichlet pigeon-hole principle, one can easily show that for every integer a there are integers x and y with |x|, |y| 6 2p1/2 with y/x ≡ a (mod p). Unfortunately, this is known only with |x|, |y| > Cp3/4 for some absolute constant C > 0, which is shown in [31]. Several modifications of this bound, for example for composite m, are also known, see [44]. These results are based on the bound (1.1) of Kloosterman sums (and its more precise form in the case when m = p is a prime) combined with some other standard arguments. The same arguments also produce the following estimate which is a slight generalisation of several previously known results, see [5,29] and references therein. Theorem 3.1. Let X = {U + 1, . . . , U + X}, where X > 1 and U > 0 are arbitrary integers. Suppose that for every x ∈ X we are given a set Yx = {Vx + 1, . . . , Vx + Y } where Y > 1 and Vx > 0 are arbitrary integers. Then for any integer m > 1 and a with gcd(a, m) = 1, we have X ϕ(m) 1 = XY + O(m1/2+o(1) ). m2 (x,y)∈Ha,m x∈X ,y∈Yx

Proof. Using (2.1) we write X X X X 1 1= 2 m

X

(x,y)∈Ha,m w∈X z∈Yw r,s∈Z/mZ 16x,y6m

(x,y)∈Ha,m x∈X ,y∈Yx

=

1 m2

X

r,s∈Z/mZ

Km (r, as)

X

w∈X

em (r(x − w) + s(y − z))

em (−rw)

X

z∈Yw

em (−sz) .

164

IGOR E. SHPARLINSKI

We now separate the main term which corresponds to r = s = 0 and is equal to XY m2

X

1 = XY

(x,y)∈Ha,m 16x,y6m

ϕ(m) . m2

For the error term E, for each divisor d|m, we collect together pairs (r, s) with the same value gcd(r, s, m) = d. Applying the bounds (1.1) and (2.2) we obtain E ≪ m1/2+o(1)

≪ m1/2+o(1) ≪ m1/2+o(1)

X

d|m d m1/2 (log m)κ/2 (log log m)3/4−ε

max{|x|, |y| : xy ≡ 1 holds:

• for all positive integers m 6 M , except for possibly o(M ) of them, • for all prime m = p 6 M except for possibly o(M/ log M ) of them. Similar questions about the ratios x/y, have also been studied, see [30, 34,61]. The result of [34] shows that almost all reduced classes modulo m can be represented as xy with 1 6 x, y 6 m1/2+ε . However, it does not imply that these products are uniformly distributed in reduced residue classes, which is some times required in applications. In this respect, the following bound is a minor modification of a result of [68] and gives the desired uniformity of distribution for 1 6 x 6 X, V + 1 6 y 6 V + Y provided that X, Y > m1/2+ε for a fixed ε > 0 and sufficient large integer m. In turn, it is based on some ideas from [4]. Theorem 3.2. Let X = {1, . . . , X} and Y = {V + 1, . . . , V + Y } where X, Y > 1 and V > 0 are arbitrary integers. Then for any integer m > 1, ¯ ¯2 m X ¯ ¯ ¯#Ha,m (X , Y) − XY ϕ(m) ¯ ≪ X(X + Y )mo(1) . ¯ 2 m ¯ a=1 gcd(a,m)=1

Proof. We rewrite the congruence xy ≡ a (mod m) as y ≡ ax−1 (mod m) (where the inversion is taken modulo m). Using the identity (2.1), we write #Ha,m (X , Y) =

=

1 m

1 m

X X

VX +Y

X

x=1 y=V +1 −(m−1)/26r6m/2 gcd(x,m)=1

X

em (−rV )

X X

x=1 gcd(x,m)=1

−(m−1)/26r6m/2

¡ ¢ em r(ax−1 − y)

Y ¡ ¢X em arx−1 em (−ry). y=1

By Lemma 2.7, the main term corresponding to r = 0 is 1 m

X X

Y X

y=1 x=1 gcd(x,m)=1

1 = XY

³ ´ ϕ(m) −1+o(1) + O Y m . m2

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

167

Hence #Ha,m (X , Y) − XY

ϕ(m) 1 ≪ Ea,m (X, Y ) + Y m−1+o(1) , 2 m m

where

¯ ¯ X ¯¯ Ea,m (X, Y ) = ¯ ¯ 16|r|6m/2 ¯

X X

x=1 gcd(x,m)=1

Using the Cauchy inequality we have ¯2 m ¯ X ¯ ¯ ¯#Ha,m (X , Y) − XY ϕ(m) ¯ ¯ 2 m ¯

¯ ¯ ¯¯ Y ¯ ¡ ¢¯ ¯X ¯ −1 ¯ ¯ em arx em (−ry)¯ . ¯¯ ¯ ¯¯ ¯ y=1

a=1

m 1 X 6 2 Ea,m (X, Y )2 + Y 2 m−1+o(1) . m a=1

(3.4)

We now put J = ⌊log(Y /2)⌋ and define the sets n mo R0 = r : 1 6 |r| 6 , Y n m mo Rj = r : ej−1 < |r| 6 ej , j = 1, . . . , J, Y Y o n m < |r| 6 m/2 RJ+1 = r : eJ Y (we can certainly assume that J > 1 since otherwise the bound is trivial). Applying the Cauchy inequality again, we deduce Ea,m (X, Y )2 6 (J + 2)

J+1 X

Ea,m,j (X, Y )2 ,

(3.5)

j=0

where

¯ ¯ X ¯¯ Ea,m,j (X, Y ) = ¯ ¯ r∈Rj ¯

X X

x=1 gcd(x,m)=1

Using (2.2), we conclude that that Y X y=1

¯ ¯ ¯¯ Y ¯ ¯ ¯X ¡ ¢ ¯ ¯ ¯ em arx−1 ¯ ¯ em (−ry)¯ . ¯ ¯¯ ¯ y=1

em (−ry) ≪ e−j Y.

for r ∈ Rj , j = 0, . . . , J + 1. Thus ¯ ¯ X ¯X X ¯ −j ϑr Ea,m,j (X, Y ) ≪ e Y ¯ ¯ x=1 ¯r∈Rj

gcd(x,m)=1

¯ ¯ ¯ ¢ ¡ ¯ em arx−1 ¯ , ¯ ¯

j = 0, . . . , J +1,

168

IGOR E. SHPARLINSKI

for some complex numbers ϑr with |ϑr | 6 1 for |r| 6 m/2. Therefore, ¯ ¯2 ¯ ¯ m X m ¯X ¯ X X X ¢ ¡ ¯ ¯ Ea,m,j (X, Y )2 ≪ e−2j Y 2 ϑr em arx−1 ¯ ¯ ¯ ¯ a=1 x=1 a=1 ¯r∈Rj ¯ gcd(x,m)=1

= e−2j Y 2

X

r1 ,r2 ∈Rj

ϑr1 ϑr2

X X

m X

|x1 |,|x2 |6X a=1 gcd(x1 x2 ,m)=1

¡ ¡ ¢¢ −1 em a r1 x−1 . 1 − r2 x2

−1 Clearly the inner sum vanishes if r1 x−1 (mod m) and is equal to 1 6≡ r2 x2 m otherwise. Therefore m X Ea,m,j (X, Y )2 ≪ e−2j Y 2 mTj , (3.6) a=1

where Tj is the number of solutions to the congruence r1 x2 ≡ r2 x1

(mod m),

r1 , r2 ∈ Rj , |x1 |, |x2 | 6 X, gcd(x1 x2 , m) = 1.

We now see that if r1 and x2 are fixed, then r2 and x1 are such that their product s = r2 x1 ≪ ej mX/Y belongs to a prescribed residue class modulo ¡ ¢ m. Thus there are at most O ej X/Y + 1 possible values of s and for each fixed s ≪ ej mX/Y there are τ (s) = mo(1) values of r1 and x2 with s = r2 x1 , see (2.6). Therefore ¡ ¢ ej Xm1+o(1) e2j X 2 m1+o(1) + Tj 6 X#Rj ej X/Y + 1 mo(1) = Y2 Y and after substitution into (3.6) we get m X

a=1

Ea,m,j (X, Y )2 ≪ e−2j Y 2 mTj = X 2 m2+o(1) + e−j XY m2+o(1) .

Substituting this bound in (3.5) and recalling (3.4), we conclude the proof. We note that the proof of Theorem 3.2 can easily be extended to arbitrary sets X ⊆ {1, . . . , X}, see [68]. However, it breaks down if x runs through a short interval away from the origin. An alternative approach has been suggested in [32] and is based on bounds of the 4th moment of multiplicative character sums, see Lemma 2.3. If m = p is prime, it can handle such shifted intervals X = {U + 1, . . . , U + X} (but not arbitrary sets X ⊆ {1, . . . , X} as that of [68]). Furthermore, the technique of [32] leads to more explicit expressions instead of mo(1) in the error term. Thus,

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

169

although the approaches of [32] and [68] complement each other they still leave some natural open questions. Question 3.3. Extend Theorem 3.2 to sets X = {U + 1, . . . , U + X} with arbitrary U . As we have mentioned, if m = p, Question 3.3 is addressed in [32], however the some of the necessary ingredients are not known for composite m. We also note that several results “on average” related to various modifications of the Lehmer problem are given in [48,73,75,86,87]. Finally, we remark that #Ha,m (X , Y) has been studied in [17] for the same sets as in Theorem 3.1, that is, for X = {1, . . . , X} and Y = {V + 1, . . . , V + Y }, but on average over V . It is shown in [17] that in this case one can also obtain stronger bounds than that of Theorem 3.1. 3.3. Points on Ha,m in Sets with Arithmetic Conditions Theorems 3.1 and 3.2 consider the case when x and y belong to sets of consecutive integers. However, studying points on Ha,m in other sets is of ultimate interest as well. We start with a very simple observation that no general result of the type of Theorems 3.1 and 3.2 applying to arbitrary sets X and Y is possible (even for very massive sets X and Y). For example, if m = p is a prime and X = Y consist of all (p − 1)/2 quadratic residues modulo p, then Ha,p (X , Y) = ∅ for every quadratic nonresidue a. The problem of distribution of pairs of primes (p, q) ∈ Ha,m has been considered in [24]. Unfortunately, it seems that even the Extended Riemann Hypothesis is not powerful enough to get a satisfactory answer to this question, see [24] for details. However it seems that the method of [64] can be used to study points (x, y) ∈ Ha,m with squarefree x and y. Question 3.4. Obtain an asymptotic formula for #{(x, y) ∈ Ha,m (X , Y) : x and y are squarefree} where X = {1, . . . , X} and Y = {1, . . . , Y } and 1 6 X, Y 6 m are arbitrary integers. One can also study the distribution of points (x, y) ∈ Hm (X , Y) with some prescribed structure of prime factors. For example, let P+ (k) and P− (k) denote the largest and the smallest prime divisors of an integer k > 1.

170

IGOR E. SHPARLINSKI

Question 3.5. Obtain an asymptotic formula for #{(x, y) ∈ Ha,m (X , Y) : P+ (xy) 6 R} and #{(x, y) ∈ Ha,m (X , Y) : P− (xy) > r} where X = {1, . . . , X} and Y = {1, . . . , Y } and 1 6 X, Y 6 m are arbitrary integers with R and r in reasonably large ranges. We remark that using elementary sieving arguments one can extend the result of Theorem 3.1 to counting (x, y) ∈ Ha,m such that x is of the largest possible multiplicative order modulo m (and thus so is y) which is given by the Carmichael function λ(n). In particular, when m = p is a prime, this addresses the problem of counting (x, y) ∈ Ha,m where x is a primitive root modulo p, for example, see [5,44]. Proofs of these results usually follow the same standard lines as the proof of Theorem 3.1, except that instead of (1.1) one uses the bound of the same strength on Kloosterman sums twisted with multiplicative characters X 1/2+o(1) χ(x)em (rx + sy) ≪ (m gcd(r, s, m)) . (x,y)∈Hm 16x,y6m

There are still some delicate issues of getting the mo(1) term as small as possible. Finally, we note that several very interesting results have recently been obtained in [55] about points (x, y) ∈ Ha,m such that x and y have restricted g-ary expansions to some fixed base g > 2. 4. Geometric Properties of Ha,m 4.1. Distances We observe that the asymptotic formulas (3.2) and (3.3) have a natural interpretation as the bounds on the power moments and the distribution function of the distances between an element x ∈ {1, . . . , m} with gcd(x, m) = 1 and its modular inverse. Several results about the average (over a) value of power moments can be found in [48–51], see also references therein. We now define the width wa,m of the set Ha,m : ª © wa,m = max |x − y| : (x, y) ∈ Ha,m . We also put

wm = w1,m

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

171

for the width of Hm , which has been the main object of study of [26,43,44]. Using Theorem 3.1 one easily derives that wa,m = m + O(m3/4+o(1) ).

(4.1)

Using the same arguments as in [44], one can obtain a more precise expression for the factor mo(1) . On the other hand, it has been noticed in [43] that § √ ¨ m − wm ≥ 2 m − 1 with equality for all m of the form

m = k 2 + ℓk + 1

(4.2)

√ with integers k and ℓ such that k > 0, 0 ≤ ℓ < 2 k + 1 and hence lim inf m→∞

m − wm √ = 2. m

(4.3)

Question 4.1. Show that √ there are infinitely many primes m = p of the form (4.2) with 0 ≤ ℓ < 2 k + 1. As a curiosity, we recall the following, it has been noted in [26], that √ m − wm 6 8m for all positive integers m = 2s with s ∈ Z. Indeed, if s is even, then m = (2s/2 − 1)2 + 2(2s/2 − 1) + 1 is of the form (4.2), if s is odd, then this follows from (2(s+1)/2 − 1)(2s − 2(s+1)/2 − 1) ≡ 1

(mod 2s ).

In the opposite direction it is shown in [26] that lim sup m→∞

m − wm √ = ∞. m

(4.4)

Furthermore, analogues of (4.3) and (4.4) also hold for prime values m = p: lim inf p→∞

p − wp =2 √ p

and

lim sup p→∞

p − wp = ∞, √ p

which follow from the following two results given in [26]. Theorem 4.1. For infinitely many primes p, we have √ p √ p − wp 6 2 p + . log p

172

IGOR E. SHPARLINSKI

Proof. Let ε = 1/(4 log Q). Using [10] one can show that for sufficiently large Q, there is a prime in the interval ((1 − ε)Q, Q] such that p − 1 has √ √ a divisor d in the interval ((1 − 2ε) Q, (1 − ε) Q]. If we write p − 1 = df , then wp > p − f − d. But, if Q is so large that ε 6 0.01, then f +d=

√ p−1 x √ √ + (1 − ε) x 6 (2 + 3ε) p, +d6 d (1 − 2ε) x

which implies the desired result.

Theorem 4.2. Let f (M ) be any positive function tending monotonically to zero as M → ∞. Then the inequality m − wm > m1/2 (log m)κ/2 (log log m)3/4 f (m) holds: • for all positive integers m 6 M , except for possibly o(M ) of them, • for all prime m = p 6 M except for possibly o(M/ log M ) of them. Proof. Let x be large and set z = (log M )κ/2 (log log M )3/4 f (M/2). It suffices to show m − wm 6 zm1/2 for o(M ) of integers m between M/2 and M . Without loss of generality, suppose f (M ) > 1/ log log M for all M > 10. We define Jk to be the set of positive integers m ∈ (M/2, M ] for which m − wm 6 ym1/2 and such that there are (x, y) ∈ Hm with dm = y − x and x(m − y) = km − 1. By the arithmetic-geometric mean inequality, for every m ∈ Jk , we have √ m − wm m−y+x p (4.5) = ≥ x(m − y) = km − 1. 2 2 Thus Jk = ∅ for k > z 2 + 1. Suppose 1 6 k < z 2 + 1, m ∈ Jk , (x, y) ∈ Hm and x(m − y) = km − 1. Then p √ kM/2 − 1 6 max(x, m − y) 6 z M . By Lemma 2.8,

#Jk 6 H(M,

p

√ kM/2 − 1, z M ; Tk ) ≪

kM (log(3z 2 /k))κ ϕ(k)(log M )κ (log log M )3/2

which after simple calculations leads to the estimate X #Jk = o(M ) 16k m1/4+o(1) . However we are mostly interested in more precise results which should certainly be based on some additional ideas. Finally, we ask a question of a different flavour, which is about the number of possible directions on the Euclidean plane defined by the pairs of distinct points (x1 , y1 ), (x2 , y2 ) ∈ Ha,m . Question 4.6. Estimate the cardinality of the set ¾ ½ x1 − x2 La,m = : (x1 , y1 ), (x2 , y2 ) ∈ Ha,m , (x1 , y1 ) 6= (x2 , y2 ) . y1 − y2 Obviously Questions 4.5 and 4.6 are influenced by the Erd˝ os and Kakeya problems, respectively, see [13, Sections 5.3 and 7.1] and also surveys [12,42]. They can also be asked for points of Ha,m (X , Y) for various sets X and Y. Finally, motivated by [6,7] and some other works one can also ask various questions about the distribution of the angles of elevation arctan(y/x) of points (x, y) ∈ Ha,m (X , Y) over the horisontal line. 4.2. Convex Hull We consider the convex closure Cm of the point set Hm . It is not hard to see that Cm is always a convex polygon with nonempty interior, except when m = 2, 3, 4, 6, 8, 12, 24, see [45].

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

175

Following [45], we denote by v(m) the number of vertices of Cm and by V (M ) its average value, M X 1 V (M ) = v(m). M − 1 m=2

Using Theorem 3.1, one can easily derive v(m) 6 m3/4+o(1) .

(4.7)

Naturally, exactly as Theorem 3.1, the bound (4.7) can be extended to sets of solutions of many other congruences. More interestingly, one can obtain another bound, which is much better in some cases and relies on more specific properties of Hm v(m) 6 T (m − 1)mo(1) ,

(4.8)

see [45], where T (k) is defined in §2.3. The lower bound v(m) > 2(τ (m − 1) − 1)

(4.9)

is also given in [45]. Furthermore, it is shown in [45] that in fact v(m) = 2(τ (m − 1) − 1) whenever T (m − 1) 6 5. Thus, this and the bound (4.8) explain why Lemma 2.9 comes into play. One can also find in [45] several efficient algorithms for computing v(m), together with their complexity analysis. Numerical calculations show that while the behaviour of v(m) is not adequately described by any of the above bounds, the lower bound (4.9) seems to be more precise than (4.7) and (4.8). It is quite natural to view the points of Hm as being randomly distributed in the square [0, m] × [0, m] (which is supported by the theoretic results which we have presented in §3.1 and 3.2) and then appeal to the following result of R´enyi and Sulanke [57, Satz 1]. Let R be a convex polygon in the plane with r vertices and let Pi , i = 1, . . . , n, be n points chosen at random in R with uniform distribution. Let Xn be the number of vertices of the convex closure of the points Pi , and let E(Xn ) be the expectation of Xn . Then E(Xn ) =

2 r(log n + γ) + cR + o(1), 3

(4.10)

where γ = 0.577215 . . . is the Euler constant, and cR depends on R and is maximal when R is a regular r-gon or is affine equivalent to a regular

176

IGOR E. SHPARLINSKI

r-gon. In particular, for the unit square R = [0, 1]2 we have 8 cR = − log 2. 3 Using (4.10) with r = 4, it seems plausible to conjecture that for most m v(m) ≈ h(m), where h(m) =

8 (log ϕ(m) + γ − log 2). 3

However, surprisingly enough, the numerical results of [45] show that V (M ) deviates from H(M ) =

M X 1 8 8 (log ϕ(m) + γ − log 2) = (log M + γ + η − 1 − log 2), M − 1 m=2 3 3

where η=

X log(1 − 1/p) = −0.580058 . . . , p

p prime

quite significantly, and is apparently larger than H(M ) by a fixed factor. Some partial explanation to this phenomenon has been given in [45] and suggests that for each m, the convex hull Cm , besides some “random” points, also contains a “regular” component with 2(τ (m − 1) − 1) points associated with divisors of m − 1 whose average contribution M X 1 2(τ (m − 1) − 1) ∼ 2 log M M − 1 m=2

is of the same order of magnitude as the size of H(M ). It is also shown [45] that this affect is specific to the points of Hm and disappears for the convex hull of points on a “generic” curve which behaves in much better agreement with (4.10) than v(m). Clearly since (1, 1),√ (m − 1, m − 1) ∈ Hm the diameter of Hm takes the largest possible value 2(m − 2). However for the other values of a the question about the diameter of Ha,m is more interesting. Question 4.7. Estimate the diameter p ∆a,m = max{ (x1 − x2 )2 + (y1 − y2 )2 : (x1 , y1 ), (x2 , y2 ) ∈ Ha,m }.

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

177

4.3. Visible Points For two sets of integers X and Y we denote Ga,m (X , Y) = #{(x, y) ∈ Ha,m (X , Y) : gcd(x, y) = 1} and also, following our usual agreement, we put Gm (X , Y) = G1,m (X , Y). Clearly Ga,m (X , Y) is the set of points (x, y) ∈ Ha,m (X , Y) which are “visible” from the origin (that is, which are not “blocked” by other points with integer coordinates). The following estimate is obtained in [64] for Gm (X , Y) but its extension to the general case is immediate and we present it here (in fact we also simplify the argument). Theorem 4.3. Let X = {1, . . . , X} and Y = {1, . . . , Y } where 1 6 X, Y 6 m are arbitrary integers. For all integers m, we have µ ¶−1 ³ ´ 6 XY Y 1 + O X 1/2 Y 1/2 m−1/4+o(1) , #Ga,m (X , Y) = 2 · 1+ π m p p|m

where the product is taken over all prime numbers p | m. Proof. For an integer z we let Ha,m (z; X , Y) = {(x, y) ∈ Ha,m (X , Y) : z | gcd(x, y)}. By the inclusion-exclusion principle, we write #Ga,m (X , Y) = Clearly

∞ X z=1

µ(z)#Ha,m (z; X , Y).

Ha,m (z; X , Y) = ∅ if gcd(z, m) > 1 or z > m. For gcd(z, m) = 1, writing x = zs

and

y = zt,

we have Ha,m (z; X , Y) = {(zs, zt) : st ≡ az −2

(mod m),

1 6 s 6 X/z, 1 6 t 6 Y /z}.

Now, we define l m R = X 1/2 Y 1/2 m−3/4

and

l m Q = X 1/2 Y 1/2 m−1/2 ,

(4.11)

178

IGOR E. SHPARLINSKI

and note that XY /Q2 6 m. A variant of Theorem 3.1 gives #Ha,m (z; X , Y) =

³ ´ XY ϕ(m) + O m1/2+o(1) , 2 2 z m

which we apply for “small” z 6 R. We also note that for each z, the product r = st 6 XY /z 2 , where s and t are given by (4.11), belongs to a fixed residue class modulo m and thus can take at most XY /z 2 m + 1 possible values. For each fixed r 6 XY /z 2 6 XY 6 m2 , there are τ (r) = ro(1) = mo(1) pairs (s, t) of integers s and t with r = st, see (2.6). Therefore, µ ¶ XY #Ha,m (z; X , Y) 6 + 1 mo(1) , z2m which we apply for “medium” z with Q > z > R. Finally, we note that Lemma 2.1 with gives the bound m X

z=1 gcd(z,m)=1

em (Az −2 + Bz) ≪ gcd(A, B, m)1/2 m1/2+o(1) .

Using the same arguments as in the proof of Theorem 3.1, we deduce that for the number of positive integers z 6 Z with az −2 ≡ w (mod m) with some w 6 W is ZW + O(m1/2+o(1) ). We now note that az −2 ≡ r (mod m) where, as before, r = st 6 XY /z 2 6 XY /Q2 6 m. Furthermore, for every z the value of r is uniquely defined and leads to at most τ (r) = mo(1) possible pairs (s, t). Hence, the total contribution from “large” z > Q can be estimated as X

m>z>Q

#Ha,m (z; X , Y) 6 6

⌈2 log m⌉

X

ν=0

ν=0

6

2ν+1 Q>z>2ν Q

⌈2 log m⌉ µ

X

2ν+1 Q ·

⌈2 log m⌉ µ

X

ν=0

X

#Ha,m (z; X , Y)

¶ XY 1/2+o(1) +m mo(1) (2ν Q)2

¶ XY o(1) XY 1/2+o(1) +m m . mo(1) = 2ν Q Q

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

179

Combining the above bounds and recalling our choice of R and Q (which optimises the error term), we derive the desired result. In particular, under the conditions of Theorem 4.3 µ ¶−1 6 XY Y 1 #Ga,m (X , Y) ∼ 2 · 1+ π m p p|m

provided that XY > m3/2+ε for some fixed ε > 0. There is little doubt that our approach can also be used to obtain asymptotic formulas for (x, y) ∈ Ha,m (X , Y) with arbitrary integer a, and also the sums X X |µ(x)µ(y)| (4.12) |µ(xy)| and (x,y)∈Hm (X ,Y)

(x,y)∈Hm (X ,Y)

under the same conditions on the sets X and Y as in Theorem 4.3, see also Question 3.4. However, we do not have any approaches to the following problem. Question 4.8. Extend Theorem 4.3 to intervals of the form X = {U + 1, . . . , U + X}

and

Y = {V + 1, . . . , V + Y }

with arbitrary U and V . It is also interesting to study sums of other arithmetic functions on (x, y) ∈ Hm (X , Y). Question 4.9. For intervals X = {U + 1, . . . , U + X}

and

Y = {V + 1, . . . , V + Y }

of length X, Y 6 m, • obtain nontrivial bounds for the sums X µ(xy) and (x,y)∈Hm (X ,Y)

X

(x,y)∈Hm (X ,Y)

µ ¶ x , y

where (x/y) is the Jacobi symbol of x modulo y, which we also extend to even values of y by simply putting (x/y) = 0 in this case; • obtain asymptotic formulas for the sums X X ω(|x − y|). ϕ(|x − y|) and (x,y)∈Hm (X ,Y)

(x,y)∈Hm (X ,Y)

180

IGOR E. SHPARLINSKI

5. Applications 5.1. Lehmer Problem One of the most natural and immediate applications of the uniformity of distribution results outlined in §3.1 is a positive solution to the Lehmer problem, see [37, Problem F12], about the joint distribution of the parity of x and y for (x, y) ∈ Ha,m (X , Y) with some intervals X = {U + 1, . . . , U + X}

and

Y = {V + 1, . . . , V + Y }.

This distribution is naturally expected to be close to uniform (that is, each parity combination is taken about in 25% of the cases) for any odd m and sufficiently large X and Y . The Lehmer problem can easily be reformulated e for some a and some as a question about the cardinality of #Ha,m (Xe, Y) e e other sets X and Y of about X/2 and Y /2 consecutive integers, respectively. Indeed, we are interested in solutions to the congruence (2e x + ϑ1 )(2e y + ϑ2 ) ≡ 1

(mod m)

with some fixed ϑ1 , ϑ2 ∈ {0, 1} and

U + 1 − ϑ1 U + X − ϑ1 V + 1 − ϑ2 V + Y − ϑ2 6x e6 , 6 ye 6 . 2 2 2 2 It remains to notice that the above congruence is equivalent to (e x + 2−1 ϑ1 )(e y + 2−1 ϑ2 ) ≡ 4−1

(mod m)

(where the inversion is taken modulo m), which after a shift of variables takes the desired shape. Close links between the Lehmer problem and bounds of Kloosterman sums has been first observed in [79,80]. The question and the above approach have been extended in several directions. Accordingly, instead of the bound (1.1) more general bounds of incomplete or multiple Kloosterman sums are used, see [2,19,20,48,49,51,52,75,81–84,86] and references therein. However, it has turned out that for multivariate analogues of the Lehmer problem, and a number of similar questions, bounds of character sums provide a more efficient tool than Kloosterman sums, see [67], where Lemmas 2.2 and 2.3 and the bound (2.4) have been used to improve several previous results. 5.2. Distribution of Angles in Some Point Sets A version of Theorem 3.1 has been used in [7] to study the distribution of angles between visible points (viewed from the origin) in a dilation of

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

181

a certain plain region Ω ∈ [0, 1]2 . More precisely, for Ω ∈ [0, 1]2 and a sufficiently large Q, we define ΩQ = {(Qα, Qβ) : (α, β) ∈ Ω)}. We now consider #FΩ (Q) angles between the horisontal line and the points of the set FΩ (Q) = {(x, y) ∈ ΩQ ∩ Z2 : gcd(x, y) = 1}. The distribution of this angles is studied in [7] and shown to exhibit somewhat unexpected behaviour. A slight generalisation of Theorem 3.1 has also played an important role in a study of angles defined by some other point set [6], see also [8]. 5.3. Sums of Divisor Functions over Arithmetic Progressions For example, the strength of estimates on the error terms in asymptotic formulas for sums of some divisor functions over an arithmetic progression is closely related to the precision of our knowledge of #Ha,m (X , Y), where X and Y are sets of consecutive integers, see [4,27,28]. This link becomes transparent if one recalls that a standard approach to evaluating divisor sums is approximating the hyperbolic region {(x, y) : x, y > 0, xy 6 N } by a union of “small” rectangles. In fact, as we have mentioned, the proof of Theorem 3.2 is based on some ideas of [4]. 5.4. Sato-Tate Conjecture in the “Vertical” Aspect We also recall that Theorem 3.2 about the distribution of points on Ha,m on average over a have been used in studying the so-called “vertical” aspect of the Sato-Tate conjecture for Kloosterman sums, see [68]. The relation is provided by the identity Km (r, s) = Km (1, rs) which holds if gcd(r, m) = 1. Clearly for the complex conjugated sum we have Km (r, s) = Km (−r, −s) = Km (r, s), hence we see that Km (r, s) is real. Since by the Weil bound, see [39], for any prime p, we have √ |Kp (r, s)| 6 2 p, gcd(r, s, p) = 1, we can now define the angles ψp (r, s) by the relations √ and 0 6 ψp (r, s) 6 π. Kp (r, s) = 2 p cos ψp (r, s)

182

IGOR E. SHPARLINSKI

The famous Sato–Tate conjecture asserts that, in the “horizontal” aspect, that is, for any fixed non-zero integers r and s, the angles ψp (r, s) are distributed according to the Sato–Tate density Z 2 β 2 sin γ dγ, µST (α, β) = π α see [39, Section 21.2]. More precisely, if πr,s (α, β; T ) denotes the number of primes p 6 T with α 6 ψp (r, s) 6 β, where, as usual π(T ) denotes the total number of primes p 6 T , the Sato–Tate conjecture predicts that πr,s (α, β; T ) ∼ µST (α, β)π(T ),

T → ∞,

(5.1)

for all fixed real 0 6 α < β 6 π, see [39, Section 21.2]. We remark that for elliptic curves, the Sato-Tate conjecture in its original (and the most difficult) “horizontal” aspect has recently been settled [70], but it still remains open for Kloosterman sums. It is shown in [68] that Theorem 3.2 implies that X X 1 πr,s (α, β; T ) ∼ µST (α, β)π(T ) 4RS 0 0. 5.5. Torsion of Elliptic Curves A variant of Theorem 3.1 has been obtained in [54] and applied to estimating torsion of elliptic curves. More precisely, let E be an elliptic curve over an algebraic number field K of degree d over Q. It is shown in [54] that if E contains a point of prime order p then 2

p 6 d3d . 5.6. Approximations by Sums of Two Rationals Following [15], we consider the problem of obtaining an upper bound on the approximation of a real α by s rational fractions with denominators at most Q, that is for ¯ ¯ ¯ ¯ ¯α − r 1 − . . . − r s ¯ δα,s (Q) = min ¯ 16q1 ,...,qs 6Q q1 qs ¯

with positive integers q1 , . . . , qs 6 Q. The question is motivated by the Dirichlet theorem on rational approximations which corresponds to the case s = 1.

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

183

It is shown in [15] that for s = 2 the results on the distribution of points on Hm,a are directly related to this question and imply nontrivial bounds on δα,2 (Q). It is natural to expect that there are also links between δα,s (Q) and multivariate analogues of the sets Hm,a . 5.7. SL2 (Fp ) Matrices of Bounded Height For a finite field Fp of p elements and a positive integer T 6 (p − 1)/2, we use Np (T ) to denote the number of matrices µ ¶ uv ∈ SL2 (Fp ) xy with |u|, |v|, |x|, |y| 6 T (we assume that Fp is represented by the elements of the set {0, ±1, . . . , ±(p − 1)/2}). Clearly X Np (T ) = #Ha+1,p (T , T )#Ha,p (T , T ) a∈Fp

where T = {0, ±1, . . . , ±T } and also #H0,p (T , T ) = 4T + 1. It has been shown in [1] that using the identity X #Ha,p (T , T ) = (2T + 1)2 a∈Fp

and Theorem 3.2, one can derive that Np (T ) =

³ ´ (2T + 1)4 + O T 2 po(1) p

which is nontrivial if T > p1/2+ε for any fixed ε > 0 and sufficiently large p. We remark that this is an Fp analogue of the results of similar spirit for matrices over Z and algebraic number fields, see [23,58] and references therein. 5.8. Computing Discrete Logarithms and Factoring It has been discovered in [62] that the distribution of points on hyperbolas Ha,m has direct links with analysis of some discrete logarithm algorithms. Namely, analysis of the algorithm from [62] rests on a weaker version of Theorem 3.2 obtained in [61]. Question 5.1. Study whether Theorem 3.2 can be used to improve some of the results of [62].

184

IGOR E. SHPARLINSKI

A new deterministic integer factorisation algorithm has recently been suggested in [59]. Its analysis depends on a variant of Theorem 3.1. 6. Concluding Remarks 6.1. Generalisations Multidimensional variants of the above problems have also been considered and in principle one can use bounds of multidimensional Kloosterman sums to study the distribution of solutions to the congruence x1 . . . xs ≡ a (mod m)

(6.1)

in the same fashion as in the case of two variables. However, it has turned out that in many cases bounds of multiplicative character sums lead to much stronger results. For examples, in [67], using the bounds of Lemmas 2.2 and 2.3 and the bound (2.4), an improvement is given of some results of [2] on generalised Lehmer problem. Similar ideas have also led in [65] to some other results on distribution of solutions to (6.1). For instance, let ∆s,a,m denote the discrepancy of the s-dimensional points ³x xs ´ 1 ∈ [0, 1]s , x1 . . . xs ≡ a (mod m), 1 6 x1 , . . . , xs 6 m. ,..., m m It is shown in [65] that ½ −1/2+o(1) m if s = 3, ∆s,a,m 6 −1+o(1) m if s > 4. It is worth noticing that one has to be careful with posing multidimensional generalisations, which sometimes lead to rather simple questions for s > 3 (despite that for s = 2 they are nontrivial at all). For example, it has been noticed in [65] that often such generalisations do not need any analytic technique used in [78] but follows immediately from a very elementary argument. Another unexplored line of research is in the direction of function field generalisations, which conceivably should admit results of the same strength or maybe even stronger. For example, a polynomial analogue of the conjecture of [24] could be more accessible. Question 6.1. Let Fq be a finite field of q elements. Given an irreducible polynomial F (X) ∈ Fq [X] of sufficiently large degree d, show that for any polynomial A(X) ∈ Fq [X], relatively prime to F (X), there are two irreducible polynomials G(X), H(X) ∈ Fq [X] of degree at most d such that G(X)H(X) ≡ A(X) (mod F (X)).

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

185

Finally, we believe that it is interesting to explore how much of the theory developed for modular hyperbolas can be extended to modular circles Ca,m = {(x, y) : x2 + y 2 ≡ a (mod m)}. Well known parallels between the properties of integer points on hyperbolas and circles on the Euclidean plane, suggest that many of the results obtained for Ha,m can be extended to Ca,m . 6.2. Further Improvements We have seen that bounds of Kloosterman sums play a prominent role in this field. It would be interesting to find some applications of bounds of very short incomplete Kloosterman sums from [11,40,41,46,53,66]. In [76,85] some questions are studied about the distribution of residues modulo m of powers (xk , y k ) taken over all (x, y) ∈ Ha,m . These results can be substantially extended and improved if one uses Lemma 2.1. In particular, Lemma 2.1 immediately implies that the error term in the asymptotic formula of [85] for the quantity N (k, q) can be lowered from q 3/4+o(1) to q 1/2+o(1) . It may also be extended and generalised in various directions. All these results can be obtained by combining Lemma 2.1 with standard arguments like those used in the proof of Theorem 3.1. References 1. O. Ahmadi and I. E. Shparlinski, Distribution of matrices with restricted entries over finite fields, Preprint, 2006. 2. E. Alkan, F. Stan and A. Zaharescu, Lehmer k-tuples, Proc. Amer. Math. Soc., 134 (2006), 2807–2815. 3. A. Ayyad, T. Cochrane and Z. Zheng, The congruence x1 x2 ≡ x3 x4 (mod p), the equation x1 x2 = x3 x4 and the mean value of character sums, J. Number Theory, 59 (1996), 398–413. 4. W. D. Banks, R. Heath-Brown and I. E. Shparlinski, On the average value of divisor sums in arithmetic progressions, Intern. Math. Research Notices, 2005 (2005), 1–25. 5. J. Beck and M. R. Khan, On the uniform distribution of inverses modulo n, Period. Math. Hung., 44 (2002), 147–155. 6. F. P. Boca, On the distribution of angles between geodesic rays associated with hyperbolic lattice points, Preprint, 2006. 7. F. P. Boca, C. Cobeli and A. Zaharescu, Distribution of lattice points visible from the origin, Commun. Math. Phys., 213 (2000), 433–470. 8. F. P. Boca and A. Zaharescu, Farey fractions and two-dimensional tori, Noncommutative Geometry and Number Theory, Aspects of Mathematics E37, Vieweg Verlag, Wiesbaden, 2006, 57–77.

186

IGOR E. SHPARLINSKI

9. E. Bombieri, On exponential sums in finite fields, Amer. J. Math., 88 (1966), 71–105. 10. E. Bombieri, J. Friedlander and H. Iwaniec, Primes in arithmetic progressions to large moduli, III, J. Amer. Math. Soc., 2 (1989), 215–224. 11. J. Bourgain, More on the sum-product phenomenon in prime fields and its applications, Intern. J. Number Theory, 1 (2005), 1–32. 12. J. Bourgain, New encounters in combinatorial number theory: From the Kakeya problem to cryptography, Perspectives in Analysis, Mathematical Physics Studies, vol. 27, Springer-Verlag, Berlin, 2005, 17–26. 13. P. Brass, W. Moser and J. Pach, Research problems in discrete geometry, Springer, New York, 2005. 14. T. H. Chan, Distribution of difference between inverses of consecutive integers modulo p, Integers, 4 (2004), Paper A03, 1–11. 15. T. H. Chan, Approximating reals by sums of two rationals, Preprint, 2006. 16. C. Cobeli, S. Gonek and A. Zaharescu, The distribution of patterns of inverses modulo a prime, J. Number Theory, 101 (2003), 209–222. 17. C. Cobeli, M. Vˆ ajˆ aitu and A. Zaharescu, Average estimates for the number of tuples of inverses modp in short intervals, Bull. Math. Soc. Sci. Math. Roumanie., 43 (2000), 155–164. 18. C. Cobeli, M. Vˆ ajˆ aitu and A. Zaharescu, Distribution of gaps between the inverses mod q, Proc. Edinb. Math. Soc., 46 (2003), 185–203. 19. C. Cobeli and A. Zaharescu, The order of inverses mod q, Mathematika, 47 (2000), 87–108. 20. C. Cobeli and A. Zaharescu, Generalization of a problem of Lehmer, Manuscr. Math., 104 (2001), 301–307. 21. C. Cobeli and A. Zaharescu, On the distribution of the Fp -points on an affine curve in r dimensions, Acta Arithmetica, 99 (2001), 321–329. 22. M. Drmota and R. Tichy, Sequences, discrepancies and applications, Springer-Verlag, Berlin, 1997. 23. W. Duke, Z. Rudnick and P. Sarnak, Density of integer points on affine homogeneous varieties, Duke Math. J., 71 (1993), 143–179. 24. P. Erd˝ os, A. M. Odlyzko and A. S´ ark˝ ozy, On the residues of products of prime numbers, Period. Math. Hung., 18 (1987), 229–239. 25. K. Ford, The distribution of integers with a divisor in a given interval, Ann. Math., (to appear). 26. K. Ford, M. R. Khan, I. E. Shparlinski and C. L. Yankov, On the maximal difference between an element and its inverse in residue rings, Proc. Amer. Math. Soc., 133 (2005), 3463–3468. 27. J. B. Friedlander and H. Iwaniec, Incomplete Kloosterman sums and a divisor problem, Ann. of Math., 121 (1985), 319–350. 28. J. B. Friedlander and H. Iwaniec, The divisor problem for arithmetic progressions, Acta Arith., 45 (1985), 273–277. 29. A. Fujii and Y. Kitaoka, On plain lattice points whose coordinates are reciprocals modulo a prime, Nagoya Math. J., 147 (1997), 137–146. 30. M. Z. Garaev, Character sums in short intervals and the multiplication table modulo a prime, Monatsh. Math., 148 (2006), 127–138.

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

187

31. M. Z. Garaev, On the logarithmic factor in error term estimates in certain additive congruence problems, Acta Arith., 124 (2006), 27–39. 32. M. Z. Garaev and V. Garcia, The equation x1 x2 = x3 x4 + λ in fields of prime order and applications, Preprint, 2007. 33. M. Z. Garaev and A. A. Karatsuba, On character sums and the exceptional set of a congruence problem, J. Number Theory, 114 (2005), 182–192. 34. M. Z. Garaev and A. A. Karatsuba, The representation of residue classes by products of small integers, Proc. Edin. Math. Soc., (to appear). 35. M. Z. Garaev and K.-L. Kueh, Distribution of special sequences modulo a large prime, Int. J. Math. Math. Sci., 50 (2003), 3189–3194. 36. A. Granville, I. E. Shparlinski and A. Zaharescu, On the distribution of rational functions along a curve over Fp and residue races, J. Number Theory, 112 (2005), 216–237. 37. R. K. Guy, Unsolved problems in number theory, Springer-Verlag, Berlin, 1994. 38. R. Hall and G. Tenenbaum , Divisors, Cambridge Tracts in Math. 90, Cambridge Univ. Press, 1988. 39. H. Iwaniec and E. Kowalski, Analytic number theory, Amer. Math. Soc., Providence, RI, 2004. 40. A. A. Karatsuba, Fractional parts of functions of a special form, Izv. Ross. Akad. Nauk Ser. Mat. (Transl. as Russian Acad. Sci. Izv. Math.), (4)55 (1995), 61–80 (in Russian). 41. A. A. Karatsuba, Analogues of Kloosterman sums, Izv. Ross. Akad. Nauk Ser. Mat. (Transl. as Russian Acad. Sci. Izv. Math.), (5)55 (1995), 93–102 (in Russian). 42. N. Katz and T. Tao, Recent progress on the Kakeya conjecture, Proceedings of the 6th International Conference on Harmonic Analysis and Partial Differential Equations Publ Matem., U. Barcelona, 2002, 161–180. 43. M. R. Khan, Problem 10736: An optimization with a modular constraint, Amer. Math. Monthly, 108 (2001), 374–375. 44. M. R. Khan and I. E. Shparlinski, On the maximal difference between an element and its inverse modulo n, Period. Math. Hung., 47 (2003), 111–117. 45. M. R. Khan, I. E. Shparlinski and C. L. Yankov, On the convex closure of the graph of modular inversions, Preprint, 2006. 46. M. A. Korolev, Incomplete Kloosterman sums and their applications, Izv. Ross. Akad. Nauk Ser. Mat. (Transl. as Russian Acad. Sci. Izv. Math.), (6)64 (2000), 41–64. 47. M. Laczkovich, Discrepancy estimates for sets with small boundary, Studia Sci. Math. Hungar., 30 (1995), 105–109. 48. H. N. Liu and W. Zhang, On a problem of D. H. Lehmer, Acta Math. Sinica, 22 (2006), 61–68. 49. H. N. Liu and W. Zhang, Hybrid mean value on the difference between a quadratic residue and its inverse modulo p, Publ. Math. Debrecen, 69 (2006), 227–243. 50. H. N. Liu and W. Zhang, General Kloosterman sums and the difference between an integer and its inverse modulo q, Acta Math. Sinica, 23 (2007),

188

IGOR E. SHPARLINSKI

77–82. 51. H. N. Liu and W. Zhang, Mean value on the difference between a quadratic residue and its inverse modulo p, Acta Math. Sinica, 23 (2007), 915–924. 52. S. R. Louboutin, J. Rivat and A. S´ ark¨ ozy, On a problem of D. H. Lehmer, Proc. Amer. Math. Soc., 135 (2007), 969–975. 53. W. Luo, Bounds for incomplete hyper-Kloosterman sums, J. Number Theory, 75 (1999), 41–46. 54. L. Merel, Bornes pour la torsion des courbes elliptiques sur les corps de nombres, Invent. Math., 124 (1996), 437–449. 55. N. G. Moshchevitin, On numbers with missing digits: Solvability of the congruences x1 x2 ≡ λ (mod p), Doklady Akad. Nauk , 410 (2006), 730–733 (in Russian). 56. H. Niederreiter and J. M. Wills, Diskrepanz und Distanz von Massen bezuglich konvexer und Jordanscher Mengen, Math. Z., 144 (1975), 125–134. ¨ 57. A. R´enyi and R. Sulanke, Uber die Konvexe H¨ ulle von n Zuf¨ allig Gewa¨ ahlten Punkten, Z. Wahrscheinlichkeitstheorie, 2 (1963), 75–84. 58. C. Roettger, Counting invertible matrices and uniform distribution, J. Th´eorie Nombres Bordeaux , 17 (2005), 301–322. 59. M. Rubinstein, Hide and seek – A naive factoring algorithm, Preprint, 2006. ´ Saias, Entiers ´ 60. E. a Diviseurs Denses 1, J. Number Theory, 62 (1997), 163– 191. 61. I. A. Semaev, On the number of small solutions of a linear homogeneous congruence, Mat. Zametki, 50 (1991), no.4, 102–107, (in Russian). 62. I. A. Semaev, An algorithm for evaluation of discrete logarithms in some nonprime finite fields, Math. Comp., 67 (1998), 1679–1689. 63. I. E. Shparlinski, On exponential sums with sparse polynomials and rational functions, J. Number Theory, 60 (1996), 233–244. 64. I. E. Shparlinski, Primitive points on a modular hyperbola, Bull. Polish Acad. Sci. Math., 54 (2006), 193–200. 65. I. E. Shparlinski, On the distribution of points on multidimensional modular hyperbolas, Proc. Japan Acad. Sci., Ser.A, 83 (2007), 5–9. 66. I. E. Shparlinski, Bounds of incomplete multiple Kloosterman sums, J. Number Theory, (to appear). 67. I. E. Shparlinski, On a generalisation of a Lehmer problem, Preprint, 2006. 68. I. E. Shparlinski, Distribution of inverses and multiples of small integers and the Sato–Tate conjecture on average, Preprint, 2006. 69. I. E. Shparlinski and A. Winterhof, Distances between the points on modular hyperbolas, J. Number Theory, (to appear). 70. R. Taylor, Automorphy for some l-adic lifts of automorphic mod l representations, II, Preprint, 2006. 71. G. Tenenbaum, Introduction to analytic and probabilistic number theory, Cambridge Univ. Press, 1995. 72. M. Vajaitu and A. Zaharescu, Distribution of values of rational maps on the Fp -points on an affine curve, Monatsh. Math., 136 (2002), 81–86. 73. Y. Weili, On the generalization of the D. H. Lehmer problem and its mean value, J. Algebra Number Theory and Appl., 6 (2006), 479–491.

DISTRIBUTION OF POINTS ON MODULAR HYPERBOLAS

189

74. H. Weyl, On the volume of tubes, Amer. J. Math., 61 (1939), 461–472. 75. Z. Xu and W. Zhang, On a problem of D. H. Lehmer over short intervals, J. Math. Anal. Appl., 320 (2006), 756–770. 76. Y. Yi and W. Zhang, On the generalization of a problem of D. H. Lehmer, Kyushu J. Math., 56 (2002), 235–241. 77. A. Zaharescu, The distribution of the values of a rational function modulo a big prime, J. Theor. Nombres Bordeaux , bf 15 (2003), 863–872. 78. T. Zhang and W. Zhang, A generalization on the difference between an integer and its inverse modulo q, II, Proc. Japan Acad. Sci., Ser.A, 81 (2005), 7–11. 79. W. Zhang, On a problem of D. H. Lehmer and its generalization, Compos. Math., 86 (1993), 307–316. 80. W. Zhang, On a problem of D. H. Lehmer and its generalization, II, Compos. Math., 91 (1994), 47–56. 81. W. Zhang, On the difference between a D. H. Lehmer number and its inverse modulo q, Acta Arith., 68 (1994), 255–263. 82. W. Zhang, On the difference between an integer and its inverse modulo n, J. Number Theory, 52 (1995), 1–6. 83. W. Zhang, On the distribution of inverses modulo n, J. Number Theory, 61 (1996), 301–310. 84. W. Zhang, On a problem of P. Gallagher, Acta Math. Hung., 78 (1998), 345–357. 85. W. Zhang, On the distribution of inverses modulo p, Acta Arith., 100 (2001), 189–194. 86. W. Zhang, On a problem of D. H. Lehmer and Kloosterman sums, Monatsh. Math., 139 (2003), 247–257. 87. W. Zhang, Z. Xu and Y. Yi, A problem of D. H. Lehmer and its mean square value formula, J. Number Theory, 103 (2003), 197–213. 88. Z. Zheng, The distribution of zeros of an irreducible curve over a finite field, J. Number Theory, 59 (1996), 106–118.

190

A SURVEY OF PROBLEMS AND RESULTS ON RESTRICTED SUMSETS ZHI-WEI SUN Department of Mathematics, Nanjing University, Nanjing 210093, China E-mail: [email protected] Additive number theory is currently an active field related to combinatorics. In this paper we give a survey of problems and results concerning lower bounds for cardinalities of various restricted sumsets with elements in a field or an abelian group.

1. Erd˝ os-Heilbronn conjecture and the polynomial method Let A = {a1 , . . . , ak } and B = {b1 , . . . , bl } be two finite subsets of Z with a1 < · · · < ak and b1 < · · · < bl . Observe that a1 + b1 < a2 + b1 < · · · < ak + b1 < ak + b2 < · · · < ak + bl , whence we see that the sumset A + B = {a + b: a ∈ A and b ∈ B} contains at least k + l − 1 elements. In particular, |2A| ≥ 2|A| − 1, where |A| denotes the cardinality of A, and 2A stands for A + A. The following fundamental theorem was first proved by A. Cauchy [9] in 1813 and then rediscovered by H. Davenport [11] in 1935. Cauchy-Davenport Theorem. Let A and B be non-empty subsets of the field Z/pZ where p is a prime. Then |A + B| ≥ min{p, |A| + |B| − 1}.

(1.1)

For lots of important results on sumsets over Z, the reader is referred to the recent book [38] by T. Tao and V. H. Vu. In this paper we mainly focus our attention on restricted sumsets with elements in a field or an abelian group.

A SURVEY OF PROBLEMS AND RESULTS ON RESTRICTED SUMSETS

191

In combinatorics, for a finite sequence {Ai }ni=1 of sets, a sequence {ai }ni=1 is called a system of distinct representatives of {Ai }ni=1 if a1 ∈ A1 , . . . , an ∈ An and a1 , . . . , an are distinct. A fundamental theorem of P. Hall [17] states that {Ai }ni=1 has a system of distinct representatives if and S only if | i∈I Ai | ≥ |I| for all I ⊆ {1, . . . , n}. The reader may consult [31] for a simple proof of Hall’s theorem. In 1964 P. Erd˝os and H. Heilbronn [13] made the following challenging conjecture. Erd˝ os-Heilbronn Conjecture. Let p be a prime, and let A be a nonempty subset of the field Z/pZ. Then |2∧ A| ≥ min{p, 2|A| − 3}, where 2∧ A = {a + b : a, b ∈ A and a 6= b}. This conjecture remained open until it was confirmed by Dias da Silva and Y. Hamidoune [12] thirty years later, with the help of the representation theory of groups. For a general field F , the additive order of the (multiplicative) identity of F is either infinite or a prime, which we denote by p(F ). The characteristic of the field F is defined as follows: ½ p if p(F ) is a prime p, (1.2) ch(F ) = 0 if p(F ) = ∞. Now we state Dias da Silva and Y. Hamidoune’s extension of the Erd˝osHeilbronn conjecture. Dias da Silva–Hamidoune Theorem ([12]). Let F be a field, and let n ∈ Z+ = {1, 2, 3, . . .}. Then, for any finite subset A of F , we have |n∧ A| ≥ min{p(F ), n|A| − n2 + 1}, ∧

(1.3)

where n A denotes the set of all sums of n distinct elements of A. √ If p is a prime, A ⊆ Z/pZ and |A| > 4p − 7, then by the Dias da Silva–Hamidoune theorem, any element of Z/pZ can be written as a sum of ⌊|A|/2⌋ distinct elements of A (see [12]), where ⌊·⌋ is the well-known floor function. In 1995–1996 N. Alon, M. B. Nathanson and I. Z. Ruzsa [4,5] developed a polynomial method rooted in [6] to prove the Erd˝os-Heilbronn conjecture and some similar results. The method turns out to be very powerful and has many applications in number theory and combinatorics. Now we introduce the above-mentioned polynomial method. We begin with a lemma.

192

ZHI-WEI SUN

Lemma 1.1 (Alon, Nathanson and Ruzsa [4,5]). Let F be a field and let A1 , . . . , An be finite non-empty subsets of F . Let f (x1 , . . . , xn ) ∈ F [x1 , . . . , xn ] have degree smaller than ki = |Ai | in xi for each i = 1, . . . , n. If f (a1 , . . . , an ) = 0 for all a1 ∈ A1 , . . . , an ∈ An , then f (x1 , . . . , xn ) is identically zero. This lemma can be proved by using induction on n and noting that a non-zero polynomial P (x) ∈ F [x] of degree smaller than a positive integer k cannot have k distinct zeros in F . The central part of the polynomial method is the following important principle formulated by Alon in 1999. Combinatorial Nullstellensatz (Alon [1]). Let A1 , . . . , An be finite subsets of a field F , and let f (x1 , . . . , xn ) ∈ F [x1 , . . . , xn ]. Q (i) Set gi (x) = a∈Ai (x − a) for i = 1, . . . , n. Then f (a1 , . . . , an ) = 0 for all a1 ∈ A1 , . . . , an ∈ An

(1.4)

if and only if there are h1 (x1 , . . . , xn ), . . . , hn (x1 , . . . , xn ) ∈ F [x1 , . . . , xn ] with deg hi ≤ deg f − deg gi for i = 1, . . . , n, such that f (x1 , . . . , xn ) =

n X

gi (xi )hi (x1 , . . . , xn ).

(1.5)

i=1

(ii) Suppose that deg f = k1 + · · · + kn where 0 ≤ ki < |Ai | for i = 1, . . . , n. If (1.4) holds then [xk11 · · · xknn ]f (x1 , . . . , xn ) = 0, where [xk11 · · · xknn ]f (x1 , . . . , xn ) denotes the coefficient of xk11 · · · xknn in f (x1 , . . . , xn ). Proof. (i) If there are h1 (x1 , . . . , xn ), . . . , hn (x1 , . . . , xn ) ∈ F [x1 , . . . , xn ] such that (1.5) holds, then for any a1 ∈ A1 , . . . , an ∈ An we have f (a1 , . . . , an ) =

n X

gi (ai )hi (a1 , . . . , an ) = 0.

i=1

Now we consider the converse. Write X fj1 ,...,jn xj11 . . . xjnn f (x1 , . . . , xn ) = j1 ,...,jn ≥0

and (j)

xj = gi (x)qij (x) + ri (x),

193

A SURVEY OF PROBLEMS AND RESULTS ON RESTRICTED SUMSETS (j)

(j)

where qij (x), ri (x) ∈ F [x] and deg ri (x) < deg gi (x) = |Ai |. Note that (j) (j) both ri (x) and gi (x)qij (x) = xj − ri (x) have degree not exceeding j. Clearly f (x1 , . . . , xn ) =

X

fj1 ,...,jn

j1 ,...,jn >0 j1 +···+jn 6deg f

n ³ Y

(ji )

gi (xi )qiji (xi ) + ri

i=1

=f¯(x1 , . . . , xn ) +

n X

´ (xi )

gi (xi )hi (x1 , . . . , xn ),

i=1

where X

f¯(x1 , . . . , xn ) =

fj1 ,...,jn

n Y

(ji )

ri

(xi )

i=1

j1 ,...,jn ≥0

and each hi (x1 , . . . , xn ) is a suitable polynomial over F with deg gi + deg hi ≤ deg f . If a1 ∈ A1 , . . . , an ∈ An , then f¯(a1 , . . . , an ) =

X

fj1 ,...,jn

n Y

aji i = f (a1 , . . . , an ) = 0.

i=1

j1 ,...,jn ≥0

Since the degree of f¯(x1 , . . . , xn ) in xi is smaller than |Ai |, by Lemma 1.1 the polynomial f¯(x1 , . . . , xn ) is identically zero. Therefore (1.5) holds. (ii) By part (i) we can write f (x1 , . . . , xn ) =

n X

gi (xi )hi (x1 , . . . , xn )

i=1

with hi (x1 , . . . , xn ) ∈ F [x1 , . . . , xn ] and deg hi ≤ deg f − deg gi . Since k1 + · · · + kn = deg f and ki < |Ai | for i = 1, . . . , n, we have [xk11 · · · xknn ]f (x1 , . . . , xn ) =

n X |A | [xk11 · · · xknn ]xi i hi (x1 , . . . , xn ) = 0. i=1

This concludes the proof. Here is a useful lemma implied by the Combinatorial Nullstellensatz. ANR Lemma ([5]). Let A1 , . . . , An be finite subsets of a field F with ki = |Ai | > 0 for i = 1, . . . , n. Let f (x1 , . . . , xn ) ∈ F [x1 , . . . , xn ] \ {0} and Pn deg f ≤ i=1 (ki − 1). If Pn

[xk11 −1 · · · xnkn −1 ]f (x1 , . . . , xn )(x1 + · · · + xn )

i=1 (ki −1)−deg f

6= 0,

(1.6)

194

ZHI-WEI SUN

then n X

|{a1 +· · ·+an : ai ∈ Ai , f (a1 , . . . , an ) 6= 0}| ≥

i=1

(ki −1)−deg f +1. (1.7)

Proof. Assume that C = {a1 + · · · + an : ai ∈ Ai , f (a1 , . . . , an ) 6= 0} has Pn cardinality not exceeding K = i=1 (ki − 1) − deg f . Then the polynomial Y (x1 + · · · + xn − c) P (x1 , . . . , xn ) := f (x1 , . . . , xn )(x1 + · · · + xn )K−|C| c∈C

Pn

xk11 −1

is of degree i=1 (ki − 1) with the coefficient of · · · xnkn −1 non-zero. Applying the second part of the Combinatorial Nullstellensatz, we find that P (a1 , . . . , an ) 6= 0 for some a1 ∈ A1 , . . . , an ∈ An . This is impossible since a1 + · · · + an ∈ C if f (a1 , . . . , an ) 6= 0.

We remark that a variant of this lemma appeared in Q. H. Hou and Z. W. Sun [18]. Alon-Nathanson-Ruzsa Theorem ([5]). Let A1 , . . . , An be finite nonempty subsets of a field F with |A1 | < · · · < |An |. Then, for the set ¾ ½X n ai : ai ∈ Ai , and ai 6= aj if i 6= j , (1.8) A1 ∔ · · · ∔ An = i=1

we have

½ ¾ n X n(n + 1) |A1 ∔ · · · ∔ An | ≥ min p(F ), |Ai | − +1 . 2 i=1

(1.9)

This follows from the ANR lemma and the following fact: If k1 , . . . , kn ∈ Z+ , then Y Pn (xj − xi ) × (x1 + · · · + xn ) i=1 ki −n(n+1)/2 [xk11 −1 · · · xnkn −1 ] 1≤i 1 and k > m(n − 1). (i) (Q. H. Hou and Z. W. Sun [18]) We have Y [xk−1 · · · xk−1 (xi − xj )2m · (x1 + · · · + xn )(k−1−m(n−1))n n ] 1 16i 0 then Y (xi − xj )2m−1 · (x1 + · · · + xn )(k−1−m(n−1))n [xk−n · · · xk−1 n ] 1 1≤i max{mn, (k − 1 − m(n − 1))n}, and let A1 , . . . , An be finite subsets of F with max1≤i≤n |Ai | = k. Set C = {a1 + · · · + an : a1 ∈ A1 , . . . , an ∈ An , ai − aj 6∈ Sij if i < j}, where Sij (1 ≤ i < j ≤ n) are subsets of F . (i) (Q. H. Hou and Z. W. Sun [18]) If |A1 | = · · · = |An | = k, and |Sij | ≤ 2m for all 1 ≤ i < j ≤ n, then we have |C| ≥ (k −1−m(n−1))n+1. (ii) (Z. W. Sun and Y. N. Yeh [37]) If |Ai | = k − n + i for i = 1, . . . , n, and |Sij | < 2m for all 1 ≤ i < j ≤ n, then |C| ≥ (k − 1 − m(n − 1))n + 1. The following conjecture posed by Z. W. Sun in [18] is open even for the rational field. Conjecture 2.1 (Z. W. Sun, 2002). Let A1 , . . . , An be finite non-empty subsets of a field F . For 1 ≤ i < j ≤ n, let Sij and Sji be finite subsets of F with |Sij | ≡ |Sji | (mod 2). Then |{a1 + · · · + an : a1 ∈ A1 , . . . , an ∈ An , ai − aj 6∈ Sij if i 6= j}| ¾ ½ n X X (|Sij | + |Sji |) − n + 1 . ≥ min p(F ), |Ai | − i=1

(2.5)

1≤i (k − 1)n − (m + 1) n2 , then |{a1 + · · · + an : ai ∈ Ai , and Pi (ai ) 6= Pj (aj ) if i 6= j}| µ ¶ n ≥ (k − 1)n − (m + 1) + 1. 2

(2.7)

Here we pose the following conjecture. Conjecture 2.2. Under the conditions of Theorem 2.3, we have |{a1 + · · · + an : ai ∈ Ai , and Pi (ai ) 6= Pj (aj ) if i 6= j}| ≥ p(F ) ¡ ¢ if p(F ) ≤ (k − 1)n − (m + 1) n2 .

Lemma 2.3 (Z. W. Sun [33]). Let R be a commutative ring with identity. Let A = (aij )1≤i,j≤n be a matrix over R, and let det(A) = |aij |1≤i,j≤n be the determinant of A. Let k, m1 , . . . , mn ∈ N. (i) If m1 ≤ · · · ≤ mn ≤ k, then we have kn− i [xk1 · · · xkn ]|aij xm j |1≤i,j≤n (x1 + · · · + xn ) Pn (kn − i=1 mi )! = Qn det(A). i=1 (k − mi )!

Pn

i=1

mi

(2.8)

(ii) If m1 < · · · < mn ≤ k then i [xk1 · · · xkn ]|aij xm j |16i,j6n

n (kn − = (−1)( 2 ) Qn Q

i=1

Y

(xj − xi ) ·

16i (m − 1) n2 .

Conjecture 2.4. ¡ ¢ Under the conditions of Theorem 2.5, if p(F ) ≤ (k − 1)n − (m + 1) n2 then the restricted sumset in (2.17) has cardinality at least p(F ). Corollary 2.2 (Z. W. Sun [35]). Let A1 , . . . , An and B = {b1 , . . . , bn } be subsets of a field with cardinality n. Then there are distinct a1 ∈ A1 , . . . , an ∈ An such that the permanent k(aj bj )i−1 k1≤i,j≤n is non-zero. Theorem 2.6 (Z. W. Sun [35]). Let h, k, l, m, n be positive integers satisfying k − 1 ≥ m(n − 1) and l − 1 ≥ h(n − 1). Let F be a field with p(F ) > max{K, L}, where µ ¶ µ ¶ n n K = (k − 1)n − (m + 1) and L = (l − 1)n − (h + 1) . 2 2

200

ZHI-WEI SUN

Assume that c1 , . . . , cn ∈ F are distinct and A1 , . . . , An , B1 , . . . , Bn are subsets of F with |A1 | = · · · = |An | = k and |B1 | = · · · = |Bn | = l. Let P1 (x), . . . , Pn (x), Q1 (x), . . . , Qn (x) ∈ F [x] be monic polynomials with deg Pi (x) = m and deg Qi (x) = h for i = 1, . . . , n. Then, for any S, T ⊆ F with |S| ≤ K and |T | ≤ L, there exist a1 ∈ A1 , . . . , an ∈ An , b1 ∈ B1 , . . . , bn ∈ Bn such that a1 + · · · + an 6∈ S, b1 + · · · + bn 6∈ T , and also ai bi ci 6= aj bj cj , Pi (ai ) 6= Pj (aj ), Qi (bi ) 6= Qj (bj ) if 1 ≤ i < j ≤ n. (2.18) Lemma 2.5 (Z. W. Sun [35]). Let k, m, n ∈ Z+ with k − 1 > m(n − 1). Then Y [xk−1 · · ·xk−1 (xj − xi )2m−1 (xj yj − xi yi ) · (x1 + · · · + xn )N n ] 1 16i max{mn, (k−1−m(n−1))n}. Assume that c1 , . . . , cn ∈ F are distinct, and A1 , . . . , An , B1 , . . . , Bn are subsets of F with |A1 | = · · · = |An | = k and |B1 | = · · · = |Bn | = n. Let Sij ⊆ F with |Sij | < 2m for all 1 ≤ i < j ≤ n. Then there are distinct b1 ∈ B1 , . . . , bn ∈ Bn such that the restricted sumset S = {a1 + · · · + an : ai ∈ Ai , ai − aj 6∈ Sij and ai bi ci 6= aj bj cj if i < j} (2.20) has at least (k − 1 − m(n − 1))n + 1 elements. 3. Snevily’s conjecture and additive theorems Suppose that {a1 , . . . , an }, {b1 , . . . , bn } and {a1 + b1 , . . . , an + bn } are complete systems of residues modulo n. Let σ = 0+1+· · ·+(n−1) = n(n−1)/2. Pn Pn Pn Since i=1 (ai + bi ) = i=1 ai + i=1 bi , we have σ ≡ σ + σ (mod n) and hence 2 ∤ n. In 1999 H. S. Snevily [28] made the following interesting conjecture. Snevily’s Conjecture. Let G be an additive abelian group with |G| odd. Let A and B be subsets of G with cardinality n > 0. Then there is a numbering {ai }ni=1 of the elements of A and a numbering {bi }ni=1 of the elements of B such that a1 + b1 , . . . , an + bn are distinct.

A SURVEY OF PROBLEMS AND RESULTS ON RESTRICTED SUMSETS

201

Theorem 3.1. (i) (N. Alon [2]) Let p be an odd prime and A be a nonempty subset of Z/pZ with cardinality n < p. For any given b1 , . . . , bn ∈ Z/pZ, we can find a numbering {ai }ni=1 of the elements of A such that the sums a1 + b1 , · · · , an + bn are distinct. (ii) (Q. H. Hou and Z. W. Sun [18]) Let k ≥ n ≥ 1 be integers, and let F be a field with p(F ) > max{n, (k − n)n}. Let A1 , . . . , An be subsets of F with cardinality k, and let b1 , . . . , bn be elements of F . Then the restricted sumset {a1 + · · · + an : ai ∈ Ai , ai 6= aj and ai + bi 6= aj + bj if i 6= j} has more than (k − n)n elements. Note that part (ii) in the case k = n and A1 = · · · = An yields part (i). In order to get part (i) by the polynomial method, Alon noted that Y (xj − xi )(xj + bj − (xi + bi )) = (−1)n(n−1)/2 n!. [x1n−1 · · · xnn−1 ] 1≤i 0 be any odd integer. As 2ϕ(m) ≡ 1 (mod m) by Euler’s theorem, the multiplicative group of the finite field with order 2ϕ(m) has a cyclic subgroup of order m. Thus, in view of the Combinatorial Nullstellensatz, Snevily’s conjecture for the cyclic group of order m follows from the following statement: If F is a field of characteristic 2 and b1 , . . . , bn are distinct elements of F ∗ = F \{0}, then Y (xj − xi )(bj xj − bi xi ) 6= 0. c := [x1n−1 · · · xnn−1 ] 1≤i h(n − 1). Assume that c1 , . . . , cn ∈ G are distinct, and A1 , . . . , An , B1 , . . . , Bn are subsets of G with |A1 | = · · · = |An | = k and |B1 | = · · · = |B ¡ n¢| = l. Then, for any sets S and ¡T ¢ with |S| ≤ (k − 1)n − (m + 1) n2 and |T | ≤ (l − 1)n − (h + 1) n2 , there are a1 ∈ A1 , . . . , an ∈ An , b1 ∈ B1 , . . . , bn ∈ Bn such that {a1 , . . . , an } 6∈ S, {b1 , . . . , bn } 6∈ T , and also ai + bi + ci 6= aj + bj + cj , mai 6= maj , hbi 6= hbj if 1 ≤ i < j ≤ n. (3.4) Corollary 3.1 (Z. W. Sun [35]). Let G be an additive abelian group with cyclic torsion subgroup, and let A1 , . . . , An , B1 , . . . , Bn and C = {c1 , . . . , cn } be finite subsets of G with the same cardinality n > 0. Then there are distinct a1 ∈ A1 , . . . , an ∈ An and distinct b1 ∈ B1 , . . . , bn ∈ Bn such that all the sums a1 + b1 + c1 , . . . , an + bn + cn are distinct. Proof. Just apply Theorem 3.4 with k = l = n and m = h = 1. In contrast with Snevily’s conjecture, Corollary 3.1 in the case A1 = · · · = An = A and B1 = · · · = Bn = B is of particular interest. Here we state a general additive theorem. Theorem 3.5 (Z. W. Sun [35]). Let G be any additive abelian group with cyclic torsion subgroup, and let A1 , . . . , Am be subsets of G with the same cardinality n ∈ Z+ . If m is odd or all the elements of Am are of odd order, then the elements of Ai (1 ≤ i ≤ m) can be listed in a suitable order Pm ai1 , . . . , ain , so that all the sums i=1 aij (1 ≤ j ≤ n) are distinct.

Sun [35] also noted that Theorem 3.5 with m odd cannot be extended to general abelian groups since there are counter-examples for the Klein quaternion group Z/2Z ⊕ Z/2Z.

204

ZHI-WEI SUN

A line of a square, an n × n matrix, is a row or column of the matrix. We define a line of an n × n × n cube in a similar way. A Latin cube over a set S of cardinality n is an n × n × n cube whose entries come from the set S and no line of which contains a repeated element. A transversal of an n × n × n cube is a collection of n cells no two of which lie in the same line. A Latin transversal is a transversal whose cells contain no repeated element. Corollary 3.2 (Z. W. Sun [35])). ( Let N be any positive integer. For the N × N × N Latin cube over Z/N Z formed by the Cayley addition table, each n × n × n sub-cube with n ≤ N contains a Latin transversal. Proof. Just apply Theorem 3.5 with m = 3 (or Corollary 3.1) to the cyclic group Z/N Z. In contrast, Theorem 3.2 has the following equivalent version observed by Snevily [28]: Let N be a positive odd integer. For the N × N Latin square over Z/N Z formed by the Cayley addition table, each of its sub-squares contains a Latin transversal. Conjecture 3.1 (Z. W. Sun [35]). Every n × n × n Latin cube contains a Latin transversal. 4. On a conjecture of Lev and related results Let A and B be finite non-empty subsets of an additive abelian group G. In contrast with the Cauchy-Davenport theorem, J.H.B. Kemperman [21] and P. Scherk [27] proved that |A + B| ≥ |A| + |B| − min νA,B (c),

(4.1)

νA,B (c) = |{(a, b) ∈ A × B: a + b = c}|;

(4.2)

c∈A+B

where

in particular, we have |A + B| ≥ |A| + |B| − 1 if some c ∈ A + B can be uniquely written as a + b with a ∈ A and b ∈ B. Motivated by the Kemperman-Scherk theorem and the Erd˝os-Heilbronn conjecture, V. F. Lev [22] proposed the following interesting conjecture. Lev’s Conjecture. Let G be an abelian group, and let A and B be finite non-empty subsets of G. Then we have |A ∔ B| ≥ |A| + |B| − 2 − min νA,B (c). c∈A+B

(4.3)

A SURVEY OF PROBLEMS AND RESULTS ON RESTRICTED SUMSETS

205

By a sophisticated application of the first part of the Combinatorial Nullstellensatz, H. Pan and Z. W. Sun [26] made the following progress on Lev’s conjecture. Theorem 4.1 (H. Pan and Z. W. Sun [26]). Let A and B be finite non-empty subsets of a field F . Let P (x, y) ∈ F [x, y] and C = {a + b: a ∈ A, b ∈ B, and P (a, b) 6= 0}.

(4.4)

If C is non-empty, then |C| ≥ |A| + |B| − deg P − min νA,B (c). c∈C

(4.5)

Theorem 4.2 (H. Pan and Z. W. Sun [26]). Let A and B be finite non-empty subsets of an abelian group G with cyclic torsion subgroup. For i = 1, . . . , l let mi and ni be non-negative integers and let di ∈ G. Suppose that C = {a + b: a ∈ A, b ∈ B, and mi a − ni b 6= di for all i = 1, . . . , l} (4.6) is non-empty. Then l X |C| ≥ |A| + |B| − (mi + ni ) − min νA,B (c). c∈C

i=1

(4.7)

The following result on difference-restricted sumsets follows from Theorems 4.1 and 4.2. Theorem 4.3 (H. Pan and Z. W. Sun [26]). Let G be an abelian group, and let A, B, S be finite non-empty subsets of G with C = {a + b: a ∈ A, b ∈ B, and a − b 6∈ S} = 6 ∅.

(4.8)

(i) If G is torsion-free or elementary abelian, then |C| ≥ |A| + |B| − |S| − min νA,B (c). c∈C

(4.9)

(ii) If Tor(G) is cyclic, then |C| ≥ |A| + |B| − 2|S| − min νA,B (c). c∈C

(4.10)

Proof. Without loss of generality we can assume that G is generated by the finite set A ∪ B. If G ∼ = Zn , then we can simply view G as the ring of algebraic integers in an algebraic number field K with [K : Q] = n. If G ∼ = (Z/pZ)n where

206

ZHI-WEI SUN

p is a prime, then G is isomorphic to the additive group of the finite field with pn elements. Thus part (i) follows from Theorem 4.1 with P (x, y) = Q s∈S (x − y − s). Let d1 , . . . , dl be a list of all the elements of S. Applying Theorem 4.2 with mi = ni = 1 for all i = 1, . . . , l, we immediately get part (ii), completing the proof. Given two finite subsets A and B of a field F and a general P (x, y) ∈ F [x, y], what can we say about the cardinality of the restricted sumset {a + b: a ∈ A, b ∈ B, and P (a, b) 6= 0}? In 2002 H. Pan and Z. W. Sun [25] made progress in this direction by relaxing (to some extent) the limitations of the polynomial method, their approach allows one to draw conclusions even if no coefficients in question are explicitly known. Lemma 4.1 (H. Pan and Z. W. Sun [25]). Let P (x) be a polynomial over a field F . Let F¯ be the algebraic closure of the field F and mP (α) be the multiplicity of α ∈ F¯ as a root of P (x) = 0 over F¯ . Suppose that there exist non-negative integers k < l such that [xi ]P (x) = 0 for all i with k < i < l. Then either xl | P (x), or deg P (x) ≤ k, or Nq (P ) ≥ l − k for some q ∈ P(p) = {1, p, p2 , . . .}, where p = ch(F ), X {mP (α)}q (4.11) Nq (P ) = q|{α ∈ F¯ \ {0}: mP (α) ≥ q}| − α∈F¯ \{0}

and {m}q denotes the least non-negative residue of m ∈ Z modulo q.

We remark that N1 (P ) is the number of distinct roots in F¯ \ {0} of the equation P (x) = 0 over F¯ . Theorem 4.4 (H. Pan and Z. W. Sun [25]). Let A and B be two finite non-empty subsets of a field F . Furthermore, let P (x, y) be a polynomial over F of degree d = deg P (x, y) such that for some i < |A| and j < |B| we have [xi y d−i ]P (x, y) 6= 0 and [xd−j y j ]P (x, y) 6= 0. Define P0 (x, y) to be the homogeneous polynomial of degree d such that P (x, y) = P0 (x, y) + R(x, y) for some R(x, y) ∈ F [x, y] with deg R(x, y) < d, and put P ∗ (x) = P0 (x, 1). For any α in the algebraic closure F¯ of F , let mP ∗ (α) denote the multiplicity of α as a zero of P ∗ (x) ∈ F¯ [x]. Then |{a + b: a ∈ A, b ∈ B, and P (a, b) 6= 0}|

where

≥ min{p(F ) − mP ∗ (−1), |A| + |B| − 1 − d − N (P ∗ )},

N (P ∗ ) =

max

q∈P(ch(F ))

q|{α ∈ F¯ \ {0, −1} : mP ∗ (α) ≥ q}|.

(4.12)

(4.13)

207

A SURVEY OF PROBLEMS AND RESULTS ON RESTRICTED SUMSETS

For the sake of clarity, here we state a consequence of Theorem 4.4. Corollary 4.1 (H. Pan and Z. W. Sun [25]). Let F be a field with p = ch(F ) 6= 2, and let A, B and S be finite non-empty subsets of F . Then |{a + b: a ∈ A, b ∈ B, and a − b 6∈ S}| ≥ min{p(F ), |A| + |B| − |S| − q − 1}, (4.14) where q is the largest element of P(p) not exceeding |S|. 5. Working with general abelian groups Theorem 5.1 (Kneser’s Theorem). Let G be an additive abelian group. Let A and B be finite non-empty subsets of G, and let H = H(A + B) be the stabilizer {g ∈ G : g + A + B = A + B}. If |A + B| ≤ |A| + |B| − 1, then |A + B| = |A + H| + |B + H| − |H|.

(5.1)

The following consequence is an extension of the Cauchy-Davenport theorem. Corollary 5.1. Let G be an additive abelian group. Let p(G) = +∞ if G is torsion-free, otherwise we let p(G) be the least order of a non-zero element of G. Then, for any finite non-empty subsets A and B of G, we have |A + B| ≥ min{p(G), |A| + |B| − 1}.

(5.2)

Proof. Suppose that |A + B| < |A| + |B| − 1. Then H = H(A + B) 6= {0} by Kneser’s theorem. Therefore |H| ≥ p(G) and hence |A + B| = |A + H| + |B + H| − |H| ≥ |A + H| ≥ |H| ≥ p(G). We are done. G. K´arolyi [19,20] extended the Erd˝os-Heilbronn conjecture to general abelian groups. Theorem 5.2. Let G be an additive abelian group and let A be a finite non-empty subset of G. (i) (G. K´arolyi [19]) We have |2∧ A| ≥ min{p(G), 2|A| − 3}.

(5.3)

(ii) (G. K´arolyi [20]) When |A| ≥ 5 and p(G) > 2|A| − 3, the equality |2∧ A| = 2|A| − 3 holds if and only if A is an arithmetic progression.

208

ZHI-WEI SUN

Using the fact that any finitely generated abelian group can be written as the direct sum of some cyclic groups of infinite or prime power order, K´arolyi proved Theorem 5.2 in two steps. First, he showed that Theorem 5.2 is true for any cyclic group G of infinite or prime power order; then, he proved that those abelian groups possessing the required property are closed under direct sum. In the first step for Theorem 5.2(i), he actually obtained the following more general result. Theorem 5.3 (G. K´ arolyi [19]). Let ∅ = 6 A, B ⊆ Z/pα Z, where p is a prime and α ∈ Z+ . Then |A ∔ B| ≥ min{p, |A| + |B| − 3}.

(5.4)

For non-empty subsets A and B of Z/pZ with p a prime, if |A| = 6 |B| then we have ½ ¾ 2(2 + 1) |A ∔ B| ≥ min p, |A| + |B| − + 1 = min{p, |A| + |B| − 2} 2 by the ANR theorem; if |A| = |B| then |A ∔ B| ≥ |(A \ {a0 }) ∔ B)| ≥ min{p, (|A| − 1) + |B| − 2}, where a0 is any fixed element of A. When q = pα is not a prime, Z/qZ is not a subgroup of the additive group of a field but K´arolyi considered it as the group of qth roots of unity (up to isomorphism) which can be viewed as a subgroup of the multiplicative group C∗ of non-zero complex numbers. Lemma 5.1 (Z. W. Sun [29,32]). Let λ1 , . . . , λk be qth roots of unity, and let c1 , . . . , ck be non-negative integers with c1 λ1 + · · · + ck λk = 0. Then c1 + · · · + ck ∈ D(q), where D(q) is as in Theorem 2.4(iii). Proof of Theorem 5.3. Since Z/qZ is isomorphic to the multiplicative group Cq of qth roots of unity, we may view A and B as subsets of Cq . If |A| + |B| − 3 > p, then we can choose ∅ = 6 A′ ⊆ A and ∅ = 6 B ′ ⊆ B so that ′ ′ |A | + |B | − 3 = p. Thus, without loss of generality, we may assume that k + l − 3 ≤ p where k = |A| and l = |B|. Suppose that |C| 6≥ min{p, k + l − 3} = k + l − 3, where C = {ab : a ∈ A, b ∈ B and a 6= b}. If c0 := [xk−1 y l−1 ](xy − 1)

Y

c∈C

(x − cy) × (x − y)k+l−4−|C| 6= 0,

A SURVEY OF PROBLEMS AND RESULTS ON RESTRICTED SUMSETS

209

then by the polynomial method, there exist a ∈ A and b−1 ∈ B −1 such that ab−1 6= 1 and a 6= cb−1 for all c ∈ C, which leads to a contradiction since a 6= b and ab ∈ C. Thus, it suffices to show c0 6= 0. Observe that c0 = [xk−2 y l−2 ]

k+l−4 Y s=1

(x − ρs y) = (−1)l−2

X

1≤i1 k, we have |{f (a1 , . . . , an ) : a1 , . . . , an ∈ A, and ai 6= aj if i 6= j}| ½ ¾ n n o ½ |A| − n ¾ n(|A| − n) ≥ min p(F ) − δ, −k +1 , k k k

(6.7)

where {α} denotes the fractional part α − ⌊α⌋ of a real number α, and ½ 1 if n = 2 and c1 = −c2 , δ= (6.8) 0 otherwise. By Corollary 3 of [25], this conjecture holds when n = 2. Note also that the Dias da Silva–Hamidoune theorem is a special case of Conjecture 6.1 with k = 1. References 1. N. Alon, Combinatorial Nullstellensatz, Combin. Prob. Comput., 8 (1999), 7–29. 2. N. Alon, Additive Latin transversals, Israel J. Math., 117 (2000), 125–130. 3. N. Alon, Discrete mathematics: methods and challenges, Proceedings of the International Congress of Mathematicians, Vol. I, (Beijing, 2002), Higher Ed. Press, Beijing, 2002, 119–135.

212

ZHI-WEI SUN

4. N. Alon, M. B. Nathanson and I. Z. Ruzsa, Adding distinct congruence classes modulo a prime, Amer. Math. Monthly, 102 (1995), 250–255. 5. N. Alon, M. B. Nathanson and I. Z. Ruzsa, The polynomial method and restricted sums of congruence classes, J. Number Theory, 56 (1996), 404– 417. 6. N. Alon and M. Tarsi, A nowhere-zero point in linear mappings, Combinatorica, 9 (1989), 393–395. 7. H. Q. Cao and Z. W. Sun, On sums of distinct representatives, Acta Arith., 87 (1998), 159–169. 8. L. Carlitz, Solvablity of certain equations in a finite field, Quart. J. Math., 7 (1956), 3–4. 9. A. Cauchy, Recherches sur les nombres, Jour. Ecole Polytechn., 9 (1813), 99–116. 10. S. Dasgupta, G. K´ arolyi, O. Serra and B. Szegedy, Transversals of additive Latin squares, Israel J. Math., 126 (2001), 17–28. 11. H. Davenport, On the addition of residue classes, J. London Math. Soc., 10 (1935), 30–32. 12. J. A. Dias da Silva and Y. O. Hamidoune, Cyclic spaces for Grassmann derivatives and additive theory, Bull. London Math. Soc., 26 (1994), 140– 146. 13. P. Erd˝ os and H. Heilbronn, On the addition of residue classes modulo p, Acta Arith., 9 (1964), 149–159. 14. R. K. Guy, Parker’s permutation problem involves the Catalan numbers, Amer. Math. Monthly, 100 (1993), 287–289. 15. B. Felzeghy, On the solvability of some special equations over finite fields, Publ. Math. Debrecen, 68 (2006), 15–23. 16. M. Hall, A combinatorial problem on abelian groups, Proc. Amer. Math. Soc., 3 (1952), 584–587. 17. P. Hall, On representatives of subsets, J. London Math. Soc., 10 (1935), 26–30. 18. Q. H. Hou and Z. W. Sun, Restricted sums in a field, Acta Arith., 102 (2002), 239–249. 19. G. K´ arolyi, The Erd˝ os-Heilbronn problem in abelian groups, Israel J. Math., 139 (2004), 349–359. 20. G. K´ arolyi, An inverse theorem for the restricted set addition in abelian groups, J. Algebra, 290 (2005), 557–593. 21. J. H. B. Kemperman, On small sumsets in an abelian group, Acta Math., 103 (1960), 63–88. 22. V. F. Lev, Restricted set addition in Abelian groups: results and conjectures, J. Th´eor. Nombres Bordeaux, 17 (2005), 181–193. 23. J. X. Liu and Z. W. Sun, Sums of subsets with polynomial restrictions, J. Number Theory, 97 (2002), 301–304. 24. M. B. Nathanson, Additive Number Theory: Inverse Problems and the Geometry of Sumsets (Graduated texts in mathematics; 165), Springer, New York, 1996. 25. H. Pan and Z. W. Sun, A lower bound for |{a+b: a ∈ A, b ∈ B, P (a, b) 6= 0}|,

A SURVEY OF PROBLEMS AND RESULTS ON RESTRICTED SUMSETS

213

J. Combin. Theory Ser. A, 100 (2002), 387–393. 26. H. Pan and Z. W. Sun, Restricted sumsets and a conjecture of Lev, Israel J. Math., 154 (2006), 21–28. 27. P. Scherk, Distinct elements in a set of sums, Amer. Math. Monthly, 62 (1955), 46–47. 28. H. S. Snevily, The Cayley addition table of Zn , Amer. Math. Monthly, 106 (1999), 584–585. 29. Z. W. Sun, Covering the integers by arithmetic sequences II, Trans. Amer. Math. Soc., 348 (1996), 4279–4320. 30. Z. W. Sun, Restricted sums of subsets of Z, Acta Arith., 99 (2001), 41–60. 31. Z. W. Sun, Hall’s theorem revisited, Proc. Amer. Math. Soc., 129 (2001), 3129–3131. 32. Z. W. Sun, On the function w(x) = |{1 ≤ s ≤ k : x ≡ as (mod n)s }|, Combinatorica, 23 (2003), 681–691. 33. Z. W. Sun, On Snevily’s conjecture and restricted sumsets, J. Combin. Theory Ser. A, 103 (2003), 291–304. 34. Z. W. Sun, Unification of zero-sum problems, subset sums and covers of Z, Electron. Res. Announc. Amer. Math. Soc., 9 (2003), 51–60. 35. Z. W. Sun, An additive theorem and restricted sumsets, preprint, 2006. Online version: http://arxiv.org/abs/math.CO/0610981. 36. Z. W. Sun, On value sets of polynomials over a field, preprint, 2007. On-line version: http://arxiv.org/abs/math.NT/0703180. 37. Z. W. Sun and Y. N. Yeh, On various restricted sumsets, J. Number Theory, 114 (2005), 209–220. 38. T. Tao and V. H. Vu, Additive Combinatorics, Cambridge Univ. Press, Cambridge, 2006.

214

A GENERAL MODULAR RELATION IN ANALYTIC NUMBER THEORY HARUO TSUKADA Department of Information and Computer Sciences, School of Humanity-Oriented Science and Engineering, University of Kinki, Iizuka, Fukuoka 820-8555, Japan E-mail: [email protected]

In this paper, we consider Dirichlet series

∞ P

k=1 ∞ X

k=1

αk λs k

and associated series of type

˛ ˛ {(a − A s, A )}n , {(c , C )}N , j j j=1 j j j j=1 ˛ m+M,n+N zλk ˛ αk Hp+P,q+Q M , ˛ {(bj − Bj s, Bj )}m j=1 , {(dj , Dj )} j=1

{(aj − Aj s, Aj )}pj=n+1 , {(cj , Cj )}P j=N +1

{(bj − Bj s, Bj )}qj=m+1 , {(dj , Dj )}Q j=M +1

1

A,

where H denotes the Fox H-function. Assuming a functional equation with gamma factors between the Dirichlet series, we prove a modular relation of a general form for the corresponding H-function series. This formula enables us to treat wide varieties of similar formulas in analytic number theory in a systematic way. For example, if the Dirichlet series satisfy a functional equation of Hecke type, the modular relation includes Bochner’s formula, the Riesz sum, a K-Bessel expansion, the incomplete gamma expansion, etc. as special cases, and we present examples associated to the Riemann zeta-function. This paper is an expanded version of the preprint [28].

1. Preliminaries In this section, we fix some notation related to Fox H-functions. Let C be the set of complex numbers, and let R+ be the set of positive real numbers. For any set A, the set of finite sequences of elements of A is denoted by A(∞) . ³ ´4 (∞) We set Ω = (C × R+) and for an element ∆=

¶ µ {(aj , Aj )}nj=1 ;{(aj , Aj )}pj=n+1 q {(bj , Bj )}m j=1 ;{(bj , Bj )}j=m+1

(1.1)

A GENERAL MODULAR RELATION IN ANALYTIC NUMBER THEORY

215

of Ω, we define ∆+ = and −

∆ =

¶ µ {(aj , Aj )}nj=1 ;{(aj , Aj )}pj=n+1 ∈Ω q − ;{(bj , Bj )}j=m+1 µ

− ;{(aj , Aj )}pj=n+1 q {(bj , Bj )}m j=1 ;{(bj , Bj )}j=m+1



(1.2)

∈ Ω,

(1.3)

where “−” means the empty sequence. We define the Gamma factor associated to ∆ by

∆(s) =

m n Y ¡ ¢ Y ¡ ¢ Γ bj + Bj s Γ 1 − aj − Aj s

j=1 p Y

¡ ¢ Γ aj + Aj s

j=n+1

j=1 q Y

¡ ¢ Γ 1 − bj − Bj s

.

(1.4)

j=m+1

Finally, we define the Fox H-function [12,20,24,25] associated to ∆ by µ ¯ ¶ ¯{(aj , Aj )}nj=1 , {(aj , Aj )}pj=n+1 ¡ ¯ ¢ m,n z ¯¯ H z ¯ ∆ = Hp,q q {(bj , Bj )}m j=1 , {(bj , Bj )}j=m+1 (1.5) Z 1 −s = ∆(s) z ds, 2πi L

Fig. 1.

the path L

216

HARUO TSUKADA

provided that the integral converges absolutely, where the path L : γ−i∞ → γ + i∞ (γ : real) is taken so that the poles of ∆+ (s) lie to the right of L, and those of ∆− (s) lie to the left of L (cf. Fig. 1). We note that the Meijer G-functions [21,22] are special cases of Hfunctions. µ ¯ ¶ ¯{aj }nj=1 , {aj }pj=n+1 ¯ z Gm,n p,q ¯{bj }m , {bj }q j=1 j=m+1 (1.6) µ ¯ ¶ p n ¯ {(a , 1)} j m,n j=1 , {(aj , 1)}j=n+1 ¯ = Hp,q z ¯ q {(bj , 1)}m j=1 , {(bj , 1)}j=m+1 For c ∈ C, we define ¶ µ {(aj + Aj c, Aj )}nj=1 ;{(aj + Aj c, Aj )}pj=n+1 ∈ Ω, ∆+c= q {(bj + Bj c, Bj )}m j=1 ;{(bj + Bj c, Bj )}j=m+1

and for C ∈ R+ , we define µ ¶ {(aj , CAj )}nj=1 ;{(aj , CAj )}pj=n+1 C ·∆ = ∈ Ω. q {(bj , CBj )}m j=1 ;{(bj , CBj )}j=m+1

(1.7)

(1.8)

We also set ¶ q {(1 − bj , Bj )}m j=1 ;{(1 − bj , Bj )}j=m+1 ∈ Ω, (1.9) ∆ = {(1 − aj , Aj )}nj=1 ;{(1 − aj , Aj )}pj=n+1 Ã ! ′ ′ {(a′j , A′j )}nj=1 ;{(a′j , A′j )}pj=n′ +1 ′ and for another element ∆ = of Ω, we ′ ′ ′ q ′ {(b′j , Bj′ )}m j=1 ;{(bj , Bj )}j=m′ +1 define ∗

µ

∆ ⊕ ∆′ Ã ! ′ ′ {(aj , Aj )}nj=1 ,{(a′j , A′j )}nj=1 ; {(aj , Aj )}pj=n+1 ,{(a′j , A′j )}pj=n′ +1 = ∈ Ω. ′ q ′ ′ m′ ′ ′ q {(bj , Bj )}m j=1 ,{(bj , Bj )}j=1 ;{(bj , Bj )}j=m+1 ,{(bj , Bj )}j=m′ +1 (1.10) Then we have the following relations between the Gamma factors: ¡ ¢ ∆ + c (s) = ∆(s + c) (c ∈ C), ¡ ¢ C ·∆ (s) = ∆(C s) (C ∈ R+ ), ∆∗ (s) = ∆(−s), ¢ ¡ ∆ ⊕ ∆′ (s) = ∆(s) ∆′ (s).

A GENERAL MODULAR RELATION IN ANALYTIC NUMBER THEORY

217

Furthermore, we have the following relations between the H-functions: ¢ ¡ ¯ ¢ ¡ ¯ (1.11) H z ¯ ∆ + c = z c H z ¯ ∆ (c ∈ C), µ ¯ ¶ ¯ 1 ¡ ¯ ¢ H z ¯¯ ·∆ = C H z C ¯ ∆ (C ∈ R+ ), (1.12) C µ ¯ ¶ ¡ ¯ ¢ 1 ¯¯ ∆ , (1.13) H z ¯ ∆∗ = H z ¯ ¯ ¶ µ µ ¯ ¶ Z ∞ ¡ ¯ ¢ 1 ¯¯ ′ dt z ¯¯ ′ ¯ H tz ∆ H ∆ = H ′ ¯∆ ⊕ ∆ . (1.14) t z′ ¯ t z 0

2. Assumptions

In this section, we state the in our theorem. H and I be © (i) ªLet © assumptions ∞ (h) ª∞ two natural numbers. Let λk k=1 (1 ≦ h ≦ H), µk k=1 (1 ≦ i ≦ I) be increasing sequences © of positive real numbers tending to ∞, and let © (h) ª∞ (i) ª∞ αk k=1 (1 ≦ h ≦ H), βk k=1 (1 ≦ i ≦ I) be complex sequences. We assume that the Dirichlet series ϕh (s) =

∞ (h) X αk

(h)s

k=1

and

ψi (s) =

λk

∞ (i) X βk

(i)s

k=1

µk

(1 ≦ h ≦ H)

(2.1)

(1 ≦ i ≦ I)

(2.2)

have finite abscissas of absolute convergence σϕh (1 ≦ h ≦ H) and σψi (1 ≦ i ≦ I), respectively. We also assume the existence of a meromorphic function χ satisfying the functional equation H X (h)  ¡ ¢   ∆1 (s) ϕh (s) Re(s) > max σϕh ,    1≦h≦H h=1 χ(s) = (2.3) I  X  ¢ ¡  (i)∗   ∆2 (r − s) ψi (r − s) Re(s) < min r − σψi ,  1≦i≦I

i=1

(h)

(i)

where ∆1 (1 ≦ h ≦ H), ∆2 (1 ≦ i ≦ I) are elements of Ω, and r is a real number. We further assume that among the poles of χ(s), only finitely (h)+ many distinct sk (1 ≦ k ≦ L) are neither a pole of ∆1 (s) nor a pole ¡ (i) ¢− of ∆2 − r (s). We denote the set of such poles by S = {sk | 1 ≦ k ≦ L}. Now, we introduce a gamma factor associated to ∆ ∈ Ω and suppose that for any pair of real numbers u1 , u2 (u1 < u2 ), we have a convergence

218

HARUO TSUKADA

lim

|v|→∞

¡ ¢ ∆ − s (u + iv) χ(u + iv) = 0 uniformly in u1 ≦ u ≦ u2 . Let L1 = L1 (s) : γ1 − i∞ → γ1 + i∞,

L2 = L2 (s) : γ2 − i∞ → γ2 + i∞,

(γ1 , γ2 : real, γ2 < γ1 ) be two non-intersecting paths in the w-plane depending on s. We assume that there exists a real number Y such that L1 and L2 coincide with the two lines Re(w) = γ1¡ and Re(w) = γ¢2 , respectively, (h) + for |Im(w)| > Y , and that all the poles of (∆ − s) ⊕ ∆1 (w) lie to the ¡ (h) ¢− (w) and S lie to the left of L1 , right of L1 , and those of (∆ − s) ⊕ ∆1 ¡ ¡ (i) ¢¢+ and all the poles of (∆ − s) ⊕ ∆2 − r (w) and S lie to the right of L2 , ¡ ¡ (i) ¢¢− and those of (∆ − s) ⊕ ∆2 − r (w) lie to the left of L2 (cf. Fig. 2).

Fig. 2.

the paths L1 and L2

Under these assumptions, we define a function X(z, s, ∆) by Z 1 ∆(w − s) χ(w) z −w dw. X(z, s, ∆) = 2πi L1

(2.4)

A GENERAL MODULAR RELATION IN ANALYTIC NUMBER THEORY

Then, by the Cauchy residue theorem, we have Z 1 ∆(w − s) χ(w) z −w dw X(z, s, ∆) = 2πi L2 +

L X

k=1

¢ ¡ Res ∆(w − s) χ(w) z −w , w = sk .

219

(2.5)

3. Theorem In this section we state and prove our theorem. Theorem 3.1. For the H-function series ∞ ¯ ³ ´ X (h) (h) (h) ¯ αk H zλk ¯(∆ − s) ⊕ ∆1 Φh (z, s, ∆) =

(3.1)

k=1

and

Ψi (z, s, ∆) =

∞ X

k=1

¯ ³ ´ (i)∗ (i) ¯ (i) βk H zµk ¯(∆ − s) ⊕ ∆2 ,

(3.2)

the following modular relation holds:

X(z, s, ∆) H X ¡ ¢    Φh (z, s, ∆) if L1 can be taken to the right of max σϕh ,    1≦h≦H  h=1    µ ¶ I L X X ¡ ¢ 1 = ∗ −r z Ψ Res ∆(w−s) χ(w) z −w , w = sk , r−s, ∆ +  i   z  i=1 k=1    ¢ ¡   if L2 can be taken to the left of min r − σψi ,  1≦i≦I

(3.3)

for ∆ and z such that the H-functions on the right-hand side converge absolutely. ¡ ¢ Proof. If L1 can be taken to the right of max σϕh (cf. Fig. 3), by using 1≦h≦H

(2.4), (2.3) and (2.1) successively, we have Z 1 ∆(w − s) χ(w) z −w dw X(z, s, ∆) = 2πi L1 Z H X 1 (h) ∆1 (w) ϕh (w) z −w dw ∆(w − s) = 2πi L1 h=1

220

HARUO TSUKADA

Fig. 3.

the paths L1 and L2

Z H ∞ (h) X X α 1 (h) ³ k ´w dw = ∆(w − s) ∆1 (w) (h) 2πi L1 zλ h=1

=

k=1

∞ H X X

(h)

αk

h=1 k=1

1 2πi

Z

k

³ ´−w (h) (h) (∆ − s)(w) ∆1 (w) zλk dw,

L1

where the exchange of the integration and the summation is permitted by absolute convergence. Now the resulting integral is an H-function, and by (3.1), we have

X(z, s, ∆) =

∞ H X X

h=1 k=1

=

H X

h=1

¯ ³ ´ (h) (h) ¯ (h) αk H zλk ¯(∆ − s) ⊕ ∆1

Φh (z, s, ∆) .

A GENERAL MODULAR RELATION IN ANALYTIC NUMBER THEORY

221

¢ ¡ On the other hand, if L2 can be taken to the left of min r − σψi 1≦i≦I

(cf. Fig. 4), by using (2.5), (2.3) and (2.2) successively, we have

X(z, s, ∆) − 1 = 2πi 1 = 2πi =

I X i=1

=

L X

k=1

Z Z

¢ ¡ Res ∆(w − s) χ(w) z −w , w = sk

∆(w − s) χ(w) z −w dw

L2

∆(w − s)

L2

1 2πi

∞ I X X

i=1 k=1

Z

i=1

∆(w −

L2

(i) βk

I X

1 2πi

Z

(i)∗

∆2 (r − w) ψi (r − w) z −w dw

(i) s) ∆2 (w

− r)

∞ X

k=1

(i)

µ

βk

(i) ¶−w

µk z

¡ (i) ¢ (∆ − s)(w) ∆2 − r (w)

L2

Fig. 4.

the paths L1 and L2

Ã

dw

z (i)

µk

Ã

1 (i)

µk

!−w

dw

!r

Ã

1 (i)

µk

!r

,

222

HARUO TSUKADA

where we changed the order of the integration and the summation. Now the resulting integral is an H-function which can be transformed as follows by the properties (1.11) and (1.13): 1 2πi

Z

¡ (i) ¢ (∆ − s)(w) ∆2 − r (w)

Ã

z (i)

dw

Ã

1 (i)

µk ¯ !r !à ¡ (i) ¢ z ¯¯ z = z −r H ¯ (∆ − s) ⊕ ∆2 − r (i) ¯ (i) µ µk ! à k ¯ ¢ z ¯¯ ¡ (i) = z −r H ¯ ∆ + (r − s) ⊕ ∆2 (i) ¯ µk ! à (i) ¯ ¯¡ ¢ µ ¯ (i)∗ k . = z −r H ¯ ∆∗ − (r − s) ⊕ ∆2 z ¯ L2

Ã

µk

!−w

!r

Therefore, we have, by (3.2), X(z, s, ∆) − =z

−r

= z −r

L X

k=1

¢ ¡ Res ∆(w − s) χ(w) z −w , w = sk

¯ ! ¯¡ ¢ ¯ ∗ (i)∗ ¯ ∆ − (r − s) ⊕ ∆2 ¯ i=1 k=1 ¶ µ I X 1 ∗ , r − s, ∆ . Ψi z i=1

∞ I X X

(i) βk H

Ã

(i)

µk z

This completes the proof of (3.3). ¡ ¢4 Remark 3.1. Let p : Ω → Ξ = R+(∞) be a map defined by

¶ ¶ µ µ {Aj }nj=1 ; {Aj }pj=n+1 {(aj , Aj )}nj=1 ;{(aj , Aj )}pj=n+1 , 7→ q q {Bj }m {(bj , Bj )}m j=1 ;{Bj }j=m+1 j=1 ;{(bj , Bj )}j=m+1

then for each element Λ ∈ Ξ, the inverse image p−1 (Λ) is a finitedimensional complex vector space, and we can regard the space Ω = ` −1 p (Λ) as a disjoint union. In this sense, we may regard ∆ ∈ Ω as Λ∈Ξ

a “complex variable”, and then the use of variable s in Theorem 3.1 is useful but unnecessary, since we have Φh (z, s, ∆) = Φh (z, 0, ∆ − s) and Ψi (z, s, ∆) = Ψi (z, 0, ∆ − s).

A GENERAL MODULAR RELATION IN ANALYTIC NUMBER THEORY

223

4. A General Modular Relation associated to the Riemann Zeta-function Even in the simplest case of Hecke type  Γ(s) ϕ(s) Re(s) > σϕ , (4.1) χ(s) = Γ(r − s) ψ(r − s) Re(s) < r − σψ , ³ ´ − ; − (i. e. H = I = 1, ∆1 = ∆∗2 = (0,1) ), the above modular relation (3.3) ; −

includes Bochner’s formula [6], the Riesz sum [8], a K-Bessel expansion [5], the incomplete Gamma expansion [9,11,17,27], and another Bochner’s formula [7], to name a few. Examples of non-Hecke type were studied in [1–4], and also in [14,15] using our formalism. Numerous concrete examples will appear in the forthcoming book [16]. In this section, we consider the general modular relation associated to the Riemann zeta-function defined by the Dirichlet series ζ(s) =

∞ X 1 , ks

Re(s) > 1.

(4.2)

k=1

If we set Z(s) =



ζ(2s) X 1 ¡ ¢s , = πs πk 2

(4.3)

k=1

then it satisfies the functional equation of Hecke type: ¢ ¡ ¢ ¡ Γ(s) Z(s) = Γ 12 − s Z 12 − s ,

(4.4)

after analytic continuation. By Theorem 3.1, we have the following modular relation: ¯ µ ¶ ∞ X ¯ {(1 − aj − Aj s, Aj )}nj=1 , {(aj − Aj s, Aj )}pj=n+1 m+1,n Hp,q+1 zπk 2 ¯¯ q (0, 1), {(bj − Bj s, Bj )}m j=1 , {(1 − bj − Bj s, Bj )}j=m+1 k=1 ¡ ¢ − Res Γ(w)Z(w)∆(w − s) z −w , w = 0 µ 2¯ ∞ 1 m X ¯ n+1,m πk ¯ {(1 − bj − Bj ( 2 − s), Bj )}j=1 , − 12 =z Hq,p+1 z ¯(0, 1), {(aj −Aj ( 12 −s), Aj )}nj=1 , k=1 ¶ {(bj − Bj ( 12 − s), Bj )}qj=m+1 {(1−aj −Aj ( 12 −s), Aj )}pj=n+1 ¢ ¡ + Res Γ(w)Z(w)∆(w − s) z −w , w = 12 , (4.5)

224

HARUO TSUKADA

which can be rewritten as ¯ µ ¶ ∞ ¯ {(1 − aj , Aj )}nj=1 , {(aj , Aj )}pj=n+1 1 X 1 m+1,n 2¯ H zπk ¯ q (s, 1), {(bj , Bj )}m πs k 2s p,q+1 j=1 , {(1 − bj , Bj )}j=m+1 k=1 ´ ³ ¢ ¡ 1 + Res Γ(w)Z(w)∆ −w − s + 12 z s− 2 +w , w = 12 =

1

∞ X

1 k 1−2s k=1 µ ¯ ¶ q 2 ¯ {(1 − bj , Bj )}m n+1,m πk ¯¡ j=1 , {(bj , Bj )}j=m+1 ¢ × Hq,p+1 p z ¯ 12 − s, 1 , {(aj , Aj )}nj=1 , {(1 − aj , Aj )}j=n+1 ¡ ¢ + Res Γ(w)Z(w)∆(w − s) z s−w , w = 12 , (4.6) 1

π 2 −s

and in this form, the modular relation is given by the following simultaneous exchange of the parameters:

©

©

s↔

1 2

z↔

1 z

−s

ªn © ªm (aj , Aj ) j=1 ↔ (bj , Bj ) j=1

(4.7)

ªp © ªq (aj , Aj ) j=n+1 ↔ (bj , Bj ) j=m+1 ¡

∆ ↔ ∆∗

¢

Example 4.1. (Bochner’s formula [6]). As the simplest example of (4.5), we have ¯ µ ¶ ∞ X ¯ 1 1,0 2¯ − H0,1 zπk ¯ + (0, 1) 2 k=1 ¯ µ ∞ X 1,0 πk 2 ¯ − ¶ 1 1 1 ¯ H0,1 = z− 2 + z− 2 . z ¯(0, 1) 2 k=1

Since

we get Bochner’s formula: ∞ X

k=1

−zπk2

e

µ ¯ ¶ ¯ − 1,0 H0,1 z ¯¯ = e−z , (0, 1) 1 1 + = z− 2 2

Ã

∞ X

k=1

− πk z

e

(4.8)

2

1 + 2

!

,

A GENERAL MODULAR RELATION IN ANALYTIC NUMBER THEORY

225

which can be rewritten as the modular relation of the theta function: 1 θ(z) = √ θ z

µ

1 z



,

θ(z) =

X

2

e−zπk ,

(4.9)

k∈Z

where Re(z) > 0. Example 4.2. (Riesz sum [8]). As an example of (4.6), we have ! à ¯ µ ¶ ∞ ¯ 1 1 X 1 Γ(w)Z(w) 1,0 s− +w 2 ¯(a, 1) 1 ¢ z 2 ,w = 2 ¡ H zπk ¯ + Res (s, 1) πs k 2s 1,1 Γ a − w − s + 12 k=1 ¶ µ 2¯ ∞ ¯ 1 X 1 − 1,0 πk ¯¡ ¢ = 1 −s H 0,2 z ¯ 12 − s, 1 , (1 − a, 1) π 2 k=1 k 1−2s ¶ µ Γ(w)Z(w) s−w + Res z , w = 12 . Γ(a + w − s) Setting a = 1, and using

and

 µ ¶s 1 z ¶  µ ¯  ¯ (1, 1) 1,0 = Γ(1 − s) 1 − z z ¯¯ H1,1  (s, 1)  0 µ ¯ ¯ z ¯¯

1,0 H0,2

− (s, 1), (0, 1)



|z| < 1,

(4.10)

|z| > 1,

¡ √ ¢ s = z 2 Js 2 z ,

(4.11)

where Js denotes the Bessel function of the first kind, we get the Riesz sum formula: 1 Γ(1 − s)

X

−w 1,

(4.29)

is the Hurwitz zeta-function. We note that Formula (4.28) leads to Hurwitz’s formula: ´ Γ(1 − 2s) ³ 1−2s πi − 1−2s πi 2 2 ζ(2s, w) = l (−w) + e l (w) , (4.30) e 1−2s 1−2s (2π)1−2s

where

ls (w) =

∞ X e2πikw

k=1

ks

,

Re(s) > 1,

(4.31)

is the Lerch zeta-function or the polylogarithm function. 5. A General Modular Relation associated to a product of the Riemann Zeta-function In this section, we consider a product of the Riemann zeta-function ¡ ¢ ¡ ¢ ζ(2s + ν) ζ(2s − ν) Z s + ν2 Z s − ν2 = π 2s ∞ ∞ ∞ (5.1) X σ2ν (k) X X lν k −ν ¡ ¢s = ¡ ¢s , = π 2 l2 k 2 kν π2 k2 k=1 l=1 k=1 P t where σt (k) = j . j|k

By the functional equation (4.4), we have ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ Γ s + ν2 Γ s − ν2 Z s + ν2 Z s − ν2 ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ = Γ 21 − s + ν2 Γ 12 − s − ν2 Z 21 − s + ν2 Z 12 − s − ν2 .

(5.2)

230

HARUO TSUKADA

Therefore, by Theorem 3.1, we have the following modular relation: ¯ µ ∞ X ¯ σ2ν (k) m+2,n {(1 − a − Aj s, Aj )}nj=1 , 2 2 ¯¡ Hp,q+2 zπ k ¯ ν ¢ ¡ ν j¢ m ν k 2 , 1 , − 2 , 1 , {(bj − Bj s, Bj )}j=1 , k=1

{(aj − Aj s, Aj )}pj=n+1 {(1 − bj − Bj s, Bj )}qj=m+1



µ 2 2¯ 1 ¯ Bj )}m {(1 n+2,m π k ¯¡ j=1 , ¢ ¡ −νbj ¢− Bj ( 2 − s), H 1 ν q,p+2 n ¯ ν ,1 , − ,1 ,{(a −A ( −s),A k z j j j )}j=1 , 2 2 2 k=1 ¶ {(bj − Bj ( 12 − s), Bj )}qj=m+1 {(1−aj −Aj ( 12 −s),Aj )}pj=n+1 ³ ¡ X ¢ ¡ ¢ ¡ ¢ ¡ ¢ Res Γ w + ν2 Γ w − ν2 Z w + ν2 Z w − ν2 + ´ v∈Sν ∆(w − s) z −w , w = v , 1

= z− 2

∞ X σ2ν (k)

(5.3)

which can be rewritten as ¯ µ ∞ n ¯ 1 X σ2ν (k) m+2,n {(1 2 2 ¯¡ ¢ ¡ − aνj , A¢j )}j=1 , H k zπ p,q+2 ν ¯ 2s 2s+ν s + 2 , 1 , s − 2 , 1 , {(bj , Bj )}m π k j=1 , k=1

=

∞ X σ2ν (k) π 1−2s k 1−2s+ν k=1 µ 2 2 n+2,m π k × Hq,p+2 z

{(aj , Aj )}pj=n+1 {(1−bj , Bj )}qj=m+1



1

¯ ¯ ¯¡ ¯ 1

m ¢{(1 ¡ 1− bj , Bjν)}j=1 ¢ , − s + , − s − , 1 , {(aj , Aj )}nj=1 , 2 2 2 ¶ {(bj , Bj )}qj=m+1 {(1 − aj , Aj )}pj=n+1 ³ ¡ X ¢ ¡ ¢ ¡ ¢ ¡ ¢ Res Γ w + ν2 Γ w − ν2 Z w + ν2 Z w − ν2 + ´ v∈Sν ∆(w − s) z s−w , w = v , ν 2,1

(5.4)

where © 1 ª 0.   © 2 ª Sν = − 14 , 14 , 34   ª © ν ± 2 , ± ν2 + 21

is the set of poles.

ν = 0, ν = ± 12 ,

ν 6= 0, ± 12 ,

(5.5)

231

A GENERAL MODULAR RELATION IN ANALYTIC NUMBER THEORY

Example 5.1. As the simplest example of (5.3), we have ¯ µ ¶ ∞ X ¯ σ2ν (k) 2,0 − 2 2 ¯¡ ¢ ¡ ¢ zπ k H 0,2 ¯ ν , 1 , −ν , 1 kν 2 2 k=1 ¶ µ 2 2¯ ∞ X σ2ν (k) 2,0 π k ¯¯ − − 12 ¡ ¢ ¡ ¢ =z H0,2 kν z ¯ ν2 , 1 , − ν2 , 1 k=1 X ¡ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¢ Res Γ w + ν2 Γ w − ν2 Z w + ν2 Z w − ν2 z −w , w = v . + v∈Sν

Since

µ ¯ ¯ z ¯¯¡ ν

¢− ¡ ν ¢ 2,1 , −2,1

2,0 H0,2

we get 2

∞ X σ2ν (k)

k=1





¡ √ ¢ = 2 Kν 2 z ,

(5.6)

¢ ¡ √ Kν 2 z πk

¶ 2 πk √ kν z k=1 X ¡ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¢ Res Γ w + ν2 Γ w − ν2 Z w + ν2 Z w − ν2 z −w , w = v . + 1

= 2 z− 2

∞ X σ2ν (k)



µ

v∈Sν

(5.7)

In particular, for ν = 12 , we have ¶ µ ¯ √ ¯ π −2√z − 2,0 , e H0,2 z ¯¯¡ 1 ¢ ¡ 1 ¢ = √ 4 z 4, 1 , −4, 1

(5.8)

and by using the relation ∞ X σ1 (k)

where

∞ X ∞ X 1 −nmz e k m n=1 m=1 k=1 ¶¶ µ µ ∞ ³ ´ X z iz −nz , log 1 − e − = − log η =− 2π 24 n=1

e−kz =

πi

η(τ ) = e 12 τ

∞ ³ Y

n=1

1 − e2πinτ

´

is the eta function, we get its transformation formula ¶¶ µ µ ¡ ¢ ¡ ¢ 1 1 − log − iτ log η(τ ) = log η − τ 2

(5.9)

(5.10)

232

HARUO TSUKADA

or

µ ¶ √ 1 η − = −iτ η(τ ). τ

(5.11)

Example 5.2. (Oppenheim’s formula [23]). As the second example of (5.3), we have ¯ ¶ µ ∞ X ¯ σ2ν (k) 2,0 1),¡(b − s,¢ 1) 2 2 ¯(a ¡− s, ¢ H zπ k 2,2 ν ν ¯ kν 2,1 , −2,1 k=1

1

= z− 2

∞ X σ2ν (k)



¯ ¶ π 2 k 2 ¯¯ − ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ × z ¯ ν2 , 1 , − ν2 , 1 , 12 − a + s, 1 , 12 − b + s, 1 à ¡ ! ¢ ¡ ¢ ¡ ¢ ¡ ¢ X Γ w + ν2 Γ w − ν2 Z w + ν2 Z w − ν2 −w Res + z ,w = v . Γ(a + w − s) Γ(b + w − s) k=1

µ

2,0 H0,4

v∈Sν

Setting a = s +

and

ν 2

+ 1, b = s − ν2 , and using ¢ ¡ ¢¶ µ ¯¡ ν ¯ + 1, 1 , − ν , 1 2,0 2 2 ¯ H2,2 z ¯ ¡ ν ¢ ¡ ν ¢ 2,1 , −2,1 ¢¶ ( ν µ ¯¡ ν ¯ + 1, 1 z2 1,0 2 = H1,1 z ¯¯ ¡ ν ¢ = 0 2,1 µ ¯ ¯ z ¯¯¡ ν

2,0 H0,4

=z where

− 14

(5.12)

|z| < 1,

|z| > 1,

¢ ¡ ν ¢ ¡ − ¢ ¡ ¢ , 1 , − 2 , 1 , − 12 − ν2 , 1 , 12 + ν2 , 1 2 ¡ √ ¢ F2ν+1 4 4 z ,



(5.13)

³ν ´ ³ν ´ ³ν ´ 2 sin π Kν (z) − sin π Yν (z) + cos π Jν (z), (5.14) π 2 2 2 and Yν denotes the Bessel function of the second kind, we get ∞ ³ √ ´ X X 1 σ2ν (k) −ν 2 σ2ν (k) = z z zk 1 F2ν+1 4π k ν+ 2 k 1, (5.16) and µ ¯ ¯ z ¯¯¡ ν

¢ ¡ ν ¢ ¡ ν −λ ¢ ¡ ν ¢ λ 1 ,1 , 2 , 1 , − 2 , 1 , − 2 − 2 , 1 , − 2 ¡− 2 −¢ 2 ¡ ¢¶ ν ν 1 ¡ 2 ν− λ − 2 ,11 ¢ ¡ ν− 2 ,11 ,¢ − 2 + 2, 1 , −2 − λ − 2, 1 µ ¯ ¯ − 3,0 ¢ ¡ ν ¢ = H1,5 z ¯¯¡ ν ¢ ¡ ν λ , 1 , − − , 1 , − 2 − λ2 − 12 , 1 , 2 2 2 ¢ ¡ ν ¶ 1 − λ − , 1 − 2 2 ¢ ¡ ¢ ¡ν 1 ν 1 2 + 2, 1 , −2 − λ − 2, 1

4,0 H2,6

234

HARUO TSUKADA

=

µ ¯ ¯ z ¯¯¡ ν

4,0 H2,6

λ

1

= z− 4 − 4 where

¢ ¡ν ¢ ¡ ν −λ ¢ ¡ ν ¢ 1 λ 1 2 , 1 , 2 + 2 , 1 , − 2 − 2 , 1¡ , − 2 −¢2 ¡− 2 , 1 , ¢¶ 1 ν 1 ν ¡ 2ν + 21 , 1 ¢, ¡− 2ν − λ − 12 , 1¢ 2 + 2 , 1 , − 2 −λ − 2 , 1 ¢ ¡ √ Gλ2ν+λ+1 4 4 z , (5.17)

µ ¶ ν−λ 2 Gλν (z) = − (−1)λ sin π Kν (z) π 2 ¶ µ ¶ µ ν−λ ν−λ π Yν (z) + cos π Jν (z), − sin 2 2

(5.18)

we get X ¡ ¢λ 2λ σ2ν (k) z − k Γ(λ + 1) k