Pillars of transcendental number theory 9789811541544, 9789811541551

295 109 1MB

English Pages 184 Year 2020

Table of contents :
Preface......Page 8
Contents......Page 11
About the Authors......Page 13
Symbols......Page 15
1.1 Algebraic Independence of Functions......Page 16
1.2 Gauss's Lemma......Page 19
1.3 Properties of Algebraic Numbers......Page 20
1.4 Linear Independence of Functions......Page 28
References......Page 35
2 Early Transcendence Results from Nineteenth Century......Page 36
2.1 Functional Identity of Hermite......Page 37
2.2 e Is Transcendental......Page 39
2.3 π Is Transcendental......Page 40
2.4 A Lemma from Galois Theory......Page 43
2.5 Theorem of Hermite–Lindemann–Weierstrass......Page 44
2.6 Applications of Theorem 2.5.1......Page 46
References......Page 48
3 Theorem of Gelfond and Schneider......Page 49
3.1 Lemmas on Linear Equations......Page 50
3.2 Proof of Gelfond–Schneider Theorem 3.0.1......Page 54
References......Page 58
4.1 Functions Satisfying Differential Equations......Page 59
4.2 First Extension......Page 63
4.3 Theorem 4.2.1 Implies Theorem4.1.1......Page 66
4.4 Another Consequence of Theorem 4.2.1......Page 67
4.5 Second Extension......Page 68
4.6 Some Consequences of Theorem 4.5.1......Page 71
References......Page 74
5 Diophantine Approximation and Transcendence......Page 75
5.1 Approximation Theorem of Dirichlet......Page 76
5.2 Theorems of Liouville and Thue......Page 80
5.3 Theorem of Siegel......Page 89
References......Page 99
6 Roth's Theorem......Page 100
6.1 Index of a Polynomial......Page 101
6.2 Set of Polynomials......Page 102
6.3 A Combinatorial Lemma......Page 108
6.4 The Approximation Polynomial......Page 110
6.5 Statement and Proof of Roth's Theorem......Page 114
References......Page 118
7 Baker's Theorems and Applications......Page 119
7.1 Statement of Baker's Theorems......Page 120
7.2 Applications of the Qualitative Result—Theorem 7.1.1......Page 121
7.3 Applications of the Quantitative Result—Theorem 7.1.2......Page 123
7.4 Effective Version of Thue's Theorem......Page 128
7.4.1 Proof of Theorem 7.4.1......Page 129
7.5 p-Adic Version of Baker's Result and an Application......Page 134
References......Page 140
8 Baker's Theorem......Page 142
8.1 Ground Work for the Proof of Baker's Theorem......Page 143
8.1.1 A Lower Bound for a Non-vanishing Linear Form......Page 144
8.1.2 A Special Augmentative Polynomial......Page 146
8.1.3 Construction of the Auxiliary Function......Page 147
8.1.4 Basic Estimates Relating to Φ......Page 151
8.1.5 Extrapolation Technique to Get More Zeros......Page 153
8.1.6 Smallness of Derivatives......Page 157
8.2 Proof of Baker's Theorem......Page 159
References......Page 165
9.1 Statement of Subspace Theorem......Page 166
9.2 Dirichlet's Multidimensional Approximation Results......Page 168
9.3 Applications of Subspace Theorems to Diophantine Approximation......Page 172
9.4 A Different Application......Page 175
References......Page 180
Appendix A Introductory Quotes......Page 182
Index......Page 183

Recommend Papers

Elements of Number Theory

121 43 4MB Read more

Number Theory

108 37 3MB Read more

Number Theory

114 20 3MB Read more

Elements of analytic number theory

117 25 645KB Read more

Number theory

97 16 3MB Read more

Elementary Number Theory

117 6 7MB Read more

Number theory revealed

563 53 6MB Read more

Algebraic Number Theory

584 63 583KB Read more

Number Theory [corrected ver.]

206 14 12MB Read more

Number theory in physics

601 102 167KB Read more

Pillars of transcendental number theory
9789811541544, 9789811541551

Author / Uploaded
Natarajan S.
Thangadurai R

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Saradha Natarajan Ravindranathan Thangadurai

Pillars of Transcendental Number Theory

Pillars of Transcendental Number Theory

This picture is the great corridor at Rameshwaram Temple, Tamilnadu. The picture is taken from the following url address: https://commons.wikimedia.org/wiki/File:Grand_corridor, rameshwaram_temple,tamilnadu_-_panoramio.jpg. Our thanks to Mr. Rajaraman Sundaram who has taken the photograph and made available for public in the above website

Saradha Natarajan Ravindranathan Thangadurai •

Pillars of Transcendental Number Theory

123

Saradha Natarajan DAE Centre for Excellence in Basic Sciences University of Mumbai Mumbai, Maharashtra, India

Ravindranathan Thangadurai Department of Mathematics Harish-Chandra Research Institute Prayagraj, Uttar Pradesh, India

ISBN 978-981-15-4154-4 ISBN 978-981-15-4155-1 https://doi.org/10.1007/978-981-15-4155-1

(eBook)

© Springer Nature Singapore Pte Ltd. 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microﬁlms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional afﬁliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

I seem to have been only like a boy playing on the seashore, and diverting myself in now and then ﬁnding a smoother pebble or a prettier shell than ordinary, whilst the great ocean of truth lay all undiscovered before me —Isaac Newton

Dedicated To Our families

Preface

Since the proof of Hermite in 1873 on the transcendence of the classical constant e, the theory of transcendence has evolved as a well-developed subject due to the contributions of many mathematicians like Lindemann, Thue, Siegel, Gelfond, Schneider, Roth and others. Baker’s work on linear forms in logarithms from 1965 to 1970 gave an impetus to the subject. The years that followed saw a resurgence of the subject. Several improvements were made in the lower bound estimates for linear forms in logarithms by Baker himself, Waldschmidt, Shorey and many others. These improved bounds have a wide number of effective applications in Diophantine equations, class number problems, powers in recurrence sequences, etc. Numerous papers have been written and are still being written using the theory of linear forms. The theory is a gold mine for researchers. There are several mathematicians all around the world who have made and are still making important contributions to this wonderful subject. We have mentioned only very few names and fewer results to keep the book as simple as possible. There are a handful of books written on the theory of linear forms and its applications beginning with the book of Baker [1]. Some of the other books are by Waldschmidt [2], Shorey & Tijdeman [3], Baker & Wüstholz [4], Nesterenko [5] and Ram Murty & Rath [6]. With the ever-growing applications, it becomes important for any student who wants to work in this area, to know the proofs of Baker’s original results. For this purpose, the only available sources are either Baker’s book or his original papers or some online notes. One of our primary aims of writing this book is to present Baker’s original results in a way, suitable for students of postgraduate or ﬁrst-year Ph.D. level. We intend to keep the exposition simple and easily accessible. This book will begin with some classical results like the transcendence of e;… and Hermite–Lindemann–Weierstrass theorem. Our proof of the Gelfond– Schneider theorem is based on Siegel’s method. A new feature here is that we will be showing some well-known results of Ramachandra, which are not widely known. Gelfond–Schneider theorem and many other interesting results will be derived from his results. ix

x

Preface

An important area, which got an impetus due to an ingenious method initiated by Thue in 1909, was Diophantine Approximation. This was developed by Siegel, Dyson, Gelfond and ﬁnally Roth, in 1955, obtained the best possible result for the approximation of an algebraic number by a rational. This result is now known as Thue–Siegel–Roth theorem. This theorem is an important pillar in this subject. The proof of Roth’s theorem can be found in Schmidt [7]. There are a few other books which outline the proof of Roth’s theorem. We will give proofs of theorems of Thue, improvement by Siegel and the theorem of Roth in this book. We will follow the original paper of Roth [8] (and [9]) for the proof of his theorem. When students see all the three proofs in one place, they can understand how the ideas developed. Our aim is not only to give the proofs of the theorems, but to illustrate the ineffectiveness of these theorems with the effectiveness of Baker’s result in solving Diophantine equations. Thue's equations have attracted a lot of attention lately. There are computational programs developed, by which, nowadays it is a routine matter to solve a Thue's equation. For any student who wants to work in this and allied areas, this is a prerequisite. Another important pillar in this subject is Schmidt’s subspace theorem. This is a multidimensional analogue of Roth’s theorem. Of late, many interesting applications of subspace theorem are being discovered. Although we will not be able to give the proof of this theorem, we will illustrate the theorem with some applications including a recent one. Each chapter will conclude with a few problems in the form of Exercise and some interesting information as Notes. We do not intend to be exhaustive in both Exercise and Notes. These are meant to infuse and instil curiosity for the reader. Our aim is to bring important theorems of transcendence theory under one roof so that the subject can be taught as a well-knit graduate course. The style will be classical, simple and friendly to students. Postgraduate and Ph.D. students of pure mathematics can be beneﬁted by a course based on this book. It will be a good collection in any pure mathematician’s library and can use this for teaching a course in transcendence. As the book will be self-contained, the need to refer to other books will be minimal. A basic course in algebraic number theory [10], real and complex analysis [11, 12] will be required for any reader of this book. The ﬁrst author would like to thank the Indian National Science Academy for awarding her Senior Scientist Fellowship and DAE Centre for Excellence in Basic Sciences, University of Mumbai, for providing facilities. The second author is thankful to Harish-Chandra Research Institute for the excellent environment which helped in writing the book, and also he is thankful to Dr. Veekesh Kumar and Ms. Bidisha Roy for going through some chapters of this book in the ﬁrst draft and for suggesting some of the exercises. Mumbai, India Prayagraj, India

Saradha Natarajan Ravindranathan Thangadurai

References

xi

References 1. A. Baker, Transcendental Number Theory (Cambridge Tracts, 1975) (Preface, Chapter 7 and Chapter 8). 2. M. Waldschmidt, Nombres Transcendants (Springer-Verlag, 1974) (Preface and Chapter 4). 3. T.N. Shorey, R. Tijdeman, Exponential Diophantine Equations (Cambridge Tracts, 1986 and re-printed in 2008) (Preface and Chapter 7). 4. A. Baker, G. Wüstholz, Logarithmic Forms and Diophantine Geometry (Cambridge Tracts, 2007) (Preface and Chapter 7). 5. Y.V. Nesterenko, Algebraic Independence, vol. 14 (Tata Institute of Fundamental Research Publications, 2008), 157pp (Preface, Chapter 2 and Chapter 3). 6. M. Ram Murty, P. Rath, Transcendental Numbers (Springer, 2014), 217pp (Preface, Chapter 2, Chapter 4 and Chapter 7). 7. W.M. Schmidt, Diophantine Aproximation, vol. 785 (Springer-Verlag LNM, BerlinHeidelberg, 1980) (Preface and Chapter 9). 8. K.F. Roth, Rational Approximations to Algebraic Numbers. Mathematika 2, 1–20 (1955); Corrigendum 2 (1955), p. 168. (Preface and Chapter 6). 9. W.J. LeVeque, Topics in Number Theory, Vol I and II (Dover Publication Inc, New York, 1984) (Preface and Chapter 6). 10. S. Lang, Algebraic Number Theory, vol. 110, 2nd edn. Graduate Texts in Mathematics (Springer-Verlag, New York, 1994) (Preface and Chapter 1). 11. T.M. Apostol, Mathematical Analysis: A Modern Approach to Advanced Calculus (Addison-Wesley Publishing Company, Inc., Reading, Mass., 1957) (Preface). 12. W. Rudin, Real and Complex Analysis, 3rd edn (McGraw-Hill Book Co., New York, 1987) (Preface).

Contents

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

1 1 4 5 13 20

2 Early Transcendence Results from Nineteenth Century 2.1 Functional Identity of Hermite . . . . . . . . . . . . . . . . . 2.2 e Is Transcendental . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 … Is Transcendental . . . . . . . . . . . . . . . . . . . . . . . . 2.4 A Lemma from Galois Theory . . . . . . . . . . . . . . . . 2.5 Theorem of Hermite–Lindemann–Weierstrass . . . . . . 2.6 Applications of Theorem 2.5.1 . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

21 22 24 25 28 29 31 33

3 Theorem of Gelfond and Schneider . . . . . . . . 3.1 Lemmas on Linear Equations . . . . . . . . . . 3.2 Proof of Gelfond–Schneider Theorem 3.0.1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

35 36 40 44

4 Extensions Due to Ramachandra . . . . . . . . . . 4.1 Functions Satisfying Differential Equations 4.2 First Extension . . . . . . . . . . . . . . . . . . . . . 4.3 Theorem 4.2.1 Implies Theorem 4.1.1 . . . . 4.4 Another Consequence of Theorem 4.2.1 . . 4.5 Second Extension . . . . . . . . . . . . . . . . . . . 4.6 Some Consequences of Theorem 4.5.1 . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

45 45 49 52 53 54 57 60

1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . 1.1 Algebraic Independence of Functions . 1.2 Gauss’s Lemma . . . . . . . . . . . . . . . . 1.3 Properties of Algebraic Numbers . . . . 1.4 Linear Independence of Functions . . . References . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

xiii

xiv

Contents

61 62 66 75 85

5 Diophantine Approximation and Transcendence . 5.1 Approximation Theorem of Dirichlet . . . . . . . 5.2 Theorems of Liouville and Thue . . . . . . . . . . 5.3 Theorem of Siegel . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

6 Roth’s Theorem . . . . . . . . . . . . . . . . . . . . . 6.1 Index of a Polynomial . . . . . . . . . . . . . 6.2 Set of Polynomials . . . . . . . . . . . . . . . . 6.3 A Combinatorial Lemma . . . . . . . . . . . . 6.4 The Approximation Polynomial . . . . . . . 6.5 Statement and Proof of Roth’s Theorem References . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. 87 . 88 . 89 . 95 . 97 . 101 . 105

7 Baker’s Theorems and Applications . . . . . . . . . . . . . . . . . . 7.1 Statement of Baker’s Theorems . . . . . . . . . . . . . . . . . . . 7.2 Applications of the Qualitative Result—Theorem 7.1.1 . . 7.3 Applications of the Quantitative Result—Theorem 7.1.2 . 7.4 Effective Version of Thue’s Theorem . . . . . . . . . . . . . . 7.4.1 Proof of Theorem 7.4.1 . . . . . . . . . . . . . . . . . . . 7.5 p-Adic Version of Baker’s Result and an Application . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

107 108 109 111 116 117 122 128

....... ....... Form . . ....... ....... ....... ....... ....... ....... .......

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

131 132 133 135 136 140 142 146 148 154

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

8 Baker’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Ground Work for the Proof of Baker’s Theorem . . 8.1.1 A Lower Bound for a Non-vanishing Linear 8.1.2 A Special Augmentative Polynomial . . . . . . 8.1.3 Construction of the Auxiliary Function . . . . 8.1.4 Basic Estimates Relating to U . . . . . . . . . . 8.1.5 Extrapolation Technique to Get More Zeros 8.1.6 Smallness of Derivatives . . . . . . . . . . . . . . 8.2 Proof of Baker’s Theorem . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9 Subspace Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Statement of Subspace Theorem . . . . . . . . . . . . . . . 9.2 Dirichlet’s Multidimensional Approximation Results . 9.3 Applications of Subspace Theorems to Diophantine Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 A Different Application . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . 155 . . . . . . . . . . 155 . . . . . . . . . . 157 . . . . . . . . . . 161 . . . . . . . . . . 164 . . . . . . . . . . 169

Appendix A: Introductory Quotes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

About the Authors

Saradha Natarajan is an INSA Senior Scientist at the DAE Center for Excellence in Basic Sciences at the University of Mumbai, India, and elected fellow of the Indian National Science Academy (INSA). Earlier, she was Professor of Mathematics at the Tata Institute of Fundamental Research, Mumbai, India, until 2016. She earned her Ph.D. in 1983 under the guidance of Prof. T. S. Bhanumurthy from the Ramanujan Institute for Advanced Study in Mathematics, University of Madras, Chennai. She was a postdoctoral fellow at Concordia University, Canada; Macquarie University, Australia; and National Board of Higher Mathematics (NBHM), India. Her area of specialization is number theory, in general, and transcendental number theory and Diophantine equations, in particular. She has published several papers in international journals of repute and has collaborated with many mathematicians both in India and abroad. Several students have completed their Ph.D. under her supervision. She has travelled extensively and given invited talks and lectures at national and international seminars and conferences. Professor Natarajan has made substantial contributions to the conjectures of Erdos on perfect powers in arithmetic progressions, where combinatorial and computational methods, linear forms in logarithms and modular method are combined. She also has made signiﬁcant contributions to Thue equations and Diophantine approximations, especially towards conjectures of Bombieri, Mueller and Schmidt on number of solutions of Thue inequalities for forms in terms of number of non-zero coefﬁcients of the form. In the area of transcendence, she has obtained best possible simultaneous approximation measures for values of exponential function and Weierstrass elliptic function. Further, signiﬁcant lower bounds were shown for the Ramanujan tau-function for almost all primes p. Ravindranathan Thangadurai is Professor at Harish-Chandra Research Institute, Prayagraj, India. He earned his Ph.D. in Combinatorial Number Theory in 1999 from the Mehta Research Institute for Mathematics and Theoretical Physics, Allahabad (now Harish-Chandra Research Institute, Prayagraj) under the

xv

xvi

About the Authors

supervision of Prof. S. D. Adhikari. He spent two years as a postdoc at the Institute of Mathematical Sciences, Chennai, India, and two years at Indian Statistical Institute, Kolkata, India. His areas of research include analytic, combinatorial and transcendental number theory, speciﬁcally, major contributions in the area of zerosum problems in ﬁnite abelian groups, distribution of residues modulo p, Liouville numbers and Schanuel’s conjecture in transcendental number theory. He has collaborated with reputed mathematicians and his research articles have been published journals of repute. He has computed the exact values of Olson’s constant and Alon–Dubiner constant for subsets for the group. He proved a conjecture of Schmid and Zhuang for large class of ﬁnite abelian p-groups and the current best known upper bound for Davenport’s constant for a general ﬁnite abelian group. He has also made a major contribution to the theory of distribution of particular type of elements (specially, quadratic non-residues but not a primitive root) of residues modulo p. He has proved a strong form of Schanuel’s conjecture in transcendental number theory for many n-tuples.

Symbols

N Z Q R C A K OK OK N ðaÞ N K=Q ðaÞ F½X F½X1 ; . . .; Xn K½½z ½x fxg jjxjj 0 be a real number. We say that an entire function f is of order ρ if there exists an absolute constant C > 0 such that ρ

| f | R = max | f (z)| ≤ C R for R → ∞. |z|=R

For example, any polynomial P(z) ∈ C[z] is of order 0 while e z has order 1. This notion is extended to meromorphic functions as follows. A meromorphic function is said to be of order ρ if it is the quotient of two entire functions of order ≤ ρ. For example, the Weierstrass elliptic function ℘ (z) is known to be the quotient of entire functions of order 2. Hence ℘ (z) is of order 2. As a consequence of well-known Jensen’s formula, we get that the number of zeros of an entire function of order ≤ ρ, inside a circle of radius R is at most O(R ρ ) as R → ∞. We will use these facts to show the algebraic independence of certain classical functions below. It is well known that e z is a transcendental function. We show the following result. Lemma 1.1.1 For any non-zero a ∈ C, the functions z and eaz are algebraically independent over C. Proof Suppose the lemma is false. Then there exists a non-zero polynomial P(x1 , x2 ) such that P(eaz , z) = 0 for all z ∈ C. We shall take the polynomial P to be of least degree in x1 . Let the degree of P in x1 be ν. Thus there exist non-zero polynomials f 0 (z), . . . , f ν (z) ∈ C[z], not all constants such that f 0 (z) ≡ 0 and f 0 (z)eνaz + f 1 (z)e(ν−1)az + · · · + f ν (z) = 0. Divide out by f 0 (z) to get an equation of the form Q(z) = eνaz + g1 (z)e(ν−1)az + · · · + gν (z) = 0 with gi (z) = f i (z)/ f 0 (z) for 1 ≤ i ≤ ν. Note that Q(z) is uniquely determined since ν is the least degree in x1 . We know that eaz is invariant under z → z + 2nπi/a, n ∈ Z. Thus each gi (z), 1 ≤ i ≤ ν must be invariant under these transformations. That is, gi (z) = gi (z + 2nπi/a). That means each gi (z) has either infinitely many zeros or poles. Since gi is a rational function, we conclude each gi (z) is a constant which is a contradiction. By similar argument, we can also show the following lemma. Lemma 1.1.2 The functions e z and eaz , a irrational are algebraically independent over C. Proof Suppose the lemma is false. Arguing as in Lemma 1.1.1, there exists a unique relation

1.1 Algebraic Independence of Functions

3

Q(z) = eνaz + g1 (e z )e(ν−1)az + · · · + gν (e z ) = 0 with gi (e z ) = f i (e z )/ f 0 (e z ) for 1 ≤ i ≤ ν. Again, each gi (e z ) must be invariant under the transformation z → z + 2nπi/a, n ∈ Z. Since a is irrational, each gi (e z ) has two independent periods w1 := 2πi and w2 := 2πi/a and so 1 w1 + 2 w2 is a period for any integers 1 and 2 . Hence, the number of zeros or poles inside a circle of radius R of any gi (e z ) is bounded below by O(R 2 ). This is a contradiction as the order of gi (e z ) is 1. Another important function which has been widely studied is the Weierstrass elliptic function ℘ (z). This is a doubly periodic meromorphic function which is a quotient of entire functions of order 2. In fact, these entire functions are known as σ functions. For various properties of this function which will be used in this book, we refer to [1]. Lemma 1.1.3 Let ℘ (z) and ℘ ∗ (z) be two elliptic functions with periods (ω1 , ω2 ) and (ω1∗ , ω2∗ ). Then ℘ and ℘ ∗ are algebraically dependent if and only if their periods are commensurable i.e there exists a 2 × 2 rational matrix M such that ∗ ω1 ω = M 1∗ . (1.1) ω2 ω2 Proof Suppose there exists M =

a b c d

satisfying (1.1) with a, b, c, d ∈ Q. Then

ω1 = aω1∗ + bω2∗ ; ω2 = cω1∗ + dω2∗ . Hence there exists an integer m such that mω1 = a ω1∗ + b ω2∗ ; mω2 = c ω1∗ + d ω2∗ with a , b , c , d ∈ Z. (One may take m to be the least common multiple of the denominators of a, b, c, d). Thus ℘ ∗ (mz) has fundamental periods ω1 and ω2 . Hence ℘ ∗ (mz) is a rational function of ℘ (z). A priori, ℘ ∗ (mz) is a rational function of ℘ ∗ (z) which therefore implies that ℘ (z) and ℘ ∗ (z) are algebraically dependent. Now we prove the converse. Suppose ℘ (z) and ℘ ∗ (z) are algebraically dependent. Arguing as in Lemma 1.1.1 there exists a unique relation Q(z) = ℘ (z)ν + g1 (℘ ∗ (z))℘ (z)(ν−1) + · · · + gν (℘ ∗ (z)) = 0 with gi (℘ ∗ (z)) = f i (℘ ∗ (z))/ f 0 (℘ ∗ (z)) for 1 ≤ i ≤ ν. Then each gi (℘ ∗ (z)) is invariant under z → z + mω1 , m ∈ {0, 1, 2, . . .} and under z → z + nω2 , n ∈ {0, 1, 2, . . .}. This means ℘ ∗ (z + mω1 ) and ℘ ∗ (z + nω2 ) are not all distinct. Hence there exist integers m 0 and n 0 such that m 0 ω1 and n 0 ω2 are periods of ℘ ∗ (z). Thus for some integers m 1 , m 2 , n 1 , n 2 we have m 0 ω1 = m 1 ω1∗ + m 2 ω2∗ and n 0 ω1 = n 1 ω1∗ + n 2 ω2∗ which proves (1.1).

4

1 Preliminaries

1.2 Gauss’s Lemma Let P(z) ∈ Z[z] be a polynomial. Let C(P) denote the greatest common divisor of the coefficients of P, and it is called the content of P. We say P(z) is a primitive polynomial if C(P) = 1. We show Lemma 1.2.1 The product of two primitive polynomials is a primitive polynomial. Thus for any two polynomials P, Q ∈ Z[z] we have C(P Q) = C(P)C(Q). Proof Let P(z) and Q(z) be two primitive polynomials with coefficients in Z. Suppose their product P Q is not primitive. Then there exists a prime p such that P(z)Q(z) is identically zero in Z/ pZ[z], which we write as P(z)Q(z) ≡ 0 (mod p). Let P1 (z) and Q 1 (z) be the polynomials P(z) and Q(z) reduced (mod p)., i.e. P1 (z) ≡ P(z) (mod p) and Q 1 (z) ≡ Q(z) (mod p). Hence (1.2) P1 (z)Q 1 (z) ≡ 0 (mod p). Since P(z) and Q(z) are both primitive, P1 (z) and Q 1 (z) are not identically 0. Let p1 and q1 be the leading coefficients of P1 and Q 1 , respectively. Then p1 ≡ 0 (mod p), q1 ≡ 0 (mod p) and hence p1 q1 ≡ 0 (mod p). This contradicts (1.2). Thus P Q is primitive. Any polynomial P(z) ∈ Z[z] can be written as P(z) = C(P)P (z) with P primitive. If R = P Q, then C(R)R = C(P)C(Q)P Q . Since P Q is primitive, it follows that C(R) = C(P)C(Q). It is possible to generalise the above result to any number of variables as follows. Lemma 1.2.2 Let P, Q ∈ Z[z 1 , . . . , z m ]. Then C(P Q) = C(P)C(Q). For a proof, we refer to Cassels [2]. It follows from the above lemmas that if P can be factored over Q, it can also be factored over Z. Lemma 1.2.3 Let 1 ≤ r < m. Suppose that F(z 1 , . . . , z m ) ∈ Z[z 1 , . . . , z m ], G(z 1 , . . . , zr ) ∈ Q[z 1 , . . . , zr ] and H (zr +1 , . . . , z m ) ∈ Q[zr +1 , . . . , z m ] such that F(z 1 , . . . , z m ) = G(z 1 , . . . , zr )H (zr +1 , . . . , z m ). Let γ be a coefficient in F. Then there is a factorisation γ = αβ in Q such that αG ∈ Z[z 1 , . . . , zr ] and β H ∈ Z[zr +1 , . . . , z m ]. Proof Let the coefficients of G be α1 , . . . , αs and that of H be β1 , . . . , βt in some order. Since the variables in G and H are disjoint, the coefficients of F are αi β j and they are all in Z. In particular, α1 β j ∈ Z for 1 ≤ j ≤ t and β1 α j ∈ Z for 1 ≤ j ≤ s.

1.2 Gauss’s Lemma

5

But these are coefficients of α1 H and β1 G, respectively, and α1 β1 is some coefficient of F. By similar argument, every coefficient of F can be written in the required form.

1.3 Properties of Algebraic Numbers A complex number α ∈ C is said to be an algebraic number if there exists a nonzero polynomial P(x) ∈ Z[x] such that P(α) = 0. If P(x) is monic, then α is said to be an algebraic integer. A complex number α ∈ C is said to be a transcendental number if α is not an algebraic number. The set of all algebraic numbers is denoted by A. For a given α ∈ A, there exists a polynomial of least degree, say, ν with integer coefficients satisfied by α. This is known as the minimal polynomial of α. The other roots of the minimal polynomial are called the conjugates of α. We shall denote the conjugates of α by α(1) = α, α(2) , . . . , α(ν) . We denote by N (α), the norm of α which is defined by N (α) = α(1) · · · α(ν) . Observe that for any algebraic number α ∈ A, there exists a rational integer d such that dα is an algebraic integer. For instance, one may take d to be the leading coefficient of the minimal polynomial satisfied by α over Q. The least such d denoted as d(α) is called the denominator of α. For any polynomial P(X ) ∈ Q[X ], by the height of P, denoted as H (P), we mean the maximum of the absolute values of the coefficients of P. By the height of an algebraic number α, denoted as h(α), we mean the height of the minimal polynomial of α. By α , called house of α, we mean the maximum of the absolute values of α and its conjugates. By the size of α, denoted by s(α), we mean d(α) + α . We will also use the absolute logarithmic height of an algebraic number α which we denote by h ◦ (α), and it is defined as h ◦ (α) =

ν 1 1 log a0 + log max(1, |αi |) ν ν i=1

where a0 > 0 is the leading coefficient of the minimal polynomial of α. We know that when α is an algebraic integer, any non-negative integral power of α is a linear combination of 1, α, . . . , αν−1 with coefficients in Z. The following lemma gives a bound for the coefficients of this linear combination. Lemma 1.3.1 Let α be an algebraic integer of degree ν ≥ 2 and of height h(α). Let αs =

ν−1 j=0

b j,s α j , s ∈ N ∪ {0}, b j,s ∈ Z.

(1.3)

6

1 Preliminaries

Then max |b j,s | ≤ (2h(α))s .

0≤ j 0. Further let α1 , . . . , αd be the roots of φ(X ) = 0. Then any symmetric rational integral polynomial in aα1 , . . . , aαd is a rational integer. Proof Let φ(X ) = a X d + a1 X d−1 + · · · + ad . Then any symmetric rational integral polynomial in aα1 , . . . , aαd is a rational integral polynomial in

aαi ,

1≤i≤d

aαi aα j , . . . ,

1≤i< j≤d

d

aαi

i=1

by Lemma 2.3.1 and hence a rational integral polynomial in a1 , . . . , ad showing that it is a rational integer. Now we prove the transcendence of π. Suppose π is algebraic. Then πi is also algebraic. Let α1 = πi be of degree n satisfying the polynomial a X n + a1 X n−1 + · · · + an = 0. Since 1 + eα1 = 0, we have (1 + eα1 )(1 + eα2 ) · · · (1 + eαn ) = 0 where α2 , . . . , αn are conjugates of α1 . The product on the left-hand side can be expanded as ⎛ ⎞ n exp ⎝ jαj⎠ j=1

where the first sum is over all tuples (1 , . . . , n ) with j ∈ {0, 1} for 1 ≤ j ≤ n. Thus each summand is of the form eθ , and there are 2n such summands. Assume that exactly d of the θs is non-zero. Then the above sum becomes 2n − d + eθ1 + · · · + eθd = 0. Hence with I (φ, x) as defined in Lemma 2.1.1, we get S := (2n − d)I (φ, 0) + I (φ, θ1 ) + · · · + I (φ, θd ) = −(2n − d)

j≥0

φ( j) (0) −

d j≥0 k=1

φ( j) (θk ),

2.3 π Is Transcendental

27

Choice of φ(X) Now we specialise φ(X ) as φ(X ) = a dp X p−1 ((X − θ1 ) · · · (X − θd )) p where p is a prime with p > max(a, 2n − d, a d θ1 . . . θd ). Note that d

a X

2n −d

(X − θ1 ) · · · (X − θd ) =

1 n j=1 j =0

aX −

n

i aαi .

i=1

The polynomial on the right-hand side of the above equality is a polynomial with coefficients which are symmetric polynomials in aα1 , . . . , aαn with rational integer coefficients and hence are rational integers by Lemma 2.3.2. So by the above identity and the definition of φ(X ), it follows that φ(X ) ∈ Z[X ]. Lower Bound for |S| As seen earlier, we have φ( p−1) (0) = ( p − 1)!(−a)dp (θ1 · · · θd ) p ∈ Z and the integer on the left-hand side is divisible by ( p − 1)! but p! does not divide it by the choice of p. Further φ( j) (0) ∈ Z and ≡ 0 and

φ( j) (θk ) ∈ Z and ≡ 0

(mod p!) for j = p − 1 (mod p!) for j ≥ 0.

Thus S is an integer divisible by ( p − 1)! but p! does not divide S. Hence S is non-zero and |S| ≥ ( p − 1)! Upper Bound for |S| Note that

28

2 Early Transcendence Results from Nineteenth Century

|S| ≤ |I (φ, θ1 )| + · · · + |I (φ, θd )| d θk θk −t = e φ(t)dt 0

k=1

≤

d

e|θk |

|φ(t)|dt

0

k=1

≤

θk

d

ˆ k |) e|θk | |θk |φ(|θ

k=1 p

≤ c2.3 . ˆ ) = a dp |X | p−1 (|X | + |θ1 |) p · · · (|X | + |θd |) p and c2.3 is a positive number where φ(X depending on a, d and θs and independent of p. For instance, c2.3 can be taken as d(2a)d |θ0 |d+2 e|θ0 | where |θ0 | = max(|θ1 |, . . . , |θd |). Final Contradiction Comparing the upper and lower bounds for |S| we get p

( p − 1)! ≤ |S| ≤ c2.3 which is not possible if p is sufficiently large. For instance, p can be taken > c2.3 .

2.4 A Lemma from Galois Theory Lemma 2.4.1 Let K be a normal extension of Q with [K : Q] = ν and σ1 , . . . , σν be the automorphisms of K. Let A(z) ∈ K[[z]] be a power series. Then B(z) =

ν

σi (A(z)) ∈ Q[[z]].

i=1

Proof Observe that for any A(z) =

γk z k ∈ K[[z]], and any automorphism σi of

k≥0

K, we have

σi (A(z)) =

σi (γk )z k .

k≥0

Hence for any automorphism σ, we have σ(B(z)) = σ

ν i=1

Hence B(z) ∈ Q[[z]].

(σi (A(z))) =

ν i=1

(σσi (A(z)) =

ν

σ j (A(z)) = B(z).

j=1

2.5 Theorem of Hermite–Lindemann–Weierstrass

29

2.5 Theorem of Hermite–Lindemann–Weierstrass Theorem 2.5.1 Let α0 , . . . , αm be distinct algebraic numbers and a0 , . . . , am be non-zero algebraic numbers. Then m

ai eαi = 0.

i=0

In other words, if α0 , . . . , αm are distinct algebraic numbers, then eα0 , . . . , eαm are linearly independent over A. The proof is similar to the proof of the transcendence of e, but more sophisticated as we will now deal with a normal extension of Q. Proof We assume that

m

ai eαi = 0.

(2.6)

i=0

Application of Lemma 2.4.1 Let us consider the function A(z) =

m

ai eαi z , z ∈ C.

i=0

Since αi s are distinct, A(z) ≡ 0. For, suppose we get

m

m

ai eαi z = 0 for all z. In particular,

i=0

ai e

αi j

= 0 for all j = 1, 2, . . . , m + 1. Since the matrix (e jαi ) has non-

i=0

zero determinant as αi s are distinct, we get ai = 0 for all i, which is a contradiction. Let K be the normal extension of Q containing ai s and αi s and let [K : Q] = ν. Since A(1) = 0 by (2.6), we see that σ(A(1)) = 0 for all automorphisms of K and hence we may replace A(z) by B(z) in Lemma 2.4.1 to assume that A(z) ∈ Q[[z]]. Application of Lemma 2.1.2 Since A(1) = 0, we may multiply A(1) by a common denominator of a0 , . . . , am to assume that ai ∈ OK , for 0 ≤ i ≤ m. Let d ∈ Z be the common denominator of αi , 0 ≤ i ≤ m. Further let n be a large integer to be chosen later and let f (z) = (z − α0 )n (z − α1 )n+1 · · · (z − αm )n+1 ;

30

2 Early Transcendence Results from Nineteenth Century

h(z) = (z − dα0 )n (z − dα1 )n+1 · · · (z − dαm )n+1 and g(z) =

1 () f (z). n! ≥n

Then f (z) ∈ K[z]; h(z) ∈ OK [z] and since

d m(n+1) f (z) = d −n h(dz)

we get d m(n+1)

f () (α j ) h () (dα j ) = d −n ∈ OK , for ≥ n. ! !

Let I = d m(n+1)

m

a j g(α j ).

j=0

An Upper Bound for |I| We have I =

m j=0

= Note that

aj

≥n

d m(n+1)

1 () f (α j ) n!

m f () (α j ) d m(n+1) a0 f (n) (α0 ) ! + (n + 1) d m(n+1) . aj n! (n + 1)! ! j=0 ≥n+1

f (n) (α0 ) = (α0 − α j )n+1 . n! j=1 m

Thus we get from the above expression for I that I = a0

m

(dα0 − dα j )n+1 + (n + 1)J

j=1

where J ∈ OK . Suppose I = 0. Then from (2.7), we see that (n + 1) divides N (a0 )

m j=1

N (dα0 − dα j )n+1 .

(2.7)

2.5 Theorem of Hermite–Lindemann–Weierstrass

By taking

31

n = k N (dα0 − dα j ) ,

(2.8)

for some 1 ≤ j ≤ m with k ≥ |N (a0 )| we see that the above property does not hold. Hence I = 0. By Lemma 2.1.2, with φ = f we conclude that 0 < |I |
w. The following lemma gives a bound for a non-trivial solution. Lemma 3.1.1 Let y j := a j1 x1 + · · · + a jv xv = 0 f or 1 ≤ j ≤ w be w linear equations in v variables x1 , . . . , xv with a ji ∈ Z, 1 ≤ j ≤ w; 1 ≤ i ≤ v and let v > w. Assume that |a ji | ≤ A. Then there exist x1 , . . . , xv ∈ Z, not all zero, satisfying the equations such that w

|xi | ≤ (v A) v−w . Proof Let s j =

a ji and t j = −

a ji ≥0

a ji for any j with 1 ≤ j ≤ w. Then s j +

a ji (1 + X v A)w , there exist two distinct tuples (x1 , . . . , xv ) and (x1 , . . . , xv ) such that ψ(x1 , . . . , xv ) = ψ(x1 , . . . , xv ) = (y1(0) , . . . , yw(0) ).

3.1 Lemmas on Linear Equations

37

Hence (x1 , . . . , xv ) = ((x1 − x1 ), . . . , (xv − xv )) satisfies the linear system with xi ∈ Z, not all zero and |xi | ≤ X. We choose

w

X = [(v A) v−w ] where [x] denotes the integral part of x. Then (3.2) is satisfied and the proof is complete. Lemma 3.1.2 Suppose {γ1 , . . . , γν } is a Z-basis for OK . Let α ∈ OK be written as α = a1 γ1 + · · · + aν γν , ai ∈ Z.

(3.3)

Then |ai | ≤ c3.1 α where c3.1 > 0 depends only on K. Proof Taking conjugate on both sides of (3.3), we have α(i) = a1 γ1(i) + · · · + aν γν(i) , 1 ≤ i ≤ ν. Then by Cramer’s rule, ak =

det Ak

(3.4)

det(γ (i) j )

(1) (ν) T where Ak is the matrix (γ (i) j ) with its kth column replaced by (α , . . . , α ) . It is (i) well known that det(γ j ) = 0. Further det Ak can be written as a linear expression in α(1) , . . . , α(ν) with coefficients defined in terms of γ (i) j . Taking absolute values on both sides of (3.4) we get the result.

We now give an analogue of Lemma 3.1.1 when the coefficients of the linear equations are in OK . Lemma 3.1.3 Let K be a number field of degree ν > 1. y j := a j1 x1 + · · · + a jv xv = 0 f or 1 ≤ j ≤ w

(3.5)

be w linear equations in v variables x1 , . . . , xv with a ji ∈ OK , 1 ≤ i ≤ v; 1 ≤ j ≤ w and v > w. Assume that a ji ≤ A. Then the following assertions hold. (i) There exist x1 , . . . , xv ∈ OK , not all zero, satisfying the equations and positive numbers c3.2 and c3.3 depending only on K such that w

xi ≤ c3.2 (c3.3 v A) v−w .

38

3 Theorem of Gelfond and Schneider

(ii) There exist x1 , . . . , xv ∈ Z, not all zero, satisfying the equations such that wν(ν+1)

|xi | ≤ 1 + (2v A) 2v−wν(ν+1) , 1 ≤ i ≤ v provided 2v > wν(ν + 1) and A ≥ 1. Proof of (i) We denote by c3.4 , c3.5 positive numbers depending only on K. Let γ1 , . . . , γν be a basis for OK . If x1 , . . . , xv ∈ OK are solutions of (3.5), then write (3.6) xs = ξs1 γ1 + · · · + ξsν γν , 1 ≤ s ≤ v with all ξsi ∈ Z. Further write a js γr = b jsr 1 γ1 + · · · + b jsr ν γν , for 1 ≤ j ≤ w, 1 ≤ s ≤ v, 1 ≤ r ≤ ν with all b jsri ∈ Z. Then for 1 ≤ j ≤ w, 0 = = =

v

a js xs =

s=1 ν v

v

s=1 ν

ξsr r =1 s=1 u=1 ν v ν

a js

ν

ξsr γr

r =1

b jsru γu

b jsru ξsr γu .

u=1

r =1 s=1

Since γ1 , . . . , γν is a basis for K, ν v

b jsru ξsr = 0

(3.7)

r =1 s=1

for 1 ≤ j ≤ w and 1 ≤ u ≤ ν. This gives wν equations in vν variables ξsr . Further by Lemma 3.1.2 and ab ≤ a b , we get, |b jsru | ≤ c3.1 a js γr ≤ c3.1 A max γr ≤ c3.4 A. 1≤r ≤ν

Hence by Lemma 3.1.1, the system of Eq. (3.7) has a non-trivial solution in Z satisfying |ξsr | < (c3.4 Avν)w/(v−w) , 1 ≤ s ≤ v, 1 ≤ r ≤ ν. Hence from (3.6), we get

3.1 Lemmas on Linear Equations

39

xs ≤ |ξs1 | γ1 + · · · + |ξsν | γν ≤ ν(c3.4 Avν)w/(v−w) max γr 1≤r ≤ν

≤ c3.5 (c3.4 Avν)

w/(v−w)

.

This completes the proof of (i) of the lemma. Proof of (ii) Let X ≥ 2 and Y be natural numbers. Let I X = {(x1 , . . . , xv ) : |xi | ≤ X, 1 ≤ i ≤ v} and JY = {(y1 , . . . , yw ) : y j ≤ Y, 1 ≤ j ≤ w}. Note that y j ≤ v AX for any (x1 , . . . , xv ) ∈ I X . Hence there is a mapping from I X to Jv AX . We have |I X | ≤ (2X + 1)v . Now y j ∈ OK and hence satisfies an equation of the form + · · · + bν = 0, bi ∈ Z, 1 ≤ i ≤ ν. y νj + b1 y ν−1 j In fact, for 1 ≤ i ≤ ν, bi is the ith symmetric function in y j and its conjugates. So |bi | ≤

ν (v AX )i . i

Hence the number of possible equations satisfied by y j is at most ν ν i 2 (v AX ) + 1 , i i=1 and each such equation has at most ν possible values for y j . Thus

ν w w ν ν ν 1 i w wν w ν(ν+1) 2 |Jv AX | ≤ ν = ν 2 (v AX ) . 2 (v AX ) + 1 + i i 2(v AX )i i=1

i=1

Since v AX ≥ 2, using the arithmetic–geometric means, the right-hand side of the above inequality can be estimated as ν w 2wν (v AX )wν(ν+1)/2

ν ν i=1

Thus we get

i

+

1 2i+1

w

≤ ν w 2wν

2ν − 1/2 ν

|Jv AX | < 2wν(ν+1) (v AX )wν(ν+1)/2 .

wν

(v AX )wν(ν+1)/2 .

40

Let

3 Theorem of Gelfond and Schneider

(2X + 1)v ≥ 2wν(ν+1) (v AX )wν(ν+1)/2 = (4v AX )wν(ν+1)/2 .

(3.8)

Arguing as in Lemma 3.1.1 there exist x1 , . . . , xv ∈ Z, not all zero, satisfying the given system of linear equations with |xi | ≤ 2X. Define λ as λv−wν(ν+1)/2 = (2v A)wν(ν+1)/2 and take X satisfying λ − 1 ≤ 2X < λ + 1. Then (4v AX )wν(ν+1)/2 = (2v A)wν(ν+1)/2 (2X )wν(ν+1)/2 < (2X + 1)v−wν(ν+1)/2 (2X )wν(ν+1)/2 < (2X + 1)v . Hence (3.8) is satisfied. Also since 2X ≤ λ + 1, we get the assertion of the lemma.

3.2 Proof of Gelfond–Schneider Theorem 3.0.1 Assume that α, β, ω = αβ are all algebraic. Let K be the number field containing these numbers with [K : Q] = ν. Choice of Parameters Let m = 2ν + 2; t = u 2 ; 2m|t; n = t/(2m). Thus

√ u = c3.6 n

√ where c3.6 = 2 ν + 1. Here and henceforth, we denote by c3.6 , c3.7 , . . . positive numbers depending only on α, β and independent of n. Let θau+b = (a + bβ) log α, a, b ∈ Z and 0 ≤ a ≤ u − 1, 1 ≤ b ≤ u. There are t = u 2 such θs and let them be labelled as θ1 , . . . , θt . Note that θs are all distinct since β is irrational. Auxiliary Polynomial Put

R(z) = η1 eθ1 z + · · · + ηt eθt z

where η1 , . . . , ηt are variables to be determined such that

(3.9)

3.2 Proof of Gelfond–Schneider Theorem 3.0.1

41

(log α)−k R (k) (s) = 0 for 0 ≤ k ≤ n − 1 and 1 ≤ s ≤ m.

(3.10)

This is a set of mn equations in t (= 2mn) variables η1 , . . . , ηt . Coefficients of η1 , . . . , ηk in (3.9) are in K A typical coefficient is of the form (log α)−k θik eθi s = (a + bβ)k es(a+bβ) log α = (a + bβ)k αsa ω sb and hence in K by our assumption on ω. Let c3.7 be a common denominator of α, β and ω. Then c3.7 α, c3.7 β, c3.7 ω ∈ OK . Hence

n−1+2mu (log α)−k R (k) (s) c3.7

(3.11)

have coefficients in OK . Also house of the coefficients is bounded by

n−1 n−1+2mu n n (n−1)/2 n u+u β α mu ω mu ≤ c3.8 u n−1 ≤ c3.9 t ≤ c3.10 n (n−1)/2 c3.7 √ since u = c3.6 n. Application of Lemma 3.1.3 By Lemma 3.1.3 (i), with v = t = 2mn and w = mn, the set of equations in (3.10) has a solution {η1 , . . . , ηt }, not all zero, in K satisfying mn

t−mn n−1 n ηk < c3.2 c3.3 tc3.10 n 2

n < c3.11 n

n−1 2

n < c3.12 n

n+1 2

u2 for 1 ≤ k ≤ t.

(3.12)

R(z) is Not a Zero Polynomial Suppose R(z) ≡ 0 for any z. Then expanding the exponentials in (3.9) we have η1 θ1k + · · · + ηt θtk = 0 for k = 0, 1, . . . . j

Since the determinant of the matrix (θi )i,t j=1 is non-zero as θi = θk for i = k, we get ηi = 0 for 1 ≤ i ≤ t, a contradiction. Thus R(z) ≡ 0. Hence there exist integers r and s ∗ with r ≥ n, 1 ≤ s ∗ ≤ m such that R (k) (s) = 0 for 0 ≤ k ≤ r − 1 and 1 ≤ s ≤ m and

R (r ) (s ∗ ) = 0.

42

3 Theorem of Gelfond and Schneider

Define

λ = (log α)−r R (r ) (s ∗ ).

Thus λ = 0. In the rest of the proof, we find lower and upper bounds for |N (λ)| which contradict each other. Lower Bound for |N (λ)| By (3.11), we know that Hence as r ≥ n,

r +2mu λ ∈ OK . c3.7

−ν(r +2mu) −r ≥ c3.13 . |N (λ)| ≥ c3.7

Upper Bound for |N (λ)| Note that |N (λ)| ≤ λ

ν−1

|λ|.

(3.13)

(3.14)

We give upper estimates for λ and |λ|. Upper Bound for λ We have λ ≤ t max ηk es

∗

θk

1≤k≤t

θkr .

Now θk = (a + bβ) log α ≤ c3.14 u and es

∗

θk

u ≤ c3.15 .

Hence, by (3.12), n λ ≤ tc3.12 n

n+1 2

u r c3.15 (c3.14 u)r ≤ c3.16 r (2r +3)/2 .

(3.15)

Upper Bound for |λ| Using Cauchy Integral Formula Let m R(z) s ∗ − k r . T (z) = r ! (z − s ∗ )r k=1 z − k k=s ∗

Then observe that

T (s ∗ ) = R (r ) (s ∗ ) = (log α)r λ.

Since R ( j) (s ∗ ) = 0 for 0 ≤ j < r, T (z) has a Taylor’s expansion at z = s ∗ which is valid for all z ∈ C. Hence T (z) is an entire function. Applying Cauchy Integral formula for λ on C : |z| = m(1 + ur ) we get

3.2 Proof of Gelfond–Schneider Theorem 3.0.1

43

λ = (log α)−r T (s ∗ ) = (log α)−r

1 2πi

C

T (z) dz. z − s∗

Note that for z on C, we have s ∗ ≤ m < |z|. We estimate |T (z)| as below. First, |R(z)| ≤ t max |ηk |e|θi ||z| k,i

≤ t max |ηk |eu(1+|β|)m(

u+r u

) log |α|

k

u+r n ≤ tc3.12 n (n+1)/2 c3.17 .

Thus

r r (r +3)/2 |R(z)| ≤ c3.18

(3.16)

since r ≥ n and t = 2mn. Next

mr r −m = for 1 ≤ k ≤ m. |z − k| ≥ |z| − k ≥ m 1 + u u Hence

∗ r m

m

mr −r

s − k

u r

(z − s ∗ )−r ≤ m

z−k u mr

k=1,k=s ∗ k=1

u mr ≤ c3.19 . r We use the above inequality along with (3.16) to get r r (r +3)/2 c3.19 |T (z)| ≤ r !c3.18 r ≤ c3.20 rrr

r +3 2

r−

mr 2

u mr

r r (3−m)+3 r = c3.20 r 2 .

Thus

1

T (z)

dz |λ| ≤ | log α| 2π C z − s ∗

r (3−m)+3 r u r ≤ | log α|−r m 1 + c3.20 r 2 . u mr −r

Hence r |λ| ≤ c3.21 r

Using (3.15) and (3.17) in (3.14), we get

r (3−m)+3 2

.

(3.17)

44

3 Theorem of Gelfond and Schneider (ν−1)r (ν−1)(2r +3)/2 r |N (λ)| ≤ c3.16 r c3.21 r (r (3−m)+3)/2 r ≤ c3.22 r ((ν−1)(2r +3)+r (3−2ν−2)+3)/2 .

Thus

r r (3ν−r )/2 . |N (λ)| ≤ c3.22

(3.18)

Comparing (3.13) and (3.18), we see that −r r ≤ c3.22 r (3ν−r )/2 c3.13

or

r . r (r −3ν)/2 < c3.23

Since r ≥ n and c3.23 is independent of n, this inequality does not hold for n sufficiently large. Thus we conclude that ω = αβ is not algebraic. Exercise 1. If z ∈ C is a non-rational zero of the equation √ 3(1 + z) = tan(zπ/2),

(3.19)

then show that z is transcendental. 2. Prove that the following statement is equivalent to Theorem 3.0.1. Let α and β be algebraic numbers such that they are Q-linearly independent. Then, for any t ∈ C\{0}, at least one of etα and etβ is transcendental. Notes There are still different proofs of Gelfond–Schneider theorem available now, for instance, see [1] for a proof based on the method of interpolation determinants introduced in 1992 by M. Laurent. It is known that the roots of the Eq. (3.19) with z > −1 is simple and real. The Amick–Fraenkel conjecture asserts that the set {1, z 1 , z 2 , . . .} of the zeros of (3.19) is linearly independent over Q. This is known under Schanuel’s conjecture. See [2].

References 1. Yu.V. Nesterenko, Algebraic Independence, vol. 14 (Tata Institute of Fundamental Research Publications, Mumbai, 2008), 157 p 2. E. Shargorodsky, On the Amick-Fraenkel conjecture. Quart. J. Math. 65, 267–278 (2014)

Chapter 4

Extensions Due to Ramachandra

Change is hard at first, messy in the middle, and gorgeous at the end —Robin Sharma

In 1968, Ramachandra [1, 2] proved results relating to the set of complex numbers at which a given set of algebraically independent meromorphic functions assumes values in a fixed algebraic number field. These results proved to be significant in the case, to quote his own words “(overlooked by Gelfond) where the functions concerned do not satisfy algebraic differential equations of the first order with algebraic number coefficients.” His result, besides simplifying Schneider’s method, enables one to study the set of all complex numbers at which two algebraically independent meromorphic functions f (z) and g(z) take values which are algebraic numbers. In particular, he was able to obtain results when ( f (z), g(z)) ∈ {(z, ℘ (az)), (e z , ℘ (az)), (℘1 (z), ℘2 (az))} where a = 0 is an arbitrary complex number and ℘, ℘1 and ℘2 are Weierstrass elliptic functions. We refer to [2] for these results. In this chapter we give two theorems of Ramachandra. Theorem 3.0.1 is deduced from these theorems. Further we give few other applications for instance, about the transcendence of values of Weierstrass elliptic function.

4.1 Functions Satisfying Differential Equations Let P(x1 , . . . , xr ) =

λ1 ,...,λr

pλ1 ,...,λr x1λ1 . . . xrλr , Q(x1 , . . . , xr ) =

qλ1 ,...,λr x1λ1 . . . xrλr

λ1 ,...,λr

© Springer Nature Singapore Pte Ltd. 2020 S. Natarajan and R. Thangadurai, Pillars of Transcendental Number Theory, https://doi.org/10.1007/978-981-15-4155-1_4

45

46

4 Extensions Due to Ramachandra

be two polynomials in C[x1 , . . . , xr ]. We say that the polynomial P is majorised by the polynomial Q and written as P Q if | pλ1 ,...,λr | ≤ |qλ1 ,...,λr | for every (λ1 , . . . , λr ). We need to study meromorphic functions when they satisfy some differential equation. The following lemma describes one such situation. For any non-zero algebraic number, we denote by s(α) the size of α as in Sect. 1.3. We will be using the results from Lemma 1.3.2 often without any mention. Lemma 4.1.1 Let K be a number field. Let f be a meromorphic function. For some integer k ≥ 1, suppose f satisfies a differential equation as follows. f (k) (z) =

n1

···

ν1 =0

nk

dν1 ,...,νk ( f (0) (z))ν1 · · · ( f (k−1) (z))νk with dν1 ,...,νk ∈ K. (4.1)

νk =0

Suppose f (0) (z 0 ), . . . , f (k−1) (z 0 ) are all in K for some z = z 0 . Then there exist positive integers b and c such that for any integer τ ≥ 0, we have (i) (b )τ +1 f (τ ) (z 0 ) is an algebraic integer in K. (ii) s( f (τ ) (z 0 )) ≤ (c )τ +1 (τ + 1)τ . Proof Let n = 1 +

k

n i . Treating f (0) (z), . . . , f (k−1) (z) as variables, we see from

i=1

(4.1) that f (k) (z) is of degree at most n. Further, f (k+1) (z) is of degree at most 2n; f (k+2) (z) is of degree at most 3n and so on. Thus f (k+ j) (z) is of degree at most n( j + 1). In other words, f ( j) (z) is of degree at most n( j − k + 1) < n( j + 1) for j ≥ k. This is obviously true for j < k. Let d be the denominator of dν1 ,...,νk and f (τ ) (z 0 ), 0 ≤ τ ≤ k − 1. Then we can take b = d n . This proves (i) of the lemma. For the second part of the lemma, we take Q(z) = Q( f (0) (z), . . . , f (k−1) (z)) = 1 +

n1

···

ν1 =0

nk

dν1 ,...,νk + f (0) (z) + · · · + f (k−1) (z).

νk =0

We claim that for any integer j ≥ k, we get f ( j) (z) (n Q n (z)) j+1 ( j + 1) j . Note that by (4.1),

f (k) (z) Q n (z),

i.e. f (k) (z) is majorised by Q n (z). Hence f (k+1) (z) n Q n−1 (z)( f (1) (z) + · · · + f (k) (z)) n Q n−1 (z)(Q(z) + Q n (z)) n Q 2n (z)

from which we also get

4.1 Functions Satisfying Differential Equations

47

f (k+2) (z) n(2n)Q 2n−1 (z)Q n+1 (z) = 2!n 2 Q 3n (z). Proceeding thus, we find that f (k+ j) (z) j!n j Q ( j+1)n (z). Therefore for j ≥ k, f ( j) (z) (n Q n (z))( j+1) ( j + 1) j ,

(4.2)

as claimed. The above claim is trivially true for j < k. Let us choose

c = n 1 +

s(dν1 ,...,νk ) +

ν1 ,...,νk

k−1

n s( f (r ) (z 0 ))

.

r =0

To complete the proof of (ii), for any integer τ ≥ 0, we need to estimate s( f (τ ) (z 0 )). By (4.2), f

(τ )

τ τ +1

(z 0 ) ≤ (τ + 1) n

1+

dν1 ,...,νk +

ν1 ,...,νk

k−1

n(τ +1) f

(r )

(z 0 )

r =0

and we know by the first part (i) of the lemma that d( f (τ ) (z 0 )) ≤ d n(τ +1) . Hence

s( f

(τ )

τ τ +1

(z 0 )) ≤ (τ + 1) n

d + 1+ n

dν1 ,...,νk +

ν1 ,...,νk

k−1

n (τ +1) f

(r )

(z 0 )

r =0

≤ (τ + 1)τ (c )τ +1

proving (ii). In the next lemma we deal with two meromorphic functions.

Lemma 4.1.2 Let the hypothesis of Lemma 4.1.1 be satisfied for f = f i with k = ki , i = 1, 2. Let bi , ci be the corresponding values of b and c . Let b = max(b1 , b2 ) and c = b max(c1 , c2 ). Let ρ1 and ρ2 be given natural numbers. Then for any integer j ≥ 0, we have (i) b j+ρ1 +ρ2

dj (( f 1 (z))ρ1 ( f 2 (z))ρ2 ) |z=z0 is an algebraic integer. dz j

48

4 Extensions Due to Ramachandra

(ii) s

dj (( f 1 (z))ρ1 ( f 2 (z))ρ2 ) |z=z0 dz j

≤ (ρ1 + ρ2 ) j ( j + 1) j c j+ρ1 +ρ2 .

Proof We have j ν j−ν j d dj ρ1 ρ2 ρ1 d (( f (z)) ( f (z)) ) = (( f (z)) ) (( f 2 (z))ρ2 ). 1 2 1 ν j−ν ν dz j dz dz ν=0

Now

dν (( f 1 (z))ρ1 ) = ν dz μ +···+μ

bν+ρ1

(μρ1 )

(z) · · · f 1

(z).

ρ1 =ν

1

Hence by Lemma 4.1.1

(μ1 )

f1

dν (( f 1 (z))ρ1 ) |z=z0 dz ν

is an algebraic integer. Similarly, b j−ν+ρ2

d j−ν (( f 2 (z))ρ2 ) |z=z0 dz j−ν

is an algebraic integer. Thus we get (i). ( j) (ii) Since b j+1 f i (z 0 ) is an algebraic integer we get that dν ρ s bρ1 +ν ν ( f 1 1 ) |z=z0 ≤ dz μ +···+μ

(μρ1 )

· · · bμρ1 +1 f 1

|z=z0 ).

ρ1 =ν

1

≤

(μ1 )

s(bμ1 +1 f 1

(μ1 )

s(bμ1 +1 f 1

(μρ1 )

) |z=z0 · · · s(bμρ1 +1 f 1

) |z=z0 .

μ1 +···+μρ1 =ν

Note that by Lemma 4.1.1(ii), the right-hand side of the above inequality is bounded by ⎛ ⎞ ρ1 ρ1 +ν ⎝ ρ1 +ν ⎠ b 1 (c1 ) (μk + 1)μk . μ1 +···+μρ1 =ν

Thus s since

μ1 +···+μρ1 =ν

dν ρ ( f 1 ) |z=z0 dz ν 1

k=1

≤ ρν1 (bc1 )ρ1 +ν (ν + 1)ν

(4.3)

1 is the value of (x1 + · · · + xρ1 )ν when x1 = · · · = xρ1 = 1 and

4.1 Functions Satisfying Differential Equations

ρ1

(μk + 1)

μk

≤

k=1

ρ 1

49

μk + 1

ρ1 k=1

μk .

k=1 ρ

Similar inequality as (4.3) holds for f 2 2 with ρ1 , c1 replaced by ρ2 , c2 . Hence s

dj ρ ρ ( f 1 f 2 ) |z=z0 dz j 1 2

j j ν j−ν+1 ≤ ρ1 ρ2 ( j − ν + 1) j−ν (ν + 1)ν (c ) j+ρ1 +ρ2 ν ν=0

≤ (ρ1 + ρ2 ) j ( j + 1) j c j+ρ1 +ρ2

proving (ii).

4.2 First Extension Let F1 , . . . , Fs be algebraically independent entire functions and let {(aμ , n μ )} be a sequence of pairs with aμ ∈ C and n μ ∈ N with {n μ } a non-decreasing sequence. We make the following hypotheses. H1. Let F1 , . . . , Fs be entire functions of finite order ≤ ρ. H2. Let N (Q) = |{aμ : n μ ≤ Q}|. Assume that N (Q) is finite for any Q ≥ 1. H3. Let D(Q) = max (|aμ |). n μ ≤Q

N (Q) H4. Assume that lim inf log > ρ. log D(Q) H5. Assume that Ft (aμ ) are all algebraic. Let ν(Q) be the degree of the field Q(Ft (aμ )), 1 ≤ t ≤ s and n μ ≤ Q. H6. Let M (t) (R) = 1 + max |Ft (z)|, 1 ≤ t ≤ s.

H7. Put

M1(t) (Q)

|z|=R

= 1 + max {s(Ft (aμ ))}, 1 ≤ t ≤ s. n μ ≤Q

Let f (q) be a positive and increasing function. Let s ≥ 1 and r1 , . . . , rs be positive integers. We say that r1 · · · rs ∼ f (q) if given > 0, there exists q0 () such that for q ≥ q0 () we have (1 − ) f (q) < r1 · · · rs < (1 + ) f (q). For instance, suppose f (q) = q and s = 2, then choosing r1 = r2 = [q 1/2 ], we see that r1r2 ∼ f (q). Theorem 4.2.1 Let F1 , . . . , Fs be algebraically independent entire functions and let {(aμ , n μ )} be a sequence of pairs with aμ ∈ C and n μ ∈ N with {n μ } a non-decreasing

50

4 Extensions Due to Ramachandra

sequence. Suppose the hypotheses H 1 − H 7 are satisfied. Let q ≥ 1 and r1 , . . . , rs be natural numbers such that r1 · · · rs ∼ ν(q)(ν(q) + 1)N (q). Then there exists Q > q such that for any positive number R, we have 8ν(Q) s s (t) 8D(Q) N (Q−1) rt (t) rt (M1 (Q)) (M (R)) ≥ 1. R t=1 t=1

(4.4)

Proof Note that (4.4) is trivially true if R < 2D(Q). So we assume from now on that R ≥ 2D(Q). Auxiliary Polynomial Take the entire function R(z) =

r 1 −1 k1 =0

...

r s −1

ck1 ,...,ks (F1 (z))k1 · · · (Fs (z))ks

ks =0

where ck1 ,...,ks are rational integers to be chosen soon. Consider R(aμ ) = 0 for all aμ with n μ ≤ q.

(4.5)

This is a system of N (q) linear equations in q1 = r1 · · · rs unknowns ck1 ,...,ks . Let d be a denominator of the coefficients of this linear system. Then d ≤ (M1(1) (q))r1 · · · (M1(s) (q))rs =: J1 (q), say.

(4.6)

Upper Bound for |ck1 ...,ks | The size of the algebraic integer coefficients so obtained is bounded by J1 (q)2 . Hence by Lemma 3.1.3 (ii), we get rational integers ck1 ,...,ks , not all zero with

N (q)ν(q)(ν(q)+1) |ck1 ,...,ks | < 1 + 2r1 · · · rs J1 (q)2 2q1 −N (q)ν(q)(ν(q)+1) satisfying (4.5). Since by assumption q1 ∼ ν(q)(ν(q) + 1)N (q), we find that given any > 0 there exists q0 () such that for q > q0 (), the exponent above satisfies N (q)ν(q)(ν(q) + 1) 1 < . 2q1 − N (q)ν(q)(ν(q) + 1) 1− Using the inequality 2r1 · · · rs ≤ 2r1 · · · 2rs ≤ J1 (q), we get therefore that 3

|ck1 ,...,ks | < 1 + (J1 (q)) 1− for q ≥ q0 ().

4.2 First Extension

51

R(aμ ) = 0 for Some aμ By the algebraic independence of F1 , . . . , Fs , we see that R(z) ≡ 0. Further it is an entire function of order ≤ ρ. Suppose R(aμ ) = 0 for all aμ . Then lim inf

log N (Q) ≤ρ log D(Q)

which contradicts H 4. Hence there exist points a j in {aμ } for which R(a j ) = 0. Choose such an a j with least possible n j say, n j = Q. Then Q > q, γ = R(a j ) = 0 and R(a ) = 0 for all with n < Q. Lower Bound for |N (γ)| By (4.6), there exists a natural number d ≤ J1 (Q) such that dγ is an algebraic integer of degree ≤ ν(Q) and hence |N (γ)| ≥ J1 (Q)−ν(Q) . Upper Bound for |N (γ)| Using Cauchy Integral Formula Note that 3 3 s(γ) ≤ r1 · · · rs (1 + J1 (q) 1− )J1 (Q) ≤ (J1 (Q))2+ 1− .

(4.7)

(4.8)

Integrating on C : |z| = R0 with R0 ≥ 2D(Q), we get γ=

1 2πi

R(z) C

a j − a dz . z − a z − a j n ρ. log D(Q)

Let ν(Q) = ν. Note that ν is the degree of Q(2π , 2π , 2π ) over Q which is indepenQ Q and M1(2) (Q) ≤ c4.13 . We need to satisfy dent of Q. Further M1(1) (Q) ≤ c4.12 2

3

r1r2 ∼ ν(ν + 1)q 3 which is possible by taking r1 = r2 = [(ν(ν + 1)q 3 )1/2 ] < (ν(ν + 1)Q 3 )1/2 .

54

4 Extensions Due to Ramachandra

Now take R = 16c4.11 Q. Then the inequality in Theorem 4.2.1 implies that ((c4.12 c4.13 )ν(ν+1)

1/2

Q 5/2 8ν

) (c4.8 c4.9 )16c4.11 Q

5/2

(ν(ν+1))1/2 −Q 3

2

≥ 1.

This is not possible for sufficiently large Q.

4.5 Second Extension Theorem 4.5.1 Let i = 1, 2. Suppose each f i is a meromorphic function which is quotients of entire functions of order ≤ μ and satisfying a differential equation of the form f i(ki ) (z)

=

n1

···

ν1 =0

n ki

dν1 ,...,νki ( f i(0) (z))ν1 · · · ( f i(ki −1) (z))νki −1

νki =0

where all the coefficients dν1 ,...,νki belong to a number field K of degree ν. Suppose there exists an infinite sequence of distinct points {z λ }, λ ≥ 0 such that f i() (z λ ) ∈ K f or 0 ≤ ≤ ki − 1. Then f 1 and f 2 are algebraically dependent. Proof The proof is similar to the proofs of Theorems 3.0.1 and 4.2.1. Auxiliary Polynomial Suppose f 1 and f 2 are algebraically independent. Let us consider a function (z) =

r r

Cρ1 ,ρ2 ( f 1 (z))ρ1 ( f 2 (z))ρ2

ρ1 =0 ρ2 =0

with

(s) (z λ ) = 0 for 0 ≤ s ≤ t − 1, 0 ≤ λ ≤ m − 1.

Here m and t are free parameters and r = [mtν(ν + 1)]. Coefficients Cρ1 ,ρ2 are in K and Not All Zero Note that there are mt equations in (r + 1)2 unknowns. By Lemma 4.1.1 corresponding to each z λ , there exists positive integers bλ and cλ such that bλt+2r (s) (z λ ) = 0 for 0 ≤ s ≤ t − 1, 0 ≤ λ ≤ m − 1 have algebraic integer coefficients, and the size of the coefficients is bounded by

4.5 Second Extension

55

(2r )t (t + 1)t (bλ cλ )t+2r ≤ t δ1 t γ1t as seen in (4.3). Here and in the sequel, γ1 , γ2 , . . . denote numbers depending on m while δ1 , δ2 , . . . denote numbers depending on ν but independent of m and t. A priori, these numbers depend on K. By Lemma 3.1.3, there exist rational integers Cλ1 ,λ2 , not all zero, satisfying the given system of linear equations and such that mtν(ν+1)

|Cλ1 ,λ2 | ≤ 1 + (t δ1 t γ1t ) (r +1)2 −mtν(ν+1) ≤ t δ2 t γ2t .

(4.10)

By construction above, (z) ≡ 0. So it has zeros of finite order at z λ , 0 ≤ λ ≤ m − 1. Hence there exists j ≥ t with (τ ) (z λ ) = 0 for 0 ≤ τ ≤ j − 1; 0 ≤ λ ≤ m − 1 but

( j) (z λ0 ) = 0 for some λ0 with 0 ≤ λ0 ≤ m − 1.

We shall find lower and upper bounds for |( j) (z λ0 )|. Lower Bound for |( j ) (zλ0 )| We know

j+2r

bλ0

( j) (z λ0 )

is a non-zero algebraic integer. Hence j+2r

|N (bλ0

( j) (z λ0 ))| ≥ 1.

(4.11)

On the other hand, by (4.10) and Lemma 4.1.2 (ii), there exists a positive integer cλ0 such that

j+2r j s ( j) (z λ0 ) ≤ (r + 1)2 γ2t t δ2 t (2r ) j ( j + 1) j cλ0 ≤ j δ3 j γ3 . Using this in (4.11), we get |( j) (z λ0 )| > j −δ4 ν j γ4 = j −δ5 j γ4 . j

j

(4.12)

Upper Bound for |( j ) (zλ0 )| Using Cauchy Integral Formula Write f i = h i /gi , i = 1, 2 with h i , gi entire functions of order ≤ μ. Consider G = (g1 g2 )r . By the hypothesis, g1 and g2 do not vanish at z 0 , . . . , z m−1 . Hence 1 2r t G(z ) ≤ γ5 ≤ γ6 . λ0

56

4 Extensions Due to Ramachandra

Consider the integral I =

1 2πi

(z)G(z)dz m−1 (z − z λ0 ) (z − z λ ) j

T

λ=0,λ=λ0

where T : |z| = j δ with δ to be chosen later. Note that since j ≥ t by taking t sufficiently large, we may assume that 21 T =: |z| = 21 j δ contains z 0 , . . . , z m−1 . By Cauchy Integral formula, I =

G(z λ0 )

1 j!

m−1

( j) (z λ0 ).

(z λ0 − z λ ) j

λ=0,λ=λ0

Hence j! ( j) (z λ0 ) =

m−1

(z λ0 − z λ ) j

λ=0,λ=λ0

G(z λ0 )

I.

m−1 j j (z λ0 − z λ ) < γ7 . λ=0,λ=λ0

Note that

We now use the fact that f 1 , f 2 are meromorphic functions of order ≤ μ. Hence for any given > 0, there exist j0 () such that for j ≥ j0 () we have max |(z)G(z)| < (r + 1)2 γ2t t δ2 t e2r j

δ(μ+)

|z|= j δ

by using also (4.10). Choose δ = 1/(2μ + 1). Then for < 1/2, δ(μ + ) < 1/2. Hence j max |(z)G(z)| < γ8 j δ6 j |z|= j δ

and

m−1 (z − z λ ) j ≥ 2−(m j+1) j δ(m j+1) . min (z − z λ0 )

|z|= j δ

λ=0

Hence by all the above estimates, we have |( j) (z λ0 )| < j!γ6t γ7 γ8 j δ6 j 2m j+1 j −δ(m j+1) j δ < γ9 j (δ6 +1−mδ) j j

j

j

4.5 Second Extension

57

for j ≥ j0 and hence for t ≥ t0 where t0 is sufficiently large. Comparing this upper bound for |( j) (z λ0 )| with the lower bound in (4.12), we get j δ5 +δ6 +1−mδ >

γ4 . γ9

This is not valid for large m. This completes the proof of Theorem 4.5.1.

4.6 Some Consequences of Theorem 4.5.1 Derivation of Gelfond–Schneider Theorem In Theorem 4.5.1 take f 1 (z) = e z , f 2 (z) = eβz with β algebraic irrational; z λ = λ log α, λ = 0, 1, . . . with α algebraic having log α = 0. If αβ were algebraic, then by Theorem 4.5.1, f 1 and f 2 are algebraically dependent which contradicts Lemma 1.1.2. Another Consequence of Theorem 4.5.1 For basic definition and properties of functions involved for the rest of the discussion, we refer to Apostol [3]. Theorem 4.6.1 Let j (z) be the modular invariant function associated with ℘ (z). If τ is algebraic complex number with its imaginary part positive but not imaginary quadratic, then j (τ ) is transcendental. It is well known that j (τ ) is algebraic if τ is imaginary quadratic (see [4]). We first prove a lemma on Weierstrass ℘-function which itself is interesting. Lemma 4.6.2 Let g2 , g3 be invariants of ℘ (z) and g2∗ , g3∗ be the invariants of ℘ ∗ (z). Let ℘ (z) and ℘ ∗ (βz) be algebraically independent. Then for any α ∈ C which is not a pole of ℘ (z) and ℘ ∗ (βz), one at least of the seven numbers g2 , g3 , g2∗ , g3∗ , β, ℘ (α), ℘ ∗ (βα)

(4.13)

is transcendental. Proof Suppose all the seven numbers in (4.13) are algebraic. We take z λ = λα, λ = 0, 1, 2, . . . except those non-negative integers for which λα is a pole of either ℘ (z) or ℘ ∗ (βz). We know ℘ 2 = 4℘ 3 − g2 ℘ − g3 ; ℘ = 6℘ 2 − g2 /2.

58

4 Extensions Due to Ramachandra

Hence ℘ (i) (α), i ≥ 0 are all algebraic. Similarly (℘ ∗ )(i) (βα), i ≥ 0 are all algebraic. We also know 1 ℘ (z) − ℘ (z ∗ ) ℘ (z + z ∗ ) = −℘ (z) − ℘ (z ∗ ) + 4 ℘ (z) − ℘ (z ∗ ) for z ≡ z ∗ (mod (ω1 , ω2 )) where ω1 and ω2 are periods of ℘ and ℘ (2z) = −2℘ (z) +

℘ . 4℘

Let λ = 1, 2, . . . . If λα is not a pole of ℘ (z), then ℘ (λα) is expressible in terms of ℘ (i) (α) by applying L’Hospital’s rule if necessary. Thus ℘ (λα) are all algebraic. Similarly ℘ ∗ (λβα) is expressible in terms of (℘ ∗ )(i) (βα) and so ℘ ∗ (λβα) are all algebraic. By assumption, all these lie in the field generated by the seven algebraic numbers in (4.13) and ℘ (α), (℘ ∗ ) (βα). Hence ℘ (z) and ℘ ∗ (βz) are algebraically dependent by Theorem 4.5.1, a contradiction to Lemma 1.1.3. Proof of Theorem 4.6.1 Assume that j (τ ) is algebraic. Write τ = ω2 /ω1 . We have j (τ ) = g23 /(g23 − 27g32 ). If j (τ ) = 0, then g2 = 0 and ω1 , ω2 may be normalised to give g3 = 1. If j (τ ) = 0, then g2 = 0 and normalise g2 to be 1 and then g3 has to be algebraic since j (τ ) is algebraic. Thus τ , g2 , g3 are all algebraic. Since 4(℘ (z))2 = (℘ (z) − ℘ (ω1 /2))(℘ (z) − ℘ (ω2 /2))(℘ (z) − ℘ ((ω1 + ω2 )/2)), the numbers ℘ (ω1 /2), ℘ (ω2 /2) and ℘ ((ω1 + ω2 )/2) are all algebraic. Next set ω1∗ = ω1 τ , ω2∗ = ω2 τ . Let ℘ ∗ (z) be the corresponding Weierstrass elliptic function. Then g2∗ , g3∗ , ℘ ∗ (ω1∗ /2), ℘ ∗ (ω2∗ /2), ℘ ∗ ((ω1∗ + ω2∗ )/2) are all algebraic. Further since ∗ ∗ ω2 ∗ ∗ ω2 −1 2 =τ ℘ τ , ℘ (ω2 /2) = ℘ 2 2 we find that ℘ ∗ (ω2 /2) is algebraic. Thus g2 , g3 , g2∗ , g3∗ , 1, ℘ (ω2 /2), ℘ ∗ (ω2 /2) are algebraic and hence ℘ (z), ℘ ∗ (z) are algebraically dependent by Lemma 4.6.2. So their periods are commensurable by Lemma 1.1.3, i.e. there exist non-zero integers n 1 , n 2 such that n 1 ω1 τ = aω1 + bω2 ; n 2 ω2 τ = cω1 + dω2 with a, b, c, d integers and b, d non-zero. Hence

4.6 Some Consequences of Theorem 4.5.1

τ

59

n2 c + dτ = n1 a + bτ

leading to a quadratic equation for τ , a contradiction.

Exercise 1. Let α, β, γ, δ be algebraic numbers with log α, log β, log δ are Q-linearly independent and log γ/ log δ ∈ / Q. Show that at least one of αlog γ/ log δ and β log γ/ log δ is transcendental. Notes By using the proof of Theorem 4.4.1, one can show the following result, known as six exponentials theorem. Let {a1 , a2 } and {b1 , b2 , b3 } be two sets of Q−linearly independent complex numbers. Then at least one of the six numbers exp(ai b j ), 1 ≤ i ≤ 2, 1 ≤ j ≤ 3, is transcendental. The corresponding result for four exponentials still remains unsolved. A particular case of the four exponential result was conjectured in 1944 by Alaoglu and Erd˝os [5]: For any distinct prime numbers p and q, if p x and q x are integers for some real number x, then x must be an integer. This case also remains unsolved. For related details, see [6]. A good source for various transcendence and algebraic independence results on the values of the exponential function is the book of Waldschmidt [7]. 2 3 A simultaneous approximation measure for the three numbers 2π , 2π and 2π was k obtained by Shorey in 1974; see [8]. This was generalised to 2π for k = 1, 2, . . . by Srinivasan; see [9, 10]. In 1962, Lang generalised Schneider’s method which is similar to Theorem 4.5.1 as follows. Let K be a number field. Let f 1 , f 2 , . . . , f n be meromorphic functions of order ≤ ρ such that at least two of these functions are algebraically independent. Suppose that the derivative D = d/dz as a map takes the ring K[ f 1 , f 2 , . . . , f n ] into itself. If α1 , α2 , . . . , αm are distinct elements of C not among the poles of f i s such that f i (α j ) ∈ K for all 1 ≤ i ≤ n and 1 ≤ j ≤ m, then m ≤ 4ρ[K : Q]. For its proof and some consequences, we refer to [4]. The above result was further generalised by Bombieri; see [11].

60

4 Extensions Due to Ramachandra

References 1. K. Ramachandra, Contributions to the theory of transcendental numbers I. Acta Arithmetica 14, 65–72 (1968); II 14, 73–88 (1968) 2. K. Ramachandra, Lectures on Transcendental Numbers (The Ramanujan Institute, University of Madras, Chennai, 1969) 3. T.M. Apostol, Modular Functions and Dirichlet Series in Number Theory, 2nd edn. Graduate Texts in Mathematics, vol. 41 (Springer, New York, 1990) 4. M. Ram Murty, P. Rath, Transcendental Numbers (Springer, Berlin, 2014), 217 pp 5. L. Alaoglu, P. Erd˝os, On highly composite and similar numbers. Trans. Amer. Math. Soc. 56, 448–469 (1944) 6. K. Senthil Kumar, R. Thangadurai, V. Kumar, On a problem of Alaoglu and Erd˝os. Resonance 23(7), 749–758 (2018) 7. M. Waldschmidt, Nombres Transcendants (Springer, Berlin, 1974) 3 πk 8. T.N. Shorey, On the sum 2 − αk , αk algebraic numbers. J. Number Theory 6, 248–260 k=1

(1974) k 9. S. Srinivasan, On algebraic approximation to 2π (k = 1, 2, 3, . . .), I. Indian J. Pure Appl. Math. 5, 513–523 (1974) k 10. S. Srinivasan, On algebraic approximation to 2π (k = 1, 2, 3, . . .), II. J. Indian Math. Soc. (N.S.) 43 (1979); (1–4), 53–60 (1980) 11. Yu.V. Nesterenko, Algebraic Independence, vol. 14 (Tata Institute of Fundamental Research Publications, Mumbai, 2008), 157 pp

Chapter 5

Diophantine Approximation and Transcendence

Come friends, it’s not too late to seek a newer world —Tennyson

Diophantine approximation deals with the solubility of inequalities in integers. Dirichlet obtained one of the first type of such result in 1842 based on pigeon-hole principle. He showed that when α is irrational, there exist infinitely many rationals p/q(q > 0) such that α − p < 1 . q q2 In 1844, Liouville proved that for any algebraic number α of degree n ≥ 2, there exists a computable number c(α) > 0 such that α − p > c(α) q qκ with κ = n. This led him to construct first examples of transcendental numbers. When α is a quadratic irrational, then κ = 2, and this cannot be improved by the first inequality of Dirichlet above. Thus in Liouville’s result κ = 2 + , > 0 is essentially the best exponent to be expected. In 1909, the Norwegian mathematician Thue used the approximation techniques in an ingenious way to make a major advancement to solve certain equations. Thus he could show that the equation F(x, y) = m

© Springer Nature Singapore Pte Ltd. 2020 S. Natarajan and R. Thangadurai, Pillars of Transcendental Number Theory, https://doi.org/10.1007/978-981-15-4155-1_5

61

62

5 Diophantine Approximation and Transcendence

where F is an irreducible binary form with integral coefficients of deg ≥ 3, possesses only finitely many solutions in integers x and y. For this, he realised that the exponent κ should be lowered. He showed that for any given > 0, one can take κ = n/2 + 1 + . Sections 6.1 and 6.2 describe the results of Dirichlet, Liouville and Thue. Siegel improved Thue’s result, by showing that κ=

min

1≤s≤n−1,s∈Z

n + s + . s+1

(5.1)

√ In particular, √ κ can be taken as 2 n + . The next improvement was by Dyson in 1947 with κ = 2n + . In 1948, Gelfond obtained the same value for κ as a corollary of a more general theorem. Finally in 1955, Roth proved the best possible result of κ = 2 + . In Sect. 5.3, we give the proof of Siegel’s result as given in Mordell [1] so that the reader can compare and contrast it with the proof of Thue’s theorem.

5.1 Approximation Theorem of Dirichlet Theorem 5.1.1 Let α and Q be real numbers with Q > 1. Then there exist integers p, q such that 1 ≤ q < Q and |αq − p| ≤ 1/Q. Proof First let us assume that Q is an integer. Consider the following Q + 1 numbers 0, 1, {α}, {2α}, . . . , {(Q − 1)α} where {x} means the fractional part of the real number x. They lie in the unit interval [0, 1]. Divide the unit interval into Q subintervals u+1 u ≤x< , u ∈ {0, 1, . . . , Q − 1} Q Q with < replaced by ≤ if u = Q − 1. Since there are only Q such intervals, at least one such subinterval contains at least two of the Q + 1 numbers listed above. Hence there are integers r1 , r2 , s1 , s2 with 0 ≤ ri < Q, i = 1, 2 and r1 = r2 such that |(r1 α − s1 ) − (r2 α − s2 )| ≤ 1/Q. Taking q = |r1 − r2 |, p = s1 − s2 or p = s2 − s1 according as r1 > r2 or r1 < r2 , respectively, we get 1 ≤ q < Q and |qα − p| ≤ 1/Q.

5.1 Approximation Theorem of Dirichlet

63

Suppose Q is not an integer, then let Q = [Q] + 1 > Q and apply the above result with Q replaced by Q . Then 1 ≤ q < Q implies that 1 ≤ q ≤ [Q], and hence 1 ≤ q < Q. Further |qα − p| ≤ 1/Q implies |qα − p| < 1/Q since Q > Q. It follows from the above theorem that α − p ≤ 1 < 1 . q Qq q2 In fact there exist infinitely many coprime integers p, q with this property if α is irrational as shown in the corollary below. Corollary 5.1.2 Suppose α is irrational. Then there exist infinitely many pairs p, q of relatively prime integers with α − p < 1 . q q2 Proof Suppose p = dp , q = dq , with gcd( p , q ) = 1. Then α − p = α − p < 1 ≤ 1 . q q q2 q 2 Hence in Theorem 5.1.1 we may take p and q as coprime. Since α is irrational, qα − p is never zero. Hence for any given p, q setting Q 0 = |qα − p|−1 , the inequality |qα − p| < 1/Q can be satisfied only when Q ≤ Q 0 . Hence as Q → ∞ there will be infinitely many distinct pairs p, q with gcd( p, q) = 1 satisfying the inequality in Theorem 5.1.1. Remark The above corollary is not true if α is rational. For, then let α = u/v and if α = p/q, then α − p = u − p = qu − pv ≥ 1 . q v q vq vq Hence the inequality in Corollary 5.1.2 is satisfied only if q < v. Thus there are only finitely many q values and since | p| < (1 + |α|)|q| there are only finitely many p values satisfying the inequality in the corollary. In view of the corollary above, we define an irrational number α as badly approximable if there is a constant c = c(α) > 0 such that α − p > c q q2

64

5 Diophantine Approximation and Transcendence

for every rational p/q. √ It can be shown that c must satisfy 0 < c < 1/ 5. This is due to Hurwitz. Further if α is a quadratic irrational, then by a well-known result of Legendre we know that α − p n < 1 qn qn2 for all n ≥ 0 where pn /qn is the nth convergent in the continued fraction expansion of α. Using the best approximation properties of these convergents, one can show that every quadratic irrational is badly approximable. We state here without proof a beautiful result of Khintchine from 1926 which forms basis for metrical transcendence theory. Theorem 5.1.3 Suppose ψ(q) is a positive, non-increasing function defined for q = 1, 2, . . . . Consider the inequality α − p < ψ(q) q q and the sum

∞

(5.2)

ψ(q).

q=1

If the sum is convergent, then (5.2) has only finitely many solutions in rationals p/q with q > 0 for almost all α (in the sense of Lebesgue measure). If the sum is divergent, then (5.2) has infinitely many solutions for almost all α. As a result of the above theorem, we find that for every δ > 0, the inequality α − p < 1 q q 2+δ

(5.3)

has only finitely many solutions for almost all α, but 1 α − p < q q 2 log q

(5.4)

has infinitely many solutions for almost all α. Since quadratic irrationals are badly approximable, they behave like almost every algebraic number with respect to (5.3), but not with respect to (5.4). Some Reductions In the problem of approximating a number α ∈ C by rationals, note that if there are only finitely many rationals satisfying the inequality, say α − p < 1 , κ ≥ 1, q qκ

(5.5)

5.1 Approximation Theorem of Dirichlet

65

then there exists a number C(α, κ) such that α − p > C(α, κ) q qκ

(5.6)

for all p/q. Thus in order to prove a result of the type (5.6), we need to prove that (5.5) has only finitely many solutions. Note that when α is a rational number say, s/t, then obviously we have α − p > 1 q qt for any p/q = α. Hence (5.6) holds. So we will consider only α ∈ / Q. While dealing with the approximation problem, we observe that the following simplifications can be made. (a) We may assume that α ∈ R. For, otherwise, suppose α = a + ib with b = 0. Then α − p ≥ |b| ≥ |b| q qκ for any p/q. (b) By similar argument as in (a), we may also assume that α − p ≤ 1 q for any p/q. This implies that if q is bounded, then | p| is also bounded since | p| ≤ |q|(1 + |α|). (c) Suppose α is an algebraic number. Then, we may assume that α is an algebraic integer. For otherwise, let d be the denominator of α. Then α − p = 1 q d

dα − dp . q

Hence if (5.6) holds for algebraic integers, then the above equality implies that α − p ≥ C(α, κ) = C (α, κ) q dq κ qκ where C (α, κ) = C(α, κ)/d. (d) Suppose (5.6) is true for all reduced rationals. Let p = f p and q = f q with f ≥ 1 and gcd ( p , q ) = 1. Then η α − p = α − p ≥ C(α, κ) = C(α, κ) f ≥ C(α, κ) . q q q κ qκ qκ

66

5 Diophantine Approximation and Transcendence

Thus it is enough to prove (5.6) for all reduced fractions p/q. (e) As observed earlier, in order to prove (5.6), we may assume that (5.5) has infinitely many solutions in reduced rationals p/q. We assume from now onwards that (a)–(e) hold.

5.2 Theorems of Liouville and Thue Theorem 5.2.1 Let α be an algebraic number of degree n ≥ 2. There exists a number c5.1 = c5.1 (α) > 0 such that α − p > c5.1 (5.7) q qn for any integers p, q with q > 0. Proof Let f (X ) = a0 X n + · · · + an , a0 = 0 be the minimal polynomial of α. Then by mean value theorem, f p = f (α) − f p = f (ψ) α − p q q q

(5.8)

where ψ is some number between α and p/q. Note that | f (ψ)| ≤ n H ( f )(1 + |ψ| + · · · + |ψ n |) ≤ c5.2 since |ψ| ≤ 1 + |α| as |α − p/q| < 1. Here c5.2 denotes a number depending only on α. Thus f p < c5.2 α − p . q q Further, f ( p/q) is non-zero since α is of degree ≥ 2 and n n−1 n f p = |a0 p + a1 p q + · · · + an q | ≥ 1 . n q q qn

(5.9)

Combining (5.8) and (5.9) we get −1 α − p > min(1, c5.2 ) . q qn Theorem 5.2.1 enables one to construct transcendental numbers. For example, L(μ) =

∞ 1 j! μ j=0

5.2 Theorems of Liouville and Thue

67

is transcendental for any integer μ > 1. This is seen as follows. Suppose the number L(μ) is algebraic of degree n ≥ 1. For any integer m ≥ 1, writing 1 pm = qm μ j! j=0 m

with gcd( pm , qm ) = 1, we see that qm = μm! . Then ∞ 1 2 L(μ) − pm = < n+1 for all m ≥ n. j! qm μ qm j=m+1 Hence L(μ) cannot be algebraic of degree n, thus showing that L(μ) is transcendental. In Theorem 5.2.1, if the exponent n in (5.7) can be reduced, then one can show that certain equations have only finitely many solutions. In 1909, Thue was able to reduce the exponent from n to n/2 + 1 + , > 0. This enabled him to show the following result. Let F(X, Y ) = a0 X n + a1 X n−1 Y + · · · + an Y n ∈ Z[X, Y ] be an irreducible binary form of degree n ≥ 3, a0 = 0. Consider equations of the form F(x, y) = k

(5.10)

in integers x and y for any fixed integer k. Then the theorem of Thue is as follows. Theorem 5.2.2 Equation (5.10) has only finitely many solutions. Such equations are called Thue equations. This theorem is a consequence of the following result. Theorem 5.2.3 Let α be an algebraic number of degree n ≥ 3 and > 0. Let κ = n2 + 1 + . Then there exists a number c5.3 = c5.3 (α, ) > 0 such that α − p > c5.3 q qκ for any p, q ∈ Z, q > 0. Derivation of Theorem 5.2.2 from Theorem 5.2.3 Denote by c5.4 , c5.5 positive numbers depending only on F and k. Write F(X, Y ) = a0 (X − α1 Y ) · · · (X − αn Y ) where α1 , . . . , αn are algebraic numbers. They are distinct since F is irreducible. Suppose (5.10) has a solution in integers x and y. We assume that |x| ≤ |y|, the other case is similar. Let α ∈ {α1 , . . . , αn } be such that

68

5 Diophantine Approximation and Transcendence

α − x = min αi − x . 1≤i≤n y y Let

c5.4 = min αi − α j . i= j

α − x < c5.4 . y 2

First assume that

Then for αi = α, we have αi − x ≥ |α − αi | − α − x > c5.4 . y y 2 From (5.10) we have

Hence

n αi − x = k . n y a0 y i=1

(5.11)

α − x < c5.5 y |y|n

where n−1 ). c5.5 = |k|2n−1 /(|a0 |c5.4

By Theorem 5.2.3, we get

which gives

c5.5 c5.3 < |y|κ |y|n −1 . |y|n−κ < c5.5 c5.3

By taking < n/2 − 1, the above inequality implies that there are only finitely many values for y and hence for x since |x| ≤ |y|. x Next if α − y ≥ c5.4 /2, then from (5.11), we get n |y|n ≤ |k|2n /(|a0 |c5.4 )

giving the desired result as before.

5.2 Theorems of Liouville and Thue

69

Proof of Theorem 5.2.3 The proof is based on a paper by Davenport [2]. Let the assumptions (a)–(e) in Sect. 6.1 hold. Step 1. Construction of an auxiliary polynomial which vanishes at (α, α) to a high order Let L , k be parameters to be chosen later. Put R(X, Y ) =

L 1

p(λ1 , λ2 )X λ1 Y λ2 ∈ Z[X, Y ]

(5.12)

λ1 =0 λ2 =0

and Rm (X, Y ) =

1 ∂m R(X, Y ) m! ∂ X m

(5.13)

where p(λ1 , λ2 ) are integral variables to be determined subject to the condition Rm (α, α) = 0 for 0 ≤ m < k. We have Rm (α, α) =

L 1 λ1 =0 λ2 =0

(5.14)

λ1 λ1 +λ2 −m α p(λ1 , λ2 ) m

which by Lemma 1.3.1 can be written as Rm (α, α) =

n−1 j=0

and

αj

1 L

p(λ1 , λ2 )

λ1 =0 λ2 =0

λ1 b j,λ1 +λ2 −m m

λ1 b j,λ1 +λ2 −m < 2 L (2h(α)) L+1 ≤ (4h(α))2L for m ≥ 0. max (λ1 ,λ2 ) m

Since α is of degree n, the system of equations in (5.14) is equivalent to 1 L λ1 =0 λ2 =0

p(λ1 , λ2 )

λ1 b j,λ1 +λ2 −m = 0, 0 ≤ m < k, 0 ≤ j ≤ n − 1. m

There are nk equations in 2(L + 1) unknowns p(λ1 , λ2 ). We shall choose L=

1 (1 + δ)nk 2

(5.15)

so that 2(L + 1) > (1 + δ)nk, δ > 0. Then by Lemma 3.1.1, there exist p(λ1 , λ2 ) ∈ Z, not all zero, such that

70

5 Diophantine Approximation and Transcendence

| p(λ, λ2 )| ≤ (4(L + 1)(4h(α))2L )1/δ ≤ (4h(α))4L/δ .

(5.16)

Step 2. An upper bound for Rm qp11 , qp22 where qp11 and qp22 are two rationals very close to α Denote by c5.6 , c5.7 , . . . positive numbers depending only on α and δ. Let p1 /q1 and p2 /q2 be two reduced rational numbers with α − pi < 1, i = 1, 2. qi Then | pi /qi | < 1 + |α|, i = 1, 2. By (5.12), we can write R(X, Y ) = P(X ) − Y Q(X ) where P(X ) =

L

p(λ1 , 0)X λ1 ; Q(X ) = −

λ1 =0

L

p(λ1 , 1)X λ1 .

λ1 =0

Then by the definition of Rm (X, Y ), it follows that Rm (X, Y ) = Pm (X ) − Y Q m (X ) where

L 1 (m) λ1 P (X ) = X λ1 −m p(λ1 , 0) Pm (X ) = m! m λ1 =0

and

L 1 (m) λ1 Q (X ) = X λ1 −m . p(λ1 , 1) Q m (X ) = m! m λ1 =0

Applying (5.16) we get |Pm (X )| ≤ (L + 1)(4h(α))4L/δ 2 L (1 + |X |) L ≤ (c5.6 (1 + |X |)) L where c5.6 may be taken as 4(4h(α))4/δ . Similarly |Q m (X )| ≤ (c5.6 (1 + |X |)) L .

(5.17)

Thus |Rm (X, Y )| ≤ |Pm (X )| + |Y ||Q m (X )| ≤ (c5.6 (1 + |X |)) L (1 + |Y |). Taking X = p1 /q1 , Y = p2 /q2 and observing that

(5.18)

5.2 Theorems of Liouville and Thue

Rm

p1 p2 , q1 q2

= Pm

p1 q1

71

− αQ m

p1 q1

−

p2 p1 , − α Qm q2 q1

we obtain R m p 1 , p 2 ≤ R m p 1 , α + α − p 2 Q m p 1 q1 q2 q1 q2 q1 which by (5.17) implies that Rm p1 , p2 ≤ Rm p1 , α + α − p2 c L . q1 q2 q1 q2 5.7 Estimate for Rm qp11 , α Fix an integer m with 0 ≤ m < k. Let S(X ) = Rm (X, α) and Sν (α) = Then by (5.13),

1 (ν) S (α). ν!

m+ν Sν (α) = Rm+ν (α, α). ν

Using (5.14), we get Sν (α) = 0 for ν < k − m. Hence

S

p1 q1

=S

ν ∞ p1 p1 −α+α = Sν (α) −α q1 q1 ν=k−m

by Taylor expansion, thus giving Rm By α −

p1 q1

ν L−m m + ν p1 p1 Rm+ν (α, α) ,α = −α . ν q1 q1 ν=k−m

< 1 and (5.18), we therefore get k−m R m p1 , α ≤ c L p1 − α . 5.8 q q 1

Combining this with (5.19) we obtain

1

(5.19)

72

5 Diophantine Approximation and Transcendence

k−m

p2 L R m p 1 , p 2 ≤ α − p 1 + α − c5.9 . q1 q2 q1 q2 Step 3. There exists some integer m with Rm Let t be a positive integer such that Rm

p1 p2 , q1 q2

p1 p2 , q1 q2

(5.20)

= 0

= 0 for 0 ≤ m ≤ t.

We shall get an upper bound T for t. Then there exists some m with 0 ≤ m ≤ T + 1 such that Rm qp11 , qp22 = 0. Let us take two distinct integers m and m with 0 ≤ m, m ≤ t. From the definition of Rm it is clear that p1 p2 (m) p1 (m) − =0 Q P q1 q2 q1 and P

(m )

p1 q1

p2 (m ) − Q q2

p1 q1

= 0.

Eliminating p2 /q2 in these equations, we get P (m) Put

p1 q1

Q (m )

p1 q1

− P (m )

p1 q1

Q (m)

p1 q1

= 0.

W (X ) = P(X )Q (1) (X ) − P (1) (X )Q(X ).

Then W (X ) ∈ Z[X ] and by the previous identity, we get W

(μ)

p1 q1

= 0 for 0 ≤ μ < t.

Therefore, W (X ) = (q1 X − p1 )t U (X ) with U (X ) ∈ Z[X ]

(5.21)

by gcd ( p1 , q1 ) = 1 and Gauss Lemma 1.2.1. By Step 1, it follows that H (W ), the height of W satisfies L k ≤ c5.11 . (5.22) H (W ) ≤ c5.10 Suppose W (X ) ≡ 0. Then by (5.21), H (W ) ≥ q1t . Comparing this with (5.22), we get

k log c5.11 . t≤ log q1

5.2 Theorems of Liouville and Thue

73

Thus we may take

T =

k log c5.11 . log q1

(5.23)

Now we show that W (X ) ≡ 0. By Step 1, we observe that either P(X ) or Q(X ) is not identically zero. Suppose Q(X ) ≡ 0. Then P(X ) ≡ 0. Again by Step 1, P (m) (α) = 0 for 0 ≤ m < k which implies

P (m) (α( j) ) = 0 for 0 ≤ m < k, 1 ≤ j ≤ n

where α(1) = α, . . . , α(n) are the conjugates of α. Since P(X ) ≡ 0 we have nk ≤ deg P ≤ L < nk, by (5.15) and taking 0 < δ < 1. This is a contradiction. Thus Q(X ) ≡ 0. Next suppose that P(X )/Q(X ) is a constant function, say λ. Then λ ∈ Q. Hence λ = α. Also R(X, α) = (λ − α)Q(X ). So by Step 1, we get that Q (m) (α( j) ) = 0 for 0 ≤ m < k, 1 ≤ j ≤ n. By a comparison of the degree, as earlier, we get a contradiction. Thus P(X )/Q(X ) is not a constant function. Hence P(X ) W (X ) ≡ 0 =− 2 Q(X ) Q (X ) implying W (X ) ≡ 0. In conclusion, there exists an integer m say, m 0 with 0 ≤ m 0 ≤ p1 p2 T + 1 for which Rm 0 q1 , q2 = 0 with T as in (5.23). Step 4. A lower bound for Rm0 qp11 , qp22 Among the infinitely many rationals satisfying (5.5) we will choose suitably two rationals p1 /q1 and p2 /q2 satisfying q2 > q1 > 1. Let L be given by (5.15) and take 0 < δ < 1/12 so that Steps 1–3 are valid. Take

log q2 k= log q1 and

q1 > elog c5.11 /δ .

Then q1k ≤ q2 < q1k+1 . and

(5.24)

74

5 Diophantine Approximation and Transcendence

m0 ≤

k log c5.11 log q1

+ 1 ≤ kδ + 2.

(5.25)

From the definition of Rm and (5.24), it is easy to see that Rm p1 , p2 ≥ q −(1+δ)nk/2−k−1 . 1 0 q q 1 2

(5.26)

Step 5. An upper bound for Rm0 qp11 , qp22 By (5.20), assumption (5.5) and (5.25), we get Rm p1 , p2 ≤ max(q κ(−k(1−δ)+2) , q −κ )ck . 5.12 2 1 0 q q 1 2 Let us now take q1 > max(elog c5.11 /δ , c5.12 ). Then Step 4 is valid and also 1/δ

Rm p1 , p2 ≤ q κ(−k(1−δ)+2)+δk . 1 0 q q 1 2

(5.27)

Step 6. Gap principle For any two reduced rationals P1 /Q 1 and P2 /Q 2 with Q 2 > Q 1 > 1 satisfying (5.5) we see that P1 P1 P2 2 P2 1 α − α − ≤ + < κ ≤ − Q1 Q2 Q1 Q2 Q1 Q2 Q1 which gives

Q 2 > Q κ−1 1 /2.

In other words, denominators of such rationals are far apart. This phenomenon is also described as good approximations repel one another. Step 7. Final contradiction Take a rational p2 /q2 with q2 > 0 satisfying (5.5) and so large that k=

log q2 > δ −1 . log q1

This is possible by Step 6. Comparing (5.26) and (5.27), we get (1 + δ)k(n/2 + 1) + 1 k(1 − δ) − 2 (2δk + 2)(n/2 + 1) + 1 n . ≤ +1+ 2 k(1 − δ) − 2

κ ≤

5.2 Theorems of Liouville and Thue

75

Since δ < 1/12, we estimate the last term in the above inequality as
0. There exists a number c5.13 = c5.13 (α, ) > 0 such that α − p > c5.13 q qκ for any p, q ∈ Z, q > 0 provided κ is as in (5.1). Proof We denote by c5.14 , . . . positive numbers depending only on α and some parameter δ > 0 which will be chosen later. We need to generalise several ideas in the proof of Theorem 5.2.2. As before, we do them in several steps. Let all the assumptions (a)–(e) of Sect. 6.1 hold. Step 1 . Construction of an auxiliary polynomial which vanishes at (α, α) to a high order Note that in Theorem 5.2.2 the polynomial R(X, Y ) is linear in Y. Now we shall construct a nonlinear polynomial in Y. Let a, b, r be non-negative integers. Put R(X, Y ) =

b a

p(λ1 , λ2 )X λ1 Y λ2 ∈ Z[X, Y ]

λ1 =0 λ2 =0

and Rm (X, Y ) =

1 ∂m R(X, Y ) m! ∂ X m

where p(λ1 , λ2 ) are integral variables to be determined subject to the condition R(X, α) = (X − α)r S(X ) for some polynomial S and

(5.28)

76

5 Diophantine Approximation and Transcendence

Rm (α, α) = 0 for 0 ≤ m < r.

(5.29)

The vanishing of any polynomial P(X ) ∈ Q[X ] at X = α requires n conditions given by expressing the powers of α in P(α) in terms of 1, α, . . . , αn−1 , and then equating to zero, the coefficients of these powers. We also have P(α1 ) = · · · = P(αn ) = 0 where α = α1 , . . . , αn are the conjugates of α. This is equivalent to saying, α1t P(α1 ) + · · · + αnt P(αn ) = 0, 0 ≤ t < n. Taking P(X ) = Rm (X, α) each condition in (5.29) gives rise to n conditions on the coefficients p(λ1 , λ2 ) in α1t Rm (α1 , α1 ) + · · · + αnt Rm (αn , αn ) = 0.

(5.30)

Thus there are r n conditions for (a + 1)(b + 1) unknowns pλ1 ,λ2 . Note also that the coefficients of each pλ1 ,λ2 are in Z. Now we choose the parameters a, b as follows. 0 < b < n and a =

n+δ r . b+1

Then (a + 1)(b + 1) − nr > δr.

(5.31)

Note that Rm (αi , αi ) =

b a λ1 p(λ1 , λ2 )αiλ1 +λ2 −m , 1 ≤ i ≤ n. m

λ1 =m λ2 =0

Hence the coefficient of p(λ1 , λ2 ) in Rm (αi , αi ) has absolute value bounded by r . 2a max(1, |αi |a+b ) < c5.14

Thus from (5.30) we get that the coefficients of p(λ1 , λ2 ) are bounded in absolute value by r r (|α1 |t + · · · + |αn |t ) ≤ c5.15 . c5.14 Applying Lemma 3.1.1, there exist rational integers p(λ1 , λ2 ), not all zero, such that nr/((a+1)(b+1)−nr ) r max | p(λ1 , λ2 )| ≤ (a + 1)(b + 1)c5.15 .

5.3 Theorem of Siegel

77

By (5.31) and since (a + 1)(b +

r 1)c5.15

c5.16 .

(5.33)

r From now on we shall assume that q2 > c5.16 . We now claim that there exists m with 0 ≤ m < r such that p2 = 0 for any h ∈ Q Rm h, q2

provided

(γ) (h) = 0 for some γ with b + γ < r.

Assume first that (5.34) holds. We show the claim. By (5.33), we have

(5.34)

80

5 Diophantine Approximation and Transcendence

b p2 p2 = ≡ 0. R X, Fλ (X )G λ q2 q2 λ=0

Hence we may assume without loss of generality that G 0 the above expression repeatedly, we get m!Rm

p2 X, q2

=

b

Fλ(m) (X )G λ

λ=0

p2 q2

p2 q2

= 0. Differentiating

, 0 ≤ m ≤ b .

Since the determinant of the coefficient matrix of the above system, (X ) is not identically zero, we can solve for G λ qp22 . Thus we have (X )G 0

p2 q2

=

b λ=0

p2 , Hλ (X )Rλ X, q2

where Hλ (X ) are polynomials with integer coefficients. Differentiating the above expression γ times and putting X = h, we obtain

(γ)

(h)G 0

p2 q2

b +γ

=

λ=0

p2 h λ Rλ h, q2

where h λ are rational numbers. Since b + γ ≤ b + γ < r, the Rλ are defined by p2 (5.28) and not all the Rλ h, q2 are zero for 0 ≤ λ ≤ b + γ. This proves the claim. Next we find a γ satisfying (5.34). We have

m!Rm (X, α) =

b

Fλ(m) (X )G λ (α), 0 ≤ m ≤ b .

λ=0

The polynomial Rm (X, α) in X is divisible by (X − α)r −m and so by (X − α)r −b since m ≤ b < r. Arguing as in the previous paragraph, we get that (X )G 0 (α) is a linear combination of Rm (X, α) and hence divisible by (X − α)r −m . Now G 0 (X ) ≡ 0 and G 0 (α) = 0 since G 0 (X ) is of degree b < n. Hence (X ) is divisible by (X − α)r −b . So (X ) is divisible by the minimal polynomial say, f (X ) of α to the power r − b . Thus we can write

(X ) = f (X )r −b D(X ) where D(X ) ∈ Q[X ] is of degree d, say. We know that the elements of (X ) are of degree at most a. Hence (X ) is of degree at most a(b + 1). From the above expression for (X ) we get

5.3 Theorem of Siegel

81

n(r − b ) + d ≤ a(b + 1). By the choices of a and b we get d≤

n+δ r (b + 1) − nr + nb ≤ δr + nb ≤ δr + n 2 − n. b+1

Since h ∈ Q, f (h) = 0, and so (X ) has a zero at X = h of order γ with γ ≤ δr + n 2 − n. Thus (5.34) is satisfied if b + δr + n 2 − n < δr + n 2 < r. Taking h = p1 /q1 and summarising the above arguments, we get Rm

p1 p2 , q1 q2

= 0 for some m with 0 ≤ m < r

r . provided δr + n 2 < r and q2 > c5.16 Step 4 . A lower bound for Rm qp11 , qp22 We follow the Step 4 in Theorem 5.2.2. Among the infinitely many rationals satisfying (5.5) we will choose suitably two rationals p1 /q1 and p2 /q2 satisfying q1r ≤ q2 < q1r +1 so that log q2 . r= log q1 r Let a and b be as given in Step 1 and take 0 < δ < 1/2 and q2 > c5.16 so that Steps 2 1 − 3 are valid. Further we take q2 so large that r > 2n . Then

m ≤ δr + n 2 < r.

(5.35)

p1 p2 ≥ 1. , q1a q2b Rm q1 q2

Thus

Together with (5.32), this gives r c5.19 q1a q2b

p1 r −m p2 max α − , α − > 1. q1 q2

(5.36)

Step 5 . Final contradiction We shall choose q1 and δ in such a way that inequality (5.36) is contradicted. Let κ=

n + b + , 0 < < 1. b+1

82

5 Diophantine Approximation and Transcendence

Then κ< Consider A :=

q1a q2b

n + n − 1 + 1 < 2n. 2

r −m α − p 1 ≤ q1a+b(r +1)−κ(r −m) q1

by the choice of q2 and (5.5). Using the value for a and (5.35) we get b mκ ( n+δ b+1 +b+ r −κ+ r )r

A ≤ q1 Take δ such that

< q1

δ b κn 2 b+1 +δκ+ r + r

− r

.

δ + δκ < /2 b+1

and q2 so large that r satisfies b κn 2 + < /4. r r Then

−r/4

A < q1 Next,

.

(5.37)

n+δ p2 r ( n+δ +b−κ)r B := q1a q2b α − < q1b+1 q2b−κ < q1 b+1 q2

since b < κ and q2 ≥ q1r . Thus ( δ −)r −r/2 B < q1 b+1 < q1 by the choice of δ. Hence (5.37) is true with A replaced by B. From (5.36), we get −/4 r

r max(A, B) < (c5.19 q1 1 < c5.19

).

−/4

Choose q1 large so that (c5.19 q1 )r < 1. This choice of q1 contradicts the above inequality giving the final contradiction. Exercise (1) Let P be the set of natural numbers all of whose prime divisors belong to a finite set of primes { p1 , . . . , pm }. Show that the equation x − y = k, k ∈ Z, k = 0

5.3 Theorem of Siegel

83

has only finitely many solutions in x, y ∈ P, (2) Let F(X, Y ) = a0 X n + a1 X n−1 Y + · · · + an Y n ∈ Z[X, Y ] be an √ irreducible binary form of degree n ≥ 3. Then show that for any ν < n − 2 n, there are only finitely many integer points (x, y) with 0 < |F(x, y)| < (|x| + |y|)ν . √ Suppose G(X, Y ) is a polynomial of total degree ν < n − 2 n. Then show that there are only finitely many integer points (x, y) with F(x, y) = G(x, y). (3) Let α be an algebraic number of degree n and let P(X ) ∈ Z[X ] be of degree k. Show that there exists a number c = c(α) > 0 such that either P(α) = 0 or |P(α)| >

ck . H (P)n−1

Notes Given any real number α, one would like to know whether it behaves like almost every other number. For example, if α is a quadratic irrational, then it is badly approximable, and hence, it behaves like almost every number with respect to inequality (5.3) but not with respect to inequality (5.4). As another example, consider the Euler constant e. It has the continued fraction expansion as [2; 1, 2, 1, 1, 4, 1, 1, 6, 1, . . .]. In 1978, Davis [3] used this continued fraction expansion to obtain the following rational approximation to e. Let > 0. Then for any p, q ∈ N with q > q0 (), one has e − p > 1 − log log q . q 2 q 2 log q From this it is clear that neither of the inequalities (6.4) and (6.5) has infinitely many solutions if α = e. Thus with respect to (5.3), the constant e behaves like almost every number but not with respect to (5.4). In 1953, Mahler [4] showed that π − p < 1 q q 42 has only finitely many solutions. Although the method of Thue is ineffective, i.e. one cannot bound the solutions of the equation F(x, y) = k

84

5 Diophantine Approximation and Transcendence

it is possible to bound the number of solutions of this equation. Several mathematicians like Bombieri, Evertse, Gy˝ory, Schmidt, Siegel and others have contributed to this problem. See the recent book of Evertse and Gy˝ory [5] for various developments in this direction. We mention a result of Siegel [6]. A binary form F(x, y) ∈ Z[x, y] of degree r is said to be diagonalisable if it is of the form (αx + β y)r − (γx + δ y)r . Here we consider forms with α, β, γ and δ algebraic satisfying αδ − βγ = 0 and r ≥ 3. Let be the discriminant of F. As F(x, y) ∈ Z[x, y], it is possible to write (αx + β y)(γx + δ y) = χ(Ax 2 + Bx y + C y 2 ), A, B, C ∈ Z. Let D = D(F) = B 2 − 4 AC. Let N F (k) denote the number of primitive solutions (i.e. gcd(x, y) = 1) of the inequality |F(x, y)| ≤ k.

Theorem 5.3.2 Assume that F(x, y) is a diagonalisable form of degree r and with discriminant satisfying > 2r

2

−r r 2r −2

r k

4 cl r 2−l r k

where r ≥ 6 − l, l = 1, 2, 3, c1 = 45 + Then

593 134 156 , c2 = 6 + and c3 = 75 + . 913 4583 167

⎧ 2lr ⎪ ⎪ ⎪ ⎨4l N F (k) ≤ ⎪ 2l ⎪ ⎪ ⎩ 1

if if if if

D D D D

0, r is even and F is indefinite > 0, r is odd and F is indefinite > 0 and F is definite.

In particular, if D < 0 and l = 1, then N F (k) ≤ 2r provided || > 2r

2

−r 183.6r 47.6r −2

r

k

.

For a recent improvement of this result, see [7]. The most interesting families of diagonalisable forms are binomial forms, the forms of the shape ax r − by r . In an important work, Bennett [8] combined hyper-geometric method with Chebyshev like estimates for primes in arithmetic progressions to show that

5.3 Theorem of Siegel

85

ax r − by r = 1, a, b ∈ N has at most one solution in positive integers x and y. This is a best possible result since (a + 1)x r − ay r = 1 has precisely one solution. Although Dyson’s improvement of Thue–Siegel theorem was mild, a lemma of Dyson proved to be very useful. Bombieri [9] generalised this lemma and used it to prove effective results on the approximation of certain algebraic numbers by rationals.

References 1. 2. 3. 4. 5. 6.

7. 8. 9.

L.J. Mordell, Diophantine Equations (Academic, New York, 1969) H. Davenport, A note on Thue’s theorem. Mathematika 15, 76–87 (1968) C.S. Davis, Rational approximation to e. J. Aust. Math. Soc. 25, 497–502 (1978) K. Mahler, On the approximation π. Proc. Akad. Wetensch. Ser. A 56, 30–42 (1953) J.-H. Evertse, K. Gy˝ory, Unit Equations in Diophantine Number Theory (Cambridge University Press, Cambridge, 2015), 378 p. C.L. Siegel, Einige Erläuterungen zu Thue’s Unterschungen über Annäherungswerte algbraischer zahlen und diophantische. Gleichungen Nach Akad Wissen Göttingen Math-Phys, 169–195 (1970) S. Akhtari, N. Saradha, D. Sharma, Thue’s inequalities and hyper geometric method. Ramanujan J. 45(2), 521–567 (2018) M.A. Bennett, Rational approximation to algebraic numbers of small height: the Diophantine equation |ax n − by n | = 1,. J. Reine Angew. Math. 535, 1–49 (2001) E. Bombieri, On the Thue -Siegel -Dyson theorem. Acta Math. 148, 255–296 (1982)

Chapter 6

Roth’s Theorem

I’m not lost for I know where I am. But however, where I am may be lost. –Winnie the Pooh

Thue’s and Siegel’s improvements of Liouville’s theorem depend on the construction of an auxiliary polynomial in two variables possessing zeros to a high order. Any further progress seemed to require non-trivial extension of the arguments relating to polynomials in several variables especially the possible multiplicities of its zeros. This was discovered by Roth in 1955, when he proved that κ in Theorem 5.2.3 can be taken as 2 + , > 0. To deal with the multiplicities of zeros of multi-variable polynomials, Roth introduced the notion of index of a polynomial; see Sect. 6.1. This notion was later used by Vojta in 1991 in his proof of Falting’s famous theorem about Mordell conjecture. There are now other proofs of Roth’s theorem available based on techniques of algebraic geometry. Roth was awarded the Fields Medal at ICM in Edinburgh in 1958. We have already seen in Chap. 5 that Roth’s theorem is essentially best possible with respect to the exponent 2 + . This can also be seen via continued fractions. Let h n /kn be the n-th convergent to the irrational number ξ. Then it is well known that ξ − h n < 1 < 1 . kn kn kn+1 kn2 Thus there are infinitely many rational numbers h/k such that ξ − h < 1 . k k2

© Springer Nature Singapore Pte Ltd. 2020 S. Natarajan and R. Thangadurai, Pillars of Transcendental Number Theory, https://doi.org/10.1007/978-981-15-4155-1_6

87

88

6 Roth’s Theorem

For proving the theorem, we follow Roth [1]. One may also refer to LeVeque [2] for approximation of ξ by algebraic numbers. We need some preparation.

6.1 Index of a Polynomial We saw in the proof of Theorems of 5.2.3 and 5.3.1 that a polynomial in two variables was constructed. Roth used a polynomial P(X 1 , . . . , X m ) in several variables. If p1 /q1 , . . ., pm /qm are very good rational approximations to α, then one may substitute these into P(X 1 , . . . , X m ). The main difficulty is to show that P( p1 /q1 , . . . , pm /qm ) = 0. This was fairly easy in Lemmas 6.3.1 and 6.4.1 where m = 2. Roth used the polynomial described in Lemma 1.4.3 for this purpose. One also needs m very good rational approximations and it depends on κ. A simple way to define the order of vanishing of P(X 1 , . . . , X m ) at a given point (ξ1 , . . . , ξm ) is to take the smallest value of i 1 + · · · + i m for which the partial derivative

∂ ∂ X1

i1

···

∂ ∂ Xm

im

P |(ξ1 ,··· ,ξm ) := P (i1 ,...,im ) (ξ1 , . . . , ξm ) = 0.

(6.1)

But it became necessary to study polynomials P(X 1 , . . . , X m ) which have different degrees in X 1 , . . . , X m and hence it was better to attach different weights to i1 , . . . , im . Definition of Index Let P ∈ Z[X 1 , . . . , X m ] and ≡ 0. The index of P at (ξ1 , . . . , ξm ) with respect to a tuple (r1 , . . . , rm ) is defined to be the least value of i1 im + ··· + r1 rm for which (6.1) holds. If P ≡ 0, index is defined to be +∞. We denote the index by I P,r1 ,...,rm (ξ1 , . . . , ξm ). If r1 , . . . , rm , ξ1 , . . . , ξm remain the same, we omit referring to them and write simply I P . Note that I P ≥ 0 and I P = 0 if and only if P(ξ1 , . . . , ξm ) = 0. Further, if P (k1 ,...,km ) (X 1 , . . . , X m ) ≡ 0, then its index at (ξ1 , . . . , ξm ) is at least IP −

k1 km − ··· − . r1 rm

6.1 Index of a Polynomial

89

We list here some properties which can be easily verified. Let P(X 1 , . . . , X m ) and Q(X 1 , . . . , X m ) be two non-identically vanishing polynomials. Let the indices be formed at (ξ1 , . . . , ξm ). Then I P+Q ≥ min(I P , I Q ) and I P Q = I P + I Q . Suppose P is a polynomial in X 1 , . . . , X m−1 and Q is a polynomial in X m , then I P Q,r1 ,...,rm (ξ1 , . . . , ξm ) = I P,r1 ,...,rm−1 (ξ1 , . . . , ξm−1 ) + I Q,rm (ξm ). We leave the proofs of these properties to the reader.

6.2 Set of Polynomials In this section we define a set of polynomials in several variables and obtain an upper bound for the index of the polynomials in the set. Let us denote by Rm = Rm (B; r1 , . . . , rm ) the set of polynomials R(X 1 , . . . , X m ) satisfying the following conditions. 1. R(X 1 , . . . , X m ) ∈ Z[X 1 , . . . , X m ] and R ≡ 0. 2. R is of degree at most r j in X j for 1 ≤ j ≤ m. 3. H (R) ≤ B. Let ψi = pi /qi with qi > 0, gcd( pi , qi ) = 1 and h(ψi ) = qi for 1 ≤ i ≤ m. Let θ(R) = I R,r1 ,...,rm (ψ1 , . . . , ψm ). Define m = m (B; q1 , . . . , qm ; r1 , . . . , rm ) = sup θ(R) where the supremum is taken over all R ∈ Rm and all rational numbers ψ1 , . . . , ψm of heights q1 , . . . , qm , respectively. Our aim is to bound m . We begin with the case m = 1. Lemma 6.2.1 We have 1 (B; q1 ; r1 ) ≤

log B . r1 log q1

Proof Let θ = θ(R). By the definition of index, R(X 1 ) is divisible by (X 1 − ψ1 )r1 θ and since R(X 1 ) ∈ Z[X 1 ], gcd( p1 , q1 ) = 1, by Lemma 1.2.1, it is divisible by (q1 X 1 − p1 )r1 θ and

90

6 Roth’s Theorem

R(X 1 ) = (q1 X 1 − p1 )r1 θ Q(X 1 ) for some Q(X 1 ) ∈ Z[X 1 ]. Thus

q1r1 θ ≤ B

which gives the result.

Lemma 6.2.2 Let p ≥ 2 be a positive integer. Let r1 , . . . , r p be positive integers such that 10 r j−1 1 > f or 2 ≤ j ≤ p rp > , (6.2) δ rj δ where 0 < δ < 1 and q1 , . . . , q p are positive integers. Further for any integer 1 ≤ a ≤ r p + 1, let a = 1 (M; q p ; ar p ) + p−1 (M; q1 , . . . , q p−1 ; ar1 , . . . , ar p−1 )

(6.3)

where M = Ma = (r1 + 1) pa 2r1 pa a!B a .

(6.4)

Then there exists an integer 1 ≤ ≤ r p + 1 such that 1/2

p (B; q1 , . . . , q p ; r1 , . . . , r p ) ≤ 2( + + δ 1/2 ).

(6.5)

Proof Let R(X 1 , . . . , X p ) be any polynomial in the set R p (B; r1 , . . . , r p ) and let ψi = pi /qi with gcd( pi , qi ) = 1 and h(ψi ) = qi for 1 ≤ i ≤ p. We prove that θ = θ(R) satisfies (6.5). By Lemma 1.4.3 and the Remark following it, there exists an integer and a polynomial 0 ≡ F(X 1 , . . . , X p ) ∈ Z[X 1 , . . . , X p ] such that if ∂ ν 1 R , 0 ≤ μ, ν ≤ − 1, F(X 1 , . . . , X p ) = det μ ν! ∂ X p then F(X 1 , . . . , X p ) = U (X 1 , . . . , X p−1 )V (X p ) where U ∈ Z[X 1 , . . . , X p−1 ] with deg X j U ≤ r j , 1 ≤ j ≤ p − 1 and V (X p ) ∈ Z[X p ] with deg X p V ≤ r p . Further by (1.8), H (F) ≤ {(r1 + 1) · · · (r p + 1)} 2(r1 +···+r p ) ! B . An Upper Bound for the I F Since r1 > r2 > · · · > r p by (6.2), we get

6.2 Set of Polynomials

91

H (F) ≤ M with a = . Also H (U ) < M, H (V ) < M. The polynomial U (X 1 , . . . , X p−1 ) is of degree at most r j in X j for 1 ≤ j ≤ p − 1. Hence U ∈ R p−1 (M; r1 , . . . , r p−1 ). Thus its index at (ψ1 , . . . , ψ p−1 ) relative to r1 , . . . , r p−1 is at most p−1 (M; q1 , . . . , q p−1 ; r1 , . . . , r p−1 ). Hence it follows that its index at (ψ1 , . . . , ψ p−1 ) relative to r1 , . . . , r p−1 is at most p−1 (M; q1 , . . . , q p−1 ; r1 , . . . , r p−1 ). Similarly V (X p ) ∈ R1 (M; r p ) and its index at ψ p relative to r p is at most 1 (M; q p ; r p ). Further by index property, I F = IU + I V ≤ .

(6.6)

A Lower Bound for I F The polynomial F is given by ∂ ν 1 F(X 1 , . . . , X p ) = β det μ R , 0 ≤ μ, ν ≤ − 1. ν! ∂ X p Hence F is a sum of ! terms and a typical term is of the form ∂ −1 1 ∂ 1 ± β(μ0 R) μ1 R · · · μ−1 R , 1! ∂ X p ( − 1)! ∂ X p

(6.7)

where μ0 , . . . , μ−1 are differential operators on X 1 , . . . , X p−1 whose orders are at most − 1. We shall determine a lower bound for the index of such a typical term. Let i p−1 ∂ i1 1 ∂ = ··· i 1 ! · · · i p−1 ! ∂ X 1 ∂ X p−1 of order w = i 1 + · · · + i p−1 ≤ − 1. If

92

6 Roth’s Theorem

1 ν!

∂ ∂Xp

ν

R(X 1 , . . . , X p ), ν ≤ − 1,

does not vanish identically, its index at (ψ1 , . . . , ψ p ) relative to r1 , . . . , r p is at least θ−

i p−1 ν w ν i1 − ··· − − ≥θ− − . r1 r p−1 r p r p−1 r p

By (6.2) and ≤ r p + 1, we get w r p−1

≤

rp −1 ≤ < δ. r p−1 r p−1

Thus the index of a term as in (6.7), which does not vanish identically, is at least ν − δ. max 0, θ − rp ν=0

−1

Hence

−1

ν IF ≥ max 0, θ − rp ν=0

− δ.

Now we shall complete the proof of the lemma. Suppose θr p ≤ 10. Then θ
10. Then [θr p ]2 > 2θ2 r 2p /3. If now, θr p < , then we have [θr p ] [θr p ]2 θ2r p ν −1 = rp . θr p − ν ≥ max 0, θ − ≥ rp 2r p 3 ν=0 ν=0

−1

If θr p ≥ , then −1 ν θ ν = θ− ≥ . max 0, θ − r r 2 p p ν=0 ν=0

−1

Thus in either case,

6.2 Set of Polynomials

93

θ r p θ2 I F ≥ min , 2 3

− δ.

(6.8)

We combine (6.6) and (6.8) to obtain min

θ r p θ2 , 2 3

≤ ( + δ).

In the above inequality, suppose the minimum is (6.5) is satisfied. If the minimum is

rpθ 3

2

θ , 2

then θ < 2( + δ) and hence

, then

r p θ2 4r p ( + δ) ≤ ( + δ) ≤ (r p + 1)( + δ) ≤ . 3 3 Hence

1/2

θ ≤ 2( + δ)1/2 ≤ 2( + δ 1/2 )

which completes the proof.

In the next lemma we impose some growth conditions on h(ψ1 )=q1 , . . . , h(ψm ) = qm to obtain the following result. Lemma 6.2.3 Let m be a positive integer and assume that 0

10 r j−1 1 , > f or 2 ≤ j ≤ m. δ rj δ

(6.10)

Let q1 , . . . , qm be positive integers such that log q1 > m(2m + 1)/δ,

(6.11)

r j log q j ≥ r1 log q1 f or 2 ≤ j ≤ m.

(6.12)

and

Then

m (q1δr1 ; q1 , . . . , qm ; r1 , . . . , rm ) < 10m δ 1/2 . m

Proof The proof is by induction on m. Let m = 1. By Lemma 6.2.1, (6.9) and (6.11), we get log(q1δr1 ) = δ ≤ 10δ 1/2 1 (q1δr1 ; q1 ; r1 ) < r1 log q1

94

6 Roth’s Theorem

giving the desired inequality. Next suppose that p ≥ 2 is an integer and the lemma holds for m = p − 1. We prove the result for m = p. By (6.9) and (6.10), Lemma 6.2.2 is valid. We shall estimate M. Estimate for M By (6.4)

M = (r1 + 1) p 2r1 p !B ≤ (r1 + 1) p 2r1 p q1δr1

for some integer satisfying ≤ r p + 1 < r1 + 1 ≤ 2r1 . Hence

M < 2(2 p+1)r1 q1δr1 < e(2 p+1)r1 q1δr1 . By (6.11), with m = p we have 2 p + 1 < δ log q1 / p, so that M < q1δ1 r1 where δ1 = δ(1 + 1/ p).

(6.13)

Note also that by (6.9) with m = p we have δ1
δ satisfying (6.14) and replacing r1 , . . . , r p−1 by r1 , . . . , r p−1 . Hence

6.2 Set of Polynomials

95

1/2 p−1 p−1 q1δ1 r1 ; q1 , . . . , q p−1 ; r1 , . . . , r p−1 < 10 p−1 δ1 . Since δ1 < 2δ by (6.13), from the estimates for 1 (M; q p ; r p ), p−1 (M; q1 , . . . , q p−1 ; r1 , . . . , r p−1 ) and (6.3), we get that

p−1 p−1 < 2δ + 2 10 p−1 δ 1/2 < 3 10 p−1 δ 1/2 . Final Estimate for p By (6.5), we get

p−1 p p q1δr1 ; q1 , . . . , q p ; r1 , . . . , r p < 2 3(10 p−1 δ 1/2 ) + 31/2 10( p−1)/2 δ 1/2 + δ 1/2 p 3 31/2 1 + 0. Let Sm (λ) be the set of integers ( j1 , . . . , jm ) satisfying the two conditions below. (i) 0 ≤ ji ≤ ri with 1 ≤ i ≤ m, (ii) Then

jm 1 j1 + ··· + ≤ (m − λ). r1 rm 2 |Sm (λ)| ≤ 2m 1/2 λ−1 (r1 + 1) · · · (rm + 1).

Proof The proof is by induction on m. Suppose m = 1. Then the number of integers j1 satisfying 0 ≤ j1 ≤ r1 and j1 ≤ 21 (1 − λ)r1 is at most r1 + 1 if λ ≤ 1 and is 0 if λ > 1 and hence the result is true for m = 1. Let m > 1. Suppose λ ≤ 2m 1/2 . Then 2m 1/2 λ−1 (r1 + 1) · · · (rm + 1) ≥ (r1 + 1) · · · (rm + 1). The product on the right-hand side is the total number of m-tuples satisfying (i). Hence the result is true in this case. Now we assume that λ > 2m 1/2 . Fix jm . We count the number of (m − 1)-tuples ( j1 , . . . , jm−1 ) such that (i) 0 ≤ ji ≤ ri with 1 ≤ i ≤ m − 1,

96

(ii)

6 Roth’s Theorem

2 jm j1 jm−1 1 . m−λ− + ··· + ≤ r1 rm−1 2 rm

Put λ = λ ( jm ) = λ − 1 + Then |Sm (λ)| =

rm

2 jm . rm

|Sm−1 (λ ( jm ))|.

jm =0

Hence by induction hypothesis, |Sm (λ)| ≤ 2(m − 1)

1/2

rm 2 j −1 λ−1+ (r1 + 1) · · · (rm−1 + 1) . rm j=0

Thus it is enough to show that r 2 j −1 ≤ λ−1 (m − 1)−1/2 m 1/2 (r + 1) λ−1+ r j=0 for all positive integers r and m. Let r be even. Substitute j = r j=0

2j λ−1+ r

−1

r 2

+ k. Then

r/2 2k −1 λ+ = r k=−r/2 r/2 2k −1 2k −1 −1 λ+ =λ + + λ− r r k=1 r/2 −1 4k 2 = λ−1 + 2λ λ2 − 2 r k=1 ≤ λ−1 + 2λ

r/2 r/2 (λ2 − 1)−1 = λ−1 + 2λ−1 (1 − λ−2 )−1 k=1

k=1

≤ (r + 1)λ−1 (1 − λ−2 )−1 . By our assumption on λ, 1 − λ−2 > 1 − (1/4m) > (1 − 1/m)1/2 . Hence we get the desired inequality. Let now r be odd. Put j = (r − 1)/2 + k. Then

6.3 A Combinatorial Lemma

97

(r +1)/2 r 2 j −1 2k − 1 −1 = λ−1+ λ+ r r j=0 k=−(r −1)/2 (r +1)/2 2k − 1 −1 2k − 1 −1 λ+ = + λ− r r k=1 (r +1)/2

= 2λ

k=1

(2k − 1)2 λ − r2

−1

2

≤ (r + 1)λ(λ2 − 1)−1 = (r + 1)λ−1 (1 − λ−2 )−1 .

and the result follows as before.

6.4 The Approximation Polynomial Let α be an algebraic integer of degree n ≥ 2 with h(α) = A. Note that α ≤ A + 1 (see Chap. 1, Exercise 2). Choice of Parameters We choose the values m, δ, q1 , . . . , qm , r1 , . . . , rm satisfying the following conditions. 1 P1. 0 < δ < m m m P2. 10m δ 1/2 + 2n(1 + 3δ)m 1/2 < 2 10 r j−1 1 , > for 2 ≤ j ≤ m P3. rm > δ rj δ P4. δ 2 log q1 > 2m + 1 + 4m log(A + 1) P5. r j log q j ≥ r1 log q1 for 2 ≤ j ≤ m. From P4 and P1, we see that log q1 > (2m + 1)/δ 2 > m(2m + 1)/δ. Thus (6.11) is valid. The conditions (6.9), (6.10) and (6.12) are listed as conditions P1,P3 and P5, respectively. Thus Lemma 6.2.3 is valid. We put (i) λ = 4n(1 + 3δ)m 1/2 ; m−λ ; (ii) μ = 2 m (iii) η = 10m δ 1/2 ; (iv) B1 = [q1δr1 ].

98

6 Roth’s Theorem

Using these notations, we see from the condition P2 above that δr1 /2

η < μ ; q1

< B1 .

Further by conditions P4 and P1, we also have B1 > 4; B1δ > 2mr1 ; B1δ > (A + 1)2mr1 .

(6.15)

The following lemma will be the main lemma to be used in the proof of Roth’s theorem. For any polynomial P(X 1 , . . . , X m ) ∈ Z[X 1 , . . . , X m ], we put Pi1 ,...,im = Pi1 ,...,im (X 1 , . . . , X m ) im ∂ i1 1 ∂ = ··· P i1 ! · · · im ! ∂ X 1 ∂ Xm with integers i 1 ≥ 0, . . . , i m ≥ 0. Lemma 6.4.1 Assume that conditions P1–P5 on the choice of the parameters are satisfied. Suppose that ψi = pi /qi with gcd( pi , qi ) = 1, h(ψi ) = qi for 1 ≤ i ≤ m. Then there exists Q(X 1 , . . . , X m ) ∈ Z[X 1 , . . . , X m ] of degree at most r j in X j for 1 ≤ j ≤ m having the following properties. (a) I Q,r1 ,...,rm (ψ1 , . . . , ψm ) ≥ μ − η. (b) Q(ψ1 , . . . , ψm ) = 0. (c) |Q i1 ,...,im (X 1 , . . . , X m )| < B11+3δ (1 + |X 1 |)r1 · · · (1 + |X m |)rm . Proof Estimation for P j1 ,..., jm (α, . . . , α) . Consider P(X 1 , . . . , X m ) =

r1

···

s1 =0

rm

γ(s1 , . . . , sm )X 1s1 · · · X msm ∈ Z[X 1 , . . . , X m ]

sm =0

with 0 ≤ γ(s1 , . . . , sm ) ≤ B1 . Let (r1 + 1) · · · (rm + 1) = r. Then there are (B1 + 1)r distinct polynomials P(X 1 , . . . , X m ). Note that P j1 ,..., jm (X 1 , . . . , X m ) =

r1 s1 = j1

Then by (6.15), we have

···

rm sm = jm

s1 sm s −j s −j ··· X 11 1 · · · X mm m . j1 jm

γ(s1 , . . . , sm )

6.4 The Approximation Polynomial

99

H (P j1 ,..., jm ) ≤ 2r1 +···+rm B1 ≤ 2mr1 B1 < B11+δ . Also

Ar1 +···+rm ≤ (A + 1)mr1 < B1δ .

Hence P j1 ,..., jm (α, . . . , α) ≤ (r1 + 1) · · · (rm + 1)B11+δ Ar1 +···+rm ≤ (2 A)mr1 B11+δ ≤ B11+3δ .

Application of Pigeonhole Principle Let L = Q(α) and order the conjugates of α as follows. Let α1 , . . . , αρ1 be real and αρ1 +ν and αρ1 +ρ2 +ν be complex conjugates for 1 ≤ ν ≤ ρ2 so that ρ1 + 2ρ2 = n. Let us fix φ = P j1 ,..., jm (α, . . . , α) for some ( j1 , . . . , jm ) satisfying 0 ≤ j1 ≤ r1 , . . . , 0 ≤ jm ≤ rm ,

j1 jm + ··· + ≤ μ. r1 rm

(6.16)

Then φ is a polynomial in α with rational coefficients and let its conjugates be denoted by φ(ν) , 1 ≤ ν ≤ n. Define n real numbers by the equations φν = φ(ν) , 1 ≤ ν ≤ ρ1 , φν + iφν+ρ2 = φ(ν) , ρ1 + 1 ≤ ν ≤ ρ1 + ρ2 . Let us consider the tuple (φ1 , . . . , φn ) for ( j1 , . . . , jm ) satisfying (6.16). By Lemma 6.3.1, there are M ≤ 2nm 1/2 r/λ tuples, and each element of the tuple has its absolute value bounded by B11+3δ = t. Hence all the tuples lie in a cube of edge 2t in an M-dimensional space. Dividing each edge into 3t equal parts we get (3t) M subcubes of edge 2/3. There are (B1 + 1)r distinct polynomials. As B1 > 4 by (6.15), we have (B1 + 1)r > (3t) M . Hence points corresponding to two different polynomials P ∗ and P ∗∗ in X 1 , . . . , X m variables lie in the same cube and taking P(X 1 , . . . , X m ) = P ∗ (X 1 , . . . , X m ) − P ∗∗ (X 1 , . . . , X m ),

100

6 Roth’s Theorem

we get P j1 ,..., jm (α, . . . , α) ≤

√ 2 × 2/3 < 1

for some ( j1 , . . . , jm ) satisfying (6.16). Since P j1 ,..., jm (α, . . . , α) is an algebraic integer, it must be zero. Hence I P (α, . . . , α) ≥ μ. Further the coefficients of P are all not zero and H (P) ≤ B1 . Application of Lemma 6.2.3 Note that P ∈ Rm (q1δr1 ; r1 , . . . , rm ) since P ∗ and P ∗∗ have integer coefficients in [0, B1 ]. Hence by Lemma 6.2.3, we have I P,r1 ,...,rm (ψ1 , . . . , ψm ) < η. Therefore there exists Q(X 1 , . . . , X m ) given by 1 Q(X 1 , . . . , X m ) = k1 ! · · · km ! with

∂ ∂ X1

k1

∂ ··· ∂ Xm

km P,

km k1 + ··· + 2 the inequality α − p < 1 (6.17) q qκ has only finitely many solutions in p/q. Proof We may take α as an algebraic integer. From now on we assume that (6.17) holds for infinitely many reduced rationals p/q with κ = 2 + for some > 0. See (a)–(e) in Sect. 6.1 for justifying these assumptions. Choice of m and δ Take κ = 2 + with > 0. Choose m such that m > 16n 2 Thus m > 4nm 1/2 and κ>

κ κ−2

2 .

2m . m − 4nm 1/2

m

Note that η = 10m δ 1/2 becomes arbitrarily small with δ. Hence for sufficiently small δ m − 4n(1 + 3δ)m 1/2 − 2η > 0. This is same as condition P2 in Sect. 6.5. We also choose δ so that P1 is satisfied and κ(m − 4nm 1/2 ) − 2m ≥ κ(12δnm 1/2 + 2η) + (2m + 4 + 10δ)δ. This gives κ>

2m(1 + δ) + 2δ(2 + 5δ) m − 4(1 + 3δ)nm 1/2 − 2η

which is equivalent to κ>

m(1 + δ) + δ(2 + 5δ) . μ−η

(6.18)

Choice of q1 , . . . , qm First choose a solution p1 /q1 of (6.17) with q1 large so that condition P4 is satisfied. Then choose p2 /q2 , . . . , pm /qm such that log q j 2 > , 2 ≤ j ≤ m. log q j−1 δ

102

6 Roth’s Theorem

Choice of r1 , . . . , rm Take r1 to be any integer such that r1 >

10 log qm δ log q1

(6.19)

and define r j by r1 log q1 r1 log q1 ≤ rj < + 1, 2 ≤ j ≤ m. log q j log q j

(6.20)

Then condition P5 is satisfied. Also r j log q j log q j log qm δ

rj log q j−1

δ −1 1+ > δ −1 . 10

Hence P3 is also satisfied. By the above choices of m, δ, q1 , . . . , qm , r1 , . . . , rm , all the conditions P1–P5 are satisfied. So Lemma 6.4.1 holds. Application of Lemma 6.4.1 By Lemma 6.4.1, there exists a polynomial Q(X 1 , . . . , X m ) satisfying the properties (a)–(c) listed therein. Thus for a set of m reduced rationals p1 /q1 , . . . , pm /qm , the number pm p1 = 0 and is in Q ,..., φ0 = Q q1 qm and hence there exist k1 , . . . , km ∈ N such that |k1r1 · · · kmrm φ0 | ≥ 1. On the other hand, we have Q

pm p1 ,..., q1 qm

=

r1 i 1 =0

···

rm i m =0

Q i1 ,...,im (α, . . . , α)

p1 −α q1

i1

···

pm −α qm

im

6.5 Statement and Proof of Roth’s Theorem

and the terms with

103

i1 im + ··· + q1 by (6.20). Hence by Lemma 6.4.1(c) and (6.15), we get −r1 (μ−η)κ

|φ0 | < (r1 + 1) · · · (rm + 1)B11+3δ (A + 1)mr1 q1 −r1 (μ−η)κ

< B11+5δ q1

.

Thus −r1 (μ−η)κ

1 ≤ |k1r1 · · · kmrm φ0 | ≤ B11+5δ q1

m

qiri

i=1 δr1 (1+5δ)+r1 +···+rm −r1 (μ−η)κ q1 δr (1+5δ)+r1 m(1+δ)−r1 (μ−η)κ q1 1

< < since r j < δr j−1 for 2 ≤ j ≤ m. Hence

δ(1 + 5δ) + m(1 + δ) > (μ − η)κ which gives κ
2. Here [x] denotes the greatest integer in x. (2) Let k ∈ N. Suppose 3k = 2k q + r, then show that 0 < r < 2k − q for k sufficiently large. (Hint: Use Theorem 6.5.2 below).

104

6 Roth’s Theorem

Notes In 1957, Ridout [3] obtained an extension of Roth’s theorem which is as follows. Theorem 6.5.2 Let α be a non-zero algebraic number and let p1 , . . . , pr , q1 , . . . , qs be distinct prime numbers. Let μ, ν and c be real numbers with 0 ≤ μ, ν ≤ 1 and c > 0. Let p and q be integers of the form p = p ∗ p1a1 · · · prar , q = q ∗ q1b1 · · · qsbs with ai ’s and b j ’s non-negative integers and p ∗ and q ∗ are non-zero integers satisfying | p ∗ | ≤ cp μ ; |q ∗ | ≤ cq ν . Further suppose that κ > μ + ν. Then there are only a finite number of solutions to (5.5). We recover Roth’s theorem by taking μ = ν = c = 1. Roth’s theorem and Ridout’s theorem can be applied to establish that α is transcendental if it admits infinitely many very good rational approximants. For instance, the transcendence of the Champernowne number 0.1234567891011 . . . , can be proven using these theorems. For this and other results, see [4] and references given therein. In Roth’s theorem, it is believable that the factor q could be replaced by a smaller factor. By Theorem 5.1.3 of Khintchine, we know that for almost all α ∈ R 1 α − p < q q 2 (log q)1+ has only finitely many solutions for every > 0. It was conjectured by Lang [5] that the same conclusion holds to be true for all algebraic irrationals. In 1959, Cugiani [6] could show that p(1) p(2) , q(2) , . . . are solutions of if the rational numbers q(1) 1 α − p < q q 2+20(log log log q)−1/2 with 0 < q(1) < q(2) < · · · , then lim sup

log q(k + 1) = ∞. log q(k)

Similar results were proved earlier by Siegel and Schneider . For analogous results of Roth’s theorem over function fields, see the article of Thakur [7].

References

105

References 1. K.F. Roth, Rational approximations to algebraic numbers. Mathematika 2, 1–20 (1955). Corrigendum 2, 168 (1955) 2. W.J. LeVeque, Topics in Number Theory, Vol I and II (Dover Publication Inc, New York, 1984) 3. D. Ridout, Rational approximations to algebraic numbers. Mathematika 4, 125–131 (1957) 4. Y. Bugeaud, Approximation by Algebraic numbers (Cambridge University Press, Cambridge, 2004), 274 pp 5. S. Lang, Report on diophantine approximations. Bull. de la Soc. Math. de France 93, 117–192 (1965) 6. M. Cugiani, Sulla approssimabilitá dei numeri algebrici mediante numeri razionali. Ann. Mat. Pura Appl. (4) 48, 135–145 (1959) 7. D.S. Thakur, Diophantine Approximation and Transcendence in Finite Characteristic. Diophantine Equations, ed. by N. Saradha (Narosa Publishing House, New Delhi, 2005), pp. 265–278

Chapter 7

Baker’s Theorems and Applications

The cave you fear to enter holds the treasure you seek –Joseph Campbell

Gelfond and Schneider’s theorem can be restated as follows. For any algebraic number α = 0, 1 the number log α to any algebraic base other than 0 or 1 is either rational or transcendental. Gelfond, by a refinement of his method, obtained a positive lower bound for the absolute value of β1 log α1 + β2 log α2 where β1 , β2 denote algebraic numbers not both 0, and α1 , α2 denote algebraic numbers not 0 or 1, with log α1 / log α2 irrational. Gelfond also remarked that an analogous theorem for linear forms in arbitrarily many logarithms of algebraic numbers would be of great value for the solution of some apparently very difficult problems in number theory. In 1966–68, Baker established such a result. See his papers [1] and his prize winning book [2]. Corollaries 7.2.1, 7.2.2 and 7.2.3 resolve the multidimensional analogue of Hilbert’s seventh problem. We have chosen to give as applications, some results on Pillai’s equation, the growth of the greatest prime factor of polynomial values and effective version of Thue’ theorem; see Sects. 7.3 and 7.4. At the ICM in Nice in 1970, Baker was awarded Fields Medal for his contributions to linear forms in the logarithms of algebraic numbers and their applications to various problems in number theory.

© Springer Nature Singapore Pte Ltd. 2020 S. Natarajan and R. Thangadurai, Pillars of Transcendental Number Theory, https://doi.org/10.1007/978-981-15-4155-1_7

107

108

7 Baker’s Theorems and Applications

7.1 Statement of Baker’s Theorems Define the complex logarithm by log z = log |z| + i arg z with −π < arg z ≤ π. In fact one may take any branch of the logarithm. We denote by c7.m = c7.m (· · · ), m ≥ 1, effectively computable positive numbers and we will specify within brackets the parameters on which the number depends. Theorem 7.1.1 Let α1 , . . . , αn ∈ A\{0, 1}, β0 ∈ A and β1 , . . . βn ∈ A\{0}. Assume that log α1 , . . . , log αn ar e linearly independent over Q. Then := β0 + β1 log α1 + · · · + βn log αn = 0. In other words, 1, log α1 , . . . , log αn are linearly independent over A. Once non-vanishing of is known, one would like to find a non-trivial lower bound for ||. Such a bound proved to have several applications in Diophantine equations. We shall present some such results and give a few applications. The following is a quantitative result of the above theorem proved by Baker himself in 1975. Theorem 7.1.2 Let α1 , . . . , αn ∈ A\{0, 1} and β0 , β1 , . . . βn ∈ A. Assume that = 0. Then || ≥ (eB)−c7.1 where B = max(h(β0 ), h(β1 ), . . . , h(βn )) and c7.1 = c7.1 (n, α1 , . . . , αn ). In 1977, Baker proved a more explicit bound as follows. Theorem 7.1.3 Let α1 , . . . , αn ∈ A\{0, 1}. Let the field Q(α1 , . . . , αn ) have degree at most d over Q. Let h(α j ) ≤ A j with A j ≥ 4 for 1 ≤ j ≤ n. Put A = max(A j ), = (log A1 ) · · · (log An ), = (log A1 ) · · · (log An−1 ). Let β0 , β1 , . . . βn ∈ Z with e B ≥ 4. Assume that = 0. Then log || ≥ −(16nd)200n (log B) log . Although completely explicit, the above bound has some drawbacks. The factor (16nd)200n is very large. It is expected that it can be replaced by a polynomial expression in n and d, and the constants can be greatly reduced. The product of the logarithms in was expected to be replaced by the sum of the logarithms. After the efforts of several authors, Baker and Wüstholz [3] proved the following improved result in 1993.

7.1 Statement of Baker’s Theorems

109

Theorem 7.1.4 Under the conditions of Theorem 7.1.3, we have either = 0 or log || ≥ −(16nd)2(n+2) (log B). For many applications, only two or three logarithms occur. Best results in these cases were obtained by Laurent, Mignotte and Nesterenko and Bennett et al.; see [4] and [5]. Lastly, we state the result of Matveev [6] in 2000 which is the best known so far. Theorem 7.1.5 Let α1 , . . . , αn ∈ A\{0, 1}. Let h ◦ (α j ), 1 ≤ j ≤ n denote the absolute logarithmic height. Let K = Q(α1 , . . . , αn ) have degree at most d over Q. Let κ=

1 if K ⊂ R 2 if K ⊂ C.

Consider = b1 log α1 + · · · + bn log αn with b j ∈ Z f or 1 ≤ j ≤ n. Put B = max(|b1 |, . . . , |bn |) and let A j be real numbers such that A j ≥ max(dh ◦ (α j ), | log α j |, 0.16) f or 1 ≤ j ≤ n. Then either = 0 or log || ≥ −c7.2 d 2 A1 · · · An log(ed) log(eB) where

c7.2 = min(κ −1 (en/2)κ 30n+3 n 3.5 , 26n+20 ).

There are innumerable papers in the literature giving various applications of the above effective results. We shall give few results here. The reader may see the books of Bugeaud [7], Shorey and Tijdeman [8] and the references therein for many other results.

7.2 Applications of the Qualitative Result—Theorem 7.1.1 Theorem 7.1.1 with n = 1 is Hermite–Lindemann–Weierstrass theorem and n = 2 is Gelfond–Schneider theorem. We derive some more results in the following corollaries. Corollary 7.2.1 Let α1 , . . . , αn ∈ A\{0} and β1 , . . . βn ∈ A. Then

110

7 Baker’s Theorems and Applications

β1 log α1 + · · · + βn log αn is either zero or transcendental. Proof When n = 1, the statement is Hermite–Lindemann–Weierstrass theorem. We prove the corollary by induction on n. We assume that the corollary is true for any m < n. We prove the corollary for n = m. Suppose the assertion does not hold for n = m. Then there exist β0 , β1 , . . . , βm ∈ A such that β1 log α1 + · · · + βm log αm = β0

(7.1)

with β0 = 0. Then by Theorem 7.1.1, log α1 , . . . , log αm are linearly dependent over Q. Hence there exist rational numbers c1 , . . . , cm , not all zero, such that c1 log α1 + · · · + cm log αm = 0. We may assume without loss of generality that cm = 0. Using this equation along with (7.1), we eliminate log αm to get a linear form log αm−1 = cm β0 β1 log α1 + · · · + βm−1

where cm β0 = 0 and algebraic. By induction, the left-hand side is transcendental which is a contradiction. This proves the corollary. As a consequence of Corollary 7.2.1 we show the following result. Corollary 7.2.2 The number

β

eβ0 α1 1 · · · αnβn

is transcendental for α1 , . . . , αn , β0 , β1 , . . . βn ∈ A\{0}. β

β

Proof Suppose eβ0 α1 1 · · · αn n equals an algebraic number αn+1 , then αn+1 = 0 and β1 log α1 + · · · + βn log αn − log αn+1 = −β0 with β0 = 0. This contradicts Corollary 7.2.1.

√

The corollary implies that numbers like e.2 2 and π + log α for any non-zero algebraic α are transcendental. The following is a homogeneous version (β0 = 0) of the theorem. Corollary 7.2.3 Let α1 , . . . , αn ∈ A\{0, 1} and β1 , . . . , βn be algebraic numbers β β with 1, β1 , . . . , βn linearly independent over the rationals. Then α1 1 · · · αn n is transcendental. Proof It suffices to show that for any α1 , . . . , αn ∈ A\{0, 1} and Q-linearly independent β1 , . . . , βn that

7.2 Applications of the Qualitative Result—Theorem 7.1.1

β1 log α1 + · · · + βn log αn = 0. β

111

(7.2)

β

For, then suppose α1 1 · · · αn n were algebraic, say αn+1 . Then β1 log α1 + · · · + βn log αn − log αn+1 = 0 and by hypothesis, 1, β1 , . . . , βn are Q-linearly independent which contradicts (7.2) with n replaced by n + 1. Thus we need to prove (7.2) which we do by induction on n. We see that (7.2) is true for n = 1. We assume that (7.2) is true for any m < n. If log α1 , . . . , log αn are Q-linearly independent, then the result follows from Theorem 7.1.1. Hence we may assume that log α1 , . . . , log αn are Q-linearly dependent. Then there are rational numbers c1 , . . . , cn , not all zero such that c1 log α1 + · · · + cn log αn = 0. Let us assume without loss of generality that cn = 0. Suppose (7.2) does not hold for n. Then together with the above equality, we get (cn β1 − c1 βn ) log α1 + · · · + (cn βn−1 − cn−1 βn ) log αn−1 = 0. Thus to conclude the proof, it is enough to show that the (n − 1) algebraic numbers (cn β1 − c1 βn ), . . . , (cn βn−1 − cn−1 βn ) are linearly independent over Q. Suppose not. Then there exist rational numbers A1 , . . . , An−1 , not all zero such that A1 (cn β1 − c1 βn ) + · · · + An−1 (cn βn−1 − cn−1 βn ) = 0 which gives cn A1 β1 + · · · + cn An−1 βn−1 − (A1 c1 + · · · + An−1 cn−1 )βn = 0. Since β1 , . . . , βn are Q-linearly independent, we deduce that A1 = · · · = An−1 = 0, which is a contradiction.

7.3 Applications of the Quantitative Result—Theorem 7.1.2 We shall restrict to the case when β0 = 0 and βi ∈ Z for 1 ≤ i ≤ n. Corollary 7.3.1 Let α1 , . . . , αn ∈ A\{0, 1} and b1 , . . . , bn ∈ Z such that α1b1 · · · αnbn = 1. Then

112

7 Baker’s Theorems and Applications

|α1b1 · · · αnbn − 1| ≥ (eB)−c7.3 where B = max(|b1 |, . . . , |bn |) and c7.3 = c7.3 (n, α1 , . . . , αn ). Proof In the proof, c7.4 , c7.5 , c7.6 depend on n, α1 , . . . , αn . For any complex number z, we take log z = log |z| + i arg z with −π < arg z ≤ π. Then log(1 + z) =

∞ (−1)n−1

n

i=1

z n for z ∈ C with |z| < 1.

Thus when |z| ≤ 1/2 we get | log(1 + z)| ≤ |z|(1 + |z| + |z|2 + · · · ) ≤ 2|z|. We take z = α1b1 · · · αnbn − 1. If |z| > 1/2, the assertion of the corollary follows. So we may assume that |z| ≤ 1/2. Now log(1 + z) = b1 log α1 + · · · + bn log αn + 2kπi = b1 log α1 + · · · + bn log αn + 2k log(−1)

for some k ∈ Z as log(−1) = πi. We apply Theorem 7.1.2 to get | log(1 + z)| ≥ (e max(B, |2k|))−c7.4 . As | log(1 + z)| ≤ 2|z| ≤ 1, we have |2kπi| ≤ 1 +

n

| log αi ||bi | ≤ 1 +

i=1

n

| log αi | B.

i=1

Hence |2k| ≤ c7.5 B and we get | log(1 + z)| ≥ (ec7.5 B)−c7.4 . Again using | log(1 + z)| ≤ 2|z| we find |z| ≥ (eB)−c7.6 . We shall see the implication of the quantitative result for the following well-known conjecture of Pillai. Conjecture 7.1 Let a, b ∈ Z and k ∈ Z\{0} be given. Then there exists a positive number c7.7 = c7.7 (a, b, k) such that the equation ax m − by n = k in integer s x > 1, y > 1, m > 1, n > 1 with mn ≥ 6

(7.3)

implies that max(x, y, m, n) ≤ c7.7 . This conjecture is still open although many special cases are known. We shall consider some cases.

7.3 Applications of the Quantitative Result—Theorem 7.1.2

113

Case 1 Let equation (7.3) hold with a = b = 1. Then max(m, n) ≤ c7.8 (x, y, k). Proof Let B = max(m, n). We may assume without loss of generality that x m ≥ y n . Then by Corollary 7.3.1, we have |1 − x −m y n | ≥ (eB)−c7.9 with c7.9 = c7.9 (x, y). Hence |x m − y n | ≥

xm . (eB)c7.9

Since x ≥ 2, y ≥ 2, we have x m ≥ 2 B . Thus |k| = |x m − y n | ≥

2B (eB)c7.9

giving 2 B ≤ c7.10 B c7.9 with c7.10 = c7.10 (x, y, k). Thus B ≤ c7.11 (x, y, k). In particular, equation 3m − 2n = 1 has only finitely many solutions in integers m > 1, n > 1. Case 2 Let equation (7.3) hold with n = m and k > 1. Then m ≤ c7.12 (a, b, k). Proof Note that we may assume that a = 0, b = 0. First let a > 0, b < 0. Let b = −b with b > 0. Then (7.3) becomes m ax + b y m = k which implies that ax m ≤ k and b y m ≤ k. Thus m, x, y are all bounded by say, k which proves the statement in this case. The case a < 0, b > 0 is similar. Let us now consider a > 0, b > 0. We shall assume x ≥ y. The case y ≥ x is similar. Now m m m m b y − 1 = ax m |e z − 1| |k| = |ax − by | = |ax | a x where z = log(b/a) + m log(y/x). We may assume without loss of generality that |e z − 1| < 1/2. Otherwise, we have 2m ≤ x m ≤ 2|k|/a proving the statement. As a general fact, we claim that there exists an absolute constant c7.13 such that |e z − 1| > c7.13 |z| whenever |e z − 1| < 1/2. To see this, we note that

(7.4)

114

7 Baker’s Theorems and Applications

ez = |e z | = |e z − 1 + 1| ≤ |e z − 1| + 1 ≤ 3/2 giving z ≤ log(3/2) ≤ 1. Also | z| ≤ π. Thus |z| ≤ 1 + π ≤ 3π/2. The function (e z − 1)/z is holomorphic in |z| ≤ 3π/2, hence attains its minimum. Thus there exists an absolute constant c7.13 satisfying (7.4). We apply Theorem 7.1.3 to = z with n=2, B = m, A1 = h(b/a)=c7.14 (a, b) > 0, A2 = h(y/x) = x. Thus |z| > exp(−c7.15 log x log m) with c7.15 = c7.15 (a, b). Thus |k| ≥ (ax m )c7.13 exp(−c7.15 (log x)(log m)) which gives the assertion. If a < 0, b < 0, then writing a = −a , b = −b , the equation becomes n b y − a x n = k with a > 0, b > 0. As in the previous case, the assertion follows. We proceed to give another application. Let |m| ≥ 2 be an integer. Denote by P(m) the greatest prime factor of the integer m. We put P(±1) = 1. Theorem 7.3.2 Let f (X ) ∈ Z[X ] having at least two distinct rational roots. Then P( f (x)) → ∞ effectively as x → ∞, x ∈ N. Remark In fact, the theorem is true when f (X ) has two distinct roots. The theorem asserts that for any given > 0, there exists c7.16 = c7.16 (, f ) > 0 such that P( f (x)) ≥ for x ≥ c7.16 . This is what we mean by effectiveness here. Suppose f (X ) = X r , r ≥ 1. Then P( f (x)) does not tend to infinity, by allowing x to run through {2n , n = 1, 2, . . .}. Thus the condition that f (X ) has two distinct roots is necessary. Proof Let f (X ) = a0 X d + · · · + ad , ai ∈ Z. Then a0d−1 f (X ) = (a0 X )d + a1 (a0 X )d−1 + · · · + ad a0d−1 . Putting g(X ) = X d + a1 X d−1 + · · · + ad a0d−1 , we see that g(X ) is monic, g(a0 X ) = a0d−1 f (X ) and hence P( f (x)) → ∞ if and only if P(g(x)) → ∞. Thus we may assume that f (X ) itself is monic. Let > 0 and x a positive integer satisfying P( f (x)) ≤ . Then we should show that there exists c7.17 = c7.17 (, f ) > 0 such that x ≤ c7.17 . Let the two distinct rational roots of f (X ) be α1 and α2 . Since f (X ) is monic, these are indeed rational integers. Let

7.3 Applications of the Quantitative Result—Theorem 7.1.2

115

c7.18 = max(16, 2|α1 |, 2|α2 |, 4|α2 − α1 |2 ). We assume from now on that x ≥ c7.18 . Write x − α1 = p1a1 · · · psas ; x − α2 = q1b1 · · · qtbt with ai s and b j s positive integers and pi s and q j s primes. By our assumption, x − α1 > 0 and x − α2 > 0. Consider α2 − α1 = (x − α1 ) − (x − α2 ) = p1a1 · · · psas − q1b1 · · · qtbt = 0. Then we get 0 = | p1a1 · · · psas q1−b1 · · · qt−bt − 1| =

1 |α2 − α1 | ≤√ |x − α2 | x

(7.5)

since x ≥ max(2|α2 |, 4|α2 − α1 |2 ). Take z = a1 log p1 + · · · + as log ps − b1 log q1 − · · · − bt log qt . Then by (7.5), we have z = 0 and √ 0 = |e z − 1| ≤ 1/ x.

(7.6)

Further, z is real. We consider three cases accordingly as z > 1/2, z < −1/2 and |z| ≤ 1/2. Suppose z > 1/2. Then |e z − 1| > |e z | − 1 = ez − 1 > e1/2 − 1 ≥ 1/2. Then by (7.6), we get x ≤ 4, a contradiction since x ≥ c7.18 . Suppose z < −1/2. Then |e z − 1| ≥ 1 − e−1/2 . Along with (7.6), we get a contradiction since x > 6.5 > 1/(1 − e−1/2 )2 . Suppose |z| ≤ 1/2. Then it is easy to see that |e z − 1| ≥ |z|/2. Hence, by (7.6), we get |z| < exp(log 2 − (log x)/2) < exp(−(log x)/4) since x ≥ 16.

(7.7)

116

7 Baker’s Theorems and Applications

Our final step is to apply Theorem 7.1.4 to get a lower bound for |z| and compare it with the upper bound in (7.7). We need to bound the coefficients ai s, b j s, pi s and q j s in terms of and x. Now x − α1 = p1a1 · · · psas . Hence for 1 ≤ i ≤ s we have 2ai ≤ piai ≤ x − α1 ≤ x 2 since x ≥ max{2, |α1 |}. Thus ai ≤ 2 log x/ log 2 ≤ 3 log x. Similarly, b j ≤ 3 log x for 1 ≤ j ≤ t. Since X − α1 divides f (X ) and by assumption P( f (x)) ≤ , we find that each pi ≤ and similarly each q j ≤ . Also since s is the number of distinct prime divisors of x − α1 , it cannot exceed the largest prime divisor of f (x). Thus s ≤ . Similarly, t ≤ . Finally, h( pi ) ≤ pi ≤ and h(q j ) ≤ q j ≤ . Using these estimates in Theorem 7.1.4 with n = s + t ≤ 2 and d = 1, we get |z| > exp(−c7.19 (log log x)) where c7.19 = c7.19 (, f ). Comparing with the upper bound for |z| in (7.7), we get −c7.19 (log log x) ≤ −(log x)/4, which gives x ≤ c7.20 (, f ). Taking c7.17 = max(c7.18 , c7.20 ), we get x ≤ c7.17 .

7.4 Effective Version of Thue’s Theorem In Chap. 5, we saw that equations of the form f (x, y) = m where f is a binary form of degree at least 3 have only finitely many solutions. The method of Thue was ineffective in the sense that it was not possible to bound the solutions x and y. Bounding the solutions will also bound the number of solutions. We shall show below how Baker’s results can be applied to make the result effective, i.e. we will be able to bound the solutions x and y of the Thue equation. Theorem 7.4.1 Let f (X, Y ) ∈ Z[X, Y ] be an irreducible binary form of degree n ≥ 3 and m a non-zero integer. Then there exists an effectively computable number c7.21 = c7.21 ( f ) > 0 such that the equation f (x, y) = m with x, y ∈ Z implies that max(|x|, |y|) ≤ (2|m|)c7.21 . Remark (1) It is well known that Pell’s equation X 2 − 2Y 2 = 1 has infinitely many solutions. Hence the condition n ≥ 3 is necessary.

7.4 Effective Version of Thue’s Theorem

117

(2) Coefficients of X n and Y n are non-zero, otherwise, f (X, Y ) is not irreducible. (3) Suppose f (X, Y ) = a0 X n + a1 X n−1 Y + a2 X n−2 Y 2 + · · · + an Y n . Then a0n−1 f (X, Y ) = (a0 X )n + a1 (a0 X )n−1 Y + a0 a2 (a0 X )n−2 Y 2 + · · · + a0n−1 an Y n = g(a0 X, Y )

where g is a monic polynomial with integral coefficients and g(a0 X, Y ) = ma0n−1 . Suppose the theorem is true for g. Then max(|a0 x|, |y|) ≤ c7.21 which implies max(|x|, |y|) ≤ c7.21 . Hence we may assume that coefficient of X n in f is 1. (4) Write f (X, Y ) = Y n ((X/Y )n + a1 (X/Y )n−1 + · · · + an ). Then z n + a1 z n−1 + · · · + an is a monic, irreducible polynomial with integral coefficients since f is monic and irreducible. We shall denote this polynomial as f (z).

7.4.1 Proof of Theorem 7.4.1 In the sequel, we denote by c7.22 , c7.23 , . . . , positive numbers depending only on f. Write f (X, Y ) = (X − α (1) Y ) · · · (X − α (n) Y ) with α (1) = α. Put β = x − αy. Then β is an algebraic integer in K = Q(α), βi = x − α (i) y, 1 ≤ i ≤ n and hence N (β) = |m|. It is clear that any positive number which depends only on the fundamental units of K is in fact a number depending only on f. This fact will be used several times without any mention. By Lemma 1.3.8, there exists an associate γ of β such that γ ≤ c7.22 and γ = ζβη1b1 · · · ηrbr bi ∈ Z, ζ a root of unity. Also denominator d(γ ) = 1 since γ is an algebraic integer. Hence s(γ ) ≤ c7.23 . Let −1 c = max s((η(i) j ) ) and B = max |bi |. Then 1≤i, j≤r

1≤i≤r

γ (i) B . |βi | = (i) ≤ c7.23 (c ) B ≤ c7.24 (η )b1 · · · (ηr(i) )br

(7.8)

1

From

x − α (1) y − β1 = 0; x − α (2) y − β2 = 0,

we get x=

α (1) β2 − α (2) β1 ; α (1) − α (2)

y=

β2 − β1 α (1) − α (2)

(7.9)

118

7 Baker’s Theorems and Applications

B which gives max(|x|, |y|) ≤ c7.25 max(|β1 |, |β2 |) ≤ c7.26 by (7.8). Thus if B ≤ c7.27 , we see that the theorem is valid. So we may assume that B > c7.27 with c7.27 sufficiently large. We have the identity

(α (2) − α (3) )β1 + (α (3) − α (1) )β2 + (α (1) − α (2) )β3 = 0. Thus

(2) (1) α − α (3) )β1 (α − α (2) )β3 = 0 = (1) − 1 (α (3) − α (1) )β . (α − α (3) )β2 2

(7.10)

The left-hand side of the above equality is (α (2) − α (3) ) γ (1) (ζ (1) )−1 (η(1) )−b1 · · · (η(1) )−br r 1 0 = (1) − 1 = |e z − 1| (α − α (3) ) γ (2) (ζ (2) )−1 (η(2) )−b1 · · · (ηr(2) )−br 1 where (2)

z = log

(2) η α (2) − α (3) γ (1) ζ (2) ηr + log (2) + log (1) + b1 log 1(1) + · · · + br log (1) + 2M log(−1) = 0 (1) (3) α −α γ ζ η1 ηr

(7.11) for a suitable integer M. We shall take the principal value of the logarithm so that z ≤ π. Since |βr1 +i | = |βr1 +r2 +i | for 1 ≤ i ≤ r2 and |β1 · · · βn | = |m|(≥ 1), we see that |βi | for 1 ≤ i ≤ r determine |βi | for r1 + r2 ≤ i ≤ n. Thus the above equality implies that there exists I with 1 ≤ I ≤ r and |β I | ≥ 1. We claim that there exists an integer J, 1 ≤ J ≤ n and I = J such that |β J | ≤ |m|1/(n−1) e−c7.28 B .

(7.12)

(a) γ = b1 log η(a) + · · · + br log η(a) , 1 ≤ a ≤ n. log r 1 βa

(7.13)

To see this, note that

Let

(k) (a) γ log = max log γ . β 1≤a≤r β k a

Now (7.13) can be written in the matrix form as

7.4 Effective Version of Thue’s Theorem

119

⎛ (1) ⎞ ⎛ ⎞ log γβ1 b1 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ .. T ⎝ ... ⎠ = ⎜ ⎟ ⎝ . (r ) ⎠ br log γβr where

Hence

Thus

⎞ log η1(1) · · · log ηr(1) ⎜ ⎟ ⎜ ⎟ .. T =⎜ ⎟. . ⎝ (r ) ⎠ (r ) log η1 · · · log ηr ⎛

⎛ ⎞ b1 ⎜ .. ⎟ Adj T ⎝.⎠= det T br

(1) ⎞ log γβ1 ⎟ ⎜ ⎟ ⎜ .. ⎟. ⎜ . ⎝ (r ) ⎠ γ log βr ⎛

(k) γ B ≤ c7.29 log β k

giving

(k) γ log β ≥ c7.30 B. k

Consider (k) γ − log γ (k) ≥ c7.30 B − c7.22 ≥ c7.31 B |log |βk || ≥ log β k

since s(γ ) ≤ c7.22 and B is large. Then either log |βk | ≤ −c7.31 B or log |βk | ≥ c7.31 B. Suppose log |βk | ≤ −c7.31 B, then (7.12) holds with J = k. So we assume that log |βk | ≥ c7.31 B. Then |β1 · · · βk−1 βk+1 · · · βn | =

|m| ≤ |m|e−c7.31 B . |βk |

Let |βi | = Then

min

1≤ j≤k, j=k

β j .

120

7 Baker’s Theorems and Applications

|βi |n−1 ≤ |m|e−c7.31 B giving

|βi | ≤ |m|1/(n−1) e−c7.32 B

showing that (7.12) holds with J =i. Note that I = J since |β (J ) | c7.33 log(2|m|). Now we fix β2 = β I and β3 = β J in (7.10). Then (1) α − α (2) β J ≤ c7.34 |m|1/(n−1) e−c7.32 B < 1/2 |e − 1| = (3) α − α (1) β I z

(7.14)

for B > c7.35 log(2|m|) with z given by (7.11). As seen in (7.4), we have |e z − 1| ≥ c7.36 |z|. By Theorem 7.1.3, |z| ≥ exp(−c7.37 log B). Hence |e z − 1| ≥ exp(−c7.38 log B). Together with the upper bound in (7.14), this implies that B ≤ c7.39 log(2|m|). Thus by (7.8), |β| ≤ (2|m|)c7.40 and hence |x| and |y| are bounded by (2|m|)c7.41 by (7.9). As an application of effective Thue’s Theorem 7.4.1, we shall show an improvement in Liouville’s Theorem 5.2.1 which we recall below. Let α be an algebraic number of degree n ≥ 2. Then there exists c5.1 = c5.1 (α) such that for all integers p and q with q ≥ 1, α − p > c5.1 . q qn We show Theorem 7.4.2 Let α be an algebraic number of degree n ≥ 3. Then there exists c7.42 = c7.42 (α) > 0 and κ = κ(α) > 0, κ < n such that for all integers p and q with q ≥ 1, α − p > c7.42 . q qκ Remark The theorems of Thue, Siegel and Roth in Chap. 6 are ineffective since the constant appearing in those theorems corresponding to c7.42 is not computable.

7.4 Effective Version of Thue’s Theorem

121

Proof of Theorem 7.4.2 We may assume that α − p ≤ 1. q Let f (X ) be the minimal polynomial of α. Thus deg f = n. Now f (α) − f p = f p ≥ 1 . q q qn Let g( p, q) = q n f ( p/q). Then g( p, q) is an irreducible binary form with integral coefficients and deg g( p, q) ≥ 3. Hence by Theorem 7.4.1, there exists c7.43 = c7.43 ( f ) such that g( p, q) = r implies that max(| p|, |q|) ≤ (2|r |)c7.43 . Now 0 = q ≤ max(| p|, |q|) ≤ (2|r |)c7.43 implies |r | ≥

q 1/c7.43 . Thus 2 1 f p = |g( p, q)| = r ≥ . n− c 1 q qn qn 2q 7.43

Also if f (X ) = a0 X n + · · · + an , then f p = f (α) − f p q q n n−1 p p p = a0 α n − + a1 α n−1 − + · · · + an−1 α − q q q n p α − ( p/q)n = α − a0 + · · · + an−1 . q α − ( p/q) For any integer t with 2 ≤ t ≤ n, t α − ( p/q)t t−1 t−2 t−1 α − ( p/q) = |α + α ( p/q) + · · · + ( p/q) |. p p Since α − ≤ 1, we have ≤ 1 + |α|. Hence from (7.15), we get q q

(7.15)

122

7 Baker’s Theorems and Applications

t α − ( p/q)t t−1 n−1 α − ( p/q) ≤ t (|α| + 1) ≤ n(|α| + 1) . f p ≤ n 2 h(α)(|α| + 1)n−1 α − p q q

Therefore,

giving

α − p ≥ 2q n−(1/c7.43 ) n 2 h(α)(|α| + 1)n−1 −1 ≥ c7.44 q qκ

with κ = n −

1 . c7.43

7.5

p-Adic Version of Baker’s Result and an Application

Analogous results for Baker’s theorems in the p-adic metric are due to Kunrui Yu. We state one of his theorems and give an application. Let Q p be the completion of Q under p-adic metric. For any prime p, define the p-adic absolute value | | p : Z → Z≥0 as follows: |1| p = 1, | p| p = 1/ p, |q| p = 1 for a prime q = p, for every integer factorisation of α = ± p1α1 · · · prαr , |α| p = 1/ p αi if p = pi for some 1 ≤ i ≤ r and |α| p = 1 if p = pi , for all 1 ≤ i ≤ r.

Extend | | p to Q as follows: If n = a/b, then |n| p = |a| p /|b| p . Also | | p defined over Q satisfies the properties: (1) |u| p ≥ 0 for all u ∈ Q, |u| p = 0 if and only if u = 0. (2) |uv| p = |u| p |v| p . (3) |u + v| p ≤ max(|u| p , |v| p ). Let K be an algebraic number field. The absolute value | | p can be extended to K as follows. Let p be a prime ideal of K which sits over p, i.e. p ∩ Z = pZ. Let β ∈ K. Define |β| p as follows. Suppose the principal ideal [β] = p a for some ideal a in K with gcd(p, a) = 1 Define ordp [β] = . Then |β|p =

1 . p ordp [β]

7.5 p-Adic Version of Baker’s Result and an Application

123

When β = p, then ordp [β] = 1. Hence | p|p = 1/ p. When β = q, a prime not equal to p then ideal p does not occur in the prime factorisation of [q] and we get ordp [q] = 0. Therefore |q|p = 1. Thus | |p coincides with | | p in the set of integers and hence in Q. We now state a result of Yu. Theorem 7.5.1 Let p be a prime number and K an algebraic number field of degree d. Let α1 , . . . , αn ∈ K with s(αi ) ≤ H for 1 ≤ i ≤ n. Further, let b1 , . . . , bn be rational integers such that α1b1 · · · αnbn = 1. Put B = max(|b1 |, . . . , |bn |). Let p be a prime ideal of K which sits over the rational prime p. Then |α1b1 · · · αnbn − 1|p ≥ (eB)−c7.45 where c7.45 = c7.45 (H, n, d, p). We give an application of the above result. Theorem 7.5.2 Let p1 < p2 < · · · < ps be fixed primes and S = {± p1a1 · · · psas : ai ∈ Z≥0 } i.e., S is the semi-group generated by p1 , . . . , ps . Let X 1 , X 2 , X 3 ∈ S be such that X 1 + X 2 = X 3 with gcd(X 1 , X 2 ) = 1. Then there exists c7.46 = c7.46 ( ps , S) such that max(|X 1 |, |X 2 |) ≤ c7.46 . In other words, the number solutions of X 1 + X 2 = X 3 in S is bounded. Proof Write X 1 = ± p1a11 · · · psa1s , X 2 = ± p1a21 · · · psa2s , X 3 = ± p1a31 · · · psa3s with ai j ∈ Z, ai j ≥ 0. Let Z = max(|X 1 |, |X 2 |, |X 3 |). Observe that for 1 ≤ i ≤ s, 2a1i ≤ p1a1i ≤ Z giving a1i ≤ 2 log Z . Similarly, a2i ≤ 2 log Z . Consider |X 3 | pi =

1 = |X 1 + X 2 | pi = | p1a11 · · · psa1s ± p1a21 · · · psa2s | pi . pia3i

(7.16)

Since gcd(X 1 , X 2 ) = 1, pi cannot divide both X 1 and X 2 . Let us assume without loss of generality that pi X 2 . Hence |X 2 | pi = 1. Thus from (7.16), we have

124

7 Baker’s Theorems and Applications

|X 3 | pi =

|X 3 | pi = | ± p1a11 −a21 · · · psa1s −a2s − 1| pi |X 2 | pi

where |a1i − a2i | ≤ 4 log Z . Now applying Theorem 7.5.1 we get |X 3 | pi > exp(−c7.47 log log Z ) with c7.47 = c7.47 (s, pi ). On the other hand, |X 3 | pi =

1 . pia3i

Along with the lower bound, this gives pia3i < exp(c7.47 log log Z ). This is true for 1 ≤ i ≤ s. Thus |X 3 | =

s

pia3i < exp(c7.48 log log Z ).

i=1

If Z = |X 3 |, the above inequality implies that Z is bounded. Suppose Z = max(|X 1 |, |X 2 |), say Z = |X 2 |. Then |X 3 | = |X 1 + X 2 | = |X 2 | × | ± p1a11 −a21 · · · psa1s −a2s − 1| ≥ Z exp(−c7.50 log log Z ). Again from the upper bound we get exp(log Z − c7.47 log log Z ) < exp(c7.48 log log Z ) implying Z ≤ c7.49 . Hence max(|X 1 |, |X 2 |) is bounded. Since |X 3 | ≤ |X 1 | + |X 2 |, |X 3 | is also bounded. Thus the number of solutions of X 1 + X 2 = X 3 in S is bounded. abc−Conjecture In 1983, Masser and Oesterlé formulated abc - conjecture as a possible approach to Fermat’s last theorem, conjectured by Fermat in 1637, which has been proved by Wiles in 1995 after centuries of tireless efforts and ideas developed by many mathematicians like Euler, Dirichlet, Legendre, Kummer. In twentieth century the problem got an impetus with the works of Frey, Serre, Ribet, Wiles and many others. Before stating abc - conjecture, we make some definitions. Given a sum a + b = c with a, b, c ∈ Z, coprime and non-zero, define the height h and the radical r of this sum by

7.5 p-Adic Version of Baker’s Result and an Application

h = h(a, b, c) = max(log |a|, log |b|, log |c|); r = r (a, b, c) =

125

log p

p|abc

where p runs over all prime divisors of a, b and c. For instance, take the following triples (a, b, c) (2, 3, 5) : h = log 5, r = log 30 : (9, 16, 25) : h = log 25, r = log 30; (3, 125, 128) : h = log 128, r = log 30; (19 × 1307, 7 × 292 × 318 , 28 × 322 × 54 ) : h = 36.15..., r = 22.36 . . . . In the first two examples the height is smaller than the radical. In the next two examples the height is larger than the radical. The abc - conjecture says that the height cannot be much larger than the radical. abc - Conjecture Let a, b, c be given non-zero, coprime integers such that a + b = c and let > 0 be given. Then there exists a number K ( ) > 0 such that h(a, b, c) ≤ r (a, b, c) + h(a, b, c) + K ( ). Equivalently, the above inequality can be written as h(a, b, c) ≤

K ( ) 1 r (a, b, c) + . 1− 1−

From this we see that if the radical is fixed, i.e. if we consider sums of integers composed of fixed set of prime numbers then there are only finitely many such sums and one can bound the summands. This is Theorem 7.5.2. abc - Conjecture Implies Fermat’s Last Theorem For a given integer n ≥ 3, consider x n + y n = z n in positive integers x, y and z. Take (a, b, c) = (x n , y n , z n ). Then h = n log z and r=

log p ≤ log x yz < 3 log z.

p|x yz

Applying abc - conjecture with = 1/2, we get n log z ≤ 6 log z + 2K (1/2). Since we know the equation has no solution for n = 3, 4, 5 and 6, we may assume that n > 6. Thus (n − 6) log z ≤ 2K (1/2)

126

7 Baker’s Theorems and Applications

which implies that n, z and hence x and y are bounded and these finitely many values are left for direct verification. We give another application of abc - conjecture. In 1909, Wieferich proved if p is a prime satisfying (7.17) 2 p−1 ≡ 1 (mod p 2 ), then the equation x p + y p = z p has no non-trivial integral solutions satisfying p x yz. It is still not known if there are infinitely many primes p satisfying (7.17). We shall show this under abc - conjecture. This was proved by Silverman [9] in 1988. We need the following notion. We say that an integer N is square full if a prime q divides N , then q 2 divides N . For example, N = 28 · 317 is a square-full number. In fact, any positive integer N can be written as N = u N vN where u N is a square-free part of N , and v N is a square-full part of N so that (u N , v N ) = 1. Theorem 7.5.3 Under abc - conjecture, there are infinitely many primes p satisfying (7.17). Proof For any positive integer n, write 2n − 1 = u n vn where u n is the square-free part of 2n − 1 and vn is the square-full part of 2n − 1. Thus (u n , vn ) = 1. We claim the following. Claim Assume that abc - conjecture is true. Then as n → ∞, the factor u n is unbounded. Suppose u n is bounded. Consider the equality (2n − 1) + 1 = 2n . Taking a = 2n − 1, b = 1, c = 2n , we get h(a, b, c) = n log 2 and r (a, b, c) = log(2

p) ≤ log 2 + log u n + (log vn )/2.

p|(2n −1)

Then by the abc - conjecture, for any > 0, there exists K ( ) such that n log 2 ≤

K ( ) log 2 + log u n + (log vn )/2 + . 1− 1−

On the other hand, n log 2 > log(2n − 1) = log u n + log vn .

7.5 p-Adic Version of Baker’s Result and an Application

127

Comparing with the upper bound, we see that (1/2 − ) log vn ≤ log u n + log 2 + K ( ). Fixing = 1/4, we get that vn is bounded since u n is bounded. This implies 2n − 1 is bounded which is not true by taking n arbitrarily large. This proves the claim. By the claim, as u n is square free, we conclude that there are infinitely many primes p such that p divides u n for some positive integer n. Hence, in order to prove the theorem, it is enough to show that every prime divisor p of u n for n ≥ 1 satisfies (7.17). Let p|u n for some n ≥ 1. Then p 2 2n − 1. Let d be the order of 2 (mod p). Thus 2d ≡ 1 (mod p) and 2h ≡ 1 (mod p) for any h < d, d|n and d|( p − 1). So let n = de and p − 1 = d f. Suppose 2d ≡ 1 (mod p 2 ). Then 2d = 1 + kp with p|k. Hence 2n = 2de = (1 + pk)e ≡ 1 + pke ≡ 1

(mod p 2 ),

a contradiction. Thus 2d ≡ 1 (mod p 2 ). Then 2d = 1 + mp for some integer m with p m. Hence, 2 p−1 = 2d f = (1 + mp) f ≡ 1 + f mp

(mod p 2 ).

Since f |( p − 1) and p m we see that 2 p−1 ≡ 1 (mod p 2 ) which proves the theorem. Exercise (1) Show that exp(απ + β) is transcendental if α, β ∈ A, β = 0. (2) Show that the series 1 1 1 1− + − + ··· 4 7 10 is transcendental. (3) Is π e transcendental? What can be said about log log 2, log2 2 + log2 3 ? (4) Show that Fermat equation x n + y n = z n has no solution for n = 3, 4, 5 and 6. (5) Assuming abc - conjecture, show that there are only finitely many consecutive cube full numbers. (6) Assuming abc - conjecture, show that the number of Wieferich primes not exceeding X is log X/ log log X. Notes There are innumerable articles on the application of Baker’s results, both qualitative and quantitative, in different areas of number theory. In the past decade, a theorem of Baker, Birch and Wirsing [10], in which Baker’s theory of linear forms was used, found many applications. Starting with a paper of Adhikari et al. [11] and of Murty and Saradha [12], their theorem was applied to prove the transcendence of many infinite series. We refer to the book of Murty and Rath [13] and an article of Saradha

128

7 Baker’s Theorems and Applications

and Sharma [14] for various results in this connection, including transcendence of some integrals. The equation (7.18) x m − yn = 1 is known as Catalan equation. In 1844, Catalan conjectured that the only two consecutive perfect powers are 23 and 32 , i.e. the only positive integral solutions of (7.18) is (x, y, m, n) = (3, 2, 2, 3). Using linear forms in logarithms, in 1976, Tijdeman [15] showed that the equation (7.18) in positive integers x, y, m, n implies that max(x, y, m, n) is bounded by an effectively computable absolute constant. In 2002, Catalan conjecture was solved by Mihailescu ˇ [16] using cyclotomic fields and Galois modules. In 1897, Störmer used Pell’s equation to show that P(x(x + 1)) → ∞ as x → ∞. He showed that there are only 23 pairs (x, x + 1) which are composed of fixed primes and he gave explicitly all the 23 pairs. Thue, in 1908, noted that, his method would give finiteness of the number of such pairs (x, x + 1). But his method was ineffective in the sense that one cannot furnish the actual pairs as Störmer did. On the other hand, we saw in Theorem 7.3.2 that Baker’s method gives effective result even if x(x + 1) is replaced by a polynomial P(x) having at least two distinct roots. But, computing all the x values needs more analysis and techniques. In 1964 Baker [17] used properties of hyper-geometric series to obtain effective results for approximation of certain fractional powers of rationals by rationals. For instance, it was shown that −6 1/3 2 − p > 10 . 2.955 q q Since then using the Padé approximants occurring in the hyper-geometric method has become an important tool in Diophantine approximation. Further, approximation of more than one algebraic number simultaneously has also been studied.

References 1. A. Baker, Linear forms in the logarithms of algebraic numbers. Mathematika 13, 204–216 (1966); II 14, 102–107 (1967); III 14, 220–228 (1967) 2. A. Baker, Transcendental Number Theory. Cambridge Tracts (1975) 3. A. Baker, G. Wüstholz, Logarithmic Forms and Diophantine Geometry. Cambridge Tracts (2007) 4. M. Laurent, M. Mignotte, Y. Nesterenko, Formes linéaires en deux logarithmes et déterminants d’interpolation. J. Number Theory 55, 285–321 (1995) 5. C.D. Bennett, J. Blass, A.M.W. Glass, D.B. Meronk, R.P. Steiner, Linear forms in the logarithms of three positive rational integers. J. Theo. Nombr. Bordeaux 9, 97–136 (1997) 6. E.M. Matveev, An explicit lower bound for a homogeneous rational linear form in logarithms of algebraic numbers. Izv. Math. 62, 81–136 (1998) 7. Y. Bugeaud, Linear Forms in Logarithms and Applications. IRMA Lectures in Mathematics and Theoretical Physics, vol. 28 (2018)

References

129

8. T.N. Shorey, R. Tijdeman, Exponential Diophantine Equations. Cambridge Tracts (1986 and re-printed in 2008) 9. J. Silverman, Wieferich’s criterion and the ABC- conjecture. J. Number Theory 30, 226–237 (1988) 10. A. Baker, B.J. Birch, E.A. Wirsing, On a problem of Chowla. J. Number Theory 5, 224–236 (1973) 11. S.D. Adhikari, N. Saradha, T.N. Shorey, R. Tijdeman, Transcendental infinite sums. Indag Math. (N. S.) 12(1), 1–14 (2001) 12. M.R. Murty, N. Saradha, Transcendental values of the digamma function. J. Number Theory 125, 298–318 (2007) 13. M. Ram Murty, P. Rath, Transcendental Numbers (Springer, Berlin, 2014), 217 pp 14. N. Saradha, D. Sharma, Arithmetic Nature of Some Infinite Series and Integrals. Contemporary Mathematics 655, 191–207 (2015) 15. R. Tijdeman, On the equation of Catalan. Acta Arith. 29, 197–209 (1976) 16. P. Mihailescu, ˇ Catalan’s conjecture: another old Diophantine problem solved. Bull. Am. Math. Soc. 41, 43–57 (2013) 17. A. Baker, Rational approximations to 21/3 and other algebraic numbers. Quart J. Math. Oxford 15, 375–383 (1964)

Chapter 8

Baker’s Theorem

The city you’re dreaming of it’s at the end of this road –Lal Ded

We begin with some basic tools necessary for the proof of Theorem 7.1.1 in Sect. 8.1. First, Theorem 7.1.1 is reduced to an equivalent statement; see Theorem 8.1.2. In Sect. 8.1.1, we derive a simple, but useful, non-trivial lower bound for a non-vanishing linear form in logarithms of algebraic numbers with bounded coefficients. Section 8.1.2 provides construction of an augmentative polynomial. In Sect. 8.1.3, we give the construction of the auxiliary polynomial (Z 0 , . . . , Z n−1 ) in several variables which generalises the function of a single complex variable employed by Gelfond. Basic estimates on are shown in Sect. 8.1.4. The main difficulty is in the interpolation techniques. Usually the order of the derivatives is increased while leaving the points of interpolation fixed. Baker used a special extrapolation procedure in which the range of interpolation points is extended while the order of the derivatives is reduced, and the absolute values of these derivatives are shown to be very small. See Sects. 8.1.5 and 8.1.6. In Sect. 8.2, all the tools of Sect. 8.1 are combined in an ingenious way. The auxiliary polynomial (Z 0 , . . . , Z n−1 ) at Z 0 = · · · = Z n−1 = z is expressed as an exponential polynomial (z, . . . , z) =

pα z να eψα z ≡ 0 with pα ∈ Z.

Then using the augmentative polynomial, the coefficients pα of (z, . . . , z) are expressed in terms of the derivatives of at z = 0. Using the smallness of the derivatives, it is shown that | pα | < 1 © Springer Nature Singapore Pte Ltd. 2020 S. Natarajan and R. Thangadurai, Pillars of Transcendental Number Theory, https://doi.org/10.1007/978-981-15-4155-1_8

131

132

8 Baker’s Theorem

for every α which implies that (z) ≡ 0, a contradiction to the construction. We refer to the original papers of Baker [1] and his book [2] for the presentation here. The reader may also refer to [3] for some fascinating exposition.

8.1 Ground Work for the Proof of Baker’s Theorem A Reduction Let us recall Baker’s result from Theorem 7.1.1. Theorem 8.1.1 Let α1 , . . . , αn ∈ A\{0, 1}, β0 ∈ A and β1 , . . . , βn ∈ A\{0}. Assume that log α1 , . . . , log αn ar e linearly independent over Q. Then := β0 + β1 log α1 + · · · + βn log αn = 0. In other words, 1, log α1 , . . . , log αn are linearly independent over A. In order to prove Theorem 8.1.1, we first show that it is enough to prove the following statement. Theorem 8.1.2 Let α1 , . . . , αn be algebraic numbers such that log α1 , . . . , log αn are Q-linearly independent. Suppose β0 , β1 , . . . , βn are non-zero algebraic numbers. Then there exist non-negative integers λ1 , . . . , λn such that λ +λn β1

eλn β0 α1 1

λ

n−1 . . . αn−1

+λn βn−1

= α1λ1 · · · αnλn .

(8.1)

We shall show that Theorem 8.1.2 implies Theorem 8.1.1. Assume that (8.1) holds. We need to prove that for any β0 , β1 , . . . , βn ∈ A, not all zero, we have β0 + β1 log α1 + · · · + βn log αn = 0. Since β j s are not all zero, by rearranging the indices, if necessary, we can assume that βn = 0. Since βn ∈ A\{0}, we have βn−1 ∈ A\{0}. By letting, γi = −βi /βn for 0 ≤ i < n, we need to prove that for any γ0 , γ1 , . . . , γn−1 ∈ A, not all zero, we have γ0 + γ1 log α1 + · · · + γn−1 log αn−1 = log αn .

(8.2)

Suppose (8.2) is not true. Then there exist γ0 , . . . , γn−1 ∈ A, not all zero, for which γ0 + γ1 log α1 + · · · + γn−1 log αn−1 = log αn . By exponentiating both sides, we get,

8.1 Ground Work for the Proof of Baker’s Theorem

133

γ

γ

n−1 eγ0 α11 · · · αn−1 = αn .

Therefore, for all non-negative integer λn , we have λ γ

λ γ

n n−1 eλn γ0 α1 n 1 · · · αn−1 = αnλn .

λ

n−1 , we get, for all non-negative integers λ1 , . . . , λn Multiplying both sides by α1λ1 · · · αn−1 that λn−1 +λn γn−1 λ +λ γ eλn γ0 α1 1 n 1 · · · αn−1 = α1λ1 · · · αnλn ,

which is a contradiction to (8.1). Thus we need to prove Theorem 8.1.2. We assume that the assertion of the theorem is not true and arrive at a contradiction. This means we can assume that for all nonnegative integers λ0 , λ1 , . . . , λn , we have λ +λn β1

eλn β0 α1 1

λ

n−1 · · · αn−1

+λn βn−1

= α1λ1 · · · αnλn .

(8.3)

Therefore, it follows, by raising both sides to th power that for all positive integers , we have (λ +λ β ) (λ +λ β ) (8.4) eλn β0 α1 1 n 1 · · · αn−1n−1 n n−1 = α1λ1 · · · αnλn . We shall denote by c8.··· = c8.··· (α1 , . . . , αn , β0 , . . . , βn−1 ) effectively computable positive numbers depending only on α1 , . . . , αn , β0 , . . . , βn−1 . By this dependence, we mean any quantity which can be determined once α1 , . . . , αn , β0 , . . . , βn−1 are given, for instance, when n, deg(αi ), d(αi ), h(αi ) for 1 ≤ i ≤ n and deg(β j ), d(β j ), h(β j ) for 0 ≤ j < n are given. Let K = Q(α1 , . . . , αn , β0 , . . . , βn−1 ) and [K : Q] = ν. We put ai = d(αi ), ei = d(1/αi ) for 1 ≤ i ≤ n and b j = d(β j ) for 0 ≤ j ≤ n − 1.

8.1.1 A Lower Bound for a Non-vanishing Linear Form Lemma 8.1.3 Let α1 , . . . , αn be non-zero algebraic numbers such that log α1 , . . . , log αn are Q-linearly independent complex numbers. Let T > 0 be any real number. Suppose t1 , . . . , tn are, not all zero, integers with |ti | ≤ T , for 1 ≤ i ≤ n. Then, there exists c8.1 such that −T . |t1 log α1 + t2 log α2 + · · · + tn log αn | ≥ c8.1

Proof Since log α1 , . . . , log αn are Q-linearly independent, we see that t1 log α1 + · · · + tn log αn = 0. We want to prove a lower bound for this number. So, let = t1 log α1 + · · · + tn log αn ∈ C∗ .

134

8 Baker’s Theorem

Then

e = et1 log α1 +···+tn log αn = α1t1 · · · αntn ∈ A.

Without loss of generality, we may assume that t1 , . . . , tk ≥ 0 and tk+1 , . . . , tn < 0. Let t j = −s j with s j > 0 for all j = k + 1, k + 2, . . . , n. Consider the complex number sk+1 −s · · · ensn α1t1 · · · αktk αk+1k+1 · · · αn−sn − 1 ∈ A. ω = a1t1 · · · aktk ek+1 Then, ω is an algebraic integer in K. Hence deg(ω) ≤ ν and T |σ(ω)| ≤ c8.2

(8.5)

for any conjugate σ(ω) of ω. Suppose ω = 0. Then e = 1 giving = 2πim for some integer m = 0, as = 0. Therefore, we get || = |2πm| ≥ 2π > 1, which proves the assertion of the lemma by taking c8.1 = 1. Now consider ω = 0. Then

|σ(ω)| = |N (ω)| ≥ 1

σ

giving by (8.5) that |ω| ≥ Thus we get |e − 1| =

1 σ(ω)=ω |σ(ω)|

ν−1 −T ≥ (c8.2 ) .

|ω| a1t1

sk+1 · · · aktk ek+1

· · · ensn

≥

1 ν−1 (c8.2 c8.3 )T

.

Since for any complex number z, we know that |e z − 1| ≤ |z|e|z| and || ≤ c8.4 T , we get ||e|| |e − 1| 1 || = ≥ ≥ T e|| e|| c8.5 ν−1 where c8.5 = c8.2 c8.3 ec8.4 .

8.1 Ground Work for the Proof of Baker’s Theorem

135

8.1.2 A Special Augmentative Polynomial Lemma 8.1.4 Let R and S be given positive integers. Let σ0 , σ1 , . . . , σ R−1 be given distinct complex numbers. Define σ = max{1, |σn | : 0 ≤ n ≤ R − 1} and ρ = min{1, |σi − σ j | : 0 ≤ i < j ≤ R − 1}.

Let (r, s) ∈ [0, R − 1] × [0, S − 1] be a given pair of integers. Then there exists a polynomial R S−1 W (z) = c j z j ∈ C[z] j=0

of degree at most RS − 1 such that RS 2σ (a) |c j | ≤ f or 0 ≤ j < RS; ρ i d 0 if (i, j) = (s, r ) (W (z))| = (b) z=σ j 1 if (i, j) = (s, r ). dz i Proof Fix a pair (r, s). Consider W (z) = a0 (z − σr )s

R−1

(z − σm ) S

m=0 m=r

where

⎞−1

⎛ ⎜ a0 = ⎝s!

R−1

⎟ (σr − σm ) S ⎠

.

m=0 m=r

Clearly, W (z) is a polynomial with complex coefficients of degree at most RS − 1. Using Leibnitz’s formula, it can be seen that di (W (z))|z=σ j = 0 for i < S and i = r dz i and

R−1 ds (W (z))| = a s! (σr − σm ) S = 1, z=σ 0 r dz s m=0 m=r

by the definition of a0 . Hence (b) is satisfied. Now, we estimate the coefficients of W (z). Write W (z) = c0 + c1 z + · · · + c R S−1 z R S−1 . Note that

136

8 Baker’s Theorem

W (z) |a0 |(z + σ) R S . Hence |cm | ≤ |a0 | × Coefficient of z m in (z + σ) R S , 0 ≤ m < RS. Note that the coefficient of z m in (z + σ) R S is equal to

and

RS m σ < (σ + 1) R S m

⎞−1 ⎟ ⎜ |σr − σm |s ⎠ < ρ−S(R−1) , |a0 | = ⎝s! ⎛

m=0 m=r

by the definition of ρ. Thus we get |cm |
ν(ν + 1)h(h 2 + 1)n (L + 1)n+1 ≥ h 2− 4n

(8.6)

by taking h > (ν(ν + 1)2n )4n/(3n−1) , say. Here ν = [K : Q]. Simplification of a Typical Equation Consider m 0 ,m 1 ,...,m n−1 () = 0. By the definition of m 0 ,...,m n−1 (z 0 , . . . , z n−1 ), we get, L λ0 =0

···

L λn =0

p(λ0 , . . . , λn )

∂ m 1 (λ +λ β )z ∂ m 0 λ0 λn β0 z0 |z0 = m 1 α1 1 n 1 1 |z1 = · · · m0 z0 e ∂z 0 ∂z 1 ∂ m n−1 (λn−1 +λn βn−1 )zn−1 |zn−1 = = 0. (8.7) · · · m n−1 αn−1 ∂z n−1

By Leibnitz’s formula, we find ∂ m i (λi +λn βi )zi (λ +λ β ) αi |zi = = (λi + λn βi )m i (log αi )m i αi i n i for 1 ≤ i ≤ n − 1 ∂z im i and m0 m 0 ∂ μ0 λ0 ∂ m 0 λ0 λn β0 z0 ∂ m 0 −μ0 λn β0 z0 z | |z0 = e = z 0 = μ0 (z 0 )|z 0 = m0 0 m −μ e ∂z 0 μ0 ∂z 0 ∂z 0 0 0 μ =0 0

138

8 Baker’s Theorem

=

m0 m0 μ0 =0

μ0

λ0 (λ0 − 1) · · · (λ0 − μ0 + 1)λ0 −μ0 (λn β0 )m 0 −μ0 eλn β0 .

Therefore, Eq. (8.7) becomes, L

···

λ0 =0

L

⎛ p(λ0 , . . . , λn ) ⎝

m0 m0 μ0 =0

λn =0

×

n−1

μ0

⎞ μ −1 0 (λ0 − i) λ0 −μ0 (λn β0 )m 0 −μ0 eλn β0 ⎠ i=0

(8.8) (λi +λn βi )

(log αi )m i (λi + λn βi )m i αi

= 0.

i=1

Since log α1 , . . . , log αn−1 are Q- linearly independent, we see that (log α1 )m 1 · · · (log αn−1 )m n−1 = 0 and independent of the above sum. Thus Eq. (8.8) reduces to L

p(λ0 , . . . , λn )q(m 0 , λ0 , λn , , β0 )

λ0 ,...,λn =0

n−1

(λi + λn βi )

mi

e

λn β0

n−1

i=1

(λ +λ β ) αi i n i

=0

i=1

(8.9)

where q(m 0 , λ0 , λn , , β0 ) =

m0 m0 μ0 =0

μ0

λ0 (λ0 − 1) · · · (λ0 − μ0 + 1)λ0 −μ0 (λn β0 )m 0 −μ0 .

Application of (8.4) Using Eq. (8.4) in (8.9) we get L

p(λ0 , . . . , λn )q(m 0 , λ0 , λn , , β0 )

λ0 ,...,λn =0

n−1

(λi + λn βi )

α1λ1 · · · αnλn = 0.

mi

i=1

(8.10) Recall that ai = d(αi ) for 1 ≤ i ≤ n and b j = d(β j ) for 0 ≤ j ≤ n − 1. We multiply m n−1 to get Eq. (8.10) by (a1 a2 · · · an ) L b0m 0 · · · bn−1 L

p(λ0 , . . . , λn )q (m 0 , λ0 , λn , , β0 )

λ0 ,...,λn =0

n−1

n

i=1

i=1

(bi λi + λn (bi βi ))m i

(L−λi )

ai

n (ai αi )λi = 0, i=1

(8.11)

where q (m 0 , λ0 , λn , , β0 ) =

m0 m0 μ λ0 (λ0 − 1) · · · (λ0 − μ0 + 1)λ0 −μ0 b0 0 (λn b0 β0 )m 0 −μ0 . μ0

μ0 =0

8.1 Ground Work for the Proof of Baker’s Theorem

139

Note that the coefficients of p(λ0 , · · · , λn ) in (8.11) are algebraic integers in K. Estimation of the Coefficients n Lh (a) |ai αi |λi ≤ c8.7 (b) (c)

i=1 n−1 i=1 n−1

(n−1)h |bi λi + λn bi βi |m i ≤ L (n−1)h c8.8 ≤ L c8.9 h . 2

2

2

Lh ai(L−λi ) ≤ c8.10 .

i=1

(d)

m 0 m0 μ λ0 (λ0 − 1) · · · (λ0 − μ0 + 1)λ0 −μ0 b0 0 (λn b0 β0 )m 0 −μ0 |q (m 0 , λ0 , λn , , β0 )| = μ 0 μ =0 0 μ0 λ 0 m 0 m 0 m 0 ≤ (m 0 + 1)2 λ0 λn c8.11 2

≤ L c8.12 h h L .

Combining (a)–(d) the coefficients in Eq. (8.11) are bounded by ec8.13 Lh+c8.14 h

2

log L

.

In fact, the above bound is true for any conjugate of the coefficients. Therefore, by Lemma 3.1.3(ii) with v = (L + 1)n+1 , w = h(h 2 + 1)n , and (8.6), we get | p(λ0 , . . . , λn )| ≤ c8.15 (L + 1)n+1 ec8.13 Lh+c8.14 h

2

log L

.

Final Estimate We have

1 log h, (i) (n + 1) log(L + 1) ≤ (n + 1) log 2 + (n + 1) 2 − 4n 1 (ii) Lh ≤ h 3− 4n, 1 2 h 2 log h. (iii) h log L ≤ 2 − 4n

Hence we get | p(λ0 , . . . , λn )| ≤ ec8.16 h

3−1/4n

< eh

3

4n whenever h > c8.16 . Thus the lemma is proved by taking h > c8.6 where 4n , (ν(ν + 1)2n )4n/(3n−1) ). c8.6 = max(c8.16

140

8 Baker’s Theorem

8.1.4 Basic Estimates Relating to Lemma 8.1.6 Let h ≥ c8.6 and as constructed in Lemma 8.1.5. Let m 0 , m 1 , . . . , m n−1 be non-negative integers such that m 0 + m 1 + · · · + m n−1 ≤ h 2 . Let f m 0 ,...,m n−1 (z) :=

∂ m0 ∂ m n−1 m n−1 ((z 0 , . . . , z n−1 )) |z 0 =···z n−1 =z . m0 · · · ∂z 0 ∂z n−1

Then there exist c8.17 and c8.18 satisfying the following properties: (a) For any z ∈ C,

fm

0 ,...,m n−1

h 3 +L|z| (z) ≤ c8.17 ;

(b) For any integer , we have either f m 0 ,...,m n−1 () = 0 or fm

0 ,...,m n−1

−h 3 −L|| () > c8.18 .

Proof Proof of (a) From (8.8), by replacing by z we see that f m 0 ,...,m n−1 (z) =

L

p(λ0 , . . . , λn )q(m 0 , λ0 , λn , z, β0 )

λ0 ,...,λn =0

n−1

(λi + λn βi )

mi

i=1

n−1

(log αi )

mi

α1λ1 · · · αnλn

z

,

(8.12)

i=1

where q(m 0 , λ0 , λn , z, β0 ) =

m0 m0 (λ0 (λ0 − 1) · · · (λ0 − μ0 + 1))z λ0 −μ0 (λn β0 )m 0 −μ0 . μ0

μ0 =0

Now, we shall estimate the individual terms to get an upper bound. (a ) By Lemma 8.1.5, we know that 3

| p(λ0 , . . . , λn )| ≤ eh . (b ) Consider |q(m 0 , λ0 , λn , z, β0 )| ≤

m0 m0 |λ0 ||λ0 − 1| · · · |λ0 − μ0 + 1||z|λ0 −μ0 |λn β0 |m 0 −μ0 μ0

μ0 =0

≤ (c8.19 )m 0 L m 0 (1 + |z|)λ0 2

2

h ≤ c8.19 L h (1 + |z|) L .

8.1 Ground Work for the Proof of Baker’s Theorem

141

(c ) n−1 m 1 +···+m n−1 mi (λi + λn βi ) ≤ L m 1 +···+m n−1 c8.20 i=1

2

2

h ≤ L h c8.20 .

(d ) n−1 mi (log αi ) ≤ max | log αi |m 1 +···+m n−1 i i=1

2

h ≤ c8.21 .

(e ) |z| z λ λ1 α1 · · · αnλn ≤ α1 1 · · · αnλn L|z|

≤ c8.22 . Thus, by (a ) − −(e ), we get 3

| f m 0 ,...,m n−1 (z)| ≤ eh L c8.23 h h 3 +L|z|

≤ c8.25

2

+L L|z| c8.24

,

which proves (a). Proof of (b) Let be a given integer. We assume that f m 0 ,...,m n−1 () = 0. Let g0 () = and

f m 0 ,...,m n−1 () m (log α1 ) 1 · · · (log αn−1 )m n−1

.

m

n−1 L a1 · · · anL g0 (). g() = b0m 0 · · · bn−1

Then g() is a non-zero algebraic integer in K. Hence N (g()) = g()

σ(g()) ≥ 1,

σ=1

where the product is over all the embeddings σ of K into C except the identity map. Note that for any embedding σ : K → C, we have

142

σ(g()) =

8 Baker’s Theorem L

p(λ0 , . . . , λn )q(m 0 , λ0 , , σ(β0 ))

λ0 ,...,λn =0

n−1

(bi λi + λn bi σ(βi ))

mi

×

i=1

×

n

(ai σ(αi ))λi ai(L−λi ) .

i=1

By (8.12), we see that n−1 m j n i=1 aiL j=0 b j |σ(g())| ≤ n−1 | f m 0 ,...,m n−1 ()|. mi i=1 |log αi | Therefore, using the upper bound for | f m 0 ,...,m n−1 ()| from (a), we get h 3 +L||

|σ(g())| ≤ c8.26

.

Thus 1 |σ(g())| σ=1 ⎞−1 ⎛ h 3 +L|| c8.26 ⎠ ≥⎝

|g()| ≥

σ=1

= =

(ν−1)(−h 3 −L||) c8.26 −h 3 −L|| c8.27 .

Therefore, we get −h 3 −L||

| f m 0 ,...,m n−1 ()| ≥ c8.28

,

as desired.

8.1.5 Extrapolation Technique to Get More Zeros By Lemma 8.1.5, we know that f m 0 ,...,m n−1 () = 0 whenever 1 ≤ ≤ h and m 0 + · · · + m n−1 ≤ h 2 . Now we will increase the range of at the cost of a reduction in m 0 + · · · + m n−1 .

8.1 Ground Work for the Proof of Baker’s Theorem

143

Lemma 8.1.7 There exists c8.29 such that for any integer h > c8.29 the following J holds. Let J be an integer with 0 ≤ J ≤ (8n)2 . For all integers with 1 ≤ ≤ h 1+ 8n 2 J and for all non-negative integers m 0 , . . . , m n−1 with m 0 + · · · + m n−1 ≤ h /2 , we have f m 0 ,...,m n−1 () = 0. Proof We assume h > c8.6 where c8.6 is the number occurring in Lemma 8.1.5 so that the results of Lemmas 8.1.5 and 8.1.6 hold. J We prove the lemma by induction on J . When J = 0, we have h 1+ 8n = h and h 2 /2 J = h 2 . Therefore the conclusion follows from Lemma 8.1.5. Hence, we shall assume that the result holds when J = k and prove the result for J = k + 1. We take m 0 , . . . , m n−1 to be non-negative integers such that m 0 + · · · + m n−1 ≤ h 2 /2k+1 .

(8.13) k

Then m 0 + · · · + m n−1 ≤ h 2 /2k . Let be any integer with 1 ≤ ≤ h 1+ 8n . By induction hypothesis, it follows that f m 0 ,...,m n−1 () = 0. Hence, we may assume that k

h 1+ 8n < ≤ h 1+

k+1 8n

.

(8.14)

Suppose that f m 0 ,...,m n−1 () = 0 with m 0 , . . . , m n−1 and satisfying (8.13) and (8.14). We compute lower and upper bounds for this non-zero quantity which contradict each other. This will prove the lemma. Lower Bound for | fm0 ,...,mn−1 | By Lemma 8.1.6, fm

0 ,...,m n−1

−h 3 −L|| () ≥ c8.18 −h ≥ c8.30

3+k/(8n)

(8.15)

2 . The upper bound for | f m 0 ,...,m n−1 ()| comes after where c8.30 can be taken as c8.18 many steps.

Construction of an Entire Function with Many Zeros Let 2 h k Rk = h 1+ 8n ; Sk = k . 2

144

8 Baker’s Theorem

By (8.14) and (8.13), we see that satisfies Rk < ≤ Rk+1 and m 0 , . . . , m n−1 satisfies m 0 + · · · + m n−1 ≤ Sk+1 . Take any non-negative integer m ≤ Sk+1 . Writing m = j0 + · · · + jn−1 with each jt non-negative integer, we see that (m 0 + j0 ) + · · · + (m n−1 + jn−1 ) = (m 0 + · · · + m n−1 ) + ( j0 + · · · + jn−1 ) ≤ Sk+1 + m ≤ 2Sk+1 2 h h2 ≤ 2 k+1 ≤ k . 2 2 Therefore, by induction hypothesis, for any integer r with 1 ≤ r ≤ Rk , we have f m 0 + j0 ,...,m n−1 + jn−1 (r ) = 0

(8.16)

with j0 + · · · + jn−1 = m ≤ Sk+1 . Let g(z) = f m 0 ,...,m n−1 (z) and put

dm (g(z)) dz m

= gm (z). Since

g(r ) = f m 0 ,...,m n−1 (r ) =

∂ m0 ∂ m n−1 · · · m n−1 (z 0 , . . . , z n−1 )|z 0 =···=z n−1 =r ∂z 0m 0 ∂z n−1

by Leibnitz’s formula, we get, gm (r ) =

j0 ,..., jn−1 j0 +···+ jn−1 =m

=

j0 ,..., jn−1 j0 +···+ jn−1 =m

=

j0 ,..., jn−1 j0 +···+ jn−1 =m

m j0 , . . . , jn−1 m j0 , . . . , jn−1 m j0 , . . . , jn−1

∂ j0 j ∂z 00

···

∂ jn−1 m 0 ,...,m n−1 (z 0 , . . . , z n−1 ) z 0 =···=z n−1 =r jn−1 ∂z n−1

m 0 + j0 ,...,m n−1 + jn−1 (z 0 , . . . , z n−1 )|z 0 =···=z n−1 =r f m 0 + j0 ,...,m n−1 + jn−1 (r ).

(8.17)

Thus, by Eqs. (8.16) and (8.17), we see that the entire function g(z) has zeros at z = 1, 2, . . . , Rk of order ≥ Sk+1 . Application of Maximum–Minimum Modulus Principle Let F(z) = [(z − 1)(z − 2) · · · (z − Rk )] Sk+1 . Then, g(z)/F(z) is an entire function. Let 1

R = Rk+1 h 8n ,

8.1 Ground Work for the Proof of Baker’s Theorem

145

Thus R > 2Rk by taking h > 28n , say. Let D R be the closed disc |z| ≤ R. By the maximum–minimum modulus principle, we have g(z) max|z|=R |g(z)| for z ∈ D R . F(z) ≤ min |z|=R |F(z)| Thus, for any z ∈ D R , we get |g(z)| ≤

|F(z)|(max|z|=R |g(z)|) . min|z|=R |F(z)|

Upper Bound for max|z|=R |g(z)| By Lemma 8.1.6 (a), h 3 +L R . max |g(z)| ≤ c8.17

(8.18)

|z|=R

Lower Bound for min|z|=R |F(z)| Since > Rk , F() = 0. Further R S

k k+1 |F()| = (| − 1|| − 2| · · · | − Rk |) Sk+1 ≤ Rk+1

since | − j| ≤ ≤ Rk+1 . As R > 2Rk , we get for 1 ≤ j ≤ Rk , |z − j| > fore, Rk Sk+1 R . min |F(z)| ≥ |z|=R 2

(8.19) R . There2

(8.20)

From (8.18)–(8.20), we get h +L R |g()| ≤ c8.17 (2Rk+1 /R) Rk Sk+1 . 3

Note that

k

Rk Sk+1 ≤ and h +L R h +h ≤ c8.17 c8.17 3

3

h 3+ 8n 2k+1

k 3+ 8n

(8.21)

k 3+ 8n

h ≤ c8.31

(8.22)

2 , say. Using (8.21) and (8.22), we get by taking c8.31 = c8.17

|g()| ≤

k

h 3+ 8n c8.31 k 3+ 8n

2

(h 3+ 8nk )/(2k+1 ) (8.23)

h 1/8n

h ≤ c8.32 h −(h

k 3+ 8n

)/(2k+4 n)

146

8 Baker’s Theorem

where c8.32 can be taken as 2c8.31 . Final Contradiction Comparing (8.15) and (8.23) we have h i.e.

k 3+ 8n

h 3+ 8n k log c8.32 − k+4 log h ≥ − h 3+ 8n log c8.30 n2 k

log h k k h 3+ 8n (log c8.32 + log c8.30 ) ≥ h 3+ 8n n2k+4

or log h ≤ n2k+4 log(c8.33 ) (8n)2 +4 n2 , 28n , c8.6 , we get the final conwith c8.33 = c8.30 c8.32 . Taking h > max c8.33 tradiction.

8.1.6 Smallness of Derivatives Let f (z) = (z, . . . , z) with as in Lemma 8.1.5. Lemma 8.1.8 There exists a constant c8.34 such that for all integers h > c8.34 the following property holds. For all integers j with 0 ≤ j ≤ h 8n , we have j d −h 8n ( f (z))| . z=0 < e dz j Proof Let h > c8.29 so that Lemma 8.1.7 is valid. Let X = h 8n and Y =

h2 2(8n)2

.

Then by Lemma 8.1.7 we have f m 0 ,...,m n−1 () = 0 for 1 ≤ ≤ X and for all non-negative integers m 0 , . . . , m n−1 with m 0 + · · · + m n−1 ≤ Y. That is, f has zeros at z = 1, 2, . . . , X of order at least Y. Application of Maximum–Minimum Modulus Principle Let R = X h 1/(8n) . Then R > 2X since h > c8.29 ≥ 28n . Let C be the circle |z| = R and D R the closed disc |z| ≤ R. Put

8.1 Ground Work for the Proof of Baker’s Theorem

F(z) =

147

f (z) . [(z − 1)(z − 2) · · · (z − X )]Y

Then F(z) is an entire function. Therefore, by maximum–minimum modulus principle, for all z ∈ D¯ R , we get |F(z)| ≤

max|z|=R | f (z)| . min|z|=R (|z − 1| · · · |z − X |)Y

This implies that | f (z)| ≤

max|z|=R | f (z)| (|z − 1| · · · |z − X |)Y . min|z|=R (|z − 1| · · · |z − X |)Y

First we estimate max|z|=R | f (z)|. By Lemma 8.1.6 (a), we get h +L R max | f (z)| ≤ c8.17 3

(8.24)

|z|=R

2+8n

2h ≤ c8.17

(8n)2 +1

2 ≤ c8.17

XY

XY ≤ c8.35 .

Next since X < R/2, we get min (|z − 1| · · · |z − X |)Y > (R − X ) X Y

|z|=R

(8.25)

X Y R . > 2

Lastly, for all |z| ≤ X , we have (|z − 1| · · · |z − X |)Y ≤ (2X ) X Y .

(8.26)

Combining (8.24)–(8.26), we obtain for all |z| ≤ X , | f (z)| ≤ (4c8.35 X/R) X Y < exp(−X Y )

(8.27)

for h > (4ec8.35 )8n . Application of Cauchy Integral Formula For any integer 0 ≤ j ≤ h 8n and circle C : |z| = X , we have by Cauchy Integral formula and (8.27) that

148

8 Baker’s Theorem

| f (z)| j! dz 2π C |z| j+1 e−X Y ≤ jj Xj −X Y ≤e .

| f ( j) (0)| ≤

Observe that X Y = h 8n+2 /2(8n) > h 8n if h > 2(8n) /2 . This completes the proof of 2 the lemma by taking c8.34 = max(c8.29 , (4ec8.35 )8n , 2(8n) /2 ). 2

2

8.2 Proof of Baker’s Theorem We assume that the hypotheses of Theorem 8.1.2 hold. Then Lemmas 8.1.5–8.1.8 are valid. The main strategy now is to show that the inequalities in Lemma 8.1.8 cannot all be valid simultaneously. Unique Representation of Integers Let n ≥ 1 and L ≥ 1 be integers. Put S = L + 1 and R = S n . It is well known that any non-negative integer α can be represented uniquely in base S as α = λ0 + λ1 S + · · · + λ S

(8.28)

with 0 ≤ λi ≤ L for 0 ≤ i ≤ for some integer ≥ 0. It is easy to see that every integer α with 0 ≤ α ≤ S n+1 − 1 can be represented uniquely as in (8.28) with ≤ n and we write α = (λ0 , λ1 , . . . , λn ). Set να = λ 0 ,

pα = p(λ0 , . . . , λn ), ψα = λ1 log α1 + · · · + λn log αn .

(8.29)

Cardinality of the set {ψα } with 0 ≤ α ≤ RS − 1. Lemma 8.2.1 Let S = {ψα : 0 ≤ α ≤ RS − 1} . Then |S| = S n . Proof Let (λ0 , . . . , λn ) and (κ0 , . . . , κn ) ∈ [0, L]n+1 be two tuples. Then (λ0 , . . . , λn ) = (κ0 , . . . , κn ) if and only if either λ0 = κ0 or (λ1 , . . . , λn ) = (κ1 , . . . , κn ).

8.2 Proof of Baker’s Theorem

149

Therefore, the number of distinct elements (λ0 , . . . , λn ) ∈ [0, L]n+1 such that λ0 = κ0 is precisely (L + 1)n = R. Consider α = (κ0 , λ1 , . . . , λn ) and β = (κ0 , κ1 , . . . , κn ) with α = β. Then we claim that ψα = ψβ . Suppose not. Then (λ1 − κ1 ) log α1 + · · · + (λn − κn ) log αn = 0. Since for some i with 1 ≤ i ≤ n, we have λi − κi = 0, the above equality implies that log α1 , . . . , log αn are Q-linearly dependent, a contradiction which proves the claim. Thus αs with fixed λ0 correspond to distinct ψα . Thus |S| = R. Connecting to Augmentative Polynomial We list the distinct elements of S as ψ0 , ψ1 , . . . , ψ R−1 . Let r and s be two given integers with 0 ≤ r ≤ R − 1 and 0 ≤ s ≤ S − 1. Then by Lemma 8.1.4 there exists a polynomial R S−1 ci z i ∈ C[z] (8.30) W (z) = i=0

of degree at most RS − 1 and satisfying (a) |ci | < (2σ/ρ) R S ; dj 1 if (i, j) = (r, s) (W (z))|z=ψi = (b) 0 if (i, j) = (r, s) dz j where σ = max{1, |ψi | : 0 ≤ i ≤ R − 1} and ρ = min{1, |ψi − ψ j | : 0 ≤ i < j ≤ R − 1}. Lemma 8.2.2 Let ψα , να and pα for α ∈ [0, RS − 1] be as defined in (8.29) and W (z) be the polynomial given in (8.30). Then

d dz

να (W (z))|z=ψα =

R S−1

cj

j=0

d dz

j

z να eψα z |z=0 .

Proof Consider

d dz

να

(W (z))|z=ψα = =

d dz

R S−1

να R S−1

ck z k |z=ψα

k=0

ck (k(k − 1) . . . (k − να + 1))ψαk−να

k=να

Now, for any positive integer k, by Leibnitz’s formula, we have,

(8.31)

150

8 Baker’s Theorem

d dz

k

k− j k d k d j να z να e ψ α z = (z ) (eψα z ) j dz dz j=0 k k

=

j=0

j

(να (να − 1) . . . (να − j + 1))z να − j ψαk− j eψα z

Therefore,

d dz

k

z να eψα z |z=0 = constant term in the above expression k (να !)ψαk−να = να

(8.32)

= (k(k − 1) . . . (k − να + 1)ψαk−να .

The lemma follows from (8.31) and (8.32). Connecting to the Auxiliary Polynomial

Lemma 8.2.3 Let ψα , να and pα for α ∈ [0, RS − 1] be as defined in (8.29). Let (z 0 , . . . , z n−1 ) be the polynomial defined in Lemma 8.1.5. Then we have (z, z, . . . , z) =

R S−1

p α z να e ψ α z .

α=0

Proof By Lemma 8.1.5, we have (z 0 , . . . , z n−1 ) =

L

(λ1 +λn β1 )z 1

p(λ0 , . . . , λn )z 0λ0 eλn β0 z0 α1

(λ

n−1 . . . αn−1

+λn βn−1 )z n−1

λ0 ,...,λn =0

and hence (z, . . . , z) =

L

z λn−1 +λn βn−1 λ +λ β p(λ0 , . . . , λn )z λ0 eλn β0 α1 1 n 1 . . . αn−1 .

λ0 ,...,λn =0

Therefore, by Eq. (8.3), we get,

8.2 Proof of Baker’s Theorem

151

z p(λ0 , . . . , λn )z λ0 α1λ1 . . . αnλn

L

(z, . . . , z) =

λ0 ,...,λn =0 L

=

z p(λ0 , . . . , λn )z λ0 eλ1 log α1 +···+λn log αn

λ0 ,...,λn =0

=

R S−1

p α z να e ψ α z ,

α=0

as desired. Expressing the Coefficients pα in Terms of

Lemma 8.2.4 Suppose there exists a coefficient p(λ0 , . . . , λn ) = 0 for some tuple (λ0 , . . . , λn ) ∈ [0, L]n+1 . Let r be the integer given by r = λ0 + λ1 S + · · · + λn S n . Then there exist integers r, s with 0 ≤ r ≤ R − 1 and 0 ≤ s ≤ S − 1 such that p(λ1 , . . . , λn ) = p =

R S−1

r

j=0

cj

d dz

j ((z, . . . , z)) |z=0 =

R S−1

c j f ( j) (0)

j=0

where c j ’s are the coefficients of the augmentative polynomial W (z) as given in (8.30). Proof Note that 0 ≤ r ≤ RS − 1. Consider ψr = λ1 log α1 + · · · + λn log αn and νr = λ0 . If r ≤ R − 1, then we take r = r . Let r > R − 1. Since the distinct elements of S are listed as ψ0 , . . . , ψ R−1 , by Lemma 8.2.1, there exists r ≤ R − 1 such that ψr = ψr . Take s = νr = λ0 ≤ L ≤ S − 1. With these choices of integers (r, and ψ0 , . . . , ψ R−1 distinct complex numbers, there exists a polynomial W (z) = Rs)S−1 j j=0 c j z as in Lemma 8.1.4, satisfying (a) |ci | < (2σ/ρ) R S ; dj 1 if (i, j) = (r, s) (W (z))|z=ψi = (b) 0 if (i, j) = (r, s) dz j Here σ = max{1, |ψi | : 0 ≤ i ≤ R − 1} and ρ = min{1, |ψi − ψ j | : 0 ≤ i < j ≤ R − 1}. By the explicit values of (r, s), we rewrite (b) as follows.

d dz

ν j

(W (z))|z=ψi =

1 if ψi = ψr and ν j = νr 0 if ψi = ψr or ν j = νr

Suppose ψi = ψ j for some 0 ≤ i = j ≤ RS − 1. Then νi = ν j . Therefore, we can write

152

8 Baker’s Theorem

R S−1

pr =

pi

i=0

d dz

νi (W (z))|z=ψi .

Now by Lemma 8.2.2, pr =

R S−1

pi

i=0

d dz

νi (W (z))|z=ψi

d j νi ψi z z e |z=0 = pi cj dz i=0 j=0 j R R S−1 S−1 d νi ψi z = cj pi z e |z=0 dz j=0 i=0 R S−1

=

R S−1

R S−1

cj

j=0

d dz

j

((z, . . . , z))|z=0 ,

by Lemma 8.2.3. This proves the lemma.

Finally... Let r be as given in Lemma 8.2.4. We will show that | pr | < 1. This will complete the proof of Baker’s theorem. By Lemma 8.2.4, we have | pr | ≤

R S−1

|c j || f j (0)|,

j=0 1

where RS = (L + 1)n+1 ≤ (h 2− 4n + 1)n+1 < h 4n and |c j | < (2σ/ρ) R S where σ and ρ are defined as in Lemma 8.1.4. Firstly, (i) |ψi | ≤ |λ(i) 1 || log α1 | + · · · + |λn || log αn | ≤ L(| log α1 | + · · · + | log αn |) ≤ c8.36 L .

Hence σ ≤ c8.36 L . Now, we compute ρ as follows. ρ= =

min

{1, |ψi − ψ j |}

min

(i) ( j) {1, |(λ(i) 1 − λ1 ) log α1 + · · · + (λn − λn ) log αn |}.

0≤i< j≤R−1 0≤i< j≤R−1

Let ρ = 1. Then

( j)

8.2 Proof of Baker’s Theorem

153 ( j)

(i) ( j) ρ = |(λ(i) 1 − λ1 ) log α1 + · · · + (λn − λn ) log αn |. ( j)

Putting κk = λ(i) k − λk for 1 ≤ k ≤ n we get ρ = |κ1 log α1 + · · · + κn log αn |,

(8.33)

for some integers |κi | ≤ L and not all of them are zero since i = j. Therefore by Lemma 8.1.3, we get −L . (8.34) ρ ≥ c8.37 This estimate also holds if ρ = 1 by taking c8.37 > 1. Thus, for all integers j with 0 ≤ j < RS, we have L L RS ) R S ≤ L R S c8.38 . |c j | ≤ (2c8.36 Lc8.37

(8.35)

Thus, using also Lemma 8.1.8, we get 1 ≤ | pr | ≤

R S−1

|c j || f ( j) (0)|

j=0 L R S −h ≤ R S L R S c8.38 e

8n

h ≤ h 4n h 2h c8.38 e−h < 1, 4n

4n+2

8n

as 4n log h + 2h 4n log h + h 4n+2 log c8.38 < c8.39 h 4n+2 < h 8n 1/(4n−2)

by taking h > c8.39

. This proves Baker’s theorem.

Exercise (1) Let x, y > 0 be real numbers. Show that x > log(1 + x) > y(1 − (1 + x)−1/y ). (2) Prove the Leibnitz’s rule: If f j (x) has n derivatives for 1 ≤ j ≤ m, show that m i! ∂n ∂ i1 f 1 ∂ im f m f (x) = · · · j ∂x n j=1 i 1 ! · · · i m ! ∂x i1 ∂x im

with summation over all i 1 , . . . , i m ≥ 0 with i 1 + · · · + i m = i. √ 1/3 1/3 (3) For a real number √ t,1/3let t denote the real cubic root of t. Let α = (2 + 5) and β = (2 − 5) . Are α and β linearly independent over Q? Are 1, α, β linearly independent over Q?

154

8 Baker’s Theorem

Notes The ground work shows the importance of some factors: (i) Choice of parameters. (ii) Liouville-type argument to get a non-trivial lower bound for a non-zero algebraic integer since its norm is ≥ 1. (iii) Construction of auxiliary polynomial by solving linear equations. The above three factors have been encountered in earlier chapters as well. (iv) The role of the auxiliary polynomial in several variables: (a) The factor (log α1 )m 1 · · · (log αn−1 )m n−1 can be pulled out nicely (see (8.8) and (8.9)) giving rise to linear equations with algebraic coefficients. (b) More importantly, in the extrapolation argument, the partial derivatives can be taken in any direction instead of just along the diagonal (see (8.16)). The quantitative results are more involved as the dependence on αi s and β j s have to be kept track. Since the results of Baker, several mathematicians like Ramachandra, Shorey, Waldschmidt and others have worked on improving the lower bounds. As mentioned in Chap. 7, the best known result in the general case till date is due to Matveev [4] while for linear forms in two and three logarithms the reader may see [5, 6].

References 1. A. Baker, Linear forms in the logarithms of algebraic numbers. Mathematika 13, 204–216 (1966); II 14, 102–107 (1967); III 14, 220–228 (1967) 2. A. Baker, Transcendental Number Theory, Tracts (Cambridge, 1975) 3. N.I. Fel’dman, Y.V. Nesterenko, Transcendental Numbers. Number Theory IV, Encyclopaedia of Mathematical Sciences, vol. 44 (Springer, Berlin, 1991) 4. E.M. Matveev, An explicit lower bound for a homogeneous rational linear form in logarithms of algebraic numbers. Izv. Math. 62, 81–136 (1998) 5. M. Laurent, M. Mignotte, Y. Nesterenko, Formes linéaires en deux logarithmes et déterminants d’interpolation. J. Number Theory 55, 285–321 (1995) 6. C.D. Bennett, J. Blass, A.M.W. Glass, D.B. Meronk, R.P. Steiner, Linear forms in the logarithms of three positive rational integers. J. Theo. Nombr. Bordeaux 9, 97–136 (1997)

Chapter 9

Subspace Theorem

All knowledge that the world has ever received comes from the mind; the infinite library of universe is in our own mind —Ramakrishna

Subspace theorem is a multidimensional extension of Roth’s theorem developed by Schmidt in 1980. He introduced several new ideas, especially from the geometry of numbers. An important ingredient was the properties of successive minima. Since the proofs of his results are beyond the scope of this book, we limit ourselves to stating two versions of his results and derive Roth’s theorem. See Sect. 9.1. In Sect. 9.2, we present some classical approximation results derived from Dirichlet’s multidimensional theorem. Section 9.3 deals with the application of subspace theorem to simultaneous approximation of algebraic numbers by rationals. Sections 9.1–9.3 are based on results from Schmidt [1, 2]. Several innovative applications of this result have been recently found. Starting from 1998, Bugeaud, Corvaja, Luca, Zannier and others have applied the subspace Theorem 9.1.2 successfully to some number theoretic problems like the growth of greatest prime factor of (ab + 1)(ac + 1)(bc + 1) where a > b > c are integers. In Sect. 9.4, we present one such result.

9.1 Statement of Subspace Theorem We denote an n-tuple of variables by X = (X 1 , . . . , X n ). By L = L(X) we mean a linear form in n variables X 1 , . . . , X n . For any x = (x1 , . . . , xn ) ∈ Rn , by |x| we mean max(1, |x1 |, . . . , |xn |).

© Springer Nature Singapore Pte Ltd. 2020 S. Natarajan and R. Thangadurai, Pillars of Transcendental Number Theory, https://doi.org/10.1007/978-981-15-4155-1_9

155

156

9 Subspace Theorem

Theorem 9.1.1 Let n ≥ 2 be an integer. Let L 1 , . . . , L n be n linearly independent linear forms in n variables with real or complex algebraic coefficients. Let > 0. Then the set of solutions x = (x1 , . . . , xn ) ∈ Zn to the inequality n

|L i (x)| < |x|−

(9.1)

i=1

lies in finitely many proper rational subspaces of Qn . By a rational subspace of Qn , we mean a subspace that can be defined by linear equations with rational coefficients. Let us derive Roth’s theorem from the above result. Let n = 2, L 1 (X) = αX 2 − X 1 , L 2 (X) = X 2 . Then for α ∈ A and for any > 0 the above theorem implies that all the points x = (x1 , x2 ) ∈ Z2 for which |αx2 − x1 ||x2 | < |x|−

(9.2)

lie on a finite set of lines x1 = kx2 , k ∈ Q. Suppose if ( p0 , q0 ) and (t p0 , tq0 ), t ∈ Z, are two points lying on the line x1 = kx2 , then |tq0 α − t p0 | < max(|tq0 |, |t p0 |)− |tq0 |−1 implies

|t|2+ < |q0 |−1− |q0 α − p0 |−1 ,

i.e., |t| is bounded from above. Hence each such line can contain only finitely many points satisfying the inequality (9.2). So we conclude that there are only finitely many solutions which give Roth’s theorem. The subspace theorem states that the set of solutions of (9.1) lie in a finite union of proper subspaces of Qn . But one may ask if indeed (9.1) has only finitely many solutions. For instance, suppose there is a non-zero x0 such that L 1 (x0 ) = 0. Then for any λx0 , λ ∈ Z, inequality (9.1) is satisfied. Thus there are infinitely many solutions to (9.1). In the case n = 2, we have seen in the derivation of Roth’s theorem that there are only finitely many solutions to (9.1). If n ≥ 3, then (9.1) may very well having infinitely many solutions. For example, consider the following linear forms. L 1 = x1 +

√ √ √ √ √ √ 2x2 + 3x3 ; L 2 = x1 + 2x2 − 3x3 ; L 3 = x1 − 2x2 − 3x3 .

soluTake the subspace of Q3 defined by x3 = 0. Then consider the infinitely many√ tions to Pell’s equation x12 − 2x22 = 1, say with x1 > 0 and x2 < 0 so that x1 + 2x2 is small. Then for any of these infinitely many solutions |L 1 L 2 L 3 | = |x1 +

√

2x2 | ≤ (max(|x1 |, |x2 |)−

9.1 Statement of Subspace Theorem

157

for many choices of . Now we state another form of subspace theorem which is a special case of a result from [3]. Theorem 9.1.2 Let n ≥ 2 be an integer. Let S be a finite set of primes. Let L 1,∞ , . . . , L n,∞ be n linearly independent linear forms in n variables with real algebraic coefficients. For any prime number in S let L 1, , . . . , L n, be n linearly independent linear forms with integer coefficients. Let > 0 be given. Then the set of solutions x = (x1 , . . . , xn ) ∈ Zn to the inequality n i=1

|L i,∞ (x)|.

n

|L i, (x)| ≤ |x|−

∈S i=1

lies in finitely many proper subspaces of Qn .

9.2 Dirichlet’s Multidimensional Approximation Results Before going for an application of subspace theorem, we shall present some classical results due to Dirichlet which motivate the ensuing application. We recall Theorem 5.1.1 due to Dirichlet. In this section we denote by c9.··· = c9.··· (· · · ) positive numbers depending on certain parameters which are mentioned within brackets, and these numbers can be effectively computable. Theorem 9.2.1 Let α and Q be real numbers with Q > 1. Then there exist integers p, q such that 1 ≤ q < Q and |αq − p| ≤ 1/Q. It follows from this theorem that the inequality α − qp < q12 has solutions in integers p, q and in fact, there exist infinitely many coprime integers p, q with this property if α is irrational. This is not true if α is rational. See Corollary 5.1.2 and the Remark following it. An irrational number α is defined to be badly approximable if there is a constant c9.1 = c9.1 (α) > 0 such that α − p > c9.1 (9.3) q q2 for every rational p/q. In 1842, Dirichlet gave an extension of Theorem 9.2.1 to higher dimension. Theorem 9.2.2 Suppose that αi j , 1 ≤ i ≤ n, 1 ≤ j ≤ m are nm real numbers and Q > 1. Then there exist integers q1 , . . . , qm , p1 , . . . , pn with 1 ≤ max(|q1 |, . . . , |qm |) < Q n/m , such that

158

9 Subspace Theorem

|αi1 q1 + · · · + αim qm − pi | ≤

1 for 1 ≤ i ≤ n. Q

Proof We may assume that Q is an integer. Otherwise one may work with Q = [Q] + 1. Consider the tuples ({α11 x1 + · · · + α1m xm }, . . . , {αn1 x1 + · · · + αnm xm }) where each x j is an integer satisfying 0 ≤ x j < Q n/m , 1 ≤ j ≤ m. Here {x} denotes the fractional part of x. There are at least Q n such tuples, and each such tuple lies in the unit cube U in Rn . Along with the tuple (1, . . . , 1) the unit cube U contains Q n + 1 such tuples. Divide U into Q n pairwise disjoint subcubes of side length 1/Q. So among the Q n + 1 tuples under consideration, at least two tuples will lie in the same cube. Let us say such two tuples are (α11 x1 + · · · + α1m xm − y1 , . . . , αn1 x1 + · · · + αnm xm − yn ), (α11 x1 + · · · + α1m xm − y1 , . . . , αn1 x1 + · · · + αnm xm − yn ) with (x1 , . . . , xm ) = (x1 , . . . , xm ), yi = [α11 x1 + · · · + α1m xm ] and yi = [α11 x1 + · · · + α1m xm ], 1 ≤ i ≤ n. Taking qi = xi − xi for 1 ≤ i ≤ m and p j = y j − y j for 1 ≤ j ≤ n, we get the assertion of the theorem. We derive some consequences. Taking m = 1 we immediately get the following result on simultaneous approximation. Theorem 9.2.3 Suppose that α1 , . . . , αn are n real numbers and that Q > 1 is an integer. Then there exist integers q, p1 , . . . , pn with 1 ≤ q < Q n and |qαi − pi | ≤

1 f or 1 ≤ i ≤ n. Q

As seen in the derivation of Corollary 5.1.2, we can deduce from the above theorem the following result. Corollary 9.2.4 Suppose that at least one of α1 , . . . , αn is irrational. Then there are infinitely many n-tuples ( p1 /q, . . . , pn /q) with αi − pi < 1 , 1 ≤ i ≤ n. q q 1+1/n Taking n = 1 in Theorem 9.2.2, we get the next result which deals with a single linear form in α1 , . . . , αn .

9.2 Dirichlet’s Multidimensional Approximation Results

159

Theorem 9.2.5 Suppose that α1 , . . . , αn are n real numbers and that Q > 1 is an integer. Then there exist integers q1 , . . . , qn , p with 1 ≤ max(|q1 |, . . . , |qn |) < Q 1/n and |α1 q1 + · · · + αn qn − p| ≤

1 . Q

As a consequence of the above theorem we get Corollary 9.2.6 Suppose 1, α1 , . . . , αn are linearly independent over the rationals. Then there are infinitely many coprime (n + 1)-tuples (q1 , . . . , qn , p) with q = max(|q1 |, . . . , |qn |) > 0 and |α1 q1 + · · · + αn qn − p|
0 such that x m L(x) − yn > c9.2

(9.4)

for every integer tuples x and y with x = 0. Suppose m = n = 1. Then (9.4) gives that the linear form L(x) = α11 x − y is badly approximable if y c9.2 α11 − > 2 . x x Thus α11 is badly approximable in the sense of (9.3). Suppose n = 1. Then a single linear form L(x) is badly approximable if |α1 q1 + · · · + αm qm − p| >

c9.3 qm

for every integer point (q1 , . . . , qm , p) with q = max(|q1 |, . . . , |qm |) > 0 and c9.3 = c9.3 (L).

160

9 Subspace Theorem

Suppose m = 1. Then we have L 1 (x) = α1 x, . . . , L n (x) = αn x and max(|α1 q − p1 |, . . . , |αn q − pn |) >

c9.4 q 1/n

for integers q > 0, p1 , . . . , pn with c9.4 = c9.4 (L 1 , . . . , L n ). In other words, c9.4 p1 pn > 1+1/n . max α1 − , . . . , αn − q q q In this case, we say the tuple (α1 , . . . , αn ) is a badly approximable tuple. We now show the following result. Theorem 9.2.7 Suppose 1, α1 , . . . , αm is a basis of a real algebraic number field of degree m + 1. Then the linear form α1 x1 + · · · + αm xm is badly approximable. Proof We denote by c9.5 , c9.6 , . . . positive numbers which depend only on α1 , . . . , αm . We may restrict to integers q1 , . . . , qm , p with |α1 q1 + · · · + αm qm − p| < 1 since otherwise the assertion is obviously true. Then | p| ≤ c9.5 q where q = max(|q1 |, . . . , |qm |). Also each conjugate (i) qm − p| ≤ c9.6 q. |α1(i) q1 + · · · + αm

Hence the norm |N (α1 q1 + · · · + αm qm − p)| ≤ c9.7 q m |α1 q1 + · · · + αm qm − p|.

(9.5)

Let d be the denominator of α1 , . . . , αm . Then |N (dα1 q1 + · · · + dαm qm − dp)| ≥ 1 which gives |N (α1 q1 + · · · + αm qm − p)| ≥ d −m−1 = c9.8 which together with (9.5) yields the result. By taking qv+1 = · · · = qm = 0, we may state the above theorem in a slightly more general way as follows. Theorem 9.2.8 Suppose 1, α1 , . . . , αv are linearly independent over Q and they generate an algebraic number field of degree ν. Then there exists a number c9.9 = c9.9 (α1 , . . . , αv ) such that |α1 q1 + · · · + αv qv − p| > c9.9 q −ν+1 for any integers q1 , . . . , qv , p with q = max(|q1 |, . . . , |qv |) > 0.

9.2 Dirichlet’s Multidimensional Approximation Results

161

When v = 1, we obtain Liouville’s theorem. When v = ν − 1, the exponent is best possible by Theorem 9.2.7. So far, note that the results are effective in the sense that the numbers c9.··· are computable. When v < ν − 1, the exponent can be improved as shown in Theorem 9.3.1 below.

9.3 Applications of Subspace Theorems to Diophantine Approximation We state and prove Roth-type results as application of subspace Theorem 9.1.1. Wherever subspace theorem is applied, the reader should note that the results are ineffective. Henceforth, we shall denote by γ9.··· = γ9.··· (. . .) numbers which are ineffective, i.e. which cannot be effectively computable. On the other hand, c9.10 , . . . are effectively computable numbers depending only on α1 , . . . , αn . Theorem 9.3.1 Suppose α1 , . . . , αu are real algebraic numbers such that 1, α1 , . . . , αu are linearly independent over Q. Let > 0. Then there are only finitely many positive integers q with (9.6) q 1+ ||α1 q|| · · · ||αu q|| < 1 where ||x|| denotes the distance of x to the nearest integer. Proof Let q satisfy (9.6). Choose p1 , . . . , pu with ||αi q|| = |αi q − pi | for 1 ≤ i ≤ u. Note that | pi | ≤ |αi |q + 1/2, 1 ≤ i ≤ u. Put n = u + 1 and take x0 = (x1 , . . . , xn ) = ( p1 , . . . , pu , q). Then q ≤ |x0 | ≤ c9.10 q. Let us introduce linear forms L i (x) = αi X n − X i , 1 ≤ i ≤ u;

L n (x) = X n .

Then (9.6) implies that |L 1 (x0 ) · · · L n (x0 )| < q − ≤ c9.10 |x0 |−

(9.7)

≤ |x0 |−/2 2 whenever q > c9.10 . Consider

|L 1 (x) · · · L n (x)| < |x|−/2

(9.8)

By Theorem 9.1.1, the solutions to (9.8) lie in finitely many proper rational subspaces, and by (9.7), x0 is a solution of (9.8). Let T be a subspace containing x0 . Let it be defined by

162

9 Subspace Theorem

c1 x1 + · · · + cu xu + cn xn = 0. Then for x = ( p1 , . . . , pu , q) in T we have c1 (α1 q − p1 ) + · · · + cu (αu q − pu ) = (c1 α1 + · · · + cu αu + cn )q. Hence |c1 |||α1 q|| + · · · + |cu |||αu q|| ≥ γ9.1 q

(9.9)

where γ9.1 = |c1 α1 + · · · + cu αu + cn |. Note that γ9.1 > 0 since α1 , . . . , αu are linearly independent. Also γ9.1 is ineffective since we only know the existence of c1 , . . . , cu , cn . From (9.9), we therefore get that q≤

|c1 | + · · · + |cu | 2γ9.1

showing that q is bounded. Suppose

This implies that

αi − pi < q −1−(1/u)− , 1 ≤ i ≤ u. q

(9.10)

||αi q|| < q −1/u− , 1 ≤ i ≤ u.

Hence (9.6) is satisfied. So (9.10) has only finitely many solutions. Thus we get the following result. Corollary 9.3.2 Suppose α1 , . . . , αu are real algebraic numbers such that 1, α1 , . . . , αu are linearly independent over Q. Let > 0. Then there are only finitely many p1 pu rational u-tuples q , . . . , q satisfying (9.10). Theorem 9.3.3 Suppose α1 , . . . , αu are real algebraic numbers such that 1, α1 , . . . , αu are linearly independent over Q. Let > 0. Then there are only finitely many vtuples of non-zero integers (q1 , . . . , qv ) with |q1 · · · qv |1+ ||α1 q1 + · · · + αv qv || < 1.

(9.11)

Remark We know that ||α1 q1 + · · · + αv qv || < 1. Taking q = max(|q1 |, . . . , |qv |), we get, under the hypothesis of Theorem 9.3.3, that there exists an integer p such that

9.3 Applications of Subspace Theorems to Diophantine Approximation

163

||α1 q1 + · · · + αv qv || = |α1 q1 + · · · + αv qv − p| < q −v− . The exponent is best possible by Theorem 9.2.8. Now we prove Theorem 9.3.3. Proof Suppose q1 , . . . , qv are given integers satisfying (9.11). Choose p with ||α1 q1 + · · · + αv qv || = |α1 q1 + · · · + αv qv − p|. Then |α1 q1 + · · · + αv qv − p| ≤ 1/2 giving | p| ≤ c9.11 q where c9.11 = max(1, |α1 |, . . . , |αv |). Put n = v + 1 and write x0 = (q1 , . . . , qv , p). Then q ≤ |x0 | ≤ c9.11 q. Let us introduce linear forms L i (x) = X i , 1 ≤ i ≤ v; L n (x) = α1 X 1 + · · · + αv X v − X n . Then by Theorem 9.1.1, the solutions lie in a finite number of rational subspaces. Let a typical such subspace be given by c1 x1 + · · · + cv xv + cn xn = 0 in which x0 lies. Suppose cv = 0. Then we get cv ||α1 q1 + · · · + αv qv || = cv |α1 q1 + · · · + αv qv − p| = |(cv α1 − c1 αv )q1 + · · · + (cv αv−1 − cv−1 αv )qv−1 − (cv + cn αv ) p| = |cv + cn αv ||α1 q1 + · · · + αv−1 qv−1 − p|

where αi = (cv αi − ci αv )/(cv + cn αv ), 1 ≤ i ≤ v. We prove the theorem by induction on v. Suppose that v = 1. Then |α1 q − p| < 1/2 implies that |α1 |q − | p| < 1/2 giving | p| > c9.12 q for q > c9.13 . Further, we get c1 ||α1 q|| = | p(c1 + c2 α1 | ≥ γ9.2 | p| ≥ 1 if q > 1/(γ9.2 c9.12 ). For v > 1 we obtain qv−1 ||. cv ||α1 q1 + · · · + αv qv || ≥ γ9.3 ||α1 q1 + · · · + αv−1

Hence by (9.11), we get qv−1 || < 1. |q1 · · · qv−1 |1+/2 ||α1 q1 + · · · + αv−1

164

9 Subspace Theorem

Since 1, α1 , . . . , αv−1 are linearly independent over Q this inequality has only finitely many solutions, by induction. Similar argument holds if c j = 0, 1 ≤ j ≤ v or if c1 = · · · = cv = 0 but cn = 0.

9.4 A Different Application Here we present a result of Bugeaud et al. [4]. Theorem 9.4.1 Let a, b be multiplicatively independent integers ≥ 2 and let > 0. Then there exists a positive number n 0 such that gcd(a n − 1, bn − 1) < exp(n) whenever n ≥ n 0 . Remarks (1) The theorem implies that a n − 1 and bn − 1 cannot have a common factor of significant size. (2) Let g = gcd(a n − 1, bn − 1). Write bn − 1 c = an − 1 d with gcd(c, d) = 1. Suppose d ≥ a (1−)n , then Theorem 9.4.1 follows since g=

an − 1 ≤ a n . d

(3) The number n 0 cannot be computed due to the ineffective nature of subspace theorem. Proof As seen in Remark (2), it is enough to show that d ≥ a (1−)n . Setting For integers j ≥ 1, let z j (n) = with gcd(c j,n , dn ) = 1. Since

c j,n b jn − 1 = n a −1 dn

9.4 A Different Application

165

z j (n) = and

b jn − 1 z 1 (n) bn − 1

b jn − 1 ∈Z bn − 1

we see that dn = d1 = d. We shall assume that dn ≤ a (1−)n for infinitely many values of n and arrive at a contradiction. We fix two positive integers k and h satisfying 2

(i) k > 2/ and (ii) a h > 2a k bk .

(9.12)

Approximation for z j (n) We have ∞ h

1 1 1 = + O a −(h+1)n = n r n r n a −1 a a r =1 r =1 giving

h 1 −(h+1)n

b jn − 1 jn = (b − 1) . +O a an − 1 ar n r =1

Thus

h h j n

b 1 − z j (n) + = O b jn a −(h+1)n . r n r a a r =1

r =1

Linear Forms for Applying Theorem 9.1.2 We take S = {primes dividing ab}. Let N = hk + h + k. For any x ∈ R N , we write x = (x1 , . . . , x N ) = (z 1 , . . . , z k , y01 , . . . , y0h , . . . , yk1 , . . . , ykh ). We take the following N linearly independent linear forms: L i,∞ (x) = z i + y01 + · · · + y0h − yi1 − · · · − yi h for 1 ≤ i ≤ k;

(9.13)

166

9 Subspace Theorem

for (i, v) ∈ / {(1, ∞), . . . , (k, ∞)}, take L i,v (x) = xi . The Smallness of the Linear Forms at x0 We consider the point x0 where

z i = dn a hn z i (n) for 1 ≤ i ≤ k, yit = dn a hn (bi a −t )n for 0 ≤ i ≤ k; 1 ≤ t ≤ h.

Let A = a h+1 bk . Note that |z j | = |dn a hn z j (n)| ≤ dn a hn b jn ≤ An for 1 ≤ j ≤ k; |yit | ≤ dn (a h bk )n ≤ An for 0 ≤ i ≤ k; 1 ≤ t ≤ h. Hence |x0 | ≤ An . Further by the product formula we have

|m|v = 1

v∈S∪{∞}

for any rational m composed of only the primes dividing a and b. For i > k, consider

|L i,v (x0 )|v =

v∈S∪{∞}

|dn a hn (bi a −t )n |v

(9.14)

v∈S∪{∞}

=

|dn |v

v∈S∪{∞}

≤ |dn |∞ = dn . For 1 ≤ i ≤ k, xi = dn a hn z i (n) which gives

|xi | p ≤ a −hn .

(9.15)

p|ab

Further by (9.13), for 1 ≤ i ≤ k, we have |L i,∞ (x0 )| = |z i + y01 + · · · + y0h − yi1 − · · · − yi h | = |dn a hn ||z i (n) + a −n + · · · + a −hn − (bi a −1 )n − · · · − (bi a −h )n | = O(bin a −hn−n dn a hn ) = O(bin a −n dn ).

9.4 A Different Application

167

Combining this with (9.14) and (9.15) we get N

|L i,v (x0 )|v ≤

k

dnN −k

v∈S∪{∞} i=1

|L i,v (x0 )|v

v∈S∪{∞} i=1

= dnN −k

k

|L i,∞ (x0 )|

i=1

k

|xi | p

p|ab i=1

= O(dnN bk n a −hkn ) 2

= O(a (1−)n N bk n a −hkn ) 2

= O((bk a h+k a −N )n ) = O(2−n ) 2

by the choice of k and h in (9.12). Let us take δ < (log 2)/(log A). Thus N

|L i,v (x0 )|v = O(A−δn ) = O(|x0 |)−δ .

v∈S∪{∞} i=1

This is true for infinitely many n. Application of Theorem 9.1.2 By subspace Theorem 9.1.2, x0 must lie on finitely many proper subspaces of Q N . Hence there is a proper subspace having infinitely many such x0 . So for infinitely many n, such x0 must lie in the hyper plane say, ζ1 Z 1 + · · · + ζk Z k +

αi j Yi j = 0

i, j

i.e.

bn − 1 bkn − 1 + · · · + ζk n + ζ1 n αi j a −1 a −1 i, j

bj ai

n =0

(9.16)

valid for infinitely many n. Let A denote the set of these infinitely many values of n. Algebraic Independence of the Functions a z and b z Suppose a z and b z are algebraically dependent. Then there exists a relation

ci j (a i b j )z = 0

(9.17)

with ci j not all zero. Since a and b are multiplicatively independent, the numbers a i b j are distinct. Let a i0 b j0 be of the largest value among them with ci0 j0 = 0. Now let z run through the integer values in A. Since A is an infinite set, we find that the left-hand side of (9.17) tends to ci0 j0 = 0, a contradiction.

168

9 Subspace Theorem

Final Contradiction Since a z and b z are algebraically independent, (9.16) gives rise to an identity Y −1 Yk − 1 + · · · + ζk + αi j ζ1 X −1 X −1 i, j

Yj Xi

n =0

in Q(X, Y ). This may be rewritten as f (Y ) g(X, Y ) + =0 X −1 Xh where f (Y ) = ζ1 (Y − 1) + · · · + ζk (Y k − 1) and g(X, Y ) = i, j αi j X h−i Y j . Thus X − 1 divides f (Y ) in Q[X, Y ] which means f (Y ) = 0 and hence g(X, Y ) = 0. This implies that ζ1 = · · · = ζk = 0 and αi j = 0 for all i, j giving the final contradiction. Exercise 1. Prove that if a and b are multiplicatively independent, a z and b z are algebraically independent functions using Lemma 1.1.2. 2. Let S be the set containing all the integers composed of given primes. Assume that a relation of the type

j

γi, j s1i s2 = 0, γi j ∈ Q

(9.18)

ij

does not hold for infinitely many pairs of multiplicatively independent integers (s1 , s2 ) ∈ S 2 with |s2 |/2 ≤ |s1 | < |s2 |. Show that for any > 0, gcd(s1 − 1, s2 − 1) < max(|s1 |, |s2 |)

(9.19)

for all pairs (s1 , s2 ) ∈ S with min(|s1 |, |s2 |) > 1 and max(|s1 |, |s2 |) exceeding / Q. (Hint: Follow the proof some number depending on and log |s2 |/ log |s1 | ∈ of Theorem 9.4.1 and prove (9.19).) 3. Let 0 < δ < 1. For x = (x1 , x2 , x3 ) ∈ Z3 , consider √ √ √ √ √ √ 2x2 + 3x3 )(x1 − 2x2 + 3x3 )(x1 − 2x2 − 3x3 )| ≤ |x|−δ . (9.20) (a) Prove that (9.20) has infinitely many solutions in the spaces x1 = 0 and x2 = 0. (b) Prove that (9.20) has only finitely many solutions with x1 x2 x3 = 0. 0 < |(x1 +

Notes Schmidt himself gave an upper bound for the number of subspaces in Theorem 9.1.1. Such results are known as quantitative subspace theorems. He applied his result to

9.4 A Different Application

169

estimate the number of solutions of norm form equations. Schlickewei generalised the theorem to an arbitrary absolute value— p-adic subspace theorem and also to algebraic integer solutions. These results have been applied to the study of various Diophantine equations. See [3, 5, 6]. Important contributions towards this area were also made by Evertse, Gy˝ory, Thunder and others. Hernández and Luca [7] generalised Theorem 9.4.1 to S-units. They showed that the assertion (9.19) in Exercise 2 above holds even if the condition (9.18) does not hold. As an application of their result, they could resolve a conjecture of Gy˝ory et al. [8] that for positive integers a > b > c we have lim P((ab + 1)(ac + 1)(bc + 1)) = ∞.

a→∞

Here P(n) denotes the greatest prime factor of an integer n > 1. As mentioned earlier, Corvaja and Zannier applied successfully, various versions of subspace theorem to many number theoretic problems. Here we mention two of their interesting results. Let θ > 1. The distribution of the sequence {θn } where {x} denotes the fractional part of x is one of the intriguing problems in number theory. It is well known that {θn } is uniformly distributed in [0, 1] for almost all real θ > 1. On the other hand, it is not known whether {(3/2)n } is dense in [0, 1]. Mahler [9] showed that for any rational θ and 0 < < 1, {θn } ≤ n

(9.21)

for all but a finite set of integers n depending on θ and . His result depends on Theorem 6.5.2. Similar result does not hold if θ were a suitable algebraic number, for instance, if θ is a Pisot number. Mahler asked for what algebraic numbers similar result holds. In [10], Corvaja and Zannier answered Mahler’s question by showing a best possible result that if (9.21) holds for infinitely many n, then θd is a Pisot number for some positive integer d. As another application, they showed that the length of the period of the continued fraction expansion of αn → ∞ as n → ∞. Here α is a real quadratic irrational which is not a square root of a rational number and not a unit in OQ(α) . This solves a question of Mendés France. We strongly urge the reader to browse through the papers of Corvaja, Zannier and their co-authors for a variety of other results.

References 1. W.M. Schmidt, Diophantine Approximation. LNM , vol. 785 (Springer, Berlin, 1980) 2. W.M. Schmidt, Diophantine Approximations and Diophantine Equations. LNM, vol. 1467 (Springer, Berlin, 1991) 3. H.P. Schlickewei, The P-adic Thue-Siegel-Roth-Schmidt theorem. Arch. Math. (Basel) 29, 267–270 (1977)

170

9 Subspace Theorem

4. Y. Bugeaud, P. Corvaja, U. Zannier, An upper bound for the G.C.D of a n − 1 and bn − 1. Math. Z. 243, 79–84 (2003) 5. H.P. Schlickewei, Die p-adische Verallgemeinerung des Satzes von Thue-Siegel-Roth-Schmidt. J. Reine Angew. Math 288, 86–105 (1976) 6. H.P. Schlickewei, Linearformen mit algebraischen koeffizienten. Manuscripta Math. 18, 147– 185 (1976) 7. S. Hernández, F. Luca, On the largest prime factor of (ab + 1)(ac + 1)(bc + 1). Bol. Soc. Mat. Mexicana (3) 9, 235–244 (2003) 8. K. Gy˝ory, A. Sárkozy, ˇ C.L. Stewart, On the number of prime factors of integers of the form ab + 1. Acta Arith. 74(4), 365–385 (1996) 9. K. Mahler, On the fractional parts of the powers of a rational number II. Mathematika 4, 122–124 (1957) 10. P. Corvaja, U. Zannier, On the rational approximation to the powers of an algebraic number: Solution of two problems of Mahler and Mendés-France. Acta Math. 193, 175–191 (2004)

Appendix A

Introductory Quotes

Later on, when they had all said “Good-bye” and “Thank You” to Christopher Robin, Pooh and Piglet walked home thoughtfully together in the golden evening and for a long time they were silent. ‘When you wake up in the morning, Pooh, ’said Piglet at last, “What’s the first thing you say to yourself?” “What’s for breakfast?” said Pooh. “What do you say, Piglet?” “ I say, I wonder what’s going to happen exciting to-day?” said Piglet. Pooh nodded thoughtfully. “It’s the same thing,” he said. -Winnie the Pooh

© Springer Nature Singapore Pte Ltd. 2020 S. Natarajan and R. Thangadurai, Pillars of Transcendental Number Theory, https://doi.org/10.1007/978-981-15-4155-1

171

Index

A abc - conjecture, 124–127 Absolute logarithmic height, xvii, 5, 109 Amick–Fraenkel conjecture, 44

B Badly approximable, 63, 64, 83, 157, 159, 160 Binary form, 67, 83, 84, 116, 121

C Catalan conjecture, 128 Cauchy integral formula, 42, 51, 55, 56, 147 Champernowne, 104 Chebyshev, 84 Conjecture of Pillai, 112 Conjugates, 5, 8, 9, 11, 26, 39, 73, 76, 99

D Denominator, ix, x, 3, 5, 29, 41, 46, 50, 65, 74, 117, 160 Diophantine equations, ix, x, 85, 104, 108, 128, 169 Dirichlet, 10, 124, 157 Dyson, x, 62, 85

E Effective, ix, x, 11, 13, 75, 83, 85, 108, 109, 114, 116, 120, 128, 133, 157, 161, 162, 164 Elementary symmetric function, 9, 25 Embeddings, 10, 141 Euler, 83, 124

F Fermat, 124, 125 Frey, 124 Fundamental periods, 3 Fundamental units, 10, 117 G Gap principle, 74 Gauss Lemma, 4, 72 Gelfond, ix, 40, 44, 57, 62, 107, 109 Gelfond and Schneider, 35 Good approximation, 74 H Height, xvii, 5, 22, 72, 85, 89, 109, 124, 125 Hermite, ix, 21–23, 29, 32 Hyper-geometric method, 128 I Index of a polynomial, 88 Ineffective, x, 75, 83, 116, 120, 128, 161, 162, 164 J Jensen’s formula, 2 K Khintchine, 64, 104 Kummer, 124 L Lattice, 10, 11, 19

© Springer Nature Singapore Pte Ltd. 2020 S. Natarajan and R. Thangadurai, Pillars of Transcendental Number Theory, https://doi.org/10.1007/978-981-15-4155-1

173

174 Legendre, 64, 124 L’Hospital’s rule, 58 Lindemann, ix, 21, 29, 32, 109, 110 Linear forms in logarithms, ix, 128 Liouville, 19, 66, 120, 154, 161

M Masser, 124 Maximum-Minimum modulus principle, 144–147 Minimal polynomial, 5, 6, 66, 80, 121 Modular invariant function, 57

Index R Rational subspace, 156, 161, 163 Ribet, 124

S Schanuel’s conjecture, 19, 32, 44 Schneider, ix, 35, 40, 44, 57, 104, 107, 109 Serre, 124 Shidlovsky, 32 Siegel, ix, x, 32, 36, 62, 75, 84, 85, 104, 120, 169 Simultaneous approximation, 59, 158

N Norm, xvii, 5, 10, 154, 160, 169

O Oesterlé, 124

P P-adic absolute value, 122 Padé approximants, 128 Pell’s equation, 116, 128, 156 Pigeonhole principle, 99 Primitive, 4, 84

T Thue, ix, x, 62, 66, 67, 75, 83, 85, 116, 120, 128, 169

W Weierstrass, ix, 2, 3, 21, 32, 45, 57, 58, 109, 110 Wieferich, 126–128 Wiles, 124 Wronskian, 13–16, 18