177 56 547KB
English Pages [68] Year 2016
Part II — Number Fields Based on lectures by I. Grojnowski Notes taken by Dexter Chua
Lent 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures. They are nowhere near accurate representations of what was actually lectured, and in particular, all errors are almost surely mine.
Part IB Groups, Rings and Modules is essential and Part II Galois Theory is desirable Definition of algebraic number fields, their integers and units. Norms, bases and discriminants. [3] Ideals, principal and prime ideals, unique factorisation. Norms of ideals.
[3]
Minkowski’s theorem on convex bodies. Statement of Dirichlet’s unit theorem. Determination of units in quadratic fields. [2] Ideal classes, finiteness of the class group. Calculation of class numbers using statement of the Minkowski bound. [3] Dedekind’s theorem on the factorisation of primes. Application to quadratic fields. [2] Discussion of the cyclotomic field and the Fermat equation or some other topic chosen by the lecturer. [3]
1
Contents
II Number Fields
Contents 0 Introduction
3
1 Number fields
4
2 Norm, trace, discriminant, numbers
10
3 Multiplicative structure of ideals
17
4 Norms of ideals
27
5 Structure of prime ideals
32
6 Minkowski bound and finiteness of class group
37
7 Dirichlet’s unit theorem
48
8 L-functions, Dirichlet series*
56
Index
68
2
0
0
Introduction
II Number Fields
Introduction
Technically, IID Galois Theory is not a prerequisite of this course. However, many results we have are analogous to what we did in Galois Theory, and we will not refrain from pointing out the correspondence. If you have not learnt Galois Theory, then you can ignore them.
3
1
1
Number fields
II Number Fields
Number fields
The focus of this course is, unsurprisingly, number fields. Before we define what number fields are, we look at some motivating examples. Suppose we wanted to find all numbers of the form x2 + y 2 , where x, y ∈ Z. For example, if a, b can both be written in this form, does it follow that ab can? In IB Groups, Rings and Modules, we did the clever thing of working with Z[i]. The integers of the form x2 + y 2 are exactly the norms of integers in Z[i], where the norm of x + iy is N (x + iy) = |x + iy|2 = x2 + y 2 . Then the previous result is obvious — if a = N (z) and b = N (w), then ab = N (zw). So ab is of the form x2 + y 2 . Similarly, in the IB Groups, Rings and Modules example √ sheet, we found all solutions to the equation x2 + 2 = y 3 by working in Z[ −2]. This is a√very general technique — working with these rings, and corresponding fields Q( −d) can tell us a lot about arithmetic we care about. In this chapter, we will begin by writing down some basic definitions and proving elementary properties about number fields. Definition (Field extension). A field extension is an inclusion of fields K ⊆ L. We sometimes write this as L/K. Definition (Degree of field extension). Let K ⊆ L be fields. Then L is a vector space over K, and the degree of the field extension is [L : K] = dimK (L). Definition (Finite extension). A finite field extension is a field extension with finite degree. Definition (Number field). A number field is a finite field extension over Q. A field is the most boring kind of ring — the only ideals are the trivial one and the whole field itself. Thus, if we want to do something interesting with number fields algebraically, we need to come up with something more interesting. In the case of Q itself, one interesting thing to talk about is the integers Z. It turns out the right generalization to number fields is algebraic integers. Definition (Algebraic integer). Let L be a number field. An algebraic integer is an α ∈ L such that there is some monic f ∈ Z[x] with f (α) = 0. We write OL for the set of algebraic integers in L. Example. It is a fact that if L = Q(i), then OL = Z[i]. We will prove this in the next chapter after we have the necessary tools. These are in fact the main objects of study in this course. Since we say this is a generalization of Z ⊆ Q, the following had better be true: Lemma. OQ = Z, i.e. α ∈ Q is an algebraic integer if and only if α ∈ Z.
4
1
Number fields
II Number Fields
Proof. If α ∈ Z, then x − α ∈ Z[x] is a monic polynomial. So α ∈ OQ . On the other hand, let α ∈ Q. Then there is some coprime r, s ∈ Z such that α = rs . If it is an algebraic integer, then there is some f (x) = xn + an−1 xn−1 + · · · + a0 with ai ∈ Z such that f (α) = 0. Substituting in and multiplying by sn , we get rn + an−1 rn−1 s + · · · + a0 sn = 0, | {z } divisible by s
So s | rn . But if s 6= 1, there is a prime p such that p | s, and hence p | rn . Thus p | r. So p is a common factor of s and r. This is a contradiction. So s = 1, and α is an integer. How else is this a generalization of Z? We know Z is a ring. So perhaps OL also is. Theorem. OL is a ring, i.e. if α, β ∈ OL , then so is α ± β and αβ. Note that in general OL is not a field. For example, Z = OQ is not a field. The proof of this theorem is not as straightforward as the previous one. Recall we have proved a similar theorem in IID Galois Theory before with “algebraic integer” replaced with “algebraic number”, namely that if L/K is a field extension with α, β ∈ L algebraic over K, then so is αβ and α ± β, as well as α1 if α 6= 0. To prove this, we notice that α ∈ K is algebraic if and only if K[α] is a finite extension — if α is algebraic, with f (α) = an αn + · · · + a0 = 0,
an 6= 0
then K[α] has degree at most n, since αn (and similarly α−1 ) can be written as a linear combination of 1, α, · · · , αn−1 , and thus these generate K[α]. On the other hand, if K[α] is finite, say of degree k, then 1, α, · · · , αk are independent, hence some linear combination of them vanishes, and this gives a polynomial for which α is a root. Moreover, by the same proof, if K 0 is any finite extension over K, then any element in K 0 is algebraic. Thus, to prove the result, notice that if K[α] is generated by 1, α, · · · , αn and K[β] is generated by 1, β, · · · , β m , then K[α, β] is generated by {αi β j } for 1 ≤ i ≤ n, 1 ≤ j ≤ m. Hence K[α, β] is a finite extension, hence αβ, α ± β ∈ K[α, β] are algebraic. We would like to prove this theorem in an analogous way. We will consider OL as a ring extension of Z. We will formulate the general notion of “being an algebraic integer” in general ring extensions: Definition (Integrality). Let R ⊆ S be rings. We say α ∈ S is integral over R if there is some monic polynomial f ∈ R[x] such that f (α) = 0. We say S is integral over R if all α ∈ S are integral over R. Definition (Finitely-generated). We say S is finitely-generated over R if there exists elementsPα1 , · · · , αn ∈ S such that the function Rn → S defined by (r1 , · · · , rn ) 7→ ri αi is surjective, i.e. every element of S can be written as a Rlinear combination of elements α1 , · · · , αn . In other words, S is finitely-generated as an R-module. 5
1
Number fields
II Number Fields
This is a refinement of the idea of being algebraic. We allow the use of rings and restrict to monic polynomials. In Galois theory, we showed that finiteness and algebraicity “are the same thing”. We will generalize this to integrality of rings. Example. In a number field Z ⊆ Q ⊆ L, α ∈ L is an algebraic integer if and only if α is integral over Z by definition, and OL is integral over Z. Notation. If α1 , · · · , αr ∈ S, we write R[α1 , · · · , αr ] for the subring of S generated by R, α1 , · · · , αr . In other words, it is the image of the homomorphism from the polynomial ring R[x1 , · · · , xn ] → S given by xi 7→ αi . Proposition. (i) Let R ⊆ S be rings. If S = R[s] and s is integral over R, then S is finitely-generated over R. (ii) If S = R[s1 , · · · , sn ] with si integral over R, then S is finitely-generated over R. This is the easy direction in identifying integrality with finitely-generated. Proof. (i) We know S is spanned by 1, s, s2 , · · · over R. However, since s is integral, there exists a0 , · · · , an ∈ R such that sn = a0 + a1 s + · · · + an−1 sn−1 . So the R-submodule generated by 1, s, · · · , sn−1 is stable under multiplication by s. So it contains sn , sn+1 , sn+2 , · · · . So it is S. (ii) Let Si = R[s1 , · · · , si ]. So Si = Si−1 [si ]. Since si is integral over R, it is integral over Si−1 . By the previous part, Si is finitely-generated over Si−1 . To finish, it suffices to show that being finitely-generated is transitive. More precisely, if A ⊆ B ⊆ C are rings, B is finitely generated over A and C is finitely generated over B, then C is finitely generated over A. This is not hard to see, since if x1 , · · · , xn generate B over A, and y1 , · · · , ym generate C over B, then C is generated by {xi yj }1≤i≤n,1≤j≤m over A. The other direction is harder. Theorem. If S is finitely-generated over R, then S is integral over R. The idea of the proof is as follows: if s ∈ S, we need to find a monic polynomial which it satisfies. In Galois theory, we have fields and vector spaces, and the proof is easy. We can just consider 1, s, s2 , · · · , and linear dependence kicks in and gives us a relation. But even if this worked in our case, there is no way we can make this polynomial monic. Instead, consider the multiplication-by-s map: ms : S → S by γ 7→ sγ. If S were a finite-dimensional vector space over R, then Cayley-Hamilton tells us ms , and thus s, satisfies its characteristic polynomial, which is monic. Even though S is not a finite-dimensional vector space, the proof of Cayley-Hamilton will work.
6
1
Number fields
II Number Fields
Proof. Let α1 , · · · , αn generate S as an R-module. wlog take α1 = 1 ∈ S. For any s ∈ S, write X sαi = bij αj for some bij ∈ R. We write B = (bij ). This is the “matrix of multiplication by S”. By construction, we have α1 .. (sI − B) . = 0. (∗) an Now recall for any matrix X, we have adj(X)X = (det X)I, where the i, jth entry of adj(X) is given by the determinant of the matrix obtained by removing the ith row and jth column of X. We now multiply (∗) by adj(sI − B). So we get α1 .. det(sI − B) . = 0 αn In particular, det(sI −B)α1 = 0. Since we picked α1 = 1, we get det(sI −B) = 0. Hence if f (x) = det(xI − B), then f (x) ∈ R[x], and f (s) = 0. Hence we obtain the following: Corollary. Let L ⊇ Q be a number field. Then OL is a ring. Proof. If α, β ∈ OL , then Z[α, β] is finitely-generated by the proposition. But then Z[α, β] is integral over Z, by the previous theorem. So α ± β, αβ ∈ Z[α, β]. Note that it is not necessarily true that if S ⊇ R is an integral extension, then S is finitely-generated over R. For example, if S is the set of all algebraic integers in C, and R = Z, then by definition S is an integral extension of Z, but S is not finitely generated over Z. Thus the following corollary isn’t as trivial as the case with “integral” replaced by “finitely generated”: Corollary. If A ⊆ B ⊆ C be ring extensions such that B over A and C over B are integral extensions. Then C is integral over A. The idea of the proof is that while the extensions might not be finitely generated, only finitely many things are needed to produce the relevant polynomials witnessing integrality. Proof. If c ∈ C, let f (x) =
N X
bi xi ∈ B[x]
i=0
be a monic polynomial such that f (c) = 0. Let B0 = A[b0 , · · · , bN ] and let C0 = B0 [c]. Then B0 /A is finitely generated as b0 , · · · , bN are integral over A. Also, C0 is finitely-generated over B0 , since c is integral over B0 . Hence C0 is finitely-generated over A. So c is integral over A. Since c was arbitrary, we know C is integral over A. 7
1
Number fields
II Number Fields
Now how do we recognize algebraic integers? If we want to show something is an algebraic integer, we just have to exhibit a monic polynomial that vanishes on the number. However, if we want to show that something is not an algebraic integer, we have to make sure no monic polynomial kills the number. How can we do so? It turns out to check if something is an algebraic integer, we don’t have to check all monic polynomials. We just have to check one. Recall that if K ⊆ L is a field extensions with α ∈ L, then the minimal polynomial is the monic polynomial pα (x) ∈ K[x] of minimal degree such that pα (α) = 0. Note that we can always make the polynomial monic. It’s just that the coefficients need not lie in Z. Recall that we had the following lemma about minimal polynomials: Lemma. If f ∈ K[x] with f (α) = 0, then pα | f . Proof. Write f = pα h + r, with r ∈ K[x] and deg(r) < deg(pα ). Then we have 0 = f (α) = p(α)h(α) + r(α) = r(α). So if r 6= 0, this contradicts the minimality of deg pα . In particular, this lemma implies pα is unique. One nice application of this result is the following: Proposition. Let L be a number field. Then α ∈ OL if and only if the minimal polynomial pα (x) ∈ Q[x] for the field extension Q ⊆ L is in fact in Z[x]. This is a nice proposition. This gives us an necessary and sufficient condition for whether something is algebraic. Proof. (⇐) is trivial, since this is just the definition of an algebraic integer. (⇒) Let α ∈ OL and pα ∈ Q[x] be the minimal polynomial of α, and h(x) ∈ Z[x] be a monic polynomial which α satisfies. The idea is to use h to show that the coefficients of pα are algebraic, thus in fact integers. Now there exists a bigger field M ⊇ L such that pα (x) = (x − α1 ) · · · (x − αr ) factors in M [x]. But by our lemma, pα | h. So h(αi ) = 0 for all αi . So αi ∈ OM is an algebraic integer. But OM is a ring, i.e. sums and products of the αi ’s are still algebraic integers. So the coefficients of pα are algebraic integers (in OM ). But they are also in Q. Thus the coefficients must be integers. Alternatively, we can deduce this proposition from the previous lemma plus Gauss’ lemma. Another relation between Z and Q is that Q is the fraction field of Z. This is true for general number fields Lemma. We have Frac OL =
α : α, β ∈ OL , β 6= 0 = L. β
In fact, for any α ∈ L, there is some n ∈ Z such that nα ∈ OL . 8
1
Number fields
II Number Fields
Proof. If α ∈ L, let g(x) ∈ Q[x] be its monic minimal polynomial. Then there exists n ∈ Z non-zero such that ng(x) ∈ Z[x] (pick n to be the least common multiple of the denominators of the coefficients of g(x)). Now the magic is to put x h(x) = ndeg(g) g . n Then this is a monic polynomial with integral coefficients — in effect, we have just multiplied the coefficient of xi by ndeg(g)−i ! Then h(nα) = 0. So nα is integral.
9
2
Norm, trace, discriminant, numbers
2
II Number Fields
Norm, trace, discriminant, numbers
Recall that in our motivating example of Z[i], one important tool was the norm of an algebraic integer x + iy, given by N (x + iy) = x2 + y 2 . This can be generalized to arbitrary number fields, and will prove itself to be a very useful notion to consider. Apart from the norm, we will also consider a number known as the trace, which is also useful. We will also study numbers associated with the number field itself, rather than particular elements of the field, and it turns out they tell us a lot about how the field behaves. Norm and trace Recall the following definition from IID Galois Theory: Definition (Norm and trace). Let L/K be a field extension, and α ∈ L. We write mα : L → L for the map ` 7→ α`. Viewing this as a linear map of L vector spaces, we define the norm of α to be NL/K (α) = det mα , and the trace to be trL/K (α) = tr mα . The following property is immediate: Proposition. For a field extension L/K and a, b ∈ L, we have N (ab) = N (a)N (b) and tr(a + b) = tr(a) + tr(b). We can alternatively define the norm and trace as follows: Proposition. Let pα ∈ K[x] be the minimal polynomial of α. Then the characteristic polynomial of mα is det(xI − mα ) = p[L:K(α)] α Hence if pα (x) splits in some field L0 ⊇ K(α), say pα (x) = (x − α1 ) · · · (x − αr ), then NK(α)/K (α) =
Y
αi ,
trK(α)/K (α) =
X
αi ,
and hence NL/K (α) =
Y
αi
[L:K(α)]
,
trL/K = [L : K(α)]
X
αi .
This was proved in the IID Galois Theory course, and we will just use it without proving. Corollary. Let L ⊇ Q be a number field. Then the following are equivalent: (i) α ∈ OL . (ii) The minimal polynomial pα is in Z[x] 10
2
Norm, trace, discriminant, numbers
II Number Fields
(iii) The characteristic polynomial of mα is in Z[x]. This in particular implies NL/Q (α) ∈ Z and trL/Q (α) ∈ Z. Proof. The equivalence between the first two was already proven. For the equivalence between (ii) and (iii), if mα ∈ Z[x], then α ∈ OL since it vanishes on a monic polynomial in Z[x]. On the other hand, if pα ∈ Z[x], then so is the characteristic polynomial, since it is just pN α. The final implication comes from the fact that the norm and trace are just coefficients of the characteristic polynomial. It would be nice if the last implication is an if and only if. This is in general not true, but it occurs, obviously, when the characteristic polynomial is quadratic, since the norm and trace would be the only coefficients. √ 2 Example. Let L = K( d) = K[z]/(z √ − d), where d is not a square in K. As a vector space over K, we can take 1, d as our basis. So every α can be written as √ α = x + y d. Hence the matrix of multiplication by α is x dy mα = . y x So the trace and norm are given by √ √ √ trL/K (x + y d) = 2x = (x + y d) + (x − y d) √ √ √ NL/K (x + y d) = x2 − dy 2 = (x + y d)(x − y d) We can also polynomial of √ obtain this by consider the roots of the minimal √ α = x + y d, namely (α − x)2 − y 2 d = 0, which has roots x ± y d. √ In particular, if L = Q( d), with d < 0, then the norm of an element is just the norm of it as a complex number. Now that we have computed the general trace and norm, we can use the proposition to find out what the algebraic integers are. It turns out the result is (slightly) unexpected: √ Lemma. Let L = Q( d), where d ∈ Z is not 0, 1 and is square-free. Then ( √ Z[h d] √ i OL = Z 21 (1 + d)
d ≡ 2 or 3 d≡1
(mod 4)
(mod 4)
√ Proof. We know x + y λ ∈ OL if and only if 2x, x2 − dy 2 ∈ Z by the previous example. These imply 4dy 2 ∈ Z. So if y = rs with r, s coprime, r, s ∈ Z, then we must have s2 | 4d. But d is square-free. So s = 1 or 2. So x=
u , 2
y=
v 2
for some u, v ∈ Z. Then we know u2 − dv 2 ∈ 4Z, i.e. u2 ≡ dv 2 (mod 4). But we know the squares mod 4 are always 0 and 1. So if d 6≡ 1 (mod 4), then u2 ≡ dv 2 11
2
Norm, trace, discriminant, numbers
II Number Fields
2 2 (mod 4) imply that √ u = v = 0 (mod 4), and hence u, v are even. So x, y ∈ Z, giving OL = Z[ d]. On the other hand,√if d ≡ 1 (mod 4), then u, v have the same √ parity mod 2, i.e. we can write x + y d as a Z-combination of 1 and 21 (1 + d). √ As a sanity check, we find that the minimal polynomial of 12 (1 + d) is x2 − x + 14 (1 − d) which is in Z if and only if d ≡ 1 (mod 4).
Field embeddings Recall the following theorem from IID Galois Theory: Theorem (Primitive element theorem). Let K ⊆ L be a separable field extension. Then there exists an α ∈ L such that K(α) = L. √ √ √ √ For example, Q( 2, 3) = Q( 2 + 3). Since Q has characteristic zero, it follows that all number fields are separable extensions. So any number field L/Q is of the form L = Q(α). This makes it much easier to study number fields, as the only extra “stuff” we have on top of Q. One particular thing we can do√is to look at the number of ways we can embed√L ,→ C. For example, for Q( √ −1), there are two such embeddings — one sends −1 to i and the other sends −1 to −i. Lemma. The degree [L : Q] = n of a number field is the number of field embeddings L ,→ C. Proof. Let α be a primitive element, and pα (x) ∈ Q[x] its minimal polynomial. Then by we have deg pα = [L : Q] = n, as 1, α, α2 , · · · , αn−1 is a basis. Moreover, Q[x] ∼ = Q(α) = L. (pα ) Since L/Q is separable, we know pα has n distinct roots in C. Write pα (x) = (x − α1 ) · · · (x − αn ). Now an embedding Q[x]/(pα ) ,→ C is uniquely determined by the image of x, and x must be sent to one of the roots of pα . So for each i, the map x 7→ αi gives us a field embedding, and these are all. So there are n of them. Using these field embeddings, we can come up with the following alternative formula for the norm and trace. Corollary. Let L/Q be a number field. If σ1 , · · · , σn : L → C are the different field embeddings and β ∈ L, then X Y trL/Q (β) = σi (β), NL/Q (β) = σi (β). i
We call σ1 (β), · · · , σn (β) the conjugates of β in C. Proof is in the Galois theory course. Using this characterization, we have the following very concrete test for when something is a unit. 12
2
Norm, trace, discriminant, numbers
II Number Fields
Lemma. Let x ∈ OL . Then x is a unit if and only if NL/Q (x) = ±1. × Notation. Write OL = {x ∈ OL : x−1 ∈ OL }, the units in OL . × Proof. (⇒) We know N (ab) = N (a)N (b). So if x ∈ OL , then there is some y ∈ OL such that xy = 1. So N (x)N (y) = 1. So N (x) is a unit in Z, i.e. ±1. (⇐) Let σ1 , · · · , σn : L → C be the n embeddings of L in C. For notational convenience, We suppose that L is already subfield of C, and σ1 is the inclusion map. Then for each x ∈ OL , we have
N (x) = xσ2 (x) · · · σn (x). Now if N (x) = ±1, then x−1 = ±σ2 (x) · · · σn (x). So we have x−1 ∈ OL , since this is a product of algebraic integers. So x is a unit in OL . Corollary. If x ∈ OL is such that N (x) is prime, then x is irreducible. Proof. If x = ab, then N (a)N (b) = N (x). Since N (x) is prime, either N (a) = ±1 or N (b) = ±1. So a or b is a unit. We can consider a more refined notion than just the number of field embeddings. Definition (r and s). We write r for the number of field embeddings L ,→ R, and s the number of pairs of non-real field embeddings L ,→ C. Then n = r + 2s. Alternatively, r is the number of real roots of pα , and s is the number of pairs of complex conjugate roots. The distinction between real embeddings and complex embeddings will be important in the second half of the course. Discriminant The final invariant we will look at in this chapter is the discriminant. It is based on the following observation: Proposition. Let L/K be a separable extension. Then a K-bilinear form L × L → K defined by (x, y) 7→ trL/K (xy) is non-degenerate. Equivalent, if α1 , · · · , αn are a K-basis for L, the Gram matrix (tr(αi αj ))i,j=1,··· ,n has non-zero determinant. Recall from Galois theory that if L/K is not separable, then trL/K = 0, and it is very very degenerate. Also, note that if K is of characteristic 0, then there is a quick and dirty proof of this fact — the trace map is non-degenerate, because for any x ∈ K, we have trL/K (x · x−1 ) = n 6= 0. This is really the only case we care about, but in the proof of the general result, we will also find a useful formula for the discriminant when the basis is 1, θ, θ2 , . . . , θn−1 . We will use the following important notation: Notation. ∆(α1 , · · · , αn ) = det(trL/K (αi αj )). 13
2
Norm, trace, discriminant, numbers
II Number Fields
¯ be the n distinct K-linear field embeddings Proof. Let σ1 , · · · , σn : L → K ¯ Put L ,→ K. σ1 (α1 ) · · · σ1 (αn ) .. .. S = (σi (αj ))i,j=1,··· ,n = ... . . σn (α1 ) · · · Then
n X
T
S S=
σn (αn ).
! σk (αi )σk (αj )
k=1
. i,j=1,···n
We know σk is a field homomorphism. So n X
σk (αi )σk (αj ) =
k=1
n X
σk (αi αj ) = trL/K (αi αj ).
k=1
So S T S = (tr(αi αj ))i,j=1,··· ,n . So we have ∆(α1 , · · · , αn ) = det(S T S) = det(S)2 . Now we use the theorem of primitive elements to write L = K(θ) such that 1, θ, · · · , θn−1 is a basis for L over K, with [L : K] = n. Now S is just 1 σ1 (θ) · · · σ1 (θ)n−1 .. .. .. S = ... . . . . 1
σn (θ) · · ·
σn (θ)n−1
This is a Vandermonde matrix, and so ∆(1, θ, · · · , θn−1 ) = (det S)2 =
Y
(σi (θ) − σj (θ))2 .
i 0 such that Bε (x) ∩ X = {x}. This is true if and only if for every compact K ⊆ Rn , K ∩ X is finite. We have the following very useful characterization of discrete subgroups of Rn : Proposition. Suppose Λ ⊆ Rn is a subgroup. Then Λ is a discrete subgroup of (Rn , +) if and only if (m ) X Λ= ni x i : ni ∈ Z 1
for some x1 , · · · , xm linearly independent over R. √ √ Note that linear independence is important. For example, Z 2 + Z 3 ⊆ √R is not discrete. On the other hand, if Λ = a C OL is an ideal, where L = Q( d) and d < 0, then this is discrete.
42
6
Minkowski bound and finiteness of class group
II Number Fields
Proof. Suppose Λ is generated by x1 , · · · , xm . By linear independence, there is some g ∈ GLn (R) such that gxi = ei for all 1 ≤ i ≤ m, where e1 , · · · , en is the standard basis. We know acting by g preserves discreteness, since it is a homeomorphism, and gΛ = Zm ⊆ Rm × Rn−m is clearly discrete (take ε = 12 ). So this direction follows. For the other direction, suppose Λ is discrete. We pick y1 , · · · , ym ∈ Λ which are linearly independent over R, with m maximal (so m ≤ n). Then by maximality, we know (m ) (m ) X X λi yi : λi ∈ R = λi zi : λi ∈ R, zi ∈ Λ , 1
i=1
and this is the smallest vector subspace of Rn containing Λ. We now let (m ) X X= λi yi : λi ∈ [0, 1] ∼ = [0, 1]m . i=1
This is closed and bounded, and hence compact. So X ∩ Λ is finite. Also, we know M Zyi = Zm ⊆ Λ, and if γ is any element of Λ, we can write it as γ = γ0 + γ1 , where γ0 ∈ X and γ1 ∈ Zm . So Λ Zm ≤ |X ∩ Λ| < ∞. So let d = |Λ/Zm |. Then dΛ ⊆ Zm , i.e. Λ ⊆ d1 Zm . So Zm ⊆ Λ ⊆
1 m Z . d
So Λ is a free abelian group of rank m. So there exists x1 , · · · , xm ∈ d1 Zm which is an integral basis of Λ and are linearly independent over R. Definition (Lattice). If rank Λ = n = dim Rn , then Λ is a lattice in Rn . Definition (Covolume and fundamental domain). Let Λ ⊆ Rn be a lattice, and x1 , · · · , xn be a basis of Λ, then let ( n ) X P = λi xi : λi ∈ [0, 1] , i=1
and define the covolume of Λ to be covol(Λ) = vol(P ) = | det A|, P where A is the matrix such that xi = aij ej . We say P is a fundamental domain for the action of Λ on Rn , i.e. [ Rn = (γ + P ), γ∈Λ
and (γ + P ) ∩ (µ + P ) ⊆ ∂(γ + P ). In particular, the intersection has zero volume. 43
6
Minkowski bound and finiteness of class group
II Number Fields
This is called the covolume since if we consider the space Rn /Λ, which is an n-dimensional torus, then this has volume covol(Λ). 0 0 Observe now P that if x1 , · · · , xn is a different basis of Λ, then the transition matrix x0i = bij xj has B ∈ GLn (Z). So we have det B = ±1, and covol(Λ) is independent of the basis choice. With these notations, we can now state Minkowski’s theorem. Theorem (Minkowski’s theorem). Let Λ ⊆ Rn be a lattice, and P be a fundamental domain. We let S ⊆ Rn be a measurable set, i.e. one for which vol(S) is defined. (i) Suppose vol(S) > covol(Λ). Then there exists distinct x, y ∈ S such that x − y ∈ Λ. (ii) Suppose 0 ∈ S, and S is symmetric around 0, i.e. s ∈ S if and only if −s ∈ S, and S is convex, i.e. for all x, y ∈ S and λ ∈ [0, 1], then λx + (1 − λ)y ∈ S. Then suppose either (a) vol(S) > 2n covol(Λ); or (b) vol(S) ≥ 2n covol(Λ) and S is closed. Then S contains a γ ∈ Λ with γ 6= 0. Note that for n = 2, this is what we used for quadratic fields. By considering Λ = Zn ⊆ Rn and S = [−1, 1]n , we know the bounds are sharp. Proof. (i) Suppose vol(S) > covol(Λ) = vol(P ). Since P ⊆ Rn is a fundamental domain, we have X X vol(S) = vol(S ∩ Rn ) = vol S ∩ (P + γ) = vol(S ∩ (P + γ)). γ∈Λ
γ∈Λ
Also, we know vol(S ∩ (P + γ)) = vol((S − γ) ∩ P ), as volume is translation invariant. We now claim the sets (S − γ) ∩ P for γ ∈ Λ are not pairwise disjoint. If they were, then X X vol(P ) ≥ vol((S − γ) ∩ P ) = vol(S ∩ (P + γ)) = vol(S), γ∈Λ
γ∈Λ
contradicting our assumption. Then in particular, there are some distinct γ and µ such that (S − γ) and (S − µ) are not disjoint. In other words, there are x, y ∈ S such that x − γ = y − µ, i.e. x − y = γ − µ ∈ Λ 6= 0.
44
6
Minkowski bound and finiteness of class group
II Number Fields
(ii) We now let 1 S = S= 2 0
1 s:s∈S . 2
So we have vol(S 0 ) = 2−n vol(S) > covol(Λ), by assumption. (a) So there exists some distinct y, z ∈ S 0 such that y − z ∈ Λ \ {0}. We now write 1 y − z = (2y + (−2z)), 2 Since 2z ∈ S implies −2z ∈ S by symmetry around 0, so we know y − z ∈ S by convexity. 1 (b) We apply the previous part to Sm = 1 + m S for all m ∈ N, m > 0. So we get a non-zero γm ∈ Sm ∩ Λ. By convexity, we know Sm ⊆ S1 = 2S for all m. So γ1 , γ2 , · · · ∈ S1 ∩Λ. But S1 is compact set. So S1 ∩ Λ is finite. So there exists γ such that γm is γ infinitely often. So \ γ∈ Sm = S. m≥0
So γ ∈ S. We are now going to use this to mimic our previous proof that the class group of an imaginary quadratic field is finite. To begin with, we need to produce lattices from ideals of OL . Let L be a number field, and [L : Q] = n. We let σ1 , · · · , σr : L → R be the real embeddings, and σr+1 , · · · , σr+s , σ ¯r+1 , · · · , σ ¯r+s : L → C be the complex embeddings (note that which embedding is σr+i and which is σ ¯r+i is an arbitrary choice). Then this defines an embedding σ = (σ1 , σ2 , · · · , σr , σr+1 , · · · , σr+s ) : L ,→ Rr × Cs ∼ = Rr × R2s = Rr+2s = Rn , under the isomorphism C → R2 by x + iy 7→ (x, y). Just as we did for quadratic fields, we can relate the norm of ideals to their covolume. Lemma. 1
(i) σ(OL ) is a lattice in Rn of covolume 2−s |DL | 2 . (ii) More generally, if a C OL is an ideal, then σ(a) is a lattice and the covolume 1
covol(σ(a)) = 2−s |DL | 2 N (a). Proof. Obviously (ii) implies (i). So we just prove (ii). Recall that a has an integral basis γ1 , · · · , γn . Then a is the integer span of the vectors (σ1 (γi ), σ2 (γi ), · · · , σr+s (γi ))
45
6
Minkowski bound and finiteness of class group
II Number Fields
for i = 1, · · · , n, and they are independent as we will soon see when we compute the determinant. So it is a lattice. We also know that ∆(γ1 , · · · , γn ) = det(σi (γj ))2 = N (a)2 DL , where the σi run over all σ1 , · · · , σr , σr+1 , · · · , σr+s , σ ¯r+1 , · · · σ ¯r+s . So we know 1 | det(σi (γj ))| = N (a)|DL | 2 . So what we have to do is to relate det(σi (γj )) to the covolume of σ(a). But these two expressions are very similar. In the σi (γj ) matrix, we have columns that look like σr+i (γj ) σ ¯r+i (γj ) = z z¯ . On the other hand, the matrix of σ(γ) has corresponding entries 1 1 1 z Re(z) Im(z) = 12 (z + z¯) 2i (¯ z − z) = z¯ 2 i −i 1 1 1 We call the last matrix A = . We can compute the determinant as 2 i −i 1 1 1 1 = . | det A| = det 2 i −i 2 Hence the change of basis matrix from (σi (γj )) to σ(γ) is s diagonal copies of A, so has determinant 2−s . So this proves the lemma. Proposition. Let a C OL be an ideal. Then there exists an α ∈ a with α = 6 0 such that |N (α)| ≤ cL N (a), where cL =
s 1 n! 4 |DL | 2 . n π n
This is the Minkowski bound . Proof. Let n o X X |yi | + 2 |zi | ≤ t . Br,s (t) = (y1 , · · · , yr , z1 , · · · , zs ) ∈ Rr × Cs : This (i) is closed and bounded; (ii) is measurable (it is defined by polynomial inequalities); (iii) has volume vol(Br,s (t)) = 2r (iv) is convex and symmetric about 0. 46
π s tn ; 2 n!
6
Minkowski bound and finiteness of class group
II Number Fields
Only (iii) requires proof, and it is on the second example sheet, i.e. we are not doing it here. It is just doing the integral. We now choose t so that vol Br,s (t) = 2n covol(σ(a)). Explicitly, we let
s 4 t = n!|DL |1/2 N (a). π n
Then by Minkowski’s lemma, there is some α ∈ a non-zero such that σ(α) ∈ Br,s (t). We write σ(α) = (y1 , · · · , yr , z1 , · · · , zs ). Then we observe N (α) = y1 · · · yr z1 z¯1 z2 z¯2 · · · zs z¯s =
Y
yi
Y
|zj |2 .
By the AM-GM inequality, we know |N (α)|1/n ≤
X t 1 X yi + 2 |zj | ≤ , n n
as we know σ(a) ∈ Br,s (t). So we get |N (α)| ≤
tn = cL N (a). nn
Corollary. Every [a] ∈ clL has a representative a ∈ OL with N (a) ≤ cL . Theorem (Dirichlet). The class group clL is finite, and is generated by prime ideals of norm ≤ cL . Proof. Just as the case for imaginary quadratic fields.
47
7
7
Dirichlet’s unit theorem
II Number Fields
Dirichlet’s unit theorem
We have previously characterized the units on OL as the elements with unit norm, i.e. α ∈ OL is a unit if and only if |N (α)| = 1. However, this doesn’t tell us much about how many units there are, and how they are distributed. The answer to this question is given by Dirichlet’s unit theorem. Theorem (Dirichlet unit theorem). We have the isomorphism × ∼ OL = µL × Zr+s−1 ,
where µL = {α ∈ L : αN = 1 for some N > 0} is the group of roots of unity in L, and is a finite cyclic group. Just as in the finiteness of the class group, we do it for an example first, or else it will be utterly incomprehensible. √ We do the example of real quadratic fields, L = Q( d), where d > 1 is square-free. So r = 2, s = 0, and L ⊆ R implies µL = {±1}. So × ∼ OL = {±1} × Z.
Also, we know that √ √ √ N (x + y d) = (x + y d)(x − y d) = x2 − dy 2 . So Dirichlet’s theorem is saying that there are infinitely many solutions of x2 − dy 2 = ±1, and are all (plus or minus) the powers of one single element. √ Theorem (Pell’s equation). There are infinitely many x + y d ∈ OL such that x2 − dy 2 = ±1. You might have seen this in IIC Number Theory, where we proved it directly by continued fractions. We will provide a totally unconstructive proof here, since this is more easily generalized to arbitrary number fields. This is actually just half of Dirichlet’s theorem. The remaining part is to show that they are all powers of the same element. Proof. Recall that σ : OL → R2 sends √ √ √ α = x + y d 7→ (σ1 (α), σ2 (α)) = (x + y d, x − y d). √ (in the domain, d is a formal symbol, while in the codomain, it is a real number, namely the positive square root of d) Also, we know 1 covol(σ(OL )) = |DL | 2 .
48
7
Dirichlet’s unit theorem
II Number Fields
Z[X]
(1, 1) N (α) = 1
Consider st =
|DL |1/2 (y1 , y2 ) ∈ R : |y1 | ≤ t, |y2 | ≤ t 2
So
.
1
vol(st ) = 4|DL | 2 = 2n covol(OL ), as n = [L : Q] = 2. Now Minkowski implies there is an α ∈ OL non-zero such that σ(α) ∈ st . Also, if we write σ(α) = (y1 , y2 ), then N (α) = y1 y2 . So such an α will satisfy 1 ≤ |N (α)| ≤ |DL |1/2 . This is not quite what we want, since we need |N (α)| = 1 exactly. Nevertheless, this is a good start. So let’s try to find infinitely such elements. First notice that no points on the lattice (apart origin) hits the x √ from the √ or y axis, since any such point must satisfy x ± y d = 0, but d is not rational. Also, st is compact. So st ∩ σ(OL ) contains finitely many points. So we can find a t2 such that for each (y1 , y2 ) ∈ st ∩ OL , we have |y1 | > t2 . In particular, st2 does not contain any point in st ∩ σ(OL ). So we get a new set of points α ∈ st2 ∩ OL such that 1 ≤ |N (α)| ≤ |DL |1/2 .
s2
s1
49
7
Dirichlet’s unit theorem
II Number Fields
We can do the same thing for st2 and get a new t3 . In general, given t1 > · · · > tn , pick tn+1 be such that ( ) n [ 0 < tn+1 < min |y1 | : (y1 , y2 ) ∈ sti ∩ σ(OL ) , i=1
and the minimum is finite since st is compact and hence contains finitely many lattice points on σ(OL ). Then we get an infinite sequence of ti such that sti ∩ σ(OL ) are disjoint for different i. Since each must contain at least one point, we have got infinitely many points in OL satisfying 1 ≤ |N (α)| ≤ |DL |1/2 . Since there are only finitely many integers between 1 and |DL |1/2 , we can apply the pigeonhole principle, and get that there is some integer satisfying 1 ≤ |m| ≤ |DL |1/2 such that there exists infinitely many α ∈ OL with N (α) = m. This is not quite good enough. We consider OL /mOL ∼ = (Z/mZ)[L:Q] , another finite set. We notice that each α ∈ OL must fall into one of finitely many the cosets of mOL in OL . In particular, each α such that N (α) = m must belong to one of these cosets. So again by the pigeonhole principle, there exists a β ∈ OL with N (β) = m, and infinitely many α ∈ OL with N (α) = m and α = β (mod mOL ). Now of course α and β are not necessarily units, if m 6= 1. However, we will show that α/β is. The hard part is of course showing that it is in OL itself, because it is clear that α/β has norm 1 (alternatively, by symmetry, β/α is in OL , so an inverse exists). Hence all it remains is to prove the general fact that if α = β + mγ, where α, β, γ ∈ OL and N (α) = N (β) = m, then α/β ∈ OL . To show this, we just have to compute α m N (β) ¯ ∈ OL , =1+ γ =1+ γ = 1 + βγ β β β ¯ So done. since N (β) = β β. We have thus constructed infinitely many units. We now prove the remaining part √ Theorem (Dirchlet’s unit theorem for real quadratic fields). Let L = Q( d). × Then there is some ε0 ∈ OL such that × OL = {±εn0 : n ∈ Z}.
We call such an ε0 a fundamental unit (which is not unique). So × ∼ OL = {±1} × Z.
50
7
Dirichlet’s unit theorem
II Number Fields
Proof. We have just proved the really powerful theorem that there are infinitely many ε with N (ε) = 1. We are not going to need the full theorem. All we need is that there are three — in particular, something that is not ±1. × We pick some ε ∈ OL with ε = 6 ±1. This exists by what we just proved. Then we know |σ1 (ε)| = 6 1, as |σ1 (ε)| = 1 if and only if ε = ±1. Replacing by ε−1 if necessary, we wlog E = |σ1 (ε)| > 1. Now consider {α ∈ OL : N (α) = ±1, 1 ≤ |σ1 (α)| ≤ E}. This is again finite, since it is specified by a compact subset of the OL -lattice. So we pick ε0 in this set with ε0 = 6 ±1 and |σ1 (ε0 )| minimal (> 1). Replacing ε0 by −ε0 if necessary, we can assume σ1 (ε) > 1. × Finally, we claim that if ε ∈ OL and σ1 (ε) > 0, then ε = εN 0 for some N ∈ Z. This is obvious if we have addition instead of multiplication. So we take logs. Suppose log ε = N + γ, log ε0 where N ∈ Z and 0 ≤ γ < 1. Then we know × εε−N = εγ0 ∈ OL , 0
but |εγ0 | = |ε0 |γ < |ε0 |, as |ε0 | > 1. By our choice of ε0 , we must have γ = 0. So done. Now we get to prove the Dirichlet unit theorem in its full glory. Theorem (Dirichlet unit theorem). We have the isomorphism × ∼ OL = µL × Zr+s−1 ,
where µL = {α ∈ L : αN = 1 for some N > 0} is the group of roots of unity in L, and is a finite cyclic group. Proof. We do the proof in the opposite order. We throw in the logarithm at the very beginning. We define × ` : OL → Rr+s by x 7→ (log |σ1 (x)|, · · · , log |σr (x)|, 2 log |σr+1 (x)|, · · · , 2 log |σr+s (x)|). Note that |σr+i (x)| = |σr+` (x)|. So this is independent of the choice of one of σr+i , σ ¯r+i . Claim. We now claim that im ` is a discrete group in Rr+s and ker ` = µL is a finite cyclic group.
51
7
Dirichlet’s unit theorem
II Number Fields
We note that log |ab| = log |a| + log |b|. So this is a group homomorphism, and the image is a subgroup. To prove the first part, it suffices to show that im ` ∩ [−A, A]r+s is finite for all A > 0. We notice ` factors as × OL
OL
σ
R r × Cs
j
Rr+s .
where σ maps α 7→ (σ1 (α), · · · , σr+s (α)), and j : (y1 , · · · , yr , z1 , · · · , zs ) 7→ (log |y1 |, · · · , log |yr |, 2 log |z1 |, · · · , 2 log |z2 |). We see j −1 ([−A, A]r+s ) = {(yi , zj ) : e−A ≤ |yi | ≤ eA , e−A ≤ 2|zj | ≤ eA } is a compact set, and σ(OL ) is a lattice, in particular discrete. So σ(OL ) ∩ j −1 ([−A, A]r+s ) is finite. This also shows the kernel is finite, since the kernel is the inverse image of a compact set. Now as ker ` is finite, all elements are of finite order. So ker ` ⊆ µL . Conversely, it is clear that µL ⊆ ker `. So it remains to show that µL is cyclic. Since L embeds in C, we know µL is contained in the roots of unity in C. Since µL is finite, we know L is generated by a root of unity with the smallest argument (from, say, IA Groups). Claim. We claim that n o X im ` ⊆ (y1 , · · · , yr+s ) : yi = 0 ∼ = Rr+s−1 . × To show this, note that if α ∈ OL , then
N (α) =
n Y i=1
σi (α)
s Y
σr+` (α)¯ σr+` = ±1.
`=1
Taking the log of the absolute values, we get X X 0= log |σi (α)| + 2 log |σr+i (α)|. So we know im ` ⊆ Rr+s−1 as a discrete subgroup. So it is isomorphic to Za for some a ≤ r + s − 1. Then what we want to show is that im ` ⊆ Rr+s−1 is a lattice, i.e. it is congruent to Zr+s−1 . Note that so far what we have done is the second part of what we did for the real quadratic fields. We took the logarithm to show that these form a discrete subgroup. Next, we want to find r + s − 1 independent elements to show it is a lattice. Claim. Fix a k such that 1 ≤ k ≤ r + s and α ∈ OL with α = 6 0. Then there exists a β ∈ OL such that s 2 |N (β)| ≤ |DL |1/2 , π 52
7
Dirichlet’s unit theorem
II Number Fields
and moreover if we write `(α) = (a1 , · · · , ar+s ) `(β) = (b1 , · · · , br+s ), then we have bi < ai for all i 6= k. We can apply Minkowski to the region S = {(y1 , · · · , yr , z1 , · · · , zs ) ∈ Rr × Cs : |yi | ≤ ci , |zj | ≤ cr+j } (we will decide what values of ci to take later). Then this has volume vol(S) = 2r π s c1 · · · cr+s . We notice S is convex and symmetric around 0. So if we choose 0 < ci < eai for i 6= k, and choose s 2 1 ck = |DL |1/2 . π c1 · · · cˆk · · · cr+s Then Minkowski gives β ∈ σ(OL ) ∩ S, satisfying the two conditions above. × Claim. For any k = 1, · · · , r + s, there is a unit uk ∈ OL suchP that if `(uk ) = (y1 , · · · , yr+s , then yi < 0 for all i 6= k (and hence yk > 0 since yi = 0).
This is just as in the proof for the real quadratic case. We can repeatedly apply the previous claim to get a sequence α1 , α2 , · · · ∈ OL such that N (αt ) is bounded for all t, and for all i 6= k, the ith coordinate of `(α1 ), `(α2 ), · · · is strictly decreasing. But then as with real quadratic fields, the pigeonhole principle implies we can find t, t0 such that N (αt ) = N (αt0 ) = m, say, and αt ≡ αt0
(mod mOL ),
i.e. αt = αt0 in OL /mOL . Hence for each k, we get a unit uk = αt /αt0 such that `(uk ) = `(αt ) − `(αt0 ) = (y1 , · · · , yr+s ) P has yi < 0 if i 6= k (and hence yk > 0, since yi = 0). We need a final trick to show the following: Claim. The units u1 , · · · , ur+s−1 are linearly independent in Rr+s−1 . Hence × the rank of `(OL ) = r + s − 1, and Dirichlet’s theorem is proved. We let A be the (r + s) × (r + s) matrix whose jth row is `(uj ), and apply the following lemma: Claim. P Let A ∈ Matm (R) be such that aii > 0 for all i and aij < 0 for all i 6= j, and j aij ≥ 0 for each i. Then rank(A) ≥ m − 1.
53
7
Dirichlet’s unit theorem
II Number Fields
To show this, we let vi be the ith column of A. We show that v1 , · · · , vm−1 are linearly independent. If not, there exists a sequence ti ∈ R such that m−1 X
ti vi = 0,
(∗)
i=1
with not all of the ti non-zero. We choose k so that |tk | is maximal among the t1 , · · · , tm−1 ’s. We divide the whole equation by tk . So we can wlog assume tk = 1, ti ≤ 1 for all i. Now consider the kth row of (∗). We get 0=
m−1 X
ti aki ≥
i=1
m−1 X
aki ,
i=1
as a < 0 and t ≤ 1 implies at ≥ a. Moreover, we know ami > 0 strictly. So we get m X aki ≥ 0. 0> i=1
This is a contradiction. So done. You should not expect this to be examinable. We make a quick definition that we will need later. Definition (Regulator). The regulator of a number field L is × RL = covol(`(OL ) ⊆ Rr+s−1 ). × More concretely, we pick fundamental units ε1 , · · · , εr+s−1 ∈ OL so that n
× r+s−1 OL = µL × {εn1 1 · · · εr+s−1 : ni ∈ Z}.
We take any (r + s − 1)(r + s − 1) subminor of the matrix `(ε1 ) · · · Their determinants all have the same absolute value, and
`(εr+s ) .
| det(subminor)| = RL . This is a definition we will need later. √ We quickly look at some examples with quadratic fields. Consider L = Q( d), where d 6= 0, 1 square-free. × Example. If d < 0, then r = 0 and s = 1. So r + s − 1 = 0. So OL = µL is a finite group. So RL = 1.
Lemma. (i) If d = −1, then Z[i]× = {±1, ±i} = Z/4Z. √ (ii) If d = −3, then let ω = 12 (1 + d), and we have ω 6 = 1. So Z[ω]× = {1, ω, · · · , ω 5 } ∼ = Z/6Z. × (iii) For any other d < 0, we have OL = {±1}.
54
7
Dirichlet’s unit theorem
II Number Fields
Proof. This is just a direct check. If d ≡ 2, 3 (mod 4), then by looking at the solution of x2 − dy 2 = ±1 in the integers, we get (i) and (iii). 2 If d ≡ 1 (mod 4), then by looking at the solutions to x + y2 − d4 y 2 = ±1 in the integers, we get (ii) and (iii). Now if d > 0, then RL = | log |ε||, where ε is a fundamental unit. So how do we find a fundamental unit? In general, there is no good algorithm for finding the fundamental unit of a fundamental field. The best algorithm takes exponential time. We do have a good algorithm for quadratic fields using continued fractions, but we are not allowed to use that. Instead, we could just guess a solution — we find a unit by guessing, and then show there is no smaller one by direct check. √ √ Example. Consider the field Q( 2). We can try ε = 1 + 2. We have N (ε) = 1 − 2 = −1. So this √ is a unit. We claim this is fundamental. If not, there there exists u = a + b 2, where a, b ∈ Z and 1 < u < ε (as real numbers). Then we have √ u ¯=a−b 2 has u¯ u = ±1. Since u > 1, we know |¯ u| < 1. Then we must have u ± u ¯ > 0. So we need a, b > 0. We know can only be finitely many possibilities for √ √ 1 < a + b 2 < 1 + 2, where a, b are positive integers. But there actually are none. So done. √ √ Example. Consider Q( 11). We guess ε = 10 − 3 11 is a unit. We can compute N (ε) = 100 − 99 = 1. Note that ε < 1 and ε−1 > 1. Suppose this is not fundamental. Then we have some u such that √ √ 1 < u = a + b 11 < 10 + 3 11 = ε−1 < 20. (∗) We can check all the cases, but there is a faster way. We must have N (u) = ±1. If N (u) = −1, then a2 − 11b2 = −1. But −1 is not a square mod 11. So there we must have N (u) = 1. Then u−1 = u ¯. We get 0 < ε < u−1 = u ¯0
This is since every n = pe11 · · · perr factors uniquely as a product of primes, and each such product appears exactly once in this. If there were finitely many −1 P 1 1 primes, as , the sum pn converges to 1 − p Y X1 1 = 1− n p
n≥1
p prime
must be finite. But the harmonic series diverges. This is a contradiction. We all knew that. What we now want to prove is something more interesting. Theorem (Dirichlet’s theorem). Let a, q ∈ Z be coprime. Then there exists infinitely many primes in the sequence a, a + q, a + 2q, · · · , i.e. there are infinitely many primes in any such arithmetic progression. We want to imitate the Euler proof, but then that would amount to showing that −1 Y 1 1− p p≡a mod q p prime
is divergent, and there is no nice expression for this. So it will be a lot more work. To begin with, we define the Riemann zeta function. Definition (Riemann zeta function). The Riemann zeta function is defined as X ζ(s) = n−s n≥1
for s ∈ C. There are some properties we will show (or assert): Proposition. (i) The Riemann zeta function ζ(s) converges for Re(s) > 1.
56
8
L-functions, Dirichlet series*
II Number Fields
(ii) The function 1 s−1 extends to a holomorphic function when Re(s) > 0. ζ(s) −
In other words, ζ(s) extends to a meromorphic function on Re(s) > 0 with a simple pole at 1 with residue 1. (iii) We have the expression Y
ζ(s) =
1−
p prime
1 ps
−1
for Re(s) > 1, and the product is absolutely convergent. This is the Euler product. The first part follows from the following general fact about Dirichlet series. P Definition (Dirichlet series). A Dirichlet series is a series of the form an n−s , where a1 , a2 , · · · ∈ C. Lemma. If there is a real number r ∈ R such that a1 + · · · + aN = O(N r ), then X
an n−s
converges for Re(s) > r, and is a holomorphic function there. Then (i) is immediate by picking r = 1, since in the Riemann zeta function, a1 = a2 = · · · = 1. Recall that xs = es log x has |xs | = |xRe(s) | if x ∈ R, x > 0. Proof. This is just IA Analysis. Suppose Re(s) > r. Then we can write N X
an n−s = a1 (1−s − 2−s ) + (a1 + a2 )(2−s − 3−s ) + · · ·
n=1
+ (a1 + · · · + aN −1 )((N − 1)−s − N −s ) + RN , where
a1 + · · · + aN . Ns This is getting annoying, so let’s write RN =
T (N ) = a1 + · · · + aN . We know
T (N ) T (N ) 1 = N s N r N Re(s)−r → 0 57
8
L-functions, Dirichlet series*
II Number Fields
as N → ∞, by assumption. Thus we have X X an n−s = T (n)(n−s − (n + 1)−s ) n≥1
n≥1
if Re(s) > r. But again by assumption, T (n) ≤ B · nr for some constant B and all n. So it is enough to show that X nr (n−s − (n + 1)−s ) n
converges. But n
−s
− (n + 1)
−s
n+1
Z =
n
s dx, xs+1
and if x ∈ [n, n + 1], then nr ≤ xr . So we have Z n+1 Z n+1 dx r −s −s r s n (n − (n + 1) ) ≤ . x s+1 dx = s s+1−r x x n n It thus suffices to show that
Z 1
converges, which it does (to
n
dx xs+1−r
s s−r ).
We omit the proof of (ii). The idea is to write ∞ Z n+1 X dx 1 = , s − 1 n=1 n xs P and show that φn is uniformly convergent when Re(s) > 0, where Z n+1 dx φn = n−s − . xs n For (iii), consider the first r primes p1 , · · · , pr , and r Y
−1 (1 − p−s = i )
X
n−s ,
i=1
where the sum is over the positive integers n whose prime divisors are among p1 , · · · , pr . Notice that 1, · · · , r are certainly in the set. So r X Y X −s −1 (1 − pi ) ≤ |n−s | = n− Re(s) . ζ(s) − i=1 n≥r n≥r P − Re(s) But n≥r n → 0 as r → ∞, proving the result, if we also show that it converges absolutely. We omit this proof, but it follows from the fact that X X p−s ≤ n−s . n
p prime
Q and the latter P converges absolutely, plus the fact that (1 − an ) converges if and only if an converges, by IA Analysis I. This is good, but not what we want. Let’s mimic this definition for an arbitrary number field! 58
8
L-functions, Dirichlet series*
II Number Fields
Definition (Zeta function). Let L ⊇ Q be a number field, and [L : Q] = n. We define the zeta function of L by X ζL (s) = N (a)−s . aCOL
It is clear that if L = Q and OL = Z, then this is just the Riemann zeta function. Theorem. (i) ζL (s) converges to a holomorphic function if Re(s) > 1. (ii) Analytic class number formula: ζL (s) is a meromorphic function if Re(s) > 1 − n1 and has a simple pole at s = 1 with residue | clL |2r (2π)s RL , |DL |1/2 |µL | where clL is the class group, r and s are the number of real and complex embeddings, you know what π is, RL is the regulator, DL is the discriminant and µL is the roots of unity in L. (iii) ζL (s) =
Y
(1 − N (p)−s )−1 .
pCOL prime ideal
This is again known as the Euler product. We will not prove this, but the proof does not actually require any new ideas. Note that X Y N (a)−s = (1 − N (p)−s )−1 aCOL
pCOL ,p prime
holds “formally”, as in the terms match up when you expand, as an immediate consequence of the unique factorization P of ideals into a product of prime ideals. The issue is to study convergence of N (a)−s , and this comes down to estimating the number of ideals of fixed norm geometrically, and that is where all the factors in the pole come it. √ Example. We try to compute ζL (s), where L = Q( d). This has discriminant D, which may be d or 4d. We first look at the prime ideals. If p is a prime ideal in OL , then p | hpi for a unique p. So let’s enumerate the factors of ηL controlled by p ∈ Z. Now if p | |DL |, then hpi = p2 ramifies, and N (p) = p. So this contributes a factor of (1 − p−s )−1 . Now if p remains prime, then we have N (hpi) = p2 . So we get a factor of (1 − p−2s )−1 = (1 − p−s )−1 (1 + p−s )−1 . If p splits completely, then hpi = p1 p2 . So N (pi ) = p, 59
8
L-functions, Dirichlet series*
II Number Fields
and so we get a factor of (1 − p−s )−1 (1 − p−s )−1 . So we find that ζL (s) = ζ(s)L(χD , s), where we define Definition (L-function). We define the L-function by Y (1 − χ(p)p−s )−1 . L(χ, s) = p prime
In our case, χ is 0 χD (p) = −1 1
given by ( p|D D p p remains prime = depends on d mod 8 p splits
p is odd
.
p=2
√ Example. If L = Q( −1), then we know p−1 −1 −4 = = (−1) 2 if p 6= 2, p p and χD (2) = 0 as 2 ramifies. We then have L(χD , s) =
Y
(1 − (−1)
p−1 2
p−s )−1 = 1 −
p>2 prime
1 1 1 + s − s + ··· . 3s 5 7
Note that χD was defined for primes only, but we can extend it to a function χD : Z → C by imposing χD (nm) = χD (n)χD (m), i.e. we define χD (pe11 · · · perr ) = χD (p1 )e1 · · · χD (pr )er . √ Example. Let L = Q( −1). Then ( m−1 (−1) 2 m odd χ−4 (m) = 0 m even. It is an exercise to show that this is really the extension, i.e. χ−4 (mn) = χ−4 (m)χ−4 (n). Notice that this has the property that χ−4 (m − 4) = χ−4 (m). We give these some special names
60
8
L-functions, Dirichlet series*
II Number Fields
Definition (Dirichlet character). A function χ : Z → C is a Dirichlet character of modulus D if there exists a group homomorphism w: such that
( χ(m) =
Z DZ
×
→ C×
w(m mod D) 0
gcd(m, D) = 1 . otherwise
We say χ is non-trivial if ω is non-trivial. Example. χ−4 is a Dirichlet character of modulus 4. Note that χ(mn) = χ(m)χ(n) for such Dirichlet characters, and so L(χ, s) =
Y
(1 − χ(p)p−s )−1 =
p prime
X χ(n) ns
n≥1
for such χ. √ Proposition. χD , as defined for L = Q( d) is a Dirichlet character of modulus D. Note that this is a very special Dirichlet character, as it only takes values 0, ±1. We call this a quadratic Dirichlet character. Proof. We must show that χD (p + Da) = χD (p) for all p, a. (i) If d ≡ 3 (mod 4), then D = 4d. Then χD (2) = 0, as (2) ramifies. So χD (even) = 0. For p > 2, we have p−1 D d p χD (p) = = = (−1) 2 p p d as
d−1 2
≡ 1 (mod 2), by quadratic reciprocity. So p−1 p + Da χD (p + Da) = (−1) 2 (−1)4da/2 = χD (p). d
(ii) If d ≡ 1, 2 (mod 4), see example sheet. Lemma. Let χ be any non-trivial Dirichlet character. Then L(χ, s) is holomorphic for Re(s) > 0. 61
8
L-functions, Dirichlet series*
II Number Fields
Proof. By our lemma on convergence of Dirichlet series, we have to show that N X
χ(i) = O(1),
i=1
i.e. it is bounded. Recall from Representation Theory that distinct irreducible characters of a finite group G are orthogonal, i.e. ( 1 χ1 = χ2 1 X . χ1 (g)χ2 (g) = |G| 0 otherwise g∈G We apply this to G = (Z/DZ)× , where χ1 is trivial and χ2 = χ. So orthogonality gives X X χ(i) = χ(i) = 0, i∈(Z/DZ)×
aD 1, then this is “non-abelian class field theory”, known as Langlands programme.
67
Index
II Number Fields
Index DL , 15 IL , 24 L-function, 60 clL , 26 ζ, 56 ζL , 59 pα , 8 r, 13 s, 13
fractional, 22 honest, 22 integral, 22 invertible, 22 multiplication, 17 norm, 27 prime, 18 sum, 24 unique factorization, 24 ideal class group, 26 inert prime, 32 integral, 5 integral basis, 14 integral ideal, 22 invertible fractional ideal, 22
addition of ideals, 24 algebraic integer, 4 analytic class number formula, 59 Artin L-function, 67 associates, 18 class field theory, 67 class group, 26 finiteness, 47 conjugate, 12 covolume, 43
Kronecker–Weber theorem, 67 Langlands programme, 67 lattice, 43 minimal polynomial, 8 Minkowski bound, 40, 46 Minkowski’s lemma, 38 Minkowski’s theorem, 44 multiplication of ideals, 17
Dedekind domain, 18 Dedekind’s criterion, 33 degree, 4 Dirichlet character, 61 Dirichlet series, 57 Dirichlet unit theorem, 48 Dirichlet’s theorem, 56 Dirichlet’s theorem on primes in AP, 65 Dirichlet’s unit theorem, 50, 51 discrete subset, 42 discriminant, 15 divides, 17
norm, 10 of ideal, 27 number field, 4 Pell’s equation, 48 prime ideal, 18 primitive element theorem, 12 quadratic Dirichlet character, 61
Euler factor, 67
ramification index, 32 ramified prime, 32 regulator, 54 Riemann zeta function, 56
field extension, 4 finite extension, 4 finitely-generated, 5 finiteness of class group, 47 fractional ideal, 22 invertible, 22 fundamental domain, 43
splitting prime, 32 sum of ideals, 24 trace, 10
honest ideal, 22
unique factorization of ideals, 24
ideal
zeta function, 59 68