281 70 3MB
English Pages 316 Year 2015
637
Algorithmic Arithmetic, Geometry, and Coding Theory 14th International Conference Arithmetic, Geometry, Cryptography and Coding Theory June 3–7, 2013 CIRM, Marseille, France
Stéphane Ballet Marc Perret Alexey Zaytsev Editors
American Mathematical Society
637
Algorithmic Arithmetic, Geometry, and Coding Theory 14th International Conference Arithmetic, Geometry, Cryptography and Coding Theory June 3–7, 2013 CIRM, Marseille, France
Stéphane Ballet Marc Perret Alexey Zaytsev Editors
American Mathematical Society Providence, Rhode Island
Editorial Board of Contemporary Mathematics Dennis DeTurck, managing editor Michael Loss
Kailash Misra
Martin J. Strauss
2010 Mathematics Subject Classification. Primary 11G10, 11G20, 11G25, 11H71, 11Y16, 14G05, 14G15, 14Q05, 14Q15, 94B27.
Library of Congress Cataloging-in-Publication Data International Conference Arithmetic, Geometry, Cryptography and Coding Theory (14th : 2013 : Marseille, France) Algorithmic arithmetic, geometry, and coding theory : 14th International Conference on Arithmetic, Geometry, Cryptography, and Coding Theory, June 3-7 2013, CIRM Marseille, France / St´ ephane Ballet, Marc Perret, Alexey Zaytsev, editors. pages cm. – (Contemporary mathematics ; volume 637) Includes bibliographical references. ISBN 978-1-4704-1461-0 (alk. paper) 1. Coding theory—Congresses. 2. Geometry, Algebraic—Congresses. 3. Cryptography— Congresses. 4. Number theory—Congresses. I. Ballet, St´ ephane, 1971–editor. II. Perret, M. (Marc), 1963– editor. III. Zaytsev, Alexey (Alexey I.), 1976–editor. IV. Title. QA268.I57 2013 510–dc23
2014037646
Contemporary Mathematics ISSN: 0271-4132 (print); ISSN: 1098-3627 (online) DOI: http://dx.doi.org/10.1090/conm/637
Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Permissions to reuse portions of AMS publication content are handled by Copyright Clearance Center’s RightsLink service. For more information, please visit: http://www.ams.org/rightslink. Send requests for translation rights and licensed reprints to [email protected]. Excluded from these provisions is material for which the author holds copyright. In such cases, requests for permission to reuse or reprint material should be addressed directly to the author(s). Copyright ownership is indicated on the copyright page, or on the lower right-hand corner of the first page of each article within proceedings volumes. c 2015 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability. Visit the AMS home page at http://www.ams.org/ 10 9 8 7 6 5 4 3 2 1
20 19 18 17 16 15
Contents
Preface
v
Geometric error correcting codes On products and powers of linear codes under componentwise multiplication Hugues Randriambololona Higher weights of affine Grassmann codes and their duals Mrinmoy Datta and Sudhir R. Ghorpade
3
79
Algorithmic: special varieties The geometry of efficient arithmetic on elliptic curves David Kohel 2–2–2 isogenies between Jacobians of hyperelliptic curves Ivan Boyer
95 111
Easy scalar decompositions for efficient scalar multiplication on elliptic curves and genus 2 Jacobians Benjamin Smith 127
Algorithmic: point counting A point counting algorithm for cyclic covers of the projective line C´ ecile Gonc ¸ alves
145
Point counting on non-hyperelliptic genus 3 curves with automorphism group Z/2Z using Monsky-Washnitzer cohomology Yih-Dar Shieh 173 Wiman’s and Edge’s sextic attaining Serre’s bound II Motoko Qiu Kawakita
191
Algorithmic: general Genetics of polynomials over local fields ` rdia and Enric Nart Jordi Gua
207 iii
iv
CONTENTS
Explicit algebraic geometry Explicit equations of optimal curves of genus 3 over certain finite fields with three parameters Ekaterina Alekseenko and Alexey Zaytsev 245 Smooth embeddings for the Suzuki and Ree curves Abdulla Eid and Iwan Duursma
251
Arithmetic geometry Uniform distribution of zeroes of L-functions of modular forms Alexey Zykin
295
A survey on class field theory for varieties Alexander Schmidt
301
Preface The 14th AGCT conference (Arithmetic, Geometry, Cryptography, and Coding Theory) took place at CIRM (Centre International de Rencontres Math´ematiques) in Marseille, France, on June 3–7, 2013. This international conference has been a major event in the area of arithmetic geometry and its applications for more than 25 years, 77 participants attended this year. We thank all of them for creating a stimulating research environment. The topics of the talks extended from algebraic number theory to diophantine geometry, curves and abelian varieties over finite fields from the theoretical or the algorithmic point of view, and applications to error-correcting codes. We especially thank the speakers Ekaterina Alekseenko, Nurdagul Ambar, Alp Bassa, Peter Beelen, Jean-Robert Belliard, Ivan Boyer, Niels Bruin, Florian Caullery, Claus Diem, Virgile Ducet, Iwan Duursma, Sudhir Ghorpade, C´ecile ´ Gon¸calves, Emmanuel Hallouin, Safia Haloui, Johan Peter Hansen, Masaaki Homma, Grigory Kabatiansky, Motoko Kawakita, David Kohel, Dmitry Kubrak, Gilles Lachaud, Kristin Lauter, Winnie Li, Enric Nart, Ferruh Ozbudak, Laurent Poinsot, Hugues Randriambololona, Christophe Ritzenthaler, Damien Robert, Karl R¨ okaeus, Robert Rolland, Sergey Rybakov, Alexander Schmidt, Jeroen Sijsling, Benjamin Smith, Patrick Sol´e, and Milakulo Tukumuli for their lectures. The editors would like to thank the anonymous referees and the staff of CIRM (Olivia Barbarroux, Muriel Milton and Laure Stefanini) for their remarkable professionalism.
v
Geometric error correcting codes
Contemporary Mathematics Volume 637, 2015 http://dx.doi.org/10.1090/conm/637/12749
On products and powers of linear codes under componentwise multiplication Hugues Randriambololona Abstract. In this text we develop the formalism for products and powers of linear codes under componentwise multiplication. As an expanded version of the author’s talk at AGCT-14, focus is put mostly on basic properties and descriptive statements that could otherwise probably not fit in a regular research paper. On the other hand, more advanced results and applications are only quickly mentioned with references to the literature. We also point out a few open problems. Our presentation alternates between two points of view, which the theory intertwines in an essential way: that of combinatorial coding, and that of algebraic geometry. In appendices that can be read independently, we investigate topics in multilinear algebra over finite fields, notably we establish a criterion for a symmetric multilinear map to admit a symmetric algorithm, or equivalently, for a symmetric tensor to decompose as a sum of elementary symmetric tensors.
Contents 1. Introduction Basic definitions Link with tensor constructions Rank functions Geometric aspects 2. Basic structural results and miscellaneous properties Support Decomposable codes Repeated columns Extension of scalars Monotonicity Stable structure Adjunction properties Symmetries and automorphisms 3. Estimates involving the dual distance 4. Pure bounds The generalized fundamental functions An upper bound: Singleton Lower bounds for q large: AG codes c 2015 American Mathematical Society
3
4
HUGUES RANDRIAMBOLOLONA
Lower bounds for q small: concatenation 5. Some applications Multilinear algorithms Construction of lattices from codes Oblivious transfer Decoding algorithms Analysis of McEliece-type cryptosystems Appendix A: A criterion for symmetric tensor decomposition Frobenius symmetric maps Trisymmetric and normalized multiplication algorithms Appendix B: On symmetric multilinearized polynomials Polynomial description of symmetric powers of an extension field Equidistributed beads on a necklace Appendix C: Review of open questions References
Notations and conventions. In the first three sections of this text we will be working over an arbitrary field F, although we will keep in mind the case where F = Fq is the finite field with q elements. If V is a vector space over F, we denote by V ∨ its dual, that is, the vector space of all linear forms V −→ F. If X ⊆ V is an arbitrary subset, we denote by X its linear span in V . We let S · V = t≥0 S t V be the symmetric algebra of V , which is the the largest commutative graded quotient algebra of the tensor algebra of V . In particular, the t-th symmetric power S t V is a quotient of V ⊗t , and should not be confused with Symt V , the space of symmetric tensors of order t, which is a subspace of V ⊗t . If W is another vector space, we also let Symt (V ; W ) be the space of symmetric t-multilinear maps from V t to W . All these objects are related by natural identifications such as Symt (V ∨ ) = Symt (V ; F) = (S t V )∨ . Here it was always understood that we were working over F, but in case of ambiguity we will use more precise notations such as SF· V , SymtF (V ; W ), etc. By [n] we denote the standard set with n elements, the precise definition of which will depend on the context: when doing combinatorics, it will be [n] = {1, 2, . . . , n}, and when doing algebraic geometry over F, it will be [n] = Spec Fn . We also let Sn be the symmetric group on n elements, which acts naturally on [n]. By a linear code of length n over F we mean a linear subspace C ⊆ Fn ; moreover if dim(C) = k we say C is a [n, k] code. Given a word x ∈ Fn , its support Supp(x) ⊆ [n] is the set of indices over which x is nonzero, and its Hamming weight is w(x) = |Supp(x)|. If S ⊆ [n] is a subset we let 1S ∈ Fn be the characteristic vector of S, that is, the vector with coordinates 1 over S and 0 over [n] \ S. We also let πS : Fn FS be the natural linear projection and ιS : FS → Fn be the natural linear inclusion of vector spaces. The dual code C ⊥ ⊆ Fn is defined as the orthogonal of C with respect to the standard scalar product in Fn . One should be careful not to confuse this notion with that of the dual vector space C ∨ .
ON PRODUCTS AND POWERS OF LINEAR CODES
5
1. Introduction Basic definitions. 1.1. — Given a field F and an integer n ≥ 1, we let ∗ denote componentwise multiplication in Fn , so (x1 , . . . , xn ) ∗ (y1 , . . . , yn ) = (x1 y1 , . . . , xn yn ) for xi , yj ∈ F. This makes Fn a commutative F-algebra, with unit element the all-1 vector 1[n] = (1, . . . , 1). Its group of invertible elements is (Fn )× = (F× )n . This algebra Fn can also be identified with the algebra of diagonal matrices of size n over F (see 2.37 and 2.48 for more). 1.2. — If S, S ⊆ Fn are two subsets, that is, two (nonlinear) codes of the same length n, we let S ∗˙ S = {c ∗ c ; (c, c ) ∈ S × S } ⊆ Fn be the set of componentwise products of their elements. This operation ∗˙ is easily seen to be commutative and associative, and distributive with respect to the union of subsets: S ∗˙ (S ∪ S ) = (S ∗˙ S ) ∪ (S ∗˙ S ) for all S, S , S ⊆ Fn . This means that the set of (nonlinear) codes over F of some given ˙ with zero element length n becomes a commutative semiring under these laws ∪, ∗, the empty code ∅ and with unit element the singleton {1[n] }. This semiring is in fact an ordered semiring (under the inclusion relation ⊆), since these laws are obviously compatible with ⊆. 1.3. — If moreover C, C ⊆ Fn are two linear subspaces, that is, two linear codes of the same length n, we let C ∗ C = C ∗˙ C ⊆ Fn be the linear span of C ∗˙ C . (The reader should be careful of the shift in notations from [43].) For x ∈ Fn , we also write x ∗ C = x ∗ C = {x ∗ c ; c ∈ C}. Some authors also call our C ∗ C the Schur product of C and C . It is easily seen that this operation ∗, defined on pairs of linear codes of a given length n, is commutative and associative. Given linear codes C1 , . . . , Ct ⊆ Fn , their product C1 ∗ · · · ∗ Ct ⊆ Fn is then the linear code spanned by componentwise products c1 ∗ · · · ∗ ct for ci ∈ Ci . Also, given three linear codes C, C , C ⊆ Fn , we have the distributivity relation C ∗ (C + C ) = C ∗ C + C ∗ C ⊆ Fn where + is the usual sum of subspaces in Fn . This means that the set of linear codes over F of some given length n becomes a commutative semiring under these operations +, ∗. The zero element of this semiring is the zero subspace, and its unit element is the one-dimensional repetition code 1 (generated by the all-1 vector 1[n] ). And as above, this semiring is easily seen to be an ordered semiring (under inclusion). The t-th power of an element C in this semiring will be denoted C t . For instance, C 0 = 1, C 1 = C, C 2 = C ∗ C, and we have the usual relations C t ∗ C t = C t+t , (C t )t = C tt .
6
HUGUES RANDRIAMBOLOLONA
1.4. Definition. Let C ⊆ Fn be a linear code. The sequence of integers dim(C i ),
i≥0
is called the dimension sequence, or the Hilbert sequence, of C. The sequence of integers dmin (C i ),
i≥0
is called the distance sequence of C. Occasionally we will also consider the dual distance sequence, d⊥ (C i ) = dmin ((C i )⊥ ), i ≥ 0. Probably the more self-describing term “dimension sequence” should be preferred over “Hilbert sequence,” although two very good reasons for using the latter would be: (i) emphasis on the geometric analogy that will be given in Proposition 1.28 (ii) pedantry. One can show ([43, Prop. 11], or Theorem 2.32 below) that the dimension sequence of a nonzero linear code is non-decreasing, so it becomes ultimately constant. This allows the following: 1.5. Definition. The (Castelnuovo-Mumford) regularity of a nonzero linear code C ⊆ Fn is the smallest integer r = r(C) ≥ 0 such that dim(C r ) = dim(C r+i ),
∀i ≥ 0.
1.6. Example. For 2 ≤ k ≤ n ≤ |F|, the [n, k] Reed-Solomon code C obtained by evaluating polynomials of degree up to k − 1 at n distinct elements of F, has dimension sequence n−1 1, k, 2k − 1, . . . , (k − 1) + 1, n, n, n, . . . k−1 and distance sequence
n−1 n, n − k + 1, n − 2k + 2, . . . , n − (k − 1), 1, 1, 1, . . . k−1
and regularity
r(C) =
n−1 . k−1
1.7. Example. More generally, one class of linear codes that behave particularly well with respect to the operation ∗, is that of evaluation codes. For example if RMF (r, m) is the generalized Reed-Muller code, obtained by evaluating polynomials in m ≥ 2 variables, of total degree up to r ≥ 0, at all points of Fm , then we have RMF (r, m) ∗ RMF (r , m) = RMF (r + r , m). Likewise for algebraic-geometry codes we have C(D, G) ∗ C(D , G) ⊆ C(D + D , G) where C(D, G) is the code obtained by evaluating functions from the Riemann-Roch space L(D), associated with a divisor D, at a set G of F-points out of the support of D, on an algebraic curve over F. Note that here one can construct examples in which the inclusion is strict (even when D, D are taken effective): indeed L(D+D )
ON PRODUCTS AND POWERS OF LINEAR CODES
7
contains L(D)L(D ) but need not be spanned by it — see e.g. [37], §§1-2, for a study of questions of this sort. 1.8. — There is a natural semiring morphism, from the ∪, ∗˙ -semiring of (nonlinear) codes of length n over F, to the +, ∗ -semiring of linear codes of length n over F, mapping a subset S ⊆ Fn to its linear span S (moreover this map respects the ordered semiring structures defined by inclusion). Last, there is another semiring structure, defined by the operations ⊕, ⊗ on the set of all linear codes over F (not of fixed length; see e.g. [54]). One should not confuse these constructions, although we will see some more relations between them in 1.10 and 2.8 below. 1.9. — Also there are links between products of codes and the theory of intersecting codes [12][45], but one should be careful to avoid certain misconceptions. Given two linear codes C1 , C2 ⊆ Fn , we define their intersection number i(C1 , C2 ) =
min
c1 ∈C1 , c2 ∈C2 c1 ,c2 =0
w(c1 ∗ c2 ).
We say the pair (C1 , C2 ) is intersecting if i(C1 , C2 ) > 0, that is, if any two nonzero codewords from C1 and C2 have intersecting supports. And given an integer s > 0, we say (C1 , C2 ) is s-intersecting if i(C1 , C2 ) ≥ s. (We also say a linear code C is (s-)intersecting if the pair (C, C) is.) Now the two quantities dmin (C1 ∗ C2 ) and i(C1 , C2 ) might seem related, but there are some subtleties: • Let s > 0 and suppose (C1 , C2 ) is s-intersecting. Then this does not necessarily imply dmin (C1 ∗ C2 ) ≥ s. More precisely, what we have is that any codeword z ∈ C1 ∗ C2 of the specific form z = c1 ∗ c2 has weight w(z) ≥ s. But there are codewords in C1 ∗ C2 that are not of this form (because C1 ∗C2 is defined as a linear span), which can make its minimum distance smaller: see Example 1.22. • On the other hand, dmin (C1 ∗ C2 ) ≥ s does not necessarily imply that (C1 , C2 ) is s-intersecting. In fact it could well be that there are nonzero c1 , c2 such that c1 ∗ c2 = 0, so (C1 , C2 ) is even not intersecting at all! Indeed note that c1 ∗ c2 = 0 does not contribute to dmin (C1 ∗ C2 ). The possibility of such “unexpected zero codewords” is one difficulty in the estimation of the minimum distance of products of codes; see also the discussion in 1.32. However it remains that, if (C1 , C2 ) is intersecting, then it is at least dmin (C1 ∗ C2 )intersecting. Or equivalently, if i(C1 , C2 ) > 0, then i(C1 , C2 ) ≥ dmin (C1 ∗ C2 ). Link with tensor constructions. Here we present algebraic constructions related to products and powers of codes. Later (in 1.25-1.31) they will be revisited from a geometric point of view. 1.10. — We identify the tensor product Fm ⊗Fn with the space Fm×n of m×n matrices, by identifying the elementary tensor (x1 , . . . , xm ) ⊗ (y1 , . . . , yn ) with the matrix with entries (xi yj ), and extending by linearity. Given two linear codes C, C ⊆ Fn of the same length n, their tensor product C ⊗ C ⊆ Fn×n is then the linear code spanned by elementary tensor codewords c ⊗ c , for c ∈ C, c ∈ C . It then follows from the definitions that C ∗ C is the
8
HUGUES RANDRIAMBOLOLONA
projection of C ⊗ C on the diagonal. In fact this could be taken as an alternative definition for C ∗ C . So we have an exact sequence Δ 0 −→ I(C, C ) −→ C ⊗ C −→ C ∗ C −→ 0
π
where πΔ is the projection on the diagonal, and I(C, C ) is its kernel in C ⊗ C . One might view I(C, C ) as the space of formal bilinear expressions in codewords of C, C that evaluate to zero. Equivalently, if c1 , . . . , ck are the rows of a generator matrix of C, and c1 , . . . , ck are the rows of a generator matrix of C , then the products ci ∗ cj generate C ∗ C , but need not be linearly independent: the space of linear relations between these product generators is precisely I(C, C ). In particular we have dim(C ∗ C ) = kk − dim(I(C, C )) ≤ min(n, kk ). It is true, although not entirely obvious from its definition, that C ⊗ C can also be described as the space of n × n matrices all of whose columns are in C and all of whose rows are in C . Then I(C, C ) is the subspace made of such matrices that are zero on the diagonal. 1.11. — Likewise, by the universal property of symmetric powers, there is a natural surjective map S t C C t , whose kernel I t (C) can be viewed as the space of formal homogeneous polynomials of degree t in codewords of C that evaluate to zero. Equivalently, if c1 , . . . , ck are the rows of a generator matrix of C, then the monomials (c1 )i1 ∗ · · · ∗ (ck )ik , for i1 + · · · + ik = t, generate C t , but need not be linearly independent: the space of linear relations between these monomial generators is precisely I t (C). In particular we have k+t−1 k+t−1 t t − dim(I (C)) ≤ min n, . dim(C ) = t t For example, let 5 ≤ n ≤ |F|, and C be the [n, 3] Reed-Solomon code obtained by evaluating polynomials of degree up to 2 in F[x] at n points in F. Denote by 1, x, x2 the canonical basis of C (obtained from the corresponding monomials in F[x]). Then 1 · x2 − x · x is nonzero in S 2 C but it evaluates to zero in C 2 , that is, it defines a nonzero element in I 2 (C). We have dim(S 2 C) = 6 and dim(C 2 ) = 5, so in fact I 2 (C) has dimension 1 and admits 1 · x2 − x · x as a generating element. 1.12. — The direct sum C . =
C t
t≥0
admits a natural structure of graded F-algebra, which makes it a quotient of the symmetric algebra S · C under the maps described in the previous entry. Equivalently, the direct sum I t (C) I · (C) = t≥0
ON PRODUCTS AND POWERS OF LINEAR CODES
9
is a homogeneous ideal in the symmetric algebra S · C, and we have a natural identification C . = S · C/I · (C) of graded F-algebras. Rank functions. 1.13. — Let C be a finite dimensional F-vector space, and S ⊆ C a generating set. Definition. The rank function associated with S is the function rkS : C −→ Z≥0 defined as follows: the rank rkS (c) of an element c ∈ C is the smallest integer r ≥ 0 such that there is a decomposition c = λ1 s 1 + · · · + λr s r ,
λi ∈ F, si ∈ S
of c as a linear combination of r generators from S. Conversely we say that rk : C −→ Z≥0 is a rank function on C if rk = rkS for some generating set S. It is easily seen that a rank function is a norm on C (relative to the trivial absolute value on F), in particular it satisfies rk(λx) = rk(x) and rk(x + y) ≤ rk(x) + rk(y) ×
for all λ ∈ F , x, y ∈ C. In fact, a generating set S defines a surjective linear map F(S) C, and rkS is then the quotient norm on C of the Hamming norm on F(S) . (In some instances we will also make the following abuse: given S ⊆ C that does not span the whole of C, we define rkS on S as above, and then we let rkS (c) = ∞ for c ∈ C \ S.) 1.14. Example. Suppose given a full rank n × (n − k) matrix H, and let y ∈ Fn be an arbitrary word (in row vector convention), and z = yH T ∈ Fn−k the corresponding “syndrome”. The set S of columns of H, or equivalently, of rows of H T , is a generating set in Fn−k , and the rank rkS (z) is then equal to the weight of a minimum error vector e such that y − e is in the code defined by the parity-check matrix H. 1.15. — Given a rank function on C, we let C(i) = {c ∈ C ; rk(c) = i},
C(≤i) = {c ∈ C ; rk(c) ≤ i}
for all i ≥ 0. Obviously rk(c) ≤ k = dim(C) for all c ∈ C, so 0 = C(≤0) ⊆ C(≤1) ⊆ · · · ⊆ C(≤k) = C. Two generating sets S, S ⊆ C define the same rank function if the sets of lines {F · s ; s ∈ S, s = 0} and {F · s ; s ∈ S , s = 0} are equal. Given a rank function rk on C, there is a preferred generating set S such that rk = rkS , namely it is S = C(1) .
10
HUGUES RANDRIAMBOLOLONA
1.16. Lemma. Let f : C −→ C be a linear map between two finite dimensional F-vector spaces. Suppose S ⊆ C, S ⊆ C are generating sets with f (S) ⊆ S . Then for all c ∈ C we have rkS (c) ≥ rkS (f (c)). Proof. Obvious from the definition.
1.17. — Now we generalize constructions 1.2-1.3 slightly. Suppose given sets V1 , . . . , Vt , W and a map Φ : V1 × · · · × Vt −→ W. Then for subsets S1 ⊆ V1 , . . . , St ⊆ Vt we define ˙ 1 , . . . , St ) = {Φ(c1 , . . . , ct ) ; c1 ∈ S1 , . . . , ct ∈ St } ⊆ W. Φ(S If moreover V1 , . . . , Vt , W are F-vector spaces and C1 ⊆ V1 , . . . , Ct ⊆ Vt are F-linear subspaces, we let ˙ 1 , . . . , Ct ) ⊆ W. Φ(C1 , . . . , Ct ) = Φ(C In this definition Φ could be an arbitrary map, although in most examples it will be t-multilinear. In this case, the (F-)linear span is easily seen to reduce to just an additive span. Also, we will use analogous notations when Φ is written as a composition law, for instance if V, V are F-vector spaces and S ⊆ V , S ⊆ V arbitrary subsets, then ˙ = {c ⊗ c ; (c, c ) ∈ S × S } ⊆ V ⊗ V . S ⊗S 1.18. — Let V, V , W be F-vector spaces and Φ : V ×V −→ W a bilinear map. Let C ⊆ V and C ⊆ V be linear subspaces, and suppose C, C equipped with rank ˙ (1) , C (1) ) ⊆ Φ(C, C ) is a generating set, functions rk, rk respectively. Then Φ(C Φ and we define rk as the associated rank function on Φ(C, C ), also called the Φ-rank function deduced from rk and rk . Alternatively, the rank rkΦ
(z) of an element z ∈ Φ(C, C ) can be computed as the smallest value of the i ) over all possible decompositions (of
sum i rk(ci ) rk (c arbitrary length) z = i Φ(ci , ci ), ci ∈ C, ci ∈ C . Here we considered a bilinear Φ, but these constructions easily generalize to the t-multilinear case, t ≥ 3. When Φ = ⊗ is tensor product, or when Φ = ∗ is componentwise multiplication in Fn , we will often make the following additional assumptions: is 1.19. — From now on, unless otherwise specified, when a linear code C written as a tensor product C = C1 ⊗ . . . ⊗ Ct , it will be assumed that the Ci are equipped with the trivial rank function (such that rk(ci ) = 1 for all nonzero is equipped with the ⊗-rank function from these. A nonzero ci ∈ Ci ), and that C c ∈ C is then of rank 1 if it is an elementary tensor c = c1 ⊗ . . . ⊗ ct , with ci ∈ Ci . For example, with these conventions, the rank function on Fm×n = Fm ⊗ Fn is the rank of matrices in the usual sense. Given two linear codes C ⊆ Fm , C ⊆ Fn , we have an inclusion C ⊗ C ⊆ Fm ⊗ Fn , and by Lemma 1.16 the rank of a codeword z ∈ C ⊗ C is greater than or equal to its rank as a matrix. = ⊆ Fn is written as a product C 1.20. — Likewise, when a linear code C n C1 ∗ · · · ∗ Ct for some Ci ⊆ F , it will be assumed that these Ci are equipped is equipped with the ∗-rank function with the trivial rank function, and that C
ON PRODUCTS AND POWERS OF LINEAR CODES
11
is then of rank 1 if it is an elementary product from these. A nonzero c ∈ C c = c1 ∗ · · · ∗ ct , with ci ∈ Ci . In particular, given a code C and its t-th power C t , it will be assumed that C is equipped with the trivial rank function, and C t with its t-th power. As in 1.10 there is a natural projection πΔ : C1 ⊗ · · · ⊗ Ct −→ C1 ∗ · · · ∗ Ct (or C ⊗t −→ C t ), and by Lemma 1.16 this map can only make the rank decrease. 1.21. Definition. Let C be a nonzero linear code equipped with a rank function rk. For any i ≥ 1 we then define dmin,i (C) as the minimum weight of a nonzero element in C(≤i) . If C has dimension k, then obviously dmin,1 (C) ≥ dmin,2 (C) ≥ · · · ≥ dmin,k (C) = dmin (C). 1.22. Example. Given two linear codes C ⊆ Fm and C ⊆ Fn , it follows from the description of C ⊗ C as the space of m × n matrices with columns in C and rows in C , that dmin (C ⊗ C ) = dmin (C) dmin (C ), and that moreover this value is attained by an elementary tensor codeword, that is dmin,1 (C ⊗ C ) = dmin (C ⊗ C ) = dmin (C) dmin (C ). On the other hand, let C, C ⊆ (F2 )7 be the linear codes with generator matrices 1 0 0 1 1 1 1 1 0 0 1 1 1 1 G= , G = 0 1 1 0 0 1 1 0 1 1 1 1 0 0 respectively, so the nonzero codewords of C are c1 = (1001111), c2 = (0111100), c1 + c2 = (1110011) and the nonzero codewords of C are c1 = c1 = (1001111), c2 = (0110011), c1 + c2 = (1111100). We can then check that c ∗ c has weight at least 2 for all nonzero c ∈ C, c ∈ C , while C∗C also contains c1 ∗c1 +c1 ∗c2 +c2 ∗c1 = (1000000). Hence dmin ,1 (C ∗ C ) = 2 > dmin (C ∗ C ) = 1, and in this example, dmin (C ∗ C ) cannot be attained by an elementary product codeword. Geometric aspects. Part of the discussion here is aimed at readers with a certain working knowledge of algebraic geometry. Other readers can still read 1.23-1.24, the first halves of 1.30 and 1.31, and also 1.32, which remain elementary, and then skip to the next section with no harm. 1.23. — Let C ⊆ Fn be a linear code and C ⊥ ⊆ Fn its dual. For each integer i ≥ 0 we have C = C ⊥⊥ ⊆ x ∈ C ⊥ ; w(x) ≤ i⊥ , and we denote the dimension of the latter as ni = dimx ∈ C ⊥ ; w(x) ≤ i⊥ . Obviously ni ≥ ni+1 , and the first values are easily computed (see also 2.3 and 2.19): • n0 = n is the length of C • n1 = |Supp(C)| is the support length of C, that is, the number of nonzero columns of a generator matrix of C • n2 is the projective length of C, that is, the number of proportionality classes of nonzero columns of a generator matrix of C.
12
HUGUES RANDRIAMBOLOLONA
The author does not know such a nice interpretation for the subsequent values n3 , n4 , . . . At some point the sequence must stabilize, more precisely, if C ⊥ is generated by its codewords of weight at most i0 , then ni0 = ni0 +1 = · · · = k = dim(C). Putting C ⊥ in systematic form, one sees one can take i0 ≤ k + 1. 1.24. — Let C ⊆ Fn be a linear code of dimension k, and let G be a generator matrix for C. We will suppose that C has full support, that is, G has no zero column, or with the notations of 1.23, n1 = n. In many applications, the properties of C that are of interest are preserved when a column of G is replaced with a proportional one, or when columns are permuted. So ([57][58]) these properties only depend on the projective set ΠC ⊆ Pk−1 of proportionality classes of columns of G, where possibly elements of ΠC may be affected multiplicities to reflect the fact that some columns of G may be repeated (up to proportionality). However, in our context we need to keep track of the ordering of columns, since the product of codes works coordinatewise. This can be done by considering the labeling νC : [n] ΠC ⊆ Pk−1 where νC maps i ∈ [n] to the proportionality class in Pk−1 of the i-th column of G. Note that, in particular, the image of νC is ΠC . It has n2 elements (with the notations of 1.23), and it spans Pk−1 (because G has rank k). In fact this description can be made slightly more intrinsic. We can view C ⊆ Fn as an abstract vector space C equipped with n linear forms C −→ F, which span the dual vector space C ∨ . That C has full support means that each of these n linear forms is nonzero, so it defines a line in C ∨ . Seeing Pk−1 as the projective space of these lines, we retrieve the definition of νC . 1.25. — Recall [23, §4.1][24, §II.7] that if V is a finite dimensional F-vector space, then P(V ) = Proj S · V is the scheme whose points represent lines in V ∨ , or equivalently, hyperplanes in V , or equivalently, invertible quotients of V . If A is a F-algebra, then giving a map ν : Spec A −→ P(V ) is the same as giving an invertible A-module L and a F-linear map V −→ L whose image generates L over A. The closure of the image of ν (which will be the full image of ν if A is finite) is then the closed subscheme of P(V ) defined by the homogeneous ideal t≥0 ker(S t V −→ L⊗t ) of S · V . We apply this with V = C and L = A = Fn . Indeed, that C has full support means Fn ∗ C = Fn , that is, C generates Fn as a Fn -module. So we deduce a morphism [n] = Spec Fn −→ P(C). This morphism is precisely the νC defined in 1.24: indeed the n points of [n] correspond to the n projections Fn −→ F, so their images in P(C) correspond to the n coordinate linear forms C −→ F. Recalling notations from 1.11-1.12, we then find: 1.26. Proposition. The map νC fits into the commutative diagram νC :
[n] Spec Fn
ΠC ⊆ Pk−1 Proj C . ⊆ P(C)
where the homogeneous ideal defining ΠC = Proj C . in Pk−1 = P(C) is I · (C).
ON PRODUCTS AND POWERS OF LINEAR CODES
Proof. Indeed we have ker(S t C −→ Fn ) = ker(S t C −→ C t ) = I t (C).
13
Note that the linear span of Proj C . is the whole of P(C) since I 1 (C) = 0. 1.27. — A possible application of what precedes is to the interpolation problem, where one seeks a subvariety Σ ⊆ Pk−1 passing through ΠC , in order to write C as an evaluation code on Σ. Viewing S t C as the space of homogeneous functions of degree t on Pk−1 , the homogeneous equations defining Σ are then to be found in I · (C). Another consequence is the following, which explains the names in Definitions 1.4-1.5. We define the Hilbert function and the Castelnuovo-Mumford regularity of a closed subscheme in a projective space, as those of its homogeneous coordinate ring (see e.g. [19]). Then: 1.28. Proposition. Let C ⊆ Fn be a linear code with full support, and νC : [n] ΠC ⊆ Pk−1 the associated projective spanning map. Then: (i) The dimension sequence of C is equal to the Hilbert function of ΠC . (ii) The regularity r(C) of C is equal to the Castelnuovo-Mumford regularity of ΠC . (iii) The stable value of the dimension sequence is the projective length of C: dim C t = n2 for t ≥ r(C). Proof. As established during the construction of νC in 1.25, the homogeneous coordinate ring of ΠC is S · C/I · (C) = C . . This gives point (i), and then point (ii) follows by [19, Th. 4.2(3)] (see also [36, Lect. 14]). To show point (iii), first recall that if A· is a graded algebra such that the dimension dim At becomes constant for t 0, then this stable value is precisely the length of the finite projective scheme Proj A· . Now here ΠC is a reduced union of some F-points (as an image of [n]), so this length is precisely the number n2 of these points. Another (perhaps more concrete) proof of point (iii) will be given in Theorem 2.35. For more on the geometric significance of (ii), see the discussion in 3.12. 1.29. Remark. From the short exact sequence of sheaves on P(C) 0 −→ JΠC OP(C) (t) −→ OP(C) (t) −→ OΠC (t) −→ 0 one can form a long exact sequence in cohomology, in which the first terms can be identified as Γ(P(C), JΠC OP(C) (t)) = I t (C) and Γ(P(C), OP(C) (t)) = S t C, leading to a short exact sequence 0 −→ C t −→ Γ(ΠC , OΠC (t)) −→ H 1 (P(C), JΠC OP(C) (t)) −→ 0. Now Γ(ΠC , OΠC (t)) is a vector space of dimension n2 over F, and choosing a subset S ⊆ [n] of size |S| = n2 mapped bijectively onto ΠC by νC , we can identify this vector space with FS . From this we finally deduce an identification H 1 (P(C), JΠC OP(C) (t)) FS /πS (C t ). In particular, since C and C t have the same projective length for t ≥ 1 (this should be obvious, but if not see 2.19-2.21), we find that dim H 1 (P(C), JΠC OP(C) (t)) = n2 − dim(C t ) = dim((C t )⊥ /x ∈ (C t )⊥ ; w(x) ≤ 2)
14
HUGUES RANDRIAMBOLOLONA
is the minimum number of parity-check relations of weight at least 3 necessarily appearing in any set of relations defining C t . 1.30. — Let C, C ⊆ Fn be two linear codes. Choose corresponding generating matrices G, G . For 1 ≤ i ≤ n, let pi ∈ Fk be the i-th column of G and pi ∈ Fk the i-th column of G . Then C, C are the respective images of the evaluation maps F[X1 , . . . , Xk ]1 L
−→ Fn → (L(p1 ), . . . , L(pn ))
and
F[Y1 , . . . , Yk ]1 −→ Fn L → (L (p1 ), . . . , L (pn )) defined on spaces of linear homogeneous polynomials in k and k variables. Then, C ∗ C is the image of the evaluation map F[X1 , . . . , Xk ; Y1 , . . . , Yk ]1,1 B
−→ Fn → (B(p1 ; p1 ), . . . , B(pn ; pn ))
defined on the space of bilinear homogenous polynomials in k + k variables, that is, on polynomials of the form μi,j Xi Yj B(X1 , . . . , Xk ; Y1 , . . . , Yk ) = i,j
where μi,j ∈ F, 1 ≤ i ≤ k, 1 ≤ j ≤ k . This is just a reformulation of 1.10. Geometrically, it corresponds to the Segre construction. Suppose C, C ⊆ Fn have full support, and let νC : [n] ΠC ⊆ Pk−1 and νC : [n] ΠC ⊆ Pk −1 be the associated projective spanning maps. Composing this pair of maps with the Segre embedding we get (νC ,ν
)
C −→ Pk−1 × Pk −1 −→ Pkk −1 [n] −−−−−
which should be essentially νC∗C , except that its image ΠC∗C might not span Pkk −1 as requested, so we have to replace Pkk −1 with the linear span of the image ΠC∗C , which is a Pdim(C∗C )−1 . More intrinsically, we have Pk−1 = P(C), Pk −1 = P(C ), and Pkk −1 = P(C ⊗ C ). The linear span ΠC∗C of ΠC∗C is then easily identified: we have C ∗ C = C ⊗ C /I(C, C ), so ΠC∗C = P(C ∗ C ) ⊆ P(C ⊗ C ) is the linear subspace cut by I(C, C ) (where we view elements of I(C, C ) ⊆ C ⊗ C as linear homogeneous functions on Pkk −1 = P(C ⊗ C )). We summarize this with the commutative diagram νC∗C :
[n] Spec Fn
ΠC∗C ⊆ ΠC∗C ⊆ Pkk −1 Proj(C ∗ C ). ⊆ P(C ∗ C ) ⊆ P(C ⊗ C ).
1.31. — Keep the same notations as in the previous entry. First, in coordinates, if C is the image of the evaluation map F[X1 , . . . , Xk ]1 L
−→ Fn → (L(p1 ), . . . , L(pn ))
ON PRODUCTS AND POWERS OF LINEAR CODES
15
defined on the space of linear homogeneous polynomials in k variables, then C t is the image of the evaluation map F[X1 , . . . , Xk ]t Q
−→ Fn → (Q(p1 ), . . . , Q(pn ))
defined on the space of homogeneous polynomials of degree t in k variables. This is just a reformulation of 1.11. Geometrically, it corresponds to the Veronese construction. Suppose C ⊆ Fn has full support, and let νC : [n] ΠC ⊆ Pk−1 be the associated projective spanning map. Composing with the t-fold Veronese embedding we get k+t−1 ν [n] −−C→ Pk−1 −→ P( t )−1 , the image of which spans the linear subspace ΠC t = P(C t ) ⊆ P(S t C) cut by I t (C) (where now we see elements of I t (C) ⊆ S t C not as homogeneous functions of degree t on Pk−1 = P(C), but as linear homogeneous functions on k+t−1 P( t )−1 = P (S t C)). Again we summarize this with the commutative diagram νC t :
[n] Spec Fn
ΠC t ⊆ Proj C t· ⊆
k+t−1 ΠC t ⊆ P( t )−1 P(C t ) ⊆ P(S t C).
1.32. — This geometric view is especially interesting when one considers the distance problem. Given C ⊆ Fn with full support, and νC : [n] ΠC ⊆ Pk−1 the associated projective spanning map, nonzero codewords c ∈ C correspond to −1 (Hc )|. As a hyperplanes Hc ⊆ Pk−1 , and the weight of c is w(c) = n − |νC consequence, the minimum distance of C is dmin (C) = n −
−1 max |νC (H)|.
H⊆Pk−1 hyperplane
k+t−1 t
Applying the Veronese construction which identifies hyperplanes in P( hypersurfaces of degree t in Pk−1 , we find likewise dmin (C t ) = n −
max
)−1 with
−1 |νC (H)|.
H⊆P , H⊇ΠC hypersurface of degree t k−1
Note that here we have to add the extra condition H ⊇ ΠC , reflecting the fact that I t (C) could be nonzero. This makes the distance problem slightly more delicate as soon as t ≥ 2. In many code constructions, often the very same argument that gives a lower bound on the minimum distance shows at the same time that the code has “full dimension”. For example, if C(D, G) = Im(L(D) −→ FG ) is the algebraic-geometry code defined in 1.7, then, provided m = deg(D) < n = |G|, a function in L(D) can have at most m zeroes in G, from which we get at the same time injectivity of the evaluation map, so dim(C(D, G)) = dim(L(D)), and dmin (C(D, G)) ≥ n − m. On the other hand if a code is defined as a power of another code, we have to deal separately with the fact that it could have dimension smaller than expected.
16
HUGUES RANDRIAMBOLOLONA
Given νC : [n] ΠC ⊆ Pk−1 , to show dmin (C t ) ≥ n − m one has to show that for any homogenous form of degree t on Pk−1 , either: ∗ • νC F has at most m zeroes in [n], or ∗ • νC F vanishes on all of [n].
1.33. — Now the author would like to share some (personal) speculations about the objects constructed so far. From Proposition 1.28, we see that the dimension sequence of a linear code C is a notion that has been already well studied, albeit under a different (but equivalent) form. In
fact,its study tcan also be reduced to an interpolation problem: − dim I (C), to estimate the Hilbert function we can since dim C t = k+t−1 t equivalently count the hypersurfaces of degree t passing through ΠC . This problem is not really of a coding-theoretic nature. We can do the same thing for powers of a linear subspace in any finite-dimensional algebra A, not only in Fn . However, things change if one is also interested in the distance sequence of C. While we’re still doing geometry over F, that is, over a base of dimension 0, now, following the philosophy of Arakelov theory, the introduction of metric data (such as defined, here, by the Hamming metric) is very similar to passing to a base of dimension 1. In this way, the study of the joint dimension and distance sequences of a code might be viewed as a finite field analogue of the study of the “arithmetic Hilbert(-Samuel) function” associated in [30] to interpolation matrices over a number field, and further analyzed in [41]. For example the monotonicity results that will be given in 2.32-2.33 are very similar in spirit to those of [41, 5.2]; in turn, keeping Remark 1.29 in mind, a natural interpretation is as the size of some H 1 decreasing, as in [36, p. 102]. For another illustration of this principle, to give an upper bound on dmin (C t ) one has to find a nonzero codeword of small weight in C t , that is, a function P ∈ S t C whose zero locus intercepts a large part of, but not all, the image of [n] under νC t . This is somehow reminiscent of the situation in transcendental number theory, where one has to construct an auxiliary function that is small but nonzero, which often involves a Siegel lemma. Conversely, to give a lower bound on dmin (C t ), one has to show that for all P ∈ S t C, either P vanishes on all the image (which means P ∈ I t (C)), or else it misses a certain part of it, of controlled size. Perhaps one could see this as a loose analogue of a zero lemma. 2. Basic structural results and miscellaneous properties In this section we study basic properties of codes with respect to componentwise product, while aiming at the widest generality. This means including the case of “degenerated” codes (e.g. not having full support, or having repeated columns, or also decomposable codes) that are often not of primary interest to coding theorists; the hurried reader should feel free to skip the corresponding entries. This said, it turns out these degenerated codes sometimes appear in some natural situations, which motivates having them treated here for reference. For instance, even if a code C is indecomposable, its powers C t might be decomposable. Also, to study a code C, it can be useful to filter it by a chain of subcodes Ci (see e.g. 3.10, or [14][35][59]), and even for the nicest C, the Ci under consideration might very well then be degenerated.
ON PRODUCTS AND POWERS OF LINEAR CODES
17
Support. 2.1. — From now on, by the i-th column of a linear code C ⊆ Fn , we will mean the i-th coordinate projection πi : C −→ F, which is an element of the dual vector space C ∨ . This name is justified because, given a generator matrix G, which corresponds to a basis of C over F, the column vector of the coordinates of πi with respect to this basis is precisely the i-th column of G. 2.2. — A possible definition of the support of words or codes, in terms of our product ∗, can be given as follows. First, note that for all S, T ⊆ [n], we have 1S ∗ 1T = 1S∩T . In particular, 1S is an idempotent of Fn . In fact, as a linear endomorphism of Fn , we have 1S ∗ · = ιS ◦ πS where πS : Fn FS and ιS : FS → Fn are the natural linear maps. Then the support of a word x ∈ Fn can be defined as the smallest, or the intersection, of all subsets S ⊆ [n] such that 1S ∗ x = x. Likewise the support of a linear code C ⊆ Fn is the smallest, or the intersection, of all subsets S ⊆ [n] such that 1S ∗ C = C. 2.3. — Equivalently, for i ∈ [n], we have i ∈ Supp(C) if and only if the i-th column of C is nonzero. This may be rephrased in terms of vectors of weight 1 in the dual code: i ∈ Supp(C) ⇐⇒ 1{i} ∈ C ⊥ . As a consequence we have x ∈ C ⊥ ; w(x) ≤ 1⊥ = ι(FSupp(C) ) (where ι = ιSupp(C) : FSupp(C) → Fn ) and we retrieve the relation n1 = dimx ∈ C ⊥ ; w(x) ≤ 1⊥ = |Supp(C)| as stated in 1.23. 2.4. Lemma. If c1 , . . . , ct ∈ Fn are words of the same length, then Supp(c1 ∗ · · · ∗ ct ) = Supp(c1 ) ∩ · · · ∩ Supp(ct ). If C1 , . . . , Ct ⊆ F are linear codes of the same length, then n
Supp(C1 ∗ · · · ∗ Ct ) = Supp(C1 ) ∩ · · · ∩ Supp(Ct ). In particular, for C ⊆ Fn and t ≥ 1 we have Supp(C t ) = Supp(C). Proof. Obvious.
2.5. — In most applications we can discard the 0 columns of a linear code without affecting its good properties, that is, we can replace C with its projection on Supp(C) so that it then has full support.
18
HUGUES RANDRIAMBOLOLONA
In particular, given C1 , . . . , Ct ⊆ Fn , if we let I = Supp(C1 ) ∩ · · · ∩ Supp(Ct ) and we replace each Ci with πI (Ci ), this replaces C1 ∗ · · · ∗ Ct with πI (C1 ∗ · · · ∗ Ct ), which does not change its essential parameters (dimension, weight distribution...). In this way, many results on products of codes can be reduced to statements on products of codes which all have full support. However, this intersection I may be strictly smaller than some of the Supp(Ci ), so replacing Ci with πI (Ci ) might change some relevant parameter of this code. In some applications, namely when both the parameters of C1 ∗ · · · ∗ Ct and those of the Ci are relevant, this added difficulty has to be taken into account carefully. Decomposable codes. We recast some classical results of [54] in the light of the ∗ operation, elaborating from 2.2. Beside reformulating elementary notions in a fancy language, what is done here will also appear naturally while studying automorphisms in 2.48 and following. 2.6. Definition. Let C ⊆ Fn be a linear code. The extended stabilizing algebra of C is A(C) = {a ∈ Fn ; a ∗ C ⊆ C}, and the (proper) stabilizing algebra of C is = {a ∈ Fn ; Supp(a) ⊆ Supp(C), a ∗ C ⊆ C}. A(C) = 1Supp(C) ∗ A(C) Clearly A(C) is a subalgebra of Fn , while projection πSupp(C) identifies A(C) with a subalgebra of FSupp(C) (the identity element of A(C) is the idempotent 1Supp(C) of Fn ). Moreover we have A(C) = A(C) ⊕ ι(F[n]\Supp(C) ) where ι = ι[n]\Supp(C) is the natural inclusion F[n]\Supp(C) → Fn . 2.7. Proposition. Let C, C ⊆ Fn be two linear codes of the same length. Then A(C) ∗ A(C ) ⊆ A(C ∗ C ), and for all t ≥ 1 A(C) = A(C)t ⊆ A(C t ). Also we have A(A(C)) = A(C). Proof. If a ∗ C ⊆ C and a ∗ C ⊆ C , then (a ∗ a ) ∗ C ∗ C ⊆ C ∗ C . Using Lemma 2.4 and passing to the linear span we find A(C) ∗ A(C ) ⊆ A(C ∗ C ) as claimed. Induction then gives A(C)t ⊆ A(C t ). Last, we have A(C) = A(C)t and A(A(C)) = A(C) because A(C) is an algebra under ∗, with unit 1Supp(C) . 2.8. Definition. Let C ⊆ Fn be a linear code and P = {P1 , . . . , Ps } a partition of Supp(C). We say that C decomposes under P if 1Pi ∈ A(C) for all i. Equivalently, this means there are linear subcodes C1 , . . . , Cs ⊆ C with Supp(Ci ) = Pi such that C = C1 ⊕ · · · ⊕ Cs .
ON PRODUCTS AND POWERS OF LINEAR CODES
19
To show the equivalence, write Ci = 1Pi ∗ C, so by definition Ci is a subcode of C if and only if 1Pi ∈ A(C). 2.9. — We recall that the set of partitions of a given set S forms a lattice under refinement. In particular if P = {P1 , . . . , Ps } and Q = {Q1 , . . . , Qt } are two partitions of S, their coarsest common refinement is the partition P ∧ Q = {Pi ∩ Qj ; Pi ∩ Qj = ∅}. More generally, if S, T are two sets, P is a partition of S, and Q a partition of T , then P ∧ Q, formally defined by the very same formula as above, is a partition of S ∩ T. 2.10. Lemma-definition. If C decomposes under two partitions P, Q of Supp(C), then it decomposes under P ∧ Q. Hence there is a finest partition P(C) under which C decomposes. If P(C) = {A1 , . . . , Ar }, we have C = C1 ⊕ · · · ⊕ Cr where the Ci = 1Ai ∗ C are called the indecomposable components of C. This is the finest decomposition of C as a direct sum of nonzero subcodes with pairwise disjoint supports. Proof. If 1Pi ∈ A(C) and 1Qj ∈ A(C), then 1Pi ∩Qj = 1Pi ∗ 1Qj ∈ A(C).
2.11. Proposition. We have dim A(C) = |P(C)|. More precisely, if P(C) = {A1 , . . . , Ar }, then A(C) = 1A1 , . . . , 1Ar = 1A1 ⊕ · · · ⊕ 1Ar . Proof. Let V = 1A1 , . . . , 1Ar . Obviously the 1Ai are linearly independent so dim(V ) = r; and by definition we have 1Ai ∈ A(C), so V ⊆ A(C). Conversely, let x ∈ A(C). We want to show x ∈ V . Let λ1 , . . . , λs ∈ F be the elements that appear at least once as a coordinate of x over Supp(C), and for each such λj , let Bj ⊆ Supp(C) be the set of indices on which x takes coordinate λj , so x = λ1 1B1 + · · · + λs 1Bs . For each j, there is a Lagrange interpolation polynomial P such that P (λj ) = 1 and P (λj ) = 0 for j = j. Evaluating P on x in the algebra A(C) we find 1Bj = P (x) ∈ A(C). This means C decomposes under the partition Q = {B1 , . . . , Bs }, hence P(C) refines Q. So, for all j, we get that Bj is a union of some of the Ai , and 1Bj ∈ V . The conclusion follows. 2.12. Corollary. Let C, C ⊆ Fn be two linear codes of the same length. Then P(C ∗ C ) is a (possibly strict) refinement of P(C) ∧ P(C ). For t ≥ 1, P(C t ) is a (possibly strict) refinement of P(C). More generally, if C decomposes under a partition P of Supp(C) as C = C1 ⊕ · · · ⊕ Cs and C under a partition P of Supp(C ) as C = C1 ⊕ · · · ⊕ Cs
20
HUGUES RANDRIAMBOLOLONA
then C ∗ C decomposes under P ∧ P (which is a partition of Supp(C ∗ C )) as Ci ∗ Cj C ∗ C = i,j
where we keep only those of the i, j for which Ci ∗ Cj = 0. (However, these Ci ∗ Cj need not necessarily be indecomposable, even if the Ci and Cj are.) And for any t ≥ 1, the t-th power C t also decomposes under P as t
C t = C1 ⊕ · · · ⊕ Cst . t
(However, these Ci
need not necessarily be indecomposable, even if the Ci are.)
Proof. Everything is clear and can be proved directly. An alternative proof for the first assertion is as a consequence of Propositions 2.7 and 2.11. 2.13. Example. Note that the parity [3, 2, 2]2 code C is indecomposable, while its square is the trivial [3, 3, 1]2 code, which decomposes totally. That is, this gives an example where P(C 2 ) = {{1}, {2}, {3}} strictly refines P(C) = {{1, 2, 3}}, and A(C) = A(C)2 = 1 A(C 2 ) = (F2 )3 . 2.14. — We gave results only for the proper stabilizing algebra. However, since A(C) = A(C) ⊕ ι(F[n]\Supp(C) ), one immediately deduces similar statements for the extended algebra. ) ⊆ A(C ∗ C ), For instance, Proposition 2.7 is replaced with A(C) ∗ A(C t t A(C) = A(C) ⊆ A(C ), and A(A(C)) = A(A(C)) = A(C). Instead of Definition 2.8, we say that C weakly decomposes under a partition Q of [n] if, for each Q ∈ Q, we have 1Q ∈ A(C). This means there are subcodes Ci ⊆ C with disjoint supports such that C = i Ci , with each Supp(Ci ) included (possibly strictly) in some Qi ∈ Q. There is a finest partition of [n] under which C weakly decomposes, it is P(C) = P(C) ∪ {{j}; j ∈ Supp(C)}. Then Proposition 2.11 becomes A(C) = 1Q . Q∈P(C)
∗ C ) is a (possibly strict) refinement of P(C) ), and if C Last, P(C ∧ P(C under Q as weakly under Q as C = i Ci and C weakly decomposes decomposes C = i Ci , then C ∗ C weakly decomposes under Q ∧ Q as C ∗ C = i,j Ci ∗ Cj . Additional properties of A(C), involving the dual code C ⊥ , will be given in 2.41 and 2.42. Repeated columns. 2.15. — We keep the same notations as in 2.1: by the columns of a linear code C ⊆ Fn we mean the n coordinate projections C −→ F. Then: Definition. We define an equivalence relation ∼ (or ∼C ) on Supp(C) by setting i ∼ j when the i-th and j-th columns of C are proportional. By abuse of language we also say these are two repeated columns. We let U(C) = Supp(C)/ ∼ be the set of equivalence classes of ∼, which is a partition of Supp(C).
ON PRODUCTS AND POWERS OF LINEAR CODES
21
2.16. Lemma. Let i, j ∈ Supp(C), i = j. Then i∼j
⇐⇒
∃x ∈ C ⊥ , Supp(x) = {i, j}.
i ∼ j
⇐⇒
∃c ∈ C, πi (c) = 1, πj (c) = 0
Conversely (and then likewise with i, j permuted). Proof. Basic manipulation in linear algebra.
2.17. Proposition. Let C ⊆ Fn be a linear code. Then U(C) is a refinement of P(C). Proof. We have to show that if A, B ∈ P(C), A = B, and i ∈ A, j ∈ B, then i ∼ j. Since i ∈ Supp(C), we can find c ∈ C with πi (c) = 1. Then 1A ∗ c ∈ C satisfies πi (1A ∗ c) = 1, πj (1A ∗ c) = 0, and we conclude with Lemma 2.16. 2.18. — An equivalent formulation for Lemma 2.16 is: i ∼ j if and only if dim(1{i,j} ∗ C) = 2. Conversely a subset B ⊆ Supp(C) is contained in an equivalence class for ∼ if and only if dim(1B ∗ C) = 1. In particular, B ∈ U(C) if and only if B is maximal for this property. Definition. We call these 1B ∗ C, for B ∈ U(C), the one-dimensional slices of C. If c ∈ C is nonzero over B, then v = 1B ∗ c is a generator of the corresponding slice: 1B ∗ C = v. Beware that since U(C) might be a strict refinement of P(C), this slice 1B ∗ C need not actually be a subcode of C, or equivalently, v need not actually belong to C. 2.19. — If U(C) = {B1 , . . . , Bs } and v1 , . . . , vs are corresponding slice generators, then 1[Supp(C)] = 1B1 + · · · + 1Bs from which it follows C = 1[Supp(C)] ∗ C ⊆ 1B1 , . . . , 1Bs ∗ C = v1 ⊕ · · · ⊕ vs . The right hand side is easily identified thanks to 2.3 and Lemma 2.16: v1 ⊕ · · · ⊕ vs = x ∈ C ⊥ ; w(x) ≤ 2⊥ . As a consequence we retrieve the relation n2 = dimx ∈ C ⊥ ; w(x) ≤ 2⊥ = s = |U(C)| as stated in 1.23. 2.20. — To restate all this more concretely, choose a set of representatives S = {j1 , . . . , js } ⊆ Supp(C), with ji ∈ Bi , so each nonzero column of C is repeated from one (and only one) column indexed by S. Then a codeword c ∈ C is entirely determined over Bi by its value at ji . More precisely, after possibly multiplying by scalars, we can suppose our slice generators are normalized with respect to S, that is, vi is 1 at ji for all i. Then for each c ∈ C, the slice of c over Bi is 1Bi ∗c = πji (c)vi . Said otherwise, πS induces a commutative diagram
22
HUGUES RANDRIAMBOLOLONA
C
⊆
v1 ⊕ · · · ⊕ vs
πS (C)
⊆
FS
identifying C with the code πS (C) of length |S| = s = n2 , which has full support in FS and no repeated column (so dual distance d⊥ (πS (C)) ≥ 3 by 2.3 and Lemma 2.16). Each column of C is repeated from one column of πS (C), or more precisely, each (λj1 , . . . , λjs ) ∈ πS (C) extends uniquely to λj1 v1 + · · · + λjs vs ∈ C. 2.21. Proposition. Let C, C ⊆ Fn be linear codes of the same length, and let i, j ∈ Supp(C)∩Supp(C ). Then the i-th and j-th columns are repeated in C ∗C if and only if they are repeated in C and in C . Said otherwise, U(C ∗ C ) = U(C) ∧ U(C ). If v1 , . . . , vs are slice generators for C and w1 , . . . , ws are slice generators for C , then those among the vi ∗ wj that are nonzero form a family of slice generators for C ∗ C . In particular U(C t ) = U(C) for all t ≥ 1, and (v1 )t , . . . , (vs )t are slice generators for C t . If S ⊆ Supp(C) is a set of representatives for ∼C , then the dimension sequences of C and πS (C) are the same: dim(C t ) = dim(πS (C)t ) for all t ≥ 0. Hence they also have the same regularity: r(C) = r(πS (C)). Proof. Suppose πi = λπj on C and πi = λ πj on C , for some λ, λ ∈ F× . Then πi = λλ πj on C ∗ C : indeed it is so on elementary product vectors, and this extends by linearity. Conversely, suppose for example i ∼C j, so by Lemma 2.16 we can find c ∈ C with πi (c) = 1, πj (c) = 0. Since i ∈ Supp(C ), we can find c ∈ C with πi (c ) = 1. Then πi (c ∗ c ) = 1, πj (c ∗ c ) = 0, hence i ∼C∗C j. The rest follows easily (note πS (C t ) = πS (C)t ). Extension of scalars. Let F ⊆ K be a field extension. In many applications, one is given a “nice” linear code over K and one wants to deduce from it a “nice” linear code over F. Several techniques have been designed for this task, especially when the extension has finite degree: subfield subcodes, trace codes, and concatenation. How these operations behave with respect to the product ∗ turns out to be quite difficult to analyze, although we will give results involving concatenation in 4.15 and following. In the other direction, base field extension (or extension of scalars) allows to pass from a linear code C ⊆ Fn over F to a linear code CK ⊆ Kn over K. In general this operation is less useful for practical applications, however in some cases it can be of help in order to prove theorems. The definition is simple: we let CK be the K-linear span of C in Kn (where we implicitly used the chain of inclusions C ⊆ Fn ⊆ Kn ).
ON PRODUCTS AND POWERS OF LINEAR CODES
23
2.22. Lemma. Let C ⊆ Fn be a linear code over F. Then: (i) The inclusion C ⊗F K ⊆ Fn ⊗F K = Kn induces the identification C ⊗F K = CK . (ii) If G is a generator matrix for C over F, then G is a generator matrix for CK over K. (iii) If H is a parity-check matrix for C over F, then H is a parity-check matrix for CK over K. Proof. Basic manipulation in linear algebra.
Extension of scalars is compatible with most operations on codes: 2.23. Lemma.
(i) If C ⊆ Fn is a linear code, then (C ⊥ )K = (CK )⊥
⊆ Kn .
(ii) If C, C ⊆ Fn are linear codes, then C ⊆ C
⇐⇒
CK ⊆ CK .
(iii) Let C, C ⊆ Fn be linear codes. Then: (C + C )K = CK + CK (C ∩ C )K = CK ∩ CK and
(C ∗ C )K = CK ∗ CK (where on the left hand side, ∗ denotes product in Fn , and on the right hand side, in Kn ). (iv) Let C ⊆ Fm , C ⊆ Fn be a linear codes. Then: (C ⊕ C )K = CK ⊕ CK
⊆ Km+n
(C ⊗ C )K = CK ⊗ CK
⊆ Km×n .
Proof. Routine verifications, using Lemma 2.22.
2.24. Proposition. If C ⊆ F is a linear code, then P(CK ) = P(C), and A(CK ) = A(C)K in Kn . In particular C is indecomposable if and only if CK is indecomposable. n
Proof. For P ⊆ Supp(C) we have 1P ∗ C ⊆ C ⇐⇒ 1P ∗ CK ⊆ CK by Lemma 2.23(ii)-(iii). Conclude with Proposition 2.11. 2.25. Proposition. If C ⊆ Fn is a linear code, then U(CK ) = U(C). If v1 , . . . , vs ∈ Fn are slice generators for C, then they are also for CK . Proof. Obvious: πi = λπj on C ⇐⇒ πi = λπj on CK .
2.26. — For C ⊆ Fn and S ⊆ [n] we let S CS = ιS (ι−1 S (C)) = C ∩ ιS (F ) = {c ∈ C ; Supp(c) ⊆ S}
be the largest subcode of C with support in S. Also we recall from [59] that for 1 ≤ i ≤ dim(C), the i-th generalized Hamming weight wi (C) of C is the smallest integer s such that C admits a linear subcode of dimension i and support size s. Equivalently: wi (C) = min{|S| ; S ⊆ [n], dim(CS ) ≥ i}.
24
HUGUES RANDRIAMBOLOLONA
In particular w1 (C) = dmin (C). 2.27. Proposition. Let F ⊆ K be a field extension, and C ⊆ Fn a linear code over F. Then for any subset S ⊆ [n] we have (CS )K = (CK )S . In particular, dimK ((CK )S ) = dimF (CS ). Proof. Write CS = C ∩ ιS (FS ) and use Lemma 2.23(iii) (for ∩).
2.28. Corollary. Let F ⊆ K be a field extension, and C ⊆ Fn a linear code over F. Then we have dimK (CK ) = dimF (C) and wi (CK ) = wi (C) for all i. In particular, dmin (CK ) = dmin (C). Proof. The first equality follows from Lemma 2.22(i). The second follows from Proposition 2.27 and the definition of the generalized Hamming weights. 2.29. Lemma. Let F ⊆ K be a field extension, and suppose λ1 , . . . , λr ∈ K are linearly independent over F. Let C ⊆ Fn be a linear code, and x1 , . . . , xr ∈ Fn be arbitrary words. Set x = λ1 x1 + · · · + λr xr ∈ Kn . Then we have Supp(x) =
Supp(xi )
i
and x ∈ CK
⇐⇒
∀i, xi ∈ C.
Proof. The only nontrivial point is the implication x ∈ CK =⇒ xi ∈ C. It is in fact a consequence of Lemma 2.22(iii). Alternative proofs for Propositions 2.24 and 2.27 could also be given using the following: 2.30. Proposition. Let F ⊆ K be a field extension, C ⊆ Fn a linear code over F, and C ⊆ CK a linear subcode over K. Then there is a linear subcode C0 ⊆ C over F of support Supp(C0 ) = Supp(C ) such that C ⊆ (C0 )K and dimK (C ) ≤ dimF (C0 ) ≤ min(dimF (C), [K : F] dimK (C )). Moreover, if the extension is finite separable, we can take C0 = trK/F (C ) where we extended the trace map trK/F to a map Kn −→ Fn by letting it act componentwise.
ON PRODUCTS AND POWERS OF LINEAR CODES
25
Proof. Choose a basis (λi ) of K over F, and decompose each element x of a (K-)basis of C as a finite sum x = λ1 x1 +· · ·+λr xr for some xi ∈ Fn (after possibly renumbering the λi ). Then apply Lemma 2.29. When the extension is separable we have xi = trK/F (λ∗i x), where the basis (λ∗i ) is dual to (λi ) with respect to the trace bilinear form. 2.31. Lemma. Let C ⊆ Fn be a linear code of dimension k over F. Then there exists an extension field K of finite degree [K : F] ≤ k, and a codeword c ∈ CK , such that Supp(c) = Supp(C). Proof. Let G be a generator matrix for C. Let F0 ⊆ F be the prime subfield of F (that is, F0 = Q is char(F) = 0, and F0 = Z/pZ if char(F) = p > 0), and let F1 = F0 (G) ⊆ F be the field generated over F0 by the entries of G. So F1 is finitely generated over a prime field, and as such we contend it admits finite extensions of any degree (this is clear if F, and thus also F1 , is a finite field, which is the case in most applications; and for completeness a proof of the general case will be given five lines below). Let then K1 be an extension of F1 of degree k, and let K be a compositum of F and K1 . Now if c1 , . . . , ck ∈ Fn are the rows of G, and if λ1 , . . . , λk ∈ K1 are linearly independent over F1 , we set c = λ1 c1 + · · · + λk ck ∈ CK and conclude with Lemma 2.29. Concerning the general case of the claim made in the middle of this proof, it can be established as follows: write F1 as a finite extension (say of degree d) of a purely transcendental extension of F0 , and let F1,c be its constant field, that is, the algebraic closure of F0 in F1 ; proceeding as in [56] Prop. 3.6.1 and Lemma 3.6.2, one then gets (a) that F1,c is finite over F0 (more precisely, of degree at most d), and (b) that any algebraic extension of F1,c is linearly disjoint from F1 . Now (a) means F1,c is either a number field or a finite field, and as such it admits finite extensions of any degree, for instance cyclotomic extensions do the job; and then by (b), such an extension of F1,c induces an extension of F1 of the same degree. (An alternative, more geometric proof, would be to consider F1 as the field of functions of a projective variety over F0 , and then get properties (a) and (b) of F1,c from finiteness of cohomology and its properties under base field extension.) Monotonicity. 2.32. Theorem. Let C ⊆ Fn be a linear code. Then for t ≥ 1 we have dim(C t+1 ) ≥ dim(C t ). Also the generalized Hamming weights satisfy wi (C t+1 ) ≤ wi (C t ) for 1 ≤ i ≤ dim(C t ), and wi ((C t+1 )⊥ ) ≥ wi ((C t )⊥ ) for 1 ≤ i ≤ dim((C t+1 )⊥ ). In particular, the minimum distances satisfy dmin (C t+1 ) ≤ dmin (C t ), and the dual distances, d⊥ (C t+1 ) ≥ d⊥ (C t ).
26
HUGUES RANDRIAMBOLOLONA
Proof. Thanks to Lemmas 2.23(i),(iii) and 2.31, and Corollary 2.28, it suffices to treat the case where there is c ∈ C with Supp(c) = Supp(C). The multiplication map c ∗ · is then injective from C t into C t+1 , so dim(C t ) = dim(c ∗ C t ) ≤ dim(C t+1 ). Likewise if C ⊆ C t has dimension i and support weight wi (C t ), we have dim(c ∗ C ) = dim(C ) = i and wi (C t+1 ) ≤ |Supp(c ∗ C )| = |Supp(C )| = wi (C t ). Now extend c to c ∈ (Fn )× by setting it equal to 1 out of Supp(C), that is, formally, c = c + 1[n]\Supp(C) . The multiplication map c ∗ · is then injective as a linear endomorphism of Fn , and t on C it coincides with the multiplication map c ∗ · as above. So c ∗ · sends C t t+1 t+1 ⊥ t ⊥ into C , which implies that it sends (C ) into (C ) (here this is easily checked, but see Corollary 2.39 to put it in a more general context). Then if C ⊆ (C t+1 )⊥ has dimension i and support weight wi ((C t+1 )⊥ ), we have dim( c ∗ C ) = dim(C ) = i and wi ((C t )⊥ ) ≤ |Supp( c ∗ C )| = |Supp(C )| = wi ((C t+1 )⊥ ). This shows that the regularity r(C) is well defined (in 1.5). One can then give a slightly stronger monotonicity result for the dimension sequence: 2.33. Corollary. For 1 ≤ t < r(C), we have dim(C t+1 ) > dim(C t ). Proof. Again it suffices to treat the case where there is c ∈ C with Supp(c) = Supp(C). Let t ≥ 1, and suppose dim(C t+1 ) = dim(C t ). Then necessarily C t+1 = c ∗ C t so
C t+2 = C ∗ C t+1 = C ∗ (c ∗ C t ) = c ∗ C t+1 = c2 ∗ C t . We continue in the same way and, for all i ≥ 0, we find C t+i = ci ∗ C t , hence dim(C t+i ) = dim(C t ). This means precisely t ≥ r(C). An alternative proof can be given using Proposition 2.21 to reduce to the case where C has dual distance at least 3, and then concluding with Proposition 3.5 below. Stable structure.
2.34. — In what follows we use the same notations as in 2.19-2.20. So C ⊆ Fn is a linear code and |U(C)| = dimx ∈ C ⊥ ; w(x) ≤ 2⊥ = n2 is its projective length. We choose a set of representatives S = {j1 , . . . , jn2 } ⊆ Supp(C), and associated normalized slice generators v1 , . . . , vn2 for C, that is, vi ∈ Fn with pairwise disjoint supports such that C ⊆ v1 ⊕ · · · ⊕ vn2
ON PRODUCTS AND POWERS OF LINEAR CODES
27
and πji (vi ) = 1, so we have an isomorphism ϕ:
πS (C) (λj1 , . . . , λjn2 )
−→ →
C λj1 v1 + · · · + λjn2 vn2
inverse to πS . Here πS (C) ⊆ FS has full support and no repeated column. Then by Proposition 2.21 for all t ≥ 1, we have an inclusion C t ⊆ (v1 )t ⊕ · · · ⊕ (vn2 )t and an isomorphism ϕt :
πS (C)t (λj1 , . . . , λjn2 )
−→ C t t → λj1 (v1 ) + · · · + λjn2 (vn2 )t
inverse to πS (observe πS (C)t = πS (C t )). 2.35. Theorem. Let C ⊆ Fn be a linear code of dimension k. Then, with the notations of 2.34, the code C has regularity r(C) ≤ n2 − k + 1, and we have C t = (v1 )t ⊕ · · · ⊕ (vn2 )t for all t ≥ r(C). In particular the stable value of the dimension sequence of C is its projective length: dim(C t ) = n2 for t ≥ r(C). Proof. Because of the inclusion C t ⊆ (v1 )t ⊕· · ·⊕(vn2 )t we have dim(C t ) ≤ n2 for all t. However for t = 1 we have dim(C) = k. So the dimension sequence can increase at most n2 − k times, which, joint with Corollary 2.33, implies the bound on r(C). To conclude it suffices to show that there exists one t with dim(C t ) = n2 . Since S is a set of representatives for ∼, Lemma 2.16 gives, for all i, i ∈ S, i = i , a word xi,i ∈ πS (C) which is 1 at i and 0 at i . Fixing i and letting i = i vary we find 1{i} = xi,i ∈ πS (C)n2 −1 ⊆ FS . i ∈S\{i}
Now this holds for all i ∈ S, so dim(πS (C)n2 −1 ) = n2 . Then applying ϕn2 −1 we find dim(C n2 −1 ) = n2 as claimed. 2.36. Corollary. Let C ⊆ Fn be a linear code and t ≥ 0 an integer. The following are equivalent: (i) t ≥ r(C) (ii) dim(C t ) = dim(C t+1 ) (iii) dim(C t ) = n2 the projective length of C (iv) C t is generated by some codewords with pairwise disjoint supports (v) (C t )⊥ is generated by its codewords of weight at most 2 (vi) there is a subset S ⊆ [n] such that πS : C t FS is onto, and every nonzero column of C t is repeated from a column indexed by S. Proof. The first equivalence (i)⇐⇒(ii) is essentially Corollary 2.33. Also (iv)⇐⇒(v)⇐⇒(vi) is clear. Now Theorem 2.35 gives (i)⇐⇒(iii)=⇒(iv). Conversely suppose (iv), so C t is generated by r = dim(C t ) codewords c1 , . . . , cr with pairwise disjoint supports. Then necessarily these codewords are slice generators for C t , so r = |U(C t )| = |U(C)|, where for the last equality we used Proposition 2.21. But by 2.19 we have |U(C)| = n2 , so (iii) holds.
28
HUGUES RANDRIAMBOLOLONA
Adjunction properties. 2.37. — If A is a finite dimensional algebra over F (with unit), then letting A act on itself by multiplication (on the left) allows to identify A with a subalgebra of the algebra of linear endomorphisms End(A). We then define the trace linear form on A as the linear form inherited from the usual trace in End(A), so formally tr(a) = tr(x → ax) for a ∈ A. We also define the trace bilinear form ·|· on A, by the formula x|y = tr(xy) for x, y ∈ A. Moreover the identity tr(xy) = tr(yx) then shows that ·|· is in fact a symmetric bilinear form, and that for any a, the left-multiplication-by-a map a· and the right-multiplication-by-a map ·a, which are elements of End(A), are adjoint to each other with respect to ·|·. In the particular case A = Fn , this construction identifies Fn with the algebra of diagonal matrices of size n over F. The trace function is the linear map tr(x) = x1 + · · · + xn and the trace bilinear form is the standard scalar product x|y = x1 y1 + · · · + xn yn , where x = (x1 , . . . , xn ), y = (y1 , . . . , yn ), xi , yj ∈ F. 2.38. Proposition. For any c ∈ Fn , the multiplication-by-c map c ∗ · acting on Fn is autoadjoint with respect to the standard scalar product: c ∗ x|y = x|c ∗ y for all x, y ∈ Fn . Proof. This can be checked directly, or seen as a special case of the discussion 2.37. 2.39. Corollary. Let C1 , C2 ⊆ Fn be two linear codes. Then for any c ∈ Fn we have c ∗ C1 ⊆ C2 ⇐⇒ c ∗ C2⊥ ⊆ C1⊥ and for any linear code C ⊆ Fn we have C ∗ C1 ⊆ C2
⇐⇒
C ∗ C2⊥ ⊆ C1⊥ .
Proof. The first assertion is a consequence of Proposition 2.38, and the second follows after passing to the linear span. 2.40. Corollary. For any two linear codes C, C ⊆ Fn we have C ∗ (C ∗ C )⊥ ⊆ C ⊥ . Given two integers t ≥ t ≥ 0 we have
C t ∗ (C t )⊥ ⊆ (C t −t )⊥ . (Note: it is easy to construct examples where the inclusion is strict.) Proof. Apply the second equivalence in Corollary 2.39.
Recall from 2.6, 2.11, and 2.14 we described the extended stabilizing algebra A(C) = {a ∈ Fn ; a ∗ C ⊆ C} of a linear code C as
A(C) =
1Q ,
Q∈P(C)
where P(C) = P(C) ∪ {{j}; j ∈ Supp(C)} is the partition of [n] associated with the decomposition of C into indecomposable components.
ON PRODUCTS AND POWERS OF LINEAR CODES
29
⊥ ), hence 2.41. Corollary. For any linear code C we have A(C) = A(C ⊥ ). also P(C) = P(C In particular, a linear code C of length n ≥ 2 is indecomposable with full support if and only if C ⊥ is. Proof. Apply the first equivalence in Corollary 2.39.
The following interesting characterization of A(C) was apparently first noticed by Couvreur and Tillich: 2.42. Corollary. For any linear code C we have A(C) = (C ∗ C ⊥ )⊥ , or equivalently, C ∗ C ⊥ is the space of words orthogonal to the 1Q for Q ∈ P(C). In particular, if C is indecomposable with full support, of length n ≥ 2, then C ∗ C ⊥ = 1⊥ is the [n, n − 1, 2] parity code. Proof. The first inclusion in Corollary 2.40, applied with C = C ⊥ , shows that (C ∗ C ⊥ )⊥ stabilizes C, hence (C ∗ C ⊥ )⊥ ⊆ A(C). By duality, to get the converse inclusion, it suffices now to show ⊥, C ∗ C ⊥ ⊆ A(C) Now if Q = {j} i.e. we have to show C ∗ C ⊥ orthogonal to the 1Q , for Q ∈ P(C). for j out of Supp(C) this is clear. Otherwise we have Q ∈ P(C), and projecting onto Q we can now suppose that C is indecomposable with full support, in which case we have to show C ∗ C ⊥ orthogonal to 1, which is obvious (this can also be seen as the second inclusion in Corollary 2.40 applied with t = t = 1). Symmetries and automorphisms. 2.43. — For any integer n, we let the symmetric group Sn act on the right on Fn by the formula (x1 , . . . , xn )σ = (xσ(1) , . . . , xσ(n) ) where σ ∈ Sn , xi ∈ F. Equivalently, if x ∈ Fn is a row vector, then xσ = xPσ where Pσ is the permutation matrix with entries (Pσ )i,j = 1{i=σ(j)} . For σ, τ ∈ Sn we have (xσ )τ = xστ . 2.44. Lemma. For x, y ∈ Fn and σ ∈ Sn we have xσ ∗ y σ = (x ∗ y)σ . Proof. Obvious. 2.45. Definition. If C ⊆ Fn is a linear code, its group of symmetries is S(C) = {σ ∈ Sn ; C σ = C} = {σ ∈ Sn ; ∀c ∈ C, cσ ∈ C}.
30
HUGUES RANDRIAMBOLOLONA
2.46. Proposition. Let C, C ⊆ Fn be two linear codes. Then we have S(C) ∩ S(C ) ⊆ S(C ∗ C ). Given two integers t, t ≥ 1, then t|t
=⇒
S(C t ) ⊆ S(C t ).
Proof. Direct consequence of Lemma 2.44.
2.47. Example. Let C be the one-dimensional code of length 2 over F5 generated by the row vector (1, 2). Then S(C t ) = {1} for t odd, and S(C t ) = S2 for t even. 2.48. — We define Aut(Fn ) as the group with elements the pairs (σ, a) with σ ∈ Sn and a ∈ (Fn )× , and composition law given by (σ, a)(τ, b) = (στ, aτ ∗ b) for σ, τ ∈ Sn and a, b ∈ (Fn )× . This is a semidirect product of Sn and (Fn )× , with (Fn )× normal. We let Aut(Fn ) act on the right on Fn , where (σ, a) acts as x → xσ ∗ a. For a ∈ (Fn )× , let D(a) ∈ Fn×n be the associated diagonal matrix. Then the map (σ, a) → Pσ D(a) n is an isomorphism of Aut(F ) with the group of n × n monomial matrices (note D(aτ ) = Pτ−1 D(a)Pτ ). The latter acts on the right on Fn , seen as a space of row vectors, and this isomorphism preserves the actions. 2.49. Lemma. For x, y ∈ Fn , σ ∈ Sn , and a, b ∈ (Fn )× , we have (xσ ∗ a) ∗ (y ∗ b) = (x ∗ y)σ ∗ (a ∗ b). σ
Proof. Obvious.
2.50. — By the definition of Aut(Fn ) as a semidirect product, we have a split exact sequence π 1 −→ (Fn )× −→ Aut(Fn ) −→ Sn −→ 1. Definition. Given two subgroups H, H of Aut(Fn ), we write H H ⊆ when π(H) ⊆ π(H ) and H ∩ (Fn )× ⊆ H ∩ (Fn )× . If H is finite (for example if F is finite) this implies that |H| divides |H |. H and H ⊆ H. This implies |H| = |H |. We also set H ∼ H when H ⊆ 2.51. Definition. Let C ⊆ Fn be a linear code. Then the group of linear automorphisms of C in Fn is Aut(C) = {(σ, a) ∈ Aut(Fn ) ; C σ ∗ a = C}. We also let S(C) = π(Aut(C)) = {σ ∈ Sn ; ∃a ∈ (Fn )× , C σ ∗ a = C} be the group of projective symmetries of C, and we note that × Aut(C) ∩ (Fn )× = A(C)
ON PRODUCTS AND POWERS OF LINEAR CODES
31
by Definition 2.6, so we get an exact sequence × −→ Aut(C) −→ S(C) 1 −→ A(C) −→ 1. π
2.52. Proposition. Let C, C ⊆ Fn be two linear codes. Then we have ) ⊆ S(C ∗ C ) S(C) ∩ S(C and × A(C )× ⊆ A(C ∗ C )× . A(C) Given two integers t, t ≥ 1, then t|t
=⇒
Aut(C t ). Aut(C t ) ⊆
Proof. The first inclusion is a direct consequence of Lemma 2.49. The second follows from Proposition 2.7 (and 2.14). Then together they imply the last assertion. 2.53. — Given a subset S ⊆ [n] and a partition U of S, we define S(U) = {σ ∈ Sn ; ∀U ∈ U, σ(U ) ∈ U} and A(U) =
U∈U
1U ⊕
1{j} j∈S
which is a subalgebra of Fn . 2.54. Proposition. Let v1 , . . . , vs ∈ Fn , vi = 0, be vectors with pairwise disjoint supports, and let C = v1 ⊕ · · · ⊕ vs ⊆ Fn be the linear code they generate. Let Bi = Supp(vi ), so U = {B1 , . . . , Bs } is a partition of Supp(C). Then we have S(C) = S(U) and A(C) = A(U). Proof. Obviously the vi are the indecomposable components of C, and an automorphism of a code must map indecomposable components to indecomposable components. This implies S(C) ⊆ S(U). Conversely, let σ ∈ S(U). We have to construct a ∈ (Fn )× such that C σ ∗ a = C. First, we set the coordinates of a equal to 1 out of Supp(C). Now σ determines a permutation j → j of [s], such that σ(Bj ) = Bj (so |Bj | = |Bj |). Then, for i ∈ Supp(C), we have i ∈ Bj for some j, and we can just set πi (a) = πi (vj )/πσ(i) (vj ). This gives vjσ ∗ a = vj , hence, letting j vary, C σ ∗ a = C as claimed. Last, A(C) = A(U) follows from Proposition 2.7 (and 2.14). 2.55. — Given a subset S ⊆ [n] and a partition U of S, we define C(U) = U∈U 1U . Corollary. Let C ⊆ Fn be a linear code. Then for all t ≥ r(C) we have Aut(C t ) ∼ Aut(C(U)). Proof. Consequence of Theorem 2.35 and Proposition 2.54.
32
HUGUES RANDRIAMBOLOLONA
Said otherwise, up to ∼, the sequence Aut(C t ) becomes ultimately constant. For all t, t ≥ r(C) we have Aut(C t ) ∼ Aut(C t ). 2.56. — Here are three open problems that, by lack of time, the author did not try to address. First, considering Example 2.47, Proposition 2.52, and Corollary 2.55, it might be interesting to compare Aut(C t ) and Aut(C t ) for all t, t , not only for t|t or t ) ⊆ A(C t+1 ) for all for t, t ≥ r(C). By Proposition 2.7 (and 2.14) we have A(C t+1 ). t ) and S(C t ≥ 1, so a key point would be to compare S(C Second, note that, as defined, Aut(C) is a subgroup of Aut(Fn ), and its action on C need not be faithful. Another perhaps equally interesting object is the group Autin (C) of invertible linear endomorphisms of C (seen as an abstract vector space) that preserve the Hamming metric. We might call Autin (C) the group of “internally defined” automorphisms (or isometries) of C. Obviously, an element of Aut(C) acts on C through an element of Autin (C), and conversely, the McWilliams equivalence theorem [33] shows that all elements of Autin (C) arise in this way. So we have an identification Autin (C) = Aut(C)/ Aut0 (C) where Aut0 (C) ⊆ Aut(C) is the kernel of the action of Aut(C) on C. It then appears very natural to try compare the Autin (C t ) as t varies (and for this, it might be useful to compare the Aut0 (C t ) first). Last, we were interested here only in groups acting linearly on codes. However, when F is a nonprime finite field, we can also consider the action of the Frobenius, which preserves the Hamming metric, leading to the notion of semilinear automorphism. One could then try to extend the study to this semilinear setting. 3. Estimates involving the dual distance 3.1. — A characterization of the dual distance d⊥ (C) of a linear code C ⊆ Fn is as the smallest possible length of a linear dependence relation between columns of C. In case C = Fn , there is no such relation, but it might then be convenient to set d⊥ (Fn ) = n + 1. This can be rephrased as: Lemma. Let 0 ≤ m ≤ n. Then we have d⊥ (C) ≥ m + 1 if and only if, for any set of indices J ⊆ [n] of size |J| = m, and for any j ∈ J, there is a codeword y ∈ C with coordinate πj (y) = 1 and πj (y) = 0 for j ∈ J \ {j}. Equivalently, d⊥ (C) ≥ m + 1 if and only if, for any J ⊆ [n] of size |J| = m, dim(πJ (C)) = m. 3.2. — From this we readily derive the following properties: Lemma. Let C ⊆ Fn be a linear code. Then: (i) For any subcode C ⊆ C, we have d⊥ (C ) ≤ d⊥ (C). (ii) For any set of indices S ⊆ [n], we have d⊥ (πS (C)) ≥ min(|S|+1, d⊥ (C)). (iii) We have d⊥ (C) ≤ dim(C) + 1 with equality if and only if C is MDS. 3.3. — The simplest estimate involving products of codes and the dual distance is probably the following:
ON PRODUCTS AND POWERS OF LINEAR CODES
33
Proposition. Let C1 , C2 ⊆ Fn be two linear codes with full support, i.e. dual ⊥ distances d⊥ 1 , d2 ≥ 2. Then we have ⊥ d⊥ (C1 ∗ C2 ) ≥ min(n + 1, d⊥ 1 + d2 − 2). ⊥ Proof. It suffices to show that any subset J ⊆ [n] of size m = min(n, d⊥ 1 +d2 − 3) satisfies the condition in Lemma 3.1. So pick j ∈ J and write J \ {j} = A1 ∪ A2 ⊥ with |A1 | = d⊥ 1 − 2 and |A2 | ≤ d2 − 2. Then by Lemma 3.1 we can find y1 ∈ C1 that is 1 at j and 0 over A1 , and y2 ∈ C2 that is 1 at j and 0 over A2 . Then y = y1 ∗ y2 ∈ C1 ∗ C2 is 1 at j and 0 over J \ {j} as requested.
From this one deduces the following estimate, that the author first learned from A. Couvreur: 3.4. Corollary. Let C ⊆ Fn have full support and no repeated column, i.e. dual distance d⊥ ≥ 3. Then for all t ≥ 1 we have dim(C t ) ≥ min(n, 1 + (d⊥ − 2)t). As a consequence C has regularity
n−1 r(C) ≤ ⊥ d −2
and for t ≥ r(C) we have C t = Fn . Proof. Write dim(C t ) ≥ d⊥ (C t ) − 1 and make induction on t using Proposition 3.3. In fact it is possible to say slightly better, as will be seen below. 3.5. Proposition. Let C1 , C2 ⊆ Fn be two linear codes. Suppose C2 has full support, i.e. dual distance d⊥ 2 ≥ 2. Then we have dim(C1 ∗ C2 ) ≥ min(n1 , k1 + d⊥ 2 − 2) where n1 = |Supp(C1 )| and k1 = dim(C1 ). Proof. We can suppose C1 has a generator matrix of the form
G 1 = Ik 1 X 0 where Ik1 is the k1 × k1 identity matrix, and X is a k1 × (n1 − k1 ) matrix with no zero column. Then, multiplying rows of G1 with suitable codewords of C2 given by Lemma 3.1, one constructs codewords in C1 ∗ C2 that form the rows of a matrix of the form
Ik Z with k = min(n1 , k1 + d⊥ 2 − 2). See [44, Lemma 6], for more details.
For example, if C1 , C2 ⊆ F have full support, and C2 is MDS of dimension k2 so d⊥ 2 = k2 + 1, we find n
dim(C1 ∗ C2 ) ≥ min(n, k1 + k2 − 1). 3.6. — In another direction, an easy induction on Proposition 3.5 also shows that if C ⊆ Fn is a linear code with full support and no repeated column, then for any integers t0 ≥ 0, a ≥ 1, and j ≥ 0, we have dim(C t0 +aj ) ≥ min(n, k0 + (d⊥ a − 2)j)
34
HUGUES RANDRIAMBOLOLONA
⊥ a where k0 = dim(C t0 ) and d⊥ ). As a consequence C has regularity a = d (C n − k0 r(C) ≤ t0 + a ⊥ da − 2
and for t ≥ r(C) we have C t = Fn . We retrieve Corollary 3.4 by setting t0 = 0, k0 = 1, a = 1. 3.7. Corollary. Let C1 , C2 ⊆ Fn be two linear codes. Suppose C2 has full support, i.e. dual distance d⊥ 2 ≥ 2. Fix an integer i in the interval 1 ≤ i ≤ dim(C1 ), and set m = min(wi (C1 ) − i, d⊥ 2 − 2) ≥ 0 where wi (C1 ) is the i-th generalized Hamming weight of C1 . Then for all j in the interval 1 ≤ j ≤ i + m we have wj (C1 ∗ C2 ) ≤ wi (C1 ) − i − m + j. In particular (for C1 nonzero) setting i = j = 1 we find dmin (C1 ∗ C2 ) ≤ max(1, d1 − d⊥ 2 + 2) where d1 = w1 (C1 ) = dmin (C1 ). Proof. Since wj (C1 ∗ C2 ) ≤ wi+m (C1 ∗ C2 ) − i − m + j (proof: shortening), it suffices to show wi+m (C1 ∗ C2 ) ≤ wi (C1 ). But then, just take C ⊆ C1 with support size wi (C1 ) and dimension i, and observe that C ∗ C2 ⊆ C1 ∗ C2 has support size wi (C1 ) and dimension at least i + m by Proposition 3.5. 3.8. — The same works for the dual weights of a product, improving on 3.3: Corollary. Let C1 , C2 ⊆ Fn be two linear codes with full support. Then for all i in the interval 1 ≤ i ≤ n − dim(C1 ∗ C2 ) we have wi ((C1 ∗ C2 )⊥ ) ≥ wi+d⊥ (C1⊥ ). 2 −2 Proof. By Corollary 2.40 we have (C1 ∗ C2 )⊥ ∗ C2 ⊆ C1⊥ , so for all j, wj (C1⊥ ) ≤ wj ((C1 ∗ C2 )⊥ ) ∗ C2 ). Set m = min(wi ((C1 ∗ C2 )⊥ ) − i, d⊥ 2 − 2) ≥ 0. Then for 1 ≤ j ≤ i + m we can apply Corollary 3.7 with C1 replaced by (C1 ∗ C2 )⊥ , to get wj ((C1 ∗ C2 )⊥ ∗ C2 ) ≤ wi ((C1 ∗ C2 )⊥ ) − i − m + j = max(j, wi ((C1 ∗ C2 )⊥ ) − i + j − d⊥ 2 + 2). We combine these two inequalities, and we note that C1 having full support implies wj (C1⊥ ) ≥ j + 1, so the only possibility left in the max is wj (C1⊥ ) ≤ wi ((C1 ∗ C2 )⊥ ) − i + j − d⊥ 2 + 2. ⊥ Now for j = i this gives wi ((C1 ∗ C2 )⊥ ) ≥ i + d⊥ 2 − 2 so in fact m = d2 − 2. Then ⊥ setting j = i + d2 − 2 finishes the proof.
3.9. — The last assertion in Corollary 3.7 extends to distances with rank constraints (as defined in 1.21):
ON PRODUCTS AND POWERS OF LINEAR CODES
35
Proposition. Let C1 , C2 ⊆ Fn be two linear codes of the same length n. Suppose C2 has full support, i.e. dual distance d⊥ 2 ≥ 2. Let C1 be equipped with an arbitrary rank function, and C2 with the trivial rank function. Equip then C1 ∗ C2 with the product rank function. Then for 1 ≤ i ≤ dim(C1 ) we have dmin,i (C1 ∗ C2 ) ≤ max(1, dmin,i (C1 ) − d⊥ 2 + 2). Proof. Let x ∈ C1 with rank rk(x) ≤ i and weight w = dmin,i (C1 ). Choose j ∈ Supp(x). Then Lemma 3.1 gives y ∈ C2 nonzero at j but vanishing at m = min(w − 1, d⊥ 2 − 2) other positions in Supp(x). Then z = x ∗ y ∈ C1 ∗ C2 has rank rk(z) ≤ rk(x) and weight w − m. In particular, if C ⊆ Fn is a linear code with dual distance d⊥ ≥ 2, then, with the convention of 1.20, for all t, t ≥ 0 we have
dmin ,i (C t+t ) ≤ max(1, dmin ,i (C t ) − (d⊥ − 2)t ). 3.10. — Since the dual distance behaves nicely under projection, its use combines well with the following filtration inequality: Lemma. Let C, C ⊆ Fn be two linear codes. Suppose C equipped with a filtration 0 = C0 ⊆ C1 ⊆ · · · ⊆ C = C by linear subcodes Ci . For 1 ≤ i ≤ set Ti = Supp(Ci ) \ Supp(Ci−1 ). Then we have dim(πTi (Ci ) ∗ πTi (C )). dim(C ∗ C ) ≥ i=1
In particular if = k = dim(C) and dim(Ci ) = i for all i, we have dim(C ∗ C ) ≥
k
dim(πTi (C )).
i=1
Proof. We have a filtration 0 = C0 ∗ C ⊆ C1 ∗ C ⊆ · · · ⊆ C ∗ C = C ∗ C where for all i ≥ 1 we have Ci−1 ∗ C ⊆ ker πTi , hence dim(Ci ∗ C ) − dim(Ci−1 ∗ C ) ≥ dim(Ci ∗ C ) − dim((Ci ∗ C ) ∩ ker πTi ) = dim(πTi (Ci ∗ C )). Then we observe πTi (Ci ∗ C ) = πTi (Ci ) ∗ πTi (C ) and we sum over i. In case = k = dim(C) and dim(Ci ) = i for all i, we can pick ci ∈ Ci \ Ci−1 and we have πTi (Ci ) = vi , where vi = πTi (ci ) ∈ FTi has full support (except if Ti = ∅, but then the contribution is 0, which is fine). The conclusion follows. From this, another bound involving the dual distance was established by D. Mirandola:
36
HUGUES RANDRIAMBOLOLONA
3.11. Theorem ([35]). Let q be a prime power. Fix an odd integer D ≥ 3. Then for all ε > 0, there is an integer N such that, for any integers n, k such that n − k ≥ N and for any [n, k]q linear code C with dual distance d⊥ ≥ D we have D−1 1 −ε log2q (n − k). dim(C 2 ) ≥ k + 2 2 The proof uses two main ingredients. The first is to transform the condition on d⊥ into a lower bound on the terms dim(πTi (C)) that appear in Lemma 3.10 (applied with C = C). For this one can use any of the classical bounds of coding theory, applied to πTi (C)⊥ . For example, the Singleton bound gives dim(πTi (C)) ≥ min(|Ti |, d⊥ − 1), as already mentioned; Mirandola also uses the Hamming bound. Then, in order to optimize the resulting estimates, one needs the filtration of C to be constructed with some control on the |Ti |; for this one uses the Plotkin bound. However this leads to quite involved computations, and a careful analysis remains necessary in order to make all this work. 3.12. — Some of the results above become especially interesting when seen from the geometric point of view. Recall from Proposition 1.28 that r(C) is also equal to the Castelnuovo-Mumford regularity of the projective set of points ΠC ⊆ Pk−1 associated to C. Then ΠC admits syzygies OPk−1 (−ak−1,j ) −→ · · · 0 −→ j
· · · −→
OPk−1 (−a1,j ) −→ OPk−1 −→ OΠC −→ 0
j
for some integers ai,j ≥ i, and we have r(C) = max(ai,j − i) i,j
(see e.g. Chapter 4 of [19]). Thus, from estimates such as the one in Corollary 3.4 (or 3.6), we see that important information on the syzygies of ΠC can be extracted from the dual code C ⊥ . This situation is very similar to that of [20] although, admittedly, the results proved there are much deeper. There, the duality of codes, seen from a geometric point of view, is called Gale duality, in reference to another context where it is of use. 4. Pure bounds Here we consider bounds on the basic parameters (dimension, distance) of a code and its powers, or of a family of codes and their product. In contrast with section 3, no other auxiliary parameter (such as a dual distance) should appear. Most of the material here will be taken from [43] and [44]. The results are more involved and in some places we will only give partial proofs, but the reader can refer to the original papers for details. The generalized fundamental functions. 4.1. — In many applications, a linear code is “good” when both its dimension and its minimum distance are “large”. In order to measure to what extent this
ON PRODUCTS AND POWERS OF LINEAR CODES
37
is possible, it is customary to consider the “fundamental functions of linear block coding theory” given by a(n, d) = max{k ≥ 0 ; ∃C ⊆ Fn , dim(C) = k, dmin (C) ≥ d} and
a(n, δn) . n n→∞ Now suppose we need a code C such that all powers C, C 2 , . . . , C t , up to a certain t, are good (see 5.4-5.14 for situations where this condition naturally appears). Thanks to Theorem 2.32, to give a lower bound on the dimension and the minimum distance of all these codes simultaneously, it suffices to do so only for dim(C) and for dmin (C t ). This motivates the introduction of the generalized fundamental functions α(δ) = lim sup
at (n, d) = max{k ≥ 0 ; ∃C ⊆ Fn , dim(C) = k, dmin (C t ) ≥ d} and αt (δ) = lim sup n→∞
at (n, δn) , n
first defined in [43] If the base field F is not clear from the context, we will use more explicit t t t t notations such as aF and αF . Also if q is a prime power, we set aq = aFq and t
t
αq = αFq , where Fq is the finite field with q elements. 4.2. — From Theorem 2.32 we get at once: Lemma. Let t ≥ 1. Then for all n, d we have at+1 (n, d) ≤ at (n, d) and for all δ we have
αt+1 (δ) ≤ αt (δ).
As a consequence, any upper bound on the usual fundamental functions passes to the generalized functions. However, improvements can be obtained by working directly on the latter. Concerning lower bounds, we have the following: 4.3. Proposition. Let t ≥ 1. Then for all 1 ≤ d ≤ n we have n . at (n, d) ≥ d Moreover if n ≤ |F| + 1 we also have n−d t a (n, d) ≥ + 1. t
Proof. For the first inequality, partition the set [n] of coordinates into nd subsets of size d or d + 1, and consider the code C spanned by their characteristic vectors. Then C t = C has dimension nd and minimum distance d. For the second inequality, consider the (possibly extended) Reed-Solomon code, at n given distinct eleobtained by evaluating polynomials of degree up to n−d t + 1, and its t-th ments of F (or possibly also at infinity). It has dimension n−d t power is also a Reed-Solomon code, obtained by evaluating polynomials of degree ≤ n − d, so of minimum distance at least d. up to t n−d t
38
HUGUES RANDRIAMBOLOLONA
It will be a consequence of the product Singleton bound that these inequalities are tight, leading to the exact determination of the functions at and αt when F is infinite (see Corollary 4.10). 4.4. — On the other hand, when F = Fq is a finite field, the corresponding t t (generalized) fundamental functions aq and αq are much more mysterious. t For example, note that the function αq is nontrivial if and only if there is an asymptotically good family of linear codes over Fq whose t-th powers also form an asymptotically good family (and then also do all powers between 1 and t). We let τ (q) = sup{t ≥ 1 ; ∃δ > 0, αqt (δ) > 0} ∈ N ∪ {∞} be the supremum of the integers t for which this holds, for a given q. There is no q for which it is known whether τ (q) is finite or infinite, although algebraic-geometry codes will provide examples showing that τ (q) → ∞ as q → ∞. It is also true that τ (q) ≥ 2 for all q, that is, there exists an asymptotically good family of q-ary linear codes whose squares also form an asymptotically good family. But as we will see, to include the case of small q requires a quite intricate construction. This leads to the author’s favorite open problem on this topic: try to improve (any side of) the estimate 2 ≤ τ (2) ≤ ∞. That is, answer one of these two questions: does there exist an asymptotically good family of binary linear codes whose cubes also form an asymptotically good family? or instead of cubes, is it possible with powers of some arbitrarily high given degree? An upper bound: Singleton. 4.5. — The Singleton bound is one of the simplest upper bounds on the parameters of (possibly nonlinear) codes. In the linear case, it states that for any C ⊆ Fn of dimension k and minimum distance d, we have k + d ≤ n + 1. At least three strategies of proof can be devised: (i) Shortening. Shorten C at any set of coordinates I of size |I| = k − 1, that is consider the subcode made of codewords vanishing at I. This subcode has codimension at most k −1, since it is defined by the vanishing of k −1 linear forms, so it is nonzero. Hence C contains a nonzero codeword c supported in [n] \ I, and d ≤ w(c) ≤ n − k + 1. (ii) Duality. Let H be a parity matrix for C, and let C ⊥ be the dual code. Recall that codewords of C are precisely linear relations between columns of H. So dmin (C) ≥ d means any d − 1 columns of H are linearly independent, hence n − k = dim(C ⊥ ) = rk(H) ≥ d − 1. (iii) Puncturing. Puncture C at any set of coordinates J of size |J| = n − k + 1. By dimension, the corresponding projection C −→ (Fq )[n]\J is not injective. This means there are two codewords in C that differ only over J, hence d ≤ |J| ≤ n − k + 1.
ON PRODUCTS AND POWERS OF LINEAR CODES
39
Of course these methods are not entirely independent, since shortening is somehow the dual operation to puncturing. Note that proofs (i) and (ii) work only in the linear case, while proof (iii) remains valid for general codes (suppose F = Fq finite, set k = logq |C|, and use cardinality instead of dimension to show the projection noninjective). Also a variant of proof (iii) is to puncture at a set of coordinates of size d − 1 (instead of n − k + 1), and conclude using injectivity of the projection (instead of noninjectivity). 4.6. — Now let C1 , . . . , Ct ⊆ Fn be linear codes, and set ki = dim(Ci ) and d = dmin (C1 ∗ · · · ∗ Ct ). By the shortening argument of (i) above, we see that for any choice of Ii ⊆ [n] of size |Ii | = ki − 1, there is a nonzero codeword ci ∈ Ci supported in [n] \ Ii . If we could do so as the intersection of the supports of the ci be nonempty (and the Ii be pairwise disjoint), then c1 ∗ · · · ∗ ct would be a nonzero codeword in C1 ∗ · · · ∗ Ct of weight at most n − (k1 − 1) − · · · − (kt − 1). Although this argument is incomplete, it makes plausible that, perhaps under a few additional hypotheses, the Singleton bound should extend to products of codes essentially in the form of a linear inequality k1 + · · · + kt + d ≤ n + t. A result of this sort has been proved in [44] and will be discussed in 4.9 below. It turns out that the case t = 2 is already of interest: 4.7. Proposition. Let C1 , C2 ⊆ Fn be linear codes, with nondisjoint supports. Set ki = dim(Ci ) and d = dmin (C1 ∗ C2 ). Then d ≤ max(1, n − k1 − k2 + 2). It is interesting to try to prove this result with each of the three methods in 4.5. Quite surprisingly, the shortening approach (i) does not appear to adapt easily (or at least, the author did not succeed). On the other hand, the duality approach (ii) and the puncturing approach (iii) will give two very different proofs. A first common step is to reduce to the case where C1 and C2 both have full support, by projecting on I = Supp(C1 ) ∩ Supp(C2 ). Indeed, setting C i = πI (Ci ), while ki = dim(C i ), n = |I|, and Ji = Supp(Ci ) \ I, we have dmin (C 1 ∗ C 2 ) = d, ki ≥ ki − |Ji | so n − k 1 − k 2 + 2 ≤ n − k1 − k2 + 2. Hence the result holds for C1 and C2 as soon as it holds for C 1 and C 2 . So in the two proofs below we suppose that C1 and C2 both have full support. Also we set di = dmin (Ci ). First proof of Proposition 4.7. We will reason by duality (the reader can check that our argument reduces to 4.5(ii) when C1 = 1). If d = 1 the proof is finished, so we can suppose d ≥ 2 and we have to show d ≤ n − k1 − k2 + 2. First, by Corollary 2.40 we have n − k2 = dim(C2⊥ ) ≥ dim(C1 ∗ (C1 ∗ C2 )⊥ ), while by Proposition 3.5 dim(C1 ∗ (C1 ∗ C2 )⊥ ) ≥ min(n, k1 + d − 2). Then Corollary 3.7 gives d ≤ d1 , so k1 + d ≤ n + 1 by the classical Singleton bound. Thus min(n, k1 + d − 2) = k1 + d − 2
40
HUGUES RANDRIAMBOLOLONA
and we conclude. Second proof of Proposition 4.7. We distinguish two cases: • high dimension: suppose k1 + k2 > n, show d = 1 • low dimension: suppose k1 + k2 ≤ n, show d ≤ n − k1 − k2 + 2.
To start with, we reduce the low dimension case to the high dimension case using a puncturing argument similar to 4.5(iii). So suppose k1 + k2 ≤ n, and puncture at any set of coordinates J of size |J| = n − k1 − k2 + 1. If one of the projections Ci −→ F[n]\J is not injective, then di ≤ |J| ≤ n − k1 − k2 + 2, and the proof is finished since d ≤ di by Corollary 3.7. On the other hand, if both projections are injective, set C i = π[n]\J (Ci ) and n = k1 + k2 − 1. Then we have dim(C i ) = ki with k1 + k2 > n, while d ≤ dmin (C 1 ∗ C 2 ) + |J|. Replacing Ci with C i we are now reduced to the high dimension case. It is treated in the following Lemma. 4.8. Lemma. Let C1 , C2 ⊆ Fn be linear codes, both with full support. Set ki = dim(Ci ) and suppose k1 + k2 > n. Then there are codewords c1 ∈ C1 and c2 ∈ C2 , the product of which has weight w(c1 ∗ c2 ) = 1. In particular we have dmin (C1 ∗ C2 ) = dmin,1 (C1 ∗ C2 ) = 1. This Lemma was first proved by N. Kashyap, as follows: Proof. Let Hi be a parity-check matrix for Ci . It is enough to find a pair of disjoint subsets A1 , A2 ⊆ [n] and a coordinate j ∈ [n] \ (A1 ∪ A2 ) such that, for each i = 1, 2: • the columns of Hi indexed by Ai are linearly independent • the columns of Hi indexed by Ai ∪ {j} are linearly dependent. These can be found by a simple greedy algorithm: Initialize A1 = A2 = ∅ FOR j = 1, . . . , n IF columns of H1 indexed by A1 ∪ {j} are independent THEN append j to A1 ELSE IF columns of H2 indexed by A2 ∪ {j} are independent THEN append j to A2 ELSE Output A1 , A2 , j and STOP. The stopping criterion must be met for some value of j, since rk(H1 ) + rk(H2 ) = (n − k1 ) + (n − k2 ) < n. It is remarkable that, in this case, we can show that the minimum distance of C1 ∗ C2 is attained by a codeword in product form (compare with Example 1.22). Now we state the product Singleton bound for general t:
ON PRODUCTS AND POWERS OF LINEAR CODES
41
4.9. Theorem. Let C1 , . . . , Ct ⊆ Fn be linear codes; if t ≥ 3 suppose these codes all have full support. Then there are codewords c1 ∈ C1 , . . . , ct ∈ Ct , the product of which has weight 1 ≤ w(c1 ∗ · · · ∗ ct ) ≤ max(t − 1, n − (k1 + · · · + kt ) + t) where ki = dim(Ci ). As a consequence we have dmin (C1 ∗ · · · ∗ Ct ) ≤ dmin ,1 (C1 ∗ · · · ∗ Ct ) ≤ max(t − 1, n − (k1 + · · · + kt ) + t). The proof is a direct generalization of the second proof of Proposition 4.7 above. The puncturing step is essentially the same, so the only difficulty is to find the proper generalization of Lemma 4.8. We refer to [44] for the details. By Proposition 4.3 we see that this upper bound is tight. Also it turns out that, for t ≥ 3, the condition that the codes have full support is necessary. Actually, projecting on the intersection of the supports (as in the case t = 2) allows to slightly relax this condition, but not to remove it entirely. More details can be found in [44]. 4.10. Corollary. For any field F we have n for 1 ≤ d ≤ t, at (n, d) = d n−d at (n, d) = +1 for t < d ≤ n ≤ |F| + 1, t and (in case F finite) n−d at (n, d) ≤ +1 for t < d ≤ n with n > |F| + 1. t Likewise, we have αt (0) = 1, and αt (δ) ≤
1−δ t
for 0 < δ ≤ 1
with equality when F is infinite. Proof. Consequence of Theorem 4.9 and Proposition 4.3, and of the inequality at (n, d) ≤ ad (n, d) (Lemma 4.2) in case d ≤ t. It should be noted that the bound in Theorem 4.9 holds not only for the minimal distance dmin of the product code, but also for the distance with rank constraint dmin,1 . As a consequence, the estimates in Corollary 4.10 hold in fact for the functions at (n, d)1 = max{k ≥ 0 ; ∃C ⊆ Fn , dim(C) = k, dmin,1 (C t ) ≥ d} 1 and αt (δ)1 = lim supn→∞ a(n,δn) . n Another interesting remark is that, for t ≥ 2, the function αt is not continuous at δ = 0. This might make wonder whether the definitions of the functions at and αt are the “right” ones. For example one could ask how these functions are modified when one considers only codes with no repeated columns. Lower bounds for q large: AG codes.
t 4.11. — When F = Fq is a finite field, the inequality aq (n, d) ≤ n−d + 1 in t Corollary 4.10 might be strict for n > q + 1, because the length of Reed-Solomon codes is bounded.
42
HUGUES RANDRIAMBOLOLONA
In this setting, a classical way to get codes sharing most of the good properties of Reed-Solomon codes, but without this limitation on the length, is to consider so-called algebraic-geometry codes constructed from curves of higher genus. We recall from Example 1.7 that C(D, G) is the code obtained by evaluating functions from the Riemann-Roch space L(D), associated with a divisor D, at a set G of points out of the support of D, on an algebraic curve over F. 4.12. Proposition. Let q be a prime power and t ≥ 1 an integer. Suppose there is a curve X of genus g over Fq having at least n rational points. Then we have n−d aqt (n, d) ≥ +1−g for t < d ≤ n − tg. t Proof. LetG be a set of n rational points on X, and D a divisor of degree with support disjoint from G. Then we have g ≤ deg(D) ≤ deg(D) = n−d t t deg(D) < n so by the Goppa estimates dim(C(D, G)) = l(D) ≥ deg(D) + 1 − g and dmin (C(tD, G)) ≥ n − t deg(D) ≥ d. The conclusion follows since C(D, G)t ⊆ C(tD, G) by Example 1.7.
4.13. — We let Nq (g) be the largest integer n such that there is a curve of genus g over Fq with n rational points, and we define the Ihara constant A(q) = lim sup g→∞
Nq (g) . g
Corollary. For any prime power q and for any integer t ≥ 1 we have αqt (δ) ≥
1 1−δ − t A(q)
for 0 < δ ≤ 1 −
t . A(q)
As a consequence, τ (q) ≥ A(q)! − 1. Proof. Apply Proposition 4.12 with g → ∞ and n/g → A(q).
4.14. — For any prime power q we have [17] A(q) ≤ q 1/2 − 1, and in the other direction the following are known: (i) There is a constant c > 0 such that, for any prime power q, we have [51] A(q) > c log(q). (ii) If p is a prime, then for any integer s ≥ 1 we have [25] A(p2s ) = ps − 1. (iii) If p is a prime, then for any integer s ≥ 1 we have [22] −1 1 1 1 2s+1 + )≥ . A(p 2 ps − 1 ps+1 − 1
ON PRODUCTS AND POWERS OF LINEAR CODES
43
Then from Corollary 4.13 and from (i) we see that τ (q) → ∞ for q → ∞, as claimed in 4.4. Moreover, we also see that the lower bound in Corollary 4.13 asymptotically matches the upper bound in Corollary 4.10. Lower bounds for q small: concatenation. 4.15. — The lower bound in Corollary 4.13 is too weak to give some nontrivial information on τ (q) for q small. A useful tool in such a situation is concatenation, which allows to construct codes over small alphabets from codes over large alphabets. For technical reasons we present this notion in a more general context. First, if A1 , . . . , At and B are sets, seen as “alphabets”, and if Φ : A1 × · · · × At −→ B is any map, then for any integer n, applying Φ componentwise we get a map that (by a slight abuse of notation) we also denote Φ : (A1 )n × · · · × (At )n −→ B n . From this point on, we can proceed as in 1.17: given subsets S1 ⊆ (A1 )n , . . . , St ⊆ (At )n we let ˙ 1 , . . . , St ) = {Φ(c1 , . . . , ct ) ; c1 ∈ S1 , . . . , ct ∈ St } ⊆ B n . Φ(S If moreover A1 , . . . , At , B are F-vector spaces and C1 ⊆ (A1 )n , . . . , Ct ⊆ (At )n are F-linear subspaces, we let ˙ 1 , . . . , Ct ) ⊆ B n . Φ(C1 , . . . , Ct ) = Φ(C For instance, if t = 2, A1 = A2 = B = F, and Φ is multiplication in F, we retrieve the definition of C1 ∗ C2 given in 1.3. 4.16. — Concatenation in the usual sense corresponds to t = 1, A = Fqr , B = (Fq )m , and ϕ : Fqr → (Fq )m an injective Fq -linear map. Using the natural identification ((Fq )m )n = (Fq )nm , we see that if C is a [n, k]qr code, then ϕ(C) is a [nm, kr]q code. In the classical terminology, ϕ(C) ⊆ (Fq )nm is called the concatenated code obtained from the external code C ⊆ (Fqr )n and the internal code Cϕ = ϕ(Fqr ) ⊆ (Fq )m , using the symbol mapping ϕ. It is easily seen that we have dmin (ϕ(C)) ≥ dmin (C) dmin (Cϕ ). Now let t ≥ 2 be an integer. It turns out that if ϕ is well chosen, then there is a T ≥ t such that dmin (ϕ(C)t ) can be estimated from dmin (C T ) similarly. To state this more precisely we need to introduce the following notations: 4.17. — Suppose we are given h symmetric maps ψ1 : (Fqr )t −→ Fqr , . . . , ψh : (Fqr )t −→ Fqr such that: • viewed over Fqr , each ψi is a polynomial of degree Di ≤ T • viewed over Fq , each ψi is t-multilinear. In case t > q, suppose also these ψi satisfy the Frobenius exchange condition A.5 in Appendix A. Then by Theorem A.7, the map Ψ : (ψ1 , . . . , ψh ) : (Fqr )t −→ (Fqr )h
44
HUGUES RANDRIAMBOLOLONA
admits a symmetric algorithm, that is, one can find an integer m and Fq -linear maps ϕ : Fqr −→ (Fq )m and ω : (Fq )m −→ (Fqr )h such that the following diagram is commutative Ψ
(Fqr )t
(Fqr )h
(ϕ,...,ϕ)
ω ∗
m t
((Fq ) )
(Fq )m
where ∗ is componentwise multiplication in (Fq )m . Now suppose both ϕ and ω are injective. 4.18. Proposition. Let C be a [n, k] code over Fqr . With the notations just above, suppose either: • each polynomial ψi is homogeneous of degree Di , or • C contains the all-1 word 1[n] . Then the [nm, kr] code ϕ(C) over Fq satisfies dmin (ϕ(C)t ) ≥ dmin (C T ). Proof. This is a straightforward generalization of [43, Prop. 8]. Following the conventions in 4.15, first viewing ψi as a symmetric t-multilinear map, it defines a Fq -linear subspace ψi (C, . . . , C) ⊆ (Fqr )n . Then viewing ψi as a polynomial of degree Di , and using our hypothesis that either ψi is homogeneous or C contains 1[n] , we deduce ψi (C, . . . , C) ⊆ C Di . The commutative diagram in 4.17 then translates into the diagram (C)t
Ψ
C D1 ⊕ · · · ⊕ C Dh ω
(ϕ,...,ϕ)
(ϕ(C))t
∗
ϕ(C)t .
Now let z ∈ ϕ(C)t be a nonzero codeword of minimum weight w(z) = dmin (ϕ(C)t ). From the diagram just above we can write ω(z) = (c1 , . . . , ch ) with ci ∈ C Di , and since ω is injective, there is at least one i such that ci = 0. On the other hand ω is defined blockwise on (Fq )nm = ((Fq )m )n , so w(z) ≥ w(ci ) ≥ dmin (C Di ). We conclude since Di ≤ T implies dmin (C Di ) ≥ dmin (C T ) by Theorem 2.32. 4.19. Theorem. Keep the notations above and suppose t ≤ q. Set m =
r+t−1 and t (t−1)r (t−2)r r T = q t + q t + · · · + q t + 1. Then: (i) We have T
aqt (nm, d) ≥ raqr (n, d) for all 1 ≤ d ≤ n.
ON PRODUCTS AND POWERS OF LINEAR CODES
(ii) We have αqt (δ) ≥
45
r T α r (mδ) m q
for all 0 ≤ δ ≤ 1/m. (iii) If τ (q r ) ≥ T , then τ (q) ≥ t. Proof. Since t ≤ q, any symmetric t-multilinear map admits a symmetric algorithm by Theorem A.7. Equivalently, for any Fq -vector space V , the space of symmetric tensors SymtFq (V ) is spanned by elementary symmetric tensors. We apply this with V = (Fqr )∨ , so we can find m linear forms ϕ1 , . . . , ϕm ∈ (Fqr )∨ t ⊗t ∨ t ∨ such that ϕ⊗t 1 , . . . , ϕm span SymFq ((Fq r ) ) = (SFq Fq r ) . Note that this implies ∨ that ϕ1 , . . . , ϕm span (Fqr ) , so ϕ = (ϕ1 , . . . , ϕm ) : Fqr −→ (Fq )m is injective, ⊗t t m induces an isomorphism and also that Φ = (ϕ⊗t 1 , . . . , ϕm ) : (Fq r ) −→ (Fq ) t m SFq Fqr (Fq ) of Fq -vector spaces. However, by Theorem B.7 we also have an isomorphism SFt q Fqr I∈S FqrI (for some rI |r), induced by a symmetric t-multilinear map Ψ : (Fqr )t −→ I∈S FqrI , so all this fits in a commutative diagram: (Fqr )t (ϕ,...,ϕ)
((Fq )m )t
Ψ
I∈S Φ ∗
FqrI ⊆ (Fqr )|S|
(Fq )m .
The components of Ψ are homogeneous polynomials SI , which can be chosen of degree DI ≤ T by Corollary B.22. Now we only have to apply Proposition 4.18 to get (i), from which (ii) and (iii) follow. 4.20. Example. For t = 2, the bounds in (i) and (ii) give q r/2 +1
a2 q (r(r + 1)n/2, d) ≥ raq r
(n, d)
and
2 q r/2 +1 αqr (r(r + 1)δ/2) r+1 for any prime power q and for any r. For t = 3, they give αq2 (δ) ≥
q 2r/3 +q r/3 +1
a3 q (r(r + 1)(r + 2)n/6, d) ≥ raq r
(n, d)
and
6 q 2r/3 +q r/3 +1 αr (r(r + 1)(r + 2)δ/6) (r + 1)(r + 2) q for any q ≥ 3 and for any r. Note that the proof does not apply for q = 2, since it requires t ≤ q. αq3 (δ) ≥
4.21. — Let q be a prime power and let t ≤ q be an integer. Then for any integer r, combining Corollary 4.13 with Theorem 4.19(ii) we find 1 r 1 − mδ − αqt (δ) ≥ m T A(q r )
(t−1)r r and T = q t + · · · + q t + 1. where m = r+t−1 t
46
HUGUES RANDRIAMBOLOLONA
For this to be nontrivial we need T < A(q r ). Since A(q r ) ≤ q r/2 − 1, this can happen only for t = 2 and r odd, and it turns out we have indeed T < A(q r ) in this case: this follows from 4.14(ii) if q is a square, and from 4.14(iii) else. As a consequence we see τ (q) ≥ 2 for all q. In particular for q = 2 and r = 9 we find m = 45, T = 17, and A(29 ) ≥ 465 23 , so [43] 9 74 2 − δ ≈ 0.001872 − 0.5294 δ. α2 (δ) ≥ 39525 17 5. Some applications Product of codes being such a natural operation, it is no wonder it has already been used, since a long time, implicitely or explicitely, in numerous applications. Our aim here is to quickly survey the most significant of these applications, without entering too much into the historical details (for which the reader can refer to the literature), but rather focusing on where the various bounds, structural results, and geometric interpretations presented in this text can be brought into play. Multilinear algorithms. 5.1. — Let V1 , . . . , Vt and W be finite-dimensional F-vector spaces, and let Φ : V1 × · · · × Vt −→ W be a t-multilinear map. A multilinear algorithm of length n for Φ is a collection of t + 1 linear maps ϕ1 : V1 −→ Fn , . . . , ϕt : Vt −→ Fn and ω : Fn −→ W , such that the following diagram commutes: Φ
V1 × · · · × Vt −−−−→ ⏐ ⏐ (ϕ1 ,...,ϕt )
W ⏐ω ⏐
∗
Fn × · · · × Fn −−−−→ Fn or equivalently, such that Φ(v1 , . . . , vt ) = ω(ϕ1 (v1 ) ∗ · · · ∗ ϕt (vt )) for all v1 ∈ V1 , . . . , vt ∈ Vt . If we let ε1 , . . . , εn be the canonical basis of Fn and π1 , . . . , πn the canonical projections Fn → F, then setting wj = ω(εj ) ∈ W and li,j = πj ◦ ϕi ∈ Vi∨ , the last formula can also be written ⎛ ⎞ ⎝ Φ(v1 , . . . , vt ) = li,j (vi )⎠ wj . 1≤j≤n
1≤i≤t
Said otherwise, Φ can be viewed as a tensor in V1∨ ⊗· · ·⊗Vt∨ ⊗W , and a multilinear algorithm of length n corresponds to a decomposition Φ= l1,j ⊗ · · · ⊗ lt,j ⊗ wj 1≤j≤n
as a sum of n elementary tensors. In turn, since elementary tensors are essentially the image of the Segre map in V1∨ ⊗· · ·⊗Vt∨ ⊗W , all this can be viewed geometrically in a way similar to 1.30.
ON PRODUCTS AND POWERS OF LINEAR CODES
47
5.2. — More precisely, consider the linear codes Ci = ϕi (Vi ) ⊆ Fn and
C = ω T (W ∨ ) ⊆ (Fn )∨ = Fn (where ω T is the transpose of ω). Let also Vi = Vi / ker(ϕi ), and W = Im(ω). Note that Φ and the ϕi pass to the quotient, and as such they define a t-multilinear map Φ : V1 × · · · × Vt −→ W as well as a multilinear algorithm of length n for it. So, after possibly replacing Φ with Φ, we can suppose the ϕi and ω T are injective, hence give identifications Ci Vi and C W ∨ . Also we can suppose the Ci and C all have full support (otherwise some coordinates are not “used” in the algorithm, and can be discarded). Then Φ defines a point PΦ in the projective space P = P(C1 ⊗ · · · ⊗ Ct ⊗ C ), and the multilinear algorithm of length n for Φ defines n points in the Segre subvariety in P whose linear span contains PΦ . Now, untying all the definitions, we see these n points are precisely those in the projective set of points ΠC1 ∗···∗Ct ∗C constructed in 1.30. 5.3. — When V1 = · · · = Vt = V and Φ is a symmetric multilinear map, the algorithm is said symmetric if ϕ1 = · · · = ϕt = ϕ. A symmetric algorithm of length n for Φ corresponds to a decomposition of the associated tensor as a sum of n elementary symmetric tensors in Symt (V ∨ ) ⊗ W . In turn, the Symt (V ∨ ) part of this tensor space is essentially the image of a Veronese map, and links with the constructions in 1.31 could be given as above. When F is finite it is not always true that a symmetric multilinear map admits a symmetric algorithm (counterexamples can be given as soon as t > |F|), but Theorem A.7 in Appendix A provides a necessary and sufficient criterion for this to occur. 5.4. — From a more concrete point of view, the multilinear algorithm for Φ can be interpreted as follows: each vi ∈ Vi is splitted into n local shares in Fq using the fixed map ϕi , the shares are multiplied locally, and the results are combined using ω to recover the final value Φ(v1 , . . . , vt ). This is of interest in at least two contexts: • In algebraic complexity theory one is interested in having n as small as possible, in order to minimize the number of t-variable multiplications in Fq needed to compute Φ. This is relevant for applications in which the cost of a fixed linear operation is negligible compared to the cost of a t-variable multiplication. We thus define μ(Φ) the multilinear complexity of Φ, as the smallest possible length of a multilinear algorithm for Φ; and when Φ is symmetric, μsym (Φ) the symmetric multilinear complexity of Φ, as the smallest possible length of a symmetric multilinear algorithm for Φ (provided such an algorithm exists). These are rank functions (in the sense of 1.13) on the corresponding spaces of multilinear maps. There is a very broad literature on this subject, for various classes of multilinear maps Φ. We mention [6] for first pointing out the link
48
HUGUES RANDRIAMBOLOLONA
between these questions and coding theory, and [11] for studying this link much further, in particular bringing AG codes into play. For more recent results and other points of view still close to the one presented here, we refer the reader to [1][2][13][42][49], and to the references therein for a more thorough historical coverage. • One can also view this process as a very naive instance of multi-party computation, in which the local shares are given to n remote users, who “collectively” compute Φ. Now this scheme has to be modified because it is too weak for most practical applications, in which it is customary to impose various security requirements. For example, some shares could be altered by noise, or even by malicious users, to which the computation should remain robust. Also, these malicious users should not be able to determine neither the entries nor the final value of the computation by putting their shares in common. Of special importance is the case where Φ is multiplication in F (or in an extension field), since addition and multiplication are the basic gates in arithmetic circuits, that allow to represent arbitrary computable functions. A more precise formalization of these problems, as well as some important initial constructions, can be found in [4][9][15]. For the more mathematically minded reader, especially if interested in the use of AG codes, a nice point of entry to the literature could be [10] and then [8]. Since the vectors in Fn involved in the computation are codewords in the Ci or in their product C1 ∗ · · · ∗ Ct , all the questions above are linked to the possible parameters of these codes. For example it is easily shown: 5.5. Proposition. If the given multilinear algorithm for Φ has length n > dim(C1 ∗· · ·∗Ct ), then one can puncture coordinates to deduce a shorter multilinear algorithm of length dim(C1 ∗ · · · ∗ Ct ). Moreover, if the original algorithm is symmetric, then so is the punctured algorithm. ∼
Proof. Let S ⊆ [n] be an information set for C1 ∗ · · · ∗ Ct , and let σ : FS −→ C1 ∗ · · · ∗ Ct be the inverse of the natural projection πS (on C1 ∗ · · · ∗ Ct ). Then πS ◦ ϕ1 , . . . , πS ◦ ϕt and ω ◦ σ define a multilinear algorithm of length |S| for Φ. In this way μ(Φ) (and if relevant, μsym (Φ)) can be expressed as the dimension of a product code. Likewise, in the multi-party computation scenario, one is interested in constructing codes C having high rate (for efficiency), and such that C 2 has high minimum distance (for resilience), and also C ⊥ has high minimum distance (for privacy). The reader can consult [8] for recent advances in this direction. Construction of lattices from codes. 5.6. — Barnes and Sloane’s Construction D, first introduced in [3], and Forney’s code formula from [21], are two closely related ways to construct lattices from binary linear codes. Up to some details, they can be described as follows. Consider the lifting ∼
ε : F2 −→ {0, 1} ⊆ Z.
ON PRODUCTS AND POWERS OF LINEAR CODES
49
As in 4.15 we extend ε coordinatewise to ε : (F2 )n −→ Zn . Then given a chain of binary linear codes C : C0 ⊆ C1 ⊆ · · · ⊆ Ca−1 ⊆ Ca = (F2 )n we construct a subset ΛC = ε(C0 ) + 2ε(C1 ) + · · · + 2a−1 ε(Ca−1 ) + 2a Zn ⊆ Zn . It turns out that ΛC need not be a lattice in general, a fact that was sometimes overlooked in the literature. One can show: 5.7. Proposition. With these notations, ΛC is a lattice if and only if the codes Ci satisfy 2 Ci ⊆ Ci+1 . Proof. This follows from the relation ε(u) + ε(v) = ε(u + v) + 2ε(u ∗ v) ∈ Zn which holds for any u, v ∈ (F2 )n .
This observation was first made in Kositwattanarerk and Oggier’s paper [28], where examples and a more careful analysis of the connection between the constructions in [3] and [21] are also given. Roughly at the same time and independently, it was rediscovered by Boutros and Z´emor, from whom the author learned it. A feature of this Construction D is that, if ΛC is a lattice, then its parameters (volume, distance) can be estimated from those of the Ci . One motivation for its introduction was to reformulate a construction of the Barnes-Wall lattices. In this 2 case the Ci are essentially Reed-Muller codes, and the condition Ci ⊆ Ci+1 is satisfied, as noted in Example 1.7. 5.8. — What precedes can be generalized to codes over larger alphabets. Let p be a prime number, choose an arbitrary set of representatives R for Z modulo p, and consider the lifting ∼ ε : Fp −→ R ⊆ Z. Then given a chain of binary linear codes C : C0 ⊆ C1 ⊆ · · · ⊆ Ca−1 ⊆ Ca = (Fp )n we construct a subset ΛC = ε(C0 ) + pε(C1 ) + · · · + pa−1 ε(Ca−1 ) + pa Zn ⊆ Zn . Again, ΛC need not be a lattice in general. To give a criterion for this, one can introduce carry operations κj : Fp × Fp −→ Fp for 1 ≤ j ≤ a − 1, such that for each x, y ∈ Fp we have ε(x) + ε(y) = ε(x + y) + pε(κ1 (x, y)) + · · · + pa−1 ε(κa−1 (x, y)) mod pa in Z. 5.9. Proposition. With these notations, ΛC is a lattice if and only if for any i, j we have κj (Ci , Ci ) ⊆ Ci+j (where κj (Ci , Ci ) is defined according to 4.15). Proof. Same as above.
50
HUGUES RANDRIAMBOLOLONA
The usefulness of this criterion depends on our ability to control the κj (Ci , Ci ), which in turn depends on the choice of the lifting ε, or equivalently, of the set of representatives R. It turns out that there are at least two natural choices for this. 5.10. — The first choice is to take R = {0, 1, . . . , p−1}. We call the associated ε the “naive” lifting. Then one has κ1 = κ where 1 if x + y ≥ p κ(x, y) = 0 else while κj = 0 for j > 1. One drawback of this choice is that the expression for κ does not appear to have much algebraic structure, except for the cocycle relation κ(x, y) + κ(x + y, z) = κ(x, y + z) + κ(y, z). Anyway the fact that the higher κj are 0 should help in the computations. 5.11. — Another choice is to take as R a set of p − 1-th roots of unity in Z modulo pa , plus 0. We call the associated ε the multiplicative or Teichm¨ uller lifting. The κj are then essentially given by the addition formulae for Witt vectors [50, §II.6][36, Lect. 26], more precisely this allows to express κj as a symmetric homogeneous polynomial of degree pj . However since κj is defined on Fp × Fp , sometimes this expression can be simplified. Example. For p = 3 we can take R = {0, 1, −1} as multiplicative representatives for any a. The first carry operation is given by κ1 (x, y) = −xy(x + y). Then an expression for κ2 is κ2 (x, y) = −xy(x + y)(x − y)6 , however over F3 × F3 this is always 0. In fact the same holds for all the higher κj : as in the case p = 2 we can take κj = 0 for j > 1. 5.12. Corollary. When ε is the Teichm¨ uller lifting, a sufficient condition for ΛC to be a lattice is that the codes Ci satisfy p
Ci
⊆ Ci+1 .
Proof. Indeed, since κj can be expressed as a symmetric homogeneous polypj
nomial of degree pj , we then have κj (Ci , Ci ) ⊆ Ci
⊆ Ci+j .
Natural candidates for a family of codes satisfying this condition is to take the Ci evaluation codes, e.g. (generalized) Reed-Muller codes, as in the binary case. It remains to be investigated whether these lead to new examples of good lattices. Also, note that we worked over Z for simplicity, but most of the discussion remains valid over the ring of integers of an algebraic number field, possibly allowing further improvements. All this will be eventually considered in a forthcoming paper. Oblivious transfer. 5.13. — In an oblivious transfer (OT) protocol, Alice has two secrets s0 , s1 ∈ {0, 1}n , and Bob has a selection bit b. At the end of the protocol, Bob should
ON PRODUCTS AND POWERS OF LINEAR CODES
51
get sb but no other information, and Alice should get no information on b. In the Cr´epeau-Kilian protocol, Alice and Bob achieve this through communication over a noisy channel [16]. As a first step, emulating a coset coding scheme over a wiretap channel, they construct an almost-OT protocol, in which Alice can cheat to learn s with a certain positive probability. Then, from this almost-OT protocol, they construct a true OT protocol. More precisely, Alice chooses a N × n random matrix A0 such that (in row vector convention) 1[N ] · A0 = s0 , and she sets A1 = A0 + (1[N ] )T · (s0 + s1 ). Bob has a selection bit b, and he chooses a random selection vector v = (b1 , . . . , bN ) ∈ {0, 1}N such that v · (1[N ] )T = b. Using the almost-OT protocol N times, for each i he then learns the i-th row of Abi . Putting theses rows together in a N × n matrix B, he finally finds 1[N ] · B = v · A1 + (1[N ] − v) · A0 = sb . One can then show that for N ≈ n2 , Alice cannot cheat without Bob noticing it. 5.14. — A slight drawback of this protocol is that the number N of channel uses grows quadratically in the size n of the secret, so the overall communication rate tends to 0. A first solution was proposed in [26]. Their construction combines several sub-protocols, one of which is based on the results of [10] already mentioned in the discussion about multilinear algorithms and multi-party computation. Another construction is proposed in [38]. Quite interestingly it also makes an essential use of product of codes, while staying very close in spirit to the original Cr´epeau-Kilian protocol. The key idea is to replace the vector 1[N ] above, which is the generator matrix of a repetition code, by the generator matrix G of a code C of fixed rate R > 0 (so the secrets s0 and s1 also become matrices). For Bob the reconstruction step is only slightly more complicated: if he is interested in s0 , he has to choose a random selection vector b ∈ (C 2 )⊥ . One can then show that Alice cannot cheat as soon as (C 2 )⊥ has dual distance at least δn for some fixed δ > 0. Note that the dual distance of (C 2 )⊥ is just dmin (C 2 ), and the linear span in the definition of C 2 is relevant here since the distance appears through a duality argument. This raised the question of the existence of asymptotically good binary linear codes with asymptotically good squares, as discussed in section 4 above, to which a positive answer was finally given in [43]. Decoding algorithms. 5.15. — There are several applications of products of codes to the decoding problem. The first and probably the most famous of them is through the notion of error-correcting pairs [29][39]. If C is a linear code of length n, a t-error-correcting pair for C is a pair of codes (A, B) of length n such that: (i) A ∗ B ⊥ C (ii) dim(A) > t
52
HUGUES RANDRIAMBOLOLONA
(iii) dmin (A) > n − dmin (C) (iv) d⊥ (B) > t. (In [39] the product A ∗ B is defined without taking the linear span, but this is equivalent here since we are interested in the orthogonal). Given such a pair, C then admits a decoding algorithm that corrects t errors with complexity O(n3 ). This might be viewed as a way to reformulate most of the classical decoding algorithms for algebraic codes. It is then natural to investigate the existence of error-correcting pairs for given C and t. Results are known for many classes of codes, and the study deeply involves properties of the ∗ product, such as those presented in this text. For cyclic codes, a key result is the Roos bound, which is essentially the following: 5.16. • • • Then
Proposition ([46][40]). Let A, B, C ⊆ Fn be linear codes such that: A∗B ⊥ C A has full support dim(A) + dmin (A) + d⊥ (B) ≥ n + 3. dmin (C) ≥ dim(A) + d⊥ (B) − 1.
Proof. Since C ⊆ (A∗B)⊥ , it suffices to show d⊥ (A∗B) ≥ dim(A)+d⊥ (B)−1. Then by Lemma 3.1, setting s = dim(A) + d⊥ (B) − 2, it suffices to show dim(πS (A ∗ B)) = s for all S ⊆ [n] of size |S| = s. Now for any such S we have |[n] \ S| = n + 2 − dim(A) − d⊥ (B) < dmin (A), so the projection πS : A −→ πS (A) is injective, that is, dim(πS (A)) = dim(A). On the other hand, by Lemma 3.2(ii), we have d⊥ (πS (B)) ≥ d⊥ (B). We can then conclude with Proposition 3.5 applied to πS (A) and πS (B). Various improvements as well as generalizations of Proposition 5.16 are given in [31][18], all of which can also be re-proved along these lines. 5.17. — As another example of application of the product ∗ to the decoding problem, we can cite the technique of so-called power syndrome decoding for ReedSolomon codes [47]. If c is codeword of a Reed-Solomon code C, then for any integer i, the (componentwise) power ci is also a codeword of a Reed-Solomon code of higher degree. Now let x be a received word, with error e = x − c. Then xi − ci has support included in that of e. Arranging the xi together, we thus get a virtual received word for an interleaved Reed-Solomon code, with a burst error. Special decoding algorithms exist for this situation, and using them one eventually expects to get an improved decoding algorithm for the original C. A more detailed analysis shows this works when C has sufficiently low rate, allowing to decode it beyond half the minimum distance. Analysis of McEliece-type cryptosystems. 5.18. — McEliece-type cryptosystems [32] rely on the fact that decoding a general linear code is a NP-hard problem [5]. First, Alice chooses a particular linear code C with generator matrix G, for which an efficient decoding algorithm
ON PRODUCTS AND POWERS OF LINEAR CODES
53
(up to a certain number t of errors) is known, and she sets G = SGP where S is a randomly chosen invertible matrix and P a random permutation matrix. Her secret key is then the triple (G, S, P ), while her public key is essentially G (plus the number t). Typically, C is chosen among a class of codes with a strong algebraic structure, which is used for decoding. However, multiplying G by S and P allows to conceal this algebraic structure and make G look like the generator matrix of a general linear code C . A possible attack against such a scheme uses the fact that study of products C1 ∗ · · · ∗ Ct allows to find hidden algebraic relationships between subcodes Ci of C , and ultimately, to uncover the algebraic structure of C , from which a decoding algorithm could be designed. This strategy was carried out successfully against certain variants of the McEliece cryptosystem, e.g. when C is a (generalized) Reed-Solomon code [60][14]. 5.19. — However, the original McEliece cryptosystem remains unbroken. There, C is a binary Goppa code, which is constructed as a subfield subcode of an algebraic code defined over a larger field. As D. Augot pointed out to the author, one key difficulty in the analysis comes from the fact that it is not yet well understood how subfield subcodes behave under the ∗ product. Appendix A: A criterion for symmetric tensor decomposition Frobenius symmetric maps. A.1. — First we recall definitions from 5.1. Let F be a field, let V, W be finite-dimensional F-vector spaces, and let Φ : V t −→ W be a symmetric t-multilinear map. A symmetric multilinear algorithm of length n for Φ is a pair of linear maps ϕ : V −→ Fn and ω : Fn −→ W , such that the following diagram commutes: Vt ⏐ ⏐ (ϕ,...,ϕ)
Φ
−−−−→ W ⏐ω ⏐ ∗
(Fn )t −−−−→ Fn or equivalently, such that Φ(v1 , . . . , vt ) = ω(ϕ1 (v1 ) ∗ · · · ∗ ϕt (vt )) for all v1 , . . . , vt ∈ V . In turn, using the natural identification Symt (V ; W ) = Symt (V ∨ ) ⊗ W ⊆ ∨ ⊗t (V ) ⊗ W to view Φ as an element in this tensor space, this corresponds to a decomposition Φ= lj⊗t ⊗ wj 1≤j≤n
as a sum of n elementary symmetric tensors. Here, “symmetry” refers to the action of St by permutation on the t copies of V ∨ in (V ∨ )⊗t ⊗ W , and we call elementary the symmetric tensors of the form l⊗t ⊗ w for l ∈ V ∨ , w ∈ W . To show the equivalence, write wj = ω(εj ) ∈ W and lj = πj ◦ ϕ ∈ V ∨ , where ε1 , . . . , εn and π1 , . . . , πn are the canonical bases of Fn and (Fn )∨ respectively.
54
HUGUES RANDRIAMBOLOLONA
A.2. — When F = Fq is a finite field, it turns out that not all symmetric multilinear maps admit a symmetric algorithm, or equivalently, not all symmetric tensors can be decomposed as a sum of elementary symmetric tensors. There are at least two ways to see that. The first is by a dimension argument: setting r = dim(V ) and s = dim(W ), we have r+t−1 s dim Symt (V ; W ) = t which goes to infinity as t → ∞, while qr − 1 diml⊗t ; l ∈ V ∨ ⊗ W ≤ s q−1 remains bounded. So, for t big enough, l⊗t ; l ∈ V ∨ ⊗ W cannot be all of Symt (V ; W ), as claimed. However this proof is nonconstructive, because given an element in Symt (V ; W ), it does not provide a practical way to check whether or not this element admits a symmetric algorithm. The second is to show that the l⊗t all satisfy certain algebraic identities. So, by linearity, if an element in Symt (V ; W ) does not satisfy these identities, then it can not admit a symmetric algorithm. One such identity will come from the Frobenius property, xq = x for all x ∈ Fq . At first sight this gives a necessary condition for the existence of a symmetric algorithm. However, by elaborating on this Frobenius property, we will show how to turn it into a necessary and sufficient condition. A.3. — Symmetric tensor decomposition in characteristic 0 has been extensively studied; see e.g. [13] for a survey of recent results. Over finite fields, perhaps the earliest appearance of the notion of symmetric bilinear algorithm was in [49]. In this context, an important problem is the determination of (k) μsym q the symmetric bilinear complexity of Fqk over Fq , defined as the smallest possible length of a symmetric bilinear algorithm for the multiplication map Fqk × Fqk −→ Fqk (seen as a Fq -bilinear map). One can also consider μq (k) the (general) bilinear complexity of Fqk over Fq , defined similarly but without the symmetry condition. A survey of results up to 2005 can be found in [1]. Quite strangely, although most authors gave constructions of symmetric algorithms, they only stated their results for the weaker complexity μq (k) (this is perhaps because another central topic in algebraic complexity theory is that of matrix multiplication, which is noncommutative). At some point the development of the theory faced problems due to the fact that construction of symmetric algorithms from interpolation on a curve required a careful analysis of the 2-torsion class group of the curve [8], that was previously overlooked; in particular, some of the bounds cited in [1] (especially the one from [52]) had to be revised. Finally the situation was clarified by the author in [42]; among the contributions of this work we can cite: • emphasis is put on the distinction between symmetric and general bilinear complexity, rediscovering some results of [49] • it is shown that nonsymmetric bilinear algorithms are much easier to construct, since the 2-torsion obstruction does not apply to them
ON PRODUCTS AND POWERS OF LINEAR CODES
55
• for symmetric algorithms, a new construction is given (the idea of which originates from [45]) that bypasses the 2-torsion obstruction, repairing most of the broken bounds except perhaps for very small q.1 However, although this work settled a certain number of problems for bilinear algorithms, in particular showing that symmetric bilinear complexity is always welldefined, at the same time it raised the question of the existence of symmetric algorithms for symmetric t-multilinear maps, t ≥ 3. A.4. Proposition-definition. Let V, W be Fq -vector spaces and f : V t −→ W a symmetric t-multilinear map, for an integer t ≥ q. Then, for 1 ≤ i ≤ t − q + 1, the map f(i) : V t−q+1 −→ W defined by f(i) (v1 , . . . , vt−q+1 ) = f (v1 , . . . , vi−1 , vi , . . . , vi , vi+1 , . . . , vt−q+1 ) ! " q times
for v1 , . . . , vt−q+1 ∈ V , is t − q + 1-multilinear. We call this f(i) the i-th Frobenius reduced of f . Proof. This follows from these two facts: (1) all elements λ ∈ Fq satisfy λq = λ, and (2) the binomial coefficients qj are zero in Fq for 0 < j < q. When i is not specified, we let f = f(1) be “the” Frobenius reduced of f . A.5. Definition. Let V, W be Fq -vector spaces and f : V t −→ W a symmetric t-multilinear map. We say that f satisfies the Frobenius exchange condition, or that f is Frobenius symmetric if, either: • t ≤ q, or • t ≥ q + 1 and f(1) = f(2) , that is, f (u, . . . , u, v, z1 , . . . , zt−q−1 ) = f (u, v, . . . , v , z1 , . . . , zt−q−1 ) ! " ! " q times
q times
for all u, v, z1 , . . . , zt−q−1 ∈ V . We let SymtF rob (V ; W ) ⊆ Symt (V ; W ) be the subspace of Frobenius symmetric t-multilinear maps from V to W (in case the base field is not clear from the context, we will use a more precise notation such as SymtFq ,F rob (V ; W )). A.6. — If f : V t −→ W is a symmetric t-multilinear map, for t ≥ q + 1, then in general its (first) Frobenius reduced f need not be symmetric. However: Proposition. Let f ∈ Symt (V ; W ) be a symmetric t-multilinear map, for an integer t ≥ q + 1. Then these two conditions are equivalent: (i) f satisfies the Frobenius exchange condition, that is, it is Frobenius symmetric; 1 For very small q, the effect of 2-torsion on the bounds cannot be entirely discarded, but the methods of [8] allow to make it smaller, leading to the best results in this case.
56
HUGUES RANDRIAMBOLOLONA
(ii) its Frobenius reduced f is symmetric. Moreover, suppose these conditions are satisfied. Then: • all the Frobenius reduced of f are equal: f(i) = f for 1 ≤ i ≤ t − q + 1 • f also satisfies the Frobenius exchange condition, that is, f also is Frobenius symmetric.
Proof. Direct computation from the definitions.
So, as a summary, for t ≤ q we have SymtF rob (V ; W ) = Symt (V ; W ) by definition, and for t ≥ q + 1 we have f ∈ Symt (V ; W ) ⇐⇒ f ∈ Symt−q+1 (V ; W ). F rob
F rob
Now we can state our main result: A.7. Theorem. Let V, W be Fq -vector spaces of finite dimension, and f ∈ Symt (V ; W ) a symmetric multilinear map. Then f admits a symmetric multilinear algorithm if and only if f is Frobenius symmetric. In particular when t ≤ q (for example, when t = 2) this holds automatically. Proof. Choose vectors wi forming a basis of W , and let wi∨ be the dual basis. Then f admits a symmetric multilinear algorithm if and only if all the wi∨ ◦ f admit a symmetric multilinear algorithm, and f is Frobenius symmetric if and only if all the wi∨ ◦ f are Frobenius symmetric. As a consequence, it suffices to prove the Theorem when W = Fq , that is, when f ∈ Symt (V ∨ ) is a symmetric multilinear form. By definition, a symmetric multilinear form f ∈ Symt (V ∨ ) admits a symmetric multilinear algorithm if and only if it belongs to the subspace spanned by elementary symmetric multilinear forms l⊗t , for l ∈ V ∨ . Thus, setting SymtF rob (V ∨ ) = SymtF rob (V, Fq ), we have to show the equality SymtF rob (V ∨ ) = l⊗t ; l ∈ V ∨ of subspaces of Symt (V ∨ ). For this, we proceed by duality. Given a subspace Z ⊆ Symt (V ∨ ), we let ⊥ Z ⊆ S t V be its orthogonal, under the natural duality between Symt (V ∨ ) and the t-th symmetric power S t V . So we have to show the equality SymtF rob (V ∨ )⊥ = l⊗t ; l ∈ V ∨ ⊥ of subspaces of S t V . By definition A.5, a symmetric multilinear form f ∈ Symt (V ∨ ) is Frobenius symmetric if and only if it is orthogonal to all elements of the form (uq v − uv q )z1 · · · zt−q−1 ∈ S t V . Using biduality, this means we have SymtF rob (V ∨ )⊥ = Jt where J ⊆ S · V is the homogeneous ideal spanned by the (uq v − uv q ) ∈ S q+1 V , for u, v ∈ V . Now let X1 , . . . , Xn denote a basis of V , so formally we can identify the symmetric algebra of V with the polynomial algebra in the Xi , as graded Fq -algebras: S · V = Fq [X1 , . . . , Xn ]. With this identification, J then becomes the homogeneous ideal generated by the polynomials Xiq Xj − Xi Xjq , for 1 ≤ i, j ≤ n.
ON PRODUCTS AND POWERS OF LINEAR CODES
57
Also this choice of a basis X1 , . . . , Xn for V gives an identification V ∨ = (Fq )n : a linear form l ∈ V ∨ corresponds to the n-tuple (x1 , . . . , xn ) ∈ (Fq )n where xi = l(Xi ). With this identification, the value of l⊗t ∈ Symt (V ∨ ) at an element P (X1 , . . . , Xn ) ∈ S t V is precisely P (x1 , . . . , xn ). Thus l⊗t ; l ∈ V ∨ ⊥ is the space of homogeneous polynomials of degree t that vanish at all points in (Fq )n , or equivalently, that vanish at all points in the projective space Pn−1 (Fq ). The conclusion then follows from Lemma A.8 below. A.8. Lemma. The homogenous ideal in Fq [X1 , . . . , Xn ] of the finite projective algebraic set Pn−1 (Fq ) is the homogeneous ideal generated by the Xiq Xj − Xi Xjq , for 1 ≤ i, j ≤ n. This is a well-known fact, and a nice exercise, from elementary algebraic geometry. One possible proof is by induction on n, in which one writes Pn (Fq ) = (Fq )n ∪ Pn−1 (Fq ) and one shows at the same time the affine variant, that the set of polynomials in Fq [X1 , . . . , Xn ] vanishing at all points in (Fq )n is the ideal generated by the Xiq − Xi , for 1 ≤ i ≤ n. (See also [34] for more.) A.9. Definition. The Frobenius symmetric algebra of a Fq -vector space V is SF· rob V = S · V /(uq v − uv q )u,v∈V the homogeneous quotient algebra of the symmetric algebra of V by its graded ideal generated by elements of the form uq v − uv q . Also, by the t-th Frobenius symmetric power of V we mean the t-th graded part SFtrob V of this quotient algebra. It comes equipped with a canonical Frobenius symmetric t-multilinear map V t −→ SFtrob V , with the universal property that, given another Fq -vector space W , then any Frobenius symmetric t-multilinear map V t −→ W uniquely factorizes through it. Thus we get a natural identification SymtF rob (V, W ) = (SFtrob V )∨ ⊗ W of Fq -vector spaces, and in particular SymtF rob (V ∨ ) = (SFtrob V )∨ . A.10. — If V has dimension n, with basis X1 , . . . , Xn , we have an identification SF· rob V = Fq [X1 , . . . , Xn ]/(Xiq Xj − Xi Xjq )i,j , hence by Lemma A.8 SFtrob V P RMq (t, n − 1) where the projective Reed-Muller code P RMq (t, n − 1) ⊆ (Fq ) image of the map Fq [X1 , . . . , Xn ]t −→ (Fq )
q n −1 q−1
is defined as the
q n −1 q−1
that evaluates homogeneous polynomials of degree t at (a set of representatives of) all points in Pn−1 (Fq ). In particular we have dim SFtrob V = dim P RMq (t, n − 1), which is computed in [55] (and also by another mehtod in [34]). Last, note that we have P RMq (t, n − 1) = P RMq (1, n − 1)t
58
HUGUES RANDRIAMBOLOLONA
−1 where P RMq (1, n − 1) is the [ qq−1 , n, q n−1 ] simplex code. So we see that the sequence of dimensions dim SFtrob V coincides with the Hilbert sequence of this simplex code, as defined in 1.4 (and also, as an illustration of Proposition 1.28, with that of the projective algebraic set Pn−1 (Fq )). n
A.11. — That Frobenius symmetry is a necessary condition in Theorem A.7 was already known, and is in fact easy to show: indeed, an elementary symmetric multilinear map l⊗t ⊗ w, for l ∈ V ∨ , w ∈ W , is obviously Frobenius symmetric, so any linear combination of such maps should also be. The author learned it from I. Cascudo, who provided the example of the map m : F4 × F4 × F4 (x, y, z)
−→ F4 → xyz
which is trilinear symmetric over F2 , but does not admit a symmetric trilinear algorithm, since one can find x, y ∈ F4 with x2 y = xy 2 . So the new part in Theorem A.7 is that Frobenius symmetry is also a sufficient condition. For example the map m : F4 × F4 × F4 (x, y, z)
−→ →
F4 x2 yz + xy 2 z + xyz 2
is trilinear symmetric over F2 , and it satisfies m (x, x, y) = m (x, y, y) for all x, y ∈ F4 , so it has to admit a symmetric trilinear algorithm. And indeed, one can check m (x, y, z) = tr(x) tr(y) tr(z) + α2 tr(αx) tr(αy) tr(αz) + α tr(α2 x) tr(α2 y) tr(α2 z) where F4 = F2 [α]/(α2 + α + 1) and tr is the trace from F4 to F2 . A.12. — Theorem A.7 allows us to retrieve (and perhaps give a slightly more conceptual proof of) a result of N. Bshouty [7, Th. 5]: given a finite dimensional commutative Fq -algebra A, then the t-wise multiplication map mt :
At −→ A (a1 , . . . , at ) → a1 · · · at
admits a symmetric multilinear algorithm if and only if, either: • t ≤ q, or • t ≥ q + 1 and all elements a ∈ A satisfy aq = a. Indeed, this is easily seen to be equivalent to mt satisfying the Frobenius exchange condition A.5. So, for t ≤ q, a symmetric algorithm for mt exists for any algebra A. On the other hand, for t > q, it turns out that Bshouty’s condition is very restrictive. An example of algebra in which mt obviously admits a symmetric multilinear algorithm is (Fq )n (one can take the trivial algorithm of length n, which is moreover easily seen minimal). We show that, in fact, it is the only possibility, leading to the following “all-or-nothing” result: Proposition. If all elements a in a commutative Fq -algebra A of dimension n satisfy aq = a, then we have an isomorphism A (Fq )n of Fq -algebras. As a consequence, the t-wise multiplication map mt in a commutative Fq -algebra A of dimension n admits a symmetric multilinear algorithm if and only if, either: • t ≤ q, or
ON PRODUCTS AND POWERS OF LINEAR CODES
59
• t ≥ q + 1 and A (Fq )n . Proof. Let A be a commutative Fq -algebra of dimension n in which all elee ments a satisfy aq = a. Then aq = a for arbitrarily large e, implying that A has no (nonzero) nilpotent element. This means the finite commutative Fq -algebra A is reduced, and as such, it can be written as a product of extension fields A= Fqri . i
Now using the condition aq = a once again, we see all ri = 1.
(This proof, shorter than the author’s original one, was inspired by a remark from I. Cascudo.) Trisymmetric and normalized multiplication algorithms. A.13. — We conclude with a last application of Theorem A.7. Let q be a prime power, and k ≥ 1 an integer. For any a ∈ Fqk we define a linear form ta : Fqk −→ Fq by setting ta (x) = tr(ax), where tr is the trace from Fqk to Fq . It is well known that the map a → ta induces an isomorphism of Fq -vector spaces Fqk (Fqk )∨ . Now let m : Fqk × Fqk −→ Fqk be the multiplication map in Fqk , viewed as a tensor m ∈ Sym2Fq ((Fqk )∨ ) ⊗Fq Fqk . Then by construction, μsym (k) is the rank of q m (in the sense of 1.13) with respect to the set of elementary symmetric tensors, that is, tensors of the form t⊗2 a ⊗b for a, b ∈ Fqk . By definition, “symmetry” here refers only to the first two factors of the tensors. However, one could try to strengthen this notion as follows: Definition. The elementary trisymmetric tensors in Sym2Fq ((Fqk )∨ ) ⊗Fq Fqk are those of the form t⊗2 a ⊗a for a ∈ Fqk . We then define μtri q (k) the trisymmetric bilinear complexity of Fqk over Fq , as the rank of m with respect to these. Equivalently, μtri q (k) is the smallest possible length of a trisymmetric bilinear algorithm for multiplication in Fqk over Fq , that is, of a decomposition of m as a linear combination of elementary trisymmetric tensors. (Conveniently, if no such algorithm exists, we set μtri q (k) = ∞.) An equivalent definition can already be found in [49]. Now we determine the values of q and k for which μtri q (k) is finite: A.14. Proposition. A trisymmetric bilinear algorithm for multiplication in Fqk over Fq exists for all q and k, except precisely for q = 2, k ≥ 3. Proof. Applying the trace, we see that a trisymmetric bilinear multiplication algorithm of the form 2 ∨ λj t⊗2 m= aj ⊗ aj ∈ SymFq ((Fq k ) ) ⊗Fq Fq k 1≤j≤n
60
HUGUES RANDRIAMBOLOLONA
for λj ∈ Fq , aj ∈ Fqk , corresponds to a symmetric trilinear algorithm 3 ∨ λj t⊗3 T = aj ∈ SymFq ((Fq k ) ) 1≤j≤n
for the symmetric trilinear form T : Fqk × Fqk × Fqk (x, y, z)
−→ →
Fq tr(xyz).
By Theorem A.7, a symmetric trilinear form always admits a symmetric trilinear algorithm when q ≥ 3. Now, suppose q = 2. Then T satisfies the Frobenius exchange condition if only if tr(x2 y) = tr(xy 2 ) for all x, y ∈ F2k . On the other hand we have tr(a) = tr(a2 ) for all a, so in particular tr(x2 y) = tr(x4 y 2 ), hence T satisfies the Frobenius exchange condition if only if tr(x4 y 2 ) = tr(xy 2 ) for all x, y ∈ F2k . But this means tx4 = tx , or equivalently x4 = x, for all x. We conclude since this holds if and only if k = 1 or 2. A.15. — In case q = k = 2, one can check that the trisymmetric bilinear complexity of F4 over F2 is 3, so tri μ2 (2) = μsym 2 (2) = μ2 (2) = 3.
Indeed, setting F4 = F2 [α]/(α2 + α + 1), a symmetric trilinear algorithm for T is given by the tensor decomposition ⊗3 ⊗3 T = t⊗3 1 + tα + tα2
in Sym3F2 ((F4 )∨ ). In fact T is the only element of rank 3 in Sym3F2 ((F4 )∨ ). This formula could be compared with that for the symmetric algorithm for m in A.11, which can be rewritten ⊗3 ⊗3 2 m = t⊗3 1 ⊗ 1 + tα ⊗ α + tα2 ⊗ α
in Sym3F2 ((F4 )∨ ) ⊗F2 F4 . It also motivates the following: A.16. Definition. A normalized trisymmetric bilinear algorithm of length n for multiplication in Fqk over Fq is a decomposition of the multiplication tensor m as a sum of n elementary trisymmetric tensors m= t⊗2 aj ⊗ a j 1≤j≤n
Sym2Fq ((Fqk )∨ )
in of n cubes
⊗Fq Fqk , or equivalently, of the trace trilinear form T as a sum T =
t⊗3 aj
1≤j≤n
in
Sym3Fq ((Fqk )∨ ).
We then define μnrm (k) q
the normalized trisymmetric bilinear complexity of Fqk over Fq , as the smallest possible length of such a normalized algorithm (Conveniently, if no such algorithm (k) = ∞.) exists, we set μnrm q
ON PRODUCTS AND POWERS OF LINEAR CODES
61
The new restriction here is we require the decomposition to be a sum, not a (k) cannot be intermere linear combination (and as a consequence, this time μnrm q preted in terms of a rank function). This is somehow reminiscent of the distinction between orthogonal and self-dual bases in a nondegenerate quadratic space. In fact, suppose instead of the trace trilinear form T (x, y, z) = tr(xyz) of Fqk over Fq , we’re interested in the much more manageable trace bilinear form B(x, y) = tr(xy). Since B is nondegenerate, it has symmetric complexity k. Moreover, symmetric algorithms for B correspond to orthogonal bases of Fqk over Fq , although not necessarily self-dual. See e.g. [48][27] for more on this topic and related questions. A.17. — The various notions of bilinear complexity defined so far can be compared. Obviously (or for the first three, as a consequence of Lemma 1.16) we always have nrm (k) ≤ μtri (k). μq (k) ≤ μsym q q (k) ≤ μq In the other direction, by [49, Th. 1] or [42, Lemma 1.6] we have (k) ≤ 2 μq (k) μsym q
for char(Fq ) = 2,
and by [49, Th. 2] sym (k) μtri q (k) ≤ 4 μq
for q = 2, char(Fq ) = 3.
(k). This can be stated in a greater Now we want an upper bound on μnrm q generality. Let V be a Fq -vector space, and let F ∈ Symt (V ∨ ) be a symmetric t-multilinear form with a t-symmetric algorithm of length n λj lj⊗t , F = 1≤j≤n ∨
for λj ∈ Fq , lj ∈ V . Now suppose each λj can be written as a sum of g t-th powers in Fq t t λj = ξj,1 + · · · + ξj,g . Then we have ((ξj,1 lj )⊗t + · · · + (ξj,g lj )⊗t ) F = 1≤j≤n
so F is a sum of gn t-th powers in Symt (V ∨ ). In particular, if g = g(t, q) is the smallest integer such that any element in Fq is a sum of g t-th powers in Fq (if no such integer exists we set g = ∞), we find μnrm (k) ≤ g(3, q)μtri q q (k). Determination of g(t, q) is an instance of Waring’s problem (note that determi(k), that is, of the shortest decomposition of T as a sum of cubes in nation of μnrm q 3 ∨ SymFq ((Fqk ) ), also is!). For t = 3 the answer is well known (see e.g. [53]): A.18. Lemma. For q = 2, 4, 7, we have g(3, q) = 2 i.e. any element in Fq is a sum of two cubes (and this is optimal). The exceptions are: g(3, 2) = 1, g(3, 4) = ∞, and g(3, 7) = 3. In F4 = F2 [α]/(α2 + α + 1), note that every nonzero x satisfy x3 = 1. As a consequence, neither α nor α2 can be written as a sum of cubes, and g(3, 4) = ∞ as asserted.
62
HUGUES RANDRIAMBOLOLONA
A.19. Proposition. A normalized trisymmetric multiplication algorithm for Fqk over Fq exists for all q and k, except precisely for q = 2, k ≥ 3 and for q = 4, k ≥ 2. More precisely, we have: (1) = 1 (i) μnrm q (ii) (iii) (iv) (v) (vi)
μnrm (2) 2 nrm μ2 (k) μnrm (k) 4 μnrm (k) 7 (k) μnrm q
for all q,
= 3, =∞
for k ≥ 3,
=∞
for k ≥ 2,
≤ 3 μtri 7 (k)
for k ≥ 2,
≤ 2 μtri q (k)
for q = 2, 4, 7 and k ≥ 2.
Proof. Item (i) is obvious, (ii) comes from A.15, and (iii)(v)(vi) from Proposition A.14 joint with the discussion in A.17 and Lemma A.18. Now to prove (iv) we have to show that, for k ≥ 2, there is no normalized multiplication algorithm for F4k over F4 . We proceed by contradiction, so we suppose we have a decomposition T = t⊗3 aj 1≤j≤n
which implies, for any x ∈ F4k , tr(x3 ) =
tr(aj x)3 .
1≤j≤n
Now let α ∈ F4 with α2 = α + 1. The trace function tr : F4k −→ F4 is surjective, so α = tr(z) for some z ∈ F4k . Moreover, by Lemma A.18, we can write z = x3 + y 3 as a sum of two cubes in F4k . So we conclude that (tr(aj x)3 + tr(aj y)3 ) α = tr(x3 + y 3 ) = 1≤j≤n
is a sum of cubes in F4 , in contradiction with Lemma A.18 and the discussion following it. The more constraints we put on the structure of the algorithms, the smaller the set of such algorithms is, and hopefully the lower the complexity of search in this set should be. This makes one wonder whether an adaptation of the methods of [2] nrm (k) for could allow one to succesfully compute the exact values of μtri q (k) and μq a not-so-small range of k, and find the corresponding optimal algorithms. Appendix B: On symmetric multilinearized polynomials B.1. — Let A be the algebra of all functions from (Fqr )t to Fqr . It is easily checked that any such function can be represented as a polynomial function, and r moreover, since elements of Fqr satisfy xq = x, we have a natural identification r
r
A = Fqr [x1 , . . . , xt ]/(xq1 − x1 , . . . , xqt − xt ). Often we will identify an element f ∈ A with its (unique) representative of minimum degree in Fqr [x1 , . . . , xt ]. Likewise we identify Z/rZ with {0, 1, . . . , r − 1}.
ON PRODUCTS AND POWERS OF LINEAR CODES
63
The symmetric group St acts linearly on A, by permutation of the variables: for σ ∈ St , f ∈ A, and (u1 , . . . , ut ) ∈ (Fqr )t , we set (σf )(u1 , . . . , ut ) = f (uσ(1) , . . . , uσ(t) ). Also, the Frobenius f → f q defines an automorphism of A over Fq , of order r, j hence an action of Z/rZ on A, where j ∈ Z/rZ acts as f → f q . These two actions commute, so A is equipped with an action of G = St × Z/rZ, the invariants of which are the symmetric functions on (Fqr )t with values in Fq . Note that the action of St is linear over Fqr , while the action of Z/rZ (hence that of G) is only linear over Fq . B.2. Definition. The t-multilinearized polynomials with coefficients in Fqr over Fq are the polynomials of the form i1 it ai1 ,...,ir xq1 · · · xqt 0≤i1 ,...,it ≤r−1
with ai1 ,...,ir ∈ Fqr . These t-multilinearized polynomials form a Fqr -linear subspace B ⊆ A, of dimension r over Fqr (hence also of dimension r t+1 over Fq ). It is easily checked that B coincides precisely with the space of functions from (Fqr )t to Fqr that are t-multilinear over Fq . For t = 1, one retrieves the notion of linearized polynomials, which is an important tool in the theory of finite fields and in coding theory. For t = 2, bilinearized polynomials have also been introduced to solve various problems in bilinear algebra, as illustrated in [49] and [43]. Our aim here is to extend some of the results from [43] to arbitrary t. More precisely, in Theorem B.7 we construct a family of homogeneous symmetric t-multilinearized polynomials t
SI : (Fqr )t −→ FqrI (the index I ranges in a certain set S and is essentially the multidegree of SI ) taking values in an intermediate field FqrI , and satisfying the following universal property: for any Fq -vector space V , and for any map F : (Fqr )t −→ V symmetric t-multilinear over Fq , there is a unique family of Fq -linear maps fI : FqrI −→ V such that F =
fI ◦ SI .
I∈S
Then in Theorem B.9 (or more precisely in Corollary B.22) we give an upper bound on the degrees of the SI . Polynomial description of symmetric powers of an extension field. First we state (without proof) two easy results on group actions.
64
HUGUES RANDRIAMBOLOLONA
B.3. Lemma. Let Γ be a finite group acting on a finite set P, let R ⊆ P be a set of representatives for the action, and for I ∈ R let o(I) ⊆ P be its orbit. Suppose also Γ acts linearly on a vector space V with basis (bI )I∈P , such that γ · bI = bγI for all γ ∈ Γ, I ∈ P. Now for I ∈ R, set bJ . sI = J∈o(I)
Then, the subspace of invariants V
Γ
admits the (sI )I∈R as a basis.
B.4. Lemma. Let the finite cyclic group Z/rZ act on a finite set R, and let S ⊆ R be a set of representatives for the action. For each I ∈ S, let rI = |o(I)| be the size of its orbit, so rI |r and rI Z/rZ ⊆ Z/rZ is the stabilizer subgroup of I. Now set P = Z/rZ × R and let Z/rZ act on this product set, on the first factor, by translation, and on the second factor, by the action we started with. Then the action of Z/rZ on P is free, and it admits T = {(i, I) ; 0 ≤ i ≤ rI − 1, I ∈ S} ⊆ P as a set of representatives. Moreover we have rI = |R|. |T | = I∈S
Note that there are other possible choices for a set of representatives, for instance a more obvious one would be {0} × R. However, T is the choice that will make the proof of Theorem B.7 below work. B.5. — To each I = (i1 , . . . , it ) ∈ (Z/rZ)t we associate a monomial I
i1
it
MI = xq = xq1 · · · xqt , of degree DI = q i1 + · · · + q it so t ≤ DI ≤ tq r−1 . These form a basis of B over Fqr . As we saw in B.1, the symmetric group St acts on A, and we let it act also on (Z/rZ)t by permutation of coordinates. The map I → MI is compatible with these actions, in that σ MI = Mσ(I) for σ ∈ St , so B is stable under St . Let R ⊆ (Z/rZ)t be the set of nonincreasing t-tuples of elements of Z/rZ {0, 1, . . . , r − 1}. It has cardinality r+t−1 |R| = . t Clearly R is a set of representatives for the action of St on (Z/rZ)t , so we have a bijection ∼ R −→ (Z/rZ)t /St I → o(I)
ON PRODUCTS AND POWERS OF LINEAR CODES
65
where o(I) ⊆ (Z/rZ)t is the orbit of I under St . Now for I ∈ R we set MJ , SI = J∈o(I)
so SI is a symmetric homogeneous polynomial of degree DI in t variables over Fqr , which is also symmetric t-multilinear over Fq . The number |o(I)| of monomials in SI is a divisor of t! (it can be a strict divisor if I has repeated elements). Then, by Lemma B.3, the subspace of invariants B St = SymtFq (Fqr ; Fqr ) admits the (SI )I∈R as a basis over Fqr . B.6. — As we saw in B.1, the cyclic group Z/rZ acts on A by Frobenius, and we let it act also on (Z/rZ)t diagonally by translation, that is, we let j ∈ Z/rZ act as I = (i1 , . . . , it ) → I + j = (i1 + j, . . . , it + j) where addition is modulo r. The map I → MI is compatible with these actions, in that j
(MI )q = MI+j so B is stable under Z/rZ. This diagonal action of Z/rZ on (Z/rZ)t commutes with that of St , so it defines an action of Z/rZ on R (Z/rZ)t /St , which can be written as Z/rZ × R (j, I)
−→ →
R Ij
where I j is the representative of I + j in R. More precisely, I + j need not be nonincreasing since addition is modulo r, but there is a (cyclic) permutation that puts it back in nonincreasing order, the result of which is I j. (Example: r = 10, t = 5, I = (8, 7, 4, 2, 2), I + 3 = (1, 0, 7, 5, 5), I 3 = (7, 5, 5, 1, 0).) By construction we then have j
SIq = SIj in B St . Choosing a set of representatives S⊆R for this action of Z/rZ on R, we note that S R/(Z/rZ) ((Z/rZ)t /St )/(Z/rZ) (Z/rZ)t /G, so S is also a set of representatives for the action of G on (Z/rZ)t . B.7. — For each I ∈ S, we let rI be the size of its orbit under Z/rZ in R, so rI rI |r and rI Z/rZ ⊆ Z/rZ is the stabilizer subgroup of I. We then have SIq = SI , so in fact SI defines a map SI : (Fqr )t −→ FqrI whose image lies in the subfield FqrI ⊆ Fqr . Theorem. With these notations, the map Ψ : (Fqr )t −→ FqrI I∈S
66
HUGUES RANDRIAMBOLOLONA
whose components are the SI for I ∈ S, is symmetric t-multilinear over Fq , and moreover it is universal for this property. In particular, it induces an isomorphism FqrI SFt q Fqr I∈S
of Fq -vector spaces. Proof. We have to show that, for a certain basis B of the dual Fq -vector space ( I∈S FqrI )∨ , the (b ◦ Ψ)b∈B form a basis of (SFt q Fqr )∨ = SymtFq ((Fqr )∨ ) over Fq . For this we would like, ultimately, to apply Lemma B.3 to the action of Z/rZ on B St . Indeed, as already noted B St = SymtFq (Fqr ; Fqr ), so its subspace of invariants under Frobenius is B G = (B St )Z/rZ = SymtFq (Fqr ; Fq ) = SymtFq ((Fqr )∨ ). Now, since the action of Z/rZ is only Fq -linear, we need a basis of B St over Fq , stable under the action. First choose a γ ∈ Fqr such that γ, γ q , . . . , γ q
r−1
is a (normal) basis of Fqr over Fq . Then, given I ∈ S, we set j βi,I = γq j∈Z/rZ, j≡i mod rI
for 0 ≤ i ≤ rI − 1, which happen to form a basis of FqrI over Fq . This is easily checked directly, but can also be viewed as a consequence of Lemma B.3 (with Γ = rI Z/rZ, P = Z/rZ, V = Fqr , V Γ = FqrI ). In B.5 we saw that the (SI )I∈R form a basis of B St over Fqr . It then follows that the i (γ q SI )i∈Z/rZ,I∈R form a basis of B St over Fq . This basis is stable under the action of Z/rZ on B St by Frobenius, more precisely we have i
j
(γ q SI )q = γ q
i+j
SIj .
This means, our basis is indexed by P = Z/rZ × R, and Z/rZ acts on this product set, on the first factor, by translation, and on the second factor, by the action . Let T = {(i, I) ; 0 ≤ i ≤ rI − 1, I ∈ S} ⊆ P be the set of representatives given by Lemma B.4. Now we can apply Lemma B.3, which gives that the i j (γ q SI )q , Fi,I = j∈Z/rZ
for (i, I) ∈ T , form a basis of (B St )Z/rZ = SymtFq ((Fqr )∨ ) over Fq . Using the invariance of SI under rI Z/rZ and grouping together the j according to their class
ON PRODUCTS AND POWERS OF LINEAR CODES
67
modulo rI , these can also be written j Fi,I = (βi,I SI )q = ϕi,I ◦ SI = ϕi,I ◦ πI ◦ Ψ j∈Z/rI Z
where
ϕi,I : FqrI x
−→ Fq → trFqrI /Fq (βi,I x)
is the trace linear form deduced from βi,I , and FqrI FqrI πI : I∈S
is projection on the I-th factor. form a basis Now, for fixed I, the βi,I form a basis of FqrI over Fq , so the ϕi,I of (FqrI )∨ over Fq . Hence as i and I vary, the ϕi,I ◦ πI form a basis of ( I∈S FqrI )∨ over Fq . This is the basis B we were looking for at the beginning of the proof. As a double check, Lemma B.4 also gives directly r+t−1 r dimFq Fq I = rI = |R| = = dimFq SFt q Fqr . t I∈S
I∈S
Also we note that Burnside’s lemma allows us to compute |S| = |R/(Z/rZ)| =
(r+t)/d−1
1 , although this will not be needed in the sequel. d| gcd(r,t) φ(d) r t/d Equidistributed beads on a necklace. B.8. — Recall from B.5-B.6 we are interested in the set R = Rr,t ⊆ (Z/rZ)t of nonincreasing t-tuples of elements of Z/rZ {0, 1, . . . , r − 1}, of cardinality
, modulo the action of Z/rZ, inherited from the diagonal action |Rr,t | = r+t−1 t of Z/rZ on (Z/rZ)t by translation. There are several ways to interpret this object. For instance, we can also view it as the set of multisets of cardinality t of elements of Z/rZ, or as the set of vectors in Nr that sum to t (identify a multiset with its characteristic vector), with the natural action of Z/rZ by cyclic permutation. So, in a sense, the quotient set Rr,t /(Z/rZ) describes all the possible arrangements of r beads with weight in N into a circular necklace of total weight t. We introduce a particular element Ir,t ∈ Rr,t , which corresponds to the weight being equidistributed on the necklace: r (t − 1)r (t − 2)r Ir,t = , , ..., ,0 t t t #r$ (t − 1)r 2r , ...,r − ,0 = r− , r− t t t Equip Rr,t with the lexicographic order, so for I = (i1 , . . . , it ) and J = (j1 , . . . , jt ) in Rr,t , we set I < J if and only if there exists an index a such that ib = jb for all b < a, and ia < ja . B.9. Theorem. Each orbit in Rr,t /(Z/rZ) admits a representative I ≤ Ir,t . Example: r = 10, t = 7, I10,7 = (8, 7, 5, 4, 2, 1, 0). Let J = (9, 8, 7, 6, 4, 3, 1). Then in the orbit of J we can find J 4 = (8, 7, 5, 3, 2, 1, 0) < I10,7 , and also J 7 = (8, 6, 5, 4, 3, 1, 0) < I10,7 (so there is not unicity in the Theorem).
68
HUGUES RANDRIAMBOLOLONA
Note that the Theorem is stated for multisets, but then, it applies a fortiori to ordinary sets. So it gives that, for any subset J ⊆ Z/rZ of cardinality |J| = t, there , . . . , (t−a+1)r , is a translate I of J in Z/rZ whose largest elements are (t−1)r t t . Moreover, here we but then the next one (if applicable) is smaller than (t−a)r t used the lexicographic order, but the very same method of proof gives a similar result for the antilexicographic order: there is also a translate I of J in Z/rZ whose , but then the next one (if applicable) is smallest elements are 0, rt , . . . , (a−1)r t . smaller than ar t The proof of the Theorem will require several intermediary results: B.10. Definition. We say J = (j1 , . . . , jt ) ∈ Rr,t is reduced if jt = 0. We let R0r,t ⊆ Rr,t be the set of reduced elements. For instance, Ir,t ∈ R0r,t is reduced. Note also that forgetting the last coordinate
gives R0r,t Rr,t−1 , so |R0r,t | = r+t−2 t−1 . B.11. — Let Gr,t ⊆ N>0 × Nt−1 be the set of t-tuples of integers (g1 , . . . , gt ) with g1 > 0 and sum g1 + · · · + gt = r
r+t−2 so |Gr,t | = t−1 . Equip Gr,t with the lexicographic order. Given J = (j1 , . . . , jt ) ∈ R0r,t , so jt = 0, we define its i-th gap, 1 ≤ i ≤ t, as for i = 1 r − j1 gi (J) = ji−1 − ji for 2 ≤ i ≤ t so in particular gt (J) = jt−1 , and we let its gap sequence be g(J) = (g1 (J), . . . , gt (J)) ∈ Gr,t . Then: B.12. Lemma. This map g : R0r,t −→ Gr,t is an order-reversing bijection. Proof. Indeed, to (g1 , . . . , gt ) ∈ Gr,t , the inverse map associates the t-uple (j1 , . . . , jt ) ∈ R0r,t given by ji = r − (g1 + · · · + gi ), which is clearly order-reversing.
B.13. — We let Z/tZ act on Nt by cyclic permutation. More precisely, we let σ0 be the identity on Nt , and for g = (g1 , . . . , gt ) ∈ Nt and 1 ≤ a ≤ t − 1 we set σa (g) = (ga+1 , ga+2 , . . . , gt , g1 , g2 , . . . , ga ). This action “almost” preserves Gr,t : more precisely, for g ∈ Gr,t , we have σa (g) ∈ Gr,t if and only if ga+1 > 0. B.14. — Let J = (j1 , . . . , jt ) ∈ R0r,t and let j > 0 be such that j = ja for an index a; if there are several choices for such an a, choose it maximum, so ja > ja+1 , hence ga+1 > 0. Then: Lemma. With these notations, we have g(J (r − ja )) = σa (g(J)).
ON PRODUCTS AND POWERS OF LINEAR CODES
69
Proof. Clear, from J (r − ja ) = (ja+1 + r − ja , ja+2 + r − ja , . . . , jt−1 + r − ja , r − ja , j1 − ja , j2 − ja , . . . , ja−1 − ja , 0). B.15. Lemma (“large gap”). Let J = (j1 , . . . , jt ) ∈ R0r,t have gap sequence g(J) = (g1 , . . . , gt ). Suppose there is an index a such that ga > rt !. Then there is j ∈ Z/rZ such that J j < Ir,t . Proof. Thanks to Lemma B.14, after possibly replacing J with J (r − ja−1 ) if a > 1, we can suppose a = 1. Then #r$ j1 = r − g1 < r − t so J < Ir,t . B.16. Lemma (“small gap”). Let J = (j1 , . . . , jt ) ∈ R0r,t have gap sequence g(J) = (g1 , . . . , gt ). Suppose there is an index a such that ga < rt . Then there is j ∈ Z/rZ such that J j ≤ Ir,t . Proof. Choose a such that ga is minimum; if there are several choices for such an a, choose it maximum, so ga+1 > ga if a < t. Then, thanks to Lemma B.14, after possibly replacing J with J (r − ja ) if a < t, we can suppose a = t, so jt−1 = gt < rt , hence, since these are integers, r r jt−1 ≤ − 1 ≤ − 1. t t Now we proceed by contradiction: suppose J j > Ir,t for all j. We will construct a sequence of indices 1 ≤ a1 < a2 < · · · < ak < t with the following properties: (i) r − ja1 < a1t r (ii) jai − jai+1 < (ai+1t−ai )r for 2 ≤ i ≤ k − 1 (iii) jak < (t−at k )r . Summing all these inequalities we find r < r, a contradiction. The sequence is constructed as follows. To start with, we have J > Ir,t so there is an index a < t such that ja > r − ar t !. If there are several choices for such an a, choose it maximum (which implies ja > ja+1 ), and call it a1 . Then r − ja1 < a1t r ! and moreover r − ja1 is an integer, so r − ja1 < a1t r . Now suppose we have already constructed 1 ≤ a1 < a2 < · · · < a < t satisfying (i) and (ii), and with ja > ja +1 . If a = t − 1, then we are done: (iii) is more than satisfied (with k = ) since jt−1 ≤ rt − 1 < rt . If a < t − 1, we use the fact that J (r − ja ) > Ir,t , so for some b ≥ 1, these sequences coincide on the first b − 1 positions, while the b-th coefficient of J (r − ja ) (whose expression is given in the proof of Lemma B.14) is larger than that of Ir,t . We distinguish two cases. First case: b < t − a . Then ja +b + r − ja > r − brt !, with the left-hand side an integer, hence in fact ja +b + r − ja > r − br t . Thus there exists an index a (namely here a = a + b works) with a < a < t and ja − ja < (a −a)r . If there are t several choices for such an a, choose it maximum (which implies ja > ja+1 ), and call it a+1 .
70
HUGUES RANDRIAMBOLOLONA
Second case: b ≥ t − a , so the (t − 1 − a )-th coefficient of J (r − ja ) is equal )r )r to that of Ir,t , that is, jt−1 + r − ja = r − (t−1−a !, or ja = jt−1 + (t−1−a !< t t (t−1−a )r r jt−1 + + 1. But then we use j ≤ − 1 to conclude that (iii) is satisfied t−1 t t with k = . B.17. Definition. Let J = (j1 , . . . , jt ) ∈ R0r,t have gap sequence g(J) = (g1 , . . . , gt ). We say J is balanced if |ga − rt | < 1 for all a ( i.e. ga = rt or rt !). We let 0 Rbal r,t ⊆ Rr,t be the set of such balanced sequences. B.18. — Suppose t | r, set Q =
r t !,
so Q − 1 = rt , and write
r = tQ − u with 0 < u < t, which can also be written r #r$ r=u + (t − u) . t t Let J = (j1 , . . . , jt ) ∈ Rbal r,t be a balanced sequence, with gaps g(J) = (g1 , . . . , gt ). Since all ga = rt or rt ! and they must sum to r, we deduce exactly u of them are equal to rt , and the other t − u are equal to rt !. So let a1 < a2 < · · · < au be those indices a with ga = rt , and then for 1 ≤ i ≤ u set bi = r − ai . Note 1 ≤ ai ≤ t so bi ∈ {0, . . . , t − 1} Z/tZ, and the bi form a decreasing sequence, hence they define an element of Rt,u . Definition. With these notations, we call ∂(J) = (b1 , . . . , bu ) the derived sequence of J. B.19. Proposition. This map ∂ : Rbal r,t −→ Rt,u is injective and order-preserving. Proof. The map ∂ factorizes as g|Rbal
b
bal Rbal r,t −−−−→ Gr,t −−−→ Rt,u . r,t
bal t Here Gr,t = g(Rbal r,t ) ⊆ Gr,t is the set of t-tuples of integers (g1 , . . . , gt ) ∈ N among r r which u of them are equal to t , and the other t − u are equal to t !, and with g1 > 0. Note that, if t < r, the last condition vanishes since g1 ≥ rt > 0 automatically, while if t > r, then g1 > 0 means g1 = rt !. Then by Lemma B.12, the restriction g|Rbal is an order-reversing bijection. r,t The second map is b(g1 , . . . , gt ) = (b1 , . . . , bu ), where bi = t − ai and a1 < a2 < · · · < au are those indices a with ga = rt . This map is clearly injective. Now suppose (g1 , . . . , gt ) < (g1 , . . . , gt ), so there is some index k with gi = gi for i < k, and gk = rt < gk = rt !. So k = av < av for some v, while aw = aw for w < v. This means (a1 , . . . , au ) < (a1 , . . . , au ), or equivalently, (b1 , . . . , bu ) > (b1 , . . . , bu ). Hence the map b is also order-reversing. We conclude since composing two order-reversing maps gives an order-preserving map.
ON PRODUCTS AND POWERS OF LINEAR CODES
71
B.20. Proposition. Suppose t | r, and write r = tQ − u with Q = 0 < u < t. Then Ir,t ∈ Rbal r,t and
r t !,
∂(Ir,t ) = It,u . Proof. We have g(Ir,t ) = (g1 , . . . , gt ) with # ar $ (a − 1)r − ga = . t t au However, writing ar t = aQ − t we find there are only two possibilities:
ar t !
= aQ − au t . Then, since
u t
< 1,
(a−1)u , in which case ga = Q = rt !, • either au t = t (a−1)u • or au + 1, and ga = Q − 1 = rt . t = t We are interested in the second case. It happens precisely when
au (a − 1)u jv+1 . Then ∂(J (r − jv )) = ∂(J) (t − v) in Rt,u . Proof. Consequence of Lemma B.14 and the definition of ∂.
Proof of Theorem B.9. We proceed by induction on t. The result is clear for t = 1. Suppose it holds for all t < t. Let J = (j1 , . . . , jt ) ∈ Rr,t . After possibly replacing J with J (r − jt ) = J − jt , we can suppose J ∈ R0r,t . If J has a gap larger than rt ! we conclude with Lemma B.15; and if J has a gap smaller than rt we conclude with Lemma B.16. r So the only case remaining is J ∈ Rbal r,t . If t|r, this means all gaps are equal to t , r and then J = Ir,t . Now suppose t | r, set Q = t !, and write r = tQ − u with 0 < u < t. Then ∂(J) ∈ Rt,u and we can apply the induction hypothesis to find v such that ∂(J) (t − v) ≤ It,u . In particular the first coefficient of ∂(J) (t − v) is at most t − ut ! < t − 1. This means that the first gap of ∂(J) (t − v), or equivalently the (v + 1)-th gap of J, is not equal to rt , so it is rt ! ≥ 1, which means in turn jv > jv+1 . We can then apply Lemma B.21 to the left-hand side of our last inequality, and Proposition B.20 to the right-hand side, to restate it as ∂(J (r − jv )) ≤ ∂(Ir,t ).
72
HUGUES RANDRIAMBOLOLONA
But ∂ is order-preserving by Proposition B.19, so J (r − jv ) ≤ Ir,t
which finishes the proof.
B.22. — Theorem B.9 can be used to make Theorem B.7 more precise. Recall there we are working in an extension Fqr of a finite field Fq , and we have constructed a universal family of symmetric homogeneous t-multilinearized polynomials SI : (Fqr )t −→ FqrI for I ranging in a set S of representatives of Rr,t modulo Z/rZ. Corollary. Suppose t ≤ q, and set T = q
(t−1)r t
+ q (t−2)r + · · · + q rt + 1. t
Then the set S can be chosen so that the polynomials SI all have deg(SI ) ≤ T. Proof. Recall for I ∈ Rr,t we defined DI = q i1 + · · · + q it = deg(MI ) = deg(SI ). In particular we have T = DIr,t . However, since t ≤ q, it is easily seen that DI ≤ DI if and only if I ≤ I for the lexicographic order. To construct S, in each orbit of Rr,t /(Z/rZ) we choose the representative I that is minimum for the lexicographic order, so I ≤ Ir,t by Theorem B.9, and we conclude. (Note that for t > q, it can be that DI > DI although I < I . This happens for instance, for t = 3, q = 2, I = (2, 2, 2), I = (3, 0, 0). In such a case, instead of choosing in each orbit the representative I minimum for the lexicographic order, perhaps it is better to choose the one that gives the smallest DI .) B.23. Example. For t = 2, Theorem B.9 specializes to the results of [43]. Here we have S = {(0, 0), (1, 0), (2, 0), . . . , (r/2, 0)} i
i
and the SI for I ∈ S are the maps m0 (x, y) = xy and mi (x, y) = xq y + xy q for 1 ≤ i ≤ r/2. The maximum degree is q r/2 + 1, reaching the bound in Corollary B.22. If r is odd, then r+1 (Fqr ) 2 can be seen as a Fq -vector space of dimension even, by abuse of notation we set (Fqr )
r+1 2
r(r+1) . 2
On the other hand, if r is
= (Fqr )r/2 × Fqr/2
and again this can be seen as a Fq -vector space of dimension r(r+1) ; note then that 2 mr/2 takes values in Fqr/2 . In any case, the map Ψ = (m0 , m1 , . . . , mr/2 ) induces an isomorphism r+1 SF2q Fqr (Fqr ) 2 of Fq -vector spaces. This can be viewed as a symmetric variant of the isomorphism Fqr ⊗Fq Fqr (Fqr )r
ON PRODUCTS AND POWERS OF LINEAR CODES
73
i
induced by the maps (x, y) → xq y for i ∈ Z/rZ, although the latter has the additional property that it is in fact an isomorphism of Fq -algebras. B.24. Example. For q = 2, r = 2, t = 3, we can take S = {(0, 0, 0), (1, 0, 0)} with associated maps m(x, y, z) = xyz and m (x, y, z) = x2 yz + xy 2 z + xyz 2 on (F4 )3 . Then (m, m ) induces an isomorphism SF32 F4 (F4 )2 of F2 -vector spaces, of maximum degree 4. For q = 2, r = 3, t = 3, we can take S = {(0, 0, 0), (1, 0, 0), (1, 1, 0), (2, 1, 0)} with associated maps ψ1 (x, y, z) = xyz, ψ2 (x, y, z) = x2 yz+xy 2 z+xyz 2 , ψ3 (x, y, z) = x2 y 2 z + x2 yz 2 + xy 2 z 2 , and ψ4 (x, y, z) = x4 y 2 z + x4 yz 2 + x2 y 4 z + x2 yz 4 + xy 4 z 2 + xy 2 z 4 on (F8 )3 . Note that ψ4 is invariant under Frobenius, so it takes values in F2 , and then (ψ1 , ψ2 , ψ3 , ψ4 ) induces an isomorphism SF32 F8 (F8 )3 × F2 of F2 -vector spaces, of maximum degree 7. Appendix C: Review of open questions Here is a subjective selection of (hopefully) interesting open problems related to the topic of products of codes, some of which were already mentioned in the text. C.1. — Study arithmetic in the +, ∗, ⊆ -ordered semiring of linear codes of length n over Fq (either for n fixed, or for n → ∞). One could devise an almost infinite number of questions, but perhaps the most natural ones to start with concern the distribution of squares [43]: • Among all linear codes of length n over Fq , how many of them are squares? • What is the maximum number of square roots a code can admit? • If a code is not a square, what are the largest squares it contains? or the smallest squares it is contained in? This leads to some approximation problems. Consider the metric dist(C, C ) = dim(C + C ) − dim(C ∩ C ). Then: • How far can a code be from the set of squares? C.2. — These questions can also be made more algorithmic: • Is there an efficient algorithm to decide if a code if a square? • If so, is there an efficient algorithm to compute one of its square roots? or to compute all of them? C.3. — In 2.56 we pointed out three open problems concerning symmetries and automorphisms of powers of codes: • Compare the Aut(C t ) as t varies, where Aut(C) ⊆ Aut(Fn ) is the group of monomial transformations preserving C; e.g. for all C and Aut(C t+1 )? (From 2.52 and 2.55 we know t, does it hold Aut(C t ) ⊆ t t Aut(C ) ⊆ Aut(C ) only for t|t or t ≥ r(C).)
74
HUGUES RANDRIAMBOLOLONA
• Instead of Aut(C), compare the Autin (C t ) as t varies, where Autin (C) is the group of invertible linear endomorphisms of C (seen as an abstract vector space) that preserve the Hamming metric. • Do the same thing for semilinear automorphisms. C.4. — Study how change of field operations interact with product of codes. Here is what is known: extension of scalars concatenation trace code subfield subcode
obvious (2.23((iii)), 2.28) partial results, highly dependent on the inner code (4.18) not yet understood not yet understood
As noted in 5.18, results on products of subfield subcodes would be useful in the analysis of the original McEliece cryptosystem. Since the trace code operation is somehow dual to the subfield subcode operation, one could guess both to be equally difficult to understand. With some abuse of language one can consider trace codes as a very specific class of concatenated codes (with a noninjective symbol mapping). This raises the question of whether some of the techniques introduced for the study of products of concatenated codes (e.g. polynomial representation of Fq -multilinear maps) could also be applied to trace codes. C.5. — Improve the bounds on the possible joint parameters of a family of codes and their product, or of a code and its powers, given in section 4. In particular: • Beside the Singleton bound, try to generalize the other classical bounds of coding theory in the context of products of codes. • Fill the large gap between 4.10 and 4.21: 2
0.001872 − 0.5294 δ ≤ α2 (δ) ≤ 0.5 − 0.5 δ. • Is τ (2) > 2? That means, does there exist an asymptotically good family of binary linear codes C whose cubes C 3 also form an asymptotically good family? • Is τ (2) infinite? That means, for any t (arbitrarily large), does there exist an asymptotically good family of binary linear codes C whose t-th powers C t also form an asymptotically good family? C.6. — Does there exist an asymptotically good family of binary linear codes C, whose squares C 2 , and also whose dual codes C ⊥ , form an asymptotically good family? (Motivation comes from the theory of multi-party computation, as mentioned after 5.5.) If instead of binary codes, one is interested in q-ary codes, then AG codes are easily seen to provide a positive answer, at least for q large. When q becomes smaller, AG codes still work, except perhaps for q = 2, 3, 4, 5, 7, 11, 13: this is shown in [8], using a careful analysis of the 2-torsion in the class group of a certain tower of curves. However, for q = 2, curves simply do not have enough points, so there is no hope that bare AG codes work in this case. Probably one should combine AG codes with another tool. Note that concatenation as in [43] does not work, since it destroys the dual distance.
ON PRODUCTS AND POWERS OF LINEAR CODES
75
C.7. — Our bound 4.19 on t-th powers of concatenated codes, for t ≤ q, relied on Appendix B, in which we gave a polynomial description of the symmetric power SFt q Fqr , using homogeneous t-multilinearized polynomials of controlled degree. To extend this bound for t > q, it would be nice to have a similar result also for the Frobenius symmetric power SFt rob,Fq Fqr , as defined in A.9. That means: construct a universal family of Frobenius symmetric t-multilinearized polynomials, of controlled degree. Such a construction seems highly unlikely if one keeps the homogeneity condition; so we might drop this condition, to allow more flexibility. Indeed, usually this will not be a problem for applications. For instance, Proposition 4.18 can deal with non-homogeneous polynomials, provided the external code C contains the all-1 word 1[n] . It is often so in practice, e.g. when C is an AG code. C.8. — In 1.23 we defined ni = dimx ∈ C ⊥ ; w(x) ≤ i⊥ , and noted that n0 is the length of C, while n1 is its support length, and n2 its projective length. Is there such a nice interpretation for the subsequent values ni , i ≥ 3? Or conversely, is there another “natural” sequence of which n0 , n1 , n2 are the first terms?
References [1] St´ ephane Ballet and Robert Rolland, On the bilinear complexity of the multiplication in finite fields (English, with English and French summaries), Arithmetic, geometry and coding theory (AGCT 2003), S´emin. Congr., vol. 11, Soc. Math. France, Paris, 2005, pp. 179–188. MR2182843 (2006g:11245) [2] Razvan Barbulescu, J´ er´ emie Detrey, Nicolas Estibals, and Paul Zimmermann, Finding optimal formulae for bilinear maps, Arithmetic of finite fields, Lecture Notes in Comput. Sci., vol. 7369, Springer, Heidelberg, 2012, pp. 168–186, DOI 10.1007/978-3-642-31662-3 12. MR2992378 [3] E. S. Barnes and N. J. A. Sloane, New lattice packings of spheres, Canad. J. Math. 35 (1983), no. 1, 117–130, DOI 10.4153/CJM-1983-008-1. MR685820 (84f:52010) [4] M. Ben-Or, S. Goldwasser & A. Wigderson. “Completeness theorems for non-cryptographic fault-tolerant distributed computation.” Proc. 20th Ann. ACM Symp. on Theory of Computing (STOC ’88), 1988, pp. 1-10. [5] Elwyn R. Berlekamp, Robert J. McEliece, and Henk C. A. van Tilborg, On the inherent intractability of certain coding problems, IEEE Trans. Information Theory IT-24 (1978), no. 3, 384–386. MR0495180 (58 #13912) [6] Roger W. Brockett and David Dobkin, On the optimal evaluation of a set of bilinear forms, Fifth Annual ACM Symposium on Theory of Computing (Austin, Tex., 1973), Assoc. Comput. Mach., New York, 1973, pp. 88–95. MR0416098 (54 #4174) [7] N. Bshouty. “Multilinear complexity is equivalent to optimal tester size.” Electron. Colloq. Comput. Complexity, Report No. TR13-011 (2013). [8] Ignacio Cascudo, Ronald Cramer, and Chaoping Xing, The torsion-limit for algebraic function fields and its application to arithmetic secret sharing, Advances in cryptology—CRYPTO 2011, Lecture Notes in Comput. Sci., vol. 6841, Springer, Heidelberg, 2011, pp. 685–705, DOI 10.1007/978-3-642-22792-9 39. MR2874882 [9] D. Chaum, C. Cr´ epeau & I. Damg˚ ard. “Multiparty unconditionally secure protocols.” Proc. 20th Ann. ACM Symp. on Theory of Computing (STOC ’88), 1988, pp. 11-19. [10] Hao Chen and Ronald Cramer, Algebraic geometric secret sharing schemes and secure multiparty computations over small fields, Advances in cryptology—CRYPTO 2006, Lecture Notes in Comput. Sci., vol. 4117, Springer, Berlin, 2006, pp. 521–536, DOI 10.1007/11818175 31. MR2422182 (2009d:94111) [11] D. V. Chudnovsky and G. V. Chudnovsky, Algebraic complexities and algebraic curves over finite fields, J. Complexity 4 (1988), no. 4, 285–316, DOI 10.1016/0885-064X(88)90012-X. MR974928 (90g:11167)
76
HUGUES RANDRIAMBOLOLONA
[12] G´ erard Cohen and Abraham Lempel, Linear intersecting codes, Discrete Math. 56 (1985), no. 1, 35–43, DOI 10.1016/0012-365X(85)90190-6. MR808084 (87c:94046) [13] Pierre Comon, Gene Golub, Lek-Heng Lim, and Bernard Mourrain, Symmetric tensors and symmetric tensor rank, SIAM J. Matrix Anal. Appl. 30 (2008), no. 3, 1254–1279, DOI 10.1137/060661569. MR2447451 (2009i:15039) [14] A. Couvreur, P. Gaborit, V. Gauthier, A. Otmani & J.-P. Tillich. “Distinguisher-based attacks on public-key cryptosystems using Reed-Solomon codes.” Presented at WCC 2013, to appear in Des. Codes Crypto. Preprint: http://arxiv.org/abs/1307.6458 [15] Ronald Cramer, Ivan Damg˚ ard, and Ueli Maurer, General secure multi-party computation from any linear secret-sharing scheme, Advances in cryptology—EUROCRYPT 2000 (Bruges), Lecture Notes in Comput. Sci., vol. 1807, Springer, Berlin, 2000, pp. 316–334, DOI 10.1007/3-540-45539-6 22. MR1772026 (2001b:94031) [16] C. Cr´ epeau & J. Kilian. “Achieving oblivious transfer using weakened security assumptions.” Proc. 29th IEEE Symp. on Found. of Computer Sci. (FOCS ’88), 1988, pp. 42-52. [17] S. G. Vl` eduts and V. G. Drinfeld, The number of points of an algebraic curve (Russian), Funktsional. Anal. i Prilozhen. 17 (1983), no. 1, 68–69. MR695100 (85b:14028) [18] Iwan M. Duursma and Ruud Pellikaan, A symmetric Roos bound for linear codes, J. Combin. Theory Ser. A 113 (2006), no. 8, 1677–1688, DOI 10.1016/j.jcta.2006.03.020. MR2269547 (2007i:94124) [19] David Eisenbud, The geometry of syzygies, Graduate Texts in Mathematics, vol. 229, Springer-Verlag, New York, 2005. A second course in commutative algebra and algebraic geometry. MR2103875 (2005h:13021) [20] David Eisenbud and Sorin Popescu, Gale duality and free resolutions of ideals of points, Invent. Math. 136 (1999), no. 2, 419–449, DOI 10.1007/s002220050315. MR1688433 (2000i:13014) [21] G. David Forney Jr., Coset codes. I. Introduction and geometrical classification, IEEE Trans. Inform. Theory 34 (1988), no. 5, 1123–1151, DOI 10.1109/18.21245. Coding techniques and coding theory. MR987661 (90f:94051) [22] A. Garcia, H. Stichtenoth, A. Bassa & P. Beelen. “Towers of function fields over non-prime finite fields.” To appear. Preprint: http://arxiv.org/abs/1202.5922 ´ ements de g´ ´ [23] A. Grothendieck, El´ eom´ etrie alg´ ebrique. II. Etude globale ´ el´ ementaire de quelques ´ classes de morphismes, Inst. Hautes Etudes Sci. Publ. Math. 8 (1961), 222. MR0217084 (36 #177b) [24] Robin Hartshorne, Algebraic geometry, Springer-Verlag, New York-Heidelberg, 1977. Graduate Texts in Mathematics, No. 52. MR0463157 (57 #3116) [25] Yasutaka Ihara, Some remarks on the number of rational points of algebraic curves over finite fields, J. Fac. Sci. Univ. Tokyo Sect. IA Math. 28 (1981), no. 3, 721–724 (1982). MR656048 (84c:14016) [26] Yuval Ishai, Eyal Kushilevitz, Rafail Ostrovsky, Manoj Prabhakaran, and Amit Sahai, Efficient non-interactive secure computation, Advances in cryptology—EUROCRYPT 2011, Lecture Notes in Comput. Sci., vol. 6632, Springer, Heidelberg, 2011, pp. 406–425, DOI 10.1007/978-3-642-20465-4 23. MR2813653 (2012h:94191) [27] Dieter Jungnickel, Alfred J. Menezes, and Scott A. Vanstone, On the number of self-dual bases of GF(q m ) over GF(q), Proc. Amer. Math. Soc. 109 (1990), no. 1, 23–29, DOI 10.2307/2048357. MR1007501 (90i:11144) [28] W. Kositwattanarerk & F. Oggier. “On construction D and related constructions of lattices from linear codes.” Presented at WCC 2013, to appear in Des. Codes Crypto. Preprint: http://arxiv.org/abs/1308.6175 [29] R. K¨ otter. “A unified description of an error locating procedure for linear codes.” Proc. Int. Workshop on Algebraic and Comb. Coding Theory, Voneshta Voda, Bulgaria, June 22-28, 1992. [30] Michel Laurent, Hauteur de matrices d’interpolation (French, with English summary), Approximations diophantiennes et nombres transcendants (Luminy, 1990), de Gruyter, Berlin, 1992, pp. 215–238. MR1176532 (93i:11072) [31] Jacobus H. van Lint and Richard M. Wilson, On the minimum distance of cyclic codes, IEEE Trans. Inform. Theory 32 (1986), no. 1, 23–40, DOI 10.1109/TIT.1986.1057134. MR831557 (87j:94017)
ON PRODUCTS AND POWERS OF LINEAR CODES
77
[32] R. McEliece. “A public-key system based on algebraic coding theory.” Deep Space Network Progress Report 44 (1978) 114-116. [33] F. J. McWilliams. Combinatorial properties of elementary abelian groups. Ph.D. dissertation, Harvard University, Cambridge, Mass., 1962. [34] Dany-Jack Mercier and Robert Rolland, Polynˆ omes homog` enes qui s’annulent sur l’espace projectif Pm (Fq ) (French, with French summary), J. Pure Appl. Algebra 124 (1998), no. 1-3, 227–240, DOI 10.1016/S0022-4049(96)00104-1. MR1600301 (99h:13015) [35] D. Mirandola. Schur products of linear codes: a study of parameters. Master Thesis (under the supervision of G. Z´ emor), Univ. Bordeaux 1 & Stellenbosch Univ., July 2012. Online version: http://www.algant.eu/documents/theses/mirandola.pdf [36] David Mumford, Lectures on curves on an algebraic surface, With a section by G. M. Bergman. Annals of Mathematics Studies, No. 59, Princeton University Press, Princeton, N.J., 1966. MR0209285 (35 #187) [37] David Mumford, Varieties defined by quadratic equations, Questions on Algebraic Varieties (C.I.M.E., III Ciclo, Varenna, 1969), Edizioni Cremonese, Rome, 1970, pp. 29–100. MR0282975 (44 #209) [38] F. Oggier & G. Z´ emor. “Coding constructions for efficient oblivious transfer from noisy channels.” In preparation. [39] Ruud Pellikaan, On decoding by error location and dependent sets of error positions, Discrete Math. 106/107 (1992), 369–381, DOI 10.1016/0012-365X(92)90567-Y. A collection of contributions in honour of Jack van Lint. MR1181934 (93h:94021) [40] Ruud Pellikaan, On the existence of error-correcting pairs, J. Statist. Plann. Inference 51 (1996), no. 2, 229–242, DOI 10.1016/0378-3758(95)00088-7. Shanghai Conference Issue on Designs, Codes, and Finite Geometries, Part 1 (Shanghai, 1993). MR1397531 (97k:94069) [41] Hugues Randriambololona, Hauteurs des sous-sch´ emas de dimension nulle de l’espace projectif (French, with English and French summaries), Ann. Inst. Fourier (Grenoble) 53 (2003), no. 7, 2155–2224. MR2044170 (2005a:11097) [42] Hugues Randriambololona, Bilinear complexity of algebras and the ChudnovskyChudnovsky interpolation method, J. Complexity 28 (2012), no. 4, 489–517, DOI 10.1016/j.jco.2012.02.005. MR2925903 [43] Hugues Randriambololona, Asymptotically good binary linear codes with asymptotically good self-intersection spans, IEEE Trans. Inform. Theory 59 (2013), no. 5, 3038–3045, DOI 10.1109/TIT.2013.2237944. MR3053393 [44] Hugues Randriambololona, An upper bound of singleton type for componentwise products of linear codes, IEEE Trans. Inform. Theory 59 (2013), no. 12, 7936–7939, DOI 10.1109/TIT.2013.2281145. MR3142273 [45] Hugues Randriambololona, (2, 1)-separating systems beyond the probabilistic bound, Israel J. Math. 195 (2013), no. 1, 171–186, DOI 10.1007/s11856-012-0126-9. MR3101247 [46] Cornelis Roos, A new lower bound for the minimum distance of a cyclic code, IEEE Trans. Inform. Theory 29 (1983), no. 3, 330–332, DOI 10.1109/TIT.1983.1056672. MR712390 (85e:94021) [47] Georg Schmidt, Vladimir R. Sidorenko, and Martin Bossert, Syndrome decoding of ReedSolomon codes beyond half the minimum distance based on shift-register synthesis, IEEE Trans. Inform. Theory 56 (2010), no. 10, 5245–5252, DOI 10.1109/TIT.2010.2060130. MR2808676 (2012a:94241) [48] Gadiel Seroussi and Abraham Lempel, Factorization of symmetric matrices and traceorthogonal bases in finite fields, SIAM J. Comput. 9 (1980), no. 4, 758–767, DOI 10.1137/0209059. MR592766 (81k:15019) [49] G. Seroussi and A. Lempel, On symmetric algorithms for bilinear forms over finite fields, J. Algorithms 5 (1984), no. 3, 327–344, DOI 10.1016/0196-6774(84)90014-2. MR756160 (86g:11076) [50] Jean-Pierre Serre, Corps locaux (French), Hermann, Paris, 1968. Deuxi` eme ´ edition; Publications de l’Universit´e de Nancago, No. VIII. MR0354618 (50 #7096) [51] Jean-Pierre Serre, Nombres de points des courbes alg´ ebriques sur Fq (French), Seminar on number theory, 1982–1983 (Talence, 1982/1983), Univ. Bordeaux I, Talence, 1983, pp. Exp. No. 22, 8. MR750323 (86d:11051) [52] Igor E. Shparlinski, Michael A. Tsfasman, and Serge G. Vladut, Curves with many points and multiplication in finite fields, Coding theory and algebraic geometry (Luminy, 1991), Lecture
78
[53] [54] [55] [56] [57]
[58]
[59] [60]
HUGUES RANDRIAMBOLOLONA
Notes in Math., vol. 1518, Springer, Berlin, 1992, pp. 145–169, DOI 10.1007/BFb0087999. MR1186422 (93h:11063) Sahib Singh, Analysis of each integer as sum of two cubes in a finite integral domain, Indian J. Pure Appl. Math. 6 (1975), no. 1, 29–35. MR0409419 (53 #13174) David Slepian, Some further theory of group codes, Bell System Tech. J. 39 (1960), 1219–1252. MR0122628 (22 #13351) Anders Bjært Sørensen, Projective Reed-Muller codes, IEEE Trans. Inform. Theory 37 (1991), no. 6, 1567–1576, DOI 10.1109/18.104317. MR1134296 (92g:94018) Henning Stichtenoth, Algebraic function fields and codes, 2nd ed., Graduate Texts in Mathematics, vol. 254, Springer-Verlag, Berlin, 2009. MR2464941 (2010d:14034) M. A. Tsfasman and S. G. Vl˘ adut¸, Algebraic-geometric codes, Mathematics and its Applications (Soviet Series), vol. 58, Kluwer Academic Publishers Group, Dordrecht, 1991. Translated from the Russian by the authors. MR1186841 (93i:94023) Michael A. Tsfasman and Serge G. Vl˘ adut¸, Geometric approach to higher weights, IEEE Trans. Inform. Theory 41 (1995), no. 6, 1564–1588, DOI 10.1109/18.476213. Special issue on algebraic geometry codes. MR1391017 (97m:94042) Victor K. Wei, Generalized Hamming weights for linear codes, IEEE Trans. Inform. Theory 37 (1991), no. 5, 1412–1418, DOI 10.1109/18.133259. MR1136673 (92i:94019) Christian Wieschebrink, Cryptanalysis of the Niederreiter public key scheme based on GRS subcodes, Post-quantum cryptography, Lecture Notes in Comput. Sci., vol. 6061, Springer, Berlin, 2010, pp. 61–72, DOI 10.1007/978-3-642-12929-2 5. MR2776311
Contemporary Mathematics Volume 637, 2015 http://dx.doi.org/10.1090/conm/637/12750
Higher weights of affine Grassmann codes and their duals Mrinmoy Datta and Sudhir R. Ghorpade Abstract. We consider the question of determining the higher weights or the generalized Hamming weights of affine Grassmann codes and their duals. Several initial as well as terminal higher weights of affine Grassmann codes of an arbitrary level are determined explicitly. In the case of duals of these codes, we give a formula for many initial as well as terminal higher weights. As a special case, we obtain an alternative simpler proof of the formula of Beelen et al. for the minimum distance of the dual of an affine Grasmann code.
1. Introduction A q-ary linear code of length n and dimension k, or in short, a [n, k]q -code, is simply a k-dimensional subspace of the n-dimensional vector space Fnq over the finite field Fq with q elements. A basic example is that of a (generalized) ReedMuller code RM(ν, δ) of order ν and length n := q δ , given by the image of the evaluation map Ev : Fq [X1 , . . . Xδ ]≤ν → Fnq
defined by
Ev(f ) = (f (P1 ), . . . , f (Pn )) ,
where Fq [X1 , . . . Xδ ]≤ν denotes the space of polynomials in δ variables of (total) degree ≤ ν with coefficients in Fq and P1 , . . . , Pn is an ordered listing of the points of the affine space Aδ (Fq ) = Fδq . A useful variant of this is the projective Reed-Muller code PRM(ν, δ) of order ν and length n := (q δ+1 − 1)/(q − 1), which is obtained by evaluating homogeneous polynomials in δ + 1 variables of degree ν with coefficients in Fq at points of the projective space Pδ = Pδ (Fq ) or rather at suitably normalized representatives in Fδ+1 of an ordered listing of the points of Pδ . q From a geometric viewpoint, projective Reed-Muller codes PRM(ν, δ) correspond (at least when ν < q) to the Veronese variety given by the image of Pδ
ν+δ k−1 in P under the Veronese map of degree ν, where k := ν . In this set-up, RM(ν, δ) corresponds to the image of this Veronese map when restricted to an Aδ inside Pδ (for instance, the set of points (x0 : x1 : · · · : xδ ) of Pδ with x0 = 1). Reed-Muller codes are classical objects and in the generalized setting above, their study goes back at least to Kasami, Lin, and Peterson [10] as well as Delsarte, 2010 Mathematics Subject Classification. Primary 15A03, 11T06 05E99 Secondary 11T71. The first named author was partially supported by a doctoral fellowship from the National Board for Higher Mathematics, a division of the Department of Atomic Energy, Govt. of India. The second named author was partially supported by Indo-Russian project INT/RFBR/P114 from the Department of Science & Technology, Govt. of India and IRCC Award grant 12IRAWD009 from IIT Bombay. c 2015 American Mathematical Society
79
80
MRINMOY DATTA AND SUDHIR R. GHORPADE
Goethals, and MacWilliams [4]. One may refer to [2, Prop. 4] for a summary of several of the basic properties of RM(ν, δ). Projective Reed-Muller codes appeared explicitly in the work of Lachaud [11, 12] and Sørensen [17]. Around the same time, a new class of codes called Grassmann codes were studied by Ryan [14, 15], and later by Nogin [13] and several others (see, e.g., [5–8]). These correspond geometrically to the Grassmann variety G,m formed by the -dimensional subspaces
of Fm ucker embedding G,m → Pk−1 , where k = m q together with the Pl¨ . In effect, the Grassmann code C( , m) is a linear code whose generator matrix has as its columns certain fixed representatives in Fkq of the Pl¨ ucker coordinates of all Fq -rational points of G,m . Affine Grassmann codes were introduced in [1] and further studied in [2] and [5]. Given positive integers , with ≤ , upon letting m = + and δ =
, the affine Grassmann code C A (l, m) is defined, like a ReedMuller code, as the q-ary linear code of length n = q δ given by the image of the evaluation map (1)
Ev : F( , m) → Fnq
defined by
Ev(f ) = (f (P1 ), . . . , f (Pn )) ,
where F( , m) is the space of linear polynomials in the minors of a generic × matrix X and P1 , . . . , Pn is an ordered listing of the δ-dimensional affine space of all × matrices with entries in Fq . The relationship between affine Grassmann codes C A (l, m) and Grassmann codes C(l, m) is akin to that between Reed-Muller codes RM(ν, δ) and projective Reed-Muller codes PRM(ν, δ). The notion of higher weight, also known as generalized Hamming weight, of a linear code is a natural and useful generalization of the basic notion of minimum distance (cf. [20]). If C is a [n, k]q -code, then for r = 0, 1, . . . , k, the r th higher weight of C is defined by dr = dr (C) = min{wH (D) : D is a subspace of C with dim D = r}, where wH (D) denotes the support weight of D [see Section 2 below for a definition]. Clearly, d1 (C) is the minimum distance d(C) of C. It is well-known and easy to see that 0 = d0 < d1 < · · · < dk and moreover dk = n provided C is nondegenerate. It is, in general, an interesting and difficult question to determine the weight hierarchy, i.e., all the higher weights, of a given class of codes. For example, in a significant piece of work, Heijnen and Pelikaan [9] completely determined the higher weights of Reed-Muller codes RM(ν, δ). In the case of projective Reed-Muller codes, the minimum distance was determined by Lachaud [12] and independently by Sørrensen [17]. In fact, Lachaud derives it as a consequence of an affirmative answer given by Serre [16] to a question of Tsfasman concerning the maximum number of Fq -rational points on a projective hypersurface of a given degree. The second higher weight was determined by Boguslavsky [3], while the determination of dr (PRM(ν, δ)) is still open for r > 2. In the case of Grassmann codes, the r th higher weight is known for the first few and the last few values of r, thanks to Nogin [13] (see also [6]) and Hansen, Johnsen and Ranestad [8] (see also [7]). More precisely, for r = 0, 1, . . . μ, where μ := 1 + max{ , m − }, we have dr (C( , m)) = q δ +q δ−1 +· · ·+q δ−r+1
and
dk−r (C( , m)) = n−(1+q+· · ·+q r−1 ),
where n denotes the length of C( , m) or in other words, the number %of F & q -rational points of G,m , and it is given by the Gaussian binomial coefficient m q . In case
= 2, we know a little more (cf. [7]), but the general case is still open.
HIGHER WEIGHTS OF AFFINE GRASSMANN CODES AND THEIR DUALS
81
We consider in this paper the problem of determining the higher weights of affine codes and their duals. Our main result is an explicit formula for
Grassmann dr C A (l, m) for the first few and the last few values of r, or more precisely, for 0 ≤ r ≤ μ and for k − μ ≤ r ≤ k, where (2)
μ = 1 + max{ , − }
and
μ := 1 + max{ , } = + 1.
In the case of the result for the first μ higher weights, we have to make an additional mild assumption that < . The result for the last μ higher weights can be deduced from the corresponding results for Grassmann codes using a geometric approach. However, we give here self-contained proofs in the spirit of [1, 2] and this has the advantage that analogous results are also obtained for affine Grassmann codes of arbitrary level introduced in [2]. As for the duals, we can in fact go much farther, and determine many more higher weights of the duals of affine Grassmann codes except that the result we give here is best described recursively. As a corollary, we obtain a new and simpler proof of [2, Theorem 17], which states that if > 1, then the minimum distance of C A (l, m; h)⊥ is 3 or 4 according as q > 2 or q = 2. The geometric approach and an alternative proof of the result about the last μ higher weights is also outlined in an appendix for the convenience of the reader. 2. Initial higher weights For any q-ary linear code C of length n, and any D ⊆ C, we let Supp(D) := {i ∈ {1, . . . , n} : ci = 0 for some c ∈ D}
and wH (D) = |Supp(D)|
denote, respectively, the support and the support weight of D. For a codeword c = (c1 , . . . , cn ) ∈ C, we write wH (c) = wH ({c}) and note that this is simply the Hamming weight of c. Fix, throughout this paper, positive integers h, , with h ≤ ≤ and an l × l matrix X = (Xij ) whose entries are algebraically independent indeterminates over Fq . Let Fq [X] denote the ring of polynomials in the
variables Xij ’s with coefficients in Fq . As in [2], we let Δ( , m; h) denote the set of all minors of X of degree ≤ h. Note that Δ( , m; h) is a subset of Fq [X] that contains the constant polynomial 1, which corresponds to the 0 × 0 minor of X. Further let F( , m; h) := the Fq -linear subspace of Fq [X] generated by Δ( , m; h). Note that the space F( , m) defined in the Introduction contains F( , m; h) and the equality holds when h = . The affine Grassmann code of level h, denoted C A ( , m; h), is defined to be the image of F( , m; h) under the evaluation map Ev given by (1). Evidently, C A ( , m; ) = C A ( , m) and C A ( , m; h) is a subcode of RM(h, δ). Now here is a slightly refined version of a basic result proved in [2]. Proposition 2.1. The minimum distance d( , m; h) of C A ( , m; h) is h 2 1 δ (3) d( , m; h) = q 1 − i = q δ−h |GLh (Fq )| . q i=1 Moreover, if M is any h × h minor of X, then wH (Ev (M)) = d( , m; h). Proof. The first equality (3) is proved in [2, Theorem 5], while the second is easily deduced. Also it is shown that in [2, Theorem 5] if Lh = det (Xij )1≤i,j≤h is the hth leading principal minor of X, then wH (Ev (Lh )) = d( , m; h). Now if M is any h × h minor of X, then there are positive integers p1 , . . . , ph , q1 , . . . , qh with
82
MRINMOY DATTA AND SUDHIR R. GHORPADE
p1 < · · · < ph ≤ and q1 < · · · < qh ≤ such that M = det(Xpi qj )1≤i,j≤h . Let σ ∈ S be a permutation such that σ(i) = pi for 1 ≤ i ≤ and P ∈ GL (Fq ) be the permutation matrix corresponding to σ so that for 1 ≤ i, j ≤ , the (i, j)th entry of P is 1 is j = σ(i) and 0 otherwise. Likewise, let τ ∈ S be such that τ (i) = qi for 1 ≤ i ≤ and Q ∈ GL (Fq ) be the permutation matrix corresponding to τ . Then it is easily seen that M is the the hth leading principal minor of P XQ−1 . Moreover, we know from [2, §IV] that X → P XQ−1 induces a permutation automorphism of C A ( , m; h). It follows that wH (Ev (M)) = wH (Ev (Lh )) = d( , m; h). The following general observation about the support weights of linear codes will be useful in the sequel. Lemma 2.2. Let C be an [n, k]q -code and for i = 1, . . . , n, let πi : C → Fq denote the ith projection map defined by πi (c1 , . . . , cn ) = ci . Also let D be a subcode of C and {y1 , . . . , yr } be a generating set of D. Then n Supp(D) = Aj where for 1 ≤ j ≤ n, Aj := {i ∈ {1, . . . , n} : πi (yj ) = 0} . j=1
Proof. Clearly, ∪nj=1 Aj ⊆ Supp(D). On the other hand, suppose i ∈ {1, . . . , n} is such that i ∈ / ∪nj=1 Aj . Then πi (yj ) = 0 for all j = 1, . . . , r. Now for any x ∈ D, we
r
r can write x = j=1 cj yj for some c1 , . . . , cr ∈ Fq ; hence πi (x) = j=1 cj πi (yj ) = 0. Thus i ∈ / Supp(D). This shows that Supp(D) ⊆ ∪nj=1 Aj . The next two lemmas extend Proposition 2.1 and show that for a judicious choice of a family {M1 , . . . , Mr } of minors of X, the support weight of the product of any nonempty subfamily is given by a formula analogous to (3). Lemma 2.3. Let r be a positive integer such that r ≤ − h + 1 and let Y be any h × (h + r − 1) submatrix of X. Also for j = 1, . . . , r, let Mj denote the h × h minor of Y (and hence of X) corresponding to the first h − 1 columns of Y together with the (h + j − 1)th column of Y , and let Aj = {P ∈ Aδ : Mj (P ) = 0}. Then for any positive integer s with s ≤ r and any j1 , . . . , js ∈ {1, . . . , r} with j1 < · · · < js , s−1 1 (4) |Aj1 ∩ · · · ∩ Ajs | = d( , m; h) 1 − . q Proof. Given any × matrix P ∈ Aδ with entries in Fq , let Q denote the h × (h + r − 1) submatrix of P formed in exactly the same way as Y , and let Q1 , . . . Qh+r−1 denote the column vectors of Q. For any positive integer s with s ≤ r and any j1 , . . . , js ∈ {1, . . . , r} with j1 < · · · < js , the condition P ∈ Aj1 ∩· · ·∩Ajs is equivalent to the condition that the column vectors Q1 , . . . , Qh−1 , Qh+j−1 in Fhq are linearly independent for each j ∈ {j1 , . . . , js }. This will hold when the submatrix of Q formed by its first h−1 columns is chosen in exactly (q h −1)(q h −q) · · · (q h −q h−2 ) ways, while each of Qh+j1 −1 , . . . , Qh+js −1 are chosen in (q h − q h−1 ) ways. The remaining r − s columns of Q may be chosen arbitrarily in q h(r−s) ways. Since P has
− h(h + r − 1), i.e., δ − h2 − h(r − 1), entries outside Q, it follows that |Aj1 ∩ · · · ∩ Ajs | = (q h − 1)(q h − q) · · · (q h − q h−2 )(q h − q h−1 )s q δ−h s−1 1 = d( , m; h) 1 − , q where the last equality follows from (3).
2
−h(s−1)
HIGHER WEIGHTS OF AFFINE GRASSMANN CODES AND THEIR DUALS
83
Lemma 2.4. Assume that h < . Let r be a positive integer such that r ≤ h + 1 and let Y be any h × (h + 1) submatrix of X. Also for j = 1, . . . , r, let Mj denote the determinant of the h×h submatrix of Y formed by all except the (h−r +j +1)th column of Y , and let Aj = {P ∈ Aδ : Mj (P ) = 0}. Then (4) holds for any positive integer s with s ≤ r and any j1 , . . . , js ∈ {1, . . . , r} with j1 < · · · < js . Proof. Given P ∈ Aδ , let Q be the h × (h + 1) submatrix of P corresponding to Y , and let Q1 , . . . , Qh+1 denote its column vectors. Fix any positive integer s with s ≤ r and j1 , . . . , js ∈ {1, . . . , r} with j1 < · · · < js . Now Mj1 (P ) = 0 implies that Q has rank h and in particular, Qh−r+j1 +1 is a Fq -linear combination of the remaining h column vectors of Q. Moreover, for 2 ≤ t ≤ s, if Mjt (P ) = 0, then the coefficients of Qh−r+jt +1 in this Fq -linear combination must be nonzero. Conversely, if all except the (h − r + j1 + 1)th column of Q are linearly independent (and these columns can thus be chosen in |GLh (Fq )| ways), while Qh−r+j1 +1 is a Fq -linear combination of the remaining h column vectors of Q with a nonzero coefficient for the s − 1 columns Qh−r+j2 +1 , . . . , Qh−r+js +1 , then Mjt (P ) = 0 for each t = 1, . . . , s. The h coefficients in this Fq -linear combination can thus be chosen in q h−s+1 (q − 1)s−1 ways. Since P has δ − h(h + 1) entries outside of Q, it follows that s−1 1 |Aj1 ∩ · · · ∩ Ajs | = |GLh (Fq )| q h−s+1 (q − 1)s−1 q δ−h(h+1) = d( , m; h) 1 − , q where the last equality follows once again from (3).
Theorem 2.5. Let r be a positive integer such that r ≤ max{ − h, h} + 1. Assume that h < in case max{ − h, h} = h, i.e., ≤ 2h. Then the r th higher weight dr ( , m; h) of C A ( , m; h) is (5)
dr ( , m; h) = q
h 2 − 1) |GLh (Fq )| 1 . 1 − i = q δ−h −r+1 (q r − 1) (q − 1) i=1 q q−1
δ−r+1 (q
r
Moreover, the r th higher weight of C A ( , m; h) attains the Griesmer-Wei bound. Proof. The hypotheses on r and h together with Lemmas 2.3 and 2.4 ensure that there exist minors M1 , . . . , Mr ∈ Δ( , m; h) with supports A1 , . . . , Ar respectively, such that (4) holds for any positive integer s with s ≤ r and any j1 , . . . , js ∈ {1, . . . , r} with j1 < · · · < js . Consequently, ' ' ' ' r ' ' r ' = ' A (−1)s−1 |Aj1 ∩ · · · ∩ Ajs | j ' ' 'j=1 ' s=1 1≤j1 0. Let d0 := 0 and for 1 ≤ r ≤ k, let dr denote the r th higher weight of C. Also let (10)
ej := dj − j
and
fj := n − j − dk−j
for 0 ≤ j ≤ k.
Then the e-sequence and the f -sequence partition {0, 1, . . . , n − k}; more precisely, (11)
0 = e0 ≤ e1 ≤ · · · ≤ ek = n − k
and
0 = f0 ≤ f1 ≤ · · · ≤ fk = n − k.
Moreover, for 0 ≤ s < n − k, the last sth higher weight of the dual of C is given by
(12) dn−k−s C ⊥ = n−s−j if j is the unique integer < k with ej ≤ s < ej+1 . Equivalently, for 0 < s ≤ n − k, the sth higher weight of the dual of C is given by
(13) ds C ⊥ = s + j + 1 if j is the unique integer < k with fj < s ≤ fj+1 . Proof. Note that dk = n, since C is nondegenerate. With this in view, part (i) of Proposition 4.1 implies (11). Next, parts (i) and (ii) of Proposition 4.1 together with (11) readily imply (12) and (13) . Remark 4.3. The above Corollary shows that the higher weights of the dual of a nondegenerate linear code C of positive dimension k take consecutive values in strings of length dr+1 − dr − 1 for 0 ≤ r < k, where dr denotes the r th higher weight of C. Evidently, this phenomenon is prevalent if there are large gaps among the consecutive higher weights of C. In fact, a duality of sorts seems to prevail here: more the number of consecutive strings among the higher weights of a code, the less there are among the higher weights of its dual, and vice-versa. In this connection,
HIGHER WEIGHTS OF AFFINE GRASSMANN CODES AND THEIR DUALS
87
it may useful to note the following result of Tsfasman and Vl˘adut¸ [19, Cor. 3.5], which states that for 1 ≤ r ≤ s ≤ k, s−r (q − 1)dr (q − 1)dr ds ≥ dr + and in particular, dr+1 − dr ≥ . (q r − 1)q (q r − 1)q i=1 Another special case of Corollary 4.2 worth noting is that C ⊥ is nondegenerate if and only if d1 (C) > 1. We now turn to duals of affine Grassmann codes. Recall that we have fixed positive integers h, , with h ≤ ≤ and that the length of the corresponding affine Grassmann code C A ( , m; h) of level h is given by n := q δ and the dimension kh is given by (9). To avoid trivialities we will further assume that > 1. Indeed, it is easy to describe what the affine Grassmann code C A ( , m; h) and its dual is in the trivial case = 1 (or another trivial case h = 0 that we have ignored from the beginning) and in fact, this has been done in the paragraph before Theorem 17 in [2]. Using the results of Sections 2 and 3, we obtain a more concrete version of Corollary 4.2, which determines several initial and terminal higher weights of the C A ( , m; h)⊥ . As a very special case, we also obtain an alternative and simpler proof of [2, Theorem 17]. Theorem 4.4. Assume that > 1. 1 ≤ r higher weight of C A ( , m; h)⊥ . Then (i) The minimum distance of C A ( , m; h)⊥
A 3 ⊥ d1 C ( , m; h) = 4
th ≤ n − kh , let d⊥ r denote the r
is given by if q > 2, if q = 2.
More generally, upon letting Qj = q j − j for j ≥ 0, the sth higher weight of C A ( , m; h)⊥ for 1 ≤ s < q − is given by
ds C A ( , m; h)⊥ = s + j + 1 where j is the unique positive integer ≤ such that Qj−1 ≤ s < Qj . (ii) With d( , m; h) as in (3), we have d( , m; h) ≥ 2 and
dn−kh −s C A ( , m; h)⊥ = q δ − s for 0 ≤ s ≤ d( , m; h) − 2. In particular, C A ( , m; h)⊥ is nondegenerate. Further if we assume that
j−1 h < or > 2h, and we let G0 = H0 = 0 and Gj := i=0 q −i and Hj := d( , m; h)Gj − j for any positive integer j, then the last sth higher weight of C A ( , m; h)⊥ for H1 ≤ s ≤ max{H , H − } is given by
dn−kh −s C A ( , m; h)⊥ = n − s − j, where j is the unique positive integer ≤ max{ , − } with Hj−1 ≤ s < Hj . Proof. Let C = C A ( , m; h), n = q δ and k = kh , and let dj , ej , fj be as in Corollary 4.2. By Theorem 3.3, the code C is nondegenerate and dk−j = n − q j−1 for 1 ≤ j ≤ + 1. Consequently, the condition fj < s ≤ fj+1 translates to 1 ≤ j ≤ . Thus (13) implies the desired formula in (i) Qj−1 ≤
sA < Qj , provided ⊥ for ds C ( , m; h) . In the particular case when s = 1, we have Q1 = 1 < 2 = Q2 or 0 = Q0 < 1 < Q1 according as q = 2 or q > 2, and this yields the formula for the minimum distance of C A ( , m; h)⊥ .
88
MRINMOY DATTA AND SUDHIR R. GHORPADE
Next, d1 = d( , m; h) = q δ− |GL (Fq )| and since > 1 we see that q δ− ≥ 2 when < , whereas |GL (Fq )| ≥ (q 2 − 1)(q 2 − q) ≥ 6 when = . Thus in any case d( , m; h) ≥ 2 and so (12) implies the first assertion in (ii). Further, Theorem 2.5 shows that ej = Hj for 1 ≤ j ≤ 1 + max{ , − }. Thus (12) implies the remaining assertion in (ii) as well. 2
2
Remark 4.5. With the first higher weight of C A ( , m; h)⊥ given as in part (i) of Theorem
A4.4 above, we can also describe many of the initial higher weights ⊥ by the recursive formula d⊥ s := ds C ( , m; h) d⊥ s−1 + 2 if ds−1 is a power of q, ⊥ ds = d⊥ s−1 + 1 otherwise,
provided 2 ≤ s ≤ q − . Likewise, the last higher weight of C A ( , m; h)⊥ is n = q δ , and many terminal higher weights of C A ( , m; h)⊥ are given by the recursive formula d⊥ n−kh −s+1 − 2 if dn−kh −s+1 = n + 1 − d( , m; h)Gj for some j, d⊥ n−kh −s = d⊥ n−kh −s+1 − 1 otherwise, provided 1 ≤ s ≤ max{d( , m; h)G − , d( , m; h)G − − ( − )} and it is assumed that h < or > 2h. Using the direct formula in Theorem 4.4 or the recursive formula in Remark 4.5, we can easily write down several of the initial and terminal higher weights of C A ( , m; h)⊥ . Table 1 illustrates the first few higher weights d⊥ s of the dual of C A ( , m; h), where h, , are sufficiently large, say > ≥ h ≥ 27. Appendix A. A geometric approach to higher weights Let n, k be positive integers with k ≤ n. A nondegenerate [n, k]q -projective system is simply a (multi)set X of n points in the projective space Pk−1 over the finite field Fq . If we write Pk−1 = P(V ), where V is a k-dimensional vector space over Fq and fix some lifts, say v1 , . . . , vn , of these n points to V , then the associated nondegenerate linear code CX is the image of the evaluation map Ev : V ∗ → Fnq
defined by
Ev(φ) = (φ(v1 ), . . . , φ(vn )) ,
∗
where V denotes the dual of V , i.e., the space of all linear maps from V to Fq . It is shown in [18, 19] that the association X CX is a one-to-one correspondence, modulo natural notions of equivalence, from the class of nondegenerate [n, k]q projective systems onto the class of nondegenerate [n, k]q -codes. For 1 ≤ r ≤ k, the r th higher weight of CX corresponds to maximal sections of X by (projective) linear subspaces of Pk−1 of codimension r; more precisely, dr (CX ) = n − max{|X ∩ Π| : Π linear subspace of Pk−1 with codim Π = r}. For more on this, we refer to [18, 19].
Now let , m be positive integers with ≤ m and as before let k = m ,
:= m − and δ =
. Assume that 1 < ≤ . Consider the Grassmannian G,m = G,m (Fq ) of -dimensional subspaces of Fm ucker embedding q . The Pl¨ m G,m → P( )−1 = P(∧ Fm q )
given by
W = w1 , . . . , w → [w1 ∧ · · · ∧ w ]
is known to be nondegenerate and the corresponding nondegenerate linear code is the Grassmann code C( , m). The formula stated in the Introduction for the last few
HIGHER WEIGHTS OF AFFINE GRASSMANN CODES AND THEIR DUALS
89
Table 1. Dual Higher Weights of Affine Grassmann Codes q d⊥ 1 d⊥ 2 d⊥ 3 d⊥ 4 d⊥ 5 d⊥ 6 d⊥ 7 d⊥ 8 d⊥ 9 d⊥ 10 d⊥ 11 d⊥ 12 d⊥ 13 d⊥ 14 d⊥ 15 d⊥ 16 d⊥ 17 d⊥ 18 d⊥ 19 d⊥ 20 d⊥ 21 d⊥ 22 d⊥ 23 d⊥ 24 d⊥ 25 d⊥ 26 d⊥ 27
2 4 6 7 8 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 34
3 3 5 6 7 8 9 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 29 30 31 32
4 3 4 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 30 31
5 3 4 5 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 27 28 29 30 31
7 3 4 5 6 7 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
8 3 4 5 6 7 8 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
9 3 4 5 6 7 8 9 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
11 3 4 5 6 7 8 9 10 11 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
13 3 4 5 6 7 8 9 10 11 12 13 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
16 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 30
17 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 19 20 21 22 23 24 25 26 27 28 29 30
higher weights of C( , m) follows readily from the structure of linear subvarieties of G,m or, in algebraic parlance, the structure of decomposable subspaces of exterior powers. Indeed, G,m contains a linear subspace Π of dimension r − 1, provided r ≤ μ, where μ := 1 + max{ , m − }; see, for example, [7, Cor. 7] or [5, Lemma 3.5]. This subspace Π has codimension (k − 1) − (r − 1) = k − r in Pk−1 , and clearly, |Π ∩ G,m | = |Π| = |Pr−1 | = (1 + q + · · · + q r−1 ). Consequently, dk−r (C( , m)) = n − (1 + q + · · · + q r−1 ) Suppose we fix an ordered basis {e1 , . . . , em } of {eα : α ∈ I( , m)} of ∧ Fm q , where
Fm q
for 1 ≤ r ≤ μ. and the corresponding basis
I( , m) = {α = (α1 , . . . , α ) :∈ Z : 1 ≤ α1 < · · · < α ≤ m} ucker coordinates and eα := eα1 ∧ · · · ∧ eα for α = (α1 , . . . , α ) ∈ I( , m). The Pl¨ of an -dimensional subspace W ∈ G,m spanned by {w1 , . . . , w } are precisely
90
MRINMOY DATTA AND SUDHIR R. GHORPADE
p = (pα )α∈I(,m) , where pα ∈ Fq are determined by the relation pα eα . w1 ∧ · · · ∧ w = α∈I(,m)
For α ∈ I( , m), let Hα denote the hyperplane {p ∈ Pk−1 : pα = 0} in Pk−1 , and let Uα := {p ∈ Pk−1 : pα = 0} be the corresponding basic open set. It is a classical fact that Uα ∩ G,m is isomorphic to the affine space Aδ of × matrices over Fq . This correspondence is given explicitly by the Basic Cell Lemma of [6]. For the sake of definitiveness, consider θ := ( + 1, + 2, . . . , m) ∈ I( , m). Then the Pl¨ ucker embedding restricted to Uθ ∩ G,m gives a nondegenerate embedding of Aδ into Pk−1 , and the linear code corresponding to this projective system is, in fact, the affine Grassmann code C A ( , m) of length q δ . If Π is a linear subspace of Pk−1 of dimension r − 1, then Π ∩ Hθ would be a linear subspace of dimension r − 2 or r − 1 according as Π ⊆ Hθ or Π ⊆ Hθ . Consequently, |Π ∩ Uθ | = |Π| − |Π ∩ Hθ | ≤ (1 + q + · · · + q r−1 ) − (1 + q + · · · + q r−2 ) = q r−1 . Consequently, |Π ∩ Uθ ∩ G,m | ≤ q r−1 and so dk−r (C A ( , m)) ≥ q δ − q r−1 for all r = 1, . . . , k. This proves a stronger version of Lemma 3.2 in the case h = . Further, if r ≤ μ = + 1 and if Π is a linear subspace of Pk−1 of codimension k − r chosen in such a way that Π ⊆ G,m and Π ⊆ Hθ , then |Π ∩ Uθ ∩ G,m | = |Π ∩ Hθ | = q r−1 and so dk−r (C A ( , m)) = q δ − q r−1 for all r = 1, . . . , μ. Since 1 < ≤ , choosing such a subspace Π is possible for 1 ≤ r ≤ μ; for example, we can take Π = {p ∈ Pk−1 : p(1,2,...,−1,j) = 0 for j = , + 1, . . . , + r − 1} to be the intersection of Pl¨ ucker coordinate hyperplanes that are “close” to each other. Thus we obtain an alternative proof of Theorem 3.3 when h = . On the other hand, deriving the formulas that we have for initial higher weights of C A ( , m) from the corresponding results for the Grassmann code C( , m) is not so straightforward. To be sure, the optimal linear subspace in G,m of large dimension (or small codimension) are obtained in [6] by considering close families in I( , m) ucker and the corresponding linear subsbaces of Pk−1 given by the intersections of Pl¨ coordinate hyperplanes. Recall that Λ ⊆ I( , m) is said to be close if any two distinct elements of Λ have − 1 coordinates in common. However, determining the maximum possible cardinality of the intersection of the corresponding linear subspace Π with Uθ ∩ G,m is not easy. It may be tempting to consider Λ ⊆ I( , m) not containing θ such that Λ ∪ {θ} is close. But this doesn’t work even when Λ is singleton (which would correspond to looking at the minimum distance). In fact, it is better to keep the elements of Λ as far away from θ as possible. Thus choosing a close family in I( , ) rather than I( , m) is helpful and this has, in fact, motivated the proofs of Lemma 2.3 and 2.4, which paved the way for Theorem 2.5. References [1] Peter Beelen, Sudhir R. Ghorpade, and Tom Høholdt, Affine Grassmann codes, IEEE Trans. Inform. Theory 56 (2010), no. 7, 3166–3176, DOI 10.1109/TIT.2010.2048470. MR2798982 (2012b:94135) [2] Peter Beelen, Sudhir R. Ghorpade, and Tom Høholdt, Duals of affine Grassmann codes and their relatives, IEEE Trans. Inform. Theory 58 (2012), no. 6, 3843–3855, DOI 10.1109/TIT.2012.2187171. MR2924405 [3] Michael I. Boguslavsky, On the number of solutions of polynomial systems, Finite Fields Appl. 3 (1997), no. 4, 287–299, DOI 10.1006/ffta.1997.0186. MR1478830 (98m:11132)
HIGHER WEIGHTS OF AFFINE GRASSMANN CODES AND THEIR DUALS
91
[4] Phillipe Delsarte, Jean-Marie Goethals, and Florence Jessie MacWilliams, On generalized Reed-Muller codes and their relatives, Information and Control 16 (1970), 403–442. MR0274186 (42 #9061) [5] Sudhir R. Ghorpade and Krishna V. Kaipa, Automorphism groups of Grassmann codes, Finite Fields Appl. 23 (2013), 80–102, DOI 10.1016/j.ffa.2013.04.005. MR3061086 [6] Sudhir R. Ghorpade and Gilles Lachaud, Higher weights of Grassmann codes, Coding theory, cryptography and related areas (Guanajuato, 1998), Springer, Berlin, 2000, pp. 122–131. MR1749453 (2001d:94036) [7] Sudhir R. Ghorpade, Arunkumar R. Patil, and Harish K. Pillai, Decomposable subspaces, linear sections of Grassmann varieties, and higher weights of Grassmann codes, Finite Fields Appl. 15 (2009), no. 1, 54–68, DOI 10.1016/j.ffa.2008.08.001. MR2468992 (2009h:14043) [8] Johan P. Hansen, Trygve Johnsen, and Kristian Ranestad, Grassmann codes and Schubert unions (English, with English and French summaries), Arithmetics, geometry, and coding theory (AGCT 2005), S´emin. Congr., vol. 21, Soc. Math. France, Paris, 2010, pp. 103–123. MR2856563 [9] Petra Heijnen and Ruud Pellikaan, Generalized Hamming weights of q-ary Reed-Muller codes, IEEE Trans. Inform. Theory 44 (1998), no. 1, 181–196, DOI 10.1109/18.651015. MR1486657 (99a:94068) [10] Tadao Kasami, Shu Lin, and W. Wesley Peterson, New generalizations of the ReedMuller codes. I. Primitive codes, IEEE Trans. Information Theory IT-14 (1968), 189–199. MR0275989 (43 #1742) [11] Gilles Lachaud, Projective Reed-Muller codes (English, with French summary), Coding theory and applications (Cachan, 1986), Lecture Notes in Comput. Sci., vol. 311, Springer, Berlin, 1988, pp. 125–129, DOI 10.1007/3-540-19368-5 13. MR960714 (89i:94038) [12] Gilles Lachaud, The parameters of projective Reed-Muller codes (English, with French summary), Discrete Math. 81 (1990), no. 2, 217–221, DOI 10.1016/0012-365X(90)90155-B. MR1054981 (91g:94025) [13] Dmitri Yu. Nogin, Codes associated to Grassmannians, Arithmetic, geometry and coding theory (Luminy, 1993), de Gruyter, Berlin, 1996, pp. 145–154. MR1394931 (97k:94075) [14] Charles Ryan, An application of Grassmannian varieties to coding theory, Congr. Numer. 57 (1987), 257–271. Sixteenth Manitoba conference on numerical mathematics and computing (Winnipeg, Man., 1986). MR889714 (88f:94043) [15] Charles T. Ryan, Projective codes based on Grassmann varieties, Congr. Numer. 57 (1987), 273–279. Sixteenth Manitoba conference on numerical mathematics and computing (Winnipeg, Man., 1986). MR889715 (88f:94044) [16] Jean-Pierre Serre, Lettre ` a M. Tsfasman (French, with English summary), Ast´erisque 198200 (1991), 11, 351–353 (1992). Journ´ ees Arithm´ etiques, 1989 (Luminy, 1989). MR1144337 (93e:14026) [17] Anders Bjært Sørensen, Projective Reed-Muller codes, IEEE Trans. Inform. Theory 37 (1991), no. 6, 1567–1576, DOI 10.1109/18.104317. MR1134296 (92g:94018) [18] Michael A. Tsfasman and Serge G. Vl˘ adut¸, Algebraic-geometric codes, Mathematics and its Applications (Soviet Series), vol. 58, Kluwer Academic Publishers Group, Dordrecht, 1991. Translated from the Russian by the authors. MR1186841 (93i:94023) [19] Michael A. Tsfasman and Serge G. Vl˘ adut¸, Geometric approach to higher weights, IEEE Trans. Inform. Theory 41 (1995), no. 6, 1564–1588, DOI 10.1109/18.476213. Special issue on algebraic geometry codes. MR1391017 (97m:94042) [20] Victor K. Wei, Generalized Hamming weights for linear codes, IEEE Trans. Inform. Theory 37 (1991), no. 5, 1412–1418, DOI 10.1109/18.133259. MR1136673 (92i:94019) Department of Mathematics, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India E-mail address: [email protected] Department of Mathematics, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India E-mail address: [email protected]
Algorithmic: special varieties
Contemporary Mathematics Volume 637, 2015 http://dx.doi.org/10.1090/conm/637/12751
The geometry of efficient arithmetic on elliptic curves David Kohel Abstract. The arithmetic of elliptic curves, namely polynomial addition and scalar multiplication, can be described in terms of global sections of line bundles on E × E and E, respectively, with respect to a given projective embedding of E in Pr . By means of a study of the finite dimensional vector spaces of global sections, we reduce the problem of constructing and finding efficiently computable polynomial maps defining the addition morphism or isogenies to linear algebra. We demonstrate the effectiveness of the method by improving the best known complexity for doubling and tripling, by considering families of elliptic curves admiting a 2-torsion or 3-torsion point.
1. Introduction The computational complexity of arithmetic on an elliptic curve, determined by polynomial maps, depends on the choice of projective embedding of the curve. Explicit counts of multiplications and squarings are expressed in terms of operations on the coordinate functions determined by this embedding. The perspective of this work is to reduce the determination of the complexity of evaluating a morphism, up to additions and multiplication by constants, to a problem of computing a ddimensional subspace V of the space of monomials of degree n. This in turn can be conceptually reduced to the construction of a flag V1 ⊂ V2 ⊂ · · · ⊂ Vd = V. For this purpose we recall results from Kohel [13] for the linear classification of projectively normal models. We generalize this further by analysing the conditions under which a degree n isogeny is determined by polynomials of degree n in terms of the given projective embeddings. This approach allows us to derive conjecturally optimal or nearly optimal algorithms for operations of doubling and tripling, which form the basic building blocks for efficient scalar multiplication. 2. Background An elliptic curve E is a projective nonsingular genus one curve with a fixed base point. In order to consider the arithmetic, namely addition and scalar multiplication defined by polynomial maps, we need to fix the additional structure of a projective embedding. We call an embedding ι : E → Pr a (projective) model for E. A model given by a complete linear system is called projectively normal (see BirkenhakeLange [6, Chapter 7, Section 3] or Hartshorne [10, Chapter I, Exercise 3.18 & 2010 Mathematics Subject Classification. Primary 11G, 14G. This work was supported by a project of the Agence Nationale de la Recherche, reference ANR-12-BS01-0010-01. 95
c 2015 American Mathematical Society
96
DAVID KOHEL
Chapter II, Exercise 5.14] for the general definition and its equivalence with this one for curves). If ι : E → Pr is a projectively normal model, letting {X0 , . . . , Xr } denote the coordinate functions on Pr , we have a surjection of rings: ι∗ : k[Pr ] = k[X0 , . . . , Xr ] −→ k[E] =
k[X0 , . . . , Xr ] , IE
where IE is the defining ideal for the embedding. In addition, using the property that ι is given by a complete linear system, there exists T in E(k) such that every ¯ with multiplicities, hyperplane intersects E in d = r+1 points {P0 , . . . , Pr } ⊂ E(k), such that P0 + · · · Pr = T , and we say that the degree of the embedding is d. The invertible sheaf L attached to the embedding is ι∗ OPr (1), where OPr (1) is the sheaf spanned by {X0 , . . . , Xr }. Similarly, the space of global sections of OPr (n) is generated by the monomials of degree n in the Xi . Let L n denote its image under ι∗ , then the global sections Γ(E, L n ) is the finite dimensional k-vector space spanned by monomials of degree n modulo IE , and hence k[E] =
∞
Γ(E, L n ),
n=0
which is a subspace of k(E)[X0 ]. Now let D be the divisor on E cut out by X0 = 0, then we can identify Γ(E, L n ) with the Riemann–Roch space associated to nD: L(nD) = {f ∈ k(E)∗ | div(f ) ≥ −nD} ∪ {0}. More precisely, we have Γ(E, L n ) = L(nD)X0n ⊂ k(E)X0n for each n ≥ 0. While the dimension of L(nD) is nd, the dimension of the space of all monomials of degree n is:
n+r n+d−1 = · dimk Γ(Pr , OPr (n)) = r d−1 The discrepancy is accounted for by relations of a given degree in IE . More precisely, for the ideal sheaf IE of E on Pr , with Serre twist IE (n) = IE ⊗ OPr (n), the space of relations of degree n is Γ(Pr , IE (n)), such that the defining ideal of E in Pr is ∞ Γ(Pr , IE (n)) ⊂ k[X0 , . . . , Xr ]. IE = n=1
Consequently, each polynomial in the quotient Γ(E, L n ) ⊂ k[E] represents a coset f + Γ(Pr , IE (n)) of polynomials. From the following table of dimensions: n 1 2 3
d=3:
n+r r
nd
n
3 6 10
3 6 9
1 2 3
d=4:
n+r r
nd
n
4 10 15
4 8 12
1 2 3
d=5:
n+r r
nd
n
5 15 35
5 10 15
1 2 3
d=6:
n+r r
nd
6 21 56
6 12 18
we see the well-known result that a degree-3 curve in P2 is generated by a cubic relation, and a degree-4 curve in P3 is the intersection of two quadrics. Similarly, a quintic model in P4 and a sextic model in P5 are generated by a space of quadrics of dimensions 5 and 9, respectively.
THE GEOMETRY OF EFFICIENT ARITHMETIC ON ELLIPTIC CURVES
97
When considering polynomial maps between curves, this space of relations Γ(Pr , IE (n)), which evaluate to zero, gives a source of ambiguity but also room for optimization when evaluating a representative polynomial f in its class. Addition law relations. A similar analysis applies to the set of addition laws, from E × E to E. The set polynomials of bidegree (m, n) on E × E are well-defined modulo relations in Γ(Pr , IE (m)) ⊗k Γ(Pr , OPr (n)) + Γ(Pr , OPr (m)) ⊗k Γ(Pr , IE (n)). As the kernel of the surjective homomorphism Γ(Pr , OPr (m)) ⊗k Γ(Pr , OPr (n)) −→ Γ(E, L m ) ⊗k Γ(E, L n )), m+r n+r − mnd2 . r r In particular, this space of relations will be of interest in the case of minimal bidegree (m, n) = (2, 2) for addition laws, where it becomes: 2 d+1 d2 (d − 3)(d + 5) − 4d2 = · 2 4
its dimension is
For d = 3, this dimension is zero since there are no degree-2 relations, but for d = 4, 5 or 6, the dimensions, 36, 125, and 297, respectively, are significant and provide a large search space in which to find sparse or efficiently computable forms in a coset. A category of pairs. The formalization of the above concepts is provided by the introduction of a category of pairs (X, L ), consisting of a variety X and very ample invertible sheaf L . For more general varieties X, in order to maintain the correspondence between the spaces of sections Γ(X, L n ) and spaces of homogeneous functions of degree n on X, the embedding determined by L should be projectively normal. The isomorphisms φ : (X1 , L1 ) → (X2 , L2 ) in this category are isomorphisms X1 → X2 for which φ∗ L2 ∼ = L1 . These are the linear isomorphisms whose classification, for elliptic curves, is recalled in the next section. In general the space of tuples of defining polynomials of degree n can be identified with Hom(φ∗ L2 , L1n ). The exact morphisms, for which φ∗ L2 ∼ = L1n for some n, are the subject of Section 4. 3. Linear classification of models Hereafter we consider only projectively normal models. A linear change of variables gives a model with equivalent arithmetic, up to additions and multiplication by constants, thus it is natural to consider linear isomorphisms between models of elliptic curves. In this section we recall results from Kohel [13] classifying elliptic curve morphisms which are linear. This provides the basis for a generalization to exact morphism in the next section. Definition 3.1. Suppose that E ⊂ Pr is a projectively normal model of an elliptic curve. The point T = P0 + · · · Pr , where H ∩ E = {P0 , . . . , Pr } for a hyperplane H in Pr , is an invariant of the embedding called the embedding class of the model. The divisor r(O) + (T ) is called the embedding divisor class. We recall a classification of elliptic curves models up to projective linear equivalence (cf. Lemmas 2 and 3 of Kohel [13]).
98
DAVID KOHEL
Theorem 3.2. Let E1 and E2 be two projectively normal models of an elliptic curve E in Pr . There exists a linear transformation of Pr inducing an isomorphism of E1 to E2 if and only if E1 and E2 have the same embedding divisor class. Remark. The theorem is false if the isomorphism in the category of elliptic curves is weakened to an isomorphism of curves. In particular, if Q is a point of E such that [d](Q) = T2 − T1 , then the pullback of the embedding divisor class r(O) + (T2 ) by the translation morphism τQ is r(O) + (T1 ), and τQ is given by a projectively linear transformation (see Theorem 3.5 for this statement for T1 = T2 ). Corollary 3.3. Two projectively normal models for an elliptic curve of the same degree have equivalent arithmetic up to additions and multiplication by fixed constants if they have the same embedding divisor class. A natural condition is to assume that [−1] is also linear on E in its embedding, for which we recall the notion of a symmetric model (cf. Lemmas 2 and 4 of Kohel [13] for the equivalence of the following conditions). Definition 3.4. A projectively normal elliptic curve model ι : E → Pr is symmetric if and only if any of the following is true: (1) [−1] is given by a projective linear transformation, (2) [−1]∗ L ∼ = L where L = ι∗ OPr (1), (3) T ∈ E[2], where T is the embedding class. In view of the classification of the linear isomorphism class, this reduces the classification of projectively normal symmetric models of a given degree d to the finite set of points T in E[2] (and more precisely, for models over k, to T in E[2](k)). To complete the analysis of models up to linear equivalence, we finally recall a classification of linear translation maps. Although the automorphism group of an elliptic curve is finite, and in particular Aut(E) = {±1} if j(E) = 0, 123 , there exist additional automorphisms as genus-one curves: each point T induces a translationby-T morphism τT . Those which act linearly on a given model have the following simple characterization (see Lemma 5 of Kohel [13]). Theorem 3.5. Let E be a projectively normal projective degree d model of an elliptic curve. The translation-by-T morphism τT acts linearly if and only if T is in E[d]. ¯ Remark. The statement is geometric, in the sense that it is true for all T in E(k), but if T is not in E(k) then the linear transformation is not k-rational. 4. Exact morphisms and isogenies In order to minimize the number of arithmetic operations, it is important to control the degree of the defining polynomials for an isogeny. For an isomorphism, we gave conditions for the isomorphism to be linear. In general we want to characterize those morphisms of degree n given by polynomials of degree n. A tuple (f0 , . . . , fr ) of polynomials defining a morphism φ : X → Y as a rational map is defined to be complete if the exceptional set ¯ | f0 (P ) = · · · = fr (P ) = 0} {P ∈ X(k) is empty. In this case a single tuple defines φ as a morphism. The following theorem characterizes the existence and uniqueness of such a tuple for a morphism of curves.
THE GEOMETRY OF EFFICIENT ARITHMETIC ON ELLIPTIC CURVES
99
Theorem 4.1. Let φ : C1 → C2 be a morphism of curves, embedded as projectively normal models by invertible sheaves, L1 and L2 , respectively. The morphism φ is given by a complete tuple s = (f0 , . . . , fr ) of defining polynomials of degree n if and only if φ∗ L2 ∼ = L1n . If it exists, s is unique in k[C1 ]d up to a scalar multiple. Proof. Under the hypotheses that the Ci are projectively normal models, we identify the spaces of polynomials of degree n with global sections of L1n . A tuple of polynomials of degree n defining φ corresponds to an element of ∼ Γ(C1 , φ∗ L −1 ⊗ L n ). Hom(φ∗ L2 , L n ) = 1
2
1
Being complete implies that s is a generator for all such tuples of degree n defining polynomials for φ, as a k = Γ(C1 , OC1 ) vector space. Explicitly, let (g0 , . . . , gr ) be another tuple, and set c = g0 /f0 = · · · = gr /fr ∈ k(C1 ). Since the fj have no common zero, c has no poles and thus lies in k. Consequently k = Γ(C1 , φ∗ L −1 ⊗ L n ), and hence φ∗ L2 ∼ = L n. 2
1
1
Conversely, if the latter isomorphism holds, Hom(φ∗ L2 , L1n ) ∼ = k, and a generator s for Hom(φ∗ L2 , L1n ) is also a generator of the spaces of defining polynomials of all degrees: Hom(φ∗ L2 , L1n+m ) = ks ⊗k Γ(C1 , L1m ). Since φ is a morphism, s has no base point, hence is complete.
We say that a morphism between projectively normal models of curves is exact if it satisfies the condition φ∗ L2 ∼ = L1n for some n. If φ is exact, then n is uniquely determined by deg(φ) deg(L2 ) = deg(φ∗ L2 ) = deg(L1n ) = n deg(L1 ), and, in particular n = deg(φ) if C1 and C2 are models of the same degree. Corollary 4.2. Let E1 and E2 be projecively normal models of elliptic curves of the same degree d with embedding classes T1 and T2 . An isogeny φ : E1 → E2 of degree n and kernel G is exact if and only if n(T1 − S1 ) = d Q where S1 ∈ φ−1 (T2 ). Q∈G
Proof. This statement expresses the sheaf isomorphism L1n ∼ = φ∗ L2 in terms of equivalence of divisors: n((d − 1)(O1 ) + (T1 )) = nD1 ∼ φ∗ D2 = φ∗ ((d − 1)(O2 ) + (T2 )). This equivalence holds if and only if the evaluation of the divisors on the curve are equal, from which the result follows. Corollary 4.3. The multiplication-by-n map on any symmetric projectively normal model is exact. Proof. In the case of a symmetric model we take E = E1 = E2 in the previous corollary. The embedding divisor class T = T1 = T2 is in E[2], and S ∈ [n]−1 (T ) satisfies nS = T , so deg([n])(T − S) = n2 (T − S) = n(nT − T ) = n(n − 1)T = O. On the other hand, the sum over the points of E[n] is O, hence the result.
100
DAVID KOHEL
This contrasts with the curious fact that 2-isogenies are not well-suited to elliptic curves in Weierstrass form. Corollary 4.4. There does not exist an exact cyclic isogeny of even degree n between curves in Weierstrass form. Proof. For a cyclic subgroup G of even order, the sum over its points is a nontrivial 2-torsion point Q. For Weierstrass models we have T1 = O1 and T2 = O2 , and may choose S1 = O1 , so that (for d = 3) n(T1 − S1 ) = O = 3Q = Q, so we never have equality. Example. Let E : Y 2 Z = X(X 2 + aXZ + bZ 2 ) be an elliptic curve with rational 2-torsion point (0 : 0 : 1). The quotient by G = (0 : 0 : 1), to the curve Y 2 Z = X((X − aZ)2 − 4bZ 2 ), is given by a 3-dimensional space of polynomial maps of degree 3: ⎧ 2 ⎨ (Y Z : (X 2 − bZ 2 )Y : X 2 Z) (X : Y : Z) −→ ((X + aZ)Y 2 : (Y 2 − 2bXZ − abZ 2 )Y : X 2 (X + aZ)) ⎩ ((X 2 + aXZ + bZ 2 )Y : XY 2 − b(X 2 + aXZ + bZ 2 )Z : XY Z) but not by any system of polynomials of degree 2. Corollary 4.5. Let φ : E1 → E2 be an isogeny of even degree n of symmetric models of elliptic curves of the same even degree d, and let T1 and T2 be the respective embedding classes. Then φ is exact if and only if T2 ∈ φ(E1 [n]). Proof. This is a consequence of Corollary 4.2. Since n is even and E1 symmetric, nT1 = O1 , and since d is even, Q = O1 . d Q∈G
ˆ ˆ This conclusion follows since nS1 = φφ(S 1 ) = φ(T2 ), which equals O1 if and only T2 is in φ(E1 [n]). 5. Other models for elliptic curves Alternative models have been proposed for efficient arithmetic on elliptic curves. Since the classification of models up to isomorphism is more natural for projective embeddings, providing a reduction to linear algebra, we describe how to interpret other models in terms of a standard projective embedding. Affine models. An affine plane model in A2 provides a convenient means of specifying (an open neighborhood of) an elliptic curve. A direct description of arithmetic in terms of the affine model requires inversions, interpolation of points, and special conventions for representations of points at infinity, which we seek to avoid. Affine models of degree 3 extend naturally to an embedding in the projective closure P2 of A2 . When the degree of the model is greater than three, the standard projective closure is singular. However, in general there exists a well-defined divisor at infinity of degree d (= r + 1), which uniquely determines a Riemann–Roch space and associated embedding in Pr , up to linear isomorphism. Product space P1 ×P1 . Elliptic curves models in P1 ×P1 arise naturally by equipping an elliptic curve E with two independent maps to P1 . The product projective space P1 × P1 embeds via the Segre embedding as the hypersurface X0 X3 = X1 X2
THE GEOMETRY OF EFFICIENT ARITHMETIC ON ELLIPTIC CURVES
101
in P3 . This construction is particularly natural when the maps to P1 are given by inequivalent divisors D1 and D2 of degree two (such that the coordinate function are identified with the Riemann–Roch basis), in which case the Segre embedding of P1 × P1 in P3 induces an embedding by the Riemann–Roch space of D = D1 + D2 . In order for both D1 and D2 to be symmetric (so that [−1] stabilizes each of the projections to P1 ), each must be of the form Di = (O) + (Ti ) for points Ti ∈ E[2]. Moreover, for the Di to be independent, T1 = T2 , which implies that D is not equivalent to 4(O). Weighted projective spaces. Various embeddings of elliptic curves in weighted projective spaces appear in the computational and cryptographic literature for optimization of arithmetic on elliptic curves (particularly of isogenies). We detail a few of the standard models below, and their transformation to projective models of degree 3 or 4. • P22,3,1 . An elliptic curve in this weighted projective space is referred to as being in Jacobian coordinates [7], taking the Weierstrass form Y (Y + a1 XZ + a3 Z 3 ) = X 3 + a2 X 2 Z 2 + a2 XZ 4 + a6 Z 6 . The space encodes the order of the polar divisor of the functions x and y of a Weierstrass model. An elliptic curve in this coordinate system embeds as a Weierstrass model in the ordinary projective plane P2 by the map by (X : Y : Z) → (XZ : Y : Z 3 ) with birational inverse (X : Y : Z) → (XZ : Y Z 2 : Z) defined outside of (0 : 1 : 0) (whose image is (1 : 1 : 0)). This weighted projective space gives interesting algorithmic efficiencies, since an isogeny can be expressed in the form φ(P ) ω(P ) : : 1 · P −→ (φ(P ) : ω(P ) : ψ(P )) = ψ(P )2 ψ(P )3 Unfortunately, addition doesn’t preserve the poles, so mixing isogenies (e.g. doublings and triplings) with addition, one loses the advantages of the special form. • P21,2,1 . An elliptic curve in this weighted projective space is referred to as being in L´opez–Dahab coordinates [7]. This provides an artifice for deflating a model in P3 to the surface P21,2,1 . It embeds as the surface X0 X2 = X12 in P3 by (X : Y : Z) → (X 2 : XZ : Z 2 : Y ), with inverse
1 (X0 : X1 : X2 : X3 ) −→
(X1 : X2 X3 : X2 ), (X0 : X0 X3 : X1 ).
• P31,2,1,2 . An elliptic curve in this weighted projective space is commonly referred to as being in extended L´ opez–Dahab coordinates. Denoting the coordinates (X : Y : Z : W ), an elliptic curve is usually embedded in the surface S : W = Z 2 (variants have W = XZ or extend further to include XZ and Z 2 ). As above, the space S is birationally equivalent to P3 : (X : Y : Z : W ) → (X 2 : XZ : Z 2 : Y ) = (X 2 : XZ : W : Y ). For isogenies (e.g. doubling and tripling), by replacing a final squaring with an inital squaring, one can revert to P21,2,1 .
102
DAVID KOHEL
6. Efficient arithmetic We first recall some notions of complexity, which we use to describe the cost of evaluating the arithmetic on elliptic curves. The notation M and S denote the cost of a field multiplication and squaring, respectively. For a finite field of q elements, typical algorithms for multiplication take time cM log(q)ω for some 1 + ε ≤ ω ≤ 2, with a possibly better constant for squaring (or in characteristic 2 where squarings can reduce to the class O(log(q))). The upper bound of 2 arises by a naive implementation, while a standard Karatsuba algorithm gives ω = log2 (3), and fast Fourier transform gives an asymptotic complexity of 1+ε. We ignore additions, which lie in the class O(log(q)), and distiguish multiplication by a constant (of fixed small size or sparse), using the notation m for its complexity. The principle focus for efficient arithmetic is the operation
of scalar multiplication by k. Using a windowing computation, we write k = ti=0 ai ni in base n = k (the window), and precompute [ai ](P ) for ai in a set of coset representatives for Z/nZ. A sliding window lets us restrict representatives for ai to (Z/nZ)∗ . We may then compute [k](P ), using at most t additions for kt scalings by [ ]. In order to break down the problem further, we suppose the existence of an ˆ for which we need a rational cyclic subgroup G ⊂ isogeny decomposition [ ] = φφ, E[n] (where in practice n = = 3 or n = 2 = 4 — the window may be a higher power of ). For this purpose we study families of elliptic curves with Glevel structure. In view of the analysis of torsion action and degrees of defining polynomials, we give preference to degree-d models where n divides d, and G will be either Z/nZ or μn as a group scheme. We now describe the strategy for efficient isogeny computation. Given E1 and E2 in Pr with isogeny φ : E1 → E2 given by defining
polynomials (f0 , . . . , fr ) of degree n = deg(φ), we set V0 = Γ(Pr , IE (n)) = ker Γ(Pr , O(n)) → Γ(E1 , L n ) , and successively construct a flag V0 ⊂ V1 ⊂ · · · ⊂ Vd = V0 + f0 , . . . , fr such that each space Vi+1 is constructed by adjoining to Vi a new form gi in Vd \Vi , whose evaluation minimizes the number of M and S. Subsequently the forms f0 , . . . , fr can be expressed in terms of the generators g0 , . . . , gr with complexity O(m). In Sections 6.1 and 6.3 we analyze the arithmetic of tripling and doubling, on a family of degree 3 with a rational 3-torsion point and a family of degree 4 with rational 2-torsion point, respectively, such that the translation maps are linear. Let G be the subgroup generated by this point. Using the G-module structure, and an associated norm map, we construct explicit generators gi for the flag decompositions. In Sections 6.2 and 6.4 we compare the resulting algorithms of Sections 6.1 and 6.3 to previous work. 6.1. Arithmetic on cubic models. For optimization of arithmetic on a cubic family we consider a univeral curve with μ3 level structure, the twisted Hessian normal form: H : aX 3 + Y 3 + Z 3 = XY Z, O = (0 : 1 : −1), obtained by descent of the Hessian model X 3 + Y 3 + Z 3 = cXY Z to a = c3 , by coordinate scaling (see [4]). Addition on this model is reasonably efficient at a cost
THE GEOMETRY OF EFFICIENT ARITHMETIC ON ELLIPTIC CURVES
103
of 12M. In order to optimize the tripling morphism [3], we consider the quotient by μ3 = (0 : ω : −1). By means of the isogeny (X : Y : Z) → (aX 3 : Y 3 : Z 3 ), with kernel μ3 , we obtain the quotient elliptic curve E : XY Z = a(X + Y + Z)3 , O = (0 : 1 : −1). This yields an isogeny φ of cubic models by construction, at a cost of three cubings: 3M + 3S. In the previous construction, by using the μn structure, with respect to which the coordinate functions are diagonalized, we were able to construct the quotient isogeny (X0 : · · · : Xr ) → (X0n : · · · : Xrn ) without much effort. It remains to construct the dual. In the case of the twisted Hessian, the dual isogeny ψ = φˆ is given by (X : Y : Z) → (f0 : f1 : f2 ), where f0 = X 3 + Y 3 + Z 3 − 3XY Z, f1 = X 2 Y + Y 2 Z + XZ 2 − 3XY Z, f2 = XY 2 + Y Z 2 + X 2 Z − 3XY Z as we can compute by pushing [3] through φ. The quotient curve E : XY Z = a(X + Y + Z)3 , admits a Z/3Z-level structure, acting by cyclic coordinate permutation. The isogeny ψ : E → H is the quotient of this group G = ker(ψ) must be defined by polynomials in 2 3 Γ(E, LE3 )G = X 3 + Y 3 + Z 3 , X 2 Y + Y 2 Z + XZ 2 , XY 2 + Y Z 2 + X 2 Z modulo the relation XY Z = a(X + Y + Z)3 . We note, however, that the map ψ ∗ : Γ(H, LH ) → Γ(E, LE3 )G must be surjective since both have dimension 3. Using the group action, we construct the norm map NG : Γ(E, L ) → Γ(E, LE3 )G , by NG (f ) = f (X, Y, Z)f (Y, Z, X)f (Z, X, Y ). It is nonlinear but sufficient to provide a set of generators using 2M each, and by fixing a generator of the fixed subspace of G, we construct a distinguished generator g0 requiring 1M + 1S for cubing. For the first norm we set g0 = NG (X + Y + Z) = (X + Y + Z)3 , noting that NG (X) = NG (Y ) + NG (Z) = XY Z = ag0 . We complete a basis with forms g1 and g2 given by g1 = NG (Y + Z) = (Y + Z)(X + Z)(X + Y ), g2 = NG (Y − Z) = (Y − Z)(Z − X)(X − Y ), then solve for the linear transformation to the basis {f0 , f1 , f2 }: f0 = (1 − 3a)g0 − 3g1 , f1 = −4ag0 + (g1 − g2 )/2, f2 = −4ag0 + (g1 + g2 )/2. This gives an algorithm for ψ using 5M + 1S, for a total tripling complexity of 8M+4S using the decomposition [3] = ψ◦φ. Attributing 1m for the multiplications by a, ignoring additions implicit in the small integers (after scaling by 2), this gives 8M + 4S + 2m.
104
DAVID KOHEL
6.2. Comparison with previous work. A naive analysis, and the previously best known algorithmm for tripling, required 8M + 6S + 1m. To compare with scalar multiplication using doubling and a binary chain, one scales by log3 (2) to account for the reduced length of the addition chain. For comparison, the best known doublings algorithms on ordinary projective models (in characteristic other than 2) are: • Extended Edwards models in P3 , using 4M + 4S. (Hisil et al. [11]) • Singular Edwards models in P2 , using 3M + 4S (Berstein et al. [2]) • Jacobi quartic models in P3 , using 2M + 5S. (Hisil et al. [11]) We note that the Jacobi quartic models are embeddings in P3 of the affine curve y 2 = x4 +2ax2 +1 extended to a projective curve in the (1, 2, 1)-weighted projective plane. The embedding (x, y) → (x2 : y : 1 : x) = (X0 : X1 : X2 : X3 ) gives X12 = X02 + 2aX0 X2 + X22 , X0 X2 = X32 , in ordinary projective space P3 . There also exist models in weighted projective space with complexity 2M + 5S on the 2-isogeny oriented curves [8] with improvements of Bernstein and Lange [1], and a tripling algorithm with complexity of 6M + 6S for 3-isogeny oriented curves [8]. Each of these comes with a significantly higher cost for addition (see [8], the EFD [3], and the table below for more details). The relative comparison of complexities of [ ] and addition ⊕ on twisted Hessians ([ ] = [3]) and on twisted Edwards models and Jacobi quartics ([ ] = [2]) yields the following: [ ] 4M + 4S (8M + 4S)log3 (2) (6M + 6S)log3 (2) 3M + 4S 2M + 5S
1.00M 8.00M 7.57M 7.57M 7.00M 7.00M
Cost of 1S 0.80M 0.66M 7.20M 6.66M 7.07M 6.73M 6.81M 6.28M 6.20M 5.66M 6.00M 5.33M
⊕ 9M 12M log3 (2) = 7.57M (11M + 6S) log3 (2) 10M + 1S 7M + 3S
This analysis brings tripling on a standard projective model, coupled with an efficient addition algorithm, in line with with doubling (on optimal models for each). In the section which follows we improve the 2M + 5S result for doubling. 6.3. Arithmetic on level-2 quartic models. The arithmetic of quartic models provides the greatest advantages in terms of existence exact 2-isogeny decompositions and symmetric action of 4-torsion subgroups. A study of standard models with a level-4 structure, which provide the best complexity for addition to complement doubling complexities, will be detailed elsewhere. The best doubling algorithms, however, are obtained for embedding divisor class 4(O), as in the Jacobi quartic, rather than 3(O) + (T ), for a 2-torsion point T , as is the case for the Edwards model (see [9] and [1]) or its twists, the Z/4Z-normal form or the μ4 -normal form in characteristic 0. In what follows we seek the best possible complexity for doubling in a family of elliptic curves. In order to exploit an isogeny decomposition for doubling and linear action of torsion, we construct a universal family with 2-torsion point and embed the family in P3 by the divisor 4(O). We note that any of the recent profusion of models with rational 2-torsion point can be transformed to this model, hence the complexity results obtained apply to any such family.
THE GEOMETRY OF EFFICIENT ARITHMETIC ON ELLIPTIC CURVES
105
A universal level-2 curve. Over a field of characteristic different from 2, a general Weierstrass model with 2-torsion point has the form y 2 = x3 + a1 x2 + b1 x. The quotient by the subgroup (0, 0) of order 2 gives y 2 = x3 + a2 x2 + b2 x, where a2 = −2a1 and b2 = a21 − 4b1 , by formulas of V´elu [18]. In order to have a family with good reduction at 2, we may express (a1 , b1 ) by the change of variables a1 = 4u + 1 and b1 = −16v, after which y 2 = x3 + a1 x2 + b1 x is isomorphic to the curve E1 : y 2 + xy = x3 + ux2 − vx with isogenous curve E2 : y 2 + xy = x3 + ux2 + 4vx + (4u + 1)v. A quartic model in P3 for each of these curves is given by the embedding (x, y) −→ (X0 , X1 , X2 , X3 ) = (x2 , x, 1, y), with respect to the embedding divisor 4(O). This gives the quartic curve in P2 given by Q1 : X32 + X1 X3 = (X0 + uX1 − vX2 )X1 , X12 = X0 X2 , with isogenous curve Q2 : X32 + X1 X3 = (X0 + 4vX2 )(X1 + uX2 ) + vX22 , X12 = X0 X2 each having (1 : 0 : 0 : 0) as identity. The Weierstrass model has discriminant Δ = v 2 ((4u + 1)2 − 64v), hence the Ei and Qi are elliptic curves provided that Δ is nonzero. Translating the V´elu 2-isogeny through to these models we find the following expressions for the isogeny decomposition of doubling. Lemma 6.1. The 2-isogeny ψ : Q1 → Q2 with kernel (0 : 0 : 1 : 0) sends (X0 , X1 , X2 , X3 ) to
(X0 − vX2 )2 , (X0 − vX2 )X1 , X12 , vX1 X2 + (X0 + vX2 )X3 and the dual isogeny φ sends (X0 , X1 , X2 , X3 ) to
(X0 + 4vX2 )2 , (X1 + 2X3 )2 , (4X1 + (4u + 1)X2 )2 , f3 , where f3 = uX12 − 8vX1 X2 − (4u + 1)vX22 + 2X0 X3 + 4uX1 X3 − 8vX2 X3 − X32 ). Efficient isogeny evaluation. For each of the tuples (f0 , f1 , f2 , f3 ), we next determine quadratic forms g0 , g1 , g2 , g3 , each a square or product, spanning the same space and such that the basis transformation involves only coefficients which are polynomials in the parameters u and v. In order to determine a projective isomorphism, it is necessary and sufficient that the determinant of the transformation be invertible, but it is not necessary to compute its inverse. As previously noted, the evaluation of equality among quadratic polynomials on the domain curve Qi is in k[Qi ], i.e. modulo the 2-dimension space of relations for Qi . Lemma 6.2. If k is a field of characteristic different from 2, the quadratic defining polynomials for ψ are spanned by the following forms
(g0 , g1 , g2 , g3 ) = (X0 − vX2 )2 , (X0 − vX2 + X1 )2 , X12 , (X0 + X1 + vX2 + 2X3 )2 . Proof. By scaling the defining polynomials (f0 , f1 , f2 , f3 ) by 4, the projective transformation from (g0 , g1 , g2 , g3 ) is given by (4f0 , 4f2 ) = (4g0 , 4g2 ), 4f1 = −2(f0 − f1 + f2 ) and 4f3 = 2f0 − 3f1 + 2(1 − 2(u + v)) + f3 . Since the transformation has determinant 32, it defines an isomorphism over any field of characteristic different from 2.
106
DAVID KOHEL
Lemma 6.3. If k is a field of characteristic 2, the quadratic defining polynomials for ψ are spanned by the following forms
(g0 , g1 , g2 , g3 ) = (X0 + vX2 )2 , (X1 + X3 )X3 , X12 , (X0 + v(X1 + X3 ))(X2 + X3 ) . Proof. The transformation from (g0 , g1 , g2 , g3 ) to the tuple (f0 , f1 , f2 , f3 ) of defining polynomials is given by (g0 , g2 ) = (f0 , f2 ), (g1 , g3 ) = (f1 + uf2 , vf1 + f2 + f3 ). The transformation has determinant 1 hence is an isomorphism.
Corollary 6.4. The isogeny ψ can be evaluated with 4S in characteristic different from 2 and 2M + 2S in characteristic 2. Lemma 6.5. Over any field k, the quadratic defining polynomials for φ are spanned by the square forms (g0 , g1 , g2 , g3 ):
(X0 +4vX2 )2 , (X1 +2X3 )2 , (4X1 +(4u+1)X2 )2 , (X0 +(2u+1)X1 −4vX2 +X3 )2 . Proof. The forms (g0 , g1 , g2 ) equal (f0 , f1 , f2 ), and it suffices to verify the equality f3 = −g0 − (u + 1)g1 + vg2 + g3 , a transformation of determinant 1. Lemma 6.6. If k is a field of characteristic 2, the quadratic defining polynomials for φ are spanned by (X02 , X12 , X22 , X32 ). Proof. It is verified by inspection that the isogeny φ is defined by a linear combination of the squares of (X0 , X1 , X2 , X3 ) or by specializing the previous lemma to characteristic 2. Corollary 6.7. The isogeny φ can be evaluated with 4S over any field. Corollary 6.8. Doubling on Q1 or Q2 can be carried out with 8S over a field of characteristic different from 2, or 2M + 6S over a field of characteristic 2. Factorization through singular quotients. With the given strategy of computing the isogenies of projectively normal models ψ : Q1 → Q2 then φ : Q2 → Q1 , this result is optimal or nearly so — to span the spaces of forms of dimension 4, in each direction, one needs at least four operations. We thus focus on replacing Q1 by a singular quartic curve D1 in P2 such that the morphisms induced by the isogenies between Q2 and Q1 remain well-defined but for which we can save one operation in the construction of the coordinate functions of the singular curves. We treat characteristic different from 2 and the derivation of a doubling algorithm improving on 2M + 5S; an analogous construction in characteristic 2 appears in Kohel [14]. Let T = (0 : 0 : 1 : 0) be the 2-torsion point on Q1 , which acts by translation as: τT (X0 : X1 : X2 : X3 ) = (vX2 : −X1 : v −1 X0 : X1 + X3 ) Similarly, the inverse morphism is: [−1](X0 : X1 : X2 : X3 ) = (X0 : X1 : X2 : −(X1 + X3 )) Over a field of characteristic different from 2, the morphism from Q1 to P2 (X0 : X1 : X2 : X3 ) −→ (X : Y : Z) = (X0 : X1 + 2X3 : X2 ) has image curve: D1 : (Y 2 − (4u + 1)XZ)2 = 16XZ(X − vZ)2 ,
THE GEOMETRY OF EFFICIENT ARITHMETIC ON ELLIPTIC CURVES
107
on which τT and [−1] induce linear transformations, since the subspace generated by {X0 , X1 + 2X3 , X2 } is stabilized by pullbacks of both [−1] and [τT ]. The singular subscheme of D1 is X = vZ, Y 2 = (4u + 1)vZ 2 , which has no rational points if (4u + 1)v is not a square, and in this case, the projection to D1 induces an isomorphism of the set of nonsingular points. Since τT acts linearly, the morphism ψ : Q1 → Q2 maps through D1 , as given by the next lemma. Lemma 6.9. The 2-isogeny ψ : Q1 → Q2 induces a morphism D1 → Q2 sending (X, Y, Z) to
8(X − vZ)2 , 2(Y 2 − (4u + 1)XZ), 8XZ, 4Y (X + vZ) − (Y 2 − (4u + 1)XZ . This defining polynomials are spanned by
(g0 , g1 , g2 ) = (X − vZ)2 , Y 2 , (X + vZ)2 , (X + Y + vZ)2 . In particular the morphism can be evaluated with 4S. Lemma 6.10. The 2-isogeny φ : Q2 → Q1 induces a morphism φ : Q2 → D1 sending (X0 , X1 , X2 , X3 ) to
(X0 + 4vX2 )2 , (X1 + 2X3 )(2X0 + (4u + 1)X1 − 8vX2 ), (4X1 + (4u + 1)X2 )2 , which can be evaluated with 1M + 2S. If 4u + 1 = −(2s + 1)2 , then the forms
(X0 + 4vX2 )2 , (X0 − 4vX2 − (2s + 1)(sX1 − X3 ))2 , (4X1 + (4u + 1)X2 )2 , span the defining polynomials for φ : Q2 → D1 , and can be evaluated with 3S. Proof. The form of the defining polynomals (f0 , f1 , f2 ) for the map φ : Q2 → D1 follows from composing the 2-isogeny φ : Q2 → Q2 with the projection to D1 . The latter statement holds since, the square forms (g0 , g1 , g2 ) of Lemma 6.10 satisfy f0 = g0 , f2 = g2 , and (2s + 1)f1 = −2(g0 − g1 − vg2 ). Composing the morphism Q2 → D1 with D1 → Q2 gives the following complexity result. Theorem 6.11. The doubling map on Q2 over a field of characteristic different from 2 can be evaluated with 1M + 6S, and if 4u + 1 = −(2s + 1)2 , with 7S. Remark. The condition a1 = 4u + 1 = −(2s + 1)2 is equivalent to the condition u = −(s2 + s + 1/2). This implies that the curves in the family are isomorphic to one of the form y 2 = x3 − x2 + b1 x, where b1 = −16v/(4u + 1)2 , fixing the quadratic twist but not changing the level structure. In light of this normalization in the subfamily, we may as well fix s = 0 and 4u + 1 = −1 to achieve a simplification of the formulas in terms of the constants. 6.4. Comparison with previous doubling algorithms. We recall that the previously best known algorithms for doubling require 2M+5S, obtained for Jacobi quartic models in P3 (see Hisil et al. [11]) or specialized models in weighted projective space [8]. We compare this base complexity to the above complexities which apply to any elliptic curve with a rational 2-torsion point. We include the naive 8M algorithm of Corollary 6.8, and improvements of Theorem 6.11 to 1M + 6S generically, and 7S for an optimal choice of twist. The relative costs of the various
108
DAVID KOHEL
doubling algorithms, summarized below, show that the proposed doubling algorithms determined here give a non-neglible improvement on previous algorithms. Cost of 1S [2] 1.00M 0.80M 0.66M 8S 8.0M 6.40M 5.33M 2M + 5S 7.0M 6.00M 5.33M 1M + 6S 7.0M 5.80M 5.00M 7S 7.0M 5.60M 4.66M The improvements for doubling require only a 2-torsion point, but imposing additional 2-torsion or 4-torsion structure would allow us to carry this doubling algorithm over to a normal form with symmetries admitting more efficient addition laws.
References [1] Daniel J. Bernstein and Tanja Lange, Faster addition and doubling on elliptic curves, Advances in cryptology—ASIACRYPT 2007, Lecture Notes in Comput. Sci., vol. 4833, Springer, Berlin, 2007, pp. 29–50, DOI 10.1007/978-3-540-76900-2 3. MR2565722 (2011d:11125) [2] Daniel J. Bernstein, Peter Birkner, Marc Joye, Tanja Lange, and Christiane Peters, Twisted Edwards curves, Progress in cryptology—AFRICACRYPT 2008, Lecture Notes in Comput. Sci., vol. 5023, Springer, Berlin, 2008, pp. 389–405, DOI 10.1007/978-3-540-68164-9 26. MR2482341 (2010e:11057) [3] D. J. Bernstein, T. Lange, Explicit formulas database. http://www.hyperelliptic.org/EFD/ [4] Progress in cryptology—AFRICACRYPT 2010, Lecture Notes in Computer Science, vol. 6055, Springer, Berlin, 2010. Edited by Daniel J. Bernstein and Tanja Lange. MR2905894 (2012j:94004) [5] Daniel J. Bernstein and Tanja Lange, A complete set of addition laws for incomplete Edwards curves, J. Number Theory 131 (2011), no. 5, 858–872, DOI 10.1016/j.jnt.2010.06.015. MR2772476 (2012d:11127) [6] Christina Birkenhake and Herbert Lange, Complex abelian varieties, 2nd ed., Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 302, Springer-Verlag, Berlin, 2004. MR2062673 (2005c:14001) [7] Handbook of elliptic and hyperelliptic curve cryptography, Discrete Mathematics and its Applications (Boca Raton), Chapman & Hall/CRC, Boca Raton, FL, 2006. Edited by Henri Cohen, Gerhard Frey, Roberto Avanzi, Christophe Doche, Tanja Lange, Kim Nguyen and Frederik Vercauteren. MR2162716 (2007f:14020) [8] Christophe Doche, Thomas Icart, and David R. Kohel, Efficient scalar multiplication by isogeny decompositions, Public key cryptography—PKC 2006, Lecture Notes in Comput. Sci., vol. 3958, Springer, Berlin, 2006, pp. 191–206, DOI 10.1007/11745853 13. MR2423190 (2009m:94045) [9] Harold M. Edwards, A normal form for elliptic curves, Bull. Amer. Math. Soc. (N.S.) 44 (2007), no. 3, 393–422 (electronic), DOI 10.1090/S0273-0979-07-01153-6. MR2318157 (2008b:14052) [10] Robin Hartshorne, Algebraic geometry, Springer-Verlag, New York-Heidelberg, 1977. Graduate Texts in Mathematics, No. 52. MR0463157 (57 #3116) [11] Huseyin Hisil, Kenneth Koon-Ho Wong, Gary Carter, and Ed Dawson, Twisted Edwards curves revisited, Advances in cryptology—ASIACRYPT 2008, Lecture Notes in Comput. Sci., vol. 5350, Springer, Berlin, 2008, pp. 326–343, DOI 10.1007/978-3-540-89255-7 20. MR2546103 [12] D. Kohel et al., Echidna Algorithms (Version 4.0), http://echidna.maths.usyd.edu.au/ kohel/alg/index.html, 2013. [13] David Kohel, Addition law structure of elliptic curves, J. Number Theory 131 (2011), no. 5, 894–919, DOI 10.1016/j.jnt.2010.12.001. MR2772478 (2012c:14072)
THE GEOMETRY OF EFFICIENT ARITHMETIC ON ELLIPTIC CURVES
109
[14] David Kohel, Efficient arithmetic on elliptic curves in characteristic 2, Progress in cryptology—INDOCRYPT 2012, Lecture Notes in Comput. Sci., vol. 7668, Springer, Heidelberg, 2012, pp. 378–398, DOI 10.1007/978-3-642-34931-7 22. MR3064782 [15] H. Lange and W. Ruppert, Complete systems of addition laws on abelian varieties, Invent. Math. 79 (1985), no. 3, 603–610, DOI 10.1007/BF01388526. MR782238 (86f:14029) [16] Magma Computational Algebra System (Version 2.19), http://magma.maths.usyd.edu.au/ magma/handbook/, 2013. [17] W. A. Stein et al., Sage Mathematics Software (Version 5.12), The Sage Development Team, http://www.sagemath.org, 2013. [18] Jacques V´ elu, Isog´ enies entre courbes elliptiques (French), C. R. Acad. Sci. Paris S´er. A-B 273 (1971), A238–A241. MR0294345 (45 #3414) Aix Marseille Universit´ e, CNRS, Centrale Marseille, I2M, UMR 7373, 13453 Marseille, France Current address: Institut de Math´ ematiques de Marseille (I2M), 163 avenue de Luminy, Case 907, 13288 Marseille, cedex 9, France E-mail address: [email protected]
Contemporary Mathematics Volume 637, 2015 http://dx.doi.org/10.1090/conm/637/12752
2–2–2 isogenies between Jacobians of hyperelliptic curves Ivan Boyer Abstract. We show that there exist essentially 4 irreducible families of hyperelliptic curves of genus 3 such that there is a hyperelliptic curve of genus 3 and a 2−2−2 isogeny between their jacobians. We give explicitly irreducible polynomials defining these families. The first part of this article essentially deals with theta functions, and more precisely with Thomae and duplication formulae. The second part describes entirely the curves of two families among the four, and gives a correspondence between them commuting with the hyperelliptic involutions.
Contents Introduction Part 1. 2. 3. 4.
1. 2 · · · 2 isogenies and theta functions Isogenies Formulae Classification of kernels Computation of the four families
Part 5. 6. 7. 8.
2. Correspondences between family (f-2 ) and family (f-3 ) Trigonal maps and trigonal construction The curve C A correspondence preserving hyperelliptic involutions Numerical examples
References
Notation. We consider genus g curves and g-dimensional abelian varieties. We denote by 2 · · · 2 isogeny, an isogeny with a kernel isomorphic to (Z/2Z)g . We study mainly the case g = 3 where we write 2−2−2 isogenies. Moreover, every isogeny of jacobians will be assumed to be compatible with the canonical principal polarizations on the jacobians. 2010 Mathematics Subject Classification. Primary 11G10; Secondary 14K25, 14K02. This work is part of the author’s Ph.D. thesis, under the supervision of Jean-Fran¸cois Mestre. The author is very grateful to the organizers of the AGCT-14 and deeply thanks the referee for his useful comments and suggestions. c 2015 American Mathematical Society
111
112
IVAN BOYER
Let H be a hyperelliptic curve of degree 3, and y 2 = (x − x1 ) · · · (x − x8 ) a Weierstrass model of H . In the following, we call trigonal maps any map P1 → P1 of degree 3 which identifies the Weierstrass points of H by pairs, i.e. the xi have the same image by pairs. These maps are used by Recillas in [9] to generalize Humbert’s techniques using bigonal maps. Introduction In the 4 first sections of this article, we find all the pairs of genus 3 hyperelliptic curves for which there exists a 2−2−2 isogeny between their jacobians. These pairs are explicitly given by Theorem 4.1 and by 4 irreducible polynomials, denoted by (f-1 ), (f-2 ), (f-3 ) and (f-4 ). For the first pair, the curves are both in the locus of the same irreducible polynomial (f-1 ), up to a change of variables; this is the genus 3 specialization of the result obtained by Mestre in [5]. The second pair is defined by two irreducible polynomials (f-2 ) and (f-3 ): we have a situation of duality, where the first curve is in the locus of one irreducible polynomial while the second curve is in the other locus and reciprocally. Finally, the third pair behaves as the first one, defined by a fourth irreducible polynomial, (f-4 ). An example of such a pair is given by Smith in [11]. In the last 4 sections, we study the duality which appears between the second and third families. We give geometric criteria characterizing the two curves of this second pair. Theorem. A hyperelliptic curve belongs to the family ( f-2 ) if and only if it has exactly one map P1 → P1 of degree 3 identifying Weierstrass points by pairs. We can also characterize the curves of the first pair defined by (f-1 ): these are the hyperelliptic curves for which there are no trigonal maps. Theorem. A hyperelliptic curve belongs to the family ( f-3 ) if and only if we can split its 8 Weierstrass in two subsets of 4 with a common cross-ratio or equivalently with the same j-invariant. We use the trigonal construction to explicitly compute a correspondence between the two curves of a pair in a slightly different manner than Smith’s in [10]. This correspondence does not commute with the hyperelliptic involutions, but we have the following theorem. Theorem. There exists a correspondence between the two curves of a pair defined by families ( f-2 ) and ( f-3 ), commuting with the hyperelliptic involutions on each curve. Part 1. 2 · · · 2 isogenies and theta functions 1. Isogenies We want to find pairs of hyperelliptic curves whose jacobians are 2 · · · 2 isogenous. This isogeny must factor the multiplication by two, as shown in the following commutative diagram for g-dimensional abelian varieties over C. We use the action of the symplectic group Sp2g (Z), following Mumford in [7]:
2–2–2 ISOGENIES AND HYPERELLIPTIC CURVES
113
ϕ Cg /(Zg + ΩZg )
αΓ ∼
β
Cg /(Zg + Ω Zg )
(1)
Cg /(Zg + 2Ω Zg ) z → z
[2]
Cg /(Zg + Ω Zg ) where Ω = (AΩ + B)(CΩ + D)−1 and the isogenies β and αΓ are defined by β : Cg /(Zg +Ω Zg ) −→ Cg /(Zg +Ω Zg ) ; αΓ : Cg /(Zg +ΩZg ) −→ Cg /(Zg +Ω Zg ) z −→ 2z z −→ t(CΩ + D)−1 z Then, ϕ is a 2 · · · 2 isogeny between Cg /(Zg + ΩZg ) and Cg /(Zg + 2Ω Zg ). Now that we have the 2 · · · 2 isogenies, our problem is to find the conditions under which the abelian varieties are jacobians of hyperelliptic curves. The main tool we use is theta functions: with Thomae’s formula, we can determine the theta constants of the first abelian variety, which we set to be the jacobian of a hyperelliptic curve. Then, with the duplication formula and the functional equation of theta, we can compute the theta constants of the second abelian variety and find the conditions under which it is the jacobian of a hyperelliptic curve. In genus 3, the condition for an abelian variety to be the jacobian of a hyperelliptic curve is that exactly one even theta constant, among the 36 even theta constants, vanishes (cf. [8]). We compute the product of all even theta constants, which is easier to handle. Indeed, this product cannot have more than one vanishing factor in general. Otherwise, the isogenous variety would always have more than one even theta null and so, it would never be the jacobian of a hyperelliptic curve. But the equations in the second part show there are such jacobians. 2. Formulae In this section, the genus is g. We want to obtain formulae to classify abelian varieties defined over C. We first compute these relations with base field R but as they are polynomial, we can extend the results over C. So, we begin by considering a hyperelliptic curve with 2g + 2 real Weierstrass points satisfying, without losing generality, the inequalities x1 > x2 > · · · > x2g+2 > 0. Following the notations of [7], we set U = {x1 , x3 , . . . , x2g+1 } and we can write Thomae’s formula as ⎧ (xi − xj ) (xi − xj ) if |SΔU | = g + 1 , ⎪ ⎨c 4 i,j∈SΔU i,j ∈ SΔU ϑ[ηS ](0, Ω) = i