155 68 2MB
English Pages 330 [332] Year 2002
Ohio State University Mathematical Research Institute Publications 10 Editors: Gregory R. Baker, Walter D. Neumann, Karl Rubin
Ohio State University Mathematical Research Institute Publications 1 2 3 4 5 6 7 8 9
Topology ’90, B. Apanasov, W. D. Neumann, A. W. Reid, L. Siebenmann (Eds.) The Arithmetic of Function Fields, D. Goss, D. R. Hayes, M. I. Rosen (Eds.) Geometric Group Theory, R. Charney, M. Davis, M. Shapiro (Eds.) Groups, Difference Sets, and the Monster, K. T. Arasu, J. F. Dillon, K. Harada, S. Sehgal, R. Solomon (Eds.) Convergence in Ergodic Theory and Probability, V. Bergelson, P. March, J. Rosenblatt (Eds.) Representation Theory of Finite Groups, R. Solomon (Ed.) The Monster and Lie Algebras, J. Ferrar, K. Harada (Eds.) Groups and Computation III, W. M. Kantor, A´. Seress (Eds.) Complex Analysis and Geometry, J. D. McNeal (Ed.)
Codes and Designs Proceedings of a conference honoring Professor Dijen K. Ray-Chaudhuri on the occasion of his 65th birthday The Ohio State University May 18⫺21, 2000
Editors K. T. Arasu ´ . Seress A
≥
Walter de Gruyter · Berlin · New York 2002
Editors K. T. Arasu Department of Mathematics and Statistics, Wright State University, Dayton, OH 45435, USA ´ kos Seress A Department of Mathematics, The Ohio State University, Columbus, OH 43210-1174, USA Series Editors Gregory R. Baker Department of Mathematics, The Ohio State University, Columbus, OH 43210-1174, USA Karl Rubin Department of Mathematics, Stanford University, Stanford, CA 94305-2125, USA Walter D. Neumann Department of Mathematics, Columbia University, New York, NY 10027, USA Mathematics Subject Classification 2000: 05⫺06; 94⫺06 앝 Printed on acid-free paper which falls within the guidelines of the ANSI 앪 to ensure permanence and durability.
Library of Congress Cataloging-in-Publication Data Codes and designs : proceedings of a conference honoring Professor Dijen K. Ray-Chaudhuri on the occasion of his 65th birthday, The Ohio State University, May 18⫺21, 2000 / edi´ . Seress. tors, K. T. Arasu, A p. cm. ⫺ (Ohio State University Mathematical Research Institute publications ; vol. 10) ISBN 3 11 017396 4 (acid-free-paper) I. Arasu, K. T., 1954⫺ II. Seress, A´kos, 1958⫺ III. Ohio State University Mathematical Research Institute publications ; 10. 2002023667
Die Deutsche Bibliothek ⫺ Cataloging-in-Publication Data Codes and designs : proceedings of a Conference Honoring Professor Dijen K. Ray-Chaudhuri on the Occasion of His 65th Birthday, the Ohio State University, May 18⫺21, 2000 / ed. ´ . Seress. ⫺ Berlin ; New York : de Gruyter, 2002 K. T. Arasu ; A (Ohio State University Mathematical Research Institute publications ; 10) ISBN 3-11-017396-4 쑔 Copyright 2002 by Walter de Gruyter GmbH & Co. KG, 10785 Berlin. All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Printed in Germany. Cover design: Thomas Bonnie, Hamburg. Typeset using the authors’ TEX files: I. Zimmermann, Freiburg. Printing and binding: Hubert & Co. GmbH & Co. KG, Göttingen.
Preface
Following an initiative of the late Professor Hans Zassenhaus in 1965, the Departments of Mathematics at the The Ohio State University and Denison University have been holding conferences in Combinatorics, Group Theory, and Ring Theory. Initially, these meetings have been held annually, and later biannually; in the year 2000, the 25th meeting of this series was conducted. These conferences have primarily attracted mathematicians from institutions in Ohio and nearby states, but there have been many participants from other parts of the country, as well as from abroad. There are usually twenty to thirty invited 20-minute talks given in each of the three main areas. However, at the last conference, held during May 18–21, 2000 on the Ohio State main campus in Columbus, there was a special addition to the Combinatorics program in tribute to the 65th birthday of Dijen Ray-Chaudhuri. The Dijen 65 part of the conference consisted of fourteen 40-minute lectures by either former students of Dijen or other mathematicians with strong personal and professional ties with him. The topics ranged from Coding Theory, Design Theory, Geometry and Optimization to Graph Theory, reflecting the wide range of areas to which Professor Ray-Chaudhuri has made substantial contributions during his exemplary career. The banquet to celebrate his 65th birthday included remarks made by Professors R.M. Wilson, Jeff Kahn, Thomas Dowling, and Dr. John F. Dillon. The highlight of this party was the presentation of the Euler medal to Professor Ray-Chaudhuri for his life-long achievements and contributions to Combinatorics. This medal was presented to him by Professor Ralph Stanton on behalf of the Institute of Combinatorics and Applications. We are indebted to Professor Thomas Dowling for his invaluable help with the organization of the combinatorics part of the conference, and to the referees of this volume for their conscientious work. We are very grateful for the generous support of the Mathematical Research Institute of The Ohio State University and the National Security Agency. K. T. Arasu Ákos Seress
Table of contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Ákos Seress Highlights of Dijen Ray-Chaudhuri’s research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 K. T. Arasu, Henk D. L. Hollmann, Kevin Player, and Qing Xiang On the p-ranks of GMW difference sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Sejeong Bang and Sung-Yell Song Characterization of maximal rational circulant association schemes . . . . . . . . . . 37 Michel Deza Face-regular polyhedra and tilings with two combinatorial types of faces . . . . . 49 J. F. Dillon Geometry, codes and difference sets: exceptional connections . . . . . . . . . . . . . . . 73 Jeffrey H. Dinitz and Douglas R. Stinson A singular direct product for bicolorable Steiner triple systems . . . . . . . . . . . . . . 87 Dominic Elvira and Yutaka Hiramine On semi-regular relative difference sets in non-abelian p-groups . . . . . . . . . . . . . 99 Nick C. Fiala Every λ-design on 6p + 1 points is type-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Christian Fremuth-Paeger and Dieter Jungnickel An introduction to balanced network flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Derek W. Hein and Yury J. Ionin On the λ-design conjecture for v = 5p + 1 points . . . . . . . . . . . . . . . . . . . . . . . . 145
viii Table of contents Hadi Kharaghani and Vladimir D. Tonchev On a class of twin balanced incomplete block designs . . . . . . . . . . . . . . . . . . . . . 157 Jon-Lark Kim and Vera Pless Decoding some doubly-even self-dual [32, 16, 8] codes by hand . . . . . . . . . . . . 165 Donald L. Kreher and Rolf S. Rees On the maximum size of a hole in an incomplete t-wise balanced design with specified minimum block size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Warwick de Launey On a family of cocyclic Hadamard matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Akihiro Munemasa A mass formula for Type II codes over finite fields of characteristic two . . . . . 207 Erin J. Schram A posteriori probability decoding through the discrete Fourier transform and the dual code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Mohan S. Shrikhande Subdesigns of symmetric designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Irfan Siap Linear codes over F2 + uF2 and their complete weight enumerators . . . . . . . . 259 N. J. A. Sloane On single-deletion-correcting codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 Zhe-Xian Wan Critical problems in finite vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 Richard M. Wilson Existence of Steiner systems that admit automorphisms with large cycles . . . . 305 Andrew J. Woldar Rainbow graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Dijen Ray-Chaudhuri
Highlights of Dijen Ray-Chaudhuri’s research Ákos Seress
Dijen Ray-Chaudhuri, in over 80 published papers, books, and monographs, has worked on a broad range of problems in combinatorics that arose in the theory of errorcorrecting codes, graph theory, design theory, difference sets, geometry, information retrieval, and combinatorial optimization. His first major contribution appeared in his Ph.D. thesis [1], where he constructed the 2-error-correcting version of the codes which later became known as BCH codes. The name BCH stands for Bose and Chaudhuri, since Dijen constructed the d-error-correcting version of these codes with his advisor R. Bose [2], and for Hocquenguem, who independently discovered the same codes. BCH codes are the first major application of algebra in coding theory, and are considered of fundamental importance in the subject. Books in the area (for example, Algebraic Theory of Coding by Berlekamp, and Theory of Error-Correcting Codes by MacWilliams and Sloane) devote at least a chapter to BCH codes. Another fundamental result is Dijen’s joint work with R. Wilson [18], on a theory of recursive construction of designs. These constructions led to the solution of a centuryold problem on the existence of Steiner systems, known also as the Kirkman School Girl Problem. Later, Dijen extended the scope of these investigations. He proved (with N. Singhi) [47] the λ-large existence theorem for designs in projective spaces and affine spaces, and (with E. Schram) [54] he constructed designs and large sets of designs in vector spaces, using the theory of quadratic forms. Popular scientific articles about BCH codes and the Kirkman School Girl Problem appeared in the Scientific American and in the Encyclopedia Britannica. Besides these fundamental results, Dijen’s work opened up new areas of research in other branches of combinatorics. His early paper [4] on minimally redundant systems of Boolean functions has a significant follow-up in the Russian electrical engineering community, while another early paper [9] on the connection of association schemes with finite projective spaces and designs is the basis of research on association schemes in China. Another highlight is Dijen’s work with A. Sprague [33] and E. Brickel [42], on the characterization of graphs and association schemes arising from the intersection properties of flats of finite projective spaces, affine spaces, and attenuated spaces. These results are deep, and they attracted the attention of finite geometers. In [39], Dijen with R. Roth developed a theory of nonassociative commutative Moufang loops of exponent 3 and nilpotence class 2, arising from Hall triple systems. This theory was used to construct new Hall triple systems, which are also perfect Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
2 Ákos Seress matroid designs. The seminal paper [41], with S. B. Rao and N. M. Singhi, develops a structure theory for imprimitive association schemes. The paper [57], with N. M. Singhi and G. R. Vijayakumar is a continuation of Dijen’s interest in the spectral characterization of line graphs [12], and uses root lattices and root systems for the classification of signed graphs with least eigenvalue at least −2. Earlier work with A. J. Hoffman [10], [11] gives spectral characterization of line graphs of symmetric designs and affine planes. Dijen also contributed to extremal set theory. His most important results are an algorithm for the computation of the covering number of a hypergraph [6], a bound for the size of set systems with pairwise intersections of prescribed sizes (with R. Wilson) [24], and the generalization of this result to polynomial semilattices (with T. Zhu and J. Qian) [58], [69]. Dijen gave over hundred invited lectures at various institutions. The most important ones are a 45-minute address at the International Congress of Mathematics in 1970, and an hour-long invited talk at the Combined Winter Meeting of the AMS and MAA in 1973. He received a Senior US Scientist Award of the Alexander von Humboldt Foundation, the Distinguished Senior Research Award of The Ohio State University, and the Euler Medal of the Institute for Combinatorics and its Applications [76]. Last, but not least, we have to mention Dijen’s enormous contribution to the development of young researchers. So far, he has been the advisor of 31 Ph.D. students. In the order of graduation, they are R. M. Wilson, B. T. Datta, A. P. Sprague, K. S. Vijayan, A. H. Chan, K. Chang, D. Nemzer, J. LeFever, H.-P. Ko, J. Kahn, R. Roth, R. Games, E. Brickell, A. Moon, K. T. Arasu, Á. Seress, D. Miklós, E. J. Schram, J. J. Kim, L. Narayani, H.-M. Shaw, T. Zhu, X. Wu, Q. Xiang, H. Mohácsy, K. Liu, T. Blackford, J. Qian, I. Siap, G. Yeh, and A. Nabavi. Among these, there are two Pólya Prize winners, an Associate Director of the Rényi Institute of the Hungarian Academy of Sciences, over ten professors at universities all over the world, and several hold leadership positions in industry.
References
[1]
D. K. Ray-Chaudhuri, On the application of the geometry of quadrics to the construction of partially balanced incomplete block designs and error-correcting codes, The Institute of Statistics, Univ. North Carolina, Chapel Hill, NC Mimeo Series 230, 1959.
[2]
R. C. Bose and D. K. Ray-Chaudhuri, On a class of binary error-correcting group codes, Inform. and Control 3 (1960), 68–79.
[3]
R. C. Bose and D. K. Ray-Chaudhuri, Further results on error correcting binary group codes, Inform. and Control 3 (1960), 279–290.
[4]
D. K. Ray-Chaudhuri, On the construction of minimally redundant reliable system designs, Bell Systems Technical Journal 40 (1961), 595–611.
Highlights of Dijen Ray-Chaudhuri’s research
3
[5]
D. K. Ray-Chaudhuri, Some results on quadrics in finite projective geometry based on Galois fields, Canad. J. Math. 14 (1962), 129–138.
[6]
D. K. Ray-Chaudhuri, An algorithm for the minimum cover of an abstract complex, Canad. J. Math. 15 (1963), 11–24.
[7]
D. K. Ray-Chaudhuri, Application of the geometry of quadrics for constructing PBIB designs, Ann. Math. Statist. 33 (1962), 1175–1186.
[8]
D. K. Ray-Chaudhuri, On some connections between balanced incomplete block designs and minimum covers, in: Coll. Internat. Du’Centre National de la Recherche Sci. 110, le Plan D’experiences, Paris, August 29–September 6, 1961, 129–136.
[9]
D. K. Ray-Chaudhuri, Some configurations in finite projective spaces and partially balanced incomplete block designs, Canad. J. Math. 17 (1965), 114–123.
[10] A. J. Hoffman and D. K. Ray-Chaudhuri, On the line graph of a finite affine plane, Canad. J. Math. 17 (1965), 687–694. [11] A. J. Hoffman and D. K. Ray-Chaudhuri, On the line graph of a symmetric balanced incomplete block design, Trans. Amer. Math. Soc. 116 (1965), 238–252. [12] D. K. Ray-Chaudhuri, Characterization of line graphs, J. Combin. Theory 3 (1967), 201–214. [13] G. C. Chow and D. K. Ray-Chaudhuri, An alternative proof of Hannan’s theorem on canonical correlation and multiple equation systems, Econometrica 35 (1967), 139–142. [14] D. K. Ray-Chaudhuri, Combinatorial information retrieval systems for files, SIAM J. Appl. Math. 16 (1968), 973–992. [15] C. T. Abraham, S. P. Ghosh and D. K. Ray-Chaudhuri, File organization schemes based on finite geometries, Inform. and Control 12 (1968), 143–163. [16] D. K. Ray-Chaudhuri, On some connections between graph theory and experimental designs and some recent existence results, in: Graph Theory Appl. (Proc. Adv. Sem., Math. Research Center, Univ. of Wisconsin, Madison, WI, 1969), Academic Press, New York 1970, 149–166. [17] D. K. Ray-Chaudhuri and R. M. Wilson, On the existence of resolvable balanced incomplete block designs, in: Combinatorial Structures and their Applications (Proc. Calgary Internat. Conf., Calgary, AB, 1969), Gordon and Breach, New York 1970, 331–341. [18] D. K. Ray-Chaudhuri and R. M. Wilson, Solution of Kirkman’s schoolgirl problem, in: Combinatorics (Univ. California, Los Angeles, CA, 1968), Proc. Sympos. Pure Math. XIX, Amer. Math. Soc., Providence, R.I., 1971, 187–203. [19] H. Hanani, D. K. Ray-Chaudhuri and R. M. Wilson, On resolvable designs, Discrete Math. 3 (1972), 343–357. [20] D. K. Ray-Chaudhuri, Recent developments on combinatorial designs, in: Actes du Congrès International des Mathématiciens (Nice, 1970), Tome 3, Gauthier-Villars, Paris 1971, 223–227. [21] D. K. Ray-Chaudhuri and R. M. Wilson, The existence of resolvable block designs, in: Survey of combinatorial theory (Proc. Internat. Sympos., Colorado State Univ., Fort Collins, CO, 1971), North-Holland, Amsterdam 1973, 361–375.
4 Ákos Seress [22] C. Berge and D. K. Ray-Chaudhuri (editors), Hypergraph Seminar, Proceedings of the First Working Seminar on Hypergraphs, The Ohio State University, August 16 – September 9, 1972. Lecture Notes in Math. 411, Springer-Verlag, Berlin, New York 1974. [23] H. B. Mann and D. K. Ray-Chaudhuri, Lectures on error correcting codes, The University of Arizona Department of Mathematics Lecture Note Series, University of Arizona, Tucson, AZ, 1974. [24] D. K. Ray-Chaudhuri and R. M. Wilson, On t-designs, Osaka J. Math. 12 (1975), 737–744. [25] D. K. Ray-Chaudhuri, Uniqueness of association schemes, in: Colloquio Internazionale sulle Teorie Combinatorie (Rome, 1973), Tomo II, Atti dei Convegni Lincei 17, Accad. Naz. Lincei, Rome 1976, 465–479. [26] D. K. Ray-Chaudhuri and A. P. Sprague, Characterization of projective incidence structures, Geom. Dedicata 5 (1976), 361–376. [27] D. K. Ray-Chaudhuri and N. M. Singhi, A characterization of the line-hyperplane design of a projective space and some extremal theorems for matroid designs, in: Number theory and algebra, Academic Press, New York 1977, 289–301. [28] D. K. Ray-Chaudhuri, Combinatorial characterization theorems for geometric incidence structures, in: Combinatorial surveys (Proc. Sixth British Combinatorial Conf., Royal Holloway Coll., Egham, 1977), Academic Press, London 1977, 87–116. [29] D. K. Ray-Chaudhuri, Some characterization theorems for graphs and incidence structures, in: Combinatorics (Proc. Fifth Hungarian Colloq., Keszthely, 1976), Vol. II, Colloq. Math. Soc. János Bolyai 18, North-Holland, Amsterdam, New York 1978, 821–842. [30] D. K. Ray-Chaudhuri (editor), Relations between combinatorics and other parts of mathematics, The Ohio State University, March 20–23, 1978, Proc. Sympos. Pure Math. XXXIV, Amer. Math. Soc., Providence, RI, 1979. [31] A. H. Chan and D. K. Ray-Chaudhuri, Characterization of “linegraph of an affine space”, J. Combin. Theory Ser. A 26 (1979), 48–64. [32] A. H. Chan and D. K. Ray-Chaudhuri, Embedding of a pseudoresidual design into a Möbius plane, J. Combin. Theory Ser. A 32 (1982), 73–98. [33] D. K. Ray-Chaudhuri and A. P. Sprague, A combinatorial characterization of attenuated spaces, Utilitas Math. 15 (1979), 3–29. [34] H.-P. Ko and D. K. Ray-Chaudhuri, Group divisible difference sets and families from s-flats of finite geometries, Proceedings of the Tenth Southeastern Conference on Combinatorics, Graph Theory and Computing, Florida Atlantic Univ., 1979, Congr. Numer. 24 (1979), 601–627. [35] D. K. Ray-Chaudhuri,Affine triple systems, in: Combinatorics and graph theory (Calcutta, 1980), Lecture Notes in Math. 885, Springer-Verlag, Berlin, New York 1981, 60–69. [36] D. K. Ray-Chaudhuri and S. S. Rappaport, Sampled multiserver queues with general arrivals and deterministic service time, Proc. IEE-E 127 (1980), 88–92. [37] H.-P. Ko and D. K. Ray-Chaudhuri, Multiplier theorems, J. Combin. Theory Ser. A 30 (1981), 134–157.
Highlights of Dijen Ray-Chaudhuri’s research
5
[38] H.-P. Ko and D. K. Ray-Chaudhuri, Intersection theorems for group divisible difference sets, Discrete Math. 39 (1982), 37–58. [39] R. Roth and D. K. Ray-Chaudhuri, Hall triple systems and commutative Moufang exponent 3 loops: the case of nilpotence class 2, J. Combin. Theory Ser. A 36 (1984), 129–162. [40] D. K. Ray-Chaudhuri, Group divisible difference sets, in: Enumeration and design (Waterloo, Ont., 1982), Academic Press, Toronto, ON, 1984, 271–283. [41] S. B. Rao, D. K. Ray-Chaudhuri and N. M. Singhi, On imprimitive association-schemes, in: Combinatorics and applications (Calcutta, 1982), Indian Statist. Inst., Calcutta 1984, 273–291. [42] E. Brickel and D. K. Ray-Chaudhuri, Characterization of incidence structures of intervals of affine geometries, Mitt. Math. Sem. Giessen 166 (1984), 17–34. [43] K. T. Arasu and D. K. Ray-Chaudhuri, Divisible quotient lists and their multipliers, in: Combinatorics, Graph theory and Computing (Proceedings of the sixteenth Southeastern international conference on combinatorics, graph theory and computing, Boca Raton, FL, 1985) Congr. Numer. 49 (1985), 321–338. [44] K. T. Arasu and D. K. Ray-Chaudhuri, Multiplier theorem for a difference list, Ars Combin. 22 (1986), 119–137. [45] D. K. Ray-Chaudhuri and N. M. Singhi, On existence of t-designs with large v and λ, SIAM J. Discrete Math. 1 (1988), 98–104. [46] D. K. Ray-Chaudhuri and N. M. Singhi, On existence and number of orthogonal arrays, J. Combin. Theory Ser. A 47 (1988), 28–36; Corrigendum: J. Combin. Theory Ser. A 66 (1994), 327–328. [47] D. K. Ray-Chaudhuri and N. M. Singhi, q-analogues of t-designs and their existence, Linear Algebra Appl. 114/115 (1989), 57–68. [48] K. T. Arasu and D. K. Ray-Chaudhuri, Affine difference sets and their homomorphic images, Mitt. Math. Sem. Giessen 192 (1989), 71–78. [49] D. K. Ray-Chaudhuri (editor), Coding theory and design theory. Part I. Coding theory, IMA Vol. Math. Appl. 20, Springer-Verlag, New York 1990. [50] D. K. Ray-Chaudhuri (editor), Coding theory and design theory. Part I. Design theory, IMA Vol. Math. Appl. 21, Springer-Verlag, New York 1990. [51] M. Deza, D. K. Ray-Chaudhuri and N. M. Singhi, Positive independence and enumeration of codes with a given distance pattern, in: Coding theory and design theory, Part I, IMA Vol. Math. Appl. 20, Springer-Verlag, New York 1990, 93–101. [52] D. K. Ray-Chaudhuri, editor, Combinatorial mathematics and applications, Sankhy¯a Ser. A 54 (1992), special issue dedicated to the memory of R. C. Bose. [53] D. K. Ray-Chaudhuri and N. M. Singhi, Some recent results on t-designs, Sankhy¯a Ser. A 54 (1992), special issue (Combinatorial mathematics and applications, Calcutta, 1988), 383–391. [54] D. K. Ray-Chaudhuri and E. J. Schram, Designs on vector spaces constructed using quadratic forms, Geom. Dedicata 42 (1992), 1–42.
6 Ákos Seress [55] D. K. Ray-Chaudhuri and T. Zhu, A recursive method for construction of designs, Discrete Math. 106/107 (1992), A collection of contributions in honour of Jack van Lint, 399–406. [56] D. K. Ray-Chaudhuri and N. M. Singhi (editors), Prof. R. C. Bose Memorial Issue, J. Combin. Inform. System Sci. 17 (1992), 1–2. [57] D. K. Ray-Chaudhuri, N. M. Singhi and G. R. Vijayakumar, Signed graphs having least eigenvalue around −2, J. Combin. Inform. System Sci. 17 (1992), 148–165. [58] D. K. Ray-Chaudhuri and T. Zhu, s-intersection families and tight designs, in: Coding theory, design theory, group theory (Burlington, VT, 1990), Wiley-Interscience Publishers, New York 1993, 67–75. [59] K. T. Arasu, D. K. Ray-Chaudhuri and N. M. Singhi, Simple designs, J. Combin. Inform. System Sci. 18 (1993), 130–135. [60] D. K. Ray-Chaudhuri, N. M. Singhi, S. Sanyal, and P. S. Subramanian, Theory and design of t-unidirectional error-correcting and d-unidirectional error-detecting code, IEEE Trans. Comput. 43 (1994), 1221–1226. [61] D. K. Ray-Chaudhuri and E. J. Schram, A large set of designs on vector spaces, J. Number Theory 47 (1994), 247–272. [62] D. K. Ray-Chaudhuri and H.-M. Shaw,A greedy algorithm for maximum multicommodity flows on dominance networks, J. Combin. Inform. System Sci. 20 (1995), 161–171. [63] D. K. Ray-Chaudhuri and X. Wu, Abelianizations of non-abelian difference sets, J. Combin. Inform. System Sci. 20 (1995), 173–195. [64] D. K. Ray-Chaudhuri, Vector space designs, in: The CRC Handbook of Combinatorial Designs (Colbourn, Charles J. et al., eds.), CRC Press Ser. Discrete Math. Appl., CRC Press, Boca Raton, FL, 1996, 492–496. [65] D. K. Ray-Chaudhuri and Q. Xiang, Constructions of partial difference sets and relative difference sets using Galois rings, Des. Codes Cryptogr. 8 (1996), special issue dedicated to Hanfried Lenz, 215–227. [66] Y. Q. Chen, D. K. Ray-Chaudhuri and Q. Xiang, Constructions of partial difference sets and relative difference sets using Galois rings. II, J. Combin. Theory Ser. A 76 (1996), 179–196. [67] D. K. Ray-Chaudhuri and T. Zhu, Orthogonal arrays and ordered designs, J. Statist. Plann. Inference 58 (1997), 177–183. [68] D. K. Ray-Chaudhuri and Q. Xiang, New necessary conditions for abelian Hadamard difference sets, J. Statist. Plann. Inference 62 (1997), 69–79. [69] J. Qian and D. K. Ray-Chaudhuri, Frankl-Füredi type inequalities for polynomial semilattices, Electron. J. Combin. 4 (1997), no. 1, Research Paper 28, 15 pp. (electronic). [70] J. Qian and D. K. Ray-Chaudhuri, Combinatorial inequalities for quasi polynomial semilattices, Recent advances in interdisciplinary mathematics, Portland, ME, 1997, J. Combin. Inform. System Sci. 25 (2000), 59–76. [71] I. Siap, N. Aydin, and D. K. Ray-Chaudhuri, New ternary quasi-cyclic codes with better minimum distances, IEEE Trans. Inform. Theory 46 (2000), 1554–1558.
Highlights of Dijen Ray-Chaudhuri’s research
7
[72] T. Blackford and D. K. Ray-Chaudhuri, A transform approach to permutation groups of cyclic codes over Galois rings, IEEE Trans. Inform. Theory 46 (2000), 2350–2358. [73] I. Siap and D. K. Ray-Chaudhuri, New linear codes over F3 and F5 and improvements on bounds, Des. Codes Cryptogr. 21 (2000), special issue dedicated to Dr. Jaap Seidel on the occasion of his 80th birthday (Oisterwijk, 1999), 223–233. [74] J. Qian and D. K. Ray-Chaudhuri, On mod-p Alon-Babai-Suzuki inequality, J. Algebraic Combin. 12 (2000), 85–93. [75] I. Siap and D. K. Ray-Chaudhuri, On r-fold complete weight enumerators of r linear codes, in: Algebra and its applications (Athens, OH, 1999), Contemp. Math. 259, Amer. Math. Soc., Providence, RI, 2000, 501–513. [76] The 1999 Euler, Hall, and Kirkman medals, Bull. Inst. Combin. Appl. 30 (2000), 7–9. [77] I. Siap, N. Aydin, and D. K. Ray-Chaudhuri, New 1-generator quasi-twisted codes over GF(5), in: Codes and association schemes (Piscataway, NJ, 1999), DIMACS Ser. Discrete Math. Theoret. Comput. Sci. 56, Amer. Math. Soc., Providence, RI, 2001, 265–275. [78] H. Mohácsy and D. K. Ray-Chaudhuri, A construction for infinite families of Steiner 3-designs, J. Combin. Theory Ser. A 94 (2001), 127–141. [79] J. Qian and D. K. Ray-Chaudhuri, Extremal case of Frankl-Ray-Chaudhuri-Wilson inequality, J. Statist. Plann. Inference 95 (2001), special issue on design combinatorics: in honor of S. S. Shrikhande, 293–306. [80] H. Mohácsy and D. K. Ray-Chaudhuri, Candelabra systems and t-designs, J. Statist. Plann. Inference, to appear. [81] H. Mohácsy and D. K. Ray-Chaudhuri, A construction for group-divisible t-designs with strength t ≥ 2 and index 1, J. Statist. Plann. Inference, to appear. [82] N. Aydin, D. K. Ray-Chaudhuri and I. Siap, The structure of 1-generator quasi-twisted codes and improvements on bounds, Des. Codes Cryptogr., to appear. Á. Seress Department of Mathematics The Ohio State University Columbus, OH 43210-1174, U.S.A. [email protected]
On the p-ranks of GMW difference sets K. T. Arasu, Henk D. L. Hollmann, Kevin Player, and Qing Xiang ∗
Abstract. We determine the p-ranks of the classical GMW difference sets (p even or odd). In the p odd case, this solves an open problem mentioned in [4], p. 461 and [15], p. 84. We also compute the 2-ranks of some non-classical GMW difference sets arising from monomial hyperovals. 2000 Mathematics Subject Classification: primary 05B10; secondary 11L05.
1. Introduction Let G be a finite (multiplicative) group of order v. A k-element subset D of G is called a (v, k, λ) difference set in G if the list of “differences” d1 d2−1 , d1 , d2 ∈ D, d1 = d2 , represents each nonidentity element in G exactly λ times. We say that two (v, k, λ) difference sets D1 and D2 in an abelian group G are equivalent if there exists an automorphism α of G and an element g ∈ G such that α(D1 ) = D2 g. In particular, if G is cyclic, then D1 and D2 are equivalent if there exists an integer t, gcd(t, v) = 1, such that D1(t) = D2 g for some g ∈ G, where D1(t) = {d t | d ∈ D1 }. Singer [17] discovered a large class of difference sets which are related to finite projective geometry. These difference sets have parameters v=
qd − 1 , q −1
k=
q d−1 − 1 , q −1
λ=
q d−2 − 1 q −1
(1)
where d ≥ 3, and they exist whenever q is a prime power. In this paper, difference sets with parameters (1), or the complementary parameters v = (q d − 1)/(q − 1), k = q d−1 , λ = q d−2 (q − 1) are called difference sets with classical parameters. In ([2], p. 143), it is mentioned that on one hand, Singer [17] conjectured that there is only one equivalence class of difference sets with parameters (1) if d = 3 (i.e., λ = 1); on the other hand, the largest known class of multiple inequivalent difference sets also have classical parameters. While little progress has ∗ K. T. Arasu’s research is supported by NSF grant CCR-9814106 and by NSA grant 904-01-1-0041. Kevin Player was partially supported by an REU grant from the NSF. Q. Xiang was supported by NSA grant 904-01-1-006.
Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
10
K. T. Arasu, Henk D. L. Hollmann, Kevin Player, and Qing Xiang
been made on the Singer conjecture above, there has been a great deal of research on constructing inequivalent difference sets with classical parameters, especially when q = 2. For a survey of recent work in this area, we refer the reader to [20]. The first infinite series of examples of mutually inequivalent difference sets with m −1 m−1 m−2 parameters qq−1 ,q ,q (q − 1) is due to Gordon, Mills and Welch [8]. These difference sets will be called GMW difference sets, and the symmetric designs developed from these difference sets are called GMW designs. When q = 2, the 2-ranks of the so-called classical GMW difference sets (see Section 2 for definition) were computed by Scholtz and Welch [16] in terms of the linear spans of their characteristic sequences. However, the p-ranks of the classical GMW difference sets in the case q = 2 are not known (cf. [4], p. 461, [15], p. 84). In this paper, we compute the p-ranks of the classical GMW difference sets. We also compute the 2-ranks of some non-classical GMW difference sets from monomial hyperovals. The methods used here to compute the p-ranks are essentially the same as those in [7], but the details are more complicated because of the recursive nature of GMW difference sets. We first show that the character sums of the GMW difference sets under consideration are related to Gauss or Jacobi sums, then we use Stickelberger’s theorem on the prime ideal factorization of Gauss sums to reduce the problem of computing the p-ranks to certain counting problems. The counting problems are then solved either in a straightforward manner or with the help of the so-called transfer matrix method. It should be noted that the p-ranks of GMW difference sets usually cannot distinguish GMW designs because very often inequivalent GMW difference sets have the same p-ranks (see, for example, Corollary 3.7). The difficult problem whether inequivalent GMW difference sets lead to nonisomorphic GMW designs is recently solved by Kantor [11] using group theory.
2. Preliminaries We first recall a construction of Singer difference sets. Let Fq d be the finite field with q d elements, where q = ps , p is a prime, d ≥ 3, and let Tr be the trace from Fq d to Fq . We may take a system L of coset representatives of F∗q in F∗q d such that Tr maps L into {0, 1}. Write L = L0 ∪ L1 , where L0 = {x ∈ L | Tr(x) = 0},
L1 = {x ∈ L | Tr(x) = 1}.
(2)
d −1 q d−1 −1 q d−2 −1 , q−1 , q−1 difference set Theorem 2.1. With the above notation, L0 is a qq−1 d −1 q in the quotient group F∗q d /F∗q , and L1 is a q−1 , q d−1 , q d−2 (q − 1) difference set in F∗q d /F∗q . Proofs of Theorem 2.1 of course can be found in many places. For our later use, we mention a proof by Yamamoto [22] (see also [7]), in which it is shown that the
On the p-ranks of GMW difference sets
11
character values of L0 and L1 are related to Gauss sums. More precisely, let χ be a nontrivial multiplicative character of Fq d whose restriction to F∗q is trivial. Then χ(L0 ) = g(χ)/q, and χ (L1 ) = −g(χ )/q,
(3)
where g(χ) is the Gauss sum defined over Fq d , i.e., g(χ) =
Tr q d /p (a)
χ(a)ξp
,
a∈F∗ d q
here ξp is a fixed complex primitive pth root of unity and Tr q d /p is the absolute trace from Fq d to Fp , the field of p elements. Now we proceed to discuss the GMW construction. Let m = d · e, where d > 2, e > 1 are integers, and let q be a prime power. We define R = {x ∈ Fq m | Tr q m /q d (x) = 1}, where Tr q m /q d is the trace from Fq m to Fq d . Let μ : F∗q m → F∗q m /F∗q be the canonical epimorphism, and let R = μ(R ). Using the terminology of relative difference sets m −1 d (see, for example [15], p. 13), the set R is a qq d −1 , q − 1, q d(e−1) , q d(e−2) relative difference set in F∗q m relative to F∗q d , and R is a qm − 1 qd − 1 d(e−1) d(e−2) , , q , q (q − 1) qd − 1 q − 1 relative difference set in F∗q m /F∗q relative to F∗q d /F∗q . We state the following theorem of Gordon, Mills and Welch. d −1 d−1 d−2 Theorem 2.2 ([8]). If is any qq−1 ,q ,q (q − 1) difference set in F∗q d /F∗q , m −1 m−1 m−2 ,q ,q (q−1) difference set in F∗q m /F∗q , with the above then D = R is a qq−1 d −1 d−1 d−2 ,q ,q (q − 1) difference set definition of R. Moreover, if is another qq−1 in F∗q d /F∗q , then the two cyclic difference sets D = R and D = R are equivalent if and only if is a translate of . (r)
In Theorem 2.2, if one uses the difference sets L1 = {x r | x ∈ L1 } as , where gcd(r,
q d −1 q−1
) = 1, and L1 is the same as in Theorem 2.1, then the resulting difference
(r) L1 R
are called classical GMW difference sets. If we assume furthermore sets D = that q = 2 so that R = R = {x ∈ F2m | Tr 2m /2d (x) = 1}, then the characteristic (r) sequence of D = L1 R in F2m is given by {Tr 2d /2 [Tr 2m /2d (α i )]1/r }0≤i≤2m −2 . This sequence is called a binary GMW sequence in [16]. The linear spans of GMW sequences (i.e., the 2-ranks of classical GMW difference sets with q = 2) are computed in [16]. Antweiler and Bömer [1] consider sequences defined over Fp in a way analogous to the definition of GMW sequences, and computed their linear spans. We
12
K. T. Arasu, Henk D. L. Hollmann, Kevin Player, and Qing Xiang
note that the sequences they considered apparently are different from the characteristic sequences of the GMW difference sets when q = 2 (cf. [15], p. 84). It therefore remains a problem to compute the p-ranks of GMW difference sets for general p. We will solve this problem in Section 3. d −1 d−1 d−2 We emphasize that in Theorem 2.2, the choice of the qq−1 ,q ,q (q − 1) difference set in F∗q d /F∗q is completely arbitrary. When q = 2, and is not of d the form L(r) 1 for any r relatively prime to 2 − 1, the characteristic sequences of the GMW difference sets R are studied in [14] and [6]. We now define the p-rank of a difference set. Let G be a (multiplicative) abelian group of order v, and let D be a (v, k, λ) difference set in G. Then D = (P , B) is a (v, k, λ) symmetric design with a regular automorphism group G, where the set P of points of D is G, and where the set B of blocks of D is {gD | g ∈ G}. This design is usually called the development of D. The incidence matrix of D is the matrix A whose rows are indexed by the blocks B of D and whose columns are indexed by the points g of D, where the entry AB,g in row B and column g is 1 if g ∈ B, and 0 otherwise. The p-ary code of D, denoted Cp (D), is defined to be the row space of A over Fp , the field of p elements. This code is also the p-ary code of D, denoted by Cp (D). The Fp -dimension of Cp (D) is usually called the p-rank of the difference set D. It is well known that Cp (D) is of interest only if p | (k − λ) (see [5]). So from now on, we always assume that p | (k − λ). In our computation of p-ranks of the GMW difference sets, we will take the well known approach described by the following lemma.
Lemma 2.3. Let G be an Abelian group of order v and exponent v ∗ , let p be a prime not dividing v ∗ , and let p be a prime ideal above p in Z[ξv ∗ ]. Let D be a (v, k, λ) difference set in G. Then the p-rank of D is equal to the number of complex characters χ of G with χ(D) ≡ 0 (mod p) For a proof of this lemma, we refer the reader to [12], and ([4], p. 465). We will also need Stickelberger’s result (Theorem 2.4 below) on the prime ideal factorization of Gauss sums. We first introduce some notation. Let p be a prime, q = p s , and let ξq−1 be a complex primitive (q − 1)th root of unity. Fix any prime ideal p in Z[ξq−1 ] lying over p. Then Z[ξq−1 ]/p is a finite field of order q, which we identify with Fq . Let ωp be the Teichmüller character on Fq , i.e., an isomorphism q−2 2 , . . . , ξq−1 ωp : F∗q → 1, ξq−1 , ξq−1 satisfying ωp (α) (mod p) = α,
(4)
for all α in F∗q . The Teichmüller character ωp has order q − 1; hence it generates all multiplicative characters of Fq .
On the p-ranks of GMW difference sets
13
Let P be the prime ideal of Z[ξq−1 , ξp ] lying above p. For an integer a, let s(a) = vP (g(ωp−a )), where vP is the P-adic valuation. Thus Ps(a) || g(ωp−a ). The following evaluation of s(a) is due to Stickelberger (see [3], p. 344, [19], p. 96). Theorem 2.4. Let p be a prime, and q = p s . For an integer a not divisible by q − 1, let a0 + a1 p + a2 p 2 + · · · + as−1 ps−1 , 0 ≤ ai ≤ p − 1, be the p-adic expansion of the reduction of a modulo q − 1. Then s(a) = a0 + a1 + · · · + as−1 , that is, s(a) is the sum of the p-adic digits of the reduction of a modulo q − 1. As an easy application of Stickelberger’s theorem, we prove the following lemma. Lemma 2.5. Let q = ps , and let d > 2 be an integer. For any integer a not divisible by q d − 1, let s(a) be the sum of p-adic digits of the reduction of a modulo q d − 1. Then s((q − 1)b) ≥ (p − 1)s, for all integers b, 0 < b < (q d − 1)/(q − 1). Proof. For p a prime ideal in Z[ξq d −1 ] lying over p, let ωp be the Teichmüller character −(q−1)
on Fq d and let χ = ωp
. Then χ is a generator of the character group of F∗q d /F∗q .
By (3), we know that for each b, 0 < b
2, e > 1 are integers, R = {x ∈ Fq m | Tr q m /q d (x) = 1}, here Tr q m /q d is the trace from Fq m to Fq d . Let μ : F∗q m → F∗q m /F∗q be the canonical epimorphism. As before, we define R = μ(R ). Let χ be a nontrivial character of F∗q m /F∗q . Our goal here is to compute χ (R) := x∈R χ(x).
14
K. T. Arasu, Henk D. L. Hollmann, Kevin Player, and Qing Xiang
Since R = μ(R ), we see that χ(R) = χ ◦ μ(R ). Let η = χ ◦ μ. Then η is a multiplicative character of F∗q m , whose restriction to F∗q is trivial. By the definition of Gauss sums over Fq m , we have Tr m (y) η(y) ξp q /p . g(η) = y∈F∗q m
Let L be a system of coset representatives of F∗q d in F∗q m such that {Tr q m /q d (x) | x ∈ L } = {0, 1}. Define L0 = {x ∈ L | Tr q m /q d (x) = 0}, and L1 = {x ∈ L | Tr q m /q d (x) = 1}. Then
g(η) = =
η(x)
x∈L
a∈F∗ d
q
η(x)
F∗ d
η(a) +
a∈F∗ d
x∈L0
Therefore, if η
Tr q d /p aTr q m /q d (x) η(a) ξp
Tr q d /p (a)
η(a) ξp
a∈F∗ d
x∈L1
q
η(x)
q
= 1, then
q
and if η
F∗ d
= 1, then
q
g(η) = −q d η(L1 ) ;
a∈F∗ d q
η(a) = 0, hence g(η) = η(L1 ) · g1 (η1 ),
where η1 = η F∗ (the restriction of η to Fq d ), and g1 (η1 ) is the Gauss sum over Fq d qd
with respect to the character η1 . Noting that L1 = R , we have η(R ) = χ(R) =
⎧ ⎨ − q1d g(η) ,
if
⎩
if
g(η) g1 (η1 )
,
η F∗ = 1 ,
qd η F∗ = 1 .
(6)
qd
We will use this evaluation of χ(R) in later sections.
3. The p-ranks of the classical GMW difference sets Let m = d · e, where d > 2, e > 1 are integers. Let R be defined as in Section 2. d −1 Let L1 be defined as in (2.1), and let r be an integer such that gcd r, qq−1 = 1. q m −1 m−1 m−2 (r) ,q (q − 1) Then Theorem 2.2 tells us that the set D := L1 R is a q−1 , q difference set in F∗q m /F∗q , and such a difference set is called a classical GMW difference
On the p-ranks of GMW difference sets
15
set. In this section, we compute the p-ranks of the classical GMW difference sets. We will maintain the notation in Section 2. We begin with a lemma which reduces the computation of p-ranks of the classical GMW difference sets to a combinatorial counting problem. Lemma 3.1. Let D = L1 R be the difference set in F∗q m /F∗q defined above. Let q = ps , where p is a prime. Then the p-rank of D is equal to the cardinality of the set qm − 1 B= a|0 1 are integers. Let X be an integer not divisible by q d − 1, 0 < X < q m − 1, and let s(X), s1 (X) be the p-weight of the reduction of X modulo q m − 1 and q d − 1 respectively. Then s(X) − s1 (X) = (p − 1)α, for some integer α ≥ 0. Proof. We write X=
e−1
Xi q di ,
i=0
where Xi =
ds−1
Xij p j
j =0
with 0 ≤ Xij ≤ p − 1. j We will use x = jds−1 =0 xj p , 0 ≤ xj ≤ p − 1, to denote the reduction of X (mod q d − 1). So 0 ≤ x ≤ q d − 1 and x ≡ X mod q d − 1. By add-with-carry
18
K. T. Arasu, Henk D. L. Hollmann, Kevin Player, and Qing Xiang
algorithm, there are nonnegative carries cj , j = 0, 1, . . . , ds − 1 such that pcj + xj =
e−1
Xij + cj −1 ,
i=0
holds for all j = 0, 1, . . . , ds − 1. Here c−1 = cds−1 . This implies that (p − 1) cj + xj = Xij , j
j
that is, (p − 1)α + s1 (X) = s(X), where α =
j
i
j cj .
We now give a completely elementary proof of Lemma 2.5. In fact, we will prove a strengthening of the lemma. We first introduce some notation. Let b ≥ 2 be any integer. Define Z≥0 = {0, 1, . . . }. For any index set I ⊆ Z≥0 , let R(I ) be the collection of all sequences x = (xi )i∈I with xi nonnegative integer for all i ∈ I and xi = 0 for all but finitely many i. For convenience, we define 0 ∈ R(I ) as the sequence x with xi = 0 for all i ∈ I . Also, we write R and Rm to denote R(Z≥0 ) and R({0, 1, . . . , m − 1}), respectively. For each x ∈ R(I ), we associate its numerical value ν(x) = xi b i i∈I
and its b-ary weight sb (x) =
xi .
i∈I
Note that if I = Z≥0 = {0, 1, . . . } and the xi are the b-ary digits of a number, then the numerical value of the sequence is just the number itself and the weight of the sequence is just the weight of the number. With these definitions, we have the following. Lemma 3.3. Let x ∈ R \ {0} satisfy (bs − 1) | ν(x) for some integer s ≥ 1. Then sb (x) ≥ (b − 1)s, with equality if and only if
xi = b − 1
(7)
i≡r ( mod s)
for r = 0, 1, . . . , s − 1. Conversely, if (7) holds, then (bs − 1) | ν(x) and sb (x) = (b − 1)s. Proof. Let x = (xi )i≥0 ∈ R \ {0} satisfy the assumptions in the lemma. For i = 0, . . . , s − 1, write xi,j = xi+j s
On the p-ranks of GMW difference sets
for all j ≥ 0 and define yi =
19
xi,j .
j ≥0
We consider y = (y0 , . . . , ys−1 ) as a member of Rs . Note that sb (x) =
s−1
yi = sb (y).
(8)
i=0
Now modulo bs − 1 we have that 0 ≡ ν(x) =
s−1
xi,j bi bj s
i=0 j ≥0
≡
s−1
xi,j bi
i=0 j ≥0
=
s−1
yi b i
i=0
= ν(y) (mod bs − 1). For each i = 0, 1, . . . , s − 1 (considered modulo s), we define the transformation τi on sequences z from Rs with zi ≥ b as follows. The image z = τi (z) will have zk = zk for k = i, i + 1; zi = zi − b, and zi+1 = zi+1 + 1. Note that we have that τi (z) ∈ Rs \ {0}, sb (τi (z)) = sb (z) − (b − 1) < sb (z), and ν(z), if i = s − 1; ν(τi (z)) = ν(z) − (bs − 1), if i = s − 1. In particular, we have that ν(τi (z)) ≡ ν(z) mod bs − 1. Now, repeatedly apply transformations τi to the sequence y = (yj )0≤j ≤s−1 until we obtain a sequence y = ), where y ≤ b −1 for all i = 0, 1, . . . , m−1. By the above remarks, (y0 , y1 , . . . ys−1 i we have that ν(y ) ≡ ν(y) ≡ 0 (mod bs − 1), y = 0, and sb (y ) ≤ sb (y) with equality if and only if yi ≤ b − 1 for all i. To finish the proof, it suffices to remark that we may consider the sequence y as the b-ary representation of the number ν(y ); so 0 < ν(y ) ≤ bs − 1 and hence ν(y ) ≡ 0 ( mod bs − 1) implies that yi = b − 1 for all i. The converse in the lemma is evident: indeed, if the condition in the lemma holds, that is, if yi = b − 1 for each i = 0, . . . , s − 1, then ν(x) ≡ ν(y) = bs − 1 ≡ 0 mod bs − 1 and sb (x) = sb (y) = (b − 1)s.
20
K. T. Arasu, Henk D. L. Hollmann, Kevin Player, and Qing Xiang
Remarks. (1) In fact, from the proof of Lemma 3.3 we see the following: if 0 < ν(x) ≡ ν(z) mod bs −1 with 0 = z = (z0 , z1 , . . . , zs−1 , 0, 0, . . . ) and 0 ≤ zi ≤ b−1 for all i, then we have that sb (x) ≥ sb (z) with equality if and only if
xi = zi
i≡r (mod s)
for all i = 0, . . . , s − 1. (The lemma is simply the case where z = (b − 1, b − 1, . . . , b − 1, 0, 0, . . . ) so that ν(z) = bs − 1 ≡ 0 mod bs − 1.) (2) As a consequence of this lemma, we see that if (bs −1) | x and sb (x) = (b−1)s, then certainly (bt − 1) x for t > s. Using the notation introduced at the beginning of this section, we now have the following. Theorem 3.4. Let q = ps , p a prime, let m = de, where d > 2, e > 1 are integers. d −1 Let r be an integer relatively prime to qq−1 . Then the p-rank of the difference set (r)
D = L1 R is equal to ds−1 xj + e − 1 , e−1 x j =0
ds−1
where the sum is over all x = j =0 xj p j , 0 ≤ xj ≤ p − 1, such that the number y ≡ rx (mod q d − 1) has the form y=
d−1 s−1
yij p i q j
i=0 j =0
with 0 ≤ yij ≤ p − 1 and d−1
yij = p − 1
j =0
for i = 0, . . . , s − 1. Proof. By Lemma 3.1, the p-rank of D is equal to B, the cardinality of the following set B = {X | 0 < X < q m − 1, (q − 1) | X, (q d − 1) X, s1 (Xr) + s(X) − s1 (X) = (p − 1)s}, where s1 (X) is the p-weight of X (mod q d − 1).
On the p-ranks of GMW difference sets
21
Let X ∈ B. By Lemma 3.2, we see that s(X)−s1 (X) ≥ 0, hence s1 (Xr) ≤ (p−1)s. Since (q − 1) | X and X | Xr, by Lemma 3.3, we have either Xr ≡ 0 mod q d − 1, or s1 (Xr) = (p − 1)s. In the latter case, Xr is of a special form as specified in Lemma 3.3. Let us first show that Xr ≡ 0 mod q d − 1 is impossible. Indeed, in that case we have rX = c(q d − 1), for some integer c. So with X = X/(q − 1), which is an integer by our assumption on X, we have that rX = c(q d − 1)/(q − 1), that is, rX ≡ 0 mod (q d − 1)/(q − 1). So by our assumption on r, this implies X ≡ 0 mod (q d − 1)/(q − 1), and that implies X ≡ 0 mod q d − 1, contradicting the assumption that (q d − 1) X. So we must have s1 (Xr) = (p − 1)s (with Xr of a special form). Hence s(X) = s1 (X). Let x denote the integer in the range [0, q d − 1) such that x ≡ X mod q d − 1. Then s(X) = s1 (x). Therefore in order compute the cardinality of B, we must count, for each x, 0 < x < q d − 1, with s1 (xr) = (p − 1)s, the number of X ∈ B such that X ≡ x mod q d − 1 and s(X) = s1 (x). di We will use the same notation as in Lemma 3.2, i.e., X = e−1 i=0 Xi q , Xi = ds−1 ds−1 j j j =0 Xij p , with 0 ≤ Xij ≤ p − 1. Given an x = j =0 xj p , 0 ≤ xj ≤ p − 1, d since we want to count those X ∈ B such that X ≡ x mod q − 1, and s(X) = s1 (x), we require that e−1 Xij , xj = i=0
that is, the addition X0 + X1 + · · · + Xe−1 (mod q d − 1) has no carry. As before, given an xj , there are precisely xj + e − 1 e−1 ways to distribute the quantity xj over the Xij ’s. So for each x, 0 < x < q d − 1, with xj +e−1 . Summing s1 (xr) = (p − 1)s, the number of “liftings” X ∈ B of x is jds−1 =0 e−1 over these x, we get the desired formula for the p-rank of D. Example 3.5. We use a concrete example to illustrate the p-rank formula in Theorem 3.4. Let us take p = 3, s = 1, d = 3, e = 2, so m = de = 6. Let r ≡ 1/5 (mod 33 − 1). We have 6 choices for y ≡ x/5 (mod 33 − 1) such that s1 (y) = p − 1 = 2. Therefore we have 6 choices for x. These are x ≡ 1 + 32 , 1 + 3, 3 + 32 , 2 + 2 · 3, 2 · 3 + 2 · 32 , 2 + 2 · 32 (mod 33 − 1). By Theorem 3.4, (1/5) the 3-rank of D = L1 R is 1+e−1 0+e−1 1+e−1 3· · · 1 1 1 2+e−1 2+e−1 0+e−1 +3· · · = 39. 1 1 1
22
K. T. Arasu, Henk D. L. Hollmann, Kevin Player, and Qing Xiang
This agrees with the result in the table on page 86 of [15]. In some special cases, the p-rank formula in Theorem 3.4 can be made more explicit. Corollary 3.6. Let D = L(r) 1 R, with r = 1 (or a power of p). Then the p-rank of D p+de−2s is de−1 . Proof. Since r = 1, we have rx = x. By Theorem 3.4, the p-rank of D is ds−1 xj + e − 1 , e−1 x j =0
where the sum is over all x =
ds−1 j =0
xj p j , 0 ≤ xj ≤ p − 1, with
xi + xi+s + xi+2s + · · · + xi+(d−1)s = p − 1, for i = 0, 1, . . . , s − 1. The above sum is easily seen to be d−1 zi + e − 1 p + de − 2 s [ , ]s = de − 1 zi z +···+z =p−1 0
i=0
d−1 0≤zi 1. In fact, in what follows we will be interested in the numbers Ak := (2k + 3)−1 B2k+3
(18)
for k ≥ 0. We will show that the following holds. Theorem 4.4. For all s and t, the numbers Ak satisfy a linear recurrence relation.
28
K. T. Arasu, Henk D. L. Hollmann, Kevin Player, and Qing Xiang
Our approach to this counting problem involves the transfer matrix method (see [18], Sect. 4.7). We will first define a weighted digraph G (the (s, t)-extended Segre graph) together with a partition of its vertex set V into two sets V0 and V1 . Then we will show that there is a 1-1 correspondence between closed walks C of length d in the digraph G containing precisely one vertex from V1 and pairs (v, x) and their cyclic shifts for which (16) and (17) hold. Moreover, the correspondence is such that the product w(C) of the weights of the arcs in C equals ew(v) . Once this is achieved, a routine application of the transfer matrix method shows that the generating function Ak z k (19) A(z) := k≥0
is rational. Once this is known, we can use Lemma 4.5 from [7] to obtain an explicit recurrence by computing a limited number of the Ak ’s. We will also discuss a slight variation of the transfer matrix method to help keeping the necesary computations to a minimum. The idea to associate a weighted digraph with our counting problems is based on the observation that certain add/subtract-with-carry algorithms to do (bitwise) arithmetic modulo 2d − 1 can be thought of as walking along arcs in an associated graph. Before we explain this, we need a lemma. Let Nd denote the collection of all periodic integer sequences withj period d. For a sequence s = s0 , . . . , sd−1 from Nd , we define ν(s) = jd−1 =0 sj 2 and w(s) = d−1 j =0 sj ; if the sequence is in fact binary (that is, if all the sj are 0 or 1), then we abuse notation and write s instead of ν(s). Lemma 4.5. Let s ∈ Nd . There exists a sequence c ∈ Nd such that cj −1 + sj = 2cj
(20)
holds for all j if and only if ν(s) ≡ 0 mod 2d − 1. In that case the cj are unique and satisfy where νk (s) = then
d−1
j =0 sj +k 2
ck−1 = νk (s)/(2d − 1) j;
(21)
moreover, w(c) = w(s) and if L ≤ sj ≤ U for all j ,
(i) cj ≤ U − 1 for all j or cj = sj = U for all j ; (ii) L + 1 ≤ cj for all j or cj = sj = L for all j . The main part of this lemma has appeared in [9]. For convenience of the reader, we give a proof here. Proof. Suppose that (20) holds for all j . Multiply (20) by 2j −k and add the resulting equations for j = k, . . . , j = k + d − 1, we obtain that νk (s) = 2d cd+k−1 − ck−1 = (2d − 1)ck−1 .
(22)
On the p-ranks of GMW difference sets
29
Since ck−1 is an integer, we have that νk (s) ≡ 0 mod 2d −1. This is true for all k since c is an integer sequence, in particular, we have that ν0 (s) = ν(s) ≡ 0 mod 2d − 1. Conversely, it is easily verified that (20) holds if the cj satisfy (21). The fact that w(c) = w(s) is a trivial consequence of (20); the bounds on the cj are easily verified. Now consider a 4-tuple (v, x, y, z) ∈ Nd4 , where all of v, x, y, and z are binary, for which y ≡ 5x, z ≡ 6x, tv ≡ sx, and x ≡ 0, where all congruences are modulo 2d − 1. According to Lemma 4.5, there are b, c ∈ Nd such that 2bj + yj = xj + xj −2 + bj −1 ,
(23)
2cj + zj = xj + yj + cj −1 .
(24)
Writing s=
δi 2i ,
t=
i∈S
i 2i ,
(25)
i∈T
by Lemma 4.5, we see that there exists a ∈ Nd such that 2aj +
i vj −i = δi xj −i + aj −1 . i∈T
(26)
i∈S
Note that (23) and (24) define an add-with-carry algorithm modulo 2d − 1 for the computation of y ≡ 5x and z ≡ x + y. It is easily seen that bj , cj ∈ {0, 1}. Also, write δ+ = δi , δ− = δi , + =
i , − =
i i, δi >0
i, δi 0
i, i 0), then X(Zn ) X(Kp1 )[α1 ] × X(Kp2 )[α2 ] × · · · × X(Kp )[α ] . Proof. Use Theorem 4.1 and observe that Zn Zpα1 × · · · × Zpα implies that 1 X(Zn ) X(Zpα1 ) × · · · × X(Zpα ). Now the proof follows from Lemma 3.1 and 1 Lemma 4.2.
48
Sejeong Bang and Sung-Yell Song
Theorem 4.4. If n1 , n2 , . . . , nk are positive integers with prime factorizations ni = αi αi1 pi1 . . . pii i , for i = 1, 2, . . . , k and if A = ki=1 Zni , the product ring of Zni , then i X(A) ki=1 j=1 X(Zpij )[αij ] . Proof. It is immediate from Theorem 4.1 and Corollary 4.3. Acknowledgement We are grateful to our colleague Mitsugu Hirasaka for his valuable suggestions. The notion of the internal direct product in Theorem 3.2 is due to his suggestion. We also thank the referee for suggesting us the name ‘maximal rational circulant’ for the class of schemes investigated here, by pointing out the fact that the corresponding S-rings are maximal among rational circulant S-rings. References [1] [2] [3]
[4] [5] [6] [7] [8] [9]
E. Bannai and T. Ito, Algebraic Combinatorics I: Association Schemes, Benjamin/Cummings, Menlo Park, CA, 1984. W. G. Bridges andA. R. Mena, Rational circulants with rational spectra and cyclic strongly regular graphs, Ars Combin. 8 (1979), 143–161. I. A. Faradzev, M. H. Klin, and M. E. Muzichuk, Cellular rings and groups of automorphisms of graphs, in: Investigations in Algebraic Theory of Combinatorical Objects (I. A. Faradzev, A. A. Ivanov, M. H. Klin and A. J. Woldar, eds.), Kluwer, Dordrecht 1992, 1–152. P. A. Ferguson and A. Turull, Algebraic Decompositions of Commutative Association Schemes, J. Algebra 96 (1985), 211–229. M. E. Muzychuk, The structure of rational Schur rings over cyclic groups, European J. Combin. 14 (1993), 479–490. Sung Y. Song, Fusion Relation in Products of Association Schemes, Graphs and Combinatorics, to appear. H. M. Sun, PBID designs and association schemes from rings with unit, preprint. H. Wielandt, Finite Permutation Groups, Academic Press, 1974. P. -H. Zieschang, An Algebraic Approach to Association Schemes, Lecture Notes in Math. 1628, Springer-Verlag, Berlin 1996.
S. Bang Pohang University of Sience and Technology Pohang, 790-784, Korea [email protected] S.-Y. Song Iowa State University Ames, Iowa 50011, U.S.A. [email protected]
Face-regular polyhedra and tilings with two combinatorial types of faces Michel Deza
Abstract. We present in detail the list of all 71 face-regular bifaced polyhedra, found in [BD00] and give for them symmetry groups and constructions as decorations. All 41 2-isohedral bifaced polyhedra among those 71 are identified, as well as all polyhedra P such that the 1-skeleton of P or P ∗ embeds isometrically into the 1-skeleton of a hypercube Hm or of a half-hypercube 21 Hm . The list of all face-regular bifaced tilings of the Euclidean plane is also presented and compared with the list of all 39 2-homeohedral types of such tilings of ([GLST85]); there are 33 realisable sets of parameters and a continuum of face-regular tilings for 11 of them. All those tilings are decorations of regular tilings (63 ), (44 ) or their truncations. 2000 Mathematics Subject Classification: primary 52B10; secondary 05B45
1. Introduction We consider here face-regular bifaced polyhedra, i.e., k-valent convex polyhedra with only a- and b-gonal faces (3 ≤ a < b), such that each a-gonal (respectively, bgonal) face has the same number ta (respectively, tb ) of a-gonal (respectively, b-gonal) neighbors. For a given polyhedron P , let v, pa , pb , P ∗ , Aut P denote the number of vertices, the number of a-gonal faces, the number of b-gonal faces, the dual polyhedron and the group of symmetry. See [Grün67], [Joh66], for example, for the notions on polyhedra. It was proved in [BD00] that there are 3 infinite families of face-regular bifaced polyhedra: prisms Prismb (b ≥ 5), anti-prisms APrismb (b ≥ 4) and barrels Barrelb (b ≥ 6), i.e. simple 4b-vertex polyhedra with two b-gonal faces, separated by two b-rings of 5-gons; clearly, Barrelb is a decoration of Prismb . Besides these, there are 68 sporadic examples. The results are presented in Table 1 (illustrated by drawings of polyhedra in Figure 2) for the 71 polyhedra, and in Table 2 for tilings. Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
50
Michel Deza
2. 71 polyhedra: constructions and decorations The 3 infinite families are represented as Nos. 15, 61 and 44 in Table 1, by their smallest members. Nos. 2, 18 of Table 1 can be seen as Barrel3 (the cube, truncated on two opposite vertices, is also called the Dürer octahedron) and Barrel4 . Barrel5 , Prism4 , APrism3 are Platonic polyhedra. Prism3 is given as No. 1 and not as a case in No. 15 of the Table 1, because it has a different (a, b)-value and its dual is embeddable. The following operations (decorations) are used for the 71 polyhedra (and, in the last two sections, for tilings): m-cap: the m-cap of a polyhedron is obtained by putting a pyramid on all m-gonal faces (it is the dual of the truncation on all vertices of degree m); 4-triakon: the 4-triakon of a polyhedron is obtained by partitioning each triangle into a ring of 3 4-gons, by putting a vertex in the middle and connecting it to the midpoint of every edge on the boundary; 5-triakon: a 5-triakon of a polyhedron is obtained by partitioning each hexagon into a ring of 3 pentagons, by putting a vertex in the middle and connecting it to the midpoint of every second edge on the boundary; m-halving: given an even number m, an m-halving of a polyhedron is obtained by putting a new edge, connecting the mid-points of opposite edges, on each m-gon. Nos. 50, 58 of Table 1 give examples of two different face-regular bifaced polyhedra, both coming as a 5-triakon of No. 22 (truncated octahedron). Also Nos. 52, 59 are both derived from No. 50 as different decorations of 6 of its hexagons. Nos. 35, 45 are derived from No. 3 (truncated tetrahedron)) as a 4- and 5-triakon, respectively. Nos. 3, 45, 48 arise as consecutive 5-triakons; Nos. 23, 46, 56 come by consecutive halving of some hexagons. The list of 71 polyhedra consists of (see the 8th column of Table 1 for details): 1) 10 semi-regular ones (truncations of all 5 Platonic solids, both quasi-regular (cuboctahedron and icosidodecahedron) and both infinite families (Prism3 , Prismb (b ≥ 5) and APrismb (b ≥ 4)); 2) 13 polyhedra obtained as the duals of the b-cap of the above 10 polyhedra and of 3 twisted forms (of both quasi-regular ones and of rhombicuboctahedron); 3) 10 partial truncations of the cube (2- (on opposite vertices), 6- (all but two opposite vertices, 4- (4 non-adjacent vertices), 4- (4 vertices of 2 opposite edges)) and of the dodecahedron (4- (on 4 vertices with pairwise distance 3), 16- (all but the above 4 vertices), 8- (8 vertices with pairwise distance at least 2), 12- (all but the above 8 vertices) 8- (2 opposite vertices s, s and 6 vertices of 3 edges, on geodesics (s, s ), with pairwise distance 3 between edges), 12- (all but the above 8 vertices)); 4) 14 4-triakons of Nos. 1–14; 5) b-cap of a 1-, 2- (on 2 opposite vertices), 4- (all but 2 opposite vertices), 6- (i.e. fully) truncated octahedron;
Face-regular polyhedra and tilings with two combinatorial types of faces
51
6) Nos. 21, 27, 28, 56 are m-halvings of Nos. 17, 24, 25, 46, respectively; Nos. 46, 49, 52 are partial (on some 3, 4, 6 hexagons) 6-halvings of Nos. 23, 48, 50; No. 16 (dual of the disphenoid) comes as a partial (on two opposite faces) 4-halving of the cube; 7) Nos. 48, 50 are 5-triakons of Nos. 45, 22 (Nos. 45, 48 and 60 can be obtained also as 5-triakons of truncated Platonic polyhedra, having hexagons); 8) Nos. 67, 68, 59 come as decorations of the truncated tetrahedron, the truncated cube and the truncated octahedron; 9) Nos. 64, 65 are obtained by putting (by two different ways) diagonals on 4 4gons of the dual cuboctahedron, so that those diagonals cover all 8 vertices of degree 3; 10) Nos. 44, 47, 57 come as decorations of Nos. 15, 44 (in the smallest case b = 6), and 47 (Nos. 47, 57, 62 are organized into alternated concentric rings of aand b-gons); 11) No. 23 is a decoration of the cube; No. 34 can be seen as a decoration of No. 20, or of No. 1, or of the cube. No. 34 comes from No. 20 by putting an “H” on all of its 6-gonal faces and a quadrangle on all of its 4-gonal faces; No. 59 comes from No. 22 by putting an “H” on all of its 4-gonal faces and a 5-triakon on all of its 6-gonal faces. Nos. 16∗ , 17∗ , 18∗ , 62 are polyhedra with regular faces, and they occur as numbers 84, 51, 17, 15 in the list of all 92 such polyhedra found in [Joh66]; Nos.1∗ , 16∗ , 17∗ , 18∗ and (Prism5 )∗ (the smallest case of 15∗ ) are all 5 non-Platonic convex deltahedra. Nos. 5, 25, 54 are twisted forms of Nos. 4, 24, 53, respectively; the last 3 are the chambered Platonic tetrahedron, cube, and dodecahedron, respectively (i.e., they are obtained by putting a prism on each face, followed by deleting all original edges).
3. The symmetry groups and the 41 2-isohedral bifaced polyhedra All bifaced face-regular polyhedra have symmetry groups Dnh , Dnd (for n ≥ 2) or as below (see the 6th column of Table 1 for details): C3v Aut P : Number of polyhedra: 2 Number of 2-isohedral polyhedra: 0
D3 7 1
T 6 3
Th 4 2
Td 9 5
O 2 2
Oh 8 7
I Ih 1 6 1 5
The remaining 23 sporadic bifaced face-regular polyhedra have the following symmetry groups: D2h Aut P : Number of polyhedra: 2 Number of 2-isohedral polyhedra: 0
D2d 3 1
D3h 7 3
D3d 4 3
D4h 4 4
D4d 2 1
D5h 1 0
52
Michel Deza
Using the Euler formula it is easy to check that any k-valent bifaced (a, b)polyhedron fulfills pa =
4b − v(2k + 2b − kb) 2b − 2a
pb =
v(2k + 2a − ka) − 4a . 2b − 2a
and
By face-regularity, pa b − tb = , pb a − ta implying (k − 2 + v4 )b − 2k b − tb = , a − ta 2k − (k − 2 + v4 )a i.e. v=
4(2ab − atb − bta ) . 2k(a + b − ta − tb ) − (k − 2)(2ab − atb − bta )
A polyhedron P is called 2-isohedral , if its symmetry group Aut P has exactly two orbits of faces. Clearly, any bifaced 2-isohedral polyhedron P should be face-regular. Moreover, it should satisfy the conditions: (i) pa , pb divide the order of Aut(P ), (ii) any face has the same 1-corona, that is the same sequence of gonality (i.e. the number of edges) of its neighboring faces. We found all 2-isohedral polyhedra in our list of 71 polyhedra (see the 7th column of Table 1 for details). They are the three infinite families and 38 sporadic ones. Only Nos. 25, 46, 52, 65 (marked by the sign ! in the 7th column of Table 1) among the remaining 30 polyhedra (i.e. of those having more than 2 orbits of faces) satisfy (i); their pairs of numbers of orbits of a- and b-gons are (1,2), (2,1), (1,2), (2,2), respectively. For example, No. 30 satisfies (ii), but it has two orbits (of sizes 6 and 12) of 4-gonal faces, which differ only on the 2-corona (see Figure 1). Comparing with the partition of the list of 71 polyhedra into groups 1)–11), one can see that the list of 41 2-isohedral bifaced polyhedra consists of: 1 ) all 10 polyhedra in 1), 2 ) all but 4 polyhedra in 2), i.e. all but the dual b-cap of the rhombicuboctahedron and of the 3 twisted Archimedean polyhedra, 3 ) 5 out of the 10 polyhedra in 3), 4 ) 6 out of the 14 polyhedra in 4) (the 4-triakon of the truncated tetrahedron, of the truncated cube, of Prism3 and of the 3 polyhedra of 3 ) above,
Face-regular polyhedra and tilings with two combinatorial types of faces
53
5 ) all 4 polyhedra in 5), 6 ) Nos. 16, 21, 27 (4-halvings) out of the 8 polyhedra in 6), 7 ) Nos. 44, 48, 64, 68 out of the 12 polyhedra in 7)–11).
Figure 1. Two 2-coronas of 4-gons of the polyhedron No. 30
Remarks. (i) It is easy to see that the skeletons of all 71 polyhedra are Hamiltonian. (ii) The duals of all 5 non-Platonic convex deltahedra, as well as all 3 infinite families – Nos. 15, 44, 61 – are 2-isohedral. (iii) Among the largest 3-valent (Nos. 43, 55, 60 with v = 140) and 4-valent (Nos. 68, 70, 71 with v = 30) polyhedra of our list of 71, only No. 43 is not 2-isohedral. (iv) Nos. 9, 13, 28, 33, 34, 36, 38, 43, 47, 54, 57, 65, for example, have more than 3 orbits of faces.
4. Embedding The fact that the skeleton of any polyhedron is a planar graph implies that if it can be embedded into the skeleton of a hypercube then, using a result from [CDGr97], the embedding can be chosen isometric up to a scale 1 or 2; i.e., an isometric embedding exists into a hypercube or a half-hypercube. An embeddable graph embeds into a hypercube if and only if it is bipartite, i.e. in the case of bifaced a, b-polyhedra, if and only if both a and b are even. Recall that the half-hypercube 21 Hm is (in coordinate terms) the set of all binary m-tuples with even number of ones, two of them being adjacent if their Hamming distance is 2. The notation P → Hm (or P → 21 Hm ) means that the skeleton of the polyhedron P embeds isometrically into the m-cube (or m-half-cube); see the last two columns of Table 1 for details. Among polyhedra of the 3 infinite families Prismb (b ≥ 5), Barrelb (b ≥ 6), APrismb (b ≥ 4) and their duals, all embeddings are:
54
Michel Deza
Prismb → 21 Hb+2 (moreover, Prismb → H b +1 for even b), APrismb → 21 Hb+1 . 2 Among the 71 polyhedra P , only Nos. 1, 2, 62, 66 have both P and P ∗ embeddable. Exactly 9 polyhedra P have only P embeddable, and exactly 17 polyhedra P have only P ∗ embeddable. All 14 polyhedra with (k, a) = (3, 3) (but none of the other polyhedra with k = 3) have embeddable P ∗ . All 108 non-embeddable P , P ∗ (except the hypermetric 17∗ and 18∗ , marked by the sign ! in the last column of Table 1) do not satisfy already the 5-gonal inequality, necessary for embedding (see [DGr97], [DGr99], [DL97], [DS96] for related notions and results). All embeddings of P , P ∗ among the 68 sporadic polyhedra P are: Nr. 1∗ → 21 H4 ;
Nr. 1 → 21 H5 ; Nrs. 2∗ , 62 → 21 H6 ;
Nrs. 2, 4∗ , 5∗ , 16, 66 → 21 H8 ; Nr. 66∗ → H5 ;
Nrs. 62∗ , 63∗ → H4 ;
Nrs. 3∗ , 45∗ → 21 H7 ; Nrs. 6∗ , 7∗ , 51∗ → 21 H10 ;
Nrs. 10∗ , 71 → 21 H12 ; Nrs. 22, 70∗ → H6 ;
Nrs. 24, 25, 69∗ → H7 ;
Nr. 48 → 21 H16 ;
Nrs. 13∗ , 53 → 21 H22 ;
Nr. 14∗ → 21 H26 .
Nrs. 8∗ , 9∗ → 21 H14 ;
Nrs. 11∗ , 12∗ → 21 H18 ;
Bifaced polyhedra with a weaker notion of face-regularity – only c-gonal (where c is a or b) faces have the same number tc of c-gonal neighbors – were considered in [DGr01] and [DGr99]. For example, all fullerenes (i.e. simple polyhedra with (a, b) = (5, 6)) with fixed t6 or fixed t5 ≥ 2 were found. Simple bifaced polyhedra, such that the c-gonal faces form a ring (so tc = 2) were considered in [DGr02]. Faceregular polyhedra with more than two types of faces were considered in [BD00]. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
k 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
a, b 3,4 3,5 3,6 3,6 3,6 3,6 3,7 3,7 3,7 3,8 3,8 3,8 3,9 3,10 4,b 4,5 4,5 4,5 4,6 4,6 4,6
v 6 12 12 16 16 28 20 36 36 24 44 44 52 60 2b 12 14 16 14 20 20
ta , tb 0,2 0,4 0,3 0,4 0,4 0,5 0,4 0,5 0,5 0,4 0,5 0,5 0,5 0,5 2,0 1,2 0,3 0,4 2,2 2,4 1,3
Aut D3h D3d Td Td D2h T D3d Th D3 Oh Th D3 T Ih Dbd D2d D3h D4d D3h D3d D3
2 orbits + + + + + + + + + + + + + + + +
Polyhedron P Prism3 2-truncated cube trunc.tetrahedron 4-truncated cube twisted No. 4 4-truncated dodec. 6-truncated cube 8-truncated dodec. twisted No. 8 truncated cube 12-truncated dodec. twisted No. 11 16-truncated dodec. trunc.dodecahedron Prismb , b ≥ 5 decorated cube (b-cap Prism3 )∗ (b-cap APrism4 )∗ 4-triakon No. 1 4-triakon No. 2 4-halved No. 17
emb. P 1/2H5 1/2H8 1/2Hb+2 1/2H8 -
emb. P ∗ 1/2H4 1/2H6 1/2H7 1/2H8 1/2H8 1/2H10 1/2H10 1/2H14 1/2H14 1/2H12 1/2H18 1/2H18 1/2H22 1/2H26 -! -! -
Face-regular polyhedra and tilings with two combinatorial types of faces 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4
4,6 4,6 4,6 4,6 4,6 4,7 4,7 4,7 4,7 4,7 4,8 4,8 4,8 4,9 4,9 4,9 4,10 4,11 4,11 4,12 4,13 4,15 5,b 5,6 5,6 5,6 5,6 5,6 5,6 5,6 5,6 5,6 5,6 5,6 5,7 5,7 5,8 5,8 5,10 3,b 3,4 3,4 3,4 3,4 3,4 3,4 3,4 3,5 3,5 3,6
24 26 32 32 56 44 44 44 80 80 32 32 80 28 68 68 44 92 92 56 116 140 4b 28 32 38 44 52 56 60 68 80 80 140 44 92 56 92 140 2b 10 12 14 14 14 22 30 22 30 30
0,3 1,4 0,4 0,4 0,5 1,4 1,4 2,5 0,4 0,4 2,4 2,4 1,4 2,6 2,4 2,4 2,6 2,6 2,6 2,8 2,8 2,10 4,0 3,0 3,2 2,2 2,3 1,3 2,4 0,3 1,4 0,4 0,4 0,5 3,1 2,2 3,0 3,2 3,0 2,0 2,2 0,0 1,2 1,2 2,3 1,3 0,3 2,3 0,0 2,3
Oh D3h Oh D3h O Th D3 T Oh D4d Td D2h D3 Td Th D3 D3d Th D3 Oh T Ih Dbd Td D3h C3v T T Td Ih Td Ih D5h I D3h C3v Oh Td Ih Dbd D4h Oh D4h D2d D4h D2d O D4h Ih Oh
+ + -! + + + + + + + + -! + + -! + + + + + + + + -! + + + + +
trunc.octahedron decorated cube (b-cap cuboct.)∗ (b-cap tw.cuboct.)∗ (b-cap snub cube)∗ 4-halved No. 24 4-halved No. 25 4-triakon No. 6 (b-cap rhombicbct.)∗ (b-cap tw.rhombicbct.)∗ 4-triakon No. 4 4-triakon No. 5 decorated No. 20 4-triakon No. 3 4-triakon No. 8 4-triakon No. 9 4-triakon No. 7 4-triakon No. 11 4-triakon No. 12 4-triakon No. 10 4-triakon No. 13 4-triakon No. 14 Barrelb ,b ≥ 6 (b-cap trunc. tetr.)∗ decorated No. 23 decorated Barrel6 5-triakon No. 45 decorated No. 48 5-triakon No. 22 trunc.icosahedron decorated No. 50 (b-cap icosido.)∗ (b-cap tw.icosido.)∗ (b-cap snub dodec.)∗ 6-halved No. 46 decorated No. 47 (b-cap trunc.cube)∗ decorated trunc.oct. (b-cap trunc.dodec.)∗ APrismb , b ≥ 4 capp. 1-trunc.oct. cuboctahedron decorated (cuboct.)∗ decorated (cuboct.)∗ capp. 2-trunc.oct. decorated trunc.tetr. decorated trunc.cube 4-cap 4-trunc.oct. icosidodecahedron 4-cap trunc.oct.
Table 1. All face-regular bifaced polyhedra
H6 H7 H7 1/2H16 1/2H22 1/2Hb+1 1/2H6 1/2H8 1/2H12
55
1/2H7 1/2H10 H4 H4 H5 H7 H6 -
56
Michel Deza
No. 1
No. 2
No. 3
No. 4
No. 5
No. 6
No. 7
No. 8
No. 9
No. 10
No. 11
No. 12
Face-regular polyhedra and tilings with two combinatorial types of faces
No. 13
No. 14
No. 15
No. 16
No. 17
No. 18
No. 19
No. 20
No. 21
No. 22
No. 23
No. 24
57
58
Michel Deza
No. 25
No. 26
No. 27
No. 28
No. 29
No. 30
No. 31
No. 32
No. 33
No. 35
No. 34
Face-regular polyhedra and tilings with two combinatorial types of faces
No. 36
No. 37
No. 38
No. 39
No. 40
No. 41
No. 42
No. 43
No. 44
No. 45
No. 46
No. 47
59
60
Michel Deza
No. 48
No. 49
No. 50
No. 51
No. 52
No. 53
No. 54
No. 55
No. 56
No. 57
No. 58
No. 59
Face-regular polyhedra and tilings with two combinatorial types of faces
No. 60
No. 61
No. 62
No. 63
No. 64
No. 65
No. 66
No. 67
No. 68
No. 69
No. 70
No. 71
Figure 2. The 71 face-regular bifaced polyhedra
61
62
Michel Deza
5. Infinite face-regular bifaced polyhedra We present now all k-valent infinite a, b-polyhedra (i.e. tilings of the Euclidean plane), which are face-regular. Of course, our k-valent tiling of the Euclidean plane by a- and b-gons only should be ) normal and balanced, in the sense of [GLST85], [GS85]. So the limits v = lim v(r,P t (r,P ) ) and e = lim e(r,P t (r,P ) for r → ∞ exist (and are finite), and Euler’s relation for tilings, v − e + 1 = 0, holds. Here v(r, P ), e(r, P ), t (r, P ) denote the number of vertices, edges, tiles, respectively, in the patch of tiles, corresponding to the circular disk in the plane with center P and radius r. This Euler formula implies that any k-valent tiling by a- and b-gons only fulfills (cf. the formulas in Section 3)
pa (k − 2)b − 2k = pb 2k − (k − 2)a and, by face-regularity, pa b − tb = . pb a − ta So, we have the equation (in fact, this is equation (4.6.11) in [GS85], given there for k-valent 2-homeohedral tilings) b − tb (k − 2)b − 2k = a − ta 2k − (k − 2)a (it is the case v → ∞ of the corresponding equation from Section 3). Following to [GLST85] and [GS85], call a tiling 2-homeohedral (or topologically 2-tile-transitive) if the faces form two transitivity orbits under the group of combinatorial (or topological) self-transformations of the Euclidean plane, that map the tiling onto itself. Such a tiling is called, moreover, 2-isohedral if its symmetry group is isomorphic to the group of combinatorial self-transformations. The list of all 39 2-homeohedral types of k-valent tilings is given in [GLST85], pages 135-136, and (with some errors, corrected in the Remark below)) in [GS85], Figure 4.6.3; see also the pioneering thesis [L68] and the extension of the results in [DHZ90]. Each of the 39 types represented there by a 2-isohedral tiling. (The tilings with two orbits of edges were considered in [GS83]; they have at most three orbits of tiles and vertices each.) Remarks. (i) Grünbaum ([Gr99]) indicated that Figure 4.6.3 in [GS85] (the list of 39 tilings) contains the following errors, with respect to the corrected list, given in [GLST85]: a) the diagrams 43 83 I and 43 83 I I I have the same 2-homeohedral type, and so do the diagrams 53 86 I and 53 86 I I ; so only one of the two should be included; b) two 2-homeohedral types (32 52 10 and 43 83 1, in the notation of [GLST85]) should be added;
Face-regular polyhedra and tilings with two combinatorial types of faces
63
c) on page 188, the second diagram in the second row and the left-most diagram in the bottom row should be interchanged. (ii) Grünbaum ([Gr99]) also asked for the enumeration of non-convex face-regular polyhedra (thus giving up the 3-connectedness, or the requirement that the polyhedra be of genus 0, or both) and also of those, in which the two kinds of faces differ only by their “color”, and, possibly, by the number of neighbors of the same color. (iii) One can show that the p-vector of any k-valent tiling of the Euclidean plane, which does not contain i-gons with i > i0 , satisfies 3p3 + 2p4 + p5 ≤ 6 for (k, i0 ) = (3, 6) and p3 ≤ 4 for (k, i0 ) = (4, 4).
6. The list of face-regular bifaced tilings The main result of this section is the list of face-regular bifaced tilings given in Table 2. We hope that this list is complete, but (similarly as [GLST85] say on page 132 about their enumeration of the normal 2-homeohedral tilings) we cannot be certain that, because of the large number of cases to be considered, some possibilities have not been overlooked. The sketch of proof follows. First, one should prove that the 33 parameter sets (k; a, b; ta , tb ) of Table 2 are the only realizable ones. The main tool is the equation above; it gives a finite list of possible parameter sets. Further, one can remove many admissible parameter sets by combinatorial and geometric ad hoc considerations (usually, in terms of the possible 1and 2-corona of faces). Such messy, case by case enumeration was used in [GLST85]), [GS85] (and illustrated there by examples) in order to obtain the list of 2-homeohedral tilings; see also the Appendix in [DFSV00]. For example, for k = 3, a = 5, all possibilities for the parameters (b; ta , tb ) are (7; 1, 3), (7; 2, 4), (8; 2, 2), (12; 3, 0) (all realized by seven 2-isohedral tilings; see Figure 3) and, possibly, ta = 3, tb = 12 − b with 8 ≤ b ≤ 11( the cases b = 8, 10, 11 will occur; see Figure 6). The second part is to classify the combinatorial types of all face-regular tilings for each of the realizable 33 parameter sets. All non-existence and unicity results come by the same ad hoc considerations as above. We list now all existing tilings. All 39 2homeohedral types (except No. 12, which is simply the 4-triakon of the Archimedean (3.122 )) are represented in Figures 4 and 5. Now we will list all others: first the sporadic ones and then, in detail, 11 continuums. Nos. 4 , 4 , 4 , 5 are truncations of (63 ) on all vertices, which were not truncated in Nos. 2 , 2 , 2 , 1, respectively. Now we consider the tilings Nos. 2 , 2 , 2 in Figure 6. No. 2 is 2-homeohedral; No. 2 comes from (63 ) by the truncation of pairs of vertices of edges from the nonextendable set of edges with minimal distance 3; No. 2 comes from (63 ) by taking each third zone of 6-gons and truncating a pair of opposite vertices of each 6-gon of the zone (on the diameter perpendicular to the direction of the zone). Nos. 7–12 are simply 4-triakons (see the definition in Section 2) of Nos. 1–6, respectively.
64
Michel Deza
Next, we consider the tilings Nos. 20–22 in Figure 6. No. 20 comes from (63 ) by decoration, by the letter H, of each 6-gon of each third zone of 6-gons; each of the two vertical lines of H goes in the direction perpendicular to the direction of the zone and connect the midpoints of 2 edges. No. 21 comes from (63 ) by the same decoration of each 6-gon of each second zone of 6-gons. No. 22 comes from (63 ) by the following decoration of each zone of 6-gons: 2 non-decorated 6-gons are followed by the 5-triakon of a 6-gon, the above H-decorated 6-gon, the 5-triakon of a 6-gon, 2 non-decorated 6-gons and so on; the decorations are shifted on two 6-gons on each second zone. Finally, we describe the 11 continuums. Each continuum of such tilings is represented by all infinite (in both directions) words over the alphabet {u, v}, spanned between the words (u)∞ := . . . uuuu . . . and (uv)∞ := . . . uvuv . . . (cf. the packings f.c.c. and h.c.p. within the continuum of Kelvin partitions of the 3-space by regular tetrahedra and octahedra). For example, the description of the continuum No. 17 comes by the following steps. 1) Only two types of 2-corona are possible. 2) Each of those two motives should propagate in both directions on the tiling. 3) Denote the resulting two sequences by letters u, v and see the tiling as an infinite word over u, v. For each of the 11 parameter sets No. i, realized by a continuum, we denote by No. iA and No. iB the tilings corresponding to the words (u)∞ and (uv)∞ . All such tilings which are not 2-homeohedral are Nos. 13B , 18B , 26A and 26B (for example, No. 18B has 4 orbits of faces). Denote by No. ic the unique tiling outside of the continuum, if it exists; those are Nos. 13c , 15c , 18c , 29c . Each of the 11 continuums can be visualized using the definition of the letters u, v below and Figure 5, representing the extremal cases (u)∞ , (uv)∞ . No. 13 (except No. 13c ): for each zone of 8-gons in (4.82 ), each 8-gon is cut in half by a new edge, all new edges being parallel. The letters u, v correspond to zones in which the direction of cutting edges was SW-NE or NW-SE. This is a continuum of different metric realizations of the same topological type; only 13A and 13B are 2-isohedral. Apropos, No. 13c corresponds to cutting in half all “black” 8-gons of (4.82 ) (in a chess-board coloring of all 8-gons) by an edge N-S (North-South) and all “white”8-gons by an edge W-E (West-East). No. 15 is nothing but cutting in half each 4-gon of each sequence of 4-gons (alternated by linking edges) in any fixed tiling No. 13. The cutting edges for any sequence are in the same direction; the choice of the direction does not affect the combinatorial type. No. 16 comes from (4.82 ) by partitioning it into parallel zones of 8-gons and decorating each sequence of 4-gons (between two neighboring zones of 8-gons). Namely, each 4-gon is cut in half and all cutting edges for a fixed sequence are parallel, in one of two, mutually perpendicular, directions. The letters u, v correspond each to one of those two directions. No. 29 (except No. 29c , which is a decoration of (44 )) comes from (4.82 ) by partitioning it into “mixed zones”of 4-gons alternated by 8-gons. On each 8-gon of each mixed zone, put a diagonal connecting vertices of 4-gons, all diagonals of a fixed
Face-regular polyhedra and tilings with two combinatorial types of faces
65
mixed zone in the same direction. There are two choices, and let them correspond to the letters u, v. So, the tiling will be partitioned into decorated (u or v) mixed zones. Then put a diagonal in each 4-gon, so that the valency of each vertex becomes 4. No. 24: each second vertex of each second column of vertices in (44 ) is truncated and then all resulting 4-gons are capped. Now shift some columns (with truncated vertices) on one step; the letters u, v correspond to the choice “shift or no”. No. 26: it is a “complement”of No. 24 in the sense that in (44 ) those vertices are truncated which were not truncated in No. 24. For each truncated vertex, the resulting 4-gon is capped. No. 25: a tiling is defined by the translation of a path of vertices (in king’s move on Z2 , i.e. in the l∞ -metric on Z2 ). The letters u, v correspond to steps by a side (of a 4-gon) and by a diagonal; then all vertices of every second translation of the path are truncated and all resulting 4-gons are capped. No. 33: a tiling is defined by a path of 4-gons in (44 ). The letters u, v correspond to the moves right and up; then all 4-gons of every second translation of the path are decorated by diagonals, all in the same direction (the choice of the direction does not affect the combinatorial type; cf. No. 15). The continuums Nos. 17–19 are described in [DFSV00]; see also Figure 3. A tiling No. 19 is defined by the translation of a path of pairs of 5-gons. Each such path can be seen as a word in the letters u, v, where those letters correspond, respectively, to one of the two ways of adjacency of pairs. A tiling No. 17 is defined by the translation of a sequence of pairs of 5-gons, alternated by pairs of 7-gons. All edges, separating two 5-gons, of a sequence are supposed to be parallel; the letters u, v correspond, as for No. 16, to two, mutually perpendicular, directions. Remark that for Nos. 15, 19, 29, and 33 the continuum can be defined over 3 letters; it introduces new symmetries, new metric realizations, but does not effect the combinatorial type. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14
k 3 3 3 3 3 3 3 3 3 3 3 3 3 3
a, b 3,7 3,8 3,9 3,10 3,11 3,12 4,8 4,10 4,12 4,14 4,16 4,18 4,7 4,8
ta , tb 0,6 0,6 0,6 0,6 0,6 0,6 2,6 2,6 2,6 2,6 2,6 2,6 0,5 0,4
No. of all 1 3 3 3 1 1 1 3 3 3 1 1 1(∞) + 1 1
No. of 2-homeohedral 1 1 3 1 1 1 1 1 1+1 1
Tilings 1 -truncated (63 ) 6 1 -truncated (63 ) 3 1 3 2 -truncated (6 ) 2 -truncated (63 ) 3 5 -truncated (63 ) 6 trunc.(63 ) = (3.122 )
4-triakon of No. .1 4-triakon of No. .2 4-triakon of No. .3 4-triakon of No. .4 4-triakon of No. .5 4-triakon of No. .6 8-halved (4.82 ) trunc.(44 ) = (4.82 )
66 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Michel Deza 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 5 5
4,8 4,10 5,7 5,7 5,8 5,8 5,10 5,11 5,12 3,5 3,6 3,7 3,8 3,5 3,5 3,6 3,6 3,4 3,4
1,5 1,4 1,3 2,4 2,2 3,4 3,2 3,1 3,0 2,4 2,4 2,4 2,4 0,2 1,3 0,0 1,2 1,0 2,2
∞+1 ∞ ∞ ∞+1 ∞ 1 1 1 1 ∞ ∞ ∞ 1 1 ∞+1 1 1 1 ∞
2+1 2 2 1+1 2 1 2 2 1 1 2+1 1 1 1 2
4-halved No. .13 4-halved (4.82 ) a 6-halved (63 ) decorated (63 ) a 6-halved (63 ) decorated (63 ) decorated (63 ) decorated (63 ) a 5-triakon (63 ) decorated (44 ) decorated (44 ) decorated (44 ) 4-capped (4.82 ) decorated (44 ) decorated (44 ), (4.82 ) Archimedean (3.6.3.6) decorated (44 ) Archimedean (32 .4.3.4) decorated (44 )
Table 2. All face-regular bifaced tilings
Figure 3. The 3-valent 2-homeohedral types of tilings by 5- and b-gons: Nos. 17A , 17B , 18A , 18c , 19B , 19A , 23
Face-regular polyhedra and tilings with two combinatorial types of faces
No. 1
No. 7
No. 2
No. 8
No. 3
No. 9
No. 3
No. 3
No. 6
No. 14
No. 30
No. 32
No. 13c
No. 15c
No. 18c
No. 29c
No. 23
No. 27
No. 28
No. 31
Figure 4. 20 sporadic 2-homeohedral types of tilings
67
68
Michel Deza
No. 13A
No. 16A
No. 15A
No. 16B
No. 18A
No. 15B
No. 17A
No. 17B
No. 19A
No. 19B
No. 24A
No. 24B
No. 25A
No. 25B
No. 29A
No. 29B
No. 33A
No. 33B
Figure 5. 18 2-homeohedral types from continuums
Face-regular polyhedra and tilings with two combinatorial types of faces
69
Figure 6. Face-regular tilings Nos. 2 4,2”4, 2 and 20, 21, 22 (only No. 2 is 2-homeohedral)
Remarks. (i) The classification of the 11 continuums of Table 2 is just an extension of the ideas used in [DFSV00] to classify the continuums Nos. 17–19. Moreover, [DFSV00] gives, for tilings No. 17, all possible groups of symmetry – the strip groups p111, p112, p1m1 and the plane groups cmm (only for No. 17A ), pgg (including No. 17B ), pg, p2, p1 – and the minimal polyhedral torus or Klein bottle for tilings No. 17 of each of those 7 symmetry groups. (See [L68], pp. 61–72, for the symmetry groups of 2-isohedral tilings.) (ii) Amongst face-regular bifaced tilings, only Nos. 6, 14, 30, 32, 33A , the Archimedean plane tilings (3.122 ), (4.82 ), (3.6.3.6), (32 .4.3.4), (33 .42 ) (with symmetry groups p6m, p4m, p6m, p4g, cmm, respectively; the unique remaining bifaced Archimedean tiling (34 .6) is not face-regular) are mosaics, i.e., tilings of the Euclidean plane by regular polygons; see [DS02] for the list of 58 mosaics T with embeddable skeletons of T or T ∗ . All five are 2-isohedral. The duals of (3.122 ), (4.82 ), (3.6.3.6), (32 .4.3.4), (33 .42 ) embed isometrically (or isometrically up to scale 2, that is indicated by “ 21 ”) into 21 Z∞ , Z4 , Z3 , 21 Z4 , 21 Z3 , respectively.
70
Michel Deza
(iii) Tedious checks show that all embeddable ones among the face-regular bifaced tilings are: No. 3’ (the 2-homohedral one), No. 13c , No. 14, all No. 29 (except of No. 29c ), No. 32, No. 33A and all others No. 33; they embed into 21 Z3 , 21 Z8 , Z4 , 21 Z5 , 1 1 1 2 Z4 , 2 Z3 , 2 Z4 , respectively. (iv) Tilings Nos. 1–6 are truncations of the regular tiling (63 ) (Archimedean No. 6 being its full truncation); Nos. 7–12 are their 4-triakon decorations. Nos. 17–23 are decorated (63 ). The Archimedean No. 14 is the full truncation of the regular tiling (44 ); Nos. 13–16, 27 and all No. 29, except of No. 29c , are decorations of it. All other face-regular bifaced tilings (including Archimedean Nos. 30, 32, 33A ) are decorated (44 ). (v) Only one of the 33 sets of parameters of face-regular bifaced tilings (the 3 tilings No. 8, with (k; a, b; ta , tb ) = (3; 4, 10; 2, 6)) is among the parameter sets of the 71 polyhedra (No. 38); all are 4-triakons. Acknowledgement. This work was done during the visit of SFB 343, Department of Mathematics, University of Bielefeld, in November 1999. Many thanks to SFB 343, to Gunnar Brinkmann, who did pictures of polyhedra, and to Slava Grishukhin and Misha Shtogrin.
References [BD00]
G. Brinkmann and M. Deza, Tables of face-regular polyhedra, J. Chem. Inf. Comput. Sci. 40-3 (1999), 530–541.
[DHZ90]
O. Delgado, D. Huson and E. Zamorzaeva, The classification of 2-isohedral tilings of the plane, Geom. Dedicata 42 (1992), 43–117.
[CDGr97]
V. Chepoi, M. Deza and V. P. Grishukhin, A clin d’oeil on l1 -embeddable planar graphs, Discrete Appl. Math. 80 (1997), 3–19.
[DGr97]
M. Deza and V. P. Grishukhin, A zoo of l1 -embeddable polyhedra, Bull. Inst. Math. Acad. Sinica 25 (1997), 181–231.
[DGr01]
M. Deza and V. P. Grishukhin, Face-regular bifaced polyhedra, J. Statist. Plann. Inference 95 (1/2) (2001), 175–195, special issue in honor of S. S. Shrikhande (V. Tonchev, ed.).
[DGr99]
M. Deza and V. P. Grishukhin, 1 -embeddable bifaced polyhedra, in: Algebras and Combinatorics, Internat. Congress ICAC ’97 Hong Kong, (Kar-Ping Shum et al., eds.), Springer-Verlag, Singapore 1999, 189–210.
[DGr02]
M. Deza and V. P. Grishukhin, Maps of p-gons with a ring of q-gons, Bull. Inst. Combin. Appl. 34 (2002).
[DL97]
M. Deza and M. Laurent, Geometry of cuts and metrics, Algorithms Combin. 15, Springer-Verlag, Berlin 1997.
Face-regular polyhedra and tilings with two combinatorial types of faces
71
[DS96]
M. Deza and M. Shtogrin, Isometric embeddings of semi-regular polyhedra, plane partitions and their duals into hypercubes and cubic lattices, Russian Math. Surveys 51 (1996), 1193–1194.
[DS02]
M. Deza and M. Shtogrin, Mosaics, embeddable into cubic lattices, Discrete Math. 244 (1–3) (2002), 43–53.
[DFSV00] M. Deza, P. W. Fowler, M. I. Shtogrin and K. Vietze, Pentaheptite modifications of the graphite sheet, J. Chem. Inform. Comput. Sci. 40 (2000), 1325–1332. [Grün67]
B. Grünbaum, Convex polytopes, Interscience, New York 1967.
[GS83]
B. Grünbaum and G. C. Shephard, The 2-Homeotoxal Tilings of the Plane and the 2-Sphere, J. Combin. Theory Ser. B 34 (1983), 113–150.
[GLST85]
B. Grünbaum, H.-D. Löckenhoff, G. C. Shephard and A. H. Temesvari, The enumeration of normal 2-homeohedral tilings, Geom. Dedicata 19 (1985), 109–174.
[GS85]
B. Grünbaum and G. C. Shephard, Tilings and Patterns, Freeman, New York 1985.
[Gr99]
B. Grünbaum, private communication, 1999.
[Joh66]
N. W. Johnson, Convex polyhedra with regular faces, Canad. J. Math. 18 (1966), 169–200.
[L68]
H.-D. Löckenhoff, Über die Zerlegung der Ebene in zwei Arten topologisch verschiedener Flächen, Inaugural Dissertation, Marburg 1968.
M. Deza Laboratoire Interdisciplinaire de Géometrie Appliquée CNRS/ENS 45, rue d’Ulm 75230 Paris Cedex 05, France [email protected]
Geometry, codes and difference sets: exceptional connections J. F. Dillon
Abstract. We survey some recent results which tie together a number of areas which have been enriched by the work of D. K. Ray-Chaudhuri. (m) We call the odd positive integer s exceptional if the binary cyclic code Cs of m s length 2 − 1 with zeros ω and ω , ω primitive in L = F2m , is double-error-correcting (m) for infinitely many m. We show that Cs is double-error-correcting precisely when the reverse Dickson/Fibonacci polynomial hs (z) = Ds (1, z), which can be defined recursively by h1 = 1 = h2 , ht+2 = ht+1 + zht , t ≥ 0, induces a 1-to-1 map on the elements of absolute trace 0 in L. In particular, s is exceptional if gs (z) = 1 + hs (z) is an exceptional polynomial. This characterization explains the known classes of exceptional s – namely s = 2k + 1 and s = 4k − 2k + 1 – and provides a tool to bring to bear on the open conjecture of Janwa, McGuire and Wilson that there are no others. We also point out another property shared by the known exceptional s – namely that the set D = L\{1 + x s + (x + 1)s : x ∈ L} is a cyclic difference set with Singer parameters in L× ; and we derive several new descriptions of the sets of type s = 4k − 2k + 1 by exploiting the theory of quadratic forms on the F2 -vector space L2 and thereby obtaining yet another connection between the two known exceptional exponents 2k + 1 and 4k − 2k + 1. 2000 Mathematics Subject Classification: primary 05B10; secondary 11T24, 94A55.
1. Introduction It has been forty years since R. C. Bose and his student Dijen K. Ray-Chaudhuri (and, independently, A. Hocquenguem) discovered the beauty and utility of BCH-codes. And the rich context and interplay of finite fields, geometry and codes continue to this day to challenge our understanding with beautiful tantalizing questions whose answers hold important information for applications in communications reliability and security. Typical BCH-codes are the binary cyclic codes of length 2m − 1 with generator polynomial g(x) = M1 (x)M3 (x), where Mt (x) denotes the minimal polynomial over F2 of ωt , ω primitive in F2m . These codes are double-error-correcting for infinitely Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
74
J. F. Dillon
many m (in fact, for ALL m ≥ 3); and it is this exceptional property that we wish to explore further in this paper. We shall call the odd positive integer s exceptional if for infinitely many m the binary cyclic code Cs(m) of length 2m − 1 with generator polynomial g(x) = M1 (x)Ms (x) is double-error-correcting. There are only two known classes of exceptional s – s = 2k + 1 and s = 4k − 2k + 1 – and it has been conjectured [JMCW]by Janwa, Wilson and McGuire (the last two being Dijen’s academic son and grandson, respectively) that no others exist. [JMCW] makes giant inroads on this problem, but the complete solution remains elusive. Note that the two known classes intersect in the classical BCH s = 3. (m) It turns out that the code Cs is double-error-correcting precisely when the polynomial x s + (x + 1)s induces a 2-to-1 map on L = F2m . While this property was exploited implicitly in [JMCW], nowadays the map x → x s on L is called Almost Perfect Nonlinear (abbreviated APN); and this property provides further motivation for the determination of the exceptional s. The wonderfully informative and thoughtprovoking survey paper by Pascale Charpin [PASC] actually says “We have here exceptional objects which appear in other contexts.” In the next section we review the connection between the codes Cs(m) and APN maps. In Section 3 we justify the epithet exceptional for these codes and maps by exhibiting an intimate tie-in with the well-established technical notion of exceptional polynomial. In Section 4 we consider the two known classes of exceptional s, showing how their properties follow immediately via the connection with exceptional polynomials. In Section 5 we point out another property shared by the known exceptional s – namely that the set D = L\{1 + x s + (x + 1)s : x ∈ L} is a difference set in the cyclic multiplicative group L× . In the case s = 2k + 1, for given m all k, (k, m) = 1, produce the same difference set which is the classical Singer set comprised of all elements of Trace 1 in L× . But in the case s = 4k − 2k + 1, for given m the φ(m)/2 k, 1 ≤ k < m/2, (k, m) = 1, produce pairwise inequivalent difference sets [DOB1], [DIDO]. In the last section we give several alternative descriptions of these latter difference sets which follow from yet another connection between the exceptional exponents 2k +1 and 4k −2k +1. For this purpose we exploit the theory of quadratic forms on the F2 -vector space F2m – another topic pioneered by that exceptional visionary and sexagenarian Dijen K. Ray-Chaudhuri.
2. The codes In this section we give a brief introduction to a class of codes which generalize in many ways the classical double-error-correcting BCH-codes. For a much more comprehensive treatment of some of the notions touched on here the reader may consult the excellent survey paper of Pascale Charpin in the Handbook of Coding Theory [PASC]. Let s > 1 be an odd integer. For any integer m ≥ 3 consider a binary cyclic code (m) Cs of length 2m − 1 whose zeros are ω and ωs (and their conjugates), where ω is
Geometry, codes and difference sets: exceptional connections
75
primitive in L = F2m . Thus, Cs(m) = {c ∈ F22 (m)
where Hs
m −1
: Hs(m) c = 0},
is the 2 × 2m − 1 parity check matrix 1 ω ω2 · · · ωj · · · (m) Hs = 1 ωs ω2s · · · ωj s · · ·
ω2 −2 m ω(2 −2)s m
.
(m)
The generator polynomial of Cs is g(x) = M1 (x)Ms (x), where Mt (x) denotes the minimal polynomial of ωt over F2 . For example, when s = 3 C3(m) is the now classical double-error-correcting BCH-code invented by R. C. Bose and his student Dijen K. Ray-Chaudhuri [DKRC] (and, independently, by A. Hocquenguem [HOCQ]) some forty years ago. Note that s = 3 has the property that the code Cs(m) is double-errorcorrecting for infinitely many m; and it is this property that we wish to explore further in this paper. Now suppose that w, x, y and z are distinct elements of L which satisfy the equations w+x+y+z=0 (1) s w + x s + y s + zs = 0. (m)
If all four elements are nonzero then they correspond to four distinct columns of Hs (m) which sum to 0 and, hence, to a codeword of weight 4 in the code Cs . On the other hand, if one of the elements, say w, is equal to 0, then the others correspond to three distinct columns of Hs(m) which sum to 0 and, hence, to a codeword of weight 3 in the code Cs(m) . In any case, if we define elements a and b in L by a =w+x b = ws + x s , then we see that the equation Xs + (X + a)s = b
(2)
has the four distinct solutions w, x, y and z. Moreover, this process can be reversed. It is clear that for any a and b with a = 0 the solutions of (2) in L come in pairs {X, X + a}. If (2) should have more than two solutions and, hence, at least four, say w, x = w + a, y and z = y + a, then w+x+y+z=a+a =0 and ws + x s + y s + zs = ws + (w + a)s + y s + (y + a)s = b + b = 0, so that w, x, y and z satisfy the equations (1). This critical derivative property, although used implicitly by many others much earlier, has been given a name by Kaisa Nyberg [KNYB].
76
J. F. Dillon
Definition 2.1. The mapping P : L → L is called Almost Perfect Nonlinear (abbreviated APN) if the equation P (X) + P (X + a) = b has at most 2 (hence exactly 0 or 2) solutions for all a, b in L, a = 0. Thus, the map P is APN iff all directional derivatives P (X) + P (X + a), a = 0, are 2-to-1 on L. Theorem 2.2 (van Lint, Wilson, Janwa, McGuire, Charpin). The following are equivalent: (m)
1. Cs
has minimum distance 5;
2. P (X) = X s is APN on L = F2m ; 3. X s + (X + 1)s is 2-to-1 on L. Since for any nonzero element a of L the substitution X ← aX converts Xs + (X + a)s into a s (Xs + (X + 1)s ), the power map X s has the nice property that all directional derivatives Xs + (X + a)s , a = 0, are 2-to-1 as soon as the particular one X s + (X + 1)s is 2-to-1. Hence the equivalence of the last two statements in the Theorem. We remark that the same argument above with Xs replaced by P (X) establishes the equivalence of 1. and 2. for a more general polynomial P (X) [PASC]. But in this paper we shall restrict our attention to the monomial case P (X) = Xs . Definition 2.3. The odd positive integer s is exceptional if X s + (X + 1)s is 2-to-1 on L = F2m for infinitely many m. Conjecture 2.4 (Janwa, McGuire, Wilson [JMCW]). The only exceptional integers s are s = 2k + 1 and s = 4k − 2k + 1. The fact that these two types of s produce double-error-correcting codes whenever k and m are relatively prime has been proven in many different ways over the course of many years. Janwa and Wilson [JAWI] were perhaps the first to prove the complete result for both types. Janwa and Wilson [JAWI] and later Janwa, McGuire and Wilson [JMCW] developed an approach exploiting Weil’s Theorem on counting rational points on curves which proved that these s’s are exceptional, ruled out infinitely many other values of s and led them to enunciate the stronger Conjecture 2.5 (Janwa, McGuire, Wilson [JMCW]). The polynomial X s + Y s + Z s + (X + Y + Z)s (X + Y )(X + Z)(Y + Z) in F2 [X, Y, Z] is absolutely irreducible for all s not of the form 2k + 1 or 4k − 2k + 1.
Geometry, codes and difference sets: exceptional connections
77
3. The geometry – part I In this section we relate the relevant derivative polynomials to various classes of polynomials studied in other contexts. For a more comprehensive treatment of Dickson polynomials and their relatives the reader should consult [LIMT]; while for more information on exceptional polynomials [COMA] is recommended. Recall that Newton’s formulae express the power sum symmetric functions as polynomials in the elementary symmetric functions. In the case of two variables this correspondence is given by X n + Y n = Dn (X + Y, XY ), where Dn (X, a) is the Dickson polynomial (of the first kind) which is given explicitly by n/2 n n−i Dn (X, a) = (−a)i X n−2i . i n−i i=0
Now most treatments of Dickson polynomials ([LIMT] is a good example) point out that the Dickson polynomials Dn (X, a) are determined by the functional equation a n a Xn + = Dn X + , a . X X In particular, taking a = 1, we obtain the special case of univariate Dickson polynomials Dn (X) = Dn (X, 1) determined by 1 1 . Xn + n = Dn X + X X For the problem at hand, however, it is not the special case Y = X1 that we wish to consider, but rather the case Y = X + 1. We thus have in F2 [X] n n−i n 2 n (−[X + X 2 ])i ∀n ≥ 1. X + (X + 1) = Dn (1, X + X ) = i n−i Now Kang [LIMT] studied polynomials Tn (X, a) obtained from the Dickson polynomials Dn (X, a) by interchanging X and a; i.e. Tn (X, a) = Dn (a, −X). Putting Tn (Z) = Tn (Z, 1) = Dn (1, −Z), we see that Xn + (X + 1)n = Tn (X + X 2 )
∀n ≥ 1.
In fact, for all n ≥ 1, Tn (Z) = Fn−1 (Z), where the Fn (Z) are Fibonacci polynomials given explicitly by n − i Zi , Fn (Z) = i and, recursively, by F0 (Z) = F1 (Z) = 1
and
Fn+2 (Z) = ZFn (Z) + Fn+1 (Z)
∀n ≥ 0.
78
J. F. Dillon
Notice that Fn (Z), interpreted as a polynomial with integer coefficients, takes at Z = 1 the value Fn , the nth Fibonacci number. Thus, X n + (X + 1)n = Fn−1 (X + X 2 ); and, since x → x + x 2 maps L = F2m in a 2-to-1 manner onto the kernel of the trace map m−1 T rL/F (Z) = Z + Z 2 + · · · + Z 2 , we have Theorem 3.1. The power map Xn is APN on L = F2m precisely when the Fibonacci polynomial Fn−1 (Z) induces on L a map that is 1-to-1 on the hyperplane of elements of trace 0. Now a polynomial with coefficients in a field K is exceptional [COMA] if it induces a permutation on infinitely many extensions of K. We thus have the result that explains our choice of terminology Corollary 3.2. The integer s is exceptional if the Fibonacci polynomial Fs−1 (Z) in F2 [Z] is an exceptional polynomial. Now for any n we define related polynomials n (X) and gn (Z) by n (X) = Xn + (X + 1)n + 1
and
gn (Z) = Tn (Z) − 1.
Table 1 gives a list of the first few polynomials gn (Z) ∈ F2 [Z]. They are generated recursively by g1 = g2 = 0 and gn+2 = Z + gn+1 + Zgn , for all n ≥ 0; or, alternatively, by g0 = 1, g1 = 0 = g2 and gn+3 = (Z + 1)gn+1 + Zgn , n ≥ 0. A number of properties of these polynomials are already visible in the table; e.g. g2n = gn2 n
gn
n
gn
n
gn
1 2 3 4 5 6 7 8
0 0 Z 0 Z + Z2 Z2 Z + Z3 0
9 10 11 12 13 14 15 16
Z + Z2 + Z4 Z2 + Z4 Z + Z3 + Z4 + Z5 Z4 Z + Z2 + Z5 + Z6 Z2 + Z6 Z + Z3 + Z7 0
17 18 19 20 21 22 23 24
Z + Z2 + Z4 + Z8 Z2 + Z4 + Z8 Z + Z3 + Z4 + Z5 + Z8 + Z9 Z4 + Z8 Z + Z 2 + Z 5 + Z 6 + Z 8 + Z 9 + Z 10 Z 2 + Z 6 + Z 8 + Z 10 Z + Z 3 + Z 7 + Z 8 + Z 9 + Z 11 Z8
Table 1. Polynomials gn (Z) with 1 + X n + (X + 1)n = gn (X + X 2 )
2i and g2k +1 = k−1 i=0 Z . We remark that Waring’s formula for three variables [LIMT] gives gn (Z) directly as 1 + X n + (X + 1)n = gn (X + X 2 ), where n n − i − 2j i+j (Z + 1)i Z j , n ≥ 2. gn (Z) = i+j i n − i − 2j 2i+3j =n
Geometry, codes and difference sets: exceptional connections
79
Now we have n (X) = gn (X + X 2 ); and, since changing the constant term of a polynomial does not effect its quality of inducing a 1-to-1 or a 2-to-1 mapping on a set, we can restate our earlier observations as Theorem 3.3. The following are equivalent: (m)
1. Cs
has minimum distance 5;
2. P (X) = X s is APN on L = F2m ; 3. s (X) = 1 + Xs + (X + 1)s is 2-to-1 on L; 4. gs (Z) induces a 1-to-1 map on the elements of L of trace 0. Corollary 3.4. The integer s is exceptional if gs (Z) ∈ F2 [Z] is an exceptional polynomial. We note here that for a fixed m there may be other power maps x → x s which are APN on F2m . For example, it is easy to see that s = 2m−1 − 1, which is related m to the “inverse map” x → x 2 −2 = (x 2 )s , gives an APN map on every F2m , m odd. m−1 Also, for odd m, s = 2 2 + 3 and s = 4r + 2r − 1, 4r + 1 ≡ 0 (mod m) have been shown to give APN maps on F2m by Hans Dobbertin [DOB2] who has made an art form out of finding appropriate permutation polynomials p(Z) with which to express t the relevant derivative maps in the form 1+ Xs + (X + 1)s = p(X +X 2 ), (t, m) = 1. In these cases, however, s, p(Z) and t all depend on m; so, while the above results (e.g. Theorem 3.3) still apply, we do not have the exceptionality criterion which is the main theme of this paper.
4. Two exceptional cases In this section we take a close look at the two known classes of exceptional integers. We begin with s = 2k + 1. In this case we have s (X) = 1 + Xs + (X + 1)s = k X + X 2 = gs (X + X 2 ), where gs (Z) = Tk (Z) =
k−1
i
Z2 .
i=0
The zeros of Tk (Z) in the algebraic closure of F2 are the elements of absolute trace 0 in F2k . Therefore, for any m relatively prime to k, Tk (Z) induces on L = F2m an F2 -linear map whose kernel is comprised of those elements of F2k ∩ F2m = F2 which have trace 0 as elements of F2k ; i.e. the kernel is {0} if k is odd and F2 if k is even. Thus, if k is odd, Tk (Z) is an exceptional polynomial permuting every field L = F2m with (m, k) = 1. If k is even, then Tk is 2-to-1 on all of L but is 1-to-1 when restricted
80
J. F. Dillon
to elements of L of trace 0 since (m, k) = 1 means that m is odd and 1 does not have trace 0. In light of Theorem 3.3 we have derived the well-known result [MACS] that the codes Cs(m) are double-error-correcting for all s = 2k + 1 and m with (m, k) = 1. Next we consider the case s = 4k − 2k + 1. In this case we have s (X) = 1 + X s + (X + 1)s = gs (X + X 2 ), where k Tk (Z) 2 +1 . gs (Z) = fk,2k +1 (Z) = Z Z This polynomial belongs to the family of polynomials studied by Müller (for k = 3) and by Cohen and Matthews for general k [COMA] and which are given by fk,d (Z) =
Tdk (Z e ) Z2
k
,
for any factorization 2k + 1 = de. Cohen and Matthews proved that any such polynomial is exceptional for k odd; indeed it permutes all fields L = F2m for (m, k) = 1. Dillon [JFD1] deduced the APN property from the MCM-polynomials gs (Z) for all k and m, (k, m) = 1. Later, in the case that k is even, Dillon and Dobbertin [DIDO] proved that gs (Z) induces a 2-to-1 map on all of L = F2m , (m, k) = 1, which is 1-to-1 when restricted to the elements of L of trace 0. Again, in light of Theorem 3.3, we have corroborated the known result [JAWI] that the codes Cs(m) are double-error-correcting for all s = 4k − 2k + 1 and m with (m, k) = 1.
5. The difference sets We may compare and contrast the two exceptional cases in another respect. Definition 5.1. The k-subset D of the group G of order v is a difference set with parameters (v, k, λ) if for all nonidentity elements g of G the equation g = xy −1 has exactly λ solutions with x and y in D. D is a (v, k, λ)-difference set in G iff its complement G\D is a (v, v−k, v−2k+λ)difference set in G. Two such difference sets are equivalent if they are in the same orbit under the action of the holomorph of G acting on subsets of G. If G is a cyclic group of order 2m − 1, which we may take to be the multiplicative group L× of the finite field L = F2m , then the subset D of G is a difference set with the socalled Singer parameters (v, k, λ) = (2m − 1, 2m−1 , 2m−2 ) (or the complementary parameters (v, k, λ) = (2m −1, 2m−1 −1, 2m−2 −1)) precisely when its characteristic
Geometry, codes and difference sets: exceptional connections
binary sequence {at } given by
at =
1 0
81
if ωt ∈ D otherwise.
has the ideal autocorrelation property c(s) =
m −2 2
t=0
(−1)
at+s +at
=
2m − 1 if 2m − 1|s −1 otherwise.
Two such sequences which are related by any combination of decimation and cyclic shift correspond to equivalent difference sets. Note that according to this notion of equivalence the sequences {at } obtained above by varying the primitive element ω are all equivalent. The best known examples are the Singer sets D = {x ∈ L× : Tr(x) = 1} whose corresponding sequences {at } = {Tr(ωt )} are called m-sequences or maximal length sequences. Note that in this case D is the complement in L of k the image of the map x → x + x 2 for any k relatively prime to m. For a more comprehensive treatment of difference sets the interested reader is referred to [BJL2]. Now let us consider for an integer s the sets s (L) and its complement L\s (L), where s (X) = 1 + Xs + (X + 1)s . For the case s = 2k + 1 and (k, m) = 1 L\(L) is precisely the classical Singer set given above. Note in particular that for a given field L = F2m the difference sets obtained as k varies over integers prime to m are all identical! What about the case s = 4k − 2k + 1? In this case something quite different happens. For any k, 1 ≤ k ≤ m, (k, m) = 1, let us denote the set L\s (L) more simply by Dk . For each such k the sets Dk and Dm−k coincide since for any α in L m−k m−k s (α 2 ) = 2s (α) and s (α) = 4s (α) = s (α 4 ) for s = 4m−k − 2m−k + 1. In [DOB1] Dobbertin conjectured that the sets Dk were difference sets in L× and he proved that the binary cyclic code which is the row space of the circulant matrix [aj −i ] over F2 has dimension (2Fk1 − 1)m, where k1 = min{k , m − k }, k = k −1 (mod m) and Fn denotes the Fibonacci number defined earlier. The conjecture is proved in [DIDO] which also contains much more information about the maps given by Dickson and Müller–Cohen–Matthews polynomials. We summarize these results on Dk in the Theorem 5.2 (Dillon and Dobbertin [DIDO]). Let L = F2m and for each k, 1 ≤ k < m/2, (k, m) = 1, let k (X) = Xs + (X + 1)s + 1, s = 4k − 2k + 1. Then Dk = L \ k (L) is a difference set with Singer parameters in L× . Equivalently, the binary sequence {at(k) }, at(k) = 1 iff ωt ∈ Dk , has ideal autocorrelation. Moreover, for each fixed m, the φ(m)/2 difference sets Dk (resp. sequences {at(k) }) are pairwise inequivalent. We mention a few special cases. When k = 1 we get the classical Singer set. When k = 2 we get the Maschietti difference set corresponding to the Segre monomial
82
J. F. Dillon
hyperoval in P G(2, 2m ) [MASC]. When k = (m − 1)/2 we get the set conjectured to be a difference set by Gong, Golomb et al. [NOGO]. And when k = (m ± 1)/3 we get the sets conjectured to be difference sets by No et al. [NOCY], [NOGO]. Note that for any m we get φ(m)/2 pairwise inequivalent difference sets; and, in particular, for m a prime we get (p − 1)/2 pairwise inequivalent difference sets in marked contrast to the technique of Gordon, Mills and Welch [GMWD] which requires that m be composite. In the next section we shall give an alternative description of the sets Dk which is provided by yet another connection between the exceptional exponents 2k + 1 and 4k − 2k + 1.
6. The geometry – part II Let L = F2m and let (k, m) = 1. Define Kk : L → L by Kk (x) = x +x 2 +1 ∀x ∈ L. It is easy to see that the map Kk has multiplicities 0, 1, 2 and 3. Indeed, Kk (X) has two zeros in L and for each nonzero α in L k
#(X 2
k +1
+ X + α = 0) = #(X2 = #(X
2k −1
22k
+ X2
+X
2k
k −1
+ α = 0)
+ αX = 0) − 1.
Put Kk = {α ∈ L : |Kk−1 [α]| = 1}. It is not hard to see that m−1 − 1 if m is odd 2 |Kk | = m−1 2 if m is even. Indeed, we have the Theorem 6.1.
(−1) Kk
=
L× \Dk Dk
if m is odd if m is even.
Proof. We shall assume here that m is odd. The other case is very similar but slightly more complicated by the fact that 2k + 1 and 2m − 1 are not relatively prime. We certainly have |Kk−1 [0]| = 2 and for all nonzero α ∈ L 1 2k +1 (−1)Tr([x +x+α]y) |Kk−1 [α]| = q x,y∈L
=1+
1 2k +1 (−1)Tr(x y+xy+αy) q x y =0
1 2k +1 −2k /(2k +1) xy+y2k +1 ) =1+ (−1)Tr(x +α . q x,y −
2k
Therefore, α is in Kk if and only if Qc (x, y) is balanced, where c = α 2k +1 , and k k Qc (x, y) denotes the quadratic form Tr(x2 +1 + cxy + y2 +1 ) on the F2 -linear space
83
Geometry, codes and difference sets: exceptional connections
V = L2 . Now Qc is balanced (i.e is neither elliptic nor hyperbolic, but takes the values 0 and 1 equally often) precisely when it does not vanish identically on the radical V ⊥ , where the relevant bilinear form on V is given by Bc ((x, y), (u, v)) = Qc (x + u, y + v) + Qc (x, y) + Qc (u, v) k
= Tr([u2 + u2
−k
k
+ cv]x + [v 2 + v 2
−k
+ cu]y).
Thus, Qc is balanced if and only if there exists a (u, v) in L2 satisfying k
−k
2k
2−k
u 2 + u2 v
+v
= cv
(3)
= cu
and Qc (u, v) = Tr(u2
k +1
+ cuv + v 2
k +1
) = 1.
(4)
Now suppose that α is in Kk and that (u, v) ∈ L2 satisfies (3) and (4). Multiply the first equation by u and the second by v to obtain u2 v
k +1
2k +1
+ u(2 +v
k +1)2−k
= cuv
(2k +1)2−k
(5)
= cuv.
Thus, Tr(cuv) = 0 which gives 1 = Qc (u, v) = Tr(u2
k +1
+ v2
k +1
)
and adding the equations in (5) gives u2
k +1
+ v2
k +1
= (u2
k +1
+ v2
k +1
k
)2 .
The last equation implies that u2 +1 + v 2 +1 ∈ F2m ∩ F2k = F2(m,k) = F2 . But then k k k k Tr(u2 +1 + v2 +1 ) = 1 implies that u2 +1 + v 2 +1 = 1. We may now deduce that k
c2
k +1
= c2
k
k +1
(u2
k +1
2k +1
= (cu)
+ v2
+ (cv) −k
k +1
)
2k +1 −k
= (u + u2 )2 +1 + (v 2 + v 2 )2 = 1 + ζ s + (ζ + 1)s , 2k
where ζ = u(2
k +1)2−k
k
k
k +1
and s = (23k + 1)/(2k + 1) = 4k − 2k + 1. It follows that
α −2 = (α k
−
2k 2k +1
)2
k +1
= c2
k +1
∈ L× \Dk .
Since Dk is closed under the Frobenius α −1 ∈ L× \Dk . Thus Kk since both sets have cardinality 2m−1 − 1, they are in fact equal.
(−1)
⊆ L× \Dk ; and,
84
J. F. Dillon
References [BJL2]
T. Beth, D. Jungnickel and H. Lenz, Design Theory, 2nd Edition, Cambridge University Press, Cambridge 1999.
[DKRC]
R. C. Bose and D. K. Ray-Chaudhuri, On a class of error correcting binary group codes, Inform. and Control 3 (1960), 68–79.
[PASC]
P. Charpin, Open problems in cyclic codes, in: Handbook of Coding Theory (V. S. Pless and W. C. Huffman, eds.), North-Holland, Amsterdam 1998, 963–1063.
[COMA]
S. D. Cohen and R. W. Matthews, A Class of Exceptional Polynomials, Trans. Amer. Math. Soc. 345 (1994), 897–909.
[JFD1]
J. F. Dillon, Multiplicative Difference Sets via Additive Characters, Des. Codes Cryptogr. (Assmus Memorial Issue) 17 (1999), 225–235.
[DIDO]
J. F. Dillon and H. Dobbertin, New Cyclic Difference Sets with Singer Parameters, submitted to Finite Fields Appl.
[DOB1]
Hans Dobbertin, Kasami Power Functions, Permutation Polynomials and Cyclic Difference Sets, in: Difference Sets, Sequences and their Correlation Properties (A. Pott, P. V. Kumar, T. Helleseth and D. Jungnickel, eds.), Kluwer, Dordrecht 1999, 133–158.
[DOB2]
Hans Dobbertin, Almost Perfect Nonlinear Power Functions on GF(2n ): The Welch Case, IEEE Trans. Inform. Theory 45 (1999), 1271–1275.
[GMWD]
B. Gordon, W. H. Mills and L. R. Welch, Some New Difference Sets, Canadian J. Math. 14 (1962), 614–625.
[HOCQ]
A. Hocquenguem, Codes correcteurs d’erreurs, Chiffres (Paris) 2 (1959), 147–156.
[JAWI]
H. Janwa and R. M. Wilson, Hyperplane Sections of Fermat Varieties in P 3 in Char. 2 and Some Applications to Cyclic Codes, in: Applied Algebra, Algebraic Algorithms and Error-Correcting Codes, Proceedings AAECC-10 (G. Cohen, T. Mora and O. Moreno, eds.), Lecture Notes in Comput. Sci. 673, SpringerVerlag, Berlin 1993, 180–194.
[JMCW]
Heeralal Janwa, Gary McGuire and Richard M. Wilson, Double-Error-Correcting Codes and Absolutely Irreducible Polynomials over GF(2), J. Algebra 178 (1995), 665–676.
[LIMT]
R. Lidl, G. L. Mullen and G. Turnwald, Dickson Polynomials, Pitman Monographs Surveys Pure Appl. Math. 65, Longman, Harlow 1993.
[VANW]
J. H. van Lint and R. M. Wilson, On the minimum distance of cyclic codes, IEEE Trans. Inform. Theory 32 (1986), 23–40.
[MACS]
F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Codes, North-Holland, New York 1978.
[MASC]
A. Maschietti, Difference Sets and Hyperovals, Des. Codes Cryptogr. 14 (1998), 89–98.
Geometry, codes and difference sets: exceptional connections
85
[NOCY]
J.-S. No, H. Chung and M.-S. Yun, Binary Pseudorandom Sequences of Period 2m − 1 with Ideal Autocorrelation Generated by the Polynomial zd + (z + 1)d , IEEE Trans. Inform. Theory 44 (1998), 1278–1282.
[NOGO]
J.-S. No, S. Golomb, G. Gong, H.-K. Lee and P. Gaal, Binary Pseudorandom Sequences of Period 2n − 1 with Ideal Autocorrelation, IEEE Trans. Inform. Theory 44 (1998), 814–817.
[KNYB]
K. Nyberg, Differentially uniform mappings for cryptography, in: Advances in Cryptology - EUROCRYPT ’93 (T. Helleseth, ed.), Lecture Notes in Comput. Sci. 765, Springer-Verlag, Berlin 1994, 55–64.
J. F. Dillon National Security Agency Fort George G. Meade, MD 20755, U.S.A. [email protected]
A singular direct product for bicolorable Steiner triple systems Jeffrey H. Dinitz and Douglas R. Stinson
Abstract. A Steiner triple system has a bicoloring with m color classes if the points are partitioned into m subsets and the three points in every block are contained in exactly two of the color classes. In this paper we generalize the direct product theorem for bicolored Steiner triple systems given in [CDR] to a singular direct product theorem of the form v → 3(v − 1) + 1. Our construction uses a generalization of the “forbidden latin squares” introduced in [CDR]. We also consider possible singular direct products of the form v → 3(v − w) + w. 2000 Mathematics Subject Classification: 05B07
1. Introduction and background ˜Throughout this paper we use notation consistent with that found in [CD]. Let D = (V , B) be a (v, k, λ)-design. A coloring of D is a mapping ϕ : V → C. The elements of C are colors; if |C| = m, we have an m-coloring of D. For each c ∈ C, the set ϕ −1 (c) = {x : ϕ(x) = c} is a color class. For an extensive survey of results on coloring designs, the reader is referred to [RC]. Here we consider a special class of colorings termed bicolorings. While a bicoloring is defined for any design, we examine only bicolorings of Steiner triple systems. A coloring ϕ of D is a bicoloring if for all B ∈ B, |ϕ(B)| = 2, where ϕ(B) = v∈B ϕ(v). This definition implies that in a triple system every triple has two elements in one color class and one in another class, i.e., there are no monochromatic triples nor are there any triples receiving three colors. An m-bicoloring is a bicoloring with m color classes, and a design admitting an m-bicoloring is m-bicolorable. A design is m-bichromatic if it is m-bicolorable but not (m − 1)-bicolorable. Example 1.1. A 3-bicolorable STS(13). First, construct an STS(13) by developing the base blocks {1, 3, 9}, {2, 5, 6} mod 13. The color classes are {0, 1}, {2, 6, 8, 10, 11}, {3, 4, 5, 7, 9, 12}. Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
88
Jeffrey H. Dinitz and Douglas R. Stinson
In the context of strict colorings of hypergraphs defined recently by Voloshin [Vol], a bicoloring of an STS is a strict coloring of an STS in which all triples are both edges and also co-edges. In [MiT1,MiT2], Milazzo and Tuza discuss several properties of strict colorings of Steiner triple systems. A second related topic is studied recently by Milici, Rosa, Voloshin [MRV]. In this paper the authors let S be a set of “color patterns” and define a coloring of type S as a coloring where every block has color pattern from S. They mainly study (v, 4, 1)-designs in that paper. We summarize earlier results on bicolored Steiner triple systems. An easy counting argument [Ro] establishes that there exist no nontrivial 2-colorable STS (or 2-colorable triple systems of any index λ for v > 4), and hence no 2-bichromatic triple systems. In [MiT1,MiT2], Milazzo and Tuza discuss several properties of bicolorings of Steiner triple systems. In particular they prove that there is an infinite family of unbicolorable Steiner triple systems. They also prove a bound on the maximum number of colors in a bicolored Steiner triple system. Precisely, they prove that if there exists a t-bicolorable STS(v) with v ≤ 2k − 1, then t ≤ k. They also characterize those designs attaining this bound. In [CDR] the authors concentrate on 3-bicoloring Steiner triple systems. In that paper, the following general necessary conditions are proven: Proposition 1.2. Let (X, A) be an m-bicolorable triple system TS(v, λ) and assume that the m color classes have sizes c1 , c2 , . . . , cm . Then 1.
m ci i=1 2
=
v 2
/3.
2. There do not exist ci and cj , i = j , with ci = cj = 2 (no matter what the size of the other color classes are). 3. At most one of numbers c1 , c2 , . . . , cm can be odd. 4. Let v ≡ 1, 3 (mod 6). If there exists an m-bicolorable STS(v) with m-split (c1 , . . . , ck , d1 , . . . , dm−k ) (with 0 < k < m), then the inequality k k−1 k 1 ci ci cj − 0≤ 2 2 i=1 j =i+1
i=1
1 ≤ 2 holds, where
⎧ ⎪ ⎪ ⎨
k
ci ·
i=1
x/2 0 (x) = (x + 2)/2 ⎪ ⎪ ⎩ 4
m−k i=1
m−k
di −
di
i=1
if x ≡ 0, 2 (mod 6) if x ≡ 1, 3 (mod 6) if x ≡ 4 (mod 6) if x ≡ 5 (mod 6)
A singular direct product for bicolorable Steiner triple systems
89
5. If there exists a 3-bicolorable STS(v), then any prime p dividing v with p ≡ 5 (mod 6) must have an even power in the prime factorization of v. Also in [CDR] is the following direct product theorem for 3-bicolorable STS. Theorem 1.3. If there exists a 3-bicolorable STS(u) and a 3-bicolorable STS(v), then there exists a 3-bicolorable STS(uv). In the remainder of this paper we will be concerned with modifying this construction to obtain a singular direct product theorem for 3-bicolorable STS. We are still unable to prove the following conjecture from [CDR], yet we also believe it to be true. Conjecture 1.4. For every v ≡ 1, 3 (mod 6), satisfying the condition (5) in Proposition 1.2 and for all 3-splits (a, b, c) for v satisfying conditions (1) and (2) of Proposition 1.2, there exists a 3-bicolorable STS(v) with color classes of sizes a, b, and c. The following theorem from [CDR] gives the current state of knowledge concerning the existence of 3-bicolorable STS(v). Theorem 1.5. Let v ≡ 1, 3 (mod 6) and assume that in the prime factorization of v no prime congruent to 5 (mod 6) appears with an odd exponent. Further assume that all prime factors p congruent to 1 (mod 6) are less than 1000 and that all prime factors p congruent to 5 (mod 6) satisfy p2 < 1000. Then there exists a 3-bicolorable STS(v).
2. Forbidden Latin squares Underlying the singular direct product theorem is a special type of latin square termed a forbidden latin square. A special class of these latin squares was used in the proof of Theorem 1.3, but a more general definition is needed for the singular direct product. Suppose n = a +b+c = x +y +z. Let A, B and C be disjoint sets of sizes a, b and c, respectively; and let X, Y and Z be disjoint sets of sizes x, y and z, respectively. A latin square with rows and columns indexed by A ∪ B ∪ C and symbols in the set X ∪ Y ∪ Z is called (a, b, c; x, y, z)-forbidden if in cell (r, g) we find symbol s satisfying: r r r r
in A and g in A implies s not in X in A and g in B implies s not in Z in A and g in C implies s not in Y in B and g in A implies s not in Z
90
Jeffrey H. Dinitz and Douglas R. Stinson
in B in B in C in C in C
r r r r r
and g in B implies s not in Y and g in C implies s not in X and g in A implies s not in Y and g in B implies s not in X and g in C implies s not in Z.
The following gives the general picture of an (a, b, c; x, y, z)-forbidden latin square. The notation ∼ X denotes that the symbols in this region of the latin square contain no elements from the set X. ∼ Y and ∼ Z are defined analogously. Each region is indexed by the elements in A, B and C and this is also indicated. A
B
C
A
∼X
∼Z
∼Y
B
∼Z
∼Y
∼X
C
∼Y
∼X
∼Z
We note that forbidden latin squares were defined in [CDR]. The (a, b, c)-forbidden latin square from that paper are (a, b, c; c, a, b)-forbidden latin square in this more general definition. Example 2.1. A (5, 5, 2; 6, 1, 5)-FLS. In this example, X = {1, 2, 3, 4, 5, 6}, Y = {x} and Z = {a, b, c, d, e}. a e d c b 2 1 5 4 x 3 6
c b a e d x 3 1 6 5 4 2
e d c b a 6 x 4 1 2 5 3
b a e d c 3 2 x 5 1 6 4
d c b a e 1 4 3 x 6 2 5
2 1 5 4 x a 6 d c 3 b e
x 3 1 6 5 4 b 2 e d c a
6 x 4 1 2 e 5 c 3 a d b
3 2 x 5 1 b a 6 d 4 e c
1 4 3 x 6 5 c b 2 e a d
4 5 6 2 3 c d e a b x 1
5 6 2 3 4 d e a b c 1 x
A singular direct product for bicolorable Steiner triple systems
91
In the next theorem we give some necessary conditions for the existence of forbidden latin squares. Theorem 2.2. The following are necessary conditions for the existence of an (a, b, c; x, y, z)-FLS: max{x, y, z} ≤ min{a + b, a + c, b + c} max{a, b, c} ≤ min{a + b, a + c, b + c, x + y, x + z, y + z} ax + by + cz = ab + bc + ca.
(1) (2) (3)
Proof. (1) and (2): Let r ∈ B. Each symbol s ∈ X occurs in a different cell in row r. Also, no symbol occurs in cell (r, g) for any g ∈ C. Hence, x ≤ a + b. The other inequalities can be proven in a similar manner. (3): Given an (a, b, c; x, y, z)-FLS, construct a set of (a +b +c)2 (ordered) triples in the obvious way. This set of triples forms a transversal design, in which the groups are G1 = A ∪ B ∪ C, G2 = A ∪ B ∪ C and G3 = X ∪ Y ∪ Z. This transversal design is in fact bicolored using three colors, where the color classes are of size 2a + x, 2b + y and 2c + z. This means that each triple contains exactly one “pure pair”. The number of triples, (a + b + c)2 , is therefore equal to the number of pure pairs, namely, a 2 + b2 + c2 + 2(ax + by + cz). The result follows from algebraic simplification. These necessary conditions are, in general, quite restrictive, but we note that in the case of the (a, b, c; c, a, b)-forbidden latin squares from [CDR], that condition (1) alone was shown to be necessary and sufficient. (Note that, in this case, condition (3) is vacuously true and condition (2) is equivalent to condition (1).) We next construct an infinite class of forbidden latin squares that will be needed in the singular direct product construction in Section 3. Theorem 2.3. Suppose there is a latin square of order a−1 having b disjoint transversals, then there exists an (a − 1, a − 1, b; a, b − 1, a − 1)-FLS. Proof. Let X be a set of size a where ∞ ∈ X, let Y be a set of size b − 1, and let Z be a set of size a − 1 with X, Y and Z mutually disjoint. Let L be a latin square of order a − 1 on symbol set X\{∞}, having b disjoint transversals, denoted T1 , . . . , Tb . Let LZ be an isomorphic copy of L on symbol set Z. Next, define L to be the square obtained from L by replacing the transversals T1 , . . . , Tb with symbols from Y ∪ {∞}, respectively. Similarly, define LZ to be the square obtained from LZ by replacing the transversals T1 , . . . , Tb from LZ with the corresponding transversals T1 , . . . , Tb from L. Now define M to be the a − 1 by b rectangle whose columns are the transversals T1 , . . . , Tb from L; and define MZ to be the a − 1 by b rectangle whose columns are the transversals T1 , . . . , Tb from LZ . Similarly, define N to be the b by a −1 rectangle whose rows are the transversals T1 , . . . , Tb from L; and define NZ to be the b by a − 1
92
Jeffrey H. Dinitz and Douglas R. Stinson
rectangle whose rows are the transversals T1 , . . . , Tb from LZ . Finally, let L be a latin square of order b on symbol set Y ∪ {∞}. We construct the desired FLS, as follows: LZ L N
L LZ NZ
M MZ L
It is straightforward to check that the square constructed above is indeed latin and that it satisfies the forbidden properties. Corollary 2.4. An (a − 1, a − 1, b; a, b − 1, a − 1)-FLS exists for all positive integers a > b. Proof. By Theorem 2.3 the required forbidden latin square exists if there exists a latin square of side a − 1 having b disjoint transversals. Such a square can be constructed if there exists a pair of orthogonal latin squares of side a −1. Hence if a −1 = 2 or 6, the result follows. Since there exists a pair of incomplete latin squares of side 6 missing a hole of side 2, there exists a latin square of side 6 having 4 (or fewer) disjoint transversals. So for all pairs (a, b) ∈ {(3, 1), (3, 2), (7, 5), (7, 6)} with a > b the result follows. For these remaining ordered pairs (a, b), (a−1, a−1, b; a, b−1, a−1)FLS can be found on the web page at the following URL: http://www.emba.uvm.edu/˜Dinitz/forbiddenLS.txt
It is not necessary that a > b for an (a − 1, a − 1, b; a, b − 1, a − 1)-FLS to exist. In the next example we describe a construction of such a square when a = 6 and b = 10. Example 2.5. A (5, 5, 10; 6, 9, 5)-FLS. Define X = {x1 , . . . , x5 , ∞}, Y = {y1 , . . . , y9 } and Z = {z1 , . . . , z5 }. Let U be a latin square of order five on symbol set X\{∞}; let V1 be a latin square of order five on symbol set {y1 , . . . , y5 }; let V2 be a latin square of order five on symbol set {y6 , . . . , y9 }∪{∞}; and let W be a latin square of order five on symbol set {z1 , . . . , z5 }. Then the following array is the desired FLS: V1 V2 W U
V2 U V1 W
W V1 U V2
U W V2 V1
We can adapt Stinson’s hill climbing algorithm for Steiner triple systems [St] to find latin squares. One merely lets the triples be of the form (r, c, s) where the
A singular direct product for bicolorable Steiner triple systems
93
symbol s occurs in row r and column c and requires that every pair of row-column, row-symbol and column-symbol occurs exactly once. The algorithm can then be further modified to search for forbidden latin squares. We have done this and found (a −1, a −1, b; a, b −1, a −1)-FLS for many pairs (a, b) with a ≤ b ≤ 2a −2. (Note that Theorem 2.2 implies that b ≤ 2a − 2 if an (a − 1, a − 1, b; a, b − 1, a − 1)-FLS exists.) Using a hill-climbing algorithm, we have found (a − 1, a − 1, b; a, b − 1, a − 1)FLS for a = 4, 5, 6 and 7 for all a ≤ b ≤ 2a − 2. These squares are available from the above-mentioned web page.
3. The singular direct product We are now in a position to prove our main result, a v → 3(v − 1) + 1 singular direct product theorem. Theorem 3.1 (Singular Direct Product). Suppose there is a 3-bicolorable STS(v) with split (a, a − 1, b), and an (a − 1, a − 1, b; a, b − 1, a − 1)-FLS. Then there exists a 3-bicolorable STS(3v − 2) with split (3a − 2, 2b + a − 1, 2a + b − 2). Proof. Define sets Vij of elements with i, j ∈ {0, 1, 2}, so that Vij has a − 1, a − 1, b elements for j = 0, 1, 2 respectively, when 0 ≤ i ≤ 1; and a, b − 1, a − 1 elements for j = 0, 1, 2 respectively, when i = 2. The union of Vij for 0 ≤ i ≤ 2 then has 3a − 2, 2a + b − 3 and 2b + a − 1 elements for j = 0, 1, 2, respectively. Let ∞ be a new point. For i = 0, . . . , 2, place on the union of Vij for j = 0, 1, 2 with {∞} an (a, a−1, b)bicolored STS(v) in which the color classes are Vi0 , Vi1 ∪ {∞} and Vi2 . Now form an (a − 1, a − 1, b; a, b − 1, a − 1)-FLS. Use the latin square to construct triples in the obvious way (i.e., form the transversal design from the latin square and align the row, column, and symbol classes on the corresponding Vfj ’s, Vgj ’s and Vhj ’s). The result is a bicolorable STS(3v − 2) whose color classes have the specified sizes. Corollary 3.2. Suppose there is a 3-bicolorable STS(v) with split (a, a −1, b), where a > b. Then there is a 3-bicolorable STS(3v − 2) with split (3a − 2, 2b + a − 1, 2a + b − 2). Proof. This follows immediately from the singular direct product theorem above and Corollary 2.4. Theorem 3.4 below will show that it is necessary that a > b in order for there to exist a 3-bicolorable STS(v) with split (a, a − 1, b). Hence the condition that a > b in the above corollary is not needed.
94
Jeffrey H. Dinitz and Douglas R. Stinson
One may ask whether there exists a singular direct product theorem of the form v → 3(v − 1) + 1 which does not require that the original 3-bicolorable STS(v) has color split (a, a − 1, b). The answer to this is no, as exhibited in the next proposition. Proposition 3.3. In any singular direct product theorem of the form v → 3(v −1)+1 it is necessary that the original 3-bicolorable STS(v) has color split (a, a − 1, b) for some a and b. Proof. Consider a hypothetical v → 3(v − 1) + 1 construction in which we use, WLOG, an (a, b − 1, c; b, c − 1, a)-FLS. Note that this implies that the point ∞ again ends up in the second color class and that the original STS(v) has color split (a, b, c). Theorem 2.2 (3) implies that a(b − 1) + c(b − 1) + ac = ab + (b − 1)(c − 1) + ac. This simplifies to yield b − 1 = a, which is isomorphic to the construction presented above. Since our main ingredient in the singular direct product theorem (other than the forbidden latin square) is a 3-bicolorable STS(v) with split (a, a−1, b), it is reasonable to determine the values of a and b for which this can exist. Theorem 3.4. A 3-bicolorable STS(v) with split (a, a − 1, b) can exist only for the following parameters: or
a = 3t 2 + 2t + 1 and b = 3t 2 − t a = 3t 2 + 4t + 2 and b = 3t 2 + t
where t is any integer. Proof. From Proposition 1.2 (a) it follows that a a−1 b 2a + b − 1 3 + + = . 2 2 2 2 This simplifies to give (a − b)2 = 3a − 2. Then a − b = 3t + 1 or 3t + 2 for an integer t. Solving for a and b gives the result.
4. The v → 3(v − u) + u construction The next theorem gives a general v → 3(v − w) + w construction. Let D = (X, A) be a 3-bicolorable Steiner triple system of order v with split (a, b, c) which contains
A singular direct product for bicolorable Steiner triple systems
95
a subsystem (Y, B) of order w (so Y ⊂ X, |Y | = w and B ⊂ A). Let A, B, and C be the color classes of D and assume that |A ∩ Y | = i, |B ∩ Y | = j and |C ∩ Y | = k. Then D is said to have color split (a, b, c; i, j, k). Theorem 4.1. Suppose there is a 3-bicolorable STS(v) with a sub STS(w) which has color split (a, b, c; i, j, k) and a 3-bicolorable STS(v) with a sub STS(w) which has color split (a, b, c; j, k, i). Suppose further that there exists an (a − i, b − j, c − k; c − i, a − j, b − k)-FLS. Then there exists a 3-bicolorable STS(3(v − w) + w) with split (2a + c − 2i, 2b + a − 2j, 2c + b − 2k). Proof. Define disjoint sets Vpq of elements with p, q ∈ {0, 1, 2}, so that Vpq has a − i, b − j, c − k elements for q = 0, 1, 2 respectively, when p = 0 or p = 1; and c − i, a − j, b − k elements for q = 0, 1, 2 respectively, when p = 2. The union of Vpq for 0 ≤ p ≤ 2 then has 2a + c − 3i, 2b + a − 3j and 2c + b − 3k elements for q = 0, 1, 2, respectively. Further, define three more disjoint sets I, J and K with |I | = i, |J | = j and |K| = k. For p = 0 and p = 1, place on Vp0 ∪ Vp1 ∪ Vp2 ∪ I ∪ J ∪ K an STS(v) containing a sub-STS(w) (where the subsystem is on the points I ∪ J ∪ K) with color split (a, b, c; i, j, k) in which the color classes are Vp0 ∪ I, Vp1 ∪ J and Vp2 ∪ K. Next, delete all blocks in the subsystem from both designs. For p = 2, place on the union of Vp0 ∪ Vp1 ∪ Vp2 ∪ I ∪ J ∪ K an STS(v) containing a sub-STS(w) (the subsystem is again on the points I ∪ J ∪ K) with color split (c, a, b; i, j, k) in which the color classes are Vp0 ∪ I, Vp1 ∪ J and Vp2 ∪ K. But this time do not delete any blocks in the subsystem. Now form an (a − i, b − j, c − k; c − i, a − j, b − k)-FLS. Use the latin square to construct triples in the obvious way (i.e., form the transversal design from the latin square and align the row, column, and symbol classes on the corresponding V0q ’s, V1q ’s and V2q ’s). The result is a bicolorable STS(3(v−w)+w) whose color classes have the specified sizes. Obviously, there will be many conditions on the parameters necessary for this construction to be used. We will not go into those here. We will, however note that the next case for w, after the w = 1 case considered in the previous section, is of course w = 3. When w = 3, the only possible color split is (0, 1, 2). We now restate the above theorem in the case w = 3. Corollary 4.2. Suppose there is a 3-bicolorable STS(v) with split (a, b, c), and an (a, b − 1, c − 2; c, a − 1, b − 2)-FLS. Then there exists a 3-bicolorable STS(3v − 6) with split (2a + c, 2b + a − 2, 2c + b − 4). A necessary condition for the existence of a (a, b − 1, c − 2; c, a − 1, b − 2)-FLS required above is 2a + 3 = b + c. (This follows from condition (3) of Theorem 2.2.) The first parameter situation where all the necessary conditions are satisfied is when
96
Jeffrey H. Dinitz and Douglas R. Stinson
a = 12, b = 10 and c = 17. This would require the existence of a 3-bicolorable STS(39) with split (12, 10, 17), and a (12, 9, 15; 17, 11, 8)-FLS. The 3-bicolorable STS(39) with color split (12, 10, 17) was found to exist in [CDR], and using the modified hill-climbing algorithm we found a (12, 9, 15; 17, 11, 8)-FLS. (This latin square can also be obtained from the previously mentioned web page). Hence, in this particular parameter situation, the singular direct product can be used. We believe that there are many other cases where this construction can be used; however we have not searched for others at this time.
5. Conclusion and open problems The problem of bicolorable Steiner triple systems remains open, of course. We have shown in that the direct product construction from [CDR] can be modified to a singular direct product construction. The singular direct product construction depends on the existence of a generalized form of forbidden latin square, which is an interesting open problem in its own right. In particular, we ask if the necessary conditions from Theorem 2.2 are sufficient for existence of an (a, b, c; x, y, z)-FLS. An obvious further generalization of forbidden latin squares is to specify possibly different partitions associated with rows, columns and symbols, i.e., an (a, b, c; u, v, w; x, y, z)-FLS. Necessary conditions analogous to those of Theorem 2.2 could easily be written down, and one could perhaps formulate more general versions of the singular direct product construction that would use these more general forbidden squares. Acknowledgements D. R. Stinson’s research is supported by the Natural Sciences and Engineering Research Council of Canada through the following grants: NSERCIRC #216431-96 and NSERC-RGPIN #203114-98.
References [CD]
C. J. Colbourn and J. H. Dinitz (eds.), The CRC Handbook of Combinatorial Designs, CRC Press, Inc., 1996.
[CDR]
C. J. Colbourn, J. H. Dinitz and A. Rosa, Bicoloring Steiner Triple Systems, Electron. J. Combin. 6 (1999), #R25.
[MiT1] L. Milazzo and Zs. Tuza, Upper chromatic number of Steiner triple and quadruple systems, Discrete Math. 174 (1997), 247–259. [MiT2] L. Milazzo and Zs. Tuza, Strict colourings for classes of Steiner triple systems, Discrete Math. 182 (1998), 233–243.
A singular direct product for bicolorable Steiner triple systems
97
[MRV]
S. Milici, A. Rosa and V. Voloshin, Colouring Steiner systems with specified block colour patterns, Discrete Math. 240 (2001), 145–160.
[Ro]
A. Rosa, Steiner triple systems and their chromatic number, Acta Fac. Rer. Nat. Univ. Comen. Math. 24 (1970), 159–174.
[RC]
A. Rosa and C. J. Colbourn, Colorings of block designs, in: Contemporary Design Theory: A Collection of Surveys (J. H. Dinitz and D. R. Stinson, eds.), John Wiley & Sons, 1992, 401–430.
[St]
D. R. Stinson, Hill-climbing algorithms for the construction of combinatorial designs, Ann. Discrete Math. 26 (1985), 321–334.
[Vol]
V. I. Voloshin, On the upper chromatic number of a hypergraph, Australas. J. Combin. 11 (1995), 25–45.
J. H. Dinitz Mathematics and Statistics University of Vermont Burlington, Vermont 05405, U.S.A. D. R. Stinson Combinatorics and Optimization University of Waterloo Waterloo, Ontario N2L 3G1, Canada
On semi-regular relative difference sets in non-abelian p-groups Dominic Elvira∗ and Yutaka Hiramine
Abstract. In this article, we study semi-regular relative difference sets (RDS’s) in nonabelian p-groups containing a maximal cyclic subgroup of index p. In particular, we prove that if the modular p-group Mn (p) has a non-trivial semi-regular RDS then the possible order of the forbidden subgroup is p unless (n, p) = (4, 2). Moreover, we prove that Mn (2)(n ≥ 5) and the semi-dihedral group SD2n (n ≥ 4) do not contain non-trivial semi-regular RDS’s. 2000 Mathematics Subject Classification: 05B10.
1. Introduction An (m, u, k, λ) relative difference set (RDS) in a group G of order mu relative to a normal subgroup U of order u is a k-element subset R of G such that the number of ordered pairs (r1 , r2 ) with r1 r2−1 = g (r1 , r2 ∈ R) for every g ∈ G, g = 1 is λ if g ∈ G \ U or 0 if g ∈ U . By this inherent property of U , we often call it the forbidden subgroup. If k = uλ, we call R a semi-regular RDS and its parameters are given by (uλ, u, uλ, λ). In this case, R is a set of coset representatives of G/U . Moreover, if u = 1, R is called a trivial semi-regular RDS. Any group G itself is a trivial semi-regular RDS. Semi-regular RDS’s in abelian p-groups have been studied vigorously by many authors ([1], [9], [13]). In the non-abelian case, (4t, 2, 4t, 2t) RDS’s have been studied by N. Ito who proved the existence of (2n−1 , 2, 2n−1 , 2n−2 ) RDS’s in the generalised quaternion group Q2n (n ≥ 3) and the extra-special 2-groups. He also conjectured the existence of RDS’s in the dicyclic group Q8t for every t ≥ 1 ([6]). B. Schmidt further strengthened Ito’s conjecture by showing the existence of (4t, 2, 4t, 2t) RDS’s in Q8t with t of some particular form and verified this conjecture for all t ≤ 46 ([13]). Semi-regular RDS’s in non-abelian groups have been studied by the authors also. In [4], we proved the non-existence of non-trivial semi-regular RDS’s in the dihedral ∗ This author is a faculty member of Philippine Normal University (PNU), Manila on study leave at Kumamoto University under a Monbusho grant.
Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
100 Dominic Elvira and Yutaka Hiramine group D2m for any m ≥ 1. We also gave an alternative proof on the existence of a semiregular RDS in Q2n (n ≥ 3) with parameters (2n−1 , 2, 2n−1 , 2n−2 ) and constructed an example. This time, we aim to explore further the properties of RDS’s in some non-abelian groups and exhibit examples when possible, to know what is going on at least in nonabelian p-groups. Specifically, we have been considering non-abelian p-groups G of order pn containing a cyclic subgroup H of order pn−1 . A complete classification of these groups was given in [5], namely: (i) Mn (p), the modular p-group of order pn with n ≥ 3 if p > 2 and n ≥ 4 if p = 2, (ii) D2n , (iii) Q2n , and (iv) SD2n , the semi-dihedral group of order 2n with n ≥ 4. The search for RDS’s in the modular p-group Mn (p) and the semi-dihedral group SD2n have been open problems. In this article, we focus our attention to the groups SD2n and Mn (p) where p is any prime. In particular, we prove that if Mn (p) has a non-trivial semi-regular RDS then the possible order of the forbidden subgroup is p unless (n, p) = (4, 2). We also show that Mn (2) and SD2n do not contain non-trivial semi-regular RDS’s for any n ≥ 4.
2. Preliminaries and terminology In this section, known results that will be used frequently are provided. All groups and sets are assumed to be finite and the terminology applied is as in [5] and [11]. For a subset X of G, we set X −1 = {x −1 | x ∈ X} and throughout this article we identify a subset X of G with the group ring element x∈X x ∈ Z[G]. By definition, a k-subset R of G is an (m, u, k, λ) RDS in G relative to U ( G) if and only if R satisfies the equation RR −1 = k + λ(G − U ) in the group ring Z[G]. In the rest of this article, we consider only non-trivial semi-regular RDS’s. We also consider the forbidden subgroup U to be always a normal subgroup of G and by the symbol p, we mean a prime number. The following result due to Elliot and Butson [3] is basic in the study of relative difference sets. Result 2.1. Let R be an (m, u, k, λ) RDS in a group G relative to a normal subgroup U and let U1 be a normal subgroup of G contained in U . Set G = G/U1 . Then R is an (m, u/u1 , k, u1 λ) RDS in G relative to U . For a group G, we denote its exponent by exp(G). Result 2.2. Let G be an abelian group of order pa+b . If G contains a (pa,pb,pa,pa−b ) RDS R relative to U then the following exponent bound conditions hold: (i) exp(G) ≤ pa unless G Z4 ,
On semi-regular relative difference sets in non-abelian p-groups
101
(ii) exp(G) ≤ pa+b−[a/2] , and (iii) if a is odd and p > 2, then exp(G) ≤ p(a+1)/2 and if p = 2, then exp(U ) ≤ p(a+1)/2 . In the above result, (i) and (ii) are Corollaries 3.2 and 3.5 in [10], respectively, while (iii) is Theorem 4.2 in [8]. Definition 2.3. Let x be a cyclic group of order n. Let S be a collection of elements of x. Here we assume that S can contain an element several times. Set S = {x m1 , x m2 , . . . , x mr }. We define ε(S) = m1 + m2 + · · · + mr (mod n). We note that ε(S) is uniquely determined modulo n.
3. The parameters of a semi-regular RDS in Mn (p) In this section, we study semi-regular RDS’s in the modular p-group Mn (p) of order p n . Using generators x and y, this group is defined by Mn (p) = x, y | x p
n−1
= y p = 1, y −1 xy = x 1+p
n−2
.
n−2
Set G = Mn (p) and z = x p . Then by Theorem 5.4.3 in [5], [G, G] = z and Z(G) = x p . We note that when p = 2, we have n ≥ 4 as M3 (2) D8 , and if p > 2, we have n ≥ 3. Moreover, one can verify that its automorphism group is Aut(G) = {θi,j,k | 0 ≤ i ≤ pn−1 − 1, i ≡ 0 (mod p), 0 ≤ j, k ≤ p − 1},
(1)
n−2
where θi,j,k is determined by θi,j,k (x) = x i y j and θi,j,k (y) = x kp y. Also, for every g ∈ G, we can write g = x i y j where 0 ≤ i ≤ pn−1 − 1 and 0 ≤ j ≤ p − 1. It is an easy exercise to prove that the following group operations hold in G: n−2 n−2 (x a y b )(x c y d ) = x a+c−bcp y b+d , (x a y b )(x c y d )−1 = x a−c+c(b−d)p y b−d , and n−2 (x a y b )m = x ma−ab(1+2+···+m−1)p y mb . We assume that R is a (p a , pb , pa , pa−b ) RDS in G relative to a normal subgroup U . As |G| = p a+b , we must have n = a + b and a ≥ b. Lemma 3.1. If R is a semi-regular RDS in G relative to U then its possible parameters are either one of the following cases: (i) (8, 4, 8, 2), (ii) (p2 , p2 , p2 , 1) with p ≥ 2, (iii) (pn−1 , p, p n−1 , pn−2 ) with p ≥ 2.
102 Dominic Elvira and Yutaka Hiramine Proof. If |U | = p, then we have case (iii). Assume |U | = p b ≥ p2 . Since z = [G, G] ≤ Z(G) = x p , we have [G, G] ≤ U . Set G = G/[G, G]( Zpn−2 × Zp ). Then by Result 2.1, R is a non-trivial abelian (p a , pb−1 , pa , pa−b+1 ) RDS in G ¯ we get b = 2. As a ≥ b, we also have relative to U . Applying Result 2.2(i) to G, a ≥ 2. By Result 2.2(ii), a + b − 2 ≤ a + b − 1 − [a/2]. Hence a ≤ 3. Thus we have (a, b) = (2, 2) or (3, 2). By Result 2.2(iii), (a, b) = (3, 2) when p > 2. Hence (a, b) = (2, 2) or (a, b, p) = (3, 2, 2). Therefore we have the lemma. We first settle case (i) of Lemma 3.1 by the following: Lemma 3.2. There exists no (8, 4, 8, 2) RDS R in M5 (2). Proof. Set G = M5 (2) and let G = x, y | x 16 = y 2 = 1, y −1 xy = xz where z = x 8 . Assume that R is an (8, 4, 8, 2) RDS in M5 (2) relative to a normal subgroup U . Set G = G/z( Z8 ×Z2 ). Then, by an argument similar to the proof of Lemma 3.1, z ⊂ U and R is an (8, 2, 8, 4) RDS in G relative to U . By Theorem 4.4 of [9], U = x 4 . Hence U = x 4 . Set H = x 2 (= Z(G)) and R = A + Bx + Cy + Dxy, where A, B, C, D ⊂ H . Then each of A, B, C, D is a set of coset representatives of H /U . Since RR −1 = 8 + 2(G − U ) = 8 + 2(H − U ) + 2H x + 2Hy + 2H xy, we have 2H x = A(Bx)−1 + (Bx)A−1 + (Cy)(Dxy)−1 + (Dxy)(Cy)−1 . Hence AB −1 x −2 + A−1 B + CD −1 x −2 + C −1 D = 2H.
(2)
Set A = x 4a +x 4b+2 , B = x 4c +x 4d+2 , C = x 4e +x 4f +2 and D = x 4g +x 4h+2 . By (2), we have 2(x 2 +x 6 +x 10 +x 14 ) = x 4a−4c−2 +x 4b−4d−2 +x −4a+4d+2 +x −4b−2+4c + x 4e−4g−2 + x 4f −4h−2 + x −4e+4h+2 + x −4f −2+4g . Thus ε(2(x 2 + x 6 + x 10 + x 14 )) ≡ −8 (mod 16). Therefore 0 ≡ 8 (mod 16), a contradiction. In the rest of this section, we settle case (ii) of Lemma 3.1. We assume that the RDS 3 R in G = Mn (p) has parameters (p 2 , p2 , p2 , 1). We have M4 (p) = x, y | x p = 2 y p = 1, y −1 xy = xz where z = x p and p ≥ 2. There are exactly p + 1 normal subgroups of G of order p 2 , namely: x p , z, y, x ip y(1 ≤ i ≤ p − 1). By (1), it suffices to consider only the following three cases for U : (i) U = x p Zp2 ,
(ii) U = z, y Zp × Zp ,
(iii) U = x p y Zp2 .
We define subsets Ai (0 ≤ i ≤ p − 1) by Ai = Ry −i ∩ x. Then R = A0 + A1 y + · · · + Ap−1 y p−1 . Set RR −1 = B0 + B1 y + · · · + Bp−1 y p−1 , where B0 , B1 , . . . , Bp−1 ∈ Z[x]. Then one can check that 2 p p p−1 Ai A−1 . (3) B0 = i = p + x x + · · · + x x 0≤i≤p−1
Lemma 3.3. There is no (p 2 , p2 , p2 , 1) RDS in G relative to U of type (i).
On semi-regular relative difference sets in non-abelian p-groups
103
Proof. Set S = RR −1 ∩ x p x. By (3), S = x p x. Since R is a set of coset representatives of G/U and U = x p ≥ z, it follows that Ai = x pai0 + x pai1 +1 + x pai2 +2 + · · · + x paip−1 +p−1 (aij ∈ Z) for each i ∈ {0, 1, . . . , p − 1}. Hence ε(x p x) = 1 + (1 + p) + (1 + 2p) + · · · + (1 + (p 2 − 1)p) ≡ (p(ai,0 − ai,p−1 ) − (p − 1)) + (p(ai,1 − ai,0 ) + 1) + (p(ai,2 − ai,1 ) + 1) + · · · + (p(ai,p−1 − ai,p−2 ) + 1)
(mod p3 ).
Thus p 2 ≡ 0 (mod p3 ), a contradiction. Lemma 3.4. Assume p > 2. If U is a subgroup of G of type (ii) or (iii) then, we may assume the following: 2 2 x jp+aj p + x mj +bj p , A0 = 0≤j ≤p−1
Ai =
1≤j ≤p−1
x mj +cij p (1 ≤ i ≤ p − 1)
1≤j ≤p−1
where aj , bj , cij , mj ∈ Z and mj ≡ j (mod p) for any i, j . Proof. Set G = G/z Zp2 × Zp . Then R is a (p 2 , p, p 2 , p) RDS in G relative to U Zp . Here U = y or x p y. By Theorem 3.2 of [8], a translate of R has the following property: x p y if U = y p p i R = x + g1 H1 + gi x y , where H1 = if U = x p y y 2≤i≤p−1
and gi = x ni (1 ≤ i ≤ p − 1) with {n1 , . . . , np−1 } = {1, . . . , p − 1}(mod p). Hence, 2 2 R = 0≤j ≤p−1 x jp+aj p + 1≤i,j ≤p−1 x ni +jp+ei,j p y ij or 2 2 2 R= x jp+aj p + x n1 +e1,j p y j + x ni +jp+ei,j p y ij 0≤j ≤p−1
0≤j ≤p−1
2≤i≤p−1 0≤j ≤p−1
depending on whether U = y or U = x p y, respectively. Here aj , ei,j ∈ Z. if U = y n1 Set w1 = and wi = ni (2 ≤ i ≤ p − 1). Then R = n1 − p if U = x p y jp+aj p2 + wi +jp+ei,j p2 y ij . Hence 0≤j ≤p−1 x 1≤i≤p−1 0≤j ≤p−1 x 2 2 A0 = x jp+aj p + x wi +ei,0 p . (4) 0≤j ≤p−1
1≤i≤p−1
104 Dominic Elvira and Yutaka Hiramine We now consider At y t for t ∈ {1, 2, . . . , p − 1}. Let s ∈ {1, . . . , p − 1}. Then there is a unique solution (i, j ) to the following simultaneous equations: ij ≡ t (mod p) wi + jp + ei,j p 2 ≡ s (mod p). Thus for any t ∈ {1, 2, . . . , p − 1}, we have At = x s+ct,s p
(1 ≤ t ≤ p − 1)
(5)
1≤s≤p−1
for some ct,s ∈ Z. By (4) and (5), we have the lemma. Lemma 3.5. Set S = RR −1 ∩ x p x and assume p > 2. Then ε(S) ≡ p 2 (mod p3 ) when the subgroup U of G is of type (ii) or (iii). Proof. As RR −1 = p2 + G − U and U ∩ x p x = φ, we have S = 0≤i≤p2 −1 x 1+ip , hence ε(S) ≡ 1 · p 2 + p(0 + 1 + · · · + (p 2 − 1)) ≡ p2 (mod p3 ). Lemma 3.6. Let U be a subgroup of G of type (ii) or (iii). If p > 2 then there exists no (p 2 , p2 , p2 , 1) RDS in G relative to U. Proof. By Lemma 3.4, A0 = 0≤j ≤p−1
x uj +
x vj
and
Ai =
1≤j ≤p−1
x wij
1≤j ≤p−1
for each i ∈ {1, . . . , p − 1} where uj ≡ 0 (mod p), vj ≡ j (mod p), and wij ≡ p j (mod p). Set Si = Ai A−1 i ∩ x x for each i ∈ {0, 1, . . . , p − 1}. Then by (3), S = S0 + S1 + · · · + Sp−1 , where x v1 −ui S0 = x v1 −v0 + x v2 −v1 + · · · + x v0 −vp−1 + 0≤i≤p−1
and Si = x wi,2 −wi,1 + x wi,3 −wi,2 + · · · + x wi,p−1 −wi,p−2 (1 ≤ i ≤ p − 1). Therefore, ε(S) = pv1 − (u0 + · · · + up−1 ) +
(wi,p−1 − wi,1 )
1≤i≤p−1
≡ (p − 1)(p − 2) ≡ 2
(mod p).
On the other hand, ε(S) ≡ 0 (mod p) by Lemma 3.5. As p > 2, this is a contradiction. Thus we have the lemma. Remark 3.7. The only case excluded by Lemma 3.6 is when p = 2. In this case, K. Akiyama showed that R0 = {1, x 2 y, x 3 y, x 5 y} is a (4,4,4,1) RDS in M4 (2) relative
On semi-regular relative difference sets in non-abelian p-groups
105
to U = z, y Z2 × Z2 . One can easily check that any (4, 4, 4, 1) RDS in M4 (2) is a translate of R0 and the corresponding projective plane is desarguesian of order 4. By Lemmas 3.1, 3.2, 3.3, 3.6 and Remark 3.7, we obtain the following: Proposition 3.8. Let R be a non-trivial semi-regular RDS in the modular p-group Mn (p) relative to a normal subgroup U . Then either U Zp or (n, p) = (4, 2) and U Z2 × Z2 .
4. Non-existence when G Mn (2) or SD2n In this section, we show the non-existence of (2n−1 , 2, 2n−1 , 2n−2 ) RDS in Mn (2) and SD2n . Lemma 4.1. Let H = w be a cyclic 2-group of order at least 4 and z the unique involution in H . Let each of A and B be a set of coset representatives of H /z. Then AB −1 = A−1 Bwc for any odd integer c. Proof. Let 2m+1 be the order of w. Since each of A and B is a set of coset represenm tatives of H /z (z = w2 ), we can put m m w2 ai +i , and B = w2 bi +i , A= 0≤i≤2m −1
0≤i≤2m −1
m for suitable ai , bj ∈ {0, 1}. We also set AB −1 = 0≤i≤2m+1 −1 ci wi . As z = w 2 and m B −1 is a set of coset representatives of H /z, we have AB −1 (1 + w2 ) = AH = 2m H . Hence ci + c2m +i = 2m
(0 ≤ i ≤ 2m − 1).
(6)
Let S be the sum of exponents of the terms 1 or w in AB −1 and let T be the sum of m +1 2 −1 in AB . Then S = 0≤i≤2m −1 (2m ai − 2m bi ) = exponents of the terms w or w m m m m m m b2 −1 + 2m − 1) + 0≤i≤2m −1 2 mai − m 0≤i≤2m −1 2 bi and T = 2m a0 − (2 1≤i≤2m −1 (2 ai − 2 bi−1 + 1) = 0≤i≤2m −1 2 ai − 0≤i≤2m −1 2 bi . Thus S = T. First assume that 2m
AB −1 = A−1 Bw. Since A−1 Bw =
ci w 2
m+1 +1−i
= c1 + c0 w +
(7)
0≤i≤2m+1 −1
2≤j ≤2m+1 −1
c0 = c1 ,
ci = c2m+1 +1−i (2 ≤ i ≤ 2m+1 − 1).
c2m+1 +1−j wj ,
it follows that (8)
106 Dominic Elvira and Yutaka Hiramine On the other hand S = 0 · c0 + 2m c2m and T = 1 · c1 + (2m + 1)c2m +1 ≡ c0 + (2m + 1)c2m (mod 2m+1 ) by (8). As S = T , we have c0 + c2m ≡ 0 (mod 2m+1 ), contrary to (6). We now assume that AB −1 = A−1 Bwc for some odd integer c. We consider an automorphism σ of H given by σ (x) = x d , where d is an integer such that cd ≡ 1 (mod 2m+1 ). We note that each of σ (A) and σ (B) is a set of coset representatives of H /z and that σ (A)σ (B)−1 = σ (A)−1 σ (B)σ (wc ) = σ (A)−1 σ (B)w. The last equation is similar to (7) and so this is a contradiction. Therefore the lemma holds. Lemma 4.2. Let R be a (2n−1 , 2, 2n−1 , 2n−2 ) RDS in G Mn (2) or SD2n relative n−1 n−2 to U . Set G = x, y | x 2 = y 2 = 1, y −1 xy = x 2 ±1 } so that R = A0 + A1 x + B0 y + B1 xy, where A0 , A1 , B0 , B1 ⊂ x 2 . Then (i) each of A0 , A1 , B0 , B1 is a set of coset representatives of x 2 /z, where z is the unique involution of x 2 , (ii) We have B0 B1−1 = B0−1 B1 x 2 z. Proof. Set H = x 2 ( Z2n−2 ). Since U = z, (i) is obvious. We note that y centralizes or inverts H depending on whether G Mn (2) or G SD2n . −1 −1 We first assume that G Mn (2). Then RR −1 = A0 A−1 0 + A1 A1 + B0 B0 + −2 + A−1 A + B B −1 x −2 + B −1 B )x + (A B −1 + A−1 B + B1 B1−1 + (A0 A−1 1 0 1 1 0 0 0 1 x 0 0 0 −1 −1 −1 −2 z + A B −1 x −2 z)xy. On A1 B1 z + A1 B1 z)y + (A1 B0−1 + A−1 B + A B x 1 0 0 1 0 1 the other hand, RR −1 = 2n−1 + 2n−2 ((H − U ) + H x + Hy + H xy). Hence −2 + A−1 A + B B −1 x −2 + B −1 B )x = 2n−2 H x. >From this we have (A0 A−1 1 0 1 1 1 x 0 0 −1 −1 −1 2 2 n−2 A0 A−1 H. 1 + A0 A1 x + B0 B1 + B0 B1 x = 2
As
R −1 R
=
RR −1
(9)
by Proposition 2.8 of [7], similarly we have
−1 −1 −1 2 2 n−2 H. A0 A−1 1 + A 0 A1 x + B 0 B 1 z + B 0 B 1 x z = 2
(10)
By (9) and (10), we have B0 B1−1 + B0−1 B1 x 2 = (B0 B1−1 + B0−1 B1 x 2 )z. On the other hand, (B0 B1−1 + B0−1 B1 x 2 ) + (B0 B1−1 + B0−1 B1 x 2 )z = (B0 B1−1 + B0−1 B1 x 2 )U = 2n−2 H since each of B0 and B1 is a set of coset representatives of H /U . Thus we have B0 B1−1 + B0−1 B1 x 2 = 2n−3 H . Moreover B0 B1−1 + B0 B1−1 z = B0 B1−1 U = 2n−3 H . It follows that B0−1 B1 x 2 = B0 B1−1 z. Therefore we have B0 B1−1 = B0−1 B1 x 2 z. We now assume that G SD2n . Similar to the preceding argument, we have −1 −1 −1 2 2 n−2 H and A A−1 + A−1 A x 2 + A0 A−1 0 1 1 1 + A 0 A1 x + B 0 B 1 + B 0 B 1 x = 2 0 −1 −1 2 n−2 B0 B1 z + B0 B1 x z = 2 H. It follows that B0 B1−1 + B0−1 B1 x 2 = (B0 B1−1 + B0−1 B1 + x 2 )z. Again, we have B0 B1−1 = B0−1 B1 x 2 z. Therefore the lemma holds. Proposition 4.3. There is no (2n−1 , 2, 2n−1 , 2n−2 ) RDS in Mn (2) or SD2n .
On semi-regular relative difference sets in non-abelian p-groups
107
Proof. Let notations be as in Lemma 4.2 and suppose that the proposition is false. Set w = x 2 , H = w, A = B0 and B = B1 . Then, as n ≥ 4, |H | ≥ 4. By Lemma n−3 4.2(ii), AB −1 = A−1 Bw1+2 , contrary to Lemma 4.1. By Propositions 3.8 and 4.3 and the results of section 3 in [4], we have Theorem 4.4. Let G be a non-abelian p-group with a maximal cyclic subgroup. If G contains a non-trivial semi-regular RDS relative to a normal subgroup U , then one of the following holds: (i) G Q2n and U Z2 , (ii) G Mn (p) and U Zp with p an odd prime, (iii) G M4 (2) and U Z2 × Z2 . 2
= x, y | x p = y p = 1. Remark 4.5. Let G = M3 (p) with p an odd prime and set G By Dillon’s construction in [2], we can check that R = 0≤i≤p−1 x i x ip y is a (p2 , p, p 2 , p) RDS in G relative to x p . By Remark 4.5, we can construct a semi-regular RDS in any extra-special p-group (see [5] for the definition) by the following proposition. Proposition 4.6. Let P be an extra-special p-group of order p 2m+1 with m ≥ 1. Then there exists a (p 2m , p, p 2m , p2m−1 ) RDS in P relative to [P , P ]( Zp ) unless P D8 . Proof. By Proposition 3 of [6], it suffices to consider the case when p is an odd prime. Then, by Theorem 5.5.2 of [5], P is isomorphic to one of the central product M r or M r−1 N, where M = M(p) (see [5]) and N = M3 (p) (p > 2). By the product construction (see [12]) , M has a (p2 , p, p 2 , p) RDS relative to Z(M). Again, by the product construction and Remark 4.5, the proposition holds. In this article, we have shown that if a semi-regular RDS R is contained in G = Mn (p) relative to a normal subgroup U then its possible parameters are given by (p n−1 , p, p n−1 , pn−2 ) and U Zp ⊆ Z(G) except when G = M4 (2) and U Z2 × Z2 . In addition, if p > 2 and n = 3, we have shown the existence of a semi-regular RDS in G and if p = 2 and n ≥ 4, we have shown the non-existence. At this point, we pose the following: Problem. Does there exist a (pn−1 , p, p n−1 , pn−2 ) RDS in G = Mn (p) relative to U Zp when p > 2 and n ≥ 4? We note, however, that when G M4 (3), we have checked the non-existence of such an RDS by a computer search.
108 Dominic Elvira and Yutaka Hiramine References [1]
J. A. Davis and J. Jedwab, A New Family of Relative Difference Sets in 2-Groups, Des. Codes Cryptogr. 17 (1998), 305–312.
[2]
J. F. Dillon, Variations on a Scheme of McFarland for Noncyclic Difference Sets, J. Combin. Theory Ser. A 40 (1985), 9–21.
[3]
J. E. H. Elliot and A. T. Butson, Relative Difference Sets, Illinois J. Math. 10 (1966), 517–531.
[4]
D. T. Elvira and Y. Hiramine, On Non-Abelian Semi-Regular Relative Difference Sets, in: Finite Fields and Applications (D. Jungnickel and H. Niederreiter, eds.), Proceedings of the 5th International Conference Fq (5), University of Augsburg, Germany, SpringerVerlag, Berlin–Heidelberg 2001, 122–127.
[5]
D. Gorenstein, Finite Groups, Harper and Row, New York, 1968.
[6]
N.Ito, Remarks on Hadamard Groups, Kyushu. J. Math. 50 (1996), 83–91.
[7]
D. Jungnickel, On automorphism groups of divisible designs, Canad. J. Math. 34 (1982), 257–297.
[8]
S. L. Ma and A. Pott, Relative Difference Sets, Planar Functions and Generalized Hadamard Matrices, J. Algebra 175 (1995), 505–525.
[9]
S. L. Ma and B. Schmidt, On (pa , p, p a , pa−1 )-Relative Difference Sets, Des. Codes Cryptogr. 6 (1995), 57–71.
[10] A. Pott, On the structure of abelian groups admitting divisible difference sets, J. Combin. Theory Ser. A 65 (1994), 202–213. [11] A. Pott, Finite Geometry and Character Theory, Lecture Notes in Math. 1601, SpringerVerlag, Berlin 1995. [12] B. Schmidt, On (p a , pb , pa , pa−b ) Relative Difference Sets, J. Algebraic Combin. 6 (1997), 279–297. [13] B. Schmidt, Williamson Matrices and a Conjecture of Ito’s, Des. Codes Cryptogr. 17 (1999), 61–68. D. Elvira Department of Mathematics, Graduate School of Science and Technology Kumamoto University Kurokami, Kumamoto, Japan [email protected] Y. Hiramine Department of Mathematics, Faculty of Education Kumamoto University Kurokami, Kumamoto, Japan [email protected]
Every λ-design on 6p + 1 points is type-1 Nick C. Fiala
Abstract. A λ-design on v points is a set of v distinct subsets (blocks) of a v-element set (points) such that any two different blocks meet in exactly λ points and not all of the blocks have the same size. Ryser’s and Woodall’s λ-design conjecture states that all λ-designs can be obtained from symmetric designs by a certain complementation procedure. The main result of the present paper is that the λ-design conjecture is true when v = 6p + 1, where p is any prime number. 2000 Mathematics Subject Classification: primary 05B05; secondary 05B30.
1. Introduction Definition 1.1. Given integers λ and v satisfying 0 < λ < v, a λ-design D on v points is a pair (X, B), where X is a set of cardinality v whose elements are called points and B is a set of v distinct subsets of X whose elements are called blocks, such that (i) For all blocks A, B ∈ B, A = B, |A ∩ B| = λ, and (ii) There exist blocks A, B ∈ B with |A| = |B|. Remark 1.2. If X is a v-set and B is a set of distinct subsets of X such that any two of these subsets intersect in λ ≥ 1 elements of X, then the non-uniform Fisher inequality [Ma], [Is] says that |B| ≤ v. Thus, λ-designs are extremal set systems in this sense. λ-designs were first defined by Ryser [Ry68], [Ry70] and Woodall [Wo70]. The only known examples of λ-designs are obtained from symmetric designs by the following complementation procedure. Let (X, A) be a symmetric (v, k, μ)-design with μ = k/2 and fix a block A ∈ A. Put B = {A} ∪ {AB : B ∈ A, B = A}, where denotes the symmetric difference of sets (we refer to this procedure as complementing with respect to the block A). Then an elementary counting argument shows that (X, B) is a λ-design on v points with one block of size k and v − 1 blocks of size 2λ, where λ = k − μ. Any λ-design obtained in this manner is called a type-1 λ-design. Remark 1.3. If μ = k/2, then all of the blocks of (X, B) have the same size, violating condition (ii) of Definition 1.1. In this case, (X, B) is a symmetric design. The reason for condition (ii) is to exclude symmetric designs from our theory. Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
110 Nick C. Fiala Example 1.4. Complementing a projective plane of order three, i.e., a symmetric (13, 4, 1)-design, with respect to a fixed line, we obtain a 3-design on 13 points with one block of size 4 and 12 blocks of size 6 (by a 3-design, we mean a λ-design with λ = 3, not a t-design with t = 3). The λ-design conjecture of Ryser [Ry68], [Ry70] and Woodall [Wo70] states that all λ-designs are type-1. The conjecture was proven for λ = 1 by deBruijn and Erd˝os [BE], for λ = 2 by Ryser [Ry68], for 3 ≤ λ ≤ 9 by Bridges and Kramer [Br70], [Kr69], [BK], for λ = 10 by Seress [Se90], for λ = 14 by Tsaur [Ts], [BT], and for all remaining λ ≤ 34 by Weisz [We]. S. S. Shrikhande and Singhi [SS] proved the conjecture for prime λ and Seress [Se01] proved it when λ is twice a prime. Investigating the conjecture as a function of v rather than λ, Ionin and M. S. Shrikhande [IS96a], [IS96b] proved the conjecture for v = p + 1, 2p + 1, 3p + 1, and 4p +1, where p is any prime, and Hein [He], [HI] proved it for v = 5p +1, where p is a prime not congruent to 2 or 8 modulo 15. The conjecture has also been verified by computer for all v ≤ 85 [Wo71] (it is somewhat interesting that 86 = 5 · 17 + 1 and 17 ≡ 2 (mod 15) is prime, so this value of v is not covered by Hein’s result). Continuing along these lines, in the present paper we will prove the following result. Theorem 1.5. All λ-designs on v = 6p + 1 points, p a prime, are type-1. The method employed to prove Theorem 1.5 is a slight extension of the method of Ionin and M. S. Shrikhande developed in [IS96a] and [IS96b] and used in [He], [HI]. In fact, Ionin and M. S. Shrikhande remarked in [IS96a] that they thought that the techniques developed in that paper might work in the v = 6p + 1 case as well. However, whereas they were always able to reduce to the case of designs having at most two distinct block sizes, in this paper we will have to deal with designs potentially possessing three different block sizes.
2. Preliminary results Definition 2.1. Given a λ-design D = (X, B) and a point x ∈ X, the replication number of x is the number of blocks A ∈ B which contain x. Remark 2.2. If we complement a symmetric (v, k, μ)-design with respect to a block A, then the points lying in A will all have replication number v − k + 1 in the resulting type-1 λ-design D, and the points lying outside A will all have replication number k in D. Ryser [Ry68] and Woodall [Wo70] independently proved the following theorem concerning these replication numbers.
Every λ-design on 6p + 1 points is type-1
111
Theorem 2.3. If D = (X, B) is a λ-design on v points, then there exist integers r > 1 and r ∗ > 1, r = r ∗ , such that every point x ∈ X has replication number r or r ∗ and r + r ∗ = v + 1. In addition, the integers r and r ∗ satisfy the equation 1 (v − 1)2 1 + = . λ |A| − λ (r − 1)(r ∗ − 1)
(1)
A∈B
We will also need the following two theorems concerning the integers r and r ∗ . The first was stated without proof in [Wo71]. For a proof see [Se89]. Theorem 2.4. A λ-design on v points with replication numbers r and r ∗ is type-1 if and only if r(r − 1)/(v − 1) or r ∗ (r ∗ − 1)/(v − 1) is an integer. Theorem 2.5 ([IS96a]). Let D be a λ-design with replication numbers r and r ∗ and put g = gcd(r − 1, r ∗ − 1). If g = 1, 2, or 3, then D is type-1. Additionally, we will need the following three theorems concerning the validity of the λ-design conjecture for certain values of λ. Theorem 2.6 ([BE], [Ry68], [Br70], [Kr69], [BK], [Se90]). The λ-design conjecture is true for λ ≤ 10. Theorem 2.7 ([SS]). The λ-design conjecture is true for prime λ. Theorem 2.8 ([We]). The λ-design conjecture is true for λ = 12. Remark 2.9. Although it is proved in [We] that the λ-design conjecture is valid for all λ ≤ 34, we will not need the full strength of this result in the present paper. The contents of Theorems 2.6, 2.7, and 2.8 will suffice.
3. The Ionin–Shrikhande method Let D = (X, B) be a λ-design on v points. Then Theorem 2.3 implies that every point of D has replication number r or r ∗ for some integers r = r ∗ . Therefore, the underlying set X of our λ-design is partitioned into two subsets, E and E ∗ , of points having replication numbers r and r ∗ , respectively. Let |E| = e and |E ∗ | = e∗ , so e + e∗ = v. Also, for any block A ∈ B, put τA = |A ∩ E| and τA∗ = |A ∩ E ∗ |, so τA + τA∗ = |A|. We will frequently use the trivial inequalities 0 ≤ τA ≤ e for all A. The following simple relation among these parameters is the starting point of the Ionin–Shrikhande method developed in [IS96a] and [IS96b].
112 Nick C. Fiala Lemma 3.1. Let D = (X, B) be a λ-design on v points with replication numbers r and r ∗ . Then the following relation holds for all blocks A ∈ B: (r − 1)(|A| − 2τA ) = (v − 1)(|A| − λ − τA ).
(2)
Proof. Fixing a block A ∈ B, we will count in two different ways all of the pairs (x, B), where x ∈ X, B ∈ B, B = A, and x ∈ A ∩ B. This gives us the equation τA (r − 1) + τA∗ (r ∗ − 1) = λ(v − 1), which is easily transformed into equation (2). Now, let g = gcd(r − 1, r ∗ − 1). Then, since (r − 1) + (r ∗ − 1) = v − 1 by Theorem 2.3, we also have g = gcd(r − 1, v − 1) = gcd(r ∗ − 1, v − 1). We put q=
v−1 . g
(3)
Then, since gcd((r − 1)/g, q) = 1, equation (2) implies that q divides |A| − 2τA for all blocks A ∈ B. Therefore, for each block A we define an integer σA by qσA = |A| − 2τA . Next, we define the quantity s=
(4)
σA .
(5)
τA = λ −
r∗ − 1 σA g
(6)
τA∗ = λ +
r −1 σA g
(7)
A∈B
Also, equations (2) and (4) imply that
and
for all A. Adding equations (6) and (7) we obtain |A| = 2λ +
r − r∗ σA g
(8)
for all A. Remark 3.2. Note that equation (8) implies that for any two blocks A, B ∈ B, |A| = |B| if and only if σA = σB . The next three equations are easily verified: |A| = er + e∗ r ∗ , A∈B
(9)
Every λ-design on 6p + 1 points is type-1
113
τA = er,
(10)
τA∗ = e∗ r ∗ .
(11)
A∈B
and
A∈B
Equations (4), (9), and (10) then imply that sq = A∈B (|A| − 2τA ) = e∗ r ∗ − er = (v − e)(v − r + 1) − er, which can be transformed into sq = gq(gq − e − r + 3) − (2e + r − 2).
(12)
Equation (12) then implies that q divides 2e + r − 2. Therefore, we define a positive integer m by qm = 2e + r − 2. Similarly, equations (4), (9), and (11) imply that q divides define a positive integer m∗ by
(13) 2e∗
+
r∗
qm∗ = 2e∗ + r ∗ − 2.
− 2. Thus, we (14)
Adding equations (13) and (14), we obtain m + m∗ = 3g.
(15)
Finally, equations (12), (13), and (15) imply that s = g 2 q − g(e + r) + 3g − m.
(16)
Remark 3.3. Upon further manipulation of the above equations, we eventually arrive at (r − r ∗ )(m∗ − m) = g[v − (4λ − 1)].
(17)
Note that equation (17) and the fact that r = r ∗ imply that v = 4λ − 1 if and only if m = m∗ . The next lemma establishes formulae for e and r in terms of the parameters λ, g, q, and m. They follow easily from equations (13) and (17). Lemma 3.4 ([IS96a]). If D is a λ-design on v = 4λ − 1 points, then e=
gλ − (g − m)2 q + g − m 3g − 2m
r=
(2g − m)(gq + 2) − 2gλ . 3g − 2m
and
114 Nick C. Fiala The next result gives a way of constructing new λ-designs from old ones by complementing with respect to a fixed block. For a proof see [IS96a]. Remark 3.5. In what follows, if we complement with respect to the block A, the parameters of the new design will be denoted by λ(A), r(A), m(A), etc. Lemma 3.6. Let D = (X, B) be a λ-design on v points with replication numbers r and r ∗ and let A ∈ B. Put B(A) = {A} ∪ {AB : B ∈ B, B = A}. Denote by D(A) the complemented set system (X, B(A)). Then we have (i) If A = E or E ∗ , then D(A) is a symmetric (v, |A|, |A| − λ)-design, (ii) If A = E and A = E ∗ , then D(A) is a λ(A)-design on v points with r(A) = r, r ∗ (A) = r ∗ , and m(A) = m + 2σA , where λ(A) = |A| − λ, (iii) If A = E and A = E ∗ and D is type-1, then D(A) is also type-1, and (iv) (D(A))(A) = D.
4. λ-designs with g = 6 We are now in a position to prove our main result. In what follows, the computer program Mathematica [Wol] was used extensively to carry out computations. Theorem 4.1. Let D = (X, B) be a λ-design on v points with replication numbers r and r ∗ . If g = gcd(r − 1, r ∗ − 1) = 6, then D is type-1. Proof. If λ ≤ 13, then Theorems 2.6, 2.7, and 2.8 imply that D is type-1. Therefore, we may assume that λ ≥ 14. By equation (3), we may write v = 6q + 1. For each integer i, let ai denote the number of blocks A ∈ B with σA = i. We will frequently use the trivial fact that ai ≥ 0 for all i. Since the number of blocks is equal to the number of points, we clearly have ai = 6q + 1. (18) i∈Z
Also, equations (5), (16), and (17) and the formulae of Lemma 3.4 imply that i∈Z
if m = 9.
iai =
(3q + 1)(m2 − 18m + 72) + 18λ 9−m
(19)
Every λ-design on 6p + 1 points is type-1
115
Next, equation (4) implies that for any block A ∈ B, we have |A| = 2τA + qσA . Using this and the formulae of Lemma 3.4, equation (1) is transformed into (m − 9)ai i∈Z
λ(m − 9) + i(2λ − 3q − 1) 4(m − 9)2 q 2 1 = − [q(m − 6) − 2λ + 1][q(m − 12) + 2λ − 1] λ
(20)
if m = 9. Now, since 6 divides r − 1, r is odd and equation (13) implies that m is odd as well. Also, equation (15) implies that m + m∗ = 18. Without loss of generality, we may assume that m ≤ m∗ . Therefore, m = 1, 3, 5, 7, or 9. Case 1: m = 1. In this case, the formulae of Lemma 3.4 imply that e = (6λ − 25q + 5)/16, r = (33q − 6λ + 11)/8, and r ∗ = (15q + 6λ + 5)/8. Also, equation (6) implies that τA = λ − [(5q + 2λ − 1)/16]σA for any block A ∈ B. Then the inequalities 0 ≤ τA ≤ e imply that 5 ≤ σA ≤ 16λ/(5q + 2λ − 1) for all A. Now, 6 divides r − 1 and r > 1, so r ≥ 7. This gives us the inequality q ≥ (2λ + 15)/11. Combining the last two inequalities, we obtain that σA = 5 for all A. Therefore, by Remark 3.2, all blocks have the same cardinality, contradicting the definition of a λ-design. Case 2: m = 3. In this case, Lemma 3.4 implies that e = (2λ−3q +1)/4, r = (9q −2λ+3)/2, and r ∗ = (3q+2λ+1)/2. Also, equation (6) implies that τA = λ−[(3q+2λ−1)/12]σA for any block A. Then the inequalities 0 ≤ τA ≤ e imply that 3 ≤ σA ≤ 12λ/(3q+2λ−1) for all A. Next, r ≥ 7 gives us the inequality q ≥ (2λ + 11)/9. Combining the last two inequalities, we obtain that σA = 3 or 4 for all A. Hence, ai = 0 for all i except possibly 3 and 4. Then equations (18), (19), and (20) become a3 + a4 = 6q + 1, 3a3 + 4a4 =
3(9q + 2λ + 3) , 2
(21) (22)
and 4 i=3
144q 2 1 6ai = − . 6λ − i(2λ − 3q − 1) (1 − 3q − 2λ)(2λ − 9q − 1) λ
Solving equations (21) and (22) yields a3 =
21q − 6λ − 1 2
and a4 =
3(2λ − 3q + 1) . 2
(23)
116 Nick C. Fiala Inserting the above expressions for a3 and a4 into equation (23) and manipulating the result, we arrive at (2λ − 3q − 1)2 (2λ − 3q + 1)(21λq − 36q − 6λ2 + 5λ − 4) = 0. Now, e > 0 so 2λ − 3q + 1 = 0. Also, 2λ − 3q − 1 = 0 since v = 4λ − 1 because m = 9. Therefore, we obtain that 21λq − 36q − 6λ2 + 5λ − 4 = 0 which can be transformed into (21λ − 36)(7q − 2λ) = 37λ + 28. Now, 37λ+28 > 0 and 21λ−36 > 0, so 7q−2λ ≥ 1. If 7q−2λ = 1, then λ = −4, a contradiction. If 7q − 2λ = 2, then r = (2q + 5)/2 is not an integer, a contradiction. Therefore, we must have 7q − 2λ ≥ 3. Consequently, 3(21λ − 36) ≤ 37λ + 28, which implies that λ ≤ 5, a contradiction. Case 3: m = 5. In this case, Lemma 3.4 implies that e = (6λ−q +1)/8, r = (21q −6λ+7)/4, and r ∗ = (3q +6λ+1)/4. Also, equation (6) implies that τA = λ−[(q +2λ−1)/8]σA for any block A. Then the inequalities 0 ≤ τA ≤ e imply that 1 ≤ σA ≤ 8λ/(q + 2λ − 1) for all A. Also, r ≥ 7 gives us the inequality q ≥ (2λ + 7)/7. Combining the last two inequalities, we obtain that σA = 1, 2, or 3 for all A. Hence, ai = 0 for all i except possibly 1, 2, and 3. Then equations (18), (19), and (20) become a1 + a2 + a3 = 6q + 1, a1 + 2a2 + 3a3 =
21q + 18λ + 7 , 4
(24) (25)
and 3 i=1
4ai 64q 2 1 = − . 4λ − i(2λ − 3q − 1) (1 − q − 2λ)(2λ − 7q − 1) λ
(26)
Solving equations (24), (25), and (26) yields a1 =
α13 q 3 + α12 q 2 + α11 q + α10 , 8λ(2λ − 7q − 1)(2λ + q − 1)
where α13 = −459λ + 126, α12 = 162λ2 − 447λ − 66, α11 = 204λ3 − 12λ2 − 225λ − 54, and α10 = −72λ4 + 20λ3 + 10λ2 − 21λ − 6, a2 =
α23 q 3 + α22 q 2 + α21 q + α20 , 2λ(2λ − 7q − 1)(2λ + q − 1)
Every λ-design on 6p + 1 points is type-1
117
where α23 = 51λ−63, α22 = −324λ2 +359λ+33, α21 = 108λ3 −180λ2 +153λ+27, and α20 = 36λ3 − 24λ2 + 13λ + 3, and a3 =
α33 q 3 + α32 q 2 + α31 q + α30 , 8λ(2λ − 7q − 1)(2λ + q − 1)
where α33 = −81λ + 126, α32 = 558λ2 − 757λ − 66, α31 = −444λ3 + 444λ2 − 291λ − 54, and α30 = 72λ4 − 132λ3 + 54λ2 − 23λ − 6. Replacing q by a real variable x in the above expressions for a1 , a2 , and a3 , we obtain three functions, a1 (x), a2 (x), and a3 (x). Now, we already know q ≥ (2λ+7)/7, and the inequality e ≥ 1 implies that q ≤ 6λ − 7. This implies 2λ − 7q − 1 < 0. Therefore, a1 (x), a2 (x), and a3 (x) are continuous functions of x on the interval [(2λ + 7)/7, 6λ − 7]. Now, the function a2 (x) has zeros only at −1/3, 6λ + 1, and z21 =
6λ2 − 5λ + 3 . 17λ − 21
Clearly, −1/3, 6λ + 1 ∈ / [(2λ + 7)/7, 6λ − 7], so a2 (x) has at most one zero on this interval. However, a2 (
−5(3λ + 14)(λ2 − 14λ + 21) 2λ + 7 )= 0. λ(5λ − 6)
Therefore, a2 (x) has exactly one zero on the interval [(2λ + 7)/7, 6λ − 7] at z21 . The above inequalities then imply that a2 (x) is negative on the interval [(2λ + 7)/7, z21 ). Hence, we must have q ∈ [z21 , 6λ − 7]. Next, the function a3 (x) has zeros only at (2λ − 3)/9, 6λ + 1, and z31 =
6λ2 − 3λ + 2 . 9λ − 14
Clearly, (2λ − 3)/9, 6λ + 1 ∈ / [z21 , 6λ − 7], so a3 (x) has at most one zero on this interval. However, a3 (z21 ) =
3(12λ2 − 13λ − 3) >0 17λ − 21
and a3 (6λ − 7) =
−3(λ − 2)(13λ − 15) < 0. λ(5λ − 6)
Therefore, a3 (x) has exactly one zero on the interval [z21 , 6λ − 7] at z31 . The above inequalities then imply that a3 (x) is negative on the interval (z31 , 6λ − 7]. Hence, we must have q ∈ [z21 , z31 ].
118 Nick C. Fiala Now, the function a1 (x) has zeros only at −(2λ + 1)/3 and √ 78λ2 − 63λ − 18 ∓ 4 36λ4 − 252λ3 − 87λ2 + 108λ + 36 z11 , z12 = . 153λ − 42 Clearly, −(2λ + 1)/3 ∈ / [z21 , z31 ], so a1 (x) has at most two zeros on this interval. However, a1 (z21 ) =
2(13λ + 3) > 0, 17λ − 21
−(7λ + 2)(5λ2 − 36λ − 12) λ a1 ( ) = < 0, 2 16λ(5λ − 2) and 5(9λ + 2) > 0, 9λ − 14 where z21 < λ/2 < z31 . Therefore, a1 (x) has exactly two zeros on the interval [z21 , z31 ] at z11 and z12 . The above inequalities then imply that a1 (x) is negative on the interval (z11 , z12 ). Hence, we must have q ∈ [z21 , z11 ] ∪ [z12 , z31 ]. Next, we easily obtain the inequalities z21 > (6λ + 2)/17 and z31 < (6λ + 8)/9. Also, we have 36λ4 −252λ3 −87λ2 +108λ+36 > (6λ2 −22λ−46)2 , which implies that z11 < (54λ2 +25λ+166)/(153λ−42) and z12 > (102λ2 −151λ−202)/(153λ−42). This in turn implies that z11 < (6λ + 6)/17 and z12 > (6λ − 9)/9. Therefore, we have q ∈ ((6λ + 2)/17, (6λ + 6)/17) ∪ ((6λ − 9)/9, (6λ + 8)/9). Thus, since q is an integer, we must have q ∈ {(6λ + j )/17 : 3 ≤ j ≤ 5} ∪ {(6λ + k)/9 : −8 ≤ k ≤ 7}. If q ∈ {(6λ + j )/17 : 3 ≤ j ≤ 5}, then r = (4q + j + 7)/4 for j = 3, 4, or 5. But, clearly r is an integer only for j = 5. Therefore, we must have q = (6λ + 5)/17. However, a1 (z31 ) =
a1 (
−(13λ + 8)(52λ2 − 1143λ − 234) 6λ + 5 )= 0, a1 ( ) = 2 4(3λ − 2)
and a1 (2λ − 5) =
−(3λ − 7)(5λ2 − 40λ + 78) < 0, 2(λ − 3)
where (2λ + 3)/5 < λ/2 < 2λ − 5. Therefore, a1 (x) has exactly two zeros on the interval [(2λ + 3)/5, 2λ − 5] at z11 and z12 . Then the above inequalities imply that a1 (x) is negative on [(2λ + 3)/5, z11 ) ∪ (z12 , 2λ − 5]. Hence, we must have q ∈ [z11 , z12 ]. Next, the function a2 (x) has zeros only at (λ − 1)/3 and √ 14λ − 9 ∓ 2 4λ2 − 28λ + 9 z21 , z22 = . 15 Clearly, (λ−1)/3 ∈ / [z11 , z12 ]. Therefore, a2 (x) has at most two zeros on this interval. However, √ 36λ + 26 + 3 144λ2 − 468λ + 169 a2 (z11 ) = > 0, 26 a2 ( and
−(7λ + 5)(12λ2 − 76λ − 45) 4λ )= < 0, 5 10(2λ + 1)(6λ − 5)
√ 36λ + 26 − 3 144λ2 − 468λ + 169 a2 (z12 ) = > 0, 26
where z11 < 4λ/5 < z12 . Therefore, a2 (x) has exactly two zeros on the interval [z11 , z12 ] at z21 and z22 . Then the above inequalities imply that a2 (x) is negative on the interval (z21 , z22 ). Hence, we must have q ∈ [z11 , z21 ] ∪ [z22 , z12 ]. Suppose first that q ∈ [z22 , z12 ]. Now, we have the inequalities 4λ2 − 28λ + 9 > (2λ − 8)2 and 144λ2 − 468λ + 169 < (12λ − 19)2 . These imply that z22 > (6λ − 9)/5 and z12 < (6λ − 5)/5. Therefore, q ∈ ((6λ − 9)/5, (6λ − 5)/5). Thus, since q is an integer, we must have q = (6λ − 8)/5, (6λ − 7)/5, or (6λ − 6)/5. If q = (6λ − 8)/5, then r = (10q − 3)/2 is not an integer, a contradiction. If q = (6λ − 7)/5, then e = (2q + 3)/2 is not an integer, a contradiction. If q = (6λ − 6)/5, then r = (10q − 1)/2 is not an integer, a contradiction. Hence, we must have q ∈ [z11 , z21 ].
Every λ-design on 6p + 1 points is type-1
121
Now, the function a0 (x) has zeros only at √ 66λ2 − 39λ − 6 ∓ 2 36λ4 − 324λ3 + 9λ2 + 36λ + 4 . z01 , z02 = 117λ + 10 Also, we have a0 (z11 ) = √ 3[2028λ2 − 4596λ + 1638 − (169λ − 124) 144λ2 − 468λ + 169] > 0, √ 10[97λ − 52 − 6 144λ2 − 468λ + 169] −3λ2 + 28λ + 4 λ a0 ( ) = < 0, 2 2(3λ − 2) and a0 (z21 ) = √ 3[2028λ2 − 4596λ + 1638 + (169λ − 124) 144λ2 − 468λ + 169] > 0, √ 10[97λ − 52 + 6 144λ2 − 468λ + 169] where z11 < λ/2 < z21 . Therefore, a0 (x) has exactly two zeros on the interval [z11 , z21 ] at z01 and z02 . Then the above inequalities imply that a0 (x) is negative on the interval (z01 , z02 ). Hence, we must have q ∈ [z11 , z01 ] ∪ [z02 , z21 ]. Now, we easily obtain the inequalities z11 > (6λ − 1)/13 and z21 < (2λ + 2)/3. Next, the inequality 36λ4 − 324λ3 + 9λ2 + 36λ + 4 > (6λ2 − 28λ − 80)2 implies that z01 < (54λ2 + 17λ + 154)/(117λ + 10) and z02 > (78λ2 − 95λ − 166)/(117λ + 10). This in turn implies that z01 < (6λ + 3)/13 and z02 > (2λ − 3)/3. Therefore, we have q ∈ ((6λ − 1)/13, (6λ + 3)/13) ∪ ((2λ − 3)/3, (2λ + 2)/3). Thus, since q is an integer, we must have q ∈ {(6λ + j )/13 : 0 ≤ j ≤ 2} ∪ {(2λ + k)/3 : −2 ≤ k ≤ 1}. If q ∈ {(6λ + j )/13 : 0 ≤ j ≤ 2}, then r = (2q + j + 5)/2 for j = 0, 1, or 2. However, clearly r is an integer only for j = 1. Therefore, q = (6λ + 1)/13, which implies e = (6q − 1)/2, which is not an integer, a contradiction. Hence, we must have q ∈ {(2λ + k)/3 : −2 ≤ k ≤ 1}. Then r = (6q + 3k + 5)/2 for k = −2, −1, 0, or 1. However, clearly r is an integer only for k = −1 and 1. If q = (2λ − 1)/3, then v = 4λ − 1, a contradiction by equation (17) since m = 9. If q = (2λ + 1)/3, then a2 (q) = 9/(λ − 1). Since this must be an integer, this implies that λ − 1 divides 9, or λ = 2, 4, or 10, a contradiction. Case 5: m = 9. If there exists a block A with σA ≤ −1, then m(A) ≤ 7 and D is type-1 by cases 1, 2, 3, and 4. If there exists a block A with σA ≥ 1, then m(A) ≥ 11, so m∗ (A) ≤ 7 and once again D is type-1 by previous cases. Therefore, we may assume that σA = 0 for all blocks A, a contradiction. This concludes the proof of Theorem 4.1.
122 Nick C. Fiala Corollary 4.2. All λ-designs on v = 6p + 1 points, p a prime, are type-1. Proof. If p does not divide g, then g = 1, 2, 3, or 6 and D is type-1 by Theorems 2.5 and 4.1. If p does divide g, then p divides r − 1 and r ∗ − 1. Without loss of generality, we may assume r > r ∗ . Therefore, either r = 5p + 1 and r ∗ = p + 1 or r = 4p + 1 and r ∗ = 2p + 1. First, suppose that r = 5p + 1 and r ∗ = p + 1. If p = 2, then r = 11 and ∗ r = 3, so g = gcd(10, 2) = 2 and D is type-1 by Theorem 2.5. If p = 3, then r = 16 and r ∗ = 4, so g = gcd(15, 3) = 3 and D is type-1 by Theorem 2.5. If p ≡ 1 (mod 6), then 6 divides 5p + 1, so r(r − 1)/(v − 1) = 5(5p + 1)/6 is an integer and D is type-1 by Theorem 2.4. If p ≡ 5 (mod 6), then 6 divides p + 1, so r ∗ (r ∗ − 1)/(v − 1) = (p + 1)/6 is an integer and D is type-1 by Theorem 2.4. Next, suppose that r = 4p + 1 and r ∗ = 2p + 1. If p = 3, then r = 13 and r ∗ = 7, so g = gcd(12, 6) = 6 and D is type-1 by Theorem 4.1. If p ≡ 1 (mod 3), then 3 divides 2p + 1, so r ∗ (r ∗ − 1)/(v − 1) = (2p + 1)/3 is an integer and D is type-1 by Theorem 2.4. If p ≡ 2 (mod 3), then 3 divides 4p + 1, so r(r − 1)/(v − 1) = 2(4p + 1)/3 is an integer and D is type-1 by Theorem 2.4. Remark 4.3. The author has also proven that all λ-designs with g = 7 are type-1 [Fi00]. However, a proof attempt analogous to that of Corollary 4.2 breaks down in the v = 7p + 1 case. Consequently, the author was unable to establish whether or not all λ-designs on 7p + 1 points are type-1. A proof that all λ-designs on 8p + 1 points, where p ≡ 1 or 7 (mod 8) is prime, are type-1 will appear in a forthcoming paper [Fi02].
5. Conjectures It is the author’s great hope to one day see the λ-design conjecture completely resolved. However, this appears to be an exceedingly difficult problem. Therefore, we make the following three weaker conjectures, which are hopefully more vulnerable to attack. Conjecture 5.1. Every λ-design has at least one block of size 2λ. Conjecture 5.2. All λ-designs have exactly two block sizes. Conjecture 5.3. All λ-designs with only two block sizes, one of which is 2λ, are type-1. Obviously, the combination of Conjectures 5.1, 5.2, and 5.3 would prove the λdesign conjecture. However, each conjecture is still very interesting on its own. It has been verified by computer that all λ-designs on v ≤ 713 points with only two block sizes are type-1 [Wo71]. One could prove Conjecture 5.3 by showing that the other block size occurs only once [Wo70].
Every λ-design on 6p + 1 points is type-1
123
Acknowledgement The author is grateful to Ákos Seress for many helpful suggestions and support during the preparation of this paper.
References
[Br70]
W. G. Bridges, Some results on λ-designs, J. Combin. Theory 8 (1970), 350–360.
[Br77]
W. G. Bridges, A characterization of type-1 λ-designs, J. Combin. Theory Ser. A 22 (1977), 361–367.
[BK]
W. G. Bridges and E. S. Kramer, The determination of all λ-designs with λ = 3, J. Combin. Theory 8 (1970), 343–349.
[BT]
W. G. Bridges and T. Tsaur, Some structural characterizations of λ-designs, Ars Combin. 44 (1996), 129–135.
[BE]
N. G. deBruijn and P. Erd˝os, On a combinatorial problem, Indag. Math. 10 (1948), 421–423.
[Fi00]
N. C. Fiala, λ-designs with g = 7, unpublished.
[Fi02]
N. C. Fiala, λ-designs on 8p + 1 points, to appear in Ars Combin.
[He]
D. W. Hein, On the λ-design conjecture for v = 5p + 1 points, Ph. D. dissertation, Central Michigan Univ., 2000.
[HI]
D. W. Hein and Y. J. Ionin, On the λ-design conjecture for v = 5p + 1 points, in: Codes and Designs (K. T. Arasu and Á. Seress, eds.), Ohio State Univ. Math. Res. Inst. Publ. 10, Walter de Gruyter, Berlin–New York 2002, 145–156.
[IS96a] Y. J. Ionin and M. S. Shrikhande, On the λ-design conjecture, J. Combin. Theory Ser. A 74 (1996), 100–114. [IS96b] Y. J. Ionin and M. S. Shrikhande, λ-designs on 4p + 1 points, J. Combin. Math. Combin. Comput. 22 (1996), 135–142. [Is]
J. R. Isbell, An inequality for incidence matrices, Proc. Amer. Math. Soc. 10 (1959), 216–218.
[Kr69]
E. S. Kramer, On λ-designs, Ph. D. dissertation, Univ. of Michigan, 1969.
[Kr74]
E. S. Kramer, On λ-designs, J. Combin. Theory Ser. A 16 (1974), 57–75.
[Ma]
K. N. Majumdar, On some theorems in combinatorics related to incomplete block designs, Ann. Math. Statist. 24 (1953), 377–389.
[Ry68]
H. J. Ryser,An extension of a theorem of deBruijn and Erd˝os on combinatorial designs, J. Algebra 10 (1968), 246–261.
[Ry70]
H. J. Ryser, New types of combinatorial designs, in: Actes Congrès Intern. Math., tome 3, 1970, 235–239.
[Se89]
Á. Seress, Some characterizations of type-1 λ-designs, J. Combin. Theory Ser. A 52 (1989), 288–300.
124 Nick C. Fiala [Se90]
Á. Seress, On λ-designs with λ = 2p, in: Coding theory and Design Theory, Part II, Design Theory (D. K. Ray-Chaudhuri, ed.), IMA Vol.Math.Appl. 21, Springer-Verlag, New York 1990, 290–303.
[Se01]
Á. Seress, All lambda-designs with λ = 2p are type-1, Des. Codes Cryptogr. 22 (2001), 5–17.
[SS]
S. S. Shrikhande and N. M. Singhi, On the λ-design conjecture, Util. Math. 9 (1976), 301–318.
[Ts]
T. Tsaur, Variants of symmetric block designs, Ph. D. dissertation, The Univ. of Wyoming, 1993.
[We]
I. Weisz, Lambda-designs with small lambda are type-1, Ph. D. dissertation, The Ohio State Univ., 1995.
[Wol]
S. Wolfram, Mathematica, Addison-Wesley, Redwood City, CA, 1991.
[Wo70] D. R. Woodall, Square λ-linked designs, Proc. London Math. Soc. 20 (1970), 669–687. [Wo71] D. R. Woodall, Square λ-linked designs: A survey, in: Combinatorial Mathematics and Its Applications, Academic Press, 1971, 349–355. N. Fiala Department of Mathematics The Ohio State University 231 W 18th Avenue Columbus, OH 43210, U.S.A. [email protected]
An introduction to balanced network flows Christian Fremuth-Paeger and Dieter Jungnickel
Abstract. We give an introduction to the treatment of general matching problems (both in the weighted and the cardinality case) based on a network flow approach. 2000 Mathematics Subject Classification: primary 05C70; secondary 90B10, 90C35.
1. Preliminaries A matching of a graph is a set of arcs which have no end nodes in common. A 1-factor is a matching without exposed vertices. The central problems of matching theory are the cardinality matching problem which asks for a matching of maximum cardinality, and the perfect matching problem (PMP ) which asks for a minimum cost 1-factor. 2 1
2
3
(a)
4 (b)
6
5
1
4
5 3
Figure 1. Graphs and matchings
It is well-known that bipartite matching problems have a network flow formulation. For example, the bigraph and its matching shown in Figure 1(a) can be transformed into the flow network and the st-flow shown in Figure 2 (arcs which carry flow are drawn bold). This reduction technique is appealing for several reasons: • It is very simple. • It applies to bipartite f -factor and b-matching problems as well. • It leads to very efficient matching algorithms (Dinic, SAP). Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
126 Christian Fremuth-Paeger and Dieter Jungnickel
s
1
2
3
4
5
6
t
Figure 2. Reduction of a bipartite matching problem
• Statements can be translated from the network flow context to matching theory. There are a few, much less known attempts to handle general matching problems in a similar way. We refer to the work of Tutte [Tut67], Kocay/Stone [KoSt93] and Goldberg/Karzanov [GoKa96]. Unfortunately, there is no traditional terminology, and the authors just mentioned did not work out a comprehensive theory but only certain aspects. Balanced network flows, as introduced in the next section, fill this gap in the spirit of the bipartite setting. We will present our framework, but omit the proofs and the algorithmic details which can be found in our previous papers.
2. Problem setting Throughout this paper, we will work with digraphs N which have the following properties: (b1) The node set V (N ) splits into complementary pairs {x, x }. Denote (x ) := x. (b2) a = uv ∈ A(N) if and only if a := v u ∈ A(N ). Up to the arc directions which flip, node complementarity defines a graph isomorphism. For this reason, a digraph which satisfies (b1) and (b2) is called skewsymmetric. A small skew-symmetric digraph can be found in Figure 3. Note that node complementarity induces a complementarity relationship of arcs and even of paths. In our example, the cycles (u, u , v, u) and (u , v , u, u ) are complementary. By a balanced flow network, we denote a skew-symmetric digraph together with two arc capacity vectors lcap, ucap which satisfy the following symmetry condition: (b3) Capacity labels are balanced, that is, complementary arcs have the same integral label.
An introduction to balanced network flows
u
u’
v’
v
127
Figure 3. A skew-symmetric digraph
The integrality prerequisite is important and indicates that we are talking about matching problems rather than network flow problems. From the perspective of matching problems, one may also require bipartiteness: (b4) N is bipartite with independent sets Inner(N ) and Outer(N ), (b5) v ∈ Inner(N) if and only if v ∈ Outer(N ). The restriction to bipartite networks has no computational benefit but simplifies the theory in some circumstances. Based on balanced flow networks, one might study the following optimization problems: • The maximum balanced flow problem (MBFP) which asks for a maximum balanced st-flow where s = t and lcap ≡ 0 are assumed. • The 0-1 maximum balanced flow problem (1MBFP ) where in addition ucap ≡ 1 is assumed. • The feasible balanced circulation problem (FBCP). • The minimum cost balanced flow problem (MCBFP) which asks for a maximum balanced st-flow f of minimum costs which are defined by c(f ) := a∈A(N) c(a)f (a) for some vector c of non-negative arc labels. • The minimum cost balanced circulation problem (MCBCP). In [FrJu1], the FBCP is solved by a transformation to the maximum balanced flow problem. The reduction principle is quite analogous to the reduction proposed by Ford/Fulkerson [FoFu62] for ordinary network flows. Clearly, this principle applies to the MCBCP as well. Hence we may concentrate on the respective maximization problems.
128 Christian Fremuth-Paeger and Dieter Jungnickel
3. Relationship with matching problems The reduction of the cardinality matching problem to the 1MBFP is rather simple and can be sketched as follows: (1) Split each node of G into a pair of complementary nodes. (2) Split each edge of G into a pair of complementary arcs. (3) Transform the resulting bipartite graph into an appropriate balanced flow network NG . (4) Find an maximum balanced flow on NG . (5) Transform the maximum balanced flow on NG into an maximum matching of G. We do not formalize the construction of the network NG . An example is given in Figure 4 which shows the reduction of the non-bipartite graph in Figure 1(b). This reduction principle works for the PMP and also for a much more general problem, the capacitated b-matching problem: Given a multigraph G and integer ˜ of G so that node labels b, this problem asks for a cost-minimal sub-multigraph G ˜ every node v has degree b(v) in G. The reader will find it easy to transform this problem to MCBCP explicitly. We point out that there is a (polynomial) reverse reduction mechanism from the MCBCP to the capacitated b-matching problem. Hence both problems are computationally equivalent. See [FrJu6] for the details.
4. Polyhedral aspects The polyhedron F (N) of fractional balanced circulations is defined by the constraints (p1a) lcap(a) ≤ f (a) ∀ a ∈ A(N ), (p1b) f (a) ≤ ucap(a) ∀ a ∈ A(N ), ∀ a ∈ A(N ), (p2) f (a) = f (a ) (p3) e(v) = 0 ∀ v ∈ V (N), where e(v) := a∈δ − (v) f (a) − a∈δ + (v) f (a) denotes the flow excess at the node N
N
− v, and δN (v) denotes the arcs in N with end node v. In order to characterize the vertices of this polyhedron, we call an arc a ∈ A(N ) free if lcap(a) < f (a) < ucap(a). A path in N(f ) is free if all traversed arcs are free. If the symmetry constraints (p2) would be omitted, an ordinary network flow problem would result. All vertices would be integral by the unimodularity of the constraint matrix, and would not admit free cycles. Let us see what is different in our setting.
An introduction to balanced network flows
129
A cycle p is odd if it can be written in the form p = q ◦ r where r is the reverse of q and q is strictly simple. That means that q traverses at most one node of a complementary pair (end nodes excluded). If N is a transformed instance of some matching problem, the odd cycles in N correspond to odd length cycles [paths] in the original graph. In Figure 4, for example, the cycles (1, 2 , 3, 1 , 2, 3 , 1) and (s, 4, 5 , t, 4 , 5, s) are odd.
s
5
4’
2
3’
1
1’
3
2’
4
5’
t
Figure 4. Reduction of a matching problem
Without much effort, one can show that: Theorem 4.1 ([FrJu5]). Let f be a fractional balanced flow on a balanced flow network N . Then f is a vertex of the polytope F (N ) iff every free cycle in N(f ) is odd. A pseudo-basic circulation is a half-integral circulation where the arcs with nonintegral flow form disjoint odd cycles. As the terminology suggests, the vertices of F (N ) are pseudo-basic circulations. There is a rather simple O(m2 ) procedure which turns a fractional balanced circulation into a vertex of the polytope F (N). A half-integral balanced circulation can be transformed into a pseudo-basic solution by the same idea even in O(m) time. The latter procedure forms part of the state-of-the-art algorithms for the MBFP as well as the MCBCP. The polyhedral characterization of balanced circulations utilizes the idea of “odd sets” which are defined here as follows: A skew cut is a pair (A1 , A2 ) of disjoint arc sets with lcap(A1 ) − ucap(A2 ) odd, so that A1 ! A2 is a directed cut in N which
130 Christian Fremuth-Paeger and Dieter Jungnickel separates the node set into self-complementary sets. All skew cuts are collected into the set O(N ). In [FrJu6], we have shown that P (N), the convex hull of balanced circulations, is given by the constraints (p1a) f (a) ≥ lcap(a) (p1b) f (a) ≤ ucap(a) (p2) f (a) = f (a ) (p3) e(v) = 0 (p4) f (A2 ) − f (A1 ) ≥ lcap(A2 ) − ucap(A1 ) + 1
∀ a ∈ A(N ), ∀ a ∈ A(N ), ∀ a ∈ A(N ), ∀ v ∈ V (N), ∀ (A1 , A2 ) ∈ O(N ).
Our proof uses the reduction to b-matchings, pseudo-basic circulations, and an idea of Schrijver [Shr81]. Using LP-duality, it turns out that the symmetry constraints (p2) are redundant.
5. Optimality Virtually all known matching algorithms depend on the idea of augmentation: Let p be a simple, not necessarily directed, path in N . The elementary flow fp supported by p is defined by ⎧ ⎨ +1 if a is a forward arc, −1 if a is a backward arc, fp (a) := ⎩ 0 otherwise. If f is a flow, and p is a directed path in the residual network N(f ), the update f :≡ f + fp is called an augmentation step. If we augment a balanced flow by a single path, we loose the symmetry requirement for this flow. Hence we must augment in complementary pairs f :≡ f + fp + fp . For sake of brevity, we will use the notation χp :≡ fp + fp . If f as well as f + χp are feasible balanced flows, the path p is called valid (augmenting) with respect to the residual network N (f ). Theorem 5.1 ([FrJu1]). Let f , g be different balanced circulations on the balanced flow network N. Then there are valid cycles p1 , p2 , . . . , pk in N(f ) such that g−f ≡
k
χpi .
i=1
This theorem gives a primal proof for the augmenting path theorem:
An introduction to balanced network flows
131
Theorem 5.2 ([FrJu1]). Let f be a balanced st-flow on the balanced flow network N. Then f is a maximum balanced flow iff there is no valid st-path in N(f ). This theorem can be viewed a generalization of the well-known results of Berge [Ber57] and Ford/Fulkerson [FoFu62]. By the decomposition theorem 5.1, we can also derive optimality criteria for weighted problems. The first statement is the primal optimality criterion: Theorem 5.3 ([FrJu1]). Let f be a balanced circulation on the balanced flow network N. Then f is optimal iff there is no valid cycle p in N(f ) with c(fp ) negative. The next statement concerns the shortest augmenting path (SAP ) algorithm which applies to the min-cost balanced st-flow problem rather than the MCBCP. This algorithm finds a balanced st-flow f of given value ν with c(f ) minimum, called (ν)-optimal in what follows. A series of (μ)-optimal balanced flows can be derived using the following idea: Theorem 5.4 ([FrJu1]). Let f be a (μ)-optimal balanced st-flow on the balanced flow network N, and p a shortest valid augmenting path in N(f ). Then g :≡ f + χp is (μ + 2)-optimal balanced. Note that there is no distinction between (μ)-optimal and extreme flows as in the graph theoretical context. The most popular algorithms for weighted matching problems are based on the primal-dual algorithm for linear programming. But this approach requires dual solutions and a reduced-costs optimality criterion. The dual variables are node potentials π and non-negative variables φ associated with the skew cuts. In accordance with [GoKa96], and in order to keep the distinction to the reduced cost labels known for the min cost flow problem, we call cπφ (a) := c(a) + π(a − ) − π(a + ) + χ A1 ,A2 (a) φ(A1 , A2 ) (A1 ,A2 )∈O(N )
the modified cost of the arc a. In this formula, χ A1 ,A2 denotes the incidence vector of the skew cut (A1 , A2 ) which is defined by ⎧ ⎨ +1 if a ∈ A1 , −1 if a ∈ A2 , χ A1 ,A2 (a) := ⎩ 0 otherwise. One then can observe the following: Theorem 5.5 ([FrJu6]). Let f be a balanced circulation on a balanced flow network N. Then f is optimal iff there are vectors π and φ ≥ 0 so that φ
(cs1) cπ (a) ≥ 0, if rescapf (a) > 0, (cs2) φ(A1 , A2 ) = 0, if (A1 , A2 ) ∈ O(N ) is not tight, i.e., if (A1 , A2 ) does not satisfy (p4) with equality.
132 Christian Fremuth-Paeger and Dieter Jungnickel If we would go into the details, the modified length labels would occur in the description of the primal algorithm and the SAP algorithm likewise. Let ucap(a), if cφπ (a) ≤ 0 ucap◦ (a) := lcap(a), if cφπ (a) > 0, lcap(a), if cφπ (a) ≥ 0 lcap◦ (a) := ucap(a), if cφπ (a) < 0. By Nπ,φ , we denote the balanced flow network which is formed by V (N), A(N ) and the capacity labels cap◦ , lower ◦ . This network is called the admissible graph with respect to π, φ. If there is no confusion about the dual solution, we write N◦ instead of Nπ,φ . Now let π, φ be an optimal dual solution. Then Theorem 5.5 says that a balanced circulation is optimal iff it is a feasible circulation on N◦ . Hence it is essentially a MBFP which determines an optimal circulation from an optimal dual solution.
6. Distance and tenacity labels In what follows, we will consider path problems rather than network flow problems. Hence let N denote a skew-symmetric digraph together with a balanced capacity function cap, and a source node s. By putting ucap :≡ cap and f, lcap :≡ 0, the notion of valid paths applies to this new setting. The balanced network search (BNS) problem asks for valid paths joining s to the other nodes of the network. If we use a distance label d(v) this shall denote the minimum length of a valid sv-path. To a path which achieves this minimum length, we refer as a d(v)-path. Let us consider an augmentation algorithm for MBFP which chooses only d(t)-paths for augmentation. The importance of shortest augmenting paths is the same as for the ordinary max flow problem: Theorem 6.1 ([FrJu1]). Let p be a d(t)-path in N(f ), put g := f + χp , and let q be a d(t)-path in N (g). Then |p| ≤ |q| holds. That is, one can decompose the augmentation algorithm into phases where all augmenting paths have equal length. Such an augmentation algorithm is called phaseordered. It can be shown that if an augmenting path traverses an arc a then none of the augmenting paths in the same phase may traverse the reverse arc a. ¯ It follows that the number of augmentations per phase is O(m). For the details, see [FrJu1]. The length of an augmenting path is restricted by the number of nodes, and hence the number of phases is O(n). In some circumstances, the number of phases may be even less:
An introduction to balanced network flows
133
Theorem 6.2 ([FrJu1]). Let NG be the 0-1-balanced flow network associated with a √ graph G. Then the phase ordered algorithm consists of O( n) phases. A node v is a minlevel node if d(v) ≤ d(v ), and a maxlevel node otherwise. The tenacity labels of a node v and an arc a = uv are defined ten(v) := d(v) + d(v ) and ten(a) := d(u) + d(v ) + 1 respectively. Tenacity labels play a central role for the determination of shortest augmenting paths. While minlevel nodes are explored according to the order of their distance labels, maxlevel nodes are explored according to their tenacity labels. Figure 5 shows a balanced network which is our running example (all arcs have unit capacity). One finds that d(6) = 2, d(7 ) = 4, d(7), d(6 ), d(5), d(5 ) = 3 so
4’
t
1’
5 2’
6’
3’
7 6 3
7’
2 5’
1
s
4
Figure 5. A balanced network
that ten(6) = 5, ten(5) = 6, ten(7) = 7. Hence 6, 7, 5, 5 are minlevel nodes while 6 and 7 are maxlevel nodes. If x is a minlevel node, then any arc which reaches x on a d(t)-path is called a prop. Arcs which are neither props nor complements of props are called bridges. Up to complementarity, there are three bridges in our example which have tenacity ten(2, 2 ) = 3, ten(6, 6 ) = 5 and ten(7, 5) = 7 respectively.
134 Christian Fremuth-Paeger and Dieter Jungnickel
7. Odd sets All literature on matching theory depends on odd sets in some way. Sometimes, it is not obvious to see the formal relationship between the different notions. Apart from the skew cuts in Section 4, we introduce two more versions of odd sets: We call C := {v ∈ V (N ) : ten(v) < ∞} the core, and the connected components of N [C] the nuclei of the balanced network N . The nucleus containing node x is denoted by U (x). Adopting the notation of Tutte [Tut52], an arc a = uv is accessible if there is a valid su-path p and either rescap(a) > 1 or a is not on p. An arc a is bicursal if a and a are accessible. If we restrict N to the bicursal arcs, the connected components of the resulting digraph are called the blossoms of N. The blossom containing node x is denoted by B(x). Theorem 7.1 ([FrJu1]). Let B is a blossom not containing the source s. Then there is a prop a = ub with cap(a) = 1 which is traversed by every valid path p connecting s to a node v in B. To the node b, we refer as the base, and to the arc a, we refer as the prop of this blossom. An analogous statement can be given for nuclei. It can be shown that nuclei split into blossoms, and that blossoms and nuclei coincide for NG (f ), the 1-matching case. In this case, nuclei are exactly the blossoms used by Edmonds [Edm65]. But even for the case of 2-factor problems, the notions differ (see [FrJu1] for an example). The formal relationship of blossoms and nuclei is as follows: Theorem 7.2 ([FrJu1]). Let a be a bridge. The following statements are equivalent: (a) ten(a) is finite. (b) a is contained in a blossom. (c) a is bicursal. (d) a is contained in a nucleus. In Figure 5, all nodes are reachable and hence in a common nucleus. On the other hand, the arc (6 , 2 ) is not accessible so that two blossoms exist. The layered auxiliary network associated with N consists of the blossom bases and the nodes x with d(x) < ∞ = d(x ) which we call buds and for which we put base(x) = x. Two bases b1 , b2 are joined by an arc if there is a prop uv such that b1 = base(u) and b2 = base(v). The layered auxiliary network for our running example merely consists of the nodes s and 6 which are adjacent. It is shown in Figure 7, and labelled N4 there.
An introduction to balanced network flows
135
8. Canonical decomposition Before we get into the details of BNS algorithms, we briefly return to the MBFP and its dual problem: To this purpose, let f denote a maximum balanced flow on some balanced flow network N , and let Q = [S, T ] denote an arbitrary st-cut in N, and put A(Q) := {v ∈ T : v ∈ S}, B(Q) := {v ∈ S : v ∈ T }, C(Q) := {v ∈ S : v ∈ S}, D(Q) := {v ∈ T : v ∈ T }. In extension of the preceding definitions, the set C(Q) is called the core of Q, and the connected components of N[C(Q)] are the nuclei of Q. A nucleus U is odd if cap(U, T ) is odd, and the number of nuclei is denoted by odd(Q). An st-cut is called minimum balanced if balcap(Q) := ucap(Q) − odd(Q), the balanced capacity of Q, is minimum among all st-cuts in N. Note that the dual feasibility set consists of ordinary edge cuts, but the objective function is not the same as in the ordinary min-cut problem. Theorem 8.1 ([Tut67], [KoSt93], [FrJu4]). Let f be a balanced st-flow, and Q an st-cut of the balanced flow network N. Then val(f ) ≤ balcap(Q) holds. Furthermore, equality holds iff f is a maximum balanced st-flow, and Q is a minimum balanced st-cut. This statement establishes an alternative proof for the augmenting path theorem 5.1 which is in the spirit of Ford and Fulkerson [FoFu62]. Let Q(N, f ) denote the st-cut separating the nodes which are s-reachable in N(f ). To prove Theorem 9.1, one must show that Q(N, f ) is minimum balanced and that all nuclei are odd. As a later application of this duality theorem, one can characterize Q(N, f ) in terms of the other minimum balanced cuts: Theorem 8.2 ([FrJu4]). Let N be a balanced flow network, and f a maximum balanced st-flow on N . Then v ∈ V (N ) is strictly s-reachable with respect to N(f ) iff v ∈ S for every minimum balanced st-cut [S, T ] of N. This shows that Q(N, f ) does not depend on the special choice of the maximum balanced flow f . For this reason, Q(N) := Q(N, f ) is well-defined, and called the canonical st-cut in N. With moderate effort, one can derive from this cut the wellknown Gallai–Edmonds decomposition [LoPl86], and prove the respective structure theorem for maximum 1-matchings.
136 Christian Fremuth-Paeger and Dieter Jungnickel By putting ucap :≡ cap and f, lcap :≡ 0, the notion of (canonical) decomposition applies to the BNS problem as well, but we rather use the term (canonical ) barrrier.
9. Shrinking blossoms Blossoms as introduced in Section 7 are static in the sense that no algorithm is involved. We shall now explain how the blossoms and the layered auxiliary network are affected by the investigation of some complementary arc pair a = uv, a = v u during the BNS procedure. Let A denote the arcs inspected by the procedure so far, and A˜ := A ∪ {a, a }. We can apply the results of Section 7 to the network N[A]. Whenever necessary, we write a subscript or a prefix to denote that we are talking about N[A] rather than N . For example, we refer to the canonical barrier of the network N[A] by the sets AA , BA , CA , DA . It is simple to check that blossoms form a nested family, that is, an (A)-blossom and ˜ ˜ an (A)-blossom are either disjoint or the (A)-blossom is contained in the (A)-blossom. ˜ To obtain more explicit statements relating N [A] to N[A], we have to assume that (α1) A is self-complementary, (α2) A does not contain (A)-acursal arcs, (α3) Every (A)-unicursal arc is an (A)-prop. Note that these three conditions are all very natural and satisfied by all known cardinality matching algorithms. Now we can formulate the two essential operations of a BNS algorithm, namely bud generation and blossom shrinking: Theorem 9.1 ([FrJu1]). Let dA (u) < ∞ and dA (v ) = ∞. Then (a) CA˜ = CA (b) BA˜ = BA ∪ {v} ˜ (c) Every (A)-blossom is an (A)-blossom. Theorem 9.2 ([FrJu1]). Let dA (u), dA (v ) < ∞. Then: (a) DA˜ = DA . ˜ (b) Both u and v are in a common proper (A)-blossom, denoted by BA˜ (u, v). ˜ (c) Except for BA˜ (u, v), every (A)-blossom is an (A)-blossom.
An introduction to balanced network flows
137
It has to be mentioned that both statements do not depend on property (α3) at all. This property becomes important only if we want to determine the new blossom BA˜ (u, v) in Theorem 10.1. It essentially says that all necessary information can be found in the layered auxiliary network for N[A]. The main effort in computing BA˜ (u, v) is to determine an appropriate bottleneck b. This is a base node which must be traversed by every (A)-valid su-path, and by every (A)-valid sv -path. It turns out that the new blossom contains all (A)-blossoms whose base is on a directed path from b to base(u) or base(v) in the layered auxiliary network. As an example, suppose that an arc (7, 2 ) is investigated in the situation of Figure 6, network N3 . A blossom shrinking operation would occur, s would be the bottleneck, and all blossoms but {6, 6 } would be shrunk. Hence the layered auxiliary network of Figure 6, network N4 would result. All known BNS algorithms satisfy the conditions (α1)–(α3). Some procedures even satisfy the condition (α4) At most one (A)-unicursal arc with given end node v exists. which is much more restrictive than (α3). Since then all layered auxiliary networks which occur are trees, we call such BNS algorithms tree growing. It is a rather simple procedure to find the desired bottleneck under these circumstances. Even more, one can state a generic path expansion rule and generic correctness proof for tree growing BNS algorithms. Details may be found in [FrJu1].
10. Shortest valid paths We next describe an algorithm which determines the distance labels of a balanced network. They key idea is a series of subgraphs N1 , N2 , . . . , Nk which are grown from scratch until t becomes reachable. It turns out that t is reachable if and only if ten(t) ≤ 2k. The network Ni consists of the following arcs: • The props whose end nodes v have distance label d(v) ≤ i. • The complements of these props. • The bridges a which have tenacity ten(a) ≤ 2i. A subscript i indicates that we are talking about the network Ni . The layered networks for the subgraphs Ni of our running example are depicted in Figure 6. Let v ∈ V (N ) so that d(v) ≤ d(v ). It requires considerable effort to show that d(v ), if ten(v) ≤ 2i d(v), if d(v) ≤ i di (v ) = di (v) = ∞, otherwise. ∞, otherwise, The proof uses induction on i, a nested application of the general theory of Section 9 and the following statements:
138 Christian Fremuth-Paeger and Dieter Jungnickel
7
3
4
6
3
6
6
t
1
2
1
2
1
2
s
s
s
s
N1
N2
N3
N4
Figure 6. Iterated Networks
Theorem 10.1 ([FrJu1]). Let v ∈ V (N ) either be a minlevel node with distance d(v) = i + 1 or a maxlevel node with tenacity ten(v) = 2i + 1. Any d(v)-path p has the following properties: (a) p traverses only (i + 1)-arcs. (b) p traverses exactly one arc a which is not an (i)-arc. (c) If v is a maxlevel node, then a is a bridge and ten(a) = ten(v). (d) p visits any (i)-nucleus at most once. (e) Any (i)-nucleus U = U (s) traversed by p before a is reached by propi (U ). (f) Any (i)-nucleus U traversed by p after a is left by propi (U ) . (g) If both xy and y x are bridges and xy is on p, then t (x, y ) < t (y , x) holds.
11. Surface graphs We shall describe the general ideas occurring in weighted algorithms. The notion of layered auxiliary network for non-weighted BNS corresponds to the notion of surface graphs here. Again, these networks are associated with a graph search problem rather than a network flow problem:
An introduction to balanced network flows
139
As introduced in [GoKa96], a fragment of a balanced network N is a pair (U, a) where U , the interior, is a self-complementary node set, and a, the prop, is an arc in N[U , U ]. Blossoms and nuclei together with their props are natural examples of fragments. Shrinking a fragment (U, a) of a balanced network N means the following: All interior nodes and arcs are deleted from N. Instead of these, a new pair w, w of nodes is introduced. The arcs adjacent with U are redirected like follows: • The new end node of a is w. • The new start node of a is w . • The start node of other arcs in N [U , U ] is w. • The end node of other arcs in N[U, U ] is w . This modification preserves the skew-symmetry of the network. An example is given in Figure 7 where the fragment ({2, 4, 5, 2 , 4 , 5 }, (s, 2)) of our running example in Figure 5 is shrunk.
t
1’
w’ 3’ 6’ 7 7’ 6 3 w 1
s
Figure 7. A surface graph
A shrinking family is a set S of fragments whose interiors form a nested family. That is, the cardinality of a shrinking family is bounded by O(n). This is important since in practical algorithms all non-zero dual variables φ belong to some member of the shrinking family. The network which results from shrinking all maximal fragments of a family S is called the surface graph. Note that these shrinking operations commute. Hence we can write N / S irrespective of the special order. If no confusion about S is possible, we write N instead of N / S.
140 Christian Fremuth-Paeger and Dieter Jungnickel The idea of the primal-dual algorithm for the MCBFP is to start with the zero flow f and the trivial dual solution π ≡ 0, φ ≡ 0. These solutions are compatible, that is, the modified length labels are non-negative for every arc with residual capacity. In fact, both solutions stay strongly compatible throughout the algorithm, that is, the dual solution is always supported by a shrinking family S. Let F0 ∈ S be an arbitrary fragment with interior U and prop a. Let S0 := {F ∈ S : F is properly nested into F0 }, ˜ Then rescap(a) = 1 must hold, and N˜ := N◦ (f ) / S0 , and b the end node of a in N. ˜ ]. every node x ∈ U must be b-reachable by a valid path in N[U Then, iteratively, a BNS is performed on the network N◦ (f ). This is initially the same as N◦ (f ). If N◦ (f ) admits a valid st-path, then N◦ (f ) also admits a valid stpath, and f is augmented. Otherwise, if the core of N◦ (f ) is non-trivial, the fragments corresponding to the blossoms found are added to S and shrunk immediately. Theorem 11.1. Let U denote some blossom or nucleus of the network N◦ (f ). Then there is a tight skew cut with interior U . If the core is trivial, a dual update is performed which does not change the modified length of the interior arcs. After that operation, the BNS tree may contain some maximal fragments whose dual variables are zero and which are reached by an arc other than the prop. The dual update ends with the expansion of such fragments. If no fragments are expanded, at least one new node becomes reachable in N◦ (f ). As in the 1-matching case, only O(n) shrinking, expansion and dual update operations can occur, before an augmenting path is found or the flow is shown to be maximum. In a naive algorithm, one would start a new BNS after each of these operations. One merely has to show that the solutions stay strongly dual compatible during the augmentations of f . It it easy to see that the skew cut corresponding to a fragment F ∈ S remains tight. However, it should be possible to traverse the interior of F even in a later augmentation step. We do not go into the details here, but give an exhaustive description in [FrJu7].
12. Explicit algorithms We conclude this introduction with several remarks about the implementation of MBFP and MCBFP algorithms, and some test results. All matching-like algorithms must handle a so-called disjoint set union (DSU ) data structure. In our setting, the DSU problem asks for the base of the blossom containing a given node. The critical operations are the updates which are necessary if blossoms are merged or expanded.
An introduction to balanced network flows
141
If this data structure is implemented carefully, base find operations and the merging of one blossom into another can be considered to be elementary operations. In the case of weighted problems, the additional blossom expansion operations cannot be implemented as elementary operations, but are dominated by the dual updates. Regarding the DSU process, there are no additional difficulties involved in MBFP and MCBFP algorithms compared to the traditional setting of 1-matchings. We first discuss the problem of finding the canonical decomposition. As suggested in Section 9, every arc a must be investigated only once. The critical operation is to collect all blossoms bases which are potentially shrunk by the investigation of a. One may choose a tree-growing search strategy. If distance labels are used which are an upper bound for the correct distance labels in N[A], one may trace back from baseA (u) and baseA (v) in the layered auxiliary network without tracing beyond the new blossom base. Hence the shrinking of blossoms requires O(n) time alltogether. An explicit pseudo-code can be found in [FrJu2]. If we want to know the correct distance labels, we cannot choose a search strategy which works tree-like. Nevertheless, the double depth first search strategy applies which was introduced for the famous Micali/Vazirani algorithm [MiVa80], and which helps shrinking a blossom without searching the nodes outside of the new blossom. It turns out that the canonical decomposition and the distance labels can be found in O(m) time. Keeping the relevant information, an augmenting path or even a d(t)-path may be extracted in O(n) steps. However, it requires considerable care to implement path expansion procedures. In case of d(t)-paths, a lot of code and data structures are also needed. Next let us consider the phase-ordered augmentation algorithm as suggested in Section 7. Since only O(n) phases and O(m) augmentation per phase can occur, we conclude that the MBFP can be solved in O(nm2 ) time. As the Dinic algorithm and the Micali/Vazirani algorithm show, a phase-ordered algorithm may be implemented so that only one BNS per phase is needed. The relevant update operation between two augmentation steps is called topological erase, and applies in our context also. The double depth first search inspects nodes which are not on the final augmenting path, but the phase-ordered algorithm can run in O(n2 m) time nevertheless. Pseudo-code for phase-ordered algorithms can be found in [FrJu3]. New complexity statements, including the idea of graph compression, are given in [FrJu8] The MBFP may be solved by another strategy to which we refer as the cycle canceling algorithm: One first computes an ordinary integral maximum flow on the balanced flow network. This flow may be symmetrized by taking f (a) :=
1 (f (a) + f (a )) 2
for each arc of the network. As mentioned in Section 4, the resulting flow may be transformed into a pseudo-basic flow in O(m) time which is still maximum.
142 Christian Fremuth-Paeger and Dieter Jungnickel The next step essentially determines valid paths which connect two of the odd cycles. This turns f into a balanced flow and reduces the flow value of f by at most 2n units in O(nm) time. Then a suitable augmentation procedure may be used to compute a maximum balanced flow from f in O(nm) time. Hence a time bound for the complete cycle canceling algorithm is O(nm + γ (n, m)) where γ (n, m) denotes the worst case complexity of the max-flow algorithm used. We have implemented the phase-ordered algorithm as well as the cycle canceling algorithm and tested the computation of 1-, 2- and 3-factors in sparse random graphs. The tests turned out that both algorithms roughly need 1 minute for 105 node problems on a contemporary PC. Note that the cycle canceling method admits the choice of an arbitrary max-flow algorithm. In our tests, the running times with the Dinic method were superior to the Push/Relabel algorithm. This apparently stems from the special shape of the transformed matching problems. But the phase-ordered algorithm and the cyclecanceling method with Dinic start-up are rather similar, what may explain the running times. We did not test the MKM max-flow algorithm which is closely related to the Dinic method. It may turn out that the cycle canceling method is superior in the general setting. The min-cost balanced flow problem can be solved by the each of the strategies suggested in Section 5. A procedure which finds a min-cost augmenting path has been described in Goldberg/Karzanow [GoKa96]. However, the SAP approach seems less efficient than the PD approach since negative arc lengths occur which causes a tremendous increase of effort. The primal approach seems valuable as a post-optimization procedure but not as a general strategy. A primal 1-matching algorithm can be found in [CuMa78]. We think that this idea applies to the setting of balanced network flow as well. We have devised a simple primal-dual code based on Section 12 which has the following complexity: Let a phase denote the operations of the algorithm between two augmentations. As mentioned, a phase consists of O(n) dual updates each of which needs O(m) time in the naive implementation. If ν denotes the value of a maximum flow, then the primal-dual algorithm has worst-case time complexity O(νnm) and is hence not polynomial. There are well-known techniques which improve the PD algorithm to O(n2 ) respectively O(m log n) in the case of 1-matchings. In [GoKa96], these ideas are used for an SAP-method. Hence, it seems likely that the dual updates in the PD-algorithm can be improved to O(m log n) time per each phase. The technique of canceling fractional cycles applies to the weighted setting also, and produces near-optimal input to the PD-algorithm. This idea, which has been described for the weighted b-matching problem by Anstee [Ans87] before, leads to the only strongly polynomial MCBCP algorithms known so far. A detailed description for the framework of balanced network flows including some primal-dual code is given in [FrJu7].
An introduction to balanced network flows
143
Acknowledgement. This research has been funded by the German Research Council (DFG).
References
[Ans87]
R. P. Anstee, A polynomial algorithm for b-matchings: An alternative approach, Inform. Process. Lett. 24 (1987), 153–157.
[Ber57]
C. Berge, Two theorems in graph theory, Proc. Natl. Acad. Sci. USA 43 (1957), 842–844.
[CuMa78] W. H. Cunningham and A. B. Marsh, A primal algorithm for optimum matching, Math. Program. 8 (1978), 50–72. [Edm65]
J. Edmonds, Paths, trees and flowers, Canad. J. Math. 17 (1965), 449–4675.
[FoFu62]
L. R. Ford and D. R. Fulkerson, Flows in networks, Princeton University Press, Princeton, NJ, 1962.
[FrJu1]
C. Fremuth-Paeger and D. Jungnickel, Balanced network flows (I): A unifying framework for design and analysis of matching algorithms, Networks 33 (1999), 1–28.
[FrJu2]
C. Fremuth-Paeger and D. Jungnickel, Balanced network flows (II): Simple augmentation algorithms, Networks 33 (1999), 29–41.
[FrJu3]
C. Fremuth-Paeger and D. Jungnickel, Balanced network flows (III): Strongly polynomial augmentation algorithms, Networks 33 (1999), 43–56.
[FrJu4]
C. Fremuth-Paeger and D. Jungnickel, Balanced network flows (IV): Duality and structure theory, Networks 37 (2001),194–201.
[FrJu5]
C. Fremuth-Paeger and D. Jungnickel, Balanced network flows (V): Cycle canceling algorithms, Networks 37 (2001), 202–209.
[FrJu6]
C. Fremuth-Paeger and D. Jungnickel, Balanced network flows (VI): Polyhedral descriptions, Networks 37 (2001), 210–218.
[FrJu7]
C. Fremuth-Paeger and D. Jungnickel, Balanced network flows (VII): A primaldual algorithm, Networks 39 (2002), 135–142.
[FrJu8]
C. Fremuth-Paeger and D. Jungnickel, Balanced √ network flows (VIII): A revised theory of phase ordered algorithms and the O( nm log(n2 /m)/ log n) bound for the non-bipartite cardinality matching problem, submittet to Networks.
[GoKa96]
A. Goldberg and A. V. Karzanov, Path problems in skew-symmetric graphs, Combinatorica 16 (1996), 353–382.
[KoSt93]
W. Kocay and D. Stone, Balanced network flows, Bull. Inst. Combin. Appl. 7 (1993), 17–32.
[LoPl86]
L. Lovasz and M. D. Plummer, Matching theory, North-Holland, Amsterdam 1986.
144 Christian Fremuth-Paeger and Dieter Jungnickel [MiVa80]
√ S. Micali and V. V. Vazirani, An O( V E) algorithm for finding maximum matching in general graphs, in: Proceedings of the 21st Annual IEEE Symposium in Foundation of Computer Science, 1980, 17–27.
[Shr81]
A. Schrijver, Short proofs on the matching polyhedron, J. Combin. Theory Ser. B 34 (1983), 104–108.
[Tut52]
W. T. Tutte, The factors of graphs, Canad. J. Math. 4 (1952), 314–328.
[Tut67]
W. T. Tutte. Antisymmetrical digraphs, Canad. J. Math. 19 (1967), 1101–1117.
C. Fremuth-Paeger, D. Jungnickel Lehrstuhl für Diskrete Mathematik, Optimierung und Operations Research Universität Augsburg 86135 Augsburg, Germany [email protected] [email protected]
On the λ-design conjecture for v = 5p + 1 points Derek W. Hein and Yury J. Ionin
Abstract. A λ-design on v points is a family of v subsets (blocks) of a v-set such that the cardinality of the intersection of any two distinct blocks is λ. Ryser and Woodall conjectured that every λ-design can be obtained by fixing a block of a symmetric design and replacing every other block by its symmetric difference with the fixed block. We prove this conjecture for λ-designs with replication numbers r and r ∗ such that (r − 1, r ∗ − 1) = 5, and, as a consequence, we prove the conjecture for v = 5p + 1, where p is a prime not congruent to 2 or 8 mod 15. 2000 Mathematics Subject Classification: primary 05B05; secondary 05B30.
1. Introduction Let v and λ be positive integers. If X is a set of cardinality v and B is a family of subsets of X (blocks) such that |A ∩ B| = λ for any distinct A, B ∈ B, then |B| ≤ v [9]. The extremal case |B| = v can be realized if (X, B) is a symmetric (v, k, λ)design, i.e., all elements of B are of the same cardinality k. If |B| = v but (X, B) is not a symmetric design, then (X, B) is called a λ-design on v points. The initial study of λ-designs was undertaken by H. J. Ryser [10] and D. R. Woodall [17]. They showed that for any λ-design (X, B) on v points there are integers r and r ∗ (the replication numbers) such that r > 1, r ∗ > 1, r + r ∗ = v + 1, and every x ∈ X is contained in either r or r ∗ blocks. If (X, A) is a symmetric (v, k, k − λ)design with k = 2λ, then fixing a block A ∈ A and replacing every other block B ∈ A by the symmetric difference A " B yields a λ-design with replication numbers k and v − k + 1. A λ-design that can be obtained in this way is called type-1. The λ-design conjecture due to H. J. Ryser and D. R. Woodall states that every λ-design is type-1. This conjecture has been proven for λ = 1 (N. G. DeBruijn and P. Erd˝os [3]), λ = 2 (H. J. Ryser [10]), λ = 3 (W. G. Bridges and E. S. Kramer [2]), λ = 4 (W. G. Bridges [1]), 5 ≤ λ ≤ 9 (E. S. Kramer [7, 8]), λ = 10 (Á. Seress [12]), and any prime λ (N. M. Singhi and S. S. Shrikhande [15]). In [11], Á. Seress proved the λ-design conjecture for λ = 2p, where p is a prime. In his Ph.D. dissertation [16], I. Weisz proved the conjecture for λ ≤ 34. Let g be the greatest common divisor of r − 1 and r ∗ − 1, where r and r ∗ are the replication numbers of a λ-design. In the papers [5] and [6], Y. J. Ionin and Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
146 Derek W. Hein and Yury J. Ionin M. S. Shrikhande proved that every λ-design with g ≤ 4 is type-1, and, as a consequence, the λ-design conjecture is true for all λ-designs on p + 1, 2p + 1, 3p + 1, and 4p + 1 points, with p a prime. We also note that N. C. Fiala [4] has proved the λ-design conjecture for g = 6, and then for all λ-designs on 6p + 1 points (where p is a prime) using this technique. In this paper, we apply the technique developed byY. J. Ionin and M. S. Shrikhande and prove the λ-design conjecture for g = 5 and then for all λ-designs on 5p + 1 points, where p is a prime not congruent to 2 or 8 mod 15.
2. Preliminaries We begin with the definition of a λ-design. Definition 2.1. Let v and λ be positive integers. A λ-design E on v points is a pair (X, B), where X is a set of elements called points and B is a collection of subsets of X called blocks such that (1) |X| = |B| = v; (2) |A ∩ B| = λ for any distinct blocks A and B in B; (3) |B| > λ for any block B in B; (4) There are blocks A and B in B such that |A| = |B|. All known λ-designs can be obtained by the construction proposed by H. J. Ryser [10] and D. R. Woodall [17] and described in the following proposition. The proof of the proposition is straightforward. Proposition 2.2. Let λ be a nonnegative integer and A a family of subsets of a finite set X such that |B ∩ C| = λ for any distinct B, C ∈ A. Fix A ∈ A and define B = {A} ∪ {A " B : B ∈ A, B = A}. Then |S ∩ T | = |A| − λ for any distinct S, T ∈ B. We will say that (X, B) is obtained by the Ryser–Woodall complementation of (X, A) with respect to A ∈ A. If (X, A) is a symmetric (v, k, k − λ)-design with k = 2λ, then (X, B) is a λ-design on v points. Definition 2.3. A λ-design is said to be type-1 if it can be obtained by a Ryser–Woodall complementation of a symmetric design. We now quote several results on λ-designs that will be essential in the sequel. The following two theorems were proved in H. J. Ryser [10, Theorem 1.1] and D. R. Woodall [17, Theorem 2]:
On the λ-design conjecture for v = 5p + 1 points
147
Theorem 2.4. In any λ-design E on v points there exist distinct integers r and r ∗ (with r > 1 and r ∗ > 1) such that any point of E occurs in either r or r ∗ blocks and r + r ∗ = v + 1. Theorem 2.5. Let E = (X, B) be a λ-design on v points with replication numbers r and r ∗ . Then 1 (v − 1)2 1 + = . (2.1) λ |B| − λ (r − 1)(r ∗ − 1) B ∈B
D. R. Woodall [18, Theorem 3] stated a theorem that implies the following result (a proof can be found in Á. Seress [13]): Theorem 2.6. A λ-design on v points with replication numbers r and r ∗ is type-1 if and only if (r(r − 1))/(v − 1) or (r ∗ (r ∗ − 1))/(v − 1) is a (positive) integer. In the paper [14, Theorem 4.6], S.S. Shrikhande and N. M. Singhi proved the following result: Theorem 2.7. Let E = (X, B) be a λ-design with replication numbers r and r ∗ . Let g be the greatest common divisor of r − 1 and r ∗ − 1. If (r − r ∗ )/g and λ are relatively prime, then E is type-1. In the sequel, we will make use of the fact that the λ-design conjecture is proved for λ ≤ 11 ([1, 2, 3, 7, 8, 10, 12, 15]). Theorem 2.8. Every λ-design with λ ≤ 11 is type-1. From now on, let E = (X, B) be a λ-design on v points with replication numbers r and r ∗ . Let E (respectively E ∗ ) be the set of points x ∈ X with replication number r (respectively r ∗ ). Let e = |E| and e∗ = |E ∗ |, so that e + e∗ = v. For every A ∈ B, let τA = |A ∩ E| and τA∗ = |A ∩ E ∗ |, so that τA + τA∗ = |A|. Let g = (r − 1, r ∗ − 1). Since (r − 1) + (r ∗ − 1) = v − 1, we have also g = (r − 1, v − 1) = (r ∗ − 1, v − 1). We write v = gq + 1, where q is a positive integer. We now quote several useful relations that were derived in [5, 6]. By fixing a block B ∈ B and counting pairs (x, A) with A ∈ B, A = B, and x ∈ A ∩ B, one obtains that τB (r − 1) + τB∗ (r ∗ − 1) = (v − 1)λ, which can be rewritten as (r − 1)(|B| − 2τB ) = (v − 1)(|B| − τB − λ).
(2.2)
This implies |B| ≡ 2τB (mod q), and we write |B| − 2τB = σB q. The “dual” equation |B| − 2τB∗ = σB∗ q
(2.3)
148 Derek W. Hein and Yury J. Ionin implies σB∗ = −σB . From equations (2.2) and (2.3) we obtain that τB = λ −
r∗ − 1 · σB . g
(2.4)
The dual equation is τB∗ = λ − (r − 1)/g · σB∗ = λ + (r − 1)/g · σB , so that |B| = 2λ +
r − r∗ · σB . g
(2.5)
By counting in two ways and by summing both sides of equation (2.3) over all blocks B we get that sq = gq(gq − e − r + 3) − (2e + r − 2).
(2.6)
Therefore, 2e + r − 2 ≡ 0 (mod q). We define m and m∗ by 2e + r − 2 = mq
(2.7)
2e∗ + r ∗ − 2 = m∗ q
(2.8)
and
respectively. By adding equations (2.7) and (2.8), we obtain that m + m∗ = 3g.
(2.9)
Substituting equation (2.7) into equation (2.6) and simplifying using equation (2.9) we obtain that s = g 2 q − g(e + r) + m∗ .
(2.10)
We also obtain by routine manipulations that (2r − v − 1)(2m − 3g) = g(4λ − 1 − v), which can be rewritten as (r − r ∗ )(m − m∗ ) = g(4λ − 1 − v).
(2.11)
We note from equation (2.11) that v = 4λ − 1 implies that r − r ∗ = g(4λ − 1 − v)/(m − m∗ ), so that r=
(2g − m)(gq + 2) − 2λg . 3g − 2m
(2.12)
Also, from equation (2.7) we have that r = mq − 2e + 2. Substituting this expression into equation (2.12) we obtain mq − 2e + 2 = ((2g − m)(gq + 2) − 2λg)/(3g − 2m),
On the λ-design conjecture for v = 5p + 1 points
149
so that e=
λg − q(g − m)2 + g − m . 3g − 2m
(2.13)
Equations (2.12) and (2.13) also imply the “dual” equations (2g − m∗ )(gq + 2) − 2λg 3g − 2m∗ λg − q(g − m∗ )2 + g − m∗ e∗ = . 3g − 2m∗
r∗ =
(2.14)
The proof of the following lemma is found in Y. J. Ionin and M. S. Shrikhande [6]: Lemma 2.9. If a λ-design E on v points has a block of cardinality v − 1, then λ = 1. The next proposition limits the number of possible block sizes in a λ-design. Proposition 2.10. For any block B ∈ B, g − m ≤ σB ≤ g − 1. Proof. From equations (2.13) and (2.14), we derive that λ − e = (g − m)(r ∗ − 1)/g. Since τB = |B ∩ E| ≤ |E| = e, equation (2.4) now implies that σB ≥ g − m. From Lemma 2.9, |B| ≤ gq −1. Therefore, |B|−2τB ≤ gq −1, and equation (2.3) implies that σB q ≤ gq − 1, so σB ≤ g − 1/q, i.e., σB ≤ g − 1. For any block B ∈ B, let E (B) = (X, A), where A is the Ryser–Woodall complementation of B with respect to B. It is immediate that E (B)(B) = E . Therefore, if E (B) is a symmetric design, then E is type-1. If E (B) is not a symmetric design, we will denote the corresponding values of the parameters λ, e, e∗ , m, m∗ by λ(B), e(B), e∗ (B), m(B), and m∗ (B). The following proposition is straightforward. (A proof can be found in Y. J. Ionin and M. S. Shrikhande [6].) Proposition 2.11. Let B ∈ B. Then (1) E (B) is a symmetric design if and only if B = E or B = E ∗ ; (2) if B = E and B = E ∗ , then the replication numbers of E (B) are r and r ∗ , λ(B) = |B| − λ, e(B) = e + qσB , and m(B) = m + 2σB ; (3) if E (B) is type-1, then so is E .
150 Derek W. Hein and Yury J. Ionin
3. λ-designs with g = 5 Let E = (X, B) be a λ-design with replication numbers r and r ∗ and g = (r − 1, r ∗ − 1) = 5. For every integer i, let ai denote the number of blocks B ∈ B with σB = i. Then 4
ai = 5q + 1,
(3.15)
iai = 25q − 5(e + r) + 15 − m.
(3.16)
i=5−m
and, from equation (2.10), 4 i=5−m
We also rewrite equation (2.1) as 4 25q 2 5ai 1 − = 0. + 5λ + i(r − r ∗ ) (r − 1)(r ∗ − 1) λ
(3.17)
i=5−m
Theorem 3.1. Every λ-design with replication numbers r and r ∗ such that (r − 1, r ∗ − 1) = 5 is type-1. Proof. Let E = (X, B) be a λ-design with replication numbers r and r ∗ and (r − 1, r ∗ −1) = 5. Then equation (2.9) implies that m+m∗ = 15. Without loss of generality, we assume that m ≤ m∗ , i.e., 1 ≤ m ≤ 7. Case m = 1. Proposition 2.10 implies that σB = 4 for all blocks B. This implies that |B| is constant for all blocks B, which contradicts the definition of a λ-design. Case m = 2. Proposition 2.10 implies that σB ∈ {3, 4} for all blocks B. Equations (3.15) and (3.16) read a3 + a4 = 5q + 1 and 3a3 + 4a4 = (120q + 25λ + 48)/11. Thus, a3 = (100q − 25λ − 4)/11 and a4 = (25λ − 45q + 15)/11. Substituting these values into equation (3.17) and factoring, we obtain −(5q − 4λ + 2)2 (9q − 5λ − 3)(4q(5λ − 8) − 5λ2 + 5λ − 4) = 0. (3q + 2λ − 1)(8q − 2λ + 1)(15q − λ + 6)(20q − 5λ + 8)λ If 5q − 4λ + 2 = 0, then |B| = 2λ for all blocks B. If 9q − 5λ − 3 = 0, then e = 0. Hence, 4q(5λ − 8) − 5λ2 + 5λ − 4 = 0, which can be rewritten as (20q − 5λ − 3)(5λ − 8) = 44. Thus, 5λ − 8 divides 44. In particular, 5λ − 8 ≤ 44, which implies that λ ≤ 10. Hence, in this case, we are done by Theorem 2.8.
On the λ-design conjecture for v = 5p + 1 points
151
Case m = 3. Proposition 2.10 implies that σB ∈ {2, 3, 4} for all blocks B. If σB = 4 for some block B, then 0 ≤ τB implies that (2q + 2λ − 1) · 4/9 ≤ λ from equation (2.4). This in turn implies that 8q − 4 ≤ λ. On the other hand, Lemma 2.9 implies that λ < v − 2 = 5q − 1, a contradiction. Thus we have that σB ∈ {2, 3} for all blocks B. So, a2 + a3 = 5q + 1 and 2a2 +3a3 = (70q +25λ+28)/9. These equations imply that a2 = (65q −25λ−1)/9 and a3 = (25λ − 20q + 10)/9. Substituting these values into equation (3.17) and factoring, we obtain −(5q − 4λ + 2)2 (4q − 5λ − 2)(q(13λ − 21) − 5λ2 + 4λ − 3) = 0. 3(2q + 2λ − 1)(5q − λ + 2)(7q − 2λ + 1)(10q + λ + 4)λ If 5q − 4λ + 2 = 0, then |B| = 2λ for all blocks B. If 4q − 5λ − 2 = 0, then e = 0. Hence, q(13λ − 21) − 5λ2 + 4λ − 3 = 0, which can be rewritten as (13q − 5λ − 4)(13λ − 21) = λ + 123. Thus, 13λ − 21 divides λ + 123. If 13λ − 21 = λ + 123, then λ = 12. We then obtain that q = 5 and r = 23/3. Therefore, 2(13λ − 21) ≤ λ + 123, so λ ≤ 6. Hence, we are done by Theorem 2.8. Case m = 4. Proposition 2.10 implies that σB ∈ {1, 2, 3, 4} for all blocks B. If there is a block B ∈ B with σB = 4, then m∗ (B) = m∗ + 2σB∗ = 3. Therefore, E (B) is type-1 or a symmetric design. Then E is type-1. Hence, we may assume that σB ∈ {1, 2, 3} for all blocks B ∈ B. Suppose there is a block B with σB = 3. Then τB ≥ 0 and equations (2.4) and (2.14) imply λ ≥ 3q − 3. On the other hand, since r ≥ 2, we obtain from equation (2.12) that 10λ ≤ 30q − 2. Therefore, 10λ ≤ 30q − 10, and we have 3q − 3 ≤ λ ≤ 3q − 1. If λ = 3q − 1 or 3q − 2, equation (2.12) yields non-integer values of r. Thus, λ = 3q − 3. Note that if q ≤ 4, then λ ≤ 9, and we apply Theorem 2.8. Therefore, we assume that q ≥ 5. Now, equations (2.12), (2.13) and (2.14) yield r = 6, e = 2q − 2, and r ∗ = 5q − 4, and then equation (2.5) reads |B| = 6q − 6 − (q − 2)σB . Applying this equation to distinct A, B ∈ B yields |A ∪ B| = |A| + |B| − λ = 9q − 9 − (q − 2)(σA + σB ) Since, on the other hand, |A ∪ B| ≤ v = 5q + 1, we obtain that (q − 2)(σA + σB ) ≥ 4q − 10, for any distinct blocks A and B. Since q ≥ 5, 4q − 10 > 3(q − 2), and we have σA + σB ≥ 4 for any distinct A, B ∈ B. Thus, either a1 = 1 and a2 = 0 or a1 = 0. If a1 = 1 and a2 = 0, then a3 = 5q, and equation (3.16) is not satisfied. Therefore, a1 = 0. Then equations (3.15) and (3.16) yield a2 + a3 = 5q + 1 and 2a2 + 3a3 = 15q − 9. Therefore, a2 = 12 and a3 = 5q − 11. Plugging the values of ai , λ, r, and r ∗ in equation (3.17), we obtain an equation in q which has no solution in integers greater than or equal to 5. Thus, we can now assume that σB ∈ {1, 2} for all blocks B. So, a1 +a2 = 5q+1 and a1 + 2a2 = (30q + 25λ + 12)/7. These equations imply that a1 = (40q − 25λ + 2)/7
152 Derek W. Hein and Yury J. Ionin and a2 = (25λ − 5q + 5)/7. Substituting these values into equation (3.17) and factoring, we obtain −(5q − 4λ + 2)2 (5λ − q + 1)(4q(−2λ + 3) + 5λ2 − 3λ + 2) = 0. (5q + 3λ + 2)(λ − 10q − 4)(2λ − 6q − 1)(q + 2λ − 1)λ If 5q − 4λ + 2 = 0, then |A| = 2λ for all blocks A. If 5λ − q + 1 = 0, then e = 0. Hence, 4q(−2λ + 3) + 5λ2 − 3λ + 2 = 0, which can be rewritten as (8q − 5λ − 4)(2λ − 3) = λ + 16. Thus, 2λ − 3 divides λ + 16. If 2λ − 3 = λ + 16, then λ = 19 and q = 25/2. Hence, 2(2λ − 3) ≤ λ + 16, which implies that λ ≤ 7. Hence, in this case, we are done by Theorem 2.8. Case m = 5. Proposition 2.10 implies that σB ∈ {0, 1, 2, 3, 4} for all blocks B. If there is a block B with σB = 4 or 3, then m∗ (B) = 2 or 4, respectively. Therefore, we apply one of the previous cases to E (B) and then apply Proposition 2.11. Thus we have that σB ∈ {0, 1, 2} for all blocks B. Subcase a0 = 0. In this case a1 + a2 = 5q + 1 and a1 + 2a2 = 25q + 10 − 5(λ + 5q − 2λ + 2) = 5λ. Thus, a1 = 10q − 5λ + 2 and a2 = 5λ − 5q − 1. Substituting these values into equation (3.17) yields −(5q − 4λ + 2)2 (50λq 2 + 5q(−15λ2 + 11λ + 2) + 25λ3 − 22λ2 + 7λ + 2) = 0. (5q + λ + 2)(3λ − 10q − 4)(2λ − 5q − 1)(2λ − 1)λ If 5q − 4λ + 2 = 0, then |B| = 2λ for all blocks B. Hence, 50λq 2 + 5q(−15λ2 + 11λ + 2) + 25λ3 − 22λ2 + 7λ + 2 = 0. For this quadratic (with respect to q) equation to have a solution in the integers, the discriminant must be a perfect square. So, we define D to be the discriminant; that is, D = 625λ4 − 3850λ3 + 125λ2 + 700λ + 100. We notice that for λ ≥ 378, (25λ2 − 77λ − 117)2 < D < (25λ2 − 77λ − 116)2 . A computer check of √ λ such that 2 ≤ λ ≤ 377 gives us no values of λ which yield an integral value of D. Subcase a0 = 1. In this case a1 + a2 = 5q and a1 + 2a2 = 5λ. Thus, a0 = 1, a1 = 5(2q − λ) and a2 = 5(λ − q). Substituting these values into equation (3.17) yields −(5q − 4λ + 2)2 (50λq 2 − 5q(15λ2 − 7λ − 4) + 25λ3 − 14λ2 − λ + 4) = 0. (5q + λ + 2)(3λ − 10q − 4)(2λ − 5q − 1)(2λ − 1)λ If 5q − 4λ + 2 = 0, then |B| = 2λ for all blocks B. Hence, 50λq 2 − 5q(15λ2 − 7λ − 4) + 25λ3 − 14λ2 − λ + 4 = 0. We define D to be the discriminant; that is D = 625λ4 − 2450λ3 − 1575λ2 + 600λ + 400. We notice that for λ ≥ 303, (25λ2 − 49λ − 80)2 < D < (25λ2 − 49λ − 79)2 . A computer check of λ such that
On the λ-design conjecture for v = 5p + 1 points
2 ≤ λ ≤ 302 gives us that λ ∈ {5, 8} yield an integral value of values do not exceed 8, we are done by Theorem 2.8.
153
√ D. Since these
Subcase a2 = 0. In this case a0 + a1 = 5q + 1 and a1 = 5λ. Thus, a0 = 5q − 5λ + 1 and a1 = 5λ. Substituting these values into equation (3.17) yields (5q − 4λ + 2)2 (5q(λ − 1) − 5λ2 + 2λ − 1) = 0. λ(5q + λ + 2)(5q − 2λ + 1)(2λ − 1) If 5q − 4λ + 2 = 0, then |B| = 2λ for all blocks B. Hence, 5q(λ − 1) − 5λ2 + 2λ − 1 = 0, which can be rewritten as (5q − 5λ − 3)(λ − 1) = 4. Thus, λ − 1 divides 4. This implies that λ ≤ 5, and we are done by Theorem 2.8. Subcase a2 = 1. In this case a0 + a1 = 5q and a1 = 5λ − 2. Thus, a0 = 5q − 5λ + 2, a1 = 5λ − 2 and a2 = 1. Substituting these values into equation (3.17) yields (5q − 4λ + 2)2 (50q 2 (λ − 1) − 5q(13λ2 − 15λ + 8) + 15λ3 − 34λ2 + 19λ − 6) = 0. (5q + λ + 2)(3λ − 10q − 4)(2λ − 5q − 1)(2λ − 1)λ If 5q − 4λ + 2 = 0, then |B| = 2λ for all blocks B. Hence, 50q 2 (λ − 1) − 5q(13λ2 − 15λ + 8) + 15λ3 − 34λ2 + 19λ − 6 = 0. The discriminant of this quadratic (in q) equation is equal to 25(49λ4 + 2λ3 + 9λ2 − 40λ + 16). We define D to be 49(49λ4 + 2λ3 + 9λ2 − 40λ + 16). We notice that for λ ≥ 41, (49λ2 + λ + 4)2 < D < (49λ2 + λ + 5)2 . A computer check of λ such that 2 ≤ λ ≤ 40 gives us no value √ of λ that yields an integral value of D. Subcase a0 ≥ 2 and a2 ≥ 2. By assumption, there exist two blocks B1 and B2 such that σB1 = σB2 = 0 and two blocks C1 and C2 such that σC1 = σC2 = 2. For blocks B1 and B2 , we have that τB1 = τB2 = e = λ. This means that the two blocks B1 and B2 must intersect entirely in E, and are disjoint in E ∗ . Also, |B1 | = |B2 | = 2λ. Denote E ∗ ∩ B1 by E1∗ and E ∗ ∩ B2 by E2∗ . Note again that E1∗ ∩ E2∗ = ∅, and that |E1∗ | = |E2∗ | = λ. For blocks C1 and C2 , we have that τC1 = τC2 = (λ + 2)/5. Since |C1 ∩ B1 | = λ, we have |C1 ∩ E1∗ | = λ − (λ + 2)/5 = (4λ − 2)/5. Similarly, |C2 ∩ B1 | = λ and |C2 ∩ E1∗ | = (4λ − 2)/5. Thus, |C1 ∩ C2 ∩ E1∗ | ≥ |C1 ∩ E1∗ | + |C2 ∩ E2∗ | − |E1∗ | = (3λ − 4)/5. By similar reasons, |C1 ∩ C2 ∩ E2∗ | ≥ (3λ − 4)/5. Now, |C1 ∩ C2 | = λ. On the other hand, |C1 ∩ C2 | ≥ |C1 ∩ C2 ∩ E1∗ | + |C1 ∩ C2 ∩ E2∗ | ≥ (6λ − 8)/5. Therefore, λ ≤ 8, and we are done by Theorem 2.8. Case m = 6. Proposition 2.10 implies that σB ∈ {−1, 0, 1, 2, 3, 4} for all blocks B. If there is a block B with σB ≥ 2, then m∗ (B) ≤ 5. By the previous cases, E (B) is type-1
154 Derek W. Hein and Yury J. Ionin (or a symmetric design), and we apply Proposition 2.11. If there is a block B with σB = −1, then m(B) = 4, and we again reduce the proof to a resolved case. Thus we have that σB ∈ {0, 1} for all blocks B. So, a0 + a1 = 5q + 1 and a1 = (25λ−20q −8)/3. Then a0 = (35q −25λ+11)/3 and a1 = (25λ−20q −8)/3. Substituting these values into equation (3.17) yields (5q − 4λ + 2)2 (28q 2 − 5q(11λ − 7) + 25λ2 − 22λ + 7) =0 3λ(5q − λ + 2)(4q − 2λ + 1)(q − 2λ + 1) If 5q − 4λ + 2 = 0, then |B| = 2λ for all blocks B. Hence, 28q 2 − 5q(11λ − 7) + 25λ2 −22λ+7 = 0. The discriminant of this equation is equal to 9(25λ2 −154λ+49). We define D to be 25λ2 − 154λ + 49. Notice that for λ ≥ 35, (5λ − 16)2 < D < (5λ − 15)2 . A computer check √ of λ such that 2 ≤ λ ≤ 34 gives us that λ ∈ {6, 7, 15} yields an integral value of D. If λ = 15, then q = 11, implying that r = 26 and r ∗ = 31. Since ((r − r ∗ )/g, λ) = 1, we are done by Theorem 2.7. Otherwise, λ ≤ 7, and we are done by Theorem 2.8. Case m = 7. By Proposition 2.10, σB ∈ {−2, −1, 0, 1, 2, 3, 4} for all blocks B. Also τB∗ ≤ e∗ implies σB ≤ 3. If σB = 3, 2 or 1, then m∗ (B) = 2, 4 or 6, respectively; if σB = −1 or −2, then m(B) = 5 or 3, respectively. In each of these cases, we reduce the proof to a resolved case. If σB = 0 for all blocks B, then all the blocks are of the same cardinality, a contradiction. The proof is now complete. Remark 3.2. We note that Maple was used extensively in the preceding proof to both factor multivariate polynomials and to check ranges of λ values to determine if relevant discriminants were perfect squares.
4. The λ-design conjecture for v = 5p + 1 We now examine the consequence of Theorem 3.1. Theorem 4.1. Let p be a prime, p ≡ 2 or 8 (mod 15). Every λ-design on 5p + 1 points is type-1. Proof. Let E = (X, B) be a λ-design on v = 5p + 1 points with replication numbers r and r ∗ . Let g = (r − 1, r ∗ − 1). Then g divides v − 1 = 5p, so g ∈ {1, 5, p, 5p}. If g = 1, then E is type-1 by Y. J. Ionin and M. S. Shrikhande [6]. If g = 5, then E is type-1 by Theorem 3.1. If g = 5p, then r ≥ 5p + 1, r ∗ ≥ 5p + 1, and r + r ∗ > v + 1, a contradiction. Suppose g = p. Without loss of generality, we assume that r > r ∗ . Since r ∗ > 1, we obtain that either (i) r = 3p + 1 and r ∗ = 2p + 1 or (ii) r = 4p + 1 and r ∗ = p + 1.
On the λ-design conjecture for v = 5p + 1 points
155
In case (i), (r − r ∗ )/g = 1, and Theorem 2.7 implies that E is type-1. The same theorem works in case (ii) if we assume that λ ≡ 0 (mod 3). If, in case (ii), p ≡ ±1 (mod 5), then either (r(r − 1))/(v − 1) = (4p + 1)/5 or (r ∗ (r ∗ − 1))/(v − 1) = (p + 1)/5 is an integer, and E is type-1 by Theorem 2.6. Suppose now that there are λ-designs on 5p + 1 points that are not type-1. Then p ≡ ±1 (mod 5). For any such a design, we have r = 4p + 1, r ∗ = p + 1, and λ ≡ 0 (mod 3). Let E be such a design (for a fixed p) with the smallest value of λ. By proposition 2.11, λ(B) = |B| − λ for B ∈ B. By the choice of λ, we have then |B| ≥ 2λ for all B ∈ B. By Theorem 2.8, we also have λ ≥ 12, so |B| ≥ 24 for all B ∈ B. If p ≡ 0 (mod 5), i.e., p = 5, then v = 26 and Lemma 2.9 implies that all blocks B are of cardinality 24, a contradiction. Thus, p ≡ ±2 (mod 5). Since r = 4p + 1 and q = (v − 1)/g = 5, equation 2.12 simplifies to (4p + 1)(3p − 2m) = (2p − m)(5p + 2) − 2λp. Taking both sides modulo 3, we obtain that p ≡ 0 (mod 3) or p ≡ 2 (mod 3). Since p ≡ 2 or 8 (mod 15), the latter possibility is ruled out, i.e., p = 3. But then v = 16, while |B| ≥ 24 for all blocks B, a contradiction. The proof is now complete. Acknowledgment The authors wish to thank N. C. Fiala for pointing out an error in the original proof of the Case m = 4 of Theorem 3.1.
References [1]
W. G. Bridges, Some results on λ-designs, J. Combin. Theory 8 (1970), 350–360.
[2]
W. G. Bridges and E. S. Kramer, The determination of all λ-designs with λ = 3, J. Combin. Theory 8 (1970), 343–349.
[3]
N. G. DeBruijn and P. Erd˝os, On a combinatorial problem, Indag. Math. 10 (1948), 421–423.
[4]
N. C. Fiala, λ-designs on 6p + 1 points, in: Codes and Designs (K. T. Arasu and Á. Seress, eds.), Ohio State Univ. Math. Res. Inst. Publ. 10, Walter de Gruyter, Berlin 2002, 109–124.
[5]
Y. J. Ionin and M. S. Shrikhande, λ-designs on 4p + 1 points, J. Combin. Math. Combin. Comput. 22 (1996), 135–142.
[6]
Y. J. Ionin and M. S. Shrikhande, On the λ-design conjecture, J. Combin. Theory Ser. A 74 (1996), 100–114.
[7]
E. S. Kramer, On λ-designs, Ph.D. dissertation, University of Michigan, 1969.
[8]
E. S. Kramer, On λ-designs, J. Combin. Theory Ser. A 16 (1974), 57–75.
[9]
K. N. Majumdar, On some theorems in combinatorics related to incomplete block designs, Ann. Math. Statist. 24 (1953), 377–389.
156 Derek W. Hein and Yury J. Ionin [10]
H. J. Ryser, An extension of a theorem of de Bruijn and Erd˝os on combinatorial designs, J. Algebra 10 (1968), 246–261.
[11]
Á. Seress, All lambda-designs with λ = 2p are type-1, Des. Codes Cryptogr. 22 (2001), 5–17.
[12]
Á. Seress, On λ-designs with λ = 2p, in: Coding Theory and Design Theory, Part II, Design Theory (D. K. Ray-Chaudhuri, ed.), IMA Vol. Math. Appl. 21 , Springer-Verlag, New York 1990, 290–303.
[13]
Á. Seress, Some characterizations of type-1 λ-designs, J. Combin. Theory Ser. A 52 (1989), 288–300.
[14]
S. S. Shrikhande and N. M. Singhi, Some combinatorial problems, in: Combinatorics and Its Applications (K. S. Vijayan and N. M. Singhi, eds.), Indian Statistical Institute, Calcutta 1984, 340–349.
[15]
N. M. Singhi and S. S. Shrikhande, On the λ-design conjecture, Util. Math. 9 (1976), 301–318.
[16]
I. Weisz, Lambda-designs with small lambda are type-1, Ph.D. dissertation, The Ohio State University, 1995.
[17]
D. R. Woodall, Square λ-linked designs, Proc. London Math. Soc. 20 (1970), 669–687.
[18]
D. R. Woodall, Square λ-linked designs: A survey, in: Combinatorial Mathematics and Its Applications, Academic Press, New York/London 1971, 349–355.
D. W. Hein Department of Mathematics and Physics Oklahoma Panhandle State University Goodwell, Oklahoma, 73939, U.S.A. [email protected] Y. J. Ionin Department of Mathematics Central Michigan University Mt. Pleasant, Michigan, 48858, U.S.A. [email protected]
On a class of twin balanced incomplete block designs Hadi Kharaghani and Vladimir D. Tonchev
Abstract. A two-parameter family of 2-(4n2 , n(2n − 1), m(n − 1)) designs are constructed starting from a certain block matrix with 2n by 2m sub-matrices, and a balanced generalized weighing matrix over an appropriate cyclic group. The special case n = m corresponds to a construction of symmetric 2-designs from Hadamard matrices of Bush-type described in [10]. If 2m and 2n are the orders of Hadamard matrices, the construction yields Hadamard matrices of Bush-type. Furthermore, if either 2n − 1 or 2n + 1 is a prime power, the designs can be expanded to infinitely many new designs by using known balanced generalized weighing matrices. 2000 Mathematics Subject Classification: primary 05B05; secondary 05B20, 51E15.
1. Introduction A 2-(v, k, λ) design is a set X of v points together with a collection B of k-subsets of X called blocks such that every point appears in exactly r = λ(v − 1)/(k − 1) blocks and every two points are contained in exactly λ blocks. A 2-(v, k, λ) design can be described in terms of its incidence matrix, being a v by b = vr/k (0, 1)-matrix with constant row sum equal to r, constant column sum equal to k, and constant scalar product of pairs of rows equal to λ. For more on designs see Beth, Jungnickel and Lenz [1]. A Hadamard matrix of order n is a square n by n matrix with entries ±1 whose rows are pairwise orthogonal. A Hadamard matrix H of order 4n2 is regular if every row and column of H contains a constant number (2n2 − n or 2n2 + n) of +1’s. Alternatively, replacing the −1’s with zeros in H yields the incidence matrix of a symmetric 2-(4n2 , 2n2 − n, n2 − n) or 2-(4n2 , 2n2 + n, n2 + n) design. A Bush-type Hadamard matrix [2] is a regular Hadamard matrix of order 4n2 with the additional property of being a block matrix H = [Hij ], with blocks of size 2n such that Hii = J2n and Hij J2n = J2n Hij = 0, i = j , 1 ≤ i ≤ 2n, 1 ≤ j ≤ 2n, where J2n is the all-one 2n by 2n matrix. Bush [2] showed that the existence of a projective plane of order 2n implies the existence of a symmetric Bush-type Hadamard matrix of order 4n2 . Kharaghani [8] proved that the existence of a Hadamard matrix of order 4n implies the existence of a Bush-type Hadamard matrix of order 16n2 . Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
158 Hadi Kharaghani and Vladimir D. Tonchev Very little seems to be known about the Bush-type Hadamard matrices of order 4n2 , n odd, n > 1. The only odd n values for which a Bush-type Hadamard matrix is known to exist are n = 1, n = 3, n = 5 [5] and n = 9 [6], see also [4]. A 2-(4n2 , n(2n − 1), m(n − 1)) design, n ≤ m is called to be of K-type, if the incidence matrix of the design is a 4n2 by 4nm block matrix of block size 2n by 2m with the additional property that all the diagonal blocks are zero 2n by 2m matrices, and all the off-diagonal blocks have exactly m ones in each row and n ones in each column. Note that K-type matrices are generalized Bush-type Hadamard matrices and for n = m a K-matrix is a Bush-type Hadamard matrix. A balanced generalized weighing matrix BGW (ν, κ, λ) over a multiplicative group ¯ = G ∪ {0} such that each row G is a ν by ν matrix W = [gij ] with entries from G of W contains exactly κ nonzero entries, and for every a, b ∈ {1, . . . , ν}, a = b, the multi-set −1 {gai gbi : 1 ≤ i ≤ ν, gai = 0, gbi = 0}
contains exactly λ/|G| copies of each element of G. In a recent work [10, 11], Kharaghani described a construction of symmetric 2designs from a given Bush-type Hadamard matrix and a BGW (ν, κ, λ) over a suitable cyclic group. The method generates an infinite class of symmetric designs from any given Bush-type Hadamard matrix of order 4n2 such that one of the numbers 2n − 1 or 2n + 1 is a prime power, and two infinite classes of symmetric designs if both 2n − 1 and 2n + 1 are both prime powers. In this paper we will show that any given K-type 2-(4n2 , n(2n − 1), m(n − 1)) design, n ≤ m give rise to infinitely many 2-designs if either of 2n − 1 or 2n + 1 is a prime power. The resulting designs are non-symmetric if m = n, and symmetric if m = n. The construction in the case m = n reduces to the one given in [10]. For a (0, ±1)-matrix K, let K = K + − K − , where K + , K − and K + + K − are (0, 1)-matrices. A (0, ±1)-matrix D is called a twin 2-(v, k, λ) design if both D + and D − are the incidence matrices of a symmetric 2-(v, k, λ) design. A (0, ±1)-matrix S is called a Siamese twin design sharing the entries of I , if S = I + K − L, where I , K, L and K + L are non-zero (0, 1)-matrices and both I + K and I + L are the incidence matrices of a symmetric 2-(v, k, λ) design.
2. 2-(v, k, λ) designs of K-type We begin with a well known lemma. Lemma 1. There is a 2 − (2k, k, k − 1) design for every k for which there is a Hadamard matrix of order 4k.
On a class of twin balanced incomplete block designs
159
Proof. Consider any Hadamard 2-(4k − 1, 2k − 1, k − 1) design related to the given Hadamard matrix of order 4k, and note that the parameters of a 2-(2k, k, k − 1) design are the parameters of a residual design of a 2-(4k − 1, 2k − 1, k − 1) design. The Kronecker product of two matrices A = [aij ] and B, denoted A⊗B is defined, as usual, by A ⊗ B = [aij B]. Lemma 2. Let M be the ±1-incidence matrix of a 2 − (2k, k, k − 1) design and let W be a weighing matrix with W (2k, 2k − 1) with zero diagonal. Then D = W ⊗ M is a twin 2 − (4k 2 , k(2k − 1), (k − 1)(2k − 1)) designs of K-type. Proof. Let D = D + − D − . It is easy to see that D + + D − = P , where P = J4k 2 ×(4k−2) − I2k ⊗ J2k×(4k−2) . Therefore 2D + = W ⊗ M + P . We can see now that D + D +t = k(2k − 1)I4k 2 + (2k − 1)(k − 1)J4k 2 . Theorem 3. Let 4n and 4m, n ≤ m be orders of Hadamard matrices. Then there exists a twin 2 − (16n2 , 2n(4n − 1), 2m(2n − 1)) design of K-type. Proof. Let K and H be two normalized Hadamard matrices of order 4n and 4m the top 4n respectively. Let r1 , r2 , . . . , r4n be the row vectors of K and r1 , r2 , . . . , r4n t row vectors of H . Let Ci = ri ri , i = 1, 2, . . . , 4n. It is easy to check that: 1. C1 = J4n×4m , Ci J4m = J4n Ci = 0, for i = 2, . . . , = 4n. 2. Ci Cjt = 0, for i = j , 1 ≤ i, j ≤ 4n. 3.
4n
t i=1 Ci Ci
= 16n2 I4n .
Now let H = circ(C1 , C2 , . . . , C4n ) be the block circulant matrix with first row C1 C2 . . . C4n . It is easy to verify that the matrix M = H − I2n ⊗ J2n×2m is a twin 2-(16n2 , 2n(4n − 1), 2m(2n − 1)) design of K-type. See [9] for details. Corollary 4. If 4n is the order of a Hadamard matrix, then there exists a Bush-type Hadamard matrix of order 16n2 . Proof. Take n = m in the previous theorem.
3. The twin 2-designs Let SP2k be the set of all signed permutation matrices of order 2k. Let U be the circulant matrix of order 2k with first row (010 . . . 0) and N be the diagonal matrix of order 2k with −1 at the (1, 1)-position and 1 elsewhere on the diagonal. Then the matrix E = U N is a signed permutation matrix from SP2k . Let G4k = {E i ⊗ I4k−2 : i = 1, 2, . . . , 4k}. Then G4k is a cyclic subgroup of SP2k(4k−2) .
160 Hadi Kharaghani and Vladimir D. Tonchev Lemma 5. Let q = (2k − 1)2 be a prime power. Then there is a balanced weighing matrix BGW (q m + q m−1 + · · · + q + 1, q m , q m − q m−1 ) over the cyclic group G4k for each positive integer m. Proof. Note that 4k is a divisor of q − 1 and apply [7], Theorem 2.2. Theorem 6. If there is a twin 2 − (4n2 , n(2n − 1), m(n − 1)) design of K-type, n ≤ m, and q = (2n − 1)2 is a prime power, then there is a twin 2 − 4n2 (1 + q + q 2 + · · · + t t t q ), n(2n − 1)q , m(n − 1)q design for each positive integer t. Proof. Let t be a positive integer and M be a twin 2-(4n2 , n(2n − 1), m(n − 1)) design. Let W = [wij ] be the balanced generalized weighing matrix BGW (ν, κ, λ) of Lemma 5 for k = n, where ν = q t + q t−1 + · · · + q + 1, κ = q t , λ = q t − t t−1 . Let D = [Mwij ]. This is a twin 2- 4n2 (1 + q + q 2 + · · · + q t ), n(2n − 1)q t , m(n − 1)q t design. To see this, note that D+ =
1 [P |wij | + Mwij ] 2
D− =
1 [P |wij | − Mwij ], 2
and
where P = J4n2 ×4nm − I4n2 ⊗ J2n×2m . We will only show that D − is a 2- 4n2 (1 + q + q 2 + · · · + q t ), n(2n − 1)q t , m(n − 1)q t design. The proof for D + is similar. Let 4D − D −t = [dkl ]. Note that every block of P is orthogonal to every block of Mwij and thus (P |wij |)(Mwlk )t = (Mwlk )(P |wij |)t = 0 for all i,j ,k and l. For l = k, dll =
ν
P |wli ||wil |t P t + Mwli wilt M t
j =1
=P
ν
ν |wli ||wilt | P t + M wli wilt M t
j =1 t
i=1
= κ(P P + MM ) = κ 4m(2n − 1)I2n×ν + 4(n − 1)mJ2n×ν . t
For l = k, dkl =
ν
P |wki ||wil |t P t + Mwli wilt M t
j =1
On a class of twin balanced incomplete block designs
=P
ν
161
ν |wli ||wil |t P t + M wli wilt M t
j =1
i=1
2λ ⊗ I2n×2n P t J 2 4n 4n ×4nm 2(2n − 1)λ 2 (2n − 1)J2n×2n = = q t (2n − 2)2mJ4n2 ×4nm . 4n =P
This shows that D − is a 2-design with the desired parameters. Corollary 7. Let 4n, 4m, n ≤ m be orders of Hadamard matrices and q = (4n − 1)2 be a prime power. Then there exists a 2 − (16n2 (1 + q + · · · + q t , t 2n(4n − 1)q , 2m(2n − 1)q t ) design for each positive integer t. Proof. This follows from previous theorem and Theorem 3. Example 8. Take n = 4, m = 6, and q = 49. Starting with two Hadamard matrices of orders 8 and 12, we obtain a twin 2-(64, 28, 18) of K-type by applying Theorem 3. Using this design and a BGW(50, 49, 48) over the group G16 , we get a twin 2-(3200, 1372, 882) design. Remark 9. 1. Let n = m in the previous theorem. Then the symmetric designs of the theorem reduce to the twin symmetric designs with parameters ν = 16(q m + q m−1 + · · · + q + 1)n2 ,
κ = q m (8n2 − 2n),
λ = q m (4n2 − 2n)
for every positive integer m. This class of symmetric designs first appeared in [10]. 2. It is not known whether a 2-(4n2 , n(2n − 1), m(n − 1)) design of K-type exists for each pair of odd integers n and m, n ≤ m. Any such 2-design can be “blown” to infinitely many by using the method of Theorem 6. Corollary 10. Let 4k be the order of a Hadamard matrix and assume that there is a W (2k, 2k − 1), and q = (2k − 1)2 is a prime power. Then there exists a 2 − 4k 2 (1 + q 2 + · · · + q m ), k(2k − 1)q m , (2k − 1)(k − 1)q m design for each positive integer m. Proof. The proof follows from Theorem 6 and Lemma 2. Example 11. Starting from a skew-type Hadamard matrix of order 4 (k = 2), ⎛ ⎞ 1 1 1 1 ⎜ − 1 1 − ⎟ ⎟ K=⎜ ⎝ − − 1 1 ⎠ − 1 − 1
162 Hadi Kharaghani and Vladimir D. Tonchev we obtain the weighing matrix W (4, 3), where ⎛ 0 1 ⎜ − 0 W (4, 3) = ⎜ ⎝ − − − 1
1 1 0 −
⎞ 1 − ⎟ ⎟. 1 ⎠ 0
We double K and normalize it to get the following twin 2-(4, 2, 1) design: ⎛ ⎞ 1 1 1 − − − ⎜ 1 − − 1 1 − ⎟ ⎟ M=⎜ ⎝ − 1 − 1 − 1 ⎠. − − 1 − 1 1 Then D = W ⊗ M is a twin 2-(16, 6, 3) design of K-type. We now use D and a BGW(10, 9, 8) over the cyclic group G8 to get a twin 2-(160, 54, 27) design. Remark 12. If we assume in Theorems 6 that q = (2n+1)2 is a prime power, then we get an infinite class of Siamese twin 2-(4n2 (1+q +· · ·+q t ), n(2n+1)q t , m(n+1)q t ) designs sharing the entries [(I2n ⊗ J2n×2m )wij ]. A similar class of Siamese twin designs is obtained from Corollary 10. See [11] for details. Acknowledgments Part of the work was completed while the first author was on sabbatical leave visiting the Institute for studies in theoretical Physics and Mathematics, IPM, in Iran, Tehran. Hospitality and support is appreciated. Supported in part by an NSERC operating grant. The authors wish to thank the referee for careful reading of the manuscript and the constructive suggestions.
References [1]
T. Beth, D. Jungnickel, H. Lenz, Design Theory, Second Edition, Cambridge University Press, Cambridge 1999.
[2]
K.A. Bush, Unbalanced Hadamard matrices and finite projective planes of even order, J. Combin. Theory 11 (1971), 38–44.
[3]
D. R. Hughes and F. C. Piper, Design Theory, Cambridge University Press, Cambridge 1985.
[4]
Z. Janko, Coset enumeration in groups and constructions of symmetric designs, Combinatorics 90 (1992), 275–277.
[5]
Z. Janko, H. Kharaghani, andV. D. Tonchev, Bush-type Hadamard matrices and symmetric designs, J. Combin. Des. 9 (2001), 72–78.
On a class of twin balanced incomplete block designs
163
[6]
Z. Janko, H. Kharaghani, and V. D. Tonchev, The existence of a Bush-type Hadamard matrix of order 324 and two new infinite classes of symmetric designs, Des. Codes Cryptogr. 24 (2001), 225–232.
[7]
Dieter Jungnickel and Vladimir Tonchev, Perfect Codes and Balanced Weighing Matrices, Finite Fields Appl. 5 (1999), 294–300.
[8]
H. Kharaghani, New classes of weighing matrices, Ars Combin. 19 (1985), 69–72.
[9]
H. Kharaghani, 2-parameter Hadamard Balanced Incomplete Block Designs, Util. Math. 27 (1985), 225–227.
[10] H. Kharaghani, On the twin designs with the Ionin-type parameters, Electron. J. Combin. 7 (2000), #R1. [11] H. Kharaghani, On the Siamese Twin Designs, in: Finite Fields and Applications (D. Jungnickel and H. Niederreiter, eds.), Proceedings of the 5th International Conference Fq (5), University of Augsburg, Germany, Springer-Verlag, Berlin–Heidelberg 2001, 303–312. H. Kharaghani Department of Mathematics & Computer Science University of Lethbridge Lethbridge, Alberta, Canada, T1K 3M4 [email protected] V. D. Tonchev Department of Mathematical Sciences Michigan Technological University Houghton, Michigan 49931, U.S.A. [email protected]
Decoding some doubly-even self-dual [32, 16, 8] codes by hand Jon-Lark Kim and Vera Pless
Abstract. The purpose of this paper is to decode some binary doubly-even self-dual [32, 16, 8] codes by hand. We will decode C84 (or 8f4 ) in detail. Our method is the syndrome decoding method used in [G3] to decode the binary Reed–Muller code R(2, 5). At the end we also describe how to decode another doubly-even self-dual [32, 16, 8] code C83 (or 2g16 ) and three singly-even self-dual [32, 16, 8] codes by using the syndrome decoding method. 2000 Mathematics Subject Classification: primary 94B35; secondary 94B05.
1. Introduction Our notation follows [MS, P2]. A linear [n, k] code C over GF(2) is a k-dimensional vector subspace of GF(2)n , where GF(2) is the Galois field with two elements. The weight wt(c) of a codeword c ∈ C is the number of nonzero components of c. The minimum nonzero weight d of all codewords in C is called the minimum weight of C. An [n, k, d] code is an [n, k] code with minimum weight d. The dual code C ⊥ of C consists of vectors in GF(2) orthogonal to all vectors in C with respect to the usual inner product. If C = C ⊥ , then C is called a self-dual code. A self-dual binary code is called doubly-even if all codewords have weight ≡ 0 (mod 4) and singly-even if some codeword has weight ≡ 2 (mod 4). We note that self-dual binary codes exist only for even lengths. Moreover doubly-even self-dual codes exist only for lengths n ≡ 0 (mod 8). It was shown [P1] that the extended binary (ternary) Golay codes can be decoded by hand by projecting these codes onto quaternary (ternary) codes of smaller length. It has been open since then whether this idea can be applied to higher length binary codes. It was shown [G3] that this is possible for the binary Reed–Muller [32, 16, 8] code R(2, 5) by using the linear Hamming [8, 4, 4] code over GF(4). We remind readers of the fact that R(2, 5) is one of the 5 binary extremal doubly-even self-dual [32, 16, 8] codes [CP]. Because of the linear structure of the Hamming [8, 4, 4] code Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
166 Jon-Lark Kim and Vera Pless over GF(4), R(2, 5) was decoded [G3] by two methods. One, the representation method, is the analogue of the method used to decode the Golay code [P1]. The other is the syndrome decoding method. Even though there is an idea in [G3] about decoding other doubly-even self-dual [32, 16, 8] codes C84 and C83 by hand, specific examples were not given. Actually decoding C84 (or C83) is much more complicated than R(2, 5) because the projected codes are non-linear and do not have binary generator matrices as the Hamming code over GF(4) does. The purpose of this paper is to decode some doubly-even self-dual [32, 16, 8] binary codes by hand, i.e., C83 (or 2g16 ) and C84 (or 8f4 ) in the notation of [CP]. We will decode C84 (or 8f4 ) in detail. Our method is the syndrome decoding method used in [G3] to decode the binary Reed–Muller code R(2, 5). At the end we also describe how to decode another doubly-even self-dual [32, 16, 8] code C83 (or 2g16 ) and the three singly-even self-dual [32, 16, 8] codes [CS] by using the syndrome decoding method. Since C84 (or C83) is a [32, 16, 8] code, its error-correcting capability is at most 3. It is also well known [AP] that the covering radius of C84 (or C83) is 6. Our decoding scheme can not only correct up to 3 errors, but can also detect 4, 5, or 6 errors, and in addition give one or more coset representatives of cosets of weight 4, 5, or 6.
2. Preliminaries Let GF(4) = {0, 1, ω, ω} be the Galois field of four elements. An additive code C over GF(4) of length n is an additive subgroup of GF(4)n . As C is a free GF(2)-module, it has size 2k for some 0 ≤ k ≤ 2n. We call C an (n, 2k ) code. It has a basis, as a GF(2)-module, consisting of k basis vectors; a generator matrix of C will be a k × n matrix with entries in GF(4) whose rows are a basis of C. When we consider additive codes over GF(4), we define the symmetric bilinear dot product, · : GF(4) × GF(4) → GF(2) by 1 · ω = ω · 1 = 1, 1 · ω = ω · 1 = 1, ω · ω = ω · ω = 1 and x · x = 0 · x = 0 for all x ∈ GF(4). We now define the trace inner product of two vectors x = (x1 x2 . . . xn ) and y = (y1 y2 . . . yn ) in GF(4)n to be xy =
n
xi · yi ∈ GF(2).
i=1
Note that xi · yi = 1 if and only if xi and yi are nonzero distinct elements in GF(4). If C is an additive code, its dual, denoted C ⊥ , is the additive code {x ∈ GF(4)n | xc = 0 for all c ∈ C}. If C is an (n, 2k ) code, then C ⊥ is an (n, 22n−k ) code. As usual, C is self-orthogonal if C ⊆ C ⊥ and self-dual if C = C ⊥ . In particular, if C is self-dual, C is an (n, 2n ) code. The weight wt(c) of c ∈ C is the number of nonzero components of c. The minimum weight d of C is the smallest weight of any nonzero codeword in C. If C
Decoding some doubly-even self-dual [32, 16, 8] codes by hand
167
is an (n, 2k ) additive code of minimum weight d, C is called an (n, 2k , d) code. C is Type II if C is self-dual and all codewords have even weight; it is a fact that Type II codes of length n exist only if n is even [G1, G2, H]. If C is self-dual but some codeword has odd weight (in which case the code cannot be GF(4)-linear), the code is Type I (see Section 4.2 in [RS]). Recently it has been shown [G1, G2, H] that there are exactly 3 even self-dual additive (8, 28 , 4) codes of length 8 over GF(4). Among these three even codes, one is the linear Hamming [8, 4, 4] code H over GF(4) and the others are Ci (i = 1, 2) over GF(4) with generator matrices Gi respectively QC_8d, and QC_8c in Table 1 [G1, G2]. Furthermore it was shown [G1, G2] that the binary Reed–Muller [32, 16, 8] code R(2, 5), C84, and C83 can be obtained in the following way. Let C0 = H over &i (i = 0, 1, 2) be the binary linear GF(4). Let Ci (i = 1, 2) be defined as above. Let C [32, 8] code obtained from Ci (i = 0, 1, 2) by replacing each GF(4) component by a 4-tuple in GF(2)4 as follows: 0 → 0000, 1 → 0011, ω → 0101, ω → 0110. Let d4 be the [4, 1] binary linear code {0000, 1111}. Let (d48 )0 be the [32, 7] binary linear code consisting of all codewords of weights divisible by 8 from the [32, 8] code d48 . Finally let f1 be the [32, 1] code generated by 1000 1000 . . . 1000. ⎡
1 ⎢0 ⎢ ⎢1 ⎢0 ⎢ QC_8c = ⎢ 0 ⎢ ⎢ω ⎢ ⎣ω ω
1 0 0 1 0 ω 0 0
1 0 0 0 1 0 ω 0
1 0 0 0 0 0 0 ω
0 1 1 1 ω ω 1 ω
0 1 ω ω 1 ω 1 ω
0 1 ω ω 0 ω ω 1
⎤ 0 1⎥ ⎥ 0⎥ 0⎥ ⎥ , ω⎥ ⎥ ⎥ ω⎥ ω⎦ 1
⎡1 0 ⎢0 1 ⎢0 0 ⎢ ⎢ω 0 K=⎢ ⎢ω ω ⎢ ⎢ω ω ⎣ 0 ω 0 0
⎡
1 ⎢0 ⎢ ⎢1 ⎢0 ⎢ QC_8d = ⎢ 0 ⎢ ⎢ω ⎢ ⎣ω ω 0 0 1 0 0 ω ω ω
ω 0 0 1 0 0 ω ω
ω ω 0 0 1 0 0 ω
ω ω ω 0 0 1 0 0
0 ω ω ω 0 0 1 0
1 0 0 1 0 ω 0 0
1 0 0 0 1 0 ω 0
1 0 0 0 0 0 0 ω
0 1 1 1 ω ω 1 ω
0 1 ω ω 1 1 0 ω
0 1 ω 0 0 ω 1 ω
⎤ 0 1⎥ ⎥ 0⎥ ω⎥ ⎥ , ω⎥ ⎥ ⎥ 1⎥ 0⎦ ω
0⎤ 0⎥ ω⎥ ⎥ ω⎥ ⎥ ω⎥ ⎥ 0⎥ ⎦ 0 1
Table 1. Generator matrices for even additive self-dual codes over GF(4)
&i + (d 8 )0 + f1 for i = 0, 1, 2 produces the binary Lemma 2.1 ([G1, G2]). ρ(Ci ) = C 4 Reed–Muller [32, 16, 8] code R(2, 5), C84, and C83, respectively.
168 Jon-Lark Kim and Vera Pless
3. Two doubly-even self-dual [32, 16, 8] codes We describe how to decode the doubly-even self-dual [32, 16, 8] codes C83 (or 2g16 ) and C84 (or 8f4 ) in the notation of [CP] by using the syndrome decoding method. It is known [G3] that the code with generator matrix QC_8d is equivalent to a cyclic even additive self-dual code K over GF(4). This cyclic code has the generator matrix K in Table 1. The advantage of K is that it has only two nonzero elements 1 and ω, which simplify the computation of syndromes. Thus we will decode C84 (or 8f4 ) by using K in detail. We index each column of K from left to right by 1 to 8. This is how we refer to the columns of K, i.e., column 6 is (ω, ω, ω, 0, 0, 1, 0, 0)T . Before we describe how to use K to decode C84 (or 8f4 ), we recall the projection of binary codes onto quaternary codes explained in [P1]. Consider a 4 × 8 array with zeros and ones in it. Label the four rows with the elements of GF(4); 0, 1, ω, ω. Recall that ω = ω2 , ω2 = ω, and ω = 1 + ω. If we take the (Euclidean) inner product of a column of our array with the row labels, we get an element in GF(4). In this way we have a correspondence between binary vectors of length 32 and quaternary vectors of length 8. For example, Let v = (1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0) be the binary vector of length 32. Then
0 1 v= ω ω
1 1 0 1 1 1
2 3 0 1 0 0 0 0 0 1 0 ω
4 5 0 0 0 1 1 0 1 1 1 ω
6 7 0 0 0 1 1 0 1 1 1 ω
8 0 1 1 0 ω
corresponds to (or projects onto) the quaternary vector (1, 0, ω, 1, ω, 1, ω, ω) of length 8. Note that this correspondence is linear, i.e., if bi corresponds to qi , i = 1, 2, then b1 + b2 corresponds to q1 + q2 . Let the parity of a column be either even or odd if an even or an odd number of ones exist in the column. Define the parity of the top row in a similar fashion. Thus the first column of the 4 × 8 array of the above vector has odd parity, and the rest have even parity. The top row has even parity. Let Ci (i = 1, 2) be the additive codes over GF(4) with generator matrices Gi respectively K and QC_8c. When we prove the analogue of Lemma 1 in [G3] for C84 (C83) we need to modify its proof since C1 (C2 ) is not linear. In this case we recall that C1 (C2 ) has 28 codewords. Keeping the above notation, we have the following lemma.
Decoding some doubly-even self-dual [32, 16, 8] codes by hand
169
Lemma 3.1. The set of all binary vectors of length 32 with the following properties is (up to equivalence) C84 (C83): (i) The parity of all the columns is the same (i.e., all even or all odd ), and the parity of the top row is always even. All vectors of this form constitute a linear space. (ii) The projection is in C1 (C2 ). All vectors with this property form a linear space. Proof. The linearity of the set satisfying (i) and (ii) is easy. The nontrivial thing is to determine its dimension. From now on, we identify a binary vector of length 32 with a 4 × 8 array. First suppose that all columns of our 4 × 8 arrays are even and the first row is even. Then for each codeword y in C1 (C2 ) there are 27 arrays whose projection is y as there are 2 choices for each column and one for the last column because of top row parity. Since there are 28 codewords in C1 (C2 ) we see that there are 27 × 28 = 215 arrays whose projection is in C1 (C2 ) when all columns are even. For the same reason we get 27 × 28 = 215 when all columns are odd. Therefore the dimension of the set of vectors of length 32 satisfying (i) and (ii) is 16, as desired. Since ρ(C1 )(ρ(C2 )) satisfies (i) and (ii) and produces C84 (C83) by Lemma 2.1, we conclude that ρ(C1 )(ρ(C2 )) is C84 (C83). This completes the proof. As an example, we compute the syndrome KeT of e = (1, 0, 0, 0, 0, 0, 0, 0) with respect to the trace inner product. We get KeT = (0, 0, 0, 1, 1, 1, 0, 0)T . Likewise if we let e = (ω, 0, 0, 0, 0, 0, 0, 0) then Ke T = (1, 0, 0, 0, 0, 0, 0, 0)T . Also if we let e = (ω, 0, 0, 0, 0, 0, 0, 0) then Ke T = K(e + e )T = KeT + Ke T = (1, 0, 0, 1, 1, 1, 0, 0)T , since the trace inner product is bilinear. Now we briefly describe the idea of decoding the code C84 by using K to understand the decoding algorithm(Section 4.1). Let v be a received vector of length 32. As before, we write it as a 4 × 8 array with rows indexed by 0, 1, ω, and ω. Then we compute the projection onto a quaternary vector y of length 8. The problem is how to find a codeword of K closest to y. As we will see in the next section, we will have some idea of which columns are in error. After we know this, our aim is to find the correct projection in K. In order to do this we compute the syndrome of y with respect to K. This is always a binary vector. We want to express this as a linear combination of the vectors in Table 2. These vectors are obtained by taking the trace inner product of all possible weight one vectors e with the columns of K. Let the non-zero component of e be ei . If ei = 1, e is binary. If ei = ω (or ω), the weight one vector e has one ω (or ω) component. We use ei,1 , ei,ω , and ei,ω for i = 1, . . . , 8 to denote the binary coefficients of the syndrome KeT of eT with wt(e)=1 and ei = 1, ω, and ω respectively.
170 Jon-Lark Kim and Vera Pless 1 0 0 T Ke 0 with 1 ei = 1 1 1 0 0 i
2 0 0 0 0 1 1 1 0
3 0 0 0 0 0 1 1 1
4 1 0 0 0 0 0 1 1
5 1 1 0 0 0 0 0 1
6 1 1 1 0 0 0 0 0
7 0 1 1 1 0 0 0 0
8 0 0 1 1 1 0 0 0
1 1 0 T Ke 0 with 0 ei = ω 0 0 0 0 i
2 0 1 0 0 0 0 0 0
3 0 0 1 0 0 0 0 0
4 0 0 0 1 0 0 0 0
5 0 0 0 0 1 0 0 0
6 0 0 0 0 0 1 0 0
7 0 0 0 0 0 0 1 0
8 0 0 0 0 0 0 0 1
1 1 0 T Ke 0 with 1 ei = ω 1 1 0 0 i
2 0 1 0 0 1 1 1 0
3 0 0 1 0 0 1 1 1
4 1 0 0 1 0 0 1 1
5 1 1 0 0 1 0 0 1
6 1 1 1 0 0 1 0 0
7 0 1 1 1 0 0 1 0
8 0 0 1 1 1 0 0 1
Table 2. Syndromes KeT of eT with wt(e)=1 and ei = 0
For example, let y = (ω, ω, 0, ω, 0, 0, ω, 0) be our quaternary vector of length 8 obtained by projecting a binary vector of length 32. To make the explanation simple, suppose that we know that only the first two positions of y are error positions. Then we compute the following syndrome equation: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 0 1 0 0 ⎢0⎥ ⎢1⎥ ⎢1⎥ ⎢0⎥ ⎢0⎥ ⎢0⎥ ⎢0⎥ ⎢0⎥ ⎢0⎥ ⎢0⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ Ky T = ⎢ 0 ⎥ = e1,1 ⎢ 1 ⎥ + e1,ω ⎢ 0 ⎥ + e2,1 ⎢ 0 ⎥ + e2,ω ⎢ 0 ⎥ , ⎢0⎥ ⎢0⎥ ⎢1⎥ ⎢1⎥ ⎢1⎥ ⎣0⎦ ⎣0⎦ ⎣1⎦ ⎣1⎦ ⎣1⎦ 0 0 1 0 1 0 0 0 0 0 where the coefficients on the right side are binary. We solve this equation to get e1,ω = 1, e2,ω = 1, e1,1 = 0, and e2,1 = 1. Thus the error vector is e = (ω, ω, 0, 0, 0, 0, 0, 0), implying that the codeword x of K is x = y + e = (1, 1, 0, ω, 0, 0, ω, 0), which is, in fact, the sum of the first two rows of K. Then we decode our received 4 × 8 array according to Table 3 as we will see in the next section.
4. Decoding the doubly-even self-dual code C84 Now we decode the doubly-even self-dual code C84 by hand in detail by using the syndrome decoding method. We recall from Section 3 that when we decode C84, we use the even self-dual additive code K with generator matrix K in Table 1. We give an explanation of Table 3 which is an analysis of the parities of the columns of a received vector. For example, let us consider Case II. Here the second column denotes the parity of columns. So (7, 1) means that seven columns have one parity and one column has the other parity. The third column shows all possible errors in each subcase. For
Decoding some doubly-even self-dual [32, 16, 8] codes by hand
Case
Parity of columns (One parity, the other)
I
(8, 0)
II
(7, 1)
III
(6, 2)
IV
(5, 3)
V
(4, 4)
171
Possible errors
Correct columns (): undecided col. 0 errors 8 2=2 (7) out of 8 4=4 (7) out of 8 4=2+2 (6) out of 8 6=2+2+2 (5) out of 8 1=1 7 3=3 7 3=2+1 (6) out of 7 5=2+2+1 (5) out of 7 2=1+1 6 4=1+3 6 4=2+1+1 (5) out of 6 6=2+2+1+1 (4) out of 6 6=1+1+1+1+1+1 2 3=1+1+1 5 5=1+1+3 5 5=2+1+1+1 (4) out of 5 5=1+1+1+1+1 3 4=1+1+1+1 4 6=1+1+1+3 4 6=2+1+1+1+1 (3) out of 4
Table 3. All possible pases
example, 3 = 2 + 1 means that there are 2 errors in one of the seven columns with one parity and 1 error in one column with the other parity. The fourth column shows correct columns. So (6) out of 7 means that 6 columns out of seven columns with one parity are correct even though which 6 columns are correct are not decided. In Case V, there are equal number of even parity and odd parity columns. To simplify the table, we omit symmetric cases such as 4 = 1 + 1 + 1 + 1, 6 = 1 + 1 + 1 + 3, and 6 = 2 + 1 + 1 + 1 + 1 [both of which may need to be considered].
4.1. Decoding algorithm We give the outline of a maximum-likelihood syndrome decoding algorithm of C84. Maximum-likelihood means that we assume that the smallest number of errors have occurred. We have K as before and the received vector v as a 4 × 8 binary matrix. Step 1. Compute the parities of the columns of v and determine which case of Table 3 we are in. Step 2. Compute the projection of v, call it y. Step 3. Compute the syndrome of y with respect to K.
172 Jon-Lark Kim and Vera Pless Step 4. If the syndrome of y is zero and we are in case I of Table 3, we compute the parity of the top row of v, else go to Step 5. If the parity is even, we say no errors have occurred and we stop. If odd, we are in case 4 = 4 and we decode by complementing any column of v then stop. Step 5. From Step 1 we know either a) explicitly which columns are incorrect [lines 1 and 2 of cases II, III, IV, and V] or b) we have partial information about the number or location of incorrect columns. In case a) we write the syndrome as a combination of the incorrect columns using Table 2. If successful, we go to Step 6. If we find that other columns are wrong we go to case b). In case b) by trial and error we write the syndrome as a combination of known incorrect columns and possible unknown ones. Step 6. From Step 5 we get an error vector which we add to y. We now know a correct projection y . We can adjust v to get its projection y and to have the same column parities and top row parity even. Thus v is decoded as a codeword of C84. Then we stop. To see this algorithm more clearly we give some specific examples.
4.2. Examples Example 4.1 (Case I). All the columns have the same parity. In this case we know that there are zero, two, four, or six errors by Table 3. Consider the following example.
0 1 v= ω ω y
=
1 0 1 1 0 ω
2 1 0 1 0 ω
3 0 0 0 0 0
4 0 1 0 1 ω
5 1 0 0 1 ω
6 1 0 1 0 ω
7 0 0 1 1 1
8 1 0 1 0 ω
Suppose the parity of all columns is correct (Step 1). The projection of the received vector v is y = (ω, ω, 0, ω, ω, ω, 1, ω) (Step 2). Even though the top row is even, we are not sure if this v is a codeword of C84 before we compute the syndrome of y with respect to K (Step 3). ⎡ ⎤ 0 ⎢1⎥ ⎢1⎥ ⎢ ⎥ Ky T = ⎢ 1 ⎥ . ⎢0⎥ ⎣0⎦ 0 0 If v were a codeword of C84, then Ky T would be a zero vector of length 8 by Lemma 3.1 (ii) (Step 4). Thus v is not a codeword. To decode v we need to find
Decoding some doubly-even self-dual [32, 16, 8] codes by hand
173
a vector in K closest to y (Step 5 b)). We first note that the trace inner product of the 7th column of K and (0, 0, 0, 0, 0, 0, 1, 0), i.e., the syndrome KeT with wt(e)=1 and e7 = 1 (see Table 2), is (0, 1, 1, 1, 0, 0, 0, 0). Therefore we found an error in the 7th position of y. Hence our error vector is e = (0, 0, 0, 0, 0, 0, 1, 0) giving a vector x = y + e = (ω, ω, 0, ω, ω, ω, 0, ω) in K. For checking, we can actually write x as a linear combination of rows of K. That is, x = 1st + 5th rows of K. We observe that only the 1st and 5th coordinates of x have 1 or ω. This is true in general. Whenever we want to write x as a linear combination of rows of K, we just consider coordinates of x having 1 or ω, whose corresponding rows are a linear combination of x. Hence we uniquely decode v as follows (Step 6). 1 0 1 1 0 ω
0 1 v= ω ω
2 1 0 1 0 ω
3 0 0 0 0 0
4 0 1 0 1 ω
5 1 0 0 1 ω
6 1 0 1 0 ω
7 0 0 0 0 0
8 1 0 1 0 ω
Example 4.2 (Case II). Seven of the columns have one parity, and one of the columns has the other parity. In this case we know that there are one, three, or five errors from Table 3. If there are at most three errors, then we can uniquely decode. Otherwise we can detect five errors and decode into some codeword. We explain this case in the next example.
0 1 v= ω ω y
=
1 1 1 1 0 ω
2 3 0 0 1 1 1 0 1 0 0 1
4 5 1 1 0 0 1 0 1 0 1 0
6 1 0 1 0 ω
7 0 0 0 1 ω
8 0 0 1 0 ω
Suppose the parity of the seven odd columns is correct (Step 1). So the sixth column of this array has at least one error. No syndrome equation involving two syndromes works. Note that we have to include at least one syndrome KeT with e6 = 1, ω or ω since there is an error in the sixth column. By trial and error we consider the following equation (these syndromes can be found in Table 2) (Step 5 b)). ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 0 1 0 1 0 0 ⎢1⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢1⎥ ⎢1⎥ ⎢0⎥ ⎢0⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ Ky T = ⎢ 0 ⎥ = e4,1 ⎢ 0 ⎥ + e5,ω ⎢ 0 ⎥ + e6,ω ⎢ 0 ⎥ . ⎢0⎥ ⎢1⎥ ⎢0⎥ ⎢1⎥ ⎣1⎦ ⎣1⎦ ⎣0⎦ ⎣0⎦ 0 1 1 0 0 1 1 0
174 Jon-Lark Kim and Vera Pless Solving we get e4,1 = 1, e5,ω = 1, and e6,ω = 1 (recall that these coefficients are binary). Thus we get an error vector e = (0, 0, 0, 1, ω, ω, 0, 0), giving x = y + e = (ω, 0, 1, 0, ω, 1, ω, ω) in K. Actually x = 1st + 3rd + 6th + 7th rows of K as observed in Example 4.1. Hence we decode v as follows (Step 6). 1 1 1 1 0 ω
0 1 v= ω ω
2 0 1 1 1 0
3 0 1 0 0 1
4 0 1 1 1 0
5 0 0 1 0 ω
6 1 0 1 1 1
7 0 0 0 1 ω
8 0 0 1 0 ω
Example 4.3 (Case III). Six of the columns have one parity and two of the columns have the other parity. In this case we know that there are two, four, or six errors from Table 3. Consider the following example.
0 1 v= ω ω y
=
1 1 1 1 0 ω
2 1 0 0 0 0
3 0 1 1 1 0
4 0 0 0 1 ω
5 1 0 1 1 1
6 1 0 1 0 ω
7 0 1 1 0 ω
8 1 0 1 1 1
Suppose that the parity of the six odd columns is correct (Step 1). Noting that ⎡ ⎤ ⎡ ⎤ 1 1 ⎢0⎥ ⎢0⎥ ⎢0⎥ ⎢0⎥ ⎢ ⎥ ⎢ ⎥ Ky T = ⎢ 0 ⎥ = e1,ω ⎢ 0 ⎥ , ⎢0⎥ ⎢0⎥ ⎣0⎦ ⎣0⎦ 0 0 0 0 we know that more than two errors have occurred since the sixth and seventh columns already have errors. From the above syndrome equation we get an error vector e = (ω, 0, 0, 0, 0, 0, 0, 0) giving x = y + e = (1, 0, 0, ω, 1, ω, ω, 1) in K (Step 5 b)). Actually x = 1st + 4th + 5th + 7th rows of K. Hence we decode v as follows (Step 6).
0 1 v= ω ω
1 0 1 0 0 1
2 1 0 0 0 0
3 0 1 1 1 0
4 0 0 0 1 ω
5 1 0 1 1 1
6 0 0 1 0 ω
7 1 1 1 0 ω
8 1 0 1 1 1
Decoding some doubly-even self-dual [32, 16, 8] codes by hand
175
Example 4.4 (Case IV). Five of the columns have one parity, and three of the columns have the other parity. In this case we know that there are three or five errors from Table 3. We first try the three error case. If we cannot correct three errors, then we see that five errors occurred. Consider the following example.
0 1 v= ω ω y
=
1 0 0 0 1 ω
2 1 1 0 0 1
3 1 1 1 0 ω
4 0 1 1 1 0
5 0 1 0 1 ω
6 1 1 0 1 ω
7 0 1 0 0 1
8 1 1 1 1 0
Suppose that the parity of the five odd columns is correct (Step 1). We consider the equation (Step 5 b)): ⎡ ⎤ ⎡ ⎤ 1 1 ⎢1⎥ ⎢1⎥ ⎢0⎥ ⎢0⎥ ⎢ ⎥ ⎢ ⎥ Ky T = ⎢ 0 ⎥ = e5,ω ⎢ 0 ⎥ . ⎢1⎥ ⎢1⎥ ⎣0⎦ ⎣0⎦ 0 0 1 1 Thus we get an error vector e = (0, 0, 0, 0, ω, 0, 0, 0) giving x = y + e = (ω, 1, ω, 0, 1, ω, 1, 0) in K. Actually x = 1st + 2nd + 3rd + 5th + 7th rows of K. Hence we decode v as follows (Step 6).
v=
1 0 0 0 1 ω
0 1 ω ω
2 0 1 0 0 1
3 1 1 1 0 ω
4 0 1 1 1 0
5 0 1 0 0 1
6 1 1 0 1 ω
7 0 1 0 0 1
8 0 1 1 1 0
Example 4.5 (Case V). Four of the columns have even parity, and four of the columns have odd parity. In this case we know that there are 4 or 6 errors from Table 3. We first try 4 errors. If we cannot detect any 4 errors, then we say that 6 errors occurred. We consider an example. 0 1 v= ω ω y
=
1 1 1 1 1 0
2 1 0 1 1 1
3 0 0 1 1 1
4 1 1 0 1 ω
5 1 0 0 0 0
6 0 1 1 0 ω
7 1 0 1 0 ω
8 1 0 0 0 0
176 Jon-Lark Kim and Vera Pless Suppose that the even columns are correct (Step 1). Thus we have errors in positions 2, 4, 5, and 8. So we consider the equation: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 0 1 0 1 ⎢1⎥ ⎢0⎥ ⎢1⎥ ⎢0⎥ ⎢0⎥ ⎢0⎥ ⎢0⎥ ⎢1⎥ ⎢0⎥ ⎢0⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ Ky T = ⎢ 1 ⎥ = e2,1 ⎢ 0 ⎥ + e2,ω ⎢ 0 ⎥ + e4,1 ⎢ 0 ⎥ + e4,ω ⎢ 1 ⎥ ⎢0⎥ ⎢0⎥ ⎢1⎥ ⎢1⎥ ⎢0⎥ ⎣0⎦ ⎣0⎦ ⎣1⎦ ⎣1⎦ ⎣0⎦ 0 0 1 1 1 0 0 1 0 1 ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 1 0 0 1 0 0 ⎢0⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢0⎥ ⎢0⎥ ⎢1⎥ ⎢0⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ + e5,1 ⎢ 0 ⎥ + e5,ω ⎢ 0 ⎥ + e8,1 ⎢ 1 ⎥ + e8,ω ⎢ 0 ⎥ . ⎢0⎥ ⎢0⎥ ⎢1⎥ ⎢1⎥ ⎣0⎦ ⎣0⎦ ⎣0⎦ ⎣0⎦ 0 0 0 0 1 1 0 0 Solving this equation, we have a unique solution as follows : e2,1 = 1, e2,ω = 0, e4,1 = 0, e4,ω = 0, e5,1 = 1, e5,ω = 1, e8,1 = 1, e8,ω = 0. Thus we get an error vector e = (0, 1, 0, 0, ω, 0, 0, 1) giving x = y + e = (0, 0, 1, ω, ω, ω, ω, 1) in K. Actually x = 3rd + 5th + 6th + 8th rows of K. Hence we decode v as follows (Step 6).
0 1 v= ω ω
1 1 1 1 1 0
2 1 1 1 1 0
3 0 0 1 1 1
4 0 1 0 1 ω
5 1 0 0 1 ω
6 0 1 1 0 ω
7 1 0 1 0 ω
8 0 0 1 1 1
Since we decoded, we need not go further, i.e., we do not consider the other possibility that the odd columns of v are correct. Remark 4.6. For decoding the code C83 using QC_8c, we follow the same steps as above. However the syndrome equation will be more complicated since the generator matrix of QC_8c has as entries three nonzero elements in GF(4). Remark 4.7. It is well known [CS] that there are three singly-even self-dual [32, 16, 8] codes. It was shown [G1, G2] that the three singly-even self-dual [32, 16, 8] codes are constructed from the extended Hamming [8, 4, 4] code H over GF(4), QC_8c, and QC_8d. Let Ci (i = 1, 2, 3) denote even additive self-dual codes with generator matrices Gi respectively H , QC_8c, and QC_8d. If we follow the notation in the &i + (d 8 )0 + f3 for i = 1, 2, 3, where f3 is the [32, 1] preliminaries, then ρ(Ci ) = C 4 code generated by 1000 1000 . . . 1000 0111, produces the three singly-even selfdual [32, 16, 8] codes [G1, G2]. So we can decode the three singly-even self-dual [32, 16, 8] codes by the syndrome decoding method with the difference that the top
Decoding some doubly-even self-dual [32, 16, 8] codes by hand
177
row parity is equal to the parity of all columns when we write a binary vector v of length 32 as a 4 × 8 array as before. Remark 4.8. We comment on the worst cases of the decoding algorithm. It is easy to see that the computation of Step 1 through Step 3 in Section 4.1 is the same for all possible cases in Table 3. In Step 4 we go to Step 5, stop, or decode by complementing any column of v. If we are in Step 5 a), then since we already know which columns are incorrect, we just solve the syndrome equation which will involve at most 8 columns in Table 2 (see Example 4.5). However if we are in Step 5 b), the situation is more complicated. The worst case is Case I, i.e. (8, 0), in particular, the subcase 6 = 2 + 2 + 2. Since we donot know which three columns out of 8 columns of v are 8 incorrect, we could try = 56 sets of 3 columns until we get one solvable 3 syndrome equation. Note that there are 32 vectors of weight 6 in a coset of weight 6 of C84 (also C83) [AP]. Given a set of 3 columns giving the correct syndrome equation, there are 4 vectors of weight 6 in this coset of weight 6 since complementing any two of the 3 columns gives the same projection and preserves parities of columns and the top row. Thus 8 sets of 3 columns will hold all of them. Thus the probability that we find a desired syndrome equation is 8/56 = 1/7. However in the worst case we have to try 56 − 7 = 49 sets of 3 columns.
References [AP] E. F. Assmus and V. Pless, On the covering radius of extremal self-dual codes, IEEE Trans. Inform. Theory 29 (1983), 359–363. [CP] J. H. Conway and V. Pless, On the enumeration of self-dual codes, J. Combin. Theory Ser. A 28 (1980), 26–53. [CS] J. H. Conway and N. J. A. Sloane, A new upper bound on the minimal distance of self-dual code, IEEE Trans. Inform. Theory 36 (1990), 1319–1333. [G1] P. Gaborit, W. C. Huffman, J.-L. Kim, and V. Pless, On the classification of extremal additive codes over GF(4), in: Proceedings of the 37th Allerton Conference on Communication, Control, and Computing, UIUC Sep. 1999, 535–544. [G2] —, On additive GF(4) codes, in: Codes and association schemes (A. Barg and S. Litsyn, eds.), DIMACS, Ser. Discrete Math. Theoret. Comput. Sci. 56, Amer. Math. Soc., Providence, RI. 2001, 135–149. [G3] P. Gaborit, J.-L. Kim, and V. Pless, Decoding binary R(2, 5) by hand, preprint, 2000. [H]
G. Höhn, Self-dual codes over the Kleinian four group, preprint, 1996; updated version in http://xxx.lanl.gov/(math.CO/0005266).
[MS] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Codes, NorthHolland, New York 1977.
178 Jon-Lark Kim and Vera Pless [P1] V. Pless, Decoding the Golay codes, IEEE Trans. Inform. Theory IT-32 (1986), 561–567. [P2] V. Pless, Introduction to the Theory of Error-Correcting Codes, 3rd ed., John Wiley and Sons, New York 1998. [RS] E. M. Rains and N. J. A. Sloane, Self-dual codes, in: Handbook of Coding Theory (V. S. Pless and W. C. Huffman, eds.), Elsevier, Amsterdam 1998, 177–294. J.-L. Kim, V. Pless University of Illinois–Chicago Mathematics, Statistics, and Computer Science (M/C 249), 322 SEO, 851 S. Morgan, Chicago, IL 60607-7045, U.S.A. [email protected] [email protected]
On the maximum size of a hole in an incomplete t-wise balanced design with specified minimum block size Donald L. Kreher and Rolf S. Rees
Abstract. We derive a general upper bound on the size of a hole in an incomplete t-wise balanced design of order v and index λ, given that its minimum block size is k ≥ t + 1: if h is the size of the hole, then h ≤ (v + (k − t)(t − 2) − 1)/(k − t + 1). We then show that this bound is sharp infinitely often when t = 2 or 3, in that for each h ≥ t and each k ≥ t + 1, (t, h, k) = (3, 3, 4), there exists an ItBD meeting the bound. 2000 Mathematics Subject Classification: 05B05.
1. Introduction A t-wise balanced design (tBD) of type t-(v, K, λ) is a pair (X, B) where X is a v−element set of points and B is a collection of subsets of X called blocks, with the property that the size of every block is in K and every t-element subset of X is contained in exactly λ blocks. If K is a set of positive integers strictly between t and v, then we say the tBD is proper. An incomplete t-wise balanced design (ItBD) of type t-(v, h, K, λ) is a triple (X, H, B) where X is a v-element set of points, H is an h-element subset H ⊆ X (called the hole), and B is a collection of subsets of X called blocks, such that every t-element subset of points is either contained in the hole or in exactly λ blocks, but not both. Thus, an ItBD of type t-(v, h, K, λ) is equivalent to a tBD of type t-(v, K ∪ {h}, λ) having a block of size h which is repeated λ times. In particular, when λ = 1, a tBD of type t-(v, K, λ) is an ItBD of type t-(v, h, K, 1) for any h ∈ K, provided of course that the tBD actually has a block of size h. In a recent article [KR], the authors prove the following result: Theorem 1.1 ([KR, Theorem 1.2]). Let (X, H, B) be a proper ItBD with t ≥ 2. If h = |H | ≥ t is the size of the hole in (X, H, B), then h ≤ (v − 1)/2 when t is even, while h ≤ v/2 when t is odd. Setting λ = 1, we verify Kramer’s conjecture for all t ≥ 2: Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
180 Donald L. Kreher and Rolf S. Rees Corollary 1.2 ([KR, Corollary 1.3]). In any proper tBD (X, B) of type t-(v, K, 1), we have k ≤ (v − 1)/2 when t is even, while k ≤ v/2 when t is odd, where k is the size of any block in (X, B). Moreover, in [KR, Theorems 3.3, 3.5, 3.8] it was shown that for every t ≥ 2 and every h ≥ t + 1, there exists an ItBD (with λ as a function of h and t) meeting the bounds of Theorem 1.1, and that any ItBD meeting this bound must have k = t + 1 as its minimum block size. This of course raises the question regarding what happens if we prescribe the minimum block size in the ItBD to be something larger than t + 1. In this article we derive the following upper bound: If (X, H, B) is a proper ItBD of type t-(v, h, K, λ) with h ≥ t ≥ 2 and min K = k ≥ t + 1, then h≤
v + (k − t)(t − 2) − 1 . k−t +1
We will show that this bound is sharp when t = 2 or 3 in that for each h ≥ t and each k ≥ t + 1, (t, h, k) = (3, 3, 4), there exists an ItBD (with λ a function of h and k) meeting this bound.
2. The upper bound In this section we prove (Theorem 2.1) the bound mentioned in Section 1. This is in fact a generalization of Lemma 1.6 in [KR] and we generalize the technique used therein to prove our result here. We will require the following terminology: if (X, B) is a tBD, then an α-parallel class of blocks in (X, B) is a subset B ⊆ B with the property that each point x ∈ X is contained in exactly α of the blocks in B . Theorem 2.1. If (X, H, B) is a proper ItBD of type t-(v, h, K, λ) with h ≥ t ≥ 2 and min K = k ≥ t + 1, then h≤
v + (k − t)(t − 2) − 1 . k−t +1
Proof. Let S be a fixed (t − 2)-element subset of H and let H = S ∪ {x1 , x2 , x3 , . . . , xh−t+2 }. Consider the derived design with respect to S ∪ {xi }, where i is a fixed element of {1, 2, . . . , h − t + 2}. Now because H is a hole in the original ItBD, the blocks in the derived design form a λ-parallel class of blocks, each of size at least k − t + 1 on the v − h points of X \ H ; call this set of blocks Bi . Then again, because H is a hole in the original ItBD; we have Bi ∩ Bj = ∅ for all 1 ≤ i < j ≤ h − t + 2. (It may well
On the maximum size of a hole in an ItBD with specified minimum block size
181
be that as sets there is a block Bi ∈ Bi and a block Bj ∈ Bj that are equal; however, they will have arisen from distinct blocks in (X, H, B) and so as blocks are distinct.) Thus, as i ranges over {1, 2, . . . , h − t + 2}, we obtain h − t + 2 λ-parallel classes of blocks, each of size at least k − t + 1, on the v − h points of X \ H . Now because k ≥ t + 1, we have k − t + 1 ≥ 2, so let v, v ∈ X \ H , v = v . The pair v, v cannot occur together in more than λ blocks in BS = B1 ∪ B2 ∪ · · · ∪ Bh−t+2 , for otherwise the t-element set {v, v } ∪ S would have occurred in more than λ blocks in (X, H, B). Thus, by considering the blocks in BS which contain the fixed point v ∈ X \ H , we have λ(k − t)(h − t + 2) ≤ λ(v − h − 1) from which we have h≤
v + (k − t)(t − 2) − 1 . k−t +1
Remark 2.2. Note that since S was a fixed but arbitrary (t − 2)-element subset of H , we have equality in Theorem 2.1 if and only if every block that intersects the hole in exactly t − 1 points has size k andthat among these blocks each pair of elements h /(t − 1) times. This completely characterizes from X \ H is covered exactly λ t−2 the case for t = 2: Corollary 2.3. In any incomplete 2-(v, h, K, λ) design with 2 ≤ h < v and min K = k ≥ 3, we have h≤
v−1 , k−1
with equality occurring if and only if every block has size k and intersects the hole (in exactly one point). Corollary 2.4. In any incomplete 3-(v, h, K, λ) design with 3 ≤ h < v and min K = k ≥ 4, we have h≤
v+k−4 , k−2
with equality occurring only if every block that intersects the hole does so in exactly two points and has size k. Proof. From Theorem 2.1 and Remark 2.2 we need only show that when h = (v + k − 4)/(k − 2) no block intersects the hole in exactly one point. Suppose, to the contrary, that such a block B exists, and let x be the unique point in the intersection of B with the hole. Then taking the derived design through x yields an incomplete 2-(v −1, h−1, K −1, λ) design with 2 ≤ h−1 < v −1 and min(K −1) = k −1 ≥ 2,
182 Donald L. Kreher and Rolf S. Rees with h−1=
v+k−4 v−2 −1= , k−2 k−2
h−1=
(v − 1) − 1 . (k − 1) − 1
i.e.
But in this derived design there is a block B \ {x} that does not intersect the hole, contradicting Corollary 2.3. Remark 2.5. With regards to Corollary 2.4, it is not necessary that every block intersects the hole in order for equality to occur. For example, let X = {a, b, c} ∪ {1, 2, 3, 4, 5}, H = {a, b, c}, and B = {1, 2, 3, 4, 5}, {1, 2, 3, 4, 5}, {1, 2, 3, 4, 5} ∪ {x, y, i, j, k} : x, y ∈ H and i, j, k ∈ X \ H . This is an incomplete 3-(8, 3, {5}, 6) design (meeting the bound of Corollary 2.4) with the three copies of {1, 2, 3, 4, 5} disjoint from the hole H = {a, b, c}. We conclude this section by observing that the argument used in the proof of Corollary 2.4 can be easily generalized to show that in any incomplete t-(v, h, K, λ) design meeting the bound of Theorem 2.1 the number of blocks which intersect the hole in exactly t − 2 points is zero.
3. Meeting the bounds for t = 2 and t = 3 In this section, we will show that the bound of Theorem 2.1 is sharp infinitely often when t = 2 or 3. We begin with t = 3, using the technique in Section 3 of [KR] to construct our designs. If Y ⊆ X, let Sym(Y ) denote the symmetric group on Y . Theorem 3.1. For each h ≥ 3 and each k ≥ 4, (h, k) = (3, 4), there exists an I3BD (X, H, B) of type 3-(v, h, {k}, λ) where v = (k − 2)h − (k − 4) and v−3 v−h−1 λ= (k − 1)! , k−3 k−1 having Sym(H ) × Sym(X \ H ) as an automorphism group. Proof. There are three orbits 0 , 1 , 2 of 3-element subsets that need to be covered, where i is the set of 3-element subsets that intersect the hole in exactly i points. Similarly, there are three orbits 0 , 1 , 2 of blocks (k-element subsets) that are
On the maximum size of a hole in an ItBD with specified minimum block size
183
available, where j is the set of all k-element subsets that intersect the hole in exactly j points. Thus, h v−h |j | = . j k−j Now, consider the 3 by 3 matrix M whose [i, j ]-entry is M[i, j ] = |{B ∈ j : T ⊆ B}|, where T is any fixed representative of i . Then the design whose existence is asserted by the statement of the theorem exists if and only if there is a non-negative integer vector u$ such that M u$ = λJ, J = [1, 1, 1]T . We now proceed to show that such a vector u$ exists. The case k = 4 is handled in [KR, Theorem 3.3], so we can henceforth assume that k ≥ 5. The matrix M is given explicitly by ⎡ v−h−3 ⎢ ⎢ M=⎢ ⎣
k−3
0
h v−h−3 k−4 v−h−2 k−3
0
0
hv−h−3 2
k−5
(h − 1) v−h−2 k−4 v−h−1
⎤ ⎥ ⎥ ⎥. ⎦
k−3
Now observe that for i = 0, 1, 2, the sum along row i of M is k−3 h − i v − h − (3 − i) v−3 . − α k−3−α k−3
α=3−i
Hence, we first solve M v$ = w, $ where k−3 k−3 k−3 h v −h−3 h−1 v −h−2 h−2 v −h−1 T w $= . , , α k −3−α α k −3−α α k −3−α α=3
α=2
α=1
184 Donald L. Kreher and Rolf S. Rees We see that 1
v2 = v−h−1 k−3
α k−3−α v−3 v−h−1 − k−3 k−3
α=1
1 = v−h−1 k−3
k−3 h−2 v−h−1
v−3
k−3 = v−h−1 − 1, k−3
. v−h−2 − (h − 1) v2 v1 = v−h−2 α k−3−α k−4 k−3 α=2 k−3 v−3 . h − 1v − h − 2 1 v−h−2 k−3 = v−h−2 − (h − 1) v−h−1 α k−3−α k−4 k−3 k−3 α=1 v−3 v−h−2 (h − 1)(k − 3) v − 3 1 − − = v−h−2 v−h−1 k−3 k−3 k−3 1
k−3 h − 1v − h − 2
k−3
= −1, and v0 = = = = =
because v − h − 1 = (h − 1)(k − 3),
. h v−h−3 v−h−3 v2 −h v1 − v−h−3 2 k−5 α k−3−α k−4 k−3 α=3 k−3 . h v − h − 3 hv − h − 3 v−3 1 k−3 − v−h−3 v−h−1 α k−3−α 2 k−5 k−3 k−3 α=1 v−3 v−h−3 h(h − 1)(k − 3)(k − 4) v − 3 1 − − v−h−3 2(v − h − 1)(v − h − 2) k − 3 k−3 k−3 k−3 v−3 h(k − 4) k−3 −1 v−h−3 1 − 2(v − h − 2) k−3 v−3 (h − 2)(k − 2) k−3 − 1. v−h−3 2(v − h − 2) 1
k−3 h v − h − 3
k−3
Thus, we take v$ = [v0 , v1 , v2 ]T . Now because h ≥ 3, it is easy to see that v0 > −1, v1 = −1, and v2 > −1. Hence v$ + J is a non-negative rational vector which, by our choice of w, $ is the unique solution to
v−3 M($ v + J) = J. k−3
On the maximum size of a hole in an ItBD with specified minimum block size
Then setting
185
v−h−1 u$ = (k − 1)! ($ v + J ), k−1
we see that u$ is a non-negative integer vector for which v−3 v−h−1 M u$ = (k − 1)! J = λJ k−3 k−1 as desired. The result follows. Remark 3.2. Note that in the solution vector u$ in the proof of Theorem 3.1, we have u1 = 0. This means that in each design constructed by this result the orbit 1 is never used. That is, there are no blocks which intersect the hole in exactly one point, as must be the case by Corollary 2.4. With regards to the parameters (h, k) = (3, 4), it is easy to show that no I3BD of type 3-(6, 3, {4}, λ) exists, for any λ > 0. Designs meeting the bound of Theorem 2.1, for t = 2, are now easily obtained. Theorem 3.3. For each h ≥ 2 and each k ≥ 3, (h, k) = (2, 3), there exists an I2BD (X, H, B) of type 2-(v, h, {k}, λ) where v = (k − 1)h + 1 and v−h−1 v−2 λ= k! . k−2 k Proof. From Theorem 3.1, there is an I3BD of type 3-(v + 1, h + 1, {k + 1}, λ), where v + 1 = ((k + 1) − 2)(h + 1) − ((k + 1) − 4) = (k − 1)h + 2 and
λ=
v+1−3 v + 1 − (h + 1) − 1 (k + 1 − 1)! . k+1−3 k+1−1
Take the derived design through a point in the hole to get the desired I2BD. Remark 3.4. One can of course obtain infinite classes of 2-designs with λ = 1 meeting the bound of Theorem 2.1 by starting with resolvable BIBDs with λ = 1 (e.g. one-factorizations, Kirkman Triple Systems, etc.). With regard to the parameters (h, k) = (2, 3) in Theorem 3.3, it is a simple matter to construct I2BDs of type 2-(5, 2, {3}, λ) for any even λ > 0.
4. Conclusion It would be of great interest to determine the effectiveness of the bound of Theorem 2.1 for t ≥ 4. Note that when k = t + 1, this bound reduces to h ≤ (v + t − 3)/2
186 Donald L. Kreher and Rolf S. Rees (which, incidentally, is equivalent to Theorem 2.1 in [K]) and so cannot be sharp in the case t ≥ 4 (see Theorem 1.1). That is, one must restrict oneself to ItBDs with min K = k ≥ t + 2. Acknowledgment. The authors thank Malcolm Greig for some useful suggestions for this article. Note added in proof. The effectiveness of the bound in Theorem 2.1 has recently been investigated in [AKLR]; while much remains to be established, this bound has been shown to be asymptotically sharp for every t ≥ 4.
References [AKLR]
I. Adamczak, D. L. Kreher, A. C. H. Ling and R. S. Rees, Further results on the maximum size of a hole in an incomplete t-wise balanced design with specified minimum block size, J. Combin. Des., to appear.
[K]
E. S. Kramer, Some results on t-wise balanced designs, Ars Combin. 15 (1983), 179–192.
[KR]
D. L. Kreher and R. S. Rees, A hole-size bound for incomplete t-wise balanced designs, J. Combin. Des. 9 (2001), 269–284.
D. L. Kreher Michigan Technological University 1400 Townsend Drive Houghton, Michigan 49931-1295, U.S.A. [email protected] R. S. Rees Department of Mathematics and Statistics Memorial University of Newfoundland St. John’s, Newfoundland, A1C 5S7 Canada
On a family of cocyclic Hadamard matrices Warwick de Launey
Abstract. We exhibit a large family of cocyclic Hadamard matrices. We thereby obtain a large family of maximal sized relative difference sets with central forbidden subgroup of size two. For example, we show that any group of odd square free order p1 p2 . . . pn (where pi is prime) may be embedded in a group of order 2n+1 (p1 + 1)(p2 + 1) . . . (pn + 1)p1 p2 . . . pn containing such a relative difference set. 2000 Mathematics Subject Classification: primary 05B20; secondary 05B10.
1. The main result Let G be a finite group. Let Z2 denote the multiplicative group {1, −1}. A normalised binary 2-cocycle f : G × G → Z2 is a map satisfying the equations (a) f (x, 1) = f (1, y) = 1
and (b) f (x, y)f (xy, w) = f (x, yw)f (y, w). (1.1) A Hadamard matrix H of order n is an n × n (1, −1)-matrix such that H H T = nIn . If P and Q are n × n signed permutation matrices, then P H Q is also a Hadamard matrix. The set of all such matrices is the equivalence class of H . Suppose that for some map g : G → Z2 and some cocycle f over G, the Hadamard matrix H contains in its equivalence class a matrix of the form given on the righthand side of equation (1.2). F = [ g(xy)f (x, y) ]x,y∈G .
(1.2)
Then H is said to be cocyclic over the group G with Hadamard cocycle f . Cocyclic development of combinatorial designs was introduced in [8] and [1]. We warn the reader that some authors take a more restrictive view and define to be cocyclic only those Hadamard matrices of the form given on the righthand side of equation (1.2). That definition has its advantages, but we prefer to employ our definition because it allows us to relate several cocyclic Hadamard matrices over possibly different groups to a single combinatorial object, namely, an equivalence class of Hadamard matrices, Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
188 Warwick de Launey and more importantly to characterize these cocyclic Hadamard matrices in terms of the regular subgroups of the automorphism group of any single representative of that equivalence class. The purpose of this paper is to prove the following theorem and to discuss some of its extensions. Theorem 1.1. If q1 , q2 , . . . , qr ≡ 1 (mod 4) and p1 , p2 , . . . , ps ≡ 3 (mod 4) are prime powers, and k1 , k2 , . . . , kr and m1 , m2 , . . . , ms are non-negative integers, then there exists a cocyclic Hadamard matrix of order r i=1
2(qi + 1)
s
r
(pj + 1)
j =1
i=1
qiki
s
mj
pj
.
j =1
A proof of Theorem 1.1 was presented at conferences and seminars in 1993 and 1994. This paper (and [4]) contains many of the ideas in [2]. As noted at the time, Theorem 1.1 gives a large family of relative difference sets with forbidden subgroup of size two. As noted in [6], a Hadamard group, as defined by Ito [10], of order 2n exists if and only if there is a cocyclic Hadamard matrix of order n; so Theorem 1.1 also implies the existence of many Hadamard groups. We will say more about which Hadamard groups arise in the last section. The rest of this paper is organized as follows. Section 2 defines cocyclic orthogonal designs and then proves some basic results about their regular group actions. Section 3 establishes some notation and well known facts about Paley matrices. Then our treatment of Theorem 1.1 is presented over several sections. We start by proving various special cases, and then we enlarge the scope of our proof introducing as needed a number of general ideas. Our discussion has three themes: (a) heavy use is made of special properties of the Paley conference matrix; (b) the device of substituting suitable matrices into a smaller cocyclic orthogonal matrix to obtain a larger cocyclic orthogonal design is used in various ways; and finally (c) the correspondence between cocyclic Hadamard matrices and normal relative difference sets is exploited via group ring manipulations.
2. Cocyclic orthogonal designs Let X = {xi | i = 1, . . . , k} be a set of commuting indeterminates, and let An orthogonal design a1 , a2 , . . . , ak be a sequence of positive integers. OD(n; a1 , a2 , . . . , ak ) of order n and type (a1 , a2 , . . . , ak ) on X is an n × n matrix D whose non-zero entries are taken from the set of signed indeterminates X± = {±xi | i = 1, . . . , k} so that %
DD =
k i=1
ai xi2 In .
On a family of cocyclic Hadamard matrices
189
The orthogonal design D is said to be cocyclic with cocycle f if there are a map h : G → X± ∪ {0} and signed permutation matrices P and Q such that P DQ = [ h(xy)f (x, y) ]x,y∈G . We let Aut (D) denote the group of pairs of signed permutation matrices P and Q such that P DQ% = D. The following 2n × 2n (0, X± )-matrix ED =
−D D
D −D
is called the expanded design of D. We let PermAut (ED ) denote the group of pairs (P , Q) of permutation matrices such that P ED Q% = ED . If we put
P =Q=
I
,
I
then P ED Q% = ED . So the ordered pair ζ = (P , Q) is in PermAut (ED ). Now % E D ED
=
% ED ED
=2
k i=1
ai xi2
I −I
−I I
.
(2.1)
So the inner product of any pair of distinct rows (or columns) of ED is either 0 or −2 ki=1 ai xi2 . Now if two rows (or columns) have inner product equal to −2 ki=1 ai xi2 , then they are negations of each other. Therefore, the rows (and columns) of ED can be uniquely partitioned into pairs containing a row (or column) and its negation, and the automorphism ζ of ED may be characterized as the automorphism which interchanges each row with its negation and interchanges each column with its negation. Notice that any automorphism of ED must preserve the orbits of ζ ; so ζ is central in PermAut (ED ). Definition 2.1. Let D be an OD(n; a1 , a2 , . . . , ak ). A subgroup of order 2n in PermAut (ED ) is said to act regularly on ED if it acts transitively on the rows and columns of ED . It is said to act normally if it contains the involution ζ which inter-
190 Warwick de Launey changes each row and each column with their negations. In this case, it is said to act normally with respect to ζ . Standard arguments imply the following lemma. Lemma 2.2. Let D be an OD(n; a1 , a2 , . . . , a ). A group R of order 2n acts regularly on ED if and only if ED can be indexed over R so that for some function g : R → {0} ∪ X± , ED = [ g(xy) ]x,y∈R .
(2.2)
Under such an action, the element a ∈ R acts on ED by moving row x to row xa and column y to column a −1 y. The properties (1.1) allow the construction of an extension group Rf on the set of ordered pairs {(x, a) | x ∈ G, a ∈ Z2 } under the operation (x, a)(y, b) = (xy, f (x, y)ab).
(2.3)
Note that (1, −1) is a central involution in Rf . Theorem 2.3. An orthogonal design D is cocyclic with cocycle f over G if and only if Rf acts normally with respect to (1, −1) on ED . Proof. Since D is cocyclic with cocycle f over G, there are signed permutation matrices P and Q and a (0, X± )−function h on G such that D = P [ f (x, y)h(xy) ]x,y∈G Q. Write P = U − V and Q = X − Y where the matrices U, V , X and Y are (0, 1)matrices. Then the matrices U V X Y S= and T = V U Y X are permutation matrices such that f (x, y)h(xy) ED = S −f (x, y)h(xy)
−f (x, y)h(xy) f (x, y)h(xy)
Putting E = [ abh(xy)f (x, y) ](x,a),(y,b)∈Rf , we therefore have ED = SET . Now define g : Rf → {0} ∪ X± so that g((x, a)) = ah(x).
T.
On a family of cocyclic Hadamard matrices
191
Then g((x, a)(y, b)) = g((xy, abf (x, y))) = abf (x, y)h(xy), and E = [ g((x, a)(y, b)) ](x,a),(y,b)∈Rf . By Lemma 2.2, the extension group Rf acts regularly on E. Indeed, the automorphism induced by (1, −1) as per Lemma 2.2 is equal to the involution ζ which maps each row and column to its negation. Since the element (1, −1) ∈ Rf induces the automorphism ζ , the action of Rf on E is normal with respect to (1, −1). Moreover, if (A, B) is the pair of permutation matrices corresponding to the action of an element of Rf on E, then the pair of permutation matrices (SAS % , T % BT ) corresponds to the action of Rf on ED . Furthermore, S and T commute with the matrix I ; I so the element (1, −1) of Rf acts like ζ on ED . We have therefore exhibited a normal action of Rf on ED . Conversely, suppose that the extension group Rf acts normally with respect to (1, −1) on ED . By Lemma 2.2 there is a (0, X± )−function g on Rf such that ED = [ g((x, a)(y, b)) ](x,a),(y,b)∈Rf , Moreover, since (1, −1) acts by interchanging each row (and column) with its negation, we have g((x, 1)) = −g((x, −1)),
(2.4)
and for some signed permutation matrices P and Q we have P DQ = D = [ g((x, 1)(y, 1)) ](x,1),(y,1)∈Rf = [ g((xy, f (x, y))) ]x,y∈G . Now define the (0, X± )−function h on G via the equation h(x) = g((x, 1)). Then by equation (2.4), we have g((xy, f (x, y))) = h(xy)f (x, y), and indeed that P DQ = [ f (x, y)h(xy) ]x,y∈G . This completes the proof.
3. Paley matrices We relate some well known facts about the Paley conference matrix. Let V be the vector space obtained by regarding the Galois field GF(q 2 ) as a two dimensional vector
192 Warwick de Launey space over its subfield GF(q) of order q. Let χ be the quadratic character on the field GF(q). A Paley conference matrix is defined as follows. Let be a set of distinct subspace representatives for the q + 1 one dimensional subspaces of V. Let C be the (0, ±1)-matrix C = [ χ det(x, y) ]x,y∈
(3.1)
where det is any alternating bilinear form on V. Then C is a Paley conference matrix of order q + 1. We note that since χ(−1) = (−1)(q−1)/2 , we have C % = (−1)(q−1)/2 C.
(3.2)
So if q ≡ 3 (mod 4), then the matrix H = Iq+1 + C is a Hadamard matrix. In this case, H is said to be a Paley Hadamard matrix of order q + 1. We note for later use that if we let λ if x = λy where λ ∈ GF(p), ∗ det (x, y) = det(x, y) otherwise, then H = [ χ det∗ (x, y) ]x,y∈ . It can be shown that a change in the choice for the set of subspace representatives or a change in the choice for the bilinear form det cannot change the equivalence class of C or H . So for each order q + 1, the designs C and H are unique up to equivalence. In the sequel it will be useful to expand the domain of det to the group GL(2, q) of invertible linear transformations of V. Fix a basis {u, v} for V , then we obtain det(au + bv, cu + dv) = (ad − bc) det(u, v). So if we write the linear transformation α ∈ GL(2, q) as the 2 × 2 matrix with respect to the basis {u, v}, then det(x α , y α ) = det(α) det(x, y),
e g
f h
where det(α) = eh − f g.
We define det on GL(2, q) accordingly.
4. A construction for prime powers p ≡ 3 (mod 4) Let Q4t denote the generalized quaternion group a, b | a 2t = 1, b2 = a t , a b = a −1 . This group was first discussed in the context of Hadamard matrices by Ito [9] and Yamamoto [15]. Yamada [14] describes several families of Hadamard matrices of generalized quaternion type. All these authors essentially show how a generalized quaternion group acts on any Paley Hadamard matrix. The following lemma shows how this action can be adapted to give an action on an orthogonal design.
On a family of cocyclic Hadamard matrices
193
Lemma 4.1. Let p ≡ 3 (mod 4) be a prime power. Then there exists a cocyclic OD(p + 1; 1, p) with extension group equal to Q2(p+1) . Proof. It is well known that the automorphism group of the type I Paley Hadamard matrix of order p + 1 contains the generalized quaternion group Q2(p+1) as a regular subgroup [10, Example 3]. The following construction is an adaptation of that given by de Launey and Stafford [5]. Let a1 and a2 be commuting indeterminates. Let D be the (±a1 , ±a2 )-matrix D = [ h(xy −1 )χ det∗ (x, y) ]x,y∈ where the map h : GF(p 2 ) → {a1 , a2 } satisfies a1 if x ∈ GF(p), h(x) = a2 otherwise. In other words, we have taken a Paley Hadamard matrix of order p + 1 and multiplied each diagonal entry by a1 and each off-diagonal entry by a2 . We verify that D is an OD(p + 1; 1, p) with DD % = (a1 2 + pa2 2 )Ip+1 . Consider the matrix M obtained by setting a1 = 0. That matrix is a2 times a Paley conference matrix of order p + 1; so it is orthogonal and by equation (3.2) we have M % = −M. Therefore the entire matrix D is orthogonal as stated above. We now exhibit a normal regular action of the group Q2(p+1) on ED . For each one dimensional subspace of V the set of its non-zero vectors may be divided into two subsets (which we call half spaces) of the form {λ2 x | λ ∈ GF(p) \ {0}} where x = 0 is in V. We may choose a complete set J of distinct half space representatives so that ED = [ h(xy −1 )χ det∗ (x, y) ]x,y∈J . For example if χ(ν) = −1, then we may take J = ∪ ν (with the obvious abuse of notation). Now let S = α be any Singer cycle in GL(2, q). Set a = α2
and
b = α ◦ σ,
multiplication. The map σ is an using the where σ maps x ∈ V to involution which is linear over GF(p). Indeed, det(σ ) = −1. Now it is well known that there is a primitive element ω ∈ V = GF(p 2 ) such that α maps x ∈ V to ωx. So det(α) (equals the norm ωp+1 of ω) is a non-square in GF(p). It follows that for xy −1 ∈ GF(p) xp
χ det∗ (x a
i bj
, ya
i bj
GF(p2 )
) = χ(det(a))i χ(det(b))j χ (det∗ (x, y)) = χ (det ∗ (x, y)),
while for xy −1 = λ ∈ GF(p) χ det∗ (x a , y a ) = χ det∗ (ω2 x, ω2 y) = χ (xy −1 ) = χ det∗ (x, y), χ det∗ (x b , y b ) = χ det∗ (ωx p , ωy p ) = χ ((x/y)p ) = χ det∗ (x, y).
194 Warwick de Launey Moreover, h(x a /y a ) = h(x/y), and h(x b /y b ) = h((x/y)p ) = h(x/y). Therefore the i j i j map a i bj : (x, y) → (x a b , y a b ) lies inside PermAut (ED ). It is also well known that the group generated by a and b has presentation a, b | a (p
2 −1)/2
= 1, b2 = a (p+1)/2 , a b = a p ,
and that the group a, b acts regularly on the set V∗ of non-zero vectors in V. Moreover, a and b both map half spaces to half spaces; so a, b induces an action on J. Now the half spaces are of the form {ω2(p+1)i x | i = 0, 1, . . . , (p − 3)/2} where x = 0; so the kernel of the action is Q = a (p+1) . Factoring out by this subgroup, we obtain the group a, b | a p+1 = 1, b2 = a (p+1)/2 , a b = a −1 ∼ = Q2(p+1) which acts regularly on J. Finally, a (p+1)/2 maps x ∈ V∗ to ωp+1 x; so, since χ (ωp+1 ) = −1, the map a (p+1)/2 maps each half space to the other half space in its subspace. Hence a (p+1)/2 induces the automorphism ζ of ED which interchanges each row (and column) in ED with its negation. Therefore, by Theorem 2.3, the orthogonal design D is cocyclic with an extension group isomorphic to Q2(p+1) .
5. Plug-in matrices and cocyclic matrices So far we have proved Theorem 1.1 when r = 0, s = 1 and m1 = 0. In order to prove the theorem just for r = 0, we adapt an old method for obtaining larger orthogonal designs from smaller orthogonal designs. Given t ×t matrices A1 , . . . , Ar with entries in a set {±b1 , . . . , ±bs , 0}, and an m × m matrix D = [ i,j ap(i,j ) ]i,j , where p(i, j ) ∈ {1, . . . , r}, ij ∈ {0, ±1} and {a1 , . . . , ar } is a set of indeterminates, we may define the block matrix E = [ i,j Ap(i,j ) ]i,j , obtained from D by replacing the indeterminates ai by the corresponding matrices Ai . The idea is that if D is “orthogonal” in some sense, and if the matrices A1 , A2 , . . . , Ar are “suitable”, then the matrix E is orthogonal. Since Williamson’s paper [13] this device has been a recurring theme in the study of orthogonal matrices. See Geramita and Seberry [7] for many variations on the idea. Usually, the array D is an orthogonal design, however, as in the proof of Theorem 8.1, below, we can have indeterminates appearing in D which satisfy other constraints instead of (or in addition to) the commutativity constraints needed when D is an orthogonal design. It is natural to ask whether there are situations in which we can use this idea to inflate cocyclic orthogonal designs. We now verify that in the following circumstances
On a family of cocyclic Hadamard matrices
195
(regardless of our definition of “orthogonal” and “suitable”) the inflated matrix E is cocyclic. Suppose that (under some common indexing order) for some cocycle f over the finite group G we have Ai = [ gi (xy) f (x, y) ]x,y∈G , and suppose that, for some cocycle h over the finite group H , there exists a map g : H →{±a1 , . . . , ±ar , 0} such that D = [ g(uv) h(u, v) ]u,v∈H . where pˆ : H →{1, . . . , r} and ˆuv ∈ {±1, 0}. Notice that Write g(uv) = ˆuv ap(uv) ˆ ˆ Therefore when ˆuv = 0, we have freedom in the choice for p(u). h(u, v) ]u,v∈H . D = [ ˆuv ap(uv) ˆ Now, indexing by products x · u in G × H , we have (xy) f (x, y) h(u, v) ]x·u,y·v∈G×H . E = [ ˆuv gp(uv) ˆ ¯ · u) = ˆu gp(u) Let g¯ : G × H →{±b1 , . . . , ±bs , 0} be such that g(x ˆ (x), and let f × h : G × H → Z2 be defined by the equation f × h (x · u, y · v) = f (x, y)h(u, v). Then E = [ g(xy ¯ · uv)f × h (x · u, y · v) ]x·u,y·v∈G×H . It is easily checked that the map f × h is a cocycle. Indeed, Rf ×h is the quotient group Rf Rh obtained by factoring the direct product group Rf × Rh by the central subgroup ((1, −1), (1, −1)) of order two.1 We therefore have
Proposition 5.1. Let the matrices A1 , . . . , Ar , D and E be defined as above, then E is cocyclic with cocycle f × h and extension group Rf Rh . Instances of this proposition appear in several places: see for example [1, 2, 4, 6, 8]. We illustrate the ideas with the following example.
Example 5.2. Let a, b and c be commuting indeterminates. Set ⎡ ⎤ ⎡ ⎤ b b b c −b b c −b ⎦ . W = ⎣ b b b ⎦ X = Y = Z⎣ b b b b −b b c 1Alternatively, one may think of R R as a central product of R and R in which we have identified f h f h the elements (1, 1) and (1, −1) in Rf with the corresponding elements in Rh .
196 Warwick de Launey The cocycle for these matrices is the trivial cocycle over Z3 ; so the extension group is Z2 × Z3 . If we “plug” these matrices into the Williamson array ⎡ ⎤ W X Y Z ⎢ X −W Z −Y ⎥ ⎢ ⎥ ⎣ Y −Z −W X ⎦ Z Y −X −W then we obtain an OD(12; 3, 9) E. Now the connection between the Williamson array [13] and the quaternions Q8 has been known for several decades; so it is no surprise that (as noted in [8]) the Williamson array is cocyclic with extension group equal to Q8 . By Proposition 5.1, the orthogonal design E is cocyclic with extension group equal to Q8 × Z3 .
6. Proof of Theorem 1.1 for r = 0 We now use Proposition 5.1 to extend Lemma 4.1. Let Et denote the group of order t which is the direct product of elementary abelian groups. Lemma 6.1. Let p ≡ 3 (mod 4) be a prime power. Then, for all k ≥ 0, there is a cocyclic OD((p + 1)p k ; pk , pk+1 ) with extension group equal to Q2(p+1) × Epk . Proof. We first construct a cocyclic OD((p + 1)p; p, p2 ) with extension group equal to Q2(p+1) × Ep . Recall that in Lemma 4.1 we constructed an OD(p + 1; 1, p) D satisfying the equation DD % = (a12 + pa22 )Ip+1 . We showed that this matrix is cocyclic with an extension group equal to Q2(p+1) . Let D0 denote this matrix. We now make some suitable matrices for inflating D0 . Recall the construction for the Paley conference matrix given in equation (3.1). If we choose the representatives = {(0, 1)} ∪ { (1, x) | x ∈ GF(p) }, choose the alternating bilinear form so that det((a, b), (c, d)) = ad − bc, and remove the first row and column, then we obtain the (0, ±1)-matrix B = [ χ(y − x) ]x,y∈GF(p) . This matrix is developed modulo the group Ep and it satisfies the equations Jp B % = BJp = 0,
B % = −B,
and
BB % = pIp − Jp .
Notice that the matrices a2 Jp and a1 Ip + a2 B satisfy the equations (a2 Jp )(a1 Ip + a2 B)% = (a1 Ip + a2 B)(a2 Jp )% ,
On a family of cocyclic Hadamard matrices
197
and a2 Jp (a2 Jp )% + p(a1 Ip + a2 B)(a1 Ip + a2 B)% = (p2 a2 2 + a1 2 )Ip . So if we make the following substitutions into D0 a1 ← a2 Jp
a2 ← a1 Ip + a2 B,
and
then we obtain a cocyclic OD((p + 1)p; p, p2 ) D1 with an extension group equal to Q2(p+1) × Ep . Iterating k times we obtain a cocyclic OD((p + 1)pk ; pk , pk+1 ) with extension group equal to Q2(p+1) × Epk . We note the following corollary to Proposition 5.1. Corollary 6.2. If the matrix A is cocyclic with cocycle f and B is cocyclic with cocycle h, then the matrix A × B is cocyclic with cocycle f × h and extension group Rf Rh . Proof. Take r = 1, A1 = A and D = B in Proposition 5.1. Corollary 6.3. Theorem 1.1 holds for r = 0. In that case we may take the extension group to be (Q2(p1 +1) · · · Q2(ps +1) ) × Epm1 ...psms . 1
Proof. Apply Corollary 6.2 repeatedly to the material supplied by Lemma 6.1.
7. Cocyclic Williamson-like matrices Our next goal is to prove Theorem 1.1 for s = 0. To this end we introduce the idea of a cocyclic set of Williamson-like matrices. Definition 7.1. Let f : G × G → Z2 be a cocycle with extension group Rf . Suppose that the m (0, X± )-matrices A1 , A2 , . . . , Am (i = 1, . . . , m) satisfy the following conditions. k % 2 % 1. Orthogonality. A1 A% i=1 ai xi It . 1 + A2 A2 + · · · + Am Am = 2. Amicability. Ai Aj% = Aj A% i . 3. Cocycle Property. There are maps gi : G → {0} ∪ X± such that (under a common row and column indexing) Ai = [ gi (xy)f (x, y) ]x,y∈G .
198 Warwick de Launey Then the matrices A1 , A2 , . . . , Am comprise a cocyclic set of m order t type (a1 , a2 , . . . , ak ) Williamson-like2 (0, X± )-matrices with cocycle f and extension group Rf . Remark 7.2. When k = 1, we usually replace the lone indeterminate by 1, and thereby regard the matrices as being (0, ±1)-matrices. When m = 4, and the cocycle f is trivial, we say the matrices A1 , A2 , A3 , A4 are type (a1 , a2 , . . . , ak ) Williamsonlike matrices developed modulo the group G. Such matrices have been studied by many authors. We demonstrate the utility of these matrices in the following example which was described for f trivial in [2]. Example 7.3. Suppose that the (1, −1)-matrices A, B, C and D comprise a cocyclic set of four order t Williamson-like matrices with cocycle f . Let a and b be commuting indeterminates. Define the order t matrices W, X, Y, Z via the equations: 2W = a(A + B) + b(A − B) 2Y = a(C + D) + b(C − D)
2X = a(A − B) − b(A + B) 2Z = a(C − D) − b(C + D).
It is easily verified that W W % + XX % + Y Y % + ZZ % = 2t (a 2 + b2 )It , and that W, X, Y and Z are amicable; so the matrices W, X, Y and Z comprise a cocyclic set of four type (2t, 2t) Williamson-like matrices with cocycle f . Using the Williamson array as in Example 5.2, we obtain a cocyclic OD(4t; 2t, 2t) with extension group equal to Q8 Rf .
8. Proof of Theorem 1.1 for s = 0 We now have the ingredients for a proof of Theorem 1.1 when s = 0. First we use the plug-in idea and some starting material drawn from the Paley conference matrices to prove Theorem 1.1 for s = 0 and r = 1. Theorem 8.1. Let q ≡ 1 (mod 4) be a prime power. Then, for all k ≥ 0, there are type (2q k , 2q k+1 ) Williamson-like matrices developed modulo the group Z 1 (q+1) × Eq k . 2
Proof. Recall the construction of Section 3. Let ω be any primitive element of GF(q 2 ), and let α denote the map x → ωx. If we choose our set of one-dimensional subspace representatives to be = {1, ω2 , ω4 , . . . , ωq−1 } ∪ {ω, ω3 , . . . , ωq }, then the map α 2 has two orbits on : namely 1, ω2 , ω4 , . . . , ωq−1 and ω, ω3 , . . . , ωq . Since 2 This is an extension of the definition of Williamson matrices [13].
On a family of cocyclic Hadamard matrices
199
χ (−1) = 1, and det(α) = ωp+1 is a non-square in GF(q), we see that the Paley conference matrix of order q + 1 may be written in the form A B B −A where (a) the matrices A = [ χ det(1, ω2(j −i) ) ]i,j =0,1,...,(q−1)/2 and B = [ χ det(ω, ω2(j −i)+1 ) ]i,j =0,1,...,(q−1)/2 are circulant and symmetric, (b) the matrix B has all entries equal to ±1, and (c) the matrix A has all diagonal entries equal to zero and all off-diagonal entries equal to ±1. Let a, b, c and d be commuting indeterminates. Consider the four matrices3 A1 = aI + bA,
A2 = bB,
A3 = cI + dA,
and
A4 = dB.
It is easily checked that if we put c = a and d = −b, then we obtain four (2, 2q)suitable matrices. Put C = [ χ(y − x) ]x,y∈GF(q) . Then the matrix C is developed modulo the group Eq , and it satisfies the equations Jq C % = CJq = 0,
C % = C,
and
CC % = qIq − Jq .
Notice that • the matrices bJq , aIq + bC, dJq and cIq + dC are amicable; • we have bJq (bJq )% +q(aIq +bC)(aIq +bC)% +dJq (dJq )% +q(cIq +dC)(cIq +dC)% = (q 2 b2 + q 2 d 2 + qa 2 + qc2 )Iq + q(ab + cd)(C + C % );
• and if c = a and d = −b, then (bJq ) (aIq + bC) = −(dJq ) (cIq + dC). If we make the following plug-in substitutions a ← bJq ,
b ← aIq + bC,
c ← dJq ,
d ← cIq + dC,
and then set c = a and d = −b, then we obtain four (2q, 2q 2 )-suitable matrices which are developed modulo the group Eq × Z 1 (q+1) . In general, if we begin with 2
3 A , A , A , A are essentially the Williamson matrices obtained by Turyn [12]. 1 2 3 4
200 Warwick de Launey the matrices A, B, C, D, make the plug-in substitutions k times, and then set c = a and d = −b, then we obtain four (2q k , 2q k+1 )-suitable matrices which are developed modulo the group Eq k × Z 1 (q+1) . 2
Now let Ar denote the group Z2 Ar = Q8 Q8 · · · Q8 (r − 1 times.)
if r = 1. if r > 1.
Theorem 8.2. Let q1 , q2 , . . . , qr ≡ 1 (mod 4) be prime powers, and let k1 , k2 , . . . , kr be non-negative integers. Then there are four cocyclic Williamson-like matrices with extension group Ar × Z 1 (q1 +1) × · · · × Z 1 (qr +1) × Eq1 k1 ...qr kr . 2
2
Proof. Consider first the case r = 1. By Theorem 8.1, we have four Williamsonlike matrices developed modulo the group Z 1 (q1 +1) × Eq1 k1 . Since these may be 2 viewed as cocyclic Williamson-like matrices with trivial cocycle, the extension group is Z 1 (q1 +1) × Eq1 k1 × Z2 . 2 Now consider the case r > 1. By applying Proposition 5.1 (with D equal to the Williamson array) to the Williamson-like matrices supplied by Theorem 8.1 we obtain for each i ∈ {2, 3 . . . , r} a cocyclic Hadamard matrix with extension group equal to Q8 × Z 1 (qi +1) × Eqi ki . By Corollary 6.2, forming the Kronecker product of these 2 matrices yields a cocyclic Hadamard matrix with extension group Ar × Z 1 (q2 +1) × 2 Eq2 k2 × Z 1 (q3 +1) × Eq3 k3 × · · · × Z 1 (qr +1) × Eqr kr . Taking the Kronecker product of 2 2 this Hadamard matrix with the four cocyclic Williamson-like matrices with extension group Z 1 (q1 +1) × Eq1 k1 × Z2 completes the proof. 2
Notice that Theorem 8.2 implies Theorem 1.1 for s = 0. Let the group Br be the product Q8 · · · Q8 with r terms. We obtain Corollary 8.3. If q1 , q2 , . . . , qr ≡ 1 (mod 4) are prime powers, and if k1 , k2 , . . . , kr are non-negative integers, then there exists a cocyclic Hadamard matrix of order r
r qiki 2(qi + 1)
i=1
i=1
and extension group equal to Br × (Z 1 (q1 +1) × · · · × Z 1 (qr +1) ) × (Eq1 k1 ...qr kr ). 2
2
9. Proof of Theorem 1.1 and some generalizations The main theorem is now proved by combining Corollary 8.3 and Theorem 6.3 via Corollary 6.2. Note that we can prove somewhat more than Theorem 1.1. Minor
On a family of cocyclic Hadamard matrices
201
changes to our argument lead to type (n, nq) cocyclic orthogonal designs and type (n, nq) cocyclic Williamson-like matrices. So far we have obtained just one extension group for each order in Theorem 1.1. Our goal now is to show that the set of possible extension groups is much larger. We begin by discussing a relationship between cocyclic Hadamard matrices and normal relative difference sets. A cardinality k partial transversal D of an order m subgroup M of an order v group R is said to be a relative difference set in R relative to the forbidden subgroup M with index λ = k(k − 1)/(v − m) if each element in R \ M is expressible as a quotient of elements of D in exactly λ ways. If M is normal in R we say the relative difference set D is normal. We use the notation NRDS(v, k, λ, m). If M = 1, then D is said to be a difference set. We use the notation DS(v, k, λ). Given a transversal D of a central involution z in a finite group R and a projection homomorphism π : R → G with Ker π = z, we define a transversal map τ : G → D ⊂ R via the equation π ◦ τ = identity map on G and define a cocycle fτ : G × G → Z2 by the equation z(1−fτ (x,y))/2 = τ (xy)−1 τ (x)τ (y).
(9.1)
We will say fτ may be obtained via π, D, z and R. Throughout the remainder of this section we will suppose 1 ∈ D. Let C(z, R) denote the set of normalised 2-cocycles obtained via π, D, z and R for some epimorphism π : R → G and transversal D of z in R. We will say a cocycle f is a Hadamard cocycle if there is a cocyclic Hadamard matrix with cocycle f . The following theorem was proved in [2]. See [4] for full details. A narrower version of the result is proved by a more direct argument in [3, Theorem 2.4]. Theorem 9.1. There is a NRDS(4t, 2t, t, 2) in R with forbidden subgroup z of order 2 if and only if there is a Hadamard cocycle in C(z, R). Therefore, our main theorem implies the existence of a large family of normal relative difference sets. However, at the moment our main concern is that this correspondence allows us to generalize the Kronecker product result Corollary 6.2. We employ some ideas discussed in [4]. Let D be a subset of a finite group G, and let the element a− a = 2D − G D∗ = a∈D
a ∈D
of the integral group ring Z(G) be termed the associate of D. Suppose that G contains two subgroups G1 and G2 such that G = G1 G2
and
G1 ∩ G2 = 1.
202 Warwick de Launey We shall say G factors into the groups G1 and G2 .4 Dillon noted that, if D1 is a DS(4t 2 , 2t 2 + t, t 2 + t) in G1 and D2 is a DS(4s 2 , 2s 2 + s, s 2 + s) in G2 , then D1∗ D2∗ is the associate of a DS(4(2st)2 , 2(2st)2 + (2ts), (2st)2 + (2st)) in G. A similar composition rule applies for normal relative difference sets NRDS(4t, 2t, t, 2). Suppose that R is a group of order 8t1 t2 containing a central involution z and containing two groups R1 and R2 of respective orders 4t1 and 4t2 , such that R = R1 R2
and
R1 ∩ R2 = z.
(9.2)
We will say that R = R1 'z R2 , and that R ∼ = S 'z T if there are subgroups R1 ∼ =S ∼ and R2 = T of R satisfying equations (9.2). If the involution z is of no interest, we omit the subscript in 'z . we will adopt the convention that 1 ' G = G. We note that S T is of the form S ' T . Example 9.2. Write Q8 = a, b, z | a 2 = b2 = z, z2 = 1, ba = abz , and let R1 = a and R2 = b, then Q8 = R1 'z R2 ; so Q8 ∼ = Z4 ' Z4 . The following facts are proved in [4]. The normal relative difference set NRDS(4t, 2t, t, 2) D with forbidden central involution z has associate equal to D(1 − z). Now suppose that for i = 1, 2, the set Di is a NRDS(4ti , 2ti , ti , 2) in the group Ri with respect to the involution zi . Then identifying z1 and z2 with z in the central product R = R1 'z R2 , the element D ∗ = D1 D2 (1 − z) of the integral group ring of R is the associate of a NRDS(8t1 t2 , 4t1 t2 , t1 t2 , 2) in R with forbidden involution equal to z. Consequently, by Theorem 9.1, we have the following generalization of Corollary 6.2. Theorem 9.3. Suppose a finite group R equals S 'z T for some subgroups S and T of R. Suppose also that there are Hadamard cocycles in C(z, S) and in C(z, T ); then there is a Hadamard cocycle in C(z, R). We now obtain the somewhat stronger version of Theorem 1.1. As before let the group Br denote the central amalgamated product Q8 · · · Q8 with r copies of Q8 . Theorem 9.4. Let q1 , q2 , . . . , qr ≡ 1 (mod 4) and p1 , p2 , . . . , ps ≡ 3 (mod 4) be prime powers, and let k1 , k2 , . . . , kr and m1 , m2 , . . . , ms be non-negative integers. Suppose the group G factors into the groups Epm1 , Epm2 , . . . , Epsms , Eq k1 , Eq k2 , . . . , 1
2
1
2
Eq ks . Then there is a cocyclic Hadamard matrix with extension group G × Z 1 (q1 +1) × r 2 Z 1 (q2 +1) × · · · × Z 1 (qr +1) × Br (Q2(p1 +1) Q2(p2 +1) · · · Q2(ps +1) ). 2
2
Proof. Suppose a finite group M factors into two groups K and L, and suppose that A and B are two finite groups respectively containing central involutions w and z. Then identifying w in A with z in B we have (A B) × M ∼ = (A × K) 'z (B × L).
(9.3)
4 We note that H · (H · H ) will in general not be isomorphic to (H · H ) · H . Nevertheless, we will 1 2 3 1 2 3 say that both groups factor into the groups H1 , H2 , H3 , and so on for “products” of more than three groups.
On a family of cocyclic Hadamard matrices
203
By Corollary 6.3, there is (for each i = 1, . . . , s) a Hadamard cocycle in C(zi , Q2(pi +1) × Epmi ) where zi is the unique central involution in Q2(pi +1) × Epmi . i i Similarly, by Corollary 8.3, there is a Hadamard cocycle in C(wi , Q8 ×Z 1 (qi +1) ×E ki ) where wi is the unique central involution in Q8 × Z 1 (qi +1) × E
2
k qi i
2
qi
. The theorem now
follows by repeated application of equation (9.3) and Theorem 9.3. We note that the extension group in Theorem 9.4 is a semi-direct product of a Sylow 2-subgroup and an odd order normal subgroup which is the direct product of a normal abelian subgroup and the group G. Moreover, the Sylow 2-subgroup contains an extraspecial subgroup Q8 Q8 · · · Q8 of order 22(r+s)+1 which in turn contains the unique central involution. As pointed out in [6], a group is the extension group of cocyclic Hadamard matrix if and only if it is a Hadamard group as defined by Ito [10]. So Theorem 9.4 gives many Hadamard groups. We now discuss the embedding question, “Which groups of odd order may be embedded in a Hadamard group, and how large does the Hadamard group have to be?”. Now every group G of odd order g is soluble, and any finite order soluble group has a polycyclic presentation5 . Therefore, we may write g as a product p1 . . . pk of primes pi in such a way that, for some non-negative integers mi,j, , mi, < p , p
m
m
m
i,i+1 i,i+2 G = y1 , . . . , yk | yi i = yi+1 yi+2 . . . yk i,k mi,j,i+1 mi,j,i+2 m yj yi = yi yi+1 yi+2 . . . yk i,j,k for j > i and i = 1, . . . , k .
p
Notice that if g is square-free, then we can choose yi so that yi i = 1.6 So in this case, G may be constructed by forming a sequence of semi-direct products with prime order groups. We therefore have Theorem 9.5. If G is a group of order p1 p2 . . . ps q1 q2 . . . qr where the primes q1 , q2 , . . . , qr ≡ 1 (mod 4) and the primes p1 , p2 , . . . , ps ≡ 3 (mod 4) are distinct primes, then there is a cocyclic Hadamard matrix with extension group G×Z 1 (q1 +1) ×Z 1 (q2 +1) ×· · ·×Z 1 (qr +1) ×Br (Q2(p1 +1) Q2(p2 +1) · · ·Q2(ps +1) ). 2
2
2
This implies that every group of odd squarefree order g may be embedded in a Hadamard group with order somewhat less than g 3 . We conjecture that every odd order group can be embedded similarly. We now give an argument which deals with groups which factor into abelian groups. By Dirichlet’s theorem for primes in arithmetic progression, we know that any odd order cyclic group may be embedded in a Hadamard 5 See [11, Sections 9.3, 9.4] for a discussion of the relationship between soluble groups and polycyclic groups and their polycyclic presentations. 6 Proof. Every element of G can be written uniquely in the form y t1 y t2 . . . y tk where 0 ≤ t < p . So i i 1 2 k the groups Gi = yi , yi+1 , . . . , yk have order pi pi+1 . . . pk . Now by Sylow’s theorem we may choose p elements xi ∈ Gi such that xi i = 1. Since pi does not divide the order of Gi+1 , we have xi ∈ Gi+1 ; so Gi = xi , xi+1 , . . . , xk . Indeed, since Gi+1 is normal in Gi , we may replace the elements y1 , y2 , . . . , yk by the elements x1 , x2 , . . . , xk and still have a polycyclic presentation.
204 Warwick de Launey group. Indeed, by Linnik’s bound on the smallest prime in an arithmetic progression, we know that there is an absolute constant c such that, for any odd order m, there is a prime p ≡ −1 (mod m) with p < mc . Hence, there is an absolute constant d such that we may embed Zm in a Hadamard group of order at most 2(p + 1) ≤ md . Since any abelian group is the direct product of cyclic groups, any abelian group of odd order a embeds in a Hadamard group of order at most a d , and hence any group of odd order g which factors into abelian groups must embed in a Hadamard group of order at most g d . Acknowledgement The author would like to thank the referee for reading the original manuscript carefully and in particular for picking up several minor errors in the exposition.
References [1]
W. de Launey, On the construction of n-dimensional designs from 2-dimensional designs, Australas. J. Combin. 1 (1990), 67–81.
[2]
W. de Launey, Cocyclic Hadamard matrices and relative difference sets, Ohio State Conference on Groups and Difference Sets, June 1993, Hadamard Centenary Conference, U. Wollongong, Australia, December 1993, unpublished.
[3]
W. de Launey, D. L. Flannery and K.J. Horadam, Cocyclic Hadamard matrices and difference sets, Discrete Appl. Math. 102 (2000), 47–61.
[4]
W. de Launey and M. J. Smith, Cocyclic orthogonal designs and the asymptotic existence of maximal size relative difference sets with forbidden subgroup size 2, J. Combin. Theory Ser. A 93 (2001), 37–92.
[5]
W. de Launey and R. M. Stafford, On cocyclic weighing matrices and the regular actions of certain Paley matrices, Discrete Appl. Math. 102 (2000), 63–101.
[6]
D. L. Flannery, Cocyclic Hadamard matrices and Hadamard groups are equivalent, J. Algebra 192 (1997), 749–779.
[7]
A. V. Geramita and J. Seberry. Orthogonal Designs: Quadratic Forms and Hadamard Matrices. Marcel Dekker, New York 1979.
[8]
K. J. Horadam and W. de Launey, Cocyclic development of designs, J. Algebraic Combin. 2 (1993), 267–290; Erratum, ibid. 3 (1994), 129.
[9]
N. Ito, Note on Hadamard matrices of type Q, Studia Sci. Math. Hungar. 16 (1981), 389–393.
[10] N. Ito, On Hadamard groups, J. Algebra 168 (1994), 981–987. [11] C. C. Sims. Computation with finitely presented groups. Cambridge University Press, Cambridge, New York, Melbourne 1994. [12] R. J. Turyn, An infinite class of Williamson matrices, J. Combin. Theory Ser. A 12 (1972) 319–321.
On a family of cocyclic Hadamard matrices
205
[13] J. Williamson, Hadamard’s determinant theorem and the sum of four squares, Duke Math J. 11 (1944), 65–81. [14] M.Yamada, Hadamard matrices of generalised quaternion type, Discrete Math. 87 (1991), 187–196. [15] K. Yamamoto, On a generalised Williamson equation, Colloq. Math. Soc. János Bolyai 37 (1981) 839–850. W. de Launey Center for Communications Research 4320 Westerra Court, San Diego, CA 92121, U.S.A. [email protected]
A mass formula for Type II codes over finite fields of characteristic two Akihiro Munemasa
Abstract. Type II codes over finite fields of characteristic two are a generalization of doubly-even self-dual binary codes. In this paper, we characterize Type II codes as maximal totally singular subspaces with respect to a quadratic form, and give a mass formula for them. 2000 Mathematics Subject Classification: primary 11T71; secondary 05B25, 51E20, 94B05.
1. Introduction Recently, Gaborit, Pless, Solé and Atkin [6] introduced Type II codes over GF(4). Type II codes over GF(4) share many properties in common with doubly-even selfdual binary codes. Moreover, Type II codes over GF(4) produce doubly-even self-dual binary codes when one takes the image under the Gray map defined in [6]. Some further results on Type II codes have been obtained in [1]. In this paper, we generalize the definition of Type II codes to any finite field of characteristic two by using a trace-orthogonal basis. Analogous to the Gray map for Z4 or GF(4), one can define the Gray map from GF(2r ) to GF(2)r . The image of a Type II code over GF(2r ) under the Gray map is a doubly-even self-dual binary code. Although the concept of a Type II code over GF(2r ) has not been defined formally, various Type II codes were constructed already in 1980’s (see [9, 10, 11, 17, 18, 19]). Notably, the discovery, due to Pasquier [9], of a Type II code over GF(8) whose Gray map image is the extended binary Golay code seems to suggest that Type II codes are worthy of study. In this paper, we characterize Type II codes as maximal totally singular subspaces with respect to a quadratic form. Then we determine the lengths for which a Type II code exists. Finally, we prove a mass formula, giving the total number of Type II codes over GF(2r ) of length n. For the binary case, the formula was derived in [8, 15], and for the case r = 2 in [6]. The formula is used in the enumeration of Type II codes of small length and small r in [2, 6]. Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
208 Akihiro Munemasa
2. Preliminaries We denote by GF(q) a finite field with q elements, where q is a prime power. A code n of length n over GF(q) is a linear subspace n of the vector space GF(q) n over GF(q). We denote by (u, v) the inner product i=1 ui vi , where u, v ∈ GF(q) . For a code C, we define C ⊥ = {u ∈ GF(q)n | (u, v) = 0 for all v ∈ C}. A code C is said to be self-dual if C = C ⊥ . Let r be a positive integer, Tr the absolute trace from GF(2r ) to GF(2), that is, 2i r Tr(α) = r−1 i=0 α . A trace-orthogonal basis is a basis B = {α1 , . . . , αr } of GF(2 ) over GF(2) with the property Tr(αi αj ) = δij for all i, j with 1 ≤ i, j ≤ r. The existence of a trace-orthogonal basis for arbitrary value of r has been established by basis of GF(2r ) over GF(2), and let [7]. Let rB = {α1 , . . . , αr } be a trace-orthogonal α = i=1 ci αi be an element of GF(2r ), where ci ∈ GF(2). Then the weight of α with respect to B is defined to be the number of i’s with ci = 1, and is denoted by wt(α), or wt B (α) when we need to emphasize the reference to B. The weight of a vector v ∈ GF(2r )n with respect to B is the sum of the weights of its entries, and is also denoted by wt(v) or wt B (v). A self-dual code C is said to be a Type II code with respect to the trace-orthogonal basis B if wt(v) ≡ 0 (mod 4) for all v ∈ C. For r = 1, Type II codes are also called doubly-even self-dual binary codes. For r = 2, B = {ω, ω2 } is the unique trace-orthogonal basis of GF(4), where ω is a primitive element of GF(4). Type II codes with respect to B were studied in [1,6]. The Gray map φ with respect to a trace-orthogonal basis B is a map from GF(2r )n to GF(2)rn which maps a vector v = (v (1) , . . . , v (n) ) ∈ GF(2r )n to the concatenation (i) of the vectors (ci1 , . . . , cir ), 1 ≤ i ≤ n, where v = jr =1 cij αj . It is easy to show Tr(u, v) = (φ(u), φ(v)) for u, v ∈ GF(2r )n . Thus, if C is a self-dual code over GF(2r ), then φ(C) is a self-dual binary code. Moreover, wt(v) = wtH (φ(v)) holds for v ∈ GF(2r )n , where wt H (u) denotes the Hamming weight of a binary vector. This means that the Gray map is an isometry. Thus, if C is a Type II code with respect to a trace-orthogonal basis B over GF(2r ), then φ(C) is a doubly-even self-dual binary code. For the remainder of this paper, we will consider a fixed trace-orthogonal basis, so we often omit the reference to a trace-orthogonal basis, that is, we say, for instance Type II codes over GF(2r ) instead of Type II codes with respect to B over GF(2r ). Let V be a vector space over GF(q). A mapping f from V to GF(q) is said to be a quadratic form if f (αv) = α 2 f (v), f (u + v) = f (u) + f (v) + Bf (u, v) for any u, v ∈ V and α ∈ GF(q), where Bf is a bilinear form. Bf is called the bilinear form associated to f . The radical of f is defined by Rad f = {v ∈ V | f (v) = 0, Bf (u, v) = 0 for all u ∈ V }.
A mass formula for Type II codes over finite fields of characteristic two
209
Note that Rad f is a subspace of V . The quadratic form f is said to be nondegenerate if Rad f = 0. In general, f induces a nondegenerate quadratic form on V / Rad f . A subspace W of V is said to be totally singular if f (u) = 0 for all u ∈ W . The Witt index of f is the maximum of the dimensions of totally singular subspaces.
3. Characterization Let 1 denote the all ones vector in GF(2)n . Define f : 1⊥ −→ GF(2) by f (x) = wt(x)/2 (mod 2). Then f is a quadratic form with associated bilinear form Bf (x, y) = ni=1 xi yi ([3, 12]). Doubly-even self-dual binary codes are precisely the subspaces of dimension n/2 which are totally singular respect to f . We aim to generalize this interpretation of doubly-even self-dual codes in terms of a quadratic form to Type II codes over GF(2r ). Let B = {α1 , . . . , αr } be a traceorthogonal basis of GF(2r ) over GF(2). Then we have the following. Lemma 3.1. If u, v ∈ 1⊥ and (u, v) = 0, then wt(u) wt(v) wt(u + v) ≡ + 2 2 2
(mod 2).
Proof. Our assumption implies (φ(u), φ(v)) = 0 (see Wolfmann [18], Proposition 2), so wtH (φ(u)) wt H (φ(v)) wtH (φ(u) + φ(v)) ≡ + 2 2 2
(mod 2).
Since φ is an additive isometry, the result follows. Proposition 3.2. Let 1 be the 1-dimensional subspace of GF(2r )n spanned by 1 over GF(2r ). Define f : 1⊥ −→ GF(2r ) by f (u) =
r wt(αj u) j =1
2
αj2 .
Then f is a quadratic form with associated bilinear form (u, v) =
n
i=1 ui vi .
Proof. Suppose u ∈ 1⊥ and α ∈ GF(2r ). Write αj α = ri=1 cij αi , cij ∈ GF(2). Then r wt( ri=1 cij αi u) 2 αj . f (αu) = 2 j =1
210 Akihiro Munemasa Since αi u (1 ≤ i ≤ r) are pairwise orthogonal, Lemma 3.1 implies f (αu) = =
r r j =1 i=1 r
cij
wt(αi u) 2 αj 2
2 wt(αi u) cij αj . 2 r
j =1
i=1
Since cij = Tr(αi αj α), we have cij = cj i , so that f (αu) =
r wt(αi u) i=1 2
2
(αi α)2
= α f (u). Now, f (u + v) − f (u) − f (v) =
r 1 i=1
=
2
r 1 i=1
2
(wt(αi (u + v)) − wt(αi u) − wt(αi v))αi2 (wt(φ(αi u) + φ(αi v)) − wt(φ(αi u)) − wt(φ(αi v)))αi2
r = (φ(αi u), φ(αi v))αi2 i=1
= =
r i=1 r
Tr((αi u, αi v))αi2 Tr((u, v)αi2 )αi2 .
i=1
, . . . , αr } is a trace-orthogonal basis, so is {α12 , . . . , αr2 }. Thus, if we write Since {α1 (u, v) = ri=1 ci αi2 , then Tr((u, v)αi2 ) = ci . This implies f (u + v) − f (u) − f (v) = (u, v). This completes the proof. Proposition 3.3. Let f be as in the previous proposition. A self-dual code C ⊂ GF(2r )n is Type II if and only if C is totally singular with respect to f . Proof. Notice that C is totally singular if and only if wt(αi u) ≡ 0 (mod 4) for all u ∈ C and 1 ≤ i ≤ r. Since C is a GF(2r )-linear subspace, this is equivalent to wt(u) ≡ 0 (mod 4) for all u ∈ C.
A mass formula for Type II codes over finite fields of characteristic two
211
We note that the radical of f is 1 if n ≡ 0 (mod 4), and f induces a nondegenerate quadratic form on 1⊥ /1. The Witt index of f will be determined in the next section.
4. Existence of Type II codes In this section we determine the lengths for which a Type II code over GF(2r ) exists. For r = 1, it is well-known that a Type II binary code of length n exists if and only if 8|n. For r = 2, the code of length 4 generated by the following matrix is a Type II code: 1 1 1 1 (1) 0 1 ω ω2 where ω2 + ω + 1 = 0. This implies (see [6]) that a Type II code of length n over GF(4) exists if and only if 4|n. The following theorem determines the lengths for which a Type II code over GF(2r ) exists. We note that the existence of a Type II code of length n over GF(2r ) is independent of the choice of a trace-orthogonal basis, and depends only on r and n. Theorem 4.1. Let B be an arbitrary trace-orthogonal basis of GF(2r ) over GF(2). A Type II code of length n with respect to B exists if and only if 4 | n and 8 | rn. Proof. Suppose that there exists a Type II code C of length n with respect to B = {α1 , α2 , . . . , αr }. Since C is self-dual, C contains the all-ones vector. Thus α1 1 ∈ C. Since wt(α1 1) = n and C is Type II, n is divisible by 4. The Gray map image of C is a Type II binary code of length rn, so 8|rn. Conversely, suppose 4 | n and 8 |rn. If r is odd, then n is divisible by 8. Let C be a Type II binary code of length n. The code generated by C over GF(2r ) is a Type II code over GF(2r ). If r is even, then it suffices to show the existence of a Type II code of length 4 over GF(2r ). We claim that the code with generator matrix (1) over GF(2r ) is a Type II code. To prove this, it suffices to show fB (u) = 0, where u = (0, 1, ω, ω2 ). Since wt(αi ω2 ) = |{j | 1 ≤ j ≤ r, Tr(αi αj ω2 ) = 1}| = |{j | 1 ≤ j ≤ r, Tr(αi αj ) + Tr(αi αj ω) = 1}| = |{j | 1 ≤ j ≤ r, j = i, Tr(αi αj ω) = 1}| + δTr(α 2 ω),0 i
= |{j | 1 ≤ j ≤ r, Tr(αi αj ω) = 1}| − δTr(α 2 ω),1 + δTr(α 2 ω),0 i
= wt(αi ω) − 1 + 2δTr(α 2 ω),0 , i
i
212 Akihiro Munemasa we have fB (u) =
r 1
2
i=1
(wt(αi ) + wt(αi ω) + wt(αi ω2 ))αi2
r = (wt(αi ω) + δTr(α 2 ω),0 )αi2 i
=
i=1 r
(Tr(αi ω) + Tr(αi2 ω) + 1)αi2
i=1 r = (Tr(αi2 ω2 ) + Tr(αi2 ω) + Tr(αi2 ))αi2 i=1
=
r
Tr(αi2 (ω2 + ω + 1))αi2
i=1
= 0. Therefore, the claim is proved.
5. Mass formula Now a mass formula for Type II codes over GF(2r ) can easily be derived. The formula is nothing but the one originally found by Segre [14], Ray-Chaudhuri [13], and Feng and Dai [5]. Theorem 5.1. Suppose 4 |n and 8 |rn. Then the number of Type II codes with respect to a fixed trace-orthogonal basis B over GF(2r ) of length n is given by n/2−2
(2ri + 1).
i=0
1⊥
For any vector v ∈ \ 1 satisfying wtB (αv) ≡ 0 (mod 4) for all α ∈ GF(2r ), the number of Type II codes with respect to B over GF(2r ) containing v is given by n/2−3
(2ri + 1).
i=0
Proof. The number of Type II codes over GF(2r ) of length n is the same as the number of (maximal) totally singular subspaces of dimension n/2 − 1 with respect to the nondegenerate quadratic form of Witt index n/2 − 1 on the (n − 2)-dimensional space 1⊥ /1 induced by f . The number of Type II codes over GF(2r ) containing v
A mass formula for Type II codes over finite fields of characteristic two
213
is same as the number of (maximal) totally singular subspaces of dimension n/2 − 2 with respect to the nondegenerate quadratic form of Witt index n/2 − 2 on the (n − 4)dimensional space 1, v⊥ /1, v induced by f . The formulas for these numbers are well-known, and found in Corollary 7.25 of [16] for instance. The correspondence established in the proof of the above theorem shows that the set of all Type II codes with respect to a fixed trace-orthogonal basis over GF(2r ) of length n forms a distance-regular graph known as the dual polar graph Dn/2−1 (2r ) (see [4]), if we join two Type II codes by an edge when their intersection has dimension n/2 − 1. In particular, this graph is connected, hence all Type II codes can be obtained from a single Type II code by successively taking neighbors at most n/2 − 1 times. It follows from Theorem 5.1 that there are exactly two Type II codes of length 4 over GF(2r ) whenever r is even. As shown in the proof of Theorem 4.1, one of them is the code with generator matrix (1). The other code is obtained from this code by applying the coordinate permutation (3, 4). Therefore, we obtain the following. Theorem 5.2. If r is even, then the code with generator matrix (1) is a unique Type II code of length 4 over GF(2r ), up to permutation equivalence. Acknowledgements The author would like to thank Masaaki Harada for helpful discussion, and Satoshi Yoshiara for bringing the article [3] to the author’s attention.
References [1]
K. Betsumiya, T. A. Gulliver, M. Harada and A. Munemasa, On Type II codes over F4 , IEEE Trans. Inform. Theory 47 (2001), 2242–2248.
[2]
K. Betsumiya, M. Harada and A. Munemasa, Type II codes over F2r , in: Applied Algebra, Algebraic Algorithms and Error Correcting Codes, Springer Lecture Notes in Comput. Sci. 2227, Springer-Verlag, Berlin–Heidelberg 2001, 102–111.
[3]
M. Broué, Codes correcteurs d’erreurs auto-orthogonaux sur le corps a deux elements et formes quadratiques entieres definies positives a discriminant +1, Discrete Math. 17 (1977), 247–269.
[4]
A. E. Brouwer, A. M. Cohen and A. Neumaier, Distance-Regular Graphs, Springer-Verlag, 1988.
[5]
X.-n. Feng and Z.-d. Dai, Notes on finite geometries and the construction of PBIB designs, V, Some “Anzahl” theorems in orthogonal geometry over finite fields of characteristic 2, Sci. Sinica 13 (1964), 2005–2008.
[6]
P. Gaborit, V. Pless, P. Solé and O. Atkin, Type II codes over F4 , preprint.
[7]
A. Lempel, Matrix factorization over GF(2) and trace-orthogonal bases of GF(2n ), SIAM J. Comput. 4 (1975), 175–186.
214 Akihiro Munemasa [8]
F. J. MacWilliams, N. J. A. Sloane and J. G. Thompson, Good self dual codes exist, Discrete Math. 3 (1972), 153–162.
[9]
G. Pasquier, The binary Golay code obtained from an extended cyclic code over F8 , European J. Combin. 1 (1980), 369–370.
[10] G. Pasquier, Binary images of some self-dual codes over GF(2m ) with respect to traceorthogonal basis, Discrete Math. 37 (1981), 127–129. [11] G. Pasquier, Binary self-dual codes construction from self-dual codes over a Galois field F2m , in: Combinatorial mathematics, North-Holland Math. Stud. 75, North-Holland, Amsterdam, New York 1983, 519–526. [12] E. M. Rains and N. J. A. Sloane, Self-dual codes, in: Handbook of Coding Theory (V. Pless and W. C. Huffman, eds.), North-Holland, Amsterdam 1998. [13] D. K. Ray-Chaudhuri, Some results on quadrics in finite projective geometry based on Galois fields, Canad. J. Math. 14 (1962), 129–138. [14] B. Segre, Le geometrie di Galois, Ann. Mat. Pura Appl. (4) 48 (1959), 1–96. [15] J. G. Thompson, Weighted averages associated to some codes, Scripta Math. 29 (1973), 449–452. [16] Z.-X. Wan, Geometry of Classical Groups over Finite Fields, Chartwell Bratt, 1993. [17] J. Wolfmann, A new construction of the binary Golay code (24, 12, 8) using a group algebra over a finite field, Discrete Math. 31 (1980), 337–338. [18] J. Wolfmann, A class of doubly even self dual binary codes, Discrete Math. 56 (1985), 299–303. [19] J. Wolfmann, A group algebra construction of binary even self dual codes, Discrete Math. 65 (1987), 81–89. A. Munemasa Department of Mathematics Kyushu University 6-10-1 Hakozaki, Higashi-ku, Fukuoka, 812-8581 Japan [email protected]
A posteriori probability decoding through the discrete Fourier transform and the dual code Erin J. Schram
Abstract. In a posteriori probability decoding algorithms, such as trellis-based soft decoding algorithms for convolutional codes and the turbo decoding algorithm for systematic parallel concatenated codes, the probability states of each symbol is usually stored as log likelihood ratios, a notation which offers fast computation and, when the symbols are bits, symmetry between the 0 and 1 states. When the symbols of the code are in a finite field GF(p n ) or an integer ring Z/nZ, storing the probability states of each symbol as the discrete Fourier transform of those probabilities also provides symmetry. Moreover, in this DFT notation the formulas for Bayesian reestimation of the probabilities sum over the words of the dual code rather than the original code. Since the iterative decoding algorithms propagate via relations from the parity check matrix, which are words in the dual code, the DFT notation offers a tool for tracking the spread of information in that propagation. 2000 Mathematics Subject Classification: primary 94B35; secondary 94B05.
1. Introduction In the soft decoding of a codeword, the decoding algorithm acts not on a received codeword but on a set of probabilities about the symbol at each position in the received codeword. If these probabilities are determined from the strength and static of the incoming signal, this gives useful additional information for decoding the codeword. The decoding is a Bayesian reestimation of these a priori probabilities taking into account all possible transmitted codewords to give the a posteriori probabilities of the symbols in the received codeword. In practice, unless the code is small, any such reestimation must be conducted piecewise by taking advantage of the structure of the code, for example, linearly in a trellis decoding of a convolutional code or iteratively in a turbo decoding of a turbo code. With piecewise Bayesian reestimation, the notation used to represent the probability values can reduce the work of performing the decoding or can simplify the analysis of the decoding algorithm. Direct use of the probabilities is convenient for proofs, though a frequent need for renormalization in that notation leads some researchers to instead Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
216 Erin J. Schram use weights that are merely directly proportional to the probabilities. For codes that use bits as symbols, the log likelihood ratio, L(x) = log(Prob(x = 1)/ Prob(x = 0)), both simplifies many intermediate calculations and gives a pleasant symmetry in that the values associated with the two symbols differ only in sign. The notation discussed in this paper is based on the discrete Fourier transform (DFT) of the probability values. When the symbols are bits, this DFT notation resembles the log likelihood ratio in that the two symbol states differ only by sign. The DFT notation, however, lacks the ease of calculation of the log likelihood ratio. Also, since its values are complex numbers rather than real numbers, it requires twice as much memory to store as probability notation. But DFT notation is as convenient for proofs as the probability values themselves. Interestingly, though the Bayesian reestimation for the probabilities is based on summing probability values for all the words in the code, the Bayesian reestimation for the DFT probabilities is based on summing values for all the words in the dual of the code. This link to the dual code makes this notation natural for analyzing any soft decoding algorithm that is guided by the parity check matrix in its partial reestimations.
2. Discrete Fourier transform Let S be a finite commutative ring. Let χ be a character map from the additive group of S to the multiplicative group in the complex numbers with the added property that for any nonzero x in S there is a nonzero y in S such that χ (xy) = 1. For example, if S is the integer ring Z/nZ, then χ(x) = e2π ix/n is such a map. If S is GF(pn ), then such a map. χ (x) = e2πi tr(x)/p , where tr(x) is the trace of GF(pn ) over GF(p), is n If S is GF(p)n with pointwise multiplication, then χ (x) = e(2π i/p) i=1 xi is such a map. The added property on the character map is to allow the following lemma: Lemma 2.1. For any element a in S,
χ(ax) =
x∈S
|S| if a = 0, 0 otherwise.
In other words, x∈S χ(ax) = |S|δ(a). Also when X is an S-submodule of S n (i.e., a subspace if S is a field), and a is an element of S n , we have |X| if a · x = 0 for all x in X, χ(a · x) = 0 otherwise. x∈X Proof. For a = 0,
x∈S
χ(ax) =
x∈S
χ(0) =
x∈S
1 = |S|.
A posteriori probability decoding through the DFT and the dual code
217
For a = 0, let b be an element in S such that χ (ab) = 1. χ (ax) = χ(a(x + b)) = χ (ax)χ (ab) = χ (ab) χ (ax) x∈S
x∈S
x∈S
x∈S
Thus, (1 − χ (ab)) x∈S χ(ax) = 0 and 1 − χ (ab) = 0, so x∈S χ (ax) = 0. Likewise, for a in S n , suppose there exists z in X such that a · z = 0. Let a = a · z. There is a element b in S such that χ (ab) = 1. χ (a · x) = χ(a · (x − bz)) = χ(a · (bz)) χ (a · x) = χ (ab) χ (a · x). x∈X
x∈X
x∈X
This gives (1 − χ(ab))
x∈X
χ(a · x) = 0, so
If a · x = 0 for all x in X, then
x∈X
x∈C
x∈X
χ(a · x) =
χ (a · x) = 0.
x∈X
χ (0) = |X|.
Definition 2.2. Given v : S → C, where C denotes the complex numbers, the discrete Fourier transform (DFT) of v is vˆ : S → C, defined by v(x) ˆ = χ(xy)v(y) for all x in S. y∈S
We also have the inverse discrete Fourier transform, v(x) =
1 χ(−xy)v(y) ˆ for all x in S. |S| y∈S
Theorem 2.3. The discrete Fourier transform and the inverse discrete Fourier transform are inverses of each other. Proof. The inverse DFT of the DFT of v at x is 1 1 χ (−xy) χ(zy)v(z) = χ ((z − x)y)v(z) |S| |S| y∈S z∈S y∈S z∈S 1 v(z) χ ((z − x)y) = |S| z∈S
y∈S
1 v(z)|S|δ(z − x) = |S| z∈S
= v(x).
by Lemma 2.1,
218 Erin J. Schram The DFT of the inverse DFT of v at x is 1 1 χ (xz) χ(−yz)v(y) = χ ((x − y)z)v(y) |S| |S| z∈S y∈S z∈S y∈S 1 v(y) χ ((x − y)z) = |S| y∈S
z∈S
1 v(y)|S|δ(x − y) = |S|
by Lemma 2.1,
y∈S
= v(x).
3. Bayesian reestimation Let S be a ring and χ be a character map of S with properties defined as above. Consider any code C of length n with symbol set S that is linear over S. Let I be the set of all positions in the code, I = {1, 2, . . . , n}. When a codeword c = (c1 , c2 , . . . , cn ) in C is transmitted over a memoryless channel as a signal, our receiver analyzes the signal to determine a set of probabilities about the symbols in the codeword rather than providing the best guess for those symbols. These a priori probabilities are based on the signal alone, at this stage incorporating no information about the code; thus, the probabilities at separate positions are independent. Let A be a vector of random variables (A1 , A2 , . . . , An ) such that for each i in I the random variable Ai contains the probabilities from the receiver about the value of ci in the codeword c. Let p(Ai , x) denote the probability that ci is x. Thus, x∈S p(Ai , x) = 1 for all positions i. Incorporating knowledge of the code C into the probabilities lets us reestimate the probabilities in the random variables yielding new a posteriori probabilities. This paper will use its own notation R(C) for the reestimation to emphasize which code is involved, because often we will reestimate using only part of the code. Definition 3.1. Given probabilities A that do not incorporate knowledge of the code C, let R(C)A denote the transform of those probabilities that occur when the transmitted codeword is required to be in C. p(R(C)Ai , x) = Prob(The symbol at position i in the transmitted codeword was x | C, p) = c∈C such that ci =x Prob(c was transmitted | p) c∈C such that ci =x j ∈I p(Aj , cj ) = . c∈C j ∈I p(Aj , cj )
A posteriori probability decoding through the DFT and the dual code
219
We wish to clarify that the reestimation R(C) acts on the entire vector A and that R(C)Ai denotes the ith random variable in R(C)A, not an attempt to reestimate at Ai alone. The discrete Fourier transform of the probability values in Ai yields a new set of values. However, since the DFT is invertible, Ai in DFT notation contains the same information as Ai in probability notation, so we can use Ai to denote either notation. ˆ i , x), defined as But the individual values differ. A DFT value in Ai will be denoted p(A p(A ˆ i , x) = χ (xy)p(Ai , y), y∈S
for all x in S. The values of the DFT notation seem to have no interpretation in terms of actual probabilities or likelihoods of the received codeword. However, for ˆ i , 0), seems to serve a normalization role as any variable Ai , its first DFT value, p(A shown in Lemma 3.2. Lemma 3.2. For any position i, p(A ˆ i , 0) = 1. Proof. p(A ˆ i , 0) = y∈S χ(0y)p(Ai , y) = y∈S p(Ai , y) = 1. We wish to find a formula for Bayesian reestimation that works with the DFT probability notation. Definition 3.3. Let C be any code of length n with symbol set S that is linear over S. ˆ Define the DFT reestimation R(C) with respect to C of the random variables A in DFT probability notation pˆ by ˆ i , ci + x) j ∈I \{i} p(A ˆ j , cj ) c∈C p(A ˆ p( ˆ R(C)Ai , x) = ˆ j , cj ) c∈C j ∈I p(A for any position i and symbol x. If we define C + v as the coset { c + v | c ∈ C }, then we have an alternative notation for the DFT reestimation, ˆ j , cj ) c∈C+xe j ∈I p(A i ˆ p( ˆ R(C)A i , x) = ˆ j , cj ) c∈C j ∈I p(A where ei is the vector in S n that is 1 at the ith position and 0 elsewhere. ˆ Note that, in accordance with Lemma 3.2, p( ˆ R(C)A i , 0) = 1. Theorem 3.4. Given the random variables A, the relation between Bayesian reestiˆ ⊥ )A, where C ⊥ is the dual code mation and DFT reestimation is that R(C)A = R(C of C.
220 Erin J. Schram Proof. At position i and symbol x, p(R(C)A ˆ χ(xy)p(R(C)Ai , y) i , x) = y∈S
=
χ(xy)
c∈C with ci =y
j ∈I
p(Aj , cj )
p(Aj , cj ) y∈S χ(xy) c∈C with ci =y j ∈I p(Aj , cj ) = c∈C j ∈I p(Aj , cj ) c∈C χ(xci ) j ∈I p(Aj , cj ) = c∈C j ∈I p(Aj , cj ) 1 ˆ j , z) c∈C χ(xci ) j ∈I |S| z∈S χ (−zcj )p(A = 1 ˆ j , z) c∈C j ∈I |S| z∈S χ (−zcj )p(A y∈S
c∈C
j ∈I
χ(xci ) |S|1 n z∈S n j ∈I χ (−zj cj )p(A ˆ j , zj ) = 1 ˆ j , zj ) c∈C |S|n z∈S n j ∈I χ (−zj cj )p(A ˆ j , zj ) z∈S n j ∈I p(A c∈C χ (xci ) j ∈I χ (−zj cj ) = ˆ j , zj ) z∈S n j ∈I p(A c∈C j ∈I χ (−zj cj ) ˆ j , zj ) z∈S n j ∈I p(A c∈C χ (−c · (z − xei )) = ˆ j , zj ) z∈S n j ∈I p(A c∈C χ (−c · z)
c∈C
where ei is the vector in S n that is 1 in the ith position and 0 elsewhere. By Lemma 2.1, the sum c∈C χ(−c · z) is zero unless c · z = 0 for all c in C. ⊥ Thus, we can ignore all terms c∈C χ(−c · z) except when z is in C , in which case c∈C χ (−c · z) = |C|. p(R(C)A ˆ i , x) =
z∈C ⊥ +xei
z∈C ⊥
p(A ˆ j , zj ) ˆ ⊥ )Ai , x). = p( ˆ R(C p(A ˆ j , zj )
j ∈I
j ∈I
Common sense dictates that if the parity check conditions do not apply to a position on the codeword, the reestimation should not change the probabilities at that position. The following lemma proves that property for DFT notation. Lemma 3.5. Suppose there exists an position i in I such that for all codewords c in ˆ C, ci = 0. Then R(C)A i = Ai .
A posteriori probability decoding through the DFT and the dual code
Proof. For x in S,
221
p(A ˆ i , ci + x) j ∈I \{i} p(A ˆ j , cj ) ˆ i , ci ) j ∈I \{i} p(A ˆ j , cj ) c∈C p(A ˆ i , x) j ∈I \{i} p(A ˆ j , cj ) c∈C p(A = ˆ i , 0) j ∈I \{i} p(A ˆ j , cj ) c∈C p(A
ˆ p( ˆ R(C)A i , x) =
c∈C
since ci = 0 for all c in C, ˆ j , cj ) p(A ˆ i , x) c∈C j ∈I \{i} p(A = p(A ˆ i , 0) c∈C j ∈I \{i} p(A ˆ j , cj ) =
p(A ˆ i , x) p(A ˆ i , 0)
= p(A ˆ i , x)
since p(A ˆ i , 0) = 1 by Lemma 3.2.
Conversely, the probability values at positions outside the support of the dual code do not affect the reestimation of probabilities at other positions. Lemma 3.6. Let C be a code and A and B be two sets of probabilities about a codeword such that Ai = Bi for all positions i in Supp(C), where Supp(C) = {i ∈ ˆ ˆ I | ci = 0 for some c ∈ C}. Then R(C)A i = R(C)B i for all positions i in Supp(C). Proof. For x in S and i in Supp(C), we have Supp(C + xei ) = Supp(C) and ˆ j , cj ) c∈C+xe j ∈I p(A i ˆ p( ˆ R(C)A i , x) = ˆ j , cj ) c∈C j ∈I p(A ˆ j , cj ) ˆ j , cj ) c∈C+xei j ∈Supp(C) p(A j ∈I \Supp(C) p(A = ˆ j , cj ) ˆ j , cj ) c∈C j ∈Supp(C) p(A j ∈I \Supp(C) p(A ˆ j , cj ) ˆ j , 0) c∈C+xei j ∈Supp(C) p(A j ∈I \Supp(C) p(A = ˆ j , cj ) ˆ j , 0) j ∈Supp(C) p(A j ∈I \Supp(C) p(A c∈C ˆ j , cj ) c∈C+xei j ∈Supp(C) p(A = ˆ j , cj ) c∈C j ∈Supp(C) p(A by Lemma 3.2. Likewise, for the same i and x ˆ j , cj ) c∈C+xe j ∈Supp(C) p(B i ˆ . p( ˆ R(C)B i , x) = ˆ j , cj ) j ∈Supp(C) p(B c∈C ˆ j , cj ) for all possible cj in S and all positions j in By the hypothesis p(A ˆ j , cj ) = p(B ˆ ˆ Supp(C). So p( ˆ R(C)A , x) = p( ˆ R(C)B i i , x).
222 Erin J. Schram
4. Piecewise decoding and the DFT notation Trellis decoding and iterative decoding involve decoding parts of the codeword based on partial information. The following theorems demonstrate how two partial reestimations can combine to give a complete reestimation. Theorem 4.1. For C a code, let C1 and C2 be subcodes of C such that C = C1 + C2 ˆ and Supp(C1 ) ∩ Supp(C2 ) = ∅. For any probabilities A, we have R(C)A = ˆ ˆ R(C2 )R(C1 )A. Proof. Let S1 = Supp(C1 ) and S2 = Supp(C2 ). Note that Supp(C) = S1 ∪ S2 . By Lemmas 3.5 and 3.6, we have ⎧ ˆ 1 )Ai , x) for i in S1 , ⎪ ˆ R(C ⎨p( ˆ 2 )R(C ˆ 1 )Ai , x) = p( ˆ 2 )Ai , x) for i in S2 , p( ˆ R(C ˆ R(C ⎪ ⎩ for i not in S1 ∪ S2 . p(A ˆ i , x) For i in S1 ,
ˆ p( ˆ R(C)A i , x) =
c∈C+xei
c∈C
j ∈I
j ∈I
p(A ˆ j , cj )
p(A ˆ j , cj )
p(A ˆ j , cj ) ˆ j , cj ) j ∈S2 p(A ˆ j , cj ) ˆ j , cj ) j ∈S1 p(A j ∈S2 p(A c∈C
=
c∈C+xei
j ∈S1
by Lemma 3.2 since cj = 0 for all j in S3 , ˆ j , bj ) ˆ j , cj ) c∈C2 j ∈S1 p(A j ∈S2 p(A b∈C1 +xei = ˆ j , bj ) ˆ j , cj ) b∈C1 c∈C2 j ∈S1 p(A j ∈S2 p(A since C = C1 + C2 and C1 ∩ C2 = {0}, ˆ j , bj ) ˆ j , cj ) b∈C1 +xei j ∈S1 p(A c∈C2 j ∈S2 p(A = ˆ j , bj ) ˆ j , cj ) b∈C1 j ∈S1 p(A c∈C2 j ∈S2 p(A ˆ j , bj ) b∈C1 +xei j ∈S1 p(A = ˆ j , bj ) b∈C2 j ∈S1 p(A ˆ 1 )Ai , x) = p( ˆ R(C For i in S2 , we just exchange the roles of C1 and C2 in the above calculation to get ˆ 2 )Ai , x). ˆ ˆ R(C p( ˆ R(C)A i , x) = p( ˆ ˆ R(C)A ˆ i , x) by Lemma 3.5. For i not in S1 ∪ S2 , p( i , x) = p(A
A posteriori probability decoding through the DFT and the dual code
223
Theorem 4.2. Let C be a code with subcodes C1 and C2 such that C = C1 + C2 and ˆ 2 )R(C ˆ 1 )Ai = Supp(C1 ) ∩ Supp(C2 ) = {k}. Then for any position i in Supp(C2 ), R(C ˆ . R(C)A i Proof. Let S1 = Supp(C1 ) \ {k} and S2 = Supp(C2 ) \ {k}. For any i in S2 ∪ {k} and x in S, ˆ 1 )Ai , x) ˆ 2 )R(C p( ˆ R(C c∈C2 +xei
=
ˆ 1 )Aj , cj ) p( ˆ R(C
j ∈I
ˆ 1 )Aj , cj ) p( ˆ R(C ˆ 1 )Ak , ck ) j ∈S p( ˆ 1 )Aj , cj ) ˆ 1 )Aj , cj ) ˆ R(C ˆ R(C ˆ R(C c∈C2 +xei p( j ∈S2 p( 1 ˆ 1 )Ak , ck ) j ∈S p( ˆ 1 )Aj , cj ) ˆ 1 )Aj , cj ) ˆ R(C ˆ R(C p( ˆ R(C c∈C2 p( j ∈S 1 2
=
c∈C2
j ∈I
For c in C2 or C2 + xei , we have cj = 0 for all j in S1 . Thus, by Lemma 3.2 ˆ 1 )Aj , cj ) = j ∈S p( ˆ 1 )Aj , 0) = 1. For all c in C1 and all j in S2 , ˆ R(C ˆ R(C j ∈S1 p( 1 ˆ 1 )Aj , x) = p(A we have cj = 0, so by Lemma 3.5 p( ˆ R(C ˆ j , x). ˆ 2 )R(C ˆ 1 )Ai , x) p( ˆ R(C ˆ 1 )Ak , ck ) j ∈S p(A ˆ R(C ˆ j , cj ) c∈C2 +xei p( 2 = ˆ 1 )Ak , ck ) j ∈S p(A ˆ R(C ˆ j , cj ) c∈C2 p( 2 ˆ k ,bk +ck ) j ∈I \{k} p(A ˆ j ,bj ) b∈C1 p(A =
c∈C2 +xei
b∈C1
c∈C2
b∈C1
j ∈I
p(A ˆ k ,bk +ck ) b∈C1
j ∈I
p(A ˆ j ,bj )
ˆ j ,bj ) j ∈I \{k} p(A
p(A ˆ j ,bj )
j ∈S2
p(A ˆ j , cj )
j ∈S2
p(A ˆ j , cj )
p(A ˆ k , bk + ck ) j ∈I \{k} p(A ˆ j , bj ) ˆ j , cj ) j ∈S2 p(A = ˆ k , bk + ck ) j ∈I \{k} p(A ˆ j , bj ) ˆ j , cj ) c∈C2 b∈C1 p(A j ∈S2 p(A ˆ k , bk + ck ) j ∈S1 p(A ˆ j , bj ) ˆ j , cj ) c∈C2 +xei b∈C1 p(A j ∈S2 p(A = ˆ k , bk + ck ) j ∈S1 p(A ˆ j , bj ) ˆ j , cj ) b∈C1 p(A j ∈S2 p(A c∈C2
c∈C2 +xei
b∈C1
since bj = 0 for all b in C1 and all j not in S1 ∪ {k}, ˆ k , bk + ck ) j ∈S1 p(A ˆ j , bj ) ˆ j , cj ) c∈C2 +xei b∈C1 p(A j ∈S2 p(A = ˆ k , bk + ck ) j ∈S1 p(A ˆ j , bj ) ˆ j , cj ) c∈C2 b∈C1 p(A j ∈S2 p(A C = C1 + C2 , so for every codeword b in C1 and c in C2 , d = b + c is in C. Note that dj = bj for j in S1 , dj = cj for j in S2 , and dk = bk + ck . For a given codeword d in C, the number of (b, c) pairs in C1 × C2 such that d = b + c is |C1 ∩ C2 |. Since
224 Erin J. Schram the all-zero vector 0 is in C1 ∩ C2 , |C1 ∩ C2 | ≥ 1. Also C1 + (C2 + xei ) = C + xei . So ˆ 1 )Ai , x) ˆ 2 )R(C p( ˆ R(C ˆ k , dk ) j ∈S1 p(A ˆ j , dj ) ˆ j , dj ) |C1 ∩ C2 | d∈C+xei p(A j ∈S2 p(A = |C1 ∩ C2 | d∈C p(A ˆ k , dk ) j ∈S1 p(A ˆ j , dj ) ˆ j , dj ) j ∈S2 p(A ˆ = p( ˆ R(C)A i , x). Use of the above theorem is not limited to dual codes that have only one splitting point k. It is possible to chain together several subcodes of the dual code. Theorem 4.2 can be generalized to reestimate over several subcodes of the dual code provided that the intersections of the supports of the subcodes contain at most single points and the subcodes are chained together in an orderly structure. Theorem 4.3 (Chain t Theorem). Let C be a dual code with subcodes C1 , C2 , . . . , Ct such that C = i=1 Ci . If there exist positions k1 , k2 , . . . , kt−1 in I such that Supp(Ci ) ∩ Supp( jt =i+1 Cj ) = {ki } for i = 1, . . . , t − 1, then for any position ˆ ˆ t )R(C ˆ t−1 ) . . . R(C ˆ 1 )Ai . i in Supp(Ct ), R(C)A i = R(C Proof. This theorem holds for t = 1 by triviality and for t = 2 by Theorem 4.2. We proceed with a proof by induction on t. Suppose for some positive integer t the hypothesis holds for all partitions of codes into fewer than t subcodes. Let D = ti=2 Ct . The code D partitioned into C2 , C3 , . . . , Ct satisfies the hypothesis and the partition has only t − 1 subcodes. So ˆ t )R(C ˆ t−1 ) . . . R(C ˆ 2 )Bi for all ˆ for any set of probabilities B, we have R(D)B i = R(C positions i in Supp(Ct ). ˆ 1 )A is a set of probabilities, so we can apply the t − 1 case of this theorem to R(C ˆ t ) . . . R(C ˆ 2 )R(C ˆ 1 )Ai for all positions i in Supp(Ct ). ˆ ˆ 1 )Ai = R(C it. Thus, R(D) R(C Supp(C1 ) ∩ Supp(D) = {k1 }. Then by Theorem 4.2 for all positions i in Supp(D), ˆ ˆ 1 )Ai = R(C)A ˆ R(D) R(C i . Since Supp(Ct ) ⊆ Supp(D), our conclusion holds. The arrangement of subcodes used in the Chain Theorem (Theorem 4.3) can be viewed as a tree. Definition 4.4. For C a code and D be a set of subcodes of C such that C = D∈D D, we will call D a partition of the code C. Define the Tanner graph of C with respect to the partition D to be a bipartite graph whose two vertex sets are the codeword positions I and the partition D and whose edges are all pairs (k, D) of a position k and a subcode D such that k is in Supp(D). Theorem 4.5. Let C be a code with Supp(C) = I , the set of all positions in the codewords. Let D be a partition of C of size t and let D be a subcode in D. The subcodes in D can be ordered as C1 , C2 , . . . , Ct with Ct = D and Supp(Ci ) ∩
A posteriori probability decoding through the DFT and the dual code
225
Supp( jt =i+1 Cj ) = {ki } for some set of positions k1 , . . . , kt−1 in I if and only if the Tanner graph of C with respect to D is a tree. Proof. We proceed by induction in t. For t = 1, D = {C}, so the Tanner graph of C with respect to D has edge set {(i, C) | i ∈ I }. Such a graph is always a tree, so the hypothesis holds for t = 1. Suppose the hypothesis holds for t − 1. Let D be a partition of C of size t and let D be a subcode in D. Suppose that the subcodes in D can be ordered as C1 , C2 , . . . , Ct with Ct = D and Supp(Ci ) ∩ Supp( jt =i+1 Cj ) = {ki } for some set of positions k1 , . . . , kt−1 in I . Let C = Supp( jt =2 Cj ), D = D \ {C1 }, and I = Supp(C ). Then by the hypothesis, the Tanner graph of C with respect to D is a tree. The Tanner graph of C with respect to D differs from that graph in the addition of the subcode vertex C1 , the position vertices in I \ I , the edge (k1 , C1 ), and the edges (i, Ci ) for all i in I \ I . The graph defined by the added edges is a tree and intersects the tree from the Tanner graph of C only at the single vertex k1 , so the combination of those two graphs is still a tree. Suppose that the Tanner graph of C with respect to D is a tree. Note that all the leaf vertices of the Tanner graph of C are position vertices. Let G be the subgraph of the Tanner graph of C created by removing all leaf vertices of the Tanner graph of C. Let C1 be a subcode in D that is a leaf vertex of G. Since t > 1, we can choose C1 so that C1 = D. Let k1 be the one position vertex of G that is adjacent to C1 . Let D = D \ {C1 }, C = D∈D D, and I = Supp(C ). Then by the with Ct−1 =D hypothesis, the subcodes in D can be ordered as C1 , C2 , . . . , Ct−1 t−1 in I . and Supp(Ci ) ∩ Supp( j =i+1 Cj ) = {ki } for some set of positions k1 , . . . , kt−1 Let Ci = Ci−1 for i = 2, . . . , t. Let ki = ki−1 for i = 2, . . . , t − 1. Then Ct = D and Supp(Ci ) ∩ Supp( jt =i+1 Cj ) = {ki } for i = 1, . . . t − 1.
5. Trellis decoding and the DFT notation Convolutional codes and other codes that can be decoded by trellis-based decoding algorithms [2], such as the Viterbi algorithm [6], are structured in a linear fashion to fit the trellis. This linearity lends itself to partitioning the dual code to give a Tanner graph that is mostly a linear path. Thus, we can apply the Chain Theorem (4.3) to a trellis-based code, though we need some extra tools to adapt the trellis decoding into a DFT Bayesian re-estimation. In a trellis decoding of a convolutional code, the trellis contains probabilistic estimates not only of the transmitted symbols but also of the internal states of the convolutional encoder. We can handle this in our re-estimation model by treating those internal states as additional positions in the codeword about which the transmitted codeword provided no information. The initial probabilities of those states would be
226 Erin J. Schram p(Ai , x) = 1/|S| for all x in S, which in DFT notation becomes p(A ˆ i , 0) = 1 and p(A ˆ i , x) = 0 for all nonzero x in S. However, in a typical convolutional code the internal state of the encoder stores several symbols in S. Thus, for a convolutional code that stores m symbols in its internal state, the decoding algorithm stores |S m | probabilities at each step, one for each possible value of the multi-symbol internal state. Although our theorems so far handle probabilities for only single symbols, the DFT notation can be generalized to handle groupings of symbols. Definition 5.1. Given the set, I , of positions in the code, we wish to partition I into subsets G1 , G2 , . . . , Gt , so that ti=1 Gi = I and Gi ∩ Gj = ∅ for i = j . Let G = {G1 , G2 , . . . , Gt }. Then S n = S Gi × S G2 × · · · × S Gt , so let π(G) be the projection map from S n into S G for any G in G. The partition G regroups the random G G G variables A into the random variables AG = (AG1 , . . . , AGt ), where AG stores the probabilities that the symbols at the positions in set G have particular values in S G . G G For the a priori values of AG , we have p(AG , x) = i∈G p(Ai , xi ) for x in S G . Define a DFT group probability by G G χ (x · y)p(AG , y). (1) p(A ˆ G , x) = y∈S G
The reestimation of AG with respect to the dual code C is calculated as G G p(A ˆ G , πG (c) + x) H ∈G\{G} p(A ˆ H , πG (c)) c∈C G ˆ p( ˆ R(C)A , G , x) = G ˆ H , πG (c)) H ∈G p(A c∈C
(2)
for all G in G and all x in S G . The proofs of all theorems related to the DFT notation generalize trivially to the proofs of similar theorems in the DFT grouped notation, because Lemma 2.1 applies equally well to both multiplication in S and inner products in S G for any G in G. Thus, we end up with the following generalization of Theorem 4.2: Theorem 5.2. Let C be a code with subcodes C1 and C2 such that C = C1 + C2 and let the position set I be partitioned into disjoint sets G = {G1 , G2 , . . . , Gt } such that Supp(C1 ) ∩ Supp(C2 ) ⊆ H for some set of positions H in G. Then for any set of positions G in G that is a subset of Supp(C2 ), G ˆ 2 )R(C ˆ 1 )AG = R(C)A ˆ R(C G G.
Theorem 5.3. Let C be a dual code with subcodes C1 , C2 , . . . , Ct such that t C = i=1 Ci and let the position set I be partitioned into disjoint sets G = {G1 , G2 , . . . , Gt }. If there exist sets of positions H1 , H2 , . . . , Ht−1 in G such
A posteriori probability decoding through the DFT and the dual code
227
that Supp(Ci ) ∩ Supp( jt =i+1 Cj ) ⊆ Hi for i = 1, . . . , t − 1, then for any set of positions G in G that is a subset of Supp(Ct ), G G ˆ ˆ ˆ ˆ R(C)A G = R(Ct )R(Ct−1 ) . . . R(C1 )AG .
Theorem 4.5 can also be generalized to the grouped DFT notation, given a proper generalization of the Tanner graph. For the positions I partitioned into a G = {G1 , G2 , . . . , Gt } and the code C partitioned into subcodes C = {C1 , . . . , Cu }, the Tanner graph with respect to both those partitions is the bipartite graph whose vertices are G ∪ C and whose edges are all pairs (Gi , Cj ) for Gi a set of positions in G and Cj a subcode in C such that Gi ∩ Supp(Cj ) = ∅. For an example of a Bayesian re-estimation of a convolutional code, consider the following binary convolutional encoder: at time t this encoder inputs bit it , has the bits it−1 and it−2 stored in its internal states (with i0 = i−1 = 0), and transmits bits xt and yt where xt = it + it−1 + it−2 and yt = it + tt−2 . This encoder runs for n steps, transmitting 2n bits: (x1 , y1 , . . . , xn , yn ). For our decoding, we treat the transmitted codeword as if it had 4n + 2 positions: xt and yt for t = 1, . . . , n and ut and vt for t = 1 . . . , n + 1, where ut = it−1 and vt = it−2 are the internal states. Let our groups of positions be G2t−1 = {ut , vt } and G2t = {xt , yt } for t = 1, . . . , n and G2n+1 = {un+1 , vn+1 }. Let our subcodes be Ct for t = 1, . . . , n, where Ct is the dual subcode represented by the parity check equations ut + vt + xt + ut+1 = 0, ut + yt + ut+1 = 0, and ut + vt+1 = 0. Thus, the edges of the Tanner graph are (G1 , C1 ), (G2 , C1 ), (G3 , C1 ), (G3 , C2 ), (G4 , C2 ), (G5 , C2 ), (G5 , C3 ), (G6 , C3 ), (G7 , C3 ), . . . , (G2n+1 , Cn ), so the Tanner graph is a tree. The forward and backward algorithm for decoding a convolutional code [2], translated into the DFT grouped notation, calculates and stores the values ˆ t−2 ) . . . R(C ˆ 1 )AG ˆ t−1 )R(C R(C G2t−1 for all t = 2, . . . , n and the values ˆ t+2 ) . . . R(C ˆ n )AG ˆ t+1 )R(C R(C G2t+1 for all t = 1, . . . , n − 1. If for a given t we define B G by G ˆ t−1 ) . . . R(C ˆ 1 )AG BG2t−1 = R(C G2t−1 G
BG2t
G
= AG2t
G ˆ t+1 )R(C ˆ t+2 ) . . . R(C ˆ n )AG BG2t+1 = R(C G2t+1
then by Lemma 3.6 and Theorem 4.1 ˆ t−1 ) . . . R(C ˆ 1 )R(C ˆ t+1 )R(C ˆ t+2 ) . . . R(C ˆ n )AG , ˆ t )R(C ˆ t )B G = R(C R(C G G for G = G2t−1 , G2t , G2t+1 . By Theorem 5.3, the right hand side of the above G ˆ equation equals R(C)A G , giving us a full Bayesian reestimation of AG at G2t−1 , G2t ,
228 Erin J. Schram ˆ and G2t+1 , which we can break apart to give the a posteriori probabilities R(C)A at it−2 , it−1 , it , xt , and yt .
6. Belief propagation The turbo decoding algorithm is an unexpectedly effective iterative method of correcting errors in systematic parallel-concatenated linear codes. Robert McEliece, David MacKay, and Jung-Fu Chen demonstrated in [5] that the turbo decoding algorithm is a belief propagation algorithm. That paper modeled the systematic parallelconcatenated code as a Bayesian belief network and proved that the turbo decoding of the code mimicked Pearl’s belief propagation algorithm on the corresponding Bayesian belief network. A Bayesian belief network is a set of interdependent probabilities represented as a directed acyclic graph. In a directed graph, a node X is called a parent of a node Y if a directed edge goes from X to Y . A directed acyclic graph is considered a Bayesian belief network if its nodes are random variables and the dependencies between the random variables obey the rule Prob(Ak = ak for k = 1, . . . , n) =
n
Prob(Ak = ak | Aj = aj for all Aj parent to Ak )
(3)
k=1
where {A1 , A2 , . . . , An } is the set of random variables and a1 , a2 , . . . , an is a list of possible values for those random variables. Any code that satisfies the hypothesis of the chain theorem (Theorem 4.3) has a natural mapping to a directed acyclic graph. If C is the dual of such a code, t the Chain and t subcodes C such that C = Theorem says C has t − 1 positions k i i i=1 Ci and Supp(Ci ) ∩ Supp( jt =i+1 Cj ) = {ki } for k = 1, . . . , t − 1. Let kt be any point in Supp(Ct ). The directed graph for the Bayesian belief network has vertices {Ai | i ∈ I } and directed edges from Ak to Aki for all i in I and all positions k in Supp(Ci ) \ {ki }. Since the graph of any subtree with vertices inside any single Supp(Ci ) is a tree and since by Theorem 4.5 the Tanner graph associated with the partition {C1 , . . . , Ct } is a tree, the directed graph is acyclic. But except in the case where every dual subcode Ci is generated by a single parity check relation, the directed acyclic graph is not a Bayesian belief network. When every Ci is a single parity check, then Prob(Ak = ak for k ∈ Supp(Ci )) = Prob(Aki = aki | Aj = aj for all j in Supp(Ci ) \ {ki } ) × k∈Supp(Ci )\{ki } Prob(Ak = ak ).
(4)
A posteriori probability decoding through the DFT and the dual code
229
Since the set of parents of Aki is all Aj such that j is in Supp(Ci ) \ {ki } and the other Ak for k in Supp(Ci ) have no parents, this fits the probability dependency pattern for Bayesian belief networks in equation (3). However, when Ci consist of two or more linearly independent codewords, then the dependencies of the random variables Ak for all k in Supp(Ci ) cannot be limited to a single conditional probability for Aki . Unless Ci can be further divided into subcodes that satisfy the chain theorem, the conditional probabilities are too interwoven to follow equation (3). Nevertheless, Pearl’s belief propagation algorithm can be adapted to operate on Tanner graphs rather than Bayesian belief networks. The adaptation works on the Tanner graph with respect to a partition into subcodes even when that Tanner graph is not associated with a Bayesian belief network, provided that the Tanner graph is a tree. Pearl’s algorithm uses probabilities rather than DFT notation. This is not important for the proof of its effectiveness, since one notation is readily converted into the other. However, rewriting Lemma 3.6 into probability notation will simplify some later calculations.
Lemma 6.1. Let C be a code linear in S and let C ⊥ be its dual. The Bayesian reestimation R(C)A can be calculated as ⎧ ⎪ for i ∈ / Supp(C ⊥ ), ⎨p(A i , x) p(R(C)Ai , x) = c∈C with ci =x j ∈Supp(C ⊥ ) p(Aj , cj ) ⎪ for i ∈ Supp(C ⊥ ) ⎩ p(A , c ) ⊥ j j c∈C j ∈Supp(C ) for all i in I and x in S. ˆ ⊥ )Ai , x) = p(A ˆ R(C ˆ i , x) for all x in S, by Proof. For i not in Supp(C ⊥ ), p( ⊥ ˆ Lemma 3.6. Thus, p(R(C)Ai , x) = p(R(C )Ai , x) = p(Ai , x) for all x in S. If Supp(C ⊥ ) contains all positions, the theorem reduces to the usual formula for Bayesian reestimation. c∈C with ci =x j ∈I p(Aj , cj ) p(R(C)Ai , x) = c∈C j ∈I p(Aj , cj ) Otherwise, let k be any position in I \ Supp(C ⊥ ). Since k is not in Supp(C ⊥ ), the vector ek is in C = (C ⊥ )⊥ . p(Aj , cj ) = (1/|S|) p(Aj , cj ) c∈C j ∈I
x∈S c∈C j ∈I
= (1/|S|)
x∈S c∈C
p(Ak , ck )
j ∈I \{k}
p(Aj , cj ) =
230 Erin J. Schram = (1/|S|)
p(Ak , ck + x)
p(Aj , cj )
j ∈I \{k}
x∈S c∈C
since C = C + xek , p(Ak , ck + x) p(Aj , cj ) = (1/|S|) c∈C
= (1/|S|)
j ∈I \{k}
x∈S
p(Aj , cj )
c∈C j ∈I \{k}
Repeat this for all other positions in I \ Supp(C ⊥ ) to get ⊥ p(Aj , cj ) = |S|−|I \Supp(C )| c∈C j ∈I
p(Aj , cj )
c∈C j ∈Supp(C ⊥ )
Likewise, by a similar calculation, ⊥ p(Aj , cj ) = |S|−|I \Supp(C )| c∈C j ∈I with ci = x
p(Aj , cj )
c∈C j ∈Supp(C ⊥ ) with ci = x
Thus, p(R(C)Ai , x) =
|S|−|I \Supp(C
=
⊥ )|
c∈C with ci =x j ∈Supp(C ⊥ ) p(Aj , cj ) ⊥ )| −|I \Supp(C |S| c∈C j ∈Supp(C ⊥ ) p(Aj , cj )
c∈C with ci =x
j ∈Supp(C ⊥ ) p(Aj , cj )
c∈C
j ∈Supp(C ⊥ ) p(Aj , cj )
Pearl’s algorithm exploits that some reestimations can be divided out in order to undo an old reestimation and redo it with better information. The following theorem provides the basis for that undoing.
Theorem 6.2. Let C1 and C2 be two codes such that Supp(C1⊥ ) ∩ Supp(C2⊥ ) = {k}. Then there exists a real constant μ such that, for all x in S, p(R(C1 )Ak , x)p(R(C2 )Ak , x) = μ p(Ak , x)p(R(C2 )R(C1 )Ak , x). Proof. Let S1 = Supp(C1⊥ ) \ {k} and S2 = Supp(C2⊥ ) \ {k}. p(R(C1 )Ak , x)p(R(C2 )Ak , x) = p(R(C1 )Ak , x)
c∈C2 with ck =x
c∈C2
j ∈S2 ∪{k} p(Aj , cj )
j ∈S2 ∪{k} p(Aj , cj )
by Lemma 6.1,
A posteriori probability decoding through the DFT and the dual code
p(R(C1 )Ak , x)p(Ak , x) = c∈C2
c∈C2 with ck =x
j ∈S2
231
p(Aj , cj )
j ∈S2 ∪{k} p(Aj , cj )
p(R(C1 )Ak , x)p(Ak , x) c∈C2 with ck =x j ∈S2 p(R(C1 )Aj , cj ) = c∈C2 j ∈S2 ∪{k} p(Aj , cj ) since Aj = R(C1 )Aj for all j in S2 ,
=
c∈C2
j ∈S2 ∪{k} p(R(C1 )Aj , cj )
c∈C2
j ∈S2 ∪{k} p(Aj , cj )
× p(Ak , x)
c∈C2 with ck =x
c∈C2
j ∈S2 ∪{k} p(R(C1 )Aj , cj )
j ∈S2 ∪{k} p(R(C1 )Aj , cj )
= μ p(Ak , x)p(R(C2 )R(C1 )Ak , x) where μ = ( c∈C2 j ∈S2 ∪{k} p(R(C1 )Aj , cj ))/( c∈C2 j ∈S2 ∪{k} p(Aj , cj )), which does not depend on the choice of x.
To describe Pearl’s belief propagation algorithm, let us work with the dual code C partitioned into a set of subcodes D, i.e., C = D∈D D and for distinct subcodes D and D in D, D ∩ D = {0}. Let T be the Tanner graph of C with respect to D. The position nodes of the Tanner graph store the values of the random variables at those positions, which we will denote Ati at time t. Initiallly, each A0i = Ai , the a priori probabilities. The subcode nodes of the Tanner graph store some information needed for the four-step activation of Pearl’s algorithm. To each subcode node D, we assign complex variables r(D, i, x, t) for all positions i adjacent to D and all symbols x in S. Initially, each r(D, i, x, 0) is set to one. Pearl’s belief propagation algorithm is a message-passing algorithm. In a Bayesian belief network, each node of the network stores the messages it receives until that node is activated, at which time it processes its messages to reestimate the probabilities and sends out messages to adjacent nodes. In a Tanner graph, the two types of nodes act differently. The subcode nodes follow Pearl’s algorithm. A subcode node stores the messages it receives until activated and passes messages to adjacent nodes during activation. The position nodes are not activated and do not pass messages; instead, a position node stores the current value of its random variable Ati and always makes that value available to adjacent nodes. Subcode nodes are activated repeatedly, one at a time in an order determined by which nodes are due for activation. A subcode node is due for activation if it has not been activated before or if it has received a message since the last time it was activated. Let Dt denote the subcode node activated at time t, starting with t = 1 and ending when no nodes are due for activation. When a subcode node Dt is activated, it ˆ t )-reestimation of the random variables for all positions i undoes the most recent R(D
232 Erin J. Schram in Supp(Dt ), reestimates those random variables with respect to Dt again, prepares for the next undoing, and passes messages. The details of this activation are: 1. (Undoing) For each position i adjacent to Dt , request the value of the rant t dom variable At−1 i . Calculate Bi for all x in S by p(Bi , x) = r(Dt , i, x, t − t−1 1)p(Ai , x). 2. (Reestimating) Given the values for the random variable set B t , reestimate the random variables with respect to Dt and call the result At . This can be accomplished by either using the code Dt⊥ to reestimate the probability values by the shortened Bayesian formula in Lemma 6.1, or by converting the random variables to DFT notation, reestimating through the DFT formula, and converting back to probability notation. 3. (Preparing for undoing) Store the values p(Bit , x)/p(Ati , x) as r(Dt , i, x, t) for all positions i adjacent to Dt and all x in S. 4. (Passing messages) For each position i adjacent to Dt store the values of the random variable Ati . If for any of those positions, Ati = At−1 i , pass a message to all other subcode nodes adjacent to that position node that those subcode nodes are due for activation. The probabilities At calculated by Pearl’s algorithm are not a Bayesian reestimation of the a priori probabilities, but they can be described as a piecemeal combination of such reestimations. Definition 6.3. We presume that Pearl’s algorithm has been applied to the Tanner graph T and that at time t the subcode Dt was activated. For D a subcode node in T , define the subgraph G(D) of T to consist of the nodes D and Supp(D) and the edges (i, D) for all i in Supp(D). For each time t and position i define the subgraph Ti,t of T recursively by Ti,0 is the empty graph, Ti,t is Ti,t−1 whenever t > 0 and i is not adjacent to Dt , and Ti,t is the union of G(Dt ) and all the subgraphs Tj,t−1 for j in Supp(Dt ) whenever t > 0 and i is adjacent to Dt . For T any subgraph of T define D(T ) to be {0} whenever T is empty and D for all subcode nodes D of T whenever T is nonempty. Lemma 6.4. If Ti,t is nonempty, then it is connected and it contains the position i as a node. Proof. For t = 0, the subgraph Ti,0 is the empty graph. We proceed by induction on time t. Assume that the hypothesis is satisfied for all times before time t. If Ti,t is nonempty, then there exists a time s ≤ t such that Ds is adjacent to position i and Ti,t is the union of G(Ds ) and all the Tj,s−1 for all j adjacent to Ds . Note that G(Ds ) is connected. Since i is adjacent to Ds in T , i is in G(Ds ), so it is a
A posteriori probability decoding through the DFT and the dual code
233
node in Ti,t . The subgraph Tj,s−1 for j adjacent to Ds either is empty or is connected and contains the position j . The nodes j and Ds are connected in G(Ds ), so Tj,s−1 is connected to Ds . Thus, all parts of Ti,t are connected to Ds , so the graph is connected.
Definition 6.5. When the Tanner graph T is a tree, for a time t, a position node i in T , and an edge E out of i in T , define the subgraph Gi,t,E of T by starting with Ti,t , removing the edge E, and setting Gi,t,E to the component of the resulting graph that contains i. Likewise, define the subgraph Hi,t,E by starting with Ti,t , removing all edges out of i except the edge E, and taking the component of the resulting graph that contains i. Thus, the union of Gi,t,E and Hi,t,E is Ti,t and the intersection of Gi,t,E and Hi,t,E is i. Theorem 6.6. If the Tanner graph T is a tree, then at each time t and position i, ˆ At = R(D(T i,t ))Ai and there exists a constant νi,t such that i
ˆ p(Bit , x) = νi,t p(R(D(G i,t,(i,Dt ) ))Ai , x) for all x in S. 0 ˆ Proof. At time 0, Ti,0 is the empty graph, so R(D(T i,0 ))Ai = Ai = Ai for all positions i. We proceed by induction on time t. Assume that the hypothesis is satisfied for all times before time t. Consider the case where time t is the first time subcode node Dt has been activated. Since Ds = Dt for all s < t, no subgraph Ti,t−1 contains Dt . Furthermore, r(Dt , i, x) = 1 for all positions i adjacent to Dt and all x i n S, so B t = At−1 . By ˆ = R(D(T the hypothesis Bit = At−1 i,t−1 ))Ai for any position i adjacent to Dt in T . i The subgraphs Ti,t−1 are connected by Lemma 6.4. For any distinct positions i and j both adjacent to D − t, the only path from i to j in T contains Dt . So these subgraphs do not intersect each other. Moveover, since for a position I adjacent to Dt , Ti,t−1 is the only subgraph out of all Tj,t−1 for j adjacent to Dt that contains i. ˆ Thus, Gi,t,(i,Dt ) = Ti,t−1 . So Bit = R(D(G i,t,(i,Dt ) )A − i. Let i1 , i2 , . . . , ir be all positions adjacent to Dt in T . The supports Supp(D(Ti,t−1 )) do not intersect each other, so by Theorem 4.1 for any position iq ,
ˆ ˆ ˆ ˆ Bitq = R(D(T iq ,t−1 ))Ai = R(D(T i1 ,t−1 ))R(D(T i2 ,t−1 )) . . . R(D(T ir ,t−1 ))Ai . ˆ ˆ ˆ ˆ t )R(D(T Ati = R(D i1 ,t−1 ))R(D(T i2 ,t−1 )) . . . R(D(T ir ,t−1 ))Ai , which is the same as ˆ ˆ R(Dt + j D(Tij ,t−1 ))Ai = R(D(Ti,t ))Ai by the chain Theorem 4.3. Consider the case where time t is the second or later time in which subcode node Dt was activated. Let s be the most recent previous time that node was activated. Let E be the edge (i, Dt ). ˆ By the hypothesis and Theorem 4.2, p(At−1 i , x) = p(R(D(Ti,t−1 ))Ai , x) = ˆ ˆ p(R(D(Hi,t−1,E ))R(D(Gi,t−1,E ))Ai , x). By the hypothesis, Theorem 4.2, and The-
234 Erin J. Schram orem 6.2, r(Dt , i, t − 1, x) = r(Dt , i, s, x) = p(Bis , x)/p(Asi , x) ˆ ˆ = p(R(D(G i,s,E ))Ai , x)/p(R(D(T i,s ))Ai , x) ˆ ˆ ˆ = p(R(D(G i,s,E ))Ai , x)/p(R(D(G i,s,E ))R(D(H i,s,E ))Ai , x) ˆ = μi,t p(Ai , x)/p(R(D(H i,s,E ))Ai , x) where μi,t is a constant is independent of the choice of x. We claim that Hi,t−1,E = Hi,s,E . Note that for any time n, the graph Ti,n contains its predecessor Ti,n−1 , so therefore, Ti,s is a subgraph of Ti,t−1 and Hi,s,E is a subgraph of Hi,t−1,E . Let N be any node that is in Hi,t−1,E . Let t1 be the smallest time such that Ti,t1 contains N . Thus, Ti,t1 −1 = Ti,t1 , so Ti,t1 is the union of G(Dt1 ) and all Tj,t1 −1 for all j adjacent to Dt1 in T . If N is not in G(Dt1 ), then there is a j1 adjacent to Dt1 such that N is in Tj1 ,t1 −1 and we can let t2 be the smallest time such that Tj1 ,t2 contains N. Continuing this way, we end up with a path (j0 , Dt1 , j1 , Dt2 , j2 , . . . , Dtr ) in T with j0 = i such that t > t1 > t2 > · · · > tr and N is in G(Dtr ). But since T is a tree, any path from i to N must contain Ds . So for some q, 1 ≤ q ≤ r, Dtq = Ds and jq−1 = i. Given that s is the largest time less than t such that Ds = Dt , we have tq ≤ s. So Ti,tq is a subgraph of Ti,s . But Ti,tq = Tjq−1 ,tq , which contains N. So Ti,s contains N and given which branch of T that N is on, Hi,s,E contains N. Therefore, since Hi,s,E contains N for all nodes N in Hi,t−1,E , Hi,s,E = Hi,t−1,E . For any x in S, p(Bit , x) = p(At−1 i , x)r(Dt , i, t − 1, x) ˆ = p(R(D(T i,t−1 ))Ai , x)r(Dt , i, s, x) =
ˆ ˆ p(R(D(H i,t−1 , E))R(D(G i,t−1,E ))Ai , x)μi,t p(Ai , x) ˆ p(R(D(H i,s,E ))Ai , x)
=
ˆ ˆ p(R(D(H i,s,E ))R(D(G i,t−1,E ))Ai , x)μi,t p(Ai , x) ˆ p(R(D(Hi,s,E ))Ai , x)
ˆ = μi,t μi,t p(R(D(G i,t−1,E ))Ai , x)
by Theorem 6.2
where μi,t is a constant is independent of the choice of x. Let νi,t = μi,t μi,t for all positions i adjacent to Dt . By Lemma 6.1, at time t, for any position i adjacent to Dt , and any symbol x in S, p(Ati , x) = p(R(Dt⊥ )Bit , x) =
t j ∈Supp(Dt ) p(Bj , cj ) t j ∈Supp(Dt ) p(Bj , cj )
c∈C with ci =x
c∈C
A posteriori probability decoding through the DFT and the dual code
c∈C with ci =x
=
=
c∈C with ci =x
ˆ
j ∈Supp(Dt ) νj,t p(R(D(Gj,t−1,(j,Dt ) ))Aj , cj )
ˆ
j ∈Supp(Dt ) νj,t p(R(D(Gj,t−1,(j,Dt ) ))Aj , cj )
c∈C
235
c∈C
ˆ
j ∈Supp(Dt ) p(R(D(Gj,t−1,(j,Dt ) ))Aj , cj )
ˆ
j ∈Supp(Dt ) p(R(D(Gj,t−1,(j,Dt ) ))Aj , cj )
For all j in Supp(Dt ), each subgraph Gj,t−1,(j,Dt ) is in a different component of the graph formed by removing Dt from T . So by Lemma 3.6 for any j in Supp(Dt ), ˆ R(D(G j,t−1,(j,Dt ) ))Aj ˆ ˆ ˆ = R(D(G i1 ,t−1,(ii ,Dt ) ))R(D(G i2 ,t−1,(i2 ,Dt ) )) ◦ · · · ◦ R(D(G ir ,t−1,(ir ,Dt ) ))Aj where {i1 , i2 , . . . , ir } = Supp(Dt ). Thus, for any x in S, p(Ati , x) ˆ ˆ ˆ ˆ t )R(D(G = R(D i1 ,t−1,(i1 ,Dt ) ))R(D(G i2 ,t−1,(i2 ,Dt ) )) ◦ · · · ◦ R(D(G ir ,t−1,(ir ,Dt ) ))Ai . ˆ )Ai where By the chain Theorem 4.3, Ati = R(D D(Gj,t−1,(j,Dt ) ). D = Dt + j ∈Supp(Dt )
We claim that Gi,t−1,(i,Dt ) = Gi,t,(i,Dt ) for any position i adjacent to Dt . Ti,t is the union of G(Dt ) and all Tj,t−1 for all j adjacent to Dt ; therefore, Gi, t, (i, Dt ) is the union of all Gj,t−1,(i,Dt ) for all j adjacent to Dt . Let j be a position that is adjacent to Dt but is not i. We have shown that Hj,t−1,(j,Dt ) = Hj,s,(j,Dt ) . Since the edge (i, Dt ) is farther from j than the edge (j, Dt ) this means that Gj,t−1,(i,Dt ) = Gj,s,(i,Dt ) . But Tj,s = Ti,s , so Gj,s,(i,Dt ) = Gi,s,(i,Dt ) , which is a subgraph of Gi,t−1,(i,Dt ) . Thus, the union of all Gj,t−1,(i,Dt ) for all j adjacent to Dt is Gi,t−1,(i,Dt ) . Therefore, Ti,t is the union of G(Dt ) and all Gj,t−1,(j,Dt ) for all j adjacent to Dt , t ˆ ˆ so Ati = R(D(T i,t )Ai . Also Bi = νi,t R(D(G i,t,(i,Dt ) ))Ai . For each step t in applying Pearl’s algorithm to a Tanner graph T that is a tree, if i and j are both adjacent to Dt , then Ti,t = Tj,t and they both contain the union of Ti,t−1 and Tj,t−1 . So, until all subgraphs Ti,t are identical for all positions i, a cycle of activating all subcodes nodes D will cause Ti,t to increase for some position i. Since for any time t such that all subcode nodes have been activated already, the union of Ti,t for all positions i contains all D in D, this increase ends only at a time t such that Ti,t = T for all positions i. At that time and at all later times, by Theorem 6.6, ˆ Thus, the values of Ati stop changing, so the subcode nodes no longer At = R(C)A. become due for activation, and Pearl’s algorithm terminates with a final result of ˆ R(C)A.
236 Erin J. Schram
7. Conclusion The DFT notation does not appear to offer any advantage in implementing an a posteriori decoding algorithm. But it does appear to have a use in analyzing the decoding algorithm. Many soft decoding algorithms on codes with low density parity check matrices have been analyzed through its Tanner graph or a related graph, such as its Tanner– Wiberg–Leoliger graph. These graphs have been built on relationships in the parity check matrix or in the finite-state encoder, not in the dual code. By relating the dual code directly to the a posteriori decoding algorithm, the DFT notation creates a new generalization of the Tanner graph based on the dual code. These graphs should be able to analyze graphs with higher densities in the parity check matrix than the plain Tanner graph can handle. Coding theory has many theorems for finding the structure of the dual of a code. It is hoped that through the DFT notation, those theorems may be converted into tools for analyzing a posteriori decoding schemes on the code.
References [1] C. Berrou and A. Glavieux, Near Optimal Error-Correcting Coding and Decoding: Turbocodes, IEEE Trans. Inform. Theory 44 (1996), 1261–1271. [2] G. David Forney, Jr., On Iterative Decoding and the Two-Way Algorithm, International Symposium on Turbo Codes, Brest, France, 1997. [3] Frank R. Kschischang and Brendan J. Frey, Iterative Decoding of Compound Codes by Probability Propagation in Graphical Models, IEEE J. Selected Areas Commun. 16 (1998), 219–230. [4] David J. C. MacKay, Good Error-Correcting Codes Based on Very Sparse Matrices, IEEE Trans. Inform. Theory 45 (1999), 399–431. [5] Robert J. McEliece, David J. C. MacKay, and Jung-Fu Cheng, Turbo Decoding as an Instance of Pearl’s ‘Belief Propagation’ Algorithm, IEEE J. Selected Areas Commun. 16 (1998) 140–152. [6] A. J.Viterbi, Error bounds for convolutional codes and an asymptotically optimum decoding algorithm, IEEE Trans. Inform. Theory 13 (1967), 260–269. E. J. Schram S31222, HQ 3A179, Suite 6305 National Security Agency Fort George G. Meade, MD 20755, U.S.A. [email protected]
Subdesigns of symmetric designs Mohan S. Shrikhande
Abstract. Subdesigns of symmetric designs is the main theme of this paper. We attempt to give a comprehensive account of the results beginning with Bruck’s result on subplanes of finite projective planes and Kantor’s extension to subdesigns of symmetric designs. Some later results of Bose and S. S. Shrikhande, Haemers and M. S. Shrikhande, Baartmans and M. S. Shrikhande, Jungnickel, Cron and Mavron, and Baker are then discussed. Ionin has considered so-called strong subdesigns of symmetric designs. We discuss briefly these results which are the starting point of Ionin’s recent work on constructions of some new infinite families of symmetric designs. We end with a recent result of Jungnickel and Tonchev on decompositions of difference sets. 2000 Mathematics Subject Classification: primary 05B05; secondary 05B25.
1. Introduction We shall mostly follow the standard terminology and notation of design theory as in Beth, Jungnickel and Lenz [5]. Let D = (V , B, I ) be a finite incidence structure with point set V , block set B, and I ⊆ V × B the incidence relation. We will identify a block with its subset of incident points, take I to be the inclusion relation, and denote the (incidence) structure by D = (V , B). Let v = |V | and b = |B|. Then D is called regular if each point is on exactly r blocks and is uniform if each block has exactly k points, where r and k are fixed positive integers. All structures considered from now on will be assumed to be regular and uniform and so the relation bk = vr holds. If further b = v, then D is called symmetric. A proper substructure D1 = (U, A) of D = (V , B) has U ⊂ V and A ⊂ B. A tactical decomposition of D is a partition of V into point classes X1 , X2 , . . . , Xs and a partition of B into block classes B1 , B2 , . . . , Bt such that each pair (Xi , Bj ) is a substructure. A design D = (V , B) is a uniform structure with block size k (v > k ≥ 3) and such that any pair of distinct points occur together in exactly λ blocks. A design is then regular with replication number r and the relation λ(v − 1) = r(k − 1) also holds. Designs are also referred to as 2 − (v, k, λ) designs, (v, b, r, k, λ) designs, or BIBDs. A symmetric design D has b = v. The parameters of a symmetric design are (v, k, λ) Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
238 Mohan S. Shrikhande and n = k − λ is called its order. A finite projective plane P of order n is a classical example of a symmetric 2 − (n2 + n + 1, n + 1, 1) design. Subdesigns of symmetric designs is the main theme of this paper. In the literature there are some different notions of subdesigns of symmetric designs which have been considered. We attempt to give an almost comprehensive account of this topic. We begin in section 2 with Bruck’s result [12] on subplanes of finite projective planes, Kantor’s extension [32] to subdesigns of symmetric designs, and results of Bose and S. S. Shrikhande [10] on Baer subdesigns of symmetric designs. Some later results of Haemers and M. S. Shrikhande [21], Baartmans and M. S. Shrikhande [2], Jungnickel [30], Cron and Mavron [16], and Baker [3] are then discussed in section 3. Ionin [23] considers the notion of so-called strong subdesigns of symmetric design. In section 4, we discuss briefly these results which are the starting point of Ionin’s recent work ([24], [25], [26]) on constructions of some new infinite families of symmetric designs. Some recent results on symmetric designs and a result of Jungnickel and Tonchev [31] on decompositions of difference sets are taken up in section 5. Some results discussed in the present paper have been drawn freely from the author’s very recent survey [42]. That paper treated a much broader range of combinatorial topics in design theory and had a different purpose. The present paper has a much narrower focus, namely that of subdesigns of symmetric designs.
2. Basic results on subdesigns of symmetric designs A symmetric 2 −(v1 , k1 , λ) design D1 = (U, A) is a proper subdesign of a symmetric 2 − (v, k, λ) design D = (V , B) if U ⊂ V and there exists a Ao ⊂ B such that A = {B ∩ U : B ∈ Ao }. For convenience, we say that D1 (v1 , k1 , λ) is a (symmetric) subdesign of D(v, k, λ). As is well known, a finite projective plane P of order n is an incidence structure of points and lines satisfying the following conditions: (i) To any two distinct points p and q, there is a unique line pq incident with both of them. (ii) To any distinct lines L and M, there is a unique common point LM on both of them. (iii) There exists a quadrangle in P , that is, a set of four points in P , no three of which lie on a line. (iv) There exists a line with exactly n + 1 points on it. Let P be a finite projective plane of order n. A subplane Q of order m of P is a subset Q of points and lines which is itself a projective plane of order m relative to the incidence relation holding in P . A subplane Q of P is called a Baer subplane of P if it satisfies the following conditions: (1) Every point of P is incident with a line of Q. (2) Every line of P is incident with a point of Q. R. H. Bruck [12] proved the following classical result about the possible orders of subplanes of finite projective plane:
Subdesigns of symmetric designs
239
Theorem 2.1. Let P be a finite projective plane of order n and Q a proper subplane of order m. Then the following conditions hold: (i) n = m2 if and only if Q is a Baer subplane of P . (ii) If Q is not a Baer subplane of P , then n ≥ m2 + m. The proof of Bruck’s Theorem is by a counting argument. It is not known whether (ii) can hold with equality. We note that this would imply the existence of a projective plane of order m(m + 1) which is not a prime power. Bruck’s Theorem was extended to symmetric designs by Kantor [32]: Theorem 2.2. Let D(v, k, λ) be a symmetric design which contains a proper symmetric subdesign D1 (v1 , k1 , λ). Then k = λv1 − k1 + 1 or k ≥ λv1 . (Alternatively, k − λ = (k1 − 1)2 or k − λ ≥ k1 (k1 − 1)). A proper subdesign D1 (v1 , k1 , λ) of a symmetric design D(v, k, λ) is called a Baer subdesign of D if k − λ = (k1 − 1)2 and is called a Bruck subdesign of D if k − λ = k1 (k1 − 1). The later terminology is due to Baker [3] who investigated symmetric designs with Bruck subdesigns. We discuss some results of Baker on Bruck subdesigns in the next section. Example 2.3. Let p be a prime and t ≥ 1 be an integer. The projective plane P = PG(2, p2t ) contains a Baer subplane Q = PG(2, pt ). Example 2.4. Arrange the points 1, 2, . . . , 16 in a 4 × 4 array. Corresponding to any point p form a block Bp of size 6 by taking the 6 points, other than p, which occur in the same row or column as p. Then, as is well known, we get a symmetric design D(16, 6, 2). Let D1 (4, 3, 2) be the subdesign formed on the points 1, 2, 3, 4 and the induced blocks on them. Then D1 is a (trivial) Baer subdesign of D. R. C. Bose and S. S. Shrikhande [10] studied Baer subdesigns of symmetric designs. They also showed that if a symmetric design D1 (m, k1 , λ) is a subdesign of a symmetric design D(v, k, λ) with k > k1 , then k ≥ (k1 − 1)2 + λ. When equality holds, they refer to D1 as a Baer subdesign of D. The paper of Bose and S. S. Shrikhande [10] uses incidence matrices in their study of Baer subdesigns of symmetric designs. Let N and N1 be the usual point versus block incidence matrices of D and D1 respectively. For convenience, identifying the design with its incidence matrix, we take D1 P D= Q R Refer to R as the complementary structure of D1 in D. Suppose v = mn points are partitioned into m sets (called groups) of size n each. Then, a group divisible design (GDD) D ∗ is an arrangement of the points into b blocks
240 Mohan S. Shrikhande of size k < v such that (i) each point occurs in r blocks, (ii) any two points in the same (different) group(s) occur together in λ1 (λ2 ) blocks. Then D ∗ is denoted by D ∗ (v, b, r, k; λ1 , λ2 ; m, n). In case, v = b, the GDD is called symmetric (SGDD) and is denoted (in [5]) by D ∗ (m, n, k; λ1 , λ2 ). One of the main results of Bose and S. S. Shrikhande [10] is the next: Theorem 2.5. Let D1 (m, k1 , λ) be a Baer subdesign of D(v, k, λ) with k > k1 . Then D1 contains a complementary subdesign D1∗ which is a symmetric group divisible design (SGDD) D1∗ (v ∗ , k ∗ , λ − 1, λ; m, n), where v ∗ = mn, k ∗ = (k1 − 1)2 + λ − 1, k 2 −k +λ
m = 1 λ1 , n = (k1 − 1)2 − (k1 − λ). Conversely, any SGDD D1∗ with the above parameters can be embedded in a unique symmetric design D(v, k, λ), where v = v ∗ + m, k = (k1 − 1)2 + λ containing as a complementary subdesign another symmetric design D1 (m, k1 , λ). The above theorem of Bose and S. S. Shrikhande generalized a result of Dembowski ([17], p. 290) who proved the result for the case λ = 1. An SGDD is said to have the dual property if its dual has the same parameters as the original SGDD. In Bose [8], structural properties of such designs were investigated and an application to Baer subdesigns was given.
3. Further results on subdesigns of symmetric designs Another notion of subdesigns of symmetric designs was considered by Haemers and M. S. Shrikhande [21]. Let D(v, k, λ) be a symmetric design containing a symmetric design D1 (v1 , k1 , λ1 ) with k1 < k. Note that here λ and λ1 may have possibly different values, in contrast to the earlier section. We refer to D1 as a subdesign of D. Working with incidence matrices, we may assume D1 D2 D= D3 D4 1 (k−k1 ) Then form the rational number x = v(v−v . Notice that x is the average row sum of 1) D3 . The main tool used in the paper of Haemers and M. S. Shrikhande is the following result of Haemers [19] on interlacing of eigenvalues:
Theorem 3.1. Let A be a complex Hermitian matrix of order n which is partitioned into block matrices: ⎛ ⎞ A11 . . . A1m ⎜ ⎟ ... ⎜ ⎟ ⎜ .. ⎟ . . A=⎜ . ... . ⎟ ⎜ ⎟ ⎝ ⎠ ... Am1 . . . Amm
Subdesigns of symmetric designs
241
such that Aii is a square matrix of size m for 1 ≤ i ≤ m. Let B be the square matrix of size m, each element bij of which equals the average row sum of the block matrix Aij . Then the eigenvalues α1 ≥ α2 ≥ · · · ≥ αn of A and the eigenvalues β1 ≥ β2 ≥ · · · ≥ βm of B satisfy αn−m+i ≤ βi ≤ αi , for all 1 ≤ i ≤ m. Moreover, if for some M, 1 ≤ M ≤ m, βi = αi , for all 1 ≤ i ≤ M and βi = αn−m+i , for all M < i ≤ m, then Aij has constant row and column sum for all 1 ≤ i, j ≤ m. Using the above result, Haemers and M. S. Shrikhande [21] proved the following: Theorem 3.2. Let D1 (v1 , k1 , λ1 ) be a symmetric subdesign of a symmetric design D(v, k, λ). Let D1 D2 D= D3 D4 Let x =
v1 (k−k1 ) (v−v1 ) .
Then,
(i) k ≥ (k1 − x)2 + λ. (ii) If k = (k1 − x)2 + λ, then the points [blocks] of D1 and the blocks [points] not on D1 form a possibly trivial block design D2 (v1 , x, λ − λ1 ). Haemers’ eigenvalue technique is an important tool in algebraic graph theory and designs. We mention also the recent paper of Haemers [20]. For completeness, we give a proof of the above theorem since it is also short. Proof. (i) Let D1 and D be the incidence matrices and write D1 D2 D= D3 D4 Then x is the average row sum of D3 . Form O A= Dt Then,
⎛
O ⎜O A=⎜ ⎝D t 1 D2t
O O D3t D4t
D O
D1 D3 O O
⎞ D2 D4 ⎟ ⎟, O⎠ O
where D t denotes the transpose of D. Next, we construct the matrix B consisting of the average row sums of A corresponding to the given blocking. Then, ⎞ ⎛ 0 0 k1 k − k1 ⎜0 0 x k−x⎟ ⎟ B=⎜ ⎝k1 k − k1 0 0 ⎠ x k−x 0 0
242 Mohan S. Shrikhande The eigenvalues of
k1 x
k − k1 k−x
x). The eigenvalues are k and k1 − x. Hence √ the eigenvalues of B are ±k, and ±(k1 −√ of A are ±k and ± k − λ. Using Haemers”s result, this gives k − λ ≥ (k1 − x), which proves (i). (ii) From DD t = (k − λ)I + λJ and D1 D1t = (k1 − λ1 )I + λ1 J , it follows that D2 D2t = ((k − k1 ) − (λ − λ1 ))I + (λ − λ1 )J . On the other hand, if k = (k1 − x)2 + λ, then Haemers”s result shows that D2 has constant column sums. Observe that in case λ1 = λ, then x ≤ 1 and we obtain from Theorem 3.2 the earlier mentioned result proved by Kantor [32] and Bose and S. S. Shrikhande [10]: Corollary 3.3. Let D1 (v1 , k1 , λ) be a symmetric subdesign of D(v, k, λ). Then k ≥ (k1 − 1)2 + λ. Haemers and M. S. Shrikhande call D1 (v1 , k1 , λ1 ) a tight subdesign of D(v, k, λ) if k = (k1 − x)2 + λ. We note that if D1 (v1 , k1 , λ1 ) is a tight subdesign of D(v, k, λ) then (i) k − λ is a square, (ii) the complement of D1 is a tight subdesign of the complement of D and (iii) if D1 (v1 , k1 , λ) is a tight subdesign of D(v, k, λ), then D1 is a Baer subdesign of D and conversely. We give two examples, mentioned in [21], of tight subdesigns of symmetric designs. The first follows from the above observations. Example 3.4. Let D1 (v1 ,k1 ,1) be a Baer subplane of a D(v,k,1). Then D1c (v1 ,v1−k1 , v1 − 2k1 + 1), the complement of D1 , is a tight subdesign of D c (v, v − k, v − 2k + 1), the complement of D. Example 3.5. Let H1 be a regular Hadamard matrix of order m, m ≥ 4. This means that H1 is a Hadamard matrix of order m and furthermore, H1 J = cJ , where c is a constant and J is the all one matrix. Using H1 H1t = mI and m ≡ 0 (mod 4), it follows that m = c2 = 4n2 , for some positive integer n. Then replacing respectively the 1’s (−1’s) in H1 by 0 (1), we obtain the incidence matrix of a 2−(4n2 , n(2n−1), n(n−1)) symmetric design D1 . Put ⎛ ⎞ H1 −H1 −H1 −H1 ⎜−H1 H1 −H1 −H1 ⎟ ⎟ H =⎜ ⎝−H1 −H1 H1 −H1 ⎠ −H1 −H1 −H1 H1 Then H is a regular Hadamard matrix of size 16n2 and similarly gives rise to a 2 − (16n2 , 2n(4n + 1), 2n(n + 1)) symmetric design D. It can be checked that D1 is a tight subdesign of D. Haemers and M. S. Shrikhande obtain the following parametric classification for Baer subdesigns of symmetric designs:
Subdesigns of symmetric designs
243
Theorem 3.6. Let D1 (v1 , k1 , λ) be a Baer subdesign of D(v, k, λ). Then one of the following holds: (a) v = λ(λ2 − 2λ + 2), D has parameters (λ(λ2 − 2λ + 2), λ2 − λ + 1, λ) and D1 (λ, λ, λ) is the trivial subdesign. (b) v = λ2 (λ+2), D has parameters (λ2 (λ+2), λ(λ+1), λ) and D1 (λ+2, λ+1, λ) is the trivial subdesign. (c) v > λ2 (λ + 2). The paper of Haemers and M. S. Shrikhande gives the following examples to show that in each case of the above theorem, there exist symmetric designs with Baer subdesigns: Example 3.7. A symmetric design D(λ(λ2 −2λ+2), λ2 −λ+1, λ) has the parameters of the symmetric design on the points and planes of PG(3, λ − 1) which exist for all prime powers λ − 1. The points on a given line and all the planes containing it form a Baer subdesign D1 (λ, λ, λ). Example 3.8. The existence of symmetric designs with parameters (λ2 (λ + 2), λ(λ + 1), λ) is known for all prime powers λ. This is due to Ahrens and Szekeres [1]. From their construction, it is seen that D has a Baer subdesign D1 (λ + 2, λ + 1, λ), corresponding to the λ + 2 points on a line in the corresponding geometry. Example 3.9. There exists a D(56, 11, 2) design which has D1 (7, 4, 2), the complement of Fano plane as a Baer subdesign. This has been discussed at length in [21]. Tight subdesigns of symmetric subdesigns were later investigated by Baartmans and M. S. Shrikhande [2], which was mainly concerned with existence and construction of tight subdesigns. The following result, proved along lines essentially similar to Theorem 3.6, was the starting point of their paper: Theorem 3.10. Let a symmetric design D1 (v1 , k1 , λ1 ) be a tight subdesign of 1 (k−k1 ) . Then exactly one of the following holds: D(v, k, λ). Define x = v(v−v 1) (a) D has parameters λ1 [(λ1 −x)(λx1 −x−1)+λ] , (λ1 − x)2 + λ, λ and D1 is the trivial design (λ1 , λ1 , λ1 ). (b) D has parameters (λ + 2) [(λ−x)(λ1x+1−x)+λ] , (λ1 + 1 − x)2 + λ, λ and D is the trivial subdesign with parameters (λ1 + 2, λ1 + 1, λ1 ). . (c) v > (λ1 + 2) [(λ1 +1−x)+λ] x
244 Mohan S. Shrikhande We note that in case λ1 = λ and D(v1 , k1 , λ1 ) is a tight (equivalently Baer) subdesign of D(v, k, λ), then x = 1 and Theorem 3.10 reduces to Theorem 3.6. One of the methods used in [2], relied on the connection between symmetric designs and the so-called (v, k, λ) graphs. A (v, k, λ) graph is a strongly regular graph on v vertices, having valency k and such that any two distinct vertices are simultaneously adjacent to λ other vertices. These graphs were studied by Bose and S. S. Shrikhande [11], who referred to them as G2 (d) graphs. From any (v, k, λ) graph , one obtains a symmetric (v, k, λ) design D on the points of with blocks being the vertices adjacent to any given vertex. To produce possible tight subdesigns, the following lemma, whose proof is obvious, was then used: Lemma 3.11. The existence of a (v, k, λ) graph G having a clique (i.e. complete subgraph) G1 of size v1 implies the existence of a symmetric design D(v, k, λ) with a 1 +1) subdesign D1 (v1 , v1 − 1, v1 − 2). Let x = v1 (k−v (v−v1 ) . Then D1 is a tight subdesign / 20 1 +1) of D if k = v1 − 1 − v1 (k−v + λ. (v−v1 ) Next some specific (v, k, λ) graphs arising from partial geometries (r, k, t) were used. Relying on results of Bose [6] and S. S. Shrikhande and Singh [46], were then used to prove the following (sample) result in [2]: Theorem 3.12. The existence of a BIBD E with parameters (v, b, r, k, λ = 1) and satisfying r = 2k + 1 implies the existence of a symmetric design (4k 2 − 1, 2k 2 , k 2 ) with a tight subdesign D1 (2k + 1, 2k, 2k − 1). Example 3.13. For any n ≥ 2, there exists a symmetric design D(22n − 1, 22n−1 , 22n−2 ) with a tight subdesign D1 (2n + 1, 2n , 2n − 1). The existence of such designs D is due to E. Seiden [41]. For further results on existence and constructions of tight subdesigns refer to the original paper of Baartmans and M. S. Shrikhande [2]. Bruck’s Theorem 2.1 referred to in section 1 is a classical result on subplanes of finite projective planes. The study of finite projective spaces and of special types of substructures contained in them is a central theme in the area of finite geometries. We refer to Hirschfeld [22] for a comprehensive account of this. In particular we mention the papers of Bose, Freeman and Glynn [9], Vedder [47] on the intersections of Baer subplanes in finite projective planes; Vedder [48] on affine subplanes of projective planes and the classical papers of Bruck and Bose [13], [14]. The paper of Cron and Mavron [16] is concerned with intersections of symmetric subdesigns of symmetric designs. Their definition of design allows possible degenerate structures. Let (v, k, λ) denote a symmetric design, with v > k ≥ 2, λ ≥ 1. Let and be two symmetric subdesigns of which may be trivial and may have different parameters. The incidence structure whose points and blocks are those common
Subdesigns of symmetric designs
245
to and and with induced incidence is denoted by . Then Cron and Mavron proved among other results the following: Theorem 3.14. Let and be subdesigns of a symmetric 2 − (v, k, λ) design with v > k ≥ 2, λ ≥ 2. Then is either a symmetric subdesign of and or it consists of just point and no blocks or just one block and no points. The second case cannot occur if both and are Baer subdesigns. Another important theme in finite geometry is that of partitioning a finite projective geometry PG(d, q) into interesting substructures of the same type, for example, into spreads, Baer subgeometries, arcs, caps, etc. A classical result of this kind is the following theorem of Rao [39]: Theorem 3.15. PG(d, q) can be partitioned into subgeometries PG(t, q) if and only if t + 1 divides d + 1. The fact that the Singer cycle can be used to partition PG(2, q 2 ) into Baer subplanes was proved much earlier by Rao [38]. Rao’s above result can be rephrased into design theoretic language as: Theorem 3.16. The symmetric design PG(d, q)c formed by the points and complements of hyperplanes in PG(d, q) can be partitioned into strongly induced subdesigns isomorphic to PGt−1 (t, q) if and only if t + 1 divides d + 1. (The precise definition of “strongly induced subdesign” is given in the next section). The paper of Cron and Mavron also deals with the question of partitioning a symmetric design into Baer subdesigns. They note that if q is a prime power, the designs PG(2, q 2 ) contains a Baer subdesign PG(2, q). Also the symmetric design PG(3, q) contains a trivial Baer subdesign with parameters (q + 1, q + 1, q + 1). Both the designs can be partitioned into Baer subdesigns and they refer to this as a Baer partition. They had raised the question of which symmetric 2 − (V , K, λ) designs admit Baer partitions into trivial 2 − (v, k, λ) designs, i.e. those having λ = k − 1 = v − 2 or λ = k = v and admitting a Baer partition? Their paper contains the following remarks: Any such possible would have parameters (λ2 (λ + 2), λ(λ + 1), λ) or (λ(λ − 1)2 + λ, λ2 − λ + 1, λ). The first parameter set is of the Ahrens–Szekeres type. Examples are any one of the three B(6) biplanes and all have a Baer partition. An example of the second class is PG(3, 2) which also has a Baer partition. It is further mentioned in [16] that Mavron has further shown that there exists a design of Ahrens–Szekeres type with a Baer partition when λ is a prime power and the design has the additional property of admitting a partition into λ + 2 affine planes of order λ. Designs having this additional property have also been studied by S. S. Sane [40]. Subdesigns of symmetric designs have also been investigated by Jungnickel [30]. For convenience, we state the earlier result Theorem 3.2 of Haemers and M. S. Shrikhande in Jungnickel’s terminology: Suppose D is a symmetric design with
246 Mohan S. Shrikhande parameters (v, k, λ) and let D be an (induced) symmetric subdesign of D with parameters (v , k , λ ) having k < k. Then, Theorem 3.2 can be restated as: With D and D (k−k ) 2 as above, define x = v(v−v ) . Then n ≥ (k − x) , where (as usual), n = k − λ is the order of D. Moreover, if equality holds, then the incidence structure D consisting of the points of D together with the blocks of D not in D is a 2-design with parameters v = v , k = x, λ = λ − λ . It is not immediately clear whether the condition n = (k − x)2 holds if D is a 2-design (this is in fact proved by Jungnickel.) The following lemma is the starting point of Jungnickel [30]: Lemma 3.17. Let D be a subdesign of D and let D be as above. Then D is a 2-design if and only if any block in D which is not in D meets D in a constant number (k−k ) x of points. Then in this case, x = v(v−v ) and D has the parameters of Theorem 3.2. Proof. Count all incidences (p, B) with p ∈ D and B a block of D which is not in D , in two ways obtaining: (v − v )x = v (k − k ). Clearly, then v = v and λ = λ − λ . Jungnickel calls D a tight subdesign of D provided D is a 2-design. If furthermore, λ = λ , then he also calls D a Baer subdesign of D. Jungnickel proves that his definitions are equivalent to those of Haemers and M. S. Shrikhande via the following result proved by counting methods:
(k−k ) Theorem 3.18. Let D be a subdesign of D and let x be defined by x = v(v−v ) . Then x is the average number of points of D which is contained in a block of D which is not in D . Furthermore,
(a) n ≥ k − v λ + x(k − k ). 2
Moreover, the following conditions are equivalent: (b) D is tight. (c) n = k − v λ + x(k − k ). 2
(d) n = (k − x)2 . It should be remarked that Jungnickel shows that the two bounds n ≥ (k − x)2 2 and n ≥ k − v λ + x(k − k ) are in general not comparable, but do coincide in case of equality. Using a result of Wallis [49] involving affine designs and strongly regular graphs, Jungnickel obtains the following: Theorem 3.19. Assume there exists an affine design with parameters (v, b, r, k, λ). Then there exists a symmetric design D((r + 1)v, kr, kλ) with a tight subdesign D1 (r + 1, r, r − 1).
Subdesigns of symmetric designs
247
As a consequence, he obtains: Corollary 3.20. A symmetric design D(v, k, λ) and a tight subdesign D1 (c + 1, c, c − 1) exists in at least the following cases: (i) v = q d+1 (q d + · · · + q 2 + q + 2), k = q d (q d + · · · + q 2 + q + 1), λ = q d (q d−1 + · · · + q 2 + q + 1), c = q d (q d + · · · + q 2 + q + 1), (q a prime power); (ii) v = 16a 2 , k = 2a(4a − 1), λ = 2a(2a − 1), c = 4a − 1, whenever 4a is the order of a Hadamard matrix. Jungnickel then applies Theorem 3.16 under the extra assumption that D and D admit regular groups of automorphisms, say G and H , with H a subgroup of G. Then using the well-known relationship between difference sets and symmetric designs he obtains: Theorem 3.21. Let D be a (v, k, λ)-difference set in G and D be a subdifference set with parameters (v , k , λ) in a subgroup H of order v in G. Then in Theorem 3.16, inequality (a) holds and the equality holds iff (d) holds (which means geometrically that the corresponding symmetric subdesign D obtained by developing the subdifference set is tight in the symmetric design D developed from its difference set.) The following is another result from Jungnickel [30]. It is a difference set analogue of a result of Baartmans and M. S. Shrikhande (Proposition 2.6, [2]). Theorem 3.22. Let G be a group of order 4a 2 . Assume the existence of 2a subgroups U1 , . . . , U2a of G which have pairwise trivial intersection. Then there exists a difference set with parameters v = 4a 2 , k = 2a 2 − a, λ = a 2 − a having a tight (2a, 2a − 1, 2a − 2)-subdifference set. For further results and examples about subdesigns of symmetric designs and constructions via difference sets, we refer to the original paper of Jungnickel [30]. As mentioned in section 2, we now end this section with some results of Baker [3] on Bruck subdesigns of symmetric designs. Kantor’s Theorem 2.2 asserts that if D0 (v0 , k1 , λ) is a proper symmetric subdesign of a symmetric design D(v, k, λ), then k = λv0 − k0 + 1 or k ≥ λv0 . If k = λv0 , then Baker refers to D0 as a Bruck subdesign of D. Baker’s paper contains some structural results of symmetric designs containing Bruck subdesigns. Two methods of constructing symmetric designs with Bruck subdesigns using spreads and packings of designs are given. A family of symmetric designs containing (trivial) Bruck subdesigns is also given. We discuss very briefly some of these results and refer to Baker’s paper for further details. For convenience, we follow the notation used in his paper. One of the initial results of Baker is the following:
248 Mohan S. Shrikhande Theorem 3.23. Let a symmetric design D(v, k, λ) = (V , B) contain a Bruck subdesign D0 (v0 , k0 , λ0 ) = (V0 , B0 ). Then D admits a tactical decomposition with point classes X0 , X1 , X2 and block classes B0 , B1 , B1 with the following parameters: (l0 , l1 , l2 ) = (m0 , m1 , m2 ) = (v0 , v0 (k0 − 2) + 1, v0 (k − k0 )), ⎛
k0 (βij )t = (γij ) = ⎝ 1 k1
0 k0 − 1 k − k1
⎞ k − k0 k ⎠. k − ko
Here li (resp. mi ) denotes |Xi | (resp. |Bj |), γij (resp. βij ) denotes row (resp. column) sum of the substructure (Xi , Bj ) and (βij )t denotes the transpose of (βij ). In the above result, X1 is the set of points of D not incident with any block of D0 , B1 is the set of blocks not incident with any point of D0 , and X2 , B2 are the remaining set of points and blocks (these are non-empty as D0 is a Bruck subdesign of D.) In the above, defining the structure D1 = (X0 ∪ X1 , B0 ∪ B2 ), Baker obtains the following two results: Corollary 3.24. If D(v, k, λ) contains a Bruck subdesign D0 (v0 , k0 , λ), then D contains a subdesign D1 which contains D0 , where D1 is a BIBD(v1 , k1 , λ) with v1 = v0 (k0 − 1) + 1, and k1 = k0 . Corollary 3.25. If D(v, k, λ) contains a Bruck subdesign D0 (v0 , k0 , λ), then D contains a symmetric substructure C which has v0 (k −k0 ) points and blocks of cardinality k − k0 . Further, C admits a tactical decomposition having v0 point classes each of size k − k0 and βij and γij equal to λ or λ − 1. In the above C = (X2 , B2 ) in the earlier notation. The point classes of the tactical decomposition are the blocks of B0 and the block classes are the points of X0 considered as sets of blocks. The construction results of Baker are too technical to go into here and we refer to his paper. We end this section by stating a result of Baker [3]: Theorem 3.26. If n and n − 1 are both prime powers, then there exists a symmetric design D(n3 − n + 1, n2 , n) with a Bruck subdesign Do (n, n, n).
4. Strong subdesigns of symmetric designs Recently, Ionin [23] has considered a more restrictive notion of subdesigns of symmetric designs and has used this as one of the ingredients to construct some new infinite families of symmetric designs. The other main tools used in these constructions are
Subdesigns of symmetric designs
249
generalized Hadamard matrices and balanced generalized weighing matrices. The methods introduced in Ionin’s paper are in my opinion, an important new contribution to design theory. In the remainder of this section, we shall discuss some of the results of Ionin [23]. As seen earlier, the symmetric designs D = (V , B) containing smaller symmetric designs D1 = (U, A) (in the sense of Haemers and M. S. Shrikhande [21], Baartmans and M. S. Shrikhande [2], and Jungnickel [30]) are those that satisfy the conditions: U ⊂ V and there exists a B0 ⊂ B such that A = {B ∩ U : B ∈ B0 }. Ionin defines a symmetric design D1 (U, A) to be a strong subdesign of a symmetric design of D(V , B) if U ⊂ V and A = {B ∩ U : B ∈ B, B ∩ U = ∅}. We remark that in Jungnickel and Tonchev [31], Ionin’s notion is referred to as strongly induced subdesign of a symmetric design. Following Ionin’s notation, we write (v, k, λ) ⊂ (v , k , λ ), if there exists a symmetric 2 − (v , k , λ )-design having a strong symmetric 2 − (v, k, λ) subdesign. Ionin gives the following trivial example of strong subdesign: Example 4.1. (1, 1, μ) ⊂ (v, k, λ). Note that μ does not have to be an integer. Before giving examples of symmetric designs with strong subdesigns, we recall some well known facts. The classical example of symmetric designs is PG(d, q) formed by the one-dimensional and d-dimensional subspaces of the (d+1)-dimensional d+1 d −1 q d−1 −1 vector space over the finite field GF(q). It has parameters q q−1−1 , qq−1 , q−1 . Another important symmetric design, a Hadamard 2-design has parameters (4n − 1, 2n − 1, n − 1), where 4n is the order of Hadamard matrix. Replacing each block of a (v, b, r, k, λ) BIBD by its complement gives a complementary (v, b, b − r, v − k, b − 2r + λ) BIBD. Another standard construction produces BIBDs from a symmetric design. If D = (V , B) is a symmetric (v, k, λ)-design and A ∈ B, then define BA = {B ∩ A : B ∈ B, B = A} and B A = {B\A : B ∈ B, B = A}. Then DA = (A, BA ) is a (k, v − 1, k − 1, λ, λ − 1) BIBD called the derived design of D and D A = (V \A, B A ) is a (v − k, v − 1, k, k − λ, λ) BIBD called the residual design of D. Any derived design of PG(d, q) is a q-fold multiple of PG(d − 1, q). Any residual design of PG(d, q) is isomorphic to the design AG(d, q) which is formed by the points and hyperplanes of the d-dimensional vector space over GF(q). Its pa d −1) (q d −1) d−1 rameters are q d , q(qq−1 , q−1 , q d−1 , (q q−1−1) . The parameters (v, b, r, k, λ) of a residual design satisfy the relation r = k + λ. Any (v, b, r, k, λ) BIBD with r = k + λ (or equivalently b = v + r − 1) is called quasi-residual. Any (v, b, r, k, λ) BIBD with k = λ + 1 (or equivalently v = r + 1) is called quasi-derived. The complement of a quasi-residual design is quasi-derived and vice versa. We then obtain the following examples given [23]: Example 4.2. Using the familiar doubling construction yields (4n − 1, 2n, n) ⊂ (8n − 1, 4n, 2n) whenever there is a Hadamard matrix of order 4n.
250 Mohan S. Shrikhande Example 4.3. Denote, as is usual, PG(n, q)c for the complement of the symmetric design of points and hyperplanes of an n-dimensional projective geometry over GF(q). Then, PG(n, q)c ⊂ PG(n + 1, q)c . The starting point of [23] is the following result: Proposition 4.4. If (v, k, λ) ⊂ (v , k , λ ) and v > 1, then k = qk, λ = qλ, for some positive integer q. Proof. Let C = (U, A) be a strong symmetric 2 − (v, k, λ) subdesign of a symmetric (v , k , λ ) design D = (V , B). Let A∗ be the multi-set {B ∩U : B ∈ B, B ∩U = ∅}. Then C ∗ = (U, A∗ ) is a quasi-symmetric 2−(v, k, λ ) design with replication number k and with block intersection numbers k and λ. Thus C ∗ is a q-multiple of the symmetric 2−(v, k, λ) design C. Counting incidences (p, B), where p ∈ U, B ∈ A∗ in two ways gives: k |A∗ | = q|U |k, giving k = qk. Next, using the basic parameter relation in C ∗ and D gives: λ (v −1) = r (k −1) = kq(k −1) and λ(v −1) = k(k −1) which implies λ = qλ, if v > 1. Before examining the structure induced on V \ U , we recall the following concept from Ionin and M. S. Shrikhande [28]: Definition 4.5. Let λ be a positive integer. An affine resolvable pairwise balanced design (ARPBD) of index λ is a triple T = (X, C, R), where X is a finite set (of points), C is a collection of subsets of X (blocks), and R is a partition of C (resolution) satisfying the following conditions: (i) any two points occur in exactly λ blocks; (ii) for any resolution class R, there is a positive integer α(R) (the replication number of R) such that each point occurs in α(R) blocks from R; (iii) the cardinality of each block and the cardinality of the intersection of two distinct blocks depends only on their resolution classes (equivalently as shown in [28], |B| = |X| + |R| − 1.) The above definition generalizes the definition of affine α-resolvable BIBD in S. S. Shrikhande and Raghavarao [44]. Using modifications of results in Bekker, Ionin, and M. S. Shrikhande [4], Ionin then shows: Theorem 4.6. For positive integers v > 1, q > 1 if (v, k, λ) ⊂ (v , qk, qλ), then there exists an ARPBD of index qλ whose resolution consists of v classes of cardinality q and replication number qλ k and one class of cardinality v − qv and replication qλ number q − k . The following result of Ionin [23] is then immediate: Corollary 4.7. If (v, k, λ) ⊂ (v , qk, qλ), then k divides qλ.
Subdesigns of symmetric designs
251
In [23], Ionin also shows that if a symmetric design C and a “suitable” ARPBD exists, then C can be embedded as a strong subdesign of a symmetric design D: Theorem 4.8. Suppose that for positive integers v > 1, q > 1 there exists a symmetric 2 − (v, k, λ) and an ARPBD index qλ whose resolution consists of v classes of cardinality q and replication number qλ k and one class of cardinality v − qv and qλ replication number q − k , then (v, k, λ) ⊂ (v , qk, qλ), where v = 1 + k(qk−1) . λ Proof. Let (X, C, R) be an ARPBD satisfying the above conditions and (U, A) be a symmetric 2 − (v, k, λ) design. Assume X ∩ U = ∅. Let R = {R1 , R2 , . . . , Rv+1 }, where |Ri | = q, i = 1, 2, . . . , v and A = {A1 , A2 , . . . , Av }. For any B ∈ C, define B ∪ Ai , if B ∈ Ri , i = 1, 2, . . . , v, ∗ B = B, if B ∈ Rv+1 Put V = X ∪ U, B = {B ∗ : B ∈ C}. Then it can be verified that D = (V , B) is the required symmetric 2 − (v , qk, qλ) design which contains (U, A) as a strong subdesign. Using Theorem 4.8, Ionin gives some constructions of symmetric designs from their strong symmetric subdesigns. As stated earlier, the main tools used by Ionin are generalized Hadamard matrices and balanced generalized weighing matrices. Definition 4.9. A generalized Hadamard matrix GH(q, s) over a group G of order q is a matrix H = [hij ] of order qs with entries from G such that for any two distinct rows i and l the multi-set {h−1 lj hij : 1 ≤ j ≤ qs} contains s copies of every element of G. Remark 4.10. If q is a prime power and G is the additive group of the field GF(q), then for example, the existence of the following GH(q, s) is known: (i) GH(q, 1), (ii) GH(q, q − 1), if q is also a prime power (see [5] for these and additional results.) Definition 4.11. A balanced generalized weighing matrix BGW(v, k, λ) over a finite multiplicative group G is a matrix W = [wij ] of order v with entries from the set G ∪ {0}, where 0 ∈ / G such that the following conditions hold: (i) each row and column of W contains exactly k non-zero entries. (ii) for any two distinct rows i and l, the multi-set {wlj−1 wij : 1 ≤ j ≤ v, wlj = 0, λ wij = 0} contains exactly |G| copies of every element of G. Remark 4.12. Replacing every non-zero entry in BGW(v, k, λ) by 1 produces the incidence matrix of a symmetric 2 − (v, k, λ) design. It is known [15], that a m+1 BGW(v, k, λ) over G exists for v = q q−1−1 , k = q m , λ = q m−1 (q − 1) and G = Zt , where q is a prime power, m a positive integer, and t a divisor of q − 1.
252 Mohan S. Shrikhande The discussion and proofs of the construction results of Ionin [23] are rather technical to go into here, so we refer to the paper [23]. We end this section with a statement of one of Ionin’s results: Theorem 4.13. There exist symmetric designs with parameters v = 1 + 2(q + 1), 2m−1 ((q+1)2m −1) , k = (q + 1)2m , λ = (q+1) 2 (q+2) , where m is a positive integer and (q+2) q = 2p − 1 is a Mersenne prime. We also mention the very interesting recent papers of Ionin [24]–[27]. We end this section with a conjecture due to Ionin. Conjecture 4.14. If there exists symmetric designs D(v, k, λ) and D1 (v1 , qk, qλ) for a prime power q with k dividing qλ, then (v, k, λ) ⊂ (v1 , qk, qλ).
5. Symmetric designs and decompositions In a series of papers starting from [23], Ionin systematically developed methods which have produced several infinite families of symmetric designs. A general technique was laid down in [24]. His basic idea was to start from a set M of incidence matrices of symmetric (v, k, λ)-designs, a suitable group G of mappings M → M, and a suitable balanced generalized weighing matrix BGW matrix W . Then, Ionin gives sufficient conditions under which the Kronecker product W ⊗ M is the incidence matrix of a larger symmetric design. The main construction method of [24] is the following result: Theorem 5.1. Let v > k > λ ≥ 0 be integers. Let M be a non-empty set of matrices and G be a finite group of mappings M → M. Let W be a balanced generalized weighing matrix BGW(ω, l, μ) over G with k 2 μ = vλl. If (i) every matrix X ∈ M is the incidence matrix of a symmetric (v, k, λ)- design, (ii) for any X, Y ∈ M and σ ∈ G, (σ X)(σ Y )T = XY T , and (iii) for any X ∈ M, σ ∈G σ X = aJ , then, for any X ∈ M and any positive integer m, W ⊗ M is the incidence matrix of a symmetric (vω, kl, λl)-design. Ionin then applies Theorem 5.1 to symmetric designs which are developed from McFarland and Spence difference sets (in [24]) and then in [25] to Davis and Jedwab and Hadamard difference sets to obtain four infinite families of symmetric designs. Refer to the original papers of Ionin or [5] for the parameters.
Subdesigns of symmetric designs
253
Using a similar approach, Kharaghani [33] shows that this construction works whenever (v, k, λ) = (4n2 , 2n2 − n, n2 − n), (2n − 1)2 ) is a prime power, and there exists a Bush type Hadamard matrix of order 4n2 . A Bush-type Hadamard matrix is a block matrix H = [Hij ] of order 4n2 with block size 2n, Hii = J2n and Hij J2n = J2n Hij = O, i = j, 1 ≤ i ≤ 2n, 1 ≤ j ≤ 2n. The recent paper by Jungnickel and Tonchev [31] has connections with Ionin’s strong subdesigns discussed in the previous section. We shall briefly discuss some of the key concepts and state one of the results of this paper. Refer to [31] for most of the details and to [5] for the necessary background on difference (and relative difference) sets. Let G be a multiplicatively written finite group of order mn containing a normal subgroup N of order n. A k-element subset D of G is called an (m, n, k, λ1 , λ2 ) divisible difference set (DDS) in G relative to N if the list {xy −1 : x, y ∈ D} contains exactly λ1 copies of each non-identity element of N and exactly λ2 copies of each element of G \ N. If n = 1, then λ1 is vacuous and D then becomes an (ordinary) (m, k, λ2 )-difference set in G. If λ1 = 0, then D is called a relative difference set in G relative to N . A (relative) difference set is called cyclic or abelian if G has the respective property. A part of the paper of Jungnickel and Tonchev [31] deals with the following question: Under what conditions is it possible to combine a (small) difference set in a group N , with a suitable relative difference set (relative to N) in a larger group G to obtain a (large) difference set in G? Jungnickel and Tonchev use the standard method of representing (relative) difference sets as elements of the integral group ring ZG. Identify a subset of ZG with the element g∈A g. Then the conditions for a difference set (relative difference set) translate into the following: Lemma 5.2. Let G be a group of order v, D ∈ ZG. Then D is a (v, k, λ)-difference set iff DD (−1) = (k − λ) + λG holds in ZG. Let N be a normal subgroup of order n and index m in G. Then R is a (m, n, k, λ)-relative difference set in G relative to N iff RR (−1) = k + λ(G \ N ) holds in ZG. Then use is made of the following result of Pott [36]: Proposition 5.3. Assume the existence of both a (ω, v, l, α)-difference set R in G relative to a normal subgroup N and a (v, k, λ)-difference set S in N. If k 2 α = lλ, then there exists a (vω, kl, λl)-difference set in G of the form SR. Jungnickel and Tonchev refer to the difference set D = SR so obtained as being decomposable. Let D = (V , B) be a symmetric (v, k, λ)-design and let D1 = (U, A) be a strong subdesign of D with parameters (v , k , λ ) (Jungnickel and Tonchev refer
254 Mohan S. Shrikhande to D1 as a strongly induced subdesign of D.) Jungnickel and Tonchev [31] show that decomposable difference sets and Ionin’s strongly induced symmetric subdesigns are related by proving the following: Proposition 5.4. Let D = SR ⊆ G be a difference set with a decomposition and let D = dev(D) be the associated symmetric design. Then D admits a G-invariant partition into strongly induced subdesigns isomorphic to S = dev(S). They also prove the converse of the above result. For further results, we refer to their paper. We end the paper with some remarks and examples. Let D = (V , B) be a symmetric design on sv points. Let P = {V1 , . . . , Vs } be a partition of V into sets of cardinality v and Q = {B1 , . . . , Bs } a partition of B into sets of cardinality v. For i, j = 1, 2, . . . , s, form incidence structures D ij = (Vi , Bj ) (with p ∈ Vi and B ∈ Bj being incident if and only if p ∈ B). Let E be a symmetric (v, k, λ)-design. We will say that D is decomposed by E if each D ij is either isomorphic E or is a structure with empty incidence relation (i.e., no point and no block are incident). ⎡
0 Example 5.5. Matrices A = ⎣1 1
1 0 1
⎤ ⎡ 1 1 1⎦ and B = ⎣1 0 0
0 1 1
⎤ 1 0⎦ are incidence matri1
ces of a symmetric (3, 2, 1)-design E. Then the block-matrix ⎤ ⎡ O A B B A ⎢A O A B B ⎥ ⎥ ⎢ ⎢B A O A B ⎥ ⎥ ⎢ ⎣B B A O A⎦ A B B A O is the incidence matrix of a symmetric (15, 8, 4)-design D, which therefore is decomposed by E. Example 5.6. Let D be the development of a difference set D which admits a decomposition D = SR in the sense of Jungnickel and Tonchev, where S is a difference set. Then D is decomposed by the development of S. Example 5.7. Let G = Q4 × Z2 , where Q4 = {±1, ±i, ±j, ±k} is the quaternion group, and Z2 =< x > is the group of order 2. Then E = {1, i, j, k, ±x} is a (16, 6, 2)-difference set in G and let E be the development of E. One can apply Ionin’s techniques [24] to obtain, for any positive integer m, a symmetric design D with parameters 2 · (9m+1 − 1), 6 · 9m , 2 · 9m that is decomposed by E.
Subdesigns of symmetric designs
255
Generalized Hadamard matrices, balanced generalized weighing matrices and group rings are some of the tools used in investigations on symmetric designs and difference sets. Some of these have been used also in Ionin and M. S. Shrikhande [29] which concerns designs which admit nearly affine decompositions. We mention also the papers of Mavron [34] and McDonough and Mavron [35]. These concern symmetric designs with special substructures which in some cases are subdesigns in the more general sense. Acknowledgement. The author acknowledges support of a Central Michigan University Research Professorship Award #22183.
References [1]
R. W. Ahrens and G. Szekeres, On a combinatorial generalization of 27 lines associated with a cubic surface, J. Austral. Math. Soc. 10 (1969), 485-492.
[2]
A. Baartmans and M. S. Shrikhande, Tight subdesigns of symmetric designs, Ars Combin. 12 (1981), 303–309.
[3]
R. D. Baker, Symmetric designs with Bruck subdesigns Combinatorica 2 (1982), 103–109.
[4]
B. Bekker, Y. J. Ionin and M. S. Shrikhande, Embeddability and construction of affine α-resolvable pairwise balanced designs, J. Combin. Des. 6 (1998), 111–129.
[5]
T. Beth, D. Jungnickel, and H. Lenz, Design Theory, Volumes I and II, Cambridge University Press, Cambridge 1999.
[6]
R. C. Bose, Strongly regular graphs, partial geometries, and partially balanced designs, Pacific J. Math. 13 (1963), 389–419.
[7]
R. C. Bose, A note on the resolvability of incomplete block designs, Sankhy¯a 6 (1942), 105–110.
[8]
R. C. Bose, Symmetric group divisible designs with the dual property, J. Statist. Plann. Inference 1 (1977), 87–101.
[9]
R. C. Bose, J. W. Freeman, and D. G. Glynn, On the intersection of two Baer subplanes in finite projective plane, Util. Math. 17 (1980), 65–71.
[10] R. C. Bose and S. S. Shrikhande, Baer subdesigns of symmetric balanced incomplete block designs, in: Essays in Probability and Statistics (S. Ikeda et al., eds.), Shinko Tsusho, Tokyo 1976, 1–16. [11] R. C. Bose and S. S. Shrikhande, Graphs in which each pair of vertices is adjacent to d other vertices, Studia Sci. Math. Hungar. 5 (1970), 181–195. [12] R. H. Bruck, Difference sets in a finite group, Trans.Amer. Math. Soc. 78 (1955), 464–481. [13] R. H. Bruck and R. C. Bose, The construction of translation planes from projective spaces, J. Algebra 1 (1964), 85–102.
256 Mohan S. Shrikhande [14] R. H. Bruck and R. C. Bose, Linear representations of projective planes in projective spaces, J. Algebra 4 (1966), 117–172. [15] C. J. Colbourn and J. H. Dinitz (eds.), The CRC Handbook of Combinatorial Designs, CRC Press, Boca Raton 1996. [16] N. J. Cron and V. C. Mavron, On intersections of symmetric subdesigns of symmetric designs, Arch. Math. 40 (1983), 475–480. [17] P. Dembowski, Finite Geometries, Springer-Verlag, Berlin, Heidelberg 1968. [18] J. D. Fanning, A family of symmetric designs, Discrete Math. 146 (1995), 307–312. [19] W. H. Haemers, A generalization of the Higman–Sims technique, Indag. Math. 40 (1978), 445–447. [20] W. H. Haemers, Interlacing eigenvalues and graphs, Linear Algebra Appl. 227/228 (1995), 593–616. [21] W. H. Haemers and M. S. Shrikhande, Some remarks on subdesigns of symmetric designs, J. Statist. Plann. Inference 3 (1979), 361–366. [22] J. W. P. Hirschfeld, Projective Geometries over Finite Fields, second edition, Oxford University Press, Oxford 1998. [23] Y. J. Ionin, Symmetric subdesigns of symmetric designs, J. Combin. Math. Combin. Comput. 29 (1999), 65–78. [24] Y. J. Ionin, A technique for constructing symmetric designs, Des. Codes Cryptogr. 14 (1998), 147–158. [25] Y. J. Ionin, New symmetric designs from regular Hadamard matrices, Electron. J. Combin. 5, No. 1, (1998), R1. [26] Y. J. Ionin, Building symmetric designs with building sets, Des. Codes Cryptogr. 17 (1999), 159–175. [27] Y. J. Ionin, Applying balanced generalized weighing matrices to construct block designs, Electron. J. Combin. 8, No. 1, (2001), R12. [28] Y. J. Ionin and M. S. Shrikhande, Resolvable pairwise balanced designs, J. Statist. Plann. Inference 72 (1998), 393–405. [29] Y. J. Ionin and M. S. Shrikhande, Strongly regular graphs and designs with three intersection numbers, Des. Codes Cryptogr. 21 (2000), 113–125. [30] D. Jungnickel, On subdesigns of symmetric designs, Math. Z. 181 (1982), 383–393. [31] D. Jungnickel and V. D. Tonchev, Decompositions of difference sets, J. Algebra 217 (1998), 21–39. [32] W. M. Kantor, 2-transitive symmetric designs, Trans. Amer. Math. Soc. 146 (1969), 1–28. [33] H. Kharaghani, On the twin designs with the Ionin-type parameters, Electron. J. Combin. 7, No. 7, (2000), R1. [34] V. C. Mavron, Symmetric designs and λ-arcs, European J. Combin. 9 (1988), 507–516. [35] T. P. McDonough and V. C. Mavron, Symmetric designs and geometroids, Combinatorica 9 (1) (1989), 51–57.
Subdesigns of symmetric designs
257
[36] A. Pott, Finite Geometry and Character Theory, Lecture Notes in Math. 1601, SpringerVerlag, Berlin 1995. [37] D. Rajkundlia, Some techniques for constructing infinite families of BIBDs, Discrete Math. 44 (1983), 61–96. [38] C. R. Rao, Difference sets and combinatorial arrangements derivable from finite geometries, Proc. Nat. Inst. Sci. India 12 (1946), 123–135. [39] C. R. Rao, Cyclical generation of linear subspaces in finite geometries, in: Combinatorial Mathematics and its Applications (R. C. Bose and T. A. Dowling, eds.), University of North Carolina Press, Chapel Hill 1969, 515–535. [40] S. S. Sane, On a class of symmetric designs, in: Combinatorics and Applications (K. S. Vijayan and N. M. Singhi, eds.), Indian Statistical Institute, Calcutta 1982, 292–302. [41] E. Seiden, A method of construction of resolvable BIBD, Sankhy¯a Ser. A 25 (1963), 393–394. [42] M. S. Shrikhande, Combinatorics of block designs, in: Recent Advances in Experimental Designs and Related Topics (S. Altan and J. Singh, eds.), Nova Science Publishers, Huntington, NY, 2001, 175–191. [43] M. S. Shrikhande and S. S. Sane, Quasi-Symmetric Designs, London Math. Soc. Lecture Note Ser. 164, Cambridge Univiversity Press, Cambridge 1991. [44] S. S. Shrikhande and D. Raghavarao, Affine α- resolvable incomplete block designs, in: Contributions to Statistics, Presented to Professor Mahalanobis on his 70th Birthday, Pergamon Press, New York 1964, 471–480. [45] S. S. Shrikhande and D. Raghavarao, A method of construction of incomplete block designs, Sankhy¯a Ser. A 25 (1963), 399–402. [46] S. S. Shrikhande and N. K. Singh, On a method of constructing symmetric balanced incomplete block designs, Sankhy¯a Ser. A 25 (1962), 25–32. [47] K. Vedder, A note on the intersection of two Baer subplanes, Arch. Math. 37 (1981), 287–288. [48] K. Vedder, Affine subplanes of projective planes, in: Finite Geometries and Designs (P. J. Cameron, J. W. P. Hirschfeld, and D. R. Hughes, eds.), London Math. Soc. Lecture Note Ser. 49, Cambridge University Press, Cambridge 1981, 359–364. [49] W. D. Wallis, Construction of strongly regular graphs using affine designs, Bull. Austral. Math. Soc. 4 (1971), 41–49. M. S. Shrikhande Central Michigan University Mathematics Department Mt. Pleasant, MI 48859, U.S.A. [email protected]
Linear codes over F2 + uF2 and their complete weight enumerators Irfan Siap
Abstract. In this paper we explore the complete weight enumerators of linear codes over F2 + uF2 with u2 = 1, whose images under a Gray map are Type I and Type II binary codes. Our approach simplifies the relations between the codes over F2 + uF2 and their images. By establishing the connection between the 2-byte weight enumerators of binary codes and the complete weight enumerators of codes over F2 + uF2 , we determine the ring of invariants of complete weight enumerators of Type I (I-R) and Type II (II-R) codes over F2 + uF2 . 2000 Mathematics Subject Classification: primary 94B05; secondary 94B60.
1. Introduction Let R be a finite commutative ring with identity. An R submodule of R n is called a code. The elements of a code are called codewords. The function d : R n × R n → N0 d(u, v) := |{i : ui = vi }|, where u = (u1 , . . . , un ), v = (v1 , . . . , vn ) ∈ R n and N0 is the set of nonnegative integers, is called the Hamming distance. The Hamming distance is a metric in R n . The minimum Hamming distance of a code C is defined by d(C) =
min
u,v∈C,u=v
d(u, v).
C is said to be an (n, M)-linear code if and only if C is a submodule of R n of size M, and d(C) = d. Another important notion is the (Hamming) weight of a codeword u which is defined by w(u) = |{i|ui = 0}|, i.e., the number of the nonzero entries of u. The minimum Hamming weight w(C) of a code C is the smallest possible weight among all its nonzero codewords. We observe that if C is a linear code then d(C) = w(C). Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
260 Irfan Siap The inner product of two codewords v and w is defined by v, w :=
n
(1)
vi wi
i=1
where v = (v1 , . . . , vn ) and w = (w1 , . . . , wn ). The dual code C ⊥ of an (n, M) linear code C is defined by C ⊥ := {v ∈ R | w, v = 0 for all w ∈ C}. Observe that C ⊥ is also a submodule of R n . Commutative rings of order 4 are F4 (Galois field of order 4), Z4 (integers modulo 4), F2 [u]/(u2 − 1), and F2 [u]/(u2 − u). It is clear that the ring F2 [u]/(u2 − 1) is isomorphic to the ring F2 [u]/(u2 ) via the map ψ : u → u + 1. Most researches have considered codes over the rings of type F2 + uF2 with u2 = 0 [DHGS], and [DGHMS]. Instead we will work on codes over F2 + uF2 with u2 = 1. The advantage of working on codes over the latter ring is that the Gray map has a very simple and natural structure which will be apparent later. In order to simplify notation we will take Ri := F2 [u]/(u2 − i),
i = 0, 1
and simply we will use R for R1 = F2 [u]/(u2 − 1). First, we will relate the definitions given in [DHGS] and [DGHMS] for codes over R0 to codes over R1 . In [DHGS] and [DGHMS], the idea of considering codes over the ring R0 is based on similarity between the rings Z4 and R0 and the indeterminate u is playing the same role as 2 in Z4 . In the following table we define the Hamming weight and the Lee weight of the elements of R0 (given in [DHGS]) and R1 . Rings Elements Hamming weight Lee weight
R0 0 0 0
R1 0 0 0
R0 1 1 1
R1 1 1 1
R0 u 1 2
R1 u 1 1
R0 1+u 1 1
R1 1+u 1 2
Remark. The Lee weight for the elements of R1 can be defined alternatively as wL (a + ub) = wH (a) + wH (b), where wL and wH stand for Lee and Hamming weights, respectively. These weights are extended to an n-tuple coordinatewise. The Hamming and the Lee distances of two codewords v 1 and v 2 are defined as the Hamming and the Lee weights of v 1 −v 2 , respectively. The minimum Lee weight of a code C is the minimum nonzero weight among its codewords and it is denoted by Lee(C).
Linear codes over F2 + uF2 and their complete weight enumerators
261
In [DHGS] a linear code C over R0 of length n is mapped to a linear binary code of length 2n with a Gray map : R0 → F22 by (a + ub) = (b, a + b) defined on the entries of the codeword and extended naturally to n-tuples coordinatewise. This Gray map is an F2 -linear isometry from (R0n , Lee distance) onto (F22n , Hamming distance) [DHGS]. As we mentioned in the beginning, we are going to work on codes over R1 . We define a Gray map : R → F22 , (a + ub) = (a, b) for all a + ub ∈ R1 . We extend this map coordinatewise to a codeword. This map is very simple and further a very natural linear map. It is a straightforward observation that the Lee weight of a codeword in C is equal to the Hamming weight of its Gray image. So, this Gray map is a distance preserving linear map. Especially if u = (a1 +ub1 , a2 +ub2 , . . . , an +ubn ) ∈ C, then (u) = (a1 , b1 , a2 , b2 , . . . , an , bn ).
(2)
By definition, we see that the Gray map is simply a projection map while working on codes over R. We define a swap map S : F22 → F22 by S(a, b) = (b, a) and extend it to an n-tuple coordinatewise. The relation between the rings and maps can be represented by the following diagram: (a + b) + bu ∈ R0 ↑ a + ub ∈ R1
→ →
(b, a) ∈ F22 ↑S (a, b) ∈ F22
(3)
All maps are ring isomorphisms and further they preserve the weights. Here we state an important theorem obtained in [DHGS] via the Gray map on codes over R0 . These theorems and others given in [DHGS] can be obtained via the Gray map for codes over R1 . Theorem 1.1 ([DHGS]). C is self-dual if and only if (C) is self-dual. The minimum of Lee weight of C is equal to the minimum of Hamming weight of (C). Self-dual binary codes are called Type II if all weights are divisible by 4, and are called Type I otherwise. Since the Lee weight of a codeword in C is equal to the Hamming weight of its image, the following definition is natural: Definition 1.2 ([DHGS]). A self-dual code C is called a Type II code if all Lee weights are divisible by 4, and it is called Type I otherwise.
262 Irfan Siap In [Si], the notions of Type I-R and Type II-R codes were introduced. Here we point out that Type I and Type I-R codes, as well as Type II and Type II-R codes, are equivalent. The author was motivated by the work of Gulliver and Harada [GH]. In that paper they consider codes over F3 + uF3 with u2 = 1. Here the codes over F2 + uF2 with u2 = 1 are considered. Another important fact that relates the self-dual codes over R with self-dual binary codes is the following theorem, which can be seen easily using the Gray map: Theorem 1.3 ([DHGS]). There is a one-to-one correspondence between the self-dual codes of length n over R and self-dual binary codes of length 2n which are invariant under the permutation (12)(23) . . . (2n − 1)(2n). The complete weight enumerator of a code C of length n over R is given by δ (u) δ (u) δ (u) δ (u) z0000 z0101 z1010 z1111 (4) WCu (z00 , z01 , z10 , z11 ) = u∈C
where δij (u) = |{s|us = i + uj, 1 ≤ s ≤ n}|,
(5)
and 0 ≤ i, j ≤ 1. Example. We give an example of a Type II code C4 over R with generator matrix: 1 0 1+u u G := . 0 1 u 1+u The complete weight enumerator of C4 is WCu4 (z00 , z01 , z10 , z11 ) 4 4 4 2 2 2 2 4 = z00 + z01 + z10 + z11 + 2z00 z11 + 2z01 z10 + 8z00 z01 z10 z11 .
(6)
2. The 2-byte weight enumerator of a binary code The definition of r-byte weight enumerator of a binary code and the relation between 2-byte weight enumerator of a binary code and its dual is given in [WWK]. In [WWK], they use this approach to investigate some properties of the weight distribution for the Euclidean image of binary linear codes. Here, we will relate these weight enumerators to the complete weight enumerators of codes over R. In this section we will assume that the length of a codeword, say n, is always divisible by 2, i.e. n = 2s for some s.
Linear codes over F2 + uF2 and their complete weight enumerators
263
2.1. Byte representation of a codeword Let u = (u1 , u2 , . . . , un ) ∈ C where n = 2s. We define u1 = (u1 , u2 ) ∈ R 2 u2 = (u3 , u4 ) ∈ R 2 .. . us = (u2s−1 , u2s ) ∈ R 2 . In other words, u1 , u2 , . . . , us are the 2-segments of a codeword starting from the first entry of the codeword. The s tuple (u1 , u2 , . . . , us ) is called the 2-byte representation of a codeword u of length n = 2s. Example. Let R = F2 and n = 6 and u = (1, 1, 0, 1, 1, 0) ∈ F26 . Then, u1 = (1, 1), u2 = (0, 1), u3 = (1, 0) and the 2-byte representation of u is ((1, 1), (0, 1), (1, 0)). Actually, using the notation adopted in [WWK], the 2-byte representation of u is given as (11, 01, 10)2 . Here, we are going to use the former representation which is more convenient for our purposes. 2.2. The 2-byte weight enumerator Let C be a code of length n = 2s over R. To each 2-tuple in R 2 we associate the following weight function 1, if (i1 , i2 ) = (j1 , j2 ) (7) ηi1 i2 (j1 , j2 ) = 0, otherwise. The 2-byte weight enumerator of C is given by μ (u) μ (u) μ (u) μ (u) z0000 z0101 z1010 z1111 WC2b (z00 , z01 , z10 , z11 ) =
(8)
u∈C
where μi1 i2 (u) =
s
ηi1 i2 (uj ).
(9)
j =1
(Note that the number of variables in a 2-byte weight enumerator is equal to m2 and the superscript 2b in W stands for 2-byte.) Example. Let u = (1, 0, 0, 1, 1, 1, 1, 1, 1, 0) ∈ F210 . Then, ⎧ 2, if (i1 , i2 ) = (1, 0), ⎪ ⎪ ⎪ ⎨ 1, if (i1 , i2 ) = (0, 1), μi1 i2 (u) = ⎪ ⎪ ⎪2, if (i1 , i2 ) = (1, 1), ⎩ 0, otherwise.
264 Irfan Siap Observe that the 2-byte weight enumerator of a code is a multivariable polynomial with homogenous degree s where n = 2s. Theorem 2.1 ([WWK], The 2-byte weight enumerator). Let C be a binary code of even length. Then the relation between the 2-byte weight enumerator of C and its dual is given by WC2b⊥ (z00 , z01 , z10 , z11 ) =
1 W 2b (z00 + z01 + z10 + z11 , z00 − z01 + z10 − z11 , |C| C z00 + z01 − z10 − z11 , z00 − z01 − z10 + z11 ). (10)
In order to establish a MacWilliams-type identity between the complete weight enumerators of a code C over R and its dual, we are going to consider the image code (C) and its 2-byte weight enumerator. Using the MacWilliams identity for 2-byte weight enumerators (10) and the Gray map we will obtain a MacWilliams-type identity for complete weight enumerators of codes over R. Using the definitions of 2-byte and complete weight enumerators together with the definition of , we have the following lemma:
Lemma 2.2. Let C be a code over R of length n. Let be the Gray map defined above in (2). Then, 2b WCu (z00 , z01 , z10 , z11 ) = W(C) (z00 , z01 , z10 , z11 ).
(11)
Another important observation is the following lemma:
Lemma 2.3. Let C be a code over R. Then, ((C))⊥ = (C ⊥ ).
(12)
Proof. Let v = (c1 , d1 , . . . , cn , dn ) ∈ ((C))⊥ . Then, v, (c) = 0 for all c ∈ C. Let c = (a1 + ub1 , a2 + ub2 , . . . , an + ubn ) ∈ C. Thus, 0 = v, (c) =
n
ai c i +
i=1
Further, uc ∈ C since C is a submodule of
Rn.
n
bi d i .
i=1
This implies that
n 0 = v, (uc) = (ai di + bi ci ). i=0
Linear codes over F2 + uF2 and their complete weight enumerators
265
Then, −1 (v), c = (c1 + ud1 , . . . , cn + udn ), (a1 + ub1 , a2 + ub2 , . . . , an + ubn ) =
n n (ai ci + bi di ) + u (ai di + bi ci ) i=0
i=0
= 0 + u0 = 0. Thus, v ∈ (C ⊥ ), i.e. ((C))⊥ ⊂ (C ⊥ ). By definition, we have ((C))⊥ = {v ∈ F2n 2 |v(c) = 0 for all c ∈ C}. Let (x) ∈ (C ⊥ )where x = (a1 + ub1 , a2 + ub2 , . . . , an + ubn ) ∈ C ⊥ . Let v = (c1 + ud1 , c2 + ud2 , . . . , cn + udn ) be an arbitrary element of C. Then, (x), (v) = (a1 , b1 , . . . , an , bn ), (c1 , d1 , . . . , cn , dn ) =
n i=1
ai ci +
n
bi di = 0
i=1
n u ni=0 (ai di + bi ci ). This implies that since n 0 = x, v = i=0 (ai ci + bi di ) + ⊥ ⊥ ⊥ i=0 (ai ci + bi di ) = 0. Thus, (x) ∈ (C ) for all (x), i.e. ((C)) ⊂ (C ).
We would like to point out that the following theorem is known since the additive group of F2 + uF2 is isomorphic to F4 . Here, we give an alternative proof to this by connecting the complete weight enumerators of these codes to 2-byte weight enumerators of binary codes. Theorem 2.4. Let C be a code of length n over R. Then, the relation between the complete weight enumerator of C and its dual is given by WCu⊥ (z00 , z01 , z10 , z11 ) =
1 W u (z00 + z01 + z10 + z11 , z00 − z01 + z10 − z11 , |C| C z00 + z01 − z10 − z11 , z00 − z01 − z10 + z11 ). (13)
Proof. Applying Corollary 2.1 to (C) which is a code of length 2n, we get 2b W(C) ⊥ (z00 ,z01 , z10 , z11 )
=
1 W 2b (z00 + z01 + z10 + z11 , z00 − z01 + z10 − z11 , |C| (C) z00 + z01 − z10 − z11 , z00 − z01 − z10 + z11 ).
266 Irfan Siap By Lemma 2.3, we have 2b W(C ⊥ ) (z00 ,z01 , z10 , z11 )
=
1 W 2b (z00 + z01 + z10 + z11 , z00 − z01 + z10 − z11 , |C| (C) z00 + z01 − z10 − z11 , z00 − z01 − z10 + z11 ).
2b (z , z , z , z ) and By Lemma 11 we have WCu (z00 , z01 , z10 , z11 ) = W(C) 00 01 10 11 since is injective |(C)| = |C|. Now, if we combine the results above, then we get the identity (13).
3. Ring of invariants of complete weight enumerators of codes over R 3.1. Some invariant theory. In this section we are going to classify the complete weight enumerator of codes over R1 whose images are Type I and II codes. Similar classifications for various weight enumerators and extensive applications of invariant theory for several families of codes are done in [HCT]. First we will review some results from invariant theory. Let G be a finite group of linear transformations on n (complex) variables x1 , x2 , . . . , xn . In other words G is a multiplicative group of nonsingular complex n × n matrices. Let g be the order of G and let I stand for the n × n identity matrix. Let f (x) = f (x1 , . . . , xn ) and A = (aij ) be an n × n matrix over complex numbers, then f (Ax) := f (. . . ,
n
aij xj , . . . )
j =1
i.e., f (Ax) is a polynomial obtained by applying the transformation A (regarding the matrix A as a transformation) to x = (x1 , . . . , xn ). Definition 3.1. f (x) is an invariant polynomial of G if and only if f (Ax) = f (x), for all A ∈ G. By Definition (3.1) we see that if f, g are invariants of G so are f + g, f g. Hence, the invariants of G form a ring, say R(G). In order to characterize R(G) it is sufficient to characterize the invariants that are homogenous polynomials, since any invariant is a sum of homogenous invariants. Definition 3.2. The polynomials f1 (x), . . . , fm (x) are called algebraically dependent if there is a polynomial P with complex coefficients, not all zero, such that P (f1 (x), . . . , fm (x)) ≡ 0. Otherwise, f1 (x), . . . , fm (x) are algebraically independent.
Linear codes over F2 + uF2 and their complete weight enumerators
267
Lemma 3.3 ([MS]). Any n + 1 polynomials in n variables are algebraically dependent. Theorem 3.4 ([Fl]). Let f1 (x), . . . , fm (x) ∈ C[x]. Then f1 , f2 , . . . , fm are algebraically independent if and only if their Jacobian is not equal to zero, i.e
∂fi (x)
∂x = 0. i Theorem 3.5 ([MMS]). If f (x) is any polynomial, then the average of f (x) over the group G, h(x) = g1 A∈G f (Ax) is an invariant of G. Theorem 3.6 ([MMS]). The number of linearly independent invariants of G of degree ν is the coefficient of λν in the expansion of 1 1 (λ) = g |A − λI | A∈G
where |A| stands for the determinant of A. (λ) is called the Molien series of G. In this section we are going to determine the ring of invariants of the complete weight enumerators of self-dual codes over R = F2 + uF2 with u2 = 1. This is equivalent to determination of the ring of invariants of 2-byte weight enumerators of binary self-dual codes by Lemma 2.2 and Lemma 2.3. We are going to investigate the ring of invariants of linear codes over R whose images are Type I and II codes over F2 . Recently, the ring of invariants of certain classes of codes over F2 + uF2 with u2 = 0 is determined in [DHGS] and [DGHMS].
3.2. Ring of invariants of Type I codes. Let C be a Type I-R code of length n. By definition, its image (C), is given by (2), is a Type I code of length 2n. For brevity we will use Z = (z00 , z01 , z10 , z11 ). Let f0 (Z) :=z00 + z11 , 2 2 f1 (Z) :=z00 + 2z01 z10 + z11 , 2 2 2 2 f2 (Z) :=z00 + z11 + z01 + z10 , 4 4 4 4 2 2 2 2 f3 (Z) :=z00 + z11 + z01 + z10 + 6z00 z11 + 6z01 z10 .
268 Irfan Siap Theorem 3.7. Let C be a Type I code of length n. Then, the 2-byte weight enumerator of C WC2b (Z) ∈ C[f0 (Z), f1 (Z), f2 (Z), f3 (Z)]. Proof. First we identify the group, say G which leaves the 2-byte weight enumerator of a Type I-R code invariant. By Theorem 2.4 we see that 2-byte weight is invariant under: ⎡ ⎤ 1 1 1 1 1 ⎢ 1 −1 1 −1 ⎥ ⎥. (14) M1 := ⎢ ⎣ 1 1 −1 −1 ⎦ 2 1 −1 −1 1 Since C = C ⊥ , by Lemma 2.3, this implies that (C) is a binary self-dual code. Thus, the all one codeword, say 1, is in (C) . By definition of , 1 + u := (1 + u, 1 + u, . . . , 1 + u) ∈ C. Since C is linear 1 + u + C = C. (This is also proven as a proposition in [DHGS]) So the following holds: 2b W(C) (z00 , z01 , z10 , z11 ) =
μ (v) μ01 (v) μ10 (v) μ11 (v) z01 z10 z11
z0000
v∈(C)
= =
μ (1+u+v) μ01 (1+u+v) μ10 (1+u+v) μ11 (1+u+v) z01 z10 z11 1+u+v∈(C)
z0000
μ (v) μ10 (v) μ01 (v) μ00 (v) z01 z10 z11
z0011
v∈(C) 2b = W(C) (z11 , z10 , z01 , z00 ).
In matrix form, 2-byte weight enumerator of C is left invariant under ⎡ ⎤ 0 0 0 1 ⎢ 0 0 1 0 ⎥ ⎥ M2 := ⎢ ⎣ 0 1 0 0 ⎦. 1 0 0 0
(15)
Further, we consider the map u (v) = uv where u ∈ R and v ∈ C. This map is injective since 2 (v) = 1 for all v ∈ C and u2 = 1. This implies that uC = C. Now, clearly we have WC2b (z00 , z01 , z10 , z11 ) = WC2b (z00 , z10 , z01 , z11 ). Thus, the matrix
⎡
1 ⎢ 0 M3 := ⎢ ⎣ 0 0
0 0 1 0
⎤ 0 0 1 0 ⎥ ⎥ 0 0 ⎦ 0 1
(16)
Linear codes over F2 + uF2 and their complete weight enumerators
269
leaves the 2-byte weight enumerator invariant. Moreover, v, v = 0 for all v ∈ C since C = C ⊥ . Since v, v = μ01 (v)+μ10 (v) for all v ∈ C, we see that μ01 (v) + μ10 (v) ≡ 0 mod 2. Hence, the 2-byte weight enumerator is also invariant under the following matrix: ⎡ ⎤ 1 0 0 0 ⎢ 0 −1 0 0 ⎥ ⎥. (17) M4 := ⎢ ⎣ 0 0 −1 0 ⎦ 0 0 0 1 Thus, the 2-byte weight enumerator of a Type I code is invariant under a group G which is generated by the matrices: M1 , M2 , M3 , and M4 . G has order 16. The Molien series of G (computed in Magma) is G (λ) =
1 . (1 − λ)(1 − λ2 )2 (1 − λ4 )
(18)
The Molien series of G suggests to look for 4 free invariants of degrees 1, 2 (two of them) and 4. f0 (Z) is the weight enumerator of a Type I code generated by [1 + u], f1 (Z) is also the weight enumerator of a Type I code generated by [1 u]. f2 (Z) is the weight enumerator of a Type I code 2K2 and f3 (Z) is the weight enumerator of a Type I code K4 both given in [DHGS]. The Jacobian of the weight enumerators f0 (Z), f1 (Z), f2 (Z), and f3 (Z) is nonzero. Symbolic manipulations are carried out in MAPLE. Hence, by Theorem 3.4 f0 (Z), f1 (Z), f2 (Z), and f3 (Z) are algebraically independent. Therefore, we have the result. Note. R(G) is generated by the weight enumerators of Type I codes.
3.3. Ring of invariants of Type II codes. Theorem 3.8. Let C be a Type II-R code of length n. Then, the 2-byte weight enumerator of C WC2b (Z) ∈ C[f3 (Z), g1 (Z), g2 (Z), g3 (Z)] ⊕ g4 (Z)C[f3 (Z), g1 (Z), g2 (Z), g3 (Z)] ⊕ g5 (Z)C[f3 (Z), g1 (Z), g2 (Z), g3 (Z)] ⊕ WCu4 (Z)C[f3 (Z), g1 (Z), g2 (Z), g3 (Z)], where f3 is given in the previous theorem, and WCu4 (Z) is given in (6) which is the weight enumerator of a Type II code. Further, 4 2 2 4 z00 + z11 , + 2z11 g1 (Z) :=z00 4 4 4 4 g2 (Z) :=z00 + z11 + z01 + z10 ,
270 Irfan Siap 4 4 2 2 2 2 2 2 g3 (Z) :=z00 + z11 + 4z00 z11 z01 + 4z00 z11 z10 + 2z00 z11 + 4z01 z10 , 4 4 2 2 g4 (Z) :=z00 + z11 + 2z01 z10 , 4 4 4 4 2 2 2 2 g5 (Z) :=z00 + z11 + z01 + z10 + 2z01 z10 + 2z00 z11 ,
are respectively the weight enumerators of the Type II codes with the following generator matrices: / 0 1+u 0 1+u 0 1 0 1 1+u , 1 1 1 1 , , 0 1+u 0 1+u 0 1 1+u 1 1 1 u u 1+u 1+u 0 0 , . u u 1 1 1 1 u u Proof. It is clear that the 2-byte weight enumerator of C is invariant under the group G which also leaves the 2-byte weight enumerators of Type I-R codes invariant. Further, by definition, (C) is a Type II code. Thus the length of (C), which is 2n, is divisible by 8 [MS]. This implies that n is divisible by 4. In other words, μ00 (v) + μ01 (v) + μ10 (v) + μ11 (v) = n ≡ 0
mod 4 for all v ∈ C.
Thus, the 2-byte weight enumerator of C is invariant under the following matrices: ⎡ ⎤ ⎡ ⎤ −i 0 0 0 i 0 0 0 ⎢ 0 −i ⎢ 0 i 0 0 ⎥ 0 0 ⎥ ⎥. ⎥ N := ⎢ N − := ⎢ ⎣ 0 ⎣ 0 0 i 0 ⎦, 0 −i 0 ⎦ 0 0 0 −i 0 0 0 i Thus, the 2-byte weight enumerator of C is invariant under G and matricesN, N − which generate a group, say H , of order 64. The Molien series of H is G (λ) =
3λ4 + 1 . (1 − λ4 )4
(19)
The Molien series suggests to search for 4 free invariants of degrees 4 and for 3 transient invariants of degrees 4. In the statement of the theorem all polynomials are the weight enumerators of Type II codes, hence they are invariant. The Jacobian of the weight enumerators f3 (Z), g1 (Z), g2 (Z) and g3 (Z) is nonzero, hence they are algebraically independent. Note. R(H ) is generated by the weight enumerators of Type II codes. Acknowledgements. We would like to thank the referee for his/her helpful and valuable remarks, especially for the remarks on Theorem 3.7.
Linear codes over F2 + uF2 and their complete weight enumerators
271
References [DHGS]
Steven T. Dougherty, M. Harada, P. Gaborit, and P. Sole, Type II Codes Over F2 + uF2 , IEEE Trans. Inform. Theory 45(1999), 32–45.
[DGHMS]
Steven T. Dougherty, P. Gaborit, M. Harada, A. Munemasa and P. Solé, Type IV Self-Dual Codes over Rings, IEEE Trans. Inform. Theory 45 (1999), 2345–2360.
[Fl]
Leopold Flatto, Invariants of Finite Reflection Groups, Enseign. Math. 24 (1978), 237–292.
[GH]
T. Aaron Gulliver and Masaaki Harada, Codes over F3 + uF3 and Improvements to the Bounds on Ternary Linear Codes, Des. Codes Cryptogr., submitted.
[MS]
F. J. MacWilliams and N. J. A Sloane, The Theory of Error Correcting Codes, North-Holland Math. Library 16. North-Holland, Amsterdam 1996.
[MMS]
F. J. MacWilliams, C. J. Mallows, N. J. A. Sloane. Generalizations of Gleason’s Theorem on Weight-Enumerators of Self-Dual Codes, IEEE Trans. Inform. Theory 18, (1972), 794–805.
[HCT]
Vera Pless and Cary Huffman (eds.), Handbook of Coding Theory, Elsevier, Amsterdam 1998.
[Si]
Irfan Siap, Generalized r-fold weight enumerators and new linear codes with better minimum distances, Ph. D. Thesis, The Ohio State University, 1999.
[WWK]
Tadashi Wadayama, Koichiro Wakasugi and Masao Kasahara,On Weight Distribution for Euclidean Image of Binary Linear Codes, Mem. Fac. Engrg. Design Kyoto Inst. Technol. Ser. Sci. Tech. 45 (1997), 43–54.
I. Siap Adiyaman Egitim Fakultesi Gaziantep University Adiyaman, Turkey [email protected]
On single-deletion-correcting codes N. J. A. Sloane
Abstract. This paper gives a brief survey of binary single-deletion-correcting codes. The Varshamov–Tenengolts codes appear to be optimal, but many interesting unsolved problems remain. The connections with shift-register sequences also remain somewhat mysterious. 2000 Mathematics Subject Classification: primary 94B60; secondary 94A55.
1. Introduction The possibility of packet loss on internet transmissions has renewed interest in deletioncorrecting codes. (Of course there are many other applications of such codes, including magnetic recording, although in that case there are usually additional conditions that must be satisfied.) This paper considers the very simplest family of such codes, binary block codes capable of correcting single deletions. Even for these codes there remain several apparently unsolved problems. It is surprising, but these codes do not appear to be surveyed in any of the usual references ([MS77], [PH98], etc.). This paper is a first attempt at such a survey. It will be posted on the author’s home page [SL01] and will be updated as appropriate. It is hoped that the problems mentioned here will either soon be solved or will turn out to be already solved. Proofs are given of a number of results, either because the new proofs are simpler or because the original sources are hard to locate1 . Definition 1.1. For a vector u ∈ Fnq , let De (u) denote the set of e-th order descendants, that are obtained if e components are deleted from u. i.e. the set of vectors v ∈ Fn−e q n A subset C ⊆ Fq is said to be an e-deletion-correcting code if De (u) ∩ De (v) = ∅ for all u, v ∈ C, u = v. Our problem is to find the largest such code. In this paper we mostly consider the simplest case, q = 2 and e = 1. The deletion distance dd(u, v) between vectors u, v ∈ Fnq is defined to be one-half of the smallest number of deletions and insertions needed to change u to v. Then C 1 and when located are sometimes poorly translated or badly photocopied!
Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
274 N. J. A. Sloane is e-deletion-correcting if and only if dd(u, v) ≥ e + 1 for u, v ∈ C, u = v. (For dd(u, v) ≤ e if and only if there is a vector x that can be reached from u by at most e deletions and also from v by at most e deletions, and then C cannot correct e deletions.) Consider the graph Gn having a node for every vector u ∈ Fnq , with an edge joining the nodes corresponding to u, v ∈ Fnq , u = v, if and only if v can be obtained from u by a single deletion and insertion, i.e. if and only if D1 (u) ∩ D1 (v) = ∅. The deletion distance dd(u, v) is the length of the shortest path from u to v (this shows that dd is indeed a metric). In particular, a single-deletion-correcting code corresponds to an independent set in Gn . One can now attempt to calculate the sizes of the largest independent sets by computer. In the binary case we find that the largest single-deletion-correction codes of lengths 1, 2, . . . , 8 have sizes 1, 2, 2, 4, 6, 10, 16, ≥ 30 .
(1)
The last entry in (1) was kindly computed by my colleague David Johnson. Unfortunately G8 is too large for present computers and 30 is at present only a lower bound on the size of a maximal independent set.2 However, (1) turns out to be a useful hint. When one looks up this sequence in [EIS], one finds a unique matching sequence, number A16, whose initial terms N1 , N2 , N3 , . . . are 1, 1, 2, 2, 4, 6, 10, 16, 30, 52, 94, 172, 316, 586, . . .
(2)
and whose nth term is given by Nn =
1 φ(d)2n/d , n ≥ 1 , 2n
(3)
odd d|n
where the sum is over all odd divisors d of n and φ is the Euler totient function (sequence A10 in [EIS]). The references cited for sequence A16 indicate that it has arisen in connection with the enumeration of shift-register sequences [Go67] and tournaments [Br80]. However there was (at that time) no reference to indicate that this sequence has any connection with codes, nor was there any apparent connection between the shift-register sequences and deletion-correction codes. More conventional search methods, in particular, consulting some well-known papers of Levenshtein [Lev65], [Lev65a] on codes for correcting deletions, turned up many other relevant references. Some of these will be discussed further in Section 6. The most interesting codes are those of Varshamov and Tenengolts [VT65]. In [VT65] they present a family of codes depending on a certain parameter a. When a is taken to be 0, these codes have size Nn−1 (see (3)) and thus match (1). These codes are the subject of Section 2. 2 Postscript: DavidApplegate has since used CPLEX’s integer programming subroutines (which combine ordinary linear programming with branch-and-bound) to confirm that the largest single-deletion-correcting code of length 8 does indeed have size 30.
On single-deletion-correcting codes
275
Sections 3 and 4 will discuss the connection with shift-registers and tournaments, and Section 5 contains some general remarks about the number of descendants of a vector. The final section, Section 6, gives a brief discussion of other papers on deletion-correcting and related codes.
2. The Varshamov–Tenengolts codes Definition 2.1. For 0 ≤ a ≤ n, the Varshamov–Tenengolts code VTa (n) consists of all binary vectors (x1 , . . . , xn ) satisfying n
ixi ≡ a (mod n + 1) ,
(4)
i=1
where the sum is evaluated as an ordinary rational integer. As will appear, the codes with a = 0 contain the most codewords. The first few such codes are VT0 (1) = {0} VT0 (2) = {00, 11} VT0 (3) = {000, 101} VT0 (4) = {0000, 1001, 0110, 1111} VT0 (5) = {00000, 10001, 01010, 110011, 11100, 00111} ,
(5)
of sizes 1, 2, 2, 4, 6, matching (1) and (2). These codes were introduced in [VT65] for correcting errors on a Z-channel (or asymmetric channel). Similar constructions have been used in [BR82] and also in [GS80] and [Kl81] to construct constant weight codes. Levenshtein [Lev65], [Lev65a] observed that the Varshamov–Tenengolts codes could be used for correcting single deletions, proving this by giving the following elegant decoding algorithm. Decoding algorithm • Suppose a codeword x = (x1 , . . . , xn ) ∈ VTa (n) is transmitted, the symbol ) is received. Let there be s in position p is deleted, and x = (x1 , . . . , xn−1 L0 0’s and L1 1’s to the left of s, and R0 0’s and R1 1’s to the right of s (with p = 1 + L0 + L1 ). • We compute the weight w = L1 + R1 of x and the new checksum n−1 i=1 ixi . If s = 0 the new checksum is R1 (≤ w) less than it was before, and if s = 1 it is p + R1 = 1 + L0 + L1 + R1 = 1 + w + L0 (> w) less than it was before. (These numbers are less than n + 1 so there is no ambiguity.)
276 N. J. A. Sloane • So if the deficiency in the checksum is less than or equal to w we know that a 0 was deleted, and we restore it just to the left of the rightmost R1 1’s. Otherwise a 1 was deleted and we restore it just to the right of the leftmost L0 0’s.
n\a 1 2 3 4 5 6 7 8
0 1 2 2 4 6 10 16 30
1 2 3 4 5 6 7 8 1 1 1 2 2 2 3 3 3 3 5 5 6 5 5 9 9 9 9 9 9 16 16 16 16 16 16 16 28 28 29 28 28 29 28 28
Table 1. Number of codewords in Varshamov–Tenengolts code VTa (n).
The sizes | VTa (n)| of the first few codes are shown in Table 1. (This array forms sequence A53633 in [EIS].) These numbers were studied by Varshamov [Var65] and Ginzburg [Gi67], but the following simple formula appears to be new. Theorem 2.2.
| VTa (n)| =
d (d,a)
1 2(n+1)/d , φ(d) d 2(n + 1) d|n+1 φ (d,a) μ
(6)
d odd
where μ(n) is the Möbius function (A8683 in [EIS]), and (d, a) = gcd(d, a). Proof. Write wa (n) = | VTa (n)|. We will calculate wa (n − 1), assuming throughout that n ≥ 1. It follows from the definition of these codes that the generating function f (z) =
n−1
wa (n − 1)za
a=0
is equal to n−1
(1 + zk ) mod zn − 1 .
k=1
Let ξ =
e2πi/n .
Then
f (ξ ) = j
n−1 a=0
wa (n − 1)ξ
ja
=
n−1
(1 + ξ j k ), j = 0, . . . , n − 1 .
k=1
On single-deletion-correcting codes
277
We solve this by taking an inverse discrete Fourier transform (cf. [Ko88], Chap. 97) to obtain 1 f (ξ j )ξ −j a . n n−1
wa (n − 1) =
j =0
Since n−1
(z − ξ k ) = zn − 1 ,
k=0
we can calculate f (ξ j ) explicitly. An elementary calculation gives g−1 if d = n/g is odd, 2 f (ξ j ) = 0 if d = n/g is even, where g = gcd(n, j ). Therefore wa (n − 1) =
1 n|d 2 2n d|n d odd
n
ξ −j a
j =1 gcd(n,j )=n/d
which becomes, writing j = kn/d, =
d 1 n/d −2πika/d 2 e . 2n d|n k=1 d odd
(k,d)=1
The innermost sum is a Ramanujan sum cd (a) ([Ap76], p. 160), which simplifies to d μ (d,a) cd (a) = φ(d) d φ (d,a) ([Ap76], p. 164). Corollary 2.3. (i) | VT0 (n)|
=
1 φ(d)2(n+1)/d , 2(n + 1) d|n+1
=
1 μ(d)2(n+1)/d , 2(n + 1) d|n+1
(7)
d odd
(ii) | VT1 (n)|
(8)
d odd
(iii) For any a, | VT0 (n)| ≥ | VTa (n)| ≥ | VT1 (a)| .
(9)
278 N. J. A. Sloane Remark 2.4. (i) and the left-hand inequality in (iii) are due to Varshamov [Var65], and (ii) and the right-hand inequality in (iii) to Ginzburg [Gi67]. Proof. (i) and (ii) follow immediately from Theorem 2.2, as does the left-hand side of (iii) using μ(k) ≤ φ(k) for all k. To establish the right-hand side of (iii), let p be the smallest odd prime dividing both n + 1 and a (if no such prime exists then | VTa (n)| = | VT1 (n)|). The terms in the expressions for | VTa (n)| and | VT1 (n)| agree for d < p, and at d = p the term in | VTa (n)| exceeds that in | VT1 (n)| by p2n/p . It is easy to check that the remaining terms can never make the sum in | VT1 (n)| catch up with the sum in | VTa (n)|.
Optimality It is more difficult to obtain upper bounds for deletion-correcting codes than for conventional error-correcting codes, since the disjoint balls De (u) associated with the codewords (see Section 1) do not all have the same size. Furthermore the metric space (Fn2 , dd) is not an association scheme and so there is no obvious linear programming bound. of runs in u. The size of D1 (u) is easily seen to be equal to r(u), the n−1number n Furthermore the number of vectors in F2 with r runs is 2 r−1 . (We will discuss |De (u)| further in Section 5.) Let A(n, e) denote the size of the largest e-deletion-correcting binary code of length n, and call a code C optimal if |C| = A(n, e). The values of A(n, 1) for n ≤ 9 were given in Section 1, and show that VT0 (n) is optimal for n ≤ 9. For large n, the codes VT0 (n) are certainly close to being optimal, since on the one hand we have 2n , (10) | VT0 (n)| ≥ n+1 from (9), and on the other hand we have the following result of Levenshtein: Theorem 2.5 ([Lev65]). 2n , as n → ∞ . n Proof. (10) gives a lower bound. Let C be an optimal code. Following Levenshtein, let C0 denote the subset of C consisting of the vectors u ∈ C with n 1 n 1 − n log n ≤ r(u) ≤ + n log n 2 2 and let C1 = C \ C0 . Since the sets D1 (u), u ∈ C, must be disjoint, A(n, 1) ∼
|C0 | ≤
n 2
2n 2n−1 . √ n − n log n
On single-deletion-correcting codes
279
Furthermore, n √ 2 − n log n
|C1 | ≤ 2
r=1
which is much smaller than
n−1 2 , r −1
2n /n.
In a later paper, Levenshtein [Lev92] defines a code C to be perfect if the balls De (u), u ∈ C, partition the set Fn−e 2 . In [Lev92] he proves the remarkable fact that all the codes VT0 (n), VT1 (n), VT2 (n), . . . are perfect single-deletion-correcting codes. The argument, not reproduced here, is essentially just a refinement of the decoding algorithm for these codes given above. It is initially surprising that perfect codes of the same length can have different numbers of codewords, but this is explained by the fact that the balls D1 (u) have different sizes. In view of this and the result in (9), it is tempting to make the following conjecture. Conjecture 2.6. The codes VT0 (n) are optimal for all n. This is true for n ≤ 8, as already mentioned, but for larger n it is possible that other, smaller, perfect codes may exist, or even that smaller, optimal but non-perfect codes may exist. Indeed, consider the code {000, 111}. For this code, u∈C |D1 (u)| = 1 + 1 = 2 < 4, so this is optimal but not perfect. For length 4, {0000, 0011, 1100, 1111} contains as many codewords as VT0 (4) (compare (5)), and again is optimal but not perfect. At length 6 it is possible to replace two codewords of VT0 (6) by two other vectors without affecting its ability to correct single deletions: 110100 and 001011 can be replaced by 111000 and 000111. The former pair cover eight vectors of length 5, but the latter only cover four vectors of length 5, leaving four vectors uncovered. This suggests the possibility that in some larger code VT0 (n) it may be possible to replace k vectors by k + 1 vectors, which would prove that these codes are not optimal. In view of these remarks, Conjecture 2.6 does not seem especially compelling!
Linearity As can be seen from (5), the codes VT0 (n) are linear for n ≤ 4. They are never again linear, since, for n ≥ 5, VT0 (n) contains the vectors 1 0 0 0 . . . 0 0 1 and 1 1 0 0 . . . 1 0 0 but not their sum. In particular, even though | VT0 (7)| = 16, this code is not linear. One might wonder if it is possible to find a linear code that will do as well, but a computer search has shown that no such code exists.
280 N. J. A. Sloane On the other hand, by adapting a construction of Tenengolts [Ten76], one can modify the Varshamov–Tenengolts construction to obtain linear codes, with only a small increase in the length of the code. Definition 2.7. Given k ≥ 1, let n=k+
21 3 2k + 9/4 + 1/2 .
The linear single-deletion-correcting code VT0 (n) has dimension k and consists of symbols and the all vectors (x1 , . . . , xn ) ∈ Fn2 , where x1 , . . . , xk are information c = n − k check symbols xk+1 , . . . , xn are chosen so that ni=1 ixi ≡ 0 ( mod n + 1). The construction works because c is just large enough so that c+1 ≥ n + 1, and 2 n so the sums i=k+1 ixi cover n + 1 consecutive values modulo n + 1. We omit the details. √ The number of check symbols in these codes is of the order of 2n, compared with O(log n) for the VT0 (n) codes. So we end this section with a final question: What are the optimal linear single-deletion-correcting codes?
3. Shift register sequences As mentioned in Section 1, the entry for sequence A16 in [EIS] indicates that these numbers also arise in the enumeration of shift register sequences [Go67]. We will show here that indeed this is the same sequence. But whether this is anything more than a coincidence remains an open question. Of course there are well-known connections between shift-register sequences and conventional error-correcting codes (cf. [MS77], Chapter 7), so there should be a deeper explanation. The context in which sequence A16 appears in Golomb’s book [Go67] is the enumeration of the (infinite) output sequences from certain types of n-stage binary shift registers. We consider four kinds of shift registers: the pure cycling register (or PCR), as illustrated in Fig. 1, the complemented cycling register (or CCR), the pure summing register (or PSR) and the complemented summing register (or CSR). If the shift register has n cells, initially containing x1 , x2 , . . . , xn (xi = 0 or 1), then x1 is appended to the output stream, symbols x2 , . . . , xn move to the left, and the symbol (PCR) x1 (CCR) 1 + x1 (PCR) x1 + x2 + · · · + xn or (CSR) 1 + x1 + x2 + · · · + xn is fed back to the right-most cell. The problem is to determine the numbers of different possible output sequences from these registers, which we denote by Z(n), Z ∗ (n), S(n) and S ∗ (n), respectively.
On single-deletion-correcting codes
x1
x3
x2
...
281
xn
Figure 1. An n-stage pure cycling register.
For example S ∗ (5) = 6, corresponding to the sequences ... ... ... ... ... ...
000001000001 ... 000111000111 ... 001011001011 ... 010011010011 ... 010101010101 ... 011111011111 ... ,
all having period 6 (or a divisor of 6). Table 3, based on [Go67, p. 172], shows the first few values of these functions, together with the corresponding sequence numbers from [EIS].
n 1 2 3 4 5 6 7 8 9 10 ··· Sequence:
PCR Z(n) 2 3 4 6 8 14 20 36 60 108 ··· A31
CCR Z ∗ (n) 1 1 2 2 4 6 10 16 30 52 ··· A16
PSR S(n) 2 2 4 4 8 10 20 30 56 94 ··· A13
CSR S ∗ (n) 1 2 2 4 6 10 16 30 52 94 ··· A16
Table 2. Number of output sequences from n-stage shift registers of types PCR, CCR, PSR, CSR.
Explicit formulas for these functions are given in the next theorem.
282 N. J. A. Sloane Theorem 3.1. For n ≥ 1,
1 φ(d)2n/d , n
Z(n) =
d|n
Z ∗ (n) = S ∗ (n − 1) =
(11)
1 φ(d)2n/d , 2n d|n
(12)
d odd
1 φ(2d)2(n+1)/d . S(n) = 2(n + 1)
(13)
d|n+1
Remark 3.2. Golomb proves (11) and sketches proofs of the other results. Actually (13) is due to Michael Somos (personal communication), Golomb’s version (given in (15) below) being slightly more complicated. The numbers Z(n) (sequence A31 in [EIS]) in the first column are also familiar as the number of binary irreducible polynomials of degree dividing n, and the number of n-bead necklaces formed with beads of two colors, when the necklaces may not be turned over (cf. [Be68, Chap. 4], [GR61], [MS77, Chap. 4], [St99, Problem 7.112]). Fredricksen [Fr70] shows that Z(n) − 1 is the number of 1’s in the truth table defining the lexicographically least de Bruijn cycle. Proof. Note that sequence A16 from [EIS] appears in two places in the table, for CCR registers of length n and CSR registers of length n − 1. We begin by explaining this, and thus proving that Z ∗ (n) = S ∗ (n − 1) .
(14)
Suppose for concreteness that n = 4. The output sequences from the four types of register are (omitting plus signs, and writing 1a rather than 1 + a, etc.): (i) (ii) (iii) (iv)
a a a a
b b b b
c c c c
d d d d
a 1a abcd 1abcd
b 1b a a
c 1c b b
d 1d c c
a a d d
... b abcd 1abcd
c a a
d b b
... c d c d
... ...
In general these sequences have periods n, 2n, n + 1 and n + 1, respectively. If we replace (ii) by the sums of adjacent pairs we get ab
bc
cd
1ad
ab
bc
cd
1ad
... ,
a CSR(3) sequence. Conversely, given a CSR(3) sequence, say A
B
C
1ABC
A
B
C
1ABC
... ,
of period 4, there is a unique CCR(4) sequence of period 8 corresponding to it, namely 0
A
AB
ABC
1
1A
1AB
1ABC
Applying this argument in the general case establishes (14).
0
A
... .
On single-deletion-correcting codes
283
In the rest of the proof we make use of Burnside’s lemma (cf. [St99]), which states that the number of orbits of a finite permutation group G is equal to the average number of points that are fixed by the elements of G. Let us first prove (11). (This is Golomb’s proof [Gol, p. 121].) We take G to be the cyclic group of order n generated by π = (1, 2, . . . , n), acting on Fn2 . The permutation π i (1 ≤ i ≤ n) contains gcd(n, i) cycles, each of length n/gcd(n, i), and has order n/gcd(n, i). There are precisely 2gcd(n,i) vectors fixed by π i , since each cycle must consist of all 0’s or all 1’s. Hence, by Burnside’s lemma, 1 gcd(n,i) 2 n n
Z(n) =
i=1
=
1 n k|n
=
n i=1 gcd(n,i)=k
1 k 2 n k|n
=
2k
1
gcd ( nk ,i )=1
1 n k 2 φ k n k|n
1 = φ(d)2n/d . n d|n
To establish (12), we note from (iv) that S ∗ (n − 1) is equal to the number of orbits of the same group, but now acting on binary vectors of length n and odd weight. The number of odd weight vectors fixed by π i is 2gcd(n,i)−1 if the cycle lengths n/gcd(n, i) are odd, and zero otherwise. Hence 1 S (n − 1) = n ∗
=
n
2gcd(n,i)−1
i=1 n/gcd(n,i) odd
1 n k−1 2 φ k n k|n n/k odd
1 φ(d)2d . = 2n d|n d odd
Finally, we prove (13), by determining S(n − 1). The group is the same, but now (see (iii)) acting on even weight vectors. If d = n/gcd(n, i) is even there are 2d fixed
284 N. J. A. Sloane vectors, but if d is odd only 2d−1 fixed vectors. Hence 1 1 φ(d)2d−1 + φ(d)2d S(n − 1) = n d|n n d|n d odd
1 = φ(2d)2n/d , 2n
d even
(15)
d|n
since φ(2d) = φ(d) if d odd, φ(d) = 2φ(d) if d even. But a mystery still remains: is the fact that the number of codewords in VT0 (n) equals Z(n) just a numerical coincidence, or is there a one-to-one correspondence between the codewords and the CCR shift register sequences? (This is essentially equivalent to a research problem stated by Stanley in [St86], Chapter 1, Problem 27(c).) Furthermore, why is | VT1 (n)| (sequence A48 in [EIS]), equal to the number of (n + 1)-bead necklaces with beads of two colors and primitive period n + 1, when the two colors may be interchanged but the necklaces may not be turned over (cf. [Fi58], [GR61])? This is also the number of irreducible polynomials over F2 of degree n + 1 in which the coefficient of x n is 1 [Car52], [CMRSS].
4. Locally transitive tournaments The entry for A16 in [EIS] also indicates that this sequence arose in Brouwer’s enumeration [Br80] of locally transitive tournaments. A tournament is a directed graph with one directed edge between any two nodes. It is transitive if there are no directed cycles. A locally transitive tournament is a tournament such that the subgraphs on the predecessors of a point and the successors of a point are both transitive. Brouwer, answering a question raised by P. J. Cameron, determined the number of locally transitive tournaments on n nodes. He began by calculating the first few values by computer. Then he looked up this sequence in [HIS], and found the reference to Golomb’s book [Go67]. With this hint alone, and without having access to the book, he established a one-to-one correspondence between these tournaments and output sequences from shift registers of CCR type. From this he obtained the formula n 2d−1 μ(e) , (16) odd d d e d|n e| dn where odd(i) is 0 or 1 according to whether i is even or odd, and μ is the Möbius function (A8683 in [EIS]). Using the identity n φ(n) = μ(d) d d|n
On single-deletion-correcting codes
285
([Ap76], p. 26), (16) immediately reduces to (12). Again we can ask, is there a connection between locally transitive tournaments and the VT0 (n) codes?
5. The number of descendants of a vector It was already mentioned in Section 2 that |D1 (u)| = r(u), the number of runs in u. The next theorem was discovered by E. M. Rains and the author. Although this must be well-known, we have not found it in the literature. The derivative u ∈ Fn−1 of u = (u1 , . . . , un ) ∈ Fn2 is given by 2 u = (u1 + u2 , u2 + u3 , . . . , un−1 + un ) . Note that wt (u ) = r(u) − 1. Theorem 5.1.
r(u) + 1 −δ , |D2 (u)| = 2
(17)
where δ = 2wt (u ) − wt (u ) is the deficiency of u. Sketch of proof. First, suppose u is a “normal” vector, meaning that all runs have length ≥ 2, for example u = 0 u = u =
0 0
0 0
0 0
1 1
1 0
1 0
0 1
0 0
0 0
(18) 0 0 1 1 0 1 1 0 r(u) + 1 is the number of ways of choosing two things out of Then |D2 (u)| = 2 r(u) with repetitions allowed. If the runs in u have lengths i, j, k, l, . . . , the runs in the shortened vector have lengths i − 2, j, k, l, . . . i, j − 2, k, l, . . . i, j, k − 2, l, . . . ··· ··· i − 1, j − 1, k, l, . . . i − 1, j, k − 1, l, . . . ··· ··· For a normal vector wt (u ) = 2wt (u ) (cf. (18)), δ = 0 and (17) holds.
(19)
286 N. J. A. Sloane Next suppose that all runs in u have length ≥ 2 except for a single internal run of length 1, as in u = 0 u = u =
0 0
0
0
0 0
0 0
1 1
1
0 1
0
0 0
1
0 0
0
Then δ = 2, and indeed |D2 (u)| is 2 less than it would be for a normal vector, since one of the possibilities in (19) vanishes and two others coalesce. The remaining cases, when there are several runs of length 1, possibly including beginning or ending runs, are left to the reader. It is not clear how to generalize Theorem 5.1 to k-th order descendants. Certainly D3 (u) is not simply a function of the weights of u, u , u and u . Theorem 5.2. Let μk (n) = maxn |Dk (u)| u∈F2
be the maximal number of k-th order descendants of any binary vector of length n. Then k n−k μk (n) = , (20) i i=0
for n ≥ k + 1. Equality is achieved just by the vectors 010101 . . .
and 101010 . . . .
(21)
According to Calabi and Hartnett [CH69], (20) is proved in an unpublished 1967 report3 of Calabi [Cal67]. The first published proof seems to have been given by Levenshtein [Lev96]. It was generalized to the nonbinary case by Hirschberg [Hir99] (see also Levenshtein [Lev99] and Hirschberg and Regnier [HR01]). It is not difficult to show that the vectors (21) achieve the bound in (20). Theorem 5.3. For the two vectors 010101 . . . and 101010 . . . we have k n−k . |Dk (u)| = i
(22)
i=0
Proof. Let u = 010101 · · · ∈ and let mn,k = |Mn,k |. Then
Fn2 ,
let Mn,k be the set of k-th order descendants of u,
Mn,k = 0|M¯ n−1,k ∪ Mn−1,k−1 = 0|M¯ n−1,k ∪ 1|Mn−2,k−1 ∪ Mn−2,k−2 , 3 I have been unable to locate a copy of this report.
(23)
On single-deletion-correcting codes
287
where the bars denote binary complementation. However, the last term in (23) can be dropped because it is contained in the union of the other two terms. Since these two terms are disjoint, we have Mn,k = Mn−1,k + Mn−2,k−1 . This is a disguised version of the recurrence for binomial coefficients, whose solution is given by (22). The case k = 2 of (20) is a corollary of Theorem 5.1: Corollary 5.4. For n ≥ 3, μ2 (n) =
2 n−2 i=0
i
=
1 2 (n − 3n + 4) . 2
(24)
Proof. Let u achieve μ2 (n). The result is easily verified if r(u) is 1 or 2, so we assume r(u) ≥ 3. Suppose u begins with a string of k ≥ 0 runs of length 1, followed by a run of length ≥ 2 from position k +1. We will show that the vector u∗ obtained by complementing u from position k +2 onwards satisfies |D2 (u∗ )| ≥ |D2 (u)|. By repeating this operation we eventually arrive at one of the vectors (21). ¯ we may assume that the run following the initial k runs Since |Dk (u)| = |Dk (u)|, of length 1 in u begins 11x . . . . In u∗ this is replaced by 10x¯ . . . . Then we find that u∗ has r(u) + 1 runs, and wt (u∗ ) = wt (u ) − 2 + 2x, from which it follows using (17) that |D2 (u∗ )| − |D2 (u)| = r(u) + 2x − 3 ≥ 0, as required.
6. Related work The history of deletion-correcting codes is closely tied up with studies of codes for correcting other classes of errors such as: • erasures, when bits whose positions are known are deleted • insertions of bits (rather than deletions) • asymmetric errors, when the only errors that occur are that 1’s may be changed to 0’s (this is also known as a Z-channel) • unidirectional errors: 0’s may be changed to 1’s or 1’s to 0’s, but only one type of error occurs in any particular transmission • bit reversals: 0’s may be changed to 1’s or vice versa — this is the subject of classical coding theory
288 N. J. A. Sloane • transpositions: adjacent bits may be swapped • any meaningful combination of the above. Furthermore the alphabet may be changed from F2 to Fq . This produces an extensive list of families of codes, and of course in each case one can ask for the largest codes. In this section we give a brief overview of some other relevant papers. First, Levenshtein’s papers [Lev65], [Lev65a], [Lev92], [Lev99] should be considered essential reading. Hartnett [Ha74] (see especially Calabi and Hartnett [CH69]) contains some general investigations of all the above-mentioned codes (both block codes and variable length codes) from a fairly abstract mathematical point of view. One of the earliest papers to study deletion-correcting codes is Sellers [Se62], which combines a special separating string between blocks with a burst-error correcting code inside the blocks. Ullman [Ull66] uses a construction similar to that of Varshamov and Tenegolts, but his codes are not as efficient and also use a separating string between blocks. In [Ull67] he gives bounds on the size of codes for correcting synchronization errors. Tenengolts [Ten84] generalizes the VTa (n) codes to larger alphabets. Nonbinary codes are also discussed in [Bo94], [Bo95], [Do85], [Ma98]. Other constructions for deletion-correcting and related codes are given by Calabi and Hartnett [CH69a], Iizuka, Kasahara and Namekawa [IKN], Kløve [Kl95] and Tanaka and Kasai [TK76]. The most recent paper on this subject is by Schulman and Zuckerman [SZ99], who present what they describe as “simple, polynomial-time encodable and decodable codes which are asymptotically good for channels allowing insertions, deletions and transpositions”. The number of errors that can be corrected is some constant fraction of the block-length n. The constructions are not explicit. We conclude this section by mentioning some papers on peripherally related codes. Codes for correcting asymmetric and unidirectional errors are discussed in [BR82], [Et91], [EO98], [WVB88] and [WVB89]. Erasure correcting codes are discussed by Alon and Luby [AL96] and Barg [Ba98]. Acknowledgements. I would like to thank Andries Brouwer, Suhas Diggavi, Vladimir Levenshtein, Andrew Odlyzko, Eric Rains and Richard Stanley for conversations about the subject of this paper; and David Applegate, David Johnson and Mauricio Resende for their help in establishing that the VT0 (n) codes are optimal for n ≤ 9 and for their (so far unsuccessful!) attempts to find better codes.
On single-deletion-correcting codes
289
References [AL96]
N. Alon and M. Luby, A linear time erasure resilient code with nearly optimal recovery, IEEE Trans. Inform. Theory 42 (1996), 1732–1736.
[Ap76]
T. M. Apostol, Introduction to Analytic Number Theory, Springer-Verlag, NewYork 1976.
[Ba98]
A. Barg, Complexity issues in coding theory, in: Handbook of Coding Theory (V. S. Pless and W. C. Huffman, eds.), North-Holland, Amsterdam 1998, 649–754.
[Be68]
E. R. Berlekamp, Algebraic Coding Theory, McGraw-Hill, New York 1968.
[BR82]
B. Bose and T. R. N. Rao, Theory of undirectional error correcting/detecting codes, IEEE Trans.Comput. 31 (1982), 521–530.
[Bo94]
P. A. H. Bours, Construction of fixed-length insertion/deletion correcting runlengthlimited codes, IEEE Trans. Inform. Theory 40 (1994), 1841–1856.
[Bo95]
P. A. H. Bours, On the construction of perfect deletion-correcting codes using design theory, Des. Codes Cryptogr. 6 (1995), 5–20.
[Br80]
A. E. Brouwer, The Enumeration of Locally Transitive Tournaments, Math. Centr. Report ZW138, Amsterdam, April 1980.
[Cal67]
L. Calabi, On the Computation of Levenshtein’s Distances, Report TM-9-0030, Parke Mathematical Laboratories, Inc., Carlisle, MA, 1967; research supported by Air Force Cambridge Research Laboratories under contracts AF19(628)-3826 and F1962867COO30.
[CH69]
L. Calabi and W. E. Hartnett, Some general results of coding theory with applications to the study of codes for the correction of synchronization errors, Inform. Control 15 (1969), 235–249; reprinted in Hartnett [Ha74].
[CH69a]
L. Calabi and W. E. Hartnett, A family of codes for the correction of substitution and synchronization errors, IEEE Trans. Inform. Theory 15 (1969), 102–106.
[Car52]
L. Carlitz, A theorem of Dickson on irreducible polynomials, Proc. Amer. Math. Soc. 3 (1952), 693–700.
[CMRSS] K. Cattell, C. R. Miers, F. Ruskey, J. Sawada and M. Serra, The number of irreducible polynomials over GF (2) with given trace and subtrace, preprint, 2000. See http:// csr.csc.uvic.ca/∼fruskey/Publications/TraceSubtrace.html. [Do85]
A. S. Dolgopolov, Nonbinary codes correcting symbol insertions, deletions and substitutions (in Russian), Problemy Peredachi Informatsii 21, No. 1, (1985), 35–39; English translation in Problems Inform. Transmission 21, No. 1, (1985).
[Et91]
T. Etzion, New lower bounds for asymmetric and unidirectional codes, Trans. Inform. Theory 37 (1991), 1696–1704.
[EO98]
T. Etzion and P. R. J. Östergård, Greedy and heuristic algorithms for codes and colorings, IEEE Trans. Inform. Theory 44 (1998), 382–388.
[Fi58]
N. J. Fine, Classes of periodic sequences, Illinois J. Math. 2 (1958), 285–302.
[Fr70]
H. Fredricksen, The lexicographically least de Bruijn cycle, J. Combin. Theory 9 (1970), 1–5.
290 N. J. A. Sloane [GR61]
E. N. Gilbert and J. Riordan, Symmetry types of periodic sequences, Illinois J. Math. 5 (1961), 657–665.
[Gi67]
B. D. Ginzburg, A number-theoretic function with an application in the theory of coding (in Russian), Problemy Kibernetiki 19 (1967), 249–252; English translation in Systems Theory Research 19 (1970), 255–259.
[Go67]
S. W. Golomb, Shift Register Sequences, Holden-Day, San Francisco 1967.
[GS80]
R. L. Graham and N. J. A. Sloane, Lower bounds for constant weight codes, IEEE Trans. Inform. Theory 26 (1980), 37–43.
[Ha74]
W. E. Hartnett (ed.), Foundations of Coding Theory, Reidel, Dordrecht 1974.
[Hir99]
D. S. Hirschberg, Bounds on the number of string subsequences, in: Combinatorial Pattern Matching (M. Crochemore and M. Paterson, eds.), Lecture Notes in Comput. Sci. 1645, Springer-Verlag, Berlin 1999, 115–122.
[HR01]
D. S. Hirschberg and M. Regnier, Tight bounds on the number of string subsequences, J. Discrete Algorithms, to appear.
[IKN]
I. Iizuka, M. Kasahara and T. Namekawa, Block codes capable of correcting both additive and timing errors, IEEE Trans. Inform. Theory 26 (1980), 393–400.
[Kl81]
T. Kløve, A lower bound for A(n, 4, w), IEEE Trans. Inform. Theory 27 (1981), 257–258.
[Kl95]
T. Kløve, Codes correcting a single insertion/deletion of a zero or a single peak-shift, IEEE Trans. Inform. Theory 41 (1995), 279–283.
[Ko88]
T. W. Körner, Fourier Analysis, Cambridge Univiversity Press, Cambridge 1988.
[Lev65]
V. I. Levenshtein, Binary codes capable of correcting deletions, insertions and reversals (in Russian), Dokl, Akad, Nauk SSSR 163, No. 4, (1965), 845–848; English translation in Soviet Physics Dokl. 10, No. 8, (1966), 707–710.
[Lev65a] V. I. Levenshtein, Binary codes capable of correcting spurious insertions and deletions of ones (in Russian), Problemy Peredachi Informatsii 1, No. 1, (1965), 12–25; English translation in Problems of Information Transmission 1, No. 1, (1965), 8–17. [Lev92]
V. I. Levenshtein, On perfect codes in the deletion/insertion metric (in Russian), Diskret. Mat. 3, No. 1, (1991), 3–20; English translation in Discrete Math. Appl. 2, No. 3, (1992), 241–258.
[Lev96]
V. I. Levenshtein, Reconstructing binary sequences by the minimum number of their subsequences or supersequences of a given length, in: Proceedings of Fifth Intern. Workshop on Algebr. and Combin. Coding Theory, Sozopol, Bulgaria, June 1–7, 1996, Unicorn, Shumen 1996, 176–183.
[Lev99]
V. I. Levenshtein, Efficient reconstruction of sequences from their subsequences or supersequences, preprint 1999.
[MS77]
F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Codes, North-Holland, Amsterdam 1977.
[Ma98]
A. Mahmoodi, Existence of perfect 3-deletion correcting codes, Des. Codes Cryptogr. 14 (1998), 81–87.
[PH98]
V. S. Pless and W. C. Huffman, Handbook of Coding Theory, North-Holland, Amsterdam 1998.
On single-deletion-correcting codes
291
[SZ99]
L. J. Schulman and D. Zuckerman, Asymptotically good codes correcting insertions, deletions and transpositions, IEEE Trans. Inform. Theory 45 (1999), 2552–2557.
[Se62]
F. F. Sellers, Jr., Bit loss and gain correction codes, IEEE Trans. Inform. Theory 8 (1962), 35–38.
[HIS]
N. J. A. Sloane, A Handbook of Integer Sequences, Academic Press, New York 1973.
[SL01]
N. J. A. Sloane, Home page: http://www.research.att.com/∼njas/, http://www.research.att.com/∼njas/.
[EIS]
N. J. A. Sloane, The On-Line Encyclopedia of Integer Sequences, Published electronically at http://www.research.att.com/∼njas/sequences/, http://www.research.att.com/∼njas/sequences/.
[St86]
R. P. Stanley, Enumerative Combinatorics, Vol. 1, Wadsworth, Monterey, CA, 1986.
[St99]
R. P. Stanley, Enumerative Combinatorics, Vol. 2, Cambridge University Press, Cambridge 1999.
[TK76]
E. Tanaka and T. Kasai, Synchronization and substitution error-correcting codes for the Levenshtein metric, IEEE Trans. Inform. Theory 22 (1976), 156–162.
[Ten76]
G. M. Tenengolts, Class of codes correcting bit loss and errors in the preceding bit (in Russian), Avtomatika i Telemakhanika 5 (1976), 174–179; English translation in Automation and Remote Control 37, No. 5, (1976), 797–802.
[Ten84]
G. Tenengolts, Nonbinary codes correcting single deletion or insertion, IEEE Trans. Inform. Theory 30 (1984), 766–769.
[Ull66]
J. D. Ullman, Near-optimal, single-synchronization-error-correcting code, IEEE Trans. Inform. Theory 12 (1966), 418–424.
[Ull67]
J. D. Ullman, On the capabilities of codes to correct synchronization errors, IEEE Trans. Inform. Theory 13 (1967), 95–105.
[Var65]
R. R. Varshamov, On an arithmetic function with an application in the theory of coding (in Russian), Dokl. Akad. Nauk SSSR, 161, No. 3, (1965), 540–543.
[VT65]
R. R. Varshamov and G. M. Tenengolts, Codes which correct single asymmetric errors (in Russian),Avtomatika i Telemekhanika 26, No. 2, (1965), 288–292; English translation in Automation and Remote Control 26, No. 2, (1965), 286–290.
[WVB88] J. H. Weber, C. de Vroedt and D. E. Boekee, Bounds and constructions for binary codes of length less than 24 and asymmetric distance less than 6, IEEE Trans. Inform. Theory 34 (1988), 1321–1331. [WVB89] J. H. Weber, C. de Vroedt and D. E. Boekee, Bounds and constructions for codes correcting unidirectional errors, IEEE Trans. Inform. Theory 35 (1989), 797–810. N. J. A. Sloane AT&T Shannon Labs 180 Park Avenue, Florham Park, NJ 07932-0971, U.S.A. [email protected]
Critical problems in finite vector spaces Zhe-Xian Wan
Abstract. Critical problems in finite vector spaces on which some classical groups act are studied. In particular, the critical exponents of finite unitary spaces and finite symplectic spaces are defined and are expressed in terms of the Anzahl theorems of the corresponding geometries. 2000 Mathematics Subject Classification: primary 05E15; secondary 05B25.
1. Introduction The classical critical problem of finite vector spaces was formulated and studied by Crapo and Rota [1] in 1970. We begin with a review of their results. Let q be a power of a prime, Fq be the finite field with q elements, and Fq(n) be the n-dimensional row vector space over Fq , where n is a positive integer. Let S (n) (n) be a set of non-zero vectors in Fq . A subspace P of Fq is said to distinguish S (n) if P ∩ S = φ. The critical exponent of S, denoted by c(S, Fq ), is defined to be the minimum non-negative integer s such that there exists a subspace of dimension n − s distinguishing S. It can also be defined as the minimum non-negative integer s such that there exists an s-tuple of (n − 1)-dimensional subspaces (H1 , H2 , . . . , Hs ) distinguishing S, i.e., (∩si=1 Hi ) ∩ S = φ. By convention, we regard the intersection of (n) 0 (n−1)-dimensional subspace to be Fq . Since an (n−s)-dimensional subspace can be expressed as an intersection of s (n − 1)-dimensional subspaces, the equivalence of these two definitions is clear. (n) A map f from Fq to Fq is called a linear map if f (λx + μy) = λf (x) + μf (y)
(n)
for all x, y ∈ Fq
and λ, μ ∈ Fq .
Denote the kernel of the linear map f by Ker f . It is clear that if f is a non-zero linear map Ker f is an (n − 1)-dimensional subspace of Fq(n) and that if f is the zero map (n) (n) Ker f = Fq . The critical exponent of a set of non-zero vectors S in Fq can also be defined as the minimum non-negative integer s such that there exists an s-tuple of linear maps from Fq(n) to Fq , (f1 , f2 , . . . , fs ), distinguishing S, i.e., (∩si=1 Ker fi ) ∩ S = φ. Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
294 Zhe-Xian Wan By convention we also regard the intersection of the kernel of 0-tuple of linear map to be Fq(n) . The equivalence of the third definition with the preceding two is also clear. (n) Let X be a non-empty set of vectors of Fq . Denote by X the subspace spanned by X. The rank of X, denoted by r(X), is defined to be dim X. If X = φ, we agree that X = φ and r(X) = 0. In what follows we need some concepts from matroid theory and the reader may consult Welsh [5]. Based on the third definition of the critical exponent, the following theorem was obtained by Crapo and Rota [1]. (n)
Theorem 1.1. Let S be a set of non-zero vectors in Fq , M(S) be the matroid on S defined by linear independence of vectors, L(M(S)) be the lattice of flats of M(S), μ be the Möbius function on L(M(S)), and χ (L(M(S)), λ) be the characteristic polynomial of L(M(S)). Then the number of s-tuples of linear maps from Fq(n) to Fq distinguishing S is q s(n−r(S)) χ(L(M(S)), q s ). Corollary 1.2. Let S be a set of non-zero vectors in Fq(n) and M(S), L(M(S)), χ (L(M(S)), λ) be the same as in Theorem 1.1. Then c(S, Fq(n) ) = min{s | χ(L(M(S)), q s ) = 0}. It follows from Corollary 1.2 that the critical exponent of a set S of non-zero vectors in Fq(n) depends only on the matroid structure of the set S. Moreover, we have (n)
Theorem 1.3. Let S be a set of non-zero vectors in Fq . Then c(S, Fq(n) ) ≤ r(S). Then, based on the first two definitions, the following theorems were obtained by Dowling [2] in 1971. Theorem 1.4. Let S be a set of non-zero vectors in Fq(n) and M(S), L(M(S)), χ (L(M(S)), λ) be the same as in Theorem 1.1. Then the number of s-tuples of (n − 1)-dimensional subspaces of Fq(n) distinguishing S is s 1 s−j s (−1) q j (n−r(S)) χ (L(M(S)), q j ). (q − 1)s j j =0
(n)
Corollary 1.5. Let S be a set of non-zero vectors in Fq and M(S), L(M(S)), χ (L(M(S)), λ) be the same as in Theorem 1.1. Then s s j (n−r(S)) c(S, Fq(n) ) = min s | q (−1)s−j χ (L(M(S)), q j ) = 0 . j j =0
Critical problems in finite vector spaces
295
(n)
Theorem 1.6. Let S be a set of non-zero vectors in Fq and M(S), L(M(S)), χ (L(M(S)), λ) be the same as in Theorem 1.1. Then the number of (n−s)-dimensional subspaces of Fq(n) distinguishing S is s 1 s j (j2) (−1) q q (s−j )(n−r(S)) χ (L(M(S)), q s−j ). s−1 s j q (q − q j ) j =0
j =0
(n)
Corollary 1.7. Let S be a set of non-zero vectors in Fq and M(S), L(M(S)), χ (L(M(S)), λ) be the same as in Theorem 1.1. Then
s s (n) j (j2) (s−j )(n−r(S)) s−j
q (−1) q χ (L(M(S)), q ) = 0 . c(S, Fq ) = min s j j =0
In the present paper the critical problems in finite unitary spaces, finite symplectic spaces, etc. will be studied.
2. Critical problems in finite unitary spaces (n)
Let n be a positive integer and Fq 2 be the n-dimensional unitary space, i.e., the ndimensional vector space over Fq 2 on which the unitary group Un (Fq 2 ) of degree n over Fq 2 acts. (n)
Let S be a set of non-zero vectors in Fq 2 . The unitary critical exponent of S, (n)
denoted by cuni (S, Fq 2 ), is defined to be the minimum non-negative integer s such that there exists an (n − s)-dimensional non-isotropic subspace P distinguishing S, i.e., P ∩S = φ. Since an (n−s)-dimensional non-isotropic subspace can be expressed as an intersection of s (n−1)-dimensional non-isotropic subspaces, the unitary critical exponent of S can also be defined as the minimum non-negative integer s such that there exists an s-tuple of (n 4 − 1)-dimensional non-isotropic subspaces (H1 , H2 , . . . , Hs ) distinguishing S, i.e., ( si=1 Hi ) ∩ S = φ. (n) Let X be a non-empty set of vectors in Fq 2 . If the subspace U = X spanned by X is of type (m, r), where m is the dimension of U and r is the rank of U t U , then X is also said to be of type (m, r). By Theorem 5.7 of [4], 2r ≤ 2m ≤ n + r. We also write m = m(X) and r = r(X) for simplicity. If X = φ, we agree that X = φ and m(X) = r(X) = 0. Lemma 2.1. Let X be a set of vectors of type (m, r) in the n-dimensional unitary space Fq(n) 2 , where 2r ≤ 2m ≤ n + r, and let k be a positive integer ≤ n. Then the
296 Zhe-Xian Wan number of k-dimensional non-isotropic subspaces containing X equals q (n−k)(k−r)
n−k i=1
q k+r−2m+i − (−1)k+r−2m+i . q i − (−1)i
Proof. When X = φ, our lemma follows from Corollary 5.21 of [4]. In fact, the number of k-dimensional non-isotropic subspaces containing X = φ is the number N (k, k; n) of k-dimensional non-isotropic subspaces, which is equal to N (k, k; n) = q k(n−k)
k q n−k+i − (−1)n−k+i q i − (−1)i i=1
= q k(n−k)
n−k i=1
q k+i − (−1)k+i . q i − (−1)i
Now we assume that X = φ. Let P be a k-dimensional non-isotropic subspace in (n) the unitary space Fq 2 . Then P ⊇ X if and only if P ⊇ X. If k < 2m − r, by Theorem 5.36 of [4] the number of k-dimensional non-isotropic subspaces containing X is 0 and the formula in Lemma 2.1 also gives 0. If k ≥ 2m − r, our lemma follows from Theorem 5.37 of [4]. Corollary 2.2. Let X be a set of vectors of type (m, r) in the n-dimensional unitary space Fq(n) 2 , where 2r ≤ 2m ≤ n + r. Then the number of (n − 1)-dimensional non-isotropic subspaces containing X equals q n−r−1 (q n+r−2m − (−1)n+r−2m ) . q +1 Theorem 2.3. Let S be a set of non-zero vectors in the n-dimensional unitary space (n) Fq 2 , M(S) be the matroid on S defined by linear independence of vectors, L(M(S)) be the lattice of flats of the matroid M(S), and μ be the Möbius function on L(M(S)). Then the number of (n − s)-dimensional non-isotropic subspaces distinguishing S equals
μ(φ, X)q s(n−s−r(X))
s q n−s+r(X)−2m(X)+i − (−1)n−s+r(X)−2m(X)+i . q i − (−1)i i=1
X∈L(M(S))
Proof. Let X be a flat of M(S) and assume that X is of type (m(X), r(X)). Denote by g(s, X) the number of (n − s)-dimensional non-isotropic subspaces containing X. By Lemma 2.1 g(s, X) = q
s(n−s−r(X))
s q n−s+r(X)−2m(X)+i − (−1)n−s+r(X)−2m(X)+i . q i − (−1)i i=1
297
Critical problems in finite vector spaces
Denote by f (s, X) the number of (n − s)-dimensional non-isotropic subspace P such that P ∩ S = X. Then g(s, X) = f (s, Y ). Y ∈L(M(S)):Y ⊇X
By Möbius inversion,
f (s, Y ) =
μ(Y, X)g(s, X).
X∈L(M(S)):X⊇Y
For Y = φ, f (s, φ) is the number of (n − s)-dimensional non-isotropic subspaces distinguishing S. The theorem is proved. Corollary 2.4. Let S be a set of non-zero vectors in the n-dimensional unitary space Fq(n) 2 , and M(S), L(M(S)), μ be as in Theorem 2.3. Then
(n) cuni (S, Fq 2 ) = min s
μ(φ, X)q s(n−s−r(X))
X∈L(M(S))
s q n−s+r(X)−2m(X)+i − (−1)n−s+r(X)−2m(X)+i × = 0 . q i − (−1)i i=1
In a similar way from Corollary 2.2 we deduce Theorem 2.5. Let S be a set of non-zero vectors in the n-dimensional unitary space (n) Fq 2 . Then the number of s-tuples of (n − 1)-dimensional non-isotropic subspaces distinguishing S is 6s 5 q n−r(X)−1 (q n+r(X)−2m(X) − (−1)n+r(X)−2m(X) ) . μ(φ, X) q +1 X∈L(M(S))
Corollary 2.6. Let S be a set of non-zero vectors in the n-dimensional unitary space (n) Fq 2 . Then
(n) cuni (S, Fq 2 )
= min s
μ(φ, X)
X∈L(M(S))
5
q n−r(X)−1 (q n+r(X)−2m(X) − (−1)n+r(X)−2m(X) ) × q +1
6s
-
= 0 .
298 Zhe-Xian Wan It follows from Corollaries 2.4 or 2.6 that the unitary critical exponent of a set S of non-zero vectors in the unitary space Fq(n) 2 depends only on the matroid structure of S. The orbit of (n − 1)-dimensional subspaces in Fq(n) 2 under the general linear group GLn (Fq 2 ) is partitioned into two orbits under the unitary group Un (Fq 2 ), one of which consists of all (n − 1)-dimensional non-isotropic subspaces and the other one consists of all (n − 1)-dimensional isotropic subspaces or subspaces of type (n − 1, n − 2). The critical exponent of a set S of non-zero vectors in Fq(n) 2 is defined to be the minimum non-negative integer s such that there exists an s-tuple of (n−1)-dimensional subspaces distinguishing S, and the unitary critical exponent of S is defined to be the minimum non-negative integer s such that there exists an s-tuple of (n−1)-dimensional non-isotropic subspaces distinguishing S. Now we call the later the unitary critical exponent of S defined by (n − 1)-dimensional non-isotropic subspaces and denote (n) (n) it by cuni (S, Fq 2 ). In a similar way we define the unitary critical exponent of S defined by (n − 1)-dimensional isotropic subspaces to be the minimum non-negative integer s such that there exists an s-tuple of (n − 1)-dimensional isotropic subspaces (i) (n) distinguishing S and denote it by cuni (S, Fq 2 ). Lemma 2.7. Let X be a set of vectors of type (m, r) in the n-dimensional unitary space Fq(n) 2 , where 2r ≤ 2m ≤ n + r. Then the number of (n − 1)-dimensional isotropic subspaces containing X is equal to N (m, r; n − 1, n − 2; n) q 2(m−r) j1=0 (q n−2m+r−j − (−1)n−2m+r−j ) + (q 2(m−r) − 1) = . q2 − 1 Proof. When X = φ, the number of (n − 1)-dimensional isotropic subspaces containing X is equal to the number of (n − 1)-dimensional isotropic subspace and by Theorem 5.19 of [4] the latter is equal to N (n − 1, n − 2; n) =
(q n − (−1)n )(q n−1 − (−1)n−1 ) . q2 − 1
By convention m = r = 0 and then the formula of N (m, r; n − 1, n − 2; n) gives N (0, 0; n − 1, n − 2; n) =
(q n − (−1)n )(q n−1 − (−1)n−1 ) . q2 − 1
Therefore our lemma is true for the case X = φ. Now assume that X = φ. Consider first the case m = r. By Theorem 5.36 of [4] when m = n or n − 1, the number of (n − 1)-dimensional isotropic subspaces containing X is zero, and the formula of N (m, r; n − 1, n − 2; n) gives also N (m, r; n − 1, n − 2; n) = 0.
Critical problems in finite vector spaces
299
When m ≤ n − 2, by Theorem 5.37 of [4] we have the formula of N (m, m; n − 1, n − 2; n). Then consider the case m > r. The formula of N (m, r; n − 1, n − 2; n) is a special case of the formula in Theorem 5.37 of [4]. Theorem 2.8. Let S be a set of non-zero vectors in the n-dimensional unitary space Fq(n) 2 . Then the number of s-tuples of (n − 1)-dimensional isotropic subspaces distinguishing S is μ(φ, X)N (m(X), r(X); n − 1, n − 2; n)s . X∈L(M(S))
Proof. Similar to the proof of Theorem 2.3. Corollary 2.9. Let S be a set of non-zero vectors in the n-dimensional unitary space Fq(n) 2 . Then
(i) (n) s
cuni (S, Fq 2 ) = min s μ(φ, X)N (m(X), r(X); n − 1, n − 2; n) = 0 . X∈L(M(S))
Finally, we have Theorem 2.10. Let S be a set of non-zero vectors in the n-dimensional unitary space (n) Fq 2 . Then (n)
(n)
(n)
(i)
(n)
(n)
cuni (S, Fq 2 ) ≥ c(S, Fq 2 ) and cuni (S, Fq 2 ) ≥ c(S, Fq 2 ). The above study of the critical problems in the finite unitary space can be carried over to the finite pseudo-symplectic and orthogonal spaces in a parallel way, but the details will not be repeated. However, for the symplectic space of dimension 2ν over Fq the set of (2ν − 1)-dimensional subspaces forms also an orbit under the symplectic group Sp2ν (Fq ), thus the symplectic critical exponent will be defined in a different way in the next section.
3. A critical problem in finite symplectic spaces Let ν be a positive integer,
K=
0 −I (ν)
I (ν) 0
,
Sp2ν (Fq ) be the symplectic group of degree 2ν with respect to K over Fq , and Fq(2ν) be the 2ν-dimensional symplectic space over Fq , i.e., the 2ν-dimensional vector space (2ν) Fq on which the symplectic group Sp2ν (Fq ) acts.
300 Zhe-Xian Wan (2ν)
A set of vectors X in Fq is called an isotropic set of vectors if uK t v = 0 for all u, v ∈ X. Denote by X the subspace spanned by X. Clearly, X is a totally isotropic subspace if X is an isotropic set of vectors. The rank of a set of vectors X is defined to be the dimension of X, which is denoted by r(X), and X is called a rank-r set. When X = φ, then X = φ and we agree that r(X) = 0. Let S be a set of nonzero vectors in the 2ν-dimensional symplectic space over Fq . A totally isotropic subspace P is said to distinguish S if P ∩ S = φ. The symplectic critical exponent of S, denoted by csymp (S, Fq(2ν) ), is defined to be the minimum positive integer s ≤ ν + 1 such that there exists an (ν + 1 − s)-dimensional totally isotropic subspace distinguishing S, (cf. [3]). Since any (ν +1−s)-dimensional totally isotropic subspace is an intersection of s maximal totally isotropic subspaces, the symplectic critical exponent of S can also be defined as the minimum positive integer s such that there exist an s-tuple of maximal totally isotropic subspaces (P1 , P2 , . . . , Ps ) such that ∩si=1 Pi distinguishes S. (2ν) Let S be a set of non-zero vectors in Fq , M(S) be the matroid on S defined by linear independence of vectors, and L(M(S)) be the lattice of flats of G. An isotropic flat is a flat which is also an isotropic set of vectors. Clearly, subsets of isotropic set of vectors are also isotropic. It follows that the collection of isotropic flats form an ideal in the lattice L(M(S)) and this ideal will be denoted by LI (M(S)). Lemma 3.1. Let X be a rank-r isotropic set of vectors in the 2ν-dimensional symplectic space Fq(2ν) and k be an integer satisfying ν ≥ k ≥ r. Then the number of (2ν) k-dimensional totally isotropic subspaces in Fq containing X is k−r−1 i=0
q 2ν−2r−2i − 1 . q k−r−i − 1
Proof. When r = 0 our lemma follows from Corollary 3.19 of [4]. Now assume that r > 0. Let P be a k-dimensional totally isotropic subspace. Then P ⊃ X if and only if P ⊃ X. Therefore our lemma is a special case of Theorem 3.38 of [4]. Corollary 3.2. The number of maximal totally isotropic subspaces in the 2ν-dimensional symplectic space Fq(2ν) containing a given rank-r isotropic set of vectors is ν−r−1
(q ν−r−i + 1).
i=0
Theorem 3.3. Let S be a set of non-zero vectors in the 2ν-dimensional symplectic (2ν) space Fq , M(S) be the matroid on S defined by linear independence of vectors, L(M(S)) be the lattice of flats of the matroid G, LI (M(S)) be the ideal of isotropic flats in the lattice L(M(S)), and μ be the Möbius function on L(M(S)). Then for any positive integer s ≤ ν + 1 the number of (ν + 1 − s)-dimensional totally isotropic
Critical problems in finite vector spaces
301
subspaces distinguishing S is equal to
μ(φ, X)
ν−s−r(X) i=0
X∈LI (M(S)):r(X)≤ν+1−s
q 2ν−2r(X)−2i − 1 . q ν+1−s−r(X)−i − 1
Proof. Let X be a flat of M(S). Denote by g(s, X) the number of (ν + 1 − s)dimensional totally isotropic subspaces containing X. If X is isotropic and r(X) ≤ ν + 1 − s, then by Lemma 3.1 g(s, X) =
ν−s−r(X) i=0
q 2ν−2r(X)−2i − 1 . q ν+1−s−r(X)−i − 1
If X is not isotropic or X is isotropic but r(X) > ν + 1 − s, then g(s, X) = 0. Denote by f (s, X) the number of (ν + 1 − s)-dimensional totally isotropic subspaces P such that P ∩ S = X. Then g(s, X) = f (s, Y ). Y ∈LI (M(S)):Y ⊇X
By Möbius inversion
f (s, Y ) =
μ(Y, X)g(s, X)
X∈LI (M(S)):X⊇Y
=
μ(Y, X)
ν−s−r(X) i=0
X∈LI (M(S)):X⊇Y
q 2ν−2r(X)−2i − 1 . q ν+1−s−r(X)−i − 1
and r(X)≤ν+1−s
For Y = φ, f (s, φ) is the number of (ν+1−s)-dimensional totally isotropic subspaces distinguishing S. The theorem is proved. Corollary 3.4. Let S be a set of non-zero vectors in the 2ν-dimensional symplectic space Fq(2ν) and M(S), L(M(S)), LI (M(S)), μ be as in Theorem 3.3. Then csymp (S, Fq(2ν) ) ⎧ ⎨
= min s
⎩
X∈LI (M(S))
μ(Y, X)
ν−s−r(X) i=0
⎫ ⎬
−1 = 0 . ⎭ q ν+1−s−r(X)−i − 1 q 2ν−2r(X)−2i
with r(X)≤ν+1−s
The critical problem in finite symplectic spaces was first studied by Kung [3], but the formulas in Theorem 3.3 and Corollary 3.4 obtained by him are incorrect. For completeness we quote also the following results of his [3].
302 Zhe-Xian Wan Theorem 3.5. Let S be a set of non-zero vectors in the 2ν-dimensional symplectic space Fq(2ν) . Then the number of s-tuples of maximal totally isotropic subspaces distinguishing S is
μ(φ, X)
ν−r(X)−1
(q ν−r(X)−i + 1)s .
i=0
X∈LI (M(S))
Corollary 3.6. Let S be a set of non-zero vectors in the 2ν-dimensional symplectic space Fq(2ν) . Then csymp (S, Fq(2ν) )
= min s
μ(φ, X)
(q ν−r(X)−i + 1)s = 0 .
ν−r(X)−1 i=0
X∈LI (M(S))
Theorem 3.7. Let S be a set of nonzero vectors in the 2ν-dimensional symplectic space Fq(2ν) . Then (2ν)
csymp (S, Fq(2ν) ) ≤ - 21 c(S, Fq
). + 1,
where - 21 c(S, Fq(2ν) ). is the largest integer ≤ 21 c(S, Fq(2ν) ). The above discussion of the critical problem in the finite symplectic space can be carried over to the finite pseudo-symplectic, unitary, and orthogonal spaces in a parallel way. The details will not be repeated but we remark that in the finite orthogonal space over a field of characteristic 2, we should use singular sets of vectors and totally singular subspaces to replace the isotropic sets of vectors and totally isotropic subspaces.
References [1] H. H. Crapo and G.-C. Rota, On the foundations of combinatorial theory: Combinatorial geometries, preliminary edition, M.I.T. Press, Cambridge, MA, 1970. [2] T.A. Dowling, Codes, packing and the critical problems, in:Atti del Convegno di Geometria Combinatoria e sue Applicazioni, Instituto di Matematica, Università di Perugia, Perugia 1971, 209–224. [3] J. P. S. Kung, Pfaffian structures and critical problems in finite symplectic spaces, Ann. Comb. 1 (1997), 159–172. [4] Z.-X. Wan, Geometry of Classical Groups over Finite Fields, Studentlitteratur, Lund/Chatwell-Bratt, Bromley 1993. [5] D. J. A. Welsh, Matriod Theory, Academic Press, London and New York 1976.
Critical problems in finite vector spaces Z.-X. Wan Academy of Mathematics and System Sciences Chinese Academy of Sciences and Department of Information Technology Lund University, P.O. Box 118, SE-221 00 Lund, Sweden [email protected]
303
Existence of Steiner systems that admit automorphisms with large cycles Richard M. Wilson
Abstract. For every integer k ≥ 2, we construct infinite families of Steiner systems S(2, k, v) that have (1) an automorphism that permutes the v points in a single cycle of length v, (2) an automorphism that fixes one point and permutes the remaining v − 1 points in a single cycle, or (3) an automorphism that fixes one point and permutes the remaining points in two cycles, of lengths r = (v − 1)/(k − 1) and v − r − 1. The designs we construct with property (2) are also resolvable. 2000 Mathematics Subject Classification: primary 05B05; secondary 05B25.
1. Introduction A Steiner system S(2, k, v) consists of a set X of v points and a set A of k-element subsets called blocks so that every two points are contained together in a unique block. These are also called 2-(v, k, 1) designs or (v, k, 1)-BIBD’s. (We are considering only designs with index λ = 1 here, though the techniques can be used for higher indices as well.) An automorphism of such an S(2, k, v) is a permutation of the points that takes blocks to blocks. In [5], the following theorem was proved. Theorem 1.1. For every integer k ≥ 2, there exist infinitely many cyclic Steiner systems S(2, k, v), that is, Steiner systems S(2, k, v) that admit an automorphism that permutes the points in a single cycle of length v. In fact, such cyclic S(2, k, v) exist whenever v is a prime so that v ≡ 1 (mod k(k − 1)) and v is sufficiently large with respect to k. A simple example of such a Steiner system is the S(2, 3, 13) whose 26 blocks are the images of {1, 3, 9} and {2, 6, 5} under the powers of the permutation α = (0 1 2 3 4 5 6 7 8 9 10 11 12). Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
306 Richard M. Wilson When v ≡ 1 (mod k(k − 1)), the blocks are permuted in cycles of length v. There are also cyclic Steiner systems where v ≡ k (mod k(k − 1)). A simple example of such a Steiner system is the S(2, 3, 15) whose 35 blocks are the distinct images of {0, 1, 4}, {0, 2, 8} and {0, 5, 10} under the powers of the permutation α = (0 1 2 3 4 5 6 7 8 9 10 11 12 13 14). Here the block {0, 5, 10} lies in an orbit of five blocks under the powers of α. The congruence class k modulo k(k − 1) includes the numbers of points of the Steiner systems consisting of the points and lines of Desarguesian finite projective spaces of odd dimensions. Here k − 1 is a power of a prime. We will prove the following theorem. Theorem 1.2. For every integer k ≥ 2, there exist infinitely many cyclic Steiner systems S(2, k, v) with v ≡ k (mod k(k − 1)). A 1-rotational S(2, k, v) is one that admits an automorphism that fixes one point and permutes the others in a single cycle of length v − 1. A simple example of such a Steiner system is the S(2, 3, 27) whose 117 blocks are the distinct images of {2, 6, 18}, {1, 8, 10}, {3, 4, 24}, {9, 12, 20} and {∞, 0, 13} under the powers of the permutation α = (∞)(0 1 2 3 4 . . . 25). The block {∞, 0, 13} lies in an orbit of 13 blocks. The Steiner systems consisting of the points and lines of Desarguesian finite affine spaces are 1-rotational. Here k is a power of a prime. For some other constructions, see e.g. [1] and [2]. We will prove the following theorem. Theorem 1.3. For every integer k ≥ 2, there exist infinitely many 1-rotational Steiner systems S(2, k, v). Indeed, there exist infinitely many 1-rotational and resolvable Steiner systems S(2, k, v). Here resolvable means that there is a partition of the blocks A into sets A1 , A2 , . . . , Ar , r = (v − 1)/(k − 1), each of which is a parallel class. That is, so that for each i = 1, 2 . . . , r, each point is contained in exactly one block of Ai . Our example of an S(2, 3, 27) above is resolvable. The images of the initial blocks under the permutations α 0 and α 13 give nine distinct blocks (the block {∞, 0, 13} is fixed) that form a parallel class. The images of this parallel class under the permutations α 0 , α 1 , . . . , α 12 (or under the group α 0 , α 2 , . . . , α 24 ) provide 13 parallel classes. Theorem 1.4. For every integer k ≥ 2, there exist infinitely many Steiner systems S(2, k, v) that admit an automorphism that permutes the points in cycles of length 1, r, and v − r − 1, where r = (v − 1)/(k − 1).
Existence of Steiner systems that admit automorphisms with large cycles
307
A simple example of such a Steiner system is the S(2, 4, 13) whose 13 blocks are the distinct images of {A, 2, 3, 5}, {∞, A, 0, 4} and {A, B, C, D} under the powers of the permutation α = (∞)(A B C D)(0 1 2 3 4 5 6 7). Here the block {∞, A, 0, 4} lies in an orbit of four blocks, and the block {A, B, C, D} is fixed by all powers of α. In general, the Steiner systems consisting of the points and lines of Desarguesian finite projective spaces have the property in the statement of Theorem 1.4. It may be worth noting that if α is an automorphism as in the statement of Theorem 1.4, then α r is an automorphism of order k − 2 that has r + 1 fixed points. In [3], Kreher, Stinson, and Zhu prove that an automorphism of prime order p = k − 2 can have at most r +1 fixed points. They construct such designs when k −1 is also a prime or prime power. Our constructions produce only large examples, but no conditions on k are necessary. In Section 2, we introduce the language of decompositions of edge-colored graphs that we wish to use. The proof of Theorem 1.2 will be given in Section 3, and the proofs of Theorems 1.3 and 1.4 will be given in Section 4. Some remarks are given in Section 5.
2. Decompositions of edge-colored complete graphs We consider finite edge-r-colored directed graphs. Here, edge-r-colored means that each edge has a color chosen from a set of r colors. We often require edge-r-colored digraphs to be simple, i.e. there are no loops and for each ordered pair (x, y) of distinct vertices, there is at most one edge directed from x to y. Undirected and/or ‘mixed’ graphs may be included: we can identify an undirected edge of some color with a pair of opposite directed edges of that color. The term graph will be used below to mean ‘edge-colored directed graph’. We require that isomorphisms between edge-r-colored digraphs preserve the colors of edges. Let Kn(r) be a complete graph on n vertices with exactly one edge of color i joining any vertex x to any other vertex y for every color i in a set of r colors. The graph Kn(r) has a total of rn(n − 1) edges and, of course, is not simple if r > 1. A family F of subgraphs of a graph K will be called a decomposition of K if every edge e ∈ E(K) belongs to exactly one member of F . Given a family G of edge-r-colored digraphs, a G-decomposition of K is a decomposition F such that every graph F ∈ F is isomorphic to some graph G ∈ G. Often G = {G} consists of a single graph G, and we speak of a G-decomposition.
308 Richard M. Wilson (r)
An automorphism of a decomposition F is a permutation α of the vertices of Kn so that H ∈ F if and only if α(H ) ∈ F , where here, of course, α(H ) is the subgraph that contains the edge of color i joining α(x) and α(y) if and only if H contains the edge of color i joining x and y. The following theorem is proved in [4]. The last two sentences are not explicitly included in the statement of the theorem in [4], but the claims are perfectly clear from the proof given there. Theorem 2.1. Let G0 be an edge-r-colored digraph with m edges of each of r colors. Further assume that m is even. Then Kq(r) admits a G0 -decomposition for every prime power q ≡ m + 1 (mod 2m) with q sufficiently large. If q is prime, there is a G0 -decomposition of Kq(r) that is cyclic: it admits an automorphism that permutes the points in a single cycle of length q. (In general, there exists a G0 -decomposition that admits the elementary abelian group of order q as a group of automorphisms acting regularly on the vertices.)
3. Proof of Theorem 2 Let be a group of order k. We consider edge-k-colored digraphs and take the elements of as our colors. For each mapping f : → Z + (the nonnegative integers) such that g∈ f (g) = k, let G(f ) denote the graph with vertex set V (G(f )) = g∈ Tg where the Tg ’s are disjoint sets with |Tg | = f (g) and where for all distinct x, y ∈ V (G(f )), there is an edge from x to y of color a −1 b, where a and b are such that x ∈ Ta and y ∈ Tb . Let G be the collection of all such G(f ). We claim that the existence of a G-decomposition of Kq(k) implies the existence of an S(2, k, kq) that admits as a group of automorphisms acting semiregularly on the kq points and so that the orbits of provide a parallel class of blocks. Here is the construction. Given a G-decomposition F of Kq(k) , let V be the vertex (k) set of Kq and take X = V × as the point set for our design. For each graph F ∈ F , we construct k blocks AF,h , h ∈ , as follows: Write V (F ) (the vertex set of F ) as −1 g∈ Sg in a way so that the edge from x to y has color a b when x ∈ Sa and y ∈ Sb and define : AF,h = (Sg × {hg}). g∈
For the blocks of the S(2, k, kq), we take the sets {x} × as x ranges over V , together with all sets AF,h , h ∈ , F ∈ F . To find a block containing two points (x, a) and (y, b) with x = y, let F be the unique graph in F containing the edge of color a −1 b from x to y. If x ∈ Sc and y ∈ Sd , where V (F ) = g∈ Sg as above, we have c−1 d = a −1 b and then (x, a)
Existence of Steiner systems that admit automorphisms with large cycles
309
and (y, b) belong to AF,h , where h = ac−1 = bd −1 . This is the only block that will contain both points. For each g ∈ , the permutation αb : (x, a) → (x, ba) is an automorphism of the Steiner system, that takes AF,h to AF,bh . (Conversely, the existence of an S(2, k, kq) that admits as a group of automorphisms acting semiregularly on the kq points and so that the orbits of provide a parallel class of blocks implies the existence of a G-decomposition of Kq(k) . Each orbit of blocks will give us one subgraph isomorphic to a member of G. We omit the details.) We claim that it is possible to find a simple graph G0 that is the edge-disjoint union of graphs isomorphic to members of G so that G0 has the same number m of edges of each of the k colors. We need only use two members of G. Let denote the identity in . Define f1 and f2 by f1 (a) = 1 for all a ∈ , and k if a = , f2 (a) = 0 if a = . Then G(f1 ) has k edges of each of the k − 1 non-identity colors while G(f2 ) has k(k − 1) edges of color . So we may take m = k(k − 1) and G0 to be any simple edge-disjoint union (e.g. a vertex disjoint union) of k − 1 copies of G(f1 ) and one copy of G(f2 ). Now take to be cyclic of order k and let c be a generator of . By Theorem 2.1, there are infinitely many primes q ≡ m + 1 (mod 2m) for which there exists a G0 decomposition—and hence a G-decomposition—of Kq(k) with an automorphism δ that is a cyclic permutation of V . We can extend δ to an automorphism δ ∗ : (x, a) → (δ(x), a) of the S(2, k, kq) constructed above; this automorphism also has order q. Since q and k are relatively prime, the product of αc and δ ∗ is an automorphism of the S(2, k, kq) that permutes the kq points in a single cycle. This proves Theorem 1.2.
4. Proofs of Theorem 1.3 and 1.4 Let be a group of order k − 1. We consider edge-k-colored digraphs and take the elements of ∪ {ζ } as our colors. For each mapping f : → Z + such that g∈ f (g) = k, let G(f ) denote the graph with k + 1 vertices : V (G(f )) = {w} ∪ Tg g∈
where the Tg ’s are disjoint sets with |Tg | = f (g) and w is another point. For all distinct x, y ∈ g∈ Tg , there is an edge from x to y of color a −1 b, where a and b are such that x ∈ Ta and y ∈ Tb ; there is an edge of color ζ from w to each vertex in g∈ Tg . Let G be the collection of all such G(f ).
310 Richard M. Wilson (k)
We claim that the existence of a G-decomposition of Kq implies the existence of an S(2, k, (k − 1)q + 1) that admits as a group of automorphisms fixing one point and acting semiregularly on the remaining (k − 1)q points and so that the union of the fixed point and any orbit of is a block. Moreover, the S(2, k, (k − 1)q + 1) is resolvable in such a way that elements of fix each parallel class of blocks. Here is the construction. Given a G-decomposition of Kq(k) , let V be the vertex set (k) of Kq and take X = {∞} ∪ V × as the point set for our design. For each graph F ∈ F , we construct k − 1 blocks AF,h , h ∈ , as follows: Write : Sg V (F ) = {wF } ∪ g∈
in a way so that the edge from x to y has color a −1 b when x ∈ Sa and y ∈ Sb , and edges from wF to the other vertices have color ζ , and define : (Sg × {hg}). AF,h = g∈
For the blocks of the S(2, k, (k − 1)q + 1), we take the sets {∞} ∪ ({x} × ) as x ranges over V , together with all sets AF,h , h ∈ , F ∈ F . The same argument as in the proof of Theorem 1.2 shows that there is a unique block containing two points (x, a) and (y, b) with x = y. The blocks {∞} ∪ ({x} × ) contain all other pairs of points. We show that the constructed design is resolvable. For each z ∈ V , let Az be the set of blocks consisting of the block {∞} ∪ ({z} × ) and all blocks AF,h where wF = z. A point (x, a) with x = z is contained in exactly one of these latter blocks, namely in AF,h where F is the unique graph in F that contains the edge of color ζ directed from z to x, and where h = ag −1 if x ∈ Sg . The parallel classes Az , z ∈ V , partition the set of blocks. For each g ∈ , the permutation αb that fixes ∞ and takes (x, a) → (x, ba) is an automorphism of the Steiner system, that takes AF,h to AF,bh . Such an automorphism fixes all parallel classes Az . We claim that it is possible to find a simple graph G0 that is the edge-disjoint union of graphs isomorphic to members of G so that G0 has the same number m of edges of each of the k colors. As in the proof of Theorem 1.2, we need only use two members of G. Let denote the identity in . Define f1 and f2 as follows: 2 if a = , f1 (a) = 1 if a = . f2 (a) =
k 0
if a = , if a = .
Existence of Steiner systems that admit automorphisms with large cycles
311
Then G(f1 ) has k edges of color ζ , two edges of color , and k + 1 edges of each of the k − 2 non-identity colors. And G(f2 ) has k edges of color ζ and k(k − 1) edges of color . So we may take m = k(k + 1) and G0 to be any simple graph that is the edge-disjoint union of k copies of G(f1 ) and one copy of G(f2 ). Now take to be cyclic of order k − 1 and let c be a generator of . By Theorem 2.1, there are infinitely many primes q ≡ m + 1 (mod 2m) for which there exists a G0 -decomposition—and hence a G-decomposition—of Kq(k) with an automorphism δ that is a cyclic permutation of V . We can extend δ to an automorphism δ ∗ of the S(2, k, (k − 1)q + 1) constructed above by defining δ ∗ (∞) = ∞ and δ ∗ (x, a) = (δ(x), a); this automorphism also has order q. Since q and k − 1 are relatively prime, the product of αc and δ ∗ is an automorphism of the S(2, k, (k − 1)q + 1) that permutes all the points other than ∞ in a single cycle of length (k − 1)q. This proves Theorem 1.3. The values of q above are ≡ 1 (mod k(k +1)), so as long as q is large enough, there exists, in addition to the S(2, k, (k −1)q +1) constructed above, a cyclic S(2, k+1, q). This is by Theorem 1.1. We may assume the S(2, k + 1, q) has point set V and has δ as an automorphism. We can then construct an S(2, k + 1, kq + 1) with point set X = V ∪ X = V ∪ {∞} ∪ V × and whose blocks are the blocks of the S(2, k + 1, q) together with all sets of the form {z} ∪ A with z ∈ V and A ∈ Az . The permutation whose restriction to V is δ, and whose restriction to X is the product of αc and δ ∗ , is an automorphism of the S(2, k + 1, kq + 1) that fixes ∞ and permutes the remaining points in cycles of length q and (k − 1)q. This proves Theorem 1.4; these Steiner systems have the properties required the statement of Theorem 1.4, except that k has been replaced by k + 1.
5. Remarks It may be conjectured that, given k, cyclic S(2, k, v) exist for all integers v that are sufficiently large with respect to k and satisfy v ≡ 1 or k (mod k(k − 1)). It seems that this would be very hard to prove with current techniques. More can be done if both k and the group (its order) are fixed. We note that the constructions of Sections 3 and 4 show that any group of order k or k − 1 (or of order a divisor of k or k − 1, since these may be subgroups of groups of order k or k − 1) may appear in the automorphism group of an S(2, k, v). In [4] it is proved that given k and a group of order k − 1, there exist Steiner systems S(2, k, v) so that fixes one point x0 and acts semiregularly on the others, and such that all blocks on x0 are fixed, whenever v = r(k − 1) + 1 is sufficiently large and r(r − 1) ≡ 0 (mod k), and, in addition, if k ≡ 3 (mod 4), then r(r − 1) ≡ 0 (mod 4). These conditions are necessary whether v is large or not.
312 Richard M. Wilson Let be a group of order m. For each mapping f : → Z + , let G(f ) denote the graph with vertices V (G(f )) = g∈ Tg where the Tg ’s are disjoint sets with |Tg | = f (g). We take the elements of as colors, and for all distinct x, y ∈ V (G(f )), take an edge from x to y of color a −1 b, where a and b are such that x ∈ Ta and y ∈ Tb . These may be called ‘coboundary colorings’ of complete digraphs. Let G be the collection of such coboundary colorings of complete digraphs on k vertices, i.e. where f is such that g∈ f (g) = k. Then a G-decomposition of Kn(m) can be seen to be equivalent to the existence of a group divisible design (GDD) with n groups of size m and block size k (and index 1) on which acts semiregularly so that the orbits of are the groups of the design. (This last sentence can be confusing because of the use of ‘group’ with two different meanings. The groups of the design are not algebraic structures.) It is possible to use the results and techniques of [4] to give necessary and ‘asymptotically’ sufficient conditions on n for the existence of such decompositions (and GDD’s). The case m = k − 1 was alluded to above. Further details may appear elsewhere.
References [1] M. Buratti, Old and new designs via difference multisets and strong difference families, J. Combin. Des. 7 (1999), 406–425. [2] M. Buratti and F. Zuanni, G-invariantly resolvable Steiner 2-designs which are 1-rotational over G, in: Finite geometry and combinatorics (Deinze, 1997), Bull. Belg. Math. Soc. Simon Stevin 5 (1998), 221–235. [3] D. L. Kreher, D. R. Stinson, and L. Zhu, On the maximum number of fixed points in automorphisms of prime order of 2-(v, k, 1) designs, Ann. Comb. 1 (1997) 227–243. [4] E. R. Lamken and R. M. Wilson, Decompositions of edge-colored complete graphs, J. Combin. Theory, Ser. A 89 (2000) 149–200. [5] R. M. Wilson, Cyclotomy and difference families in elementary abelian groups, J. Number Theory 4 (1972) 17–47. R. M. Wilson Department of Mathematics 253-37 California Institute of Technology Pasadena, CA 91125, U.S.A. [email protected]
Rainbow graphs Andrew J. Woldar ∗
Abstract. A rainbow graph is a graph that can be vertex-colored so that every color is represented once, and only once, in each neighborhood 1 (v), v ∈ V (). In this paper we discuss some properties of rainbow graphs, and show how special families of such graphs can be obtained via filtrations of free products of finite groups. We also discuss some of their applications. 2000 Mathematics Subject Classification: primary 05C25; secondary 05C35, 05C90.
1. Introduction A rainbow graph is a graph that can be vertex-colored so that every color is represented once, and only once, in each neighborhood 1 (v), v ∈ V (). Such a coloring π : V () → C will be called a rainbow coloring of . We shall sometimes write (, π ) when referring to a rainbow graph with specified rainbow coloring. Proposition 1.1. Let be a rainbow graph with v vertices and e edges, and let π : V () → C be a rainbow coloring of . Then the following hold. (a) is k-regular, where k is the size of the color set C. (b) contains a perfect matching. (c) Each color c ∈ C occurs equally often in (, π ). (d) k | v and k 2 | e. (e) For any subset C of C the induced subgraph [V ], where V = π −1 (C ), is a rainbow graph with rainbow coloring π|V : V → C . Proof. The proof of (a) is trivial. To prove (b), simply note that every vertex of is in a unique monochromatic edge of (, π) (i.e., edge xy for which π(x) = π(y)). Thus the monochromatic edges of (, π) form a perfect matching of . Let be the graph obtained by deleting all monochromatic edges from . Then, for any pair ∗ This research was partially supported by NSF grant DMS-9622091.
Codes and Designs Ohio State Univ. Math. Res. Inst. Publ. 10
© Walter de Gruyter 2002
314 Andrew J. Woldar of colors c1 , c2 ∈ C, the subgraph of induced on the set {x ∈ V ( ) | π(x) ∈ {c1 , c2 }} is a matching. Thus |π −1 (c1 )| = |π −1 (c2 )| and (c) follows. From (c), v = !c∈C |π −1 (c)| = k|π −1 (c)| and e = vk/2 = k 2 |π −1 (c)|/2. But it follows from the proof of (b) that |π −1 (c)| is even, in which case (d) follows. The proof of (e) is straightforward and left to the reader. Recall that the distance-two graph of a graph is defined to be the graph 2 having vertex set V (2 ) = V () and edge set E(2 ) = {xy | dist (x, y) = 2}. Proposition 1.2. Let π be a rainbow coloring of rainbow graph . Then π induces a proper k-coloring of the vertices of 2 . Moreover, if is triangle-free then c(2 ) = χ (2 ) = k, where c(2 ) and χ(2 ) denote the clique number and chromatic number of 2 , respectively. Proof. If π(x) = π(y) with xy an edge in 2 , then there exists a vertex z such that xz and yz are edges in . But this contradicts the fact that π is a rainbow coloring of . If is triangle-free, then the neighbors 1 (x) of x ∈ V () form a k-clique in 2 . But then k ≤ c(2 ) ≤ χ(2 ) ≤ k.
2. Generic construction of rainbow graphs The following recipe for the construction of rainbow graphs is implicit in the proof of Proposition 1.1: For k and t positive integers, form k distinct copies V1 , . . . , Vk of the graph tK2 (i.e., the graph with t disjoint edges). For each pair i, j , 1 ≤ i, j ≤ k, construct a matching each edge of which has one endpoint in Vi , the other in Vj . Then the obtained graph is a rainbow graph; indeed, a rainbow coloring results simply by coloring each vertex of Vi with color i. Although every rainbow graph arises in the manner outlined above, one is still faced with the problem of recognizing whether or not a given graph is rainbow. Here, Proposition 1.1 offers some measure of help but additional results of this type are clearly needed. Moreover, the class of rainbow graphs is quite robust and it thus becomes advantageous to be able to construct graphs from this class that have certain desired properties such as large automorphism group or large girth. (In Section 6 we shall visit situations in which such properties are application driven.) Obviously, it is hopeless to appeal to the recipe above as part of any serious attempt to construct a graph with, say, large automorphism group. Thus we must find more refined ways to construct rainbow graphs, ways in which some underlying structure influences the properties that result. This motivates our foray into the realm of groups in later sections.
Rainbow graphs
315
3. Rainbow morphisms and coordinatized graphs Let (, π ) and ( , π ) be rainbow graphs with identical color set C. A rainbow morphism of (, π) onto ( , π ) is a surjective graph homomorphism η : V () → V ( ) which satisfies π (η(v)) = π(v) for all v ∈ V (). In this situation we call ( , π ) a rainbow quotient of (, π). It is clear that rainbow morphisms are local isomorphisms, i.e., for any v ∈ V () the subgraph of induced on the neighbor set of v is isomorphic to the subgraph of induced on the neighbor set of η(v). In [11] Lazebnik and the author introduced a class of graphs which we here refer to as coordinatized. We focus on the bipartite version of such graphs because they turn out to be rainbow. We give their definition presently. Let R be an arbitrary commutative ring, and denote by R n the Cartesian product of n copies of R. For each 2 ≤ i ≤ n, let fi : R 2i−2 → R be an arbitrary function. The bipartite graph Bn = B(R; f2 , . . . , fn ) is defined as follows. The set of vertices V (Bn ) is the disjoint union of two copies of R n , one denoted by Pn and the other by Ln . Elements of Pn will be called points and those of Ln lines. In order to distinguish points from lines we introduce the use of parentheses and brackets: if a ∈ R n , then (a) ∈ Pn and [a] ∈ Ln . Edges of Bn are defined by declaring point (p) = (p1 , p2 , . . . , pn ) and line [l] = [l1 , l2 , . . . , ln ] to be adjacent if and only if the following n − 1 relations on their coordinates hold: p2 + l2 = f2 (p1 , l1 ) p3 + l3 = f3 (p1 , l1 , p2 , l2 ) .. . pn + ln = fn (p1 , l1 , p2 , l2 , . . . , pn−1 , ln−1 )
(1)
Proposition 3.1. For any n ≥ 2 and commutative ring R, the graph Bn is a rainbow graph with rainbow coloring πn : R n → R given by πn : (p1 , . . . , pn ) → p1 and πn : [l1 , . . . , ln ] → l1 . Moreover, for any i < j the truncation map φij : R j → R i defined by φij : (p1 , . . . , pj ) → (p1 , . . . , pi ) and φij : [l1 , . . . , lj ] → [l1 , . . . , li ] is a rainbow morphism of (Bj , πj ) onto (Bi , πi ). Proof. We must first verify that πn is a rainbow coloring for Bn . By symmetry, it suffices to show that given any point (p) ∈ Pn and color α ∈ R, there is a unique neighbor of (p) which has color α, i.e., a unique line of the form [l] = [α, l2 , . . . , ln ] which is adjacent to (p). Given the value l1 = α and values for p1 , . . . , pn , one recursively obtains l2 = f2 (p1 , α) − p2 l3 = f3 (p1 , α, p2 , l2 ) − p3 .. . ln = fn (p1 , α, p2 , l2 , . . . , pn−1 , ln−1 ) − pn , and the unique α-colored neighbor [l] of (p) is thereby determined.
(2)
316 Andrew J. Woldar Clearly, φij is a graph homomorphism since the conditions for adjacency in Bi are given by the first i−1 functions f2 , . . . , fi defining adjacency in Bj . As πi (φij (v)) = πj (v) for all vertices v of Bj , the mapping φij preserves color, so is a rainbow morphism. While the graphs Bn form a rather diverse family of rainbow graphs, it is still difficult to construct graphs from this family which have special predetermined properties. There are, however, some properties shared by all members of the family which have been quite useful in applications. For example, the fact that, for appropriate m, the edge set of the complete graph Km can be partitioned into edge disjoint copies of a certain polarity graph of Bn has led to results in multicolor Ramsey numbers, see Section 6.
4. Group rainbow graphs and unipotent-like factorizations Let G be a group with proper subgroups G1 and G2 . The group incidence structure γ (G)G1 ,G2 is defined as follows. The set of objects of γ (G)G1 ,G2 is (G : G1 ) ∪ (G : G2 ), where (G : Gi ) is the set of right cosets of Gi in G, i = 1, 2. Two cosets are defined to be incident if their set theoretic intersection is nonempty. We denote the incidence graph of γ (G)G1 ,G2 by (G)G1 ,G2 . Clearly (G)G1 ,G2 is a bipartite graph. Note also that G is a subgroup of the full automorphism group of (G)G1 ,G2 . In fact, the action of G on the edge set of (G)G1 ,G2 is similar to its action on (G : G1 ∩ G2 ) via right translation, that is ((G1 x, G2 x)g )ϕ = ((G1 x, G2 x)ϕ )g where we define ϕ : E((G)G1 ,G2 ) → (G : G1 ∩ G2 ) via ϕ : (G1 x, G2 x) → (G1 ∩ G2 )x. Though it is not necessarily the case that graph (G)G1 ,G2 is a rainbow graph, we call a rainbow graph which is isomorphic to (G)G1 ,G2 for some G, G1 , G2 , a group rainbow graph. Moreover, a rainbow quotient of a group rainbow graph which is itself a group rainbow graph will be called a group rainbow quotient. The following definition is motivated by the behavior of unipotent subgroups of rank two groups of Lie type (see Section 5.3 of [3], especially Theorem 5.3.3). A group U is said to admit a unipotent-like factorization U = U1 U2 U3 if, for certain subgroups U1 , U2 , U3 , the following hold: (i) Every element u ∈ U can be uniquely expressed in the form u = u1 u2 u3 with ui ∈ Ui , i = 1, 2, 3. (ii) U3 contains [U1 , U2 ] = [u1 , u2 ] | u1 ∈ U1 , u2 ∈ U2 . Note that any group U which admits a unipotent-like factorization U = U1 U2 U3 automatically admits the factorization U = U2 U1 U3 as well. Thus each u ∈ U has also a unique representation of the form u = u2 u1 u3 , with ui ∈ Ui , i = 1, 2, 3.
Rainbow graphs
317
In fact, it is easily seen that the two representations of u above must be related by u1 = u1 , u2 = u2 , and u3 = [u1 , u2 ]u3 . We call a group which admits a unipotent-like factorization U = U1 U2 U3 balanced provided |U1 | = |U2 |. This occurs, for example, in all rank two groups of normal Lie type (VIZ., groups of Lie type A2 , B2 and G2 ) wherein it is actually the case that U1 and U2 are each isomorphic to the additive group of some finite field (see Section 5.1 of [3]). Consider now the group incidence structure γ (U )U1 ,U2 where U is balanced, and fix a bijection α : U1 → U2 . The following may be concluded directly from the definitions. 1. For every coset U1 u there is a canonical representative u2 u3 with u2 ∈ U2 and u3 ∈ U3 . We define u2 to be the color π(U1 u) of U1 u. 2. For every coset U2 u there is a canonical representative u1 u3 with u1 ∈ U1 and u3 ∈ U3 . We define α(u1 ) to be the color π(U2 u) of U2 u. Thus we obtain a coloring π of the vertices of (U )U1 U2 for which the color set is U2 . Proposition 4.1. Let U be balanced, with π as defined above. Then (U )U1 ,U2 is a group rainbow graph with rainbow coloring π . Proof. By symmetry, it suffices to show that given an arbitrary coset U1 u ∈ (U : U1 ) and color g ∈ U2 , there is a unique neighbor of U1 u of color g. Let g = α(g1 ), and write u = u1 u2 u3 where ui ∈ Ui , i = 1, 2, 3. Then U1 u = U1 u2 u3 clearly contains the element g1 u2 u3 , which we may alternately express in the form u2 g1 [g1 , u2 ]u3 . But then since [g1 , u2 ]u3 ∈ U3 , it follows that U2 g1 [g1 , u2 ]u3 is a neighbor of U1 u of color g. Now suppose there is another neighbor of U1 u of color g, say U2 g1 g3 , g3 ∈ U3 . As U1 u ∩ U2 g1 g3 = ∅, there exist elements h1 ∈ U1 and h2 ∈ U2 such that h1 u2 u3 = h2 g1 g3 . Re-expressing h2 g1 g3 , we obtain h1 u2 u3 = g1 h2 [h2 , g1 ]g3 which, by uniqueness of representation, gives h1 = g1 , u2 = h2 , and u3 = [h2 , g1 ]g3 . But then U2 g1 g3 = U2 g1 [g1 , u2 ]u3 , and we have established uniqueness of the gcolored neighbor of U1 u. The following result gives a methodology for constructing group rainbow quotients of group rainbow graphs. The proof is straightforward and left to the reader. Proposition 4.2. Let U be balanced, and let F be a subgroup of U3 which is normal in U . Let φ denote the canonical homomorphism of U onto U/F . Then φ(U ) admits the unipotent-like factorization φ(U ) = φ(U1 )φ(U2 )φ(U3 ). Moreover, as φ(Ui ) is isomorphic to Ui , i = 1, 2, φ(U ) is balanced and φ induces a rainbow morphism (U )U1 ,U2 → (φ(U ))φ(U1 ),φ(U2 ) . As Ui is isomorphic to φ(Ui ) in Proposition 4.2, we shall denote the graph (φ(U ))φ(U1 ),φ(U2 ) more simply as (φ(U ))U1 ,U2 or (U/F )U1 ,U2 .
318 Andrew J. Woldar
5. Free products and filtrations Let G1 = A1 | R1 and G2 = A2 | R2 be groups with respective generating sets A1 , A2 and sets of relations R1 , R2 . We define the free product G1 ∗ G2 of G1 and G2 to be the group with generating set A1 ∪ A2 and set of relations R1 ∪ R2 . In other words, G1 ∗ G2 = A1 ∪ A2 | R1 ∪ R2 . We may regard G1 and G2 as subgroups of G1 ∗ G2 in the obvious way. Free products constitute an important source of examples for us of groups which admit unipotent-like factorizations. Proposition 5.1. Let G = G1 ∗ G2 be a free product of the nontrivial groups G1 and G2 , and let G3 be the subgroup of G defined by G3 = [G1 , G2 ]. Then G admits the unipotent-like factorization G = G1 G2 G3 . Proof. The proof follows immediately from the observation that G3 is the kernel of the natural surjective homomorphism G1 ∗ G2 → G1 × G2 , together with the fact that G1 ∩ G3 = G2 ∩ G3 = 1. Given an infinite group G, a filtration of G is an infinite descending chain G = F0 ⊃ F1 ⊃ F2 ⊃ · · · of distinct normal subgroups Fi of G such that [Fi , Fj ] ⊆ Fi+j for all i, j ≥ 1. We further call the filtration cofinite if |G : Fi | < ∞ for all i. The notion of filtration allows one to successfully construct families of group rainbow graphs which have large automorphism group and girth. In Section 6 we shall see how, following ideas of Ustimenko [17], such families can be used as powerful encryption tools. Proposition 5.2. Let G = G1 ∗ G2 be a free product of two finite groups G1 and G2 , and let G = F0 ⊃ F1 ⊃ F2 ⊃ · · · be a cofinite filtration with F1 = G3 = [G1 , G2 ]. Then = (G)G1 G2 is a rainbow graph of infinite order over (G1 , G2 ), in fact a tree, and the graphs i = (G/Fi )G1 ,G2 (i ≥ 1) form an infinite sequence of group rainbow quotients of of increasing order and nondecreasing girth. Moreover, j is a group rainbow quotient of i for every i > j . Proof. By Propositions 4.1 and 5.1, is clearly a rainbow graph of infinite order. As G is a free product of G1 and G2 , is acyclic; as G = G1 , G2 , is connected. Hence is a tree. As G3 is normal in G, the requirement that F1 = G3 is not exclusionary. Thus Proposition 4.2 implies that each i is a group rainbow quotient of . Moreover, for i > j , j is a group rainbow quotient of i with morphism i → j induced by the canonical homomorphism G/Fi → G/Fj . The order of i is |G/Fi : G1 Fi /Fi | +
Rainbow graphs
319
|G/Fi : G2 Fi /Fi | which clearly increases as i → ∞ since G1 ∩ Fi = G2 ∩ Fi = 1. Finally, as i → j maps cycles to (nondegenerate) cycles, the girth of i must be at least as large as that of j . Remark 5.3. As G/G3 is isomorphic to the direct product G1 × G2 , the graph 1 is complete bipartite. Thus the girth of graphs in the sequence 1 , 2 , 3 , . . . always begins at g(1 ) = 4, and the rate at which girth grows is governed by the nature of the filtration.
6. Some applications In this section we provide a brief survey of results which depend on constructions related to rainbow graphs. 6.1. Multicolor Ramsey numbers. The multicolor Ramsey number rk (C4 ) is the smallest integer n for which any k-coloring of the edges of the complete graph Kn must produce a monochromatic 4-cycle. One has r3 (C4 ) = 11 (see [5]) but for k ≥ 3 this is the only case for which rk (C4 ) is explicitly known. Prior to the results discussed herein, the best known lower bound for infinitely many values of k was rk (C4 ) ≥ k 2 − k + 2, see [4]. Recall from Section 3 the notion of coordinatized graphs. One particular class consists of the graphs B2 (q) where we let R vary over all finite fields GF(q) and we define the function f2 by f2 (p1 , l1 ) = p1 l1 . In [11] Lazebnik and the author prove that the mapping φ : GF(q)2 → GF(q)2 defined by φ : (p) → [p] and φ : [l] → (l) is a polarity of B2 (q). The polarity graph 2 (q) of B2 (q) is defined to be the graph with vertex set V (2 (q)) = P (i.e., the point set of B2 (q)) and edge set E(2 (q)) = {uv φ | uv ∈ E(B2 (q)), u ∈ P , v ∈ L, u = v φ }. The authors further show that the edge set of the complete graph Kq 2 can be partitioned into q edge disjoint copies 1 , . . . , q of 2 (q). It is further shown that 2 (q) is C4 free, a result which is very easy to prove using the rainbow property. Thus, by assigning color i to the edges of i , they obtain a q-coloring of the edges of Kq 2 with no monochromatic C4 , hence rq (C4 ) ≥ q 2 + 1. The authors improve this bound to rq (C4 ) ≥ q 2 + 2 by extending the aforementioned coloring to Kq 2 +1 , see [12]. 6.2. Dense graphs of large girth. Let F be a family of graphs. By ex(v, F ) we denote the greatest number of edges in a graph on v vertices which contains no subgraph isomorphic to a graph from F . The best known bounds on ex(v, {C3 , C4 , . . . , C2k }) for fixed k, 2 ≤ k = 5, are the following: 2
1
ck v 1+ 3k−3+ ≤ ex(v, {C3 , C4 , . . . , C2k }) ≤ 90kv 1+ k
(3)
320 Andrew J. Woldar The upper bound actually holds for all k ≥ 2 and v, and was established by Bondy and Simonovits [1] (see also [7], [17]). The lower bound holds for an infinite sequence of values of v; ck is a positive function of k alone, and = 0 if k is odd and = 1 if k is even. It was established by Lazebnik, Ustimenko and the author in [10]. (For k = 5 a better lower bound c(v 1+1/5 ) is given by the regular generalized hexagon of Lie type B2 , see [2].) The lower bound comes from an explicit construction of special group rainbow graphs. Initially, these graphs were defined as connected components CD(n, q) of certain coordinatized graphs D(n, q) constructed by Lazebnik and Ustimenko, see [8]. The graphs CD(n, q), however, can be obtained directly as rainbow quotients of the free product (GF(q), +) ∗ (GF(q), +), where (GF(q), +) denotes the additive group of the finite field GF(q). The proof that the girth of D(n, q) is at least n + 5 for n odd was a challenging task, accomplished by Lazebnik and Ustimenko in [8]. We should further mention that the functions f2 , f3 , . . . , fn defining adjacency in D(n, q) (as a coordinatized graph) did not come about haphazardly but were deduced by investigating a group incidence structure arising from the (infinite) rank two group ;1 . The relatively simple form of the functions results from work of affine Lie type A done by Ustimenko in the area of embedding Lie geometries in their corresponding Lie algebras, see [19]. 6.3. Cages. For k ≥ 2 and g ≥ 3, a (k, g)-cage is a k-regular graph of minimum order subject to having girth exactly g. The problem of determining the order ν(k, g) of a (k, g)-cage is unsolved for most pairs (k, g) and is extremely hard in the general case. By counting the number of vertices in the breadth-first-search tree of a (k, g)-graph, one easily establishes the following lower bounds for ν(k, g): k(k−1)(g−1)/2 −2 , for g odd; ν(k, g) ≥ 2(k−1)k−2 g/2 −2 , for g even. k−2 Finding upper bounds for ν(k, g) is a far more difficult affair. In [14] Sachs established that ν(k, g) is finite for all k and g, and shortly thereafter, Erd˝os and Sachs [6] gave a substantially smaller general upper bound by nonconstructive methods. Their result was slightly improved by Walther [21], [22], and later by Sauer [15]. The following upper bound is due to Sauer [15]: 2(k − 1)g−2 , for g odd and k ≥ 4; ν(k, g) ≤ 4(k − 1)g−3 , for g even and k ≥ 4. Note that these upper bounds are roughly the squares of the previously indicated lower bounds. In [9], Lazebnik, Ustimenko and the author were able to establish, via explicit construction, general upper bounds on ν(k, g) which are roughly the 3/2-power of the lower bounds. The graphs they manufactured were certain induced subgraphs of the rainbow graphs CD(n, q) (see 6.2, above) obtained through the process of
Rainbow graphs
321
“color restriction” in accordance with Proposition 1.1(e). Their result can be stated as follows: Let g ≥ 5, and let q denote the smallest odd prime power for which k ≤ q. Then 3
ν(k, g) ≤ 2kq 4 g−a , where a = 4, 11/4, 7/2, 13/4 for g ≡ 0, 1, 2, 3 (mod 4), respectively. 6.4. Cryptography. In [20], Ustimenko introduced a cryptographic tool which uses rainbow graphs of sufficiently large girth to effectively encrypt messages. In his encryption scheme, both the initial and encrypted “bits” of information are vertices of a specified rainbow graph, and the key is an ordered sequence of colors. More explicitly, given the sequence (c1 , c2 , . . . , ct ) of colors ci ∈ C with ci = ci+2 , the encryption of the initial bit v0 is the terminal vertex vt of a path v0 v1 . . . vt which is uniquely determined by the rule: vi is the unique ci -colored vertex adjacent to vi−1 . To recover the initial information one simply applies the same rule to the encrypted information, only using the reverse sequence (ct , ct−1 , . . . , c1 ). The utility in having a graph of large girth is to ensure that the issued key cannot be circumvented by one of shorter length and that the initial and encrypted bits never coincide. Both goals are achieved, for example, if the value of t is chosen to be strictly less than one-half girth, and in this case even knowledge of the graph and value of t is relatively useless when t is large, since there are k(k − 1)t−1 possible terminal vertices of paths of length t from a fixed vertex. (The condition ci = ci+2 prevents walks in the graph, so is also an important ingredient in achieving the desired goals.) What makes this scheme special is that it can be implemented on a family of rainbow graphs 1 , 2 , 3 , . . . in which each i is a rainbow quotient of j for j > i. (Such a family is referred to by Ustimenko as a folder.) The group rainbow graphs which arise via a filtration of a free product are excellent candidates here, putatively because these graphs can have increasing girth as well. This allows for a potentially infinite “alphabet,” at the same time ensuring that the probability of breaking the scheme approaches zero. (Ustimenko has proved in [20] that the probability of breaking the scheme is roughly the same as guessing an initial bit at random for large graphs.) Regarding implementation of the scheme, one also has the potential to issue subsets of colors (i.e., fragments of the key) of various sizes to appropriately reflect status, or level of confidentiality, within an organization.
References [1]
J. A. Bondy, M. Simonovits, Cycles of even length in graphs, J. Combin. Theory Ser. B 16 (1974), 97–105.
[2]
A. E. Brouwer, A. M. Cohen, A. Neumaier, Distance-Regular Graphs, Springer-Verlag, Berlin 1989.
322 Andrew J. Woldar [3]
R. W. Carter, Simple Groups of Lie Type, Wiley, New York 1972.
[4]
F. R. K. Chung, R. L. Graham, On multicolor Ramsey numbers for complete bipartite graphs, J. Combin. Theory Ser. B 18 (1975), 164–169.
[5]
C. Clapham, The Ramsey number r(C4 , C4 , C4 ), Period. Math. Hungar. 18 (1987), 317– 318.
[6]
P. Erd˝os, H. Sachs, Reguläre Graphen gegebener Taillenweite mit minimaler Knotenzahl, Wiss. Z. Martin Luther Univ. Halle Wittenberg Math. Naturwiss. Reihe 12 (1963), 251– 257.
[7]
R. J. Faudree, M. Simonovits, On a class of degenerate extremal graph problems, Combinatorica 3 1 (1983), 83–93.
[8]
F. Lazebnik, V. A. Ustimenko, Explicit construction of graphs with an arbitrary large girth and of large size, Discrete Appl. Math. 60 (1995), 275–284.
[9]
F. Lazebnik, V. A. Ustimenko, A. J. Woldar, New upper bounds on the order of cages, Electron. J. Combin. 14 R13 (1997), 1–11.
[10] F. Lazebnik, V. A. Ustimenko, A. J. Woldar, A new series of dense graphs of high girth, Bull. Aer. Math. Soc. 32 (1) (1995), 73–79. [11] F. Lazebnik, A. J. Woldar, Graphs defined by systems of linear equations, to appear in J. Graph Theory. [12] F. Lazebnik, A. J. Woldar, A new lower bound on the multicolor Ramsey numbers rk (C4 ) for k an Odd Prime Power, J. Combin. Theory B 79 (2000), 172–176. [13] H. Sachs, Regular graphs with given girth and restricted circuits, J. London Math. Soc. 38 (1963), 423–429. [14] N. Sauer, Extremaleigenschaften I and II, Sitzungsber. Abt. II Oesterr. Akad. Wiss. Math. Naturwiss. Kl. 176 (1967), 9–25; 176 (1967), 27–43. [15] M. Simonovits, Extremal Graph Theory, in: Selected Topics in Graph Theory 2, (L.W. Beineke and R.J. Wilson, eds.), Academic Press, London 1983, 161–200. [16] V. A. Ustimenko, On the embedding of some geometries in flag systems in Lie algebras and superalgebras, in: Root Systems, Representations and Geometries, Kiev: Inst. Math. Acad. Sci. UkrSSR (1990), 3–16. [17] V. A. Ustimenko, Families of graphs with special arcs and cryptography, preprint. [18] H. Walther, Eigenschaften von regulären Graphen gegebener Taillenweite und minimaler Knotenzahl, Wiss. Z. Techn. Hochsch. Ilmenau 11 (1965), 167–168. [19] H. Walther, Über reguläre Graphen gegebener Taillenweite und minimaler Knotenzahl, Wiss. Z. Techn. Hochsch. Ilmenau 11 (1965), 93–96. A. J. Woldar Villanova University Villanova, PA 19085, U.S.A. [email protected]