350 75 17MB
English Pages [287] Year 1993
LONDON MATHEMATICAL SOCIETY MONOGRAPHS NEW SERIES
Series Editors P. M. Cohn H. G. Dales
LONDON MATHEMATICAL SOCIETY MONOGRAPHS NEW SERIES Previous volumes of the LMS Monographs were published by Academic Press,
to whom all enquiries should be addressed. Volumes in the New Series will be published by Oxford University Press throughout the world.
Ng‘P'PP’Nt‘
NEW SERIES
Diophantine inequalities R. C. Baker The Schur multiplier Gregory Karpilovsky Existentially closed groups Graham Higman and Elizabeth Scott
The asymptotic solution of linear diflerential systems M. S. P. Eastham The restricted Burnside problem Michael Vaughan-Lee Pluripotential theory Maciej Klimek
Free Lie algebras Christophe Reutenauer
Free Lie Algebras Christophe Reutenauer Université du Québec d Montréal
CLARENDON PRESS - OXFORD 1993
Oxford University Press, Walton Street, Oxford 0X2 6DP Oxford New York Toronto
Delhi Bombay Calcutta Madras Karachi Kuala Lumpur Singapore Hang Kong Tokyo Nairobi Dar es Salaam Cape Town Melbourne Auckland Madrid
and associated companies in Berlin Ibadan Oxford is a trade mark of Oxford University Press Published in the United States by Oxford University Press Inc., New York © Christophe Reutenauer, 1993 _ All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any
form or by any means, without the prior permission in writing of Oxford University Press. Within the UK, exceptions are allowed in respect of any
fair dealing for the purpose of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act, 1988, or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms and in other countries should be sent to the Rights Department, Oxford University Press, at the address above. A catalogue record for this book is available from the British Library
Library of Congress Cataloging in Publication Data Reutenauer, Christophe. Free lie algebras/Christophe Reutenauer. (London Mathematical Society monographs, new series, no. 7)
Includes bibliographical references and index. I. Lie algebras. I. Title. II. Series. QA252.3.R48 1993 512’.55--dc20 92-27318 ISBN 0-19-853679-8 Typeset by Integral Typesetting, Great Yarmouth, Norfolk
Printed in Great Britain by St Edmundsbury Press, Bury St Edmunds, Suflalk
Ce livre est dédié d Arthur, Victor, Emile, Eva
Preface
Lie polynomials first appeared at the turn of the century in the work of Campbell, Baker, and Hausdorff on exponential mapping in a Lie group, which led to a result known as the Campbell—Baker—Hausdorfl‘ formula. About thirty years later, Witt showed that the Lie algebra of Lie polynomials is actually the free Lie algebra, and that its enveloping algebra is the (associative) algebra of noncommutative polynomials; he answered a ques-
tion of Magnus—who had himself arrived at the solution—on the lower central series of the free group. Some years earlier, in 1933, P. Hall had begun
commutator calculus in the free group, which led M. Hall to construct his basis of the free Lie algebra; the link between the latter and the free group
is given by the work of Witt and the Magnus transformation. In 1942 and 1944, Thrall and Brandt studied the free Lie algebra from the
point of view of representation theory of the linear group, and Brandt computed the character—a formula closely related to the Witt formula. At the end of the forties, Dynkin, Specht, and Wever simultaneously
discovered the characterization of Lie polynomials through the ‘left to right bracketing mapping’. Some years later, Friedrichs gave his characterization
for Lie polynomials; his criterion fascinated many mathematicians, who actually proved it. After that, the subject was studied by many people, often independently.
Recently, it has had a new impulse, from the point of view of representation theory of the symmetric group. As far as we know, no book exists that exclusively treats free Lie algebras. Bahturin, in his recent book on varieties of Lie algebras, devotes two chapters
to free Lie algebras; Bourbaki, Jacobson, and others, give an introduction to the subject. It seems to us that the theory has become so extensive with existing results so widely scattered, to justify the publication of a book on the subject. The book is partly written in the spirit of Lothaire’s Combinatorics on words, with emphasis on the algebraic point of view; it can be considered as a series of variations on Lyndon words; the presentation of the latter is rather indirect, so the interested reader could begin by reading the corresponding
section by Lothaire. In Chapter 0, we give without proof the Poincaré—Birkhoff—Witt theorem,
viii
Preface
which enables us to prove that the Lie algebra of Lie polynomials is the free Lie algebra; this necessitates a basis construction (in cases where the ring of scalars is not a field), which is done through Lazard elimination.
The impatient reader may proceed directly to Chapter 1, where things really begin. We give several characterizations of Lie polynomials, introduce the shuflie product and present Hopf algebra-like properties of free associative algebras.
Chapter 2 is devoted mainly to two results: subalgebras of free Lie algebras are free; automorphisms of free Lie algebras are products of elementary automorphisms. The related problem of characterizing free sets of Lie polynomials is also treated. In Chapter 3 we characterize exponentials of Lie series, and give several results on the Hausdorff series, after having connected it to the canonical
projections of the free associative algebra. The study of Hall bases begins in Chapter 4: we construct the Hall basis
of the free Lie algebra, and the corresponding Poincaré—BirkhoH—Witt basis of the free associative algebra. We show also that this basis construction is identical to the one arising from Lazard elimination.
Chapter 5 gives some applications of Hall sets: the Lyndon basis, which is a particular case of a Hall basis; the calculation of the dual basis in the shuflle algebra; the construction of a Hall basis compatible with the derived series of the free Lie algebra; and the order on the free monoid associated with a Hall set and the associated codes. In Chapter 6 we give some properties of the shuflle algebra: it is freely generated by Lyndon words, and has a remarkable presentation. Related to
shuflles is the concept of subword. This leads to subword functions, a generalization of binomial coeflicients, and the Magnus transformation of
the free group. Commutator calculus is presented, and connected with the Hall basis and the algebra of subword functions. Chapter 7 studies circular words: after giving the formulas enumerating them, we relate them to Hall sets. Two algorithms on Lyndon words are
presented, and we give a natural bijection between words on an ordered alphabet and multisets of primitive necklaces. The Lie representation of the symmetric group (or the linear group) is
considered in Chapter 8. Its character and the multiplicities of the irreducible representations are given. Almost all of them occur. Remarkable Lie elements, the Lie idempotents, are studied in the symmetric group algebra. Representations on the components of the canonical decomposition of the free associative algebra are also studied. Chapter 9 shows the close connection between the free Lie algebra and the
Solomon descent algebra of the symmetric group. The primitive idempotents of the latter represent the canonical projections, and the dimension of the corresponding quasi-ideals has an interpretation in terms of necklaces. The
action on Lie monomials of elements of the descent algebra characterizes this
Preface
ix
algebra, which has as a natural homomorphic image the ring of symmetric functions, and is itself dual to the ring of quasisymmetric functions.
Each chapter ends with an appendix: each subsection can be thought of as an exercise, with hints or references, and gives some information on related
subjects; sometimes it is simply a review of related work. Montreal July 1992
C.R.
Acknowledgements
I first discovered the subject of this book in Gérard Lallement’s book on semigroups and in Dominique Perrin’s chapter on factorization of free monoids in Lothaire’s book; thanks to Adriano Garsia’s Combinatorics of
the free Lie algebra and the symmetric group, this subject was given a new impulse, in the direction of algebraic combinatorics. While writing this book, I had innumerable discussions with Marco Schfitzenberger, who is for me the initiator of the subject, and gave me useful advice as well as some
unpublished results. During this same time, Guy Melancon wrote his Ph.D., and was of considerable help, by discussions, reading and correcting the
successive versions of the manuscript. Paul Cohn kindly accepted this book in the London Mathematical Society Monographs series and carefully read the whole manuscript; he also communicated some unpublished results of his Ph.D. thesis. I also thank, for many discussions and correspondence, Sheila Sundaram, André Joyal, Pierre Leroux, Xavier Viennot, Hartmut Laue, Pierre Bouchard, Francois Bergeron, and Nantel Bergeron. Special thanks to Daniel Krob, who carefully read the manuscript, and found many
mistakes. I also want to thank the whole Department of Mathematics and Computer Science of the Université du Québec a Montreal, for excellent working conditions, and especially Dominique Chabot, Sonya Comtois, Lucie Lortie, Marlaine Grenier, and Diane Amatuzio for their typing. Finally, I was supported by a grant of NSERC (Canada) during the three years I wrote this book.
Contents
Introduction
0.1 0.2 0.3 0.4 0.5
The Poincaré—Birkhoff—Witt theorem Free Lie algebras Elimination
1.
Lie polynomials
1.1 1.2 1.3 1.4 1.5 1.6 1.7
Words, polynomials, and series
Appendix Notes
Lie polynomials Characterizations of Lie polynomials Shufl‘les Duality concatenation/shuflie
Appendix Notes
H
0.
XV
r—nr—n wNfl-FH
Index of notation
14 18 19 23 26 33 39
2. Algebraic properties
40
2.1 2.2 2.3 2.4 2.5 2.6
40 44 45 49 50 51
The weak algorithm Subalgebras
Automorphisms Free sets of Lie polynomials
Appendix Notes
3. Logarithms and exponentials
52
3.1 3.2 3.3 3.4 3.5 3.6
52 57 61 76 80 82
Lie series and logarithm
The canonical projections Coefficients of the Hausdorff series Derivation and exponentiation Appendix Notes
xii
4.
Contents
Hall bases
84
4.5 Notes
84 89 98 101 103
5.
Applications of Hall sets
105
5.1 5.2 5.3 5.4 5.5 5.6
Lyndon words and basis The dual basis The derived series Order properties of Hall sets Synchronous codes Appendix
5.7 Notes
105 108 112 114 119 124 126
6.
127
4.1 Hall trees and words 4.2 Hall and Poincaré—Birkhoff—Witt bases 4.3 Hall sets and Lazard sets
4.4 Appendix
Shuflle algebra and subwords
6.1 The free generating set of Lyndon words
6.5 Appendix 6.6 Notes
127 129 131 136 147 152
7.
Circular words
154
7.1 7.2 7.3 7.4 7.5
The number of primitive necklaces Hall words and primitive necklaces Generation of Lyndon words Factorization into Lyndon words Words and multisets of primitive necklaces
7.6 Appendix 7.7 Notes
154 158 161 163 166 170 174
8.
The action of the symmetric group
176
8.1 8.2 8.3 8.4 8.5 8.6
Action of the symmetric group and of the linear group The character of the free Lie algebra Irreducible components Lie idempotents Representations on the canonical decomposition Appendix
176 180 185 194 201 206 215
6.2 Presentation of the shuffle algebra
6.3 Subword functions 6.4 The lower central series of the free group
8.7 Notes
Contents
xiii
9.
217
The Solomon descent algebra
9.1 The descent algebra 9.2 Idempotents
9.3 9.4 9.5 9.6
Homomorphisms Quasisymmetric functions and enumeration of permutations Appendix Notes
217 224 233 242 248 254
References
256
Index
267
Notation
K
544A), $(A) M(A) t', t”
M
commutative ring with unit free Lie algebra over K on the set A 4, 18 free magma on the set A 4 immediate left (right) subtree of the tree t 5 degree of the tree t 5
ad(c)
mapping y |—> EC, y] 7
(stp) Un
the tree (. . .((s, t), t), . . . , t)
10
the submodule of an enveloping algebra generated by nth powers of Lie elements
13, 57
A
alphabet, set of noncommuting variables
A*
free monoid on A 15 empty word, neutral element in A*
1 A+ = A*\1 IWI lwla K K (S, P) (S, w), (P, w) (S, 1), (P, 1) 5 {X
5=(id®a)06 Ad
15
15
15 length of the word w in A*I.1; weight of the word w in T* 224 number of occurences of the letter a in w 14 K-algebra of noncommutative polynomials on A, free associative K-algebra on A 15 K-algebra of noncommutative formal series on A 17 pairing K >< K(A) —> K 17 coefficient of the word w in S, P 15, 17 constant-term of S, P 17 coproduct a i—> a®1+1® a 19, 52
principal anti-automorphism of K 19, 52 19, 52 K-algebra homomorphism K —> EndK(Kad(a) right normed bracketing, bracketing from right to left
19, 53
20, 52
shuflie product 24 p-fold coproduct 25 26, 46 concatenation product K ® K K
27
dual coproduct 27 shufl‘le product K ® K —> K 26 maps a polynomial to its constant term 27 convolution product in End(K K p-fold dual coproduct 31
31
(Ph ' ' ' 7 Pu)
p—fold shuffle product K®P —> K 31 left normed bracketing, bracketing from left to right 41 symmetrized product 57
Sn
symmetric group of order n
P
l
Pu—l
TE"
7'51
D(W) ua Dss 0
f(t) H —>l>
h’h” Ph 14(5), [31(5) T(s) SW
3”
canonical projection K (A) —» U,I 58 canonical projection K (A) ——> $(A) 58 descent set of w 62 right action of permutation a on word w 63 sum of permutations whose descent set is contained in S
anti-automorphism a I—> a“ in KS" 65 foliage of tree t 84 Hall set 84 rewriting rule on standard sequences 86 standard factorization of Hall word h 85 Hall polynomial 90 mappings defined on standard sequences 91 derivation tree of a standard sequence 5 dual basis of K 108
91
(3')
derived ideal of the free Lie algebra binomial coeflicient on words 131
F(A) M
free group on A Magnus transformation F(A) —> Z 1 + a
l FAA)
infiltration product
Hs n
Hall words of length 5n Hall exponent 136
"h 0(A") pm p11
st(w)
K(A >71
37
112
132
134
lower central series of F(A)
136
136
series without term of degree < n
138
power sum symmetric function 156, 178 standard permutation of word w 167 space of homogeneous polynomials of degree n multilinear part of K,,
176
characteristic of a representation 178 «characteristic of the nth Lie representation 180 shape of a standard tableau T 185 descent set of standard tableau T 185 major index of standard tableau T 185 irreducible character of S" 186 value of irreducible character of Sn 188 space of homogeneous Lie polynomials of degree n sum of permutations whose descent set is S 195 Dynkin—Specht—Wever Lie idempotent 195 Lie idempotent of Klyachko 196 subspace of K
length of partition 1
201
201
plethysm of symmetric functions
194
65
xvii
Notation
h"
complete homogeneous symmetric function
qn 4c
projection of K(A) on its graded component of degree n
202 217
convolution product of the (1,, corresponding to the composition C 218 convolution subalgebra generated by the q" 217 graded component of F 217 descent composition of permutation a 222 subset associated to a composition 223 is D; 5w) 223 Solomon descent algebra of S,, 223 is 60:") 224 projection onto U; 224 is {t1,t2,t3,...} 224 composition associated with w in T* 224 length of w in T* 224 element of (EMT) 224 element of Q 224 algebra homomorphism Q —> F, ti I—I qi
225
partition associated to composition C, polynomial P subspace of Q
227
weight of polynomial P in © partition i. is finer than u 227 automorphism of Q 228 is {o y 228
227
227
0 Introduction
The aim of this preliminary chapter is to show that the Lie algebra of Lie polynomials, which is introduced in Chapter 1 and which is the subject of this book, is indeed the free Lie algebra. Apart from some elementary universal constructions, two results are required: the Poincaré—BirkhoH—Witt theorem (actually only its consequence, that a Lie algebra is embedded in its enveloping algebra), and the fact that the free Lie algebra is a free module.
The first result will not be proved here, as it is included in many textbooks. The second one is proved in Section 0.3.
0.1
THE POINCARE—BIRKHOFF—WITT THEOREM
Let K be a commutative ring with unit. A Lie algebra over K‘ is a K-module .9, together with a K-bilinear mapping
3’ X If —> $,(x,y)i—>[x,y], called a Lie product or Lie bracket, satisfying the two following relations, for any x, y, z in 5!:
[x, x] = 0,
(0.1.1)
[[96, y], Z] + [[y, 2], X] + [[2, x], y] = 0.
(0-1-2)
The latter identity is called the Jacobi identity. Note that (0.1.1) implies antisymmetry, i.e.
[x, y] + [y, X] = 0,
(013)
because 0 = [x + y,x + y] = [x, x] + [x, y] + [y, X] + [y, y] = [x, y] + [ y, x], by (0.1.1), bilinearity, and (0.1.1) again. In view of (0.1.3), we may rewrite (0.1.2) as
[[x, y], Z] = [x, [y, 2]] + [[x, 2], y]-
(014)
Subalgebras of Lie algebras and homomorphisms between Lie algebras are defined as usual. Given an associative algebra a! over K, it acquires a
2
0
Introduction
y——————+%
\/ mol natural structure of Lie algebra when [x, y] is defined by
Mfl=w—w Indeed, [x, y] is bilinear, (0.1.1) is immediate and (0.1.2) is easily verified. Let .5? be a Lie algebra, and consider a Lie algebra homomorphism from
5? into an associative algebra d, with its natural Lie algebra structure. Among all these algebras d, there is one which has a universal property, stated in the next result. Proposition 0.1 Let 2’ be a Lie algebra over K. There exists an associative algebra do over K and a Lie algebra homomorphism goo: E —> do having the
following property: for any associative algebra d and any Lie algebra homomorphism go: 2 —> d, there is a unique algebra homomorphism f: do —> d making the diagram of Fig. 0.1 commutative. The algebra do is unique up to isomorphism. This algebra do is called the enveloping algebra of 3, Proof
We first prove uniqueness up to isomorphism. Let o1, d, be another
couple like (po, do. Then, using Fig. 0.2, we deduce the existence of algebra homomorphisms f: do —> .5211 and 9: d1 —> do such that f orpo = (p, and
goo1 = (po. Then idoqoo = (p0 = g0(p1 = goforpo, and id, gof are both
$—°——>a¢0
y——> d1
\/
\/ mnz
0.1
Poincaré—Birkhofl—Witt theorem
3
r—————>ao
\/ Fig. 0.3 algebra homomorphisms. By looking at Fig. 0.3, and by uniqueness, we find that g of = id. Similarly, f o g = id. This shows that do and d1 are isomorphic. We now prove the existence of do. Let T = T(.£,P) be the tensor algebra of 2 over K, that is
T($)= 6—) 3‘3". 7120
Then T has a natural structure of associative algebra with unit. Let I be the ideal of T generated by the elements x ® y — y ® x — [x, y](x, y e 5,”); finally, let do = T/I and (po: .5? —> do be the composition (po = poi, where i is the canonical injection 2 ——> @nzog‘g’", and p the canonical surjective
algebra homomorphism T—+ T/I. Note that since Kerp = I and by the definition of I, we have p([x, y]) = p(x ® y — y ® x). Hence, for any x, y
in 55’, (p0 ([x, y]) = p([x, y]) = p(x (8 y - y 8) x) = p(x)p(y) — p(y)p(X) = [p(X), 1900] = [p o i(x), p ° 1'00] = [90000, (po(y)]- Hence, (#0 is a Lie algebra homomorphism. Next, let d and (p be as defined in Proposition 0.1; then, since T is the tensor algebra of .2”, there is a unique extension of (p to an algebra homomorphism (I): T —> d. Now, for, x, y e 55’, we have @(x ® y — y 69 x —
[x, y]) = ¢(X) do be the canonical Lie homomorphism. Then (po is injective. This result allows us to consider a Lie algebra as a Lie subalgebra of its enveloping algebra, especially in the case when K is a field.
0.2
FREE LIE ALGEBRAS
Let go be a Lie algebra over K, A a set and i: A —> .20 a mapping. The Lie algebra .23 is called free on A if for any Lie algebra .9 and any mapping
f: A —> .9, there is a unique Lie algebra homomorphism f: $0 —> .9 such that the diagram in Fig. 0.4 is commutative. Theorem 0.4
For each set A, there exists a free Lie algebra $(A) on A,
which is unique up to isomorphism. Moreover, $(A) is naturally a graded K-module, i is injective, the component ofdegree 1 ofg(A) is thefree submodule generated by A = i(A), and $(A) is generated, as a Lie algebra, by A. We denote the free Lie algebra by LK(A) or $(A), and we also say that
$(A) is freely generated by A. Recall that a magma is a set with a binary operation. The free magma M(A) over A may be identified with the set of binary, complete, planar, rooted trees with leaves labelled in A. Equivalently,
trees may be identified with well-formed expressions over A, which are recursively defined by the following: each element of A is a well-formed expression; if t’, t” are well-formed expressions, then t= (t’, t”) is a well-
0.2
Free Lie algebras
5
5’0 Fig. 0.5
formed expression, which is identified with the tree obtained by taking a new root, with immediate left subtree t’ and immediate right subtree t". The binary operation of M(A) is the mapping M(A) x M(A) —> M(A), (t’, t”) 1—» t. We define the degree |t| of a tree t to be the number of its leaves, i.e. |t| = 1 if
re A and |(t’, t”)| = |t’| + |t”|. Proof of Theorem 0.4 (i) We prove first uniqueness of the free Lie algebra. Let i:A —> 33 and j: A —> .54, where 33, 3’1 are free on A. By the diagrams in Fig. 0.5 we deduce the existence of Lie algebra homomorphism j: E}, —> .991
and E. .9, —» 51;, such that joi— — j and {o j _ i By Fig 06 we deduce that id. .9}, —> $0 is the unique Lie algebra homomorphism such that idoi— — i. Since we have i = 10 j = io jo_i, _we therefore must have to j — id. Similarly,
joi= id, which shows that i, j are isomorphisms. Hence $1) and 2’1 are isomorphic. (ii) Let D(A) be the free (noncommutative, nonassociative) K-algebra over
A. One may view D(A) as the K-module freely generated by M(A), the free magma over A. Multiplication in M(A) is linearly extended to D(A). An ideal in D(A) is a submodule of D(A) which is closed under multiplication on the
left or right by any element of D(A). Let I be the ideal of D(A) generated by the elements
0002 + (y2)x + (My, A——'—->:z0
\/ Fig. 0.6
(0.2.1)
6
0
Introduction
A———>d0
\/ Fig. 0.7 and xx,
(0.2.2)
with x, y, z e D(A). Let $(A) be the quotient module D(A)/I. It is immediate that $(A) has a multiplication inherited from D(A), and that with this multiplication, $(A) is a Lie algebra over K. Moreover, with the canonical mapping A —-> $(A), $(A) is clearly the free Lie algebra on A. Now, with the degree defined on M(A), D(A) is a graded K-module, with
D(A)l = @aeAKa. Since the relations (0.2.1) and (0.2.2) are homogeneous of degree 22 and since multiplication increases the degree, $(A) is also a
graded module with $(A), = @aEAKa.
El
Let do be an associative algebra over K and j: A —> do a mapping. Then do is called free on A if for any associative algebra d and any mapping f: A —> d, there exists a unique homomorphism of algebras f such that the
diagram in Fig. 0.7 is commutative. Uniqueness up to isomorphism is proved as in the case of free Lie algebras, or enveloping algebras. Existence is shown below, and a direct construction is done in Chapter 1. Let $(A) be the free
Lie algebra on A, i: A —-> $(A) the corresponding mapping, and do the enveloping algebra of $(A) with (po: $(A) —> do the corresponding Lie algebra homomorphism. Then we have a mapping j=do.
Theorem 0.5 The enveloping algebra do of the free Lie algebra $(A) is a free associative algebra on A. The Lie algebra homomorphism (po: .55’(A) —> do is injective, and (po($(A)) is the Lie subalgebra of do generated by j(A). Proof
(i) Let d and f be as shown in Fig. 0.7. Then we have Fig. 0.8, in
which we show the existence of g and f, homomorphisms of Lie algebras and associative algebras, respectively. By the universal property of the free Lie algebra $(A), we deduce existence and uniqueness of g. Then by the universal property of the enveloping algebra, we deduce the existence of
0.3
Elimination
7 j=§00 of ;
A
(do
‘Po
00
i(A)
.524
Fig. 0.8 f. To prove uniqueness of i, suppose we have the commutative diagram in Fig. 0.7. Then, as above, we find a unique Lie algebra homomorphism 9
such that goi=f. Since j=goooi, we have f=foj=fogoooi, and by uniqueness of g, we deduce g = f oqoo. Now, by the universal property of
the enveloping algebra, we deduce uniqueness of f. (ii) In view of Corollary 0.3, it is enough to show that $(A) is a free K-module to deduce that (p0 is injective. This will be done independently in the next section (see Corollary 0.10). Now, the proof of Theorem 0.4 shows
that $(A) is generated, as a Lie algebra, by i(A). Hence, goo(.5f(A)) is generated by goo o i(A) = j(A).
0.3
D
ELIMINATION
We begin by stating a theorem which allows us to ‘eliminate’ one variable.
Recall that a derivation of a Lie algebra is a linear endomorphism D such
that D([x, y]) = [Dx, y] + [x, Dy]. If c is an element of a Lie algebra 5!, we denote by ad(c) the linear mapping 3’ —> .5,” defined by ad(c)(y) = [c, y]. By (0.1.4) and (0.1.3), ad(c) is a derivation of 5?. Recall that A is canonically embedded in the free Lie algebra $(A). Theorem 0.6
Let c e A. Then the K-module $(A) is the direct sum of Kc
and of a Lie subalgebra which is freely generated, as a Lie algebra, by the elements
(—ad(c))”(b), n 2 0, b e A\c.
(0.3.1)
0
8
Introduction
Note that an element (0.3.1) is of the form [. . .[[b, c], c], . . . , c], with n cs. The reader may verify that the subalgebra of the theorem is the Lie ideal generated by A\c.
We begin with a lemma. Lemma 0.7 Let 2 be a Lie algebra. . (i) The set Der(.$5’) of derivations of .5!’ is a Lie subalgebra of the algebra of linear endomorphisms of 2. (ii) If .3” is another Lie algebra and h: .5?’ —> Der(5£’) is a Lie algebra homomorphism, then there is a unique Lie algebra structure on $1 = .5.” 9 3’, extending that of .5,” and 59’, and such that Vx e 2, Vx’ e .55”, [x’, x] = h(x’)(x).
(0.3.2)
(iii) If .27 is free on T, then each mapping T-> .5! extends uniquely to a derivation of Z. The Lie algebra $1 is called the semidirect product of .5,” and .5,” with respect to the homomorphism h: 5,” —> Der(.$€). Proof (i) Let D1, D2 be two derivations. We show that D1 0 D2 — D1 0D2 is again a derivation. We have
Di°Dj([x: y]) = Di([D,-x, y] + [x, Di) = [DiDj-xi ,V] + [D199 DJ] + [Di-x: DjY] + [x9 DiDjy]-
This implies that
[D1, D2]([x, y]) = D1 ° D2([x, yJ) — D2 ° D1([x, y])
= [[D1,D2](X),y] + [x,[D1,D2](y)], hence [D1, D2] is a derivation. (ii) It is clear that the Lie algebra structure on .93 is completely defined by (0.3.2), because the Lie bracket must be distributive and satisfy [u, v] = —[v, u]. Conversely, define the bracket by (0.3.2). We verify the Jacobi identity. We must show that
[[x + x’, y + y’], z + 2’]
+ [[y + y’,z + 2’], x + X’] + [[2 + 2’,x + x’ly + y’]=0, where x, y, z e 3, x’, y’, z’ e 3’. By multilinearity and antisymmetry, we only have to consider four cases:
(1) [[x’, Y'], 2'] + ' " (2) [[x’, y’], 2] + ' ' ' (3) [[x’, y], 2] + ‘ ' ' (4)
[[X, y], 2] + ' ' '
0.3
Elimination
9
Cases (1) and (4) are immediate consequences of the Jacobi identity in .9” and E. In case (2), we have, by antisymmetry:
[EXZ y'], Z] + [[y’, 2], x'] + [[2, X’], y'] = h([XZ y'])(2) - h(X')((h(y')(Z)) + h(Y')((h(X')(Z)), which is 0, because h([x’, y’]) = h(x’)h( y’) — h( y’ )h(x’). In case (3), we have
[D63 Y], Z] + [[y, 2], X'] + [[2, X'], y] = [h(X'XY), Z] - h(X’)([y, 2]) - [h(X’XZ), y], which is 0, because h(x’) is a derivation and by antisymmetry. (iii) We use the previous part with the following: 3’ = $(T) as a K-module, with trivial Lie bracket, i.e. [x, y] = 0 for any x, y in 2; 2’ = $(T) as Lie algebra; h(x’)(x) = [x’, x] for any x in .5,” and any x in
3, where the Lie bracket is taken in the free Lie algebra $(T); then h(x) is a derivation of 5,” (actually any endomorphism of 5,” is a derivation of .2”,
because the Lie bracket is trivial), and x’ t—> h(x’) is a Lie algebra homomorphism (see Section 0.4.1). So 33 = If x $’ gets a Lie algebra structure with [(x, x’), (y, y’)] = ([x’, y] + [x, y’], [x’, y’]), where brackets are taken in $(T). Define a Lie homomorphism f: $(T) —> 33 by f(t) = (d(t), t), where d is the given mapping T —> 2; f exists, because .5,” is free on T. We may write f(x) = (D(x), u(x)) for any x in $(T). By the definition of the product in $1, u(x) is a Lie homomorphism =2’(T) —> 2’. Moreover, u(t) =t for t in T. Hence, u is the identity and f(x) = (D(x), x). Then
(D([x, y]), [x, y]) = f([x, y]) = [f(X), f(y)] = [(09%), (Dy,y)] = ([Dx, y] + [x, Dy], [x, y]), which shows that D is a derivation of $(T) extending d. [1 Proof of Theorem 0.6 Let .9” = $(c), the free Lie algebra on c (which is of dimension 1), and E = $(T), the free Lie algebra on T = N x B, with B = A\c. By Lemma 0.7(iii), there is a unique derivation D of 2‘ such that D(n, b) = —(n + 1, b), for (n, b) in N x B. The linear mapping h: 3’ —+ Der(.5€), c I—> D is a Lie homomorphism. Hence, by Lemma 0.6(ii), we may define the Lie algebra .5571 = .5,” 6-) 59’, whose product extends that of 3’ and 2’, and such that [c, (n, b)] = h(c)((n, b)) = D((n, b)) = —(n + 1, b). Now, since $(A) is free on A, there is a unique Lie algebra homomorphism l/II $(A) —> .21 such that tb(c) = c, and l//(b) = (0, b) for b e B.
Similarly, there are unique Lie homomorphisms (p': 2’ —-> $(A) and (p: .5,” —> $(A), such that (p’(c) = c and [c, y] is a derivation, E is a Lie subalgebra of 3: x, y e E implies
-~>t,,},
(0.3.4)
for some n 2 0, such that (0.3.3) holds, and that moreover
Tm oE=Q Corollary 0.9
(0.3.5)
IfL is a Lazard set, then IML) is a basis ofthe K-module $(A).
Proof The set l/I(L) is linearly independent: indeed, it is enough to show that it is the case for each finite subset K of L; then we can find a finite closed subset E 75 Q containing K, and it is enough to show that ¢(L n E) is linearly independent; this is a consequence of (0.3.4), (0.3.5), and Corollary
0.8. Now, let P ¢ 0 be in $(A); the latter is linearly generated by 11/(M(A)), hence we may find a finite nonempty subset B of A such that P is a linear combination of ¢(M(B)). Denote by d the total degree in the variables in B. Let d(P) = d, and define
E = {t e M(B)|d(t) s d}. Then E is a nonempty, closed and finite subset of M(A), so we have (0.3.4), (0.3.3), and (0.3.5). Note that for each t = to, . . . , tn, d(t) s d, and, hence, also d(¢(t)) S d because 1/1 does not increase degrees. We have, by (0.3.5), te Tm.1 = d(t) > d. Hence, since I// is homogeneous, each nonzero element of 0(1)“) is homogeneous of total B-degree >d. Since the Lie bracket is homogeneous, the same holds for each homogeneous component of each
element of the subalgebra generated by ¢(T,,+1). Hence, by Corollary 0.8 and the degree assumption on P, the latter is a linear combination of El
$00), ' ' ~ ’ ‘/’(tn)'
Corollary 0.10
$(A) is a free K-module.
Proof In chapter 4, we shall define Hall sets, show that they exist (Proposition 4.1) and that each Hall set is a Lazard set (Theorem 4.18(i)):
12
0
Introduction
these two results will be proved independently of this chapter. Hence, Lazard set exist, which implies by Corollary 0.9 that $(A) is a free K-module.
0.4
0.4.1
El
APPENDIX
Variants of the Jacobi identity
The Jacobi identity (0.1.2) may be rewritten, using (0.1.3):
[x, [y, 2]] = [[x, y], 2] + [y, [x, 2]]. This means that the linear endomorphism ad(x): y |—> [x, y] is a derivation of the Lie algebra 3, i.e.
ad(X)([y, 2]) = [ad(X)(y), Z] + D" ad00(2)]The Jacobi identity may also be written [x, [ya 2]] _ [y’ [x, 2]] = [[x, y], Z].
This is equivalent to
[ad(X), ad(y)](Z) = ad([x, y])(Z), and means that x H ad(x) is a Lie algebra homomorphism 2 —> End($). Another equivalent form of the Jacobi identity is [x, [ya 2]] = [[x, y]: Z] _ [[x, 2], y]
This may be used to show by induction that if a set X generates a Lie algebra, then the latter is linearly generated by the left to right bracketed elements (or left normed elements) [' ' '[Exla X2], X3], - ~ - a xvi], xi 6 X
0.4.2
Witt formula
Let A have q elements and an denote the dimension of the homogeneous component of degree n of $(A) over a field K.‘ Then Theorems 0.2 and 0.5 imply that H (1_anx”)_1 = Z flux”: n2 1 n2 0
where [3,, is the dimension of the component of degree n of the free associative algebra on A. Now, it is not difficult to see that [3,, = q" (see Section 1.1). Then, by taking the logarithm of the previous formula and by Mobius
0.5
Notes
13
inversion, one obtains the formula of Witt (1937):
an =1 2 MM“dln
0.4.3
Canonical decomposition
Let E be a Lie algebra over K, and at“ its enveloping algebra. We assume that .5,” is a free K-module, hence .55’ is embedded in 4%. Suppose that K contains Q and define U” as the linear span of the elements 1 (x1, . . . , x") = ’7
Z
xa(1) . . . x600,
71- ass"
for each choice of x1, . . . , x" in .5! (S,l is the symmetric group). Then Uo = K, U1 = a? and
d = 6-) U".
(0.4.1)
It 2 0
Indeed, denote by V,l the linear span in .2! of the elements x1...xp, with
p s n and xie 55’. Then
x1...xn 5 (x1, . . . , x") mod V,,_1.
(0.4.2)
This is a consequence of the formula xy = [x, y] + yx, which implies that
all products x,(1)...xam are congruent to x1 . ..xn mod V,,_1. Summing up, we get (0.4.2). Now, take the xi in an ordered basis B of .9. Then eqn (0.4.2) gives triangular relations between the elements x1 . . . x,l
(xi 6 B, x1 2 ~ - - 2 x") and the elements (x1, . . . , x"). Hence, Theorem 0.2 implies that the latter form a basis of at, which implies (0.4.1). Note that Proposition 3.6 shows that U" coincides with the submodule of .2! generated by the elements x", x e .56.
0.5
NOTES
In Sections 0.1 and 0.2, we have followed Bourbaki (1972). Note that Corollary 0.3 is true under weaker hypotheses (Cohn 1963). For the Lazard elimination process (Section 0.3), we have followed Viennot (1978). Theorem 0.6 and its corollaries are due to Lazard (1960).
1 Lie polynomials
After introducing words, noncommutative polynomials and series, we define Lie polynomials. One of the main results presents the various characterizations of Lie polynomials. This naturally leads us to define the shuffle product, and to study the duality between concatenation and shuflle product; in other words: the Hopf-algebra-like properties of the free associative algebra. Related questions are treated in the appendix, including the support of the free Lie algebra (the set of words which may appear in Lie polynomials), the free Lie p-algebra and the Jacobson identities, the kernel of the left to right bracketing mapping and a brief excursion into automata theory.
1.1
WORDS, POLYNOMIALS, AND SERIES
Let A be a set, which we call an alphabet, whose elements are letters; for the main applications, the alphabet will be finite, but it is convenient to admit
infinite alphabets. A word on the alphabet A is a finite sequence of elements of A, including the empty sequence, called the empty word. With the concatenation product, the set of all words over A gives rise to a monoid called the free monoid on A and denoted by A*; the neutral element is the empty word, denoted by 1. The set of nonempty words is denoted by A+. Each letter is itself a word, and each word w is the product of its letters,
from left to right: w=a1a2...a,,
(aieA).
Here, n is the length of w, denoted by |w|. For each letter a, we denote by
|w|a the number of occurrences of a in the word w: it is the length of w with respect to the letter a.
A factor of a word w is a word u such that w = xuy for some words x, y; if, moreover, x = 1, u is called a left factor, or a prefix of w; a right factor (or sufiix) is defined similarly. A factor a of w is called proper if u 9E w, and nontrivial if u ;E 1.
The terminology ‘free monoid’ is justified by the following universal property.
1.1
Words, polynomials, and series
15
A———>A*
\/ figfl Proposition 1.1 For any mapping ffrom A _into a monoid M, there is a unique extension off to a monoid homomorphism f: A* —> M such that the diagram in Fig. 1.1 is commutative (where i is the natural injection A —> A*). Proof
If w— — a1a2.a,,(a-eA) then clearly we must have f(w)
f(a1)f(a2) f(a ). So f 15 unique. Moreover, it is easily verified that if f is defined in this way, then f is a monoid homomorphism and satisfies
f=fa
B
Let K be a commutative ring with unit. A noncommutative polynomial on A over K is a linear combination over K of words on A. We simply say
polynomial when no confusion arises. If P is a polynomial, we write it as P= 2 (P, w)w. weA*
Thus, (P, w) is the coelficient in P of the word w; all but a finite number of the (P, w)’s are zero. The set of all polynomials is denoted by K .2! such that the diagram in Fig. 1.2 is commutative (where i is the natural injection
A —> K).
1
16
Lie polynomials
A ———> K(A)
\/ Fig. 1.2 Proof If f is such an extension, then flA’“ is a monoid homomorphism A* —> a" (multiplicative structure of 52/), so f|A* is unique by Proposition 1.1. Since A* linearly generates K over K, f" must be unique. Now, denote by g the extension of f to a monoid homomorphism A* —+ .52! (g exists by Proposition 1.1). Define f— to be the linear extension of g to .21: it exists because A* is a basis of the K-module K(A). Given
two polynomials P = 2“," (P, u)u, Q = 296“ (Q, 0):), we have PQ = Z“ (P, u)(Q, v)uv, hence
f(PQ) = Z (P, u)(Q, v)g(uv) u,v
= z (P, u)(Q, v)g(u)g(v) = Z (P, u)g(u) 2 (Q, v)g(v)
= f(P)f(Q), which shows that f is an algebra homomorphism. Evidently, f oi = f.
III
The degree of a nonzero polynomial P is deg(P) = sup{|wl, w e A*, (P, w) ¢ 0} and deg(O) = — 00. Similarly, for a letter a in A, the partial degree of P with respect to a is
dega(P) = sup{IWIa, w e A*, (P, W) 96 0} and dega(0) = —
A polynomial P is homogeneous of degree n if P is a linear combination of words of length n, and finely homogeneous if P is a linear combination of words having all the same partial degrees with respect to all letters; homogeneous components and finely homogeneous components are defined as usual.
1.1
Words, polynomials, and series
17
A formal series (or series) on A over K is an infinite formal linear combination
5 = Z (s, w)w. weA*
The constant term of S is (S, 1). The set of all series is denoted by K. It acquires a K-algebra structure, with product
(ST, w) = Z (S, u)(T, v). w=uv
As before, we use the word ‘concatenation’ when confusion with other products may arise. Note that K is a subalgebra of K. There is a natural duality between K(A) and K «A». Indeed, define the pairing
K >< K —’ K, (s, P) l—> (s, P) = Z (s, w)(P, w). weA*
This sum is finite because P is a polynomial. It is easily seen that with this pairing, K may be identified with the dual space of K. When
restricted to K, this pairing gives a scalar product on K with A* as orthonormal basis. We need some topological remarks, but we shall not go into too much detail. Put on K the discrete topology, and consider on K the smallest
topology such that each mapping SH(S,W),
K —>K,
is continuous. Equivalently, a fundamental system of neighbourhoods of a series S is the family of subsets, indexed by finite subsets L of A*,
VL(S) = {T6 K | W 6 L, (T, W) = (S, W)}. Then K becomes a complete topological ring, and K is a dense subring of K. This topology is sometimes called the A-adic topology. When A is finite, it is defined by the ultrametric distance
d(S, T) = ems-T), for some fixed 0, 0 < 6 < 1, where for any nonzero series S, co(S) is the length of the shortest word w such that (S, w) ¢ 0, and (0(0) = + 00. This topology has the following nice property, with respect to infinite sums:
if (50:91 is a family of series such that for each neighbourhood of 0, all but a finite number of these series are in this neighbourhood, then the family
18
1.2
Lie polynomials
(Si) is summable, and its sum S is defined for any word w by
(s, w) = Z (8., w). kl
Observe that only finitely many terms in the right-hand sum are nonzero, by hypothesis. This condition may also be expressed by saying that the family (Si) is locally finite. In particular, if a series S has constant term 0 (equivalently w(S) 2 1), then each family (a,,S"),,20 is summable, and one may define 2,20 anS". In particular, (1 — S)‘1 = 2,2 0 S", and if K contains Q, one defines Sn
exp(S) = es = Z —', n-
n20
_1n-1
log(1+ S) = Z ( n21
)
S",
n
and one has the usual formulas
log(es) = S,
exp(log(1 + S)) = 1 + S.
Similar consideration apply to infinite products, and to other rings of formal series, like the complete tensor product K®K.
1.2
LIE POLYNOMIALS
Given two (noncommutative) polynomials P, Q in K, their Lie bracket (or Lie product) is as usual defined by
[P, Q] = PQ - QPA Lie polynomial is an element of the smallest submodule of K containing A and closed under the Lie bracket. By Theorem 0.5 it is the free Lie algebra on A, so we denote it by $K(A) or $(A). Moreover, K is the enveloping algebra of 3AA).
The next result is elementary, but useful. If L is another commutative ring with unit and (p: K —+ L a Z-linear mapping, then we still denote by (p the mapping K —> L defined by (p(P) = Ewe,“ (p((P, w))w. Note that if (p is a ring homomorphism, then so is its extension to K. Lemma 1.3 (i) =S!’(A) is finely homogeneous, that is, if P is a Lie polynomial, then each finely homogeneous component of P is a Lie polynomial. (ii) With L and go as above, qo(P) is a Lie polynomial in L for each Lie polynomial P in K. Proof
(i) is true when P is a letter. Moreover, if it is true for P, Q, and if
a e K, then it is also true for P + Q, aP and [P, Q] = PQ — QP. So it is true for any Lie polynomial.
1.3
Characterizations of Lie polynomials
19
(ii) If go is a ring homomorphism, then so is its extension to K K, v: Z —> L be the canonical ring homomorphisms sending 1 to 1. Then each Lie polynomial P e 9AA) may be written as a finite sum
P = Z aiu(Pi),
(1.2.1)
where aieK, Pieifzm). This is clear when P is a letter, and is easily
extended by induction to all of $K(A). Now, if QeZ, then for a in K, (p(au(Q)) = (p(a)v(Q); indeed Q =
Z njwj(nj e Z, W}. e A*),
hence
qo(ocu(Q)) = (p(z njawj) = Z q)(njoz)wj =
Z nj(p(a)wj = K ®KK(A>,
6(a) = a®1+1® a,
for any letter a. Such a homomorphism exists by Proposition 1.2. Define a linear mapping a from K into itself, in the following way: for any word w = a1 . . . an(ai e A), let oc(w) = (—1)"a,, . . . al. In other words,
a is the anti-automorphism of K which sends each letter a to —a. In particular, a(PQ) = a(Q)oc(P) for all polynomials P, Q. Now, define
5 = (id ® 0006, where id is the identity of K [a, [b, Q]] = abQ — aQb — a + Qba. Let D: K —> K be the linear mapping which sends each word w
of length n into nw. It is easily verified that D is a derivation of K(A), that is,
D(PQ) = D(P)Q + PD(Q)-
1
20
Lie polynomials
Actually, D is the unique derivation of K such that D(a) = a for any letter a.
Finally, define the ‘Lie bracketing from right to left’ or the ‘right normed bracketing’ to be the unique linear endomorphism r of K(A> such that
for any word w = a1 . . . a,l of positive length, one has r(w) = [a1,. . . , [a,,_1,a,,]. ..]. Moreover, r(1) = 0. For example, one has for a, b, c in
A: r(abc) = [a, [b, c]] = [a, be — cb] = abc — acb — bca + cba. In the next result, the ring K is assumed to be a GED-algebra, that is, a ring
containing CD as a subfield. We shall also assume that A has at least two letters; the theorem is trivial in the one-letter case (except for condition (ii), which is not equivalent to the others). Theorem 1.4
For P in K (A), the following conditions are equivalent:
(i) P is a Lie polynomial; (ii) ad(P) = Ad(P) and (P, 1) = 0; (iii) 6(P)=P®1+1®P;
(iv) 5(P)=P®1—1®P; (v) (P, 1) = 0 and r(P) = D(P). The equivalence of (i) and (iii) is often expressed in the following way: let A’ be a copy of A, with bijection a l—> a’; let each letter in A commute with each letter in A’. Then, denoting a polynomial P by P(a, b, . . .), (iii) is
rewritten as P(a + a’, b + b’, . . .) = P(a, b, . . .) + P(a’, b’, . . .). Thus, a polynomial is a Lie polynomial if it is additive, in the above sense. Theorem 1.4 is still true if K is a Z-algebra without torsion (see Corollary 4.17). However, it is not true if K is a field with nonzero characteristic; when the characteristic is a prime number p, it characterizes the free p-Lie algebra
(see Section 2.5.2). Lemma 1.5
Define two linear mappings ,1, conc: K ®K —> K
by it(P ® Q_) = D(P)Q and cone (P 8) Q) = PQ. Then for any polynomial P, one has lo 6(P) = r(P) and conc:> 5(P) = (P, 1).
Proof We have 5(1)=1®1 and D(1)=0 so that 205(1)=0 and cone 0 5(1) = 1. It remains to show that for n 2 1 and an, . . . , a1 in A, one has 105(an . . . a1) = r(a,l . . . a1),
(1.3.2)
mum 5(a,l . . . a1) = 0.
(1.3.3)
and
We do it by induction on n. The case n = 1 is easy:
5(a1)=(id®a)06(a1)=(id®a)(a1®1+1®a1)=a1®1—1®a1,
1.3
Characterizations of Lie polynomials
21
so that
2.05%) = D(a1) — D(1)a1 = a1 = r(a1) and
conCo5(a1) = a1 — a1 = 0. Suppose that eqns (1.3.2) and (1.3.3) are true for n 2 1; we prove them for n + 1. Let
5(an--.a1)=ZP.-®Q.~ Then 5(an . . . a1) = Z: P: 69 «(Qi), hence we have by induction 2 D(1’i)oc(Qi) = 105%, . . . a1) = r(a,, . . . a1),
(1.3.4)
and
Z Roc(Qi) = conCo 5(a,l . . . a1) = 0.
(1.3.5)
Now, we have
5m,“ . . . a1) = (id 09000601,,“ . . . a1) = (id ® a)(6(a..+ 0502.. - - . a1» = (id (29 a)((a,.+1 ®1+1® an+1)(2 P.- ® Q.» = (id 69 «XX amP. ® Q.- + ER 69 a.+1Q.-) = Z amP. ® a(Qi) — Z P. ® oc(Q.-)an+1, because a is an anti-endomorphism of the algebra K and oc(an+1) = —a,l+ 1. Thus, we have 1° 5(“n+1 - - - a1) = Z D(an+ 1Pi)°‘(Qi) " Z D(Pi)°‘(Qi)an+1
= Z an+1Pia(Qi) + Z “n+1D(Pi)°‘(Qi) — Z D(Pi)°‘(Qi)an+1 = 0 + an+1r(a,, . . . a1) — r(a,l . . . a1)a,,+1
= [an+1,r(an - - - 01)] = r(an+1"'a1)s
where we have used in the second equality the fact that D is a derivation such that D(a,,+ 1) = a,l+ 1, and (1.3.4) and (1.3.5) in the third one. Moreover, we have C0nc°5(an+1 - - - “1): Z “n+1Pia(Qi) _ Z Pia(Qi)an+1 = 0, by (1.3.5).
El
1
22
Lie polynomials
Define a linear mapping a: K ® K —> End(K(A)) by #(P1® P2)(Q) = P1QP2-
Lemma 1.6 (i) _For any polynomial P, one has ad(P) = u(P® 1 — 1 ® P) and Ad(P) = u05(P). . (ii) a is injective if A has at least two letters.
Proof (i) The first equality is immediate. The second one holds if P is a letter, and so it is enough to show that #05 is multiplicative, because Ad is,
so both are algebra homomorphisms which coincide on the generators, and thus are equal. Now, a o 5 = p 0 (id ()5) a) o 6, and 6 is multiplicative. A routine verification shows that ,uo (id 63 or) is multiplicative:
It ° (id 69 00((1’1 ® P2)(Q1 ® Q2))(R) = # ° (id ® 06)(P1Q1 ® P2Q2)(R) = “(P1Q1 ® “(Q2)°‘(P2))(R)
= P1Q1R0¢(Q2)d(Pz) = ”(P1 ® cc(132))(Q1R0‘(Qz))
= #(P1 69 d(P2)) 0 #(Q1 ® d(Q2))(R) = (#1 ° (id 69 aXPl 69 P2)) ° (1‘ ° (id 69 06))(Q1 ® Q2)(R)'3 (ii) Note that A* x A* is a basis of the free K-module K ® K.
Take an element x aé 0 in this space; it may be written x = Z, 5,- Sn * u, 69 1),, where * indicates a nonzero coefficient, and (ai, v,) are distinct couples of words. Let u1 be of minimal length among all the 14,, take N greater than the lengths of all these words, and let a, b be two distinct letters in A. We
show that ulaNbvl is different from each word uiaNbv, for i2 2: this will imply that p(x)(a"b) aé 0, hence a is injective. Suppose ula"’bv1 = uiaNbvi.
Then, by minimality of 14,, u1 is a left factor of ui: u,- = uls => aNbvl = saNbvi. Because N is big, s is a left factor of a”, hence s= aj, which implies
aNbvl = ai+Nbv,. Since a ¢ b, we must have j= 0; therefore u, = u1 and v1 = 1),. Lemma 1.7
El a(P) = —P for any Lie polynomial P.
Proof This is clear for each letter, by definition of (1. Furthermore, if this equality is true for polynomials P, Q, and if a e K, then it is also true for P + Q, (IF and for [P, Q]: indeed, oc([P, Q]) = a(PQ — QP) = ot(Q)a(P) — a(P)a(Q) = (—Q)(—P) — (—P)(—Q) = —[P, Q], because at is an anti-
endomorphism of the K-algebra K (A).
El
1.4
Shuflles
23
Proof of Theorem 1.4 (i) => (ii) The set of polynomials P satisfying ad(P) = Ad(P) is a submodule of K (A) containing A and closed under Lie bracket: indeed if ad(P,-) = Ad(P,), i = 1, 2, then for any polynomial Q ad([Pls P2])(Q) = ad(P1P2 — P2P1)(Q)
=P1P2Q ‘P2P1Q’ QP1P2 + QP2P1, and Ad([P1, P2])(Q) = Ad(P1P2 — P2P1)(Q)
= (Ad(P1) ° Ad(P2) - Ad(P2) ° Ad(P1))(Q) = (ad(P1) ° ad(192) - ad(P2) ° ad(P1))(Q) = [P1, [P2, Q1] — [P2, [P1, Q1] =P1P2Q‘P1QP2 “PzQP1 +QP2P1 — PZPIQ + PZQPI + PIQPZ — QPIPZ = PIPZQ + QP2P1 — PZPIQ — QPIPZHence, ad([P1, P2]) = Ad([P,, P2]) and the set of polynomials satisfying (ii)
contains all Lie polynomials. (ii) => (iv) We have by hypothesis and Lemma 1.6(i) that 11(P ® 1 —
1 Q) P) = no 8(P). Then by Lemma 1.6(ii), we obtain P ® 1 — 1 (9 P = 5(P). (iv) 2 (v) Apply the mappings I". and cone of Lemma 1.5 to the equality
5(P) = P® 1 — 1® P. We obtain r(P) = D(P) and (P, 1) = P — P = 0. (v) 2 (i) Write P as 2,20 P", where R, is the homogeneous component of degree n of P; then r(P) = D(P) = Z nPn (by definition of D). Thus M; = r(P)", the homogeneous component of degree n of r(P). Since r(P) is evidently a Lie polynomial and since $(A) is homogeneous by Lemma 1.3(i). nPn is a Lie polynomial. Finally (since we may divide by n in K), P" is a Lie
polynomial, for each n 2 1. To conclude, observe that P0 = (P, 1) = 0. (i) 2 (iii) We have shown that (i) => (iv). Hence, if P is a Lie polynomial,
we have 6(P)=(id®oc)05(P)=(id®oc)(P®1—1®P)=P®1+1®P, because a is an involution, and by Lemma 1.7.
(iii) => (v) We have 5(P) = (id 8) or) o 5(P) = P ()9 1 + 1 ® oc(P). Applying the mappings A1 and cone of Lemma 1.5 to this equality we get r(P) = D(P) and (P, 1) = P + oc(P). Taking constant terms, and observing that (oc(P), 1) = (P, l), we have (P, 1) = 2(P, 1), hence (P, 1) = 0. D
1.4
SHUFFLES
Let w = a1...a,, be a word of length n and I be a subset of {1, . . . , n}. We denote by wlI the word a,l . . . aik, if I = {i1 < i2 < - - - < ik}; in particular, w|I is the empty word if I = Q. Such a word w|I is called a subword of w.
1
24
Lie polynomials
Note that when
then w is determined by the knowledge of the p words wllj and the p subsets 1].. Given p words 141,. . . ,up of respective lengths n1, . . . , np, their shufi‘le product, denoted by u1 Lu- - 'LU up, is the polynomial
ulLLl'--L|Jup=ZW(11,...,Ip) where the sum is extended to all p-tuples (11, . . . ,Ip) of pairwise disjoint subsets of {1, . . . , n} (n = n1 + - - - + np) such that {1,...,n}=
U
Ij’
ISJSP
and [l = nj for any j= 1,. . .,p, and where the word w = w(11, . . . ,Ip) is defined by wllj = uj for j = 1, . . . ,p. Note that u1 L1J"'LLlup is the sum of (
n
>
n1,...,np
n! nll...np!
words of length n, so it is a homogeneous polynomial of degree n. In
particular, if one of the uj is empty, it may be omitted without changing the shufl‘le product. Moreover, the product does not depend on the order of the words uj, as the reader may verify. A word appearing in the shuffle product u1 Lu- - ‘LJJ up is called a shufi‘le of u1,...,up. Thus, a shuffle of u1,...,up is a word obtained by ‘shuflling’
together the words u1,. . . ,up without changing each word uj. The shuffle product is then the sum of all shuflles, with multiplicities. For example, with a, b, c e A, we have: ab UJ ac = abac + 2aabc + 2aacb + acab.
The shuflie product LU is extended to K and K9. More formally, an element of .52! is an infinite linear combination veA“ «Mu 69 v and the product in M is defined by (Z amt)“ ® ”>(Z fix,yx ® Y) =
Z
au,vfix,y(u LU x) ® (Dy)
ll, 1’.n
Note that each endomorphism f of K is completely described by its canonical image Zueflu® f(u) in M. Note also that the identiy endo-
morphism is mapped onto Zu u ® u. Proposition 1.10 With the convolution, End(K) acquires an associative algebra structure, whose unit element is a. The inverse in this algebra of the
identity of K is a, which is an anti-automorphism of K(A> for the concatenation product, and an automorphism for the shuflle product. The canonical embedding End(K) —> 42!, f I—> 2,, u ® f(u), is an algebra homo-
morphism for this product. In other words
2 w 69 ((f * g)(W)) = (2 u (9 f(u)>(Z v 69 902))
(1.5.5)
where the product is taken in .52! (shufile on the left of ®, concatenation on the right). This proposition implies in particular that K has a Hopf algebra
structure, with concatenation as product, 6 as coproduct, and a as antipode. Proof We start with eqn (1.5.5). The right-hand side is
2 (u U-I v) ® (f(100(9)) = Z (Z (w, u U-I UN 69 (f(109(0)) 14,!)
= Z w ® (2 (w, u LU v)f(u)g(v)>. Hence, eqn (1.5.5) is equivalent to
(f * e)(W) = Z (w, u LU v)f(u)g(v), which is the definition of f =1: g (see eqn (15.4)). Since the product in .52! is associative and since
f H 2 u ® f(u),
End(K) » a
30
1
Lie polynomials
is injective, the convolution is also associative. Moreover, the image of e is 1 ® 1, which is the neutral element of 42/. Hence, a is the neutral element for the convolution. The fact that at is an automorphism of the shuffle algebra
(and an anti-automorphism of the concatenation algebra) is clear by inspection. Now, by Lemma 1.5, we have for any polynomial P, conCo 5(P) = (P, 1), where 5 = (id ()9 a)o 5. In other words, by eqn (1.5.3), we have id :1: at = 3. So
at is the right inverse of id for the convolution. This may be written as
= 1691. Now, at is an automorphism for the shuffle, hence a 03 id is an automorphism
of at. Applying 0: ® id to the last equation, we obtain
(2 0(1)) ® v)) H 42¢,
f i—> Z f(v) ® v. veA*
Furthermore, define 6;, and shp by
5;,(w)=
Z w=u1...up
u1®~--®up,
shp(u1®---®up)=u1Lu---Luup.
1
32
Lie polynomials
The mapping 6;, is a shuffle homomorphism, and we have
f1*'-'-*'fp=Shp°(f1®---®fp)°5}n
(1.5.8)
for any endomorphisms f1, . . . , fl, of K. It may be worthwhile to note that if f, f * are adjoint endomorphisms, i.e. (f(u), v) = (u, f* (12)) for any words u, u, then one has in .52!
E u ® f(u) = 2 f*(v) 69 v
(1.5.9)
Conversely, this equality implies that f and f* are adjoint. The foregoing results have as applications some combinatorial identities on words. By Lemma 1.5, we have the identity 105(P) = r(P) for any polynomial P, where A, 5 are the linear _mappings defined by: D(w) = lwlw for any word w, zl(P ® Q) = D(P)Q, 6 = (id ® 000 5, and r(al . . . an) = [a1, . . . , [a"_l, an] . . .] for any letters a1, . . . , a,l (Lie bracketing from right to left). We may write ,1. = cone 0 (D ® id), hence the above identity may be rewritten as:
r=conCo(D®id)o(id®a)o $(A) be the ‘Lie bracketing from left to right’, i.e. the linear function such that [(1) = 0, l(a) = a for any letter a, and l(Pa) = [l(P), a] for
any polynomial P. Then, for any polynomials P, Q
l(Pl(Q)) = [l(P), l(Q)]-
(165)
It is enough to check this identity when P, Q are words; then, a few lines of
computations and an induction on the length of Q give the result. From (1.6.5), one deduces that
l([1(P), l(Q)]) = [12(P),1(Q)] + [l(P), 12(Q)]-
(16-6)
By Jacobi’s identity, each Lie polynomial is of the form l(P) (see Section 0.4.1). Thus (1.6.6) implies that l|.5€(A) is a derivation of Lie algebra, that is
l(EP, Q]) = [l(P), Q] + [1” l(Q)],
(1-6-7)
for any Lie polynomials P and Q. Now, (1.6.7) implies easily, by induction on n, that for each homogeneous Lie polynomial P of degree n, one has
l(P) = nP. This is Theorem 1.4(v).
1.6
Appendix
1.6.7
37
Kernel of the left to right bracketing
The kernel of the linear mapping 1: K (A) —> $(A) is the right ideal of K (A) (concatenation algebra) generated by the polynomials
Pl(P),
P e K.
(1.6.8)
This was proved by Cohn (1951). We outline a proof which works only in_ characteristic 0. Observe that Kerl is a right idea]: this is because l(Pa) = [l(P), a] for any letter a. Now, Baker’s identity (1.6.5) shows that each polynomial of the form (1.6.8) is in Ker I. For the converse, we use the identity
nP =
Z (P, u LlJ u)ul(u)
(1.6.9)
u.veA*
for any homogeneous polynomial of degree n (proof below).
Identity (1.6.9) may be rewritten (n — 1)P = l(P) +
Z
(P, u LlJ u)ul(u)
u,vaé1
= l(P) + Z (P, u LlJ u)ul(u) + Z (P, u LU u)ul(u). u# 1
1:12:91
The second summation is a linear combination of ul(v) + ul(u), because the
shuffle product is commutative. The latter polynomial is equal to (u + v)l(u + v) — ul(u) — ul(v). Thus, if P is in Ker I, it is a linear combination of polynomials of the form
(1.6.8). To prove identity (1.6.9), we use the identity r * id = D of Theorem 1.12.
By symmetry, we have id*l = D, which by (1.5.5) may be rewritten in the algebra .2!
(Zu®u>
by (2.3.4) = C‘1[(f(a), 1) + l; 9009(1)" lf(a))]
= I; [0‘ 19(1)) 9(1)" 1f(0)) + (9(1)), 1)C‘ 19(17— 1f(a))]
by (2.3.4) = X C" 19(b)g(b' 7(a)), beB
2.3
Automorphisms
47
because g is proper. Equality of the extreme members means that the (c, a)—entry of the matrix J(g c f) is equal to the (c, a)—entry in the product of J(g) by J(f)“. This proves (2.3.2). El
Let V denote the vector space 2”,, Ka. A Lie algebra automorphism (p: $(A) —> $(A) is called elementary if either (pl V is a linear automorphism of V, or if for some letter a, (p(a) = a + P, where P is in $(A\a), and (p(b) = b
for any letter bsfia.
Note that, in the second case, 4)” is defined by (ii—1(a) = a — P, qo'1(b) = b for b aé a; hence, if q) is an elementary automorphism, so is o”. We call jacobian matrix of a Lie algebra endomorphism of $(A) the
jacobian matrix of its unique extension to an algebra endomorphism of K. Theorem 2.7
Let (p: $(A) —> $(A) be a Lie algebra endomorphism. The
following conditions are equivalent:
(i) (p is an automorphism; (ii) 4) is surjective;
(iii) the jacobian matrix of go is right invertible in K(AV-M; (iv) (p is a product of elementary automorphisms. The equivalence of (i) and (iv) is due to Cohn (1964). Proof We tacitly use the following fact: if a square matrix over K (A) is right or left invertible, then no column or row in this matrix is 0; this may be seen by taking the image of this matrix in K [A].
(i) 3 (ii) is evident. (ii) => (iii) because, since 90 is surjective, it has a right inverse I/l. Indeed, define 1/1 for any letter a by 11/(a) = P for some Lie polynomial P such that (p(P) = a; since $(A) is the free Lie algebra, W extends uniquely to a Lie endomorphism of $(A). Let f, g be the algebra endomorphisms of K (A)
extending (p, (II respectively. Then go o W = id, hence f o g = id. By Proposition 2.6, we obtain J ( f )J(g)f = IA, the A x A identity matrix. Hence J( f ) is right
invertible. (iii) => (iv) by induction on d((p) = Z,“ deg((p(a)). Since no (p(a) is zero (otherwise the jacobian matrix J((p) of go is not right
invertible) and since (p(a) is a Lie element, we have deg((p(a)) 2 1. If d( K which sends each word w on (S, w) is a
homomorphism from the shuflle algebra K (A) into K; (iv) for any series T, Ad(S)(T) = S TS '1. Proof By Theorem 3.1, log(S) is a Lie series if and only if 6(log(S)) = log(S) ® 1 + 1 ® log(S).
(3.1.4)
3.1
Lie series and logarithm
55
Since 6 is a continuous homomorphism, we have 6(3) = 6(exp(log(S)) = exp(6(log(S)). Moreover, since log(S) ® 1 and 1 ® log(S) commute, we have the usual identity of the exponential function: exp(log(S) ® 1 + 1 ® log(S)) = exp(log(S) ® 1) exp(l ® log(S)). Since T —> T6) 1 and T —> 1 69 T are con-
tinuous homomorphisms, this is equal to
(CXP(log(S)) ® 1)(1 ® eXP(log(S)) = (S 8) l)(1 (8 S) = S ® S. Finally, taking exponentials of both sides in (3.1.4), we find that log(S) is a Lie series if and only if 6(S) = S®S. Thus (i) and (ii) are equivalent. Now, by Proposition 1.8, we have
6(8) =
2 (5,14 Luv)u®v. u,veA*
Since S®S =
Z
(S: u)(S, v)u®v,
u.veA*
we deduce that (ii) is equivalent to Vu, v e A*,
(S, u LU v) = (S, u)(S, v).
But this is equivalent to (iii). Let S = e” and denote by g (respectively d) the continuous linear operator on K «A» defined on any word w by g(w) = Uw (respectively d(w) = wU).
Then g and d commute with each other, and hence SwS'1 = euwe‘" = e“ e_‘(w) = e9_"(w) = e“‘(U’(w). Moreover, Ad: K —> End(K If, dw is a shuflie homomorphism, that is, (p,(w LlJ w’) = K be the linear mapping defined by I(w) = w if
weA+, [(1) = 0. Lemma 3.9
Let (p: K —> K be a concatenation homomorphism such
that (p(b) is a Lie polynomial for any letter b in B. Then (p commutes with I *k and n1, for any k 2 0. Proof We have 121 = log(id) = log(l + I ) = Z (—1)"' 1I""‘/k, so it is enough
to show that (poI*" = I*"o(p. Now, 6km) = (p®"ook because both sides are concatenation homomorphisms and for any letter b: 6,,0 (p(b) = (p(b) ® 1 ®. - -® 1 + 1 ® qo(b)(—B~-EBI +~~+1®1®~~® qa(b) (by eqn. (1.5.6), because qo(b) is a
Lie polynomial) = (p®" o 6,,(b). Moreover, Io q) = o o I (because (p preserves constant terms), hence I ‘3’“ o o®k = (pm 0 I ®", and 00a 0 (pm = (p o conck, (p being a concatenation homomorphism. Finally by eqn (1.5.7), I *" o q) =
conck o I ‘3’" o 5,, o (p = (p o I *k, by putting together the previous equalities. El Proof of Theorem 3.7
We show that
(i) 7t” restricted to U" is the identity;
(ii) any polynomial P is equal to the sum 2,20 71,,(P);
60
3
Logarithms and exponentials
(iii) the image of 715,, is contained in U"; and
(iv) if k 9e n, then nk(U,,) = 0. This will imply the theorem. (i) We have 1r1 = log(id) = log(l + I ), hence
_ Z( _1k—1 ) I“,
TCl—
'
can
It 2 1
in End(K®(n.(u.) ...u1ij+1}? R(w)={ql+---+qj,ISjSm— laij a(r + 1) 2 u(a(r)) 2 u(a(r + 1)) (because u is an increasing word) 9 w(r) 2 w(r + 1) (because w = ua) 41> w(r) > w(r + 1) (by the choice of r) c» r e D(w); similarly, a(r) S a(r + 1) implies w(r) s w(r + 1), hence the reverse implications also
hold. In particular, since ac and w have no descent in Ij\{r}, we have D(O'C) = D(w).
Next we claim that if at e G, then D(aca) is the disjoint union D(w) U D(a). From the previous observation, we deduce that D(acoc) n {q1 + - ' - + (1;,
1 s j s m — 1} = D(w). Since ocllj is increasing, and since a permutes Ij, we have for i in IJ-\{q1 + - - - + qj}: «(1') > oc(i + 1) ¢ aca(i) > acac(i + 1). This proves the claim. From the latter claims, we deduce that
E06436) = Eac(X)Ea(X) = X“”C’Ea(X) = xd‘w’E.(X), hence EGCG(x) = x‘““’)EG(x). Evidently, EG(x) = q(x) . . . Eqm(x), so that l
l
EW(x) = Z E,CG(x) = x400 Z EG(x) = x10”) ——p1". ' ' ' qp”,. E,“(x) . . . Eqm(x). c
c
q 1...
m
For the second identity of the lemma, note that a 6 8,, implies r(a) + d(o) =
3.3
Coeflicients of the Hausdorfl series
65
p — 1; thus by (3.3.2)
_ Ew(x’y)=yp
x
_ p!...p,,.
1Ew=yp
171! ~ - ~ pn!
= —x qllmqm‘
x"(“”
11—_
qllmqm! y
d(w) r(w)
_. n
lsm
yl
qJq(xsy)
E .(x, Y),
y
ISUSM
q]
becauser(w)=m—1—d(w) andp=q1+-..+qm,
III
The right action of the symmetric group SF on the words of length p extends linearly to a right action of the group algebra Q[Sp] on the linear span of the words of length p. It is convenient to extend it to all of @(A) by the formula: wa = 0 if a 6 SI, and w is a word of length 9E p. If S is a subset of {1, . . . , p — 1}, we denote by DES the sum, in @[Sp], of the permutations whose descent set is contained in S. In the next lemma, we identify each
permutation in Sp with the corresponding word on {1, . . . , p}. Recall that the convolution product * has been defined in Section 1.5. Lemma 3.13 Let p1, . . . , pk (k 2 1) be positive integers of sum p and S: {p1,p1 +p2,...,p1 + - - - +pk_1} the corresponding subset {1,...,p— 1}. Factorize the word 1 2. . . p as u1 . . . uk with |ui| = pi, for each i. (i) One has 1395:6041 LU‘ ' 'UJ 14k),
where 0 is the linear involution of Q[Sp] sending each permutation onto its
inverse. (ii) Let q" denote the linear endomorphism of Q such that (ME;- Pi) = P",
for each polynomial P = Z P, written as the sum of its homogeneous components. Then, for each polynomial of degree p, one has PD§S=(qp1*H
*qkP)
The right action of Q[Sp] defined previously leaves the canonical scalar product on (EMA) invariant (this scalar product has A* as orthonormal basis;
see Section 1.1). This implies that for any words 14, v of length p and any permutation a in Sp, one has (u, va) = (ua‘l, 0). Thus, if x is in Q[Sp], we obtain (u, vx) = (u6(x), v),
(3.3.5)
which means that the adjoint of the linear endomorphism of (EMA): v I—+ vx is u I—> u0(x).
Proof (i) A permutation 0 appears in the sum DES if and only if its descent
66
3
Logarithms and exponentials
set is contained in S, that is 0(1') > a(i+1)=ieS,
for any i in {1, . . . , p — 1}. Moreover, a permutation 0: appears in u1 Lu- - 'UJ uk if and only if for any i in {1, . . . ,p — 1}\S, the digit i+ 1 appears at the right of i in the word «1(1) . . . oc( p), that is '
i¢S => oc‘1(i) < a'1(i + l), or equivalently
a'1(i) > a'1(i + 1) a ieS. This implies the lemma, since the sums DES and u1 LIJ"'UJuk+1 are multiplicity-free. (ii) The adjoint of the linear mapping PI—> PDES is by (i) and (3.3.5) the linear mapping PI—>P(u1 u_|- - ~Luuk). The adjoint of the mapping qmav - *q is the mapping qpl*’- - -*’qpk, where *’ is the convolution product defined in Section 1.5; indeed, the adjoint of conc and 6 are respectively 6' and sh (Proposition 1.9), so that the adjoint of f *g =
conCo(f®g)o6 is sho(f* ®g*)o(5’, where f*, 9* denote the adjoint of fi 9; hence, the above assertion is implied by the fact that q” is self-adjoint. Thus, it is enough to show that for any word w, one has w(u1 LIJ- ~ 'UJ uk) = (q *’- - -*’qpk)(w).
(3.3.6)
Suppose that |w| = p. Let w = v1 . . . vk be the factorization such that lvil = pi.
Then, by definition of the right action of Sp, one has w(u1 L1J- - -Luu,,) = 121 Lu- - 'Luvk. On the other hand, by eqn (1.5.8), we have
(em *’- - -*’ qpk)(W) = Shk ° (qp. ®- - -® qpk) ° 52(W) =Shk°(qm®---®qpk)
w=W1...Wk
= shk
ql ----- qSJ'Sm
where we have extended the linear function s:@[x] —> CD @[x]((t1, . . . , rm)) —> Q, coefficient-wise. This is by (3.3.8)
s(x“(1 + x)’
H
to
s:
ZEq,(X,X + 1)t§’/q,-!>
ISJ'qj tj_1
“(mus n __) — 1) 1 Sism 1 — X(e"
Now, we have, for some elements Ni in the field of fractions of 0((t1, . . . , t", )):
x‘(l+x)’
n 61—1 2 N’ 3 lsmm=j=lm
because the left-hand side is a rational fraction in x with degree of numerator equal to d + r = m — 1, which is smaller than the degree of the denominator,
equal to m, and because the latter is a polynomial in x with simple roots.
72
3
Logarithms and exponentials
To compute M, multiply both sides by x = 1/(e'i — 1), hence I + x = e"/(e"' — 1): N —
erli
l—L etj _ 1
1 — x(e"' — 1) and put
—
:_(en_1)d(e!:_1)rnj#i1_etj_1—
n,-
nj etj _ l Hiaéien—etj
because d + r = m — 1. Now, observe that for T in @[x]((t1, . . . , rm)), s(T) = —h(T)|,,= -1, where h is the Q((t1, . . . , tm))-linear operator of Q[x]((t1, . . . , tm)) which maps 3:" onto x"+1/(k + 1). Hence, h is an integration operator, and so
h.
+sxd+11+x’
h;’ (by (4.1.1)) 2
hi+1,. . ., h"; finally, for j = 1, . . . , i — 1, either hi is a letter or h’-’ 2 h}’ (by the choice of i) > h}, by (4.1.2). This shows that t’ is standard. Furthermore,
(h;, h;’) is a legal rise of t’, because hg < h;’ by (4.1.2) and hg’ 2 hm, . . . , h", by (4.1.5), s being standard. Hence, t’ —> 5. By induction, we find a sequence of letters t such that ti» t’. Hence, ti) 5.
(iii) If s is not decreasing, then s has at least one rise. Let i be the right-most rise. Then
hi u, t i» u for some sequence u. Since 5, t have no rises, this is only possible if s = t. I]
We call Hall word the foliage of a Hall tree. Corollary 4.5
Each Hall word is the foliage of a unique Hall tree.
Thus, we may identify Hall trees and words. We call Hall set in A* the
image under f of a Hall set in M(A), with the corresponding total order. Example 4.6
The Hall words corresponding to the trees of Example 4.2 are
azba, azbaz, azbab, azb, a, a3b2, a2b3, ababz, ab, ab3, ab“, a2b2, abz, b. Corollary 4.7
Each word has a unique decreasing factorization into Hall
words. If h is a Hall word, let t be the unique Hall tree such that h = f(t). If h is not a letter, then t = (t’, t”): let h’ = f(t’), h” = f(t”); then h = h’h”, and we call this factorization of h its standard factorization. For later reference, we
note several inequalities on Hall words, which are immediate consequences of the definitions and of (4.1.1), (4.1.2), and (4.1.3): if h is a Hall word with standard factorization h = h’h”, then h’ < h”,
(4.1.9)
h < h”.
(4.1.10)
and Now, let k be another Hall word with h < k. Then hk is a Hall word with
standard factorization hk if and only if either h e A, or h” 2 k.
4.2
(4.1.11)
HALL AND POINCARE—BIRKHOFF—WITT BASES
Let K be a commutative ring with unit and consider a fixed Hall set. We define for each Hall word h a Lie polynomial Ph in «553(04): if a is a letter,
90
4
Hall bases
then R, = a; if h is a Hall word of length 22 with standard factorization h = h’h”, define Ph = [P1,, P2]. It is clear by induction that each Ph is an
homogeneous Lie polynomial of degree equal to the length of h; furthermore, Ph has same partial degree with respect to each letter as h. Example 4.8 The Lie polynomials of the Hall set of Examples 4.2 and 4.6 are
Paz,m = [[a, [a, b]], a] = 3a2ba — 3aba2 + ba3 — a3b, Pazbaz = [[[a, [a, b]], a], a] = 6a2ba2 — 4aba3 + ba4 — 4a3ba + a4b, Pazbab = [[a, [a, b]], [a, b]] = azbab — 3aba2b + 2ba3b — azbza
+ 4ababa — 3ba2ba — abza2 + babaz, PM, = [a, [a, b]] = azb — 2aba + baz,
Pa = 0, Pa3b2 = [a, [a, [[a, b], b]]] = (13b2 — 2a2bab + 4ababa — abza2 — azbza — 2baba2 + b2a3,
Pazbs = [a, [[[a, b], b], b]] = a2b3 — 3abab2 + 3ab2ab — 2ab3a + 3bab2a — 3b2aba + b3a2, Pam; = [[a, b], [[a, b], b]] = abab2 — Babzab + 2ab3a — bazb2
+ 4babab — 3bab2a — bzazb + bzabab, Pa, = [a, b] = ab — ba,
Pub; = [[[a, b], b], b] = ab3 — 3bab2 + 3b2ab — b3a, Paba = [[[[a, b], b], b], b] = ab4 — 4bab3 + 6b2ab2 — 4b3ab + b4a, Pazbz = [a, [[a, b], b]] = azb2 — 2abab + 2baba — bzaz, Pub; = [[a, b], b] = ab2 — 2bab + bza, P1, = b. The previous example shows a fact which is clear from the definition: in order to compute Ph, one has simply to interpret in the tree t corresponding to h each node as a Lie bracketing. We call the polynomials Ph the Hall
polynomials. These polynomials form a basis of the free Lie algebra, as the next result indicates. Recall that the free associative algebra K (A) is a free K-module having the canonical basis A*. Theorem 4.9 (i) The Hall polynomials form a basis of the fi'ee Lie algebra (viewed as a K-module).
4.2
Hall and Poincaré—Birkhofl— Witt bases
91
(ii) The decreasing products of Hall polynomials Ph1"'Phn?
hiEH,h12"'Zhn
(4.2.1)
form a basis of the free associative algebra (viewed as a K-module).
The second part of the theorem is the Poincaré—Birkhoff—Witt theorem applied to the basis (Ph),,eH of $(A). We shall not use the latter theorem
here and actually prove (ii) first and then deduce (i). The proof is constructive 'and allows us (i) to express each Lie polynomial in the basis of Hall polynomials, without computing these, and (ii) to express each polynomial in the basis of decreasing product of Hall polynomials, again without computing these products. Consider again a standard sequence of Hall trees (or words) 5 = (h1,.. ., h"),
(4.2.2)
2,-(5) = (h1,..., hi- 1, hihHl, hi”, . . . , h”),
(4.2.3)
with a legal rise i. We define
which is a standard sequence (see (4.1.7)); we also define pi(s) = (h1,. . . ,h,_1,hi+1, hi, hi”, . . . , h”),
(4.2.4)
obtained by interchanging h, and hi+ 1 in 5. Observe that pi(s) is a standard
sequence: indeed, either hi+1 is a letter, or hf“ > hi+1 (by (4.1.1)) > hi, because i is a rise. Observe also that pi(s) has one inversion less than s. Let s be a standard sequence. Define a derivation tree T(s) of s to be a labelled rooted tree with the following properties: if sis decreasing, then T(s) is reduced to its root, labelled s; if not, T(s) is the tree with root labelled s,
with left and right immediate subtrees T(s’) and T(s”), where s’ = 2,-(s), s” = pi(S) for some legal rise i of s (e.g. the right-most rise, cf. proof of Theorem 4.3(i-ii)) and where T(s’), T(s”) are derivation trees of s’, s”
respectively. Observe that T(s) always exists, and is finite. Example 4.10 With the Hall set of Example 4.2, a derivation tree associated with s = (a, b, b, b) is shown in Fig. 4.4.
Observe that in a derivation tree, the leaves are labelled by decreasing sequences. For 3 as in (4.2.2), define P(s) = Ph1 . . . Pm.Lemma 4.11
For each standard sequence 5, P(s) is the sum of all P(t), for t
a leaf in a fixed derivation tree of 5. Example 4.12
From Example 4.10 and Lemma 4.11 we deduce abbb = Pabbb + 3Psbb + 3P§Pab + na-
92
4
Hall bases
(a,b,b,b) (l7.a,b.b)
(ab,b,b)
(abb,b) (abbb)
(b,ab,b)
(b,ab,b)
(b,abb) (b,abb)
(b,b,ab) (b,abh)
‘ (b,b,a,b)
(b,b,ab) (b,/7,ab)
(b,b,b,a)
Fig. 4.4 Proof The lemma is a consequence of the definitions (4.2.3) and (4.2.4) of Ms) and p,(s), of that of T(s) and P(s), and of the identity in K (A): PhiBli+l = [Phi’ Phi+i] + Phi+lPhi = Phihi+1+ BIHiBIi‘
Proof of Theorem 4.9
E]
If w = a1...a,, (LIL-EA), then s = (a1,...,a,,) is a
standard sequence, hence w = P(s) is, by Lemma 4.11, a sum of polynomials of the form (4.2.1). We have now to show that the polynomials (4.2.1) are linearly independent. For this, we may assume that the alphabet is finite. Note that, in this
case, the K-module of homogeneous polynomials of degree d admits the basis A“. Since Ph is a homogeneous polynomial of degree |h|, Corollary 4.7
gives a canonical bijection between the polynomials (4.2.1) which are of degree d and the words of length d. Hence, these polynomials form a basis: indeed, if M is a free K-module of rank n, then each family of n elements
which generates M is a basis of M. A particular case of (ii) is that the Hall polynomials are linearly independent. Hence, it remains to show that $(A) is linearly generated by the Hall
polynomials. We may suppose that the alphabet A is finite. Since each letter is a Hall polynomial, it suflices to prove the following claim (here and in the sequel, h = h’h” denotes the standard factorization of a Hall word h). For any two Hall polynomials Ph, Pk, their Lie bracket [Ph,Pk] is a linear combination over Z of Hall polynomials P, with |l| = |h| + |k| and l” S sup(h, k). This will be shown by induction on the couple (Ihl + |k|, sup(h, k)), where
these couples are lexicographically ordered: (d, k) < (d1, k1) if either d < d1 or d = dl and k < k1. As there are only a finite number of Hall words of bounded length, the induction is correct. We may suppose that h < k, because [PM Pk] = —[Pk, Ph], and [Ph, P,,] = 0. If h is a letter, or if h = h’h” with
h” 2 k, then by (4.1.11), hk is a Hall word and hk is its standard factorization. Thus [PM Pk] = PM and we have (hk)” = k s sup(h, k), because h < k. So we
4.2
Hall and Poincaré—Birkhojf— Witt bases
93
may assume that h = h’h” and h” < k. By (4.1.10), we have also h < h”, hence h < h” < k.
(4.2.5)
Using the Jacobi identity
[[P, Q], R] = [[P, R], Q] + [P, [Q, R1], we obtain [PmPk] = [[Ph’: Ph”]> Pk] = [[Ph': Pk]:Ph”] + [Ph'a [Ph'UPkJI
Since lh’l + |k| and |h”| + |k| are both strictly less than lhl + |k|, the induction
hypothesis implies that
[Pm Pk] = Zi ail)...
[PM Pk] = Zj fiJ-Puj
for some integers aci, [3,- and Hall words u,, U} such that:
luil = W + |k|, |v = lh”| + |k|, 14? S Slip(h', k) and 03' S SUPUI", k)Note that by (4.1.9) h’ < h”, hence by (4.2.5), the two previous inequalities imply: u;' s k, v}! s k. (4.2.6) Thus we obtain
[Pm Pk] = Z WEE", Ph”] + Z fij[Ph'a Pu,]1
(4-2-7)
J
We have luil + |h”| = |h’| + |k| + |h”| = lhl + |k|, and sup(ui, h”) < k = sup(h, k) by (4.1.10), (4.2.6) and (4.2.5). Hence [PW Phu] is by induction 3 Z-linear combination of Hall polynomials P, with l” s sup(ui, h”) < sup(h, k), and |l| = |u,~| + |h”| = |h| + |k|. Similarly, we have Ih’l + Ivjl = |h’| + |h”| + |k| = [h[ + |k| and sup(h’, 1),) < k = sup(h, k) because of (4.1.9), (4.2.5), (4.1.10) and (4.2.6). By induction we deduce that [Pb], P91] is a Z-linear combination of Hall polynomials P, with Ill = lh’l + |v = |h| + |k| and l” s sup(h’, 1),) < sup(h, k). With (4.2.7), the previous discussion proves the claim. [1 It is worthwhile to write down explicitly the algorithm underlying the proof of Theorem 4.9(i). For this, let us denote by P, the Lie polynomial obtained by interpreting in the tree t each node as a Lie bracketing; formally, Pa = a if a e A, and P, = [P,., P,..] if t = (t’, t”). The algorithm takes as input a linear combination of trees 2 cc,t and gives as output a linear combination
of Hall trees 2 fihh such that 2 a,P, = Z BhPh. Consider the linear combination 2 a,t. If all the trees involved are Hall
94
4
Hall bases
Fig. 4.5 trees, then we are done. Hence, take some tree t which is not a Hall tree,
and consider a subtree s = (s’, s”) of t which is not a Hall tree and such that s’, s” are Hall trees (each leaf is in A, so is a Hall tree, hence 5 exists). If s’ > s”, then replace s by (s”, s’) in t and a, by —oc,.
(4.2.8)
remove I from the linear combination.
(4.2.9)
If s’ = s” then
So, we may assume that s’ < s”. If s’ is in A, or s’ = (x, y) with y 2 s”, then s is a Hall tree, which was excluded. Thus y < s”, and replace t by the sum of the two trees t1 and t2 obtained by replacing s = ((x, y), s”) respectively by ((x, s”), y) and (x, (y, s”)).
(4 2 10) ' '
Then go back to the beginning. The fact that this algorithm stops and gives the desired output is implicit in the proof of Theorem 4.9(i). We illustrate
the algorithm by an example. Example 4.13 We take the Hall set described in Example 4.2. The input linear combination is the tree of Fig. 4.5. After step (4.2.8) applied to the subtree whose root is circled, we obtain the output given in Fig. 4.6 (the coefficient is indicated at the root). After step (4.2.8) again, we obtain the output given in Fig. 4.7. Now, step (4.2.10) gives the linear combination of Fig. 4.8. Finally, step (4.2.8) again gives the final linear combination of Hall
trees (these are not indicated in Example 4.2, but are Hall trees by definition of the latter); see Fig. 4.9. As another example, take as input the tree in Fig. 4.10. After step (4.2.10), we obtain the linear combination in Fig. 4.11. Step (4.2.9) removes the first tree, and the output is the remaining one, which is
a Hall tree.
4.2
Hall and Poincaré—Birkhofl— Witt bases
95
—l
b
a
b
a
b
a
Fig. 4.6
a
b
a
b
a
b
Fig. 4.7
a +
a a
b
b a
b
b a
a
b
b Fig. 4.8
A closer look at the proof of Theorem 4.9 shows that the Lie algebra of Lie polynomials is the free Lie algebra, independently of the Poincaré— Birkhoff—Witt theorem and of the proofs in Chapter 0. Indeed, it shows that each relation between the elements P, (t in the free magma on A) may be
deduced from the defining relations of Lie algebras: distributivity of the bracket, antisymmetry, and Jacobi identity.
96
4
a
a
Hall bases
b a
b
b
b a
a
b
b
Fig. 4.9
a
a
b
a a
b
Fig. 4.10
+
a
(1
a
a
b
a a
b
a
b
a
b
Fig. 4.11
Corollary 4.14
Let A be finite with q elements. The number of Hall words of
length n, and the dimension of the space of homogeneous Lie polynomials of degree n are equal to
1 — Z “(d)q"" ndln
where u is the M(ibius function.
4.2
Hall and Poincare—Birkhofl—Witt bases
97
Proof Call at" the number of Hall words of length n. By Corollary 4.7, we have the following identity of generating series
1 _
1
(4.2.1 1)
1_qt—k21(1_tk)ak. Indeed, Corollary 4.7 implies that one has in Z
1
—2w=n#
l—ZaeAa—weA*
Ill—h,
where the product is taken over all Hall words in decreasing order; eqn (4.2.11) follows by applying the homomorphism Z —> Z[[t]] which
sends each letter onto I. Take the logarithmic derivative of (4.2.11) and multiply by t:
qt
_
kockt"
l—qt—kzll—lk.
Hence,
2 qntn= Z Z kakt“: Z t"xeT,-+,forsomeisk, x = (it!) for somep 2 land I e Ti\ti.
(433) ' '
Suppose now, that for k = 0, . . . , n, one has tk e 1}. Then
tké n+1,...,T,,+1.
(4.3.4)
This is actually a consequence of Corollary 0.8. A different proof is the following: observe that 1}” 1 E M(7],), the submagma generated by TI}.
Hence, 1}” 1 E M(T1) for k 2 0. Now, each tree in T1 either has to as a proper subtree, or is in A\t0, so a fortiori in M(A\to); thus, the same holds for each tree in M(Tl), which implies to ¢ M(T1), hence to ¢ 1}” 1, for any k 2 0. This proves (4.3.4) for k = 0. For k 2 1, one proceeds inductively, by noticing first that M(T1) is canonically isomorphic to the free magma generated by T1, and
similarly for each M01). Suppose now that L is a Lazard set and E a closed subset of M(A) such that eqns (0.3.3), (0.3.4), and (0.3.5) hold. Then we have flnEEL
fork=0,...,n.
(4.3.5)
Indeed, ift e 17‘ n E and ift é L, then If ye to, . . . , tn by (0.3.4), so that t e Tn+1 by (4.3.2). But this contradicts (0.3.5). Proof of Theorem 4.18
(i) Let E be a finite, nonempty and closed subset of
M(A). We show by induction on IE I that for each Hall set H, H satisfies conditions (0.3.3)—(0.3.5) defining a Lazard set (with H in place of L). Denote
by A’ the finite alphabet of letters which actually appear in E. Let c = max(A’) and X = {(ac") l a e A’\c, n 2 0}. Let H be a Hall set in M(A). Observe that H n M(A’) is a Hall set in M(A’). Then by Lemma 4.19 applied to M(A’) and its Hall set H n M(A’), H’ = H n M(X) is a Hall set in M(X), where M(X ) is identified to a submagma of M(A).
100
4
Hall bases
The set E’ = E n M(X) is a finite closed subset of M(X), of smaller cardinality than E, because c e E \M(X ). If E’ = g, then each letter b appearing in E is not in M(X ), hence equal to c: this shows that E involves
only the letter c, and E n H = {0}; since c e A and X n E = Q, conditions (0.3.3)—(0.3.5) defining a Lazard set are satisfied in this case. If E’ 9é fl, then by induction on IE I, we conclude that for some n 2 1,
H’nE’= {t1 >-~>t,,}, with t,eX= T'l, t,eT’,= {an-‘11) | p20, teT§_1\t,_1} for i=2,...,n, and TQM nE’= Q. Let t0=ceA, T0=A andfori=1,...,n +1, T,- = {tn-9.1) | p 2 0,te T,_1\t,_1}. Then T} E Tifor i = 1, . . . , n, and an easy induction on 1' shows that if t e T,\T1., then If involves some letter in A\A’. Hence, we have Tn+1 n E = T;,+1 n E (because E h”. Denote by r(t) the immediate right subtree of t. Then, for some p 2 1, we
have r"(h) = a e A. If a < h, then by (4.1.2) and (4.1.3), (a, h) e H. If on the contrary a 2 h, then for some m in {1,...,p}, we have r"‘(h) > h and r"“1(h) < h (because r"(h) > h and h” = r(h) < h); then, by (4.1.2) and (4.1.3),
we have (rm—1(h), h) e H (because r”‘(h)” = r"'(h) > h). In both cases, we find a nontrivial right factor v of f(h) such that of(h) e f(H ). Hence, by (i), the word f(h)f(h) has a factorization f(hl) . . . f(hn), hieH, hl 2 ' - - 2 h", with |h,,| 2 |v| + |h| > |h|. This contradicts the uniqueness in (ii).
4.4.2
Elimination in free partially commutative Lie algebras
Let 6 be a subset of A x A, not intersecting the diagonal. The free partially
commutative monoid M(A, 9) is defined as the quotient of A* by the
102
4
Hall bases
congruence E generated by the relations ab E ba for (a, b) e 6. Similarly, the free partially commutative Lie algebra $(A, 6) is the quotient of $(A) by the Lie ideal I generated by the Lie polynomials [a, b], for (a, b) e 0. The monoid M (A, 0) and the Lie algebra $(A, 0) are clearly free, each in the appropriate category. The following result, due to Duchamp and Krob
(1992a), extends the Lazard elimination method (Theorem 0.6). For m in M(A, 6), denote by TA(m) the set of letters b in A such that m e M(A, 0)b. Note that if (a, b) e 6, then we have for any Lie polynomial: ad(a)c ad(b)(P) E ad(b) cad(a)(P) mod 1. Indeed, by Jacobi’s identity, ad is
a Lie homomorphism, so that ad(a) o ad(b) — ad(b) c- ad(a) = ad([a, b]). From this we deduce that u E 1) implies ad(u)(P) E ad(v)(P) mod 1. Thus ad(x) is a well-defined endomorphism of $(A, 6), for any x in M(A, 0). With these notations, the elimination result is the following. Let C E A such that
6 n B x B = Q, where B = A\C. Then the K-module $(A, 0) is the direct sum of $(C, 6 n C x C) and of the Lie ideal J generated by B. Moreover, J is freely generated, as a Lie algebra, by the elements ad(m)(b), where m is
any element of the submonoid of M(A, 0) generated by C, and b in B is such that {b} = TA(mb). For this and related results, see Schmidt (1990), Duchamp and Krob (1992b, 6, 1991a, b, 1992a), and Duchamp (1989).
4.4.3
Another rewriting rule
There is another rewriting rule for sequence of Hall words, which plays the same role as the relation —> of Section 4.1. It is due to Schiitzenberger (1958, 1986), works only for Hall sets H with the further property
h < h’,
(4.4.1)
for any h = (h’, h”) in H, but has the advantage of being local. Instead of standard sequences, one considers sequences (4.1.4) with the property that for i= 1,. . ., n — 1, h, < h,“ implies (hi, hi+1)e H. Instead of legal rises, one considers rises (h,,h,-+1) such that either i + 1 = n, or h,+1 2 hi”. Then Theorem 4.3 holds for this new rewriting rule. Note that property (4.4.1)
also appears in Theorem 5.16(vi).
4.4.4
Free Lie superalgebra
Let A be an alphabet with a mapping x: A x A —> K, where K is a field of characteristic 0 such that 1(a, b) 1(b, a) = 1 for a, b in A. For finely homo-
geneous polynomials P, Q, one defines X(PaQ)=
n a,beA
1(a,b)degaen.
4.5
Notes
103
Then one defines a bilinear operation on K by
[P, Q]; = PQ - 1(P, Q)QPDenote by $1M) the smallest subspace of K (A) containing A and which
is closed under this operation. The previous construction is due to Ree (1960), who gives a generalization of the Friedrichs, Dynkin—Specht—Wever and Poincaré—Birkhofl—Witt theorems and the Witt formula. The construction of Hall bases extends to $1M) as follows (Melancon 1991). Let H be a Hall
set in M(A). Let x(t1, t2) = x(f(t1),f(t2)), for any tree t1, t2 in M(A). Let H- = {heH | x(h, h) = —1}, and define
HX=HU{(h,h)|heH_}. Extend the order of H to Hz by h > (h, h) > k if heH_, keH and h > k. Then each word has a unique factorization f(hl)"'f(hn)ahiEHxsh12'”t
where each h in H_ appears at most once. Defining Ph, for h in H1, as in
Section 4.2, the set of polynomials Phl . . . Phn, with the same condition as above, forms a basis of K. Moreover, the Ph, h 6 H1, form a basis of ,Sfx(A). See also Mikkalev (1986).
4.5
NOTES
The story of Hall bases of the free Lie algebra is not a simple one. They appear in a paper of M. Hall (1950), with condition (4.1.1) replaced by the stronger condition that the order be compatible with the degree. However,
similar constructions of ‘basic commutators’ in the free group had already been done by P. Hall (1933) and Magnus (1937). M. Hall’s construction was generalized by Meier-Wunderli (1951), Witt (1956), and Schfitzenberger
(1958), by weakening his degree condition. The present condition (4.1.1) was shown to be sufficient to give bases of the free Lie algebra by Viennot (1978);
this condition was also known to Shirshov (1962) and Michel (1974). Actually, Viennot shows that this condition is in some sense optimal (see Section 4.4.1). Condition (4.1.1) is so general that it includes the Lyndon basis (which is a basis of the free Lie algebra constructed by Viennot (1978) and Lothaire (1983) by following the lines of the commutator calculus of
Chen et al. (1958)) and the Shirshov basis (1958); this was not the case with the original bases of M. Hall. We warn the reader that there are four symmetries which may change the
presentation of Hall bases: one can reverse the words and also reverse the order. For instance, our presentation is compatible with that of Viennot
104
4
Hall bases
(1978) after these two reversals. We have chosen this presentation to make it compatible with the Lyndon basis of Lothaire (1983). Theorem 4.3 is due to Melancon (1992), who extended a method of
Schiitzenberger (1958), itself related to the collecting process of P. Hall (1933); see also M. Hall (1959). It allows one to quickly prove Corollaries 4.4, 4.5, and 4.7, which were known to all the previous authors. The proof of Theorem 4.9(ii) follows Melancon and Reutenauer (1989), and that of
part (i) of this theorem, together with the underlying algorithm, follows Schiitzenberger (1986). The assertion on the Lie polynomials in Corollary 4.14 is due to Witt (1937) and Corollary 4.15 is due to Schfitzenberger (1958). The fact that the bases obtained by Lazard are the same as the generalized Hall bases (Theorem 4.18) is due to Viennot (1978). Other bases of the free Lie algebra were constructed by Kukin (1978) and
Blessenohl and Laue (1990b), the latter by purely group theoretic methods. See also Corollary 8.20. A formula similar to that of Corollary 4.14 appears in Witt (1956) in the case of the free Lie p-algebra.
5 Applications of Hall sets
We begin by introducing Lyndon words and the Lyndon basis of the free Lie algebra; this is a particular Hall set, with special properties, such as triangularity of the change of basis. The dual basis of the Poincaré—Birkhofi— Witt basis associated to a Hall basis may be computed recursively, by using the shuffle product; this is done in Section 5.2. In the next section a special Hall basis is constructed, which is compatible with the derived series of the
free Lie algebra. The special properties of Lyndon words, with respect to the alphabetical order, are also true for Hall words once the appropriate order on words is defined: to obtain it, one factorizes each word, and then orders
sequences of Hall words alphabetically (Section 5.4). In the final section, we show how Hall sets lead to the construction of variable-length codes, with various synchronization properties.
5.1
LYNDON WORDS AND BASIS
Let A be a totally ordered alphabet. We order the free monoid A* with
alphabetical order, that is, u < v if and only if either v = ux for some nonempty word x, or u = xau’, v = xbv’ for some words x, u’, v’ and some
letters a, b with a < b. Observe that this order on words is simply the order in which they would appear in some dictionary. A Lyndon word on A* is a nonempty word which is smaller than all its nontrivial proper right factors; in other words, w is a Lyndon word if w aé 1 and if for each factorization w = uv with u, v aé 1, one has w < v.
Theorem 5.1 The set of Lyndon words, ordered alphabetically, is a Hall set. The corresponding Hall basis has thefollowing triangularity property: for each word w = 11 . . . l,I written as a decreasing product of Lyndon words, the polynomial PW = Pi, . . . P," is equal to w plus a Z-linear combination of greater words.
We need some properties of the alphabetical order, gathered in the next lemma.
106
5
Lemma 5.2
Applications of Hall sets
(i) If u < v and u is not a prefix of v, then ux < vy for all words
x, y. (ii) If u < v < uw, then u = uv’ for some word 0’ such that v’ < w. Proof
(i) This is an immediate consequence of the definition of the order.
For (ii), imagine the words u, v, uw written in a dictionary, and the assertion immediately follows. E] Proof of Theorem 5.1 (a) Define the standard factorization of each word w of length 22 to be the factorization w = uv, where v is the smallest nontrivial proper right factor of w for the alphabetical order. Then define recursively for each nonempty word w a tree t(w) in M(A) by t(a) = a if a
is a letter, and t(w) = (t(u), t(v)) if w = uv is its standard factorization. Evidently, the foliage of t(w) is w, so wr—>t(w) is injective and the set {t(w) | w e A*} inherits the alphabetical order. We show that the set {t(w) | w Lyndon} is a Hall set. In view of eqns (4.1.9)—(4.1.11), it is enough to prove the following two assertions. If w is a Lyndon word with standard factorization uv, then u, v are Lyndon words with u < v, w < v, and either u is a letter, (5.1.1)
or the standard factorization of u is xy with y 2 v. If u, v are Lyndon words with u < u, then up is a Lyndon word.
(5.1.2)
Let us prove (5.1.1): we have u < uv = w < v, the first inequality by definition of the order, and the second because w is Lyndon. So u < v, w < 1). Moreover, v is smaller than all its nontrivial proper right factors, because these are nontrivial proper right factors of w and v is by definition the smallest among them. Hence v is a Lyndon word. Let y be any nontrivial proper right factor of u. Then yv > 0, because v is the smallest nontrivial proper right factor of w. If we had y < u, then Lemma 5.2(ii) and the inequalities y < v < yv would imply v = yv’ for some word 12’ with v’ < v, a contradiction (because 12’ would be a proper nontrivial right factor of w = at: = uyv’). Hence, y 2 v.
This shows at the same time that u is a Lyndon word (because u < v s y), and that if u = xy is its standard factorization, then y 2 1). Hence, (5.1.1) is proved. In order to prove (5.1.2), let s be a proper nontrivial right factor of uv. Then we have three cases.
(i) sis longer than 0, hence s = u’v for some nontrivial proper right factor 14’ of u. Since u is Lyndon, we have u. < u’, and since u is not a prefix of u’, we have by Lemma 5.2(i), uv < u’v = 5.
(ii) 3 = v: if u is a prefix of v, then 12 = uv’; since v is Lyndon, we have
5.1
Lyndon words and basis
107
v < 0’, hence uv < uv’ = v; on the other hand, if u is not a prefix of u, then from u < v and Lemma 5.2(i), we deduce uv < v.
(iii) sis shorter than u: then sis a nontrivial proper right factor of 0, hence v < s (because u is Lyndon); since by (ii) uv < v, we deduce uv < s.
This proves that uv < s, hence also (5.1.2), and we conclude that the set of Lyndon words is a Hall set. (b) Note that if I is a Lyndon word written as a nontrivial product I = uv, then by definition I < 0; since v < vu, we have I < vu.
. We show first that for each Lyndon word I, the corresponding Lie polynomial P, is equal to 1 plus a Z-linear combination of greater words of the same length as I. This is clear for l = a e A, because R, = a. If Ill 2 2, let I: uv be its standard factorization; then u, v are Lyndon words and by
induction
Pu=u+Zaxx,
Pv=v+ZI3yy.
x>u
y>v
Then, by definition of P,, we have
P: = [Pm Pu]
=ag—gg =uv+ Z fiyuy+ Z axxv+ Z axfiyxy—vu— Z axvx y>v
x>u
x>u
x>u
y>v
— Z Byyu— Z axflyyx' y>v
x>u
y>v
Then, the assertion follows from the inequalities l = uv < vu and u t if either t = Ms) or t = pi(s) for some i; see eqns (4.2.3)
and (4.2.4). Denote by é> the transitive closure of =. Derivation trees are defined in Section 4.2. Lemma 5.4 (i) Let s be the standard sequence (h1,. . . , h”) with n 2 2 and suppose that hl 2 h, for i = 2, . . . , n. Then for any standard sequence t with s£> t, t is of length at least 2. ‘ (ii) Let s be the standard sequence (hi, . . . , h") with hz 2 - - - 2 h”. If h1 . . . h,I is a Hall word, then there is exactly one chain s =>- - -=>(h1 . . . h"). Ifs=*> t and t 916(h1 ...h,,), then t is of length at least 2. Proof (i) By assumption, for any rise h, < h“.1 in 3, one has i2 2. This implies that Ms) and pi(s) are of length 2 2. Moreover, they satisfy the same condition as s, that is, their first element is greater than or equal to the
others. This is clear for p,(s), and for Ms), note that by eqns (4.1.7) and (4.1.10), hi+1 > hihiH. Hence, (i) follows by induction on the length of the chain 5 £> t. (ii) If n = 1, there is nothing to prove. Suppose n 2 2. If h1 2 h, then s is decreasing and so there is no nontrivial chain starting from s. Moreover, h1 . . . h, is not a Hall word, by Corollary 4.7. So we may assume that h1 < hz. This is surely the only rise, so that Ms) = (h2, h1, h3, . . . , h") and pi(s) = (hlhz, h3, . . . , h"). Note that h2 is the greatest element of Ms), so by (i), 3 => t implies that t is of length at least 2. Moreover, pi(s) is shorter than s and satisfies the same hypothesis. So we El may conclude by induction.
Proof of Theorem 5.3 (i) This is clear because P1 = 1 and the other PW are homogeneous of degree 21. (ii) Let h = av be a Hall word with first letter a. Since S,l has no constant term, the equality S,, = a5, is equivalent to saying that for any word u and any letter b, (Sh, bu) = 60,5(Sv, u). We have by (5.2.1) u = 2 (SW, u)Pw'
Hence, bu = 2 (SW, u)w.
(5.2.3)
Choose a word w and write it as a decreasing product of Hall words,
w = h1 . . . h,,. Then the sequence s = (b, h, . . . , h") is standard, and by Lemma 4.11
w = bP,ll . . . Phn = P(s) = Z a,P(t), t
110
5
Applications of Hall sets
where the summation is over all decreasing sequences 1‘ of Hall words and where at, is the number of chains s*=t in a fixed derivation tree of s. By Lemma 5.4(ii), each t is of length 22, except when bw = bh1 . . . h" is a Hall word, in which case there is exactly one chain from s to (bw). When we put this into (5.2.3), we obtain
bu =
2 wallword
(SW, “)w + sum of decreasing products of length 22 of Hall polynomials.
Hence, the coeflicient of Ph = P“, in this sum is equal to 0 if a aé b, and to (So, u) if a = b. In other words, (SW, bu) = 6a,b(S,,, u), as required. (iii) Note that, by definition of the dual basis, we have (SW, P“) = 5,“. In particular, if w is a Hall word and u is not, then (SW, P“) = 0.
Write w = w1 . . . w, as a decreasing product of Hall words, hence i = ill + - - - + ik. We evaluate (SWl LlJ' - -Lu SW P"), which is equal to (SM 69- - -® SW.» 6i(P,,)) by Proposition 1.8. By (1.5.6)
5i(P)=P®1®---®1+1®P®~~~®1+~-+1®1®--~®P, (5.2.4) for each Lie polynomial P. Write u = u1 . . . u" as a decreasing product of Hall words. Then Pu = Pul . . . Pun. Now, 6, is an homomorphism and each Puj a Lie polynomial. Hence,
6i(Pu) = 6,-(Pul) ._. . 6i(P,,n).
(5.2.5)
By inspection of (5.2.4) and (5.2.5), we find that 6,-(Pu) is a sum of terms of the form Q1 69- - -® Qi, hence that (5w, LLl- - -LL| SM, 6i(P,,)) is a sum of terms
of the form (Sm, Q1) . . . (SW Qt): if i > n, then in each term, at least one Qj is equal to 1, hence, since (SW 1) = 0, we have (SWx LL|~ - ~LL| Sm, P“) = 0; if on the contrary, i < n, then in each term, at least one Q is a decreasing
product P“, = Pu“ . . . PM with r 2 2, hence, since (Sm, P,,,) = 0, we also have (SWl m- - -LL| Sm, P“) = 0. In the remaining case i = n, we obtain, because (Swn 1) = 0: (SW1LL'. ' ‘LL' Swna Pu) =
=
Z (SW1, Pug“) ' ' ' (Swm Pram) Hes"
Z
6Wh“u(1)' ' ' 6Wmlla(n)'
aeSn
If w 9e u, then (w1,...,w,,) 9E (u1,...,u,,), and since both sequences are
decreasing, the right-hand side vanishes. If w = u, then by Corollary 4.7 (w1,...,w,,)=(u1,...,u,,);since(w1,...,wn)=(h1,...,h1,...,hk,...,hk), each h]. repeated ij times, the right-hand side is equal to the number of
5.2
The dual basis
111
permutations fixing the previous sequence, that 15,11!Finally, 1
'(S’IIJ‘Jil LlJ' . 'LlJ Sh im, P“) = 6a,w’
11...ik.
which proves (iii), by definition of the dual basis.
I]
Corollary 5.5 The shufi‘le algebra @(A) is a free commutative algebra over the set S, (h Hall word). Proof
The families (SW) and (PW) are dual bases. Hence, besides eqn (5.2.1),
we also have the dual relation, for any polynomial Q
Q= Z (Pw,Q)SwweA*
This shows that the polynomials SW form a basis of the space (EMA). Because of Theorem 5.3(iii), this implies that the polynomials S,l (h Hall word) form
a free generating set of the shuffle algebra @(A).
D
In the next result, we again use the algebra 42! introduced in Section 1.5. Corollary 5.6
The following identity holds in the algebra .12!
Z W®W= H CXP(Sh®Ph)a
weA*
heH
where the product has to be taken in decreasing order. This result could also be stated in the algebra End(Q) with the
convolution product :1: (see Section 1.5): it gives a formula for the identity as a product of exponentials of elementary endomorphisms. Proof The right-hand side is
SM®P1>= 11(2, i>Ol
1
., ., Sfi“ LLI- - -LU sg‘kcapy, . . .133; 2 h1>“‘>hk ll""lk' i1 ..... ikzl
Because of the definition of PW and Theorem 5.3(iii) this is
Z SW®PW, weA*
112
5
Applications of Hall sets
which is equal to
Z“®=Zu®u u
W
by (5.2.1). 5.3
B
THE DERIVED SERIES
Define a sequence of subspaces of $(A), called the derived series, in the following way: .590 = $(A),
3"“ = [3“, 3"].
The latter means that 3"“ is the subspace of .55’(A) generated by the polynomials [P, Q] for P, Q in 3". Each subspace .5?" is a Lie ideal of $(A),
and in particular 3”“ c 3"; this is an easy consequence of the Jacobi identity, left as an exercise. We define a particular Hall set H which will be shown to be compatible
with the derived series. Define Ho = A and order it totally; now define recursively H"+ 1 as the set of trees of the form h = (' ' '((h1s I12), ’13), - - - a hk):
(5'31)
where k 2 2 and where h1,. . . , hk e H” with hl < hz 2 I13 2 ‘ ' ’ 2 hk-
(5.3.2)
Now order Hn+ I totally.
Finally, let H = UnzO H" and extend the order of the H" to H by the condition heHn,ke,nh>k.
(5.3.3)
Symbolically, this may be written as H0 > H1 > - - - > H" > H"+1 > - ~ . Equation (5.3.1) is illustrated in Fig. 5.1.
Theorem 5.7 The set H is a Hall set. For each n 2 0, the set of Hall polynomials {PW | w 6 UR" Hp} is a basis of 2". Proof (a) In order to prove the first assertion, we have only to verify conditions (4.1.1)—(4.1.3). Suppose h e H is of the form (h’, h”). Then h is in H,,+1 for some n 2 0,
and is of the form (5.3.1). Then h” = hk is in H", so h < h” by (5.3.3), and (4.1.1) holds.
5.3
The desired series
113
Fig. 5.1
If k2 3, then h’ = (...(h1,h2),...,hk_1) is in Hn+l' Hence, h’ < h” by (5.3.3). Moreover, h’ = (x, y) with y = hk_1, and by (5.3.2), y _>_ h”. So (4.1.2) and (4.1.3) hold in this case. On the other hand, if k = 2, then h’ e H” and h’ = h1 < h2 = h” by (5.3.2), and (4.1.2) holds. Moreover, if h’ is not a letter, then n 2 1 and h’ = (x, y)
must also be of the form (5.3.1), so that y is in H,,_ 1. Hence, y > h” by (5.3.3), and (4.1.3) holds. Suppose now that (4.1.2) and (4.1.3) hold for h = (h’, h”). We have to show
that h is of the form (5.3.1) and that (5.3.2) holds. If h’, h” are in the same H", then h is of the form (5.3.1) and (5.3.2) holds with k = 2, because h’ < h” by (4.1.2). Otherwise, because of (4.1.2) and (5.3.3), we have h’ e H".r 1, h” e Hp
with n + 1 > p. Then h’ is of the form (5.3.1) and condition (5.3.2) holds. Moreover, writing h’ = (x, y), we have y 2 h” by (4.1.3), and y = hk by (5.3.1). Since hk is in H" and h” in Hp, we deduce from (5.3.3) that n g p. This, together with p < n + 1, shows that p = n. Then we set h” = hk+ 1, so h is of the form (5.3.1), and condition (5.3.2) holds (with k + 1 instead of k).
(b) Define the level l(t) of a tree t inductively by l(t) = 0 if teA and, otherwise, 1: = (t’, t”), and let
1(t) _ {1 + 1(z')
if l(t’) = to");
_ sup(l(t’), 10"» if 10’) #10").
114
5
Applications of Hall sets
The following facts are easy to verify: if s is a subtree of t, then [(3) S l(t);
if tl is obtained from t by replacing its subtree s by s1 and if . 1(5) S l(sl), then l(t) S l(tl).
(5.3.4)
(5 3 5) ' '
We claim that H, is the set of trees in H which are of level n. Indeed, note first that, by definition of the level, a tree which is not reduced to a single
letter is of level _>_1. Hence, H0 is the set of Hall trees of level 0. Arguing by induction, let h be in Hn+1, hence of the form (5.3.1); then h1,. . . , hk are in H", hence of level n by induction; an immediate induction on k (starting from k = 2) and the definition of the level shows that h is of level n + 1. This
proves the claim. Note that the second assertion of the theorem is true for n = 0, by Theorem 4.9(i). The general case will follow by induction if we prove the following.
Let h, k be Hall trees of level 2n; then [P,,, Pk] is a linear combination of Hall polynomials R, with t of level 2n + 1. Note that under the previous hypothesis the tree (h, k) is of level 2 n + 1.
Thus, in view of the algorithm stated after the proof of Theorem 4.9 it is enough to show that each step of this algorithm, when applied to a tree t, produces only trees of level 21(t). This is clear for step (4.2.8) and (4.2.9). For step (4.2.10), we have s = (s’, s”), where s’ = (x, y) and s” are Hall trees
and x < y < s”, s’ < s”. By the claim and (5.3.3), we have l(x) 2 l(y) 2 KS”) and [(5’) 2 l(s”). If we had l(s’) = l(s”), then we would deduce [(3’) = l(s”) S l( y) S l(x) S l(s’) (the latter inequality by (5.3.4)), hence, equality everywhere; since 5’ = (x, y), this would imply, by the definition of the level, [(3’) =
l(x) + 1 > l(x), a contradiction. Thus we have [(5’) > l(s”) which implies [(5) = l(s’), by definition of the level. Now, by (5.3.4) and (5.3.5), l(((x, s” ), y)) 2 l((x, y)) = l(s’) = 1(3), and l((x, (y, S”))) 2 l((x, y)) = [(5’) = l(s). Hence, the trees t1 and t2 of step (4.2.10), obtained by replacing the subtree s of t by ((x,s”), y) and (x, (y, s”)), respectively, are by (5.3.5) of level 2l(t). This proves the theorem. D
5.4
ORDER PROPERTIES OF HALL SETS
Let H be a Hall set in A*. According to Corollary 4.7, each word w in A* has a unique factorization w = h1 . . . h", h,- e H, 111 2 . - - 2 hn. We use this bijection between A* and the set of decreasing sequence of Hall words to define a total order on A*, which extends the order on Hall words, and which
we therefore still denote by S. This order is obtained by carrying to A* the alphabetical order on the sequences of Hall words. This means that if z is another word in A*, factorized as z = k1 . . . km, k, EH, k1 _>_ « - - 2 k,,,, then
5.4
Order properties of Hall sets
115
I)
u
Fig. 5.2
w < 2 if and only if eithern h. Then for each nonempty word m in M, there exists p such that m” is synchronizing. If K aé H, X is complete.
Note that, in view of (5.4.1), M is by definition the set of words of the form hl...hn,n20,hieH\K,h1zn-zhn.
(5.5.5)
Assertion (i) of the theorem is that M contains all the words h1 . . . h" with hi 6 H \K. Note that Lemma 4.19 is a particular case of the theorem. Proof (i) By definition (5.4.1) of the order s in A*, the empty word is in M. Let w, 2 be two elements in M: then, by (5.4.1), w = h1 . . . h”, z = k1...km
with hl,...,h,,, k1,...,kmeH, hl 2---2h,,, k1 2---2k,,,; moreover hl, k1 < k for any k in K, by definition of M. Then wz = h1 . . . hnk1 . . . km, and by Lemma 5.10 we have wz =11...l,,
5.5
Synchronous codes
121
for some standard sequence (11,. . ., IP) with max(l,-) < k for any k in K. Hence, by Theorem 4.3(iii) and Lemma 5.11(i), we have wz = m1 . . . m, with mi 6 H, ml 2 - - - 2 ml and m1 < k for any k in K. Thus by (5.4.1), we deduce wz < k for any k in K, i.e. wze M.
(ii) By Lemma 5.12, for each element h in X and each nontrivial proper right factor 12 of h, one has 12 2 h”; since h” e K and since K is by hypothesis upwards closed in H, we cannot have 1) EH \K, and a fortiori not 12 e X.
Hence, X is a suffix code. It generates M: indeed, M is generated by H \K, and for h in H \K, either h e X, or h” e H \K, hence h’ e H \K (because h’ < h” by (4.1.9) and K is upwards closed) and by induction h’, h” e X *, implying h = h’h” e X *. (iii) Let h e H \(K u X). Then h” é K, which by (4.1.9) implies that h’ 9.5 K, because K is upwards closed. So h’, h” e H \K and h’ < h”. If h’ ¢X, then
h’ 9? A and (h’)” 2 h” by (4.1.11). Conversely, let h, k e H \K with h < k and either h e X or h” 2 k. We show that hk e H \K. In the first case, either h e A and then hk 6H because of
(4.1.11), or h e X\A, hence h” e K, implying h” > k (because k 9% K, and K is upwards closed), thus hk e H by (4.1.11). In the second case, (4.1.11) gives directly hkeH. Moreover, hk < k by (4.1.10), so that hk¢K, and finally
hk e H \K. To conclude that H \K is a Hall set in X *, note that X is contained in H \K, and H \K is contained in M = X *, by definition of M. (iv) Let S be as in the theorem. Then each word w in A* has a unique decomposition w = sx1 ...x,,, seS, n 2 0, xieX. Indeed, this is a consequence of Corollary 4.7, of the definition (5.5.5) of M, and of the fact, proved above, that M is freely generated by X.
If X is complete, then by uniqueness of (5.5.3), we must have necessarily that S is the set of proper right factors of words in X. Conversely, if each word in S is the right factor of some word in X, then each word in A* is the right factor of some word in X *, because of the representation sxl . . . x” above. Hence, X is a complete suflix code.
If K is finite, then X is complete because of (v) and the fact that M aé {l}:
indeed [Al 2 2, hence H is infinite and Q ;E H \K E M.
(v) Let K = {kp > - - - > k1}. We show by induction on i that mikij" . . . k{‘ (is 2 1) is in M. This is enough in view of (5.5.5) and Corollary 4.7. We may
write m=h1...h,, with h1 2~~2 h", hieH\K, n 21. Then, in view of Lemma 5.10, there exists a standard sequence of Hall words (11,. ..,l,,)
such that w = mkij" . . . k{‘ = 11 . . . 1,, with k, = max(lj) 76 ll. Hence, Theorem 4.3(iii) and Lemma 5.11(iii) imply that w = p1 . . . p, for Hall words p1, . . . , pq with p1 2 - - - 2 pa and p1 < ki. Thus, we conclude by induction on i that
mi‘ 1w is in M, as desired. (vi) If H has the stated property, then so has the Hall set described in (iii). Let m in M, m 9e 1. We show that for p = |m|, m" is synchronizing, by induction on |m|.
122
5
Applications of Hall sets
If m is a letter, then for any word w = h1 . . . h”, written as a decreasing product of Hall words, the sequence 5 = (m, 11,, . . . , h”) is standard. Then either m 2 h1 and by (5.5.5) mw e M, or m < h, and s —> (mhl, h2, . . . , h"): since mh1 < m by hypothesis, we have mh1 e H \K and we deduce from Theorem 4.3 (iii) and Lemma 5.11(i) that mw e M. Now, let in be of length 22, and choose c = the greatest letter in m. If e e H\K, then each letter of m is in H\K, hence m = m’b, m’ e M, b e A\K. By the case |m| = 1, we have bw e M, hence mw = m’bw e M for any word w.
Suppose nowthatceK. Let K’ = {heH | h 2 c}, M’ = {weA* | VkeK’, w < k}; then K’ E K, M’ 2 M, and by (i) and (ii), M’ is a submonoid of A*,
generated by a suffix code X’; moreover, H \K’ is a Hall set in the free monoid X ’* by (iii), K \K’ is an upwards closed subset of H \K’ and
M = {w e X’* | Vk e K\K’, w < k}. Hence, M is obtained from X’ in exactly the same way as M from A. Write m = h1 . ..h,l with hieH, h1 2 - - - 2 h". Then by (5.5.5) we have h, e H\K, hence h, e H\K’. Thus m e M’ = X’*, by (5.5.5). The X’-length of m is strictly less than its A-length, otherwise each letter of m is in X’ E M’,
in particular c, a contradiction. Hence, by induction and the previous remarks, we conclude that m"‘ 1X ’* g M. Observe that, since c is the greatest
letter in m, m does not begin with c: otherwise its decreasing factorization into Hall words begins by c, and m 2 c, hence m ¢ M’, a contradiction. Hence m = bm’, b < c, thus b e M’. By the case |m| = 1, we have bA* g M’, hence mA* 9 M’ = X’*. This shows that mpA* E M.
If K aé H, then M aé {1} and the synchronizing property implies clearly that X is complete.
E]
Let X be a set of words of equal length n. Then evidently X is a suffix code. We say that X is comma-free if whenever a word x of X is a factor of a message (i.e. a word in X *), then the latter may be cut into two submessages
at x; formally, it means that VxeX,Vu,veA*,uxveX*=u,veX*.
(5.5.6)
Note that if X contains two words x, y, with x = uv, y = vu for some nonempty words u, v in A*, then X is not comma-free. Indeed, we have that the word uvuv is in X * and contains the inner factor y, but u, v are not in
X *. This implies that the cardinality of X does not exceed the number of primitive conjugation classes of length n. Hence
1 IXI S — Z #(d)q""‘, dln
where q = |Al (see Theorem 7.1). The next result shows that this bound may be achieved, when n is odd. Theorem 5.17
Let n be odd and q = |A|. Then there exists a comma-free code,
consisting of words of length n, and of cardinality (1/n)Z,,I,, a(d)q"".
5.5
Synchronous codes
123
The result is not true when n is even; see Berstel and Perrin (1985, p. 346). To prove the theorem, we construct a special Hall set H in A, and show that
the set of words of length n in H is a comma-free code. This is enough, because the number in the theorem is equal to the number of Hall words of
length n (Corollary 4.14). Recall that M(A) denotes the free magma. We say that te M(A) (respectively A*) is even (respectively odd) if |t| is. Lemma 5.18
There exists a Hall set H such that for any h, k in H:
(i) h even, k odd implies h < k; (ii) h, k odd and |h| > |k| implies h < k. Proof Let N = {te M(A) | t even, or t odd and t” odd}. Define a binary relation 2' on N by s 2' t if either 5 is even and 1: odd, or if s, t have the same parity and |s| > |t|. Then the reflexive and transitive closure of /' is a partial order on N. Extend it to a total order s on N. Note that for h, k in N, one has (i) and (ii), and that h < h” for each h in N. Define recursively a subset H of N by A E H, and for any t = (t’, t”) in N\A, t is in H if and only if t’, t” e H, t’ < t” and either t’ e A, or (t’)” 2 t”. Note that h, k e N and
h < k implies (h, k) e N. Then H is a Hall set with the desired properties. El
Proof of Theorem 5.17 Let H be the Hall set of Lemma 5.18. Let K = {heH l h odd}. Then K is upwards closed in H, thus satisfies the
hypothesis of Theorem 5.16 and M = {we A* | VkeK, w < k} is a submonoid of A*. Note that M is by (5.5.5) generated by the even Hall words. By Corollary 4.7, each word w in A* has a unique representation w = k1...k,,m, with kieK, meM, k1 2 - - - 2 kn. Moreover, if w is a right factor of some word k in K, then by Lemma 5.9 we have w = k1 . . . kn with kieH and k1 2 ' ~ - 2 kn 2 k”; by (4.1.10) k” > k, so that each k, is in K, because K is upwards closed. We claim that each left factor of a word in H is in M u KM. Suppose the claim is proved. Then suppose that the set of Hall words of length n is not a comma-free code. This means by (5.5.6) that there are three such Hall words h, k, I say, and a factorization hk = ulv for nonempty words u, v. Hence,
I = wlw2 with h = uwl and k = wzv. Since lis odd, one of w1 and w2 is even, and so there exists an even word w which is both right and left factor of
some word in K. Thus, by a previous remark, and the claim, we have w = k1...k,,eM u KM,withk,-eK,k1 2 - - - 2 k,,.Sincewiseven,wemust have w e M, which contradicts the uniqueness of Corollary 4.7, in view of
(5.5.5).
124
5
Applications of Hall sets
To prove the claim, we prove first that for Hall words h, k one has
h, k odd, h < k => hk e M,
(5.5.7)
h even, k odd 2 hk 6 KM. ~
(5.5.8)
and
Let us prove (5.5.7): if h e A or h” 2 k, then by (4.1.11), hk is a Hall word, evidently even, so that hk e M. If h” < k, then, since h” is odd (because h” > h, and K is upwards closed), we have by induction Wk 6 M; but h’ is even, so that h’ e M, and hk = h’(h”k) e M, the latter being a submonoid.
We prove (5.5.8): we have h < k by Lemma 5.18(i). If h e A or h” 2 k, then hk e H by (4.1.11), and hk e K E KM. Otherwise h” < k: either h” is odd, so that h’ is too and, by (5.5.7), hk = h’(h”k) 6 KM; or h” is even, so that h’ is too, and by induction on lhl, h”k = klm with k1 e K, m e M. Then, by
induction again, h’k1 6 KM, so that hk = h’h”k = h’klm e KMm 9 KM. We now prove the claim: let w be a left factor of a Hall word h. If w is a left factor of h’, we are done by induction. Otherwise w = h’w’, where w’ is a left factor of h”. If w’ = h”, we are done because then w e H g M U KM;
so we may suppose w’ 96 h”. By induction, we have w’ = m or km (m e M, keK). Thus w = h’m or h’km. If h’ is even, w belongs to M U KMm, by (5.5.8), hence to M U KM. If h’ is odd, then, since h’ < h” by (4.1.9), h” is
odd too; moreover, k is shorter than h” (because w’ 75 h”), so that by Lemma 5.18(ii), h” < k, hence h’ < k; thus w 6 KM u M by (5.5.7). This proves the claim. E]
5.6 5.6.1
APPENDIX Long products of Lyndon words
There exists a function N(q, k) such that for any totally ordered alphabet A with q elements, any word win A* of length at least N(q, k) has a factorization w = ull . . . lkv, where each I, is a Lyndon word and ll 2 - - - 2 1,, (Reutenauer 1986a). For the proof, one introduces the code B = a(A\a)*, where a is the smallest letter in A. Then B may be considered as a totally ordered alphabet. The free monoid B* is embedded in A*, and one shows that: (i) the alphabetical order in B* is the restriction to B* of that of A*; (ii) each Lyndon word in B* is a Lyndon word in A*. Then the existence of N(q, k) is proved by lexicographical induction on (k, q).
The previous result has as consequence a theorem of Shirshov, which has itself applications in rings with polynomial identities (Shirshov 1957; see also Lothaire 1983, Chapter 7). It has been extended to the Viennot factorizations by Varrichio (1990). It is not known if a similar result holds for Hall words.
5.6 5.6.2
Appendix
125
Multilinear Lie polynomials
Let A = {a1, . . . , an} and call a polynomial P multilinear if it is a linear combination of words aa(1)...aa(,,,, aeSn. Denote by F" the space of multilinear Lie polynomials. Then F" is of dimension (n — 1)! and admits as basis the set [. . . [[aouJ, 06(2)], 06(3)], . . . , dam], 0 E S”, 0(1) = 1.
Indeed, an inductive use of Jacobi’s identity shows that these polynomials generate F". Moreover, they are linearly independent, because the above polynomial is the only one involving the word a,(1)...a,(,,). Alternatively, one can use the fact that the number of multilinear Lyndon words is (n — 1)!. Let A as above be naturally ordered, and consider the set of Lyndon words as a Hall set, and let PW denote the corresponding polynomial (as in Theorem 5.1). Then one has the following identity (Melancon and Reutenauer 1989):
a1...a,l = Z Pamujm. 065,.
5.6.3
Another approach to Hall sets
Example 5.15 is directly related to the Lazard elimination process (see Section 0.3). It allows a quite different approach to Hall sets. Let H be a Hall set in A*. If A has a greatest element 2, then by Theorem 5.16(iii) the set H n B* is a Hall set in B*, with B = (A\z)z*. Similarly, if (az") denotes the tree (. . .(a, z), . . . , 2), then the homomorphism of magma h: M(B) —> M(A) sending az" e B on to (az") is injective, and the set h_1(H) is a Hall set in M(B). This allows us to prove all the results on Hall trees and words.
Let us sketch for instance the proof of Corollary 4.4. Let w be a word in A* and let 2 be the greatest letter in w. We may replace A by a finite alphabet with greatest element 2. Let B be as above. Then we may write w = z"x1 . . . xk with x, e B. Then the word x1 . . . xk in B* is strictly shorter (as a word on the alphabet B) than w, hence we conclude by induction that w has a decreasing factorization into foliages of Hall trees. To prove uniqueness, one notes that n is necessarily unique (it is the number of 2’s at the beginning of w, because each x in B begins by a letter distinct from 2); then one uses induction by passing to the word x1 . . . xk. To prove Theorem 4.9 one proves first that, with the previous notations, one has the isomorphism of K-modules:
KEZ] ® K -> K, z"®x1...xki—>z"l...k(xieB), where, for x = az’, Px is the Lie polynomial [. . .[[a, z], z], . . . , z], with r zs.
Compare this with Theorem 0.6.
126
5
Applications of Hall sets
The factorization A* = z*((A\z)z*)* is a special case of bisection. Viennot (1978) gives similar isomorphisms in the case of general bisections (see also Lothaire 1983, Proposition 5.3.11), and more generally, for left regular factorizations of the free monoid. In particular, $(a,b) has a canoni-
cal decomposition as the (module) direct sum of the Lie subalgebras .Sf,(r 6 0., u 00), where .9; is the space generated by~the homogeneous Lie polynomials P such that deg,(P)/degb(P) = r (see also Viennot 1974).
5.7
NOTES
Lyndon words appear in the work of Lyndon (1954, 1955a), and they are used by Chen et al. (1958) to construct basic commutators of the free group; Viennot (1978) and Lothaire (1983) also construct the Lyndon basis of the free Lie algebra; an equivalent basis had been constructed by Shirshov (1958).
It was Viennot who showed that the Lyndon basis is a particular Hall basis, once Hall sets have been properly generalized (see Section 4.5). Theorem 5.3 is due to Schiitzenberger (1958), with improvements from Melancon and Reutenauer (1989) and Melancon (1991). Note that condition
(iii) holds in any enveloping algebra, as the proof shows. Theorem 5.7 is from Reutenauer (1990). The assertion on the length in Theorem 5.13(i) is due to Viennot (1978), and Theorem 5.13(ii) was proved by Duval (1983) in
the case of Lyndon words. All other results of Section 5.4 are due to Melancon (1992), who also proved that the order obtained in (5.4.1) in the
case of Lyndon words is the alphabetical order. Theorem 5.16 follows an idea of Schiitzenberger (1958) and part (vi) is especially due to him (1986). Theorem 5.17 is due to Eastmann (1965); the proof by Scholtz (1969) of this
result consists in constructing a special Hall set, so we have included it in this book (see also the book on codes by Berstel and Perrin (1985, Theorem
5.3.8)). The problem of factorizing matrices of the form 1 — a — b — - - - leads also to the construction of special Hall sets, by Good (1971), who calls them standard lists.
6 Shuffle algebra and subwords
The shuflle algebra is a free commutative algebra over the set of Lyndon words; this result is presented in Section 6.1, together with a precise identity on the shuflle product of Lyndon words, which implies that actually there is a canonical structure of algebra of divided powers. In Section 6.2 a remarkable presentation of the shuffle algebra is given; the generators are the nonempty words, and the relations the nontrivial shuflle products. In Section 6.3 we introduce subword functions on the free group, the Magnus transformation of the free group, the algebra structure on the module of subword functions, and the fact that this algebra is generated by the
particular subword functions corresponding to Lyndon words. The main tools are the concept of representative, or recognizable functions on the free group, and the infiltration product of Chen, Fox and Lyndon. Section 6.4
presents the commutator calculus of P. Hall, and its generalizations. There are many results involving the lower central series of the free group, the Magnus transformation, and the algebra of subword functions.
6.1
THE FREE GENERATING SET OF LYNDON WORDS
Recall that the shuffle product UJ was defined in Section 1.4, and that Q with this product is a free commutative algebra (Corollary 5.5). We show
here that the set of Lyndon words is a free generating set of the shuflle algebra. Let A be totally ordered and put on A* the alphabetical order. Recall that a Lyndon word is a word w on A* such that Vu, veA+, w = uv => w < 0
(see Section 5.1). This means that w is smaller than all its nontrivial proper right factors.
Recall that each word w in A* has a unique decreasing factorization into Lyndon words; this is a consequence of Theorem 5.1 (the set of Lyndon words is a Hall set), and of Corollary 4.7 (each word has a unique decreasing factorization into Hall words). We assume that K is a (ll-algebra. Theorem 6.1 words.
(i) The shuffle algebra K (A) is freely generated by the Lyndon
(ii) For each word w, written as a product of Lyndon words w = I ‘1‘ . . . If,"
128
6
Shuflle algebra and subwords
(ll>--->lk;i1,...,ik21),0nehas
.
l
.
.
_ l?“u.l-~ml,';”"‘=w+ Z auu,
11!...lk!
u - - . > lk; i1, . . . , ik 2 1). Since w is not Lyndon, we have i1 + - - - + ik 2 2. Let w = xy, with x =11: this is a nontrivial factorization of w. Then each shuffle u of x and y appears in the
polynomial 11”“ UJ- - 'LIJ 1),-”i". By Theorem 6.1(iii), this implies that u s w. E!
6.2
PRESENTATION OF THE SHUFFLE ALGEBRA
Here, K is still a (ll-algebra. Define, for each nonempty word w in A*, a
variable xw. Denote by X the set of these variables, and consider the algebra of commutative polynomials K [X]. We have a linear mapping
1p: K —> K[X], wt—>xw(w $1),1+—>0. Let f be the ideal of K[X] generated by the polynomials
t//(u Lu 0), u, UEA+. We shall see that the shuffle algebra K is isomorphic with K [X ]/.,¢.
Actually, the isomorphism may be precisely described, by the use of the logarithm.
As in Section 1.5, consider the complete tensor product
42’ = K_®—K, with the shuflie product at the left of g, and the concatenation at the right.
Then, as in Section 3.2, consider the element log(ZueA. u 8) u); we saw there that
log< Z u®u> = 2W®7I1(W), its/1*
w
where n, is a degree-preserving linear endomorphism of (LIMA), whose image is in $(A) (Lemma 3.8). The adjoint endomorphism n’f of n1 is completely defined by the equality
Z n’f(w) 69 w =10g< Z a ® u), 116A“
see (1.5.9).
(6.2.1)
130
6
Shuflle algebra and subwords
Theorem 6.3 Considering K with its shufi‘le structure, let f: K[X] —> K be the K-algebra homomorphism defined by f(xw) = n’f(w), for any
nonempty word w. Then KerU’) = J, f is surjective, and K(A> 2 K[X]/J. Proof Let u, v be nonempty words. Then by Theorem 3.1(iv), we have (rcl(w), u LU v) = 0 for any word w, because n1(w) is a 'Lie element. Hence, by duality (n’fiu UJ v), w) = 0, which shows that nfiu LIJ v) = 0. Observe that n’f = fo up; hence, we deduce fo 1,0(u LL! 0) = 0, and J .C. Ker f. For the opposite inclusion, define L = K [X ]/j and let v: K[X] —> L be the canonical projection. We show that the series
2 v:>¢(w)w = Z v(xw)w, weA*
(6.2.2)
weA*
is a Lie series in L. Indeed, from Lemma 1.5, we have the identity lie 5(P) = r(P), for any polynomial P. We need the following facts: r(P) is a
Lie polynomial, 5 = (id 69 at) o 6, 01(1) = 1, 6(P) = Ewe“ (P, u LLl v)u ® 0, Mu ® 0) = luluv, and all these mappings are linear, homogeneous, and degree-preserving; see Section 1.3 and Proposition 1.8. The above identity means that for any word w
Z (w, u LLl v)|u|uoc(v) = r(w). In the sum, separate the term corresponding to v = 1, u = w. Thus we obtain
IWIW = r(W) - Z (w, u U—I v)lulud(v), u.u;é1
since the terms with u =1 vanish because [II = 0. The series (6.2.2) is therefore equal to
Z V°Il/(W)(|Wl"‘r(W)-IW|" Z (WauU-Iv)|u|u°¢(v)) weél
u,u;é1
Z V°I//(W)IWI‘1r(W)—T, = Waél where Tis
T:
Z
V°¢(W)|Wl_l(w,u LU v)|u|uoc(v)
u,v,w;é1
=
Z u,vaé1
luluoc(v)|uv|'1Vo|//< Z (w,umv)w), weél
because (w, u Lu 1)) 9e 0 = |w| = luvl. Now, the second summation is equal to
u LlJ v, so that by definition of J and v, the series T is equal to 0. This shows that (6.2.2) is a Lie series. Hence, by Theorem 3.2(iii), its exponential is
6.3
Subword functions
131
defined by a shuflle homomorphism [3: K (A) —> L, that is,
Z v(xw)w=log< Z B(u)u>. waél
116A“
Applying the homomorphism [3 ® id: u 69 v l—> [3(a) v, a! —> L a”
p ’
the ordinary binomial coefficient. A function A* —> Z of the form w |—->
(w) u
,
is called a subword function. Denoting by 4* the characteristic series of A*,
i.e. 4* = 205“ 1), it is easy to verify that the subword functions are defined
132
6
Shufi‘le algebra and subwords
by the shuffle product
uLuA’“ = Z w. weA*
u
Similarly, a simple verification shows that, if w = 010.2 . . . an (n 2 0, ai e A), then
(1+ a1)(1+ a2) . . . (1 + an) = 2 CV)“. ueA*
u
We call Magnus transformation the homomorphism M from A* into the multiplicative monoid Z, defined by M(a)=1+a, for any letter a in A. Then we have M(w) = Zue 14* (Du for any word w in A*. Actually, let F(A) denote the free group generated by A; it contains A*
as a submonoid. Since the series 1 + a are invertible in Z, the Magnus transformation may be extended to a group homomorphism, still denoted by M:
M: F(A) —> Z,
a H1 + a.
For an element g of the group F(A) and a word u in A*, we denote (i) the coefficient of u in M(g). Thus
M(g) = Z u. uEA*
(6.3.2)
u
We still call subword function a function F(A) —> Z of the form g I—> (3). We shall see that these functions have close connections with Lyndon
words and the free Lie algebra. We call space of subword functions the subspace over Q spanned by all the subword functions on F(A). If a, ,8 are two functions F(A) —> CD, their (pointwise) product is the function
o€131 F(A) -> C”,
(afixg) = “(9)3(9).
This is the way the GED-algebra of functions is defined on F(A). We suppose that A is totally ordered and that A* gets the alphabetical order; Lyndon words are defined in Section 5.1 (see also Section 6.1). Theorem 6.4
The space ofsubword functions on the free group is a subalgebra
of the (ED-algebra offunctions on the free group. It is generated by the particular
subword functions g H Q representative (or recognizable) if there exists a finite dimensional vector space E over 0, a right action of F(A) on E, a vector e in E and a linear function f on E such that for any g in F(A)
«(9) = f(eg)Lemma 6.6
(6.3.3)
(i) Representative functions form a subalgebra of the (ll-algebra
of functions on F(A). l lll
I a rep resentative unction vanishes on 14*, then it is the zero unction. Each subword unction is representative.
134 Proof
6
Shuffle algebra and subwords
(i) If 1’, a” are as in (6.3.3), then
(06’ + at")(g) = f(69), where
E = E’ (43 E ”
(with
action
under F(A): (x’ + x”)g = x’g + x”g),
e = e’ + e”, and f(x’ + x”) = f’(x’) + f”(x”). Moreover, we have
(d’a”)(g) = f(eg), with E = E’ ® E ” (with action (x’ ® x")g = (x’g) ® (x”g)), e = e’ ® e”, and
f(36’ ® X”) = f’(X’ )f"(X")~
(ii) Each element g in F(A) may be written g = uovl‘ 1u1 . . . uk_1vk' 1uk,
with k 2 0, 14,, vi 6 A*. We prove that a(g) = 0 by induction on k. If k = 0, it is true by assumption. Let k z 1; let n be the dimension of E, where E, f, e are as above. By the Cayley—Hamilton theorem applied to the endomorphism x |—> xv; 1 of E, we have for some rational numbers r1, . . . , r"
xvl‘" = rlxvf"+1+ WWI—"+2 + - - - + rnx, for any vector x in E. With x = euov'{_1, this is may;1 = rleuo + rzeuovl + - - - + rneuov'{_‘.
Multiplying on the right by M1 . . . vk' luk and taking the image under f, we obtain ac(g) = rloc(u0u1...vk_1uk)+ r20c(u0v1u1 ...v,:1uk) + - ' -
+ rna(u0v’{_1u1...v,f1uk). By induction on k, the right-hand side is 0. Thus a(g) = 0.
(iii) Let a(g) = (a) for some word u in A*. Let E be the finite-dimensional subvector space of ® generated by the set P of words which are left factors of u. Let v: @((A>> —> E be defined by v(S) = 2,6,, (S, v)v. Note that
V(SD = v("(S)T)I indeed V(5 T) = Eve? (ST, v)v = sP 20:» (S, X)(T, y)vSince xy 6 P s x e P, this is equal to 2,5,. Zv=x,(v(S), x)(T, y)v =
Eve? (V(S)T, v)v = v(V(S)T)Define a right action of F(A) on E by the formula: Xg = v(XM(g)) for any X in
E, g in
F(A).
This is
indeed
an action,
because
(X91)h = v("(XM(9))M(h)) = v(XM(g)M(h)) = v(XM(9’0) = X(9’!)Let e = 1 e E, and f: E —> 01, X l—-> (X, u). Then by (6.3.2), a(g) = (M(g), u). Since u is in P, this is equal to (v(M(g)), u) = (v(eM(g)), u) = f(eg). Hence,
0: is a representative function.
[I
Recall the notation wlI, for a word w of length n and a subset I of
[n] = {1, . . . , n} (see Section 1.4). Given p words u1,.. . ,up of respective lengths n1, . . . ,np, their infiltration product, denoted by "11' - -lup, is the polynomial
ull---lup=Zw(Il,...,Ip),
6.3
Subword functions
135
where the sum is extended over all n s n1 + - - - + np and all p-tuples of subsets of [n] such that [n] = Ulsi, |Ij| = nj for j= 1,...,n, and where w = w(11, . . . , IP) is defined by wllj = uj, forj = 1,. . . , p. The infiltration differs from the shufiie in that there may be overlappings between the
uj when they appear as subwords of w (we do not require the Ij to be pairwise disjoint). We call infiltration of u1,...,up a word appearing in their
infiltration product. Each shuflie of ul, . . . , up is an infiltration, with the same multiplicity, and each infiltration of u1,...,up is either a shuflie, or of length D, defined by
a(g) = (96;) — 2A (x1 y, ”(9' By Lemma 6.6(i) and (iii), it is a representative function. It vanishes on A*
by Lemma 6.7. So, by Lemma 6.6(ii), it is the zero function. In other words, (6.3.4) holds for any element 9 of the free group F(A).
This shows that the space of subword functions is closed under pointwise product, and proves the first assertion of the theorem. We show now, by induction, that each function (3,) is in the subalgebra M generated by the functions (3’), l Lyndon word. There is nothing to prove if w is a Lyndon word, or if w = 1 (because (*3) = 1). Suppose that w is not a Lyndon word. Then, by Corollary 6.2, there exists a nontrivial factorization w = xy such that each shuflle u of x and y is S w. Let k = (x LU y, w) > 0. Then, by the first part of the proof,
(1X3)=k+ggw> Z. To simplify, write h, = h, hi+ 1 = k. We have several cases: the simplest one is when lhkl > N; then
hi1, k:1 is replaced by kil, hi1.
(6.4.5)
For the remaining cases, we suppose |t s N. Then h, k is replaced by k, h, hk;
(6.4.6)
h”, k is replaced by k, (hk)‘1, h‘l.
(6.4.7)
For the remaining cases, let e (respectively 0) be the greatest even (respectively odd) integer such that lhkel (respectively |hk°|) is S N. Then
h, k‘1 is replaced by k'l, h, hkz, . . . , hke, (hk°)‘1,...,(hk)‘1;
(6.4.8)
h‘ l, k_ 1 is replaced by k‘l, hk, . . . , hk“, (hke)'1, . . . , (hk2)—l, h‘l.
(6.4.9)
Lemma 6.9
(i) The sequence 5/ is standard.
(ii) There is no infinite chain so —> 51 —>- - .—-> s,l ——>- - Proof We have h < k, because h, k is a rise of 5. Moreover, if h is not in A, then h” 2 k, because 5 is standard. Also, by (4.1.10), k” > k hence k” > h. By (4.1.11), hk is a Hall word, and inductively, hk' is a Hall word for each
r 2 1, (hk')” = k, and hk’ < k by (4.1.10). We use these facts without reference in the sequel of this proof.
The sequence s is the concatenation of the three sequences u, (h‘—”, kil) and v; the sequence s’ is the concatenation of u, x, and v, where x is one of
the sequences replacing (hi1, kil) and given by eqns (6.4.5)—(6.4.9). Note
138
6
Shuflle algebra and subwords
that if w is a word such that wi 1 appears in x, then w = h, k, or hk’; k 2 w and if w é A, w” 2 k.
(6.4.10)
(i) In order to show that s’ is standard, we have to verify that if I, m are two words in s’ with [at A and m at the right of l in s’, then I” 2 m. This is clear if I and m are in the sequences u or v, because sis standard. So we have three cases:
(a) l is in u, m is in x—then l” 2 k because 5 is standard, hence I" 2 m by (6.4.10); (b) l, m are both in x—then l” 2 k 2 m by (6.4.10); (c) l is in x and m is in v—then k 2 m by (6.4.4) because h, k is a legal rise, hence I” 2 m by (6.4.10). (ii) We may assume that the alphabet A is finite. Then HS N is finite. Let E: {(h,k) | h,keHsN,h < k}. Then E is finite. Order E by (h1,k1) > (hz, k2) if either kl < k2 or k1 = k2 and h1 >deg h, where hl >deg h2 means either lhll > lhzl or Ihll = lhzl and hl > hz. Now order NE lexicographically: then there is no infinite strictly
decreasing chain in NE. We show the existence of a mapping 0 from the set of standard sequences into NE such that s —> 5’ implies v(s) > v(s’). This will prove (ii). Define v(s)(,,yk, to be the number of subsequences (h, k) in s; in other words, it is the number of inversions (h, k) in 5 (note that h < k because (h, k) e E). In view of eqns (6.4.5)—(6.4.9), we have v(s’)(,,, k, = 12(5)“, ,0 — 1. Suppose that l < m: we show that for each inversion (l, m) in s’, not already in s, we have
(I, m) > (h, k). This will imply v(s’) < v(s). We have three cases. (a) l is in u, m is in x: then, since the inversion was not in s, we have by
(6.4.10) m = hk’, r 2 1; thus m < k and (l, m) > (h, k). (b) l, m are both in x: then, since in each replacing sequence (6.4.5)—(6.4.9), k appears only at the beginning, we have m aé k, hence m < k by (6.4.10). Thus (I, m) > (h, k). (c) l is in x and m is in 1;: since h, k is a legal rise of s, we have by (6.4.4) k 2 m. If m < k, then (I, m) > (h, k). If m = k, then I < k, hence by (6.4.10) l = hk’, r 2 O; moreover, r 2 1, otherwise the inversion (l, m) is already in s; hence, |l| > lhl, which implies l >degh and finally (I, m) > (h, k).
[I
Recall that in Section 4.2 we defined a homogeneous Lie polynomial Ph of degree |h| for each Hall word h, and that the family (P,,)heH forms a basis of the free Lie algebra. If S is a series in Z and P a polynomial in Z,
we write
S = P + O(A"+1), to express the fact that S — P is a series having no term of degree sn.
6.4
The lower central series of the free group
Lemma 6.10
139
(i) For a, b in A, one has
(1+ a)_1(l + b)_1(1+ a)(1+ b) = 1 + z (—1)i+jaibj[a, b]. l,j> ' o
(6.4.11)
(ii) For each Hall word h of length n, one has
M((h)) = 1 + Pi + 0(A"“). (iii) If g e FN(A), then M(g) = 1 + 0(AN). (iv) For g e FN(A), let M(g) = 1 + P(g) + 0(A"+1). Then g I—> P(g) is a homomorphism from FN(A) into the additive group of homogeneous Lie polynomials of degree N over Z. Proof
(i) The left-hand side of (6.4.11) is equal to
( Z (—1)"+J‘a"bj)(1 + b + a + ab) 13120 =
z
(_1)i+jaibj+
i.j20
Z
(_1)i+jaibj+l
LjZO
+ Z (—1)i+jaibja+ Z (—1)i+jaibjab. i.j20
i,jZO
The sum of the first two summations is equal to Xizo (—1)‘a‘. The third summation may be rewritten Zizo (—1)‘.a‘”r1 + Zr. 1.20 (— 1)‘ +1+ la‘bjba. Hence, the whole sum is Z (_1)iai+ Z (_1)iai+l + 1‘20
iZO
Z
(_1)i+j+1aibjba
iyJ'ZO
+ Z (—l)i+ja‘bjab=l+ Z (—1)i+iaibi(ab—ba), ' i.1>0
iJZO
which is as required.
(ii) Let h = h’h” a Hall word of length n _>_ 2, written in standard factorization. Then, by induction, we have M((h’)) = 1 + PW + 0(A""r 1) and M((h”)) = 1 + Phn + 0(A""+’), where n’ = lh’l, n” = WI and n = n’ + n”. Then,
M(01)) = M(((h’), (h”))) = M((h’)' 1(h”)"(h’)(h”)) = M((h’)) ' 1M((h”)) _ 1M((h’ ))M((h”)) = (1 + Ph, + 0(A"'+1))‘1(1 + Phn + 0(A"""1))‘1 X (1 + Phi + 0(A"'+1))(1 + Phn + 0(A""+1)).
140
6
Shufile algebra and subwords
By (6.4.11), this is equal to
1 + Z (—1)”J'(P..I + 0(A"'+ 1))‘(P.~ + 0(A.~+.»,i.j20
x [P,,. + 0(A"'+l), P,” + 0(A"“+1)]. Observe that the term corresponding to i, j involves only words of length zin’ + jn” + n’ + n”. Hence, for i or j 2 1, this term is O(A"+1). This implies that
M((h)) = 1 + [Phi + 0(A”'“), Pm + 0(A""“)] + 004"“) =1+[Ph'a Ph”] + 00‘1"“)
=1+ Pi + 0(A"“), by definition of Ph (Section 4.2). (iii) This is clear for N = 1. For the general case, take geFi, heFj with i+ j 2 N. Then, induction and eqn (6.4.11) show that M((g, h)) =
1 + 0(AN). Moreover, if M(g1), M(g2) are both of the form 1 + 0(A”), then so is M(glgz‘ 1). This proves (iii) by definition of FN. (iv) This is the consequence of a straightforward computation.
I]
Proof of Theorem 6.8 (i) Observe that if a standard sequence is not decreasing, then it has a legal rise, e.g. the right-most rise. This implies by Lemma 6.9 that each standard sequence may be rewritten, using the binary relation —>, into a decreasing sequence. In particular, this is the case for any sequence of letters, with exponents i 1. Now, define for the sequence 5 in
(6.4.3),
(S) = UH)“ - - - Gin)“We show below that if s —> s’, then (s) E (s’) mod FN+1(A).
(6.4.12)
This will imply the existence of the expansion (6.4.2). Eqn (6.4.12) is obvious by definition of FN+1 when (6.4.5) is applied, because uv = vu(u, v),
(6.4.13)
hence (hf—”(k)”:1 E (k)i'1(h)‘—L1 mod FN+1- When (6.4.6) is applied, (6.4.12) holds too, by (6.4.13). We also have
v(u, 1;)‘1u‘1 = vv‘lu‘lvuu‘1 = u‘lv, which implies that (6.4.12) holds when (6.4.7) is applied. Now, by (6.4.1),
1 = (u, vv‘ 1) = (u, v‘ 1)(u, v)((u, v), D“),
6.4
The lower central series of the free group
141
hence
(u, U”) = ((u, v), v“1)_1(u, 12)—1.
(6.4.14)
Writing (uv") for (. . .((u, v), v), . . . , v), we obtain from this identity, applied to (uv”) and 12 instead of u and v,
(040"), v'1)=((uv"+1), 0—1)_1(W"+1)—1Hence, by (6.4.14) again,
(“a ”—1) = ((140), v‘1)'1(u, 19—1 = (uv2)(uv2, v' 1)(uv)' 1 = (uvz)(uv3, 11'1)‘1(uv3)'1(uv)‘1 = (uv2)(uv4)(uv4, v'1)(uv3)‘ 1(1w)‘ 1 = = (uvz)(uv4) . . . (uvz")(uv2", v ‘ 1)(uv2’“ 1) ‘ 1 . . . (uv3)' 1(uv) ‘ 1.
(6.4.15)
Thus, by (6.4.13)
uv‘1 = v‘1u(uvz)(uv4) . . . (uvz")(uvz”, v'l)(uvz”‘1)_1 . . . (uv3)‘1(uv)_1. The latter identity shows that (6.4. 12) still holds when (6.4. 8) 18 applied. For
(6.4.9), one argues similarly, using the identity u‘lif1 = v‘1(u, v -1)-1u—1
— v‘l(uv)(uv3) . . . (uvz"' 1)(uv2", v'l)‘ 1(uvz”)_ 1 . . . x (uv4)'1(uv2)‘1u‘1, by (6.4.15). (ii) We prove at the same time uniqueness of the exponents n,,(g), and the
fact that nh belongs to the space of subword functions. This will be done by induction: assume the result is true for N — 1 (N 2 1). We know that an expansion (6.4.2) exists. By definition, FN+ 1 is contained in F", so that (6.4.2) implies gE
n
(h)""(g) mod FN.
heHSN—1
By induction on N, we know that the exponents nh(g) are unique and that the functions gr—vnh(g) belong to the space of subword functions, for lhl 3 N — 1. Apply the Magnus transformation M to (6.4.2), using the identity (Lemma 6.10(ii)):
M((h))= 1+ P1. + T1,
142
6
Shufile algebra and subwords
where Th = O(A“"+l). By Lemma 6.10(iii), the image under M of both members of (6.4.2) coincide up to words of length N; let us express this with the symbol EN:
M(9)E~ n M((h))""= H (1+Ph+Tp.)"“ hEHSN
hEHsN
= H 2 (:">(P1+T,.)l hEHSNiZO
.
n
n
.
= 2 (PM 1.‘"‘>(P... + Th.)'l...(Ph.+ TM)“: k20
11
k
where the second sum is over all h1 > - - - > h, Hall words of length SN, and integers i1, . . . , ik 2 1. Denote by M(9),, the homogeneous part of degree N of M(9). We obtain
M(g)N = Z n,P,, + Z 2 ("f“) . . . ("f”‘>*, lhl =N
k20
ll
lk
(6.4.16)
where the second sum is subject to the further condition that lhll, . . . , lhkl < N, and where * is a polynomial of degree N depending solely on h1,...,hk and i1,...,ik. We know by Theorem 4.9(i) that the polynomials P,, are linearly independent. Fix ho of length N; then there exists an homogeneous polynomial Q of degree N such that (PhD, Q) = 1 and (Ph, Q) = 0 for any other h of length n. Take the scalar product of Q with the last identity. We obtain that nho is equal to (M(g), Q) plus a linear combination of products nhl . . . nhj, with lhil < N. Observe that in the previous computation, only M(g) and the exponents nh depend on g: this proves uniqueness of the functions nh. Moreover, by (6.3.2), induction, and Theorem 6.4, we deduce that in, belongs
to the space of subword functions. (iii) Equations (6.4.16) and (6.3.2) show that the subword functions belong to the algebra generated by the functions 11,. Hence, the latter functions generate the algebra of subword functions. They generate it freely, because for any finite subset H’ c H, one can find, by (6.4.2), an element 9 in F(A)
such that nh(g) takes, for h in H’, arbitrary values in Z.
1]
Remark 6.11 In practical computations, it is useful to add to eqns (6.4.5)— (6.4.9) the rule h, h‘1 or h‘l, h is deleted in s.
(6.4.17)
Indeed, Lemma 6.9 still holds, as is easily verified (for (ii) one has to add to the vector v(s) the length of s as a new component, at the extreme right).
Moreover, it is clear that rule (6.4.17) does not change (s), with the notations of the proof of Theorem 6.8.
6.4
The lower central series of the free group
Example 6.12
143
Take the Hall set of Example 4.6, g = bab' 1, and N = 4. We
only need to know the inequalities b > ab2 > azb2 > ab3 > ab > a > azb. Then, by using at each step the right-most rise, we have
(b, a, VI) —» (b, b‘ 1, a, abz, (ab3)' 1, (ab)- 1)
—» (a, .1122, (ab3)‘ 1, (ab)' 1)
by (6.4.8)
by (6.4.17)
_> (abz, a, azbz, (ab3)‘ 1, (ab)‘ 1)
by (6.4.6)
—> (abz, azbz, (ab3)'1, a, (ab)'1)
by (6.4.5)
—> (abz, azbz, (ab3)‘1, (ab)'1, a, (61%)”)
by (6.4.8)
Hence, we have
bab‘ 1 a (ab2)(a2b2)(ab3)‘ ‘(ab)’ 14(42b)‘ 1 mod 1’5Corollary 6.13 (We use the notation of Theorem 6.8.) There exist nonnegative integers km (h e H, ueA*, 1 S lul S lhl) such that for any 9 in F(A)
(6.4.18)
Me) = 2 king).
Proof In view of Theorem 6.8 and Lemma 6.6 it is enough to prove (6.4.18) when 9 is a word in A*. Observe that when dealing with standard sequences in A*, the only rule of the rewriting system —> which is used is h, k is replaced by{
k, h, (hk) if|hk| s N, k, h
. if lhkl > N.
We shall use a modified version of this rewriting system, which works on
labelled standard sequences, i.e. sequences of the form S =((h1: E1): - . . : (hm En»:
(6.419)
where s = (h, . . . , h") is a standard sequence and each E, is a subset of N. If h, hi+1 is a legal rise of s, then we define S, = (- - ' a (hi-1a Ei-1)9(hi+1a Ei+1)a (hi, Eli), (hihi+1a Ei U EH1),
(hm, EH2), - - ).
(6.4.20)
where the term with hihilr1 has to be omitted if this word has length >N. Then S ’ is still a labelled standard sequence, and we write S —> S ’. It is easy to verify that —> is confluent, because legal rises do not overlap (cf. proof of Theorem 4.3(i)). Then one shows that the reflexive and transitive closure 1>
of ~+ is also confluent (cf. proof of Theorem 4.3(i)) and that there is no infinite chain S0 —> S1 —» S3 . . . (cf. Lemma 6.9(ii)). Thus, for any S, there is
144
6
Shufl‘le algebra and subwords
a unique final S', i.e. a sequence such that S i» S’ and that S’ —-> S” for no
sequence S”. We write S’ = f(S). Now, let u = a1 . . . an e A+ (aie A, n 21) and S =((a1’{i1})s ((12, {5}), - ' - a (an: {in})),
where i,, . . . , in are distinct numbers.
With f(S) given by (6.4.19), let kh,u=l{il
1SlSn,hi=handEi={i1,...,ln}}l.
Observe that km is well defined, i.e. does not depend on the sequence i1, . . . , in of distinct numbers. Observe also that by definition of f(S) and of —», we have |Eil S lh,l, so that km aé 0 implies M = n S |h|. We show that (6.4.18) holds. For a sequence S as in (6.4.19) and E E N, define S |E to be the sequence obtained by keeping only those i with E, E E. We claim that if S i» T, then
S|E 1» TIE. Indeed, we may suppose S —-> S’, that h,, h,+1 is a legal rise of S and that S’ is given by (6.4.20). Then either E,- and E,+ 1 are both contained in E, so that E, u E,+1 is too, hence SIE —> S’|E; or one of E,- or E,+1 is not
contained in E, so that neither is E, u EH1, and S IE = S’IE. From the claim, we deduce that for any word w = a1...a,,, E E {1, 2,...,n} and
s = ((a,, {1}), . . . , (a,, {n})), we have f(SIE) = f(S)IE, because the underlying standard sequence of f(S) is decreasing, hence so is that of f(S )IE. Recall the notation w|E, defined in Section 1.4. Then we have
nh(w) = number of (h, E) in f(S), with E E {1, . . . , n}
= Z
2 number of (h, E) in f(S)|E
145A“ w|E=u
Z
2
number of (h, E) in f(SIE)
ueA* w|E=u
X (3)19“Z ktu= ueA*
III
ueA“ w|E=u
Corollary 6.14 (i) For geF(A) and h eH, the number nh(g") is a linear combination over 2 of (7), 1 S i S |h|. (ii) For g1, g2 e F(A), and h e H, the number nh(glg2‘ 1) is a polynomial over
Q in the numbers n,”(g1), nh2(g2), h1, h2 e H. Proof
(i) We have, by Corollary 6.13,
Me") =
2 MG).
1 SluISlhI
u
6.4
The lower central series of the free group
145
Let M(g) = 1 + Twith (T, 1) = 0. Then M(g") = 2,20 CDT" and
P(g) of
Lemma 6.10(iv) with N = n. Now, the dual group 5!: of .2, is freely generated by the linear functions P +—> a,, where a, is the coefficient of P when
expressed in the basis (P,), l a Lyndon word of length n. Since P, = l i greater words (Theorem 5.1), we deduce by triangularity that .56: is also generated by the functions P H (P, I). To conclude, we observe that (M(g), l ) = (l), by definition of the subword functions.
D
Corollary 6.18 For any finite set L’ of Lyndon words and any sequence («0151; of integers, there exists an element 9 in F(A) such that (i) = a“
VlEL’.
Proof Let n be the maximum length of the elements in L’. By induction on n, there exists an element 91 in F(A) such that: (9;) = a, for any I in L’ of length R. For each word w and t6 [0, T], define the iterated
integral [2, dw recursively by [2, dw = 1 if w is the empty word and, if w = uai, then [2, dw = IE, (If, du)u,(s) ds. Note that this definition agrees with the definition in Section 3.1 when (1,-(t) = E, ui(s) ds.
Let S e R such that |(S, w)| s Clwllr'WI for some constants C and r. Define
y(t) = Z (S, w) J! dw, weA*
0
(6.5.1) .
6.5
Appendix
149
which is a convergent series. Then the functional (u1,.. . , u") H y is called a causal analytic functional, with generating series S. The product of two such
functionals corresponds to the shufiie product of their generating series (Fliess 1981). The proof is similar to that of Corollary 3.5, using integration by
parts. Among these functionals, there is the special class of those which correspond to a differential system of the form m
. q(t)
=
Ai (q) , 1' £1140)
(6.52)
W) = h(q), where q(t) belongs to an analytic variety Q over IR, and where the vector fields A1, . . . , Am, and the function h: Q —> R are analytic in a neighbourhood of q(0). The corresponding generating series S is of finite Lie rank, i.e. the vector space {Po S | P e $(A)} is finite dimensional, where
PoS = Ewe,” (S, wP)w, i.e. S l—> Po S is the adjoint of the right multiplication by P (Fliess 1983). The finiteness of the Lie rank is equivalent to the following condition: S belongs to a finitely generated shuffle subalgebra of
R, closed under the operations TI—> ToP (P e R) and closed in the A-adic topology (Reutenauer 1985a).
6.5.5
Differential algebra
A special case of system (6.5.2) is the case where Q is a finite-dimensional vector space, and A,, h are linear. Such a system is called bilinear in control theory. It corresponds to the case where the generating series S is recogniz—
able (see Section 1.6.8).
Let A = {(11, . . . , am} and R{u} = R{u1, . . . , um} the [IR-algebra of differential polynomials, i.e. the algebra of (commutative) polynomials in the variables u1,...,u,,, and their formal derivatives u’l, u’1’,. . .. Consider the
algebra M = R ® R{u}, with the shuffle structure on IR. It is isomorphic with (R{u}), with its shuffle structure, which is isomorphic
with an algebra of formal power series in (infinitely many if m 2 2) commutative variables, by Theorem 6.1(i). In particular, M is without zero divisors, and we may form its field of fraction K. The algebra M (hence the field K) becomes a diflerential ring if one defines as derivation the unique derivation D extending that of IR{u}, and which is defined on R by
0(8) = i (Sa; 1) o u...
(6.5.3)
where Sa‘l = 2w“... (S, wa)w. Observe that eqn (6.5.3) is motivated by the
150
6
Shuflle algebra and subwords
functional intepretation (6.5.1): there, the derivative of y is given by
yo) = i < 2 (was dw)u.- G, one has L = (p‘1(p(L). In this case, we say that L is recognized by G. The family of subsets of F(A) which are recognized by finite p-groups (respectively nilpotent groups) is equal to the boolean algebra generated by the particular subsets
{geF(A)‘(g)Eimodp} u
(ueA*,i20)
6.5
Appendix
151
(respectively the particular subsets {geF(A)
Eimodn}
(ueA*,n_>_1,i20).
This may be shown by using the Magnus transformation, considered with coefficients in Z/pZ (respectively Z/nZ), and mod 0(A" +1) for suitable d, to show one inclusion. For the other, use Theorem 6.8, Corollary 6.13, and the
following result, proved similarly to Lemma 6.6: the function
w H (m) k is an N-linear combination of subword function (‘3). This result allows one to reduce the knowledge of n = (“1,”) mod p" to that of (’2) mod p, P because
(’2) E n, mod p, P
if n = Z nipi is the p-adic expansion of n (the latter congruence is obtained by expanding (1 + x)" in characteristic p).
6.5.8
Quotients of the lower central series and free Lie algebra
Let (1‘3"),21 be the lower central series of the free group F over A and consider the set gr(F) = ZR 1 F,,/F,,+1. Then gr(F) becomes a natural structure of graded Lie algebra over Z. Indeed, Fn/Fn+1 is an abelian group, hence a Z-module. Now, let x e F,,/F,,+ 1, y e Fp/Fp+ 1, respectively represented by f e F”, g e Fp. Then (f, g) e F,” and formula (6.4.1) shows that the class mod Fm.“l of (f, g) depends only on the class of g mod FPH, i.e. on y. A symmetric argument shows that (f, g) depends only on x. Hence we have a
well-defined mapping (Fn/Fn+l) X (Fp/Fp+1) _’ Fn+p/Fh+p+13
(X, y) H [X, y]
This mapping is Z-linear, in view of (6.4.1). Extend this mapping to gr(F) by linearity. Since (f, f) = 1 and (g, f) = (f, g)”, we have [x,x] = 0 for
any x in gr(F). Now, the Jacobi identity is a consequence of the following identity, where f9 denotes g'lfg:
(fg, (g, h))(g", (h, D)(h’, (f, 9)) = 1-
152
6
Shufi‘le algebra and subwords
From this, one may deduce that the mapping 9 1—> P(g) of Lemma 6.10(iv) induces an isomorphism from gr(F) onto 22M), and give another proof of Corollary 6.15. In particular, gr(F) is the free Lie algebra over Z; see Serre
(1965). This method 1S actually the original proof of Witt (1937); see also Lazard (1954) and Bourbaki (1972). 6.5.9
Image of the Magnus transformation on A *
The following result (Ochsenschléiger 1981; see also Lothaire 1983, Theorem 6.3.22), characterizes the image of the restriction of the Magnus transformation to A*: a polynomial P in N is in M(A*) ifand only iffor any words x, y one has
(P, 300”, y) = Z; (x U, W)(P, W) In contrast to Corollary 6.20, no closure is needed here.
6.6
NOTES
Theorem 6.1 is due to Radford (1979), who proved combinatorially the
triangular identity in the statement. The first assertion was also proved by Perrin and Viennot (1981). The proof given here follows Melancon and Reutenauer (1989). It is not true in general that any Hall set freely generates the shuffle algebra. Corollary 6.2 is due to Chen et al. (1958). Theorem 6.3
and its proof are due to Rec (1958); note the unusual role played by the logarithm in this proof (as Ree observes). We have borrowed the terminology
‘subword function’ and ‘binomial coeflicient’ from Eilenberg (1976); the extension of this terminology to the free group is done via the Magnus transformation. Another approach is to take the free differential calculus of Fox (1953). Theorem 6.4 is due to Chen et al. (1958); they introduced the
infiltration product and proved Lemma 6.7. Lemma 6.6 follows Melancon and Reutenauer (1993). See Chapter VI of Lothaire (1983) for more on subwords. The commutator calculus of Section 6.4 has its origin in a paper of P. Hall (1933); he essentially proved eqn (6.4.2) of Theorem 6.8, by the use of his ‘collecting process’; this process generates particular Hall sets: those where the order is compatible with the length (of. the discussion in Section 4.5). See also P. Hall (1957) and M. Hall (1950, 1958, Chapter 11), where
uniqueness of the exponents in Theorem 6.8 is also proved, for these particular Hall sets. Here, we work with the general Hall sets, as generalized by Viennot (see Section 4.5); the corresponding group commutator calculus was developed
6.6
Notes
153
by Melancon (1991), and Melangon and Reutenauer (1993); Gorchakov (1969) already gives some of the results (see also Ward 1969). The algorithm
presented here is a generalization of the collecting process of P. Hall, with ideas of M. Hall (1959), Schfitzenberger (1958), and Melancon and Reutenauer (1989) where standard sequences of Lyndon words are introduced.
Lemma 6.10 is due to Magnus (1937). Corollary 6.13 is due to Thérien (1983), in the case of the particular Hall sets (see above) and when 9 is a word in .4*; its generalization to general Hall sets, and to the free group follows Melaneon and Reutenauer (1993). Corollary 6.14(i), which is an immediate consequence of Corollary 6.13, is related to an identity of P. Hall (1933; see also Magnus et al. 1976, Theorem 5.13B). Corollary 6.15 is due to Magnus (1937) and Witt (1937); they also proved the existence of the
canonical isomorphism between F,,/F,,+1 and the Z-module of homogeneous Lie polynomials of degree n (Corollary 6.16). Corollaries 6.17—6.20 are all due to Chen et al. (1958).
For applications to the Burnside problem and the theory of p-groups, see Magnus et al. (1976, Chapter 5) and M. Hall (1959, Chapters 12, 18).
7 Circular words
There are many links between the free Lie algebra and circular words; the most immediate is the equality of the homogeneous dimension of the former, given by the Witt formula, and of the number of primitive necklaces. The aim of this chapter is to study circular words as an end in itself. We begin by computing the number of primitive necklaces, and of necklaces. Then, we describe the bijection between primitive necklaces and
Hall words. In the next two sections two eflicient algorithms are described: the first generates Lyndon words up to a given length, and the other
computes the factorization into Lyndon words of a given word. The decreasing factorization of a word into Hall words provides a bijection between words and multisets of primitive necklaces; this bijection depends of course on the chosen Hall set. In the final section we give another bijection,
which leaves invariant the associated permutation, and which has applications in the study of the various symmetric functions related to the free Lie algebra.
7.1
THE NUMBER OF PRIMITIVE NECKLACES
We say that two words u, v in A* are conjugate if for some words x, y, one
has u = xy and v = yx. The relation ‘u and v are conjugate’ is an equivalence relation, called conjugation. An equivalence class is called a conjugacy class, a circular word, or a necklace. Geometrically speaking, a necklace is a regular
n-gon in an oriented plane whose vertices are labelled in A; two such n-gons are considered as identical if they may be superposed by applying a rotation, a translation and a homcthety (see Fig. 7.1). A necklace is called primitive if no nontrivial rotation leaves it invariant; a word is called primitive if its conjugacy class is a primitive necklace. More generally, a necklace has always a smallest period d, dividing its length n,
and in this case the corresponding conjugacy class has d elements, and is of the form C = {u'i/‘fl . . . , uZ/“}, where {up . . . , u,,} is a primitive conjugacy class. We say in this case that each word w in C has period d and exponent
n/d (see Fig. 7.2).
7.1
The number of primitive necklaces
155
ababb babba abbab bbaba babab
Fig. 7.1
The conjugacy class of ababb.
abbabb = (abb)2
bbabba = (bba)2 babbab = (bab)2
o
Fig. 7.2
Theorem 7.1
The conjugacy class of abbabb (period 3, exponent 2).
Let A = {a1,...,aq} be an alphabet with q elements. The
number of primitive necklaces of length n is huff: ,u(d)q"/d.
(7.1.1)
The number of primitive necklaces having ni occurrences of the letter at (i=1,...,q) is
n/d 1 . . , 714/61), . /d,
=12 pn/d(x1, . . . xq) Z eu< ndd”I
eld
=1
3|
- X", 90001)””(x12 - - ~ ,xq),
because pf(x‘i, . . . , x3) = pe,(x1, . . . , xq) and (p(d) = 2 e” eu(d/e), as is well known. In order to obtain the other two formulas, we follow the proof of Theorem 7.1. CI
7.2
HALL WORDS AND PRIMITIVE NECKLACES
Conjugacy of words is an equivalence relation, which preserves primitivity and periodicity (see Section 7.1). Observe that if a set H of words is a set of representatives of the primitive conjugacy classes, then the set
{h"|heH,n21} is a set of representatives of all the conjugacy classes of positive length. The following result is a particular case of a theorem of Schiitzenberger
(1965). Theorem 7.4 Let H be a subset of A*, with a total order _ It] z» [pl 2 ti, and we would have p = t, which is not true because of the previous inequality p < t. Hence, we can use Lemma 5.2(i) to deduce that pc < t.
Corollary 7.12
E!
Let u be a Lyndon word, which is not the greatest letter of
the alphabet, p a nonempty prefix ofu, and k an integer. Then S(u"p) is a Lyndon
word.
7.4
Factorization into Lyndon words
163
Proof Let 2 be the greatest letter in A. Then p does not begin with z: indeed, otherwise u = zu’ and u 2 z 2 last letter of u; since u is a Lyndon word, this implies that u is equal to its last letter, hence u = 2, against the assumption. In particular, p is not a power of 2. Hence, we have p = p loz", where a e A\z, and by (7.3.1), S(p) = p1b, where b is the letter after a in A. We have u = p131
for some word sl beginning with a. Hence, s1 < b, and by Lemma 7.11, we have that plb is a Lyndon word. Now, by (7.3.1) again, we have
S(ukp) = S (ukplazi) = ukplb. Since pla is a prefix of u, we have u < plb, and we conclude by using Lemma 7.9.
III
Proof of Theorem 7.8 Let u be a Lyndon word of length SN, 14 aé z, and w = lN(u). Then by definition w is the smallest Lyndon word in the set {xeA* | x>u, lxl SN}. By Lemma 7.10, we have w > DN(u). Hence, w is the smallest Lyndon word in the set {x e A* | x > DN(u), |x| s N}. But S(DN(u)) is the smallest word in this set. Since DN(u) = ukp for some integer k and some nonempty prefix p of u, we know by Corollary 7.12 that S(DN(u)) is a Lyndon word. Hence, S(DN(u)) = w =lN(u). El
Example 7.13 N = 9, A = {a < b}. Then it = aabbb is a Lyndon word. One has l9(u) = S oD9(u) = S(aabbbaabb) = aabbbab.
7.4
FACTORIZATION INTO LYNDON WORDS
As in the previous section, L denotes the set of Lyndon words relative to the alphabetical order on A*, where A is a totally ordered finite alphabet. By Theorem 5.1, L is a particular Hall set, hence, by Corollary 4.7, each
word w in A* has a unique decreasing factorization. W=l1...ln,
liGL,l12"~Zln.
(7.4.1)
Existence and uniqueness of the factorization (7.4.1) may be proved directly—indeed, each word has a factorization into Lyndon words (e.g. w is the product of its letters). Now, take such a factorization, with a minimal number of factors. Since for k, l Lyndon words, k < 1 implies kl e L (Lemma 7.9), this minimal factorization must be decreasing. This proves the existence
of factorization (7.4.1). Uniqueness is a consequence of the following result. Lemma 7.14
For afactorization of theform (4.1) of the word w, thefollowing
properties hold: (i) l" is the smallest nontrivial suflix of w;
(ii) 1,, is the longest sufl‘ix of w which is a Lyndon word; (iii) 11 is the longest prefix of w which is a Lyndon word.
7
164
Circular words
Proof Let s be a nontrivial suffix of w. Then s = lgli+1 . . . I", where l; is a nonempty suffix of l,- and 1 s i s n.
(i) l, is a Lyndon word, hence we have I, s l’i s lfi-li+1 . . . 1,, = s, and 1,, is the smallest nontrivial suffix of w, because 1,, S 1,. (ii) Suppose that s is longer than 1,. Then i< n, which implies I; < s. Arguing as in (i), we deduce 1,, < s, which shows that s has a nontrivial suffix smaller than itself, and is therefore not a Lyndon word.
(iii) Let p be a prefix of w, strictly longer than 1,. Then p = Z1 . . . I]- ,1}, where I} is a nonempty prefix of l}. and 2 S j s n. Using (7.4.1), we deduce 13s lj £11 < 11 ...lj_ll} = p, which shows that p is not a Lyndon word. El In order to find the Lyndon factorization of a word, one may apply the algorithm described in the proof of Corollary 4.4. However, there is a much more efficient algorithm due to Duval (1978, 1983), which we describe now.
For this purpose, call sesquipower of a word u any word of the form ukp, for some integer k 2 O and some prefix p of u, which we may assume to be proper (i.e. p 9e u), without loss of generality. Such a sesquipower is called
nontrivial if k 2 1. Denote by S the set of nontrivial sesquipowers of Lyndon words. An element u of S has always a representation u = l"p,
[E L, p proper prefix of l, k 2 1.
(7.4.2)
For a given element u of S, the representation (7.4.2) is unique: indeed, let p = h1 ...h,, be the decreasing factorization of p into Lyndon words; then h, s p < 1, hence the decreasing factorization of u into Lyndon words is
l“h1 . . . ha, and we conclude by uniqueness of this factorization. In the sequel of this section, it should be understood that when we deal with elements of S, we deal actually with their unique representation (7.4.2).
For a word u in S, having the representation (7.4.2), we may write I = pas for some letter a and some word 3. We denote a = 2(a). Now, define a binary relation on the set S x A*, denoted by —>, and defined for any u in S, b in A, and v in A* by
(u, bv) —> (ub, v)
if 6(a) s b.
This is well defined, because 80:) s b implies ab 6 S: indeed, let u = l"p as in (7.4.2) and l = pas, thus a = 8(a). Then either a = b, hence ab = (pas)"pa is clearly in S; or a < b and as < b, hence by lemma 7.11, pb is a Lyndon word with l < pb, which implies by lemma 7.9 that ab = l"pb is a Lyndon word, hence in S.
Denote by i) the transitive closure of —>. This is clearly a partial order on S x A*, and we say that x e S x A* is maximal if for no y in S x A*, one has x —> y. It is clear that for each x in S x A*, there is a unique maximal
y in S x A* such that x1» y. The factorization algorithm is described in the next theorem. Note that
7.4
Factorization into Lyndon words
165
the decreasing factorization into Lyndon words of a word w may be written
w=1';1...l';,v,
z,>--->l,,,k,,...,k,21.
(7.4.3)
Theorem 7.15 Let w be a word, factorized as in (7.4.3), c its first letter, with w = cw’, (u, v) the unique maximal element in S x A* (where u is as in (742)), such that (c, w’) i) (u, 0). Then 11 = l and k1 = k. In other words, the rewriting system —> allows us to compute the power of the first Lyndon word in the factorization of w; then, one continues with pa instead of w, and so on. Example 7.16
w = abbabbababb, a < b. For each u in S, written as in (7.4.2),
we put the letter 3(a) in bold face. Then we have: (a, bbabbababb) -> (ab, babbababb) —> (abb, abbababb) —> ((abb)a,bbababb) —> ((abb)ab, bababb) —> ((abb)2, ababb) —> ((abb)2a, babb) —> ((abb)2ab, abb). The latter is maximal, because b > a. So w = (abb)zs, and we continue with
s: (a, babb) —> (ab, abb) —> ((ab)a, bb) —> ((ab)2, b) —> (ababb, 1). Hence, 3 is a Lyndon word, and the factorization of w is (abb)2(ababb). Proof Since (c, w’) i» (u, v) we have w = cw’ = uv = lhpv. Let I = pas, a e A. Let pa = h1 . . . h, be the decreasing factorization into Lyndon words of pv.
Then either hl is a prefix of p, hence h, S p < l, or p is a proper prefix of h: pbh’, where b is the first letter of v; then, since (u, v) is maximal, we must
have a > b, hence h1 = pbh’ < pas = I. In both cases, h1 < l, which shows that the decreasing factorization into Lyndon words of w is l"hl .. . h, and
that l"=l’;l, h,...hq=l';2...l';v.
D
The proof shows that this algorithm is linear in time: more precisely, to factorize w, one needs at most 2|w| comparisons between letters in A.
7.4.1
Applications
(a) We know by Corollary 7.5 that each primitive word is conjugate to a unique Lyndon word. To find it, it is enough to factorize ww into Lyndon words, and to extract from this factorization a Lyndon word of length |w|. Indeed, such a word will clearly be conjugate to w. Moreover, it exists: indeed, let I be the unique Lyndon word conjugate to w. Then w = xy, yx = 1. Hence, ww = xly. If x or y is empty, we are done because ww = ll. So we may suppose
that x, y are 1. The last factor in the Lyndon factorization of x is a suffix of x, hence of I, so greater than I. The first factor in the Lyndon factorization
of y is a prefix of y, hence of I, so smaller than 1. Thus, the Lyndon
166
7
Circular words
factorization of xly is obtained by concatenating that of x, l, and that of y: hence, 1 appears in the Lyndon factorization of ww = xly. (b) Each Lyndon word I of length 2 2 has a standard factorization l = H”
(see Section 4.1), which according to the proof of Theorem 5.1 is given by: l” is the smallest nontrivial proper suffix of I. In order to find 1’, I”, let I = aw,
a e A, and let w = II . . . In be the Lyndon factorization of w. Then I” = In. This is an immediate consequence of Lemma 7.14(i). Hence, one can quickly compute the standard factorization of any Lyndon word w, and by iterating this process, its associated tree t(w) (cf. the proof of Theorem 5.1).
7.5
WORDS AND MULTISETS OF PRIMITIVE NECKLACES
Recall that, for a set E, a multiset of elements of E is a mapping M : E —> N. For e in E, M(e) is the multiplicity of e in M. It is finite if its cardinality, i.e.
Zed- M(e), is < 00. Given a subset H of A* which satisfies the hypothesis of Theorem 7.4, there is an evident bijection, given by (7.2.1), between the words of A* and multisets of elements of H. This is true for example for any Hall set H, or for the set H = L of Lyndon words (cf. Corollary 4.7 and Theorem 5.1).
Moreover, Corollary 7.5 shows that there is a bijection between H and the set of primitive necklaces. Hence, we have the following result, where the evaluation (respectively length) of a multiset is the product (respectively sum) of the evaluations (respectively lengths) of its components, with multiplicities.
Theorem 7.17 Given a Hall set on A* (especially the set of Lyndon words), there is a canonical evaluation-preserving bijection between the three following sets. (i) The set of words of length n. (ii) The set of multisets of length n of Hall words. (iii) The set of multisets of length n of primitive necklaces. We give now another bijection between words and multisets of primitive
necklaces, which has better invariance properties than the previous ones. This bijection will be useful in the study of the various symmetric functions related to the free Lie algebra. Let A be a totally ordered alphabet. Let w = a1 . . . a,l in A* (a, e A). Let
[n] = {1,...,n} and define a function 5w: [n] +—>A x [n] by 6(i) = (at, i). Evidently, 6,, is injective. Order A x [n] with lexicographic order. Then the condition (5,,(1') < 6W(j) defines a total order on [n]. Note that this condition
is equivalent to (a, < aj)
or
(a,- = aJ- and i M(A, 6) the canonical morphism and let st(m) = max((p"(m)) for any m in M; then m s p if st(m) s st( p).
An element in of M is called a Lyndon element if for any nontrivial factorization m = pq, one has m < q. The properties of Lyndon elements in M are quite similar to those of Lyndon words in A*. For instance, m is Lyndon if and only if m is primitive (i.e. m cannot be written m = pq, where p and q commute and p,q 9e 1), and if it is the smallest element in its conjugacy class (conjugation is the equivalence relation ~ in M generated by the relations pq ~ qp, p, q e M). A technical lemma, which is not obviously equivalent to the latter, is that m is Lyndon if and only if m < qp for any
nontrivial factorization m = pq. Define IA(m) to be the set of a in A such that m e aM. A pyramid is an element m such that l1A(m)| = 1. A pyramid m is admissible if IA(m) consists in the smallest letter appearing in m. Each Lyndon element is an admissible pyramid. The set of admissible pyramids m such that IA(m) = {a}, for a fixed a in A, is a free monoid Ma, and an admissible pyramid m e M, is a Lyndon element in M if and only if it is a Lyndon word in the free monoid. Ma. Each element in M has a unique factorization into a decreasing product of Lyndon elements. If m is Lyndon, not in A, then it has a unique nontrivial
factorization m = pq, where q is chosen minimum for the total order _ .
Then, to compute the b", one takes the logarithmic derivative and applies Mobius inversion, as in the proof of Corollary 4.14. The same computation,
using the method of Witt (1937), gives the dimension of the space of homogeneous elements of $(A, 0); see Duchamp and Krob (1992c).
7.6.2
Irreducible polynomials over a finite field
Let F be the field with q elements. Then the number an of irreducible monic polynomials in F [x] of degree n is equal to (7.1.1), a formula which was known to Gauss. Indeed, F [x] is a unique factorization domain and there are q" monic polynomials of degree n, so that
“. Z qntn= n (12,) n20
n21
So one proceeds as in the proof of Corollary 4.14. This shows that the number an is equal to the dimension of the space of homogeneous Lie polynomials of degree n: this was noted by Witt (1937). A direct bijection between primitive necklaces of length n over F and the set of irreducible polynomials of degree n in F [x] may be described as follows: let K be the field with q" elements; it is a vector space of dimension
n over F. There exists in K an element 0 such that the set {9, 0", . . . , 0"""} is a linear basis of K over F: such a basis is called a normal basis, and always exists (see Lid] and Niederreiter 1983, Theorem 2.35). With each word w = or0 . . . a,,_1 of length n on the alphabet F, associate the element [3 of K
7
172
Circular words
given by )3 = a06 + (110“ + - - - + an_10""'l. It is easily shown that to conjugate words w, w’ correspond conjugate elements [3, [3’ in the field extension K/F, and that w I—> [i is a bijection. Hence, to a primitive conjugation class corresponds a conjugation class of cardinality n in K; to the latter corresponds a unique irreducible polynomial of degree n in F [x]. This gives
the desired bijection. Another bijection using, instead of a normal basis, a generator of the cyclic group K \O, is given in Golomb (1967). 7.6.3
Determinant of a sum of matrices
Given a square matrix x over a commutative ring, define the function A,(x)
by det(1 — tx) = 1 + Z (—1)"t”A,,(x), n2 1 where t is a commuting indeterminate. Note that An(xy) = An(yx) for any matrices x, y. Let x1, . . . , xk be square
matrices of the same size; we consider also {x1,. . . , xk} as an alphabet, to simplify notations. For each primitive necklace v = (i . . . xir), the matrix function An(v) = /\,,(x,.l . . . x,) is well defined. If M is a multiset of primitive necklaces, let A(M) be the matrix function H, AM(,)(v), where M(v) is the multiplicity of v in the multiset M. Let sgn(M) be the sign of M, that is the
product of the signs of the necklaces in M, where the sign of (xi1 . . . xik) is (—1)"‘ 1. Then the following formula holds: [\"(x1 + - - - + xk) = Z sgn(M)A(M),
(7.6.1)
M
where the sum is extended to all multisets of primitive necklaces M of
length n over the alphabet {x1, . . . , xk} (see Amitsur 1980; Reutenauer and Schiitzenberger 1987). For example, A3(x + y) is the sum of eight terms, given in Fig. 7.7. For the proof, one uses the identity in Zi2>i121
is a linear combination of shufile products u Lu 0, u, v e A+. The following example contains essentially the proof of this lemma.
Example 8.6
n = 5, p = 2.
cg — c4c3 — c4c2 — c4cl — c3c2 — c3c1 — czc1 = 34512 — 34215 — 32415 — 23415 — 32145 — 23145 — 21345
= 34512 — (34 LL! 21)5 = 34512 — 345 LLJ 21 + (345 Lu 2)1 = 34512 — 345 LlJ 21+ 3451 Lu 2 — 34512 = —345 LU21+ 3451 LL12.
We have used the dual identity of (1.4.2), defining the shuffle product. Proof We write P E Q to express the fact that P — Q is a linear combination of u LIJ v, u, v e A+. Then we have for any words u, v and letters a, b
(u LLI vb)a E —(ua LLI v)b,
(8.2.3)
by (1.4.2) and symmetry.
Let n > i1 > i2 > - - ~ > ip 2 1. Then a straightforward induction, which is left to the reader, shows that the permutation c, c,2 ..c“, is of the form
upp...u22u11vn, where the word u}. has length ij— i —1, with ip+1“ —0 and where up . . . ulv = (p + 1)(p + 2) . . . (n — 1). Hence, these permutations
are all distinct. Since they are (";1) in number and since they appear all in the polynomial [(p + 1)(p + 2) . . . (n — 1) Lu (p . . . 21)]n, which itself has
184
8
The action of the symmetric group
(";1) terms, we deduce that P:
2
Cip...C
izcil
n>ip>~-->i2>i121
=[(p+ l)(p+2)...(n—1)u_I(p...21)]n. Now, an iterative application of (8.2.3) shows that
'_=_—(1)"(p+1)(p+2)...(n—1)n12...p=(—1)"c,{’.
I]
Second proof of Theorem 8.3 (a) Let G" denote the intersection of E" and of the space generated by the elements u LL| v, u, v e A+, and E, the space of Lie polynomials in E". By Theorem 3.1, G, and F,l are the orthogonal space each of another, for the scalar product which admits A“ as an orthonormal basis. This scalar product is invariant under the left action of S, on En, hence the action of S,l on F” is equivalent to the action of S" on En/Gn. We compute the character of the latter. For this, we may take K = C. (b) Let C be a primitive nth root of unity and define 1 n—1
e =—
k k
C _ c".
n kgo
A simple computation shows that e is an idempotent. The left action of S, on the left ideal CSne has a character 1, whose characteristic is, by Lemma
8.4(ii) ln—l
_
1
_
ch(x)=— X C "pmm =— Z X C "Pucmnk=o ndlngcd(k,n)=n/d Observe that the cycle type of c'; depends only on gcd(k, n) = n/d, and is
equal to d""‘. Moreover, gcd(k, n) = n/d is equivalent to: (“k is a primitive dth root of unity. Thus
1
ohm = — 2 pl" d In
1
= — 2 WWW),
2 a: primitive dth root of 1
(8.2.4)
d In
as is well known. (0) Denote by E the equality mod 6,, in En (identified with CS”). We have by Lemma 8.5 ln—l
e =—np=0 Z {-c 111-1
E- Z (—1)"€ P np=0
z
n>ip>--->i;>i121
c.,...c,,c,.l
=%(1—C‘1cn_1),..(1_§-1c2)(1_§-1c1)_
8.3
Irreducible components
185
Call u this latter element. Then a is invertible in CS", because for p = 1, . . . ,
n— 1, OyéC"—l=C"—cf,’=(C—c,,)(C‘"1 +---+c§_l), hence {—cp is invertible, as is 1 — C'lc Since G, is invariant under the left action of S", we obtain CSne E CSnu = (IDS,l mod 6,. Observe that 03S, e and En/G both have dimension (n — 1)! (Lemma 8 4(ii), orthogonality of G, and F”, and Section 5.6 2). Hence, the restriction to CS, e of the canonical mapping E —> E”/6 is a linear isomorphism This shows that the left action of S on En/G is equivalent to that on CSne, and concludes the proof. El Corollary 8.7
Let a be an n-cycle in S, and p: (a) —> C a faithful representa-
tion of the subgroup generated by 0'. Then the representation induced by p to S, is equivalent to the Lie representation of degree n. Proof Let a) = p(a). Then a) is a primitive nth root of unity, and the representation p is equivalent to the representation of (a) on the (left) ideal
Kf of K, with f= fizk; ow ",0" because af=cof.
Now, by definition of the induction, the representation of S, obtained by inducing p is equivalent to the representation of S, on the left ideal KS"f. With the notations of part 2 of the second proof of Theorem 8.3, we have
that e and f are conjugate idempotents. Hence, the characters of the corresponding representations are the same, and their common characteristic is given by (8.2.4). This concludes the proof, by Theorem 8.3. D
8.3
IRREDUCIBLE COMPONENTS
Recall that for a standard tableau Tofshape MT) = l, where A = (11, . . . , 1k) is a partition of n, a descent in T is an index i in {1, . . . , n — 1} such that
i + 1 is located in a lower row than i in T (in the English way of depicting tableaux, i.e. rows increase in length from bottom to top). The descent set of T is the set of descents of T, denoted by D(T), and the major index of T is the number
maj(T)=
Z
i.
ieD(T)
For example, for the tableau
T below, we have MT) =(3,2, 1,1),
186
8
The action of the symmetric group
D(T) = {2, 4, 6} and maj(T) = 12. 1 24 36 5 7 Recall also that the irreducible representations of the symmetric group S,l are in one-to-one correspondence with the partitions of n, and that the
characteristic of the character x‘ corresponding to the partition 1 is the Schur function sl (see Macdonald 1979). The Lie representation has a special link with the representation of S,, on a quotient ring of K[x1,. . . , x,], which we study first. The action of S, on K[x1,. . . ,xn] is given by oP(x1, . . . , x") = P(x,1,. . . , x," ,
for any polynomial P in K [xv . . . , xn] and any permutation o in S". Denote by A(x1, . . . , x”) the fixed subring of this action, i.e. the ring of symmetric
polynomials, and by I the ideal of K [x1, . . . , xn] generated by the symmetric polynomials without constant term. Let R=K[x1,...,x,,]/I.
Since I is invariant under the action of S", R inherits the action of Sn. Moreover, R inherits the graduation of K [xv . . . , xn]: R:
6')
Ri'
1'20
Theorem 8.8 The multiplicity of the irreducible character x” of 8,, in its representation on R, is equal to the number of standard tableaux of shape a and major index equal to i. Proof (a) It is well known that K [xv . . . , xn] is a free A(x1, . . . , xn)-module (see Bourbaki 1981b, Chapter IV, Section 6, Theorem 1). We show that there
is a K-linear isomorphism
R ®K A(x) —» K[x].
(8.3.1)
Indeed, let (Pl-)1.e J be a basis of K[x] over A. We show first that (P) is a basis over K of K[x] mod I. This is because each polynomial P may be
written P = Zj Pi for some symmetric polynomials Qj; hence, with at] = constant term of Q}, we have P E 21‘“i mod I. Now, suppose that Z}— 0(i E 0 mod I ; then 21' ail-P}. = 2,, Q, Rk for some symmetric polynomials Qk without constant term, and some polynomials Rk; the latter may be written
8.3
Irreducible components
187
21-1)]i for some symmetric polynomials i, hence Ziai = Z]. I’jZki, which implies ai = 0 because the Qk are without constant term and that (Pj) is a A(x)-basis. Since R = K[x]/I, we may define (8.3.1) by (PJ- mod 1) ® Q |—> PjQ, and it
is indeed a K-linear isomorphism. This shows that for any choice of a basis (Pj) of K[x]/I, the latter mapping is an isomorphism. Since I is invariant
under the action of Sn, and since this action is homogeneous, we may find a homogeneous subspace of K [x] which is invariant under this action and which is complementary to I. Take a homogeneous basis of this subspace: then (8.3.1) preserves the grading and the action of 8,. (b) The isomorphism (8.3.1) preserves the grading and the action of S". For a graded Sn-module M = 69M”, where each Mn is of finite dimension, and for a in S", let us call generating series of the character of a on M the series
2 Xn(0)q”s 7120
where x" is the character of S, on M”. Then, the tensor product corresponds to the product of generating series. We apply this observation to the isomorphism (8.3.1). The generating series of a on A(x) is “L, (1 — qi)”, because A(x) is freely generated by the n elementary symmetric functions e1, . . . , en, of degree 1, . . . , n, as is well
known. (0) We compute the generating series of the character of a on K[x1, . . . , xn].
It is equal to 2,20 q“ x (number of monomials x" = xi” . . . x5" left fixed by a and which are of degree d). The action of a on x" is x531, . . . x53"): hence this monomial is fixed by a if and only if for any i and j in the same cycle of a, one has p,- = Pi; hence, fixed monomials of a are in one-to-one
correspondence with mappings f: {cycles of a} —> N, and the degree of the monomial is the sum
2
f(C) x length (c).
6 cycle of a
If a is of cycle type A = 111 . . . 11k, we deduce that the generating series of a is Z
qruh+~~+rklk = 151
r1 ..... rkZO
1
i=11—q
14'
From (8.3.1), we thus have that the generating series of the character of a on R is
l
(1 — (19/ [[1 (1 — q“).
(8.3.2)
188
8
The action of the symmetric group
In particular, for o = id, we obtain
2 dimRi=(1+q)(1+q+q2)...(1+q+---+q"_1), 1‘20
which shows that R, = O for i > (3). (d) Let ,3. be a partition of n. Then the symmetric function p,1L has the following expansion in terms of Schur functions: p3, = Z qlia
u
where xi is the value of the irreducible character )5” at a permutation of cycle type i, and where the sum is over all partitions u of n; see Macdonald (1979, Chapter 1, (7.8)). Taking the value of these symmetric functions at 1, q,
qz, . . . , and using the identities k
1
P10, q: (12, - - -) = n pli(1, q, ‘12, ' ' ) = n
i=1
1.,
i=11—‘1l
and Su(1a q, (12,. . ) = Z qmaj(T)/1_[ (1 _ qi), T
i=1
(where the sum is over all standard tableaux of shape a; see Macdonald (1979)), we find that (8.3.2) is equal to Z
Z
qm“j(T)x‘,{.
(8.3.3)
it MT)=u
Let my denote the multiplicity of the character x” of S, in its representation on Rt. Then (8.3.2) is equal to
Z qi Z mixt-
i20
It
By linear independence of the irreducible characters, the n,” are completely determined by this equality. Hence, comparing (8.3.3) and the previous expression, we get niu = number of standard tableaux of shape a and major index i. El Theorem 8.9
Let i be in the range 0 S i S n — 1, c an n-cycle in S", C the
subgroup generated by c and co a primitive n-th root ofunity. The representation
of S, on {-Bpsimodn R1, is equivalent to the representation induced from the I-dimensional representation of C given by c —> (oi. In particular, it depends up to equivalence only on the subgroup generated by imod n. Using Corollary 8.7 and Theorem 8.8, we obtain the beautiful combina-
torial interpretation of the multiplicities of the Lie representation.
8.3
Irreducible components
189
Corollary 8:10 Let i, n be relatively prime integers. The multiplicity of the character 1‘ of 5,, in the Lie representation of degree n is equal to the number of standard tableaux of shape [1 and of major index congruent to imod n. We need to consider again, for each partition ,3. = 1‘12“. . . n“" of n, the
polynomial in q 1 __ qi
n
7: (q) =
A
—ia'
(8.3.4)
U (1 — q) .
This is indeed a polynomial because
M4) = 1W1)
(1— q") 3, p a prime not dividing d. Since c is an n-cycle, c" has d cycles, each of length q; call them 01,...,a,,. The group C acts
transitively on the set 2 = {01, . . . , ad} by conjugation, and c“ acts as the identity of 2; since q and d are relatively prime, c" generates C modulo the subgroup generated by c", and we conclude that c" cyclically permutes, by conjugation, the elements of Z. We thus may assume that
a, = c'qialcqi, i = 0,. . ., d — 1.
(8.3.7)
Let G be the commutative subgroup of Sn generated by 01,...,ad. The restriction VAIG of V; to G splits into a direct sum of one-dimensional representations of G. Since V,1 is faithful on G and since c“ is of order q = p', one of them is faithful on the subgroup of G generated by c". Let x be the character of this representation, and v a basis of this representation; hence av = x(a)v for any a in G. In particular, we have 0,1; = x(a,)v, x(a,~) = 5,. is a qth root ofunity, and x(c‘) = ;((a1 ...ad) = 61 ...C,, = 6 is a primitive qth root of unity.
(f) For a in S", such that 0—160 9 G, denote by x” the one-dimensional character of G given by x”(a) = x(a“oca). Suppose first that {1, . . . , 6,, are
8.3
Irreducible components
193
not all equal. Then we can find a permutation i1, . . . , id of 1, . . . , d such that the cyclic permutations of the sequence (5“, . . . , ii) are pairwise distinct.
Let a in Sn be such that a‘laka = a-lk’ for k = 1, . . . , d. Then 10(01') = 107—1010"): Mark) = 6:),-
Thus, replacing x by x”, we may assume that the cyclic permutations of
(£1, . . . , 5,) are pairwise distinct. Since these d sequences coincide with the d sequences (;(°""( 0(1). In other words, d(a) IS the number of descents in (I, viewed cyclically We claim that for a in S:
maj(oc) E maj(a) - 3(a) mod n, maj(ca) E maj(a) — 1 mod n,
d(ac) = 3(a). Suppose the claim is true. Then 1 Kncn = _2
n-1 1 Z wmaj(a')-io.ci = _2
Z
n 668,. i=0
‘ wmaj(ac')+i(d(a)- l)o.ci
n 0,:
12Zwmaj(¢rc‘)wi(d(ac")-1)o.ci —2n
0'. i
_ 12: —2n
wmaj(a)a :20 (wd(a)-1)il
aeSn
Now, for an nth root of unity [1, one has 25:01 pi = 0, except when p = 1, in which case it is n. Moreover, d(a) = 1 if and only if a = cj for some j, and
d(a) e {1, . . . , n}. Since maj(cj) = n —j, for j e {1, . . . , n}, we obtain "
1
. .
KnCn=_ Z w—JCJ=€n‘ nj=1
Furthermore, I" '
.
. .
l _ — —2 2 Zoo maJ(a)-t ca
n
=00055,.
_ i Z wmaj(c"a)cio.
_ n2 SIN a
,0’
Z comaj‘“’oc = K". ES"
Now, 1c" is idempotent, because Kn Kn = K" C" Kn = ("Kn = x", and similarly for Cn'
It remains to prove the claim. Let a = a1...a,,. Then ac = a2...a,,a1, which shows that
D(ac)={i—1|ieD(a),i22}u E,
8.4
Lie idempotents
199
where E = Q or {n — 1}, depending on whether a,l < a1 or a,l > a1. Thus
maj(06)= Z (i—1)+(n-1)|El= Z (i-1)+(n—1)IE| i_eD(d)
ieD(a)
:2 2
E maj(a) — d(a) — |E| mod n
= maj(a) — 3(a). Also, co = (a1 + 1). . . (an + 1), where digits are taken mod it. If an = n, then maj(ca) = maj(a) + n — l E maj(a) — 1 mod n. Otherwise, a = . . . in j..., co = ...(i + 1)1(j+ 1)... and the descent nj in a is replaced by the descent (i + 1)1 in ca; thus maj(ca) = maj(a) — 1. The last equality of the claim is immediate. I] For S E {1, . . . , n — 1}, denote by maj(S) the sum of its elements. Proof of Theorem 8.17
Let e be any Lie idempotent. We show that ex" = e.
This will show that the space KSne is contained in the space KSnKn. Now, Kn is idempotent by Lemma 8.19, hence the dimension of KSnrcn is (n — 1)!, by Lemma 8.4(ii). Similarly, the dimension of KSne is also (n — 1)!, hence
the two spaces are equal and Kn is a Lie idempotent. We have by definition of K", by Lemma 8.18 and by the fact that e is a Lie element:
etc,l = e(1
Z
.
1
nSEU ..... n-l}
1 _____
,
Z f..,kt" = exp< 2 n,k
i2 1d|n
(Hanlon 1990). This may be established by using Theorem 8.23, which implies that
%u(d)p3">, mt" = (Z hiti>°p~
Proof We have, because 5k is a concatenation homomorphism, 6k°COIl(Pl®"'®Pp)=5k(P1...Pp)=6k(P1)...6k(Pp)
=
Q®”, P1 69- - -® PN r—> Pam ®- - -® Pam). Lemma 9.8 Proof
conch,1 ,,,,, a") = concN o a.
This is clear.
El
9.1
The descent algebra
Lemma 9.9
221
For endomorphisms f1, . . . , fN of ERA) and a in SN, one has 0'°(f1®"‘®f~)= (fa1®‘ ' ‘®faN)°0-
Proof
Indeed
U°(f1®'"®fN)(P1®"'®PN)= a(f1(P1)®‘”®fN(PN» = fal(Pa’1)®' ' '®faN(PaN)a
and (fo’1®“'®faN)°O-(P1®"'®PN)=(fa1®'H®faN)(Pal®"'®P¢N)
= fa1(Pa1)®‘ ' '®f;rN(PaN)'
Lemma 9.10
D
For a in SN, one has 0°5N= 5”.
Proof
By Proposition 1.8, we have Z
005N(P)=0'
0 such that: (i) The linear mapping (0": 1",, —> A" defined by (pn = 2m” (plpA/z, is an homomorphism from l",I onto A” with inner tensor product, Ker (p is the radical of Pu and §0n(7‘(z)) = pA/zl' (ii) For any polynomial P in U), llll = n and any element f in 1"", one has
f(P) E cpt(f)P mod 2 U”.
(9.3.2)
” = ds @(T). This implies that ds|y(E) is a linear isomorphism y(E) —> (EMT). We conclude that fill oy(E) is a linear isomorphism C oy(E) —> F, and in particular is is surjective. (b) Observe that a finely homogeneous Lie polynomial P in ® of weight 1 (respectively 2) must be a scalar multiple of t1 (respectively t2). Thus, for s = l, 2, the subspace E is equal to
6-)
Wk,“
seinseu
by definition (9.2.6) of WM. Hence, we obtain by Theorem 9.19
C°Y(E) = C(15) =
(B
””11“;-
se}..seu
Since the 7t,1L are orthogonal idempotents, the subspace C(E) is therefore a subalgebra of F under composition, with neutral element 255,1 1:,1. We have seen that
€s|C’(E)= C(13) -> F in a linear isomorphism. Since is is an homomorphism for composition by Theorem 9.34, {SIC’(E) is an isomorphism of algebras, and is has a right El inverse.
242
9
The Solomon descent algebra
9.4 QUASISYMMETRIC FUNCTIONS AND ENUMERATION OF PERMUTATIONS
Let X be a totally ordered infinite set, which will serve as an alphabet, and
also as a set of commuting variables. A formal power series F in Z[[X]] is called a quasisymmetric function if for any x1, . . . , x", y,, . . . , y,l in X, with
x1 < - - - < x", yl < - - ~ < yn, and any positive integers k1,. . ., kn, the coeffi-
cients of x’{‘ . . . xii" and y’{‘ . . . yfi" in F are equal. We denote by sm the ring of quasisymmetric functions, and by smn the Z-module of homogeneous quasisymmetric functions of degree n. If C = (i1, . . . , ik) is a composition of n, we define MC by
MC =
2
xi1 . . . xi“.
(9.4.1)
x1